pager: exit without error on SIGPIPE

Message ID	bc88492979fee215d5be06ccbc246ae0171a9ced.1611910122.git.liu.denton@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> From: Denton Liu <liu.denton@gmail.com> To: Git Mailing List <git@vger.kernel.org> Cc: Vincent Lefevre <vincent@vinc17.net> Subject: [PATCH] pager: exit without error on SIGPIPE Date: Fri, 29 Jan 2021 15:48:54 -0800 Message-Id: <bc88492979fee215d5be06ccbc246ae0171a9ced.1611910122.git.liu.denton@gmail.com> In-Reply-To: <YAG/vzctP4JwSp5x@zira.vinc17.org> References: <YAG/vzctP4JwSp5x@zira.vinc17.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	pager: exit without error on SIGPIPE \| expand pager: exit without error on SIGPIPE

Denton Liu Jan. 29, 2021, 11:48 p.m. UTC

If the pager closes before the git command feeding the pager finishes,
git is killed by a SIGPIPE and the corresponding exit code is 141.
Since the pipe is just an implementation detail, it does not make sense
for this error code to be user-facing.

Handle SIGPIPEs by simply calling exit(0) in wait_for_pager_signal().

Introduce `test-tool pager` which infinitely prints `y` to the pager in
order to test the new behavior. This cannot be tested with any existing
git command because there are no other commands which produce infinite
output. Without the change to pager.c, the newly introduced test fails.

Reported-by: Vincent Lefevre <vincent@vinc17.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
Sorry for the resend, it seems like vger has dropped the first patch.

 Makefile              |  1 +
 pager.c               |  2 ++
 t/helper/test-pager.c | 12 ++++++++++++
 t/helper/test-tool.c  |  1 +
 t/helper/test-tool.h  |  1 +
 t/t7006-pager.sh      |  4 ++++
 6 files changed, 21 insertions(+)
 create mode 100644 t/helper/test-pager.c

Johannes Sixt Jan. 30, 2021, 8:29 a.m. UTC | #1

Am 30.01.21 um 00:48 schrieb Denton Liu:
> If the pager closes before the git command feeding the pager finishes,
> git is killed by a SIGPIPE and the corresponding exit code is 141.
> Since the pipe is just an implementation detail, it does not make sense
> for this error code to be user-facing.
> 
> Handle SIGPIPEs by simply calling exit(0) in wait_for_pager_signal().
> 
> Introduce `test-tool pager` which infinitely prints `y` to the pager in
> order to test the new behavior. This cannot be tested with any existing
> git command because there are no other commands which produce infinite
> output. Without the change to pager.c, the newly introduced test fails.
> 
> Reported-by: Vincent Lefevre <vincent@vinc17.net>
> Signed-off-by: Denton Liu <liu.denton@gmail.com>

...

> diff --git a/pager.c b/pager.c
> index ee435de675..5922d99dc8 100644
> --- a/pager.c
> +++ b/pager.c
> @@ -34,6 +34,8 @@ static void wait_for_pager_atexit(void)
>  static void wait_for_pager_signal(int signo)
>  {
>  	wait_for_pager(1);
> +	if (signo == SIGPIPE)
> +		exit(0);
>  	sigchain_pop(signo);
>  	raise(signo);
>  }
> diff --git a/t/helper/test-pager.c b/t/helper/test-pager.c
> new file mode 100644
> index 0000000000..feb68b8643
> --- /dev/null
> +++ b/t/helper/test-pager.c
> @@ -0,0 +1,12 @@
> +#include "test-tool.h"
> +#include "cache.h"
> +
> +int cmd__pager(int argc, const char **argv)
> +{
> +	if (argc > 1)
> +		usage("\ttest-tool pager");
> +
> +	setup_pager();
> +	for (;;)
> +		puts("y");
> +}

My gut feeling tells that this will end in an infinite loop on Windows.
There are no signals on Windows that would kill the upstream of a pipe.
This call site will only notice that the downstream of the pipe was
closed, when it checks for write errors.

Let me test it.

-- Hannes

Johannes Sixt Jan. 30, 2021, 12:52 p.m. UTC | #2

Am 30.01.21 um 09:29 schrieb Johannes Sixt:
> Am 30.01.21 um 00:48 schrieb Denton Liu:
>> +++ b/t/helper/test-pager.c
>> @@ -0,0 +1,12 @@
>> +#include "test-tool.h"
>> +#include "cache.h"
>> +
>> +int cmd__pager(int argc, const char **argv)
>> +{
>> +	if (argc > 1)
>> +		usage("\ttest-tool pager");
>> +
>> +	setup_pager();
>> +	for (;;)
>> +		puts("y");
>> +}
> 
> My gut feeling tells that this will end in an infinite loop on Windows.
> There are no signals on Windows that would kill the upstream of a pipe.
> This call site will only notice that the downstream of the pipe was
> closed, when it checks for write errors.
> 
> Let me test it.

The test case is protected by a TTY prerequisite; that is not satisfied
on Windows, and the test is skipped. No harm done so far.

But when I run `test-tool pager` manually and quit out of the pager, the
tool does spin in the endless loop. The following fixup helps.


diff --git a/t/helper/test-pager.c b/t/helper/test-pager.c
index feb68b8643..5f1982411f 100644
--- a/t/helper/test-pager.c
+++ b/t/helper/test-pager.c
@@ -7,6 +7,8 @@ int cmd__pager(int argc, const char **argv)
 		usage("\ttest-tool pager");
 
 	setup_pager();
-	for (;;)
-		puts("y");
+	while (write_in_full(1, "y\n", 2) > 0)
+		;
+
+	return 0;
 }

Ævar Arnfjörð Bjarmason Feb. 1, 2021, 3:03 p.m. UTC | #3

On Sat, Jan 30 2021, Denton Liu wrote:

> [...]

The thread at large has enough about whether this approach even makes
sense. I won't repeat that here. Just small notes on the patch itself:

> diff --git a/Makefile b/Makefile
> index 4edfda3e00..38a1a20f31 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -719,6 +719,7 @@ TEST_BUILTINS_OBJS += test-mktemp.o
>  TEST_BUILTINS_OBJS += test-oid-array.o
>  TEST_BUILTINS_OBJS += test-oidmap.o
>  TEST_BUILTINS_OBJS += test-online-cpus.o
> +TEST_BUILTINS_OBJS += test-pager.o
>  TEST_BUILTINS_OBJS += test-parse-options.o
>  TEST_BUILTINS_OBJS += test-parse-pathspec-file.o
>  TEST_BUILTINS_OBJS += test-path-utils.o
> diff --git a/pager.c b/pager.c
> index ee435de675..5922d99dc8 100644
> --- a/pager.c
> +++ b/pager.c
> @@ -34,6 +34,8 @@ static void wait_for_pager_atexit(void)
>  static void wait_for_pager_signal(int signo)
>  {
>  	wait_for_pager(1);
> +	if (signo == SIGPIPE)
> +		exit(0);

As shown in
https://lore.kernel.org/git/20210201144921.8664-1-avarab@gmail.com/ this
leaves us without guard rails where the pager dies/segfaults or
whatever.

That's an existing bug, but by not carrying the SIGPIPE forward it
changes from "most of the time we'd exit with SIGPIPE anyway" to "we'll
never notice".

> [...]
> +test_expect_success TTY 'SIGPIPE from pager returns success' '
> +	test_terminal env PAGER=true test-tool pager
> +'
> +
>  test_done

As noted in
https://lore.kernel.org/git/20210201144921.8664-1-avarab@gmail.com/ I
think this whole "test-tool pager" isn't needed. We can just use git
itself with some trickery.

Junio C Hamano Feb. 1, 2021, 5:47 p.m. UTC | #4

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> diff --git a/pager.c b/pager.c
>> index ee435de675..5922d99dc8 100644
>> --- a/pager.c
>> +++ b/pager.c
>> @@ -34,6 +34,8 @@ static void wait_for_pager_atexit(void)
>>  static void wait_for_pager_signal(int signo)
>>  {
>>  	wait_for_pager(1);
>> +	if (signo == SIGPIPE)
>> +		exit(0);
>
> As shown in
> https://lore.kernel.org/git/20210201144921.8664-1-avarab@gmail.com/ this
> leaves us without guard rails where the pager dies/segfaults or
> whatever.
>
> That's an existing bug, but by not carrying the SIGPIPE forward it
> changes from "most of the time we'd exit with SIGPIPE anyway" to "we'll
> never notice".

Would it be the matter of propagating the exit status of the pager
noticed by wait_or_white() down thru finish_command_in_signal() and
wait_for_pager(1) to here, so

 - If we know pager exited with non-zero status, we would report,
   perhaps with warning(_("..."));

 - If we notice we got a SIGPIPE, we ignore it---it is nothing of
   interest to the end-user;

 - Otherwise we do not do anything differently.

would be sufficient?  Implementors of "git -p" may know that "git"
happens to implement its paging by piping its output to an external
pager, but the end-users do not care.  Implementors may say they are
giving 'q' to their pager "less", but to the end-users, who report
"I ran 'git log' and after reading a pageful, I told it to 'q'uit",
the distinction does not have any importance.

Or are there more to it, in that the exit status we get from the
pager, combined with the kind of signal we are getting, is not
sufficient for us to tell what is going on?

Ævar Arnfjörð Bjarmason Feb. 1, 2021, 7:52 p.m. UTC | #5

On Mon, Feb 01 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>>> diff --git a/pager.c b/pager.c
>>> index ee435de675..5922d99dc8 100644
>>> --- a/pager.c
>>> +++ b/pager.c
>>> @@ -34,6 +34,8 @@ static void wait_for_pager_atexit(void)
>>>  static void wait_for_pager_signal(int signo)
>>>  {
>>>  	wait_for_pager(1);
>>> +	if (signo == SIGPIPE)
>>> +		exit(0);
>>
>> As shown in
>> https://lore.kernel.org/git/20210201144921.8664-1-avarab@gmail.com/ this
>> leaves us without guard rails where the pager dies/segfaults or
>> whatever.
>>
>> That's an existing bug, but by not carrying the SIGPIPE forward it
>> changes from "most of the time we'd exit with SIGPIPE anyway" to "we'll
>> never notice".
>
> Would it be the matter of propagating the exit status of the pager
> noticed by wait_or_white() down thru finish_command_in_signal() and
> wait_for_pager(1) to here, so
>
>  - If we know pager exited with non-zero status, we would report,
>    perhaps with warning(_("..."));
>
>  - If we notice we got a SIGPIPE, we ignore it---it is nothing of
>    interest to the end-user;
>
>  - Otherwise we do not do anything differently.
>
> would be sufficient?  Implementors of "git -p" may know that "git"
> happens to implement its paging by piping its output to an external
> pager, but the end-users do not care.  Implementors may say they are
> giving 'q' to their pager "less", but to the end-users, who report
> "I ran 'git log' and after reading a pageful, I told it to 'q'uit",
> the distinction does not have any importance.
>
> Or are there more to it, in that the exit status we get from the
> pager, combined with the kind of signal we are getting, is not
> sufficient for us to tell what is going on?

It is, I just wonder if ignoring the exit code is a practical issue as
long as we're not clobbering SIGPIPE, particularly with my trace2
logging patch in this thread.

But yeah, we could patch git to handle this in the general case. I think
it's probably a bit of a PITA to do, since for the general case we need
to munge the exit code in an atexit() handler.

Which means calling _exit() (if that's even portable), and presumably
changing from the atexit() API to our own registry of how many times we
called atexit(), which would introduce logic bugs if we ever use a
library that wants to have atexit(). I.e. if we _exit() before its
atexit() handler runs because we wanted to munge the exit code.

Junio C Hamano Feb. 1, 2021, 8:55 p.m. UTC | #6

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> Would it be the matter of propagating the exit status of the pager
>> noticed by wait_or_white() down thru finish_command_in_signal() and
>> wait_for_pager(1) to here, so
>>
>>  - If we know pager exited with non-zero status, we would report,
>>    perhaps with warning(_("..."));
>>
>>  - If we notice we got a SIGPIPE, we ignore it---it is nothing of
>>    interest to the end-user;
>>
>>  - Otherwise we do not do anything differently.
>>
>> would be sufficient?  Implementors of "git -p" may know that "git"
>> happens to implement its paging by piping its output to an external
>> pager, but the end-users do not care.  Implementors may say they are
>> giving 'q' to their pager "less", but to the end-users, who report
>> "I ran 'git log' and after reading a pageful, I told it to 'q'uit",
>> the distinction does not have any importance.
>>
>> Or are there more to it, in that the exit status we get from the
>> pager, combined with the kind of signal we are getting, is not
>> sufficient for us to tell what is going on?
>
> It is, I just wonder if ignoring the exit code is a practical issue as
> long as we're not clobbering SIGPIPE, particularly with my trace2
> logging patch in this thread.
>
> But yeah, we could patch git to handle this in the general case....

Sorry, but now you lost me.

I was merely wondering if Denton's patch can become a small update
on top of these, if we just made sure that the exit code of the
pager noticed by wait_or_whine() is reported to the code where
Denton makes the decision to say "let's not re-raise but simply exit
with 0 return as what we got is SIGPIPE".  I guess we could even
make git exit with the pager's return code in that case, as the
end-user observable result would be similar to "git log | less"
where 'less' may be segfaulting or exiting cleanly.

IOW, something like this on top of your three-patch series?

 pager.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git c/pager.c w/pager.c
index 3d37dd7ada..73bc5fc0e4 100644
--- c/pager.c
+++ w/pager.c
@@ -28,8 +28,14 @@ static void wait_for_pager_atexit(void)
 
 static void wait_for_pager_signal(int signo)
 {
+	int status;
+
 	close_pager_fds();
-	finish_command_in_signal(&pager_process);
+	status = finish_command_in_signal(&pager_process);
+
+	if (signo == SIGPIPE)
+		exit(status);
+
 	sigchain_pop(signo);
 	raise(signo);
 }

Ævar Arnfjörð Bjarmason Feb. 2, 2021, 2:05 a.m. UTC | #7

On Mon, Feb 01 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>>> Would it be the matter of propagating the exit status of the pager
>>> noticed by wait_or_white() down thru finish_command_in_signal() and
>>> wait_for_pager(1) to here, so
>>>
>>>  - If we know pager exited with non-zero status, we would report,
>>>    perhaps with warning(_("..."));
>>>
>>>  - If we notice we got a SIGPIPE, we ignore it---it is nothing of
>>>    interest to the end-user;
>>>
>>>  - Otherwise we do not do anything differently.
>>>
>>> would be sufficient?  Implementors of "git -p" may know that "git"
>>> happens to implement its paging by piping its output to an external
>>> pager, but the end-users do not care.  Implementors may say they are
>>> giving 'q' to their pager "less", but to the end-users, who report
>>> "I ran 'git log' and after reading a pageful, I told it to 'q'uit",
>>> the distinction does not have any importance.
>>>
>>> Or are there more to it, in that the exit status we get from the
>>> pager, combined with the kind of signal we are getting, is not
>>> sufficient for us to tell what is going on?
>>
>> It is, I just wonder if ignoring the exit code is a practical issue as
>> long as we're not clobbering SIGPIPE, particularly with my trace2
>> logging patch in this thread.
>>
>> But yeah, we could patch git to handle this in the general case....
>
> Sorry, but now you lost me.
>
> I was merely wondering if Denton's patch can become a small update
> on top of these, if we just made sure that the exit code of the
> pager noticed by wait_or_whine() is reported to the code where
> Denton makes the decision to say "let's not re-raise but simply exit
> with 0 return as what we got is SIGPIPE".  I guess we could even
> make git exit with the pager's return code in that case, as the
> end-user observable result would be similar to "git log | less"
> where 'less' may be segfaulting or exiting cleanly.
>
> IOW, something like this on top of your three-patch series?
>
>  pager.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git c/pager.c w/pager.c
> index 3d37dd7ada..73bc5fc0e4 100644
> --- c/pager.c
> +++ w/pager.c
> @@ -28,8 +28,14 @@ static void wait_for_pager_atexit(void)
>  
>  static void wait_for_pager_signal(int signo)
>  {
> +	int status;
> +
>  	close_pager_fds();
> -	finish_command_in_signal(&pager_process);
> +	status = finish_command_in_signal(&pager_process);
> +
> +	if (signo == SIGPIPE)
> +		exit(status);
> +
>  	sigchain_pop(signo);
>  	raise(signo);
>  }

I sent a WIP start at something like this at the end of my v2, please
discard it when picking up the rest:
https://lore.kernel.org/git/20210202020001.31601-6-avarab@gmail.com/

As noted there it's going to be a lot more complex than this.

Junio C Hamano Feb. 2, 2021, 4:45 a.m. UTC | #8

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>>> But yeah, we could patch git to handle this in the general case....
>>
>> Sorry, but now you lost me.
>>
>> I was merely wondering if Denton's patch can become a small update
>> on top of these, if we just made sure that the exit code of the
>> pager noticed by wait_or_whine() is reported to the code where
>> Denton makes the decision to say "let's not re-raise but simply exit
>> with 0 return as what we got is SIGPIPE".  I guess we could even
>> make git exit with the pager's return code in that case, as the
>> end-user observable result would be similar to "git log | less"
>> where 'less' may be segfaulting or exiting cleanly.
>>
>> IOW, something like this on top of your three-patch series?
>>
>>  pager.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>> ...
> I sent a WIP start at something like this at the end of my v2, please
> discard it when picking up the rest:
> https://lore.kernel.org/git/20210202020001.31601-6-avarab@gmail.com/
>
> As noted there it's going to be a lot more complex than this.

Sorry, but you still have lost me---I do not see if/why we even care
about atexit codepath.  As far as the end users are concered, they
are running "git" and observing the exit code from "git".  There,
reporting that "git" was killed by SIGPIPE, instead of exiting
normally, is not something they want to hear about after quitting
their pager, and that is why the signal reception codepath matters.

Yes, I can see you are making it "a lot more complex" in your patch,
but what I do not see is why we even need to.

Thanks.

Junio C Hamano Feb. 2, 2021, 5:25 a.m. UTC | #9

Junio C Hamano <gitster@pobox.com> writes:

> Sorry, but you still have lost me---I do not see if/why we even care
> about atexit codepath.  As far as the end users are concered, they
> are running "git" and observing the exit code from "git".  There,
> reporting that "git" was killed by SIGPIPE, instead of exiting
> normally, is not something they want to hear about after quitting
> their pager, and that is why the signal reception codepath matters.

(something I noticed that I left unsaid...)

On the other hand, "git" spawns not just pager but other
subprocesses (e.g. "hooks"), and it is entirely up to us what to do
with the exit code from them.  When we care about making an external
effect (e.g. post-$action hooks that are run for their side effects),
we can ignore their exit status just fine.

And I do not see why the "we waited before leaving, and noticed the
pager exited with non-zero status" that we could notice in the
atexit codepath has to be so special.  We _could_ (modulo the "exit
there is not portable" you noted) make our exit status reflect that,
but I do not think it is all that important a "failure" (as opposed
to, say, we tried to show a commit message but failed to recode it
into utf-8, or we tried to spawn the pager but failed to start a
process) to clobber _our_ exit status with pager's exit status.

So...

Johannes Sixt Feb. 2, 2021, 7:45 a.m. UTC | #10

Am 02.02.21 um 06:25 schrieb Junio C Hamano:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> Sorry, but you still have lost me---I do not see if/why we even care
>> about atexit codepath.  As far as the end users are concered, they
>> are running "git" and observing the exit code from "git".  There,
>> reporting that "git" was killed by SIGPIPE, instead of exiting
>> normally, is not something they want to hear about after quitting
>> their pager, and that is why the signal reception codepath matters.
> 
> (something I noticed that I left unsaid...)
> 
> On the other hand, "git" spawns not just pager but other
> subprocesses (e.g. "hooks"), and it is entirely up to us what to do
> with the exit code from them.  When we care about making an external
> effect (e.g. post-$action hooks that are run for their side effects),
> we can ignore their exit status just fine.
> 
> And I do not see why the "we waited before leaving, and noticed the
> pager exited with non-zero status" that we could notice in the
> atexit codepath has to be so special.  We _could_ (modulo the "exit
> there is not portable" you noted) make our exit status reflect that,
> but I do not think it is all that important a "failure" (as opposed
> to, say, we tried to show a commit message but failed to recode it
> into utf-8, or we tried to spawn the pager but failed to start a
> process) to clobber _our_ exit status with pager's exit status.
> 
> So...

The pager is a special case of a sub-process spawned, as it really only
a courtesy for the user. Without the pager facility, the user would have
to use

    git log | less

In that situation, the exit code of the pager *does* override git's, and
it is also irrelevant for the user that git was killed by SIGPIPE and is
not worth a visible notice.

-- Hannes

Junio C Hamano Feb. 2, 2021, 8:13 p.m. UTC | #11

Johannes Sixt <j6t@kdbg.org> writes:

> Am 02.02.21 um 06:25 schrieb Junio C Hamano:
>> Junio C Hamano <gitster@pobox.com> writes:
>> 
>>> Sorry, but you still have lost me---I do not see if/why we even care
>>> about atexit codepath.  As far as the end users are concered, they
>>> are running "git" and observing the exit code from "git".  There,
>>> reporting that "git" was killed by SIGPIPE, instead of exiting
>>> normally, is not something they want to hear about after quitting
>>> their pager, and that is why the signal reception codepath matters.
>> 
>> (something I noticed that I left unsaid...)
>> 
>> On the other hand, "git" spawns not just pager but other
>> subprocesses (e.g. "hooks"), and it is entirely up to us what to do
>> with the exit code from them.  When we care about making an external
>> effect (e.g. post-$action hooks that are run for their side effects),
>> we can ignore their exit status just fine.
>> 
>> And I do not see why the "we waited before leaving, and noticed the
>> pager exited with non-zero status" that we could notice in the
>> atexit codepath has to be so special.  We _could_ (modulo the "exit
>> there is not portable" you noted) make our exit status reflect that,
>> but I do not think it is all that important a "failure" (as opposed
>> to, say, we tried to show a commit message but failed to recode it
>> into utf-8, or we tried to spawn the pager but failed to start a
>> process) to clobber _our_ exit status with pager's exit status.
>> 
>> So...
>
> The pager is a special case of a sub-process spawned, as it really only
> a courtesy for the user. Without the pager facility, the user would have
> to use
>
>     git log | less
>
> In that situation, the exit code of the pager *does* override git's, and
> it is also irrelevant for the user that git was killed by SIGPIPE and is
> not worth a visible notice.

All true, except that "GIT_PAGER=less git -p log" reports the exit
status of "git" and not "less" when the entire command finishes
(regardless of how it happens, like user typing 'q', output of log
is shorter than one page and "less" automatically exiting at the
end, etc.), unlike "git log | less", where the exit status of "git"
is hidden.  But from the end-user's point of view, I do think it
is not a good idea to report an abnormal exit of "git" with SIGPIPE;
it is an irrelevant implementation detail.

Anyway, my opinion in the message you are responding to was that the
exit status of the pager subprocess wait_for_pager_atexit() finds
does not matter, and there is no reason to overly complicate the
implementation like the comments in Ævar's [v2 5/5] implies, and it
is sufficient to just hide the fact in wait_for_pager_signal() that
we got SIGPIPE.  I am not sure if you are agreeing with me, or are
showing me where/why I was wrong.

Thanks.

Johannes Sixt Feb. 2, 2021, 10:15 p.m. UTC | #12

Am 02.02.21 um 21:13 schrieb Junio C Hamano:
> Johannes Sixt <j6t@kdbg.org> writes:
> 
>> Am 02.02.21 um 06:25 schrieb Junio C Hamano:
>>> Junio C Hamano <gitster@pobox.com> writes:
>>>
>>>> Sorry, but you still have lost me---I do not see if/why we even care
>>>> about atexit codepath.  As far as the end users are concered, they
>>>> are running "git" and observing the exit code from "git".  There,
>>>> reporting that "git" was killed by SIGPIPE, instead of exiting
>>>> normally, is not something they want to hear about after quitting
>>>> their pager, and that is why the signal reception codepath matters.
>>>
>>> (something I noticed that I left unsaid...)
>>>
>>> On the other hand, "git" spawns not just pager but other
>>> subprocesses (e.g. "hooks"), and it is entirely up to us what to do
>>> with the exit code from them.  When we care about making an external
>>> effect (e.g. post-$action hooks that are run for their side effects),
>>> we can ignore their exit status just fine.
>>>
>>> And I do not see why the "we waited before leaving, and noticed the
>>> pager exited with non-zero status" that we could notice in the
>>> atexit codepath has to be so special.  We _could_ (modulo the "exit
>>> there is not portable" you noted) make our exit status reflect that,
>>> but I do not think it is all that important a "failure" (as opposed
>>> to, say, we tried to show a commit message but failed to recode it
>>> into utf-8, or we tried to spawn the pager but failed to start a
>>> process) to clobber _our_ exit status with pager's exit status.
>>>
>>> So...
>>
>> The pager is a special case of a sub-process spawned, as it really only
>> a courtesy for the user. Without the pager facility, the user would have
>> to use
>>
>>     git log | less
>>
>> In that situation, the exit code of the pager *does* override git's, and
>> it is also irrelevant for the user that git was killed by SIGPIPE and is
>> not worth a visible notice.
> 
> All true, except that "GIT_PAGER=less git -p log" reports the exit
> status of "git" and not "less" when the entire command finishes
> (regardless of how it happens, like user typing 'q', output of log
> is shorter than one page and "less" automatically exiting at the
> end, etc.), unlike "git log | less", where the exit status of "git"
> is hidden.  But from the end-user's point of view, I do think it
> is not a good idea to report an abnormal exit of "git" with SIGPIPE;
> it is an irrelevant implementation detail.
> 
> Anyway, my opinion in the message you are responding to was that the
> exit status of the pager subprocess wait_for_pager_atexit() finds
> does not matter, and there is no reason to overly complicate the
> implementation like the comments in Ævar's [v2 5/5] implies, and it
> is sufficient to just hide the fact in wait_for_pager_signal() that
> we got SIGPIPE.  I am not sure if you are agreeing with me, or are
> showing me where/why I was wrong.

We are agreeing that the SIGPIPE should not be reported.

We may be disagreeing whether it is good or bad that git's exit code is
overridden by the pager's exit code, which Ævar wanted to implement,
IIUC. I think that is reasonable and I base my opinion on the comparison
with the pipeline `git log | less`, where git's exit code is ignored.

-- Hannes

Junio C Hamano Feb. 2, 2021, 10:21 p.m. UTC | #13

Johannes Sixt <j6t@kdbg.org> writes:

>> Anyway, my opinion in the message you are responding to was that the
>> exit status of the pager subprocess wait_for_pager_atexit() finds
>> does not matter, and there is no reason to overly complicate the
>> implementation like the comments in Ævar's [v2 5/5] implies, and it
>> is sufficient to just hide the fact in wait_for_pager_signal() that
>> we got SIGPIPE.  I am not sure if you are agreeing with me, or are
>> showing me where/why I was wrong.
>
> We are agreeing that the SIGPIPE should not be reported.
>
> We may be disagreeing whether it is good or bad that git's exit code is
> overridden by the pager's exit code, which Ævar wanted to implement,
> IIUC. I think that is reasonable and I base my opinion on the comparison
> with the pipeline `git log | less`, where git's exit code is ignored.

I guess we are then in agreement---I do think it makes sense to send
the true exit code from the pager as the exit code from the pager to
the trace output, which is what the early part of Ævar's patch does,
but I do not think the exit code of the pager should affect the exit
code from "git log" as a whole.

Ævar Arnfjörð Bjarmason Feb. 3, 2021, 2:45 a.m. UTC | #14

On Tue, Feb 02 2021, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Sorry, but you still have lost me---I do not see if/why we even care
>> about atexit codepath.  As far as the end users are concered, they
>> are running "git" and observing the exit code from "git".  There,
>> reporting that "git" was killed by SIGPIPE, instead of exiting
>> normally, is not something they want to hear about after quitting
>> their pager, and that is why the signal reception codepath matters.
>
> (something I noticed that I left unsaid...)
>
> On the other hand, "git" spawns not just pager but other
> subprocesses (e.g. "hooks"), and it is entirely up to us what to do
> with the exit code from them.  When we care about making an external
> effect (e.g. post-$action hooks that are run for their side effects),
> we can ignore their exit status just fine.
>
> And I do not see why the "we waited before leaving, and noticed the
> pager exited with non-zero status" that we could notice in the
> atexit codepath has to be so special.  We _could_ (modulo the "exit
> there is not portable" you noted) make our exit status reflect that,
> but I do not think it is all that important a "failure" (as opposed
> to, say, we tried to show a commit message but failed to recode it
> into utf-8, or we tried to spawn the pager but failed to start a
> process) to clobber _our_ exit status with pager's exit status.

Because your patch upthread makes git's exit code on pager failure a
function of how your PIPE_BUF happens to be consumed on your OS.

You can see this if you do this:

    diff --git a/pager.c b/pager.c
    index ee435de6756..5124124ac36 100644
    --- a/pager.c
    +++ b/pager.c
    @@ -30,0 +31 @@ static void wait_for_pager_atexit(void)
    +       trace2_region_enter_printf("pager", "wait_for_pager_atexit", the_repository, "%d", 0);
    @@ -35,0 +37 @@ static void wait_for_pager_signal(int signo)
    +       trace2_region_enter_printf("pager", "wait_for_pager_signal", the_repository, "%d", 1);

And then e.g.:

    GIT_TRACE2_EVENT=/tmp/trace2.json ~/g/git/git -c core.pager="less; exit 1" log -100; echo $?

On my laptop & screen size I get around ~20 page lenghts with less
before I get to the end.

If I quit on the first page I get an exit code of 141, ditto the second,
on everything from the 3rd forward I get an exit code of 0. Because at
that point git's/the OS/pipe buffers etc. have flushed the rest to the
pager.

So:

 1. With something like your patch we'd get an exit code of !0 on pager
    failure only if the user won the PIPE_BUF roulette.

 2. With finishing up my "WIP/PATCH v2 5/5" we'd get consistent exit
    codes carried down, but that patch is insanity already, and
    finishing it would be even crazier.

So I haven't been advocating for #2, just the #0 of "I don't really see
the problem with the current behavior of SIGHUP, maybe configure your
shell?".

B.t.w. to <5772995f-c887-7f13-6b5f-dc44f4477dcb@kdbg.org> in the
side-thread: Having a smarter pager than just less isn't really all that
unusual, e.g. it's very handy on a remote system to type commands
interactively but sloooowly, but then configure a pager with some
combination of an ssh tunnel + nc + remote system's screen so you can
browse around without every search/page up/down taking 1-2 seconds.

It's also nice when a thing like that can quit as fast as possible when
it gets SIGHUP, not wait on the exit code of the spawned program, which
may involve tearing down an ssh session or whatever.

But I digress.

Getting back to the point, whatever anyone thinks of returning SIGHUP as
we do now or not, I think it's bonkers to ignore SIGHUP and *then*
return a non-zero *only in the non-atexit case*.

That just means that if you do have a broken pager you're going to get
flaky exits depending on the state of our flushed buffers, who's going
to be helped by such a fickle exit code?

So if we're going to change the behavior to not return SIGHUP, and don't
want to refactor our entire atexit() handling in #2 to be guaranteed to
pass down the pager's exit code, I don't see how anything except the
approach of just exit(0) in that case makes sense, which is what Denton
Liu's patch initially suggested doing.

Junio C Hamano Feb. 3, 2021, 2:54 a.m. UTC | #15

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> Getting back to the point, whatever anyone thinks of returning SIGHUP as
> we do now or not, I think it's bonkers to ignore SIGHUP and *then*
> return a non-zero *only in the non-atexit case*.
>
> That just means that if you do have a broken pager you're going to get
> flaky exits depending on the state of our flushed buffers, who's going
> to be helped by such a fickle exit code?
>
> So if we're going to change the behavior to not return SIGHUP, and don't
> want to refactor our entire atexit() handling in #2 to be guaranteed to
> pass down the pager's exit code, I don't see how anything except the
> approach of just exit(0) in that case makes sense, which is what Denton
> Liu's patch initially suggested doing.

Then we are on the same page (assuming that all your HUPs are
PIPEs), as the "perhaps we could even exit with pager's error" was
my mistaken reaction to your "we have been losing pager's exit
status" message.

I do want to ignore, as I said in the message you are responding to,
the exit status of the pager, just like we ignore exit status of
some of the hooks that we spawn primarily for their side effects (as
opposed to the pre-* hooks whose status we do use to base our
decision on).

I guess we just want to take just a half of your [WIP/PATCH v2 5/5],
ignoring the return values from finish_command*() and exiting with 0
when we got SIGPIPE (that would mean that there will be no change on
the atexit codepath).  Unlike Denton's change directly on the current
codebase, the resulting code would clearly show that we only care about
the signal codepath, thanks to the refactoring [PATCH v2 1/5] has
done.

Ævar Arnfjörð Bjarmason Feb. 3, 2021, 3:36 a.m. UTC | #16

On Wed, Feb 03 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> Getting back to the point, whatever anyone thinks of returning SIGHUP as
>> we do now or not, I think it's bonkers to ignore SIGHUP and *then*
>> return a non-zero *only in the non-atexit case*.
>>
>> That just means that if you do have a broken pager you're going to get
>> flaky exits depending on the state of our flushed buffers, who's going
>> to be helped by such a fickle exit code?
>>
>> So if we're going to change the behavior to not return SIGHUP, and don't
>> want to refactor our entire atexit() handling in #2 to be guaranteed to
>> pass down the pager's exit code, I don't see how anything except the
>> approach of just exit(0) in that case makes sense, which is what Denton
>> Liu's patch initially suggested doing.
>
> Then we are on the same page (assuming that all your HUPs are
> PIPEs)

Yes, sorry, PBCAK :)

Johannes Sixt Feb. 3, 2021, 5:07 p.m. UTC | #17

Am 02.02.21 um 23:21 schrieb Junio C Hamano:
> Johannes Sixt <j6t@kdbg.org> writes:
> 
>>> Anyway, my opinion in the message you are responding to was that the
>>> exit status of the pager subprocess wait_for_pager_atexit() finds
>>> does not matter, and there is no reason to overly complicate the
>>> implementation like the comments in Ævar's [v2 5/5] implies, and it
>>> is sufficient to just hide the fact in wait_for_pager_signal() that
>>> we got SIGPIPE.  I am not sure if you are agreeing with me, or are
>>> showing me where/why I was wrong.
>>
>> We are agreeing that the SIGPIPE should not be reported.
>>
>> We may be disagreeing whether it is good or bad that git's exit code is
>> overridden by the pager's exit code, which Ævar wanted to implement,
>> IIUC. I think that is reasonable and I base my opinion on the comparison
>> with the pipeline `git log | less`, where git's exit code is ignored.
> 
> I guess we are then in agreement---I do think it makes sense to send
> the true exit code from the pager as the exit code from the pager to
> the trace output, which is what the early part of Ævar's patch does,
> but I do not think the exit code of the pager should affect the exit
> code from "git log" as a whole.

Then we do not agree. The exit code of `git log | less` is ignored, and
I regard `git -p log` just as a short-hand for that.

That said, I don't care a lot about the exit code. When a pager is in
the game, we are talking about an interactive command invocation, and
what the exit code of that is, is irrelevant in practice.

The only thing I care is that git does not die due to a SIGPIPE when the
pager is closed early.

-- Hannes

Johannes Sixt Feb. 3, 2021, 5:19 p.m. UTC | #18

Am 03.02.21 um 03:54 schrieb Junio C Hamano:
> I guess we just want to take just a half of your [WIP/PATCH v2 5/5],
> ignoring the return values from finish_command*() and exiting with 0
> when we got SIGPIPE (that would mean that there will be no change on
> the atexit codepath).  Unlike Denton's change directly on the current
> codebase, the resulting code would clearly show that we only care about
> the signal codepath, thanks to the refactoring [PATCH v2 1/5] has
> done.

Note though, that we cannot call exit() from a signal handler: it is not
async-signal safe.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03

-- Hannes

Junio C Hamano Feb. 3, 2021, 6:12 p.m. UTC | #19

Johannes Sixt <j6t@kdbg.org> writes:

>> I guess we are then in agreement---I do think it makes sense to send
>> the true exit code from the pager as the exit code from the pager to
>> the trace output, which is what the early part of Ævar's patch does,
>> but I do not think the exit code of the pager should affect the exit
>> code from "git log" as a whole.
>
> Then we do not agree. The exit code of `git log | less` is ignored, and
> I regard `git -p log` just as a short-hand for that.

I think you skipped "not" while reading the "but I do not think"
part of the last sentence.

> The only thing I care is that git does not die due to a SIGPIPE when the
> pager is closed early.

Makes two of us ;-)

Vincent Lefevre Feb. 4, 2021, 3:10 p.m. UTC | #20

On 2021-02-03 18:07:02 +0100, Johannes Sixt wrote:
> Then we do not agree. The exit code of `git log | less` is ignored,
> and I regard `git -p log` just as a short-hand for that.

If git didn't have the -p feature, "git log | less" would just be
one way to get a pager. For instance, I use an alias that does

  pager-wrapper grep --color=always --line-buffered -E

which sends the grep output to a pager, and returns the exit status
of the pager if non-zero, otherwise the exit status of grep, except
when this is SIGPIPE. An unavoidable issue is that if there has been
an error in grep, I could still get the exit status 0. But as a user,
this is a choice I have done by quitting the pager before letting
grep terminate in the normal way (which could have been costly) so
that it could report the error, say, about unreadable files with a
recursive grep (grep -r).

So Git could do the same thing, and even better, because it controls
its own exit status: if there has been an error in the generation of
the Git output (for instance, I can see that there is a --check option
of "git log" that can trigger an error), then this error should be
reported (with a non-zero exit status) after the pager is quit, as if
a pager were not used. Otherwise, terminate with the exit status of
the pager.

pager: exit without error on SIGPIPE

Commit Message

Comments

Patch