rebase -i: ignore signals when forking subprocesses

Message ID	pull.1581.git.1694080982621.gitgitgadget@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@vger.kernel.org> Message-ID: <pull.1581.git.1694080982621.gitgitgadget@gmail.com> Date: Thu, 07 Sep 2023 10:03:02 +0000 Subject: [PATCH] rebase -i: ignore signals when forking subprocesses Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>, Jeff King <peff@peff.net>, Phillip Wood <phillip.wood@dunelm.org.uk>, Phillip Wood <phillip.wood@dunelm.org.uk> Precedence: bulk From: Phillip Wood <phillip.wood@dunelm.org.uk>
Series	rebase -i: ignore signals when forking subprocesses \| expand rebase -i: ignore signals when forking subprocesses

Phillip Wood Sept. 7, 2023, 10:03 a.m. UTC

From: Phillip Wood <phillip.wood@dunelm.org.uk>

If the user presses Ctrl-C to interrupt a program run by a rebase "exec"
command then SIGINT will also be sent to the git process running the
rebase resulting in it being killed. Fortunately the consequences of
this are not severe as all the state necessary to continue the rebase is
saved to disc but it would be better to avoid killing git and instead
report that the command failed. A similar situation occurs when the
sequencer runs "git commit" or "git merge". If the user generates SIGINT
while editing the commit message then the git processes creating the
commit will ignore it but the git process running the rebase will be
killed.

Fix this by ignoring SIGINT and SIGQUIT when forking "exec" commands,
"git commit" and "git merge". This matches what git already does when
running the user's editor and matches the behavior of the standard
library's system() function.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
    rebase -i: ignore signals when forking subprocesses
    
    Having written this I started thinking about what happens when we fork
    hooks, merge strategies and merge drivers. I now wonder if it would be
    better to change run_command() instead - are there any cases where we
    actually want git to be killed when the user interrupts a child process?

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1581%2Fphillipwood%2Fsequencer-subprocesses-ignore-sigint-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1581/phillipwood/sequencer-subprocesses-ignore-sigint-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1581

 sequencer.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)


base-commit: 1fc548b2d6a3596f3e1c1f8b1930d8dbd1e30bf3

Johannes Schindelin Sept. 7, 2023, 12:57 p.m. UTC | #1

Hi Phillip,

On Thu, 7 Sep 2023, Phillip Wood via GitGitGadget wrote:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> If the user presses Ctrl-C to interrupt a program run by a rebase "exec"
> command then SIGINT will also be sent to the git process running the
> rebase resulting in it being killed. Fortunately the consequences of
> this are not severe as all the state necessary to continue the rebase is
> saved to disc but it would be better to avoid killing git and instead
> report that the command failed. A similar situation occurs when the
> sequencer runs "git commit" or "git merge". If the user generates SIGINT
> while editing the commit message then the git processes creating the
> commit will ignore it but the git process running the rebase will be
> killed.
>
> Fix this by ignoring SIGINT and SIGQUIT when forking "exec" commands,
> "git commit" and "git merge". This matches what git already does when
> running the user's editor and matches the behavior of the standard
> library's system() function.

ACK

>
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
>     rebase -i: ignore signals when forking subprocesses
>
>     Having written this I started thinking about what happens when we fork
>     hooks, merge strategies and merge drivers. I now wonder if it would be
>     better to change run_command() instead - are there any cases where we
>     actually want git to be killed when the user interrupts a child process?

I am not sure that we can rely on arbitrary hooks to do the right thing
upon Ctrl+C, which is to wrap up and leave. So I _guess_ that we will have
to leave it an opt-in.

However, we could easily make it an option that `run_command()` handles,
much like `no_stdin`.

Ciao,
Johannes

>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1581%2Fphillipwood%2Fsequencer-subprocesses-ignore-sigint-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1581/phillipwood/sequencer-subprocesses-ignore-sigint-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1581
>
>  sequencer.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index a66dcf8ab26..26d70f68454 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1059,6 +1059,7 @@ static int run_git_commit(const char *defmsg,
>  			  unsigned int flags)
>  {
>  	struct child_process cmd = CHILD_PROCESS_INIT;
> +	int res;
>
>  	if ((flags & CLEANUP_MSG) && (flags & VERBATIM_MSG))
>  		BUG("CLEANUP_MSG and VERBATIM_MSG are mutually exclusive");
> @@ -1116,10 +1117,16 @@ static int run_git_commit(const char *defmsg,
>  	if (!(flags & EDIT_MSG))
>  		strvec_push(&cmd.args, "--allow-empty-message");
>
> +	sigchain_push(SIGINT, SIG_IGN);
> +	sigchain_push(SIGQUIT, SIG_IGN);
>  	if (is_rebase_i(opts) && !(flags & EDIT_MSG))
> -		return run_command_silent_on_success(&cmd);
> +		res = run_command_silent_on_success(&cmd);
>  	else
> -		return run_command(&cmd);
> +		res = run_command(&cmd);
> +	sigchain_pop(SIGINT);
> +	sigchain_pop(SIGQUIT);
> +
> +	return res;
>  }
>
>  static int rest_is_empty(const struct strbuf *sb, int start)
> @@ -3628,10 +3635,14 @@ static int do_exec(struct repository *r, const char *command_line)
>  	struct child_process cmd = CHILD_PROCESS_INIT;
>  	int dirty, status;
>
> +	sigchain_push(SIGINT, SIG_IGN);
> +	sigchain_push(SIGQUIT, SIG_IGN);
>  	fprintf(stderr, _("Executing: %s\n"), command_line);
>  	cmd.use_shell = 1;
>  	strvec_push(&cmd.args, command_line);
>  	status = run_command(&cmd);
> +	sigchain_pop(SIGINT);
> +	sigchain_pop(SIGQUIT);
>
>  	/* force re-reading of the cache */
>  	discard_index(r->index);
> @@ -4111,7 +4122,11 @@ static int do_merge(struct repository *r,
>  				NULL, 0);
>  		rollback_lock_file(&lock);
>
> +		sigchain_push(SIGINT, SIG_IGN);
> +		sigchain_push(SIGQUIT, SIG_IGN);
>  		ret = run_command(&cmd);
> +		sigchain_pop(SIGINT);
> +		sigchain_pop(SIGQUIT);
>
>  		/* force re-reading of the cache */
>  		if (!ret) {
>
> base-commit: 1fc548b2d6a3596f3e1c1f8b1930d8dbd1e30bf3
> --
> gitgitgadget
>

Jeff King Sept. 7, 2023, 9:06 p.m. UTC | #2

On Thu, Sep 07, 2023 at 10:03:02AM +0000, Phillip Wood via GitGitGadget wrote:

>     rebase -i: ignore signals when forking subprocesses
>     
>     Having written this I started thinking about what happens when we fork
>     hooks, merge strategies and merge drivers. I now wonder if it would be
>     better to change run_command() instead - are there any cases where we
>     actually want git to be killed when the user interrupts a child process?

I think that would be quite surprising in most cases, as the user may
not be aware when or if a sub-program is in use.

Imagine that you're running a script which runs git-commit in a loop,
which in turn runs "gc --auto", which in turn runs a pre-auto-gc hook.
You want to stop it, so you hit ^C, which sends SIGINT to all of the
processes.

I think most people would expect the whole process chain to stop
immediately. But in your proposal, we'd kill the hook only. That kind-of
propagates up to "gc --auto" (which says "OK, the hook says don't run").
And then that doesn't really propagate to git-commit, which ignores the
exit code of gc and continues on, running the post-commit hook and so
on. And then outer loop of the script continues, invoking the next
commit, and so on. To actually quit you have to hit ^C several times
with the right timing (the exact number of which is unknown to you, as
it depends on the depth of the process chain).

I think this really comes down to: does the user perceive the child
process as the current "main" process running in the foreground? For
most run-command invocations, I would say no. For something like running
an editor, the answer is more clearly yes.

For something like sequencer "exec" commands, I think it really depends
on what the command is doing. If it is "git commit --amend", then that
is going to open an editor and take over the terminal. If it is "make",
then probably not. But it may be OK to do here, just because we know
that a signal exit from the child will be immediately read and
propagated by the sequencer machinery (assuming the child dies; if it
blocks SIGINT, too, then now you cannot interrupt it at all!).

In the classic Unix world, I think the solution here is setsid(),
process groups, and controlling terminals. One example in your commit
message is the sequencer kicking off git-commit, which kicks off an
editor. The user hits ^C then, and the sequencer is killed. But I think
your patch is papering over the deeper bug. In 913ef36093
(launch_editor: ignore terminal signals while editor has control,
2012-11-30), we did this same "ignore INT" trick. But I think the right
solution is actually to start the editor in its own process group, and
let it be the foreground of the terminal. And then a ^C while the editor
is running would not only not hit git-commit, but it would not hit any
sequencer or other intermediate processes above it.

I've never done it before, but from my reading we basically want to do
(in the forked process before we exec):

  setsid();
  open("/dev/tty");

But of course none of that probably has any meaning on Windows. I'm not
sure if there are analogous concepts there we could access with
alternate code, or if it would need to be a totally different strategy.

-Peff

Junio C Hamano Sept. 7, 2023, 9:16 p.m. UTC | #3

"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> If the user presses Ctrl-C to interrupt a program run by a rebase "exec"
> command then SIGINT will also be sent to the git process running the
> rebase resulting in it being killed. Fortunately the consequences of
> this are not severe as all the state necessary to continue the rebase is
> saved to disc but it would be better to avoid killing git and instead
> report that the command failed.

The above made me wonder if we can guarantee that the intention of
the end-user is always to kill only the external program and not
"git".  But with or without this change, "git" will stop making
progress after such a signal (in other words, it is not like killing
"exec sleep 20" will make "git rebase -i -x 'sleep 20'" to move to
the next step without complaining), so "ignore signals" is not all
that harmful as the phrasing makes it sound like.  With the patch,
we just handle signals that will kill the external programs, and
their consequences, a bit more gracefully.

But that makes me wonder what happens if the external program has
good reasons to ignore the signal (that is, the program does not die
when signaled, without misbehaving).  If "git" dies in such a case,
would it help the overall end-user experience, or would it even
hurt?  If the latter, then "git" that ignores these interactive
interrupts would be improvement in both cases (i.e. external
programs that dies with signals, and those that ignores them).  If
the former, however, "git" that ignores the signals would be a
regression.

Other than that, the change is well reasoned, I would think.

Thanks.

Phillip Wood Sept. 8, 2023, 9:59 a.m. UTC | #4

Hi Peff

Thanks for your thoughts, I hoped I get a interesting response if I cc'd 
you and I wasn't disappointed!

On 07/09/2023 22:06, Jeff King wrote:
> On Thu, Sep 07, 2023 at 10:03:02AM +0000, Phillip Wood via GitGitGadget wrote:
> 
>>      rebase -i: ignore signals when forking subprocesses
>>      
>>      Having written this I started thinking about what happens when we fork
>>      hooks, merge strategies and merge drivers. I now wonder if it would be
>>      better to change run_command() instead - are there any cases where we
>>      actually want git to be killed when the user interrupts a child process?
> 
> I think that would be quite surprising in most cases, as the user may
> not be aware when or if a sub-program is in use.
> 
> Imagine that you're running a script which runs git-commit in a loop,
> which in turn runs "gc --auto", which in turn runs a pre-auto-gc hook.
> You want to stop it, so you hit ^C, which sends SIGINT to all of the
> processes.
> 
> I think most people would expect the whole process chain to stop
> immediately. But in your proposal, we'd kill the hook only. That kind-of
> propagates up to "gc --auto" (which says "OK, the hook says don't run").
> And then that doesn't really propagate to git-commit, which ignores the
> exit code of gc and continues on, running the post-commit hook and so
> on. And then outer loop of the script continues, invoking the next
> commit, and so on. To actually quit you have to hit ^C several times
> with the right timing (the exact number of which is unknown to you, as
> it depends on the depth of the process chain).

Ah, I hadn't thought about "gc --auto" I was assuming that the calling 
code would see the child had been killed and exit but that's not always 
the case.

> I think this really comes down to: does the user perceive the child
> process as the current "main" process running in the foreground? For
> most run-command invocations, I would say no. For something like running
> an editor, the answer is more clearly yes.
> 
> For something like sequencer "exec" commands, I think it really depends
> on what the command is doing. If it is "git commit --amend", then that
> is going to open an editor and take over the terminal. If it is "make",
> then probably not. But it may be OK to do here, just because we know
> that a signal exit from the child will be immediately read and
> propagated by the sequencer machinery (assuming the child dies; if it
> blocks SIGINT, too, then now you cannot interrupt it at all!).

The child not dying is tricky, if it is in the same process group as git 
then even if git dies the I think the shell will wait for the child to 
exit before showing the prompt again so it is not clear to me that the 
user is disadvantaged by git ignoring SIGINT in that case. Part of the 
motivation for this patch is that I'd like to move the sequencer to a 
model where it only saves its state to disk when it is stopping for the 
user to fix conflicts. To do that safely it cannot die if the user 
interprets a child with SIGINT.

> In the classic Unix world, I think the solution here is setsid(),
> process groups, and controlling terminals. One example in your commit
> message is the sequencer kicking off git-commit, which kicks off an
> editor. The user hits ^C then, and the sequencer is killed. But I think
> your patch is papering over the deeper bug. In 913ef36093
> (launch_editor: ignore terminal signals while editor has control,
> 2012-11-30), we did this same "ignore INT" trick.

Yes, that was the inspiration for this patch

> But I think the right
> solution is actually to start the editor in its own process group, and
> let it be the foreground of the terminal. And then a ^C while the editor
> is running would not only not hit git-commit, but it would not hit any
> sequencer or other intermediate processes above it.
> 
> I've never done it before, but from my reading we basically want to do
> (in the forked process before we exec):
> 
>    setsid();
>    open("/dev/tty");

Do we want a whole new session? As I understand it to launch a 
foreground job shells put the child in its own process group and then 
call tcsetpgrp() to change the foreground process group of the 
controlling terminal to that of the child. I agree that would be a 
better way of doing things on unix.

> But of course none of that probably has any meaning on Windows. I'm not
> sure if there are analogous concepts there we could access with
> alternate code, or if it would need to be a totally different strategy.

Lets see if Johannes has any comments about that.

Best Wishes

Phillip

Phillip Wood Sept. 8, 2023, 10:02 a.m. UTC | #5

Hi Johannes

On 07/09/2023 13:57, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Thu, 7 Sep 2023, Phillip Wood via GitGitGadget wrote:
> 
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> If the user presses Ctrl-C to interrupt a program run by a rebase "exec"
>> command then SIGINT will also be sent to the git process running the
>> rebase resulting in it being killed. Fortunately the consequences of
>> this are not severe as all the state necessary to continue the rebase is
>> saved to disc but it would be better to avoid killing git and instead
>> report that the command failed. A similar situation occurs when the
>> sequencer runs "git commit" or "git merge". If the user generates SIGINT
>> while editing the commit message then the git processes creating the
>> commit will ignore it but the git process running the rebase will be
>> killed.
>>
>> Fix this by ignoring SIGINT and SIGQUIT when forking "exec" commands,
>> "git commit" and "git merge". This matches what git already does when
>> running the user's editor and matches the behavior of the standard
>> library's system() function.
> 
> ACK

Thanks

>>
>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>> ---
>>      rebase -i: ignore signals when forking subprocesses
>>
>>      Having written this I started thinking about what happens when we fork
>>      hooks, merge strategies and merge drivers. I now wonder if it would be
>>      better to change run_command() instead - are there any cases where we
>>      actually want git to be killed when the user interrupts a child process?
> 
> I am not sure that we can rely on arbitrary hooks to do the right thing
> upon Ctrl+C, which is to wrap up and leave. So I _guess_ that we will have
> to leave it an opt-in.

Peff pointed out it doesn't play well with "gc --auto" either. Do you 
have any thoughts (particularly about the implications for Windows) on 
his suggestion to put the child in it's own session, or putting the 
child in its own process group and making that the foreground process 
group of the controlling terminal?

> However, we could easily make it an option that `run_command()` handles,
> much like `no_stdin`.

That's an interesting idea.

Best Wishes

Phillip

Phillip Wood Sept. 8, 2023, 1:11 p.m. UTC | #6

On 08/09/2023 10:59, Phillip Wood wrote:
>> But I think the right
>> solution is actually to start the editor in its own process group, and
>> let it be the foreground of the terminal. And then a ^C while the editor
>> is running would not only not hit git-commit, but it would not hit any
>> sequencer or other intermediate processes above it.
>>
>> I've never done it before, but from my reading we basically want to do
>> (in the forked process before we exec):
>>
>>    setsid();
>>    open("/dev/tty");
> 
> Do we want a whole new session? As I understand it to launch a 
> foreground job shells put the child in its own process group and then 
> call tcsetpgrp() to change the foreground process group of the 
> controlling terminal to that of the child. I agree that would be a 
> better way of doing things on unix.

It is better for handling SIGINT and SIGQUIT when we don't want git to 
be killed but in complicates the handling of SIGTSTP and friends. We'd 
need to pass WUNTRACED to waitpid() and then do 
"raise(WSTOPSIG(wstatus))" to propagate the signal up to the shell. When 
resuming we'd need to call tcsetpgrp() again if git is resumed in the 
foreground before sending SIGCONT to the child.

>> But of course none of that probably has any meaning on Windows. I'm not
>> sure if there are analogous concepts there we could access with
>> alternate code, or if it would need to be a totally different strategy.
> 
> Lets see if Johannes has any comments about that.

I had a quick google and it looks like cygwin somehow manages to 
implement tcsetpgrp() but the windows terminal does not have any concept 
of a foreground process group

Best Wishes

Phillip

Johannes Schindelin Sept. 10, 2023, 8:24 a.m. UTC | #7

Hi Phillip,

On Fri, 8 Sep 2023, Phillip Wood wrote:

> On 07/09/2023 13:57, Johannes Schindelin wrote:
> >
> > On Thu, 7 Sep 2023, Phillip Wood via GitGitGadget wrote:
> >
> > >      Having written this I started thinking about what happens when
> > >      we fork hooks, merge strategies and merge drivers. I now wonder
> > >      if it would be better to change run_command() instead - are
> > >      there any cases where we actually want git to be killed when
> > >      the user interrupts a child process?
> >
> > I am not sure that we can rely on arbitrary hooks to do the right
> > thing upon Ctrl+C, which is to wrap up and leave. So I _guess_ that we
> > will have to leave it an opt-in.
>
> Peff pointed out it doesn't play well with "gc --auto" either. Do you have any
> thoughts (particularly about the implications for Windows) on his suggestion
> to put the child in it's own session, or putting the child in its own process
> group and making that the foreground process group of the controlling
> terminal?

The concept of "sessions" does not really translate well into the Windows
world. Neither does the concept of a "process group".

Ciao,
Johannes

Jeff King Sept. 14, 2023, 9:56 a.m. UTC | #8

On Fri, Sep 08, 2023 at 10:59:06AM +0100, Phillip Wood wrote:

> Do we want a whole new session? As I understand it to launch a foreground
> job shells put the child in its own process group and then call tcsetpgrp()
> to change the foreground process group of the controlling terminal to that
> of the child. I agree that would be a better way of doing things on unix.

One thing I am not clear on is the convention on who does the process
group and controlling terminal setup. Should Git do it to say "I am
handing off control of the terminal to the editor that I am spawning"?
Or should the editor say "I am an editor which has a user interface that
takes over the terminal; I will control it while I am running".

The latter makes much more sense to me, as Git cannot know how the
editor plans to behave. But as I understand it, this kind of job control
stuff is implemented by the calling shell, which does the tcsetpgrp()
call.

So I dunno. It sounds to me like the "right" thing here is making Git
more shell-like in handing control to a program (like the editor) that
we expect to be in the foreground of the terminal. As opposed to the
"ignore SIGINT temporarily" thing which feels more like band-aid. But
I'm wary of getting into a rabbit hole of portability headaches and
weird corners of Unix terminal-handling conventions.

-Peff

Phillip Wood Sept. 14, 2023, 1:50 p.m. UTC | #9

On 14/09/2023 10:56, Jeff King wrote:
> On Fri, Sep 08, 2023 at 10:59:06AM +0100, Phillip Wood wrote:
> 
>> Do we want a whole new session? As I understand it to launch a foreground
>> job shells put the child in its own process group and then call tcsetpgrp()
>> to change the foreground process group of the controlling terminal to that
>> of the child. I agree that would be a better way of doing things on unix.
> 
> One thing I am not clear on is the convention on who does the process
> group and controlling terminal setup. Should Git do it to say "I am
> handing off control of the terminal to the editor that I am spawning"?
> Or should the editor say "I am an editor which has a user interface that
> takes over the terminal; I will control it while I am running".

As I understand it the editor has a controlling terminal (assuming there 
is a controlling terminal associated with the editor's session id), not 
the other way round. If the editor is launched in the background then it 
will receive SIGTTIN when it tries to read from it's controlling 
terminal which stops the process unless the process installs a signal 
handler.

> The latter makes much more sense to me, as Git cannot know how the
> editor plans to behave. But as I understand it, this kind of job control
> stuff is implemented by the calling shell, which does the tcsetpgrp()
> call.

Yes, my understanding is that the shell puts all the processes in a 
pipeline in the same process group and calls tcsetpgrp() if it wants 
that job to be run in the foreground.

> So I dunno. It sounds to me like the "right" thing here is making Git
> more shell-like in handing control to a program (like the editor) that
> we expect to be in the foreground of the terminal. As opposed to the
> "ignore SIGINT temporarily" thing which feels more like band-aid. But
> I'm wary of getting into a rabbit hole of portability headaches and
> weird corners of Unix terminal-handling conventions.

I'm wary of that too, it has the potential to end up adding a lot of 
fiddly code checking if git is in the foreground or background and 
propagating SIGTSTP and friends up to the shell. In any case we'd need 
the "ignore SIGINT" thing for windows anyway. Ignoring SIGINT and 
SIGQUIT in the parent is what system(3) does. As long as git exits 
promptly when the interrupted child is killed I think that is 
reasonable. Would you be happier if we re-raised the signal once we have 
cleaned up any state that needs to be written before exiting?

Best Wishes

Phillip

rebase -i: ignore signals when forking subprocesses

Commit Message

Comments

Patch