Message ID | 6f9c8775817117c2b36539eb048e2462a650ab8f.1550508544.git.msuchanek@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | worktree add race fix | expand |
On Mon, Feb 18, 2019 at 12:05 PM Michal Suchanek <msuchanek@suse.de> wrote: > When adding wotktrees git can die in get_common_dir_noenv while > examining existing worktrees because the commondir file does not exist. > Rather than testing if the file exists before reading it handle ENOENT. This commit message leaves the reader wondering under what conditions "commondir file" might not exist. For instance, the reader might wonder "iIs it simply a condition of normal operation or does it arise under odd circumstances?". Without this information, it is difficult for someone reading the explanation to understand if or how this code might validly be changed in the future. Your cover letter contained explanation which likely ought to be duplicated here as an aid to future readers. > Signed-off-by: Michal Suchanek <msuchanek@suse.de>
On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: > > When adding wotktrees git can die in get_common_dir_noenv while > examining existing worktrees because the commondir file does not exist. > Rather than testing if the file exists before reading it handle ENOENT. I don't think we could go around fixing every access to incomplete worktrees like this. If this is because of racy 'worktree add', then perhaps a better solution is make it absolutely clear it's not ready for anybody to access. For example, we can suffix the worktree directory name with ".lock" and make sure get_worktrees() ignores entries ending with ".lock". That should protect other commands while 'worktree add' is still running. Only when the worktree is complete that 'worktree add' should rename the directory to lose ".lock" and run external commands like git-checkout to populate the worktree. > Signed-off-by: Michal Suchanek <msuchanek@suse.de> > --- > v2: > - do not test file existence first, just read it and handle ENOENT. > - handle zero size file correctly > --- > setup.c | 16 +++++++++++----- > 1 file changed, 11 insertions(+), 5 deletions(-) > > diff --git a/setup.c b/setup.c > index ca9e8a949ed8..dd865f280d34 100644 > --- a/setup.c > +++ b/setup.c > @@ -270,12 +270,20 @@ int get_common_dir_noenv(struct strbuf *sb, const char *gitdir) > { > struct strbuf data = STRBUF_INIT; > struct strbuf path = STRBUF_INIT; > - int ret = 0; > + int ret; > > strbuf_addf(&path, "%s/commondir", gitdir); > - if (file_exists(path.buf)) { > - if (strbuf_read_file(&data, path.buf, 0) <= 0) > + ret = strbuf_read_file(&data, path.buf, 0); > + if (ret <= 0) { > + /* > + * if file is missing or zero size (just being written) > + * assume default, bail otherwise > + */ > + if (ret && errno != ENOENT) > die_errno(_("failed to read %s"), path.buf); > + strbuf_addstr(sb, gitdir); > + ret = 0; > + } else { > while (data.len && (data.buf[data.len - 1] == '\n' || > data.buf[data.len - 1] == '\r')) > data.len--; > @@ -286,8 +294,6 @@ int get_common_dir_noenv(struct strbuf *sb, const char *gitdir) > strbuf_addbuf(&path, &data); > strbuf_add_real_path(sb, path.buf); > ret = 1; > - } else { > - strbuf_addstr(sb, gitdir); > } > > strbuf_release(&data); > -- > 2.20.1 >
On Thu, 21 Feb 2019 17:50:38 +0700 Duy Nguyen <pclouds@gmail.com> wrote: > On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: > > > > When adding wotktrees git can die in get_common_dir_noenv while > > examining existing worktrees because the commondir file does not exist. > > Rather than testing if the file exists before reading it handle ENOENT. > > I don't think we could go around fixing every access to incomplete > worktrees like this. If this is because of racy 'worktree add', then > perhaps a better solution is make it absolutely clear it's not ready > for anybody to access. > > For example, we can suffix the worktree directory name with ".lock" > and make sure get_worktrees() ignores entries ending with ".lock". > That should protect other commands while 'worktree add' is still > running. Only when the worktree is complete that 'worktree add' should > rename the directory to lose ".lock" and run external commands like > git-checkout to populate the worktree. The problem is we don't forbid worktree names ending with ".lock". Which means that if we start to forbid them now existing worktrees might become inaccessible. Thanks Michal
Hi Michal/Duy On 21/02/2019 13:50, Michal Suchánek wrote: > On Thu, 21 Feb 2019 17:50:38 +0700 > Duy Nguyen <pclouds@gmail.com> wrote: > >> On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: >>> >>> When adding wotktrees git can die in get_common_dir_noenv while >>> examining existing worktrees because the commondir file does not exist. >>> Rather than testing if the file exists before reading it handle ENOENT. >> >> I don't think we could go around fixing every access to incomplete >> worktrees like this. If this is because of racy 'worktree add', then >> perhaps a better solution is make it absolutely clear it's not ready >> for anybody to access. >> >> For example, we can suffix the worktree directory name with ".lock" >> and make sure get_worktrees() ignores entries ending with ".lock". >> That should protect other commands while 'worktree add' is still >> running. Only when the worktree is complete that 'worktree add' should >> rename the directory to lose ".lock" and run external commands like >> git-checkout to populate the worktree. > > The problem is we don't forbid worktree names ending with ".lock". > Which means that if we start to forbid them now existing worktrees > might become inaccessible. I think it is also racy as the renaming breaks the use of mkdir erroring out if the directory already exists. One solution is to have a lock entry in $GIT_COMMON_DIR/worktree-locks and make sure the code that iterates over the entries in $GIT_COMMON_DIR/worktrees skips any that have a corresponding ignores in $GIT_COMMON_DIR/worktree-locks. If the worktree-locks/<dir> is created before worktree/<dir> then it should be race free (you will have to remove the lock if the real entry cannot be created and then increment the counter and try again). Entries could also be locked on removal to prevent a race there. Best Wishes Phillip > Thanks > > Michal >
On Thu, Feb 21, 2019 at 12:07 PM Phillip Wood <phillip.wood@talktalk.net> wrote: > On 21/02/2019 13:50, Michal Suchánek wrote: > >> On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: > > The problem is we don't forbid worktree names ending with ".lock". > > Which means that if we start to forbid them now existing worktrees > > might become inaccessible. > > I think it is also racy as the renaming breaks the use of mkdir erroring > out if the directory already exists. One solution is to have a lock > entry in $GIT_COMMON_DIR/worktree-locks and make sure the code that > iterates over the entries in $GIT_COMMON_DIR/worktrees skips any that > have a corresponding ignores in $GIT_COMMON_DIR/worktree-locks. If the > worktree-locks/<dir> is created before worktree/<dir> then it should be > race free (you will have to remove the lock if the real entry cannot be > created and then increment the counter and try again). Entries could > also be locked on removal to prevent a race there. I wonder, though, how much this helps or hinders the use-case which prompted this patch series in the first place; to wit, creating hundreds or thousands of worktrees. Doing so serially was too slow, so the many "git worktree add" invocations were instead run in parallel (which led to "discovery" of race conditions). Using a global worktree lock would serialize worktree creation, thus slowing it down once again.
Hi Eric On 21/02/2019 17:12, Eric Sunshine wrote: > On Thu, Feb 21, 2019 at 12:07 PM Phillip Wood <phillip.wood@talktalk.net> wrote: >> On 21/02/2019 13:50, Michal Suchánek wrote: >>>> On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: >>> The problem is we don't forbid worktree names ending with ".lock". >>> Which means that if we start to forbid them now existing worktrees >>> might become inaccessible. >> >> I think it is also racy as the renaming breaks the use of mkdir erroring >> out if the directory already exists. One solution is to have a lock >> entry in $GIT_COMMON_DIR/worktree-locks and make sure the code that >> iterates over the entries in $GIT_COMMON_DIR/worktrees skips any that >> have a corresponding ignores in $GIT_COMMON_DIR/worktree-locks. If the >> worktree-locks/<dir> is created before worktree/<dir> then it should be >> race free (you will have to remove the lock if the real entry cannot be >> created and then increment the counter and try again). Entries could >> also be locked on removal to prevent a race there. > > I wonder, though, how much this helps or hinders the use-case which > prompted this patch series in the first place; to wit, creating > hundreds or thousands of worktrees. Doing so serially was too slow, so > the many "git worktree add" invocations were instead run in parallel > (which led to "discovery" of race conditions). Using a global worktree > lock would serialize worktree creation, thus slowing it down once > again. The idea is that there are per-worktree lock stored under worktree-locks (hence the plural name). Using a separate directory for the locks gets round the problems of name clashes between the lock for a worktree called foo and one called foo.lock and means we can rely on mkdir erroring out if the worktree name already exists as there is no renaming. Best Wishes Phillip
On Thu, 21 Feb 2019 12:12:28 -0500 Eric Sunshine <sunshine@sunshineco.com> wrote: > On Thu, Feb 21, 2019 at 12:07 PM Phillip Wood <phillip.wood@talktalk.net> wrote: > > On 21/02/2019 13:50, Michal Suchánek wrote: > > >> On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: > > > The problem is we don't forbid worktree names ending with ".lock". > > > Which means that if we start to forbid them now existing worktrees > > > might become inaccessible. > > > > I think it is also racy as the renaming breaks the use of mkdir erroring > > out if the directory already exists. One solution is to have a lock > > entry in $GIT_COMMON_DIR/worktree-locks and make sure the code that > > iterates over the entries in $GIT_COMMON_DIR/worktrees skips any that > > have a corresponding ignores in $GIT_COMMON_DIR/worktree-locks. If the > > worktree-locks/<dir> is created before worktree/<dir> then it should be > > race free (you will have to remove the lock if the real entry cannot be > > created and then increment the counter and try again). Entries could > > also be locked on removal to prevent a race there. > > I wonder, though, how much this helps or hinders the use-case which > prompted this patch series in the first place; to wit, creating > hundreds or thousands of worktrees. Doing so serially was too slow, so > the many "git worktree add" invocations were instead run in parallel > (which led to "discovery" of race conditions). Using a global worktree > lock would serialize worktree creation, thus slowing it down once > again. I created thousands of worktrees only for stress-testing. The real workload needs only a dozen of them. That still leads to hitting a race condition occasionally and automation failure. Creating a separate lock directory will probably work. The question is when do you need to take the lock. Before adding a worktree, sure. Before deleting it as well. The problem is that deleting a worktree successfully without creating some broken state needs to exclude processes that might add stuff in the worktree directory. How many operations then do *not* need to take the lock? Thanks Michal
On Thu, Feb 21, 2019 at 8:50 PM Michal Suchánek <msuchanek@suse.de> wrote: > > On Thu, 21 Feb 2019 17:50:38 +0700 > Duy Nguyen <pclouds@gmail.com> wrote: > > > On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: > > > > > > When adding wotktrees git can die in get_common_dir_noenv while > > > examining existing worktrees because the commondir file does not exist. > > > Rather than testing if the file exists before reading it handle ENOENT. > > > > I don't think we could go around fixing every access to incomplete > > worktrees like this. If this is because of racy 'worktree add', then > > perhaps a better solution is make it absolutely clear it's not ready > > for anybody to access. > > > > For example, we can suffix the worktree directory name with ".lock" > > and make sure get_worktrees() ignores entries ending with ".lock". > > That should protect other commands while 'worktree add' is still > > running. Only when the worktree is complete that 'worktree add' should > > rename the directory to lose ".lock" and run external commands like > > git-checkout to populate the worktree. > > The problem is we don't forbid worktree names ending with ".lock". > Which means that if we start to forbid them now existing worktrees > might become inaccessible. Worktrees ending with .lock will not work well now anyway. While [1] reports the problem with worktree names having a whitespace, ".lock" is in the same class (not a valid refname) and will result the same error. So if you have "*.lock" worktrees now you're already in trouble. [1] https://public-inbox.org/git/1550673274.30738.0@yandex.ru/T/#m9d86e0a388fd4961bc102c2c69e8bc3b2db07a42
On Fri, Feb 22, 2019 at 12:07 AM Phillip Wood <phillip.wood@talktalk.net> wrote: > > Hi Michal/Duy > > On 21/02/2019 13:50, Michal Suchánek wrote: > > On Thu, 21 Feb 2019 17:50:38 +0700 > > Duy Nguyen <pclouds@gmail.com> wrote: > > > >> On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: > >>> > >>> When adding wotktrees git can die in get_common_dir_noenv while > >>> examining existing worktrees because the commondir file does not exist. > >>> Rather than testing if the file exists before reading it handle ENOENT. > >> > >> I don't think we could go around fixing every access to incomplete > >> worktrees like this. If this is because of racy 'worktree add', then > >> perhaps a better solution is make it absolutely clear it's not ready > >> for anybody to access. > >> > >> For example, we can suffix the worktree directory name with ".lock" > >> and make sure get_worktrees() ignores entries ending with ".lock". > >> That should protect other commands while 'worktree add' is still > >> running. Only when the worktree is complete that 'worktree add' should > >> rename the directory to lose ".lock" and run external commands like > >> git-checkout to populate the worktree. > > > > The problem is we don't forbid worktree names ending with ".lock". > > Which means that if we start to forbid them now existing worktrees > > might become inaccessible. > > I think it is also racy as the renaming breaks the use of mkdir erroring > out if the directory already exists. You mean the part where we see "fred" exists and decide to try the name "fred1" instead (i.e. patch 1/2)? I don't think it's the problem if that's the case. We mkdir "fred.lock" _then_ check if "fred" exists. If it does, remove fred.lock and move on to fred1.lock. Then we rename fred1.lock to fred1 and error out if rename fails.
Hi Duy On 22/02/2019 09:32, Duy Nguyen wrote: > On Fri, Feb 22, 2019 at 12:07 AM Phillip Wood <phillip.wood@talktalk.net> wrote: >> >> Hi Michal/Duy >> >> On 21/02/2019 13:50, Michal Suchánek wrote: >>> On Thu, 21 Feb 2019 17:50:38 +0700 >>> Duy Nguyen <pclouds@gmail.com> wrote: >>> >>>> On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: >>>>> >>>>> When adding wotktrees git can die in get_common_dir_noenv while >>>>> examining existing worktrees because the commondir file does not exist. >>>>> Rather than testing if the file exists before reading it handle ENOENT. >>>> >>>> I don't think we could go around fixing every access to incomplete >>>> worktrees like this. If this is because of racy 'worktree add', then >>>> perhaps a better solution is make it absolutely clear it's not ready >>>> for anybody to access. >>>> >>>> For example, we can suffix the worktree directory name with ".lock" >>>> and make sure get_worktrees() ignores entries ending with ".lock". >>>> That should protect other commands while 'worktree add' is still >>>> running. Only when the worktree is complete that 'worktree add' should >>>> rename the directory to lose ".lock" and run external commands like >>>> git-checkout to populate the worktree. >>> >>> The problem is we don't forbid worktree names ending with ".lock". >>> Which means that if we start to forbid them now existing worktrees >>> might become inaccessible. >> >> I think it is also racy as the renaming breaks the use of mkdir erroring >> out if the directory already exists. > > You mean the part where we see "fred" exists and decide to try the > name "fred1" instead (i.e. patch 1/2)? > > I don't think it's the problem if that's the case. We mkdir > "fred.lock" _then_ check if "fred" exists. If it does, remove > fred.lock and move on to fred1.lock. Then we rename fred1.lock to > fred1 and error out if rename fails. Ah you're right, if another process tries to create fred.lock as we're renaming it either their mkdir fred.lock will fail or they'll see fred once they've made fred.lock Sorry for the confusion Phillip
Hello, On Thu, 21 Feb 2019 17:27:04 +0000 Phillip Wood <phillip.wood@talktalk.net> wrote: > Hi Eric > > On 21/02/2019 17:12, Eric Sunshine wrote: > > On Thu, Feb 21, 2019 at 12:07 PM Phillip Wood <phillip.wood@talktalk.net> wrote: > >> On 21/02/2019 13:50, Michal Suchánek wrote: > >>>> On Tue, Feb 19, 2019 at 12:05 AM Michal Suchanek <msuchanek@suse.de> wrote: > >>> The problem is we don't forbid worktree names ending with ".lock". > >>> Which means that if we start to forbid them now existing worktrees > >>> might become inaccessible. > >> > >> I think it is also racy as the renaming breaks the use of mkdir erroring > >> out if the directory already exists. One solution is to have a lock > >> entry in $GIT_COMMON_DIR/worktree-locks and make sure the code that > >> iterates over the entries in $GIT_COMMON_DIR/worktrees skips any that > >> have a corresponding ignores in $GIT_COMMON_DIR/worktree-locks. If the > >> worktree-locks/<dir> is created before worktree/<dir> then it should be > >> race free (you will have to remove the lock if the real entry cannot be > >> created and then increment the counter and try again). Entries could > >> also be locked on removal to prevent a race there. > > > > I wonder, though, how much this helps or hinders the use-case which > > prompted this patch series in the first place; to wit, creating > > hundreds or thousands of worktrees. Doing so serially was too slow, so > > the many "git worktree add" invocations were instead run in parallel > > (which led to "discovery" of race conditions). Using a global worktree > > lock would serialize worktree creation, thus slowing it down once > > again. > > The idea is that there are per-worktree lock stored under worktree-locks > (hence the plural name). Using a separate directory for the locks gets > round the problems of name clashes between the lock for a worktree > called foo and one called foo.lock and means we can rely on mkdir > erroring out if the worktree name already exists as there is no renaming. I suppose this separate directory would work. When are you supposed to take the lock, though? When adding worktree, sure. When managing worktrees, sure. Otherwise you would see the incomplete worktrees. When doing anything in git? Probably. Because otherwise you could accidentally use the incomplete worktree. Or somebody deleting worktree would fail removing it because you would keep adding files to it. Isn't git supposed to allow parallel access to the repository? As things stand if you wanted to implement worktree locking you would need to lock the worktree for *every* operation that touches it, and for many operations you would have to lock/unlock *all* worktrees one by one to find the worktree you are supposed to work on. I don't feel like adding locking to all of git to fix this problem. Sure, adding enough locking to ensure repository consistency at all times would be nice but it also needs to be granular enough to not harm performance. I can't say I understand the git repository layout and usage well enough to design that. Thanks Michal
diff --git a/setup.c b/setup.c index ca9e8a949ed8..dd865f280d34 100644 --- a/setup.c +++ b/setup.c @@ -270,12 +270,20 @@ int get_common_dir_noenv(struct strbuf *sb, const char *gitdir) { struct strbuf data = STRBUF_INIT; struct strbuf path = STRBUF_INIT; - int ret = 0; + int ret; strbuf_addf(&path, "%s/commondir", gitdir); - if (file_exists(path.buf)) { - if (strbuf_read_file(&data, path.buf, 0) <= 0) + ret = strbuf_read_file(&data, path.buf, 0); + if (ret <= 0) { + /* + * if file is missing or zero size (just being written) + * assume default, bail otherwise + */ + if (ret && errno != ENOENT) die_errno(_("failed to read %s"), path.buf); + strbuf_addstr(sb, gitdir); + ret = 0; + } else { while (data.len && (data.buf[data.len - 1] == '\n' || data.buf[data.len - 1] == '\r')) data.len--; @@ -286,8 +294,6 @@ int get_common_dir_noenv(struct strbuf *sb, const char *gitdir) strbuf_addbuf(&path, &data); strbuf_add_real_path(sb, path.buf); ret = 1; - } else { - strbuf_addstr(sb, gitdir); } strbuf_release(&data);
When adding wotktrees git can die in get_common_dir_noenv while examining existing worktrees because the commondir file does not exist. Rather than testing if the file exists before reading it handle ENOENT. Signed-off-by: Michal Suchanek <msuchanek@suse.de> --- v2: - do not test file existence first, just read it and handle ENOENT. - handle zero size file correctly --- setup.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)