Message ID | 20181113003911.GB170017@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Avoid confusing messages from new index extensions (Re: [PATCH v8 0/7] speed up index load through parallelization) | expand |
Jonathan Nieder <jrnieder@gmail.com> writes: > Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension, > 2018-10-10) Git defaults to writing the new EOIE section when writing > out an index file. Usually that is a good thing because it improves > threaded performance, but when a Git repository is shared with older > versions of Git, it produces a confusing warning: > > $ git status > ignoring EOIE extension > HEAD detached at 371ed0defa > nothing to commit, working tree clean > > Let's introduce the new index extension more gently. First we'll roll > out the new version of Git that understands it, and then once > sufficiently many users are using such a version, we can flip the > default to writing it by default. > > Introduce a '[index] recordEndOfIndexEntries' configuration variable > to allow interested users to benefit from this index extension early. Thanks. I am in principle OK with this approach. In fact, I suspect that the default may want to be dynamically determined, and we give this knob to let the users further force their preference. When no extension that benefits from multi-threading is written, the default can stay "no" in future versions of Git, for example. > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 41a9ff2b6a..d702379db4 100644 The timing is a bit unfortunate for any topic to touch this file, and contrib/cocci would not help us in this case X-<. > diff --git a/read-cache.c b/read-cache.c > index f3a848d61c..4bfe93c4c2 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -2698,6 +2698,15 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile > rollback_lock_file(lockfile); > } > > +static int record_eoie(void) > +{ > + int val; > + > + if (!git_config_get_bool("index.recordendofindexentries", &val)) > + return val; > + return 0; > +} Unconditionally defaulting to no in this round is perfectly fine. Let's make a mental note that this is the place to decide dynamic default in the future when we want to. It would probably have to ask around various "extension writing" helpers if they want to have a say in the outcome (e.g. if there are very many cache entries in the istate, the entry offset table may want to be written and otherwise not). > @@ -2945,7 +2954,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, > * read. Write it out regardless of the strip_extensions parameter as we need it > * when loading the shared index. > */ > - if (offset) { > + if (offset && record_eoie()) { > struct strbuf sb = STRBUF_INIT; > > write_eoie_extension(&sb, &eoie_c, offset); > diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh > index 2ac47aa0e4..0cbac64e28 100755 > --- a/t/t1700-split-index.sh > +++ b/t/t1700-split-index.sh > @@ -25,14 +25,17 @@ test_expect_success 'enable split index' ' > git update-index --split-index && > test-tool dump-split-index .git/index >actual && > indexversion=$(test-tool index-version <.git/index) && > + > + # NEEDSWORK: Stop hard-coding checksums. Also let's stop hard-coding the assumption that the new knob is off by default. Ideally, you'd want to test both cases, right? Perhaps you'd call "git update-index --split-index" we see in the precontext twice, with "-c VAR=false" and "-c VAR=true", to prepare "actual.without-eoie" and "actual.with-eoie", or something like that? Thanks. > if test "$indexversion" = "4" > then > - own=3527df833c6c100d3d1d921a9a782d62a8be4b58 > - base=746f7ab2ed44fb839efdfbffcf399d0b113fb4cb > + own=432ef4b63f32193984f339431fd50ca796493569 > + base=508851a7f0dfa8691e9f69c7f055865389012491 > else > - own=5e9b60117ece18da410ddecc8b8d43766a0e4204 > - base=4370042739b31cd17a5c5cd6043a77c9a00df113 > + own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339 > + base=39d890139ee5356c7ef572216cebcd27aa41f9df > fi && > + > cat >expect <<-EOF && > own $own > base $base
On 11/12/2018 8:05 PM, Junio C Hamano wrote: > Jonathan Nieder <jrnieder@gmail.com> writes: > >> Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension, >> 2018-10-10) Git defaults to writing the new EOIE section when writing >> out an index file. Usually that is a good thing because it improves >> threaded performance, but when a Git repository is shared with older >> versions of Git, it produces a confusing warning: >> >> $ git status >> ignoring EOIE extension >> HEAD detached at 371ed0defa >> nothing to commit, working tree clean >> >> Let's introduce the new index extension more gently. First we'll roll >> out the new version of Git that understands it, and then once >> sufficiently many users are using such a version, we can flip the >> default to writing it by default. >> >> Introduce a '[index] recordEndOfIndexEntries' configuration variable >> to allow interested users to benefit from this index extension early. > > Thanks. I am in principle OK with this approach. In fact, I > suspect that the default may want to be dynamically determined, and > we give this knob to let the users further force their preference. > When no extension that benefits from multi-threading is written, the > default can stay "no" in future versions of Git, for example. > While I can understand the user confusion the warning about ignoring an extension could cause I guess I'm a little surprised that people would see it that often. To see the warning means they are running a new version of git in the same repo as they are running an old version of git. I just haven't ever experienced that (I only ever have one copy of git installed) so am surprised it comes up often enough to warrant this change. That said, if it _is_ that much of an issue, this patch makes sense and provides a way to more gracefully transition into this feature. Even if we had some logic to dynamically determine whether to write it or not, we'd still want to avoid confusing users when it did get written out. >> diff --git a/Documentation/config.txt b/Documentation/config.txt >> index 41a9ff2b6a..d702379db4 100644 > > The timing is a bit unfortunate for any topic to touch this file, > and contrib/cocci would not help us in this case X-<. > >> diff --git a/read-cache.c b/read-cache.c >> index f3a848d61c..4bfe93c4c2 100644 >> --- a/read-cache.c >> +++ b/read-cache.c >> @@ -2698,6 +2698,15 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile >> rollback_lock_file(lockfile); >> } >> >> +static int record_eoie(void) >> +{ >> + int val; >> + >> + if (!git_config_get_bool("index.recordendofindexentries", &val)) >> + return val; >> + return 0; >> +} > > Unconditionally defaulting to no in this round is perfectly fine. > Let's make a mental note that this is the place to decide dynamic > default in the future when we want to. It would probably have to > ask around various "extension writing" helpers if they want to have > a say in the outcome (e.g. if there are very many cache entries in > the istate, the entry offset table may want to be written and > otherwise not). > >> @@ -2945,7 +2954,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, >> * read. Write it out regardless of the strip_extensions parameter as we need it >> * when loading the shared index. >> */ >> - if (offset) { >> + if (offset && record_eoie()) { >> struct strbuf sb = STRBUF_INIT; >> >> write_eoie_extension(&sb, &eoie_c, offset); >> diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh >> index 2ac47aa0e4..0cbac64e28 100755 >> --- a/t/t1700-split-index.sh >> +++ b/t/t1700-split-index.sh >> @@ -25,14 +25,17 @@ test_expect_success 'enable split index' ' >> git update-index --split-index && >> test-tool dump-split-index .git/index >actual && >> indexversion=$(test-tool index-version <.git/index) && >> + >> + # NEEDSWORK: Stop hard-coding checksums. > > Also let's stop hard-coding the assumption that the new knob is off > by default. Ideally, you'd want to test both cases, right? > > Perhaps you'd call "git update-index --split-index" we see in the > precontext twice, with "-c VAR=false" and "-c VAR=true", to prepare > "actual.without-eoie" and "actual.with-eoie", or something like > that? > > Thanks. > >> if test "$indexversion" = "4" >> then >> - own=3527df833c6c100d3d1d921a9a782d62a8be4b58 >> - base=746f7ab2ed44fb839efdfbffcf399d0b113fb4cb >> + own=432ef4b63f32193984f339431fd50ca796493569 >> + base=508851a7f0dfa8691e9f69c7f055865389012491 >> else >> - own=5e9b60117ece18da410ddecc8b8d43766a0e4204 >> - base=4370042739b31cd17a5c5cd6043a77c9a00df113 >> + own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339 >> + base=39d890139ee5356c7ef572216cebcd27aa41f9df >> fi && >> + >> cat >expect <<-EOF && >> own $own >> base $base
Hi, Ben Peart wrote: > While I can understand the user confusion the warning about ignoring an > extension could cause I guess I'm a little surprised that people would see > it that often. To see the warning means they are running a new version of > git in the same repo as they are running an old version of git. I just > haven't ever experienced that (I only ever have one copy of git installed) > so am surprised it comes up often enough to warrant this change. Great question. There are a few contexts where it comes up: 1. Using multiple versions of Git on a single machine. For example, some IDEs bundle a particular version of Git, which can be a different version from the system copy, or on a Mac, /usr/bin/git quickly goes out of sync with the Homebrew git in /usr/local/bin/git. 2. Sharing a single Git repository between multiple machines. This is not unusual, using NFS or sshfs, for example. 3. Downgrading after trying a new version of Git. To support these, Git is generally careful to avoid writing repositories that older versions of Git do not understand. The EOIE extension was almost perfect in this respect: it works fine with older versions of Git, except for the alarming error message. > That said, if it _is_ that much of an issue, this patch makes sense and > provides a way to more gracefully transition into this feature. Even if we > had some logic to dynamically determine whether to write it or not, we'd > still want to avoid confusing users when it did get written out. Yes. An earlier version of this patch defaulted to writing EOIE if and only if the .git/index file already has an EOIE extension. There were enough holes in that (commands like "git reset" that do not read the existing index file) and enough complexity that it didn't seem worth it. Really in this series, patch 3/3 is the one I care most about. I wish we had had it years ago. :) It would make patches 1 and 2 unnecessary. Thanks, Jonathan
Jonathan Nieder <jrnieder@gmail.com> writes: > 1. Using multiple versions of Git on a single machine. For example, > some IDEs bundle a particular version of Git, which can be a > different version from the system copy, or on a Mac, /usr/bin/git > quickly goes out of sync with the Homebrew git in > /usr/local/bin/git. Exactly this, especially the latter, is the answer to your question in an earlier message: >> Am I understanding correclty? Can you give an example of when a user >> would *want* to see this message and what they would do in response? The user may not be even aware of using another version of Git that does not know how to take advantage of the version of Git you have used in the repository, and it can be a mistake the user may want to fix (e.g. by futzing with PATH). The message would help the user notice the situation and take corrective action. Users of IDEs that bundle stale version of Git cannot even bug the supplier of the IDE to make them more up-to-date if they aren't aware of it.
Junio C Hamano wrote: > Jonathan Nieder <jrnieder@gmail.com> writes: >> 1. Using multiple versions of Git on a single machine. For example, >> some IDEs bundle a particular version of Git, which can be a >> different version from the system copy, or on a Mac, /usr/bin/git >> quickly goes out of sync with the Homebrew git in >> /usr/local/bin/git. > > Exactly this, especially the latter, is the answer to your > question in an earlier message: > >>> Am I understanding correctly? Can you give an example of when a user >>> would *want* to see this message and what they would do in response? > > The user may not be even aware of using another version of Git that > does not know how to take advantage of the version of Git you have > used in the repository, and it can be a mistake the user may want to > fix (e.g. by futzing with PATH). Ah, thanks much. I'll add a hint along those lines (e.g. warning: ignoring optional IEOT index extension hint: This is likely due to the file having been written by a newer hint: version of Git than is reading it. You can upgrade Git to hint: take advantage of performance improvements from the updated hint: file format. hint: hint: You can run "git config advice.unknownIndexExtension true" to hint: suppress this message. I am still vaguely uncomfortable with this since it seems analogous to warning that the server is advertising an unrecognized capability, but I can live with it. :) Patch coming in a few moments. Jonathan
diff --git a/Documentation/config.txt b/Documentation/config.txt index 41a9ff2b6a..d702379db4 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2188,6 +2188,13 @@ imap:: The configuration variables in the 'imap' section are described in linkgit:git-imap-send[1]. +index.recordEndOfIndexEntries:: + Specifies whether the index file should include an "End Of Index + Entry" section. This reduces index load time on multiprocessor + machines but produces a message "ignoring EOIE extension" when + reading the index using Git versions before 2.20. Defaults to + 'false'. + index.threads:: Specifies the number of threads to spawn when loading the index. This is meant to reduce index load time on multiprocessor machines. diff --git a/read-cache.c b/read-cache.c index f3a848d61c..4bfe93c4c2 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2698,6 +2698,15 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile rollback_lock_file(lockfile); } +static int record_eoie(void) +{ + int val; + + if (!git_config_get_bool("index.recordendofindexentries", &val)) + return val; + return 0; +} + /* * On success, `tempfile` is closed. If it is the temporary file * of a `struct lock_file`, we will therefore effectively perform @@ -2945,7 +2954,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, * read. Write it out regardless of the strip_extensions parameter as we need it * when loading the shared index. */ - if (offset) { + if (offset && record_eoie()) { struct strbuf sb = STRBUF_INIT; write_eoie_extension(&sb, &eoie_c, offset); diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh index 2ac47aa0e4..0cbac64e28 100755 --- a/t/t1700-split-index.sh +++ b/t/t1700-split-index.sh @@ -25,14 +25,17 @@ test_expect_success 'enable split index' ' git update-index --split-index && test-tool dump-split-index .git/index >actual && indexversion=$(test-tool index-version <.git/index) && + + # NEEDSWORK: Stop hard-coding checksums. if test "$indexversion" = "4" then - own=3527df833c6c100d3d1d921a9a782d62a8be4b58 - base=746f7ab2ed44fb839efdfbffcf399d0b113fb4cb + own=432ef4b63f32193984f339431fd50ca796493569 + base=508851a7f0dfa8691e9f69c7f055865389012491 else - own=5e9b60117ece18da410ddecc8b8d43766a0e4204 - base=4370042739b31cd17a5c5cd6043a77c9a00df113 + own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339 + base=39d890139ee5356c7ef572216cebcd27aa41f9df fi && + cat >expect <<-EOF && own $own base $base
Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension, 2018-10-10) Git defaults to writing the new EOIE section when writing out an index file. Usually that is a good thing because it improves threaded performance, but when a Git repository is shared with older versions of Git, it produces a confusing warning: $ git status ignoring EOIE extension HEAD detached at 371ed0defa nothing to commit, working tree clean Let's introduce the new index extension more gently. First we'll roll out the new version of Git that understands it, and then once sufficiently many users are using such a version, we can flip the default to writing it by default. Introduce a '[index] recordEndOfIndexEntries' configuration variable to allow interested users to benefit from this index extension early. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Documentation/config.txt | 7 +++++++ read-cache.c | 11 ++++++++++- t/t1700-split-index.sh | 11 +++++++---- 3 files changed, 24 insertions(+), 5 deletions(-)