Message ID | cover.1685574402.git.jonathantanmy@google.com (mailing list archive) |
---|---|
Headers | show |
Series | Changed path filter hash fix and version bump | expand |
Jonathan Tan <jonathantanmy@google.com> writes: > Here's a new version. With this, Git can function with both version > 1 (incorrect murmur3) and version 2 (correct murmur3) changed path > filters, but not at the same time: the user can set a config variable to > choose which one, and Git will ignore existing changed path filters of > the wrong version (and always write the version that the config variable > says). Hmph. On a system with unsigned char, we should be able to keep using version 1 without losing correctness, I suspect, but probably they are in the minority we do not have to care about? I can see the desire to simplify the migration plan (i.e. essentially have no migration---this will give us just a flag day per repository), but I'll let others to comment. > In patch 1, the test assumes that char is signed. I'm not sure if it's > worth asserting on the contents of the filter, since it depends on > whether char is signed, but I've included it anyway (since it's easy > to remove). So, on a system with unsigned char, would these tests fail? Do we need a prereq to skip them? Thanks.
Junio C Hamano <gitster@pobox.com> writes: > Jonathan Tan <jonathantanmy@google.com> writes: > >> Here's a new version. With this, Git can function with both version >> 1 (incorrect murmur3) and version 2 (correct murmur3) changed path >> filters, but not at the same time: the user can set a config variable to >> choose which one, and Git will ignore existing changed path filters of >> the wrong version (and always write the version that the config variable >> says). Seems to break t4216 when merged to 'seen' to replace the previous round. Could you take a look?
Junio C Hamano <gitster@pobox.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > > > Jonathan Tan <jonathantanmy@google.com> writes: > > > >> Here's a new version. With this, Git can function with both version > >> 1 (incorrect murmur3) and version 2 (correct murmur3) changed path > >> filters, but not at the same time: the user can set a config variable to > >> choose which one, and Git will ignore existing changed path filters of > >> the wrong version (and always write the version that the config variable > >> says). > > Seems to break t4216 when merged to 'seen' to replace the previous > round. Could you take a look? OK - I see that it fails CI when I upload the merge to GitHub (although it passes locally). I'll take a look.
Jonathan Tan <jonathantanmy@google.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > > Seems to break t4216 when merged to 'seen' to replace the previous > > round. Could you take a look? > > OK - I see that it fails CI when I upload the merge to GitHub (although > it passes locally). I'll take a look. Hmm...so when the test writes a file with a high bit filename, the filename written is different. Here's the output from my machine (Linux): ++ git -C highbit1/ commit -m c1 [main (root-commit) 808e481] c1 Author: A U Thor <author@example.com> 1 file changed, 1 insertion(+) create mode 100644 "\302\242" ++ case "$tag" in ++ git -C highbit1/ tag c1 ++ touch $'\302\242\302\242' ++ ls A expect file5_renamed limits log_wo_bloom trace2-auto.txt ''$'\302\242\302\242' actual file4 highbit1 log_w_bloom trace.perf trace2.txt ++ git -C highbit1 commit-graph write --reachable --changed-paths bloom calculating -62,-94 version 1 ok 141 - set up repo with high bit path, version 1 changed-path and here's the output from the CI on GitHub on linux-gcc-default: [main (root-commit) 3f37a43] c1 Author: A U Thor <author@example.com> 1 file changed, 1 insertion(+) create mode 100644 "\\xc2\\xa2" + git -C highbit1/ tag c1 + touch \xc2\xa2\xc2\xa2 + ls A \xc2\xa2\xc2\xa2 actual expect file4 file5_renamed highbit1 limits log_w_bloom log_wo_bloom trace.perf trace2-auto.txt trace2.txt + git -C highbit1 commit-graph write --reachable --changed-paths bloom calculating 92,120 version 1 ok 141 - set up repo with high bit path, version 1 changed-path These were run with some extra code, here for completeness: diff --git a/bloom.c b/bloom.c index dea883d8d6..ccc3e0ce80 100644 --- a/bloom.c +++ b/bloom.c @@ -192,6 +192,8 @@ void fill_bloom_key(const char *data, ? murmur3_seeded_v2 : murmur3_seeded_v1)(seed0, data, len); const uint32_t hash1 = (settings->hash_version == 2 ? murmur3_seeded_v2 : murmur3_seeded_v1)(seed1, data, len); + if (len > 0) + fprintf(stderr, "bloom calculating %d,%d version %d\n", data[0], data[1], settings->hash_version); key->hashes = (uint32_t *)xcalloc(settings->num_hashes, sizeof(uint32_t)); for (i = 0; i < settings->num_hashes; i++) diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index 764c6dee0f..1121e7d4cd 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -423,6 +423,8 @@ CENT=$(printf "\xc2\xa2") test_expect_success 'set up repo with high bit path, version 1 changed-path' ' git init highbit1 && test_commit -C highbit1 c1 "$CENT" && + touch $CENT$CENT && + ls && git -C highbit1 commit-graph write --reachable --changed-paths ' Notice the different filename after "create mode", and the different "ls" output. I'll investigate some more, but if anyone offhand knows what's going on (and/or knows how to write a high-bit filename portably, even across Linux), please let me know. An alternative is to drop the tests from these patches - so, leave them in during the review period and reviewers would see that the tests pass the CI jobs for Windows and Mac OS X pass, and then before we merge it, delete the tests from the patches. This also solves needing to prevent unsigned char platforms from running the version 1 tests - there's no prereq for that yet and we would have to investigate making one.