mbox series

[v4,0/3] Use default values from settings instead of config

Message ID 20211012174208.95161-1-chooglen@google.com (mailing list archive)
Headers show
Series Use default values from settings instead of config | expand

Message

Glen Choo Oct. 12, 2021, 5:42 p.m. UTC
Hi everyone! This patch was created in response to something we observed in
Google, where fsck failed to detect that the commit graph was invalid. We
initially assumed that fsck never checked the commit graph, but it turns out
that it does so only when core.commitgraph is set, even though we set defaults
for "whether to use the commit graph" in the repository settings.

Instead of using the config, let's use repository settings where
available. Replace core.commitGraph and core.multiPackIndex with their
equivalent repository settings in fsck and gc.

This re-roll is primarily motivated by the CI failures noted by Junio in [1].
The underlying cause is that GIT_TEST_COMMIT_GRAPH=1 (enabled in the linux-gcc
job) causes commands like "git commit" to write commit-graphs, but certain tests
like t/t0410-partial-clone.sh and t/t3800-mktag.sh intentionally create
corruptions that cause commit-graphs to become out-of-sync/invalid. Patch 1
fixes a bug where fsck did not check commit-graphs by default, which means that
these tests will now fail because they have invalid commit-graphs. The easiest
solution I found is to disable this confusing and noisy behavior with
GIT_TEST_COMMIT_GRAPH=0.

And since I am re-rolling, I incorporated Ævar's feedback regarding the commit
messages (thanks!). I considered combining patches 1 and 2, but patch 1 has a
grown a little to fix the CI issues, so I've decided to keep patches 1 and 2
separate.

This series has also seen a healthy amount of interest in test style and
coverage. We haven't converged on a path for the future, but I'd like to think
that the tests here are still a step in (approximately) the right direction.
Hopefully this version LGT us while we figure out what to do with tests :)

[1] https://lore.kernel.org/git/xmqqfstafyox.fsf@gitster.g/

Changes since v3
* Disable GIT_TEST_COMMIT_GRAPH in tests that intentionally corrupt things in a
  way that is incompatible with commit-graphs.
* Make patch 1 and 2's commit messages more concise (thanks Ævar!).

Changes since v2:
* Various small test fixes (thanks Eric!). Most notably, I've used -c instead of
  test_config because test_config can affect the values in subsequent tests.
* Rewording fix in patch 3 commit message
* Refactor tests in patch 3 so that we use a single helper function instead of
  copy-pasted code

Changes since v1:
* clean up typo in patch 1 commit message 
* document the commits that patches 1 and 2 address
* use test helpers in patch 1
* rewrite patch 2's tests so that it's easier to tell that each test
  does something different
* reword patch 3 commit message to explain the bug
* add tests to patch 3

Glen Choo (3):
  fsck: verify commit graph when implicitly enabled
  fsck: verify multi-pack-index when implictly enabled
  gc: perform incremental repack when implictly enabled

 builtin/fsck.c              |  5 +++--
 builtin/gc.c                |  5 ++---
 t/t0410-partial-clone.sh    |  6 +++++-
 t/t3800-mktag.sh            |  5 +++++
 t/t5318-commit-graph.sh     | 23 ++++++++++++++++++++++-
 t/t5319-multi-pack-index.sh |  5 ++++-
 t/t7900-maintenance.sh      | 28 ++++++++++++++++++++++++----
 7 files changed, 65 insertions(+), 12 deletions(-)

Range-diff against v3:
1:  2f9ff949e6 ! 1:  aac1253e7b fsck: verify commit graph when implicitly enabled
    @@ Metadata
      ## Commit message ##
         fsck: verify commit graph when implicitly enabled
     
    -    the_repository->settings is the preferred way to get certain
    -    settings (such as "core.commitGraph") because it gets default values
    -    from prepare_repo_settings(). However, cmd_fsck() reads the config
    -    directly via git_config_get_bool(), which bypasses these default values.
    -    This causes fsck to ignore the commit graph if "core.commitgraph" is not
    -    explicitly set in the config. This worked fine until commit-graph was
    -    enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
    -    default, 2019-08-13).
    +    Change fsck to check the "core_commit_graph" variable set in
    +    "repo-settings.c" instead of reading the "core.commitGraph" variable.
    +    This fixes a bug where we wouldn't verify the commit-graph if the
    +    config key was missing. This bug was introduced in
    +    31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13),
    +    where core.commitGraph was turned on by default.
     
    -    Replace git_config_get_bool("core.commitgraph") in fsck with the
    -    equivalent call to the_repository->settings.core_commit_graph.
    -
    -    The expected behavior is that fsck respects the config value when it is
    -    set, but uses the default value when it is unset. For example, for
    -    core.commitGraph, there are three cases:
    -
    -    - Config value is set to true -> enabled
    -    - Config value is set to false -> disabled
    -    - Config value is unset -> enabled
    -
    -    As such, tests cover all three cases.
    +    Add tests to "t5318-commit-graph.sh" to verify that fsck checks the
    +    commit-graph as expected for the 3 values of core.commitGraph. Also,
    +    disable GIT_TEST_COMMIT_GRAPH for tests that use fsck in ways that
    +    assume that commit-graph checking is disabled (t/t3800-mktag.sh,
    +    t/t0410-partial-clone.sh).
     
    +    Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Glen Choo <chooglen@google.com>
     
      ## builtin/fsck.c ##
    @@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
      		const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
      
     
    + ## t/t0410-partial-clone.sh ##
    +@@ t/t0410-partial-clone.sh: test_description='partial clone'
    + 
    + # missing promisor objects cause repacks which write bitmaps to fail
    + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
    ++# When enabled, some commands will write commit-graphs. This causes fsck
    ++# to fail when delete_object() is called because fsck will attempt to
    ++# verify the out-of-sync commit graph.
    ++GIT_TEST_COMMIT_GRAPH=0
    + 
    + delete_object () {
    + 	rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|')
    +@@ t/t0410-partial-clone.sh: test_expect_success 'rev-list stops traversal at missing and promised commit' '
    + 
    + 	git -C repo config core.repositoryformatversion 1 &&
    + 	git -C repo config extensions.partialclone "arbitrary string" &&
    +-	GIT_TEST_COMMIT_GRAPH=0 git -C repo -c core.commitGraph=false rev-list --exclude-promisor-objects --objects bar >out &&
    ++	git -C repo rev-list --exclude-promisor-objects --objects bar >out &&
    + 	grep $(git -C repo rev-parse bar) out &&
    + 	! grep $FOO out
    + '
    +
    + ## t/t3800-mktag.sh ##
    +@@ t/t3800-mktag.sh: test_description='git mktag: tag object verify test'
    + 
    + . ./test-lib.sh
    + 
    ++# When enabled, some commands will automatically write commit-graphs.
    ++# This will cause the mktag tests to fail because fsck will attempt to
    ++# verify the out-of-sync commit graph.
    ++GIT_TEST_COMMIT_GRAPH=0
    ++
    + ###########################################################
    + # check the tag.sig file, expecting verify_tag() to fail,
    + # and checking that the error message matches the pattern
    +
      ## t/t5318-commit-graph.sh ##
     @@ t/t5318-commit-graph.sh: test_expect_success 'detect incorrect chunk count' '
      		$GRAPH_CHUNK_LOOKUP_OFFSET
2:  b13ca2a695 ! 2:  ed64983430 fsck: verify multi-pack-index when implictly enabled
    @@ Metadata
      ## Commit message ##
         fsck: verify multi-pack-index when implictly enabled
     
    -    Like the previous commit, the_repository->settings.core_multi_pack_index
    -    is preferable to reading "core.multiPackIndex" from the config because
    -    prepare_repo_settings() sets a default value for
    -    the_repository->settings. This worked fine until core.multiPackIndex was
    -    enabled by default in 18e449f86b (midx: enable core.multiPackIndex by
    -    default, 2020-09-25).
    -
    -    Replace git_config_get_bool("core.multiPackIndex") in fsck with the
    -    equivalent call to the_repository->settings.multi_pack_index.
    +    Like the previous commit, change fsck to check the
    +    "core_multi_pack_index" variable set in "repo-settings.c" instead of
    +    reading the "core.multiPackIndex" config variable. This fixes a bug
    +    where we wouldn't verify midx if the config key was missing. This bug
    +    was introduced in 18e449f86b (midx: enable core.multiPackIndex by
    +    default, 2020-09-25) where core.multiPackIndex was turned on by default.
     
    +    Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Glen Choo <chooglen@google.com>
     
      ## builtin/fsck.c ##
3:  76943aff80 = 3:  821b492d8b gc: perform incremental repack when implictly enabled

Comments

Junio C Hamano Oct. 12, 2021, 8:23 p.m. UTC | #1
Glen Choo <chooglen@google.com> writes:

> And since I am re-rolling, I incorporated Ævar's feedback regarding the commit
> messages (thanks!). I considered combining patches 1 and 2, but patch 1 has a
> grown a little to fix the CI issues, so I've decided to keep patches 1 and 2
> separate.

OK.  Will replace.  Thanks.
Ævar Arnfjörð Bjarmason Oct. 12, 2021, 8:34 p.m. UTC | #2
On Tue, Oct 12 2021, Glen Choo wrote:

> Hi everyone! This patch was created in response to something we observed in
> Google, where fsck failed to detect that the commit graph was invalid. We
> initially assumed that fsck never checked the commit graph, but it turns out
> that it does so only when core.commitgraph is set, even though we set defaults
> for "whether to use the commit graph" in the repository settings.
>
> Instead of using the config, let's use repository settings where
> available. Replace core.commitGraph and core.multiPackIndex with their
> equivalent repository settings in fsck and gc.
>
> This re-roll is primarily motivated by the CI failures noted by Junio in [1].
> The underlying cause is that GIT_TEST_COMMIT_GRAPH=1 (enabled in the linux-gcc
> job) causes commands like "git commit" to write commit-graphs, but certain tests
> like t/t0410-partial-clone.sh and t/t3800-mktag.sh intentionally create
> corruptions that cause commit-graphs to become out-of-sync/invalid. Patch 1
> fixes a bug where fsck did not check commit-graphs by default, which means that
> these tests will now fail because they have invalid commit-graphs. The easiest
> solution I found is to disable this confusing and noisy behavior with
> GIT_TEST_COMMIT_GRAPH=0.

That's easy, but...

> And since I am re-rolling, I incorporated Ævar's feedback regarding the commit
> messages (thanks!). I considered combining patches 1 and 2, but patch 1 has a
> grown a little to fix the CI issues, so I've decided to keep patches 1 and 2
> separate.
>
> This series has also seen a healthy amount of interest in test style and
> coverage. We haven't converged on a path for the future, but I'd like to think
> that the tests here are still a step in (approximately) the right direction.
> Hopefully this version LGT us while we figure out what to do with tests :)
>
> [1] https://lore.kernel.org/git/xmqqfstafyox.fsf@gitster.g/
>
> Changes since v3
> * Disable GIT_TEST_COMMIT_GRAPH in tests that intentionally corrupt things in a
>   way that is incompatible with commit-graphs.
> * Make patch 1 and 2's commit messages more concise (thanks Ævar!).

...how isn't disabling those t3800-mktag.sh tests just plasting over
corruption that we're noticing because of your changes to (rightly) fix
the bug where "fsck" wasn't checking the graph at all?

IOW haven't we just found exactly the sort of bug that
"GIT_TEST_COMMIT_GRAPH" is put in place to find for us, but now instead
of fixing it we're hiding it?

If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
I see that we fail N number of tests, but all of them are actually
fallout of just this test:

        git replace $head_parent $head && 
        git replace -f $tree $blob 

I.e. we've created a replacement object replacing a tree with a blob, as
part of tests I added to test how mktag handles those sorts of weird
edge cases.

This then causes the graph verify code to throw a hissy fit with:

    root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
    commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
    0fbca9850869684085d654f9e1380c9780802570

I.e. when we wrote the graph we somehow didn't notice that the root tree
node we wrote is to an object that's not actually a tree? Isn't this a
bug where some part of the commit graph writing should be doing its own
extended OID lookup that's replacement-object aware, it didn't, and we
wrote a corrupt graph as a result?

If there is a legitimate reason why we're not just hiding a bug we've
turned up with these fixes let's disable that one test, not the entire
test file.

If you don't run the one test that fails (which is split up into 3
individual pieces) there's still 143 other tests that are run, all of
those presumably benefit from finding future bugs with
GIT_TEST_COMMIT_GRAPH=true, particularly since the test file seems to
just have turned up one just now...
Glen Choo Oct. 12, 2021, 10:29 p.m. UTC | #3
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> ...how isn't disabling those t3800-mktag.sh tests just plasting over
> corruption that we're noticing because of your changes to (rightly) fix
> the bug where "fsck" wasn't checking the graph at all?
>
> IOW haven't we just found exactly the sort of bug that
> "GIT_TEST_COMMIT_GRAPH" is put in place to find for us, but now instead
> of fixing it we're hiding it?
>
> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
> I see that we fail N number of tests, but all of them are actually
> fallout of just this test:
>
>         git replace $head_parent $head && 
>         git replace -f $tree $blob 
>
> I.e. we've created a replacement object replacing a tree with a blob, as
> part of tests I added to test how mktag handles those sorts of weird
> edge cases.
>
> This then causes the graph verify code to throw a hissy fit with:
>
>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>     0fbca9850869684085d654f9e1380c9780802570
>
> I.e. when we wrote the graph we somehow didn't notice that the root tree
> node we wrote is to an object that's not actually a tree? Isn't this a
> bug where some part of the commit graph writing should be doing its own
> extended OID lookup that's replacement-object aware, it didn't, and we
> wrote a corrupt graph as a result?
>
> If there is a legitimate reason why we're not just hiding a bug we've
> turned up with these fixes let's disable that one test, not the entire
> test file.
>
> If you don't run the one test that fails (which is split up into 3
> individual pieces) there's still 143 other tests that are run, all of
> those presumably benefit from finding future bugs with
> GIT_TEST_COMMIT_GRAPH=true, particularly since the test file seems to
> just have turned up one just now...

I think this falls on my shoulders. I assumed that the failures were
expected behavior, not bugs. You are right that we shouldn't be
plastering over bugs.

I'll have to ask for help here because I don't know enough about mktag
to distinguish between 'expected' and 'unexpected' failures. The best I
can do is to add GIT_TEST_COMMIT_GRAPH=0 + NEEDSWORK for the failing
tests. But if that's good enough for now, I'll just do that :)
Derrick Stolee Oct. 13, 2021, 1:12 p.m. UTC | #4
On 10/12/2021 4:34 PM, Ævar Arnfjörð Bjarmason wrote:...
> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
> I see that we fail N number of tests, but all of them are actually
> fallout of just this test:
> 
>         git replace $head_parent $head && 
>         git replace -f $tree $blob 
> 
> I.e. we've created a replacement object replacing a tree with a blob, as
> part of tests I added to test how mktag handles those sorts of weird
> edge cases.
> 
> This then causes the graph verify code to throw a hissy fit with:
> 
>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>     0fbca9850869684085d654f9e1380c9780802570
> 
> I.e. when we wrote the graph we somehow didn't notice that the root tree
> node we wrote is to an object that's not actually a tree? Isn't this a
> bug where some part of the commit graph writing should be doing its own
> extended OID lookup that's replacement-object aware, it didn't, and we
> wrote a corrupt graph as a result?
> 
> If there is a legitimate reason why we're not just hiding a bug we've
> turned up with these fixes let's disable that one test, not the entire
> test file.

The commit-graph should be disabled if replace-objects are enabled. If
there is a bug being introduced here it is because the commit-graph is
being checked during fsck even though it would never be read when the
replace-objects exist.

Thanks,
-Stolee
Ævar Arnfjörð Bjarmason Oct. 13, 2021, 3:57 p.m. UTC | #5
On Wed, Oct 13 2021, Derrick Stolee wrote:

> On 10/12/2021 4:34 PM, Ævar Arnfjörð Bjarmason wrote:...
>> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
>> I see that we fail N number of tests, but all of them are actually
>> fallout of just this test:
>> 
>>         git replace $head_parent $head && 
>>         git replace -f $tree $blob 
>> 
>> I.e. we've created a replacement object replacing a tree with a blob, as
>> part of tests I added to test how mktag handles those sorts of weird
>> edge cases.
>> 
>> This then causes the graph verify code to throw a hissy fit with:
>> 
>>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>>     0fbca9850869684085d654f9e1380c9780802570
>> 
>> I.e. when we wrote the graph we somehow didn't notice that the root tree
>> node we wrote is to an object that's not actually a tree? Isn't this a
>> bug where some part of the commit graph writing should be doing its own
>> extended OID lookup that's replacement-object aware, it didn't, and we
>> wrote a corrupt graph as a result?
>> 
>> If there is a legitimate reason why we're not just hiding a bug we've
>> turned up with these fixes let's disable that one test, not the entire
>> test file.
>
> The commit-graph should be disabled if replace-objects are enabled. If
> there is a bug being introduced here it is because the commit-graph is
> being checked during fsck even though it would never be read when the
> replace-objects exist.
>
> Thanks,
> -Stolee

Thanks, isn't the obvious fix for this to extend your d6538246d3d
(commit-graph: not compatible with replace objects, 2018-08-20) to do
"read_replace_refs = 0;" in graph_verify()? That works for me on this
case.

I.e. if we ignore replace refs when writing the graph, and we don't use
it if there are any due to commit_graph_compatible(), why would we
consider them when verifying it?
Ævar Arnfjörð Bjarmason Oct. 14, 2021, 3:53 p.m. UTC | #6
On Tue, Oct 12 2021, Glen Choo wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> ...how isn't disabling those t3800-mktag.sh tests just plasting over
>> corruption that we're noticing because of your changes to (rightly) fix
>> the bug where "fsck" wasn't checking the graph at all?
>>
>> IOW haven't we just found exactly the sort of bug that
>> "GIT_TEST_COMMIT_GRAPH" is put in place to find for us, but now instead
>> of fixing it we're hiding it?
>>
>> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
>> I see that we fail N number of tests, but all of them are actually
>> fallout of just this test:
>>
>>         git replace $head_parent $head && 
>>         git replace -f $tree $blob 
>>
>> I.e. we've created a replacement object replacing a tree with a blob, as
>> part of tests I added to test how mktag handles those sorts of weird
>> edge cases.
>>
>> This then causes the graph verify code to throw a hissy fit with:
>>
>>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>>     0fbca9850869684085d654f9e1380c9780802570
>>
>> I.e. when we wrote the graph we somehow didn't notice that the root tree
>> node we wrote is to an object that's not actually a tree? Isn't this a
>> bug where some part of the commit graph writing should be doing its own
>> extended OID lookup that's replacement-object aware, it didn't, and we
>> wrote a corrupt graph as a result?
>>
>> If there is a legitimate reason why we're not just hiding a bug we've
>> turned up with these fixes let's disable that one test, not the entire
>> test file.
>>
>> If you don't run the one test that fails (which is split up into 3
>> individual pieces) there's still 143 other tests that are run, all of
>> those presumably benefit from finding future bugs with
>> GIT_TEST_COMMIT_GRAPH=true, particularly since the test file seems to
>> just have turned up one just now...
>
> I think this falls on my shoulders. I assumed that the failures were
> expected behavior, not bugs. You are right that we shouldn't be
> plastering over bugs.
>
> I'll have to ask for help here because I don't know enough about mktag
> to distinguish between 'expected' and 'unexpected' failures. The best I
> can do is to add GIT_TEST_COMMIT_GRAPH=0 + NEEDSWORK for the failing
> tests. But if that's good enough for now, I'll just do that :)

I don't think your series needs to be tasked with fixing a generic issue
in the commit-graph you stumbled across.

So for your series I'd think the only required fix would be to narrowlry
skip *just* the broken tests, not the entire test file.

But in this case (see
http://lore.kernel.org/git/87pms8hq4s.fsf@evledraar.gmail.com) the fix
seems rather trivial to me.

I.e. just adding one line to builtin/commit-graph.c, so perhaps you can
see if that works for you and such a "this bug was missed because this
mode didn't work" fix could be part of a re-roll of yours?
Derrick Stolee Oct. 14, 2021, 4:53 p.m. UTC | #7
On 10/13/2021 11:57 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Oct 13 2021, Derrick Stolee wrote:
> 
>> On 10/12/2021 4:34 PM, Ævar Arnfjörð Bjarmason wrote:...
>>> If I comment yout your addition of GIT_TEST_COMMIT_GRAPH=0 in that file
>>> I see that we fail N number of tests, but all of them are actually
>>> fallout of just this test:
>>>
>>>         git replace $head_parent $head && 
>>>         git replace -f $tree $blob 
>>>
>>> I.e. we've created a replacement object replacing a tree with a blob, as
>>> part of tests I added to test how mktag handles those sorts of weird
>>> edge cases.
>>>
>>> This then causes the graph verify code to throw a hissy fit with:
>>>
>>>     root tree OID for commit 0ddfaf193ff13d6ab39b7cbd9eed645e3ee2f050 in
>>>     commit-graph is da5497437fd67ca928333aab79c4b4b55036ea66 !=
>>>     0fbca9850869684085d654f9e1380c9780802570
>>>
>>> I.e. when we wrote the graph we somehow didn't notice that the root tree
>>> node we wrote is to an object that's not actually a tree? Isn't this a
>>> bug where some part of the commit graph writing should be doing its own
>>> extended OID lookup that's replacement-object aware, it didn't, and we
>>> wrote a corrupt graph as a result?
>>>
>>> If there is a legitimate reason why we're not just hiding a bug we've
>>> turned up with these fixes let's disable that one test, not the entire
>>> test file.
>>
>> The commit-graph should be disabled if replace-objects are enabled. If
>> there is a bug being introduced here it is because the commit-graph is
>> being checked during fsck even though it would never be read when the
>> replace-objects exist.
>>
>> Thanks,
>> -Stolee
> 
> Thanks, isn't the obvious fix for this to extend your d6538246d3d
> (commit-graph: not compatible with replace objects, 2018-08-20) to do
> "read_replace_refs = 0;" in graph_verify()? That works for me on this
> case.

Ignoring the replace refs while verifying will allow you to verify the
on-disk commit-graph file without issue.
 
> I.e. if we ignore replace refs when writing the graph, and we don't use
> it if there are any due to commit_graph_compatible(), why would we
> consider them when verifying it?

I generally consider any interaction with the commit-graph while
replace refs are enabled to be the bug. We don't read the commit-graph
when replace refs are on, so why are we verifying it?

But, I understand the desire to check the on-disk content even though
it is not being used, so disabling replace refs to do the verify should
be sufficient.

Thanks,
-Stolee
Glen Choo Oct. 14, 2021, 10:21 p.m. UTC | #8
Derrick Stolee <stolee@gmail.com> writes:

>>> The commit-graph should be disabled if replace-objects are enabled. If
>>> there is a bug being introduced here it is because the commit-graph is
>>> being checked during fsck even though it would never be read when the
>>> replace-objects exist.
>>>
>>> Thanks,
>>> -Stolee
>> 
>> Thanks, isn't the obvious fix for this to extend your d6538246d3d
>> (commit-graph: not compatible with replace objects, 2018-08-20) to do
>> "read_replace_refs = 0;" in graph_verify()? That works for me on this
>> case.
>
> Ignoring the replace refs while verifying will allow you to verify the
> on-disk commit-graph file without issue.

It seems like we've converged on doing read_replace_refs = 0 \o/

If we are going to do this twice in graph_verify() and graph_write(), is
there any reason why I shouldn't just do "read_replace_refs = 0" once in
cmd_commit_graph()? IOW any time we do anything with commit-graphs, we
should just ignore replace refs because they're incompatible.
Junio C Hamano Oct. 14, 2021, 10:25 p.m. UTC | #9
Derrick Stolee <stolee@gmail.com> writes:

> I generally consider any interaction with the commit-graph while
> replace refs are enabled to be the bug. We don't read the commit-graph
> when replace refs are on, so why are we verifying it?
>
> But, I understand the desire to check the on-disk content even though
> it is not being used, so disabling replace refs to do the verify should
> be sufficient.

That's quite a sensible argument.  Sounds like we have a way to
solve this that everybody agrees with.

Thanks.
Ævar Arnfjörð Bjarmason Oct. 14, 2021, 11:38 p.m. UTC | #10
On Thu, Oct 14 2021, Glen Choo wrote:

> Derrick Stolee <stolee@gmail.com> writes:
>
>>>> The commit-graph should be disabled if replace-objects are enabled. If
>>>> there is a bug being introduced here it is because the commit-graph is
>>>> being checked during fsck even though it would never be read when the
>>>> replace-objects exist.
>>>>
>>>> Thanks,
>>>> -Stolee
>>> 
>>> Thanks, isn't the obvious fix for this to extend your d6538246d3d
>>> (commit-graph: not compatible with replace objects, 2018-08-20) to do
>>> "read_replace_refs = 0;" in graph_verify()? That works for me on this
>>> case.
>>
>> Ignoring the replace refs while verifying will allow you to verify the
>> on-disk commit-graph file without issue.
>
> It seems like we've converged on doing read_replace_refs = 0 \o/
>
> If we are going to do this twice in graph_verify() and graph_write(), is
> there any reason why I shouldn't just do "read_replace_refs = 0" once in
> cmd_commit_graph()? IOW any time we do anything with commit-graphs, we
> should just ignore replace refs because they're incompatible.

No reason, I think that's the best way to do this.

I've submitted a series to fix that verify issue, as noted in the CL
these patches on top of it without that disabling of the mktag tests
will pass with GIT_TEST_COMMIT_GRAPH=true):
https://lore.kernel.org/git/cover-0.3-00000000000-20211014T233343Z-avarab@gmail.com
Junio C Hamano Oct. 15, 2021, 3:57 p.m. UTC | #11
Junio C Hamano <gitster@pobox.com> writes:

> Derrick Stolee <stolee@gmail.com> writes:
>
>> I generally consider any interaction with the commit-graph while
>> replace refs are enabled to be the bug. We don't read the commit-graph
>> when replace refs are on, so why are we verifying it?
>>
>> But, I understand the desire to check the on-disk content even though
>> it is not being used, so disabling replace refs to do the verify should
>> be sufficient.
>
> That's quite a sensible argument.  Sounds like we have a way to
> solve this that everybody agrees with.

Would it be sensible to silently ignore, though?  Would a warning,
when we see replacement is in use and an explicit request by an end
user is made to make use of or update the commit-graph, be too loud?