Message ID | f5779e19-813c-cda9-2f84-9fe58f745e89@web.de (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | pack-objects: fix and simplify --filter handling | expand |
René Scharfe <l.s.r@web.de> writes: > This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. > > 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, > 2022-03-28) avoided leaking rev_info allocations in many cases by > calling repo_init_revisions() only when the .filter member was actually > needed, but then still leaking it. That was fixed later by 2108fe4a19 > (revisions API users: add straightforward release_revisions(), > 2022-04-13), making the reverted commit unnecessary. Hmph, with this merged, 'seen' breaks linux-leaks job in a strange way. https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 Does anybody want to help looking into it? Thanks.
On Mon, Nov 28 2022, Junio C Hamano wrote: > René Scharfe <l.s.r@web.de> writes: > >> This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. >> >> 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, >> 2022-03-28) avoided leaking rev_info allocations in many cases by >> calling repo_init_revisions() only when the .filter member was actually >> needed, but then still leaking it. That was fixed later by 2108fe4a19 >> (revisions API users: add straightforward release_revisions(), >> 2022-04-13), making the reverted commit unnecessary. > > Hmph, with this merged, 'seen' breaks linux-leaks job in a strange > way. > > https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 > > Does anybody want to help looking into it? It's because the tip of that branch introduces a new leak, but it's only revealed in our test suite by a "git" invocation that we don't check the exit code of. So testing with "GIT_TEST_SANITIZE_LEAK_LOG=true" (which is there to check exactly those edge cases) catches it. René, testing locally with e.g.: GIT_TEST_PASSING_SANITIZE_LEAK=check \ GIT_TEST_SANITIZE_LEAK_LOG=true \ make test Should catch it, but note that we might still have new leaks in tests that are already failing due to other leaks. So diff-ing the resulting t/test-results/*.leak/* would be a good sanity check to see if there's any other leak we've missed. As to the bug itself: This exact edge case is why I went for the lazy-setup part of 5cb28270a1f (pack-objects: lazily set up "struct rev_info", don't leak, 2022-03-28), and indeed the example mentioned in its commit message now leaks again, but not on master: echo e83c5163316f89bfbde7d9ab23ca2e25604af290 | ./git pack-objects initial Now, maybe I'm missing something, but it the tip of that branch points out a legitimate bug in my 5cb28270a1f. But I don't see why the fix for it needs to be a full revert of 5cb28270a1f. If I replace all the C code changes in that commit with this the test it flipped also passes: diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 573d0b20b76..0de53c5a2df 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4158,8 +4158,10 @@ static struct list_objects_filter_options *po_filter_revs_init(void *value) { struct po_filter_data *data = value; - repo_init_revisions(the_repository, &data->revs, NULL); - data->have_revs = 1; + if (!data->have_revs) { + repo_init_revisions(the_repository, &data->revs, NULL); + data->have_revs = 1; + } return &data->revs.filter; } I.e. the commit is correct that we shouldn't be clobbering the "rev_info" when we see --filter again, but then let's just stop doing that. Well, I know why, it's because René initially started poking at this to rid the codebase of the reliance of C99's J.5.7. I think our two opinions on that are quite clear, so I won't repeat that here (I don't per-se mind, I just think it's a nothingburger). But in any case, if we're going to also refactor this code to get rd of the J.5.7 reliance I'd think we could agree that it really belongs in its own change. That change could then be a pure refactoring change, and wouldn't need to also explain how it's fixing a bug while-at-it.
Am 28.11.2022 um 11:03 schrieb Junio C Hamano: > René Scharfe <l.s.r@web.de> writes: > >> This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. >> >> 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, >> 2022-03-28) avoided leaking rev_info allocations in many cases by >> calling repo_init_revisions() only when the .filter member was actually >> needed, but then still leaking it. That was fixed later by 2108fe4a19 >> (revisions API users: add straightforward release_revisions(), >> 2022-04-13), making the reverted commit unnecessary. > > Hmph, with this merged, 'seen' breaks linux-leaks job in a strange > way. > > https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 > > Does anybody want to help looking into it? The patch exposes that release_revisions() leaks the diffopt allocations as we're yet to address the TODO added by 54c8a7c379 (revisions API: add a TODO for diff_free(&revs->diffopt), 2022-04-14). The patch below plugs it locally. --- >8 --- Subject: [PATCH 4/3] fixup! revision: free diffopt in release_revisions() Signed-off-by: René Scharfe <l.s.r@web.de> --- builtin/pack-objects.c | 1 + 1 file changed, 1 insertion(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 3e74fbb0cd..a47a3f0fba 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4462,6 +4462,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) } else { get_object_list(&revs, rp.nr, rp.v); } + diff_free(&revs.diffopt); release_revisions(&revs); cleanup_preferred_base(); if (include_tag && nr_result) -- 2.30.2
On Mon, Nov 28 2022, René Scharfe wrote: > Am 28.11.2022 um 11:03 schrieb Junio C Hamano: >> René Scharfe <l.s.r@web.de> writes: >> >>> This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. >>> >>> 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, >>> 2022-03-28) avoided leaking rev_info allocations in many cases by >>> calling repo_init_revisions() only when the .filter member was actually >>> needed, but then still leaking it. That was fixed later by 2108fe4a19 >>> (revisions API users: add straightforward release_revisions(), >>> 2022-04-13), making the reverted commit unnecessary. >> >> Hmph, with this merged, 'seen' breaks linux-leaks job in a strange >> way. >> >> https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 >> >> Does anybody want to help looking into it? [I see we crossed E-Mails]: https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ > The patch exposes that release_revisions() leaks the diffopt allocations > as we're yet to address the TODO added by 54c8a7c379 (revisions API: add > a TODO for diff_free(&revs->diffopt), 2022-04-14). That's correct, and we have that leak in various places in our codebase, but per the above side-thread I think this is primarily exposing that we're setting up the "struct rev_info" with your change when we don't need to. Why can't we just skip it? Yeah, if we do set it up we'll run into an outstanding leak, and that should also be fixed (I have some local patches...), but the other cases I know of where we'll leak that data is where we're actually using the "struct rev_info". I haven't tried tearing your change apart to poke at it myself, and maybe there's some really good reason for why you can't separate getting rid of the J.5.7 dependency and removing the lazy-init. > The patch below plugs it locally. > > --- >8 --- > Subject: [PATCH 4/3] fixup! revision: free diffopt in release_revisions() > > Signed-off-by: René Scharfe <l.s.r@web.de> > --- > builtin/pack-objects.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index 3e74fbb0cd..a47a3f0fba 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -4462,6 +4462,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) > } else { > get_object_list(&revs, rp.nr, rp.v); > } > + diff_free(&revs.diffopt); > release_revisions(&revs); > cleanup_preferred_base(); > if (include_tag && nr_result) So, the main motivation for the change was paranoia that a compiler or platform might show up without J.5.7 support and that would bite us, but we're now adding a double-free-in-waiting? I think we're both a bit paranoid, but clearly have different paranoia-priorities :) If we do end up with some hack like this instead of fixing the underlying problem I'd much prefer that such a hack just be an UNLEAK() here. I.e. we have a destructor for "revs.*" already, let's not bypass it and start freeing things from under it, which will result in a double-free if we forget this callsite once the TODO in 54c8a7c379 is addressed. As you'd see if you made release_revisions() simply call diff_free(&revs.diffopt) doing so would reveal some really gnarly edge cases. I haven't dug into this one, but offhand I'm not confident in saying that this isn't exposing us to some aspect of that gnarlyness (maybe not, it's been a while since I looked). (IIRC some of the most gnarly edge cases will only show up as CI failures on Windows, to do with the ordering of when we'll fclose() files hanging off that "diffopt").
On Mon, Nov 28 2022, Ævar Arnfjörð Bjarmason wrote: René: > On Mon, Nov 28 2022, René Scharfe wrote: > >> Am 28.11.2022 um 11:03 schrieb Junio C Hamano: >>> René Scharfe <l.s.r@web.de> writes: >>> >>>> This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. >>>> >>>> 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, >>>> 2022-03-28) avoided leaking rev_info allocations in many cases by >>>> calling repo_init_revisions() only when the .filter member was actually >>>> needed, but then still leaking it. That was fixed later by 2108fe4a19 >>>> (revisions API users: add straightforward release_revisions(), >>>> 2022-04-13), making the reverted commit unnecessary. >>> >>> Hmph, with this merged, 'seen' breaks linux-leaks job in a strange >>> way. >>> >>> https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 >>> >>> Does anybody want to help looking into it? > > [I see we crossed E-Mails]: > https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ > >> The patch exposes that release_revisions() leaks the diffopt allocations >> as we're yet to address the TODO added by 54c8a7c379 (revisions API: add >> a TODO for diff_free(&revs->diffopt), 2022-04-14). > > That's correct, and we have that leak in various places in our codebase, > but per the above side-thread I think this is primarily exposing that > we're setting up the "struct rev_info" with your change when we don't > need to. Why can't we just skip it? > > Yeah, if we do set it up we'll run into an outstanding leak, and that > should also be fixed (I have some local patches...), but the other cases > I know of where we'll leak that data is where we're actually using the > "struct rev_info". > > I haven't tried tearing your change apart to poke at it myself, and > maybe there's some really good reason for why you can't separate getting > rid of the J.5.7 dependency and removing the lazy-init. > >> The patch below plugs it locally. >> >> --- >8 --- >> Subject: [PATCH 4/3] fixup! revision: free diffopt in release_revisions() >> >> Signed-off-by: René Scharfe <l.s.r@web.de> >> --- >> builtin/pack-objects.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c >> index 3e74fbb0cd..a47a3f0fba 100644 >> --- a/builtin/pack-objects.c >> +++ b/builtin/pack-objects.c >> @@ -4462,6 +4462,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) >> } else { >> get_object_list(&revs, rp.nr, rp.v); >> } >> + diff_free(&revs.diffopt); >> release_revisions(&revs); >> cleanup_preferred_base(); >> if (include_tag && nr_result) > > So, the main motivation for the change was paranoia that a compiler or > platform might show up without J.5.7 support and that would bite us, but > we're now adding a double-free-in-waiting? > > I think we're both a bit paranoid, but clearly have different > paranoia-priorities :) > > If we do end up with some hack like this instead of fixing the > underlying problem I'd much prefer that such a hack just be an UNLEAK() > here. > > I.e. we have a destructor for "revs.*" already, let's not bypass it and > start freeing things from under it, which will result in a double-free > if we forget this callsite once the TODO in 54c8a7c379 is addressed. > > As you'd see if you made release_revisions() simply call > diff_free(&revs.diffopt) doing so would reveal some really gnarly edge > cases. > > I haven't dug into this one, but offhand I'm not confident in saying > that this isn't exposing us to some aspect of that gnarlyness (maybe > not, it's been a while since I looked). > > (IIRC some of the most gnarly edge cases will only show up as CI > failures on Windows, to do with the ordering of when we'll fclose() > files hanging off that "diffopt"). This squashed into 3/3 seems to me to be a proper fix to a change that wants to refactor the code for non-J.5.7 compatibility. I.e. this just does the data<->fp casting part of the change, without refactoring the "lazy init". But I think you should check this a bit more carefully. Your 3/3 says that your change "mak[es] the reverted commit unnecessary", but as I noted if you'd run the command that commit shows, you'd have seen you're re-introducing the leak it fixed. So I wonder what else has been missed here. I vaguely recall that one reason I ended up with that J.5.7 dependency was because there was an objection to mocking up the "struct option" as I'm doing here. I.e. here we assume that the opt_parse_list_objects_filter() is only ever going to care about the "value" member. I think that's probably fine, but I may be misrecalling, or missing some crucial detail. I'll leave digging that up & convincing us that it's fine to the person pushing for refactoring all of this :) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 3e74fbb0cd5..faf210bfe8c 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4149,6 +4149,27 @@ static int option_parse_cruft_expiration(const struct option *opt, return 0; } +struct po_filter_data { + unsigned have_revs:1; + struct rev_info revs; +}; + +static int opt_parse_list_objects_filter_init(const struct option *opt, + const char *arg, int unset) +{ + struct po_filter_data *data = opt->value; + struct rev_info *revs = &data->revs; + const struct option opt_rev = { + .value = (void *)&revs->filter, + }; + + if (!data->have_revs) + repo_init_revisions(the_repository, revs, NULL); + data->have_revs = 1; + + return opt_parse_list_objects_filter(&opt_rev, arg, unset); +} + int cmd_pack_objects(int argc, const char **argv, const char *prefix) { int use_internal_rev_list = 0; @@ -4159,7 +4180,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) int rev_list_index = 0; int stdin_packs = 0; struct string_list keep_pack_list = STRING_LIST_INIT_NODUP; - struct rev_info revs; + struct po_filter_data pfd = { .have_revs = 0 }; struct option pack_objects_options[] = { OPT_SET_INT('q', "quiet", &progress, @@ -4250,7 +4271,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) &write_bitmap_index, N_("write a bitmap index if possible"), WRITE_BITMAP_QUIET, PARSE_OPT_HIDDEN), - OPT_PARSE_LIST_OBJECTS_FILTER(&revs.filter), + OPT_CALLBACK(0, "filter", &pfd, N_("args"), N_("object filtering"), + opt_parse_list_objects_filter_init), OPT_CALLBACK_F(0, "missing", NULL, N_("action"), N_("handling for missing objects"), PARSE_OPT_NONEG, option_parse_missing_action), @@ -4269,8 +4291,6 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) read_replace_refs = 0; - repo_init_revisions(the_repository, &revs, NULL); - sparse = git_env_bool("GIT_TEST_PACK_SPARSE", -1); if (the_repository->gitdir) { prepare_repo_settings(the_repository); @@ -4372,7 +4392,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (!rev_list_all || !rev_list_reflog || !rev_list_index) unpack_unreachable_expiration = 0; - if (revs.filter.choice) { + if (pfd.have_revs && pfd.revs.filter.choice) { if (!pack_to_stdout) die(_("cannot use --filter without --stdout")); if (stdin_packs) @@ -4459,10 +4479,16 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) read_cruft_objects(); } else if (!use_internal_rev_list) { read_object_list_from_stdin(); + } else if (pfd.have_revs) { + get_object_list(&pfd.revs, rp.nr, rp.v); + release_revisions(&pfd.revs); } else { + struct rev_info revs; + + repo_init_revisions(the_repository, &revs, NULL); get_object_list(&revs, rp.nr, rp.v); + release_revisions(&revs); } - release_revisions(&revs); cleanup_preferred_base(); if (include_tag && nr_result) for_each_tag_ref(add_ref_tag, NULL);
Am 28.11.2022 um 12:31 schrieb Ævar Arnfjörð Bjarmason: > > On Mon, Nov 28 2022, René Scharfe wrote: > >> Am 28.11.2022 um 11:03 schrieb Junio C Hamano: >>> René Scharfe <l.s.r@web.de> writes: >>> >>>> This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. >>>> >>>> 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, >>>> 2022-03-28) avoided leaking rev_info allocations in many cases by >>>> calling repo_init_revisions() only when the .filter member was actually >>>> needed, but then still leaking it. That was fixed later by 2108fe4a19 >>>> (revisions API users: add straightforward release_revisions(), >>>> 2022-04-13), making the reverted commit unnecessary. >>> >>> Hmph, with this merged, 'seen' breaks linux-leaks job in a strange >>> way. >>> >>> https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 >>> >>> Does anybody want to help looking into it? > > [I see we crossed E-Mails]: > https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ > >> The patch exposes that release_revisions() leaks the diffopt allocations >> as we're yet to address the TODO added by 54c8a7c379 (revisions API: add >> a TODO for diff_free(&revs->diffopt), 2022-04-14). > > That's correct, and we have that leak in various places in our codebase, > but per the above side-thread I think this is primarily exposing that > we're setting up the "struct rev_info" with your change when we don't > need to. Why can't we just skip it? I have no idea how to stop get_object_list() from using struct rev_info. We could let it take a struct list_objects_filter_options pointer instead and have it build a struct rev_info internally, but that would just move the problem, not solve it. > Yeah, if we do set it up we'll run into an outstanding leak, and that > should also be fixed (I have some local patches...), but the other cases > I know of where we'll leak that data is where we're actually using the > "struct rev_info". > > I haven't tried tearing your change apart to poke at it myself, and > maybe there's some really good reason for why you can't separate getting > rid of the J.5.7 dependency and removing the lazy-init. > >> The patch below plugs it locally. >> >> --- >8 --- >> Subject: [PATCH 4/3] fixup! revision: free diffopt in release_revisions() >> >> Signed-off-by: René Scharfe <l.s.r@web.de> >> --- >> builtin/pack-objects.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c >> index 3e74fbb0cd..a47a3f0fba 100644 >> --- a/builtin/pack-objects.c >> +++ b/builtin/pack-objects.c >> @@ -4462,6 +4462,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) >> } else { >> get_object_list(&revs, rp.nr, rp.v); >> } >> + diff_free(&revs.diffopt); >> release_revisions(&revs); >> cleanup_preferred_base(); >> if (include_tag && nr_result) > > So, the main motivation for the change was paranoia that a compiler or > platform might show up without J.5.7 support and that would bite us, but > we're now adding a double-free-in-waiting? > > I think we're both a bit paranoid, but clearly have different > paranoia-priorities :) > > If we do end up with some hack like this instead of fixing the > underlying problem I'd much prefer that such a hack just be an UNLEAK() > here. > > I.e. we have a destructor for "revs.*" already, let's not bypass it and > start freeing things from under it, which will result in a double-free > if we forget this callsite once the TODO in 54c8a7c379 is addressed. Well, that TODO fix should remove this new diff_free() call, but I agree that this is fragile. Removing the "TEST_PASSES_SANITIZE_LEAK=true" line from affected tests is probably better. > As you'd see if you made release_revisions() simply call > diff_free(&revs.diffopt) doing so would reveal some really gnarly edge > cases. That was my first attempt; it breaks lots of tests due to double frees. > I haven't dug into this one, but offhand I'm not confident in saying > that this isn't exposing us to some aspect of that gnarlyness (maybe > not, it's been a while since I looked). I saw it as the way towards a release_revisions() that calls diff_free() itself: Add such calls to each of them, fix the "gnarlyness" individually, finally move them all into release_revisions(). The only problem is that there are 60+ callsites. > (IIRC some of the most gnarly edge cases will only show up as CI > failures on Windows, to do with the ordering of when we'll fclose() > files hanging off that "diffopt"). Fun. René
On Mon, Nov 28 2022, René Scharfe wrote: > Am 28.11.2022 um 12:31 schrieb Ævar Arnfjörð Bjarmason: >> >> On Mon, Nov 28 2022, René Scharfe wrote: >> >>> Am 28.11.2022 um 11:03 schrieb Junio C Hamano: >>>> René Scharfe <l.s.r@web.de> writes: >>>> >>>>> This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. >>>>> >>>>> 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, >>>>> 2022-03-28) avoided leaking rev_info allocations in many cases by >>>>> calling repo_init_revisions() only when the .filter member was actually >>>>> needed, but then still leaking it. That was fixed later by 2108fe4a19 >>>>> (revisions API users: add straightforward release_revisions(), >>>>> 2022-04-13), making the reverted commit unnecessary. >>>> >>>> Hmph, with this merged, 'seen' breaks linux-leaks job in a strange >>>> way. >>>> >>>> https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 >>>> >>>> Does anybody want to help looking into it? >> >> [I see we crossed E-Mails]: >> https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ >> >>> The patch exposes that release_revisions() leaks the diffopt allocations >>> as we're yet to address the TODO added by 54c8a7c379 (revisions API: add >>> a TODO for diff_free(&revs->diffopt), 2022-04-14). >> >> That's correct, and we have that leak in various places in our codebase, >> but per the above side-thread I think this is primarily exposing that >> we're setting up the "struct rev_info" with your change when we don't >> need to. Why can't we just skip it? > > I have no idea how to stop get_object_list() from using struct rev_info. > We could let it take a struct list_objects_filter_options pointer > instead and have it build a struct rev_info internally, but that would > just move the problem, not solve it. I mean skip it when it's not needed, it's needed when we call get_object_list(). But what "problem" is being caused by get_object_list()? That there's some other case(s) where it'll leak still? I haven't checked, I think we should leave that for some other time if there's such leaks, and just not introduce any new leaks in this topic. >> Yeah, if we do set it up we'll run into an outstanding leak, and that >> should also be fixed (I have some local patches...), but the other cases >> I know of where we'll leak that data is where we're actually using the >> "struct rev_info". >> >> I haven't tried tearing your change apart to poke at it myself, and >> maybe there's some really good reason for why you can't separate getting >> rid of the J.5.7 dependency and removing the lazy-init. >> >>> The patch below plugs it locally. >>> >>> --- >8 --- >>> Subject: [PATCH 4/3] fixup! revision: free diffopt in release_revisions() >>> >>> Signed-off-by: René Scharfe <l.s.r@web.de> >>> --- >>> builtin/pack-objects.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c >>> index 3e74fbb0cd..a47a3f0fba 100644 >>> --- a/builtin/pack-objects.c >>> +++ b/builtin/pack-objects.c >>> @@ -4462,6 +4462,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) >>> } else { >>> get_object_list(&revs, rp.nr, rp.v); >>> } >>> + diff_free(&revs.diffopt); >>> release_revisions(&revs); >>> cleanup_preferred_base(); >>> if (include_tag && nr_result) >> >> So, the main motivation for the change was paranoia that a compiler or >> platform might show up without J.5.7 support and that would bite us, but >> we're now adding a double-free-in-waiting? >> >> I think we're both a bit paranoid, but clearly have different >> paranoia-priorities :) >> >> If we do end up with some hack like this instead of fixing the >> underlying problem I'd much prefer that such a hack just be an UNLEAK() >> here. >> >> I.e. we have a destructor for "revs.*" already, let's not bypass it and >> start freeing things from under it, which will result in a double-free >> if we forget this callsite once the TODO in 54c8a7c379 is addressed. > > Well, that TODO fix should remove this new diff_free() call, but I > agree that this is fragile. > > Removing the "TEST_PASSES_SANITIZE_LEAK=true" line from affected tests > is probably better. Or just not introduce new leaks, per my suggested fix-up at https://lore.kernel.org/git/221128.86zgcbl0pe.gmgdl@evledraar.gmail.com/ (which it looks like you haven't seen when this E-Mail is composed...). >> As you'd see if you made release_revisions() simply call >> diff_free(&revs.diffopt) doing so would reveal some really gnarly edge >> cases. > > That was my first attempt; it breaks lots of tests due to double frees. Right, to be clear I'm saying that none of this is needed right now, i.e. I don't get why we'd want the scope-creep past the hunk I noted in https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ for the --filter bug fix (plus the tests you're adding). >> I haven't dug into this one, but offhand I'm not confident in saying >> that this isn't exposing us to some aspect of that gnarlyness (maybe >> not, it's been a while since I looked). > > I saw it as the way towards a release_revisions() that calls diff_free() > itself: Add such calls to each of them, fix the "gnarlyness" > individually, finally move them all into release_revisions(). The only > problem is that there are 60+ callsites. I think this is a really bad approach in general. Yes, it may happen to work to free() some data from under an API, but it's just as likely that we'll miss that this one caller is screwing with its internal state, and e.g. when some new revision.c code is used it'll go boom. If we wanted to phase in such a free() of "foo" I think the right way would be to add some "revs.free_foo = 1" flag, giving the API a chance to treat that sanely, not to start poking at members of the struct, and assuming that its release() won't be free()-ing them. But as noted above & in the linked I think we can defer all of that. The only reason we're discussing this is because you're changing the lazy-init to be not-lazy, and introducing new leaks as a result. I've shown a couple of approaches in this thread of fixing the issue(s) at hand without introducing such leaks, so ...
Am 28.11.2022 um 13:24 schrieb Ævar Arnfjörð Bjarmason: > > On Mon, Nov 28 2022, Ævar Arnfjörð Bjarmason wrote: > > René: > >> On Mon, Nov 28 2022, René Scharfe wrote: >> >>> Am 28.11.2022 um 11:03 schrieb Junio C Hamano: >>>> René Scharfe <l.s.r@web.de> writes: >>>> >>>>> This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. >>>>> >>>>> 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, >>>>> 2022-03-28) avoided leaking rev_info allocations in many cases by >>>>> calling repo_init_revisions() only when the .filter member was actually >>>>> needed, but then still leaking it. That was fixed later by 2108fe4a19 >>>>> (revisions API users: add straightforward release_revisions(), >>>>> 2022-04-13), making the reverted commit unnecessary. >>>> >>>> Hmph, with this merged, 'seen' breaks linux-leaks job in a strange >>>> way. >>>> >>>> https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 >>>> >>>> Does anybody want to help looking into it? >> >> [I see we crossed E-Mails]: >> https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ >> >>> The patch exposes that release_revisions() leaks the diffopt allocations >>> as we're yet to address the TODO added by 54c8a7c379 (revisions API: add >>> a TODO for diff_free(&revs->diffopt), 2022-04-14). >> >> That's correct, and we have that leak in various places in our codebase, >> but per the above side-thread I think this is primarily exposing that >> we're setting up the "struct rev_info" with your change when we don't >> need to. Why can't we just skip it? >> >> Yeah, if we do set it up we'll run into an outstanding leak, and that >> should also be fixed (I have some local patches...), but the other cases >> I know of where we'll leak that data is where we're actually using the >> "struct rev_info". >> >> I haven't tried tearing your change apart to poke at it myself, and >> maybe there's some really good reason for why you can't separate getting >> rid of the J.5.7 dependency and removing the lazy-init. >> >>> The patch below plugs it locally. >>> >>> --- >8 --- >>> Subject: [PATCH 4/3] fixup! revision: free diffopt in release_revisions() >>> >>> Signed-off-by: René Scharfe <l.s.r@web.de> >>> --- >>> builtin/pack-objects.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c >>> index 3e74fbb0cd..a47a3f0fba 100644 >>> --- a/builtin/pack-objects.c >>> +++ b/builtin/pack-objects.c >>> @@ -4462,6 +4462,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) >>> } else { >>> get_object_list(&revs, rp.nr, rp.v); >>> } >>> + diff_free(&revs.diffopt); >>> release_revisions(&revs); >>> cleanup_preferred_base(); >>> if (include_tag && nr_result) >> >> So, the main motivation for the change was paranoia that a compiler or >> platform might show up without J.5.7 support and that would bite us, but >> we're now adding a double-free-in-waiting? >> >> I think we're both a bit paranoid, but clearly have different >> paranoia-priorities :) >> >> If we do end up with some hack like this instead of fixing the >> underlying problem I'd much prefer that such a hack just be an UNLEAK() >> here. >> >> I.e. we have a destructor for "revs.*" already, let's not bypass it and >> start freeing things from under it, which will result in a double-free >> if we forget this callsite once the TODO in 54c8a7c379 is addressed. >> >> As you'd see if you made release_revisions() simply call >> diff_free(&revs.diffopt) doing so would reveal some really gnarly edge >> cases. >> >> I haven't dug into this one, but offhand I'm not confident in saying >> that this isn't exposing us to some aspect of that gnarlyness (maybe >> not, it's been a while since I looked). >> >> (IIRC some of the most gnarly edge cases will only show up as CI >> failures on Windows, to do with the ordering of when we'll fclose() >> files hanging off that "diffopt"). > > This squashed into 3/3 seems to me to be a proper fix to a change that > wants to refactor the code for non-J.5.7 compatibility. I.e. this just > does the data<->fp casting part of the change, without refactoring the > "lazy init". That works, but lazy code is more complicated and there is no benefit here -- eager allocations are not noticably slow or big. Laziness hides leaks in corners, i.e. requiring invocations with uncommon options to trigger them. > > But I think you should check this a bit more carefully. Your 3/3 says > that your change "mak[es] the reverted commit unnecessary" No, it says that _your_ change 2108fe4a19 (revisions API users: add straightforward release_revisions(), 2022-04-13) made it unnecessary. > , but as I > noted if you'd run the command that commit shows, you'd have seen you're > re-introducing the leak it fixed. So I wonder what else has been missed > here. 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, 2022-03-28) did not plug the leak. It only moved it to the corner that handles the --filter option. That leak is only interesting to Git developers and harmless for users. But if the goal is to become free of trivial leaks in order to allow using tools like LeakSanitizer to find real ones then pushing them into the shadows not yet reached by our test coverage won't help for long. > I vaguely recall that one reason I ended up with that J.5.7 dependency > was because there was an objection to mocking up the "struct option" as > I'm doing here. I.e. here we assume that the > opt_parse_list_objects_filter() is only ever going to care about the > "value" member. It's probably fine, but unnecessarily complicated compared to calling repo_init_revisions() eagerly. > > I think that's probably fine, but I may be misrecalling, or missing some > crucial detail. I'll leave digging that up & convincing us that it's > fine to the person pushing for refactoring all of this :) > > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index 3e74fbb0cd5..faf210bfe8c 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -4149,6 +4149,27 @@ static int option_parse_cruft_expiration(const struct option *opt, > return 0; > } > > +struct po_filter_data { > + unsigned have_revs:1; > + struct rev_info revs; > +}; > + > +static int opt_parse_list_objects_filter_init(const struct option *opt, > + const char *arg, int unset) > +{ > + struct po_filter_data *data = opt->value; > + struct rev_info *revs = &data->revs; > + const struct option opt_rev = { > + .value = (void *)&revs->filter, > + }; > + > + if (!data->have_revs) > + repo_init_revisions(the_repository, revs, NULL); > + data->have_revs = 1; > + > + return opt_parse_list_objects_filter(&opt_rev, arg, unset); > +} > + > int cmd_pack_objects(int argc, const char **argv, const char *prefix) > { > int use_internal_rev_list = 0; > @@ -4159,7 +4180,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) > int rev_list_index = 0; > int stdin_packs = 0; > struct string_list keep_pack_list = STRING_LIST_INIT_NODUP; > - struct rev_info revs; > + struct po_filter_data pfd = { .have_revs = 0 }; > > struct option pack_objects_options[] = { > OPT_SET_INT('q', "quiet", &progress, > @@ -4250,7 +4271,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) > &write_bitmap_index, > N_("write a bitmap index if possible"), > WRITE_BITMAP_QUIET, PARSE_OPT_HIDDEN), > - OPT_PARSE_LIST_OBJECTS_FILTER(&revs.filter), > + OPT_CALLBACK(0, "filter", &pfd, N_("args"), N_("object filtering"), > + opt_parse_list_objects_filter_init), > OPT_CALLBACK_F(0, "missing", NULL, N_("action"), > N_("handling for missing objects"), PARSE_OPT_NONEG, > option_parse_missing_action), > @@ -4269,8 +4291,6 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) > > read_replace_refs = 0; > > - repo_init_revisions(the_repository, &revs, NULL); > - > sparse = git_env_bool("GIT_TEST_PACK_SPARSE", -1); > if (the_repository->gitdir) { > prepare_repo_settings(the_repository); > @@ -4372,7 +4392,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) > if (!rev_list_all || !rev_list_reflog || !rev_list_index) > unpack_unreachable_expiration = 0; > > - if (revs.filter.choice) { > + if (pfd.have_revs && pfd.revs.filter.choice) { > if (!pack_to_stdout) > die(_("cannot use --filter without --stdout")); > if (stdin_packs) > @@ -4459,10 +4479,16 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) > read_cruft_objects(); > } else if (!use_internal_rev_list) { > read_object_list_from_stdin(); > + } else if (pfd.have_revs) { > + get_object_list(&pfd.revs, rp.nr, rp.v); > + release_revisions(&pfd.revs); > } else { > + struct rev_info revs; > + > + repo_init_revisions(the_repository, &revs, NULL); > get_object_list(&revs, rp.nr, rp.v); > + release_revisions(&revs); > } > - release_revisions(&revs); > cleanup_preferred_base(); > if (include_tag && nr_result) > for_each_tag_ref(add_ref_tag, NULL);
On Mon, Nov 28 2022, René Scharfe wrote: > Am 28.11.2022 um 13:24 schrieb Ævar Arnfjörð Bjarmason: >> >> On Mon, Nov 28 2022, Ævar Arnfjörð Bjarmason wrote: >> >> René: >> >>> On Mon, Nov 28 2022, René Scharfe wrote: >>> >>>> Am 28.11.2022 um 11:03 schrieb Junio C Hamano: >>>>> René Scharfe <l.s.r@web.de> writes: >>>>> >>>>>> This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. >>>>>> >>>>>> 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, >>>>>> 2022-03-28) avoided leaking rev_info allocations in many cases by >>>>>> calling repo_init_revisions() only when the .filter member was actually >>>>>> needed, but then still leaking it. That was fixed later by 2108fe4a19 >>>>>> (revisions API users: add straightforward release_revisions(), >>>>>> 2022-04-13), making the reverted commit unnecessary. >>>>> >>>>> Hmph, with this merged, 'seen' breaks linux-leaks job in a strange >>>>> way. >>>>> >>>>> https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 >>>>> >>>>> Does anybody want to help looking into it? >>> >>> [I see we crossed E-Mails]: >>> https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ >>> >>>> The patch exposes that release_revisions() leaks the diffopt allocations >>>> as we're yet to address the TODO added by 54c8a7c379 (revisions API: add >>>> a TODO for diff_free(&revs->diffopt), 2022-04-14). >>> >>> That's correct, and we have that leak in various places in our codebase, >>> but per the above side-thread I think this is primarily exposing that >>> we're setting up the "struct rev_info" with your change when we don't >>> need to. Why can't we just skip it? >>> >>> Yeah, if we do set it up we'll run into an outstanding leak, and that >>> should also be fixed (I have some local patches...), but the other cases >>> I know of where we'll leak that data is where we're actually using the >>> "struct rev_info". >>> >>> I haven't tried tearing your change apart to poke at it myself, and >>> maybe there's some really good reason for why you can't separate getting >>> rid of the J.5.7 dependency and removing the lazy-init. >>> >>>> The patch below plugs it locally. >>>> >>>> --- >8 --- >>>> Subject: [PATCH 4/3] fixup! revision: free diffopt in release_revisions() >>>> >>>> Signed-off-by: René Scharfe <l.s.r@web.de> >>>> --- >>>> builtin/pack-objects.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c >>>> index 3e74fbb0cd..a47a3f0fba 100644 >>>> --- a/builtin/pack-objects.c >>>> +++ b/builtin/pack-objects.c >>>> @@ -4462,6 +4462,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) >>>> } else { >>>> get_object_list(&revs, rp.nr, rp.v); >>>> } >>>> + diff_free(&revs.diffopt); >>>> release_revisions(&revs); >>>> cleanup_preferred_base(); >>>> if (include_tag && nr_result) >>> >>> So, the main motivation for the change was paranoia that a compiler or >>> platform might show up without J.5.7 support and that would bite us, but >>> we're now adding a double-free-in-waiting? >>> >>> I think we're both a bit paranoid, but clearly have different >>> paranoia-priorities :) >>> >>> If we do end up with some hack like this instead of fixing the >>> underlying problem I'd much prefer that such a hack just be an UNLEAK() >>> here. >>> >>> I.e. we have a destructor for "revs.*" already, let's not bypass it and >>> start freeing things from under it, which will result in a double-free >>> if we forget this callsite once the TODO in 54c8a7c379 is addressed. >>> >>> As you'd see if you made release_revisions() simply call >>> diff_free(&revs.diffopt) doing so would reveal some really gnarly edge >>> cases. >>> >>> I haven't dug into this one, but offhand I'm not confident in saying >>> that this isn't exposing us to some aspect of that gnarlyness (maybe >>> not, it's been a while since I looked). >>> >>> (IIRC some of the most gnarly edge cases will only show up as CI >>> failures on Windows, to do with the ordering of when we'll fclose() >>> files hanging off that "diffopt"). >> >> This squashed into 3/3 seems to me to be a proper fix to a change that >> wants to refactor the code for non-J.5.7 compatibility. I.e. this just >> does the data<->fp casting part of the change, without refactoring the >> "lazy init". > > That works, but lazy code is more complicated and there is no benefit > here -- eager allocations are not noticably slow or big. Laziness > hides leaks in corners, i.e. requiring invocations with uncommon > options to trigger them. Yes, sometimes it's easier to just set everything up at the beginning. As for hiding leaks I think the empirical data here is going against that, i.e. your change introduced a leak. I don't think it's realistic that we'll have the side that assigns to "have_revs" drift from the corresponding code in cmd_pack_objects(). >> But I think you should check this a bit more carefully. Your 3/3 says >> that your change "mak[es] the reverted commit unnecessary" > > No, it says that _your_ change 2108fe4a19 (revisions API users: add > straightforward release_revisions(), 2022-04-13) made it unnecessary. Yes, I'm saying that's not correct, because if you run the command that 5cb28270a1 prominently notes we'll now leak with this revert: echo e83c5163316f89bfbde7d9ab23ca2e25604af290 | ./git pack-objects initial But yes with just 5cb28270a1 didn't add release_revisions(), that came shortly afterwards in 2108fe4a19. >> , but as I >> noted if you'd run the command that commit shows, you'd have seen you're >> re-introducing the leak it fixed. So I wonder what else has been missed >> here. > > 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, > 2022-03-28) did not plug the leak. It only moved it to the corner that > handles the --filter option. I think we're using "the leak" here differently. I mean callstacks that LeakSanitizer emits & tests we have that do & don't pass with SANITIZE=leak. But yes, there may be multiple paths through a function, some of which leak, some of which don't. I'm not saying that the entire set of API features that builtin/pack-objects.c uses in the revision API is leak-free. > That leak is only interesting to Git developers and harmless for users. > But if the goal is to become free of trivial leaks in order to allow > using tools like LeakSanitizer to find real ones then pushing them into > the shadows not yet reached by our test coverage won't help for long. It's clearly helping in this case, as our CI had multiple failing tests. >> I vaguely recall that one reason I ended up with that J.5.7 dependency >> was because there was an objection to mocking up the "struct option" as >> I'm doing here. I.e. here we assume that the >> opt_parse_list_objects_filter() is only ever going to care about the >> "value" member. > > It's probably fine, but unnecessarily complicated compared to calling > repo_init_revisions() eagerly. I'm leaving aside the question of whether we should go for some version of the refactoring in your 3/3. What I am saying is that such refactoring should be split up from the more narrow bug fix to the existing code. I.e. this as a replacement for your 3/3 is all that's needed to pass the test you're adding in 2/3. -- >8 -- From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= <l.s.r@web.de> Subject: [PATCH] pack-objects: support multiple --filter options again 5cb28270a1f (pack-objects: lazily set up "struct rev_info", don't leak, 2022-03-28) broke support for multiple --filter options by calling repo_init_revisions() every time "--filter" was seen. Instead we should only do so the first time, and subsequently append to the existing filter data. Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/pack-objects.c | 3 ++- t/t5317-pack-objects-filter-objects.sh | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 573d0b20b76..c702c09dd45 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4158,7 +4158,8 @@ static struct list_objects_filter_options *po_filter_revs_init(void *value) { struct po_filter_data *data = value; - repo_init_revisions(the_repository, &data->revs, NULL); + if (!data->have_revs) + repo_init_revisions(the_repository, &data->revs, NULL); data->have_revs = 1; return &data->revs.filter; diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index 25faebaada8..5b707d911b5 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -265,7 +265,7 @@ test_expect_success 'verify normal and blob:limit packfiles have same commits/tr test_cmp expected observed ' -test_expect_failure 'verify small limit and big limit results in small limit' ' +test_expect_success 'verify small limit and big limit results in small limit' ' git -C r2 ls-files -s large.1000 >ls_files_result && test_parse_ls_files_stage_oids <ls_files_result | sort >expected &&
Am 28.11.2022 um 15:34 schrieb Ævar Arnfjörð Bjarmason: > > On Mon, Nov 28 2022, René Scharfe wrote: > >> Am 28.11.2022 um 12:31 schrieb Ævar Arnfjörð Bjarmason: >>> >>> On Mon, Nov 28 2022, René Scharfe wrote: >>> >>>> Am 28.11.2022 um 11:03 schrieb Junio C Hamano: >>>>> René Scharfe <l.s.r@web.de> writes: >>>>> >>>>>> This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. >>>>>> >>>>>> 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, >>>>>> 2022-03-28) avoided leaking rev_info allocations in many cases by >>>>>> calling repo_init_revisions() only when the .filter member was actually >>>>>> needed, but then still leaking it. That was fixed later by 2108fe4a19 >>>>>> (revisions API users: add straightforward release_revisions(), >>>>>> 2022-04-13), making the reverted commit unnecessary. >>>>> >>>>> Hmph, with this merged, 'seen' breaks linux-leaks job in a strange >>>>> way. >>>>> >>>>> https://github.com/git/git/actions/runs/3563546608/jobs/5986458300#step:5:3917 >>>>> >>>>> Does anybody want to help looking into it? >>> >>> [I see we crossed E-Mails]: >>> https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ >>> >>>> The patch exposes that release_revisions() leaks the diffopt allocations >>>> as we're yet to address the TODO added by 54c8a7c379 (revisions API: add >>>> a TODO for diff_free(&revs->diffopt), 2022-04-14). >>> >>> That's correct, and we have that leak in various places in our codebase, >>> but per the above side-thread I think this is primarily exposing that >>> we're setting up the "struct rev_info" with your change when we don't >>> need to. Why can't we just skip it? >> >> I have no idea how to stop get_object_list() from using struct rev_info. >> We could let it take a struct list_objects_filter_options pointer >> instead and have it build a struct rev_info internally, but that would >> just move the problem, not solve it. > > I mean skip it when it's not needed, it's needed when we call > get_object_list(). > > But what "problem" is being caused by get_object_list()? That there's > some other case(s) where it'll leak still? I haven't checked, I think we > should leave that for some other time if there's such leaks, and just > not introduce any new leaks in this topic. The problem is "How to use struct rev_info without leaks?". No matter where you move it, the leak will be present until the TODO in release_revisions() is done. > >>> Yeah, if we do set it up we'll run into an outstanding leak, and that >>> should also be fixed (I have some local patches...), but the other cases >>> I know of where we'll leak that data is where we're actually using the >>> "struct rev_info". >>> >>> I haven't tried tearing your change apart to poke at it myself, and >>> maybe there's some really good reason for why you can't separate getting >>> rid of the J.5.7 dependency and removing the lazy-init. >>> >>>> The patch below plugs it locally. >>>> >>>> --- >8 --- >>>> Subject: [PATCH 4/3] fixup! revision: free diffopt in release_revisions() >>>> >>>> Signed-off-by: René Scharfe <l.s.r@web.de> >>>> --- >>>> builtin/pack-objects.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c >>>> index 3e74fbb0cd..a47a3f0fba 100644 >>>> --- a/builtin/pack-objects.c >>>> +++ b/builtin/pack-objects.c >>>> @@ -4462,6 +4462,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) >>>> } else { >>>> get_object_list(&revs, rp.nr, rp.v); >>>> } >>>> + diff_free(&revs.diffopt); >>>> release_revisions(&revs); >>>> cleanup_preferred_base(); >>>> if (include_tag && nr_result) >>> >>> So, the main motivation for the change was paranoia that a compiler or >>> platform might show up without J.5.7 support and that would bite us, but >>> we're now adding a double-free-in-waiting? >>> >>> I think we're both a bit paranoid, but clearly have different >>> paranoia-priorities :) >>> >>> If we do end up with some hack like this instead of fixing the >>> underlying problem I'd much prefer that such a hack just be an UNLEAK() >>> here. >>> >>> I.e. we have a destructor for "revs.*" already, let's not bypass it and >>> start freeing things from under it, which will result in a double-free >>> if we forget this callsite once the TODO in 54c8a7c379 is addressed. >> >> Well, that TODO fix should remove this new diff_free() call, but I >> agree that this is fragile. >> >> Removing the "TEST_PASSES_SANITIZE_LEAK=true" line from affected tests >> is probably better. > > Or just not introduce new leaks, per my suggested fix-up at > https://lore.kernel.org/git/221128.86zgcbl0pe.gmgdl@evledraar.gmail.com/ > (which it looks like you haven't seen when this E-Mail is composed...). Not adding leaks is a good idea. AFAICS none of my patches so far add any. Patch 3 of v2 exposes an existing one that was only triggered by the --filter option before. Which is also not ideal, of course, but giving it more visibility hopefully will motivate a proper fix. >>> As you'd see if you made release_revisions() simply call >>> diff_free(&revs.diffopt) doing so would reveal some really gnarly edge >>> cases. >> >> That was my first attempt; it breaks lots of tests due to double frees. > > Right, to be clear I'm saying that none of this is needed right now, > i.e. I don't get why we'd want the scope-creep past the hunk I noted in > https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ > for the --filter bug fix (plus the tests you're adding). Well, you asked to squash the minimal fix into the laziness removal in https://lore.kernel.org/git/221112.86bkpcmm6i.gmgdl@evledraar.gmail.com/ Reverting 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, 2022-03-28) wholesale is the simplest way to reach the goals of regression fix, simplification and standard compliance. Except that the leak check tests have come to depend on the leak being hidden in the --filter corner. So going back to v1 sure seems attractive. > >>> I haven't dug into this one, but offhand I'm not confident in saying >>> that this isn't exposing us to some aspect of that gnarlyness (maybe >>> not, it's been a while since I looked). >> >> I saw it as the way towards a release_revisions() that calls diff_free() >> itself: Add such calls to each of them, fix the "gnarlyness" >> individually, finally move them all into release_revisions(). The only >> problem is that there are 60+ callsites. > > I think this is a really bad approach in general. > > Yes, it may happen to work to free() some data from under an API, but > it's just as likely that we'll miss that this one caller is screwing > with its internal state, and e.g. when some new revision.c code is used > it'll go boom. > > If we wanted to phase in such a free() of "foo" I think the right way > would be to add some "revs.free_foo = 1" flag, giving the API a chance > to treat that sanely, not to start poking at members of the struct, and > assuming that its release() won't be free()-ing them. And that's why you added no_free to struct diff_options. We could use it here by setting it in repo_init_revisions() and unsetting in cmd_pack_objects() and elsewhere, until it is set everywhere. > But as noted above & in the linked I think we can defer all of that. The > only reason we're discussing this is because you're changing the > lazy-init to be not-lazy, and introducing new leaks as a result.> > I've shown a couple of approaches in this thread of fixing the issue(s) > at hand without introducing such leaks, so ... As noted above: These leaks are not new, they are just moved into test coverage. René
Am 28.11.2022 um 16:56 schrieb René Scharfe:> > The problem is "How to use struct rev_info without leaks?". No matter > where you move it, the leak will be present until the TODO in > release_revisions() is done. Wrong. The following sequence leaks: struct rev_info revs; repo_init_revisions(the_repository, &revs, NULL); release_revisions(&revs); ... and this here doesn't: struct rev_info revs; repo_init_revisions(the_repository, &revs, NULL); setup_revisions(0, NULL, &revs, NULL); // leak plugger release_revisions(&revs); That's because setup_revisions() calls diff_setup_done(), which frees revs->diffopt.parseopts, and release_revisions() doesn't. And since builtin/pack-objects.c::get_object_list() calls setup_revisions(), it really frees that memory, as you claimed from the start. Sorry, I was somehow assuming that a setup function wouldn't clean up. D'oh! The first sequence is used in some other places. e.g. builtin/prune.c. But there LeakSanitizer doesn't complain for some reason; Valgrind reports the parseopts allocation as "possibly lost". I still think the assumption that "init_x(x); release_x(x);" doesn't leak is reasonable. Let's make it true. How about this? It's safe in the sense that we don't risk double frees and it's close to the TODO comment so we probably won't forget removing it once diff_free() becomes used. --- revision.c | 1 + 1 file changed, 1 insertion(+) diff --git a/revision.c b/revision.c index 439e34a7c5..6a51ef9418 100644 --- a/revision.c +++ b/revision.c @@ -3055,6 +3055,7 @@ void release_revisions(struct rev_info *revs) release_revisions_mailmap(revs->mailmap); free_grep_patterns(&revs->grep_filter); /* TODO (need to handle "no_free"): diff_free(&revs->diffopt) */ + FREE_AND_NULL(revs->diffopt.parseopts); diff_free(&revs->pruning); reflog_walk_info_release(revs->reflog_info); release_revisions_topo_walk_info(revs->topo_walk_info); -- 2.30.2
On Mon, Nov 28 2022, René Scharfe wrote: > Am 28.11.2022 um 15:34 schrieb Ævar Arnfjörð Bjarmason: > [...] >> I mean skip it when it's not needed, it's needed when we call >> get_object_list(). >> >> But what "problem" is being caused by get_object_list()? That there's >> some other case(s) where it'll leak still? I haven't checked, I think we >> should leave that for some other time if there's such leaks, and just >> not introduce any new leaks in this topic. > > The problem is "How to use struct rev_info without leaks?". No matter > where you move it, the leak will be present until the TODO in > release_revisions() is done. No, the problem at hand is that "--filter" doesn't accept N arguments, which is easily fixed. Junio then tested the patch, and noted CI failing. Yes we have various outstanding issues with the revisions API etc. leaking, but the scope-creep of conflating those questions with a regression fix doesn't seem necessary. >>> [...] >>> Well, that TODO fix should remove this new diff_free() call, but I >>> agree that this is fragile. >>> >>> Removing the "TEST_PASSES_SANITIZE_LEAK=true" line from affected tests >>> is probably better. >> >> Or just not introduce new leaks, per my suggested fix-up at >> https://lore.kernel.org/git/221128.86zgcbl0pe.gmgdl@evledraar.gmail.com/ >> (which it looks like you haven't seen when this E-Mail is composed...). > > Not adding leaks is a good idea. AFAICS none of my patches so far add > any. I don't see where debating what constitutes a "new" or "added" memory leak is going to get us. But to clarify I'm not saying you're responsible for the revisions/filter APIs you're running into, and some of those are existing leaks. I'm also saying that if you run: make SANITIZE=leak test T=... For some values of T=... (and the whole test suite with "GIT_TEST_PASSING_SANITIZE_LEAK=true") your 2/3 passes certain tests, and your 3/3 doesn't. > Patch 3 of v2 exposes an existing one that was only triggered by > the --filter option before. Which is also not ideal, of course, but > giving it more visibility hopefully will motivate a proper fix. We try not to break CI. Only finding out that existing CI breaks once a patch is queued & Junio & others start testing it to hunt down issues isn't a good use of anyone's time. So, maybe you think this area of CI is useless etc., but again, if you're submitting a change to advocate for that can we peel that away from a regression we can fix relatively easily? >>>> As you'd see if you made release_revisions() simply call >>>> diff_free(&revs.diffopt) doing so would reveal some really gnarly edge >>>> cases. >>> >>> That was my first attempt; it breaks lots of tests due to double frees. >> >> Right, to be clear I'm saying that none of this is needed right now, >> i.e. I don't get why we'd want the scope-creep past the hunk I noted in >> https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ >> for the --filter bug fix (plus the tests you're adding). > > Well, you asked to squash the minimal fix into the laziness removal in > https://lore.kernel.org/git/221112.86bkpcmm6i.gmgdl@evledraar.gmail.com/ The second part of the relevant request being so that we could regression test the memory leak(s). So, thanks for trying to address that in some way, but proceeding to break linux-leaks, and then arguing that that's OK seems like an odd way to go about that... > Reverting 5cb28270a1 (pack-objects: lazily set up "struct rev_info", > don't leak, 2022-03-28) wholesale is the simplest way to reach the goals > of regression fix, simplification and standard compliance. Except that > the leak check tests have come to depend on the leak being hidden in the > --filter corner. So going back to v1 sure seems attractive. Yeah. Honestly I'd mostly forgotten about the v1 review, and only skimmed things then. So, I've just noticed that the "squash" I'm suggesting is basically equivalent to your v1 now. But yeah I think it's fine to just skip testing the regression fix with SANITIZE=leak for now, we can do it some other time. I just brought it up in the v1 feedback because that patch prominently notes the leak fix, so if we can get regression test it relatively easily that seemed like a good change to make. >>> [...] >>> I saw it as the way towards a release_revisions() that calls diff_free() >>> itself: Add such calls to each of them, fix the "gnarlyness" >>> individually, finally move them all into release_revisions(). The only >>> problem is that there are 60+ callsites. >> >> I think this is a really bad approach in general. >> >> Yes, it may happen to work to free() some data from under an API, but >> it's just as likely that we'll miss that this one caller is screwing >> with its internal state, and e.g. when some new revision.c code is used >> it'll go boom. >> >> If we wanted to phase in such a free() of "foo" I think the right way >> would be to add some "revs.free_foo = 1" flag, giving the API a chance >> to treat that sanely, not to start poking at members of the struct, and >> assuming that its release() won't be free()-ing them. > > And that's why you added no_free to struct diff_options. We could use > it here by setting it in repo_init_revisions() and unsetting in > cmd_pack_objects() and elsewhere, until it is set everywhere. FWIW I think the "no_free" isn't all that good of an approach in retrospect either, but at least it's (mostly) self-contained in the struct that owns it. But it's rather messy, because some of the time it's embedded in a "struct rev_info" that should really own it, and sometimes now. At the time release_revisions() didn't exist yet, so making some forward progress with the diff leaks was easiest with the "no_free". >> But as noted above & in the linked I think we can defer all of that. The >> only reason we're discussing this is because you're changing the >> lazy-init to be not-lazy, and introducing new leaks as a result.> >> I've shown a couple of approaches in this thread of fixing the issue(s) >> at hand without introducing such leaks, so ... > > As noted above: These leaks are not new, they are just moved into > test coverage. Right, and in some cases we should definitely just un-mark a test as being tested under linux-leaks to make forward progress. But in this case I just don't see how that's a good trade-off. We can fix the --filter bug without that, and yes we'll have some new/existing leaks, but those are all contained in tests that are not tested by linux-leaks now. Whereas the larger rewrite to make the init non-lazy would lose us test coverage.
On Mon, Nov 28 2022, René Scharfe wrote: > Am 28.11.2022 um 16:56 schrieb René Scharfe:> >> The problem is "How to use struct rev_info without leaks?". No matter >> where you move it, the leak will be present until the TODO in >> release_revisions() is done. > > Wrong. The following sequence leaks: > > struct rev_info revs; > repo_init_revisions(the_repository, &revs, NULL); > release_revisions(&revs); > > ... and this here doesn't: > > struct rev_info revs; > repo_init_revisions(the_repository, &revs, NULL); > setup_revisions(0, NULL, &revs, NULL); // leak plugger > release_revisions(&revs); > > That's because setup_revisions() calls diff_setup_done(), which frees > revs->diffopt.parseopts, and release_revisions() doesn't. > > And since builtin/pack-objects.c::get_object_list() calls > setup_revisions(), it really frees that memory, as you claimed from the > start. Sorry, I was somehow assuming that a setup function wouldn't > clean up. D'oh! > > The first sequence is used in some other places. e.g. builtin/prune.c. > But there LeakSanitizer doesn't complain for some reason; Valgrind > reports the parseopts allocation as "possibly lost". Yes, some of the interactions are tricky. It's really useful to run the tests with GIT_TEST_PASSING_SANITIZE_LEAK=[true|check] (see t/README) to check these sorts of assumptions for sanity. > I still think the assumption that "init_x(x); release_x(x);" doesn't > leak is reasonable. Let's make it true. How about this? It's safe > in the sense that we don't risk double frees and it's close to the > TODO comment so we probably won't forget removing it once diff_free() > becomes used. > > --- > revision.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/revision.c b/revision.c > index 439e34a7c5..6a51ef9418 100644 > --- a/revision.c > +++ b/revision.c > @@ -3055,6 +3055,7 @@ void release_revisions(struct rev_info *revs) > release_revisions_mailmap(revs->mailmap); > free_grep_patterns(&revs->grep_filter); > /* TODO (need to handle "no_free"): diff_free(&revs->diffopt) */ > + FREE_AND_NULL(revs->diffopt.parseopts); > diff_free(&revs->pruning); > reflog_walk_info_release(revs->reflog_info); > release_revisions_topo_walk_info(revs->topo_walk_info); At this point I'm unclear on what & why this is needed? I.e. once we narrowly fix the >1 "--filter" options what still fails? But in general: I don't really think this sort of thing is worth it. Here we're reaching into a member of "revs->diffopt" behind its back rather than calling diff_free(). I think we should just focus on being able to do do that safely. WIP patches I have in that direction, partially based on your previous "is_dead" suggestion: https://github.com/avar/git/commit/e02a15f6206 https://github.com/avar/git/commit/c718f36566a I haven't poked at that in a while, I think the only outstanding issue with it is that fclose() interaction. I think for this particular thing there aren't going to be any bad side-effects in practice, but I also think convincing oneself of that basically means putting the same amount of work in as just fixing some of these properly.
Am 28.11.2022 um 19:32 schrieb Ævar Arnfjörð Bjarmason: > > On Mon, Nov 28 2022, René Scharfe wrote: > >> Am 28.11.2022 um 16:56 schrieb René Scharfe:> >>> The problem is "How to use struct rev_info without leaks?". No matter >>> where you move it, the leak will be present until the TODO in >>> release_revisions() is done. >> >> Wrong. The following sequence leaks: >> >> struct rev_info revs; >> repo_init_revisions(the_repository, &revs, NULL); >> release_revisions(&revs); >> >> ... and this here doesn't: >> >> struct rev_info revs; >> repo_init_revisions(the_repository, &revs, NULL); >> setup_revisions(0, NULL, &revs, NULL); // leak plugger >> release_revisions(&revs); >> >> That's because setup_revisions() calls diff_setup_done(), which frees >> revs->diffopt.parseopts, and release_revisions() doesn't. >> >> And since builtin/pack-objects.c::get_object_list() calls >> setup_revisions(), it really frees that memory, as you claimed from the >> start. Sorry, I was somehow assuming that a setup function wouldn't >> clean up. D'oh! >> >> The first sequence is used in some other places. e.g. builtin/prune.c. >> But there LeakSanitizer doesn't complain for some reason; Valgrind >> reports the parseopts allocation as "possibly lost". > > Yes, some of the interactions are tricky. It's really useful to run the > tests with GIT_TEST_PASSING_SANITIZE_LEAK=[true|check] (see t/README) to > check these sorts of assumptions for sanity. That may be true, and looks even useful -- I didn't know the check value. I only get a strange error message, though: $ GIT_TEST_PASSING_SANITIZE_LEAK=check ./t0001-init.sh Bail out! GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak Same with make test and prove, of course. And of course I compiled with SANITIZE=leak beforehand. But I don't see a connection between my comment and yours. I was not running any tests, just the above sequences of function calls, e.g. in git prune. > >> I still think the assumption that "init_x(x); release_x(x);" doesn't >> leak is reasonable. Let's make it true. How about this? It's safe >> in the sense that we don't risk double frees and it's close to the >> TODO comment so we probably won't forget removing it once diff_free() >> becomes used. >> >> --- >> revision.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/revision.c b/revision.c >> index 439e34a7c5..6a51ef9418 100644 >> --- a/revision.c >> +++ b/revision.c >> @@ -3055,6 +3055,7 @@ void release_revisions(struct rev_info *revs) >> release_revisions_mailmap(revs->mailmap); >> free_grep_patterns(&revs->grep_filter); >> /* TODO (need to handle "no_free"): diff_free(&revs->diffopt) */ >> + FREE_AND_NULL(revs->diffopt.parseopts); >> diff_free(&revs->pruning); >> reflog_walk_info_release(revs->reflog_info); >> release_revisions_topo_walk_info(revs->topo_walk_info); > > At this point I'm unclear on what & why this is needed? I.e. once we > narrowly fix the >1 "--filter" options what still fails? As I wrote: A call to an initialization function followed by a call to a cleanup function and nothing else shouldn't leak. There are examples of repo_init_revisions()+release_revisions() without setup_revisions() or diff_setup_done() beyond pack-objects. I mentioned prune, but there are more, e.g. in sequencer.c. > But in general: I don't really think this sort of thing is worth > it. Here we're reaching into a member of "revs->diffopt" behind its back > rather than calling diff_free(). I think we should just focus on being > able to do do that safely. Sure, but the FREE_AND_NULL call is simple and safe, while diff_free() is complicated and calling it one time too many can hurt. > WIP patches I have in that direction, partially based on your previous > "is_dead" suggestion: > > https://github.com/avar/git/commit/e02a15f6206 > https://github.com/avar/git/commit/c718f36566a Copy-typed the interesting parts of the first patch like a medieval monk because there doesn't seem to be a download option. :-| > I haven't poked at that in a while, I think the only outstanding issue > with it is that fclose() interaction. You mean the t3702-add-edit.sh failure on Windows mentioned in the commit message of e02a15f6206? That's caused by the file being kept open and thus locked during the call of the editor. Moving the release_revisions() call in builtin/add.c::edit_patch() before the launch_editor() call fixes that by closing the file. > I think for this particular thing there aren't going to be any bad > side-effects in practice, but I also think convincing oneself of that > basically means putting the same amount of work in as just fixing some > of these properly. Not to me, but perhaps that TODO is easier solved that I expected. In any case, with the mentioned edit_patch() change described above e02a15f6206 passes the test suite on Windows for me. René
On Mon, Nov 28, 2022 at 10:57:02PM +0100, René Scharfe wrote: > >> diff --git a/revision.c b/revision.c > >> index 439e34a7c5..6a51ef9418 100644 > >> --- a/revision.c > >> +++ b/revision.c > >> @@ -3055,6 +3055,7 @@ void release_revisions(struct rev_info *revs) > >> release_revisions_mailmap(revs->mailmap); > >> free_grep_patterns(&revs->grep_filter); > >> /* TODO (need to handle "no_free"): diff_free(&revs->diffopt) */ > >> + FREE_AND_NULL(revs->diffopt.parseopts); > >> diff_free(&revs->pruning); > >> reflog_walk_info_release(revs->reflog_info); > >> release_revisions_topo_walk_info(revs->topo_walk_info); > > > > At this point I'm unclear on what & why this is needed? I.e. once we > > narrowly fix the >1 "--filter" options what still fails? > > As I wrote: A call to an initialization function followed by a call to a > cleanup function and nothing else shouldn't leak. There are examples of > repo_init_revisions()+release_revisions() without setup_revisions() or > diff_setup_done() beyond pack-objects. I mentioned prune, but there are > more, e.g. in sequencer.c. I tend to agree with you; an init+release combo should release all memory. We _can_ work around it in the caller here, but I think we are better to correct the root cause. I do think what Ævar is saying is: once we have fixed the bug, why are more changes needed? I.e., why would we get rid of the lazy-init. IMHO the answer is that the lazy-init is a more complex pattern, and requires more boilerplate code (which can lead to more bugs, as we saw). So once the bug is fixed, this is purely about cleanup/simplification (if one ignores the C-standard issues), but I tend to think it is one worth doing. (Apologies to Ævar if I'm mis-stating your position). > > But in general: I don't really think this sort of thing is worth > > it. Here we're reaching into a member of "revs->diffopt" behind its back > > rather than calling diff_free(). I think we should just focus on being > > able to do do that safely. > > Sure, but the FREE_AND_NULL call is simple and safe, while diff_free() > is complicated and calling it one time too many can hurt. Again, I'd agree with you here. In the long run, yes, we want diff_free(). But if that is not ready yet, the FREE_AND_NULL() is a stepping stone that takes us in the right direction and which we can do immediately. > > WIP patches I have in that direction, partially based on your previous > > "is_dead" suggestion: > > > > https://github.com/avar/git/commit/e02a15f6206 > > https://github.com/avar/git/commit/c718f36566a > > Copy-typed the interesting parts of the first patch like a medieval monk > because there doesn't seem to be a download option. :-| This is the actual reason I responded to your message. ;) Try: https://github.com/avar/git/commit/e02a15f6206.patch etc. I don't think there's a "raw patch" link or similar in the interface, though. You have to just know about it. -Peff
Jeff King <peff@peff.net> writes: > This is the actual reason I responded to your message. ;) Try: > > https://github.com/avar/git/commit/e02a15f6206.patch > > etc. I don't think there's a "raw patch" link or similar in the > interface, though. You have to just know about it. ;-)
On Mon, Nov 28 2022, René Scharfe wrote: > Am 28.11.2022 um 19:32 schrieb Ævar Arnfjörð Bjarmason: >> >> On Mon, Nov 28 2022, René Scharfe wrote: >> >>> Am 28.11.2022 um 16:56 schrieb René Scharfe:> >>>> The problem is "How to use struct rev_info without leaks?". No matter >>>> where you move it, the leak will be present until the TODO in >>>> release_revisions() is done. >>> >>> Wrong. The following sequence leaks: >>> >>> struct rev_info revs; >>> repo_init_revisions(the_repository, &revs, NULL); >>> release_revisions(&revs); >>> >>> ... and this here doesn't: >>> >>> struct rev_info revs; >>> repo_init_revisions(the_repository, &revs, NULL); >>> setup_revisions(0, NULL, &revs, NULL); // leak plugger >>> release_revisions(&revs); >>> >>> That's because setup_revisions() calls diff_setup_done(), which frees >>> revs->diffopt.parseopts, and release_revisions() doesn't. >>> >>> And since builtin/pack-objects.c::get_object_list() calls >>> setup_revisions(), it really frees that memory, as you claimed from the >>> start. Sorry, I was somehow assuming that a setup function wouldn't >>> clean up. D'oh! >>> >>> The first sequence is used in some other places. e.g. builtin/prune.c. >>> But there LeakSanitizer doesn't complain for some reason; Valgrind >>> reports the parseopts allocation as "possibly lost". >> >> Yes, some of the interactions are tricky. It's really useful to run the >> tests with GIT_TEST_PASSING_SANITIZE_LEAK=[true|check] (see t/README) to >> check these sorts of assumptions for sanity. > > That may be true, and looks even useful -- I didn't know the check > value. I only get a strange error message, though: > > $ GIT_TEST_PASSING_SANITIZE_LEAK=check ./t0001-init.sh > Bail out! GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak > > Same with make test and prove, of course. And of course I compiled > with SANITIZE=leak beforehand. The "=true" part of the message is unfortunately incorrect (it pre-dates "check" being a possible value), but I don't see how you could have compiled with "SANITIZE=leak" and get that message. It's unreachable if 'test -n "$SANITIZE_LEAK"', and that'll be non-empty in GIT-BUILD-OPTIONS if compiled with it. Perhaps you gave SANITIZE=leak to t/Makefile, not the top-level Makefile? Try this at the top-level: GIT_TEST_PASSING_SANITIZE_LEAK=check make SANITIZE= test T=t0001-init.sh > But I don't see a connection between my comment and yours. I was > not running any tests, just the above sequences of function calls, > e.g. in git prune. The connection is that you submitted patches that fail the CI. I don't think you use GitHub, so in particular if you're submitting patches that claim to do something with leak checking it's useful to run those modes against them to check your assumptions. >> >>> I still think the assumption that "init_x(x); release_x(x);" doesn't >>> leak is reasonable. Let's make it true. How about this? It's safe >>> in the sense that we don't risk double frees and it's close to the >>> TODO comment so we probably won't forget removing it once diff_free() >>> becomes used. >>> >>> --- >>> revision.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/revision.c b/revision.c >>> index 439e34a7c5..6a51ef9418 100644 >>> --- a/revision.c >>> +++ b/revision.c >>> @@ -3055,6 +3055,7 @@ void release_revisions(struct rev_info *revs) >>> release_revisions_mailmap(revs->mailmap); >>> free_grep_patterns(&revs->grep_filter); >>> /* TODO (need to handle "no_free"): diff_free(&revs->diffopt) */ >>> + FREE_AND_NULL(revs->diffopt.parseopts); >>> diff_free(&revs->pruning); >>> reflog_walk_info_release(revs->reflog_info); >>> release_revisions_topo_walk_info(revs->topo_walk_info); >> >> At this point I'm unclear on what & why this is needed? I.e. once we >> narrowly fix the >1 "--filter" options what still fails? > > As I wrote: A call to an initialization function followed by a call to a > cleanup function and nothing else shouldn't leak. There are examples of > repo_init_revisions()+release_revisions() without setup_revisions() or > diff_setup_done() beyond pack-objects. I mentioned prune, but there are > more, e.g. in sequencer.c. Yes, I agree it shouldn't leak. And we should definitely fix those leaks. I just don't see why a series fixing bugs in --filter needs to expand the scope to fix those. >> But in general: I don't really think this sort of thing is worth >> it. Here we're reaching into a member of "revs->diffopt" behind its back >> rather than calling diff_free(). I think we should just focus on being >> able to do do that safely. > > Sure, but the FREE_AND_NULL call is simple and safe, while diff_free() > is complicated and calling it one time too many can hurt. It's "safe" because you've read the internals of it, and know that it isn't assuming a non-NULL there once it's past initialization? Or is it like the revisions init()+release() in this thread, where you're assuming it works one way based on the function names etc., only for the CI to fail? In either case, I'm saying that if someone's confident enough to reach into the internals of a structure and tweak it they should be confident enough to just patch diff_free() or the like. >> WIP patches I have in that direction, partially based on your previous >> "is_dead" suggestion: >> >> https://github.com/avar/git/commit/e02a15f6206 >> https://github.com/avar/git/commit/c718f36566a > > Copy-typed the interesting parts of the first patch like a medieval monk > because there doesn't seem to be a download option. :-| Jeff pointed out the ".patch" (there's also ".diff"), but also: Git has this well-known transport protocol it uses, which typically maps to the web URL on public hosting sites ... :) git remote add avar https://github.com/avar/git.git git fetch avar git show <OID> >> I haven't poked at that in a while, I think the only outstanding issue >> with it is that fclose() interaction. > > You mean the t3702-add-edit.sh failure on Windows mentioned in the > commit message of e02a15f6206? That's caused by the file being kept > open and thus locked during the call of the editor. Moving the > release_revisions() call in builtin/add.c::edit_patch() before the > launch_editor() call fixes that by closing the file. Yeah, I haven't had time to poke at it in a while, and I had it queued behind various other leak fixes. I'm sure it's not that complicated in the end, just that that interaction needs to be dealt with. >> I think for this particular thing there aren't going to be any bad >> side-effects in practice, but I also think convincing oneself of that >> basically means putting the same amount of work in as just fixing some >> of these properly. > > Not to me, but perhaps that TODO is easier solved that I expected. > In any case, with the mentioned edit_patch() change described above > e02a15f6206 passes the test suite on Windows for me. Nice! I'll try that out.
On Mon, Nov 28 2022, Jeff King wrote: > On Mon, Nov 28, 2022 at 10:57:02PM +0100, René Scharfe wrote: > >> >> diff --git a/revision.c b/revision.c >> >> index 439e34a7c5..6a51ef9418 100644 >> >> --- a/revision.c >> >> +++ b/revision.c >> >> @@ -3055,6 +3055,7 @@ void release_revisions(struct rev_info *revs) >> >> release_revisions_mailmap(revs->mailmap); >> >> free_grep_patterns(&revs->grep_filter); >> >> /* TODO (need to handle "no_free"): diff_free(&revs->diffopt) */ >> >> + FREE_AND_NULL(revs->diffopt.parseopts); >> >> diff_free(&revs->pruning); >> >> reflog_walk_info_release(revs->reflog_info); >> >> release_revisions_topo_walk_info(revs->topo_walk_info); >> > >> > At this point I'm unclear on what & why this is needed? I.e. once we >> > narrowly fix the >1 "--filter" options what still fails? >> >> As I wrote: A call to an initialization function followed by a call to a >> cleanup function and nothing else shouldn't leak. There are examples of >> repo_init_revisions()+release_revisions() without setup_revisions() or >> diff_setup_done() beyond pack-objects. I mentioned prune, but there are >> more, e.g. in sequencer.c. > > I tend to agree with you; an init+release combo should release all > memory. We _can_ work around it in the caller here, but I think we are > better to correct the root cause. > > I do think what Ævar is saying is: once we have fixed the bug, why are > more changes needed? I.e., why would we get rid of the lazy-init. IMHO > the answer is that the lazy-init is a more complex pattern, and requires > more boilerplate code (which can lead to more bugs, as we saw). So once > the bug is fixed, this is purely about cleanup/simplification (if one > ignores the C-standard issues), but I tend to think it is one worth > doing. > > (Apologies to Ævar if I'm mis-stating your position). My position isn't that it isn't worth doing, but that we can clearly proceed in smaller steps here, and changing one thing at a time is better. So far in this thread we've had, in relation to 3/3 (and I may have lost track of some points made): 1. It's fixing the bug where you provide --filter N times 2. It's refactoring the "lazy init" to "non-lazy init" 3. It's refactoring the code to avoid relying on the J.5.7 extension in C99. 4. Due to #2 we run into more leaks 5. A dozen or so test files fail due to #2. Are we digressing to fix the root cause of the leak(s), or un-marking those tests as passing leak-free & losing test coverage? As https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/ shows we can do #1 as a stand-alone change. I'm not saying the rest of this isn't worth pursuing, except to point out in https://lore.kernel.org/git/221128.86zgcbl0pe.gmgdl@evledraar.gmail.com/ that #2 can be split off from #3.
Am 29.11.22 um 08:12 schrieb Ævar Arnfjörð Bjarmason: > > On Mon, Nov 28 2022, René Scharfe wrote: > >> That may be true, and looks even useful -- I didn't know the check >> value. I only get a strange error message, though: >> >> $ GIT_TEST_PASSING_SANITIZE_LEAK=check ./t0001-init.sh >> Bail out! GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak >> >> Same with make test and prove, of course. And of course I compiled >> with SANITIZE=leak beforehand. > > The "=true" part of the message is unfortunately incorrect (it pre-dates > "check" being a possible value), but I don't see how you could have > compiled with "SANITIZE=leak" and get that message. > > It's unreachable if 'test -n "$SANITIZE_LEAK"', and that'll be non-empty > in GIT-BUILD-OPTIONS if compiled with it. Perhaps you gave SANITIZE=leak > to t/Makefile, not the top-level Makefile? > > Try this at the top-level: > > GIT_TEST_PASSING_SANITIZE_LEAK=check make SANITIZE= test T=t0001-init.sh It works today, no idea what I did yesterday. Did reboot and make clean in the meantime, shell history is inconclusive. *shrug* >> As I wrote: A call to an initialization function followed by a call to a >> cleanup function and nothing else shouldn't leak. There are examples of >> repo_init_revisions()+release_revisions() without setup_revisions() or >> diff_setup_done() beyond pack-objects. I mentioned prune, but there are >> more, e.g. in sequencer.c. > > Yes, I agree it shouldn't leak. And we should definitely fix those > leaks. I just don't see why a series fixing bugs in --filter needs to > expand the scope to fix those. The connection is that this is the very leak that 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, 2022-03-28) plugged locally. Took me a while to see that. Anyway, I'm also not keen on scope creep. >>> But in general: I don't really think this sort of thing is worth >>> it. Here we're reaching into a member of "revs->diffopt" behind its back >>> rather than calling diff_free(). I think we should just focus on being >>> able to do do that safely. >> >> Sure, but the FREE_AND_NULL call is simple and safe, while diff_free() >> is complicated and calling it one time too many can hurt. > > It's "safe" because you've read the internals of it, and know that it > isn't assuming a non-NULL there once it's past initialization? > > Or is it like the revisions init()+release() in this thread, where > you're assuming it works one way based on the function names etc., only > for the CI to fail? Ouch. > In either case, I'm saying that if someone's confident enough to reach > into the internals of a structure and tweak it they should be confident > enough to just patch diff_free() or the like. diff_free() is more complicated; it does that FREE_AND_NULL plus several things that are not idempotent. >>> WIP patches I have in that direction, partially based on your previous >>> "is_dead" suggestion: >>> >>> https://github.com/avar/git/commit/e02a15f6206 >>> https://github.com/avar/git/commit/c718f36566a >> >> Copy-typed the interesting parts of the first patch like a medieval monk >> because there doesn't seem to be a download option. :-| > > Jeff pointed out the ".patch" (there's also ".diff"), but also: Git has > this well-known transport protocol it uses, which typically maps to the > web URL on public hosting sites ... :) > > git remote add avar https://github.com/avar/git.git > git fetch avar > git show <OID> Yes, I probably should have downloaded everything like that. Just did it for the heck of it and got 43.37 MiB at 598.00 KiB/s. Typing wasn't that much slower. René
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 573d0b20b7..3e74fbb0cd 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4149,21 +4149,6 @@ static int option_parse_cruft_expiration(const struct option *opt, return 0; } -struct po_filter_data { - unsigned have_revs:1; - struct rev_info revs; -}; - -static struct list_objects_filter_options *po_filter_revs_init(void *value) -{ - struct po_filter_data *data = value; - - repo_init_revisions(the_repository, &data->revs, NULL); - data->have_revs = 1; - - return &data->revs.filter; -} - int cmd_pack_objects(int argc, const char **argv, const char *prefix) { int use_internal_rev_list = 0; @@ -4174,7 +4159,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) int rev_list_index = 0; int stdin_packs = 0; struct string_list keep_pack_list = STRING_LIST_INIT_NODUP; - struct po_filter_data pfd = { .have_revs = 0 }; + struct rev_info revs; struct option pack_objects_options[] = { OPT_SET_INT('q', "quiet", &progress, @@ -4265,7 +4250,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) &write_bitmap_index, N_("write a bitmap index if possible"), WRITE_BITMAP_QUIET, PARSE_OPT_HIDDEN), - OPT_PARSE_LIST_OBJECTS_FILTER_INIT(&pfd, po_filter_revs_init), + OPT_PARSE_LIST_OBJECTS_FILTER(&revs.filter), OPT_CALLBACK_F(0, "missing", NULL, N_("action"), N_("handling for missing objects"), PARSE_OPT_NONEG, option_parse_missing_action), @@ -4284,6 +4269,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) read_replace_refs = 0; + repo_init_revisions(the_repository, &revs, NULL); + sparse = git_env_bool("GIT_TEST_PACK_SPARSE", -1); if (the_repository->gitdir) { prepare_repo_settings(the_repository); @@ -4385,7 +4372,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (!rev_list_all || !rev_list_reflog || !rev_list_index) unpack_unreachable_expiration = 0; - if (pfd.have_revs && pfd.revs.filter.choice) { + if (revs.filter.choice) { if (!pack_to_stdout) die(_("cannot use --filter without --stdout")); if (stdin_packs) @@ -4472,16 +4459,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) read_cruft_objects(); } else if (!use_internal_rev_list) { read_object_list_from_stdin(); - } else if (pfd.have_revs) { - get_object_list(&pfd.revs, rp.nr, rp.v); - release_revisions(&pfd.revs); } else { - struct rev_info revs; - - repo_init_revisions(the_repository, &revs, NULL); get_object_list(&revs, rp.nr, rp.v); - release_revisions(&revs); } + release_revisions(&revs); cleanup_preferred_base(); if (include_tag && nr_result) for_each_tag_ref(add_ref_tag, NULL); diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 5339660238..ee01bcd2cc 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -290,10 +290,6 @@ int opt_parse_list_objects_filter(const struct option *opt, const char *arg, int unset) { struct list_objects_filter_options *filter_options = opt->value; - opt_lof_init init = (opt_lof_init)opt->defval; - - if (init) - filter_options = init(opt->value); if (unset || !arg) list_objects_filter_set_no_filter(filter_options); diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index 7eeadab2dd..64af2e29e2 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -107,31 +107,13 @@ void parse_list_objects_filter( struct list_objects_filter_options *filter_options, const char *arg); -/** - * The opt->value to opt_parse_list_objects_filter() is either a - * "struct list_objects_filter_option *" when using - * OPT_PARSE_LIST_OBJECTS_FILTER(). - * - * Or, if using no "struct option" field is used by the callback, - * except the "defval" which is expected to be an "opt_lof_init" - * function, which is called with the "opt->value" and must return a - * pointer to the ""struct list_objects_filter_option *" to be used. - * - * The OPT_PARSE_LIST_OBJECTS_FILTER_INIT() can be used e.g. the - * "struct list_objects_filter_option" is embedded in a "struct - * rev_info", which the "defval" could be tasked with lazily - * initializing. See cmd_pack_objects() for an example. - */ int opt_parse_list_objects_filter(const struct option *opt, const char *arg, int unset); -typedef struct list_objects_filter_options *(*opt_lof_init)(void *); -#define OPT_PARSE_LIST_OBJECTS_FILTER_INIT(fo, init) \ - { OPTION_CALLBACK, 0, "filter", (fo), N_("args"), \ - N_("object filtering"), 0, opt_parse_list_objects_filter, \ - (intptr_t)(init) } #define OPT_PARSE_LIST_OBJECTS_FILTER(fo) \ - OPT_PARSE_LIST_OBJECTS_FILTER_INIT((fo), NULL) + OPT_CALLBACK(0, "filter", fo, N_("args"), \ + N_("object filtering"), \ + opt_parse_list_objects_filter) /* * Translates abbreviated numbers in the filter's filter_spec into their diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index 25faebaada..5b707d911b 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -265,7 +265,7 @@ test_expect_success 'verify normal and blob:limit packfiles have same commits/tr test_cmp expected observed ' -test_expect_failure 'verify small limit and big limit results in small limit' ' +test_expect_success 'verify small limit and big limit results in small limit' ' git -C r2 ls-files -s large.1000 >ls_files_result && test_parse_ls_files_stage_oids <ls_files_result | sort >expected &&
This reverts commit 5cb28270a1ff94a0a23e67b479bbbec3bc993518. 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, 2022-03-28) avoided leaking rev_info allocations in many cases by calling repo_init_revisions() only when the .filter member was actually needed, but then still leaking it. That was fixed later by 2108fe4a19 (revisions API users: add straightforward release_revisions(), 2022-04-13), making the reverted commit unnecessary. 5cb28270a1 broke support for multiple --filter options by calling repo_init_revisions() for each one, clobbering earlier state and leaking rev_info allocations. Reverting it fixes that regression. Luckily it only affects users that run pack-objects directly. Internal callers in bundle.c and upload-pack.c combine multiple filters into a single option using "+". 5cb28270a1 introduced OPT_PARSE_LIST_OBJECTS_FILTER_INIT. It relies on being able to faithfully convert function pointers to object pointers and back, which is undefined in C99. The standard mentions this ability as a common extension in its annex J (J.5.7 Function pointer casts), but avoiding that dependency is more portable. The macro hasn't been used since, OPT_PARSE_LIST_OBJECTS_FILTER is enough so far. So revert 5cb28270a1 fully to fix support for multiple --filter options and avoid reliance on undefined behavior. Its intent is better served by the release_revisions() call added by 2108fe4a19 alone -- we just need to do it unconditionally, mirroring the unconditional call of repo_init_revisions(). Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> --- builtin/pack-objects.c | 31 +++++--------------------- list-objects-filter-options.c | 4 ---- list-objects-filter-options.h | 24 +++----------------- t/t5317-pack-objects-filter-objects.sh | 2 +- 4 files changed, 10 insertions(+), 51 deletions(-) -- 2.38.1