Message ID | 20220415231342.35980-3-carenas@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 9c539d1027df2a0f6e45b24d518e90d3baf1e7a5 |
Headers | show |
Series | ci: avoid failures for pedantic job with fedora 36 | expand |
Carlo Marcelo Arenas Belón <carenas@gmail.com> writes: > This provides a "no code change needed" option to the "fix" currently > queued as part of ab/http-gcc-12-workaround and therefore should be > reverted once that gets merged. > > Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> > --- > config.mak.dev | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/config.mak.dev b/config.mak.dev > index 335efd46203..c3104f400b2 100644 > --- a/config.mak.dev > +++ b/config.mak.dev > @@ -68,6 +68,7 @@ endif > # https://bugzilla.redhat.com/show_bug.cgi?id=2075786 > ifneq ($(filter gcc12,$(COMPILER_FEATURES)),) > DEVELOPER_CFLAGS += -Wno-error=stringop-overread > +DEVELOPER_CFLAGS += -Wno-error=dangling-pointer > endif > > GIT_TEST_PERL_FATAL_WARNINGS = YesPlease Hmph, this might be an acceptable workaround to squelch the compiler that complains "you stored an onstack pointer in a structure and then you are leaving the scope". We should do something more like the attached patch, with or without the gcc warning, I think. We may have smuggled the pointer to finished in slot->finished pointer that survives the current stackframe out of the function in the original code, so that is what we rectify by clearing the member when it has the value we stored. http.c | 3 +++ 1 file changed, 3 insertions(+) diff --git c/http.c w/http.c index 229da4d148..85437b1980 100644 --- c/http.c +++ w/http.c @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot) select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout); } } + + if (slot->finished == &finished) + slot->finished = NULL; } static void release_active_slot(struct active_request_slot *slot)
On Fri, Apr 15, 2022 at 4:34 PM Junio C Hamano <gitster@pobox.com> wrote: > diff --git c/http.c w/http.c > index 229da4d148..85437b1980 100644 > --- c/http.c > +++ w/http.c > @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot) > select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout); > } > } > + > + if (slot->finished == &finished) > + slot->finished = NULL; > } > > static void release_active_slot(struct active_request_slot *slot) this would be IMHO a better fix than the one currently queued in ab/http-gcc-12-workaround and indeed squashes the warning with the gcc 12 version that is likely to be released with Fedora 36, but notice that it was proposed before[1] and apparently didn't work with the version of the compiler that Ævar was using at that time, as documented in the commit message. Either way, my hope is (assuming this series will go earlier than the other one), that a revert to this commit is included at the end of ab/http-gcc-12-workaround, instead of adding any code changes to this series. Carlo [1] https://lore.kernel.org/git/xmqq8rv2nggn.fsf@gitster.g/
Carlo Arenas <carenas@gmail.com> writes: > this would be IMHO a better fix than the one currently queued in > ab/http-gcc-12-workaround and indeed squashes the warning with the gcc > 12 version that is likely to be released with Fedora 36, That is an excellent news, as ... > but notice > that it was proposed before[1] and apparently didn't work with the > version of the compiler that Ævar was using at that time, as > documented in the commit message. ... I was the one who suggested it, and I remember that it didn't work for Ævar back then. If the problem with the version Ævar had is no longer there, that does sound like a good thing. We can take the patch you posted and then post release we can apply the "clear the .finished member as we are done with the slot" fix, which is a good hygiene regardless of any compiler warning issue. > Either way, my hope is (assuming this series will go earlier than the > other one), that a revert to this commit is included at the end of > ab/http-gcc-12-workaround, instead of adding any code changes to this > series. At this point, my inclination is to merge these two DEVELOPER_CFLAGS changes before the 2.36 final gets tagged. Thanks.
On Fri, Apr 15, 2022 at 5:28 PM Junio C Hamano <gitster@pobox.com> wrote: > > Carlo Arenas <carenas@gmail.com> writes: > > > this would be IMHO a better fix than the one currently queued in > > ab/http-gcc-12-workaround and indeed squashes the warning with the gcc > > 12 version that is likely to be released with Fedora 36, > > That is an excellent news, as ... Sadly I botched the test, and was doubly confused because Ævar might have botched it too the same way as he originally reported it worked[1] for him, only to say the opposite in the commit message for the reroll. the warning is not squashed even if we do the more aggressive BUG if not NULL first and now I am even doubting there was a real bug to begin with. Neither this one or the previous one had bugs reported to gcc AFAIK, until I raised[2] the previous one with Fedora, so there is no confirmation either from their side that they are indeed bugs yet. Carlo [1] https://lore.kernel.org/git/220127.86mtjhdeme.gmgdl@evledraar.gmail.com/ [2] https://bugzilla.redhat.com/show_bug.cgi?id=2075786
Junio C Hamano <gitster@pobox.com> writes: > ... We can take > the patch you posted and then post release we can apply the "clear > the .finished member as we are done with the slot" fix, which is a > good hygiene regardless of any compiler warning issue. > ... > At this point, my inclination is to merge these two DEVELOPER_CFLAGS > changes before the 2.36 final gets tagged. So, the post release longer term clean-up with log message may look like this. ----- >8 --------- >8 --------- >8 --------- >8 ----- Subject: [PATCH] http.c: clear the 'finished' member once we are done with it In http.c, the run_active_slot() function allows the given "slot" to make progress by calling step_active_slots() in a loop repeatedly, and the loop is not left until the request held in the slot completes. Ages ago, we used to use the slot->in_use member to get out of the loop, which misbehaved when the request in "slot" completes (at which time, the result of the request is copied away from the slot, and the in_use member is cleared, making the slot ready to be reused), and the "slot" gets reused to service a different request (at which time, the "slot" becomes in_use again, even though it is for a different request). The loop terminating condition mistakenly thought that the original request has yet to be completed. Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10) fixed this issue, uses a separate "slot->finished" member that is set in run_active_slot() to point to an on-stack variable, and the code that completes the request in finish_active_slot() clears the on-stack variable via the pointer to signal that the particular request held by the slot has completed. It also clears the in_use member (as before that fix), so that the slot itself can safely be reused for an unrelated request. One thing that is not quite clean in this arrangement is that, unless the slot gets reused, at which point the finished member is reset to NULL, the member keeps the value of &finished, which becomes a dangling pointer into the stack when run_active_slot() returns. In finish_active_slot(), clear the finished member after it is used to signal the run_active_slot() caller, because we know we are done with the pointer at that point. Also, because compilers may not be able to follow the callchain that deep from run_active_slot() down to finish_active_slot(), clear the finished member but make sure to limit it to the case where the pointer still points at the on-stack variable of ours (the pointer may be set to point at the on-stack variable of somebody else after the slot gets reused, in which case we do not want to touch it). Signed-off-by: Junio C Hamano <gitster@pobox.com> --- http.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/http.c b/http.c index 229da4d148..626b4051e1 100644 --- a/http.c +++ b/http.c @@ -197,8 +197,10 @@ static void finish_active_slot(struct active_request_slot *slot) closedown_active_slot(slot); curl_easy_getinfo(slot->curl, CURLINFO_HTTP_CODE, &slot->http_code); - if (slot->finished != NULL) - (*slot->finished) = 1; + if (slot->finished != NULL) { + *slot->finished = 1; + slot->finished = NULL; + } /* Store slot results so they can be read after the slot is reused */ if (slot->results != NULL) { @@ -1367,6 +1369,9 @@ void run_active_slot(struct active_request_slot *slot) select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout); } } + + if (slot->finished == &finished) + slot->finished = NULL; } static void release_active_slot(struct active_request_slot *slot)
On Fri, Apr 15 2022, Carlo Arenas wrote: > On Fri, Apr 15, 2022 at 4:34 PM Junio C Hamano <gitster@pobox.com> wrote: >> diff --git c/http.c w/http.c >> index 229da4d148..85437b1980 100644 >> --- c/http.c >> +++ w/http.c >> @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot) >> select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout); >> } >> } >> + >> + if (slot->finished == &finished) >> + slot->finished = NULL; >> } >> >> static void release_active_slot(struct active_request_slot *slot) > > this would be IMHO a better fix than the one currently queued in > ab/http-gcc-12-workaround and indeed squashes the warning with the gcc > 12 version that is likely to be released with Fedora 36, but notice > that it was proposed before[1] and apparently didn't work with the > version of the compiler that Ævar was using at that time, as > documented in the commit message. Does it suppress the warning on your GCCv12? It doesn't on mine, as noted later in related threads my "yes, that does quiet it" in [1] is a misreport. That doesn't suppress the warning (the [2] patch notes it. 1. https://lore.kernel.org/git/220127.86mtjhdeme.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/patch-v3-1.1-69190804c67-20220325T143322Z-avarab@gmail.com/
On Fri, Apr 15 2022, Junio C Hamano wrote: > Carlo Arenas <carenas@gmail.com> writes: > >> this would be IMHO a better fix than the one currently queued in >> ab/http-gcc-12-workaround and indeed squashes the warning with the gcc >> 12 version that is likely to be released with Fedora 36, > > That is an excellent news, as ... > >> but notice >> that it was proposed before[1] and apparently didn't work with the >> version of the compiler that Ævar was using at that time, as >> documented in the commit message. > > ... I was the one who suggested it, and I remember that it didn't > work for Ævar back then. If the problem with the version Ævar had > is no longer there, that does sound like a good thing. We can take > the patch you posted and then post release we can apply the "clear > the .finished member as we are done with the slot" fix, which is a > good hygiene regardless of any compiler warning issue. I don't know what version of GCC 12 Carlo is using, but I'm using it built from its main branch this week, and it warns on that "if (slot->finished == &finished)" suggestion. I don't understand what you think you're gaining from: if (slot->finished == &finished) slot->finished = NULL; That you don't get from: slot->finished = NULL; I.e. the assignment earlier in the function is unconditional, why wouldn't the clearing of the data correspond to that assignment and clear it unconditionally? If it's imagined that we have parallel assignments of the "finished" member wouldn't that check/assignment be racy?
On Fri, Apr 15 2022, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > >> ... We can take >> the patch you posted and then post release we can apply the "clear >> the .finished member as we are done with the slot" fix, which is a >> good hygiene regardless of any compiler warning issue. >> ... >> At this point, my inclination is to merge these two DEVELOPER_CFLAGS >> changes before the 2.36 final gets tagged. > > So, the post release longer term clean-up with log message may look > like this. > > ----- >8 --------- >8 --------- >8 --------- >8 ----- > [...] clear the > finished member but make sure to limit it to the case where the > pointer still points at the on-stack variable of ours (the pointer > may be set to point at the on-stack variable of somebody else after > the slot gets reused, in which case we do not want to touch it). I'm still not sure I get this. So while we're in the run_active_slot() will we have a nested or concurrent invocation of another run_active_slot() (driven by the curl API?). My reading of this code in get_active_slot() is that the "in_use" member is guard in "struct active_request_slot" against any such potential shenanigans, even if this was racily running multi-threaded (although then the slot selection loop itself would need some mutexing). Then in finish_active_slot() we have since baa7b67d091 (HTTP slot reuse fixes, 2006-03-10) unconditionally clobbered "slot->finished" if it's non-NULL, without any "is it ours?" check. Which has been my analysis of this, i.e. that at this point we "own" that member, and we won't race with anyone. We were even doing the unconditional clearing already, it was just across a function boundary. But GCC started warning about /how/ we did the clearing, so. If it's just general paranoia OK, quite confusing though, because we're doing this in 2 places, and this puts a guard on 1/2. But I'm asking in case you see some path through this where "the slot gets reused [by someone else, it's not our value, and] we do not want to touch it". I don't see how that could happen *within* run_active_slot() but not between the existing code between run_active_slot() and finish_active_slot(), which is doing that clearing unconditionally... > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > http.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/http.c b/http.c > index 229da4d148..626b4051e1 100644 > --- a/http.c > +++ b/http.c > @@ -197,8 +197,10 @@ static void finish_active_slot(struct active_request_slot *slot) > closedown_active_slot(slot); > curl_easy_getinfo(slot->curl, CURLINFO_HTTP_CODE, &slot->http_code); > > - if (slot->finished != NULL) > - (*slot->finished) = 1; > + if (slot->finished != NULL) { > + *slot->finished = 1; > + slot->finished = NULL; > + } > > /* Store slot results so they can be read after the slot is reused */ > if (slot->results != NULL) {
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > I.e. the assignment earlier in the function is unconditional, why > wouldn't the clearing of the data correspond to that assignment and > clear it unconditionally? The original problem description that introduced .finished member indicates that inside the while() loop, the same slot object can be completed (by feeding it to finish_active_slot(), which would also clears its in_use thus making it reusable) and then later be reused (by using it for a different request). The dispatching is done by calling step_active_slots() repeatedly inside the loop and I do not think there is any multi-threaded concurrency to worry about here. The protection is against a case where such a slot, which was originally ours and pointed at our on-stack finished variable with its finished member, is reused for a different request, and its finished member is used in a similar way to point at the on-stack finish variable in somebody else's stackframe in the future code. If the slot instance we were using as ours upon the entry of this function is being used for another request already (the fix that required the .finished member is an enough explanation that it is a real concern), after we leave the loop, the slot instance is no longer ours, so we need to be careful when we clear it. At the entry of this function, the story is vastly different. The slot instance belongs to us---the caller chose the slot and decided to use it to service a particular request and threw the slot instance at us, so there is nothing wrong to unconditionally use the .finished member of the slot and point it at a variable in our stackframe. But after the loop leaves, and the slot may or may not be already reused to hold another request. If .finished is set and it is the value that points at the variable in our stackframe, then we are the only one who could have set that and it is safe to clear. Any other value other than NULL, we do not know at that point who set it, and it is being used for a request that we have nothing to do with. That is why we want to refrain from touching it when it is not clearly ours.
diff --git a/config.mak.dev b/config.mak.dev index 335efd46203..c3104f400b2 100644 --- a/config.mak.dev +++ b/config.mak.dev @@ -68,6 +68,7 @@ endif # https://bugzilla.redhat.com/show_bug.cgi?id=2075786 ifneq ($(filter gcc12,$(COMPILER_FEATURES)),) DEVELOPER_CFLAGS += -Wno-error=stringop-overread +DEVELOPER_CFLAGS += -Wno-error=dangling-pointer endif GIT_TEST_PERL_FATAL_WARNINGS = YesPlease
This provides a "no code change needed" option to the "fix" currently queued as part of ab/http-gcc-12-workaround and therefore should be reverted once that gets merged. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> --- config.mak.dev | 1 + 1 file changed, 1 insertion(+)