Message ID | 000a01d4c0b1$9ef9ea70$dcedbf50$@nexbridge.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [Proposed,Fix] daemon.c: not initializing revents | expand |
"Randall S. Becker" <rsbecker@nexbridge.com> writes: > Hi All, > > I found this while trying to track down a hang in t5562 - this isn't the > fix, but here it is something that could be considered a code-inspection. If > there have been random unexplained hangs when git runs as a daemon, this > might be the cause. > > According to many systems (other than Linux), the revents field is supposed > to be 0 on return to poll(). This was the cause of some heart-ache a while > back in compat/poll/poll.c. I am having a hard time grokking "supposed to be 0 on return to", but do you mean "the caller must clear .revents field before calling poll()"? http://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html has this In each pollfd structure, poll() shall clear the revents member, except that where the application requested a report on a condition by setting one of the bits of events listed above, poll() shall set the corresponding bit in revents if the requested condition is true. In addition, poll() shall set the POLLHUP, POLLERR, and POLLNVAL flag in revents if the condition is true, even if the application did not set the corresponding bit in events. and I am also having a hard time interpreting the "except that". If we asked to report (e.g. we set POLLIN in the .events field), poll() does not have to clear .revents but just set whatever bits it needs to set to report the condition in the field? If that is the case, it makes it a bug not to clear .revents before calling poll; the sample code snippet on the same page in EXAMPLES section does not, though, so I am puzzled. In any case, no matter what POSIX says, if clearing .revents before calling poll() helps on platforms in the real world, the patch is worth taking as a fix, I would think. > I am not certain whether that copy of poll() is > used in daemon, but I wanted to point out that the value is being returned > to poll, outside of compat/poll/poll.c and may present a potential for poll > returning an error on that FD due to random values that might be in revents. > > Please see 61b2a1acaae for a related change/justification. > > diff --git a/daemon.c b/daemon.c > index 9d2e0d20ef..1e275fc8b3 100644 > --- a/daemon.c > +++ b/daemon.c > @@ -1194,6 +1194,7 @@ static int service_loop(struct socketlist *socklist) > } > handle(incoming, &ss.sa, sslen); > } > + pfd[i].revents = 0; > } > } > } > > Regards, > Randall
On February 11, 2019 15:57, Junio C Hamano wrote: > "Randall S. Becker" <rsbecker@nexbridge.com> writes: > > > Hi All, > > > > I found this while trying to track down a hang in t5562 - this isn't > > the fix, but here it is something that could be considered a > > code-inspection. If there have been random unexplained hangs when git > > runs as a daemon, this might be the cause. > > > > According to many systems (other than Linux), the revents field is > > supposed to be 0 on return to poll(). This was the cause of some > > heart-ache a while back in compat/poll/poll.c. > > I am having a hard time grokking "supposed to be 0 on return to", but do you > mean "the caller must clear .revents field before calling poll()"? > > http://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html > has this > > In each pollfd structure, poll() shall clear the revents member, > except that where the application requested a report on a condition > by setting one of the bits of events listed above, poll() shall set > the corresponding bit in revents if the requested condition is > true. In addition, poll() shall set the POLLHUP, POLLERR, and > POLLNVAL flag in revents if the condition is true, even if the > application did not set the corresponding bit in events. > > and I am also having a hard time interpreting the "except that". If we asked > to report (e.g. we set POLLIN in the .events field), poll() does not have to > clear .revents but just set whatever bits it needs to set to report the > condition in the field? > > If that is the case, it makes it a bug not to clear .revents before calling poll; > the sample code snippet on the same page in EXAMPLES section does not, > though, so I am puzzled. > > In any case, no matter what POSIX says, if clearing .revents before calling > poll() helps on platforms in the real world, the patch is worth taking as a fix, I > would think. That's what my intent was - my explanations are suffering from a little work-induced sleep deprivation. Would you like this as a formal patch? > > > I am not certain whether that copy of poll() is used in daemon, but I > > wanted to point out that the value is being returned to poll, outside > > of compat/poll/poll.c and may present a potential for poll returning > > an error on that FD due to random values that might be in revents. > > > > Please see 61b2a1acaae for a related change/justification. > > > > diff --git a/daemon.c b/daemon.c > > index 9d2e0d20ef..1e275fc8b3 100644 > > --- a/daemon.c > > +++ b/daemon.c > > @@ -1194,6 +1194,7 @@ static int service_loop(struct socketlist *socklist) > > } > > handle(incoming, &ss.sa, sslen); > > } > > + pfd[i].revents = 0; > > } > > } > > } Regards, Randall
"Randall S. Becker" <rsbecker@nexbridge.com> writes: >> In any case, no matter what POSIX says, if clearing .revents before > calling >> poll() helps on platforms in the real world, the patch is worth taking as > a fix, I >> would think. > > That's what my intent was - my explanations are suffering from a little > work-induced sleep deprivation. Would you like this as a formal patch? That depends ;-) At this late in the cycle, I do not see much urgency for this patch to be in the upcoming release (after all, this code survived real world for quite a long time, so it's only minority platforms like NonStop that haven't seen serious porting effort until recently that would see improvement---and they have survived without reliably working daemon for so long that they can wait for one more release). Now, the knowledge that we will have long enough time before the final version of the formal patch becomes necessary makes me wonder what the best use of that time to polish the patch would be. Ideally we'd like to see "this definitely fixed (or 'worked around') such and such breakages on platform X, Y and Z" instead of my "Well, we could read POSIX that way, so there may be some platforms that would require applications to do this, and an extra assignment here would certainly not hurt", which was the hand-waving I just did. I dunno.
On February 11, 2019 21:59, Junio C Hamano wrote: > "Randall S. Becker" <rsbecker@nexbridge.com> writes: > > >> In any case, no matter what POSIX says, if clearing .revents before > > calling > >> poll() helps on platforms in the real world, the patch is worth > >> taking as > > a fix, I > >> would think. > > > > That's what my intent was - my explanations are suffering from a > > little work-induced sleep deprivation. Would you like this as a formal > patch? > > That depends ;-) > > At this late in the cycle, I do not see much urgency for this patch to be in the > upcoming release (after all, this code survived real world for quite a long > time, so it's only minority platforms like NonStop that haven't seen serious > porting effort until recently that would see improvement---and they have > survived without reliably working daemon for so long that they can wait for > one more release). > > Now, the knowledge that we will have long enough time before the final > version of the formal patch becomes necessary makes me wonder what the > best use of that time to polish the patch would be. Ideally we'd like to see > "this definitely fixed (or 'worked around') such and such breakages on > platform X, Y and Z" instead of my "Well, we could read POSIX that way, so > there may be some platforms that would require applications to do this, and > an extra assignment here would certainly not hurt", which was the hand- > waving I just did. > > I dunno. I hear that. I'd rather (not) be working on debugging breakages from other authors that impact my platform. Honestly, I'd rather work at my $DAYJOB, although, some days... Since this topic isn't a breakage per-se (no tests seem to be impacted on way or another), I agree that this can wait and get through the normal cycle of events at some point in the future. Now, if I could only get some help on t5562 ;) that would be time well spent for rc0. Cheers, Randall
diff --git a/daemon.c b/daemon.c index 9d2e0d20ef..1e275fc8b3 100644 --- a/daemon.c +++ b/daemon.c @@ -1194,6 +1194,7 @@ static int service_loop(struct socketlist *socklist) } handle(incoming, &ss.sa, sslen); } + pfd[i].revents = 0; } } }