Message ID | 20250107070752.GB584668@coredump.intra.peff.net (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | lsan test-lib readability | expand |
On Tue, Jan 07, 2025 at 02:07:52AM -0500, Jeff King wrote: > We want to know if there are any leaks logged by LSan in the results > directory, so we run "find" on the containing directory and pipe it to > xargs. We can accomplish the same thing by just globbing in the shell > and passing the result to grep, which has a few advantages: > > - it's one fewer process to run > > - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we > checked at the beginning of the function, and is the same glob use s/use/used I'm always a bit thrown off by your style of bulleted lists, where they feel like sentences but start with a lower-case letter, and sometimes they do and sometimes they don't end with punctuation. Maybe it's just me not being a native speaker and it's a natural thing to do in English. In any case, it's nothing that really matters in the end, but would be happy to learn if this is indeed something you tend to do in English. > to show the logs in check_test_results_san_file_ > > - this correctly handles the case where TEST_OUTPUT_DIRECTORY has a > space in it. For example doing: > > mkdir "/tmp/foo bar" > TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test > > would yield a lot of: > > grep: /tmp/foo: No such file or directory > grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory > > when there are leaks. We could do the same thing with "xargs > --null", but that isn't portable. > > We are now subject to command-line length limits, but that is also true > of the globbing cat used to show the logs themselves. This hasn't been a > problem in practice. Yup, this also came to my mind immediately. But I agree that it shouldn't be an issue in general. > We do need to use "grep -s" for the case that the glob does not expand > (i.e., there are not any log files at all). This option is in POSIX, and > has been used in t7407 for several years without anybody complaining. > This also also naturally handles the case where the surrounding > directory has already been removed (in which case there are likewise no > files!), dropping the need to comment about it. Okay. So in case there are no matching files we don't expand the globbing string, and "--no-messages" makes us ignore that case. A bit funny, but I don't see any issue with it. > Signed-off-by: Jeff King <peff@peff.net> > --- > I was surprised by the use of "grep -s" in t7407, since it is totally > pointless there. But I think we can take its presence as a positive sign > for portability. Good to know. > t/test-lib.sh | 7 +------ > 1 file changed, 1 insertion(+), 6 deletions(-) > > diff --git a/t/test-lib.sh b/t/test-lib.sh > index be3553e40e..898c2267b8 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -1172,12 +1172,7 @@ test_atexit_handler () { > check_test_results_san_file_has_entries_ () { > test -z "$TEST_RESULTS_SAN_FILE" && return 1 > > - # stderr piped to /dev/null because the directory may have > - # been "rmdir"'d already. > - find "$TEST_RESULTS_SAN_DIR" \ > - -type f \ > - -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null | > - xargs grep ^DEDUP_TOKEN | > + grep -s ^DEDUP_TOKEN "$TEST_RESULTS_SAN_FILE".* | > grep -qv sanitizer::GetThreadStackTopAndBottom And this nicely simplifies things indeed. Patrick
Jeff King <peff@peff.net> writes: > We want to know if there are any leaks logged by LSan in the results > directory, so we run "find" on the containing directory and pipe it to > xargs. We can accomplish the same thing by just globbing in the shell > and passing the result to grep, which has a few advantages: > > - it's one fewer process to run > ... > We are now subject to command-line length limits, but that is also true > of the globbing cat used to show the logs themselves. This hasn't been a > problem in practice. Nice to see it mentioned here. And the resulting code does become simpler to reason about. > We do need to use "grep -s" for the case that the glob does not expand > (i.e., there are not any log files at all). This option is in POSIX, and > has been used in t7407 for several years without anybody complaining. Also since c625bf0e (git-p4: git-p4 tests with p4 triggers, 2017-07-13) t9831 has also been using it. It is not like a stray error message about unmatched glob would really matter here, though. We are not doing 2>&1 to let the downstream of the pipe see it, and unless the test is run under "-v" option, it wouldn't even be seen. > This also also naturally handles the case where the surrounding > directory has already been removed (in which case there are likewise no > files!), dropping the need to comment about it. Nice. Thanks.
On Tue, Jan 07, 2025 at 08:37:33AM +0100, Patrick Steinhardt wrote: > On Tue, Jan 07, 2025 at 02:07:52AM -0500, Jeff King wrote: > > We want to know if there are any leaks logged by LSan in the results > > directory, so we run "find" on the containing directory and pipe it to > > xargs. We can accomplish the same thing by just globbing in the shell > > and passing the result to grep, which has a few advantages: > > > > - it's one fewer process to run > > > > - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we > > checked at the beginning of the function, and is the same glob use > > s/use/used > > I'm always a bit thrown off by your style of bulleted lists, where they > feel like sentences but start with a lower-case letter, and sometimes > they do and sometimes they don't end with punctuation. Maybe it's just > me not being a native speaker and it's a natural thing to do in English. > In any case, it's nothing that really matters in the end, but would be > happy to learn if this is indeed something you tend to do in English. Heh. Yeah, I've seen you mention them before and I've been tempted to start a big discussion. But I never felt like it was worth it. But tonight's your lucky night. ;) In short: I think it's a style question. I perceive them as continuations of the sentence that has the ":". Though admittedly I do not always grammatically continue that sentence. So for example I could: - have one bullet item that completes the sentence. - and then another that likewise completes it. ;) I think many style guides would frown on that. Especially with the periods at the end (you might argue that they should be semicolons). In the example you quoted above they don't grammatically continue the sentence, so arguably what I'm saying doesn't even apply. But I also kind of think of the list items as sentence fragments. That sometimes happen to make a full sentence. Or need punctuation because that fragments gets so long it contains multiple sentences. I dunno. You asked if it is something you tend to do in English. It is something _I_ tend to do in English, but I think most style guides would suggest against it (but then, most also suggest against bulleted lists in the first place). (They probably also suggest against lots of parentheses). So I wouldn't necessarily copy me. My general feeling is that unless a commit message is inaccurate or hard to understand, we should mostly let it pass (even typos). Yes, they are an artifact that is enshrined in the history. But at some point they are also just a written communication between developers, and we all have our own voices and styles. And make mistakes. Polishing them is something we _can_ do collaboratively, but there are diminishing returns. In case it is not clear, I would not say the same for documentation, error messages, etc. Those are artifacts that hits a wider audience, and we have a tool for polishing them together: git. And people should still proofread and correct their own messages before sending. Believe it or not, I do always take a final pass when sending out my commits and still manage to have errors. ;) A lot of times I end up improving clarity and wording on the final pass, but end up introducing a typo (I'm pretty sure that the use/used above was me switching last-minute between "the same glob we use" and "the same glob used"). Bringing it back to the example at hand, my assumption is that the bullet list capitalization and punctuation is mostly a question of style, and isn't making the result hard to understand. But if it is, I can try to adjust. I actually wrote a bulleted list in a commit message earlier today and capitalized it just for you. :) -Peff
On Tue, Jan 07, 2025 at 08:23:34AM -0800, Junio C Hamano wrote: > > We do need to use "grep -s" for the case that the glob does not expand > > (i.e., there are not any log files at all). This option is in POSIX, and > > has been used in t7407 for several years without anybody complaining. > > Also since c625bf0e (git-p4: git-p4 tests with p4 triggers, > 2017-07-13) t9831 has also been using it. It is not like a stray > error message about unmatched glob would really matter here, though. > We are not doing 2>&1 to let the downstream of the pipe see it, and > unless the test is run under "-v" option, it wouldn't even be seen. Yeah, I saw those. But I don't think they count since hardly anybody runs the p4 tests. They do run in CI, but on a rather limited set of platforms. Though come to think of it, this one would only kick in for LSan, which may also run on a pretty limited set of platforms. :) -Peff
On Thu, Jan 09, 2025 at 02:57:50AM -0500, Jeff King wrote: > On Tue, Jan 07, 2025 at 08:37:33AM +0100, Patrick Steinhardt wrote: > > > On Tue, Jan 07, 2025 at 02:07:52AM -0500, Jeff King wrote: > > > We want to know if there are any leaks logged by LSan in the results > > > directory, so we run "find" on the containing directory and pipe it to > > > xargs. We can accomplish the same thing by just globbing in the shell > > > and passing the result to grep, which has a few advantages: > > > > > > - it's one fewer process to run > > > > > > - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we > > > checked at the beginning of the function, and is the same glob use > > > > s/use/used > > > > I'm always a bit thrown off by your style of bulleted lists, where they > > feel like sentences but start with a lower-case letter, and sometimes > > they do and sometimes they don't end with punctuation. Maybe it's just > > me not being a native speaker and it's a natural thing to do in English. > > In any case, it's nothing that really matters in the end, but would be > > happy to learn if this is indeed something you tend to do in English. > > Heh. Yeah, I've seen you mention them before and I've been tempted to > start a big discussion. But I never felt like it was worth it. But > tonight's your lucky night. ;) > > In short: I think it's a style question. I perceive them as > continuations of the sentence that has the ":". Though admittedly I do > not always grammatically continue that sentence. So for example I could: > > - have one bullet item that completes the sentence. > > - and then another that likewise completes it. > > ;) I think many style guides would frown on that. Especially with the > periods at the end (you might argue that they should be semicolons). > > In the example you quoted above they don't grammatically continue the > sentence, so arguably what I'm saying doesn't even apply. But I also > kind of think of the list items as sentence fragments. That sometimes > happen to make a full sentence. Or need punctuation because that > fragments gets so long it contains multiple sentences. > > I dunno. You asked if it is something you tend to do in English. It is > something _I_ tend to do in English, but I think most style guides would > suggest against it (but then, most also suggest against bulleted lists > in the first place). (They probably also suggest against lots of > parentheses). So I wouldn't necessarily copy me. > > My general feeling is that unless a commit message is inaccurate or hard > to understand, we should mostly let it pass (even typos). Yes, they are > an artifact that is enshrined in the history. But at some point they are > also just a written communication between developers, and we all have > our own voices and styles. And make mistakes. Polishing them is > something we _can_ do collaboratively, but there are diminishing > returns. Yup, agreed. It's a minor detail and I'm happy to gloss over it in the future. > In case it is not clear, I would not say the same for documentation, > error messages, etc. Those are artifacts that hits a wider audience, and > we have a tool for polishing them together: git. > > And people should still proofread and correct their own messages before > sending. Believe it or not, I do always take a final pass when sending > out my commits and still manage to have errors. ;) A lot of times I end > up improving clarity and wording on the final pass, but end up > introducing a typo (I'm pretty sure that the use/used above was me > switching last-minute between "the same glob we use" and "the same glob > used"). > > Bringing it back to the example at hand, my assumption is that the > bullet list capitalization and punctuation is mostly a question of > style, and isn't making the result hard to understand. But if it is, I > can try to adjust. I actually wrote a bulleted list in a commit message > earlier today and capitalized it just for you. :) Thanks for explaining! Patrick
diff --git a/t/test-lib.sh b/t/test-lib.sh index be3553e40e..898c2267b8 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1172,12 +1172,7 @@ test_atexit_handler () { check_test_results_san_file_has_entries_ () { test -z "$TEST_RESULTS_SAN_FILE" && return 1 - # stderr piped to /dev/null because the directory may have - # been "rmdir"'d already. - find "$TEST_RESULTS_SAN_DIR" \ - -type f \ - -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null | - xargs grep ^DEDUP_TOKEN | + grep -s ^DEDUP_TOKEN "$TEST_RESULTS_SAN_FILE".* | grep -qv sanitizer::GetThreadStackTopAndBottom }
We want to know if there are any leaks logged by LSan in the results directory, so we run "find" on the containing directory and pipe it to xargs. We can accomplish the same thing by just globbing in the shell and passing the result to grep, which has a few advantages: - it's one fewer process to run - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we checked at the beginning of the function, and is the same glob use to show the logs in check_test_results_san_file_ - this correctly handles the case where TEST_OUTPUT_DIRECTORY has a space in it. For example doing: mkdir "/tmp/foo bar" TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test would yield a lot of: grep: /tmp/foo: No such file or directory grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory when there are leaks. We could do the same thing with "xargs --null", but that isn't portable. We are now subject to command-line length limits, but that is also true of the globbing cat used to show the logs themselves. This hasn't been a problem in practice. We do need to use "grep -s" for the case that the glob does not expand (i.e., there are not any log files at all). This option is in POSIX, and has been used in t7407 for several years without anybody complaining. This also also naturally handles the case where the surrounding directory has already been removed (in which case there are likewise no files!), dropping the need to comment about it. Signed-off-by: Jeff King <peff@peff.net> --- I was surprised by the use of "grep -s" in t7407, since it is totally pointless there. But I think we can take its presence as a positive sign for portability. t/test-lib.sh | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)