Message ID | 20200907171639.766547-1-eantoranz@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Commit | 1302badd16ad36bc9441367b240e053130d15f7a |
Headers | show |
Series | blame.c: replace instance of !oidcmp for oideq | expand |
On Mon, Sep 7, 2020 at 11:16 AM Edmundo Carmona Antoranz <eantoranz@gmail.com> wrote: Blamed the wrong branch. I should have looped Derrick instead of Jeff. Sorry about that.
On Mon, Sep 7, 2020 at 11:21 AM Edmundo Carmona Antoranz <eantoranz@gmail.com> wrote: > Blamed the wrong branch. I should have looped Derrick instead of Jeff. > Sorry about that. I realized I didn't sign it off. Should I send it again? Or given that it's an almost 1-liner, it's ok? If I send it again, I will provide just a little more context about having the !oidcmp calls replaced for oideq in previous versions.
On 9/7/2020 1:16 PM, Edmundo Carmona Antoranz wrote: > --- Please include sign-off. I saw you reported your intention there in another message, but it's probably best to just send it again. This message could also mention 14438c4 (introduce hasheq() and oideq(), 2018-08-28) which introduced oideq(). This use of !oidcmp() was introduced by 0906ac2b (blame: use changed-path Bloom filters, 2020-04-16). My bad. There is no good reason to introduce this use since it is well after the oideq() method was introduced. > @@ -1353,8 +1353,8 @@ static struct blame_origin *find_origin(struct repository *r, > else { > int compute_diff = 1; > if (origin->commit->parents && > - !oidcmp(&parent->object.oid, > - &origin->commit->parents->item->object.oid)) > + oideq(&parent->object.oid, > + &origin->commit->parents->item->object.oid)) > compute_diff = maybe_changed_path(r, origin, bd); The code itself looks correct. Thanks, -Stolee
On Tue, Sep 08, 2020 at 03:07:34PM -0400, Derrick Stolee wrote: > This message could also mention 14438c4 (introduce hasheq() and > oideq(), 2018-08-28) which introduced oideq(). > > This use of !oidcmp() was introduced by 0906ac2b (blame: use > changed-path Bloom filters, 2020-04-16). My bad. There is no > good reason to introduce this use since it is well after the > oideq() method was introduced. > > > @@ -1353,8 +1353,8 @@ static struct blame_origin *find_origin(struct repository *r, > > else { > > int compute_diff = 1; > > if (origin->commit->parents && > > - !oidcmp(&parent->object.oid, > > - &origin->commit->parents->item->object.oid)) > > + oideq(&parent->object.oid, > > + &origin->commit->parents->item->object.oid)) > > compute_diff = maybe_changed_path(r, origin, bd); > > The code itself looks correct. Yeah, it looks obviously correct. I am puzzled why "make coccicheck" doesn't find this, though. +cc René, as my favorite target for coccinelle nerd-snipes. :) (But clearly we should make the change with or without figuring out the coccinelle part). -Peff
On Wed, Sep 9, 2020 at 3:11 AM Jeff King <peff@peff.net> wrote: > > Yeah, it looks obviously correct. I am puzzled why "make coccicheck" > doesn't find this, though. +cc René, as my favorite target for > coccinelle nerd-snipes. :) > I added this to contrib/coccinelle/object_id.cocci in v2.27.0 @@ identifier f != oideq; expression E1, E2; @@ - !oidcmp(E1, E2) + oideq(E1, E2) And it found it: $ cat contrib/coccinelle/object_id.cocci.patch diff -u -p a/blame.c b/blame.c --- a/blame.c +++ b/blame.c @@ -1352,8 +1352,7 @@ static struct blame_origin *find_origin( else { int compute_diff = 1; if (origin->commit->parents && - !oidcmp(&parent->object.oid, - &origin->commit->parents->item->object.oid)) + oideq(&parent->object.oid, &origin->commit->parents->item->object.oid)) compute_diff = maybe_changed_path(r, origin, bd); if (compute_diff) Do I need to add more things into the coccinelle definition so that it is more restrictive in terms of the expression we are hunting down?
I haven't had a chance to look at the cocci script, but I did have one thought... Derrick pointed out, 14438c4 added both oideq and hasheq. It might be good to have a similar check for hasheq, if there is not one already. On Wed, Sep 9, 2020 at 9:01 AM Edmundo Carmona Antoranz <eantoranz@gmail.com> wrote: > > On Wed, Sep 9, 2020 at 3:11 AM Jeff King <peff@peff.net> wrote: > > > > Yeah, it looks obviously correct. I am puzzled why "make coccicheck" > > doesn't find this, though. +cc René, as my favorite target for > > coccinelle nerd-snipes. :) > > > > I added this to contrib/coccinelle/object_id.cocci in v2.27.0 > > @@ > identifier f != oideq; > expression E1, E2; > @@ > - !oidcmp(E1, E2) > + oideq(E1, E2) > > And it found it: > > $ cat contrib/coccinelle/object_id.cocci.patch > diff -u -p a/blame.c b/blame.c > --- a/blame.c > +++ b/blame.c > @@ -1352,8 +1352,7 @@ static struct blame_origin *find_origin( > else { > int compute_diff = 1; > if (origin->commit->parents && > - !oidcmp(&parent->object.oid, > - &origin->commit->parents->item->object.oid)) > + oideq(&parent->object.oid, > &origin->commit->parents->item->object.oid)) > compute_diff = maybe_changed_path(r, origin, bd); > > if (compute_diff) > > > Do I need to add more things into the coccinelle definition so that it > is more restrictive in terms of the > expression we are hunting down?
On Wed, Sep 09, 2020 at 08:00:57AM -0600, Edmundo Carmona Antoranz wrote: > On Wed, Sep 9, 2020 at 3:11 AM Jeff King <peff@peff.net> wrote: > > > > Yeah, it looks obviously correct. I am puzzled why "make coccicheck" > > doesn't find this, though. +cc René, as my favorite target for > > coccinelle nerd-snipes. :) > > > > I added this to contrib/coccinelle/object_id.cocci in v2.27.0 > > @@ > identifier f != oideq; > expression E1, E2; > @@ > - !oidcmp(E1, E2) > + oideq(E1, E2) > > And it found it: Interesting. The existing rule is: struct object_id *OIDPTR1; struct object_id *OIDPTR2; @@ - oidcmp(OIDPTR1, OIDPTR2) == 0 + oideq(OIDPTR1, OIDPTR2) The "== 0" part looks like it might be significant, but it's not. Coccinelle knows that "!foo" is the same as "foo == 0" (and you can confirm by tweaking it). The addition of "identifer f != oideq" here isn't necessary (we don't even define an "f" in the semantic patch part). And anyway, we use hasheq() inside oideq(), so no need to override the rule there. So the relevant part is probably that our existing rule specifies the exact type, whereas your rule allows any expression. And indeed, if I do this, it works: diff --git a/contrib/coccinelle/object_id.cocci b/contrib/coccinelle/object_id.cocci index ddf4f22bd7..62a6cee0eb 100644 --- a/contrib/coccinelle/object_id.cocci +++ b/contrib/coccinelle/object_id.cocci @@ -55,8 +55,8 @@ struct object_id OID; + oidcmp(&OID, OIDPTR) @@ -struct object_id *OIDPTR1; -struct object_id *OIDPTR2; +expression OIDPTR1; +expression OIDPTR2; @@ - oidcmp(OIDPTR1, OIDPTR2) == 0 + oideq(OIDPTR1, OIDPTR2) Which really _seems_ like a bug in coccinelle, unless I am missing something. Because both of those parameters look like object_id pointers (and the compiler would be complaining if it were not the case). But I also wonder if giving the specific types in the coccinelle rule is buying us anything. If you passed two void pointers or ints or whatever to !oidcmp(), we'd still want to rewrite it as oideq(). -Peff
On Wed, Sep 09, 2020 at 03:13:46PM -0400, Jeff King wrote: > Which really _seems_ like a bug in coccinelle, unless I am missing > something. Because both of those parameters look like object_id pointers > (and the compiler would be complaining if it were not the case). But I > also wonder if giving the specific types in the coccinelle rule is > buying us anything. If you passed two void pointers or ints or whatever > to !oidcmp(), we'd still want to rewrite it as oideq(). And indeed, just blindly swapping out "struct object_id" for "expression" in the coccinelle file (patch below), shows another spot that was missed: diff -u -p a/packfile.c b/packfile.c --- a/packfile.c +++ b/packfile.c @@ -735,7 +735,7 @@ struct packed_git *add_packed_git(const p->mtime = st.st_mtime; if (path_len < the_hash_algo->hexsz || get_sha1_hex(path + path_len - the_hash_algo->hexsz, p->hash)) - hashclr(p->hash); + oidclr(p); return p; } Maybe it's worth being looser in our cocci patch definitions. I'm having trouble thinking of a downside... -Peff -- >8 -- Here's the patch to loosen object_id.cocci. Perhaps we'd want to do the same in other files. diff --git a/contrib/coccinelle/object_id.cocci b/contrib/coccinelle/object_id.cocci index ddf4f22bd7..738c60923e 100644 --- a/contrib/coccinelle/object_id.cocci +++ b/contrib/coccinelle/object_id.cocci @@ -1,62 +1,62 @@ @@ -struct object_id OID; +expression OID; @@ - is_null_sha1(OID.hash) + is_null_oid(&OID) @@ -struct object_id *OIDPTR; +expression *OIDPTR; @@ - is_null_sha1(OIDPTR->hash) + is_null_oid(OIDPTR) @@ -struct object_id OID; +expression OID; @@ - hashclr(OID.hash) + oidclr(&OID) @@ identifier f != oidclr; -struct object_id *OIDPTR; +expression *OIDPTR; @@ f(...) {<... - hashclr(OIDPTR->hash) + oidclr(OIDPTR) ...>} @@ -struct object_id OID1, OID2; +expression OID1, OID2; @@ - hashcmp(OID1.hash, OID2.hash) + oidcmp(&OID1, &OID2) @@ identifier f != oidcmp; -struct object_id *OIDPTR1, OIDPTR2; +expression *OIDPTR1, OIDPTR2; @@ f(...) {<... - hashcmp(OIDPTR1->hash, OIDPTR2->hash) + oidcmp(OIDPTR1, OIDPTR2) ...>} @@ -struct object_id *OIDPTR; -struct object_id OID; +expression *OIDPTR; +expression OID; @@ - hashcmp(OIDPTR->hash, OID.hash) + oidcmp(OIDPTR, &OID) @@ -struct object_id *OIDPTR; -struct object_id OID; +expression *OIDPTR; +expression OID; @@ - hashcmp(OID.hash, OIDPTR->hash) + oidcmp(&OID, OIDPTR) @@ -struct object_id *OIDPTR1; -struct object_id *OIDPTR2; +expression OIDPTR1; +expression OIDPTR2; @@ - oidcmp(OIDPTR1, OIDPTR2) == 0 + oideq(OIDPTR1, OIDPTR2) @@ -71,8 +71,8 @@ expression E1, E2; ...>} @@ -struct object_id *OIDPTR1; -struct object_id *OIDPTR2; +expression *OIDPTR1; +expression *OIDPTR2; @@ - oidcmp(OIDPTR1, OIDPTR2) != 0 + !oideq(OIDPTR1, OIDPTR2)
Am 09.09.20 um 21:17 schrieb Jeff King: > On Wed, Sep 09, 2020 at 03:13:46PM -0400, Jeff King wrote: > >> Which really _seems_ like a bug in coccinelle, unless I am missing >> something. Because both of those parameters look like object_id pointers >> (and the compiler would be complaining if it were not the case). But I >> also wonder if giving the specific types in the coccinelle rule is >> buying us anything. If you passed two void pointers or ints or whatever >> to !oidcmp(), we'd still want to rewrite it as oideq(). Right, using expressions for such a like-for-like transformation is safe and practical in the sense that it won't break correct code, and broken code will be flagged by the compiler. > > And indeed, just blindly swapping out "struct object_id" for > "expression" in the coccinelle file (patch below), shows another spot > that was missed: > > diff -u -p a/packfile.c b/packfile.c > --- a/packfile.c > +++ b/packfile.c > @@ -735,7 +735,7 @@ struct packed_git *add_packed_git(const > p->mtime = st.st_mtime; > if (path_len < the_hash_algo->hexsz || > get_sha1_hex(path + path_len - the_hash_algo->hexsz, p->hash)) > - hashclr(p->hash); > + oidclr(p); > return p; > } > > > Maybe it's worth being looser in our cocci patch definitions. I'm having > trouble thinking of a downside... For transformations that change the type as in the example above we should insist on getting the right one, otherwise we might introduce bugs -- like in the example above. p points to a struct packed_git and not to a struct object_id, so this introduces a type mismatch. We better make sure our semantic patches are safe, otherwise we have to check all conversions very carefully, and then we might be better off doing them manually.. René
On Wed, Sep 09, 2020 at 09:54:55PM +0200, René Scharfe wrote: > > diff -u -p a/packfile.c b/packfile.c > > --- a/packfile.c > > +++ b/packfile.c > > @@ -735,7 +735,7 @@ struct packed_git *add_packed_git(const > > p->mtime = st.st_mtime; > > if (path_len < the_hash_algo->hexsz || > > get_sha1_hex(path + path_len - the_hash_algo->hexsz, p->hash)) > > - hashclr(p->hash); > > + oidclr(p); > > return p; > > } > > > > > > Maybe it's worth being looser in our cocci patch definitions. I'm having > > trouble thinking of a downside... > > For transformations that change the type as in the example above we > should insist on getting the right one, otherwise we might introduce > bugs -- like in the example above. p points to a struct packed_git and > not to a struct object_id, so this introduces a type mismatch. Heh. You'd think that I would have applied that patch and run "make". Or even read it carefully. Thanks for pointing that out. I guess now we have a real example of a downside (the compiler _would_ still catch it, but it means "make coccicheck" is useless if it's repeatedly suggesting a bad transformation). -Peff
René Scharfe <l.s.r@web.de> writes: >> diff -u -p a/packfile.c b/packfile.c >> --- a/packfile.c >> +++ b/packfile.c >> @@ -735,7 +735,7 @@ struct packed_git *add_packed_git(const >> p->mtime = st.st_mtime; >> if (path_len < the_hash_algo->hexsz || >> get_sha1_hex(path + path_len - the_hash_algo->hexsz, p->hash)) >> - hashclr(p->hash); >> + oidclr(p); >> return p; >> } >> >> >> Maybe it's worth being looser in our cocci patch definitions. I'm having >> trouble thinking of a downside... > > For transformations that change the type as in the example above we > should insist on getting the right one, otherwise we might introduce > bugs -- like in the example above. p points to a struct packed_git and > not to a struct object_id, so this introduces a type mismatch. ;-) A good counter-example. > We better make sure our semantic patches are safe, otherwise we have to > check all conversions very carefully, and then we might be better off > doing them manually.. Yes, that is a sensible suggestion.
Jeff King <peff@peff.net> writes: > @@ > -struct object_id *OIDPTR1; > -struct object_id *OIDPTR2; > +expression OIDPTR1; > +expression OIDPTR2; > @@ > - oidcmp(OIDPTR1, OIDPTR2) == 0 > + oideq(OIDPTR1, OIDPTR2) > @@ -71,8 +71,8 @@ expression E1, E2; > ...>} > > @@ > -struct object_id *OIDPTR1; > -struct object_id *OIDPTR2; > +expression *OIDPTR1; > +expression *OIDPTR2; > @@ > - oidcmp(OIDPTR1, OIDPTR2) != 0 > + !oideq(OIDPTR1, OIDPTR2) With an extra insight from the counter-example Réne pointed out in your message, I think the above two are safe but all the others are unsafe.
Am 09.09.20 um 21:13 schrieb Jeff King: > On Wed, Sep 09, 2020 at 08:00:57AM -0600, Edmundo Carmona Antoranz wrote: > >> On Wed, Sep 9, 2020 at 3:11 AM Jeff King <peff@peff.net> wrote: >>> >>> Yeah, it looks obviously correct. I am puzzled why "make coccicheck" >>> doesn't find this, though. +cc René, as my favorite target for >>> coccinelle nerd-snipes. :) >>> >> >> I added this to contrib/coccinelle/object_id.cocci in v2.27.0 >> >> @@ >> identifier f != oideq; >> expression E1, E2; >> @@ >> - !oidcmp(E1, E2) >> + oideq(E1, E2) >> >> And it found it: > > Interesting. The existing rule is: > > struct object_id *OIDPTR1; > struct object_id *OIDPTR2; > @@ > - oidcmp(OIDPTR1, OIDPTR2) == 0 > + oideq(OIDPTR1, OIDPTR2) > > The "== 0" part looks like it might be significant, but it's not. > Coccinelle knows that "!foo" is the same as "foo == 0" (and you can > confirm by tweaking it). It is significant in the sense that "x == 0" in the semantic patch also matches "!x" in the code, but "!x" in the semantic patch doesn't match "x == 0". That's because coccinelle has this isomorphism built in (in /usr/lib/coccinelle/standard.iso on my machine): Expression @ not_int1 @ int X; @@ !X => X == 0 It's a one-way isomorphism (i.e. a rule that says that certain expressions have the same meaning). So we should use "x == 0" over "!x" in semantic patches to cover both cases. > So the relevant part is probably that our existing rule specifies the > exact type, whereas your rule allows any expression. > > And indeed, if I do this, it works: > > diff --git a/contrib/coccinelle/object_id.cocci b/contrib/coccinelle/object_id.cocci > index ddf4f22bd7..62a6cee0eb 100644 > --- a/contrib/coccinelle/object_id.cocci > +++ b/contrib/coccinelle/object_id.cocci > @@ -55,8 +55,8 @@ struct object_id OID; > + oidcmp(&OID, OIDPTR) > > @@ > -struct object_id *OIDPTR1; > -struct object_id *OIDPTR2; > +expression OIDPTR1; > +expression OIDPTR2; > @@ > - oidcmp(OIDPTR1, OIDPTR2) == 0 > + oideq(OIDPTR1, OIDPTR2) > > Which really _seems_ like a bug in coccinelle, unless I am missing > something. Because both of those parameters look like object_id pointers > (and the compiler would be complaining if it were not the case). Yes, seems it looks like coccinelle gives up trying to determine the type of these things. And while this one here matches the example in blame.c: @@ expression A, B; @@ - 0 == oidcmp(A, B) + oideq(A, B) ... and this one does as well: @@ expression A, B; @@ - !oidcmp + oideq (A, B) ... the following one doesn't: @@ expression A, B; @@ - 0 == oidcmp + oideq (A, B) ... and neither does this one: @@ expression A, B; @@ - oidcmp + oideq (A, B) - == 0 So it helps to try some variants in the hope to bypass some of the restrictions/bugs/misunderstandings. O_o René
diff --git a/blame.c b/blame.c index 1be1cd82a2..b475bfa1c0 100644 --- a/blame.c +++ b/blame.c @@ -1353,8 +1353,8 @@ static struct blame_origin *find_origin(struct repository *r, else { int compute_diff = 1; if (origin->commit->parents && - !oidcmp(&parent->object.oid, - &origin->commit->parents->item->object.oid)) + oideq(&parent->object.oid, + &origin->commit->parents->item->object.oid)) compute_diff = maybe_changed_path(r, origin, bd); if (compute_diff)