Message ID | 75feb18dfd8af03f5e7ba02403a16a0ed4c2edaa.1681329955.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | cocci: codify authoring and reviewing practices | expand |
On Wed, Apr 12, 2023 at 08:05:55PM +0000, Glen Choo via GitGitGadget wrote: > From: Glen Choo <chooglen@google.com> > > This isn't set in stone; we expect this to be updated as the project > evolves. > > Signed-off-by: Glen Choo <chooglen@google.com> > --- > contrib/coccinelle/README | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README > index 9b28ba1c57a..055e3622e5c 100644 > --- a/contrib/coccinelle/README > +++ b/contrib/coccinelle/README > @@ -92,3 +92,26 @@ that might be useful to developers. > > The absolute times will differ for you, but the relative speedup > from caching should be on that order. > + > +== Authoring and reviewing coccinelle changes > + > +* When introducing and applying a new .cocci file, both the Git changes and > + .cocci file should be reviewed. > + > +* Reviewers do not need to be coccinelle experts. To give a Reviewed-By, it is > + enough for the reviewer to get a rough understanding of the proposed rules by > + comparing the .cocci and Git changes, then checking that understanding > + with the author. > + > +* Conversely, authors should consider that reviewers may not be coccinelle > + experts. The primary aim should be to make .cocci files easy to understand, > + e.g. by adding comments or by using rules that are easier to understand even > + if they are less elegant. > + > +* .cocci rules should target only the problem it is trying to solve; "collateral > + damage" is not allowed. > + > +* .cocci files used for refactoring should be temporarily kept in-tree to aid How should such semantic patches be kept in-tree? As .pending.cocci? Then I think it would be better to point this out here. Or as a "regular" semantic patch? Then I'm not sure I agree with this recommendation, but perhaps a commit message explaining the reasoning behind this would help me make up my mind :) It might also be worth mentioning that before submitting a new semantic patch developers should consider its cost-benefit ratio, in particular its effect on the runtime of 'make coccicheck', in the hope that we can avoid another 'unused.cocci' fiasco. > + the refactoring of out-of-tree code (e.g. in-flight topics). They should be > + removed when enough time has been given for others to refactor their code, > + i.e. ~1 release cycle. > -- > gitgitgadget
On Wed, Apr 12 2023, Glen Choo via GitGitGadget wrote: > From: Glen Choo <chooglen@google.com> > > This isn't set in stone; we expect this to be updated as the project > evolves. > > Signed-off-by: Glen Choo <chooglen@google.com> > --- > contrib/coccinelle/README | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README > index 9b28ba1c57a..055e3622e5c 100644 > --- a/contrib/coccinelle/README > +++ b/contrib/coccinelle/README > @@ -92,3 +92,26 @@ that might be useful to developers. > > The absolute times will differ for you, but the relative speedup > from caching should be on that order. > + > +== Authoring and reviewing coccinelle changes > + > +* When introducing and applying a new .cocci file, both the Git changes and > + .cocci file should be reviewed. > + > +* Reviewers do not need to be coccinelle experts. To give a Reviewed-By, it is > + enough for the reviewer to get a rough understanding of the proposed rules by > + comparing the .cocci and Git changes, then checking that understanding > + with the author. Maybe it would be useful here to add something about how you can reproduce the application of the coccinelle rule(s). I sometimes do this on an ad-hoc basis, something like (untested): git checkout HEAD^ -- ':!contrib/coccinelle' '*.[ch]' make coccicheck <apply any suggested patches> git add -A Then see if I ended up with a no-op, or if there's suggested changes. With changes that modify both the header & source files this can be tricky with the default of SPATCH_USE_O_DEPENDENCIES=Y, but disabling it will take care of any potential circular dependency issues. I.e. when the header doesn't contain a required construct that we're replacing. > +* Conversely, authors should consider that reviewers may not be coccinelle > + experts. The primary aim should be to make .cocci files easy to understand, > + e.g. by adding comments or by using rules that are easier to understand even > + if they are less elegant. I agree that simple things should be kept simple, but this seems to come quite close (or perhaps past the line of) suggesting that we use only the simpler features of the language when a more elegant solution would be available with something less well-known. I think we should clarify that that's not the intent. Just as with C, shellscript, Perl etc. we should aim for simplicity, but ultimately we should expect that we can target the full available language available to us. > +* .cocci rules should target only the problem it is trying to solve; "collateral > + damage" is not allowed. I think what you mean here is that you should be able to apply the rule and still build the project. I think that's correct, but I also think that rather than define this in prose, how about we just modify the current CI job to apply the result of non-pending rules, and do a build at the end? Wouldn't that assert this going forward. > +* .cocci files used for refactoring should be temporarily kept in-tree to aid > + the refactoring of out-of-tree code (e.g. in-flight topics). They should be > + removed when enough time has been given for others to refactor their code, > + i.e. ~1 release cycle. Maybe s/should/can/? E.g. for my recent "index" and "the_repository" patches I think they can, but we often keep unused code in-repo for longer than that. If e.g. that code stayed in for more than one release until someone cared to remove it we'd also be fine. I also don't know if some long-running forks (e.g. GfW?) would benefit from the rules for longer than that...
SZEDER Gábor <szeder.dev@gmail.com> writes: >> +* .cocci rules should target only the problem it is trying to solve; "collateral >> + damage" is not allowed. >> + >> +* .cocci files used for refactoring should be temporarily kept in-tree to aid > > How should such semantic patches be kept in-tree? > As .pending.cocci? Then I think it would be better to point this out > here. Or as a "regular" semantic patch? Then I'm not sure I agree > with this recommendation, but perhaps a commit message explaining the > reasoning behind this would help me make up my mind :) I don't feel strongly about this, but I was envisioning keeping them as a "regular" patch, e.g. what Ævar proposed in: https://lore.kernel.org/git/230326.86ileow1fu.gmgdl@evledraar.gmail.com/ In theory, this means that a long running fork (that didn't get updated during the initial refactor) can run coccicheck, notice the failure, and then automatically fix themselves with the included semantic patch. In practice, I don't know how many forks run coccicheck, or whether these refactors are just easy enough to do by hand. For refactors, I suspect that the impact on the 'make coccicheck' runtime will be low, since we're typically targeting just a few tokens and cocci can skip whatever files don't have those tokens, so keeping it as a "regular" patch might be okay. > It might also be worth mentioning that before submitting a new > semantic patch developers should consider its cost-benefit ratio, in > particular its effect on the runtime of 'make coccicheck', Makes sense, though I'm not sure what practical advice to give in order to evaluate the impact on runtime (besides just running it themselves). > in the hope > that we can avoid another 'unused.cocci' fiasco. Maybe this is a good starting point for discussing cost-benefit analysis. I'm not familiar with this fiasco, though. Was an early version of 'unused.cocci' too broad, resulting in a massive hit to runtime?
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > On Wed, Apr 12 2023, Glen Choo via GitGitGadget wrote: > >> +== Authoring and reviewing coccinelle changes >> + >> +* When introducing and applying a new .cocci file, both the Git changes and >> + .cocci file should be reviewed. >> + >> +* Reviewers do not need to be coccinelle experts. To give a Reviewed-By, it is >> + enough for the reviewer to get a rough understanding of the proposed rules by >> + comparing the .cocci and Git changes, then checking that understanding >> + with the author. > > Maybe it would be useful here to add something about how you can > reproduce the application of the coccinelle rule(s). > > I sometimes do this on an ad-hoc basis, something like (untested): > > git checkout HEAD^ -- ':!contrib/coccinelle' '*.[ch]' > make coccicheck > <apply any suggested patches> > git add -A > > Then see if I ended up with a no-op, or if there's suggested changes. > > With changes that modify both the header & source files this can be > tricky with the default of SPATCH_USE_O_DEPENDENCIES=Y, but disabling it > will take care of any potential circular dependency issues. I.e. when > the header doesn't contain a required construct that we're replacing. Makes sense. I've been thinking about sending a "MyFirstCocci" guide as a follow up, and this sounds like the kind of "Tips & Tricks" content that would belong there. >> +* Conversely, authors should consider that reviewers may not be coccinelle >> + experts. The primary aim should be to make .cocci files easy to understand, >> + e.g. by adding comments or by using rules that are easier to understand even >> + if they are less elegant. > > I agree that simple things should be kept simple, but this seems to come > quite close (or perhaps past the line of) suggesting that we use only > the simpler features of the language when a more elegant solution would > be available with something less well-known. > > I think we should clarify that that's not the intent. Just as with C, > shellscript, Perl etc. we should aim for simplicity, but ultimately we > should expect that we can target the full available language available > to us. Makes sense too. I think I'll adjust this to something to the effect of "When using more esoteric parts of the language, be prepared to explain what the .cocci is doing.". >> +* .cocci rules should target only the problem it is trying to solve; "collateral >> + damage" is not allowed. > > I think what you mean here is that you should be able to apply the rule > and still build the project. Yes and no. Yes in that if the project doesn't build, the rule is obviously overly broad, but no in that I think it's possible to write a rule that affects something you didn't mean to, but still builds. I can't think of a way to automatedly check for the latter case, so I categorized it as something to catch at review time. >> +* .cocci files used for refactoring should be temporarily kept in-tree to aid >> + the refactoring of out-of-tree code (e.g. in-flight topics). They should be >> + removed when enough time has been given for others to refactor their code, >> + i.e. ~1 release cycle. > > Maybe s/should/can/? E.g. for my recent "index" and "the_repository" > patches I think they can, but we often keep unused code in-repo for > longer than that. If e.g. that code stayed in for more than one release > until someone cared to remove it we'd also be fine. > > I also don't know if some long-running forks (e.g. GfW?) would benefit > from the rules for longer than that... Yeah, this is the most iffy to me too, which means it would be extra helpful to decide on as many details as we can now instead of deciding ad-hoc. Post-refactor, the .cocci file is always obsolete in-tree, so I think we can either say "always keep the patch" or "never keep the patch". If I understand you correctly, _how long_ to keep the patch is probably a case-by-case matter, though (makes sense to me). I think this comes down to the cost-benefit tradeoff mentioned by others elsewhere in the thread. Maybe I'll just mention that it depends on the cost-benefit analysis and drop the "~1 release cycle" recommendation.
diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README index 9b28ba1c57a..055e3622e5c 100644 --- a/contrib/coccinelle/README +++ b/contrib/coccinelle/README @@ -92,3 +92,26 @@ that might be useful to developers. The absolute times will differ for you, but the relative speedup from caching should be on that order. + +== Authoring and reviewing coccinelle changes + +* When introducing and applying a new .cocci file, both the Git changes and + .cocci file should be reviewed. + +* Reviewers do not need to be coccinelle experts. To give a Reviewed-By, it is + enough for the reviewer to get a rough understanding of the proposed rules by + comparing the .cocci and Git changes, then checking that understanding + with the author. + +* Conversely, authors should consider that reviewers may not be coccinelle + experts. The primary aim should be to make .cocci files easy to understand, + e.g. by adding comments or by using rules that are easier to understand even + if they are less elegant. + +* .cocci rules should target only the problem it is trying to solve; "collateral + damage" is not allowed. + +* .cocci files used for refactoring should be temporarily kept in-tree to aid + the refactoring of out-of-tree code (e.g. in-flight topics). They should be + removed when enough time has been given for others to refactor their code, + i.e. ~1 release cycle.