diff mbox series

[2/2] cocci: codify authoring and reviewing practices

Message ID 75feb18dfd8af03f5e7ba02403a16a0ed4c2edaa.1681329955.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series cocci: codify authoring and reviewing practices | expand

Commit Message

Glen Choo April 12, 2023, 8:05 p.m. UTC
From: Glen Choo <chooglen@google.com>

This isn't set in stone; we expect this to be updated as the project
evolves.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 contrib/coccinelle/README | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Comments

SZEDER Gábor April 16, 2023, 7:42 a.m. UTC | #1
On Wed, Apr 12, 2023 at 08:05:55PM +0000, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
> 
> This isn't set in stone; we expect this to be updated as the project
> evolves.
> 
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
>  contrib/coccinelle/README | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README
> index 9b28ba1c57a..055e3622e5c 100644
> --- a/contrib/coccinelle/README
> +++ b/contrib/coccinelle/README
> @@ -92,3 +92,26 @@ that might be useful to developers.
>  
>     The absolute times will differ for you, but the relative speedup
>     from caching should be on that order.
> +
> +== Authoring and reviewing coccinelle changes
> +
> +* When introducing and applying a new .cocci file, both the Git changes and
> +  .cocci file should be reviewed.
> +
> +* Reviewers do not need to be coccinelle experts. To give a Reviewed-By, it is
> +  enough for the reviewer to get a rough understanding of the proposed rules by
> +  comparing the .cocci and Git changes, then checking that understanding
> +  with the author.
> +
> +* Conversely, authors should consider that reviewers may not be coccinelle
> +  experts. The primary aim should be to make .cocci files easy to understand,
> +  e.g. by adding comments or by using rules that are easier to understand even
> +  if they are less elegant.
> +
> +* .cocci rules should target only the problem it is trying to solve; "collateral
> +  damage" is not allowed.
> +
> +* .cocci files used for refactoring should be temporarily kept in-tree to aid

How should such semantic patches be kept in-tree?
As .pending.cocci?  Then I think it would be better to point this out
here.  Or as a "regular" semantic patch?  Then I'm not sure I agree
with this recommendation, but perhaps a commit message explaining the
reasoning behind this would help me make up my mind :)

It might also be worth mentioning that before submitting a new
semantic patch developers should consider its cost-benefit ratio, in
particular its effect on the runtime of 'make coccicheck', in the hope
that we can avoid another 'unused.cocci' fiasco.

> +  the refactoring of out-of-tree code (e.g. in-flight topics). They should be
> +  removed when enough time has been given for others to refactor their code,
> +  i.e. ~1 release cycle.
> -- 
> gitgitgadget
Ævar Arnfjörð Bjarmason April 16, 2023, 1:37 p.m. UTC | #2
On Wed, Apr 12 2023, Glen Choo via GitGitGadget wrote:

> From: Glen Choo <chooglen@google.com>
>
> This isn't set in stone; we expect this to be updated as the project
> evolves.
>
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
>  contrib/coccinelle/README | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
>
> diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README
> index 9b28ba1c57a..055e3622e5c 100644
> --- a/contrib/coccinelle/README
> +++ b/contrib/coccinelle/README
> @@ -92,3 +92,26 @@ that might be useful to developers.
>  
>     The absolute times will differ for you, but the relative speedup
>     from caching should be on that order.
> +
> +== Authoring and reviewing coccinelle changes
> +
> +* When introducing and applying a new .cocci file, both the Git changes and
> +  .cocci file should be reviewed.
> +
> +* Reviewers do not need to be coccinelle experts. To give a Reviewed-By, it is
> +  enough for the reviewer to get a rough understanding of the proposed rules by
> +  comparing the .cocci and Git changes, then checking that understanding
> +  with the author.

Maybe it would be useful here to add something about how you can
reproduce the application of the coccinelle rule(s).

I sometimes do this on an ad-hoc basis, something like (untested):

	git checkout HEAD^ -- ':!contrib/coccinelle' '*.[ch]'
	make coccicheck
	<apply any suggested patches>
	git add -A

Then see if I ended up with a no-op, or if there's suggested changes.

With changes that modify both the header & source files this can be
tricky with the default of SPATCH_USE_O_DEPENDENCIES=Y, but disabling it
will take care of any potential circular dependency issues. I.e. when
the header doesn't contain a required construct that we're replacing.

> +* Conversely, authors should consider that reviewers may not be coccinelle
> +  experts. The primary aim should be to make .cocci files easy to understand,
> +  e.g. by adding comments or by using rules that are easier to understand even
> +  if they are less elegant.

I agree that simple things should be kept simple, but this seems to come
quite close (or perhaps past the line of) suggesting that we use only
the simpler features of the language when a more elegant solution would
be available with something less well-known.

I think we should clarify that that's not the intent. Just as with C,
shellscript, Perl etc. we should aim for simplicity, but ultimately we
should expect that we can target the full available language available
to us.

> +* .cocci rules should target only the problem it is trying to solve; "collateral
> +  damage" is not allowed.

I think what you mean here is that you should be able to apply the rule
and still build the project.

I think that's correct, but I also think that rather than define this in
prose, how about we just modify the current CI job to apply the result
of non-pending rules, and do a build at the end? Wouldn't that assert
this going forward.

> +* .cocci files used for refactoring should be temporarily kept in-tree to aid
> +  the refactoring of out-of-tree code (e.g. in-flight topics). They should be
> +  removed when enough time has been given for others to refactor their code,
> +  i.e. ~1 release cycle.

Maybe s/should/can/? E.g. for my recent "index" and "the_repository"
patches I think they can, but we often keep unused code in-repo for
longer than that. If e.g. that code stayed in for more than one release
until someone cared to remove it we'd also be fine.

I also don't know if some long-running forks (e.g. GfW?) would benefit
from the rules for longer than that...
Glen Choo April 19, 2023, 7:29 p.m. UTC | #3
SZEDER Gábor <szeder.dev@gmail.com> writes:

>> +* .cocci rules should target only the problem it is trying to solve; "collateral
>> +  damage" is not allowed.
>> +
>> +* .cocci files used for refactoring should be temporarily kept in-tree to aid
>
> How should such semantic patches be kept in-tree?
> As .pending.cocci?  Then I think it would be better to point this out
> here.  Or as a "regular" semantic patch?  Then I'm not sure I agree
> with this recommendation, but perhaps a commit message explaining the
> reasoning behind this would help me make up my mind :)

I don't feel strongly about this, but I was envisioning keeping them as
a "regular" patch, e.g. what Ævar proposed in:

  https://lore.kernel.org/git/230326.86ileow1fu.gmgdl@evledraar.gmail.com/

In theory, this means that a long running fork (that didn't get updated
during the initial refactor) can run coccicheck, notice the failure, and
then automatically fix themselves with the included semantic patch. In
practice, I don't know how many forks run coccicheck, or whether these
refactors are just easy enough to do by hand.

For refactors, I suspect that the impact on the 'make coccicheck'
runtime will be low, since we're typically targeting just a few tokens
and cocci can skip whatever files don't have those tokens, so keeping it
as a "regular" patch might be okay.

> It might also be worth mentioning that before submitting a new
> semantic patch developers should consider its cost-benefit ratio, in
> particular its effect on the runtime of 'make coccicheck',

Makes sense, though I'm not sure what practical advice to give in order
to evaluate the impact on runtime (besides just running it themselves).

> in the hope
> that we can avoid another 'unused.cocci' fiasco.

Maybe this is a good starting point for discussing cost-benefit
analysis. I'm not familiar with this fiasco, though. Was an early
version of 'unused.cocci' too broad, resulting in a massive hit to
runtime?
Glen Choo April 19, 2023, 10:30 p.m. UTC | #4
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Wed, Apr 12 2023, Glen Choo via GitGitGadget wrote:
>
>> +== Authoring and reviewing coccinelle changes
>> +
>> +* When introducing and applying a new .cocci file, both the Git changes and
>> +  .cocci file should be reviewed.
>> +
>> +* Reviewers do not need to be coccinelle experts. To give a Reviewed-By, it is
>> +  enough for the reviewer to get a rough understanding of the proposed rules by
>> +  comparing the .cocci and Git changes, then checking that understanding
>> +  with the author.
>
> Maybe it would be useful here to add something about how you can
> reproduce the application of the coccinelle rule(s).
>
> I sometimes do this on an ad-hoc basis, something like (untested):
>
> 	git checkout HEAD^ -- ':!contrib/coccinelle' '*.[ch]'
> 	make coccicheck
> 	<apply any suggested patches>
> 	git add -A
>
> Then see if I ended up with a no-op, or if there's suggested changes.
>
> With changes that modify both the header & source files this can be
> tricky with the default of SPATCH_USE_O_DEPENDENCIES=Y, but disabling it
> will take care of any potential circular dependency issues. I.e. when
> the header doesn't contain a required construct that we're replacing.

Makes sense. I've been thinking about sending a "MyFirstCocci" guide as
a follow up, and this sounds like the kind of "Tips & Tricks" content
that would belong there.

>> +* Conversely, authors should consider that reviewers may not be coccinelle
>> +  experts. The primary aim should be to make .cocci files easy to understand,
>> +  e.g. by adding comments or by using rules that are easier to understand even
>> +  if they are less elegant.
>
> I agree that simple things should be kept simple, but this seems to come
> quite close (or perhaps past the line of) suggesting that we use only
> the simpler features of the language when a more elegant solution would
> be available with something less well-known.
>
> I think we should clarify that that's not the intent. Just as with C,
> shellscript, Perl etc. we should aim for simplicity, but ultimately we
> should expect that we can target the full available language available
> to us.

Makes sense too. I think I'll adjust this to something to the effect of
"When using more esoteric parts of the language, be prepared to explain
what the .cocci is doing.".

>> +* .cocci rules should target only the problem it is trying to solve; "collateral
>> +  damage" is not allowed.
>
> I think what you mean here is that you should be able to apply the rule
> and still build the project.

Yes and no. Yes in that if the project doesn't build, the rule is
obviously overly broad, but no in that I think it's possible to write a
rule that affects something you didn't mean to, but still builds. I
can't think of a way to automatedly check for the latter case, so I
categorized it as something to catch at review time.

>> +* .cocci files used for refactoring should be temporarily kept in-tree to aid
>> +  the refactoring of out-of-tree code (e.g. in-flight topics). They should be
>> +  removed when enough time has been given for others to refactor their code,
>> +  i.e. ~1 release cycle.
>
> Maybe s/should/can/? E.g. for my recent "index" and "the_repository"
> patches I think they can, but we often keep unused code in-repo for
> longer than that. If e.g. that code stayed in for more than one release
> until someone cared to remove it we'd also be fine.
>
> I also don't know if some long-running forks (e.g. GfW?) would benefit
> from the rules for longer than that...

Yeah, this is the most iffy to me too, which means it would be extra
helpful to decide on as many details as we can now instead of deciding
ad-hoc.

Post-refactor, the .cocci file is always obsolete in-tree, so I think we
can either say "always keep the patch" or "never keep the patch".

If I understand you correctly, _how long_ to keep the patch is probably
a case-by-case matter, though (makes sense to me). I think this comes
down to the cost-benefit tradeoff mentioned by others elsewhere in the
thread. Maybe I'll just mention that it depends on the cost-benefit
analysis and drop the "~1 release cycle" recommendation.
diff mbox series

Patch

diff --git a/contrib/coccinelle/README b/contrib/coccinelle/README
index 9b28ba1c57a..055e3622e5c 100644
--- a/contrib/coccinelle/README
+++ b/contrib/coccinelle/README
@@ -92,3 +92,26 @@  that might be useful to developers.
 
    The absolute times will differ for you, but the relative speedup
    from caching should be on that order.
+
+== Authoring and reviewing coccinelle changes
+
+* When introducing and applying a new .cocci file, both the Git changes and
+  .cocci file should be reviewed.
+
+* Reviewers do not need to be coccinelle experts. To give a Reviewed-By, it is
+  enough for the reviewer to get a rough understanding of the proposed rules by
+  comparing the .cocci and Git changes, then checking that understanding
+  with the author.
+
+* Conversely, authors should consider that reviewers may not be coccinelle
+  experts. The primary aim should be to make .cocci files easy to understand,
+  e.g. by adding comments or by using rules that are easier to understand even
+  if they are less elegant.
+
+* .cocci rules should target only the problem it is trying to solve; "collateral
+  damage" is not allowed.
+
+* .cocci files used for refactoring should be temporarily kept in-tree to aid
+  the refactoring of out-of-tree code (e.g. in-flight topics). They should be
+  removed when enough time has been given for others to refactor their code,
+  i.e. ~1 release cycle.