Message ID | 20220127190259.2470753-4-gitster@pobox.com (mailing list archive) |
---|---|
State | Accepted |
Commit | cdba0295b0a70cf7a8113dd74be90d11ceddd5f7 |
Headers | show |
Series | None | expand |
On Thu, Jan 27, 2022 at 11:02:59AM -0800, Junio C Hamano wrote: > > Extend the "describe your changes well" section to cover whom we are > trying to help by doing so in the first place. > > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > Documentation/SubmittingPatches | 29 +++++++++++++++++++++++++++++ > 1 file changed, 29 insertions(+) > > diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches > index 7225a0fb52..a6121d1d42 100644 > --- a/Documentation/SubmittingPatches > +++ b/Documentation/SubmittingPatches > @@ -110,6 +110,35 @@ run `git diff --check` on your changes before you commit. > [[describe-changes]] > === Describe your changes well. > > +The log message that explains your changes is just as important as the > +changes themselves. Your code may be clearly written with in-code > +comment to sufficiently explain how it works with the surrounding > +code, but those who need to fix or enhance your code in the future > +will need to know _why_ your code does what it does, for a few > +reasons: > + > +. Your code may be doing something differently from what you wanted it > + to do. Writing down what you actually wanted to achieve will help > + them fix your code and make it do what it should have been doing > + (also, you often discover your own bugs yourself, while writing the > + log message to summarize the thought behind it). > + > +. Your code may be doing things that were only necessary for your > + immediate needs (e.g. "do X to directories" without implementing or > + even designing what is to be done on files). Writing down why you > + excluded what the code does not do will help guide future developers. > + Writing down "we do X to directories, because directories have > + characteristic Y" would help them infer "oh, files also have the same > + characteristic Y, so perhaps doing X to them would also make sense?". > + Saying "we don't do the same X to files, because ..." will help them > + decide if the reasoning is sound (in which case they do not waste > + time extending your code to cover files), or reason differently (in > + which case, they can explain why they extend your code to cover > + files, too). > + > +The goal of your log message is to convey the _why_ behind your > +change to help future developers. > + This is pretty compelling. Is it clear enough why we care about this in the commit message, as opposed to in the patch description (cover letter or post-"---" blurb)? Is it too obvious to explicitly mention that the commit message is the first thing we try to make sense of during a 'git blame' or 'git bisect'? Either way, I think this is an improvement on the doc as it was before. Reviewed-by: Emily Shaffer <emilyshaffer@google.com> > The first line of the commit message should be a short description (50 > characters is the soft limit, see DISCUSSION in linkgit:git-commit[1]), > and should skip the full stop. It is also conventional in most cases to > -- > 2.35.0-177-g7d269f5170 >
Emily Shaffer <emilyshaffer@google.com> writes: >> +The goal of your log message is to convey the _why_ behind your >> +change to help future developers. >> + > > This is pretty compelling. Is it clear enough why we care about this in > the commit message, as opposed to in the patch description (cover letter > or post-"---" blurb)? Is it too obvious to explicitly mention that the > commit message is the first thing we try to make sense of during a 'git > blame' or 'git bisect'? Having to say "this may be better in the in-code comment rather than the log message" to some patch recently (I do not remember), I tend to agree that some guidance would help people decide between the two (or writing both). Again, patches welcome ;-)
On Thu, Mar 03 2022, Junio C Hamano wrote: > Emily Shaffer <emilyshaffer@google.com> writes: > >>> +The goal of your log message is to convey the _why_ behind your >>> +change to help future developers. >>> + >> >> This is pretty compelling. Is it clear enough why we care about this in >> the commit message, as opposed to in the patch description (cover letter >> or post-"---" blurb)? Is it too obvious to explicitly mention that the >> commit message is the first thing we try to make sense of during a 'git >> blame' or 'git bisect'? > > Again, patches welcome ;-) I think for something that's a stylistic preference I'd see why Emily would try to see how you feel about it first. I.e. in this project it's ultimately up to you to decide on those things, so for coding guidelines you've just updated I'd probably do the same. Don't take that as a complaint on the end result b.t.w., I think the overall excellent state of the codebase in this project speaks for itself on that front. I just see why other contributors would be gun shy about pulling the trigger on a patch submission on this particular front :) > Having to say "this may be better in the in-code comment rather than > the log message" to some patch recently (I do not remember), I tend > to agree that some guidance would help people decide between the two > (or writing both). [A somewhat sneaky $subject change :)] I think you're referring to this comment on (some admittedly tricky) code I wrote[1]. First, given the above I'll adjust that to your preferences on a re-submission. So don't take this as some argument on *that* specific point. I'll add a comment in a re-roll. But on reflection I still wouldn't put a comment on that code if it were purely up to me. Why? First, I think all programmers go through a phase of learning where they feel more compelled to comment on the "how" v.s. "why" early on. At the most extremes beginner programmers explaining how say a common standard library function works in code that's relatively straightforward. But that's really not the case in [1], that code really is doing something odd and worth explaining. So why not add a comment? Because access to ubiquitous and *local* source control from git changed a lot of people's habits on this front, including mine. For this code I *would* definitely add a comment there if it was the pre-DVCS[2] days. But I'd say that today my criteria for adding a comment is closer to: Is this so essential to note for the understanding of the rest of this code that nobody who's skimming past this line or reading this part *wouldn't* want this information? Or rather, they might not absolutely want to know, but it might be useful, *and* there's nothing odd about the pattern itself that makes you go "hrm?" enough to run "git blame/log" on it. In this case it's clearly pretty weird that we run the exact "test-tool regex" command twice. But I think that "hrm?" applies. To elaborate: A. It's setting up a self-contained prereq, so it's not essential to understand that implementation detail to read the rest of the code. B. Anyone who does want to see why that odd case is the way it is can run some version "git blame" or say: git log -p -L14,21:t/t7812-grep-icase-non-ascii.sh C. Each comment you add, even within a function or other scope dilutes the value of other more important comments. It trains people not to read them, as they're probably not that important. D. Comments that are "frozen in time" by adding them to the code itself are almost always in danger of drifting in accuracy from the rest of the implementation and its assumptions. Even in well-curated codebases like git.git they're *much more* likely to drift away from the "ground truth" than code is. E. Even if "D" isn't true, commit messages (in this case my [3]) are almost always at an advantage over comments in that they accompany a change to the pre-image. So e.g. that commit message doesn't need to waste time explaining what pattern we'd prefer not to have here instead of the post-image, you get that context for free. Due to a combination of "D" and "E" I almost never read comments in their current context, unless it's painfully obvious that they *must* still apply to the current code. I'll usually run some variant of "git blame" or say: git log -p -L:relevant_function:file And then see how the code looked when that comment was added, and page through what's changed since then. 1. https://lore.kernel.org/git/xmqqsfryah42.fsf@gitster.g/ 2. To those who'd nitpick DVCS v.s. VCS: Yes nothing changed in theory from CVS/SVN etc. on this front, but in practice it did. Access to version control didn't tend to be ubiquitous, and even if it was accessible many people browsing or contributing to your code would probably do so via a downloaded tarball than trying to get CVS or whatever to work. So the "D" in DVCS really did change this. 3. https://lore.kernel.org/git/patch-12.15-f3cc5bc7eb9-20220302T171755Z-avarab@gmail.com/
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > On Thu, Mar 03 2022, Junio C Hamano wrote: > >> Emily Shaffer <emilyshaffer@google.com> writes: >> >>>> +The goal of your log message is to convey the _why_ behind your >>>> +change to help future developers. >>>> + >>> >>> This is pretty compelling. Is it clear enough why we care about this in >>> the commit message, as opposed to in the patch description (cover letter >>> or post-"---" blurb)? Is it too obvious to explicitly mention that the >>> commit message is the first thing we try to make sense of during a 'git >>> blame' or 'git bisect'? >> >> Again, patches welcome ;-) > > I think for something that's a stylistic preference I'd see why Emily > would try to see how you feel about it first. I do not think there is a need to write down stylistic preference. I may show my preference during my reviews, of course, but I won't take preference-only things as a blocker. I also do not think things like "'We used to do X here but we do Y because ...' does not belong to in-code comment, but to log message" is "stylistic preference", and if people are unclear about, I agree that we should spell it out. An example, I can think of offhand, of what should be in comment, whether we also write in the log, is "We do X here because that other code expects us to", when it is tricky to figure out by reading the code by itself without going back to "git blame". "git blame" certainly can be used to figure out which commit touched the line that does X (which is hard to figure out why), and the log message can refer to "that other code expects us to", but that is an extra operation. Also, when we really need to figure out, it is wonderful that we can ask "blame" to give us the commit, and can look at "that other code" in the same commit by checking it out to the working tree, especially when "that other code" may have drifted and the original reasoning no longer applies (iow, what we find out from "git blame" may become stale, and it will stay stale forever because we cannot rewrite the history that old). Now, it is certainly not black/white decision to say what is and what is not tricky to figure out in the code. We shouldn't be commenting obvious things. Two yardsticks I use are (1) if reviewers raise questions during a review, it may indicate that it is worth commenting. (2) if an earlier round of the same series had a bug around the area, it may indicate that the fixed code is worth commenting. but the way I use them is more to say "I found this uncommented code somewhat tricky---but nobody asked a question and it has stayed the same from the initial round, so it may be clear enough for other people, and after all I managed to figure out myself, so it probably is OK to leave it uncommented".
Junio C Hamano <gitster@pobox.com> writes: > I also do not think things like "'We used to do X here but we do Y > because ...' does not belong to in-code comment, but to log message" Sorry for too many nagations. "I also do not" -> "I also do".
diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 7225a0fb52..a6121d1d42 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -110,6 +110,35 @@ run `git diff --check` on your changes before you commit. [[describe-changes]] === Describe your changes well. +The log message that explains your changes is just as important as the +changes themselves. Your code may be clearly written with in-code +comment to sufficiently explain how it works with the surrounding +code, but those who need to fix or enhance your code in the future +will need to know _why_ your code does what it does, for a few +reasons: + +. Your code may be doing something differently from what you wanted it + to do. Writing down what you actually wanted to achieve will help + them fix your code and make it do what it should have been doing + (also, you often discover your own bugs yourself, while writing the + log message to summarize the thought behind it). + +. Your code may be doing things that were only necessary for your + immediate needs (e.g. "do X to directories" without implementing or + even designing what is to be done on files). Writing down why you + excluded what the code does not do will help guide future developers. + Writing down "we do X to directories, because directories have + characteristic Y" would help them infer "oh, files also have the same + characteristic Y, so perhaps doing X to them would also make sense?". + Saying "we don't do the same X to files, because ..." will help them + decide if the reasoning is sound (in which case they do not waste + time extending your code to cover files), or reason differently (in + which case, they can explain why they extend your code to cover + files, too). + +The goal of your log message is to convey the _why_ behind your +change to help future developers. + The first line of the commit message should be a short description (50 characters is the soft limit, see DISCUSSION in linkgit:git-commit[1]), and should skip the full stop. It is also conventional in most cases to
Extend the "describe your changes well" section to cover whom we are trying to help by doing so in the first place. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Documentation/SubmittingPatches | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)