Message ID | 20241021124145.636561-1-karthik.188@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | CodingGuidelines: discourage arbitrary suffixes in function names | expand |
On Mon, Oct 21, 2024 at 02:41:45PM +0200, Karthik Nayak wrote: > We often name functions with arbitrary suffixes like `_1` as an > extension of another existing function. This created confusion and Micro-nit: s/created/creates/ [snip] > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines > index 30fda4142c..b8e2d30567 100644 > --- a/Documentation/CodingGuidelines > +++ b/Documentation/CodingGuidelines > @@ -621,6 +621,11 @@ For C programs: > - `S_free()` releases a structure's contents and frees the > structure. > > + - Function names should be self-explanatory, clearly reflecting their > + purpose or behavior. To maintain clarity and avoid confusion, > + arbitrary suffixes such as _1 are discouraged, as they provide no > + meaningful insight into the function's role. Names being self-explanatory is certainly a worthwhile goal, even though I guess that it's a bit more on the idealized side of things. Functions will often not be fully self-explanatory, which is likely just a matter of reality. I mostly just don't want us to end on the other side of the spectrum where we go militant on "Every function must be no longer than 2 lines of code such that it can be self-explanatory". And yes, I'm of course stretching what you are saying quite a bit, I know that this is not what you want to say. I'm merely writing down my own thoughts while thinking it through. So, that being said, I agree that we shouldn't use arbitrary suffixes, as these are quite hard to understand indeed and typically don't really provide any context. So as long as we interpret this rule leniently I'm happy with it. Patrick
On Mon, Oct 21, 2024, at 14:41, Karthik Nayak wrote: > We often name functions with arbitrary suffixes like `_1` as an > extension of another existing function. This created confusion and > doesn't provide good clarity into the functions purpose. Let's document > good function naming etiquette in our CodingGuidelines. > > Signed-off-by: Karthik Nayak <karthik.188@gmail.com> > --- > > This is mostly in response to an ongoing thread [1] where I ran into one of > these functions and it really took me a while to wrap my head around what the > function does. > > [1]: > https://lore.kernel.org/git/CAOLa=ZREg3xuaT6mbM8+Havn3regZDhK45kGy0+Fw8t56c7Mpg@mail.gmail.com/#R I was wondering whether it would make sense to use that link in this document for the context. But I see that there is only one instance of that in the current document.
On Mon, Oct 21, 2024 at 02:59:00PM +0200, Patrick Steinhardt wrote: > On Mon, Oct 21, 2024 at 02:41:45PM +0200, Karthik Nayak wrote: > > We often name functions with arbitrary suffixes like `_1` as an > > extension of another existing function. This created confusion and > > Micro-nit: s/created/creates/ > > [snip] > > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines > > index 30fda4142c..b8e2d30567 100644 > > --- a/Documentation/CodingGuidelines > > +++ b/Documentation/CodingGuidelines > > @@ -621,6 +621,11 @@ For C programs: > > - `S_free()` releases a structure's contents and frees the > > structure. > > > > + - Function names should be self-explanatory, clearly reflecting their > > + purpose or behavior. To maintain clarity and avoid confusion, > > + arbitrary suffixes such as _1 are discouraged, as they provide no > > + meaningful insight into the function's role. > > Names being self-explanatory is certainly a worthwhile goal, even though > I guess that it's a bit more on the idealized side of things. Functions > will often not be fully self-explanatory, which is likely just a matter > of reality. I mostly just don't want us to end on the other side of the > spectrum where we go militant on "Every function must be no longer than > 2 lines of code such that it can be self-explanatory". > > And yes, I'm of course stretching what you are saying quite a bit, I > know that this is not what you want to say. I'm merely writing down my > own thoughts while thinking it through. > > So, that being said, I agree that we shouldn't use arbitrary suffixes, > as these are quite hard to understand indeed and typically don't really > provide any context. So as long as we interpret this rule leniently I'm > happy with it. I am not sure here... I think that using a "_1()" suffix means that the function is processing one of a number of elements that all need to be handled similarly, but where both the processing of an individual element, and the handling of a group of elements are both complicated enough to be placed in their own function. It's also a helpful convention when you have a recursive function that does some setup during its initial call, but subsequent layers of recursion don't need or want to repeat that setup. So I'm not sure I agree that "_1()" is always a bad idea as this changes suggests (i.e. by writing that "they provide no meaningful insight into the function's role"). Perhaps we could rephrase what is written here to suggest a couple of instances where we wouldn't want to apply this rule, and the two that I have written above could perhaps be a useful starting point. But I lean more towards not adjusting our CodingGuidelines at all here. Thanks, Taylor
Patrick Steinhardt <ps@pks.im> writes: > On Mon, Oct 21, 2024 at 02:41:45PM +0200, Karthik Nayak wrote: >> We often name functions with arbitrary suffixes like `_1` as an >> extension of another existing function. This created confusion and > > Micro-nit: s/created/creates/ > > [snip] >> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines >> index 30fda4142c..b8e2d30567 100644 >> --- a/Documentation/CodingGuidelines >> +++ b/Documentation/CodingGuidelines >> @@ -621,6 +621,11 @@ For C programs: >> - `S_free()` releases a structure's contents and frees the >> structure. >> >> + - Function names should be self-explanatory, clearly reflecting their >> + purpose or behavior. To maintain clarity and avoid confusion, >> + arbitrary suffixes such as _1 are discouraged, as they provide no >> + meaningful insight into the function's role. > > Names being self-explanatory is certainly a worthwhile goal, even though > I guess that it's a bit more on the idealized side of things. Functions > will often not be fully self-explanatory, which is likely just a matter > of reality. I mostly just don't want us to end on the other side of the > spectrum where we go militant on "Every function must be no longer than > 2 lines of code such that it can be self-explanatory". > > And yes, I'm of course stretching what you are saying quite a bit, I > know that this is not what you want to say. I'm merely writing down my > own thoughts while thinking it through. > I agree, going the other way doesn't help either. > So, that being said, I agree that we shouldn't use arbitrary suffixes, > as these are quite hard to understand indeed and typically don't really > provide any context. So as long as we interpret this rule leniently I'm > happy with it. > > Patrick I tried to keep the wording to also not just say "it is not allowed to use such suffixes", but mostly to discourage it. I guess we can also iterate on this as needed with time. Karthik
Taylor Blau <me@ttaylorr.com> writes: > On Mon, Oct 21, 2024 at 02:59:00PM +0200, Patrick Steinhardt wrote: >> On Mon, Oct 21, 2024 at 02:41:45PM +0200, Karthik Nayak wrote: >> > We often name functions with arbitrary suffixes like `_1` as an >> > extension of another existing function. This created confusion and >> >> Micro-nit: s/created/creates/ >> >> [snip] >> > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines >> > index 30fda4142c..b8e2d30567 100644 >> > --- a/Documentation/CodingGuidelines >> > +++ b/Documentation/CodingGuidelines >> > @@ -621,6 +621,11 @@ For C programs: >> > - `S_free()` releases a structure's contents and frees the >> > structure. >> > >> > + - Function names should be self-explanatory, clearly reflecting their >> > + purpose or behavior. To maintain clarity and avoid confusion, >> > + arbitrary suffixes such as _1 are discouraged, as they provide no >> > + meaningful insight into the function's role. >> >> Names being self-explanatory is certainly a worthwhile goal, even though >> I guess that it's a bit more on the idealized side of things. Functions >> will often not be fully self-explanatory, which is likely just a matter >> of reality. I mostly just don't want us to end on the other side of the >> spectrum where we go militant on "Every function must be no longer than >> 2 lines of code such that it can be self-explanatory". >> >> And yes, I'm of course stretching what you are saying quite a bit, I >> know that this is not what you want to say. I'm merely writing down my >> own thoughts while thinking it through. >> >> So, that being said, I agree that we shouldn't use arbitrary suffixes, >> as these are quite hard to understand indeed and typically don't really >> provide any context. So as long as we interpret this rule leniently I'm >> happy with it. > > I am not sure here... I think that using a "_1()" suffix means that the > function is processing one of a number of elements that all need to be > handled similarly, but where both the processing of an individual > element, and the handling of a group of elements are both complicated > enough to be placed in their own function. > The crux here is that this meaning is internalized by people who know the code through in and out. But for anyone new to the project, this is not evident from the naming. > It's also a helpful convention when you have a recursive function that > does some setup during its initial call, but subsequent layers of > recursion don't need or want to repeat that setup. > I get the usecase of having such functions. I totally see the need, it's mostly the naming that I'm trying to change. Let's consider two of such functions: 1. mark_remote_island_1: This function is called to do _some_ work on a single remote island when iterating over a list. 2. find_longest_prefixes_1: This is a recursive function which is used to find the longest prefix. If you notice, both use the "_1" suffix and do different things (operate on a single instance from a list vs provide recursive behavior). So the user has to read the code to understand, which makes the "_1" a bit confusing. Second, this could have instead been named: 1. mark_single_remote_island: Which reads better, giving the idea that we are really working on a single remote island. Whereas having a "_1" doesn't easily imply the same. 2. find_longest_prefixes_recursively: Which also reads better, and gives a hint on how the function operates and why it is split out from the base function. > So I'm not sure I agree that "_1()" is always a bad idea as this changes > suggests (i.e. by writing that "they provide no meaningful insight into > the function's role"). > > Perhaps we could rephrase what is written here to suggest a couple of > instances where we wouldn't want to apply this rule, and the two that I > have written above could perhaps be a useful starting point. But I lean > more towards not adjusting our CodingGuidelines at all here. > I hope my reasoning convinces you of the benefits, this is mostly coming from me spending more time than I'd like to admit on one of these functions, trying to understand what they do :D > Thanks, > Taylor Thanks, Karthik
"Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> writes: > On Mon, Oct 21, 2024, at 14:41, Karthik Nayak wrote: >> We often name functions with arbitrary suffixes like `_1` as an >> extension of another existing function. This created confusion and >> doesn't provide good clarity into the functions purpose. Let's document >> good function naming etiquette in our CodingGuidelines. >> >> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> >> --- >> >> This is mostly in response to an ongoing thread [1] where I ran into one of >> these functions and it really took me a while to wrap my head around what the >> function does. >> >> [1]: >> https://lore.kernel.org/git/CAOLa=ZREg3xuaT6mbM8+Havn3regZDhK45kGy0+Fw8t56c7Mpg@mail.gmail.com/#R > > I was wondering whether it would make sense to use that link in this > document for the context. But I see that there is only one instance > of that in the current document. Yeah, there are a few such functions in our codebase. I'd be happy to make any improvements, but also think this is simple and clear at the moment.
On Tue, Oct 22, 2024 at 03:45:38AM -0500, karthik nayak wrote: > >> So, that being said, I agree that we shouldn't use arbitrary suffixes, > >> as these are quite hard to understand indeed and typically don't really > >> provide any context. So as long as we interpret this rule leniently I'm > >> happy with it. > > > > I am not sure here... I think that using a "_1()" suffix means that the > > function is processing one of a number of elements that all need to be > > handled similarly, but where both the processing of an individual > > element, and the handling of a group of elements are both complicated > > enough to be placed in their own function. > > The crux here is that this meaning is internalized by people who know > the code through in and out. But for anyone new to the project, this is > not evident from the naming. Right. I think that with that in mind, a good goal might be to document that convention to make sure that newcomers to the project can easily interpret what foo() and foo_1() mean. Another approach is the one you pursue here, which is to change the existing convention in the name of making the code more approachable for newcomers. Both approaches meet the same end, but the former does not involve changing existing conventions, but instead documenting them. That seems like a more reasonable path to me. > > It's also a helpful convention when you have a recursive function that > > does some setup during its initial call, but subsequent layers of > > recursion don't need or want to repeat that setup. > > > > I get the usecase of having such functions. I totally see the need, it's > mostly the naming that I'm trying to change. > > Let's consider two of such functions: > > 1. mark_remote_island_1: This function is called to do _some_ work on a > single remote island when iterating over a list. > 2. find_longest_prefixes_1: This is a recursive function which is used > to find the longest prefix. > > If you notice, both use the "_1" suffix and do different things (operate > on a single instance from a list vs provide recursive behavior). So the > user has to read the code to understand, which makes the "_1" a bit > confusing. > > Second, this could have instead been named: > 1. mark_single_remote_island: Which reads better, giving the idea that > we are really working on a single remote island. Whereas having a "_1" > doesn't easily imply the same. > 2. find_longest_prefixes_recursively: Which also reads better, and gives > a hint on how the function operates and why it is split out from the > base function. I don't disagree that writing "single" or "recursively" can be considered clearer. I think that the convention to suffix such functions with "_1()" is more terse, but saves characters and can avoid awkward line wrapping. Thanks, Taylor
Taylor Blau <me@ttaylorr.com> writes: > On Tue, Oct 22, 2024 at 03:45:38AM -0500, karthik nayak wrote: >> >> So, that being said, I agree that we shouldn't use arbitrary suffixes, >> >> as these are quite hard to understand indeed and typically don't really >> >> provide any context. So as long as we interpret this rule leniently I'm >> >> happy with it. >> > >> > I am not sure here... I think that using a "_1()" suffix means that the >> > function is processing one of a number of elements that all need to be >> > handled similarly, but where both the processing of an individual >> > element, and the handling of a group of elements are both complicated >> > enough to be placed in their own function. >> >> The crux here is that this meaning is internalized by people who know >> the code through in and out. But for anyone new to the project, this is >> not evident from the naming. > > Right. I think that with that in mind, a good goal might be to document > that convention to make sure that newcomers to the project can easily > interpret what foo() and foo_1() mean. Another approach is the one you > pursue here, which is to change the existing convention in the name of > making the code more approachable for newcomers. > > Both approaches meet the same end, but the former does not involve > changing existing conventions, but instead documenting them. That seems > like a more reasonable path to me. > I would disagree though. I think some conventions even though we've been using them for a while, should be changed. I guess a good middle ground is to document current behavior while also discouraging future use. I'll add that in and push a new version. >> > It's also a helpful convention when you have a recursive function that >> > does some setup during its initial call, but subsequent layers of >> > recursion don't need or want to repeat that setup. >> > >> >> I get the usecase of having such functions. I totally see the need, it's >> mostly the naming that I'm trying to change. >> >> Let's consider two of such functions: >> >> 1. mark_remote_island_1: This function is called to do _some_ work on a >> single remote island when iterating over a list. >> 2. find_longest_prefixes_1: This is a recursive function which is used >> to find the longest prefix. >> >> If you notice, both use the "_1" suffix and do different things (operate >> on a single instance from a list vs provide recursive behavior). So the >> user has to read the code to understand, which makes the "_1" a bit >> confusing. >> >> Second, this could have instead been named: >> 1. mark_single_remote_island: Which reads better, giving the idea that >> we are really working on a single remote island. Whereas having a "_1" >> doesn't easily imply the same. >> 2. find_longest_prefixes_recursively: Which also reads better, and gives >> a hint on how the function operates and why it is split out from the >> base function. > > I don't disagree that writing "single" or "recursively" can be > considered clearer. I think that the convention to suffix such functions > with "_1()" is more terse, but saves characters and can avoid awkward > line wrapping. > But are those pros worth the ambiguity it brings? > Thanks, > Taylor - Karthik
Taylor Blau <me@ttaylorr.com> writes: > I don't disagree that writing "single" or "recursively" can be > considered clearer. I think that the convention to suffix such functions > with "_1()" is more terse, but saves characters and can avoid awkward > line wrapping. I am reasonably sure that I was the first user of the _1() convention, or at least I was one of them. The reason for the choice of suffix was only because there wasn't anything suitable when refactoring an existing function foo() into a set-up part and its recursive body, so I just kept the set-up part and the single call into the new function in the original foo(), and had to give a name to the new function that holds the body of the original logic that was moved from foo(). Neither foo_helper() or foo_recursive() were descriptive enough to warrant such longer suffixes than a simple _1(). They easily can get "help by doing what?" and "recursively doing what?" reaction, which is a sure sign that the suffixes are not descriptive enough. That was the only reason why I picked that "short-and-sweet but cryptic" suffix. Surely all of _1(), _helper(), _recursive() are meaningless. If we were to replace existing uses of them, the replacement has to be 10x better. Having said all that, as an aspirational goal, I think it is good to encourage people to find a name that is descriptive when writing a new function. I'd refrain from judging if it is way too obvious to be worth documenting (as I am officially on vacation and shouldn't be thinking too much about the project). Thanks.
On Wed, Oct 23, 2024 at 05:50:17PM -0700, Junio C Hamano wrote: > Surely all of _1(), _helper(), _recursive() are meaningless. If we > were to replace existing uses of them, the replacement has to be 10x > better. Well put. Each of the three are more or less equally meaningless, but _1() is an accepted (?) project convention and has the fewest characters, so I think is a good choice personally. > Having said all that, as an aspirational goal, I think it is good to > encourage people to find a name that is descriptive when writing a > new function. I'd refrain from judging if it is way too obvious to > be worth documenting (as I am officially on vacation and shouldn't > be thinking too much about the project). Yeah, go back to vacation ;-). Thanks, Taylor
diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 30fda4142c..b8e2d30567 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -621,6 +621,11 @@ For C programs: - `S_free()` releases a structure's contents and frees the structure. + - Function names should be self-explanatory, clearly reflecting their + purpose or behavior. To maintain clarity and avoid confusion, + arbitrary suffixes such as _1 are discouraged, as they provide no + meaningful insight into the function's role. + For Perl programs: - Most of the C guidelines above apply.
We often name functions with arbitrary suffixes like `_1` as an extension of another existing function. This created confusion and doesn't provide good clarity into the functions purpose. Let's document good function naming etiquette in our CodingGuidelines. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> --- This is mostly in response to an ongoing thread [1] where I ran into one of these functions and it really took me a while to wrap my head around what the function does. [1]: https://lore.kernel.org/git/CAOLa=ZREg3xuaT6mbM8+Havn3regZDhK45kGy0+Fw8t56c7Mpg@mail.gmail.com/#R Documentation/CodingGuidelines | 5 +++++ 1 file changed, 5 insertions(+)