Message ID | 20240429-snprintf-checkpatch-v6-1-354c62c88290@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v6] checkpatch: add check for snprintf to scnprintf | expand |
On Mon, Apr 29, 2024 at 06:39:28PM +0000, Justin Stitt wrote: > I am going to quote Lee Jones who has been doing some snprintf -> > scnprintf refactorings: > > "There is a general misunderstanding amongst engineers that > {v}snprintf() returns the length of the data *actually* encoded into the > destination array. However, as per the C99 standard {v}snprintf() > really returns the length of the data that *would have been* written if > there were enough space for it. This misunderstanding has led to > buffer-overruns in the past. It's generally considered safer to use the > {v}scnprintf() variants in their place (or even sprintf() in simple > cases). So let's do that." > > To help prevent new instances of snprintf() from popping up, let's add a > check to checkpatch.pl. > > Suggested-by: Finn Thain <fthain@linux-m68k.org> > Signed-off-by: Justin Stitt <justinstitt@google.com> Thanks! Reviewed-by: Kees Cook <keescook@chromium.org>
On Mon, 2024-04-29 at 12:49 -0700, Kees Cook wrote: > On Mon, Apr 29, 2024 at 06:39:28PM +0000, Justin Stitt wrote: > > I am going to quote Lee Jones who has been doing some snprintf -> > > scnprintf refactorings: > > > > "There is a general misunderstanding amongst engineers that > > {v}snprintf() returns the length of the data *actually* encoded into the > > destination array. However, as per the C99 standard {v}snprintf() > > really returns the length of the data that *would have been* written if > > there were enough space for it. This misunderstanding has led to > > buffer-overruns in the past. It's generally considered safer to use the > > {v}scnprintf() variants in their place (or even sprintf() in simple > > cases). So let's do that." > > > > To help prevent new instances of snprintf() from popping up, let's add a > > check to checkpatch.pl. > > > > Suggested-by: Finn Thain <fthain@linux-m68k.org> > > Signed-off-by: Justin Stitt <justinstitt@google.com> > > Thanks! > > Reviewed-by: Kees Cook <keescook@chromium.org> > $ git grep -P '\b((v|)snprintf)\s*\(' | wc -l 7745 $ git grep -P '(?:return\s+|=\s*)\b((v|)snprintf)\s*\(' | wc -l 1626 Given there are ~5000 uses of these that don't care whether or not it's snprintf or scnprintf, I think this is not great. I'd much rather make sure the return value of the call is used before suggesting an alternative. $ git grep -P '\b((v|)snprintf)\s*\(.*PAGE_SIZE' | wc -l 515 And about 1/3 of these snprintf calls are for sysfs style output that ideally would be converted to sysfs_emit or sysfs_emit_at instead.
On Mon, 29 Apr 2024, Joe Perches wrote: > On Mon, 2024-04-29 at 12:49 -0700, Kees Cook wrote: > > On Mon, Apr 29, 2024 at 06:39:28PM +0000, Justin Stitt wrote: > > > I am going to quote Lee Jones who has been doing some snprintf -> > > > scnprintf refactorings: > > > > > > "There is a general misunderstanding amongst engineers that > > > {v}snprintf() returns the length of the data *actually* encoded into the > > > destination array. However, as per the C99 standard {v}snprintf() > > > really returns the length of the data that *would have been* written if > > > there were enough space for it. This misunderstanding has led to > > > buffer-overruns in the past. It's generally considered safer to use the > > > {v}scnprintf() variants in their place (or even sprintf() in simple > > > cases). So let's do that." > > > > > > To help prevent new instances of snprintf() from popping up, let's add a > > > check to checkpatch.pl. > > > > > > Suggested-by: Finn Thain <fthain@linux-m68k.org> > > > Signed-off-by: Justin Stitt <justinstitt@google.com> > > > > Thanks! > > > > Reviewed-by: Kees Cook <keescook@chromium.org> > > > > $ git grep -P '\b((v|)snprintf)\s*\(' | wc -l > 7745 > $ git grep -P '(?:return\s+|=\s*)\b((v|)snprintf)\s*\(' | wc -l > 1626 > > Given there are ~5000 uses of these that don't care > whether or not it's snprintf or scnprintf, I think this > is not great. > > I'd much rather make sure the return value of the call > is used before suggesting an alternative. > > $ git grep -P '\b((v|)snprintf)\s*\(.*PAGE_SIZE' | wc -l > 515 > > And about 1/3 of these snprintf calls are for sysfs style > output that ideally would be converted to sysfs_emit or > sysfs_emit_at instead. I am working on the migration of these (this patch was spun off from that project in fact). Some subsystems are currently prioritising the status quo (a.k.a. "no churn"), but most have been accepting of the changes. Planning to get back to it once the CVE project has calmed a little. Those numbers should diminish over time.
On Mon, 29 Apr 2024, Justin Stitt wrote: > I am going to quote Lee Jones who has been doing some snprintf -> > scnprintf refactorings: > > "There is a general misunderstanding amongst engineers that > {v}snprintf() returns the length of the data *actually* encoded into the > destination array. However, as per the C99 standard {v}snprintf() > really returns the length of the data that *would have been* written if > there were enough space for it. This misunderstanding has led to > buffer-overruns in the past. It's generally considered safer to use the > {v}scnprintf() variants in their place (or even sprintf() in simple > cases). So let's do that." > > To help prevent new instances of snprintf() from popping up, let's add a > check to checkpatch.pl. > > Suggested-by: Finn Thain <fthain@linux-m68k.org> > Signed-off-by: Justin Stitt <justinstitt@google.com> > --- > Changes in v6: > - move capture group to only include symbol name (not spaces or paren) > - Link to v5: https://lore.kernel.org/r/20240422-snprintf-checkpatch-v5-1-f1e90bf7164e@google.com > > Changes in v5: > - use capture groups to let the user know which variation they used > - Link to v4: https://lore.kernel.org/r/20240408-snprintf-checkpatch-v4-1-8697c96ac94b@google.com > > Changes in v4: > - also check for vsnprintf variant (thanks Bill) > - Link to v3: https://lore.kernel.org/r/20240315-snprintf-checkpatch-v3-1-a451e7664306@google.com > > Changes in v3: > - fix indentation > - add reference link (https://github.com/KSPP/linux/issues/105) (thanks Joe) > - Link to v2: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v2-1-9baeb59dae30@google.com > > Changes in v2: > - Had a vim moment and deleted a character before sending the patch. > - Replaced the character :) > - Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com > --- > From a discussion here [1]. > > [1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/ > --- > scripts/checkpatch.pl | 6 ++++++ > 1 file changed, 6 insertions(+) Reviewed-by: Lee Jones <lee@kernel.org>
On Mon, Apr 29, 2024 at 08:21:49PM -0700, Joe Perches wrote: > On Mon, 2024-04-29 at 12:49 -0700, Kees Cook wrote: > > On Mon, Apr 29, 2024 at 06:39:28PM +0000, Justin Stitt wrote: > > > I am going to quote Lee Jones who has been doing some snprintf -> > > > scnprintf refactorings: > > > > > > "There is a general misunderstanding amongst engineers that > > > {v}snprintf() returns the length of the data *actually* encoded into the > > > destination array. However, as per the C99 standard {v}snprintf() > > > really returns the length of the data that *would have been* written if > > > there were enough space for it. This misunderstanding has led to > > > buffer-overruns in the past. It's generally considered safer to use the > > > {v}scnprintf() variants in their place (or even sprintf() in simple > > > cases). So let's do that." > > > > > > To help prevent new instances of snprintf() from popping up, let's add a > > > check to checkpatch.pl. > > > > > > Suggested-by: Finn Thain <fthain@linux-m68k.org> > > > Signed-off-by: Justin Stitt <justinstitt@google.com> > > > > Thanks! > > > > Reviewed-by: Kees Cook <keescook@chromium.org> > > > > $ git grep -P '\b((v|)snprintf)\s*\(' | wc -l > 7745 > $ git grep -P '(?:return\s+|=\s*)\b((v|)snprintf)\s*\(' | wc -l > 1626 > > Given there are ~5000 uses of these that don't care > whether or not it's snprintf or scnprintf, I think this > is not great. But let's not add more of either case. :) > I'd much rather make sure the return value of the call > is used before suggesting an alternative. > > $ git grep -P '\b((v|)snprintf)\s*\(.*PAGE_SIZE' | wc -l > 515 > > And about 1/3 of these snprintf calls are for sysfs style > output that ideally would be converted to sysfs_emit or > sysfs_emit_at instead. Detecting that we're in the right place for sysfs_emit seems out of scope for here, but maybe it should be more clearly called out by the contents at the reported URL?
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 9c4c4a61bc83..bdae8a7ae5ed 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -7012,6 +7012,12 @@ sub process { "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr); } +# {v}snprintf uses that should likely be {v}scnprintf + if ($line =~ /\b((v|)snprintf)\s*\(/) { + WARN("SNPRINTF", + "Prefer ${2}scnprintf over $1 - see: https://github.com/KSPP/linux/issues/105\n" . $herecurr); + } + # ethtool_sprintf uses that should likely be ethtool_puts if ($line =~ /\bethtool_sprintf\s*\(\s*$FuncArg\s*,\s*$FuncArg\s*\)/) { if (WARN("PREFER_ETHTOOL_PUTS",
I am going to quote Lee Jones who has been doing some snprintf -> scnprintf refactorings: "There is a general misunderstanding amongst engineers that {v}snprintf() returns the length of the data *actually* encoded into the destination array. However, as per the C99 standard {v}snprintf() really returns the length of the data that *would have been* written if there were enough space for it. This misunderstanding has led to buffer-overruns in the past. It's generally considered safer to use the {v}scnprintf() variants in their place (or even sprintf() in simple cases). So let's do that." To help prevent new instances of snprintf() from popping up, let's add a check to checkpatch.pl. Suggested-by: Finn Thain <fthain@linux-m68k.org> Signed-off-by: Justin Stitt <justinstitt@google.com> --- Changes in v6: - move capture group to only include symbol name (not spaces or paren) - Link to v5: https://lore.kernel.org/r/20240422-snprintf-checkpatch-v5-1-f1e90bf7164e@google.com Changes in v5: - use capture groups to let the user know which variation they used - Link to v4: https://lore.kernel.org/r/20240408-snprintf-checkpatch-v4-1-8697c96ac94b@google.com Changes in v4: - also check for vsnprintf variant (thanks Bill) - Link to v3: https://lore.kernel.org/r/20240315-snprintf-checkpatch-v3-1-a451e7664306@google.com Changes in v3: - fix indentation - add reference link (https://github.com/KSPP/linux/issues/105) (thanks Joe) - Link to v2: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v2-1-9baeb59dae30@google.com Changes in v2: - Had a vim moment and deleted a character before sending the patch. - Replaced the character :) - Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com --- From a discussion here [1]. [1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/ --- scripts/checkpatch.pl | 6 ++++++ 1 file changed, 6 insertions(+) --- base-commit: b401b621758e46812da61fa58a67c3fd8d91de0d change-id: 20240221-snprintf-checkpatch-a864ed67ebd0 Best regards, -- Justin Stitt <justinstitt@google.com>