[v2] kdb: replace deprecated strncpy

Message ID	20240405-strncpy-kernel-debug-kdb-kdb_io-c-v2-1-d0bf595ab301@google.com (mailing list archive)
State	New, archived
Headers	show Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 59AE37484 for <linux-hardening@vger.kernel.org>; Fri, 5 Apr 2024 02:34:00 +0000 (UTC) Date: Fri, 05 Apr 2024 02:33:58 +0000 Precedence: bulk Mime-Version: 1.0 Message-ID: <20240405-strncpy-kernel-debug-kdb-kdb_io-c-v2-1-d0bf595ab301@google.com> Subject: [PATCH v2] kdb: replace deprecated strncpy From: Justin Stitt <justinstitt@google.com> To: Jason Wessel <jason.wessel@windriver.com>, Daniel Thompson <daniel.thompson@linaro.org>, Douglas Anderson <dianders@chromium.org> Cc: kgdb-bugreport@lists.sourceforge.net, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, Justin Stitt <justinstitt@google.com> Content-Type: text/plain; charset="utf-8"
Series	[v2] kdb: replace deprecated strncpy \| expand [v2] kdb: replace deprecated strncpy

Justin Stitt April 5, 2024, 2:33 a.m. UTC

We should move away from using strncpy because it is deprecated [1].

Since these buffers want to be NUL-terminated, let's use strscpy() which
guarantees this behavior.

The code in question enables the visual autocomplete when using kdb tab
completion. After pressing tab a couple times when sitting on a partial
symbol it will attempt to fill it in. In my testing, strscpy() provides
the exact same autocomplete behavior that strncpy() provides here (i.e:
it fills in the same number of characters for the user).

You can confirm this by enabling kdb [3] and booting up the kernel. I
performed my tests with qemu with this incantation (wow these get long):

$ /usr/bin/qemu-system-x86_64 -display none -nodefaults -cpu Nehalem
-append 'console=ttyS0,115200 earlycon=uart8250,io,0x3f8 rdinit=/bin/sh
kgdboc=ttyS0,115200 nokaslr' -kernel $BUILD_DIR/arch/x86/boot/bzImage
-initrd $REPOS/boot-utils/images/x86_64/rootfs.cpio -m 512m -serial
mon:stdio

... then you can type some symbols and see that autocomplete still kicks
in and performs exactly the same.

For example:
tes <tab><tab> gives you "test",
then "test_ap" <tab><tab> gives you "test_aperfmperf"

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://github.com/KSPP/linux/issues/90 [2]
Link: https://www.kernel.org/doc/html/v5.0/dev-tools/kgdb.html#using-kdb [3]
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
---
Changes in v2:
- use strscpy over memcpy (thanks Daniel T.)
- Link to v1: https://lore.kernel.org/r/20240403-strncpy-kernel-debug-kdb-kdb_io-c-v1-1-7f78a08e9ff4@google.com
---
---
 kernel/debug/kdb/kdb_io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


---
base-commit: 026e680b0a08a62b1d948e5a8ca78700bfac0e6e
change-id: 20240402-strncpy-kernel-debug-kdb-kdb_io-c-53e5ed26da3d

Best regards,
--
Justin Stitt <justinstitt@google.com>

Daniel Thompson April 5, 2024, 9:51 a.m. UTC | #1

On Fri, Apr 05, 2024 at 02:33:58AM +0000, Justin Stitt wrote:
> We should move away from using strncpy because it is deprecated [1].
>
> Since these buffers want to be NUL-terminated, let's use strscpy() which
> guarantees this behavior.
>
> The code in question enables the visual autocomplete when using kdb tab
> completion. After pressing tab a couple times when sitting on a partial
> symbol it will attempt to fill it in.

FWIW the code that this patch changes is only executed when tab is
pressed once.


> In my testing, strscpy() provides
> the exact same autocomplete behavior that strncpy() provides here (i.e:
> it fills in the same number of characters for the user).
>
> You can confirm this by enabling kdb [3] and booting up the kernel. I
> performed my tests with qemu with this incantation (wow these get long):
>
> $ /usr/bin/qemu-system-x86_64 -display none -nodefaults -cpu Nehalem
> -append 'console=ttyS0,115200 earlycon=uart8250,io,0x3f8 rdinit=/bin/sh
> kgdboc=ttyS0,115200 nokaslr' -kernel $BUILD_DIR/arch/x86/boot/bzImage
> -initrd $REPOS/boot-utils/images/x86_64/rootfs.cpio -m 512m -serial
> mon:stdio
>
> ... then you can type some symbols and see that autocomplete still kicks
> in and performs exactly the same.
>
> For example:
> tes <tab><tab> gives you "test",
> then "test_ap" <tab><tab> gives you "test_aperfmperf"
>
> Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
> Link: https://github.com/KSPP/linux/issues/90 [2]
> Link: https://www.kernel.org/doc/html/v5.0/dev-tools/kgdb.html#using-kdb [3]
> Cc: linux-hardening@vger.kernel.org
> Signed-off-by: Justin Stitt <justinstitt@google.com>
> ---
> Changes in v2:
> - use strscpy over memcpy (thanks Daniel T.)
> - Link to v1: https://lore.kernel.org/r/20240403-strncpy-kernel-debug-kdb-kdb_io-c-v1-1-7f78a08e9ff4@google.com
> ---
> ---
>  kernel/debug/kdb/kdb_io.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 9443bc63c5a2..60be22132020 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -368,9 +368,9 @@ static char *kdb_read(char *buffer, size_t bufsize)
>  			kdb_printf("%s", buffer);
>  		} else if (tab != 2 && count > 0) {
>  			len_tmp = strlen(p_tmp);
> -			strncpy(p_tmp+len_tmp, cp, lastchar-cp+1);
> +			strscpy(p_tmp+len_tmp, cp, lastchar-cp+1);

This still looks like it is reproducing the obvious[1] error in
the original strncpy() call. The third argument does *not* provide the
number of characters in the destination buffer.

Just to be really clear, I think this patch and your original memcpy()
conversion is mechanically correct, in that the new code is equivalent
to the original strncpy(). The problem is that neither patch acts
on the warning signs that the original code is broken.


[1] I know that this code is hard to read so "obvious" is a relative
    term. However just looking at one line tells us that the source
    pointer is part of a two pointer calculation that purports to
    give the length of the destination string! Such code is almost
    always going to be wrong.


>  			len_tmp = strlen(p_tmp);
> -			strncpy(cp, p_tmp+len, len_tmp-len + 1);
> +			strscpy(cp, p_tmp+len, len_tmp-len + 1);

Again, I really don't think the third argument provides the number of
characters in the destination buffer.


Daniel.

Justin Stitt April 9, 2024, 12:46 a.m. UTC | #2

On Fri, Apr 5, 2024 at 2:51 AM Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> >                       len_tmp = strlen(p_tmp);
> > -                     strncpy(cp, p_tmp+len, len_tmp-len + 1);
> > +                     strscpy(cp, p_tmp+len, len_tmp-len + 1);
>
> Again, I really don't think the third argument provides the number of
> characters in the destination buffer.
>

Right, the third argument is the length of the "remaining" characters
from the completion point.

if you type "tes" and press tab then kallsyms_symbol_complete() will
populate p_tmp with "test". Prior to rendering to the user, @cp points
to "s", we need to catch the user up and print the rest of the symbol
name since they've already typed "tes" we only need to print out "t".

len_tmp is the length of the entire symbol part as returned by
kallsyms_symbol_complete() and len is the length of only the
user-typed symbol. Therefore, the amount of remaining characters to
print is given by len_tmp-len (and +1 for a NUL-byte).

So, yeah, you're right. This isn't the length of the destination but I
don't see why we can't use memcpy() (or strscpy()) and have this not
be considered "broken". The pointer arithmetic checks out.

I tested out strcat and other alternatives and they all seem less readable.

>
> Daniel.

Daniel Thompson April 9, 2024, 6:35 p.m. UTC | #3

On Mon, Apr 08, 2024 at 05:46:42PM -0700, Justin Stitt wrote:
> On Fri, Apr 5, 2024 at 2:51 AM Daniel Thompson
> <daniel.thompson@linaro.org> wrote:
> >
> > >                       len_tmp = strlen(p_tmp);
> > > -                     strncpy(cp, p_tmp+len, len_tmp-len + 1);
> > > +                     strscpy(cp, p_tmp+len, len_tmp-len + 1);
> >
> > Again, I really don't think the third argument provides the number of
> > characters in the destination buffer.
> >
>
> Right, the third argument is the length of the "remaining" characters
> from the completion point.

Which is not how strscpy() is designed to be used.

> if you type "tes" and press tab then kallsyms_symbol_complete() will
> populate p_tmp with "test". Prior to rendering to the user, @cp points
> to "s", we need to catch the user up and print the rest of the symbol
> name since they've already typed "tes" we only need to print out "t".

I'm more concerned about the case where you fill the buffer entirely
then move the cursor left until you get to the tes and then press Tab.
I think at the point we write too many bytes to cp.

> len_tmp is the length of the entire symbol part as returned by
> kallsyms_symbol_complete() and len is the length of only the
> user-typed symbol. Therefore, the amount of remaining characters to
> print is given by len_tmp-len (and +1 for a NUL-byte).
>
> So, yeah, you're right. This isn't the length of the destination but I
> don't see why we can't use memcpy() (or strscpy()) and have this not
> be considered "broken". The pointer arithmetic checks out.

The problem with substituting strncpy() with memcpy() is that is *not*
obviously wrong... but it could be subtly wrong.

We can see that the person who originally wrote this code made a pretty
serious mistake with strncpy() and the third argument if garbage. It is
therefore important to figure out what the *correct* value for argument
#3 should have been *before* we attempt to replace strncpy() with
anything.

Transforming something we know to be broken without fixing it first
means it is impossible to know if the transformation is correct or not.
Hence the original question, how do we know there is enough space
after cp to store the string?

Daniel.

Justin Stitt April 9, 2024, 8:48 p.m. UTC | #4

Hi,

On Tue, Apr 9, 2024 at 11:36 AM Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> On Mon, Apr 08, 2024 at 05:46:42PM -0700, Justin Stitt wrote:
> > On Fri, Apr 5, 2024 at 2:51 AM Daniel Thompson
> > <daniel.thompson@linaro.org> wrote:
> > >
> > > >                       len_tmp = strlen(p_tmp);
> > > > -                     strncpy(cp, p_tmp+len, len_tmp-len + 1);
> > > > +                     strscpy(cp, p_tmp+len, len_tmp-len + 1);
> > >
> > > Again, I really don't think the third argument provides the number of
> > > characters in the destination buffer.
> > >
> >
> > Right, the third argument is the length of the "remaining" characters
> > from the completion point.
>
> Which is not how strscpy() is designed to be used.
>
>
> > if you type "tes" and press tab then kallsyms_symbol_complete() will
> > populate p_tmp with "test". Prior to rendering to the user, @cp points
> > to "s", we need to catch the user up and print the rest of the symbol
> > name since they've already typed "tes" we only need to print out "t".
>
> I'm more concerned about the case where you fill the buffer entirely
> then move the cursor left until you get to the tes and then press Tab.
> I think at the point we write too many bytes to cp.
>
>
> > len_tmp is the length of the entire symbol part as returned by
> > kallsyms_symbol_complete() and len is the length of only the
> > user-typed symbol. Therefore, the amount of remaining characters to
> > print is given by len_tmp-len (and +1 for a NUL-byte).
> >
> > So, yeah, you're right. This isn't the length of the destination but I
> > don't see why we can't use memcpy() (or strscpy()) and have this not
> > be considered "broken". The pointer arithmetic checks out.
>
> The problem with substituting strncpy() with memcpy() is that is *not*
> obviously wrong... but it could be subtly wrong.
>
> We can see that the person who originally wrote this code made a pretty
> serious mistake with strncpy() and the third argument if garbage. It is
> therefore important to figure out what the *correct* value for argument
> #3 should have been *before* we attempt to replace strncpy() with
> anything.
>
> Transforming something we know to be broken without fixing it first
> means it is impossible to know if the transformation is correct or not.
> Hence the original question, how do we know there is enough space
> after cp to store the string?

Gotcha, I will find time to seriously refactor/rewrite this function
(or at the very least the tab handling part of it).

At the end of the day, though, I just want this strncpy() gone.

>
>
> Daniel.

Thanks
Justin

Daniel Thompson April 15, 2024, 1:45 p.m. UTC | #5

On Tue, Apr 09, 2024 at 01:48:38PM -0700, Justin Stitt wrote:
> Hi,
>
> On Tue, Apr 9, 2024 at 11:36 AM Daniel Thompson
> <daniel.thompson@linaro.org> wrote:
> >
> > On Mon, Apr 08, 2024 at 05:46:42PM -0700, Justin Stitt wrote:
> > > On Fri, Apr 5, 2024 at 2:51 AM Daniel Thompson
> > > <daniel.thompson@linaro.org> wrote:
> > > >
> > > > >                       len_tmp = strlen(p_tmp);
> > > > > -                     strncpy(cp, p_tmp+len, len_tmp-len + 1);
> > > > > +                     strscpy(cp, p_tmp+len, len_tmp-len + 1);
> > > >
> > > > Again, I really don't think the third argument provides the number of
> > > > characters in the destination buffer.
> > > >
> > >
> > > Right, the third argument is the length of the "remaining" characters
> > > from the completion point.
> >
> > Which is not how strscpy() is designed to be used.
> >
> >
> > > if you type "tes" and press tab then kallsyms_symbol_complete() will
> > > populate p_tmp with "test". Prior to rendering to the user, @cp points
> > > to "s", we need to catch the user up and print the rest of the symbol
> > > name since they've already typed "tes" we only need to print out "t".
> >
> > I'm more concerned about the case where you fill the buffer entirely
> > then move the cursor left until you get to the tes and then press Tab.
> > I think at the point we write too many bytes to cp.
> >
> >
> > > len_tmp is the length of the entire symbol part as returned by
> > > kallsyms_symbol_complete() and len is the length of only the
> > > user-typed symbol. Therefore, the amount of remaining characters to
> > > print is given by len_tmp-len (and +1 for a NUL-byte).
> > >
> > > So, yeah, you're right. This isn't the length of the destination but I
> > > don't see why we can't use memcpy() (or strscpy()) and have this not
> > > be considered "broken". The pointer arithmetic checks out.
> >
> > The problem with substituting strncpy() with memcpy() is that is *not*
> > obviously wrong... but it could be subtly wrong.
> >
> > We can see that the person who originally wrote this code made a pretty
> > serious mistake with strncpy() and the third argument if garbage. It is
> > therefore important to figure out what the *correct* value for argument
> > #3 should have been *before* we attempt to replace strncpy() with
> > anything.
> >
> > Transforming something we know to be broken without fixing it first
> > means it is impossible to know if the transformation is correct or not.
> > Hence the original question, how do we know there is enough space
> > after cp to store the string?
>
> Gotcha, I will find time to seriously refactor/rewrite this function
> (or at the very least the tab handling part of it).
>
> At the end of the day, though, I just want this strncpy() gone.

So... I starting looking into what it takes to provoke kdb_read() into
overflowing it's buffers. So far I have found two ways (one of which
does affect this strncpy() code as I thought).

I took the view that the strncpy() (or any other copy) into
tmpbuffer/p_tmp is just the wrong thing to do. A memmove() is simply
a better approach!

Short verison, whilst there is other refactoring needed, this change
fixes the overflow. I hope it also meets your ambition with respect
to strncpy().


Daniel.



From aab83fc9d0e97987fdec51613b536fc32a63c544 Mon Sep 17 00:00:00 2001
From: Daniel Thompson <daniel.thompson@linaro.org>
Date: Mon, 15 Apr 2024 14:35:06 +0100
Subject: [PATCH] kdb: Fix buffer overflow during tab-complete

Currently, when the user attempts symbol completion with the Tab key, kdb
will use strncpy() to insert the completed symbol into the command buffer.
Unfortunately it passes the size of the source buffer rather than the
destination to strncpy() with predictably horrible results. Most obviously
if the command buffer is already full but cp, the cursor position, is in
the middle of the buffer, then we will write past the end of the supplied
buffer.

Even when the buffer does not overflow there are other problems when cp is
in the middle of a line. For example the cursor position is not updated
correctly meaning what is shown to the user on the console is not what is
actually present in the command buffer.

Fix this by replacing the dubious strncpy() calls with memmove()/memcpy()
calls plus explicit boundary checks to make sure we have enough space
before we start moving characters around. We also add a second call to
kdb_printf() to fix the console cursor position if needed.

Reported-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 kernel/debug/kdb/kdb_io.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index 9443bc63c5a24..3a1b9dd890f45 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -367,14 +367,23 @@ static char *kdb_read(char *buffer, size_t bufsize)
 			kdb_printf(kdb_prompt_str);
 			kdb_printf("%s", buffer);
 		} else if (tab != 2 && count > 0) {
-			len_tmp = strlen(p_tmp);
-			strncpy(p_tmp+len_tmp, cp, lastchar-cp+1);
-			len_tmp = strlen(p_tmp);
-			strncpy(cp, p_tmp+len, len_tmp-len + 1);
-			len = len_tmp - len;
-			kdb_printf("%s", cp);
-			cp += len;
-			lastchar += len;
+			/* How many new characters do we want from tmpbuffer? */
+			len_tmp = strlen(p_tmp) - len;
+			if (lastchar + len_tmp >= bufend)
+				len_tmp = bufend - lastchar - 1;
+
+			if (len_tmp) {
+				/* + 1 ensures the '\0' is memmove'd */
+				memmove(cp+len_tmp, cp, (lastchar-cp) + 1);
+				memcpy(cp, p_tmp+len, len_tmp);
+				lastchar += len_tmp;
+
+				kdb_printf("%s", cp);
+				cp += len_tmp;
+				if (cp != lastchar)
+					kdb_printf("\r%s%.*s", kdb_prompt_str,
+						   (int)(cp - buffer), buffer);
+			}
 		}
 		kdb_nextline = 1; /* reset output line number */
 		break;
--
2.43.0

[v2] kdb: replace deprecated strncpy

Commit Message

Comments

Patch