diff mbox series

[bpf-next] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator

Message ID eb78270e61e4d2e8ece047430d8397e000ef8569.1604456921.git.dxu@dxuuu.xyz (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator | expand

Commit Message

Daniel Xu Nov. 4, 2020, 2:29 a.m. UTC
do_strncpy_from_user() may copy some extra bytes after the NUL
terminator into the destination buffer. This usually does not matter for
normal string operations. However, when BPF programs key BPF maps with
strings, this matters a lot.

A BPF program may read strings from user memory by calling the
bpf_probe_read_user_str() helper which eventually calls
do_strncpy_from_user(). The program can then key a map with the
resulting string. BPF map keys are fixed-width and string-agnostic,
meaning that map keys are treated as a set of bytes.

The issue is when do_strncpy_from_user() overcopies bytes after the NUL
terminator, it can result in seemingly identical strings occupying
multiple slots in a BPF map. This behavior is subtle and totally
unexpected by the user.

This commit uses the proper word-at-a-time APIs to avoid overcopying.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
---
 lib/strncpy_from_user.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Daniel Borkmann Nov. 4, 2020, 4:24 p.m. UTC | #1
On 11/4/20 3:29 AM, Daniel Xu wrote:
> do_strncpy_from_user() may copy some extra bytes after the NUL
> terminator into the destination buffer. This usually does not matter for
> normal string operations. However, when BPF programs key BPF maps with
> strings, this matters a lot.
> 
> A BPF program may read strings from user memory by calling the
> bpf_probe_read_user_str() helper which eventually calls
> do_strncpy_from_user(). The program can then key a map with the
> resulting string. BPF map keys are fixed-width and string-agnostic,
> meaning that map keys are treated as a set of bytes.
> 
> The issue is when do_strncpy_from_user() overcopies bytes after the NUL
> terminator, it can result in seemingly identical strings occupying
> multiple slots in a BPF map. This behavior is subtle and totally
> unexpected by the user.
> 
> This commit uses the proper word-at-a-time APIs to avoid overcopying.
> 
> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>

It looks like this is a regression from the recent refactoring of the mem probing
util functions? Could we add a Fixes tag and then we'd also need to target the fix
against bpf tree instead of bpf-next, no?

Moreover, a BPF kselftest would help to make sure it doesn't regress in future again.

Thanks,
Daniel
Daniel Xu Nov. 4, 2020, 8:18 p.m. UTC | #2
Hi Daniel,

On Wed Nov 4, 2020 at 8:24 AM PST, Daniel Borkmann wrote:
> On 11/4/20 3:29 AM, Daniel Xu wrote:
> > do_strncpy_from_user() may copy some extra bytes after the NUL
> > terminator into the destination buffer. This usually does not matter for
> > normal string operations. However, when BPF programs key BPF maps with
> > strings, this matters a lot.
> > 
> > A BPF program may read strings from user memory by calling the
> > bpf_probe_read_user_str() helper which eventually calls
> > do_strncpy_from_user(). The program can then key a map with the
> > resulting string. BPF map keys are fixed-width and string-agnostic,
> > meaning that map keys are treated as a set of bytes.
> > 
> > The issue is when do_strncpy_from_user() overcopies bytes after the NUL
> > terminator, it can result in seemingly identical strings occupying
> > multiple slots in a BPF map. This behavior is subtle and totally
> > unexpected by the user.
> > 
> > This commit uses the proper word-at-a-time APIs to avoid overcopying.
> > 
> > Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
>
> It looks like this is a regression from the recent refactoring of the
> mem probing
> util functions?

I think it was like this from the beginning, at 6ae08ae3dea2 ("bpf: Add
probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers").
The old bpf_probe_read_str() used the kernel's byte-by-byte copying
routine. bpf_probe_read_user_str() started using strncpy_from_user()
which has been doing the long-sized strides since ~2012 or earlier.

I tried to build and test the kernel at that commit but it seems my
compiler is too new to build that old code. Bunch of build failures.

I assume the refactor you're referring to is 8d92db5c04d1 ("bpf: rework
the compat kernel probe handling").

> Could we add a Fixes tag and then we'd also need to target the fix
> against bpf tree instead of bpf-next, no?

Sure, will do in v2.

>
> Moreover, a BPF kselftest would help to make sure it doesn't regress in
> future again.

Ditto.

[..]

Thanks,
Daniel
Daniel Borkmann Nov. 4, 2020, 10:36 p.m. UTC | #3
On 11/4/20 9:18 PM, Daniel Xu wrote:
> On Wed Nov 4, 2020 at 8:24 AM PST, Daniel Borkmann wrote:
>> On 11/4/20 3:29 AM, Daniel Xu wrote:
>>> do_strncpy_from_user() may copy some extra bytes after the NUL
>>> terminator into the destination buffer. This usually does not matter for
>>> normal string operations. However, when BPF programs key BPF maps with
>>> strings, this matters a lot.
>>>
>>> A BPF program may read strings from user memory by calling the
>>> bpf_probe_read_user_str() helper which eventually calls
>>> do_strncpy_from_user(). The program can then key a map with the
>>> resulting string. BPF map keys are fixed-width and string-agnostic,
>>> meaning that map keys are treated as a set of bytes.
>>>
>>> The issue is when do_strncpy_from_user() overcopies bytes after the NUL
>>> terminator, it can result in seemingly identical strings occupying
>>> multiple slots in a BPF map. This behavior is subtle and totally
>>> unexpected by the user.
>>>
>>> This commit uses the proper word-at-a-time APIs to avoid overcopying.
>>>
>>> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
>>
>> It looks like this is a regression from the recent refactoring of the
>> mem probing
>> util functions?
> 
> I think it was like this from the beginning, at 6ae08ae3dea2 ("bpf: Add
> probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers").
> The old bpf_probe_read_str() used the kernel's byte-by-byte copying
> routine. bpf_probe_read_user_str() started using strncpy_from_user()
> which has been doing the long-sized strides since ~2012 or earlier.
> 
> I tried to build and test the kernel at that commit but it seems my
> compiler is too new to build that old code. Bunch of build failures.
> 
> I assume the refactor you're referring to is 8d92db5c04d1 ("bpf: rework
> the compat kernel probe handling").

Ah I see, it was just reusing 3d7081822f7f ("uaccess: Add non-pagefault user-space
read functions"). Potentially it might be safer choice to just rework the
strncpy_from_user_nofault() to mimic strncpy_from_kernel_nofault() in that
regard?

>> Could we add a Fixes tag and then we'd also need to target the fix
>> against bpf tree instead of bpf-next, no?
> 
> Sure, will do in v2.
> 
>> Moreover, a BPF kselftest would help to make sure it doesn't regress in
>> future again.
> 
> Ditto.
> 
> [..]
> 
> Thanks,
> Daniel
>
Daniel Xu Nov. 5, 2020, 2:21 a.m. UTC | #4
On Wed Nov 4, 2020 at 2:36 PM PST, Daniel Borkmann wrote:
> On 11/4/20 9:18 PM, Daniel Xu wrote:
> > On Wed Nov 4, 2020 at 8:24 AM PST, Daniel Borkmann wrote:
> >> On 11/4/20 3:29 AM, Daniel Xu wrote:
> >>> do_strncpy_from_user() may copy some extra bytes after the NUL
> >>> terminator into the destination buffer. This usually does not matter for
> >>> normal string operations. However, when BPF programs key BPF maps with
> >>> strings, this matters a lot.
> >>>
> >>> A BPF program may read strings from user memory by calling the
> >>> bpf_probe_read_user_str() helper which eventually calls
> >>> do_strncpy_from_user(). The program can then key a map with the
> >>> resulting string. BPF map keys are fixed-width and string-agnostic,
> >>> meaning that map keys are treated as a set of bytes.
> >>>
> >>> The issue is when do_strncpy_from_user() overcopies bytes after the NUL
> >>> terminator, it can result in seemingly identical strings occupying
> >>> multiple slots in a BPF map. This behavior is subtle and totally
> >>> unexpected by the user.
> >>>
> >>> This commit uses the proper word-at-a-time APIs to avoid overcopying.
> >>>
> >>> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
> >>
> >> It looks like this is a regression from the recent refactoring of the
> >> mem probing
> >> util functions?
> > 
> > I think it was like this from the beginning, at 6ae08ae3dea2 ("bpf: Add
> > probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers").
> > The old bpf_probe_read_str() used the kernel's byte-by-byte copying
> > routine. bpf_probe_read_user_str() started using strncpy_from_user()
> > which has been doing the long-sized strides since ~2012 or earlier.
> > 
> > I tried to build and test the kernel at that commit but it seems my
> > compiler is too new to build that old code. Bunch of build failures.
> > 
> > I assume the refactor you're referring to is 8d92db5c04d1 ("bpf: rework
> > the compat kernel probe handling").
>
> Ah I see, it was just reusing 3d7081822f7f ("uaccess: Add non-pagefault
> user-space
> read functions"). Potentially it might be safer choice to just rework
> the
> strncpy_from_user_nofault() to mimic strncpy_from_kernel_nofault() in
> that
> regard?

I'm a little reluctant to do that b/c it would introduce less efficient,
duplicated code. The word-at-a-time API already has the zero_bytemask()
API so it's clear that it was designed to handle this issue -- we're not
really hacking anything here.

I'll send out a V2 with the selftest shortly. Happy to change things
after that.

Thanks,
Daniel
Alexei Starovoitov Nov. 6, 2020, 4:32 a.m. UTC | #5
Daniel,

the kasan complains about the previous version of your patch,
but your v4 version looks equivalent.
Could you try to repro this issue?
The code looks correct, but kasan complain is concerning.

On Thu, Nov 5, 2020 at 5:56 PM kernel test robot <oliver.sang@intel.com> wrote:
>
> Greeting,
>
> FYI, we noticed the following commit (built with clang-12):
>
> commit: 00a4ef91e8f5af6edceb9bd4bceed2305f038796 ("[PATCH bpf-next] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator")
> url: https://github.com/0day-ci/linux/commits/Daniel-Xu/lib-strncpy_from_user-c-Don-t-overcopy-bytes-after-NUL-terminator/20201104-103306
> base: https://git.kernel.org/cgit/linux/kernel/git/bpf/bpf-next.git master
>
> in testcase: trinity
> version: trinity-x86_64-af355e9-1_2019-12-03
> with following parameters:
>
>         runtime: 300s
>
> test-description: Trinity is a linux system call fuzz tester.
> test-url: http://codemonkey.org.uk/projects/trinity/
>
>
> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
> +----------------------------------------------------------------------------+------------+------------+
> |                                                                            | f4c3881edb | 00a4ef91e8 |
> +----------------------------------------------------------------------------+------------+------------+
> | boot_successes                                                             | 8          | 4          |
> | boot_failures                                                              | 14         | 15         |
> | Initramfs_unpacking_failed                                                 | 13         | 7          |
> | Kernel_panic-not_syncing:VFS:Unable_to_mount_root_fs_on_unknown-block(#,#) | 13         | 9          |
> | BUG:kernel_hang_in_boot_stage                                              | 1          |            |
> | BUG:KASAN:slab-out-of-bounds_in_s                                          | 0          | 3          |
> | BUG:KASAN:slab-out-of-bounds_in_l                                          | 0          | 3          |
> +----------------------------------------------------------------------------+------------+------------+
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <oliver.sang@intel.com>
>
>
> [  324.803835] BUG: KASAN: slab-out-of-bounds in strlen+0x53/0x5d
> [  324.808932] Read of size 1 at addr ffff88813be5f380 by task trinity-c0/7148
> [  324.809979]
> [  324.810240] CPU: 1 PID: 7148 Comm: trinity-c0 Not tainted 5.9.0-13430-g00a4ef91e8f5 #1
> [  324.811397] Call Trace:
> [  324.811797]  dump_stack+0x156/0x194
> [  324.812387]  ? wake_up_klogd+0x49/0x5e
> [  324.813118]  ? vprintk_emit+0x297/0x307
> [  324.813680]  print_address_description+0x25/0x4b7
> [  324.814354]  ? printk+0x54/0x5d
> [  324.814877]  ? kasan_report+0xad/0x187
> [  324.815531]  kasan_report+0x140/0x187
> [  324.816187]  ? strlen+0x53/0x5d
> [  324.820931] [child7:7142] Tried 16 32-bit syscalls unsuccessfully. Disabling all 32-bit syscalls.
> [  324.828848]  strlen+0x53/0x5d
> [  324.828864]  getname_kernel+0x19/0x257
> [  324.828874]  kern_path+0x19/0x32
> [  324.828887]  lookup_bdev+0x52/0x182
> [  324.828908]  __x64_sys_quotactl+0x1fe/0x4e97
> [  324.833228]  ? kvm_sched_clock_read+0x14/0x28
> [  324.837181]  ? sched_clock+0x5/0x8
> [  324.837748]  ? sched_clock_cpu+0x18/0x151
> [  324.838396]  ? up_write+0xd7/0x399
> [  324.838944]  ? security_file_mprotect+0x93/0xb0
> [  324.839686]  ? __x64_sys_mprotect+0x31a/0x6a9
> [  324.840405]  ? fpregs_assert_state_consistent+0xae/0xd3
> [  324.841253]  do_syscall_64+0x34/0x6c
> [  324.841808]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  324.842540] RIP: 0033:0x7f77ba3311c9
> [  324.843079] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 dc 2c 00 f7 d8 64 89 01 48
> [  324.845838] RSP: 002b:00007ffe42abbe58 EFLAGS: 00000246 ORIG_RAX: 00000000000000b3
> [  324.846978] RAX: ffffffffffffffda RBX: 00000000000000b3 RCX: 00007f77ba3311c9
> [  324.848039] RDX: 0000000004000000 RSI: 00007f77b8719000 RDI: 0000000012121000
> [  324.849923] RBP: 00007f77baa1d000 R08: ffffffffffffffff R09: 0000000000000000
> [  324.850961] R10: 00007f77b8719000 R11: 0000000000000246 R12: 00007f77baa1d058
> [  324.852032] R13: 00007f77baa246b0 R14: 0000000000000000 R15: 00007f77baa1d000
> [  324.853117]
> [  324.853292]
> [  324.853372] Allocated by task 7148:
> [  324.854203]  kasan_save_stack+0x27/0x47
> [  324.854779]  __kasan_kmalloc+0xed/0x104
> [  324.855365]  kmem_cache_alloc+0xcb/0x135
> [  324.855971]  getname_flags+0x51/0x3a2
> [  324.856536]  __x64_sys_quotactl+0x1c1/0x4e97
> [  324.857205]  do_syscall_64+0x34/0x6c
> [  324.857749]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  324.858495]
> [  324.858656] [child2:7150] Tried 16 32-bit syscalls unsuccessfully. Disabling all 32-bit syscalls.
> [  324.858746] The buggy address belongs to the object at ffff88813be5e380
> [  324.858746]  which belongs to the cache names_cache of size 4096
> [  324.858764] The buggy address is located 0 bytes to the right of
> [  324.858764]  4096-byte region [ffff88813be5e380, ffff88813be5f380)
> [  324.860587]
> [  324.862729] The buggy address belongs to the page:
> [  324.862755] page:000000009f9037ac refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88813be5ffff pfn:0x13be5e
> [  324.867328] head:000000009f9037ac order:1 compound_mapcount:0
> [  324.868165] flags: 0x8000000000010200(slab|head)
> [  324.868875] raw: 8000000000010200 ffffea0005a88688 ffffea000459f288 ffff888100252300
> [  324.870009] raw: ffff88813be5ffff ffff88813be5e380 0000000100000001 0000000000000000
> [  324.871126] page dumped because: kasan: bad access detected
> [  324.871945]
> [  324.872192] Memory state around the buggy address:
> [  324.872947]  ffff88813be5f280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  324.873980]  ffff88813be5f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  324.875009] >ffff88813be5f380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [  324.876036]                    ^
> [  324.876538]  ffff88813be5f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [  324.877588]  ffff88813be5f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [  324.878621] ==================================================================
> [  324.879657] Disabling lock debugging due to kernel taint
> [  324.882776] [child2:7152] Tried 16 32-bit syscalls unsuccessfully. Disabling all 32-bit syscalls.
> [  324.882801]
> [  324.933069] [main] kernel became tainted! (32/0) Last seed was 2498072066
> [  324.933099]
> [  324.969750] trinity: Detected kernel tainting. Last seed was 2498072066
> [  324.969776]
> [  324.976192] [main] exit_reason=7, but 7 children still running.
> [  324.976217]
> [  326.978916] [main] Bailing main loop because kernel became tainted..
> [  326.978943]
> [  327.015587] [main] Ran 32788 syscalls. Successes: 10983  Failures: 20991
> [  327.015610]
>
> Kboot worker: lkp-worker04
> Elapsed time: 360
>
> kvm=(
>         qemu-system-x86_64
>         -enable-kvm
>         -cpu SandyBridge
>         -kernel $kernel
>         -initrd initrd-vm-snb-72.cgz
>         -m 8192
>         -smp 2
>         -device e1000,netdev=net0
>         -netdev user,id=net0,hostfwd=tcp::32032-:22
>         -boot order=nc
>         -no-reboot
>         -watchdog i6300esb
>         -watchdog-action debug
>         -rtc base=localtime
>
>
> To reproduce:
>
>         # build kernel
>         cd linux
>         cp config-5.9.0-13430-g00a4ef91e8f5 .config
>         make HOSTCC=clang-12 CC=clang-12 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage
>
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
>
>
>
> Thanks,
> Oliver Sang
>
Daniel Xu Nov. 6, 2020, 6:54 p.m. UTC | #6
On Thu Nov 5, 2020 at 8:32 PM PST, Alexei Starovoitov wrote:
> Daniel,
>
> the kasan complains about the previous version of your patch,
> but your v4 version looks equivalent.
> Could you try to repro this issue?
> The code looks correct, but kasan complain is concerning.
>
> On Thu, Nov 5, 2020 at 5:56 PM kernel test robot <oliver.sang@intel.com>
> wrote:
> >
> > Greeting,
> >
> > FYI, we noticed the following commit (built with clang-12):
> >
> > commit: 00a4ef91e8f5af6edceb9bd4bceed2305f038796 ("[PATCH bpf-next] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator")
> > url: https://github.com/0day-ci/linux/commits/Daniel-Xu/lib-strncpy_from_user-c-Don-t-overcopy-bytes-after-NUL-terminator/20201104-103306
> > base: https://git.kernel.org/cgit/linux/kernel/git/bpf/bpf-next.git master

[...]

I'll take a look, thanks.

Seems like the original email went into my spam. I'll try to fix my spam
filter.
diff mbox series

Patch

diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index e6d5fcc2cdf3..d084189eb05c 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -35,17 +35,22 @@  static inline long do_strncpy_from_user(char *dst, const char __user *src,
 		goto byte_at_a_time;
 
 	while (max >= sizeof(unsigned long)) {
-		unsigned long c, data;
+		unsigned long c, data, mask, *out;
 
 		/* Fall back to byte-at-a-time if we get a page fault */
 		unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
 
-		*(unsigned long *)(dst+res) = c;
 		if (has_zero(c, &data, &constants)) {
 			data = prep_zero_mask(c, data, &constants);
 			data = create_zero_mask(data);
+			mask = zero_bytemask(data);
+			out = (unsigned long *)(dst+res);
+			*out = (*out & ~mask) | (c & mask);
 			return res + find_zero(data);
+		} else  {
+			*(unsigned long *)(dst+res) = c;
 		}
+
 		res += sizeof(unsigned long);
 		max -= sizeof(unsigned long);
 	}