Message ID | eb78270e61e4d2e8ece047430d8397e000ef8569.1604456921.git.dxu@dxuuu.xyz (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator | expand |
On 11/4/20 3:29 AM, Daniel Xu wrote: > do_strncpy_from_user() may copy some extra bytes after the NUL > terminator into the destination buffer. This usually does not matter for > normal string operations. However, when BPF programs key BPF maps with > strings, this matters a lot. > > A BPF program may read strings from user memory by calling the > bpf_probe_read_user_str() helper which eventually calls > do_strncpy_from_user(). The program can then key a map with the > resulting string. BPF map keys are fixed-width and string-agnostic, > meaning that map keys are treated as a set of bytes. > > The issue is when do_strncpy_from_user() overcopies bytes after the NUL > terminator, it can result in seemingly identical strings occupying > multiple slots in a BPF map. This behavior is subtle and totally > unexpected by the user. > > This commit uses the proper word-at-a-time APIs to avoid overcopying. > > Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> It looks like this is a regression from the recent refactoring of the mem probing util functions? Could we add a Fixes tag and then we'd also need to target the fix against bpf tree instead of bpf-next, no? Moreover, a BPF kselftest would help to make sure it doesn't regress in future again. Thanks, Daniel
Hi Daniel, On Wed Nov 4, 2020 at 8:24 AM PST, Daniel Borkmann wrote: > On 11/4/20 3:29 AM, Daniel Xu wrote: > > do_strncpy_from_user() may copy some extra bytes after the NUL > > terminator into the destination buffer. This usually does not matter for > > normal string operations. However, when BPF programs key BPF maps with > > strings, this matters a lot. > > > > A BPF program may read strings from user memory by calling the > > bpf_probe_read_user_str() helper which eventually calls > > do_strncpy_from_user(). The program can then key a map with the > > resulting string. BPF map keys are fixed-width and string-agnostic, > > meaning that map keys are treated as a set of bytes. > > > > The issue is when do_strncpy_from_user() overcopies bytes after the NUL > > terminator, it can result in seemingly identical strings occupying > > multiple slots in a BPF map. This behavior is subtle and totally > > unexpected by the user. > > > > This commit uses the proper word-at-a-time APIs to avoid overcopying. > > > > Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> > > It looks like this is a regression from the recent refactoring of the > mem probing > util functions? I think it was like this from the beginning, at 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers"). The old bpf_probe_read_str() used the kernel's byte-by-byte copying routine. bpf_probe_read_user_str() started using strncpy_from_user() which has been doing the long-sized strides since ~2012 or earlier. I tried to build and test the kernel at that commit but it seems my compiler is too new to build that old code. Bunch of build failures. I assume the refactor you're referring to is 8d92db5c04d1 ("bpf: rework the compat kernel probe handling"). > Could we add a Fixes tag and then we'd also need to target the fix > against bpf tree instead of bpf-next, no? Sure, will do in v2. > > Moreover, a BPF kselftest would help to make sure it doesn't regress in > future again. Ditto. [..] Thanks, Daniel
On 11/4/20 9:18 PM, Daniel Xu wrote: > On Wed Nov 4, 2020 at 8:24 AM PST, Daniel Borkmann wrote: >> On 11/4/20 3:29 AM, Daniel Xu wrote: >>> do_strncpy_from_user() may copy some extra bytes after the NUL >>> terminator into the destination buffer. This usually does not matter for >>> normal string operations. However, when BPF programs key BPF maps with >>> strings, this matters a lot. >>> >>> A BPF program may read strings from user memory by calling the >>> bpf_probe_read_user_str() helper which eventually calls >>> do_strncpy_from_user(). The program can then key a map with the >>> resulting string. BPF map keys are fixed-width and string-agnostic, >>> meaning that map keys are treated as a set of bytes. >>> >>> The issue is when do_strncpy_from_user() overcopies bytes after the NUL >>> terminator, it can result in seemingly identical strings occupying >>> multiple slots in a BPF map. This behavior is subtle and totally >>> unexpected by the user. >>> >>> This commit uses the proper word-at-a-time APIs to avoid overcopying. >>> >>> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> >> >> It looks like this is a regression from the recent refactoring of the >> mem probing >> util functions? > > I think it was like this from the beginning, at 6ae08ae3dea2 ("bpf: Add > probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers"). > The old bpf_probe_read_str() used the kernel's byte-by-byte copying > routine. bpf_probe_read_user_str() started using strncpy_from_user() > which has been doing the long-sized strides since ~2012 or earlier. > > I tried to build and test the kernel at that commit but it seems my > compiler is too new to build that old code. Bunch of build failures. > > I assume the refactor you're referring to is 8d92db5c04d1 ("bpf: rework > the compat kernel probe handling"). Ah I see, it was just reusing 3d7081822f7f ("uaccess: Add non-pagefault user-space read functions"). Potentially it might be safer choice to just rework the strncpy_from_user_nofault() to mimic strncpy_from_kernel_nofault() in that regard? >> Could we add a Fixes tag and then we'd also need to target the fix >> against bpf tree instead of bpf-next, no? > > Sure, will do in v2. > >> Moreover, a BPF kselftest would help to make sure it doesn't regress in >> future again. > > Ditto. > > [..] > > Thanks, > Daniel >
On Wed Nov 4, 2020 at 2:36 PM PST, Daniel Borkmann wrote: > On 11/4/20 9:18 PM, Daniel Xu wrote: > > On Wed Nov 4, 2020 at 8:24 AM PST, Daniel Borkmann wrote: > >> On 11/4/20 3:29 AM, Daniel Xu wrote: > >>> do_strncpy_from_user() may copy some extra bytes after the NUL > >>> terminator into the destination buffer. This usually does not matter for > >>> normal string operations. However, when BPF programs key BPF maps with > >>> strings, this matters a lot. > >>> > >>> A BPF program may read strings from user memory by calling the > >>> bpf_probe_read_user_str() helper which eventually calls > >>> do_strncpy_from_user(). The program can then key a map with the > >>> resulting string. BPF map keys are fixed-width and string-agnostic, > >>> meaning that map keys are treated as a set of bytes. > >>> > >>> The issue is when do_strncpy_from_user() overcopies bytes after the NUL > >>> terminator, it can result in seemingly identical strings occupying > >>> multiple slots in a BPF map. This behavior is subtle and totally > >>> unexpected by the user. > >>> > >>> This commit uses the proper word-at-a-time APIs to avoid overcopying. > >>> > >>> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> > >> > >> It looks like this is a regression from the recent refactoring of the > >> mem probing > >> util functions? > > > > I think it was like this from the beginning, at 6ae08ae3dea2 ("bpf: Add > > probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers"). > > The old bpf_probe_read_str() used the kernel's byte-by-byte copying > > routine. bpf_probe_read_user_str() started using strncpy_from_user() > > which has been doing the long-sized strides since ~2012 or earlier. > > > > I tried to build and test the kernel at that commit but it seems my > > compiler is too new to build that old code. Bunch of build failures. > > > > I assume the refactor you're referring to is 8d92db5c04d1 ("bpf: rework > > the compat kernel probe handling"). > > Ah I see, it was just reusing 3d7081822f7f ("uaccess: Add non-pagefault > user-space > read functions"). Potentially it might be safer choice to just rework > the > strncpy_from_user_nofault() to mimic strncpy_from_kernel_nofault() in > that > regard? I'm a little reluctant to do that b/c it would introduce less efficient, duplicated code. The word-at-a-time API already has the zero_bytemask() API so it's clear that it was designed to handle this issue -- we're not really hacking anything here. I'll send out a V2 with the selftest shortly. Happy to change things after that. Thanks, Daniel
Daniel, the kasan complains about the previous version of your patch, but your v4 version looks equivalent. Could you try to repro this issue? The code looks correct, but kasan complain is concerning. On Thu, Nov 5, 2020 at 5:56 PM kernel test robot <oliver.sang@intel.com> wrote: > > Greeting, > > FYI, we noticed the following commit (built with clang-12): > > commit: 00a4ef91e8f5af6edceb9bd4bceed2305f038796 ("[PATCH bpf-next] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator") > url: https://github.com/0day-ci/linux/commits/Daniel-Xu/lib-strncpy_from_user-c-Don-t-overcopy-bytes-after-NUL-terminator/20201104-103306 > base: https://git.kernel.org/cgit/linux/kernel/git/bpf/bpf-next.git master > > in testcase: trinity > version: trinity-x86_64-af355e9-1_2019-12-03 > with following parameters: > > runtime: 300s > > test-description: Trinity is a linux system call fuzz tester. > test-url: http://codemonkey.org.uk/projects/trinity/ > > > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): > > > +----------------------------------------------------------------------------+------------+------------+ > | | f4c3881edb | 00a4ef91e8 | > +----------------------------------------------------------------------------+------------+------------+ > | boot_successes | 8 | 4 | > | boot_failures | 14 | 15 | > | Initramfs_unpacking_failed | 13 | 7 | > | Kernel_panic-not_syncing:VFS:Unable_to_mount_root_fs_on_unknown-block(#,#) | 13 | 9 | > | BUG:kernel_hang_in_boot_stage | 1 | | > | BUG:KASAN:slab-out-of-bounds_in_s | 0 | 3 | > | BUG:KASAN:slab-out-of-bounds_in_l | 0 | 3 | > +----------------------------------------------------------------------------+------------+------------+ > > > If you fix the issue, kindly add following tag > Reported-by: kernel test robot <oliver.sang@intel.com> > > > [ 324.803835] BUG: KASAN: slab-out-of-bounds in strlen+0x53/0x5d > [ 324.808932] Read of size 1 at addr ffff88813be5f380 by task trinity-c0/7148 > [ 324.809979] > [ 324.810240] CPU: 1 PID: 7148 Comm: trinity-c0 Not tainted 5.9.0-13430-g00a4ef91e8f5 #1 > [ 324.811397] Call Trace: > [ 324.811797] dump_stack+0x156/0x194 > [ 324.812387] ? wake_up_klogd+0x49/0x5e > [ 324.813118] ? vprintk_emit+0x297/0x307 > [ 324.813680] print_address_description+0x25/0x4b7 > [ 324.814354] ? printk+0x54/0x5d > [ 324.814877] ? kasan_report+0xad/0x187 > [ 324.815531] kasan_report+0x140/0x187 > [ 324.816187] ? strlen+0x53/0x5d > [ 324.820931] [child7:7142] Tried 16 32-bit syscalls unsuccessfully. Disabling all 32-bit syscalls. > [ 324.828848] strlen+0x53/0x5d > [ 324.828864] getname_kernel+0x19/0x257 > [ 324.828874] kern_path+0x19/0x32 > [ 324.828887] lookup_bdev+0x52/0x182 > [ 324.828908] __x64_sys_quotactl+0x1fe/0x4e97 > [ 324.833228] ? kvm_sched_clock_read+0x14/0x28 > [ 324.837181] ? sched_clock+0x5/0x8 > [ 324.837748] ? sched_clock_cpu+0x18/0x151 > [ 324.838396] ? up_write+0xd7/0x399 > [ 324.838944] ? security_file_mprotect+0x93/0xb0 > [ 324.839686] ? __x64_sys_mprotect+0x31a/0x6a9 > [ 324.840405] ? fpregs_assert_state_consistent+0xae/0xd3 > [ 324.841253] do_syscall_64+0x34/0x6c > [ 324.841808] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 324.842540] RIP: 0033:0x7f77ba3311c9 > [ 324.843079] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 dc 2c 00 f7 d8 64 89 01 48 > [ 324.845838] RSP: 002b:00007ffe42abbe58 EFLAGS: 00000246 ORIG_RAX: 00000000000000b3 > [ 324.846978] RAX: ffffffffffffffda RBX: 00000000000000b3 RCX: 00007f77ba3311c9 > [ 324.848039] RDX: 0000000004000000 RSI: 00007f77b8719000 RDI: 0000000012121000 > [ 324.849923] RBP: 00007f77baa1d000 R08: ffffffffffffffff R09: 0000000000000000 > [ 324.850961] R10: 00007f77b8719000 R11: 0000000000000246 R12: 00007f77baa1d058 > [ 324.852032] R13: 00007f77baa246b0 R14: 0000000000000000 R15: 00007f77baa1d000 > [ 324.853117] > [ 324.853292] > [ 324.853372] Allocated by task 7148: > [ 324.854203] kasan_save_stack+0x27/0x47 > [ 324.854779] __kasan_kmalloc+0xed/0x104 > [ 324.855365] kmem_cache_alloc+0xcb/0x135 > [ 324.855971] getname_flags+0x51/0x3a2 > [ 324.856536] __x64_sys_quotactl+0x1c1/0x4e97 > [ 324.857205] do_syscall_64+0x34/0x6c > [ 324.857749] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 324.858495] > [ 324.858656] [child2:7150] Tried 16 32-bit syscalls unsuccessfully. Disabling all 32-bit syscalls. > [ 324.858746] The buggy address belongs to the object at ffff88813be5e380 > [ 324.858746] which belongs to the cache names_cache of size 4096 > [ 324.858764] The buggy address is located 0 bytes to the right of > [ 324.858764] 4096-byte region [ffff88813be5e380, ffff88813be5f380) > [ 324.860587] > [ 324.862729] The buggy address belongs to the page: > [ 324.862755] page:000000009f9037ac refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88813be5ffff pfn:0x13be5e > [ 324.867328] head:000000009f9037ac order:1 compound_mapcount:0 > [ 324.868165] flags: 0x8000000000010200(slab|head) > [ 324.868875] raw: 8000000000010200 ffffea0005a88688 ffffea000459f288 ffff888100252300 > [ 324.870009] raw: ffff88813be5ffff ffff88813be5e380 0000000100000001 0000000000000000 > [ 324.871126] page dumped because: kasan: bad access detected > [ 324.871945] > [ 324.872192] Memory state around the buggy address: > [ 324.872947] ffff88813be5f280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 324.873980] ffff88813be5f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 324.875009] >ffff88813be5f380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > [ 324.876036] ^ > [ 324.876538] ffff88813be5f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > [ 324.877588] ffff88813be5f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > [ 324.878621] ================================================================== > [ 324.879657] Disabling lock debugging due to kernel taint > [ 324.882776] [child2:7152] Tried 16 32-bit syscalls unsuccessfully. Disabling all 32-bit syscalls. > [ 324.882801] > [ 324.933069] [main] kernel became tainted! (32/0) Last seed was 2498072066 > [ 324.933099] > [ 324.969750] trinity: Detected kernel tainting. Last seed was 2498072066 > [ 324.969776] > [ 324.976192] [main] exit_reason=7, but 7 children still running. > [ 324.976217] > [ 326.978916] [main] Bailing main loop because kernel became tainted.. > [ 326.978943] > [ 327.015587] [main] Ran 32788 syscalls. Successes: 10983 Failures: 20991 > [ 327.015610] > > Kboot worker: lkp-worker04 > Elapsed time: 360 > > kvm=( > qemu-system-x86_64 > -enable-kvm > -cpu SandyBridge > -kernel $kernel > -initrd initrd-vm-snb-72.cgz > -m 8192 > -smp 2 > -device e1000,netdev=net0 > -netdev user,id=net0,hostfwd=tcp::32032-:22 > -boot order=nc > -no-reboot > -watchdog i6300esb > -watchdog-action debug > -rtc base=localtime > > > To reproduce: > > # build kernel > cd linux > cp config-5.9.0-13430-g00a4ef91e8f5 .config > make HOSTCC=clang-12 CC=clang-12 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage > > git clone https://github.com/intel/lkp-tests.git > cd lkp-tests > bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email > > > > Thanks, > Oliver Sang >
On Thu Nov 5, 2020 at 8:32 PM PST, Alexei Starovoitov wrote: > Daniel, > > the kasan complains about the previous version of your patch, > but your v4 version looks equivalent. > Could you try to repro this issue? > The code looks correct, but kasan complain is concerning. > > On Thu, Nov 5, 2020 at 5:56 PM kernel test robot <oliver.sang@intel.com> > wrote: > > > > Greeting, > > > > FYI, we noticed the following commit (built with clang-12): > > > > commit: 00a4ef91e8f5af6edceb9bd4bceed2305f038796 ("[PATCH bpf-next] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator") > > url: https://github.com/0day-ci/linux/commits/Daniel-Xu/lib-strncpy_from_user-c-Don-t-overcopy-bytes-after-NUL-terminator/20201104-103306 > > base: https://git.kernel.org/cgit/linux/kernel/git/bpf/bpf-next.git master [...] I'll take a look, thanks. Seems like the original email went into my spam. I'll try to fix my spam filter.
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index e6d5fcc2cdf3..d084189eb05c 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -35,17 +35,22 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, goto byte_at_a_time; while (max >= sizeof(unsigned long)) { - unsigned long c, data; + unsigned long c, data, mask, *out; /* Fall back to byte-at-a-time if we get a page fault */ unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time); - *(unsigned long *)(dst+res) = c; if (has_zero(c, &data, &constants)) { data = prep_zero_mask(c, data, &constants); data = create_zero_mask(data); + mask = zero_bytemask(data); + out = (unsigned long *)(dst+res); + *out = (*out & ~mask) | (c & mask); return res + find_zero(data); + } else { + *(unsigned long *)(dst+res) = c; } + res += sizeof(unsigned long); max -= sizeof(unsigned long); }
do_strncpy_from_user() may copy some extra bytes after the NUL terminator into the destination buffer. This usually does not matter for normal string operations. However, when BPF programs key BPF maps with strings, this matters a lot. A BPF program may read strings from user memory by calling the bpf_probe_read_user_str() helper which eventually calls do_strncpy_from_user(). The program can then key a map with the resulting string. BPF map keys are fixed-width and string-agnostic, meaning that map keys are treated as a set of bytes. The issue is when do_strncpy_from_user() overcopies bytes after the NUL terminator, it can result in seemingly identical strings occupying multiple slots in a BPF map. This behavior is subtle and totally unexpected by the user. This commit uses the proper word-at-a-time APIs to avoid overcopying. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> --- lib/strncpy_from_user.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)