Message ID | 36b09a00-9f72-4ef2-8f73-79b2ba99b11c@kernel.dk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | io_uring/sqpoll: retain test for whether the CPU is valid | expand |
On Mon, 2024-09-16 at 03:07 -0600, Jens Axboe wrote: > A recent commit ensured that SQPOLL cannot be setup with a CPU that > isn't in the current tasks cpuset, but it also dropped testing > whether > the CPU is valid in the first place. Without that, if a task passes > in > a CPU value that is too high, the following KASAN splat can get > triggered: > > BUG: KASAN: stack-out-of-bounds in io_sq_offload_create+0x858/0xaa4 > Read of size 8 at addr ffff800089bc7b90 by task wq-aff.t/1391 > > CPU: 4 UID: 1000 PID: 1391 Comm: wq-aff.t Not tainted 6.11.0-rc7- > 00227-g371c468f4db6 #7080 > Hardware name: linux,dummy-virt (DT) > Call trace: > dump_backtrace.part.0+0xcc/0xe0 > show_stack+0x14/0x1c > dump_stack_lvl+0x58/0x74 > print_report+0x16c/0x4c8 > kasan_report+0x9c/0xe4 > __asan_report_load8_noabort+0x1c/0x24 > io_sq_offload_create+0x858/0xaa4 > io_uring_setup+0x1394/0x17c4 > __arm64_sys_io_uring_setup+0x6c/0x180 > invoke_syscall+0x6c/0x260 > el0_svc_common.constprop.0+0x158/0x224 > do_el0_svc+0x3c/0x5c > el0_svc+0x34/0x70 > el0t_64_sync_handler+0x118/0x124 > el0t_64_sync+0x168/0x16c > > The buggy address belongs to stack of task wq-aff.t/1391 > and is located at offset 48 in frame: > io_sq_offload_create+0x0/0xaa4 > > This frame has 1 object: > [32, 40) 'allowed_mask' > > The buggy address belongs to the virtual mapping at > [ffff800089bc0000, ffff800089bc9000) created by: > kernel_clone+0x124/0x7e0 > > The buggy address belongs to the physical page: > page: refcount:1 mapcount:0 mapping:0000000000000000 > index:0xffff0000d740af80 pfn:0x11740a > memcg:ffff0000c2706f02 > flags: 0xbffe00000000000(node=0|zone=2|lastcpupid=0x1fff) > raw: 0bffe00000000000 0000000000000000 dead000000000122 > 0000000000000000 > raw: ffff0000d740af80 0000000000000000 00000001ffffffff > ffff0000c2706f02 > page dumped because: kasan: bad access detected > > Memory state around the buggy address: > ffff800089bc7a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ffff800089bc7b00: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 > > ffff800089bc7b80: 00 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 > ^ > ffff800089bc7c00: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 > ffff800089bc7c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f3 > > Reported-by: kernel test robot <oliver.sang@intel.com> > Closes: > https://lore.kernel.org/oe-lkp/202409161632.cbeeca0d-lkp@intel.com > Fixes: f011c9cf04c0 ("io_uring/sqpoll: do not allow pinning outside > of cpuset") > Signed-off-by: Jens Axboe <axboe@kernel.dk> > > --- > > diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c > index 272df9d00f45..7adfcf6818ff 100644 > --- a/io_uring/sqpoll.c > +++ b/io_uring/sqpoll.c > @@ -465,6 +465,8 @@ __cold int io_sq_offload_create(struct > io_ring_ctx *ctx, > int cpu = p->sq_thread_cpu; > > ret = -EINVAL; > + if (cpu >= nr_cpu_ids || !cpu_online(cpu)) > + goto err_sqpoll; Thanks for fixing. I'm just wondering if cpu_online is really needed, as offline CPUs are in the mask as well, aren't they? Anyways: Tested-by: Felix Moessbauer <felix.moessbauer@siemens.com> Felix > cpuset_cpus_allowed(current, &allowed_mask); > if (!cpumask_test_cpu(cpu, &allowed_mask)) > goto err_sqpoll;
On 9/16/24 3:11 AM, MOESSBAUER, Felix wrote: > On Mon, 2024-09-16 at 03:07 -0600, Jens Axboe wrote: >> A recent commit ensured that SQPOLL cannot be setup with a CPU that >> isn't in the current tasks cpuset, but it also dropped testing >> whether >> the CPU is valid in the first place. Without that, if a task passes >> in >> a CPU value that is too high, the following KASAN splat can get >> triggered: >> >> BUG: KASAN: stack-out-of-bounds in io_sq_offload_create+0x858/0xaa4 >> Read of size 8 at addr ffff800089bc7b90 by task wq-aff.t/1391 >> >> CPU: 4 UID: 1000 PID: 1391 Comm: wq-aff.t Not tainted 6.11.0-rc7- >> 00227-g371c468f4db6 #7080 >> Hardware name: linux,dummy-virt (DT) >> Call trace: >> dump_backtrace.part.0+0xcc/0xe0 >> show_stack+0x14/0x1c >> dump_stack_lvl+0x58/0x74 >> print_report+0x16c/0x4c8 >> kasan_report+0x9c/0xe4 >> __asan_report_load8_noabort+0x1c/0x24 >> io_sq_offload_create+0x858/0xaa4 >> io_uring_setup+0x1394/0x17c4 >> __arm64_sys_io_uring_setup+0x6c/0x180 >> invoke_syscall+0x6c/0x260 >> el0_svc_common.constprop.0+0x158/0x224 >> do_el0_svc+0x3c/0x5c >> el0_svc+0x34/0x70 >> el0t_64_sync_handler+0x118/0x124 >> el0t_64_sync+0x168/0x16c >> >> The buggy address belongs to stack of task wq-aff.t/1391 >> and is located at offset 48 in frame: >> io_sq_offload_create+0x0/0xaa4 >> >> This frame has 1 object: >> [32, 40) 'allowed_mask' >> >> The buggy address belongs to the virtual mapping at >> [ffff800089bc0000, ffff800089bc9000) created by: >> kernel_clone+0x124/0x7e0 >> >> The buggy address belongs to the physical page: >> page: refcount:1 mapcount:0 mapping:0000000000000000 >> index:0xffff0000d740af80 pfn:0x11740a >> memcg:ffff0000c2706f02 >> flags: 0xbffe00000000000(node=0|zone=2|lastcpupid=0x1fff) >> raw: 0bffe00000000000 0000000000000000 dead000000000122 >> 0000000000000000 >> raw: ffff0000d740af80 0000000000000000 00000001ffffffff >> ffff0000c2706f02 >> page dumped because: kasan: bad access detected >> >> Memory state around the buggy address: >> ffff800089bc7a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> ffff800089bc7b00: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 >>> ffff800089bc7b80: 00 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 >> ^ >> ffff800089bc7c00: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 >> ffff800089bc7c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f3 >> >> Reported-by: kernel test robot <oliver.sang@intel.com> >> Closes: >> https://lore.kernel.org/oe-lkp/202409161632.cbeeca0d-lkp@intel.com >> Fixes: f011c9cf04c0 ("io_uring/sqpoll: do not allow pinning outside >> of cpuset") >> Signed-off-by: Jens Axboe <axboe@kernel.dk> >> >> --- >> >> diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c >> index 272df9d00f45..7adfcf6818ff 100644 >> --- a/io_uring/sqpoll.c >> +++ b/io_uring/sqpoll.c >> @@ -465,6 +465,8 @@ __cold int io_sq_offload_create(struct >> io_ring_ctx *ctx, >> int cpu = p->sq_thread_cpu; >> >> ret = -EINVAL; >> + if (cpu >= nr_cpu_ids || !cpu_online(cpu)) >> + goto err_sqpoll; > > Thanks for fixing. I'm just wondering if cpu_online is really needed, > as offline CPUs are in the mask as well, aren't they? Probably not, but I felt saner just returning the old check so we don't access the cpumask variable beyond the end.
diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c index 272df9d00f45..7adfcf6818ff 100644 --- a/io_uring/sqpoll.c +++ b/io_uring/sqpoll.c @@ -465,6 +465,8 @@ __cold int io_sq_offload_create(struct io_ring_ctx *ctx, int cpu = p->sq_thread_cpu; ret = -EINVAL; + if (cpu >= nr_cpu_ids || !cpu_online(cpu)) + goto err_sqpoll; cpuset_cpus_allowed(current, &allowed_mask); if (!cpumask_test_cpu(cpu, &allowed_mask)) goto err_sqpoll;