Message ID | 20241202093943.227786-1-dmantipov@yandex.ru (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] netfs: fix kernel BUG in iov_iter_revert() | expand |
Interesting... The test program also causes fuse to oops (see attached) over without even getting to netfslib. The BUG is in iov_iter_revert(): if (iov_iter_is_xarray(i) || iter_is_ubuf(i)) { BUG(); /* We should never go beyond the start of the specified * range since we might then be straying into pages that * aren't pinned. */ I was trying the C reproducer from here: https://syzkaller.appspot.com/bug?extid=404b4b745080b6210c6c David --- FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 1 CPU: 3 UID: 0 PID: 5926 Comm: repro-12bc200f9 Not tainted 6.13.0-rc1-build3+ #5189 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Call Trace: <TASK> dump_stack_lvl+0x47/0x70 should_fail_ex+0x12e/0x160 should_failslab+0x50/0x60 __kmalloc_noprof+0x9f/0x290 ? fuse_get_user_pages+0xad/0x290 ? kmem_cache_debug_flags+0xc/0x20 fuse_get_user_pages+0xad/0x290 ? fuse_folios_alloc+0x13/0x30 fuse_direct_io+0x220/0x460 fuse_direct_write_iter+0x10f/0x180 vfs_write+0x142/0x1e0 ? __pfx_fuse_file_write_iter+0x10/0x10 ksys_write+0x6d/0xc0 do_syscall_64+0x80/0xe0 entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7fdd71f0f85d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a3 45 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007fdd71df7e68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fdd71df8cdc RCX: 00007fdd71f0f85d RDX: 000000000000000f RSI: 0000000020000280 RDI: 0000000000000005 RBP: 00007fdd71df7e90 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdd71df86c0 R13: ffffffffffffff88 R14: 0000000000000002 R15: 00007fffe6090bd0 </TASK> ------------[ cut here ]------------ kernel BUG at lib/iov_iter.c:626! Oops: invalid opcode: 0000 [#1] SMP PTI CPU: 3 UID: 0 PID: 5926 Comm: repro-12bc200f9 Not tainted 6.13.0-rc1-build3+ #5189 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 RIP: 0010:iov_iter_revert+0x4c/0xf0 Code: 80 f9 06 0f 84 bf 00 00 00 48 8b 47 08 48 39 f0 72 0c 48 29 f0 48 89 47 08 c3 cc cc cc cc 48 29 c6 80 f9 05 74 04 84 c9 75 02 <0f> 0b 80 f9 02 48 8b 47 10 75 15 8b 48 f8 48 83 e8 10 48 ff 42 20 RSP: 0018:ffff88811f16fd00 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888107504900 RCX: 0000000000001600 RDX: ffff88811f16fe60 RSI: 000000000000000f RDI: ffff88811f16fe60 RBP: 0000000000000000 R08: ffff88840fb9fa40 R09: ffff88840fb9fa60 R10: 0000000000000006 R11: 0000000000000020 R12: ffff88810b481400 R13: 00000000fffffff4 R14: ffff88811ce26cc0 R15: ffff88811f16fdb8 FS: 00007fdd71df86c0(0000) GS:ffff88840fb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000560d9a217710 CR3: 000000010760e003 CR4: 00000000001706f0 Call Trace: <TASK> ? __die_body+0x1a/0x60 ? die+0x30/0x50 ? do_trap+0x7a/0x100 ? iov_iter_revert+0x4c/0xf0 ? iov_iter_revert+0x4c/0xf0 ? do_error_trap+0x6e/0xa0 ? iov_iter_revert+0x4c/0xf0 ? exc_invalid_op+0x49/0x60 ? iov_iter_revert+0x4c/0xf0 ? asm_exc_invalid_op+0x16/0x20 ? iov_iter_revert+0x4c/0xf0 fuse_direct_io+0x398/0x460 fuse_direct_write_iter+0x10f/0x180 vfs_write+0x142/0x1e0 ? __pfx_fuse_file_write_iter+0x10/0x10 ksys_write+0x6d/0xc0 do_syscall_64+0x80/0xe0 entry_SYSCALL_64_after_hwframe+0x71/0x79
On Wed, 4 Dec 2024 at 10:56, David Howells <dhowells@redhat.com> wrote: > > Interesting... The test program also causes fuse to oops (see attached) over > without even getting to netfslib. The BUG is in iov_iter_revert(): > > if (iov_iter_is_xarray(i) || iter_is_ubuf(i)) { > BUG(); /* We should never go beyond the start of the specified > * range since we might then be straying into pages that > * aren't pinned. > */ Can you please test this? --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1541,8 +1541,10 @@ static int fuse_get_user_pages(struct fuse_args_pages *ap, struct iov_iter *ii, */ struct page **pages = kzalloc(max_pages * sizeof(struct page *), GFP_KERNEL); - if (!pages) + if (!pages) { + *nbytesp = 0; return -ENOMEM; + } while (nbytes < *nbytesp && nr_pages < max_pages) { unsigned nfolios, i; (Also attaching patch without whitespace damage.) Thanks, Miklos
On 12/4/24 3:41 PM, Miklos Szeredi wrote: > Can you please test this? Tested-by: Dmitry Antipov <dmantipov@yandex.ru> Dmitry
I seems to work, so you can add:
Tested-by: David Howells <dhowells@redhat.com>
Please don't mark it as closing the syzbot report as that pertains to
netfslib, not fuse. syzbot attached both reproducers to the same overall
report when it really ought to have split them.
Note that a reproducer zombie is left behind and the filesystem can't be
unmounted as it is busy (which may be due to the zombie). However, the zombie
goes away and is replaced with another if the reproducer is run again. I
wonder if I'm seeing a systemd bug as systemd should be reaping the zombies.
David
Hi Dmitry, > So wrap 'netfs_prepare_read_iterator()' to handle all possible > -ENOMEM cases and mark the corresponding subrequest as cancelled. I think there's a slightly better way to do it. There are three users of netfs_prepare_read_iterator() but there's also the call to ->prepare_read() which I think can all be combined. (And I've just realised that the patch I asked syzbot to test has a bug in it) David
On 12/4/24 13:41, Miklos Szeredi wrote: > On Wed, 4 Dec 2024 at 10:56, David Howells <dhowells@redhat.com> wrote: >> >> Interesting... The test program also causes fuse to oops (see attached) over >> without even getting to netfslib. The BUG is in iov_iter_revert(): >> >> if (iov_iter_is_xarray(i) || iter_is_ubuf(i)) { >> BUG(); /* We should never go beyond the start of the specified >> * range since we might then be straying into pages that >> * aren't pinned. >> */ > > Can you please test this? > > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -1541,8 +1541,10 @@ static int fuse_get_user_pages(struct > fuse_args_pages *ap, struct iov_iter *ii, > */ > struct page **pages = kzalloc(max_pages * sizeof(struct page *), > GFP_KERNEL); > - if (!pages) > + if (!pages) { > + *nbytesp = 0; > return -ENOMEM; > + } > > while (nbytes < *nbytesp && nr_pages < max_pages) { > unsigned nfolios, i; > > (Also attaching patch without whitespace damage.) I had already posted a patch on Monday. https://lore.kernel.org/r/20241203-fix-fuse_get_user_pages-v2-1-acce8a29d06b@ddn.com @David, is that the same sysbot report or another one? Thanks, Bernd
On Fri, 6 Dec 2024 at 13:41, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > I had already posted a patch on Monday. > > https://lore.kernel.org/r/20241203-fix-fuse_get_user_pages-v2-1-acce8a29d06b@ddn.com Sorry, missed that. Applied your version with the above test-by's added. > @David, is that the same sysbot report or another one? It's a different one, assigned to netfs, not fuse. Thanks, Miklos
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c index 7ac34550c403..4404f3c6ec3e 100644 --- a/fs/netfs/buffered_read.c +++ b/fs/netfs/buffered_read.c @@ -95,7 +95,7 @@ static size_t netfs_load_buffer_from_ra(struct netfs_io_request *rreq, } /* - * netfs_prepare_read_iterator - Prepare the subreq iterator for I/O + * __netfs_prepare_read_iterator - Prepare the subreq iterator for I/O * @subreq: The subrequest to be set up * * Prepare the I/O iterator representing the read buffer on a subrequest for @@ -109,7 +109,7 @@ static size_t netfs_load_buffer_from_ra(struct netfs_io_request *rreq, * [!] NOTE: This must be run in the same thread as ->issue_read() was called * in as we access the readahead_control struct. */ -static ssize_t netfs_prepare_read_iterator(struct netfs_io_subrequest *subreq) +static ssize_t __netfs_prepare_read_iterator(struct netfs_io_subrequest *subreq) { struct netfs_io_request *rreq = subreq->rreq; size_t rsize = subreq->len; @@ -174,6 +174,21 @@ static ssize_t netfs_prepare_read_iterator(struct netfs_io_subrequest *subreq) return subreq->len; } +/* Wrap the above by handling possible -ENOMEM and + * marking the corresponding subrequest as cancelled. + */ +static inline ssize_t netfs_prepare_read_iterator(struct netfs_io_subrequest *subreq) +{ + struct netfs_io_request *rreq = subreq->rreq; + ssize_t slice = __netfs_prepare_read_iterator(subreq); + + if (unlikely(slice < 0)) { + atomic_dec(&rreq->nr_outstanding); + netfs_put_subrequest(subreq, false, netfs_sreq_trace_put_cancel); + } + return slice; +} + static enum netfs_io_source netfs_cache_prepare_read(struct netfs_io_request *rreq, struct netfs_io_subrequest *subreq, loff_t i_size) @@ -285,9 +300,7 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq) } slice = netfs_prepare_read_iterator(subreq); - if (slice < 0) { - atomic_dec(&rreq->nr_outstanding); - netfs_put_subrequest(subreq, false, netfs_sreq_trace_put_cancel); + if (unlikely(slice < 0)) { ret = slice; break; } @@ -302,6 +315,10 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq) trace_netfs_sreq(subreq, netfs_sreq_trace_submit); netfs_stat(&netfs_n_rh_zero); slice = netfs_prepare_read_iterator(subreq); + if (unlikely(slice < 0)) { + ret = slice; + break; + } __set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags); netfs_read_subreq_terminated(subreq, 0, false); goto done; @@ -310,6 +327,10 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq) if (source == NETFS_READ_FROM_CACHE) { trace_netfs_sreq(subreq, netfs_sreq_trace_submit); slice = netfs_prepare_read_iterator(subreq); + if (unlikely(slice < 0)) { + ret = slice; + break; + } netfs_read_cache_to_pagecache(rreq, subreq); goto done; }