Message ID | 20241022122141.2030194-1-yangerkun@huaweicloud.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC] NFSv4: fix rpc_task use-after-free when open concurrently | expand |
在 2024/10/22 20:21, Yang Erkun 写道: > From: Yang Erkun <yangerkun@huawei.com> > > Two threads that work with the same cred try to open different files > concurrently, they will utilize the same nfs4_state_owner. And in order > to sequential open request send to server, the second task will fall > into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already one > work doing the open operation. Furthermore, the second task will wait > until the first task completes its work, call rpc_wake_up_queued_task in > nfs_release_seqid to wake up the second task, allowing it to complete > the remaining open operation. > > The preceding logic does not cause any problems under normal > circumstances. However, when once we force an unmount using `umount -f`, > the function nfs_umount_begin attempts to kill all tasks by calling > rpc_signal_task. This help wake up the second task, but it sets the > status to -ERESTARTSYS. This status prevents `nfs4_open_release` from > calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second > task will be freed, the original tasks will still exist in > sequence->list(see nfs_release_seqid). Latter, when the first thread > calls nfs_release_seqid and attempts to wake up the second task, it will > trigger the uaf. > > To resolve this issue, ensure rpc_task will remove it from > sequence->list in nfs4_open_release when open failed, besides, we can > only wakeup the next rpc_task, or use-after-free will happen too since > privious rpc_task may be released too. > > ================================================================== > BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0 > Read of size 8 at addr ff11000007639930 by task bash/792 > > CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G B W > 6.11.0-09960-gd10b58fe53dc-dirty #10 > Tainted: [B]=BAD_PAGE, [W]=WARN > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.16.1-2.fc37 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl+0xa3/0x120 > print_address_description.constprop.0+0x63/0x510 > print_report+0xf5/0x360 > kasan_report+0xd9/0x140 > __asan_report_load8_noabort+0x24/0x40 > rpc_wake_up_queued_task+0xbb/0xc0 > nfs_release_seqid+0x1e1/0x2f0 > nfs_free_seqid+0x1a/0x40 > nfs4_opendata_free+0xc6/0x3e0 > _nfs4_do_open.isra.0+0xbe3/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > ... > > Allocated by task 767: > kasan_save_stack+0x3b/0x70 > kasan_save_track+0x1c/0x40 > kasan_save_alloc_info+0x44/0x70 > __kasan_slab_alloc+0xaf/0xc0 > kmem_cache_alloc_noprof+0x1e0/0x4f0 > rpc_new_task+0xe7/0x220 > rpc_run_task+0x27/0x7d0 > nfs4_run_open_task+0x477/0x810 > _nfs4_proc_open+0xc0/0x6d0 > _nfs4_open_and_get_state+0x178/0xc50 > _nfs4_do_open.isra.0+0x47f/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Freed by task 767: > kasan_save_stack+0x3b/0x70 > kasan_save_track+0x1c/0x40 > kasan_save_free_info+0x43/0x80 > __kasan_slab_free+0x4f/0x90 > kmem_cache_free+0x199/0x4f0 > mempool_free_slab+0x1f/0x30 > mempool_free+0xdf/0x3d0 > rpc_free_task+0x12d/0x180 > rpc_final_put_task+0x10e/0x150 > rpc_do_put_task+0x63/0x80 > rpc_put_task+0x18/0x30 > nfs4_run_open_task+0x4f4/0x810 > _nfs4_proc_open+0xc0/0x6d0 > _nfs4_open_and_get_state+0x178/0xc50 > _nfs4_do_open.isra.0+0x47f/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC call") > Signed-off-by: Yang Erkun <yangerkun@huawei.com> > --- > fs/nfs/nfs4_fs.h | 1 + > fs/nfs/nfs4proc.c | 2 ++ > fs/nfs/nfs4state.c | 18 ++++++++++++++++++ > 3 files changed, 21 insertions(+) > > diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h > index 7d383d29a995..3bbd945a78ca 100644 > --- a/fs/nfs/nfs4_fs.h > +++ b/fs/nfs/nfs4_fs.h > @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_ > extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task); > extern void nfs_increment_open_seqid(int status, struct nfs_seqid *seqid); > extern void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid); > +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid); > extern void nfs_release_seqid(struct nfs_seqid *seqid); > extern void nfs_free_seqid(struct nfs_seqid *seqid); > extern int nfs4_setup_sequence(struct nfs_client *client, > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index cd2fbde2e6d7..86e093ffb39c 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void *calldata) > struct nfs4_opendata *data = calldata; > struct nfs4_state *state = NULL; > > + if (data->rpc_status != 0 || !data->rpc_done) > + nfs_release_seqid_inorder(data->o_arg.seqid); > /* If this request hasn't been cancelled, do nothing */ > if (!data->cancelled) > goto out_free; > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > index dafd61186557..df5e7a0b6528 100644 > --- a/fs/nfs/nfs4state.c > +++ b/fs/nfs/nfs4state.c > @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_m > return new; > } > > +void nfs_release_seqid_inorder(struct nfs_seqid *seqid) > +{ > + struct nfs_seqid_counter *sequence; > + > + if (seqid == NULL || list_empty(&seqid->list)) > + return; > + sequence = seqid->sequence; > + spin_lock(&sequence->lock); > + if (!list_is_last(&seqid->list, &sequence->list)) { > + struct nfs_seqid *next; > + > + next = list_next_entry(seqid, list); > + rpc_wake_up_queued_task(&sequence->wait, next->task); > + } > + list_del_init(&seqid->list); > + spin_unlock(&sequence->lock); > +} > + Hi, For this patch, no failed test case is added to the local nfsv4.0 nfsv4.1 test suite. But I'm not sure whether to change nfs_release_seqid to wake-up in sequence, so I changed the patch title to rfc. Hope we can discuss it together. Thanks, Erkun. > void nfs_release_seqid(struct nfs_seqid *seqid) > { > struct nfs_seqid_counter *sequence;
Hi, Gently ping for this patch! 在 2024/10/22 20:21, Yang Erkun 写道: > From: Yang Erkun <yangerkun@huawei.com> > > Two threads that work with the same cred try to open different files > concurrently, they will utilize the same nfs4_state_owner. And in order > to sequential open request send to server, the second task will fall > into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already one > work doing the open operation. Furthermore, the second task will wait > until the first task completes its work, call rpc_wake_up_queued_task in > nfs_release_seqid to wake up the second task, allowing it to complete > the remaining open operation. > > The preceding logic does not cause any problems under normal > circumstances. However, when once we force an unmount using `umount -f`, > the function nfs_umount_begin attempts to kill all tasks by calling > rpc_signal_task. This help wake up the second task, but it sets the > status to -ERESTARTSYS. This status prevents `nfs4_open_release` from > calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second > task will be freed, the original tasks will still exist in > sequence->list(see nfs_release_seqid). Latter, when the first thread > calls nfs_release_seqid and attempts to wake up the second task, it will > trigger the uaf. > > To resolve this issue, ensure rpc_task will remove it from > sequence->list in nfs4_open_release when open failed, besides, we can > only wakeup the next rpc_task, or use-after-free will happen too since > privious rpc_task may be released too. > > ================================================================== > BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0 > Read of size 8 at addr ff11000007639930 by task bash/792 > > CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G B W > 6.11.0-09960-gd10b58fe53dc-dirty #10 > Tainted: [B]=BAD_PAGE, [W]=WARN > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.16.1-2.fc37 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl+0xa3/0x120 > print_address_description.constprop.0+0x63/0x510 > print_report+0xf5/0x360 > kasan_report+0xd9/0x140 > __asan_report_load8_noabort+0x24/0x40 > rpc_wake_up_queued_task+0xbb/0xc0 > nfs_release_seqid+0x1e1/0x2f0 > nfs_free_seqid+0x1a/0x40 > nfs4_opendata_free+0xc6/0x3e0 > _nfs4_do_open.isra.0+0xbe3/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > ... > > Allocated by task 767: > kasan_save_stack+0x3b/0x70 > kasan_save_track+0x1c/0x40 > kasan_save_alloc_info+0x44/0x70 > __kasan_slab_alloc+0xaf/0xc0 > kmem_cache_alloc_noprof+0x1e0/0x4f0 > rpc_new_task+0xe7/0x220 > rpc_run_task+0x27/0x7d0 > nfs4_run_open_task+0x477/0x810 > _nfs4_proc_open+0xc0/0x6d0 > _nfs4_open_and_get_state+0x178/0xc50 > _nfs4_do_open.isra.0+0x47f/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Freed by task 767: > kasan_save_stack+0x3b/0x70 > kasan_save_track+0x1c/0x40 > kasan_save_free_info+0x43/0x80 > __kasan_slab_free+0x4f/0x90 > kmem_cache_free+0x199/0x4f0 > mempool_free_slab+0x1f/0x30 > mempool_free+0xdf/0x3d0 > rpc_free_task+0x12d/0x180 > rpc_final_put_task+0x10e/0x150 > rpc_do_put_task+0x63/0x80 > rpc_put_task+0x18/0x30 > nfs4_run_open_task+0x4f4/0x810 > _nfs4_proc_open+0xc0/0x6d0 > _nfs4_open_and_get_state+0x178/0xc50 > _nfs4_do_open.isra.0+0x47f/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC call") > Signed-off-by: Yang Erkun <yangerkun@huawei.com> > --- > fs/nfs/nfs4_fs.h | 1 + > fs/nfs/nfs4proc.c | 2 ++ > fs/nfs/nfs4state.c | 18 ++++++++++++++++++ > 3 files changed, 21 insertions(+) > > diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h > index 7d383d29a995..3bbd945a78ca 100644 > --- a/fs/nfs/nfs4_fs.h > +++ b/fs/nfs/nfs4_fs.h > @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_ > extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task); > extern void nfs_increment_open_seqid(int status, struct nfs_seqid *seqid); > extern void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid); > +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid); > extern void nfs_release_seqid(struct nfs_seqid *seqid); > extern void nfs_free_seqid(struct nfs_seqid *seqid); > extern int nfs4_setup_sequence(struct nfs_client *client, > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index cd2fbde2e6d7..86e093ffb39c 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void *calldata) > struct nfs4_opendata *data = calldata; > struct nfs4_state *state = NULL; > > + if (data->rpc_status != 0 || !data->rpc_done) > + nfs_release_seqid_inorder(data->o_arg.seqid); > /* If this request hasn't been cancelled, do nothing */ > if (!data->cancelled) > goto out_free; > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > index dafd61186557..df5e7a0b6528 100644 > --- a/fs/nfs/nfs4state.c > +++ b/fs/nfs/nfs4state.c > @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_m > return new; > } > > +void nfs_release_seqid_inorder(struct nfs_seqid *seqid) > +{ > + struct nfs_seqid_counter *sequence; > + > + if (seqid == NULL || list_empty(&seqid->list)) > + return; > + sequence = seqid->sequence; > + spin_lock(&sequence->lock); > + if (!list_is_last(&seqid->list, &sequence->list)) { > + struct nfs_seqid *next; > + > + next = list_next_entry(seqid, list); > + rpc_wake_up_queued_task(&sequence->wait, next->task); > + } > + list_del_init(&seqid->list); > + spin_unlock(&sequence->lock); > +} > + > void nfs_release_seqid(struct nfs_seqid *seqid) > { > struct nfs_seqid_counter *sequence;
Hi, we have discovered another similar issue where performing concurrent operations of opening file B, closing file A, and unmounting while file A is open can trigger a UAF of rpc_task. T1: OPEN fileB T2: CLOSE fileA T3: UMOUNT nfs4_do_open ... .rpc_call_prepare nfs_wait_on_sequence // insert to sequence list(T1) // This is the first entry, // and continue. nfs4_do_close .rpc_call_prepare nfs_wait_on_sequence // insert to sequence list(T2) // This is the second entry, // and wait. nfs_umount_begin rpc_killall_tasks rpc_signal_task // get T2 from clnt->cl_tasks rpc_wake_up_queued_task_set_status rpc_wake_up_task_queue_set_status_locked rpc_wake_up_task_on_wq_queue_action_locked __rpc_do_wake_up_task_on_wq rpc_make_runnable wake_up_bit // wake up T2 ... rpc_put_task // finish rpc_task and free it .rpc_call_done nfs_release_seqid list_first_entry // get T1 from list rpc_wake_up_queued_task // wake up T1, UAF _nfs4_open_and_get_state _nfs4_open_and_get_state _nfs4_opendata_to_nfs4_state nfs_release_seqid // remove T1 from list Although the rpc_task corresponding to T1 has been freed, its seqid has not been removed from the sequence list. Additionally, even though T2 is the last task in the list, the current logic of nfs_release_seqid allows T2 to access tasks inserted before it, which can trigger UAF. Using nfs_release_seqid_inorder instead of nfs_release_seqid can solve this problem. Considering that tasks in the sequence list should not access tasks inserted before them, would it be better to use nfs_release_seqid_inorder globally? 在 2024/10/22 20:21, Yang Erkun 写道: > From: Yang Erkun <yangerkun@huawei.com> > > Two threads that work with the same cred try to open different files > concurrently, they will utilize the same nfs4_state_owner. And in order > to sequential open request send to server, the second task will fall > into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already one > work doing the open operation. Furthermore, the second task will wait > until the first task completes its work, call rpc_wake_up_queued_task in > nfs_release_seqid to wake up the second task, allowing it to complete > the remaining open operation. > > The preceding logic does not cause any problems under normal > circumstances. However, when once we force an unmount using `umount -f`, > the function nfs_umount_begin attempts to kill all tasks by calling > rpc_signal_task. This help wake up the second task, but it sets the > status to -ERESTARTSYS. This status prevents `nfs4_open_release` from > calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second > task will be freed, the original tasks will still exist in > sequence->list(see nfs_release_seqid). Latter, when the first thread > calls nfs_release_seqid and attempts to wake up the second task, it will > trigger the uaf. > > To resolve this issue, ensure rpc_task will remove it from > sequence->list in nfs4_open_release when open failed, besides, we can > only wakeup the next rpc_task, or use-after-free will happen too since > privious rpc_task may be released too. > > ================================================================== > BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0 > Read of size 8 at addr ff11000007639930 by task bash/792 > > CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G B W > 6.11.0-09960-gd10b58fe53dc-dirty #10 > Tainted: [B]=BAD_PAGE, [W]=WARN > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.16.1-2.fc37 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl+0xa3/0x120 > print_address_description.constprop.0+0x63/0x510 > print_report+0xf5/0x360 > kasan_report+0xd9/0x140 > __asan_report_load8_noabort+0x24/0x40 > rpc_wake_up_queued_task+0xbb/0xc0 > nfs_release_seqid+0x1e1/0x2f0 > nfs_free_seqid+0x1a/0x40 > nfs4_opendata_free+0xc6/0x3e0 > _nfs4_do_open.isra.0+0xbe3/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > ... > > Allocated by task 767: > kasan_save_stack+0x3b/0x70 > kasan_save_track+0x1c/0x40 > kasan_save_alloc_info+0x44/0x70 > __kasan_slab_alloc+0xaf/0xc0 > kmem_cache_alloc_noprof+0x1e0/0x4f0 > rpc_new_task+0xe7/0x220 > rpc_run_task+0x27/0x7d0 > nfs4_run_open_task+0x477/0x810 > _nfs4_proc_open+0xc0/0x6d0 > _nfs4_open_and_get_state+0x178/0xc50 > _nfs4_do_open.isra.0+0x47f/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Freed by task 767: > kasan_save_stack+0x3b/0x70 > kasan_save_track+0x1c/0x40 > kasan_save_free_info+0x43/0x80 > __kasan_slab_free+0x4f/0x90 > kmem_cache_free+0x199/0x4f0 > mempool_free_slab+0x1f/0x30 > mempool_free+0xdf/0x3d0 > rpc_free_task+0x12d/0x180 > rpc_final_put_task+0x10e/0x150 > rpc_do_put_task+0x63/0x80 > rpc_put_task+0x18/0x30 > nfs4_run_open_task+0x4f4/0x810 > _nfs4_proc_open+0xc0/0x6d0 > _nfs4_open_and_get_state+0x178/0xc50 > _nfs4_do_open.isra.0+0x47f/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC call") > Signed-off-by: Yang Erkun <yangerkun@huawei.com> > --- > fs/nfs/nfs4_fs.h | 1 + > fs/nfs/nfs4proc.c | 2 ++ > fs/nfs/nfs4state.c | 18 ++++++++++++++++++ > 3 files changed, 21 insertions(+) > > diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h > index 7d383d29a995..3bbd945a78ca 100644 > --- a/fs/nfs/nfs4_fs.h > +++ b/fs/nfs/nfs4_fs.h > @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_ > extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task); > extern void nfs_increment_open_seqid(int status, struct nfs_seqid *seqid); > extern void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid); > +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid); > extern void nfs_release_seqid(struct nfs_seqid *seqid); > extern void nfs_free_seqid(struct nfs_seqid *seqid); > extern int nfs4_setup_sequence(struct nfs_client *client, > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index cd2fbde2e6d7..86e093ffb39c 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void *calldata) > struct nfs4_opendata *data = calldata; > struct nfs4_state *state = NULL; > > + if (data->rpc_status != 0 || !data->rpc_done) > + nfs_release_seqid_inorder(data->o_arg.seqid); > /* If this request hasn't been cancelled, do nothing */ > if (!data->cancelled) > goto out_free; > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > index dafd61186557..df5e7a0b6528 100644 > --- a/fs/nfs/nfs4state.c > +++ b/fs/nfs/nfs4state.c > @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_m > return new; > } > > +void nfs_release_seqid_inorder(struct nfs_seqid *seqid) > +{ > + struct nfs_seqid_counter *sequence; > + > + if (seqid == NULL || list_empty(&seqid->list)) > + return; > + sequence = seqid->sequence; > + spin_lock(&sequence->lock); > + if (!list_is_last(&seqid->list, &sequence->list)) { > + struct nfs_seqid *next; > + > + next = list_next_entry(seqid, list); > + rpc_wake_up_queued_task(&sequence->wait, next->task); > + } > + list_del_init(&seqid->list); > + spin_unlock(&sequence->lock); > +} > + > void nfs_release_seqid(struct nfs_seqid *seqid) > { > struct nfs_seqid_counter *sequence;
On Tue, 2024-10-22 at 20:21 +0800, Yang Erkun wrote: > From: Yang Erkun <yangerkun@huawei.com> > > Two threads that work with the same cred try to open different files > concurrently, they will utilize the same nfs4_state_owner. And in > order > to sequential open request send to server, the second task will fall > into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already > one > work doing the open operation. Furthermore, the second task will wait > until the first task completes its work, call rpc_wake_up_queued_task > in > nfs_release_seqid to wake up the second task, allowing it to complete > the remaining open operation. > > The preceding logic does not cause any problems under normal > circumstances. However, when once we force an unmount using `umount - > f`, > the function nfs_umount_begin attempts to kill all tasks by calling > rpc_signal_task. This help wake up the second task, but it sets the > status to -ERESTARTSYS. This status prevents `nfs4_open_release` from > calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second > task will be freed, the original tasks will still exist in > sequence->list(see nfs_release_seqid). Latter, when the first thread > calls nfs_release_seqid and attempts to wake up the second task, it > will > trigger the uaf. > > To resolve this issue, ensure rpc_task will remove it from > sequence->list in nfs4_open_release when open failed, besides, we can > only wakeup the next rpc_task, or use-after-free will happen too > since > privious rpc_task may be released too. > > ================================================================== > BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0 > Read of size 8 at addr ff11000007639930 by task bash/792 > > CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G B W > 6.11.0-09960-gd10b58fe53dc-dirty #10 > Tainted: [B]=BAD_PAGE, [W]=WARN > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.16.1-2.fc37 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl+0xa3/0x120 > print_address_description.constprop.0+0x63/0x510 > print_report+0xf5/0x360 > kasan_report+0xd9/0x140 > __asan_report_load8_noabort+0x24/0x40 > rpc_wake_up_queued_task+0xbb/0xc0 > nfs_release_seqid+0x1e1/0x2f0 > nfs_free_seqid+0x1a/0x40 > nfs4_opendata_free+0xc6/0x3e0 > _nfs4_do_open.isra.0+0xbe3/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > ... > > Allocated by task 767: > kasan_save_stack+0x3b/0x70 > kasan_save_track+0x1c/0x40 > kasan_save_alloc_info+0x44/0x70 > __kasan_slab_alloc+0xaf/0xc0 > kmem_cache_alloc_noprof+0x1e0/0x4f0 > rpc_new_task+0xe7/0x220 > rpc_run_task+0x27/0x7d0 > nfs4_run_open_task+0x477/0x810 > _nfs4_proc_open+0xc0/0x6d0 > _nfs4_open_and_get_state+0x178/0xc50 > _nfs4_do_open.isra.0+0x47f/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Freed by task 767: > kasan_save_stack+0x3b/0x70 > kasan_save_track+0x1c/0x40 > kasan_save_free_info+0x43/0x80 > __kasan_slab_free+0x4f/0x90 > kmem_cache_free+0x199/0x4f0 > mempool_free_slab+0x1f/0x30 > mempool_free+0xdf/0x3d0 > rpc_free_task+0x12d/0x180 > rpc_final_put_task+0x10e/0x150 > rpc_do_put_task+0x63/0x80 > rpc_put_task+0x18/0x30 > nfs4_run_open_task+0x4f4/0x810 > _nfs4_proc_open+0xc0/0x6d0 > _nfs4_open_and_get_state+0x178/0xc50 > _nfs4_do_open.isra.0+0x47f/0x1380 > nfs4_do_open+0x28b/0x620 > nfs4_atomic_open+0x2c6/0x3a0 > nfs_atomic_open+0x4f8/0x1180 > atomic_open+0x186/0x4e0 > lookup_open.isra.0+0x3e7/0x15b0 > open_last_lookups+0x85d/0x1260 > path_openat+0x151/0x7b0 > do_filp_open+0x1e0/0x310 > do_sys_openat2+0x178/0x1f0 > do_sys_open+0xa2/0x100 > __x64_sys_openat+0xa8/0x120 > x64_sys_call+0x2507/0x4540 > do_syscall_64+0xa7/0x240 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC > call") > Signed-off-by: Yang Erkun <yangerkun@huawei.com> > --- > fs/nfs/nfs4_fs.h | 1 + > fs/nfs/nfs4proc.c | 2 ++ > fs/nfs/nfs4state.c | 18 ++++++++++++++++++ > 3 files changed, 21 insertions(+) > > diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h > index 7d383d29a995..3bbd945a78ca 100644 > --- a/fs/nfs/nfs4_fs.h > +++ b/fs/nfs/nfs4_fs.h > @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct > nfs_seqid_counter *counter, gfp_ > extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct > rpc_task *task); > extern void nfs_increment_open_seqid(int status, struct nfs_seqid > *seqid); > extern void nfs_increment_lock_seqid(int status, struct nfs_seqid > *seqid); > +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid); > extern void nfs_release_seqid(struct nfs_seqid *seqid); > extern void nfs_free_seqid(struct nfs_seqid *seqid); > extern int nfs4_setup_sequence(struct nfs_client *client, > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index cd2fbde2e6d7..86e093ffb39c 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void > *calldata) > struct nfs4_opendata *data = calldata; > struct nfs4_state *state = NULL; > > + if (data->rpc_status != 0 || !data->rpc_done) > + nfs_release_seqid_inorder(data->o_arg.seqid); > /* If this request hasn't been cancelled, do nothing */ > if (!data->cancelled) > goto out_free; > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > index dafd61186557..df5e7a0b6528 100644 > --- a/fs/nfs/nfs4state.c > +++ b/fs/nfs/nfs4state.c > @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct > nfs_seqid_counter *counter, gfp_t gfp_m > return new; > } > > +void nfs_release_seqid_inorder(struct nfs_seqid *seqid) > +{ > + struct nfs_seqid_counter *sequence; > + > + if (seqid == NULL || list_empty(&seqid->list)) > + return; > + sequence = seqid->sequence; > + spin_lock(&sequence->lock); > + if (!list_is_last(&seqid->list, &sequence->list)) { > + struct nfs_seqid *next; > + > + next = list_next_entry(seqid, list); > + rpc_wake_up_queued_task(&sequence->wait, next- > >task); > + } > + list_del_init(&seqid->list); > + spin_unlock(&sequence->lock); > +} > + > void nfs_release_seqid(struct nfs_seqid *seqid) > { > struct nfs_seqid_counter *sequence; I'm not really seeing why we need a new function nfs_release_seqid_inorder(). Yes, there is an optimisation that can be made for nfs_release_seqid(), but that's not relevant to the problem you are seeing. The other issue is that you're applying the fix to nfs4_open_confirm_release(), which doesn't need fixing. The only function that needs to change is nfs4_open_release(), because it is the only one that can be called without nfs_wait_on_sequence() having succeeded, and so is the only function that can destroy the rpc_task before the seqid has become the first entry in the sequence->list. I've sent a couple of patches that address both the optimisation and a proposed final fix for the problem to the list (with you on the Cc list). Please check if they fix your problem. Thanks!
在 2024/11/9 6:21, Trond Myklebust 写道: > On Tue, 2024-10-22 at 20:21 +0800, Yang Erkun wrote: >> From: Yang Erkun <yangerkun@huawei.com> >> >> Two threads that work with the same cred try to open different files >> concurrently, they will utilize the same nfs4_state_owner. And in >> order >> to sequential open request send to server, the second task will fall >> into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already >> one >> work doing the open operation. Furthermore, the second task will wait >> until the first task completes its work, call rpc_wake_up_queued_task >> in >> nfs_release_seqid to wake up the second task, allowing it to complete >> the remaining open operation. >> >> The preceding logic does not cause any problems under normal >> circumstances. However, when once we force an unmount using `umount - >> f`, >> the function nfs_umount_begin attempts to kill all tasks by calling >> rpc_signal_task. This help wake up the second task, but it sets the >> status to -ERESTARTSYS. This status prevents `nfs4_open_release` from >> calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second >> task will be freed, the original tasks will still exist in >> sequence->list(see nfs_release_seqid). Latter, when the first thread >> calls nfs_release_seqid and attempts to wake up the second task, it >> will >> trigger the uaf. >> >> To resolve this issue, ensure rpc_task will remove it from >> sequence->list in nfs4_open_release when open failed, besides, we can >> only wakeup the next rpc_task, or use-after-free will happen too >> since >> privious rpc_task may be released too. >> >> ================================================================== >> BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0 >> Read of size 8 at addr ff11000007639930 by task bash/792 >> >> CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G B W >> 6.11.0-09960-gd10b58fe53dc-dirty #10 >> Tainted: [B]=BAD_PAGE, [W]=WARN >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >> 1.16.1-2.fc37 04/01/2014 >> Call Trace: >> <TASK> >> dump_stack_lvl+0xa3/0x120 >> print_address_description.constprop.0+0x63/0x510 >> print_report+0xf5/0x360 >> kasan_report+0xd9/0x140 >> __asan_report_load8_noabort+0x24/0x40 >> rpc_wake_up_queued_task+0xbb/0xc0 >> nfs_release_seqid+0x1e1/0x2f0 >> nfs_free_seqid+0x1a/0x40 >> nfs4_opendata_free+0xc6/0x3e0 >> _nfs4_do_open.isra.0+0xbe3/0x1380 >> nfs4_do_open+0x28b/0x620 >> nfs4_atomic_open+0x2c6/0x3a0 >> nfs_atomic_open+0x4f8/0x1180 >> atomic_open+0x186/0x4e0 >> lookup_open.isra.0+0x3e7/0x15b0 >> open_last_lookups+0x85d/0x1260 >> path_openat+0x151/0x7b0 >> do_filp_open+0x1e0/0x310 >> do_sys_openat2+0x178/0x1f0 >> do_sys_open+0xa2/0x100 >> __x64_sys_openat+0xa8/0x120 >> x64_sys_call+0x2507/0x4540 >> do_syscall_64+0xa7/0x240 >> entry_SYSCALL_64_after_hwframe+0x76/0x7e >> >> ... >> >> Allocated by task 767: >> kasan_save_stack+0x3b/0x70 >> kasan_save_track+0x1c/0x40 >> kasan_save_alloc_info+0x44/0x70 >> __kasan_slab_alloc+0xaf/0xc0 >> kmem_cache_alloc_noprof+0x1e0/0x4f0 >> rpc_new_task+0xe7/0x220 >> rpc_run_task+0x27/0x7d0 >> nfs4_run_open_task+0x477/0x810 >> _nfs4_proc_open+0xc0/0x6d0 >> _nfs4_open_and_get_state+0x178/0xc50 >> _nfs4_do_open.isra.0+0x47f/0x1380 >> nfs4_do_open+0x28b/0x620 >> nfs4_atomic_open+0x2c6/0x3a0 >> nfs_atomic_open+0x4f8/0x1180 >> atomic_open+0x186/0x4e0 >> lookup_open.isra.0+0x3e7/0x15b0 >> open_last_lookups+0x85d/0x1260 >> path_openat+0x151/0x7b0 >> do_filp_open+0x1e0/0x310 >> do_sys_openat2+0x178/0x1f0 >> do_sys_open+0xa2/0x100 >> __x64_sys_openat+0xa8/0x120 >> x64_sys_call+0x2507/0x4540 >> do_syscall_64+0xa7/0x240 >> entry_SYSCALL_64_after_hwframe+0x76/0x7e >> >> Freed by task 767: >> kasan_save_stack+0x3b/0x70 >> kasan_save_track+0x1c/0x40 >> kasan_save_free_info+0x43/0x80 >> __kasan_slab_free+0x4f/0x90 >> kmem_cache_free+0x199/0x4f0 >> mempool_free_slab+0x1f/0x30 >> mempool_free+0xdf/0x3d0 >> rpc_free_task+0x12d/0x180 >> rpc_final_put_task+0x10e/0x150 >> rpc_do_put_task+0x63/0x80 >> rpc_put_task+0x18/0x30 >> nfs4_run_open_task+0x4f4/0x810 >> _nfs4_proc_open+0xc0/0x6d0 >> _nfs4_open_and_get_state+0x178/0xc50 >> _nfs4_do_open.isra.0+0x47f/0x1380 >> nfs4_do_open+0x28b/0x620 >> nfs4_atomic_open+0x2c6/0x3a0 >> nfs_atomic_open+0x4f8/0x1180 >> atomic_open+0x186/0x4e0 >> lookup_open.isra.0+0x3e7/0x15b0 >> open_last_lookups+0x85d/0x1260 >> path_openat+0x151/0x7b0 >> do_filp_open+0x1e0/0x310 >> do_sys_openat2+0x178/0x1f0 >> do_sys_open+0xa2/0x100 >> __x64_sys_openat+0xa8/0x120 >> x64_sys_call+0x2507/0x4540 >> do_syscall_64+0xa7/0x240 >> entry_SYSCALL_64_after_hwframe+0x76/0x7e >> >> Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC >> call") >> Signed-off-by: Yang Erkun <yangerkun@huawei.com> >> --- >> fs/nfs/nfs4_fs.h | 1 + >> fs/nfs/nfs4proc.c | 2 ++ >> fs/nfs/nfs4state.c | 18 ++++++++++++++++++ >> 3 files changed, 21 insertions(+) >> >> diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h >> index 7d383d29a995..3bbd945a78ca 100644 >> --- a/fs/nfs/nfs4_fs.h >> +++ b/fs/nfs/nfs4_fs.h >> @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct >> nfs_seqid_counter *counter, gfp_ >> extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct >> rpc_task *task); >> extern void nfs_increment_open_seqid(int status, struct nfs_seqid >> *seqid); >> extern void nfs_increment_lock_seqid(int status, struct nfs_seqid >> *seqid); >> +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid); >> extern void nfs_release_seqid(struct nfs_seqid *seqid); >> extern void nfs_free_seqid(struct nfs_seqid *seqid); >> extern int nfs4_setup_sequence(struct nfs_client *client, >> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c >> index cd2fbde2e6d7..86e093ffb39c 100644 >> --- a/fs/nfs/nfs4proc.c >> +++ b/fs/nfs/nfs4proc.c >> @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void >> *calldata) >> struct nfs4_opendata *data = calldata; >> struct nfs4_state *state = NULL; >> >> + if (data->rpc_status != 0 || !data->rpc_done) >> + nfs_release_seqid_inorder(data->o_arg.seqid); >> /* If this request hasn't been cancelled, do nothing */ >> if (!data->cancelled) >> goto out_free; >> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c >> index dafd61186557..df5e7a0b6528 100644 >> --- a/fs/nfs/nfs4state.c >> +++ b/fs/nfs/nfs4state.c >> @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct >> nfs_seqid_counter *counter, gfp_t gfp_m >> return new; >> } >> >> +void nfs_release_seqid_inorder(struct nfs_seqid *seqid) >> +{ >> + struct nfs_seqid_counter *sequence; >> + >> + if (seqid == NULL || list_empty(&seqid->list)) >> + return; >> + sequence = seqid->sequence; >> + spin_lock(&sequence->lock); >> + if (!list_is_last(&seqid->list, &sequence->list)) { >> + struct nfs_seqid *next; >> + >> + next = list_next_entry(seqid, list); >> + rpc_wake_up_queued_task(&sequence->wait, next- >>> task); >> + } >> + list_del_init(&seqid->list); >> + spin_unlock(&sequence->lock); >> +} >> + >> void nfs_release_seqid(struct nfs_seqid *seqid) >> { >> struct nfs_seqid_counter *sequence; > > > I'm not really seeing why we need a new function > nfs_release_seqid_inorder(). Yes, there is an optimisation that can be > made for nfs_release_seqid(), but that's not relevant to the problem > you are seeing. Hi, you can see the second uaf analysis with this link: https://lore.kernel.org/all/ae13ac55-a90f-6cfd-f23b-2ae5999d2f8c@huaweicloud.com/ And for why I do this, the reply from me for this patch. It's a rfc to talk does we should change nfs_release_seqid all to do this order wakeup. " Hi, For this patch, no failed test case is added to the local nfsv4.0 nfsv4.1 test suite. But I'm not sure whether to change nfs_release_seqid to wake-up in sequence, so I changed the patch title to rfc. Hope we can discuss it together. Thanks, Erkun. " > > The other issue is that you're applying the fix to > nfs4_open_confirm_release(), which doesn't need fixing. The only Oh, sorry, this is some mistake when I write this rfc. > function that needs to change is nfs4_open_release(), because it is the > only one that can be called without nfs_wait_on_sequence() having > succeeded, and so is the only function that can destroy the rpc_task > before the seqid has become the first entry in the sequence->list. > > I've sent a couple of patches that address both the optimisation and a > proposed final fix for the problem to the list (with you on the Cc > list). Please check if they fix your problem. Yeah, your patchset pass my testcase. > > Thanks! >
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 7d383d29a995..3bbd945a78ca 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_ extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task); extern void nfs_increment_open_seqid(int status, struct nfs_seqid *seqid); extern void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid); +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid); extern void nfs_release_seqid(struct nfs_seqid *seqid); extern void nfs_free_seqid(struct nfs_seqid *seqid); extern int nfs4_setup_sequence(struct nfs_client *client, diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index cd2fbde2e6d7..86e093ffb39c 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void *calldata) struct nfs4_opendata *data = calldata; struct nfs4_state *state = NULL; + if (data->rpc_status != 0 || !data->rpc_done) + nfs_release_seqid_inorder(data->o_arg.seqid); /* If this request hasn't been cancelled, do nothing */ if (!data->cancelled) goto out_free; diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index dafd61186557..df5e7a0b6528 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_m return new; } +void nfs_release_seqid_inorder(struct nfs_seqid *seqid) +{ + struct nfs_seqid_counter *sequence; + + if (seqid == NULL || list_empty(&seqid->list)) + return; + sequence = seqid->sequence; + spin_lock(&sequence->lock); + if (!list_is_last(&seqid->list, &sequence->list)) { + struct nfs_seqid *next; + + next = list_next_entry(seqid, list); + rpc_wake_up_queued_task(&sequence->wait, next->task); + } + list_del_init(&seqid->list); + spin_unlock(&sequence->lock); +} + void nfs_release_seqid(struct nfs_seqid *seqid) { struct nfs_seqid_counter *sequence;