diff mbox series

[RFC] NFSv4: fix rpc_task use-after-free when open concurrently

Message ID 20241022122141.2030194-1-yangerkun@huaweicloud.com (mailing list archive)
State New
Headers show
Series [RFC] NFSv4: fix rpc_task use-after-free when open concurrently | expand

Commit Message

yangerkun Oct. 22, 2024, 12:21 p.m. UTC
From: Yang Erkun <yangerkun@huawei.com>

Two threads that work with the same cred try to open different files
concurrently, they will utilize the same nfs4_state_owner. And in order
to sequential open request send to server, the second task will fall
into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already one
work doing the open operation. Furthermore, the second task will wait
until the first task completes its work, call rpc_wake_up_queued_task in
nfs_release_seqid to wake up the second task, allowing it to complete
the remaining open operation.

The preceding logic does not cause any problems under normal
circumstances. However, when once we force an unmount using `umount -f`,
the function nfs_umount_begin attempts to kill all tasks by calling
rpc_signal_task. This help wake up the second task, but it sets the
status to -ERESTARTSYS. This status prevents `nfs4_open_release` from
calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second
task will be freed, the original tasks will still exist in
sequence->list(see nfs_release_seqid). Latter, when the first thread
calls nfs_release_seqid and attempts to wake up the second task, it will
trigger the uaf.

To resolve this issue, ensure rpc_task will remove it from
sequence->list in nfs4_open_release when open failed, besides, we can
only wakeup the next rpc_task, or use-after-free will happen too since
privious rpc_task may be released too.

==================================================================
BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0
Read of size 8 at addr ff11000007639930 by task bash/792

CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G    B   W
6.11.0-09960-gd10b58fe53dc-dirty #10
Tainted: [B]=BAD_PAGE, [W]=WARN
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.16.1-2.fc37 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xa3/0x120
 print_address_description.constprop.0+0x63/0x510
 print_report+0xf5/0x360
 kasan_report+0xd9/0x140
 __asan_report_load8_noabort+0x24/0x40
 rpc_wake_up_queued_task+0xbb/0xc0
 nfs_release_seqid+0x1e1/0x2f0
 nfs_free_seqid+0x1a/0x40
 nfs4_opendata_free+0xc6/0x3e0
 _nfs4_do_open.isra.0+0xbe3/0x1380
 nfs4_do_open+0x28b/0x620
 nfs4_atomic_open+0x2c6/0x3a0
 nfs_atomic_open+0x4f8/0x1180
 atomic_open+0x186/0x4e0
 lookup_open.isra.0+0x3e7/0x15b0
 open_last_lookups+0x85d/0x1260
 path_openat+0x151/0x7b0
 do_filp_open+0x1e0/0x310
 do_sys_openat2+0x178/0x1f0
 do_sys_open+0xa2/0x100
 __x64_sys_openat+0xa8/0x120
 x64_sys_call+0x2507/0x4540
 do_syscall_64+0xa7/0x240
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

...

Allocated by task 767:
 kasan_save_stack+0x3b/0x70
 kasan_save_track+0x1c/0x40
 kasan_save_alloc_info+0x44/0x70
 __kasan_slab_alloc+0xaf/0xc0
 kmem_cache_alloc_noprof+0x1e0/0x4f0
 rpc_new_task+0xe7/0x220
 rpc_run_task+0x27/0x7d0
 nfs4_run_open_task+0x477/0x810
 _nfs4_proc_open+0xc0/0x6d0
 _nfs4_open_and_get_state+0x178/0xc50
 _nfs4_do_open.isra.0+0x47f/0x1380
 nfs4_do_open+0x28b/0x620
 nfs4_atomic_open+0x2c6/0x3a0
 nfs_atomic_open+0x4f8/0x1180
 atomic_open+0x186/0x4e0
 lookup_open.isra.0+0x3e7/0x15b0
 open_last_lookups+0x85d/0x1260
 path_openat+0x151/0x7b0
 do_filp_open+0x1e0/0x310
 do_sys_openat2+0x178/0x1f0
 do_sys_open+0xa2/0x100
 __x64_sys_openat+0xa8/0x120
 x64_sys_call+0x2507/0x4540
 do_syscall_64+0xa7/0x240
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 767:
 kasan_save_stack+0x3b/0x70
 kasan_save_track+0x1c/0x40
 kasan_save_free_info+0x43/0x80
 __kasan_slab_free+0x4f/0x90
 kmem_cache_free+0x199/0x4f0
 mempool_free_slab+0x1f/0x30
 mempool_free+0xdf/0x3d0
 rpc_free_task+0x12d/0x180
 rpc_final_put_task+0x10e/0x150
 rpc_do_put_task+0x63/0x80
 rpc_put_task+0x18/0x30
 nfs4_run_open_task+0x4f4/0x810
 _nfs4_proc_open+0xc0/0x6d0
 _nfs4_open_and_get_state+0x178/0xc50
 _nfs4_do_open.isra.0+0x47f/0x1380
 nfs4_do_open+0x28b/0x620
 nfs4_atomic_open+0x2c6/0x3a0
 nfs_atomic_open+0x4f8/0x1180
 atomic_open+0x186/0x4e0
 lookup_open.isra.0+0x3e7/0x15b0
 open_last_lookups+0x85d/0x1260
 path_openat+0x151/0x7b0
 do_filp_open+0x1e0/0x310
 do_sys_openat2+0x178/0x1f0
 do_sys_open+0xa2/0x100
 __x64_sys_openat+0xa8/0x120
 x64_sys_call+0x2507/0x4540
 do_syscall_64+0xa7/0x240
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC call")
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
---
 fs/nfs/nfs4_fs.h   |  1 +
 fs/nfs/nfs4proc.c  |  2 ++
 fs/nfs/nfs4state.c | 18 ++++++++++++++++++
 3 files changed, 21 insertions(+)

Comments

yangerkun Oct. 22, 2024, 12:29 p.m. UTC | #1
在 2024/10/22 20:21, Yang Erkun 写道:
> From: Yang Erkun <yangerkun@huawei.com>
> 
> Two threads that work with the same cred try to open different files
> concurrently, they will utilize the same nfs4_state_owner. And in order
> to sequential open request send to server, the second task will fall
> into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already one
> work doing the open operation. Furthermore, the second task will wait
> until the first task completes its work, call rpc_wake_up_queued_task in
> nfs_release_seqid to wake up the second task, allowing it to complete
> the remaining open operation.
> 
> The preceding logic does not cause any problems under normal
> circumstances. However, when once we force an unmount using `umount -f`,
> the function nfs_umount_begin attempts to kill all tasks by calling
> rpc_signal_task. This help wake up the second task, but it sets the
> status to -ERESTARTSYS. This status prevents `nfs4_open_release` from
> calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second
> task will be freed, the original tasks will still exist in
> sequence->list(see nfs_release_seqid). Latter, when the first thread
> calls nfs_release_seqid and attempts to wake up the second task, it will
> trigger the uaf.
> 
> To resolve this issue, ensure rpc_task will remove it from
> sequence->list in nfs4_open_release when open failed, besides, we can
> only wakeup the next rpc_task, or use-after-free will happen too since
> privious rpc_task may be released too.
> 
> ==================================================================
> BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0
> Read of size 8 at addr ff11000007639930 by task bash/792
> 
> CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G    B   W
> 6.11.0-09960-gd10b58fe53dc-dirty #10
> Tainted: [B]=BAD_PAGE, [W]=WARN
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.16.1-2.fc37 04/01/2014
> Call Trace:
>   <TASK>
>   dump_stack_lvl+0xa3/0x120
>   print_address_description.constprop.0+0x63/0x510
>   print_report+0xf5/0x360
>   kasan_report+0xd9/0x140
>   __asan_report_load8_noabort+0x24/0x40
>   rpc_wake_up_queued_task+0xbb/0xc0
>   nfs_release_seqid+0x1e1/0x2f0
>   nfs_free_seqid+0x1a/0x40
>   nfs4_opendata_free+0xc6/0x3e0
>   _nfs4_do_open.isra.0+0xbe3/0x1380
>   nfs4_do_open+0x28b/0x620
>   nfs4_atomic_open+0x2c6/0x3a0
>   nfs_atomic_open+0x4f8/0x1180
>   atomic_open+0x186/0x4e0
>   lookup_open.isra.0+0x3e7/0x15b0
>   open_last_lookups+0x85d/0x1260
>   path_openat+0x151/0x7b0
>   do_filp_open+0x1e0/0x310
>   do_sys_openat2+0x178/0x1f0
>   do_sys_open+0xa2/0x100
>   __x64_sys_openat+0xa8/0x120
>   x64_sys_call+0x2507/0x4540
>   do_syscall_64+0xa7/0x240
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> ...
> 
> Allocated by task 767:
>   kasan_save_stack+0x3b/0x70
>   kasan_save_track+0x1c/0x40
>   kasan_save_alloc_info+0x44/0x70
>   __kasan_slab_alloc+0xaf/0xc0
>   kmem_cache_alloc_noprof+0x1e0/0x4f0
>   rpc_new_task+0xe7/0x220
>   rpc_run_task+0x27/0x7d0
>   nfs4_run_open_task+0x477/0x810
>   _nfs4_proc_open+0xc0/0x6d0
>   _nfs4_open_and_get_state+0x178/0xc50
>   _nfs4_do_open.isra.0+0x47f/0x1380
>   nfs4_do_open+0x28b/0x620
>   nfs4_atomic_open+0x2c6/0x3a0
>   nfs_atomic_open+0x4f8/0x1180
>   atomic_open+0x186/0x4e0
>   lookup_open.isra.0+0x3e7/0x15b0
>   open_last_lookups+0x85d/0x1260
>   path_openat+0x151/0x7b0
>   do_filp_open+0x1e0/0x310
>   do_sys_openat2+0x178/0x1f0
>   do_sys_open+0xa2/0x100
>   __x64_sys_openat+0xa8/0x120
>   x64_sys_call+0x2507/0x4540
>   do_syscall_64+0xa7/0x240
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Freed by task 767:
>   kasan_save_stack+0x3b/0x70
>   kasan_save_track+0x1c/0x40
>   kasan_save_free_info+0x43/0x80
>   __kasan_slab_free+0x4f/0x90
>   kmem_cache_free+0x199/0x4f0
>   mempool_free_slab+0x1f/0x30
>   mempool_free+0xdf/0x3d0
>   rpc_free_task+0x12d/0x180
>   rpc_final_put_task+0x10e/0x150
>   rpc_do_put_task+0x63/0x80
>   rpc_put_task+0x18/0x30
>   nfs4_run_open_task+0x4f4/0x810
>   _nfs4_proc_open+0xc0/0x6d0
>   _nfs4_open_and_get_state+0x178/0xc50
>   _nfs4_do_open.isra.0+0x47f/0x1380
>   nfs4_do_open+0x28b/0x620
>   nfs4_atomic_open+0x2c6/0x3a0
>   nfs_atomic_open+0x4f8/0x1180
>   atomic_open+0x186/0x4e0
>   lookup_open.isra.0+0x3e7/0x15b0
>   open_last_lookups+0x85d/0x1260
>   path_openat+0x151/0x7b0
>   do_filp_open+0x1e0/0x310
>   do_sys_openat2+0x178/0x1f0
>   do_sys_open+0xa2/0x100
>   __x64_sys_openat+0xa8/0x120
>   x64_sys_call+0x2507/0x4540
>   do_syscall_64+0xa7/0x240
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC call")
> Signed-off-by: Yang Erkun <yangerkun@huawei.com>
> ---
>   fs/nfs/nfs4_fs.h   |  1 +
>   fs/nfs/nfs4proc.c  |  2 ++
>   fs/nfs/nfs4state.c | 18 ++++++++++++++++++
>   3 files changed, 21 insertions(+)
> 
> diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
> index 7d383d29a995..3bbd945a78ca 100644
> --- a/fs/nfs/nfs4_fs.h
> +++ b/fs/nfs/nfs4_fs.h
> @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_
>   extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task);
>   extern void nfs_increment_open_seqid(int status, struct nfs_seqid *seqid);
>   extern void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid);
> +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid);
>   extern void nfs_release_seqid(struct nfs_seqid *seqid);
>   extern void nfs_free_seqid(struct nfs_seqid *seqid);
>   extern int nfs4_setup_sequence(struct nfs_client *client,
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index cd2fbde2e6d7..86e093ffb39c 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void *calldata)
>   	struct nfs4_opendata *data = calldata;
>   	struct nfs4_state *state = NULL;
>   
> +	if (data->rpc_status != 0 || !data->rpc_done)
> +		nfs_release_seqid_inorder(data->o_arg.seqid);
>   	/* If this request hasn't been cancelled, do nothing */
>   	if (!data->cancelled)
>   		goto out_free;
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index dafd61186557..df5e7a0b6528 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_m
>   	return new;
>   }
>   
> +void nfs_release_seqid_inorder(struct nfs_seqid *seqid)
> +{
> +	struct nfs_seqid_counter *sequence;
> +
> +	if (seqid == NULL || list_empty(&seqid->list))
> +		return;
> +	sequence = seqid->sequence;
> +	spin_lock(&sequence->lock);
> +	if (!list_is_last(&seqid->list, &sequence->list)) {
> +		struct nfs_seqid *next;
> +
> +		next = list_next_entry(seqid, list);
> +		rpc_wake_up_queued_task(&sequence->wait, next->task);
> +	}
> +	list_del_init(&seqid->list);
> +	spin_unlock(&sequence->lock);
> +}
> +

Hi,

For this patch, no failed test case is added to the local nfsv4.0
nfsv4.1 test suite. But I'm not sure whether to change nfs_release_seqid
to wake-up in sequence, so I changed the patch title to rfc. Hope we can
discuss it together.

Thanks,
Erkun.

>   void nfs_release_seqid(struct nfs_seqid *seqid)
>   {
>   	struct nfs_seqid_counter *sequence;
yangerkun Oct. 28, 2024, 12:12 p.m. UTC | #2
Hi,

Gently ping for this patch!

在 2024/10/22 20:21, Yang Erkun 写道:
> From: Yang Erkun <yangerkun@huawei.com>
> 
> Two threads that work with the same cred try to open different files
> concurrently, they will utilize the same nfs4_state_owner. And in order
> to sequential open request send to server, the second task will fall
> into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already one
> work doing the open operation. Furthermore, the second task will wait
> until the first task completes its work, call rpc_wake_up_queued_task in
> nfs_release_seqid to wake up the second task, allowing it to complete
> the remaining open operation.
> 
> The preceding logic does not cause any problems under normal
> circumstances. However, when once we force an unmount using `umount -f`,
> the function nfs_umount_begin attempts to kill all tasks by calling
> rpc_signal_task. This help wake up the second task, but it sets the
> status to -ERESTARTSYS. This status prevents `nfs4_open_release` from
> calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second
> task will be freed, the original tasks will still exist in
> sequence->list(see nfs_release_seqid). Latter, when the first thread
> calls nfs_release_seqid and attempts to wake up the second task, it will
> trigger the uaf.
> 
> To resolve this issue, ensure rpc_task will remove it from
> sequence->list in nfs4_open_release when open failed, besides, we can
> only wakeup the next rpc_task, or use-after-free will happen too since
> privious rpc_task may be released too.
> 
> ==================================================================
> BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0
> Read of size 8 at addr ff11000007639930 by task bash/792
> 
> CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G    B   W
> 6.11.0-09960-gd10b58fe53dc-dirty #10
> Tainted: [B]=BAD_PAGE, [W]=WARN
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.16.1-2.fc37 04/01/2014
> Call Trace:
>   <TASK>
>   dump_stack_lvl+0xa3/0x120
>   print_address_description.constprop.0+0x63/0x510
>   print_report+0xf5/0x360
>   kasan_report+0xd9/0x140
>   __asan_report_load8_noabort+0x24/0x40
>   rpc_wake_up_queued_task+0xbb/0xc0
>   nfs_release_seqid+0x1e1/0x2f0
>   nfs_free_seqid+0x1a/0x40
>   nfs4_opendata_free+0xc6/0x3e0
>   _nfs4_do_open.isra.0+0xbe3/0x1380
>   nfs4_do_open+0x28b/0x620
>   nfs4_atomic_open+0x2c6/0x3a0
>   nfs_atomic_open+0x4f8/0x1180
>   atomic_open+0x186/0x4e0
>   lookup_open.isra.0+0x3e7/0x15b0
>   open_last_lookups+0x85d/0x1260
>   path_openat+0x151/0x7b0
>   do_filp_open+0x1e0/0x310
>   do_sys_openat2+0x178/0x1f0
>   do_sys_open+0xa2/0x100
>   __x64_sys_openat+0xa8/0x120
>   x64_sys_call+0x2507/0x4540
>   do_syscall_64+0xa7/0x240
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> ...
> 
> Allocated by task 767:
>   kasan_save_stack+0x3b/0x70
>   kasan_save_track+0x1c/0x40
>   kasan_save_alloc_info+0x44/0x70
>   __kasan_slab_alloc+0xaf/0xc0
>   kmem_cache_alloc_noprof+0x1e0/0x4f0
>   rpc_new_task+0xe7/0x220
>   rpc_run_task+0x27/0x7d0
>   nfs4_run_open_task+0x477/0x810
>   _nfs4_proc_open+0xc0/0x6d0
>   _nfs4_open_and_get_state+0x178/0xc50
>   _nfs4_do_open.isra.0+0x47f/0x1380
>   nfs4_do_open+0x28b/0x620
>   nfs4_atomic_open+0x2c6/0x3a0
>   nfs_atomic_open+0x4f8/0x1180
>   atomic_open+0x186/0x4e0
>   lookup_open.isra.0+0x3e7/0x15b0
>   open_last_lookups+0x85d/0x1260
>   path_openat+0x151/0x7b0
>   do_filp_open+0x1e0/0x310
>   do_sys_openat2+0x178/0x1f0
>   do_sys_open+0xa2/0x100
>   __x64_sys_openat+0xa8/0x120
>   x64_sys_call+0x2507/0x4540
>   do_syscall_64+0xa7/0x240
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Freed by task 767:
>   kasan_save_stack+0x3b/0x70
>   kasan_save_track+0x1c/0x40
>   kasan_save_free_info+0x43/0x80
>   __kasan_slab_free+0x4f/0x90
>   kmem_cache_free+0x199/0x4f0
>   mempool_free_slab+0x1f/0x30
>   mempool_free+0xdf/0x3d0
>   rpc_free_task+0x12d/0x180
>   rpc_final_put_task+0x10e/0x150
>   rpc_do_put_task+0x63/0x80
>   rpc_put_task+0x18/0x30
>   nfs4_run_open_task+0x4f4/0x810
>   _nfs4_proc_open+0xc0/0x6d0
>   _nfs4_open_and_get_state+0x178/0xc50
>   _nfs4_do_open.isra.0+0x47f/0x1380
>   nfs4_do_open+0x28b/0x620
>   nfs4_atomic_open+0x2c6/0x3a0
>   nfs_atomic_open+0x4f8/0x1180
>   atomic_open+0x186/0x4e0
>   lookup_open.isra.0+0x3e7/0x15b0
>   open_last_lookups+0x85d/0x1260
>   path_openat+0x151/0x7b0
>   do_filp_open+0x1e0/0x310
>   do_sys_openat2+0x178/0x1f0
>   do_sys_open+0xa2/0x100
>   __x64_sys_openat+0xa8/0x120
>   x64_sys_call+0x2507/0x4540
>   do_syscall_64+0xa7/0x240
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC call")
> Signed-off-by: Yang Erkun <yangerkun@huawei.com>
> ---
>   fs/nfs/nfs4_fs.h   |  1 +
>   fs/nfs/nfs4proc.c  |  2 ++
>   fs/nfs/nfs4state.c | 18 ++++++++++++++++++
>   3 files changed, 21 insertions(+)
> 
> diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
> index 7d383d29a995..3bbd945a78ca 100644
> --- a/fs/nfs/nfs4_fs.h
> +++ b/fs/nfs/nfs4_fs.h
> @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_
>   extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task);
>   extern void nfs_increment_open_seqid(int status, struct nfs_seqid *seqid);
>   extern void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid);
> +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid);
>   extern void nfs_release_seqid(struct nfs_seqid *seqid);
>   extern void nfs_free_seqid(struct nfs_seqid *seqid);
>   extern int nfs4_setup_sequence(struct nfs_client *client,
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index cd2fbde2e6d7..86e093ffb39c 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void *calldata)
>   	struct nfs4_opendata *data = calldata;
>   	struct nfs4_state *state = NULL;
>   
> +	if (data->rpc_status != 0 || !data->rpc_done)
> +		nfs_release_seqid_inorder(data->o_arg.seqid);
>   	/* If this request hasn't been cancelled, do nothing */
>   	if (!data->cancelled)
>   		goto out_free;
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index dafd61186557..df5e7a0b6528 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_m
>   	return new;
>   }
>   
> +void nfs_release_seqid_inorder(struct nfs_seqid *seqid)
> +{
> +	struct nfs_seqid_counter *sequence;
> +
> +	if (seqid == NULL || list_empty(&seqid->list))
> +		return;
> +	sequence = seqid->sequence;
> +	spin_lock(&sequence->lock);
> +	if (!list_is_last(&seqid->list, &sequence->list)) {
> +		struct nfs_seqid *next;
> +
> +		next = list_next_entry(seqid, list);
> +		rpc_wake_up_queued_task(&sequence->wait, next->task);
> +	}
> +	list_del_init(&seqid->list);
> +	spin_unlock(&sequence->lock);
> +}
> +
>   void nfs_release_seqid(struct nfs_seqid *seqid)
>   {
>   	struct nfs_seqid_counter *sequence;
Li Lingfeng Nov. 4, 2024, 8:51 a.m. UTC | #3
Hi, we have discovered another similar issue where performing concurrent
operations of opening file B, closing file A, and unmounting while file A
is open can trigger a UAF of rpc_task.

T1: OPEN fileB                     T2: CLOSE fileA T3: UMOUNT
nfs4_do_open
...
  .rpc_call_prepare
  nfs_wait_on_sequence
    // insert to sequence list(T1)
    // This is the first entry,
    // and continue.
                            nfs4_do_close
                             .rpc_call_prepare
                             nfs_wait_on_sequence
                                // insert to sequence list(T2)
                                // This is the second entry,
                                // and wait.
nfs_umount_begin
rpc_killall_tasks
rpc_signal_task // get T2 from clnt->cl_tasks
rpc_wake_up_queued_task_set_status
rpc_wake_up_task_queue_set_status_locked
rpc_wake_up_task_on_wq_queue_action_locked
__rpc_do_wake_up_task_on_wq
rpc_make_runnable
wake_up_bit // wake up T2
...
  rpc_put_task
   // finish rpc_task and free it
                             .rpc_call_done
                             nfs_release_seqid
                              list_first_entry // get T1 from list
                              rpc_wake_up_queued_task // wake up T1, UAF
  _nfs4_open_and_get_state
   _nfs4_open_and_get_state
    _nfs4_opendata_to_nfs4_state
     nfs_release_seqid
      // remove T1 from list

Although the rpc_task corresponding to T1 has been freed, its seqid has
not been removed from the sequence list.
Additionally, even though T2 is the last task in the list, the current
logic of nfs_release_seqid allows T2 to access tasks inserted before it,
which can trigger UAF.
Using nfs_release_seqid_inorder instead of nfs_release_seqid can solve
this problem.
Considering that tasks in the sequence list should not access tasks
inserted before them, would it be better to use nfs_release_seqid_inorder
globally?

在 2024/10/22 20:21, Yang Erkun 写道:
> From: Yang Erkun <yangerkun@huawei.com>
>
> Two threads that work with the same cred try to open different files
> concurrently, they will utilize the same nfs4_state_owner. And in order
> to sequential open request send to server, the second task will fall
> into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already one
> work doing the open operation. Furthermore, the second task will wait
> until the first task completes its work, call rpc_wake_up_queued_task in
> nfs_release_seqid to wake up the second task, allowing it to complete
> the remaining open operation.
>
> The preceding logic does not cause any problems under normal
> circumstances. However, when once we force an unmount using `umount -f`,
> the function nfs_umount_begin attempts to kill all tasks by calling
> rpc_signal_task. This help wake up the second task, but it sets the
> status to -ERESTARTSYS. This status prevents `nfs4_open_release` from
> calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second
> task will be freed, the original tasks will still exist in
> sequence->list(see nfs_release_seqid). Latter, when the first thread
> calls nfs_release_seqid and attempts to wake up the second task, it will
> trigger the uaf.
>
> To resolve this issue, ensure rpc_task will remove it from
> sequence->list in nfs4_open_release when open failed, besides, we can
> only wakeup the next rpc_task, or use-after-free will happen too since
> privious rpc_task may be released too.
>
> ==================================================================
> BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0
> Read of size 8 at addr ff11000007639930 by task bash/792
>
> CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G    B   W
> 6.11.0-09960-gd10b58fe53dc-dirty #10
> Tainted: [B]=BAD_PAGE, [W]=WARN
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.16.1-2.fc37 04/01/2014
> Call Trace:
>   <TASK>
>   dump_stack_lvl+0xa3/0x120
>   print_address_description.constprop.0+0x63/0x510
>   print_report+0xf5/0x360
>   kasan_report+0xd9/0x140
>   __asan_report_load8_noabort+0x24/0x40
>   rpc_wake_up_queued_task+0xbb/0xc0
>   nfs_release_seqid+0x1e1/0x2f0
>   nfs_free_seqid+0x1a/0x40
>   nfs4_opendata_free+0xc6/0x3e0
>   _nfs4_do_open.isra.0+0xbe3/0x1380
>   nfs4_do_open+0x28b/0x620
>   nfs4_atomic_open+0x2c6/0x3a0
>   nfs_atomic_open+0x4f8/0x1180
>   atomic_open+0x186/0x4e0
>   lookup_open.isra.0+0x3e7/0x15b0
>   open_last_lookups+0x85d/0x1260
>   path_openat+0x151/0x7b0
>   do_filp_open+0x1e0/0x310
>   do_sys_openat2+0x178/0x1f0
>   do_sys_open+0xa2/0x100
>   __x64_sys_openat+0xa8/0x120
>   x64_sys_call+0x2507/0x4540
>   do_syscall_64+0xa7/0x240
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> ...
>
> Allocated by task 767:
>   kasan_save_stack+0x3b/0x70
>   kasan_save_track+0x1c/0x40
>   kasan_save_alloc_info+0x44/0x70
>   __kasan_slab_alloc+0xaf/0xc0
>   kmem_cache_alloc_noprof+0x1e0/0x4f0
>   rpc_new_task+0xe7/0x220
>   rpc_run_task+0x27/0x7d0
>   nfs4_run_open_task+0x477/0x810
>   _nfs4_proc_open+0xc0/0x6d0
>   _nfs4_open_and_get_state+0x178/0xc50
>   _nfs4_do_open.isra.0+0x47f/0x1380
>   nfs4_do_open+0x28b/0x620
>   nfs4_atomic_open+0x2c6/0x3a0
>   nfs_atomic_open+0x4f8/0x1180
>   atomic_open+0x186/0x4e0
>   lookup_open.isra.0+0x3e7/0x15b0
>   open_last_lookups+0x85d/0x1260
>   path_openat+0x151/0x7b0
>   do_filp_open+0x1e0/0x310
>   do_sys_openat2+0x178/0x1f0
>   do_sys_open+0xa2/0x100
>   __x64_sys_openat+0xa8/0x120
>   x64_sys_call+0x2507/0x4540
>   do_syscall_64+0xa7/0x240
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Freed by task 767:
>   kasan_save_stack+0x3b/0x70
>   kasan_save_track+0x1c/0x40
>   kasan_save_free_info+0x43/0x80
>   __kasan_slab_free+0x4f/0x90
>   kmem_cache_free+0x199/0x4f0
>   mempool_free_slab+0x1f/0x30
>   mempool_free+0xdf/0x3d0
>   rpc_free_task+0x12d/0x180
>   rpc_final_put_task+0x10e/0x150
>   rpc_do_put_task+0x63/0x80
>   rpc_put_task+0x18/0x30
>   nfs4_run_open_task+0x4f4/0x810
>   _nfs4_proc_open+0xc0/0x6d0
>   _nfs4_open_and_get_state+0x178/0xc50
>   _nfs4_do_open.isra.0+0x47f/0x1380
>   nfs4_do_open+0x28b/0x620
>   nfs4_atomic_open+0x2c6/0x3a0
>   nfs_atomic_open+0x4f8/0x1180
>   atomic_open+0x186/0x4e0
>   lookup_open.isra.0+0x3e7/0x15b0
>   open_last_lookups+0x85d/0x1260
>   path_openat+0x151/0x7b0
>   do_filp_open+0x1e0/0x310
>   do_sys_openat2+0x178/0x1f0
>   do_sys_open+0xa2/0x100
>   __x64_sys_openat+0xa8/0x120
>   x64_sys_call+0x2507/0x4540
>   do_syscall_64+0xa7/0x240
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC call")
> Signed-off-by: Yang Erkun <yangerkun@huawei.com>
> ---
>   fs/nfs/nfs4_fs.h   |  1 +
>   fs/nfs/nfs4proc.c  |  2 ++
>   fs/nfs/nfs4state.c | 18 ++++++++++++++++++
>   3 files changed, 21 insertions(+)
>
> diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
> index 7d383d29a995..3bbd945a78ca 100644
> --- a/fs/nfs/nfs4_fs.h
> +++ b/fs/nfs/nfs4_fs.h
> @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_
>   extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task);
>   extern void nfs_increment_open_seqid(int status, struct nfs_seqid *seqid);
>   extern void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid);
> +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid);
>   extern void nfs_release_seqid(struct nfs_seqid *seqid);
>   extern void nfs_free_seqid(struct nfs_seqid *seqid);
>   extern int nfs4_setup_sequence(struct nfs_client *client,
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index cd2fbde2e6d7..86e093ffb39c 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void *calldata)
>   	struct nfs4_opendata *data = calldata;
>   	struct nfs4_state *state = NULL;
>   
> +	if (data->rpc_status != 0 || !data->rpc_done)
> +		nfs_release_seqid_inorder(data->o_arg.seqid);
>   	/* If this request hasn't been cancelled, do nothing */
>   	if (!data->cancelled)
>   		goto out_free;
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index dafd61186557..df5e7a0b6528 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_m
>   	return new;
>   }
>   
> +void nfs_release_seqid_inorder(struct nfs_seqid *seqid)
> +{
> +	struct nfs_seqid_counter *sequence;
> +
> +	if (seqid == NULL || list_empty(&seqid->list))
> +		return;
> +	sequence = seqid->sequence;
> +	spin_lock(&sequence->lock);
> +	if (!list_is_last(&seqid->list, &sequence->list)) {
> +		struct nfs_seqid *next;
> +
> +		next = list_next_entry(seqid, list);
> +		rpc_wake_up_queued_task(&sequence->wait, next->task);
> +	}
> +	list_del_init(&seqid->list);
> +	spin_unlock(&sequence->lock);
> +}
> +
>   void nfs_release_seqid(struct nfs_seqid *seqid)
>   {
>   	struct nfs_seqid_counter *sequence;
Trond Myklebust Nov. 8, 2024, 10:21 p.m. UTC | #4
On Tue, 2024-10-22 at 20:21 +0800, Yang Erkun wrote:
> From: Yang Erkun <yangerkun@huawei.com>
> 
> Two threads that work with the same cred try to open different files
> concurrently, they will utilize the same nfs4_state_owner. And in
> order
> to sequential open request send to server, the second task will fall
> into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already
> one
> work doing the open operation. Furthermore, the second task will wait
> until the first task completes its work, call rpc_wake_up_queued_task
> in
> nfs_release_seqid to wake up the second task, allowing it to complete
> the remaining open operation.
> 
> The preceding logic does not cause any problems under normal
> circumstances. However, when once we force an unmount using `umount -
> f`,
> the function nfs_umount_begin attempts to kill all tasks by calling
> rpc_signal_task. This help wake up the second task, but it sets the
> status to -ERESTARTSYS. This status prevents `nfs4_open_release` from
> calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second
> task will be freed, the original tasks will still exist in
> sequence->list(see nfs_release_seqid). Latter, when the first thread
> calls nfs_release_seqid and attempts to wake up the second task, it
> will
> trigger the uaf.
> 
> To resolve this issue, ensure rpc_task will remove it from
> sequence->list in nfs4_open_release when open failed, besides, we can
> only wakeup the next rpc_task, or use-after-free will happen too
> since
> privious rpc_task may be released too.
> 
> ==================================================================
> BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0
> Read of size 8 at addr ff11000007639930 by task bash/792
> 
> CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G    B   W
> 6.11.0-09960-gd10b58fe53dc-dirty #10
> Tainted: [B]=BAD_PAGE, [W]=WARN
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.16.1-2.fc37 04/01/2014
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0xa3/0x120
>  print_address_description.constprop.0+0x63/0x510
>  print_report+0xf5/0x360
>  kasan_report+0xd9/0x140
>  __asan_report_load8_noabort+0x24/0x40
>  rpc_wake_up_queued_task+0xbb/0xc0
>  nfs_release_seqid+0x1e1/0x2f0
>  nfs_free_seqid+0x1a/0x40
>  nfs4_opendata_free+0xc6/0x3e0
>  _nfs4_do_open.isra.0+0xbe3/0x1380
>  nfs4_do_open+0x28b/0x620
>  nfs4_atomic_open+0x2c6/0x3a0
>  nfs_atomic_open+0x4f8/0x1180
>  atomic_open+0x186/0x4e0
>  lookup_open.isra.0+0x3e7/0x15b0
>  open_last_lookups+0x85d/0x1260
>  path_openat+0x151/0x7b0
>  do_filp_open+0x1e0/0x310
>  do_sys_openat2+0x178/0x1f0
>  do_sys_open+0xa2/0x100
>  __x64_sys_openat+0xa8/0x120
>  x64_sys_call+0x2507/0x4540
>  do_syscall_64+0xa7/0x240
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> ...
> 
> Allocated by task 767:
>  kasan_save_stack+0x3b/0x70
>  kasan_save_track+0x1c/0x40
>  kasan_save_alloc_info+0x44/0x70
>  __kasan_slab_alloc+0xaf/0xc0
>  kmem_cache_alloc_noprof+0x1e0/0x4f0
>  rpc_new_task+0xe7/0x220
>  rpc_run_task+0x27/0x7d0
>  nfs4_run_open_task+0x477/0x810
>  _nfs4_proc_open+0xc0/0x6d0
>  _nfs4_open_and_get_state+0x178/0xc50
>  _nfs4_do_open.isra.0+0x47f/0x1380
>  nfs4_do_open+0x28b/0x620
>  nfs4_atomic_open+0x2c6/0x3a0
>  nfs_atomic_open+0x4f8/0x1180
>  atomic_open+0x186/0x4e0
>  lookup_open.isra.0+0x3e7/0x15b0
>  open_last_lookups+0x85d/0x1260
>  path_openat+0x151/0x7b0
>  do_filp_open+0x1e0/0x310
>  do_sys_openat2+0x178/0x1f0
>  do_sys_open+0xa2/0x100
>  __x64_sys_openat+0xa8/0x120
>  x64_sys_call+0x2507/0x4540
>  do_syscall_64+0xa7/0x240
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Freed by task 767:
>  kasan_save_stack+0x3b/0x70
>  kasan_save_track+0x1c/0x40
>  kasan_save_free_info+0x43/0x80
>  __kasan_slab_free+0x4f/0x90
>  kmem_cache_free+0x199/0x4f0
>  mempool_free_slab+0x1f/0x30
>  mempool_free+0xdf/0x3d0
>  rpc_free_task+0x12d/0x180
>  rpc_final_put_task+0x10e/0x150
>  rpc_do_put_task+0x63/0x80
>  rpc_put_task+0x18/0x30
>  nfs4_run_open_task+0x4f4/0x810
>  _nfs4_proc_open+0xc0/0x6d0
>  _nfs4_open_and_get_state+0x178/0xc50
>  _nfs4_do_open.isra.0+0x47f/0x1380
>  nfs4_do_open+0x28b/0x620
>  nfs4_atomic_open+0x2c6/0x3a0
>  nfs_atomic_open+0x4f8/0x1180
>  atomic_open+0x186/0x4e0
>  lookup_open.isra.0+0x3e7/0x15b0
>  open_last_lookups+0x85d/0x1260
>  path_openat+0x151/0x7b0
>  do_filp_open+0x1e0/0x310
>  do_sys_openat2+0x178/0x1f0
>  do_sys_open+0xa2/0x100
>  __x64_sys_openat+0xa8/0x120
>  x64_sys_call+0x2507/0x4540
>  do_syscall_64+0xa7/0x240
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC
> call")
> Signed-off-by: Yang Erkun <yangerkun@huawei.com>
> ---
>  fs/nfs/nfs4_fs.h   |  1 +
>  fs/nfs/nfs4proc.c  |  2 ++
>  fs/nfs/nfs4state.c | 18 ++++++++++++++++++
>  3 files changed, 21 insertions(+)
> 
> diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
> index 7d383d29a995..3bbd945a78ca 100644
> --- a/fs/nfs/nfs4_fs.h
> +++ b/fs/nfs/nfs4_fs.h
> @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct
> nfs_seqid_counter *counter, gfp_
>  extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct
> rpc_task *task);
>  extern void nfs_increment_open_seqid(int status, struct nfs_seqid
> *seqid);
>  extern void nfs_increment_lock_seqid(int status, struct nfs_seqid
> *seqid);
> +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid);
>  extern void nfs_release_seqid(struct nfs_seqid *seqid);
>  extern void nfs_free_seqid(struct nfs_seqid *seqid);
>  extern int nfs4_setup_sequence(struct nfs_client *client,
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index cd2fbde2e6d7..86e093ffb39c 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void
> *calldata)
>  	struct nfs4_opendata *data = calldata;
>  	struct nfs4_state *state = NULL;
>  
> +	if (data->rpc_status != 0 || !data->rpc_done)
> +		nfs_release_seqid_inorder(data->o_arg.seqid);
>  	/* If this request hasn't been cancelled, do nothing */
>  	if (!data->cancelled)
>  		goto out_free;
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index dafd61186557..df5e7a0b6528 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct
> nfs_seqid_counter *counter, gfp_t gfp_m
>  	return new;
>  }
>  
> +void nfs_release_seqid_inorder(struct nfs_seqid *seqid)
> +{
> +	struct nfs_seqid_counter *sequence;
> +
> +	if (seqid == NULL || list_empty(&seqid->list))
> +		return;
> +	sequence = seqid->sequence;
> +	spin_lock(&sequence->lock);
> +	if (!list_is_last(&seqid->list, &sequence->list)) {
> +		struct nfs_seqid *next;
> +
> +		next = list_next_entry(seqid, list);
> +		rpc_wake_up_queued_task(&sequence->wait, next-
> >task);
> +	}
> +	list_del_init(&seqid->list);
> +	spin_unlock(&sequence->lock);
> +}
> +
>  void nfs_release_seqid(struct nfs_seqid *seqid)
>  {
>  	struct nfs_seqid_counter *sequence;


I'm not really seeing why we need a new function
nfs_release_seqid_inorder(). Yes, there is an optimisation that can be
made for nfs_release_seqid(), but that's not relevant to the problem
you are seeing.

The other issue is that you're applying the fix to
nfs4_open_confirm_release(), which doesn't need fixing. The only
function that needs to change is nfs4_open_release(), because it is the
only one that can be called without nfs_wait_on_sequence() having
succeeded, and so is the only function that can destroy the rpc_task
before the seqid has become the first entry in the sequence->list.

I've sent a couple of patches that address both the optimisation and a
proposed final fix for the problem to the list (with you on the Cc
list). Please check if they fix your problem.

Thanks!
yangerkun Nov. 9, 2024, 2:17 a.m. UTC | #5
在 2024/11/9 6:21, Trond Myklebust 写道:
> On Tue, 2024-10-22 at 20:21 +0800, Yang Erkun wrote:
>> From: Yang Erkun <yangerkun@huawei.com>
>>
>> Two threads that work with the same cred try to open different files
>> concurrently, they will utilize the same nfs4_state_owner. And in
>> order
>> to sequential open request send to server, the second task will fall
>> into RPC_TASK_QUEUED in nfs_wait_on_sequence since there is already
>> one
>> work doing the open operation. Furthermore, the second task will wait
>> until the first task completes its work, call rpc_wake_up_queued_task
>> in
>> nfs_release_seqid to wake up the second task, allowing it to complete
>> the remaining open operation.
>>
>> The preceding logic does not cause any problems under normal
>> circumstances. However, when once we force an unmount using `umount -
>> f`,
>> the function nfs_umount_begin attempts to kill all tasks by calling
>> rpc_signal_task. This help wake up the second task, but it sets the
>> status to -ERESTARTSYS. This status prevents `nfs4_open_release` from
>> calling `nfs4_opendata_to_nfs4_state`. Consequently, while the second
>> task will be freed, the original tasks will still exist in
>> sequence->list(see nfs_release_seqid). Latter, when the first thread
>> calls nfs_release_seqid and attempts to wake up the second task, it
>> will
>> trigger the uaf.
>>
>> To resolve this issue, ensure rpc_task will remove it from
>> sequence->list in nfs4_open_release when open failed, besides, we can
>> only wakeup the next rpc_task, or use-after-free will happen too
>> since
>> privious rpc_task may be released too.
>>
>> ==================================================================
>> BUG: KASAN: slab-use-after-free in rpc_wake_up_queued_task+0xbb/0xc0
>> Read of size 8 at addr ff11000007639930 by task bash/792
>>
>> CPU: 0 UID: 0 PID: 792 Comm: bash Tainted: G    B   W
>> 6.11.0-09960-gd10b58fe53dc-dirty #10
>> Tainted: [B]=BAD_PAGE, [W]=WARN
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> 1.16.1-2.fc37 04/01/2014
>> Call Trace:
>>   <TASK>
>>   dump_stack_lvl+0xa3/0x120
>>   print_address_description.constprop.0+0x63/0x510
>>   print_report+0xf5/0x360
>>   kasan_report+0xd9/0x140
>>   __asan_report_load8_noabort+0x24/0x40
>>   rpc_wake_up_queued_task+0xbb/0xc0
>>   nfs_release_seqid+0x1e1/0x2f0
>>   nfs_free_seqid+0x1a/0x40
>>   nfs4_opendata_free+0xc6/0x3e0
>>   _nfs4_do_open.isra.0+0xbe3/0x1380
>>   nfs4_do_open+0x28b/0x620
>>   nfs4_atomic_open+0x2c6/0x3a0
>>   nfs_atomic_open+0x4f8/0x1180
>>   atomic_open+0x186/0x4e0
>>   lookup_open.isra.0+0x3e7/0x15b0
>>   open_last_lookups+0x85d/0x1260
>>   path_openat+0x151/0x7b0
>>   do_filp_open+0x1e0/0x310
>>   do_sys_openat2+0x178/0x1f0
>>   do_sys_open+0xa2/0x100
>>   __x64_sys_openat+0xa8/0x120
>>   x64_sys_call+0x2507/0x4540
>>   do_syscall_64+0xa7/0x240
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> ...
>>
>> Allocated by task 767:
>>   kasan_save_stack+0x3b/0x70
>>   kasan_save_track+0x1c/0x40
>>   kasan_save_alloc_info+0x44/0x70
>>   __kasan_slab_alloc+0xaf/0xc0
>>   kmem_cache_alloc_noprof+0x1e0/0x4f0
>>   rpc_new_task+0xe7/0x220
>>   rpc_run_task+0x27/0x7d0
>>   nfs4_run_open_task+0x477/0x810
>>   _nfs4_proc_open+0xc0/0x6d0
>>   _nfs4_open_and_get_state+0x178/0xc50
>>   _nfs4_do_open.isra.0+0x47f/0x1380
>>   nfs4_do_open+0x28b/0x620
>>   nfs4_atomic_open+0x2c6/0x3a0
>>   nfs_atomic_open+0x4f8/0x1180
>>   atomic_open+0x186/0x4e0
>>   lookup_open.isra.0+0x3e7/0x15b0
>>   open_last_lookups+0x85d/0x1260
>>   path_openat+0x151/0x7b0
>>   do_filp_open+0x1e0/0x310
>>   do_sys_openat2+0x178/0x1f0
>>   do_sys_open+0xa2/0x100
>>   __x64_sys_openat+0xa8/0x120
>>   x64_sys_call+0x2507/0x4540
>>   do_syscall_64+0xa7/0x240
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> Freed by task 767:
>>   kasan_save_stack+0x3b/0x70
>>   kasan_save_track+0x1c/0x40
>>   kasan_save_free_info+0x43/0x80
>>   __kasan_slab_free+0x4f/0x90
>>   kmem_cache_free+0x199/0x4f0
>>   mempool_free_slab+0x1f/0x30
>>   mempool_free+0xdf/0x3d0
>>   rpc_free_task+0x12d/0x180
>>   rpc_final_put_task+0x10e/0x150
>>   rpc_do_put_task+0x63/0x80
>>   rpc_put_task+0x18/0x30
>>   nfs4_run_open_task+0x4f4/0x810
>>   _nfs4_proc_open+0xc0/0x6d0
>>   _nfs4_open_and_get_state+0x178/0xc50
>>   _nfs4_do_open.isra.0+0x47f/0x1380
>>   nfs4_do_open+0x28b/0x620
>>   nfs4_atomic_open+0x2c6/0x3a0
>>   nfs_atomic_open+0x4f8/0x1180
>>   atomic_open+0x186/0x4e0
>>   lookup_open.isra.0+0x3e7/0x15b0
>>   open_last_lookups+0x85d/0x1260
>>   path_openat+0x151/0x7b0
>>   do_filp_open+0x1e0/0x310
>>   do_sys_openat2+0x178/0x1f0
>>   do_sys_open+0xa2/0x100
>>   __x64_sys_openat+0xa8/0x120
>>   x64_sys_call+0x2507/0x4540
>>   do_syscall_64+0xa7/0x240
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> Fixes: 24ac23ab88df ("NFSv4: Convert open() into an asynchronous RPC
>> call")
>> Signed-off-by: Yang Erkun <yangerkun@huawei.com>
>> ---
>>   fs/nfs/nfs4_fs.h   |  1 +
>>   fs/nfs/nfs4proc.c  |  2 ++
>>   fs/nfs/nfs4state.c | 18 ++++++++++++++++++
>>   3 files changed, 21 insertions(+)
>>
>> diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
>> index 7d383d29a995..3bbd945a78ca 100644
>> --- a/fs/nfs/nfs4_fs.h
>> +++ b/fs/nfs/nfs4_fs.h
>> @@ -525,6 +525,7 @@ extern struct nfs_seqid *nfs_alloc_seqid(struct
>> nfs_seqid_counter *counter, gfp_
>>   extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct
>> rpc_task *task);
>>   extern void nfs_increment_open_seqid(int status, struct nfs_seqid
>> *seqid);
>>   extern void nfs_increment_lock_seqid(int status, struct nfs_seqid
>> *seqid);
>> +extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid);
>>   extern void nfs_release_seqid(struct nfs_seqid *seqid);
>>   extern void nfs_free_seqid(struct nfs_seqid *seqid);
>>   extern int nfs4_setup_sequence(struct nfs_client *client,
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index cd2fbde2e6d7..86e093ffb39c 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -2438,6 +2438,8 @@ static void nfs4_open_confirm_release(void
>> *calldata)
>>   	struct nfs4_opendata *data = calldata;
>>   	struct nfs4_state *state = NULL;
>>   
>> +	if (data->rpc_status != 0 || !data->rpc_done)
>> +		nfs_release_seqid_inorder(data->o_arg.seqid);
>>   	/* If this request hasn't been cancelled, do nothing */
>>   	if (!data->cancelled)
>>   		goto out_free;
>> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
>> index dafd61186557..df5e7a0b6528 100644
>> --- a/fs/nfs/nfs4state.c
>> +++ b/fs/nfs/nfs4state.c
>> @@ -1075,6 +1075,24 @@ struct nfs_seqid *nfs_alloc_seqid(struct
>> nfs_seqid_counter *counter, gfp_t gfp_m
>>   	return new;
>>   }
>>   
>> +void nfs_release_seqid_inorder(struct nfs_seqid *seqid)
>> +{
>> +	struct nfs_seqid_counter *sequence;
>> +
>> +	if (seqid == NULL || list_empty(&seqid->list))
>> +		return;
>> +	sequence = seqid->sequence;
>> +	spin_lock(&sequence->lock);
>> +	if (!list_is_last(&seqid->list, &sequence->list)) {
>> +		struct nfs_seqid *next;
>> +
>> +		next = list_next_entry(seqid, list);
>> +		rpc_wake_up_queued_task(&sequence->wait, next-
>>> task);
>> +	}
>> +	list_del_init(&seqid->list);
>> +	spin_unlock(&sequence->lock);
>> +}
>> +
>>   void nfs_release_seqid(struct nfs_seqid *seqid)
>>   {
>>   	struct nfs_seqid_counter *sequence;
> 
> 
> I'm not really seeing why we need a new function
> nfs_release_seqid_inorder(). Yes, there is an optimisation that can be
> made for nfs_release_seqid(), but that's not relevant to the problem
> you are seeing.

Hi, you can see the second uaf analysis with this link: 
https://lore.kernel.org/all/ae13ac55-a90f-6cfd-f23b-2ae5999d2f8c@huaweicloud.com/

And for why I do this, the reply from me for this patch. It's a rfc to 
talk does we should change nfs_release_seqid all to do this order wakeup.

"
Hi,

For this patch, no failed test case is added to the local nfsv4.0
nfsv4.1 test suite. But I'm not sure whether to change nfs_release_seqid
to wake-up in sequence, so I changed the patch title to rfc. Hope we can
discuss it together.

Thanks,
Erkun.
"


> 
> The other issue is that you're applying the fix to
> nfs4_open_confirm_release(), which doesn't need fixing. The only

Oh, sorry, this is some mistake when I write this rfc.

> function that needs to change is nfs4_open_release(), because it is the
> only one that can be called without nfs_wait_on_sequence() having
> succeeded, and so is the only function that can destroy the rpc_task
> before the seqid has become the first entry in the sequence->list.
> 
> I've sent a couple of patches that address both the optimisation and a
> proposed final fix for the problem to the list (with you on the Cc
> list). Please check if they fix your problem.

Yeah, your patchset pass my testcase.

> 
> Thanks!
>
diff mbox series

Patch

diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 7d383d29a995..3bbd945a78ca 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -525,6 +525,7 @@  extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_
 extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task);
 extern void nfs_increment_open_seqid(int status, struct nfs_seqid *seqid);
 extern void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid);
+extern void nfs_release_seqid_inorder(struct nfs_seqid *seqid);
 extern void nfs_release_seqid(struct nfs_seqid *seqid);
 extern void nfs_free_seqid(struct nfs_seqid *seqid);
 extern int nfs4_setup_sequence(struct nfs_client *client,
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index cd2fbde2e6d7..86e093ffb39c 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2438,6 +2438,8 @@  static void nfs4_open_confirm_release(void *calldata)
 	struct nfs4_opendata *data = calldata;
 	struct nfs4_state *state = NULL;
 
+	if (data->rpc_status != 0 || !data->rpc_done)
+		nfs_release_seqid_inorder(data->o_arg.seqid);
 	/* If this request hasn't been cancelled, do nothing */
 	if (!data->cancelled)
 		goto out_free;
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index dafd61186557..df5e7a0b6528 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1075,6 +1075,24 @@  struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_m
 	return new;
 }
 
+void nfs_release_seqid_inorder(struct nfs_seqid *seqid)
+{
+	struct nfs_seqid_counter *sequence;
+
+	if (seqid == NULL || list_empty(&seqid->list))
+		return;
+	sequence = seqid->sequence;
+	spin_lock(&sequence->lock);
+	if (!list_is_last(&seqid->list, &sequence->list)) {
+		struct nfs_seqid *next;
+
+		next = list_next_entry(seqid, list);
+		rpc_wake_up_queued_task(&sequence->wait, next->task);
+	}
+	list_del_init(&seqid->list);
+	spin_unlock(&sequence->lock);
+}
+
 void nfs_release_seqid(struct nfs_seqid *seqid)
 {
 	struct nfs_seqid_counter *sequence;