Message ID | 20241104121843.1589284-1-yangerkun@huaweicloud.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] nfsd: fix nfs4_openowner leak when concurrent nfsd4_open occur | expand |
On Mon, 2024-11-04 at 20:18 +0800, Yang Erkun wrote: > From: Yang Erkun <yangerkun@huawei.com> > > The action force umount(umount -f) will attempt to kill all rpc_task even > umount operation may ultimately fail if some files remain open. > Consequently, if an action attempts to open a file, it can potentially > send two rpc_task to nfs server. > > NFS CLIENT > thread1 thread2 > open("file") > ... > nfs4_do_open > _nfs4_do_open > _nfs4_open_and_get_state > _nfs4_proc_open > nfs4_run_open_task > /* rpc_task1 */ > rpc_run_task > rpc_wait_for_completion_task > > umount -f > nfs_umount_begin > rpc_killall_tasks > rpc_signal_task > rpc_task1 been wakeup > and return -512 > _nfs4_do_open // while loop > ... > nfs4_run_open_task > /* rpc_task2 */ > rpc_run_task > rpc_wait_for_completion_task > > While processing an open request, nfsd will first attempt to find or > allocate an nfs4_openowner. If it finds an nfs4_openowner that is not > marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since > two rpc_task can attempt to open the same file simultaneously from the > client to server, and because two instances of nfsd can run > concurrently, this situation can lead to lots of memory leak. > Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be > triggered. > > NFS SERVER > nfsd1 nfsd2 echo 0 > /proc/fs/nfsd/threads > > nfsd4_open > nfsd4_process_open1 > find_or_alloc_open_stateowner > // alloc oo1, stateid1 > nfsd4_open > nfsd4_process_open1 > find_or_alloc_open_stateowner > // find oo1, without NFS4_OO_CONFIRMED > release_openowner > unhash_openowner_locked > list_del_init(&oo->oo_perclient) > // cannot find this oo > // from client, LEAK!!! > alloc_stateowner // alloc oo2 > > nfsd4_process_open2 > init_open_stateid > // associate oo1 > // with stateid1, stateid1 LEAK!!! > nfs4_get_vfs_file > // alloc nfsd_file1 and nfsd_file_mark1 > // all LEAK!!! > > nfsd4_process_open2 > ... > > write_threads > ... > nfsd_destroy_serv > nfsd_shutdown_net > nfs4_state_shutdown_net > nfs4_state_destroy_net > destroy_client > __destroy_client > // won't find oo1!!! > nfsd_shutdown_generic > nfsd_file_cache_shutdown > kmem_cache_destroy > for nfsd_file_slab > and nfsd_file_mark_slab > // bark since nfsd_file1 > // and nfsd_file_mark1 > // still alive > > ======================================================================= > BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on > __kmem_cache_shutdown() > ----------------------------------------------------------------------- > > Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28 > flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff) > CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ #19 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.16.1-2.fc37 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl+0x53/0x70 > slab_err+0xb0/0xf0 > __kmem_cache_shutdown+0x15c/0x310 > kmem_cache_destroy+0x66/0x160 > nfsd_file_cache_shutdown+0xac/0x210 [nfsd] > nfsd_destroy_serv+0x251/0x2a0 [nfsd] > nfsd_svc+0x125/0x1e0 [nfsd] > write_threads+0x16a/0x2a0 [nfsd] > nfsctl_transaction_write+0x74/0xa0 [nfsd] > vfs_write+0x1ae/0x6d0 > ksys_write+0xc1/0x160 > do_syscall_64+0x5f/0x170 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Disabling lock debugging due to kernel taint > Object 0xff11000110e2ac38 @offset=3128 > Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3 > pid=800 > nfsd_file_do_acquire+0x20f/0xa30 [nfsd] > nfsd_file_acquire_opened+0x5f/0x90 [nfsd] > nfs4_get_vfs_file+0x4c9/0x570 [nfsd] > nfsd4_process_open2+0x713/0x1070 [nfsd] > nfsd4_open+0x74b/0x8b0 [nfsd] > nfsd4_proc_compound+0x70b/0xc20 [nfsd] > nfsd_dispatch+0x1b4/0x3a0 [nfsd] > svc_process_common+0x5b8/0xc50 [sunrpc] > svc_process+0x2ab/0x3b0 [sunrpc] > svc_handle_xprt+0x681/0xa20 [sunrpc] > nfsd+0x183/0x220 [nfsd] > kthread+0x199/0x1e0 > ret_from_fork+0x31/0x60 > ret_from_fork_asm+0x1a/0x30 > > Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and > break nfsd4_open process to fix this problem. > > Cc: stable@vger.kernel.org # 2.6 > Signed-off-by: Yang Erkun <yangerkun@huawei.com> > --- > fs/nfsd/nfs4state.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > v1->v2: fix mistake in nfs4_openowner_unhashed > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index 551d2958ec29..bc175bafc4d7 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -1660,6 +1660,12 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp) > free_ol_stateid_reaplist(&reaplist); > } > > +static bool nfs4_openowner_unhashed(struct nfs4_openowner *oo) > +{ Can we add a lockdep_assert_held() for the cl_lock here? > + return list_empty(&oo->oo_owner.so_strhash) && > + list_empty(&oo->oo_perclient); > +} > + > static void unhash_openowner_locked(struct nfs4_openowner *oo) > { > struct nfs4_client *clp = oo->oo_owner.so_client; > @@ -4975,6 +4981,12 @@ init_open_stateid(struct nfs4_file *fp, struct nfsd4_open *open) > spin_lock(&oo->oo_owner.so_client->cl_lock); > spin_lock(&fp->fi_lock); > > + if (nfs4_openowner_unhashed(oo)) { > + mutex_unlock(&stp->st_mutex); > + stp = NULL; > + goto out_unlock; > + } > + > retstp = nfsd4_find_existing_open(fp, open); > if (retstp) > goto out_unlock; > @@ -6127,6 +6139,11 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf > > if (!stp) { > stp = init_open_stateid(fp, open); > + if (!stp) { > + status = nfserr_jukebox; > + goto out; > + } > + > if (!open->op_stp) > new_stp = true; > } Nice analysis! Reviewed-by: Jeff Layton <jlayton@kernel.org>
在 2024/11/4 22:23, Jeff Layton 写道: > On Mon, 2024-11-04 at 20:18 +0800, Yang Erkun wrote: >> From: Yang Erkun <yangerkun@huawei.com> >> >> The action force umount(umount -f) will attempt to kill all rpc_task even >> umount operation may ultimately fail if some files remain open. >> Consequently, if an action attempts to open a file, it can potentially >> send two rpc_task to nfs server. >> >> NFS CLIENT >> thread1 thread2 >> open("file") >> ... >> nfs4_do_open >> _nfs4_do_open >> _nfs4_open_and_get_state >> _nfs4_proc_open >> nfs4_run_open_task >> /* rpc_task1 */ >> rpc_run_task >> rpc_wait_for_completion_task >> >> umount -f >> nfs_umount_begin >> rpc_killall_tasks >> rpc_signal_task >> rpc_task1 been wakeup >> and return -512 >> _nfs4_do_open // while loop >> ... >> nfs4_run_open_task >> /* rpc_task2 */ >> rpc_run_task >> rpc_wait_for_completion_task >> >> While processing an open request, nfsd will first attempt to find or >> allocate an nfs4_openowner. If it finds an nfs4_openowner that is not >> marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since >> two rpc_task can attempt to open the same file simultaneously from the >> client to server, and because two instances of nfsd can run >> concurrently, this situation can lead to lots of memory leak. >> Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be >> triggered. >> >> NFS SERVER >> nfsd1 nfsd2 echo 0 > /proc/fs/nfsd/threads >> >> nfsd4_open >> nfsd4_process_open1 >> find_or_alloc_open_stateowner >> // alloc oo1, stateid1 >> nfsd4_open >> nfsd4_process_open1 >> find_or_alloc_open_stateowner >> // find oo1, without NFS4_OO_CONFIRMED >> release_openowner >> unhash_openowner_locked >> list_del_init(&oo->oo_perclient) >> // cannot find this oo >> // from client, LEAK!!! >> alloc_stateowner // alloc oo2 >> >> nfsd4_process_open2 >> init_open_stateid >> // associate oo1 >> // with stateid1, stateid1 LEAK!!! >> nfs4_get_vfs_file >> // alloc nfsd_file1 and nfsd_file_mark1 >> // all LEAK!!! >> >> nfsd4_process_open2 >> ... >> >> write_threads >> ... >> nfsd_destroy_serv >> nfsd_shutdown_net >> nfs4_state_shutdown_net >> nfs4_state_destroy_net >> destroy_client >> __destroy_client >> // won't find oo1!!! >> nfsd_shutdown_generic >> nfsd_file_cache_shutdown >> kmem_cache_destroy >> for nfsd_file_slab >> and nfsd_file_mark_slab >> // bark since nfsd_file1 >> // and nfsd_file_mark1 >> // still alive >> >> ======================================================================= >> BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on >> __kmem_cache_shutdown() >> ----------------------------------------------------------------------- >> >> Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28 >> flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff) >> CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ #19 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >> 1.16.1-2.fc37 04/01/2014 >> Call Trace: >> <TASK> >> dump_stack_lvl+0x53/0x70 >> slab_err+0xb0/0xf0 >> __kmem_cache_shutdown+0x15c/0x310 >> kmem_cache_destroy+0x66/0x160 >> nfsd_file_cache_shutdown+0xac/0x210 [nfsd] >> nfsd_destroy_serv+0x251/0x2a0 [nfsd] >> nfsd_svc+0x125/0x1e0 [nfsd] >> write_threads+0x16a/0x2a0 [nfsd] >> nfsctl_transaction_write+0x74/0xa0 [nfsd] >> vfs_write+0x1ae/0x6d0 >> ksys_write+0xc1/0x160 >> do_syscall_64+0x5f/0x170 >> entry_SYSCALL_64_after_hwframe+0x76/0x7e >> >> Disabling lock debugging due to kernel taint >> Object 0xff11000110e2ac38 @offset=3128 >> Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3 >> pid=800 >> nfsd_file_do_acquire+0x20f/0xa30 [nfsd] >> nfsd_file_acquire_opened+0x5f/0x90 [nfsd] >> nfs4_get_vfs_file+0x4c9/0x570 [nfsd] >> nfsd4_process_open2+0x713/0x1070 [nfsd] >> nfsd4_open+0x74b/0x8b0 [nfsd] >> nfsd4_proc_compound+0x70b/0xc20 [nfsd] >> nfsd_dispatch+0x1b4/0x3a0 [nfsd] >> svc_process_common+0x5b8/0xc50 [sunrpc] >> svc_process+0x2ab/0x3b0 [sunrpc] >> svc_handle_xprt+0x681/0xa20 [sunrpc] >> nfsd+0x183/0x220 [nfsd] >> kthread+0x199/0x1e0 >> ret_from_fork+0x31/0x60 >> ret_from_fork_asm+0x1a/0x30 >> >> Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and >> break nfsd4_open process to fix this problem. >> >> Cc: stable@vger.kernel.org # 2.6 >> Signed-off-by: Yang Erkun <yangerkun@huawei.com> >> --- >> fs/nfsd/nfs4state.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> v1->v2: fix mistake in nfs4_openowner_unhashed >> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >> index 551d2958ec29..bc175bafc4d7 100644 >> --- a/fs/nfsd/nfs4state.c >> +++ b/fs/nfsd/nfs4state.c >> @@ -1660,6 +1660,12 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp) >> free_ol_stateid_reaplist(&reaplist); >> } >> >> +static bool nfs4_openowner_unhashed(struct nfs4_openowner *oo) >> +{ > > Can we add a lockdep_assert_held() for the cl_lock here? Yeah, this function should be protected by cl_lock. > >> + return list_empty(&oo->oo_owner.so_strhash) && >> + list_empty(&oo->oo_perclient); >> +} >> + >> static void unhash_openowner_locked(struct nfs4_openowner *oo) >> { >> struct nfs4_client *clp = oo->oo_owner.so_client; >> @@ -4975,6 +4981,12 @@ init_open_stateid(struct nfs4_file *fp, struct nfsd4_open *open) >> spin_lock(&oo->oo_owner.so_client->cl_lock); >> spin_lock(&fp->fi_lock); >> >> + if (nfs4_openowner_unhashed(oo)) { >> + mutex_unlock(&stp->st_mutex); >> + stp = NULL; >> + goto out_unlock; >> + } >> + >> retstp = nfsd4_find_existing_open(fp, open); >> if (retstp) >> goto out_unlock; >> @@ -6127,6 +6139,11 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf >> >> if (!stp) { >> stp = init_open_stateid(fp, open); >> + if (!stp) { >> + status = nfserr_jukebox; >> + goto out; >> + } >> + >> if (!open->op_stp) >> new_stp = true; >> } > > Nice analysis! > > Reviewed-by: Jeff Layton <jlayton@kernel.org
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 551d2958ec29..bc175bafc4d7 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1660,6 +1660,12 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp) free_ol_stateid_reaplist(&reaplist); } +static bool nfs4_openowner_unhashed(struct nfs4_openowner *oo) +{ + return list_empty(&oo->oo_owner.so_strhash) && + list_empty(&oo->oo_perclient); +} + static void unhash_openowner_locked(struct nfs4_openowner *oo) { struct nfs4_client *clp = oo->oo_owner.so_client; @@ -4975,6 +4981,12 @@ init_open_stateid(struct nfs4_file *fp, struct nfsd4_open *open) spin_lock(&oo->oo_owner.so_client->cl_lock); spin_lock(&fp->fi_lock); + if (nfs4_openowner_unhashed(oo)) { + mutex_unlock(&stp->st_mutex); + stp = NULL; + goto out_unlock; + } + retstp = nfsd4_find_existing_open(fp, open); if (retstp) goto out_unlock; @@ -6127,6 +6139,11 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf if (!stp) { stp = init_open_stateid(fp, open); + if (!stp) { + status = nfserr_jukebox; + goto out; + } + if (!open->op_stp) new_stp = true; }