diff mbox series

NFSD: Fix problem of COMMIT and NFS4ERR_DELAY in infinite loop

Message ID 1681849891-29377-1-git-send-email-dai.ngo@oracle.com (mailing list archive)
State New, archived
Headers show
Series NFSD: Fix problem of COMMIT and NFS4ERR_DELAY in infinite loop | expand

Commit Message

Dai Ngo April 18, 2023, 8:31 p.m. UTC
The following request sequence to the same file causes the NFS client and
server to get into an infinite loop with COMMIT and NFS4ERR_DELAY:

OPEN
REMOVE
WRITE
COMMIT

Problem reported test recall11, recall12, recall14, recall20, recall22,
recall40, recall42, recall48, recall50 of nfstest suite.

This patch restores the handling of race condition in nfsd_file_do_acquire
with unlink to that prior of the regression.

Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/nfsd/filecache.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Jeff Layton April 18, 2023, 8:58 p.m. UTC | #1
On Tue, 2023-04-18 at 13:31 -0700, Dai Ngo wrote:
> The following request sequence to the same file causes the NFS client and
> server to get into an infinite loop with COMMIT and NFS4ERR_DELAY:
> 
> OPEN
> REMOVE
> WRITE
> COMMIT
> 
> Problem reported test recall11, recall12, recall14, recall20, recall22,
> recall40, recall42, recall48, recall50 of nfstest suite.
> 
> This patch restores the handling of race condition in nfsd_file_do_acquire
> with unlink to that prior of the regression.
> 
> Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache")
> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> ---
>  fs/nfsd/filecache.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 6e8712bd7c99..63f7d9f4ea99 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -1170,9 +1170,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  	 * If construction failed, or we raced with a call to unlink()
>  	 * then unhash.
>  	 */
> -	if (status == nfs_ok && key.inode->i_nlink == 0)
> -		status = nfserr_jukebox;
> -	if (status != nfs_ok)
> +	if (status != nfs_ok || key.inode->i_nlink == 0)
>  		nfsd_file_unhash(nf);
>  	clear_bit_unlock(NFSD_FILE_PENDING, &nf->nf_flags);
>  	smp_mb__after_atomic();

It took me a min of staring at it, but I think you're right. If the link
count goes to 0, we still want to allow access to it, but we want to
unhash it to make sure that it gets cleaned up ASAP.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Chuck Lever April 19, 2023, 1:55 p.m. UTC | #2
> On Apr 18, 2023, at 4:31 PM, Dai Ngo <dai.ngo@oracle.com> wrote:
> 
> The following request sequence to the same file causes the NFS client and
> server to get into an infinite loop with COMMIT and NFS4ERR_DELAY:
> 
> OPEN
> REMOVE
> WRITE
> COMMIT
> 
> Problem reported test recall11, recall12, recall14, recall20, recall22,
> recall40, recall42, recall48, recall50 of nfstest suite.
> 
> This patch restores the handling of race condition in nfsd_file_do_acquire
> with unlink to that prior of the regression.
> 
> Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache")
> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>

Since we are very late in the -rc cycle, I'd like to apply this
to nfsd-next. The Fixes tag will ensure it gets backported
appropriately.

However, this patch fails to apply to nfsd-next. Can you rebase?


> ---
> fs/nfsd/filecache.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 6e8712bd7c99..63f7d9f4ea99 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -1170,9 +1170,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
> * If construction failed, or we raced with a call to unlink()
> * then unhash.
> */
> - if (status == nfs_ok && key.inode->i_nlink == 0)
> - status = nfserr_jukebox;
> - if (status != nfs_ok)
> + if (status != nfs_ok || key.inode->i_nlink == 0)
> nfsd_file_unhash(nf);
> clear_bit_unlock(NFSD_FILE_PENDING, &nf->nf_flags);
> smp_mb__after_atomic();
> -- 
> 2.9.5
> 

--
Chuck Lever
diff mbox series

Patch

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 6e8712bd7c99..63f7d9f4ea99 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -1170,9 +1170,7 @@  nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	 * If construction failed, or we raced with a call to unlink()
 	 * then unhash.
 	 */
-	if (status == nfs_ok && key.inode->i_nlink == 0)
-		status = nfserr_jukebox;
-	if (status != nfs_ok)
+	if (status != nfs_ok || key.inode->i_nlink == 0)
 		nfsd_file_unhash(nf);
 	clear_bit_unlock(NFSD_FILE_PENDING, &nf->nf_flags);
 	smp_mb__after_atomic();