Message ID | 1681849891-29377-1-git-send-email-dai.ngo@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | NFSD: Fix problem of COMMIT and NFS4ERR_DELAY in infinite loop | expand |
On Tue, 2023-04-18 at 13:31 -0700, Dai Ngo wrote: > The following request sequence to the same file causes the NFS client and > server to get into an infinite loop with COMMIT and NFS4ERR_DELAY: > > OPEN > REMOVE > WRITE > COMMIT > > Problem reported test recall11, recall12, recall14, recall20, recall22, > recall40, recall42, recall48, recall50 of nfstest suite. > > This patch restores the handling of race condition in nfsd_file_do_acquire > with unlink to that prior of the regression. > > Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache") > Signed-off-by: Dai Ngo <dai.ngo@oracle.com> > --- > fs/nfsd/filecache.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index 6e8712bd7c99..63f7d9f4ea99 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -1170,9 +1170,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > * If construction failed, or we raced with a call to unlink() > * then unhash. > */ > - if (status == nfs_ok && key.inode->i_nlink == 0) > - status = nfserr_jukebox; > - if (status != nfs_ok) > + if (status != nfs_ok || key.inode->i_nlink == 0) > nfsd_file_unhash(nf); > clear_bit_unlock(NFSD_FILE_PENDING, &nf->nf_flags); > smp_mb__after_atomic(); It took me a min of staring at it, but I think you're right. If the link count goes to 0, we still want to allow access to it, but we want to unhash it to make sure that it gets cleaned up ASAP. Reviewed-by: Jeff Layton <jlayton@kernel.org>
> On Apr 18, 2023, at 4:31 PM, Dai Ngo <dai.ngo@oracle.com> wrote: > > The following request sequence to the same file causes the NFS client and > server to get into an infinite loop with COMMIT and NFS4ERR_DELAY: > > OPEN > REMOVE > WRITE > COMMIT > > Problem reported test recall11, recall12, recall14, recall20, recall22, > recall40, recall42, recall48, recall50 of nfstest suite. > > This patch restores the handling of race condition in nfsd_file_do_acquire > with unlink to that prior of the regression. > > Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache") > Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Since we are very late in the -rc cycle, I'd like to apply this to nfsd-next. The Fixes tag will ensure it gets backported appropriately. However, this patch fails to apply to nfsd-next. Can you rebase? > --- > fs/nfsd/filecache.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index 6e8712bd7c99..63f7d9f4ea99 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -1170,9 +1170,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > * If construction failed, or we raced with a call to unlink() > * then unhash. > */ > - if (status == nfs_ok && key.inode->i_nlink == 0) > - status = nfserr_jukebox; > - if (status != nfs_ok) > + if (status != nfs_ok || key.inode->i_nlink == 0) > nfsd_file_unhash(nf); > clear_bit_unlock(NFSD_FILE_PENDING, &nf->nf_flags); > smp_mb__after_atomic(); > -- > 2.9.5 > -- Chuck Lever
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index 6e8712bd7c99..63f7d9f4ea99 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -1170,9 +1170,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, * If construction failed, or we raced with a call to unlink() * then unhash. */ - if (status == nfs_ok && key.inode->i_nlink == 0) - status = nfserr_jukebox; - if (status != nfs_ok) + if (status != nfs_ok || key.inode->i_nlink == 0) nfsd_file_unhash(nf); clear_bit_unlock(NFSD_FILE_PENDING, &nf->nf_flags); smp_mb__after_atomic();
The following request sequence to the same file causes the NFS client and server to get into an infinite loop with COMMIT and NFS4ERR_DELAY: OPEN REMOVE WRITE COMMIT Problem reported test recall11, recall12, recall14, recall20, recall22, recall40, recall42, recall48, recall50 of nfstest suite. This patch restores the handling of race condition in nfsd_file_do_acquire with unlink to that prior of the regression. Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> --- fs/nfsd/filecache.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)