[v2] nfsd: fix error handling in nfs4_set_delegation()

Message ID	20180509120249.62022-1-aweits@rit.edu (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@kernel.org> From: Andrew Elble <aweits@rit.edu> To: linux-nfs@vger.kernel.org, bfields@fieldses.org, jlayton@kernel.org Cc: Andrew Elble <aweits@rit.edu> Subject: [PATCH v2] nfsd: fix error handling in nfs4_set_delegation() Date: Wed, 9 May 2018 08:02:49 -0400 Message-Id: <20180509120249.62022-1-aweits@rit.edu> Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk

Message ID

20180509120249.62022-1-aweits@rit.edu (mailing list archive)

State

New, archived

Headers

From: Andrew Elble <aweits@rit.edu>
To: linux-nfs@vger.kernel.org, bfields@fieldses.org, jlayton@kernel.org
Cc: Andrew Elble <aweits@rit.edu>
Subject: [PATCH v2] nfsd: fix error handling in nfs4_set_delegation()
Date: Wed,  9 May 2018 08:02:49 -0400
Message-Id: <20180509120249.62022-1-aweits@rit.edu>
Sender: linux-nfs-owner@vger.kernel.org
Precedence: bulk

Commit Message

Andrew W Elble May 9, 2018, 12:02 p.m. UTC

I noticed a memory corruption crash in nfsd in
4.17-rc1. This patch corrects the issue.

Fix to return error if the delegation couldn't be hashed or there was
a recall in progress. Use the existing error path instead of
destroy_unhashed_delegation() for readability. Set the fields of the
delegation to indicate that it does not need to be recalled.

Signed-off-by: Andrew Elble <aweits@rit.edu>
Fixes: 353601e7d323c ("nfsd: create a separate lease for each delegation")
---
v2: typo in changelog, set delegation recall-suppression
 fs/nfsd/nfs4state.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

J. Bruce Fields May 11, 2018, 9:30 p.m. UTC | #1

On Wed, May 09, 2018 at 08:02:49AM -0400, Andrew Elble wrote:
> I noticed a memory corruption crash in nfsd in
> 4.17-rc1. This patch corrects the issue.
> 
> Fix to return error if the delegation couldn't be hashed or there was
> a recall in progress. Use the existing error path instead of
> destroy_unhashed_delegation() for readability. Set the fields of the
> delegation to indicate that it does not need to be recalled.
> 
> Signed-off-by: Andrew Elble <aweits@rit.edu>
> Fixes: 353601e7d323c ("nfsd: create a separate lease for each delegation")
> ---
> v2: typo in changelog, set delegation recall-suppression
>  fs/nfsd/nfs4state.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 71b87738c015..20463944cd61 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -4372,12 +4372,26 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp,
>  		status = -EAGAIN;
>  	else
>  		status = hash_delegation_locked(dp, fp);
> +	/*
> +	 * This delegation is doomed, tell the recall logic
> +	 * that it's being destroyed here.
> +	 */
> +
> +	if (status) {
> +		dp->dl_time++;
> +		list_del_init(&dp->dl_recall_lru);
> +		dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> +	}

I'm trying to figure out if this fixes an actual bug.  The code should
be able to deal with a callback on an already unhashed delegation, so I
think you're right it would at worst just be an unnecessary recall.
This won't catch every such case (could be that nfsd4_cb_recall_prepare
already ran and we're too late), so I wonder if this is worth it.

More interesting to me is what exactly it would take to hit this
case....  Another thread would have to have succesfully hashed a
delegation for this client and file to make our hash_delegation_locked
fail.  So there would be two leases for the same file and client, but
with different delegation pointers as the fl_owner.  I *think* we handle
that OK.  But it was likely problematic previously when we were still
using the file pointer as the fl_owner.

--b.

>  	spin_unlock(&fp->fi_lock);
>  	spin_unlock(&state_lock);
>  
>  	if (status)
> -		destroy_unhashed_deleg(dp);
> +		goto out_unlock;
> +
>  	return dp;
> +
> +out_unlock:
> +	vfs_setlease(fp->fi_deleg_file, F_UNLCK, NULL, (void **)&dp);
>  out_clnt_odstate:
>  	put_clnt_odstate(dp->dl_clnt_odstate);
>  out_stid:
> -- 
> 1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andrew W Elble May 14, 2018, 11:31 a.m. UTC | #2

>> +	/*
>> +	 * This delegation is doomed, tell the recall logic
>> +	 * that it's being destroyed here.
>> +	 */
>> +
>> +	if (status) {
>> +		dp->dl_time++;
>> +		list_del_init(&dp->dl_recall_lru);
>> +		dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
>> +	}
>
> I'm trying to figure out if this fixes an actual bug.  The code should
> be able to deal with a callback on an already unhashed delegation, so I
> think you're right it would at worst just be an unnecessary recall.

But an 'normal' unhashed delegation would have a persistent refcount,
this one would not. If the recall code gets a hold of it, it will
place it on nn->del_recall_lru, and then free it in nfsd4_cb_recall_release()?

> This won't catch every such case (could be that nfsd4_cb_recall_prepare
> already ran and we're too late), so I wonder if this is worth it.
>
> More interesting to me is what exactly it would take to hit this
> case....  Another thread would have to have succesfully hashed a
> delegation for this client and file to make our hash_delegation_locked
> fail.  So there would be two leases for the same file and client, but
> with different delegation pointers as the fl_owner.  I *think* we handle
> that OK.  But it was likely problematic previously when we were still
> using the file pointer as the fl_owner.

I'm thinking this is more easily hit via fp->fi_had_conflict, if a lease
break comes in at the right time?

Thanks,

Andy

J. Bruce Fields May 14, 2018, 3:45 p.m. UTC | #3

On Mon, May 14, 2018 at 07:31:53AM -0400, Andrew W Elble wrote:
> 
> >> +	/*
> >> +	 * This delegation is doomed, tell the recall logic
> >> +	 * that it's being destroyed here.
> >> +	 */
> >> +
> >> +	if (status) {
> >> +		dp->dl_time++;
> >> +		list_del_init(&dp->dl_recall_lru);
> >> +		dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> >> +	}
> >
> > I'm trying to figure out if this fixes an actual bug.  The code should
> > be able to deal with a callback on an already unhashed delegation, so I
> > think you're right it would at worst just be an unnecessary recall.
> 
> But an 'normal' unhashed delegation would have a persistent refcount,
> this one would not. If the recall code gets a hold of it, it will
> place it on nn->del_recall_lru, and then free it in nfsd4_cb_recall_release()?

Sounds right.  Do you see any bug there?

> > This won't catch every such case (could be that nfsd4_cb_recall_prepare
> > already ran and we're too late), so I wonder if this is worth it.
> >
> > More interesting to me is what exactly it would take to hit this
> > case....  Another thread would have to have succesfully hashed a
> > delegation for this client and file to make our hash_delegation_locked
> > fail.  So there would be two leases for the same file and client, but
> > with different delegation pointers as the fl_owner.  I *think* we handle
> > that OK.  But it was likely problematic previously when we were still
> > using the file pointer as the fl_owner.
> 
> I'm thinking this is more easily hit via fp->fi_had_conflict, if a lease
> break comes in at the right time?

Could be.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andrew W Elble May 23, 2018, 12:31 p.m. UTC | #4

>> But an 'normal' unhashed delegation would have a persistent refcount,
>> this one would not. If the recall code gets a hold of it, it will
>> place it on nn->del_recall_lru, and then free it in nfsd4_cb_recall_release()?
>
> Sounds right.  Do you see any bug there?

I'd expect it to crash in the laundromat. Unfortunately the larger test rig
that I'd normally use to try to hit this is occupied doing other things
- I may have to come back to this later.

Thanks,

Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 71b87738c015..20463944cd61 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4372,12 +4372,26 @@  static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp,
 		status = -EAGAIN;
 	else
 		status = hash_delegation_locked(dp, fp);
+	/*
+	 * This delegation is doomed, tell the recall logic
+	 * that it's being destroyed here.
+	 */
+
+	if (status) {
+		dp->dl_time++;
+		list_del_init(&dp->dl_recall_lru);
+		dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
+	}
 	spin_unlock(&fp->fi_lock);
 	spin_unlock(&state_lock);
 
 	if (status)
-		destroy_unhashed_deleg(dp);
+		goto out_unlock;
+
 	return dp;
+
+out_unlock:
+	vfs_setlease(fp->fi_deleg_file, F_UNLCK, NULL, (void **)&dp);
 out_clnt_odstate:
 	put_clnt_odstate(dp->dl_clnt_odstate);
 out_stid:

[v2] nfsd: fix error handling in nfs4_set_delegation()

Commit Message

Comments

Patch