diff mbox

[v2] nfsd: fix error handling in nfs4_set_delegation()

Message ID 20180509120249.62022-1-aweits@rit.edu (mailing list archive)
State New, archived
Headers show

Commit Message

Andrew W Elble May 9, 2018, 12:02 p.m. UTC
I noticed a memory corruption crash in nfsd in
4.17-rc1. This patch corrects the issue.

Fix to return error if the delegation couldn't be hashed or there was
a recall in progress. Use the existing error path instead of
destroy_unhashed_delegation() for readability. Set the fields of the
delegation to indicate that it does not need to be recalled.

Signed-off-by: Andrew Elble <aweits@rit.edu>
Fixes: 353601e7d323c ("nfsd: create a separate lease for each delegation")
---
v2: typo in changelog, set delegation recall-suppression
 fs/nfsd/nfs4state.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

J. Bruce Fields May 11, 2018, 9:30 p.m. UTC | #1
On Wed, May 09, 2018 at 08:02:49AM -0400, Andrew Elble wrote:
> I noticed a memory corruption crash in nfsd in
> 4.17-rc1. This patch corrects the issue.
> 
> Fix to return error if the delegation couldn't be hashed or there was
> a recall in progress. Use the existing error path instead of
> destroy_unhashed_delegation() for readability. Set the fields of the
> delegation to indicate that it does not need to be recalled.
> 
> Signed-off-by: Andrew Elble <aweits@rit.edu>
> Fixes: 353601e7d323c ("nfsd: create a separate lease for each delegation")
> ---
> v2: typo in changelog, set delegation recall-suppression
>  fs/nfsd/nfs4state.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 71b87738c015..20463944cd61 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -4372,12 +4372,26 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp,
>  		status = -EAGAIN;
>  	else
>  		status = hash_delegation_locked(dp, fp);
> +	/*
> +	 * This delegation is doomed, tell the recall logic
> +	 * that it's being destroyed here.
> +	 */
> +
> +	if (status) {
> +		dp->dl_time++;
> +		list_del_init(&dp->dl_recall_lru);
> +		dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> +	}

I'm trying to figure out if this fixes an actual bug.  The code should
be able to deal with a callback on an already unhashed delegation, so I
think you're right it would at worst just be an unnecessary recall.
This won't catch every such case (could be that nfsd4_cb_recall_prepare
already ran and we're too late), so I wonder if this is worth it.

More interesting to me is what exactly it would take to hit this
case....  Another thread would have to have succesfully hashed a
delegation for this client and file to make our hash_delegation_locked
fail.  So there would be two leases for the same file and client, but
with different delegation pointers as the fl_owner.  I *think* we handle
that OK.  But it was likely problematic previously when we were still
using the file pointer as the fl_owner.

--b.

>  	spin_unlock(&fp->fi_lock);
>  	spin_unlock(&state_lock);
>  
>  	if (status)
> -		destroy_unhashed_deleg(dp);
> +		goto out_unlock;
> +
>  	return dp;
> +
> +out_unlock:
> +	vfs_setlease(fp->fi_deleg_file, F_UNLCK, NULL, (void **)&dp);
>  out_clnt_odstate:
>  	put_clnt_odstate(dp->dl_clnt_odstate);
>  out_stid:
> -- 
> 1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew W Elble May 14, 2018, 11:31 a.m. UTC | #2
>> +	/*
>> +	 * This delegation is doomed, tell the recall logic
>> +	 * that it's being destroyed here.
>> +	 */
>> +
>> +	if (status) {
>> +		dp->dl_time++;
>> +		list_del_init(&dp->dl_recall_lru);
>> +		dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
>> +	}
>
> I'm trying to figure out if this fixes an actual bug.  The code should
> be able to deal with a callback on an already unhashed delegation, so I
> think you're right it would at worst just be an unnecessary recall.

But an 'normal' unhashed delegation would have a persistent refcount,
this one would not. If the recall code gets a hold of it, it will
place it on nn->del_recall_lru, and then free it in nfsd4_cb_recall_release()?

> This won't catch every such case (could be that nfsd4_cb_recall_prepare
> already ran and we're too late), so I wonder if this is worth it.
>
> More interesting to me is what exactly it would take to hit this
> case....  Another thread would have to have succesfully hashed a
> delegation for this client and file to make our hash_delegation_locked
> fail.  So there would be two leases for the same file and client, but
> with different delegation pointers as the fl_owner.  I *think* we handle
> that OK.  But it was likely problematic previously when we were still
> using the file pointer as the fl_owner.

I'm thinking this is more easily hit via fp->fi_had_conflict, if a lease
break comes in at the right time?

Thanks,

Andy
J. Bruce Fields May 14, 2018, 3:45 p.m. UTC | #3
On Mon, May 14, 2018 at 07:31:53AM -0400, Andrew W Elble wrote:
> 
> >> +	/*
> >> +	 * This delegation is doomed, tell the recall logic
> >> +	 * that it's being destroyed here.
> >> +	 */
> >> +
> >> +	if (status) {
> >> +		dp->dl_time++;
> >> +		list_del_init(&dp->dl_recall_lru);
> >> +		dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> >> +	}
> >
> > I'm trying to figure out if this fixes an actual bug.  The code should
> > be able to deal with a callback on an already unhashed delegation, so I
> > think you're right it would at worst just be an unnecessary recall.
> 
> But an 'normal' unhashed delegation would have a persistent refcount,
> this one would not. If the recall code gets a hold of it, it will
> place it on nn->del_recall_lru, and then free it in nfsd4_cb_recall_release()?

Sounds right.  Do you see any bug there?

> > This won't catch every such case (could be that nfsd4_cb_recall_prepare
> > already ran and we're too late), so I wonder if this is worth it.
> >
> > More interesting to me is what exactly it would take to hit this
> > case....  Another thread would have to have succesfully hashed a
> > delegation for this client and file to make our hash_delegation_locked
> > fail.  So there would be two leases for the same file and client, but
> > with different delegation pointers as the fl_owner.  I *think* we handle
> > that OK.  But it was likely problematic previously when we were still
> > using the file pointer as the fl_owner.
> 
> I'm thinking this is more easily hit via fp->fi_had_conflict, if a lease
> break comes in at the right time?

Could be.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew W Elble May 23, 2018, 12:31 p.m. UTC | #4
>> But an 'normal' unhashed delegation would have a persistent refcount,
>> this one would not. If the recall code gets a hold of it, it will
>> place it on nn->del_recall_lru, and then free it in nfsd4_cb_recall_release()?
>
> Sounds right.  Do you see any bug there?

I'd expect it to crash in the laundromat. Unfortunately the larger test rig
that I'd normally use to try to hit this is occupied doing other things
- I may have to come back to this later.

Thanks,

Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 71b87738c015..20463944cd61 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4372,12 +4372,26 @@  static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp,
 		status = -EAGAIN;
 	else
 		status = hash_delegation_locked(dp, fp);
+	/*
+	 * This delegation is doomed, tell the recall logic
+	 * that it's being destroyed here.
+	 */
+
+	if (status) {
+		dp->dl_time++;
+		list_del_init(&dp->dl_recall_lru);
+		dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
+	}
 	spin_unlock(&fp->fi_lock);
 	spin_unlock(&state_lock);
 
 	if (status)
-		destroy_unhashed_deleg(dp);
+		goto out_unlock;
+
 	return dp;
+
+out_unlock:
+	vfs_setlease(fp->fi_deleg_file, F_UNLCK, NULL, (void **)&dp);
 out_clnt_odstate:
 	put_clnt_odstate(dp->dl_clnt_odstate);
 out_stid: