diff mbox series

[RFC] NFSD: Support write delegations for pNFS LAYOUT operations

Message ID 20240610150448.2377-2-cel@kernel.org (mailing list archive)
State New
Headers show
Series [RFC] NFSD: Support write delegations for pNFS LAYOUT operations | expand

Commit Message

Chuck Lever June 10, 2024, 3:04 p.m. UTC
From: Chuck Lever <chuck.lever@oracle.com>

I noticed LAYOUTGET(LAYOUTIOMODE4_RW) returning NFS4ERR_ACCESS
unexpectedly. The NFS client had created a file with mode 0444, and
the server had returned a write delegation on the OPEN(CREATE). The
client was requesting a RW layout using the write delegation stateid
so that it could flush file modifications.

This client behavior was permitted for NFSv4.1 without pNFS, so I
began looking at NFSD's implementation of LAYOUTGET.

The failure was because fh_verify() was doing a permission check as
part of verifying the FH. It uses the loga_iomode value to specify
the @accmode argument. fh_verify(MAY_WRITE) on a file whose mode is
0444 fails with -EACCES.

RFC 8881 Section 18.43.3 states:
> The use of the loga_iomode field depends upon the layout type, but
> should reflect the client's data access intent.

Further discussion of iomode values focuses on how the server is
permitted to change returned the iomode when coalescing layouts.
It says nothing about mandating the denial of LAYOUTGET requests
due to file permission settings.

Appropriate permission checking is done when the client acquires the
stateid used in the LAYOUTGET operation, so remove the permission
check from LAYOUTGET and LAYOUTCOMMIT, and rely on layout stateid
checking instead.

Cc: Christoph Hellwig <hch@lst.de>
X-Cc: stable@vger.kernel.org # v6.6+
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4proc.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

Comments

Trond Myklebust June 10, 2024, 4:21 p.m. UTC | #1
On Mon, 2024-06-10 at 11:04 -0400, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> I noticed LAYOUTGET(LAYOUTIOMODE4_RW) returning NFS4ERR_ACCESS
> unexpectedly. The NFS client had created a file with mode 0444, and
> the server had returned a write delegation on the OPEN(CREATE). The
> client was requesting a RW layout using the write delegation stateid
> so that it could flush file modifications.
> 
> This client behavior was permitted for NFSv4.1 without pNFS, so I
> began looking at NFSD's implementation of LAYOUTGET.
> 
> The failure was because fh_verify() was doing a permission check as
> part of verifying the FH. It uses the loga_iomode value to specify
> the @accmode argument. fh_verify(MAY_WRITE) on a file whose mode is
> 0444 fails with -EACCES.
> 
> RFC 8881 Section 18.43.3 states:
> > The use of the loga_iomode field depends upon the layout type, but
> > should reflect the client's data access intent.
> 
> Further discussion of iomode values focuses on how the server is
> permitted to change returned the iomode when coalescing layouts.
> It says nothing about mandating the denial of LAYOUTGET requests
> due to file permission settings.
> 
> Appropriate permission checking is done when the client acquires the
> stateid used in the LAYOUTGET operation, so remove the permission
> check from LAYOUTGET and LAYOUTCOMMIT, and rely on layout stateid
> checking instead.

Hmm... I'm not seeing any checking or enforcement of the
EXCHGID4_FLAG_BIND_PRINC_STATEID flag in knfsd. Doesn't that mean that
READ and WRITE are required to check access permissions, as per
RFC8881, section 13.9.2.3?

I'm not saying that the return of an ACCESS error is correct here,
since the file was created using this credential, and so the caller
should benefit from having owner privileges.

However the issue is that knfsd should be either checking that the
credential is correct w.r.t. the stateid (if
EXCHGID4_FLAG_BIND_PRINC_STATEID is set and the stateid is not a
special stateid) or that it is correct w.r.t. the file permissions (if
EXCHGID4_FLAG_BIND_PRINC_STATEID is not set or the stateid is a special
stateid).

> 
> Cc: Christoph Hellwig <hch@lst.de>
> X-Cc: stable@vger.kernel.org # v6.6+
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/nfs4proc.c | 12 +++---------
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 46bd20fe5c0f..c24f45870b28 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -2269,23 +2269,17 @@ nfsd4_layoutget(struct svc_rqst *rqstp,
>  	const struct nfsd4_layout_ops *ops;
>  	struct nfs4_layout_stateid *ls;
>  	__be32 nfserr;
> -	int accmode = NFSD_MAY_READ_IF_EXEC;
>  
> +	nfserr = nfserr_badiomode;
>  	switch (lgp->lg_seg.iomode) {
>  	case IOMODE_READ:
> -		accmode |= NFSD_MAY_READ;
> -		break;
>  	case IOMODE_RW:
> -		accmode |= NFSD_MAY_READ | NFSD_MAY_WRITE;
>  		break;
>  	default:
> -		dprintk("%s: invalid iomode %d\n",
> -			__func__, lgp->lg_seg.iomode);
> -		nfserr = nfserr_badiomode;
>  		goto out;
>  	}
>  
> -	nfserr = fh_verify(rqstp, current_fh, 0, accmode);
> +	nfserr = fh_verify(rqstp, current_fh, 0, NFSD_MAY_NOP);
>  	if (nfserr)
>  		goto out;
>  
> @@ -2359,7 +2353,7 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp,
>  	struct nfs4_layout_stateid *ls;
>  	__be32 nfserr;
>  
> -	nfserr = fh_verify(rqstp, current_fh, 0, NFSD_MAY_WRITE);
> +	nfserr = fh_verify(rqstp, current_fh, 0, NFSD_MAY_NOP);
>  	if (nfserr)
>  		goto out;
>
Chuck Lever June 10, 2024, 5:43 p.m. UTC | #2
On Mon, Jun 10, 2024 at 04:21:33PM +0000, Trond Myklebust wrote:
> On Mon, 2024-06-10 at 11:04 -0400, cel@kernel.org wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> > 
> > I noticed LAYOUTGET(LAYOUTIOMODE4_RW) returning NFS4ERR_ACCESS
> > unexpectedly. The NFS client had created a file with mode 0444, and
> > the server had returned a write delegation on the OPEN(CREATE). The
> > client was requesting a RW layout using the write delegation stateid
> > so that it could flush file modifications.
> > 
> > This client behavior was permitted for NFSv4.1 without pNFS, so I
> > began looking at NFSD's implementation of LAYOUTGET.
> > 
> > The failure was because fh_verify() was doing a permission check as
> > part of verifying the FH. It uses the loga_iomode value to specify
> > the @accmode argument. fh_verify(MAY_WRITE) on a file whose mode is
> > 0444 fails with -EACCES.
> > 
> > RFC 8881 Section 18.43.3 states:
> > > The use of the loga_iomode field depends upon the layout type, but
> > > should reflect the client's data access intent.
> > 
> > Further discussion of iomode values focuses on how the server is
> > permitted to change returned the iomode when coalescing layouts.
> > It says nothing about mandating the denial of LAYOUTGET requests
> > due to file permission settings.
> > 
> > Appropriate permission checking is done when the client acquires the
> > stateid used in the LAYOUTGET operation, so remove the permission
> > check from LAYOUTGET and LAYOUTCOMMIT, and rely on layout stateid
> > checking instead.
> 
> Hmm... I'm not seeing any checking or enforcement of the
> EXCHGID4_FLAG_BIND_PRINC_STATEID flag in knfsd.

I appreciate your review!

I see that BIND_PRINC_STATEID is not set by NFSD. RFC 8881 Section
18.16.4 says:
> Note that if the client ID was not created with the
> EXCHGID4_FLAG_BIND_PRINC_STATEID capability set in the reply to
> EXCHANGE_ID, then the server MUST NOT impose any requirement that
> READs and WRITEs sent for an open file have the same credentials
> as the OPEN itself, and the server is REQUIRED to perform access
> checking on the READs and WRITEs themselves.


Trond:
> Doesn't that mean that
> READ and WRITE are required to check access permissions, as per
> RFC8881, section 13.9.2.3?

Every NFSv4 READ and WRITE calls nfs4_preprocess_stateid_op(), and
nfs4_preprocess_stateid_op() calls nfsd_permission() (in
nfs4_check_file()). Seems like we're covered.


Trond:
> I'm not saying that the return of an ACCESS error is correct here,
> since the file was created using this credential, and so the caller
> should benefit from having owner privileges.

The alternative is to preserve the accmode logic but instead add the
NFSD_MAY_OWNER_OVERRIDE flag, me thinks, which is not objectionable.


Trond:
> However the issue is that knfsd should be either checking that the
> credential is correct w.r.t. the stateid (if
> EXCHGID4_FLAG_BIND_PRINC_STATEID is set and the stateid is not a
> special stateid) or that it is correct w.r.t. the file permissions (if
> EXCHGID4_FLAG_BIND_PRINC_STATEID is not set or the stateid is a special
> stateid).

But LAYOUTGET is not a READ or WRITE. I don't see language that
requires stateid / credential checking for it, but it's always
possible I might have missed something.

Also, RFC 5663 has nothing to say about BIND_PRINC_STATEID. Further,
I'm not sure how a SCSI READ or WRITE can check credentials as NFS
stateids are not presented to SCSI/iSCSI targets.

Likewise, how would this impact a flexfile layout that targets
an NFSv3 DS?

I think NFSD is checking stateids used for NFSv4 READ and WRITE as
needed, but help me understand how BIND_PRINC_STATEID is applicable
to pNFS block layouts...?
Trond Myklebust June 10, 2024, 11:54 p.m. UTC | #3
On Mon, 2024-06-10 at 13:43 -0400, Chuck Lever wrote:
> On Mon, Jun 10, 2024 at 04:21:33PM +0000, Trond Myklebust wrote:
> > On Mon, 2024-06-10 at 11:04 -0400, cel@kernel.org wrote:
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > > 
> > > I noticed LAYOUTGET(LAYOUTIOMODE4_RW) returning NFS4ERR_ACCESS
> > > unexpectedly. The NFS client had created a file with mode 0444,
> > > and
> > > the server had returned a write delegation on the OPEN(CREATE).
> > > The
> > > client was requesting a RW layout using the write delegation
> > > stateid
> > > so that it could flush file modifications.
> > > 
> > > This client behavior was permitted for NFSv4.1 without pNFS, so I
> > > began looking at NFSD's implementation of LAYOUTGET.
> > > 
> > > The failure was because fh_verify() was doing a permission check
> > > as
> > > part of verifying the FH. It uses the loga_iomode value to
> > > specify
> > > the @accmode argument. fh_verify(MAY_WRITE) on a file whose mode
> > > is
> > > 0444 fails with -EACCES.
> > > 
> > > RFC 8881 Section 18.43.3 states:
> > > > The use of the loga_iomode field depends upon the layout type,
> > > > but
> > > > should reflect the client's data access intent.
> > > 
> > > Further discussion of iomode values focuses on how the server is
> > > permitted to change returned the iomode when coalescing layouts.
> > > It says nothing about mandating the denial of LAYOUTGET requests
> > > due to file permission settings.
> > > 
> > > Appropriate permission checking is done when the client acquires
> > > the
> > > stateid used in the LAYOUTGET operation, so remove the permission
> > > check from LAYOUTGET and LAYOUTCOMMIT, and rely on layout stateid
> > > checking instead.
> > 
> > Hmm... I'm not seeing any checking or enforcement of the
> > EXCHGID4_FLAG_BIND_PRINC_STATEID flag in knfsd.
> 
> I appreciate your review!
> 
> I see that BIND_PRINC_STATEID is not set by NFSD. RFC 8881 Section
> 18.16.4 says:
> > Note that if the client ID was not created with the
> > EXCHGID4_FLAG_BIND_PRINC_STATEID capability set in the reply to
> > EXCHANGE_ID, then the server MUST NOT impose any requirement that
> > READs and WRITEs sent for an open file have the same credentials
> > as the OPEN itself, and the server is REQUIRED to perform access
> > checking on the READs and WRITEs themselves.
> 
> 
> Trond:
> > Doesn't that mean that
> > READ and WRITE are required to check access permissions, as per
> > RFC8881, section 13.9.2.3?
> 
> Every NFSv4 READ and WRITE calls nfs4_preprocess_stateid_op(), and
> nfs4_preprocess_stateid_op() calls nfsd_permission() (in
> nfs4_check_file()). Seems like we're covered.
> 
> 
> Trond:
> > I'm not saying that the return of an ACCESS error is correct here,
> > since the file was created using this credential, and so the caller
> > should benefit from having owner privileges.
> 
> The alternative is to preserve the accmode logic but instead add the
> NFSD_MAY_OWNER_OVERRIDE flag, me thinks, which is not objectionable.
> 
> 
> Trond:
> > However the issue is that knfsd should be either checking that the
> > credential is correct w.r.t. the stateid (if
> > EXCHGID4_FLAG_BIND_PRINC_STATEID is set and the stateid is not a
> > special stateid) or that it is correct w.r.t. the file permissions
> > (if
> > EXCHGID4_FLAG_BIND_PRINC_STATEID is not set or the stateid is a
> > special
> > stateid).
> 
> But LAYOUTGET is not a READ or WRITE. I don't see language that
> requires stateid / credential checking for it, but it's always
> possible I might have missed something.
> 
> Also, RFC 5663 has nothing to say about BIND_PRINC_STATEID. Further,
> I'm not sure how a SCSI READ or WRITE can check credentials as NFS
> stateids are not presented to SCSI/iSCSI targets.
> 
> Likewise, how would this impact a flexfile layout that targets
> an NFSv3 DS?
> 
> I think NFSD is checking stateids used for NFSv4 READ and WRITE as
> needed, but help me understand how BIND_PRINC_STATEID is applicable
> to pNFS block layouts...?
> 
> 

If you look at Section 15.2, then NFS4ERR_ACCESS is definitely listed
as a valid error for LAYOUTGET (in fact, the very first in the list).

Furthermore, please see the following blanket statement in RFC8881,
section 12.5.1.:

   "Layouts are provided to NFSv4.1 clients, and user access still
   follows the rules of the protocol as if they did not exist."

While you can argue that the above language is not normative, because
it doesn't use the official IETF "MUST" / "MUST NOT" /..., it clearly
does declare an intention to ensure that pNFS not be allowed to weaken
the protocol.

So my point would be that if LAYOUTGET is the only time where a proper
credential check can occur, then it needs to happen there. Otherwise,
your implementation is responsible for ensuring that it happens at some
other time, in order to ensure that user access follows the rules of
the protocol.
Trond Myklebust June 11, 2024, 12:34 a.m. UTC | #4
On Mon, 2024-06-10 at 23:54 +0000, Trond Myklebust wrote:
> On Mon, 2024-06-10 at 13:43 -0400, Chuck Lever wrote:
> > On Mon, Jun 10, 2024 at 04:21:33PM +0000, Trond Myklebust wrote:
> > > On Mon, 2024-06-10 at 11:04 -0400, cel@kernel.org wrote:
> > > > From: Chuck Lever <chuck.lever@oracle.com>
> > > > 
> > > > I noticed LAYOUTGET(LAYOUTIOMODE4_RW) returning NFS4ERR_ACCESS
> > > > unexpectedly. The NFS client had created a file with mode 0444,
> > > > and
> > > > the server had returned a write delegation on the OPEN(CREATE).
> > > > The
> > > > client was requesting a RW layout using the write delegation
> > > > stateid
> > > > so that it could flush file modifications.
> > > > 
> > > > This client behavior was permitted for NFSv4.1 without pNFS, so
> > > > I
> > > > began looking at NFSD's implementation of LAYOUTGET.
> > > > 
> > > > The failure was because fh_verify() was doing a permission
> > > > check
> > > > as
> > > > part of verifying the FH. It uses the loga_iomode value to
> > > > specify
> > > > the @accmode argument. fh_verify(MAY_WRITE) on a file whose
> > > > mode
> > > > is
> > > > 0444 fails with -EACCES.
> > > > 
> > > > RFC 8881 Section 18.43.3 states:
> > > > > The use of the loga_iomode field depends upon the layout
> > > > > type,
> > > > > but
> > > > > should reflect the client's data access intent.
> > > > 
> > > > Further discussion of iomode values focuses on how the server
> > > > is
> > > > permitted to change returned the iomode when coalescing
> > > > layouts.
> > > > It says nothing about mandating the denial of LAYOUTGET
> > > > requests
> > > > due to file permission settings.
> > > > 
> > > > Appropriate permission checking is done when the client
> > > > acquires
> > > > the
> > > > stateid used in the LAYOUTGET operation, so remove the
> > > > permission
> > > > check from LAYOUTGET and LAYOUTCOMMIT, and rely on layout
> > > > stateid
> > > > checking instead.
> > > 
> > > Hmm... I'm not seeing any checking or enforcement of the
> > > EXCHGID4_FLAG_BIND_PRINC_STATEID flag in knfsd.
> > 
> > I appreciate your review!
> > 
> > I see that BIND_PRINC_STATEID is not set by NFSD. RFC 8881 Section
> > 18.16.4 says:
> > > Note that if the client ID was not created with the
> > > EXCHGID4_FLAG_BIND_PRINC_STATEID capability set in the reply to
> > > EXCHANGE_ID, then the server MUST NOT impose any requirement that
> > > READs and WRITEs sent for an open file have the same credentials
> > > as the OPEN itself, and the server is REQUIRED to perform access
> > > checking on the READs and WRITEs themselves.
> > 
> > 
> > Trond:
> > > Doesn't that mean that
> > > READ and WRITE are required to check access permissions, as per
> > > RFC8881, section 13.9.2.3?
> > 
> > Every NFSv4 READ and WRITE calls nfs4_preprocess_stateid_op(), and
> > nfs4_preprocess_stateid_op() calls nfsd_permission() (in
> > nfs4_check_file()). Seems like we're covered.
> > 
> > 
> > Trond:
> > > I'm not saying that the return of an ACCESS error is correct
> > > here,
> > > since the file was created using this credential, and so the
> > > caller
> > > should benefit from having owner privileges.
> > 
> > The alternative is to preserve the accmode logic but instead add
> > the
> > NFSD_MAY_OWNER_OVERRIDE flag, me thinks, which is not
> > objectionable.
> > 
> > 
> > Trond:
> > > However the issue is that knfsd should be either checking that
> > > the
> > > credential is correct w.r.t. the stateid (if
> > > EXCHGID4_FLAG_BIND_PRINC_STATEID is set and the stateid is not a
> > > special stateid) or that it is correct w.r.t. the file
> > > permissions
> > > (if
> > > EXCHGID4_FLAG_BIND_PRINC_STATEID is not set or the stateid is a
> > > special
> > > stateid).
> > 
> > But LAYOUTGET is not a READ or WRITE. I don't see language that
> > requires stateid / credential checking for it, but it's always
> > possible I might have missed something.
> > 
> > Also, RFC 5663 has nothing to say about BIND_PRINC_STATEID.
> > Further,
> > I'm not sure how a SCSI READ or WRITE can check credentials as NFS
> > stateids are not presented to SCSI/iSCSI targets.
> > 
> > Likewise, how would this impact a flexfile layout that targets
> > an NFSv3 DS?
> > 
> > I think NFSD is checking stateids used for NFSv4 READ and WRITE as
> > needed, but help me understand how BIND_PRINC_STATEID is applicable
> > to pNFS block layouts...?
> > 
> > 
> 
> If you look at Section 15.2, then NFS4ERR_ACCESS is definitely listed
> as a valid error for LAYOUTGET (in fact, the very first in the list).
> 
> Furthermore, please see the following blanket statement in RFC8881,
> section 12.5.1.:
> 
>    "Layouts are provided to NFSv4.1 clients, and user access still
>    follows the rules of the protocol as if they did not exist."
> 
> While you can argue that the above language is not normative, because
> it doesn't use the official IETF "MUST" / "MUST NOT" /..., it clearly
> does declare an intention to ensure that pNFS not be allowed to
> weaken
> the protocol.
> 
> So my point would be that if LAYOUTGET is the only time where a
> proper
> credential check can occur, then it needs to happen there. Otherwise,
> your implementation is responsible for ensuring that it happens at
> some
> other time, in order to ensure that user access follows the rules of
> the protocol.
> 

Put differently: unless you have an alternative mechanism for ensuring
that the NFSv4.1 rules are followed, then

 * If EXCHGID4_FLAG_BIND_PRINC_STATEID is not set, then LAYOUTGET needs
   to check the supplied credential against the file permissions, and a
   file permissions change while a layout is outstanding should almost
   certainly trigger a layout recall.
 * If EXCHGID4_FLAG_BIND_PRINC_STATEID is set, then LAYOUTGET needs to
   check the credential against the one that was used to create the
   stateid, however it should presumably not be necessary to perform
   the layout recall during a file permissions change.
Tom Talpey June 11, 2024, 1:32 p.m. UTC | #5
On 6/10/2024 7:54 PM, Trond Myklebust wrote:
> On Mon, 2024-06-10 at 13:43 -0400, Chuck Lever wrote:
>> On Mon, Jun 10, 2024 at 04:21:33PM +0000, Trond Myklebust wrote:
>>> On Mon, 2024-06-10 at 11:04 -0400, cel@kernel.org wrote:
>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>
>>>> I noticed LAYOUTGET(LAYOUTIOMODE4_RW) returning NFS4ERR_ACCESS
>>>> unexpectedly. The NFS client had created a file with mode 0444,
>>>> and
>>>> the server had returned a write delegation on the OPEN(CREATE).
>>>> The
>>>> client was requesting a RW layout using the write delegation
>>>> stateid
>>>> so that it could flush file modifications.
>>>>
>>>> This client behavior was permitted for NFSv4.1 without pNFS, so I
>>>> began looking at NFSD's implementation of LAYOUTGET.
>>>>
>>>> The failure was because fh_verify() was doing a permission check
>>>> as
>>>> part of verifying the FH. It uses the loga_iomode value to
>>>> specify
>>>> the @accmode argument. fh_verify(MAY_WRITE) on a file whose mode
>>>> is
>>>> 0444 fails with -EACCES.
>>>>
>>>> RFC 8881 Section 18.43.3 states:
>>>>> The use of the loga_iomode field depends upon the layout type,
>>>>> but
>>>>> should reflect the client's data access intent.
>>>>
>>>> Further discussion of iomode values focuses on how the server is
>>>> permitted to change returned the iomode when coalescing layouts.
>>>> It says nothing about mandating the denial of LAYOUTGET requests
>>>> due to file permission settings.
>>>>
>>>> Appropriate permission checking is done when the client acquires
>>>> the
>>>> stateid used in the LAYOUTGET operation, so remove the permission
>>>> check from LAYOUTGET and LAYOUTCOMMIT, and rely on layout stateid
>>>> checking instead.
>>>
>>> Hmm... I'm not seeing any checking or enforcement of the
>>> EXCHGID4_FLAG_BIND_PRINC_STATEID flag in knfsd.
>>
>> I appreciate your review!
>>
>> I see that BIND_PRINC_STATEID is not set by NFSD. RFC 8881 Section
>> 18.16.4 says:
>>> Note that if the client ID was not created with the
>>> EXCHGID4_FLAG_BIND_PRINC_STATEID capability set in the reply to
>>> EXCHANGE_ID, then the server MUST NOT impose any requirement that
>>> READs and WRITEs sent for an open file have the same credentials
>>> as the OPEN itself, and the server is REQUIRED to perform access
>>> checking on the READs and WRITEs themselves.
>>
>>
>> Trond:
>>> Doesn't that mean that
>>> READ and WRITE are required to check access permissions, as per
>>> RFC8881, section 13.9.2.3?
>>
>> Every NFSv4 READ and WRITE calls nfs4_preprocess_stateid_op(), and
>> nfs4_preprocess_stateid_op() calls nfsd_permission() (in
>> nfs4_check_file()). Seems like we're covered.
>>
>>
>> Trond:
>>> I'm not saying that the return of an ACCESS error is correct here,
>>> since the file was created using this credential, and so the caller
>>> should benefit from having owner privileges.
>>
>> The alternative is to preserve the accmode logic but instead add the
>> NFSD_MAY_OWNER_OVERRIDE flag, me thinks, which is not objectionable.
>>
>>
>> Trond:
>>> However the issue is that knfsd should be either checking that the
>>> credential is correct w.r.t. the stateid (if
>>> EXCHGID4_FLAG_BIND_PRINC_STATEID is set and the stateid is not a
>>> special stateid) or that it is correct w.r.t. the file permissions
>>> (if
>>> EXCHGID4_FLAG_BIND_PRINC_STATEID is not set or the stateid is a
>>> special
>>> stateid).
>>
>> But LAYOUTGET is not a READ or WRITE. I don't see language that
>> requires stateid / credential checking for it, but it's always
>> possible I might have missed something.
>>
>> Also, RFC 5663 has nothing to say about BIND_PRINC_STATEID. Further,
>> I'm not sure how a SCSI READ or WRITE can check credentials as NFS
>> stateids are not presented to SCSI/iSCSI targets.
>>
>> Likewise, how would this impact a flexfile layout that targets
>> an NFSv3 DS?
>>
>> I think NFSD is checking stateids used for NFSv4 READ and WRITE as
>> needed, but help me understand how BIND_PRINC_STATEID is applicable
>> to pNFS block layouts...?
>>
>>
> 
> If you look at Section 15.2, then NFS4ERR_ACCESS is definitely listed
> as a valid error for LAYOUTGET (in fact, the very first in the list).
> 
> Furthermore, please see the following blanket statement in RFC8881,
> section 12.5.1.:
> 
>     "Layouts are provided to NFSv4.1 clients, and user access still
>     follows the rules of the protocol as if they did not exist."
> 
> While you can argue that the above language is not normative, because
> it doesn't use the official IETF "MUST" / "MUST NOT" /..., it clearly
> does declare an intention to ensure that pNFS not be allowed to weaken
> the protocol.
> 
> So my point would be that if LAYOUTGET is the only time where a proper
> credential check can occur, then it needs to happen there. Otherwise,
> your implementation is responsible for ensuring that it happens at some
> other time, in order to ensure that user access follows the rules of
> the protocol.

Absolutely agreed. The MDS needs to verify access before returning
the layout, for no other reason that the DS can't perform the same
verification.

Tom.
Chuck Lever June 11, 2024, 1:54 p.m. UTC | #6
> On Jun 11, 2024, at 9:32 AM, Tom Talpey <tom@talpey.com> wrote:
> 
> On 6/10/2024 7:54 PM, Trond Myklebust wrote:
>> On Mon, 2024-06-10 at 13:43 -0400, Chuck Lever wrote:
>>> On Mon, Jun 10, 2024 at 04:21:33PM +0000, Trond Myklebust wrote:
>>>> On Mon, 2024-06-10 at 11:04 -0400, cel@kernel.org wrote:
>>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>> 
>>>>> I noticed LAYOUTGET(LAYOUTIOMODE4_RW) returning NFS4ERR_ACCESS
>>>>> unexpectedly. The NFS client had created a file with mode 0444,
>>>>> and
>>>>> the server had returned a write delegation on the OPEN(CREATE).
>>>>> The
>>>>> client was requesting a RW layout using the write delegation
>>>>> stateid
>>>>> so that it could flush file modifications.
>>>>> 
>>>>> This client behavior was permitted for NFSv4.1 without pNFS, so I
>>>>> began looking at NFSD's implementation of LAYOUTGET.
>>>>> 
>>>>> The failure was because fh_verify() was doing a permission check
>>>>> as
>>>>> part of verifying the FH. It uses the loga_iomode value to
>>>>> specify
>>>>> the @accmode argument. fh_verify(MAY_WRITE) on a file whose mode
>>>>> is
>>>>> 0444 fails with -EACCES.
>>>>> 
>>>>> RFC 8881 Section 18.43.3 states:
>>>>>> The use of the loga_iomode field depends upon the layout type,
>>>>>> but
>>>>>> should reflect the client's data access intent.
>>>>> 
>>>>> Further discussion of iomode values focuses on how the server is
>>>>> permitted to change returned the iomode when coalescing layouts.
>>>>> It says nothing about mandating the denial of LAYOUTGET requests
>>>>> due to file permission settings.
>>>>> 
>>>>> Appropriate permission checking is done when the client acquires
>>>>> the
>>>>> stateid used in the LAYOUTGET operation, so remove the permission
>>>>> check from LAYOUTGET and LAYOUTCOMMIT, and rely on layout stateid
>>>>> checking instead.
>>>> 
>>>> Hmm... I'm not seeing any checking or enforcement of the
>>>> EXCHGID4_FLAG_BIND_PRINC_STATEID flag in knfsd.
>>> 
>>> I appreciate your review!
>>> 
>>> I see that BIND_PRINC_STATEID is not set by NFSD. RFC 8881 Section
>>> 18.16.4 says:
>>>> Note that if the client ID was not created with the
>>>> EXCHGID4_FLAG_BIND_PRINC_STATEID capability set in the reply to
>>>> EXCHANGE_ID, then the server MUST NOT impose any requirement that
>>>> READs and WRITEs sent for an open file have the same credentials
>>>> as the OPEN itself, and the server is REQUIRED to perform access
>>>> checking on the READs and WRITEs themselves.
>>> 
>>> 
>>> Trond:
>>>> Doesn't that mean that
>>>> READ and WRITE are required to check access permissions, as per
>>>> RFC8881, section 13.9.2.3?
>>> 
>>> Every NFSv4 READ and WRITE calls nfs4_preprocess_stateid_op(), and
>>> nfs4_preprocess_stateid_op() calls nfsd_permission() (in
>>> nfs4_check_file()). Seems like we're covered.
>>> 
>>> 
>>> Trond:
>>>> I'm not saying that the return of an ACCESS error is correct here,
>>>> since the file was created using this credential, and so the caller
>>>> should benefit from having owner privileges.
>>> 
>>> The alternative is to preserve the accmode logic but instead add the
>>> NFSD_MAY_OWNER_OVERRIDE flag, me thinks, which is not objectionable.
>>> 
>>> 
>>> Trond:
>>>> However the issue is that knfsd should be either checking that the
>>>> credential is correct w.r.t. the stateid (if
>>>> EXCHGID4_FLAG_BIND_PRINC_STATEID is set and the stateid is not a
>>>> special stateid) or that it is correct w.r.t. the file permissions
>>>> (if
>>>> EXCHGID4_FLAG_BIND_PRINC_STATEID is not set or the stateid is a
>>>> special
>>>> stateid).
>>> 
>>> But LAYOUTGET is not a READ or WRITE. I don't see language that
>>> requires stateid / credential checking for it, but it's always
>>> possible I might have missed something.
>>> 
>>> Also, RFC 5663 has nothing to say about BIND_PRINC_STATEID. Further,
>>> I'm not sure how a SCSI READ or WRITE can check credentials as NFS
>>> stateids are not presented to SCSI/iSCSI targets.
>>> 
>>> Likewise, how would this impact a flexfile layout that targets
>>> an NFSv3 DS?
>>> 
>>> I think NFSD is checking stateids used for NFSv4 READ and WRITE as
>>> needed, but help me understand how BIND_PRINC_STATEID is applicable
>>> to pNFS block layouts...?
>>> 
>>> 
>> If you look at Section 15.2, then NFS4ERR_ACCESS is definitely listed
>> as a valid error for LAYOUTGET (in fact, the very first in the list).
>> Furthermore, please see the following blanket statement in RFC8881,
>> section 12.5.1.:
>>    "Layouts are provided to NFSv4.1 clients, and user access still
>>    follows the rules of the protocol as if they did not exist."
>> While you can argue that the above language is not normative, because
>> it doesn't use the official IETF "MUST" / "MUST NOT" /..., it clearly
>> does declare an intention to ensure that pNFS not be allowed to weaken
>> the protocol.
>> So my point would be that if LAYOUTGET is the only time where a proper
>> credential check can occur, then it needs to happen there. Otherwise,
>> your implementation is responsible for ensuring that it happens at some
>> other time, in order to ensure that user access follows the rules of
>> the protocol.
> 
> Absolutely agreed. The MDS needs to verify access before returning
> the layout, for no other reason that the DS can't perform the same
> verification.

I have trouble believing that something that critical has been
left by the spec authors as an implied implementation quality
issue.

The bulk of Sections 12.5.1 and .2 discuss how storage devices
need to handle things like writes outside of a layout and
other corner cases, including permission changes during I/O.
Clearly the spec authors expect that DS targets will deal
properly with these situations without the use of stateids.

BIND_PRINC_STATEID is never explicitly mentioned here.
Instead, the OPEN, LOCK, and ACCESS operations are clearly
mentioned as the points in time when file permission is
inspected by the server (ie, these operations provide the
main access verification check that precedes the successful
retrieval of a layout).

Block layouts then use SCSI persistent reservation and layout
recall to handle I/O-time access checking.

Therefore, IMO, BIND_PRINC_STATEID applies only to NFSv4
READ and WRITE, because the spec as it stands is not sane
otherwise.

However, the objection is noted. I'm working on an alternate
more conservative fix for the write delegation issue that
motivated this patch. Stay tuned for v2.


--
Chuck Lever
diff mbox series

Patch

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 46bd20fe5c0f..c24f45870b28 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2269,23 +2269,17 @@  nfsd4_layoutget(struct svc_rqst *rqstp,
 	const struct nfsd4_layout_ops *ops;
 	struct nfs4_layout_stateid *ls;
 	__be32 nfserr;
-	int accmode = NFSD_MAY_READ_IF_EXEC;
 
+	nfserr = nfserr_badiomode;
 	switch (lgp->lg_seg.iomode) {
 	case IOMODE_READ:
-		accmode |= NFSD_MAY_READ;
-		break;
 	case IOMODE_RW:
-		accmode |= NFSD_MAY_READ | NFSD_MAY_WRITE;
 		break;
 	default:
-		dprintk("%s: invalid iomode %d\n",
-			__func__, lgp->lg_seg.iomode);
-		nfserr = nfserr_badiomode;
 		goto out;
 	}
 
-	nfserr = fh_verify(rqstp, current_fh, 0, accmode);
+	nfserr = fh_verify(rqstp, current_fh, 0, NFSD_MAY_NOP);
 	if (nfserr)
 		goto out;
 
@@ -2359,7 +2353,7 @@  nfsd4_layoutcommit(struct svc_rqst *rqstp,
 	struct nfs4_layout_stateid *ls;
 	__be32 nfserr;
 
-	nfserr = fh_verify(rqstp, current_fh, 0, NFSD_MAY_WRITE);
+	nfserr = fh_verify(rqstp, current_fh, 0, NFSD_MAY_NOP);
 	if (nfserr)
 		goto out;