Message ID | 1507740502-5151-1-git-send-email-Thomas.Haynes@primarydata.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Bruce, In this test: def testReplayCache006(t, env): """Send two solo sequence compounds with same seqid FLAGS: sequence all CODE: SEQ9f """ c = env.c1.new_client(env.testname(t)) sess = c.create_session() res1 = sess.compound([]) check(res1) res2 = sess.compound([], cache_this=True, seq_delta=0) check(res2) res1.tag = res2.tag = "" if not nfs4lib.test_equal(res1, res2): fail("Replay results not equal") I don't see why the result should be NFS4_OK. The first compound does not set sa_cachethis and the second does set it. According to RFC5661, the server is not required to cache the first request if the client does not request it. And if the server does not cache the response, when it gets a request to replay it, it can return NFS4ERR_RETRY_UNCACHED_REP: 2.10.6.1.3. Optional Reply Caching On a per-request basis, the requester can choose to direct the replier to cache the reply to all operations after the first operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it would not direct the replier to cache the entire reply is that the request is composed of all idempotent operations [34]. Caching the reply may offer little benefit. If the reply is too large (see Section 2.10.6.4), it may not be cacheable anyway. Even if the reply to idempotent request is small enough to cache, unnecessarily caching the reply slows down the server and increases RPC latency. Whether or not the requester requests the reply to be cached has no effect on the slot processing. If the results of SEQUENCE or CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be incremented by one. If a requester does not direct the replier to cache the reply, the replier MUST do one of following: o The replier can cache the entire original reply. Even though sa_cachethis or csa_cachethis is FALSE, the replier is always free to cache. It may choose this approach in order to simplify implementation. o The replier enters into its reply cache a reply consisting of the original results to the SEQUENCE or CB_SEQUENCE operation, and with the next operation in COMPOUND or CB_COMPOUND having the error NFS4ERR_RETRY_UNCACHED_REP. Thus, if the requester later retries the request, it will get NFS4ERR_RETRY_UNCACHED_REP. If a replier receives a retried Sequence operation where the reply to the COMPOUND or CB_COMPOUND was not cached, then the replier, Further, since the parameters to SEQUENCE are different it is a false retry according to this: 2.10.6.1.3.1. False Retry If a requester sent a Sequence operation with a slot ID and sequence ID that are in the reply cache but the replier detected that the retried request is not the same as the original request, including a retry that has different operations or different arguments in the operations from the original and a retry that uses a different principal in the RPC request's credential field that translates to a different user, then this is a false retry. When the replier detects a false retry, it is permitted (but not always obligated) to return NFS4ERR_SEQ_FALSE_RETRY in response to the Sequence operation when it detects a false retry. I'm not sure if the test should be fixed to either 1) allow either NFS4_OK or NFS4ERR_RETRY_UNCACHED_REP 2) just remove the test. Thoughts? -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2017-10-12 at 18:32 +0000, Thomas Haynes wrote: > Hi Bruce, > > In this test: > > def testReplayCache006(t, env): > """Send two solo sequence compounds with same seqid > > FLAGS: sequence all > CODE: SEQ9f > """ > c = env.c1.new_client(env.testname(t)) > sess = c.create_session() > res1 = sess.compound([]) > check(res1) > res2 = sess.compound([], cache_this=True, seq_delta=0) > check(res2) > res1.tag = res2.tag = "" > if not nfs4lib.test_equal(res1, res2): > fail("Replay results not equal") > > I don't see why the result should be NFS4_OK. > > The first compound does not set sa_cachethis and the second > does set it. According to RFC5661, the server is not required > to cache the first request if the client does not request it. > > And if the server does not cache the response, when it gets > a request to replay it, it can return NFS4ERR_RETRY_UNCACHED_REP: > > 2.10.6.1.3. Optional Reply Caching > > On a per-request basis, the requester can choose to direct the > replier to cache the reply to all operations after the first > operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or > csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE. > The reason it would not direct the replier to cache the entire > reply > is that the request is composed of all idempotent operations [34]. > Caching the reply may offer little benefit. If the reply is too > large (see Section 2.10.6.4), it may not be cacheable anyway. Even > if the reply to idempotent request is small enough to cache, > unnecessarily caching the reply slows down the server and increases > RPC latency. > > Whether or not the requester requests the reply to be cached has no > effect on the slot processing. If the results of SEQUENCE or > CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be > incremented by one. If a requester does not direct the replier to > cache the reply, the replier MUST do one of following: > > o The replier can cache the entire original reply. Even though > sa_cachethis or csa_cachethis is FALSE, the replier is always > free > to cache. It may choose this approach in order to simplify > implementation. > > o The replier enters into its reply cache a reply consisting of > the > original results to the SEQUENCE or CB_SEQUENCE operation, and > with the next operation in COMPOUND or CB_COMPOUND having the > error NFS4ERR_RETRY_UNCACHED_REP. Thus, if the requester later > retries the request, it will get NFS4ERR_RETRY_UNCACHED_REP. If > a > replier receives a retried Sequence operation where the reply to > the COMPOUND or CB_COMPOUND was not cached, then the replier, > > Further, since the parameters to SEQUENCE are different it is a false > retry according to this: > > 2.10.6.1.3.1. False Retry > > If a requester sent a Sequence operation with a slot ID and sequence > ID that are in the reply cache but the replier detected that the > retried request is not the same as the original request, including a > retry that has different operations or different arguments in the > operations from the original and a retry that uses a different > principal in the RPC request's credential field that translates to a > different user, then this is a false retry. When the replier detects > a false retry, it is permitted (but not always obligated) to return > NFS4ERR_SEQ_FALSE_RETRY in response to the Sequence operation when it > detects a false retry. > > I'm not sure if the test should be fixed to either > > 1) allow either NFS4_OK or NFS4ERR_RETRY_UNCACHED_REP > 2) just remove the test. > > Thoughts? Hmmm.... 2.10.6.1.3 ... * MUST NOT return NFS4ERR_RETRY_UNCACHED_REP in reply to a Sequence operation if the Sequence operation is the first operation. .... ...so it looks as if the first sequence in a compound always has to return either NFS_OK, or possibly NFS4ERR_SEQ_FALSE_RETRY. The latter should presumably be allowed, since there is no exception carved out in 2.10.6.1.3.1 for the sequence op w.r.t. behaviour with different operation arguments. -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com
On Thu, Oct 12, 2017 at 06:32:09PM +0000, Thomas Haynes wrote: > Hi Bruce, > > In this test: > > def testReplayCache006(t, env): > """Send two solo sequence compounds with same seqid > > FLAGS: sequence all > CODE: SEQ9f > """ > c = env.c1.new_client(env.testname(t)) > sess = c.create_session() > res1 = sess.compound([]) > check(res1) > res2 = sess.compound([], cache_this=True, seq_delta=0) > check(res2) > res1.tag = res2.tag = "" > if not nfs4lib.test_equal(res1, res2): > fail("Replay results not equal") > > I don't see why the result should be NFS4_OK. > > The first compound does not set sa_cachethis and the second > does set it. According to RFC5661, the server is not required > to cache the first request if the client does not request it. > > And if the server does not cache the response, when it gets > a request to replay it, it can return NFS4ERR_RETRY_UNCACHED_REP: If I'm reading that section correctly, it can't return NFS4ERR_RETRY_UNCACHED_REP, that can only be returned on the next op of the compound (and there is no next op in this case, only the one SEQUENCE). So I *think* the only correct options OK or FALSE_RETRY? > I'm not sure if the test should be fixed to either > > 1) allow either NFS4_OK or NFS4ERR_RETRY_UNCACHED_REP > 2) just remove the test. Maybe just add cache_this to the first? Looks like the following test is off too--I think a server's allowed to return a cached reply instead of RETRY_UNCACHED_REP? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Your mailer's not quoting right, it's a little hard for me to find your replies. Wading in: On Thu, Oct 12, 2017 at 09:39:04PM +0000, Thomas Haynes wrote: > On Oct 12, 2017, at 12:49 PM, J. Bruce Fields <bfields@fieldses.org<mailto:bfields@fieldses.org>> wrote: > > So I *think* the only correct options OK or FALSE_RETRY? > > It can’t be OK if the parameters to SEQUENCE differ. I'm getting that from: "When the replier detects a false retry, it is permitted (but not always obligated) to return NFS4ERR_FALSE_RETRY in response to the Sequence operation when it detects a false retry." If i understand the following language, we're required to return FALSE_RETRY in the case the rpc credentials of the caller map to different principals, but not otherwise. --b -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 12, 2017 at 05:44:54PM -0400, J. Bruce Fields wrote: > Your mailer's not quoting right, it's a little hard for me to find your > replies. Wading in: > > On Thu, Oct 12, 2017 at 09:39:04PM +0000, Thomas Haynes wrote: > > On Oct 12, 2017, at 12:49 PM, J. Bruce Fields <bfields@fieldses.org<mailto:bfields@fieldses.org>> wrote: > > > So I *think* the only correct options OK or FALSE_RETRY? > > > > It can’t be OK if the parameters to SEQUENCE differ. > > I'm getting that from: "When the replier detects a false retry, it is > permitted (but not always obligated) to return NFS4ERR_FALSE_RETRY in > response to the Sequence operation when it detects a false retry." I think you are agreeing with me that OK is not appropriate here? > > If i understand the following language, we're required to return > FALSE_RETRY in the case the rpc credentials of the caller map to > different principals, but not otherwise. This one drove me crazy: If a requester sent a Sequence operation with a slot ID and sequence ID that are in the reply cache but the replier detected that the retried request is not the same as the original request, including a retry that has different operations or different arguments in the operations from the original SEQUENCE is not special - both the compounds in this example only have the SEQUENCE op and they differ only in that in the first sa_cachethis is False and in the second it is True. So we have to return FALSE_SEQ_RETRY here... > > --b > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 12, 2017 at 03:00:51PM -0700, Tom Haynes wrote: > On Thu, Oct 12, 2017 at 05:44:54PM -0400, J. Bruce Fields wrote: > > Your mailer's not quoting right, it's a little hard for me to find your > > replies. Wading in: > > > > On Thu, Oct 12, 2017 at 09:39:04PM +0000, Thomas Haynes wrote: > > > On Oct 12, 2017, at 12:49 PM, J. Bruce Fields <bfields@fieldses.org<mailto:bfields@fieldses.org>> wrote: > > > > So I *think* the only correct options OK or FALSE_RETRY? > > > > > > It can’t be OK if the parameters to SEQUENCE differ. > > > > I'm getting that from: "When the replier detects a false retry, it is > > permitted (but not always obligated) to return NFS4ERR_FALSE_RETRY in > > response to the Sequence operation when it detects a false retry." > > I think you are agreeing with me that OK is not appropriate here? No, I think OK is OK: > > If i understand the following language, we're required to return > > FALSE_RETRY in the case the rpc credentials of the caller map to > > different principals, but not otherwise. > > This one drove me crazy: > > If a requester sent a Sequence operation with a slot ID and sequence > ID that are in the reply cache but the replier detected that the > retried request is not the same as the original request, including a > retry that has different operations or different arguments in the > operations from the original > > SEQUENCE is not special - both the compounds in this example > only have the SEQUENCE op and they differ only in that in the > first sa_cachethis is False and in the second it is True. > > So we have to return FALSE_SEQ_RETRY here... It says "if the replier detected" a difference, not "if there is" a difference. So the replier is not required to do such detection. This agrees with the "not always obligated" above. So I think it's allowed for the server to just return an old cached response here (with the cached OK). And I can't see any practical problem that would create--a client shouldn't be sending a different request with the same (slot, sequence) anyway. The only potential risk is the malicious client trying to snoop somebody else's reply cache, hence the requirement in the case principals differ. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2017-10-12 at 21:52 -0400, J. Bruce Fields wrote: > On Thu, Oct 12, 2017 at 03:00:51PM -0700, Tom Haynes wrote: > > On Thu, Oct 12, 2017 at 05:44:54PM -0400, J. Bruce Fields wrote: > > > Your mailer's not quoting right, it's a little hard for me to > > > find your > > > replies. Wading in: > > > > > > On Thu, Oct 12, 2017 at 09:39:04PM +0000, Thomas Haynes wrote: > > > > On Oct 12, 2017, at 12:49 PM, J. Bruce Fields <bfields@fieldses > > > > .org<mailto:bfields@fieldses.org>> wrote: > > > > > So I *think* the only correct options OK or FALSE_RETRY? > > > > > > > > It can’t be OK if the parameters to SEQUENCE differ. > > > > > > I'm getting that from: "When the replier detects a false retry, > > > it is > > > permitted (but not always obligated) to return > > > NFS4ERR_FALSE_RETRY in > > > response to the Sequence operation when it detects a false > > > retry." > > > > I think you are agreeing with me that OK is not appropriate here? > > No, I think OK is OK: > > > > If i understand the following language, we're required to return > > > FALSE_RETRY in the case the rpc credentials of the caller map to > > > different principals, but not otherwise. > > > > This one drove me crazy: > > > > If a requester sent a Sequence operation with a slot ID and > > sequence > > ID that are in the reply cache but the replier detected that the > > retried request is not the same as the original request, > > including a > > retry that has different operations or different arguments in > > the > > operations from the original > > > > SEQUENCE is not special - both the compounds in this example > > only have the SEQUENCE op and they differ only in that in the > > first sa_cachethis is False and in the second it is True. > > > > So we have to return FALSE_SEQ_RETRY here... > > It says "if the replier detected" a difference, not "if there is" a > difference. So the replier is not required to do such > detection. This > agrees with the "not always obligated" above. > > So I think it's allowed for the server to just return an old cached > response here (with the cached OK). And I can't see any practical > problem that would create--a client shouldn't be sending a different > request with the same (slot, sequence) anyway. The only potential > risk > is the malicious client trying to snoop somebody else's reply cache, > hence the requirement in the case principals differ. Clients may indeed send a different request with the same slot and sequence if they don't know that the server received the last request. This is tbe "user pressed ^C" scenario... Servers MAY ignore that fact, but they'd be really stupid to do so... > > --b. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" > in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com
On Fri, Oct 13, 2017 at 01:34:28PM +0000, Trond Myklebust wrote: > On Thu, 2017-10-12 at 21:52 -0400, J. Bruce Fields wrote: > > On Thu, Oct 12, 2017 at 03:00:51PM -0700, Tom Haynes wrote: > > > On Thu, Oct 12, 2017 at 05:44:54PM -0400, J. Bruce Fields wrote: > > > > Your mailer's not quoting right, it's a little hard for me to > > > > find your > > > > replies. Wading in: > > > > > > > > On Thu, Oct 12, 2017 at 09:39:04PM +0000, Thomas Haynes wrote: > > > > > On Oct 12, 2017, at 12:49 PM, J. Bruce Fields <bfields@fieldses > > > > > .org<mailto:bfields@fieldses.org>> wrote: > > > > > > So I *think* the only correct options OK or FALSE_RETRY? > > > > > > > > > > It can’t be OK if the parameters to SEQUENCE differ. > > > > > > > > I'm getting that from: "When the replier detects a false retry, > > > > it is > > > > permitted (but not always obligated) to return > > > > NFS4ERR_FALSE_RETRY in > > > > response to the Sequence operation when it detects a false > > > > retry." > > > > > > I think you are agreeing with me that OK is not appropriate here? > > > > No, I think OK is OK: > > > > > > If i understand the following language, we're required to return > > > > FALSE_RETRY in the case the rpc credentials of the caller map to > > > > different principals, but not otherwise. > > > > > > This one drove me crazy: > > > > > > If a requester sent a Sequence operation with a slot ID and > > > sequence > > > ID that are in the reply cache but the replier detected that the > > > retried request is not the same as the original request, > > > including a > > > retry that has different operations or different arguments in > > > the > > > operations from the original > > > > > > SEQUENCE is not special - both the compounds in this example > > > only have the SEQUENCE op and they differ only in that in the > > > first sa_cachethis is False and in the second it is True. > > > > > > So we have to return FALSE_SEQ_RETRY here... > > > > It says "if the replier detected" a difference, not "if there is" a > > difference. So the replier is not required to do such > > detection. This > > agrees with the "not always obligated" above. > > > > So I think it's allowed for the server to just return an old cached > > response here (with the cached OK). And I can't see any practical > > problem that would create--a client shouldn't be sending a different > > request with the same (slot, sequence) anyway. The only potential > > risk > > is the malicious client trying to snoop somebody else's reply cache, > > hence the requirement in the case principals differ. > > Clients may indeed send a different request with the same slot and > sequence if they don't know that the server received the last request. > This is tbe "user pressed ^C" scenario... > > Servers MAY ignore that fact, but they'd be really stupid to do so... OK, OK, I'll look into fixing the server (I'm pretty sure we get this wrong). You've explained the ctrl-C case before and I don't think I understood it. I guess otherwise the only way for the client to sort out the situation would be to retry the original request. And that requires keeping the arguments and credentials around to handle potential retries. And that's impractical if the process is going away? OK. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2017-10-13 at 11:00 -0400, bfields@fieldses.org wrote: > On Fri, Oct 13, 2017 at 01:34:28PM +0000, Trond Myklebust wrote: > > On Thu, 2017-10-12 at 21:52 -0400, J. Bruce Fields wrote: > > > On Thu, Oct 12, 2017 at 03:00:51PM -0700, Tom Haynes wrote: > > > > On Thu, Oct 12, 2017 at 05:44:54PM -0400, J. Bruce Fields > > > > wrote: > > > > > Your mailer's not quoting right, it's a little hard for me to > > > > > find your > > > > > replies. Wading in: > > > > > > > > > > On Thu, Oct 12, 2017 at 09:39:04PM +0000, Thomas Haynes > > > > > wrote: > > > > > > On Oct 12, 2017, at 12:49 PM, J. Bruce Fields <bfields@fiel > > > > > > dses > > > > > > .org<mailto:bfields@fieldses.org>> wrote: > > > > > > > So I *think* the only correct options OK or FALSE_RETRY? > > > > > > > > > > > > It can’t be OK if the parameters to SEQUENCE differ. > > > > > > > > > > I'm getting that from: "When the replier detects a false > > > > > retry, > > > > > it is > > > > > permitted (but not always obligated) to return > > > > > NFS4ERR_FALSE_RETRY in > > > > > response to the Sequence operation when it detects a false > > > > > retry." > > > > > > > > I think you are agreeing with me that OK is not appropriate > > > > here? > > > > > > No, I think OK is OK: > > > > > > > > If i understand the following language, we're required to > > > > > return > > > > > FALSE_RETRY in the case the rpc credentials of the caller map > > > > > to > > > > > different principals, but not otherwise. > > > > > > > > This one drove me crazy: > > > > > > > > If a requester sent a Sequence operation with a slot ID and > > > > sequence > > > > ID that are in the reply cache but the replier detected that > > > > the > > > > retried request is not the same as the original request, > > > > including a > > > > retry that has different operations or different arguments > > > > in > > > > the > > > > operations from the original > > > > > > > > SEQUENCE is not special - both the compounds in this example > > > > only have the SEQUENCE op and they differ only in that in the > > > > first sa_cachethis is False and in the second it is True. > > > > > > > > So we have to return FALSE_SEQ_RETRY here... > > > > > > It says "if the replier detected" a difference, not "if there is" > > > a > > > difference. So the replier is not required to do such > > > detection. This > > > agrees with the "not always obligated" above. > > > > > > So I think it's allowed for the server to just return an old > > > cached > > > response here (with the cached OK). And I can't see any > > > practical > > > problem that would create--a client shouldn't be sending a > > > different > > > request with the same (slot, sequence) anyway. The only > > > potential > > > risk > > > is the malicious client trying to snoop somebody else's reply > > > cache, > > > hence the requirement in the case principals differ. > > > > Clients may indeed send a different request with the same slot and > > sequence if they don't know that the server received the last > > request. > > This is tbe "user pressed ^C" scenario... > > > > Servers MAY ignore that fact, but they'd be really stupid to do > > so... > > OK, OK, I'll look into fixing the server (I'm pretty sure we get this > wrong). > > You've explained the ctrl-C case before and I don't think I > understood > it. I guess otherwise the only way for the client to sort out the > situation would be to retry the original request. And that requires > keeping the arguments and credentials around to handle potential > retries. And that's impractical if the process is going away? OK. > Right, we're not going to do that just for data that is just going to be tossed away anyway. We already guarantee that non-idempotent operations (the ones that we actually do ask the server to cache) are guaranteed to complete whether or not the user presses ^C, so this is mainly about what happens when somebody interrupts an operation that we did not want the server to cache. I have a patch out there that just replays a SEQUENCE op if we detect that an RPC call was interrupted. That should be sufficient to deal with servers that cache everything (whether or not the client sets sa_cachethis), but don't want to do NFS4ERR_SEQ_FALSE_RETRY. That particular combination has been seen to be extremely toxic to the current client, because it can get replayed LOOKUP or GETATTR requests after someone presses ^C. -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com
On Fri, Oct 13, 2017 at 03:26:51PM +0000, Trond Myklebust wrote: > On Fri, 2017-10-13 at 11:00 -0400, bfields@fieldses.org wrote: > > OK, OK, I'll look into fixing the server (I'm pretty sure we get this > > wrong). > > > > You've explained the ctrl-C case before and I don't think I > > understood > > it. I guess otherwise the only way for the client to sort out the > > situation would be to retry the original request. And that requires > > keeping the arguments and credentials around to handle potential > > retries. And that's impractical if the process is going away? OK. > > > > Right, we're not going to do that just for data that is just going to > be tossed away anyway. We already guarantee that non-idempotent > operations (the ones that we actually do ask the server to cache) are > guaranteed to complete whether or not the user presses ^C, so this is > mainly about what happens when somebody interrupts an operation that we > did not want the server to cache. > > I have a patch out there that just replays a SEQUENCE op if we detect > that an RPC call was interrupted. That should be sufficient to deal > with servers that cache everything (whether or not the client sets > sa_cachethis), but don't want to do NFS4ERR_SEQ_FALSE_RETRY. That > particular combination has been seen to be extremely toxic to the > current client, because it can get replayed LOOKUP or GETATTR requests > after someone presses ^C. Those all involve uncached compounds with more than one op. My reading of knfsd code is that it will return RETRY_UNCACHED_REP in this case, and I think (I might be misunderstanding) that the client will bump the slot seqid and retry in that case. So I *think* you shouldn't be seeing that problem with knfsd? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 13, 2017 at 02:50:15PM -0400, bfields@fieldses.org wrote: > Those all involve uncached compounds with more than one op. My reading > of knfsd code is that it will return RETRY_UNCACHED_REP in this case, > and I think (I might be misunderstanding) that the client will bump the > slot seqid and retry in that case. So I *think* you shouldn't be seeing > that problem with knfsd? So to bring it back to the original question, I'm still not sure it's actually a problem if a server returns NFS_OK when cache_this is set differently betwen two compound calls each containing just a single SEQUENCE. But, maybe it's worth at least warning about. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Fri, 2017-10-13 at 11:00 -0400, bfields@fieldses.org wrote: > > On Fri, Oct 13, 2017 at 01:34:28PM +0000, Trond Myklebust wrote: > > > On Thu, 2017-10-12 at 21:52 -0400, J. Bruce Fields wrote: > > > > On Thu, Oct 12, 2017 at 03:00:51PM -0700, Tom Haynes wrote: > > > > > On Thu, Oct 12, 2017 at 05:44:54PM -0400, J. Bruce Fields > > > > > wrote: > > > > > > Your mailer's not quoting right, it's a little hard for me to > > > > > > find your replies. Wading in: > > > > > > > > > > > > On Thu, Oct 12, 2017 at 09:39:04PM +0000, Thomas Haynes > > > > > > wrote: > > > > > > > On Oct 12, 2017, at 12:49 PM, J. Bruce Fields <bfields@fiel > > > > > > > dses .org<mailto:bfields@fieldses.org>> wrote: > > > > > > > > So I *think* the only correct options OK or FALSE_RETRY? > > > > > > > > > > > > > > It can’t be OK if the parameters to SEQUENCE differ. > > > > > > > > > > > > I'm getting that from: "When the replier detects a false > > > > > > retry, it is permitted (but not always obligated) to return > > > > > > NFS4ERR_FALSE_RETRY in response to the Sequence operation > when > > > > > > it detects a false retry." > > > > > > > > > > I think you are agreeing with me that OK is not appropriate > > > > > here? > > > > > > > > No, I think OK is OK: > > > > > > > > > > If i understand the following language, we're required to > > > > > > return FALSE_RETRY in the case the rpc credentials of the > > > > > > caller map to different principals, but not otherwise. > > > > > > > > > > This one drove me crazy: > > > > > > > > > > If a requester sent a Sequence operation with a slot ID and > > > > > sequence > > > > > ID that are in the reply cache but the replier detected that > > > > > the > > > > > retried request is not the same as the original request, > > > > > including a > > > > > retry that has different operations or different arguments in > > > > > the > > > > > operations from the original > > > > > > > > > > SEQUENCE is not special - both the compounds in this example > > > > > only have the SEQUENCE op and they differ only in that in the > > > > > first sa_cachethis is False and in the second it is True. > > > > > > > > > > So we have to return FALSE_SEQ_RETRY here... > > > > > > > > It says "if the replier detected" a difference, not "if there is" > > > > a > > > > difference. So the replier is not required to do such detection. > > > > This agrees with the "not always obligated" above. > > > > > > > > So I think it's allowed for the server to just return an old > > > > cached response here (with the cached OK). And I can't see any > > > > practical problem that would create--a client shouldn't be sending > > > > a different request with the same (slot, sequence) anyway. The > > > > only potential risk is the malicious client trying to snoop > > > > somebody else's reply cache, hence the requirement in the case > > > > principals differ. > > > > > > Clients may indeed send a different request with the same slot and > > > sequence if they don't know that the server received the last > > > request. > > > This is tbe "user pressed ^C" scenario... > > > > > > Servers MAY ignore that fact, but they'd be really stupid to do > > > so... > > > > OK, OK, I'll look into fixing the server (I'm pretty sure we get this > > wrong). > > > > You've explained the ctrl-C case before and I don't think I understood > > it. I guess otherwise the only way for the client to sort out the > > situation would be to retry the original request. And that requires > > keeping the arguments and credentials around to handle potential > > retries. And that's impractical if the process is going away? OK. > > > > Right, we're not going to do that just for data that is just going to be tossed > away anyway. We already guarantee that non-idempotent operations (the > ones that we actually do ask the server to cache) are guaranteed to complete > whether or not the user presses ^C, so this is mainly about what happens > when somebody interrupts an operation that we did not want the server to > cache. > > I have a patch out there that just replays a SEQUENCE op if we detect that an > RPC call was interrupted. That should be sufficient to deal with servers that > cache everything (whether or not the client sets sa_cachethis), but don't > want to do NFS4ERR_SEQ_FALSE_RETRY. That particular combination has > been seen to be extremely toxic to the current client, because it can get > replayed LOOKUP or GETATTR requests after someone presses ^C. I'm wondering if Ganesha does this right... Bruce, if there's something not covered by the pynfs tests, can we make sure to add one? Frank --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 13, 2017 at 02:50:15PM -0400, bfields@fieldses.org wrote: > On Fri, Oct 13, 2017 at 03:26:51PM +0000, Trond Myklebust wrote: > > On Fri, 2017-10-13 at 11:00 -0400, bfields@fieldses.org wrote: > > > OK, OK, I'll look into fixing the server (I'm pretty sure we get this > > > wrong). > > > > > > You've explained the ctrl-C case before and I don't think I > > > understood > > > it. I guess otherwise the only way for the client to sort out the > > > situation would be to retry the original request. And that requires > > > keeping the arguments and credentials around to handle potential > > > retries. And that's impractical if the process is going away? OK. > > > > > > > Right, we're not going to do that just for data that is just going to > > be tossed away anyway. We already guarantee that non-idempotent > > operations (the ones that we actually do ask the server to cache) are > > guaranteed to complete whether or not the user presses ^C, so this is > > mainly about what happens when somebody interrupts an operation that we > > did not want the server to cache. > > > > I have a patch out there that just replays a SEQUENCE op if we detect > > that an RPC call was interrupted. That should be sufficient to deal > > with servers that cache everything (whether or not the client sets > > sa_cachethis), but don't want to do NFS4ERR_SEQ_FALSE_RETRY. That > > particular combination has been seen to be extremely toxic to the > > current client, because it can get replayed LOOKUP or GETATTR requests > > after someone presses ^C. > > Those all involve uncached compounds with more than one op. My reading > of knfsd code is that it will return RETRY_UNCACHED_REP in this case, > and I think (I might be misunderstanding) that the client will bump the > slot seqid and retry in that case. So I *think* you shouldn't be seeing > that problem with knfsd? Argh, no, you're sending a bare SEQUENCE so of course there's just one op. And looking at Olga's COPY example and the code.... The server gets confused in this case and returns a reply to the SEQUENCE, nothing else, but sets the reply's opcnt to the count taken from the original call, for some reason. So, the server's returning a corrupt reply. It needs to return a reply that's actually legal xdr and SEQUENCE results that match the call. Beyond that it probably doesn't matter exactly what it returns--either it handles it as a replay and doesn't bump the seqid, or a new call and does, but either way the seqid ends up in the same place, which is the goal here. OK. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This prompted a long discussion on the correct server behavior in the case of false retries of bare SEQUENCE calls, which was kind of irrelevant to your actual pynfs patch. Somebody just reminded me that I never applied that. Applied now, thanks.--b. On Wed, Oct 11, 2017 at 09:48:22AM -0700, Thomas Haynes wrote: > From: Tom Haynes <loghyr@primarydata.com> > > 2.10.6.1.3.1. False Retry > > If a requester sent a Sequence operation with a slot ID and sequence > ID that are in the reply cache but the replier detected that the > retried request is not the same as the original request, including a > retry that has different operations or different arguments in the > operations from the original and a retry that uses a different > principal in the RPC request's credential field that translates to a > different user, then this is a false retry. When the replier detects > a false retry, it is permitted (but not always obligated) to return > NFS4ERR_SEQ_FALSE_RETRY in response to the Sequence operation when it > detects a false retry. > > Or in other words, sa_cachethis needs to be set or a > server can respond with an error. > > Signed-off-by: Tom Haynes <loghyr@primarydata.com> > --- > nfs4.1/server41tests/st_sequence.py | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/nfs4.1/server41tests/st_sequence.py b/nfs4.1/server41tests/st_sequence.py > index d8d460c..e1e5f06 100644 > --- a/nfs4.1/server41tests/st_sequence.py > +++ b/nfs4.1/server41tests/st_sequence.py > @@ -115,7 +115,7 @@ def testReplayCache001(t, env): > sess1 = c1.create_session() > res1 = sess1.compound([op.putrootfh()], cache_this=True) > check(res1) > - res2 = sess1.compound([op.putrootfh()], seq_delta=0) > + res2 = sess1.compound([op.putrootfh()], cache_this=True, seq_delta=0) > check(res2) > res1.tag = res2.tag = "" > if not nfs4lib.test_equal(res1, res2): > @@ -137,7 +137,7 @@ def testReplayCache002(t, env): > op.rename("%s_1" % env.testname(t), "%s_2" % env.testname(t))] > res1 = sess1.compound(ops, cache_this=True) > check(res1) > - res2 = sess1.compound(ops, seq_delta=0) > + res2 = sess1.compound(ops, cache_this=True, seq_delta=0) > check(res2) > res1.tag = res2.tag = "" > if not nfs4lib.test_equal(res1, res2): > @@ -158,7 +158,7 @@ def testReplayCache003(t, env): > sess1 = c1.create_session() > res1 = sess1.compound([op.putrootfh(), op.lookup("")], cache_this=True) > check(res1, NFS4ERR_INVAL) > - res2 = sess1.compound([op.putrootfh(), op.lookup("")], seq_delta=0) > + res2 = sess1.compound([op.putrootfh(), op.lookup("")], cache_this=True, seq_delta=0) > check(res2, NFS4ERR_INVAL) > res1.tag = res2.tag = "" > if not nfs4lib.test_equal(res1, res2): > @@ -176,7 +176,7 @@ def testReplayCache004(t, env): > ops += [op.savefh(), op.rename("", "foo")] > res1 = sess1.compound(ops, cache_this=True) > check(res1, NFS4ERR_INVAL) > - res2 = sess1.compound(ops, seq_delta=0) > + res2 = sess1.compound(ops, cache_this=True, seq_delta=0) > check(res2, NFS4ERR_INVAL) > res1.tag = res2.tag = "" > if not nfs4lib.test_equal(res1, res2): > @@ -192,7 +192,7 @@ def testReplayCache005(t, env): > sess1 = c1.create_session() > res1 = sess1.compound([op.illegal()], cache_this=True) > check(res1, NFS4ERR_OP_ILLEGAL) > - res2 = sess1.compound([op.illegal()], seq_delta=0) > + res2 = sess1.compound([op.illegal()], cache_this=True, seq_delta=0) > check(res2, NFS4ERR_OP_ILLEGAL) > res1.tag = res2.tag = "" > if not nfs4lib.test_equal(res1, res2): > @@ -208,7 +208,7 @@ def testReplayCache006(t, env): > sess = c.create_session() > res1 = sess.compound([]) > check(res1) > - res2 = sess.compound([], seq_delta=0) > + res2 = sess.compound([], cache_this=True, seq_delta=0) > check(res2) > res1.tag = res2.tag = "" > if not nfs4lib.test_equal(res1, res2): > -- > 2.3.6 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Bruce, Do you by any chance have a reference to this discussion? I would like to reference it. For now I'm hijacking this thread to bring this up. I'm still concerned about the case where client sent a request and slot got interrupted (so by default the client doesn't increment the seq#). Then the client re-used the slot for the same kind of operation (WRITE is very interesting) with same arguments but say different FH. Is the server obligated the cache the whole call to address and check that? You have a patch to check for false retries that checks for different creds but I don't think you have something that would catch this case? Thank you. On Tue, Apr 10, 2018 at 3:49 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > This prompted a long discussion on the correct server behavior in the > case of false retries of bare SEQUENCE calls, which was kind of > irrelevant to your actual pynfs patch. Somebody just reminded me that I > never applied that. > > Applied now, thanks.--b. > > On Wed, Oct 11, 2017 at 09:48:22AM -0700, Thomas Haynes wrote: >> From: Tom Haynes <loghyr@primarydata.com> >> >> 2.10.6.1.3.1. False Retry >> >> If a requester sent a Sequence operation with a slot ID and sequence >> ID that are in the reply cache but the replier detected that the >> retried request is not the same as the original request, including a >> retry that has different operations or different arguments in the >> operations from the original and a retry that uses a different >> principal in the RPC request's credential field that translates to a >> different user, then this is a false retry. When the replier detects >> a false retry, it is permitted (but not always obligated) to return >> NFS4ERR_SEQ_FALSE_RETRY in response to the Sequence operation when it >> detects a false retry. >> >> Or in other words, sa_cachethis needs to be set or a >> server can respond with an error. >> >> Signed-off-by: Tom Haynes <loghyr@primarydata.com> >> --- >> nfs4.1/server41tests/st_sequence.py | 12 ++++++------ >> 1 file changed, 6 insertions(+), 6 deletions(-) >> >> diff --git a/nfs4.1/server41tests/st_sequence.py b/nfs4.1/server41tests/st_sequence.py >> index d8d460c..e1e5f06 100644 >> --- a/nfs4.1/server41tests/st_sequence.py >> +++ b/nfs4.1/server41tests/st_sequence.py >> @@ -115,7 +115,7 @@ def testReplayCache001(t, env): >> sess1 = c1.create_session() >> res1 = sess1.compound([op.putrootfh()], cache_this=True) >> check(res1) >> - res2 = sess1.compound([op.putrootfh()], seq_delta=0) >> + res2 = sess1.compound([op.putrootfh()], cache_this=True, seq_delta=0) >> check(res2) >> res1.tag = res2.tag = "" >> if not nfs4lib.test_equal(res1, res2): >> @@ -137,7 +137,7 @@ def testReplayCache002(t, env): >> op.rename("%s_1" % env.testname(t), "%s_2" % env.testname(t))] >> res1 = sess1.compound(ops, cache_this=True) >> check(res1) >> - res2 = sess1.compound(ops, seq_delta=0) >> + res2 = sess1.compound(ops, cache_this=True, seq_delta=0) >> check(res2) >> res1.tag = res2.tag = "" >> if not nfs4lib.test_equal(res1, res2): >> @@ -158,7 +158,7 @@ def testReplayCache003(t, env): >> sess1 = c1.create_session() >> res1 = sess1.compound([op.putrootfh(), op.lookup("")], cache_this=True) >> check(res1, NFS4ERR_INVAL) >> - res2 = sess1.compound([op.putrootfh(), op.lookup("")], seq_delta=0) >> + res2 = sess1.compound([op.putrootfh(), op.lookup("")], cache_this=True, seq_delta=0) >> check(res2, NFS4ERR_INVAL) >> res1.tag = res2.tag = "" >> if not nfs4lib.test_equal(res1, res2): >> @@ -176,7 +176,7 @@ def testReplayCache004(t, env): >> ops += [op.savefh(), op.rename("", "foo")] >> res1 = sess1.compound(ops, cache_this=True) >> check(res1, NFS4ERR_INVAL) >> - res2 = sess1.compound(ops, seq_delta=0) >> + res2 = sess1.compound(ops, cache_this=True, seq_delta=0) >> check(res2, NFS4ERR_INVAL) >> res1.tag = res2.tag = "" >> if not nfs4lib.test_equal(res1, res2): >> @@ -192,7 +192,7 @@ def testReplayCache005(t, env): >> sess1 = c1.create_session() >> res1 = sess1.compound([op.illegal()], cache_this=True) >> check(res1, NFS4ERR_OP_ILLEGAL) >> - res2 = sess1.compound([op.illegal()], seq_delta=0) >> + res2 = sess1.compound([op.illegal()], cache_this=True, seq_delta=0) >> check(res2, NFS4ERR_OP_ILLEGAL) >> res1.tag = res2.tag = "" >> if not nfs4lib.test_equal(res1, res2): >> @@ -208,7 +208,7 @@ def testReplayCache006(t, env): >> sess = c.create_session() >> res1 = sess.compound([]) >> check(res1) >> - res2 = sess.compound([], seq_delta=0) >> + res2 = sess.compound([], cache_this=True, seq_delta=0) >> check(res2) >> res1.tag = res2.tag = "" >> if not nfs4lib.test_equal(res1, res2): >> -- >> 2.3.6 > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 24, 2018 at 04:10:29PM -0400, Olga Kornievskaia wrote: > Do you by any chance have a reference to this discussion? This: http://lkml.kernel.org/r/1507740502-5151-1-git-send-email-Thomas.Haynes@primarydata.com and this: http://lkml.kernel.org/r/E0161195-9F4A-4B36-A71D-6A924498C893@primarydata.com and followups. > I would like to reference it. For now I'm hijacking this thread to > bring this up. I'm still concerned about the case where client sent a > request and slot got interrupted (so by default the client doesn't > increment the seq#). Then the client re-used the slot for the same > kind of operation (WRITE is very interesting) with same arguments but > say different FH. Is the server obligated the cache the whole call to > address and check that? You have a patch to check for false retries > that checks for different creds but I don't think you have something > that would catch this case? Right. I don't believe the spec requires us to catch false retries in every possible case. That may mean we return a pretty bizarre reply that doesn't match the request, but that's the client's own fault.... --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/nfs4.1/server41tests/st_sequence.py b/nfs4.1/server41tests/st_sequence.py index d8d460c..e1e5f06 100644 --- a/nfs4.1/server41tests/st_sequence.py +++ b/nfs4.1/server41tests/st_sequence.py @@ -115,7 +115,7 @@ def testReplayCache001(t, env): sess1 = c1.create_session() res1 = sess1.compound([op.putrootfh()], cache_this=True) check(res1) - res2 = sess1.compound([op.putrootfh()], seq_delta=0) + res2 = sess1.compound([op.putrootfh()], cache_this=True, seq_delta=0) check(res2) res1.tag = res2.tag = "" if not nfs4lib.test_equal(res1, res2): @@ -137,7 +137,7 @@ def testReplayCache002(t, env): op.rename("%s_1" % env.testname(t), "%s_2" % env.testname(t))] res1 = sess1.compound(ops, cache_this=True) check(res1) - res2 = sess1.compound(ops, seq_delta=0) + res2 = sess1.compound(ops, cache_this=True, seq_delta=0) check(res2) res1.tag = res2.tag = "" if not nfs4lib.test_equal(res1, res2): @@ -158,7 +158,7 @@ def testReplayCache003(t, env): sess1 = c1.create_session() res1 = sess1.compound([op.putrootfh(), op.lookup("")], cache_this=True) check(res1, NFS4ERR_INVAL) - res2 = sess1.compound([op.putrootfh(), op.lookup("")], seq_delta=0) + res2 = sess1.compound([op.putrootfh(), op.lookup("")], cache_this=True, seq_delta=0) check(res2, NFS4ERR_INVAL) res1.tag = res2.tag = "" if not nfs4lib.test_equal(res1, res2): @@ -176,7 +176,7 @@ def testReplayCache004(t, env): ops += [op.savefh(), op.rename("", "foo")] res1 = sess1.compound(ops, cache_this=True) check(res1, NFS4ERR_INVAL) - res2 = sess1.compound(ops, seq_delta=0) + res2 = sess1.compound(ops, cache_this=True, seq_delta=0) check(res2, NFS4ERR_INVAL) res1.tag = res2.tag = "" if not nfs4lib.test_equal(res1, res2): @@ -192,7 +192,7 @@ def testReplayCache005(t, env): sess1 = c1.create_session() res1 = sess1.compound([op.illegal()], cache_this=True) check(res1, NFS4ERR_OP_ILLEGAL) - res2 = sess1.compound([op.illegal()], seq_delta=0) + res2 = sess1.compound([op.illegal()], cache_this=True, seq_delta=0) check(res2, NFS4ERR_OP_ILLEGAL) res1.tag = res2.tag = "" if not nfs4lib.test_equal(res1, res2): @@ -208,7 +208,7 @@ def testReplayCache006(t, env): sess = c.create_session() res1 = sess.compound([]) check(res1) - res2 = sess.compound([], seq_delta=0) + res2 = sess.compound([], cache_this=True, seq_delta=0) check(res2) res1.tag = res2.tag = "" if not nfs4lib.test_equal(res1, res2):