diff mbox series

nfs4.0: add a retry loop on NFS4ERR_DELAY to compound function

Message ID 20230222131100.26472-1-jlayton@kernel.org (mailing list archive)
State New, archived
Headers show
Series nfs4.0: add a retry loop on NFS4ERR_DELAY to compound function | expand

Commit Message

Jeff Layton Feb. 22, 2023, 1:11 p.m. UTC
The latest knfsd server is "courteous" in that it will not revoke a
lease held by an expired client until there is competing access for it.
When there is competing access, it can now return NFS4ERR_DELAY until
the old client is expired. I've seen this happen when running pynfs in
a loop against a server with only 4g of memory.

The v4.0 compound handler doesn't retry automatically on NFS4ERR_DELAY
like the v4.1 version does. Add support for it using the same timeouts
as the v4.1 compound handler.

Cc: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 nfs4.0/nfs4lib.py | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

Comments

Jeff Layton Feb. 22, 2023, 1:13 p.m. UTC | #1
Oops. The subject should have read [pynfs PATCH]:

On Wed, 2023-02-22 at 08:11 -0500, Jeff Layton wrote:
> The latest knfsd server is "courteous" in that it will not revoke a
> lease held by an expired client until there is competing access for it.
> When there is competing access, it can now return NFS4ERR_DELAY until
> the old client is expired. I've seen this happen when running pynfs in
> a loop against a server with only 4g of memory.
> 
> The v4.0 compound handler doesn't retry automatically on NFS4ERR_DELAY
> like the v4.1 version does. Add support for it using the same timeouts
> as the v4.1 compound handler.
> 
> Cc: Dai Ngo <dai.ngo@oracle.com>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  nfs4.0/nfs4lib.py | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/nfs4.0/nfs4lib.py b/nfs4.0/nfs4lib.py
> index 9b074f02b91f..eddcd862bc2f 100644
> --- a/nfs4.0/nfs4lib.py
> +++ b/nfs4.0/nfs4lib.py
> @@ -338,12 +338,21 @@ class NFS4Client(rpc.RPCClient):
>          un_p = self.nfs4unpacker
>          p.reset()
>          p.pack_COMPOUND4args(compoundargs)
> -        res = self.call(NFSPROC4_COMPOUND, p.get_buffer())
> -        un_p.reset(res)
> -        res = un_p.unpack_COMPOUND4res()
> -        if SHOW_TRAFFIC:
> -            print(res)
> -        un_p.done()
> +        res = None
> +
> +        # NFS servers can return NFS4ERR_DELAY at any time for any reason.
> +        # Just delay a second and retry the call again in that event. If
> +        # it fails after 10 retries then just give up.
> +        for i in range(1, 10):
> +            res = self.call(NFSPROC4_COMPOUND, p.get_buffer())
> +            un_p.reset(res)
> +            res = un_p.unpack_COMPOUND4res()
> +            if SHOW_TRAFFIC:
> +                print(res)
> +            un_p.done()
> +            if res.status != NFS4ERR_DELAY:
> +                break
> +            time.sleep(1)
>  
>          # Do some error checking
>
diff mbox series

Patch

diff --git a/nfs4.0/nfs4lib.py b/nfs4.0/nfs4lib.py
index 9b074f02b91f..eddcd862bc2f 100644
--- a/nfs4.0/nfs4lib.py
+++ b/nfs4.0/nfs4lib.py
@@ -338,12 +338,21 @@  class NFS4Client(rpc.RPCClient):
         un_p = self.nfs4unpacker
         p.reset()
         p.pack_COMPOUND4args(compoundargs)
-        res = self.call(NFSPROC4_COMPOUND, p.get_buffer())
-        un_p.reset(res)
-        res = un_p.unpack_COMPOUND4res()
-        if SHOW_TRAFFIC:
-            print(res)
-        un_p.done()
+        res = None
+
+        # NFS servers can return NFS4ERR_DELAY at any time for any reason.
+        # Just delay a second and retry the call again in that event. If
+        # it fails after 10 retries then just give up.
+        for i in range(1, 10):
+            res = self.call(NFSPROC4_COMPOUND, p.get_buffer())
+            un_p.reset(res)
+            res = un_p.unpack_COMPOUND4res()
+            if SHOW_TRAFFIC:
+                print(res)
+            un_p.done()
+            if res.status != NFS4ERR_DELAY:
+                break
+            time.sleep(1)
 
         # Do some error checking