Message ID | 19989.27202.793003.725608@regina.usersys.redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 2011-07-07 at 18:11 +1000, Max Matveev wrote: > I've had to look at the way NFS/TCP does its timeouts and backoff > and it does not make a lot of sense to me: according to the > following paragram from nfs(5) on Fedora 14 (I'm using Fedora 14 > because it has more text then the same page in nfs-utils): > > timeo=n The time (in tenths of a second) the NFS client waits > for a response before it retries an NFS request. If this > option is not specified, requests are retried every 60 > seconds for NFS over TCP. The NFS client does not per? > form any kind of timeout backoff for NFS over TCP. > > but if I try the mount with timeo=20,retrans=7 then I'm getting > retransmits which are 2, 4, 6, 8, 2, 4, 6, 8 seconds apart, i.e. > there is a) linear backoff and b) the backoff is not long enough to > let the complete sequence of 7 retransmits run its course. Sigh... Firstly, 2 second timeouts are complete lunacy when using a protocol that guarantees reliable delivery, such as TCP does. Anyone who tries it deserves exactly what they get: poor unreliable performance. Secondly, the _other_ fix for this problem is to fix the documentation. Trond
On Jul 7, 2011, at 9:47 AM, Trond Myklebust wrote: > On Thu, 2011-07-07 at 18:11 +1000, Max Matveev wrote: >> I've had to look at the way NFS/TCP does its timeouts and backoff >> and it does not make a lot of sense to me: according to the >> following paragram from nfs(5) on Fedora 14 (I'm using Fedora 14 >> because it has more text then the same page in nfs-utils): >> >> timeo=n The time (in tenths of a second) the NFS client waits >> for a response before it retries an NFS request. If this >> option is not specified, requests are retried every 60 >> seconds for NFS over TCP. The NFS client does not per? >> form any kind of timeout backoff for NFS over TCP. >> >> but if I try the mount with timeo=20,retrans=7 then I'm getting >> retransmits which are 2, 4, 6, 8, 2, 4, 6, 8 seconds apart, i.e. >> there is a) linear backoff and b) the backoff is not long enough to >> let the complete sequence of 7 retransmits run its course. > > Sigh... Firstly, 2 second timeouts are complete lunacy when using a > protocol that guarantees reliable delivery, such as TCP does. Anyone who > tries it deserves exactly what they get: poor unreliable performance. We shouldn't allow such low settings. > Secondly, the _other_ fix for this problem is to fix the documentation. How is the documentation incorrect? We do not want any kind of back-off for stream transports. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 07 Jul 2011 09:47:19 -0400, Trond Myklebust wrote: TM> On Thu, 2011-07-07 at 18:11 +1000, Max Matveev wrote: TM> Sigh... Firstly, 2 second timeouts are complete lunacy when using a TM> protocol that guarantees reliable delivery, such as TCP does. Anyone who TM> tries it deserves exactly what they get: poor unreliable performance. 2 seconds is besides the point - I'm not going to wait for 28 minutes (timeout=600,retrans=7) when doing timeout testing just to prove that it does not work. TM> Secondly, the _other_ fix for this problem is to fix the documentation. I can live with that too. max -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 07 Jul 2011 10:16:53 -0400, Trond Myklebust wrote:
> Anyway, why shouldn't we back off if the server is failing to respond?
Wasn't it the no-backoff/drop-connection approach what Mike Eisler was
advocating back in '06 during Connectathon?
http://www.connectathon.org/talks06/eisler.pdf
I think this was the trigger to go from exponential backoff to linear.
max
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Resuming a conversation from last month - it fizzed out and died without any resoltion... On Thu, 07 Jul 2011 10:59:12 -0400, Trond Myklebust wrote: trond> Looking at the code: trond> v2.6.0: exponential back off trond> v2.6.4: exponential back off trond> v2.6.9: exponential back off trond> v2.6.16: linear back off trond> v2.6.18: linear back off trond> v2.6.24: linear back off trond> v2.6.32: linear back off trond> .... trond> So I've no idea what you were testing. I'm going to assume that we're keeping linear backoff and send a patch which correct calculation of maxint for this case. >> So it seems to me the kernel has diverged (perhaps long ago) from >> the documentation, not the other way around. trond> Nope. The documentation has simply always been inaccurate afaics from trond> the above inspection. And another one to update the documentation.. max -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -606,7 +606,8 @@ static void nfs_init_timeout_values(struct rpc_timeout *to, if (to->to_initval > NFS_MAX_TCP_TIMEOUT) to->to_initval = NFS_MAX_TCP_TIMEOUT; to->to_increment = to->to_initval; - to->to_maxval = to->to_initval + (to->to_increment * to->to_retries); + to->to_maxval = to->to_increment * (to->to_retries + 1) + * (to->to_retries + 2) / 2; if (to->to_maxval > NFS_MAX_TCP_TIMEOUT) to->to_maxval = NFS_MAX_TCP_TIMEOUT; if (to->to_maxval < to->to_initval) or don't do the linear backoff --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -546,7 +546,7 @@ static void xprt_reset_majortimeo(struct rpc_rqst *req) if (to->to_exponential) req->rq_majortimeo <<= to->to_retries; else - req->rq_majortimeo += to->to_increment * to->to_retries; + req->rq_majortimeo += to->to_increment; if (req->rq_majortimeo > to->to_maxval || req->rq_majortimeo == 0) req->rq_majortimeo = to->to_maxval; req->rq_majortimeo += jiffies;