Message ID | 20130716140021.312b5b07@notabene.brown (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Adding Ben Myers to Cc: in case he has any testing help or advice (see below): On Tue, Jul 16, 2013 at 02:00:21PM +1000, NeilBrown wrote: > That would be because I only allowed extra requests up to half the number of > idle threads, and there would be zero idle threads. > > So yes - I see your point. > > I now think that it is sufficient to just allow one request through per > socket. While there is high memory pressure, that is sufficient. When the > memory pressure drops, that should be enough to cause sndbuf to grow. > > We should be able to use xpt_reserved to check if this is the only request > or not: A remaining possible bad case is if the number of "bad" sockets is more than the number of threads, and if the problem is something that won't resolve itself by just waiting a little while. I don't know if that's likely in practice. Maybe a network failure cuts you off from a large swath (but not all) of your clients? Or maybe some large proportion of your clients are just slow? But this is two lines and looks likely to solve the immediate problem: > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > index 80a6640..5b832ef 100644 > --- a/net/sunrpc/svc_xprt.c > +++ b/net/sunrpc/svc_xprt.c > @@ -331,7 +331,8 @@ static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt) > if (xprt->xpt_flags & ((1<<XPT_CONN)|(1<<XPT_CLOSE))) > return true; > if (xprt->xpt_flags & ((1<<XPT_DATA)|(1<<XPT_DEFERRED))) > - return xprt->xpt_ops->xpo_has_wspace(xprt); > + return xprt->xpt_ops->xpo_has_wspace(xprt) || > + atomic_read(&xprt->xpt_reserved) == 0; > return false; > } > > @@ -730,7 +731,8 @@ static int svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt) > newxpt = xprt->xpt_ops->xpo_accept(xprt); > if (newxpt) > svc_add_new_temp_xprt(serv, newxpt); > - } else if (xprt->xpt_ops->xpo_has_wspace(xprt)) { > + } else if (xprt->xpt_ops->xpo_has_wspace(xprt) || > + atomic_read(&xprt->xpt_reserved) == 0) { > /* XPT_DATA|XPT_DEFERRED case: */ > dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n", > rqstp, rqstp->rq_pool->sp_id, xprt, > > > Thoughts? > > I tried testing this, but xpo_has_wspace never fails for me, even if I remove > the calls to svc_sock_setbufsize (which probably aren't wanted for TCP any > more). Ben, do you still have a setup that can reproduce the problem? Or did you ever find an easier way to reproduce? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hey Bruce, On Tue, Jul 16, 2013 at 10:24:30AM -0400, J.Bruce Fields wrote: > Adding Ben Myers to Cc: in case he has any testing help or advice (see > below): > > On Tue, Jul 16, 2013 at 02:00:21PM +1000, NeilBrown wrote: > > That would be because I only allowed extra requests up to half the number of > > idle threads, and there would be zero idle threads. > > > > So yes - I see your point. > > > > I now think that it is sufficient to just allow one request through per > > socket. While there is high memory pressure, that is sufficient. When the > > memory pressure drops, that should be enough to cause sndbuf to grow. > > > > We should be able to use xpt_reserved to check if this is the only request > > or not: > > A remaining possible bad case is if the number of "bad" sockets is more > than the number of threads, and if the problem is something that won't > resolve itself by just waiting a little while. > > I don't know if that's likely in practice. Maybe a network failure cuts > you off from a large swath (but not all) of your clients? Or maybe some > large proportion of your clients are just slow? > > But this is two lines and looks likely to solve the immediate > problem: > > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > > index 80a6640..5b832ef 100644 > > --- a/net/sunrpc/svc_xprt.c > > +++ b/net/sunrpc/svc_xprt.c > > @@ -331,7 +331,8 @@ static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt) > > if (xprt->xpt_flags & ((1<<XPT_CONN)|(1<<XPT_CLOSE))) > > return true; > > if (xprt->xpt_flags & ((1<<XPT_DATA)|(1<<XPT_DEFERRED))) > > - return xprt->xpt_ops->xpo_has_wspace(xprt); > > + return xprt->xpt_ops->xpo_has_wspace(xprt) || > > + atomic_read(&xprt->xpt_reserved) == 0; > > return false; > > } > > > > @@ -730,7 +731,8 @@ static int svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt) > > newxpt = xprt->xpt_ops->xpo_accept(xprt); > > if (newxpt) > > svc_add_new_temp_xprt(serv, newxpt); > > - } else if (xprt->xpt_ops->xpo_has_wspace(xprt)) { > > + } else if (xprt->xpt_ops->xpo_has_wspace(xprt) || > > + atomic_read(&xprt->xpt_reserved) == 0) { > > /* XPT_DATA|XPT_DEFERRED case: */ > > dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n", > > rqstp, rqstp->rq_pool->sp_id, xprt, > > > > > > Thoughts? > > > > I tried testing this, but xpo_has_wspace never fails for me, even if I remove > > the calls to svc_sock_setbufsize (which probably aren't wanted for TCP any > > more). > > Ben, do you still have a setup that can reproduce the problem? Or did you > ever find an easier way to reproduce? Unfortunately I don't still have that test rig handy, and I did not find an easier way to reproduce this. I do agree that it 'is sufficient to just allow one request through per socket'... probably that would have solved the problem in our case. Maybe it would help to limit the amount of memory in your system to try and induce additional memory pressure? IIRC it was a 1mb block size read workload where we would hit this. Thanks, Ben -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 17, 2013 at 07:03:19PM -0500, Ben Myers wrote: > Hey Bruce, > > On Tue, Jul 16, 2013 at 10:24:30AM -0400, J.Bruce Fields wrote: > > Adding Ben Myers to Cc: in case he has any testing help or advice (see > > below): > > > > On Tue, Jul 16, 2013 at 02:00:21PM +1000, NeilBrown wrote: > > > That would be because I only allowed extra requests up to half the number of > > > idle threads, and there would be zero idle threads. > > > > > > So yes - I see your point. > > > > > > I now think that it is sufficient to just allow one request through per > > > socket. While there is high memory pressure, that is sufficient. When the > > > memory pressure drops, that should be enough to cause sndbuf to grow. > > > > > > We should be able to use xpt_reserved to check if this is the only request > > > or not: > > > > A remaining possible bad case is if the number of "bad" sockets is more > > than the number of threads, and if the problem is something that won't > > resolve itself by just waiting a little while. > > > > I don't know if that's likely in practice. Maybe a network failure cuts > > you off from a large swath (but not all) of your clients? Or maybe some > > large proportion of your clients are just slow? > > > > But this is two lines and looks likely to solve the immediate > > problem: > > > > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > > > index 80a6640..5b832ef 100644 > > > --- a/net/sunrpc/svc_xprt.c > > > +++ b/net/sunrpc/svc_xprt.c > > > @@ -331,7 +331,8 @@ static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt) > > > if (xprt->xpt_flags & ((1<<XPT_CONN)|(1<<XPT_CLOSE))) > > > return true; > > > if (xprt->xpt_flags & ((1<<XPT_DATA)|(1<<XPT_DEFERRED))) > > > - return xprt->xpt_ops->xpo_has_wspace(xprt); > > > + return xprt->xpt_ops->xpo_has_wspace(xprt) || > > > + atomic_read(&xprt->xpt_reserved) == 0; > > > return false; > > > } > > > > > > @@ -730,7 +731,8 @@ static int svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt) > > > newxpt = xprt->xpt_ops->xpo_accept(xprt); > > > if (newxpt) > > > svc_add_new_temp_xprt(serv, newxpt); > > > - } else if (xprt->xpt_ops->xpo_has_wspace(xprt)) { > > > + } else if (xprt->xpt_ops->xpo_has_wspace(xprt) || > > > + atomic_read(&xprt->xpt_reserved) == 0) { > > > /* XPT_DATA|XPT_DEFERRED case: */ > > > dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n", > > > rqstp, rqstp->rq_pool->sp_id, xprt, > > > > > > > > > Thoughts? > > > > > > I tried testing this, but xpo_has_wspace never fails for me, even if I remove > > > the calls to svc_sock_setbufsize (which probably aren't wanted for TCP any > > > more). > > > > Ben, do you still have a setup that can reproduce the problem? Or did you > > ever find an easier way to reproduce? > > Unfortunately I don't still have that test rig handy, and I did not find an > easier way to reproduce this. I do agree that it 'is sufficient to just allow > one request through per socket'... probably that would have solved the problem > in our case. Maybe it would help to limit the amount of memory in your system > to try and induce additional memory pressure? IIRC it was a 1mb block size > read workload where we would hit this. That makes sense. Or I wonder if you could cheat and artificially force the tcp code to behave as if there was memory pressure. (E.g. by calling tcp_enter_memory_pressure on some trigger). Neil, are you still looking at this? (If you're done with it, could you resend that patch with a signed-off-by?) --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 80a6640..5b832ef 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -331,7 +331,8 @@ static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt) if (xprt->xpt_flags & ((1<<XPT_CONN)|(1<<XPT_CLOSE))) return true; if (xprt->xpt_flags & ((1<<XPT_DATA)|(1<<XPT_DEFERRED))) - return xprt->xpt_ops->xpo_has_wspace(xprt); + return xprt->xpt_ops->xpo_has_wspace(xprt) || + atomic_read(&xprt->xpt_reserved) == 0; return false; } @@ -730,7 +731,8 @@ static int svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt) newxpt = xprt->xpt_ops->xpo_accept(xprt); if (newxpt) svc_add_new_temp_xprt(serv, newxpt); - } else if (xprt->xpt_ops->xpo_has_wspace(xprt)) { + } else if (xprt->xpt_ops->xpo_has_wspace(xprt) || + atomic_read(&xprt->xpt_reserved) == 0) { /* XPT_DATA|XPT_DEFERRED case: */ dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n", rqstp, rqstp->rq_pool->sp_id, xprt,