Message ID | 87y49ylq76.fsf@notabene.neil.brown.name (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Mar 04, 2016 at 05:20:13PM +1100, NeilBrown wrote: > > sunrpc_cache_pipe_upcall() can detect a race if CACHE_PENDING is no longer > set. In this case it aborts the queuing of the upcall. > However it has already taken a new counted reference on "h" and > doesn't "put" it, even though it frees the data structure holding the reference. > > So let's delay the "cache_get" until we know we need it. > > Fixes: f9e1aedc6c79 ("sunrpc/cache: remove races with queuing an upcall.") > Signed-off-by: NeilBrown <neilb@suse.com> > --- > net/sunrpc/cache.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > I found this when I was looking for something else. Testing hasn't > shown a bug, and nor has it shown that this is bug-free. But it looks > right. Sorry for the delay. I agree, it seems simple enough; applying for 4.6.... --b. > > NeilBrown > > > diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c > index 273bc3a35425..008c25d1b9f9 100644 > --- a/net/sunrpc/cache.c > +++ b/net/sunrpc/cache.c > @@ -1182,14 +1182,14 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h) > } > > crq->q.reader = 0; > - crq->item = cache_get(h); > crq->buf = buf; > crq->len = 0; > crq->readers = 0; > spin_lock(&queue_lock); > - if (test_bit(CACHE_PENDING, &h->flags)) > + if (test_bit(CACHE_PENDING, &h->flags)) { > + crq->item = cache_get(h); > list_add_tail(&crq->q.list, &detail->queue); > - else > + } else > /* Lost a race, no longer PENDING, so don't enqueue */ > ret = -EAGAIN; > spin_unlock(&queue_lock); > -- > 2.7.0 > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Mon, Mar 14, 2016 at 9:38 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > Sorry for the delay. I agree, it seems simple enough; applying for > 4.6.... It it something we can consider for stable? Thanks,
On Tue, Mar 15, 2016 at 01:43:37PM +0100, William Dauchy wrote: > Hello, > > On Mon, Mar 14, 2016 at 9:38 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > Sorry for the delay. I agree, it seems simple enough; applying for > > 4.6.... > > It it something we can consider for stable? It's an unlikely-looking race, I haven't seen it reported in the wild, and the consequences are just a reference-count leak (though admittedly those can sometimes cause crashes later--I haven't checked for that here). So I wasn't planning too, but any additional information's welcomed. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 16 2016, J. Bruce Fields wrote: > On Tue, Mar 15, 2016 at 01:43:37PM +0100, William Dauchy wrote: >> Hello, >> >> On Mon, Mar 14, 2016 at 9:38 PM, J. Bruce Fields <bfields@fieldses.org> wrote: >> > Sorry for the delay. I agree, it seems simple enough; applying for >> > 4.6.... >> >> It it something we can consider for stable? > > It's an unlikely-looking race, I haven't seen it reported in the wild, > and the consequences are just a reference-count leak (though admittedly > those can sometimes cause crashes later--I haven't checked for that > here). > > So I wasn't planning too, but any additional information's welcomed. I agree that it isn't needed for stable - the race is tiny and the consequence of losing the race is that entries get stuck in the cache and possible an exported filesystem cannot be unmounted. I don't think that is a problem severe enough to meet the rules for -stable - though I know the rules aren't always closely followed. I wouldn't exactly object to it going to stable if someone wanted to push, but I would rather it didn't appear in -stable before appearing in a released (not -rc) kernel from Linus. NeilBrown
Hi Bruce, Thank you for your reply. On Tue, Mar 15, 2016 at 9:37 PM, NeilBrown <neilb@suse.com> wrote: > I agree that it isn't needed for stable - the race is tiny and the > consequence of losing the race is that entries get stuck in the cache > and possible an exported filesystem cannot be unmounted. I believe in an industrial usage, this race will become more common; moreover the consequence of an exported filesystem which can't be unmounted is not something I can accept in some environment. I accept your judgment about including it or not in -stable but I am just complaining about the lack of further consideration for those type of fixes in nfs generally speaking. In an industrial usage, we are hitting races way more easily than in a normal usage; lost of them are already fixed and we do backports ourself. We are at a point where it becomes almost impossible to come with a proof and say "we are hitting this race, and this patch fixes the issue, please backport in -stable".
On Wed, Mar 16, 2016 at 08:31:03AM +0100, William Dauchy wrote: > Hi Bruce, > > Thank you for your reply. > > On Tue, Mar 15, 2016 at 9:37 PM, NeilBrown <neilb@suse.com> wrote: > > I agree that it isn't needed for stable - the race is tiny and the > > consequence of losing the race is that entries get stuck in the cache > > and possible an exported filesystem cannot be unmounted. > > I believe in an industrial usage, this race will become more common; Why is that? > moreover the consequence of an exported filesystem which can't be > unmounted is not something I can accept in some environment. > > I accept your judgment about including it or not in -stable but I am > just complaining about the lack of further consideration for those > type of fixes in nfs generally speaking. In an industrial usage, we > are hitting races way more easily than in a normal usage; lost of them > are already fixed and we do backports ourself. Do you have a list? > We are at a point where > it becomes almost impossible to come with a proof and say "we are > hitting this race, and this patch fixes the issue, please backport in > -stable". I don't expect *proof*, necessarily, but some sort of argument would be helpful. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Bruce, Thank your for your reply. On Wed, Mar 16, 2016 at 4:05 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > Why is that? It's indeed not really based on actual proof but based on the experience we had. Faced several bugs, and it appears we had less issues by backporting a few commits; e.g related to races, leaks, performances regressions. We did not spent enough time on it to actually show this commit was actually fixing an issue hit. > Do you have a list? I am currently trying to switch from v3.14 to v4.1 (nfs client). So even know it is off topic, those I have in mind: 0d2a970 SUNRPC: Fix a backchannel race 6851447 SUNRPC: Fix a backchannel deadlock 0993920 SUNRPC: Prevent SYN+SYNACK+RST storms 03c78827 SUNRPC: Fix races between socket connection and destroy code a41cbe8 Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x mount 39d0d3b NFS: Fix a tracepoint NULL-pointer dereference 5e99b53 nfs4: reset states to use open_stateid when returning delegation voluntarily e92c1e0 NFSv4: Fix a nograce recovery hang 6b55970 nfs: Fix a memory leak when meeting an unsupported state protect I am probably wrong on most of them but we had better stability after applying them and I probably forgot some. > I don't expect *proof*, necessarily, but some sort of argument would be > helpful. In my environment, if I hit this race, it is not acceptable to not being able to unmount the filesystem. Thanks,
On Wed, Mar 16, 2016 at 04:38:55PM +0100, William Dauchy wrote: > Hi Bruce, > > Thank your for your reply. > > On Wed, Mar 16, 2016 at 4:05 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > Why is that? > > It's indeed not really based on actual proof but based on the experience we had. > Faced several bugs, and it appears we had less issues by backporting a > few commits; e.g related to races, leaks, performances regressions. We > did not spent enough time on it to actually show this commit was > actually fixing an issue hit. > > > Do you have a list? > > I am currently trying to switch from v3.14 to v4.1 (nfs client). So > even know it is off topic, those I have in mind: > 0d2a970 SUNRPC: Fix a backchannel race > 6851447 SUNRPC: Fix a backchannel deadlock > 0993920 SUNRPC: Prevent SYN+SYNACK+RST storms > 03c78827 SUNRPC: Fix races between socket connection and destroy code > a41cbe8 Failing to send a CLOSE if file is opened WRONLY and server > reboots on a 4.x mount > 39d0d3b NFS: Fix a tracepoint NULL-pointer dereference > 5e99b53 nfs4: reset states to use open_stateid when returning > delegation voluntarily > e92c1e0 NFSv4: Fix a nograce recovery hang > 6b55970 nfs: Fix a memory leak when meeting an unsupported state protect > > I am probably wrong on most of them but we had better stability after > applying them and I probably forgot some. Note options 2 and 3 in Documentation/stable_kernel_rules.txt. You can send a backport to stable@vger.kernel.org yourself. When you do so, please cc: the relevant maintainers (looks like Trond and Anna for the above) and linux-nfs@vger.kernel.org, so they can ACK/NACK. > > I don't expect *proof*, necessarily, but some sort of argument would be > > helpful. > > In my environment, if I hit this race, it is not acceptable to not > being able to unmount the filesystem. OK. I'd be interested in your use case if you get the chance. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 16, 2016 at 5:00 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> OK. I'd be interested in your use case if you get the chance.
Well, in our case, we have temporary instances which mount the needed
filesystem in order to do temporary work and clean it before leaving.
In other cases, there is only one connection by nfs server and all the
filesystems are bind to it.
Thanks,
On Thu, Mar 17 2016, William Dauchy wrote: > Hi Bruce, > > Thank your for your reply. > > On Wed, Mar 16, 2016 at 4:05 PM, J. Bruce Fields <bfields@fieldses.org> wrote: >> Why is that? > > It's indeed not really based on actual proof but based on the experience we had. > Faced several bugs, and it appears we had less issues by backporting a > few commits; e.g related to races, leaks, performances regressions. We > did not spent enough time on it to actually show this commit was > actually fixing an issue hit. > >> Do you have a list? > > I am currently trying to switch from v3.14 to v4.1 (nfs client). So > even know it is off topic, those I have in mind: > 0d2a970 SUNRPC: Fix a backchannel race > 6851447 SUNRPC: Fix a backchannel deadlock > 0993920 SUNRPC: Prevent SYN+SYNACK+RST storms > 03c78827 SUNRPC: Fix races between socket connection and destroy code > a41cbe8 Failing to send a CLOSE if file is opened WRONLY and server > reboots on a 4.x mount > 39d0d3b NFS: Fix a tracepoint NULL-pointer dereference > 5e99b53 nfs4: reset states to use open_stateid when returning > delegation voluntarily > e92c1e0 NFSv4: Fix a nograce recovery hang > 6b55970 nfs: Fix a memory leak when meeting an unsupported state protect Commit: 7632e465feb1 ("dcache: use IS_ROOT to decide where dentry is hashed") is my most recent complaint - it really should have been tagged for -stable. It is now in 3.12-stable, over 2 years later :-( NeilBrown > > I am probably wrong on most of them but we had better stability after > applying them and I probably forgot some. > >> I don't expect *proof*, necessarily, but some sort of argument would be >> helpful. > > In my environment, if I hit this race, it is not acceptable to not > being able to unmount the filesystem. > > Thanks, > -- > William
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 273bc3a35425..008c25d1b9f9 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -1182,14 +1182,14 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h) } crq->q.reader = 0; - crq->item = cache_get(h); crq->buf = buf; crq->len = 0; crq->readers = 0; spin_lock(&queue_lock); - if (test_bit(CACHE_PENDING, &h->flags)) + if (test_bit(CACHE_PENDING, &h->flags)) { + crq->item = cache_get(h); list_add_tail(&crq->q.list, &detail->queue); - else + } else /* Lost a race, no longer PENDING, so don't enqueue */ ret = -EAGAIN; spin_unlock(&queue_lock);
sunrpc_cache_pipe_upcall() can detect a race if CACHE_PENDING is no longer set. In this case it aborts the queuing of the upcall. However it has already taken a new counted reference on "h" and doesn't "put" it, even though it frees the data structure holding the reference. So let's delay the "cache_get" until we know we need it. Fixes: f9e1aedc6c79 ("sunrpc/cache: remove races with queuing an upcall.") Signed-off-by: NeilBrown <neilb@suse.com> --- net/sunrpc/cache.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) I found this when I was looking for something else. Testing hasn't shown a bug, and nor has it shown that this is bug-free. But it looks right. NeilBrown