Message ID | 20121130020047.GA4939@ZenIV.linux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 29/11/12 06:00 PM, Al Viro wrote: > On Thu, Nov 29, 2012 at 05:54:02PM -0800, Patrick McLean wrote: >>> Very interesting. Do you have anything mounted on the corresponding >>> directories on server? The picture looks like you are getting empty >>> fhandles in readdir+ respons for exactly the same directories that happen >>> to be mountpoints on client. In any case, we shouldn't do that blind >>> d_drop() - empty fhandles can happen. The only remaining question is >>> why do they happen on that set of entries. From my reading of >>> encode_entryplus_baggage() it looks like we have compose_entry_fh() >>> failing for those entries and those entries alone. One possible cause >>> would be d_mountpoint(dchild) being true on server. If it is true, we >>> can declare the case closed; if not, I really wonder what's going on. >> >> Those directories do have the server's own copies of the said directories bind mounted at the moment in a separate mount namespace. >> >> Unmounting those directories on the server does appear to stop the WARN_ON from triggering. > > OK, that settles it. WARN_ON() and printks in the area can be dropped; > the right fix is below. However, there's a similar place in cifs that > also needs to be dealt with and I really, really wonder why the hell do > we do d_drop() in nfs_revalidate_lookup(). It's not relevant in this > bug, but I would like to understand what's wrong with simply returning > 0 from ->d_revalidate() and letting the caller (in fs/namei.c) take care > of unhashing, etc. itself. Would make have_submounts() in there pointless > as well - we could just return 0 and let d_invalidate() take care of the > checks... Trond? > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c > --- a/fs/nfs/dir.c > +++ b/fs/nfs/dir.c > @@ -450,7 +450,8 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry) > nfs_refresh_inode(dentry->d_inode, entry->fattr); > goto out; > } else { > - d_drop(dentry); > + if (d_invalidate(dentry) != 0) > + goto out; > dput(dentry); > } > } Excellent, thanks. Is there any chance this will make it to 3.7? Also we might want to cc stable@ on this as well since it is a regression in 3.6. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 29, 2012 at 06:33:53PM -0800, Patrick McLean wrote:
> Excellent, thanks. Is there any chance this will make it to 3.7? Also we might want to cc stable@ on this as well since it is a regression in 3.6.
Definitely. I've dropped that into vfs.git#for-linus and vfs.git#for-next
and tomorrow to Linus it goes...
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2012-11-30 at 02:00 +0000, Al Viro wrote: > On Thu, Nov 29, 2012 at 05:54:02PM -0800, Patrick McLean wrote: > > > Very interesting. Do you have anything mounted on the corresponding > > > directories on server? The picture looks like you are getting empty > > > fhandles in readdir+ respons for exactly the same directories that happen > > > to be mountpoints on client. In any case, we shouldn't do that blind > > > d_drop() - empty fhandles can happen. The only remaining question is > > > why do they happen on that set of entries. From my reading of > > > encode_entryplus_baggage() it looks like we have compose_entry_fh() > > > failing for those entries and those entries alone. One possible cause > > > would be d_mountpoint(dchild) being true on server. If it is true, we > > > can declare the case closed; if not, I really wonder what's going on. > > > > Those directories do have the server's own copies of the said directories bind mounted at the moment in a separate mount namespace. > > > > Unmounting those directories on the server does appear to stop the WARN_ON from triggering. > > OK, that settles it. WARN_ON() and printks in the area can be dropped; > the right fix is below. However, there's a similar place in cifs that > also needs to be dealt with and I really, really wonder why the hell do > we do d_drop() in nfs_revalidate_lookup(). It's not relevant in this > bug, but I would like to understand what's wrong with simply returning > 0 from ->d_revalidate() and letting the caller (in fs/namei.c) take care > of unhashing, etc. itself. Would make have_submounts() in there pointless > as well - we could just return 0 and let d_invalidate() take care of the > checks... Trond? The reason for the choice of d_drop over d_invalidate() is the d_count checks. It really doesn't matter whether or not the client thinks it has users for a directory if the server is telling you that it is ESTALE. So we force a d_drop to prevent further lookups from finding it. IOW: It is there in order to fix the case where the user does 'rmdir("foo"); mkdir("foo")' on the server. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com
On Fri, Nov 30, 2012 at 02:00:48AM +0000, Al Viro wrote: > OK, that settles it. WARN_ON() and printks in the area can be dropped; > the right fix is below. However, there's a similar place in cifs that > also needs to be dealt with and I really, really wonder why the hell do > we do d_drop() in nfs_revalidate_lookup(). It's not relevant in this > bug, but I would like to understand what's wrong with simply returning > 0 from ->d_revalidate() and letting the caller (in fs/namei.c) take care > of unhashing, etc. itself. Would make have_submounts() in there pointless > as well - we could just return 0 and let d_invalidate() take care of the > checks... Trond? > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c > --- a/fs/nfs/dir.c > +++ b/fs/nfs/dir.c > @@ -450,7 +450,8 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry) > nfs_refresh_inode(dentry->d_inode, entry->fattr); > goto out; > } else { > - d_drop(dentry); > + if (d_invalidate(dentry) != 0) > + goto out; > dput(dentry); > } > } Hello, With your previous patch (with the WARN_ON), I hit the WARN_ON() in the test case described here: https://patchwork.kernel.org/patch/1446851/ . The __d_move()ing mountpoint case no longer hits, and there is no longer an EBUSY, so this seems to work for me (in 3.6, where it broke). Simon- -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 30, 2012 at 01:58:18PM +0000, Myklebust, Trond wrote: > The reason for the choice of d_drop over d_invalidate() is the d_count > checks. It really doesn't matter whether or not the client thinks it has > users for a directory if the server is telling you that it is ESTALE. So > we force a d_drop to prevent further lookups from finding it. > > IOW: It is there in order to fix the case where the user does > 'rmdir("foo"); mkdir("foo")' on the server. You do realize that your have_submounts() check in there is inherently racy, right? -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -450,7 +450,8 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry) nfs_refresh_inode(dentry->d_inode, entry->fattr); goto out; } else { - d_drop(dentry); + if (d_invalidate(dentry) != 0) + goto out; dput(dentry); } }