Message ID | 20150316044320.23648.90827.stgit@notabene.brown (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Mar 16, 2015 at 03:43:20PM +1100, NeilBrown wrote: > + char *kaddr; > + struct page *page; > + struct address_space *mapping = dentry->d_inode->i_mapping; Who said that dentry->d_inode hasn't gone NULL by that point? > + nd_terminate_link(kaddr, dentry->d_inode->i_size, PAGE_SIZE - 1); ... or changed here. Again, dentry->d_inode is stable only if you are holding a reference to dentry. That's why we have those dances around nd->inode, for example. Doing unlazy_walk() is enough to stabilize the damn thing, so currently ->follow_link() doesn't have to worry about it. With your changes, though... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 16 Mar 2015 22:50:40 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote: > On Mon, Mar 16, 2015 at 03:43:20PM +1100, NeilBrown wrote: > > + char *kaddr; > > + struct page *page; > > + struct address_space *mapping = dentry->d_inode->i_mapping; > > Who said that dentry->d_inode hasn't gone NULL by that point? > > > + nd_terminate_link(kaddr, dentry->d_inode->i_size, PAGE_SIZE - 1); > > ... or changed here. Again, dentry->d_inode is stable only if you are > holding a reference to dentry. That's why we have those dances around > nd->inode, for example. Doing unlazy_walk() is enough to stabilize the > damn thing, so currently ->follow_link() doesn't have to worry about it. > With your changes, though... Ahhh - that's what nd->inode is for. I wondered. Am I correct in thinking that dentry->d_inode can only become NULL - it cannot then become some other inode? In that case the various follow_link methods that are sufficiently atomic for rcu-walk just need something like: struct inode *inode = dentry->d_inode; if (!inode) return -ECHILD; If ->d_inode can become another inode, then I suspect we need to pass the inode as well as the dentry to ->follow_link. Thanks, NeilBrown
On Fri, Mar 20, 2015 at 09:38:33AM +1100, NeilBrown wrote: > Ahhh - that's what nd->inode is for. I wondered. > > Am I correct in thinking that dentry->d_inode can only become NULL - it cannot > then become some other inode? It can - consider somebody doing mkdir on that name right under you. _All_ we are guaranteed is that at some moment nd->inode matched the pathname this far and so was (at the same moment) path->dentry. We are not promised that these inode and dentry will remain associated with each other, etc. We ought to check ->d_seq after checking ->d_flags, BTW. _That_ will confirm that inode remained corresponding to that dentry until the time we'd observed d_is_symlink(dentry), i.e. make sure that inode *is* a symlink one. And yes, we probably would have to pass dentry and inode separately, more's the pity. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/namei.c b/fs/namei.c index c9c58cd1af2a..2602d31ecc99 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -4499,6 +4499,28 @@ static char *page_getlink(struct dentry * dentry, struct page **ppage) return kaddr; } +/* get the link contents from pagecache under RCU */ +static char *page_getlink_rcu(struct dentry * dentry, struct page **ppage) +{ + char *kaddr; + struct page *page; + struct address_space *mapping = dentry->d_inode->i_mapping; + page = find_get_page(mapping, 0); + if (page && + (!PageUptodate(page) || PageHighMem(page))) { + put_page(page); + page = NULL; + } + if (!page) { + *ppage = ERR_PTR(-ECHILD); + return NULL; + } + *ppage = page; + kaddr = page_address(page); + nd_terminate_link(kaddr, dentry->d_inode->i_size, PAGE_SIZE - 1); + return kaddr; +} + int page_readlink(struct dentry *dentry, char __user *buffer, int buflen) { struct page *page = NULL; @@ -4514,19 +4536,22 @@ EXPORT_SYMBOL(page_readlink); void *page_follow_link_light(struct dentry *dentry, int flags) { struct page *page = NULL; - if (flags & LOOKUP_RCU) - return ERR_PTR(-ECHILD); - nd_set_link(page_getlink(dentry, &page)); + if (flags & LOOKUP_RCU) { + nd_set_link(page_getlink_rcu(dentry, &page)); + page = (void*)((unsigned long)page | 1); + } else + nd_set_link(page_getlink(dentry, &page)); return page; } EXPORT_SYMBOL(page_follow_link_light); void page_put_link(struct dentry *dentry, void *cookie) { - struct page *page = cookie; + struct page *page = (void*)((unsigned long)cookie & ~1UL) ; if (page) { - kunmap(page); + if (page == cookie) + kunmap(page); page_cache_release(page); } }
If the symlink has already be been read-in, then page_follow_link_light can succeed in RCU-walk mode. page_getlink_rcu() is added to support this. With this many filesystems can follow links in RCU-walk mode when everything is cached. This includes ext?fs and others. If the page is a HighMem page we do *not* try to kmap_atomic, but simply give up - only page_address() is used. This is because we need to be able to sleep while holding the address of the page, particularly over calls to do_last() which can be quite slow and in particular takes a mutex. If this were a problem, then copying into a GFP_ATOMIC allocation might be a workable solution. This selective calling of kmap requires us to know, in page_put_link, whether or not kunmap() need to be called. Pass this information in the lsb of the cookie. Signed-off-by: NeilBrown <neilb@suse.de> --- fs/namei.c | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html