Message ID | 1429674624-25922-2-git-send-email-boqun.feng@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Nak on exporting symbols for broken staging code. Please get rid of the ioctls looking up path names in horrible ways in the lustre code. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Apr 22, 2015, at 1:53 AM, Christoph Hellwig wrote: > Nak on exporting symbols for broken staging code. Please get rid of > the ioctls looking up path names in horrible ways in the lustre code. For a reference, is there a good example of a non-horrible way to look up a pathname? Thanks. Bye, Oleg-- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 22, 2015 at 06:27:11AM +0000, Drokin, Oleg wrote: > > Nak on exporting symbols for broken staging code. Please get rid of > > the ioctls looking up path names in horrible ways in the lustre code. > > For a reference, is there a good example of a non-horrible way to look up a pathname? Just dont do it from an ioctl, it's got an fd parameter for a reason. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 21, 2015 at 10:53:11PM -0700, Christoph Hellwig wrote: > Nak on exporting symbols for broken staging code. Please get rid of > the ioctls looking up path names in horrible ways in the lustre code. I agree with Christoph, we shouldn't be doing this. Let's fix lustre "properly", which should be possible, right? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Apr 22, 2015, at 2:31 AM, Greg Kroah-Hartman wrote: > On Tue, Apr 21, 2015 at 10:53:11PM -0700, Christoph Hellwig wrote: >> Nak on exporting symbols for broken staging code. Please get rid of >> the ioctls looking up path names in horrible ways in the lustre code. > > I agree with Christoph, we shouldn't be doing this. Let's fix lustre > "properly", which should be possible, right? Well, sort of. For the first ioctl we clearly can do an fd pass and get info from there (need to also teach the tools about the new ioctl, so I cannot submit a patch right away, but I'll get on it). In the second case the usecase is a bit more involved. It deals with the problem of having directory entry pointing to a non-existing directory (so basically a name only, but we do know it's supposed to be a directory, just on another server), so I doubt we can even open it to get the fd. Now, perhaps we can just allow regular rmdir to ignore the error and kill the bogus entry anyway, but there was this thought that we might want to alert the user there's something more fundamentally broken going on here, and so they would need to call this certain ioctl called if they just would like to kill the entry anyway as opposed to calling say fsck to make sure there's nothing else broken. I am not entirely sure of this idea myself, but it probably made sense at some point. I see there's quite a dislike for this approach, so we can remove it if there's no better option. Bye, Oleg-- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Apr 22, 2015, at 2:31 AM, Christoph Hellwig wrote: > On Wed, Apr 22, 2015 at 06:27:11AM +0000, Drokin, Oleg wrote: >>> Nak on exporting symbols for broken staging code. Please get rid of >>> the ioctls looking up path names in horrible ways in the lustre code. >> >> For a reference, is there a good example of a non-horrible way to look up a pathname? > > Just dont do it from an ioctl, it's got an fd parameter for a reason. I know this is not going to be a popular opinion with you, but sometimes opening a file is just too expensive. 1 RPC roudntrip to open a file and then another one to close it. Also some files could not be opened (fs corruption). Anyway, I got your point and there will be a solution. I was just hoping there was a way to do it because what if e.g. I need to create something new, not do something with already existing stuff, certainly there's no way to get an fd from a not yet existing fs object. Bye, Oleg-- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 22, 2015 at 06:49:08AM +0000, Drokin, Oleg wrote: > I know this is not going to be a popular opinion with you, but sometimes opening a file > is just too expensive. 1 RPC roudntrip to open a file and then another one to close it. Use O_PATH to avoid this. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Apr 22, 2015, at 3:34 AM, Christoph Hellwig wrote: > On Wed, Apr 22, 2015 at 06:49:08AM +0000, Drokin, Oleg wrote: >> I know this is not going to be a popular opinion with you, but sometimes opening a file >> is just too expensive. 1 RPC roudntrip to open a file and then another one to close it. > Use O_PATH to avoid this. Hm, I guess I can open with O_PATH, but… if (unlikely(f->f_flags & O_PATH)) { f->f_mode = FMODE_PATH; f->f_op = &empty_fops; return 0; } so with such an fd I am never getting into my ioctl handler, you see… Let's suppose I overcome this somehow, still that does not completely solve my problem that has more dimensions. So, imagine I have this huge file tree (common with really big filesystems), and I want to do a one-off find on it for whatever reason. I do not want to pollute my dentry and inode cache with all the extra entries because (I think) I know I will never need them again. So with our broken ioctl from the past that was somewhat easy - I just open a dir, I do getdents, I get a bunch of names and I proceed to call my ioctl on this bunch of names and get all the info I need (one rpc per entry, which is not all that great, but oh well) and my dentry cache is only getting directories, but not the files. Now, if I convert to O_PATH (or to some other single call thing that does not need it, like say getxattr that might work for some subset of intended usage), I get pretty much the same thing, but I also get dcache pollution and in order to guard my dcache, I am getting a bunch of lustre locks (the expensive kind of lustre locks issued by server so that the cache stays coherent cluster wide), even if I somehow do uncached dentries so I can avoid the lock, there would still be that pesky LOOKUP RPC (that I would need to somehow teach to not just do lookup, but to bring me other interesting things, kind of like with open intents). This looks like it's getting out of hand rather fast. Now, I probably can create some sort of an RPC that is "extended getdents with attributes" and so my extended_getdents_rpc would return me the name and a bunch of other data like file striping, stat information and the like. This also saves me some more RPCs, but I imagine if I try to expose that over an ioctl, you would not be very happy with it either and I don't think we have this sort of an extended getdents interface at the kernel too, do we (though I think internally nfs is capable of such a thing)? Do you think any of this makes sense, or do you think I should just convert this ioctl from our broken getname version to something like user_path_at() (an exported symbol) to do the lookup+fetch whatever info I need and immediately unhash the resultant dentry/inode and be done with it (at least then I do not need any tools changes). Do you think there's something else I might be doing, but not yet realizing this? Thanks. Bye, Oleg-- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/namei.c b/fs/namei.c index c83145a..472911c 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -199,11 +199,23 @@ error: return err; } +/** + * getname() - Get a file name copy from userland + * @filename: userland pointer to the file name + * + * If successful, return a 'struct filename' pointer and ->name is the pointer + * to the kernel copy of the file name, otherwise an ERR_PTR. + * + * getname() should only be called in a system call context, and for each + * getname() that returns a successful value, callers must ensure exactly one + * corresponding putname() is called before returning to userland. + */ struct filename * getname(const char __user * filename) { return getname_flags(filename, 0, NULL); } +EXPORT_SYMBOL(getname); struct filename * getname_kernel(const char * filename) @@ -242,6 +254,11 @@ getname_kernel(const char * filename) return result; } +/* putname() - Release a 'struct filename' structure + * @name: the 'struct filename' structure to be release + * + * See more at getname() + */ void putname(struct filename *name) { BUG_ON(name->refcnt <= 0); @@ -255,6 +272,7 @@ void putname(struct filename *name) } else __putname(name); } +EXPORT_SYMBOL(putname); static int check_acl(struct inode *inode, int mask) {
getname/putname in fs/namei.c is a well-implemented way to copy a file name from userland, however other ways, such as directly calling __getname() and strncpy_from_user(), may lack features(e.g. auditing and reusing), introduce errors or at least reinvent wheels. Therefore for places need a kernel file name copy from userland, it's better to use getname and putname if possible. To be able to use these functions all over the kernel, symbols 'getname' and 'putname' are exported and comments of their behaviors and constraints are added. Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> --- fs/namei.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)