Message ID | 20200118120800.16358-1-cyphar@cyphar.com (mailing list archive) |
---|---|
Headers | show |
Series | openat2: minor uapi cleanups | expand |
On Sat, Jan 18, 2020 at 11:07:58PM +1100, Aleksa Sarai wrote: > Patch changelog: > v3: > * Merge changes into the original patches to make Al's life easier. > [Al Viro] > v2: > * Add include <linux/types.h> to openat2.h. [Florian Weimer] > * Move OPEN_HOW_SIZE_* constants out of UAPI. [Florian Weimer] > * Switch from __aligned_u64 to __u64 since it isn't necessary. > [David Laight] > v1: <https://lore.kernel.org/lkml/20191219105533.12508-1-cyphar@cyphar.com/> > > While openat2(2) is still not yet in Linus's tree, we can take this > opportunity to iron out some small warts that weren't noticed earlier: > > * A fix was suggested by Florian Weimer, to separate the openat2 > definitions so glibc can use the header directly. I've put the > maintainership under VFS but let me know if you'd prefer it belong > ot the fcntl folks. > > * Having heterogenous field sizes in an extensible struct results in > "padding hole" problems when adding new fields (in addition the > correct error to use for non-zero padding isn't entirely clear ). > The simplest solution is to just copy clone(3)'s model -- always use > u64s. It will waste a little more space in the struct, but it > removes a possible future headache. > > This patch is intended to replace the corresponding patches in Al's > #work.openat2 tree (and *will not* apply on Linus' tree). > > @Al: I will send some additional patches later, but they will require > proper design review since they're ABI-related features (namely, > adding a way to check what features a syscall supports as I > outlined in my talk here[1]). #work.openat2 updated, #for-next rebuilt and force-pushed. There's a massive update of #work.namei as well, also pushed out; not in #for-next yet, will post the patch series for review later today.
On Sat, Jan 18, 2020 at 03:28:33PM +0000, Al Viro wrote: > #work.openat2 updated, #for-next rebuilt and force-pushed. There's > a massive update of #work.namei as well, also pushed out; not in > #for-next yet, will post the patch series for review later today. BTW, looking through that code again, how could this static bool legitimize_root(struct nameidata *nd) { /* * For scoped-lookups (where nd->root has been zeroed), we need to * restart the whole lookup from scratch -- because set_root() is wrong * for these lookups (nd->dfd is the root, not the filesystem root). */ if (!nd->root.mnt && (nd->flags & LOOKUP_IS_SCOPED)) return false; possibly trigger? The only things that ever clean ->root.mnt are 1) failing legitimize_path(nd, &nd->root, nd->root_seq) in legitimize_root() itself. If *ANY* legitimize_path() has failed, we are through - RCU pathwalk is given up. In particular, if you look at the call chains leading to legitimize_root(), you'll see that it's called by unlazy_walk() or unlazy_child() and failure has either of those buggger off immediately. The same goes for their callers; fail any of those and we are done; the very next thing that will be done with that nameidata is going to be terminate_walk(). We don't look at its fields, etc. - just return to the top level ASAP and call terminate_walk() on it. Which is where we run into if (nd->flags & LOOKUP_ROOT_GRABBED) { path_put(&nd->root); nd->flags &= ~LOOKUP_ROOT_GRABBED; } paired with setting LOOKUP_ROOT_GRABBED just before the attempt to legitimize in legitimize_root(). The next thing *after* terminate_walk() is either path_init() or the end of life for that struct nameidata instance. This is really, really fundamental for understanding the whole thing - a failure of unlazy_walk/unlazy_child means that we are through with that attempt. 2) complete_walk() doing if (!(nd->flags & (LOOKUP_ROOT | LOOKUP_IS_SCOPED))) nd->root.mnt = NULL; Can't happen with LOOKUP_IS_SCOPED in flags, obviously. 3) path_init(). Where it's followed either by leaving through if (*s == '/' && !(flags & LOOKUP_IN_ROOT)) { .... } (and LOOKUP_IS_SCOPED includes LOOKUP_IN_ROOT) or with a failure exit (no calls of *anything* but terminate_walk() after that or with if (flags & LOOKUP_IS_SCOPED) { nd->root = nd->path; ... and that makes damn sure nd->root.mnt is not NULL. And neither of the LOOKUP_IS_SCOPED bits ever gets changed in nd->flags - they remain as path_init() has set them. The same, BTW, goes for the check you've added in the beginning of set_root() - set_root() is called only with NULL nd->root.mnt (trivial to prove) and that is incompatible with LOOKUP_IS_SCOPED. I'm kinda-sorta OK with having WARN_ON() there for a while, but IMO the check in the beginning of legitimize_root() should go away - this kind of defensive programming only makes harder to reason about the behaviour of the entire thing. And fs/namei.c is too convoluted as it is...
On 2020-01-18, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Sat, Jan 18, 2020 at 03:28:33PM +0000, Al Viro wrote: > > > #work.openat2 updated, #for-next rebuilt and force-pushed. There's > > a massive update of #work.namei as well, also pushed out; not in > > #for-next yet, will post the patch series for review later today. > > BTW, looking through that code again, how could this > static bool legitimize_root(struct nameidata *nd) > { > /* > * For scoped-lookups (where nd->root has been zeroed), we need to > * restart the whole lookup from scratch -- because set_root() is wrong > * for these lookups (nd->dfd is the root, not the filesystem root). > */ > if (!nd->root.mnt && (nd->flags & LOOKUP_IS_SCOPED)) > return false; > > possibly trigger? The only things that ever clean ->root.mnt are You're quite right -- the codepath I was worried about was pick_link() failing (which *does* clear nd->path.mnt, and I must've misread it at the time as nd->root.mnt). We can drop this check, though now complete_walk()'s main defence against a NULL nd->root.mnt is that path_is_under() will fail and trigger -EXDEV (or set_root() will fail at some point in the future). However, as you pointed out, a NULL nd->root.mnt won't happen with things as they stand today -- I might be a little too paranoid. :P > This is really, really fundamental for understanding the whole > thing - a failure of unlazy_walk/unlazy_child means that we are through > with that attempt. Yup -- see above, the worry was about pick_link() not about how the RCU-walk and REF-walk dances operate. > The same, BTW, goes for the check you've added in the beginning of > set_root() - set_root() is called only with NULL nd->root.mnt (trivial to > prove) and that is incompatible with LOOKUP_IS_SCOPED. I'm kinda-sorta > OK with having WARN_ON() there for a while, but IMO the check in the > beginning of legitimize_root() should go away - You're quite right about dropping the legitimize_root() check, but I'd like to keep the WARN_ON() in set_root(). The main reason being that it makes us very damn sure that a future change won't accidentally break the nd->root contract which all of the LOOKUP_IS_SCOPED changes rely on. Then again, this might be my paranoia popping up again. > this kind of defensive programming only makes harder to reason about > the behaviour of the entire thing. And fs/namei.c is too convoluted > as it is... If you feel that dropping some of these more defensive checks is better for the codebase as a whole, then I defer to your judgement. I completely agree that namei is a pretty complicated chunk of code.
On Sun, Jan 19, 2020 at 10:03:13AM +1100, Aleksa Sarai wrote: > > possibly trigger? The only things that ever clean ->root.mnt are > > You're quite right -- the codepath I was worried about was pick_link() > failing (which *does* clear nd->path.mnt, and I must've misread it at > the time as nd->root.mnt). pick_link() (allocation failure of external stack in RCU case, followed by failure to legitimize the link) is, unfortunately, subtle and nasty. We *must* path_put() the link; if we'd managed to legitimize the mount and failed on dentry, the mount needs to be dropped. No way around it. And while everything else there can be left for soon-to-be-reached terminate_walk(), this cannot. We have no good way to pass what we need to drop to the place where that eventual terminate_walk() drops rcu_read_lock(). So we end up having to do what terminate_walk() would've done and do it right there, so we could do that path_put(link) before we bugger off. I'm not happy about that, but I don't see cleaner solutions, more's the pity. However, it doesn't mess with ->root - nor should it, since we don't have LOOKUP_ROOT_GRABBED (not in RCU mode), so it can and should be left alone. > We can drop this check, though now complete_walk()'s main defence > against a NULL nd->root.mnt is that path_is_under() will fail and > trigger -EXDEV (or set_root() will fail at some point in the future). > However, as you pointed out, a NULL nd->root.mnt won't happen with > things as they stand today -- I might be a little too paranoid. :P The only reason why complete_walk() zeroes nd->root in some cases is microoptimization - we *know* we won't be using it later, so we don't care whether it's stale or not and can spare unlazy_walk() a bit of work. All there is to that one. I don't see any reason for adding code that would clear nd->root in later work; if such thing does get added (again, I don't see what purpose could that possibly serve), we'll need to watch out for a lot of things. Starting with LOOKUP_ROOT case... It's not something likely to slip in unnoticed.