Message ID | 158376245699.344135.7522994074747336376.stgit@warthog.procyon.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | VFS: Filesystem information [ver #18] | expand |
Hi David, > Add additional RESOLVE_* flags to correspond to AT_* flags that aren't > currently implemented: > > RESOLVE_NO_TRAILING_SYMLINKS for AT_SYMLINK_NOFOLLOW > RESOLVE_NO_TRAILING_AUTOMOUNTS for AT_NO_AUTOMOUNT > RESOLVE_EMPTY_PATH for AT_EMPTY_PATH Thanks for changing the names! > This is necessary for fsinfo() to use RESOLVE_* flags instead of AT_* flags > if the latter are to be considered deprecated for new system calls. > > Also make openat2() handle RESOLVE_NO_TRAILING_SYMLINKS. > > Automounting is currently forced by doing an open(), so adding support to > openat2() for RESOLVE_NO_TRAILING_AUTOMOUNTS is not trivial. lookup_flags &= ~LOOKUP_AUTOMOUNT won't work? At least it should cause EINVAL (or something similar) instead of being silently ignored. The same applies to RESOLVE_EMPTY_PATH, it should be handled with some logic or also cause EINVAL. vfs_statx()/vfs_stat_set_lookup_flags() seems to have a similar logic using the AT_* flags. metze
Stefan Metzmacher <metze@samba.org> wrote: > > Automounting is currently forced by doing an open(), so adding support to > > openat2() for RESOLVE_NO_TRAILING_AUTOMOUNTS is not trivial. > > lookup_flags &= ~LOOKUP_AUTOMOUNT won't work? No. LOOKUP_OPEN overrides that. David
On Mon, Mar 9, 2020 at 5:56 PM Aleksa Sarai <cyphar@cyphar.com> wrote: > > On 2020-03-09, David Howells <dhowells@redhat.com> wrote: > > This is necessary for fsinfo() to use RESOLVE_* flags instead of AT_* flags > > if the latter are to be considered deprecated for new system calls. > > > > Also make openat2() handle RESOLVE_NO_TRAILING_SYMLINKS. No, please let's not do this. We have O_NOFOLLOW, and we can't get rid of it. So adding RESOLVE_NO_TRAILING_SYMLINKS isn't a cleanup. It's just extra complexity for absolutely zero gain. > After thinking about what Christian said some more, I reckon we > shouldn't support both O_NOFOLLOW and RESOLVE_NO_TRAILING_SYMLINKS. But > that means we'll need to cherry-pick this patch and get it into mainline > before v5.6. No. It simply means that we shouldn't have RESOLVE_NO_TRAILING_SYMLINKS at all. Adding that flag is a mistake. It causes problems like this, where subtlenly people say "what if both flags are set". Just don't do it. There's no way in hell we can ever get rid of O_NOFOLLOW anyway, since people will continue to use plain open() and openat(). So adding RESOLVE_NO_TRAILING_SYMLINKS is entirely redundant. Don't deprecate the old flags that are going to always stay around, don't add stupid new flags that add no value. It's that easy. Linus
Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > Also make openat2() handle RESOLVE_NO_TRAILING_SYMLINKS. > > No, please let's not do this. > > We have O_NOFOLLOW, and we can't get rid of it. > > So adding RESOLVE_NO_TRAILING_SYMLINKS isn't a cleanup. It's just > extra complexity for absolutely zero gain. Okay. So what's the equivalent of AT_SYMLINK_NOFOLLOW in RESOLVE_* flag terms? RESOLVE_NO_SYMLINKS is not equivalent, though O_NOFOLLOW is. The reason I ask is that RESOLVE_* flags can't be easily extended to non-open syscalls that don't take O_* flags without it. Would you prefer that new non-open syscalls continue to take AT_* and ignore RESOLVE_* flags? That would be fine by me. David
On Tue, Mar 10, 2020 at 12:25 AM David Howells <dhowells@redhat.com> wrote: ? > Okay. So what's the equivalent of AT_SYMLINK_NOFOLLOW in RESOLVE_* flag > terms? Nothing. openat2() takes two sets of flags. We'll never get rid of AT_SYMLINK_NOFOLLOW / O_NOFOLLOW, and we've added RESOLVE_NO_SYMLINKS to the new set of flags. It's just a separate namespace. We will _not_ be adding a RESOLVE_XYZ flag for O_NOFOLLOW or AT_SYMLINK_NOFOLLOW. At least not visible to user space - because as people already figured out, that just causes problems with consistency issues. And yes, the fact that we then have three different user-visible namespaces (O_xyz flags for open(), AT_xyz flags for linkat(), and now RESOLVE_xyz flags for openat2()) is sad and messy. But it's an inherent messiness from just how the world works. We can't get rid of it. If we need linkat2() and friends, so be it. Do we? Could we have a _fourth_ set of flags that are simply for internal use that is a superset of them all? Sure. But no, it's almost certainly not worth it. Four is not better than three. Now, some type-safety in the kernel to make sure that we can't mix AT_xyz with O_xyz or RESOLVE_xyz - that might be worth it. Although judging by past experience, not enough people run sparse for it to really be worth it. Linus PS. Yeah, we also have that LOOKUP_xyz namespace, and the access mode namespace, so we already have those internal format versions too.
Hi Linus, >> Okay. So what's the equivalent of AT_SYMLINK_NOFOLLOW in RESOLVE_* flag >> terms? > > Nothing. > > openat2() takes two sets of flags. We'll never get rid of > AT_SYMLINK_NOFOLLOW / O_NOFOLLOW, and we've added RESOLVE_NO_SYMLINKS > to the new set of flags. It's just a separate namespace. > > We will _not_ be adding a RESOLVE_XYZ flag for O_NOFOLLOW or > AT_SYMLINK_NOFOLLOW. At least not visible to user space - because as > people already figured out, that just causes problems with consistency > issues. > > And yes, the fact that we then have three different user-visible > namespaces (O_xyz flags for open(), AT_xyz flags for linkat(), and now > RESOLVE_xyz flags for openat2()) is sad and messy. But it's an > inherent messiness from just how the world works. We can't get rid of > it. For openat2() and other existing syscalls I agree, that it's good to have just a single bit to control a feature. The whole discussion was triggered by the introduction of a completely new fsinfo() call: >> The new system call looks like: >> >> int ret = fsinfo(int dfd, >> const char *pathname, >> const struct fsinfo_params *params, >> size_t params_size, >> void *result_buffer, >> size_t result_buf_size); >> >> The params parameter optionally points to a block of parameters: >> >> struct fsinfo_params { >> __u32 resolve_flags; If I remember correctly with was named at_flags initially. And I thought it would be great to also have the new RESOLVE_YXZ feature available for that new path based syscall. Would you propose to have 'at_flags' and 'resolve_flags' passed in here? Or is there something even better you would propose for new syscalls? >> __u32 flags; >> __u32 request; >> __u32 Nth; >> __u32 Mth; >> }; > If we need linkat2() and friends, so be it. Do we? Yes, I'm going to propose something like this, as it would make the life much easier for Samba to have the new features available on all path based syscalls. In addition I'll propose to have a way to specify the source of removeat and unlinkat also by fd in addition to the the source parent fd and relative path, the reason are also to detect races of path recycling. pidfd_open() solved a similar problem for pid recycling. > Could we have a _fourth_ set of flags that are simply for internal use > that is a superset of them all? Sure. But no, it's almost certainly > not worth it. Four is not better than three. As you pointed our below the LOOKUP_yxz namespace is already in place... And the discussion was more about an possible single namespace for completely new syscalls. > Now, some type-safety in the kernel to make sure that we can't mix > AT_xyz with O_xyz or RESOLVE_xyz - that might be worth it. Although > judging by past experience, not enough people run sparse for it to > really be worth it. I'm new to all this and maybe too naive, but would a build bot running sparse on linux-next be able to catch this early enough? metze
On Thu, Mar 12, 2020 at 2:08 AM Stefan Metzmacher <metze@samba.org> wrote: > > The whole discussion was triggered by the introduction of a completely > new fsinfo() call: > > Would you propose to have 'at_flags' and 'resolve_flags' passed in here? Yes, I think that would be the way to go. > > If we need linkat2() and friends, so be it. Do we? > > Yes, I'm going to propose something like this, as it would make the life > much easier for Samba to have the new features available on all path > based syscalls. Will samba actually use them? I think we've had extensions before that weren't worth the non-portability pain? But yes, if we have a major package like samba use it, then by all means let's add linkat2(). How many things are we talking about? We have a number of system calls that do *not* take flags, but do do pathname walking. I'm thinking things like "mkdirat()"?) > In addition I'll propose to have a way to specify the source of > removeat and unlinkat also by fd in addition to the the source parent fd > and relative path, the reason are also to detect races of path > recycling. Would that be basically just an AT_EMPTY_PATH kind of thing? IOW, you'd be able to remove a file by doing fd = open(path.., O_PATH); unlinkat(fd, "", AT_EMPTY_PATH); Hmm. We have _not_ allowed filesystem changes without that last component lookup. Of course, with our dentry model, we *can* do it, but this smells fairly fundamental to me. It might avoid some of the extra system calls (ie you could use openat2() to do the path walking part, and then unlinkat(AT_EMPTY_PATH) to remove it, and have a "fstat()" etc in between the verify that it's the right type of file or whatever - and you'd not need an unlinkat2() with resolve flags). I think Al needs to ok this kind of change. Maybe you've already discussed it with him and I just missed it. Linus
Linus Torvalds <torvalds@linux-foundation.org> wrote: > > The whole discussion was triggered by the introduction of a completely > > new fsinfo() call: > > > > Would you propose to have 'at_flags' and 'resolve_flags' passed in here? > > Yes, I think that would be the way to go. Okay, I can do that. Any thoughts on which set of flags should override the other? If we're making RESOLVE_* flags the new definitive interface, then I feel they should probably override the AT_* flags where there's a conflict, ie. RESOLVE_NO_SYMLINKS should override AT_SYMLINK_FOLLOW for example. David
Am 12.03.20 um 17:24 schrieb Linus Torvalds: > On Thu, Mar 12, 2020 at 2:08 AM Stefan Metzmacher <metze@samba.org> wrote: >> >> The whole discussion was triggered by the introduction of a completely >> new fsinfo() call: >> >> Would you propose to have 'at_flags' and 'resolve_flags' passed in here? > > Yes, I think that would be the way to go. Ok, that's also fine for me:-) >>> If we need linkat2() and friends, so be it. Do we? >> >> Yes, I'm going to propose something like this, as it would make the life >> much easier for Samba to have the new features available on all path >> based syscalls. > > Will samba actually use them? I think we've had extensions before that > weren't worth the non-portability pain? Yes, we're currently moving to the portable *at() calls as a start. And we already make use of Linux only feature for performance reasons in other places. Having the new resolve flags will make it possible to move some of the performance intensive work into non-linux specific modules as fallback. I hope that we'll use most of this through io_uring in the end, that's the reason Jens added the IORING_REGISTER_PERSONALITY feature used for IORING_OP_OPENAT2. > But yes, if we have a major package like samba use it, then by all > means let's add linkat2(). How many things are we talking about? We > have a number of system calls that do *not* take flags, but do do > pathname walking. I'm thinking things like "mkdirat()"?) I haven't looked them up in detail yet. Jeremy can you provide a list? Do you think we could route some of them like mkdirat() and mknodat() via openat2() instead of creating new syscalls? >> In addition I'll propose to have a way to specify the source of >> removeat and unlinkat also by fd in addition to the the source parent fd >> and relative path, the reason are also to detect races of path >> recycling. > > Would that be basically just an AT_EMPTY_PATH kind of thing? IOW, > you'd be able to remove a file by doing > > fd = open(path.., O_PATH); > unlinkat(fd, "", AT_EMPTY_PATH); > > Hmm. We have _not_ allowed filesystem changes without that last > component lookup. Of course, with our dentry model, we *can* do it, > but this smells fairly fundamental to me. > > It might avoid some of the extra system calls (ie you could use > openat2() to do the path walking part, and then > unlinkat(AT_EMPTY_PATH) to remove it, and have a "fstat()" etc in > between the verify that it's the right type of file or whatever - and > you'd not need an unlinkat2() with resolve flags). If that works safely for hardlinks and having another process doing a rename between openat2() and unlinkat(), we could try that. My initial naive idea was to have one syscall instead of linkat2/renameat3/unlinkat2. int xlinkat(int src_dfd, const char *src_path, int dst_dfd, const char *dst_path, const struct xlinkat_how *how, size_t how_size); struct xlinkat_how { __u64 src_at_flags; __u64 src_resolve_flags; __u64 dst_at_flags; __u64 dst_resolve_flags; __u64 rename_flags; __s32 src_fd; }; With src_dfd=-1, src_path=NULL, how.src_fd = -1, this would be like linkat(). With dst_dfd=-1, dst_path=NULL, it would be like unlinkat(). Otherwise a renameat2(). If how.src_fd is not -1, it would be checked to be the same path as specified by src_dfd and src_path. > I think Al needs to ok this kind of change. Maybe you've already > discussed it with him and I just missed it. This is the first time I'm discussing this. Thanks for the useful feedback! metze
On Thu, Mar 12, 2020 at 9:56 AM David Howells <dhowells@redhat.com> wrote: > > Any thoughts on which set of flags should override the other? Do we need to care? I don't think we actually have conflicts, because the semantics aren't the same, and they are about independent issues. > If we're making > RESOLVE_* flags the new definitive interface, then I feel they should probably > override the AT_* flags where there's a conflict, ie. RESOLVE_NO_SYMLINKS > should override AT_SYMLINK_FOLLOW for example. That's just for a linkat2() system call? I think the natural semantic is the one that falls out directly: RESOLVE_NO_SYMLINKS will cause it to fail with -ELOOP if it is a symlink. NOTE! This isn't really a "conflict". It's actually two different and independent things: - without AT_SYMLINK_FOLLOW, a linkat() simply won't even try to follow the symlink, and will link to the symlink itself instead. - RESOLVE_NO_SYMLINKS says "never follow symlinks". Note how one does *NOT* override the other, quite the reverse. They are about different things. One is about the _behavior_ when the last component is a symlink, and the other is about permission to follow any symlinks. So all combinations make sense: - no AT_SYMLINK_FOLLOW, no RESOLVE_NO_SYMLINKS: just link to the target, whether it's a symlink or not This is obviously our historical link() behavior. - no AT_SYMLINK_FOLLOW, yes RESOLVE_NO_SYMLINKS: just link to the target, whether it's a symlink or not, but if there's a symlink in the middle, return -ELOOP Note how this case doesn't follow the last one, so RESOLVE_NO_SYMLINKS isn't an issue for the last component, but _is_ an issue for the components in the middle. - AT_SYMLINK_FOLLOW, no RESOLVE_NO_SYMLINKS: just link to the target, following the symlink if it exists This is obviously the historical AT_SYMLINK_FOLLOW behavior - AT_SYMLINK_FOLLOW | RESOLVE_NO_SYMLINKS: just link to the target, return -ELOOP if it's a symlink (of if there's a symlink on the way). This is the natural behavior for "refuse to follow any symlinks anywhere". note how they are all completely sane versions, and in no case does one flag really override the other. If anything, we actually miss a third flag: the "don't allow linking to a final symlink at all" (but allow intermediate symlinks). We've never had that behavior, although I think POSIX makes that case undefined (ie you're not guaranteed to be able to link to a symlink in the first place in POSIX). I guess that third case could be emulated with open(O_PATH) + fstat to check it's not a symlink + linkat(fd,AT_EMPTY_PATH) if it turns somebody would want something like that (and we decided that AT_EMPTY_PATH is ok for linkat()). I doubt anybody cares. Linus
On Thu, Mar 12, 2020 at 09:24:49AM -0700, Linus Torvalds wrote: > Would that be basically just an AT_EMPTY_PATH kind of thing? IOW, > you'd be able to remove a file by doing > > fd = open(path.., O_PATH); > unlinkat(fd, "", AT_EMPTY_PATH); > > Hmm. We have _not_ allowed filesystem changes without that last > component lookup. Of course, with our dentry model, we *can* do it, > but this smells fairly fundamental to me. That's a bloody bad idea. It breeds fuckloads of corner cases, it does not match the locking model at all and I don't want to even think of e.g. the interplay with open-by-fhandle ("Parent? What parent?"), etc. Fundamentally, there are operations on objects and there are operations on links to objects. Mixing those is the recipe for massive headache. > It might avoid some of the extra system calls (ie you could use > openat2() to do the path walking part, and then > unlinkat(AT_EMPTY_PATH) to remove it, and have a "fstat()" etc in > between the verify that it's the right type of file or whatever - and > you'd not need an unlinkat2() with resolve flags). > > I think Al needs to ok this kind of change. Maybe you've already > discussed it with him and I just missed it. They have not. And IME samba folks tend to present the set of primitives they want without bothering to explain what do they want to factorize that way, let alone why it should be factorized that way...
On Thu, Mar 12, 2020 at 06:11:09PM +0100, Stefan Metzmacher wrote: > If that works safely for hardlinks and having another process doing a > rename between openat2() and unlinkat(), we could try that. > > My initial naive idea was to have one syscall instead of > linkat2/renameat3/unlinkat2. > > int xlinkat(int src_dfd, const char *src_path, > int dst_dfd, const char *dst_path, > const struct xlinkat_how *how, size_t how_size); > > struct xlinkat_how { > __u64 src_at_flags; > __u64 src_resolve_flags; > __u64 dst_at_flags; > __u64 dst_resolve_flags; > __u64 rename_flags; > __s32 src_fd; > }; > > With src_dfd=-1, src_path=NULL, how.src_fd = -1, this would be like > linkat(). > With dst_dfd=-1, dst_path=NULL, it would be like unlinkat(). > Otherwise a renameat2(). > > If how.src_fd is not -1, it would be checked to be the same path as > specified by src_dfd and src_path. "Checked" as in...? And is that the same path or another link to the same object, or...? The idea of dumping all 3 into the same syscall looks wrong - compare the effects of link() and rename() on the opened files, for starters, and try to come up with documentation for all of that. Multiplexors tend to be very bad, in large part because they have so bloody convoluted semantics...
On Thu, Mar 12, 2020 at 06:11:09PM +0100, Stefan Metzmacher wrote: > Am 12.03.20 um 17:24 schrieb Linus Torvalds: > > On Thu, Mar 12, 2020 at 2:08 AM Stefan Metzmacher <metze@samba.org> wrote: > >> > >> The whole discussion was triggered by the introduction of a completely > >> new fsinfo() call: > >> > >> Would you propose to have 'at_flags' and 'resolve_flags' passed in here? > > > > Yes, I think that would be the way to go. > > Ok, that's also fine for me:-) > > >>> If we need linkat2() and friends, so be it. Do we? > >> > >> Yes, I'm going to propose something like this, as it would make the life > >> much easier for Samba to have the new features available on all path > >> based syscalls. > > > > Will samba actually use them? I think we've had extensions before that > > weren't worth the non-portability pain? > > Yes, we're currently moving to the portable *at() calls as a start. > And we already make use of Linux only feature for performance reasons > in other places. Having the new resolve flags will make it possible to > move some of the performance intensive work into non-linux specific > modules as fallback. > > I hope that we'll use most of this through io_uring in the end, > that's the reason Jens added the IORING_REGISTER_PERSONALITY feature > used for IORING_OP_OPENAT2. > > > But yes, if we have a major package like samba use it, then by all > > means let's add linkat2(). How many things are we talking about? We > > have a number of system calls that do *not* take flags, but do do > > pathname walking. I'm thinking things like "mkdirat()"?) > > I haven't looked them up in detail yet. > Jeremy can you provide a list? Fixing the flags argument on fchmodat() to actually *implement* AT_SYMLINK_NOFOLLOW would be a good start :-). As for the syscalls that don't have flags I'm thinking of the things like: getxattr/setxattr/removexattr just off the top of my head. Jeremy.
On Fri, Mar 13, 2020 at 08:59:01PM +1100, Aleksa Sarai wrote: > > I have heard some folks asking for a way to create a directory and get a > handle to it atomically -- so arguably this is something that could be > inside openat2()'s feature set (O_MKDIR?). But I'm not sure how popular > of an idea this is. This would be very useful to prevent race conditions between making a directory and EA's on it, as are needed by Samba for DOS attributes and Windows/NFSv4 ACLS.
On Fri, Mar 13, 2020 at 08:59:01PM +1100, Aleksa Sarai wrote: > On 2020-03-12, Stefan Metzmacher <metze@samba.org> wrote: > > Am 12.03.20 um 17:24 schrieb Linus Torvalds: > > > But yes, if we have a major package like samba use it, then by all > > > means let's add linkat2(). How many things are we talking about? We > > > have a number of system calls that do *not* take flags, but do do > > > pathname walking. I'm thinking things like "mkdirat()"?) > > > > I haven't looked them up in detail yet. > > Jeremy can you provide a list? > > > > Do you think we could route some of them like mkdirat() and mknodat() > > via openat2() instead of creating new syscalls? > > I have heard some folks asking for a way to create a directory and get a > handle to it atomically -- so arguably this is something that could be > inside openat2()'s feature set (O_MKDIR?). But I'm not sure how popular > of an idea this is. For fuck sake, *NO*! We don't need any more multiplexors from hell. mkdir() and open() have deeply different interpretation of pathnames (and anyone who asks for e.g. traversals of dangling symlinks on mkdir() is insane). Don't try to mix those; even O_TMPFILE had been a mistake. Folks, we'd paid very dearly for the atomic_open() merge. We are _still_ paying for it - and keep finding bugs induced by the convoluted horrors in that thing (see yesterday pull from vfs.git#fixes for the latest crop). I hope to get into more or less sane shape (part - this cycle, with followups in the next one), but the last thing we need is more complexity in the area. Keep the semantics simple and regular; corner cases _suck_. "Infinitely extensible (without review)" is no virtue. And having nowhere to hide very special flags for very special kludges is a bloody good thing. Every fucking time we had a multiplexed syscall, it had been a massive source of trouble. IF it has a uniform semantics - fine; we don't need arseloads of read_this(2)/read_that(2). But when you need pages upon pages to describe the subtle differences in the interpretation of its arguments, you have already lost. It will be full of corner cases, they will get zero testing and they will rot. Inevitably. All the faster for the lack of people who would be able to keep all of that in head. We do have a mechanism for multiplexing; on amd64 it lives in do_syscall_64(). We really don't need openat2() turning into another one. Syscall table slots are not in a short supply, and the level of review one gets from "new syscall added" is higher than from "make fubar(2) recognize a new member in options->union_full_of_crap if it has RESOLVE_TO_WANK_WITH_RIGHT_HAND set in options->flags, affecting its behaviour in some odd ways". Which is a good thing, damnit.
On Fri, Mar 13, 2020 at 06:28:44PM +0000, Al Viro wrote: > On Fri, Mar 13, 2020 at 08:59:01PM +1100, Aleksa Sarai wrote: > > On 2020-03-12, Stefan Metzmacher <metze@samba.org> wrote: > > > Am 12.03.20 um 17:24 schrieb Linus Torvalds: > > > > But yes, if we have a major package like samba use it, then by all > > > > means let's add linkat2(). How many things are we talking about? We > > > > have a number of system calls that do *not* take flags, but do do > > > > pathname walking. I'm thinking things like "mkdirat()"?) > > > > > > I haven't looked them up in detail yet. > > > Jeremy can you provide a list? > > > > > > Do you think we could route some of them like mkdirat() and mknodat() > > > via openat2() instead of creating new syscalls? > > > > I have heard some folks asking for a way to create a directory and get a > > handle to it atomically -- so arguably this is something that could be > > inside openat2()'s feature set (O_MKDIR?). But I'm not sure how popular > > of an idea this is. > > For fuck sake, *NO*! > > We don't need any more multiplexors from hell. mkdir() and open() have > deeply different interpretation of pathnames (and anyone who asks for > e.g. traversals of dangling symlinks on mkdir() is insane). Don't try to > mix those; even O_TMPFILE had been a mistake. > > Folks, we'd paid very dearly for the atomic_open() merge. We are _still_ > paying for it - and keep finding bugs induced by the convoluted horrors > in that thing (see yesterday pull from vfs.git#fixes for the latest crop). > I hope to get into more or less sane shape (part - this cycle, with > followups in the next one), but the last thing we need is more complexity > in the area. Can we disentangle the laudable desire to keep kernel internals simple (which I completely agree with :-) from the desire to keep user-space interfaces simple ? Having some way of doing a mkdir() that returns an open fd on the new directory *is* a very useful thing for many applications, but I really don't care how the kernel implements it. We have so much Linux-specific code already that one more thing won't matter :-).
On 2020-03-13, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Fri, Mar 13, 2020 at 08:59:01PM +1100, Aleksa Sarai wrote: > > On 2020-03-12, Stefan Metzmacher <metze@samba.org> wrote: > > > Am 12.03.20 um 17:24 schrieb Linus Torvalds: > > > > But yes, if we have a major package like samba use it, then by all > > > > means let's add linkat2(). How many things are we talking about? We > > > > have a number of system calls that do *not* take flags, but do do > > > > pathname walking. I'm thinking things like "mkdirat()"?) > > > > > > I haven't looked them up in detail yet. > > > Jeremy can you provide a list? > > > > > > Do you think we could route some of them like mkdirat() and mknodat() > > > via openat2() instead of creating new syscalls? > > > > I have heard some folks asking for a way to create a directory and get a > > handle to it atomically -- so arguably this is something that could be > > inside openat2()'s feature set (O_MKDIR?). But I'm not sure how popular > > of an idea this is. > > For fuck sake, *NO*! > > We don't need any more multiplexors from hell. mkdir() and open() have > deeply different interpretation of pathnames (and anyone who asks for > e.g. traversals of dangling symlinks on mkdir() is insane). Don't try to > mix those; even O_TMPFILE had been a mistake. I agree that O_TMPFILE is a mess, and you're right that it wouldn't be a good idea to fold it into open*(). But what is your opinion on a hypothetical mkdirat2() which would let you get an fd to the directory that was just created? > We really don't need openat2() turning into another one. Syscall table > slots are not in a short supply, and the level of review one gets from > "new syscall added" is higher than from "make fubar(2) recognize a new > member in options->union_full_of_crap if it has RESOLVE_TO_WANK_WITH_RIGHT_HAND > set in options->flags, affecting its behaviour in some odd ways". > Which is a good thing, damnit. You're quite right, and I don't intend openat2() to become another ioctl-but-with-even-more-fun-semantics.
diff --git a/fs/open.c b/fs/open.c index 0788b3715731..7c38a7605c21 100644 --- a/fs/open.c +++ b/fs/open.c @@ -977,7 +977,7 @@ inline struct open_how build_open_how(int flags, umode_t mode) inline int build_open_flags(const struct open_how *how, struct open_flags *op) { int flags = how->flags; - int lookup_flags = 0; + int lookup_flags = LOOKUP_FOLLOW | LOOKUP_AUTOMOUNT; int acc_mode = ACC_MODE(flags); /* Must never be set by userspace */ @@ -1055,8 +1055,8 @@ inline int build_open_flags(const struct open_how *how, struct open_flags *op) if (flags & O_DIRECTORY) lookup_flags |= LOOKUP_DIRECTORY; - if (!(flags & O_NOFOLLOW)) - lookup_flags |= LOOKUP_FOLLOW; + if (flags & O_NOFOLLOW) + lookup_flags &= ~LOOKUP_FOLLOW; if (how->resolve & RESOLVE_NO_XDEV) lookup_flags |= LOOKUP_NO_XDEV; @@ -1068,6 +1068,8 @@ inline int build_open_flags(const struct open_how *how, struct open_flags *op) lookup_flags |= LOOKUP_BENEATH; if (how->resolve & RESOLVE_IN_ROOT) lookup_flags |= LOOKUP_IN_ROOT; + if (how->resolve & RESOLVE_NO_TRAILING_SYMLINKS) + lookup_flags &= ~LOOKUP_FOLLOW; op->lookup_flags = lookup_flags; return 0; diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h index 7bcdcf4f6ab2..eacf17a8ca34 100644 --- a/include/linux/fcntl.h +++ b/include/linux/fcntl.h @@ -19,7 +19,8 @@ /* List of all valid flags for the how->resolve argument: */ #define VALID_RESOLVE_FLAGS \ (RESOLVE_NO_XDEV | RESOLVE_NO_MAGICLINKS | RESOLVE_NO_SYMLINKS | \ - RESOLVE_BENEATH | RESOLVE_IN_ROOT) + RESOLVE_BENEATH | RESOLVE_IN_ROOT | RESOLVE_NO_TRAILING_SYMLINKS | \ + RESOLVE_NO_TRAILING_AUTOMOUNTS | RESOLVE_EMPTY_PATH) /* List of all open_how "versions". */ #define OPEN_HOW_SIZE_VER0 24 /* sizeof first published struct */ diff --git a/include/uapi/linux/openat2.h b/include/uapi/linux/openat2.h index 58b1eb711360..2647a108f116 100644 --- a/include/uapi/linux/openat2.h +++ b/include/uapi/linux/openat2.h @@ -22,7 +22,10 @@ struct open_how { __u64 resolve; }; -/* how->resolve flags for openat2(2). */ +/* + * Path resolution paths to replace AT_* paths in all new syscalls that would + * use them. + */ #define RESOLVE_NO_XDEV 0x01 /* Block mount-point crossings (includes bind-mounts). */ #define RESOLVE_NO_MAGICLINKS 0x02 /* Block traversal through procfs-style @@ -35,5 +38,8 @@ struct open_how { #define RESOLVE_IN_ROOT 0x10 /* Make all jumps to "/" and ".." be scoped inside the dirfd (similar to chroot(2)). */ +#define RESOLVE_NO_TRAILING_SYMLINKS 0x20 /* Don't follow trailing symlinks in the path */ +#define RESOLVE_NO_TRAILING_AUTOMOUNTS 0x40 /* Don't follow trailing automounts in the path */ +#define RESOLVE_EMPTY_PATH 0x80 /* Permit a path of "" to indicate the dfd exactly */ #endif /* _UAPI_LINUX_OPENAT2_H */
Add additional RESOLVE_* flags to correspond to AT_* flags that aren't currently implemented: RESOLVE_NO_TRAILING_SYMLINKS for AT_SYMLINK_NOFOLLOW RESOLVE_NO_TRAILING_AUTOMOUNTS for AT_NO_AUTOMOUNT RESOLVE_EMPTY_PATH for AT_EMPTY_PATH This is necessary for fsinfo() to use RESOLVE_* flags instead of AT_* flags if the latter are to be considered deprecated for new system calls. Also make openat2() handle RESOLVE_NO_TRAILING_SYMLINKS. Automounting is currently forced by doing an open(), so adding support to openat2() for RESOLVE_NO_TRAILING_AUTOMOUNTS is not trivial. Reported-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Aleksa Sarai <cyphar@cyphar.com> --- fs/open.c | 8 +++++--- include/linux/fcntl.h | 3 ++- include/uapi/linux/openat2.h | 8 +++++++- 3 files changed, 14 insertions(+), 5 deletions(-)