Message ID | 20230721105743.819362688@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | futex: More futex2 bits | expand |
On Fri, Jul 21, 2023, at 12:22, Peter Zijlstra wrote: > * futex_parse_waitv - Parse a waitv array from userspace > @@ -207,7 +207,12 @@ static int futex_parse_waitv(struct fute > if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) > return -EINVAL; > > - if (!(aux.flags & FUTEX2_32)) > + if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall()) { > + if ((aux.flags & FUTEX2_64) == FUTEX2_64) > + return -EINVAL; > + } > + > + if ((aux.flags & FUTEX2_64) != FUTEX2_32) > return -EINVAL; This looks slightly confusing, how about defining another FUTEX2_SIZEMASK (or similar) macro to clarify that "aux.flags & FUTEX2_64" is a mask operation that can match the FUTEX2_{8,16,32,64} values? Arnd
On Fri, Jul 21, 2023 at 05:47:31PM +0200, Arnd Bergmann wrote: > On Fri, Jul 21, 2023, at 12:22, Peter Zijlstra wrote: > > * futex_parse_waitv - Parse a waitv array from userspace > > @@ -207,7 +207,12 @@ static int futex_parse_waitv(struct fute > > if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) > > return -EINVAL; > > > > - if (!(aux.flags & FUTEX2_32)) > > + if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall()) { > > + if ((aux.flags & FUTEX2_64) == FUTEX2_64) > > + return -EINVAL; > > + } > > + > > + if ((aux.flags & FUTEX2_64) != FUTEX2_32) > > return -EINVAL; > > This looks slightly confusing, how about defining another > FUTEX2_SIZEMASK (or similar) macro to clarify that > "aux.flags & FUTEX2_64" is a mask operation that can > match the FUTEX2_{8,16,32,64} values? Yeah, I had that in an earlier version, but then reconsidered as I didn't want to clutter the uabi with that. But perhaps I over-throught this.
On Fri, Jul 21 2023 at 12:22, Peter Zijlstra wrote: > +#define FUTEX2_8 0x00 > +#define FUTEX2_16 0x01 > #define FUTEX2_32 0x02 > - /* 0x04 */ > +#define FUTEX2_64 0x03 > +#define FUTEX2_NUMA 0x04 > /* 0x08 */ > /* 0x10 */ > /* 0x20 */ > --- a/kernel/futex/syscalls.c > +++ b/kernel/futex/syscalls.c > @@ -183,7 +183,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uad > return do_futex(uaddr, op, val, tp, uaddr2, (unsigned long)utime, val3); > } > > -#define FUTEX2_MASK (FUTEX2_32 | FUTEX2_PRIVATE) > +#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) > > /** > * futex_parse_waitv - Parse a waitv array from userspace > @@ -207,7 +207,12 @@ static int futex_parse_waitv(struct fute > if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) > return -EINVAL; With the above aux.flags with FUTEX2_32 set will result in -EINVAL. I don't think that's intentional. Thanks, tglx
On Mon, Jul 31, 2023 at 06:11:27PM +0200, Thomas Gleixner wrote: > On Fri, Jul 21 2023 at 12:22, Peter Zijlstra wrote: > > +#define FUTEX2_8 0x00 > > +#define FUTEX2_16 0x01 > > #define FUTEX2_32 0x02 > > - /* 0x04 */ > > +#define FUTEX2_64 0x03 > > +#define FUTEX2_NUMA 0x04 > > /* 0x08 */ > > /* 0x10 */ > > /* 0x20 */ > > --- a/kernel/futex/syscalls.c > > +++ b/kernel/futex/syscalls.c > > @@ -183,7 +183,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uad > > return do_futex(uaddr, op, val, tp, uaddr2, (unsigned long)utime, val3); > > } > > > > -#define FUTEX2_MASK (FUTEX2_32 | FUTEX2_PRIVATE) > > +#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) > > > > /** > > * futex_parse_waitv - Parse a waitv array from userspace > > @@ -207,7 +207,12 @@ static int futex_parse_waitv(struct fute > > if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) > > return -EINVAL; > > With the above aux.flags with FUTEX2_32 set will result in -EINVAL. I > don't think that's intentional. FUTEX2_8 0 FUTEX2_16 1 FUTEX2_32 2 FUTEX2_64 3 Therefore FUTEX2_64, when used as a mask, includes then all. Arnd suggested adding FUTEX2_SIZE_MASK or something. And I initially had something like that, but pulled it for not wanting to pullute the uabi. Can easily add back.
On Mon, Jul 31 2023 at 18:11, Thomas Gleixner wrote: > On Fri, Jul 21 2023 at 12:22, Peter Zijlstra wrote: >> -#define FUTEX2_MASK (FUTEX2_32 | FUTEX2_PRIVATE) >> +#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) >> >> /** >> * futex_parse_waitv - Parse a waitv array from userspace >> @@ -207,7 +207,12 @@ static int futex_parse_waitv(struct fute >> if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) >> return -EINVAL; > > With the above aux.flags with FUTEX2_32 set will result in -EINVAL. I > don't think that's intentional. Also if you allow 64bit wide futexes, how is that supposed to work with the existing code, which clearly expects a 32bit uval throughout the place? Thanks, tglx
On Mon, Jul 31, 2023 at 07:16:29PM +0200, Thomas Gleixner wrote: > On Mon, Jul 31 2023 at 18:11, Thomas Gleixner wrote: > > On Fri, Jul 21 2023 at 12:22, Peter Zijlstra wrote: > >> -#define FUTEX2_MASK (FUTEX2_32 | FUTEX2_PRIVATE) > >> +#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) > >> > >> /** > >> * futex_parse_waitv - Parse a waitv array from userspace > >> @@ -207,7 +207,12 @@ static int futex_parse_waitv(struct fute > >> if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) > >> return -EINVAL; > > > > With the above aux.flags with FUTEX2_32 set will result in -EINVAL. I > > don't think that's intentional. > > Also if you allow 64bit wide futexes, how is that supposed to work with > the existing code, which clearly expects a 32bit uval throughout the > place? Not allowed yet, these patches only allow 8,16,32. I still need to audit the whole futex core and do 'u32 -> unsigned long' (and everything else that follows from that), and only when that's done can the futex2 syscalls allow FUTEX2_64 on 64bit archs. So for now, these patches: - add the FUTEX2_64 flag, - add 'unsigned long' interface such that 64bit can potentiall use it, - explicitly disallow having it set.
On Mon, Jul 31 2023 at 18:11, Thomas Gleixner wrote: > On Fri, Jul 21 2023 at 12:22, Peter Zijlstra wrote: >> +#define FUTEX2_8 0x00 >> +#define FUTEX2_16 0x01 >> #define FUTEX2_32 0x02 >> - /* 0x04 */ >> +#define FUTEX2_64 0x03 >> +#define FUTEX2_NUMA 0x04 >> /* 0x08 */ >> /* 0x10 */ >> /* 0x20 */ >> --- a/kernel/futex/syscalls.c >> +++ b/kernel/futex/syscalls.c >> @@ -183,7 +183,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uad >> return do_futex(uaddr, op, val, tp, uaddr2, (unsigned long)utime, val3); >> } >> >> -#define FUTEX2_MASK (FUTEX2_32 | FUTEX2_PRIVATE) >> +#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) >> >> /** >> * futex_parse_waitv - Parse a waitv array from userspace >> @@ -207,7 +207,12 @@ static int futex_parse_waitv(struct fute >> if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) >> return -EINVAL; > > With the above aux.flags with FUTEX2_32 set will result in -EINVAL. I > don't think that's intentional. Aargh. This is really nasty to make FUTEX2_64 0x3 and abuse it to test the flags for validity. Intuitive and obvious is something else.
On Mon, Jul 31, 2023 at 07:42:11PM +0200, Thomas Gleixner wrote: > Aargh. This is really nasty to make FUTEX2_64 0x3 and abuse it to test > the flags for validity. Intuitive and obvious is something else. Like so then? --- a/include/uapi/linux/futex.h +++ b/include/uapi/linux/futex.h @@ -57,6 +57,8 @@ /* 0x40 */ #define FUTEX2_PRIVATE FUTEX_PRIVATE_FLAG +#define FUTEX2_SIZE_MASK 0x03 + /* do not use */ #define FUTEX_32 FUTEX2_32 /* historical accident :-( */ --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -183,7 +183,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uad return do_futex(uaddr, op, val, tp, uaddr2, (unsigned long)utime, val3); } -#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) +#define FUTEX2_MASK (FUTEX2_SIZE_MASK | FUTEX2_PRIVATE) /** * futex_parse_waitv - Parse a waitv array from userspace @@ -208,11 +208,11 @@ static int futex_parse_waitv(struct fute return -EINVAL; if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall()) { - if ((aux.flags & FUTEX2_64) == FUTEX2_64) + if ((aux.flags & FUTEX2_SIZE_MASK) == FUTEX2_64) return -EINVAL; } - if ((aux.flags & FUTEX2_64) != FUTEX2_32) + if ((aux.flags & FUTEX2_SIZE_MASK) != FUTEX2_32) return -EINVAL; futexv[i].w.flags = aux.flags;
On Mon, Jul 31 2023 at 19:35, Peter Zijlstra wrote: > On Mon, Jul 31, 2023 at 07:16:29PM +0200, Thomas Gleixner wrote: >> On Mon, Jul 31 2023 at 18:11, Thomas Gleixner wrote: >> > On Fri, Jul 21 2023 at 12:22, Peter Zijlstra wrote: >> >> -#define FUTEX2_MASK (FUTEX2_32 | FUTEX2_PRIVATE) >> >> +#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) >> >> >> >> /** >> >> * futex_parse_waitv - Parse a waitv array from userspace >> >> @@ -207,7 +207,12 @@ static int futex_parse_waitv(struct fute >> >> if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) >> >> return -EINVAL; >> > >> > With the above aux.flags with FUTEX2_32 set will result in -EINVAL. I >> > don't think that's intentional. >> >> Also if you allow 64bit wide futexes, how is that supposed to work with >> the existing code, which clearly expects a 32bit uval throughout the >> place? > > Not allowed yet, these patches only allow 8,16,32. I still need to audit > the whole futex core and do 'u32 -> unsigned long' (and everything else > that follows from that), and only when that's done can the futex2 > syscalls allow FUTEX2_64 on 64bit archs. > > So for now, these patches: > > - add the FUTEX2_64 flag, > - add 'unsigned long' interface such that > 64bit can potentiall use it, > - explicitly disallow having it set. I figured that out very late. This flags having a size fields which claims to be flags had confused the hell out of me.
On Mon, Jul 31 2023 at 21:20, Peter Zijlstra wrote: > -#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) > +#define FUTEX2_MASK (FUTEX2_SIZE_MASK | FUTEX2_PRIVATE) Along with some comment which documents that the size "flags" constitute a number field and not flags in the sense of binary flags. And please name these size constants so it really becomes obvious: #define FUTEX2_SIZE_U32 2 > /** > * futex_parse_waitv - Parse a waitv array from userspace > @@ -208,11 +208,11 @@ static int futex_parse_waitv(struct fute > return -EINVAL; > > if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall()) { > - if ((aux.flags & FUTEX2_64) == FUTEX2_64) > + if ((aux.flags & FUTEX2_SIZE_MASK) == FUTEX2_64) > return -EINVAL; > } That should be part of the actual 64bit futex enablement, no? > - if ((aux.flags & FUTEX2_64) != FUTEX2_32) > + if ((aux.flags & FUTEX2_SIZE_MASK) != FUTEX2_32) > return -EINVAL; In hindsight I think it was as mistake just to have this __u32 flags field in the new interface. Soemthing like the incomplete below might be retrofittable, no? --- a/include/uapi/linux/futex.h +++ b/include/uapi/linux/futex.h @@ -74,7 +74,12 @@ struct futex_waitv { __u64 val; __u64 uaddr; - __u32 flags; + union { + __u32 flags; + __u32 size : 2, + : 5, + private : 1; + }; __u32 __reserved; }; --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -204,10 +204,10 @@ static int futex_parse_waitv(struct fute if (copy_from_user(&aux, &uwaitv[i], sizeof(aux))) return -EFAULT; - if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) + if ((aux.flags & ~FUTEX2_VALID_FLAGBITS) || aux.__reserved) return -EINVAL; - if (!(aux.flags & FUTEX2_32)) + if (aux.size != FUTEX2_SIZE_U32) return -EINVAL; futexv[i].w.flags = aux.flags; If this muck already confused me right now, then I don't want to experience the confusion factor 6 month down the road :) Thanks, tglx
On Mon, Jul 31, 2023 at 11:14:11PM +0200, Thomas Gleixner wrote: > On Mon, Jul 31 2023 at 21:20, Peter Zijlstra wrote: > > -#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) > > +#define FUTEX2_MASK (FUTEX2_SIZE_MASK | FUTEX2_PRIVATE) > > Along with some comment which documents that the size "flags" constitute > a number field and not flags in the sense of binary flags. > > And please name these size constants so it really becomes obvious: > > #define FUTEX2_SIZE_U32 2 So you want them named: #define FUTEX2_SIZE_U8 0x00 #define FUTEX2_SIZE_U16 0x01 #define FUTEX2_SIZE_U32 0x02 #define FUTEX2_SIZE_U64 0x03 #define FUTEX2_SIZE_MASK 0x03 Sure, can do. > > /** > > * futex_parse_waitv - Parse a waitv array from userspace > > @@ -208,11 +208,11 @@ static int futex_parse_waitv(struct fute > > return -EINVAL; > > > > if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall()) { > > - if ((aux.flags & FUTEX2_64) == FUTEX2_64) > > + if ((aux.flags & FUTEX2_SIZE_MASK) == FUTEX2_64) > > return -EINVAL; > > } > > That should be part of the actual 64bit futex enablement, no? The 'unsigned long' thing is part of the syscalls, which is why I had it now. > > > - if ((aux.flags & FUTEX2_64) != FUTEX2_32) > > + if ((aux.flags & FUTEX2_SIZE_MASK) != FUTEX2_32) > > return -EINVAL; > > In hindsight I think it was as mistake just to have this __u32 flags > field in the new interface. Soemthing like the incomplete below might be > retrofittable, no? > > --- a/include/uapi/linux/futex.h > +++ b/include/uapi/linux/futex.h > @@ -74,7 +74,12 @@ > struct futex_waitv { > __u64 val; > __u64 uaddr; > - __u32 flags; > + union { > + __u32 flags; > + __u32 size : 2, > + : 5, > + private : 1; > + }; > __u32 __reserved; > }; Durr, I'm not sure I remember if that does the right thing across architectures -- might just work. But I'm fairly sure this isn't the only case of a field in a flags thing in our APIs. Although obviously I can't find another case in a hurry :/ Also, sys_futex_{wake,wait}() have this thing as a syscall argument, surely you don't want to put this union there as well? I'd much prefer to just keep the 'unsigned int flags' thing and perhaps put a comment on-top of the '#define FUTEX2_*' thingies. Note that having it a field instead of a bunch of flags makes sense, since you can only have a single size, not a combination of sizes.
On Mon, Jul 31 2023 at 23:33, Peter Zijlstra wrote: > On Mon, Jul 31, 2023 at 11:14:11PM +0200, Thomas Gleixner wrote: >> --- a/include/uapi/linux/futex.h >> +++ b/include/uapi/linux/futex.h >> @@ -74,7 +74,12 @@ >> struct futex_waitv { >> __u64 val; >> __u64 uaddr; >> - __u32 flags; >> + union { >> + __u32 flags; >> + __u32 size : 2, >> + : 5, >> + private : 1; >> + }; >> __u32 __reserved; >> }; > > Durr, I'm not sure I remember if that does the right thing across > architectures -- might just work. But I'm fairly sure this isn't the > only case of a field in a flags thing in our APIs. Although obviously > I can't find another case in a hurry :/ I know, but that doesn't make these things more readable and neither an argument against doing it for futex2 :) > Also, sys_futex_{wake,wait}() have this thing as a syscall argument, > surely you don't want to put this union there as well? Why not? The anon union does not break the ABI unless I'm missing something. Existing user space can still use 'flags' and people who care about readability can use the bitfield, no? Its inside struct futex_waitv and not an explicit syscall argument, right? > I'd much prefer to just keep the 'unsigned int flags' thing and perhaps > put a comment on-top of the '#define FUTEX2_*' thingies. Note that > having it a field instead of a bunch of flags makes sense, since you can > only have a single size, not a combination of sizes. I'm aware of that by now :) Still that explicit bitfield does neither need comments nor does it leave room for interpretation. Thanks, tglx
On Tue, Aug 01, 2023 at 12:43:24AM +0200, Thomas Gleixner wrote: > > Also, sys_futex_{wake,wait}() have this thing as a syscall argument, > > surely you don't want to put this union there as well? > > Why not? The anon union does not break the ABI unless I'm missing > something. Existing user space can still use 'flags' and people who care > about readability can use the bitfield, no? > > Its inside struct futex_waitv and not an explicit syscall argument, right? Nope, see patches 5 and 6, they introduce: sys_futex_wake(void __user *uaddr, unsigned long mask, int nr, unsigned int flags); sys_futex_wait(void __user *uaddr, unsigned long val, unsigned long mask, unsigned int flags, struct __kernel_timespec __user *timeout, clockid_t clockid); Using a union, would turn that into: sys_futex_wake(void __user *uaddr, unsigned long mask, int nr, union futex_flags flags); sys_futex_wait(void __user *uaddr, unsigned long val, unsigned long mask, union futex_flags flags, struct __kernel_timespec __user *timeout, clockid_t clockid); Which then gets people to write garbage like: futex_wake(add, 0xFFFF, 1, (union futex_flags){ .flags = FUTEX2_SIZE_U16 | FUTEX2_PRIVATE)); or futex_wake(add, 0xFFFF, 1, (union futex_flags){ .size = FUTEX2_SIZE_U16, private = true, )); You really want that ?
On Tue, Aug 1, 2023, at 00:43, Thomas Gleixner wrote: > On Mon, Jul 31 2023 at 23:33, Peter Zijlstra wrote: >> On Mon, Jul 31, 2023 at 11:14:11PM +0200, Thomas Gleixner wrote: >>> --- a/include/uapi/linux/futex.h >>> +++ b/include/uapi/linux/futex.h >>> @@ -74,7 +74,12 @@ >>> struct futex_waitv { >>> __u64 val; >>> __u64 uaddr; >>> - __u32 flags; >>> + union { >>> + __u32 flags; >>> + __u32 size : 2, >>> + : 5, >>> + private : 1; >>> + }; >>> __u32 __reserved; >>> }; >> >> Durr, I'm not sure I remember if that does the right thing across >> architectures -- might just work. But I'm fairly sure this isn't the >> only case of a field in a flags thing in our APIs. Although obviously >> I can't find another case in a hurry :/ > > I know, but that doesn't make these things more readable and neither an > argument against doing it for futex2 :) ... > > Still that explicit bitfield does neither need comments nor does it > leave room for interpretation. It may be clear to the compiler, but without comments or looking up psABI documentation I certainly wouldn't know immediately which bits of the flags word overlay the bitfields for a given combination of __BIG_ENDIAN/__LITTLE_ENDIAN and __BIG_ENDIAN_BITFIELD/__LITTLE_ENDIAN_BITFIELD or architectures with unusual struct alignment requirements (m68k or arm-oabi). I'd prefer to completely avoid the bitfield here. Maybe having exclusive flags for each width would be less confusing, at the cost of needing two more flag bits and a slightly more complicated sanity check, or we could take an extra byte out of the __reserved field to store the length. Arnd
On Tue, Aug 01 2023 at 00:59, Peter Zijlstra wrote: > On Tue, Aug 01, 2023 at 12:43:24AM +0200, Thomas Gleixner wrote: > Which then gets people to write garbage like: > > futex_wake(add, 0xFFFF, 1, (union futex_flags){ .flags = FUTEX2_SIZE_U16 | FUTEX2_PRIVATE)); > or > futex_wake(add, 0xFFFF, 1, (union futex_flags){ .size = FUTEX2_SIZE_U16, private = true, )); > > You really want that ? Well, people write garbage no matter what. So just keep the flags and make the names explicit. Note to myself: /me shouldn't look at futex patches when tired
--- a/include/uapi/linux/futex.h +++ b/include/uapi/linux/futex.h @@ -46,10 +46,11 @@ /* * Flags for futex2 syscalls. */ - /* 0x00 */ - /* 0x01 */ +#define FUTEX2_8 0x00 +#define FUTEX2_16 0x01 #define FUTEX2_32 0x02 - /* 0x04 */ +#define FUTEX2_64 0x03 +#define FUTEX2_NUMA 0x04 /* 0x08 */ /* 0x10 */ /* 0x20 */ --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -183,7 +183,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uad return do_futex(uaddr, op, val, tp, uaddr2, (unsigned long)utime, val3); } -#define FUTEX2_MASK (FUTEX2_32 | FUTEX2_PRIVATE) +#define FUTEX2_MASK (FUTEX2_64 | FUTEX2_PRIVATE) /** * futex_parse_waitv - Parse a waitv array from userspace @@ -207,7 +207,12 @@ static int futex_parse_waitv(struct fute if ((aux.flags & ~FUTEX2_MASK) || aux.__reserved) return -EINVAL; - if (!(aux.flags & FUTEX2_32)) + if (!IS_ENABLED(CONFIG_64BIT) || in_compat_syscall()) { + if ((aux.flags & FUTEX2_64) == FUTEX2_64) + return -EINVAL; + } + + if ((aux.flags & FUTEX2_64) != FUTEX2_32) return -EINVAL; futexv[i].w.flags = aux.flags;
Add the definition for the missing but always intended extra sizes, and add a NUMA flag for the planned numa extention. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- include/uapi/linux/futex.h | 7 ++++--- kernel/futex/syscalls.c | 9 +++++++-- 2 files changed, 11 insertions(+), 5 deletions(-)