Message ID | AS8PR02MB7237F5BFDAA793E15692B3998BFD2@AS8PR02MB7237.eurprd02.prod.outlook.com (mailing list archive) |
---|---|
Headers | show |
Series | Hardening perf subsystem | expand |
Hi Andrew, On Sat, Jun 01, 2024 at 06:56:15PM +0200, Erick Archer wrote: > Hi everyone, > > This is an effort to get rid of all multiplications from allocation > functions in order to prevent integer overflows [1][2]. > > In the first patch, the "struct amd_uncore_ctx" can be refactored to > use a flex array for the "events" member. This way, the allocation/ > freeing of the memory can be simplified. Then, the struct_size() > helper can be used to do the arithmetic calculation for the memory > to be allocated. > > In the second patch, as the "struct intel_uncore_box" ends in a > flexible array, the preferred way in the kernel is to use the > struct_size() helper to do the arithmetic instead of the calculation > "size + count * size" in the kzalloc_node() function. > > In the third patch, as the "struct perf_buffer" also ends in a > flexible array, the preferred way in the kernel is to use the > struct_size() helper to do the arithmetic instead of the calculation > "size + count * size" in the kzalloc_node() functions. At the same > time, prepare for the coming implementation by GCC and Clang of the > __counted_by attribute. > > This time, I have decided to send these three patches in the same serie > since all of them has been rejected by the maintainers. I have used > the v4 tag since it is the latest iteration in one of the patches. > > The reason these patches were rejected is that Peter Zijlstra detest > the struct_size() helper [3][4]. However, Kees Cook and I consider that > the use of this helper improves readability. But we can all say that it > is a matter of preference. > > Anyway, leaving aside personal preferences, these patches has the > following pros: > > - Code robustness improvement (__counted_by coverage). > - Code robustness improvement (use of struct_size() to do calculations > on memory allocator functions). > - Fewer lines of code. > - Follow the recommendations made in "Deprecated Interfaces, Language > Features, Attributes, and Conventions" [5] as the preferred method > of doing things in the kernel. > - Widely used in the kernel. > - Widely accepted in the kernel. > > There are also patches in this subsystem that use the struct_size() > helper: > > 6566f907bf31 ("Convert intel uncore to struct_size") by Matthew Wilcox > dfbc411e0a5e ("perf/x86/rapl: Prefer struct_size() over open coded arithmetic") by me > > Therefore, I would like these patches to be applied this time. This is my last attemp to get these patches applied. I have decided to send this mail to try to unjam this situation. I have folowed all the reviewers comments and have no response from the maintainers other than "I detest the struct_size() helper". Therefore, I would like to know your opinion and that of other people about these patches. If the final consensus is that the code has no real benefit, I will stop insisting on it ;) Regards, Erick > > Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] > Link: https://github.com/KSPP/linux/issues/160 [2] > Link: https://lore.kernel.org/linux-hardening/20240430091833.GD40213@noisy.programming.kicks-ass.net [3] > Link: https://lore.kernel.org/linux-hardening/20240430091504.GC40213@noisy.programming.kicks-ass.net [4] > Link: https://docs.kernel.org/process/deprecated.html [5] > --- > Changes in v4: > > - Add the "Reviewed-by:" tag to the three patches. > - Rebase against linux next (tag next-20240531). > > Previous versions: > > perf/x86/amd/uncore: Add flex array to struct amd_uncore_ctx > v1 -> https://lore.kernel.org/linux-hardening/AS8PR02MB7237E4848B44A5226BD3551C8BE02@AS8PR02MB7237.eurprd02.prod.outlook.com/ > > perf/x86/intel/uncore: Prefer struct_size over open coded arithmetic > v1 -> https://lore.kernel.org/linux-hardening/AS8PR02MB7237F4D39BF6AA0FF40E91638B392@AS8PR02MB7237.eurprd02.prod.outlook.com/ > > perf/ring_buffer: Prefer struct_size over open coded arithmetic > v3 -> https://lore.kernel.org/linux-hardening/AS8PR02MB72379B4807F3951A1B926BA58BE02@AS8PR02MB7237.eurprd02.prod.outlook.com/ > v2 -> https://lore.kernel.org/linux-hardening/AS8PR02MB7237569E4FBE0B26F62FDFDB8B1D2@AS8PR02MB7237.eurprd02.prod.outlook.com/ > v1 -> https://lore.kernel.org/linux-hardening/AS8PR02MB72372AB065EA8340D960CCC48B1B2@AS8PR02MB7237.eurprd02.prod.outlook.com/ > > Changes in v3: > - Refactor the logic, compared to the previous version, of the second > rb_alloc() function to gain __counted_by() coverage (Kees Cook and > Christophe JAILLET). > > Changes in v2: > - Annotate "struct perf_buffer" with __counted_by() attribute (Kees Cook). > - Refactor the logic to gain __counted_by() coverage (Kees Cook). > --- > Erick Archer (3): > perf/x86/amd/uncore: Add flex array to struct amd_uncore_ctx > perf/x86/intel/uncore: Prefer struct_size over open coded arithmetic > perf/ring_buffer: Prefer struct_size over open coded arithmetic > > arch/x86/events/amd/uncore.c | 18 +++++------------- > arch/x86/events/intel/uncore.c | 7 +++---- > kernel/events/internal.h | 2 +- > kernel/events/ring_buffer.c | 15 ++++----------- > 4 files changed, 13 insertions(+), 29 deletions(-) > > -- > 2.25.1 >
On Sat, Jun 08, 2024 at 10:50:44AM +0200, Erick Archer wrote: > Hi Andrew, > > On Sat, Jun 01, 2024 at 06:56:15PM +0200, Erick Archer wrote: > > Hi everyone, > > > > This is an effort to get rid of all multiplications from allocation > > functions in order to prevent integer overflows [1][2]. > > > > In the first patch, the "struct amd_uncore_ctx" can be refactored to > > use a flex array for the "events" member. This way, the allocation/ > > freeing of the memory can be simplified. Then, the struct_size() > > helper can be used to do the arithmetic calculation for the memory > > to be allocated. > > > > In the second patch, as the "struct intel_uncore_box" ends in a > > flexible array, the preferred way in the kernel is to use the > > struct_size() helper to do the arithmetic instead of the calculation > > "size + count * size" in the kzalloc_node() function. > > > > In the third patch, as the "struct perf_buffer" also ends in a > > flexible array, the preferred way in the kernel is to use the > > struct_size() helper to do the arithmetic instead of the calculation > > "size + count * size" in the kzalloc_node() functions. At the same > > time, prepare for the coming implementation by GCC and Clang of the > > __counted_by attribute. > > > > This time, I have decided to send these three patches in the same serie > > since all of them has been rejected by the maintainers. I have used > > the v4 tag since it is the latest iteration in one of the patches. > > > > The reason these patches were rejected is that Peter Zijlstra detest > > the struct_size() helper [3][4]. However, Kees Cook and I consider that > > the use of this helper improves readability. But we can all say that it > > is a matter of preference. > > > > Anyway, leaving aside personal preferences, these patches has the > > following pros: > > > > - Code robustness improvement (__counted_by coverage). > > - Code robustness improvement (use of struct_size() to do calculations > > on memory allocator functions). > > - Fewer lines of code. > > - Follow the recommendations made in "Deprecated Interfaces, Language > > Features, Attributes, and Conventions" [5] as the preferred method > > of doing things in the kernel. > > - Widely used in the kernel. > > - Widely accepted in the kernel. > > > > There are also patches in this subsystem that use the struct_size() > > helper: > > > > 6566f907bf31 ("Convert intel uncore to struct_size") by Matthew Wilcox > > dfbc411e0a5e ("perf/x86/rapl: Prefer struct_size() over open coded arithmetic") by me > > > > Therefore, I would like these patches to be applied this time. > > This is my last attemp to get these patches applied. I have decided to > send this mail to try to unjam this situation. I have folowed all the > reviewers comments and have no response from the maintainers other than > "I detest the struct_size() helper". > > Therefore, I would like to know your opinion and that of other people > about these patches. If the final consensus is that the code has no real > benefit, I will stop insisting on it ;) Seriously, I've got plenty patches to look at that actually do something. This falls well within the 'random changes for changes sake' and goes waaaaay down the priority list. If you're addressing an actual issue, state so. Otherwise, go play somewhere else.
On Sat, Jun 01, 2024 at 06:56:15PM +0200, Erick Archer wrote: > Hi everyone, > > This is an effort to get rid of all multiplications from allocation > functions in order to prevent integer overflows [1][2]. I didn't actually see these 3 patches in this thread nor via lore. > In the first patch, the "struct amd_uncore_ctx" can be refactored to > use a flex array for the "events" member. This way, the allocation/ > freeing of the memory can be simplified. Then, the struct_size() > helper can be used to do the arithmetic calculation for the memory > to be allocated. I like this patch because it reduces the allocation from 2 to 1. This isn't what Peter might see as "churn": this is an improvement in resource utilization. I think it's here: https://lore.kernel.org/linux-hardening/AS8PR02MB7237E4848B44A5226BD3551C8BE02@AS8PR02MB7237.eurprd02.prod.outlook.com/ > In the second patch, as the "struct intel_uncore_box" ends in a > flexible array, the preferred way in the kernel is to use the > struct_size() helper to do the arithmetic instead of the calculation > "size + count * size" in the kzalloc_node() function. This is https://lore.kernel.org/linux-hardening/AS8PR02MB7237F4D39BF6AA0FF40E91638B392@AS8PR02MB7237.eurprd02.prod.outlook.com/ I prefer this style, as it makes things unambiguous ("this will never wrap around") without having to check the associated types and doesn't make the resulting binary code different in the "can never overflow" case. In this particular case: int size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg); "int numshared" comes from struct intel_uncore_type::num_shared_regs, which is: unsigned num_shared_regs:8; And the struct sizes are: $ pahole -C intel_uncore_box gcc-boot/vmlinux | grep size: /* size: 488, cachelines: 8, members: 19 */ $ pahole -C intel_uncore_extra_reg gcc-boot/vmlinux | grep size: /* size: 96, cachelines: 2, members: 5 */ So we have: s32 size = 488 + u8 * 96 Max size here is 24968 so it can never overflow an s32, so I can see why Peter views this as "churn". I still think the patch is a coding style improvement, but okay. > In the third patch, as the "struct perf_buffer" also ends in a > flexible array, the preferred way in the kernel is to use the > struct_size() helper to do the arithmetic instead of the calculation > "size + count * size" in the kzalloc_node() functions. At the same > time, prepare for the coming implementation by GCC and Clang of the > __counted_by attribute. This is https://lore.kernel.org/linux-hardening/AS8PR02MB72379B4807F3951A1B926BA58BE02@AS8PR02MB7237.eurprd02.prod.outlook.com/ This provides __counted_by coverage, and I think this is important to gain in ever place we can. Given that this is part of a ring buffer implementation that is arbitrarily sized, this is exactly the kind of place I'd like to see __counted_by used. This is a runtime robustness improvement, so I don't see this a "churn" at all. Peter, for patches 1 and 3, if you'd prefer not to carry them, I could put them in the hardening tree to keep them out of your way. It seems clear you don't want patch 2 at all. -Kees
On Mon, Jun 10, 2024 at 10:28:52AM -0700, Kees Cook wrote: > On Sat, Jun 01, 2024 at 06:56:15PM +0200, Erick Archer wrote: > > Hi everyone, > > > > This is an effort to get rid of all multiplications from allocation > > functions in order to prevent integer overflows [1][2]. > > I didn't actually see these 3 patches in this thread nor via lore. He managed to break threading between 0/n and the rest. > > In the first patch, the "struct amd_uncore_ctx" can be refactored to > > use a flex array for the "events" member. This way, the allocation/ > > freeing of the memory can be simplified. Then, the struct_size() > > helper can be used to do the arithmetic calculation for the memory > > to be allocated. > > I like this patch because it reduces the allocation from 2 to 1. This > isn't what Peter might see as "churn": this is an improvement in resource > utilization. But then he went and used that struct_size() abomination :/ > I prefer this style, as it makes things unambiguous ("this will never > wrap around") without having to check the associated types and doesn't make > the resulting binary code different in the "can never overflow" case. > > In this particular case: > > int size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg); > > "int numshared" comes from struct intel_uncore_type::num_shared_regs, > which is: > > unsigned num_shared_regs:8; > > And the struct sizes are: > > $ pahole -C intel_uncore_box gcc-boot/vmlinux | grep size: > /* size: 488, cachelines: 8, members: 19 */ > $ pahole -C intel_uncore_extra_reg gcc-boot/vmlinux | grep size: > /* size: 96, cachelines: 2, members: 5 */ > > So we have: > > s32 size = 488 + u8 * 96 > > Max size here is 24968 so it can never overflow an s32, so I can see > why Peter views this as "churn". > > I still think the patch is a coding style improvement, but okay. I really detest this thing because it makes what was trivially readable into something opaque. Get me that type qualifier that traps on overflow and write plain C. All this __builtin_overflow garbage is just that, unreadable nonsense. > This provides __counted_by coverage, and I think this is important to > gain in ever place we can. Given that this is part of a ring buffer > implementation that is arbitrarily sized, this is exactly the kind of > place I'd like to see __counted_by used. This is a runtime robustness > improvement, so I don't see this a "churn" at all. Again, mixed in with that other crap. Anyway, remind me wth this __counted_by thing actually does? > Peter, for patches 1 and 3, if you'd prefer not to carry them, I could > put them in the hardening tree to keep them out of your way. It seems > clear you don't want patch 2 at all. I prefer to not have struct_size() anywhere at all. Please just write readable code. Again, if you really care about that overflow muck, get a useable C type qualifier or something so we can write readable code.
On Mon, Jun 10, 2024 at 10:05:44PM +0200, Peter Zijlstra wrote: > On Mon, Jun 10, 2024 at 10:28:52AM -0700, Kees Cook wrote: > > On Sat, Jun 01, 2024 at 06:56:15PM +0200, Erick Archer wrote: > > > Hi everyone, > > > > > > This is an effort to get rid of all multiplications from allocation > > > functions in order to prevent integer overflows [1][2]. > > > > I didn't actually see these 3 patches in this thread nor via lore. > > He managed to break threading between 0/n and the rest. > > > > In the first patch, the "struct amd_uncore_ctx" can be refactored to > > > use a flex array for the "events" member. This way, the allocation/ > > > freeing of the memory can be simplified. Then, the struct_size() > > > helper can be used to do the arithmetic calculation for the memory > > > to be allocated. > > > > I like this patch because it reduces the allocation from 2 to 1. This > > isn't what Peter might see as "churn": this is an improvement in resource > > utilization. > > But then he went and used that struct_size() abomination :/ > > > I prefer this style, as it makes things unambiguous ("this will never > > wrap around") without having to check the associated types and doesn't make > > the resulting binary code different in the "can never overflow" case. > > > > In this particular case: > > > > int size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg); > > > > "int numshared" comes from struct intel_uncore_type::num_shared_regs, > > which is: > > > > unsigned num_shared_regs:8; > > > > And the struct sizes are: > > > > $ pahole -C intel_uncore_box gcc-boot/vmlinux | grep size: > > /* size: 488, cachelines: 8, members: 19 */ > > $ pahole -C intel_uncore_extra_reg gcc-boot/vmlinux | grep size: > > /* size: 96, cachelines: 2, members: 5 */ > > > > So we have: > > > > s32 size = 488 + u8 * 96 > > > > Max size here is 24968 so it can never overflow an s32, so I can see > > why Peter views this as "churn". > > > > I still think the patch is a coding style improvement, but okay. > > I really detest this thing because it makes what was trivially readable > into something opaque. Get me that type qualifier that traps on overflow > and write plain C. All this __builtin_overflow garbage is just that, > unreadable nonsense. It's more readable than container_of(), IMO. "give me the struct size for variable VAR, which has a flexible array MEMBER, when we have COUNT many of them": struct_size(VAR, MEMBER, COUNT). It's more readable, more robust, and provides saturation in the face of potential wrap-around. > > This provides __counted_by coverage, and I think this is important to > > gain in ever place we can. Given that this is part of a ring buffer > > implementation that is arbitrarily sized, this is exactly the kind of > > place I'd like to see __counted_by used. This is a runtime robustness > > improvement, so I don't see this a "churn" at all. > > Again, mixed in with that other crap. Anyway, remind me wth this > __counted_by thing actually does? It provides annotation for the compiler to perform run-time bounds checking on dynamically sized arrays. i.e. CONFIG_FORTIFY_SOURCE and CONFIG_UBSAN_BOUNDS can actually reason about annotated flexible arrays instead of just saying "oh no a flexible array, I give up". > > Peter, for patches 1 and 3, if you'd prefer not to carry them, I could > > put them in the hardening tree to keep them out of your way. It seems > > clear you don't want patch 2 at all. > > I prefer to not have struct_size() anywhere at all. Please just write > readable code. That ship has sailed, and it has been keeping things at bay for a while now. As we make progress on making the compiler able to do this more naturally, we can work on replacing struct_size(), but it's in use globally and it's useful both for catching runtime mistakes and for catching compile-time mistakes (the flexible array has to match the variable's struct). -Kees
On Mon, Jun 10, 2024 at 02:46:09PM -0700, Kees Cook wrote: > > I really detest this thing because it makes what was trivially readable > > into something opaque. Get me that type qualifier that traps on overflow > > and write plain C. All this __builtin_overflow garbage is just that, > > unreadable nonsense. > > It's more readable than container_of(), Yeah, no. container_of() is absolutely trivial and very readable. container_of_const() a lot less so. (one static_assert() removed) #define container_of(ptr, type, member) ({ \ void *__mptr = (void *)(ptr); \ ((type *)(__mptr - offsetof(type, member))); }) Which is very clear indeed in what it does. Compare with: #define struct_size(p, member, count) \ __builtin_choose_expr(__is_constexpr(count), \ sizeof(*(p)) + flex_array_size(p, member, count), \ size_add(sizeof(*(p)), flex_array_size(p, member, count))) And I still have no idea :-( > IMO. "give me the struct size > for variable VAR, which has a flexible array MEMBER, when we have COUNT > many of them": struct_size(VAR, MEMBER, COUNT). It's more readable, more > robust, and provides saturation in the face of potential wrap-around. I'm sure you know what it does. Thing is, I don't care because I can trivially write it myself and not have to care and I'll have forgotten all about it the moment I sent this email. It just doesn't make sense to wrap something as utterly trivial as: size = sizeof(*p) + num*sizeof(p->foo); We're going to have to agree to disagree on this. Note how I naturally get the order wrong? [[ There is the whole FMA angle to this, that is, fundamentally this is a multiply-accumulate, but the problem there is the same that I noted, there is no fixed order, a+b*c and a*b+c are both very common definitions -- although I lean towards the latter being the correct one, given the order in the naming. I suppose this is a long winded way of saying that: #define struct_size(p, member, num) \ mult_add_no_overflow(num, sizeof(p->member), sizeof(*p)) would be *FAR* more readable. And then I still think struct_size() is less readable than its expansion. ]] > > > This provides __counted_by coverage, and I think this is important to > > > gain in ever place we can. Given that this is part of a ring buffer > > > implementation that is arbitrarily sized, this is exactly the kind of > > > place I'd like to see __counted_by used. This is a runtime robustness > > > improvement, so I don't see this a "churn" at all. > > > > Again, mixed in with that other crap. Anyway, remind me wth this > > __counted_by thing actually does? > > It provides annotation for the compiler to perform run-time bounds > checking on dynamically sized arrays. i.e. CONFIG_FORTIFY_SOURCE and > CONFIG_UBSAN_BOUNDS can actually reason about annotated flexible arrays > instead of just saying "oh no a flexible array, I give up". Some day I'll have to look at this FORTIFY_SOURCE and see what it actually does I suppose :/ > > > Peter, for patches 1 and 3, if you'd prefer not to carry them, I could > > > put them in the hardening tree to keep them out of your way. It seems > > > clear you don't want patch 2 at all. > > > > I prefer to not have struct_size() anywhere at all. Please just write > > readable code. > > That ship has sailed, and it has been keeping things at bay for a while > now. As we make progress on making the compiler able to do this more > naturally, we can work on replacing struct_size(), but it's in use > globally and it's useful both for catching runtime mistakes and for > catching compile-time mistakes (the flexible array has to match the > variable's struct). I coulnd't quickly find a single instance in the code I care about. So nothing is sailing afaict.
On Tue, Jun 11, 2024 at 09:55:42AM +0200, Peter Zijlstra wrote: > On Mon, Jun 10, 2024 at 02:46:09PM -0700, Kees Cook wrote: > > > > I really detest this thing because it makes what was trivially readable > > > into something opaque. Get me that type qualifier that traps on overflow > > > and write plain C. All this __builtin_overflow garbage is just that, > > > unreadable nonsense. > > > > It's more readable than container_of(), > > Yeah, no. container_of() is absolutely trivial and very readable. > container_of_const() a lot less so. I mean, we have complex macros in the kernel. This isn't uncommon. Look at cleanup.h. ;) But that's why we write kern-doc: * struct_size() - Calculate size of structure with trailing flexible * array. * @p: Pointer to the structure. * @member: Name of the array member. * @count: Number of elements in the array. > #define struct_size(p, member, num) \ > mult_add_no_overflow(num, sizeof(p->member), sizeof(*p)) > > would be *FAR* more readable. And then I still think struct_size() is > less readable than its expansion. ]] I'm happy to take patches. And for this bikeshed, this would be better named under the size_*() helpers which are trying to keep size_t calculations from overflowing (by saturating). i.e.: size_add_mult(sizeof(*p), sizeof(*p->member), num) Generalized overflow handing (check_add/sub/mul_overflow()) helpers already exist and requires a destination variable to determine type sizes. It is not safe to allow C semantics to perform promotion/truncation, so we have to be explicit. > Some day I'll have to look at this FORTIFY_SOURCE and see what it > actually does I suppose :/ LOL. It's basically doing compile-time (__builtin_object_size) and run-time (__builtin_dynamic_object_size) bounds checking on destination (and source) object sizes, mainly driven by the mentioned builtins: https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html > I coulnd't quickly find a single instance in the code I care about. So > nothing is sailing afaict. I have to care about all of Linux's code. :P We generally don't have to defend the kernel from perf, so the hardening changes tend to end up in core APIs along with compiler improvements. Anyway! What about the patch that takes the 2 allocations down to 1? That seems like an obvious improvement. -Kees
On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > On Tue, Jun 11, 2024 at 09:55:42AM +0200, Peter Zijlstra wrote: > > On Mon, Jun 10, 2024 at 02:46:09PM -0700, Kees Cook wrote: > > > > > > I really detest this thing because it makes what was trivially readable > > > > into something opaque. Get me that type qualifier that traps on overflow > > > > and write plain C. All this __builtin_overflow garbage is just that, > > > > unreadable nonsense. > > > > > > It's more readable than container_of(), > > > > Yeah, no. container_of() is absolutely trivial and very readable. > > container_of_const() a lot less so. > > I mean, we have complex macros in the kernel. This isn't uncommon. Look > at cleanup.h. ;) Yeah, but in those cases the macro actually saves a lot of typing. A mult and add is something you shouldn't be needing a macro for. > But that's why we write kern-doc: Right, but reading comments is asking for trouble. That is, I really don't even see comments -- I have a built-in pre-processor that filters them out. I've been burned too many times by misleading or flat out wrong comments. > > #define struct_size(p, member, num) \ > > mult_add_no_overflow(num, sizeof(p->member), sizeof(*p)) > > > > would be *FAR* more readable. And then I still think struct_size() is > > less readable than its expansion. ]] > > I'm happy to take patches. And for this bikeshed, this would be better > named under the size_*() helpers which are trying to keep size_t > calculations from overflowing (by saturating). i.e.: > > size_add_mult(sizeof(*p), sizeof(*p->member), num) Fine I suppose, but what if we want something not size_t? Are we waiting for the type system extension? Anyway, I'll take the patch with the above. That at least is readable. The saturating thing is relying in the allocators never granting INT_MAX (or whatever size_t actually is) bytes? > Generalized overflow handing (check_add/sub/mul_overflow()) helpers > already exist and requires a destination variable to determine type > sizes. It is not safe to allow C semantics to perform > promotion/truncation, so we have to be explicit. Yeah, those are a sodding trainwreck, much like the __builtin_*_overflow() garbage. Can't we simply have them trap on overflow and do away with that most terrible calling convention? If we want to be really fancy we can have a #UD instruction that encodes a destination register to clear/saturate/whatever. > > Some day I'll have to look at this FORTIFY_SOURCE and see what it > > actually does I suppose :/ > > LOL. It's basically doing compile-time (__builtin_object_size) and > run-time (__builtin_dynamic_object_size) bounds checking on destination > (and source) object sizes, mainly driven by the mentioned builtins: > https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html Right, I got that far. I also read most of: https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854 But none of that is showing me generated asm for the various cases. As such, I don't consider myself informed enough. > Anyway! What about the patch that takes the 2 allocations down to 1? > That seems like an obvious improvement. Separate it from the struct_size() nonsense and Cc the author of that code (Sandipan IIRC) and I might just apply it.
On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote: > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > > I'm happy to take patches. And for this bikeshed, this would be better > > named under the size_*() helpers which are trying to keep size_t > > calculations from overflowing (by saturating). i.e.: > > > > size_add_mult(sizeof(*p), sizeof(*p->member), num) > > Fine I suppose, but what if we want something not size_t? Are we waiting > for the type system extension? Because of C's implicit promotion/truncation, we can't do anything sanely with return values of arbitrary type size; we have to capture the lvalue type somehow so the checking can happen without C doing silent garbage. > The saturating thing is relying in the allocators never granting INT_MAX > (or whatever size_t actually is) bytes? The max of size_t is ULONG_MAX, but yes, most of the allocators will refuse >INT_MAX, but I think vmalloc() is higher, but certainly not SIZE_MAX, which is the entire virtual memory space. ;) The saturating thing is two-fold: that we never wrap around SIZE_MAX, and that the allocator will refuse a SIZE_MAX allocation. > > LOL. It's basically doing compile-time (__builtin_object_size) and > > run-time (__builtin_dynamic_object_size) bounds checking on destination > > (and source) object sizes, mainly driven by the mentioned builtins: > > https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html > > Right, I got that far. I also read most of: > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854 Oh wow, that's serious extra credit. :) It'll also probably be a while before most of that stuff is even landed in Clang, much less implemented in GCC. What we _do_ have is the "counted_by" attribute. This was added to Clang a little while ago and just landed last week in GCC for GCC 15. > But none of that is showing me generated asm for the various cases. As > such, I don't consider myself informed enough. Gotcha. For the compile-time stuff it's all just looking at known-at-compile-time sizes. So for something like this, we get a __compiletime_warning() emitted: const char src[] = "Hello there"; char dst[10]; strscpy(dst, src); /* Compiler yells since src is bigger than dst. */ For run-time checks it's basically just using the regular WARN() infrastructure with __builtin_dynamic_object_size(). Here's a simplified userspace example with assert(): https://godbolt.org/z/zMrKnMxn5 The kernel's FORTIFY_SOURCE is much more complex in how it does the checking, how it does the reporting (for helping people figure out what's gone weird), etc. > > Anyway! What about the patch that takes the 2 allocations down to 1? > > That seems like an obvious improvement. > > Separate it from the struct_size() nonsense and Cc the author of that > code (Sandipan IIRC) and I might just apply it. Okay, thanks! -Kees
On Wed, Jun 12, 2024 at 04:23:31PM -0700, Kees Cook wrote: > On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote: > > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > > > I'm happy to take patches. And for this bikeshed, this would be better > > > named under the size_*() helpers which are trying to keep size_t > > > calculations from overflowing (by saturating). i.e.: > > > > > > size_add_mult(sizeof(*p), sizeof(*p->member), num) > > > > Fine I suppose, but what if we want something not size_t? Are we waiting > > for the type system extension? > > Because of C's implicit promotion/truncation, we can't do anything > sanely with return values of arbitrary type size; we have to capture the > lvalue type somehow so the checking can happen without C doing silent > garbage. So sizeof() returns the native (built-in) size_t, right? If that type the nooverflow qualifier on, then: sizeof(*p) + num*sizeof(p->foo[0]) should all get the nooverflow semantics right? Because size_t is effectively 'nooverflow unsigned long' the multiplication should promote 'num' to some 'long'. Now, I've re-read the rules and I don't see qualifiers mentioned, so can't we state that the overflow/nooverflow qualifiers are to be preserved on (implicit) promotion and when nooverflow and overflow are combined the 'safe' nooverflow takes precedence? I mean, when we're adding qualifiers we can make up rules about them too, right? If 'people' don't want to adorn the built-in size_t, we can always do something like: #define sizeof(x) ((nooverflow unsigned long)(sizeof(x))) and 'fix' it ourselves. > > But none of that is showing me generated asm for the various cases. As > > such, I don't consider myself informed enough. > > Gotcha. For the compile-time stuff it's all just looking at > known-at-compile-time sizes. So for something like this, we get a > __compiletime_warning() emitted: > > const char src[] = "Hello there"; > char dst[10]; > > strscpy(dst, src); /* Compiler yells since src is bigger than dst. */ > > For run-time checks it's basically just using the regular WARN() > infrastructure with __builtin_dynamic_object_size(). Here's a simplified > userspace example with assert(): > > https://godbolt.org/z/zMrKnMxn5 > > The kernel's FORTIFY_SOURCE is much more complex in how it does the > checking, how it does the reporting (for helping people figure out what's > gone weird), etc. Thanks, I'll go have a look at that.
Am Freitag, dem 14.06.2024 um 12:17 +0200 schrieb Peter Zijlstra: > On Wed, Jun 12, 2024 at 04:23:31PM -0700, Kees Cook wrote: > > On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote: > > > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > > > > I'm happy to take patches. And for this bikeshed, this would be better > > > > named under the size_*() helpers which are trying to keep size_t > > > > calculations from overflowing (by saturating). i.e.: > > > > > > > > size_add_mult(sizeof(*p), sizeof(*p->member), num) > > > > > > Fine I suppose, but what if we want something not size_t? Are we waiting > > > for the type system extension? > > > > Because of C's implicit promotion/truncation, we can't do anything > > sanely with return values of arbitrary type size; we have to capture the > > lvalue type somehow so the checking can happen without C doing silent > > garbage. What is the specific problem here? > > So sizeof() returns the native (built-in) size_t, right? If that type > the nooverflow qualifier on, then: > > sizeof(*p) + num*sizeof(p->foo[0]) > > should all get the nooverflow semantics right? Because size_t is > effectively 'nooverflow unsigned long' the multiplication should promote > 'num' to some 'long'. > > Now, I've re-read the rules and I don't see qualifiers mentioned, so > can't we state that the overflow/nooverflow qualifiers are to be > preserved on (implicit) promotion and when nooverflow and overflow are > combined the 'safe' nooverflow takes precedence? > > I mean, when we're adding qualifiers we can make up rules about them > too, right? It should probably be a type attribute. > > If 'people' don't want to adorn the built-in size_t, we can always do > something like: > > #define sizeof(x) ((nooverflow unsigned long)(sizeof(x))) > > and 'fix' it ourselves. This is likely a stupid question, but making it signed wouldn't work? Or is a signed size_t too small or some architectures? Or would this change break too much? Martin > > > > But none of that is showing me generated asm for the various cases. As > > > such, I don't consider myself informed enough. > > > > Gotcha. For the compile-time stuff it's all just looking at > > known-at-compile-time sizes. So for something like this, we get a > > __compiletime_warning() emitted: > > > > const char src[] = "Hello there"; > > char dst[10]; > > > > strscpy(dst, src); /* Compiler yells since src is bigger than dst. */ > > > > For run-time checks it's basically just using the regular WARN() > > infrastructure with __builtin_dynamic_object_size(). Here's a simplified > > userspace example with assert(): > > > > https://godbolt.org/z/zMrKnMxn5 > > > > The kernel's FORTIFY_SOURCE is much more complex in how it does the > > checking, how it does the reporting (for helping people figure out what's > > gone weird), etc. > > Thanks, I'll go have a look at that.
On Fri, Jun 14, 2024 at 12:17:08PM +0200, Peter Zijlstra wrote: > On Wed, Jun 12, 2024 at 04:23:31PM -0700, Kees Cook wrote: > > On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote: > > > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > > > > I'm happy to take patches. And for this bikeshed, this would be better > > > > named under the size_*() helpers which are trying to keep size_t > > > > calculations from overflowing (by saturating). i.e.: > > > > > > > > size_add_mult(sizeof(*p), sizeof(*p->member), num) > > > > > > Fine I suppose, but what if we want something not size_t? Are we waiting > > > for the type system extension? > > > > Because of C's implicit promotion/truncation, we can't do anything > > sanely with return values of arbitrary type size; we have to capture the > > lvalue type somehow so the checking can happen without C doing silent > > garbage. > > So sizeof() returns the native (built-in) size_t, right? If that type > the nooverflow qualifier on, then: > > sizeof(*p) + num*sizeof(p->foo[0]) > > should all get the nooverflow semantics right? Because size_t is > effectively 'nooverflow unsigned long' the multiplication should promote > 'num' to some 'long'. Hmmm. This is an interesting point. I'll see what Justin has found as he's been working on limiting the overflow sanitizer to specific types. It doesn't help this (unfortunately common) code pattern, though: int size; size = sizeof(*p) + num*sizeof(p->foo[0]); p = kmalloc(size, GFP_KERNEL); But that was going to be a problem either way. > Now, I've re-read the rules and I don't see qualifiers mentioned, so > can't we state that the overflow/nooverflow qualifiers are to be > preserved on (implicit) promotion and when nooverflow and overflow are > combined the 'safe' nooverflow takes precedence? > > I mean, when we're adding qualifiers we can make up rules about them > too, right? Yup, that is the design of the "wraps" attribute (though it is the reverse: it _allows_ for wrap-around, since we want to the default state to be mitigation). > If 'people' don't want to adorn the built-in size_t, we can always do > something like: > > #define sizeof(x) ((nooverflow unsigned long)(sizeof(x))) > > and 'fix' it ourselves. Right, though my hope is still we get the result of "nooverflow" by marking that which was expected to overflow.
On Sat, Jun 15, 2024 at 06:09:07PM +0200, Martin Uecker wrote: > Am Freitag, dem 14.06.2024 um 12:17 +0200 schrieb Peter Zijlstra: > > On Wed, Jun 12, 2024 at 04:23:31PM -0700, Kees Cook wrote: > > > On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote: > > > > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > > > > > I'm happy to take patches. And for this bikeshed, this would be better > > > > > named under the size_*() helpers which are trying to keep size_t > > > > > calculations from overflowing (by saturating). i.e.: > > > > > > > > > > size_add_mult(sizeof(*p), sizeof(*p->member), num) > > > > > > > > Fine I suppose, but what if we want something not size_t? Are we waiting > > > > for the type system extension? > > > > > > Because of C's implicit promotion/truncation, we can't do anything > > > sanely with return values of arbitrary type size; we have to capture the > > > lvalue type somehow so the checking can happen without C doing silent > > > garbage. > > What is the specific problem here? This particular complaint is about not being able to "discover" the lvalue type during an assignment, which means we can't create helper macros that Just Work. For example, in: void foo(some_type arg); foo(add_mul(base, size, count)); There is no way to write "add_mul" so that it is checking for overflow of the "some_type" type. The behavior of such a macro changes if it's doing u8, int, size_t, etc. The existing solution to this is to use macros (and builtins) that cannot be composed: add_mul(base, size, count, &result); foo(result); Here, the macro can examine the type of "result" and perform the correct overflow handling for that type. But lots of developers (rightly) dislike this style of "assignment side-effect". Additionally, it can't be composed: add(base, mul(size, count)); But, using type attributes we have much more flexibility. Hence, the proposed "wraps" attribute: https://github.com/llvm/llvm-project/pull/86618 > This is likely a stupid question, but making it signed > wouldn't work? Or is a signed size_t too small > or some architectures? Or would this change break too much? The ssize_t gets used in some places already, but since size_t is used for address calculations too, I don't think we can universally switch it.
On Mon, Jun 17, 2024 at 10:28:20AM -0700, Kees Cook wrote: > But, using type attributes we have much more flexibility. Hence, the > proposed "wraps" attribute: > https://github.com/llvm/llvm-project/pull/86618 So I still think that's going about things backwards. unsigned explicitly wraps. signed is UB. When using -fwrapv signed is well defined as 2s complement, which takes away the UB and makes it implicitly wrap. When extending the language, it is important to not break existing code, so the default must remain wrap. This in turn means you need to add a 'nowrap' thingy. Also, I would very much not make this an attribute, but a full type qualifier. Furthermore, add type promotion rules to ensure nowrap takes precedence and is preserved throughout expressions. Such that: 'long' + 'nowrap int' -> 'nowrap long' Then, once you have this, you can go do things like: typedef nowrap unsigned long size_t; #define sizeof(x) ((size_t)sizeof(x)) and things will just work. Hmm?
On Mon, Jun 17, 2024 at 10:19:15AM -0700, Kees Cook wrote: > On Fri, Jun 14, 2024 at 12:17:08PM +0200, Peter Zijlstra wrote: > > On Wed, Jun 12, 2024 at 04:23:31PM -0700, Kees Cook wrote: > > > On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote: > > > > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > > > > > I'm happy to take patches. And for this bikeshed, this would be better > > > > > named under the size_*() helpers which are trying to keep size_t > > > > > calculations from overflowing (by saturating). i.e.: > > > > > > > > > > size_add_mult(sizeof(*p), sizeof(*p->member), num) > > > > > > > > Fine I suppose, but what if we want something not size_t? Are we waiting > > > > for the type system extension? > > > > > > Because of C's implicit promotion/truncation, we can't do anything > > > sanely with return values of arbitrary type size; we have to capture the > > > lvalue type somehow so the checking can happen without C doing silent > > > garbage. > > > > So sizeof() returns the native (built-in) size_t, right? If that type > > the nooverflow qualifier on, then: > > > > sizeof(*p) + num*sizeof(p->foo[0]) > > > > should all get the nooverflow semantics right? Because size_t is > > effectively 'nooverflow unsigned long' the multiplication should promote > > 'num' to some 'long'. > > Hmmm. This is an interesting point. I'll see what Justin has found as > he's been working on limiting the overflow sanitizer to specific types. > > It doesn't help this (unfortunately common) code pattern, though: > > int size; > > size = sizeof(*p) + num*sizeof(p->foo[0]); > p = kmalloc(size, GFP_KERNEL); > > But that was going to be a problem either way. Well, you can add a warning on implicitly casting away nooverflow. New qualifier, we get to make up the rules etc.. it probably means we need to change the signature of the allocator functions to take a 'nooverflow' type, otherwise those will trigger this new warning, but that should not be a problem. > > Now, I've re-read the rules and I don't see qualifiers mentioned, so > > can't we state that the overflow/nooverflow qualifiers are to be > > preserved on (implicit) promotion and when nooverflow and overflow are > > combined the 'safe' nooverflow takes precedence? > > > > I mean, when we're adding qualifiers we can make up rules about them > > too, right? > > Yup, that is the design of the "wraps" attribute (though it is the > reverse: it _allows_ for wrap-around, since we want to the default state > to be mitigation). Yeah, I feel strongly about that (just mailed you in the other sub-thread) that this is the wrong way around. > > If 'people' don't want to adorn the built-in size_t, we can always do > > something like: > > > > #define sizeof(x) ((nooverflow unsigned long)(sizeof(x))) > > > > and 'fix' it ourselves. > > Right, though my hope is still we get the result of "nooverflow" by > marking that which was expected to overflow. You cannot sell that as a proper language extension because it will break world+dog.
On Tue, Jun 18, 2024 at 10:22:42AM +0200, Peter Zijlstra wrote: > On Mon, Jun 17, 2024 at 10:28:20AM -0700, Kees Cook wrote: > > > But, using type attributes we have much more flexibility. Hence, the > > proposed "wraps" attribute: > > https://github.com/llvm/llvm-project/pull/86618 > > So I still think that's going about things backwards. > > unsigned explicitly wraps. signed is UB. > > When using -fwrapv signed is well defined as 2s complement, which takes > away the UB and makes it implicitly wrap. > > When extending the language, it is important to not break existing code, > so the default must remain wrap. This in turn means you need to add a > 'nowrap' thingy. Well, we still disagree about this, but I guess we'll see how the size_t work comes along. Maybe I will come to agree with you. :) > Also, I would very much not make this an attribute, but a full type > qualifier. Furthermore, add type promotion rules to ensure nowrap takes > precedence and is preserved throughout expressions. Such that: > > 'long' + 'nowrap int' -> 'nowrap long' > > Then, once you have this, you can go do things like: > > typedef nowrap unsigned long size_t; > #define sizeof(x) ((size_t)sizeof(x)) > > and things will just work. Hmm? Right. Currently this is being plumbed through the sanitizers, so the type selection will happen there (for this version of it). The current experiment is to basically tell the sanitizers to ignore all types except a given list, and our initial list will be just size_t. Before this, Justin is finishing up the initial set of idiom exclusions. .e.g.: size_t var, offset; ... if (var + offset < var) { ... } This will not freak out if "var + offset" wraps. And "negative unsigned constants" will be ignored: #define MASK -2ULL ... size_t mask = MASK; And loop post decrements wrapping will get ignored too: size_t count = ...; ... while (count--) { ... } After we get it all nailed down, we'll see if anything new pops up. :)