Message ID | 87f60512-9242-49d1-eae1-394eb7a34760@kernel.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [GIT,PULL] Block driver changes for 5.20-rc1 | expand |
On Sun, Jul 31, 2022 at 8:03 AM Jens Axboe <axboe@kernel.dk> wrote: > > On top of the core block changes, here are the block driver changes for > 5.20-rc1. In detail: No. I pulled, and I ended up immediately unpulling again. Why? This is pure garbage that doesn't even compile: > - NVMe pull request via Christoph > - add support for In-Band authentication (Hannes Reinecke) because it is testing the address of an array member (NOT a pointer!) being NULL. Lookie here: static struct nvme_auth_dhgroup_map { const char name[16]; const char kpp[16]; } dhgroup_map[] = { ... const char *nvme_auth_dhgroup_name(u8 dhgroup_id) { if ((dhgroup_id > ARRAY_SIZE(dhgroup_map)) || !dhgroup_map[dhgroup_id].name || !strlen(dhgroup_map[dhgroup_id].name)) return NULL; return dhgroup_map[dhgroup_id].name; } That test of "name" being NULL is complete garbage, because "name[]" is not a pointer, it's a member of the struct, so the address is simply *within* the struct, and testing for NULL is nonsensical. As a result, gcc quite reasonably complains drivers/nvme/common/auth.c: In function ‘nvme_auth_dhgroup_name’: drivers/nvme/common/auth.c:59:13: error: the comparison will always evaluate as ‘true’ for the address of ‘name’ will never be NULL [-Werror=address] 59 | !dhgroup_map[dhgroup_id].name || | ^ and the exact same completely broken pattern ends up existing about five more times in that same source file with other structures and other structure members (ie there another case of exactly the same thing, except with 'kpp[]', and then there are other cases of the same thing except with the 'hash_map[]' structure etc. This code cannot have gotten much testing at all. Sure, it's possible that the warnings are compiler version dependent, but I have two completely different compilers that both complain about this thing. Clang just has a slightly different error string, and says drivers/nvme/common/auth.c:59:31: error: address of array 'dhgroup_map[dhgroup_id].name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] !dhgroup_map[dhgroup_id].name || ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ instead. And no, I don't want some "fix up broken code after the fact" commit on top. I want that code excised, and I don't want to see another pull request before it's (a) gone and (b) somebody has looked at where the testing of this COMPLETELY failed. Linus
On Tue, Aug 02, 2022 at 02:18:57PM -0700, Linus Torvalds wrote: > And no, I don't want some "fix up broken code after the fact" commit > on top. I want that code excised, and I don't want to see another pull > request before it's (a) gone and (b) somebody has looked at where the > testing of this COMPLETELY failed. This issue was fixed more than 2 weeks ago, but wasn't included in this pull request. It's in the current block drivers-post tree, though: https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.20/drivers-post&id=4dfbd5418763018f33acded0871fbbc22c8e4695 Do you want us to rebase the nvme pull request with the above squashed into the original commit instead?
On 8/2/22 3:33 PM, Keith Busch wrote: > On Tue, Aug 02, 2022 at 02:18:57PM -0700, Linus Torvalds wrote: >> And no, I don't want some "fix up broken code after the fact" commit >> on top. I want that code excised, and I don't want to see another pull >> request before it's (a) gone and (b) somebody has looked at where the >> testing of this COMPLETELY failed. > > This issue was fixed more than 2 weeks ago, but wasn't included in this pull > request. It's in the current block drivers-post tree, though: > > https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.20/drivers-post&id=4dfbd5418763018f33acded0871fbbc22c8e4695 > > Do you want us to rebase the nvme pull request with the above squashed > into the original commit instead? I can just rebase drivers and drivers-post into one now that the core bits are in. It was a bit of a pain since later driver changes ended up needing more core changes (and has made me re-think the split approach, I want to make this one block branch going forward since this isn't the first time). But if I just rebase drivers + drivers-post into one branch, then I can squash the commit. As to testing, I'm going to punt that question to Hannes and Christoph, as I have no way of testing that particular NVMe feature.
On Tue, Aug 2, 2022 at 2:35 PM Jens Axboe <axboe@kernel.dk> wrote: > > > As to testing, I'm going to punt that question to Hannes and Christoph, > as I have no way of testing that particular NVMe feature. I can't test the *feature* either. But dammit, I test two very different build configurations, and both of them failed miserably on this file. Don't you get it? That file DOES NOT EVEN COMPILE. I refuse to have anything to do with a pull request that doesn't even pass some very fundamental build requirements for me. That implies a level of lack of testing that just makes me go "No way am I touching that tree". Linus
On 8/2/22 3:50 PM, Linus Torvalds wrote: > On Tue, Aug 2, 2022 at 2:35 PM Jens Axboe <axboe@kernel.dk> wrote: >> >> >> As to testing, I'm going to punt that question to Hannes and Christoph, >> as I have no way of testing that particular NVMe feature. > > I can't test the *feature* either. > > But dammit, I test two very different build configurations, and both > of them failed miserably on this file. > > Don't you get it? That file DOES NOT EVEN COMPILE. > > I refuse to have anything to do with a pull request that doesn't even > pass some very fundamental build requirements for me. That implies a > level of lack of testing that just makes me go "No way am I touching > that tree". I can tell you that I always compile the whole damn thing, and this one is no exception. The tree is also in for-next and has been for a long time, both the drivers and drivers-post branch. The build bot has also vetted both branches, individually, not just as the merged for-next. I take it this is only happening on clang, which is why I haven't seen it as I don't compile with clang. We can certainly add that to the usual pre-flight/post-merge list, but I'm a bit surprised that clang isn't being done by the build bots too. If you want to make a clang build a hard requirement for any pull request, then that should be explicit and not illicit outbursts if that's just an implied assumption that it is being done. Really.
On 8/2/22 4:24 PM, Jens Axboe wrote: > On 8/2/22 3:50 PM, Linus Torvalds wrote: >> On Tue, Aug 2, 2022 at 2:35 PM Jens Axboe <axboe@kernel.dk> wrote: >>> >>> >>> As to testing, I'm going to punt that question to Hannes and Christoph, >>> as I have no way of testing that particular NVMe feature. >> >> I can't test the *feature* either. >> >> But dammit, I test two very different build configurations, and both >> of them failed miserably on this file. >> >> Don't you get it? That file DOES NOT EVEN COMPILE. >> >> I refuse to have anything to do with a pull request that doesn't even >> pass some very fundamental build requirements for me. That implies a >> level of lack of testing that just makes me go "No way am I touching >> that tree". > > I can tell you that I always compile the whole damn thing, and this one > is no exception. The tree is also in for-next and has been for a long > time, both the drivers and drivers-post branch. The build bot has also > vetted both branches, individually, not just as the merged for-next. > > I take it this is only happening on clang, which is why I haven't seen > it as I don't compile with clang. We can certainly add that to the usual > pre-flight/post-merge list, but I'm a bit surprised that clang isn't > being done by the build bots too. Side note - it is possible this all happened on the nvme branch side and hence I didn't see it. In any case, I've got the branch prepared and we'll send it out later this merge window when it's all vetted.
On Tue, Aug 2, 2022 at 3:24 PM Jens Axboe <axboe@kernel.dk> wrote: > > I take it this is only happening on clang, which is why I haven't seen > it as I don't compile with clang. It happens both with clang-14.0 and with gcc-12.1.1. And Keith says that the issue was apparently know about and fixed over two weeks ago. So *somebody* knew about it, and fixed it, but apparently the people involved didn't bother informing upstream. Regardless of what happened, it's not even remotely acceptable. Linus
On 8/2/22 4:27 PM, Linus Torvalds wrote: > On Tue, Aug 2, 2022 at 3:24 PM Jens Axboe <axboe@kernel.dk> wrote: >> >> I take it this is only happening on clang, which is why I haven't seen >> it as I don't compile with clang. > > It happens both with clang-14.0 and with gcc-12.1.1. gcc-12.1 is what I use, fwiw. > And Keith says that the issue was apparently know about and fixed over > two weeks ago. > > So *somebody* knew about it, and fixed it, but apparently the people > involved didn't bother informing upstream. > > Regardless of what happened, it's not even remotely acceptable. Agree, nobody told me about it, and that would've been nice to know.
On Tue, Aug 2, 2022 at 3:33 PM Jens Axboe <axboe@kernel.dk> wrote: > > On 8/2/22 4:27 PM, Linus Torvalds wrote: > > > > It happens both with clang-14.0 and with gcc-12.1.1. > > gcc-12.1 is what I use, fwiw. Hmm. I think the last ".1" is actually purely a Fedora artifact. The gcc people themselves have only released 12.1.0, and looking around for the srpm info, it looks like all the final F36 ".1" is for some unrelated patches. So we may actually be essentially on the same compiler version, and I don't know why you wouldn't see the error. It showed up with a plain "make allmodconfig" build here. Linus
On 8/2/22 4:48 PM, Linus Torvalds wrote: > On Tue, Aug 2, 2022 at 3:33 PM Jens Axboe <axboe@kernel.dk> wrote: >> >> On 8/2/22 4:27 PM, Linus Torvalds wrote: >>> >>> It happens both with clang-14.0 and with gcc-12.1.1. >> >> gcc-12.1 is what I use, fwiw. > > Hmm. I think the last ".1" is actually purely a Fedora artifact. The > gcc people themselves have only released 12.1.0, and looking around > for the srpm info, it looks like all the final F36 ".1" is for some > unrelated patches. > > So we may actually be essentially on the same compiler version, and I > don't know why you wouldn't see the error. It showed up with a plain > "make allmodconfig" build here. Actually, I'm mistaken, on the build box it's running 11.3. So that might explain it? I use 12.1 elsewhere.
On Tue, Aug 2, 2022 at 3:59 PM Jens Axboe <axboe@kernel.dk> wrote: > > Actually, I'm mistaken, on the build box it's running 11.3. So that > might explain it? I use 12.1 elsewhere. It's possible this -Waddress error is new to gcc-12. I try to keep most of my machines in sync just to avoid the pain of different distro details, so I don't have gcc-11 around any more. Linus
On 8/2/22 5:03 PM, Linus Torvalds wrote: > On Tue, Aug 2, 2022 at 3:59 PM Jens Axboe <axboe@kernel.dk> wrote: >> >> Actually, I'm mistaken, on the build box it's running 11.3. So that >> might explain it? I use 12.1 elsewhere. > > It's possible this -Waddress error is new to gcc-12. > > I try to keep most of my machines in sync just to avoid the pain of > different distro details, so I don't have gcc-11 around any more. I'll get gcc-12 back on it - I originally swapped back to 11 for building kernels to avoid spurious warnings with the new release.
On 8/2/22 5:08 PM, Jens Axboe wrote: > On 8/2/22 5:03 PM, Linus Torvalds wrote: >> On Tue, Aug 2, 2022 at 3:59 PM Jens Axboe <axboe@kernel.dk> wrote: >>> >>> Actually, I'm mistaken, on the build box it's running 11.3. So that >>> might explain it? I use 12.1 elsewhere. >> >> It's possible this -Waddress error is new to gcc-12. >> >> I try to keep most of my machines in sync just to avoid the pain of >> different distro details, so I don't have gcc-11 around any more. > > I'll get gcc-12 back on it - I originally swapped back to 11 for > building kernels to avoid spurious warnings with the new release. On the topic of warnings, on my new build box I get a lot of these: ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker which ends up polluting the output quite a bit. axboe@r7525 ~> ld --version GNU ld (GNU Binutils for Debian) 2.38.90.20220713
On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote: > > On the topic of warnings, on my new build box I get a lot of these: > > ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack > ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker > > which ends up polluting the output quite a bit. > > axboe@r7525 ~> ld --version > GNU ld (GNU Binutils for Debian) 2.38.90.20220713 Ok, I have binutils 2.37, so it may be new to 2.38. Some googling around seems to imply that we'd need to so something like this .section .note.GNU-stack,"",%progbits in all our *.S files. We do have some signs of that in our tooling, because apparently it has hit user-space, but I wonder what has triggered the need on the kernel side for you. I'd hate to add that pointless line to every asm file, but maybe we could so something like this #ifdef __ASSEMBLY_ #ifdef OUTPUT_PROGBITS .section .note.GNU-stack,"",%progbits #undef OUTPUT_PROGBITS #endif #endif and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our makefiles. *Most* asm files should include <linux/linkage.h> just for all the macros that declare variables externally, so that might catch the bulk of it. Somebody who knows the rules better than I would be a good idea. I've added random people who have touched those linkage things in the past to the participants, in the hope that somebody goes, "No, Linus, just add flag XYZ to the linker script" or other and there's a clear and obvious solution. Linus
On Wed, Aug 3, 2022 at 9:26 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote: > > > > On the topic of warnings, on my new build box I get a lot of these: > > > > ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack > > ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker > > > > which ends up polluting the output quite a bit. > > > > axboe@r7525 ~> ld --version > > GNU ld (GNU Binutils for Debian) 2.38.90.20220713 > > Ok, I have binutils 2.37, so it may be new to 2.38. > > Some googling around seems to imply that we'd need to so something like this > > .section .note.GNU-stack,"",%progbits > > in all our *.S files. > > We do have some signs of that in our tooling, because apparently it > has hit user-space, but I wonder what has triggered the need on the > kernel side for you. > > I'd hate to add that pointless line to every asm file, but maybe we > could so something like this > > #ifdef __ASSEMBLY_ > #ifdef OUTPUT_PROGBITS > .section .note.GNU-stack,"",%progbits > #undef OUTPUT_PROGBITS > #endif > #endif > > and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our > makefiles. > > *Most* asm files should include <linux/linkage.h> just for all the > macros that declare variables externally, so that might catch the bulk > of it. > > Somebody who knows the rules better than I would be a good idea. $ as --help | grep exec --execstack require executable stack for this object --noexecstack don't require executable stack for this object --statistics print various measured statistics from execution Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures help, rather than modifying every assembler source? > > I've added random people who have touched those linkage things in the > past to the participants, in the hope that somebody goes, "No, Linus, > just add flag XYZ to the linker script" or other and there's a clear > and obvious solution. > > Linus
On 8/3/22 10:51 AM, Nick Desaulniers wrote: > On Wed, Aug 3, 2022 at 9:26 AM Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote: >>> >>> On the topic of warnings, on my new build box I get a lot of these: >>> >>> ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack >>> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker >>> >>> which ends up polluting the output quite a bit. >>> >>> axboe@r7525 ~> ld --version >>> GNU ld (GNU Binutils for Debian) 2.38.90.20220713 >> >> Ok, I have binutils 2.37, so it may be new to 2.38. >> >> Some googling around seems to imply that we'd need to so something like this >> >> .section .note.GNU-stack,"",%progbits >> >> in all our *.S files. >> >> We do have some signs of that in our tooling, because apparently it >> has hit user-space, but I wonder what has triggered the need on the >> kernel side for you. >> >> I'd hate to add that pointless line to every asm file, but maybe we >> could so something like this >> >> #ifdef __ASSEMBLY_ >> #ifdef OUTPUT_PROGBITS >> .section .note.GNU-stack,"",%progbits >> #undef OUTPUT_PROGBITS >> #endif >> #endif >> >> and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our >> makefiles. >> >> *Most* asm files should include <linux/linkage.h> just for all the >> macros that declare variables externally, so that might catch the bulk >> of it. >> >> Somebody who knows the rules better than I would be a good idea. > > $ as --help | grep exec > --execstack require executable stack for this object > --noexecstack don't require executable stack for this object > --statistics print various measured statistics from execution > > Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures > help, rather than modifying every assembler source? I can try whatever here, but a quick grep doesn't find anything for KBUILD_ASFLAGS or anything close to it. What am I missing?
On Wed, Aug 3, 2022 at 9:51 AM Nick Desaulniers <ndesaulniers@google.com> wrote: > > Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures > help, rather than modifying every assembler source? See, this is why you cc the experts: > > I've added random people who have touched those linkage things in the > > past to the participants, in the hope that somebody goes, "No, Linus, > > just add flag XYZ to the linker script" or other and there's a clear > > and obvious solution. Thanks, Linus
On Wed, Aug 3, 2022 at 9:53 AM Jens Axboe <axboe@kernel.dk> wrote: > > I can try whatever here, but a quick grep doesn't find anything for > KBUILD_ASFLAGS or anything close to it. What am I missing? I think we only have 'aflags-y' (so you can use config variables etc to pick them) and EXTRA_AFLAGS. And we always go through the compiler rather than invoke 'as' directly, so I think you have to use '-Wa,option' to pass 'option' to the assembler. Linus
On Tue, Aug 02, 2022 at 02:18:57PM -0700, Linus Torvalds wrote: > This code cannot have gotten much testing at all. [...] > And no, I don't want some "fix up broken code after the fact" commit > on top. I want that code excised, and I don't want to see another pull > request before it's (a) gone and (b) somebody has looked at where the > testing of this COMPLETELY failed. > Umm. The warning is as you said totally reasonable, and we fixed it as soon as we got the report. But it turns out my compiler certainly did not report it (gcc version 10.2.1 20210110 (Debian 10.2.1-6)), Jens's apparently also did not, and the regular build bot that is running on tons of architectures did not report it until 9 days after the patch was commited and pushed out, and until after Jens pulled it. So while the complaint that we failed to get it into the same pull request is entirely reasonable, the statement that it cannot have gotten much testing at all is a bit ridiculous. It's also not that "I does not compile at all", but rather that -Werror makes a useful but mostly harmless warning fatal.
On Wed, Aug 3, 2022 at 9:53 AM Jens Axboe <axboe@kernel.dk> wrote: > > On 8/3/22 10:51 AM, Nick Desaulniers wrote: > > On Wed, Aug 3, 2022 at 9:26 AM Linus Torvalds > > <torvalds@linux-foundation.org> wrote: > >> > >> On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote: > >>> > >>> On the topic of warnings, on my new build box I get a lot of these: > >>> > >>> ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack > >>> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker > >>> > >>> which ends up polluting the output quite a bit. > >>> > >>> axboe@r7525 ~> ld --version > >>> GNU ld (GNU Binutils for Debian) 2.38.90.20220713 > >> > >> Ok, I have binutils 2.37, so it may be new to 2.38. > >> > >> Some googling around seems to imply that we'd need to so something like this > >> > >> .section .note.GNU-stack,"",%progbits > >> > >> in all our *.S files. > >> > >> We do have some signs of that in our tooling, because apparently it > >> has hit user-space, but I wonder what has triggered the need on the > >> kernel side for you. > >> > >> I'd hate to add that pointless line to every asm file, but maybe we > >> could so something like this > >> > >> #ifdef __ASSEMBLY_ > >> #ifdef OUTPUT_PROGBITS > >> .section .note.GNU-stack,"",%progbits > >> #undef OUTPUT_PROGBITS > >> #endif > >> #endif > >> > >> and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our > >> makefiles. > >> > >> *Most* asm files should include <linux/linkage.h> just for all the > >> macros that declare variables externally, so that might catch the bulk > >> of it. > >> > >> Somebody who knows the rules better than I would be a good idea. > > > > $ as --help | grep exec > > --execstack require executable stack for this object > > --noexecstack don't require executable stack for this object > > --statistics print various measured statistics from execution > > > > Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures > > help, rather than modifying every assembler source? > > I can try whatever here, but a quick grep doesn't find anything for > KBUILD_ASFLAGS or anything close to it. What am I missing? Sorry, I'm running between many meetings today...so suggestions aren't tested and may not be fully coherent... KBUILD_AFLAGS += -Wa,--noexecstack There's also as-option Make macro in case older binutils doesn't support that flag outright. tools/perf/Makefile.config also uses noexecstack as a linker flag. That might be an option, too. It is the linker producing the warnings, not the assembler, but the assembler flag is probably the way to go to fix the warnings since it sounds like these are assembler sources exclusively causing the issue.
On Wed, Aug 3, 2022 at 10:30 AM Christoph Hellwig <hch@lst.de> wrote: > > So while the complaint that we failed to get it into the same pull > request is entirely reasonable, the statement that it cannot have > gotten much testing at all is a bit ridiculous. It's also not that > "I does not compile at all", but rather that -Werror makes a useful > but mostly harmless warning fatal. Stop this idiocy. Warnings are fatal. Deal with it. If you think it's ok to have warnings in your code, go do another project. The -Werror addition is not new, it's a policy that has been in place for a decade or so. I just got fed up with people not noticing warnings and using that as an excuse for their broken code. The fact is, apparently -Werror _did_ find it, and the people involved just never bothered to even tell upstream about it, so that two weeks afterwards I got a TERMINALLY BROKEN pull, request. Don't make excuses about "it wasn't that broken". It was broken. Own it. And something went very very wrong for that breakage to have stayed around and then being pushed to me. I want whatever went wrong in that process fixed. Why did you make a pull request to Jens, and then never notified him about the problems in it? Linus
On 8/3/22 11:38 AM, Nick Desaulniers wrote: > On Wed, Aug 3, 2022 at 9:53 AM Jens Axboe <axboe@kernel.dk> wrote: >> >> On 8/3/22 10:51 AM, Nick Desaulniers wrote: >>> On Wed, Aug 3, 2022 at 9:26 AM Linus Torvalds >>> <torvalds@linux-foundation.org> wrote: >>>> >>>> On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote: >>>>> >>>>> On the topic of warnings, on my new build box I get a lot of these: >>>>> >>>>> ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack >>>>> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker >>>>> >>>>> which ends up polluting the output quite a bit. >>>>> >>>>> axboe@r7525 ~> ld --version >>>>> GNU ld (GNU Binutils for Debian) 2.38.90.20220713 >>>> >>>> Ok, I have binutils 2.37, so it may be new to 2.38. >>>> >>>> Some googling around seems to imply that we'd need to so something like this >>>> >>>> .section .note.GNU-stack,"",%progbits >>>> >>>> in all our *.S files. >>>> >>>> We do have some signs of that in our tooling, because apparently it >>>> has hit user-space, but I wonder what has triggered the need on the >>>> kernel side for you. >>>> >>>> I'd hate to add that pointless line to every asm file, but maybe we >>>> could so something like this >>>> >>>> #ifdef __ASSEMBLY_ >>>> #ifdef OUTPUT_PROGBITS >>>> .section .note.GNU-stack,"",%progbits >>>> #undef OUTPUT_PROGBITS >>>> #endif >>>> #endif >>>> >>>> and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our >>>> makefiles. >>>> >>>> *Most* asm files should include <linux/linkage.h> just for all the >>>> macros that declare variables externally, so that might catch the bulk >>>> of it. >>>> >>>> Somebody who knows the rules better than I would be a good idea. >>> >>> $ as --help | grep exec >>> --execstack require executable stack for this object >>> --noexecstack don't require executable stack for this object >>> --statistics print various measured statistics from execution >>> >>> Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures >>> help, rather than modifying every assembler source? >> >> I can try whatever here, but a quick grep doesn't find anything for >> KBUILD_ASFLAGS or anything close to it. What am I missing? > > Sorry, I'm running between many meetings today...so suggestions aren't > tested and may not be fully coherent... > > KBUILD_AFLAGS += -Wa,--noexecstack > > There's also as-option Make macro in case older binutils doesn't > support that flag outright. > > tools/perf/Makefile.config also uses noexecstack as a linker flag. > That might be an option, too. It is the linker producing the > warnings, not the assembler, but the assembler flag is probably the > way to go to fix the warnings since it sounds like these are assembler > sources exclusively causing the issue. I ran with the below and it silences the linker warning mentioned. I do also see the below with my build (which aren't new with the option added obviously, but not visible since I don't get the other noise): axboe@r7525 ~/gi/linux-block (test)> time make -j256 -s 1.886s ld: warning: .tmp_vmlinux.kallsyms1 has a LOAD segment with RWX permissions ld: warning: .tmp_vmlinux.kallsyms2 has a LOAD segment with RWX permissions ld: warning: vmlinux has a LOAD segment with RWX permissions ld: warning: arch/x86/boot/compressed/efi_thunk_64.o: missing .note.GNU-stack section implies executable stack ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker ld: warning: arch/x86/boot/setup.elf has a LOAD segment with RWX permissions diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 7854685c5f25..51824528a026 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -123,6 +123,7 @@ else CHECKFLAGS += -D__x86_64__ KBUILD_AFLAGS += -m64 + KBUILD_AFLAGS += -Wa,--noexecstack KBUILD_CFLAGS += -m64 # Align jump targets to 1 byte, not the default 16 bytes:
On Wed, Aug 03, 2022 at 10:42:32AM -0700, Linus Torvalds wrote: > Warnings are fatal. Deal with it. If you think it's ok to have > warnings in your code, go do another project. Please stop this BS. I'm the first one to fix every single warning reported, because I absoutely want warning free code to see problems. But that does not mean it is "fatal". It is bad and should go away by either fixing the code, or if the warning is too broken disable it in the compiler. But that does not in any way mean code that creates warning on some compilers is completely untested and by definition broken. We've fixed the warning the day it was reported because of just that, and we'd always do. But that doesn't make it whole giant drama. > The fact is, apparently -Werror _did_ find it, and the people involved There is nothing about Werror breaking it. The buildbot reported it, and that usually reports warnings and errors, so it did not make any difference. > I want whatever went wrong in that process fixed. Why did you make a > pull request to Jens, and then never notified him about the problems > in it? Because so far we've never made a big fuzz about a configuation so obscrube that it takes build bot more than a week to find it. We just fix it and go on with life. But now that I see that you somehow see it a personal insult I'll be more careful about it. But to be honest, while I agree with you 100% that code should be warning free I'm really amazed about all the drama this created.
On Wed, Aug 3, 2022 at 10:50 AM Christoph Hellwig <hch@lst.de> wrote: > Because so far we've never made a big fuzz about a configuation > so obscrube that it takes build bot more than a week to find it. That "obscure" config was a bog-standard "build the code". It literally happens with "allmodconfig". I found it immediately after pulling. This is literally why I do full builds after every single pull request I do - I don't want to have build warnings in my tree. And I do expect people to make sure that their code has been sufficiently tested that my smoke-testing doesn't find immediate problems. Linus
+ Linux Toolchains, Nick Clifton (author of https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107) https://lore.kernel.org/linux-block/CAKwvOdkpNRvD0kDe-j8N0Gkq+1Fdhd6=z-9ROm3gc12Sf0k-Kg@mail.gmail.com/ is the thread for context. On Wed, Aug 3, 2022 at 10:45 AM Jens Axboe <axboe@kernel.dk> wrote: > > I ran with the below and it silences the linker warning mentioned. I do > also see the below with my build (which aren't new with the option added > obviously, but not visible since I don't get the other noise): > > axboe@r7525 ~/gi/linux-block (test)> time make -j256 -s 1.886s > ld: warning: .tmp_vmlinux.kallsyms1 has a LOAD segment with RWX permissions > ld: warning: .tmp_vmlinux.kallsyms2 has a LOAD segment with RWX permissions > ld: warning: vmlinux has a LOAD segment with RWX permissions Not sure yet about these. Looks like there's linker flags to disable warnings...but I don't recommend those. https://github.com/systemd/systemd/commit/b0e5bf0451a6bc94e6e7b2a1de668b75c63f38c8 I wonder if they go away by fixing the boot issues described below? Otherwise, I think we need to find which section is the problematic one; I suspect the segment flagged as LOAD is composed of many sections, with possibly only one or a few that needs permissions adjusted. > ld: warning: arch/x86/boot/compressed/efi_thunk_64.o: missing .note.GNU-stack section implies executable stack > ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker Right, arch/x86/boot/compressed (and arch/x86/boot/) discard/reset KBUILD_AFLAGS (via the := assignment operator). So arch/x86/boot/Makefile and arch/x86/boot/compressed/Makefile also will need changes similar to the one you did below. Finally, if those parts of the code actually expect the stack to be executable (probably depends on some inline asm), I suspect we might get some kind of fault at runtime (though I don't know how the kernel handles segment permissions or even uses ELF segments). Point being that -Wa,--noexecstack is somewhat a promise that we wont try to execute data on the stack as if it were code. -Wa,--execstack is useful if we need to be able to do so, but we should be careful to limit that to individual translation units if they really truly need that. The linker warning is more so about the current ambiguity since if the implicit default changes in the future, that could break code. Better for us to be explicit here, and disable executable stack unless strictly necessary and only as necessary IMO. Personally, I think the current implicit default is wrong, but pragmatically I recognize that people have been used to the status quo for years, and changing it could break existing codebases. If you send the below diff with the two other additions I suggest to the x86 ML, the x86 maintainers might be able to comment if they're familiar with any possible cases where the stack is expected to be executable. > ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions > ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack > ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker ^ See above. > ld: warning: arch/x86/boot/setup.elf has a LOAD segment with RWX permissions > > > diff --git a/arch/x86/Makefile b/arch/x86/Makefile > index 7854685c5f25..51824528a026 100644 > --- a/arch/x86/Makefile > +++ b/arch/x86/Makefile > @@ -123,6 +123,7 @@ else > CHECKFLAGS += -D__x86_64__ > > KBUILD_AFLAGS += -m64 > + KBUILD_AFLAGS += -Wa,--noexecstack > KBUILD_CFLAGS += -m64 > > # Align jump targets to 1 byte, not the default 16 bytes: > > -- > Jens Axboe >
On 8/3/22 12:06 PM, Nick Desaulniers wrote: >> ld: warning: arch/x86/boot/compressed/efi_thunk_64.o: missing .note.GNU-stack section implies executable stack >> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker > > Right, arch/x86/boot/compressed (and arch/x86/boot/) discard/reset > KBUILD_AFLAGS (via the := assignment operator). > > So arch/x86/boot/Makefile and arch/x86/boot/compressed/Makefile also > will need changes similar to the one you did below. > > Finally, if those parts of the code actually expect the stack to be > executable (probably depends on some inline asm), I suspect we might > get some kind of fault at runtime (though I don't know how the kernel > handles segment permissions or even uses ELF segments). Point being > that -Wa,--noexecstack is somewhat a promise that we wont try to > execute data on the stack as if it were code. -Wa,--execstack is > useful if we need to be able to do so, but we should be careful to > limit that to individual translation units if they really truly need > that. The linker warning is more so about the current ambiguity since > if the implicit default changes in the future, that could break code. > Better for us to be explicit here, and disable executable stack unless > strictly necessary and only as necessary IMO. Personally, I think the > current implicit default is wrong, but pragmatically I recognize that > people have been used to the status quo for years, and changing it > could break existing codebases. > > If you send the below diff with the two other additions I suggest to > the x86 ML, the x86 maintainers might be able to comment if they're > familiar with any possible cases where the stack is expected to be > executable. I'd be happy for you to take this over, I'm really just the reporter here...
On Wed, Aug 03, 2022 at 10:54:50AM -0700, Linus Torvalds wrote: > On Wed, Aug 3, 2022 at 10:50 AM Christoph Hellwig <hch@lst.de> wrote: > > Because so far we've never made a big fuzz about a configuation > > so obscrube that it takes build bot more than a week to find it. > > That "obscure" config was a bog-standard "build the code". With a new enough compiler that apparently very few other people, and most importantly CI tools have. Again, I'm not arguing this wasn't a problem, but if there is a mismatch between what compiler you use and most of the developers and CI infrastructure it isn't a case of "code that isn't tested at all".
diff --cc drivers/block/drbd/drbd_bitmap.c index 603f6828dd79,bd2133ef6e0a..7d9db33363de --- a/drivers/block/drbd/drbd_bitmap.c