diff mbox series

[GIT,PULL] Block driver changes for 5.20-rc1

Message ID 87f60512-9242-49d1-eae1-394eb7a34760@kernel.dk (mailing list archive)
State New, archived
Headers show
Series [GIT,PULL] Block driver changes for 5.20-rc1 | expand

Pull-request

git://git.kernel.dk/linux-block.git tags/for-5.20/drivers-2022-07-29

Commit Message

Jens Axboe July 31, 2022, 3:03 p.m. UTC
Hi Linus,

On top of the core block changes, here are the block driver changes for
5.20-rc1. In detail:

- NVMe pull request via Christoph
	- add support for In-Band authentication (Hannes Reinecke)
	- handle the persistent internal error AER (Michael Kelley)
	- use in-capsule data for TCP I/O queue connect (Caleb Sander)
	- remove timeout for getting RDMA-CM established event
	  (Israel Rukshin)
	- misc cleanups (Joel Granados, Sagi Grimberg, Chaitanya Kulkarni,
	  Guixin Liu, Xiang wangx)

- MD pull request via Song
	- Improve raid5 lock contention, by Logan Gunthorpe.
	- Misc fixes to raid5, by Logan Gunthorpe.
	- Fix race condition with md_reap_sync_thread(), by Guoqing Jiang.

- Work on unifying the null_blk module parameters and configfs API
  (Vincent)

- drbd bitmap IO error fix (Lars)

- Set of rnbd fixes (Guoqing, Md Haris)

- Remove experimental marker on bcache async device registration (Coly)

- Misc fixes and cleanups (Ming, Yu, Dan, Christophe

Note that this will throw a merge conflict in
drivers/block/drbd/drbd_bitmap.c due to the request flag changes from
Bart. The resolution is trivial, just keep the new enum req_op:

+++ b/drivers/block/drbd/drbd_bitmap.c
@@@ -977,16 -990,17 +990,17 @@@ static inline sector_t drbd_md_last_bit
  static void bm_page_io_async(struct drbd_bm_aio_ctx *ctx, int page_nr) __must_hold(local)
  {
  	struct drbd_device *device = ctx->device;
 -	unsigned int op = (ctx->flags & BM_AIO_READ) ? REQ_OP_READ : REQ_OP_WRITE;
 +	enum req_op op = ctx->flags & BM_AIO_READ ? REQ_OP_READ : REQ_OP_WRITE;
- 	struct bio *bio = bio_alloc_bioset(device->ldev->md_bdev, 1, op,
- 					   GFP_NOIO, &drbd_md_io_bio_set);
  	struct drbd_bitmap *b = device->bitmap;
+ 	struct bio *bio;
  	struct page *page;
+ 	sector_t last_bm_sect;
+ 	sector_t first_bm_sect;
+ 	sector_t on_disk_sector;
  	unsigned int len;
  
- 	sector_t on_disk_sector =
- 		device->ldev->md.md_offset + device->ldev->md.bm_offset;
- 	on_disk_sector += ((sector_t)page_nr) << (PAGE_SHIFT-9);
+ 	first_bm_sect = device->ldev->md.md_offset + device->ldev->md.bm_offset;
+ 	on_disk_sector = first_bm_sect + (((sector_t)page_nr) << (PAGE_SHIFT-SECTOR_SHIFT));
  
  	/* this might happen with very small
  	 * flexible external meta data device,

Please pull!


The following changes since commit ee78ec1077d37d1a4a0860589a65df8ae6d2255c:

  blk-mq: blk_mq_tag_busy is no need to return a value (2022-06-27 06:29:12 -0600)

are available in the Git repository at:

  git://git.kernel.dk/linux-block.git tags/for-5.20/drivers-2022-07-29

for you to fetch changes up to 508e357579f07d43cb3feabe93b46bc1648ca5d9:

  bcache: remove EXPERIMENTAL for Kconfig option 'Asynchronous device registration' (2022-07-28 11:13:11 -0600)

----------------------------------------------------------------
for-5.20/drivers-2022-07-29

----------------------------------------------------------------
Caleb Sander (1):
      nvme-tcp: use in-capsule data for I/O connect

Chaitanya Kulkarni (2):
      nvme: remove unused timeout parameter
      nvme: fix qid param blk_mq_alloc_request_hctx

Chris Webb (1):
      md: Explicitly create command-line configured devices

Christophe JAILLET (1):
      block: null_blk: Use the bitmap API to allocate bitmaps

Coly Li (1):
      bcache: remove EXPERIMENTAL for Kconfig option 'Asynchronous device registration'

Dan Carpenter (1):
      null_blk: fix ida error handling in null_add_dev()

Guixin Liu (2):
      nvme-pci: use nvme core helper to cancel requests in tagset
      nvme-apple: use nvme core helper to cancel requests in tagset

Guoqing Jiang (9):
      md: unlock mddev before reap sync_thread in action_store
      rnbd-clt: open code send_msg_open in rnbd_clt_map_device
      rnbd-clt: don't free rsp in msg_open_conf for map scenario
      rnbd-clt: kill read_only from struct rnbd_clt_dev
      rnbd-clt: reduce the size of struct rnbd_clt_dev
      rnbd-clt: adjust the layout of struct rnbd_clt_dev
      rnbd-clt: check capacity inside rnbd_clt_change_capacity
      rnbd-clt: pass sector_t type for resize capacity
      rnbd-clt: make rnbd_clt_change_capacity return void

Hannes Reinecke (11):
      crypto: add crypto_has_shash()
      crypto: add crypto_has_kpp()
      lib/base64: RFC4648-compliant base64 encoding
      nvme: add definitions for NVMe In-Band authentication
      nvme-fabrics: decode 'authentication required' connect error
      nvme: implement In-Band authentication
      nvme-auth: Diffie-Hellman key exchange support
      nvmet: parse fabrics commands on io queues
      nvmet: implement basic In-Band Authentication
      nvmet-auth: Diffie-Hellman key exchange support
      nvmet-auth: expire authentication sessions

Israel Rukshin (1):
      nvme-rdma: remove timeout for getting RDMA-CM established event

Jens Axboe (2):
      Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.20/drivers
      Merge tag 'nvme-5.20-2022-07-14' of git://git.infradead.org/nvme into for-5.20/drivers

Joel Granados (1):
      nvme-multipath: refactor nvme_mpath_add_disk

Lars Ellenberg (1):
      drbd: bm_page_async_io: fix spurious bitmap "IO error" on large volumes

Logan Gunthorpe (25):
      md/raid5-log: Drop extern decorators for function prototypes
      md/raid5-ppl: Drop unused argument from ppl_handle_flush_request()
      md/raid5: suspend the array for calls to log_exit()
      md/raid5-cache: Take mddev_lock in r5c_journal_mode_show()
      md/raid5-cache: Drop RCU usage of conf->log
      md/raid5-cache: Clear conf->log after finishing work
      md/raid5-cache: Annotate pslot with __rcu notation
      md: Use enum for overloaded magic numbers used by mddev->curr_resync
      md: Ensure resync is reported after it starts
      md: Notify sysfs sync_completed in md_reap_sync_thread()
      md/raid5: Make logic blocking check consistent with logic that blocks
      md/raid5: Factor out ahead_of_reshape() function
      md/raid5: Refactor raid5_make_request loop
      md/raid5: Move stripe_add_to_batch_list() call out of add_stripe_bio()
      md/raid5: Move common stripe get code into new find_get_stripe() helper
      md/raid5: Factor out helper from raid5_make_request() loop
      md/raid5: Drop the do_prepare flag in raid5_make_request()
      md/raid5: Move read_seqcount_begin() into make_stripe_request()
      md/raid5: Refactor for loop in raid5_make_request() into while loop
      md/raid5: Keep a reference to last stripe_head for batch
      md/raid5: Refactor add_stripe_bio()
      md/raid5: Check all disks in a stripe_head for reshape progress
      md/raid5: Pivot raid5_make_request()
      md/raid5: Improve debug prints
      md/raid5: Increase restriction on max segments per request

Md Haris Iqbal (2):
      block/rnbd-srv: Set keep_id to true after mutex_trylock
      block/rnbd-srv: Replace sess_dev_list with index_idr

Michael Kelley (1):
      nvme: handle the persistent internal error AER

Ming Lei (1):
      null_blk: cleanup null_init_tag_set

Sagi Grimberg (1):
      nvme-loop: use nvme core helpers to cancel all requests in a tagset

Song Liu (1):
      MAINTAINERS: add patchwork link to linux-raid project

Vincent Fu (2):
      null_blk: add module parameters for 4 options
      null_blk: add configfs variables for 2 options

Xiang wangx (1):
      nvme: remove a double word in a comment

Yu Kuai (1):
      nbd: add missing definition of pr_fmt

Zhang Jiaming (1):
      md: Fix spelling mistake in comments

 Documentation/block/null_blk.rst       |   22 +
 MAINTAINERS                            |    1 +
 crypto/kpp.c                           |    6 +
 crypto/shash.c                         |    6 +
 drivers/block/drbd/drbd_bitmap.c       |   49 +-
 drivers/block/nbd.c                    |    6 +-
 drivers/block/null_blk/main.c          |  108 +++-
 drivers/block/null_blk/null_blk.h      |    2 +
 drivers/block/rnbd/rnbd-clt-sysfs.c    |    2 +-
 drivers/block/rnbd/rnbd-clt.c          |  201 ++++---
 drivers/block/rnbd/rnbd-clt.h          |   18 +-
 drivers/block/rnbd/rnbd-srv.c          |   20 +-
 drivers/block/rnbd/rnbd-srv.h          |    4 -
 drivers/md/bcache/Kconfig              |    2 +-
 drivers/md/dm-raid.c                   |    1 +
 drivers/md/md-autodetect.c             |    1 +
 drivers/md/md-cluster.c                |    4 +-
 drivers/md/md.c                        |   76 ++-
 drivers/md/md.h                        |   16 +
 drivers/md/raid5-cache.c               |   40 +-
 drivers/md/raid5-log.h                 |   77 ++-
 drivers/md/raid5-ppl.c                 |    2 +-
 drivers/md/raid5.c                     |  646 +++++++++++++-------
 drivers/nvme/Kconfig                   |    1 +
 drivers/nvme/Makefile                  |    1 +
 drivers/nvme/common/Kconfig            |    4 +
 drivers/nvme/common/Makefile           |    7 +
 drivers/nvme/common/auth.c             |  482 +++++++++++++++
 drivers/nvme/host/Kconfig              |   15 +
 drivers/nvme/host/Makefile             |    1 +
 drivers/nvme/host/apple.c              |    7 +-
 drivers/nvme/host/auth.c               | 1017 ++++++++++++++++++++++++++++++++
 drivers/nvme/host/core.c               |  190 +++++-
 drivers/nvme/host/fabrics.c            |   94 ++-
 drivers/nvme/host/fabrics.h            |    7 +
 drivers/nvme/host/multipath.c          |    6 +-
 drivers/nvme/host/nvme.h               |   39 +-
 drivers/nvme/host/pci.c                |    6 +-
 drivers/nvme/host/rdma.c               |   14 +-
 drivers/nvme/host/tcp.c                |   13 +-
 drivers/nvme/host/trace.c              |   32 +
 drivers/nvme/target/Kconfig            |   15 +
 drivers/nvme/target/Makefile           |    1 +
 drivers/nvme/target/admin-cmd.c        |    4 +-
 drivers/nvme/target/auth.c             |  525 +++++++++++++++++
 drivers/nvme/target/configfs.c         |  138 ++++-
 drivers/nvme/target/core.c             |   15 +
 drivers/nvme/target/fabrics-cmd-auth.c |  545 +++++++++++++++++
 drivers/nvme/target/fabrics-cmd.c      |   55 +-
 drivers/nvme/target/loop.c             |    8 +-
 drivers/nvme/target/nvmet.h            |   75 ++-
 include/crypto/hash.h                  |    2 +
 include/crypto/kpp.h                   |    2 +
 include/linux/base64.h                 |   16 +
 include/linux/nvme-auth.h              |   41 ++
 include/linux/nvme.h                   |  213 ++++++-
 lib/Makefile                           |    2 +-
 lib/base64.c                           |  103 ++++
 58 files changed, 4474 insertions(+), 532 deletions(-)
 create mode 100644 drivers/nvme/common/Kconfig
 create mode 100644 drivers/nvme/common/Makefile
 create mode 100644 drivers/nvme/common/auth.c
 create mode 100644 drivers/nvme/host/auth.c
 create mode 100644 drivers/nvme/target/auth.c
 create mode 100644 drivers/nvme/target/fabrics-cmd-auth.c
 create mode 100644 include/linux/base64.h
 create mode 100644 include/linux/nvme-auth.h
 create mode 100644 lib/base64.c

Comments

Linus Torvalds Aug. 2, 2022, 9:18 p.m. UTC | #1
On Sun, Jul 31, 2022 at 8:03 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On top of the core block changes, here are the block driver changes for
> 5.20-rc1. In detail:

No.

I pulled, and I ended up immediately unpulling again.

Why?

This is pure garbage that doesn't even compile:

> - NVMe pull request via Christoph
>         - add support for In-Band authentication (Hannes Reinecke)

because it is  testing the address of an array member (NOT a pointer!)
being NULL.

Lookie here:

  static struct nvme_auth_dhgroup_map {
          const char name[16];
          const char kpp[16];
  } dhgroup_map[] = {
        ...

  const char *nvme_auth_dhgroup_name(u8 dhgroup_id)
  {
        if ((dhgroup_id > ARRAY_SIZE(dhgroup_map)) ||
            !dhgroup_map[dhgroup_id].name ||
            !strlen(dhgroup_map[dhgroup_id].name))
                return NULL;
        return dhgroup_map[dhgroup_id].name;
  }


That test of "name" being NULL is complete garbage, because "name[]"
is not a pointer, it's a member of the struct, so the address is
simply *within* the struct, and testing for NULL is nonsensical.

As a result, gcc quite reasonably complains

    drivers/nvme/common/auth.c: In function ‘nvme_auth_dhgroup_name’:
    drivers/nvme/common/auth.c:59:13: error: the comparison will
always evaluate as ‘true’ for the address of ‘name’ will never be NULL
[-Werror=address]
       59 |             !dhgroup_map[dhgroup_id].name ||
          |             ^

and the exact same completely broken pattern ends up existing about
five more times in that same source file with other structures and
other structure members (ie there another case of exactly the same
thing, except with 'kpp[]', and then there are other cases of the same
thing except with the 'hash_map[]' structure etc.

This code cannot have gotten much testing at all.

Sure, it's possible that the warnings are compiler version dependent,
but I have two completely different compilers that both complain about
this thing.

Clang just has a slightly different error string, and says

    drivers/nvme/common/auth.c:59:31: error: address of array
'dhgroup_map[dhgroup_id].name' will always evaluate to 'true'
[-Werror,-Wpointer-bool-conversion]
                !dhgroup_map[dhgroup_id].name ||
                ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~

instead.

And no, I don't want some "fix up broken code after the fact" commit
on top. I want that code excised, and I don't want to see another pull
request before it's (a) gone and (b) somebody has looked at where the
testing of this COMPLETELY failed.

                   Linus
Keith Busch Aug. 2, 2022, 9:33 p.m. UTC | #2
On Tue, Aug 02, 2022 at 02:18:57PM -0700, Linus Torvalds wrote:
> And no, I don't want some "fix up broken code after the fact" commit
> on top. I want that code excised, and I don't want to see another pull
> request before it's (a) gone and (b) somebody has looked at where the
> testing of this COMPLETELY failed.

This issue was fixed more than 2 weeks ago, but wasn't included in this pull
request. It's in the current block drivers-post tree, though:

  https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.20/drivers-post&id=4dfbd5418763018f33acded0871fbbc22c8e4695

Do you want us to rebase the nvme pull request with the above squashed into the
original commit instead?
Jens Axboe Aug. 2, 2022, 9:35 p.m. UTC | #3
On 8/2/22 3:33 PM, Keith Busch wrote:
> On Tue, Aug 02, 2022 at 02:18:57PM -0700, Linus Torvalds wrote:
>> And no, I don't want some "fix up broken code after the fact" commit
>> on top. I want that code excised, and I don't want to see another pull
>> request before it's (a) gone and (b) somebody has looked at where the
>> testing of this COMPLETELY failed.
> 
> This issue was fixed more than 2 weeks ago, but wasn't included in this pull
> request. It's in the current block drivers-post tree, though:
> 
>   https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.20/drivers-post&id=4dfbd5418763018f33acded0871fbbc22c8e4695
> 
> Do you want us to rebase the nvme pull request with the above squashed
> into the original commit instead?

I can just rebase drivers and drivers-post into one now that the core
bits are in. It was a bit of a pain since later driver changes ended up
needing more core changes (and has made me re-think the split approach,
I want to make this one block branch going forward since this isn't the
first time).

But if I just rebase drivers + drivers-post into one branch, then I can
squash the commit.

As to testing, I'm going to punt that question to Hannes and Christoph,
as I have no way of testing that particular NVMe feature.
Linus Torvalds Aug. 2, 2022, 9:50 p.m. UTC | #4
On Tue, Aug 2, 2022 at 2:35 PM Jens Axboe <axboe@kernel.dk> wrote:
>
>
> As to testing, I'm going to punt that question to Hannes and Christoph,
> as I have no way of testing that particular NVMe feature.

I can't test the *feature* either.

But dammit, I test two very different build configurations, and both
of them failed miserably on this file.

Don't you get it? That file DOES NOT EVEN COMPILE.

I refuse to have anything to do with a pull request that doesn't even
pass some very fundamental build requirements for me. That implies a
level of lack of testing that just makes me go "No way am I touching
that tree".

                 Linus
Jens Axboe Aug. 2, 2022, 10:24 p.m. UTC | #5
On 8/2/22 3:50 PM, Linus Torvalds wrote:
> On Tue, Aug 2, 2022 at 2:35 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>>
>> As to testing, I'm going to punt that question to Hannes and Christoph,
>> as I have no way of testing that particular NVMe feature.
> 
> I can't test the *feature* either.
> 
> But dammit, I test two very different build configurations, and both
> of them failed miserably on this file.
> 
> Don't you get it? That file DOES NOT EVEN COMPILE.
> 
> I refuse to have anything to do with a pull request that doesn't even
> pass some very fundamental build requirements for me. That implies a
> level of lack of testing that just makes me go "No way am I touching
> that tree".

I can tell you that I always compile the whole damn thing, and this one
is no exception. The tree is also in for-next and has been for a long
time, both the drivers and drivers-post branch. The build bot has also
vetted both branches, individually, not just as the merged for-next.

I take it this is only happening on clang, which is why I haven't seen
it as I don't compile with clang. We can certainly add that to the usual
pre-flight/post-merge list, but I'm a bit surprised that clang isn't
being done by the build bots too.

If you want to make a clang build a hard requirement for any pull
request, then that should be explicit and not illicit outbursts if
that's just an implied assumption that it is being done. Really.
Jens Axboe Aug. 2, 2022, 10:26 p.m. UTC | #6
On 8/2/22 4:24 PM, Jens Axboe wrote:
> On 8/2/22 3:50 PM, Linus Torvalds wrote:
>> On Tue, Aug 2, 2022 at 2:35 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>
>>> As to testing, I'm going to punt that question to Hannes and Christoph,
>>> as I have no way of testing that particular NVMe feature.
>>
>> I can't test the *feature* either.
>>
>> But dammit, I test two very different build configurations, and both
>> of them failed miserably on this file.
>>
>> Don't you get it? That file DOES NOT EVEN COMPILE.
>>
>> I refuse to have anything to do with a pull request that doesn't even
>> pass some very fundamental build requirements for me. That implies a
>> level of lack of testing that just makes me go "No way am I touching
>> that tree".
> 
> I can tell you that I always compile the whole damn thing, and this one
> is no exception. The tree is also in for-next and has been for a long
> time, both the drivers and drivers-post branch. The build bot has also
> vetted both branches, individually, not just as the merged for-next.
> 
> I take it this is only happening on clang, which is why I haven't seen
> it as I don't compile with clang. We can certainly add that to the usual
> pre-flight/post-merge list, but I'm a bit surprised that clang isn't
> being done by the build bots too.

Side note - it is possible this all happened on the nvme branch
side and hence I didn't see it.

In any case, I've got the branch prepared and we'll send it out
later this merge window when it's all vetted.
Linus Torvalds Aug. 2, 2022, 10:27 p.m. UTC | #7
On Tue, Aug 2, 2022 at 3:24 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> I take it this is only happening on clang, which is why I haven't seen
> it as I don't compile with clang.

It happens both with clang-14.0 and with gcc-12.1.1.

And Keith says that the issue was apparently know about and fixed over
two weeks ago.

So *somebody* knew about it, and fixed it, but apparently the people
involved didn't bother informing upstream.

Regardless of what happened, it's not even remotely acceptable.

                 Linus
Jens Axboe Aug. 2, 2022, 10:33 p.m. UTC | #8
On 8/2/22 4:27 PM, Linus Torvalds wrote:
> On Tue, Aug 2, 2022 at 3:24 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> I take it this is only happening on clang, which is why I haven't seen
>> it as I don't compile with clang.
> 
> It happens both with clang-14.0 and with gcc-12.1.1.

gcc-12.1 is what I use, fwiw.

> And Keith says that the issue was apparently know about and fixed over
> two weeks ago.
> 
> So *somebody* knew about it, and fixed it, but apparently the people
> involved didn't bother informing upstream.
> 
> Regardless of what happened, it's not even remotely acceptable.

Agree, nobody told me about it, and that would've been nice to know.
Linus Torvalds Aug. 2, 2022, 10:48 p.m. UTC | #9
On Tue, Aug 2, 2022 at 3:33 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 8/2/22 4:27 PM, Linus Torvalds wrote:
> >
> > It happens both with clang-14.0 and with gcc-12.1.1.
>
> gcc-12.1 is what I use, fwiw.

Hmm. I think the last ".1" is actually purely a Fedora artifact. The
gcc people themselves have only released 12.1.0, and looking around
for the srpm info, it looks like all the final F36 ".1" is for some
unrelated patches.

So we may actually be essentially on the same compiler version, and I
don't know why you wouldn't see the error. It showed up with a plain
"make allmodconfig" build here.

                Linus
Jens Axboe Aug. 2, 2022, 10:59 p.m. UTC | #10
On 8/2/22 4:48 PM, Linus Torvalds wrote:
> On Tue, Aug 2, 2022 at 3:33 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 8/2/22 4:27 PM, Linus Torvalds wrote:
>>>
>>> It happens both with clang-14.0 and with gcc-12.1.1.
>>
>> gcc-12.1 is what I use, fwiw.
> 
> Hmm. I think the last ".1" is actually purely a Fedora artifact. The
> gcc people themselves have only released 12.1.0, and looking around
> for the srpm info, it looks like all the final F36 ".1" is for some
> unrelated patches.
> 
> So we may actually be essentially on the same compiler version, and I
> don't know why you wouldn't see the error. It showed up with a plain
> "make allmodconfig" build here.

Actually, I'm mistaken, on the build box it's running 11.3. So that
might explain it? I use 12.1 elsewhere.
Linus Torvalds Aug. 2, 2022, 11:03 p.m. UTC | #11
On Tue, Aug 2, 2022 at 3:59 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> Actually, I'm mistaken, on the build box it's running 11.3. So that
> might explain it? I use 12.1 elsewhere.

It's possible this -Waddress error is new to gcc-12.

I try to keep most of my machines in sync just to avoid the pain of
different distro details, so I don't have gcc-11 around any more.

             Linus
Jens Axboe Aug. 2, 2022, 11:08 p.m. UTC | #12
On 8/2/22 5:03 PM, Linus Torvalds wrote:
> On Tue, Aug 2, 2022 at 3:59 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Actually, I'm mistaken, on the build box it's running 11.3. So that
>> might explain it? I use 12.1 elsewhere.
> 
> It's possible this -Waddress error is new to gcc-12.
> 
> I try to keep most of my machines in sync just to avoid the pain of
> different distro details, so I don't have gcc-11 around any more.

I'll get gcc-12 back on it - I originally swapped back to 11 for
building kernels to avoid spurious warnings with the new release.
Jens Axboe Aug. 3, 2022, 3:16 p.m. UTC | #13
On 8/2/22 5:08 PM, Jens Axboe wrote:
> On 8/2/22 5:03 PM, Linus Torvalds wrote:
>> On Tue, Aug 2, 2022 at 3:59 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> Actually, I'm mistaken, on the build box it's running 11.3. So that
>>> might explain it? I use 12.1 elsewhere.
>>
>> It's possible this -Waddress error is new to gcc-12.
>>
>> I try to keep most of my machines in sync just to avoid the pain of
>> different distro details, so I don't have gcc-11 around any more.
> 
> I'll get gcc-12 back on it - I originally swapped back to 11 for
> building kernels to avoid spurious warnings with the new release.

On the topic of warnings, on my new build box I get a lot of these:

ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker

which ends up polluting the output quite a bit.

axboe@r7525 ~> ld --version
GNU ld (GNU Binutils for Debian) 2.38.90.20220713
Linus Torvalds Aug. 3, 2022, 4:26 p.m. UTC | #14
On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On the topic of warnings, on my new build box I get a lot of these:
>
> ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack
> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
>
> which ends up polluting the output quite a bit.
>
> axboe@r7525 ~> ld --version
> GNU ld (GNU Binutils for Debian) 2.38.90.20220713

Ok, I have binutils 2.37, so it may be new to 2.38.

Some googling around seems to imply that we'd need to so something like this

   .section .note.GNU-stack,"",%progbits

in all our *.S files.

We do have some signs of that in our tooling, because apparently it
has hit user-space, but I wonder what has triggered the need on the
kernel side for you.

I'd hate to add that pointless line to every asm file, but maybe we
could so something like this

   #ifdef __ASSEMBLY_
   #ifdef OUTPUT_PROGBITS
      .section .note.GNU-stack,"",%progbits
      #undef OUTPUT_PROGBITS
   #endif
   #endif

and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our
makefiles.

*Most* asm files should include <linux/linkage.h> just for all the
macros that declare variables externally, so that might catch the bulk
of it.

Somebody who knows the rules better than I would be a good idea.

I've added random people who have touched those linkage things in the
past to the participants, in the hope that somebody goes, "No, Linus,
just add flag XYZ to the linker script" or other and there's a clear
and obvious solution.

               Linus
Nick Desaulniers Aug. 3, 2022, 4:51 p.m. UTC | #15
On Wed, Aug 3, 2022 at 9:26 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote:
> >
> > On the topic of warnings, on my new build box I get a lot of these:
> >
> > ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack
> > ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
> >
> > which ends up polluting the output quite a bit.
> >
> > axboe@r7525 ~> ld --version
> > GNU ld (GNU Binutils for Debian) 2.38.90.20220713
>
> Ok, I have binutils 2.37, so it may be new to 2.38.
>
> Some googling around seems to imply that we'd need to so something like this
>
>    .section .note.GNU-stack,"",%progbits
>
> in all our *.S files.
>
> We do have some signs of that in our tooling, because apparently it
> has hit user-space, but I wonder what has triggered the need on the
> kernel side for you.
>
> I'd hate to add that pointless line to every asm file, but maybe we
> could so something like this
>
>    #ifdef __ASSEMBLY_
>    #ifdef OUTPUT_PROGBITS
>       .section .note.GNU-stack,"",%progbits
>       #undef OUTPUT_PROGBITS
>    #endif
>    #endif
>
> and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our
> makefiles.
>
> *Most* asm files should include <linux/linkage.h> just for all the
> macros that declare variables externally, so that might catch the bulk
> of it.
>
> Somebody who knows the rules better than I would be a good idea.

$ as --help | grep exec
  --execstack             require executable stack for this object
  --noexecstack           don't require executable stack for this object
  --statistics            print various measured statistics from execution

Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures
help, rather than modifying every assembler source?

>
> I've added random people who have touched those linkage things in the
> past to the participants, in the hope that somebody goes, "No, Linus,
> just add flag XYZ to the linker script" or other and there's a clear
> and obvious solution.
>
>                Linus
Jens Axboe Aug. 3, 2022, 4:53 p.m. UTC | #16
On 8/3/22 10:51 AM, Nick Desaulniers wrote:
> On Wed, Aug 3, 2022 at 9:26 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> On the topic of warnings, on my new build box I get a lot of these:
>>>
>>> ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack
>>> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
>>>
>>> which ends up polluting the output quite a bit.
>>>
>>> axboe@r7525 ~> ld --version
>>> GNU ld (GNU Binutils for Debian) 2.38.90.20220713
>>
>> Ok, I have binutils 2.37, so it may be new to 2.38.
>>
>> Some googling around seems to imply that we'd need to so something like this
>>
>>    .section .note.GNU-stack,"",%progbits
>>
>> in all our *.S files.
>>
>> We do have some signs of that in our tooling, because apparently it
>> has hit user-space, but I wonder what has triggered the need on the
>> kernel side for you.
>>
>> I'd hate to add that pointless line to every asm file, but maybe we
>> could so something like this
>>
>>    #ifdef __ASSEMBLY_
>>    #ifdef OUTPUT_PROGBITS
>>       .section .note.GNU-stack,"",%progbits
>>       #undef OUTPUT_PROGBITS
>>    #endif
>>    #endif
>>
>> and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our
>> makefiles.
>>
>> *Most* asm files should include <linux/linkage.h> just for all the
>> macros that declare variables externally, so that might catch the bulk
>> of it.
>>
>> Somebody who knows the rules better than I would be a good idea.
> 
> $ as --help | grep exec
>   --execstack             require executable stack for this object
>   --noexecstack           don't require executable stack for this object
>   --statistics            print various measured statistics from execution
> 
> Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures
> help, rather than modifying every assembler source?

I can try whatever here, but a quick grep doesn't find anything for
KBUILD_ASFLAGS or anything close to it. What am I missing?
Linus Torvalds Aug. 3, 2022, 4:56 p.m. UTC | #17
On Wed, Aug 3, 2022 at 9:51 AM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures
> help, rather than modifying every assembler source?

See, this is why you cc the experts:

> > I've added random people who have touched those linkage things in the
> > past to the participants, in the hope that somebody goes, "No, Linus,
> > just add flag XYZ to the linker script" or other and there's a clear
> > and obvious solution.

Thanks,
                Linus
Linus Torvalds Aug. 3, 2022, 5 p.m. UTC | #18
On Wed, Aug 3, 2022 at 9:53 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> I can try whatever here, but a quick grep doesn't find anything for
> KBUILD_ASFLAGS or anything close to it. What am I missing?

I think we only have 'aflags-y' (so you can use config variables etc
to pick them) and EXTRA_AFLAGS.

And we always go through the compiler rather than invoke 'as'
directly, so I think you have to use '-Wa,option' to pass 'option' to
the assembler.

              Linus
Christoph Hellwig Aug. 3, 2022, 5:30 p.m. UTC | #19
On Tue, Aug 02, 2022 at 02:18:57PM -0700, Linus Torvalds wrote:
> This code cannot have gotten much testing at all.

[...]

> And no, I don't want some "fix up broken code after the fact" commit
> on top. I want that code excised, and I don't want to see another pull
> request before it's (a) gone and (b) somebody has looked at where the
> testing of this COMPLETELY failed.
> 

Umm.  The warning is as you said totally reasonable, and we fixed it as
soon as we got the report.  But it turns out my compiler certainly did
not report it (gcc version 10.2.1 20210110 (Debian 10.2.1-6)), Jens's
apparently also did not, and the regular build bot that is running on
tons of architectures did not report it until 9 days after the patch
was commited and pushed out, and until after Jens pulled it.

So while the complaint that we failed to get it into the same pull
request is entirely reasonable, the statement that it cannot have
gotten much testing at all is a bit ridiculous.  It's also not that
"I does not compile at all", but rather that -Werror makes a useful
but mostly harmless warning fatal.
Nick Desaulniers Aug. 3, 2022, 5:38 p.m. UTC | #20
On Wed, Aug 3, 2022 at 9:53 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 8/3/22 10:51 AM, Nick Desaulniers wrote:
> > On Wed, Aug 3, 2022 at 9:26 AM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >>
> >> On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote:
> >>>
> >>> On the topic of warnings, on my new build box I get a lot of these:
> >>>
> >>> ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack
> >>> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
> >>>
> >>> which ends up polluting the output quite a bit.
> >>>
> >>> axboe@r7525 ~> ld --version
> >>> GNU ld (GNU Binutils for Debian) 2.38.90.20220713
> >>
> >> Ok, I have binutils 2.37, so it may be new to 2.38.
> >>
> >> Some googling around seems to imply that we'd need to so something like this
> >>
> >>    .section .note.GNU-stack,"",%progbits
> >>
> >> in all our *.S files.
> >>
> >> We do have some signs of that in our tooling, because apparently it
> >> has hit user-space, but I wonder what has triggered the need on the
> >> kernel side for you.
> >>
> >> I'd hate to add that pointless line to every asm file, but maybe we
> >> could so something like this
> >>
> >>    #ifdef __ASSEMBLY_
> >>    #ifdef OUTPUT_PROGBITS
> >>       .section .note.GNU-stack,"",%progbits
> >>       #undef OUTPUT_PROGBITS
> >>    #endif
> >>    #endif
> >>
> >> and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our
> >> makefiles.
> >>
> >> *Most* asm files should include <linux/linkage.h> just for all the
> >> macros that declare variables externally, so that might catch the bulk
> >> of it.
> >>
> >> Somebody who knows the rules better than I would be a good idea.
> >
> > $ as --help | grep exec
> >   --execstack             require executable stack for this object
> >   --noexecstack           don't require executable stack for this object
> >   --statistics            print various measured statistics from execution
> >
> > Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures
> > help, rather than modifying every assembler source?
>
> I can try whatever here, but a quick grep doesn't find anything for
> KBUILD_ASFLAGS or anything close to it. What am I missing?

Sorry, I'm running between many meetings today...so suggestions aren't
tested and may not be fully coherent...

KBUILD_AFLAGS += -Wa,--noexecstack

There's also as-option Make macro in case older binutils doesn't
support that flag outright.

tools/perf/Makefile.config also uses noexecstack as a linker flag.
That might be an option, too.  It is the linker producing the
warnings, not the assembler, but the assembler flag is probably the
way to go to fix the warnings since it sounds like these are assembler
sources exclusively causing the issue.
Linus Torvalds Aug. 3, 2022, 5:42 p.m. UTC | #21
On Wed, Aug 3, 2022 at 10:30 AM Christoph Hellwig <hch@lst.de> wrote:
>
> So while the complaint that we failed to get it into the same pull
> request is entirely reasonable, the statement that it cannot have
> gotten much testing at all is a bit ridiculous.  It's also not that
> "I does not compile at all", but rather that -Werror makes a useful
> but mostly harmless warning fatal.

Stop this idiocy.

Warnings are fatal. Deal with it. If you think it's ok to have
warnings in your code, go do another project.

The -Werror addition is not new, it's a policy that has been in place
for a decade or so. I just got fed up with people not noticing
warnings and using that as an excuse for their broken code.

The fact is, apparently -Werror _did_ find it, and the people involved
just never bothered to even tell upstream about it, so that two weeks
afterwards I got a TERMINALLY BROKEN pull, request.

Don't make excuses about "it wasn't that broken".

It was broken. Own it.

And something went very very wrong for that breakage to have stayed
around and then being pushed to me.

I want whatever went wrong in that process fixed. Why did you make a
pull request to Jens, and then never notified him about the problems
in it?

                 Linus
Jens Axboe Aug. 3, 2022, 5:45 p.m. UTC | #22
On 8/3/22 11:38 AM, Nick Desaulniers wrote:
> On Wed, Aug 3, 2022 at 9:53 AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 8/3/22 10:51 AM, Nick Desaulniers wrote:
>>> On Wed, Aug 3, 2022 at 9:26 AM Linus Torvalds
>>> <torvalds@linux-foundation.org> wrote:
>>>>
>>>> On Wed, Aug 3, 2022 at 8:16 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>
>>>>> On the topic of warnings, on my new build box I get a lot of these:
>>>>>
>>>>> ld: warning: arch/x86/lib/putuser.o: missing .note.GNU-stack section implies executable stack
>>>>> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
>>>>>
>>>>> which ends up polluting the output quite a bit.
>>>>>
>>>>> axboe@r7525 ~> ld --version
>>>>> GNU ld (GNU Binutils for Debian) 2.38.90.20220713
>>>>
>>>> Ok, I have binutils 2.37, so it may be new to 2.38.
>>>>
>>>> Some googling around seems to imply that we'd need to so something like this
>>>>
>>>>    .section .note.GNU-stack,"",%progbits
>>>>
>>>> in all our *.S files.
>>>>
>>>> We do have some signs of that in our tooling, because apparently it
>>>> has hit user-space, but I wonder what has triggered the need on the
>>>> kernel side for you.
>>>>
>>>> I'd hate to add that pointless line to every asm file, but maybe we
>>>> could so something like this
>>>>
>>>>    #ifdef __ASSEMBLY_
>>>>    #ifdef OUTPUT_PROGBITS
>>>>       .section .note.GNU-stack,"",%progbits
>>>>       #undef OUTPUT_PROGBITS
>>>>    #endif
>>>>    #endif
>>>>
>>>> and then change our 'AS' command line to do '-DOUTPUT_PROGBITS' in our
>>>> makefiles.
>>>>
>>>> *Most* asm files should include <linux/linkage.h> just for all the
>>>> macros that declare variables externally, so that might catch the bulk
>>>> of it.
>>>>
>>>> Somebody who knows the rules better than I would be a good idea.
>>>
>>> $ as --help | grep exec
>>>   --execstack             require executable stack for this object
>>>   --noexecstack           don't require executable stack for this object
>>>   --statistics            print various measured statistics from execution
>>>
>>> Does adding `--noexecstack` to KBUILD_ASFLAGS for these architectures
>>> help, rather than modifying every assembler source?
>>
>> I can try whatever here, but a quick grep doesn't find anything for
>> KBUILD_ASFLAGS or anything close to it. What am I missing?
> 
> Sorry, I'm running between many meetings today...so suggestions aren't
> tested and may not be fully coherent...
> 
> KBUILD_AFLAGS += -Wa,--noexecstack
> 
> There's also as-option Make macro in case older binutils doesn't
> support that flag outright.
> 
> tools/perf/Makefile.config also uses noexecstack as a linker flag.
> That might be an option, too.  It is the linker producing the
> warnings, not the assembler, but the assembler flag is probably the
> way to go to fix the warnings since it sounds like these are assembler
> sources exclusively causing the issue.

I ran with the below and it silences the linker warning mentioned. I do
also see the below with my build (which aren't new with the option added
obviously, but not visible since I don't get the other noise):

axboe@r7525 ~/gi/linux-block (test)> time make -j256 -s                      1.886s
ld: warning: .tmp_vmlinux.kallsyms1 has a LOAD segment with RWX permissions
ld: warning: .tmp_vmlinux.kallsyms2 has a LOAD segment with RWX permissions
ld: warning: vmlinux has a LOAD segment with RWX permissions
ld: warning: arch/x86/boot/compressed/efi_thunk_64.o: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions
ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
ld: warning: arch/x86/boot/setup.elf has a LOAD segment with RWX permissions


diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 7854685c5f25..51824528a026 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -123,6 +123,7 @@ else
         CHECKFLAGS += -D__x86_64__
 
         KBUILD_AFLAGS += -m64
+        KBUILD_AFLAGS += -Wa,--noexecstack
         KBUILD_CFLAGS += -m64
 
         # Align jump targets to 1 byte, not the default 16 bytes:
Christoph Hellwig Aug. 3, 2022, 5:49 p.m. UTC | #23
On Wed, Aug 03, 2022 at 10:42:32AM -0700, Linus Torvalds wrote:
> Warnings are fatal. Deal with it. If you think it's ok to have
> warnings in your code, go do another project.

Please stop this BS.  I'm the first one to fix every single warning
reported, because I absoutely want warning free code to see problems.
But that does not mean it is "fatal".  It is bad and should go away
by either fixing the code, or if the warning is too broken disable
it in the compiler.  But that does not in any way mean code that
creates warning on some compilers is completely untested and by
definition broken.

We've fixed the warning the day it was reported because of just that,
and we'd always do.  But that doesn't make it whole giant drama.

> The fact is, apparently -Werror _did_ find it, and the people involved

There is nothing about Werror breaking it.  The buildbot reported it,
and that usually reports warnings and errors, so it did not make any
difference.

> I want whatever went wrong in that process fixed. Why did you make a
> pull request to Jens, and then never notified him about the problems
> in it?

Because so far we've never made a big fuzz about a configuation
so obscrube that it takes build bot more than a week to find it.  We
just fix it and go on with life.  But now that I see that you somehow
see it a personal insult I'll be more careful about it.  But to be
honest, while I agree with you 100% that code should be warning free
I'm really amazed about all the drama this created.
Linus Torvalds Aug. 3, 2022, 5:54 p.m. UTC | #24
On Wed, Aug 3, 2022 at 10:50 AM Christoph Hellwig <hch@lst.de> wrote:
> Because so far we've never made a big fuzz about a configuation
> so obscrube that it takes build bot more than a week to find it.

That "obscure" config was a bog-standard "build the code".

It literally happens with "allmodconfig". I found it immediately after pulling.

This is literally why I do full builds after every single pull request
I do - I don't want to have build warnings in my tree.

And I do expect people to make sure that their code has been
sufficiently tested that my smoke-testing doesn't find immediate
problems.

           Linus
Nick Desaulniers Aug. 3, 2022, 6:06 p.m. UTC | #25
+ Linux Toolchains, Nick Clifton (author of
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107)
https://lore.kernel.org/linux-block/CAKwvOdkpNRvD0kDe-j8N0Gkq+1Fdhd6=z-9ROm3gc12Sf0k-Kg@mail.gmail.com/
is the thread for context.

On Wed, Aug 3, 2022 at 10:45 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> I ran with the below and it silences the linker warning mentioned. I do
> also see the below with my build (which aren't new with the option added
> obviously, but not visible since I don't get the other noise):
>
> axboe@r7525 ~/gi/linux-block (test)> time make -j256 -s                      1.886s
> ld: warning: .tmp_vmlinux.kallsyms1 has a LOAD segment with RWX permissions
> ld: warning: .tmp_vmlinux.kallsyms2 has a LOAD segment with RWX permissions
> ld: warning: vmlinux has a LOAD segment with RWX permissions

Not sure yet about these.  Looks like there's linker flags to disable
warnings...but I don't recommend those.
https://github.com/systemd/systemd/commit/b0e5bf0451a6bc94e6e7b2a1de668b75c63f38c8
I wonder if they go away by fixing the boot issues described below?

Otherwise, I think we need to find which section is the problematic
one; I suspect the segment flagged as LOAD is composed of many
sections, with possibly only one or a few that needs permissions
adjusted.

> ld: warning: arch/x86/boot/compressed/efi_thunk_64.o: missing .note.GNU-stack section implies executable stack
> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker

Right, arch/x86/boot/compressed (and arch/x86/boot/) discard/reset
KBUILD_AFLAGS (via the := assignment operator).

So arch/x86/boot/Makefile and arch/x86/boot/compressed/Makefile also
will need changes similar to the one you did below.

Finally, if those parts of the code actually expect the stack to be
executable (probably depends on some inline asm), I suspect we might
get some kind of fault at runtime (though I don't know how the kernel
handles segment permissions or even uses ELF segments).  Point being
that -Wa,--noexecstack is somewhat a promise that we wont try to
execute data on the stack as if it were code.  -Wa,--execstack is
useful if we need to be able to do so, but we should be careful to
limit that to individual translation units if they really truly need
that.  The linker warning is more so about the current ambiguity since
if the implicit default changes in the future, that could break code.
Better for us to be explicit here, and disable executable stack unless
strictly necessary and only as necessary IMO.  Personally, I think the
current implicit default is wrong, but pragmatically I recognize that
people have been used to the status quo for years, and changing it
could break existing codebases.

If you send the below diff with the two other additions I suggest to
the x86 ML, the x86 maintainers might be able to comment if they're
familiar with any possible cases where the stack is expected to be
executable.

> ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions
> ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack
> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker

^ See above.

> ld: warning: arch/x86/boot/setup.elf has a LOAD segment with RWX permissions
>
>
> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 7854685c5f25..51824528a026 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -123,6 +123,7 @@ else
>          CHECKFLAGS += -D__x86_64__
>
>          KBUILD_AFLAGS += -m64
> +        KBUILD_AFLAGS += -Wa,--noexecstack
>          KBUILD_CFLAGS += -m64
>
>          # Align jump targets to 1 byte, not the default 16 bytes:
>
> --
> Jens Axboe
>
Jens Axboe Aug. 4, 2022, 4:17 p.m. UTC | #26
On 8/3/22 12:06 PM, Nick Desaulniers wrote:
>> ld: warning: arch/x86/boot/compressed/efi_thunk_64.o: missing .note.GNU-stack section implies executable stack
>> ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
> 
> Right, arch/x86/boot/compressed (and arch/x86/boot/) discard/reset
> KBUILD_AFLAGS (via the := assignment operator).
> 
> So arch/x86/boot/Makefile and arch/x86/boot/compressed/Makefile also
> will need changes similar to the one you did below.
> 
> Finally, if those parts of the code actually expect the stack to be
> executable (probably depends on some inline asm), I suspect we might
> get some kind of fault at runtime (though I don't know how the kernel
> handles segment permissions or even uses ELF segments).  Point being
> that -Wa,--noexecstack is somewhat a promise that we wont try to
> execute data on the stack as if it were code.  -Wa,--execstack is
> useful if we need to be able to do so, but we should be careful to
> limit that to individual translation units if they really truly need
> that.  The linker warning is more so about the current ambiguity since
> if the implicit default changes in the future, that could break code.
> Better for us to be explicit here, and disable executable stack unless
> strictly necessary and only as necessary IMO.  Personally, I think the
> current implicit default is wrong, but pragmatically I recognize that
> people have been used to the status quo for years, and changing it
> could break existing codebases.
> 
> If you send the below diff with the two other additions I suggest to
> the x86 ML, the x86 maintainers might be able to comment if they're
> familiar with any possible cases where the stack is expected to be
> executable.

I'd be happy for you to take this over, I'm really just the reporter
here...
Christoph Hellwig Aug. 6, 2022, 7:44 a.m. UTC | #27
On Wed, Aug 03, 2022 at 10:54:50AM -0700, Linus Torvalds wrote:
> On Wed, Aug 3, 2022 at 10:50 AM Christoph Hellwig <hch@lst.de> wrote:
> > Because so far we've never made a big fuzz about a configuation
> > so obscrube that it takes build bot more than a week to find it.
> 
> That "obscure" config was a bog-standard "build the code".

With a new enough compiler that apparently very few other people, and
most importantly CI tools have.  Again, I'm not arguing this wasn't a
problem, but if there is a mismatch between what compiler you use and
most of the developers and CI infrastructure it isn't a case of
"code that isn't tested at all".
diff mbox series

Patch

diff --cc drivers/block/drbd/drbd_bitmap.c
index 603f6828dd79,bd2133ef6e0a..7d9db33363de
--- a/drivers/block/drbd/drbd_bitmap.c