init/main.c: Initialize early LSMs after arch code

Message ID	20240801171747.3155893-1-kpsingh@kernel.org (mailing list archive)
State	Superseded
Delegated to:	Paul Moore
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B5681EB4B1; Thu, 1 Aug 2024 17:17:55 +0000 (UTC) From: KP Singh <kpsingh@kernel.org> To: linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org Cc: paul@paul-moore.com, kpsingh@kernel.org, bp@alien8.de, sfr@canb.auug.org.au, peterz@infradead.org, nathan@kernel.org Subject: [PATCH] init/main.c: Initialize early LSMs after arch code Date: Thu, 1 Aug 2024 19:17:47 +0200 Message-ID: <20240801171747.3155893-1-kpsingh@kernel.org> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	init/main.c: Initialize early LSMs after arch code \| expand init/main.c: Initialize early LSMs after arch code

KP Singh Aug. 1, 2024, 5:17 p.m. UTC

With LSMs using static calls, early_lsm_init needs to wait for setup_arch
for architecture specific functionality which includes jump tables and
static calls to be initialized.

This only affects "early LSMs" i.e. only lockdown when
CONFIG_SECURITY_LOCKDOWN_LSM_EARLY is set.

Fixes: 2732ad5ecd5b ("lsm: replace indirect LSM hook calls with static calls")
Signed-off-by: KP Singh <kpsingh@kernel.org>
---
 init/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Paul Moore Aug. 5, 2024, 7:57 p.m. UTC | #1

On Thu, Aug 1, 2024 at 1:17 PM KP Singh <kpsingh@kernel.org> wrote:
>
> With LSMs using static calls, early_lsm_init needs to wait for setup_arch
> for architecture specific functionality which includes jump tables and
> static calls to be initialized.
>
> This only affects "early LSMs" i.e. only lockdown when
> CONFIG_SECURITY_LOCKDOWN_LSM_EARLY is set.
>
> Fixes: 2732ad5ecd5b ("lsm: replace indirect LSM hook calls with static calls")
> Signed-off-by: KP Singh <kpsingh@kernel.org>
> ---
>  init/main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Considering the problems we've had, I'd like to hear more about how
you've tested this and I'd like to see some reviews/ACKs from some
arch people too.

> diff --git a/init/main.c b/init/main.c
> index 206acdde51f5..a0e3f3c720e6 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -922,8 +922,8 @@ void start_kernel(void)
>         boot_cpu_init();
>         page_address_init();
>         pr_notice("%s", linux_banner);
> -       early_security_init();
>         setup_arch(&command_line);
> +       early_security_init();
>         setup_boot_config();
>         setup_command_line(command_line);
>         setup_nr_cpu_ids();
> --
> 2.46.0.rc2.264.g509ed76dc8-goog

KP Singh Aug. 5, 2024, 11:29 p.m. UTC | #2

On Mon, Aug 5, 2024 at 9:58 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Thu, Aug 1, 2024 at 1:17 PM KP Singh <kpsingh@kernel.org> wrote:
> >
> > With LSMs using static calls, early_lsm_init needs to wait for setup_arch
> > for architecture specific functionality which includes jump tables and
> > static calls to be initialized.
> >
> > This only affects "early LSMs" i.e. only lockdown when
> > CONFIG_SECURITY_LOCKDOWN_LSM_EARLY is set.
> >
> > Fixes: 2732ad5ecd5b ("lsm: replace indirect LSM hook calls with static calls")
> > Signed-off-by: KP Singh <kpsingh@kernel.org>
> > ---
> >  init/main.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
>
> Considering the problems we've had, I'd like to hear more about how

Sure, the first patch I sent while I had enabled
CONFIG_SECURITY_LOCDOWN_EARLY_INIT, I missed setting it in the command
line i.e CONFIG_LSM. Thus I thought the crash was fixed and I did not
check ARM. But then here's what I did and I will paste outputs for
posterity:

kpsingh@kpsingh:~/projects/linux$ cat .config | grep -i LOCKDOWN
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,bpf"

I first tested with the commands that Nathan had given me:

kpsingh@kpsingh:~/projects/linux$ make -skj"$(nproc)" ARCH=arm
CROSS_COMPILE=arm-linux-gnueabi- defconfig repro.config zImage

and I was able to reproduce the issue:

kpsingh@kpsingh:~/projects/linux$ qemu-system-arm       -display none
     -nodefaults       -no-reboot       -machine virt       -append
'console=ttyAMA0 earlycon'       -kernel arch/arm/boot/zImage
-initrd rootfs.cpio       -m 512m       -serial mon:stdio
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 6.11.0-rc1-00011-g269d5c03e612
(kpsingh@kpsingh.zrh.corp.google.com) (arm-linux-gnueabi-gcc (Debian
13.2.0-7) 13.2.0, GNU ld (GNU Binutils for Debian) 2.42) #7 SMP Tue
Aug  6 01:20:11 CEST 2024
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:199
static_key_enable_cpuslocked+0xb8/0xf4
[    0.000000] static_key_enable_cpuslocked(): static key '0xc1fb4cb0'
used before call to jump_label_init()
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted
6.11.0-rc1-00011-g269d5c03e612 #7
[    0.000000] Call trace:
[    0.000000]  unwind_backtrace from show_stack+0x10/0x14
[    0.000000]  show_stack from dump_stack_lvl+0x54/0x68
[    0.000000]  dump_stack_lvl from __warn+0x78/0x114
[    0.000000]  __warn from warn_slowpath_fmt+0x124/0x18c
[    0.000000]  warn_slowpath_fmt from static_key_enable_cpuslocked+0xb8/0xf4
[    0.000000]  static_key_enable_cpuslocked from static_key_enable+0x14/0x1c
[    0.000000]  static_key_enable from security_add_hooks+0xc4/0xfc
[    0.000000]  security_add_hooks from lockdown_lsm_init+0x18/0x24
[    0.000000]  lockdown_lsm_init from initialize_lsm+0x44/0x7c
[    0.000000]  initialize_lsm from early_security_init+0x44/0x50
[    0.000000]  early_security_init from start_kernel+0x64/0x6bc
[    0.000000]  start_kernel from 0x0
[    0.000000] ---[ end trace 0000000000000000 ]---

and with my patch:

kpsingh@kpsingh:~/projects/linux$ git lg -1
48ad43fb07be - init/main.c: Initialize early LSMs after arch code
(HEAD -> fix_jump_table_init) (2024-08-01 KP Singh)
kpsingh@kpsingh:~/projects/linux$ qemu-system-arm       -display none
     -nodefaults       -no-reboot       -machine virt       -append
'console=ttyAMA0 earlycon'       -kernel arch/arm/boot/zImage
-initrd rootfs.cpio       -m 512m       -serial mon:stdio
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 6.11.0-rc1-00012-g48ad43fb07be
(kpsingh@kpsingh.zrh.corp.google.com) (arm-linux-gnueabi-gcc (Debian
13.2.0-7) 13.2.0, GNU ld (GNU Binutils for Debian) 2.42) #8 SMP Tue
Aug  6 01:22:00 CEST 2024
[    0.000000] CPU: ARMv7 Processor [414fc0f0] revision 0 (ARMv7), cr=10c5387d
[    0.000000] CPU: div instructions available: patching division code
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[    0.000000] OF: fdt: Machine model: linux,dummy-virt
[    0.000000] random: crng init done
[    0.000000] earlycon: pl11 at MMIO 0x09000000 (options '')
[    0.000000] printk: legacy bootconsole [pl11] enabled
[    0.000000] Memory policy: Data cache writealloc
[    0.000000] efi: UEFI not found.
[    0.000000] cma: Reserved 64 MiB at 0x5c000000 on node -1

Then I went ahead and confirmed this on x86 too:

Before the patch, repro:

/usr/bin/qemu-system-x86_64 -nographic -s -bios qboot.rom -machine q35
-enable-kvm -cpu host -net nic,model=virtio-net-pci -net
user,hostfwd=tcp::5555-:22 -virtfs
local,path=/,mount_tag=hostfs,security_model=none,multidevs=remap
-append "console=ttyS0,115200 root=/dev/sda rw nokaslr
init=/lib/systemd/systemd debug systemd.log_level=info
sysctl.vm.dirty_bytes=2147483647" -smp 24 -m 64G -drive
file=/usr/local/google/home/kpsingh/.vmcli/debian-x86_64.img -qmp
tcp:localhost:4444,server,nowait -serial mon:stdio -kernel
/usr/local/google/home/kpsingh/projects/linux/arch/x86_64/boot/bzImage

[    0.000000] Linux version 6.11.0-rc1-00011-g269d5c03e612
(kpsingh@kpsingh.zrh.corp.google.com) (clang version 19.0.0git
(https://github.com/llvm/llvm-project.git
502e77df1fc4aa859db6709e14e93af6207e4dc4), Debian LLD 16.0.6) #9 SMP
PREEMPT_DYNAMIC Tue Aug  6 01:25:06 CEST 2024
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: CPU: 0 PID: 0 at
kernel/static_call_inline.c:153 __static_call_update+0x29f/0x310
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted
6.11.0-rc1-00011-g269d5c03e612 #9
[    0.000000] RIP: 0010:__static_call_update+0x29f/0x310
[    0.000000] Code: c1 02 75 9d 80 3d 90 3c 89 03 00 75 94 c6 05 87
3c 89 03 01 48 c7 c7 00 11 0e 83 4c 89 ee e8 c8 d6 c9 ff 0f 0b e9 77
ff ff ff <0f> 0b 48 c7 c7 60 e9 6c 84 e8 b3 24 86 01 e8 0e eb c9 ff 48
c7 44
[    0.000000] RSP: 0000:ffffffff84407d60 EFLAGS: 00010046 ORIG_RAX:
0000000000000000
[    0.000000] RAX: 0000000000000000 RBX: ffffffff8196cf30 RCX: ffffffff82caf1c6
[    0.000000] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff84407ce0
[    0.000000] RBP: ffffffff84407e60 R08: ffffffff84407ce7 R09: 1ffffffff0880f9c
[    0.000000] R10: dffffc0000000000 R11: fffffbfff0880f9d R12: 0000000000000010
[    0.000000] R13: ffffffff848157a0 R14: ffffffff83b69cb0 R15: ffffffff82cc05e8
[    0.000000] FS:  0000000000000000(0000) GS:ffffffff85403000(0000)
knlGS:0000000000000000
[    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.000000] CR2: ffff888000014700 CR3: 0000000005504000 CR4: 00000000000000b0
[    0.000000] Call Trace:
[    0.000000]  <TASK>
[    0.000000]  ? __warn+0xff/0x2d0
[    0.000000]  ? __static_call_update+0x29f/0x310
[    0.000000]  ? report_bug+0x12f/0x1c0
[    0.000000]  ? early_fixup_exception+0x8a/0x100
[    0.000000]  ? __pfx_lockdown_is_locked_down+0x10/0x10
[    0.000000]  ? __SCT__lsm_static_call_bpf_token_capable_5+0x8/0x8
[    0.000000]  ? early_idt_handler_common+0x2f/0x40
[    0.000000]  ? __SCT__lsm_static_call_bpf_token_capable_5+0x8/0x8
[    0.000000]  ? __pfx_lockdown_is_locked_down+0x10/0x10
[    0.000000]  ? __mutex_unlock_slowpath+0x156/0x400
[    0.000000]  ? __static_call_update+0x29f/0x310
[    0.000000]  ? __pfx_lockdown_is_locked_down+0x10/0x10
[    0.000000]  ? __pfx_vprintk_emit+0x10/0x10
[    0.000000]  ? __asan_memset+0x22/0x50
[    0.000000]  ? __pfx___static_call_update+0x10/0x10
[    0.000000]  ? _printk+0xd4/0x120
[    0.000000]  ? __SCT__lsm_static_call_bpf_token_capable_5+0x8/0x8
[    0.000000]  ? lsm_static_call_init+0x99/0xd0
[    0.000000]  ? security_add_hooks+0x86/0xf0
[    0.000000]  ? lockdown_lsm_init+0x21/0x30
[    0.000000]  ? initialize_lsm+0x48/0x90
[    0.000000]  ? early_security_init+0x52/0x70
[    0.000000]  ? start_kernel+0x6b/0x3d0
[    0.000000]  ? x86_64_start_reservations+0x24/0x30
[    0.000000]  ? x86_64_start_kernel+0xa9/0xb0
[    0.000000]  ? common_startup_64+0x12c/0x137
[    0.000000]  </TASK>
[    0.000000] irq event stamp: 0
[    0.000000] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.000000] hardirqs last disabled at (0): [<0000000000000000>] 0x0
[    0.000000] softirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.000000] softirqs last disabled at (0): [<0000000000000000>] 0x0
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] ------------[ cut here ]------------

and with the patch:

/usr/bin/qemu-system-x86_64 -nographic -s -bios qboot.rom -machine q35
-enable-kvm -cpu host -net nic,model=virtio-net-pci -net
user,hostfwd=tcp::5555-:22 -virtfs
local,path=/,mount_tag=hostfs,security_model=none,multidevs=remap
-append "console=ttyS0,115200 root=/dev/sda rw nokaslr
init=/lib/systemd/systemd debug systemd.log_level=info
sysctl.vm.dirty_bytes=2147483647" -smp 24 -m 64G -drive
file=/usr/local/google/home/kpsingh/.vmcli/debian-x86_64.img -qmp
tcp:localhost:4444,server,nowait -serial mon:stdio -kernel
/usr/local/google/home/kpsingh/projects/linux/arch/x86_64/boot/bzImage

[    0.000000] Linux version 6.11.0-rc1-00012-g48ad43fb07be
(kpsingh@kpsingh.zrh.corp.google.com) (clang version 19.0.0git
(https://github.com/llvm/llvm-project.git
502e77df1fc4aa859db6709e14e93af6207e4dc4), Debian LLD 16.0.6) #10 SMP
PREEMPT_DYNAMIC Tue Aug  6 01:27:35 CEST 2024
[    0.000000] Command line: console=ttyS0,115200 root=/dev/sda rw
nokaslr init=/lib/systemd/systemd debug systemd.log_level=info
sysctl.vm.dirty_bytes=2147483647
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000d0000-0x00000000000effff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007fffffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000107fffffff] usable
[    0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
[    0.000000] NX (Execute Disable) protection: active

I then left this VM (well a previous instance, I had to redo the tests
to prove that I did them and save the logs) running and using it for
some of my dev work and did not see any crashes since.

> you've tested this and I'd like to see some reviews/ACKs from some
> arch people too.

This is not the same patch as the previous one, it does not change
anything for arch, rather the decision is LSM okay waiting for arch to
initialize. arch never depended on the LSM code, early LSM was done as
early as possible and now, because of static calls it needs to wait
for setup_arch, it's mostly an LSM decision here.

I guess it would not harm Boris, Nathan and others to look at it as
well and see if it breaks any of their tests.

- KP

>
> > diff --git a/init/main.c b/init/main.c
> > index 206acdde51f5..a0e3f3c720e6 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -922,8 +922,8 @@ void start_kernel(void)
> >         boot_cpu_init();
> >         page_address_init();
> >         pr_notice("%s", linux_banner);
> > -       early_security_init();
> >         setup_arch(&command_line);
> > +       early_security_init();
> >         setup_boot_config();
> >         setup_command_line(command_line);
> >         setup_nr_cpu_ids();
> > --
> > 2.46.0.rc2.264.g509ed76dc8-goog
>
> --
> paul-moore.com

Nathan Chancellor Aug. 6, 2024, 2:20 a.m. UTC | #3

On Tue, Aug 06, 2024 at 01:29:37AM +0200, KP Singh wrote:
> On Mon, Aug 5, 2024 at 9:58 PM Paul Moore <paul@paul-moore.com> wrote:
> >
> > On Thu, Aug 1, 2024 at 1:17 PM KP Singh <kpsingh@kernel.org> wrote:
> > >
> > > With LSMs using static calls, early_lsm_init needs to wait for setup_arch
> > > for architecture specific functionality which includes jump tables and
> > > static calls to be initialized.
> > >
> > > This only affects "early LSMs" i.e. only lockdown when
> > > CONFIG_SECURITY_LOCKDOWN_LSM_EARLY is set.
> > >
> > > Fixes: 2732ad5ecd5b ("lsm: replace indirect LSM hook calls with static calls")
> > > Signed-off-by: KP Singh <kpsingh@kernel.org>
> > > ---
> > >  init/main.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > Considering the problems we've had, I'd like to hear more about how
...
> I guess it would not harm Boris, Nathan and others to look at it as
> well and see if it breaks any of their tests.

For what it's worth, I have not noticed any issues in my -next testing
with this patch applied but I only build architectures that build with
LLVM due to the nature of my work. If exposure to more architectures is
desirable, perhaps Guenter Roeck would not mind testing it with his
matrix?

Cheers,
Nathan

> > > diff --git a/init/main.c b/init/main.c
> > > index 206acdde51f5..a0e3f3c720e6 100644
> > > --- a/init/main.c
> > > +++ b/init/main.c
> > > @@ -922,8 +922,8 @@ void start_kernel(void)
> > >         boot_cpu_init();
> > >         page_address_init();
> > >         pr_notice("%s", linux_banner);
> > > -       early_security_init();
> > >         setup_arch(&command_line);
> > > +       early_security_init();
> > >         setup_boot_config();
> > >         setup_command_line(command_line);
> > >         setup_nr_cpu_ids();
> > > --
> > > 2.46.0.rc2.264.g509ed76dc8-goog
> >
> > --
> > paul-moore.com

Paul Moore Aug. 6, 2024, 9:41 p.m. UTC | #4

On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
> On Tue, Aug 06, 2024 at 01:29:37AM +0200, KP Singh wrote:
> > On Mon, Aug 5, 2024 at 9:58 PM Paul Moore <paul@paul-moore.com> wrote:
> > >
> > > On Thu, Aug 1, 2024 at 1:17 PM KP Singh <kpsingh@kernel.org> wrote:
> > > >
> > > > With LSMs using static calls, early_lsm_init needs to wait for setup_arch
> > > > for architecture specific functionality which includes jump tables and
> > > > static calls to be initialized.
> > > >
> > > > This only affects "early LSMs" i.e. only lockdown when
> > > > CONFIG_SECURITY_LOCKDOWN_LSM_EARLY is set.
> > > >
> > > > Fixes: 2732ad5ecd5b ("lsm: replace indirect LSM hook calls with static calls")
> > > > Signed-off-by: KP Singh <kpsingh@kernel.org>
> > > > ---
> > > >  init/main.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > Considering the problems we've had, I'd like to hear more about how
> ...
> > I guess it would not harm Boris, Nathan and others to look at it as
> > well and see if it breaks any of their tests.
>
> For what it's worth, I have not noticed any issues in my -next testing
> with this patch applied but I only build architectures that build with
> LLVM due to the nature of my work. If exposure to more architectures is
> desirable, perhaps Guenter Roeck would not mind testing it with his
> matrix?

Thanks Nathan.

I think the additional testing would be great, KP can you please work
with Guenter to set this up?

Paul Moore Aug. 7, 2024, 8:44 p.m. UTC | #5

On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
> On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:

...

> > For what it's worth, I have not noticed any issues in my -next testing
> > with this patch applied but I only build architectures that build with
> > LLVM due to the nature of my work. If exposure to more architectures is
> > desirable, perhaps Guenter Roeck would not mind testing it with his
> > matrix?
>
> Thanks Nathan.
>
> I think the additional testing would be great, KP can you please work
> with Guenter to set this up?

Is that something you can do KP?  I'm asking because I'm looking at
merging some other patches into lsm/dev and I need to make a decision
about the static call patches (hold off on merging the other patches
until the static call testing is complete, or yank the static call
patches until testing is complete and then re-merge).  Understanding
your ability to do the additional testing, and a rough idea of how
long it is going to take would be helpful here.

KP Singh Aug. 7, 2024, 10:45 p.m. UTC | #6

On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
> > On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> ...
>
> > > For what it's worth, I have not noticed any issues in my -next testing
> > > with this patch applied but I only build architectures that build with
> > > LLVM due to the nature of my work. If exposure to more architectures is
> > > desirable, perhaps Guenter Roeck would not mind testing it with his
> > > matrix?
> >
> > Thanks Nathan.
> >
> > I think the additional testing would be great, KP can you please work
> > with Guenter to set this up?
>

Adding Guenter directly to this thread.

> Is that something you can do KP?  I'm asking because I'm looking at
> merging some other patches into lsm/dev and I need to make a decision
> about the static call patches (hold off on merging the other patches
> until the static call testing is complete, or yank the static call
> patches until testing is complete and then re-merge).  Understanding
> your ability to do the additional testing, and a rough idea of how

I have done the best of the testing I could do here. I think we should
let this run its normal course and see if this breaks anything. I am
not sure how testing is done before patches are merged and what else
you expect me to do?


> long it is going to take would be helpful here.
>
> --
> paul-moore.com

KP Singh Aug. 7, 2024, 10:50 p.m. UTC | #7

On Thu, Aug 8, 2024 at 12:45 AM KP Singh <kpsingh@kernel.org> wrote:
>
> On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
> >
> > On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
> > > On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > ...
> >
> > > > For what it's worth, I have not noticed any issues in my -next testing
> > > > with this patch applied but I only build architectures that build with
> > > > LLVM due to the nature of my work. If exposure to more architectures is
> > > > desirable, perhaps Guenter Roeck would not mind testing it with his
> > > > matrix?
> > >
> > > Thanks Nathan.
> > >
> > > I think the additional testing would be great, KP can you please work
> > > with Guenter to set this up?
> >
>
> Adding Guenter directly to this thread.
>
> > Is that something you can do KP?  I'm asking because I'm looking at
> > merging some other patches into lsm/dev and I need to make a decision
> > about the static call patches (hold off on merging the other patches
> > until the static call testing is complete, or yank the static call
> > patches until testing is complete and then re-merge).  Understanding
> > your ability to do the additional testing, and a rough idea of how
>
> I have done the best of the testing I could do here. I think we should
> let this run its normal course and see if this breaks anything. I am
> not sure how testing is done before patches are merged and what else
> you expect me to do?
>
>

I am adding the bpf mailing list to trigger the BPF CI. That should be
another signal, that's how the BPF tree does its testing.

https://github.com/kernel-patches/bpf/pulls

> > long it is going to take would be helpful here.
> >
> > --
> > paul-moore.com

Paul Moore Aug. 7, 2024, 11:43 p.m. UTC | #8

On Wed, Aug 7, 2024 at 6:45 PM KP Singh <kpsingh@kernel.org> wrote:
> On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
> > On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
> > > On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > ...
> >
> > > > For what it's worth, I have not noticed any issues in my -next testing
> > > > with this patch applied but I only build architectures that build with
> > > > LLVM due to the nature of my work. If exposure to more architectures is
> > > > desirable, perhaps Guenter Roeck would not mind testing it with his
> > > > matrix?
> > >
> > > Thanks Nathan.
> > >
> > > I think the additional testing would be great, KP can you please work
> > > with Guenter to set this up?
> >
>
> Adding Guenter directly to this thread.
>
> > Is that something you can do KP?  I'm asking because I'm looking at
> > merging some other patches into lsm/dev and I need to make a decision
> > about the static call patches (hold off on merging the other patches
> > until the static call testing is complete, or yank the static call
> > patches until testing is complete and then re-merge).  Understanding
> > your ability to do the additional testing, and a rough idea of how
>
> I have done the best of the testing I could do here. I think we should
> let this run its normal course and see if this breaks anything. I am
> not sure how testing is done before patches are merged and what else
> you expect me to do?

That is why I was asking you to get in touch with Guenter to try and
sort out what needs to be done to test this across different
architectures.

With all due respect, this patchset has a history of not being as
tested as well as I would like; we had the compilation warning on gcc
and then the linux-next breakage.  The gcc problem wasn't a major
problem (although it was disappointing, especially considering the
context around it), but I consider the linux-next breakage fairly
serious and would like to have some assurance beyond your "it's okay,
trust me" this time around.  If there really is no way to practically
test this patchset across multiple arches prior to throwing it into
linux-next, so be it, but I want to see at least some effort towards
trying to make that happen.

KP Singh Aug. 8, 2024, 12:30 a.m. UTC | #9

On Thu, Aug 8, 2024 at 1:44 AM Paul Moore <paul@paul-moore.com> wrote:
>
> On Wed, Aug 7, 2024 at 6:45 PM KP Singh <kpsingh@kernel.org> wrote:
> > On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
> > > On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
> > > > On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > ...
> > >
> > > > > For what it's worth, I have not noticed any issues in my -next testing
> > > > > with this patch applied but I only build architectures that build with
> > > > > LLVM due to the nature of my work. If exposure to more architectures is
> > > > > desirable, perhaps Guenter Roeck would not mind testing it with his
> > > > > matrix?
> > > >
> > > > Thanks Nathan.
> > > >
> > > > I think the additional testing would be great, KP can you please work
> > > > with Guenter to set this up?
> > >
> >
> > Adding Guenter directly to this thread.
> >
> > > Is that something you can do KP?  I'm asking because I'm looking at
> > > merging some other patches into lsm/dev and I need to make a decision
> > > about the static call patches (hold off on merging the other patches
> > > until the static call testing is complete, or yank the static call
> > > patches until testing is complete and then re-merge).  Understanding
> > > your ability to do the additional testing, and a rough idea of how
> >
> > I have done the best of the testing I could do here. I think we should
> > let this run its normal course and see if this breaks anything. I am
> > not sure how testing is done before patches are merged and what else
> > you expect me to do?
>
> That is why I was asking you to get in touch with Guenter to try and
> sort out what needs to be done to test this across different
> architectures.
>
> With all due respect, this patchset has a history of not being as
> tested as well as I would like; we had the compilation warning on gcc
> and then the linux-next breakage.  The gcc problem wasn't a major
> problem (although it was disappointing, especially considering the
> context around it), but I consider the linux-next breakage fairly
> serious and would like to have some assurance beyond your "it's okay,
> trust me" this time around.  If there really is no way to practically
> test this patchset across multiple arches prior to throwing it into
> linux-next, so be it, but I want to see at least some effort towards
> trying to make that happen.
>

I did add Guenter to the thread, but really, I cannot offer more
testing than the configs we use in production. I don't use GCC as we
mostly use clang, and we don't use early LSMs which is such a special
case with two extra configs with lockdown. Calling it having a
"history of not being tested is unfair".

If there is a general process / tests you follow before merging
patches, I am happy to run them. In the absence of that, it's not easy
to spot corner cases.

- KP

> --
> paul-moore.com

Guenter Roeck Aug. 8, 2024, 12:34 a.m. UTC | #10

On 8/7/24 16:43, Paul Moore wrote:
> On Wed, Aug 7, 2024 at 6:45 PM KP Singh <kpsingh@kernel.org> wrote:
>> On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
>>> On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
>>>> On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
>>>
>>> ...
>>>
>>>>> For what it's worth, I have not noticed any issues in my -next testing
>>>>> with this patch applied but I only build architectures that build with
>>>>> LLVM due to the nature of my work. If exposure to more architectures is
>>>>> desirable, perhaps Guenter Roeck would not mind testing it with his
>>>>> matrix?
>>>>
>>>> Thanks Nathan.
>>>>
>>>> I think the additional testing would be great, KP can you please work
>>>> with Guenter to set this up?
>>>
>>
>> Adding Guenter directly to this thread.
>>
>>> Is that something you can do KP?  I'm asking because I'm looking at
>>> merging some other patches into lsm/dev and I need to make a decision
>>> about the static call patches (hold off on merging the other patches
>>> until the static call testing is complete, or yank the static call
>>> patches until testing is complete and then re-merge).  Understanding
>>> your ability to do the additional testing, and a rough idea of how
>>
>> I have done the best of the testing I could do here. I think we should
>> let this run its normal course and see if this breaks anything. I am
>> not sure how testing is done before patches are merged and what else
>> you expect me to do?
> 
> That is why I was asking you to get in touch with Guenter to try and
> sort out what needs to be done to test this across different
> architectures.
> 
> With all due respect, this patchset has a history of not being as
> tested as well as I would like; we had the compilation warning on gcc
> and then the linux-next breakage.  The gcc problem wasn't a major
> problem (although it was disappointing, especially considering the
> context around it), but I consider the linux-next breakage fairly
> serious and would like to have some assurance beyond your "it's okay,
> trust me" this time around.  If there really is no way to practically
> test this patchset across multiple arches prior to throwing it into
> linux-next, so be it, but I want to see at least some effort towards
> trying to make that happen.
> 

Happy to run whatever patchset there is through my testbed. Just send me
a pointer to it.

Note that it should be based on mainline; linux-next is typically too broken
to provide any useful signals. I can handle a patchset either on top of v6.10
or v6.11-rc2 (meaning 6.10 passes through all my tests, and I can apply and
revert patches to/from 6.11-rc2 to get it to pass).

Question of course is if that really helps: I don't specifically test features
such as LSM or BPF.

Thanks,
Guenter

KP Singh Aug. 8, 2024, 12:40 a.m. UTC | #11

On Thu, Aug 8, 2024 at 2:34 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 8/7/24 16:43, Paul Moore wrote:
> > On Wed, Aug 7, 2024 at 6:45 PM KP Singh <kpsingh@kernel.org> wrote:
> >> On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
> >>> On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
> >>>> On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >>>
> >>> ...
> >>>
> >>>>> For what it's worth, I have not noticed any issues in my -next testing
> >>>>> with this patch applied but I only build architectures that build with
> >>>>> LLVM due to the nature of my work. If exposure to more architectures is
> >>>>> desirable, perhaps Guenter Roeck would not mind testing it with his
> >>>>> matrix?
> >>>>
> >>>> Thanks Nathan.
> >>>>
> >>>> I think the additional testing would be great, KP can you please work
> >>>> with Guenter to set this up?
> >>>
> >>
> >> Adding Guenter directly to this thread.
> >>
> >>> Is that something you can do KP?  I'm asking because I'm looking at
> >>> merging some other patches into lsm/dev and I need to make a decision
> >>> about the static call patches (hold off on merging the other patches
> >>> until the static call testing is complete, or yank the static call
> >>> patches until testing is complete and then re-merge).  Understanding
> >>> your ability to do the additional testing, and a rough idea of how
> >>
> >> I have done the best of the testing I could do here. I think we should
> >> let this run its normal course and see if this breaks anything. I am
> >> not sure how testing is done before patches are merged and what else
> >> you expect me to do?
> >
> > That is why I was asking you to get in touch with Guenter to try and
> > sort out what needs to be done to test this across different
> > architectures.
> >
> > With all due respect, this patchset has a history of not being as
> > tested as well as I would like; we had the compilation warning on gcc
> > and then the linux-next breakage.  The gcc problem wasn't a major
> > problem (although it was disappointing, especially considering the
> > context around it), but I consider the linux-next breakage fairly
> > serious and would like to have some assurance beyond your "it's okay,
> > trust me" this time around.  If there really is no way to practically
> > test this patchset across multiple arches prior to throwing it into
> > linux-next, so be it, but I want to see at least some effort towards
> > trying to make that happen.
> >
>
> Happy to run whatever patchset there is through my testbed. Just send me
> a pointer to it.
>
> Note that it should be based on mainline; linux-next is typically too broken
> to provide any useful signals. I can handle a patchset either on top of v6.10
> or v6.11-rc2 (meaning 6.10 passes through all my tests, and I can apply and
> revert patches to/from 6.11-rc2 to get it to pass).
>
> Question of course is if that really helps: I don't specifically test features
> such as LSM or BPF.

The changes would not be specific to BPF LSM, we just want to check if
our init/main.c refactoring breaks some arch stuff.

I rebased my patches and pushed a branch based on v6.11-rc2:

https://git.kernel.org/pub/scm/linux/kernel/git/kpsingh/linux.git/log/?h=static_calls

- KP

>
> Thanks,
> Guenter
>

Paul Moore Aug. 8, 2024, 1:18 a.m. UTC | #12

On Wed, Aug 7, 2024 at 8:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
> On 8/7/24 16:43, Paul Moore wrote:
> > On Wed, Aug 7, 2024 at 6:45 PM KP Singh <kpsingh@kernel.org> wrote:
> >> On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
> >>> On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
> >>>> On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >>>
> >>> ...
> >>>
> >>>>> For what it's worth, I have not noticed any issues in my -next testing
> >>>>> with this patch applied but I only build architectures that build with
> >>>>> LLVM due to the nature of my work. If exposure to more architectures is
> >>>>> desirable, perhaps Guenter Roeck would not mind testing it with his
> >>>>> matrix?
> >>>>
> >>>> Thanks Nathan.
> >>>>
> >>>> I think the additional testing would be great, KP can you please work
> >>>> with Guenter to set this up?
> >>>
> >>
> >> Adding Guenter directly to this thread.
> >>
> >>> Is that something you can do KP?  I'm asking because I'm looking at
> >>> merging some other patches into lsm/dev and I need to make a decision
> >>> about the static call patches (hold off on merging the other patches
> >>> until the static call testing is complete, or yank the static call
> >>> patches until testing is complete and then re-merge).  Understanding
> >>> your ability to do the additional testing, and a rough idea of how
> >>
> >> I have done the best of the testing I could do here. I think we should
> >> let this run its normal course and see if this breaks anything. I am
> >> not sure how testing is done before patches are merged and what else
> >> you expect me to do?
> >
> > That is why I was asking you to get in touch with Guenter to try and
> > sort out what needs to be done to test this across different
> > architectures.
> >
> > With all due respect, this patchset has a history of not being as
> > tested as well as I would like; we had the compilation warning on gcc
> > and then the linux-next breakage.  The gcc problem wasn't a major
> > problem (although it was disappointing, especially considering the
> > context around it), but I consider the linux-next breakage fairly
> > serious and would like to have some assurance beyond your "it's okay,
> > trust me" this time around.  If there really is no way to practically
> > test this patchset across multiple arches prior to throwing it into
> > linux-next, so be it, but I want to see at least some effort towards
> > trying to make that happen.
> >
>
> Happy to run whatever patchset there is through my testbed. Just send me
> a pointer to it.
>
> Note that it should be based on mainline; linux-next is typically too broken
> to provide any useful signals. I can handle a patchset either on top of v6.10
> or v6.11-rc2 (meaning 6.10 passes through all my tests, and I can apply and
> revert patches to/from 6.11-rc2 to get it to pass).

Thanks Guenter, it looks like KP already make up a branch for you to
pull, but if you have any problems or need something different let us
know.

> Question of course is if that really helps: I don't specifically test features
> such as LSM or BPF.

In this particular case we are most interested in testing the LSM
initializing code so I don't believe you need to worry much about
LSM/BPF configuration, it's a matter of ensuring the different arches
are able to boot without any panics/warnings/etc.

There is some Kconfig needed, KP provided a good snippet earlier in
this thread, the relevant portion is copied below:

% cat .config | grep -i LOCKDOWN
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,bpf"

... and here is the full message:

https://lore.kernel.org/linux-security-module/CACYkzJ6486mzW97LF+QrHhM9-pZt0QPWFH+oCrTmubGkJVvGhw@mail.gmail.com/

Guenter Roeck Aug. 8, 2024, 1:20 a.m. UTC | #13

On 8/7/24 17:40, KP Singh wrote:
> On Thu, Aug 8, 2024 at 2:34 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 8/7/24 16:43, Paul Moore wrote:
>>> On Wed, Aug 7, 2024 at 6:45 PM KP Singh <kpsingh@kernel.org> wrote:
>>>> On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
>>>>> On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
>>>>>> On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
>>>>>
>>>>> ...
>>>>>
>>>>>>> For what it's worth, I have not noticed any issues in my -next testing
>>>>>>> with this patch applied but I only build architectures that build with
>>>>>>> LLVM due to the nature of my work. If exposure to more architectures is
>>>>>>> desirable, perhaps Guenter Roeck would not mind testing it with his
>>>>>>> matrix?
>>>>>>
>>>>>> Thanks Nathan.
>>>>>>
>>>>>> I think the additional testing would be great, KP can you please work
>>>>>> with Guenter to set this up?
>>>>>
>>>>
>>>> Adding Guenter directly to this thread.
>>>>
>>>>> Is that something you can do KP?  I'm asking because I'm looking at
>>>>> merging some other patches into lsm/dev and I need to make a decision
>>>>> about the static call patches (hold off on merging the other patches
>>>>> until the static call testing is complete, or yank the static call
>>>>> patches until testing is complete and then re-merge).  Understanding
>>>>> your ability to do the additional testing, and a rough idea of how
>>>>
>>>> I have done the best of the testing I could do here. I think we should
>>>> let this run its normal course and see if this breaks anything. I am
>>>> not sure how testing is done before patches are merged and what else
>>>> you expect me to do?
>>>
>>> That is why I was asking you to get in touch with Guenter to try and
>>> sort out what needs to be done to test this across different
>>> architectures.
>>>
>>> With all due respect, this patchset has a history of not being as
>>> tested as well as I would like; we had the compilation warning on gcc
>>> and then the linux-next breakage.  The gcc problem wasn't a major
>>> problem (although it was disappointing, especially considering the
>>> context around it), but I consider the linux-next breakage fairly
>>> serious and would like to have some assurance beyond your "it's okay,
>>> trust me" this time around.  If there really is no way to practically
>>> test this patchset across multiple arches prior to throwing it into
>>> linux-next, so be it, but I want to see at least some effort towards
>>> trying to make that happen.
>>>
>>
>> Happy to run whatever patchset there is through my testbed. Just send me
>> a pointer to it.
>>
>> Note that it should be based on mainline; linux-next is typically too broken
>> to provide any useful signals. I can handle a patchset either on top of v6.10
>> or v6.11-rc2 (meaning 6.10 passes through all my tests, and I can apply and
>> revert patches to/from 6.11-rc2 to get it to pass).
>>
>> Question of course is if that really helps: I don't specifically test features
>> such as LSM or BPF.
> 
> The changes would not be specific to BPF LSM, we just want to check if
> our init/main.c refactoring breaks some arch stuff.
> 

Ok.

> I rebased my patches and pushed a branch based on v6.11-rc2:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/kpsingh/linux.git/log/?h=static_calls
> 

I merged your branch into my testing branch and pushed it into my testbed.
It will run tonight. I'll send you the results tomorrow morning.

Thanks,
Guenter

Guenter Roeck Aug. 8, 2024, 1:40 a.m. UTC | #14

On 8/7/24 18:18, Paul Moore wrote:
> On Wed, Aug 7, 2024 at 8:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
>> On 8/7/24 16:43, Paul Moore wrote:
>>> On Wed, Aug 7, 2024 at 6:45 PM KP Singh <kpsingh@kernel.org> wrote:
>>>> On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
>>>>> On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
>>>>>> On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
>>>>>
>>>>> ...
>>>>>
>>>>>>> For what it's worth, I have not noticed any issues in my -next testing
>>>>>>> with this patch applied but I only build architectures that build with
>>>>>>> LLVM due to the nature of my work. If exposure to more architectures is
>>>>>>> desirable, perhaps Guenter Roeck would not mind testing it with his
>>>>>>> matrix?
>>>>>>
>>>>>> Thanks Nathan.
>>>>>>
>>>>>> I think the additional testing would be great, KP can you please work
>>>>>> with Guenter to set this up?
>>>>>
>>>>
>>>> Adding Guenter directly to this thread.
>>>>
>>>>> Is that something you can do KP?  I'm asking because I'm looking at
>>>>> merging some other patches into lsm/dev and I need to make a decision
>>>>> about the static call patches (hold off on merging the other patches
>>>>> until the static call testing is complete, or yank the static call
>>>>> patches until testing is complete and then re-merge).  Understanding
>>>>> your ability to do the additional testing, and a rough idea of how
>>>>
>>>> I have done the best of the testing I could do here. I think we should
>>>> let this run its normal course and see if this breaks anything. I am
>>>> not sure how testing is done before patches are merged and what else
>>>> you expect me to do?
>>>
>>> That is why I was asking you to get in touch with Guenter to try and
>>> sort out what needs to be done to test this across different
>>> architectures.
>>>
>>> With all due respect, this patchset has a history of not being as
>>> tested as well as I would like; we had the compilation warning on gcc
>>> and then the linux-next breakage.  The gcc problem wasn't a major
>>> problem (although it was disappointing, especially considering the
>>> context around it), but I consider the linux-next breakage fairly
>>> serious and would like to have some assurance beyond your "it's okay,
>>> trust me" this time around.  If there really is no way to practically
>>> test this patchset across multiple arches prior to throwing it into
>>> linux-next, so be it, but I want to see at least some effort towards
>>> trying to make that happen.
>>>
>>
>> Happy to run whatever patchset there is through my testbed. Just send me
>> a pointer to it.
>>
>> Note that it should be based on mainline; linux-next is typically too broken
>> to provide any useful signals. I can handle a patchset either on top of v6.10
>> or v6.11-rc2 (meaning 6.10 passes through all my tests, and I can apply and
>> revert patches to/from 6.11-rc2 to get it to pass).
> 
> Thanks Guenter, it looks like KP already make up a branch for you to
> pull, but if you have any problems or need something different let us
> know.
> 
>> Question of course is if that really helps: I don't specifically test features
>> such as LSM or BPF.
> 
> In this particular case we are most interested in testing the LSM
> initializing code so I don't believe you need to worry much about
> LSM/BPF configuration, it's a matter of ensuring the different arches
> are able to boot without any panics/warnings/etc.
> 
> There is some Kconfig needed, KP provided a good snippet earlier in
> this thread, the relevant portion is copied below:
> 
> % cat .config | grep -i LOCKDOWN
> CONFIG_SECURITY_LOCKDOWN_LSM=y
> CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
> CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,bpf"
> 

Ah, that might be a bit more challenging. I don't explicitly enable
security LSMs, so just building and loading them on various architectures
might be problematic even without the changes discussed here.

I'll enable all security modules referenced above in my test builds.
If we are lucky everything will pass; otherwise I may need a couple
of runs to establish a baseline.

Thanks,
Guenter

Guenter Roeck Aug. 8, 2024, 2:13 a.m. UTC | #15

On 8/7/24 18:18, Paul Moore wrote:
> On Wed, Aug 7, 2024 at 8:34 PM Guenter Roeck <linux@roeck-us.net> wrote:
>> On 8/7/24 16:43, Paul Moore wrote:
>>> On Wed, Aug 7, 2024 at 6:45 PM KP Singh <kpsingh@kernel.org> wrote:
>>>> On Wed, Aug 7, 2024 at 10:45 PM Paul Moore <paul@paul-moore.com> wrote:
>>>>> On Tue, Aug 6, 2024 at 5:41 PM Paul Moore <paul@paul-moore.com> wrote:
>>>>>> On Mon, Aug 5, 2024 at 10:20 PM Nathan Chancellor <nathan@kernel.org> wrote:
>>>>>
>>>>> ...
>>>>>
>>>>>>> For what it's worth, I have not noticed any issues in my -next testing
>>>>>>> with this patch applied but I only build architectures that build with
>>>>>>> LLVM due to the nature of my work. If exposure to more architectures is
>>>>>>> desirable, perhaps Guenter Roeck would not mind testing it with his
>>>>>>> matrix?
>>>>>>
>>>>>> Thanks Nathan.
>>>>>>
>>>>>> I think the additional testing would be great, KP can you please work
>>>>>> with Guenter to set this up?
>>>>>
>>>>
>>>> Adding Guenter directly to this thread.
>>>>
>>>>> Is that something you can do KP?  I'm asking because I'm looking at
>>>>> merging some other patches into lsm/dev and I need to make a decision
>>>>> about the static call patches (hold off on merging the other patches
>>>>> until the static call testing is complete, or yank the static call
>>>>> patches until testing is complete and then re-merge).  Understanding
>>>>> your ability to do the additional testing, and a rough idea of how
>>>>
>>>> I have done the best of the testing I could do here. I think we should
>>>> let this run its normal course and see if this breaks anything. I am
>>>> not sure how testing is done before patches are merged and what else
>>>> you expect me to do?
>>>
>>> That is why I was asking you to get in touch with Guenter to try and
>>> sort out what needs to be done to test this across different
>>> architectures.
>>>
>>> With all due respect, this patchset has a history of not being as
>>> tested as well as I would like; we had the compilation warning on gcc
>>> and then the linux-next breakage.  The gcc problem wasn't a major
>>> problem (although it was disappointing, especially considering the
>>> context around it), but I consider the linux-next breakage fairly
>>> serious and would like to have some assurance beyond your "it's okay,
>>> trust me" this time around.  If there really is no way to practically
>>> test this patchset across multiple arches prior to throwing it into
>>> linux-next, so be it, but I want to see at least some effort towards
>>> trying to make that happen.
>>>
>>
>> Happy to run whatever patchset there is through my testbed. Just send me
>> a pointer to it.
>>
>> Note that it should be based on mainline; linux-next is typically too broken
>> to provide any useful signals. I can handle a patchset either on top of v6.10
>> or v6.11-rc2 (meaning 6.10 passes through all my tests, and I can apply and
>> revert patches to/from 6.11-rc2 to get it to pass).
> 
> Thanks Guenter, it looks like KP already make up a branch for you to
> pull, but if you have any problems or need something different let us
> know.
> 
>> Question of course is if that really helps: I don't specifically test features
>> such as LSM or BPF.
> 
> In this particular case we are most interested in testing the LSM
> initializing code so I don't believe you need to worry much about
> LSM/BPF configuration, it's a matter of ensuring the different arches
> are able to boot without any panics/warnings/etc.
> 
> There is some Kconfig needed, KP provided a good snippet earlier in
> this thread, the relevant portion is copied below:
> 
> % cat .config | grep -i LOCKDOWN
> CONFIG_SECURITY_LOCKDOWN_LSM=y
> CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
> CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,bpf"
> 
> ... and here is the full message:
> 
> https://lore.kernel.org/linux-security-module/CACYkzJ6486mzW97LF+QrHhM9-pZt0QPWFH+oCrTmubGkJVvGhw@mail.gmail.com/
> 

I'll need to establish a baseline first to determine if the failures
are caused by newly enabled configuration options or by this patch set.
Below are just early test results.

[ Though if those are all upstream there seems to be be something seriously
   wrong with the lockdown lsm.
]

Guenter

----
arm:

[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:199 static_key_enable_cpuslocked+0xb0/0xfc
[    0.000000] static_key_enable_cpuslocked(): static key 'security_hook_active_locked_down_0+0x0/0x8' used before call to jump_label_init()
[    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
[    0.000000] Hardware name: Generic DT based system
[    0.000000] Call trace:
[    0.000000]  unwind_backtrace from show_stack+0x18/0x1c
[    0.000000]  show_stack from dump_stack_lvl+0x48/0x74
[    0.000000]  dump_stack_lvl from __warn+0x7c/0x134
[    0.000000]  __warn from warn_slowpath_fmt+0x9c/0xdc
[    0.000000]  warn_slowpath_fmt from static_key_enable_cpuslocked+0xb0/0xfc
[    0.000000]  static_key_enable_cpuslocked from security_add_hooks+0xa0/0x104
[    0.000000]  security_add_hooks from lockdown_lsm_init+0x1c/0x2c
[    0.000000]  lockdown_lsm_init from initialize_lsm+0x44/0x84
[    0.000000]  initialize_lsm from early_security_init+0x3c/0x58
[    0.000000]  early_security_init from start_kernel+0x78/0x748
[    0.000000]  start_kernel from 0x0
[    0.000000] irq event stamp: 0
[    0.000000] hardirqs last  enabled at (0): [<00000000>] 0x0
[    0.000000] hardirqs last disabled at (0): [<00000000>] 0x0
[    0.000000] softirqs last  enabled at (0): [<00000000>] 0x0
[    0.000000] softirqs last disabled at (0): [<00000000>] 0x0
[    0.000000] ---[ end trace 0000000000000000 ]---

m68k:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at include/linux/jump_label.h:322 security_add_hooks+0xc4/0x12c
static_key_enable(): static key '0x6e5860' used before call to jump_label_init()
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-mac-00134-g679d51771510 #1
Stack from 0065df00:
         0065df00 005ff98d 005ff98d 00000000 00000009 00000009 004aa710 005ff98d
         0049f87a 005c9849 00000142 0063f5ec 004cbd3e 0049f8f8 005c9849 00000142
         0075ac3e 00000009 00000000 0065df60 00000000 00000040 00000000 00000000
         005c980c 0065df7c 0075ac3e 005c9849 00000142 00000009 005c980c 004c9f98
         006e5860 00000000 00782b50 00000000 00000000 0075b7ba 0063f5ec 00000001
         004cbd3e 0075a62e 00782b50 0075a79e 00782b50 00782b50 0049feb6 00749d4c
Call Trace: [<004aa710>] dump_stack+0xc/0x10
  [<0049f87a>] __warn+0x7e/0xb4
  [<0049f8f8>] warn_slowpath_fmt+0x48/0x66
  [<0075ac3e>] security_add_hooks+0xc4/0x12c
  [<0075ac3e>] security_add_hooks+0xc4/0x12c
  [<0075b7ba>] lockdown_lsm_init+0x16/0x1e
  [<0075a62e>] initialize_lsm+0x32/0x5c
  [<0075a79e>] early_security_init+0x30/0x38
  [<0049feb6>] _printk+0x0/0x18
  [<00749d4c>] start_kernel+0x60/0x600
  [<00748414>] _sinittext+0x414/0xae0
---[ end trace 0000000000000000 ]---

Microblaze:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at include/linux/jump_label.h:322 security_add_hooks+0x124/0x21c
static_key_enable(): static key 'security_hook_active_locked_down_0+0x0/0x4' used before call to jump_label_init()
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
Kernel Stack:
(ptrval): c0999390 c0f4c9ec 00000000 00000000 ffffffff a589f3a9 c0984c20 00000000
(ptrval): c0c51ef8 00000009 c0984c30 00000000 00000000 c0c51ef8 00000000 c0c51ef8
(ptrval): 00000009 c0984cf8 c09bad94 00000000 00000000 c0a30c10 00000142 c0d19e10
(ptrval): c0a30bd0 c0a30c10 00000000 c0d19e10 c09bade4 00000142 00000009 c0a30bd0
(ptrval): c0a30ca0 c0f58820 c0a30bd0 c0c51f28 00000142 00000009 c0d19e10 c0a37340
(ptrval): c0c190c0 c0d1b1d0 00000000 00000000 00000000 c0a30bd0 c0a30ca0 c0f58820
(ptrval): c0d42b20 c0d35464 c0d42b38 00000000 00000000 00000000 00000000 00000000
(ptrval): 00100000 00000280 c0d196e8 c0d04ed0 00000000 c098465c 00000000 00000000
(ptrval): c0d19778 c0d19784 00000000 00000000 c0d0488c c09b8e40 c09b9b24 c0d42b20
(ptrval): c0d42b38 c0d00898 4883e4b3 00000000 c0d0088c 00000280 00000000 00000000
(ptrval): 00000000 00000000 00000000 c0984194 c09b7208 c0b125f8 c0f5d59c 00000000
(ptrval): 00000002 00000000 c00002e0 91a86e08 c0d33f7c 00000000 00000000 00000000
(ptrval): 00000000 00000000 00000000 00000000
Call Trace:
[<c0003168>] microblaze_unwind+0x64/0x80
[<c0984548>] show_stack+0x128/0x180
[<c0999330>] dump_stack_lvl+0x44/0x94
[<c099938c>] dump_stack+0xc/0x24
[<c0984c2c>] __warn+0xac/0xfc
[<c0984cf4>] warn_slowpath_fmt+0x78/0x98
[<c0d19e0c>] security_add_hooks+0x120/0x21c
[<c0d1b1cc>] lockdown_lsm_init+0x18/0x34
[<c0d196e4>] initialize_lsm+0x44/0x94
[<c0d19780>] early_security_init+0x4c/0x74
[<c0d00894>] start_kernel+0x90/0x8ac
[<c0984190>] machine_shutdown+0x1c/0x20
no locks held by swapper/0.
---[ end trace 0000000000000000 ]---

mips:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at include/linux/jump_label.h:322 security_add_hooks+0xf8/0x1bc
static_key_enable(): static key 'security_hook_active_locked_down_0+0x0/0x4' used before call to jump_label_init()
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
Hardware name: mti,malta
Stack : 00000000 811eedd8 00000000 00000000 00000000 00000000 00000000 00000000
         00000000 00000000 00000000 00000000 00000000 00000001 81257cd8 00000000
         81257d70 00000000 00000000 00000000 00000038 80e549c4 00000000 ffffffff
         00000000 00000000 00000000 00040000 00000000 00000000 81174584 81280000
         00000000 00000142 00000000 00000000 00000000 00000000 0a0a0b0b bbe00cfc
         ...
Call Trace:
[<8010a0a8>] show_stack+0x60/0x154
[<80e731d8>] dump_stack_lvl+0xbc/0x138
[<8012f908>] __warn+0x9c/0x1f8
[<8012fc20>] warn_slowpath_fmt+0x1bc/0x1cc
[<8138a184>] security_add_hooks+0xf8/0x1bc
[<8138a5fc>] lockdown_lsm_init+0x20/0x30
[<813899e8>] initialize_lsm+0x44/0x80
[<81389be0>] early_security_init+0x50/0x6c
[<8136c82c>] start_kernel+0xa8/0x7dc
irq event stamp: 0
hardirqs last  enabled at (0): [<00000000>] 0x0
hardirqs last disabled at (0): [<00000000>] 0x0
softirqs last  enabled at (0): [<00000000>] 0x0
softirqs last disabled at (0): [<00000000>] 0x0
---[ end trace 0000000000000000 ]---

Loongarch (crash):

[    0.000000] ------------[ cut here ]------------
[    0.000000] static_key_enable_cpuslocked(): static key 'security_hook_active_locked_down_0+0x0/0x10' used before call to jump_label_init()
[    0.000000] ------------[ cut here ]------------
[    0.000000] DEBUG_LOCKS_WARN_ON(early_boot_irqs_disabled)
[    0.000000] Caught reserved exception 12 on pid:0 [swapper] - should not happen
[    0.000000] do_reserved exception[#1]:
[    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2+ #1
[    0.000000] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[    0.000000] pc 9000000004cf9334 ra 9000000004cf9334 tp 9000000006cc8000 sp 9000000006ccbc10
[    0.000000] a0 000000000000002d a1 9000000006df7830 a2 0000000000000000 a3 9000000006ccba28
[    0.000000] a4 0000000000000001 a5 0000000000000000 a6 9000000006175570 a7 0000000000000005
[    0.000000] t0 0000000000000000 t1 0000000000000000 t2 0000000000000001 t3 0000000000000001
[    0.000000] t4 0000000000000004 t5 0000000000000094 t6 0000000000000023 t7 0000000000000030
[    0.000000] t8 ffffffff8dcb3998 u0 9000000006a45388 s9 000000000f5ea330 s0 9000000006230788
[    0.000000] s1 9000000006265c70 s2 0000000000000001 s3 0000000000000001 s4 9000000006cfaa80
[    0.000000] s5 000000000f75dad8 s6 000000000a5b0000 s7 000000000f75db30 s8 000000000eee5b18
[    0.000000]    ra: 9000000004cf9334 lockdep_hardirqs_on_prepare+0x200/0x208
[    0.000000]   ERA: 9000000004cf9334 lockdep_hardirqs_on_prepare+0x200/0x208
[    0.000000]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[    0.000000]  PRMD: 00000000 (PPLV0 -PIE -PWE)
[    0.000000]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
[    0.000000]  ECFG: 00070800 (LIE=11 VS=7)
[    0.000000] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[    0.000000]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
[    0.000000] Modules linked in:
[    0.000000] Process swapper (pid: 0, threadinfo=(____ptrval____), task=(____ptrval____))
[    0.000000] Stack : 0000000000000001 9000000006265c70 9000000006169c58 9000000004dd9ba8
[    0.000000]         9000000006ccbc70 0000000000000000 9000000006ccbc70 9000000006169c58
[    0.000000]         00000000000000b0 90000000074f08b8 9000000008616478 9000000007ad1924
[    0.000000]         0000000000000000 9000000004e95fa8 9000000006cc8000 9000000006ccbdb0
[    0.000000]         000000000000007e 9000000006df7830 0000000000000000 9000000006ccbbc8
[    0.000000]         0000000000000001 0000000000000001 90000000073f6e58 9000000006175570
[    0.000000]         0000000000000000 0000000000000000 0000000000000001 0000000000000001
[    0.000000]         0000000000000000 0000000000000092 0000000000000001 0000000000006000
[    0.000000]         ffffffff8dcb3998 9000000006a6bed8 000000000f5ea330 9000000008616478
[    0.000000]         90000000074f08b8 0000000000000001 0000000000000001 9000000006cfaa80
[    0.000000]         ...
[    0.000000] Call Trace:
[    0.000000] [<9000000004cf9334>] lockdep_hardirqs_on_prepare+0x200/0x208
[    0.000000] [<9000000004dd9ba4>] trace_hardirqs_on+0x54/0x70
[    0.000000] [<9000000006169c54>] do_reserved+0x1c/0xcc
[    0.000000] [<9000000004c52560>] handle_bp+0x120/0x1c0
[    0.000000] [<9000000004e95fa8>] static_key_enable_cpuslocked+0xdc/0xec
[    0.000000] [<9000000004e960b8>] static_key_enable+0x18/0x2c
[    0.000000] [<90000000061a9154>] security_add_hooks+0xbc/0x12c
[    0.000000] [<90000000061aa880>] lockdown_lsm_init+0x20/0x34
[    0.000000] [<90000000061a8a80>] initialize_lsm+0x3c/0x6c
[    0.000000] [<90000000061a8c34>] early_security_init+0x44/0x68
[    0.000000] [<9000000006180830>] start_kernel+0xa0/0x84c
[    0.000000] [<900000000616d0f0>] kernel_entry+0xf0/0xf8

Guenter Roeck Aug. 8, 2024, 4:07 a.m. UTC | #16

On 8/7/24 19:13, Guenter Roeck wrote:
...
> 
> I'll need to establish a baseline first to determine if the failures
> are caused by newly enabled configuration options or by this patch set.
> Below are just early test results.
> 
> [ Though if those are all upstream there seems to be be something seriously
>    wrong with the lockdown lsm.
> ]
> 

Verdict is that all the messages below are from this patch set.

On top of the reports below, alpha images fail completely, and the
backtraces are seen with several architectures. Please see the
"testing" column at https://kerneltests.org/builders for details.

The only unrelated problems are the apparmor unit test failures;
those apparently fail on all big endian systems.

Guenter

> Guenter
> 
> ----
> arm:
> 
> [    0.000000] ------------[ cut here ]------------
> [    0.000000] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:199 static_key_enable_cpuslocked+0xb0/0xfc
> [    0.000000] static_key_enable_cpuslocked(): static key 'security_hook_active_locked_down_0+0x0/0x8' used before call to jump_label_init()
> [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
> [    0.000000] Hardware name: Generic DT based system
> [    0.000000] Call trace:
> [    0.000000]  unwind_backtrace from show_stack+0x18/0x1c
> [    0.000000]  show_stack from dump_stack_lvl+0x48/0x74
> [    0.000000]  dump_stack_lvl from __warn+0x7c/0x134
> [    0.000000]  __warn from warn_slowpath_fmt+0x9c/0xdc
> [    0.000000]  warn_slowpath_fmt from static_key_enable_cpuslocked+0xb0/0xfc
> [    0.000000]  static_key_enable_cpuslocked from security_add_hooks+0xa0/0x104
> [    0.000000]  security_add_hooks from lockdown_lsm_init+0x1c/0x2c
> [    0.000000]  lockdown_lsm_init from initialize_lsm+0x44/0x84
> [    0.000000]  initialize_lsm from early_security_init+0x3c/0x58
> [    0.000000]  early_security_init from start_kernel+0x78/0x748
> [    0.000000]  start_kernel from 0x0
> [    0.000000] irq event stamp: 0
> [    0.000000] hardirqs last  enabled at (0): [<00000000>] 0x0
> [    0.000000] hardirqs last disabled at (0): [<00000000>] 0x0
> [    0.000000] softirqs last  enabled at (0): [<00000000>] 0x0
> [    0.000000] softirqs last disabled at (0): [<00000000>] 0x0
> [    0.000000] ---[ end trace 0000000000000000 ]---
> 
> m68k:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 0 at include/linux/jump_label.h:322 security_add_hooks+0xc4/0x12c
> static_key_enable(): static key '0x6e5860' used before call to jump_label_init()
> Modules linked in:
> CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-mac-00134-g679d51771510 #1
> Stack from 0065df00:
>          0065df00 005ff98d 005ff98d 00000000 00000009 00000009 004aa710 005ff98d
>          0049f87a 005c9849 00000142 0063f5ec 004cbd3e 0049f8f8 005c9849 00000142
>          0075ac3e 00000009 00000000 0065df60 00000000 00000040 00000000 00000000
>          005c980c 0065df7c 0075ac3e 005c9849 00000142 00000009 005c980c 004c9f98
>          006e5860 00000000 00782b50 00000000 00000000 0075b7ba 0063f5ec 00000001
>          004cbd3e 0075a62e 00782b50 0075a79e 00782b50 00782b50 0049feb6 00749d4c
> Call Trace: [<004aa710>] dump_stack+0xc/0x10
>   [<0049f87a>] __warn+0x7e/0xb4
>   [<0049f8f8>] warn_slowpath_fmt+0x48/0x66
>   [<0075ac3e>] security_add_hooks+0xc4/0x12c
>   [<0075ac3e>] security_add_hooks+0xc4/0x12c
>   [<0075b7ba>] lockdown_lsm_init+0x16/0x1e
>   [<0075a62e>] initialize_lsm+0x32/0x5c
>   [<0075a79e>] early_security_init+0x30/0x38
>   [<0049feb6>] _printk+0x0/0x18
>   [<00749d4c>] start_kernel+0x60/0x600
>   [<00748414>] _sinittext+0x414/0xae0
> ---[ end trace 0000000000000000 ]---
> 
> Microblaze:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 0 at include/linux/jump_label.h:322 security_add_hooks+0x124/0x21c
> static_key_enable(): static key 'security_hook_active_locked_down_0+0x0/0x4' used before call to jump_label_init()
> Modules linked in:
> CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
> Kernel Stack:
> (ptrval): c0999390 c0f4c9ec 00000000 00000000 ffffffff a589f3a9 c0984c20 00000000
> (ptrval): c0c51ef8 00000009 c0984c30 00000000 00000000 c0c51ef8 00000000 c0c51ef8
> (ptrval): 00000009 c0984cf8 c09bad94 00000000 00000000 c0a30c10 00000142 c0d19e10
> (ptrval): c0a30bd0 c0a30c10 00000000 c0d19e10 c09bade4 00000142 00000009 c0a30bd0
> (ptrval): c0a30ca0 c0f58820 c0a30bd0 c0c51f28 00000142 00000009 c0d19e10 c0a37340
> (ptrval): c0c190c0 c0d1b1d0 00000000 00000000 00000000 c0a30bd0 c0a30ca0 c0f58820
> (ptrval): c0d42b20 c0d35464 c0d42b38 00000000 00000000 00000000 00000000 00000000
> (ptrval): 00100000 00000280 c0d196e8 c0d04ed0 00000000 c098465c 00000000 00000000
> (ptrval): c0d19778 c0d19784 00000000 00000000 c0d0488c c09b8e40 c09b9b24 c0d42b20
> (ptrval): c0d42b38 c0d00898 4883e4b3 00000000 c0d0088c 00000280 00000000 00000000
> (ptrval): 00000000 00000000 00000000 c0984194 c09b7208 c0b125f8 c0f5d59c 00000000
> (ptrval): 00000002 00000000 c00002e0 91a86e08 c0d33f7c 00000000 00000000 00000000
> (ptrval): 00000000 00000000 00000000 00000000
> Call Trace:
> [<c0003168>] microblaze_unwind+0x64/0x80
> [<c0984548>] show_stack+0x128/0x180
> [<c0999330>] dump_stack_lvl+0x44/0x94
> [<c099938c>] dump_stack+0xc/0x24
> [<c0984c2c>] __warn+0xac/0xfc
> [<c0984cf4>] warn_slowpath_fmt+0x78/0x98
> [<c0d19e0c>] security_add_hooks+0x120/0x21c
> [<c0d1b1cc>] lockdown_lsm_init+0x18/0x34
> [<c0d196e4>] initialize_lsm+0x44/0x94
> [<c0d19780>] early_security_init+0x4c/0x74
> [<c0d00894>] start_kernel+0x90/0x8ac
> [<c0984190>] machine_shutdown+0x1c/0x20
> no locks held by swapper/0.
> ---[ end trace 0000000000000000 ]---
> 
> mips:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 0 at include/linux/jump_label.h:322 security_add_hooks+0xf8/0x1bc
> static_key_enable(): static key 'security_hook_active_locked_down_0+0x0/0x4' used before call to jump_label_init()
> Modules linked in:
> CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
> Hardware name: mti,malta
> Stack : 00000000 811eedd8 00000000 00000000 00000000 00000000 00000000 00000000
>          00000000 00000000 00000000 00000000 00000000 00000001 81257cd8 00000000
>          81257d70 00000000 00000000 00000000 00000038 80e549c4 00000000 ffffffff
>          00000000 00000000 00000000 00040000 00000000 00000000 81174584 81280000
>          00000000 00000142 00000000 00000000 00000000 00000000 0a0a0b0b bbe00cfc
>          ...
> Call Trace:
> [<8010a0a8>] show_stack+0x60/0x154
> [<80e731d8>] dump_stack_lvl+0xbc/0x138
> [<8012f908>] __warn+0x9c/0x1f8
> [<8012fc20>] warn_slowpath_fmt+0x1bc/0x1cc
> [<8138a184>] security_add_hooks+0xf8/0x1bc
> [<8138a5fc>] lockdown_lsm_init+0x20/0x30
> [<813899e8>] initialize_lsm+0x44/0x80
> [<81389be0>] early_security_init+0x50/0x6c
> [<8136c82c>] start_kernel+0xa8/0x7dc
> irq event stamp: 0
> hardirqs last  enabled at (0): [<00000000>] 0x0
> hardirqs last disabled at (0): [<00000000>] 0x0
> softirqs last  enabled at (0): [<00000000>] 0x0
> softirqs last disabled at (0): [<00000000>] 0x0
> ---[ end trace 0000000000000000 ]---
> 
> Loongarch (crash):
> 
> [    0.000000] ------------[ cut here ]------------
> [    0.000000] static_key_enable_cpuslocked(): static key 'security_hook_active_locked_down_0+0x0/0x10' used before call to jump_label_init()
> [    0.000000] ------------[ cut here ]------------
> [    0.000000] DEBUG_LOCKS_WARN_ON(early_boot_irqs_disabled)
> [    0.000000] Caught reserved exception 12 on pid:0 [swapper] - should not happen
> [    0.000000] do_reserved exception[#1]:
> [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2+ #1
> [    0.000000] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> [    0.000000] pc 9000000004cf9334 ra 9000000004cf9334 tp 9000000006cc8000 sp 9000000006ccbc10
> [    0.000000] a0 000000000000002d a1 9000000006df7830 a2 0000000000000000 a3 9000000006ccba28
> [    0.000000] a4 0000000000000001 a5 0000000000000000 a6 9000000006175570 a7 0000000000000005
> [    0.000000] t0 0000000000000000 t1 0000000000000000 t2 0000000000000001 t3 0000000000000001
> [    0.000000] t4 0000000000000004 t5 0000000000000094 t6 0000000000000023 t7 0000000000000030
> [    0.000000] t8 ffffffff8dcb3998 u0 9000000006a45388 s9 000000000f5ea330 s0 9000000006230788
> [    0.000000] s1 9000000006265c70 s2 0000000000000001 s3 0000000000000001 s4 9000000006cfaa80
> [    0.000000] s5 000000000f75dad8 s6 000000000a5b0000 s7 000000000f75db30 s8 000000000eee5b18
> [    0.000000]    ra: 9000000004cf9334 lockdep_hardirqs_on_prepare+0x200/0x208
> [    0.000000]   ERA: 9000000004cf9334 lockdep_hardirqs_on_prepare+0x200/0x208
> [    0.000000]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> [    0.000000]  PRMD: 00000000 (PPLV0 -PIE -PWE)
> [    0.000000]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> [    0.000000]  ECFG: 00070800 (LIE=11 VS=7)
> [    0.000000] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
> [    0.000000]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
> [    0.000000] Modules linked in:
> [    0.000000] Process swapper (pid: 0, threadinfo=(____ptrval____), task=(____ptrval____))
> [    0.000000] Stack : 0000000000000001 9000000006265c70 9000000006169c58 9000000004dd9ba8
> [    0.000000]         9000000006ccbc70 0000000000000000 9000000006ccbc70 9000000006169c58
> [    0.000000]         00000000000000b0 90000000074f08b8 9000000008616478 9000000007ad1924
> [    0.000000]         0000000000000000 9000000004e95fa8 9000000006cc8000 9000000006ccbdb0
> [    0.000000]         000000000000007e 9000000006df7830 0000000000000000 9000000006ccbbc8
> [    0.000000]         0000000000000001 0000000000000001 90000000073f6e58 9000000006175570
> [    0.000000]         0000000000000000 0000000000000000 0000000000000001 0000000000000001
> [    0.000000]         0000000000000000 0000000000000092 0000000000000001 0000000000006000
> [    0.000000]         ffffffff8dcb3998 9000000006a6bed8 000000000f5ea330 9000000008616478
> [    0.000000]         90000000074f08b8 0000000000000001 0000000000000001 9000000006cfaa80
> [    0.000000]         ...
> [    0.000000] Call Trace:
> [    0.000000] [<9000000004cf9334>] lockdep_hardirqs_on_prepare+0x200/0x208
> [    0.000000] [<9000000004dd9ba4>] trace_hardirqs_on+0x54/0x70
> [    0.000000] [<9000000006169c54>] do_reserved+0x1c/0xcc
> [    0.000000] [<9000000004c52560>] handle_bp+0x120/0x1c0
> [    0.000000] [<9000000004e95fa8>] static_key_enable_cpuslocked+0xdc/0xec
> [    0.000000] [<9000000004e960b8>] static_key_enable+0x18/0x2c
> [    0.000000] [<90000000061a9154>] security_add_hooks+0xbc/0x12c
> [    0.000000] [<90000000061aa880>] lockdown_lsm_init+0x20/0x34
> [    0.000000] [<90000000061a8a80>] initialize_lsm+0x3c/0x6c
> [    0.000000] [<90000000061a8c34>] early_security_init+0x44/0x68
> [    0.000000] [<9000000006180830>] start_kernel+0xa0/0x84c
> [    0.000000] [<900000000616d0f0>] kernel_entry+0xf0/0xf8
> 
>

KP Singh Aug. 8, 2024, 9:57 a.m. UTC | #17

On Thu, Aug 8, 2024 at 6:07 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 8/7/24 19:13, Guenter Roeck wrote:
> ...
> >
> > I'll need to establish a baseline first to determine if the failures
> > are caused by newly enabled configuration options or by this patch set.
> > Below are just early test results.
> >
> > [ Though if those are all upstream there seems to be be something seriously
> >    wrong with the lockdown lsm.
> > ]
> >
>
> Verdict is that all the messages below are from this patch set.
>
> On top of the reports below, alpha images fail completely, and the
> backtraces are seen with several architectures. Please see the
> "testing" column at https://kerneltests.org/builders for details.
>
> The only unrelated problems are the apparmor unit test failures;
> those apparently fail on all big endian systems.
>
> Guenter
>
> > Guenter
> >
> > ----
> > arm:
> >
> > [    0.000000] ------------[ cut here ]------------
> > [    0.000000] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:199 static_key_enable_cpuslocked+0xb0/0xfc
> > [    0.000000] static_key_enable_cpuslocked(): static key 'security_hook_active_locked_down_0+0x0/0x8' used before call to jump_label_init()
> > [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
> > [    0.000000] Hardware name: Generic DT based system
> > [    0.000000] Call trace:
> > [    0.000000]  unwind_backtrace from show_stack+0x18/0x1c
> > [    0.000000]  show_stack from dump_stack_lvl+0x48/0x74
> > [    0.000000]  dump_stack_lvl from __warn+0x7c/0x134
> > [    0.000000]  __warn from warn_slowpath_fmt+0x9c/0xdc
> > [    0.000000]  warn_slowpath_fmt from static_key_enable_cpuslocked+0xb0/0xfc
> > [    0.000000]  static_key_enable_cpuslocked from security_add_hooks+0xa0/0x104
> > [    0.000000]  security_add_hooks from lockdown_lsm_init+0x1c/0x2c
> > [    0.000000]  lockdown_lsm_init from initialize_lsm+0x44/0x84
> > [    0.000000]  initialize_lsm from early_security_init+0x3c/0x58
> > [    0.000000]  early_security_init from start_kernel+0x78/0x748
> > [    0.000000]  start_kernel from 0x0
> > [    0.000000] irq event stamp: 0
> > [    0.000000] hardirqs last  enabled at (0): [<00000000>] 0x0
> > [    0.000000] hardirqs last disabled at (0): [<00000000>] 0x0
> > [    0.000000] softirqs last  enabled at (0): [<00000000>] 0x0
> > [    0.000000] softirqs last disabled at (0): [<00000000>] 0x0
> > [    0.000000] ---[ end trace 0000000000000000 ]---
> >

This seems very odd for especially ARM as I don't see this error when
I do it on the next branch. Possibly something in setup_arch is
initializing jump_tables indirectly between v6.11-rc2 and linux-next
and/or this is a warning that does not immediately splash up on the
dmesg.

Both ARM64 and x86 (the architectures I really have access to)
initializes jump_tables and x86 is the only architecture that does an
explicit static_call_init is x86 and it's already in the setup_arch
code.

https://elixir.bootlin.com/linux/v6.11-rc2/source/arch/arm64/kernel/setup.c#L295
https://elixir.bootlin.com/linux/v6.11-rc2/source/arch/x86/kernel/setup.c#L783

Guenter, I have updated my tree, could you give it another run please?

Thanks!
- KP

> > m68k:
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 0 at include/linux/jump_label.h:322 security_add_hooks+0xc4/0x12c
> > static_key_enable(): static key '0x6e5860' used before call to jump_label_init()
> > Modules linked in:
> > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-mac-00134-g679d51771510 #1
> > Stack from 0065df00:
> >          0065df00 005ff98d 005ff98d 00000000 00000009 00000009 004aa710 005ff98d
> >          0049f87a 005c9849 00000142 0063f5ec 004cbd3e 0049f8f8 005c9849 00000142
> >          0075ac3e 00000009 00000000 0065df60 00000000 00000040 00000000 00000000
> >          005c980c 0065df7c 0075ac3e 005c9849 00000142 00000009 005c980c 004c9f98
> >          006e5860 00000000 00782b50 00000000 00000000 0075b7ba 0063f5ec 00000001
> >          004cbd3e 0075a62e 00782b50 0075a79e 00782b50 00782b50 0049feb6 00749d4c
> > Call Trace: [<004aa710>] dump_stack+0xc/0x10
> >   [<0049f87a>] __warn+0x7e/0xb4
> >   [<0049f8f8>] warn_slowpath_fmt+0x48/0x66
> >   [<0075ac3e>] security_add_hooks+0xc4/0x12c
> >   [<0075ac3e>] security_add_hooks+0xc4/0x12c
> >   [<0075b7ba>] lockdown_lsm_init+0x16/0x1e
> >   [<0075a62e>] initialize_lsm+0x32/0x5c
> >   [<0075a79e>] early_security_init+0x30/0x38
> >   [<0049feb6>] _printk+0x0/0x18
> >   [<00749d4c>] start_kernel+0x60/0x600
> >   [<00748414>] _sinittext+0x414/0xae0
> > ---[ end trace 0000000000000000 ]---
> >
> > Microblaze:
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 0 at include/linux/jump_label.h:322 security_add_hooks+0x124/0x21c
> > static_key_enable(): static key 'security_hook_active_locked_down_0+0x0/0x4' used before call to jump_label_init()
> > Modules linked in:
> > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
> > Kernel Stack:
> > (ptrval): c0999390 c0f4c9ec 00000000 00000000 ffffffff a589f3a9 c0984c20 00000000
> > (ptrval): c0c51ef8 00000009 c0984c30 00000000 00000000 c0c51ef8 00000000 c0c51ef8
> > (ptrval): 00000009 c0984cf8 c09bad94 00000000 00000000 c0a30c10 00000142 c0d19e10
> > (ptrval): c0a30bd0 c0a30c10 00000000 c0d19e10 c09bade4 00000142 00000009 c0a30bd0
> > (ptrval): c0a30ca0 c0f58820 c0a30bd0 c0c51f28 00000142 00000009 c0d19e10 c0a37340
> > (ptrval): c0c190c0 c0d1b1d0 00000000 00000000 00000000 c0a30bd0 c0a30ca0 c0f58820
> > (ptrval): c0d42b20 c0d35464 c0d42b38 00000000 00000000 00000000 00000000 00000000
> > (ptrval): 00100000 00000280 c0d196e8 c0d04ed0 00000000 c098465c 00000000 00000000
> > (ptrval): c0d19778 c0d19784 00000000 00000000 c0d0488c c09b8e40 c09b9b24 c0d42b20
> > (ptrval): c0d42b38 c0d00898 4883e4b3 00000000 c0d0088c 00000280 00000000 00000000
> > (ptrval): 00000000 00000000 00000000 c0984194 c09b7208 c0b125f8 c0f5d59c 00000000
> > (ptrval): 00000002 00000000 c00002e0 91a86e08 c0d33f7c 00000000 00000000 00000000
> > (ptrval): 00000000 00000000 00000000 00000000
> > Call Trace:
> > [<c0003168>] microblaze_unwind+0x64/0x80
> > [<c0984548>] show_stack+0x128/0x180
> > [<c0999330>] dump_stack_lvl+0x44/0x94
> > [<c099938c>] dump_stack+0xc/0x24
> > [<c0984c2c>] __warn+0xac/0xfc
> > [<c0984cf4>] warn_slowpath_fmt+0x78/0x98
> > [<c0d19e0c>] security_add_hooks+0x120/0x21c
> > [<c0d1b1cc>] lockdown_lsm_init+0x18/0x34
> > [<c0d196e4>] initialize_lsm+0x44/0x94
> > [<c0d19780>] early_security_init+0x4c/0x74
> > [<c0d00894>] start_kernel+0x90/0x8ac
> > [<c0984190>] machine_shutdown+0x1c/0x20
> > no locks held by swapper/0.
> > ---[ end trace 0000000000000000 ]---
> >
> > mips:
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 0 at include/linux/jump_label.h:322 security_add_hooks+0xf8/0x1bc
> > static_key_enable(): static key 'security_hook_active_locked_down_0+0x0/0x4' used before call to jump_label_init()
> > Modules linked in:
> > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
> > Hardware name: mti,malta
> > Stack : 00000000 811eedd8 00000000 00000000 00000000 00000000 00000000 00000000
> >          00000000 00000000 00000000 00000000 00000000 00000001 81257cd8 00000000
> >          81257d70 00000000 00000000 00000000 00000038 80e549c4 00000000 ffffffff
> >          00000000 00000000 00000000 00040000 00000000 00000000 81174584 81280000
> >          00000000 00000142 00000000 00000000 00000000 00000000 0a0a0b0b bbe00cfc
> >          ...
> > Call Trace:
> > [<8010a0a8>] show_stack+0x60/0x154
> > [<80e731d8>] dump_stack_lvl+0xbc/0x138
> > [<8012f908>] __warn+0x9c/0x1f8
> > [<8012fc20>] warn_slowpath_fmt+0x1bc/0x1cc
> > [<8138a184>] security_add_hooks+0xf8/0x1bc
> > [<8138a5fc>] lockdown_lsm_init+0x20/0x30
> > [<813899e8>] initialize_lsm+0x44/0x80
> > [<81389be0>] early_security_init+0x50/0x6c
> > [<8136c82c>] start_kernel+0xa8/0x7dc
> > irq event stamp: 0
> > hardirqs last  enabled at (0): [<00000000>] 0x0
> > hardirqs last disabled at (0): [<00000000>] 0x0
> > softirqs last  enabled at (0): [<00000000>] 0x0
> > softirqs last disabled at (0): [<00000000>] 0x0
> > ---[ end trace 0000000000000000 ]---
> >
> > Loongarch (crash):
> >
> > [    0.000000] ------------[ cut here ]------------
> > [    0.000000] static_key_enable_cpuslocked(): static key 'security_hook_active_locked_down_0+0x0/0x10' used before call to jump_label_init()
> > [    0.000000] ------------[ cut here ]------------
> > [    0.000000] DEBUG_LOCKS_WARN_ON(early_boot_irqs_disabled)
> > [    0.000000] Caught reserved exception 12 on pid:0 [swapper] - should not happen
> > [    0.000000] do_reserved exception[#1]:
> > [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2+ #1
> > [    0.000000] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> > [    0.000000] pc 9000000004cf9334 ra 9000000004cf9334 tp 9000000006cc8000 sp 9000000006ccbc10
> > [    0.000000] a0 000000000000002d a1 9000000006df7830 a2 0000000000000000 a3 9000000006ccba28
> > [    0.000000] a4 0000000000000001 a5 0000000000000000 a6 9000000006175570 a7 0000000000000005
> > [    0.000000] t0 0000000000000000 t1 0000000000000000 t2 0000000000000001 t3 0000000000000001
> > [    0.000000] t4 0000000000000004 t5 0000000000000094 t6 0000000000000023 t7 0000000000000030
> > [    0.000000] t8 ffffffff8dcb3998 u0 9000000006a45388 s9 000000000f5ea330 s0 9000000006230788
> > [    0.000000] s1 9000000006265c70 s2 0000000000000001 s3 0000000000000001 s4 9000000006cfaa80
> > [    0.000000] s5 000000000f75dad8 s6 000000000a5b0000 s7 000000000f75db30 s8 000000000eee5b18
> > [    0.000000]    ra: 9000000004cf9334 lockdep_hardirqs_on_prepare+0x200/0x208
> > [    0.000000]   ERA: 9000000004cf9334 lockdep_hardirqs_on_prepare+0x200/0x208
> > [    0.000000]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > [    0.000000]  PRMD: 00000000 (PPLV0 -PIE -PWE)
> > [    0.000000]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> > [    0.000000]  ECFG: 00070800 (LIE=11 VS=7)
> > [    0.000000] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
> > [    0.000000]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
> > [    0.000000] Modules linked in:
> > [    0.000000] Process swapper (pid: 0, threadinfo=(____ptrval____), task=(____ptrval____))
> > [    0.000000] Stack : 0000000000000001 9000000006265c70 9000000006169c58 9000000004dd9ba8
> > [    0.000000]         9000000006ccbc70 0000000000000000 9000000006ccbc70 9000000006169c58
> > [    0.000000]         00000000000000b0 90000000074f08b8 9000000008616478 9000000007ad1924
> > [    0.000000]         0000000000000000 9000000004e95fa8 9000000006cc8000 9000000006ccbdb0
> > [    0.000000]         000000000000007e 9000000006df7830 0000000000000000 9000000006ccbbc8
> > [    0.000000]         0000000000000001 0000000000000001 90000000073f6e58 9000000006175570
> > [    0.000000]         0000000000000000 0000000000000000 0000000000000001 0000000000000001
> > [    0.000000]         0000000000000000 0000000000000092 0000000000000001 0000000000006000
> > [    0.000000]         ffffffff8dcb3998 9000000006a6bed8 000000000f5ea330 9000000008616478
> > [    0.000000]         90000000074f08b8 0000000000000001 0000000000000001 9000000006cfaa80
> > [    0.000000]         ...
> > [    0.000000] Call Trace:
> > [    0.000000] [<9000000004cf9334>] lockdep_hardirqs_on_prepare+0x200/0x208
> > [    0.000000] [<9000000004dd9ba4>] trace_hardirqs_on+0x54/0x70
> > [    0.000000] [<9000000006169c54>] do_reserved+0x1c/0xcc
> > [    0.000000] [<9000000004c52560>] handle_bp+0x120/0x1c0
> > [    0.000000] [<9000000004e95fa8>] static_key_enable_cpuslocked+0xdc/0xec
> > [    0.000000] [<9000000004e960b8>] static_key_enable+0x18/0x2c
> > [    0.000000] [<90000000061a9154>] security_add_hooks+0xbc/0x12c
> > [    0.000000] [<90000000061aa880>] lockdown_lsm_init+0x20/0x34
> > [    0.000000] [<90000000061a8a80>] initialize_lsm+0x3c/0x6c
> > [    0.000000] [<90000000061a8c34>] early_security_init+0x44/0x68
> > [    0.000000] [<9000000006180830>] start_kernel+0xa0/0x84c
> > [    0.000000] [<900000000616d0f0>] kernel_entry+0xf0/0xf8
> >
> >
>

Guenter Roeck Aug. 8, 2024, 3:20 p.m. UTC | #18

On 8/8/24 02:57, KP Singh wrote:
> On Thu, Aug 8, 2024 at 6:07 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 8/7/24 19:13, Guenter Roeck wrote:
>> ...
>>>
>>> I'll need to establish a baseline first to determine if the failures
>>> are caused by newly enabled configuration options or by this patch set.
>>> Below are just early test results.
>>>
>>> [ Though if those are all upstream there seems to be be something seriously
>>>     wrong with the lockdown lsm.
>>> ]
>>>
>>
>> Verdict is that all the messages below are from this patch set.
>>
>> On top of the reports below, alpha images fail completely, and the
>> backtraces are seen with several architectures. Please see the
>> "testing" column at https://kerneltests.org/builders for details.
>>
>> The only unrelated problems are the apparmor unit test failures;
>> those apparently fail on all big endian systems.
>>
>> Guenter
>>
>>> Guenter
>>>
>>> ----
>>> arm:
>>>
>>> [    0.000000] ------------[ cut here ]------------
>>> [    0.000000] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:199 static_key_enable_cpuslocked+0xb0/0xfc
>>> [    0.000000] static_key_enable_cpuslocked(): static key 'security_hook_active_locked_down_0+0x0/0x8' used before call to jump_label_init()
>>> [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
>>> [    0.000000] Hardware name: Generic DT based system
>>> [    0.000000] Call trace:
>>> [    0.000000]  unwind_backtrace from show_stack+0x18/0x1c
>>> [    0.000000]  show_stack from dump_stack_lvl+0x48/0x74
>>> [    0.000000]  dump_stack_lvl from __warn+0x7c/0x134
>>> [    0.000000]  __warn from warn_slowpath_fmt+0x9c/0xdc
>>> [    0.000000]  warn_slowpath_fmt from static_key_enable_cpuslocked+0xb0/0xfc
>>> [    0.000000]  static_key_enable_cpuslocked from security_add_hooks+0xa0/0x104
>>> [    0.000000]  security_add_hooks from lockdown_lsm_init+0x1c/0x2c
>>> [    0.000000]  lockdown_lsm_init from initialize_lsm+0x44/0x84
>>> [    0.000000]  initialize_lsm from early_security_init+0x3c/0x58
>>> [    0.000000]  early_security_init from start_kernel+0x78/0x748
>>> [    0.000000]  start_kernel from 0x0
>>> [    0.000000] irq event stamp: 0
>>> [    0.000000] hardirqs last  enabled at (0): [<00000000>] 0x0
>>> [    0.000000] hardirqs last disabled at (0): [<00000000>] 0x0
>>> [    0.000000] softirqs last  enabled at (0): [<00000000>] 0x0
>>> [    0.000000] softirqs last disabled at (0): [<00000000>] 0x0
>>> [    0.000000] ---[ end trace 0000000000000000 ]---
>>>
> 
> This seems very odd for especially ARM as I don't see this error when
> I do it on the next branch. Possibly something in setup_arch is
> initializing jump_tables indirectly between v6.11-rc2 and linux-next
> and/or this is a warning that does not immediately splash up on the
> dmesg.
> 

I suspect it is more likely because I have lots of debug options enabled in my tests.

> Both ARM64 and x86 (the architectures I really have access to)
> initializes jump_tables and x86 is the only architecture that does an
> explicit static_call_init is x86 and it's already in the setup_arch
> code.
> 
> https://elixir.bootlin.com/linux/v6.11-rc2/source/arch/arm64/kernel/setup.c#L295
> https://elixir.bootlin.com/linux/v6.11-rc2/source/arch/x86/kernel/setup.c#L783
> 
> Guenter, I have updated my tree, could you give it another run please?
> 

Sure, underway.

Guenter

Guenter Roeck Aug. 8, 2024, 4:43 p.m. UTC | #19

On 8/8/24 02:57, KP Singh wrote:
> On Thu, Aug 8, 2024 at 6:07 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 8/7/24 19:13, Guenter Roeck wrote:
>> ...
>>>
>>> I'll need to establish a baseline first to determine if the failures
>>> are caused by newly enabled configuration options or by this patch set.
>>> Below are just early test results.
>>>
>>> [ Though if those are all upstream there seems to be be something seriously
>>>     wrong with the lockdown lsm.
>>> ]
>>>
>>
>> Verdict is that all the messages below are from this patch set.
>>
>> On top of the reports below, alpha images fail completely, and the
>> backtraces are seen with several architectures. Please see the
>> "testing" column at https://kerneltests.org/builders for details.
>>
>> The only unrelated problems are the apparmor unit test failures;
>> those apparently fail on all big endian systems.
>>
>> Guenter
>>
>>> Guenter
>>>
>>> ----
>>> arm:
>>>
>>> [    0.000000] ------------[ cut here ]------------
>>> [    0.000000] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:199 static_key_enable_cpuslocked+0xb0/0xfc
>>> [    0.000000] static_key_enable_cpuslocked(): static key 'security_hook_active_locked_down_0+0x0/0x8' used before call to jump_label_init()
>>> [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2-00134-g679d51771510 #1
>>> [    0.000000] Hardware name: Generic DT based system
>>> [    0.000000] Call trace:
>>> [    0.000000]  unwind_backtrace from show_stack+0x18/0x1c
>>> [    0.000000]  show_stack from dump_stack_lvl+0x48/0x74
>>> [    0.000000]  dump_stack_lvl from __warn+0x7c/0x134
>>> [    0.000000]  __warn from warn_slowpath_fmt+0x9c/0xdc
>>> [    0.000000]  warn_slowpath_fmt from static_key_enable_cpuslocked+0xb0/0xfc
>>> [    0.000000]  static_key_enable_cpuslocked from security_add_hooks+0xa0/0x104
>>> [    0.000000]  security_add_hooks from lockdown_lsm_init+0x1c/0x2c
>>> [    0.000000]  lockdown_lsm_init from initialize_lsm+0x44/0x84
>>> [    0.000000]  initialize_lsm from early_security_init+0x3c/0x58
>>> [    0.000000]  early_security_init from start_kernel+0x78/0x748
>>> [    0.000000]  start_kernel from 0x0
>>> [    0.000000] irq event stamp: 0
>>> [    0.000000] hardirqs last  enabled at (0): [<00000000>] 0x0
>>> [    0.000000] hardirqs last disabled at (0): [<00000000>] 0x0
>>> [    0.000000] softirqs last  enabled at (0): [<00000000>] 0x0
>>> [    0.000000] softirqs last disabled at (0): [<00000000>] 0x0
>>> [    0.000000] ---[ end trace 0000000000000000 ]---
>>>
> 
> This seems very odd for especially ARM as I don't see this error when
> I do it on the next branch. Possibly something in setup_arch is
> initializing jump_tables indirectly between v6.11-rc2 and linux-next
> and/or this is a warning that does not immediately splash up on the
> dmesg.
> 
> Both ARM64 and x86 (the architectures I really have access to)
> initializes jump_tables and x86 is the only architecture that does an
> explicit static_call_init is x86 and it's already in the setup_arch
> code.
> 
> https://elixir.bootlin.com/linux/v6.11-rc2/source/arch/arm64/kernel/setup.c#L295
> https://elixir.bootlin.com/linux/v6.11-rc2/source/arch/x86/kernel/setup.c#L783
> 
> Guenter, I have updated my tree, could you give it another run please?
> 

This version is much better, except for alpha which still crashes hard
with no log output. It bisects to one of your patches (results below).

Also, there is a backtrace on ppc (also see below), but that is unrelated
to your patches and only seen now because I enabled the security modules
on that architecture. I'll bring that up with ppc maintainers.

Thanks,
Guenter

---
bisect:

# bad: [b92c86ad4f4311706fe436a1545d9a97e6aebcf8] lsm: replace indirect LSM hook calls with static calls
# good: [de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed] Linux 6.11-rc2
git bisect start 'HEAD' 'v6.11-rc2'
# good: [bd2c890317b2d60b4afd89a374a56a7c9a0275bd] kernel: Add helper macros for loop unrolling
git bisect good bd2c890317b2d60b4afd89a374a56a7c9a0275bd
# good: [6a1e94163fc53a4f1b47a8689f416a1a3d0a154a] lsm: count the LSMs enabled at compile time
git bisect good 6a1e94163fc53a4f1b47a8689f416a1a3d0a154a
# first bad commit: [b92c86ad4f4311706fe436a1545d9a97e6aebcf8] lsm: replace indirect LSM hook calls with static calls

---
ppc backtrace:

LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,safesetid
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/smp.c:779 smp_call_function_many_cond+0x518/0x9d4
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.0-rc2-00127-g2e3e7093e9c8 #1
Hardware name: MPC8544DS e500v2 0x80210030 MPC8544 DS
NIP:  c0172ca8 LR: c01731b0 CTR: 00000000
REGS: c2669d60 TRAP: 0700   Not tainted  (6.11.0-rc2-00127-g2e3e7093e9c8)
MSR:  00021000 <CE,ME>  CR: 24004288  XER: 20000000
GPR00: c002255c c2669e50 c253b5c0 c267b484 00000000 00000000 00000001 c2680000
GPR08: 00000000 00000003 c2680000 00000000 44004288 020a1e18 00000000 00000000
GPR16: 00000000 00000000 00000001 00000000 c0000000 c01731b0 00000000 c267b484
GPR24: c00224fc c0773760 c0770b50 00000000 00000000 00029000 00000000 00000000
NIP [c0172ca8] smp_call_function_many_cond+0x518/0x9d4
LR [c01731b0] smp_call_function+0x3c/0x58
Call Trace:
[c2669eb0] [84000282] 0x84000282
[c2669ec0] [c002255c] flush_tlb_kernel_range+0x2c/0x50
[c2669ed0] [c0023b8c] patch_instruction+0x108/0x1b0
[c2669ef0] [c00188a4] arch_static_call_transform+0x104/0x148
[c2669f10] [c2033ebc] security_add_hooks+0x138/0x24c
[c2669f40] [c2032e24] capability_init+0x24/0x38
[c2669f50] [c203322c] initialize_lsm+0x48/0x90
[c2669f70] [c2033b68] security_init+0x31c/0x538
[c2669fa0] [c2001154] start_kernel+0x5d4/0x81c
[c2669ff0] [c0000478] set_ivor+0x150/0x18c
Code: 91220000 81620004 3d20c209 3929e478 556b103a 7c84582e 7c89202e 81220000 2c040000 3929ffff 91220000 40a2fbb8 <0fe00000> 4bfffbb0 80e20000 2c070000
irq event stamp: 1204
hardirqs last  enabled at (1203): [<c11d85f8>] _raw_spin_unlock_irqrestore+0x70/0xa8
hardirqs last disabled at (1204): [<c0023bcc>] patch_instruction+0x148/0x1b0
softirqs last  enabled at (50): [<c0064b4c>] handle_softirqs+0x348/0x508
softirqs last disabled at (43): [<c0006fd0>] do_softirq_own_stack+0x34/0x4c
---[ end trace 0000000000000000 ]---
landlock: Up and running.
Yama: becoming mindful.
LoadPin: ready to pin (currently not enforcing)

Paul Moore Aug. 8, 2024, 5:19 p.m. UTC | #20

On Thu, Aug 8, 2024 at 5:57 AM KP Singh <kpsingh@kernel.org> wrote:
>
> This seems very odd for especially ARM as I don't see this error when
> I do it on the next branch. Possibly something in setup_arch is
> initializing jump_tables indirectly between v6.11-rc2 and linux-next
> and/or this is a warning that does not immediately splash up on the
> dmesg.
>
> Both ARM64 and x86 (the architectures I really have access to)
> initializes jump_tables and x86 is the only architecture that does an
> explicit static_call_init is x86 and it's already in the setup_arch
> code.
>
> https://elixir.bootlin.com/linux/v6.11-rc2/source/arch/arm64/kernel/setup.c#L295
> https://elixir.bootlin.com/linux/v6.11-rc2/source/arch/x86/kernel/setup.c#L783
>
> Guenter, I have updated my tree, could you give it another run please?

KP, once we've got the ordering/fix sorted out, can you make sure the
reasoning is well documented in the commit description?  From what I'm
reading here it doesn't seem like it's something obvious, so providing
a detailed description will help us several years down the road when
someone changes something and it breaks again :)

Paul Moore Aug. 8, 2024, 5:32 p.m. UTC | #21

On Thu, Aug 8, 2024 at 12:43 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Also, there is a backtrace on ppc (also see below), but that is unrelated
> to your patches and only seen now because I enabled the security modules
> on that architecture. I'll bring that up with ppc maintainers.

Thanks for all your help testing this Guenter.  I see you've also
already submitted an AppArmor fix for the endian issue, that's very
helpful and I'm sure John will be happy to see it.

Beyond this work testing the static call patches from KP, would you be
willing to add a LSM configuration to your normal testing?  While most
of the LSM subsystem should be architecture agnostic, there are
definitely bits and pieces that can vary (as you've seen), and I think
it would be great to get more testing across a broad range of
supported arches, even if it is just some simple "does it boot" tests.

Out of curiosity, do you have your test setup documented anywhere?  It
sounds fairly impressive and I'd be curious to learn more about it.

Guenter Roeck Aug. 8, 2024, 6 p.m. UTC | #22

On Thu, Aug 08, 2024 at 01:32:37PM -0400, Paul Moore wrote:
> On Thu, Aug 8, 2024 at 12:43 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > Also, there is a backtrace on ppc (also see below), but that is unrelated
> > to your patches and only seen now because I enabled the security modules
> > on that architecture. I'll bring that up with ppc maintainers.
> 
> Thanks for all your help testing this Guenter.  I see you've also
> already submitted an AppArmor fix for the endian issue, that's very
> helpful and I'm sure John will be happy to see it.
> 
> Beyond this work testing the static call patches from KP, would you be
> willing to add a LSM configuration to your normal testing?  While most
> of the LSM subsystem should be architecture agnostic, there are
> definitely bits and pieces that can vary (as you've seen), and I think
> it would be great to get more testing across a broad range of
> supported arches, even if it is just some simple "does it boot" tests.
> 

That really depends. I already enabled some of the kernel security modules.

CONFIG_SECURITY=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_APPARMOR_KUNIT_TEST=y
CONFIG_SECURITY_LANDLOCK=y
CONFIG_SECURITY_LANDLOCK_KUNIT_TEST=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
CONFIG_SECURITY_YAMA=y
CONFIG_SECURITY_LOADPIN=y
CONFIG_SECURITY_SAFESETID=y
CONFIG_BPF_LSM=y
CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,bpf"

I can easily add more if you tell me what else I should enable.

Userspace is more difficult. My root file systems are generated using
buildroot. I run some basic tests, such as network interface tests
and TPM tests, but those are just simple scripts utilizing packages
provided by buildroot. I can add more, but I would need to know what
exactly to add and how to execute it.

In general my tests are intended to cover a large number of different
configurations; they are intended to be broad, not deep. That means an
individual test should not take longer than a couple of seconds. If you
can provide something that would run in the buildroot environment and
not take long to execute (example: tpm2 selftests), I'd be happy to add
it.

For more comprehensive tests, it might make sense to discuss adding
security tests to KernelCI; they have much more resources available
and target deeper testing. That would make sense if you have, for example,
test suites to run on upstream kernels or stable release candidates. 

> Out of curiosity, do you have your test setup documented anywhere?  It
> sounds fairly impressive and I'd be curious to learn more about it.
> 

Not really. The code is at https://github.com/groeck/linux-build-test.
My clone of buildroot is at https://github.com/groeck/buildroot (look
for local- branches to see my changes). Please feel free to have a look,
but documentation is seriously lacking (and README is completely out
of date).

Guenter

Paul Moore Aug. 8, 2024, 8:49 p.m. UTC | #23

On Thu, Aug 8, 2024 at 2:00 PM Guenter Roeck <linux@roeck-us.net> wrote:
> On Thu, Aug 08, 2024 at 01:32:37PM -0400, Paul Moore wrote:
> > On Thu, Aug 8, 2024 at 12:43 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > Also, there is a backtrace on ppc (also see below), but that is unrelated
> > > to your patches and only seen now because I enabled the security modules
> > > on that architecture. I'll bring that up with ppc maintainers.
> >
> > Thanks for all your help testing this Guenter.  I see you've also
> > already submitted an AppArmor fix for the endian issue, that's very
> > helpful and I'm sure John will be happy to see it.
> >
> > Beyond this work testing the static call patches from KP, would you be
> > willing to add a LSM configuration to your normal testing?  While most
> > of the LSM subsystem should be architecture agnostic, there are
> > definitely bits and pieces that can vary (as you've seen), and I think
> > it would be great to get more testing across a broad range of
> > supported arches, even if it is just some simple "does it boot" tests.
> >
>
> That really depends. I already enabled some of the kernel security modules.
>
> CONFIG_SECURITY=y
> CONFIG_SECURITY_APPARMOR=y
> CONFIG_SECURITY_APPARMOR_KUNIT_TEST=y
> CONFIG_SECURITY_LANDLOCK=y
> CONFIG_SECURITY_LANDLOCK_KUNIT_TEST=y
> CONFIG_SECURITY_LOCKDOWN_LSM=y
> CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
> CONFIG_SECURITY_YAMA=y
> CONFIG_SECURITY_LOADPIN=y
> CONFIG_SECURITY_SAFESETID=y
> CONFIG_BPF_LSM=y
> CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,bpf"
>
> I can easily add more if you tell me what else I should enable.

Thanks, I just added a todo item to send you a list.  I appreciate the help.

> Userspace is more difficult. My root file systems are generated using
> buildroot. I run some basic tests, such as network interface tests
> and TPM tests, but those are just simple scripts utilizing packages
> provided by buildroot. I can add more, but I would need to know what
> exactly to add and how to execute it.

Of course.  As far as I'm concerned, simply enabling the LSMs and
making sure the various arches boot without additional faults would be
a nice boost in testing.

> > Out of curiosity, do you have your test setup documented anywhere?  It
> > sounds fairly impressive and I'd be curious to learn more about it.
>
> Not really. The code is at https://github.com/groeck/linux-build-test.
> My clone of buildroot is at https://github.com/groeck/buildroot (look
> for local- branches to see my changes). Please feel free to have a look,
> but documentation is seriously lacking (and README is completely out
> of date).

Thanks for the pointers.

KP Singh Aug. 12, 2024, 5:12 p.m. UTC | #24

On Thu, Aug 8, 2024 at 10:49 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Thu, Aug 8, 2024 at 2:00 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > On Thu, Aug 08, 2024 at 01:32:37PM -0400, Paul Moore wrote:
> > > On Thu, Aug 8, 2024 at 12:43 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > Also, there is a backtrace on ppc (also see below), but that is unrelated
> > > > to your patches and only seen now because I enabled the security modules
> > > > on that architecture. I'll bring that up with ppc maintainers.
> > >
> > > Thanks for all your help testing this Guenter.  I see you've also
> > > already submitted an AppArmor fix for the endian issue, that's very
> > > helpful and I'm sure John will be happy to see it.
> > >
> > > Beyond this work testing the static call patches from KP, would you be
> > > willing to add a LSM configuration to your normal testing?  While most
> > > of the LSM subsystem should be architecture agnostic, there are
> > > definitely bits and pieces that can vary (as you've seen), and I think
> > > it would be great to get more testing across a broad range of
> > > supported arches, even if it is just some simple "does it boot" tests.
> > >
> >
> > That really depends. I already enabled some of the kernel security modules.
> >
> > CONFIG_SECURITY=y
> > CONFIG_SECURITY_APPARMOR=y
> > CONFIG_SECURITY_APPARMOR_KUNIT_TEST=y
> > CONFIG_SECURITY_LANDLOCK=y
> > CONFIG_SECURITY_LANDLOCK_KUNIT_TEST=y
> > CONFIG_SECURITY_LOCKDOWN_LSM=y
> > CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
> > CONFIG_SECURITY_YAMA=y
> > CONFIG_SECURITY_LOADPIN=y
> > CONFIG_SECURITY_SAFESETID=y
> > CONFIG_BPF_LSM=y
> > CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,bpf"
> >
> > I can easily add more if you tell me what else I should enable.
>
> Thanks, I just added a todo item to send you a list.  I appreciate the help.
>
> > Userspace is more difficult. My root file systems are generated using
> > buildroot. I run some basic tests, such as network interface tests
> > and TPM tests, but those are just simple scripts utilizing packages
> > provided by buildroot. I can add more, but I would need to know what
> > exactly to add and how to execute it.
>
> Of course.  As far as I'm concerned, simply enabling the LSMs and
> making sure the various arches boot without additional faults would be
> a nice boost in testing.
>
> > > Out of curiosity, do you have your test setup documented anywhere?  It
> > > sounds fairly impressive and I'd be curious to learn more about it.
> >
> > Not really. The code is at https://github.com/groeck/linux-build-test.
> > My clone of buildroot is at https://github.com/groeck/buildroot (look
> > for local- branches to see my changes). Please feel free to have a look,
> > but documentation is seriously lacking (and README is completely out
> > of date).
>

JFYI, I synced with Guenter and all arch seem to pass and alpha does
not work due to a reason that I am unable to debug. I will try doing
more debugging but I will need more alpha help here (Added the
maintainers to this thread).



> Thanks for the pointers.
>
> --
> paul-moore.com

Paul Moore Aug. 12, 2024, 7:33 p.m. UTC | #25

On Mon, Aug 12, 2024 at 1:12 PM KP Singh <kpsingh@kernel.org> wrote:
>
> JFYI, I synced with Guenter and all arch seem to pass and alpha does
> not work due to a reason that I am unable to debug. I will try doing
> more debugging but I will need more alpha help here (Added the
> maintainers to this thread).

Thanks for the update; I was hoping that we might have a resolution
for the Alpha failure by now but it doesn't look like we're that
lucky.  Hopefully the Alpha devs will be able to help resolve this
without too much trouble.

Unfortunately, this does mean that I'm going to drop the static call
patches from the lsm/dev branch so that we can continue merging other
things.  Of course this doesn't mean the static call patches can't
come back in later during this dev cycle once everything is solved if
there is still time, and worst case there is always the next dev
cycle.

KP Singh Aug. 12, 2024, 9:14 p.m. UTC | #26

On Mon, Aug 12, 2024 at 9:33 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Mon, Aug 12, 2024 at 1:12 PM KP Singh <kpsingh@kernel.org> wrote:
> >
> > JFYI, I synced with Guenter and all arch seem to pass and alpha does
> > not work due to a reason that I am unable to debug. I will try doing
> > more debugging but I will need more alpha help here (Added the
> > maintainers to this thread).
>
> Thanks for the update; I was hoping that we might have a resolution
> for the Alpha failure by now but it doesn't look like we're that
> lucky.  Hopefully the Alpha devs will be able to help resolve this
> without too much trouble.
>
> Unfortunately, this does mean that I'm going to drop the static call
> patches from the lsm/dev branch so that we can continue merging other
> things.  Of course this doesn't mean the static call patches can't
> come back in later during this dev cycle once everything is solved if
> there is still time, and worst case there is always the next dev
> cycle.
>

Do we really want to drop them for alpha? I would rather disable
CONFIG_SECURITY for alpha and if people really care for alpha we can
enable it. Alpha folks, what do you think?



> --
> paul-moore.com

Paul Moore Aug. 12, 2024, 9:32 p.m. UTC | #27

On Mon, Aug 12, 2024 at 5:14 PM KP Singh <kpsingh@kernel.org> wrote:
> On Mon, Aug 12, 2024 at 9:33 PM Paul Moore <paul@paul-moore.com> wrote:
> > On Mon, Aug 12, 2024 at 1:12 PM KP Singh <kpsingh@kernel.org> wrote:
> > >
> > > JFYI, I synced with Guenter and all arch seem to pass and alpha does
> > > not work due to a reason that I am unable to debug. I will try doing
> > > more debugging but I will need more alpha help here (Added the
> > > maintainers to this thread).
> >
> > Thanks for the update; I was hoping that we might have a resolution
> > for the Alpha failure by now but it doesn't look like we're that
> > lucky.  Hopefully the Alpha devs will be able to help resolve this
> > without too much trouble.
> >
> > Unfortunately, this does mean that I'm going to drop the static call
> > patches from the lsm/dev branch so that we can continue merging other
> > things.  Of course this doesn't mean the static call patches can't
> > come back in later during this dev cycle once everything is solved if
> > there is still time, and worst case there is always the next dev
> > cycle.
> >
>
> Do we really want to drop them for alpha? I would rather disable
> CONFIG_SECURITY for alpha and if people really care for alpha we can
> enable it. Alpha folks, what do you think?

Seriously?  I realize Alpha is an older, lesser used arch, but it is
still a supported arch and we are not going to cause a regression for
the sake of a new feature.  As I mentioned earlier, once the problem
is resolved we can bring the patchset back into lsm/dev; if it gets
resolved soon enough we can even do it during this dev cycle.

KP Singh Aug. 12, 2024, 10:02 p.m. UTC | #28

On Mon, Aug 12, 2024 at 11:33 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Mon, Aug 12, 2024 at 5:14 PM KP Singh <kpsingh@kernel.org> wrote:
> > On Mon, Aug 12, 2024 at 9:33 PM Paul Moore <paul@paul-moore.com> wrote:
> > > On Mon, Aug 12, 2024 at 1:12 PM KP Singh <kpsingh@kernel.org> wrote:
> > > >
> > > > JFYI, I synced with Guenter and all arch seem to pass and alpha does
> > > > not work due to a reason that I am unable to debug. I will try doing
> > > > more debugging but I will need more alpha help here (Added the
> > > > maintainers to this thread).
> > >
> > > Thanks for the update; I was hoping that we might have a resolution
> > > for the Alpha failure by now but it doesn't look like we're that
> > > lucky.  Hopefully the Alpha devs will be able to help resolve this
> > > without too much trouble.
> > >
> > > Unfortunately, this does mean that I'm going to drop the static call
> > > patches from the lsm/dev branch so that we can continue merging other
> > > things.  Of course this doesn't mean the static call patches can't
> > > come back in later during this dev cycle once everything is solved if
> > > there is still time, and worst case there is always the next dev
> > > cycle.
> > >
> >
> > Do we really want to drop them for alpha? I would rather disable
> > CONFIG_SECURITY for alpha and if people really care for alpha we can
> > enable it. Alpha folks, what do you think?
>
> Seriously?  I realize Alpha is an older, lesser used arch, but it is
> still a supported arch and we are not going to cause a regression for
> the sake of a new feature.  As I mentioned earlier, once the problem
> is resolved we can bring the patchset back into lsm/dev; if it gets
> resolved soon enough we can even do it during this dev cycle.
>

Okay, more data for the alpha folks, when I moved trap_init() before
early_security_init() everything seemed to work, I think we might need
to call trap_init() from setup_arch and this would fix the issue. As
to why? I don't know :)

Would alpha folks be okay with this patch:

kpsingh@kpsingh:~/projects/linux$ git diff
diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c
index bebdffafaee8..53909c1be4cf 100644
--- a/arch/alpha/kernel/setup.c
+++ b/arch/alpha/kernel/setup.c
@@ -657,6 +657,7 @@ setup_arch(char **cmdline_p)
        setup_smp();
 #endif
        paging_init();
+       trap_init();
 }


and provide me some reason as to why this works, it would be great for
a patch description

- KP





> --
> paul-moore.com

Guenter Roeck Aug. 13, 2024, 4:07 a.m. UTC | #29

On 8/12/24 15:02, KP Singh wrote:
> On Mon, Aug 12, 2024 at 11:33 PM Paul Moore <paul@paul-moore.com> wrote:
>>
>> On Mon, Aug 12, 2024 at 5:14 PM KP Singh <kpsingh@kernel.org> wrote:
>>> On Mon, Aug 12, 2024 at 9:33 PM Paul Moore <paul@paul-moore.com> wrote:
>>>> On Mon, Aug 12, 2024 at 1:12 PM KP Singh <kpsingh@kernel.org> wrote:
>>>>>
>>>>> JFYI, I synced with Guenter and all arch seem to pass and alpha does
>>>>> not work due to a reason that I am unable to debug. I will try doing
>>>>> more debugging but I will need more alpha help here (Added the
>>>>> maintainers to this thread).
>>>>
>>>> Thanks for the update; I was hoping that we might have a resolution
>>>> for the Alpha failure by now but it doesn't look like we're that
>>>> lucky.  Hopefully the Alpha devs will be able to help resolve this
>>>> without too much trouble.
>>>>
>>>> Unfortunately, this does mean that I'm going to drop the static call
>>>> patches from the lsm/dev branch so that we can continue merging other
>>>> things.  Of course this doesn't mean the static call patches can't
>>>> come back in later during this dev cycle once everything is solved if
>>>> there is still time, and worst case there is always the next dev
>>>> cycle.
>>>>
>>>
>>> Do we really want to drop them for alpha? I would rather disable
>>> CONFIG_SECURITY for alpha and if people really care for alpha we can
>>> enable it. Alpha folks, what do you think?
>>
>> Seriously?  I realize Alpha is an older, lesser used arch, but it is
>> still a supported arch and we are not going to cause a regression for
>> the sake of a new feature.  As I mentioned earlier, once the problem
>> is resolved we can bring the patchset back into lsm/dev; if it gets
>> resolved soon enough we can even do it during this dev cycle.
>>
> 
> Okay, more data for the alpha folks, when I moved trap_init() before
> early_security_init() everything seemed to work, I think we might need
> to call trap_init() from setup_arch and this would fix the issue. As
> to why? I don't know :)
> 
> Would alpha folks be okay with this patch:
> 
> kpsingh@kpsingh:~/projects/linux$ git diff
> diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c
> index bebdffafaee8..53909c1be4cf 100644
> --- a/arch/alpha/kernel/setup.c
> +++ b/arch/alpha/kernel/setup.c
> @@ -657,6 +657,7 @@ setup_arch(char **cmdline_p)
>          setup_smp();
>   #endif
>          paging_init();
> +       trap_init();
>   }
> 
> 
> and provide me some reason as to why this works, it would be great for
> a patch description
> 

Your code triggers a trap (do_entUna, unaligned access) which isn't handled unless
trap_init() has been called before.

Reason is that static_calls_table is not 8-byte aligned, causing the unaligned
access in:

static void __init lsm_static_call_init(struct security_hook_list *hl)
{
         struct lsm_static_call *scall = hl->scalls;
         int i;

         for (i = 0; i < MAX_LSM_COUNT; i++) {
                 /* Update the first static call that is not used yet */
                 if (!scall->hl) {						<-- here
                         __static_call_update(scall->key, scall->trampoline,
                                              hl->hook.lsm_func_addr);
                         scall->hl = hl;
                         static_branch_enable(scall->active);
                         return;
                 }
                 scall++;
         }
         panic("%s - Ran out of static slots.\n", __func__);
}

A somewhat primitive alternate fix is:

diff --git a/security/security.c b/security/security.c
index aa059d0cfc29..dea9736b2014 100644
--- a/security/security.c
+++ b/security/security.c
@@ -156,7 +156,7 @@ static __initdata struct lsm_info *exclusive;
   * and a trampoline (STATIC_CALL_TRAMP) which are used to call
   * __static_call_update when updating the static call.
   */
-struct lsm_static_calls_table static_calls_table __ro_after_init = {
+struct lsm_static_calls_table static_calls_table __ro_after_init __attribute__((aligned(8))) = {
  #define INIT_LSM_STATIC_CALL(NUM, NAME)                                        \
         (struct lsm_static_call) {                                      \
                 .key = &STATIC_CALL_KEY(LSM_STATIC_CALL(NAME, NUM)),    \

Guenter

KP Singh Aug. 13, 2024, 3:56 p.m. UTC | #30

On Tue, Aug 13, 2024 at 6:08 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 8/12/24 15:02, KP Singh wrote:
> > On Mon, Aug 12, 2024 at 11:33 PM Paul Moore <paul@paul-moore.com> wrote:
> >>
> >> On Mon, Aug 12, 2024 at 5:14 PM KP Singh <kpsingh@kernel.org> wrote:
> >>> On Mon, Aug 12, 2024 at 9:33 PM Paul Moore <paul@paul-moore.com> wrote:
> >>>> On Mon, Aug 12, 2024 at 1:12 PM KP Singh <kpsingh@kernel.org> wrote:
> >>>>>
> >>>>> JFYI, I synced with Guenter and all arch seem to pass and alpha does
> >>>>> not work due to a reason that I am unable to debug. I will try doing
> >>>>> more debugging but I will need more alpha help here (Added the
> >>>>> maintainers to this thread).
> >>>>
> >>>> Thanks for the update; I was hoping that we might have a resolution
> >>>> for the Alpha failure by now but it doesn't look like we're that
> >>>> lucky.  Hopefully the Alpha devs will be able to help resolve this
> >>>> without too much trouble.
> >>>>
> >>>> Unfortunately, this does mean that I'm going to drop the static call
> >>>> patches from the lsm/dev branch so that we can continue merging other
> >>>> things.  Of course this doesn't mean the static call patches can't
> >>>> come back in later during this dev cycle once everything is solved if
> >>>> there is still time, and worst case there is always the next dev
> >>>> cycle.
> >>>>
> >>>
> >>> Do we really want to drop them for alpha? I would rather disable
> >>> CONFIG_SECURITY for alpha and if people really care for alpha we can
> >>> enable it. Alpha folks, what do you think?
> >>
> >> Seriously?  I realize Alpha is an older, lesser used arch, but it is
> >> still a supported arch and we are not going to cause a regression for
> >> the sake of a new feature.  As I mentioned earlier, once the problem
> >> is resolved we can bring the patchset back into lsm/dev; if it gets
> >> resolved soon enough we can even do it during this dev cycle.
> >>
> >
> > Okay, more data for the alpha folks, when I moved trap_init() before
> > early_security_init() everything seemed to work, I think we might need
> > to call trap_init() from setup_arch and this would fix the issue. As
> > to why? I don't know :)
> >
> > Would alpha folks be okay with this patch:
> >
> > kpsingh@kpsingh:~/projects/linux$ git diff
> > diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c
> > index bebdffafaee8..53909c1be4cf 100644
> > --- a/arch/alpha/kernel/setup.c
> > +++ b/arch/alpha/kernel/setup.c
> > @@ -657,6 +657,7 @@ setup_arch(char **cmdline_p)
> >          setup_smp();
> >   #endif
> >          paging_init();
> > +       trap_init();
> >   }
> >
> >
> > and provide me some reason as to why this works, it would be great for
> > a patch description
> >
>
> Your code triggers a trap (do_entUna, unaligned access) which isn't handled unless
> trap_init() has been called before.
>
> Reason is that static_calls_table is not 8-byte aligned, causing the unaligned
> access in:
>
> static void __init lsm_static_call_init(struct security_hook_list *hl)
> {
>          struct lsm_static_call *scall = hl->scalls;
>          int i;
>
>          for (i = 0; i < MAX_LSM_COUNT; i++) {
>                  /* Update the first static call that is not used yet */
>                  if (!scall->hl) {                                              <-- here
>                          __static_call_update(scall->key, scall->trampoline,
>                                               hl->hook.lsm_func_addr);
>                          scall->hl = hl;
>                          static_branch_enable(scall->active);
>                          return;
>                  }
>                  scall++;
>          }
>          panic("%s - Ran out of static slots.\n", __func__);
> }
>
> A somewhat primitive alternate fix is:
>
> diff --git a/security/security.c b/security/security.c
> index aa059d0cfc29..dea9736b2014 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -156,7 +156,7 @@ static __initdata struct lsm_info *exclusive;
>    * and a trampoline (STATIC_CALL_TRAMP) which are used to call
>    * __static_call_update when updating the static call.
>    */
> -struct lsm_static_calls_table static_calls_table __ro_after_init = {
> +struct lsm_static_calls_table static_calls_table __ro_after_init __attribute__((aligned(8))) = {
>   #define INIT_LSM_STATIC_CALL(NUM, NAME)                                        \

I think it's worth making it aligned at 8 byte, a much simpler fix
than the arch change. Paul, I will rebase my series with these
patches, better descriptions and post them later today.

- KP

>          (struct lsm_static_call) {                                      \
>                  .key = &STATIC_CALL_KEY(LSM_STATIC_CALL(NAME, NUM)),    \
>
> Guenter
>

Guenter Roeck Aug. 13, 2024, 4:26 p.m. UTC | #31

On 8/13/24 08:56, KP Singh wrote:
> On Tue, Aug 13, 2024 at 6:08 AM Guenter Roeck <linux@roeck-us.net> wrote:
[ ... ]
>> A somewhat primitive alternate fix is:
>>
>> diff --git a/security/security.c b/security/security.c
>> index aa059d0cfc29..dea9736b2014 100644
>> --- a/security/security.c
>> +++ b/security/security.c
>> @@ -156,7 +156,7 @@ static __initdata struct lsm_info *exclusive;
>>     * and a trampoline (STATIC_CALL_TRAMP) which are used to call
>>     * __static_call_update when updating the static call.
>>     */
>> -struct lsm_static_calls_table static_calls_table __ro_after_init = {
>> +struct lsm_static_calls_table static_calls_table __ro_after_init __attribute__((aligned(8))) = {
>>    #define INIT_LSM_STATIC_CALL(NUM, NAME)                                        \
> 
> I think it's worth making it aligned at 8 byte, a much simpler fix
> than the arch change. Paul, I will rebase my series with these
> patches, better descriptions and post them later today.
> 

I think that would make sense, since it might well be that other architectures
have similar problems but it isn't seen because static_calls_table just happens
to be aligned properly there. Also, even if unaligned accesses that early were
supported, it is probably undesirable to have those in the running code if it
can be avoided due to the overhead involved.

With the above change:

Build results:
	total: 158 pass: 158 fail: 0
Qemu test results:
	total: 539 pass: 539 fail: 0
Unit test results:
	pass: 370886 fail: 3

There are a couple of caveats, unrelated to your patch series:
- I can not enable (and thus not test) security configurations
   on sh4 and sh4eb; doing so results in immediate qemu crashes.
- I can not enable / test security configurations on riscv32
   since it results in kernel images getting too large to fit
   into the available memory. I'll try to work around it,
   but for now I can only test with riscv64.

Guenter

Paul Moore Aug. 13, 2024, 6:21 p.m. UTC | #32

On Tue, Aug 13, 2024 at 11:56 AM KP Singh <kpsingh@kernel.org> wrote:
> On Tue, Aug 13, 2024 at 6:08 AM Guenter Roeck <linux@roeck-us.net> wrote:

...

> > A somewhat primitive alternate fix is:
> >
> > diff --git a/security/security.c b/security/security.c
> > index aa059d0cfc29..dea9736b2014 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > @@ -156,7 +156,7 @@ static __initdata struct lsm_info *exclusive;
> >    * and a trampoline (STATIC_CALL_TRAMP) which are used to call
> >    * __static_call_update when updating the static call.
> >    */
> > -struct lsm_static_calls_table static_calls_table __ro_after_init = {
> > +struct lsm_static_calls_table static_calls_table __ro_after_init __attribute__((aligned(8))) = {
> >   #define INIT_LSM_STATIC_CALL(NUM, NAME)                                        \
>
> I think it's worth making it aligned at 8 byte, a much simpler fix
> than the arch change.

Agreed, although please make sure it is well commented about why the
alignment is important.  It sounds like that's already your plan, but
I just want to make sure we're clear on this :)

I'd also suggest using the __aligned() macro from
compiler_attributes.h instead of the long form
__attribute__((aligned(x))).

Further, while an alignment value of "8" is generally easy enough to
guess at, especially when Alpha is concerned, it might help to further
hint at the reason by using sizeof(u64), e.g.
`__aligned(sizeof(u64))`.

> Paul, I will rebase my series with these
> patches, better descriptions and post them later today.

Great, thanks.

init/main.c: Initialize early LSMs after arch code

Commit Message

Comments

Patch