Message ID | 20230905190828.790400-1-masahiroy@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | linux/export: fix reference to exported functions for parisc64 | expand |
The patch get us slightly further but boot still fails in a similar way: [...] Run /init as init process process '/usr/bin/sh' started with executable stack Loading, please wait... Starting systemd-udevd version 254.1-3 e1000 alternatives: applied 0 out of 569 patches usbcore alternatives: applied 0 out of 17 patches e1000: Intel(R) PRO/1000 Network Driver e1000: Copyright (c) 1999-2006 Intel Corporation. usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub scsi_mod alternatives: applied 0 out of 7 patches usbcore: registered new device driver usb SCSI subsystem initialized ehci_hcd alternatives: applied 0 out of 114 patches mptbase alternatives: applied 0 out of 73 patches libata alternatives: applied 0 out of 3 patches ehci_pci alternatives: applied 0 out of 2 patches Fusion MPT base driver 3.04.20 Copyright (c) 1999-2008 LSI Corporation ohci_hcd alternatives: applied 0 out of 144 patches ehci-pci 0000:60:01.2: EHCI Host Controller ehci-pci 0000:60:01.2: new USB bus registered, assigned bus number 1 Backtrace: [<000000001071d5d8>] usb_hcd_pci_probe+0x330/0x4a0 [usbcore] [<000000001025e620>] ehci_pci_probe+0x50/0x70 [ehci_pci] [<00000000407d4004>] pci_device_probe+0x144/0x2a8 [<000000004091df6c>] really_probe+0x12c/0x5a8 [<000000004091e46c>] __driver_probe_device+0x84/0x1a0 [<000000004091e66c>] driver_probe_device+0xe4/0x2c8 [<000000004091eddc>] __driver_attach_async_helper+0x8c/0x160 [<0000000040284ecc>] async_run_entry_fn+0x64/0x210 [<000000004026b5e0>] process_one_work+0x268/0x478 [<000000004026ba88>] worker_thread+0x298/0x740 [<000000004027c3f4>] kthread+0x274/0x280 [<0000000040202020>] ret_from_kernel_thread+0x20/0x28 Page fault: no context: Code=6 (Instruction TLB miss fault) at addr 0b3a029a8348 CPU: 3 PID: 57 Comm: kworker/u64:4 Not tainted 6.5.0+ #1 Hardware name: 9000/785/C8000 Workqueue: events_unbound async_run_entry_fn YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111111100001111 Not tainted r00-03 000000ff0804ff0f 0b3a029a83406038 000000001025e770 0000000052008c30 r04-07 000000001025e000 0000000053501000 000000005156a0a0 0000000000000000 r08-11 000000005156a000 0000000053501800 0000000053501120 000000005156a0a0 r12-15 0000000000000002 0000000040d63640 00000000516b7328 0000000000000001 r16-19 0000000050d54580 0000000000000008 0000000050d54540 0000000000010000 r20-23 0000000000001a46 000000000000000f 0002000000000002 0000000000045b38 r24-27 0000000000000000 0000000000000003 0000000000000002 000000001025e000 r28-31 0000000000002395 0000000052008d40 0000000052008ce0 0000000000001033 sr00-03 00000000000cbc00 0000000000000000 0000000000000000 00000000000cbc00 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 000000000b3a029a 000000000b3a029a IAOQ: 0b3a029a83406038 0b3a029a8340603c IIR: 43ffff80 ISR: 0000000000000002 IOR: 0000000052008ea0 CPU: 3 CR30: 0000000051794c20 CR31: ffffffffffffffff ORIG_R28: 0000000000000000 IAOQ[0]: 0xb3a029a83406038 IAOQ[1]: 0xb3a029a8340603c RP(r2): ehci_pci_setup+0x100/0x780 [ehci_pci] Backtrace: [<000000001071d5d8>] usb_hcd_pci_probe+0x330/0x4a0 [usbcore] [<000000001025e620>] ehci_pci_probe+0x50/0x70 [ehci_pci] [<00000000407d4004>] pci_device_probe+0x144/0x2a8 [<000000004091df6c>] really_probe+0x12c/0x5a8 [<000000004091e46c>] __driver_probe_device+0x84/0x1a0 [<000000004091e66c>] driver_probe_device+0xe4/0x2c8 [<000000004091eddc>] __driver_attach_async_helper+0x8c/0x160 [<0000000040284ecc>] async_run_entry_fn+0x64/0x210 [<000000004026b5e0>] process_one_work+0x268/0x478 [<000000004026ba88>] worker_thread+0x298/0x740 [<000000004027c3f4>] kthread+0x274/0x280 [<0000000040202020>] ret_from_kernel_thread+0x20/0x28 Kernel panic - not syncing: Page fault: no context This was with master. I'll check ddb5cdbafaaa. Dave On 2023-09-05 3:08 p.m., Masahiro Yamada wrote: > John David Anglin reported parisc has been broken since commit > ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost"). > > I checked the assembler output, and noticed function references are > prefixed with P%, so the situation in parisc64 is similar to ia64. > > Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost") > Reported-by: John David Anglin <dave.anglin@bell.net> > Closes: https://lore.kernel.org/linux-parisc/1901598a-e11d-f7dd-a5d9-9a69d06e6b6e@bell.net/T/#u > Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> > --- > > I just checked the assembler output, and I created this patch > based on my best guess. Only compile-tested. > I hope somebody will run-test this patch. > > > include/linux/export-internal.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/include/linux/export-internal.h b/include/linux/export-internal.h > index 1c849db953a5..45fca09b2319 100644 > --- a/include/linux/export-internal.h > +++ b/include/linux/export-internal.h > @@ -52,6 +52,8 @@ > > #ifdef CONFIG_IA64 > #define KSYM_FUNC(name) @fptr(name) > +#elif defined(CONFIG_PARISC) && defined(CONFIG_64BIT) > +#define KSYM_FUNC(name) P%name > #else > #define KSYM_FUNC(name) name > #endif
On 2023-09-05 3:08 p.m., Masahiro Yamada wrote: > I checked the assembler output, and noticed function references are > prefixed with P%, so the situation in parisc64 is similar to ia64. Function references are prefixed with P% when they occur in data like the following: .dword P%func This causes the generation of a function descriptor. The assembler and linker can't handle the P% prefix in other situations. Dave
On 2023-09-05 5:57 p.m., John David Anglin wrote:
> I'll check ddb5cdbafaaa.
Similar fault with ddb5cdbafaaa:
sata_sil24 0000:00:01.0: Applying completion IRQ loss on PCI-X errata fix
Backtrace:
scsi host2: sata_sil24
[<0000000040bf2c00>] mutex_lock+0x48/0xc8
[<000000004023d370>] cpu_hotplug_disable+0x80/0x98
scsi host3: sata_sil24
[<0000000040792314>] pci_device_probe+0x144/0x2a8
[<00000000408af87c>] really_probe+0x12c/0x5a8
scsi host4: sata_sil24
[<00000000408afd7c>] __driver_probe_device+0x84/0x1a0
[<00000000408aff44>] driver_probe_device+0xac/0x260
scsi host5: sata_sil24
[<00000000408b0684>] __driver_attach_async_helper+0x8c/0x160
[<000000004028043c>] async_run_entry_fn+0x64/0x1d0
ata3: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80080000 ir6
[<0000000040269c88>] process_one_work+0x238/0x520
[<000000004026a184>] worker_thread+0x214/0x770
ata4: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80082000 ir6
[<00000000402788d4>] kthread+0x274/0x280
ata5: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80084000 ir6
[<0000000040202020>] ret_from_kernel_thread+0x20/0x28
ata6: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80086000 ir6
Page fault: no context: Code=6 (Instruction TLB miss fault) at addr 0b3a029a8348
CPU: 0 PID: 10 Comm: kworker/u64:0 Not tainted 6.4.0-rc2+ #1
Hardware name: 9000/785/C8000
Workqueue: events_unbound async_run_entry_fn
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001001111111100001111 Not tainted
r00-03 000000ff0804ff0f 0b3a029a83406038 0000000000010770 0000000050d94c50
r04-07 0000000000010000 0000000053a97000 00000000515398b0 0000000000000000
r08-11 0000000051539800 0000000053a97800 0000000053a97120 00000000515398b0
r12-15 0000000050c10000 0000000000000002 0000000040d54d60 0000000000000001
r16-19 0000000040ca1d20 0000000050ce46c0 0000000050d56150 0000000000020000
r20-23 000000007f41c000 000000000000000f 0002000000000002 0000000000044b38
r24-27 0000000000000000 0000000000000003 0000000000000002 0000000000010000
r28-31 0000000000002395 0000000050d94d60 0000000050d94d00 0000000000001033
sr00-03 00000000000c7000 0000000000000000 0000000000000000 00000000000c5c00
sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 000000000b3a029a 000000000b3a029a IAOQ: 0b3a029a83406038 0b3a029a8340603c
IIR: 43ffff80 ISR: 0000000000000dc0 IOR: 00000000402849ac
CPU: 0 CR30: 0000000050d56150 CR31: ffffffffffffffff
ORIG_R28: 0000000000000080
IAOQ[0]: 0xb3a029a83406038
IAOQ[1]: 0xb3a029a8340603c
RP(r2): ehci_pci_setup+0x100/0x780 [ehci_pci]
Backtrace:
[<0000000040bf2c00>] mutex_lock+0x48/0xc8
[<000000004023d370>] cpu_hotplug_disable+0x80/0x98
[<0000000040792314>] pci_device_probe+0x144/0x2a8
[<00000000408af87c>] really_probe+0x12c/0x5a8
[<00000000408afd7c>] __driver_probe_device+0x84/0x1a0
[<00000000408aff44>] driver_probe_device+0xac/0x260
[<00000000408b0684>] __driver_attach_async_helper+0x8c/0x160
[<000000004028043c>] async_run_entry_fn+0x64/0x1d0
[<0000000040269c88>] process_one_work+0x238/0x520
[<000000004026a184>] worker_thread+0x214/0x770
[<00000000402788d4>] kthread+0x274/0x280
[<0000000040202020>] ret_from_kernel_thread+0x20/0x28
Kernel panic - not syncing: Page fault: no context
Dave
On 2023-09-05 7:59 p.m., John David Anglin wrote: > On 2023-09-05 5:57 p.m., John David Anglin wrote: >> I'll check ddb5cdbafaaa. > Similar fault with ddb5cdbafaaa: The alignment of the __kstrtab_ symbols in vmlinux seems wrong. I'm fairly certain that function references prefixed with P% on hppa64 need 8 byte alignment. 81662: 0000000040ea4358 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_system[...] 81663: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_syst[...] 81664: 0000000040e8e830 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_system[...] 81665: 0000000040ea4365 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_static[...] 81666: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_stat[...] 81667: 0000000040ea1640 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_static[...] 81668: 0000000040ea437c 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_reset_[...] 81669: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_rese[...] 81670: 0000000040e8bbc0 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_reset_[...] 81671: 0000000040ea438a 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_loops_[...] 81672: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_loop[...] 81673: 0000000040e86610 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_loops_[...] 81674: 0000000040ea439a 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_init_uts_ns 81675: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_init[...] 81676: 0000000040e99180 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_init_uts_ns 81677: 0000000040ea43a6 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_name_t[...] 81678: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_name[...] 81679: 0000000040e9b340 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_name_t[...] 81680: 0000000040ea43b4 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_wait_f[...] 81681: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_wait[...] 81682: 0000000040ea3638 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_wait_f[...] 81683: 0000000040ea43c7 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_init_task [...] I'm not sure how we get symbols that aren't 8 byte aligned. The ".balign 4" directive in __KSYMTAB doesn't seem correct but it's not the whole problem. Dave
On Wed, Sep 6, 2023 at 4:26 AM Helge Deller <deller@gmx.de> wrote: > > I think ppc64 is affected too. I tested ppc64 ABI v1, but did not see a breakage. > Search for dereference_function_descriptor() in kernel sources, e.g. > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1494564.html > Helge > > -------- Ursprüngliche Nachricht -------- > Von: Masahiro Yamada <masahiroy@kernel.org> > Datum: 05.09.23 21:08 (GMT+01:00) > An: linux-parisc@vger.kernel.org, Helge Deller <deller@gmx.de>, John David Anglin <dave.anglin@bell.net> > Cc: linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org, Masahiro Yamada <masahiroy@kernel.org>, Nick Desaulniers <ndesaulniers@google.com> > Betreff: [PATCH] linux/export: fix reference to exported functions for parisc64 > > John David Anglin reported parisc has been broken since commit > ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost"). > > I checked the assembler output, and noticed function references are > prefixed with P%, so the situation in parisc64 is similar to ia64. > > Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost") > Reported-by: John David Anglin <dave.anglin@bell.net> > Closes: https://lore.kernel.org/linux-parisc/1901598a-e11d-f7dd-a5d9-9a69d06e6b6e@bell.net/T/#u > Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> > --- > > I just checked the assembler output, and I created this patch > based on my best guess. Only compile-tested. > I hope somebody will run-test this patch. > > > include/linux/export-internal.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/include/linux/export-internal.h b/include/linux/export-internal.h > index 1c849db953a5..45fca09b2319 100644 > --- a/include/linux/export-internal.h > +++ b/include/linux/export-internal.h > @@ -52,6 +52,8 @@ > > #ifdef CONFIG_IA64 > #define KSYM_FUNC(name) @fptr(name) > +#elif defined(CONFIG_PARISC) && defined(CONFIG_64BIT) > +#define KSYM_FUNC(name) P%name > #else > #define KSYM_FUNC(name) name > #endif > -- > 2.39.2 >
On Fri, Sep 8, 2023 at 7:02 AM John David Anglin <dave.anglin@bell.net> wrote: > > On 2023-09-05 7:59 p.m., John David Anglin wrote: > > On 2023-09-05 5:57 p.m., John David Anglin wrote: > >> I'll check ddb5cdbafaaa. > > Similar fault with ddb5cdbafaaa: > The alignment of the __kstrtab_ symbols in vmlinux seems wrong. __kstrtab_ symbols do not need alignment. They were not aligned at all before ddb5cdbafaaa^. > I'm fairly certain that function > references prefixed with P% on hppa64 need 8 byte alignment. Yeah. In the following dump, all of __ksymtab_* are correctly 8-byte aligned. > > 81662: 0000000040ea4358 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_system[...] > 81663: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_syst[...] > 81664: 0000000040e8e830 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_system[...] > 81665: 0000000040ea4365 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_static[...] > 81666: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_stat[...] > 81667: 0000000040ea1640 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_static[...] > 81668: 0000000040ea437c 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_reset_[...] > 81669: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_rese[...] > 81670: 0000000040e8bbc0 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_reset_[...] > 81671: 0000000040ea438a 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_loops_[...] > 81672: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_loop[...] > 81673: 0000000040e86610 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_loops_[...] > 81674: 0000000040ea439a 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_init_uts_ns > 81675: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_init[...] > 81676: 0000000040e99180 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_init_uts_ns > 81677: 0000000040ea43a6 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_name_t[...] > 81678: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_name[...] > 81679: 0000000040e9b340 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_name_t[...] > 81680: 0000000040ea43b4 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_wait_f[...] > 81681: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_wait[...] > 81682: 0000000040ea3638 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_wait_f[...] > 81683: 0000000040ea43c7 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_init_task > [...] > > I'm not sure how we get symbols that aren't 8 byte aligned. The ".balign 4" directive > in __KSYMTAB doesn't seem correct but it's not the whole problem. > > Dave > > -- > John David Anglin dave.anglin@bell.net >
With a little more investigation, I found arch/parisc/kernel/parisc_ksyms.c is causing the issue. That file is a collection of EXPORT_SYMBOL of assembly code. I will take a closer look tomorrow. On Sun, Sep 10, 2023 at 2:20 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > On Fri, Sep 8, 2023 at 7:02 AM John David Anglin <dave.anglin@bell.net> wrote: > > > > On 2023-09-05 7:59 p.m., John David Anglin wrote: > > > On 2023-09-05 5:57 p.m., John David Anglin wrote: > > >> I'll check ddb5cdbafaaa. > > > Similar fault with ddb5cdbafaaa: > > The alignment of the __kstrtab_ symbols in vmlinux seems wrong. > > __kstrtab_ symbols do not need alignment. > > They were not aligned at all > before ddb5cdbafaaa^. > > > > > I'm fairly certain that function > > references prefixed with P% on hppa64 need 8 byte alignment. > > Yeah. > In the following dump, all of __ksymtab_* are correctly 8-byte aligned. > > > > > > 81662: 0000000040ea4358 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_system[...] > > 81663: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_syst[...] > > 81664: 0000000040e8e830 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_system[...] > > 81665: 0000000040ea4365 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_static[...] > > 81666: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_stat[...] > > 81667: 0000000040ea1640 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_static[...] > > 81668: 0000000040ea437c 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_reset_[...] > > 81669: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_rese[...] > > 81670: 0000000040e8bbc0 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_reset_[...] > > 81671: 0000000040ea438a 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_loops_[...] > > 81672: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_loop[...] > > 81673: 0000000040e86610 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_loops_[...] > > 81674: 0000000040ea439a 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_init_uts_ns > > 81675: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_init[...] > > 81676: 0000000040e99180 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_init_uts_ns > > 81677: 0000000040ea43a6 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_name_t[...] > > 81678: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_name[...] > > 81679: 0000000040e9b340 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_name_t[...] > > 81680: 0000000040ea43b4 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_wait_f[...] > > 81681: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_wait[...] > > 81682: 0000000040ea3638 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_wait_f[...] > > 81683: 0000000040ea43c7 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_init_task > > [...] > > > > I'm not sure how we get symbols that aren't 8 byte aligned. The ".balign 4" directive > > in __KSYMTAB doesn't seem correct but it's not the whole problem. > > > > Dave > > > > -- > > John David Anglin dave.anglin@bell.net > > > > > -- > Best Regards > Masahiro Yamada
Hi John, Helge, Could you test the attached patch please? Again, I only tested compilation for this. I do not have parisc64 hardware. In my understanding, QEMU does not support hppa64. I do not find a way to test parisc64. Masahiro Yamada On Sun, Sep 10, 2023 at 4:20 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > With a little more investigation, > I found arch/parisc/kernel/parisc_ksyms.c > is causing the issue. > > That file is a collection of EXPORT_SYMBOL > of assembly code. > > I will take a closer look tomorrow. > > > > > > > > > > > > On Sun, Sep 10, 2023 at 2:20 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > > > On Fri, Sep 8, 2023 at 7:02 AM John David Anglin <dave.anglin@bell.net> wrote: > > > > > > On 2023-09-05 7:59 p.m., John David Anglin wrote: > > > > On 2023-09-05 5:57 p.m., John David Anglin wrote: > > > >> I'll check ddb5cdbafaaa. > > > > Similar fault with ddb5cdbafaaa: > > > The alignment of the __kstrtab_ symbols in vmlinux seems wrong. > > > > __kstrtab_ symbols do not need alignment. > > > > They were not aligned at all > > before ddb5cdbafaaa^. > > > > > > > > > I'm fairly certain that function > > > references prefixed with P% on hppa64 need 8 byte alignment. > > > > Yeah. > > In the following dump, all of __ksymtab_* are correctly 8-byte aligned. > > > > > > > > > > 81662: 0000000040ea4358 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_system[...] > > > 81663: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_syst[...] > > > 81664: 0000000040e8e830 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_system[...] > > > 81665: 0000000040ea4365 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_static[...] > > > 81666: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_stat[...] > > > 81667: 0000000040ea1640 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_static[...] > > > 81668: 0000000040ea437c 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_reset_[...] > > > 81669: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_rese[...] > > > 81670: 0000000040e8bbc0 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_reset_[...] > > > 81671: 0000000040ea438a 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_loops_[...] > > > 81672: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_loop[...] > > > 81673: 0000000040e86610 0 NOTYPE LOCAL DEFAULT 14 __ksymtab_loops_[...] > > > 81674: 0000000040ea439a 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_init_uts_ns > > > 81675: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_init[...] > > > 81676: 0000000040e99180 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_init_uts_ns > > > 81677: 0000000040ea43a6 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_name_t[...] > > > 81678: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_name[...] > > > 81679: 0000000040e9b340 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_name_t[...] > > > 81680: 0000000040ea43b4 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_wait_f[...] > > > 81681: 0000000040ea4748 0 NOTYPE LOCAL DEFAULT 16 __kstrtabns_wait[...] > > > 81682: 0000000040ea3638 0 NOTYPE LOCAL DEFAULT 15 __ksymtab_wait_f[...] > > > 81683: 0000000040ea43c7 0 NOTYPE LOCAL DEFAULT 16 __kstrtab_init_task > > > [...] > > > > > > I'm not sure how we get symbols that aren't 8 byte aligned. The ".balign 4" directive > > > in __KSYMTAB doesn't seem correct but it's not the whole problem. > > > > > > Dave > > > > > > -- > > > John David Anglin dave.anglin@bell.net > > > > > > > > > -- > > Best Regards > > Masahiro Yamada > > > > -- > Best Regards > Masahiro Yamada
Hi Masahiro, The attached change fixed boot at ddb5cdbafaaa
Hi Masahiro, I can confirm as well, that your patch linux/export: fix reference to exported functions for parisc64 does indeed fix the boot issue on parisc64. I did tested it on a C3000 workstation on top of Linus' v6.6-rc1 git tree. You may add: Tested-by: Helge Deller <deller@gmx.de> Dave, I don't see the issue you mention below... Helge On 9/10/23 23:30, John David Anglin wrote: > Hi Masahiro, > > The attached change fixed boot at ddb5cdbafaaa
Hi Helge, It occurs consistently on my c8000 but I'm having difficulty bisecting it. Trying a bisect with --first-parent. Note I had to pull ATI graphics card from the machine as it started to malfunction causing crashes. However, v6.1.52 boots fine. Dave On 2023-09-12 9:01 a.m., Helge Deller wrote: > Hi Masahiro, > > I can confirm as well, that your patch > linux/export: fix reference to exported functions for parisc64 > does indeed fix the boot issue on parisc64. > > I did tested it on a C3000 workstation on top of Linus' v6.6-rc1 git tree. > You may add: > Tested-by: Helge Deller <deller@gmx.de> > > Dave, I don't see the issue you mention below... > > Helge > > On 9/10/23 23:30, John David Anglin wrote: >> Hi Masahiro, >> >> The attached change fixed boot at ddb5cdbafaaa
Hi Dave, On 9/12/23 15:20, John David Anglin wrote: > It occurs consistently on my c8000 but I'm having difficulty bisecting it. Trying a bisect > with --first-parent. I just tried to boot the v6.6-rc1 with Masahiro's patch on c8000, and it succeeds as well. I've copied my pre-built kernel here: http://backup.parisc-linux.org/kernel/linux-image-6.6.0-rc1-dirty_6.6.0-rc1-250_hppa.deb So, I think Masahiro's patch is basically ok and probably isn't the root cause for your udev issues below. Did you checked if initramfs included all necessary filesystem modules? Maybe updating your machine to latest ramfstools and re-installing your kernel? Helge > Note I had to pull ATI graphics card from the machine as it started to malfunction causing crashes. > However, v6.1.52 boots fine. > > On 2023-09-12 9:01 a.m., Helge Deller wrote: >> Hi Masahiro, >> >> I can confirm as well, that your patch >> linux/export: fix reference to exported functions for parisc64 >> does indeed fix the boot issue on parisc64. >> >> I did tested it on a C3000 workstation on top of Linus' v6.6-rc1 git tree. >> You may add: >> Tested-by: Helge Deller <deller@gmx.de> >> >> Dave, I don't see the issue you mention below... >> >> Helge >> >> On 9/10/23 23:30, John David Anglin wrote: >>> Hi Masahiro, >>> >>> The attached change fixed boot at ddb5cdbafaaa
On 2023-09-12 10:05 a.m., Helge Deller wrote: > On 9/12/23 15:20, John David Anglin wrote: >> It occurs consistently on my c8000 but I'm having difficulty bisecting it. Trying a bisect >> with --first-parent. > > I just tried to boot the v6.6-rc1 with Masahiro's patch on c8000, and it succeeds as well. > I've copied my pre-built kernel here: > http://backup.parisc-linux.org/kernel/linux-image-6.6.0-rc1-dirty_6.6.0-rc1-250_hppa.deb > > So, I think Masahiro's patch is basically ok and probably isn't the root cause > for your udev issues below. I agree. I see the udev issue with the above kernel. Continuing to bisect mainline. Dave
On 2023-09-10 5:30 p.m., John David Anglin wrote: > Hi Masahiro, > > The attached change fixed boot at ddb5cdbafaaa
On 2023-09-12 5:53 p.m., John David Anglin wrote: > On 2023-09-10 5:30 p.m., John David Anglin wrote: >> Hi Masahiro, >> >> The attached change fixed boot at ddb5cdbafaaa
On 2023-09-13 1:58 p.m., John David Anglin wrote: > On 2023-09-12 5:53 p.m., John David Anglin wrote: >> On 2023-09-10 5:30 p.m., John David Anglin wrote: >>> Hi Masahiro, >>> >>> The attached change fixed boot at ddb5cdbafaaa
On 9/14/23 06:22, John David Anglin wrote: > On 2023-09-13 1:58 p.m., John David Anglin wrote: >> On 2023-09-12 5:53 p.m., John David Anglin wrote: >>> On 2023-09-10 5:30 p.m., John David Anglin wrote: >>>> Hi Masahiro, >>>> >>>> The attached change fixed boot at ddb5cdbafaaa
On 2023-09-13 7:45 p.m., Damien Le Moal wrote: > On 9/14/23 06:22, John David Anglin wrote: >> On 2023-09-13 1:58 p.m., John David Anglin wrote: >>> On 2023-09-12 5:53 p.m., John David Anglin wrote: >>>> On 2023-09-10 5:30 p.m., John David Anglin wrote: >>>>> Hi Masahiro, >>>>> >>>>> The attached change fixed boot at ddb5cdbafaaa
On 9/14/23 09:29, John David Anglin wrote: >>> dave@atlas:~/linux/linux$ git diff drivers/scsi/scsi.c >>> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c >>> index d0911bc28663..dc3a283ebd75 100644 >>> --- a/drivers/scsi/scsi.c >>> +++ b/drivers/scsi/scsi.c >>> @@ -578,6 +578,8 @@ static bool scsi_cdl_check_cmd(struct scsi_device *sdev, u8 opcode, u16 sa, >>> int ret; >>> u8 cdlp; >>> >>> + return false; >>> + >>> /* Check operation code */ >>> ret = scsi_report_opcode(sdev, buf, SCSI_CDL_CHECK_BUF_LEN, opcode, sa); >>> if (ret <= 0) >> It is weird that this solves anything... the MAINTENANCE_IN command issued by >> scsi_report_opcode() ends up being emulated in libata with >> ata_scsiop_maint_in(). There are no actual commands issued to the drive, so >> nothing that could actually fail/cause issues. By the time this is issued, the >> ATA drive is also fully probed... >> >> Or is the drive connected to the Broadcom HBA you have ? In that case, libata is >> not used and the HBA FW SAT (scsi-ata-translation) is likely to blame. > /boot, / and swap partitions reside on a ST373207LW drive connected to a Broadcom HBA. A > ST4000VN008-2DR1 drive is connected to the Silicon Image, Inc. SiI 3124 PCI-X Serial > ATA Controller. It mounts on /home. There's also a cdrom connected to the Silicon > Image, Inc. PCI0680 Ultra ATA-133 Host Controller and another ST4000VN008-2DR1 drive > connected to a Broadcom HBA. There are two Broadcom HBAs. > > I think the issue is with the root ST373207LW drive. The console output indicates that the > ROOT drive doesn't exist when the boot fails. > > Your change only appeared to affect actual SCSI drives. That's why I tried disabling CDL. OK. I can see from the dmesg snippets you sent that the drives on the ATA ports seem OK. A quick search tells me that the ST373207LW drive is a Ultra320 SCSI drive, not ATA. So that MAINTENANCE_IN command issued by scsi_report_opcode() will straight as-is. This command has been issued to devices since a long time ago, and given that your system was working, the drive is probably fine with it in its simplest form (one command format). CDL changes however added probing command support with the service action field (One command format with service action). And what may be happening is that the drive does not like/does not support that format and chokes on it. Let me check the specs to see what scsi level support this format. What is sure is that Ultra320 SCSI disks will definitely *not* support CDL, so we could exit early in scsi_cdl_check_cmd() returning false for drives with an old scsi level support. Let me send something along these lines. >> >> Could you send a full dmesg output for a clean boot and for a failed one so that >> I can compare ? > I'll try to get this together tomorrow. > > Dave >
On 9/14/23 09:29, John David Anglin wrote: > On 2023-09-13 7:45 p.m., Damien Le Moal wrote: >> On 9/14/23 06:22, John David Anglin wrote: >>> On 2023-09-13 1:58 p.m., John David Anglin wrote: >>>> On 2023-09-12 5:53 p.m., John David Anglin wrote: >>>>> On 2023-09-10 5:30 p.m., John David Anglin wrote: >>>>>> Hi Masahiro, >>>>>> >>>>>> The attached change fixed boot at ddb5cdbafaaa
On 9/14/23 09:29, John David Anglin wrote: > I think the issue is with the root ST373207LW drive. The console output indicates that the > ROOT drive doesn't exist when the boot fails. > > Your change only appeared to affect actual SCSI drives. That's why I tried disabling CDL. >> >> Could you send a full dmesg output for a clean boot and for a failed one so that >> I can compare ? > I'll try to get this together tomorrow. Please try the attached patch. That should address the issue with your drive.
On 2023-09-13 10:24 p.m., Damien Le Moal wrote: > On 9/14/23 09:29, John David Anglin wrote: >> I think the issue is with the root ST373207LW drive. The console output indicates that the >> ROOT drive doesn't exist when the boot fails. >> >> Your change only appeared to affect actual SCSI drives. That's why I tried disabling CDL. >>> Could you send a full dmesg output for a clean boot and for a failed one so that >>> I can compare ? >> I'll try to get this together tomorrow. > Please try the attached patch. That should address the issue with your drive. Mainline and v6.5.3 both booted successfully with the attached patch. Thanks, Dave
On 9/15/23 00:07, John David Anglin wrote: > On 2023-09-13 10:24 p.m., Damien Le Moal wrote: >> On 9/14/23 09:29, John David Anglin wrote: >>> I think the issue is with the root ST373207LW drive. The console output indicates that the >>> ROOT drive doesn't exist when the boot fails. >>> >>> Your change only appeared to affect actual SCSI drives. That's why I tried disabling CDL. >>>> Could you send a full dmesg output for a clean boot and for a failed one so that >>>> I can compare ? >>> I'll try to get this together tomorrow. >> Please try the attached patch. That should address the issue with your drive. > Mainline and v6.5.3 both booted successfully with the attached patch. Great ! Thanks for testing. Posting the patch. > > Thanks, > Dave >
diff --git a/include/linux/export-internal.h b/include/linux/export-internal.h index 1c849db953a5..45fca09b2319 100644 --- a/include/linux/export-internal.h +++ b/include/linux/export-internal.h @@ -52,6 +52,8 @@ #ifdef CONFIG_IA64 #define KSYM_FUNC(name) @fptr(name) +#elif defined(CONFIG_PARISC) && defined(CONFIG_64BIT) +#define KSYM_FUNC(name) P%name #else #define KSYM_FUNC(name) name #endif
John David Anglin reported parisc has been broken since commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost"). I checked the assembler output, and noticed function references are prefixed with P%, so the situation in parisc64 is similar to ia64. Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost") Reported-by: John David Anglin <dave.anglin@bell.net> Closes: https://lore.kernel.org/linux-parisc/1901598a-e11d-f7dd-a5d9-9a69d06e6b6e@bell.net/T/#u Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> --- I just checked the assembler output, and I created this patch based on my best guess. Only compile-tested. I hope somebody will run-test this patch. include/linux/export-internal.h | 2 ++ 1 file changed, 2 insertions(+)