Message ID | 20210526080741.GW30378@techsingularity.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/page_alloc: Work around a pahole limitation with zero-sized struct pagesets | expand |
On 26.05.21 10:07, Mel Gorman wrote: > Michal Suchanek reported the following problem with linux-next > > [ 0.000000] Linux version 5.13.0-rc2-next-20210519-1.g3455ff8-vanilla (geeko@buildhost) (gcc (SUSE Linux) 10.3.0, GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.36.1.20210326-3) #1 SMP Wed May 19 10:05:10 UTC 2021 (3455ff8) > [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla root=UUID=ec42c33e-a2c2-4c61-afcc-93e9527 8f687 plymouth.enable=0 resume=/dev/disk/by-uuid/f1fe4560-a801-4faf-a638-834c407027c7 mitigations=auto earlyprintk initcall_debug nomodeset earlycon ignore_loglevel console=ttyS0,115200 > ... > [ 26.093364] calling tracing_set_default_clock+0x0/0x62 @ 1 > [ 26.098937] initcall tracing_set_default_clock+0x0/0x62 returned 0 after 0 usecs > [ 26.106330] calling acpi_gpio_handle_deferred_request_irqs+0x0/0x7c @ 1 > [ 26.113033] initcall acpi_gpio_handle_deferred_request_irqs+0x0/0x7c returned 0 after 3 usecs > [ 26.121559] calling clk_disable_unused+0x0/0x102 @ 1 > [ 26.126620] initcall clk_disable_unused+0x0/0x102 returned 0 after 0 usecs > [ 26.133491] calling regulator_init_complete+0x0/0x25 @ 1 > [ 26.138890] initcall regulator_init_complete+0x0/0x25 returned 0 after 0 usecs > [ 26.147816] Freeing unused decrypted memory: 2036K > [ 26.153682] Freeing unused kernel image (initmem) memory: 2308K > [ 26.165776] Write protecting the kernel read-only data: 26624k > [ 26.173067] Freeing unused kernel image (text/rodata gap) memory: 2036K > [ 26.180416] Freeing unused kernel image (rodata/data gap) memory: 1184K > [ 26.187031] Run /init as init process > [ 26.190693] with arguments: > [ 26.193661] /init > [ 26.195933] with environment: > [ 26.199079] HOME=/ > [ 26.201444] TERM=linux > [ 26.204152] BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla > [ 26.254154] BPF: type_id=35503 offset=178440 size=4 > [ 26.259125] BPF: > [ 26.261054] BPF:Invalid offset > [ 26.264119] BPF: > [ 26.264119] > [ 26.267437] failed to validate module [efivarfs] BTF: -22 > > Andrii Nakryiko bisected the problem to the commit "mm/page_alloc: convert > per-cpu list protection to local_lock" currently staged in mmotm. In his > own words > > The immediate problem is two different definitions of numa_node per-cpu > variable. They both are at the same offset within .data..percpu ELF > section, they both have the same name, but one of them is marked as > static and another as global. And one is int variable, while another > is struct pagesets. I'll look some more tomorrow, but adding Jiri and > Arnaldo for visibility. > > [110907] DATASEC '.data..percpu' size=178904 vlen=303 > ... > type_id=27753 offset=163976 size=4 (VAR 'numa_node') > type_id=27754 offset=163976 size=4 (VAR 'numa_node') > > [27753] VAR 'numa_node' type_id=27556, linkage=static > [27754] VAR 'numa_node' type_id=20, linkage=global > > [20] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > > [27556] STRUCT 'pagesets' size=0 vlen=1 > 'lock' type_id=507 bits_offset=0 > > [506] STRUCT '(anon)' size=0 vlen=0 > [507] TYPEDEF 'local_lock_t' type_id=506 > > The patch in question introduces a zero-sized per-cpu struct and while > this is not wrong, versions of pahole prior to 1.22 (unreleased) get > confused during BTF generation with two separate variables occupying the > same address. > > This patch checks for older versions of pahole and forces struct pagesets > to be non-zero sized as a workaround when CONFIG_DEBUG_INFO_BTF is set. A > warning is omitted so that distributions can update pahole when 1.22 > is released. > > Reported-by: Michal Suchanek <msuchanek@suse.de> > Reported-by: Hritik Vijay <hritikxx8@gmail.com> > Debugged-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- > lib/Kconfig.debug | 3 +++ > mm/page_alloc.c | 11 +++++++++++ > 2 files changed, 14 insertions(+) > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 678c13967580..f88a155b80a9 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -313,6 +313,9 @@ config DEBUG_INFO_BTF > config PAHOLE_HAS_SPLIT_BTF > def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "119") > > +config PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT > + def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "122") > + > config DEBUG_INFO_BTF_MODULES > def_bool y > depends on DEBUG_INFO_BTF && MODULES && PAHOLE_HAS_SPLIT_BTF > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index ff8f706839ea..1d56d3de8e08 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -124,6 +124,17 @@ static DEFINE_MUTEX(pcp_batch_high_lock); > > struct pagesets { > local_lock_t lock; > +#if defined(CONFIG_DEBUG_INFO_BTF) && \ > + !defined(CONFIG_DEBUG_LOCK_ALLOC) && \ > + !defined(CONFIG_PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT) > + /* > + * pahole 1.21 and earlier gets confused by zero-sized per-CPU > + * variables and produces invalid BTF. Ensure that > + * sizeof(struct pagesets) != 0 for older versions of pahole. > + */ > + char __pahole_hack; > + #warning "pahole too old to support zero-sized struct pagesets" > +#endif > }; > static DEFINE_PER_CPU(struct pagesets, pagesets) = { > .lock = INIT_LOCAL_LOCK(lock), > Looks sane to me.
On Wed, May 26, 2021 at 09:07:41AM +0100, Mel Gorman wrote: > Michal Suchanek reported the following problem with linux-next > > [ 0.000000] Linux version 5.13.0-rc2-next-20210519-1.g3455ff8-vanilla (geeko@buildhost) (gcc (SUSE Linux) 10.3.0, GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.36.1.20210326-3) #1 SMP Wed May 19 10:05:10 UTC 2021 (3455ff8) > [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla root=UUID=ec42c33e-a2c2-4c61-afcc-93e9527 8f687 plymouth.enable=0 resume=/dev/disk/by-uuid/f1fe4560-a801-4faf-a638-834c407027c7 mitigations=auto earlyprintk initcall_debug nomodeset earlycon ignore_loglevel console=ttyS0,115200 > ... > [ 26.093364] calling tracing_set_default_clock+0x0/0x62 @ 1 > [ 26.098937] initcall tracing_set_default_clock+0x0/0x62 returned 0 after 0 usecs > [ 26.106330] calling acpi_gpio_handle_deferred_request_irqs+0x0/0x7c @ 1 > [ 26.113033] initcall acpi_gpio_handle_deferred_request_irqs+0x0/0x7c returned 0 after 3 usecs > [ 26.121559] calling clk_disable_unused+0x0/0x102 @ 1 > [ 26.126620] initcall clk_disable_unused+0x0/0x102 returned 0 after 0 usecs > [ 26.133491] calling regulator_init_complete+0x0/0x25 @ 1 > [ 26.138890] initcall regulator_init_complete+0x0/0x25 returned 0 after 0 usecs > [ 26.147816] Freeing unused decrypted memory: 2036K > [ 26.153682] Freeing unused kernel image (initmem) memory: 2308K > [ 26.165776] Write protecting the kernel read-only data: 26624k > [ 26.173067] Freeing unused kernel image (text/rodata gap) memory: 2036K > [ 26.180416] Freeing unused kernel image (rodata/data gap) memory: 1184K > [ 26.187031] Run /init as init process > [ 26.190693] with arguments: > [ 26.193661] /init > [ 26.195933] with environment: > [ 26.199079] HOME=/ > [ 26.201444] TERM=linux > [ 26.204152] BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla > [ 26.254154] BPF: type_id=35503 offset=178440 size=4 > [ 26.259125] BPF: > [ 26.261054] BPF:Invalid offset > [ 26.264119] BPF: > [ 26.264119] > [ 26.267437] failed to validate module [efivarfs] BTF: -22 > > Andrii Nakryiko bisected the problem to the commit "mm/page_alloc: convert > per-cpu list protection to local_lock" currently staged in mmotm. In his > own words > > The immediate problem is two different definitions of numa_node per-cpu > variable. They both are at the same offset within .data..percpu ELF > section, they both have the same name, but one of them is marked as > static and another as global. And one is int variable, while another > is struct pagesets. I'll look some more tomorrow, but adding Jiri and > Arnaldo for visibility. > > [110907] DATASEC '.data..percpu' size=178904 vlen=303 > ... > type_id=27753 offset=163976 size=4 (VAR 'numa_node') > type_id=27754 offset=163976 size=4 (VAR 'numa_node') > > [27753] VAR 'numa_node' type_id=27556, linkage=static > [27754] VAR 'numa_node' type_id=20, linkage=global > > [20] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > > [27556] STRUCT 'pagesets' size=0 vlen=1 > 'lock' type_id=507 bits_offset=0 > > [506] STRUCT '(anon)' size=0 vlen=0 > [507] TYPEDEF 'local_lock_t' type_id=506 > > The patch in question introduces a zero-sized per-cpu struct and while > this is not wrong, versions of pahole prior to 1.22 (unreleased) get > confused during BTF generation with two separate variables occupying the > same address. > > This patch checks for older versions of pahole and forces struct pagesets > to be non-zero sized as a workaround when CONFIG_DEBUG_INFO_BTF is set. A > warning is omitted so that distributions can update pahole when 1.22 > is released. > > Reported-by: Michal Suchanek <msuchanek@suse.de> > Reported-by: Hritik Vijay <hritikxx8@gmail.com> > Debugged-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- > lib/Kconfig.debug | 3 +++ > mm/page_alloc.c | 11 +++++++++++ > 2 files changed, 14 insertions(+) > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 678c13967580..f88a155b80a9 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -313,6 +313,9 @@ config DEBUG_INFO_BTF > config PAHOLE_HAS_SPLIT_BTF > def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "119") > > +config PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT > + def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "122") > + This does not seem workable with dummy-tools. Do we even have dummy pahole? Thanks Michal > config DEBUG_INFO_BTF_MODULES > def_bool y > depends on DEBUG_INFO_BTF && MODULES && PAHOLE_HAS_SPLIT_BTF > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index ff8f706839ea..1d56d3de8e08 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -124,6 +124,17 @@ static DEFINE_MUTEX(pcp_batch_high_lock); > > struct pagesets { > local_lock_t lock; > +#if defined(CONFIG_DEBUG_INFO_BTF) && \ > + !defined(CONFIG_DEBUG_LOCK_ALLOC) && \ > + !defined(CONFIG_PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT) > + /* > + * pahole 1.21 and earlier gets confused by zero-sized per-CPU > + * variables and produces invalid BTF. Ensure that > + * sizeof(struct pagesets) != 0 for older versions of pahole. > + */ > + char __pahole_hack; > + #warning "pahole too old to support zero-sized struct pagesets" > +#endif > }; > static DEFINE_PER_CPU(struct pagesets, pagesets) = { > .lock = INIT_LOCAL_LOCK(lock),
On Wed, May 26, 2021 at 10:33:42AM +0200, Michal Such?nek wrote: > > lib/Kconfig.debug | 3 +++ > > mm/page_alloc.c | 11 +++++++++++ > > 2 files changed, 14 insertions(+) > > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > > index 678c13967580..f88a155b80a9 100644 > > --- a/lib/Kconfig.debug > > +++ b/lib/Kconfig.debug > > @@ -313,6 +313,9 @@ config DEBUG_INFO_BTF > > config PAHOLE_HAS_SPLIT_BTF > > def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "119") > > > > +config PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT > > + def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "122") > > + > > This does not seem workable with dummy-tools. > > Do we even have dummy pahole? > I don't think so but if PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT is broken for you then the same problem should have happened for the PAHOLE_HAS_SPLIT_BTF check.
On Wed, May 26, 2021 at 1:07 AM Mel Gorman <mgorman@techsingularity.net> wrote: > > Michal Suchanek reported the following problem with linux-next > > [ 0.000000] Linux version 5.13.0-rc2-next-20210519-1.g3455ff8-vanilla (geeko@buildhost) (gcc (SUSE Linux) 10.3.0, GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.36.1.20210326-3) #1 SMP Wed May 19 10:05:10 UTC 2021 (3455ff8) > [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla root=UUID=ec42c33e-a2c2-4c61-afcc-93e9527 8f687 plymouth.enable=0 resume=/dev/disk/by-uuid/f1fe4560-a801-4faf-a638-834c407027c7 mitigations=auto earlyprintk initcall_debug nomodeset earlycon ignore_loglevel console=ttyS0,115200 > ... > [ 26.093364] calling tracing_set_default_clock+0x0/0x62 @ 1 > [ 26.098937] initcall tracing_set_default_clock+0x0/0x62 returned 0 after 0 usecs > [ 26.106330] calling acpi_gpio_handle_deferred_request_irqs+0x0/0x7c @ 1 > [ 26.113033] initcall acpi_gpio_handle_deferred_request_irqs+0x0/0x7c returned 0 after 3 usecs > [ 26.121559] calling clk_disable_unused+0x0/0x102 @ 1 > [ 26.126620] initcall clk_disable_unused+0x0/0x102 returned 0 after 0 usecs > [ 26.133491] calling regulator_init_complete+0x0/0x25 @ 1 > [ 26.138890] initcall regulator_init_complete+0x0/0x25 returned 0 after 0 usecs > [ 26.147816] Freeing unused decrypted memory: 2036K > [ 26.153682] Freeing unused kernel image (initmem) memory: 2308K > [ 26.165776] Write protecting the kernel read-only data: 26624k > [ 26.173067] Freeing unused kernel image (text/rodata gap) memory: 2036K > [ 26.180416] Freeing unused kernel image (rodata/data gap) memory: 1184K > [ 26.187031] Run /init as init process > [ 26.190693] with arguments: > [ 26.193661] /init > [ 26.195933] with environment: > [ 26.199079] HOME=/ > [ 26.201444] TERM=linux > [ 26.204152] BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla > [ 26.254154] BPF: type_id=35503 offset=178440 size=4 > [ 26.259125] BPF: > [ 26.261054] BPF:Invalid offset > [ 26.264119] BPF: > [ 26.264119] > [ 26.267437] failed to validate module [efivarfs] BTF: -22 > > Andrii Nakryiko bisected the problem to the commit "mm/page_alloc: convert > per-cpu list protection to local_lock" currently staged in mmotm. In his > own words > > The immediate problem is two different definitions of numa_node per-cpu > variable. They both are at the same offset within .data..percpu ELF > section, they both have the same name, but one of them is marked as > static and another as global. And one is int variable, while another > is struct pagesets. I'll look some more tomorrow, but adding Jiri and > Arnaldo for visibility. > > [110907] DATASEC '.data..percpu' size=178904 vlen=303 > ... > type_id=27753 offset=163976 size=4 (VAR 'numa_node') > type_id=27754 offset=163976 size=4 (VAR 'numa_node') > > [27753] VAR 'numa_node' type_id=27556, linkage=static > [27754] VAR 'numa_node' type_id=20, linkage=global > > [20] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > > [27556] STRUCT 'pagesets' size=0 vlen=1 > 'lock' type_id=507 bits_offset=0 > > [506] STRUCT '(anon)' size=0 vlen=0 > [507] TYPEDEF 'local_lock_t' type_id=506 > > The patch in question introduces a zero-sized per-cpu struct and while > this is not wrong, versions of pahole prior to 1.22 (unreleased) get > confused during BTF generation with two separate variables occupying the > same address. > > This patch checks for older versions of pahole and forces struct pagesets > to be non-zero sized as a workaround when CONFIG_DEBUG_INFO_BTF is set. A > warning is omitted so that distributions can update pahole when 1.22 s/omitted/emitted/ ? > is released. > > Reported-by: Michal Suchanek <msuchanek@suse.de> > Reported-by: Hritik Vijay <hritikxx8@gmail.com> > Debugged-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- Looks good! I verified that this does fix the issue on the latest linux-next tree, thanks! One question, should Fixes: 5716a627517d ("mm/page_alloc: convert per-cpu list protection to local_lock") be added to facilitate backporting? Either way: Acked-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Andrii Nakryiko <andrii@kernel.org> > lib/Kconfig.debug | 3 +++ > mm/page_alloc.c | 11 +++++++++++ > 2 files changed, 14 insertions(+) > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 678c13967580..f88a155b80a9 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -313,6 +313,9 @@ config DEBUG_INFO_BTF > config PAHOLE_HAS_SPLIT_BTF > def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "119") > > +config PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT > + def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "122") > + > config DEBUG_INFO_BTF_MODULES > def_bool y > depends on DEBUG_INFO_BTF && MODULES && PAHOLE_HAS_SPLIT_BTF > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index ff8f706839ea..1d56d3de8e08 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -124,6 +124,17 @@ static DEFINE_MUTEX(pcp_batch_high_lock); > > struct pagesets { > local_lock_t lock; > +#if defined(CONFIG_DEBUG_INFO_BTF) && \ > + !defined(CONFIG_DEBUG_LOCK_ALLOC) && \ > + !defined(CONFIG_PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT) > + /* > + * pahole 1.21 and earlier gets confused by zero-sized per-CPU > + * variables and produces invalid BTF. Ensure that > + * sizeof(struct pagesets) != 0 for older versions of pahole. > + */ > + char __pahole_hack; > + #warning "pahole too old to support zero-sized struct pagesets" > +#endif > }; > static DEFINE_PER_CPU(struct pagesets, pagesets) = { > .lock = INIT_LOCAL_LOCK(lock),
On Wed, May 26, 2021 at 1:33 AM Michal Suchánek <msuchanek@suse.de> wrote: > > On Wed, May 26, 2021 at 09:07:41AM +0100, Mel Gorman wrote: > > Michal Suchanek reported the following problem with linux-next > > > > [ 0.000000] Linux version 5.13.0-rc2-next-20210519-1.g3455ff8-vanilla (geeko@buildhost) (gcc (SUSE Linux) 10.3.0, GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.36.1.20210326-3) #1 SMP Wed May 19 10:05:10 UTC 2021 (3455ff8) > > [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla root=UUID=ec42c33e-a2c2-4c61-afcc-93e9527 8f687 plymouth.enable=0 resume=/dev/disk/by-uuid/f1fe4560-a801-4faf-a638-834c407027c7 mitigations=auto earlyprintk initcall_debug nomodeset earlycon ignore_loglevel console=ttyS0,115200 > > ... > > [ 26.093364] calling tracing_set_default_clock+0x0/0x62 @ 1 > > [ 26.098937] initcall tracing_set_default_clock+0x0/0x62 returned 0 after 0 usecs > > [ 26.106330] calling acpi_gpio_handle_deferred_request_irqs+0x0/0x7c @ 1 > > [ 26.113033] initcall acpi_gpio_handle_deferred_request_irqs+0x0/0x7c returned 0 after 3 usecs > > [ 26.121559] calling clk_disable_unused+0x0/0x102 @ 1 > > [ 26.126620] initcall clk_disable_unused+0x0/0x102 returned 0 after 0 usecs > > [ 26.133491] calling regulator_init_complete+0x0/0x25 @ 1 > > [ 26.138890] initcall regulator_init_complete+0x0/0x25 returned 0 after 0 usecs > > [ 26.147816] Freeing unused decrypted memory: 2036K > > [ 26.153682] Freeing unused kernel image (initmem) memory: 2308K > > [ 26.165776] Write protecting the kernel read-only data: 26624k > > [ 26.173067] Freeing unused kernel image (text/rodata gap) memory: 2036K > > [ 26.180416] Freeing unused kernel image (rodata/data gap) memory: 1184K > > [ 26.187031] Run /init as init process > > [ 26.190693] with arguments: > > [ 26.193661] /init > > [ 26.195933] with environment: > > [ 26.199079] HOME=/ > > [ 26.201444] TERM=linux > > [ 26.204152] BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla > > [ 26.254154] BPF: type_id=35503 offset=178440 size=4 > > [ 26.259125] BPF: > > [ 26.261054] BPF:Invalid offset > > [ 26.264119] BPF: > > [ 26.264119] > > [ 26.267437] failed to validate module [efivarfs] BTF: -22 > > > > Andrii Nakryiko bisected the problem to the commit "mm/page_alloc: convert > > per-cpu list protection to local_lock" currently staged in mmotm. In his > > own words > > > > The immediate problem is two different definitions of numa_node per-cpu > > variable. They both are at the same offset within .data..percpu ELF > > section, they both have the same name, but one of them is marked as > > static and another as global. And one is int variable, while another > > is struct pagesets. I'll look some more tomorrow, but adding Jiri and > > Arnaldo for visibility. > > > > [110907] DATASEC '.data..percpu' size=178904 vlen=303 > > ... > > type_id=27753 offset=163976 size=4 (VAR 'numa_node') > > type_id=27754 offset=163976 size=4 (VAR 'numa_node') > > > > [27753] VAR 'numa_node' type_id=27556, linkage=static > > [27754] VAR 'numa_node' type_id=20, linkage=global > > > > [20] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > > > > [27556] STRUCT 'pagesets' size=0 vlen=1 > > 'lock' type_id=507 bits_offset=0 > > > > [506] STRUCT '(anon)' size=0 vlen=0 > > [507] TYPEDEF 'local_lock_t' type_id=506 > > > > The patch in question introduces a zero-sized per-cpu struct and while > > this is not wrong, versions of pahole prior to 1.22 (unreleased) get > > confused during BTF generation with two separate variables occupying the > > same address. > > > > This patch checks for older versions of pahole and forces struct pagesets > > to be non-zero sized as a workaround when CONFIG_DEBUG_INFO_BTF is set. A > > warning is omitted so that distributions can update pahole when 1.22 > > is released. > > > > Reported-by: Michal Suchanek <msuchanek@suse.de> > > Reported-by: Hritik Vijay <hritikxx8@gmail.com> > > Debugged-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > --- > > lib/Kconfig.debug | 3 +++ > > mm/page_alloc.c | 11 +++++++++++ > > 2 files changed, 14 insertions(+) > > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > > index 678c13967580..f88a155b80a9 100644 > > --- a/lib/Kconfig.debug > > +++ b/lib/Kconfig.debug > > @@ -313,6 +313,9 @@ config DEBUG_INFO_BTF > > config PAHOLE_HAS_SPLIT_BTF > > def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "119") > > > > +config PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT > > + def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "122") > > + > > This does not seem workable with dummy-tools. > > Do we even have dummy pahole? > I don't know what dummy-tools is, so probably no. But if you don't have pahole on the build host, you can't have DEBUG_INFO_BTF=y anyways. As in, your build will fail because it will be impossible to generate BTF information. So you'll have to disable DEBUG_INFO_BTF if you can't get pahole onto your build host for some reason. > Thanks > > Michal > > > config DEBUG_INFO_BTF_MODULES > > def_bool y > > depends on DEBUG_INFO_BTF && MODULES && PAHOLE_HAS_SPLIT_BTF > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index ff8f706839ea..1d56d3de8e08 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -124,6 +124,17 @@ static DEFINE_MUTEX(pcp_batch_high_lock); > > > > struct pagesets { > > local_lock_t lock; > > +#if defined(CONFIG_DEBUG_INFO_BTF) && \ > > + !defined(CONFIG_DEBUG_LOCK_ALLOC) && \ > > + !defined(CONFIG_PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT) > > + /* > > + * pahole 1.21 and earlier gets confused by zero-sized per-CPU > > + * variables and produces invalid BTF. Ensure that > > + * sizeof(struct pagesets) != 0 for older versions of pahole. > > + */ > > + char __pahole_hack; > > + #warning "pahole too old to support zero-sized struct pagesets" > > +#endif > > }; > > static DEFINE_PER_CPU(struct pagesets, pagesets) = { > > .lock = INIT_LOCAL_LOCK(lock),
On Wed, May 26, 2021 at 10:00:34AM -0700, Andrii Nakryiko wrote: > On Wed, May 26, 2021 at 1:33 AM Michal Suchánek <msuchanek@suse.de> wrote: > > > > On Wed, May 26, 2021 at 09:07:41AM +0100, Mel Gorman wrote: > > > Michal Suchanek reported the following problem with linux-next > > > > > > [ 0.000000] Linux version 5.13.0-rc2-next-20210519-1.g3455ff8-vanilla (geeko@buildhost) (gcc (SUSE Linux) 10.3.0, GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.36.1.20210326-3) #1 SMP Wed May 19 10:05:10 UTC 2021 (3455ff8) > > > [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla root=UUID=ec42c33e-a2c2-4c61-afcc-93e9527 8f687 plymouth.enable=0 resume=/dev/disk/by-uuid/f1fe4560-a801-4faf-a638-834c407027c7 mitigations=auto earlyprintk initcall_debug nomodeset earlycon ignore_loglevel console=ttyS0,115200 > > > ... > > > [ 26.093364] calling tracing_set_default_clock+0x0/0x62 @ 1 > > > [ 26.098937] initcall tracing_set_default_clock+0x0/0x62 returned 0 after 0 usecs > > > [ 26.106330] calling acpi_gpio_handle_deferred_request_irqs+0x0/0x7c @ 1 > > > [ 26.113033] initcall acpi_gpio_handle_deferred_request_irqs+0x0/0x7c returned 0 after 3 usecs > > > [ 26.121559] calling clk_disable_unused+0x0/0x102 @ 1 > > > [ 26.126620] initcall clk_disable_unused+0x0/0x102 returned 0 after 0 usecs > > > [ 26.133491] calling regulator_init_complete+0x0/0x25 @ 1 > > > [ 26.138890] initcall regulator_init_complete+0x0/0x25 returned 0 after 0 usecs > > > [ 26.147816] Freeing unused decrypted memory: 2036K > > > [ 26.153682] Freeing unused kernel image (initmem) memory: 2308K > > > [ 26.165776] Write protecting the kernel read-only data: 26624k > > > [ 26.173067] Freeing unused kernel image (text/rodata gap) memory: 2036K > > > [ 26.180416] Freeing unused kernel image (rodata/data gap) memory: 1184K > > > [ 26.187031] Run /init as init process > > > [ 26.190693] with arguments: > > > [ 26.193661] /init > > > [ 26.195933] with environment: > > > [ 26.199079] HOME=/ > > > [ 26.201444] TERM=linux > > > [ 26.204152] BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla > > > [ 26.254154] BPF: type_id=35503 offset=178440 size=4 > > > [ 26.259125] BPF: > > > [ 26.261054] BPF:Invalid offset > > > [ 26.264119] BPF: > > > [ 26.264119] > > > [ 26.267437] failed to validate module [efivarfs] BTF: -22 > > > > > > Andrii Nakryiko bisected the problem to the commit "mm/page_alloc: convert > > > per-cpu list protection to local_lock" currently staged in mmotm. In his > > > own words > > > > > > The immediate problem is two different definitions of numa_node per-cpu > > > variable. They both are at the same offset within .data..percpu ELF > > > section, they both have the same name, but one of them is marked as > > > static and another as global. And one is int variable, while another > > > is struct pagesets. I'll look some more tomorrow, but adding Jiri and > > > Arnaldo for visibility. > > > > > > [110907] DATASEC '.data..percpu' size=178904 vlen=303 > > > ... > > > type_id=27753 offset=163976 size=4 (VAR 'numa_node') > > > type_id=27754 offset=163976 size=4 (VAR 'numa_node') > > > > > > [27753] VAR 'numa_node' type_id=27556, linkage=static > > > [27754] VAR 'numa_node' type_id=20, linkage=global > > > > > > [20] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > > > > > > [27556] STRUCT 'pagesets' size=0 vlen=1 > > > 'lock' type_id=507 bits_offset=0 > > > > > > [506] STRUCT '(anon)' size=0 vlen=0 > > > [507] TYPEDEF 'local_lock_t' type_id=506 > > > > > > The patch in question introduces a zero-sized per-cpu struct and while > > > this is not wrong, versions of pahole prior to 1.22 (unreleased) get > > > confused during BTF generation with two separate variables occupying the > > > same address. > > > > > > This patch checks for older versions of pahole and forces struct pagesets > > > to be non-zero sized as a workaround when CONFIG_DEBUG_INFO_BTF is set. A > > > warning is omitted so that distributions can update pahole when 1.22 > > > is released. > > > > > > Reported-by: Michal Suchanek <msuchanek@suse.de> > > > Reported-by: Hritik Vijay <hritikxx8@gmail.com> > > > Debugged-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > > --- > > > lib/Kconfig.debug | 3 +++ > > > mm/page_alloc.c | 11 +++++++++++ > > > 2 files changed, 14 insertions(+) > > > > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > > > index 678c13967580..f88a155b80a9 100644 > > > --- a/lib/Kconfig.debug > > > +++ b/lib/Kconfig.debug > > > @@ -313,6 +313,9 @@ config DEBUG_INFO_BTF > > > config PAHOLE_HAS_SPLIT_BTF > > > def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "119") > > > > > > +config PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT > > > + def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "122") > > > + > > > > This does not seem workable with dummy-tools. > > > > Do we even have dummy pahole? > > > > I don't know what dummy-tools is, so probably no. But if you don't > have pahole on the build host, you can't have DEBUG_INFO_BTF=y > anyways. As in, your build will fail because it will be impossible to > generate BTF information. So you'll have to disable DEBUG_INFO_BTF if > you can't get pahole onto your build host for some reason. dummy-tools is used to maintain configuration files outside of build the environment. It is not easy to have all tools with all bells and whistles for all architectures on one machine. That is you should be able to enable DEBUG_INFO_BTF without pahole, and then build the config on a build host that has a compiler and pohole for the target architecture. Thanks Michal
On Wed, May 26, 2021 at 09:57:31AM -0700, Andrii Nakryiko wrote: > > This patch checks for older versions of pahole and forces struct pagesets > > to be non-zero sized as a workaround when CONFIG_DEBUG_INFO_BTF is set. A > > warning is omitted so that distributions can update pahole when 1.22 > > s/omitted/emitted/ ? > Yes. > > is released. > > > > Reported-by: Michal Suchanek <msuchanek@suse.de> > > Reported-by: Hritik Vijay <hritikxx8@gmail.com> > > Debugged-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > --- > > Looks good! I verified that this does fix the issue on the latest > linux-next tree, thanks! > Excellent > One question, should > > Fixes: 5716a627517d ("mm/page_alloc: convert per-cpu list protection > to local_lock") > > be added to facilitate backporting? > The git commit is not stable because the patch "mm/page_alloc: convert per-cpu list protection to local_lock" is in Andrew's mmotm tree which is quilt based. I decided not to treat the patch as a fix because the patch is not wrong as such, it's a limitation of an external tool. However, I would expect both the problematic patch and the BTF workaround to go in during the same merge window so backports to -stable should not be required. > Either way: > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > Tested-by: Andrii Nakryiko <andrii@kernel.org> > Thanks!
On Wed, May 26, 2021 at 11:13 AM Mel Gorman <mgorman@techsingularity.net> wrote: > > On Wed, May 26, 2021 at 09:57:31AM -0700, Andrii Nakryiko wrote: > > > This patch checks for older versions of pahole and forces struct pagesets > > > to be non-zero sized as a workaround when CONFIG_DEBUG_INFO_BTF is set. A > > > warning is omitted so that distributions can update pahole when 1.22 > > > > s/omitted/emitted/ ? > > > > Yes. > > > > is released. > > > > > > Reported-by: Michal Suchanek <msuchanek@suse.de> > > > Reported-by: Hritik Vijay <hritikxx8@gmail.com> > > > Debugged-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > > --- > > > > Looks good! I verified that this does fix the issue on the latest > > linux-next tree, thanks! > > > > Excellent > > > One question, should > > > > Fixes: 5716a627517d ("mm/page_alloc: convert per-cpu list protection > > to local_lock") > > > > be added to facilitate backporting? > > > > The git commit is not stable because the patch "mm/page_alloc: convert > per-cpu list protection to local_lock" is in Andrew's mmotm tree which is Oh, I didn't know about this instability. > quilt based. I decided not to treat the patch as a fix because the patch is > not wrong as such, it's a limitation of an external tool. However, I would > expect both the problematic patch and the BTF workaround to go in during > the same merge window so backports to -stable should not be required. Yep, makes sense. > > > Either way: > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Tested-by: Andrii Nakryiko <andrii@kernel.org> > > > > Thanks! > > -- > Mel Gorman > SUSE Labs
On Wed, May 26, 2021 at 09:07:41AM +0100, Mel Gorman wrote: > + !defined(CONFIG_DEBUG_LOCK_ALLOC) && \ > + !defined(CONFIG_PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT) > + /* > + * pahole 1.21 and earlier gets confused by zero-sized per-CPU > + * variables and produces invalid BTF. Ensure that > + * sizeof(struct pagesets) != 0 for older versions of pahole. > + */ > + char __pahole_hack; > + #warning "pahole too old to support zero-sized struct pagesets" > +#endif Err, hell no. We should not mess up the kernel for broken tools that are not relevant to the kernel build itself ever.
On Thu, May 27, 2021 at 09:04:24AM +0100, Christoph Hellwig wrote: > On Wed, May 26, 2021 at 09:07:41AM +0100, Mel Gorman wrote: > > + !defined(CONFIG_DEBUG_LOCK_ALLOC) && \ > > + !defined(CONFIG_PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT) > > + /* > > + * pahole 1.21 and earlier gets confused by zero-sized per-CPU > > + * variables and produces invalid BTF. Ensure that > > + * sizeof(struct pagesets) != 0 for older versions of pahole. > > + */ > > + char __pahole_hack; > > + #warning "pahole too old to support zero-sized struct pagesets" > > +#endif > > Err, hell no. We should not mess up the kernel for broken tools that > are not relevant to the kernel build itself ever. What do you suggest as an alternative? I added Arnaldo to the cc as he tagged the last released version of pahole (1.21) and may be able to tag a 1.22 with Andrii's fix for pahole included. The most obvious alternative fix for this issue is to require pahole 1.22 to set CONFIG_DEBUG_INFO_BTF but obviously a version 1.22 that works needs to exist first and right now it does not. I'd be ok with this but users of DEBUG_INFO_BTF may object given that it'll be impossible to set the option until there is a release. The second alternative fix is to embed the local_lock within struct per_cpu_pages. It was shown this was possible in https://lore.kernel.org/linux-rt-users/20210419141341.26047-1-mgorman@techsingularity.net/T/#md1001d7af52ac0d6d214b95e98fe051f9399de64 but I dropped it because it makes the locking protocol complex e.g. config-specific lock-switchin in free_unref_page_list. The last one is wrapping local_lock behind #defines and only defining the per-cpu structures when local_lock_t is a non-zero size. That is simply too ugly for words, the locking patterns should always be the same.
On Thu, May 27, 2021 at 10:04:22AM +0100, Mel Gorman wrote: > What do you suggest as an alternative? > > I added Arnaldo to the cc as he tagged the last released version of > pahole (1.21) and may be able to tag a 1.22 with Andrii's fix for pahole > included. > > The most obvious alternative fix for this issue is to require pahole > 1.22 to set CONFIG_DEBUG_INFO_BTF but obviously a version 1.22 that works > needs to exist first and right now it does not. I'd be ok with this but > users of DEBUG_INFO_BTF may object given that it'll be impossible to set > the option until there is a release. Yes, disable BTF. Empty structs are a very useful feature that we use in various places in the kernel. We can't just keep piling hacks over hacks to make that work with a recent fringe feature.
On Thu, May 27, 2021 at 2:19 AM Christoph Hellwig <hch@infradead.org> wrote: > > On Thu, May 27, 2021 at 10:04:22AM +0100, Mel Gorman wrote: > > What do you suggest as an alternative? > > > > I added Arnaldo to the cc as he tagged the last released version of > > pahole (1.21) and may be able to tag a 1.22 with Andrii's fix for pahole > > included. > > > > The most obvious alternative fix for this issue is to require pahole > > 1.22 to set CONFIG_DEBUG_INFO_BTF but obviously a version 1.22 that works > > needs to exist first and right now it does not. I'd be ok with this but > > users of DEBUG_INFO_BTF may object given that it'll be impossible to set > > the option until there is a release. > > Yes, disable BTF. Empty structs are a very useful feature that we use > in various places in the kernel. We can't just keep piling hacks over > hacks to make that work with a recent fringe feature.
On Thu, May 27, 2021 at 7:37 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Thu, May 27, 2021 at 2:19 AM Christoph Hellwig <hch@infradead.org> wrote: > > > > On Thu, May 27, 2021 at 10:04:22AM +0100, Mel Gorman wrote: > > > What do you suggest as an alternative? > > > > > > I added Arnaldo to the cc as he tagged the last released version of > > > pahole (1.21) and may be able to tag a 1.22 with Andrii's fix for pahole > > > included. > > > > > > The most obvious alternative fix for this issue is to require pahole > > > 1.22 to set CONFIG_DEBUG_INFO_BTF but obviously a version 1.22 that works > > > needs to exist first and right now it does not. I'd be ok with this but > > > users of DEBUG_INFO_BTF may object given that it'll be impossible to set > > > the option until there is a release. > > > > Yes, disable BTF. Empty structs are a very useful feature that we use > > in various places in the kernel. We can't just keep piling hacks over > > hacks to make that work with a recent fringe feature. Sorry, I accidentally send out empty response. CONFIG_DEBUG_INFO_BTF is a crucial piece of modern BPF ecosystem. It is enabled by default by most popular Linux distros. So it's hardly a fringe feature and is something that many people and applications depend on. I agree that empty structs are useful, but here we are talking about per-CPU variables only, which is the first use case so far, as far as I can see. If we had pahole 1.22 released and widely packaged it could have been a viable option to force it on everyone. But right now that's not the case. So while ugly, making sure pagesets is non-zero-sized is going to avoid a lot of pain for a lot of people. By the time we need another zero-sized per-CPU var, we might be able to force pahole to 1.22.
From: Andrii Nakryiko > Sent: 27 May 2021 15:42 ... > I agree that empty structs are useful, but here we are talking about > per-CPU variables only, which is the first use case so far, as far as > I can see. If we had pahole 1.22 released and widely packaged it could > have been a viable option to force it on everyone. ... Would it be feasible to put the sources for pahole into the kernel repository and build it at the same time as objtool? That would remove any issues about the latest version not being available. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, May 28, 2021 at 08:09:39AM +0000, David Laight wrote: > From: Andrii Nakryiko > > Sent: 27 May 2021 15:42 > ... > > I agree that empty structs are useful, but here we are talking about > > per-CPU variables only, which is the first use case so far, as far as > > I can see. If we had pahole 1.22 released and widely packaged it could > > have been a viable option to force it on everyone. > ... > > Would it be feasible to put the sources for pahole into the > kernel repository and build it at the same time as objtool? > We don't store other build dependencies like compilers, binutils etc in the kernel repository even though minimum versions are mandated. Obviously tools/ exists but for the most part, they are tools that do not exist in other repositories and are kernel-specific. I don't know if pahole would be accepted and it introduces the possibility that upstream pahole and the kernel fork of it would diverge.
From: Mel Gorman > Sent: 28 May 2021 10:04 > > On Fri, May 28, 2021 at 08:09:39AM +0000, David Laight wrote: > > From: Andrii Nakryiko > > > Sent: 27 May 2021 15:42 > > ... > > > I agree that empty structs are useful, but here we are talking about > > > per-CPU variables only, which is the first use case so far, as far as > > > I can see. If we had pahole 1.22 released and widely packaged it could > > > have been a viable option to force it on everyone. > > ... > > > > Would it be feasible to put the sources for pahole into the > > kernel repository and build it at the same time as objtool? > > We don't store other build dependencies like compilers, binutils etc in > the kernel repository even though minimum versions are mandated. > Obviously tools/ exists but for the most part, they are tools that do > not exist in other repositories and are kernel-specific. I don't know if > pahole would be accepted and it introduces the possibility that upstream > pahole and the kernel fork of it would diverge. The other side of the coin is that is you want reproducible builds the smaller the number of variables that need to match the better. I can see there might be similar issues with the version of libelf-devel (needed by objtool). If I compile anything with gcc 10 (I'm doing build-root builds) I get object files that the hosts 2.30 binutils complain about. I can easily see that updating gcc and binutils might leave a broken objtool unless the required updated libelf-devel package can be found. Statically linking the required parts of libelf into objtool would save any such problems. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, May 28, 2021 at 09:49:28AM +0000, David Laight wrote: > From: Mel Gorman > > Sent: 28 May 2021 10:04 > > > > On Fri, May 28, 2021 at 08:09:39AM +0000, David Laight wrote: > > > From: Andrii Nakryiko > > > > Sent: 27 May 2021 15:42 > > > ... > > > > I agree that empty structs are useful, but here we are talking about > > > > per-CPU variables only, which is the first use case so far, as far as > > > > I can see. If we had pahole 1.22 released and widely packaged it could > > > > have been a viable option to force it on everyone. > > > ... > > > > > > Would it be feasible to put the sources for pahole into the > > > kernel repository and build it at the same time as objtool? > > > > We don't store other build dependencies like compilers, binutils etc in > > the kernel repository even though minimum versions are mandated. > > Obviously tools/ exists but for the most part, they are tools that do > > not exist in other repositories and are kernel-specific. I don't know if > > pahole would be accepted and it introduces the possibility that upstream > > pahole and the kernel fork of it would diverge. > > The other side of the coin is that is you want reproducible builds > the smaller the number of variables that need to match the better. > > I can see there might be similar issues with the version of libelf-devel > (needed by objtool). > If I compile anything with gcc 10 (I'm doing build-root builds) > I get object files that the hosts 2.30 binutils complain about. > I can easily see that updating gcc and binutils might leave a > broken objtool unless the required updated libelf-devel package > can be found. > Statically linking the required parts of libelf into objtool > would save any such problems. Static libraries are not always available. Especially for core toolchain libraries the developers often have some ideas about which of the static and dynamic libraris is the 'correct' one that they like to enforce. Thanks Michal
From: Michal Suchánek > Sent: 28 May 2021 10:57 > > On Fri, May 28, 2021 at 09:49:28AM +0000, David Laight wrote: ... > > I can see there might be similar issues with the version of libelf-devel > > (needed by objtool). > > If I compile anything with gcc 10 (I'm doing build-root builds) > > I get object files that the hosts 2.30 binutils complain about. > > I can easily see that updating gcc and binutils might leave a > > broken objtool unless the required updated libelf-devel package > > can be found. > > Statically linking the required parts of libelf into objtool > > would save any such problems. > > Static libraries are not always available. Especially for core toolchain > libraries the developers often have some ideas about which of the static > and dynamic libraris is the 'correct' one that they like to enforce. The issue is that you want a version of libelf that works with objtool and the versions of binutils/gcc/clang that the kernel build supports. If libelf was part of the binutils package this might be ok. But it isn't and it may end up with people scrambling around to find a working version to build a kernel (or out of tree module). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, May 28, 2021 at 1:09 AM David Laight <David.Laight@aculab.com> wrote: > > From: Andrii Nakryiko > > Sent: 27 May 2021 15:42 > ... > > I agree that empty structs are useful, but here we are talking about > > per-CPU variables only, which is the first use case so far, as far as > > I can see. If we had pahole 1.22 released and widely packaged it could > > have been a viable option to force it on everyone. > ... > > Would it be feasible to put the sources for pahole into the > kernel repository and build it at the same time as objtool? > > That would remove any issues about the latest version > not being available. That would be great for the kernel build, but pahole is more than just a DWARF-to-BTF converter and it has a substantial amount of logic for loading and processing DWARF before it gets converted to BTF. All BTF-related pieces are provided by libbpf, which is already part of kernel sources, so that's not a problem. DWARF processing is a problem and would add a new dependency on libdw-devel, at least. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 678c13967580..f88a155b80a9 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -313,6 +313,9 @@ config DEBUG_INFO_BTF config PAHOLE_HAS_SPLIT_BTF def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "119") +config PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT + def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "122") + config DEBUG_INFO_BTF_MODULES def_bool y depends on DEBUG_INFO_BTF && MODULES && PAHOLE_HAS_SPLIT_BTF diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ff8f706839ea..1d56d3de8e08 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -124,6 +124,17 @@ static DEFINE_MUTEX(pcp_batch_high_lock); struct pagesets { local_lock_t lock; +#if defined(CONFIG_DEBUG_INFO_BTF) && \ + !defined(CONFIG_DEBUG_LOCK_ALLOC) && \ + !defined(CONFIG_PAHOLE_HAS_ZEROSIZE_PERCPU_SUPPORT) + /* + * pahole 1.21 and earlier gets confused by zero-sized per-CPU + * variables and produces invalid BTF. Ensure that + * sizeof(struct pagesets) != 0 for older versions of pahole. + */ + char __pahole_hack; + #warning "pahole too old to support zero-sized struct pagesets" +#endif }; static DEFINE_PER_CPU(struct pagesets, pagesets) = { .lock = INIT_LOCAL_LOCK(lock),
Michal Suchanek reported the following problem with linux-next [ 0.000000] Linux version 5.13.0-rc2-next-20210519-1.g3455ff8-vanilla (geeko@buildhost) (gcc (SUSE Linux) 10.3.0, GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.36.1.20210326-3) #1 SMP Wed May 19 10:05:10 UTC 2021 (3455ff8) [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla root=UUID=ec42c33e-a2c2-4c61-afcc-93e9527 8f687 plymouth.enable=0 resume=/dev/disk/by-uuid/f1fe4560-a801-4faf-a638-834c407027c7 mitigations=auto earlyprintk initcall_debug nomodeset earlycon ignore_loglevel console=ttyS0,115200 ... [ 26.093364] calling tracing_set_default_clock+0x0/0x62 @ 1 [ 26.098937] initcall tracing_set_default_clock+0x0/0x62 returned 0 after 0 usecs [ 26.106330] calling acpi_gpio_handle_deferred_request_irqs+0x0/0x7c @ 1 [ 26.113033] initcall acpi_gpio_handle_deferred_request_irqs+0x0/0x7c returned 0 after 3 usecs [ 26.121559] calling clk_disable_unused+0x0/0x102 @ 1 [ 26.126620] initcall clk_disable_unused+0x0/0x102 returned 0 after 0 usecs [ 26.133491] calling regulator_init_complete+0x0/0x25 @ 1 [ 26.138890] initcall regulator_init_complete+0x0/0x25 returned 0 after 0 usecs [ 26.147816] Freeing unused decrypted memory: 2036K [ 26.153682] Freeing unused kernel image (initmem) memory: 2308K [ 26.165776] Write protecting the kernel read-only data: 26624k [ 26.173067] Freeing unused kernel image (text/rodata gap) memory: 2036K [ 26.180416] Freeing unused kernel image (rodata/data gap) memory: 1184K [ 26.187031] Run /init as init process [ 26.190693] with arguments: [ 26.193661] /init [ 26.195933] with environment: [ 26.199079] HOME=/ [ 26.201444] TERM=linux [ 26.204152] BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla [ 26.254154] BPF: type_id=35503 offset=178440 size=4 [ 26.259125] BPF: [ 26.261054] BPF:Invalid offset [ 26.264119] BPF: [ 26.264119] [ 26.267437] failed to validate module [efivarfs] BTF: -22 Andrii Nakryiko bisected the problem to the commit "mm/page_alloc: convert per-cpu list protection to local_lock" currently staged in mmotm. In his own words The immediate problem is two different definitions of numa_node per-cpu variable. They both are at the same offset within .data..percpu ELF section, they both have the same name, but one of them is marked as static and another as global. And one is int variable, while another is struct pagesets. I'll look some more tomorrow, but adding Jiri and Arnaldo for visibility. [110907] DATASEC '.data..percpu' size=178904 vlen=303 ... type_id=27753 offset=163976 size=4 (VAR 'numa_node') type_id=27754 offset=163976 size=4 (VAR 'numa_node') [27753] VAR 'numa_node' type_id=27556, linkage=static [27754] VAR 'numa_node' type_id=20, linkage=global [20] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED [27556] STRUCT 'pagesets' size=0 vlen=1 'lock' type_id=507 bits_offset=0 [506] STRUCT '(anon)' size=0 vlen=0 [507] TYPEDEF 'local_lock_t' type_id=506 The patch in question introduces a zero-sized per-cpu struct and while this is not wrong, versions of pahole prior to 1.22 (unreleased) get confused during BTF generation with two separate variables occupying the same address. This patch checks for older versions of pahole and forces struct pagesets to be non-zero sized as a workaround when CONFIG_DEBUG_INFO_BTF is set. A warning is omitted so that distributions can update pahole when 1.22 is released. Reported-by: Michal Suchanek <msuchanek@suse.de> Reported-by: Hritik Vijay <hritikxx8@gmail.com> Debugged-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- lib/Kconfig.debug | 3 +++ mm/page_alloc.c | 11 +++++++++++ 2 files changed, 14 insertions(+)