mbox series

[v3,0/5] arch_topology: Build cacheinfo from primary CPU

Message ID 20221209103130.572196-1-pierre.gondois@arm.com (mailing list archive)
Headers show
Series arch_topology: Build cacheinfo from primary CPU | expand

Message

Pierre Gondois Dec. 9, 2022, 10:31 a.m. UTC
v2:
 - Applied renaming/formatting comments from v1.
 - Check CACHE_TYPE_VALID flag in pppt.c.
v3:
 - Applied Sudeep's suggestions (for patch 5/5):
   - Renaming allocate_cache_info() -> fecth_cache_info()
   - Updated error message
   - Extract an inline allocate_cache_info() function
 - Re-run checkpatch with --strict option

Note:
This patchset requires the following patch to be applied first in
order to avoid the same bug described in the commit message:
https://lore.kernel.org/all/20221116094958.2141072-1-pierre.gondois@arm.com/

[1] and [2] build the CPU topology from the cacheinfo information for
both DT/ACPI based systems and remove (struct cpu_topology).llc_id
which was used by ACPI only.

Creating the cacheinfo for secondary CPUs is done during early boot.
Preemption and interrupts are disabled at this stage. On PREEMPT_RT
kernels, allocating memory (and parsing the PPTT table for ACPI based
systems) triggers a:
  'BUG: sleeping function called from invalid context' [4]

To prevent this bug, allocate the cacheinfo from the primary CPU when
preemption and interrupts are enabled and before booting secondary
CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information
only, without relying on the arm64 CLIDR_EL1 register.
If no cache information is found in the DT/ACPI PPTT, then fallback
to the current state, triggering [4] on PREEMPT_RT kernels.

Patches to update the arm64 device trees that have incomplete cacheinfo
(mostly for missing the 'cache-level' or 'cache-unified' property)
have been sent at [3].

Tested platforms:
- ACPI + PPTT: Ampere Altra, Ampere eMAG, Cavium ThunderX2,
  Kunpeng 920, Juno-r2
- DT: rb5, db845c, Juno-r2

[1] https://lore.kernel.org/all/20220704101605.1318280-1-sudeep.holla@arm.com/
[2] https://lore.kernel.org/all/20220720-arch_topo_fixes-v3-0-43d696288e84@arm.com/
[3] https://lore.kernel.org/all/20221107155825.1644604-1-pierre.gondois@arm.com/
[4] On an Ampere Altra, with PREEMPT_RT kernel based on v6.0.0-rc4:


[    7.560791] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
[    7.560794] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
[    7.560796] preempt_count: 1, expected: 0
[    7.560797] RCU nest depth: 1, expected: 1
[    7.560799] 3 locks held by swapper/111/0:
[    7.560800]  #0: ffff403e406cae98 (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
[    7.560811]  #1: ffffc5f8ed09f8e8 (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
[    7.560820]  #2: ffff403f400b4fd8 (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
[    7.560824] irq event stamp: 0
[    7.560825] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[    7.560827] hardirqs last disabled at (0): [<ffffc5f8e9f7d594>] copy_process+0x5dc/0x1ab8
[    7.560830] softirqs last  enabled at (0): [<ffffc5f8e9f7d594>] copy_process+0x5dc/0x1ab8
[    7.560833] softirqs last disabled at (0): [<0000000000000000>] 0x0
[    7.560834] Preemption disabled at:
[    7.560835] [<ffffc5f8e9fd3c28>] migrate_enable+0x30/0x130
[    7.560838] CPU: 111 PID: 0 Comm: swapper/111 Tainted: G        W          6.0.0-rc4-[...]
[    7.560841] Call trace:
[...]
[    7.560870]  __kmalloc+0xbc/0x1e8
[    7.560873]  detect_cache_attributes+0x2d4/0x5f0
[    7.560876]  update_siblings_masks+0x30/0x368
[    7.560880]  store_cpu_topology+0x78/0xb8
[    7.560883]  secondary_start_kernel+0xd0/0x198
[    7.560885]  __secondary_switched+0xb0/0xb4

Pierre Gondois (5):
  cacheinfo: Use RISC-V's init_cache_level() as generic OF
    implementation
  cacheinfo: Return error code in init_of_cache_level()
  ACPI: PPTT: Remove acpi_find_cache_levels()
  ACPI: PPTT: Update acpi_find_last_cache_level() to
    acpi_get_cache_info()
  arch_topology: Build cacheinfo from primary CPU

 arch/arm64/kernel/cacheinfo.c |  11 ++--
 arch/riscv/kernel/cacheinfo.c |  39 +----------
 drivers/acpi/pptt.c           |  93 +++++++++++++++-----------
 drivers/base/arch_topology.c  |  12 +++-
 drivers/base/cacheinfo.c      | 119 +++++++++++++++++++++++++++++-----
 include/linux/cacheinfo.h     |  11 +++-
 6 files changed, 182 insertions(+), 103 deletions(-)

Comments

Krzysztof Kozlowski Dec. 29, 2022, 4:47 p.m. UTC | #1
On 09/12/2022 11:31, Pierre Gondois wrote:
> v2:
>  - Applied renaming/formatting comments from v1.
>  - Check CACHE_TYPE_VALID flag in pppt.c.
> v3:
>  - Applied Sudeep's suggestions (for patch 5/5):
>    - Renaming allocate_cache_info() -> fecth_cache_info()
>    - Updated error message
>    - Extract an inline allocate_cache_info() function
>  - Re-run checkpatch with --strict option
> 
> Note:
> This patchset requires the following patch to be applied first in
> order to avoid the same bug described in the commit message:
> https://lore.kernel.org/all/20221116094958.2141072-1-pierre.gondois@arm.com/
> 
> [1] and [2] build the CPU topology from the cacheinfo information for
> both DT/ACPI based systems and remove (struct cpu_topology).llc_id
> which was used by ACPI only.
> 
> Creating the cacheinfo for secondary CPUs is done during early boot.
> Preemption and interrupts are disabled at this stage. On PREEMPT_RT
> kernels, allocating memory (and parsing the PPTT table for ACPI based
> systems) triggers a:
>   'BUG: sleeping function called from invalid context' [4]
> 
> To prevent this bug, allocate the cacheinfo from the primary CPU when
> preemption and interrupts are enabled and before booting secondary
> CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information
> only, without relying on the arm64 CLIDR_EL1 register.
> If no cache information is found in the DT/ACPI PPTT, then fallback
> to the current state, triggering [4] on PREEMPT_RT kernels.
> 
> Patches to update the arm64 device trees that have incomplete cacheinfo
> (mostly for missing the 'cache-level' or 'cache-unified' property)
> have been sent at [3].
> 
> Tested platforms:
> - ACPI + PPTT: Ampere Altra, Ampere eMAG, Cavium ThunderX2,
>   Kunpeng 920, Juno-r2
> - DT: rb5, db845c, Juno-r2
> 

I gave the patchset a try with DTS fixes for cache topology on Qualcomm
RB5 board (SM8250 SoC) and with KASAN it produces:

BUG: KASAN: slab-out-of-bounds in populate_cache_leaves+0x84/0x15c
[    0.633014]  dump_backtrace.part.0+0xe0/0xf0
[    0.633035]  show_stack+0x18/0x40
[    0.633050]  dump_stack_lvl+0x8c/0xb8
[    0.633085]  print_report+0x188/0x488
[    0.633106]  kasan_report+0xac/0xf0
[    0.633136]  __asan_store4+0x80/0xa4
[    0.633158]  populate_cache_leaves+0x84/0x15c
[    0.633181]  detect_cache_attributes+0xc0/0x8c4
[    0.633213]  update_siblings_masks+0x28/0x43c
[    0.633235]  store_cpu_topology+0x98/0xc0
[    0.633251]  smp_prepare_cpus+0x2c/0x15c
[    0.633281]  kernel_init_freeable+0x22c/0x424
[    0.633310]  kernel_init+0x24/0x13c
[    0.633328]  ret_from_fork+0x10/0x20
[    0.633388]
[    0.708729] Allocated by task 1:
[    0.712078]  kasan_save_stack+0x2c/0x60
[    0.716066]  kasan_set_track+0x2c/0x40
[    0.719959]  kasan_save_alloc_info+0x24/0x3c
[    0.724387]  __kasan_kmalloc+0xa0/0xbc
[    0.728278]  __kmalloc+0x74/0x110
[    0.731740]  fetch_cache_info+0x170/0x210
[    0.735902]  init_cpu_topology+0x254/0x2bc
[    0.740171]  smp_prepare_cpus+0x20/0x15c
[    0.744272]  kernel_init_freeable+0x22c/0x424
[    0.748791]  kernel_init+0x24/0x13c
[    0.752420]  ret_from_fork+0x10/0x20

Best regards,
Krzysztof
[    0.000000] arch_timer: cp15 and mmio timer(s) running at 19.20MHz (virt/virt).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x46d987e47, max_idle_ns: 440795202767 ns
[    0.000001] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps every 4398046511078ns
[    0.005051] Console: colour dummy device 80x25
[    0.478980] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.478992] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.479002] ... MAX_LOCK_DEPTH:          48
[    0.479011] ... MAX_LOCKDEP_KEYS:        8192
[    0.479019] ... CLASSHASH_SIZE:          4096
[    0.479027] ... MAX_LOCKDEP_ENTRIES:     32768
[    0.479035] ... MAX_LOCKDEP_CHAINS:      65536
[    0.479043] ... CHAINHASH_SIZE:          32768
[    0.479052]  memory used by lock dependency info: 6365 kB
[    0.479061]  memory used for stack traces: 4224 kB
[    0.479069]  per task-struct memory footprint: 1920 bytes
[    0.479976] Calibrating delay loop (skipped), value calculated using timer frequency.. 38.40 BogoMIPS (lpj=19200)
[    0.480007] pid_max: default: 32768 minimum: 301
[    0.482256] LSM: Security Framework initializing
[    0.484629] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.484692] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.518295] ==================================================================
[    0.617001] BUG: KASAN: slab-out-of-bounds in populate_cache_leaves+0x84/0x15c
[    0.624489] Write of size 4 at addr ffff0d83400366c8 by task swapper/0/1
[    0.631400]
[    0.632973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rt5-00372-ga6339d0b4e8e #45
[    0.632995] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
[    0.633006] Call trace:
[    0.633014]  dump_backtrace.part.0+0xe0/0xf0
[    0.633035]  show_stack+0x18/0x40
[    0.633050]  dump_stack_lvl+0x8c/0xb8
[    0.633085]  print_report+0x188/0x488
[    0.633106]  kasan_report+0xac/0xf0
[    0.633136]  __asan_store4+0x80/0xa4
[    0.633158]  populate_cache_leaves+0x84/0x15c
[    0.633181]  detect_cache_attributes+0xc0/0x8c4
[    0.633213]  update_siblings_masks+0x28/0x43c
[    0.633235]  store_cpu_topology+0x98/0xc0
[    0.633251]  smp_prepare_cpus+0x2c/0x15c
[    0.633281]  kernel_init_freeable+0x22c/0x424
[    0.633310]  kernel_init+0x24/0x13c
[    0.633328]  ret_from_fork+0x10/0x20
[    0.633388]
[    0.708729] Allocated by task 1:
[    0.712078]  kasan_save_stack+0x2c/0x60
[    0.716066]  kasan_set_track+0x2c/0x40
[    0.719959]  kasan_save_alloc_info+0x24/0x3c
[    0.724387]  __kasan_kmalloc+0xa0/0xbc
[    0.728278]  __kmalloc+0x74/0x110
[    0.731740]  fetch_cache_info+0x170/0x210
[    0.735902]  init_cpu_topology+0x254/0x2bc
[    0.740171]  smp_prepare_cpus+0x20/0x15c
[    0.744272]  kernel_init_freeable+0x22c/0x424
[    0.748791]  kernel_init+0x24/0x13c
[    0.752420]  ret_from_fork+0x10/0x20
[    0.756131]
[    0.757726] The buggy address belongs to the object at ffff0d8340036600
[    0.757726]  which belongs to the cache kmalloc-256 of size 256
[    0.770607] The buggy address is located 200 bytes inside of
[    0.770607]  256-byte region [ffff0d8340036600, ffff0d8340036700)
[    0.782690]
[    0.784256] The buggy address belongs to the physical page:
[    0.790008] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100034
[    0.799686] head:(____ptrval____) order:2 compound_mapcount:0 compound_pincount:0
[    0.807405] flags: 0x800000000010200(slab|head|node=0|zone=2)
[    0.813365] raw: 0800000000010200 0000000000000000 dead000000000122 ffff0d8340002480
[    0.821349] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[    0.829364] page dumped because: kasan: bad access detected
[    0.835117]
[    0.836679] Memory state around the buggy address:
[    0.841639]  ffff0d8340036580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    0.849085]  ffff0d8340036600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.856562] >ffff0d8340036680: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
[    0.864005]                                               ^
[    0.869760]  ffff0d8340036700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    0.877206]  ffff0d8340036780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[    0.884650] ==================================================================
[    0.892100] Disabling lock debugging due to kernel taint
[    0.904200] cblist_init_generic: Setting adjustable number of callback queues.
[    0.904241] cblist_init_generic: Setting shift to 3 and lim to 1.
[    0.918718] cblist_init_generic: Setting shift to 3 and lim to 1.
[    0.925721] Running RCU-tasks wait API self tests
[    1.036108] rcu: Hierarchical SRCU implementation.
[    1.036118] rcu:     Max phase no-delay instances is 400.
[    1.036756] printk: bootconsole [qcom_geni0] printing thread started
[    1.048808] Callback from call_rcu_tasks_trace() invoked.
[    1.081711] EFI services will not be available.
[    1.091529] smp: Bringing up secondary CPUs ...
[    1.103585] Detected VIPT I-cache on CPU1
[    1.103747] GICv3: CPU1: found redistributor 100 region 0:0x0000000017a80000
[    1.103830] CPU1: Booted secondary processor 0x0000000100 [0x51df805e]
[    1.129767] Detected VIPT I-cache on CPU2
[    1.129911] GICv3: CPU2: found redistributor 200 region 0:0x0000000017aa0000
[    1.129979] CPU2: Booted secondary processor 0x0000000200 [0x51df805e]
[    1.155742] Detected VIPT I-cache on CPU3
[    1.155876] GICv3: CPU3: found redistributor 300 region 0:0x0000000017ac0000
[    1.155936] CPU3: Booted secondary processor 0x0000000300 [0x51df805e]
[    1.182791] CPU features: detected: Spectre-v4
[    1.182824] CPU features: detected: Spectre-BHB
[    1.182854] CPU features: detected: ARM erratum 1508412 (kernel portion)
[    1.182900] Detected PIPT I-cache on CPU4
[    1.183187] GICv3: CPU4: found redistributor 400 region 0:0x0000000017ae0000
[    1.183294] CPU4: Booted secondary processor 0x0000000400 [0x411fd0d0]
[    1.225640] Detected PIPT I-cache on CPU5
[    1.226000] GICv3: CPU5: found redistributor 500 region 0:0x0000000017b00000
[    1.226099] CPU5: Booted secondary processor 0x0000000500 [0x411fd0d0]
[    1.252358] Detected PIPT I-cache on CPU6
[    1.252722] GICv3: CPU6: found redistributor 600 region 0:0x0000000017b20000
[    1.252821] CPU6: Booted secondary processor 0x0000000600 [0x411fd0d0]
[    1.266024] Callback from call_rcu_tasks() invoked.
[    1.284303] Detected PIPT I-cache on CPU7
[    1.284475] GICv3: CPU7: found redistributor 700 region 0:0x0000000017b40000
[    1.284525] CPU7: Booted secondary processor 0x0000000700 [0x411fd0d0]
[    1.284926] smp: Brought up 1 node, 8 CPUs
[    1.284943] SMP: Total of 8 processors activated.
[    1.284954] CPU features: detected: 32-bit EL0 Support
[    1.284963] CPU features: detected: Data cache clean to the PoU not required for I/D coherence
[    1.284975] CPU features: detected: Common not Private translations
[    1.284985] CPU features: detected: CRC32 instructions
[    1.284999] CPU features: detected: RCpc load-acquire (LDAPR)
[    1.285009] CPU features: detected: LSE atomic instructions
[    1.285019] CPU features: detected: Privileged Access Never
[    1.285029] CPU features: detected: RAS Extension Support
[    1.285044] CPU features: detected: Speculative Store Bypassing Safe (SSBS)
[    1.296826] CPU: All CPU(s) started at EL1
[    1.296891] alternatives: applying system-wide alternatives
[    1.315641] devtmpfs: initialized
[    1.670123] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[    1.670327] futex hash table entries: 2048 (order: 6, 393216 bytes, linear)
[    1.673537] pinctrl core: initialized pinctrl subsystem
[    1.684021] DMI not present or invalid.
[    1.687165] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    1.695595] DMA: preallocated 1024 KiB GFP_KERNEL pool for atomic allocations
[    1.696373] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[    1.698379] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[    1.698742] audit: initializing netlink subsys (disabled)
[    1.699628] audit: type=2000 audit(1.628:1): state=initialized audit_enabled=0 res=1
[    1.708543] thermal_sys: Registered thermal governor 'step_wise'
[    1.708570] thermal_sys: Registered thermal governor 'power_allocator'
[    1.708961] cpuidle: using governor ladder
[    1.709035] cpuidle: using governor menu
[    1.709849] NET: Registered PF_QIPCRTR protocol family
[    1.711345] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    1.719297] ASID allocator initialised with 32768 entries
[    1.733197] Serial: AMBA PL011 UART driver
[    1.855339] platform 1d87000.phy: Fixing up cyclic dependency with 1d84000.ufshc
[    2.041279] KASLR enabled
[    2.174612] HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
[    2.174626] HugeTLB: 16380 KiB vmemmap can be freed for a 1.00 GiB page
[    2.174634] HugeTLB: registered 32.0 MiB page size, pre-allocated 0 pages
[    2.174640] HugeTLB: 508 KiB vmemmap can be freed for a 32.0 MiB page
[    2.174647] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
[    2.174652] HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page
[    2.174660] HugeTLB: registered 64.0 KiB page size, pre-allocated 0 pages
[    2.174665] HugeTLB: 0 KiB vmemmap can be freed for a 64.0 KiB page
[    2.188583] ACPI: Interpreter disabled.
[    2.306645] iommu: Default domain type: Translated
[    2.306660] iommu: DMA domain TLB invalidation policy: strict mode
[    2.310343] SCSI subsystem initialized
[    2.313611] usbcore: registered new interface driver usbfs
[    2.313987] usbcore: registered new interface driver hub
[    2.314314] usbcore: registered new device driver usb
[    2.321138] pps_core: LinuxPPS API ver. 1 registered
[    2.321144] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    2.321246] PTP clock support registered
[    2.321452] EDAC MC: Ver: 3.0.0
[    2.324853] CPUidle PSCI: Initialized CPU PM domain topology
[    2.327564] qcom_scm: convention: smc arm 64
[    2.335862] FPGA manager framework
Pierre Gondois Jan. 4, 2023, 6:35 p.m. UTC | #2
On 12/29/22 17:47, Krzysztof Kozlowski wrote:
> On 09/12/2022 11:31, Pierre Gondois wrote:
>> v2:
>>   - Applied renaming/formatting comments from v1.
>>   - Check CACHE_TYPE_VALID flag in pppt.c.
>> v3:
>>   - Applied Sudeep's suggestions (for patch 5/5):
>>     - Renaming allocate_cache_info() -> fecth_cache_info()
>>     - Updated error message
>>     - Extract an inline allocate_cache_info() function
>>   - Re-run checkpatch with --strict option
>>
>> Note:
>> This patchset requires the following patch to be applied first in
>> order to avoid the same bug described in the commit message:
>> https://lore.kernel.org/all/20221116094958.2141072-1-pierre.gondois@arm.com/
>>
>> [1] and [2] build the CPU topology from the cacheinfo information for
>> both DT/ACPI based systems and remove (struct cpu_topology).llc_id
>> which was used by ACPI only.
>>
>> Creating the cacheinfo for secondary CPUs is done during early boot.
>> Preemption and interrupts are disabled at this stage. On PREEMPT_RT
>> kernels, allocating memory (and parsing the PPTT table for ACPI based
>> systems) triggers a:
>>    'BUG: sleeping function called from invalid context' [4]
>>
>> To prevent this bug, allocate the cacheinfo from the primary CPU when
>> preemption and interrupts are enabled and before booting secondary
>> CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information
>> only, without relying on the arm64 CLIDR_EL1 register.
>> If no cache information is found in the DT/ACPI PPTT, then fallback
>> to the current state, triggering [4] on PREEMPT_RT kernels.
>>
>> Patches to update the arm64 device trees that have incomplete cacheinfo
>> (mostly for missing the 'cache-level' or 'cache-unified' property)
>> have been sent at [3].
>>
>> Tested platforms:
>> - ACPI + PPTT: Ampere Altra, Ampere eMAG, Cavium ThunderX2,
>>    Kunpeng 920, Juno-r2
>> - DT: rb5, db845c, Juno-r2
>>
> 
> I gave the patchset a try with DTS fixes for cache topology on Qualcomm
> RB5 board (SM8250 SoC) and with KASAN it produces:
> 
> BUG: KASAN: slab-out-of-bounds in populate_cache_leaves+0x84/0x15c
> [    0.633014]  dump_backtrace.part.0+0xe0/0xf0
> [    0.633035]  show_stack+0x18/0x40
> [    0.633050]  dump_stack_lvl+0x8c/0xb8
> [    0.633085]  print_report+0x188/0x488
> [    0.633106]  kasan_report+0xac/0xf0
> [    0.633136]  __asan_store4+0x80/0xa4
> [    0.633158]  populate_cache_leaves+0x84/0x15c
> [    0.633181]  detect_cache_attributes+0xc0/0x8c4
> [    0.633213]  update_siblings_masks+0x28/0x43c
> [    0.633235]  store_cpu_topology+0x98/0xc0
> [    0.633251]  smp_prepare_cpus+0x2c/0x15c
> [    0.633281]  kernel_init_freeable+0x22c/0x424
> [    0.633310]  kernel_init+0x24/0x13c
> [    0.633328]  ret_from_fork+0x10/0x20
> [    0.633388]
> [    0.708729] Allocated by task 1:
> [    0.712078]  kasan_save_stack+0x2c/0x60
> [    0.716066]  kasan_set_track+0x2c/0x40
> [    0.719959]  kasan_save_alloc_info+0x24/0x3c
> [    0.724387]  __kasan_kmalloc+0xa0/0xbc
> [    0.728278]  __kmalloc+0x74/0x110
> [    0.731740]  fetch_cache_info+0x170/0x210
> [    0.735902]  init_cpu_topology+0x254/0x2bc
> [    0.740171]  smp_prepare_cpus+0x20/0x15c
> [    0.744272]  kernel_init_freeable+0x22c/0x424
> [    0.748791]  kernel_init+0x24/0x13c
> [    0.752420]  ret_from_fork+0x10/0x20
> 
> Best regards,
> Krzysztof

Hello Krzysztof,
Thanks for trying the patch-set and reporting the issue. Hopefully
the v4 should solve this:
https://lore.kernel.org/all/20230104183033.755668-1-pierre.gondois@arm.com/

I will also try to follow the corresponding dts modifications,

Regards,
Pierre