Message ID | 20240305151336.144707-1-tp@osasysteme.de (mailing list archive) |
---|---|
State | Under Review |
Delegated to: | Geert Uytterhoeven |
Headers | show |
Series | arm64: dts: r9a07g044: Add complete CPU cache information | expand |
Hi Tim Pambor, Thanks for the patch. > -----Original Message----- > From: Tim Pambor <tp@osasysteme.de> > Sent: Tuesday, March 5, 2024 3:14 PM > Subject: [PATCH] arm64: dts: r9a07g044: Add complete CPU cache information > > Based on ARM Cortex-A55 TRM and RZG2/L user's manual, each Cortex-A55 has RZ/G2L > - 32 KB of L1 4-way, set-associative instruction cache > - 32 KB of L1 4-way, set-associative data cache > > Each cache has a cache line length of 64B and therefore there are > 32768B/(4 * 64B)=128 sets for each cache. > > RZG2/L are not configured with the optional per-core L2 cache but only have a L3 cache shared among all RZ/G2L > cores. In this case, the L3 cache appears as a L2 cache to the system. Therefore, specify "cache-level > = <2>" for the L3 cache. You mean for L3 Cache, cache-level = <2> if there is no L2 Cache on the system? Does it need any update on dt-bindings to make this clear? Cheers, Biju > > Signed-off-by: Tim Pambor <tp@osasysteme.de> > --- > arch/arm64/boot/dts/renesas/r9a07g044.dtsi | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/boot/dts/renesas/r9a07g044.dtsi b/arch/arm64/boot/dts/renesas/r9a07g044.dtsi > index 9f00b75d2bd0..6379c850526a 100644 > --- a/arch/arm64/boot/dts/renesas/r9a07g044.dtsi > +++ b/arch/arm64/boot/dts/renesas/r9a07g044.dtsi > @@ -89,6 +89,12 @@ cpu0: cpu@0 { > reg = <0>; > device_type = "cpu"; > #cooling-cells = <2>; > + d-cache-size = <0x8000>; > + d-cache-line-size = <64>; > + d-cache-sets = <128>; > + i-cache-size = <0x8000>; > + i-cache-line-size = <64>; > + i-cache-sets = <128>; > next-level-cache = <&L3_CA55>; > enable-method = "psci"; > clocks = <&cpg CPG_CORE R9A07G044_CLK_I>; @@ -99,6 +105,12 @@ cpu1: cpu@100 { > compatible = "arm,cortex-a55"; > reg = <0x100>; > device_type = "cpu"; > + d-cache-size = <0x8000>; > + d-cache-line-size = <64>; > + d-cache-sets = <128>; > + i-cache-size = <0x8000>; > + i-cache-line-size = <64>; > + i-cache-sets = <128>; > next-level-cache = <&L3_CA55>; > enable-method = "psci"; > clocks = <&cpg CPG_CORE R9A07G044_CLK_I>; @@ -109,7 +121,7 @@ L3_CA55: cache- > controller-0 { > compatible = "cache"; > cache-unified; > cache-size = <0x40000>; > - cache-level = <3>; > + cache-level = <2>; > }; > }; > > -- > 2.43.0 >
Hi Biju, thanks for the review. > Hi Tim Pambor, > > Thanks for the patch. > > > -----Original Message----- > > From: Tim Pambor <tp@osasysteme.de> > > Sent: Tuesday, March 5, 2024 3:14 PM > > Subject: [PATCH] arm64: dts: r9a07g044: Add complete CPU cache > > information > > > > Based on ARM Cortex-A55 TRM and RZG2/L user's manual, each Cortex- > > A55 has > > RZ/G2L > > > - 32 KB of L1 4-way, set-associative instruction cache > > - 32 KB of L1 4-way, set-associative data cache > > > > Each cache has a cache line length of 64B and therefore there are > > 32768B/(4 * 64B)=128 sets for each cache. > > > > RZG2/L are not configured with the optional per-core L2 cache but > > only have a L3 cache shared among all > RZ/G2L > > cores. In this case, the L3 cache appears as a L2 cache to the > > system. Therefore, specify "cache-level > > = <2>" for the L3 cache. > I will send a v2 with the commit message corrected. > You mean for L3 Cache, cache-level = <2> if there is no L2 Cache on > the system? Does it need any update > on dt-bindings to make this clear? I followed the approach chosen for the Rockchip RK356x, which also has a Cortex-A55 with an L3 cache but no L2 cache [1]. I can add a comment to the device tree explaining that there is no L2 cache and that therefore the L3 cache appears as a L2 cache to the system. Do you consider that sufficient? Currently, having cache-level = <3> also causes a out-of-bounds access in populate_cache_leaves. [ 0.066217] ================================================================== [ 0.066369] BUG: KASAN: slab-out-of-bounds in populate_cache_leaves+0x25c/0x2d0 [ 0.066495] Write of size 4 at addr ffff0000082370dc by task swapper/0/1 [ 0.066580] [ 0.066619] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.8.0-rc2-00016-g30d5a685c65d #6 [ 0.066719] Hardware name: MYC-YG2LX (DT) [ 0.066793] Call trace: [ 0.066836] dump_backtrace+0x98/0x118 [ 0.066900] show_stack+0x18/0x24 [ 0.066959] dump_stack_lvl+0x60/0xac [ 0.067029] print_report+0xf8/0x5d8 [ 0.067096] kasan_report+0xc0/0x100 [ 0.067159] __asan_report_store4_noabort+0x20/0x2c [ 0.067235] populate_cache_leaves+0x25c/0x2d0 [ 0.067308] detect_cache_attributes+0x34c/0x1998 [ 0.067384] update_siblings_masks+0x30/0x554 [ 0.067460] store_cpu_topology+0xe8/0x188 [ 0.067528] smp_prepare_cpus+0x5c/0x238 [ 0.067602] kernel_init_freeable+0x258/0xb18 [ 0.067673] kernel_init+0x30/0x208 [ 0.067736] ret_from_fork+0x10/0x20 [ 0.067802] [ 0.067835] Allocated by task 1: [ 0.067889] kasan_save_stack+0x3c/0x64 [ 0.067956] kasan_save_track+0x20/0x3c [ 0.068020] kasan_save_alloc_info+0x68/0x78 [ 0.068090] __kasan_kmalloc+0xd4/0xd8 [ 0.068154] __kmalloc+0x1c0/0x430 [ 0.068215] allocate_cache_info+0xa8/0x204 [ 0.068284] fetch_cache_info+0xc4/0x200 [ 0.068349] init_cpu_topology+0x348/0x45c [ 0.068423] smp_prepare_cpus+0x1c/0x238 [ 0.068492] kernel_init_freeable+0x258/0xb18 [ 0.068561] kernel_init+0x30/0x208 [ 0.068622] ret_from_fork+0x10/0x20 [ 0.068685] [ 0.068719] The buggy address belongs to the object at ffff000008237000 [ 0.068719] which belongs to the cache kmalloc-256 of size 256 [ 0.068849] The buggy address is located 4 bytes to the right of [ 0.068849] allocated 216-byte region [ffff000008237000, ffff0000082370d8) [ 0.068984] [ 0.069018] The buggy address belongs to the physical page: [ 0.069089] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x48236 [ 0.069201] head:(____ptrval____) order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 0.069297] flags: 0x840(slab|head|zone=0) [ 0.069366] page_type: 0xffffffff() [ 0.069430] raw: 0000000000000840 ffff000008001b40 dead000000000122 0000000000000000 [ 0.069526] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 [ 0.069616] page dumped because: kasan: bad access detected [ 0.069684] [ 0.069717] Memory state around the buggy address: [ 0.069781] ffff000008236f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 0.069870] ffff000008237000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.069958] >ffff000008237080: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc [ 0.070042] ^ [ 0.070116] ffff000008237100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 0.070204] ffff000008237180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 0.070290] ================================================================== [1] https://lore.kernel.org/linux-rockchip/2285ee41e165813011220f9469e28697923aa6e0.1709491108.git.dsimic@manjaro.org/ > > Cheers, > Biju >
Hi Tim Pambor, > -----Original Message----- > From: Tim Pambor <tp@osasysteme.de> > Sent: Wednesday, March 6, 2024 10:11 AM > Subject: Re: [PATCH] arm64: dts: r9a07g044: Add complete CPU cache information > > Hi Biju, > > thanks for the review. > > > Hi Tim Pambor, > > > > Thanks for the patch. > > > > > -----Original Message----- > > > From: Tim Pambor <tp@osasysteme.de> > > > Sent: Tuesday, March 5, 2024 3:14 PM > > > Subject: [PATCH] arm64: dts: r9a07g044: Add complete CPU cache > > > information > > > > > > Based on ARM Cortex-A55 TRM and RZG2/L user's manual, each Cortex- > > > A55 has > > > > RZ/G2L > > > > > - 32 KB of L1 4-way, set-associative instruction cache > > > - 32 KB of L1 4-way, set-associative data cache > > > > > > Each cache has a cache line length of 64B and therefore there are > > > 32768B/(4 * 64B)=128 sets for each cache. > > > > > > RZG2/L are not configured with the optional per-core L2 cache but > > > only have a L3 cache shared among all > > RZ/G2L > > > cores. In this case, the L3 cache appears as a L2 cache to the > > > system. Therefore, specify "cache-level = <2>" for the L3 cache. > > > > I will send a v2 with the commit message corrected. > > > You mean for L3 Cache, cache-level = <2> if there is no L2 Cache on > > the system? Does it need any update on dt-bindings to make this clear? > > I followed the approach chosen for the Rockchip RK356x, which also has a Cortex-A55 with an L3 cache > but no L2 cache [1]. I can add a comment to the device tree explaining that there is no L2 cache and > that therefore the L3 cache appears as a L2 cache to the system. Do you consider that sufficient? I am leaving this to Geert and other DT maintainer's for their view on this topic. Cheers, Biju > > > Currently, having cache-level = <3> also causes a out-of-bounds access in populate_cache_leaves. > > [ 0.066217] ================================================================== > [ 0.066369] BUG: KASAN: slab-out-of-bounds in populate_cache_leaves+0x25c/0x2d0 > [ 0.066495] Write of size 4 at addr ffff0000082370dc by task swapper/0/1 > [ 0.066580] > [ 0.066619] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.8.0-rc2-00016-g30d5a685c65d #6 > [ 0.066719] Hardware name: MYC-YG2LX (DT) > [ 0.066793] Call trace: > [ 0.066836] dump_backtrace+0x98/0x118 > [ 0.066900] show_stack+0x18/0x24 > [ 0.066959] dump_stack_lvl+0x60/0xac > [ 0.067029] print_report+0xf8/0x5d8 > [ 0.067096] kasan_report+0xc0/0x100 > [ 0.067159] __asan_report_store4_noabort+0x20/0x2c > [ 0.067235] populate_cache_leaves+0x25c/0x2d0 > [ 0.067308] detect_cache_attributes+0x34c/0x1998 > [ 0.067384] update_siblings_masks+0x30/0x554 > [ 0.067460] store_cpu_topology+0xe8/0x188 > [ 0.067528] smp_prepare_cpus+0x5c/0x238 > [ 0.067602] kernel_init_freeable+0x258/0xb18 > [ 0.067673] kernel_init+0x30/0x208 > [ 0.067736] ret_from_fork+0x10/0x20 > [ 0.067802] > [ 0.067835] Allocated by task 1: > [ 0.067889] kasan_save_stack+0x3c/0x64 > [ 0.067956] kasan_save_track+0x20/0x3c > [ 0.068020] kasan_save_alloc_info+0x68/0x78 > [ 0.068090] __kasan_kmalloc+0xd4/0xd8 > [ 0.068154] __kmalloc+0x1c0/0x430 > [ 0.068215] allocate_cache_info+0xa8/0x204 > [ 0.068284] fetch_cache_info+0xc4/0x200 > [ 0.068349] init_cpu_topology+0x348/0x45c > [ 0.068423] smp_prepare_cpus+0x1c/0x238 > [ 0.068492] kernel_init_freeable+0x258/0xb18 > [ 0.068561] kernel_init+0x30/0x208 > [ 0.068622] ret_from_fork+0x10/0x20 > [ 0.068685] > [ 0.068719] The buggy address belongs to the object at ffff000008237000 > [ 0.068719] which belongs to the cache kmalloc-256 of size 256 > [ 0.068849] The buggy address is located 4 bytes to the right of > [ 0.068849] allocated 216-byte region [ffff000008237000, ffff0000082370d8) > [ 0.068984] > [ 0.069018] The buggy address belongs to the physical page: > [ 0.069089] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 > pfn:0x48236 > [ 0.069201] head:(____ptrval____) order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0 > [ 0.069297] flags: 0x840(slab|head|zone=0) > [ 0.069366] page_type: 0xffffffff() > [ 0.069430] raw: 0000000000000840 ffff000008001b40 dead000000000122 0000000000000000 > [ 0.069526] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 > [ 0.069616] page dumped because: kasan: bad access detected > [ 0.069684] > [ 0.069717] Memory state around the buggy address: > [ 0.069781] ffff000008236f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > [ 0.069870] ffff000008237000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 0.069958] >ffff000008237080: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc > [ 0.070042] ^ > [ 0.070116] ffff000008237100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > [ 0.070204] ffff000008237180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > [ 0.070290] ================================================================== > > [1] https://lore.kernel.org/linux- > rockchip/2285ee41e165813011220f9469e28697923aa6e0.1709491108.git.dsimic@manjaro.org/ > > > > > Cheers, > > Biju > >
Hello Tim and Biju, On 2024-03-06 11:10, Tim Pambor wrote: > thanks for the review. > >> Thanks for the patch. >> >> > -----Original Message----- >> > From: Tim Pambor <tp@osasysteme.de> >> > Sent: Tuesday, March 5, 2024 3:14 PM >> > Subject: [PATCH] arm64: dts: r9a07g044: Add complete CPU cache >> > information >> > >> > Based on ARM Cortex-A55 TRM and RZG2/L user's manual, each Cortex- >> > A55 has >> >> RZ/G2L >> >> > - 32 KB of L1 4-way, set-associative instruction cache >> > - 32 KB of L1 4-way, set-associative data cache Just a small suggestion, moving "L1" after "set-associative" would make these two lines a bit more readable. >> > Each cache has a cache line length of 64B and therefore there are >> > 32768B/(4 * 64B)=128 sets for each cache. >> > >> > RZG2/L are not configured with the optional per-core L2 cache but >> > only have a L3 cache shared among all >> RZ/G2L >> > cores. In this case, the L3 cache appears as a L2 cache to the >> > system. Therefore, specify "cache-level >> > = <2>" for the L3 cache. > > I will send a v2 with the commit message corrected. > >> You mean for L3 Cache, cache-level = <2> if there is no L2 Cache on >> the system? Does it need any update >> on dt-bindings to make this clear? > > I followed the approach chosen for the Rockchip RK356x, which also has > a Cortex-A55 with an L3 cache but no L2 cache [1]. I can add a comment > to the device tree explaining that there is no L2 cache and that > therefore the L3 cache appears as a L2 cache to the system. Do you > consider that sufficient? As I noted in the description of the Rockchip RK356x patch linked above, the documentation from ARM states that the CPU cores see the shared L3 cache as their L2 caches, when there are no private, per-code L2 caches in a particular SoC that employs the DynamIQ architecture. Thus, it's the best to follow the same approach in the kernel and present the same logical cache hierarchy to the userspace, to avoid the confusion among the users, which would almost surely wonder why there are no L2 caches, but there is an L3 cache. It should also avoid unexpected behavior and bugs such as the one visible below, because having an L3 cache with no L2 caches isn't very common. > Currently, having cache-level = <3> also causes a out-of-bounds access > in populate_cache_leaves. > > [ 0.066217] > ================================================================== > [ 0.066369] BUG: KASAN: slab-out-of-bounds in > populate_cache_leaves+0x25c/0x2d0 > [ 0.066495] Write of size 4 at addr ffff0000082370dc by task > swapper/0/1 > [ 0.066580] > [ 0.066619] CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 6.8.0-rc2-00016-g30d5a685c65d #6 > [ 0.066719] Hardware name: MYC-YG2LX (DT) > [ 0.066793] Call trace: > [ 0.066836] dump_backtrace+0x98/0x118 > [ 0.066900] show_stack+0x18/0x24 > [ 0.066959] dump_stack_lvl+0x60/0xac > [ 0.067029] print_report+0xf8/0x5d8 > [ 0.067096] kasan_report+0xc0/0x100 > [ 0.067159] __asan_report_store4_noabort+0x20/0x2c > [ 0.067235] populate_cache_leaves+0x25c/0x2d0 > [ 0.067308] detect_cache_attributes+0x34c/0x1998 > [ 0.067384] update_siblings_masks+0x30/0x554 > [ 0.067460] store_cpu_topology+0xe8/0x188 > [ 0.067528] smp_prepare_cpus+0x5c/0x238 > [ 0.067602] kernel_init_freeable+0x258/0xb18 > [ 0.067673] kernel_init+0x30/0x208 > [ 0.067736] ret_from_fork+0x10/0x20 > [ 0.067802] > [ 0.067835] Allocated by task 1: > [ 0.067889] kasan_save_stack+0x3c/0x64 > [ 0.067956] kasan_save_track+0x20/0x3c > [ 0.068020] kasan_save_alloc_info+0x68/0x78 > [ 0.068090] __kasan_kmalloc+0xd4/0xd8 > [ 0.068154] __kmalloc+0x1c0/0x430 > [ 0.068215] allocate_cache_info+0xa8/0x204 > [ 0.068284] fetch_cache_info+0xc4/0x200 > [ 0.068349] init_cpu_topology+0x348/0x45c > [ 0.068423] smp_prepare_cpus+0x1c/0x238 > [ 0.068492] kernel_init_freeable+0x258/0xb18 > [ 0.068561] kernel_init+0x30/0x208 > [ 0.068622] ret_from_fork+0x10/0x20 > [ 0.068685] > [ 0.068719] The buggy address belongs to the object at > ffff000008237000 > [ 0.068719] which belongs to the cache kmalloc-256 of size 256 > [ 0.068849] The buggy address is located 4 bytes to the right of > [ 0.068849] allocated 216-byte region [ffff000008237000, > ffff0000082370d8) > [ 0.068984] > [ 0.069018] The buggy address belongs to the physical page: > [ 0.069089] page:(____ptrval____) refcount:1 mapcount:0 > mapping:0000000000000000 index:0x0 pfn:0x48236 > [ 0.069201] head:(____ptrval____) order:1 entire_mapcount:0 > nr_pages_mapped:0 pincount:0 > [ 0.069297] flags: 0x840(slab|head|zone=0) > [ 0.069366] page_type: 0xffffffff() > [ 0.069430] raw: 0000000000000840 ffff000008001b40 dead000000000122 > 0000000000000000 > [ 0.069526] raw: 0000000000000000 0000000080100010 00000001ffffffff > 0000000000000000 > [ 0.069616] page dumped because: kasan: bad access detected > [ 0.069684] > [ 0.069717] Memory state around the buggy address: > [ 0.069781] ffff000008236f80: fc fc fc fc fc fc fc fc fc fc fc fc > fc fc fc fc > [ 0.069870] ffff000008237000: 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 > [ 0.069958] >ffff000008237080: 00 00 00 00 00 00 00 00 00 00 00 fc > fc fc fc fc > [ 0.070042] ^ > [ 0.070116] ffff000008237100: fc fc fc fc fc fc fc fc fc fc fc fc > fc fc fc fc > [ 0.070204] ffff000008237180: fc fc fc fc fc fc fc fc fc fc fc fc > fc fc fc fc > [ 0.070290] > ================================================================== > > [1] > https://lore.kernel.org/linux-rockchip/2285ee41e165813011220f9469e28697923aa6e0.1709491108.git.dsimic@manjaro.org/
diff --git a/arch/arm64/boot/dts/renesas/r9a07g044.dtsi b/arch/arm64/boot/dts/renesas/r9a07g044.dtsi index 9f00b75d2bd0..6379c850526a 100644 --- a/arch/arm64/boot/dts/renesas/r9a07g044.dtsi +++ b/arch/arm64/boot/dts/renesas/r9a07g044.dtsi @@ -89,6 +89,12 @@ cpu0: cpu@0 { reg = <0>; device_type = "cpu"; #cooling-cells = <2>; + d-cache-size = <0x8000>; + d-cache-line-size = <64>; + d-cache-sets = <128>; + i-cache-size = <0x8000>; + i-cache-line-size = <64>; + i-cache-sets = <128>; next-level-cache = <&L3_CA55>; enable-method = "psci"; clocks = <&cpg CPG_CORE R9A07G044_CLK_I>; @@ -99,6 +105,12 @@ cpu1: cpu@100 { compatible = "arm,cortex-a55"; reg = <0x100>; device_type = "cpu"; + d-cache-size = <0x8000>; + d-cache-line-size = <64>; + d-cache-sets = <128>; + i-cache-size = <0x8000>; + i-cache-line-size = <64>; + i-cache-sets = <128>; next-level-cache = <&L3_CA55>; enable-method = "psci"; clocks = <&cpg CPG_CORE R9A07G044_CLK_I>; @@ -109,7 +121,7 @@ L3_CA55: cache-controller-0 { compatible = "cache"; cache-unified; cache-size = <0x40000>; - cache-level = <3>; + cache-level = <2>; }; };
Based on ARM Cortex-A55 TRM and RZG2/L user's manual, each Cortex-A55 has - 32 KB of L1 4-way, set-associative instruction cache - 32 KB of L1 4-way, set-associative data cache Each cache has a cache line length of 64B and therefore there are 32768B/(4 * 64B)=128 sets for each cache. RZG2/L are not configured with the optional per-core L2 cache but only have a L3 cache shared among all cores. In this case, the L3 cache appears as a L2 cache to the system. Therefore, specify "cache-level = <2>" for the L3 cache. Signed-off-by: Tim Pambor <tp@osasysteme.de> --- arch/arm64/boot/dts/renesas/r9a07g044.dtsi | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)