Message ID | 9d52e6d338a059618d894abb0764015043330c2b.1714727227.git.dsimic@manjaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: dts: allwinner: Add cache information to the SoC dtsi for H616 | expand |
On Fri, 3 May 2024 11:09:41 +0200 Dragan Simic <dsimic@manjaro.org> wrote: > Add missing cache information to the Allwinner H616 SoC dtsi, to allow > the userspace, which includes lscpu(1) that uses the virtual files provided > by the kernel under the /sys/devices/system/cpu directory, to display the > proper H616 cache information. > > Adding the cache information to the H616 SoC dtsi also makes the following > warning message in the kernel log go away: > > cacheinfo: Unable to detect cache hierarchy for CPU 0 > > Rather conspicuously, almost no cache-related information is available in > the publicly available Allwinner H616 datasheet (version 1.0) and H616 user > manual (version 1.0). Thus, the cache parameters for the H616 SoC dtsi were > obtained and derived by hand from the cache size and layout specifications > found in the following technical reference manual, and from the cache size > and die revision hints available from the following community-provided data > and memory subsystem benchmarks: > > - ARM Cortex-A53 revision r0p4 TRM, version J > - Summary of the two available H616 die revisions and their differences > in cache sizes observed from the CSSIDR_EL1 register readouts, provided > by Andre Przywara [1][2] > - Tinymembench benchmark results of the H616-based OrangePi Zero 2 SBC, > provided by Thomas Kaiser [3] > > For future reference, here's a brief summary of the available documentation > and the community-provided data and memory subsystem benchmarks: > > - All caches employ the 64-byte cache line length > - Each Cortex-A53 core has 32 KB of L1 2-way, set-associative instruction > cache and 32 KB of L1 4-way, set-associative data cache > - The size of the L2 cache depends on the actual H616 die revision (there > are two die revisions), so the entire SoC can have either 256 KB or 1 MB > of unified L2 16-way, set-associative cache [1] > > Also for future reference, here's the relevant excerpt from the community- > provided H616 memory subsystem benchmark, [3] which confirms that 32 KB and > 256 KB are the L1 data and L2 cache sizes, respectively: > > block size : single random read / dual random read > 1024 : 0.0 ns / 0.0 ns > 2048 : 0.0 ns / 0.0 ns > 4096 : 0.0 ns / 0.0 ns > 8192 : 0.0 ns / 0.0 ns > 16384 : 0.0 ns / 0.0 ns > 32768 : 0.0 ns / 0.0 ns > 65536 : 4.3 ns / 7.3 ns > 131072 : 6.6 ns / 10.5 ns > 262144 : 9.8 ns / 15.2 ns > 524288 : 91.8 ns / 142.9 ns > 1048576 : 138.6 ns / 188.3 ns > 2097152 : 163.0 ns / 204.8 ns > 4194304 : 178.8 ns / 213.5 ns > 8388608 : 187.1 ns / 217.9 ns > 16777216 : 192.2 ns / 220.9 ns > 33554432 : 196.5 ns / 224.0 ns > 67108864 : 215.7 ns / 259.5 ns Thanks for dumping the elaborate information here! > The changes introduced to the H616 SoC dtsi by this patch specify 256 KB as > the L2 cache size. As outlined by Andre Przywara, [2] a follow-up TF-A patch > will perform runtime adjustment of the device tree data, making the correct > L2 cache size of 1 MB present in the device tree for the boards based on the > revision of H616 that actually provides 1 MB of L2 cache. I pushed that TF-A patch for review now: https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/28694/1 On my OrangePi Zero3 (with an 1MB H618 SoC) the size and number of sets get adjusted to describe 1MB: => fdt list /cpus/l2-cache l2-cache { compatible = "cache"; cache-level = <0x00000002>; cache-unified; cache-size = <0x00100000>; cache-line-size = <0x00000040>; cache-sets = <0x00000400>; phandle = <0x00000003>; }; > [1] https://lore.kernel.org/linux-sunxi/20240430114627.0cfcd14a@donnerap.manchester.arm.com/ > [2] https://lore.kernel.org/linux-sunxi/20240501103059.10a8f7de@donnerap.manchester.arm.com/ > [3] https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/results/4knM.txt > > Suggested-by: Andre Przywara <andre.przywara@arm.com> > Helped-by: Andre Przywara <andre.przywara@arm.com> > Signed-off-by: Dragan Simic <dsimic@manjaro.org> So I can confirm that the information above is correct, and also matches the DT properties added below. Reviewed-by: Andre Przywara <andre.przywara@arm.com> Thanks! Andre > --- > .../arm64/boot/dts/allwinner/sun50i-h616.dtsi | 37 +++++++++++++++++++ > 1 file changed, 37 insertions(+) > > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi > index b2e85e52d1a1..4faed88d8909 100644 > --- a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi > +++ b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi > @@ -26,30 +26,67 @@ cpu0: cpu@0 { > reg = <0>; > enable-method = "psci"; > clocks = <&ccu CLK_CPUX>; > + i-cache-size = <0x8000>; > + i-cache-line-size = <64>; > + i-cache-sets = <256>; > + d-cache-size = <0x8000>; > + d-cache-line-size = <64>; > + d-cache-sets = <128>; > + next-level-cache = <&l2_cache>; > }; > > cpu1: cpu@1 { > compatible = "arm,cortex-a53"; > device_type = "cpu"; > reg = <1>; > enable-method = "psci"; > clocks = <&ccu CLK_CPUX>; > + i-cache-size = <0x8000>; > + i-cache-line-size = <64>; > + i-cache-sets = <256>; > + d-cache-size = <0x8000>; > + d-cache-line-size = <64>; > + d-cache-sets = <128>; > + next-level-cache = <&l2_cache>; > }; > > cpu2: cpu@2 { > compatible = "arm,cortex-a53"; > device_type = "cpu"; > reg = <2>; > enable-method = "psci"; > clocks = <&ccu CLK_CPUX>; > + i-cache-size = <0x8000>; > + i-cache-line-size = <64>; > + i-cache-sets = <256>; > + d-cache-size = <0x8000>; > + d-cache-line-size = <64>; > + d-cache-sets = <128>; > + next-level-cache = <&l2_cache>; > }; > > cpu3: cpu@3 { > compatible = "arm,cortex-a53"; > device_type = "cpu"; > reg = <3>; > enable-method = "psci"; > clocks = <&ccu CLK_CPUX>; > + i-cache-size = <0x8000>; > + i-cache-line-size = <64>; > + i-cache-sets = <256>; > + d-cache-size = <0x8000>; > + d-cache-line-size = <64>; > + d-cache-sets = <128>; > + next-level-cache = <&l2_cache>; > + }; > + > + l2_cache: l2-cache { > + compatible = "cache"; > + cache-level = <2>; > + cache-unified; > + cache-size = <0x40000>; > + cache-line-size = <64>; > + cache-sets = <256>; > }; > }; >
Hello Andre, On 2024-05-08 13:05, Andre Przywara wrote: > On Fri, 3 May 2024 11:09:41 +0200 > Dragan Simic <dsimic@manjaro.org> wrote: > >> Add missing cache information to the Allwinner H616 SoC dtsi, to allow >> the userspace, which includes lscpu(1) that uses the virtual files >> provided >> by the kernel under the /sys/devices/system/cpu directory, to display >> the >> proper H616 cache information. >> >> Adding the cache information to the H616 SoC dtsi also makes the >> following >> warning message in the kernel log go away: >> >> cacheinfo: Unable to detect cache hierarchy for CPU 0 >> >> Rather conspicuously, almost no cache-related information is available >> in >> the publicly available Allwinner H616 datasheet (version 1.0) and H616 >> user >> manual (version 1.0). Thus, the cache parameters for the H616 SoC >> dtsi were >> obtained and derived by hand from the cache size and layout >> specifications >> found in the following technical reference manual, and from the cache >> size >> and die revision hints available from the following community-provided >> data >> and memory subsystem benchmarks: >> >> - ARM Cortex-A53 revision r0p4 TRM, version J >> - Summary of the two available H616 die revisions and their >> differences >> in cache sizes observed from the CSSIDR_EL1 register readouts, >> provided >> by Andre Przywara [1][2] >> - Tinymembench benchmark results of the H616-based OrangePi Zero 2 >> SBC, >> provided by Thomas Kaiser [3] >> >> For future reference, here's a brief summary of the available >> documentation >> and the community-provided data and memory subsystem benchmarks: >> >> - All caches employ the 64-byte cache line length >> - Each Cortex-A53 core has 32 KB of L1 2-way, set-associative >> instruction >> cache and 32 KB of L1 4-way, set-associative data cache >> - The size of the L2 cache depends on the actual H616 die revision >> (there >> are two die revisions), so the entire SoC can have either 256 KB >> or 1 MB >> of unified L2 16-way, set-associative cache [1] >> >> Also for future reference, here's the relevant excerpt from the >> community- >> provided H616 memory subsystem benchmark, [3] which confirms that 32 >> KB and >> 256 KB are the L1 data and L2 cache sizes, respectively: >> >> block size : single random read / dual random read >> 1024 : 0.0 ns / 0.0 ns >> 2048 : 0.0 ns / 0.0 ns >> 4096 : 0.0 ns / 0.0 ns >> 8192 : 0.0 ns / 0.0 ns >> 16384 : 0.0 ns / 0.0 ns >> 32768 : 0.0 ns / 0.0 ns >> 65536 : 4.3 ns / 7.3 ns >> 131072 : 6.6 ns / 10.5 ns >> 262144 : 9.8 ns / 15.2 ns >> 524288 : 91.8 ns / 142.9 ns >> 1048576 : 138.6 ns / 188.3 ns >> 2097152 : 163.0 ns / 204.8 ns >> 4194304 : 178.8 ns / 213.5 ns >> 8388608 : 187.1 ns / 217.9 ns >> 16777216 : 192.2 ns / 220.9 ns >> 33554432 : 196.5 ns / 224.0 ns >> 67108864 : 215.7 ns / 259.5 ns > > Thanks for dumping the elaborate information here! You're welcome! :) I like when patch descriptions provide as much relevant information as possible, so I always try to do that myself. >> The changes introduced to the H616 SoC dtsi by this patch specify 256 >> KB as >> the L2 cache size. As outlined by Andre Przywara, [2] a follow-up >> TF-A patch >> will perform runtime adjustment of the device tree data, making the >> correct >> L2 cache size of 1 MB present in the device tree for the boards based >> on the >> revision of H616 that actually provides 1 MB of L2 cache. > > I pushed that TF-A patch for review now: > https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/28694/1 > On my OrangePi Zero3 (with an 1MB H618 SoC) the size and number of sets > get adjusted to describe 1MB: > => fdt list /cpus/l2-cache > l2-cache { > compatible = "cache"; > cache-level = <0x00000002>; > cache-unified; > cache-size = <0x00100000>; > cache-line-size = <0x00000040>; > cache-sets = <0x00000400>; > phandle = <0x00000003>; > }; Awesome, thanks for the follow-up TF-A patch! I'll keep an eye on your TF-A patch submission. >> [1] >> https://lore.kernel.org/linux-sunxi/20240430114627.0cfcd14a@donnerap.manchester.arm.com/ >> [2] >> https://lore.kernel.org/linux-sunxi/20240501103059.10a8f7de@donnerap.manchester.arm.com/ >> [3] >> https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/results/4knM.txt >> >> Suggested-by: Andre Przywara <andre.przywara@arm.com> >> Helped-by: Andre Przywara <andre.przywara@arm.com> >> Signed-off-by: Dragan Simic <dsimic@manjaro.org> > > So I can confirm that the information above is correct, and also > matches > the DT properties added below. > > Reviewed-by: Andre Przywara <andre.przywara@arm.com> Thanks! >> --- >> .../arm64/boot/dts/allwinner/sun50i-h616.dtsi | 37 >> +++++++++++++++++++ >> 1 file changed, 37 insertions(+) >> >> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi >> b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi >> index b2e85e52d1a1..4faed88d8909 100644 >> --- a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi >> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi >> @@ -26,30 +26,67 @@ cpu0: cpu@0 { >> reg = <0>; >> enable-method = "psci"; >> clocks = <&ccu CLK_CPUX>; >> + i-cache-size = <0x8000>; >> + i-cache-line-size = <64>; >> + i-cache-sets = <256>; >> + d-cache-size = <0x8000>; >> + d-cache-line-size = <64>; >> + d-cache-sets = <128>; >> + next-level-cache = <&l2_cache>; >> }; >> >> cpu1: cpu@1 { >> compatible = "arm,cortex-a53"; >> device_type = "cpu"; >> reg = <1>; >> enable-method = "psci"; >> clocks = <&ccu CLK_CPUX>; >> + i-cache-size = <0x8000>; >> + i-cache-line-size = <64>; >> + i-cache-sets = <256>; >> + d-cache-size = <0x8000>; >> + d-cache-line-size = <64>; >> + d-cache-sets = <128>; >> + next-level-cache = <&l2_cache>; >> }; >> >> cpu2: cpu@2 { >> compatible = "arm,cortex-a53"; >> device_type = "cpu"; >> reg = <2>; >> enable-method = "psci"; >> clocks = <&ccu CLK_CPUX>; >> + i-cache-size = <0x8000>; >> + i-cache-line-size = <64>; >> + i-cache-sets = <256>; >> + d-cache-size = <0x8000>; >> + d-cache-line-size = <64>; >> + d-cache-sets = <128>; >> + next-level-cache = <&l2_cache>; >> }; >> >> cpu3: cpu@3 { >> compatible = "arm,cortex-a53"; >> device_type = "cpu"; >> reg = <3>; >> enable-method = "psci"; >> clocks = <&ccu CLK_CPUX>; >> + i-cache-size = <0x8000>; >> + i-cache-line-size = <64>; >> + i-cache-sets = <256>; >> + d-cache-size = <0x8000>; >> + d-cache-line-size = <64>; >> + d-cache-sets = <128>; >> + next-level-cache = <&l2_cache>; >> + }; >> + >> + l2_cache: l2-cache { >> + compatible = "cache"; >> + cache-level = <2>; >> + cache-unified; >> + cache-size = <0x40000>; >> + cache-line-size = <64>; >> + cache-sets = <256>; >> }; >> }; >>
On Fri, May 3, 2024 at 5:09 PM Dragan Simic <dsimic@manjaro.org> wrote: > > Add missing cache information to the Allwinner H616 SoC dtsi, to allow > the userspace, which includes lscpu(1) that uses the virtual files provided > by the kernel under the /sys/devices/system/cpu directory, to display the > proper H616 cache information. > > Adding the cache information to the H616 SoC dtsi also makes the following > warning message in the kernel log go away: > > cacheinfo: Unable to detect cache hierarchy for CPU 0 > > Rather conspicuously, almost no cache-related information is available in > the publicly available Allwinner H616 datasheet (version 1.0) and H616 user > manual (version 1.0). Thus, the cache parameters for the H616 SoC dtsi were > obtained and derived by hand from the cache size and layout specifications > found in the following technical reference manual, and from the cache size > and die revision hints available from the following community-provided data > and memory subsystem benchmarks: > > - ARM Cortex-A53 revision r0p4 TRM, version J > - Summary of the two available H616 die revisions and their differences > in cache sizes observed from the CSSIDR_EL1 register readouts, provided > by Andre Przywara [1][2] > - Tinymembench benchmark results of the H616-based OrangePi Zero 2 SBC, > provided by Thomas Kaiser [3] > > For future reference, here's a brief summary of the available documentation > and the community-provided data and memory subsystem benchmarks: > > - All caches employ the 64-byte cache line length > - Each Cortex-A53 core has 32 KB of L1 2-way, set-associative instruction > cache and 32 KB of L1 4-way, set-associative data cache > - The size of the L2 cache depends on the actual H616 die revision (there > are two die revisions), so the entire SoC can have either 256 KB or 1 MB > of unified L2 16-way, set-associative cache [1] > > Also for future reference, here's the relevant excerpt from the community- > provided H616 memory subsystem benchmark, [3] which confirms that 32 KB and > 256 KB are the L1 data and L2 cache sizes, respectively: > > block size : single random read / dual random read > 1024 : 0.0 ns / 0.0 ns > 2048 : 0.0 ns / 0.0 ns > 4096 : 0.0 ns / 0.0 ns > 8192 : 0.0 ns / 0.0 ns > 16384 : 0.0 ns / 0.0 ns > 32768 : 0.0 ns / 0.0 ns > 65536 : 4.3 ns / 7.3 ns > 131072 : 6.6 ns / 10.5 ns > 262144 : 9.8 ns / 15.2 ns > 524288 : 91.8 ns / 142.9 ns > 1048576 : 138.6 ns / 188.3 ns > 2097152 : 163.0 ns / 204.8 ns > 4194304 : 178.8 ns / 213.5 ns > 8388608 : 187.1 ns / 217.9 ns > 16777216 : 192.2 ns / 220.9 ns > 33554432 : 196.5 ns / 224.0 ns > 67108864 : 215.7 ns / 259.5 ns > > The changes introduced to the H616 SoC dtsi by this patch specify 256 KB as > the L2 cache size. As outlined by Andre Przywara, [2] a follow-up TF-A patch > will perform runtime adjustment of the device tree data, making the correct > L2 cache size of 1 MB present in the device tree for the boards based on the > revision of H616 that actually provides 1 MB of L2 cache. > > [1] https://lore.kernel.org/linux-sunxi/20240430114627.0cfcd14a@donnerap.manchester.arm.com/ > [2] https://lore.kernel.org/linux-sunxi/20240501103059.10a8f7de@donnerap.manchester.arm.com/ > [3] https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/results/4knM.txt > > Suggested-by: Andre Przywara <andre.przywara@arm.com> > Helped-by: Andre Przywara <andre.przywara@arm.com> > Signed-off-by: Dragan Simic <dsimic@manjaro.org> > --- > .../arm64/boot/dts/allwinner/sun50i-h616.dtsi | 37 +++++++++++++++++++ > 1 file changed, 37 insertions(+) > > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi > index b2e85e52d1a1..4faed88d8909 100644 > --- a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi > +++ b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi > @@ -26,30 +26,67 @@ cpu0: cpu@0 { > reg = <0>; > enable-method = "psci"; > clocks = <&ccu CLK_CPUX>; > + i-cache-size = <0x8000>; > + i-cache-line-size = <64>; > + i-cache-sets = <256>; > + d-cache-size = <0x8000>; > + d-cache-line-size = <64>; > + d-cache-sets = <128>; > + next-level-cache = <&l2_cache>; This no longer applies due to the CPU DVFS stuff getting merged. Can you rebase and resend? Thanks ChenYu
On 2024-05-28 18:08, Chen-Yu Tsai wrote: > On Fri, May 3, 2024 at 5:09 PM Dragan Simic <dsimic@manjaro.org> wrote: >> >> Add missing cache information to the Allwinner H616 SoC dtsi, to allow >> the userspace, which includes lscpu(1) that uses the virtual files >> provided >> by the kernel under the /sys/devices/system/cpu directory, to display >> the >> proper H616 cache information. >> >> Adding the cache information to the H616 SoC dtsi also makes the >> following >> warning message in the kernel log go away: >> >> cacheinfo: Unable to detect cache hierarchy for CPU 0 >> >> Rather conspicuously, almost no cache-related information is available >> in >> the publicly available Allwinner H616 datasheet (version 1.0) and H616 >> user >> manual (version 1.0). Thus, the cache parameters for the H616 SoC >> dtsi were >> obtained and derived by hand from the cache size and layout >> specifications >> found in the following technical reference manual, and from the cache >> size >> and die revision hints available from the following community-provided >> data >> and memory subsystem benchmarks: >> >> - ARM Cortex-A53 revision r0p4 TRM, version J >> - Summary of the two available H616 die revisions and their >> differences >> in cache sizes observed from the CSSIDR_EL1 register readouts, >> provided >> by Andre Przywara [1][2] >> - Tinymembench benchmark results of the H616-based OrangePi Zero 2 >> SBC, >> provided by Thomas Kaiser [3] >> >> For future reference, here's a brief summary of the available >> documentation >> and the community-provided data and memory subsystem benchmarks: >> >> - All caches employ the 64-byte cache line length >> - Each Cortex-A53 core has 32 KB of L1 2-way, set-associative >> instruction >> cache and 32 KB of L1 4-way, set-associative data cache >> - The size of the L2 cache depends on the actual H616 die revision >> (there >> are two die revisions), so the entire SoC can have either 256 KB >> or 1 MB >> of unified L2 16-way, set-associative cache [1] >> >> Also for future reference, here's the relevant excerpt from the >> community- >> provided H616 memory subsystem benchmark, [3] which confirms that 32 >> KB and >> 256 KB are the L1 data and L2 cache sizes, respectively: >> >> block size : single random read / dual random read >> 1024 : 0.0 ns / 0.0 ns >> 2048 : 0.0 ns / 0.0 ns >> 4096 : 0.0 ns / 0.0 ns >> 8192 : 0.0 ns / 0.0 ns >> 16384 : 0.0 ns / 0.0 ns >> 32768 : 0.0 ns / 0.0 ns >> 65536 : 4.3 ns / 7.3 ns >> 131072 : 6.6 ns / 10.5 ns >> 262144 : 9.8 ns / 15.2 ns >> 524288 : 91.8 ns / 142.9 ns >> 1048576 : 138.6 ns / 188.3 ns >> 2097152 : 163.0 ns / 204.8 ns >> 4194304 : 178.8 ns / 213.5 ns >> 8388608 : 187.1 ns / 217.9 ns >> 16777216 : 192.2 ns / 220.9 ns >> 33554432 : 196.5 ns / 224.0 ns >> 67108864 : 215.7 ns / 259.5 ns >> >> The changes introduced to the H616 SoC dtsi by this patch specify 256 >> KB as >> the L2 cache size. As outlined by Andre Przywara, [2] a follow-up >> TF-A patch >> will perform runtime adjustment of the device tree data, making the >> correct >> L2 cache size of 1 MB present in the device tree for the boards based >> on the >> revision of H616 that actually provides 1 MB of L2 cache. >> >> [1] >> https://lore.kernel.org/linux-sunxi/20240430114627.0cfcd14a@donnerap.manchester.arm.com/ >> [2] >> https://lore.kernel.org/linux-sunxi/20240501103059.10a8f7de@donnerap.manchester.arm.com/ >> [3] >> https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/results/4knM.txt >> >> Suggested-by: Andre Przywara <andre.przywara@arm.com> >> Helped-by: Andre Przywara <andre.przywara@arm.com> >> Signed-off-by: Dragan Simic <dsimic@manjaro.org> >> --- >> .../arm64/boot/dts/allwinner/sun50i-h616.dtsi | 37 >> +++++++++++++++++++ >> 1 file changed, 37 insertions(+) >> >> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi >> b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi >> index b2e85e52d1a1..4faed88d8909 100644 >> --- a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi >> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi >> @@ -26,30 +26,67 @@ cpu0: cpu@0 { >> reg = <0>; >> enable-method = "psci"; >> clocks = <&ccu CLK_CPUX>; >> + i-cache-size = <0x8000>; >> + i-cache-line-size = <64>; >> + i-cache-sets = <256>; >> + d-cache-size = <0x8000>; >> + d-cache-line-size = <64>; >> + d-cache-sets = <128>; >> + next-level-cache = <&l2_cache>; > > This no longer applies due to the CPU DVFS stuff getting merged. > Can you rebase and resend? Sure, just sent the rebased version as the v2. [1] [1] https://lore.kernel.org/linux-sunxi/e4b9cc3e3d366a571e552c31dafa5de847bc1c12.1716914537.git.dsimic@manjaro.org/
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi index b2e85e52d1a1..4faed88d8909 100644 --- a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi +++ b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi @@ -26,30 +26,67 @@ cpu0: cpu@0 { reg = <0>; enable-method = "psci"; clocks = <&ccu CLK_CPUX>; + i-cache-size = <0x8000>; + i-cache-line-size = <64>; + i-cache-sets = <256>; + d-cache-size = <0x8000>; + d-cache-line-size = <64>; + d-cache-sets = <128>; + next-level-cache = <&l2_cache>; }; cpu1: cpu@1 { compatible = "arm,cortex-a53"; device_type = "cpu"; reg = <1>; enable-method = "psci"; clocks = <&ccu CLK_CPUX>; + i-cache-size = <0x8000>; + i-cache-line-size = <64>; + i-cache-sets = <256>; + d-cache-size = <0x8000>; + d-cache-line-size = <64>; + d-cache-sets = <128>; + next-level-cache = <&l2_cache>; }; cpu2: cpu@2 { compatible = "arm,cortex-a53"; device_type = "cpu"; reg = <2>; enable-method = "psci"; clocks = <&ccu CLK_CPUX>; + i-cache-size = <0x8000>; + i-cache-line-size = <64>; + i-cache-sets = <256>; + d-cache-size = <0x8000>; + d-cache-line-size = <64>; + d-cache-sets = <128>; + next-level-cache = <&l2_cache>; }; cpu3: cpu@3 { compatible = "arm,cortex-a53"; device_type = "cpu"; reg = <3>; enable-method = "psci"; clocks = <&ccu CLK_CPUX>; + i-cache-size = <0x8000>; + i-cache-line-size = <64>; + i-cache-sets = <256>; + d-cache-size = <0x8000>; + d-cache-line-size = <64>; + d-cache-sets = <128>; + next-level-cache = <&l2_cache>; + }; + + l2_cache: l2-cache { + compatible = "cache"; + cache-level = <2>; + cache-unified; + cache-size = <0x40000>; + cache-line-size = <64>; + cache-sets = <256>; }; };
Add missing cache information to the Allwinner H616 SoC dtsi, to allow the userspace, which includes lscpu(1) that uses the virtual files provided by the kernel under the /sys/devices/system/cpu directory, to display the proper H616 cache information. Adding the cache information to the H616 SoC dtsi also makes the following warning message in the kernel log go away: cacheinfo: Unable to detect cache hierarchy for CPU 0 Rather conspicuously, almost no cache-related information is available in the publicly available Allwinner H616 datasheet (version 1.0) and H616 user manual (version 1.0). Thus, the cache parameters for the H616 SoC dtsi were obtained and derived by hand from the cache size and layout specifications found in the following technical reference manual, and from the cache size and die revision hints available from the following community-provided data and memory subsystem benchmarks: - ARM Cortex-A53 revision r0p4 TRM, version J - Summary of the two available H616 die revisions and their differences in cache sizes observed from the CSSIDR_EL1 register readouts, provided by Andre Przywara [1][2] - Tinymembench benchmark results of the H616-based OrangePi Zero 2 SBC, provided by Thomas Kaiser [3] For future reference, here's a brief summary of the available documentation and the community-provided data and memory subsystem benchmarks: - All caches employ the 64-byte cache line length - Each Cortex-A53 core has 32 KB of L1 2-way, set-associative instruction cache and 32 KB of L1 4-way, set-associative data cache - The size of the L2 cache depends on the actual H616 die revision (there are two die revisions), so the entire SoC can have either 256 KB or 1 MB of unified L2 16-way, set-associative cache [1] Also for future reference, here's the relevant excerpt from the community- provided H616 memory subsystem benchmark, [3] which confirms that 32 KB and 256 KB are the L1 data and L2 cache sizes, respectively: block size : single random read / dual random read 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.0 ns / 0.0 ns 32768 : 0.0 ns / 0.0 ns 65536 : 4.3 ns / 7.3 ns 131072 : 6.6 ns / 10.5 ns 262144 : 9.8 ns / 15.2 ns 524288 : 91.8 ns / 142.9 ns 1048576 : 138.6 ns / 188.3 ns 2097152 : 163.0 ns / 204.8 ns 4194304 : 178.8 ns / 213.5 ns 8388608 : 187.1 ns / 217.9 ns 16777216 : 192.2 ns / 220.9 ns 33554432 : 196.5 ns / 224.0 ns 67108864 : 215.7 ns / 259.5 ns The changes introduced to the H616 SoC dtsi by this patch specify 256 KB as the L2 cache size. As outlined by Andre Przywara, [2] a follow-up TF-A patch will perform runtime adjustment of the device tree data, making the correct L2 cache size of 1 MB present in the device tree for the boards based on the revision of H616 that actually provides 1 MB of L2 cache. [1] https://lore.kernel.org/linux-sunxi/20240430114627.0cfcd14a@donnerap.manchester.arm.com/ [2] https://lore.kernel.org/linux-sunxi/20240501103059.10a8f7de@donnerap.manchester.arm.com/ [3] https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/results/4knM.txt Suggested-by: Andre Przywara <andre.przywara@arm.com> Helped-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Dragan Simic <dsimic@manjaro.org> --- .../arm64/boot/dts/allwinner/sun50i-h616.dtsi | 37 +++++++++++++++++++ 1 file changed, 37 insertions(+)