Message ID | 20230524083153.2046084-1-s.hauer@pengutronix.de (mailing list archive) |
---|---|
Headers | show |
Series | Add perf support to the rockchip-dfi driver | expand |
Hi, On Wed, May 24, 2023 at 10:31:28AM +0200, Sascha Hauer wrote: > This is v5 of the series adding perf support to the rockchip DFI driver. > > A lot has changed in the perf driver since v4. First of all the review > feedback from Robin and Jonathan has been integrated. The perf driver > now not only supports monitoring the total DDR utilization, but also the > individual channels. I also reworked the way the raw 32bit counter > values are summed up to 64bit perf values, so hopefully the code is > easier to follow now. > > lockdep found out that that locking in the perf driver was broken, so I > reworked that as well. None of the perf hooks allows locking with > mutexes or spinlocks, so in perf it's not possible to enable the DFI > controller when needed. Instead I now unconditionally enable the DFI > controller during probe when perf is enabled. > > Furthermore the hrtimer I use for reading out the hardware counter > values before they overflow race with perf. Now a seqlock is used to > prevent that. > > The RK3588 device tree changes for the DFI were not part of v4. As > Vincent Legoll showed interest in testing this series the necessary > device tree changes are now part of this series. I tested the series on RK3588 EVB1. The read/write byts looks sensible. Sometimes cycles reads unrealistic values, though: Performance counter stats for 'system wide': 18446744070475110400 rockchip_ddr/cycles/ 828.63 MB rockchip_ddr/read-bytes/ 207.19 MB rockchip_ddr/read-bytes0/ 207.15 MB rockchip_ddr/read-bytes1/ 207.14 MB rockchip_ddr/read-bytes2/ 207.15 MB rockchip_ddr/read-bytes3/ 1.48 MB rockchip_ddr/write-bytes/ 0.37 MB rockchip_ddr/write-bytes0/ 0.37 MB rockchip_ddr/write-bytes1/ 0.37 MB rockchip_ddr/write-bytes2/ 0.38 MB rockchip_ddr/write-bytes3/ 830.12 MB rockchip_ddr/bytes/ 1.004239766 seconds time elapsed (This is with memdump running in parallel) Otherwise the series is Tested-by: Sebastian Reichel <sebastian.reichel@collabora.com> -- Sebastian > Changes since v4: > - Add device tree changes for RK3588 > - Use seqlock to protect perf counter values from hrtimer > - Unconditionally enable DFI when perf is enabled > - Bring back changes to dts/binding patches that were lost in v4 > > Changes since v3: > - Add RK3588 support > > Changes since v2: > - Fix broken reference to binding > - Add Reviewed-by from Rob > > Changes since v1: > - Fix example to actually match the binding and fix the warnings resulted thereof > - Make addition of rockchip,rk3568-dfi an extra patch > > Sascha Hauer (25): > PM / devfreq: rockchip-dfi: Make pmu regmap mandatory > PM / devfreq: rockchip-dfi: Embed desc into private data struct > PM / devfreq: rockchip-dfi: use consistent name for private data > struct > PM / devfreq: rockchip-dfi: Add SoC specific init function > PM / devfreq: rockchip-dfi: dfi store raw values in counter struct > PM / devfreq: rockchip-dfi: Use free running counter > PM / devfreq: rockchip-dfi: introduce channel mask > PM / devfreq: rk3399_dmc,dfi: generalize DDRTYPE defines > PM / devfreq: rockchip-dfi: Clean up DDR type register defines > PM / devfreq: rockchip-dfi: Add RK3568 support > PM / devfreq: rockchip-dfi: Handle LPDDR2 correctly > PM / devfreq: rockchip-dfi: Handle LPDDR4X > PM / devfreq: rockchip-dfi: Pass private data struct to internal > functions > PM / devfreq: rockchip-dfi: Prepare for multiple users > PM / devfreq: rockchip-dfi: give variable a better name > PM / devfreq: rockchip-dfi: Add perf support > PM / devfreq: rockchip-dfi: make register stride SoC specific > PM / devfreq: rockchip-dfi: account for multiple DDRMON_CTRL registers > PM / devfreq: rockchip-dfi: add support for RK3588 > dt-bindings: devfreq: event: convert Rockchip DFI binding to yaml > dt-bindings: devfreq: event: rockchip,dfi: Add rk3568 support > dt-bindings: devfreq: event: rockchip,dfi: Add rk3588 support > arm64: dts: rockchip: rk3399: Enable DFI > arm64: dts: rockchip: rk356x: Add DFI > arm64: dts: rockchip: rk3588s: Add DFI > > .../bindings/devfreq/event/rockchip,dfi.yaml | 84 ++ > .../bindings/devfreq/event/rockchip-dfi.txt | 18 - > .../rockchip,rk3399-dmc.yaml | 2 +- > arch/arm64/boot/dts/rockchip/rk3399.dtsi | 1 - > arch/arm64/boot/dts/rockchip/rk356x.dtsi | 7 + > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 16 + > drivers/devfreq/event/rockchip-dfi.c | 796 +++++++++++++++--- > drivers/devfreq/rk3399_dmc.c | 10 +- > include/soc/rockchip/rk3399_grf.h | 9 +- > include/soc/rockchip/rk3568_grf.h | 13 + > include/soc/rockchip/rk3588_grf.h | 18 + > include/soc/rockchip/rockchip_grf.h | 18 + > 12 files changed, 854 insertions(+), 138 deletions(-) > create mode 100644 Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml > delete mode 100644 Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt > create mode 100644 include/soc/rockchip/rk3568_grf.h > create mode 100644 include/soc/rockchip/rk3588_grf.h > create mode 100644 include/soc/rockchip/rockchip_grf.h > > -- > 2.39.2 >
Hello Sascha, Sebastian, On Wed, Jun 14, 2023 at 1:40 PM Sebastian Reichel <sebastian.reichel@collabora.com> wrote: > On Wed, May 24, 2023 at 10:31:28AM +0200, Sascha Hauer wrote: > > This is v5 of the series adding perf support to the rockchip DFI driver. > > [...] > > The RK3588 device tree changes for the DFI were not part of v4. As > > Vincent Legoll showed interest in testing this series the necessary > > device tree changes are now part of this series. > > I tested the series on RK3588 EVB1. The read/write byts looks > sensible. Sometimes cycles reads unrealistic values, though: > [...] > Otherwise the series is > > Tested-by: Sebastian Reichel <sebastian.reichel@collabora.com> > > -- Sebastian I also tested this new version of the series on a Pine64 QuartzPro64 dev board. I applied the series on top of my local branch, which is based on Collabora's rockchip-3588 plus some QP64 DTS patches, and your V5 patch series. Looks like this is still working properly: -bash-5.1# uname -a Linux qp64 6.4.0-rc1-00140-g658dd2200e2a #24 SMP PREEMPT Wed Jun 14 15:50:34 CEST 2023 aarch64 GNU/Linux -bash-5.1# zgrep -i _dfi /proc/config.gz CONFIG_DEVFREQ_EVENT_ROCKCHIP_DFI=y -bash-5.1# perf list | grep rockchip_ddr rockchip_ddr/bytes/ [Kernel PMU event] rockchip_ddr/cycles/ [Kernel PMU event] rockchip_ddr/read-bytes/ [Kernel PMU event] rockchip_ddr/read-bytes0/ [Kernel PMU event] rockchip_ddr/read-bytes1/ [Kernel PMU event] rockchip_ddr/read-bytes2/ [Kernel PMU event] rockchip_ddr/read-bytes3/ [Kernel PMU event] rockchip_ddr/write-bytes/ [Kernel PMU event] rockchip_ddr/write-bytes0/ [Kernel PMU event] rockchip_ddr/write-bytes1/ [Kernel PMU event] rockchip_ddr/write-bytes2/ [Kernel PMU event] rockchip_ddr/write-bytes3/ [Kernel PMU event] # With no memory load -bash-5.1# perf stat -a -e rockchip_ddr/cycles/,rockchip_ddr/read-bytes/,rockchip_ddr/write-bytes/,rockchip_ddr/bytes/ sleep 1 Performance counter stats for 'system wide': 1058691047 rockchip_ddr/cycles/ 9.35 MB rockchip_ddr/read-bytes/ 0.57 MB rockchip_ddr/write-bytes/ 9.90 MB rockchip_ddr/bytes/ 1.002616498 seconds time elapsed # With a hog -bash-5.1# memtester 4G > /dev/null 2>&1 & -bash-5.1# perf stat -a -e rockchip_ddr/cycles/,rockchip_ddr/read-bytes/,rockchip_ddr/write-bytes/,rockchip_ddr/bytes/ sleep 10 Performance counter stats for 'system wide': 10561540038 rockchip_ddr/cycles/ 60212.59 MB rockchip_ddr/read-bytes/ 31313.03 MB rockchip_ddr/write-bytes/ 91525.60 MB rockchip_ddr/bytes/ 10.001651886 seconds time elapsed You can add my T-B, for the whole series: Tested-by: Vincent Legoll <vincent.legoll@gmail.com> Or is there something else you want me to test ? Thanks for your work Regards -- Vincent Legoll
Hi, On Wed, Jun 14, 2023 at 03:40:34PM +0200, Sebastian Reichel wrote: > I tested the series on RK3588 EVB1. The read/write byts looks > sensible. Sometimes cycles reads unrealistic values, though: > > 18446744070475110400 rockchip_ddr/cycles/ I have seen this going off a few times with and without memory pressure. If it's way off, it always seems to follow the same pattern: The upper 32 bits are 0xffffffff instead of 0x00000000 with the lower 32 bits containing sensible data. -- Sebastian
Hi, On Wed, Jun 14, 2023 at 3:27 PM Sebastian Reichel <sebastian.reichel@collabora.com> wrote: > On Wed, Jun 14, 2023 at 03:40:34PM +0200, Sebastian Reichel wrote: > > I tested the series on RK3588 EVB1. The read/write byts looks > > sensible. Sometimes cycles reads unrealistic values, though: > > > > 18446744070475110400 rockchip_ddr/cycles/ > > I have seen this going off a few times with and without memory > pressure. If it's way off, it always seems to follow the same > pattern: The upper 32 bits are 0xffffffff instead of 0x00000000 > with the lower 32 bits containing sensible data. How often is that happening ? I have been running this in a loop with varying sleep duration for the last few hours without hitting it, with and without memtester. REPEATS=1000 MAX_DURATION=100 OUT="/tmp/perf-dfi-rk3588-${REPEATS}x${MAX_DURATION}s.txt" echo -n > $OUT for i in $(seq $REPEATS) ; do DURATION=$(shuf -i "0-${MAX_DURATION}" -n 1) echo -n "${DURATION} : " >> $OUT perf stat -a -e rockchip_ddr/cycles/ sleep $DURATION 2>&1 | awk '/rockchip_ddr/ {printf("%X\n", int($1))}' >> $OUT done BTW, it's on a musl-libc arm64 void linux userspace.
Hi, On Wed, Jun 14, 2023 at 07:51:21PM +0000, Vincent Legoll wrote: > On Wed, Jun 14, 2023 at 3:27 PM Sebastian Reichel > <sebastian.reichel@collabora.com> wrote: > > On Wed, Jun 14, 2023 at 03:40:34PM +0200, Sebastian Reichel wrote: > > > I tested the series on RK3588 EVB1. The read/write byts looks > > > sensible. Sometimes cycles reads unrealistic values, though: > > > > > > 18446744070475110400 rockchip_ddr/cycles/ > > > > I have seen this going off a few times with and without memory > > pressure. If it's way off, it always seems to follow the same > > pattern: The upper 32 bits are 0xffffffff instead of 0x00000000 > > with the lower 32 bits containing sensible data. > > How often is that happening ? I saw it multiple times (4 or 5) within a few tries (I guess around 20). I could see it with and without applying load on the memory. Tests have been running globally for a second using 'sleep 1' (just like the example from Sascha Hauer in the perf patch) > BTW, it's on a musl-libc arm64 void linux userspace. In my case it's Debian unstable. -- Sebastian
Hi Sebastian, On Wed, Jun 14, 2023 at 03:40:34PM +0200, Sebastian Reichel wrote: > Hi, > > On Wed, May 24, 2023 at 10:31:28AM +0200, Sascha Hauer wrote: > > This is v5 of the series adding perf support to the rockchip DFI driver. > > > > A lot has changed in the perf driver since v4. First of all the review > > feedback from Robin and Jonathan has been integrated. The perf driver > > now not only supports monitoring the total DDR utilization, but also the > > individual channels. I also reworked the way the raw 32bit counter > > values are summed up to 64bit perf values, so hopefully the code is > > easier to follow now. > > > > lockdep found out that that locking in the perf driver was broken, so I > > reworked that as well. None of the perf hooks allows locking with > > mutexes or spinlocks, so in perf it's not possible to enable the DFI > > controller when needed. Instead I now unconditionally enable the DFI > > controller during probe when perf is enabled. > > > > Furthermore the hrtimer I use for reading out the hardware counter > > values before they overflow race with perf. Now a seqlock is used to > > prevent that. > > > > The RK3588 device tree changes for the DFI were not part of v4. As > > Vincent Legoll showed interest in testing this series the necessary > > device tree changes are now part of this series. > > I tested the series on RK3588 EVB1. The read/write byts looks > sensible. Sometimes cycles reads unrealistic values, though: > > Performance counter stats for 'system wide': > > 18446744070475110400 rockchip_ddr/cycles/ I'll have a look later this day. I remember seeing this, but I thought this had been resolved already. Thanks for your feedback so far. Sascha
On Wed, Jun 14, 2023 at 03:40:34PM +0200, Sebastian Reichel wrote: > Hi, > > On Wed, May 24, 2023 at 10:31:28AM +0200, Sascha Hauer wrote: > > This is v5 of the series adding perf support to the rockchip DFI driver. > > > > A lot has changed in the perf driver since v4. First of all the review > > feedback from Robin and Jonathan has been integrated. The perf driver > > now not only supports monitoring the total DDR utilization, but also the > > individual channels. I also reworked the way the raw 32bit counter > > values are summed up to 64bit perf values, so hopefully the code is > > easier to follow now. > > > > lockdep found out that that locking in the perf driver was broken, so I > > reworked that as well. None of the perf hooks allows locking with > > mutexes or spinlocks, so in perf it's not possible to enable the DFI > > controller when needed. Instead I now unconditionally enable the DFI > > controller during probe when perf is enabled. > > > > Furthermore the hrtimer I use for reading out the hardware counter > > values before they overflow race with perf. Now a seqlock is used to > > prevent that. > > > > The RK3588 device tree changes for the DFI were not part of v4. As > > Vincent Legoll showed interest in testing this series the necessary > > device tree changes are now part of this series. > > I tested the series on RK3588 EVB1. The read/write byts looks > sensible. Sometimes cycles reads unrealistic values, though: > > Performance counter stats for 'system wide': > > 18446744070475110400 rockchip_ddr/cycles/ This goes down to missing initialization of &dfi->last_perf_count, see my other mail. Will fix this in the next round. Sascha