mbox series

[v2,00/14] Add regulator devfreq support to Panfrost

Message ID 20200704102535.189647-1-peron.clem@gmail.com (mailing list archive)
Headers show
Series Add regulator devfreq support to Panfrost | expand

Message

Clément Péron July 4, 2020, 10:25 a.m. UTC
Hi,

This serie cleans and adds regulator support to Panfrost devfreq.
This is mostly based on comment for the freshly introduced lima
devfreq.

We need to add regulator support because on Allwinner the GPU OPP
table defines both frequencies and voltages.

First patches [01-07] should not change the actual behavior
and introduce a proper panfrost_devfreq struct.

Regards,
Clément

Clément Péron (14):
  drm/panfrost: avoid static declaration
  drm/panfrost: clean headers in devfreq
  drm/panfrost: don't use pfdevfreq.busy_count to know if hw is idle
  drm/panfrost: introduce panfrost_devfreq struct
  drm/panfrost: use spinlock instead of atomic
  drm/panfrost: properly handle error in probe
  drm/panfrost: rename error labels in device_init
  drm/panfrost: move devfreq_init()/fini() in device
  drm/panfrost: dynamically alloc regulators
  drm/panfrost: add regulators to devfreq
  arm64: defconfig: Enable devfreq cooling device
  arm64: dts: allwinner: h6: Add cooling map for GPU
  [DO NOT MERGE] arm64: dts: allwinner: h6: Add GPU OPP table
  [DO NOT MERGE] arm64: dts: allwinner: force GPU regulator to be always

 .../dts/allwinner/sun50i-h6-beelink-gs1.dts   |   1 +
 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  | 102 +++++++++++
 arch/arm64/configs/defconfig                  |   1 +
 drivers/gpu/drm/panfrost/panfrost_devfreq.c   | 165 ++++++++++++------
 drivers/gpu/drm/panfrost/panfrost_devfreq.h   |  30 +++-
 drivers/gpu/drm/panfrost/panfrost_device.c    |  61 ++++---
 drivers/gpu/drm/panfrost/panfrost_device.h    |  14 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c       |  15 +-
 drivers/gpu/drm/panfrost/panfrost_job.c       |  10 +-
 9 files changed, 290 insertions(+), 109 deletions(-)

Comments

Maxime Ripard July 4, 2020, 12:13 p.m. UTC | #1
Hi,

On Sat, Jul 04, 2020 at 12:25:34PM +0200, Clément Péron wrote:
> Add an Operating Performance Points table for the GPU to
> enable Dynamic Voltage & Frequency Scaling on the H6.
> 
> The voltage range is set with minival voltage set to the target
> and the maximal voltage set to 1.2V. This allow DVFS framework to
> work properly on board with fixed regulator.
> 
> Signed-off-by: Clément Péron <peron.clem@gmail.com>

That patch seems reasonable, why shouldn't we merge it?

> ---
>  arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi | 80 ++++++++++++++++++++
>  1 file changed, 80 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> index 8f514a2169aa..a69f9e09a829 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> @@ -174,6 +174,7 @@ gpu: gpu@1800000 {
>  			clocks = <&ccu CLK_GPU>, <&ccu CLK_BUS_GPU>;
>  			clock-names = "core", "bus";
>  			resets = <&ccu RST_BUS_GPU>;
> +			operating-points-v2 = <&gpu_opp_table>;
>  			#cooling-cells = <2>;
>  			status = "disabled";
>  		};
> @@ -1036,4 +1037,83 @@ map0 {
>  			};
>  		};
>  	};
> +
> +	gpu_opp_table: gpu-opp-table {
> +		compatible = "operating-points-v2";
> +
> +		opp@216000000 {
> +			opp-hz = /bits/ 64 <216000000>;
> +			opp-microvolt = <810000 810000 1200000>;
> +		};

All those nodes will create DTC warnings though.

Maxime
Clément Péron July 4, 2020, 2:56 p.m. UTC | #2
Hi Maxime,

On Sat, 4 Jul 2020 at 14:13, Maxime Ripard <maxime@cerno.tech> wrote:
>
> Hi,
>
> On Sat, Jul 04, 2020 at 12:25:34PM +0200, Clément Péron wrote:
> > Add an Operating Performance Points table for the GPU to
> > enable Dynamic Voltage & Frequency Scaling on the H6.
> >
> > The voltage range is set with minival voltage set to the target
> > and the maximal voltage set to 1.2V. This allow DVFS framework to
> > work properly on board with fixed regulator.
> >
> > Signed-off-by: Clément Péron <peron.clem@gmail.com>
>
> That patch seems reasonable, why shouldn't we merge it?

I didn't test it a lot and last time I did, some frequencies looked unstable.
https://lore.kernel.org/patchwork/cover/1239739/

This series adds regulator support to Panfrost devfreq, I will send a
new one if DVFS on the H6 GPU is stable.

I got this running glmark2 last time
# glmark2-es2-drm
=======================================================
    glmark2 2017.07
=======================================================
    OpenGL Information
    GL_VENDOR:     Panfrost
    GL_RENDERER:   Mali T720 (Panfrost)
    GL_VERSION:    OpenGL ES 2.0 Mesa 20.0.5
=======================================================

[   93.550063] panfrost 1800000.gpu: GPU Fault 0x00000088 (UNKNOWN) at
0x0000000080117100
[   94.045401] panfrost 1800000.gpu: gpu sched timeout, js=0,
config=0x3700, status=0x8, head=0x21d6c00, tail=0x21d6c00,
sched_job=00000000e3c2132f

[  328.871070] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
0x0000000000000000
[  328.871070] Reason: TODO
[  328.871070] raw fault status: 0xAA0003C2
[  328.871070] decoded fault status: SLAVE FAULT
[  328.871070] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
[  328.871070] access type 0x3: WRITE
[  328.871070] source id 0xAA00
[  329.373327] panfrost 1800000.gpu: gpu sched timeout, js=1,
config=0x3700, status=0x8, head=0xa1a4900, tail=0xa1a4900,
sched_job=000000007ac31097
[  329.386527] panfrost 1800000.gpu: js fault, js=0,
status=DATA_INVALID_FAULT, head=0xa1a4c00, tail=0xa1a4c00
[  329.396293] panfrost 1800000.gpu: gpu sched timeout, js=0,
config=0x3700, status=0x58, head=0xa1a4c00, tail=0xa1a4c00,
sched_job=0000000004c90381
[  329.411521] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
0x0000000000000000
[  329.411521] Reason: TODO
[  329.411521] raw fault status: 0xAA0003C2
[  329.411521] decoded fault status: SLAVE FAULT
[  329.411521] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
[  329.411521] access type 0x3: WRITE
[  329.411521] source id 0xAA00

Regards,
Clement

>
> > ---
> >  arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi | 80 ++++++++++++++++++++
> >  1 file changed, 80 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> > index 8f514a2169aa..a69f9e09a829 100644
> > --- a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> > +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> > @@ -174,6 +174,7 @@ gpu: gpu@1800000 {
> >                       clocks = <&ccu CLK_GPU>, <&ccu CLK_BUS_GPU>;
> >                       clock-names = "core", "bus";
> >                       resets = <&ccu RST_BUS_GPU>;
> > +                     operating-points-v2 = <&gpu_opp_table>;
> >                       #cooling-cells = <2>;
> >                       status = "disabled";
> >               };
> > @@ -1036,4 +1037,83 @@ map0 {
> >                       };
> >               };
> >       };
> > +
> > +     gpu_opp_table: gpu-opp-table {
> > +             compatible = "operating-points-v2";
> > +
> > +             opp@216000000 {
> > +                     opp-hz = /bits/ 64 <216000000>;
> > +                     opp-microvolt = <810000 810000 1200000>;
> > +             };
>
> All those nodes will create DTC warnings though.
>
> Maxime
Alyssa Rosenzweig July 6, 2020, 1:26 p.m. UTC | #3
Patches 1-12 are 

	Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Thank you!

On Sat, Jul 04, 2020 at 12:25:21PM +0200, Clément Péron wrote:
> Hi,
> 
> This serie cleans and adds regulator support to Panfrost devfreq.
> This is mostly based on comment for the freshly introduced lima
> devfreq.
> 
> We need to add regulator support because on Allwinner the GPU OPP
> table defines both frequencies and voltages.
> 
> First patches [01-07] should not change the actual behavior
> and introduce a proper panfrost_devfreq struct.
> 
> Regards,
> Clément
> 
> Clément Péron (14):
>   drm/panfrost: avoid static declaration
>   drm/panfrost: clean headers in devfreq
>   drm/panfrost: don't use pfdevfreq.busy_count to know if hw is idle
>   drm/panfrost: introduce panfrost_devfreq struct
>   drm/panfrost: use spinlock instead of atomic
>   drm/panfrost: properly handle error in probe
>   drm/panfrost: rename error labels in device_init
>   drm/panfrost: move devfreq_init()/fini() in device
>   drm/panfrost: dynamically alloc regulators
>   drm/panfrost: add regulators to devfreq
>   arm64: defconfig: Enable devfreq cooling device
>   arm64: dts: allwinner: h6: Add cooling map for GPU
>   [DO NOT MERGE] arm64: dts: allwinner: h6: Add GPU OPP table
>   [DO NOT MERGE] arm64: dts: allwinner: force GPU regulator to be always
> 
>  .../dts/allwinner/sun50i-h6-beelink-gs1.dts   |   1 +
>  arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  | 102 +++++++++++
>  arch/arm64/configs/defconfig                  |   1 +
>  drivers/gpu/drm/panfrost/panfrost_devfreq.c   | 165 ++++++++++++------
>  drivers/gpu/drm/panfrost/panfrost_devfreq.h   |  30 +++-
>  drivers/gpu/drm/panfrost/panfrost_device.c    |  61 ++++---
>  drivers/gpu/drm/panfrost/panfrost_device.h    |  14 +-
>  drivers/gpu/drm/panfrost/panfrost_drv.c       |  15 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c       |  10 +-
>  9 files changed, 290 insertions(+), 109 deletions(-)
> 
> -- 
> 2.25.1
>
Ondřej Jirman July 7, 2020, 12:16 a.m. UTC | #4
Hello Clément,

On Sat, Jul 04, 2020 at 12:25:21PM +0200, Clément Péron wrote:
> Hi,
> 
> This serie cleans and adds regulator support to Panfrost devfreq.
> This is mostly based on comment for the freshly introduced lima
> devfreq.

I tried to test the series, but I'm unsure what it's meant to
be based on.

It doesn't appply on top of linux-next and while it applies on
top of 5.8-rc3, it fails to run due to ordering of 

  dev_pm_opp_set_regulators and dev_pm_opp_of_add_table

where this patch series places

  dev_pm_opp_of_add_table after dev_pm_opp_of_add_table

which fails with this warning:

https://elixir.bootlin.com/linux/v5.8-rc3/source/drivers/opp/core.c#L1696

[    0.155455] ------------[ cut here ]------------
[    0.155473] WARNING: CPU: 2 PID: 1 at drivers/opp/core.c:1696 dev_pm_opp_set_regulators+0x134/0x1f0
[    0.155476] Modules linked in:
[    0.155487] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc4-00328-gf89269f4a65c #12
[    0.155489] Hardware name: OrangePi 3 (DT)
[    0.155496] pstate: 20000005 (nzCv daif -PAN -UAO BTYPE=--)
[    0.155502] pc : dev_pm_opp_set_regulators+0x134/0x1f0
[    0.155507] lr : dev_pm_opp_set_regulators+0x28/0x1f0
[    0.155510] sp : ffffffc01002bb00
[    0.155512] x29: ffffffc01002bb00 x28: 0000000000000000
[    0.155518] x27: ffffffc0113b03b0 x26: ffffffc011431960
[    0.155523] x25: ffffffc011397a70 x24: ffffff807b6a2410
[    0.155528] x23: 0000000000000001 x22: ffffffc010f290a0
[    0.155533] x21: ffffff80789e3880 x20: ffffff80789e3ac8
[    0.155538] x19: ffffff80789e4400 x18: 00000000fffffffe
[    0.155543] x17: 0000000000000001 x16: 0000000000000019
[    0.155548] x15: 0000000000000001 x14: ffffffffffffffff
[    0.155553] x13: ffffffc01169fe00 x12: 0000000000000005
[    0.155558] x11: 0000000000000007 x10: 0101010101010101
[    0.155563] x9 : ffffffffffffffff x8 : 7f7f7f7f7f7f7f7f
[    0.155568] x7 : fefefeff646c606d x6 : 01111d48f3f5f3f0
[    0.155573] x5 : 70737573481d1101 x4 : 0000000000000000
[    0.155577] x3 : ffffff80789e4450 x2 : 0000000000000000
[    0.155582] x1 : ffffff807b490000 x0 : ffffff8078c2fe00
[    0.155587] Call trace:
[    0.155595]  dev_pm_opp_set_regulators+0x134/0x1f0
[    0.155603]  panfrost_devfreq_init+0x70/0x178
[    0.155608]  panfrost_device_init+0x108/0x5d8
[    0.155613]  panfrost_probe+0xa4/0x178
[    0.155619]  platform_drv_probe+0x50/0xa0
[    0.155626]  really_probe+0xd4/0x318
[    0.155631]  driver_probe_device+0x54/0xb0
[    0.155638]  device_driver_attach+0x6c/0x78
[    0.155643]  __driver_attach+0x54/0xd0
[    0.155649]  bus_for_each_dev+0x5c/0x98
[    0.155654]  driver_attach+0x20/0x28
[    0.155660]  bus_add_driver+0x140/0x1e8
[    0.155666]  driver_register+0x60/0x110
[    0.155670]  __platform_driver_register+0x44/0x50
[    0.155678]  panfrost_driver_init+0x18/0x20
[    0.155685]  do_one_initcall+0x3c/0x160
[    0.155691]  kernel_init_freeable+0x20c/0x2b0
[    0.155698]  kernel_init+0x10/0x104
[    0.155703]  ret_from_fork+0x10/0x1c
[    0.155712] ---[ end trace ed26920b0484a95e ]---
[    0.155725] panfrost 1800000.gpu: [drm:panfrost_devfreq_init] *ERROR* Couldn't set OPP regulators
[    0.156710] panfrost 1800000.gpu: devfreq init failed -16
[    0.156725] panfrost 1800000.gpu: Fatal error during GPU init
[    0.156795] panfrost: probe of 1800000.gpu failed with error -16
[    0.157158] cacheinfo: Unable to detect cache hierarchy for CPU 0


thank you and regards,
	o.

> We need to add regulator support because on Allwinner the GPU OPP
> table defines both frequencies and voltages.
> 
> First patches [01-07] should not change the actual behavior
> and introduce a proper panfrost_devfreq struct.
> 
> Regards,
> Clément
> 
> Clément Péron (14):
>   drm/panfrost: avoid static declaration
>   drm/panfrost: clean headers in devfreq
>   drm/panfrost: don't use pfdevfreq.busy_count to know if hw is idle
>   drm/panfrost: introduce panfrost_devfreq struct
>   drm/panfrost: use spinlock instead of atomic
>   drm/panfrost: properly handle error in probe
>   drm/panfrost: rename error labels in device_init
>   drm/panfrost: move devfreq_init()/fini() in device
>   drm/panfrost: dynamically alloc regulators
>   drm/panfrost: add regulators to devfreq
>   arm64: defconfig: Enable devfreq cooling device
>   arm64: dts: allwinner: h6: Add cooling map for GPU
>   [DO NOT MERGE] arm64: dts: allwinner: h6: Add GPU OPP table
>   [DO NOT MERGE] arm64: dts: allwinner: force GPU regulator to be always
> 
>  .../dts/allwinner/sun50i-h6-beelink-gs1.dts   |   1 +
>  arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  | 102 +++++++++++
>  arch/arm64/configs/defconfig                  |   1 +
>  drivers/gpu/drm/panfrost/panfrost_devfreq.c   | 165 ++++++++++++------
>  drivers/gpu/drm/panfrost/panfrost_devfreq.h   |  30 +++-
>  drivers/gpu/drm/panfrost/panfrost_device.c    |  61 ++++---
>  drivers/gpu/drm/panfrost/panfrost_device.h    |  14 +-
>  drivers/gpu/drm/panfrost/panfrost_drv.c       |  15 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c       |  10 +-
>  9 files changed, 290 insertions(+), 109 deletions(-)
> 
> -- 
> 2.25.1
>
Clément Péron Aug. 3, 2020, 7:54 a.m. UTC | #5
Hi Maxime and All,

On Sat, 4 Jul 2020 at 16:56, Clément Péron <peron.clem@gmail.com> wrote:
>
> Hi Maxime,
>
> On Sat, 4 Jul 2020 at 14:13, Maxime Ripard <maxime@cerno.tech> wrote:
> >
> > Hi,
> >
> > On Sat, Jul 04, 2020 at 12:25:34PM +0200, Clément Péron wrote:
> > > Add an Operating Performance Points table for the GPU to
> > > enable Dynamic Voltage & Frequency Scaling on the H6.
> > >
> > > The voltage range is set with minival voltage set to the target
> > > and the maximal voltage set to 1.2V. This allow DVFS framework to
> > > work properly on board with fixed regulator.
> > >
> > > Signed-off-by: Clément Péron <peron.clem@gmail.com>
> >
> > That patch seems reasonable, why shouldn't we merge it?
>
> I didn't test it a lot and last time I did, some frequencies looked unstable.
> https://lore.kernel.org/patchwork/cover/1239739/
>
> This series adds regulator support to Panfrost devfreq, I will send a
> new one if DVFS on the H6 GPU is stable.
>
> I got this running glmark2 last time
> # glmark2-es2-drm
> =======================================================
>     glmark2 2017.07
> =======================================================
>     OpenGL Information
>     GL_VENDOR:     Panfrost
>     GL_RENDERER:   Mali T720 (Panfrost)
>     GL_VERSION:    OpenGL ES 2.0 Mesa 20.0.5
> =======================================================
>
> [   93.550063] panfrost 1800000.gpu: GPU Fault 0x00000088 (UNKNOWN) at
> 0x0000000080117100
> [   94.045401] panfrost 1800000.gpu: gpu sched timeout, js=0,
> config=0x3700, status=0x8, head=0x21d6c00, tail=0x21d6c00,
> sched_job=00000000e3c2132f
>
> [  328.871070] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> 0x0000000000000000
> [  328.871070] Reason: TODO
> [  328.871070] raw fault status: 0xAA0003C2
> [  328.871070] decoded fault status: SLAVE FAULT
> [  328.871070] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> [  328.871070] access type 0x3: WRITE
> [  328.871070] source id 0xAA00
> [  329.373327] panfrost 1800000.gpu: gpu sched timeout, js=1,
> config=0x3700, status=0x8, head=0xa1a4900, tail=0xa1a4900,
> sched_job=000000007ac31097
> [  329.386527] panfrost 1800000.gpu: js fault, js=0,
> status=DATA_INVALID_FAULT, head=0xa1a4c00, tail=0xa1a4c00
> [  329.396293] panfrost 1800000.gpu: gpu sched timeout, js=0,
> config=0x3700, status=0x58, head=0xa1a4c00, tail=0xa1a4c00,
> sched_job=0000000004c90381
> [  329.411521] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> 0x0000000000000000
> [  329.411521] Reason: TODO
> [  329.411521] raw fault status: 0xAA0003C2
> [  329.411521] decoded fault status: SLAVE FAULT
> [  329.411521] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> [  329.411521] access type 0x3: WRITE
> [  329.411521] source id 0xAA00

Just to keep a track of this issue.

Piotr Oniszczuk give more test and seems to be software related:
https://www.spinics.net/lists/dri-devel/msg264279.html

Ondrej gave a great explanation about a possible origin of this issue:
https://freenode.irclog.whitequark.org/linux-sunxi/2020-07-11

20:12 <megi> looks like gpu pll on H6 is NKMP clock, and those are
implemented in such a way in mainline that they are prone to
overshooting the frequency during output divider reduction
20:13 <megi> so disabling P divider may help
20:13 <megi> or fixing the dividers
20:14 <megi> and just allowing N to change
20:22 <megi> hmm, I haven't looked at this for quite some time, but H6
BSP way of setting PLL factors actually makes the most sense out of
everything I've seen/tested so far
20:23 <megi> it waits for lock not after setting NK factors, but after
reducing the M factor (pre-divider)
20:24 <megi> I might as well re-run my CPU PLL tester with this
algorithm, to see if it fixes the lockups
20:26 <megi> it makes sense to wait for PLL to stabilize "after"
changing all the factors that actually affect the VCO, and not just
some of them
20:27 <megi> warpme_: ^
20:28 <megi> it may be the same thing that plagues the CPU PLL rate
changes at runtime

Regards,
Clement

>
> Regards,
> Clement
>
> >
> > > ---
> > >  arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi | 80 ++++++++++++++++++++
> > >  1 file changed, 80 insertions(+)
> > >
> > > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> > > index 8f514a2169aa..a69f9e09a829 100644
> > > --- a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> > > +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> > > @@ -174,6 +174,7 @@ gpu: gpu@1800000 {
> > >                       clocks = <&ccu CLK_GPU>, <&ccu CLK_BUS_GPU>;
> > >                       clock-names = "core", "bus";
> > >                       resets = <&ccu RST_BUS_GPU>;
> > > +                     operating-points-v2 = <&gpu_opp_table>;
> > >                       #cooling-cells = <2>;
> > >                       status = "disabled";
> > >               };
> > > @@ -1036,4 +1037,83 @@ map0 {
> > >                       };
> > >               };
> > >       };
> > > +
> > > +     gpu_opp_table: gpu-opp-table {
> > > +             compatible = "operating-points-v2";
> > > +
> > > +             opp@216000000 {
> > > +                     opp-hz = /bits/ 64 <216000000>;
> > > +                     opp-microvolt = <810000 810000 1200000>;
> > > +             };
> >
> > All those nodes will create DTC warnings though.
> >
> > Maxime
Maxime Ripard Aug. 24, 2020, 1:11 p.m. UTC | #6
Hi Clement,

On Mon, Aug 03, 2020 at 09:54:05AM +0200, Clément Péron wrote:
> Hi Maxime and All,
> 
> On Sat, 4 Jul 2020 at 16:56, Clément Péron <peron.clem@gmail.com> wrote:
> >
> > Hi Maxime,
> >
> > On Sat, 4 Jul 2020 at 14:13, Maxime Ripard <maxime@cerno.tech> wrote:
> > >
> > > Hi,
> > >
> > > On Sat, Jul 04, 2020 at 12:25:34PM +0200, Clément Péron wrote:
> > > > Add an Operating Performance Points table for the GPU to
> > > > enable Dynamic Voltage & Frequency Scaling on the H6.
> > > >
> > > > The voltage range is set with minival voltage set to the target
> > > > and the maximal voltage set to 1.2V. This allow DVFS framework to
> > > > work properly on board with fixed regulator.
> > > >
> > > > Signed-off-by: Clément Péron <peron.clem@gmail.com>
> > >
> > > That patch seems reasonable, why shouldn't we merge it?
> >
> > I didn't test it a lot and last time I did, some frequencies looked unstable.
> > https://lore.kernel.org/patchwork/cover/1239739/
> >
> > This series adds regulator support to Panfrost devfreq, I will send a
> > new one if DVFS on the H6 GPU is stable.
> >
> > I got this running glmark2 last time
> > # glmark2-es2-drm
> > =======================================================
> >     glmark2 2017.07
> > =======================================================
> >     OpenGL Information
> >     GL_VENDOR:     Panfrost
> >     GL_RENDERER:   Mali T720 (Panfrost)
> >     GL_VERSION:    OpenGL ES 2.0 Mesa 20.0.5
> > =======================================================
> >
> > [   93.550063] panfrost 1800000.gpu: GPU Fault 0x00000088 (UNKNOWN) at
> > 0x0000000080117100
> > [   94.045401] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > config=0x3700, status=0x8, head=0x21d6c00, tail=0x21d6c00,
> > sched_job=00000000e3c2132f
> >
> > [  328.871070] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > 0x0000000000000000
> > [  328.871070] Reason: TODO
> > [  328.871070] raw fault status: 0xAA0003C2
> > [  328.871070] decoded fault status: SLAVE FAULT
> > [  328.871070] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> > [  328.871070] access type 0x3: WRITE
> > [  328.871070] source id 0xAA00
> > [  329.373327] panfrost 1800000.gpu: gpu sched timeout, js=1,
> > config=0x3700, status=0x8, head=0xa1a4900, tail=0xa1a4900,
> > sched_job=000000007ac31097
> > [  329.386527] panfrost 1800000.gpu: js fault, js=0,
> > status=DATA_INVALID_FAULT, head=0xa1a4c00, tail=0xa1a4c00
> > [  329.396293] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > config=0x3700, status=0x58, head=0xa1a4c00, tail=0xa1a4c00,
> > sched_job=0000000004c90381
> > [  329.411521] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > 0x0000000000000000
> > [  329.411521] Reason: TODO
> > [  329.411521] raw fault status: 0xAA0003C2
> > [  329.411521] decoded fault status: SLAVE FAULT
> > [  329.411521] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> > [  329.411521] access type 0x3: WRITE
> > [  329.411521] source id 0xAA00
> 
> Just to keep a track of this issue.
> 
> Piotr Oniszczuk give more test and seems to be software related:
> https://www.spinics.net/lists/dri-devel/msg264279.html
> 
> Ondrej gave a great explanation about a possible origin of this issue:
> https://freenode.irclog.whitequark.org/linux-sunxi/2020-07-11
> 
> 20:12 <megi> looks like gpu pll on H6 is NKMP clock, and those are
> implemented in such a way in mainline that they are prone to
> overshooting the frequency during output divider reduction
> 20:13 <megi> so disabling P divider may help
> 20:13 <megi> or fixing the dividers
> 20:14 <megi> and just allowing N to change
> 20:22 <megi> hmm, I haven't looked at this for quite some time, but H6
> BSP way of setting PLL factors actually makes the most sense out of
> everything I've seen/tested so far
> 20:23 <megi> it waits for lock not after setting NK factors, but after
> reducing the M factor (pre-divider)
> 20:24 <megi> I might as well re-run my CPU PLL tester with this
> algorithm, to see if it fixes the lockups
> 20:26 <megi> it makes sense to wait for PLL to stabilize "after"
> changing all the factors that actually affect the VCO, and not just
> some of them
> 20:27 <megi> warpme_: ^
> 20:28 <megi> it may be the same thing that plagues the CPU PLL rate
> changes at runtime

I guess it's one of the bugs we never heard of...

It would be a good idea to test it on another platform (like Rockchip?)
to rule out any driver issue?

What do you think?

Maxime
Clément Péron Aug. 28, 2020, 12:16 p.m. UTC | #7
Hi Maxime,

On Tue, 25 Aug 2020 at 15:35, Maxime Ripard <maxime@cerno.tech> wrote:
>
> Hi Clement,
>
> On Mon, Aug 03, 2020 at 09:54:05AM +0200, Clément Péron wrote:
> > Hi Maxime and All,
> >
> > On Sat, 4 Jul 2020 at 16:56, Clément Péron <peron.clem@gmail.com> wrote:
> > >
> > > Hi Maxime,
> > >
> > > On Sat, 4 Jul 2020 at 14:13, Maxime Ripard <maxime@cerno.tech> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Sat, Jul 04, 2020 at 12:25:34PM +0200, Clément Péron wrote:
> > > > > Add an Operating Performance Points table for the GPU to
> > > > > enable Dynamic Voltage & Frequency Scaling on the H6.
> > > > >
> > > > > The voltage range is set with minival voltage set to the target
> > > > > and the maximal voltage set to 1.2V. This allow DVFS framework to
> > > > > work properly on board with fixed regulator.
> > > > >
> > > > > Signed-off-by: Clément Péron <peron.clem@gmail.com>
> > > >
> > > > That patch seems reasonable, why shouldn't we merge it?
> > >
> > > I didn't test it a lot and last time I did, some frequencies looked unstable.
> > > https://lore.kernel.org/patchwork/cover/1239739/
> > >
> > > This series adds regulator support to Panfrost devfreq, I will send a
> > > new one if DVFS on the H6 GPU is stable.
> > >
> > > I got this running glmark2 last time
> > > # glmark2-es2-drm
> > > =======================================================
> > >     glmark2 2017.07
> > > =======================================================
> > >     OpenGL Information
> > >     GL_VENDOR:     Panfrost
> > >     GL_RENDERER:   Mali T720 (Panfrost)
> > >     GL_VERSION:    OpenGL ES 2.0 Mesa 20.0.5
> > > =======================================================
> > >
> > > [   93.550063] panfrost 1800000.gpu: GPU Fault 0x00000088 (UNKNOWN) at
> > > 0x0000000080117100
> > > [   94.045401] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > > config=0x3700, status=0x8, head=0x21d6c00, tail=0x21d6c00,
> > > sched_job=00000000e3c2132f
> > >
> > > [  328.871070] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > > 0x0000000000000000
> > > [  328.871070] Reason: TODO
> > > [  328.871070] raw fault status: 0xAA0003C2
> > > [  328.871070] decoded fault status: SLAVE FAULT
> > > [  328.871070] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> > > [  328.871070] access type 0x3: WRITE
> > > [  328.871070] source id 0xAA00
> > > [  329.373327] panfrost 1800000.gpu: gpu sched timeout, js=1,
> > > config=0x3700, status=0x8, head=0xa1a4900, tail=0xa1a4900,
> > > sched_job=000000007ac31097
> > > [  329.386527] panfrost 1800000.gpu: js fault, js=0,
> > > status=DATA_INVALID_FAULT, head=0xa1a4c00, tail=0xa1a4c00
> > > [  329.396293] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > > config=0x3700, status=0x58, head=0xa1a4c00, tail=0xa1a4c00,
> > > sched_job=0000000004c90381
> > > [  329.411521] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > > 0x0000000000000000
> > > [  329.411521] Reason: TODO
> > > [  329.411521] raw fault status: 0xAA0003C2
> > > [  329.411521] decoded fault status: SLAVE FAULT
> > > [  329.411521] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> > > [  329.411521] access type 0x3: WRITE
> > > [  329.411521] source id 0xAA00
> >
> > Just to keep a track of this issue.
> >
> > Piotr Oniszczuk give more test and seems to be software related:
> > https://www.spinics.net/lists/dri-devel/msg264279.html
> >
> > Ondrej gave a great explanation about a possible origin of this issue:
> > https://freenode.irclog.whitequark.org/linux-sunxi/2020-07-11
> >
> > 20:12 <megi> looks like gpu pll on H6 is NKMP clock, and those are
> > implemented in such a way in mainline that they are prone to
> > overshooting the frequency during output divider reduction
> > 20:13 <megi> so disabling P divider may help
> > 20:13 <megi> or fixing the dividers
> > 20:14 <megi> and just allowing N to change
> > 20:22 <megi> hmm, I haven't looked at this for quite some time, but H6
> > BSP way of setting PLL factors actually makes the most sense out of
> > everything I've seen/tested so far
> > 20:23 <megi> it waits for lock not after setting NK factors, but after
> > reducing the M factor (pre-divider)
> > 20:24 <megi> I might as well re-run my CPU PLL tester with this
> > algorithm, to see if it fixes the lockups
> > 20:26 <megi> it makes sense to wait for PLL to stabilize "after"
> > changing all the factors that actually affect the VCO, and not just
> > some of them
> > 20:27 <megi> warpme_: ^
> > 20:28 <megi> it may be the same thing that plagues the CPU PLL rate
> > changes at runtime
>
> I guess it's one of the bugs we never heard of...
>
> It would be a good idea to test it on another platform (like Rockchip?)
> to rule out any driver issue?
>
> What do you think?

I can't exclude a bug in the driver, but if it was the case LE
community or Panfrost maintainer would have heard of that.

Megi's explanations match what I observed.
NKMP drivers seem the perfect guilty here or maybe it's a combination of both...

Jernej sent me this patch to test:
https://github.com/clementperon/linux/commit/56bde359beaf8e827ce53ede1fe4a0ad233cb79b
But it didn't fix the issue, If someone want to have a look at it :)

Regards,
Clement

>
> Maxime
Ondřej Jirman Aug. 28, 2020, 12:21 p.m. UTC | #8
On Fri, Aug 28, 2020 at 02:16:36PM +0200, Clément Péron wrote:
> Hi Maxime,
> 
> On Tue, 25 Aug 2020 at 15:35, Maxime Ripard <maxime@cerno.tech> wrote:
> >
> > Hi Clement,
> >
> > On Mon, Aug 03, 2020 at 09:54:05AM +0200, Clément Péron wrote:
> > > Hi Maxime and All,
> > >
> > > On Sat, 4 Jul 2020 at 16:56, Clément Péron <peron.clem@gmail.com> wrote:
> > > >
> > > > Hi Maxime,
> > > >
> > > > On Sat, 4 Jul 2020 at 14:13, Maxime Ripard <maxime@cerno.tech> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Sat, Jul 04, 2020 at 12:25:34PM +0200, Clément Péron wrote:
> > > > > > Add an Operating Performance Points table for the GPU to
> > > > > > enable Dynamic Voltage & Frequency Scaling on the H6.
> > > > > >
> > > > > > The voltage range is set with minival voltage set to the target
> > > > > > and the maximal voltage set to 1.2V. This allow DVFS framework to
> > > > > > work properly on board with fixed regulator.
> > > > > >
> > > > > > Signed-off-by: Clément Péron <peron.clem@gmail.com>
> > > > >
> > > > > That patch seems reasonable, why shouldn't we merge it?
> > > >
> > > > I didn't test it a lot and last time I did, some frequencies looked unstable.
> > > > https://lore.kernel.org/patchwork/cover/1239739/
> > > >
> > > > This series adds regulator support to Panfrost devfreq, I will send a
> > > > new one if DVFS on the H6 GPU is stable.
> > > >
> > > > I got this running glmark2 last time
> > > > # glmark2-es2-drm
> > > > =======================================================
> > > >     glmark2 2017.07
> > > > =======================================================
> > > >     OpenGL Information
> > > >     GL_VENDOR:     Panfrost
> > > >     GL_RENDERER:   Mali T720 (Panfrost)
> > > >     GL_VERSION:    OpenGL ES 2.0 Mesa 20.0.5
> > > > =======================================================
> > > >
> > > > [   93.550063] panfrost 1800000.gpu: GPU Fault 0x00000088 (UNKNOWN) at
> > > > 0x0000000080117100
> > > > [   94.045401] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > > > config=0x3700, status=0x8, head=0x21d6c00, tail=0x21d6c00,
> > > > sched_job=00000000e3c2132f
> > > >
> > > > [  328.871070] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > > > 0x0000000000000000
> > > > [  328.871070] Reason: TODO
> > > > [  328.871070] raw fault status: 0xAA0003C2
> > > > [  328.871070] decoded fault status: SLAVE FAULT
> > > > [  328.871070] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> > > > [  328.871070] access type 0x3: WRITE
> > > > [  328.871070] source id 0xAA00
> > > > [  329.373327] panfrost 1800000.gpu: gpu sched timeout, js=1,
> > > > config=0x3700, status=0x8, head=0xa1a4900, tail=0xa1a4900,
> > > > sched_job=000000007ac31097
> > > > [  329.386527] panfrost 1800000.gpu: js fault, js=0,
> > > > status=DATA_INVALID_FAULT, head=0xa1a4c00, tail=0xa1a4c00
> > > > [  329.396293] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > > > config=0x3700, status=0x58, head=0xa1a4c00, tail=0xa1a4c00,
> > > > sched_job=0000000004c90381
> > > > [  329.411521] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > > > 0x0000000000000000
> > > > [  329.411521] Reason: TODO
> > > > [  329.411521] raw fault status: 0xAA0003C2
> > > > [  329.411521] decoded fault status: SLAVE FAULT
> > > > [  329.411521] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> > > > [  329.411521] access type 0x3: WRITE
> > > > [  329.411521] source id 0xAA00
> > >
> > > Just to keep a track of this issue.
> > >
> > > Piotr Oniszczuk give more test and seems to be software related:
> > > https://www.spinics.net/lists/dri-devel/msg264279.html
> > >
> > > Ondrej gave a great explanation about a possible origin of this issue:
> > > https://freenode.irclog.whitequark.org/linux-sunxi/2020-07-11
> > >
> > > 20:12 <megi> looks like gpu pll on H6 is NKMP clock, and those are
> > > implemented in such a way in mainline that they are prone to
> > > overshooting the frequency during output divider reduction
> > > 20:13 <megi> so disabling P divider may help
> > > 20:13 <megi> or fixing the dividers
> > > 20:14 <megi> and just allowing N to change
> > > 20:22 <megi> hmm, I haven't looked at this for quite some time, but H6
> > > BSP way of setting PLL factors actually makes the most sense out of
> > > everything I've seen/tested so far
> > > 20:23 <megi> it waits for lock not after setting NK factors, but after
> > > reducing the M factor (pre-divider)
> > > 20:24 <megi> I might as well re-run my CPU PLL tester with this
> > > algorithm, to see if it fixes the lockups
> > > 20:26 <megi> it makes sense to wait for PLL to stabilize "after"
> > > changing all the factors that actually affect the VCO, and not just
> > > some of them
> > > 20:27 <megi> warpme_: ^
> > > 20:28 <megi> it may be the same thing that plagues the CPU PLL rate
> > > changes at runtime
> >
> > I guess it's one of the bugs we never heard of...
> >
> > It would be a good idea to test it on another platform (like Rockchip?)
> > to rule out any driver issue?
> >
> > What do you think?
> 
> I can't exclude a bug in the driver, but if it was the case LE
> community or Panfrost maintainer would have heard of that.
> 
> Megi's explanations match what I observed.
> NKMP drivers seem the perfect guilty here or maybe it's a combination of both...
> 
> Jernej sent me this patch to test:
> https://github.com/clementperon/linux/commit/56bde359beaf8e827ce53ede1fe4a0ad233cb79b
> But it didn't fix the issue, If someone want to have a look at it :)

Not sure how that patch is supposed to work, but it seems to apply
all factors at once to me.

regards,
	o.

> Regards,
> Clement
> 
> >
> > Maxime
Jernej Škrabec Aug. 28, 2020, 1:01 p.m. UTC | #9
Dne petek, 28. avgust 2020 ob 14:21:19 CEST je Ondřej Jirman napisal(a):
> On Fri, Aug 28, 2020 at 02:16:36PM +0200, Clément Péron wrote:
> > Hi Maxime,
> > 
> > On Tue, 25 Aug 2020 at 15:35, Maxime Ripard <maxime@cerno.tech> wrote:
> > > Hi Clement,
> > > 
> > > On Mon, Aug 03, 2020 at 09:54:05AM +0200, Clément Péron wrote:
> > > > Hi Maxime and All,
> > > > 
> > > > On Sat, 4 Jul 2020 at 16:56, Clément Péron <peron.clem@gmail.com> 
wrote:
> > > > > Hi Maxime,
> > > > > 
> > > > > On Sat, 4 Jul 2020 at 14:13, Maxime Ripard <maxime@cerno.tech> 
wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > On Sat, Jul 04, 2020 at 12:25:34PM +0200, Clément Péron wrote:
> > > > > > > Add an Operating Performance Points table for the GPU to
> > > > > > > enable Dynamic Voltage & Frequency Scaling on the H6.
> > > > > > > 
> > > > > > > The voltage range is set with minival voltage set to the target
> > > > > > > and the maximal voltage set to 1.2V. This allow DVFS framework
> > > > > > > to
> > > > > > > work properly on board with fixed regulator.
> > > > > > > 
> > > > > > > Signed-off-by: Clément Péron <peron.clem@gmail.com>
> > > > > > 
> > > > > > That patch seems reasonable, why shouldn't we merge it?
> > > > > 
> > > > > I didn't test it a lot and last time I did, some frequencies looked
> > > > > unstable. https://lore.kernel.org/patchwork/cover/1239739/
> > > > > 
> > > > > This series adds regulator support to Panfrost devfreq, I will send
> > > > > a
> > > > > new one if DVFS on the H6 GPU is stable.
> > > > > 
> > > > > I got this running glmark2 last time
> > > > > # glmark2-es2-drm
> > > > > =======================================================
> > > > > 
> > > > >     glmark2 2017.07
> > > > > 
> > > > > =======================================================
> > > > > 
> > > > >     OpenGL Information
> > > > >     GL_VENDOR:     Panfrost
> > > > >     GL_RENDERER:   Mali T720 (Panfrost)
> > > > >     GL_VERSION:    OpenGL ES 2.0 Mesa 20.0.5
> > > > > 
> > > > > =======================================================
> > > > > 
> > > > > [   93.550063] panfrost 1800000.gpu: GPU Fault 0x00000088 (UNKNOWN)
> > > > > at
> > > > > 0x0000000080117100
> > > > > [   94.045401] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > > > > config=0x3700, status=0x8, head=0x21d6c00, tail=0x21d6c00,
> > > > > sched_job=00000000e3c2132f
> > > > > 
> > > > > [  328.871070] panfrost 1800000.gpu: Unhandled Page fault in AS0 at
> > > > > VA
> > > > > 0x0000000000000000
> > > > > [  328.871070] Reason: TODO
> > > > > [  328.871070] raw fault status: 0xAA0003C2
> > > > > [  328.871070] decoded fault status: SLAVE FAULT
> > > > > [  328.871070] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> > > > > [  328.871070] access type 0x3: WRITE
> > > > > [  328.871070] source id 0xAA00
> > > > > [  329.373327] panfrost 1800000.gpu: gpu sched timeout, js=1,
> > > > > config=0x3700, status=0x8, head=0xa1a4900, tail=0xa1a4900,
> > > > > sched_job=000000007ac31097
> > > > > [  329.386527] panfrost 1800000.gpu: js fault, js=0,
> > > > > status=DATA_INVALID_FAULT, head=0xa1a4c00, tail=0xa1a4c00
> > > > > [  329.396293] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > > > > config=0x3700, status=0x58, head=0xa1a4c00, tail=0xa1a4c00,
> > > > > sched_job=0000000004c90381
> > > > > [  329.411521] panfrost 1800000.gpu: Unhandled Page fault in AS0 at
> > > > > VA
> > > > > 0x0000000000000000
> > > > > [  329.411521] Reason: TODO
> > > > > [  329.411521] raw fault status: 0xAA0003C2
> > > > > [  329.411521] decoded fault status: SLAVE FAULT
> > > > > [  329.411521] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> > > > > [  329.411521] access type 0x3: WRITE
> > > > > [  329.411521] source id 0xAA00
> > > > 
> > > > Just to keep a track of this issue.
> > > > 
> > > > Piotr Oniszczuk give more test and seems to be software related:
> > > > https://www.spinics.net/lists/dri-devel/msg264279.html
> > > > 
> > > > Ondrej gave a great explanation about a possible origin of this issue:
> > > > https://freenode.irclog.whitequark.org/linux-sunxi/2020-07-11
> > > > 
> > > > 20:12 <megi> looks like gpu pll on H6 is NKMP clock, and those are
> > > > implemented in such a way in mainline that they are prone to
> > > > overshooting the frequency during output divider reduction
> > > > 20:13 <megi> so disabling P divider may help
> > > > 20:13 <megi> or fixing the dividers
> > > > 20:14 <megi> and just allowing N to change
> > > > 20:22 <megi> hmm, I haven't looked at this for quite some time, but H6
> > > > BSP way of setting PLL factors actually makes the most sense out of
> > > > everything I've seen/tested so far
> > > > 20:23 <megi> it waits for lock not after setting NK factors, but after
> > > > reducing the M factor (pre-divider)
> > > > 20:24 <megi> I might as well re-run my CPU PLL tester with this
> > > > algorithm, to see if it fixes the lockups
> > > > 20:26 <megi> it makes sense to wait for PLL to stabilize "after"
> > > > changing all the factors that actually affect the VCO, and not just
> > > > some of them
> > > > 20:27 <megi> warpme_: ^
> > > > 20:28 <megi> it may be the same thing that plagues the CPU PLL rate
> > > > changes at runtime
> > > 
> > > I guess it's one of the bugs we never heard of...
> > > 
> > > It would be a good idea to test it on another platform (like Rockchip?)
> > > to rule out any driver issue?
> > > 
> > > What do you think?
> > 
> > I can't exclude a bug in the driver, but if it was the case LE
> > community or Panfrost maintainer would have heard of that.
> > 
> > Megi's explanations match what I observed.
> > NKMP drivers seem the perfect guilty here or maybe it's a combination of
> > both...
> > 
> > Jernej sent me this patch to test:
> > https://github.com/clementperon/linux/commit/56bde359beaf8e827ce53ede1fe4a
> > 0ad233cb79b But it didn't fix the issue, If someone want to have a look at
> > it :)
> Not sure how that patch is supposed to work, but it seems to apply
> all factors at once to me.

It's hackish but in essence, it is what vendor clock driver does. From A83T or 
so onwards, vendor driver uses "lock_mode = PLL_LOCK_NEW_MODE" and this hack 
tries to mimick that locking behaviour (good enough for tests).

And yes, this code still applies all factors at once, because vendor driver 
does that too on H6. Only case where vendor driver applies factors one by one 
is for V3s SoC.

Btw, only recently I noticed following block in h6 clk driver:
https://elixir.bootlin.com/linux/latest/source/drivers/clk/sunxi-ng/ccu-sun50i-h6.c#L1196

Other clk drivers don't have it. Maybe it has some influence in this matter.

Best regards,
Jernej

> 
> regards,
> 	o.
> 
> > Regards,
> > Clement
> > 
> > > Maxime