diff mbox series

cpufreq: sun50i: Fix CPU speed bin detection

Message ID 20191031181359.282617-1-megous@megous.com (mailing list archive)
State New, archived
Headers show
Series cpufreq: sun50i: Fix CPU speed bin detection | expand

Commit Message

Ondřej Jirman Oct. 31, 2019, 6:13 p.m. UTC
I have failures to boot on Orange Pi 3, because this driver determined
that my SoC is from the normal bin, but my SoC only works reliably with
the OPP values for the slowest bin.

Looking at BSP code, I found that efuse values have following meanings
on H6:

- 0b000 invalid (interpreted in vendor's BSP as normal bin)
- 0b001 slowest bin
- 0b011 normal bin
- 0b111 fastest bin

Let's play it safe and interpret 0 as the slowest bin, but fix detection
of other bins to match vendor code.

Fixes: f328584f7bff ("cpufreq: Add sun50i nvmem based CPU scaling driver")
Signed-off-by: Ondrej Jirman <megous@megous.com>
---

See https://megous.com/git/linux/tree/drivers/soc/sunxi/sunxi-sid.c?h=h6-4.9-bsp#n484
and https://megous.com/git/linux/tree/drivers/cpufreq/sunxi-cpufreq.c?h=h6-4.9-bsp#n428
(1 is substracted from soc_bin number here!)

 drivers/cpufreq/sun50i-cpufreq-nvmem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Clément Péron Oct. 31, 2019, 6:55 p.m. UTC | #1
Hi Ondrej,

On Thu, 31 Oct 2019 at 19:14, Ondrej Jirman <megous@megous.com> wrote:
>
> I have failures to boot on Orange Pi 3, because this driver determined
> that my SoC is from the normal bin, but my SoC only works reliably with
> the OPP values for the slowest bin.
>
> Looking at BSP code, I found that efuse values have following meanings
> on H6:
>
> - 0b000 invalid (interpreted in vendor's BSP as normal bin)
> - 0b001 slowest bin
> - 0b011 normal bin
> - 0b111 fastest bin

Maybe have some defines will be more readable no ?
https://megous.com/git/linux/tree/drivers/soc/sunxi/sunxi-sid.c?h=h6-4.9-bsp#n213

#define SUN50I_NVEM_INVALID_CPU_OPP (0b000)
#define SUN50I_NVEM_LOW_CPU_OPP (0b001)
#define SUN50I_NVEM_NORMAL_CPU_OPP (0b011)
#define SUN50I_NVEM_HIGH_CPU_OPP (0b111)

Regards,
Clément

>
> Let's play it safe and interpret 0 as the slowest bin, but fix detection
> of other bins to match vendor code.
>
> Fixes: f328584f7bff ("cpufreq: Add sun50i nvmem based CPU scaling driver")
> Signed-off-by: Ondrej Jirman <megous@megous.com>
> ---
>
> See https://megous.com/git/linux/tree/drivers/soc/sunxi/sunxi-sid.c?h=h6-4.9-bsp#n484
> and https://megous.com/git/linux/tree/drivers/cpufreq/sunxi-cpufreq.c?h=h6-4.9-bsp#n428
> (1 is substracted from soc_bin number here!)
>
>  drivers/cpufreq/sun50i-cpufreq-nvmem.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/sun50i-cpufreq-nvmem.c b/drivers/cpufreq/sun50i-cpufreq-nvmem.c
> index df35ef3ef567..41dad03e245c 100644
> --- a/drivers/cpufreq/sun50i-cpufreq-nvmem.c
> +++ b/drivers/cpufreq/sun50i-cpufreq-nvmem.c
> @@ -71,9 +71,12 @@ static int sun50i_cpufreq_get_efuse(u32 *versions)
>         efuse_value = (*speedbin >> NVMEM_SHIFT) & NVMEM_MASK;
>         switch (efuse_value) {
>         case 0b0001:
> -               *versions = 1;
> +               *versions = 0;
>                 break;
>         case 0b0011:
> +               *versions = 1;
> +               break;
> +       case 0b0111:
>                 *versions = 2;
>                 break;
>         default:
> --
> 2.23.0
>
> --
> You received this message because you are subscribed to the Google Groups "linux-sunxi" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscribe@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/linux-sunxi/20191031181359.282617-1-megous%40megous.com.
Ondřej Jirman Oct. 31, 2019, 7:12 p.m. UTC | #2
Hi,

On Thu, Oct 31, 2019 at 07:55:43PM +0100, Clément Péron wrote:
> Hi Ondrej,
> 
> On Thu, 31 Oct 2019 at 19:14, Ondrej Jirman <megous@megous.com> wrote:
> >
> > I have failures to boot on Orange Pi 3, because this driver determined
> > that my SoC is from the normal bin, but my SoC only works reliably with
> > the OPP values for the slowest bin.
> >
> > Looking at BSP code, I found that efuse values have following meanings
> > on H6:
> >
> > - 0b000 invalid (interpreted in vendor's BSP as normal bin)
> > - 0b001 slowest bin
> > - 0b011 normal bin
> > - 0b111 fastest bin
> 
> Maybe have some defines will be more readable no ?
> https://megous.com/git/linux/tree/drivers/soc/sunxi/sunxi-sid.c?h=h6-4.9-bsp#n213

Hmm, Alwwinner is really funny. Unused macros that just confuse things.

#if defined(CONFIG_ARCH_SUN50IW6)
#define TYPE_SB (0b001)
#define TYPE_NB (0b010)
#define TYPE_FB (0b011)
#else
#define TYPE_SB (0b001)
#define TYPE_NB (0b011)
#define TYPE_FB (0b111)
#endif

So for H6 they define special bin values and actually use different ones
in code. Fun.

I've sent out some testing program to Armbian forums, so hopefully, we'll
collect some real efuse_values from real SoCs, to see what's really being
used in the wild. If we see value 0b010, the BSP code is probably just
wrong.

Interestingly, TYPE_NB 0b010 would be interpreted as normal bin even with
the current BSP code, and TYPE_FB would be misdetected as TYPE_NB.

> #define SUN50I_NVEM_INVALID_CPU_OPP (0b000)
> #define SUN50I_NVEM_LOW_CPU_OPP (0b001)
> #define SUN50I_NVEM_NORMAL_CPU_OPP (0b011)
> #define SUN50I_NVEM_HIGH_CPU_OPP (0b111)

I'd rather not describe meanings just yet, until we get some real-world
data from H6 owners.

https://forum.armbian.com/topic/9368-orangepi-3-h6-allwiner-chip/?do=findComment&comment=88439

regards,
	o.

> Regards,
> Clément
> 
> >
> > Let's play it safe and interpret 0 as the slowest bin, but fix detection
> > of other bins to match vendor code.
> >
> > Fixes: f328584f7bff ("cpufreq: Add sun50i nvmem based CPU scaling driver")
> > Signed-off-by: Ondrej Jirman <megous@megous.com>
> > ---
> >
> > See https://megous.com/git/linux/tree/drivers/soc/sunxi/sunxi-sid.c?h=h6-4.9-bsp#n484
> > and https://megous.com/git/linux/tree/drivers/cpufreq/sunxi-cpufreq.c?h=h6-4.9-bsp#n428
> > (1 is substracted from soc_bin number here!)
> >
> >  drivers/cpufreq/sun50i-cpufreq-nvmem.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpufreq/sun50i-cpufreq-nvmem.c b/drivers/cpufreq/sun50i-cpufreq-nvmem.c
> > index df35ef3ef567..41dad03e245c 100644
> > --- a/drivers/cpufreq/sun50i-cpufreq-nvmem.c
> > +++ b/drivers/cpufreq/sun50i-cpufreq-nvmem.c
> > @@ -71,9 +71,12 @@ static int sun50i_cpufreq_get_efuse(u32 *versions)
> >         efuse_value = (*speedbin >> NVMEM_SHIFT) & NVMEM_MASK;
> >         switch (efuse_value) {
> >         case 0b0001:
> > -               *versions = 1;
> > +               *versions = 0;
> >                 break;
> >         case 0b0011:
> > +               *versions = 1;
> > +               break;
> > +       case 0b0111:
> >                 *versions = 2;
> >                 break;
> >         default:
> > --
> > 2.23.0
> >
> > --
> > You received this message because you are subscribed to the Google Groups "linux-sunxi" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscribe@googlegroups.com.
> > To view this discussion on the web, visit https://groups.google.com/d/msgid/linux-sunxi/20191031181359.282617-1-megous%40megous.com.
Maxime Ripard Nov. 1, 2019, 3:07 p.m. UTC | #3
On Thu, Oct 31, 2019 at 07:13:58PM +0100, Ondrej Jirman wrote:
> I have failures to boot on Orange Pi 3, because this driver determined
> that my SoC is from the normal bin, but my SoC only works reliably with
> the OPP values for the slowest bin.
>
> Looking at BSP code, I found that efuse values have following meanings
> on H6:
>
> - 0b000 invalid (interpreted in vendor's BSP as normal bin)
> - 0b001 slowest bin
> - 0b011 normal bin
> - 0b111 fastest bin
>
> Let's play it safe and interpret 0 as the slowest bin, but fix detection
> of other bins to match vendor code.
>
> Fixes: f328584f7bff ("cpufreq: Add sun50i nvmem based CPU scaling driver")
> Signed-off-by: Ondrej Jirman <megous@megous.com>

Acked-by: Maxime Ripard <mripard@kernel.org>

Out of curiosity, which OPP table is being used? I guess it's one of
the dozens of patches sitting there...

Maxime
Ondřej Jirman Nov. 1, 2019, 4:01 p.m. UTC | #4
On Fri, Nov 01, 2019 at 04:07:01PM +0100, Maxime Ripard wrote:
> On Thu, Oct 31, 2019 at 07:13:58PM +0100, Ondrej Jirman wrote:
> > I have failures to boot on Orange Pi 3, because this driver determined
> > that my SoC is from the normal bin, but my SoC only works reliably with
> > the OPP values for the slowest bin.
> >
> > Looking at BSP code, I found that efuse values have following meanings
> > on H6:
> >
> > - 0b000 invalid (interpreted in vendor's BSP as normal bin)
> > - 0b001 slowest bin
> > - 0b011 normal bin
> > - 0b111 fastest bin
> >
> > Let's play it safe and interpret 0 as the slowest bin, but fix detection
> > of other bins to match vendor code.
> >
> > Fixes: f328584f7bff ("cpufreq: Add sun50i nvmem based CPU scaling driver")
> > Signed-off-by: Ondrej Jirman <megous@megous.com>
> 
> Acked-by: Maxime Ripard <mripard@kernel.org>
> 
> Out of curiosity, which OPP table is being used? I guess it's one of
> the dozens of patches sitting there...

I'm using one from Orange Pi 3's fex file from Xunlong:

https://github.com/orangepi-xunlong/OrangePiH6_external/blob/master/sys_config/OrangePiH6_3_sys_config.fex
https://megous.com/git/linux/commit/?h=ths-5.4&id=7b409e83b4ac70f3435886da6a26cecf9af77213

This one doesn't really differentiate between normal/fast bins.

There's also another one in the Allwinner DTS files:

https://github.com/orangepi-xunlong/OrangePiH6_Linux4_9/blob/master/arch/arm64/boot/dts/sunxi/sun50iw6p1.dtsi#L349

which seems to be the one used by Yangtao Li in the dt-bindings
docummentation.
	
regards,
	o.

> Maxime
Ondřej Jirman Nov. 1, 2019, 4:14 p.m. UTC | #5
Hi,

On Thu, Oct 31, 2019 at 08:12:57PM +0100, megous hlavni wrote:
> Hi,
> 
> On Thu, Oct 31, 2019 at 07:55:43PM +0100, Clément Péron wrote:
> > Hi Ondrej,
> > 
> > On Thu, 31 Oct 2019 at 19:14, Ondrej Jirman <megous@megous.com> wrote:
> > >
> > > I have failures to boot on Orange Pi 3, because this driver determined
> > > that my SoC is from the normal bin, but my SoC only works reliably with
> > > the OPP values for the slowest bin.
> > >
> > > Looking at BSP code, I found that efuse values have following meanings
> > > on H6:
> > >
> > > - 0b000 invalid (interpreted in vendor's BSP as normal bin)
> > > - 0b001 slowest bin
> > > - 0b011 normal bin
> > > - 0b111 fastest bin
> > 
> > Maybe have some defines will be more readable no ?
> > https://megous.com/git/linux/tree/drivers/soc/sunxi/sunxi-sid.c?h=h6-4.9-bsp#n213
> 
> Hmm, Alwwinner is really funny. Unused macros that just confuse things.
> 
> #if defined(CONFIG_ARCH_SUN50IW6)
> #define TYPE_SB (0b001)
> #define TYPE_NB (0b010)
> #define TYPE_FB (0b011)

So this table is likely used on H6, from my research I was able to find
no owners of H6 with efuse value of 0b111 and one owner with efuse value
of 0b010, and one with 0b011.

So the bins map directly to decimal numbers efuse=1 (slow bin),
efuse=2 (normal bin), efuse=3 (fast bin).

So it looks like vendor code is wrong and works accidentally, due to
fast bin being interpretted as normal bin, and normal bin being interpretted
as having a wrong efuse value, which is then interpretted alter as normal bin.

https://forum.armbian.com/topic/9368-orangepi-3-h6-allwiner-chip/page/24/#comments
https://forum.armbian.com/topic/9368-orangepi-3-h6-allwiner-chip/page/25/#comments

This will still need to be verified, by respective owners using the optimized
OPP tables for their supposed SoC bins successfully, but meanwhile I think
we should base the efuse->speed grade mapping based on values observed in the
wild. That seems most prudent at the moment.

I'll send v2 with speed grade selection matching these observations, so
please don't merge this yet.

regards,
	o.

> #else
> #define TYPE_SB (0b001)
> #define TYPE_NB (0b011)
> #define TYPE_FB (0b111)
> #endif
> 
> So for H6 they define special bin values and actually use different ones
> in code. Fun.
> 
> I've sent out some testing program to Armbian forums, so hopefully, we'll
> collect some real efuse_values from real SoCs, to see what's really being
> used in the wild. If we see value 0b010, the BSP code is probably just
> wrong.
> 
> Interestingly, TYPE_NB 0b010 would be interpreted as normal bin even with
> the current BSP code, and TYPE_FB would be misdetected as TYPE_NB.
> 
> > #define SUN50I_NVEM_INVALID_CPU_OPP (0b000)
> > #define SUN50I_NVEM_LOW_CPU_OPP (0b001)
> > #define SUN50I_NVEM_NORMAL_CPU_OPP (0b011)
> > #define SUN50I_NVEM_HIGH_CPU_OPP (0b111)
> 
> I'd rather not describe meanings just yet, until we get some real-world
> data from H6 owners.
> 
> https://forum.armbian.com/topic/9368-orangepi-3-h6-allwiner-chip/?do=findComment&comment=88439
> 
> regards,
> 	o.
> 
> > Regards,
> > Clément
> > 
> > >
> > > Let's play it safe and interpret 0 as the slowest bin, but fix detection
> > > of other bins to match vendor code.
> > >
> > > Fixes: f328584f7bff ("cpufreq: Add sun50i nvmem based CPU scaling driver")
> > > Signed-off-by: Ondrej Jirman <megous@megous.com>
> > > ---
> > >
> > > See https://megous.com/git/linux/tree/drivers/soc/sunxi/sunxi-sid.c?h=h6-4.9-bsp#n484
> > > and https://megous.com/git/linux/tree/drivers/cpufreq/sunxi-cpufreq.c?h=h6-4.9-bsp#n428
> > > (1 is substracted from soc_bin number here!)
> > >
> > >  drivers/cpufreq/sun50i-cpufreq-nvmem.c | 5 ++++-
> > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/cpufreq/sun50i-cpufreq-nvmem.c b/drivers/cpufreq/sun50i-cpufreq-nvmem.c
> > > index df35ef3ef567..41dad03e245c 100644
> > > --- a/drivers/cpufreq/sun50i-cpufreq-nvmem.c
> > > +++ b/drivers/cpufreq/sun50i-cpufreq-nvmem.c
> > > @@ -71,9 +71,12 @@ static int sun50i_cpufreq_get_efuse(u32 *versions)
> > >         efuse_value = (*speedbin >> NVMEM_SHIFT) & NVMEM_MASK;
> > >         switch (efuse_value) {
> > >         case 0b0001:
> > > -               *versions = 1;
> > > +               *versions = 0;
> > >                 break;
> > >         case 0b0011:
> > > +               *versions = 1;
> > > +               break;
> > > +       case 0b0111:
> > >                 *versions = 2;
> > >                 break;
> > >         default:
> > > --
> > > 2.23.0
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups "linux-sunxi" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscribe@googlegroups.com.
> > > To view this discussion on the web, visit https://groups.google.com/d/msgid/linux-sunxi/20191031181359.282617-1-megous%40megous.com.
Ondřej Jirman Nov. 1, 2019, 4:15 p.m. UTC | #6
On Fri, Nov 01, 2019 at 04:07:01PM +0100, Maxime Ripard wrote:
> On Thu, Oct 31, 2019 at 07:13:58PM +0100, Ondrej Jirman wrote:
> > I have failures to boot on Orange Pi 3, because this driver determined
> > that my SoC is from the normal bin, but my SoC only works reliably with
> > the OPP values for the slowest bin.
> >
> > Looking at BSP code, I found that efuse values have following meanings
> > on H6:
> >
> > - 0b000 invalid (interpreted in vendor's BSP as normal bin)
> > - 0b001 slowest bin
> > - 0b011 normal bin
> > - 0b111 fastest bin
> >
> > Let's play it safe and interpret 0 as the slowest bin, but fix detection
> > of other bins to match vendor code.
> >
> > Fixes: f328584f7bff ("cpufreq: Add sun50i nvmem based CPU scaling driver")
> > Signed-off-by: Ondrej Jirman <megous@megous.com>
> 
> Acked-by: Maxime Ripard <mripard@kernel.org>

Self-NACK :) Please don't merge.

Please see the other e-mail for why.

thank you and regards,
	o.

> Out of curiosity, which OPP table is being used? I guess it's one of
> the dozens of patches sitting there...
> 
> Maxime
diff mbox series

Patch

diff --git a/drivers/cpufreq/sun50i-cpufreq-nvmem.c b/drivers/cpufreq/sun50i-cpufreq-nvmem.c
index df35ef3ef567..41dad03e245c 100644
--- a/drivers/cpufreq/sun50i-cpufreq-nvmem.c
+++ b/drivers/cpufreq/sun50i-cpufreq-nvmem.c
@@ -71,9 +71,12 @@  static int sun50i_cpufreq_get_efuse(u32 *versions)
 	efuse_value = (*speedbin >> NVMEM_SHIFT) & NVMEM_MASK;
 	switch (efuse_value) {
 	case 0b0001:
-		*versions = 1;
+		*versions = 0;
 		break;
 	case 0b0011:
+		*versions = 1;
+		break;
+	case 0b0111:
 		*versions = 2;
 		break;
 	default: