diff mbox series

[v2,5/6] thermal/drivers/mediatek/lvts_thermal: Don't leave threshold zeroed

Message ID 20230504004852.627049-6-nfraprado@collabora.com (mailing list archive)
State New, archived
Headers show
Series thermal/drivers/mediatek/lvts_thermal: Fixes to the interrupt handling | expand

Commit Message

Nícolas F. R. A. Prado May 4, 2023, 12:48 a.m. UTC
The thermal framework might leave the low threshold unset if there
aren't any lower trip points. This leaves the register zeroed, which
translates to a very high temperature for the low threshold. The
interrupt for this threshold is then immediately triggered, and the
state machine gets stuck, preventing any other temperature monitoring
interrupts to ever trigger.

(The same happens by not setting the Cold or Hot to Normal thresholds
when using those)

Set the unused threshold to a valid low value. This value was chosen so
that for any valid golden temperature read from the efuse, when the
value is converted to raw and back again to milliCelsius, the result
doesn't underflow.

Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

---

Changes in v2:
- Added this commit

 drivers/thermal/mediatek/lvts_thermal.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Daniel Lezcano June 2, 2023, 9:17 a.m. UTC | #1
On 04/05/2023 02:48, Nícolas F. R. A. Prado wrote:
> The thermal framework might leave the low threshold unset if there
> aren't any lower trip points. This leaves the register zeroed, which
> translates to a very high temperature for the low threshold. The
> interrupt for this threshold is then immediately triggered, and the
> state machine gets stuck, preventing any other temperature monitoring
> interrupts to ever trigger.
> 
> (The same happens by not setting the Cold or Hot to Normal thresholds
> when using those)
> 
> Set the unused threshold to a valid low value. This value was chosen so
> that for any valid golden temperature read from the efuse, when the
> value is converted to raw and back again to milliCelsius, the result
> doesn't underflow.
> 
> Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> 
> ---
> 
> Changes in v2:
> - Added this commit
> 
>   drivers/thermal/mediatek/lvts_thermal.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
> index efd1e938e1c2..951a4cb75ef6 100644
> --- a/drivers/thermal/mediatek/lvts_thermal.c
> +++ b/drivers/thermal/mediatek/lvts_thermal.c
> @@ -82,6 +82,8 @@
>   #define LVTS_HW_SHUTDOWN_MT8195		105000
>   #define LVTS_HW_SHUTDOWN_MT8192		105000
>   
> +#define LVTS_MINIUM_THRESHOLD		20000

MINIMUM

So if the thermal zone reaches 20°C, the interrupt fires, the set_trips 
sets again 20°C but the interrupt won't fire until the temperature goes 
above 20°C and then crosses the temperature low threshold the way down 
again?

>   static int golden_temp = LVTS_GOLDEN_TEMP_DEFAULT;
>   static int coeff_b = LVTS_COEFF_B;
>   
> @@ -309,6 +311,11 @@ static int lvts_set_trips(struct thermal_zone_device *tz, int low, int high)
>   		pr_debug("%s: Setting low limit temperature interrupt: %d\n",
>   			 thermal_zone_device_type(tz), low);
>   		writel(raw_low, LVTS_OFFSETL(base));
> +	} else {
> +		pr_debug("%s: Setting low limit temperature to minimum\n",
> +			 thermal_zone_device_type(tz));
> +		raw_low = lvts_temp_to_raw(LVTS_MINIUM_THRESHOLD);
> +		writel(raw_low, LVTS_OFFSETL(base));

That's duplicate code:

u32 raw_low = lvts_temp_to_raw(low != -INT_MAX ? low : 
LVTS_MINIMUM_THRESHOLD);

And then the condition in the code goes away:
         if (low != -INT_MAX) {
	}

>   	}
>   
>   	/*
Nícolas F. R. A. Prado June 29, 2023, 9:10 p.m. UTC | #2
On Fri, Jun 02, 2023 at 11:17:27AM +0200, Daniel Lezcano wrote:
> On 04/05/2023 02:48, Nícolas F. R. A. Prado wrote:
> > The thermal framework might leave the low threshold unset if there
> > aren't any lower trip points. This leaves the register zeroed, which
> > translates to a very high temperature for the low threshold. The
> > interrupt for this threshold is then immediately triggered, and the
> > state machine gets stuck, preventing any other temperature monitoring
> > interrupts to ever trigger.
> > 
> > (The same happens by not setting the Cold or Hot to Normal thresholds
> > when using those)
> > 
> > Set the unused threshold to a valid low value. This value was chosen so
> > that for any valid golden temperature read from the efuse, when the
> > value is converted to raw and back again to milliCelsius, the result
> > doesn't underflow.
> > 
> > Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
> > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> > 
> > ---
> > 
> > Changes in v2:
> > - Added this commit
> > 
> >   drivers/thermal/mediatek/lvts_thermal.c | 7 +++++++
> >   1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
> > index efd1e938e1c2..951a4cb75ef6 100644
> > --- a/drivers/thermal/mediatek/lvts_thermal.c
> > +++ b/drivers/thermal/mediatek/lvts_thermal.c
> > @@ -82,6 +82,8 @@
> >   #define LVTS_HW_SHUTDOWN_MT8195		105000
> >   #define LVTS_HW_SHUTDOWN_MT8192		105000
> > +#define LVTS_MINIUM_THRESHOLD		20000
> 
> MINIMUM
> 
> So if the thermal zone reaches 20°C, the interrupt fires, the set_trips sets
> again 20°C but the interrupt won't fire until the temperature goes above
> 20°C and then crosses the temperature low threshold the way down again?

Well, actually, set_trips won't even be called since from the thermal
framework's perspective we haven't crossed trip points, ie in
__thermal_zone_set_trips():

	/* No need to change trip points */
	if (tz->prev_low_trip == low && tz->prev_high_trip == high)
		return;

But in any case, yes, the interrupt will fire, the temperature will get updated
in the framework, and that's it. It will only fire again when a threshold is
crossed again (either by the temperature rising and falling again below this
minimum, or rising beyond the high treshold).

So basically at most this will cause a spurious interrupt when the temperature
gets low enough. I do get 34-36C on all sensors when idling though, so I doubt
that temperature is even reachable. Besides, we don't really have another option
here if we want working interrupts, the threshold needs to be set to a valid
value, and this is the lowest I've found.

And thanks for all the feedback! I'll prepare a v3 based on your comments.

Thanks,
Nícolas
diff mbox series

Patch

diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
index efd1e938e1c2..951a4cb75ef6 100644
--- a/drivers/thermal/mediatek/lvts_thermal.c
+++ b/drivers/thermal/mediatek/lvts_thermal.c
@@ -82,6 +82,8 @@ 
 #define LVTS_HW_SHUTDOWN_MT8195		105000
 #define LVTS_HW_SHUTDOWN_MT8192		105000
 
+#define LVTS_MINIUM_THRESHOLD		20000
+
 static int golden_temp = LVTS_GOLDEN_TEMP_DEFAULT;
 static int coeff_b = LVTS_COEFF_B;
 
@@ -309,6 +311,11 @@  static int lvts_set_trips(struct thermal_zone_device *tz, int low, int high)
 		pr_debug("%s: Setting low limit temperature interrupt: %d\n",
 			 thermal_zone_device_type(tz), low);
 		writel(raw_low, LVTS_OFFSETL(base));
+	} else {
+		pr_debug("%s: Setting low limit temperature to minimum\n",
+			 thermal_zone_device_type(tz));
+		raw_low = lvts_temp_to_raw(LVTS_MINIUM_THRESHOLD);
+		writel(raw_low, LVTS_OFFSETL(base));
 	}
 
 	/*