Message ID | 20250113-mt8192-lvts-filtered-suspend-fix-v2-3-07a25200c7c6@collabora.com (mailing list archive) |
---|---|
State | New |
Delegated to: | Daniel Lezcano |
Headers | show |
Series | thermal/drivers/mediatek/lvts: Fixes for suspend and IRQ storm, and cleanups | expand |
On 13/01/2025 14:27, Nícolas F. R. A. Prado wrote: > In order to get working interrupts, a low offset value needs to be > configured. The minimum value for it is 20 Celsius, which is what is > configured when there's no lower thermal trip (ie the thermal core > passes -INT_MAX as low trip temperature). However, when the temperature > gets that low and fluctuates around that value it causes an interrupt > storm. Is it really about an irq storm or about having a temperature threshold set close to the ambiant temperature. So leading to unnecessary wakeups as there is need for mitigation ? > Prevent that interrupt storm by not enabling the low offset interrupt if > the low threshold is the minimum one. The case where the high threshold is the INT_MAX should be handled too. The system may have configured a thermal zone without critical trip points, so setting the next upper threshold will program the register with INT_MAX. I guess it is an undefined behavior in this case, right ? > Cc: stable@vger.kernel.org [ ... ]
On Tue, Jan 14, 2025 at 07:30:31PM +0100, Daniel Lezcano wrote: > On 13/01/2025 14:27, Nícolas F. R. A. Prado wrote: > > In order to get working interrupts, a low offset value needs to be > > configured. The minimum value for it is 20 Celsius, which is what is > > configured when there's no lower thermal trip (ie the thermal core > > passes -INT_MAX as low trip temperature). However, when the temperature > > gets that low and fluctuates around that value it causes an interrupt > > storm. > > Is it really about an irq storm or about having a temperature threshold set > close to the ambiant temperature. So leading to unnecessary wakeups as there > is need for mitigation ? Yes, that's what I mean. The irq threshold gets configured to 20C, so whenever the temperature drops below that value, the IRQ gets triggered. But this usually does not happen just once, because from the thermal frameworks' perspective, there's no thermal threshold configured for 20C, since that's done from the driver, the framework thinks it's -INT_MAX, so the threshold doesn't get moved after the trigger and it just ends up triggering hundreds or thousands of times in a short span of time, hence why I say it's an interrupt storm. > > > Prevent that interrupt storm by not enabling the low offset interrupt if > > the low threshold is the minimum one. > > The case where the high threshold is the INT_MAX should be handled too. The > system may have configured a thermal zone without critical trip points, so > setting the next upper threshold will program the register with INT_MAX. I > guess it is an undefined behavior in this case, right ? Ah, yes, I don't think I've tested that before... I'll test it and send a fix if needed. Thanks, Nícolas
diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c index 0aaa44b734ca43e6abfd97b2ca4ce34dc6f15826..04bfbfe93a71ee9e3428bfd7f8bd359fe9446e88 100644 --- a/drivers/thermal/mediatek/lvts_thermal.c +++ b/drivers/thermal/mediatek/lvts_thermal.c @@ -67,10 +67,14 @@ #define LVTS_CALSCALE_CONF 0x300 #define LVTS_MONINT_CONF 0x0300318C -#define LVTS_MONINT_OFFSET_SENSOR0 0xC -#define LVTS_MONINT_OFFSET_SENSOR1 0x180 -#define LVTS_MONINT_OFFSET_SENSOR2 0x3000 -#define LVTS_MONINT_OFFSET_SENSOR3 0x3000000 +#define LVTS_MONINT_OFFSET_HIGH_INTEN_SENSOR0 BIT(3) +#define LVTS_MONINT_OFFSET_HIGH_INTEN_SENSOR1 BIT(8) +#define LVTS_MONINT_OFFSET_HIGH_INTEN_SENSOR2 BIT(13) +#define LVTS_MONINT_OFFSET_HIGH_INTEN_SENSOR3 BIT(25) +#define LVTS_MONINT_OFFSET_LOW_INTEN_SENSOR0 BIT(2) +#define LVTS_MONINT_OFFSET_LOW_INTEN_SENSOR1 BIT(7) +#define LVTS_MONINT_OFFSET_LOW_INTEN_SENSOR2 BIT(12) +#define LVTS_MONINT_OFFSET_LOW_INTEN_SENSOR3 BIT(24) #define LVTS_INT_SENSOR0 0x0009001F #define LVTS_INT_SENSOR1 0x001203E0 @@ -326,11 +330,17 @@ static int lvts_get_temp(struct thermal_zone_device *tz, int *temp) static void lvts_update_irq_mask(struct lvts_ctrl *lvts_ctrl) { - static const u32 masks[] = { - LVTS_MONINT_OFFSET_SENSOR0, - LVTS_MONINT_OFFSET_SENSOR1, - LVTS_MONINT_OFFSET_SENSOR2, - LVTS_MONINT_OFFSET_SENSOR3, + static const u32 high_offset_inten_masks[] = { + LVTS_MONINT_OFFSET_HIGH_INTEN_SENSOR0, + LVTS_MONINT_OFFSET_HIGH_INTEN_SENSOR1, + LVTS_MONINT_OFFSET_HIGH_INTEN_SENSOR2, + LVTS_MONINT_OFFSET_HIGH_INTEN_SENSOR3, + }; + static const u32 low_offset_inten_masks[] = { + LVTS_MONINT_OFFSET_LOW_INTEN_SENSOR0, + LVTS_MONINT_OFFSET_LOW_INTEN_SENSOR1, + LVTS_MONINT_OFFSET_LOW_INTEN_SENSOR2, + LVTS_MONINT_OFFSET_LOW_INTEN_SENSOR3, }; u32 value = 0; int i; @@ -339,10 +349,22 @@ static void lvts_update_irq_mask(struct lvts_ctrl *lvts_ctrl) for (i = 0; i < ARRAY_SIZE(masks); i++) { if (lvts_ctrl->sensors[i].high_thresh == lvts_ctrl->high_thresh - && lvts_ctrl->sensors[i].low_thresh == lvts_ctrl->low_thresh) - value |= masks[i]; - else - value &= ~masks[i]; + && lvts_ctrl->sensors[i].low_thresh == lvts_ctrl->low_thresh) { + /* + * The minimum threshold needs to be configured in the + * OFFSETL register to get working interrupts, but we + * don't actually want to generate interrupts when + * crossing it. + */ + if (lvts_ctrl->low_thresh == -INT_MAX) { + value &= ~low_offset_inten_masks[i]; + value |= high_offset_inten_masks[i]; + } else { + value |= low_offset_inten_masks[i] | high_offset_inten_masks[i]; + } + } else { + value &= ~(low_offset_inten_masks[i] | high_offset_inten_masks[i]); + } } writel(value, LVTS_MONINT(lvts_ctrl->base));