Message ID | 20201221172345.36976-1-kai.heng.feng@canonical.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Zhang Rui |
Headers | show |
Series | [v2,1/2] thermal: int340x: Fix unexpected shutdown at critical temperature | expand |
On Tue, Dec 22, 2020 at 1:23 AM Kai-Heng Feng <kai.heng.feng@canonical.com> wrote: > > We are seeing thermal shutdown on Intel based mobile workstations, the > shutdown happens during the first trip handle in > thermal_zone_device_register(): > kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down > > However, we shouldn't do a thermal shutdown here, since > 1) We may want to use a dedicated daemon, Intel's thermald in this case, > to handle thermal shutdown. > > 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside > ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature): > "... If this object it present under a device, the device’s driver > evaluates this object to determine the device’s critical cooling > temperature trip point. This value may then be used by the device’s > driver to program an internal device temperature sensor trip point." > > So a "critical trip" here merely means we should take a more aggressive > cooling method. > > As int340x device isn't present under ACPI ThermalZone, override the > default .critical callback to prevent surprising thermal shutdown. > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> A gentle ping... > --- > v2: > - Amend subject. > - Remove int3400 device. > > .../thermal/intel/int340x_thermal/int340x_thermal_zone.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c > index 6e479deff76b..d1248ba943a4 100644 > --- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c > +++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c > @@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct thermal_zone_device *zone, > return 0; > } > > +static void int340x_thermal_critical(struct thermal_zone_device *zone) > +{ > + dev_dbg(&zone->device, "%s: critical temperature reached\n", zone->type); > +} > + > static struct thermal_zone_device_ops int340x_thermal_zone_ops = { > .get_temp = int340x_thermal_get_zone_temp, > .get_trip_temp = int340x_thermal_get_trip_temp, > .get_trip_type = int340x_thermal_get_trip_type, > .set_trip_temp = int340x_thermal_set_trip_temp, > .get_trip_hyst = int340x_thermal_get_trip_hyst, > + .critical = int340x_thermal_critical, > }; > > static int int340x_thermal_get_trip_config(acpi_handle handle, char *name, > -- > 2.29.2 >
On 11/01/2021 17:18, Kai-Heng Feng wrote: > On Tue, Dec 22, 2020 at 1:23 AM Kai-Heng Feng > <kai.heng.feng@canonical.com> wrote: >> >> We are seeing thermal shutdown on Intel based mobile workstations, the >> shutdown happens during the first trip handle in >> thermal_zone_device_register(): >> kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down >> >> However, we shouldn't do a thermal shutdown here, since >> 1) We may want to use a dedicated daemon, Intel's thermald in this case, >> to handle thermal shutdown. >> >> 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside >> ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature): >> "... If this object it present under a device, the device’s driver >> evaluates this object to determine the device’s critical cooling >> temperature trip point. This value may then be used by the device’s >> driver to program an internal device temperature sensor trip point." >> >> So a "critical trip" here merely means we should take a more aggressive >> cooling method. >> >> As int340x device isn't present under ACPI ThermalZone, override the >> default .critical callback to prevent surprising thermal shutdown. >> >> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> > > A gentle ping... Applied, they are in the testing branch now. They will be a linux-next in a couple of days. Thanks -- Daniel
diff --git a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c index 6e479deff76b..d1248ba943a4 100644 --- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c +++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c @@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct thermal_zone_device *zone, return 0; } +static void int340x_thermal_critical(struct thermal_zone_device *zone) +{ + dev_dbg(&zone->device, "%s: critical temperature reached\n", zone->type); +} + static struct thermal_zone_device_ops int340x_thermal_zone_ops = { .get_temp = int340x_thermal_get_zone_temp, .get_trip_temp = int340x_thermal_get_trip_temp, .get_trip_type = int340x_thermal_get_trip_type, .set_trip_temp = int340x_thermal_set_trip_temp, .get_trip_hyst = int340x_thermal_get_trip_hyst, + .critical = int340x_thermal_critical, }; static int int340x_thermal_get_trip_config(acpi_handle handle, char *name,
We are seeing thermal shutdown on Intel based mobile workstations, the shutdown happens during the first trip handle in thermal_zone_device_register(): kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down However, we shouldn't do a thermal shutdown here, since 1) We may want to use a dedicated daemon, Intel's thermald in this case, to handle thermal shutdown. 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature): "... If this object it present under a device, the device’s driver evaluates this object to determine the device’s critical cooling temperature trip point. This value may then be used by the device’s driver to program an internal device temperature sensor trip point." So a "critical trip" here merely means we should take a more aggressive cooling method. As int340x device isn't present under ACPI ThermalZone, override the default .critical callback to prevent surprising thermal shutdown. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> --- v2: - Amend subject. - Remove int3400 device. .../thermal/intel/int340x_thermal/int340x_thermal_zone.c | 6 ++++++ 1 file changed, 6 insertions(+)