Message ID | 3b62226e320ab412357e102baf6d628e354a0b61.1618832466.git.matti.vaittinen@fi.rohmeurope.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Geert Uytterhoeven |
Headers | show |
Series | Extend regulator notification support | expand |
On 19/04/2021 13:49, Matti Vaittinen wrote: > The hardware shutdown function was exported from kernel/reboot for > other subsystems to use. Logic is copied from the thermal_core. The > protection mutex is replaced by an atomic_t to allow calls also from > an IRQ context. > > Use the exported API instead of implementing own just for the > thermal_core. Can you update the documentation: Documentation/driver-api/thermal/sysfs-api.rst 5. thermal_emergency_poweroff Thanks -- Daniel > Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> > > --- > Changelog: > v8: > - new patch (change added in v7, splitted in own patch at v8) > > Use the exported API instead > --- > drivers/thermal/thermal_core.c | 63 +++------------------------------- > 1 file changed, 4 insertions(+), 59 deletions(-) > > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c > index 996c038f83a4..b1444845af38 100644 > --- a/drivers/thermal/thermal_core.c > +++ b/drivers/thermal/thermal_core.c > @@ -36,10 +36,8 @@ static LIST_HEAD(thermal_governor_list); > > static DEFINE_MUTEX(thermal_list_lock); > static DEFINE_MUTEX(thermal_governor_lock); > -static DEFINE_MUTEX(poweroff_lock); > > static atomic_t in_suspend; > -static bool power_off_triggered; > > static struct thermal_governor *def_governor; > > @@ -327,70 +325,18 @@ static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip) > def_governor->throttle(tz, trip); > } > > -/** > - * thermal_emergency_poweroff_func - emergency poweroff work after a known delay > - * @work: work_struct associated with the emergency poweroff function > - * > - * This function is called in very critical situations to force > - * a kernel poweroff after a configurable timeout value. > - */ > -static void thermal_emergency_poweroff_func(struct work_struct *work) > -{ > - /* > - * We have reached here after the emergency thermal shutdown > - * Waiting period has expired. This means orderly_poweroff has > - * not been able to shut off the system for some reason. > - * Try to shut down the system immediately using kernel_power_off > - * if populated > - */ > - WARN(1, "Attempting kernel_power_off: Temperature too high\n"); > - kernel_power_off(); > - > - /* > - * Worst of the worst case trigger emergency restart > - */ > - WARN(1, "Attempting emergency_restart: Temperature too high\n"); > - emergency_restart(); > -} > - > -static DECLARE_DELAYED_WORK(thermal_emergency_poweroff_work, > - thermal_emergency_poweroff_func); > - > -/** > - * thermal_emergency_poweroff - Trigger an emergency system poweroff > - * > - * This may be called from any critical situation to trigger a system shutdown > - * after a known period of time. By default this is not scheduled. > - */ > -static void thermal_emergency_poweroff(void) > +void thermal_zone_device_critical(struct thermal_zone_device *tz) > { > - int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS; > /* > * poweroff_delay_ms must be a carefully profiled positive value. > - * Its a must for thermal_emergency_poweroff_work to be scheduled > + * Its a must for forced_emergency_poweroff_work to be scheduled. > */ > - if (poweroff_delay_ms <= 0) > - return; > - schedule_delayed_work(&thermal_emergency_poweroff_work, > - msecs_to_jiffies(poweroff_delay_ms)); > -} > + int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS; > > -void thermal_zone_device_critical(struct thermal_zone_device *tz) > -{ > dev_emerg(&tz->device, "%s: critical temperature reached, " > "shutting down\n", tz->type); > > - mutex_lock(&poweroff_lock); > - if (!power_off_triggered) { > - /* > - * Queue a backup emergency shutdown in the event of > - * orderly_poweroff failure > - */ > - thermal_emergency_poweroff(); > - orderly_poweroff(true); > - power_off_triggered = true; > - } > - mutex_unlock(&poweroff_lock); > + hw_protection_shutdown("Temperature too high", poweroff_delay_ms); > } > EXPORT_SYMBOL(thermal_zone_device_critical); > > @@ -1549,7 +1495,6 @@ static int __init thermal_init(void) > ida_destroy(&thermal_cdev_ida); > mutex_destroy(&thermal_list_lock); > mutex_destroy(&thermal_governor_lock); > - mutex_destroy(&poweroff_lock); > return result; > } > postcore_initcall(thermal_init); >
Hi Daniel, all, On Thu, 2021-04-22 at 10:10 +0200, Daniel Lezcano wrote: > On 19/04/2021 13:49, Matti Vaittinen wrote: > > The hardware shutdown function was exported from kernel/reboot for > > other subsystems to use. Logic is copied from the thermal_core. The > > protection mutex is replaced by an atomic_t to allow calls also > > from > > an IRQ context. > > > > Use the exported API instead of implementing own just for the > > thermal_core. > > Can you update the documentation: > > Documentation/driver-api/thermal/sysfs-api.rst > > 5. thermal_emergency_poweroff > I can. Problem is what to put there. I like the fact that logic of an emergency shut-down is described. Yet, describing in thermal_core documentation what an API hosted in kernel/reboot does sounds like a call for documentation which may not match implementation in the long run. I drafted following: diff --git a/Documentation/driver-api/thermal/sysfs-api.rst b/Documentation/driver-api/thermal/sysfs-api.rst index 29fdd817ddb0..a10bfe6e7293 100644 --- a/Documentation/driver-api/thermal/sysfs-api.rst +++ b/Documentation/driver-api/thermal/sysfs-api.rst @@ -751,20 +751,14 @@ possible. ============================= On an event of critical trip temperature crossing. Thermal framework -allows the system to shutdown gracefully by calling orderly_poweroff(). -In the event of a failure of orderly_poweroff() to shut down the system -we are in danger of keeping the system alive at undesirably high -temperatures. To mitigate this high risk scenario we program a work -queue to fire after a pre-determined number of seconds to start -an emergency shutdown of the device using the kernel_power_off() -function. In case kernel_power_off() fails then finally -emergency_restart() is called in the worst case. +shuts down the system by calling hw_protection_shutdown(). The +hw_protection_shutdown() first attempts to perform an orderly shutdown +but accepts a delay after which it proceeds doing a forced power-off +or an emergency_restart. The delay should be carefully profiled so as to give adequate time for -orderly_poweroff(). In case of failure of an orderly_poweroff() the -emergency poweroff kicks in after the delay has elapsed and shuts down -the system. +orderly poweroff. -If set to 0 emergency poweroff will not be supported. So a carefully -profiled non-zero positive value is a must for emergency poweroff to be -triggered. +If the delay is set to 0 emergency poweroff will not be supported. So a +carefully profiled non-zero positive value is a must for emergency +poweroff to be triggered. but I'm not sure what to think about it. Opinions/suggestions? Best Regards Matti Vaittinen
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 996c038f83a4..b1444845af38 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -36,10 +36,8 @@ static LIST_HEAD(thermal_governor_list); static DEFINE_MUTEX(thermal_list_lock); static DEFINE_MUTEX(thermal_governor_lock); -static DEFINE_MUTEX(poweroff_lock); static atomic_t in_suspend; -static bool power_off_triggered; static struct thermal_governor *def_governor; @@ -327,70 +325,18 @@ static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip) def_governor->throttle(tz, trip); } -/** - * thermal_emergency_poweroff_func - emergency poweroff work after a known delay - * @work: work_struct associated with the emergency poweroff function - * - * This function is called in very critical situations to force - * a kernel poweroff after a configurable timeout value. - */ -static void thermal_emergency_poweroff_func(struct work_struct *work) -{ - /* - * We have reached here after the emergency thermal shutdown - * Waiting period has expired. This means orderly_poweroff has - * not been able to shut off the system for some reason. - * Try to shut down the system immediately using kernel_power_off - * if populated - */ - WARN(1, "Attempting kernel_power_off: Temperature too high\n"); - kernel_power_off(); - - /* - * Worst of the worst case trigger emergency restart - */ - WARN(1, "Attempting emergency_restart: Temperature too high\n"); - emergency_restart(); -} - -static DECLARE_DELAYED_WORK(thermal_emergency_poweroff_work, - thermal_emergency_poweroff_func); - -/** - * thermal_emergency_poweroff - Trigger an emergency system poweroff - * - * This may be called from any critical situation to trigger a system shutdown - * after a known period of time. By default this is not scheduled. - */ -static void thermal_emergency_poweroff(void) +void thermal_zone_device_critical(struct thermal_zone_device *tz) { - int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS; /* * poweroff_delay_ms must be a carefully profiled positive value. - * Its a must for thermal_emergency_poweroff_work to be scheduled + * Its a must for forced_emergency_poweroff_work to be scheduled. */ - if (poweroff_delay_ms <= 0) - return; - schedule_delayed_work(&thermal_emergency_poweroff_work, - msecs_to_jiffies(poweroff_delay_ms)); -} + int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS; -void thermal_zone_device_critical(struct thermal_zone_device *tz) -{ dev_emerg(&tz->device, "%s: critical temperature reached, " "shutting down\n", tz->type); - mutex_lock(&poweroff_lock); - if (!power_off_triggered) { - /* - * Queue a backup emergency shutdown in the event of - * orderly_poweroff failure - */ - thermal_emergency_poweroff(); - orderly_poweroff(true); - power_off_triggered = true; - } - mutex_unlock(&poweroff_lock); + hw_protection_shutdown("Temperature too high", poweroff_delay_ms); } EXPORT_SYMBOL(thermal_zone_device_critical); @@ -1549,7 +1495,6 @@ static int __init thermal_init(void) ida_destroy(&thermal_cdev_ida); mutex_destroy(&thermal_list_lock); mutex_destroy(&thermal_governor_lock); - mutex_destroy(&poweroff_lock); return result; } postcore_initcall(thermal_init);
The hardware shutdown function was exported from kernel/reboot for other subsystems to use. Logic is copied from the thermal_core. The protection mutex is replaced by an atomic_t to allow calls also from an IRQ context. Use the exported API instead of implementing own just for the thermal_core. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> --- Changelog: v8: - new patch (change added in v7, splitted in own patch at v8) Use the exported API instead --- drivers/thermal/thermal_core.c | 63 +++------------------------------- 1 file changed, 4 insertions(+), 59 deletions(-)