[PATCH RESEND v10 03/11] thermal: Use generic HW-protection shutdown API

From: Matti Vaittinen
Date: Thu Jun 03 2021 - 01:41:31 EST


The hardware shutdown function was exported from kernel/reboot for
other subsystems to use. Logic is copied from the thermal_core. The
protection mutex is replaced by an atomic_t to allow calls also from
an IRQ context. Also the WARN() was replaced by pr_emerg() based on
discussions here:
https://lore.kernel.org/lkml/YJuPwAZroVZ%2Fw633@alley/
and here:
https://lore.kernel.org/linux-iommu/20210331093104.383705-4-geert+renesas@xxxxxxxxx/

Use the exported API instead of implementing own just for the
thermal_core.

Signed-off-by: Matti Vaittinen <matti.vaittinen@xxxxxxxxxxxxxxxxx>

---
Changelog:
v10:
- update commit message to mention changing WARN() to pr_emerg()
v9:
- Update the thermal documentation
v8:
- new patch (change added in v7, splitted in own patch at v8)

Use the exported API instead
---
.../driver-api/thermal/sysfs-api.rst | 24 +++----
drivers/thermal/thermal_core.c | 63 ++-----------------
2 files changed, 13 insertions(+), 74 deletions(-)

diff --git a/Documentation/driver-api/thermal/sysfs-api.rst b/Documentation/driver-api/thermal/sysfs-api.rst
index 4b638c14bc16..c93fa5e961a0 100644
--- a/Documentation/driver-api/thermal/sysfs-api.rst
+++ b/Documentation/driver-api/thermal/sysfs-api.rst
@@ -740,21 +740,15 @@ possible.
5. thermal_emergency_poweroff
=============================

-On an event of critical trip temperature crossing. Thermal framework
-allows the system to shutdown gracefully by calling orderly_poweroff().
-In the event of a failure of orderly_poweroff() to shut down the system
-we are in danger of keeping the system alive at undesirably high
-temperatures. To mitigate this high risk scenario we program a work
-queue to fire after a pre-determined number of seconds to start
-an emergency shutdown of the device using the kernel_power_off()
-function. In case kernel_power_off() fails then finally
-emergency_restart() is called in the worst case.
+On an event of critical trip temperature crossing the thermal framework
+shuts down the system by calling hw_protection_shutdown(). The
+hw_protection_shutdown() first attempts to perform an orderly shutdown
+but accepts a delay after which it proceeds doing a forced power-off
+or as last resort an emergency_restart.

The delay should be carefully profiled so as to give adequate time for
-orderly_poweroff(). In case of failure of an orderly_poweroff() the
-emergency poweroff kicks in after the delay has elapsed and shuts down
-the system.
+orderly poweroff.

-If set to 0 emergency poweroff will not be supported. So a carefully
-profiled non-zero positive value is a must for emergency poweroff to be
-triggered.
+If the delay is set to 0 emergency poweroff will not be supported. So a
+carefully profiled non-zero positive value is a must for emergency
+poweroff to be triggered.
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index d20b25f40d19..10a2d8e1cacf 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -36,10 +36,8 @@ static LIST_HEAD(thermal_governor_list);

static DEFINE_MUTEX(thermal_list_lock);
static DEFINE_MUTEX(thermal_governor_lock);
-static DEFINE_MUTEX(poweroff_lock);

static atomic_t in_suspend;
-static bool power_off_triggered;

static struct thermal_governor *def_governor;

@@ -327,70 +325,18 @@ static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip)
def_governor->throttle(tz, trip);
}

-/**
- * thermal_emergency_poweroff_func - emergency poweroff work after a known delay
- * @work: work_struct associated with the emergency poweroff function
- *
- * This function is called in very critical situations to force
- * a kernel poweroff after a configurable timeout value.
- */
-static void thermal_emergency_poweroff_func(struct work_struct *work)
-{
- /*
- * We have reached here after the emergency thermal shutdown
- * Waiting period has expired. This means orderly_poweroff has
- * not been able to shut off the system for some reason.
- * Try to shut down the system immediately using kernel_power_off
- * if populated
- */
- WARN(1, "Attempting kernel_power_off: Temperature too high\n");
- kernel_power_off();
-
- /*
- * Worst of the worst case trigger emergency restart
- */
- WARN(1, "Attempting emergency_restart: Temperature too high\n");
- emergency_restart();
-}
-
-static DECLARE_DELAYED_WORK(thermal_emergency_poweroff_work,
- thermal_emergency_poweroff_func);
-
-/**
- * thermal_emergency_poweroff - Trigger an emergency system poweroff
- *
- * This may be called from any critical situation to trigger a system shutdown
- * after a known period of time. By default this is not scheduled.
- */
-static void thermal_emergency_poweroff(void)
+void thermal_zone_device_critical(struct thermal_zone_device *tz)
{
- int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS;
/*
* poweroff_delay_ms must be a carefully profiled positive value.
- * Its a must for thermal_emergency_poweroff_work to be scheduled
+ * Its a must for forced_emergency_poweroff_work to be scheduled.
*/
- if (poweroff_delay_ms <= 0)
- return;
- schedule_delayed_work(&thermal_emergency_poweroff_work,
- msecs_to_jiffies(poweroff_delay_ms));
-}
+ int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS;

-void thermal_zone_device_critical(struct thermal_zone_device *tz)
-{
dev_emerg(&tz->device, "%s: critical temperature reached, "
"shutting down\n", tz->type);

- mutex_lock(&poweroff_lock);
- if (!power_off_triggered) {
- /*
- * Queue a backup emergency shutdown in the event of
- * orderly_poweroff failure
- */
- thermal_emergency_poweroff();
- orderly_poweroff(true);
- power_off_triggered = true;
- }
- mutex_unlock(&poweroff_lock);
+ hw_protection_shutdown("Temperature too high", poweroff_delay_ms);
}
EXPORT_SYMBOL(thermal_zone_device_critical);

@@ -1538,7 +1484,6 @@ static int __init thermal_init(void)
ida_destroy(&thermal_cdev_ida);
mutex_destroy(&thermal_list_lock);
mutex_destroy(&thermal_governor_lock);
- mutex_destroy(&poweroff_lock);
return result;
}
postcore_initcall(thermal_init);
--
2.25.4


--
Matti Vaittinen, Linux device drivers
ROHM Semiconductors, Finland SWDC
Kiviharjunlenkki 1E
90220 OULU
FINLAND

~~~ "I don't think so," said Rene Descartes. Just then he vanished ~~~
Simon says - in Latin please.
~~~ "non cogito me" dixit Rene Descarte, deinde evanescavit ~~~
Thanks to Simon Glass for the translation =]

Attachment: signature.asc
Description: PGP signature