diff mbox series

[RFC,2/2] arm64: Use SYSTEM_OFF2 PSCI call to power off for hibernate

Message ID 20240312135958.727765-3-dwmw2@infradead.org (mailing list archive)
State RFC, archived
Headers show
Series Add PSCI v1.3 SYSTEM_OFF2 support for hibernation | expand

Commit Message

David Woodhouse March 12, 2024, 1:51 p.m. UTC
From: David Woodhouse <dwmw@amazon.co.uk>

The PSCI v1.3 specification (alpha) adds support for a SYSTEM_OFF2
function which is analogous to ACPI S4 state. This will allow hosting
environments to determine that a guest is hibernated rather than just
powered off, and handle that state appropriately on subsequent launches.

Since commit 60c0d45a7f7a ("efi/arm64: use UEFI for system reset and
poweroff") the EFI shutdown method is deliberately preferred over PSCI
or other methods. So register a SYS_OFF_MODE_POWER_OFF handler which
*only* handles the hibernation, leaving the original PSCI SYSTEM_OFF as
a last resort via the legacy pm_power_off function pointer.

The hibernation code already exports a system_entering_hibernation()
function which is be used by the higher-priority handler to check for
hibernation. That existing function just returns the value of a static
boolean variable from hibernate.c, which was previously only set in the
hibernation_platform_enter() code path. Set the same flag in the simpler
code path around the call to kernel_power_off() too.

An alternative way to hook SYSTEM_OFF2 into the hibernation code would
be to register a platform_hibernation_ops structure with an ->enter()
method which makes the new SYSTEM_OFF2 call. But that would have the
unwanted side-effect of making hibernation take a completely different
code path in hibernation_platform_enter(), invoking a lot of special dpm
callbacks.

Another option might be to add a new SYS_OFF_MODE_HIBERNATE mode, with
fallback to SYS_OFF_MODE_POWER_OFF. Or to use the sys_off_data to
indicate whether the power off is for hibernation.

But this version works and is relatively simple.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 drivers/firmware/psci/psci.c | 35 +++++++++++++++++++++++++++++++++++
 kernel/power/hibernate.c     |  5 ++++-
 2 files changed, 39 insertions(+), 1 deletion(-)

Comments

Sudeep Holla March 12, 2024, 3:57 p.m. UTC | #1
On Tue, Mar 12, 2024 at 01:51:29PM +0000, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> The PSCI v1.3 specification (alpha) adds support for a SYSTEM_OFF2
> function which is analogous to ACPI S4 state. This will allow hosting
> environments to determine that a guest is hibernated rather than just
> powered off, and handle that state appropriately on subsequent launches.
> 
> Since commit 60c0d45a7f7a ("efi/arm64: use UEFI for system reset and
> poweroff") the EFI shutdown method is deliberately preferred over PSCI
> or other methods. So register a SYS_OFF_MODE_POWER_OFF handler which
> *only* handles the hibernation, leaving the original PSCI SYSTEM_OFF as
> a last resort via the legacy pm_power_off function pointer.
> 
> The hibernation code already exports a system_entering_hibernation()
> function which is be used by the higher-priority handler to check for
> hibernation. That existing function just returns the value of a static
> boolean variable from hibernate.c, which was previously only set in the
> hibernation_platform_enter() code path. Set the same flag in the simpler
> code path around the call to kernel_power_off() too.
> 
> An alternative way to hook SYSTEM_OFF2 into the hibernation code would
> be to register a platform_hibernation_ops structure with an ->enter()
> method which makes the new SYSTEM_OFF2 call. But that would have the
> unwanted side-effect of making hibernation take a completely different
> code path in hibernation_platform_enter(), invoking a lot of special dpm
> callbacks.
> 
> Another option might be to add a new SYS_OFF_MODE_HIBERNATE mode, with
> fallback to SYS_OFF_MODE_POWER_OFF. Or to use the sys_off_data to
> indicate whether the power off is for hibernation.
> 
> But this version works and is relatively simple.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  drivers/firmware/psci/psci.c | 35 +++++++++++++++++++++++++++++++++++
>  kernel/power/hibernate.c     |  5 ++++-
>  2 files changed, 39 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
> index d9629ff87861..69d2f6969438 100644
> --- a/drivers/firmware/psci/psci.c
> +++ b/drivers/firmware/psci/psci.c
> @@ -78,6 +78,7 @@ struct psci_0_1_function_ids get_psci_0_1_function_ids(void)
>  
>  static u32 psci_cpu_suspend_feature;
>  static bool psci_system_reset2_supported;
> +static bool psci_system_off2_supported;
>  
>  static inline bool psci_has_ext_power_state(void)
>  {
> @@ -333,6 +334,28 @@ static void psci_sys_poweroff(void)
>  	invoke_psci_fn(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
>  }
>  
> +#ifdef CONFIG_HIBERNATION
> +static int psci_sys_hibernate(struct sys_off_data *data)
> +{
> +	if (system_entering_hibernation())
> +		invoke_psci_fn(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2),
> +			       PSCI_1_3_HIBERNATE_TYPE_OFF, 0, 0);
> +	return NOTIFY_DONE;
> +}
> +
> +static int __init psci_hibernate_init(void)
> +{
> +	if (psci_system_off2_supported) {
> +		/* Higher priority than EFI shutdown, but only for hibernate */
> +		register_sys_off_handler(SYS_OFF_MODE_POWER_OFF,
> +					 SYS_OFF_PRIO_FIRMWARE + 2,
> +					 psci_sys_hibernate, NULL);
> +	}
> +	return 0;
> +}
> +subsys_initcall(psci_hibernate_init);

Looked briefly at register_sys_off_handler and it should be OK to call
it from psci_init_system_off2() below. Any particular reason for having
separate initcall to do this ? We can even eliminate the need for
psci_init_system_off2 if it can be called from there. What am I missing ?

--
Regards,
Sudeep
David Woodhouse March 12, 2024, 4:36 p.m. UTC | #2
On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote:
> Looked briefly at register_sys_off_handler and it should be OK to call
> it from psci_init_system_off2() below. Any particular reason for having
> separate initcall to do this ? We can even eliminate the need for
> psci_init_system_off2 if it can be called from there. What am I missing ?

My first attempt did that. I don't think we can kmalloc that early:

[    0.000000] psci: SMC Calling Convention v1.1
[    0.000000] Unable to handle kernel read from unreadable memory at virtual address 0000000000000018
[    0.000000] Mem abort info:
[    0.000000]   ESR = 0x0000000096000004
[    0.000000]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.000000]   SET = 0, FnV = 0
[    0.000000]   EA = 0, S1PTW = 0
[    0.000000]   FSC = 0x04: level 0 translation fault
[    0.000000] Data abort info:
[    0.000000]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[    0.000000]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    0.000000]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    0.000000] [0000000000000018] user address but active_mm is swapper
[    0.000000] Internal error: Oops: 0000000096000004 [#1] SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc3+ #30
[    0.000000] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.000000] pc : kmalloc_trace+0x138/0x340
[    0.000000] lr : register_sys_off_handler+0x60/0x258
[    0.000000] sp : ffff8000827d3d10
[    0.000000] x29: ffff8000827d3d20 x28: 000000005cd7e0ac x27: 0000000000001f3f
[    0.000000] x26: 0000000000000000 x25: ffff8000802bd890 x24: ffff8000802bd890
[    0.000000] x23: 0000000000000040 x22: 0000000000000dc0 x21: 0000000000000001
[    0.000000] x20: 0000000000000000 x19: 0000000000000000 x18: 0000000000000006
[    0.000000] x17: 000000000036fd40 x16: 000000005ec902c0 x15: ffff8000827d37c0
[    0.000000] x14: 0000000000000000 x13: 312e3176206e6f69 x12: 746e65766e6f4320
[    0.000000] x11: 00000000ffffdfff x10: ffff8000828cebe0 x9 : ffff80008281ea10
[    0.000000] x8 : ffff8000827d3d78 x7 : 0000000000000000 x6 : 0000000000000000
[    0.000000] x5 : 0000000000000000 x4 : ffff8000827e0000 x3 : ffff8000827f41c0
[    0.000000] x2 : 0000000000000040 x1 : 0000000000000dc0 x0 : 0000000000000000
[    0.000000] Call trace:
[    0.000000]  kmalloc_trace+0x138/0x340
[    0.000000]  register_sys_off_handler+0x60/0x258
[    0.000000]  psci_probe+0x2cc/0x350
[    0.000000]  psci_acpi_init+0x50/0x88
[    0.000000]  setup_arch+0x194/0x278
[    0.000000]  start_kernel+0x7c/0x410
[    0.000000]  __primary_switched+0xb8/0xc8
[    0.000000] Code: b5000f7a f94003f4 aa1803fe d50320ff (b9401a64) 
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
Sudeep Holla March 13, 2024, 3:34 p.m. UTC | #3
On Tue, Mar 12, 2024 at 04:36:05PM +0000, David Woodhouse wrote:
> On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote:
> > Looked briefly at register_sys_off_handler and it should be OK to call
> > it from psci_init_system_off2() below. Any particular reason for having
> > separate initcall to do this ? We can even eliminate the need for
> > psci_init_system_off2 if it can be called from there. What am I missing ?
>
> My first attempt did that. I don't think we can kmalloc that early:
>

That was was initial guess. But a quick hack on my setup and running it on
the FVP model didn't complain. I think either I messed up or something else
wrong, I must check on some h/w. Anyways sorry for the noise and thanks for
the response.
Sudeep Holla March 14, 2024, 11:09 a.m. UTC | #4
On Wed, Mar 13, 2024 at 03:34:44PM +0000, Sudeep Holla wrote:
> On Tue, Mar 12, 2024 at 04:36:05PM +0000, David Woodhouse wrote:
> > On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote:
> > > Looked briefly at register_sys_off_handler and it should be OK to call
> > > it from psci_init_system_off2() below. Any particular reason for having
> > > separate initcall to do this ? We can even eliminate the need for
> > > psci_init_system_off2 if it can be called from there. What am I missing ?
> >
> > My first attempt did that. I don't think we can kmalloc that early:
> >
>
> That was was initial guess. But a quick hack on my setup and running it on
> the FVP model didn't complain. I think either I messed up or something else
> wrong, I must check on some h/w. Anyways sorry for the noise and thanks for
> the response.
>

OK, it was indeed giving -ENOMEM which in my hack didn't get propogated
properly 
David Woodhouse March 14, 2024, 11:27 a.m. UTC | #5
On 14 March 2024 12:09:11 CET, Sudeep Holla <sudeep.holla@arm.com> wrote:
>On Wed, Mar 13, 2024 at 03:34:44PM +0000, Sudeep Holla wrote:
>> On Tue, Mar 12, 2024 at 04:36:05PM +0000, David Woodhouse wrote:
>> > On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote:
>> > > Looked briefly at register_sys_off_handler and it should be OK to call
>> > > it from psci_init_system_off2() below. Any particular reason for having
>> > > separate initcall to do this ? We can even eliminate the need for
>> > > psci_init_system_off2 if it can be called from there. What am I missing ?
>> >
>> > My first attempt did that. I don't think we can kmalloc that early:
>> >
>>
>> That was was initial guess. But a quick hack on my setup and running it on
>> the FVP model didn't complain. I think either I messed up or something else
>> wrong, I must check on some h/w. Anyways sorry for the noise and thanks for
>> the response.
>>
>
>OK, it was indeed giving -ENOMEM which in my hack didn't get propogated
>properly 
diff mbox series

Patch

diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index d9629ff87861..69d2f6969438 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -78,6 +78,7 @@  struct psci_0_1_function_ids get_psci_0_1_function_ids(void)
 
 static u32 psci_cpu_suspend_feature;
 static bool psci_system_reset2_supported;
+static bool psci_system_off2_supported;
 
 static inline bool psci_has_ext_power_state(void)
 {
@@ -333,6 +334,28 @@  static void psci_sys_poweroff(void)
 	invoke_psci_fn(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
 }
 
+#ifdef CONFIG_HIBERNATION
+static int psci_sys_hibernate(struct sys_off_data *data)
+{
+	if (system_entering_hibernation())
+		invoke_psci_fn(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2),
+			       PSCI_1_3_HIBERNATE_TYPE_OFF, 0, 0);
+	return NOTIFY_DONE;
+}
+
+static int __init psci_hibernate_init(void)
+{
+	if (psci_system_off2_supported) {
+		/* Higher priority than EFI shutdown, but only for hibernate */
+		register_sys_off_handler(SYS_OFF_MODE_POWER_OFF,
+					 SYS_OFF_PRIO_FIRMWARE + 2,
+					 psci_sys_hibernate, NULL);
+	}
+	return 0;
+}
+subsys_initcall(psci_hibernate_init);
+#endif
+
 static int psci_features(u32 psci_func_id)
 {
 	return invoke_psci_fn(PSCI_1_0_FN_PSCI_FEATURES,
@@ -364,6 +387,7 @@  static const struct {
 	PSCI_ID_NATIVE(1_1, SYSTEM_RESET2),
 	PSCI_ID(1_1, MEM_PROTECT),
 	PSCI_ID_NATIVE(1_1, MEM_PROTECT_CHECK_RANGE),
+	PSCI_ID_NATIVE(1_3, SYSTEM_OFF2),
 };
 
 static int psci_debugfs_read(struct seq_file *s, void *data)
@@ -523,6 +547,16 @@  static void __init psci_init_system_reset2(void)
 		psci_system_reset2_supported = true;
 }
 
+static void __init psci_init_system_off2(void)
+{
+	int ret;
+
+	ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
+
+	if (ret != PSCI_RET_NOT_SUPPORTED)
+		psci_system_off2_supported = true;
+}
+
 static void __init psci_init_system_suspend(void)
 {
 	int ret;
@@ -653,6 +687,7 @@  static int __init psci_probe(void)
 		psci_init_cpu_suspend();
 		psci_init_system_suspend();
 		psci_init_system_reset2();
+		psci_init_system_off2();
 		kvm_init_hyp_services();
 	}
 
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 4b0b7cf2e019..ac87b3cb670c 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -676,8 +676,11 @@  static void power_down(void)
 		}
 		fallthrough;
 	case HIBERNATION_SHUTDOWN:
-		if (kernel_can_power_off())
+		if (kernel_can_power_off()) {
+			entering_platform_hibernation = true;
 			kernel_power_off();
+			entering_platform_hibernation = false;
+		}
 		break;
 	}
 	kernel_halt();