diff mbox series

[v2] x86/shutdown: change default reboot method preference

Message ID 20230915074347.94712-1-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series [v2] x86/shutdown: change default reboot method preference | expand

Commit Message

Roger Pau Monné Sept. 15, 2023, 7:43 a.m. UTC
The current logic to chose the preferred reboot method is based on the mode Xen
has been booted into, so if the box is booted from UEFI, the preferred reboot
method will be to use the ResetSystem() run time service call.

However, that method seems to be widely untested, and quite often leads to a
result similar to:

Hardware Dom0 shutdown: rebooting machine
----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
CPU:    0
RIP:    e008:[<0000000000000017>] 0000000000000017
RFLAGS: 0000000000010202   CONTEXT: hypervisor
[...]
Xen call trace:
   [<0000000000000017>] R 0000000000000017
   [<ffff83207eff7b50>] S ffff83207eff7b50
   [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
   [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
   [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
   [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
   [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
   [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
   [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
   [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
   [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee

****************************************
Panic on CPU 0:
FATAL TRAP: vector = 6 (invalid opcode)
****************************************

Which in most cases does lead to a reboot, however that's unreliable.

Change the default reboot preference to prefer ACPI over UEFI if available and
not in reduced hardware mode.

This is in line to what Linux does, so it's unlikely to cause issues on current
and future hardware, since there's a much higher chance of vendors testing
hardware with Linux rather than Xen.

Add a special case for one Acer model that does require being rebooted using
ResetSystem().  See Linux commit 0082517fa4bce for rationale.

I'm not aware of using ACPI reboot causing issues on boxes that do have
properly implemented ResetSystem() methods.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Add special case for Acer model to use UEFI reboot.
 - Adjust commit message.
---
 xen/arch/x86/shutdown.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

Comments

Jan Beulich Sept. 18, 2023, 12:26 p.m. UTC | #1
On 15.09.2023 09:43, Roger Pau Monne wrote:
> The current logic to chose the preferred reboot method is based on the mode Xen
> has been booted into, so if the box is booted from UEFI, the preferred reboot
> method will be to use the ResetSystem() run time service call.
> 
> However, that method seems to be widely untested, and quite often leads to a
> result similar to:
> 
> Hardware Dom0 shutdown: rebooting machine
> ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
> CPU:    0
> RIP:    e008:[<0000000000000017>] 0000000000000017
> RFLAGS: 0000000000010202   CONTEXT: hypervisor
> [...]
> Xen call trace:
>    [<0000000000000017>] R 0000000000000017
>    [<ffff83207eff7b50>] S ffff83207eff7b50
>    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
>    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
>    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
>    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
>    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
>    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
>    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> 
> ****************************************
> Panic on CPU 0:
> FATAL TRAP: vector = 6 (invalid opcode)
> ****************************************
> 
> Which in most cases does lead to a reboot, however that's unreliable.
> 
> Change the default reboot preference to prefer ACPI over UEFI if available and
> not in reduced hardware mode.
> 
> This is in line to what Linux does, so it's unlikely to cause issues on current
> and future hardware, since there's a much higher chance of vendors testing
> hardware with Linux rather than Xen.

I certainly appreciate this as a goal. However, ...

> Add a special case for one Acer model that does require being rebooted using
> ResetSystem().  See Linux commit 0082517fa4bce for rationale.

... this is precisely what I'd like to avoid: Needing workarounds on spec-
conforming systems.

> I'm not aware of using ACPI reboot causing issues on boxes that do have
> properly implemented ResetSystem() methods.

I'm also puzzled by this statement: That Acer aspect is a clear indication
of there being an issue. Plus it's quite easy to see that hooks may be put
in place by various firmware components that would then be used to make
certain adjustments to the platform, ahead of an orderly reboot / shutdown.

> --- a/xen/arch/x86/shutdown.c
> +++ b/xen/arch/x86/shutdown.c
> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
>  
>      if ( xen_guest )
>          reboot_type = BOOT_XEN;
> +    else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> +        reboot_type = BOOT_ACPI;
>      else if ( efi_enabled(EFI_RS) )
>          reboot_type = BOOT_EFI;
> -    else if ( acpi_disabled )
> -        reboot_type = BOOT_KBD;
>      else
> -        reboot_type = BOOT_ACPI;
> +        reboot_type = BOOT_KBD;
>  }
>  
>  static int __init cf_check override_reboot(const struct dmi_system_id *d)
>  {
>      enum reboot_type type = (long)d->driver_data;
>  
> -    if ( type == BOOT_ACPI && acpi_disabled )
> +    if ( (type == BOOT_ACPI && acpi_disabled) ||
> +         (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
>          type = BOOT_KBD;

I guess I don't follow this adjustment: Why would we fall back to KBD
first thing? Wouldn't it make sense to try ACPI first if EFI cannot
be used? And go further to KBD only if ACPI then also turns out
disabled (a mode that Xen quite likely won't correctly operate in
anymore anyway, due to bitrot)?

As an aside, KBD likely is unusable on hw-reduced systems, for there
simply not being a legacy keyboard controller. Instead we may need to
fall back to CF9 in such a case.

Jan
Roger Pau Monné Sept. 18, 2023, 3:09 p.m. UTC | #2
On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
> On 15.09.2023 09:43, Roger Pau Monne wrote:
> > The current logic to chose the preferred reboot method is based on the mode Xen
> > has been booted into, so if the box is booted from UEFI, the preferred reboot
> > method will be to use the ResetSystem() run time service call.
> > 
> > However, that method seems to be widely untested, and quite often leads to a
> > result similar to:
> > 
> > Hardware Dom0 shutdown: rebooting machine
> > ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
> > CPU:    0
> > RIP:    e008:[<0000000000000017>] 0000000000000017
> > RFLAGS: 0000000000010202   CONTEXT: hypervisor
> > [...]
> > Xen call trace:
> >    [<0000000000000017>] R 0000000000000017
> >    [<ffff83207eff7b50>] S ffff83207eff7b50
> >    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> >    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> >    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> >    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> >    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> >    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> >    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> >    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> >    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> > 
> > ****************************************
> > Panic on CPU 0:
> > FATAL TRAP: vector = 6 (invalid opcode)
> > ****************************************
> > 
> > Which in most cases does lead to a reboot, however that's unreliable.
> > 
> > Change the default reboot preference to prefer ACPI over UEFI if available and
> > not in reduced hardware mode.
> > 
> > This is in line to what Linux does, so it's unlikely to cause issues on current
> > and future hardware, since there's a much higher chance of vendors testing
> > hardware with Linux rather than Xen.
> 
> I certainly appreciate this as a goal. However, ...
> 
> > Add a special case for one Acer model that does require being rebooted using
> > ResetSystem().  See Linux commit 0082517fa4bce for rationale.
> 
> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
> conforming systems.

I wouldn't call that platform spec-conforming when ACPI reboot doesn't
work reliably on it either.  I haven't been able to find a wording on
the UEFI specification that mandates using ResetSystem() in order to
reset the platform.  I've only found this wording:

"... then the UEFI OS Loader has taken control of the platform, and
EFI will not regain control of the system until the platform is reset.
One method of resetting the platform is through the EFI Runtime
Service ResetSystem()."

And this reads to me as a mere indication that one option is to use
ResetSystem(), but that there are likely other platform specific reset
methods that are suitable to be used for OSes and still be compliant
with the UEFI spec.

> 
> > I'm not aware of using ACPI reboot causing issues on boxes that do have
> > properly implemented ResetSystem() methods.
> 
> I'm also puzzled by this statement: That Acer aspect is a clear indication
> of there being an issue.

Hm yes, I had that sentence from v1, before realizing the Acer quirk.
So there's one know issue with using ACPI as the default reboot
method vs many issues when using the UEFI one.

> Plus it's quite easy to see that hooks may be put
> in place by various firmware components that would then be used to make
> certain adjustments to the platform, ahead of an orderly reboot / shutdown.

Well, I very much doubt any vendor would rely on this, seeing as both
Linux and Windows both default to ACPI reboot, and the UEFI spec not
mandating the use of ResetSystem() anyway.

> > --- a/xen/arch/x86/shutdown.c
> > +++ b/xen/arch/x86/shutdown.c
> > @@ -150,19 +150,20 @@ static void default_reboot_type(void)
> >  
> >      if ( xen_guest )
> >          reboot_type = BOOT_XEN;
> > +    else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> > +        reboot_type = BOOT_ACPI;
> >      else if ( efi_enabled(EFI_RS) )
> >          reboot_type = BOOT_EFI;
> > -    else if ( acpi_disabled )
> > -        reboot_type = BOOT_KBD;
> >      else
> > -        reboot_type = BOOT_ACPI;
> > +        reboot_type = BOOT_KBD;
> >  }
> >  
> >  static int __init cf_check override_reboot(const struct dmi_system_id *d)
> >  {
> >      enum reboot_type type = (long)d->driver_data;
> >  
> > -    if ( type == BOOT_ACPI && acpi_disabled )
> > +    if ( (type == BOOT_ACPI && acpi_disabled) ||
> > +         (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
> >          type = BOOT_KBD;
> 
> I guess I don't follow this adjustment: Why would we fall back to KBD
> first thing? Wouldn't it make sense to try ACPI first if EFI cannot
> be used?

This is IMO a weird corner case, we have a explicit request to use one
reboot method, but we cannot do so because the component is disabled.
I've assumed that falling back to KBD was the safest option.

For example if we have to explicitly reboot using UEFI it's likely
because ACPI (the proposed default method) is not suitable, and hence
falling back to ACPI here won't help.

> And go further to KBD only if ACPI then also turns out
> disabled (a mode that Xen quite likely won't correctly operate in
> anymore anyway, due to bitrot)?
> 
> As an aside, KBD likely is unusable on hw-reduced systems, for there
> simply not being a legacy keyboard controller. Instead we may need to
> fall back to CF9 in such a case.

Hm, I can send a followup patch for that, but not part of this
change.

Thanks, Roger.
Jan Beulich Sept. 18, 2023, 3:44 p.m. UTC | #3
On 18.09.2023 17:09, Roger Pau Monné wrote:
> On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
>> On 15.09.2023 09:43, Roger Pau Monne wrote:
>>> The current logic to chose the preferred reboot method is based on the mode Xen
>>> has been booted into, so if the box is booted from UEFI, the preferred reboot
>>> method will be to use the ResetSystem() run time service call.
>>>
>>> However, that method seems to be widely untested, and quite often leads to a
>>> result similar to:
>>>
>>> Hardware Dom0 shutdown: rebooting machine
>>> ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
>>> CPU:    0
>>> RIP:    e008:[<0000000000000017>] 0000000000000017
>>> RFLAGS: 0000000000010202   CONTEXT: hypervisor
>>> [...]
>>> Xen call trace:
>>>    [<0000000000000017>] R 0000000000000017
>>>    [<ffff83207eff7b50>] S ffff83207eff7b50
>>>    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
>>>    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
>>>    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
>>>    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
>>>    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
>>>    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
>>>    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>>>    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>>>    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
>>>
>>> ****************************************
>>> Panic on CPU 0:
>>> FATAL TRAP: vector = 6 (invalid opcode)
>>> ****************************************
>>>
>>> Which in most cases does lead to a reboot, however that's unreliable.
>>>
>>> Change the default reboot preference to prefer ACPI over UEFI if available and
>>> not in reduced hardware mode.
>>>
>>> This is in line to what Linux does, so it's unlikely to cause issues on current
>>> and future hardware, since there's a much higher chance of vendors testing
>>> hardware with Linux rather than Xen.
>>
>> I certainly appreciate this as a goal. However, ...
>>
>>> Add a special case for one Acer model that does require being rebooted using
>>> ResetSystem().  See Linux commit 0082517fa4bce for rationale.
>>
>> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
>> conforming systems.
> 
> I wouldn't call that platform spec-conforming when ACPI reboot doesn't
> work reliably on it either.  I haven't been able to find a wording on
> the UEFI specification that mandates using ResetSystem() in order to
> reset the platform.  I've only found this wording:
> 
> "... then the UEFI OS Loader has taken control of the platform, and
> EFI will not regain control of the system until the platform is reset.
> One method of resetting the platform is through the EFI Runtime
> Service ResetSystem()."
> 
> And this reads to me as a mere indication that one option is to use
> ResetSystem(), but that there are likely other platform specific reset
> methods that are suitable to be used for OSes and still be compliant
> with the UEFI spec.

See my reference to ia64. With ACPI_FADT_RESET_REGISTER not set, I don't
think there would have been any other non-custom reboot method there. So
while perhaps not mandated, it's still the designated abstraction layer.

>>> --- a/xen/arch/x86/shutdown.c
>>> +++ b/xen/arch/x86/shutdown.c
>>> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
>>>  
>>>      if ( xen_guest )
>>>          reboot_type = BOOT_XEN;
>>> +    else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
>>> +        reboot_type = BOOT_ACPI;
>>>      else if ( efi_enabled(EFI_RS) )
>>>          reboot_type = BOOT_EFI;
>>> -    else if ( acpi_disabled )
>>> -        reboot_type = BOOT_KBD;
>>>      else
>>> -        reboot_type = BOOT_ACPI;
>>> +        reboot_type = BOOT_KBD;
>>>  }
>>>  
>>>  static int __init cf_check override_reboot(const struct dmi_system_id *d)
>>>  {
>>>      enum reboot_type type = (long)d->driver_data;
>>>  
>>> -    if ( type == BOOT_ACPI && acpi_disabled )
>>> +    if ( (type == BOOT_ACPI && acpi_disabled) ||
>>> +         (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
>>>          type = BOOT_KBD;
>>
>> I guess I don't follow this adjustment: Why would we fall back to KBD
>> first thing? Wouldn't it make sense to try ACPI first if EFI cannot
>> be used?
> 
> This is IMO a weird corner case, we have a explicit request to use one
> reboot method, but we cannot do so because the component is disabled.
> I've assumed that falling back to KBD was the safest option.
> 
> For example if we have to explicitly reboot using UEFI it's likely
> because ACPI (the proposed default method) is not suitable, and hence
> falling back to ACPI here won't help.

Perhaps, but falling back to KBD isn't necessarily going to work either.
And it might well be that on said Acer no reboot method would actually
yield consistent behavior, except for ResetSystem(). The fallback logic
here as well as that in machine_restart() is all based on guesswork
anyway.

Jan
Roger Pau Monné Sept. 18, 2023, 4 p.m. UTC | #4
On Mon, Sep 18, 2023 at 05:44:47PM +0200, Jan Beulich wrote:
> On 18.09.2023 17:09, Roger Pau Monné wrote:
> > On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
> >> On 15.09.2023 09:43, Roger Pau Monne wrote:
> >>> The current logic to chose the preferred reboot method is based on the mode Xen
> >>> has been booted into, so if the box is booted from UEFI, the preferred reboot
> >>> method will be to use the ResetSystem() run time service call.
> >>>
> >>> However, that method seems to be widely untested, and quite often leads to a
> >>> result similar to:
> >>>
> >>> Hardware Dom0 shutdown: rebooting machine
> >>> ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
> >>> CPU:    0
> >>> RIP:    e008:[<0000000000000017>] 0000000000000017
> >>> RFLAGS: 0000000000010202   CONTEXT: hypervisor
> >>> [...]
> >>> Xen call trace:
> >>>    [<0000000000000017>] R 0000000000000017
> >>>    [<ffff83207eff7b50>] S ffff83207eff7b50
> >>>    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> >>>    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> >>>    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> >>>    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> >>>    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> >>>    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> >>>    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> >>>    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> >>>    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> >>>
> >>> ****************************************
> >>> Panic on CPU 0:
> >>> FATAL TRAP: vector = 6 (invalid opcode)
> >>> ****************************************
> >>>
> >>> Which in most cases does lead to a reboot, however that's unreliable.
> >>>
> >>> Change the default reboot preference to prefer ACPI over UEFI if available and
> >>> not in reduced hardware mode.
> >>>
> >>> This is in line to what Linux does, so it's unlikely to cause issues on current
> >>> and future hardware, since there's a much higher chance of vendors testing
> >>> hardware with Linux rather than Xen.
> >>
> >> I certainly appreciate this as a goal. However, ...
> >>
> >>> Add a special case for one Acer model that does require being rebooted using
> >>> ResetSystem().  See Linux commit 0082517fa4bce for rationale.
> >>
> >> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
> >> conforming systems.
> > 
> > I wouldn't call that platform spec-conforming when ACPI reboot doesn't
> > work reliably on it either.  I haven't been able to find a wording on
> > the UEFI specification that mandates using ResetSystem() in order to
> > reset the platform.  I've only found this wording:
> > 
> > "... then the UEFI OS Loader has taken control of the platform, and
> > EFI will not regain control of the system until the platform is reset.
> > One method of resetting the platform is through the EFI Runtime
> > Service ResetSystem()."
> > 
> > And this reads to me as a mere indication that one option is to use
> > ResetSystem(), but that there are likely other platform specific reset
> > methods that are suitable to be used for OSes and still be compliant
> > with the UEFI spec.
> 
> See my reference to ia64.

Right, I understand that on ia64 things might have been different, due
to the platform lacking any other reboot method, but I don't see how
this applies to x86 where there are other reboot methods.

> With ACPI_FADT_RESET_REGISTER not set, I don't
> think there would have been any other non-custom reboot method there. So
> while perhaps not mandated, it's still the designated abstraction layer.

Again the spec doesn't mention that ResetSystem() must be used, so
while it would make sense if it was reliable, it clearly isn't.  In
which case resorting to the more reliable method should always be
preferred, specially if the spec is so lax as to call ResetSystem()
"One method of resetting the platform".

We should also take into account that vendors are much more likely to
test new hardware with Linux rather than Xen, and hence it's low
probability that the default Linux reboot method doesn't work on a
platform, because that would hurt the vendor.

> >>> --- a/xen/arch/x86/shutdown.c
> >>> +++ b/xen/arch/x86/shutdown.c
> >>> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
> >>>  
> >>>      if ( xen_guest )
> >>>          reboot_type = BOOT_XEN;
> >>> +    else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> >>> +        reboot_type = BOOT_ACPI;
> >>>      else if ( efi_enabled(EFI_RS) )
> >>>          reboot_type = BOOT_EFI;
> >>> -    else if ( acpi_disabled )
> >>> -        reboot_type = BOOT_KBD;
> >>>      else
> >>> -        reboot_type = BOOT_ACPI;
> >>> +        reboot_type = BOOT_KBD;
> >>>  }
> >>>  
> >>>  static int __init cf_check override_reboot(const struct dmi_system_id *d)
> >>>  {
> >>>      enum reboot_type type = (long)d->driver_data;
> >>>  
> >>> -    if ( type == BOOT_ACPI && acpi_disabled )
> >>> +    if ( (type == BOOT_ACPI && acpi_disabled) ||
> >>> +         (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
> >>>          type = BOOT_KBD;
> >>
> >> I guess I don't follow this adjustment: Why would we fall back to KBD
> >> first thing? Wouldn't it make sense to try ACPI first if EFI cannot
> >> be used?
> > 
> > This is IMO a weird corner case, we have a explicit request to use one
> > reboot method, but we cannot do so because the component is disabled.
> > I've assumed that falling back to KBD was the safest option.
> > 
> > For example if we have to explicitly reboot using UEFI it's likely
> > because ACPI (the proposed default method) is not suitable, and hence
> > falling back to ACPI here won't help.
> 
> Perhaps, but falling back to KBD isn't necessarily going to work either.
> And it might well be that on said Acer no reboot method would actually
> yield consistent behavior, except for ResetSystem(). The fallback logic
> here as well as that in machine_restart() is all based on guesswork
> anyway.

Indeed, hence it seemed a suitable and less risky option to fallback
to KBD in both cases.

Thanks, Roger.
Jan Beulich Sept. 19, 2023, 9:31 a.m. UTC | #5
On 18.09.2023 18:00, Roger Pau Monné wrote:
> On Mon, Sep 18, 2023 at 05:44:47PM +0200, Jan Beulich wrote:
>> On 18.09.2023 17:09, Roger Pau Monné wrote:
>>> On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
>>>> On 15.09.2023 09:43, Roger Pau Monne wrote:
>>>>> The current logic to chose the preferred reboot method is based on the mode Xen
>>>>> has been booted into, so if the box is booted from UEFI, the preferred reboot
>>>>> method will be to use the ResetSystem() run time service call.
>>>>>
>>>>> However, that method seems to be widely untested, and quite often leads to a
>>>>> result similar to:
>>>>>
>>>>> Hardware Dom0 shutdown: rebooting machine
>>>>> ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
>>>>> CPU:    0
>>>>> RIP:    e008:[<0000000000000017>] 0000000000000017
>>>>> RFLAGS: 0000000000010202   CONTEXT: hypervisor
>>>>> [...]
>>>>> Xen call trace:
>>>>>    [<0000000000000017>] R 0000000000000017
>>>>>    [<ffff83207eff7b50>] S ffff83207eff7b50
>>>>>    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
>>>>>    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
>>>>>    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
>>>>>    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
>>>>>    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
>>>>>    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
>>>>>    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>>>>>    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>>>>>    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
>>>>>
>>>>> ****************************************
>>>>> Panic on CPU 0:
>>>>> FATAL TRAP: vector = 6 (invalid opcode)
>>>>> ****************************************
>>>>>
>>>>> Which in most cases does lead to a reboot, however that's unreliable.
>>>>>
>>>>> Change the default reboot preference to prefer ACPI over UEFI if available and
>>>>> not in reduced hardware mode.
>>>>>
>>>>> This is in line to what Linux does, so it's unlikely to cause issues on current
>>>>> and future hardware, since there's a much higher chance of vendors testing
>>>>> hardware with Linux rather than Xen.
>>>>
>>>> I certainly appreciate this as a goal. However, ...
>>>>
>>>>> Add a special case for one Acer model that does require being rebooted using
>>>>> ResetSystem().  See Linux commit 0082517fa4bce for rationale.
>>>>
>>>> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
>>>> conforming systems.
>>>
>>> I wouldn't call that platform spec-conforming when ACPI reboot doesn't
>>> work reliably on it either.  I haven't been able to find a wording on
>>> the UEFI specification that mandates using ResetSystem() in order to
>>> reset the platform.  I've only found this wording:
>>>
>>> "... then the UEFI OS Loader has taken control of the platform, and
>>> EFI will not regain control of the system until the platform is reset.
>>> One method of resetting the platform is through the EFI Runtime
>>> Service ResetSystem()."
>>>
>>> And this reads to me as a mere indication that one option is to use
>>> ResetSystem(), but that there are likely other platform specific reset
>>> methods that are suitable to be used for OSes and still be compliant
>>> with the UEFI spec.
>>
>> See my reference to ia64.
> 
> Right, I understand that on ia64 things might have been different, due
> to the platform lacking any other reboot method, but I don't see how
> this applies to x86 where there are other reboot methods.
> 
>> With ACPI_FADT_RESET_REGISTER not set, I don't
>> think there would have been any other non-custom reboot method there. So
>> while perhaps not mandated, it's still the designated abstraction layer.
> 
> Again the spec doesn't mention that ResetSystem() must be used, so
> while it would make sense if it was reliable, it clearly isn't.  In
> which case resorting to the more reliable method should always be
> preferred, specially if the spec is so lax as to call ResetSystem()
> "One method of resetting the platform".

That wording wasn't there in 1.02, but I can see it all the way back to
at least 2.1. So yes, you have a point. Yet - adding onto an earlier
remark of mine - EFI_RESET_NOTIFICATION_PROTOCOL is pretty useless if
use of ResetSystem() was optional.

Jan
Roger Pau Monné Sept. 19, 2023, 10:29 a.m. UTC | #6
On Tue, Sep 19, 2023 at 11:31:07AM +0200, Jan Beulich wrote:
> On 18.09.2023 18:00, Roger Pau Monné wrote:
> > On Mon, Sep 18, 2023 at 05:44:47PM +0200, Jan Beulich wrote:
> >> On 18.09.2023 17:09, Roger Pau Monné wrote:
> >>> On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
> >>>> On 15.09.2023 09:43, Roger Pau Monne wrote:
> >>>>> The current logic to chose the preferred reboot method is based on the mode Xen
> >>>>> has been booted into, so if the box is booted from UEFI, the preferred reboot
> >>>>> method will be to use the ResetSystem() run time service call.
> >>>>>
> >>>>> However, that method seems to be widely untested, and quite often leads to a
> >>>>> result similar to:
> >>>>>
> >>>>> Hardware Dom0 shutdown: rebooting machine
> >>>>> ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
> >>>>> CPU:    0
> >>>>> RIP:    e008:[<0000000000000017>] 0000000000000017
> >>>>> RFLAGS: 0000000000010202   CONTEXT: hypervisor
> >>>>> [...]
> >>>>> Xen call trace:
> >>>>>    [<0000000000000017>] R 0000000000000017
> >>>>>    [<ffff83207eff7b50>] S ffff83207eff7b50
> >>>>>    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> >>>>>    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> >>>>>    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> >>>>>    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> >>>>>    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> >>>>>    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> >>>>>    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> >>>>>    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> >>>>>    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> >>>>>
> >>>>> ****************************************
> >>>>> Panic on CPU 0:
> >>>>> FATAL TRAP: vector = 6 (invalid opcode)
> >>>>> ****************************************
> >>>>>
> >>>>> Which in most cases does lead to a reboot, however that's unreliable.
> >>>>>
> >>>>> Change the default reboot preference to prefer ACPI over UEFI if available and
> >>>>> not in reduced hardware mode.
> >>>>>
> >>>>> This is in line to what Linux does, so it's unlikely to cause issues on current
> >>>>> and future hardware, since there's a much higher chance of vendors testing
> >>>>> hardware with Linux rather than Xen.
> >>>>
> >>>> I certainly appreciate this as a goal. However, ...
> >>>>
> >>>>> Add a special case for one Acer model that does require being rebooted using
> >>>>> ResetSystem().  See Linux commit 0082517fa4bce for rationale.
> >>>>
> >>>> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
> >>>> conforming systems.
> >>>
> >>> I wouldn't call that platform spec-conforming when ACPI reboot doesn't
> >>> work reliably on it either.  I haven't been able to find a wording on
> >>> the UEFI specification that mandates using ResetSystem() in order to
> >>> reset the platform.  I've only found this wording:
> >>>
> >>> "... then the UEFI OS Loader has taken control of the platform, and
> >>> EFI will not regain control of the system until the platform is reset.
> >>> One method of resetting the platform is through the EFI Runtime
> >>> Service ResetSystem()."
> >>>
> >>> And this reads to me as a mere indication that one option is to use
> >>> ResetSystem(), but that there are likely other platform specific reset
> >>> methods that are suitable to be used for OSes and still be compliant
> >>> with the UEFI spec.
> >>
> >> See my reference to ia64.
> > 
> > Right, I understand that on ia64 things might have been different, due
> > to the platform lacking any other reboot method, but I don't see how
> > this applies to x86 where there are other reboot methods.
> > 
> >> With ACPI_FADT_RESET_REGISTER not set, I don't
> >> think there would have been any other non-custom reboot method there. So
> >> while perhaps not mandated, it's still the designated abstraction layer.
> > 
> > Again the spec doesn't mention that ResetSystem() must be used, so
> > while it would make sense if it was reliable, it clearly isn't.  In
> > which case resorting to the more reliable method should always be
> > preferred, specially if the spec is so lax as to call ResetSystem()
> > "One method of resetting the platform".
> 
> That wording wasn't there in 1.02, but I can see it all the way back to
> at least 2.1. So yes, you have a point. Yet - adding onto an earlier
> remark of mine - EFI_RESET_NOTIFICATION_PROTOCOL is pretty useless if
> use of ResetSystem() was optional.

See the note in
EFI_RESET_NOTIFICATION_PROTOCOL.RegisterResetNotify():

"The list of registered reset notification functions are processed if
ResetSystem() is called before ExitBootServices(). The list of
registered reset notification functions is ignored if ResetSystem() is
called after ExitBootServices()."

Those handlers are only called before ExitBootServices(), so for our
use-case it doesn't make a difference, as we call ResetSystem() after
having exited boot services.

Thanks, Roger.
Jan Beulich Sept. 27, 2023, 8:21 a.m. UTC | #7
On 15.09.2023 09:43, Roger Pau Monne wrote:
> The current logic to chose the preferred reboot method is based on the mode Xen
> has been booted into, so if the box is booted from UEFI, the preferred reboot
> method will be to use the ResetSystem() run time service call.
> 
> However, that method seems to be widely untested, and quite often leads to a
> result similar to:
> 
> Hardware Dom0 shutdown: rebooting machine
> ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
> CPU:    0
> RIP:    e008:[<0000000000000017>] 0000000000000017
> RFLAGS: 0000000000010202   CONTEXT: hypervisor
> [...]
> Xen call trace:
>    [<0000000000000017>] R 0000000000000017
>    [<ffff83207eff7b50>] S ffff83207eff7b50
>    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
>    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
>    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
>    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
>    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
>    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
>    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> 
> ****************************************
> Panic on CPU 0:
> FATAL TRAP: vector = 6 (invalid opcode)
> ****************************************
> 
> Which in most cases does lead to a reboot, however that's unreliable.
> 
> Change the default reboot preference to prefer ACPI over UEFI if available and
> not in reduced hardware mode.
> 
> This is in line to what Linux does, so it's unlikely to cause issues on current
> and future hardware, since there's a much higher chance of vendors testing
> hardware with Linux rather than Xen.
> 
> Add a special case for one Acer model that does require being rebooted using
> ResetSystem().  See Linux commit 0082517fa4bce for rationale.
> 
> I'm not aware of using ACPI reboot causing issues on boxes that do have
> properly implemented ResetSystem() methods.

A data point from a new system I'm still in the process of setting up: The
ACPI reboot method, as used by Linux, unconditionally means a warm reboot.
The EFI method, otoh, properly distinguishes "reboot=warm" from our default
of explicitly requesting cold reboot. (Without taking the EFI path, I
assume our write to the relevant BDA location simply has no effect, for
this being a legacy BIOS thing, and the system apparently defaults to warm
reboot when using the ACPI method.)

Clearly, as a secondary effect, this system adds to my personal experience
of so far EFI reboot consistently working on all x86 hardware I have (had)
direct access to. (That said, this is the first non-Intel system, which
likely biases my overall experience.)

Jan
Roger Pau Monné Oct. 3, 2023, 11:35 a.m. UTC | #8
On Wed, Sep 27, 2023 at 10:21:44AM +0200, Jan Beulich wrote:
> On 15.09.2023 09:43, Roger Pau Monne wrote:
> > The current logic to chose the preferred reboot method is based on the mode Xen
> > has been booted into, so if the box is booted from UEFI, the preferred reboot
> > method will be to use the ResetSystem() run time service call.
> > 
> > However, that method seems to be widely untested, and quite often leads to a
> > result similar to:
> > 
> > Hardware Dom0 shutdown: rebooting machine
> > ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
> > CPU:    0
> > RIP:    e008:[<0000000000000017>] 0000000000000017
> > RFLAGS: 0000000000010202   CONTEXT: hypervisor
> > [...]
> > Xen call trace:
> >    [<0000000000000017>] R 0000000000000017
> >    [<ffff83207eff7b50>] S ffff83207eff7b50
> >    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> >    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> >    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> >    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> >    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> >    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> >    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> >    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> >    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> > 
> > ****************************************
> > Panic on CPU 0:
> > FATAL TRAP: vector = 6 (invalid opcode)
> > ****************************************
> > 
> > Which in most cases does lead to a reboot, however that's unreliable.
> > 
> > Change the default reboot preference to prefer ACPI over UEFI if available and
> > not in reduced hardware mode.
> > 
> > This is in line to what Linux does, so it's unlikely to cause issues on current
> > and future hardware, since there's a much higher chance of vendors testing
> > hardware with Linux rather than Xen.
> > 
> > Add a special case for one Acer model that does require being rebooted using
> > ResetSystem().  See Linux commit 0082517fa4bce for rationale.
> > 
> > I'm not aware of using ACPI reboot causing issues on boxes that do have
> > properly implemented ResetSystem() methods.
> 
> A data point from a new system I'm still in the process of setting up: The
> ACPI reboot method, as used by Linux, unconditionally means a warm reboot.
> The EFI method, otoh, properly distinguishes "reboot=warm" from our default
> of explicitly requesting cold reboot. (Without taking the EFI path, I
> assume our write to the relevant BDA location simply has no effect, for
> this being a legacy BIOS thing, and the system apparently defaults to warm
> reboot when using the ACPI method.)

This is unfortunate, but IMO not as worse as getting a #UD or any
other fault while attempting a reboot.  We can always force this
system to use UEFI reboot, if that does work better than ACPI.

> Clearly, as a secondary effect, this system adds to my personal experience
> of so far EFI reboot consistently working on all x86 hardware I have (had)
> direct access to. (That said, this is the first non-Intel system, which
> likely biases my overall experience.)

I can try to gather some data, I can at least tell you that the Intel
NUC11TNHi7 TGL does also hit a fault when attempting UEFI reboot.
The above crash was from a Dell PowerEdge R6625.  I do recall seeing
this with other boxes on the Citrix lab, but don't know the exact
models.  I'm quite sure other downstreams can provide similar
feedback.

I think it's clear now that using ResetSystem() when booted from UEFI
is not mandated by the UEFI specification, so I still stand by this
patch and think we should select the default reboot method that has
the highest chance of succeeding.

Thanks, Roger.
Roger Pau Monné Oct. 23, 2023, 11:02 a.m. UTC | #9
On Tue, Oct 03, 2023 at 01:35:25PM +0200, Roger Pau Monné wrote:
> On Wed, Sep 27, 2023 at 10:21:44AM +0200, Jan Beulich wrote:
> > On 15.09.2023 09:43, Roger Pau Monne wrote:
> > > The current logic to chose the preferred reboot method is based on the mode Xen
> > > has been booted into, so if the box is booted from UEFI, the preferred reboot
> > > method will be to use the ResetSystem() run time service call.
> > > 
> > > However, that method seems to be widely untested, and quite often leads to a
> > > result similar to:
> > > 
> > > Hardware Dom0 shutdown: rebooting machine
> > > ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
> > > CPU:    0
> > > RIP:    e008:[<0000000000000017>] 0000000000000017
> > > RFLAGS: 0000000000010202   CONTEXT: hypervisor
> > > [...]
> > > Xen call trace:
> > >    [<0000000000000017>] R 0000000000000017
> > >    [<ffff83207eff7b50>] S ffff83207eff7b50
> > >    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> > >    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> > >    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> > >    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> > >    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> > >    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> > >    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> > >    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> > >    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> > > 
> > > ****************************************
> > > Panic on CPU 0:
> > > FATAL TRAP: vector = 6 (invalid opcode)
> > > ****************************************
> > > 
> > > Which in most cases does lead to a reboot, however that's unreliable.
> > > 
> > > Change the default reboot preference to prefer ACPI over UEFI if available and
> > > not in reduced hardware mode.
> > > 
> > > This is in line to what Linux does, so it's unlikely to cause issues on current
> > > and future hardware, since there's a much higher chance of vendors testing
> > > hardware with Linux rather than Xen.
> > > 
> > > Add a special case for one Acer model that does require being rebooted using
> > > ResetSystem().  See Linux commit 0082517fa4bce for rationale.
> > > 
> > > I'm not aware of using ACPI reboot causing issues on boxes that do have
> > > properly implemented ResetSystem() methods.
> > 
> > A data point from a new system I'm still in the process of setting up: The
> > ACPI reboot method, as used by Linux, unconditionally means a warm reboot.
> > The EFI method, otoh, properly distinguishes "reboot=warm" from our default
> > of explicitly requesting cold reboot. (Without taking the EFI path, I
> > assume our write to the relevant BDA location simply has no effect, for
> > this being a legacy BIOS thing, and the system apparently defaults to warm
> > reboot when using the ACPI method.)
> 
> This is unfortunate, but IMO not as worse as getting a #UD or any
> other fault while attempting a reboot.  We can always force this
> system to use UEFI reboot, if that does work better than ACPI.
> 
> > Clearly, as a secondary effect, this system adds to my personal experience
> > of so far EFI reboot consistently working on all x86 hardware I have (had)
> > direct access to. (That said, this is the first non-Intel system, which
> > likely biases my overall experience.)
> 
> I can try to gather some data, I can at least tell you that the Intel
> NUC11TNHi7 TGL does also hit a fault when attempting UEFI reboot.
> The above crash was from a Dell PowerEdge R6625.  I do recall seeing
> this with other boxes on the Citrix lab, but don't know the exact
> models.  I'm quite sure other downstreams can provide similar
> feedback.

As a further data point, Dasharo [0] a coreboot downstream was also
providing a firmware with a broken ResetSystem() method, and they
didn't notice until someone reported errors on Xen reboot:

https://github.com/Dasharo/edk2/pull/99/commits/dee75be10ac9387168bd3a8cad0f1ec6e372129a

It's quite clear no one is testing ResetSystem(), the UEFI spec
doesn't mandate using it, and we are just hurting ourselves by forcing
its usage.

Regards, Roger.

[0] https://github.com/Dasharo
Marek Marczykowski-Górecki July 29, 2024, 10:08 p.m. UTC | #10
On Fri, Sep 15, 2023 at 09:43:47AM +0200, Roger Pau Monne wrote:
> The current logic to chose the preferred reboot method is based on the mode Xen
> has been booted into, so if the box is booted from UEFI, the preferred reboot
> method will be to use the ResetSystem() run time service call.
> 
> However, that method seems to be widely untested, and quite often leads to a
> result similar to:
> 
> Hardware Dom0 shutdown: rebooting machine
> ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
> CPU:    0
> RIP:    e008:[<0000000000000017>] 0000000000000017
> RFLAGS: 0000000000010202   CONTEXT: hypervisor
> [...]
> Xen call trace:
>    [<0000000000000017>] R 0000000000000017
>    [<ffff83207eff7b50>] S ffff83207eff7b50
>    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
>    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
>    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
>    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
>    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
>    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
>    [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>    [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> 
> ****************************************
> Panic on CPU 0:
> FATAL TRAP: vector = 6 (invalid opcode)
> ****************************************
> 
> Which in most cases does lead to a reboot, however that's unreliable.
> 
> Change the default reboot preference to prefer ACPI over UEFI if available and
> not in reduced hardware mode.
> 
> This is in line to what Linux does, so it's unlikely to cause issues on current
> and future hardware, since there's a much higher chance of vendors testing
> hardware with Linux rather than Xen.
> 
> Add a special case for one Acer model that does require being rebooted using
> ResetSystem().  See Linux commit 0082517fa4bce for rationale.
> 
> I'm not aware of using ACPI reboot causing issues on boxes that do have
> properly implemented ResetSystem() methods.

With the Acer quirk, and the info Jan posted in the thread, this
sentence technically is not true. I don't think it warrants any code
change in this patch (it's clearly less common and less problematic
issue than crash during ResetSystem(), and still can be worked around
with a cmdline option). But might warrant adjusting commit message.

> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Other points still stand, and I think this generally is an improvement,
so, preferably with adjusted commit message:

Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

> ---
> Changes since v1:
>  - Add special case for Acer model to use UEFI reboot.
>  - Adjust commit message.
> ---
>  xen/arch/x86/shutdown.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/x86/shutdown.c b/xen/arch/x86/shutdown.c
> index 7619544d14da..3816ede1afe5 100644
> --- a/xen/arch/x86/shutdown.c
> +++ b/xen/arch/x86/shutdown.c
> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
>  
>      if ( xen_guest )
>          reboot_type = BOOT_XEN;
> +    else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> +        reboot_type = BOOT_ACPI;
>      else if ( efi_enabled(EFI_RS) )
>          reboot_type = BOOT_EFI;
> -    else if ( acpi_disabled )
> -        reboot_type = BOOT_KBD;
>      else
> -        reboot_type = BOOT_ACPI;
> +        reboot_type = BOOT_KBD;
>  }
>  
>  static int __init cf_check override_reboot(const struct dmi_system_id *d)
>  {
>      enum reboot_type type = (long)d->driver_data;
>  
> -    if ( type == BOOT_ACPI && acpi_disabled )
> +    if ( (type == BOOT_ACPI && acpi_disabled) ||
> +         (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
>          type = BOOT_KBD;
>  
>      if ( reboot_type != type )
> @@ -172,6 +173,7 @@ static int __init cf_check override_reboot(const struct dmi_system_id *d)
>              [BOOT_KBD]  = "keyboard controller",
>              [BOOT_ACPI] = "ACPI",
>              [BOOT_CF9]  = "PCI",
> +            [BOOT_EFI]  = "UEFI",
>          };
>  
>          reboot_type = type;
> @@ -530,6 +532,15 @@ static const struct dmi_system_id __initconstrel reboot_dmi_table[] = {
>              DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R740"),
>          },
>      },
> +    {    /* Handle problems with rebooting on Acer TravelMate X514-51T. */
> +        .callback = override_reboot,
> +        .driver_data = (void *)(long)BOOT_EFI,
> +        .ident = "Acer TravelMate X514-51T",
> +        .matches = {
> +            DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
> +            DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate X514-51T"),
> +        },
> +    },
>      { }
>  };
>  
> -- 
> 2.42.0
> 
>
diff mbox series

Patch

diff --git a/xen/arch/x86/shutdown.c b/xen/arch/x86/shutdown.c
index 7619544d14da..3816ede1afe5 100644
--- a/xen/arch/x86/shutdown.c
+++ b/xen/arch/x86/shutdown.c
@@ -150,19 +150,20 @@  static void default_reboot_type(void)
 
     if ( xen_guest )
         reboot_type = BOOT_XEN;
+    else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
+        reboot_type = BOOT_ACPI;
     else if ( efi_enabled(EFI_RS) )
         reboot_type = BOOT_EFI;
-    else if ( acpi_disabled )
-        reboot_type = BOOT_KBD;
     else
-        reboot_type = BOOT_ACPI;
+        reboot_type = BOOT_KBD;
 }
 
 static int __init cf_check override_reboot(const struct dmi_system_id *d)
 {
     enum reboot_type type = (long)d->driver_data;
 
-    if ( type == BOOT_ACPI && acpi_disabled )
+    if ( (type == BOOT_ACPI && acpi_disabled) ||
+         (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
         type = BOOT_KBD;
 
     if ( reboot_type != type )
@@ -172,6 +173,7 @@  static int __init cf_check override_reboot(const struct dmi_system_id *d)
             [BOOT_KBD]  = "keyboard controller",
             [BOOT_ACPI] = "ACPI",
             [BOOT_CF9]  = "PCI",
+            [BOOT_EFI]  = "UEFI",
         };
 
         reboot_type = type;
@@ -530,6 +532,15 @@  static const struct dmi_system_id __initconstrel reboot_dmi_table[] = {
             DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R740"),
         },
     },
+    {    /* Handle problems with rebooting on Acer TravelMate X514-51T. */
+        .callback = override_reboot,
+        .driver_data = (void *)(long)BOOT_EFI,
+        .ident = "Acer TravelMate X514-51T",
+        .matches = {
+            DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
+            DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate X514-51T"),
+        },
+    },
     { }
 };