diff mbox series

[XEN] x86/ACPI: Ignore entries marked as unusable when parsing MADT

Message ID 20230911101238.18005-1-simon@invisiblethingslab.com (mailing list archive)
State New, archived
Headers show
Series [XEN] x86/ACPI: Ignore entries marked as unusable when parsing MADT | expand

Commit Message

Simon Gaiser Sept. 11, 2023, 10:12 a.m. UTC
Up to version 6.2 Errata B [2] the ACPI spec only defines
ACPI_MADT_ENABLE as:

    If zero, this processor is unusable, and the operating system
    support will not attempt to use it.

The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with
"Must be zero".

Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the
meaning of ACPI_MADT_ENABLE:

    Enabled
        If this bit is set the processor is ready for use. If this bit
        is clear and the Online Capable bit is set, system hardware
        supports enabling this processor during OS runtime. If this bit
        is clear and the Online Capable bit is also clear, this
        processor is unusable, and OSPM shall ignore the contents of the
        Processor Local APIC Structure.

    Online Capbable
        The information conveyed by this bit depends on the value of the
        Enabled bit. If the Enabled bit is set, this bit is reserved and
        must be zero. Otherwise, if this this bit is set, system
        hardware supports enabling this processor during OS runtime.

So with conforming firmwares it should be safe to simply ignore the
entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE

As a precaution against buggy firmwares this change, like Linux [4],
ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note
that the MADT revision was already increased to 5 with spec version 6.2
Errata A [1], so before introducing the online capable flag. But it
wasn't changed for the new flag, so this is the best we can do here.

For previous discussion see thread [5].

Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1]
Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2]
Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3]
Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4]
Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5]
Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
---
 xen/arch/x86/acpi/boot.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

Comments

Jan Beulich Sept. 11, 2023, 10:48 a.m. UTC | #1
On 11.09.2023 12:12, Simon Gaiser wrote:
> Up to version 6.2 Errata B [2] the ACPI spec only defines
> ACPI_MADT_ENABLE as:
> 
>     If zero, this processor is unusable, and the operating system
>     support will not attempt to use it.
> 
> The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with
> "Must be zero".
> 
> Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the
> meaning of ACPI_MADT_ENABLE:
> 
>     Enabled
>         If this bit is set the processor is ready for use. If this bit
>         is clear and the Online Capable bit is set, system hardware
>         supports enabling this processor during OS runtime. If this bit
>         is clear and the Online Capable bit is also clear, this
>         processor is unusable, and OSPM shall ignore the contents of the
>         Processor Local APIC Structure.
> 
>     Online Capbable
>         The information conveyed by this bit depends on the value of the
>         Enabled bit. If the Enabled bit is set, this bit is reserved and
>         must be zero. Otherwise, if this this bit is set, system
>         hardware supports enabling this processor during OS runtime.
> 
> So with conforming firmwares it should be safe to simply ignore the
> entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE
> 
> As a precaution against buggy firmwares this change, like Linux [4],
> ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note
> that the MADT revision was already increased to 5 with spec version 6.2
> Errata A [1], so before introducing the online capable flag. But it
> wasn't changed for the new flag, so this is the best we can do here.
> 
> For previous discussion see thread [5].
> 
> Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1]
> Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2]
> Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3]
> Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4]
> Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5]
> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>

Just to avoid misunderstandings: This change addresses a comment from
Andrew. I does not attempt to address my feedback on the earlier change,
which - as indicated - I intend to revert (timeline extended until mid
of the week), unless by then my earlier comments are addressed or the
suggested possible alternative is carried out.

I think it goes without saying that this patch can't sensibly go in
until the earlier one was properly settled upon. In particular, in case
of reverting aiui this patch won't even apply anymore.

Jan
Simon Gaiser Sept. 11, 2023, 5:51 p.m. UTC | #2
Jan Beulich:
> On 11.09.2023 12:12, Simon Gaiser wrote:
>> Up to version 6.2 Errata B [2] the ACPI spec only defines
>> ACPI_MADT_ENABLE as:
>>
>>     If zero, this processor is unusable, and the operating system
>>     support will not attempt to use it.
>>
>> The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with
>> "Must be zero".
>>
>> Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the
>> meaning of ACPI_MADT_ENABLE:
>>
>>     Enabled
>>         If this bit is set the processor is ready for use. If this bit
>>         is clear and the Online Capable bit is set, system hardware
>>         supports enabling this processor during OS runtime. If this bit
>>         is clear and the Online Capable bit is also clear, this
>>         processor is unusable, and OSPM shall ignore the contents of the
>>         Processor Local APIC Structure.
>>
>>     Online Capbable
>>         The information conveyed by this bit depends on the value of the
>>         Enabled bit. If the Enabled bit is set, this bit is reserved and
>>         must be zero. Otherwise, if this this bit is set, system
>>         hardware supports enabling this processor during OS runtime.
>>
>> So with conforming firmwares it should be safe to simply ignore the
>> entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE
>>
>> As a precaution against buggy firmwares this change, like Linux [4],
>> ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note
>> that the MADT revision was already increased to 5 with spec version 6.2
>> Errata A [1], so before introducing the online capable flag. But it
>> wasn't changed for the new flag, so this is the best we can do here.
>>
>> For previous discussion see thread [5].
>>
>> Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1]
>> Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2]
>> Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3]
>> Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4]
>> Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5]
>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> 
> Just to avoid misunderstandings: This change addresses a comment from
> Andrew. I does not attempt to address my feedback on the earlier change,
> which - as indicated - I intend to revert (timeline extended until mid
> of the week), unless by then my earlier comments are addressed or the
> suggested possible alternative is carried out.
> 
> I think it goes without saying that this patch can't sensibly go in
> until the earlier one was properly settled upon. In particular, in case
> of reverting aiui this patch won't even apply anymore.

It textually conflicts with the other patch [6], and obviously was
triggered by that discussion, but addresses a slightly different aspect.
The textual conflict is trivial to resolve. I wasn't sure if it also
conflicts with the concern you raised there, see below.

The other patch is about ignoring entries with invalid APIC IDs. As the
discussion there showed the spec isn't very clear on that and there are
different opinions how they should be handled. Regarding the flags the
spec is much more concrete although given the response above I assume
you also interpret "is unusable" of the old version of the
ACPI_MADT_ENABLE flag as such that Xen should still allocate resources
for those processors?

If I understood your main concern with the other patch correctly you
have seen firmwares that later update those invalid APIC IDs with valid
ones. Do those firmwares make use of the online capable flag? (Given
above response probably not)

If not, then it indeed doesn't address your concern, and I see no way,
at least by parsing MADT, how to distinguish those cases from firmwares
that have dummy entries (the motivation for these patches).

Simon

[6]: https://lore.kernel.org/xen-devel/7f158a54548456daba9f2e105d099d2e5e2c2f38.1691399031.git.simon@invisiblethingslab.com/
Jan Beulich Sept. 12, 2023, 8:19 a.m. UTC | #3
On 11.09.2023 19:51, Simon Gaiser wrote:
> Jan Beulich:
>> On 11.09.2023 12:12, Simon Gaiser wrote:
>>> Up to version 6.2 Errata B [2] the ACPI spec only defines
>>> ACPI_MADT_ENABLE as:
>>>
>>>     If zero, this processor is unusable, and the operating system
>>>     support will not attempt to use it.
>>>
>>> The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with
>>> "Must be zero".
>>>
>>> Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the
>>> meaning of ACPI_MADT_ENABLE:
>>>
>>>     Enabled
>>>         If this bit is set the processor is ready for use. If this bit
>>>         is clear and the Online Capable bit is set, system hardware
>>>         supports enabling this processor during OS runtime. If this bit
>>>         is clear and the Online Capable bit is also clear, this
>>>         processor is unusable, and OSPM shall ignore the contents of the
>>>         Processor Local APIC Structure.
>>>
>>>     Online Capbable
>>>         The information conveyed by this bit depends on the value of the
>>>         Enabled bit. If the Enabled bit is set, this bit is reserved and
>>>         must be zero. Otherwise, if this this bit is set, system
>>>         hardware supports enabling this processor during OS runtime.
>>>
>>> So with conforming firmwares it should be safe to simply ignore the
>>> entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE
>>>
>>> As a precaution against buggy firmwares this change, like Linux [4],
>>> ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note
>>> that the MADT revision was already increased to 5 with spec version 6.2
>>> Errata A [1], so before introducing the online capable flag. But it
>>> wasn't changed for the new flag, so this is the best we can do here.
>>>
>>> For previous discussion see thread [5].
>>>
>>> Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1]
>>> Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2]
>>> Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3]
>>> Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4]
>>> Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5]
>>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
>>
>> Just to avoid misunderstandings: This change addresses a comment from
>> Andrew. I does not attempt to address my feedback on the earlier change,
>> which - as indicated - I intend to revert (timeline extended until mid
>> of the week), unless by then my earlier comments are addressed or the
>> suggested possible alternative is carried out.
>>
>> I think it goes without saying that this patch can't sensibly go in
>> until the earlier one was properly settled upon. In particular, in case
>> of reverting aiui this patch won't even apply anymore.
> 
> It textually conflicts with the other patch [6], and obviously was
> triggered by that discussion, but addresses a slightly different aspect.
> The textual conflict is trivial to resolve. I wasn't sure if it also
> conflicts with the concern you raised there, see below.
> 
> The other patch is about ignoring entries with invalid APIC IDs. As the
> discussion there showed the spec isn't very clear on that and there are
> different opinions how they should be handled. Regarding the flags the
> spec is much more concrete although given the response above I assume
> you also interpret "is unusable" of the old version of the
> ACPI_MADT_ENABLE flag as such that Xen should still allocate resources
> for those processors?
> 
> If I understood your main concern with the other patch correctly you
> have seen firmwares that later update those invalid APIC IDs with valid
> ones. Do those firmwares make use of the online capable flag? (Given
> above response probably not)

Being an older ACPI version, they don't.

> If not, then it indeed doesn't address your concern, and I see no way,
> at least by parsing MADT, how to distinguish those cases from firmwares
> that have dummy entries (the motivation for these patches).

Right. And while this _may_ be kind of acceptable when accompanied by a
downgrade of CPU hotplug support status, I haven't seen any patch to this
effect up to now. Without such a downgrade, my review comments would need
addressing, to avoid a perceived regression. Same goes for the tightening
done in the patch here: It _may_ be kind of acceptable, but only under
that same condition.

Jan
Roger Pau Monne Sept. 12, 2023, 12:43 p.m. UTC | #4
On Mon, Sep 11, 2023 at 12:12:38PM +0200, Simon Gaiser wrote:
> Up to version 6.2 Errata B [2] the ACPI spec only defines
> ACPI_MADT_ENABLE as:
> 
>     If zero, this processor is unusable, and the operating system
>     support will not attempt to use it.
> 
> The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with
> "Must be zero".
> 
> Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the
> meaning of ACPI_MADT_ENABLE:
> 
>     Enabled
>         If this bit is set the processor is ready for use. If this bit
>         is clear and the Online Capable bit is set, system hardware
>         supports enabling this processor during OS runtime. If this bit
>         is clear and the Online Capable bit is also clear, this
>         processor is unusable, and OSPM shall ignore the contents of the
>         Processor Local APIC Structure.
> 
>     Online Capbable
>         The information conveyed by this bit depends on the value of the
>         Enabled bit. If the Enabled bit is set, this bit is reserved and
>         must be zero. Otherwise, if this this bit is set, system
>         hardware supports enabling this processor during OS runtime.
> 
> So with conforming firmwares it should be safe to simply ignore the
> entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE
> 
> As a precaution against buggy firmwares this change, like Linux [4],
> ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note
> that the MADT revision was already increased to 5 with spec version 6.2
> Errata A [1], so before introducing the online capable flag. But it
> wasn't changed for the new flag, so this is the best we can do here.
> 
> For previous discussion see thread [5].
> 
> Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1]
> Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2]
> Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3]
> Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4]
> Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5]
> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> ---
>  xen/arch/x86/acpi/boot.c | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c
> index 4a62822fa9..2d0b8a9afc 100644
> --- a/xen/arch/x86/acpi/boot.c
> +++ b/xen/arch/x86/acpi/boot.c
> @@ -77,6 +77,17 @@ static int __init cf_check acpi_parse_madt(struct acpi_table_header *table)
>  	return 0;
>  }
>  
> +static bool __init acpi_is_processor_usable(uint32_t lapic_flags)
> +{
> +	if (lapic_flags & ACPI_MADT_ENABLED)
> +		return true;
> +
> +	if (madt_revision >= 5 && (lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
> +		return true;
> +
> +	return false;

So this means that Xen would only support ACPI CPU Hotplug with
versions of the MADT >= 5?  Because with the proposed code non enabled
entries on MADT versions < 5 will be reported as unusable.

Will this work with QEMU?  (ie: does QEMU expose a MADT table with
version >= 5)  Otherwise we will loose all possible ways of testing
ACPI CPU Hotplug.

Regards, Roger.
Simon Gaiser Sept. 13, 2023, 11:01 a.m. UTC | #5
Roger Pau Monné:
> On Mon, Sep 11, 2023 at 12:12:38PM +0200, Simon Gaiser wrote:
>> Up to version 6.2 Errata B [2] the ACPI spec only defines
>> ACPI_MADT_ENABLE as:
>>
>>     If zero, this processor is unusable, and the operating system
>>     support will not attempt to use it.
>>
>> The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with
>> "Must be zero".
>>
>> Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the
>> meaning of ACPI_MADT_ENABLE:
>>
>>     Enabled
>>         If this bit is set the processor is ready for use. If this bit
>>         is clear and the Online Capable bit is set, system hardware
>>         supports enabling this processor during OS runtime. If this bit
>>         is clear and the Online Capable bit is also clear, this
>>         processor is unusable, and OSPM shall ignore the contents of the
>>         Processor Local APIC Structure.
>>
>>     Online Capbable
>>         The information conveyed by this bit depends on the value of the
>>         Enabled bit. If the Enabled bit is set, this bit is reserved and
>>         must be zero. Otherwise, if this this bit is set, system
>>         hardware supports enabling this processor during OS runtime.
>>
>> So with conforming firmwares it should be safe to simply ignore the
>> entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE
>>
>> As a precaution against buggy firmwares this change, like Linux [4],
>> ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note
>> that the MADT revision was already increased to 5 with spec version 6.2
>> Errata A [1], so before introducing the online capable flag. But it
>> wasn't changed for the new flag, so this is the best we can do here.
>>
>> For previous discussion see thread [5].
>>
>> Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1]
>> Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2]
>> Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3]
>> Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4]
>> Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5]
>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
>> ---
>>  xen/arch/x86/acpi/boot.c | 19 +++++++++++++------
>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c
>> index 4a62822fa9..2d0b8a9afc 100644
>> --- a/xen/arch/x86/acpi/boot.c
>> +++ b/xen/arch/x86/acpi/boot.c
>> @@ -77,6 +77,17 @@ static int __init cf_check acpi_parse_madt(struct acpi_table_header *table)
>>  	return 0;
>>  }
>>  
>> +static bool __init acpi_is_processor_usable(uint32_t lapic_flags)
>> +{
>> +	if (lapic_flags & ACPI_MADT_ENABLED)
>> +		return true;
>> +
>> +	if (madt_revision >= 5 && (lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
>> +		return true;
>> +
>> +	return false;
> 
> So this means that Xen would only support ACPI CPU Hotplug with
> versions of the MADT >= 5?  Because with the proposed code non enabled
> entries on MADT versions < 5 will be reported as unusable.
> 
> Will this work with QEMU?  (ie: does QEMU expose a MADT table with
> version >= 5)  Otherwise we will loose all possible ways of testing
> ACPI CPU Hotplug.

That's a good question. I just had a look and it looks like QEMU doesn't
use the new revision (and the new flag) (yet?). (But QEMU assigns a
valid APIC ID from the start.)

So why does hotplugging works in Linux then? Turns out that I missed a
later change:

https://git.kernel.org/torvalds/c/fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c

So they went back to (mostly) the old logic. So then the above change
isn't a good idea. It's only "mostly" the old logic because in the
meantime Linux also improved the version check, to accurately match the
version that introduced the new flag:

https://git.kernel.org/torvalds/c/a74fabfbd1b7013045afc8cc541e6cab3360ccb5

Simon
diff mbox series

Patch

diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c
index 4a62822fa9..2d0b8a9afc 100644
--- a/xen/arch/x86/acpi/boot.c
+++ b/xen/arch/x86/acpi/boot.c
@@ -77,6 +77,17 @@  static int __init cf_check acpi_parse_madt(struct acpi_table_header *table)
 	return 0;
 }
 
+static bool __init acpi_is_processor_usable(uint32_t lapic_flags)
+{
+	if (lapic_flags & ACPI_MADT_ENABLED)
+		return true;
+
+	if (madt_revision >= 5 && (lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
+		return true;
+
+	return false;
+}
+
 static int __init cf_check
 acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end)
 {
@@ -92,9 +103,7 @@  acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end)
 		return 0;
 
 	/* Don't register processors that cannot be onlined. */
-	if (madt_revision >= 5 &&
-	    !(processor->lapic_flags & ACPI_MADT_ENABLED) &&
-	    !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
+	if (!acpi_is_processor_usable(processor->lapic_flags))
 		return 0;
 
 	if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info) {
@@ -151,9 +160,7 @@  acpi_parse_lapic(struct acpi_subtable_header * header, const unsigned long end)
 		return 0;
 
 	/* Don't register processors that cannot be onlined. */
-	if (madt_revision >= 5 &&
-	    !(processor->lapic_flags & ACPI_MADT_ENABLED) &&
-	    !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
+	if (!acpi_is_processor_usable(processor->lapic_flags))
 		return 0;
 
 	if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info)