Message ID | 20230911101238.18005-1-simon@invisiblethingslab.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [XEN] x86/ACPI: Ignore entries marked as unusable when parsing MADT | expand |
On 11.09.2023 12:12, Simon Gaiser wrote: > Up to version 6.2 Errata B [2] the ACPI spec only defines > ACPI_MADT_ENABLE as: > > If zero, this processor is unusable, and the operating system > support will not attempt to use it. > > The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with > "Must be zero". > > Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the > meaning of ACPI_MADT_ENABLE: > > Enabled > If this bit is set the processor is ready for use. If this bit > is clear and the Online Capable bit is set, system hardware > supports enabling this processor during OS runtime. If this bit > is clear and the Online Capable bit is also clear, this > processor is unusable, and OSPM shall ignore the contents of the > Processor Local APIC Structure. > > Online Capbable > The information conveyed by this bit depends on the value of the > Enabled bit. If the Enabled bit is set, this bit is reserved and > must be zero. Otherwise, if this this bit is set, system > hardware supports enabling this processor during OS runtime. > > So with conforming firmwares it should be safe to simply ignore the > entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE > > As a precaution against buggy firmwares this change, like Linux [4], > ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note > that the MADT revision was already increased to 5 with spec version 6.2 > Errata A [1], so before introducing the online capable flag. But it > wasn't changed for the new flag, so this is the best we can do here. > > For previous discussion see thread [5]. > > Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1] > Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2] > Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3] > Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4] > Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5] > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Just to avoid misunderstandings: This change addresses a comment from Andrew. I does not attempt to address my feedback on the earlier change, which - as indicated - I intend to revert (timeline extended until mid of the week), unless by then my earlier comments are addressed or the suggested possible alternative is carried out. I think it goes without saying that this patch can't sensibly go in until the earlier one was properly settled upon. In particular, in case of reverting aiui this patch won't even apply anymore. Jan
Jan Beulich: > On 11.09.2023 12:12, Simon Gaiser wrote: >> Up to version 6.2 Errata B [2] the ACPI spec only defines >> ACPI_MADT_ENABLE as: >> >> If zero, this processor is unusable, and the operating system >> support will not attempt to use it. >> >> The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with >> "Must be zero". >> >> Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the >> meaning of ACPI_MADT_ENABLE: >> >> Enabled >> If this bit is set the processor is ready for use. If this bit >> is clear and the Online Capable bit is set, system hardware >> supports enabling this processor during OS runtime. If this bit >> is clear and the Online Capable bit is also clear, this >> processor is unusable, and OSPM shall ignore the contents of the >> Processor Local APIC Structure. >> >> Online Capbable >> The information conveyed by this bit depends on the value of the >> Enabled bit. If the Enabled bit is set, this bit is reserved and >> must be zero. Otherwise, if this this bit is set, system >> hardware supports enabling this processor during OS runtime. >> >> So with conforming firmwares it should be safe to simply ignore the >> entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE >> >> As a precaution against buggy firmwares this change, like Linux [4], >> ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note >> that the MADT revision was already increased to 5 with spec version 6.2 >> Errata A [1], so before introducing the online capable flag. But it >> wasn't changed for the new flag, so this is the best we can do here. >> >> For previous discussion see thread [5]. >> >> Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1] >> Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2] >> Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3] >> Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4] >> Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5] >> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> > > Just to avoid misunderstandings: This change addresses a comment from > Andrew. I does not attempt to address my feedback on the earlier change, > which - as indicated - I intend to revert (timeline extended until mid > of the week), unless by then my earlier comments are addressed or the > suggested possible alternative is carried out. > > I think it goes without saying that this patch can't sensibly go in > until the earlier one was properly settled upon. In particular, in case > of reverting aiui this patch won't even apply anymore. It textually conflicts with the other patch [6], and obviously was triggered by that discussion, but addresses a slightly different aspect. The textual conflict is trivial to resolve. I wasn't sure if it also conflicts with the concern you raised there, see below. The other patch is about ignoring entries with invalid APIC IDs. As the discussion there showed the spec isn't very clear on that and there are different opinions how they should be handled. Regarding the flags the spec is much more concrete although given the response above I assume you also interpret "is unusable" of the old version of the ACPI_MADT_ENABLE flag as such that Xen should still allocate resources for those processors? If I understood your main concern with the other patch correctly you have seen firmwares that later update those invalid APIC IDs with valid ones. Do those firmwares make use of the online capable flag? (Given above response probably not) If not, then it indeed doesn't address your concern, and I see no way, at least by parsing MADT, how to distinguish those cases from firmwares that have dummy entries (the motivation for these patches). Simon [6]: https://lore.kernel.org/xen-devel/7f158a54548456daba9f2e105d099d2e5e2c2f38.1691399031.git.simon@invisiblethingslab.com/
On 11.09.2023 19:51, Simon Gaiser wrote: > Jan Beulich: >> On 11.09.2023 12:12, Simon Gaiser wrote: >>> Up to version 6.2 Errata B [2] the ACPI spec only defines >>> ACPI_MADT_ENABLE as: >>> >>> If zero, this processor is unusable, and the operating system >>> support will not attempt to use it. >>> >>> The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with >>> "Must be zero". >>> >>> Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the >>> meaning of ACPI_MADT_ENABLE: >>> >>> Enabled >>> If this bit is set the processor is ready for use. If this bit >>> is clear and the Online Capable bit is set, system hardware >>> supports enabling this processor during OS runtime. If this bit >>> is clear and the Online Capable bit is also clear, this >>> processor is unusable, and OSPM shall ignore the contents of the >>> Processor Local APIC Structure. >>> >>> Online Capbable >>> The information conveyed by this bit depends on the value of the >>> Enabled bit. If the Enabled bit is set, this bit is reserved and >>> must be zero. Otherwise, if this this bit is set, system >>> hardware supports enabling this processor during OS runtime. >>> >>> So with conforming firmwares it should be safe to simply ignore the >>> entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE >>> >>> As a precaution against buggy firmwares this change, like Linux [4], >>> ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note >>> that the MADT revision was already increased to 5 with spec version 6.2 >>> Errata A [1], so before introducing the online capable flag. But it >>> wasn't changed for the new flag, so this is the best we can do here. >>> >>> For previous discussion see thread [5]. >>> >>> Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1] >>> Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2] >>> Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3] >>> Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4] >>> Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5] >>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> >> >> Just to avoid misunderstandings: This change addresses a comment from >> Andrew. I does not attempt to address my feedback on the earlier change, >> which - as indicated - I intend to revert (timeline extended until mid >> of the week), unless by then my earlier comments are addressed or the >> suggested possible alternative is carried out. >> >> I think it goes without saying that this patch can't sensibly go in >> until the earlier one was properly settled upon. In particular, in case >> of reverting aiui this patch won't even apply anymore. > > It textually conflicts with the other patch [6], and obviously was > triggered by that discussion, but addresses a slightly different aspect. > The textual conflict is trivial to resolve. I wasn't sure if it also > conflicts with the concern you raised there, see below. > > The other patch is about ignoring entries with invalid APIC IDs. As the > discussion there showed the spec isn't very clear on that and there are > different opinions how they should be handled. Regarding the flags the > spec is much more concrete although given the response above I assume > you also interpret "is unusable" of the old version of the > ACPI_MADT_ENABLE flag as such that Xen should still allocate resources > for those processors? > > If I understood your main concern with the other patch correctly you > have seen firmwares that later update those invalid APIC IDs with valid > ones. Do those firmwares make use of the online capable flag? (Given > above response probably not) Being an older ACPI version, they don't. > If not, then it indeed doesn't address your concern, and I see no way, > at least by parsing MADT, how to distinguish those cases from firmwares > that have dummy entries (the motivation for these patches). Right. And while this _may_ be kind of acceptable when accompanied by a downgrade of CPU hotplug support status, I haven't seen any patch to this effect up to now. Without such a downgrade, my review comments would need addressing, to avoid a perceived regression. Same goes for the tightening done in the patch here: It _may_ be kind of acceptable, but only under that same condition. Jan
On Mon, Sep 11, 2023 at 12:12:38PM +0200, Simon Gaiser wrote: > Up to version 6.2 Errata B [2] the ACPI spec only defines > ACPI_MADT_ENABLE as: > > If zero, this processor is unusable, and the operating system > support will not attempt to use it. > > The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with > "Must be zero". > > Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the > meaning of ACPI_MADT_ENABLE: > > Enabled > If this bit is set the processor is ready for use. If this bit > is clear and the Online Capable bit is set, system hardware > supports enabling this processor during OS runtime. If this bit > is clear and the Online Capable bit is also clear, this > processor is unusable, and OSPM shall ignore the contents of the > Processor Local APIC Structure. > > Online Capbable > The information conveyed by this bit depends on the value of the > Enabled bit. If the Enabled bit is set, this bit is reserved and > must be zero. Otherwise, if this this bit is set, system > hardware supports enabling this processor during OS runtime. > > So with conforming firmwares it should be safe to simply ignore the > entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE > > As a precaution against buggy firmwares this change, like Linux [4], > ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note > that the MADT revision was already increased to 5 with spec version 6.2 > Errata A [1], so before introducing the online capable flag. But it > wasn't changed for the new flag, so this is the best we can do here. > > For previous discussion see thread [5]. > > Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1] > Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2] > Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3] > Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4] > Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5] > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> > --- > xen/arch/x86/acpi/boot.c | 19 +++++++++++++------ > 1 file changed, 13 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c > index 4a62822fa9..2d0b8a9afc 100644 > --- a/xen/arch/x86/acpi/boot.c > +++ b/xen/arch/x86/acpi/boot.c > @@ -77,6 +77,17 @@ static int __init cf_check acpi_parse_madt(struct acpi_table_header *table) > return 0; > } > > +static bool __init acpi_is_processor_usable(uint32_t lapic_flags) > +{ > + if (lapic_flags & ACPI_MADT_ENABLED) > + return true; > + > + if (madt_revision >= 5 && (lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) > + return true; > + > + return false; So this means that Xen would only support ACPI CPU Hotplug with versions of the MADT >= 5? Because with the proposed code non enabled entries on MADT versions < 5 will be reported as unusable. Will this work with QEMU? (ie: does QEMU expose a MADT table with version >= 5) Otherwise we will loose all possible ways of testing ACPI CPU Hotplug. Regards, Roger.
Roger Pau Monné: > On Mon, Sep 11, 2023 at 12:12:38PM +0200, Simon Gaiser wrote: >> Up to version 6.2 Errata B [2] the ACPI spec only defines >> ACPI_MADT_ENABLE as: >> >> If zero, this processor is unusable, and the operating system >> support will not attempt to use it. >> >> The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with >> "Must be zero". >> >> Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the >> meaning of ACPI_MADT_ENABLE: >> >> Enabled >> If this bit is set the processor is ready for use. If this bit >> is clear and the Online Capable bit is set, system hardware >> supports enabling this processor during OS runtime. If this bit >> is clear and the Online Capable bit is also clear, this >> processor is unusable, and OSPM shall ignore the contents of the >> Processor Local APIC Structure. >> >> Online Capbable >> The information conveyed by this bit depends on the value of the >> Enabled bit. If the Enabled bit is set, this bit is reserved and >> must be zero. Otherwise, if this this bit is set, system >> hardware supports enabling this processor during OS runtime. >> >> So with conforming firmwares it should be safe to simply ignore the >> entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE >> >> As a precaution against buggy firmwares this change, like Linux [4], >> ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note >> that the MADT revision was already increased to 5 with spec version 6.2 >> Errata A [1], so before introducing the online capable flag. But it >> wasn't changed for the new flag, so this is the best we can do here. >> >> For previous discussion see thread [5]. >> >> Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1] >> Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2] >> Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3] >> Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4] >> Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5] >> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> >> --- >> xen/arch/x86/acpi/boot.c | 19 +++++++++++++------ >> 1 file changed, 13 insertions(+), 6 deletions(-) >> >> diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c >> index 4a62822fa9..2d0b8a9afc 100644 >> --- a/xen/arch/x86/acpi/boot.c >> +++ b/xen/arch/x86/acpi/boot.c >> @@ -77,6 +77,17 @@ static int __init cf_check acpi_parse_madt(struct acpi_table_header *table) >> return 0; >> } >> >> +static bool __init acpi_is_processor_usable(uint32_t lapic_flags) >> +{ >> + if (lapic_flags & ACPI_MADT_ENABLED) >> + return true; >> + >> + if (madt_revision >= 5 && (lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) >> + return true; >> + >> + return false; > > So this means that Xen would only support ACPI CPU Hotplug with > versions of the MADT >= 5? Because with the proposed code non enabled > entries on MADT versions < 5 will be reported as unusable. > > Will this work with QEMU? (ie: does QEMU expose a MADT table with > version >= 5) Otherwise we will loose all possible ways of testing > ACPI CPU Hotplug. That's a good question. I just had a look and it looks like QEMU doesn't use the new revision (and the new flag) (yet?). (But QEMU assigns a valid APIC ID from the start.) So why does hotplugging works in Linux then? Turns out that I missed a later change: https://git.kernel.org/torvalds/c/fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c So they went back to (mostly) the old logic. So then the above change isn't a good idea. It's only "mostly" the old logic because in the meantime Linux also improved the version check, to accurately match the version that introduced the new flag: https://git.kernel.org/torvalds/c/a74fabfbd1b7013045afc8cc541e6cab3360ccb5 Simon
diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c index 4a62822fa9..2d0b8a9afc 100644 --- a/xen/arch/x86/acpi/boot.c +++ b/xen/arch/x86/acpi/boot.c @@ -77,6 +77,17 @@ static int __init cf_check acpi_parse_madt(struct acpi_table_header *table) return 0; } +static bool __init acpi_is_processor_usable(uint32_t lapic_flags) +{ + if (lapic_flags & ACPI_MADT_ENABLED) + return true; + + if (madt_revision >= 5 && (lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) + return true; + + return false; +} + static int __init cf_check acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) { @@ -92,9 +103,7 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) return 0; /* Don't register processors that cannot be onlined. */ - if (madt_revision >= 5 && - !(processor->lapic_flags & ACPI_MADT_ENABLED) && - !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) + if (!acpi_is_processor_usable(processor->lapic_flags)) return 0; if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info) { @@ -151,9 +160,7 @@ acpi_parse_lapic(struct acpi_subtable_header * header, const unsigned long end) return 0; /* Don't register processors that cannot be onlined. */ - if (madt_revision >= 5 && - !(processor->lapic_flags & ACPI_MADT_ENABLED) && - !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) + if (!acpi_is_processor_usable(processor->lapic_flags)) return 0; if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info)
Up to version 6.2 Errata B [2] the ACPI spec only defines ACPI_MADT_ENABLE as: If zero, this processor is unusable, and the operating system support will not attempt to use it. The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with "Must be zero". Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the meaning of ACPI_MADT_ENABLE: Enabled If this bit is set the processor is ready for use. If this bit is clear and the Online Capable bit is set, system hardware supports enabling this processor during OS runtime. If this bit is clear and the Online Capable bit is also clear, this processor is unusable, and OSPM shall ignore the contents of the Processor Local APIC Structure. Online Capbable The information conveyed by this bit depends on the value of the Enabled bit. If the Enabled bit is set, this bit is reserved and must be zero. Otherwise, if this this bit is set, system hardware supports enabling this processor during OS runtime. So with conforming firmwares it should be safe to simply ignore the entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE As a precaution against buggy firmwares this change, like Linux [4], ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note that the MADT revision was already increased to 5 with spec version 6.2 Errata A [1], so before introducing the online capable flag. But it wasn't changed for the new flag, so this is the best we can do here. For previous discussion see thread [5]. Link: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # [1] Link: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # [2] Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # [3] Link: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f # [4] Link: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ # [5] Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> --- xen/arch/x86/acpi/boot.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)