Message ID | 7f158a54548456daba9f2e105d099d2e5e2c2f38.1691399031.git.simon@invisiblethingslab.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [XEN] x86/ACPI: Ignore entries with invalid APIC IDs when parsing MADT | expand |
On 07/08/2023 10:38 am, Simon Gaiser wrote: > It seems some firmwares put dummy entries in the ACPI MADT table for non > existing processors. On my NUC11TNHi5 those have the invalid APIC ID > 0xff. Linux already has code to handle those cases both in > acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the > same check to Xen. > > Note that on some older (2nd gen Core i) laptop of mine I also saw dummy > entries with a valid APIC ID. Linux would still ignore those because > they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen > this check is only active for madt_revision >= 5. But since this version > check seems to be intentionally I leave that alone. I recall there being a discussion over this, ultimately with the version check being removed. IIRC it was a defacto standard used for a long time prior to actually getting written into the ACPI spec, so does exist in practice in older MADTs. Otherwise LGTM. I'd suggest dropping this paragraph as it's not related to the change. It will also help if we do decide to follow up and drop the MADT version check. > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] https://git.kernel.org/torvalds/c/$SHA Somewhat less verbose. https://korg.docs.kernel.org/git-url-shorteners.html > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> > --- > xen/arch/x86/acpi/boot.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c > index 54b72d716b..4a62822fa9 100644 > --- a/xen/arch/x86/acpi/boot.c > +++ b/xen/arch/x86/acpi/boot.c > @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) > if (BAD_MADT_ENTRY(processor, end)) > return -EINVAL; > > + /* Ignore entries with invalid apicid */ x2apic ID. ~Andrew
On 07/08/2023 11:01 am, Andrew Cooper wrote: > On 07/08/2023 10:38 am, Simon Gaiser wrote: >> It seems some firmwares put dummy entries in the ACPI MADT table for non >> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >> 0xff. Linux already has code to handle those cases both in >> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >> same check to Xen. >> >> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >> entries with a valid APIC ID. Linux would still ignore those because >> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >> this check is only active for madt_revision >= 5. But since this version >> check seems to be intentionally I leave that alone. > I recall there being a discussion over this, ultimately with the version > check being removed. IIRC it was a defacto standard used for a long > time prior to actually getting written into the ACPI spec, so does exist > in practice in older MADTs. > > Otherwise LGTM. I'd suggest dropping this paragraph as it's not related > to the change. It will also help if we do decide to follow up and drop > the MADT version check. > >> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] >> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] > https://git.kernel.org/torvalds/c/$SHA > > Somewhat less verbose. https://korg.docs.kernel.org/git-url-shorteners.html > >> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> >> --- >> xen/arch/x86/acpi/boot.c | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c >> index 54b72d716b..4a62822fa9 100644 >> --- a/xen/arch/x86/acpi/boot.c >> +++ b/xen/arch/x86/acpi/boot.c >> @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) >> if (BAD_MADT_ENTRY(processor, end)) >> return -EINVAL; >> >> + /* Ignore entries with invalid apicid */ > x2apic ID. Oh, and I forgot to say. I'm happy to fix all of this up on commit if you're happy. ~Andrew
On 07.08.2023 11:38, Simon Gaiser wrote: > It seems some firmwares put dummy entries in the ACPI MADT table for non > existing processors. On my NUC11TNHi5 those have the invalid APIC ID > 0xff. Linux already has code to handle those cases both in > acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the > same check to Xen. I'm afraid it doesn't become clear to me what problem you're trying to solve. > --- a/xen/arch/x86/acpi/boot.c > +++ b/xen/arch/x86/acpi/boot.c > @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) > if (BAD_MADT_ENTRY(processor, end)) > return -EINVAL; > > + /* Ignore entries with invalid apicid */ > + if (processor->local_apic_id == 0xffffffff) > + return 0; > + > /* Don't register processors that cannot be onlined. */ > if (madt_revision >= 5 && > !(processor->lapic_flags & ACPI_MADT_ENABLED) && > !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) > return 0; > > - if ((processor->lapic_flags & ACPI_MADT_ENABLED) || > - processor->local_apic_id != 0xffffffff || opt_cpu_info) { > + if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info) { > acpi_table_print_madt_entry(header); > log = true; > } In particular you're now suppressing log messages which may be relevant. The one issue that I'm aware of (and that I use a local hack to deal with; see bottom) is excess verbosity. Jan --- unstable.orig/xen/arch/x86/mpparse.c +++ unstable/xen/arch/x86/mpparse.c @@ -809,8 +809,13 @@ int mp_register_lapic(u32 id, bool enabl }; if (MAX_APICS <= id) { - printk(KERN_WARNING "Processor #%d invalid (max %d)\n", - id, MAX_APICS); + static u32 max_warn = -1; + + if (id <= max_warn) { + printk(KERN_WARNING "Processor #%d invalid (max %d)\n", + id, MAX_APICS); + max_warn = id - 1; + } return -EINVAL; }
Andrew Cooper: > On 07/08/2023 11:01 am, Andrew Cooper wrote: >> On 07/08/2023 10:38 am, Simon Gaiser wrote: >>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>> 0xff. Linux already has code to handle those cases both in >>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>> same check to Xen. >>> >>> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >>> entries with a valid APIC ID. Linux would still ignore those because >>> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >>> this check is only active for madt_revision >= 5. But since this version >>> check seems to be intentionally I leave that alone. >> I recall there being a discussion over this, ultimately with the version >> check being removed. IIRC it was a defacto standard used for a long >> time prior to actually getting written into the ACPI spec, so does exist >> in practice in older MADTs. >> >> Otherwise LGTM. I'd suggest dropping this paragraph as it's not related >> to the change. It will also help if we do decide to follow up and drop >> the MADT version check. It's in so far related as that I know this doesn't cover all cases I actually wanted to address and I want to mention this. But I can see why you might not want to have this in this commit message. Feel free to drop. >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] >> https://git.kernel.org/torvalds/c/$SHA >> >> Somewhat less verbose. https://korg.docs.kernel.org/git-url-shorteners.html Ah, handy. >>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> >>> --- >>> xen/arch/x86/acpi/boot.c | 14 ++++++++++---- >>> 1 file changed, 10 insertions(+), 4 deletions(-) >>> >>> diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c >>> index 54b72d716b..4a62822fa9 100644 >>> --- a/xen/arch/x86/acpi/boot.c >>> +++ b/xen/arch/x86/acpi/boot.c >>> @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) >>> if (BAD_MADT_ENTRY(processor, end)) >>> return -EINVAL; >>> >>> + /* Ignore entries with invalid apicid */ >> x2apic ID. > > Oh, and I forgot to say. I'm happy to fix all of this up on commit if > you're happy. Sure, go ahead. Thanks!
Jan Beulich: > On 07.08.2023 11:38, Simon Gaiser wrote: >> It seems some firmwares put dummy entries in the ACPI MADT table for non >> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >> 0xff. Linux already has code to handle those cases both in >> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >> same check to Xen. > > I'm afraid it doesn't become clear to me what problem you're trying to > solve. I want Xen to not think there are possible CPUs that actually never can be there. Without ignoring those dummy entries Xen thinks my NUC has 2 sockets and that there are 8 logical CPUs that are currently disabled but could be hotplugged. I'm moderately sure that soldering in another CPU is not supported, even less so at runtime ;] >> --- a/xen/arch/x86/acpi/boot.c >> +++ b/xen/arch/x86/acpi/boot.c >> @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) >> if (BAD_MADT_ENTRY(processor, end)) >> return -EINVAL; >> >> + /* Ignore entries with invalid apicid */ >> + if (processor->local_apic_id == 0xffffffff) >> + return 0; >> + >> /* Don't register processors that cannot be onlined. */ >> if (madt_revision >= 5 && >> !(processor->lapic_flags & ACPI_MADT_ENABLED) && >> !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) >> return 0; >> >> - if ((processor->lapic_flags & ACPI_MADT_ENABLED) || >> - processor->local_apic_id != 0xffffffff || opt_cpu_info) { >> + if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info) { >> acpi_table_print_madt_entry(header); >> log = true; >> } > > In particular you're now suppressing log messages which may be relevant. I intentionally mirrored the behavior of the check directly below. Unlike the the version in Linux the existing code didn't log ignored entries. So I did the same for the entries with an invalid ID. > The one issue that I'm aware of (and that I use a local hack to deal > with; see bottom) is excess verbosity. > > Jan > > --- unstable.orig/xen/arch/x86/mpparse.c > +++ unstable/xen/arch/x86/mpparse.c > @@ -809,8 +809,13 @@ int mp_register_lapic(u32 id, bool enabl > }; > > if (MAX_APICS <= id) { > - printk(KERN_WARNING "Processor #%d invalid (max %d)\n", > - id, MAX_APICS); > + static u32 max_warn = -1; > + > + if (id <= max_warn) { > + printk(KERN_WARNING "Processor #%d invalid (max %d)\n", > + id, MAX_APICS); > + max_warn = id - 1; > + } > return -EINVAL; > } >
On 07.08.2023 14:55, Simon Gaiser wrote: > Jan Beulich: >> On 07.08.2023 11:38, Simon Gaiser wrote: >>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>> 0xff. Linux already has code to handle those cases both in >>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>> same check to Xen. >> >> I'm afraid it doesn't become clear to me what problem you're trying to >> solve. > > I want Xen to not think there are possible CPUs that actually never can > be there. Did you try using "maxcpus=" on the command line? If that doesn't work well enough (perhaps because of causing undesirable log messages), maybe we need some way to say "no CPU hotplug" on the command line. > Without ignoring those dummy entries Xen thinks my NUC has 2 sockets and > that there are 8 logical CPUs that are currently disabled but could be > hotplugged. Yet it's exactly this which ACPI is telling us here (with some vagueness, which isn't easy to get around; see below). > I'm moderately sure that soldering in another CPU is not supported, even > less so at runtime ;] On your system. What about others, which are hotplug-capable? >>> --- a/xen/arch/x86/acpi/boot.c >>> +++ b/xen/arch/x86/acpi/boot.c >>> @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) >>> if (BAD_MADT_ENTRY(processor, end)) >>> return -EINVAL; >>> >>> + /* Ignore entries with invalid apicid */ >>> + if (processor->local_apic_id == 0xffffffff) >>> + return 0; >>> + >>> /* Don't register processors that cannot be onlined. */ >>> if (madt_revision >= 5 && >>> !(processor->lapic_flags & ACPI_MADT_ENABLED) && >>> !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) >>> return 0; >>> >>> - if ((processor->lapic_flags & ACPI_MADT_ENABLED) || >>> - processor->local_apic_id != 0xffffffff || opt_cpu_info) { >>> + if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info) { >>> acpi_table_print_madt_entry(header); >>> log = true; >>> } >> >> In particular you're now suppressing log messages which may be relevant. > > I intentionally mirrored the behavior of the check directly below. > Unlike the the version in Linux the existing code didn't log ignored > entries. So I did the same for the entries with an invalid ID. I'm afraid I can't bring in line the check you add early in the function with what is "directly below" [here]. I'm only inferring the "here" from the placement of your reply. Maybe instead you meant the rev >= 5 one, in which case I'm afraid the two are too different to compare: That rev >= 5 one tells us that the entry isn't going to be hotpluggable. The APIC ID check you add makes no such guarantees. That's why the new flag was actually added in v5. Jan
On 07/08/2023 2:17 pm, Jan Beulich wrote: > On 07.08.2023 14:55, Simon Gaiser wrote: >> Jan Beulich: >>> On 07.08.2023 11:38, Simon Gaiser wrote: >>>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>>> 0xff. Linux already has code to handle those cases both in >>>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>>> same check to Xen. >>> I'm afraid it doesn't become clear to me what problem you're trying to >>> solve. >> I want Xen to not think there are possible CPUs that actually never can >> be there. > Did you try using "maxcpus=" on the command line? If that doesn't work > well enough (perhaps because of causing undesirable log messages), maybe > we need some way to say "no CPU hotplug" on the command line. The ACPI tables are broken, and Xen's parsing of them is wrong. And irrespective - it's unreasonable to have users work around bugs in Xen by adding more command line. > >> Without ignoring those dummy entries Xen thinks my NUC has 2 sockets and >> that there are 8 logical CPUs that are currently disabled but could be >> hotplugged. > Yet it's exactly this which ACPI is telling us here (with some vagueness, > which isn't easy to get around; see below). > >> I'm moderately sure that soldering in another CPU is not supported, even >> less so at runtime ;] > On your system. What about others, which are hotplug-capable? It is required that all APIC ID are stated *ahead of time*. Entries with 0xFF and 0xFFFFFFFF at boot are entirely invalid. Furthermore, given debugging that I just did with Thomas, it's very clear that noone has tried really hotplugging CPUs under Xen. The layering violation is rather hilarious to watch. ~Andrew
On 07.08.2023 16:04, Andrew Cooper wrote: > On 07/08/2023 2:17 pm, Jan Beulich wrote: >> On 07.08.2023 14:55, Simon Gaiser wrote: >>> Jan Beulich: >>>> On 07.08.2023 11:38, Simon Gaiser wrote: >>>>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>>>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>>>> 0xff. Linux already has code to handle those cases both in >>>>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>>>> same check to Xen. >>>> I'm afraid it doesn't become clear to me what problem you're trying to >>>> solve. >>> I want Xen to not think there are possible CPUs that actually never can >>> be there. >> Did you try using "maxcpus=" on the command line? If that doesn't work >> well enough (perhaps because of causing undesirable log messages), maybe >> we need some way to say "no CPU hotplug" on the command line. > > The ACPI tables are broken, and Xen's parsing of them is wrong. > > And irrespective - it's unreasonable to have users work around bugs in > Xen by adding more command line. Well, considering how rare CPU hotplug appears to be, such a new option could default to "no hotplug". >>> Without ignoring those dummy entries Xen thinks my NUC has 2 sockets and >>> that there are 8 logical CPUs that are currently disabled but could be >>> hotplugged. >> Yet it's exactly this which ACPI is telling us here (with some vagueness, >> which isn't easy to get around; see below). >> >>> I'm moderately sure that soldering in another CPU is not supported, even >>> less so at runtime ;] >> On your system. What about others, which are hotplug-capable? > > It is required that all APIC ID are stated *ahead of time*. Would you mind pointing me at where this is stated? Aiui MADT entries can be modified dynamically (we do so ourselves in the DSDT we supply HVM guests with), and as I further understand this may be required also for the APIC ID fields because firmware may not know how APIC IDs are to be arranged for a new socket before being populated (by perhaps a not entirely identical CPU, and this may not even need to be mixed steppings). Jan
Andrew Cooper: > On 07/08/2023 10:38 am, Simon Gaiser wrote: >> It seems some firmwares put dummy entries in the ACPI MADT table for non >> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >> 0xff. Linux already has code to handle those cases both in >> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >> same check to Xen. >> >> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >> entries with a valid APIC ID. Linux would still ignore those because >> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >> this check is only active for madt_revision >= 5. But since this version >> check seems to be intentionally I leave that alone. > > I recall there being a discussion over this, ultimately with the version > check being removed. IIRC it was a defacto standard used for a long > time prior to actually getting written into the ACPI spec, so does exist > in practice in older MADTs. So I noticed that the check in Linux is actually slightly different than I thought. Since [3] it always considers the CPU usable if ACPI_MADT_ENABLED is set. Otherwise it consider it only usable if MADT revision >= 5 and ACPI_MADT_ONLINE_CAPABLE is set. So I checked what the ACPI spec says: Up to 6.2 Errata B [6] it only defines ACPI_MADT_ENABLE as: If zero, this processor is unusable, and the operating system support will not attempt to use it. And the bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with "Must be zero". 6.3 [7] Then adds ACPI_MADT_ONLINE_CAPABLE and changes the meaning of ACPI_MADT_ENABLE: Enabled If this bit is set the processor is ready for use. If this bit is clear and the Online Capable bit is set, system hardware supports enabling this processor during OS runtime. If this bit is clear and the Online Capable bit is also clear, this processor is unusable, and OSPM shall ignore the contents of the Processor Local APIC Structure. Online Capbable The information conveyed by this bit depends on the value of the Enabled bit. If the Enabled bit is set, this bit is reserved and must be zero. Otherwise, if this this bit is set, system hardware supports enabling this processor during OS runtime. So with confirming firmwares it should be safe change the simply ignore the entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE We can also do it like Linux and ignore ACPI_MADT_ONLINE_CAPABLE completely if revision < 5. Note that the revision was already increased to 5 before 6.3. ACPI spec version MADT revision 6.2 [4] 4 6.2 Errata A [5] 45 (typo I guess) 6.2 Errata B 5 6.3 5 [3]: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f [4]: http://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf [5]: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf [6]: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf [7]: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf > Otherwise LGTM. I'd suggest dropping this paragraph as it's not related > to the change. It will also help if we do decide to follow up and drop > the MADT version check. > >> >> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] >> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] > > https://git.kernel.org/torvalds/c/$SHA > > Somewhat less verbose. https://korg.docs.kernel.org/git-url-shorteners.html > >> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> >> --- >> xen/arch/x86/acpi/boot.c | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c >> index 54b72d716b..4a62822fa9 100644 >> --- a/xen/arch/x86/acpi/boot.c >> +++ b/xen/arch/x86/acpi/boot.c >> @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) >> if (BAD_MADT_ENTRY(processor, end)) >> return -EINVAL; >> >> + /* Ignore entries with invalid apicid */ > > x2apic ID. > > ~Andrew
Jan Beulich: > On 07.08.2023 14:55, Simon Gaiser wrote: >> Jan Beulich: >>> On 07.08.2023 11:38, Simon Gaiser wrote: >>>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>>> 0xff. Linux already has code to handle those cases both in >>>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>>> same check to Xen. >>> >>> I'm afraid it doesn't become clear to me what problem you're trying to >>> solve. >> >> I want Xen to not think there are possible CPUs that actually never can >> be there. > > Did you try using "maxcpus=" on the command line? If that doesn't work > well enough Yes. Then Xen says, as expected "SMP: Allowing 8 CPUs (0 hotplug CPUs)", but disabled_cpus is still 8 and nr_sockets therefore calculated as 2. > (perhaps because of causing undesirable log messages), I don't mind some verbose log messages. > maybe we need some way to say "no CPU hotplug" on the command line. That indeed might make sense, but I'm not sure this is what I want here, see below. >> Without ignoring those dummy entries Xen thinks my NUC has 2 sockets and >> that there are 8 logical CPUs that are currently disabled but could be >> hotplugged. > > Yet it's exactly this which ACPI is telling us here (with some vagueness, > which isn't easy to get around; see below). > >> I'm moderately sure that soldering in another CPU is not supported, even >> less so at runtime ;] > > On your system. What about others, which are hotplug-capable? Would those be listed with an invalid APIC ID in their entries? Then I understand your complaint. PS: based on your reply to Andrew, you think that this might be the case. >>>> --- a/xen/arch/x86/acpi/boot.c >>>> +++ b/xen/arch/x86/acpi/boot.c >>>> @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) >>>> if (BAD_MADT_ENTRY(processor, end)) >>>> return -EINVAL; >>>> >>>> + /* Ignore entries with invalid apicid */ >>>> + if (processor->local_apic_id == 0xffffffff) >>>> + return 0; >>>> + >>>> /* Don't register processors that cannot be onlined. */ >>>> if (madt_revision >= 5 && >>>> !(processor->lapic_flags & ACPI_MADT_ENABLED) && >>>> !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) >>>> return 0; >>>> >>>> - if ((processor->lapic_flags & ACPI_MADT_ENABLED) || >>>> - processor->local_apic_id != 0xffffffff || opt_cpu_info) { >>>> + if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info) { >>>> acpi_table_print_madt_entry(header); >>>> log = true; >>>> } >>> >>> In particular you're now suppressing log messages which may be relevant. >> >> I intentionally mirrored the behavior of the check directly below. >> Unlike the the version in Linux the existing code didn't log ignored >> entries. So I did the same for the entries with an invalid ID. > > I'm afraid I can't bring in line the check you add early in the function > with what is "directly below" [here]. I'm only inferring the "here" from > the placement of your reply. Maybe instead you meant the rev >= 5 one, Yes exactly. "directly below" was meant relative to the if I added. > in which case I'm afraid the two are too different to compare: That > rev >= 5 one tells us that the entry isn't going to be hotpluggable. The > APIC ID check you add makes no such guarantees. As mentioned above, if that's the case I see the problem. > That's why the new flag was actually added in v5. See other branch of this thread: https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a5cd@invisiblethingslab.com/ So we might cover at least my case regardless how invalid APIC IDs should be interpreted.
On Mon, Aug 07, 2023 at 03:17:18PM +0200, Jan Beulich wrote: > On 07.08.2023 14:55, Simon Gaiser wrote: > > Jan Beulich: > >> On 07.08.2023 11:38, Simon Gaiser wrote: > >>> It seems some firmwares put dummy entries in the ACPI MADT table for non > >>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID > >>> 0xff. Linux already has code to handle those cases both in > >>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the > >>> same check to Xen. > >> > >> I'm afraid it doesn't become clear to me what problem you're trying to > >> solve. > > > > I want Xen to not think there are possible CPUs that actually never can > > be there. > > Did you try using "maxcpus=" on the command line? If that doesn't work > well enough (perhaps because of causing undesirable log messages), maybe > we need some way to say "no CPU hotplug" on the command line. A Kconfig option to disable CPU hotplug at build time would also be interesting IMO. Thanks, Roger.
On 07/08/2023 3:18 pm, Jan Beulich wrote: > On 07.08.2023 16:04, Andrew Cooper wrote: >> On 07/08/2023 2:17 pm, Jan Beulich wrote: >>> On 07.08.2023 14:55, Simon Gaiser wrote: >>>> Jan Beulich: >>>>> On 07.08.2023 11:38, Simon Gaiser wrote: >>>>>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>>>>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>>>>> 0xff. Linux already has code to handle those cases both in >>>>>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>>>>> same check to Xen. >>>>> I'm afraid it doesn't become clear to me what problem you're trying to >>>>> solve. >>>> I want Xen to not think there are possible CPUs that actually never can >>>> be there. >>> Did you try using "maxcpus=" on the command line? If that doesn't work >>> well enough (perhaps because of causing undesirable log messages), maybe >>> we need some way to say "no CPU hotplug" on the command line. >> The ACPI tables are broken, and Xen's parsing of them is wrong. >> >> And irrespective - it's unreasonable to have users work around bugs in >> Xen by adding more command line. > Well, considering how rare CPU hotplug appears to be, such a new option > could default to "no hotplug". ... or Xen could not be buggy and think there's a chance of hotplug on the 99% of servers where there isn't. > >>>> Without ignoring those dummy entries Xen thinks my NUC has 2 sockets and >>>> that there are 8 logical CPUs that are currently disabled but could be >>>> hotplugged. >>> Yet it's exactly this which ACPI is telling us here (with some vagueness, >>> which isn't easy to get around; see below). >>> >>>> I'm moderately sure that soldering in another CPU is not supported, even >>>> less so at runtime ;] >>> On your system. What about others, which are hotplug-capable? >> It is required that all APIC ID are stated *ahead of time*. > Would you mind pointing me at where this is stated? In the spec, exactly where you'd expect to find them... "OSPM does not expect the information provided in this table to be updated if the processor information changes during the lifespan of an OS boot." Which is wordsmithing for "Some firmware was found to be modifying them and this was deemed to be illegal under the spec". ~Andrew
On 07/08/2023 3:45 pm, Simon Gaiser wrote: > Andrew Cooper: >> On 07/08/2023 10:38 am, Simon Gaiser wrote: >>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>> 0xff. Linux already has code to handle those cases both in >>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>> same check to Xen. >>> >>> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >>> entries with a valid APIC ID. Linux would still ignore those because >>> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >>> this check is only active for madt_revision >= 5. But since this version >>> check seems to be intentionally I leave that alone. >> I recall there being a discussion over this, ultimately with the version >> check being removed. IIRC it was a defacto standard used for a long >> time prior to actually getting written into the ACPI spec, so does exist >> in practice in older MADTs. > So I noticed that the check in Linux is actually slightly different than > I thought. Since [3] it always considers the CPU usable if > ACPI_MADT_ENABLED is set. Otherwise it consider it only usable if > MADT revision >= 5 and ACPI_MADT_ONLINE_CAPABLE is set. > > So I checked what the ACPI spec says: > > Up to 6.2 Errata B [6] it only defines ACPI_MADT_ENABLE as: > > If zero, this processor is unusable, and the operating system > support will not attempt to use it. > > And the bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with > "Must be zero". > > 6.3 [7] Then adds ACPI_MADT_ONLINE_CAPABLE and changes the meaning of > ACPI_MADT_ENABLE: > > Enabled > If this bit is set the processor is ready for use. If this bit > is clear and the Online Capable bit is set, system hardware > supports enabling this processor during OS runtime. If this bit > is clear and the Online Capable bit is also clear, this > processor is unusable, and OSPM shall ignore the contents of the > Processor Local APIC Structure. > > Online Capbable > The information conveyed by this bit depends on the value of the > Enabled bit. If the Enabled bit is set, this bit is reserved and > must be zero. Otherwise, if this this bit is set, system > hardware supports enabling this processor during OS runtime. > > So with confirming firmwares it should be safe change the simply ignore > the entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE > > We can also do it like Linux and ignore ACPI_MADT_ONLINE_CAPABLE > completely if revision < 5. > > Note that the revision was already increased to 5 before 6.3. > > ACPI spec version MADT revision > > 6.2 [4] 4 > 6.2 Errata A [5] 45 (typo I guess) > 6.2 Errata B 5 > 6.3 5 > > [3]: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f > [4]: http://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf > [5]: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf > [6]: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf > [7]: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf Honestly, the reserved must be zero means there's no need for a version check at all. That bit will not be set even in older MADT revisions. That said, it's likely easier to retain the version check than to set up a quirks infrastructure for buggy older MADTs. ~Andrew
On 23.08.2023 11:21, Andrew Cooper wrote: > On 07/08/2023 3:18 pm, Jan Beulich wrote: >> On 07.08.2023 16:04, Andrew Cooper wrote: >>> On 07/08/2023 2:17 pm, Jan Beulich wrote: >>>> On 07.08.2023 14:55, Simon Gaiser wrote: >>>>> Jan Beulich: >>>>>> On 07.08.2023 11:38, Simon Gaiser wrote: >>>>>>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>>>>>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>>>>>> 0xff. Linux already has code to handle those cases both in >>>>>>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>>>>>> same check to Xen. >>>>>> I'm afraid it doesn't become clear to me what problem you're trying to >>>>>> solve. >>>>> I want Xen to not think there are possible CPUs that actually never can >>>>> be there. >>>> Did you try using "maxcpus=" on the command line? If that doesn't work >>>> well enough (perhaps because of causing undesirable log messages), maybe >>>> we need some way to say "no CPU hotplug" on the command line. >>> The ACPI tables are broken, and Xen's parsing of them is wrong. >>> >>> And irrespective - it's unreasonable to have users work around bugs in >>> Xen by adding more command line. >> Well, considering how rare CPU hotplug appears to be, such a new option >> could default to "no hotplug". > > ... or Xen could not be buggy and think there's a chance of hotplug on > the 99% of servers where there isn't. Why do you say "buggy" when there's no clear cut indication of whether hotplug is possible, even not all so old ACPI versions? >>>>> Without ignoring those dummy entries Xen thinks my NUC has 2 sockets and >>>>> that there are 8 logical CPUs that are currently disabled but could be >>>>> hotplugged. >>>> Yet it's exactly this which ACPI is telling us here (with some vagueness, >>>> which isn't easy to get around; see below). >>>> >>>>> I'm moderately sure that soldering in another CPU is not supported, even >>>>> less so at runtime ;] >>>> On your system. What about others, which are hotplug-capable? >>> It is required that all APIC ID are stated *ahead of time*. >> Would you mind pointing me at where this is stated? > > In the spec, exactly where you'd expect to find them... > > "OSPM does not expect the information provided in this table to be > updated if the processor information changes during the lifespan of an > OS boot." I don't think this tells us anything about the ID not possibly changing. It merely tells us that OSPM is not expected to parse this table again (IOW firmware updating just this table isn't going to be enough). IDs possibly changing is expressed by (a) the "if the processor information changes", and (b) the next sentence, forbidding them to change while the system is asleep: "While in the sleeping state, logical processors must not be added or removed, nor can their ... ID or ... Flags change." > Which is wordsmithing for "Some firmware was found to be modifying them > and this was deemed to be illegal under the spec". That's your reading of it; I certainly don't infer such from that sentence. Jan
On Wed, Aug 23 2023 at 14:56, Jan Beulich wrote: > On 23.08.2023 11:21, Andrew Cooper wrote: >> In the spec, exactly where you'd expect to find them... >> >> "OSPM does not expect the information provided in this table to be >> updated if the processor information changes during the lifespan of an >> OS boot." > > I don't think this tells us anything about the ID not possibly changing. > It merely tells us that OSPM is not expected to parse this table again > (IOW firmware updating just this table isn't going to be enough). IDs > possibly changing is expressed by (a) the "if the processor information > changes", and (b) the next sentence, forbidding them to change while the > system is asleep: "While in the sleeping state, logical processors must > not be added or removed, nor can their ... ID or ... Flags change." > >> Which is wordsmithing for "Some firmware was found to be modifying them >> and this was deemed to be illegal under the spec". > > That's your reading of it; I certainly don't infer such from that > sentence. The APIC/X2APIC description of MADT specifies flags: Enabled If this bit is set the processor is ready for use. If this bit is clear and the Online Capable bit is set, system hardware supports enabling this processor during OS runtime. If this bit is clear and the Online Capable bit is also clear, this processor is unusable, and OSPM shall ignore the contents of the Processor Local APIC Structure. Online Capable The information conveyed by this bit depends on the value of the Enabled bit. If the Enabled bit is set, this bit is reserved and must be zero. Otherwise, if this this bit is set, system hardware supports enabling this processor during OS runtime. This is also related to SRAT which defines the proximity of memory to processors at boot time with a similar set of flags. Also 8.4 says: Each processor in the system must be declared in the ACPI namespace in the \_SB scope. .... A Device definition for a processor is declared using the ACPI0007 hardware identifier (HID). Processor configuration information is provided exclusively by objects in the processor device’s object list. When the platform uses the APIC interrupt model, UID object values under a processor device are used to associate processor devices with entries in the MADT. MADT is the authoritative table for processor enumeration, whether present or not. This is required because that's the only way to size resources, which depend on the possible maximum topology. Otherwise you'd end up with a CPU hotplugged which is outside of the resource space allocated during init. Thanks, tglx
On Sun, Aug 27, 2023 at 05:44:15PM +0200, Thomas Gleixner wrote: > On Wed, Aug 23 2023 at 14:56, Jan Beulich wrote: > > On 23.08.2023 11:21, Andrew Cooper wrote: > >> In the spec, exactly where you'd expect to find them... > >> > >> "OSPM does not expect the information provided in this table to be > >> updated if the processor information changes during the lifespan of an > >> OS boot." > > > > I don't think this tells us anything about the ID not possibly changing. > > It merely tells us that OSPM is not expected to parse this table again > > (IOW firmware updating just this table isn't going to be enough). IDs > > possibly changing is expressed by (a) the "if the processor information > > changes", and (b) the next sentence, forbidding them to change while the > > system is asleep: "While in the sleeping state, logical processors must > > not be added or removed, nor can their ... ID or ... Flags change." > > > >> Which is wordsmithing for "Some firmware was found to be modifying them > >> and this was deemed to be illegal under the spec". > > > > That's your reading of it; I certainly don't infer such from that > > sentence. > > The APIC/X2APIC description of MADT specifies flags: > > Enabled If this bit is set the processor is ready for use. If > this bit is clear and the Online Capable bit is set, > system hardware supports enabling this processor during > OS runtime. If this bit is clear and the Online Capable > bit is also clear, this processor is unusable, and OSPM > shall ignore the contents of the Processor Local APIC > Structure. > > Online Capable The information conveyed by this bit depends on the > value of the Enabled bit. If the Enabled bit is set, > this bit is reserved and must be zero. Otherwise, if > this this bit is set, system hardware supports enabling > this processor during OS runtime. Sadly this flag is only present starting with MADT v5. > This is also related to SRAT which defines the proximity of memory to > processors at boot time with a similar set of flags. > > Also 8.4 says: > > Each processor in the system must be declared in the ACPI namespace in > the \_SB scope. .... A Device definition for a processor is declared > using the ACPI0007 hardware identifier (HID). Processor configuration > information is provided exclusively by objects in the processor > device’s object list. > > When the platform uses the APIC interrupt model, UID object values > under a processor device are used to associate processor devices with > entries in the MADT. > > > MADT is the authoritative table for processor enumeration, whether > present or not. This is required because that's the only way to size > resources, which depend on the possible maximum topology. > > Otherwise you'd end up with a CPU hotplugged which is outside of the > resource space allocated during init. It's my understating that ACPI UIDs 0xff and 0xfffffff for local ACPI and x2APIC structures respectively are invalid, as that's the broadcast value used by the local (x2)APIC NMI Structures. I think Jan's point (if I understood correctly) is that Processor or Device objects can have a _MAT method that returns updated MADT entries, and some ACPI implementations might modify the original entries on the MADT and return them from that method when CPU hotplug takes place. The spec notes that "OSPM does not expect the information provided in this table to be updated if the processor information changes during the lifespan of an OS boot." so that the MADT doesn't need to be updated when CPU hotplug happens, but I don't see that sentence as preventing the MADT to be updated if CPU hotplug takes place, it's just not required. I don't see anywhere in the spec that states that APIC IDs 0xff and 0xffffffff are invalid and entries using those IDs should be ignored, but I do think that any system that supports CPU hotplug better has those IDs defined since boot. Also it seems vendors have started relying on using 0xff and 0xffffffff APIC IDs to signal non-present CPUs, and Linux has been ignoring such entries for quite some time already without reported issues. Overall I'm inclined to say we should take this change, as it avoids flagging a lot of systems as CPU hotplug capable when they are not. Acked-by: Roger Pau Monné <roger.pau@citrix.com> I think Andrew had some minor comments. I wonder whether the check should be done after the entry has been printed if `opt_cpu_info` is set, but seeing as the ONLINE_CAPABLE is also done before possibly printing the entry I guess it's fine. Thanks, Roger.
On Tue, Aug 29 2023 at 16:25, Roger Pau Monné wrote: > On Sun, Aug 27, 2023 at 05:44:15PM +0200, Thomas Gleixner wrote: >> The APIC/X2APIC description of MADT specifies flags: >> >> Enabled If this bit is set the processor is ready for use. If >> this bit is clear and the Online Capable bit is set, >> system hardware supports enabling this processor during >> OS runtime. If this bit is clear and the Online Capable >> bit is also clear, this processor is unusable, and OSPM >> shall ignore the contents of the Processor Local APIC >> Structure. >> >> Online Capable The information conveyed by this bit depends on the >> value of the Enabled bit. If the Enabled bit is set, >> this bit is reserved and must be zero. Otherwise, if >> this this bit is set, system hardware supports enabling >> this processor during OS runtime. > > Sadly this flag is only present starting with MADT v5. Correct. The difference between pre v5 MADT and v5+ is that the latter allows the OS to make more informed decisions, but the lack of this flag does not make the claim that randomly assigning APIC IDs after the initial parsing is a valid approach any more correct. Why? Simply because the other relationships vs. processor UIDs and SRAT/SLIT are not magically going away due to the lack of that flag. >> Otherwise you'd end up with a CPU hotplugged which is outside of the >> resource space allocated during init. > > It's my understating that ACPI UIDs 0xff and 0xfffffff for local ACPI > and x2APIC structures respectively are invalid, as that's the > broadcast value used by the local (x2)APIC NMI Structures. Correct. These IDs are invalid independent of any flag value. > I think Jan's point (if I understood correctly) is that Processor or > Device objects can have a _MAT method that returns updated MADT > entries, and some ACPI implementations might modify the original > entries on the MADT and return them from that method when CPU > hotplug takes place. The spec notes that "OSPM does not expect the > information provided in this table to be updated if the processor > information changes during the lifespan of an OS boot." so that the > MADT doesn't need to be updated when CPU hotplug happens, but I don't > see that sentence as preventing the MADT to be updated if CPU hotplug > takes place, it's just not required. Right. But if you read carefully what I wrote then you will figure out that any randomly made up APIC ID post MADT enumeration cannot work. > I don't see anywhere in the spec that states that APIC IDs 0xff and > 0xffffffff are invalid and entries using those IDs should be ignored, > but I do think that any system that supports CPU hotplug better has > those IDs defined since boot. Also it seems vendors have started > relying on using 0xff and 0xffffffff APIC IDs to signal non-present > CPUs, and Linux has been ignoring such entries for quite some time > already without reported issues. There is no requirement for the ACPI spec to state this simply because these APIC IDs are invalid to address a processor at the architectural level. ACPI does not care about architectural restrictions unless really required, e.g. like the LAPIC vs. X2APIC exclusiveness. Thanks, tglx
On 30.08.2023 00:54, Thomas Gleixner wrote: > On Tue, Aug 29 2023 at 16:25, Roger Pau Monné wrote: >> On Sun, Aug 27, 2023 at 05:44:15PM +0200, Thomas Gleixner wrote: >>> The APIC/X2APIC description of MADT specifies flags: >>> >>> Enabled If this bit is set the processor is ready for use. If >>> this bit is clear and the Online Capable bit is set, >>> system hardware supports enabling this processor during >>> OS runtime. If this bit is clear and the Online Capable >>> bit is also clear, this processor is unusable, and OSPM >>> shall ignore the contents of the Processor Local APIC >>> Structure. >>> >>> Online Capable The information conveyed by this bit depends on the >>> value of the Enabled bit. If the Enabled bit is set, >>> this bit is reserved and must be zero. Otherwise, if >>> this this bit is set, system hardware supports enabling >>> this processor during OS runtime. >> >> Sadly this flag is only present starting with MADT v5. > > Correct. The difference between pre v5 MADT and v5+ is that the latter > allows the OS to make more informed decisions, but the lack of this flag > does not make the claim that randomly assigning APIC IDs after the > initial parsing is a valid approach any more correct. Why? > > Simply because the other relationships vs. processor UIDs and SRAT/SLIT > are not magically going away due to the lack of that flag. > >>> Otherwise you'd end up with a CPU hotplugged which is outside of the >>> resource space allocated during init. >> >> It's my understating that ACPI UIDs 0xff and 0xfffffff for local ACPI >> and x2APIC structures respectively are invalid, as that's the >> broadcast value used by the local (x2)APIC NMI Structures. > > Correct. These IDs are invalid independent of any flag value. What we apparently agree on is these special UID values to be invalid, and that UIDs can't change. But that's not the same for the APIC IDs; see below. (As a side note, Xen range-checks UID against its implementation limit MAX_MADT_ENTRIES, so it's more than just the all-ones values which we'd reject. Not really correct, I know. Looks like Linux has done away with the simple x86_acpiid_to_apicid[] translation mechanism. This being a statically sized array requires this restriction in Xen, for the time being.) >> I think Jan's point (if I understood correctly) is that Processor or >> Device objects can have a _MAT method that returns updated MADT >> entries, and some ACPI implementations might modify the original >> entries on the MADT and return them from that method when CPU >> hotplug takes place. Just to mention it: It's not just "might". I've seen DSDT code doing so on more than one occasion. >> The spec notes that "OSPM does not expect the >> information provided in this table to be updated if the processor >> information changes during the lifespan of an OS boot." so that the >> MADT doesn't need to be updated when CPU hotplug happens, but I don't >> see that sentence as preventing the MADT to be updated if CPU hotplug >> takes place, it's just not required. > > Right. But if you read carefully what I wrote then you will figure out > that any randomly made up APIC ID post MADT enumeration cannot work. I just went back and re-read your earlier reply. I can't see such following from what you wrote. >> I don't see anywhere in the spec that states that APIC IDs 0xff and >> 0xffffffff are invalid and entries using those IDs should be ignored, >> but I do think that any system that supports CPU hotplug better has >> those IDs defined since boot. Also it seems vendors have started >> relying on using 0xff and 0xffffffff APIC IDs to signal non-present >> CPUs, and Linux has been ignoring such entries for quite some time >> already without reported issues. > > There is no requirement for the ACPI spec to state this simply because > these APIC IDs are invalid to address a processor at the architectural > level. ACPI does not care about architectural restrictions unless really > required, e.g. like the LAPIC vs. X2APIC exclusiveness. Correct, active processors can't use such APIC IDs. But placeholder MADT entries can, so long as valid APIC IDs are put in place by firmware (and announced via _MAT) at the time a socket is newly populated. They'll be uniquely identified by their UID, so the OS very well knows which originally parsed (from MADT) entries are affected. As stated before, unless putting in place extra restrictions, I can't even see how firmware would be able to up front determine APIC IDs for unpopulated sockets: It simply can't know the topology of a package that's not there yet. Requiring all packages to have identical topology might be a restriction OSes put in place, but I'd be inclined to call firmware buggy if it did (short of me being aware of there being anything in the spec putting in place such a restriction). Jan
Jan! On Wed, Aug 30 2023 at 09:20, Jan Beulich wrote: > On 30.08.2023 00:54, Thomas Gleixner wrote: >> On Tue, Aug 29 2023 at 16:25, Roger Pau Monné wrote: >> >> Correct. These IDs are invalid independent of any flag value. > > What we apparently agree on is these special UID values to be invalid, > and that UIDs can't change. But that's not the same for the APIC IDs; > see below. (As a side note, Xen range-checks UID against its > implementation limit MAX_MADT_ENTRIES, so it's more than just the > all-ones values which we'd reject. Not really correct, I know. Looks > like Linux has done away with the simple x86_acpiid_to_apicid[] > translation mechanism. This being a statically sized array requires > this restriction in Xen, for the time being.) Linux ignores entries too once the the maximum number of CPUs is reached. >>> I think Jan's point (if I understood correctly) is that Processor or >>> Device objects can have a _MAT method that returns updated MADT >>> entries, and some ACPI implementations might modify the original >>> entries on the MADT and return them from that method when CPU >>> hotplug takes place. > > Just to mention it: It's not just "might". I've seen DSDT code doing > so on more than one occasion. That does not make it more correct or better. > As stated before, unless putting in place extra restrictions, I can't > even see how firmware would be able to up front determine APIC IDs for > unpopulated sockets: It simply can't know the topology of a package > that's not there yet. Requiring all packages to have identical > topology might be a restriction OSes put in place, but I'd be inclined > to call firmware buggy if it did (short of me being aware of there > being anything in the spec putting in place such a restriction). The ACPI specification does not care about restrictions which are in the realm of hardware. You simply cannot mix random CPUs in a system just as you see fit. But that aside. ACPI based hotplug is purely used by virtualization. The efforts to support real physical hotplug have never advanced beyond the proof of concept state as the required complexity turned out to be just not worth the potential benefit. Of course virtualization people might think that everything which is imaginable and not explicitly forbidden by some specification is something which should be supported. That's just a recipe for disaster as it needlessly expands the complexity space for absolutely zero value. In reality most OSes will require that all possible APIC IDs are enumerated during initialization in order to size things correctly and to make system wide decisions correctly during boot or they will either fail to accept hot-added CPUs later on or end up in a situation where after accepting a hot-added CPU it turns out that system wide boot time decisions are wrong or data is sized incorrectly, which means it is pretty much up a creek without a paddle. So in order to avoid these hard to handle, hard to debug and diagnose failure cases, it's sensible when OSes mandate enumeration requirements which have been omitted from the specification for whatever reason. Linux expects this today and in case the expectation is not met it has issues due to the non-enforcement, which cause hard to diagnose malfunction. Those issues might be fixable by some definition of fixable, but the value of doing that is close to zero. In fact it'd be a net negative because the increased complexity will just put a maintainability burden on the code base which is completely unjustifiable. Coming back to the specification issues. As of v6.3 the Online Capable flag was added with the following rationale (paraphrased): Operating systems need to size resources at boot time and therefore they count the APIC IDs which have the enabled flag cleared to size correctly for potential hotplug operations. But that has diametral effects on bare metal because the OS is not able to distinguish between hot-plugable APIC ID and truly disabled entries. That results in overallocation or suboptimal distribution of multi-queue devices. The benefit (verbatim): The proposed “Online Capable” flag will allow OSPM to unequivocally discern the platform’s intention regarding processors that are not enabled at OS boot-time. Now look at the outcome: Enabled: If this bit is set the processor is ready for use. If this bit is clear and the Online Capable bit is set, system hardware supports enabling this processor during OS runtime. If this bit is clear and the Online Capable bit is also clear, this processor is unusable, and OSPM shall ignore the contents of the Processor Local APIC Structure. So while I conceed that this does not expressis verbis mandate that hot-pluggable CPUs must be enumerated it's not far fetched to interpret it this way. The key is 'OSPM shall ignore the contents...' As MADT is the only source of information from which an OS can deduce sizing and also topology information during early boot, ignoring the contents boils down to not allocating resources and if such an APIC ID magically surfaces later on via hot-add, then the OS can rightfully refuse to add it. Having that information is simply a correctness requirement and there is absolutely no justification to support made up ivory tower configurations which try to explore the gaps and bluryness of the ACPI specification. Of course we can't rely on that flag yet if the table is not implementing v6.3+, but from my testing there is no fallout from refusing to hot-add CPUs which have not been enumerated in MADT during early boot. The issue addressed by the Online Capable bit vs. overallocation is obviosly still there when the enumerated "disabled" APIC IDs can never be hot-added. Feel free to disagree, but Linux will enforce that _all_ possible APIC IDs are enumerated in MADT during early boot in the near future. This makes a lot of things correct, simpler and more robust. Thanks, tglx
On 07.08.2023 11:38, Simon Gaiser wrote: > It seems some firmwares put dummy entries in the ACPI MADT table for non > existing processors. On my NUC11TNHi5 those have the invalid APIC ID > 0xff. Linux already has code to handle those cases both in > acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the > same check to Xen. > > Note that on some older (2nd gen Core i) laptop of mine I also saw dummy > entries with a valid APIC ID. Linux would still ignore those because > they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen > this check is only active for madt_revision >= 5. But since this version > check seems to be intentionally I leave that alone. > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> This patch was committed with unaddressed review comments. The normal action in such a case would be to revert, expecting a v2 to arrive. One alternative here would be a timely incremental patch submission. Another alternative, considering in particular Thomas's most recent reply, would be to properly downgrade CPU hotplug support in SUPPORT.md (with a corresponding entry in CHANGELOG.md). I'll give it until the end of next week for either of the alternatives to be carried out. If nothing appears by then, I'll assume I may rightfully revert. (This timeline also allows putting this topic on the Community Call agenda, should anyone think this is necessary.) Regards, Jan
On Fri, 1 Sep 2023, Jan Beulich wrote: > On 07.08.2023 11:38, Simon Gaiser wrote: > > It seems some firmwares put dummy entries in the ACPI MADT table for non > > existing processors. On my NUC11TNHi5 those have the invalid APIC ID > > 0xff. Linux already has code to handle those cases both in > > acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the > > same check to Xen. > > > > Note that on some older (2nd gen Core i) laptop of mine I also saw dummy > > entries with a valid APIC ID. Linux would still ignore those because > > they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen > > this check is only active for madt_revision >= 5. But since this version > > check seems to be intentionally I leave that alone. > > > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] > > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> > > This patch was committed with unaddressed review comments. The normal action > in such a case would be to revert, expecting a v2 to arrive. One alternative > here would be a timely incremental patch submission. Another alternative, > considering in particular Thomas's most recent reply, would be to properly > downgrade CPU hotplug support in SUPPORT.md (with a corresponding entry in > CHANGELOG.md). I am in favor of downgrading physical CPU hotplug support in SUPPORT.md. I noticed that there is no entry for physical CPU hotplug support in SUPPORT.md today. Should we assume that it is not supported already as it is not listed as supported? Specifically, would it be a good idea to add a sentence at the top of the file saying that anything not explicitly listed is not supported?
On 06.09.2023 22:49, Stefano Stabellini wrote: > On Fri, 1 Sep 2023, Jan Beulich wrote: >> On 07.08.2023 11:38, Simon Gaiser wrote: >>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>> 0xff. Linux already has code to handle those cases both in >>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>> same check to Xen. >>> >>> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >>> entries with a valid APIC ID. Linux would still ignore those because >>> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >>> this check is only active for madt_revision >= 5. But since this version >>> check seems to be intentionally I leave that alone. >>> >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] >>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> >> >> This patch was committed with unaddressed review comments. The normal action >> in such a case would be to revert, expecting a v2 to arrive. One alternative >> here would be a timely incremental patch submission. Another alternative, >> considering in particular Thomas's most recent reply, would be to properly >> downgrade CPU hotplug support in SUPPORT.md (with a corresponding entry in >> CHANGELOG.md). > > I am in favor of downgrading physical CPU hotplug support in > SUPPORT.md. > > I noticed that there is no entry for physical CPU hotplug support in > SUPPORT.md today. Should we assume that it is not supported already as > it is not listed as supported? Hmm, I see ## Host hardware support ### Physical CPU Hotplug Status, x86: Supported pretty close to the top of the file. > Specifically, would it be a good idea to add a sentence at the top of > the file saying that anything not explicitly listed is not supported? Iirc that was the plan to do for 4.18, but then we need to be sure that things don't unintentionally become unsupported. I've no clear idea how this plan was meant to be carried out, though. Jan
On Thu, 7 Sep 2023, Jan Beulich wrote: > On 06.09.2023 22:49, Stefano Stabellini wrote: > > On Fri, 1 Sep 2023, Jan Beulich wrote: > >> On 07.08.2023 11:38, Simon Gaiser wrote: > >>> It seems some firmwares put dummy entries in the ACPI MADT table for non > >>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID > >>> 0xff. Linux already has code to handle those cases both in > >>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the > >>> same check to Xen. > >>> > >>> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy > >>> entries with a valid APIC ID. Linux would still ignore those because > >>> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen > >>> this check is only active for madt_revision >= 5. But since this version > >>> check seems to be intentionally I leave that alone. > >>> > >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] > >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] > >>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> > >> > >> This patch was committed with unaddressed review comments. The normal action > >> in such a case would be to revert, expecting a v2 to arrive. One alternative > >> here would be a timely incremental patch submission. Another alternative, > >> considering in particular Thomas's most recent reply, would be to properly > >> downgrade CPU hotplug support in SUPPORT.md (with a corresponding entry in > >> CHANGELOG.md). > > > > I am in favor of downgrading physical CPU hotplug support in > > SUPPORT.md. > > > > I noticed that there is no entry for physical CPU hotplug support in > > SUPPORT.md today. Should we assume that it is not supported already as > > it is not listed as supported? > > Hmm, I see > > ## Host hardware support > > ### Physical CPU Hotplug > > Status, x86: Supported > > pretty close to the top of the file. Ops, it must have been the case-sensitive search that failed me > > Specifically, would it be a good idea to add a sentence at the top of > > the file saying that anything not explicitly listed is not supported? > > Iirc that was the plan to do for 4.18, but then we need to be sure that > things don't unintentionally become unsupported. I've no clear idea how > this plan was meant to be carried out, though. it would be interesting to discuss it again
Andrew Cooper: > On 07/08/2023 3:45 pm, Simon Gaiser wrote: >> Andrew Cooper: >>> On 07/08/2023 10:38 am, Simon Gaiser wrote: >>>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>>> 0xff. Linux already has code to handle those cases both in >>>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>>> same check to Xen. >>>> >>>> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >>>> entries with a valid APIC ID. Linux would still ignore those because >>>> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >>>> this check is only active for madt_revision >= 5. But since this version >>>> check seems to be intentionally I leave that alone. >>> I recall there being a discussion over this, ultimately with the version >>> check being removed. IIRC it was a defacto standard used for a long >>> time prior to actually getting written into the ACPI spec, so does exist >>> in practice in older MADTs. >> So I noticed that the check in Linux is actually slightly different than >> I thought. Since [3] it always considers the CPU usable if >> ACPI_MADT_ENABLED is set. Otherwise it consider it only usable if >> MADT revision >= 5 and ACPI_MADT_ONLINE_CAPABLE is set. >> >> So I checked what the ACPI spec says: >> >> Up to 6.2 Errata B [6] it only defines ACPI_MADT_ENABLE as: >> >> If zero, this processor is unusable, and the operating system >> support will not attempt to use it. >> >> And the bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with >> "Must be zero". >> >> 6.3 [7] Then adds ACPI_MADT_ONLINE_CAPABLE and changes the meaning of >> ACPI_MADT_ENABLE: >> >> Enabled >> If this bit is set the processor is ready for use. If this bit >> is clear and the Online Capable bit is set, system hardware >> supports enabling this processor during OS runtime. If this bit >> is clear and the Online Capable bit is also clear, this >> processor is unusable, and OSPM shall ignore the contents of the >> Processor Local APIC Structure. >> >> Online Capbable >> The information conveyed by this bit depends on the value of the >> Enabled bit. If the Enabled bit is set, this bit is reserved and >> must be zero. Otherwise, if this this bit is set, system >> hardware supports enabling this processor during OS runtime. >> >> So with confirming firmwares it should be safe change the simply ignore >> the entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE >> >> We can also do it like Linux and ignore ACPI_MADT_ONLINE_CAPABLE >> completely if revision < 5. >> >> Note that the revision was already increased to 5 before 6.3. >> >> ACPI spec version MADT revision >> >> 6.2 [4] 4 >> 6.2 Errata A [5] 45 (typo I guess) >> 6.2 Errata B 5 >> 6.3 5 >> >> [3]: https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f >> [4]: http://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf >> [5]: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf >> [6]: https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf >> [7]: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf > > Honestly, the reserved must be zero means there's no need for a version > check at all. That bit will not be set even in older MADT revisions. > > That said, it's likely easier to retain the version check than to set up > a quirks infrastructure for buggy older MADTs. Yes, that's what I was thinking too. Send a patch: https://lore.kernel.org/xen-devel/20230911101238.18005-1-simon@invisiblethingslab.com/ Simon
Jan Beulich: > On 07.08.2023 11:38, Simon Gaiser wrote: >> It seems some firmwares put dummy entries in the ACPI MADT table for non >> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >> 0xff. Linux already has code to handle those cases both in >> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >> same check to Xen. >> >> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >> entries with a valid APIC ID. Linux would still ignore those because >> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >> this check is only active for madt_revision >= 5. But since this version >> check seems to be intentionally I leave that alone. >> >> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] >> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] >> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> > > This patch was committed with unaddressed review comments. The normal action > in such a case would be to revert, expecting a v2 to arrive. One alternative > here would be a timely incremental patch submission. Another alternative, > considering in particular Thomas's most recent reply, would be to properly > downgrade CPU hotplug support in SUPPORT.md (with a corresponding entry in > CHANGELOG.md). > > I'll give it until the end of next week for either of the alternatives to be > carried out. If nothing appears by then, I'll assume I may rightfully revert. > (This timeline also allows putting this topic on the Community Call agenda, > should anyone think this is necessary.) To avoid misunderstandings, since you mentioned the above in your response to the related patch I submitted today [3]: My understanding is that your main concern is that those entries with invalid APIC IDs shouldn't be ignored, since some firmwares later update them with valid IDs (on hotplug). This fully conflicts with the intention of the patch. Resolving the disagreement that followed is outside of my control, as being only the submitter of the patch. You also commented about not logging the ignored entries. Until it's clear if the change in general is accepted in the end, I considered it pointless to address that detail. If you disagree and want a follow up for that, just let me know. Simon [3]: https://lore.kernel.org/xen-devel/20230911101238.18005-1-simon@invisiblethingslab.com/
On 06/09/2023 9:49 pm, Stefano Stabellini wrote: > On Fri, 1 Sep 2023, Jan Beulich wrote: >> On 07.08.2023 11:38, Simon Gaiser wrote: >>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>> 0xff. Linux already has code to handle those cases both in >>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>> same check to Xen. >>> >>> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >>> entries with a valid APIC ID. Linux would still ignore those because >>> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >>> this check is only active for madt_revision >= 5. But since this version >>> check seems to be intentionally I leave that alone. >>> >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] >>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> >> This patch was committed with unaddressed review comments. Count the number of x86 maintainer tags on the patch, and see that they were given after discussion, not before. You're outvoted 2:1 by Xen x86 maintainers alone, and more than 2:1 if you include the x86 maintainers from other projects who participated in the discussion. I'm not willing to let Xen keep on malfunctioning on millions of systems just because you think an unfinished spec is more correct that practical reality which invalidates it. >> The normal action >> in such a case would be to revert, expecting a v2 to arrive. One alternative >> here would be a timely incremental patch submission. Another alternative, >> considering in particular Thomas's most recent reply, would be to properly >> downgrade CPU hotplug support in SUPPORT.md (with a corresponding entry in >> CHANGELOG.md). > I am in favor of downgrading physical CPU hotplug support in > SUPPORT.md. SUPPORT.md does look bogus and wants correcting, but it is laughable to claim that this ever worked, let alone that it's supported. Intel got half way through specifying ACPI CPU Hotplug, and abandoned it as a technology because they couldn't get it to work. Hence the feature has never been tested. Furthermore, cursory testing that Thomas did for the Linux topology work demonstrates that it is broken anyway for reasons unrelated to ACPI parsing. Even furthermore, it's an area of the Xen / dom0 boundary which is fundamentally broken for non-PV cases, and undocumented for the PV case, hence why it's broken in Linux. Physical CPU Hotplug does not pass the bar for being anything more than experimental. It's absolutely not tech-preview level because the only demo it has had in an environment (admittedly virtual) which does implement the spec in a usable way demonstrates that it doesn't function. The fact no-one has noticed until now shows that the feature isn't used, which comes back around full circle to the fact that Intel never made it work and never shipped it. ~Andrew
On Mon, 11 Sep 2023, Andrew Cooper wrote: > Physical CPU Hotplug does not pass the bar for being anything more than > experimental. It's absolutely not tech-preview level because the only > demo it has had in an environment (admittedly virtual) which does > implement the spec in a usable way demonstrates that it doesn't function. > > The fact no-one has noticed until now shows that the feature isn't used, > which comes back around full circle to the fact that Intel never made it > work and never shipped it. So we actually have agreement on how to move forward --- SUPPORT: downgrade Physical CPU Hotplug to Experimental The feature is not commonly used, and we don't have hardware to test it, not in OSSTest, not in Gitlab, and not even ad-hoc manually by community members. Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> diff --git a/SUPPORT.md b/SUPPORT.md index 3461f5cf2f..02e2f6eaa8 100644 --- a/SUPPORT.md +++ b/SUPPORT.md @@ -48,7 +48,7 @@ For the Cortex A77 r0p0 - r1p0, see Errata 1508412. ### Physical CPU Hotplug - Status, x86: Supported + Status, x86: Experimental ### Physical Memory
On Mon, Sep 11 2023 at 19:24, Andrew Cooper wrote: > Furthermore, cursory testing that Thomas did for the Linux topology work > demonstrates that it is broken anyway for reasons unrelated to ACPI parsing. > > Even furthermore, it's an area of the Xen / dom0 boundary which is > fundamentally broken for non-PV cases, and undocumented for the PV case, > hence why it's broken in Linux. > > Physical CPU Hotplug does not pass the bar for being anything more than > experimental. It's absolutely not tech-preview level because the only > demo it has had in an environment (admittedly virtual) which does > implement the spec in a usable way demonstrates that it doesn't function. > > The fact no-one has noticed until now shows that the feature isn't used, > which comes back around full circle to the fact that Intel never made it > work and never shipped it. OTOH it _is_ used in virtualization. KVM supports it and it just works. That's how I found out that XEN explodes in colourful ways :) Thanks, tglx
On 11.09.2023 20:24, Andrew Cooper wrote: > On 06/09/2023 9:49 pm, Stefano Stabellini wrote: >> On Fri, 1 Sep 2023, Jan Beulich wrote: >>> On 07.08.2023 11:38, Simon Gaiser wrote: >>>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>>> 0xff. Linux already has code to handle those cases both in >>>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>>> same check to Xen. >>>> >>>> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >>>> entries with a valid APIC ID. Linux would still ignore those because >>>> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >>>> this check is only active for madt_revision >= 5. But since this version >>>> check seems to be intentionally I leave that alone. >>>> >>>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] >>>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] >>>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> >>> This patch was committed with unaddressed review comments. > > Count the number of x86 maintainer tags on the patch, and see that they > were given after discussion, not before. > > You're outvoted 2:1 by Xen x86 maintainers alone, and more than 2:1 if > you include the x86 maintainers from other projects who participated in > the discussion. I wasn't aware that ./MAINTAINERS having 4. There must be no "open" objections. was possible to overrule by any number of acks. >>> The normal action >>> in such a case would be to revert, expecting a v2 to arrive. One alternative >>> here would be a timely incremental patch submission. Another alternative, >>> considering in particular Thomas's most recent reply, would be to properly >>> downgrade CPU hotplug support in SUPPORT.md (with a corresponding entry in >>> CHANGELOG.md). >> I am in favor of downgrading physical CPU hotplug support in >> SUPPORT.md. > > SUPPORT.md does look bogus and wants correcting, but it is laughable to > claim that this ever worked, let alone that it's supported. > > Intel got half way through specifying ACPI CPU Hotplug, and abandoned it > as a technology because they couldn't get it to work. Hence the feature > has never been tested. > > Furthermore, cursory testing that Thomas did for the Linux topology work > demonstrates that it is broken anyway for reasons unrelated to ACPI parsing. > > Even furthermore, it's an area of the Xen / dom0 boundary which is > fundamentally broken for non-PV cases, and undocumented for the PV case, > hence why it's broken in Linux. > > > Physical CPU Hotplug does not pass the bar for being anything more than > experimental. It's absolutely not tech-preview level because the only > demo it has had in an environment (admittedly virtual) which does > implement the spec in a usable way demonstrates that it doesn't function. > > The fact no-one has noticed until now shows that the feature isn't used, > which comes back around full circle to the fact that Intel never made it > work and never shipped it. And note how I offered a compromise by someone (Simon, you, Roger, or yet someone else) sending a patch against SUPPORT.md. Yet that hasn't happened. I therefore continue to be of the opinion that I can rightfully revert the patch, based on process not having been followed and alternative actions not having been taken. Jan
On 12.09.2023 10:33, Jan Beulich wrote: > And note how I offered a compromise by someone (Simon, you, Roger, or yet > someone else) sending a patch against SUPPORT.md. Yet that hasn't happened. > I therefore continue to be of the opinion that I can rightfully revert the > patch, based on process not having been followed and alternative actions > not having been taken. Hmm, I now see that Stefano has actually sent such a patch, just not on a fresh thread. With that acked and committed, the primary reason to revert will disappear. I'll mention the other aspect in a reply to Simon. Jan
On 11.09.2023 20:05, Simon Gaiser wrote: > Jan Beulich: >> On 07.08.2023 11:38, Simon Gaiser wrote: >>> It seems some firmwares put dummy entries in the ACPI MADT table for non >>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID >>> 0xff. Linux already has code to handle those cases both in >>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the >>> same check to Xen. >>> >>> Note that on some older (2nd gen Core i) laptop of mine I also saw dummy >>> entries with a valid APIC ID. Linux would still ignore those because >>> they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen >>> this check is only active for madt_revision >= 5. But since this version >>> check seems to be intentionally I leave that alone. >>> >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] >>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] >>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> >> >> This patch was committed with unaddressed review comments. The normal action >> in such a case would be to revert, expecting a v2 to arrive. One alternative >> here would be a timely incremental patch submission. Another alternative, >> considering in particular Thomas's most recent reply, would be to properly >> downgrade CPU hotplug support in SUPPORT.md (with a corresponding entry in >> CHANGELOG.md). >> >> I'll give it until the end of next week for either of the alternatives to be >> carried out. If nothing appears by then, I'll assume I may rightfully revert. >> (This timeline also allows putting this topic on the Community Call agenda, >> should anyone think this is necessary.) > > To avoid misunderstandings, since you mentioned the above in your > response to the related patch I submitted today [3]: > > My understanding is that your main concern is that those entries with > invalid APIC IDs shouldn't be ignored, since some firmwares later update > them with valid IDs (on hotplug). This fully conflicts with the > intention of the patch. Resolving the disagreement that followed is > outside of my control, as being only the submitter of the patch. Why would that be outside of your control? Your change broke a feature officially stated as supported. Now that there is a patch to downgrade support, that aspect will be resolved as soon as that patch gets ack-ed and committed. > You also commented about not logging the ignored entries. Until it's > clear if the change in general is accepted in the end, I considered it > pointless to address that detail. If you disagree and want a follow up > for that, just let me know. I take a different perspective here: The patch shouldn't have been committed without this aspect addressed, either verbally or by sending a v2. I continue to think that an incremental change is warranted to make sure logging of entries, at least with "cpuinfo" in use, remains consistent with what we had before. Otherwise debugging of possible issues becomes yet more difficult. Jan
On 12.09.2023 10:41, Jan Beulich wrote: > On 11.09.2023 20:05, Simon Gaiser wrote: >> You also commented about not logging the ignored entries. Until it's >> clear if the change in general is accepted in the end, I considered it >> pointless to address that detail. If you disagree and want a follow up >> for that, just let me know. > > I take a different perspective here: The patch shouldn't have been > committed without this aspect addressed, either verbally or by sending > a v2. I continue to think that an incremental change is warranted to > make sure logging of entries, at least with "cpuinfo" in use, remains > consistent with what we had before. Otherwise debugging of possible > issues becomes yet more difficult. Oh, and just to add to this: Because of excess verbosity resulting from such ambiguous MADT entries, for many years I have been carrying a private patch (reproduced below, just fyi). But since those entries are still in line with the spec, it didn't seem appropriate to me to propose this change for upstream inclusion. Jan --- unstable.orig/xen/arch/x86/mpparse.c +++ unstable/xen/arch/x86/mpparse.c @@ -809,8 +809,13 @@ int mp_register_lapic(u32 id, bool enabl }; if (MAX_APICS <= id) { - printk(KERN_WARNING "Processor #%d invalid (max %d)\n", - id, MAX_APICS); + static uint32_t max_warn = -1; + + if (id <= max_warn) { + printk(KERN_WARNING "Processor #%d invalid (max %d)\n", + id, MAX_APICS); + max_warn = id - 1; + } return -EINVAL; }
On Mon, Sep 11, 2023 at 03:38:05PM -0700, Stefano Stabellini wrote: > On Mon, 11 Sep 2023, Andrew Cooper wrote: > > Physical CPU Hotplug does not pass the bar for being anything more than > > experimental. It's absolutely not tech-preview level because the only > > demo it has had in an environment (admittedly virtual) which does > > implement the spec in a usable way demonstrates that it doesn't function. > > > > The fact no-one has noticed until now shows that the feature isn't used, > > which comes back around full circle to the fact that Intel never made it > > work and never shipped it. > > So we actually have agreement on how to move forward > > --- > SUPPORT: downgrade Physical CPU Hotplug to Experimental > > The feature is not commonly used, and we don't have hardware to test it, > not in OSSTest, not in Gitlab, and not even ad-hoc manually by community > members. We could use QEMU to test, so it's not impossible to test, just that AFAICT there aren't (that m)any users of it. > Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> > > diff --git a/SUPPORT.md b/SUPPORT.md > index 3461f5cf2f..02e2f6eaa8 100644 > --- a/SUPPORT.md > +++ b/SUPPORT.md > @@ -48,7 +48,7 @@ For the Cortex A77 r0p0 - r1p0, see Errata 1508412. > > ### Physical CPU Hotplug I think it would be clearer to rename this to "ACPI CPU Hotplug", as to not be confused with the late CPU bringup done by the pvshim. Thanks, Roger.
On Tue, Sep 12, 2023 at 8:57 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Mon, Sep 11 2023 at 19:24, Andrew Cooper wrote: > > Furthermore, cursory testing that Thomas did for the Linux topology work > > demonstrates that it is broken anyway for reasons unrelated to ACPI parsing. > > > > Even furthermore, it's an area of the Xen / dom0 boundary which is > > fundamentally broken for non-PV cases, and undocumented for the PV case, > > hence why it's broken in Linux. > > > > Physical CPU Hotplug does not pass the bar for being anything more than > > experimental. It's absolutely not tech-preview level because the only > > demo it has had in an environment (admittedly virtual) which does > > implement the spec in a usable way demonstrates that it doesn't function. > > > > The fact no-one has noticed until now shows that the feature isn't used, > > which comes back around full circle to the fact that Intel never made it > > work and never shipped it. > > OTOH it _is_ used in virtualization. KVM supports it and it just > works. That's how I found out that XEN explodes in colourful ways :) It should be pointed out that there's currently a start-up selling a product that specifically runs Xen under cloud providers -- Exostellar (was Exotanium) [1]. If cloud providers do use ACPI hotplug to allow on-the-fly adjustments of the number of vcpus, Exostellar will probably want support at some point. (Perhaps it would be good to rope them into testing / maintaining it.) -George [1] https://exostellar.io/
On Wed, 13 Sep 2023, George Dunlap wrote: > On Tue, Sep 12, 2023 at 8:57 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > > > On Mon, Sep 11 2023 at 19:24, Andrew Cooper wrote: > > > Furthermore, cursory testing that Thomas did for the Linux topology work > > > demonstrates that it is broken anyway for reasons unrelated to ACPI parsing. > > > > > > Even furthermore, it's an area of the Xen / dom0 boundary which is > > > fundamentally broken for non-PV cases, and undocumented for the PV case, > > > hence why it's broken in Linux. > > > > > > Physical CPU Hotplug does not pass the bar for being anything more than > > > experimental. It's absolutely not tech-preview level because the only > > > demo it has had in an environment (admittedly virtual) which does > > > implement the spec in a usable way demonstrates that it doesn't function. > > > > > > The fact no-one has noticed until now shows that the feature isn't used, > > > which comes back around full circle to the fact that Intel never made it > > > work and never shipped it. > > > > OTOH it _is_ used in virtualization. KVM supports it and it just > > works. That's how I found out that XEN explodes in colourful ways :) > > It should be pointed out that there's currently a start-up selling a > product that specifically runs Xen under cloud providers -- Exostellar > (was Exotanium) [1]. If cloud providers do use ACPI hotplug to allow > on-the-fly adjustments of the number of vcpus, Exostellar will > probably want support at some point. (Perhaps it would be good to > rope them into testing / maintaining it.) Supporting CPU hotplug in a nested virtualization setting is a different proposition compared to supporting Physical CPU Hotplug. Typically virtual firmware (hypervisor-provided firmware) has less unexpected behaviors compared to baremetal firmware. Could you make the distinction in SUPPORT.md?
On Thu, Sep 14, 2023 at 12:18 AM Stefano Stabellini <sstabellini@kernel.org> wrote: > > On Wed, 13 Sep 2023, George Dunlap wrote: > > On Tue, Sep 12, 2023 at 8:57 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > > > > > On Mon, Sep 11 2023 at 19:24, Andrew Cooper wrote: > > > > Furthermore, cursory testing that Thomas did for the Linux topology work > > > > demonstrates that it is broken anyway for reasons unrelated to ACPI parsing. > > > > > > > > Even furthermore, it's an area of the Xen / dom0 boundary which is > > > > fundamentally broken for non-PV cases, and undocumented for the PV case, > > > > hence why it's broken in Linux. > > > > > > > > Physical CPU Hotplug does not pass the bar for being anything more than > > > > experimental. It's absolutely not tech-preview level because the only > > > > demo it has had in an environment (admittedly virtual) which does > > > > implement the spec in a usable way demonstrates that it doesn't function. > > > > > > > > The fact no-one has noticed until now shows that the feature isn't used, > > > > which comes back around full circle to the fact that Intel never made it > > > > work and never shipped it. > > > > > > OTOH it _is_ used in virtualization. KVM supports it and it just > > > works. That's how I found out that XEN explodes in colourful ways :) > > > > It should be pointed out that there's currently a start-up selling a > > product that specifically runs Xen under cloud providers -- Exostellar > > (was Exotanium) [1]. If cloud providers do use ACPI hotplug to allow > > on-the-fly adjustments of the number of vcpus, Exostellar will > > probably want support at some point. (Perhaps it would be good to > > rope them into testing / maintaining it.) > > Supporting CPU hotplug in a nested virtualization setting is a different > proposition compared to supporting Physical CPU Hotplug. Typically > virtual firmware (hypervisor-provided firmware) has less unexpected > behaviors compared to baremetal firmware. > > Could you make the distinction in SUPPORT.md? People say at the moment it doesn't actually work, even under QEMU; so "Experimental" is probably the right status. But if someone got it working, we might add "supported, with caveats" and specify things in a comment. -George
diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c index 54b72d716b..4a62822fa9 100644 --- a/xen/arch/x86/acpi/boot.c +++ b/xen/arch/x86/acpi/boot.c @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, const unsigned long end) if (BAD_MADT_ENTRY(processor, end)) return -EINVAL; + /* Ignore entries with invalid apicid */ + if (processor->local_apic_id == 0xffffffff) + return 0; + /* Don't register processors that cannot be onlined. */ if (madt_revision >= 5 && !(processor->lapic_flags & ACPI_MADT_ENABLED) && !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) return 0; - if ((processor->lapic_flags & ACPI_MADT_ENABLED) || - processor->local_apic_id != 0xffffffff || opt_cpu_info) { + if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info) { acpi_table_print_madt_entry(header); log = true; } @@ -143,14 +146,17 @@ acpi_parse_lapic(struct acpi_subtable_header * header, const unsigned long end) if (BAD_MADT_ENTRY(processor, end)) return -EINVAL; + /* Ignore entries with invalid apicid */ + if (processor->id == 0xff) + return 0; + /* Don't register processors that cannot be onlined. */ if (madt_revision >= 5 && !(processor->lapic_flags & ACPI_MADT_ENABLED) && !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) return 0; - if ((processor->lapic_flags & ACPI_MADT_ENABLED) || - processor->id != 0xff || opt_cpu_info) + if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info) acpi_table_print_madt_entry(header); /* Record local apic id only when enabled */
It seems some firmwares put dummy entries in the ACPI MADT table for non existing processors. On my NUC11TNHi5 those have the invalid APIC ID 0xff. Linux already has code to handle those cases both in acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the same check to Xen. Note that on some older (2nd gen Core i) laptop of mine I also saw dummy entries with a valid APIC ID. Linux would still ignore those because they have !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE. But in Xen this check is only active for madt_revision >= 5. But since this version check seems to be intentionally I leave that alone. Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f3bf1dbe64b62a2058dd1944c00990df203e8e7a # [1] Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10daf10ab154e31237a8c07242be3063fb6a9bf4 # [2] Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> --- xen/arch/x86/acpi/boot.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)