diff mbox series

[1/3] acpi/processor: fix evaluating _PDC method when running as Xen dom0

Message ID 20221121102113.41893-2-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series xen: ACPI processor related fixes | expand

Commit Message

Roger Pau Monné Nov. 21, 2022, 10:21 a.m. UTC
When running as a Xen dom0 the number of CPUs available to Linux can
be different from the number of CPUs present on the system, but in
order to properly fetch processor performance related data _PDC must
be executed on all the physical CPUs online on the system.

The current checks in processor_physically_present() result in some
processor objects not getting their _PDC methods evaluated when Linux
is running as Xen dom0.  Fix this by introducing a custom function to
use when running as Xen dom0 in order to check whether a processor
object matches a CPU that's online.

Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++
 arch/x86/xen/enlighten.c              | 27 +++++++++++++++++++++++++++
 drivers/acpi/processor_pdc.c          | 11 +++++++++++
 3 files changed, 48 insertions(+)

Comments

Jan Beulich Nov. 21, 2022, 2:02 p.m. UTC | #1
On 21.11.2022 11:21, Roger Pau Monne wrote:
> @@ -47,6 +49,15 @@ static bool __init processor_physically_present(acpi_handle handle)
>  		return false;
>  	}
>  
> +	if (xen_initial_domain())
> +		/*
> +		 * When running as a Xen dom0 the number of processors Linux
> +		 * sees can be different from the real number of processors on
> +		 * the system, and we still need to execute _PDC for all of
> +		 * them.
> +		 */
> +		return xen_processor_present(acpi_id);
> +
>  	type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
>  	cpuid = acpi_get_cpuid(handle, type, acpi_id);

We had to deal with this in our XenoLinux forward ports as well, but at
the time it appeared upstream I decided to make use of acpi_get_apicid()
(which meanwhile was renamed to acpi_get_phys_id()). Wouldn't than be an
option, eliminating the need for a Xen-specific new function?

Jan
Roger Pau Monné Nov. 21, 2022, 2:29 p.m. UTC | #2
On Mon, Nov 21, 2022 at 03:02:30PM +0100, Jan Beulich wrote:
> On 21.11.2022 11:21, Roger Pau Monne wrote:
> > @@ -47,6 +49,15 @@ static bool __init processor_physically_present(acpi_handle handle)
> >  		return false;
> >  	}
> >  
> > +	if (xen_initial_domain())
> > +		/*
> > +		 * When running as a Xen dom0 the number of processors Linux
> > +		 * sees can be different from the real number of processors on
> > +		 * the system, and we still need to execute _PDC for all of
> > +		 * them.
> > +		 */
> > +		return xen_processor_present(acpi_id);
> > +
> >  	type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
> >  	cpuid = acpi_get_cpuid(handle, type, acpi_id);
> 
> We had to deal with this in our XenoLinux forward ports as well, but at
> the time it appeared upstream I decided to make use of acpi_get_apicid()
> (which meanwhile was renamed to acpi_get_phys_id()). Wouldn't than be an
> option, eliminating the need for a Xen-specific new function?

While this would work for PV, it won't work on a PVH dom0, since the
ACPI MADT table is not the native one in that case, and thus the
Processor UIDs in the MADT don't match the ones in the Processor
objects/devices.

Thanks, Roger.
Jan Beulich Nov. 21, 2022, 2:51 p.m. UTC | #3
On 21.11.2022 15:29, Roger Pau Monné wrote:
> On Mon, Nov 21, 2022 at 03:02:30PM +0100, Jan Beulich wrote:
>> On 21.11.2022 11:21, Roger Pau Monne wrote:
>>> @@ -47,6 +49,15 @@ static bool __init processor_physically_present(acpi_handle handle)
>>>  		return false;
>>>  	}
>>>  
>>> +	if (xen_initial_domain())
>>> +		/*
>>> +		 * When running as a Xen dom0 the number of processors Linux
>>> +		 * sees can be different from the real number of processors on
>>> +		 * the system, and we still need to execute _PDC for all of
>>> +		 * them.
>>> +		 */
>>> +		return xen_processor_present(acpi_id);
>>> +
>>>  	type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
>>>  	cpuid = acpi_get_cpuid(handle, type, acpi_id);
>>
>> We had to deal with this in our XenoLinux forward ports as well, but at
>> the time it appeared upstream I decided to make use of acpi_get_apicid()
>> (which meanwhile was renamed to acpi_get_phys_id()). Wouldn't than be an
>> option, eliminating the need for a Xen-specific new function?
> 
> While this would work for PV, it won't work on a PVH dom0, since the
> ACPI MADT table is not the native one in that case, and thus the
> Processor UIDs in the MADT don't match the ones in the Processor
> objects/devices.

I wonder whether we can actually get away with this difference long term.
I've gone back and looked at the commit introducing the code to build the
replacement MADT, but there's no mention of either the reason for the
changed numbering or the reason for limiting MADT entries to just the
number of CPUs Dom0 will have. (Clearly we need distinct APIC IDs, at
least until Xen becomes more flexible / correct in this regard. And
clearly we'd need to "invent" ACPI IDs in case Dom0 had more vCPU-s than
there are pCPU-s.)

Jan
Roger Pau Monné Nov. 21, 2022, 3:09 p.m. UTC | #4
On Mon, Nov 21, 2022 at 03:51:58PM +0100, Jan Beulich wrote:
> On 21.11.2022 15:29, Roger Pau Monné wrote:
> > On Mon, Nov 21, 2022 at 03:02:30PM +0100, Jan Beulich wrote:
> >> On 21.11.2022 11:21, Roger Pau Monne wrote:
> >>> @@ -47,6 +49,15 @@ static bool __init processor_physically_present(acpi_handle handle)
> >>>  		return false;
> >>>  	}
> >>>  
> >>> +	if (xen_initial_domain())
> >>> +		/*
> >>> +		 * When running as a Xen dom0 the number of processors Linux
> >>> +		 * sees can be different from the real number of processors on
> >>> +		 * the system, and we still need to execute _PDC for all of
> >>> +		 * them.
> >>> +		 */
> >>> +		return xen_processor_present(acpi_id);
> >>> +
> >>>  	type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
> >>>  	cpuid = acpi_get_cpuid(handle, type, acpi_id);
> >>
> >> We had to deal with this in our XenoLinux forward ports as well, but at
> >> the time it appeared upstream I decided to make use of acpi_get_apicid()
> >> (which meanwhile was renamed to acpi_get_phys_id()). Wouldn't than be an
> >> option, eliminating the need for a Xen-specific new function?
> > 
> > While this would work for PV, it won't work on a PVH dom0, since the
> > ACPI MADT table is not the native one in that case, and thus the
> > Processor UIDs in the MADT don't match the ones in the Processor
> > objects/devices.
> 
> I wonder whether we can actually get away with this difference long term.
> I've gone back and looked at the commit introducing the code to build the
> replacement MADT, but there's no mention of either the reason for the
> changed numbering or the reason for limiting MADT entries to just the
> number of CPUs Dom0 will have. (Clearly we need distinct APIC IDs, at
> least until Xen becomes more flexible / correct in this regard. And
> clearly we'd need to "invent" ACPI IDs in case Dom0 had more vCPU-s than
> there are pCPU-s.)

Linux when running in PVH/HVM mode uses the ACPI Processor UID in the
MADT as the vCPU ID, so attempting to re-use the native UIDs doesn't
work.

We could expand the dom0 crafted MADT to make sure all the native ACPI
Processor UIDs are present in the crafted MADT, by adding them as not
present entries, but that seems more like a bodge than a proper
solution.  Even then those X2APIC entries would appear as offline by
the current checks, and thus won't get _PDC evaluated either.

Thanks, Roger.
Roger Pau Monné Nov. 29, 2022, 4:01 p.m. UTC | #5
Ping?

So far I've got some feedback from Jan which I've replied to, but I
haven't got any more feedback.

Both patches 1 and 2 are required in order for Xen dom0s to properly
handle ACPI Processor related data to the hypervisor. Patch 3 can be
deal with later.

Thanks, Roger.

On Mon, Nov 21, 2022 at 11:21:10AM +0100, Roger Pau Monne wrote:
> When running as a Xen dom0 the number of CPUs available to Linux can
> be different from the number of CPUs present on the system, but in
> order to properly fetch processor performance related data _PDC must
> be executed on all the physical CPUs online on the system.
> 
> The current checks in processor_physically_present() result in some
> processor objects not getting their _PDC methods evaluated when Linux
> is running as Xen dom0.  Fix this by introducing a custom function to
> use when running as Xen dom0 in order to check whether a processor
> object matches a CPU that's online.
> 
> Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>  arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++
>  arch/x86/xen/enlighten.c              | 27 +++++++++++++++++++++++++++
>  drivers/acpi/processor_pdc.c          | 11 +++++++++++
>  3 files changed, 48 insertions(+)
> 
> diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
> index 16f548a661cf..b9f512138043 100644
> --- a/arch/x86/include/asm/xen/hypervisor.h
> +++ b/arch/x86/include/asm/xen/hypervisor.h
> @@ -61,4 +61,14 @@ void __init xen_pvh_init(struct boot_params *boot_params);
>  void __init mem_map_via_hcall(struct boot_params *boot_params_p);
>  #endif
>  
> +#ifdef CONFIG_XEN_DOM0
> +bool __init xen_processor_present(uint32_t acpi_id);
> +#else
> +static inline bool xen_processor_present(uint32_t acpi_id)
> +{
> +	BUG();
> +	return false;
> +}
> +#endif
> +
>  #endif /* _ASM_X86_XEN_HYPERVISOR_H */
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index b8db2148c07d..d4c44361a26c 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -346,3 +346,30 @@ void xen_arch_unregister_cpu(int num)
>  }
>  EXPORT_SYMBOL(xen_arch_unregister_cpu);
>  #endif
> +
> +#ifdef CONFIG_XEN_DOM0
> +bool __init xen_processor_present(uint32_t acpi_id)
> +{
> +	unsigned int i, maxid;
> +	struct xen_platform_op op = {
> +		.cmd = XENPF_get_cpuinfo,
> +		.interface_version = XENPF_INTERFACE_VERSION,
> +	};
> +	int ret = HYPERVISOR_platform_op(&op);
> +
> +	if (ret)
> +		return false;
> +
> +	maxid = op.u.pcpu_info.max_present;
> +	for (i = 0; i <= maxid; i++) {
> +		op.u.pcpu_info.xen_cpuid = i;
> +		ret = HYPERVISOR_platform_op(&op);
> +		if (ret)
> +			continue;
> +		if (op.u.pcpu_info.acpi_id == acpi_id)
> +			return op.u.pcpu_info.flags & XEN_PCPU_FLAGS_ONLINE;
> +	}
> +
> +	return false;
> +}
> +#endif
> diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c
> index 8c3f82c9fff3..18fb04523f93 100644
> --- a/drivers/acpi/processor_pdc.c
> +++ b/drivers/acpi/processor_pdc.c
> @@ -14,6 +14,8 @@
>  #include <linux/acpi.h>
>  #include <acpi/processor.h>
>  
> +#include <xen/xen.h>
> +
>  #include "internal.h"
>  
>  static bool __init processor_physically_present(acpi_handle handle)
> @@ -47,6 +49,15 @@ static bool __init processor_physically_present(acpi_handle handle)
>  		return false;
>  	}
>  
> +	if (xen_initial_domain())
> +		/*
> +		 * When running as a Xen dom0 the number of processors Linux
> +		 * sees can be different from the real number of processors on
> +		 * the system, and we still need to execute _PDC for all of
> +		 * them.
> +		 */
> +		return xen_processor_present(acpi_id);
> +
>  	type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
>  	cpuid = acpi_get_cpuid(handle, type, acpi_id);
>  
> -- 
> 2.37.3
>
Dave Hansen Nov. 29, 2022, 5:43 p.m. UTC | #6
On 11/21/22 02:21, Roger Pau Monne wrote:
> When running as a Xen dom0 the number of CPUs available to Linux can
> be different from the number of CPUs present on the system, but in
> order to properly fetch processor performance related data _PDC must
> be executed on all the physical CPUs online on the system.

How is the number of CPUs available to Linux different?

Is this a result of the ACPI tables that dom0 sees being "wrong"?

> The current checks in processor_physically_present() result in some
> processor objects not getting their _PDC methods evaluated when Linux
> is running as Xen dom0.  Fix this by introducing a custom function to
> use when running as Xen dom0 in order to check whether a processor
> object matches a CPU that's online.

What is the end user visible effect of this problem and of the solution?
Roger Pau Monné Nov. 30, 2022, 3:53 p.m. UTC | #7
On Tue, Nov 29, 2022 at 09:43:53AM -0800, Dave Hansen wrote:
> On 11/21/22 02:21, Roger Pau Monne wrote:
> > When running as a Xen dom0 the number of CPUs available to Linux can
> > be different from the number of CPUs present on the system, but in
> > order to properly fetch processor performance related data _PDC must
> > be executed on all the physical CPUs online on the system.
> 
> How is the number of CPUs available to Linux different?
> 
> Is this a result of the ACPI tables that dom0 sees being "wrong"?

Depends on the mode.  This is all specific to Linux running as a Xen
dom0.

For PV dom0 the ACPI tables that dom0 sees are the native ones,
however available CPUs are not detected based on the MADT, but using
hypercalls, see xen_smp_ops struct and the
x86_init.mpparse.get_smp_config hook used in smp_pv.c
(_get_smp_config()).

For a PVH dom0 Xen provides dom0 with a crafted MADT table that does
only contain the CPUs available to dom0, and hence is likely different
from the native one present on the hardware.

In any case, the dynamic tables dom0 sees where the Processor
objects/devices reside are not modified by Xen in any way, so the ACPI
Processors are always exposed to dom0 as present on the native
tables.

Xen cannot parse the dynamic ACPI tables (neither should it, since
then it would act as OSPM), so it relies on dom0 to provide same data
present on those tables for Xen to properly manage the frequency and
idle states of the CPUs on the system.

> > The current checks in processor_physically_present() result in some
> > processor objects not getting their _PDC methods evaluated when Linux
> > is running as Xen dom0.  Fix this by introducing a custom function to
> > use when running as Xen dom0 in order to check whether a processor
> > object matches a CPU that's online.
> 
> What is the end user visible effect of this problem and of the solution?

Without this fix _PDC is only evaluated for the CPUs online from dom0
point of view, which means that if dom0 is limited to 8 CPUs but the
system has 24 CPUs, _PDC will only get evaluated for 8 CPUs, and that
can have the side effect of the data then returned by _PSD method or
other methods being different between CPUs where _PDC was evaluated vs
CPUs where the method wasn't evaluated.  Such mismatches can
ultimately lead to for example the CPU frequency driver in Xen not
initializing properly because the coordination methods between CPUs on
the same domain don't match.

Also not evaluating _PDC prevents the OS (or Xen in this case)
from notifying ACPI of the features it supports.

IOW this fix attempts to make sure all physically online CPUs get _PDC
evaluated, and in order to to that we need to ask the hypervisor if a
Processor ACPI ID matches an online CPU or not, because Linux doesn't
have that information when running as dom0.

Hope the above makes sense and allows to make some progress on the
issue, sometimes it's hard to summarize without getting too
specific,

Thanks, Roger.
Dave Hansen Nov. 30, 2022, 4:48 p.m. UTC | #8
On 11/30/22 07:53, Roger Pau Monné wrote:
> On Tue, Nov 29, 2022 at 09:43:53AM -0800, Dave Hansen wrote:
>> On 11/21/22 02:21, Roger Pau Monne wrote:
>>> When running as a Xen dom0 the number of CPUs available to Linux can
>>> be different from the number of CPUs present on the system, but in
>>> order to properly fetch processor performance related data _PDC must
>>> be executed on all the physical CPUs online on the system.
>>
>> How is the number of CPUs available to Linux different?
>>
>> Is this a result of the ACPI tables that dom0 sees being "wrong"?
> 
> Depends on the mode.  This is all specific to Linux running as a Xen
> dom0.
> 
> For PV dom0 the ACPI tables that dom0 sees are the native ones,
> however available CPUs are not detected based on the MADT, but using
> hypercalls, see xen_smp_ops struct and the
> x86_init.mpparse.get_smp_config hook used in smp_pv.c
> (_get_smp_config()).
> 
> For a PVH dom0 Xen provides dom0 with a crafted MADT table that does
> only contain the CPUs available to dom0, and hence is likely different
> from the native one present on the hardware.
> 
> In any case, the dynamic tables dom0 sees where the Processor
> objects/devices reside are not modified by Xen in any way, so the ACPI
> Processors are always exposed to dom0 as present on the native
> tables.
> 
> Xen cannot parse the dynamic ACPI tables (neither should it, since
> then it would act as OSPM), so it relies on dom0 to provide same data
> present on those tables for Xen to properly manage the frequency and
> idle states of the CPUs on the system.
> 
>>> The current checks in processor_physically_present() result in some
>>> processor objects not getting their _PDC methods evaluated when Linux
>>> is running as Xen dom0.  Fix this by introducing a custom function to
>>> use when running as Xen dom0 in order to check whether a processor
>>> object matches a CPU that's online.
>>
>> What is the end user visible effect of this problem and of the solution?
> 
> Without this fix _PDC is only evaluated for the CPUs online from dom0
> point of view, which means that if dom0 is limited to 8 CPUs but the
> system has 24 CPUs, _PDC will only get evaluated for 8 CPUs, and that
> can have the side effect of the data then returned by _PSD method or
> other methods being different between CPUs where _PDC was evaluated vs
> CPUs where the method wasn't evaluated.  Such mismatches can
> ultimately lead to for example the CPU frequency driver in Xen not
> initializing properly because the coordination methods between CPUs on
> the same domain don't match.
> 
> Also not evaluating _PDC prevents the OS (or Xen in this case)
> from notifying ACPI of the features it supports.
> 
> IOW this fix attempts to make sure all physically online CPUs get _PDC
> evaluated, and in order to to that we need to ask the hypervisor if a
> Processor ACPI ID matches an online CPU or not, because Linux doesn't
> have that information when running as dom0.
> 
> Hope the above makes sense and allows to make some progress on the
> issue, sometimes it's hard to summarize without getting too
> specific,

Yes, writing changelogs is hard. :)

Let's try though.  I was missing some key pieces of background here.
Believe it or not, I had no idea off the top of my head what _PDC was or
why it's important.

the information about _PDC being required on all processors was missing,
as was the information about the dom0's incomplete concept of the
available physical processors.

== Background ==

In ACPI systems, the OS can direct power management, as opposed to the
firmware.  This OS-directed Power Management is called OSPM.  Part of
telling the firmware that the OS going to direct power management is
making ACPI "_PDC" (Processor Driver Capabilities) calls.  These _PDC
calls must be made on every processor.  If these _PDC calls are not
completed on every processor it can lead to inconsistency and later
failures in things like the CPU frequency driver.

In a Xen system, the dom0 kernel is responsible for system-wide power
management.  The dom0 kernel is in charge of OSPM.  However, the Xen
hypervisor hides some processors information from the dom0 kernel.  This
is presumably done to ensure that the dom0 system has less interference
with guests that want to use the other processors.

== Problem ==

But, this leads to a problem: the dom0 kernel needs to run _PDC on all
the processors, but it can't always see them.

== Solution ==

In dom0 kernels, ignore the existing ACPI method for determining if a
processor is physically present because it might not be accurate.
Instead, ask the hypervisor for this information.

This ensures that ...

----

Is that about right?
Roger Pau Monné Dec. 2, 2022, 12:24 p.m. UTC | #9
On Wed, Nov 30, 2022 at 08:48:14AM -0800, Dave Hansen wrote:
> On 11/30/22 07:53, Roger Pau Monné wrote:
> > On Tue, Nov 29, 2022 at 09:43:53AM -0800, Dave Hansen wrote:
> >> On 11/21/22 02:21, Roger Pau Monne wrote:
> >>> When running as a Xen dom0 the number of CPUs available to Linux can
> >>> be different from the number of CPUs present on the system, but in
> >>> order to properly fetch processor performance related data _PDC must
> >>> be executed on all the physical CPUs online on the system.
> >>
> >> How is the number of CPUs available to Linux different?
> >>
> >> Is this a result of the ACPI tables that dom0 sees being "wrong"?
> > 
> > Depends on the mode.  This is all specific to Linux running as a Xen
> > dom0.
> > 
> > For PV dom0 the ACPI tables that dom0 sees are the native ones,
> > however available CPUs are not detected based on the MADT, but using
> > hypercalls, see xen_smp_ops struct and the
> > x86_init.mpparse.get_smp_config hook used in smp_pv.c
> > (_get_smp_config()).
> > 
> > For a PVH dom0 Xen provides dom0 with a crafted MADT table that does
> > only contain the CPUs available to dom0, and hence is likely different
> > from the native one present on the hardware.
> > 
> > In any case, the dynamic tables dom0 sees where the Processor
> > objects/devices reside are not modified by Xen in any way, so the ACPI
> > Processors are always exposed to dom0 as present on the native
> > tables.
> > 
> > Xen cannot parse the dynamic ACPI tables (neither should it, since
> > then it would act as OSPM), so it relies on dom0 to provide same data
> > present on those tables for Xen to properly manage the frequency and
> > idle states of the CPUs on the system.
> > 
> >>> The current checks in processor_physically_present() result in some
> >>> processor objects not getting their _PDC methods evaluated when Linux
> >>> is running as Xen dom0.  Fix this by introducing a custom function to
> >>> use when running as Xen dom0 in order to check whether a processor
> >>> object matches a CPU that's online.
> >>
> >> What is the end user visible effect of this problem and of the solution?
> > 
> > Without this fix _PDC is only evaluated for the CPUs online from dom0
> > point of view, which means that if dom0 is limited to 8 CPUs but the
> > system has 24 CPUs, _PDC will only get evaluated for 8 CPUs, and that
> > can have the side effect of the data then returned by _PSD method or
> > other methods being different between CPUs where _PDC was evaluated vs
> > CPUs where the method wasn't evaluated.  Such mismatches can
> > ultimately lead to for example the CPU frequency driver in Xen not
> > initializing properly because the coordination methods between CPUs on
> > the same domain don't match.
> > 
> > Also not evaluating _PDC prevents the OS (or Xen in this case)
> > from notifying ACPI of the features it supports.
> > 
> > IOW this fix attempts to make sure all physically online CPUs get _PDC
> > evaluated, and in order to to that we need to ask the hypervisor if a
> > Processor ACPI ID matches an online CPU or not, because Linux doesn't
> > have that information when running as dom0.
> > 
> > Hope the above makes sense and allows to make some progress on the
> > issue, sometimes it's hard to summarize without getting too
> > specific,
> 
> Yes, writing changelogs is hard. :)
> 
> Let's try though.  I was missing some key pieces of background here.
> Believe it or not, I had no idea off the top of my head what _PDC was or
> why it's important.
> 
> the information about _PDC being required on all processors was missing,
> as was the information about the dom0's incomplete concept of the
> available physical processors.
> 
> == Background ==
> 
> In ACPI systems, the OS can direct power management, as opposed to the
> firmware.  This OS-directed Power Management is called OSPM.  Part of
> telling the firmware that the OS going to direct power management is
> making ACPI "_PDC" (Processor Driver Capabilities) calls.  These _PDC
> calls must be made on every processor.  If these _PDC calls are not
> completed on every processor it can lead to inconsistency and later
> failures in things like the CPU frequency driver.

I think the "on every processor" is not fully accurate.  _PDC methods
need to be evaluated for every Processor object.  Whether that
evaluation is executed on the physical processor that matches the ACPI
UID of the object/device is not mandatory (iow: you can evaluate
the _PDC methods of all Processor objects from the BSP).  The usage of
'on' seems to me to note that the methods are executed on the matching
physical processors.

I would instead use: "... must be made for every processor.  If these
_PDC calls are not completed for every processor..."

But I'm not a native English speaker, so this might all be irrelevant.

> 
> In a Xen system, the dom0 kernel is responsible for system-wide power
> management.  The dom0 kernel is in charge of OSPM.  However, the Xen
> hypervisor hides some processors information from the dom0 kernel.  This
> is presumably done to ensure that the dom0 system has less interference
> with guests that want to use the other processors.

dom0 on a Xen system is just another guest, so the admin can limit the
number of CPUs available to dom0, that's why we get into this weird
situation.

> == Problem ==
> 
> But, this leads to a problem: the dom0 kernel needs to run _PDC on all
> the processors, but it can't always see them.
> 
> == Solution ==
> 
> In dom0 kernels, ignore the existing ACPI method for determining if a
> processor is physically present because it might not be accurate.
> Instead, ask the hypervisor for this information.
> 
> This ensures that ...
> 
> ----
> 
> Is that about right?

Yes, I think it's accurate.  I will add to my commit log, thanks!

On the implementation side, is the proposed approach acceptable?
Mostly asking because it adds Xen conditionals to otherwise generic
ACPI code.

Thanks, Roger.
Dave Hansen Dec. 2, 2022, 4:17 p.m. UTC | #10
On 12/2/22 04:24, Roger Pau Monné wrote:
> On the implementation side, is the proposed approach acceptable?
> Mostly asking because it adds Xen conditionals to otherwise generic
> ACPI code.

That's a good Rafael question.

But, how do other places in the ACPI code handle things like this?
Roger Pau Monné Dec. 2, 2022, 4:37 p.m. UTC | #11
On Fri, Dec 02, 2022 at 08:17:56AM -0800, Dave Hansen wrote:
> On 12/2/22 04:24, Roger Pau Monné wrote:
> > On the implementation side, is the proposed approach acceptable?
> > Mostly asking because it adds Xen conditionals to otherwise generic
> > ACPI code.
> 
> That's a good Rafael question.
> 
> But, how do other places in the ACPI code handle things like this?

Hm, I don't know of other places in the Xen case, the only resource
in ACPI AML tables managed by Xen are Processor objects/devices AFAIK.
The rest of devices are fully managed by the dom0 guest.

I think such special handling is very specific to Xen, but maybe I'm
wrong and there are similar existing cases in ACPI code already.

We could add some kind of hook (iow: a function pointer in some struct
that could be filled on a implementation basis?) but I didn't want
overengineering this if adding a conditional was deemed OK.

Thanks, Roger.
Rafael J. Wysocki Dec. 2, 2022, 5:06 p.m. UTC | #12
On Fri, Dec 2, 2022 at 5:37 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Fri, Dec 02, 2022 at 08:17:56AM -0800, Dave Hansen wrote:
> > On 12/2/22 04:24, Roger Pau Monné wrote:
> > > On the implementation side, is the proposed approach acceptable?
> > > Mostly asking because it adds Xen conditionals to otherwise generic
> > > ACPI code.
> >
> > That's a good Rafael question.

Sorry for joining late, but first off _PDC has been deprecated since
ACPI 3.0 (2004) and it is not even present in ACPI 6.5 any more.

It follows from your description that _PDC is still used in the field,
though, after 18 years of deprecation.  Who uses it, if I may know?

> > But, how do other places in the ACPI code handle things like this?
>
> Hm, I don't know of other places in the Xen case, the only resource
> in ACPI AML tables managed by Xen are Processor objects/devices AFAIK.
> The rest of devices are fully managed by the dom0 guest.
>
> I think such special handling is very specific to Xen, but maybe I'm
> wrong and there are similar existing cases in ACPI code already.
>
> We could add some kind of hook (iow: a function pointer in some struct
> that could be filled on a implementation basis?) but I didn't want
> overengineering this if adding a conditional was deemed OK.

What _PDC capabilities specifically do you need to pass to the
firmware for things to work correctly?

What platforms are affected?
Roger Pau Monné Dec. 9, 2022, 10:12 a.m. UTC | #13
On Fri, Dec 02, 2022 at 06:06:26PM +0100, Rafael J. Wysocki wrote:
> On Fri, Dec 2, 2022 at 5:37 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Fri, Dec 02, 2022 at 08:17:56AM -0800, Dave Hansen wrote:
> > > On 12/2/22 04:24, Roger Pau Monné wrote:
> > > > On the implementation side, is the proposed approach acceptable?
> > > > Mostly asking because it adds Xen conditionals to otherwise generic
> > > > ACPI code.
> > >
> > > That's a good Rafael question.
> 
> Sorry for joining late, but first off _PDC has been deprecated since
> ACPI 3.0 (2004) and it is not even present in ACPI 6.5 any more.
> 
> It follows from your description that _PDC is still used in the field,
> though, after 18 years of deprecation.  Who uses it, if I may know?

I saw this issue on a Sapphire Rapids SDP from Intel, but I would
think there are other platforms affected.

> > > But, how do other places in the ACPI code handle things like this?
> >
> > Hm, I don't know of other places in the Xen case, the only resource
> > in ACPI AML tables managed by Xen are Processor objects/devices AFAIK.
> > The rest of devices are fully managed by the dom0 guest.
> >
> > I think such special handling is very specific to Xen, but maybe I'm
> > wrong and there are similar existing cases in ACPI code already.
> >
> > We could add some kind of hook (iow: a function pointer in some struct
> > that could be filled on a implementation basis?) but I didn't want
> > overengineering this if adding a conditional was deemed OK.
> 
> What _PDC capabilities specifically do you need to pass to the
> firmware for things to work correctly?

I'm not sure what capabilities do I need to pass explicitly to _PSD,
but the call to _PDC as done by Linux currently changes the reported
_PSD Coordination Field (vs not doing the call).

Thanks, Roger.
Josef Johansson Jan. 30, 2023, 9:21 a.m. UTC | #14
On 11/21/22 11:21, Roger Pau Monne wrote:
> When running as a Xen dom0 the number of CPUs available to Linux can
> be different from the number of CPUs present on the system, but in
> order to properly fetch processor performance related data _PDC must
> be executed on all the physical CPUs online on the system.
>
> The current checks in processor_physically_present() result in some
> processor objects not getting their _PDC methods evaluated when Linux
> is running as Xen dom0.  Fix this by introducing a custom function to
> use when running as Xen dom0 in order to check whether a processor
> object matches a CPU that's online.
>
> Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>   arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++
>   arch/x86/xen/enlighten.c              | 27 +++++++++++++++++++++++++++
>   drivers/acpi/processor_pdc.c          | 11 +++++++++++
>   3 files changed, 48 insertions(+)
>
> diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
> index 16f548a661cf..b9f512138043 100644
> --- a/arch/x86/include/asm/xen/hypervisor.h
> +++ b/arch/x86/include/asm/xen/hypervisor.h
> @@ -61,4 +61,14 @@ void __init xen_pvh_init(struct boot_params *boot_params);
>   void __init mem_map_via_hcall(struct boot_params *boot_params_p);
>   #endif
>   
> +#ifdef CONFIG_XEN_DOM0
> +bool __init xen_processor_present(uint32_t acpi_id);
> +#else
> +static inline bool xen_processor_present(uint32_t acpi_id)
> +{
> +	BUG();
> +	return false;
> +}
> +#endif
> +
>   #endif /* _ASM_X86_XEN_HYPERVISOR_H */
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index b8db2148c07d..d4c44361a26c 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -346,3 +346,30 @@ void xen_arch_unregister_cpu(int num)
>   }
>   EXPORT_SYMBOL(xen_arch_unregister_cpu);
>   #endif
> +
> +#ifdef CONFIG_XEN_DOM0
> +bool __init xen_processor_present(uint32_t acpi_id)
> +{
> +	unsigned int i, maxid;
> +	struct xen_platform_op op = {
> +		.cmd = XENPF_get_cpuinfo,
> +		.interface_version = XENPF_INTERFACE_VERSION,
> +	};
> +	int ret = HYPERVISOR_platform_op(&op);
> +
> +	if (ret)
> +		return false;
> +
> +	maxid = op.u.pcpu_info.max_present;
> +	for (i = 0; i <= maxid; i++) {
> +		op.u.pcpu_info.xen_cpuid = i;
> +		ret = HYPERVISOR_platform_op(&op);
> +		if (ret)
> +			continue;
> +		if (op.u.pcpu_info.acpi_id == acpi_id)
> +			return op.u.pcpu_info.flags & XEN_PCPU_FLAGS_ONLINE;
> +	}
> +
> +	return false;
> +}
My compiler (Default GCC on Fedora 32, compiling for Qubes) complain 
loudly that the below was missing.

+}
+EXPORT_SYMBOL(xen_processor_present);

`ERROR: MODPOST xen_processor_present 
[drivers/xen/xen-acpi-processor.ko] undefined!`

Same thing with xen_sanitize_pdc in the next patch.

+}
+EXPORT_SYMBOL(xen_sanitize_pdc);

Everything compiled fine after those changes.
> +#endif
> diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c
> index 8c3f82c9fff3..18fb04523f93 100644
> --- a/drivers/acpi/processor_pdc.c
> +++ b/drivers/acpi/processor_pdc.c
> @@ -14,6 +14,8 @@
>   #include <linux/acpi.h>
>   #include <acpi/processor.h>
>   
> +#include <xen/xen.h>
> +
>   #include "internal.h"
>   
>   static bool __init processor_physically_present(acpi_handle handle)
> @@ -47,6 +49,15 @@ static bool __init processor_physically_present(acpi_handle handle)
>   		return false;
>   	}
>   
> +	if (xen_initial_domain())
> +		/*
> +		 * When running as a Xen dom0 the number of processors Linux
> +		 * sees can be different from the real number of processors on
> +		 * the system, and we still need to execute _PDC for all of
> +		 * them.
> +		 */
> +		return xen_processor_present(acpi_id);
> +
>   	type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
>   	cpuid = acpi_get_cpuid(handle, type, acpi_id);
>
Jan Beulich Feb. 3, 2023, 7:05 a.m. UTC | #15
On 30.01.2023 10:21, Josef Johansson wrote:
> On 11/21/22 11:21, Roger Pau Monne wrote:
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -346,3 +346,30 @@ void xen_arch_unregister_cpu(int num)
>>   }
>>   EXPORT_SYMBOL(xen_arch_unregister_cpu);
>>   #endif
>> +
>> +#ifdef CONFIG_XEN_DOM0
>> +bool __init xen_processor_present(uint32_t acpi_id)
>> +{
>> +	unsigned int i, maxid;
>> +	struct xen_platform_op op = {
>> +		.cmd = XENPF_get_cpuinfo,
>> +		.interface_version = XENPF_INTERFACE_VERSION,
>> +	};
>> +	int ret = HYPERVISOR_platform_op(&op);
>> +
>> +	if (ret)
>> +		return false;
>> +
>> +	maxid = op.u.pcpu_info.max_present;
>> +	for (i = 0; i <= maxid; i++) {
>> +		op.u.pcpu_info.xen_cpuid = i;
>> +		ret = HYPERVISOR_platform_op(&op);
>> +		if (ret)
>> +			continue;
>> +		if (op.u.pcpu_info.acpi_id == acpi_id)
>> +			return op.u.pcpu_info.flags & XEN_PCPU_FLAGS_ONLINE;
>> +	}
>> +
>> +	return false;
>> +}
> My compiler (Default GCC on Fedora 32, compiling for Qubes) complain 
> loudly that the below was missing.
> 
> +}
> +EXPORT_SYMBOL(xen_processor_present);
> 
> `ERROR: MODPOST xen_processor_present 
> [drivers/xen/xen-acpi-processor.ko] undefined!`
> 
> Same thing with xen_sanitize_pdc in the next patch.
> 
> +}
> +EXPORT_SYMBOL(xen_sanitize_pdc);
> 
> Everything compiled fine after those changes.

Except that you may not export __init symbols. The section mismatch checker
should actually complain about that.

Jan
Josef Johansson Feb. 3, 2023, 1:58 p.m. UTC | #16
On 2/3/23 08:05, Jan Beulich wrote:
> On 30.01.2023 10:21, Josef Johansson wrote:
>> On 11/21/22 11:21, Roger Pau Monne wrote:
>>> --- a/arch/x86/xen/enlighten.c
>>> +++ b/arch/x86/xen/enlighten.c
>>> @@ -346,3 +346,30 @@ void xen_arch_unregister_cpu(int num)
>>>    }
>>>    EXPORT_SYMBOL(xen_arch_unregister_cpu);
>>>    #endif
>>> +
>>> +#ifdef CONFIG_XEN_DOM0
>>> +bool __init xen_processor_present(uint32_t acpi_id)
>>> +{
>>> +	unsigned int i, maxid;
>>> +	struct xen_platform_op op = {
>>> +		.cmd = XENPF_get_cpuinfo,
>>> +		.interface_version = XENPF_INTERFACE_VERSION,
>>> +	};
>>> +	int ret = HYPERVISOR_platform_op(&op);
>>> +
>>> +	if (ret)
>>> +		return false;
>>> +
>>> +	maxid = op.u.pcpu_info.max_present;
>>> +	for (i = 0; i <= maxid; i++) {
>>> +		op.u.pcpu_info.xen_cpuid = i;
>>> +		ret = HYPERVISOR_platform_op(&op);
>>> +		if (ret)
>>> +			continue;
>>> +		if (op.u.pcpu_info.acpi_id == acpi_id)
>>> +			return op.u.pcpu_info.flags & XEN_PCPU_FLAGS_ONLINE;
>>> +	}
>>> +
>>> +	return false;
>>> +}
>> My compiler (Default GCC on Fedora 32, compiling for Qubes) complain
>> loudly that the below was missing.
>>
>> +}
>> +EXPORT_SYMBOL(xen_processor_present);
>>
>> `ERROR: MODPOST xen_processor_present
>> [drivers/xen/xen-acpi-processor.ko] undefined!`
>>
>> Same thing with xen_sanitize_pdc in the next patch.
>>
>> +}
>> +EXPORT_SYMBOL(xen_sanitize_pdc);
>>
>> Everything compiled fine after those changes.
> Except that you may not export __init symbols. The section mismatch checker
> should actually complain about that.
>
> Jan

That makes sense. Patch 3 does change it from an __init though.

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 394dd6675113..a7b41103d3e5 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -348,7 +348,7 @@ EXPORT_SYMBOL(xen_arch_unregister_cpu);
  #endif

  #ifdef CONFIG_XEN_DOM0
-bool __init xen_processor_present(uint32_t acpi_id)
+bool xen_processor_present(uint32_t acpi_id)
  {


So the change should be in Patch 3 I guess.

Regards
- Josef
diff mbox series

Patch

diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
index 16f548a661cf..b9f512138043 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -61,4 +61,14 @@  void __init xen_pvh_init(struct boot_params *boot_params);
 void __init mem_map_via_hcall(struct boot_params *boot_params_p);
 #endif
 
+#ifdef CONFIG_XEN_DOM0
+bool __init xen_processor_present(uint32_t acpi_id);
+#else
+static inline bool xen_processor_present(uint32_t acpi_id)
+{
+	BUG();
+	return false;
+}
+#endif
+
 #endif /* _ASM_X86_XEN_HYPERVISOR_H */
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index b8db2148c07d..d4c44361a26c 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -346,3 +346,30 @@  void xen_arch_unregister_cpu(int num)
 }
 EXPORT_SYMBOL(xen_arch_unregister_cpu);
 #endif
+
+#ifdef CONFIG_XEN_DOM0
+bool __init xen_processor_present(uint32_t acpi_id)
+{
+	unsigned int i, maxid;
+	struct xen_platform_op op = {
+		.cmd = XENPF_get_cpuinfo,
+		.interface_version = XENPF_INTERFACE_VERSION,
+	};
+	int ret = HYPERVISOR_platform_op(&op);
+
+	if (ret)
+		return false;
+
+	maxid = op.u.pcpu_info.max_present;
+	for (i = 0; i <= maxid; i++) {
+		op.u.pcpu_info.xen_cpuid = i;
+		ret = HYPERVISOR_platform_op(&op);
+		if (ret)
+			continue;
+		if (op.u.pcpu_info.acpi_id == acpi_id)
+			return op.u.pcpu_info.flags & XEN_PCPU_FLAGS_ONLINE;
+	}
+
+	return false;
+}
+#endif
diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c
index 8c3f82c9fff3..18fb04523f93 100644
--- a/drivers/acpi/processor_pdc.c
+++ b/drivers/acpi/processor_pdc.c
@@ -14,6 +14,8 @@ 
 #include <linux/acpi.h>
 #include <acpi/processor.h>
 
+#include <xen/xen.h>
+
 #include "internal.h"
 
 static bool __init processor_physically_present(acpi_handle handle)
@@ -47,6 +49,15 @@  static bool __init processor_physically_present(acpi_handle handle)
 		return false;
 	}
 
+	if (xen_initial_domain())
+		/*
+		 * When running as a Xen dom0 the number of processors Linux
+		 * sees can be different from the real number of processors on
+		 * the system, and we still need to execute _PDC for all of
+		 * them.
+		 */
+		return xen_processor_present(acpi_id);
+
 	type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
 	cpuid = acpi_get_cpuid(handle, type, acpi_id);