diff mbox series

[v4] acpi/processor: fix evaluating _PDC method when running as Xen dom0

Message ID 20230316164257.42590-1-roger.pau@citrix.com (mailing list archive)
State Superseded, archived
Headers show
Series [v4] acpi/processor: fix evaluating _PDC method when running as Xen dom0 | expand

Commit Message

Roger Pau Monne March 16, 2023, 4:42 p.m. UTC
In ACPI systems, the OS can direct power management, as opposed to the
firmware.  This OS-directed Power Management is called OSPM.  Part of
telling the firmware that the OS going to direct power management is
making ACPI "_PDC" (Processor Driver Capabilities) calls.  These _PDC
methods must be evaluated for every processor object.  If these _PDC
calls are not completed for every processor it can lead to
inconsistency and later failures in things like the CPU frequency
driver.

In a Xen system, the dom0 kernel is responsible for system-wide power
management.  The dom0 kernel is in charge of OSPM.  However, the
number of CPUs available to dom0 can be different than the number of
CPUs physically present on the system.

This leads to a problem: the dom0 kernel needs to evaluate _PDC for
all the processors, but it can't always see them.

In dom0 kernels, ignore the existing ACPI method for determining if a
processor is physically present because it might not be accurate.
Instead, ask the hypervisor for this information.

Fix this by introducing a custom function to use when running as Xen
dom0 in order to check whether a processor object matches a CPU that's
online.  Such checking is done using the existing information fetched
by the Xen pCPU subsystem, extending it to also store the ACPI ID.

This ensures that _PDC method gets evaluated for all physically online
CPUs, regardless of the number of CPUs made available to dom0.

Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - Protect xen_processor_present() definition with CONFIG_ACPI.

Changes since v2:
 - Extend and use the existing pcpu functionality.

Changes since v1:
 - Reword commit message.
---
 arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++
 drivers/acpi/processor_pdc.c          | 11 +++++++++++
 drivers/xen/pcpu.c                    | 21 +++++++++++++++++++++
 3 files changed, 42 insertions(+)

Comments

Jürgen Groß March 17, 2023, 12:42 p.m. UTC | #1
On 16.03.23 17:42, Roger Pau Monne wrote:
> In ACPI systems, the OS can direct power management, as opposed to the
> firmware.  This OS-directed Power Management is called OSPM.  Part of
> telling the firmware that the OS going to direct power management is
> making ACPI "_PDC" (Processor Driver Capabilities) calls.  These _PDC
> methods must be evaluated for every processor object.  If these _PDC
> calls are not completed for every processor it can lead to
> inconsistency and later failures in things like the CPU frequency
> driver.
> 
> In a Xen system, the dom0 kernel is responsible for system-wide power
> management.  The dom0 kernel is in charge of OSPM.  However, the
> number of CPUs available to dom0 can be different than the number of
> CPUs physically present on the system.
> 
> This leads to a problem: the dom0 kernel needs to evaluate _PDC for
> all the processors, but it can't always see them.
> 
> In dom0 kernels, ignore the existing ACPI method for determining if a
> processor is physically present because it might not be accurate.
> Instead, ask the hypervisor for this information.
> 
> Fix this by introducing a custom function to use when running as Xen
> dom0 in order to check whether a processor object matches a CPU that's
> online.  Such checking is done using the existing information fetched
> by the Xen pCPU subsystem, extending it to also store the ACPI ID.
> 
> This ensures that _PDC method gets evaluated for all physically online
> CPUs, regardless of the number of CPUs made available to dom0.
> 
> Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen
Rafael J. Wysocki March 20, 2023, 5:59 p.m. UTC | #2
On Fri, Mar 17, 2023 at 1:42 PM Juergen Gross <jgross@suse.com> wrote:
>
> On 16.03.23 17:42, Roger Pau Monne wrote:
> > In ACPI systems, the OS can direct power management, as opposed to the
> > firmware.  This OS-directed Power Management is called OSPM.  Part of
> > telling the firmware that the OS going to direct power management is
> > making ACPI "_PDC" (Processor Driver Capabilities) calls.  These _PDC
> > methods must be evaluated for every processor object.  If these _PDC
> > calls are not completed for every processor it can lead to
> > inconsistency and later failures in things like the CPU frequency
> > driver.
> >
> > In a Xen system, the dom0 kernel is responsible for system-wide power
> > management.  The dom0 kernel is in charge of OSPM.  However, the
> > number of CPUs available to dom0 can be different than the number of
> > CPUs physically present on the system.
> >
> > This leads to a problem: the dom0 kernel needs to evaluate _PDC for
> > all the processors, but it can't always see them.
> >
> > In dom0 kernels, ignore the existing ACPI method for determining if a
> > processor is physically present because it might not be accurate.
> > Instead, ask the hypervisor for this information.
> >
> > Fix this by introducing a custom function to use when running as Xen
> > dom0 in order to check whether a processor object matches a CPU that's
> > online.  Such checking is done using the existing information fetched
> > by the Xen pCPU subsystem, extending it to also store the ACPI ID.
> >
> > This ensures that _PDC method gets evaluated for all physically online
> > CPUs, regardless of the number of CPUs made available to dom0.
> >
> > Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>
> Reviewed-by: Juergen Gross <jgross@suse.com>

Applied as 6.4 material under a slightly edited subject, thanks!
Rafael J. Wysocki March 21, 2023, 1:47 p.m. UTC | #3
On Thu, Mar 16, 2023 at 5:43 PM Roger Pau Monne <roger.pau@citrix.com> wrote:
>
> In ACPI systems, the OS can direct power management, as opposed to the
> firmware.  This OS-directed Power Management is called OSPM.  Part of
> telling the firmware that the OS going to direct power management is
> making ACPI "_PDC" (Processor Driver Capabilities) calls.  These _PDC
> methods must be evaluated for every processor object.  If these _PDC
> calls are not completed for every processor it can lead to
> inconsistency and later failures in things like the CPU frequency
> driver.
>
> In a Xen system, the dom0 kernel is responsible for system-wide power
> management.  The dom0 kernel is in charge of OSPM.  However, the
> number of CPUs available to dom0 can be different than the number of
> CPUs physically present on the system.
>
> This leads to a problem: the dom0 kernel needs to evaluate _PDC for
> all the processors, but it can't always see them.
>
> In dom0 kernels, ignore the existing ACPI method for determining if a
> processor is physically present because it might not be accurate.
> Instead, ask the hypervisor for this information.
>
> Fix this by introducing a custom function to use when running as Xen
> dom0 in order to check whether a processor object matches a CPU that's
> online.  Such checking is done using the existing information fetched
> by the Xen pCPU subsystem, extending it to also store the ACPI ID.
>
> This ensures that _PDC method gets evaluated for all physically online
> CPUs, regardless of the number of CPUs made available to dom0.
>
> Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v3:
>  - Protect xen_processor_present() definition with CONFIG_ACPI.
>
> Changes since v2:
>  - Extend and use the existing pcpu functionality.
>
> Changes since v1:
>  - Reword commit message.
> ---
>  arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++
>  drivers/acpi/processor_pdc.c          | 11 +++++++++++
>  drivers/xen/pcpu.c                    | 21 +++++++++++++++++++++
>  3 files changed, 42 insertions(+)
>
> diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
> index 5fc35f889cd1..990a1609677e 100644
> --- a/arch/x86/include/asm/xen/hypervisor.h
> +++ b/arch/x86/include/asm/xen/hypervisor.h
> @@ -63,4 +63,14 @@ void __init xen_pvh_init(struct boot_params *boot_params);
>  void __init mem_map_via_hcall(struct boot_params *boot_params_p);
>  #endif
>
> +#if defined(CONFIG_XEN_DOM0) && defined(CONFIG_ACPI)
> +bool __init xen_processor_present(uint32_t acpi_id);
> +#else
> +static inline bool xen_processor_present(uint32_t acpi_id)
> +{
> +       BUG();
> +       return false;
> +}
> +#endif
> +
>  #endif /* _ASM_X86_XEN_HYPERVISOR_H */
> diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c
> index 8c3f82c9fff3..18fb04523f93 100644
> --- a/drivers/acpi/processor_pdc.c
> +++ b/drivers/acpi/processor_pdc.c
> @@ -14,6 +14,8 @@
>  #include <linux/acpi.h>
>  #include <acpi/processor.h>
>
> +#include <xen/xen.h>

This along with the definition above is evidently insufficient for
xen_processor_present() to always be defined.  See
https://lore.kernel.org/linux-acpi/64198b60.bO+m9o5w+Hd8hcF3%25lkp@intel.com/T/#u
for example.

I'm dropping the patch now, please fix and resend.

> +
>  #include "internal.h"
>
>  static bool __init processor_physically_present(acpi_handle handle)
> @@ -47,6 +49,15 @@ static bool __init processor_physically_present(acpi_handle handle)
>                 return false;
>         }
>
> +       if (xen_initial_domain())
> +               /*
> +                * When running as a Xen dom0 the number of processors Linux
> +                * sees can be different from the real number of processors on
> +                * the system, and we still need to execute _PDC for all of
> +                * them.
> +                */
> +               return xen_processor_present(acpi_id);
> +
>         type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
>         cpuid = acpi_get_cpuid(handle, type, acpi_id);
>
> diff --git a/drivers/xen/pcpu.c b/drivers/xen/pcpu.c
> index fd3a644b0855..034d05e56507 100644
> --- a/drivers/xen/pcpu.c
> +++ b/drivers/xen/pcpu.c
> @@ -58,6 +58,7 @@ struct pcpu {
>         struct list_head list;
>         struct device dev;
>         uint32_t cpu_id;
> +       uint32_t acpi_id;
>         uint32_t flags;
>  };
>
> @@ -249,6 +250,7 @@ static struct pcpu *create_and_register_pcpu(struct xenpf_pcpuinfo *info)
>
>         INIT_LIST_HEAD(&pcpu->list);
>         pcpu->cpu_id = info->xen_cpuid;
> +       pcpu->acpi_id = info->acpi_id;
>         pcpu->flags = info->flags;
>
>         /* Need hold on xen_pcpu_lock before pcpu list manipulations */
> @@ -381,3 +383,22 @@ static int __init xen_pcpu_init(void)
>         return ret;
>  }
>  arch_initcall(xen_pcpu_init);
> +
> +#ifdef CONFIG_ACPI
> +bool __init xen_processor_present(uint32_t acpi_id)
> +{
> +       struct pcpu *pcpu;
> +       bool online = false;
> +
> +       mutex_lock(&xen_pcpu_lock);
> +       list_for_each_entry(pcpu, &xen_pcpus, list)
> +               if (pcpu->acpi_id == acpi_id) {
> +                       online = pcpu->flags & XEN_PCPU_FLAGS_ONLINE;
> +                       break;
> +               }
> +
> +       mutex_unlock(&xen_pcpu_lock);
> +
> +       return online;
> +}
> +#endif
> --
> 2.39.0
>
Roger Pau Monne March 21, 2023, 2:02 p.m. UTC | #4
On Tue, Mar 21, 2023 at 02:47:46PM +0100, Rafael J. Wysocki wrote:
> On Thu, Mar 16, 2023 at 5:43 PM Roger Pau Monne <roger.pau@citrix.com> wrote:
> >
> > In ACPI systems, the OS can direct power management, as opposed to the
> > firmware.  This OS-directed Power Management is called OSPM.  Part of
> > telling the firmware that the OS going to direct power management is
> > making ACPI "_PDC" (Processor Driver Capabilities) calls.  These _PDC
> > methods must be evaluated for every processor object.  If these _PDC
> > calls are not completed for every processor it can lead to
> > inconsistency and later failures in things like the CPU frequency
> > driver.
> >
> > In a Xen system, the dom0 kernel is responsible for system-wide power
> > management.  The dom0 kernel is in charge of OSPM.  However, the
> > number of CPUs available to dom0 can be different than the number of
> > CPUs physically present on the system.
> >
> > This leads to a problem: the dom0 kernel needs to evaluate _PDC for
> > all the processors, but it can't always see them.
> >
> > In dom0 kernels, ignore the existing ACPI method for determining if a
> > processor is physically present because it might not be accurate.
> > Instead, ask the hypervisor for this information.
> >
> > Fix this by introducing a custom function to use when running as Xen
> > dom0 in order to check whether a processor object matches a CPU that's
> > online.  Such checking is done using the existing information fetched
> > by the Xen pCPU subsystem, extending it to also store the ACPI ID.
> >
> > This ensures that _PDC method gets evaluated for all physically online
> > CPUs, regardless of the number of CPUs made available to dom0.
> >
> > Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Changes since v3:
> >  - Protect xen_processor_present() definition with CONFIG_ACPI.
> >
> > Changes since v2:
> >  - Extend and use the existing pcpu functionality.
> >
> > Changes since v1:
> >  - Reword commit message.
> > ---
> >  arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++
> >  drivers/acpi/processor_pdc.c          | 11 +++++++++++
> >  drivers/xen/pcpu.c                    | 21 +++++++++++++++++++++
> >  3 files changed, 42 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
> > index 5fc35f889cd1..990a1609677e 100644
> > --- a/arch/x86/include/asm/xen/hypervisor.h
> > +++ b/arch/x86/include/asm/xen/hypervisor.h
> > @@ -63,4 +63,14 @@ void __init xen_pvh_init(struct boot_params *boot_params);
> >  void __init mem_map_via_hcall(struct boot_params *boot_params_p);
> >  #endif
> >
> > +#if defined(CONFIG_XEN_DOM0) && defined(CONFIG_ACPI)
> > +bool __init xen_processor_present(uint32_t acpi_id);
> > +#else
> > +static inline bool xen_processor_present(uint32_t acpi_id)
> > +{
> > +       BUG();
> > +       return false;
> > +}
> > +#endif
> > +
> >  #endif /* _ASM_X86_XEN_HYPERVISOR_H */
> > diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c
> > index 8c3f82c9fff3..18fb04523f93 100644
> > --- a/drivers/acpi/processor_pdc.c
> > +++ b/drivers/acpi/processor_pdc.c
> > @@ -14,6 +14,8 @@
> >  #include <linux/acpi.h>
> >  #include <acpi/processor.h>
> >
> > +#include <xen/xen.h>
> 
> This along with the definition above is evidently insufficient for
> xen_processor_present() to always be defined.  See
> https://lore.kernel.org/linux-acpi/64198b60.bO+m9o5w+Hd8hcF3%25lkp@intel.com/T/#u
> for example.
> 
> I'm dropping the patch now, please fix and resend.

Hello,

Sorry.  I've sent a followup fix:

https://lore.kernel.org/xen-devel/20230321112522.46806-1-roger.pau@citrix.com/T/#u

Would you be fine with taking such followup, or would rather prefer
for me to send the original fixed patch as v5?

Thanks, Roger.
Rafael J. Wysocki March 21, 2023, 2:04 p.m. UTC | #5
On Tue, Mar 21, 2023 at 3:02 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Tue, Mar 21, 2023 at 02:47:46PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Mar 16, 2023 at 5:43 PM Roger Pau Monne <roger.pau@citrix.com> wrote:
> > >
> > > In ACPI systems, the OS can direct power management, as opposed to the
> > > firmware.  This OS-directed Power Management is called OSPM.  Part of
> > > telling the firmware that the OS going to direct power management is
> > > making ACPI "_PDC" (Processor Driver Capabilities) calls.  These _PDC
> > > methods must be evaluated for every processor object.  If these _PDC
> > > calls are not completed for every processor it can lead to
> > > inconsistency and later failures in things like the CPU frequency
> > > driver.
> > >
> > > In a Xen system, the dom0 kernel is responsible for system-wide power
> > > management.  The dom0 kernel is in charge of OSPM.  However, the
> > > number of CPUs available to dom0 can be different than the number of
> > > CPUs physically present on the system.
> > >
> > > This leads to a problem: the dom0 kernel needs to evaluate _PDC for
> > > all the processors, but it can't always see them.
> > >
> > > In dom0 kernels, ignore the existing ACPI method for determining if a
> > > processor is physically present because it might not be accurate.
> > > Instead, ask the hypervisor for this information.
> > >
> > > Fix this by introducing a custom function to use when running as Xen
> > > dom0 in order to check whether a processor object matches a CPU that's
> > > online.  Such checking is done using the existing information fetched
> > > by the Xen pCPU subsystem, extending it to also store the ACPI ID.
> > >
> > > This ensures that _PDC method gets evaluated for all physically online
> > > CPUs, regardless of the number of CPUs made available to dom0.
> > >
> > > Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()')
> > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > ---
> > > Changes since v3:
> > >  - Protect xen_processor_present() definition with CONFIG_ACPI.
> > >
> > > Changes since v2:
> > >  - Extend and use the existing pcpu functionality.
> > >
> > > Changes since v1:
> > >  - Reword commit message.
> > > ---
> > >  arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++
> > >  drivers/acpi/processor_pdc.c          | 11 +++++++++++
> > >  drivers/xen/pcpu.c                    | 21 +++++++++++++++++++++
> > >  3 files changed, 42 insertions(+)
> > >
> > > diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
> > > index 5fc35f889cd1..990a1609677e 100644
> > > --- a/arch/x86/include/asm/xen/hypervisor.h
> > > +++ b/arch/x86/include/asm/xen/hypervisor.h
> > > @@ -63,4 +63,14 @@ void __init xen_pvh_init(struct boot_params *boot_params);
> > >  void __init mem_map_via_hcall(struct boot_params *boot_params_p);
> > >  #endif
> > >
> > > +#if defined(CONFIG_XEN_DOM0) && defined(CONFIG_ACPI)
> > > +bool __init xen_processor_present(uint32_t acpi_id);
> > > +#else
> > > +static inline bool xen_processor_present(uint32_t acpi_id)
> > > +{
> > > +       BUG();
> > > +       return false;
> > > +}
> > > +#endif
> > > +
> > >  #endif /* _ASM_X86_XEN_HYPERVISOR_H */
> > > diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c
> > > index 8c3f82c9fff3..18fb04523f93 100644
> > > --- a/drivers/acpi/processor_pdc.c
> > > +++ b/drivers/acpi/processor_pdc.c
> > > @@ -14,6 +14,8 @@
> > >  #include <linux/acpi.h>
> > >  #include <acpi/processor.h>
> > >
> > > +#include <xen/xen.h>
> >
> > This along with the definition above is evidently insufficient for
> > xen_processor_present() to always be defined.  See
> > https://lore.kernel.org/linux-acpi/64198b60.bO+m9o5w+Hd8hcF3%25lkp@intel.com/T/#u
> > for example.
> >
> > I'm dropping the patch now, please fix and resend.
>
> Hello,
>
> Sorry.  I've sent a followup fix:
>
> https://lore.kernel.org/xen-devel/20230321112522.46806-1-roger.pau@citrix.com/T/#u
>
> Would you be fine with taking such followup, or would rather prefer
> for me to send the original fixed patch as v5?

Please fold the fix into the original patch and resend.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
index 5fc35f889cd1..990a1609677e 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -63,4 +63,14 @@  void __init xen_pvh_init(struct boot_params *boot_params);
 void __init mem_map_via_hcall(struct boot_params *boot_params_p);
 #endif
 
+#if defined(CONFIG_XEN_DOM0) && defined(CONFIG_ACPI)
+bool __init xen_processor_present(uint32_t acpi_id);
+#else
+static inline bool xen_processor_present(uint32_t acpi_id)
+{
+	BUG();
+	return false;
+}
+#endif
+
 #endif /* _ASM_X86_XEN_HYPERVISOR_H */
diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c
index 8c3f82c9fff3..18fb04523f93 100644
--- a/drivers/acpi/processor_pdc.c
+++ b/drivers/acpi/processor_pdc.c
@@ -14,6 +14,8 @@ 
 #include <linux/acpi.h>
 #include <acpi/processor.h>
 
+#include <xen/xen.h>
+
 #include "internal.h"
 
 static bool __init processor_physically_present(acpi_handle handle)
@@ -47,6 +49,15 @@  static bool __init processor_physically_present(acpi_handle handle)
 		return false;
 	}
 
+	if (xen_initial_domain())
+		/*
+		 * When running as a Xen dom0 the number of processors Linux
+		 * sees can be different from the real number of processors on
+		 * the system, and we still need to execute _PDC for all of
+		 * them.
+		 */
+		return xen_processor_present(acpi_id);
+
 	type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
 	cpuid = acpi_get_cpuid(handle, type, acpi_id);
 
diff --git a/drivers/xen/pcpu.c b/drivers/xen/pcpu.c
index fd3a644b0855..034d05e56507 100644
--- a/drivers/xen/pcpu.c
+++ b/drivers/xen/pcpu.c
@@ -58,6 +58,7 @@  struct pcpu {
 	struct list_head list;
 	struct device dev;
 	uint32_t cpu_id;
+	uint32_t acpi_id;
 	uint32_t flags;
 };
 
@@ -249,6 +250,7 @@  static struct pcpu *create_and_register_pcpu(struct xenpf_pcpuinfo *info)
 
 	INIT_LIST_HEAD(&pcpu->list);
 	pcpu->cpu_id = info->xen_cpuid;
+	pcpu->acpi_id = info->acpi_id;
 	pcpu->flags = info->flags;
 
 	/* Need hold on xen_pcpu_lock before pcpu list manipulations */
@@ -381,3 +383,22 @@  static int __init xen_pcpu_init(void)
 	return ret;
 }
 arch_initcall(xen_pcpu_init);
+
+#ifdef CONFIG_ACPI
+bool __init xen_processor_present(uint32_t acpi_id)
+{
+	struct pcpu *pcpu;
+	bool online = false;
+
+	mutex_lock(&xen_pcpu_lock);
+	list_for_each_entry(pcpu, &xen_pcpus, list)
+		if (pcpu->acpi_id == acpi_id) {
+			online = pcpu->flags & XEN_PCPU_FLAGS_ONLINE;
+			break;
+		}
+
+	mutex_unlock(&xen_pcpu_lock);
+
+	return online;
+}
+#endif