diff mbox series

[2/2] powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs

Message ID 20240719092545.50441-3-Dhananjay.Ugwekar@amd.com (mailing list archive)
State Superseded, archived
Headers show
Series RAPL driver fixes for AMD CPUs | expand

Commit Message

Dhananjay Ugwekar July 19, 2024, 9:25 a.m. UTC
After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
on AMD processors that support extended CPUID leaf 0x80000026, the
topology_logical_die_id() macros, no longer returns package id, instead it
returns the CCD (Core Complex Die) id. This leads to the energy-pkg
event scope to be modified to CCD instead of package.

For more historical context, please refer to commit 32fb480e0a2c
("powercap/intel_rapl: Support multi-die/package"), which initially changed
the RAPL scope from package to die for all systems, as Intel systems
with Die enumeration have RAPL scope as die, and those without die
enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
correctly with topology_logical_die_id() until recently, but this changed
after the "0x80000026 leaf" commit mentioned above.

Replacing topology_logical_die_id() with topology_physical_package_id()
conditionally only for AMD and Hygon fixes the energy-pkg event.

On an AMD 2 socket 8 CCD Zen5 server:

Before:

linux$ ls /sys/class/powercap/
intel-rapl      intel-rapl:1:0  intel-rapl:3:0  intel-rapl:5:0
intel-rapl:7:0  intel-rapl:9:0  intel-rapl:b:0  intel-rapl:d:0
intel-rapl:f:0  intel-rapl:0    intel-rapl:2    intel-rapl:4
intel-rapl:6    intel-rapl:8    intel-rapl:a    intel-rapl:c
intel-rapl:e    intel-rapl:0:0  intel-rapl:2:0  intel-rapl:4:0
intel-rapl:6:0  intel-rapl:8:0  intel-rapl:a:0  intel-rapl:c:0
intel-rapl:e:0  intel-rapl:1    intel-rapl:3    intel-rapl:5
intel-rapl:7    intel-rapl:9    intel-rapl:b    intel-rapl:d
intel-rapl:f

After:

linux$ ls /sys/class/powercap/
intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-rapl:1:0

Only one sysfs entry per-event per-package is created after this change.

Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
Reported-by: Michael Larabel <michael@michaellarabel.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
---
 drivers/powercap/intel_rapl_common.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

Comments

Zhang, Rui July 21, 2024, 2:17 p.m. UTC | #1
On Fri, 2024-07-19 at 09:25 +0000, Dhananjay Ugwekar wrote:
> After commit ("x86/cpu/topology: Add support for the AMD 0x80000026
> leaf"),
> on AMD processors that support extended CPUID leaf 0x80000026, the
> topology_logical_die_id() macros, no longer returns package id,
> instead it
> returns the CCD (Core Complex Die) id. This leads to the energy-pkg
> event scope to be modified to CCD instead of package.
> 
> For more historical context, please refer to commit 32fb480e0a2c
> ("powercap/intel_rapl: Support multi-die/package"), which initially
> changed
> the RAPL scope from package to die for all systems, as Intel systems
> with Die enumeration have RAPL scope as die, and those without die
> enumeration are not affected. So, all systems(Intel, AMD, Hygon),
> worked
> correctly with topology_logical_die_id() until recently, but this
> changed
> after the "0x80000026 leaf" commit mentioned above.
> 
> Replacing topology_logical_die_id() with
> topology_physical_package_id()
> conditionally only for AMD and Hygon fixes the energy-pkg event.
> 
> On an AMD 2 socket 8 CCD Zen5 server:
> 
> Before:
> 
> linux$ ls /sys/class/powercap/
> intel-rapl      intel-rapl:1:0  intel-rapl:3:0  intel-rapl:5:0
> intel-rapl:7:0  intel-rapl:9:0  intel-rapl:b:0  intel-rapl:d:0
> intel-rapl:f:0  intel-rapl:0    intel-rapl:2    intel-rapl:4
> intel-rapl:6    intel-rapl:8    intel-rapl:a    intel-rapl:c
> intel-rapl:e    intel-rapl:0:0  intel-rapl:2:0  intel-rapl:4:0
> intel-rapl:6:0  intel-rapl:8:0  intel-rapl:a:0  intel-rapl:c:0
> intel-rapl:e:0  intel-rapl:1    intel-rapl:3    intel-rapl:5
> intel-rapl:7    intel-rapl:9    intel-rapl:b    intel-rapl:d
> intel-rapl:f
> 
> After:
> 
> linux$ ls /sys/class/powercap/
> intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-
> rapl:1:0
> 
> Only one sysfs entry per-event per-package is created after this
> change.
> 
> Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD
> 0x80000026 leaf")
> Reported-by: Michael Larabel <michael@michaellarabel.com>
> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>

For the future Intel multi-die system that I know, it still has
package-scope RAPL, but this is done with TPMI RAPL interface.

The TPMI RAPL driver invokes these APIs with "id == pkg_id" and
"id_is_cpu == false", so no need to make rapl_pmu_is_pkg_scope()
returns true for those Intel systems.

The patch LGTM.

Reviewed-by: Zhang Rui <rui.zhang@intel.com>

thanks,
rui
> ---
>  drivers/powercap/intel_rapl_common.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/powercap/intel_rapl_common.c
> b/drivers/powercap/intel_rapl_common.c
> index 3cffa6c79538..2f24ca764408 100644
> --- a/drivers/powercap/intel_rapl_common.c
> +++ b/drivers/powercap/intel_rapl_common.c
> @@ -2128,6 +2128,18 @@ void rapl_remove_package(struct rapl_package
> *rp)
>  }
>  EXPORT_SYMBOL_GPL(rapl_remove_package);
>  
> +/*
> + * Intel systems that enumerate DIE domain have RAPL domains
> implemented
> + * per-die, however, the same is not true for AMD and Hygon
> processors
> + * where RAPL domains for PKG energy are in-fact per-PKG. Since
> + * logical_die_id is same as logical_package_id in absence of DIE
> + * enumeration, use topology_logical_die_id() on Intel systems and
> + * topology_logical_package_id() on AMD and Hygon systems.
> + */
> +#define rapl_pmu_is_pkg_scope()                                \
> +       (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||  \
> +        boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
> +
>  /* caller to ensure CPU hotplug lock is held */
>  struct rapl_package *rapl_find_package_domain_cpuslocked(int id,
> struct rapl_if_priv *priv,
>                                                          bool
> id_is_cpu)
> @@ -2136,7 +2148,8 @@ struct rapl_package
> *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
>         int uid;
>  
>         if (id_is_cpu)
> -               uid = topology_logical_die_id(id);
> +               uid = rapl_pmu_is_pkg_scope() ?
> +                     topology_physical_package_id(id) :
> topology_logical_die_id(id);
>         else
>                 uid = id;
>  
> @@ -2168,9 +2181,10 @@ struct rapl_package
> *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
>                 return ERR_PTR(-ENOMEM);
>  
>         if (id_is_cpu) {
> -               rp->id = topology_logical_die_id(id);
> +               rp->id = rapl_pmu_is_pkg_scope() ?
> +                        topology_physical_package_id(id) :
> topology_logical_die_id(id);
>                 rp->lead_cpu = id;
> -               if (topology_max_dies_per_package() > 1)
> +               if (!rapl_pmu_is_pkg_scope() &&
> topology_max_dies_per_package() > 1)
>                         snprintf(rp->name,
> PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
>                                  topology_physical_package_id(id),
> topology_die_id(id));
>                 else
Dhananjay Ugwekar July 22, 2024, 8:24 a.m. UTC | #2
Hi Rui,

On 7/21/2024 7:47 PM, Zhang, Rui wrote:
> On Fri, 2024-07-19 at 09:25 +0000, Dhananjay Ugwekar wrote:
>> After commit ("x86/cpu/topology: Add support for the AMD 0x80000026
>> leaf"),
>> on AMD processors that support extended CPUID leaf 0x80000026, the
>> topology_logical_die_id() macros, no longer returns package id,
>> instead it
>> returns the CCD (Core Complex Die) id. This leads to the energy-pkg
>> event scope to be modified to CCD instead of package.
>>
>> For more historical context, please refer to commit 32fb480e0a2c
>> ("powercap/intel_rapl: Support multi-die/package"), which initially
>> changed
>> the RAPL scope from package to die for all systems, as Intel systems
>> with Die enumeration have RAPL scope as die, and those without die
>> enumeration are not affected. So, all systems(Intel, AMD, Hygon),
>> worked
>> correctly with topology_logical_die_id() until recently, but this
>> changed
>> after the "0x80000026 leaf" commit mentioned above.
>>
>> Replacing topology_logical_die_id() with
>> topology_physical_package_id()
>> conditionally only for AMD and Hygon fixes the energy-pkg event.
>>
>> On an AMD 2 socket 8 CCD Zen5 server:
>>
>> Before:
>>
>> linux$ ls /sys/class/powercap/
>> intel-rapl      intel-rapl:1:0  intel-rapl:3:0  intel-rapl:5:0
>> intel-rapl:7:0  intel-rapl:9:0  intel-rapl:b:0  intel-rapl:d:0
>> intel-rapl:f:0  intel-rapl:0    intel-rapl:2    intel-rapl:4
>> intel-rapl:6    intel-rapl:8    intel-rapl:a    intel-rapl:c
>> intel-rapl:e    intel-rapl:0:0  intel-rapl:2:0  intel-rapl:4:0
>> intel-rapl:6:0  intel-rapl:8:0  intel-rapl:a:0  intel-rapl:c:0
>> intel-rapl:e:0  intel-rapl:1    intel-rapl:3    intel-rapl:5
>> intel-rapl:7    intel-rapl:9    intel-rapl:b    intel-rapl:d
>> intel-rapl:f
>>
>> After:
>>
>> linux$ ls /sys/class/powercap/
>> intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-
>> rapl:1:0
>>
>> Only one sysfs entry per-event per-package is created after this
>> change.
>>
>> Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD
>> 0x80000026 leaf")
>> Reported-by: Michael Larabel <michael@michaellarabel.com>
>> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
> 
> For the future Intel multi-die system that I know, it still has
> package-scope RAPL, but this is done with TPMI RAPL interface.
> 
> The TPMI RAPL driver invokes these APIs with "id == pkg_id" and
> "id_is_cpu == false", so no need to make rapl_pmu_is_pkg_scope()
> returns true for those Intel systems.

This seems like an important point, would you be okay with it, if I include
this info in the commit log in v2 along with you rb tag?

Thanks for the review.

Regards,
Dhananjay

> 
> The patch LGTM.
> 
> Reviewed-by: Zhang Rui <rui.zhang@intel.com>
> 
> thanks,
> rui
>> ---
>>  drivers/powercap/intel_rapl_common.c | 20 +++++++++++++++++---
>>  1 file changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/powercap/intel_rapl_common.c
>> b/drivers/powercap/intel_rapl_common.c
>> index 3cffa6c79538..2f24ca764408 100644
>> --- a/drivers/powercap/intel_rapl_common.c
>> +++ b/drivers/powercap/intel_rapl_common.c
>> @@ -2128,6 +2128,18 @@ void rapl_remove_package(struct rapl_package
>> *rp)
>>  }
>>  EXPORT_SYMBOL_GPL(rapl_remove_package);
>>  
>> +/*
>> + * Intel systems that enumerate DIE domain have RAPL domains
>> implemented
>> + * per-die, however, the same is not true for AMD and Hygon
>> processors
>> + * where RAPL domains for PKG energy are in-fact per-PKG. Since
>> + * logical_die_id is same as logical_package_id in absence of DIE
>> + * enumeration, use topology_logical_die_id() on Intel systems and
>> + * topology_logical_package_id() on AMD and Hygon systems.
>> + */
>> +#define rapl_pmu_is_pkg_scope()                                \
>> +       (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||  \
>> +        boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
>> +
>>  /* caller to ensure CPU hotplug lock is held */
>>  struct rapl_package *rapl_find_package_domain_cpuslocked(int id,
>> struct rapl_if_priv *priv,
>>                                                          bool
>> id_is_cpu)
>> @@ -2136,7 +2148,8 @@ struct rapl_package
>> *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
>>         int uid;
>>  
>>         if (id_is_cpu)
>> -               uid = topology_logical_die_id(id);
>> +               uid = rapl_pmu_is_pkg_scope() ?
>> +                     topology_physical_package_id(id) :
>> topology_logical_die_id(id);
>>         else
>>                 uid = id;
>>  
>> @@ -2168,9 +2181,10 @@ struct rapl_package
>> *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
>>                 return ERR_PTR(-ENOMEM);
>>  
>>         if (id_is_cpu) {
>> -               rp->id = topology_logical_die_id(id);
>> +               rp->id = rapl_pmu_is_pkg_scope() ?
>> +                        topology_physical_package_id(id) :
>> topology_logical_die_id(id);
>>                 rp->lead_cpu = id;
>> -               if (topology_max_dies_per_package() > 1)
>> +               if (!rapl_pmu_is_pkg_scope() &&
>> topology_max_dies_per_package() > 1)
>>                         snprintf(rp->name,
>> PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
>>                                  topology_physical_package_id(id),
>> topology_die_id(id));
>>                 else
>
Zhang, Rui July 22, 2024, 1:52 p.m. UTC | #3
On Mon, 2024-07-22 at 13:54 +0530, Dhananjay Ugwekar wrote:
> Hi Rui,
> 
> On 7/21/2024 7:47 PM, Zhang, Rui wrote:
> > On Fri, 2024-07-19 at 09:25 +0000, Dhananjay Ugwekar wrote:
> > > After commit ("x86/cpu/topology: Add support for the AMD
> > > 0x80000026
> > > leaf"),
> > > on AMD processors that support extended CPUID leaf 0x80000026,
> > > the
> > > topology_logical_die_id() macros, no longer returns package id,
> > > instead it
> > > returns the CCD (Core Complex Die) id. This leads to the energy-
> > > pkg
> > > event scope to be modified to CCD instead of package.
> > > 
> > > For more historical context, please refer to commit 32fb480e0a2c
> > > ("powercap/intel_rapl: Support multi-die/package"), which
> > > initially
> > > changed
> > > the RAPL scope from package to die for all systems, as Intel
> > > systems
> > > with Die enumeration have RAPL scope as die, and those without
> > > die
> > > enumeration are not affected. So, all systems(Intel, AMD, Hygon),
> > > worked
> > > correctly with topology_logical_die_id() until recently, but this
> > > changed
> > > after the "0x80000026 leaf" commit mentioned above.
> > > 
> > > Replacing topology_logical_die_id() with
> > > topology_physical_package_id()
> > > conditionally only for AMD and Hygon fixes the energy-pkg event.
> > > 
> > > On an AMD 2 socket 8 CCD Zen5 server:
> > > 
> > > Before:
> > > 
> > > linux$ ls /sys/class/powercap/
> > > intel-rapl      intel-rapl:1:0  intel-rapl:3:0  intel-rapl:5:0
> > > intel-rapl:7:0  intel-rapl:9:0  intel-rapl:b:0  intel-rapl:d:0
> > > intel-rapl:f:0  intel-rapl:0    intel-rapl:2    intel-rapl:4
> > > intel-rapl:6    intel-rapl:8    intel-rapl:a    intel-rapl:c
> > > intel-rapl:e    intel-rapl:0:0  intel-rapl:2:0  intel-rapl:4:0
> > > intel-rapl:6:0  intel-rapl:8:0  intel-rapl:a:0  intel-rapl:c:0
> > > intel-rapl:e:0  intel-rapl:1    intel-rapl:3    intel-rapl:5
> > > intel-rapl:7    intel-rapl:9    intel-rapl:b    intel-rapl:d
> > > intel-rapl:f
> > > 
> > > After:
> > > 
> > > linux$ ls /sys/class/powercap/
> > > intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-
> > > rapl:1:0
> > > 
> > > Only one sysfs entry per-event per-package is created after this
> > > change.
> > > 
> > > Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD
> > > 0x80000026 leaf")
> > > Reported-by: Michael Larabel <michael@michaellarabel.com>
> > > Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
> > 
> > For the future Intel multi-die system that I know, it still has
> > package-scope RAPL, but this is done with TPMI RAPL interface.
> > 
> > The TPMI RAPL driver invokes these APIs with "id == pkg_id" and
> > "id_is_cpu == false", so no need to make rapl_pmu_is_pkg_scope()
> > returns true for those Intel systems.
> 
> This seems like an important point, would you be okay with it, if I
> include
> this info in the commit log in v2 along with you rb tag?

Yes.

This reminds me that we can rephrase the comment for
rapl_pmu_is_pkg_scope() a bit, something including below points,
1. AMD/HYGON platforms use per-PKG Package energy counter
2. For Intel platforms
   2.1 CLX-AP platform has per-DIE Package energy counter
   2.2 other platforms that uses MSR RAPL are single die systems so the
Package energy counter are per-PKG/per-DIE
   2.3 new platforms that use TPMI RAPL doesn't care about the scope
because they are not MSR/CPU based.

what do you think?

thanks,
rui
> 
> Thanks for the review.
> 
> Regards,
> Dhananjay
> 
> > 
> > The patch LGTM.
> > 
> > Reviewed-by: Zhang Rui <rui.zhang@intel.com>
> > 
> > thanks,
> > rui
> > > ---
> > >  drivers/powercap/intel_rapl_common.c | 20 +++++++++++++++++---
> > >  1 file changed, 17 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/powercap/intel_rapl_common.c
> > > b/drivers/powercap/intel_rapl_common.c
> > > index 3cffa6c79538..2f24ca764408 100644
> > > --- a/drivers/powercap/intel_rapl_common.c
> > > +++ b/drivers/powercap/intel_rapl_common.c
> > > @@ -2128,6 +2128,18 @@ void rapl_remove_package(struct
> > > rapl_package
> > > *rp)
> > >  }
> > >  EXPORT_SYMBOL_GPL(rapl_remove_package);
> > >  
> > > +/*
> > > + * Intel systems that enumerate DIE domain have RAPL domains
> > > implemented
> > > + * per-die, however, the same is not true for AMD and Hygon
> > > processors
> > > + * where RAPL domains for PKG energy are in-fact per-PKG. Since
> > > + * logical_die_id is same as logical_package_id in absence of
> > > DIE
> > > + * enumeration, use topology_logical_die_id() on Intel systems
> > > and
> > > + * topology_logical_package_id() on AMD and Hygon systems.
> > > + */
> > > +#define rapl_pmu_is_pkg_scope()                                \
> > > +       (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||  \
> > > +        boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
> > > +
> > >  /* caller to ensure CPU hotplug lock is held */
> > >  struct rapl_package *rapl_find_package_domain_cpuslocked(int id,
> > > struct rapl_if_priv *priv,
> > >                                                          bool
> > > id_is_cpu)
> > > @@ -2136,7 +2148,8 @@ struct rapl_package
> > > *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
> > >         int uid;
> > >  
> > >         if (id_is_cpu)
> > > -               uid = topology_logical_die_id(id);
> > > +               uid = rapl_pmu_is_pkg_scope() ?
> > > +                     topology_physical_package_id(id) :
> > > topology_logical_die_id(id);
> > >         else
> > >                 uid = id;
> > >  
> > > @@ -2168,9 +2181,10 @@ struct rapl_package
> > > *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
> > >                 return ERR_PTR(-ENOMEM);
> > >  
> > >         if (id_is_cpu) {
> > > -               rp->id = topology_logical_die_id(id);
> > > +               rp->id = rapl_pmu_is_pkg_scope() ?
> > > +                        topology_physical_package_id(id) :
> > > topology_logical_die_id(id);
> > >                 rp->lead_cpu = id;
> > > -               if (topology_max_dies_per_package() > 1)
> > > +               if (!rapl_pmu_is_pkg_scope() &&
> > > topology_max_dies_per_package() > 1)
> > >                         snprintf(rp->name,
> > > PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
> > >                                 
> > > topology_physical_package_id(id),
> > > topology_die_id(id));
> > >                 else
> >
Dhananjay Ugwekar July 22, 2024, 2:01 p.m. UTC | #4
On 7/22/2024 7:22 PM, Zhang, Rui wrote:
> On Mon, 2024-07-22 at 13:54 +0530, Dhananjay Ugwekar wrote:
>> Hi Rui,
>>
>> On 7/21/2024 7:47 PM, Zhang, Rui wrote:
>>> On Fri, 2024-07-19 at 09:25 +0000, Dhananjay Ugwekar wrote:
>>>> After commit ("x86/cpu/topology: Add support for the AMD
>>>> 0x80000026
>>>> leaf"),
>>>> on AMD processors that support extended CPUID leaf 0x80000026,
>>>> the
>>>> topology_logical_die_id() macros, no longer returns package id,
>>>> instead it
>>>> returns the CCD (Core Complex Die) id. This leads to the energy-
>>>> pkg
>>>> event scope to be modified to CCD instead of package.
>>>>
>>>> For more historical context, please refer to commit 32fb480e0a2c
>>>> ("powercap/intel_rapl: Support multi-die/package"), which
>>>> initially
>>>> changed
>>>> the RAPL scope from package to die for all systems, as Intel
>>>> systems
>>>> with Die enumeration have RAPL scope as die, and those without
>>>> die
>>>> enumeration are not affected. So, all systems(Intel, AMD, Hygon),
>>>> worked
>>>> correctly with topology_logical_die_id() until recently, but this
>>>> changed
>>>> after the "0x80000026 leaf" commit mentioned above.
>>>>
>>>> Replacing topology_logical_die_id() with
>>>> topology_physical_package_id()
>>>> conditionally only for AMD and Hygon fixes the energy-pkg event.
>>>>
>>>> On an AMD 2 socket 8 CCD Zen5 server:
>>>>
>>>> Before:
>>>>
>>>> linux$ ls /sys/class/powercap/
>>>> intel-rapl      intel-rapl:1:0  intel-rapl:3:0  intel-rapl:5:0
>>>> intel-rapl:7:0  intel-rapl:9:0  intel-rapl:b:0  intel-rapl:d:0
>>>> intel-rapl:f:0  intel-rapl:0    intel-rapl:2    intel-rapl:4
>>>> intel-rapl:6    intel-rapl:8    intel-rapl:a    intel-rapl:c
>>>> intel-rapl:e    intel-rapl:0:0  intel-rapl:2:0  intel-rapl:4:0
>>>> intel-rapl:6:0  intel-rapl:8:0  intel-rapl:a:0  intel-rapl:c:0
>>>> intel-rapl:e:0  intel-rapl:1    intel-rapl:3    intel-rapl:5
>>>> intel-rapl:7    intel-rapl:9    intel-rapl:b    intel-rapl:d
>>>> intel-rapl:f
>>>>
>>>> After:
>>>>
>>>> linux$ ls /sys/class/powercap/
>>>> intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-
>>>> rapl:1:0
>>>>
>>>> Only one sysfs entry per-event per-package is created after this
>>>> change.
>>>>
>>>> Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD
>>>> 0x80000026 leaf")
>>>> Reported-by: Michael Larabel <michael@michaellarabel.com>
>>>> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
>>>
>>> For the future Intel multi-die system that I know, it still has
>>> package-scope RAPL, but this is done with TPMI RAPL interface.
>>>
>>> The TPMI RAPL driver invokes these APIs with "id == pkg_id" and
>>> "id_is_cpu == false", so no need to make rapl_pmu_is_pkg_scope()
>>> returns true for those Intel systems.
>>
>> This seems like an important point, would you be okay with it, if I
>> include
>> this info in the commit log in v2 along with you rb tag?
> 
> Yes.
> 
> This reminds me that we can rephrase the comment for
> rapl_pmu_is_pkg_scope() a bit, something including below points,
> 1. AMD/HYGON platforms use per-PKG Package energy counter
> 2. For Intel platforms
>    2.1 CLX-AP platform has per-DIE Package energy counter
>    2.2 other platforms that uses MSR RAPL are single die systems so the
> Package energy counter are per-PKG/per-DIE
>    2.3 new platforms that use TPMI RAPL doesn't care about the scope
> because they are not MSR/CPU based.
> 
> what do you think?

Agreed, this gives a more clear picture of the all the RAPL scopes.

We will need the above comment in the first patch as well, apart from the 2.3 point.

Also, regarding perf/x86/rapl driver(patch 1), will you be sending a patch
to conditionally set the rapl scope to die for CLK-AP platform(on top of this fix),
or should I fix it in this patch 1 itself?

Thanks,
Dhananjay

> 
> thanks,
> rui
>>
>> Thanks for the review.
>>
>> Regards,
>> Dhananjay
>>
>>>
>>> The patch LGTM.
>>>
>>> Reviewed-by: Zhang Rui <rui.zhang@intel.com>
>>>
>>> thanks,
>>> rui
>>>> ---
>>>>  drivers/powercap/intel_rapl_common.c | 20 +++++++++++++++++---
>>>>  1 file changed, 17 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/powercap/intel_rapl_common.c
>>>> b/drivers/powercap/intel_rapl_common.c
>>>> index 3cffa6c79538..2f24ca764408 100644
>>>> --- a/drivers/powercap/intel_rapl_common.c
>>>> +++ b/drivers/powercap/intel_rapl_common.c
>>>> @@ -2128,6 +2128,18 @@ void rapl_remove_package(struct
>>>> rapl_package
>>>> *rp)
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(rapl_remove_package);
>>>>  
>>>> +/*
>>>> + * Intel systems that enumerate DIE domain have RAPL domains
>>>> implemented
>>>> + * per-die, however, the same is not true for AMD and Hygon
>>>> processors
>>>> + * where RAPL domains for PKG energy are in-fact per-PKG. Since
>>>> + * logical_die_id is same as logical_package_id in absence of
>>>> DIE
>>>> + * enumeration, use topology_logical_die_id() on Intel systems
>>>> and
>>>> + * topology_logical_package_id() on AMD and Hygon systems.
>>>> + */
>>>> +#define rapl_pmu_is_pkg_scope()                                \
>>>> +       (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||  \
>>>> +        boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
>>>> +
>>>>  /* caller to ensure CPU hotplug lock is held */
>>>>  struct rapl_package *rapl_find_package_domain_cpuslocked(int id,
>>>> struct rapl_if_priv *priv,
>>>>                                                          bool
>>>> id_is_cpu)
>>>> @@ -2136,7 +2148,8 @@ struct rapl_package
>>>> *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
>>>>         int uid;
>>>>  
>>>>         if (id_is_cpu)
>>>> -               uid = topology_logical_die_id(id);
>>>> +               uid = rapl_pmu_is_pkg_scope() ?
>>>> +                     topology_physical_package_id(id) :
>>>> topology_logical_die_id(id);
>>>>         else
>>>>                 uid = id;
>>>>  
>>>> @@ -2168,9 +2181,10 @@ struct rapl_package
>>>> *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
>>>>                 return ERR_PTR(-ENOMEM);
>>>>  
>>>>         if (id_is_cpu) {
>>>> -               rp->id = topology_logical_die_id(id);
>>>> +               rp->id = rapl_pmu_is_pkg_scope() ?
>>>> +                        topology_physical_package_id(id) :
>>>> topology_logical_die_id(id);
>>>>                 rp->lead_cpu = id;
>>>> -               if (topology_max_dies_per_package() > 1)
>>>> +               if (!rapl_pmu_is_pkg_scope() &&
>>>> topology_max_dies_per_package() > 1)
>>>>                         snprintf(rp->name,
>>>> PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
>>>>                                 
>>>> topology_physical_package_id(id),
>>>> topology_die_id(id));
>>>>                 else
>>>
>
Zhang, Rui July 22, 2024, 3:21 p.m. UTC | #5
On Mon, 2024-07-22 at 19:31 +0530, Dhananjay Ugwekar wrote:
> 
> 
> On 7/22/2024 7:22 PM, Zhang, Rui wrote:
> > On Mon, 2024-07-22 at 13:54 +0530, Dhananjay Ugwekar wrote:
> > > Hi Rui,
> > > 
> > > On 7/21/2024 7:47 PM, Zhang, Rui wrote:
> > > > On Fri, 2024-07-19 at 09:25 +0000, Dhananjay Ugwekar wrote:
> > > > > After commit ("x86/cpu/topology: Add support for the AMD
> > > > > 0x80000026
> > > > > leaf"),
> > > > > on AMD processors that support extended CPUID leaf
> > > > > 0x80000026,
> > > > > the
> > > > > topology_logical_die_id() macros, no longer returns package
> > > > > id,
> > > > > instead it
> > > > > returns the CCD (Core Complex Die) id. This leads to the
> > > > > energy-
> > > > > pkg
> > > > > event scope to be modified to CCD instead of package.
> > > > > 
> > > > > For more historical context, please refer to commit
> > > > > 32fb480e0a2c
> > > > > ("powercap/intel_rapl: Support multi-die/package"), which
> > > > > initially
> > > > > changed
> > > > > the RAPL scope from package to die for all systems, as Intel
> > > > > systems
> > > > > with Die enumeration have RAPL scope as die, and those
> > > > > without
> > > > > die
> > > > > enumeration are not affected. So, all systems(Intel, AMD,
> > > > > Hygon),
> > > > > worked
> > > > > correctly with topology_logical_die_id() until recently, but
> > > > > this
> > > > > changed
> > > > > after the "0x80000026 leaf" commit mentioned above.
> > > > > 
> > > > > Replacing topology_logical_die_id() with
> > > > > topology_physical_package_id()
> > > > > conditionally only for AMD and Hygon fixes the energy-pkg
> > > > > event.
> > > > > 
> > > > > On an AMD 2 socket 8 CCD Zen5 server:
> > > > > 
> > > > > Before:
> > > > > 
> > > > > linux$ ls /sys/class/powercap/
> > > > > intel-rapl      intel-rapl:1:0  intel-rapl:3:0  intel-
> > > > > rapl:5:0
> > > > > intel-rapl:7:0  intel-rapl:9:0  intel-rapl:b:0  intel-
> > > > > rapl:d:0
> > > > > intel-rapl:f:0  intel-rapl:0    intel-rapl:2    intel-rapl:4
> > > > > intel-rapl:6    intel-rapl:8    intel-rapl:a    intel-rapl:c
> > > > > intel-rapl:e    intel-rapl:0:0  intel-rapl:2:0  intel-
> > > > > rapl:4:0
> > > > > intel-rapl:6:0  intel-rapl:8:0  intel-rapl:a:0  intel-
> > > > > rapl:c:0
> > > > > intel-rapl:e:0  intel-rapl:1    intel-rapl:3    intel-rapl:5
> > > > > intel-rapl:7    intel-rapl:9    intel-rapl:b    intel-rapl:d
> > > > > intel-rapl:f
> > > > > 
> > > > > After:
> > > > > 
> > > > > linux$ ls /sys/class/powercap/
> > > > > intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1 
> > > > > intel-
> > > > > rapl:1:0
> > > > > 
> > > > > Only one sysfs entry per-event per-package is created after
> > > > > this
> > > > > change.
> > > > > 
> > > > > Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the
> > > > > AMD
> > > > > 0x80000026 leaf")
> > > > > Reported-by: Michael Larabel <michael@michaellarabel.com>
> > > > > Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
> > > > 
> > > > For the future Intel multi-die system that I know, it still has
> > > > package-scope RAPL, but this is done with TPMI RAPL interface.
> > > > 
> > > > The TPMI RAPL driver invokes these APIs with "id == pkg_id" and
> > > > "id_is_cpu == false", so no need to make
> > > > rapl_pmu_is_pkg_scope()
> > > > returns true for those Intel systems.
> > > 
> > > This seems like an important point, would you be okay with it, if
> > > I
> > > include
> > > this info in the commit log in v2 along with you rb tag?
> > 
> > Yes.
> > 
> > This reminds me that we can rephrase the comment for
> > rapl_pmu_is_pkg_scope() a bit, something including below points,
> > 1. AMD/HYGON platforms use per-PKG Package energy counter
> > 2. For Intel platforms
> >    2.1 CLX-AP platform has per-DIE Package energy counter
> >    2.2 other platforms that uses MSR RAPL are single die systems so
> > the
> > Package energy counter are per-PKG/per-DIE
> >    2.3 new platforms that use TPMI RAPL doesn't care about the
> > scope
> > because they are not MSR/CPU based.
> > 
> > what do you think?
> 
> Agreed, this gives a more clear picture of the all the RAPL scopes.
> 
> We will need the above comment in the first patch as well, apart from
> the 2.3 point.

Sounds good to me.
> 
> Also, regarding perf/x86/rapl driver(patch 1), will you be sending a
> patch
> to conditionally set the rapl scope to die for CLK-AP platform(on top
> of this fix),
> or should I fix it in this patch 1 itself?

patch 1 is a fix patch.
optimization for CLX-AP should be a separate patch and that is not
urgent (the new logic is still correct for current existing Intel
platforms), I will submit it later.

I think the fix patch is good enough as long as we have below
information
1. CLX-AP is multi-die and its RAPL MSRs are die scope
2. other Intel platforms are single die systems so the scope can be
considered as either pkg-scope or die-scope.
This info will make the future optimization easier.

thanks,
rui

> 
> Thanks,
> Dhananjay
> 
> > 
> > thanks,
> > rui
> > > 
> > > Thanks for the review.
> > > 
> > > Regards,
> > > Dhananjay
> > > 
> > > > 
> > > > The patch LGTM.
> > > > 
> > > > Reviewed-by: Zhang Rui <rui.zhang@intel.com>
> > > > 
> > > > thanks,
> > > > rui
> > > > > ---
> > > > >  drivers/powercap/intel_rapl_common.c | 20 +++++++++++++++++-
> > > > > --
> > > > >  1 file changed, 17 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/powercap/intel_rapl_common.c
> > > > > b/drivers/powercap/intel_rapl_common.c
> > > > > index 3cffa6c79538..2f24ca764408 100644
> > > > > --- a/drivers/powercap/intel_rapl_common.c
> > > > > +++ b/drivers/powercap/intel_rapl_common.c
> > > > > @@ -2128,6 +2128,18 @@ void rapl_remove_package(struct
> > > > > rapl_package
> > > > > *rp)
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(rapl_remove_package);
> > > > >  
> > > > > +/*
> > > > > + * Intel systems that enumerate DIE domain have RAPL domains
> > > > > implemented
> > > > > + * per-die, however, the same is not true for AMD and Hygon
> > > > > processors
> > > > > + * where RAPL domains for PKG energy are in-fact per-PKG.
> > > > > Since
> > > > > + * logical_die_id is same as logical_package_id in absence
> > > > > of
> > > > > DIE
> > > > > + * enumeration, use topology_logical_die_id() on Intel
> > > > > systems
> > > > > and
> > > > > + * topology_logical_package_id() on AMD and Hygon systems.
> > > > > + */
> > > > > +#define
> > > > > rapl_pmu_is_pkg_scope()                                \
> > > > > +       (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||  \
> > > > > +        boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
> > > > > +
> > > > >  /* caller to ensure CPU hotplug lock is held */
> > > > >  struct rapl_package *rapl_find_package_domain_cpuslocked(int
> > > > > id,
> > > > > struct rapl_if_priv *priv,
> > > > >                                                          bool
> > > > > id_is_cpu)
> > > > > @@ -2136,7 +2148,8 @@ struct rapl_package
> > > > > *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
> > > > >         int uid;
> > > > >  
> > > > >         if (id_is_cpu)
> > > > > -               uid = topology_logical_die_id(id);
> > > > > +               uid = rapl_pmu_is_pkg_scope() ?
> > > > > +                     topology_physical_package_id(id) :
> > > > > topology_logical_die_id(id);
> > > > >         else
> > > > >                 uid = id;
> > > > >  
> > > > > @@ -2168,9 +2181,10 @@ struct rapl_package
> > > > > *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
> > > > >                 return ERR_PTR(-ENOMEM);
> > > > >  
> > > > >         if (id_is_cpu) {
> > > > > -               rp->id = topology_logical_die_id(id);
> > > > > +               rp->id = rapl_pmu_is_pkg_scope() ?
> > > > > +                        topology_physical_package_id(id) :
> > > > > topology_logical_die_id(id);
> > > > >                 rp->lead_cpu = id;
> > > > > -               if (topology_max_dies_per_package() > 1)
> > > > > +               if (!rapl_pmu_is_pkg_scope() &&
> > > > > topology_max_dies_per_package() > 1)
> > > > >                         snprintf(rp->name,
> > > > > PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
> > > > >                                 
> > > > > topology_physical_package_id(id),
> > > > > topology_die_id(id));
> > > > >                 else
> > > > 
> > 
>
Dhananjay Ugwekar July 23, 2024, 4:12 a.m. UTC | #6
On 7/22/2024 8:51 PM, Zhang, Rui wrote:
> On Mon, 2024-07-22 at 19:31 +0530, Dhananjay Ugwekar wrote:
>>
>>
>> On 7/22/2024 7:22 PM, Zhang, Rui wrote:
>>> On Mon, 2024-07-22 at 13:54 +0530, Dhananjay Ugwekar wrote:
>>>> Hi Rui,
>>>>
>>>> On 7/21/2024 7:47 PM, Zhang, Rui wrote:
>>>>> On Fri, 2024-07-19 at 09:25 +0000, Dhananjay Ugwekar wrote:
>>>>>> After commit ("x86/cpu/topology: Add support for the AMD
>>>>>> 0x80000026
>>>>>> leaf"),
>>>>>> on AMD processors that support extended CPUID leaf
>>>>>> 0x80000026,
>>>>>> the
>>>>>> topology_logical_die_id() macros, no longer returns package
>>>>>> id,
>>>>>> instead it
>>>>>> returns the CCD (Core Complex Die) id. This leads to the
>>>>>> energy-
>>>>>> pkg
>>>>>> event scope to be modified to CCD instead of package.
>>>>>>
>>>>>> For more historical context, please refer to commit
>>>>>> 32fb480e0a2c
>>>>>> ("powercap/intel_rapl: Support multi-die/package"), which
>>>>>> initially
>>>>>> changed
>>>>>> the RAPL scope from package to die for all systems, as Intel
>>>>>> systems
>>>>>> with Die enumeration have RAPL scope as die, and those
>>>>>> without
>>>>>> die
>>>>>> enumeration are not affected. So, all systems(Intel, AMD,
>>>>>> Hygon),
>>>>>> worked
>>>>>> correctly with topology_logical_die_id() until recently, but
>>>>>> this
>>>>>> changed
>>>>>> after the "0x80000026 leaf" commit mentioned above.
>>>>>>
>>>>>> Replacing topology_logical_die_id() with
>>>>>> topology_physical_package_id()
>>>>>> conditionally only for AMD and Hygon fixes the energy-pkg
>>>>>> event.
>>>>>>
>>>>>> On an AMD 2 socket 8 CCD Zen5 server:
>>>>>>
>>>>>> Before:
>>>>>>
>>>>>> linux$ ls /sys/class/powercap/
>>>>>> intel-rapl      intel-rapl:1:0  intel-rapl:3:0  intel-
>>>>>> rapl:5:0
>>>>>> intel-rapl:7:0  intel-rapl:9:0  intel-rapl:b:0  intel-
>>>>>> rapl:d:0
>>>>>> intel-rapl:f:0  intel-rapl:0    intel-rapl:2    intel-rapl:4
>>>>>> intel-rapl:6    intel-rapl:8    intel-rapl:a    intel-rapl:c
>>>>>> intel-rapl:e    intel-rapl:0:0  intel-rapl:2:0  intel-
>>>>>> rapl:4:0
>>>>>> intel-rapl:6:0  intel-rapl:8:0  intel-rapl:a:0  intel-
>>>>>> rapl:c:0
>>>>>> intel-rapl:e:0  intel-rapl:1    intel-rapl:3    intel-rapl:5
>>>>>> intel-rapl:7    intel-rapl:9    intel-rapl:b    intel-rapl:d
>>>>>> intel-rapl:f
>>>>>>
>>>>>> After:
>>>>>>
>>>>>> linux$ ls /sys/class/powercap/
>>>>>> intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1 
>>>>>> intel-
>>>>>> rapl:1:0
>>>>>>
>>>>>> Only one sysfs entry per-event per-package is created after
>>>>>> this
>>>>>> change.
>>>>>>
>>>>>> Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the
>>>>>> AMD
>>>>>> 0x80000026 leaf")
>>>>>> Reported-by: Michael Larabel <michael@michaellarabel.com>
>>>>>> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
>>>>>
>>>>> For the future Intel multi-die system that I know, it still has
>>>>> package-scope RAPL, but this is done with TPMI RAPL interface.
>>>>>
>>>>> The TPMI RAPL driver invokes these APIs with "id == pkg_id" and
>>>>> "id_is_cpu == false", so no need to make
>>>>> rapl_pmu_is_pkg_scope()
>>>>> returns true for those Intel systems.
>>>>
>>>> This seems like an important point, would you be okay with it, if
>>>> I
>>>> include
>>>> this info in the commit log in v2 along with you rb tag?
>>>
>>> Yes.
>>>
>>> This reminds me that we can rephrase the comment for
>>> rapl_pmu_is_pkg_scope() a bit, something including below points,
>>> 1. AMD/HYGON platforms use per-PKG Package energy counter
>>> 2. For Intel platforms
>>>    2.1 CLX-AP platform has per-DIE Package energy counter
>>>    2.2 other platforms that uses MSR RAPL are single die systems so
>>> the
>>> Package energy counter are per-PKG/per-DIE
>>>    2.3 new platforms that use TPMI RAPL doesn't care about the
>>> scope
>>> because they are not MSR/CPU based.
>>>
>>> what do you think?
>>
>> Agreed, this gives a more clear picture of the all the RAPL scopes.
>>
>> We will need the above comment in the first patch as well, apart from
>> the 2.3 point.
> 
> Sounds good to me.
>>
>> Also, regarding perf/x86/rapl driver(patch 1), will you be sending a
>> patch
>> to conditionally set the rapl scope to die for CLK-AP platform(on top
>> of this fix),
>> or should I fix it in this patch 1 itself?
> 
> patch 1 is a fix patch.
> optimization for CLX-AP should be a separate patch and that is not
> urgent (the new logic is still correct for current existing Intel
> platforms), I will submit it later.

Makes sense

> 
> I think the fix patch is good enough as long as we have below
> information
> 1. CLX-AP is multi-die and its RAPL MSRs are die scope
> 2. other Intel platforms are single die systems so the scope can be
> considered as either pkg-scope or die-scope.
> This info will make the future optimization easier.

Yes, this seems good

Thanks, 
Dhananjay

> 
> thanks,
> rui
> 
>>
>> Thanks,
>> Dhananjay
>>
>>>
>>> thanks,
>>> rui
>>>>
>>>> Thanks for the review.
>>>>
>>>> Regards,
>>>> Dhananjay
>>>>
>>>>>
>>>>> The patch LGTM.
>>>>>
>>>>> Reviewed-by: Zhang Rui <rui.zhang@intel.com>
>>>>>
>>>>> thanks,
>>>>> rui
>>>>>> ---
>>>>>>  drivers/powercap/intel_rapl_common.c | 20 +++++++++++++++++-
>>>>>> --
>>>>>>  1 file changed, 17 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/powercap/intel_rapl_common.c
>>>>>> b/drivers/powercap/intel_rapl_common.c
>>>>>> index 3cffa6c79538..2f24ca764408 100644
>>>>>> --- a/drivers/powercap/intel_rapl_common.c
>>>>>> +++ b/drivers/powercap/intel_rapl_common.c
>>>>>> @@ -2128,6 +2128,18 @@ void rapl_remove_package(struct
>>>>>> rapl_package
>>>>>> *rp)
>>>>>>  }
>>>>>>  EXPORT_SYMBOL_GPL(rapl_remove_package);
>>>>>>  
>>>>>> +/*
>>>>>> + * Intel systems that enumerate DIE domain have RAPL domains
>>>>>> implemented
>>>>>> + * per-die, however, the same is not true for AMD and Hygon
>>>>>> processors
>>>>>> + * where RAPL domains for PKG energy are in-fact per-PKG.
>>>>>> Since
>>>>>> + * logical_die_id is same as logical_package_id in absence
>>>>>> of
>>>>>> DIE
>>>>>> + * enumeration, use topology_logical_die_id() on Intel
>>>>>> systems
>>>>>> and
>>>>>> + * topology_logical_package_id() on AMD and Hygon systems.
>>>>>> + */
>>>>>> +#define
>>>>>> rapl_pmu_is_pkg_scope()                                \
>>>>>> +       (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||  \
>>>>>> +        boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
>>>>>> +
>>>>>>  /* caller to ensure CPU hotplug lock is held */
>>>>>>  struct rapl_package *rapl_find_package_domain_cpuslocked(int
>>>>>> id,
>>>>>> struct rapl_if_priv *priv,
>>>>>>                                                          bool
>>>>>> id_is_cpu)
>>>>>> @@ -2136,7 +2148,8 @@ struct rapl_package
>>>>>> *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
>>>>>>         int uid;
>>>>>>  
>>>>>>         if (id_is_cpu)
>>>>>> -               uid = topology_logical_die_id(id);
>>>>>> +               uid = rapl_pmu_is_pkg_scope() ?
>>>>>> +                     topology_physical_package_id(id) :
>>>>>> topology_logical_die_id(id);
>>>>>>         else
>>>>>>                 uid = id;
>>>>>>  
>>>>>> @@ -2168,9 +2181,10 @@ struct rapl_package
>>>>>> *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
>>>>>>                 return ERR_PTR(-ENOMEM);
>>>>>>  
>>>>>>         if (id_is_cpu) {
>>>>>> -               rp->id = topology_logical_die_id(id);
>>>>>> +               rp->id = rapl_pmu_is_pkg_scope() ?
>>>>>> +                        topology_physical_package_id(id) :
>>>>>> topology_logical_die_id(id);
>>>>>>                 rp->lead_cpu = id;
>>>>>> -               if (topology_max_dies_per_package() > 1)
>>>>>> +               if (!rapl_pmu_is_pkg_scope() &&
>>>>>> topology_max_dies_per_package() > 1)
>>>>>>                         snprintf(rp->name,
>>>>>> PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
>>>>>>                                 
>>>>>> topology_physical_package_id(id),
>>>>>> topology_die_id(id));
>>>>>>                 else
>>>>>
>>>
>>
>
diff mbox series

Patch

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 3cffa6c79538..2f24ca764408 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -2128,6 +2128,18 @@  void rapl_remove_package(struct rapl_package *rp)
 }
 EXPORT_SYMBOL_GPL(rapl_remove_package);
 
+/*
+ * Intel systems that enumerate DIE domain have RAPL domains implemented
+ * per-die, however, the same is not true for AMD and Hygon processors
+ * where RAPL domains for PKG energy are in-fact per-PKG. Since
+ * logical_die_id is same as logical_package_id in absence of DIE
+ * enumeration, use topology_logical_die_id() on Intel systems and
+ * topology_logical_package_id() on AMD and Hygon systems.
+ */
+#define rapl_pmu_is_pkg_scope()				\
+	(boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||	\
+	 boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
+
 /* caller to ensure CPU hotplug lock is held */
 struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv,
 							 bool id_is_cpu)
@@ -2136,7 +2148,8 @@  struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
 	int uid;
 
 	if (id_is_cpu)
-		uid = topology_logical_die_id(id);
+		uid = rapl_pmu_is_pkg_scope() ?
+		      topology_physical_package_id(id) : topology_logical_die_id(id);
 	else
 		uid = id;
 
@@ -2168,9 +2181,10 @@  struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
 		return ERR_PTR(-ENOMEM);
 
 	if (id_is_cpu) {
-		rp->id = topology_logical_die_id(id);
+		rp->id = rapl_pmu_is_pkg_scope() ?
+			 topology_physical_package_id(id) : topology_logical_die_id(id);
 		rp->lead_cpu = id;
-		if (topology_max_dies_per_package() > 1)
+		if (!rapl_pmu_is_pkg_scope() && topology_max_dies_per_package() > 1)
 			snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
 				 topology_physical_package_id(id), topology_die_id(id));
 		else