mbox series

[0/7] x86/topology: Improve CPUID.1F handling

Message ID 20220812164144.30829-1-rui.zhang@intel.com (mailing list archive)
Headers show
Series x86/topology: Improve CPUID.1F handling | expand

Message

Zhang Rui Aug. 12, 2022, 4:41 p.m. UTC
On Intel AlderLake-N platforms where there are Ecores only, the Ecore
Module topology is enumerated via CPUID.1F Module level, which has not
been supported by Linux kernel yet.

This exposes two issues in current CPUID.1F handling code.
1. Linux interprets the Module id bits as package id and erroneously
   reports a multi module system as a multi-package system.
2. Linux excludes the unknown Module id bits from the core_id, and results
   in duplicate core_id’s shown in a package after the first issue solved.

Plus that, a third problem is observed on Intel Hybrid ADL-S/P platforms.
The return value of CPUID.1F SMT level EBX (number of siblings) differs on
Pcore CPUs and Ecore CPUs, and results in inconsistent smp_num_siblings
value based on the Pcore/Ecore CPU enumeration order. This could bring
some potential issues although we have not observed any functionalities
issues so far.

Patch 1/7 and 2/7 fix the first two issues. And at the same time, it
reveals a reality that the core_id could be sparse on platforms with
CPUID.1F support.
Patch 3/7 improves coretemp driver code to be able to handle sparse core
id, which is the only driver that uses core_id as array index and run on
platforms with CPUID.1F support.

Patch 4/7 to 7/7 propose a fix for the third problem and update the
related Documents.

The patch series have been tested on Intel ICL/CLX servers, SKL/KBL/ADL
clients.

thanks,
-rui

Comments

Ingo Molnar Aug. 13, 2022, 10:44 a.m. UTC | #1
* Zhang Rui <rui.zhang@intel.com> wrote:

> On Intel AlderLake-N platforms where there are Ecores only, the Ecore
> Module topology is enumerated via CPUID.1F Module level, which has not
> been supported by Linux kernel yet.
> 
> This exposes two issues in current CPUID.1F handling code.
> 1. Linux interprets the Module id bits as package id and erroneously
>    reports a multi module system as a multi-package system.
> 2. Linux excludes the unknown Module id bits from the core_id, and results
>    in duplicate core_id’s shown in a package after the first issue solved.
> 
> Plus that, a third problem is observed on Intel Hybrid ADL-S/P platforms.
> The return value of CPUID.1F SMT level EBX (number of siblings) differs on
> Pcore CPUs and Ecore CPUs, and results in inconsistent smp_num_siblings
> value based on the Pcore/Ecore CPU enumeration order. This could bring
> some potential issues although we have not observed any functionalities
> issues so far.
> 
> Patch 1/7 and 2/7 fix the first two issues. And at the same time, it
> reveals a reality that the core_id could be sparse on platforms with
> CPUID.1F support.
> Patch 3/7 improves coretemp driver code to be able to handle sparse core
> id, which is the only driver that uses core_id as array index and run on
> platforms with CPUID.1F support.
> 
> Patch 4/7 to 7/7 propose a fix for the third problem and update the
> related Documents.

Yeah, so patch 3/7 probably needs to come first - otherwise there's a 
window for bisection breakage.

Thanks,

	Ingo
Zhang Rui Aug. 13, 2022, 5:10 p.m. UTC | #2
Hi, Ingo,

Thanks for reviewing this patch series.

On Sat, 2022-08-13 at 12:44 +0200, Ingo Molnar wrote:
> 
> * Zhang Rui <rui.zhang@intel.com> wrote:
> 
> > On Intel AlderLake-N platforms where there are Ecores only, the
> > Ecore
> > Module topology is enumerated via CPUID.1F Module level, which has
> > not
> > been supported by Linux kernel yet.
> > 
> > This exposes two issues in current CPUID.1F handling code.
> > 1. Linux interprets the Module id bits as package id and
> > erroneously
> >    reports a multi module system as a multi-package system.
> > 2. Linux excludes the unknown Module id bits from the core_id, and
> > results
> >    in duplicate core_id’s shown in a package after the first issue
> > solved.
> > 
> > Plus that, a third problem is observed on Intel Hybrid ADL-S/P
> > platforms.
> > The return value of CPUID.1F SMT level EBX (number of siblings)
> > differs on
> > Pcore CPUs and Ecore CPUs, and results in inconsistent
> > smp_num_siblings
> > value based on the Pcore/Ecore CPU enumeration order. This could
> > bring
> > some potential issues although we have not observed any
> > functionalities
> > issues so far.
> > 
> > Patch 1/7 and 2/7 fix the first two issues. And at the same time,
> > it
> > reveals a reality that the core_id could be sparse on platforms
> > with
> > CPUID.1F support.
> > Patch 3/7 improves coretemp driver code to be able to handle sparse
> > core
> > id, which is the only driver that uses core_id as array index and
> > run on
> > platforms with CPUID.1F support.
> > 
> > Patch 4/7 to 7/7 propose a fix for the third problem and update the
> > related Documents.
> 
> Yeah, so patch 3/7 probably needs to come first - otherwise there's a
> window for bisection breakage.

Sure, I will re-arrange this.


thanks,
rui