Message ID | 20171012194856.13844-7-jeremy.linton@arm.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Thu, Oct 12, 2017 at 02:48:55PM -0500, Jeremy Linton wrote: > Propagate the topology information from the PPTT tree to the > cpu_topology array. We can get the thread id, core_id and > cluster_id by assuming certain levels of the PPTT tree correspond > to those concepts. The package_id is flagged in the tree and can be > found by passing an arbitrary large level to setup_acpi_cpu_topology() > which terminates its search when it finds an ACPI node flagged > as the physical package. If the tree doesn't contain enough > levels to represent all of thread/core/cod/package then the package > id will be used for the missing levels. > > Since server/ACPI machines are more likely to be multisocket and NUMA, I think this stuff is vague enough already so to start with I would drop patch 4 and 5 and stop assuming what machines are more likely to ship with ACPI than DT. I am just saying, for the umpteenth time, that these levels have no architectural meaning _whatsoever_, level is a hierarchy concept with no architectural meaning attached. The only consistent thing PPTT is bringing about is the hierarchy levels/grouping (and _possibly_ - what a package boundary is), let's stick to that for the time being. > this patch also modifies the default clusters=sockets behavior > for ACPI machines to sockets=sockets. DT machines continue to > represent sockets as clusters. For ACPI machines, this results in a > more normalized view of the topology. Cluster level scheduler decisions > are still being made due to the "MC" level in the scheduler which has > knowledge of cache sharing domains. > > This code is loosely based on a combination of code from: > Xiongfeng Wang <wangxiongfeng2@huawei.com> > John Garry <john.garry@huawei.com> > Jeffrey Hugo <jhugo@codeaurora.org> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> > --- > arch/arm64/kernel/topology.c | 54 +++++++++++++++++++++++++++++++++++++++++++- > include/linux/topology.h | 1 + > 2 files changed, 54 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c > index 9147e5b6326d..42f3e7f28b2b 100644 > --- a/arch/arm64/kernel/topology.c > +++ b/arch/arm64/kernel/topology.c > @@ -11,6 +11,7 @@ > * for more details. > */ > > +#include <linux/acpi.h> > #include <linux/arch_topology.h> > #include <linux/cpu.h> > #include <linux/cpumask.h> > @@ -22,6 +23,7 @@ > #include <linux/sched.h> > #include <linux/sched/topology.h> > #include <linux/slab.h> > +#include <linux/smp.h> > #include <linux/string.h> > > #include <asm/cpu.h> > @@ -304,6 +306,54 @@ static void __init reset_cpu_topology(void) > } > } > > +#ifdef CONFIG_ACPI > +/* > + * Propagate the topology information of the processor_topology_node tree to the > + * cpu_topology array. > + */ > +static int __init parse_acpi_topology(void) > +{ > + u64 is_threaded; > + int cpu; > + int topology_id; > + /* set a large depth, to hit ACPI_PPTT_PHYSICAL_PACKAGE if one exists */ > + const int max_topo = 0xFF; > + > + is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK; > + > + for_each_possible_cpu(cpu) { > + topology_id = setup_acpi_cpu_topology(cpu, 0); > + if (topology_id < 0) > + return topology_id; > + > + if (is_threaded) { > + cpu_topology[cpu].thread_id = topology_id; > + topology_id = setup_acpi_cpu_topology(cpu, 1); Nit: you can move setup_acpi_cpu_topology() to include/linux/acpi.h, provide an empty inline function for the !ACPI case and remove this function ACPI ifdeffery. > + cpu_topology[cpu].core_id = topology_id; > + topology_id = setup_acpi_cpu_topology(cpu, 2); > + cpu_topology[cpu].cluster_id = topology_id; > + topology_id = setup_acpi_cpu_topology(cpu, max_topo); If you want a package id (that's just a package tag to group cores), you should not use a large level because you know how setup_acpi_cpu_topology()works, you should add an API that allows you to retrieve the package id (so that you can use th ACPI_PPTT_PHYSICAL_PACKAGE flag consistenly, whatever it represents). Lorenzo > + cpu_topology[cpu].package_id = topology_id; > + } else { > + cpu_topology[cpu].thread_id = -1; > + cpu_topology[cpu].core_id = topology_id; > + topology_id = setup_acpi_cpu_topology(cpu, 1); > + cpu_topology[cpu].cluster_id = topology_id; > + topology_id = setup_acpi_cpu_topology(cpu, max_topo); > + cpu_topology[cpu].package_id = topology_id; > + } > + } > + return 0; > +} > + > +#else > +static int __init parse_acpi_topology(void) > +{ > + /*ACPI kernels should be built with PPTT support*/ > + return -EINVAL; > +} > +#endif > + > void __init init_cpu_topology(void) > { > reset_cpu_topology(); > @@ -312,6 +362,8 @@ void __init init_cpu_topology(void) > * Discard anything that was parsed if we hit an error so we > * don't use partial information. > */ > - if (of_have_populated_dt() && parse_dt_topology()) > + if ((!acpi_disabled) && parse_acpi_topology()) > + reset_cpu_topology(); > + else if (of_have_populated_dt() && parse_dt_topology()) > reset_cpu_topology(); > } > diff --git a/include/linux/topology.h b/include/linux/topology.h > index 4660749a7303..cbf2fb13bf92 100644 > --- a/include/linux/topology.h > +++ b/include/linux/topology.h > @@ -43,6 +43,7 @@ > if (nr_cpus_node(node)) > > int arch_update_cpu_topology(void); > +int setup_acpi_cpu_topology(unsigned int cpu, int level); > > /* Conform to ACPI 2.0 SLIT distance definitions */ > #define LOCAL_DISTANCE 10 > -- > 2.13.5 >
On 10/19/2017 10:56 AM, Lorenzo Pieralisi wrote: > On Thu, Oct 12, 2017 at 02:48:55PM -0500, Jeremy Linton wrote: >> Propagate the topology information from the PPTT tree to the >> cpu_topology array. We can get the thread id, core_id and >> cluster_id by assuming certain levels of the PPTT tree correspond >> to those concepts. The package_id is flagged in the tree and can be >> found by passing an arbitrary large level to setup_acpi_cpu_topology() >> which terminates its search when it finds an ACPI node flagged >> as the physical package. If the tree doesn't contain enough >> levels to represent all of thread/core/cod/package then the package >> id will be used for the missing levels. >> >> Since server/ACPI machines are more likely to be multisocket and NUMA, > > I think this stuff is vague enough already so to start with I would drop > patch 4 and 5 and stop assuming what machines are more likely to ship > with ACPI than DT. > > I am just saying, for the umpteenth time, that these levels have no > architectural meaning _whatsoever_, level is a hierarchy concept > with no architectural meaning attached. ? Did anyone say anything about that? No, I think the only thing being guaranteed here is that the kernel's physical_id maps to an ACPI defined socket. Which seems to be the mindset of pretty much the entire !arm64 community meaning they are optimizing their software and the kernel with that concept in mind. Are you denying the existence of non-uniformity between threads running on different physical sockets?
Hi, I missed the rest of the comment below.. On 10/19/2017 10:56 AM, Lorenzo Pieralisi wrote: > On Thu, Oct 12, 2017 at 02:48:55PM -0500, Jeremy Linton wrote: >> Propagate the topology information from the PPTT tree to the >> cpu_topology array. We can get the thread id, core_id and >> cluster_id by assuming certain levels of the PPTT tree correspond >> to those concepts. The package_id is flagged in the tree and can be >> found by passing an arbitrary large level to setup_acpi_cpu_topology() >> which terminates its search when it finds an ACPI node flagged >> as the physical package. If the tree doesn't contain enough >> levels to represent all of thread/core/cod/package then the package >> id will be used for the missing levels. >> >> Since server/ACPI machines are more likely to be multisocket and NUMA, > > I think this stuff is vague enough already so to start with I would drop > patch 4 and 5 and stop assuming what machines are more likely to ship > with ACPI than DT. > > I am just saying, for the umpteenth time, that these levels have no > architectural meaning _whatsoever_, level is a hierarchy concept > with no architectural meaning attached. > > The only consistent thing PPTT is bringing about is the hierarchy > levels/grouping (and _possibly_ - what a package boundary is), let's > stick to that for the time being. > >> this patch also modifies the default clusters=sockets behavior >> for ACPI machines to sockets=sockets. DT machines continue to >> represent sockets as clusters. For ACPI machines, this results in a >> more normalized view of the topology. Cluster level scheduler decisions >> are still being made due to the "MC" level in the scheduler which has >> knowledge of cache sharing domains. >> >> This code is loosely based on a combination of code from: >> Xiongfeng Wang <wangxiongfeng2@huawei.com> >> John Garry <john.garry@huawei.com> >> Jeffrey Hugo <jhugo@codeaurora.org> >> >> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> >> --- >> arch/arm64/kernel/topology.c | 54 +++++++++++++++++++++++++++++++++++++++++++- >> include/linux/topology.h | 1 + >> 2 files changed, 54 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c >> index 9147e5b6326d..42f3e7f28b2b 100644 >> --- a/arch/arm64/kernel/topology.c >> +++ b/arch/arm64/kernel/topology.c >> @@ -11,6 +11,7 @@ >> * for more details. >> */ >> >> +#include <linux/acpi.h> >> #include <linux/arch_topology.h> >> #include <linux/cpu.h> >> #include <linux/cpumask.h> >> @@ -22,6 +23,7 @@ >> #include <linux/sched.h> >> #include <linux/sched/topology.h> >> #include <linux/slab.h> >> +#include <linux/smp.h> >> #include <linux/string.h> >> >> #include <asm/cpu.h> >> @@ -304,6 +306,54 @@ static void __init reset_cpu_topology(void) >> } >> } >> >> +#ifdef CONFIG_ACPI >> +/* >> + * Propagate the topology information of the processor_topology_node tree to the >> + * cpu_topology array. >> + */ >> +static int __init parse_acpi_topology(void) >> +{ >> + u64 is_threaded; >> + int cpu; >> + int topology_id; >> + /* set a large depth, to hit ACPI_PPTT_PHYSICAL_PACKAGE if one exists */ >> + const int max_topo = 0xFF; >> + >> + is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK; >> + >> + for_each_possible_cpu(cpu) { >> + topology_id = setup_acpi_cpu_topology(cpu, 0); >> + if (topology_id < 0) >> + return topology_id; >> + >> + if (is_threaded) { >> + cpu_topology[cpu].thread_id = topology_id; >> + topology_id = setup_acpi_cpu_topology(cpu, 1); > > Nit: you can move setup_acpi_cpu_topology() to include/linux/acpi.h, > provide an empty inline function for the !ACPI case and remove > this function ACPI ifdeffery. Yah sure.. > >> + cpu_topology[cpu].core_id = topology_id; >> + topology_id = setup_acpi_cpu_topology(cpu, 2); >> + cpu_topology[cpu].cluster_id = topology_id; >> + topology_id = setup_acpi_cpu_topology(cpu, max_topo); > > If you want a package id (that's just a package tag to group cores), you > should not use a large level because you know how setup_acpi_cpu_topology()works, you should add an API that allows you to retrieve the package id > (so that you can use th ACPI_PPTT_PHYSICAL_PACKAGE flag consistenly, > whatever it represents). I don't think the spec requires the use of PHYSICAL_PACKAGE... Am I misreading it? Which means we need to "pick" a node level to represent the physical package if one doesn't exist... > > Lorenzo > >> + cpu_topology[cpu].package_id = topology_id; >> + } else { >> + cpu_topology[cpu].thread_id = -1; >> + cpu_topology[cpu].core_id = topology_id; >> + topology_id = setup_acpi_cpu_topology(cpu, 1); >> + cpu_topology[cpu].cluster_id = topology_id; >> + topology_id = setup_acpi_cpu_topology(cpu, max_topo); >> + cpu_topology[cpu].package_id = topology_id; >> + } >> + } >> + return 0; >> +} >> + >> +#else >> +static int __init parse_acpi_topology(void) >> +{ >> + /*ACPI kernels should be built with PPTT support*/ >> + return -EINVAL; >> +} >> +#endif >> + >> void __init init_cpu_topology(void) >> { >> reset_cpu_topology(); >> @@ -312,6 +362,8 @@ void __init init_cpu_topology(void) >> * Discard anything that was parsed if we hit an error so we >> * don't use partial information. >> */ >> - if (of_have_populated_dt() && parse_dt_topology()) >> + if ((!acpi_disabled) && parse_acpi_topology()) >> + reset_cpu_topology(); >> + else if (of_have_populated_dt() && parse_dt_topology()) >> reset_cpu_topology(); >> } >> diff --git a/include/linux/topology.h b/include/linux/topology.h >> index 4660749a7303..cbf2fb13bf92 100644 >> --- a/include/linux/topology.h >> +++ b/include/linux/topology.h >> @@ -43,6 +43,7 @@ >> if (nr_cpus_node(node)) >> >> int arch_update_cpu_topology(void); >> +int setup_acpi_cpu_topology(unsigned int cpu, int level); >> >> /* Conform to ACPI 2.0 SLIT distance definitions */ >> #define LOCAL_DISTANCE 10 >> -- >> 2.13.5 >>
On Thu, Oct 19, 2017 at 11:13:27AM -0500, Jeremy Linton wrote: > On 10/19/2017 10:56 AM, Lorenzo Pieralisi wrote: > >On Thu, Oct 12, 2017 at 02:48:55PM -0500, Jeremy Linton wrote: > >>Propagate the topology information from the PPTT tree to the > >>cpu_topology array. We can get the thread id, core_id and > >>cluster_id by assuming certain levels of the PPTT tree correspond > >>to those concepts. The package_id is flagged in the tree and can be > >>found by passing an arbitrary large level to setup_acpi_cpu_topology() > >>which terminates its search when it finds an ACPI node flagged > >>as the physical package. If the tree doesn't contain enough > >>levels to represent all of thread/core/cod/package then the package > >>id will be used for the missing levels. > >> > >>Since server/ACPI machines are more likely to be multisocket and NUMA, > > > >I think this stuff is vague enough already so to start with I would drop > >patch 4 and 5 and stop assuming what machines are more likely to ship > >with ACPI than DT. > > > >I am just saying, for the umpteenth time, that these levels have no > >architectural meaning _whatsoever_, level is a hierarchy concept > >with no architectural meaning attached. > > ? > > Did anyone say anything about that? No, I think the only thing being > guaranteed here is that the kernel's physical_id maps to an ACPI > defined socket. Which seems to be the mindset of pretty much the > entire !arm64 community meaning they are optimizing their software > and the kernel with that concept in mind. > > Are you denying the existence of non-uniformity between threads > running on different physical sockets? No, I have not explained my POV clearly, apologies. AFAIK, the kernel currently deals with 2 (3 - if SMT) topology layers. 1) thread 2) core 3) package What I wanted to say is, that, to simplify this series, you do not need to introduce the COD topology level, since it is just another arbitrary topology level (ie there is no way you can pinpoint which level corresponds to COD with PPTT - or DT for the sake of this discussion) that would not be used in the kernel (apart from big.LITTLE cpufreq driver and PSCI checker whose usage of topology_physical_package_id() is questionable anyway). PPTT allows you to define what level corresponds to a package, use it to initialize the package topology level (that on ARM internal variables we call cluster) and be done with it. I do not think that adding another topology level improves anything as far as ACPI topology detection is concerned, you are not able to use it in the scheduler or from userspace to group CPUs anyway. Does this answer your question ? Thanks, Lorenzo
On Thu, Oct 19, 2017 at 11:54:22AM -0500, Jeremy Linton wrote: [...] > >>+ cpu_topology[cpu].core_id = topology_id; > >>+ topology_id = setup_acpi_cpu_topology(cpu, 2); > >>+ cpu_topology[cpu].cluster_id = topology_id; > >>+ topology_id = setup_acpi_cpu_topology(cpu, max_topo); > > > >If you want a package id (that's just a package tag to group cores), you > >should not use a large level because you know how setup_acpi_cpu_topology()works, you should add an API that allows you to retrieve the package id > >(so that you can use th ACPI_PPTT_PHYSICAL_PACKAGE flag consistenly, > >whatever it represents). > > I don't think the spec requires the use of PHYSICAL_PACKAGE... Am I > misreading it? Which means we need to "pick" a node level to > represent the physical package if one doesn't exist... The specs define a means to detect if a given PPTT node corresponds to a package (I am refraining from stating again that to me that's not clean cut what a package is _architecturally_, I think you know my POV by now) and that's what you need to use to retrieve a packageid for a given cpu, if I understand the aim of the physical package flag. Either that or that flag is completely useless. Lorenzo ACPI 6.2 - Table 5-151 (page 248) Physical package ----------------- Set to 1 if this node of the processor topology represents the boundary of a physical package, whether socketed or surface mounted. Set to 0 if this instance of the processor topology does not represent the boundary of a physical package.
Hi, On 10/20/2017 04:14 AM, Lorenzo Pieralisi wrote: > On Thu, Oct 19, 2017 at 11:13:27AM -0500, Jeremy Linton wrote: >> On 10/19/2017 10:56 AM, Lorenzo Pieralisi wrote: >>> On Thu, Oct 12, 2017 at 02:48:55PM -0500, Jeremy Linton wrote: >>>> Propagate the topology information from the PPTT tree to the >>>> cpu_topology array. We can get the thread id, core_id and >>>> cluster_id by assuming certain levels of the PPTT tree correspond >>>> to those concepts. The package_id is flagged in the tree and can be >>>> found by passing an arbitrary large level to setup_acpi_cpu_topology() >>>> which terminates its search when it finds an ACPI node flagged >>>> as the physical package. If the tree doesn't contain enough >>>> levels to represent all of thread/core/cod/package then the package >>>> id will be used for the missing levels. >>>> >>>> Since server/ACPI machines are more likely to be multisocket and NUMA, >>> >>> I think this stuff is vague enough already so to start with I would drop >>> patch 4 and 5 and stop assuming what machines are more likely to ship >>> with ACPI than DT. >>> >>> I am just saying, for the umpteenth time, that these levels have no >>> architectural meaning _whatsoever_, level is a hierarchy concept >>> with no architectural meaning attached. >> >> ? >> >> Did anyone say anything about that? No, I think the only thing being >> guaranteed here is that the kernel's physical_id maps to an ACPI >> defined socket. Which seems to be the mindset of pretty much the >> entire !arm64 community meaning they are optimizing their software >> and the kernel with that concept in mind. >> >> Are you denying the existence of non-uniformity between threads >> running on different physical sockets? > > No, I have not explained my POV clearly, apologies. > > AFAIK, the kernel currently deals with 2 (3 - if SMT) topology layers. > > 1) thread > 2) core > 3) package > > What I wanted to say is, that, to simplify this series, you do not need > to introduce the COD topology level, since it is just another arbitrary > topology level (ie there is no way you can pinpoint which level > corresponds to COD with PPTT - or DT for the sake of this discussion) > that would not be used in the kernel (apart from big.LITTLE cpufreq > driver and PSCI checker whose usage of topology_physical_package_id() is > questionable anyway). Oh! But, i'm at a loss as to what to do with those two users if I set the node which has the physical socket flag set, as the "cluster_id" in the topology. Granted, this being ACPI I don't expect the cpufreq driver to be active (given CPPC) and the psci checker might be ignored? Even so, its a bit of a misnomer what is actually happening. Are we good with this? > > PPTT allows you to define what level corresponds to a package, use > it to initialize the package topology level (that on ARM internal > variables we call cluster) and be done with it. > > I do not think that adding another topology level improves anything as > far as ACPI topology detection is concerned, you are not able to use it > in the scheduler or from userspace to group CPUs anyway. Correct, and AFAIK after having poked a bit at the scheduler its sort of redundant as the generic cache sharing levels are more useful anyway. > > Does this answer your question ? Yes, other than what to do with the two drivers. > > Thanks, > Lorenzo >
On 20/10/17 17:14, Jeremy Linton wrote: > Hi, > > On 10/20/2017 04:14 AM, Lorenzo Pieralisi wrote: >> On Thu, Oct 19, 2017 at 11:13:27AM -0500, Jeremy Linton wrote: >>> On 10/19/2017 10:56 AM, Lorenzo Pieralisi wrote: >>>> On Thu, Oct 12, 2017 at 02:48:55PM -0500, Jeremy Linton wrote: >>>>> Propagate the topology information from the PPTT tree to the >>>>> cpu_topology array. We can get the thread id, core_id and >>>>> cluster_id by assuming certain levels of the PPTT tree correspond >>>>> to those concepts. The package_id is flagged in the tree and can be >>>>> found by passing an arbitrary large level to setup_acpi_cpu_topology() >>>>> which terminates its search when it finds an ACPI node flagged >>>>> as the physical package. If the tree doesn't contain enough >>>>> levels to represent all of thread/core/cod/package then the package >>>>> id will be used for the missing levels. >>>>> >>>>> Since server/ACPI machines are more likely to be multisocket and NUMA, >>>> >>>> I think this stuff is vague enough already so to start with I would >>>> drop >>>> patch 4 and 5 and stop assuming what machines are more likely to ship >>>> with ACPI than DT. >>>> >>>> I am just saying, for the umpteenth time, that these levels have no >>>> architectural meaning _whatsoever_, level is a hierarchy concept >>>> with no architectural meaning attached. >>> >>> ? >>> >>> Did anyone say anything about that? No, I think the only thing being >>> guaranteed here is that the kernel's physical_id maps to an ACPI >>> defined socket. Which seems to be the mindset of pretty much the >>> entire !arm64 community meaning they are optimizing their software >>> and the kernel with that concept in mind. >>> >>> Are you denying the existence of non-uniformity between threads >>> running on different physical sockets? >> >> No, I have not explained my POV clearly, apologies. >> >> AFAIK, the kernel currently deals with 2 (3 - if SMT) topology layers. >> >> 1) thread >> 2) core >> 3) package >> >> What I wanted to say is, that, to simplify this series, you do not need >> to introduce the COD topology level, since it is just another arbitrary >> topology level (ie there is no way you can pinpoint which level >> corresponds to COD with PPTT - or DT for the sake of this discussion) >> that would not be used in the kernel (apart from big.LITTLE cpufreq >> driver and PSCI checker whose usage of topology_physical_package_id() is >> questionable anyway). Just thinking out loud here. 1. psci_checker.c : it's just used to get groups of cpu's to achieve deeper idle states. It should be easy to get rid of that. 2. big.LITTLE cpufreq : 2 users, scpi I should be able to do what I did for SCMI and for spc I am thinking if we can hard code it's just used on TC2
On 10/20/2017 10:14 AM, Jeremy Linton wrote: > Hi, > > On 10/20/2017 04:14 AM, Lorenzo Pieralisi wrote: >> On Thu, Oct 19, 2017 at 11:13:27AM -0500, Jeremy Linton wrote: >>> On 10/19/2017 10:56 AM, Lorenzo Pieralisi wrote: >>>> On Thu, Oct 12, 2017 at 02:48:55PM -0500, Jeremy Linton wrote: >>>>> Propagate the topology information from the PPTT tree to the >>>>> cpu_topology array. We can get the thread id, core_id and >>>>> cluster_id by assuming certain levels of the PPTT tree correspond >>>>> to those concepts. The package_id is flagged in the tree and can be >>>>> found by passing an arbitrary large level to setup_acpi_cpu_topology() >>>>> which terminates its search when it finds an ACPI node flagged >>>>> as the physical package. If the tree doesn't contain enough >>>>> levels to represent all of thread/core/cod/package then the package >>>>> id will be used for the missing levels. >>>>> >>>>> Since server/ACPI machines are more likely to be multisocket and NUMA, >>>> >>>> I think this stuff is vague enough already so to start with I would >>>> drop >>>> patch 4 and 5 and stop assuming what machines are more likely to ship >>>> with ACPI than DT. >>>> >>>> I am just saying, for the umpteenth time, that these levels have no >>>> architectural meaning _whatsoever_, level is a hierarchy concept >>>> with no architectural meaning attached. >>> >>> ? >>> >>> Did anyone say anything about that? No, I think the only thing being >>> guaranteed here is that the kernel's physical_id maps to an ACPI >>> defined socket. Which seems to be the mindset of pretty much the >>> entire !arm64 community meaning they are optimizing their software >>> and the kernel with that concept in mind. >>> >>> Are you denying the existence of non-uniformity between threads >>> running on different physical sockets? >> >> No, I have not explained my POV clearly, apologies. >> >> AFAIK, the kernel currently deals with 2 (3 - if SMT) topology layers. >> >> 1) thread >> 2) core >> 3) package >> >> What I wanted to say is, that, to simplify this series, you do not need >> to introduce the COD topology level, since it is just another arbitrary >> topology level (ie there is no way you can pinpoint which level >> corresponds to COD with PPTT - or DT for the sake of this discussion) >> that would not be used in the kernel (apart from big.LITTLE cpufreq >> driver and PSCI checker whose usage of topology_physical_package_id() is >> questionable anyway). > > Oh! But, i'm at a loss as to what to do with those two users if I set > the node which has the physical socket flag set, as the "cluster_id" in > the topology. > > Granted, this being ACPI I don't expect the cpufreq driver to be active > (given CPPC) and the psci checker might be ignored? Even so, its a bit > of a misnomer what is actually happening. Are we good with this? > > >> >> PPTT allows you to define what level corresponds to a package, use >> it to initialize the package topology level (that on ARM internal >> variables we call cluster) and be done with it. >> >> I do not think that adding another topology level improves anything as >> far as ACPI topology detection is concerned, you are not able to use it >> in the scheduler or from userspace to group CPUs anyway. > > Correct, and AFAIK after having poked a bit at the scheduler its sort of > redundant as the generic cache sharing levels are more useful anyway. What do you mean, it can't be used? We expect a followup series which uses PPTT to define scheduling domains/groups. The scheduler supports 4 types of levels, with an arbitrary number of instances of each - NUMA, DIE (package, usually not used with NUMA), MC (multicore, typically cores which share resources like cache), SMT (threads). Our particular platform has a single socket/package, with multiple "clusters", each cluster consisting of multiple cores that share caches. We represent all of this in PPTT, and expect it to be used. Leaf nodes are cores. The level above is the cluster. The top level is the package. We expect eventually (and understand that Jeremy is not tackling this with his current series) that clusters get represented MC so that migrated processes prefer their cache-shared siblings, and the entire package is represented by DIE. This will have to come from PPTT since you can't use core_siblings to derive this. Additionally, if we had multiple layers of clustering, we would expect each layer to be represented by MC. Topology.c has none of this support today. PPTT can refer to SLIT/SRAT to determine if a hirearchy level corresponds to the "Cluster-on-Die" concept of other architectures (which end up as NUMA nodes in NUMA scheduling domains). What PPTT will have to do is parse the tree(s), determine what each level is - SMT, MC, NUMA, DIE - and then use set_sched_topology() so that the scheduler can build up groups/domains appropriately. Jeremy, we've tested v3 on our platform. The topology part works as expected, we no longer see lstopo reporting sockets where there are none, but the scheduling groups are broken (expected). Caches still don't work right (no sizes reported, and the sched caches are not attributed to the cores). We will likely have additional comments as we delve into it. > >> >> Does this answer your question ? > Yes, other than what to do with the two drivers. > >> >> Thanks, >> Lorenzo >> >
Hi, On 10/20/2017 02:55 PM, Jeffrey Hugo wrote: > On 10/20/2017 10:14 AM, Jeremy Linton wrote: >> Hi, >> >> On 10/20/2017 04:14 AM, Lorenzo Pieralisi wrote: >>> On Thu, Oct 19, 2017 at 11:13:27AM -0500, Jeremy Linton wrote: >>>> On 10/19/2017 10:56 AM, Lorenzo Pieralisi wrote: >>>>> On Thu, Oct 12, 2017 at 02:48:55PM -0500, Jeremy Linton wrote: >>>>>> Propagate the topology information from the PPTT tree to the >>>>>> cpu_topology array. We can get the thread id, core_id and >>>>>> cluster_id by assuming certain levels of the PPTT tree correspond >>>>>> to those concepts. The package_id is flagged in the tree and can be >>>>>> found by passing an arbitrary large level to >>>>>> setup_acpi_cpu_topology() >>>>>> which terminates its search when it finds an ACPI node flagged >>>>>> as the physical package. If the tree doesn't contain enough >>>>>> levels to represent all of thread/core/cod/package then the package >>>>>> id will be used for the missing levels. >>>>>> >>>>>> Since server/ACPI machines are more likely to be multisocket and >>>>>> NUMA, >>>>> >>>>> I think this stuff is vague enough already so to start with I would >>>>> drop >>>>> patch 4 and 5 and stop assuming what machines are more likely to ship >>>>> with ACPI than DT. >>>>> >>>>> I am just saying, for the umpteenth time, that these levels have no >>>>> architectural meaning _whatsoever_, level is a hierarchy concept >>>>> with no architectural meaning attached. >>>> >>>> ? >>>> >>>> Did anyone say anything about that? No, I think the only thing being >>>> guaranteed here is that the kernel's physical_id maps to an ACPI >>>> defined socket. Which seems to be the mindset of pretty much the >>>> entire !arm64 community meaning they are optimizing their software >>>> and the kernel with that concept in mind. >>>> >>>> Are you denying the existence of non-uniformity between threads >>>> running on different physical sockets? >>> >>> No, I have not explained my POV clearly, apologies. >>> >>> AFAIK, the kernel currently deals with 2 (3 - if SMT) topology layers. >>> >>> 1) thread >>> 2) core >>> 3) package >>> >>> What I wanted to say is, that, to simplify this series, you do not need >>> to introduce the COD topology level, since it is just another arbitrary >>> topology level (ie there is no way you can pinpoint which level >>> corresponds to COD with PPTT - or DT for the sake of this discussion) >>> that would not be used in the kernel (apart from big.LITTLE cpufreq >>> driver and PSCI checker whose usage of topology_physical_package_id() is >>> questionable anyway). >> >> Oh! But, i'm at a loss as to what to do with those two users if I set >> the node which has the physical socket flag set, as the "cluster_id" >> in the topology. >> >> Granted, this being ACPI I don't expect the cpufreq driver to be >> active (given CPPC) and the psci checker might be ignored? Even so, >> its a bit of a misnomer what is actually happening. Are we good with >> this? >> >> >>> >>> PPTT allows you to define what level corresponds to a package, use >>> it to initialize the package topology level (that on ARM internal >>> variables we call cluster) and be done with it. >>> >>> I do not think that adding another topology level improves anything as >>> far as ACPI topology detection is concerned, you are not able to use it >>> in the scheduler or from userspace to group CPUs anyway. >> >> Correct, and AFAIK after having poked a bit at the scheduler its sort >> of redundant as the generic cache sharing levels are more useful anyway. > > What do you mean, it can't be used? We expect a followup series which > uses PPTT to define scheduling domains/groups. > > The scheduler supports 4 types of levels, with an arbitrary number of > instances of each - NUMA, DIE (package, usually not used with NUMA), MC > (multicore, typically cores which share resources like cache), SMT > (threads). It turns out to be pretty easy to map individual PPTT "levels" to MC layers simply by creating a custom sched_domain_topology_level and populating it with an equal number of MC layers. The only thing that changes is the "mask" portion of each entry. Whether that is good/bad vs just using a topology like: static struct sched_domain_topology_level arm64_topology[] = { #ifdef CONFIG_SCHED_SMT { cpu_smt_mask, cpu_smt_flags, SD_INIT_NAME(SMT) }, #endif { cpu_cluster_mask, cpu_core_flags, SD_INIT_NAME(CLU) }, #ifdef CONFIG_SCHED_MC { cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) }, #endif { cpu_cpu_mask, SD_INIT_NAME(DIE) }, { NULL, }, }; and using it on successful ACPI/PPTT parse, along with a new cpu_cluster_mask isn't clear to me either. Particularly, if one goes in and starts changing the "cpu_core_flags" for starters to the cpu_smt_flags. But as mentioned I think this is a follow on patch which meshes with patches 4/5 here. > > Our particular platform has a single socket/package, with multiple > "clusters", each cluster consisting of multiple cores that share caches. > We represent all of this in PPTT, and expect it to be used. Leaf > nodes are cores. The level above is the cluster. The top level is the > package. We expect eventually (and understand that Jeremy is not > tackling this with his current series) that clusters get represented MC > so that migrated processes prefer their cache-shared siblings, and the > entire package is represented by DIE. > > This will have to come from PPTT since you can't use core_siblings to > derive this. Additionally, if we had multiple layers of clustering, we > would expect each layer to be represented by MC. Topology.c has none of > this support today. > > PPTT can refer to SLIT/SRAT to determine if a hirearchy level > corresponds to the "Cluster-on-Die" concept of other architectures > (which end up as NUMA nodes in NUMA scheduling domains). > > What PPTT will have to do is parse the tree(s), determine what each > level is - SMT, MC, NUMA, DIE - and then use set_sched_topology() so > that the scheduler can build up groups/domains appropriately. > > > Jeremy, we've tested v3 on our platform. The topology part works as > expected, we no longer see lstopo reporting sockets where there are > none, but the scheduling groups are broken (expected). Caches still > don't work right (no sizes reported, and the sched caches are not > attributed to the cores). We will likely have additional comments as we > delve into it. >> >>> >>> Does this answer your question ? >> Yes, other than what to do with the two drivers. >> >>> >>> Thanks, >>> Lorenzo >>> >> > >
On 10/20/2017 03:22 AM, Lorenzo Pieralisi wrote: > On Thu, Oct 19, 2017 at 11:54:22AM -0500, Jeremy Linton wrote: > > [...] > >>>> + cpu_topology[cpu].core_id = topology_id; >>>> + topology_id = setup_acpi_cpu_topology(cpu, 2); >>>> + cpu_topology[cpu].cluster_id = topology_id; >>>> + topology_id = setup_acpi_cpu_topology(cpu, max_topo); >>> >>> If you want a package id (that's just a package tag to group cores), you >>> should not use a large level because you know how setup_acpi_cpu_topology()works, you should add an API that allows you to retrieve the package id >>> (so that you can use th ACPI_PPTT_PHYSICAL_PACKAGE flag consistenly, >>> whatever it represents). >> >> I don't think the spec requires the use of PHYSICAL_PACKAGE... Am I >> misreading it? Which means we need to "pick" a node level to >> represent the physical package if one doesn't exist... > > The specs define a means to detect if a given PPTT node corresponds to a > package (I am refraining from stating again that to me that's not clean > cut what a package is _architecturally_, I think you know my POV by now) > and that's what you need to use to retrieve a packageid for a given cpu, > if I understand the aim of the physical package flag. > > Either that or that flag is completely useless. > > Lorenzo > > ACPI 6.2 - Table 5-151 (page 248) > Physical package > ----------------- > Set to 1 if this node of the processor topology represents the boundary > of a physical package, whether socketed or surface mounted. Set to 0 if > this instance of the processor topology does not represent the boundary > of a physical package. > I've been following the discussion and I'm not sure I understand what the confusion is around having a physical package ID. Since I'm the one that insisted it be in the spec, I'd be glad to clarify anything. My apologies for not saying anything sooner but things IRL have been very complicated of late. What was intended was a simple flag that was meant to tell me if a CPU ID (this could be a CPU, a cluster, a processor container -- I don't really care which) is *also* an actual physical device on a motherboard. That is the only intent; there was no architectural meaning intended at all -- that is what the PPTT structures are for, in conjunction with any DSDT information uncovered later in the boot process. However, in the broader server ecosystem, this can be incredibly useful. There are a significant number of software products sold or supported that base their fees on the number of physical sockets in use. There have been in the past (and may be in the near future) machines where the cost of the lease on the machine is determined by how many physical sockets (or even CPUs) are in use, even if the machine has many more available. Some vendors also include FRU (Field Replaceable Unit) location info in their ACPI tables. So, for example, one or more CPUs or caches might fail in one physical package, which is then reported to a maintenance system of some sort that tells some human which of the physical sockets on what motherboard needs a replacement device, or it's simply noted and shut off until it's time to replace the entire server, or perhaps it's logged and used in an algorithm to predict when the server might fail completely. So, that's why the flag exists in the spec. It seems to make sense to me to have a package ID as part of struct cpu_topology -- it might even be really handy for CPU hotplug. If you don't, it seems to me a whole separate struct would be needed with more cpumasks to show who belongs to what physical package; that might be okay but seems unnecessarily complicated to me. You can also tell me that I have completely missed the point of the discussion so far :-). But if you do, you have to tell me what I missed. Hope this helps clarify...
On Wed, Nov 01, 2017 at 02:29:26PM -0600, Al Stone wrote: > On 10/20/2017 03:22 AM, Lorenzo Pieralisi wrote: > > On Thu, Oct 19, 2017 at 11:54:22AM -0500, Jeremy Linton wrote: > > > > [...] > > > >>>> + cpu_topology[cpu].core_id = topology_id; > >>>> + topology_id = setup_acpi_cpu_topology(cpu, 2); > >>>> + cpu_topology[cpu].cluster_id = topology_id; > >>>> + topology_id = setup_acpi_cpu_topology(cpu, max_topo); > >>> > >>> If you want a package id (that's just a package tag to group cores), you > >>> should not use a large level because you know how setup_acpi_cpu_topology()works, you should add an API that allows you to retrieve the package id > >>> (so that you can use th ACPI_PPTT_PHYSICAL_PACKAGE flag consistenly, > >>> whatever it represents). > >> > >> I don't think the spec requires the use of PHYSICAL_PACKAGE... Am I > >> misreading it? Which means we need to "pick" a node level to > >> represent the physical package if one doesn't exist... > > > > The specs define a means to detect if a given PPTT node corresponds to a > > package (I am refraining from stating again that to me that's not clean > > cut what a package is _architecturally_, I think you know my POV by now) > > and that's what you need to use to retrieve a packageid for a given cpu, > > if I understand the aim of the physical package flag. > > > > Either that or that flag is completely useless. > > > > Lorenzo > > > > ACPI 6.2 - Table 5-151 (page 248) > > Physical package > > ----------------- > > Set to 1 if this node of the processor topology represents the boundary > > of a physical package, whether socketed or surface mounted. Set to 0 if > > this instance of the processor topology does not represent the boundary > > of a physical package. > > > > I've been following the discussion and I'm not sure I understand what the > confusion is around having a physical package ID. Since I'm the one that > insisted it be in the spec, I'd be glad to clarify anything. My apologies > for not saying anything sooner but things IRL have been very complicated > of late. > > What was intended was a simple flag that was meant to tell me if a CPU ID > (this could be a CPU, a cluster, a processor container -- I don't really > care which) is *also* an actual physical device on a motherboard. That is > the only intent; there was no architectural meaning intended at all -- that > is what the PPTT structures are for, in conjunction with any DSDT information > uncovered later in the boot process. > > However, in the broader server ecosystem, this can be incredibly useful. There > are a significant number of software products sold or supported that base their > fees on the number of physical sockets in use. There have been in the past (and > may be in the near future) machines where the cost of the lease on the machine > is determined by how many physical sockets (or even CPUs) are in use, even if > the machine has many more available. > > Some vendors also include FRU (Field Replaceable Unit) location info in their > ACPI tables. So, for example, one or more CPUs or caches might fail in one > physical package, which is then reported to a maintenance system of some sort > that tells some human which of the physical sockets on what motherboard needs a > replacement device, or it's simply noted and shut off until it's time to replace > the entire server, or perhaps it's logged and used in an algorithm to predict > when the server might fail completely. > > So, that's why the flag exists in the spec. It seems to make sense to me to > have a package ID as part of struct cpu_topology -- it might even be really > handy for CPU hotplug. If you don't, it seems to me a whole separate struct > would be needed with more cpumasks to show who belongs to what physical package; > that might be okay but seems unnecessarily complicated to me. > > You can also tell me that I have completely missed the point of the discussion > so far :-). But if you do, you have to tell me what I missed. > > Hope this helps clarify... Hi Al, yes it does. I think we agree that package ID has a HW/architectural meaning on x86, it has none on PPTT (ie it totally depends on how PPTT is enumerated). That's the first remark. So, if the package flag is used to group CPUs and provide the topology package hierarchy to the kernel/userspace fine, if it is to be used to provide scheduler/userspace with an ID that can identify a HW "component" of sorts it is not fine because the topology package ID is a SW construction on ARM systems relying on PPTT (and DT - by the way). So, to group CPUs and call them a package, fine by me (with a hope FW developers won't play too much with that package flag to make things work but use it consistenly instead). Having said that, all I asked is that, given that we _know_ (thanks to the PPTT flag) the package boundary, let's use it to initialize the topology package level and that's where this patch series should stop IMHO. For the time being, I see no point in adding another arbitrary topology level (ie COD) with no architectural meaning, as I said, this is vague enough already and there is legacy (and DT systems) to take into account too. Thanks, Lorenzo
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index 9147e5b6326d..42f3e7f28b2b 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -11,6 +11,7 @@ * for more details. */ +#include <linux/acpi.h> #include <linux/arch_topology.h> #include <linux/cpu.h> #include <linux/cpumask.h> @@ -22,6 +23,7 @@ #include <linux/sched.h> #include <linux/sched/topology.h> #include <linux/slab.h> +#include <linux/smp.h> #include <linux/string.h> #include <asm/cpu.h> @@ -304,6 +306,54 @@ static void __init reset_cpu_topology(void) } } +#ifdef CONFIG_ACPI +/* + * Propagate the topology information of the processor_topology_node tree to the + * cpu_topology array. + */ +static int __init parse_acpi_topology(void) +{ + u64 is_threaded; + int cpu; + int topology_id; + /* set a large depth, to hit ACPI_PPTT_PHYSICAL_PACKAGE if one exists */ + const int max_topo = 0xFF; + + is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK; + + for_each_possible_cpu(cpu) { + topology_id = setup_acpi_cpu_topology(cpu, 0); + if (topology_id < 0) + return topology_id; + + if (is_threaded) { + cpu_topology[cpu].thread_id = topology_id; + topology_id = setup_acpi_cpu_topology(cpu, 1); + cpu_topology[cpu].core_id = topology_id; + topology_id = setup_acpi_cpu_topology(cpu, 2); + cpu_topology[cpu].cluster_id = topology_id; + topology_id = setup_acpi_cpu_topology(cpu, max_topo); + cpu_topology[cpu].package_id = topology_id; + } else { + cpu_topology[cpu].thread_id = -1; + cpu_topology[cpu].core_id = topology_id; + topology_id = setup_acpi_cpu_topology(cpu, 1); + cpu_topology[cpu].cluster_id = topology_id; + topology_id = setup_acpi_cpu_topology(cpu, max_topo); + cpu_topology[cpu].package_id = topology_id; + } + } + return 0; +} + +#else +static int __init parse_acpi_topology(void) +{ + /*ACPI kernels should be built with PPTT support*/ + return -EINVAL; +} +#endif + void __init init_cpu_topology(void) { reset_cpu_topology(); @@ -312,6 +362,8 @@ void __init init_cpu_topology(void) * Discard anything that was parsed if we hit an error so we * don't use partial information. */ - if (of_have_populated_dt() && parse_dt_topology()) + if ((!acpi_disabled) && parse_acpi_topology()) + reset_cpu_topology(); + else if (of_have_populated_dt() && parse_dt_topology()) reset_cpu_topology(); } diff --git a/include/linux/topology.h b/include/linux/topology.h index 4660749a7303..cbf2fb13bf92 100644 --- a/include/linux/topology.h +++ b/include/linux/topology.h @@ -43,6 +43,7 @@ if (nr_cpus_node(node)) int arch_update_cpu_topology(void); +int setup_acpi_cpu_topology(unsigned int cpu, int level); /* Conform to ACPI 2.0 SLIT distance definitions */ #define LOCAL_DISTANCE 10
Propagate the topology information from the PPTT tree to the cpu_topology array. We can get the thread id, core_id and cluster_id by assuming certain levels of the PPTT tree correspond to those concepts. The package_id is flagged in the tree and can be found by passing an arbitrary large level to setup_acpi_cpu_topology() which terminates its search when it finds an ACPI node flagged as the physical package. If the tree doesn't contain enough levels to represent all of thread/core/cod/package then the package id will be used for the missing levels. Since server/ACPI machines are more likely to be multisocket and NUMA, this patch also modifies the default clusters=sockets behavior for ACPI machines to sockets=sockets. DT machines continue to represent sockets as clusters. For ACPI machines, this results in a more normalized view of the topology. Cluster level scheduler decisions are still being made due to the "MC" level in the scheduler which has knowledge of cache sharing domains. This code is loosely based on a combination of code from: Xiongfeng Wang <wangxiongfeng2@huawei.com> John Garry <john.garry@huawei.com> Jeffrey Hugo <jhugo@codeaurora.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> --- arch/arm64/kernel/topology.c | 54 +++++++++++++++++++++++++++++++++++++++++++- include/linux/topology.h | 1 + 2 files changed, 54 insertions(+), 1 deletion(-)