Message ID | 20220525081416.3306043-8-sudeep.holla@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arch_topology: Updates to add socket support and fix cluster ids | expand |
On 5/25/22 4:14 PM, Sudeep Holla wrote: > The cacheinfo is now initialised early along with the CPU topology > initialisation. Instead of relying on the LLC ID information parsed > separately only with ACPI PPTT elsewhere, migrate to use the similar > information from the cacheinfo. > > This is generic for both DT and ACPI systems. The ACPI LLC ID information > parsed separately can now be removed from arch specific code. > > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> > --- > drivers/base/arch_topology.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > Reviewed-by: Gavin Shan <gshan@redhat.com> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c > index 765723448b10..4c486e4e6f2f 100644 > --- a/drivers/base/arch_topology.c > +++ b/drivers/base/arch_topology.c > @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu) > /* not numa in package, lets use the package siblings */ > core_mask = &cpu_topology[cpu].core_sibling; > } > - if (cpu_topology[cpu].llc_id != -1) { > + > + if (last_level_cache_is_valid(cpu)) { > if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask)) > core_mask = &cpu_topology[cpu].llc_sibling; > } > @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid) > for_each_online_cpu(cpu) { > cpu_topo = &cpu_topology[cpu]; > > - if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) { > + if (last_level_cache_is_shared(cpu, cpuid)) { > cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling); > cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling); > } >
On 25/05/2022 10:14, Sudeep Holla wrote: > The cacheinfo is now initialised early along with the CPU topology > initialisation. Instead of relying on the LLC ID information parsed > separately only with ACPI PPTT elsewhere, migrate to use the similar > information from the cacheinfo. > > This is generic for both DT and ACPI systems. The ACPI LLC ID information > parsed separately can now be removed from arch specific code. > > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> > --- > drivers/base/arch_topology.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c > index 765723448b10..4c486e4e6f2f 100644 > --- a/drivers/base/arch_topology.c > +++ b/drivers/base/arch_topology.c > @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu) > /* not numa in package, lets use the package siblings */ > core_mask = &cpu_topology[cpu].core_sibling; > } > - if (cpu_topology[cpu].llc_id != -1) { > + > + if (last_level_cache_is_valid(cpu)) { > if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask)) > core_mask = &cpu_topology[cpu].llc_sibling; > } > @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid) > for_each_online_cpu(cpu) { > cpu_topo = &cpu_topology[cpu]; > > - if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) { > + if (last_level_cache_is_shared(cpu, cpuid)) { > cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling); > cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling); > } I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks like that last_level_cache_is_shared() isn't working as expected. I instrumented cpu_coregroup_mask() like: const struct cpumask *cpu_coregroup_mask(int cpu) { const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu)); if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) { core_mask = &cpu_topology[cpu].core_sibling; (1) } (2) if (last_level_cache_is_valid(cpu)) { if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask)) core_mask = &cpu_topology[cpu].llc_sibling; (3) } if (IS_ENABLED(CONFIG_SCHED_CLUSTER) && cpumask_subset(core_mask, &cpu_topology[cpu].cluster_sibling)) core_mask = &cpu_topology[cpu].cluster_sibling; (4) (5) return core_mask; } and got: (A) v3 patch-set: [ 11.561133] (1) cpu_coregroup_mask[0]=0-47 [ 11.565670] (2) last_level_cache_is_valid(0)=1 [ 11.570587] (3) cpu_coregroup_mask[0]=0 <-- llc_sibling=0 (should be 0-23) [ 11.574833] (4) cpu_coregroup_mask[0]=0-3 <-- Altra hack kicks in! [ 11.579275] (5) cpu_coregroup_mask[0]=0-3 # cat /sys/kernel/debug/sched/domains/cpu0/domain*/name CLS DIE # cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -3 cpu0 0 domain0 00000000,00000000,0000000f domain1 ffffffff,ffffffff,ffffffff So the MC domain is missing. (B) mainline as reference (cpu_coregroup_mask() slightly different): [ 11.585008] (1) cpu_coregroup_mask[0]=0-47 [ 11.589544] (3) cpu_coregroup_mask[0]=0-23 <-- !!! [ 11.594079] (5) cpu_coregroup_mask[0]=0-23 # cat /sys/kernel/debug/sched/domains/cpu0/domain*/name CLS MC <-- !!! DIE # cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -4 cpu0 0 domain0 00000000,00000000,0000000f domain1 00000000,00000000,00ffffff <-- !!! domain2 ffffffff,ffffffff,ffffffff
On Thu, Jun 02, 2022 at 04:26:00PM +0200, Dietmar Eggemann wrote: > On 25/05/2022 10:14, Sudeep Holla wrote: > > The cacheinfo is now initialised early along with the CPU topology > > initialisation. Instead of relying on the LLC ID information parsed > > separately only with ACPI PPTT elsewhere, migrate to use the similar > > information from the cacheinfo. > > > > This is generic for both DT and ACPI systems. The ACPI LLC ID information > > parsed separately can now be removed from arch specific code. > > > > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> > > --- > > drivers/base/arch_topology.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c > > index 765723448b10..4c486e4e6f2f 100644 > > --- a/drivers/base/arch_topology.c > > +++ b/drivers/base/arch_topology.c > > @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu) > > /* not numa in package, lets use the package siblings */ > > core_mask = &cpu_topology[cpu].core_sibling; > > } > > - if (cpu_topology[cpu].llc_id != -1) { > > + > > + if (last_level_cache_is_valid(cpu)) { > > if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask)) > > core_mask = &cpu_topology[cpu].llc_sibling; > > } > > @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid) > > for_each_online_cpu(cpu) { > > cpu_topo = &cpu_topology[cpu]; > > > > - if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) { > > + if (last_level_cache_is_shared(cpu, cpuid)) { > > cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling); > > cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling); > > } > > I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks > like that last_level_cache_is_shared() isn't working as > expected. > Thanks a lot for detailed instrumentation, I am unable to identify why it is not working though. I will take a deeper look later.
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 765723448b10..4c486e4e6f2f 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu) /* not numa in package, lets use the package siblings */ core_mask = &cpu_topology[cpu].core_sibling; } - if (cpu_topology[cpu].llc_id != -1) { + + if (last_level_cache_is_valid(cpu)) { if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask)) core_mask = &cpu_topology[cpu].llc_sibling; } @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid) for_each_online_cpu(cpu) { cpu_topo = &cpu_topology[cpu]; - if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) { + if (last_level_cache_is_shared(cpu, cpuid)) { cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling); cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling); }
The cacheinfo is now initialised early along with the CPU topology initialisation. Instead of relying on the LLC ID information parsed separately only with ACPI PPTT elsewhere, migrate to use the similar information from the cacheinfo. This is generic for both DT and ACPI systems. The ACPI LLC ID information parsed separately can now be removed from arch specific code. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> --- drivers/base/arch_topology.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)