diff mbox series

[v3,1/4] hw/arm/virt: Consider SMP configuration in CPU topology

Message ID 20220323072438.71815-2-gshan@redhat.com (mailing list archive)
State New, archived
Headers show
Series hw/arm/virt: Fix CPU's default NUMA node ID | expand

Commit Message

Gavin Shan March 23, 2022, 7:24 a.m. UTC
Currently, the SMP configuration isn't considered when the CPU
topology is populated. In this case, it's impossible to provide
the default CPU-to-NUMA mapping or association based on the socket
ID of the given CPU.

This takes account of SMP configuration when the CPU topology
is populated. The die ID for the given CPU isn't assigned since
it's not supported on arm/virt machine yet. Besides, the cluster
ID for the given CPU is assigned because it has been supported
on arm/virt machine.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/arm/virt.c     | 11 +++++++++++
 qapi/machine.json |  6 ++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

Comments

Igor Mammedov March 25, 2022, 1:19 p.m. UTC | #1
On Wed, 23 Mar 2022 15:24:35 +0800
Gavin Shan <gshan@redhat.com> wrote:

> Currently, the SMP configuration isn't considered when the CPU
> topology is populated. In this case, it's impossible to provide
> the default CPU-to-NUMA mapping or association based on the socket
> ID of the given CPU.
> 
> This takes account of SMP configuration when the CPU topology
> is populated. The die ID for the given CPU isn't assigned since
> it's not supported on arm/virt machine yet. Besides, the cluster
> ID for the given CPU is assigned because it has been supported
> on arm/virt machine.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  hw/arm/virt.c     | 11 +++++++++++
>  qapi/machine.json |  6 ++++--
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index d2e5ecd234..064eac42f7 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>      int n;
>      unsigned int max_cpus = ms->smp.max_cpus;
>      VirtMachineState *vms = VIRT_MACHINE(ms);
> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>  
>      if (ms->possible_cpus) {
>          assert(ms->possible_cpus->len == max_cpus);
> @@ -2518,6 +2519,16 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>          ms->possible_cpus->cpus[n].type = ms->cpu_type;
>          ms->possible_cpus->cpus[n].arch_id =
>              virt_cpu_mp_affinity(vms, n);
> +
> +        assert(!mc->smp_props.dies_supported);
> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
> +        ms->possible_cpus->cpus[n].props.socket_id =
> +            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
> +        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
> +        ms->possible_cpus->cpus[n].props.cluster_id =
> +            n / (ms->smp.cores * ms->smp.threads);

are there any relation cluster values here and number of clusters with
what virt_cpu_mp_affinity() calculates?

> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;

>          ms->possible_cpus->cpus[n].props.has_thread_id = true;
>          ms->possible_cpus->cpus[n].props.thread_id = n;
of cause target has the right to decide how to allocate IDs, and mgmt
is supposed to query these IDs before using them.
But:
 * IDs within 'props' are supposed to be arch defined.
   (on x86 IDs in range [0-smp.foo_id), on ppc it something different)
   Question is what real hardware does here in ARM case (i.e.
   how .../cores/threads are described on bare-metal)?
   
 * maybe related: looks like build_pptt() and build_madt() diverge on
   the meaning of 'ACPI Processor ID' and how it's generated.
   My understanding of 'ACPI Processor ID' is that it should match
   across all tables. So UIDs generated in build_pptt() look wrong to me.

 * maybe related: build_pptt() looks broken wrt core/thread where it
   may create at the same time a  leaf core with a leaf thread underneath it,
   is such description actually valid?


>      }
> diff --git a/qapi/machine.json b/qapi/machine.json
> index 42fc68403d..99c945f258 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -868,10 +868,11 @@
>  # @node-id: NUMA node ID the CPU belongs to
>  # @socket-id: socket number within node/board the CPU belongs to
>  # @die-id: die number within socket the CPU belongs to (since 4.1)
> -# @core-id: core number within die the CPU belongs to
> +# @cluster-id: cluster number within die the CPU belongs to
> +# @core-id: core number within cluster the CPU belongs to

s:cluster:cluster/die:

>  # @thread-id: thread number within core the CPU belongs to
>  #
> -# Note: currently there are 5 properties that could be present
> +# Note: currently there are 6 properties that could be present
>  #       but management should be prepared to pass through other
>  #       properties with device_add command to allow for future
>  #       interface extension. This also requires the filed names to be kept in
> @@ -883,6 +884,7 @@
>    'data': { '*node-id': 'int',
>              '*socket-id': 'int',
>              '*die-id': 'int',
> +            '*cluster-id': 'int',
>              '*core-id': 'int',
>              '*thread-id': 'int'
>    }
Gavin Shan March 25, 2022, 6:49 p.m. UTC | #2
Hi Igor,

On 3/25/22 9:19 PM, Igor Mammedov wrote:
> On Wed, 23 Mar 2022 15:24:35 +0800
> Gavin Shan <gshan@redhat.com> wrote:
>> Currently, the SMP configuration isn't considered when the CPU
>> topology is populated. In this case, it's impossible to provide
>> the default CPU-to-NUMA mapping or association based on the socket
>> ID of the given CPU.
>>
>> This takes account of SMP configuration when the CPU topology
>> is populated. The die ID for the given CPU isn't assigned since
>> it's not supported on arm/virt machine yet. Besides, the cluster
>> ID for the given CPU is assigned because it has been supported
>> on arm/virt machine.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   hw/arm/virt.c     | 11 +++++++++++
>>   qapi/machine.json |  6 ++++--
>>   2 files changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index d2e5ecd234..064eac42f7 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>       int n;
>>       unsigned int max_cpus = ms->smp.max_cpus;
>>       VirtMachineState *vms = VIRT_MACHINE(ms);
>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>   
>>       if (ms->possible_cpus) {
>>           assert(ms->possible_cpus->len == max_cpus);
>> @@ -2518,6 +2519,16 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>           ms->possible_cpus->cpus[n].arch_id =
>>               virt_cpu_mp_affinity(vms, n);
>> +
>> +        assert(!mc->smp_props.dies_supported);
>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
>> +        ms->possible_cpus->cpus[n].props.socket_id =
>> +            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>> +        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
>> +        ms->possible_cpus->cpus[n].props.cluster_id =
>> +            n / (ms->smp.cores * ms->smp.threads);
> 
> are there any relation cluster values here and number of clusters with
> what virt_cpu_mp_affinity() calculates?
> 

They're different clusters. The cluster returned by virt_cpu_mp_affinity()
is reflected to MPIDR_EL1 system register, which is mainly used by VGIC2/3
interrupt controller to send send group interrupts to the CPU cluster. It's
notable that the value returned from virt_cpu_mp_affinity() is always
overrided by KVM. It means this value is only used by TCG for the emulated
GIC2/GIC3.

The cluster in 'ms->possible_cpus' is passed to ACPI PPTT table to populate
the CPU topology.


>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
> 
>>           ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>           ms->possible_cpus->cpus[n].props.thread_id = n;
> of cause target has the right to decide how to allocate IDs, and mgmt
> is supposed to query these IDs before using them.
> But:
>   * IDs within 'props' are supposed to be arch defined.
>     (on x86 IDs in range [0-smp.foo_id), on ppc it something different)
>     Question is what real hardware does here in ARM case (i.e.
>     how .../cores/threads are described on bare-metal)?
>  

On ARM64 bare-metal machine, the core/cluster ID assignment is pretty arbitrary.
I checked the CPU topology on my bare-metal machine, which has following SMP
configurations.

     # lscpu
       :
     Thread(s) per core: 4
     Core(s) per socket: 28
     Socket(s):          2

     smp.sockets  = 2
     smp.clusters = 1
     smp.cores    = 56   (28 per socket)
     smp.threads  = 4

     // CPU0-111 belongs to socket0 or package0
     // CPU112-223 belongs to socket1 or package1
     # cat /sys/devices/system/cpu/cpu0/topology/package_cpus
     00000000,00000000,00000000,0000ffff,ffffffff,ffffffff,ffffffff
     # cat /sys/devices/system/cpu/cpu111/topology/package_cpus
     00000000,00000000,00000000,0000ffff,ffffffff,ffffffff,ffffffff
     # cat /sys/devices/system/cpu/cpu112/topology/package_cpus
     ffffffff,ffffffff,ffffffff,ffff0000,00000000,00000000,00000000
     # cat /sys/devices/system/cpu/cpu223/topology/package_cpus
     ffffffff,ffffffff,ffffffff,ffff0000,00000000,00000000,00000000

     // core/cluster ID spans from 0 to 27 on socket0
     # for i in `seq 0 27`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
     0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
     # for i in `seq 28 55`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
     0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
     # for i in `seq 0 27`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
     0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
     # for i in `seq 28 55`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
     0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
     
     // However, core/cluster ID starts from 256 on socket1
     # for i in `seq 112 139`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
     256 257 258 259 260 261 262 263 264 265 266 267 268 269
     270 271 272 273 274 275 276 277 278 279 280 281 282 283
     # for i in `seq 140 167`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
     256 257 258 259 260 261 262 263 264 265 266 267 268 269
     270 271 272 273 274 275 276 277 278 279 280 281 282 283
     # for i in `seq 112 139`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
     256 257 258 259 260 261 262 263 264 265 266 267 268 269
     270 271 272 273 274 275 276 277 278 279 280 281 282 283
     # for i in `seq 140 167`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
     256 257 258 259 260 261 262 263 264 265 266 267 268 269
     270 271 272 273 274 275 276 277 278 279 280 281 282 283
    
>   * maybe related: looks like build_pptt() and build_madt() diverge on
>     the meaning of 'ACPI Processor ID' and how it's generated.
>     My understanding of 'ACPI Processor ID' is that it should match
>     across all tables. So UIDs generated in build_pptt() look wrong to me.
> 
>   * maybe related: build_pptt() looks broken wrt core/thread where it
>     may create at the same time a  leaf core with a leaf thread underneath it,
>     is such description actually valid?
> 

Yes, the UIDs in MADT/PPTT should match. I'm not sure if I missed anything here.
I don't see how the UID in MADT and PPTT table are diverged. In both functions,
'thread_id' is taken as UID.

In build_pptt(), when the entries for the cores becomes leaf, nothing will be
pushed into @list, @length becomes zero for the loop to create entries for
the threads. In this case, we won't have any entries created for threads.

> 
>>       }
>> diff --git a/qapi/machine.json b/qapi/machine.json
>> index 42fc68403d..99c945f258 100644
>> --- a/qapi/machine.json
>> +++ b/qapi/machine.json
>> @@ -868,10 +868,11 @@
>>   # @node-id: NUMA node ID the CPU belongs to
>>   # @socket-id: socket number within node/board the CPU belongs to
>>   # @die-id: die number within socket the CPU belongs to (since 4.1)
>> -# @core-id: core number within die the CPU belongs to
>> +# @cluster-id: cluster number within die the CPU belongs to
>> +# @core-id: core number within cluster the CPU belongs to
> 
> s:cluster:cluster/die:
> 

Ok. I will amend it like below in next respin:

     # @core-id: core number within cluster/die the CPU belongs to

I'm not sure if we need make similar changes for 'cluster_id' like below?

    # @cluster-id: cluster number within die/socket the CPU belongs to
                                         ^^^^^^^^^^

>>   # @thread-id: thread number within core the CPU belongs to
>>   #
>> -# Note: currently there are 5 properties that could be present
>> +# Note: currently there are 6 properties that could be present
>>   #       but management should be prepared to pass through other
>>   #       properties with device_add command to allow for future
>>   #       interface extension. This also requires the filed names to be kept in
>> @@ -883,6 +884,7 @@
>>     'data': { '*node-id': 'int',
>>               '*socket-id': 'int',
>>               '*die-id': 'int',
>> +            '*cluster-id': 'int',
>>               '*core-id': 'int',
>>               '*thread-id': 'int'
>>     }

Thanks,
Gavin
Igor Mammedov March 30, 2022, 12:50 p.m. UTC | #3
On Sat, 26 Mar 2022 02:49:59 +0800
Gavin Shan <gshan@redhat.com> wrote:

> Hi Igor,
> 
> On 3/25/22 9:19 PM, Igor Mammedov wrote:
> > On Wed, 23 Mar 2022 15:24:35 +0800
> > Gavin Shan <gshan@redhat.com> wrote:  
> >> Currently, the SMP configuration isn't considered when the CPU
> >> topology is populated. In this case, it's impossible to provide
> >> the default CPU-to-NUMA mapping or association based on the socket
> >> ID of the given CPU.
> >>
> >> This takes account of SMP configuration when the CPU topology
> >> is populated. The die ID for the given CPU isn't assigned since
> >> it's not supported on arm/virt machine yet. Besides, the cluster
> >> ID for the given CPU is assigned because it has been supported
> >> on arm/virt machine.
> >>
> >> Signed-off-by: Gavin Shan <gshan@redhat.com>
> >> ---
> >>   hw/arm/virt.c     | 11 +++++++++++
> >>   qapi/machine.json |  6 ++++--
> >>   2 files changed, 15 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> >> index d2e5ecd234..064eac42f7 100644
> >> --- a/hw/arm/virt.c
> >> +++ b/hw/arm/virt.c
> >> @@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >>       int n;
> >>       unsigned int max_cpus = ms->smp.max_cpus;
> >>       VirtMachineState *vms = VIRT_MACHINE(ms);
> >> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
> >>   
> >>       if (ms->possible_cpus) {
> >>           assert(ms->possible_cpus->len == max_cpus);
> >> @@ -2518,6 +2519,16 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >>           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> >>           ms->possible_cpus->cpus[n].arch_id =
> >>               virt_cpu_mp_affinity(vms, n);
> >> +
> >> +        assert(!mc->smp_props.dies_supported);
> >> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
> >> +        ms->possible_cpus->cpus[n].props.socket_id =
> >> +            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
> >> +        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
> >> +        ms->possible_cpus->cpus[n].props.cluster_id =
> >> +            n / (ms->smp.cores * ms->smp.threads);  
> > 
> > are there any relation cluster values here and number of clusters with
> > what virt_cpu_mp_affinity() calculates?
> >   
> 
> They're different clusters. The cluster returned by virt_cpu_mp_affinity()
> is reflected to MPIDR_EL1 system register, which is mainly used by VGIC2/3
> interrupt controller to send send group interrupts to the CPU cluster. It's
> notable that the value returned from virt_cpu_mp_affinity() is always
> overrided by KVM. It means this value is only used by TCG for the emulated
> GIC2/GIC3.
> 
> The cluster in 'ms->possible_cpus' is passed to ACPI PPTT table to populate
> the CPU topology.
> 
> 
> >> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
> >> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;  
> >   
> >>           ms->possible_cpus->cpus[n].props.has_thread_id = true;
> >>           ms->possible_cpus->cpus[n].props.thread_id = n;  
> > of cause target has the right to decide how to allocate IDs, and mgmt
> > is supposed to query these IDs before using them.
> > But:
> >   * IDs within 'props' are supposed to be arch defined.
> >     (on x86 IDs in range [0-smp.foo_id), on ppc it something different)
> >     Question is what real hardware does here in ARM case (i.e.
> >     how .../cores/threads are described on bare-metal)?
> >    
> 
> On ARM64 bare-metal machine, the core/cluster ID assignment is pretty arbitrary.
> I checked the CPU topology on my bare-metal machine, which has following SMP
> configurations.
> 
>      # lscpu
>        :
>      Thread(s) per core: 4
>      Core(s) per socket: 28
>      Socket(s):          2
> 
>      smp.sockets  = 2
>      smp.clusters = 1
>      smp.cores    = 56   (28 per socket)
>      smp.threads  = 4
> 
>      // CPU0-111 belongs to socket0 or package0
>      // CPU112-223 belongs to socket1 or package1
>      # cat /sys/devices/system/cpu/cpu0/topology/package_cpus
>      00000000,00000000,00000000,0000ffff,ffffffff,ffffffff,ffffffff
>      # cat /sys/devices/system/cpu/cpu111/topology/package_cpus
>      00000000,00000000,00000000,0000ffff,ffffffff,ffffffff,ffffffff
>      # cat /sys/devices/system/cpu/cpu112/topology/package_cpus
>      ffffffff,ffffffff,ffffffff,ffff0000,00000000,00000000,00000000
>      # cat /sys/devices/system/cpu/cpu223/topology/package_cpus
>      ffffffff,ffffffff,ffffffff,ffff0000,00000000,00000000,00000000
> 
>      // core/cluster ID spans from 0 to 27 on socket0
>      # for i in `seq 0 27`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>      # for i in `seq 28 55`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>      # for i in `seq 0 27`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>      # for i in `seq 28 55`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>      
>      // However, core/cluster ID starts from 256 on socket1
>      # for i in `seq 112 139`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>      256 257 258 259 260 261 262 263 264 265 266 267 268 269
>      270 271 272 273 274 275 276 277 278 279 280 281 282 283
>      # for i in `seq 140 167`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>      256 257 258 259 260 261 262 263 264 265 266 267 268 269
>      270 271 272 273 274 275 276 277 278 279 280 281 282 283
>      # for i in `seq 112 139`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>      256 257 258 259 260 261 262 263 264 265 266 267 268 269
>      270 271 272 273 274 275 276 277 278 279 280 281 282 283
>      # for i in `seq 140 167`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>      256 257 258 259 260 261 262 263 264 265 266 267 268 269
>      270 271 272 273 274 275 276 277 278 279 280 281 282 283

so it seems that IDs are repeatable within a socket.
If there no arch defined way or other objections it might be better
to stick to what x86 does for consistency reasons  (i.e. socket/die/
cluster/core/thread are in range [0..x) including thread-id being
in range [0..threads) ) instead of inventing arm/virt specific scheme.

>     
> >   * maybe related: looks like build_pptt() and build_madt() diverge on
> >     the meaning of 'ACPI Processor ID' and how it's generated.
> >     My understanding of 'ACPI Processor ID' is that it should match
> >     across all tables. So UIDs generated in build_pptt() look wrong to me.
> > 
> >   * maybe related: build_pptt() looks broken wrt core/thread where it
> >     may create at the same time a  leaf core with a leaf thread underneath it,
> >     is such description actually valid?
> >   
> 
> Yes, the UIDs in MADT/PPTT should match. I'm not sure if I missed anything here.
> I don't see how the UID in MADT and PPTT table are diverged. In both functions,
> 'thread_id' is taken as UID.
> 
> In build_pptt(), when the entries for the cores becomes leaf, nothing will be
> pushed into @list, @length becomes zero for the loop to create entries for
> the threads. In this case, we won't have any entries created for threads.
> 
> >   
> >>       }
> >> diff --git a/qapi/machine.json b/qapi/machine.json
> >> index 42fc68403d..99c945f258 100644
> >> --- a/qapi/machine.json
> >> +++ b/qapi/machine.json
> >> @@ -868,10 +868,11 @@
> >>   # @node-id: NUMA node ID the CPU belongs to
> >>   # @socket-id: socket number within node/board the CPU belongs to
> >>   # @die-id: die number within socket the CPU belongs to (since 4.1)
> >> -# @core-id: core number within die the CPU belongs to
> >> +# @cluster-id: cluster number within die the CPU belongs to
> >> +# @core-id: core number within cluster the CPU belongs to  
> > 
> > s:cluster:cluster/die:
> >   
> 
> Ok. I will amend it like below in next respin:
> 
>      # @core-id: core number within cluster/die the CPU belongs to
> 
> I'm not sure if we need make similar changes for 'cluster_id' like below?
> 
>     # @cluster-id: cluster number within die/socket the CPU belongs to
>                                           ^^^^^^^^^^

maybe postpone it till die is supported?

> 
> >>   # @thread-id: thread number within core the CPU belongs to
> >>   #
> >> -# Note: currently there are 5 properties that could be present
> >> +# Note: currently there are 6 properties that could be present
> >>   #       but management should be prepared to pass through other
> >>   #       properties with device_add command to allow for future
> >>   #       interface extension. This also requires the filed names to be kept in
> >> @@ -883,6 +884,7 @@
> >>     'data': { '*node-id': 'int',
> >>               '*socket-id': 'int',
> >>               '*die-id': 'int',
> >> +            '*cluster-id': 'int',
> >>               '*core-id': 'int',
> >>               '*thread-id': 'int'
> >>     }  
> 
> Thanks,
> Gavin
>
Igor Mammedov March 30, 2022, 1:18 p.m. UTC | #4
On Wed, 23 Mar 2022 15:24:35 +0800
Gavin Shan <gshan@redhat.com> wrote:

> Currently, the SMP configuration isn't considered when the CPU
> topology is populated. In this case, it's impossible to provide
> the default CPU-to-NUMA mapping or association based on the socket
> ID of the given CPU.
> 
> This takes account of SMP configuration when the CPU topology
> is populated. The die ID for the given CPU isn't assigned since
> it's not supported on arm/virt machine yet. Besides, the cluster
> ID for the given CPU is assigned because it has been supported
> on arm/virt machine.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  hw/arm/virt.c     | 11 +++++++++++
>  qapi/machine.json |  6 ++++--
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index d2e5ecd234..064eac42f7 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>      int n;
>      unsigned int max_cpus = ms->smp.max_cpus;
>      VirtMachineState *vms = VIRT_MACHINE(ms);
> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>  
>      if (ms->possible_cpus) {
>          assert(ms->possible_cpus->len == max_cpus);
> @@ -2518,6 +2519,16 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>          ms->possible_cpus->cpus[n].type = ms->cpu_type;
>          ms->possible_cpus->cpus[n].arch_id =
>              virt_cpu_mp_affinity(vms, n);
> +
> +        assert(!mc->smp_props.dies_supported);
> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
> +        ms->possible_cpus->cpus[n].props.socket_id =
> +            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
> +        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
> +        ms->possible_cpus->cpus[n].props.cluster_id =
> +            n / (ms->smp.cores * ms->smp.threads);
> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>          ms->possible_cpus->cpus[n].props.has_thread_id = true;
>          ms->possible_cpus->cpus[n].props.thread_id = n;

shouldn't be above values calculated similar to the way they are 
calculated in x86_topo_ids_from_idx()? /note '% foo' part/

>      }
> diff --git a/qapi/machine.json b/qapi/machine.json
> index 42fc68403d..99c945f258 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -868,10 +868,11 @@
>  # @node-id: NUMA node ID the CPU belongs to
>  # @socket-id: socket number within node/board the CPU belongs to
>  # @die-id: die number within socket the CPU belongs to (since 4.1)
> -# @core-id: core number within die the CPU belongs to
> +# @cluster-id: cluster number within die the CPU belongs to
> +# @core-id: core number within cluster the CPU belongs to
>  # @thread-id: thread number within core the CPU belongs to
>  #
> -# Note: currently there are 5 properties that could be present
> +# Note: currently there are 6 properties that could be present
>  #       but management should be prepared to pass through other
>  #       properties with device_add command to allow for future
>  #       interface extension. This also requires the filed names to be kept in
> @@ -883,6 +884,7 @@
>    'data': { '*node-id': 'int',
>              '*socket-id': 'int',
>              '*die-id': 'int',
> +            '*cluster-id': 'int',
>              '*core-id': 'int',
>              '*thread-id': 'int'
>    }
Denis V. Lunev" via April 2, 2022, 2:17 a.m. UTC | #5
Hi Gavin,

On 2022/3/23 15:24, Gavin Shan wrote:
> Currently, the SMP configuration isn't considered when the CPU
> topology is populated. In this case, it's impossible to provide
> the default CPU-to-NUMA mapping or association based on the socket
> ID of the given CPU.
>
> This takes account of SMP configuration when the CPU topology
> is populated. The die ID for the given CPU isn't assigned since
> it's not supported on arm/virt machine yet. Besides, the cluster
> ID for the given CPU is assigned because it has been supported
> on arm/virt machine.
>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>   hw/arm/virt.c     | 11 +++++++++++
>   qapi/machine.json |  6 ++++--
>   2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index d2e5ecd234..064eac42f7 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>       int n;
>       unsigned int max_cpus = ms->smp.max_cpus;
>       VirtMachineState *vms = VIRT_MACHINE(ms);
> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>   
>       if (ms->possible_cpus) {
>           assert(ms->possible_cpus->len == max_cpus);
> @@ -2518,6 +2519,16 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>           ms->possible_cpus->cpus[n].arch_id =
>               virt_cpu_mp_affinity(vms, n);
> +
> +        assert(!mc->smp_props.dies_supported);
> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
> +        ms->possible_cpus->cpus[n].props.socket_id =
> +            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
> +        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
> +        ms->possible_cpus->cpus[n].props.cluster_id =
> +            n / (ms->smp.cores * ms->smp.threads);
> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>           ms->possible_cpus->cpus[n].props.has_thread_id = true;
>           ms->possible_cpus->cpus[n].props.thread_id = n;
>       }
> diff --git a/qapi/machine.json b/qapi/machine.json
> index 42fc68403d..99c945f258 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -868,10 +868,11 @@
>   # @node-id: NUMA node ID the CPU belongs to
>   # @socket-id: socket number within node/board the CPU belongs to
>   # @die-id: die number within socket the CPU belongs to (since 4.1)
> -# @core-id: core number within die the CPU belongs to
> +# @cluster-id: cluster number within die the CPU belongs to
> +# @core-id: core number within cluster the CPU belongs to
>   # @thread-id: thread number within core the CPU belongs to
>   #
> -# Note: currently there are 5 properties that could be present
> +# Note: currently there are 6 properties that could be present
>   #       but management should be prepared to pass through other
>   #       properties with device_add command to allow for future
>   #       interface extension. This also requires the filed names to be kept in
> @@ -883,6 +884,7 @@
>     'data': { '*node-id': 'int',
>               '*socket-id': 'int',
>               '*die-id': 'int',
> +            '*cluster-id': 'int',
>               '*core-id': 'int',
>               '*thread-id': 'int'
>     }
Since new cluster-id is introduced, you may want to check whether to
update machine_set_cpu_numa_node() and hmp_hotpluggable_cpus(),
accordingly, which both deal with topo-ids. If we need to update them,
it's easier to review to make the whole cluster-id introduction part
a separate patch.

Thanks,
Yanan
Denis V. Lunev" via April 2, 2022, 2:27 a.m. UTC | #6
On 2022/3/30 20:50, Igor Mammedov wrote:
> On Sat, 26 Mar 2022 02:49:59 +0800
> Gavin Shan <gshan@redhat.com> wrote:
>
>> Hi Igor,
>>
>> On 3/25/22 9:19 PM, Igor Mammedov wrote:
>>> On Wed, 23 Mar 2022 15:24:35 +0800
>>> Gavin Shan <gshan@redhat.com> wrote:
>>>> Currently, the SMP configuration isn't considered when the CPU
>>>> topology is populated. In this case, it's impossible to provide
>>>> the default CPU-to-NUMA mapping or association based on the socket
>>>> ID of the given CPU.
>>>>
>>>> This takes account of SMP configuration when the CPU topology
>>>> is populated. The die ID for the given CPU isn't assigned since
>>>> it's not supported on arm/virt machine yet. Besides, the cluster
>>>> ID for the given CPU is assigned because it has been supported
>>>> on arm/virt machine.
>>>>
>>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>>> ---
>>>>    hw/arm/virt.c     | 11 +++++++++++
>>>>    qapi/machine.json |  6 ++++--
>>>>    2 files changed, 15 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>> index d2e5ecd234..064eac42f7 100644
>>>> --- a/hw/arm/virt.c
>>>> +++ b/hw/arm/virt.c
>>>> @@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>        int n;
>>>>        unsigned int max_cpus = ms->smp.max_cpus;
>>>>        VirtMachineState *vms = VIRT_MACHINE(ms);
>>>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>>>    
>>>>        if (ms->possible_cpus) {
>>>>            assert(ms->possible_cpus->len == max_cpus);
>>>> @@ -2518,6 +2519,16 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>            ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>>>            ms->possible_cpus->cpus[n].arch_id =
>>>>                virt_cpu_mp_affinity(vms, n);
>>>> +
>>>> +        assert(!mc->smp_props.dies_supported);
>>>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
>>>> +        ms->possible_cpus->cpus[n].props.socket_id =
>>>> +            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>>>> +        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
>>>> +        ms->possible_cpus->cpus[n].props.cluster_id =
>>>> +            n / (ms->smp.cores * ms->smp.threads);
>>> are there any relation cluster values here and number of clusters with
>>> what virt_cpu_mp_affinity() calculates?
>>>    
>> They're different clusters. The cluster returned by virt_cpu_mp_affinity()
>> is reflected to MPIDR_EL1 system register, which is mainly used by VGIC2/3
>> interrupt controller to send send group interrupts to the CPU cluster. It's
>> notable that the value returned from virt_cpu_mp_affinity() is always
>> overrided by KVM. It means this value is only used by TCG for the emulated
>> GIC2/GIC3.
>>
>> The cluster in 'ms->possible_cpus' is passed to ACPI PPTT table to populate
>> the CPU topology.
>>
>>
>>>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
>>>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>>>    
>>>>            ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>>>            ms->possible_cpus->cpus[n].props.thread_id = n;
>>> of cause target has the right to decide how to allocate IDs, and mgmt
>>> is supposed to query these IDs before using them.
>>> But:
>>>    * IDs within 'props' are supposed to be arch defined.
>>>      (on x86 IDs in range [0-smp.foo_id), on ppc it something different)
>>>      Question is what real hardware does here in ARM case (i.e.
>>>      how .../cores/threads are described on bare-metal)?
>>>     
>> On ARM64 bare-metal machine, the core/cluster ID assignment is pretty arbitrary.
>> I checked the CPU topology on my bare-metal machine, which has following SMP
>> configurations.
>>
>>       # lscpu
>>         :
>>       Thread(s) per core: 4
>>       Core(s) per socket: 28
>>       Socket(s):          2
>>
>>       smp.sockets  = 2
>>       smp.clusters = 1
>>       smp.cores    = 56   (28 per socket)
>>       smp.threads  = 4
>>
>>       // CPU0-111 belongs to socket0 or package0
>>       // CPU112-223 belongs to socket1 or package1
>>       # cat /sys/devices/system/cpu/cpu0/topology/package_cpus
>>       00000000,00000000,00000000,0000ffff,ffffffff,ffffffff,ffffffff
>>       # cat /sys/devices/system/cpu/cpu111/topology/package_cpus
>>       00000000,00000000,00000000,0000ffff,ffffffff,ffffffff,ffffffff
>>       # cat /sys/devices/system/cpu/cpu112/topology/package_cpus
>>       ffffffff,ffffffff,ffffffff,ffff0000,00000000,00000000,00000000
>>       # cat /sys/devices/system/cpu/cpu223/topology/package_cpus
>>       ffffffff,ffffffff,ffffffff,ffff0000,00000000,00000000,00000000
>>
>>       // core/cluster ID spans from 0 to 27 on socket0
>>       # for i in `seq 0 27`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>>       0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>>       # for i in `seq 28 55`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>>       0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>>       # for i in `seq 0 27`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>>       0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>>       # for i in `seq 28 55`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>>       0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>>       
>>       // However, core/cluster ID starts from 256 on socket1
>>       # for i in `seq 112 139`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>>       256 257 258 259 260 261 262 263 264 265 266 267 268 269
>>       270 271 272 273 274 275 276 277 278 279 280 281 282 283
>>       # for i in `seq 140 167`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>>       256 257 258 259 260 261 262 263 264 265 266 267 268 269
>>       270 271 272 273 274 275 276 277 278 279 280 281 282 283
>>       # for i in `seq 112 139`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>>       256 257 258 259 260 261 262 263 264 265 266 267 268 269
>>       270 271 272 273 274 275 276 277 278 279 280 281 282 283
>>       # for i in `seq 140 167`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>>       256 257 258 259 260 261 262 263 264 265 266 267 268 269
>>       270 271 272 273 274 275 276 277 278 279 280 281 282 283
> so it seems that IDs are repeatable within a socket.
> If there no arch defined way or other objections it might be better
> to stick to what x86 does for consistency reasons  (i.e. socket/die/
> cluster/core/thread are in range [0..x) including thread-id being
> in range [0..threads) ) instead of inventing arm/virt specific scheme.
Agreed.
>>      
>>>    * maybe related: looks like build_pptt() and build_madt() diverge on
>>>      the meaning of 'ACPI Processor ID' and how it's generated.
>>>      My understanding of 'ACPI Processor ID' is that it should match
>>>      across all tables. So UIDs generated in build_pptt() look wrong to me.
>>>
>>>    * maybe related: build_pptt() looks broken wrt core/thread where it
>>>      may create at the same time a  leaf core with a leaf thread underneath it,
>>>      is such description actually valid?
>>>    
>> Yes, the UIDs in MADT/PPTT should match. I'm not sure if I missed anything here.
>> I don't see how the UID in MADT and PPTT table are diverged. In both functions,
>> 'thread_id' is taken as UID.
>>
>> In build_pptt(), when the entries for the cores becomes leaf, nothing will be
>> pushed into @list, @length becomes zero for the loop to create entries for
>> the threads. In this case, we won't have any entries created for threads.
>>
>>>    
>>>>        }
>>>> diff --git a/qapi/machine.json b/qapi/machine.json
>>>> index 42fc68403d..99c945f258 100644
>>>> --- a/qapi/machine.json
>>>> +++ b/qapi/machine.json
>>>> @@ -868,10 +868,11 @@
>>>>    # @node-id: NUMA node ID the CPU belongs to
>>>>    # @socket-id: socket number within node/board the CPU belongs to
>>>>    # @die-id: die number within socket the CPU belongs to (since 4.1)
>>>> -# @core-id: core number within die the CPU belongs to
>>>> +# @cluster-id: cluster number within die the CPU belongs to
>>>> +# @core-id: core number within cluster the CPU belongs to
>>> s:cluster:cluster/die:
>>>    
>> Ok. I will amend it like below in next respin:
>>
>>       # @core-id: core number within cluster/die the CPU belongs to
>>
>> I'm not sure if we need make similar changes for 'cluster_id' like below?
>>
>>      # @cluster-id: cluster number within die/socket the CPU belongs to
>>                                            ^^^^^^^^^^
> maybe postpone it till die is supported?
>
>>>>    # @thread-id: thread number within core the CPU belongs to
>>>>    #
>>>> -# Note: currently there are 5 properties that could be present
>>>> +# Note: currently there are 6 properties that could be present
>>>>    #       but management should be prepared to pass through other
>>>>    #       properties with device_add command to allow for future
>>>>    #       interface extension. This also requires the filed names to be kept in
>>>> @@ -883,6 +884,7 @@
>>>>      'data': { '*node-id': 'int',
>>>>                '*socket-id': 'int',
>>>>                '*die-id': 'int',
>>>> +            '*cluster-id': 'int',
>>>>                '*core-id': 'int',
>>>>                '*thread-id': 'int'
>>>>      }
>> Thanks,
>> Gavin
>>
> .
Gavin Shan April 3, 2022, 10:46 a.m. UTC | #7
Hi Igor,

On 3/30/22 8:50 PM, Igor Mammedov wrote:
> On Sat, 26 Mar 2022 02:49:59 +0800
> Gavin Shan <gshan@redhat.com> wrote:
>> On 3/25/22 9:19 PM, Igor Mammedov wrote:
>>> On Wed, 23 Mar 2022 15:24:35 +0800
>>> Gavin Shan <gshan@redhat.com> wrote:
>>>> Currently, the SMP configuration isn't considered when the CPU
>>>> topology is populated. In this case, it's impossible to provide
>>>> the default CPU-to-NUMA mapping or association based on the socket
>>>> ID of the given CPU.
>>>>
>>>> This takes account of SMP configuration when the CPU topology
>>>> is populated. The die ID for the given CPU isn't assigned since
>>>> it's not supported on arm/virt machine yet. Besides, the cluster
>>>> ID for the given CPU is assigned because it has been supported
>>>> on arm/virt machine.
>>>>
>>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>>> ---
>>>>    hw/arm/virt.c     | 11 +++++++++++
>>>>    qapi/machine.json |  6 ++++--
>>>>    2 files changed, 15 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>> index d2e5ecd234..064eac42f7 100644
>>>> --- a/hw/arm/virt.c
>>>> +++ b/hw/arm/virt.c
>>>> @@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>        int n;
>>>>        unsigned int max_cpus = ms->smp.max_cpus;
>>>>        VirtMachineState *vms = VIRT_MACHINE(ms);
>>>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>>>    
>>>>        if (ms->possible_cpus) {
>>>>            assert(ms->possible_cpus->len == max_cpus);
>>>> @@ -2518,6 +2519,16 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>            ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>>>            ms->possible_cpus->cpus[n].arch_id =
>>>>                virt_cpu_mp_affinity(vms, n);
>>>> +
>>>> +        assert(!mc->smp_props.dies_supported);
>>>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
>>>> +        ms->possible_cpus->cpus[n].props.socket_id =
>>>> +            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>>>> +        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
>>>> +        ms->possible_cpus->cpus[n].props.cluster_id =
>>>> +            n / (ms->smp.cores * ms->smp.threads);
>>>
>>> are there any relation cluster values here and number of clusters with
>>> what virt_cpu_mp_affinity() calculates?
>>>    
>>
>> They're different clusters. The cluster returned by virt_cpu_mp_affinity()
>> is reflected to MPIDR_EL1 system register, which is mainly used by VGIC2/3
>> interrupt controller to send send group interrupts to the CPU cluster. It's
>> notable that the value returned from virt_cpu_mp_affinity() is always
>> overrided by KVM. It means this value is only used by TCG for the emulated
>> GIC2/GIC3.
>>
>> The cluster in 'ms->possible_cpus' is passed to ACPI PPTT table to populate
>> the CPU topology.
>>
>>
>>>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
>>>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>>>    
>>>>            ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>>>            ms->possible_cpus->cpus[n].props.thread_id = n;
>>> of cause target has the right to decide how to allocate IDs, and mgmt
>>> is supposed to query these IDs before using them.
>>> But:
>>>    * IDs within 'props' are supposed to be arch defined.
>>>      (on x86 IDs in range [0-smp.foo_id), on ppc it something different)
>>>      Question is what real hardware does here in ARM case (i.e.
>>>      how .../cores/threads are described on bare-metal)?
>>>     
>>
>> On ARM64 bare-metal machine, the core/cluster ID assignment is pretty arbitrary.
>> I checked the CPU topology on my bare-metal machine, which has following SMP
>> configurations.
>>
>>       # lscpu
>>         :
>>       Thread(s) per core: 4
>>       Core(s) per socket: 28
>>       Socket(s):          2
>>
>>       smp.sockets  = 2
>>       smp.clusters = 1
>>       smp.cores    = 56   (28 per socket)
>>       smp.threads  = 4
>>
>>       // CPU0-111 belongs to socket0 or package0
>>       // CPU112-223 belongs to socket1 or package1
>>       # cat /sys/devices/system/cpu/cpu0/topology/package_cpus
>>       00000000,00000000,00000000,0000ffff,ffffffff,ffffffff,ffffffff
>>       # cat /sys/devices/system/cpu/cpu111/topology/package_cpus
>>       00000000,00000000,00000000,0000ffff,ffffffff,ffffffff,ffffffff
>>       # cat /sys/devices/system/cpu/cpu112/topology/package_cpus
>>       ffffffff,ffffffff,ffffffff,ffff0000,00000000,00000000,00000000
>>       # cat /sys/devices/system/cpu/cpu223/topology/package_cpus
>>       ffffffff,ffffffff,ffffffff,ffff0000,00000000,00000000,00000000
>>
>>       // core/cluster ID spans from 0 to 27 on socket0
>>       # for i in `seq 0 27`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>>       0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>>       # for i in `seq 28 55`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>>       0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>>       # for i in `seq 0 27`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>>       0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>>       # for i in `seq 28 55`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>>       0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
>>       
>>       // However, core/cluster ID starts from 256 on socket1
>>       # for i in `seq 112 139`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>>       256 257 258 259 260 261 262 263 264 265 266 267 268 269
>>       270 271 272 273 274 275 276 277 278 279 280 281 282 283
>>       # for i in `seq 140 167`; do cat /sys/devices/system/cpu/cpu$i/topology/core_id; done
>>       256 257 258 259 260 261 262 263 264 265 266 267 268 269
>>       270 271 272 273 274 275 276 277 278 279 280 281 282 283
>>       # for i in `seq 112 139`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>>       256 257 258 259 260 261 262 263 264 265 266 267 268 269
>>       270 271 272 273 274 275 276 277 278 279 280 281 282 283
>>       # for i in `seq 140 167`; do cat /sys/devices/system/cpu/cpu$i/topology/cluster_id; done
>>       256 257 258 259 260 261 262 263 264 265 266 267 268 269
>>       270 271 272 273 274 275 276 277 278 279 280 281 282 283
> 
> so it seems that IDs are repeatable within a socket.
> If there no arch defined way or other objections it might be better
> to stick to what x86 does for consistency reasons  (i.e. socket/die/
> cluster/core/thread are in range [0..x) including thread-id being
> in range [0..threads) ) instead of inventing arm/virt specific scheme.
> 

Agreed.

>>      
>>>    * maybe related: looks like build_pptt() and build_madt() diverge on
>>>      the meaning of 'ACPI Processor ID' and how it's generated.
>>>      My understanding of 'ACPI Processor ID' is that it should match
>>>      across all tables. So UIDs generated in build_pptt() look wrong to me.
>>>
>>>    * maybe related: build_pptt() looks broken wrt core/thread where it
>>>      may create at the same time a  leaf core with a leaf thread underneath it,
>>>      is such description actually valid?
>>>    
>>
>> Yes, the UIDs in MADT/PPTT should match. I'm not sure if I missed anything here.
>> I don't see how the UID in MADT and PPTT table are diverged. In both functions,
>> 'thread_id' is taken as UID.
>>
>> In build_pptt(), when the entries for the cores becomes leaf, nothing will be
>> pushed into @list, @length becomes zero for the loop to create entries for
>> the threads. In this case, we won't have any entries created for threads.
>>
>>>    
>>>>        }
>>>> diff --git a/qapi/machine.json b/qapi/machine.json
>>>> index 42fc68403d..99c945f258 100644
>>>> --- a/qapi/machine.json
>>>> +++ b/qapi/machine.json
>>>> @@ -868,10 +868,11 @@
>>>>    # @node-id: NUMA node ID the CPU belongs to
>>>>    # @socket-id: socket number within node/board the CPU belongs to
>>>>    # @die-id: die number within socket the CPU belongs to (since 4.1)
>>>> -# @core-id: core number within die the CPU belongs to
>>>> +# @cluster-id: cluster number within die the CPU belongs to
>>>> +# @core-id: core number within cluster the CPU belongs to
>>>
>>> s:cluster:cluster/die:
>>>    
>>
>> Ok. I will amend it like below in next respin:
>>
>>       # @core-id: core number within cluster/die the CPU belongs to
>>
>> I'm not sure if we need make similar changes for 'cluster_id' like below?
>>
>>      # @cluster-id: cluster number within die/socket the CPU belongs to
>>                                            ^^^^^^^^^^
> 
> maybe postpone it till die is supported?
> 

Ok. Lets postpone to change description about 'cluster-id' until
die is supported. So only the description about 'core-id' will
be amended as you suggested in v4.

>>
>>>>    # @thread-id: thread number within core the CPU belongs to
>>>>    #
>>>> -# Note: currently there are 5 properties that could be present
>>>> +# Note: currently there are 6 properties that could be present
>>>>    #       but management should be prepared to pass through other
>>>>    #       properties with device_add command to allow for future
>>>>    #       interface extension. This also requires the filed names to be kept in
>>>> @@ -883,6 +884,7 @@
>>>>      'data': { '*node-id': 'int',
>>>>                '*socket-id': 'int',
>>>>                '*die-id': 'int',
>>>> +            '*cluster-id': 'int',
>>>>                '*core-id': 'int',
>>>>                '*thread-id': 'int'
>>>>      }

Thanks,
Gavin
Gavin Shan April 3, 2022, 10:48 a.m. UTC | #8
Hi Igor,

On 3/30/22 9:18 PM, Igor Mammedov wrote:
> On Wed, 23 Mar 2022 15:24:35 +0800
> Gavin Shan <gshan@redhat.com> wrote:
> 
>> Currently, the SMP configuration isn't considered when the CPU
>> topology is populated. In this case, it's impossible to provide
>> the default CPU-to-NUMA mapping or association based on the socket
>> ID of the given CPU.
>>
>> This takes account of SMP configuration when the CPU topology
>> is populated. The die ID for the given CPU isn't assigned since
>> it's not supported on arm/virt machine yet. Besides, the cluster
>> ID for the given CPU is assigned because it has been supported
>> on arm/virt machine.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   hw/arm/virt.c     | 11 +++++++++++
>>   qapi/machine.json |  6 ++++--
>>   2 files changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index d2e5ecd234..064eac42f7 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>       int n;
>>       unsigned int max_cpus = ms->smp.max_cpus;
>>       VirtMachineState *vms = VIRT_MACHINE(ms);
>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>   
>>       if (ms->possible_cpus) {
>>           assert(ms->possible_cpus->len == max_cpus);
>> @@ -2518,6 +2519,16 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>           ms->possible_cpus->cpus[n].arch_id =
>>               virt_cpu_mp_affinity(vms, n);
>> +
>> +        assert(!mc->smp_props.dies_supported);
>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
>> +        ms->possible_cpus->cpus[n].props.socket_id =
>> +            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>> +        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
>> +        ms->possible_cpus->cpus[n].props.cluster_id =
>> +            n / (ms->smp.cores * ms->smp.threads);
>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>>           ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>           ms->possible_cpus->cpus[n].props.thread_id = n;
> 
> shouldn't be above values calculated similar to the way they are
> calculated in x86_topo_ids_from_idx()? /note '% foo' part/
> 

I think it's fine not to have '% foo' here. However, I think it'd
better to have the similar '% foo' as x86 does. I will add this part
in v4.

>>       }
>> diff --git a/qapi/machine.json b/qapi/machine.json
>> index 42fc68403d..99c945f258 100644
>> --- a/qapi/machine.json
>> +++ b/qapi/machine.json
>> @@ -868,10 +868,11 @@
>>   # @node-id: NUMA node ID the CPU belongs to
>>   # @socket-id: socket number within node/board the CPU belongs to
>>   # @die-id: die number within socket the CPU belongs to (since 4.1)
>> -# @core-id: core number within die the CPU belongs to
>> +# @cluster-id: cluster number within die the CPU belongs to
>> +# @core-id: core number within cluster the CPU belongs to
>>   # @thread-id: thread number within core the CPU belongs to
>>   #
>> -# Note: currently there are 5 properties that could be present
>> +# Note: currently there are 6 properties that could be present
>>   #       but management should be prepared to pass through other
>>   #       properties with device_add command to allow for future
>>   #       interface extension. This also requires the filed names to be kept in
>> @@ -883,6 +884,7 @@
>>     'data': { '*node-id': 'int',
>>               '*socket-id': 'int',
>>               '*die-id': 'int',
>> +            '*cluster-id': 'int',
>>               '*core-id': 'int',
>>               '*thread-id': 'int'
>>     }

Thanks,
Gavin
Gavin Shan April 3, 2022, 11:55 a.m. UTC | #9
Hi Yanan,

On 4/2/22 10:17 AM, wangyanan (Y) wrote:
> On 2022/3/23 15:24, Gavin Shan wrote:
>> Currently, the SMP configuration isn't considered when the CPU
>> topology is populated. In this case, it's impossible to provide
>> the default CPU-to-NUMA mapping or association based on the socket
>> ID of the given CPU.
>>
>> This takes account of SMP configuration when the CPU topology
>> is populated. The die ID for the given CPU isn't assigned since
>> it's not supported on arm/virt machine yet. Besides, the cluster
>> ID for the given CPU is assigned because it has been supported
>> on arm/virt machine.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   hw/arm/virt.c     | 11 +++++++++++
>>   qapi/machine.json |  6 ++++--
>>   2 files changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index d2e5ecd234..064eac42f7 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>       int n;
>>       unsigned int max_cpus = ms->smp.max_cpus;
>>       VirtMachineState *vms = VIRT_MACHINE(ms);
>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>       if (ms->possible_cpus) {
>>           assert(ms->possible_cpus->len == max_cpus);
>> @@ -2518,6 +2519,16 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>           ms->possible_cpus->cpus[n].arch_id =
>>               virt_cpu_mp_affinity(vms, n);
>> +
>> +        assert(!mc->smp_props.dies_supported);
>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
>> +        ms->possible_cpus->cpus[n].props.socket_id =
>> +            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>> +        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
>> +        ms->possible_cpus->cpus[n].props.cluster_id =
>> +            n / (ms->smp.cores * ms->smp.threads);
>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>>           ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>           ms->possible_cpus->cpus[n].props.thread_id = n;
>>       }
>> diff --git a/qapi/machine.json b/qapi/machine.json
>> index 42fc68403d..99c945f258 100644
>> --- a/qapi/machine.json
>> +++ b/qapi/machine.json
>> @@ -868,10 +868,11 @@
>>   # @node-id: NUMA node ID the CPU belongs to
>>   # @socket-id: socket number within node/board the CPU belongs to
>>   # @die-id: die number within socket the CPU belongs to (since 4.1)
>> -# @core-id: core number within die the CPU belongs to
>> +# @cluster-id: cluster number within die the CPU belongs to
>> +# @core-id: core number within cluster the CPU belongs to
>>   # @thread-id: thread number within core the CPU belongs to
>>   #
>> -# Note: currently there are 5 properties that could be present
>> +# Note: currently there are 6 properties that could be present
>>   #       but management should be prepared to pass through other
>>   #       properties with device_add command to allow for future
>>   #       interface extension. This also requires the filed names to be kept in
>> @@ -883,6 +884,7 @@
>>     'data': { '*node-id': 'int',
>>               '*socket-id': 'int',
>>               '*die-id': 'int',
>> +            '*cluster-id': 'int',
>>               '*core-id': 'int',
>>               '*thread-id': 'int'
>>     }
> Since new cluster-id is introduced, you may want to check whether to
> update machine_set_cpu_numa_node() and hmp_hotpluggable_cpus(),
> accordingly, which both deal with topo-ids. If we need to update them,
> it's easier to review to make the whole cluster-id introduction part
> a separate patch.
> 

Yes, I agree. hw/core/machine-hmp-cmds.c::hmp_hotpluggable_cpus() also
needs 'cluster-id' either. I will have the changes in v5 because I didn't
catch your comments before posting v4. Please go ahead to review v5
directly.

Thanks,
Gavin
diff mbox series

Patch

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d2e5ecd234..064eac42f7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2505,6 +2505,7 @@  static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     int n;
     unsigned int max_cpus = ms->smp.max_cpus;
     VirtMachineState *vms = VIRT_MACHINE(ms);
+    MachineClass *mc = MACHINE_GET_CLASS(vms);
 
     if (ms->possible_cpus) {
         assert(ms->possible_cpus->len == max_cpus);
@@ -2518,6 +2519,16 @@  static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[n].type = ms->cpu_type;
         ms->possible_cpus->cpus[n].arch_id =
             virt_cpu_mp_affinity(vms, n);
+
+        assert(!mc->smp_props.dies_supported);
+        ms->possible_cpus->cpus[n].props.has_socket_id = true;
+        ms->possible_cpus->cpus[n].props.socket_id =
+            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
+        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
+        ms->possible_cpus->cpus[n].props.cluster_id =
+            n / (ms->smp.cores * ms->smp.threads);
+        ms->possible_cpus->cpus[n].props.has_core_id = true;
+        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
         ms->possible_cpus->cpus[n].props.has_thread_id = true;
         ms->possible_cpus->cpus[n].props.thread_id = n;
     }
diff --git a/qapi/machine.json b/qapi/machine.json
index 42fc68403d..99c945f258 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -868,10 +868,11 @@ 
 # @node-id: NUMA node ID the CPU belongs to
 # @socket-id: socket number within node/board the CPU belongs to
 # @die-id: die number within socket the CPU belongs to (since 4.1)
-# @core-id: core number within die the CPU belongs to
+# @cluster-id: cluster number within die the CPU belongs to
+# @core-id: core number within cluster the CPU belongs to
 # @thread-id: thread number within core the CPU belongs to
 #
-# Note: currently there are 5 properties that could be present
+# Note: currently there are 6 properties that could be present
 #       but management should be prepared to pass through other
 #       properties with device_add command to allow for future
 #       interface extension. This also requires the filed names to be kept in
@@ -883,6 +884,7 @@ 
   'data': { '*node-id': 'int',
             '*socket-id': 'int',
             '*die-id': 'int',
+            '*cluster-id': 'int',
             '*core-id': 'int',
             '*thread-id': 'int'
   }