Message ID | 1626975764-22131-1-git-send-email-pmorel@linux.ibm.com (mailing list archive) |
---|---|
Headers | show |
Series | s390x: CPU Topology | expand |
a gentle ping :) I would like if you have time, comments on the architecture I propose, if the handling is done at the right level, KVM vs QEMU. Therefor I added Viktor as CC. This series is independent of the changes on -smp from Yanan Wang currently reviewed in main line. Thanks, Regards, Pierre On 7/22/21 7:42 PM, Pierre Morel wrote: > Hi, > > This series is a first part of the implementation of CPU topology > for S390 greatly reduced from the first spin. > > In particular, we reduced the scope to the S390x specificities, removing > all code touching to SMP or NUMA, with the goal to: > - facilitate review and acceptance > - let for later the SMP part currently actively discussed in mainline > - be able despite the reduction of code to handle CPU topology for S390 > using the current S390 topology provided by QEMU with cores and sockets > only. > > To use these patches you will need the Linux series. > You find it there: > https://marc.info/?l=kvm&m=162697338719109&w=3 > > Currently this code is for KVM only, I have no idea if it is interesting > to provide a TCG patch. If ever it will be done in another series. > > ==================== > A short introduction > ==================== > > CPU Topology is described in the S390 POP with essentially the description > of two instructions: > > PTF Perform Topology function used to poll for topology change > and used to set the polarization but this part is not part of this item. > > STSI Store System Information and the SYSIB 15.1.x providing the Topology > configuration. > > S390 Topology is a 6 levels hierarchical topology with up to 5 level > of containers. The last topology level, specifying the CPU cores. > > This patch series only uses the two lower levels sockets and cores. > > To get the information on the topology, S390 provides the STSI > instruction, which stores a structures providing the list of the > containers used in the Machine topology: the SYSIB. > A selector within the STSI instruction allow to chose how many topology > levels will be provide in the SYSIB. > > Using the Topology List Entries (TLE) provided inside the SYSIB we > the Linux kernel is able to compute the information about the cache > distance between two cores and can use this information to take > scheduling decisions. > > Note: > ----- > Z15 reports 3 levels of containers, drawers, book, sockets as > Container-TLEs above the core description inside CPU-TLEs. > > The Topology can be seen at several places inside zLinux: > - sysfs: /sys/devices/system/cpu/cpuX/topology > - procfs: /proc/sysinfo and /proc/cpuinfo > - lscpu -e : gives toplogy information > > The different Topology levels have names: > - Node - Drawer - Book - sockets or physical package - core > > Threads: > Multithreading, is not part of the topology as described by the > SYSIB 15.1.x > > The interest of the guest to know the CPU topology is obviously to be > able to optimise the load balancing and the migration of threads. > KVM will have the same interest concerning vCPUs scheduling and cache > optimisation. > > > ========== > The design > ========== > > 1) To be ready for hotplug, I chose an Object oriented design > of the topology containers: > - A node is a bridge on the SYSBUS and defines a "node bus" > - A drawer is hotplug on the "node bus" > - A book on the "drawer bus" > - A socket on the "book bus" > - And the CPU Topology List Entry (CPU-TLE)sits on the socket bus. > These objects will be enhanced with the cache information when > NUMA is implemented. > > This also allows for easy retrieval when building the different SYSIB > for Store Topology System Information (STSI) > > 2) Perform Topology Function (PTF) instruction is made available to the > guest with a new KVM capability and intercepted in QEMU, allowing the > guest to pool for topology changes. > > > ===================== > Features and TBD list > ===================== > > - There is no direct match between IDs shown by: > - lscpu (unrelated numbered list), > - SYSIB 15.1.x (topology ID) > > - The CPU number, left column of lscpu, is used to reference a CPU > by Linux tools > While the CPU address is used by QEMU for hotplug. > > - Effect of -smp parsing on the topology with an example: > -smp 9,sockets=4,cores=4,maxcpus=16 > > We have 4 socket each holding 4 cores so that we have a maximum > of 16 CPU, 9 of them are active on boot. (Should be obvious) > > # lscpu -e > CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS > 0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0 > 1 0 0 0 0 1 1:1:1:1 yes yes horizontal 1 > 2 0 0 0 0 2 2:2:2:2 yes yes horizontal 2 > 3 0 0 0 0 3 3:3:3:3 yes yes horizontal 3 > 4 0 0 0 1 4 4:4:4:4 yes yes horizontal 4 > 5 0 0 0 1 5 5:5:5:5 yes yes horizontal 5 > 6 0 0 0 1 6 6:6:6:6 yes yes horizontal 6 > 7 0 0 0 1 7 7:7:7:7 yes yes horizontal 7 > 8 0 0 0 2 8 8:8:8:8 yes yes horizontal 8 > # > > > - To plug a new CPU inside the topology one can simply use the CPU > address like in: > > (qemu) device_add host-s390x-cpu,core-id=12 > # lscpu -e > CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS > 0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0 > 1 0 0 0 0 1 1:1:1:1 yes yes horizontal 1 > 2 0 0 0 0 2 2:2:2:2 yes yes horizontal 2 > 3 0 0 0 0 3 3:3:3:3 yes yes horizontal 3 > 4 0 0 0 1 4 4:4:4:4 yes yes horizontal 4 > 5 0 0 0 1 5 5:5:5:5 yes yes horizontal 5 > 6 0 0 0 1 6 6:6:6:6 yes yes horizontal 6 > 7 0 0 0 1 7 7:7:7:7 yes yes horizontal 7 > 8 0 0 0 2 8 8:8:8:8 yes yes horizontal 8 > 9 - - - - - ::: no yes horizontal 12 > # chcpu -e 9 > CPU 9 enabled > # lscpu -e > CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS > 0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0 > 1 0 0 0 0 1 1:1:1:1 yes yes horizontal 1 > 2 0 0 0 0 2 2:2:2:2 yes yes horizontal 2 > 3 0 0 0 0 3 3:3:3:3 yes yes horizontal 3 > 4 0 0 0 1 4 4:4:4:4 yes yes horizontal 4 > 5 0 0 0 1 5 5:5:5:5 yes yes horizontal 5 > 6 0 0 0 1 6 6:6:6:6 yes yes horizontal 6 > 7 0 0 0 1 7 7:7:7:7 yes yes horizontal 7 > 8 0 0 0 2 8 8:8:8:8 yes yes horizontal 8 > 9 0 0 0 3 9 9:9:9:9 yes yes horizontal 12 > # > > It is up to the admin level, Libvirt for example, to pin the righ CPU to the right > vCPU, but as we can see without NUMA, chosing separate sockets for CPUs is not easy > without hotplug because without information the code will assign the vCPU and fill > the sockets one after the other. > Note that this is also the default behavior on the LPAR. > > > Regards, > Pierre > > Pierre Morel (5): > s390x: kvm: topology: Linux header update > s390x: kvm: topology: interception of PTF instruction > s390x: topology: CPU topology objects and structures > s390x: topology: Topology list entries and SYSIB 15.x.x > s390x: topology: implementating Store Topology System Information > > hw/s390x/cpu-topology.c | 334 +++++++++++++++++++++++++++++ > hw/s390x/meson.build | 1 + > hw/s390x/s390-virtio-ccw.c | 49 +++++ > include/hw/s390x/cpu-topology.h | 67 ++++++ > include/hw/s390x/s390-virtio-ccw.h | 7 + > linux-headers/linux/kvm.h | 1 + > target/s390x/cpu.h | 44 ++++ > target/s390x/kvm/kvm.c | 122 +++++++++++ > 8 files changed, 625 insertions(+) > create mode 100644 hw/s390x/cpu-topology.c > create mode 100644 include/hw/s390x/cpu-topology.h >
On 26.08.21 11:22, Pierre Morel wrote: > > a gentle ping :) > > I would like if you have time, comments on the architecture I propose, > if the handling is done at the right level, KVM vs QEMU. Do we expect changes in this series due to the discussed changes of PTF interpretion? > > Therefor I added Viktor as CC. > > This series is independent of the changes on -smp from Yanan Wang currently reviewed in main line. > > Thanks, > > Regards, > Pierre > > > On 7/22/21 7:42 PM, Pierre Morel wrote: >> Hi, >> >> This series is a first part of the implementation of CPU topology >> for S390 greatly reduced from the first spin. >> >> In particular, we reduced the scope to the S390x specificities, removing >> all code touching to SMP or NUMA, with the goal to: >> - facilitate review and acceptance >> - let for later the SMP part currently actively discussed in mainline >> - be able despite the reduction of code to handle CPU topology for S390 >> using the current S390 topology provided by QEMU with cores and sockets >> only. >> >> To use these patches you will need the Linux series. >> You find it there: >> https://marc.info/?l=kvm&m=162697338719109&w=3 >> >> Currently this code is for KVM only, I have no idea if it is interesting >> to provide a TCG patch. If ever it will be done in another series. >> >> ==================== >> A short introduction >> ==================== >> >> CPU Topology is described in the S390 POP with essentially the description >> of two instructions: >> >> PTF Perform Topology function used to poll for topology change >> and used to set the polarization but this part is not part of this item. >> >> STSI Store System Information and the SYSIB 15.1.x providing the Topology >> configuration. >> >> S390 Topology is a 6 levels hierarchical topology with up to 5 level >> of containers. The last topology level, specifying the CPU cores. >> >> This patch series only uses the two lower levels sockets and cores. >> To get the information on the topology, S390 provides the STSI >> instruction, which stores a structures providing the list of the >> containers used in the Machine topology: the SYSIB. >> A selector within the STSI instruction allow to chose how many topology >> levels will be provide in the SYSIB. >> >> Using the Topology List Entries (TLE) provided inside the SYSIB we >> the Linux kernel is able to compute the information about the cache >> distance between two cores and can use this information to take >> scheduling decisions. >> >> Note: >> ----- >> Z15 reports 3 levels of containers, drawers, book, sockets as >> Container-TLEs above the core description inside CPU-TLEs. >> >> The Topology can be seen at several places inside zLinux: >> - sysfs: /sys/devices/system/cpu/cpuX/topology >> - procfs: /proc/sysinfo and /proc/cpuinfo >> - lscpu -e : gives toplogy information >> >> The different Topology levels have names: >> - Node - Drawer - Book - sockets or physical package - core >> >> Threads: >> Multithreading, is not part of the topology as described by the >> SYSIB 15.1.x >> >> The interest of the guest to know the CPU topology is obviously to be >> able to optimise the load balancing and the migration of threads. >> KVM will have the same interest concerning vCPUs scheduling and cache >> optimisation. >> >> >> ========== >> The design >> ========== >> >> 1) To be ready for hotplug, I chose an Object oriented design >> of the topology containers: >> - A node is a bridge on the SYSBUS and defines a "node bus" >> - A drawer is hotplug on the "node bus" >> - A book on the "drawer bus" >> - A socket on the "book bus" >> - And the CPU Topology List Entry (CPU-TLE)sits on the socket bus. >> These objects will be enhanced with the cache information when >> NUMA is implemented. >> >> This also allows for easy retrieval when building the different SYSIB >> for Store Topology System Information (STSI) >> >> 2) Perform Topology Function (PTF) instruction is made available to the >> guest with a new KVM capability and intercepted in QEMU, allowing the >> guest to pool for topology changes. >> >> >> ===================== >> Features and TBD list >> ===================== >> >> - There is no direct match between IDs shown by: >> - lscpu (unrelated numbered list), >> - SYSIB 15.1.x (topology ID) >> >> - The CPU number, left column of lscpu, is used to reference a CPU >> by Linux tools >> While the CPU address is used by QEMU for hotplug. >> >> - Effect of -smp parsing on the topology with an example: >> -smp 9,sockets=4,cores=4,maxcpus=16 >> >> We have 4 socket each holding 4 cores so that we have a maximum >> of 16 CPU, 9 of them are active on boot. (Should be obvious) >> >> # lscpu -e >> CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS >> 0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0 >> 1 0 0 0 0 1 1:1:1:1 yes yes horizontal 1 >> 2 0 0 0 0 2 2:2:2:2 yes yes horizontal 2 >> 3 0 0 0 0 3 3:3:3:3 yes yes horizontal 3 >> 4 0 0 0 1 4 4:4:4:4 yes yes horizontal 4 >> 5 0 0 0 1 5 5:5:5:5 yes yes horizontal 5 >> 6 0 0 0 1 6 6:6:6:6 yes yes horizontal 6 >> 7 0 0 0 1 7 7:7:7:7 yes yes horizontal 7 >> 8 0 0 0 2 8 8:8:8:8 yes yes horizontal 8 >> # >> >> >> - To plug a new CPU inside the topology one can simply use the CPU >> address like in: >> (qemu) device_add host-s390x-cpu,core-id=12 >> # lscpu -e >> CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS >> 0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0 >> 1 0 0 0 0 1 1:1:1:1 yes yes horizontal 1 >> 2 0 0 0 0 2 2:2:2:2 yes yes horizontal 2 >> 3 0 0 0 0 3 3:3:3:3 yes yes horizontal 3 >> 4 0 0 0 1 4 4:4:4:4 yes yes horizontal 4 >> 5 0 0 0 1 5 5:5:5:5 yes yes horizontal 5 >> 6 0 0 0 1 6 6:6:6:6 yes yes horizontal 6 >> 7 0 0 0 1 7 7:7:7:7 yes yes horizontal 7 >> 8 0 0 0 2 8 8:8:8:8 yes yes horizontal 8 >> 9 - - - - - ::: no yes horizontal 12 >> # chcpu -e 9 >> CPU 9 enabled >> # lscpu -e >> CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS >> 0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0 >> 1 0 0 0 0 1 1:1:1:1 yes yes horizontal 1 >> 2 0 0 0 0 2 2:2:2:2 yes yes horizontal 2 >> 3 0 0 0 0 3 3:3:3:3 yes yes horizontal 3 >> 4 0 0 0 1 4 4:4:4:4 yes yes horizontal 4 >> 5 0 0 0 1 5 5:5:5:5 yes yes horizontal 5 >> 6 0 0 0 1 6 6:6:6:6 yes yes horizontal 6 >> 7 0 0 0 1 7 7:7:7:7 yes yes horizontal 7 >> 8 0 0 0 2 8 8:8:8:8 yes yes horizontal 8 >> 9 0 0 0 3 9 9:9:9:9 yes yes horizontal 12 >> # >> >> It is up to the admin level, Libvirt for example, to pin the righ CPU to the right >> vCPU, but as we can see without NUMA, chosing separate sockets for CPUs is not easy >> without hotplug because without information the code will assign the vCPU and fill >> the sockets one after the other. >> Note that this is also the default behavior on the LPAR. >> >> >> Regards, >> Pierre >> >> Pierre Morel (5): >> s390x: kvm: topology: Linux header update >> s390x: kvm: topology: interception of PTF instruction >> s390x: topology: CPU topology objects and structures >> s390x: topology: Topology list entries and SYSIB 15.x.x >> s390x: topology: implementating Store Topology System Information >> >> hw/s390x/cpu-topology.c | 334 +++++++++++++++++++++++++++++ >> hw/s390x/meson.build | 1 + >> hw/s390x/s390-virtio-ccw.c | 49 +++++ >> include/hw/s390x/cpu-topology.h | 67 ++++++ >> include/hw/s390x/s390-virtio-ccw.h | 7 + >> linux-headers/linux/kvm.h | 1 + >> target/s390x/cpu.h | 44 ++++ >> target/s390x/kvm/kvm.c | 122 +++++++++++ >> 8 files changed, 625 insertions(+) >> create mode 100644 hw/s390x/cpu-topology.c >> create mode 100644 include/hw/s390x/cpu-topology.h >> >
On 8/30/21 11:54 AM, Christian Borntraeger wrote: > > > On 26.08.21 11:22, Pierre Morel wrote: >> >> a gentle ping :) >> >> I would like if you have time, comments on the architecture I propose, >> if the handling is done at the right level, KVM vs QEMU. > > Do we expect changes in this series due to the discussed changes of PTF > interpretion? No we do not expect any change. The configuration topology feature is enabled in QEMU if KVM provides the KVM_CAP_S390_CPU_TOPOLOGY. Interpretation is set in KVM if QEMU activated the feature and if the host supports the configuration topology feature. If the host does not support the feature, interception is done and QEMU emulates the PTF instruction. The feature can be fenced with qemu -cpu XX,ctop=off for CPU already having the feature activated as default in QEMU (newer GEN10_GA1) Regards, Pierre