Message ID | 20211117164848.310952-1-pmorel@linux.ibm.com (mailing list archive) |
---|---|
Headers | show |
Series | s390x: CPU Topology | expand |
Hi, This series is updated by a v5 series with documentation and numa extensions. Some changes have been made in some of the patches contained in this series too. Regards, Pierre On 11/17/21 17:48, Pierre Morel wrote: > Hi, > > This series is a first part of the implementation of CPU topology > for S390 greatly reduced from the first spin. > > In particular, we reduced the scope to the S390x specificities, removing > all code touching to SMP or NUMA, with the goal to: > - facilitate review and acceptance > - let for later the SMP part currently actively discussed in mainline > - be able despite the reduction of code to handle CPU topology for S390 > using the current S390 topology provided by QEMU with cores and sockets > only. > > To use these patches, you will need the Linux series version 4. > You find it there: > https://lkml.org/lkml/2021/9/16/576 > > Currently this code is for KVM only, I have no idea if it is interesting > to provide a TCG patch. If ever it will be done in another series. > > A short introduction > ==================== > > CPU Topology is described in the S390 POP with essentially the description > of two instructions: > > PTF Perform Topology function used to poll for topology change > and used to set the polarization but this part is not part of this item. > > STSI Store System Information and the SYSIB 15.1.x providing the Topology > configuration. > > S390 Topology is a 6 levels hierarchical topology with up to 5 level > of containers. The last topology level, specifying the CPU cores. > > This patch series only uses the two lower levels sockets and cores. > > To get the information on the topology, S390 provides the STSI > instruction, which stores a structures providing the list of the > containers used in the Machine topology: the SYSIB. > A selector within the STSI instruction allow to chose how many topology > levels will be provide in the SYSIB. > > Using the Topology List Entries (TLE) provided inside the SYSIB we > the Linux kernel is able to compute the information about the cache > distance between two cores and can use this information to take > scheduling decisions. > > Note: > ----- > Z15 reports 3 levels of containers, drawers, book, sockets as > Container-TLEs above the core description inside CPU-TLEs. > > The Topology can be seen at several places inside zLinux: > - sysfs: /sys/devices/system/cpu/cpuX/topology > - procfs: /proc/sysinfo and /proc/cpuinfo > - lscpu -e : gives toplogy information > > The different Topology levels have names: > - Node - Drawer - Book - sockets or physical package - core > > Threads: > Multithreading, is not part of the topology as described by the > SYSIB 15.1.x > > The interest of the guest to know the CPU topology is obviously to be > able to optimise the load balancing and the migration of threads. > KVM will have the same interest concerning vCPUs scheduling and cache > optimisation. > > > The design > ========== > > 1) To be ready for hotplug, I chose an Object oriented design > of the topology containers: > - A node is a bridge on the SYSBUS and defines a "node bus" > - A drawer is hotplug on the "node bus" > - A book on the "drawer bus" > - A socket on the "book bus" > - And the CPU Topology List Entry (CPU-TLE)sits on the socket bus. > These objects will be enhanced with the cache information when > NUMA is implemented. > > This also allows for easy retrieval when building the different SYSIB > for Store Topology System Information (STSI) > > 2) Perform Topology Function (PTF) instruction is made available to the > guest with a new KVM capability and intercepted in QEMU, allowing the > guest to pool for topology changes. > > > Features and TBD list > ===================== > > - There is no direct match between IDs shown by: > - lscpu (unrelated numbered list), > - SYSIB 15.1.x (topology ID) > > - The CPU number, left column of lscpu, is used to reference a CPU > by Linux tools > While the CPU address is used by QEMU for hotplug. > > - Effect of -smp parsing on the topology with an example: > -smp 9,sockets=4,cores=4,maxcpus=16 > > We have 4 socket each holding 4 cores so that we have a maximum > of 16 CPU, 9 of them are active on boot. (Should be obvious) > > # lscpu -e > CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS > 0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0 > 1 0 0 0 0 1 1:1:1:1 yes yes horizontal 1 > 2 0 0 0 0 2 2:2:2:2 yes yes horizontal 2 > 3 0 0 0 0 3 3:3:3:3 yes yes horizontal 3 > 4 0 0 0 1 4 4:4:4:4 yes yes horizontal 4 > 5 0 0 0 1 5 5:5:5:5 yes yes horizontal 5 > 6 0 0 0 1 6 6:6:6:6 yes yes horizontal 6 > 7 0 0 0 1 7 7:7:7:7 yes yes horizontal 7 > 8 0 0 0 2 8 8:8:8:8 yes yes horizontal 8 > # > > > - To plug a new CPU inside the topology one can simply use the CPU > address like in: > > (qemu) device_add host-s390x-cpu,core-id=12 > # lscpu -e > CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS > 0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0 > 1 0 0 0 0 1 1:1:1:1 yes yes horizontal 1 > 2 0 0 0 0 2 2:2:2:2 yes yes horizontal 2 > 3 0 0 0 0 3 3:3:3:3 yes yes horizontal 3 > 4 0 0 0 1 4 4:4:4:4 yes yes horizontal 4 > 5 0 0 0 1 5 5:5:5:5 yes yes horizontal 5 > 6 0 0 0 1 6 6:6:6:6 yes yes horizontal 6 > 7 0 0 0 1 7 7:7:7:7 yes yes horizontal 7 > 8 0 0 0 2 8 8:8:8:8 yes yes horizontal 8 > 9 - - - - - ::: no yes horizontal 12 > # chcpu -e 9 > CPU 9 enabled > # lscpu -e > CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS > 0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0 > 1 0 0 0 0 1 1:1:1:1 yes yes horizontal 1 > 2 0 0 0 0 2 2:2:2:2 yes yes horizontal 2 > 3 0 0 0 0 3 3:3:3:3 yes yes horizontal 3 > 4 0 0 0 1 4 4:4:4:4 yes yes horizontal 4 > 5 0 0 0 1 5 5:5:5:5 yes yes horizontal 5 > 6 0 0 0 1 6 6:6:6:6 yes yes horizontal 6 > 7 0 0 0 1 7 7:7:7:7 yes yes horizontal 7 > 8 0 0 0 2 8 8:8:8:8 yes yes horizontal 8 > 9 0 0 0 3 9 9:9:9:9 yes yes horizontal 12 > # > > It is up to the admin level, Libvirt for example, to pin the righ CPU to the right > vCPU, but as we can see without NUMA, chosing separate sockets for CPUs is not easy > without hotplug because without information the code will assign the vCPU and fill > the sockets one after the other. > Note that this is also the default behavior on the LPAR. > > Conclusion > ========== > > This patch, together with the associated KVM patch allows to provide CPU topology > information to the guest. > Currently, only dedicated vCPU and CPU are supported and a NUMA topology can only > be handled using CPU hotplug inside the guest. > > Next extensions are to provide: > - adding books and drawers levels > - NUMA using the -numa QEMU parameter. > - Topology information change for shared CPU > > Regards, > Pierre > > Pierre Morel (5): > linux-headers update > s390x: topology: CPU topology objects and structures > s390x: topology: implementating Store Topology System Information > s390x: CPU topology: CPU topology migration > s390x: kvm: topology: interception of PTF instruction > > hw/s390x/cpu-topology.c | 361 ++++++++++++++++++++++++++++ > hw/s390x/meson.build | 1 + > hw/s390x/s390-virtio-ccw.c | 54 +++++ > include/hw/s390x/cpu-topology.h | 74 ++++++ > include/hw/s390x/s390-virtio-ccw.h | 6 + > linux-headers/linux/kvm.h | 1 + > target/s390x/cpu.h | 50 ++++ > target/s390x/cpu_features_def.h.inc | 1 + > target/s390x/cpu_models.c | 2 + > target/s390x/cpu_topology.c | 113 +++++++++ > target/s390x/gen-features.c | 3 + > target/s390x/kvm/kvm.c | 26 ++ > target/s390x/machine.c | 48 ++++ > target/s390x/meson.build | 1 + > 14 files changed, 741 insertions(+) > create mode 100644 hw/s390x/cpu-topology.c > create mode 100644 include/hw/s390x/cpu-topology.h > create mode 100644 target/s390x/cpu_topology.c >