Message ID | 20230531012313.19891-1-npiggin@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | target/ppc: initial SMT support in TCG | expand |
Hello Nick, On 5/31/23 03:23, Nicholas Piggin wrote: > Hi, > > I'm posting this now just to get some first thoughts. I wouldn't say > it's ready but it does actually work with some basic tests including > pseries booting a Linux distro. I have powernv booting too, it just > requires some more SPRs converted, nothing fundamentally different so > for the purpose of this RFC I leave it out. > > A couple of things, I don't know the object model well enough to do > something nice with topology. Iterating siblings I would have thought > should be going to parent core then iterating its children CPUs. Should > that be done with the object model, or is it better to add direct > pointers in CPUs to core and core to CPUs? It is (semi) important for> performance so maybe that is better than object iterators. If we go that > way, the PnvCore and SpaprCore have pointers to the SMT threads already, > should those be abstracted go in the CPUCore? You should be able to move the thread array into the CPUCore. If you do that, please check that migration compat is not impacted by the state change. However, I am not sure you can use the CPUCore model under the insn modeling. Something to check. Anyhow, the way you implemented the loop on the siblings is sufficiently fast for a small numbers of CPU and safe, w.r.t to CPU hotplug. So I would leave that part for now, if it runs decently with 4*4 vCPUs in TCG it should be fine. Thanks, C. > The other thing is the serialisation of access. It's using the atomic > single stepping for this which... I guess should be sufficient? Is it > the best way to do it though? Can a lock be used somehow instead? > > Thanks, > Nick > > Nicholas Piggin (5): > target/ppc: gdbstub init spr gdb_id for all CPUs > target/ppc: Add initial flags and helpers for SMT support > target/ppc: Add support for SMT CTRL register > target/ppc: Add msgsnd/p and DPDES SMT support > spapr: Allow up to 8 threads SMT configuration > > hw/ppc/ppc.c | 6 ++ > hw/ppc/spapr.c | 4 +- > hw/ppc/spapr_cpu_core.c | 7 +- > include/hw/ppc/ppc.h | 1 + > target/ppc/cpu.h | 16 +++- > target/ppc/cpu_init.c | 5 + > target/ppc/excp_helper.c | 86 ++++++++++++----- > target/ppc/gdbstub.c | 32 ++++--- > target/ppc/helper.h | 4 +- > target/ppc/misc_helper.c | 93 +++++++++++++++++-- > target/ppc/translate.c | 46 ++++++++- > .../ppc/translate/processor-ctrl-impl.c.inc | 2 +- > 12 files changed, 252 insertions(+), 50 deletions(-) >
On Thu Jun 1, 2023 at 5:56 PM AEST, Cédric Le Goater wrote: > Hello Nick, > > On 5/31/23 03:23, Nicholas Piggin wrote: > > Hi, > > > > I'm posting this now just to get some first thoughts. I wouldn't say > > it's ready but it does actually work with some basic tests including > > pseries booting a Linux distro. I have powernv booting too, it just > > requires some more SPRs converted, nothing fundamentally different so > > for the purpose of this RFC I leave it out. > > > > A couple of things, I don't know the object model well enough to do > > something nice with topology. Iterating siblings I would have thought > > should be going to parent core then iterating its children CPUs. Should > > that be done with the object model, or is it better to add direct > > pointers in CPUs to core and core to CPUs? It is (semi) important for> performance so maybe that is better than object iterators. If we go that > > way, the PnvCore and SpaprCore have pointers to the SMT threads already, > > should those be abstracted go in the CPUCore? > > You should be able to move the thread array into the CPUCore. If you do > that, please check that migration compat is not impacted by the state > change. However, I am not sure you can use the CPUCore model under the > insn modeling. Something to check. Okay. > Anyhow, the way you implemented the loop on the siblings is sufficiently > fast for a small numbers of CPU and safe, w.r.t to CPU hotplug. So > I would leave that part for now, if it runs decently with 4*4 vCPUs in > TCG it should be fine. Yeah you're right I'm overly paranoid about it but we don't do hundreds of CPUs in TCG so it should be fine. Maybe I will defer it for now then and just do the CPU iteration. Thanks, Nick
On 6/2/23 09:01, Nicholas Piggin wrote: > On Thu Jun 1, 2023 at 5:56 PM AEST, Cédric Le Goater wrote: >> Hello Nick, >> >> On 5/31/23 03:23, Nicholas Piggin wrote: >>> Hi, >>> >>> I'm posting this now just to get some first thoughts. I wouldn't say >>> it's ready but it does actually work with some basic tests including >>> pseries booting a Linux distro. I have powernv booting too, it just >>> requires some more SPRs converted, nothing fundamentally different so >>> for the purpose of this RFC I leave it out. >>> >>> A couple of things, I don't know the object model well enough to do >>> something nice with topology. Iterating siblings I would have thought >>> should be going to parent core then iterating its children CPUs. Should >>> that be done with the object model, or is it better to add direct >>> pointers in CPUs to core and core to CPUs? It is (semi) important for> performance so maybe that is better than object iterators. If we go that >>> way, the PnvCore and SpaprCore have pointers to the SMT threads already, >>> should those be abstracted go in the CPUCore? >> >> You should be able to move the thread array into the CPUCore. If you do >> that, please check that migration compat is not impacted by the state >> change. However, I am not sure you can use the CPUCore model under the >> insn modeling. Something to check. > > Okay. > >> Anyhow, the way you implemented the loop on the siblings is sufficiently >> fast for a small numbers of CPU and safe, w.r.t to CPU hotplug. So >> I would leave that part for now, if it runs decently with 4*4 vCPUs in >> TCG it should be fine. > > Yeah you're right I'm overly paranoid about it but we don't do hundreds > of CPUs in TCG so it should be fine. The PowerNV did run with 64 CPUs at some point. Boot was slow bc of contention in some areas when starting the secondaries. When stabilized, perf was decent. I think that a realistic goal for book3s is to support 2 sockets * 2 cores * 4 threads on PowerNV and 16 vCPUs on pseries, this to exercise the various ways to IPI on the different HV implementations. > Maybe I will defer it for now then and just do the CPU iteration. May be there is some value to store a CPU siblings under the PowerPC CPU descriptor. IT could be useful for instructions that apply to the current CPU but for others requiring a PIR value, you will have to find a starting point in the CPU list, so it won't make much difference I think. Thanks, C. > > Thanks, > Nick