Message ID | 20211117095711.26596-1-luca.fancellu@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | Boot time cpupools | expand |
Hi Luca, On 17/11/2021 09:57, Luca Fancellu wrote: > Currently Xen creates a default cpupool0 that contains all the cpu brought up > during boot and it assumes that the platform has only one kind of CPU. > This assumption does not hold on big.LITTLE platform, but putting different > type of CPU in the same cpupool can result in instability and security issues > for the domains running on the pool. I agree that you can't move a LITTLE vCPU to a big pCPU. However... > > For this reason this serie introduces an architecture specific way to create > different cpupool at boot time, this is particularly useful on ARM big.LITTLE > platform where there might be the need to have different cpupools for each type > of core, but also systems using NUMA can have different cpu pool for each node. ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible: 1) to have a mix of LITTLE and big vCPUs in the domain 2) to create a domain spanning across two NUMA nodes So I think we need to make sure that any solutions we go through will not prevent us to implement those setups. I can see two options here: 1) Allowing a domain vCPUs to be on a different cpupool 2) Introducing CPU class (see [1]) I can't remember why Dario suggested 2) rather than 1) in the past. @Dario, do you remember it? Cheers, [1] https://lore.kernel.org/xen-devel/1481135379.3445.142.camel@citrix.com/ > > To test this serie, the hmp_unsafe Xen boot argument is passed to allow Xen > starting different CPUs from the boot core. > > Luca Fancellu (2): > xen/cpupool: Create different cpupools at boot time > tools/cpupools: Give a name to unnamed cpupools > > tools/libs/light/libxl_utils.c | 13 ++++-- > xen/arch/arm/Kconfig | 15 ++++++ > xen/arch/arm/Makefile | 1 + > xen/arch/arm/cpupool.c | 84 ++++++++++++++++++++++++++++++++++ > xen/common/sched/cpupool.c | 5 +- > xen/include/xen/cpupool.h | 30 ++++++++++++ > 6 files changed, 142 insertions(+), 6 deletions(-) > create mode 100644 xen/arch/arm/cpupool.c > create mode 100644 xen/include/xen/cpupool.h >
On 17.11.21 11:26, Julien Grall wrote: > Hi Luca, > > On 17/11/2021 09:57, Luca Fancellu wrote: >> Currently Xen creates a default cpupool0 that contains all the cpu >> brought up >> during boot and it assumes that the platform has only one kind of CPU. >> This assumption does not hold on big.LITTLE platform, but putting >> different >> type of CPU in the same cpupool can result in instability and security >> issues >> for the domains running on the pool. > > I agree that you can't move a LITTLE vCPU to a big pCPU. However... > >> >> For this reason this serie introduces an architecture specific way to >> create >> different cpupool at boot time, this is particularly useful on ARM >> big.LITTLE >> platform where there might be the need to have different cpupools for >> each type >> of core, but also systems using NUMA can have different cpu pool for >> each node. > > ... from my understanding, all the vCPUs of a domain have to be in the > same cpupool. So with this approach it is not possible: > 1) to have a mix of LITTLE and big vCPUs in the domain > 2) to create a domain spanning across two NUMA nodes > > So I think we need to make sure that any solutions we go through will > not prevent us to implement those setups. > > I can see two options here: > 1) Allowing a domain vCPUs to be on a different cpupool > 2) Introducing CPU class (see [1]) > > I can't remember why Dario suggested 2) rather than 1) in the past. > @Dario, do you remember it? Having vcpus of a domain in different cpupools would probably require a major scheduling rework due to several accounting information being per cpupool today with some of that data being additionally per domain. Not to mention the per-domain scheduling parameters, which would need to cope with different schedulers suddenly (imagine one cpupool using credit2 and the other rt). I'd rather use implicit pinning e.g. via a cpu class. Juergen
Hi Julien, > On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: > > Hi Luca, > > On 17/11/2021 09:57, Luca Fancellu wrote: >> Currently Xen creates a default cpupool0 that contains all the cpu brought up >> during boot and it assumes that the platform has only one kind of CPU. >> This assumption does not hold on big.LITTLE platform, but putting different >> type of CPU in the same cpupool can result in instability and security issues >> for the domains running on the pool. > > I agree that you can't move a LITTLE vCPU to a big pCPU. However... > >> For this reason this serie introduces an architecture specific way to create >> different cpupool at boot time, this is particularly useful on ARM big.LITTLE >> platform where there might be the need to have different cpupools for each type >> of core, but also systems using NUMA can have different cpu pool for each node. > > ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible: > 1) to have a mix of LITTLE and big vCPUs in the domain > 2) to create a domain spanning across two NUMA nodes > > So I think we need to make sure that any solutions we go through will not prevent us to implement those setups. The point of this patch is to make all cores available without breaking the current behaviour of existing system. Someone not using cpupool will keep running on the same cores as before. Someone wanting to use the other cores could assign a guest to the other(s) cpupool (big.LITTLE is just an example with 2 but there are now cores with 3 types of cores). Someone wanting to build something different can now create new cpupools in Dom0 and assign the cores they want to is to build a guest having access to different types of cores. The point here is just to make the “other” cores accessible and park them in cpupools so that current behaviour is not changed. > > I can see two options here: > 1) Allowing a domain vCPUs to be on a different cpupool > 2) Introducing CPU class (see [1]) > > I can't remember why Dario suggested 2) rather than 1) in the past. @Dario, do you remember it? I think 1) is definitely interesting and something that could be looked at in the future. This serie just aims at making all cores available without breaking backward compatibility which is a good improvement but does not contradict the 2 options you are suggesting. Cheers Bertrand > > Cheers, > > [1] https://lore.kernel.org/xen-devel/1481135379.3445.142.camel@citrix.com/ > >> To test this serie, the hmp_unsafe Xen boot argument is passed to allow Xen >> starting different CPUs from the boot core. >> Luca Fancellu (2): >> xen/cpupool: Create different cpupools at boot time >> tools/cpupools: Give a name to unnamed cpupools >> tools/libs/light/libxl_utils.c | 13 ++++-- >> xen/arch/arm/Kconfig | 15 ++++++ >> xen/arch/arm/Makefile | 1 + >> xen/arch/arm/cpupool.c | 84 ++++++++++++++++++++++++++++++++++ >> xen/common/sched/cpupool.c | 5 +- >> xen/include/xen/cpupool.h | 30 ++++++++++++ >> 6 files changed, 142 insertions(+), 6 deletions(-) >> create mode 100644 xen/arch/arm/cpupool.c >> create mode 100644 xen/include/xen/cpupool.h > > -- > Julien Grall
On 17.11.21 12:16, Bertrand Marquis wrote: > Hi Julien, > >> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: >> >> Hi Luca, >> >> On 17/11/2021 09:57, Luca Fancellu wrote: >>> Currently Xen creates a default cpupool0 that contains all the cpu brought up >>> during boot and it assumes that the platform has only one kind of CPU. >>> This assumption does not hold on big.LITTLE platform, but putting different >>> type of CPU in the same cpupool can result in instability and security issues >>> for the domains running on the pool. >> >> I agree that you can't move a LITTLE vCPU to a big pCPU. However... >> >>> For this reason this serie introduces an architecture specific way to create >>> different cpupool at boot time, this is particularly useful on ARM big.LITTLE >>> platform where there might be the need to have different cpupools for each type >>> of core, but also systems using NUMA can have different cpu pool for each node. >> >> ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible: >> 1) to have a mix of LITTLE and big vCPUs in the domain >> 2) to create a domain spanning across two NUMA nodes >> >> So I think we need to make sure that any solutions we go through will not prevent us to implement those setups. > > The point of this patch is to make all cores available without breaking the current behaviour of existing system. May I suggest to add a boot parameter for being able to control this behavior by other means than compile time configuration? > > Someone not using cpupool will keep running on the same cores as before. > Someone wanting to use the other cores could assign a guest to the other(s) cpupool (big.LITTLE is just an example with 2 but there are now cores with 3 types of cores). > Someone wanting to build something different can now create new cpupools in Dom0 and assign the cores they want to is to build a guest having access to different types of cores. > > The point here is just to make the “other” cores accessible and park them in cpupools so that current behaviour is not changed. > >> >> I can see two options here: >> 1) Allowing a domain vCPUs to be on a different cpupool >> 2) Introducing CPU class (see [1]) >> >> I can't remember why Dario suggested 2) rather than 1) in the past. @Dario, do you remember it? > > I think 1) is definitely interesting and something that could be looked at in the future. From scheduler point of view this is IMO a nightmare. Juergen
On 17/11/2021 11:16, Bertrand Marquis wrote: > Hi Julien, Hi, >> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: >> >> Hi Luca, >> >> On 17/11/2021 09:57, Luca Fancellu wrote: >>> Currently Xen creates a default cpupool0 that contains all the cpu brought up >>> during boot and it assumes that the platform has only one kind of CPU. >>> This assumption does not hold on big.LITTLE platform, but putting different >>> type of CPU in the same cpupool can result in instability and security issues >>> for the domains running on the pool. >> >> I agree that you can't move a LITTLE vCPU to a big pCPU. However... >> >>> For this reason this serie introduces an architecture specific way to create >>> different cpupool at boot time, this is particularly useful on ARM big.LITTLE >>> platform where there might be the need to have different cpupools for each type >>> of core, but also systems using NUMA can have different cpu pool for each node. >> >> ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible: >> 1) to have a mix of LITTLE and big vCPUs in the domain >> 2) to create a domain spanning across two NUMA nodes >> >> So I think we need to make sure that any solutions we go through will not prevent us to implement those setups. > > The point of this patch is to make all cores available without breaking the current behaviour of existing system. I might be missing some context here. By breaking current behavior, do you mean user that may want to add "hmp-unsafe" on the command line? Cheers,
Hi Julien, > On 17 Nov 2021, at 11:48, Julien Grall <julien@xen.org> wrote: > > On 17/11/2021 11:16, Bertrand Marquis wrote: >> Hi Julien, > > Hi, > >>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: >>> >>> Hi Luca, >>> >>> On 17/11/2021 09:57, Luca Fancellu wrote: >>>> Currently Xen creates a default cpupool0 that contains all the cpu brought up >>>> during boot and it assumes that the platform has only one kind of CPU. >>>> This assumption does not hold on big.LITTLE platform, but putting different >>>> type of CPU in the same cpupool can result in instability and security issues >>>> for the domains running on the pool. >>> >>> I agree that you can't move a LITTLE vCPU to a big pCPU. However... >>> >>>> For this reason this serie introduces an architecture specific way to create >>>> different cpupool at boot time, this is particularly useful on ARM big.LITTLE >>>> platform where there might be the need to have different cpupools for each type >>>> of core, but also systems using NUMA can have different cpu pool for each node. >>> >>> ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible: >>> 1) to have a mix of LITTLE and big vCPUs in the domain >>> 2) to create a domain spanning across two NUMA nodes >>> >>> So I think we need to make sure that any solutions we go through will not prevent us to implement those setups. >> The point of this patch is to make all cores available without breaking the current behaviour of existing system. > > I might be missing some context here. By breaking current behavior, do you mean user that may want to add "hmp-unsafe" on the command line? Right, with hmp-unsafe the behaviour is now the same as without, only extra cores are parked in other cpupools. So you have a point in fact that behaviour is changed for someone who was using hmp-unsafe before if this is activated. The command line argument suggested by Jurgen definitely makes sense here. We could instead do the following: - when this is activated in the configuration, boot all cores and park them in different pools (depending on command line argument). Current behaviour not change if other pools are not used (but more cores will be on in xen) - when hmp-unsafe is on, this feature will be turned of (if activated in configuration) and all cores would be added in the same pool. What do you think ? Cheers Bertrand > > Cheers, > > -- > Julien Grall
(Prunning some CC to just leave Arm and sched folks) On 17/11/2021 12:07, Bertrand Marquis wrote: > Hi Julien, Hi Bertrand, >> On 17 Nov 2021, at 11:48, Julien Grall <julien@xen.org> wrote: >> >> On 17/11/2021 11:16, Bertrand Marquis wrote: >>> Hi Julien, >> >> Hi, >> >>>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: >>>> >>>> Hi Luca, >>>> >>>> On 17/11/2021 09:57, Luca Fancellu wrote: >>>>> Currently Xen creates a default cpupool0 that contains all the cpu brought up >>>>> during boot and it assumes that the platform has only one kind of CPU. >>>>> This assumption does not hold on big.LITTLE platform, but putting different >>>>> type of CPU in the same cpupool can result in instability and security issues >>>>> for the domains running on the pool. >>>> >>>> I agree that you can't move a LITTLE vCPU to a big pCPU. However... >>>> >>>>> For this reason this serie introduces an architecture specific way to create >>>>> different cpupool at boot time, this is particularly useful on ARM big.LITTLE >>>>> platform where there might be the need to have different cpupools for each type >>>>> of core, but also systems using NUMA can have different cpu pool for each node. >>>> >>>> ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible: >>>> 1) to have a mix of LITTLE and big vCPUs in the domain >>>> 2) to create a domain spanning across two NUMA nodes >>>> >>>> So I think we need to make sure that any solutions we go through will not prevent us to implement those setups. >>> The point of this patch is to make all cores available without breaking the current behaviour of existing system. >> >> I might be missing some context here. By breaking current behavior, do you mean user that may want to add "hmp-unsafe" on the command line? > > Right, with hmp-unsafe the behaviour is now the same as without, only extra cores are parked in other cpupools. > > So you have a point in fact that behaviour is changed for someone who was using hmp-unsafe before if this is activated. > The command line argument suggested by Jurgen definitely makes sense here. > > We could instead do the following: > - when this is activated in the configuration, boot all cores and park them in different pools (depending on command line argument). Current behaviour not change if other pools are not used (but more cores will be on in xen) From my understanding, it is possible to move a pCPU in/out a pool afterwards. So the security concern with big.LITTLE is still present, even though it would be difficult to hit it. I am also concerned that it would be more difficult to detect any misconfiguration. So I think this option would still need to be turned on only if hmp-unsafe are the new command line one are both on. If we want to enable it without hmp-unsafe on, we would need to at least lock the pools. > - when hmp-unsafe is on, this feature will be turned of (if activated in configuration) and all cores would be added in the same pool. > > What do you think ? I am split. On one hand, this is making easier for someone to try big.LITTLE as you don't have manually pin vCPUs. On the other hand, this is handling a single use-case and you would need to use hmp-unsafe and pinning if you want to get more exotic setup (e.g. a domain with big.LITTLE). Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky by default) and then create the pools from dom0 userspace. My assumption is for dom0less we would want to use pinning instead. That said I would like to hear from Xilinx and EPAM as, IIRC, they are already using hmp-unsafe in production. Cheers,
On Wed, 17 Nov 2021, Julien Grall wrote: > > > > > On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: > > > > > > > > > > Hi Luca, > > > > > > > > > > On 17/11/2021 09:57, Luca Fancellu wrote: > > > > > > Currently Xen creates a default cpupool0 that contains all the cpu > > > > > > brought up > > > > > > during boot and it assumes that the platform has only one kind of > > > > > > CPU. > > > > > > This assumption does not hold on big.LITTLE platform, but putting > > > > > > different > > > > > > type of CPU in the same cpupool can result in instability and > > > > > > security issues > > > > > > for the domains running on the pool. > > > > > > > > > > I agree that you can't move a LITTLE vCPU to a big pCPU. However... > > > > > > > > > > > For this reason this serie introduces an architecture specific way > > > > > > to create > > > > > > different cpupool at boot time, this is particularly useful on ARM > > > > > > big.LITTLE > > > > > > platform where there might be the need to have different cpupools > > > > > > for each type > > > > > > of core, but also systems using NUMA can have different cpu pool for > > > > > > each node. > > > > > > > > > > ... from my understanding, all the vCPUs of a domain have to be in the > > > > > same cpupool. So with this approach it is not possible: > > > > > 1) to have a mix of LITTLE and big vCPUs in the domain > > > > > 2) to create a domain spanning across two NUMA nodes > > > > > > > > > > So I think we need to make sure that any solutions we go through will > > > > > not prevent us to implement those setups. > > > > The point of this patch is to make all cores available without breaking > > > > the current behaviour of existing system. > > > > > > I might be missing some context here. By breaking current behavior, do you > > > mean user that may want to add "hmp-unsafe" on the command line? > > > > Right, with hmp-unsafe the behaviour is now the same as without, only extra > > cores are parked in other cpupools. > > > > So you have a point in fact that behaviour is changed for someone who was > > using hmp-unsafe before if this is activated. > > The command line argument suggested by Jurgen definitely makes sense here. > > > > We could instead do the following: > > - when this is activated in the configuration, boot all cores and park them > > in different pools (depending on command line argument). Current behaviour > > not change if other pools are not used (but more cores will be on in xen) > > From my understanding, it is possible to move a pCPU in/out a pool afterwards. > So the security concern with big.LITTLE is still present, even though it would > be difficult to hit it. As far as I know moving a pCPU in/out of a pool is something that cannot happen automatically: it requires manual intervention to the user and it is uncommon. We could print a warning or simply return error to prevent the action from happening. Or something like: "This action might result in memory corruptions and invalid behavior. Do you want to continue? [Y/N] > I am also concerned that it would be more difficult to detect any > misconfiguration. So I think this option would still need to be turned on only > if hmp-unsafe are the new command line one are both on. > > If we want to enable it without hmp-unsafe on, we would need to at least lock > the pools. Locking the pools is a good idea. My preference is not to tie this feature to the hmp-unsafe command line, more on this below. > > - when hmp-unsafe is on, this feature will be turned of (if activated in > > configuration) and all cores would be added in the same pool. > > > > What do you think ? > > I am split. On one hand, this is making easier for someone to try big.LITTLE > as you don't have manually pin vCPUs. On the other hand, this is handling a > single use-case and you would need to use hmp-unsafe and pinning if you want > to get more exotic setup (e.g. a domain with big.LITTLE). > > Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky by > default) and then create the pools from dom0 userspace. My assumption is for > dom0less we would want to use pinning instead. > > That said I would like to hear from Xilinx and EPAM as, IIRC, they are already > using hmp-unsafe in production. This discussion has been very interesting, it is cool to hear new ideas like this one. I have a couple of thoughts to share. First I think that the ability of creating cpupools at boot time is super important. It goes way beyond big.LITTLE. It would be incredibly useful to separate real-time (sched=null) and non-real-time (sched=credit2) workloads. I think it will only become more important going forward so I'd love to see an option to configure cpupools that works for dom0less. It could be based on device tree properties rather than kconfig options. It is true that if we had the devicetree-based cpupool configuration I mentioned, then somebody could use it to create cpupools matching big.LITTLE. So "in theory" it solves the problem. However, I think that for big.LITTLE it would be suboptimal. For big.LITTLE it would be best if Xen configured the cpupools automatically rather than based on the device tree configuration. That way, it is going to work automatically without extra steps even in the simplest Xen setups. So I think that it is a good idea to have a command line option (better than a kconfig option) to trigger the MIDR-based cpupool creation at boot time. The option could be called midr-cpupools=on/off or hw-cpupools=on/off for example. In terms of whether it should be the default or not, I don't feel strongly about it. Unfortunately we (Xilinx) rely on a number of non-default options already so we are already in the situation where we have to be extra-careful at the options passed. I don't think that adding one more would make a significant difference either way. But my preference is *not* to tie the new command line option with hmp-unsafe because if you use midr-cpupools and don't mess with the pools then it is actually safe. We could even lock the cpupools like Julien suggested or warn/return error on changing the cpupools. In this scenario, it would be detrimental to also pass hmp-unsafe: it would allow actually unsafe configurations that the user wanted to avoid by using midr-cpupools. It would end up disabling checks we could put in place to make midr-cpupools safer. So in short I think it should be: - midr-cpupools alone cpupools created at boot, warning/errors on changing cpupools - midr-cpupools + hmp-unsafe cpupools created at boot, changing cpupools is allowed (we could still warn but no errors) - hmp-unsafe alone same as today with hmp-unsafe For the default as I said I don't have a strong preference. I think midr-cpupools could be "on" be default.
On 18.11.21 03:19, Stefano Stabellini wrote: > On Wed, 17 Nov 2021, Julien Grall wrote: >>>>>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: >>>>>> >>>>>> Hi Luca, >>>>>> >>>>>> On 17/11/2021 09:57, Luca Fancellu wrote: >>>>>>> Currently Xen creates a default cpupool0 that contains all the cpu >>>>>>> brought up >>>>>>> during boot and it assumes that the platform has only one kind of >>>>>>> CPU. >>>>>>> This assumption does not hold on big.LITTLE platform, but putting >>>>>>> different >>>>>>> type of CPU in the same cpupool can result in instability and >>>>>>> security issues >>>>>>> for the domains running on the pool. >>>>>> >>>>>> I agree that you can't move a LITTLE vCPU to a big pCPU. However... >>>>>> >>>>>>> For this reason this serie introduces an architecture specific way >>>>>>> to create >>>>>>> different cpupool at boot time, this is particularly useful on ARM >>>>>>> big.LITTLE >>>>>>> platform where there might be the need to have different cpupools >>>>>>> for each type >>>>>>> of core, but also systems using NUMA can have different cpu pool for >>>>>>> each node. >>>>>> >>>>>> ... from my understanding, all the vCPUs of a domain have to be in the >>>>>> same cpupool. So with this approach it is not possible: >>>>>> 1) to have a mix of LITTLE and big vCPUs in the domain >>>>>> 2) to create a domain spanning across two NUMA nodes >>>>>> >>>>>> So I think we need to make sure that any solutions we go through will >>>>>> not prevent us to implement those setups. >>>>> The point of this patch is to make all cores available without breaking >>>>> the current behaviour of existing system. >>>> >>>> I might be missing some context here. By breaking current behavior, do you >>>> mean user that may want to add "hmp-unsafe" on the command line? >>> >>> Right, with hmp-unsafe the behaviour is now the same as without, only extra >>> cores are parked in other cpupools. >>> >>> So you have a point in fact that behaviour is changed for someone who was >>> using hmp-unsafe before if this is activated. >>> The command line argument suggested by Jurgen definitely makes sense here. >>> >>> We could instead do the following: >>> - when this is activated in the configuration, boot all cores and park them >>> in different pools (depending on command line argument). Current behaviour >>> not change if other pools are not used (but more cores will be on in xen) >> >> From my understanding, it is possible to move a pCPU in/out a pool afterwards. >> So the security concern with big.LITTLE is still present, even though it would >> be difficult to hit it. > > As far as I know moving a pCPU in/out of a pool is something that cannot > happen automatically: it requires manual intervention to the user and it > is uncommon. We could print a warning or simply return error to prevent > the action from happening. Or something like: > > "This action might result in memory corruptions and invalid behavior. Do > you want to continue? [Y/N] This should only be rejected if the source and target pool are not compatible. So a cpupool could be attributed to allow only specific cpus (and maybe domains?) in it. Otherwise it would be impossible to create new cpupools after boot on such a system and populating them with cpus. >> I am also concerned that it would be more difficult to detect any >> misconfiguration. So I think this option would still need to be turned on only >> if hmp-unsafe are the new command line one are both on. >> >> If we want to enable it without hmp-unsafe on, we would need to at least lock >> the pools. > > Locking the pools is a good idea. This would be another option, yes. > My preference is not to tie this feature to the hmp-unsafe command line, > more on this below. I agree. >>> - when hmp-unsafe is on, this feature will be turned of (if activated in >>> configuration) and all cores would be added in the same pool. >>> >>> What do you think ? >> >> I am split. On one hand, this is making easier for someone to try big.LITTLE >> as you don't have manually pin vCPUs. On the other hand, this is handling a >> single use-case and you would need to use hmp-unsafe and pinning if you want >> to get more exotic setup (e.g. a domain with big.LITTLE). >> >> Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky by >> default) and then create the pools from dom0 userspace. My assumption is for >> dom0less we would want to use pinning instead. >> >> That said I would like to hear from Xilinx and EPAM as, IIRC, they are already >> using hmp-unsafe in production. > > This discussion has been very interesting, it is cool to hear new ideas > like this one. I have a couple of thoughts to share. > > First I think that the ability of creating cpupools at boot time is > super important. It goes way beyond big.LITTLE. It would be incredibly > useful to separate real-time (sched=null) and non-real-time > (sched=credit2) workloads. I think it will only become more important > going forward so I'd love to see an option to configure cpupools that > works for dom0less. It could be based on device tree properties rather > than kconfig options. I think device tree AND command line option should be possible (think of x86 here). > It is true that if we had the devicetree-based cpupool configuration I > mentioned, then somebody could use it to create cpupools matching > big.LITTLE. So "in theory" it solves the problem. However, I think that > for big.LITTLE it would be suboptimal. For big.LITTLE it would be best > if Xen configured the cpupools automatically rather than based on the > device tree configuration. That way, it is going to work automatically > without extra steps even in the simplest Xen setups. > > So I think that it is a good idea to have a command line option (better > than a kconfig option) to trigger the MIDR-based cpupool creation at > boot time. The option could be called midr-cpupools=on/off or > hw-cpupools=on/off for example. I'd rather go for: cpupools=<options> With e.g. <options>: - "auto-midr": split system into cpupools based on MIDR - "auto-numa": split system into cpupools based on NUMA nodes - "cpus=<list of cpus>[,sched=<scheduler>] This would be rather flexible without adding more and more options doing similar things. Other sub-options could be added rather easily. > In terms of whether it should be the default or not, I don't feel > strongly about it. Unfortunately we (Xilinx) rely on a number of > non-default options already so we are already in the situation where we > have to be extra-careful at the options passed. I don't think that > adding one more would make a significant difference either way. > > > But my preference is *not* to tie the new command line option with > hmp-unsafe because if you use midr-cpupools and don't mess with the > pools then it is actually safe. We could even lock the cpupools like > Julien suggested or warn/return error on changing the cpupools. In this > scenario, it would be detrimental to also pass hmp-unsafe: it would > allow actually unsafe configurations that the user wanted to avoid by > using midr-cpupools. It would end up disabling checks we could put in > place to make midr-cpupools safer. > > So in short I think it should be: > > - midr-cpupools alone > cpupools created at boot, warning/errors on changing cpupools > > - midr-cpupools + hmp-unsafe > cpupools created at boot, changing cpupools is allowed (we could still > warn but no errors) I'd rather add an explicit ",locked" option to above cpupools parameter. > > - hmp-unsafe alone > same as today with hmp-unsafe > > > For the default as I said I don't have a strong preference. I think > midr-cpupools could be "on" be default. > What about making this a Kconfig option? Juergen
On Wed, Nov 17, 2021 at 9:11 PM Julien Grall <julien@xen.org> wrote: Hi Julien, all [Sorry for the possible format issues] (Prunning some CC to just leave Arm and sched folks) > > On 17/11/2021 12:07, Bertrand Marquis wrote: > > Hi Julien, > > Hi Bertrand, > > >> On 17 Nov 2021, at 11:48, Julien Grall <julien@xen.org> wrote: > >> > >> On 17/11/2021 11:16, Bertrand Marquis wrote: > >>> Hi Julien, > >> > >> Hi, > >> > >>>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: > >>>> > >>>> Hi Luca, > >>>> > >>>> On 17/11/2021 09:57, Luca Fancellu wrote: > >>>>> Currently Xen creates a default cpupool0 that contains all the cpu > brought up > >>>>> during boot and it assumes that the platform has only one kind of > CPU. > >>>>> This assumption does not hold on big.LITTLE platform, but putting > different > >>>>> type of CPU in the same cpupool can result in instability and > security issues > >>>>> for the domains running on the pool. > >>>> > >>>> I agree that you can't move a LITTLE vCPU to a big pCPU. However... > >>>> > >>>>> For this reason this serie introduces an architecture specific way > to create > >>>>> different cpupool at boot time, this is particularly useful on ARM > big.LITTLE > >>>>> platform where there might be the need to have different cpupools > for each type > >>>>> of core, but also systems using NUMA can have different cpu pool for > each node. > >>>> > >>>> ... from my understanding, all the vCPUs of a domain have to be in > the same cpupool. So with this approach it is not possible: > >>>> 1) to have a mix of LITTLE and big vCPUs in the domain > >>>> 2) to create a domain spanning across two NUMA nodes > >>>> > >>>> So I think we need to make sure that any solutions we go through will > not prevent us to implement those setups. > >>> The point of this patch is to make all cores available without > breaking the current behaviour of existing system. > >> > >> I might be missing some context here. By breaking current behavior, do > you mean user that may want to add "hmp-unsafe" on the command line? > > > > Right, with hmp-unsafe the behaviour is now the same as without, only > extra cores are parked in other cpupools. > > > > So you have a point in fact that behaviour is changed for someone who > was using hmp-unsafe before if this is activated. > > The command line argument suggested by Jurgen definitely makes sense > here. > > > > We could instead do the following: > > - when this is activated in the configuration, boot all cores and park > them in different pools (depending on command line argument). Current > behaviour not change if other pools are not used (but more cores will be on > in xen) > > From my understanding, it is possible to move a pCPU in/out a pool > afterwards. So the security concern with big.LITTLE is still present, > even though it would be difficult to hit it. > > I am also concerned that it would be more difficult to detect any > misconfiguration. So I think this option would still need to be turned > on only if hmp-unsafe are the new command line one are both on. > > If we want to enable it without hmp-unsafe on, we would need to at least > lock the pools. > > > - when hmp-unsafe is on, this feature will be turned of (if activated in > configuration) and all cores would be added in the same pool. > > > > What do you think ? > > I am split. On one hand, this is making easier for someone to try > big.LITTLE as you don't have manually pin vCPUs. On the other hand, this > is handling a single use-case and you would need to use hmp-unsafe and > pinning if you want to get more exotic setup (e.g. a domain with > big.LITTLE). > > Another possible solution is to pin dom0 vCPUs (AFAIK they are just > sticky by default) and then create the pools from dom0 userspace. My > assumption is for dom0less we would want to use pinning instead. > > That said I would like to hear from Xilinx and EPAM as, IIRC, they are > already using hmp-unsafe in production. > We have been using hmp-unsafe since it's introduction, yes we are aware of possible consequences of enabling that option (as different CPU types might have different errata, cache lines, PA bits (?), etc), so we are trying to deal with it carefully. In our target system we pin Dom0's vCPUs to the pCPUs of the same type from userspace via "xl vcpu-pin" command, for other domains more exotic configuration can be used. I share Stefano's opinion not to tie new feature (boot time MIDR-based cpupools) to existing hmp-unsafe option and create new command line option to control new feature. The proposed algorithm (copy it here for the convenience) looks fine to me. "So in short I think it should be: - midr-cpupools alone cpupools created at boot, warning/errors on changing cpupools - midr-cpupools + hmp-unsafe cpupools created at boot, changing cpupools is allowed (we could still warn but no errors) - hmp-unsafe alone same as today with hmp-unsafe" For the default behaviour I also don't have a strong preference. One thing is clear: default behaviour should be safe. I think, the command line option is preferred over the config option as new feature can be enabled/disabled without a need to re-build Xen, moreover, if we follow the proposed algorithm route, the behaviour of new feature at runtime (whether the changing of cpupools is allowed or not) are going to be depended on the hmp-unsafe state which is under command line control currently. But, there is no strong preference here as well. > > Cheers, > > -- > Julien Grall > >
I like all your suggestions below! On Thu, 18 Nov 2021, Juergen Gross wrote: > On 18.11.21 03:19, Stefano Stabellini wrote: > > On Wed, 17 Nov 2021, Julien Grall wrote: > > > > > > > On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: > > > > > > > > > > > > > > Hi Luca, > > > > > > > > > > > > > > On 17/11/2021 09:57, Luca Fancellu wrote: > > > > > > > > Currently Xen creates a default cpupool0 that contains all the > > > > > > > > cpu > > > > > > > > brought up > > > > > > > > during boot and it assumes that the platform has only one kind > > > > > > > > of > > > > > > > > CPU. > > > > > > > > This assumption does not hold on big.LITTLE platform, but > > > > > > > > putting > > > > > > > > different > > > > > > > > type of CPU in the same cpupool can result in instability and > > > > > > > > security issues > > > > > > > > for the domains running on the pool. > > > > > > > > > > > > > > I agree that you can't move a LITTLE vCPU to a big pCPU. > > > > > > > However... > > > > > > > > > > > > > > > For this reason this serie introduces an architecture specific > > > > > > > > way > > > > > > > > to create > > > > > > > > different cpupool at boot time, this is particularly useful on > > > > > > > > ARM > > > > > > > > big.LITTLE > > > > > > > > platform where there might be the need to have different > > > > > > > > cpupools > > > > > > > > for each type > > > > > > > > of core, but also systems using NUMA can have different cpu pool > > > > > > > > for > > > > > > > > each node. > > > > > > > > > > > > > > ... from my understanding, all the vCPUs of a domain have to be in > > > > > > > the > > > > > > > same cpupool. So with this approach it is not possible: > > > > > > > 1) to have a mix of LITTLE and big vCPUs in the domain > > > > > > > 2) to create a domain spanning across two NUMA nodes > > > > > > > > > > > > > > So I think we need to make sure that any solutions we go through > > > > > > > will > > > > > > > not prevent us to implement those setups. > > > > > > The point of this patch is to make all cores available without > > > > > > breaking > > > > > > the current behaviour of existing system. > > > > > > > > > > I might be missing some context here. By breaking current behavior, do > > > > > you > > > > > mean user that may want to add "hmp-unsafe" on the command line? > > > > > > > > Right, with hmp-unsafe the behaviour is now the same as without, only > > > > extra > > > > cores are parked in other cpupools. > > > > > > > > So you have a point in fact that behaviour is changed for someone who > > > > was > > > > using hmp-unsafe before if this is activated. > > > > The command line argument suggested by Jurgen definitely makes sense > > > > here. > > > > > > > > We could instead do the following: > > > > - when this is activated in the configuration, boot all cores and park > > > > them > > > > in different pools (depending on command line argument). Current > > > > behaviour > > > > not change if other pools are not used (but more cores will be on in > > > > xen) > > > > > > From my understanding, it is possible to move a pCPU in/out a pool > > > afterwards. > > > So the security concern with big.LITTLE is still present, even though it > > > would > > > be difficult to hit it. > > > > As far as I know moving a pCPU in/out of a pool is something that cannot > > happen automatically: it requires manual intervention to the user and it > > is uncommon. We could print a warning or simply return error to prevent > > the action from happening. Or something like: > > > > "This action might result in memory corruptions and invalid behavior. Do > > you want to continue? [Y/N] > > This should only be rejected if the source and target pool are not > compatible. So a cpupool could be attributed to allow only specific > cpus (and maybe domains?) in it. Yes you are right. > Otherwise it would be impossible to create new cpupools after boot on > such a system and populating them with cpus. > > > > I am also concerned that it would be more difficult to detect any > > > misconfiguration. So I think this option would still need to be turned on > > > only > > > if hmp-unsafe are the new command line one are both on. > > > > > > If we want to enable it without hmp-unsafe on, we would need to at least > > > lock > > > the pools. > > > > Locking the pools is a good idea. > > This would be another option, yes. > > > My preference is not to tie this feature to the hmp-unsafe command line, > > more on this below. > > I agree. > > > > > - when hmp-unsafe is on, this feature will be turned of (if activated in > > > > configuration) and all cores would be added in the same pool. > > > > > > > > What do you think ? > > > > > > I am split. On one hand, this is making easier for someone to try > > > big.LITTLE > > > as you don't have manually pin vCPUs. On the other hand, this is handling > > > a > > > single use-case and you would need to use hmp-unsafe and pinning if you > > > want > > > to get more exotic setup (e.g. a domain with big.LITTLE). > > > > > > Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky > > > by > > > default) and then create the pools from dom0 userspace. My assumption is > > > for > > > dom0less we would want to use pinning instead. > > > > > > That said I would like to hear from Xilinx and EPAM as, IIRC, they are > > > already > > > using hmp-unsafe in production. > > > > This discussion has been very interesting, it is cool to hear new ideas > > like this one. I have a couple of thoughts to share. > > > > First I think that the ability of creating cpupools at boot time is > > super important. It goes way beyond big.LITTLE. It would be incredibly > > useful to separate real-time (sched=null) and non-real-time > > (sched=credit2) workloads. I think it will only become more important > > going forward so I'd love to see an option to configure cpupools that > > works for dom0less. It could be based on device tree properties rather > > than kconfig options. > > I think device tree AND command line option should be possible (think of > x86 here). Sure > > It is true that if we had the devicetree-based cpupool configuration I > > mentioned, then somebody could use it to create cpupools matching > > big.LITTLE. So "in theory" it solves the problem. However, I think that > > for big.LITTLE it would be suboptimal. For big.LITTLE it would be best > > if Xen configured the cpupools automatically rather than based on the > > device tree configuration. That way, it is going to work automatically > > without extra steps even in the simplest Xen setups. > > > > So I think that it is a good idea to have a command line option (better > > than a kconfig option) to trigger the MIDR-based cpupool creation at > > boot time. The option could be called midr-cpupools=on/off or > > hw-cpupools=on/off for example. > > I'd rather go for: > > cpupools=<options> > > With e.g. <options>: > > - "auto-midr": split system into cpupools based on MIDR > - "auto-numa": split system into cpupools based on NUMA nodes > - "cpus=<list of cpus>[,sched=<scheduler>] > > This would be rather flexible without adding more and more options > doing similar things. Other sub-options could be added rather easily. I like this > > In terms of whether it should be the default or not, I don't feel > > strongly about it. Unfortunately we (Xilinx) rely on a number of > > non-default options already so we are already in the situation where we > > have to be extra-careful at the options passed. I don't think that > > adding one more would make a significant difference either way. > > > > > > But my preference is *not* to tie the new command line option with > > hmp-unsafe because if you use midr-cpupools and don't mess with the > > pools then it is actually safe. We could even lock the cpupools like > > Julien suggested or warn/return error on changing the cpupools. In this > > scenario, it would be detrimental to also pass hmp-unsafe: it would > > allow actually unsafe configurations that the user wanted to avoid by > > using midr-cpupools. It would end up disabling checks we could put in > > place to make midr-cpupools safer. > > > > So in short I think it should be: > > > > - midr-cpupools alone > > cpupools created at boot, warning/errors on changing cpupools > > > > - midr-cpupools + hmp-unsafe > > cpupools created at boot, changing cpupools is allowed (we could still > > warn but no errors) > > I'd rather add an explicit ",locked" option to above cpupools parameter. yeah that's better > > > > - hmp-unsafe alone > > same as today with hmp-unsafe > > > > > > For the default as I said I don't have a strong preference. I think > > midr-cpupools could be "on" be default. > > > > What about making this a Kconfig option? Could also be a good idea
Hi Stefano, On 18/11/2021 02:19, Stefano Stabellini wrote: > On Wed, 17 Nov 2021, Julien Grall wrote: >>>>>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: >>>>>> >>>>>> Hi Luca, >>>>>> >>>>>> On 17/11/2021 09:57, Luca Fancellu wrote: >>>>>>> Currently Xen creates a default cpupool0 that contains all the cpu >>>>>>> brought up >>>>>>> during boot and it assumes that the platform has only one kind of >>>>>>> CPU. >>>>>>> This assumption does not hold on big.LITTLE platform, but putting >>>>>>> different >>>>>>> type of CPU in the same cpupool can result in instability and >>>>>>> security issues >>>>>>> for the domains running on the pool. >>>>>> >>>>>> I agree that you can't move a LITTLE vCPU to a big pCPU. However... >>>>>> >>>>>>> For this reason this serie introduces an architecture specific way >>>>>>> to create >>>>>>> different cpupool at boot time, this is particularly useful on ARM >>>>>>> big.LITTLE >>>>>>> platform where there might be the need to have different cpupools >>>>>>> for each type >>>>>>> of core, but also systems using NUMA can have different cpu pool for >>>>>>> each node. >>>>>> >>>>>> ... from my understanding, all the vCPUs of a domain have to be in the >>>>>> same cpupool. So with this approach it is not possible: >>>>>> 1) to have a mix of LITTLE and big vCPUs in the domain >>>>>> 2) to create a domain spanning across two NUMA nodes >>>>>> >>>>>> So I think we need to make sure that any solutions we go through will >>>>>> not prevent us to implement those setups. >>>>> The point of this patch is to make all cores available without breaking >>>>> the current behaviour of existing system. >>>> >>>> I might be missing some context here. By breaking current behavior, do you >>>> mean user that may want to add "hmp-unsafe" on the command line? >>> >>> Right, with hmp-unsafe the behaviour is now the same as without, only extra >>> cores are parked in other cpupools. >>> >>> So you have a point in fact that behaviour is changed for someone who was >>> using hmp-unsafe before if this is activated. >>> The command line argument suggested by Jurgen definitely makes sense here. >>> >>> We could instead do the following: >>> - when this is activated in the configuration, boot all cores and park them >>> in different pools (depending on command line argument). Current behaviour >>> not change if other pools are not used (but more cores will be on in xen) >> >> From my understanding, it is possible to move a pCPU in/out a pool afterwards. >> So the security concern with big.LITTLE is still present, even though it would >> be difficult to hit it. > > As far as I know moving a pCPU in/out of a pool is something that cannot > happen automatically: it requires manual intervention to the user and it > is uncommon. > We could print a warning or simply return error to prevent > the action from happening. Or something like: > > "This action might result in memory corruptions and invalid behavior. Do > you want to continue? [Y/N] > > >> I am also concerned that it would be more difficult to detect any >> misconfiguration. So I think this option would still need to be turned on only >> if hmp-unsafe are the new command line one are both on. >> >> If we want to enable it without hmp-unsafe on, we would need to at least lock >> the pools. > > Locking the pools is a good idea. > > My preference is not to tie this feature to the hmp-unsafe command line, > more on this below. The only reason I suggested to tie with hmp-unsafe is if you (or anyone else) really wanted to use a version where the pool are unlocked. If we are going to implement the lock, then I think the hmp-unsafe would not be necessary for such configuration. > > >>> - when hmp-unsafe is on, this feature will be turned of (if activated in >>> configuration) and all cores would be added in the same pool. >>> >>> What do you think ? >> >> I am split. On one hand, this is making easier for someone to try big.LITTLE >> as you don't have manually pin vCPUs. On the other hand, this is handling a >> single use-case and you would need to use hmp-unsafe and pinning if you want >> to get more exotic setup (e.g. a domain with big.LITTLE). >> >> Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky by >> default) and then create the pools from dom0 userspace. My assumption is for >> dom0less we would want to use pinning instead. >> >> That said I would like to hear from Xilinx and EPAM as, IIRC, they are already >> using hmp-unsafe in production. > > This discussion has been very interesting, it is cool to hear new ideas > like this one. I have a couple of thoughts to share. > > First I think that the ability of creating cpupools at boot time is > super important. It goes way beyond big.LITTLE. It would be incredibly > useful to separate real-time (sched=null) and non-real-time > (sched=credit2) workloads. I think it will only become more important > going forward so I'd love to see an option to configure cpupools that > works for dom0less. It could be based on device tree properties rather > than kconfig options. > > It is true that if we had the devicetree-based cpupool configuration I > mentioned, then somebody could use it to create cpupools matching > big.LITTLE. > So "in theory" it solves the problem. However, I think that > for big.LITTLE it would be suboptimal. For big.LITTLE it would be best > if Xen configured the cpupools automatically rather than based on the > device tree configuration. This brings one question. How do Linux detect and use big.LITTLE? Is there a Device-Tree binding? [...] > So I think that it is a good idea to have a command line option (better > than a kconfig option) to trigger the MIDR-based cpupool creation at > boot time. The option could be called midr-cpupools=on/off or > hw-cpupools=on/off for example. > In terms of whether it should be the default or not, I don't feel > strongly about it. On by default means this will security supported and we need to be reasonably confident this cannot be broken. In this case, I am not only referring to moving a pCPU between pool but also Xen doing the right thing (e.g. finding the minimum cache line size). [...] > - midr-cpupools alone > cpupools created at boot, warning/errors on changing cpupools > > - midr-cpupools + hmp-unsafe > cpupools created at boot, changing cpupools is allowed (we could still > warn but no errors) > > - hmp-unsafe alone > same as today with hmp-unsafe I like better Juergen's version. But before agreeing on the command line , I would like to understand how Linux is dealing with big.LITTLE today (see my question above). > > For the default as I said I don't have a strong preference. I think > midr-cpupools could be "on" be default. I think this should be off at the beginning until the feature is matured enough. We are soon opening the tree again for the next development cycle. So I think we have enough time to make sure everything work properly to have turned on by default before next release. Cheers,
On Fri, 19 Nov 2021, Julien Grall wrote: > Hi Stefano, > > On 18/11/2021 02:19, Stefano Stabellini wrote: > > On Wed, 17 Nov 2021, Julien Grall wrote: > > > > > > > On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote: > > > > > > > > > > > > > > Hi Luca, > > > > > > > > > > > > > > On 17/11/2021 09:57, Luca Fancellu wrote: > > > > > > > > Currently Xen creates a default cpupool0 that contains all the > > > > > > > > cpu > > > > > > > > brought up > > > > > > > > during boot and it assumes that the platform has only one kind > > > > > > > > of > > > > > > > > CPU. > > > > > > > > This assumption does not hold on big.LITTLE platform, but > > > > > > > > putting > > > > > > > > different > > > > > > > > type of CPU in the same cpupool can result in instability and > > > > > > > > security issues > > > > > > > > for the domains running on the pool. > > > > > > > > > > > > > > I agree that you can't move a LITTLE vCPU to a big pCPU. > > > > > > > However... > > > > > > > > > > > > > > > For this reason this serie introduces an architecture specific > > > > > > > > way > > > > > > > > to create > > > > > > > > different cpupool at boot time, this is particularly useful on > > > > > > > > ARM > > > > > > > > big.LITTLE > > > > > > > > platform where there might be the need to have different > > > > > > > > cpupools > > > > > > > > for each type > > > > > > > > of core, but also systems using NUMA can have different cpu pool > > > > > > > > for > > > > > > > > each node. > > > > > > > > > > > > > > ... from my understanding, all the vCPUs of a domain have to be in > > > > > > > the > > > > > > > same cpupool. So with this approach it is not possible: > > > > > > > 1) to have a mix of LITTLE and big vCPUs in the domain > > > > > > > 2) to create a domain spanning across two NUMA nodes > > > > > > > > > > > > > > So I think we need to make sure that any solutions we go through > > > > > > > will > > > > > > > not prevent us to implement those setups. > > > > > > The point of this patch is to make all cores available without > > > > > > breaking > > > > > > the current behaviour of existing system. > > > > > > > > > > I might be missing some context here. By breaking current behavior, do > > > > > you > > > > > mean user that may want to add "hmp-unsafe" on the command line? > > > > > > > > Right, with hmp-unsafe the behaviour is now the same as without, only > > > > extra > > > > cores are parked in other cpupools. > > > > > > > > So you have a point in fact that behaviour is changed for someone who > > > > was > > > > using hmp-unsafe before if this is activated. > > > > The command line argument suggested by Jurgen definitely makes sense > > > > here. > > > > > > > > We could instead do the following: > > > > - when this is activated in the configuration, boot all cores and park > > > > them > > > > in different pools (depending on command line argument). Current > > > > behaviour > > > > not change if other pools are not used (but more cores will be on in > > > > xen) > > > > > > From my understanding, it is possible to move a pCPU in/out a pool > > > afterwards. > > > So the security concern with big.LITTLE is still present, even though it > > > would > > > be difficult to hit it. > > > > As far as I know moving a pCPU in/out of a pool is something that cannot > > happen automatically: it requires manual intervention to the user and it > > is uncommon. We could print a warning or simply return error to prevent > > the action from happening. Or something like: > > > > "This action might result in memory corruptions and invalid behavior. Do > > you want to continue? [Y/N] > > > > > > > I am also concerned that it would be more difficult to detect any > > > misconfiguration. So I think this option would still need to be turned on > > > only > > > if hmp-unsafe are the new command line one are both on. > > > > > > If we want to enable it without hmp-unsafe on, we would need to at least > > > lock > > > the pools. > > > > Locking the pools is a good idea. > > > > My preference is not to tie this feature to the hmp-unsafe command line, > > more on this below. > > The only reason I suggested to tie with hmp-unsafe is if you (or anyone else) > really wanted to use a version where the pool are unlocked. > > If we are going to implement the lock, then I think the hmp-unsafe would not > be necessary for such configuration. > > > > > > > > > - when hmp-unsafe is on, this feature will be turned of (if activated in > > > > configuration) and all cores would be added in the same pool. > > > > > > > > What do you think ? > > > > > > I am split. On one hand, this is making easier for someone to try > > > big.LITTLE > > > as you don't have manually pin vCPUs. On the other hand, this is handling > > > a > > > single use-case and you would need to use hmp-unsafe and pinning if you > > > want > > > to get more exotic setup (e.g. a domain with big.LITTLE). > > > > > > Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky > > > by > > > default) and then create the pools from dom0 userspace. My assumption is > > > for > > > dom0less we would want to use pinning instead. > > > > > > That said I would like to hear from Xilinx and EPAM as, IIRC, they are > > > already > > > using hmp-unsafe in production. > > > > This discussion has been very interesting, it is cool to hear new ideas > > like this one. I have a couple of thoughts to share. > > > > First I think that the ability of creating cpupools at boot time is > > super important. It goes way beyond big.LITTLE. It would be incredibly > > useful to separate real-time (sched=null) and non-real-time > > (sched=credit2) workloads. I think it will only become more important > > going forward so I'd love to see an option to configure cpupools that > > works for dom0less. It could be based on device tree properties rather > > than kconfig options. > > > > It is true that if we had the devicetree-based cpupool configuration I > > mentioned, then somebody could use it to create cpupools matching > > big.LITTLE. > So "in theory" it solves the problem. However, I think that > > for big.LITTLE it would be suboptimal. For big.LITTLE it would be best > > if Xen configured the cpupools automatically rather than based on the > > device tree configuration. > > This brings one question. How do Linux detect and use big.LITTLE? Is there a > Device-Tree binding? > > [...] > > > So I think that it is a good idea to have a command line option (better > > than a kconfig option) to trigger the MIDR-based cpupool creation at > > boot time. The option could be called midr-cpupools=on/off or > > hw-cpupools=on/off for example. > > In terms of whether it should be the default or not, I don't feel > > strongly about it. > > On by default means this will security supported and we need to be reasonably > confident this cannot be broken. > > In this case, I am not only referring to moving a pCPU between pool but also > Xen doing the right thing (e.g. finding the minimum cache line size). > > > [...] > > > - midr-cpupools alone > > cpupools created at boot, warning/errors on changing cpupools > > > - midr-cpupools + hmp-unsafe > > cpupools created at boot, changing cpupools is allowed (we could still > > warn but no errors) > > > > - hmp-unsafe alone > > same as today with hmp-unsafe > > I like better Juergen's version. But before agreeing on the command line , I > would like to understand how Linux is dealing with big.LITTLE today (see my > question above). I also like Juergen's version better :-) The device tree binding that covers big.LITTLE is the same that covers cpus: Documentation/devicetree/bindings/arm/cpus.yaml So basically, you can tell it is a big.LITTLE system because the compatible strings differ between certain cpus, e.g.: cpu@0 { device_type = "cpu"; compatible = "arm,cortex-a15"; reg = <0x0>; }; cpu@1 { device_type = "cpu"; compatible = "arm,cortex-a15"; reg = <0x1>; }; cpu@100 { device_type = "cpu"; compatible = "arm,cortex-a7"; reg = <0x100>; }; cpu@101 { device_type = "cpu"; compatible = "arm,cortex-a7"; reg = <0x101>; }; > > For the default as I said I don't have a strong preference. I think > > midr-cpupools could be "on" be default. > > I think this should be off at the beginning until the feature is matured > enough. We are soon opening the tree again for the next development cycle. So > I think we have enough time to make sure everything work properly to have > turned on by default before next release. That's fine
Hi Stefano, On 19/11/2021 18:55, Stefano Stabellini wrote: > On Fri, 19 Nov 2021, Julien Grall wrote: >> I like better Juergen's version. But before agreeing on the command line , I >> would like to understand how Linux is dealing with big.LITTLE today (see my >> question above). > > I also like Juergen's version better :-) > > The device tree binding that covers big.LITTLE is the same that covers > cpus: Documentation/devicetree/bindings/arm/cpus.yaml Are you sure? I found Documentation/devicetree/bindings/arm/cpu-capacity.txt. > > So basically, you can tell it is a big.LITTLE system because the > compatible strings differ between certain cpus, e.g.: > > cpu@0 { > device_type = "cpu"; > compatible = "arm,cortex-a15"; > reg = <0x0>; > }; > > cpu@1 { > device_type = "cpu"; > compatible = "arm,cortex-a15"; > reg = <0x1>; > }; > > cpu@100 { > device_type = "cpu"; > compatible = "arm,cortex-a7"; > reg = <0x100>; > }; > > cpu@101 { > device_type = "cpu"; > compatible = "arm,cortex-a7"; > reg = <0x101>; > }; I have not found any code in Linux using the compatible. Instead, they all seem to use the cpu-map (see drivers/base/arch_topology.c). Anyway, to me it would be better to parse the Device-Tree over using the MIDR. The advantage is we can cover a wide range of cases (you may have processor with the same MIDR but different frequency). For now, we could create a cpupool per cluster. Cheers,
Hi Julien, > On 23 Nov 2021, at 13:54, Julien Grall <julien@xen.org> wrote: > > Hi Stefano, > > On 19/11/2021 18:55, Stefano Stabellini wrote: >> On Fri, 19 Nov 2021, Julien Grall wrote: >>> I like better Juergen's version. But before agreeing on the command line , I >>> would like to understand how Linux is dealing with big.LITTLE today (see my >>> question above). >> I also like Juergen's version better :-) >> The device tree binding that covers big.LITTLE is the same that covers >> cpus: Documentation/devicetree/bindings/arm/cpus.yaml > > Are you sure? I found Documentation/devicetree/bindings/arm/cpu-capacity.txt. > >> So basically, you can tell it is a big.LITTLE system because the >> compatible strings differ between certain cpus, e.g.: >> cpu@0 { >> device_type = "cpu"; >> compatible = "arm,cortex-a15"; >> reg = <0x0>; >> }; >> cpu@1 { >> device_type = "cpu"; >> compatible = "arm,cortex-a15"; >> reg = <0x1>; >> }; >> cpu@100 { >> device_type = "cpu"; >> compatible = "arm,cortex-a7"; >> reg = <0x100>; >> }; >> cpu@101 { >> device_type = "cpu"; >> compatible = "arm,cortex-a7"; >> reg = <0x101>; >> }; > > I have not found any code in Linux using the compatible. Instead, they all seem to use the cpu-map (see drivers/base/arch_topology.c). > > Anyway, to me it would be better to parse the Device-Tree over using the MIDR. The advantage is we can cover a wide range of cases (you may have processor with the same MIDR but different frequency). For now, we could create a cpupool per cluster. This is a really good idea as this could also work for NUMA systems. So reg & ~0xff would give the cluster number ? We will probably end up splitting cores into different cpupools even if they are all the same as there are several CPUs having several clusters but the same cores (I recall some NXP boards being like that). Some device tree also have a cpu-map definition: cpu-map { cluster0 { core0 { cpu = <&A57_0>; }; core1 { cpu = <&A57_1>; }; }; cluster1 { core0 { cpu = <&A53_0>; }; core1 { cpu = <&A53_1>; }; core2 { cpu = <&A53_2>; }; core3 { cpu = <&A53_3>; }; }; }; @stefano: is the cpu-map something standard ? Lots of device trees do have it in Linux now but I do not recall seeing that on older device trees. Maybe using cpu-map could be a solution, we could say that device tree without a cpu-map are not supported. Cheers Bertrand > > Cheers, > > -- > Julien Grall
On Tue, 23 Nov 2021, Bertrand Marquis wrote: > Hi Julien, > > > On 23 Nov 2021, at 13:54, Julien Grall <julien@xen.org> wrote: > > > > Hi Stefano, > > > > On 19/11/2021 18:55, Stefano Stabellini wrote: > >> On Fri, 19 Nov 2021, Julien Grall wrote: > >>> I like better Juergen's version. But before agreeing on the command line , I > >>> would like to understand how Linux is dealing with big.LITTLE today (see my > >>> question above). > >> I also like Juergen's version better :-) > >> The device tree binding that covers big.LITTLE is the same that covers > >> cpus: Documentation/devicetree/bindings/arm/cpus.yaml > > > > Are you sure? I found Documentation/devicetree/bindings/arm/cpu-capacity.txt. > > > >> So basically, you can tell it is a big.LITTLE system because the > >> compatible strings differ between certain cpus, e.g.: > >> cpu@0 { > >> device_type = "cpu"; > >> compatible = "arm,cortex-a15"; > >> reg = <0x0>; > >> }; > >> cpu@1 { > >> device_type = "cpu"; > >> compatible = "arm,cortex-a15"; > >> reg = <0x1>; > >> }; > >> cpu@100 { > >> device_type = "cpu"; > >> compatible = "arm,cortex-a7"; > >> reg = <0x100>; > >> }; > >> cpu@101 { > >> device_type = "cpu"; > >> compatible = "arm,cortex-a7"; > >> reg = <0x101>; > >> }; > > > > I have not found any code in Linux using the compatible. Instead, they all seem to use the cpu-map (see drivers/base/arch_topology.c). > > > > Anyway, to me it would be better to parse the Device-Tree over using the MIDR. The advantage is we can cover a wide range of cases (you may have processor with the same MIDR but different frequency). For now, we could create a cpupool per cluster. > > This is a really good idea as this could also work for NUMA systems. > > So reg & ~0xff would give the cluster number ? > > We will probably end up splitting cores into different cpupools even if they are all the same as there are several CPUs having several clusters but the same cores (I recall some NXP boards being like that). > > Some device tree also have a cpu-map definition: > cpu-map { > cluster0 { > core0 { > cpu = <&A57_0>; > }; > core1 { > cpu = <&A57_1>; > }; > }; > > cluster1 { > core0 { > cpu = <&A53_0>; > }; > core1 { > cpu = <&A53_1>; > }; > core2 { > cpu = <&A53_2>; > }; > core3 { > cpu = <&A53_3>; > }; > }; > }; > > @stefano: is the cpu-map something standard ? Lots of device trees do have it in Linux now but I do not recall seeing that on older device trees. > Maybe using cpu-map could be a solution, we could say that device tree without a cpu-map are not supported. cpu-map is newer than big.LITTLE support in Linux. See for instance commit 4ab328f06. Before cpu-map was introduced, Linux used mostly the MPIDR or sometimes the *machine* compatible string. I did find one example of a board where the cpu compatible string is the same for both big and LITTLE CPUs: arch/arm64/boot/dts/rockchip/rk3368.dtsi. That could be the reason why the cpu compatible string is not used for detecting big.LITTLE. Sorry about that, my earlier suggestion was wrong. Yes I think using cpu-map would be fine. It seems to be widely used by different vendors. (Even if something as generic as cpu-map should really be under the devicetree.org spec, not under Linux, but anyway.) Only one note: you mentioned NUMA. As you can see from Documentation/devicetree/bindings/numa.txt, NUMA doesn't use cpu-map. Instead, it uses numa-node-id and distance-map.
Hi all, Thank you for all your feedbacks, sorry for the late response. Given the amount of suggestions I’ve been working on a proposal for the boot time cpupools that I hope could be good for everyone. The feature will be enabled by CONFIG_BOOT_TIME_CPUPOOLS, so without it everything is behaving as now. When the feature is enabled, the code will check the device tree and use informations from there to create the cpupools: a72_1: cpu@0 { compatible = "arm,cortex-a72"; reg = <0x0 0x0>; device_type = "cpu"; [...] }; a72_2: cpu@1 { compatible = "arm,cortex-a72"; reg = <0x0 0x1>; device_type = "cpu"; [...] }; cpu@2 { compatible = "arm,cortex-a72"; reg = <0x0 0x2>; device_type = "cpu"; [...] }; a53_1: cpu@100 { compatible = "arm,cortex-a53"; reg = <0x0 0x100>; device_type = "cpu"; [...] }; a53_2: cpu@101 { compatible = "arm,cortex-a53"; reg = <0x0 0x101>; device_type = "cpu"; [...] }; chosen { cpupool_a { compatible = "xen,cpupool"; xen,cpupool-id = <0>; xen,cpupool-cpus = <&a72_1 &a72_2>; }; cpupool_b { compatible = "xen,cpupool"; xen,cpupool-id = <1>; xen,cpupool-cpus = <&a53_1 &a53_2>; xen,cpupool-sched = "credit2"; }; […] }; So for every node under chosen with the compatible “xen,cpupool”, a cpupool is created (if it doesn’t exists). Mandatory properties of that node are: - “xen,cpupool-id” which identifies the id of the pool - “xen,cpupool-cpus” which lists the handle of the cpus Optional property is “xen,cpupool-sched” which is a string that identifies the scheduler. A cpupool with identifier 0 (zero) can’t have that property, it will get the default scheduler from Xen. A set of rules are applied: 1) The cpupool with id 0 is always created, being it listed or not in the DT 2) The cpupool with id 0 must have at least one cpu, if it doesn’t the system will panic. 3) Every cpu that is not assigned to any cpupool will be automatically assigned to the cpupool with id 0 (only cpu that are brought up by Xen) 4) When a cpu is assigned to a cpupool in the DT, but the cpu is not up, the system will panic. So, given this explanation, the above example will create a system with two cpupool: - cpupool with id 0 containing 3 cpu a72 (two are explicitly listed, one was not assigned to any other cpupool) - cpupool with id 1 containing 2 cpu a53 (cpus explicitly listed) Clearly the above example works only if Xen is started using the hmp-unsafe=1 parameter, otherwise some cpus won’t be started. Given the above example, we might be able to have an option like this (“xen,domain-cpupool-id”) to assign dom0less guests to cpupools: chosen { cpupool_a { compatible = "xen,cpupool"; xen,cpupool-id = <0>; xen,cpupool-cpus = <&a72_1 &a72_2>; }; cpupool_b { compatible = "xen,cpupool"; xen,cpupool-id = <1>; xen,cpupool-cpus = <&a53_1 &a53_2>; xen,cpupool-sched = "credit2"; }; domU1 { #size-cells = <0x1>; #address-cells = <0x1>; compatible = "xen,domain"; cpus = <0x1>; memory = <0x0 0xc0000>; xen,domain-cpupool-id = <1>; /* Optional */ vpl011; module@0 { compatible = "multiboot,kernel", "multiboot,module"; […] }; }; }; Any thoughts on this? Cheers, Luca