mbox series

[v2,0/7] xen/arm: Sanitize cpuinfo

Message ID cover.1629713932.git.bertrand.marquis@arm.com (mailing list archive)
Headers show
Series xen/arm: Sanitize cpuinfo | expand

Message

Bertrand Marquis Aug. 23, 2021, 10:32 a.m. UTC
On arm architecture we might have heterogeneous platforms with different
types of cores. As a guest can potentialy run on any of those cores we
have to present them cpu features which are compatible with all cores
and discard the features which are only available on some cores.

As the features can be fairly complex, the way to deduce from 2
different features, what should be the acceptable minimal feature can be
complex (and sometime impossible).

To reduce the implementation effort in Xen, this serie is importing the
structures and filtering system used by Linux in order to build a
cpuinfo containing the best values compatible with all cores on the
platform.

The serie start by importing the necessary code and structure from Linux
and then use it to sanitize the boot cpuinfo.
It is also simplifying some Xen code which was doing the same in p2m
and allows to use heterogeneous platforms on arm64.

It is also adding DCZID and CTR registers in cpuinfo in order to check
for incoherent values between cores for those 2 registers. Xen is
tainted if different DCZID registers are found and CTR register is
sanitized when possible and in this case CTR register (and other
registers catched by HCR.TID2) is emulated.

With this serie, Xen was tested on a Juno board running on all 6 cores.

Changes in v2:
- Sanitize DCZID register
- Sanitize CTR_EL0 and add emulation of registers catched by TID2
- rename cpu_boot_data to system_cpuinfo

Bertrand Marquis (7):
  xen/arm: Import ID registers definitions from Linux
  xen/arm: Import ID features sanitize from linux
  xen/arm: Rename cpu_boot_data to system_cpuinfo
  xen/arm: Sanitize cpuinfo ID registers fields
  xen/arm: Use sanitize values for p2m
  xen/arm: Taint Xen on incompatible DCZID values
  xen/arm: Sanitize CTR_EL0 and emulate it if needed

 xen/arch/arm/arm64/Makefile            |   1 +
 xen/arch/arm/arm64/cpufeature.c        | 644 +++++++++++++++++++++++++
 xen/arch/arm/arm64/vsysreg.c           |  40 ++
 xen/arch/arm/cpufeature.c              |  12 +-
 xen/arch/arm/domain.c                  |   8 +
 xen/arch/arm/p2m.c                     |  30 +-
 xen/arch/arm/setup.c                   |  36 +-
 xen/arch/arm/smpboot.c                 |  11 +-
 xen/arch/arm/vcpreg.c                  |  45 ++
 xen/common/kernel.c                    |   6 +-
 xen/include/asm-arm/arm64/cpufeature.h | 104 ++++
 xen/include/asm-arm/arm64/hsr.h        |   6 +
 xen/include/asm-arm/arm64/sysregs.h    | 312 ++++++++++++
 xen/include/asm-arm/cpufeature.h       |  34 +-
 xen/include/asm-arm/processor.h        |  18 +-
 xen/include/xen/lib.h                  |   1 +
 16 files changed, 1254 insertions(+), 54 deletions(-)
 create mode 100644 xen/arch/arm/arm64/cpufeature.c
 create mode 100644 xen/include/asm-arm/arm64/cpufeature.h

Comments

Julien Grall Aug. 23, 2021, 12:10 p.m. UTC | #1
Hi Bertrand,

On 23/08/2021 11:32, Bertrand Marquis wrote:
> On arm architecture we might have heterogeneous platforms with different
> types of cores. As a guest can potentialy run on any of those cores we
> have to present them cpu features which are compatible with all cores
> and discard the features which are only available on some cores.

Sanitizing the CPU info is important for Xen so it can correctly size 
the P2M, flush the cache... However, I don't think this is going to be 
sufficient to be able to move a vCPU between different type of pCPU.

The main hurdle I can see so far is errata handling. Not all the errata 
can be fully handled in Xen so some of them are left to the guest to 
mitigate.

The errata are usually detected based on the MIDR while the OS is 
booting. IOW, a guest will not be able to know that it needs to handle 
an errata for pCPU B if it only runs on pCPU A.

I don't know yet how this can be solved, but errata are not that 
uncommon on Arm. So until this addressed, we will still need to make 
sure that vCPUs are not migrated between pCPUs with at least a different 
MIDR.

This prevention can be either done manually by pinning the vCPUs or 
implementing the proposal that Dario sent a few years ago (see [1]).

Cheers,

[1] 
https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg00826.html
Bertrand Marquis Aug. 23, 2021, 3:48 p.m. UTC | #2
Hi Julien,

[Keep only arm maintainers in the CC list]

> On 23 Aug 2021, at 13:10, Julien Grall <julien@xen.org> wrote:
> 
> Hi Bertrand,
> 
> On 23/08/2021 11:32, Bertrand Marquis wrote:
>> On arm architecture we might have heterogeneous platforms with different
>> types of cores. As a guest can potentialy run on any of those cores we
>> have to present them cpu features which are compatible with all cores
>> and discard the features which are only available on some cores.
> 
> Sanitizing the CPU info is important for Xen so it can correctly size the P2M, flush the cache... However, I don't think this is going to be sufficient to be able to move a vCPU between different type of pCPU.
> 
> The main hurdle I can see so far is errata handling. Not all the errata can be fully handled in Xen so some of them are left to the guest to mitigate.

I agree this is something to work on and a problem with the current serie.

> 
> The errata are usually detected based on the MIDR while the OS is booting. IOW, a guest will not be able to know that it needs to handle an errata for pCPU B if it only runs on pCPU A.

Ack.

> 
> I don't know yet how this can be solved, but errata are not that uncommon on Arm. So until this addressed, we will still need to make sure that vCPUs are not migrated between pCPUs with at least a different MIDR.
> 
> This prevention can be either done manually by pinning the vCPUs or implementing the proposal that Dario sent a few years ago (see [1]).

My current proposal would be the following:
- add a command line option to allow to use all cores on a heterogeneous platform (different MIDR)
- taint Xen on this case
- keep the feature sanitize as it is as on this case it will create a safer setup (apart from the errata potential problem)
- keep current behaviour if command line option is not passed

Having a solution to enable all cores (even if it is unsafe) could still be a good improvement for development on big.LITTLE
platforms or for people knowing how to properly configure the system to prevent the problems by using properly cpupools so
I still think this serie with the proposed changes is still making a lot of sense.

I will start looking at a long term solution, maybe automatically create a cpupools on boot or investigate on the design you provided.

Please give me your view on this.

Kind regards
Bertrand

> 
> Cheers,
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg00826.html
> 
> -- 
> Julien Grall
Julien Grall Aug. 24, 2021, 10:01 a.m. UTC | #3
On 23/08/2021 16:48, Bertrand Marquis wrote:
> Hi Julien,

Hi Bertrand,

> [Keep only arm maintainers in the CC list]
> 
>> On 23 Aug 2021, at 13:10, Julien Grall <julien@xen.org> wrote:
>>
>> Hi Bertrand,
>>
>> On 23/08/2021 11:32, Bertrand Marquis wrote:
>>> On arm architecture we might have heterogeneous platforms with different
>>> types of cores. As a guest can potentialy run on any of those cores we
>>> have to present them cpu features which are compatible with all cores
>>> and discard the features which are only available on some cores.
>>
>> Sanitizing the CPU info is important for Xen so it can correctly size the P2M, flush the cache... However, I don't think this is going to be sufficient to be able to move a vCPU between different type of pCPU.
>>
>> The main hurdle I can see so far is errata handling. Not all the errata can be fully handled in Xen so some of them are left to the guest to mitigate.
> 
> I agree this is something to work on and a problem with the current serie.
> 
>>
>> The errata are usually detected based on the MIDR while the OS is booting. IOW, a guest will not be able to know that it needs to handle an errata for pCPU B if it only runs on pCPU A.
> 
> Ack.
> 
>>
>> I don't know yet how this can be solved, but errata are not that uncommon on Arm. So until this addressed, we will still need to make sure that vCPUs are not migrated between pCPUs with at least a different MIDR.
>>
>> This prevention can be either done manually by pinning the vCPUs or implementing the proposal that Dario sent a few years ago (see [1]).
> 
> My current proposal would be the following:
> - add a command line option to allow to use all cores on a heterogeneous platform (different MIDR)

We already have a command line for heterogenous platform. How about 
re-using it?

> - taint Xen on this case
> - keep the feature sanitize as it is as on this case it will create a safer setup (apart from the errata potential problem)
> - keep current behaviour if command line option is not passed
> 
> Having a solution to enable all cores (even if it is unsafe) could still be a good improvement for development on big.LITTLE
> platforms or for people knowing how to properly configure the system to prevent the problems by using properly cpupools so
> I still think this serie with the proposed changes is still making a lot of sense.

+1.

> 
> I will start looking at a long term solution, maybe automatically create a cpupools on boot or investigate on the design you provided.

IIRC, all vCPUs of a domain needs to be in the same cpupool. If your 
plan is to expose big.LITTLE to the guest, then you would need to look 
at the design proposal from Dario.

Cheers,
Bertrand Marquis Aug. 24, 2021, 10:41 a.m. UTC | #4
Hi Julien

> On 24 Aug 2021, at 11:01, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 23/08/2021 16:48, Bertrand Marquis wrote:
>> Hi Julien,
> 
> Hi Bertrand,
> 
>> [Keep only arm maintainers in the CC list]
>>> On 23 Aug 2021, at 13:10, Julien Grall <julien@xen.org> wrote:
>>> 
>>> Hi Bertrand,
>>> 
>>> On 23/08/2021 11:32, Bertrand Marquis wrote:
>>>> On arm architecture we might have heterogeneous platforms with different
>>>> types of cores. As a guest can potentialy run on any of those cores we
>>>> have to present them cpu features which are compatible with all cores
>>>> and discard the features which are only available on some cores.
>>> 
>>> Sanitizing the CPU info is important for Xen so it can correctly size the P2M, flush the cache... However, I don't think this is going to be sufficient to be able to move a vCPU between different type of pCPU.
>>> 
>>> The main hurdle I can see so far is errata handling. Not all the errata can be fully handled in Xen so some of them are left to the guest to mitigate.
>> I agree this is something to work on and a problem with the current serie.
>>> 
>>> The errata are usually detected based on the MIDR while the OS is booting. IOW, a guest will not be able to know that it needs to handle an errata for pCPU B if it only runs on pCPU A.
>> Ack.
>>> 
>>> I don't know yet how this can be solved, but errata are not that uncommon on Arm. So until this addressed, we will still need to make sure that vCPUs are not migrated between pCPUs with at least a different MIDR.
>>> 
>>> This prevention can be either done manually by pinning the vCPUs or implementing the proposal that Dario sent a few years ago (see [1]).
>> My current proposal would be the following:
>> - add a command line option to allow to use all cores on a heterogeneous platform (different MIDR)
> 
> We already have a command line for heterogenous platform. How about re-using it?

I guess you mean hmp-unsafe which is in fact already mentioning big.LITTLE.
That is what I was planning to use (I should have mentioned it)

> 
>> - taint Xen on this case
>> - keep the feature sanitize as it is as on this case it will create a safer setup (apart from the errata potential problem)
>> - keep current behaviour if command line option is not passed
>> Having a solution to enable all cores (even if it is unsafe) could still be a good improvement for development on big.LITTLE
>> platforms or for people knowing how to properly configure the system to prevent the problems by using properly cpupools so
>> I still think this serie with the proposed changes is still making a lot of sense.
> 
> +1.
> 
>> I will start looking at a long term solution, maybe automatically create a cpupools on boot or investigate on the design you provided.
> 
> IIRC, all vCPUs of a domain needs to be in the same cpupool. If your plan is to expose big.LITTLE to the guest, then you would need to look at the design proposal from Dario.

I started to look at Dario idea and I was also thinking we could in fact pre-create cpupools for each kind of cores.
I will continue to dig and come up with a design for that because we have some use cases which might be more
complex like allowing a guest to actually have both big and little cores in which case we need to provide them with
the right cpuinfo and make sure the vcpus are always assigned to the same kind of core.

I will work on a v3 with the agreed changes and then write a proposal for the next step.

Cheers
Bertrand

> 
> Cheers,
> 
> -- 
> Julien Grall