Message ID | 20230919175017.538312-3-jingzhangos@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Get writable masks for feature ID from userspace | expand |
Hi, On 9/19/23 19:50, Jing Zhang wrote: > Add some basic documentation on how to get feature ID register writable > masks from userspace. > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > --- > Documentation/virt/kvm/api.rst | 42 ++++++++++++++++++++++++++++++++++ > 1 file changed, 42 insertions(+) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index 21a7578142a1..2defb5e198ce 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -6070,6 +6070,48 @@ writes to the CNTVCT_EL0 and CNTPCT_EL0 registers using the SET_ONE_REG > interface. No error will be returned, but the resulting offset will not be > applied. > > +4.139 KVM_ARM_GET_REG_WRITABLE_MASKS > +------------------------------------------- > + > +:Capability: KVM_CAP_ARM_SUPPORTED_FEATURE_ID_RANGES > +:Architectures: arm64 > +:Type: vm ioctl > +:Parameters: struct reg_mask_range (in/out) > +:Returns: 0 on success, < 0 on error > + > + > +:: > + > + #define ARM64_FEATURE_ID_SPACE_SIZE (3 * 8 * 8) > + #define ARM64_FEATURE_ID_RANGE_IDREGS BIT(0) > + > + struct reg_mask_range { > + __u64 addr; /* Pointer to mask array */ > + __u32 range; /* Requested range */ > + __u32 reserved[13]; > + }; > + > +This ioctl copies the writable masks for Feature ID registers to userspace. > +The Feature ID space is defined as the AArch64 System register space with > +op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, op2=={0-7}. when attempting a migration between Ampere Altra and ThunderXv2 the first hurdle is to handle a failure when writing ICC_CTLR_EL1 (3.0.12.12.4) on dest. This reg is outside of the scope of the above single range (BIT(0)). This may be questionable if we want to migrate between those types of machines but the goal is to exercise different scenarios to have a gloval view of the problems. This reg exposes some RO capabilities such as ExtRange, A3V, SEIS, IDBits, ... So to get the migration going further I would need to tweek this on the source - for instance I guess SEIS could be reset despite the host HW cap - without making too much trouble. What would you recommend, adding a new range? But I guess we need to design ranges carefully otherwise we may be quickly limited by the number of flag bits. Any suggestion? Thanks Eric > + > +The mask array pointed to by ``addr`` is indexed by the macro > +``ARM64_FEATURE_ID_SPACE_IDX(op0, op1, crn, crm, op2)``, allowing userspace > +to know what bits can be changed for the system register described by ``op0, > +op1, crn, crm, op2``. > + > +The ``range`` field describes the requested range of registers. The valid > +ranges can be retrieved by checking the return value of > +KVM_CAP_CHECK_EXTENSION_VM for the KVM_CAP_ARM_SUPPORTED_FEATURE_ID_RANGES > +capability, which will return a bitmask of the supported ranges. Each bit > +set in the return value represents a possible value for the ``range`` > +field. At the time of writing, only bit 0 is returned set by the > +capability, meaning that only the value ``ARM64_FEATURE_ID_RANGE_IDREGS`` > +is valid for ``range``. > + > +The ``reserved[13]`` array is reserved for future use and should be 0, or > +KVM may return an error. > + > 5. The kvm_run structure > ======================== >
Hey Eric, On Tue, 13 Feb 2024 13:59:31 +0000, Eric Auger <eauger@redhat.com> wrote: > > Hi, > > On 9/19/23 19:50, Jing Zhang wrote: > > Add some basic documentation on how to get feature ID register writable > > masks from userspace. > > > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > > --- > > Documentation/virt/kvm/api.rst | 42 ++++++++++++++++++++++++++++++++++ > > 1 file changed, 42 insertions(+) > > > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > > index 21a7578142a1..2defb5e198ce 100644 > > --- a/Documentation/virt/kvm/api.rst > > +++ b/Documentation/virt/kvm/api.rst > > @@ -6070,6 +6070,48 @@ writes to the CNTVCT_EL0 and CNTPCT_EL0 registers using the SET_ONE_REG > > interface. No error will be returned, but the resulting offset will not be > > applied. > > > > +4.139 KVM_ARM_GET_REG_WRITABLE_MASKS > > +------------------------------------------- > > + > > +:Capability: KVM_CAP_ARM_SUPPORTED_FEATURE_ID_RANGES > > +:Architectures: arm64 > > +:Type: vm ioctl > > +:Parameters: struct reg_mask_range (in/out) > > +:Returns: 0 on success, < 0 on error > > + > > + > > +:: > > + > > + #define ARM64_FEATURE_ID_SPACE_SIZE (3 * 8 * 8) > > + #define ARM64_FEATURE_ID_RANGE_IDREGS BIT(0) > > + > > + struct reg_mask_range { > > + __u64 addr; /* Pointer to mask array */ > > + __u32 range; /* Requested range */ > > + __u32 reserved[13]; > > + }; > > + > > +This ioctl copies the writable masks for Feature ID registers to userspace. > > +The Feature ID space is defined as the AArch64 System register space with > > +op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, op2=={0-7}. > when attempting a migration between Ampere Altra and ThunderXv2 the > first hurdle is to handle a failure when writing ICC_CTLR_EL1 > (3.0.12.12.4) on dest. This reg is outside of the scope of the above > single range (BIT(0)). Indeed. But more importantly, this isn't really an ID register. Plenty of variable bits in there. > This may be questionable if we want to migrate between those types of > machines but the goal is to exercise different scenarios to have a > gloval view of the problems. I think this is a valuable experiment, and we should definitely explore this sort of things (as I cannot see the diversity of ARM system slowing down any time soon). > > This reg exposes some RO capabilities such as ExtRange, A3V, SEIS, > IDBits, ... > So to get the migration going further I would need to tweek this on the > source - for instance I guess SEIS could be reset despite the host HW > cap - without making too much trouble. I'm not sure SEIS is such an easy one: if you promised the guest that it would never get an SError doing the most stupid things (SEIS=0), it really shouldn't get one after migration. If you advertised it on the source HW (Altra), a migration to TX2 would be fine. The other bits are possible to change depending on the requirements of the VM (aff3, IDBits), and ExtRange should always be set to 0 (because our GIC implementation doesn't support EPPI/ESPI). The really ugly part here is that if you want to affect these bits, you will need to trap and emulate the access. Not a big deal, but in the absence of FGT, you will need to handle the full Common trap group, which is going to slow things down (you will have to trap ICV_PMR_EL1, for example). > What would you recommend, adding a new range? But I guess we need to > design ranges carefully otherwise we may be quickly limited by the > number of flag bits. I can see a need to adding a range that would cover non-ID registers that have RO fields. But we also need to consider the case of EL2 registers that take part in this. For example, ICV_CTLR_EL1 and ICH_VTR_EL2 and deeply linked, and share some fields. Without NV, no need to expose HCR_VTR_EL2. With NV, this register actually drives ICV_CTLR_EL1. So careful planning is required here, and the impact of this measured. Thanks, M.
Hi Marc, On 2/13/24 15:53, Marc Zyngier wrote: > Hey Eric, > > On Tue, 13 Feb 2024 13:59:31 +0000, > Eric Auger <eauger@redhat.com> wrote: >> >> Hi, >> >> On 9/19/23 19:50, Jing Zhang wrote: >>> Add some basic documentation on how to get feature ID register writable >>> masks from userspace. >>> >>> Signed-off-by: Jing Zhang <jingzhangos@google.com> >>> --- >>> Documentation/virt/kvm/api.rst | 42 ++++++++++++++++++++++++++++++++++ >>> 1 file changed, 42 insertions(+) >>> >>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst >>> index 21a7578142a1..2defb5e198ce 100644 >>> --- a/Documentation/virt/kvm/api.rst >>> +++ b/Documentation/virt/kvm/api.rst >>> @@ -6070,6 +6070,48 @@ writes to the CNTVCT_EL0 and CNTPCT_EL0 registers using the SET_ONE_REG >>> interface. No error will be returned, but the resulting offset will not be >>> applied. >>> >>> +4.139 KVM_ARM_GET_REG_WRITABLE_MASKS >>> +------------------------------------------- >>> + >>> +:Capability: KVM_CAP_ARM_SUPPORTED_FEATURE_ID_RANGES >>> +:Architectures: arm64 >>> +:Type: vm ioctl >>> +:Parameters: struct reg_mask_range (in/out) >>> +:Returns: 0 on success, < 0 on error >>> + >>> + >>> +:: >>> + >>> + #define ARM64_FEATURE_ID_SPACE_SIZE (3 * 8 * 8) >>> + #define ARM64_FEATURE_ID_RANGE_IDREGS BIT(0) >>> + >>> + struct reg_mask_range { >>> + __u64 addr; /* Pointer to mask array */ >>> + __u32 range; /* Requested range */ >>> + __u32 reserved[13]; >>> + }; >>> + >>> +This ioctl copies the writable masks for Feature ID registers to userspace. >>> +The Feature ID space is defined as the AArch64 System register space with >>> +op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, op2=={0-7}. >> when attempting a migration between Ampere Altra and ThunderXv2 the >> first hurdle is to handle a failure when writing ICC_CTLR_EL1 >> (3.0.12.12.4) on dest. This reg is outside of the scope of the above >> single range (BIT(0)). > > Indeed. But more importantly, this isn't really an ID register. Plenty > of variable bits in there. > >> This may be questionable if we want to migrate between those types of >> machines but the goal is to exercise different scenarios to have a >> gloval view of the problems. > > I think this is a valuable experiment, and we should definitely > explore this sort of things (as I cannot see the diversity of ARM > system slowing down any time soon). > >> >> This reg exposes some RO capabilities such as ExtRange, A3V, SEIS,hyp/nvhe/hyp-main.c >> IDBits, ... >> So to get the migration going further I would need to tweek this on the >> source - for instance I guess SEIS could be reset despite the host HW >> cap - without making too much trouble. > > I'm not sure SEIS is such an easy one: if you promised the guest that > it would never get an SError doing the most stupid things (SEIS=0), it > really shouldn't get one after migration. If you advertised it on the > source HW (Altra), a migration to TX2 would be fine. I see. Indeed for sure we must make sure KVM cannot inject SEIs in the guest if SEIS is not advertised on guest side. In this case SEIS is 0 on src Altra. On dst TX2 ich_vtr_el2 advertises it and host seis =1 causing set_gic_ctlr to fail (vgic-sys-reg-v3.c). > > The other bits are possible to change depending on the requirements of > the VM (aff3, IDBits), and ExtRange should always be set to 0 (because > our GIC implementation doesn't support EPPI/ESPI). > > The really ugly part here is that if you want to affect these bits, > you will need to trap and emulate the access. Not a big deal, but in > the absence of FGT, you will need to handle the full Common trap > group, which is going to slow things down (you will have to trap > ICV_PMR_EL1, for example). Would it be sensible to simplify things and only support the new range if FGT is supported? > >> What would you recommend, adding a new range? But I guess we need to >> design ranges carefully otherwise we may be quickly limited by the >> number of flag bits. > > I can see a need to adding a range that would cover non-ID registers > that have RO fields. But we also need to consider the case of EL2 > registers that take part in this. Yes I need to better understand the consistency with ICH_VTR_EL2 > > For example, ICV_CTLR_EL1 and ICH_VTR_EL2 and deeply linked, and share > some fields. Without NV, no need to expose HCR_VTR_EL2. With NV, this > register actually drives ICV_CTLR_EL1. understood. > > So careful planning is required here, and the impact of this measured. Understood. More time needed to study the code ;-) Thanks! Eric > > Thanks, > > M. >
Hi Eric, On Wed, 14 Feb 2024 18:07:58 +0000, Eric Auger <eauger@redhat.com> wrote: > > > I'm not sure SEIS is such an easy one: if you promised the guest that > > it would never get an SError doing the most stupid things (SEIS=0), it > > really shouldn't get one after migration. If you advertised it on the > > source HW (Altra), a migration to TX2 would be fine. > I see. Indeed for sure we must make sure KVM cannot inject SEIs in the > guest if SEIS is not advertised on guest side. The problem isn't KVM injecting an SError, but the HW doing it (that's what SEIS indicates). On some HW, it only takes an old enough EFI build to trigger those. > In this case SEIS is 0 on src Altra. On dst TX2 ich_vtr_el2 advertises > it and host seis =1 causing set_gic_ctlr to fail (vgic-sys-reg-v3.c). > > > > > The other bits are possible to change depending on the requirements of > > the VM (aff3, IDBits), and ExtRange should always be set to 0 (because > > our GIC implementation doesn't support EPPI/ESPI). > > > > The really ugly part here is that if you want to affect these bits, > > you will need to trap and emulate the access. Not a big deal, but in > > the absence of FGT, you will need to handle the full Common trap > > group, which is going to slow things down (you will have to trap > > ICV_PMR_EL1, for example). > Would it be sensible to simplify things and only support the new range > if FGT is supported? I'm not sure that helps. The only FGT that covers the GIC is for ICC_IGRPENn_EL1, and we don't care much about that guy. So ICH_VMCR_EL2.TC it is, and that sucks a bit. But if that's what you want to show, you don't have a choice. Thanks, M.
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 21a7578142a1..2defb5e198ce 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6070,6 +6070,48 @@ writes to the CNTVCT_EL0 and CNTPCT_EL0 registers using the SET_ONE_REG interface. No error will be returned, but the resulting offset will not be applied. +4.139 KVM_ARM_GET_REG_WRITABLE_MASKS +------------------------------------------- + +:Capability: KVM_CAP_ARM_SUPPORTED_FEATURE_ID_RANGES +:Architectures: arm64 +:Type: vm ioctl +:Parameters: struct reg_mask_range (in/out) +:Returns: 0 on success, < 0 on error + + +:: + + #define ARM64_FEATURE_ID_SPACE_SIZE (3 * 8 * 8) + #define ARM64_FEATURE_ID_RANGE_IDREGS BIT(0) + + struct reg_mask_range { + __u64 addr; /* Pointer to mask array */ + __u32 range; /* Requested range */ + __u32 reserved[13]; + }; + +This ioctl copies the writable masks for Feature ID registers to userspace. +The Feature ID space is defined as the AArch64 System register space with +op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, op2=={0-7}. + +The mask array pointed to by ``addr`` is indexed by the macro +``ARM64_FEATURE_ID_SPACE_IDX(op0, op1, crn, crm, op2)``, allowing userspace +to know what bits can be changed for the system register described by ``op0, +op1, crn, crm, op2``. + +The ``range`` field describes the requested range of registers. The valid +ranges can be retrieved by checking the return value of +KVM_CAP_CHECK_EXTENSION_VM for the KVM_CAP_ARM_SUPPORTED_FEATURE_ID_RANGES +capability, which will return a bitmask of the supported ranges. Each bit +set in the return value represents a possible value for the ``range`` +field. At the time of writing, only bit 0 is returned set by the +capability, meaning that only the value ``ARM64_FEATURE_ID_RANGE_IDREGS`` +is valid for ``range``. + +The ``reserved[13]`` array is reserved for future use and should be 0, or +KVM may return an error. + 5. The kvm_run structure ========================
Add some basic documentation on how to get feature ID register writable masks from userspace. Signed-off-by: Jing Zhang <jingzhangos@google.com> --- Documentation/virt/kvm/api.rst | 42 ++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+)