Message ID | 20230523074326.3035745-6-luca.fancellu@arm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | SVE feature for arm guests | expand |
Hi Luca, > On 23 May 2023, at 09:43, Luca Fancellu <Luca.Fancellu@arm.com> wrote: > > Save/restore context switch for SVE, allocate memory to contain > the Z0-31 registers whose length is maximum 2048 bits each and > FFR who can be maximum 256 bits, the allocated memory depends on > how many bits is the vector length for the domain and how many bits > are supported by the platform. > > Save P0-15 whose length is maximum 256 bits each, in this case the > memory used is from the fpregs field in struct vfp_state, > because V0-31 are part of Z0-31 and this space would have been > unused for SVE domain otherwise. > > Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu > creation given the requested vector length and restore it on > context switch, save/restore ZCR_EL1 value as well. > > List import macros from Linux in README.LinuxPrimitives. > > Signed-off-by: Luca Fancellu <luca.fancellu@arm.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> Just ... > --- > Changes from v6: > - Add comment for explain why sve_save/sve_load are different from > Linux, add macros in xen/arch/arm/README.LinuxPrimitives (Julien) > - Add comments in sve_context_init and sve_context_free, handle the > case where sve_zreg_ctx_end is NULL, move setting of v->arch.zcr_el2 > in sve_context_init (Julien) > - remove stubs for sve_context_* and sve_save_* and rely on compiler > DCE (Jan) > - Add comments for sve_save_ctx/sve_load_ctx (Julien) > Changes from v5: > - use XFREE instead of xfree, keep the headers (Julien) > - Avoid math computation for every save/restore, store the computation > in struct vfp_state once (Bertrand) > - protect access to v->domain->arch.sve_vl inside arch_vcpu_create now > that sve_vl is available only on arm64 > Changes from v4: > - No changes > Changes from v3: > - don't use fixed len types when not needed (Jan) > - now VL is an encoded value, decode it before using. > Changes from v2: > - No changes > Changes from v1: > - No changes > Changes from RFC: > - Moved zcr_el2 field introduction in this patch, restore its > content inside sve_restore_state function. (Julien) > > fix patch 5 > > Signed-off-by: Luca Fancellu <luca.fancellu@arm.com> > Change-Id: Ief65b2ff14fd579afa4fd110ce08a19980e64fa9 You have a signed off and a change-id that should not be here. They are in the comment section so should be removed during push so might be ok :-) Cheers Bertrand
> On 24 May 2023, at 10:47, Bertrand Marquis <Bertrand.Marquis@arm.com> wrote: > > Hi Luca, > >> On 23 May 2023, at 09:43, Luca Fancellu <Luca.Fancellu@arm.com> wrote: >> >> Save/restore context switch for SVE, allocate memory to contain >> the Z0-31 registers whose length is maximum 2048 bits each and >> FFR who can be maximum 256 bits, the allocated memory depends on >> how many bits is the vector length for the domain and how many bits >> are supported by the platform. >> >> Save P0-15 whose length is maximum 256 bits each, in this case the >> memory used is from the fpregs field in struct vfp_state, >> because V0-31 are part of Z0-31 and this space would have been >> unused for SVE domain otherwise. >> >> Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu >> creation given the requested vector length and restore it on >> context switch, save/restore ZCR_EL1 value as well. >> >> List import macros from Linux in README.LinuxPrimitives. >> >> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com> > Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> > > Just ... > >> --- >> Changes from v6: >> - Add comment for explain why sve_save/sve_load are different from >> Linux, add macros in xen/arch/arm/README.LinuxPrimitives (Julien) >> - Add comments in sve_context_init and sve_context_free, handle the >> case where sve_zreg_ctx_end is NULL, move setting of v->arch.zcr_el2 >> in sve_context_init (Julien) >> - remove stubs for sve_context_* and sve_save_* and rely on compiler >> DCE (Jan) >> - Add comments for sve_save_ctx/sve_load_ctx (Julien) >> Changes from v5: >> - use XFREE instead of xfree, keep the headers (Julien) >> - Avoid math computation for every save/restore, store the computation >> in struct vfp_state once (Bertrand) >> - protect access to v->domain->arch.sve_vl inside arch_vcpu_create now >> that sve_vl is available only on arm64 >> Changes from v4: >> - No changes >> Changes from v3: >> - don't use fixed len types when not needed (Jan) >> - now VL is an encoded value, decode it before using. >> Changes from v2: >> - No changes >> Changes from v1: >> - No changes >> Changes from RFC: >> - Moved zcr_el2 field introduction in this patch, restore its >> content inside sve_restore_state function. (Julien) >> >> fix patch 5 >> >> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com> >> Change-Id: Ief65b2ff14fd579afa4fd110ce08a19980e64fa9 > > You have a signed off and a change-id that should not be here. > They are in the comment section so should be removed during push so might be ok :-) Ohh yeah I missed that, probably it’s from a squash! > > Cheers > Bertrand
Hi Luca, On 23/05/2023 08:43, Luca Fancellu wrote: > +int sve_context_init(struct vcpu *v) > +{ > + unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl); > + uint64_t *ctx = _xzalloc(sve_zreg_ctx_size(sve_vl_bits) + > + sve_ffrreg_ctx_size(sve_vl_bits), > + L1_CACHE_BYTES); > + > + if ( !ctx ) > + return -ENOMEM; > + > + /* > + * Point to the end of Z0-Z31 memory, just before FFR memory, to be kept in > + * sync with sve_context_free() Nit: Missing a full stop. > + */ > + v->arch.vfp.sve_zreg_ctx_end = ctx + > + (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t)); > + > + v->arch.zcr_el2 = vl_to_zcr(sve_vl_bits); > + > + return 0; > +} > + > +void sve_context_free(struct vcpu *v) > +{ > + unsigned int sve_vl_bits; > + > + if ( v->arch.vfp.sve_zreg_ctx_end ) > + return; > + > + sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl); > + > + /* > + * Point to the end of Z0-Z31 memory, just before FFR memory, to be kept > + * in sync with sve_context_init() > + */ The spacing looks a bit odd in this comment. Did you miss an extra space? Also, I notice this comment is the exact same as on top as sve_context_init(). I think this is a bit misleading because the logic is different. I would suggest the following: "Currently points to the end of Z0-Z31 memory which is not the start of the buffer. To be kept in sync with the sve_context_init()." Lastly, nit: Missing a full stop. > + v->arch.vfp.sve_zreg_ctx_end -= > + (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t)); > + > + XFREE(v->arch.vfp.sve_zreg_ctx_end); > +} > + [...] > diff --git a/xen/arch/arm/include/asm/arm64/vfp.h b/xen/arch/arm/include/asm/arm64/vfp.h > index e6e8c363bc16..4aa371e85d26 100644 > --- a/xen/arch/arm/include/asm/arm64/vfp.h > +++ b/xen/arch/arm/include/asm/arm64/vfp.h > @@ -6,7 +6,19 @@ > > struct vfp_state > { > + /* > + * When SVE is enabled for the guest, fpregs memory will be used to > + * save/restore P0-P15 registers, otherwise it will be used for the V0-V31 > + * registers. > + */ > uint64_t fpregs[64] __vfp_aligned; > + /* > + * When SVE is enabled for the guest, sve_zreg_ctx_end points to memory > + * where Z0-Z31 registers and FFR can be saved/restored, it points at the > + * end of the Z0-Z31 space and at the beginning of the FFR space, it's done > + * like that to ease the save/restore assembly operations. > + */ > + uint64_t *sve_zreg_ctx_end; Sorry I only noticed now. But shouldn't this be protected with #ifdef CONFIG_SVE? Same... > register_t fpcr; > register_t fpexc32_el2; > register_t fpsr; > diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h > index 331da0f3bcc3..814652d92568 100644 > --- a/xen/arch/arm/include/asm/domain.h > +++ b/xen/arch/arm/include/asm/domain.h > @@ -195,6 +195,8 @@ struct arch_vcpu > register_t tpidrro_el0; > > /* HYP configuration */ > + register_t zcr_el1; > + register_t zcr_el2; ... here. > register_t cptr_el2; > register_t hcr_el2; > register_t mdcr_el2; Cheers,
> On 25 May 2023, at 10:09, Julien Grall <julien@xen.org> wrote: > > Hi Luca, > > On 23/05/2023 08:43, Luca Fancellu wrote: >> +int sve_context_init(struct vcpu *v) >> +{ >> + unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl); >> + uint64_t *ctx = _xzalloc(sve_zreg_ctx_size(sve_vl_bits) + >> + sve_ffrreg_ctx_size(sve_vl_bits), >> + L1_CACHE_BYTES); >> + >> + if ( !ctx ) >> + return -ENOMEM; >> + >> + /* >> + * Point to the end of Z0-Z31 memory, just before FFR memory, to be kept in >> + * sync with sve_context_free() > > Nit: Missing a full stop. I’ll fix > >> + */ >> + v->arch.vfp.sve_zreg_ctx_end = ctx + >> + (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t)); >> + >> + v->arch.zcr_el2 = vl_to_zcr(sve_vl_bits); >> + >> + return 0; >> +} >> + >> +void sve_context_free(struct vcpu *v) >> +{ >> + unsigned int sve_vl_bits; >> + >> + if ( v->arch.vfp.sve_zreg_ctx_end ) >> + return; >> + >> + sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl); >> + >> + /* >> + * Point to the end of Z0-Z31 memory, just before FFR memory, to be kept >> + * in sync with sve_context_init() >> + */ > > The spacing looks a bit odd in this comment. Did you miss an extra space? > > Also, I notice this comment is the exact same as on top as sve_context_init(). I think this is a bit misleading because the logic is different. I would suggest the following: > > "Currently points to the end of Z0-Z31 memory which is not the start of the buffer. To be kept in sync with the sve_context_init()." > > Lastly, nit: Missing a full stop. Ok I’ll change it > >> + v->arch.vfp.sve_zreg_ctx_end -= >> + (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t)); >> + >> + XFREE(v->arch.vfp.sve_zreg_ctx_end); >> +} >> + > > [...] > >> diff --git a/xen/arch/arm/include/asm/arm64/vfp.h b/xen/arch/arm/include/asm/arm64/vfp.h >> index e6e8c363bc16..4aa371e85d26 100644 >> --- a/xen/arch/arm/include/asm/arm64/vfp.h >> +++ b/xen/arch/arm/include/asm/arm64/vfp.h >> @@ -6,7 +6,19 @@ >> struct vfp_state >> { >> + /* >> + * When SVE is enabled for the guest, fpregs memory will be used to >> + * save/restore P0-P15 registers, otherwise it will be used for the V0-V31 >> + * registers. >> + */ >> uint64_t fpregs[64] __vfp_aligned; >> + /* >> + * When SVE is enabled for the guest, sve_zreg_ctx_end points to memory >> + * where Z0-Z31 registers and FFR can be saved/restored, it points at the >> + * end of the Z0-Z31 space and at the beginning of the FFR space, it's done >> + * like that to ease the save/restore assembly operations. >> + */ >> + uint64_t *sve_zreg_ctx_end; > > Sorry I only noticed now. But shouldn't this be protected with #ifdef CONFIG_SVE? Same... > >> register_t fpcr; >> register_t fpexc32_el2; >> register_t fpsr; >> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h >> index 331da0f3bcc3..814652d92568 100644 >> --- a/xen/arch/arm/include/asm/domain.h >> +++ b/xen/arch/arm/include/asm/domain.h >> @@ -195,6 +195,8 @@ struct arch_vcpu >> register_t tpidrro_el0; >> /* HYP configuration */ >> + register_t zcr_el1; >> + register_t zcr_el2; > > ... here. Sure I can protect them. It was done on purpose before to avoid ifdefs but I think saving space is better here and also there won’t be any use of them when the config is off. > >> register_t cptr_el2; >> register_t hcr_el2; >> register_t mdcr_el2; > > Cheers, > > -- > Julien Grall
Hi Luca, On 25/05/2023 11:01, Luca Fancellu wrote: >> On 25 May 2023, at 10:09, Julien Grall <julien@xen.org> wrote: >>> diff --git a/xen/arch/arm/include/asm/arm64/vfp.h b/xen/arch/arm/include/asm/arm64/vfp.h >>> index e6e8c363bc16..4aa371e85d26 100644 >>> --- a/xen/arch/arm/include/asm/arm64/vfp.h >>> +++ b/xen/arch/arm/include/asm/arm64/vfp.h >>> @@ -6,7 +6,19 @@ >>> struct vfp_state >>> { >>> + /* >>> + * When SVE is enabled for the guest, fpregs memory will be used to >>> + * save/restore P0-P15 registers, otherwise it will be used for the V0-V31 >>> + * registers. >>> + */ >>> uint64_t fpregs[64] __vfp_aligned; >>> + /* >>> + * When SVE is enabled for the guest, sve_zreg_ctx_end points to memory >>> + * where Z0-Z31 registers and FFR can be saved/restored, it points at the >>> + * end of the Z0-Z31 space and at the beginning of the FFR space, it's done >>> + * like that to ease the save/restore assembly operations. >>> + */ >>> + uint64_t *sve_zreg_ctx_end; >> >> Sorry I only noticed now. But shouldn't this be protected with #ifdef CONFIG_SVE? Same... >> >>> register_t fpcr; >>> register_t fpexc32_el2; >>> register_t fpsr; >>> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h >>> index 331da0f3bcc3..814652d92568 100644 >>> --- a/xen/arch/arm/include/asm/domain.h >>> +++ b/xen/arch/arm/include/asm/domain.h >>> @@ -195,6 +195,8 @@ struct arch_vcpu >>> register_t tpidrro_el0; >>> /* HYP configuration */ >>> + register_t zcr_el1; >>> + register_t zcr_el2; >> >> ... here. > > Sure I can protect them. It was done on purpose before to avoid ifdefs but I think saving space > is better here and also there won’t be any use of them when the config is off. I wasn't thinking about saving space. I was more thinking about catching any (mis)use of the fields in common code. With the #ifdef, the compilation would fail. Cheers,
diff --git a/xen/arch/arm/README.LinuxPrimitives b/xen/arch/arm/README.LinuxPrimitives index 76c8df29e416..301c0271bbe4 100644 --- a/xen/arch/arm/README.LinuxPrimitives +++ b/xen/arch/arm/README.LinuxPrimitives @@ -69,7 +69,9 @@ SVE assembly macro: last sync @ v6.3.0 (last commit: 457391b03803) linux/arch/arm64/include/asm/fpsimdmacros.h xen/arch/arm/include/asm/arm64/sve-asm.S The following macros were taken from Linux: - _check_general_reg, _check_num, _sve_rdvl + _check_general_reg, _check_num, _sve_rdvl, __for, _for, _sve_check_zreg, + _sve_check_preg, _sve_str_v, _sve_ldr_v, _sve_str_p, _sve_ldr_p, _sve_rdffr, + _sve_wrffr ===================================================================== arm32 diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S index 4d1549344733..59dbefbbb252 100644 --- a/xen/arch/arm/arm64/sve-asm.S +++ b/xen/arch/arm/arm64/sve-asm.S @@ -17,6 +17,18 @@ .endif .endm +.macro _sve_check_zreg znr + .if (\znr) < 0 || (\znr) > 31 + .error "Bad Scalable Vector Extension vector register number \znr." + .endif +.endm + +.macro _sve_check_preg pnr + .if (\pnr) < 0 || (\pnr) > 15 + .error "Bad Scalable Vector Extension predicate register number \pnr." + .endif +.endm + .macro _check_num n, min, max .if (\n) < (\min) || (\n) > (\max) .error "Number \n out of range [\min,\max]" @@ -26,6 +38,54 @@ /* SVE instruction encodings for non-SVE-capable assemblers */ /* (pre binutils 2.28, all kernel capable clang versions support SVE) */ +/* STR (vector): STR Z\nz, [X\nxbase, #\offset, MUL VL] */ +.macro _sve_str_v nz, nxbase, offset=0 + _sve_check_zreg \nz + _check_general_reg \nxbase + _check_num (\offset), -0x100, 0xff + .inst 0xe5804000 \ + | (\nz) \ + | ((\nxbase) << 5) \ + | (((\offset) & 7) << 10) \ + | (((\offset) & 0x1f8) << 13) +.endm + +/* LDR (vector): LDR Z\nz, [X\nxbase, #\offset, MUL VL] */ +.macro _sve_ldr_v nz, nxbase, offset=0 + _sve_check_zreg \nz + _check_general_reg \nxbase + _check_num (\offset), -0x100, 0xff + .inst 0x85804000 \ + | (\nz) \ + | ((\nxbase) << 5) \ + | (((\offset) & 7) << 10) \ + | (((\offset) & 0x1f8) << 13) +.endm + +/* STR (predicate): STR P\np, [X\nxbase, #\offset, MUL VL] */ +.macro _sve_str_p np, nxbase, offset=0 + _sve_check_preg \np + _check_general_reg \nxbase + _check_num (\offset), -0x100, 0xff + .inst 0xe5800000 \ + | (\np) \ + | ((\nxbase) << 5) \ + | (((\offset) & 7) << 10) \ + | (((\offset) & 0x1f8) << 13) +.endm + +/* LDR (predicate): LDR P\np, [X\nxbase, #\offset, MUL VL] */ +.macro _sve_ldr_p np, nxbase, offset=0 + _sve_check_preg \np + _check_general_reg \nxbase + _check_num (\offset), -0x100, 0xff + .inst 0x85800000 \ + | (\np) \ + | ((\nxbase) << 5) \ + | (((\offset) & 7) << 10) \ + | (((\offset) & 0x1f8) << 13) +.endm + /* RDVL X\nx, #\imm */ .macro _sve_rdvl nx, imm _check_general_reg \nx @@ -35,11 +95,98 @@ | (((\imm) & 0x3f) << 5) .endm +/* RDFFR (unpredicated): RDFFR P\np.B */ +.macro _sve_rdffr np + _sve_check_preg \np + .inst 0x2519f000 \ + | (\np) +.endm + +/* WRFFR P\np.B */ +.macro _sve_wrffr np + _sve_check_preg \np + .inst 0x25289000 \ + | ((\np) << 5) +.endm + +.macro __for from:req, to:req + .if (\from) == (\to) + _for__body %\from + .else + __for %\from, %((\from) + ((\to) - (\from)) / 2) + __for %((\from) + ((\to) - (\from)) / 2 + 1), %\to + .endif +.endm + +.macro _for var:req, from:req, to:req, insn:vararg + .macro _for__body \var:req + .noaltmacro + \insn + .altmacro + .endm + + .altmacro + __for \from, \to + .noaltmacro + + .purgem _for__body +.endm + +/* + * sve_save and sve_load are different from the Linux version because the + * buffers to save the context are different from Xen and for example Linux + * is using this macro to save/restore also fpsr and fpcr while we do it in C + */ + +.macro sve_save nxzffrctx, nxpctx, save_ffr + _for n, 0, 31, _sve_str_v \n, \nxzffrctx, \n - 32 + _for n, 0, 15, _sve_str_p \n, \nxpctx, \n + cbz \save_ffr, 1f + _sve_rdffr 0 + _sve_str_p 0, \nxzffrctx + _sve_ldr_p 0, \nxpctx + b 2f +1: + str xzr, [x\nxzffrctx] // Zero out FFR +2: +.endm + +.macro sve_load nxzffrctx, nxpctx, restore_ffr + _for n, 0, 31, _sve_ldr_v \n, \nxzffrctx, \n - 32 + cbz \restore_ffr, 1f + _sve_ldr_p 0, \nxzffrctx + _sve_wrffr 0 +1: + _for n, 0, 15, _sve_ldr_p \n, \nxpctx, \n +.endm + /* Gets the current vector register size in bytes */ GLOBAL(sve_get_hw_vl) _sve_rdvl 0, 1 ret +/* + * Save the SVE context + * + * x0 - pointer to buffer for Z0-31 + FFR + * x1 - pointer to buffer for P0-15 + * x2 - Save FFR if non-zero + */ +GLOBAL(sve_save_ctx) + sve_save 0, 1, x2 + ret + +/* + * Load the SVE context + * + * x0 - pointer to buffer for Z0-31 + FFR + * x1 - pointer to buffer for P0-15 + * x2 - Restore FFR if non-zero + */ +GLOBAL(sve_load_ctx) + sve_load 0, 1, x2 + ret + /* * Local variables: * mode: ASM diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c index a9144e48ef6b..84a6dedc1fd7 100644 --- a/xen/arch/arm/arm64/sve.c +++ b/xen/arch/arm/arm64/sve.c @@ -5,6 +5,7 @@ * Copyright (C) 2022 ARM Ltd. */ +#include <xen/sizes.h> #include <xen/types.h> #include <asm/arm64/sve.h> #include <asm/arm64/sysregs.h> @@ -14,6 +15,25 @@ extern unsigned int sve_get_hw_vl(void); +/* + * Save the SVE context + * + * sve_ctx - pointer to buffer for Z0-31 + FFR + * pregs - pointer to buffer for P0-15 + * save_ffr - Save FFR if non-zero + */ +extern void sve_save_ctx(uint64_t *sve_ctx, uint64_t *pregs, int save_ffr); + +/* + * Load the SVE context + * + * sve_ctx - pointer to buffer for Z0-31 + FFR + * pregs - pointer to buffer for P0-15 + * restore_ffr - Restore FFR if non-zero + */ +extern void sve_load_ctx(uint64_t const *sve_ctx, uint64_t const *pregs, + int restore_ffr); + /* Takes a vector length in bits and returns the ZCR_ELx encoding */ static inline register_t vl_to_zcr(unsigned int vl) { @@ -21,6 +41,21 @@ static inline register_t vl_to_zcr(unsigned int vl) return ((vl / SVE_VL_MULTIPLE_VAL) - 1U) & ZCR_ELx_LEN_MASK; } +static inline unsigned int sve_zreg_ctx_size(unsigned int vl) +{ + /* + * Z0-31 registers size in bytes is computed from VL that is in bits, so VL + * in bytes is VL/8. + */ + return (vl / 8U) * 32U; +} + +static inline unsigned int sve_ffrreg_ctx_size(unsigned int vl) +{ + /* FFR register size is VL/8, which is in bytes (VL/8)/8 */ + return (vl / 64U); +} + register_t compute_max_zcr(void) { register_t cptr_bits = get_default_cptr_flags(); @@ -61,6 +96,62 @@ unsigned int get_sys_vl_len(void) SVE_VL_MULTIPLE_VAL; } +int sve_context_init(struct vcpu *v) +{ + unsigned int sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl); + uint64_t *ctx = _xzalloc(sve_zreg_ctx_size(sve_vl_bits) + + sve_ffrreg_ctx_size(sve_vl_bits), + L1_CACHE_BYTES); + + if ( !ctx ) + return -ENOMEM; + + /* + * Point to the end of Z0-Z31 memory, just before FFR memory, to be kept in + * sync with sve_context_free() + */ + v->arch.vfp.sve_zreg_ctx_end = ctx + + (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t)); + + v->arch.zcr_el2 = vl_to_zcr(sve_vl_bits); + + return 0; +} + +void sve_context_free(struct vcpu *v) +{ + unsigned int sve_vl_bits; + + if ( v->arch.vfp.sve_zreg_ctx_end ) + return; + + sve_vl_bits = sve_decode_vl(v->domain->arch.sve_vl); + + /* + * Point to the end of Z0-Z31 memory, just before FFR memory, to be kept + * in sync with sve_context_init() + */ + v->arch.vfp.sve_zreg_ctx_end -= + (sve_zreg_ctx_size(sve_vl_bits) / sizeof(uint64_t)); + + XFREE(v->arch.vfp.sve_zreg_ctx_end); +} + +void sve_save_state(struct vcpu *v) +{ + v->arch.zcr_el1 = READ_SYSREG(ZCR_EL1); + + sve_save_ctx(v->arch.vfp.sve_zreg_ctx_end, v->arch.vfp.fpregs, 1); +} + +void sve_restore_state(struct vcpu *v) +{ + WRITE_SYSREG(v->arch.zcr_el1, ZCR_EL1); + WRITE_SYSREG(v->arch.zcr_el2, ZCR_EL2); + + sve_load_ctx(v->arch.vfp.sve_zreg_ctx_end, v->arch.vfp.fpregs, 1); +} + /* * Local variables: * mode: C diff --git a/xen/arch/arm/arm64/vfp.c b/xen/arch/arm/arm64/vfp.c index 47885e76baae..2d0d7c2e6ddb 100644 --- a/xen/arch/arm/arm64/vfp.c +++ b/xen/arch/arm/arm64/vfp.c @@ -2,29 +2,35 @@ #include <asm/processor.h> #include <asm/cpufeature.h> #include <asm/vfp.h> +#include <asm/arm64/sve.h> void vfp_save_state(struct vcpu *v) { if ( !cpu_has_fp ) return; - asm volatile("stp q0, q1, [%1, #16 * 0]\n\t" - "stp q2, q3, [%1, #16 * 2]\n\t" - "stp q4, q5, [%1, #16 * 4]\n\t" - "stp q6, q7, [%1, #16 * 6]\n\t" - "stp q8, q9, [%1, #16 * 8]\n\t" - "stp q10, q11, [%1, #16 * 10]\n\t" - "stp q12, q13, [%1, #16 * 12]\n\t" - "stp q14, q15, [%1, #16 * 14]\n\t" - "stp q16, q17, [%1, #16 * 16]\n\t" - "stp q18, q19, [%1, #16 * 18]\n\t" - "stp q20, q21, [%1, #16 * 20]\n\t" - "stp q22, q23, [%1, #16 * 22]\n\t" - "stp q24, q25, [%1, #16 * 24]\n\t" - "stp q26, q27, [%1, #16 * 26]\n\t" - "stp q28, q29, [%1, #16 * 28]\n\t" - "stp q30, q31, [%1, #16 * 30]\n\t" - : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs)); + if ( is_sve_domain(v->domain) ) + sve_save_state(v); + else + { + asm volatile("stp q0, q1, [%1, #16 * 0]\n\t" + "stp q2, q3, [%1, #16 * 2]\n\t" + "stp q4, q5, [%1, #16 * 4]\n\t" + "stp q6, q7, [%1, #16 * 6]\n\t" + "stp q8, q9, [%1, #16 * 8]\n\t" + "stp q10, q11, [%1, #16 * 10]\n\t" + "stp q12, q13, [%1, #16 * 12]\n\t" + "stp q14, q15, [%1, #16 * 14]\n\t" + "stp q16, q17, [%1, #16 * 16]\n\t" + "stp q18, q19, [%1, #16 * 18]\n\t" + "stp q20, q21, [%1, #16 * 20]\n\t" + "stp q22, q23, [%1, #16 * 22]\n\t" + "stp q24, q25, [%1, #16 * 24]\n\t" + "stp q26, q27, [%1, #16 * 26]\n\t" + "stp q28, q29, [%1, #16 * 28]\n\t" + "stp q30, q31, [%1, #16 * 30]\n\t" + : "=Q" (*v->arch.vfp.fpregs) : "r" (v->arch.vfp.fpregs)); + } v->arch.vfp.fpsr = READ_SYSREG(FPSR); v->arch.vfp.fpcr = READ_SYSREG(FPCR); @@ -37,23 +43,28 @@ void vfp_restore_state(struct vcpu *v) if ( !cpu_has_fp ) return; - asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t" - "ldp q2, q3, [%1, #16 * 2]\n\t" - "ldp q4, q5, [%1, #16 * 4]\n\t" - "ldp q6, q7, [%1, #16 * 6]\n\t" - "ldp q8, q9, [%1, #16 * 8]\n\t" - "ldp q10, q11, [%1, #16 * 10]\n\t" - "ldp q12, q13, [%1, #16 * 12]\n\t" - "ldp q14, q15, [%1, #16 * 14]\n\t" - "ldp q16, q17, [%1, #16 * 16]\n\t" - "ldp q18, q19, [%1, #16 * 18]\n\t" - "ldp q20, q21, [%1, #16 * 20]\n\t" - "ldp q22, q23, [%1, #16 * 22]\n\t" - "ldp q24, q25, [%1, #16 * 24]\n\t" - "ldp q26, q27, [%1, #16 * 26]\n\t" - "ldp q28, q29, [%1, #16 * 28]\n\t" - "ldp q30, q31, [%1, #16 * 30]\n\t" - : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs)); + if ( is_sve_domain(v->domain) ) + sve_restore_state(v); + else + { + asm volatile("ldp q0, q1, [%1, #16 * 0]\n\t" + "ldp q2, q3, [%1, #16 * 2]\n\t" + "ldp q4, q5, [%1, #16 * 4]\n\t" + "ldp q6, q7, [%1, #16 * 6]\n\t" + "ldp q8, q9, [%1, #16 * 8]\n\t" + "ldp q10, q11, [%1, #16 * 10]\n\t" + "ldp q12, q13, [%1, #16 * 12]\n\t" + "ldp q14, q15, [%1, #16 * 14]\n\t" + "ldp q16, q17, [%1, #16 * 16]\n\t" + "ldp q18, q19, [%1, #16 * 18]\n\t" + "ldp q20, q21, [%1, #16 * 20]\n\t" + "ldp q22, q23, [%1, #16 * 22]\n\t" + "ldp q24, q25, [%1, #16 * 24]\n\t" + "ldp q26, q27, [%1, #16 * 26]\n\t" + "ldp q28, q29, [%1, #16 * 28]\n\t" + "ldp q30, q31, [%1, #16 * 30]\n\t" + : : "Q" (*v->arch.vfp.fpregs), "r" (v->arch.vfp.fpregs)); + } WRITE_SYSREG(v->arch.vfp.fpsr, FPSR); WRITE_SYSREG(v->arch.vfp.fpcr, FPCR); diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 6c22551b0ed2..add9929b7943 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -557,7 +557,11 @@ int arch_vcpu_create(struct vcpu *v) v->arch.cptr_el2 = get_default_cptr_flags(); if ( is_sve_domain(v->domain) ) + { + if ( (rc = sve_context_init(v)) != 0 ) + goto fail; v->arch.cptr_el2 &= ~HCPTR_CP(8); + } v->arch.hcr_el2 = get_default_hcr_flags(); @@ -587,6 +591,8 @@ fail: void arch_vcpu_destroy(struct vcpu *v) { + if ( is_sve_domain(v->domain) ) + sve_context_free(v); vcpu_timer_destroy(v); vcpu_vgic_free(v); free_xenheap_pages(v->arch.stack, STACK_ORDER); diff --git a/xen/arch/arm/include/asm/arm64/sve.h b/xen/arch/arm/include/asm/arm64/sve.h index 4b63412727fc..65b46685d263 100644 --- a/xen/arch/arm/include/asm/arm64/sve.h +++ b/xen/arch/arm/include/asm/arm64/sve.h @@ -22,6 +22,10 @@ static inline unsigned int sve_decode_vl(unsigned int sve_vl) } register_t compute_max_zcr(void); +int sve_context_init(struct vcpu *v); +void sve_context_free(struct vcpu *v); +void sve_save_state(struct vcpu *v); +void sve_restore_state(struct vcpu *v); #ifdef CONFIG_ARM64_SVE diff --git a/xen/arch/arm/include/asm/arm64/sysregs.h b/xen/arch/arm/include/asm/arm64/sysregs.h index 4cabb9eb4d5e..3fdeb9d8cdef 100644 --- a/xen/arch/arm/include/asm/arm64/sysregs.h +++ b/xen/arch/arm/include/asm/arm64/sysregs.h @@ -88,6 +88,9 @@ #ifndef ID_AA64ISAR2_EL1 #define ID_AA64ISAR2_EL1 S3_0_C0_C6_2 #endif +#ifndef ZCR_EL1 +#define ZCR_EL1 S3_0_C1_C2_0 +#endif /* ID registers (imported from arm64/include/asm/sysreg.h in Linux) */ diff --git a/xen/arch/arm/include/asm/arm64/vfp.h b/xen/arch/arm/include/asm/arm64/vfp.h index e6e8c363bc16..4aa371e85d26 100644 --- a/xen/arch/arm/include/asm/arm64/vfp.h +++ b/xen/arch/arm/include/asm/arm64/vfp.h @@ -6,7 +6,19 @@ struct vfp_state { + /* + * When SVE is enabled for the guest, fpregs memory will be used to + * save/restore P0-P15 registers, otherwise it will be used for the V0-V31 + * registers. + */ uint64_t fpregs[64] __vfp_aligned; + /* + * When SVE is enabled for the guest, sve_zreg_ctx_end points to memory + * where Z0-Z31 registers and FFR can be saved/restored, it points at the + * end of the Z0-Z31 space and at the beginning of the FFR space, it's done + * like that to ease the save/restore assembly operations. + */ + uint64_t *sve_zreg_ctx_end; register_t fpcr; register_t fpexc32_el2; register_t fpsr; diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h index 331da0f3bcc3..814652d92568 100644 --- a/xen/arch/arm/include/asm/domain.h +++ b/xen/arch/arm/include/asm/domain.h @@ -195,6 +195,8 @@ struct arch_vcpu register_t tpidrro_el0; /* HYP configuration */ + register_t zcr_el1; + register_t zcr_el2; register_t cptr_el2; register_t hcr_el2; register_t mdcr_el2;