Message ID | 20201021104611.2744565-2-qais.yousef@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add support for Asymmetric AArch32 systems | expand |
On 2020-10-21 11:46, Qais Yousef wrote: > On a system without uniform support for AArch32 at EL0, it is possible > for the guest to force run AArch32 at EL0 and potentially cause an > illegal exception if running on the wrong core. s/the wrong core/a core without AArch32/ > > Add an extra check to catch if the guest ever does that and prevent it Not "if the guest ever does that". Rather "let's hope we are lucky enough to catch the guest doing that". > from running again by resetting vcpu->arch.target and return > ARM_EXCEPTION_IL. > > We try to catch this misbehavior as early as possible and not rely on > PSTATE.IL to occur. > > Tested on Juno by instrumenting the host to: > > * Fake asym aarch32. > * Instrument KVM to make the asymmetry visible to the guest. > > Any attempt to run 32bit app in the guest will produce such error on > qemu: Not *any* attempt. Only the ones where the guest exits whilst in AArch32 EL0. It is perfectly possible for the guest to use AArch32 undetected for quite a while. > > # ./test > error: kvm run failed Invalid argument > PC=ffff800010945080 X00=ffff800016a45014 X01=ffff800010945058 > X02=ffff800016917190 X03=0000000000000000 X04=0000000000000000 > X05=00000000fffffffb X06=0000000000000000 X07=ffff80001000bab0 > X08=0000000000000000 X09=0000000092ec5193 X10=0000000000000000 > X11=ffff80001608ff40 X12=ffff000075fcde86 X13=ffff000075fcde88 > X14=ffffffffffffffff X15=ffff00007b2105a8 X16=ffff00007b006d50 > X17=0000000000000000 X18=0000000000000000 X19=ffff00007a82b000 > X20=0000000000000000 X21=ffff800015ccd158 X22=ffff00007a82b040 > X23=ffff00007a82b008 X24=0000000000000000 X25=ffff800015d169b0 > X26=ffff8000126d05bc X27=0000000000000000 X28=0000000000000000 > X29=ffff80001000ba90 X30=ffff80001093f3dc SP=ffff80001000ba90 > PSTATE=60000005 -ZC- EL1h > qemu-system-aarch64: Failed to get KVM_REG_ARM_TIMER_CNT It'd be worth working out: - why does this show an AArch64 mode it we caught the vcpu in AArch32? - why does QEMU shout about the timer register? > Aborted > > Signed-off-by: Qais Yousef <qais.yousef@arm.com> > --- > arch/arm64/kvm/arm.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index b588c3b5c2f0..c2fa57f56a94 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -804,6 +804,19 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) > > preempt_enable(); > > + /* > + * The ARMv8 architecture doesn't give the hypervisor > + * a mechanism to prevent a guest from dropping to AArch32 EL0 > + * if implemented by the CPU. If we spot the guest in such > + * state and that we decided it wasn't supposed to do so (like > + * with the asymmetric AArch32 case), return to userspace with > + * a fatal error. > + */ Please add a comment explaining the effects of setting target to -1. Something like: "As we have caught the guest red-handed, decide that it isn't fit for purpose anymore by making the4 vcpu invalid. The VMM can try and fix it by issuing a KVM_ARM_VCPU_INIT if it really wants to." > + if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) { > + vcpu->arch.target = -1; > + ret = ARM_EXCEPTION_IL; > + } > + > ret = handle_exit(vcpu, ret); > } M.
On 10/21/20 13:02, Marc Zyngier wrote: > On 2020-10-21 11:46, Qais Yousef wrote: > > On a system without uniform support for AArch32 at EL0, it is possible > > for the guest to force run AArch32 at EL0 and potentially cause an > > illegal exception if running on the wrong core. > > s/the wrong core/a core without AArch32/ > > > > > Add an extra check to catch if the guest ever does that and prevent it > > Not "if the guest ever does that". Rather "let's hope we are lucky enough > to catch the guest doing that". > > > from running again by resetting vcpu->arch.target and return > > ARM_EXCEPTION_IL. > > > > We try to catch this misbehavior as early as possible and not rely on > > PSTATE.IL to occur. > > > > Tested on Juno by instrumenting the host to: > > > > * Fake asym aarch32. > > * Instrument KVM to make the asymmetry visible to the guest. > > > > Any attempt to run 32bit app in the guest will produce such error on > > qemu: > > Not *any* attempt. Only the ones where the guest exits whilst in > AArch32 EL0. It is perfectly possible for the guest to use AArch32 > undetected for quite a while. Thanks Marc! I'll change them all. > > > > # ./test > > error: kvm run failed Invalid argument > > PC=ffff800010945080 X00=ffff800016a45014 X01=ffff800010945058 > > X02=ffff800016917190 X03=0000000000000000 X04=0000000000000000 > > X05=00000000fffffffb X06=0000000000000000 X07=ffff80001000bab0 > > X08=0000000000000000 X09=0000000092ec5193 X10=0000000000000000 > > X11=ffff80001608ff40 X12=ffff000075fcde86 X13=ffff000075fcde88 > > X14=ffffffffffffffff X15=ffff00007b2105a8 X16=ffff00007b006d50 > > X17=0000000000000000 X18=0000000000000000 X19=ffff00007a82b000 > > X20=0000000000000000 X21=ffff800015ccd158 X22=ffff00007a82b040 > > X23=ffff00007a82b008 X24=0000000000000000 X25=ffff800015d169b0 > > X26=ffff8000126d05bc X27=0000000000000000 X28=0000000000000000 > > X29=ffff80001000ba90 X30=ffff80001093f3dc SP=ffff80001000ba90 > > PSTATE=60000005 -ZC- EL1h > > qemu-system-aarch64: Failed to get KVM_REG_ARM_TIMER_CNT > > It'd be worth working out: > - why does this show an AArch64 mode it we caught the vcpu in AArch32? > - why does QEMU shout about the timer register? /me puts a monocular on Which bit is the AArch64? It did surprise me that it is shouting about the timer. My guess was that a timer interrupt at the guest between exit/reentry caused the state change and the failure to read the timer register? Since the target is no longer valid it falls over, hopefully as expected. I could have been naive of course. That explanation made sense to my mind so I didn't dig further. > > Aborted > > > > Signed-off-by: Qais Yousef <qais.yousef@arm.com> > > --- > > arch/arm64/kvm/arm.c | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index b588c3b5c2f0..c2fa57f56a94 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -804,6 +804,19 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) > > > > preempt_enable(); > > > > + /* > > + * The ARMv8 architecture doesn't give the hypervisor > > + * a mechanism to prevent a guest from dropping to AArch32 EL0 > > + * if implemented by the CPU. If we spot the guest in such > > + * state and that we decided it wasn't supposed to do so (like > > + * with the asymmetric AArch32 case), return to userspace with > > + * a fatal error. > > + */ > > Please add a comment explaining the effects of setting target to -1. > Something > like: > > "As we have caught the guest red-handed, decide that it isn't fit for > purpose > anymore by making the4 vcpu invalid. The VMM can try and fix it by issuing > a KVM_ARM_VCPU_INIT if it really wants to." Will do. Thanks -- Qais Yousef > > > + if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) { > > + vcpu->arch.target = -1; > > + ret = ARM_EXCEPTION_IL; > > + } > > + > > ret = handle_exit(vcpu, ret); > > } > > M. > -- > Jazz is not dead. It just smells funny...
On 2020-10-21 14:35, Qais Yousef wrote: > On 10/21/20 13:02, Marc Zyngier wrote: >> On 2020-10-21 11:46, Qais Yousef wrote: >> > On a system without uniform support for AArch32 at EL0, it is possible >> > for the guest to force run AArch32 at EL0 and potentially cause an >> > illegal exception if running on the wrong core. >> >> s/the wrong core/a core without AArch32/ >> >> > >> > Add an extra check to catch if the guest ever does that and prevent it >> >> Not "if the guest ever does that". Rather "let's hope we are lucky >> enough >> to catch the guest doing that". >> >> > from running again by resetting vcpu->arch.target and return >> > ARM_EXCEPTION_IL. >> > >> > We try to catch this misbehavior as early as possible and not rely on >> > PSTATE.IL to occur. >> > >> > Tested on Juno by instrumenting the host to: >> > >> > * Fake asym aarch32. >> > * Instrument KVM to make the asymmetry visible to the guest. >> > >> > Any attempt to run 32bit app in the guest will produce such error on >> > qemu: >> >> Not *any* attempt. Only the ones where the guest exits whilst in >> AArch32 EL0. It is perfectly possible for the guest to use AArch32 >> undetected for quite a while. > > Thanks Marc! I'll change them all. > >> > >> > # ./test >> > error: kvm run failed Invalid argument >> > PC=ffff800010945080 X00=ffff800016a45014 X01=ffff800010945058 >> > X02=ffff800016917190 X03=0000000000000000 X04=0000000000000000 >> > X05=00000000fffffffb X06=0000000000000000 X07=ffff80001000bab0 >> > X08=0000000000000000 X09=0000000092ec5193 X10=0000000000000000 >> > X11=ffff80001608ff40 X12=ffff000075fcde86 X13=ffff000075fcde88 >> > X14=ffffffffffffffff X15=ffff00007b2105a8 X16=ffff00007b006d50 >> > X17=0000000000000000 X18=0000000000000000 X19=ffff00007a82b000 >> > X20=0000000000000000 X21=ffff800015ccd158 X22=ffff00007a82b040 >> > X23=ffff00007a82b008 X24=0000000000000000 X25=ffff800015d169b0 >> > X26=ffff8000126d05bc X27=0000000000000000 X28=0000000000000000 >> > X29=ffff80001000ba90 X30=ffff80001093f3dc SP=ffff80001000ba90 >> > PSTATE=60000005 -ZC- EL1h >> > qemu-system-aarch64: Failed to get KVM_REG_ARM_TIMER_CNT >> >> It'd be worth working out: >> - why does this show an AArch64 mode it we caught the vcpu in AArch32? >> - why does QEMU shout about the timer register? > > /me puts a monocular on > > Which bit is the AArch64? It clearly spits out "EL1h", and PSTATE.M is 5, also consistent with EL1h. > It did surprise me that it is shouting about the timer. My guess was > that > a timer interrupt at the guest between exit/reentry caused the state > change and > the failure to read the timer register? Since the target is no longer > valid it > falls over, hopefully as expected. I could have been naive of course. > That > explanation made sense to my mind so I didn't dig further. Userspace is never involved with the timer interrupt, unless you've elected to have the interrupt controller in userspace, which I seriously doubt. As we are introducing a change to the userspace ABI, it'd be interesting to find out what is happening here. M.
On 10/21/20 14:51, Marc Zyngier wrote: > On 2020-10-21 14:35, Qais Yousef wrote: > > On 10/21/20 13:02, Marc Zyngier wrote: > > > On 2020-10-21 11:46, Qais Yousef wrote: > > > > On a system without uniform support for AArch32 at EL0, it is possible > > > > for the guest to force run AArch32 at EL0 and potentially cause an > > > > illegal exception if running on the wrong core. > > > > > > s/the wrong core/a core without AArch32/ > > > > > > > > > > > Add an extra check to catch if the guest ever does that and prevent it > > > > > > Not "if the guest ever does that". Rather "let's hope we are lucky > > > enough > > > to catch the guest doing that". > > > > > > > from running again by resetting vcpu->arch.target and return > > > > ARM_EXCEPTION_IL. > > > > > > > > We try to catch this misbehavior as early as possible and not rely on > > > > PSTATE.IL to occur. > > > > > > > > Tested on Juno by instrumenting the host to: > > > > > > > > * Fake asym aarch32. > > > > * Instrument KVM to make the asymmetry visible to the guest. > > > > > > > > Any attempt to run 32bit app in the guest will produce such error on > > > > qemu: > > > > > > Not *any* attempt. Only the ones where the guest exits whilst in > > > AArch32 EL0. It is perfectly possible for the guest to use AArch32 > > > undetected for quite a while. > > > > Thanks Marc! I'll change them all. > > > > > > > > > > # ./test > > > > error: kvm run failed Invalid argument > > > > PC=ffff800010945080 X00=ffff800016a45014 X01=ffff800010945058 > > > > X02=ffff800016917190 X03=0000000000000000 X04=0000000000000000 > > > > X05=00000000fffffffb X06=0000000000000000 X07=ffff80001000bab0 > > > > X08=0000000000000000 X09=0000000092ec5193 X10=0000000000000000 > > > > X11=ffff80001608ff40 X12=ffff000075fcde86 X13=ffff000075fcde88 > > > > X14=ffffffffffffffff X15=ffff00007b2105a8 X16=ffff00007b006d50 > > > > X17=0000000000000000 X18=0000000000000000 X19=ffff00007a82b000 > > > > X20=0000000000000000 X21=ffff800015ccd158 X22=ffff00007a82b040 > > > > X23=ffff00007a82b008 X24=0000000000000000 X25=ffff800015d169b0 > > > > X26=ffff8000126d05bc X27=0000000000000000 X28=0000000000000000 > > > > X29=ffff80001000ba90 X30=ffff80001093f3dc SP=ffff80001000ba90 > > > > PSTATE=60000005 -ZC- EL1h > > > > qemu-system-aarch64: Failed to get KVM_REG_ARM_TIMER_CNT > > > > > > It'd be worth working out: > > > - why does this show an AArch64 mode it we caught the vcpu in AArch32? > > > - why does QEMU shout about the timer register? > > > > /me puts a monocular on > > > > Which bit is the AArch64? > > It clearly spits out "EL1h", and PSTATE.M is 5, also consistent with EL1h. > > > It did surprise me that it is shouting about the timer. My guess was > > that > > a timer interrupt at the guest between exit/reentry caused the state > > change and > > the failure to read the timer register? Since the target is no longer > > valid it > > falls over, hopefully as expected. I could have been naive of course. > > That > > explanation made sense to my mind so I didn't dig further. > > Userspace is never involved with the timer interrupt, unless you've elected > to have the interrupt controller in userspace, which I seriously doubt. > > As we are introducing a change to the userspace ABI, it'd be interesting > to find out what is happening here. Sure. Let me educate myself more about this and find a way to interrogate qemu and KVM. Thanks -- Qais Yousef
Hi Marc On 10/21/20 14:51, Marc Zyngier wrote: > > > > > > > > # ./test > > > > error: kvm run failed Invalid argument > > > > PC=ffff800010945080 X00=ffff800016a45014 X01=ffff800010945058 > > > > X02=ffff800016917190 X03=0000000000000000 X04=0000000000000000 > > > > X05=00000000fffffffb X06=0000000000000000 X07=ffff80001000bab0 > > > > X08=0000000000000000 X09=0000000092ec5193 X10=0000000000000000 > > > > X11=ffff80001608ff40 X12=ffff000075fcde86 X13=ffff000075fcde88 > > > > X14=ffffffffffffffff X15=ffff00007b2105a8 X16=ffff00007b006d50 > > > > X17=0000000000000000 X18=0000000000000000 X19=ffff00007a82b000 > > > > X20=0000000000000000 X21=ffff800015ccd158 X22=ffff00007a82b040 > > > > X23=ffff00007a82b008 X24=0000000000000000 X25=ffff800015d169b0 > > > > X26=ffff8000126d05bc X27=0000000000000000 X28=0000000000000000 > > > > X29=ffff80001000ba90 X30=ffff80001093f3dc SP=ffff80001000ba90 > > > > PSTATE=60000005 -ZC- EL1h > > > > qemu-system-aarch64: Failed to get KVM_REG_ARM_TIMER_CNT > > > > > > It'd be worth working out: > > > - why does this show an AArch64 mode it we caught the vcpu in AArch32? > > > - why does QEMU shout about the timer register? > > > > /me puts a monocular on > > > > Which bit is the AArch64? > > It clearly spits out "EL1h", and PSTATE.M is 5, also consistent with EL1h. Apologies for the delay to look at the reason on failing to read the timer register. Digging into the qemu 5.0.0 code, the error message is printed from kvm_arm_get_virtual_time() which in turn is called from kvm_arm_vm_state_change(). The latter is a callback function that is called when a vm starts/stop. So the sequence of events is: VM runs 32bit apps host resets vcpu->arch.target to -1 qemu::kvm_cpu_exec() hits -EINVAL error (somewhere I didn't trace) kvm_cpu_exec()::cpu_dump_state() kvm_cpu_exec()::vm_stop() .. kvm_arm_vm_state_change() kvm_arm_get_virtual_time() host return -ENOEXEC above error message is printed abort() I admit I didn't trace qemu to see what's going inside it. It was only statically analysing the code. To verify the theory I applied the following hack to hide the timer register error diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 5d2a1caf55a0..1c8fdf6566ea 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1113,7 +1113,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp, case KVM_GET_ONE_REG: { struct kvm_one_reg reg; - r = -ENOEXEC; + r = 0; if (unlikely(!kvm_vcpu_initialized(vcpu))) break; With that I see the following error from qemu after which it seems to have 'hanged'. I can terminate qemu with the usual <ctrl-a>x. So it's not dead, just the vm has exited I suppose and qemu went into monitor mode or something. error: kvm run failed Invalid argument PC=ffff8000109ca100 X00=ffff800016ff5014 X01=ffff8000109ca0d8 X02=ffff800016daae80 X03=0000000000000000 X04=0000000000000003 X05=0000000000000000 X06=0000000000000000 X07=ffff800016e2bae0 X08=00000000ffffffff X09=ffff8000109c4410 X10=0000000000000000 X11=ffff8000164fb9c8 X12=ffff0000458ad186 X13=ffff0000458ad188 X14=ffffffffffffffff X15=ffff000040268560 X16=0000000000000000 X17=0000000000000001 X18=0000000000000000 X19=ffff0000458c0000 X20=0000000000000000 X21=ffff0000458c0048 X22=ffff0000458c0008 X23=ffff800016103a38 X24=0000000000000000 X25=ffff800016150a38 X26=ffff800012a510d8 X27=ffff8000129504e0 X28=0000000000000000 X29=ffff800016e2bac0 X30=ffff8000109c4410 SP=ffff000040268000 PSTATE=834853a0 N--- EL0t Which hopefully is what you expected to see in the first place. Note that qemu v4.1.0 code didn't have this kvm_arm_get_virtual_time() function. It seems to be a relatively new addition. Also note that kvm_cpu_exec() in qemu completely ignores ARM_EXCEPTION_IL; the kvm_arch_handle_exit() for arm only catches KVM_EXIT_DEBUG and returns 0 for everything else. So kvm_cpu_exec() will jump back the loop to reenter the guest. I haven't traced it but it seems to fail before calling: run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0); where return -ENOEXEC for invalid kvm_vcpu_initialized() in this path not -EINVAL. So all in all, a lot of qemu specific handling and not sure if there's any guarantee how things will fail for different virtualization software. But I think there's a guarantee that they will fail. Thanks -- Qais Yousef
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index b588c3b5c2f0..c2fa57f56a94 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -804,6 +804,19 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) preempt_enable(); + /* + * The ARMv8 architecture doesn't give the hypervisor + * a mechanism to prevent a guest from dropping to AArch32 EL0 + * if implemented by the CPU. If we spot the guest in such + * state and that we decided it wasn't supposed to do so (like + * with the asymmetric AArch32 case), return to userspace with + * a fatal error. + */ + if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) { + vcpu->arch.target = -1; + ret = ARM_EXCEPTION_IL; + } + ret = handle_exit(vcpu, ret); }
On a system without uniform support for AArch32 at EL0, it is possible for the guest to force run AArch32 at EL0 and potentially cause an illegal exception if running on the wrong core. Add an extra check to catch if the guest ever does that and prevent it from running again by resetting vcpu->arch.target and return ARM_EXCEPTION_IL. We try to catch this misbehavior as early as possible and not rely on PSTATE.IL to occur. Tested on Juno by instrumenting the host to: * Fake asym aarch32. * Instrument KVM to make the asymmetry visible to the guest. Any attempt to run 32bit app in the guest will produce such error on qemu: # ./test error: kvm run failed Invalid argument PC=ffff800010945080 X00=ffff800016a45014 X01=ffff800010945058 X02=ffff800016917190 X03=0000000000000000 X04=0000000000000000 X05=00000000fffffffb X06=0000000000000000 X07=ffff80001000bab0 X08=0000000000000000 X09=0000000092ec5193 X10=0000000000000000 X11=ffff80001608ff40 X12=ffff000075fcde86 X13=ffff000075fcde88 X14=ffffffffffffffff X15=ffff00007b2105a8 X16=ffff00007b006d50 X17=0000000000000000 X18=0000000000000000 X19=ffff00007a82b000 X20=0000000000000000 X21=ffff800015ccd158 X22=ffff00007a82b040 X23=ffff00007a82b008 X24=0000000000000000 X25=ffff800015d169b0 X26=ffff8000126d05bc X27=0000000000000000 X28=0000000000000000 X29=ffff80001000ba90 X30=ffff80001093f3dc SP=ffff80001000ba90 PSTATE=60000005 -ZC- EL1h qemu-system-aarch64: Failed to get KVM_REG_ARM_TIMER_CNT Aborted Signed-off-by: Qais Yousef <qais.yousef@arm.com> --- arch/arm64/kvm/arm.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)