Message ID | 20201229160059.64135-1-dbrazdil@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: Allow PSCI SYSTEM_OFF/RESET to return | expand |
On Tue, Dec 29, 2020 at 04:00:59PM +0000, David Brazdil wrote: > The KVM/arm64 PSCI relay assumes that SYSTEM_OFF and SYSTEM_RESET should > not return, as dictated by the PSCI spec. However, there is firmware out > there which breaks this assumption, leading to a hyp panic. Make KVM > more robust to broken firmware by allowing these to return. Are you sure you should just return? We've had issues in the past with Linux reboot(2) that returns to userspace, allowing on 32-bit ARM for example watchdogs to unexpectedly continue being serviced.
Hi David, On 2020-12-29 16:00, David Brazdil wrote: > The KVM/arm64 PSCI relay assumes that SYSTEM_OFF and SYSTEM_RESET > should > not return, as dictated by the PSCI spec. However, there is firmware > out > there which breaks this assumption, leading to a hyp panic. Make KVM > more robust to broken firmware by allowing these to return. > > Signed-off-by: David Brazdil <dbrazdil@google.com> > --- > arch/arm64/kvm/hyp/nvhe/psci-relay.c | 13 +++++-------- > 1 file changed, 5 insertions(+), 8 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c > b/arch/arm64/kvm/hyp/nvhe/psci-relay.c > index e3947846ffcb..8e7128cb7667 100644 > --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c > +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c > @@ -77,12 +77,6 @@ static unsigned long psci_forward(struct > kvm_cpu_context *host_ctxt) > cpu_reg(host_ctxt, 2), cpu_reg(host_ctxt, 3)); > } > > -static __noreturn unsigned long psci_forward_noreturn(struct > kvm_cpu_context *host_ctxt) > -{ > - psci_forward(host_ctxt); > - hyp_panic(); /* unreachable */ > -} > - > static unsigned int find_cpu_id(u64 mpidr) > { > unsigned int i; > @@ -251,10 +245,13 @@ static unsigned long psci_0_2_handler(u64 > func_id, struct kvm_cpu_context *host_ > case PSCI_0_2_FN_MIGRATE_INFO_TYPE: > case PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU: > return psci_forward(host_ctxt); > + /* > + * SYSTEM_OFF/RESET should not return according to the spec. > + * Allow it so as to stay robust to broken firmware. > + */ > case PSCI_0_2_FN_SYSTEM_OFF: > case PSCI_0_2_FN_SYSTEM_RESET: > - psci_forward_noreturn(host_ctxt); > - unreachable(); > + return psci_forward(host_ctxt); > case PSCI_0_2_FN64_CPU_SUSPEND: > return psci_cpu_suspend(func_id, host_ctxt); > case PSCI_0_2_FN64_CPU_ON: Thanks for having tracked this. I wonder whether we should also taint the kernel in this case, because this is completely unexpected, and a major spec violation. Ideally, we'd be able to detect this case and prevent pKVM from getting initialised at all, but I guess there is no way to detect the sucker without ... calling SYSTEM_RESET? M.
On 2020-12-29 17:04, Russell King - ARM Linux admin wrote: > On Tue, Dec 29, 2020 at 04:00:59PM +0000, David Brazdil wrote: >> The KVM/arm64 PSCI relay assumes that SYSTEM_OFF and SYSTEM_RESET >> should >> not return, as dictated by the PSCI spec. However, there is firmware >> out >> there which breaks this assumption, leading to a hyp panic. Make KVM >> more robust to broken firmware by allowing these to return. > > Are you sure you should just return? > > We've had issues in the past with Linux reboot(2) that returns > to userspace, allowing on 32-bit ARM for example watchdogs to > unexpectedly continue being serviced. I don't think this changes anything compared to the case where the PSCI relay isn't enabled. The EL1 part of the kernel would see the SYSTEM_RESET call return, and handle it accordingly (stay in a while(1) loop). This is consistent with the PSCI relay design goal of being invisible to the EL1 kernel. Thanks, M.
On Tue, Dec 29, 2020 at 05:16:41PM +0000, Marc Zyngier wrote: > Hi David, > > On 2020-12-29 16:00, David Brazdil wrote: > > The KVM/arm64 PSCI relay assumes that SYSTEM_OFF and SYSTEM_RESET should > > not return, as dictated by the PSCI spec. However, there is firmware out > > there which breaks this assumption, leading to a hyp panic. Make KVM > > more robust to broken firmware by allowing these to return. > > > > Signed-off-by: David Brazdil <dbrazdil@google.com> > > --- > > arch/arm64/kvm/hyp/nvhe/psci-relay.c | 13 +++++-------- > > 1 file changed, 5 insertions(+), 8 deletions(-) > > > > diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c > > b/arch/arm64/kvm/hyp/nvhe/psci-relay.c > > index e3947846ffcb..8e7128cb7667 100644 > > --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c > > +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c > > @@ -77,12 +77,6 @@ static unsigned long psci_forward(struct > > kvm_cpu_context *host_ctxt) > > cpu_reg(host_ctxt, 2), cpu_reg(host_ctxt, 3)); > > } > > > > -static __noreturn unsigned long psci_forward_noreturn(struct > > kvm_cpu_context *host_ctxt) > > -{ > > - psci_forward(host_ctxt); > > - hyp_panic(); /* unreachable */ > > -} > > - > > static unsigned int find_cpu_id(u64 mpidr) > > { > > unsigned int i; > > @@ -251,10 +245,13 @@ static unsigned long psci_0_2_handler(u64 > > func_id, struct kvm_cpu_context *host_ > > case PSCI_0_2_FN_MIGRATE_INFO_TYPE: > > case PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU: > > return psci_forward(host_ctxt); > > + /* > > + * SYSTEM_OFF/RESET should not return according to the spec. > > + * Allow it so as to stay robust to broken firmware. > > + */ > > case PSCI_0_2_FN_SYSTEM_OFF: > > case PSCI_0_2_FN_SYSTEM_RESET: > > - psci_forward_noreturn(host_ctxt); > > - unreachable(); > > + return psci_forward(host_ctxt); > > case PSCI_0_2_FN64_CPU_SUSPEND: > > return psci_cpu_suspend(func_id, host_ctxt); > > case PSCI_0_2_FN64_CPU_ON: > > Thanks for having tracked this. > > I wonder whether we should also taint the kernel in this case, > because this is completely unexpected, and a major spec violation. > > Ideally, we'd be able to detect this case and prevent pKVM from > getting initialised at all, but I guess there is no way to detect > the sucker without ... calling SYSTEM_RESET? Yeah, there are no bits to check, unfortunately. :( David
On Tue, 29 Dec 2020 16:00:59 +0000, David Brazdil wrote: > The KVM/arm64 PSCI relay assumes that SYSTEM_OFF and SYSTEM_RESET should > not return, as dictated by the PSCI spec. However, there is firmware out > there which breaks this assumption, leading to a hyp panic. Make KVM > more robust to broken firmware by allowing these to return. Applied to next, thanks! [1/1] KVM: arm64: Allow PSCI SYSTEM_OFF/RESET to return commit: 2c91ef39216149df6703c3fa6a47dd9a1e6091c1 Cheers, M.
diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c index e3947846ffcb..8e7128cb7667 100644 --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c @@ -77,12 +77,6 @@ static unsigned long psci_forward(struct kvm_cpu_context *host_ctxt) cpu_reg(host_ctxt, 2), cpu_reg(host_ctxt, 3)); } -static __noreturn unsigned long psci_forward_noreturn(struct kvm_cpu_context *host_ctxt) -{ - psci_forward(host_ctxt); - hyp_panic(); /* unreachable */ -} - static unsigned int find_cpu_id(u64 mpidr) { unsigned int i; @@ -251,10 +245,13 @@ static unsigned long psci_0_2_handler(u64 func_id, struct kvm_cpu_context *host_ case PSCI_0_2_FN_MIGRATE_INFO_TYPE: case PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU: return psci_forward(host_ctxt); + /* + * SYSTEM_OFF/RESET should not return according to the spec. + * Allow it so as to stay robust to broken firmware. + */ case PSCI_0_2_FN_SYSTEM_OFF: case PSCI_0_2_FN_SYSTEM_RESET: - psci_forward_noreturn(host_ctxt); - unreachable(); + return psci_forward(host_ctxt); case PSCI_0_2_FN64_CPU_SUSPEND: return psci_cpu_suspend(func_id, host_ctxt); case PSCI_0_2_FN64_CPU_ON:
The KVM/arm64 PSCI relay assumes that SYSTEM_OFF and SYSTEM_RESET should not return, as dictated by the PSCI spec. However, there is firmware out there which breaks this assumption, leading to a hyp panic. Make KVM more robust to broken firmware by allowing these to return. Signed-off-by: David Brazdil <dbrazdil@google.com> --- arch/arm64/kvm/hyp/nvhe/psci-relay.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)