Message ID | 20230320225159.92771-1-graf@amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: Allow restore of some sregs with protected state | expand |
+Peter On Mon, Mar 20, 2023, Alexander Graf wrote: > With protected state (like SEV-ES and SEV-SNP), KVM does not have direct > access to guest registers. However, we deflect modifications to CR0, Please avoid pronouns in changelogs and comments. > CR4 and EFER to the host. We also carry the apic_base register and learn > about CR8 directly from a VMCB field. > > That means these bits of information do exist in the host's KVM data > structures. If we ever want to resume consumption of an already > initialized VMSA (guest state), we will need to also restore these > additional bits of KVM state. For some definitions of "need". I've looked at this code multiple times in the past, and even posted patches[1], but I'm still unconvinced that trapping CR0, CR4, and EFER updates is necessary[2], which is partly why series to fix this stalled out. : If KVM chugs along happily without these patches, I'd love to pivot and yank out : all of the CR0/4/8 and EFER trapping/tracking, and then make KVM_GET_SREGS a nop : as well. [1] https://lore.kernel.org/all/20210507165947.2502412-1-seanjc@google.com [2] https://lore.kernel.org/all/YJla8vpwqCxqgS8C@google.com > Prepare ourselves for such a world by allowing set_sregs to set the > relevant fields. This way, it mirrors get_sregs properly that already > exposes them to user space. > > Signed-off-by: Alexander Graf <graf@amazon.com> > --- > arch/x86/kvm/x86.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 7713420abab0..88fa8b657a2f 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -11370,7 +11370,8 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, > int idx; > struct desc_ptr dt; > > - if (!kvm_is_valid_sregs(vcpu, sregs)) > + if (!vcpu->arch.guest_state_protected && > + !kvm_is_valid_sregs(vcpu, sregs)) This is broken, userspace shouldn't be allowed to stuff complete garbage just because guest state is protected. > return -EINVAL; > > apic_base_msr.data = sregs->apic_base; > @@ -11378,8 +11379,19 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, > if (kvm_set_apic_base(vcpu, &apic_base_msr)) > return -EINVAL; > > - if (vcpu->arch.guest_state_protected) > + if (vcpu->arch.guest_state_protected) { > + /* > + * While most actual guest state is hidden from us, > + * CR{0,4,8}, efer and apic_base still hold KVM state > + * with protection enabled, so let's allow restoring > + */ > + kvm_set_cr8(vcpu, sregs->cr8); > + kvm_x86_ops.set_efer(vcpu, sregs->efer); > + kvm_x86_ops.set_cr0(vcpu, sregs->cr0); Use static_call(). This code is also broken in that it doesn't set mmu_reset_needed. This patch also misses the problematic behavior in svm_set_cr0(). And I don't like having a completely separate one-off flow for protected guests. It's more code and arguably uglier, but I would prefer to explicitly skip each chunk as needed so that we aren't effectively maintaining multiple flows. Easiest thing is probably to just resurrect my aforementioned series, pending the result of the discussion on whether or not KVM should be trapping CR0/CR4/EFER writes in the first place.
On Tue, Mar 21, 2023 at 11:03 AM Sean Christopherson <seanjc@google.com> wrote: > > +Peter > > On Mon, Mar 20, 2023, Alexander Graf wrote: > > With protected state (like SEV-ES and SEV-SNP), KVM does not have direct > > access to guest registers. However, we deflect modifications to CR0, > > Please avoid pronouns in changelogs and comments. > > > CR4 and EFER to the host. We also carry the apic_base register and learn > > about CR8 directly from a VMCB field. > > > > That means these bits of information do exist in the host's KVM data > > structures. If we ever want to resume consumption of an already > > initialized VMSA (guest state), we will need to also restore these > > additional bits of KVM state. > > For some definitions of "need". I've looked at this code multiple times in the > past, and even posted patches[1], but I'm still unconvinced that trapping > CR0, CR4, and EFER updates is necessary[2], which is partly why series to fix > this stalled out. > > : If KVM chugs along happily without these patches, I'd love to pivot and yank out > : all of the CR0/4/8 and EFER trapping/tracking, and then make KVM_GET_SREGS a nop > : as well. > > [1] https://lore.kernel.org/all/20210507165947.2502412-1-seanjc@google.com > [2] https://lore.kernel.org/all/YJla8vpwqCxqgS8C@google.com Yea we are using similar patches to do intra-host migration for SNP VMs. I have dropped the ball on my AI from that thread. Let me look/test this patch. > > > Prepare ourselves for such a world by allowing set_sregs to set the > > relevant fields. This way, it mirrors get_sregs properly that already > > exposes them to user space. > > > > Signed-off-by: Alexander Graf <graf@amazon.com> > > --- > > arch/x86/kvm/x86.c | 16 ++++++++++++++-- > > 1 file changed, 14 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 7713420abab0..88fa8b657a2f 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -11370,7 +11370,8 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, > > int idx; > > struct desc_ptr dt; > > > > - if (!kvm_is_valid_sregs(vcpu, sregs)) > > + if (!vcpu->arch.guest_state_protected && > > + !kvm_is_valid_sregs(vcpu, sregs)) > > This is broken, userspace shouldn't be allowed to stuff complete garbage just > because guest state is protected. Agreed, we've been using something like this: kvm_valid_sregs() - if (sregs->fer & EFER_LME && !vcpu->arch.guest_state_protected) { + if (sregs->efer & EFER_LME) { > > > return -EINVAL; > > > > apic_base_msr.data = sregs->apic_base; > > @@ -11378,8 +11379,19 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, > > if (kvm_set_apic_base(vcpu, &apic_base_msr)) > > return -EINVAL; > > > > - if (vcpu->arch.guest_state_protected) > > + if (vcpu->arch.guest_state_protected) { > > + /* > > + * While most actual guest state is hidden from us, > > + * CR{0,4,8}, efer and apic_base still hold KVM state > > + * with protection enabled, so let's allow restoring > > + */ > > + kvm_set_cr8(vcpu, sregs->cr8); > > + kvm_x86_ops.set_efer(vcpu, sregs->efer); > > + kvm_x86_ops.set_cr0(vcpu, sregs->cr0); > > Use static_call(). This code is also broken in that it doesn't set mmu_reset_needed. > This patch also misses the problematic behavior in svm_set_cr0(). > > And I don't like having a completely separate one-off flow for protected guests. > It's more code and arguably uglier, but I would prefer to explicitly skip each > chunk as needed so that we aren't effectively maintaining multiple flows. > > Easiest thing is probably to just resurrect my aforementioned series, pending the > result of the discussion on whether or not KVM should be trapping CR0/CR4/EFER > writes in the first place.
On 21.03.23 19:15, Peter Gonda wrote: > On Tue, Mar 21, 2023 at 11:03 AM Sean Christopherson <seanjc@google.com> wrote: >> +Peter >> >> On Mon, Mar 20, 2023, Alexander Graf wrote: >>> With protected state (like SEV-ES and SEV-SNP), KVM does not have direct >>> access to guest registers. However, we deflect modifications to CR0, >> Please avoid pronouns in changelogs and comments. >> >>> CR4 and EFER to the host. We also carry the apic_base register and learn >>> about CR8 directly from a VMCB field. >>> >>> That means these bits of information do exist in the host's KVM data >>> structures. If we ever want to resume consumption of an already >>> initialized VMSA (guest state), we will need to also restore these >>> additional bits of KVM state. >> For some definitions of "need". I've looked at this code multiple times in the >> past, and even posted patches[1], but I'm still unconvinced that trapping >> CR0, CR4, and EFER updates is necessary[2], which is partly why series to fix >> this stalled out. >> >> : If KVM chugs along happily without these patches, I'd love to pivot and yank out >> : all of the CR0/4/8 and EFER trapping/tracking, and then make KVM_GET_SREGS a nop >> : as well. >> >> [1] https://lore.kernel.org/all/20210507165947.2502412-1-seanjc@google.com >> [2] https://lore.kernel.org/all/YJla8vpwqCxqgS8C@google.com > Yea we are using similar patches to do intra-host migration for SNP VMs. > > I have dropped the ball on my AI from that thread. Let me look/test this patch. Awesome, thanks. If we can get away without any of the above states and make sregs completely useless for protected state, I'd be even happier :) Alex Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7713420abab0..88fa8b657a2f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11370,7 +11370,8 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, int idx; struct desc_ptr dt; - if (!kvm_is_valid_sregs(vcpu, sregs)) + if (!vcpu->arch.guest_state_protected && + !kvm_is_valid_sregs(vcpu, sregs)) return -EINVAL; apic_base_msr.data = sregs->apic_base; @@ -11378,8 +11379,19 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, if (kvm_set_apic_base(vcpu, &apic_base_msr)) return -EINVAL; - if (vcpu->arch.guest_state_protected) + if (vcpu->arch.guest_state_protected) { + /* + * While most actual guest state is hidden from us, + * CR{0,4,8}, efer and apic_base still hold KVM state + * with protection enabled, so let's allow restoring + */ + kvm_set_cr8(vcpu, sregs->cr8); + kvm_x86_ops.set_efer(vcpu, sregs->efer); + kvm_x86_ops.set_cr0(vcpu, sregs->cr0); + vcpu->arch.cr0 = sregs->cr0; + kvm_x86_ops.set_cr4(vcpu, sregs->cr4); return 0; + } dt.size = sregs->idt.limit; dt.address = sregs->idt.base;
With protected state (like SEV-ES and SEV-SNP), KVM does not have direct access to guest registers. However, we deflect modifications to CR0, CR4 and EFER to the host. We also carry the apic_base register and learn about CR8 directly from a VMCB field. That means these bits of information do exist in the host's KVM data structures. If we ever want to resume consumption of an already initialized VMSA (guest state), we will need to also restore these additional bits of KVM state. Prepare ourselves for such a world by allowing set_sregs to set the relevant fields. This way, it mirrors get_sregs properly that already exposes them to user space. Signed-off-by: Alexander Graf <graf@amazon.com> --- arch/x86/kvm/x86.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)