Message ID | 20231218140543.870234-3-tao1.su@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86: KVM: Limit guest physical bits when 5-level EPT is unsupported | expand |
On Mon, Dec 18, 2023, Tao Su wrote: > With 4-level EPT, bits 51:48 of the guest physical address must all > be zero; otherwise, an EPT violation always occurs, which is an unexpected > VM exit in KVM currently. > > Even though KVM advertises the max physical bits to guest, guest may > ignore MAXPHYADDR in CPUID and set a bigger physical bits to KVM. > Rejecting invalid guest physical bits on KVM side is a choice, but it will > break current KVM ABI, e.g., current QEMU ignores the physical bits > advertised by KVM and uses host physical bits as guest physical bits by > default when using '-cpu host', although we would like to send a patch to > QEMU, it will still cause backward compatibility issues. > > For GPA that can't be translated by EPT but within host.MAXPHYADDR, > emulation should be the best choice since KVM will inject #PF for the > invalid GPA in guest's perspective and try to emulate the instructions > which minimizes the impact on guests as much as possible. NAK. allow_smaller_maxphyaddr is a bit of a mess and in IMO was a mistake, but at least there was reasonable motivation for trying to support guests with a small MAXPHYADDR. Fudging around a QEMU bug is not good enough justification, especially since the odds of a hack in KVM fully working are slim to none.
On Mon, Dec 18, 2023 at 07:23:01AM -0800, Sean Christopherson wrote: >On Mon, Dec 18, 2023, Tao Su wrote: >> With 4-level EPT, bits 51:48 of the guest physical address must all >> be zero; otherwise, an EPT violation always occurs, which is an unexpected >> VM exit in KVM currently. >> >> Even though KVM advertises the max physical bits to guest, guest may >> ignore MAXPHYADDR in CPUID and set a bigger physical bits to KVM. >> Rejecting invalid guest physical bits on KVM side is a choice, but it will >> break current KVM ABI, e.g., current QEMU ignores the physical bits >> advertised by KVM and uses host physical bits as guest physical bits by >> default when using '-cpu host', although we would like to send a patch to >> QEMU, it will still cause backward compatibility issues. >> >> For GPA that can't be translated by EPT but within host.MAXPHYADDR, >> emulation should be the best choice since KVM will inject #PF for the >> invalid GPA in guest's perspective and try to emulate the instructions >> which minimizes the impact on guests as much as possible. > >NAK. allow_smaller_maxphyaddr is a bit of a mess and in IMO was a mistake, but >at least there was reasonable motivation for trying to support guests with a small >MAXPHYADDR. Fudging around a QEMU bug is not good enough justification, especially >since the odds of a hack in KVM fully working are slim to none. The changelog is a little misleading. Even if there is no QEMU "bug" and QEMU sets guest.MAXPHYADDR to 48 correctly, guest __can__ set up CR3 page table to access GPAs with bits of 51:48 set. In this case, KVM still gets EPT violation and needs to inject #PF to the guest (by emulating the instruction).
On Mon, Dec 18, 2023 at 6:08 AM Tao Su <tao1.su@linux.intel.com> wrote: > > With 4-level EPT, bits 51:48 of the guest physical address must all > be zero; otherwise, an EPT violation always occurs, which is an unexpected > VM exit in KVM currently. > > Even though KVM advertises the max physical bits to guest, guest may > ignore MAXPHYADDR in CPUID and set a bigger physical bits to KVM. > Rejecting invalid guest physical bits on KVM side is a choice, but it will > break current KVM ABI, e.g., current QEMU ignores the physical bits > advertised by KVM and uses host physical bits as guest physical bits by > default when using '-cpu host', although we would like to send a patch to > QEMU, it will still cause backward compatibility issues. > > For GPA that can't be translated by EPT but within host.MAXPHYADDR, > emulation should be the best choice since KVM will inject #PF for the > invalid GPA in guest's perspective and try to emulate the instructions > which minimizes the impact on guests as much as possible. > > Signed-off-by: Tao Su <tao1.su@linux.intel.com> > Tested-by: Yi Lai <yi1.lai@intel.com> > --- > arch/x86/kvm/vmx/vmx.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index be20a60047b1..a8aa2cfa2f5d 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -5774,6 +5774,13 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) > > vcpu->arch.exit_qualification = exit_qualification; > > + /* > + * Emulate the instruction when accessing a GPA which is set any bits > + * beyond guest-physical bits that EPT can translate. > + */ > + if (unlikely(gpa & rsvd_bits(kvm_mmu_tdp_maxphyaddr(), 63))) > + return kvm_emulate_instruction(vcpu, 0); > + This doesn't really work, since the KVM instruction emulator is woefully incomplete. To make this work, first you have to teach the KVM instruction emulator how to emulate *all* memory-accessing instructions. > /* > * Check that the GPA doesn't exceed physical memory limits, as that is > * a guest page fault. We have to emulate the instruction here, because > -- > 2.34.1 > >
On Wed, Dec 20, 2023 at 05:42:56AM -0800, Jim Mattson wrote: > On Mon, Dec 18, 2023 at 6:08 AM Tao Su <tao1.su@linux.intel.com> wrote: > > > > With 4-level EPT, bits 51:48 of the guest physical address must all > > be zero; otherwise, an EPT violation always occurs, which is an unexpected > > VM exit in KVM currently. > > > > Even though KVM advertises the max physical bits to guest, guest may > > ignore MAXPHYADDR in CPUID and set a bigger physical bits to KVM. > > Rejecting invalid guest physical bits on KVM side is a choice, but it will > > break current KVM ABI, e.g., current QEMU ignores the physical bits > > advertised by KVM and uses host physical bits as guest physical bits by > > default when using '-cpu host', although we would like to send a patch to > > QEMU, it will still cause backward compatibility issues. > > > > For GPA that can't be translated by EPT but within host.MAXPHYADDR, > > emulation should be the best choice since KVM will inject #PF for the > > invalid GPA in guest's perspective and try to emulate the instructions > > which minimizes the impact on guests as much as possible. > > > > Signed-off-by: Tao Su <tao1.su@linux.intel.com> > > Tested-by: Yi Lai <yi1.lai@intel.com> > > --- > > arch/x86/kvm/vmx/vmx.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index be20a60047b1..a8aa2cfa2f5d 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -5774,6 +5774,13 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) > > > > vcpu->arch.exit_qualification = exit_qualification; > > > > + /* > > + * Emulate the instruction when accessing a GPA which is set any bits > > + * beyond guest-physical bits that EPT can translate. > > + */ > > + if (unlikely(gpa & rsvd_bits(kvm_mmu_tdp_maxphyaddr(), 63))) > > + return kvm_emulate_instruction(vcpu, 0); > > + > > This doesn't really work, since the KVM instruction emulator is > woefully incomplete. To make this work, first you have to teach the > KVM instruction emulator how to emulate *all* memory-accessing > instructions. Please forget allow_smaller_maxphyaddr and #PF for a while. I agree KVM instruction emulator is incomplete. However, hardware can’t execute instructions with GPA>48-bit and exits to KVM, KVM just repeatedly builds SPTE, I.e., current KVM is buggy. In this case, emulation may be a choice, I.e., KVM can emulate most common instructions which reflects KVM's best efforts to maintain an environment for continued execution. Even if some instructions can’t be emulated, it can terminate the guest and will not make KVM silently hang there. Thanks, Tao > > > /* > > * Check that the GPA doesn't exceed physical memory limits, as that is > > * a guest page fault. We have to emulate the instruction here, because > > -- > > 2.34.1 > > > >
On Mon, Jan 08, 2024, Tao Su wrote: > On Wed, Dec 20, 2023 at 05:42:56AM -0800, Jim Mattson wrote: > > On Mon, Dec 18, 2023 at 6:08 AM Tao Su <tao1.su@linux.intel.com> wrote: > > > > > > With 4-level EPT, bits 51:48 of the guest physical address must all > > > be zero; otherwise, an EPT violation always occurs, which is an unexpected > > > VM exit in KVM currently. > > > > > > Even though KVM advertises the max physical bits to guest, guest may > > > ignore MAXPHYADDR in CPUID and set a bigger physical bits to KVM. > > > Rejecting invalid guest physical bits on KVM side is a choice, but it will > > > break current KVM ABI, e.g., current QEMU ignores the physical bits > > > advertised by KVM and uses host physical bits as guest physical bits by > > > default when using '-cpu host', although we would like to send a patch to > > > QEMU, it will still cause backward compatibility issues. > > > > > > For GPA that can't be translated by EPT but within host.MAXPHYADDR, > > > emulation should be the best choice since KVM will inject #PF for the > > > invalid GPA in guest's perspective and try to emulate the instructions > > > which minimizes the impact on guests as much as possible. > > > > > > Signed-off-by: Tao Su <tao1.su@linux.intel.com> > > > Tested-by: Yi Lai <yi1.lai@intel.com> > > > --- > > > arch/x86/kvm/vmx/vmx.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > > index be20a60047b1..a8aa2cfa2f5d 100644 > > > --- a/arch/x86/kvm/vmx/vmx.c > > > +++ b/arch/x86/kvm/vmx/vmx.c > > > @@ -5774,6 +5774,13 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) > > > > > > vcpu->arch.exit_qualification = exit_qualification; > > > > > > + /* > > > + * Emulate the instruction when accessing a GPA which is set any bits > > > + * beyond guest-physical bits that EPT can translate. > > > + */ > > > + if (unlikely(gpa & rsvd_bits(kvm_mmu_tdp_maxphyaddr(), 63))) > > > + return kvm_emulate_instruction(vcpu, 0); > > > + > > > > This doesn't really work, since the KVM instruction emulator is > > woefully incomplete. To make this work, first you have to teach the > > KVM instruction emulator how to emulate *all* memory-accessing > > instructions. > > Please forget allow_smaller_maxphyaddr and #PF for a while. > > I agree KVM instruction emulator is incomplete. However, hardware can’t > execute instructions with GPA>48-bit and exits to KVM, KVM just repeatedly > builds SPTE, I.e., current KVM is buggy. Eh, hardware is just as much to blame as KVM. Garbage in, garbage out. > In this case, emulation may be a choice, Yes, it can be a choice, but that choice needs to be made conciously by userspace, not silently by KVM.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index be20a60047b1..a8aa2cfa2f5d 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5774,6 +5774,13 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) vcpu->arch.exit_qualification = exit_qualification; + /* + * Emulate the instruction when accessing a GPA which is set any bits + * beyond guest-physical bits that EPT can translate. + */ + if (unlikely(gpa & rsvd_bits(kvm_mmu_tdp_maxphyaddr(), 63))) + return kvm_emulate_instruction(vcpu, 0); + /* * Check that the GPA doesn't exceed physical memory limits, as that is * a guest page fault. We have to emulate the instruction here, because