Message ID | 20210204000117.3303214-8-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: Legal GPA fixes and cleanups | expand |
On Wed, 2021-02-03 at 16:01 -0800, Sean Christopherson wrote: > > - unsigned long cr3_lm_rsvd_bits; > + u64 reserved_gpa_bits; LAM defines bits above the GFN in CR3: https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html KVM doesn't support this today of course, but it might be confusing to try to combine the two concepts.
On Thu, Feb 04, 2021, Edgecombe, Rick P wrote: > On Wed, 2021-02-03 at 16:01 -0800, Sean Christopherson wrote: > > > > - unsigned long cr3_lm_rsvd_bits; > > + u64 reserved_gpa_bits; > > LAM defines bits above the GFN in CR3: > https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html > > KVM doesn't support this today of course, but it might be confusing to > try to combine the two concepts. Ah, took me a few minutes, but I see what you're saying. LAM will introduce bits that are repurposed for CR3, but not generic GPAs. And, the behavior is based on CPU support, so it'd make sense to have a mask cached in vcpu->arch as opposed to constantly generating it on the fly. Definitely agree that having a separate cr3_lm_rsvd_bits or whatever is the right way to go when LAM comes along. Not sure it's worth keeping a duplicate field in the meantime, though it would avoid a small amount of thrash. Paolo, any thoughts?
On 04/02/21 03:19, Sean Christopherson wrote: > Ah, took me a few minutes, but I see what you're saying. LAM will introduce > bits that are repurposed for CR3, but not generic GPAs. And, the behavior is > based on CPU support, so it'd make sense to have a mask cached in vcpu->arch > as opposed to constantly generating it on the fly. > > Definitely agree that having a separate cr3_lm_rsvd_bits or whatever is the > right way to go when LAM comes along. Not sure it's worth keeping a duplicate > field in the meantime, though it would avoid a small amount of thrash. We don't even know if the cr3_lm_rsvd_bits would be a field in vcpu->arch, or rather computed on the fly. So renaming the field in vcpu->arch seems like the simplest thing to do now. Paolo
On Thu, 2021-02-04 at 11:34 +0100, Paolo Bonzini wrote: > On 04/02/21 03:19, Sean Christopherson wrote: > > Ah, took me a few minutes, but I see what you're saying. LAM will > > introduce > > bits that are repurposed for CR3, but not generic GPAs. And, the > > behavior is > > based on CPU support, so it'd make sense to have a mask cached in > > vcpu->arch > > as opposed to constantly generating it on the fly. > > > > Definitely agree that having a separate cr3_lm_rsvd_bits or > > whatever is the > > right way to go when LAM comes along. Not sure it's worth keeping > > a duplicate > > field in the meantime, though it would avoid a small amount of > > thrash. > > We don't even know if the cr3_lm_rsvd_bits would be a field in > vcpu->arch, or rather computed on the fly. So renaming the field in > vcpu->arch seems like the simplest thing to do now. Fair enough. But just to clarify, I meant that I thought the code would be more confusing to use illegal gpa bit checks for checking cr3. It seems they are only incidentally the same value. Alternatively there could be something like a is_rsvd_cr3_bits() helper that just uses reserved_gpa_bits for now. Probably put the comment in the wrong place. It's a minor point in any case.
On Thu, Feb 04, 2021, Edgecombe, Rick P wrote: > On Thu, 2021-02-04 at 11:34 +0100, Paolo Bonzini wrote: > > On 04/02/21 03:19, Sean Christopherson wrote: > > > Ah, took me a few minutes, but I see what you're saying. LAM will > > > introduce > > > bits that are repurposed for CR3, but not generic GPAs. And, the > > > behavior is > > > based on CPU support, so it'd make sense to have a mask cached in > > > vcpu->arch > > > as opposed to constantly generating it on the fly. > > > > > > Definitely agree that having a separate cr3_lm_rsvd_bits or > > > whatever is the > > > right way to go when LAM comes along. Not sure it's worth keeping > > > a duplicate > > > field in the meantime, though it would avoid a small amount of > > > thrash. > > > > We don't even know if the cr3_lm_rsvd_bits would be a field in > > vcpu->arch, or rather computed on the fly. So renaming the field in > > vcpu->arch seems like the simplest thing to do now. > > Fair enough. But just to clarify, I meant that I thought the code would > be more confusing to use illegal gpa bit checks for checking cr3. It > seems they are only incidentally the same value. Hmm, yeah, bits 63:52 are incidental. Bits 52:M are not, though. If/when we need to special case CR3, I would like to take a similar approach to __reset_rsvds_bits_mask(), where the high reserved bits start from reserved_gpa_bits and mask off the bits that can't be encoded into a PxE. > Alternatively there could be something like a is_rsvd_cr3_bits() helper that > just uses reserved_gpa_bits for now. Probably put the comment in the wrong > place. It's a minor point in any case. That thought crossed my mind, too. Maybe kvm_vcpu_is_illegal_cr3() to match the gpa helpers?
On 04/02/21 18:52, Sean Christopherson wrote: >> Alternatively there could be something like a is_rsvd_cr3_bits() helper that >> just uses reserved_gpa_bits for now. Probably put the comment in the wrong >> place. It's a minor point in any case. > That thought crossed my mind, too. Maybe kvm_vcpu_is_illegal_cr3() to match > the gpa helpers? Yes, that's certainly a good name but it doesn't have to be done now. Or at least, if you do it, somebody is guaranteed to send a patch after one month because the wrapper is useless. :) Paolo
On Thu, Feb 04, 2021, Paolo Bonzini wrote: > On 04/02/21 18:52, Sean Christopherson wrote: > > > Alternatively there could be something like a is_rsvd_cr3_bits() helper that > > > just uses reserved_gpa_bits for now. Probably put the comment in the wrong > > > place. It's a minor point in any case. > > That thought crossed my mind, too. Maybe kvm_vcpu_is_illegal_cr3() to match > > the gpa helpers? > > Yes, that's certainly a good name but it doesn't have to be done now. Or at > least, if you do it, somebody is guaranteed to send a patch after one month > because the wrapper is useless. :) LOL, I can see myself doing that...
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 915f716e78e6..1653d49a66ff 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -654,7 +654,7 @@ struct kvm_vcpu_arch { int cpuid_nent; struct kvm_cpuid_entry2 *cpuid_entries; - unsigned long cr3_lm_rsvd_bits; + u64 reserved_gpa_bits; int maxphyaddr; int max_tdp_level; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 944f518ca91b..7bd1331c1bbc 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -194,7 +194,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) vcpu->arch.cr4_guest_rsvd_bits = __cr4_reserved_bits(guest_cpuid_has, vcpu); - vcpu->arch.cr3_lm_rsvd_bits = rsvd_bits(cpuid_maxphyaddr(vcpu), 63); + vcpu->arch.reserved_gpa_bits = rsvd_bits(cpuid_maxphyaddr(vcpu), 63); /* Invoke the vendor callback only after the above state is updated. */ static_call(kvm_x86_vcpu_after_set_cpuid)(vcpu); diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index a9d55ab51e3c..f673f45bdf52 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -38,7 +38,7 @@ static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu) static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa) { - return !(gpa >> cpuid_maxphyaddr(vcpu)); + return !(gpa & vcpu->arch.reserved_gpa_bits); } static inline bool kvm_vcpu_is_illegal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index ac662964cee5..add3cd4295e1 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -241,7 +241,7 @@ static bool nested_vmcb_check_cr3_cr4(struct vcpu_svm *svm, */ if ((save->efer & EFER_LME) && (save->cr0 & X86_CR0_PG)) { if (!(save->cr4 & X86_CR4_PAE) || !(save->cr0 & X86_CR0_PE) || - (save->cr3 & vcpu->arch.cr3_lm_rsvd_bits)) + kvm_vcpu_is_illegal_gpa(vcpu, save->cr3)) return false; } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index f53e6377a933..50ad5a3bf880 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4079,7 +4079,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) if (sev_guest(vcpu->kvm)) { best = kvm_find_cpuid_entry(vcpu, 0x8000001F, 0); if (best) - vcpu->arch.cr3_lm_rsvd_bits &= ~(1UL << (best->ebx & 0x3f)); + vcpu->arch.reserved_gpa_bits &= ~(1UL << (best->ebx & 0x3f)); } if (!kvm_vcpu_apicv_active(vcpu)) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e6fbf2f574a6..1da7ed093650 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1082,8 +1082,7 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) return 0; } - if (is_long_mode(vcpu) && - (cr3 & vcpu->arch.cr3_lm_rsvd_bits)) + if (is_long_mode(vcpu) && kvm_vcpu_is_illegal_gpa(vcpu, cr3)) return 1; else if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) @@ -9712,7 +9711,7 @@ static bool kvm_is_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) */ if (!(sregs->cr4 & X86_CR4_PAE) || !(sregs->efer & EFER_LMA)) return false; - if (sregs->cr3 & vcpu->arch.cr3_lm_rsvd_bits) + if (kvm_vcpu_is_illegal_gpa(vcpu, sregs->cr3)) return false; } else { /* @@ -10091,7 +10090,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) fx_init(vcpu); vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu); - vcpu->arch.cr3_lm_rsvd_bits = rsvd_bits(cpuid_maxphyaddr(vcpu), 63); + vcpu->arch.reserved_gpa_bits = rsvd_bits(cpuid_maxphyaddr(vcpu), 63); vcpu->arch.pat = MSR_IA32_CR_PAT_DEFAULT;
Rename cr3_lm_rsvd_bits to reserved_gpa_bits, and use it for all GPA legality checks. AMD's APM states: If the C-bit is an address bit, this bit is masked from the guest physical address when it is translated through the nested page tables. Thus, any access that can conceivably be run through NPT should ignore the C-bit when checking for validity. For features that KVM emulates in software, e.g. MTRRs, there is no clear direction in the APM for how the C-bit should be handled. For such cases, follow the SME behavior inasmuch as possible, since SEV is is essentially a VM-specific variant of SME. For SME, the APM states: In this case the upper physical address bits are treated as reserved when the feature is enabled except where otherwise indicated. Collecting the various relavant SME snippets in the APM and cross- referencing the omissions with Linux kernel code, this leaves MTTRs and APIC_BASE as the only flows that KVM emulates that should _not_ ignore the C-bit. Note, this means the reserved bit checks in the page tables are technically broken. This will be remedied in a future patch. Although the page table checks are technically broken, in practice, it's all but guaranteed to be irrelevant. NPT is required for SEV, i.e. shadowing page tables isn't needed in the common case. Theoretically, the checks could be in play for nested NPT, but it's extremely unlikely that anyone is running nested VMs on SEV, as doing so would require L1 to expose sensitive data to L0, e.g. the entire VMCB. And if anyone is running nested VMs, L0 can't read the guest's encrypted memory, i.e. L1 would need to put its NPT in shared memory, in which case the C-bit will never be set. Or, L1 could use shadow paging, but again, if L0 needs to read page tables, e.g. to load PDPTRs, the memory can't be encrypted if L1 has any expectation of L0 doing the right thing. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/cpuid.c | 2 +- arch/x86/kvm/cpuid.h | 2 +- arch/x86/kvm/svm/nested.c | 2 +- arch/x86/kvm/svm/svm.c | 2 +- arch/x86/kvm/x86.c | 7 +++---- 6 files changed, 8 insertions(+), 9 deletions(-)