Message ID | 20211005234459.430873-4-michael.roth@amd.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | KVM: selftests: Add tests for SEV, SEV-ES, and SEV-SNP guests | expand |
On 06/10/21 01:44, Michael Roth wrote: > SEV guests rely on an encyption bit which resides within the range that > current code treats as address bits. Guest code will expect these bits > to be set appropriately in their page tables, whereas helpers like > addr_gpa2hva() will expect these bits to be masked away prior to > translation. Add proper handling for these cases. This is not what you're doing below in addr_gpa2hva, though---or did I misunderstand? I may be wrong due to not actually having written the code, but I'd prefer if most of these APIs worked only if the C bit has already been stripped. In general it's quite unlikely for host code to deal with C=1 pages, so it's worth pointing out explicitly the cases where it does. Paolo > @@ -1460,9 +1480,10 @@ void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, > * address providing the memory to the vm physical address is returned. > * A TEST_ASSERT failure occurs if no region containing gpa exists. > */ > -void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa) > +void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa_raw) > { > struct userspace_mem_region *region;
On Thu, Oct 21, 2021 at 05:26:26PM +0200, Paolo Bonzini wrote: > On 06/10/21 01:44, Michael Roth wrote: > > SEV guests rely on an encyption bit which resides within the range that > > current code treats as address bits. Guest code will expect these bits > > to be set appropriately in their page tables, whereas helpers like > > addr_gpa2hva() will expect these bits to be masked away prior to > > translation. Add proper handling for these cases. > > This is not what you're doing below in addr_gpa2hva, though---or did I > misunderstand? The confusion is warranted, addr_gpa2hva() *doesn't* expect the C bit to be masked in advance so the wording is pretty confusing. I think I was referring the fact that internally it doesn't need/want the C-bit, in this case it just masks it away as a convenience to callers, as opposed to the other functions modified in the patch that actually make use of it. It's convenient because page table walkers/mappers make use of addr_gpa2hva() to do things like silently mask away C-bits via when translating PTEs to host addresses. We easily convert those callers from: addr_gpa2hva(paddr) to this: addr_gpa2hva(addr_raw2gpa(paddr)) but now all new code needs to consider whether it might be dealing with C-bits or not prior to deciding to pass it to addr_gpa2hva() (or not really think about it, and add addr_gpa2raw() "just in case"). So since it's always harmless to mask it away silently addr_gpa2hva(), the logic/code seems to benefit a good deal if we indicate clearly that addr_gpa2hva() can accept a 'raw' GPA, and will ignore it completely. But not a big deal either way if you prefer to keep that explicit. And commit message still needs to be clarified. > > I may be wrong due to not actually having written the code, but I'd prefer > if most of these APIs worked only if the C bit has already been stripped. > In general it's quite unlikely for host code to deal with C=1 pages, so it's > worth pointing out explicitly the cases where it does. I've tried to indicate functions that expect the C-bit by adding the 'raw_' prefix to the gpa/paddr parameters, but as you pointed out with addr_gpa2hva() it's already a bit inconsistent in that regard, and there's a couple cases like virt_map() where I should use the 'raw_' prefix as well that I've missed here. So that should be addressed, and maybe some additional comments/assertions might be warranted to guard against cases where the C-bit is passed in unexpectedly. But I should probably re-assess why the C-bit is being passed around in the first place: - vm_phy_page[s]_alloc() is the main 'source' for 'raw' GPAs with the C-bit set. it determines this based on vm_memcrypt encryption policy, and updates the encryption bitmask as well. - vm_phy_page[s]_alloc() is callable both in kvm_util lib as well as individual tests. - in theory, encoding the C-bit in the returned vm_paddr_t means that vm_phy_page[s]_alloc() callers can pass that directly into virt_map/virt_pg_map() and this will "just work" for both encrypted/non-encrypted guests. - by masking it away in addr_gpa2hva(), existing tests/code flow mostly "just works" as well. But taking a closer look, in cases where vm_phy_page[s]_alloc() is called directly by tests, like set_memory_region_test, emulator_error_test, and smm_test, that raw GPA is compared to hardcoded non-raw GPAs, so they'd still end up needing fixups to work with the proposed transparent-SEV-mode stuff. And future code would need to be written to account for this, so it doesn't really "just work" after all.. So it's worth considering the alternative approach of *not* encoding the C-bit into GPAs returned by vm_phy_page[s]_alloc(). That would likely involve introducing something like addr_gpa2raw(), which adds in the C-bit according to the encryption bitmap as-needed. If we do that: - virt_map()/virt_pg_map() still need to accept 'raw' GPAs, since they need to deal with cases where pages are being mapping that weren't allocated by vm_phy_page[s]_alloc(), and so aren't recorded in the bitmap. in those cases it is up to test code to provide the C-bit when needed (e.g. things like separate linear mappings for pa()-like stuff in guest code). - for cases where vm_phy_page[s]_alloc() determines whether the page is encrypted, addr_gpa2raw() needs to be used to add back the C-bit prior to passing it to virt_map()/virt_pg_map(), both in the library and the test code. vm_vaddr_* allocations would handle all this under the covers as they do now. So test code would need to consider cases where addr_gpa2raw() needs to be used to set the C-bit (which is basically only when they want to mix usage of the vm_phy_page[s]_alloc with their own mapping of the guest page tables, which doesn't seem to be done in any existing tests anyway). The library code would need these addr_gpa2raw() hooks in places where it calls virt_*map() internally. Probably just a handful of places though. Assuming there's no issues with this alternative approach that I may be missing, I'll look at doing it this way for the next spin. Even in this alternative approach though, having addr_gpa2hva() silently mask away C-bit still seems useful for the reasons above, but again, no strong feelings one way or the other on that. > > Paolo > > > @@ -1460,9 +1480,10 @@ void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, > > * address providing the memory to the vm physical address is returned. > > * A TEST_ASSERT failure occurs if no region containing gpa exists. > > */ > > -void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa) > > +void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa_raw) > > { > > struct userspace_mem_region *region; >
On 24/10/21 18:49, Michael Roth wrote: > So test code would need to consider cases where addr_gpa2raw() needs to be > used to set the C-bit (which is basically only when they want to mix usage > of the vm_phy_page[s]_alloc with their own mapping of the guest page tables, > which doesn't seem to be done in any existing tests anyway). Yes, and it seems like a more rare case in general. > The library code would need these addr_gpa2raw() hooks in places where > it calls virt_*map() internally. Probably just a handful of places > though. Either that, or you can have virt_*map that consults the encryption bitmap, and virt_*map_enc (or _raw, doesn't matter) that takes the encryption state explicitly as a bool. > Even in this alternative approach though, having addr_gpa2hva() silently > mask away C-bit still seems useful for the reasons above, but again, no > strong feelings one way or the other on that. Ok, keep it the way you find more useful. Paolo
On Mon, Oct 25, 2021 at 09:34:10AM +0200, Paolo Bonzini wrote: > On 24/10/21 18:49, Michael Roth wrote: > > So test code would need to consider cases where addr_gpa2raw() needs to be > > used to set the C-bit (which is basically only when they want to mix usage > > of the vm_phy_page[s]_alloc with their own mapping of the guest page tables, > > which doesn't seem to be done in any existing tests anyway). > > Yes, and it seems like a more rare case in general. > > > The library code would need these addr_gpa2raw() hooks in places where > > it calls virt_*map() internally. Probably just a handful of places > > though. > > Either that, or you can have virt_*map that consults the encryption bitmap, > and virt_*map_enc (or _raw, doesn't matter) that takes the encryption state > explicitly as a bool. That sounds promising. Will give that a shot. Thanks!
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index f417de80596c..4bf686d664cc 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -152,6 +152,7 @@ void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa); void *addr_gva2hva(struct kvm_vm *vm, vm_vaddr_t gva); vm_paddr_t addr_hva2gpa(struct kvm_vm *vm, void *hva); void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa); +vm_paddr_t addr_raw2gpa(struct kvm_vm *vm, vm_vaddr_t gpa_raw); /* * Address Guest Virtual to Guest Physical diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index c58f930dedd2..ef88fdc7e46b 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1443,6 +1443,26 @@ void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, } } +/* + * Mask off any special bits from raw GPA + * + * Input Args: + * vm - Virtual Machine + * gpa_raw - Raw VM physical address + * + * Output Args: None + * + * Return: + * GPA with special bits (e.g. shared/encrypted) masked off. + */ +vm_paddr_t addr_raw2gpa(struct kvm_vm *vm, vm_paddr_t gpa_raw) +{ + if (!vm->memcrypt.has_enc_bit) + return gpa_raw; + + return gpa_raw & ~(1ULL << vm->memcrypt.enc_bit); +} + /* * Address VM Physical to Host Virtual * @@ -1460,9 +1480,10 @@ void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, * address providing the memory to the vm physical address is returned. * A TEST_ASSERT failure occurs if no region containing gpa exists. */ -void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa) +void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa_raw) { struct userspace_mem_region *region; + vm_paddr_t gpa = addr_raw2gpa(vm, gpa_raw); region = userspace_mem_region_find(vm, gpa, gpa); if (!region) { diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index 28cb881f440d..0bbd88fe1127 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -198,7 +198,7 @@ static void *virt_get_pte(struct kvm_vm *vm, uint64_t pt_pfn, uint64_t vaddr, static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm, uint64_t pt_pfn, uint64_t vaddr, - uint64_t paddr, + uint64_t paddr_raw, int level, enum x86_page_size page_size) { @@ -208,10 +208,9 @@ static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm, pte->writable = true; pte->present = true; pte->page_size = (level == page_size); - if (pte->page_size) - pte->pfn = paddr >> vm->page_shift; - else - pte->pfn = vm_alloc_page_table(vm) >> vm->page_shift; + if (!pte->page_size) + paddr_raw = vm_alloc_page_table(vm); + pte->pfn = paddr_raw >> vm->page_shift; } else { /* * Entry already present. Assert that the caller doesn't want @@ -228,12 +227,13 @@ static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm, return pte; } -void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, +void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr_raw, enum x86_page_size page_size) { const uint64_t pg_size = 1ull << ((page_size * 9) + 12); struct pageUpperEntry *pml4e, *pdpe, *pde; struct pageTableEntry *pte; + uint64_t paddr = addr_raw2gpa(vm, paddr_raw); TEST_ASSERT(vm->mode == VM_MODE_PXXV48_4K, "Unknown or unsupported guest mode, mode: 0x%x", vm->mode); @@ -256,15 +256,15 @@ void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, * early if a hugepage was created. */ pml4e = virt_create_upper_pte(vm, vm->pgd >> vm->page_shift, - vaddr, paddr, 3, page_size); + vaddr, paddr_raw, 3, page_size); if (pml4e->page_size) return; - pdpe = virt_create_upper_pte(vm, pml4e->pfn, vaddr, paddr, 2, page_size); + pdpe = virt_create_upper_pte(vm, pml4e->pfn, vaddr, paddr_raw, 2, page_size); if (pdpe->page_size) return; - pde = virt_create_upper_pte(vm, pdpe->pfn, vaddr, paddr, 1, page_size); + pde = virt_create_upper_pte(vm, pdpe->pfn, vaddr, paddr_raw, 1, page_size); if (pde->page_size) return; @@ -272,14 +272,14 @@ void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, pte = virt_get_pte(vm, pde->pfn, vaddr, 0); TEST_ASSERT(!pte->present, "PTE already present for 4k page at vaddr: 0x%lx\n", vaddr); - pte->pfn = paddr >> vm->page_shift; + pte->pfn = paddr_raw >> vm->page_shift; pte->writable = true; pte->present = 1; } -void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr) +void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr_raw) { - __virt_pg_map(vm, vaddr, paddr, X86_PAGE_SIZE_4K); + __virt_pg_map(vm, vaddr, paddr_raw, X86_PAGE_SIZE_4K); } static struct pageTableEntry *_vm_get_page_table_entry(struct kvm_vm *vm, int vcpuid, @@ -587,7 +587,7 @@ vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva) if (!pte[index[0]].present) goto unmapped_gva; - return (pte[index[0]].pfn * vm->page_size) + (gva & 0xfffu); + return addr_raw2gpa(vm, ((uint64_t)pte[index[0]].pfn * vm->page_size)) + (gva & 0xfffu); unmapped_gva: TEST_FAIL("No mapping for vm virtual address, gva: 0x%lx", gva);
SEV guests rely on an encyption bit which resides within the range that current code treats as address bits. Guest code will expect these bits to be set appropriately in their page tables, whereas helpers like addr_gpa2hva() will expect these bits to be masked away prior to translation. Add proper handling for these cases. Signed-off-by: Michael Roth <michael.roth@amd.com> --- .../testing/selftests/kvm/include/kvm_util.h | 1 + tools/testing/selftests/kvm/lib/kvm_util.c | 23 +++++++++++++++- .../selftests/kvm/lib/x86_64/processor.c | 26 +++++++++---------- 3 files changed, 36 insertions(+), 14 deletions(-)