diff mbox series

[RFC,03/16] KVM: selftests: handle encryption bits in page tables

Message ID 20211005234459.430873-4-michael.roth@amd.com (mailing list archive)
State New
Headers show
Series KVM: selftests: Add tests for SEV, SEV-ES, and SEV-SNP guests | expand

Commit Message

Michael Roth Oct. 5, 2021, 11:44 p.m. UTC
SEV guests rely on an encyption bit which resides within the range that
current code treats as address bits. Guest code will expect these bits
to be set appropriately in their page tables, whereas helpers like
addr_gpa2hva() will expect these bits to be masked away prior to
translation. Add proper handling for these cases.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 .../testing/selftests/kvm/include/kvm_util.h  |  1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    | 23 +++++++++++++++-
 .../selftests/kvm/lib/x86_64/processor.c      | 26 +++++++++----------
 3 files changed, 36 insertions(+), 14 deletions(-)

Comments

Paolo Bonzini Oct. 21, 2021, 3:26 p.m. UTC | #1
On 06/10/21 01:44, Michael Roth wrote:
> SEV guests rely on an encyption bit which resides within the range that
> current code treats as address bits. Guest code will expect these bits
> to be set appropriately in their page tables, whereas helpers like
> addr_gpa2hva() will expect these bits to be masked away prior to
> translation. Add proper handling for these cases.

This is not what you're doing below in addr_gpa2hva, though---or did I 
misunderstand?

I may be wrong due to not actually having written the code, but I'd 
prefer if most of these APIs worked only if the C bit has already been 
stripped.  In general it's quite unlikely for host code to deal with C=1 
pages, so it's worth pointing out explicitly the cases where it does.

Paolo

> @@ -1460,9 +1480,10 @@ void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
>    * address providing the memory to the vm physical address is returned.
>    * A TEST_ASSERT failure occurs if no region containing gpa exists.
>    */
> -void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa)
> +void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa_raw)
>   {
>   	struct userspace_mem_region *region;
Michael Roth Oct. 24, 2021, 4:49 p.m. UTC | #2
On Thu, Oct 21, 2021 at 05:26:26PM +0200, Paolo Bonzini wrote:
> On 06/10/21 01:44, Michael Roth wrote:
> > SEV guests rely on an encyption bit which resides within the range that
> > current code treats as address bits. Guest code will expect these bits
> > to be set appropriately in their page tables, whereas helpers like
> > addr_gpa2hva() will expect these bits to be masked away prior to
> > translation. Add proper handling for these cases.
> 
> This is not what you're doing below in addr_gpa2hva, though---or did I
> misunderstand?

The confusion is warranted, addr_gpa2hva() *doesn't* expect the C bit to
be masked in advance so the wording is pretty confusing.

I think I was referring the fact that internally it doesn't need/want the
C-bit, in this case it just masks it away as a convenience to callers,
as opposed to the other functions modified in the patch that actually
make use of it.

It's convenient because page table walkers/mappers make use of
addr_gpa2hva() to do things like silently mask away C-bits via when
translating PTEs to host addresses. We easily convert those callers from:

  addr_gpa2hva(paddr)

to this:

  addr_gpa2hva(addr_raw2gpa(paddr))

but now all new code needs to consider whether it might be dealing with
C-bits or not prior to deciding to pass it to addr_gpa2hva() (or not
really think about it, and add addr_gpa2raw() "just in case"). So since
it's always harmless to mask it away silently addr_gpa2hva(), the
logic/code seems to benefit a good deal if we indicate clearly that
addr_gpa2hva() can accept a 'raw' GPA, and will ignore it completely.

But not a big deal either way if you prefer to keep that explicit. And
commit message still needs to be clarified.

> 
> I may be wrong due to not actually having written the code, but I'd prefer
> if most of these APIs worked only if the C bit has already been stripped.
> In general it's quite unlikely for host code to deal with C=1 pages, so it's
> worth pointing out explicitly the cases where it does.

I've tried to indicate functions that expect the C-bit by adding the 'raw_'
prefix to the gpa/paddr parameters, but as you pointed out with
addr_gpa2hva() it's already a bit inconsistent in that regard, and there's
a couple cases like virt_map() where I should use the 'raw_' prefix as well
that I've missed here.

So that should be addressed, and maybe some additional comments/assertions
might be warranted to guard against cases where the C-bit is passed in
unexpectedly.

But I should probably re-assess why the C-bit is being passed around in
the first place:

 - vm_phy_page[s]_alloc() is the main 'source' for 'raw' GPAs with the
   C-bit set. it determines this based on vm_memcrypt encryption policy,
   and updates the encryption bitmask as well.
 - vm_phy_page[s]_alloc() is callable both in kvm_util lib as well as
   individual tests.
 - in theory, encoding the C-bit in the returned vm_paddr_t means that
   vm_phy_page[s]_alloc() callers can pass that directly into
   virt_map/virt_pg_map() and this will "just work" for both
   encrypted/non-encrypted guests.
 - by masking it away in addr_gpa2hva(), existing tests/code flow mostly
   "just works" as well.

But taking a closer look, in cases where vm_phy_page[s]_alloc() is called
directly by tests, like set_memory_region_test, emulator_error_test, and
smm_test, that raw GPA is compared to hardcoded non-raw GPAs, so they'd
still end up needing fixups to work with the proposed transparent-SEV-mode
stuff. And future code would need to be written to account for this, so
it doesn't really "just work" after all..

So it's worth considering the alternative approach of *not* encoding the
C-bit into GPAs returned by vm_phy_page[s]_alloc(). That would likely
involve introducing something like addr_gpa2raw(), which adds in the
C-bit according to the encryption bitmap as-needed. If we do that:

  - virt_map()/virt_pg_map() still need to accept 'raw' GPAs, since they
    need to deal with cases where pages are being mapping that weren't
    allocated by vm_phy_page[s]_alloc(), and so aren't recorded in the
    bitmap. in those cases it is up to test code to provide the C-bit
    when needed (e.g. things like separate linear mappings for pa()-like
    stuff in guest code).

  - for cases where vm_phy_page[s]_alloc() determines whether the page
    is encrypted, addr_gpa2raw() needs to be used to add back the C-bit
    prior to passing it to virt_map()/virt_pg_map(), both in the library and
    the test code. vm_vaddr_* allocations would handle all this under the
    covers as they do now.

So test code would need to consider cases where addr_gpa2raw() needs to be
used to set the C-bit (which is basically only when they want to mix usage
of the vm_phy_page[s]_alloc with their own mapping of the guest page tables,
which doesn't seem to be done in any existing tests anyway).

The library code would need these addr_gpa2raw() hooks in places where
it calls virt_*map() internally. Probably just a handful of places
though.

Assuming there's no issues with this alternative approach that I may be
missing, I'll look at doing it this way for the next spin.

Even in this alternative approach though, having addr_gpa2hva() silently
mask away C-bit still seems useful for the reasons above, but again, no
strong feelings one way or the other on that.

> 
> Paolo
> 
> > @@ -1460,9 +1480,10 @@ void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
> >    * address providing the memory to the vm physical address is returned.
> >    * A TEST_ASSERT failure occurs if no region containing gpa exists.
> >    */
> > -void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa)
> > +void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa_raw)
> >   {
> >   	struct userspace_mem_region *region;
>
Paolo Bonzini Oct. 25, 2021, 7:34 a.m. UTC | #3
On 24/10/21 18:49, Michael Roth wrote:
> So test code would need to consider cases where addr_gpa2raw() needs to be
> used to set the C-bit (which is basically only when they want to mix usage
> of the vm_phy_page[s]_alloc with their own mapping of the guest page tables,
> which doesn't seem to be done in any existing tests anyway).

Yes, and it seems like a more rare case in general.

> The library code would need these addr_gpa2raw() hooks in places where
> it calls virt_*map() internally. Probably just a handful of places
> though.

Either that, or you can have virt_*map that consults the encryption 
bitmap, and virt_*map_enc (or _raw, doesn't matter) that takes the 
encryption state explicitly as a bool.

> Even in this alternative approach though, having addr_gpa2hva() silently
> mask away C-bit still seems useful for the reasons above, but again, no
> strong feelings one way or the other on that.

Ok, keep it the way you find more useful.

Paolo
Michael Roth Oct. 25, 2021, 2:14 p.m. UTC | #4
On Mon, Oct 25, 2021 at 09:34:10AM +0200, Paolo Bonzini wrote:
> On 24/10/21 18:49, Michael Roth wrote:
> > So test code would need to consider cases where addr_gpa2raw() needs to be
> > used to set the C-bit (which is basically only when they want to mix usage
> > of the vm_phy_page[s]_alloc with their own mapping of the guest page tables,
> > which doesn't seem to be done in any existing tests anyway).
> 
> Yes, and it seems like a more rare case in general.
> 
> > The library code would need these addr_gpa2raw() hooks in places where
> > it calls virt_*map() internally. Probably just a handful of places
> > though.
> 
> Either that, or you can have virt_*map that consults the encryption bitmap,
> and virt_*map_enc (or _raw, doesn't matter) that takes the encryption state
> explicitly as a bool.

That sounds promising. Will give that a shot. Thanks!
diff mbox series

Patch

diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index f417de80596c..4bf686d664cc 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -152,6 +152,7 @@  void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa);
 void *addr_gva2hva(struct kvm_vm *vm, vm_vaddr_t gva);
 vm_paddr_t addr_hva2gpa(struct kvm_vm *vm, void *hva);
 void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa);
+vm_paddr_t addr_raw2gpa(struct kvm_vm *vm, vm_vaddr_t gpa_raw);
 
 /*
  * Address Guest Virtual to Guest Physical
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index c58f930dedd2..ef88fdc7e46b 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1443,6 +1443,26 @@  void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 	}
 }
 
+/*
+ * Mask off any special bits from raw GPA
+ *
+ * Input Args:
+ *   vm - Virtual Machine
+ *   gpa_raw - Raw VM physical address
+ *
+ * Output Args: None
+ *
+ * Return:
+ *   GPA with special bits (e.g. shared/encrypted) masked off.
+ */
+vm_paddr_t addr_raw2gpa(struct kvm_vm *vm, vm_paddr_t gpa_raw)
+{
+	if (!vm->memcrypt.has_enc_bit)
+		return gpa_raw;
+
+	return gpa_raw & ~(1ULL << vm->memcrypt.enc_bit);
+}
+
 /*
  * Address VM Physical to Host Virtual
  *
@@ -1460,9 +1480,10 @@  void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
  * address providing the memory to the vm physical address is returned.
  * A TEST_ASSERT failure occurs if no region containing gpa exists.
  */
-void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa)
+void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa_raw)
 {
 	struct userspace_mem_region *region;
+	vm_paddr_t gpa = addr_raw2gpa(vm, gpa_raw);
 
 	region = userspace_mem_region_find(vm, gpa, gpa);
 	if (!region) {
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index 28cb881f440d..0bbd88fe1127 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -198,7 +198,7 @@  static void *virt_get_pte(struct kvm_vm *vm, uint64_t pt_pfn, uint64_t vaddr,
 static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm,
 						    uint64_t pt_pfn,
 						    uint64_t vaddr,
-						    uint64_t paddr,
+						    uint64_t paddr_raw,
 						    int level,
 						    enum x86_page_size page_size)
 {
@@ -208,10 +208,9 @@  static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm,
 		pte->writable = true;
 		pte->present = true;
 		pte->page_size = (level == page_size);
-		if (pte->page_size)
-			pte->pfn = paddr >> vm->page_shift;
-		else
-			pte->pfn = vm_alloc_page_table(vm) >> vm->page_shift;
+		if (!pte->page_size)
+			paddr_raw = vm_alloc_page_table(vm);
+		pte->pfn = paddr_raw >> vm->page_shift;
 	} else {
 		/*
 		 * Entry already present.  Assert that the caller doesn't want
@@ -228,12 +227,13 @@  static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm,
 	return pte;
 }
 
-void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr_raw,
 		   enum x86_page_size page_size)
 {
 	const uint64_t pg_size = 1ull << ((page_size * 9) + 12);
 	struct pageUpperEntry *pml4e, *pdpe, *pde;
 	struct pageTableEntry *pte;
+	uint64_t paddr = addr_raw2gpa(vm, paddr_raw);
 
 	TEST_ASSERT(vm->mode == VM_MODE_PXXV48_4K,
 		    "Unknown or unsupported guest mode, mode: 0x%x", vm->mode);
@@ -256,15 +256,15 @@  void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 	 * early if a hugepage was created.
 	 */
 	pml4e = virt_create_upper_pte(vm, vm->pgd >> vm->page_shift,
-				      vaddr, paddr, 3, page_size);
+				      vaddr, paddr_raw, 3, page_size);
 	if (pml4e->page_size)
 		return;
 
-	pdpe = virt_create_upper_pte(vm, pml4e->pfn, vaddr, paddr, 2, page_size);
+	pdpe = virt_create_upper_pte(vm, pml4e->pfn, vaddr, paddr_raw, 2, page_size);
 	if (pdpe->page_size)
 		return;
 
-	pde = virt_create_upper_pte(vm, pdpe->pfn, vaddr, paddr, 1, page_size);
+	pde = virt_create_upper_pte(vm, pdpe->pfn, vaddr, paddr_raw, 1, page_size);
 	if (pde->page_size)
 		return;
 
@@ -272,14 +272,14 @@  void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 	pte = virt_get_pte(vm, pde->pfn, vaddr, 0);
 	TEST_ASSERT(!pte->present,
 		    "PTE already present for 4k page at vaddr: 0x%lx\n", vaddr);
-	pte->pfn = paddr >> vm->page_shift;
+	pte->pfn = paddr_raw >> vm->page_shift;
 	pte->writable = true;
 	pte->present = 1;
 }
 
-void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
+void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr_raw)
 {
-	__virt_pg_map(vm, vaddr, paddr, X86_PAGE_SIZE_4K);
+	__virt_pg_map(vm, vaddr, paddr_raw, X86_PAGE_SIZE_4K);
 }
 
 static struct pageTableEntry *_vm_get_page_table_entry(struct kvm_vm *vm, int vcpuid,
@@ -587,7 +587,7 @@  vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva)
 	if (!pte[index[0]].present)
 		goto unmapped_gva;
 
-	return (pte[index[0]].pfn * vm->page_size) + (gva & 0xfffu);
+	return addr_raw2gpa(vm, ((uint64_t)pte[index[0]].pfn * vm->page_size)) + (gva & 0xfffu);
 
 unmapped_gva:
 	TEST_FAIL("No mapping for vm virtual address, gva: 0x%lx", gva);