Message ID | 20240105183025.225972-2-mhklinux@outlook.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | x86/hyperv: Mark CoCo VM pages not present when changing encrypted state | expand |
On Fri, Jan 05, 2024 at 10:30:23AM -0800, mhkelley58@gmail.com wrote: > From: Michael Kelley <mhklinux@outlook.com> > > In preparation for temporarily marking pages not present during a > transition between encrypted and decrypted, use slow_virt_to_phys() > in the hypervisor callback. As long as the PFN is correct, > slow_virt_to_phys() works even if the leaf PTE is not present. > The existing functions that depend on vmalloc_to_page() all > require that the leaf PTE be marked present, so they don't work. > > Update the comments for slow_virt_to_phys() to note this broader usage > and the requirement to work even if the PTE is not marked present. > > Signed-off-by: Michael Kelley <mhklinux@outlook.com> > --- > arch/x86/hyperv/ivm.c | 9 ++++++++- > arch/x86/mm/pat/set_memory.c | 13 +++++++++---- > 2 files changed, 17 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c > index 02e55237d919..8ba18635e338 100644 > --- a/arch/x86/hyperv/ivm.c > +++ b/arch/x86/hyperv/ivm.c > @@ -524,7 +524,14 @@ static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bo > return false; > > for (i = 0, pfn = 0; i < pagecount; i++) { > - pfn_array[pfn] = virt_to_hvpfn((void *)kbuffer + i * HV_HYP_PAGE_SIZE); > + /* > + * Use slow_virt_to_phys() because the PRESENT bit has been > + * temporarily cleared in the PTEs. slow_virt_to_phys() works > + * without the PRESENT bit while virt_to_hvpfn() or similar > + * does not. > + */ > + pfn_array[pfn] = slow_virt_to_phys((void *)kbuffer + > + i * HV_HYP_PAGE_SIZE) >> HV_HYP_PAGE_SHIFT; I think you can make it much more readable by introducing few variables: virt = (void *)kbuffer + i * HV_HYPPAGE_SIZE; phys = slow_virt_to_phys(virt); pfn_array[pfn] = phys >> HV_HYP_PAGE_SHIFT; > pfn++; > > if (pfn == HV_MAX_MODIFY_GPA_REP_COUNT || i == pagecount - 1) { > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c > index bda9f129835e..8e19796e7ce5 100644 > --- a/arch/x86/mm/pat/set_memory.c > +++ b/arch/x86/mm/pat/set_memory.c > @@ -755,10 +755,15 @@ pmd_t *lookup_pmd_address(unsigned long address) > * areas on 32-bit NUMA systems. The percpu areas can > * end up in this kind of memory, for instance. > * > - * This could be optimized, but it is only intended to be > - * used at initialization time, and keeping it > - * unoptimized should increase the testing coverage for > - * the more obscure platforms. > + * It is also used in callbacks for CoCo VM page transitions between private > + * and shared because it works when the PRESENT bit is not set in the leaf > + * PTE. In such cases, the state of the PTEs, including the PFN, is otherwise > + * known to be valid, so the returned physical address is correct. The similar > + * function vmalloc_to_pfn() can't be used because it requires the PRESENT bit. > + * > + * This could be optimized, but it is only used in paths that are not perf > + * sensitive, and keeping it unoptimized should increase the testing coverage > + * for the more obscure platforms. > */ > phys_addr_t slow_virt_to_phys(void *__virt_addr) > { > -- > 2.25.1 >
On Fri, 2024-01-05 at 10:30 -0800, mhkelley58@gmail.com wrote: > + * It is also used in callbacks for CoCo VM page transitions between > private > + * and shared because it works when the PRESENT bit is not set in > the leaf > + * PTE. In such cases, the state of the PTEs, including the PFN, is > otherwise > + * known to be valid, so the returned physical address is correct. > The similar > + * function vmalloc_to_pfn() can't be used because it requires the > PRESENT bit. I'm not sure about this comment. It is mostly about callers far away and other functions in vmalloc. Probably a decent chance to get stale. It also kind of begs the question of why vmalloc_to_pfn() requires the present bit in the leaf. It seems the first part of the comment is about why this is needed when __pa() exists. One reason given is that __pa() doesn't work with vmalloc memory. Then the next bit talks about another similar function that works with vmalloc memory. So the comment is a risk to get stale, and leaves me a little confused why this function exists. I think the reason is because vmalloc_to_pfn() *only* works with vmalloc memory and this is needed to work on other alias mappings.
From: Edgecombe, Rick P <rick.p.edgecombe@intel.com> Sent: Thursday, January 11, 2024 5:20 PM > > On Fri, 2024-01-05 at 10:30 -0800, mhkelley58@gmail.com wrote: > > + * It is also used in callbacks for CoCo VM page transitions between private > > + * and shared because it works when the PRESENT bit is not set in the leaf > > + * PTE. In such cases, the state of the PTEs, including the PFN, is otherwise > > + * known to be valid, so the returned physical address is correct. The similar > > + * function vmalloc_to_pfn() can't be used because it requires the PRESENT bit. > > I'm not sure about this comment. It is mostly about callers far away > and other functions in vmalloc. Probably a decent chance to get stale. > It also kind of begs the question of why vmalloc_to_pfn() requires the > present bit in the leaf. The comment is Kirill Shutemov's suggestion based on comments in an earlier version of the patch series. See [1]. The intent is to prevent some later revision to slow_virt_to_phys() from adding a check for the present bit and breaking the CoCo VM hypervisor callback. Yes, the comment could get stale, but I'm not sure how else to call out the implicit dependency. The idea of creating a private version of slow_virt_to_phys() for use only in the CoCo VM hypervisor callback is also discussed in the thread, but that seems worse overall. As for why vmalloc_to_page() checks the present bit, I don't know. But vmalloc_to_page() is very widely used, so trying to change it doesn't seem viable. > > It seems the first part of the comment is about why this is needed when > __pa() exists. One reason given is that __pa() doesn't work with > vmalloc memory. Then the next bit talks about another similar function > that works with vmalloc memory. > > So the comment is a risk to get stale, and leaves me a little confused > why this function exists. > > I think the reason is because vmalloc_to_pfn() *only* works with > vmalloc memory and this is needed to work on other alias mappings. Presumably so. The first paragraph of the existing comment also calls out "alloc_remap() areas on 32-bit NUMA systems" as needing slow_virt_to_phys(). Michael [1] https://lore.kernel.org/lkml/20230828221333.5blshosyqafbgwlc@box.shutemov.name/
On Fri, 2024-01-12 at 15:07 +0000, Michael Kelley wrote: > The comment is Kirill Shutemov's suggestion based on comments in > an earlier version of the patch series. See [1]. The intent is to > prevent > some later revision to slow_virt_to_phys() from adding a check for > the > present bit and breaking the CoCo VM hypervisor callback. Yes, the > comment could get stale, but I'm not sure how else to call out the > implicit dependency. The idea of creating a private version of > slow_virt_to_phys() for use only in the CoCo VM hypervisor callback > is also discussed in the thread, but that seems worse overall. Well, it's not a huge deal, but I would have just put a comment at the caller like: /* * Use slow_virt_to_phys() instead of vmalloc_to_page(), because it * returns the PFN even for NP PTEs. */ If someone is changing slow_virt_to_phys() they should check the callers to make sure they won't break anything. They can see the comment then and have an easy time. An optional comment at slow_virt_to_phys() could explain how the function works in regards to the present bit, but not include details about CoCoO VM page transition's usage of the present bit. The proposed comment text looks like something more appropriate for a commit log.
From: Edgecombe, Rick P <rick.p.edgecombe@intel.com> Sent: Friday, January 12, 2024 9:17 AM > > On Fri, 2024-01-12 at 15:07 +0000, Michael Kelley wrote: > > The comment is Kirill Shutemov's suggestion based on comments in > > an earlier version of the patch series. See [1]. The intent is to > > prevent > > some later revision to slow_virt_to_phys() from adding a check for > > the > > present bit and breaking the CoCo VM hypervisor callback. Yes, the > > comment could get stale, but I'm not sure how else to call out the > > implicit dependency. The idea of creating a private version of > > slow_virt_to_phys() for use only in the CoCo VM hypervisor callback > > is also discussed in the thread, but that seems worse overall. > > Well, it's not a huge deal, but I would have just put a comment at the > caller like: > > /* > * Use slow_virt_to_phys() instead of vmalloc_to_page(), because it > * returns the PFN even for NP PTEs. > */ Yes, that comment is added in this patch. > > If someone is changing slow_virt_to_phys() they should check the > callers to make sure they won't break anything. They can see the > comment then and have an easy time. > > An optional comment at slow_virt_to_phys() could explain how the > function works in regards to the present bit, but not include details > about CoCoO VM page transition's usage of the present bit. The proposed > comment text looks like something more appropriate for a commit log. Kirill -- you originally asked for a comment in slow_virt_to_phys(). [1] Are you OK with Rick's proposal? Michael [1] https://lore.kernel.org/lkml/20230828221333.5blshosyqafbgwlc@box.shutemov.name/
On Fri, Jan 12, 2024 at 07:24:35PM +0000, Michael Kelley wrote: > From: Edgecombe, Rick P <rick.p.edgecombe@intel.com> Sent: Friday, January 12, 2024 9:17 AM > > > > On Fri, 2024-01-12 at 15:07 +0000, Michael Kelley wrote: > > > The comment is Kirill Shutemov's suggestion based on comments in > > > an earlier version of the patch series. See [1]. The intent is to > > > prevent > > > some later revision to slow_virt_to_phys() from adding a check for > > > the > > > present bit and breaking the CoCo VM hypervisor callback. Yes, the > > > comment could get stale, but I'm not sure how else to call out the > > > implicit dependency. The idea of creating a private version of > > > slow_virt_to_phys() for use only in the CoCo VM hypervisor callback > > > is also discussed in the thread, but that seems worse overall. > > > > Well, it's not a huge deal, but I would have just put a comment at the > > caller like: > > > > /* > > * Use slow_virt_to_phys() instead of vmalloc_to_page(), because it > > * returns the PFN even for NP PTEs. > > */ > > Yes, that comment is added in this patch. > > > > > If someone is changing slow_virt_to_phys() they should check the > > callers to make sure they won't break anything. They can see the > > comment then and have an easy time. > > > > An optional comment at slow_virt_to_phys() could explain how the > > function works in regards to the present bit, but not include details > > about CoCoO VM page transition's usage of the present bit. The proposed > > comment text looks like something more appropriate for a commit log. > > Kirill -- you originally asked for a comment in slow_virt_to_phys(). [1] > Are you OK with Rick's proposal? Yes, sounds sensible.
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c index 02e55237d919..8ba18635e338 100644 --- a/arch/x86/hyperv/ivm.c +++ b/arch/x86/hyperv/ivm.c @@ -524,7 +524,14 @@ static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bo return false; for (i = 0, pfn = 0; i < pagecount; i++) { - pfn_array[pfn] = virt_to_hvpfn((void *)kbuffer + i * HV_HYP_PAGE_SIZE); + /* + * Use slow_virt_to_phys() because the PRESENT bit has been + * temporarily cleared in the PTEs. slow_virt_to_phys() works + * without the PRESENT bit while virt_to_hvpfn() or similar + * does not. + */ + pfn_array[pfn] = slow_virt_to_phys((void *)kbuffer + + i * HV_HYP_PAGE_SIZE) >> HV_HYP_PAGE_SHIFT; pfn++; if (pfn == HV_MAX_MODIFY_GPA_REP_COUNT || i == pagecount - 1) { diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index bda9f129835e..8e19796e7ce5 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -755,10 +755,15 @@ pmd_t *lookup_pmd_address(unsigned long address) * areas on 32-bit NUMA systems. The percpu areas can * end up in this kind of memory, for instance. * - * This could be optimized, but it is only intended to be - * used at initialization time, and keeping it - * unoptimized should increase the testing coverage for - * the more obscure platforms. + * It is also used in callbacks for CoCo VM page transitions between private + * and shared because it works when the PRESENT bit is not set in the leaf + * PTE. In such cases, the state of the PTEs, including the PFN, is otherwise + * known to be valid, so the returned physical address is correct. The similar + * function vmalloc_to_pfn() can't be used because it requires the PRESENT bit. + * + * This could be optimized, but it is only used in paths that are not perf + * sensitive, and keeping it unoptimized should increase the testing coverage + * for the more obscure platforms. */ phys_addr_t slow_virt_to_phys(void *__virt_addr) {