diff mbox series

[v9,2/3] x86/sgx: Introduce union with vepc_vaddr field for virtualization case

Message ID 20220920063948.3556917-3-zhiquan1.li@intel.com (mailing list archive)
State New, archived
Headers show
Series x86/sgx: fine grained SGX MCA behavior | expand

Commit Message

Zhiquan Li Sept. 20, 2022, 6:39 a.m. UTC
When a page triggers a machine check, it only reports the PFN.  But in
order to inject #MC into hypervisor, the virtual address is required.
The 'encl_owner' field is useless in virtualization case, then
repurpose it as 'vepc_vaddr' - the virtual address of the virtual EPC
page for such case so that arch_memory_failure() can easily retrieve it.

Introduce a union to prevent adding a new dedicated structure to
track the virtual address of virtual EPC page.  And it can also prevent
playing the casting games while using it.

Add a new EPC page flag - SGX_EPC_PAGE_KVM_GUEST to interpret the
meaning of the field.

Co-developed-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

---
Changes since V8:
- Remove excess Acked-by.

Changes since V7:
- Add Acked-by from Jarkko.

No changes since V6.

Changes since V5:
- To prevent casting the 'encl_owner' field, introduce a union with
  another field - 'vepc_vaddr', sugguested by Dave Hansen.
- Add Reviewed-by from Jarkko.
  Link: https://lore.kernel.org/linux-sgx/Yrf27fugD7lkyaek@kernel.org/T/#m379d00fc7f1d43726a42b3884637532061a8c0d1

Changes since V4:
- Add Co-developed-by and Signed-off-by from Cathy Zhang, as she had
  fully discussed the flag name with Jarkko.
  Link: https://lore.kernel.org/all/df92395ade424401ac3c6322de568720@intel.com/
- Add Acked-by from Kai Huang
  Link: https://lore.kernel.org/linux-sgx/0676cd4e-d94b-e904-81ae-ca1c05d37070@intel.com/T/#mccfb11df30698dbd060f2b6f06383cda7f154ef3

Changes since V3:
- Take the definition of EPC page flag SGX_EPC_PAGE_KVM_GUEST from
  Cathy Zhang's third patch of SGX rebootless recovery patch set but
  discard irrelevant portion, since it might need some time to
  re-forge and these are two different features.
  Link: https://lore.kernel.org/linux-sgx/41704e5d4c03b49fcda12e695595211d950cfb08.camel@kernel.org/T/#m9782d23496cacecb7da07a67daa79f4b322ae170

Changes since V2:
- Remove struct sgx_vepc_page and relevant code.
- Rework the patch suggested by Jarkko.
- Remove new EPC page flag SGX_EPC_PAGE_IS_VEPC definition as it is
  duplicated to SGX_EPC_PAGE_KVM_GUEST.
  Link: https://lore.kernel.org/linux-sgx/eb95b32ecf3d44a695610cf7f2816785@intel.com/T/#u

Changes since V1:
- Add documentation suggested by Jarkko.
---
 arch/x86/kernel/cpu/sgx/main.c | 4 ++++
 arch/x86/kernel/cpu/sgx/sgx.h  | 8 +++++++-
 arch/x86/kernel/cpu/sgx/virt.c | 4 +++-
 3 files changed, 14 insertions(+), 2 deletions(-)

Comments

Dave Hansen Oct. 10, 2022, 11:10 p.m. UTC | #1
On 9/19/22 23:39, Zhiquan Li wrote:
> --- a/arch/x86/kernel/cpu/sgx/virt.c
> +++ b/arch/x86/kernel/cpu/sgx/virt.c
> @@ -46,10 +46,12 @@ static int __sgx_vepc_fault(struct sgx_vepc *vepc,
>  	if (epc_page)
>  		return 0;
>  
> -	epc_page = sgx_alloc_epc_page(vepc, false);
> +	epc_page = sgx_alloc_epc_page((void *)addr, false);
>  	if (IS_ERR(epc_page))
>  		return PTR_ERR(epc_page);

One thing not clear from the changelog: This actually changes the value
getting passed into sgx_alloc_epc_page() and set in the page->owner field.

What effect does this have?  If I apply these and run the tree at this
commit, what happens?  What behavior changes?

Was this 'vepc' value simply not used before?
Zhiquan Li Oct. 11, 2022, 5:49 a.m. UTC | #2
On 2022/10/11 07:10, Dave Hansen wrote:
> On 9/19/22 23:39, Zhiquan Li wrote:
>> --- a/arch/x86/kernel/cpu/sgx/virt.c
>> +++ b/arch/x86/kernel/cpu/sgx/virt.c
>> @@ -46,10 +46,12 @@ static int __sgx_vepc_fault(struct sgx_vepc *vepc,
>>  	if (epc_page)
>>  		return 0;
>>  
>> -	epc_page = sgx_alloc_epc_page(vepc, false);
>> +	epc_page = sgx_alloc_epc_page((void *)addr, false);
>>  	if (IS_ERR(epc_page))
>>  		return PTR_ERR(epc_page);
> 
> One thing not clear from the changelog: This actually changes the value
> getting passed into sgx_alloc_epc_page() and set in the page->owner field.
> 
> What effect does this have?  If I apply these and run the tree at this
> commit, what happens?  What behavior changes?
> 
> Was this 'vepc' value simply not used before?

Yes, it was not used before. Kai had confirmed this point:
https://lore.kernel.org/all/fa93057f417b1f630d8199381589c415a0ec710b.camel@intel.com/

The initial idea is to add a new struct sgx_vepc_page to hold 'vaddr'
and the reversed relationship from EPC page to struct sgx_vepc:

  struct sgx_vepc_page {
  	unsigned long vaddr;
  	struct sgx_vepc *vepc;
  };

But which means there will be additional 16 bytes memory consumption on
host for each EPC page assigned into guest.

Jarkko suggested us repurpose the 'owner' field:
https://lore.kernel.org/linux-sgx/Yoa90l89OTQX0NYk@iki.fi/

1. It can save memory.
2. We don't have any scenario need the reversed relationship from EPC
page to struct sgx_vepc, keeping the relationship from VEPC to EPC pages
is enough.

May I add below description to explain behavior changes in changelog?

The behavior is changed when allocating an EPC page to a virtual EPC,
the virtual address of the virtual EPC will be passed as the first
argument of sgx_alloc_epc_page() which be assigned to 'encl_owner' field
of struct sgx_epc_page.  After that, the reversed relationship from an
EPC page to virtual EPC doesn't exist, in practice, such relationship is
useless.

Thanks & Best Regards,
Zhiquan
Dave Hansen Oct. 11, 2022, 1:57 p.m. UTC | #3
On 10/10/22 22:49, Zhiquan Li wrote:
> 
> May I add below description to explain behavior changes in changelog?
> 
> The behavior is changed when allocating an EPC page to a virtual EPC,
> the virtual address of the virtual EPC will be passed as the first
> argument of sgx_alloc_epc_page() which be assigned to 'encl_owner' field
> of struct sgx_epc_page.  After that, the reversed relationship from an
> EPC page to virtual EPC doesn't exist, in practice, such relationship is
> useless.

That honestly doesn't help me understand what's going on here.

The changelog and subject are spending far too much time explaining the
low level details of what is being done (like "add a union") and far too
little time explaining what it _means_.
Zhiquan Li Oct. 12, 2022, 4:42 a.m. UTC | #4
On 2022/10/11 21:57, Dave Hansen wrote:
> That honestly doesn't help me understand what's going on here.
> 
> The changelog and subject are spending far too much time explaining the
> low level details of what is being done (like "add a union") and far too
> little time explaining what it _means_.

Thanks for your comments, Dave.

I added more descriptions for motivation and reduced explanations
for details in changelog as below, could you help me review if it is
easier to understand now?

    When a page triggers a machine check, it only reports the PFN.  But
    in order to inject #MC into VM through hypervisor, the virtual
    address of virtual EPC page is required.  To avoid introduce a new
    dedicated structure to track it, repurpose the 'encl_owner' field as
    it is useless in virtualization case, so that arch_memory_failure()
    can retrieve the virtual address easily and attach it to the extra
    info of SIGBUS which will be sent to hypervisor.

    The union can prevent playing the casting games while using it.  And
    add a new EPC page flag - SGX_EPC_PAGE_KVM_GUEST to interpret the
    meaning of the field.

    The changes take place while allocating an EPC page to a KVM guests.
    After applying it, the reversed relationship from an EPC page to
    virtual EPC doesn't exist, in practice, such relationship is
    useless. Beyond that, other behaviors have not been changed.

Furthermore, may I change the subject as below to highlight the
motivation?

    x86/sgx: track virtual address of virtual EPC page for injecting #MC
    into VM

Best Regards,
Zhiquan
Huang, Kai Oct. 12, 2022, 11:17 a.m. UTC | #5
On Wed, 2022-10-12 at 12:42 +0800, Zhiquan Li wrote:
> I added more descriptions for motivation and reduced explanations
> for details in changelog as below, could you help me review if it is
> easier to understand now?
> 
>     When a page triggers a machine check, it only reports the PFN.  But
>     in order to inject #MC into VM through hypervisor, the virtual
>     address of virtual EPC page is required.  To avoid introduce a new
>     dedicated structure to track it, repurpose the 'encl_owner' field as
>     it is useless in virtualization case, so that arch_memory_failure()
>     can retrieve the virtual address easily and attach it to the extra
>     info of SIGBUS which will be sent to hypervisor.
> 
>     The union can prevent playing the casting games while using it.  And
>     add a new EPC page flag - SGX_EPC_PAGE_KVM_GUEST to interpret the
>     meaning of the field.
> 
>     The changes take place while allocating an EPC page to a KVM guests.
>     After applying it, the reversed relationship from an EPC page to
>     virtual EPC doesn't exist, in practice, such relationship is
>     useless. Beyond that, other behaviors have not been changed.

To me the above last paragraph is very hard to comprehend.  I would just point
out the fact:

Virtual EPC pages's 'encl_owner' is not used so it's safe to repurpose it to
carry virtual EPC page's virtual address. 

(and perhaps somehow to merge to your first paragraph where why the virtual
address is needed is explained.)
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 1315c69a733e..b319bedcaf1e 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -549,6 +549,10 @@  int sgx_unmark_page_reclaimable(struct sgx_epc_page *page)
  * Finally, wake up ksgxd when the number of pages goes below the watermark
  * before returning back to the caller.
  *
+ * When an EPC page is assigned to KVM guest, repurpose the 'encl_owner' field
+ * as the virtual address of virtual EPC page, since it is useless in such
+ * scenario, so 'owner' is assigned to 'vepc_vaddr'.
+ *
  * Return:
  *   an EPC page,
  *   -errno on error
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index 4d88abccd12e..d16a8baa28d4 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -28,12 +28,18 @@ 
 
 /* Pages on free list */
 #define SGX_EPC_PAGE_IS_FREE		BIT(1)
+/* Pages allocated for KVM guest */
+#define SGX_EPC_PAGE_KVM_GUEST		BIT(2)
 
 struct sgx_epc_page {
 	unsigned int section;
 	u16 flags;
 	u16 poison;
-	struct sgx_encl_page *encl_owner;
+	union {
+		struct sgx_encl_page *encl_owner;
+		/* Use when SGX_EPC_PAGE_KVM_GUEST set in ->flags: */
+		void __user *vepc_vaddr;
+	};
 	struct list_head list;
 };
 
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
index 6a77a14eee38..776ae5c1c032 100644
--- a/arch/x86/kernel/cpu/sgx/virt.c
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -46,10 +46,12 @@  static int __sgx_vepc_fault(struct sgx_vepc *vepc,
 	if (epc_page)
 		return 0;
 
-	epc_page = sgx_alloc_epc_page(vepc, false);
+	epc_page = sgx_alloc_epc_page((void *)addr, false);
 	if (IS_ERR(epc_page))
 		return PTR_ERR(epc_page);
 
+	epc_page->flags |= SGX_EPC_PAGE_KVM_GUEST;
+
 	ret = xa_err(xa_store(&vepc->page_array, index, epc_page, GFP_KERNEL));
 	if (ret)
 		goto err_free;