Message ID | 20230728051024.33063-1-haitao.huang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v6] x86/sgx: Resolves SECS reclaim vs. page fault for EAUG race | expand |
Hi Everybody, I'd like to check in on the status of this patch. This two month old patch looks to be a needed fix and has Jarkko and Kai's review tags, but I am not able to find it queued or merged in tip or upstream. Apologies if I did not look in the right spot, I just want to make sure it did not fall through the cracks if deemed needed. Reinette On 7/27/2023 10:10 PM, Haitao Huang wrote: > The SGX EPC reclaimer (ksgxd) may reclaim the SECS EPC page for an > enclave and set secs.epc_page to NULL. The SECS page is used for EAUG > and ELDU in the SGX page fault handler. However, the NULL check for > secs.epc_page is only done for ELDU, not EAUG before being used. > > Fix this by doing the same NULL check and reloading of the SECS page as > needed for both EAUG and ELDU. > > The SECS page holds global enclave metadata. It can only be reclaimed > when there are no other enclave pages remaining. At that point, > virtually nothing can be done with the enclave until the SECS page is > paged back in. > > An enclave can not run nor generate page faults without a resident SECS > page. But it is still possible for a #PF for a non-SECS page to race > with paging out the SECS page: when the last resident non-SECS page A > triggers a #PF in a non-resident page B, and then page A and the SECS > both are paged out before the #PF on B is handled. > > Hitting this bug requires that race triggered with a #PF for EAUG. > Following is a trace when it happens. > > BUG: kernel NULL pointer dereference, address: 0000000000000000 > RIP: 0010:sgx_encl_eaug_page+0xc7/0x210 > Call Trace: > ? __kmem_cache_alloc_node+0x16a/0x440 > ? xa_load+0x6e/0xa0 > sgx_vma_fault+0x119/0x230 > __do_fault+0x36/0x140 > do_fault+0x12f/0x400 > __handle_mm_fault+0x728/0x1110 > handle_mm_fault+0x105/0x310 > do_user_addr_fault+0x1ee/0x750 > ? __this_cpu_preempt_check+0x13/0x20 > exc_page_fault+0x76/0x180 > asm_exc_page_fault+0x27/0x30 > > Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave") > Cc: stable@vger.kernel.org # v6.0+ > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> > Reviewed-by: Kai Huang <kai.huang@intel.com> > Acked-by: Reinette Chatre <reinette.chatre@intel.com> > --- > v6: > - Removed 'Under heavy load' as it is not the required condition though > it makes the bug more likely happen. (Kai) > - Added mentioning of the NULL check and reloading already done for ELDU (Kai) > - Added Reviewed-by (Kai) > > v5: > - Trimmed trace and added Acked-by (Reinette) > > v4: > - Refined the title (Kai, Dave) > - Added a trace to commit meesage (Kai) > - Added a few details for the race. > > v3: > - Added comments on sgx_encl_load_secs(). (Dave) > - Added theory of the race condition to hit the bug. (Dave) > - Added Reviewed-by, and applicable stable release. (Jarkko) > > v2: > - Fixes for style, commit message (Jarkko, Kai) > - Removed unneeded WARN_ON (Kai) > --- > arch/x86/kernel/cpu/sgx/encl.c | 30 +++++++++++++++++++++++++----- > 1 file changed, 25 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c > index 91fa70e51004..279148e72459 100644 > --- a/arch/x86/kernel/cpu/sgx/encl.c > +++ b/arch/x86/kernel/cpu/sgx/encl.c > @@ -235,6 +235,21 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page, > return epc_page; > } > > +/* > + * Ensure the SECS page is not swapped out. Must be called with encl->lock > + * to protect the enclave states including SECS and ensure the SECS page is > + * not swapped out again while being used. > + */ > +static struct sgx_epc_page *sgx_encl_load_secs(struct sgx_encl *encl) > +{ > + struct sgx_epc_page *epc_page = encl->secs.epc_page; > + > + if (!epc_page) > + epc_page = sgx_encl_eldu(&encl->secs, NULL); > + > + return epc_page; > +} > + > static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, > struct sgx_encl_page *entry) > { > @@ -248,11 +263,9 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, > return entry; > } > > - if (!(encl->secs.epc_page)) { > - epc_page = sgx_encl_eldu(&encl->secs, NULL); > - if (IS_ERR(epc_page)) > - return ERR_CAST(epc_page); > - } > + epc_page = sgx_encl_load_secs(encl); > + if (IS_ERR(epc_page)) > + return ERR_CAST(epc_page); > > epc_page = sgx_encl_eldu(entry, encl->secs.epc_page); > if (IS_ERR(epc_page)) > @@ -339,6 +352,13 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, > > mutex_lock(&encl->lock); > > + epc_page = sgx_encl_load_secs(encl); > + if (IS_ERR(epc_page)) { > + if (PTR_ERR(epc_page) == -EBUSY) > + vmret = VM_FAULT_NOPAGE; > + goto err_out_unlock; > + } > + > epc_page = sgx_alloc_epc_page(encl_page, false); > if (IS_ERR(epc_page)) { > if (PTR_ERR(epc_page) == -EBUSY) > > base-commit: 6eaae198076080886b9e7d57f4ae06fa782f90ef
On 9/28/23 16:08, Reinette Chatre wrote: > I'd like to check in on the status of this patch. This two month old > patch looks to be a needed fix and has Jarkko and Kai's review tags, > but I am not able to find it queued or merged in tip or upstream. > Apologies if I did not look in the right spot, I just want to make > sure it did not fall through the cracks if deemed needed. It fell through the cracks. Sorry about that. It's in x86/urgent now.
Hi Dave, On 9/28/2023 5:11 PM, Dave Hansen wrote: > On 9/28/23 16:08, Reinette Chatre wrote: >> I'd like to check in on the status of this patch. This two month old >> patch looks to be a needed fix and has Jarkko and Kai's review tags, >> but I am not able to find it queued or merged in tip or upstream. >> Apologies if I did not look in the right spot, I just want to make >> sure it did not fall through the cracks if deemed needed. > > It fell through the cracks. Sorry about that. It's in x86/urgent now. No problem. Thank you very much for including it. Reinette
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 91fa70e51004..279148e72459 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -235,6 +235,21 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page, return epc_page; } +/* + * Ensure the SECS page is not swapped out. Must be called with encl->lock + * to protect the enclave states including SECS and ensure the SECS page is + * not swapped out again while being used. + */ +static struct sgx_epc_page *sgx_encl_load_secs(struct sgx_encl *encl) +{ + struct sgx_epc_page *epc_page = encl->secs.epc_page; + + if (!epc_page) + epc_page = sgx_encl_eldu(&encl->secs, NULL); + + return epc_page; +} + static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, struct sgx_encl_page *entry) { @@ -248,11 +263,9 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl, return entry; } - if (!(encl->secs.epc_page)) { - epc_page = sgx_encl_eldu(&encl->secs, NULL); - if (IS_ERR(epc_page)) - return ERR_CAST(epc_page); - } + epc_page = sgx_encl_load_secs(encl); + if (IS_ERR(epc_page)) + return ERR_CAST(epc_page); epc_page = sgx_encl_eldu(entry, encl->secs.epc_page); if (IS_ERR(epc_page)) @@ -339,6 +352,13 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma, mutex_lock(&encl->lock); + epc_page = sgx_encl_load_secs(encl); + if (IS_ERR(epc_page)) { + if (PTR_ERR(epc_page) == -EBUSY) + vmret = VM_FAULT_NOPAGE; + goto err_out_unlock; + } + epc_page = sgx_alloc_epc_page(encl_page, false); if (IS_ERR(epc_page)) { if (PTR_ERR(epc_page) == -EBUSY)