Message ID | 7e622156315c9c22c3ef84a7c0aeb01b5c001ff9.1638381245.git.reinette.chatre@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/sgx and selftests/sgx: Support SGX2 | expand |
On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote: > === Summary === > > An SGX VMA can only be created if its permissions are the same or > weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA > creation this rule continues to be enforced by the page fault handler. > > With SGX2 the EPCM permissions of a page can change after VMA > creation resulting in the VMA exceeding the EPCM permissions and the > page fault handler incorrectly blocking access. > > Enable the VMA's pages to remain accessible while ensuring that > the page table entries are installed to match the EPCM permissions > without exceeding the VMA perms issions. I don't understand what the short summary means in English, and the commit message is way too bloated to make any conclusions. It really needs a rewrite. These were the questions I could not find answer for: 1. Why it would be by any means safe to remove a permission check? 2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing VMA's and mmap() new ones. /Jarkko
On Sun, Dec 05, 2021 at 12:25:59AM +0200, Jarkko Sakkinen wrote: > On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote: > > === Summary === > > > > An SGX VMA can only be created if its permissions are the same or > > weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA > > creation this rule continues to be enforced by the page fault handler. > > > > With SGX2 the EPCM permissions of a page can change after VMA > > creation resulting in the VMA exceeding the EPCM permissions and the > > page fault handler incorrectly blocking access. > > > > Enable the VMA's pages to remain accessible while ensuring that > > the page table entries are installed to match the EPCM permissions > > without exceeding the VMA perms issions. > > I don't understand what the short summary means in English, and the > commit message is way too bloated to make any conclusions. It really > needs a rewrite. > > These were the questions I could not find answer for: > > 1. Why it would be by any means safe to remove a permission check? > 2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing > VMA's and mmap() new ones. 3. Isn't this an API/ABI break? /Jarkko
Hi Jarkko, On 12/4/2021 2:27 PM, Jarkko Sakkinen wrote: > On Sun, Dec 05, 2021 at 12:25:59AM +0200, Jarkko Sakkinen wrote: >> On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote: >>> === Summary === >>> >>> An SGX VMA can only be created if its permissions are the same or >>> weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA >>> creation this rule continues to be enforced by the page fault handler. >>> >>> With SGX2 the EPCM permissions of a page can change after VMA >>> creation resulting in the VMA exceeding the EPCM permissions and the >>> page fault handler incorrectly blocking access. >>> >>> Enable the VMA's pages to remain accessible while ensuring that >>> the page table entries are installed to match the EPCM permissions >>> without exceeding the VMA perms issions. >> >> I don't understand what the short summary means in English, and the >> commit message is way too bloated to make any conclusions. It really >> needs a rewrite. >> >> These were the questions I could not find answer for: >> >> 1. Why it would be by any means safe to remove a permission check? The permission check is redundant for SGX1 and incorrect for SGX2. In the current SGX1 implementation the permission check in sgx_encl_load_page() is redundant because an SGX VMA can only be created if its permissions are the same or weaker than the EPCM permissions. In SGX2 a user is able to change EPCM permissions during runtime (while VMA has the memory mapped). A RW VMA may thus originally have mapped an enclave page with RW EPCM permissions but since then the enclave page may have its permissions changed to read-only. The VMA should still be able to read those enclave pages but the check in sgx_encl_load_page() will prevent that. >> 2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing >> VMA's and mmap() new ones. User is not prevented from closing existing VMAs and creating new ones. > 3. Isn't this an API/ABI break? Could you please elaborate where you see the API/ABI break? The rule that new VMAs cannot exceed EPCM permissions is untouched. Reinette
On Mon, 2021-12-06 at 13:16 -0800, Reinette Chatre wrote: > Hi Jarkko, > > On 12/4/2021 2:27 PM, Jarkko Sakkinen wrote: > > On Sun, Dec 05, 2021 at 12:25:59AM +0200, Jarkko Sakkinen wrote: > > > On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote: > > > > === Summary === > > > > > > > > An SGX VMA can only be created if its permissions are the same or > > > > weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA > > > > creation this rule continues to be enforced by the page fault handler. > > > > > > > > With SGX2 the EPCM permissions of a page can change after VMA > > > > creation resulting in the VMA exceeding the EPCM permissions and the > > > > page fault handler incorrectly blocking access. > > > > > > > > Enable the VMA's pages to remain accessible while ensuring that > > > > the page table entries are installed to match the EPCM permissions > > > > without exceeding the VMA perms issions. > > > > > > I don't understand what the short summary means in English, and the > > > commit message is way too bloated to make any conclusions. It really > > > needs a rewrite. > > > > > > These were the questions I could not find answer for: > > > > > > 1. Why it would be by any means safe to remove a permission check? > > The permission check is redundant for SGX1 and incorrect for SGX2. > > In the current SGX1 implementation the permission check in > sgx_encl_load_page() is redundant because an SGX VMA can only be created > if its permissions are the same or weaker than the EPCM permissions. > > In SGX2 a user is able to change EPCM permissions during runtime (while > VMA has the memory mapped). A RW VMA may thus originally have mapped an > enclave page with RW EPCM permissions but since then the enclave page > may have its permissions changed to read-only. The VMA should still be > able to read those enclave pages but the check in sgx_encl_load_page() > will prevent that. > > > > 2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing > > > VMA's and mmap() new ones. > > User is not prevented from closing existing VMAs and creating new ones. > > > 3. Isn't this an API/ABI break? > > Could you please elaborate where you see the API/ABI break? The rule > that new VMAs cannot exceed EPCM permissions is untouched. > > Reinette I just don't understand the description. There's a whole bunch of text but It does not discuss what the patch does in low-level detail what the patch does, e.g. the use of vm_insert_pfn_prot(). I honestly do not get the story here... /Jarkko
Hi Jarkko, On 12/10/2021 9:39 PM, Jarkko Sakkinen wrote: > On Mon, 2021-12-06 at 13:16 -0800, Reinette Chatre wrote: >> On 12/4/2021 2:27 PM, Jarkko Sakkinen wrote: >>> On Sun, Dec 05, 2021 at 12:25:59AM +0200, Jarkko Sakkinen wrote: >>>> On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote: >>>>> === Summary === >>>>> >>>>> An SGX VMA can only be created if its permissions are the same or >>>>> weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA >>>>> creation this rule continues to be enforced by the page fault handler. >>>>> >>>>> With SGX2 the EPCM permissions of a page can change after VMA >>>>> creation resulting in the VMA exceeding the EPCM permissions and the >>>>> page fault handler incorrectly blocking access. >>>>> >>>>> Enable the VMA's pages to remain accessible while ensuring that >>>>> the page table entries are installed to match the EPCM permissions >>>>> without exceeding the VMA perms issions. >>>> >>>> I don't understand what the short summary means in English, and the >>>> commit message is way too bloated to make any conclusions. It really >>>> needs a rewrite. >>>> >>>> These were the questions I could not find answer for: >>>> >>>> 1. Why it would be by any means safe to remove a permission check? >> >> The permission check is redundant for SGX1 and incorrect for SGX2. >> >> In the current SGX1 implementation the permission check in >> sgx_encl_load_page() is redundant because an SGX VMA can only be created >> if its permissions are the same or weaker than the EPCM permissions. >> >> In SGX2 a user is able to change EPCM permissions during runtime (while >> VMA has the memory mapped). A RW VMA may thus originally have mapped an >> enclave page with RW EPCM permissions but since then the enclave page >> may have its permissions changed to read-only. The VMA should still be >> able to read those enclave pages but the check in sgx_encl_load_page() >> will prevent that. >> >>>> 2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing >>>> VMA's and mmap() new ones. >> >> User is not prevented from closing existing VMAs and creating new ones. >> >>> 3. Isn't this an API/ABI break? >> >> Could you please elaborate where you see the API/ABI break? The rule >> that new VMAs cannot exceed EPCM permissions is untouched. >> >> Reinette > > I just don't understand the description. There's a whole bunch of text > but > > It does not discuss what the patch does in low-level detail what the > patch does, e.g. the use of vm_insert_pfn_prot(). I honestly do not > get the story here... vmf_insert_pfn_prot() replaces the existing call to vmf_insert_pfn(). Notice how: vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn) { return vmf_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot); } vmf_insert_pfn() is replaced with the function it would call anyway. It is done because the PTE being installed should no longer blindly inherit the VMA permission as is done in the current code, but it should also take the EPCM permissions into account. This is because the EPCM permissions can change after the VMA is created. For example, consider a RW VMA created to map pages with RW EPCM pages. Since SGX1 does not allow EPCM permission changes it is ok to always install RW PTEs to access those pages and thus vmf_insert_pfn() is sufficient. In SGX2 the EPCM pages may become read-only and the PTEs should no longer be RW. This is made possible with the call to vmf_insert_pfn_prot() where the protection bits for the PTE can be provided (so that the PTE permissions do not exceed the EPCM permissions). Reinette
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 001808e3901c..20e97d3abdce 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -91,10 +91,8 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page, } static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, - unsigned long addr, - unsigned long vm_flags) + unsigned long addr) { - unsigned long vm_prot_bits = vm_flags & (VM_READ | VM_WRITE | VM_EXEC); struct sgx_epc_page *epc_page; struct sgx_encl_page *entry; @@ -102,14 +100,6 @@ static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, if (!entry) return ERR_PTR(-EFAULT); - /* - * Verify that the faulted page has equal or higher build time - * permissions than the VMA permissions (i.e. the subset of {VM_READ, - * VM_WRITE, VM_EXECUTE} in vma->vm_flags). - */ - if ((entry->vm_max_prot_bits & vm_prot_bits) != vm_prot_bits) - return ERR_PTR(-EFAULT); - /* Entry successfully located. */ if (entry->epc_page) { if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED) @@ -138,7 +128,9 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) { unsigned long addr = (unsigned long)vmf->address; struct vm_area_struct *vma = vmf->vma; + unsigned long page_prot_bits; struct sgx_encl_page *entry; + unsigned long vm_prot_bits; unsigned long phys_addr; struct sgx_encl *encl; vm_fault_t ret; @@ -155,7 +147,7 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) mutex_lock(&encl->lock); - entry = sgx_encl_load_page(encl, addr, vma->vm_flags); + entry = sgx_encl_load_page(encl, addr); if (IS_ERR(entry)) { mutex_unlock(&encl->lock); @@ -167,7 +159,19 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) phys_addr = sgx_get_epc_phys_addr(entry->epc_page); - ret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr)); + /* + * Insert PTE to match the EPCM page permissions ensured to not + * exceed the VMA permissions. + */ + vm_prot_bits = vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC); + page_prot_bits = entry->vm_max_prot_bits & vm_prot_bits; + /* + * Add VM_SHARED so that PTE is made writable right away if VMA + * and EPCM are writable (no COW in SGX). + */ + page_prot_bits |= (vma->vm_flags & VM_SHARED); + ret = vmf_insert_pfn_prot(vma, addr, PFN_DOWN(phys_addr), + vm_get_page_prot(page_prot_bits)); if (ret != VM_FAULT_NOPAGE) { mutex_unlock(&encl->lock); @@ -295,15 +299,14 @@ static int sgx_encl_debug_write(struct sgx_encl *encl, struct sgx_encl_page *pag * Load an enclave page to EPC if required, and take encl->lock. */ static struct sgx_encl_page *sgx_encl_reserve_page(struct sgx_encl *encl, - unsigned long addr, - unsigned long vm_flags) + unsigned long addr) { struct sgx_encl_page *entry; for ( ; ; ) { mutex_lock(&encl->lock); - entry = sgx_encl_load_page(encl, addr, vm_flags); + entry = sgx_encl_load_page(encl, addr); if (PTR_ERR(entry) != -EBUSY) break; @@ -339,8 +342,7 @@ static int sgx_vma_access(struct vm_area_struct *vma, unsigned long addr, return -EFAULT; for (i = 0; i < len; i += cnt) { - entry = sgx_encl_reserve_page(encl, (addr + i) & PAGE_MASK, - vma->vm_flags); + entry = sgx_encl_reserve_page(encl, (addr + i) & PAGE_MASK); if (IS_ERR(entry)) { ret = PTR_ERR(entry); break;
=== Summary === An SGX VMA can only be created if its permissions are the same or weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA creation this rule continues to be enforced by the page fault handler. With SGX2 the EPCM permissions of a page can change after VMA creation resulting in the VMA exceeding the EPCM permissions and the page fault handler incorrectly blocking access. Enable the VMA's pages to remain accessible while ensuring that the page table entries are installed to match the EPCM permissions without exceeding the VMA permissions. === Full Changelog === An SGX enclave is an area of memory where parts of an application can reside. First an enclave is created and loaded (from non-enclave memory) with the code and data of an application, then user space can map (mmap()) the enclave memory to be able to enter the enclave at its defined entry points for execution within it. The hardware maintains a secure structure, the Enclave Page Cache Map (EPCM), that tracks the contents of the enclave. Of interest here is its tracking of the enclave page permissions. When a page is loaded into the enclave its permissions are specified and recorded in the EPCM. In parallel the OS maintains the page table permissions and the rule is that page table permissions are never allowed to exceed EPCM permissions. A new mapping (mmap()) of enclave memory can only succeed if the mapping has the same or weaker permissions than the permissions that were vetted during enclave creation. This is enforced by sgx_encl_may_map() that is called on the mmap() as well as mprotect() paths. This permission verification remains. One feature of SGX2 is to support the modification of enclave page permissions after enclave creation. Enclave pages may thus already be mapped at the time their enclave permissions are changed resulting in the VMA's permissions potentially exceeding the enclave page permissions. Enable permissions of existing VMAs to exceed enclave page permissions in preparation for dynamic enclave page permission changes. New VMAs that attempt to exceed enclave page permissions continue to be unsupported. Reasons why permissions of existing VMAs are allowed to exceed enclave page permissions instead of dynamically changing VMA permissions when enclave page permissions change are: 1) Changing VMA permissions involve splitting VMAs which is an operation that can fail. Additionally the actual changing of page permissions of a range of pages could also fail on any of the pages involved. Handling these error cases causes problems. For example, if an enclave page permission change fails and the VMA has already been split then it is not possible to undo the VMA split nor possible to undo the enclave page permission changes that did succeed before the failure. 2) The OS has little insight into the user space where EPCM permissions are controlled from. For example, a RW page may be made RO just before it is made RX and splitting the VMAs while the VMAs may change soon is unnecessary. Remove the extra permission check called on a page fault (vm_operations_struct->fault) or during debugging (vm_operations_struct->access) when loading the enclave page from swap that ensures that the VMA permissions do not exceed the enclave permissions. Since a VMA could only exist if it passed the original permission checks during mmap() and a VMA may indeed exceed the page permissions this extra permission check is no longer appropriate. With the permission check removed, ensure that page table entries do not blindly inherit the VMA permissions but instead the permissions that the VMA and enclave agree on. PTEs for writable pages (from VMA and enclave perspective) are installed with the writable bit set, reducing the need for this additional flow to the permission mismatch cases handled next. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> --- arch/x86/kernel/cpu/sgx/encl.c | 38 ++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 18 deletions(-)