Message ID | 3c1d84724ecc7c94131ba1d94dc4c5de5aafc58f.1643393473.git.reinette.chatre@intel.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 2d03861e0d1d1ee81efc59338101cdd86a7474f6 |
Headers | show |
Series | selftests/sgx: Early enclave loading error path fixes | expand |
On 1/28/22 10:23, Reinette Chatre wrote: > A segfault is encountered if there happens to be an > early failure of any of the SGX tests. One way to > reproduce this is to remove the enclave binary > "test_encl.elf" that will trigger early enclave loading > failure followed by a segfault. > > The segfault occurs within encl_delete() that cleans up > after an enclave by umapping its mapped regions and closing > the file descriptor to the SGX driver. As integrated with > the kselftest harness encl_delete() is called upon exit > from every test, irrespective of test success. encl_delete() > is also called to clean up if an error is encountered during > enclave loading. > > encl_delete() is thus responsible for cleaning any amount of > enclave state - including state that has already been cleaned. > > encl_delete() starts by accessing encl->segment_tbl that may > not have been created yet due to a very early failure or may > already be cleaned up because of a failure encountered after > encl->segment_tbl was created. > > Ensure encl->segment_tbl is valid before attempting to access > memory offset from it. The offset with which it is accessed, > encl->nr_segments, is initialized after encl->segment_tbl and > thus considered valid to use after the encl->segment_tbl check > succeeds. I'm thinking we can be a bit more concise about the problem: == Background == The SGX selftests track parts of the enclave binaries in an array: encl->segment_tbl[]. That array is dynamically allocated early (but not first) in the test's lifetime. The array is referenced at the end of the test in encl_delete(). == Problem == encl->segment_tbl[] can be NULL if the test fails before its allocation. That leads to a NULL-pointer-dereference in encl_delete(). This is triggered during early failures of the selftest like if the enclave binary ("test_encl.elf") is deleted. -- I think it's also best to refer to this as a NULL-pointer problem rather than a segfault. The segfault is really just the fallout from the NULL pointer, *not* the primary problem. > Fixes: 3200505d4de6 ("selftests/sgx: Create a heap for the test enclave") > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> > --- > tools/testing/selftests/sgx/load.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c > index 9d4322c946e2..006b464c8fc9 100644 > --- a/tools/testing/selftests/sgx/load.c > +++ b/tools/testing/selftests/sgx/load.c > @@ -21,7 +21,7 @@ > > void encl_delete(struct encl *encl) > { > - struct encl_segment *heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; > + struct encl_segment *heap_seg; > > if (encl->encl_base) > munmap((void *)encl->encl_base, encl->encl_size); > @@ -32,10 +32,11 @@ void encl_delete(struct encl *encl) > if (encl->fd) > close(encl->fd); > > - munmap(heap_seg->src, heap_seg->size); > - > - if (encl->segment_tbl) > + if (encl->segment_tbl) { > + heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; > + munmap(heap_seg->src, heap_seg->size); This probably deserves a comment linking heap_seg->src and encl->segment_tbl together. They _look_ independent here. > free(encl->segment_tbl); > + } > > memset(encl, 0, sizeof(*encl)); > }
Hi Dave, On 1/28/2022 10:43 AM, Dave Hansen wrote: > On 1/28/22 10:23, Reinette Chatre wrote: >> A segfault is encountered if there happens to be an >> early failure of any of the SGX tests. One way to >> reproduce this is to remove the enclave binary >> "test_encl.elf" that will trigger early enclave loading >> failure followed by a segfault. >> >> The segfault occurs within encl_delete() that cleans up >> after an enclave by umapping its mapped regions and closing >> the file descriptor to the SGX driver. As integrated with >> the kselftest harness encl_delete() is called upon exit >> from every test, irrespective of test success. encl_delete() >> is also called to clean up if an error is encountered during >> enclave loading. >> >> encl_delete() is thus responsible for cleaning any amount of >> enclave state - including state that has already been cleaned. >> >> encl_delete() starts by accessing encl->segment_tbl that may >> not have been created yet due to a very early failure or may >> already be cleaned up because of a failure encountered after >> encl->segment_tbl was created. >> >> Ensure encl->segment_tbl is valid before attempting to access >> memory offset from it. The offset with which it is accessed, >> encl->nr_segments, is initialized after encl->segment_tbl and >> thus considered valid to use after the encl->segment_tbl check >> succeeds. > > I'm thinking we can be a bit more concise about the problem: > > == Background == > > The SGX selftests track parts of the enclave binaries in an array: > encl->segment_tbl[]. That array is dynamically allocated early (but not > first) in the test's lifetime. The array is referenced at the end of > the test in encl_delete(). > > == Problem == > > encl->segment_tbl[] can be NULL if the test fails before its allocation. > That leads to a NULL-pointer-dereference in encl_delete(). This is > triggered during early failures of the selftest like if the enclave > binary ("test_encl.elf") is deleted. > > -- > > I think it's also best to refer to this as a NULL-pointer problem rather > than a segfault. The segfault is really just the fallout from the NULL > pointer, *not* the primary problem. Will do. I plan to resubmit with your changes and just also append the paragraph that documents the fix. > >> Fixes: 3200505d4de6 ("selftests/sgx: Create a heap for the test enclave") >> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> >> --- >> tools/testing/selftests/sgx/load.c | 9 +++++---- >> 1 file changed, 5 insertions(+), 4 deletions(-) >> >> diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c >> index 9d4322c946e2..006b464c8fc9 100644 >> --- a/tools/testing/selftests/sgx/load.c >> +++ b/tools/testing/selftests/sgx/load.c >> @@ -21,7 +21,7 @@ >> >> void encl_delete(struct encl *encl) >> { >> - struct encl_segment *heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; >> + struct encl_segment *heap_seg; >> >> if (encl->encl_base) >> munmap((void *)encl->encl_base, encl->encl_size); >> @@ -32,10 +32,11 @@ void encl_delete(struct encl *encl) >> if (encl->fd) >> close(encl->fd); >> >> - munmap(heap_seg->src, heap_seg->size); >> - >> - if (encl->segment_tbl) >> + if (encl->segment_tbl) { >> + heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; >> + munmap(heap_seg->src, heap_seg->size); > > This probably deserves a comment linking heap_seg->src and > encl->segment_tbl together. They _look_ independent here. How about: diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c index 006b464c8fc9..946eb7c2a253 100644 --- a/tools/testing/selftests/sgx/load.c +++ b/tools/testing/selftests/sgx/load.c @@ -33,6 +33,14 @@ void encl_delete(struct encl *encl) close(encl->fd); if (encl->segment_tbl) { + /* + * Most segments form part of the enclave binary + * and have their mappings deleted with earlier + * munmap() of encl->bin. + * As a mapping of anonymous memory the heap + * segment is separate from the enclave + * binary and needs its mapping deleted separately. + */ heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; munmap(heap_seg->src, heap_seg->size); free(encl->segment_tbl); Reinette
On 1/28/22 11:22, Reinette Chatre wrote: > if (encl->segment_tbl) { > + /* > + * Most segments form part of the enclave binary > + * and have their mappings deleted with earlier > + * munmap() of encl->bin. > + * As a mapping of anonymous memory the heap > + * segment is separate from the enclave > + * binary and needs its mapping deleted separately. > + */ > heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; > munmap(heap_seg->src, heap_seg->size); I was more wondering why the status of heap_seg->src is tied to encl->segment_tbl.
Hi Dave, On 1/28/2022 11:26 AM, Dave Hansen wrote: > On 1/28/22 11:22, Reinette Chatre wrote: >> if (encl->segment_tbl) { >> + /* >> + * Most segments form part of the enclave binary >> + * and have their mappings deleted with earlier >> + * munmap() of encl->bin. >> + * As a mapping of anonymous memory the heap >> + * segment is separate from the enclave >> + * binary and needs its mapping deleted separately. >> + */ >> heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; >> munmap(heap_seg->src, heap_seg->size); > > I was more wondering why the status of heap_seg->src is tied to > encl->segment_tbl. Apologies but it is not clear to me what the concern is. Please bear with me as I first try to create some context and then I hope to navigate to the issue. The test creates an SGX enclave from a binary, in the test it is named test_encl.elf. To create the enclave the test loads the data from the binary and populates the enclave with it. In order to create the enclave correctly the binary needs to be loaded via its distinct segments, initially this is: TCS, TEXT, DATA. This is done to ensure when the enclave is created it is done with the correct page types and permissions. For example, the pages from the TCS segment needs to be loaded in to enclave pages of type TCS with RW permission, the pages from the TEXT segment needs to be loaded into regular enclave pages with RX permission, etc. To ensure the enclave is created correctly the test thus initializes the data for each segment that will be loaded into the enclave into that enclave's "segment_tbl". struct encl { ... struct encl_segment *segment_tbl; ... }; /* * struct encl_segment - parameters that needed for * SGX_IOC_ENCLAVE_ADD_PAGES ioctl() */ struct encl_segment { void *src; off_t offset; size_t size; unsigned int prot; unsigned int flags; bool measure; }; Commit 3200505d4de6 ("selftests/sgx: Create a heap for the test enclave") introduced a new segment, the heap, in support of more testing. While not loaded from the original binary it is considered a segment of the enclave and needs all fields in struct encl_segment in order to add its pages to the enclave with appropriate page permissions and flags. This is thus how heap_seg->src ended up connected to encl->segment_tbl. Apologies for being long winded here but I hope with this we could narrow down where your concerns are. Reinette
On Fri, Jan 28, 2022 at 10:23:56AM -0800, Reinette Chatre wrote: > A segfault is encountered if there happens to be an > early failure of any of the SGX tests. One way to > reproduce this is to remove the enclave binary > "test_encl.elf" that will trigger early enclave loading > failure followed by a segfault. > > The segfault occurs within encl_delete() that cleans up > after an enclave by umapping its mapped regions and closing > the file descriptor to the SGX driver. As integrated with > the kselftest harness encl_delete() is called upon exit > from every test, irrespective of test success. encl_delete() > is also called to clean up if an error is encountered during > enclave loading. > > encl_delete() is thus responsible for cleaning any amount of > enclave state - including state that has already been cleaned. > > encl_delete() starts by accessing encl->segment_tbl that may > not have been created yet due to a very early failure or may > already be cleaned up because of a failure encountered after > encl->segment_tbl was created. > > Ensure encl->segment_tbl is valid before attempting to access > memory offset from it. The offset with which it is accessed, > encl->nr_segments, is initialized after encl->segment_tbl and > thus considered valid to use after the encl->segment_tbl check > succeeds. Nit: textwidth=75 Not something I would NAK but just saying. > > Fixes: 3200505d4de6 ("selftests/sgx: Create a heap for the test enclave") > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> > --- > tools/testing/selftests/sgx/load.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c > index 9d4322c946e2..006b464c8fc9 100644 > --- a/tools/testing/selftests/sgx/load.c > +++ b/tools/testing/selftests/sgx/load.c > @@ -21,7 +21,7 @@ > > void encl_delete(struct encl *encl) > { > - struct encl_segment *heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; > + struct encl_segment *heap_seg; > > if (encl->encl_base) > munmap((void *)encl->encl_base, encl->encl_size); > @@ -32,10 +32,11 @@ void encl_delete(struct encl *encl) > if (encl->fd) > close(encl->fd); > > - munmap(heap_seg->src, heap_seg->size); > - > - if (encl->segment_tbl) > + if (encl->segment_tbl) { > + heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; > + munmap(heap_seg->src, heap_seg->size); > free(encl->segment_tbl); > + } > > memset(encl, 0, sizeof(*encl)); > } > -- > 2.25.1 > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> I tested this by a minor code mod to trigger to failure case. /Jarkko
diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c index 9d4322c946e2..006b464c8fc9 100644 --- a/tools/testing/selftests/sgx/load.c +++ b/tools/testing/selftests/sgx/load.c @@ -21,7 +21,7 @@ void encl_delete(struct encl *encl) { - struct encl_segment *heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; + struct encl_segment *heap_seg; if (encl->encl_base) munmap((void *)encl->encl_base, encl->encl_size); @@ -32,10 +32,11 @@ void encl_delete(struct encl *encl) if (encl->fd) close(encl->fd); - munmap(heap_seg->src, heap_seg->size); - - if (encl->segment_tbl) + if (encl->segment_tbl) { + heap_seg = &encl->segment_tbl[encl->nr_segments - 1]; + munmap(heap_seg->src, heap_seg->size); free(encl->segment_tbl); + } memset(encl, 0, sizeof(*encl)); }
A segfault is encountered if there happens to be an early failure of any of the SGX tests. One way to reproduce this is to remove the enclave binary "test_encl.elf" that will trigger early enclave loading failure followed by a segfault. The segfault occurs within encl_delete() that cleans up after an enclave by umapping its mapped regions and closing the file descriptor to the SGX driver. As integrated with the kselftest harness encl_delete() is called upon exit from every test, irrespective of test success. encl_delete() is also called to clean up if an error is encountered during enclave loading. encl_delete() is thus responsible for cleaning any amount of enclave state - including state that has already been cleaned. encl_delete() starts by accessing encl->segment_tbl that may not have been created yet due to a very early failure or may already be cleaned up because of a failure encountered after encl->segment_tbl was created. Ensure encl->segment_tbl is valid before attempting to access memory offset from it. The offset with which it is accessed, encl->nr_segments, is initialized after encl->segment_tbl and thus considered valid to use after the encl->segment_tbl check succeeds. Fixes: 3200505d4de6 ("selftests/sgx: Create a heap for the test enclave") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> --- tools/testing/selftests/sgx/load.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)