Message ID | 20220905020411.17290-2-jarkko@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2,1/5] selftests/sgx: Retry the ioctl()'s returned with EAGAIN | expand |
Hi Jarkko and Haitao, On 9/4/2022 7:04 PM, Jarkko Sakkinen wrote: > From: Haitao Huang <haitao.huang@linux.intel.com> > > For EMODT and EREMOVE ioctl()'s with a large range, kernel > may not finish in one shot and return EAGAIN error code > and count of bytes of EPC pages on that operations are > finished successfully. > > Change the unclobbered_vdso_oversubscribed_remove test > to rerun the ioctl()'s in a loop, updating offset and length > using the byte count returned in each iteration. > > Fixes: 6507cce561b4 ("selftests/sgx: Page removal stress test") Should this patch be moved to the "critical fixes for v6.0" series? > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > Tested-by: Jarkko Sakkinen <jarkko@kernel.org> > Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> > --- > v3: > * Added a fixes tag. The bug is in v6.0 patches. > * Added my tested-by (the bug reproduced in my NUC often). > v2: > * Changed branching in EAGAIN condition so that else branch > is not required. > * Addressed Reinette's feedback: > --- > tools/testing/selftests/sgx/main.c | 42 ++++++++++++++++++++++++------ > 1 file changed, 34 insertions(+), 8 deletions(-) > > diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c > index 9820b3809c69..59cca806eda1 100644 > --- a/tools/testing/selftests/sgx/main.c > +++ b/tools/testing/selftests/sgx/main.c > @@ -390,6 +390,7 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) > struct encl_segment *heap; > unsigned long total_mem; > int ret, errno_save; > + unsigned long count; > unsigned long addr; > unsigned long i; > > @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) > modt_ioc.offset = heap->offset; > modt_ioc.length = heap->size; > modt_ioc.page_type = SGX_PAGE_TYPE_TRIM; > - > + count = 0; > TH_LOG("Changing type of %zd bytes to trimmed may take a while ...", > heap->size); > - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > - errno_save = ret == -1 ? errno : 0; > + do { > + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > + > + errno_save = ret == -1 ? errno : 0; > + if (errno_save != EAGAIN) > + break; > + > + EXPECT_EQ(modt_ioc.result, 0); If this check triggers then there is something seriously wrong and in that case it may also be that this loop may be unable to terminate or the error condition would keep appearing until the loop terminates (which may be many iterations). Considering the severity and risk I do think that ASSERT_EQ() would be more appropriate, similar to how ASSERT_EQ() is used in patch 5/5. Apart from that I think that this looks good. Thank you very much for adding this. Reinette
On Thu, Sep 08, 2022 at 03:43:06PM -0700, Reinette Chatre wrote: > Hi Jarkko and Haitao, > > On 9/4/2022 7:04 PM, Jarkko Sakkinen wrote: > > From: Haitao Huang <haitao.huang@linux.intel.com> > > > > For EMODT and EREMOVE ioctl()'s with a large range, kernel > > may not finish in one shot and return EAGAIN error code > > and count of bytes of EPC pages on that operations are > > finished successfully. > > > > Change the unclobbered_vdso_oversubscribed_remove test > > to rerun the ioctl()'s in a loop, updating offset and length > > using the byte count returned in each iteration. > > > > Fixes: 6507cce561b4 ("selftests/sgx: Page removal stress test") > > Should this patch be moved to the "critical fixes for v6.0" series? I think not because it does not risk stability of the kernel itself. It's "nice to have" but not mandatory. > > > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > > Tested-by: Jarkko Sakkinen <jarkko@kernel.org> > > Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> > > --- > > v3: > > * Added a fixes tag. The bug is in v6.0 patches. > > * Added my tested-by (the bug reproduced in my NUC often). > > v2: > > * Changed branching in EAGAIN condition so that else branch > > is not required. > > * Addressed Reinette's feedback: > > --- > > tools/testing/selftests/sgx/main.c | 42 ++++++++++++++++++++++++------ > > 1 file changed, 34 insertions(+), 8 deletions(-) > > > > diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c > > index 9820b3809c69..59cca806eda1 100644 > > --- a/tools/testing/selftests/sgx/main.c > > +++ b/tools/testing/selftests/sgx/main.c > > @@ -390,6 +390,7 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) > > struct encl_segment *heap; > > unsigned long total_mem; > > int ret, errno_save; > > + unsigned long count; > > unsigned long addr; > > unsigned long i; > > > > @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) > > modt_ioc.offset = heap->offset; > > modt_ioc.length = heap->size; > > modt_ioc.page_type = SGX_PAGE_TYPE_TRIM; > > - > > + count = 0; > > TH_LOG("Changing type of %zd bytes to trimmed may take a while ...", > > heap->size); > > - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > > - errno_save = ret == -1 ? errno : 0; > > + do { > > + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > > + > > + errno_save = ret == -1 ? errno : 0; > > + if (errno_save != EAGAIN) > > + break; > > + > > + EXPECT_EQ(modt_ioc.result, 0); > > If this check triggers then there is something seriously wrong and in that case > it may also be that this loop may be unable to terminate or the error condition would > keep appearing until the loop terminates (which may be many iterations). Considering > the severity and risk I do think that ASSERT_EQ() would be more appropriate, > similar to how ASSERT_EQ() is used in patch 5/5. > > Apart from that I think that this looks good. > > Thank you very much for adding this. > > Reinette Hmm... I could along the lines: /* * Get time since Epoch is milliseconds. */ unsigned long get_time(void) { struct timeval start; gettimeofday(&start, NULL); return (unsigneg long)start.tv_sec * 1000L + (unsigned long)start.tv_usec / 1000L; } and #define IOCTL_RETRY_TIMEOUT 100 In the test function: unsigned long start_time; /* ... */ start_time = get_time(); do { EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT); /* ... */ } /* ... */ What do you think? BR, Jarkko
On Fri, Sep 09, 2022 at 02:19:40AM +0300, Jarkko Sakkinen wrote:
> EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT);
~~
typo
BR, Jarkko
On Fri, Sep 09, 2022 at 02:20:22AM +0300, Jarkko Sakkinen wrote: > On Fri, Sep 09, 2022 at 02:19:40AM +0300, Jarkko Sakkinen wrote: > > EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT); > ~~ > typo > > BR, Jarkko Also, the timeout can probably be like one second. BR, Jarkko
Hi Jarkko, On 9/8/2022 4:19 PM, Jarkko Sakkinen wrote: > On Thu, Sep 08, 2022 at 03:43:06PM -0700, Reinette Chatre wrote: >> Hi Jarkko and Haitao, >> >> On 9/4/2022 7:04 PM, Jarkko Sakkinen wrote: >>> From: Haitao Huang <haitao.huang@linux.intel.com> >>> >>> For EMODT and EREMOVE ioctl()'s with a large range, kernel >>> may not finish in one shot and return EAGAIN error code >>> and count of bytes of EPC pages on that operations are >>> finished successfully. >>> >>> Change the unclobbered_vdso_oversubscribed_remove test >>> to rerun the ioctl()'s in a loop, updating offset and length >>> using the byte count returned in each iteration. >>> >>> Fixes: 6507cce561b4 ("selftests/sgx: Page removal stress test") >> >> Should this patch be moved to the "critical fixes for v6.0" series? > > I think not because it does not risk stability of the > kernel itself. It's "nice to have" but not mandatory. ok, thank you for considering it. ... >>> @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) >>> modt_ioc.offset = heap->offset; >>> modt_ioc.length = heap->size; >>> modt_ioc.page_type = SGX_PAGE_TYPE_TRIM; >>> - >>> + count = 0; >>> TH_LOG("Changing type of %zd bytes to trimmed may take a while ...", >>> heap->size); >>> - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); >>> - errno_save = ret == -1 ? errno : 0; >>> + do { >>> + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); >>> + >>> + errno_save = ret == -1 ? errno : 0; >>> + if (errno_save != EAGAIN) >>> + break; >>> + >>> + EXPECT_EQ(modt_ioc.result, 0); >> >> If this check triggers then there is something seriously wrong and in that case >> it may also be that this loop may be unable to terminate or the error condition would >> keep appearing until the loop terminates (which may be many iterations). Considering >> the severity and risk I do think that ASSERT_EQ() would be more appropriate, >> similar to how ASSERT_EQ() is used in patch 5/5. >> >> Apart from that I think that this looks good. >> >> Thank you very much for adding this. >> >> Reinette > > Hmm... I could along the lines: > > /* > * Get time since Epoch is milliseconds. > */ > unsigned long get_time(void) > { > struct timeval start; > > gettimeofday(&start, NULL); > > return (unsigneg long)start.tv_sec * 1000L + (unsigned long)start.tv_usec / 1000L; > } > > and > > #define IOCTL_RETRY_TIMEOUT 100 > > In the test function: > > unsigned long start_time; > > /* ... */ > > start_time = get_time(); > do { > EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT); > > /* ... */ > } > > /* ... */ > > What do you think? I do think that your proposal can be considered for an additional check in this test but the way I understand it it does not address my feedback. In this patch the flow is: do { ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); errno_save = ret == -1 ? errno : 0; if (errno_save != EAGAIN) break; EXPECT_EQ(modt_ioc.result, 0); ... } while ... If this EXPECT_EQ() check fails then it means that errno_save is EAGAIN and modt_ioc.result != 0. This should never happen because in the kernel (sgx_enclave_modify_types()) the only time modt_ioc.result can be set is when the ioctl() returns EFAULT. In my opinion this check should be changed to: ASSERT_EQ(modt_ioc.result, 0); This is my opinion because this check indicates a kernel bug and I do not see value in continuing the test after a kernel bug is encountered. My expectation is that this test is of value to folks modifying the kernel code. As for the new check you are proposing - it seems to me that kselftest with TIMEOUT_DEFAULT already covers this. Reinette
On Thu, Sep 08, 2022 at 05:06:58PM -0700, Reinette Chatre wrote: > Hi Jarkko, > > On 9/8/2022 4:19 PM, Jarkko Sakkinen wrote: > > On Thu, Sep 08, 2022 at 03:43:06PM -0700, Reinette Chatre wrote: > >> Hi Jarkko and Haitao, > >> > >> On 9/4/2022 7:04 PM, Jarkko Sakkinen wrote: > >>> From: Haitao Huang <haitao.huang@linux.intel.com> > >>> > >>> For EMODT and EREMOVE ioctl()'s with a large range, kernel > >>> may not finish in one shot and return EAGAIN error code > >>> and count of bytes of EPC pages on that operations are > >>> finished successfully. > >>> > >>> Change the unclobbered_vdso_oversubscribed_remove test > >>> to rerun the ioctl()'s in a loop, updating offset and length > >>> using the byte count returned in each iteration. > >>> > >>> Fixes: 6507cce561b4 ("selftests/sgx: Page removal stress test") > >> > >> Should this patch be moved to the "critical fixes for v6.0" series? > > > > I think not because it does not risk stability of the > > kernel itself. It's "nice to have" but not mandatory. > > ok, thank you for considering it. > > ... > > >>> @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) > >>> modt_ioc.offset = heap->offset; > >>> modt_ioc.length = heap->size; > >>> modt_ioc.page_type = SGX_PAGE_TYPE_TRIM; > >>> - > >>> + count = 0; > >>> TH_LOG("Changing type of %zd bytes to trimmed may take a while ...", > >>> heap->size); > >>> - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > >>> - errno_save = ret == -1 ? errno : 0; > >>> + do { > >>> + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > >>> + > >>> + errno_save = ret == -1 ? errno : 0; > >>> + if (errno_save != EAGAIN) > >>> + break; > >>> + > >>> + EXPECT_EQ(modt_ioc.result, 0); > >> > >> If this check triggers then there is something seriously wrong and in that case > >> it may also be that this loop may be unable to terminate or the error condition would > >> keep appearing until the loop terminates (which may be many iterations). Considering > >> the severity and risk I do think that ASSERT_EQ() would be more appropriate, > >> similar to how ASSERT_EQ() is used in patch 5/5. > >> > >> Apart from that I think that this looks good. > >> > >> Thank you very much for adding this. > >> > >> Reinette > > > > Hmm... I could along the lines: > > > > /* > > * Get time since Epoch is milliseconds. > > */ > > unsigned long get_time(void) > > { > > struct timeval start; > > > > gettimeofday(&start, NULL); > > > > return (unsigneg long)start.tv_sec * 1000L + (unsigned long)start.tv_usec / 1000L; > > } > > > > and > > > > #define IOCTL_RETRY_TIMEOUT 100 > > > > In the test function: > > > > unsigned long start_time; > > > > /* ... */ > > > > start_time = get_time(); > > do { > > EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT); > > > > /* ... */ > > } > > > > /* ... */ > > > > What do you think? > > I do think that your proposal can be considered for an additional check in this > test but the way I understand it it does not address my feedback. > > In this patch the flow is: > > do { > ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > > errno_save = ret == -1 ? errno : 0; > if (errno_save != EAGAIN) > break; > > EXPECT_EQ(modt_ioc.result, 0); > ... > } while ... > > > If this EXPECT_EQ() check fails then it means that errno_save is EAGAIN > and modt_ioc.result != 0. This should never happen because in the kernel > (sgx_enclave_modify_types()) the only time modt_ioc.result can be set is > when the ioctl() returns EFAULT. > > In my opinion this check should be changed to: > ASSERT_EQ(modt_ioc.result, 0); Right, I missed this. It should be definitely ASSERT_EQ((). BR, Jarkko
On Fri, Sep 09, 2022 at 07:01:36AM +0300, Jarkko Sakkinen wrote: > On Thu, Sep 08, 2022 at 05:06:58PM -0700, Reinette Chatre wrote: > > Hi Jarkko, > > > > On 9/8/2022 4:19 PM, Jarkko Sakkinen wrote: > > > On Thu, Sep 08, 2022 at 03:43:06PM -0700, Reinette Chatre wrote: > > >> Hi Jarkko and Haitao, > > >> > > >> On 9/4/2022 7:04 PM, Jarkko Sakkinen wrote: > > >>> From: Haitao Huang <haitao.huang@linux.intel.com> > > >>> > > >>> For EMODT and EREMOVE ioctl()'s with a large range, kernel > > >>> may not finish in one shot and return EAGAIN error code > > >>> and count of bytes of EPC pages on that operations are > > >>> finished successfully. > > >>> > > >>> Change the unclobbered_vdso_oversubscribed_remove test > > >>> to rerun the ioctl()'s in a loop, updating offset and length > > >>> using the byte count returned in each iteration. > > >>> > > >>> Fixes: 6507cce561b4 ("selftests/sgx: Page removal stress test") > > >> > > >> Should this patch be moved to the "critical fixes for v6.0" series? > > > > > > I think not because it does not risk stability of the > > > kernel itself. It's "nice to have" but not mandatory. > > > > ok, thank you for considering it. > > > > ... > > > > >>> @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) > > >>> modt_ioc.offset = heap->offset; > > >>> modt_ioc.length = heap->size; > > >>> modt_ioc.page_type = SGX_PAGE_TYPE_TRIM; > > >>> - > > >>> + count = 0; > > >>> TH_LOG("Changing type of %zd bytes to trimmed may take a while ...", > > >>> heap->size); > > >>> - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > > >>> - errno_save = ret == -1 ? errno : 0; > > >>> + do { > > >>> + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > > >>> + > > >>> + errno_save = ret == -1 ? errno : 0; > > >>> + if (errno_save != EAGAIN) > > >>> + break; > > >>> + > > >>> + EXPECT_EQ(modt_ioc.result, 0); > > >> > > >> If this check triggers then there is something seriously wrong and in that case > > >> it may also be that this loop may be unable to terminate or the error condition would > > >> keep appearing until the loop terminates (which may be many iterations). Considering > > >> the severity and risk I do think that ASSERT_EQ() would be more appropriate, > > >> similar to how ASSERT_EQ() is used in patch 5/5. > > >> > > >> Apart from that I think that this looks good. > > >> > > >> Thank you very much for adding this. > > >> > > >> Reinette > > > > > > Hmm... I could along the lines: > > > > > > /* > > > * Get time since Epoch is milliseconds. > > > */ > > > unsigned long get_time(void) > > > { > > > struct timeval start; > > > > > > gettimeofday(&start, NULL); > > > > > > return (unsigneg long)start.tv_sec * 1000L + (unsigned long)start.tv_usec / 1000L; > > > } > > > > > > and > > > > > > #define IOCTL_RETRY_TIMEOUT 100 > > > > > > In the test function: > > > > > > unsigned long start_time; > > > > > > /* ... */ > > > > > > start_time = get_time(); > > > do { > > > EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT); > > > > > > /* ... */ > > > } > > > > > > /* ... */ > > > > > > What do you think? > > > > I do think that your proposal can be considered for an additional check in this > > test but the way I understand it it does not address my feedback. > > > > In this patch the flow is: > > > > do { > > ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); > > > > errno_save = ret == -1 ? errno : 0; > > if (errno_save != EAGAIN) > > break; > > > > EXPECT_EQ(modt_ioc.result, 0); > > ... > > } while ... > > > > > > If this EXPECT_EQ() check fails then it means that errno_save is EAGAIN > > and modt_ioc.result != 0. This should never happen because in the kernel > > (sgx_enclave_modify_types()) the only time modt_ioc.result can be set is > > when the ioctl() returns EFAULT. > > > > In my opinion this check should be changed to: > > ASSERT_EQ(modt_ioc.result, 0); > > Right, I missed this. It should be definitely ASSERT_EQ((). I was thinking to add patch, which adds helper to calculate static content length from the last item of the segment table (offset + size) and replace total_length calculations in various tests. I won't send a new version this week because I'm at Open Source Summit EU and Linux Security Summit EU. BR, Jarkko
diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index 9820b3809c69..59cca806eda1 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -390,6 +390,7 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) struct encl_segment *heap; unsigned long total_mem; int ret, errno_save; + unsigned long count; unsigned long addr; unsigned long i; @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) modt_ioc.offset = heap->offset; modt_ioc.length = heap->size; modt_ioc.page_type = SGX_PAGE_TYPE_TRIM; - + count = 0; TH_LOG("Changing type of %zd bytes to trimmed may take a while ...", heap->size); - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); - errno_save = ret == -1 ? errno : 0; + do { + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc); + + errno_save = ret == -1 ? errno : 0; + if (errno_save != EAGAIN) + break; + + EXPECT_EQ(modt_ioc.result, 0); + + count += modt_ioc.count; + modt_ioc.offset += modt_ioc.count; + modt_ioc.length -= modt_ioc.count; + modt_ioc.result = 0; + modt_ioc.count = 0; + } while (modt_ioc.length != 0); EXPECT_EQ(ret, 0); EXPECT_EQ(errno_save, 0); EXPECT_EQ(modt_ioc.result, 0); - EXPECT_EQ(modt_ioc.count, heap->size); + count += modt_ioc.count; + EXPECT_EQ(count, heap->size); /* EACCEPT all removed pages. */ addr = self->encl.encl_base + heap->offset; @@ -490,15 +505,26 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900) remove_ioc.offset = heap->offset; remove_ioc.length = heap->size; - + count = 0; TH_LOG("Removing %zd bytes from enclave may take a while ...", heap->size); - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_REMOVE_PAGES, &remove_ioc); - errno_save = ret == -1 ? errno : 0; + do { + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_REMOVE_PAGES, &remove_ioc); + + errno_save = ret == -1 ? errno : 0; + if (errno_save != EAGAIN) + break; + + count += remove_ioc.count; + remove_ioc.offset += remove_ioc.count; + remove_ioc.length -= remove_ioc.count; + remove_ioc.count = 0; + } while (remove_ioc.length != 0); EXPECT_EQ(ret, 0); EXPECT_EQ(errno_save, 0); - EXPECT_EQ(remove_ioc.count, heap->size); + count += remove_ioc.count; + EXPECT_EQ(count, heap->size); } TEST_F(enclave, clobbered_vdso)