Message ID | 20210920125401.2389105-2-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86: sgx_vepc: implement ioctl to EREMOVE all pages | expand |
On Mon, 2021-09-20 at 08:54 -0400, Paolo Bonzini wrote: > For bare-metal SGX on real hardware, the hardware provides guarantees > SGX state at reboot. For instance, all pages start out uninitialized. > The vepc driver provides a similar guarantee today for freshly-opened > vepc instances, but guests such as Windows expect all pages to be in > uninitialized state on startup, including after every guest reboot. I would consider replacing "For bare-metal SGX on real hardware, the hardware provides guarantees SGX state at reboot. For instance, all pages start out uninitialized." something like "On bare-metal SGX, start of a power cycle zeros all of its reserved memory. This happens after every reboot, but in addition to that happens after waking up from any of the sleep states." I can speculate and imagine where this might useful, but no matter how trivial or complex it is, this patch needs to nail a concrete usage example. I'd presume you know well the exact changes needed for QEMU, so from that knowledge it should be easy to write the motivational part. For instance, point out where it is needed in QEMU and why. I.e. why you end up in the first place having to re-use vepc buffers (or whatever they should be called) in QEMU. When that is taken care of, then there is a red line to eventually ack these patches. About the motivation. In Linux we do have a mechanism to take care of this in a guest, for which motivation was actually first and foremost kexec. It was not done to let VMM to give a corrupted memory state to a guest. Even to a Linux guest, since EPC should stil be represented in the state that matches the hardware. It'd be essentially a corrupted state, even if there was measures to resist this. Windows guests failing is essentially a side-effect of an issue, not an issue in the Windows guests. Since QEMU needs to reinitialize VEPC buffers for guests, it should be as efficient as we ever can make it. Just fill the gap of understanding why QEMU needs to do this for guest. This is exactly kind of stuff that you want have documented in the commit log for future :-) /Jarkko
On Tue, 2021-09-21 at 22:44 +0300, Jarkko Sakkinen wrote: > Even to a Linux guest, since EPC should stil be represented in the state that > matches the hardware. It'd be essentially a corrupted state, even if there was > measures to resist this. Windows guests failing is essentially a side-effect > of an issue, not an issue in the Windows guests. Ugh, typos, sorry. Even to a Linux guest it would be illegit what I was meaning to say... /Jarkko
On 21/09/21 21:44, Jarkko Sakkinen wrote: > "On bare-metal SGX, start of a power cycle zeros all of its reserved > memory. This happens after every reboot, but in addition to that > happens after waking up from any of the sleep states." > > I can speculate and imagine where this might useful, but no matter > how trivial or complex it is, this patch needs to nail a concrete > usage example. I'd presume you know well the exact changes needed for > QEMU, so from that knowledge it should be easy to write the > motivational part. Assuming that it's obvious that QEMU knows how to reset a machine (which includes writes to the ACPI reset register, or wakeup from sleep states), the question of "why does userspace reuse vEPC" should be answered by this paragraph: "One way to do this is to simply close and reopen the /dev/sgx_vepc file descriptor and re-mmap the virtual EPC. However, this is problematic because it prevents sandboxing the userspace (for example forbidding open() after the guest starts, or running in a mount namespace that does not have access to /dev; both are doable with pre-opened file descriptors and/or SCM_RIGHTS file descriptor passing)." > Even to a Linux guest, since EPC should stil be represented in the > state that matches the hardware. It'd be essentially a corrupted > state, even if there was measures to resist this. Windows guests > failing is essentially a side-effect of an issue, not an issue in the > Windows guests. Right, Linux is more liberal than it needs to be and ksgxd does the EREMOVE itself at the beginning (__sgx_sanitize_pages). Windows has stronger expectations of what can and cannot happen before it boots, which are entirely justified. Paolo
On Thu, 2021-09-23 at 14:08 +0200, Paolo Bonzini wrote: > On 21/09/21 21:44, Jarkko Sakkinen wrote: > > "On bare-metal SGX, start of a power cycle zeros all of its reserved > > memory. This happens after every reboot, but in addition to that > > happens after waking up from any of the sleep states." > > > > I can speculate and imagine where this might useful, but no matter > > how trivial or complex it is, this patch needs to nail a concrete > > usage example. I'd presume you know well the exact changes needed for > > QEMU, so from that knowledge it should be easy to write the > > motivational part. > > Assuming that it's obvious that QEMU knows how to reset a machine (which > includes writes to the ACPI reset register, or wakeup from sleep > states), the question of "why does userspace reuse vEPC" should be > answered by this paragraph: > > "One way to do this is to simply close and reopen the /dev/sgx_vepc file > descriptor and re-mmap the virtual EPC. However, this is problematic > because it prevents sandboxing the userspace (for example forbidding > open() after the guest starts, or running in a mount namespace that > does not have access to /dev; both are doable with pre-opened file > descriptors and/or SCM_RIGHTS file descriptor passing)." Right, this makes sense. > > > Even to a Linux guest, since EPC should stil be represented in the > > state that matches the hardware. It'd be essentially a corrupted > > state, even if there was measures to resist this. Windows guests > > failing is essentially a side-effect of an issue, not an issue in the > > Windows guests. > > Right, Linux is more liberal than it needs to be and ksgxd does the > EREMOVE itself at the beginning (__sgx_sanitize_pages). Windows has > stronger expectations of what can and cannot happen before it boots, > which are entirely justified. > > Paolo Yep. We do it for kexec(). Alternative would be to zero at the time of kexec() but this way things are just way more simpler, e.g. the whole behaviour is local to the driver... /Jarkko
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c index 64511c4a5200..59b9c13121cd 100644 --- a/arch/x86/kernel/cpu/sgx/virt.c +++ b/arch/x86/kernel/cpu/sgx/virt.c @@ -111,7 +111,7 @@ static int sgx_vepc_mmap(struct file *file, struct vm_area_struct *vma) return 0; } -static int sgx_vepc_free_page(struct sgx_epc_page *epc_page) +static int sgx_vepc_remove_page(struct sgx_epc_page *epc_page) { int ret; @@ -140,11 +140,17 @@ static int sgx_vepc_free_page(struct sgx_epc_page *epc_page) */ WARN_ONCE(ret != SGX_CHILD_PRESENT, EREMOVE_ERROR_MESSAGE, ret, ret); - return ret; } + return ret; +} - sgx_free_epc_page(epc_page); +static int sgx_vepc_free_page(struct sgx_epc_page *epc_page) +{ + int ret = sgx_vepc_remove_page(epc_page); + if (ret) + return ret; + sgx_free_epc_page(epc_page); return 0; }