Message ID | 20220510031646.3181306-1-zhiquan1.li@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | x86/sgx: fine grained SGX MCA behavior | expand |
On Tue, May 10, 2022 at 11:16:46AM +0800, Zhiquan Li wrote: > Hi everyone, > > This series contains a few patches to fine grained SGX MCA behavior. > > When VM guest access a SGX EPC page with memory failure, current > behavior will kill the guest, expected only kill the SGX application > inside it. > > To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra > information for hypervisor to inject #MC information to guest, which > is helpful in SGX virtualization case. > > However, current SGX data structures are insufficient to track the > EPC pages for vepc, so we introduce a new struct sgx_vepc_page which > can be the owner of EPC pages for vepc and saves the useful info of > EPC pages for vepc, like struct sgx_encl_page. > > Moreover, canonical memory failure collects victim tasks by iterating > all the tasks one by one and use reverse mapping to get victim tasks’ > virtual address. This is not necessary for SGX - as one EPC page can > be mapped to ONE enclave only. So, this 1:1 mapping enforcement > allows us to find task virtual address with physical address > directly. Hmm... An enclave can be shared by multiple processes. The virtual address is the same but there can be variable number of processes having it mapped. > > Then we extend the solution for the normal SGX case, so that the task > has opportunity to make further decision while EPC page has memory > failure. > > Tests: > 1. MCE injection test for SGX in VM. > As we expected, the application was killed and VM was alive. > 2. MCE injection test for SGX on host. > As we expected, the application received SIGBUS with extra info. > 3. Kernel selftest/sgx: PASS > 4. Internal SGX stress test: PASS > 5. kmemleak test: No memory leakage detected. > > Zhiquan Li (4): > x86/sgx: Move struct sgx_vepc definition to sgx.h > x86/sgx: add struct sgx_vepc_page to manage EPC pages for vepc > x86/sgx: Fine grained SGX MCA behavior for virtualization > x86/sgx: Fine grained SGX MCA behavior for normal case > > arch/x86/kernel/cpu/sgx/main.c | 24 ++++++++++++++++++++++-- > arch/x86/kernel/cpu/sgx/sgx.h | 12 ++++++++++++ > arch/x86/kernel/cpu/sgx/virt.c | 29 +++++++++++++++++++---------- > 3 files changed, 53 insertions(+), 12 deletions(-) > > -- > 2.25.1 > BR, Jarkko
On 2022/5/11 18:29, Jarkko Sakkinen wrote: > On Tue, May 10, 2022 at 11:16:46AM +0800, Zhiquan Li wrote: >> Hi everyone, >> >> This series contains a few patches to fine grained SGX MCA behavior. >> >> When VM guest access a SGX EPC page with memory failure, current >> behavior will kill the guest, expected only kill the SGX application >> inside it. >> >> To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra >> information for hypervisor to inject #MC information to guest, which >> is helpful in SGX virtualization case. >> >> However, current SGX data structures are insufficient to track the >> EPC pages for vepc, so we introduce a new struct sgx_vepc_page which >> can be the owner of EPC pages for vepc and saves the useful info of >> EPC pages for vepc, like struct sgx_encl_page. >> >> Moreover, canonical memory failure collects victim tasks by iterating >> all the tasks one by one and use reverse mapping to get victim tasks’ >> virtual address. This is not necessary for SGX - as one EPC page can >> be mapped to ONE enclave only. So, this 1:1 mapping enforcement >> allows us to find task virtual address with physical address >> directly. > > Hmm... An enclave can be shared by multiple processes. The virtual > address is the same but there can be variable number of processes > having it mapped. Thanks for your review, Jarkko. You’re right, enclave can be shared. Actually, we had discussed this issue internally. Assuming below scenario: An enclave provides multiple ecalls and services for several tasks. If one task invokes an ecall and meets MCE, but the other tasks would not use that ecall, shall we kill all the sharing tasks immediately? It looks a little abrupt. Maybe it’s better to kill them when they really meet the HW poison page. Furthermore, once an EPC page has been poisoned, it will not be allocated anymore, so it would not be propagated. Therefore, we minimized the changes, just fine grained the behavior of SIGBUG and kept the other behavior as before. Do you think the processes sharing the same enclave need to be killed, even they had not touched the EPC page with hardware error? Any ideas are welcome. Best Regards, Zhiquan > >> >> Then we extend the solution for the normal SGX case, so that the task >> has opportunity to make further decision while EPC page has memory >> failure. >> >> Tests: >> 1. MCE injection test for SGX in VM. >> As we expected, the application was killed and VM was alive. >> 2. MCE injection test for SGX on host. >> As we expected, the application received SIGBUS with extra info. >> 3. Kernel selftest/sgx: PASS >> 4. Internal SGX stress test: PASS >> 5. kmemleak test: No memory leakage detected. >> >> Zhiquan Li (4): >> x86/sgx: Move struct sgx_vepc definition to sgx.h >> x86/sgx: add struct sgx_vepc_page to manage EPC pages for vepc >> x86/sgx: Fine grained SGX MCA behavior for virtualization >> x86/sgx: Fine grained SGX MCA behavior for normal case >> >> arch/x86/kernel/cpu/sgx/main.c | 24 ++++++++++++++++++++++-- >> arch/x86/kernel/cpu/sgx/sgx.h | 12 ++++++++++++ >> arch/x86/kernel/cpu/sgx/virt.c | 29 +++++++++++++++++++---------- >> 3 files changed, 53 insertions(+), 12 deletions(-) >> >> -- >> 2.25.1 >> > > BR, Jarkko
On Thu, May 12, 2022 at 08:03:30PM +0800, Zhiquan Li wrote: > > On 2022/5/11 18:29, Jarkko Sakkinen wrote: > > On Tue, May 10, 2022 at 11:16:46AM +0800, Zhiquan Li wrote: > >> Hi everyone, > >> > >> This series contains a few patches to fine grained SGX MCA behavior. > >> > >> When VM guest access a SGX EPC page with memory failure, current > >> behavior will kill the guest, expected only kill the SGX application > >> inside it. > >> > >> To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra > >> information for hypervisor to inject #MC information to guest, which > >> is helpful in SGX virtualization case. > >> > >> However, current SGX data structures are insufficient to track the > >> EPC pages for vepc, so we introduce a new struct sgx_vepc_page which > >> can be the owner of EPC pages for vepc and saves the useful info of > >> EPC pages for vepc, like struct sgx_encl_page. > >> > >> Moreover, canonical memory failure collects victim tasks by iterating > >> all the tasks one by one and use reverse mapping to get victim tasks’ > >> virtual address. This is not necessary for SGX - as one EPC page can > >> be mapped to ONE enclave only. So, this 1:1 mapping enforcement > >> allows us to find task virtual address with physical address > >> directly. > > > > Hmm... An enclave can be shared by multiple processes. The virtual > > address is the same but there can be variable number of processes > > having it mapped. > > Thanks for your review, Jarkko. > You’re right, enclave can be shared. > > Actually, we had discussed this issue internally. Assuming below > scenario: > An enclave provides multiple ecalls and services for several tasks. If > one task invokes an ecall and meets MCE, but the other tasks would not > use that ecall, shall we kill all the sharing tasks immediately? It looks > a little abrupt. Maybe it’s better to kill them when they really meet the > HW poison page. > Furthermore, once an EPC page has been poisoned, it will not be allocated > anymore, so it would not be propagated. > Therefore, we minimized the changes, just fine grained the behavior of > SIGBUG and kept the other behavior as before. > > Do you think the processes sharing the same enclave need to be killed, > even they had not touched the EPC page with hardware error? > Any ideas are welcome. I do not think the patch set is going to wrong direction. This discussion was just missing from the cover letter. BR, Jarkko
>> Do you think the processes sharing the same enclave need to be killed, >> even they had not touched the EPC page with hardware error? >> Any ideas are welcome. > > I do not think the patch set is going to wrong direction. This discussion > was just missing from the cover letter. I was under the impression that when an enclave page triggers a machine check the whole enclave is (somehow) marked bad, so that it couldn't be entered again. Killing other processes with the same enclave mapped would perhaps be overkill, but they are going to find that the enclave is "dead" next time they try to use it. -Tony
On 2022/5/14 00:35, Luck, Tony wrote: >>> Do you think the processes sharing the same enclave need to be killed, >>> even they had not touched the EPC page with hardware error? >>> Any ideas are welcome. >> >> I do not think the patch set is going to wrong direction. This discussion >> was just missing from the cover letter. OK, I will add this point into v2 of cover letter and patch 03. > > I was under the impression that when an enclave page triggers a machine check > the whole enclave is (somehow) marked bad, so that it couldn't be entered again. > > Killing other processes with the same enclave mapped would perhaps be overkill, > but they are going to find that the enclave is "dead" next time they try to use it. Thanks for your clarification, Tony. You reminded me to check Intel SDM vol.3, 38.15.1 Interactions with MCA Events: "All architecturally visible machine check events (#MC and CMCI) that are detected while inside an enclave cause an asynchronous enclave exit. Any machine check exception (#MC) that occurs after Intel SGX is first enables causes Intel SGX to be disabled, (CPUID.SGX_Leaf.0:EAX[SGX1] == 0). It cannot be enabled until after the next reset. " So, I suppose current behavior would be gently enough, other processes with the same enclave mapped should get rid of it if they really need to use the enclave again. If we expect those processes to be early killed, it worth another patch set to archive it. Best Regards, Zhiquan > > -Tony
> Any machine check exception (#MC) that occurs after Intel SGX is first enables > causes Intel SGX to be disabled, (CPUID.SGX_Leaf.0:EAX[SGX1] == 0). It cannot be > enabled until after the next reset. " That part is out of date. A machine check used to disable SGX system-wide. It now just disables the enclave that triggered the machine check. Is that text still in the latest SDM (version 077, April 2022)? -Tony
On 2022/5/15 11:35, Luck, Tony wrote: >> Any machine check exception (#MC) that occurs after Intel SGX is first enables >> causes Intel SGX to be disabled, (CPUID.SGX_Leaf.0:EAX[SGX1] == 0). It cannot be >> enabled until after the next reset. " > > That part is out of date. A machine check used to disable SGX system-wide. It now just > disables the enclave that triggered the machine check. > > Is that text still in the latest SDM (version 077, April 2022)? Yes, I've double checked it, version 077, Apr. 2022. Looks like SDM needs to follow up :-) Best Regards, Zhiquan > > -Tony
On Sat, 2022-05-14 at 13:39 +0800, Zhiquan Li wrote: > On 2022/5/14 00:35, Luck, Tony wrote: > > > > Do you think the processes sharing the same enclave need to be killed, > > > > even they had not touched the EPC page with hardware error? > > > > Any ideas are welcome. > > > > > > I do not think the patch set is going to wrong direction. This discussion > > > was just missing from the cover letter. > > OK, I will add this point into v2 of cover letter and patch 03. > > I don't think you should add to patch 03. The same enclave can be shared by multiple processes is only true for host enclaves, but not virtual EPC isntance. Virtual EPC instance is just a raw EPC resource which is accessible to guest, but how enclaves are created on this EPC resource is completely upto guest. Therefore, one virtual EPC cannot be shared by two guests.
On 2022/5/16 10:29, Kai Huang wrote: > On Sat, 2022-05-14 at 13:39 +0800, Zhiquan Li wrote: >> On 2022/5/14 00:35, Luck, Tony wrote: >>>>> Do you think the processes sharing the same enclave need to be killed, >>>>> even they had not touched the EPC page with hardware error? >>>>> Any ideas are welcome. >>>> I do not think the patch set is going to wrong direction. This discussion >>>> was just missing from the cover letter. >> OK, I will add this point into v2 of cover letter and patch 03. >> >> > I don't think you should add to patch 03. The same enclave can be shared by > multiple processes is only true for host enclaves, but not virtual EPC isntance. > Virtual EPC instance is just a raw EPC resource which is accessible to guest, > but how enclaves are created on this EPC resource is completely upto guest. > Therefore, one virtual EPC cannot be shared by two guests. > Thanks for your clarification, Kai. Patch 04 is for the host case, we can put it there. May I add your comments in patch 03 and add you as "Acked-by" in v2? I suppose it is a good answer if someone has the same question. Best Regards, Zhiquan > > -- Thanks, -Kai
On Mon, 2022-05-16 at 16:40 +0800, Zhiquan Li wrote: > On 2022/5/16 10:29, Kai Huang wrote: > > On Sat, 2022-05-14 at 13:39 +0800, Zhiquan Li wrote: > > > On 2022/5/14 00:35, Luck, Tony wrote: > > > > > > Do you think the processes sharing the same enclave need to be killed, > > > > > > even they had not touched the EPC page with hardware error? > > > > > > Any ideas are welcome. > > > > > I do not think the patch set is going to wrong direction. This discussion > > > > > was just missing from the cover letter. > > > OK, I will add this point into v2 of cover letter and patch 03. > > > > > > > > I don't think you should add to patch 03. The same enclave can be shared by > > multiple processes is only true for host enclaves, but not virtual EPC isntance. > > Virtual EPC instance is just a raw EPC resource which is accessible to guest, > > but how enclaves are created on this EPC resource is completely upto guest. > > Therefore, one virtual EPC cannot be shared by two guests. > > > > Thanks for your clarification, Kai. > Patch 04 is for the host case, we can put it there. > > May I add your comments in patch 03 and add you as "Acked-by" in v2? > I suppose it is a good answer if someone has the same question. > Yes you can add the comments to the patch. One problem is currently we don't explicitly prevent vepc being shared via fork(). For instance, an application can open /dev/sgx_vepc, mmap(), and fork(). Then if the child does mmap() against the fd opened by the parent, the child will share the same vepc with the parent. We can either explicitly enforce that only one process can mmap() to one vepc in mmap() of vepc, or we can put into comment saying something like below: Unlike host encalves, virtual EPC instance cannot be shared by multiple VMs. It is because how enclaves are created is totally upto the guest. Sharing virtual EPC instance will very likely to unexpectedly break enclaves in all VMs. SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance being shared by multiple VMs via fork(). However KVM doesn't support running a VM across multiple mm structures, and the de facto userspace hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice this should not happen. I'd like to hear from others whether simply adding some comment is fine.
On 2022/5/17 08:43, Kai Huang wrote: > On Mon, 2022-05-16 at 16:40 +0800, Zhiquan Li wrote: >> On 2022/5/16 10:29, Kai Huang wrote: >>> On Sat, 2022-05-14 at 13:39 +0800, Zhiquan Li wrote: >>>> On 2022/5/14 00:35, Luck, Tony wrote: >>>>>>> Do you think the processes sharing the same enclave need to be killed, >>>>>>> even they had not touched the EPC page with hardware error? >>>>>>> Any ideas are welcome. >>>>>> I do not think the patch set is going to wrong direction. This discussion >>>>>> was just missing from the cover letter. >>>> OK, I will add this point into v2 of cover letter and patch 03. >>>> >>>> >>> I don't think you should add to patch 03. The same enclave can be shared by >>> multiple processes is only true for host enclaves, but not virtual EPC isntance. >>> Virtual EPC instance is just a raw EPC resource which is accessible to guest, >>> but how enclaves are created on this EPC resource is completely upto guest. >>> Therefore, one virtual EPC cannot be shared by two guests. >>> >> Thanks for your clarification, Kai. >> Patch 04 is for the host case, we can put it there. >> >> May I add your comments in patch 03 and add you as "Acked-by" in v2? >> I suppose it is a good answer if someone has the same question. >> > Yes you can add the comments to the patch. > > One problem is currently we don't explicitly prevent vepc being shared via > fork(). For instance, an application can open /dev/sgx_vepc, mmap(), and > fork(). Then if the child does mmap() against the fd opened by the parent, the > child will share the same vepc with the parent. > > We can either explicitly enforce that only one process can mmap() to one vepc in > mmap() of vepc, or we can put into comment saying something like below: > > Unlike host encalves, virtual EPC instance cannot be shared by multiple > VMs. It is because how enclaves are created is totally upto the guest. > Sharing virtual EPC instance will very likely to unexpectedly break > enclaves in all VMs. > > SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance > being shared by multiple VMs via fork(). However KVM doesn't support > running a VM across multiple mm structures, and the de facto userspace > hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice > this should not happen. > Very excellent explanation!! I will put them into patch 03. Thanks a lot, Kai. Best Regards, Zhiquan > I'd like to hear from others whether simply adding some comment is fine. > > -- Thanks, -Kai