Message ID | cover.1562542383.git.cedric.xing@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | security/x86/sgx: SGX specific LSM hooks | expand |
On Sun, Jul 07, 2019 at 04:41:30PM -0700, Cedric Xing wrote: ... > different FSMs to govern page protection transitions. Implementation wise, his > model also imposes unwanted restrictions specifically to SGX2, such as: > · Complicated/Restricted UAPI – Enclave loaders are required to provide I don't think "complicated" is a fair assessment. For SGX1 enclaves it's literally a direct propagation of the SECINFO RWX flags. > “maximal protection” at page load time, but such information is NOT always > available. For example, Graphene containers may run different applications > comprised of different set of executables and/or shared objects. Some of > them may contain self-modifying code (or text relocation) while others > don’t. The generic enclave loader usually doesn’t have such information so > wouldn’t be able to provide it ahead of time. I'm unconvinced that it would be remotely difficult to teach an enclave loader that an enclave or hosted application employs SMC, relocation or any other behavior that would require declaring RWX on all pages. > · Inefficient Auditing – Audit logs are supposed to help system > administrators to determine the set of minimally needed permissions and to > detect abnormal behaviors. But consider the “maximal protection” model, if > “maximal protection” is set to be too permissive, then audit log wouldn’t > be able to detect anomalies; Huh? Declaring overly permissive protections is only problematic if an LSM denies the permission, in which case it will generate an accurate audit log. If the enclave/loader "requires" a permission it doesn't actually need, e.g. EXECDIRTY, then it's a software bug that should be fixed. I don't see how this scenario is any different than an application that uses assembly code without 'noexecstack' and inadvertantly "requires" EXECSTACK due to triggering "read implies exec". In both cases the denied permission is unnecessary due to a userspace application bug. > or if “maximal protection” is too restrictive, > then audit log cannot identify the file violating the policy. Maximal protections that are too restrictive are completely orthogonal to LSMs as the enclave would fail to run irrespective of LSMs. This is no different than specifying the wrong RWX flags in SECINFO, or opening a file as RO instead of RW. > In either case the audit log cannot fulfill its purposes. > · Inability to support #PF driven EPC allocation in SGX2 – For those > unfamiliar with SGX2 software flows, an SGX2 enclave requests a page by > issuing EACCEPT on the address that a new page is wanted, and the resulted > #PF is expected to be handled by the kernel by EAUG’ing an EPC page at the > fault address, and then the enclave would be resumed and the faulting > EACCEPT retried, and succeed. The key requirement is to allow mmap()’ing > non-existing enclave pages so that the SGX module/subsystem could respond > to #PFs by EAUG’ing new pages. Sean’s implementation doesn’t allow > mmap()’ing non-existing pages for variety of reasons and therefore blocks > this major SGX2 usage. This is simply wrong. The key requirement in the theoretical EAUG scheme is to mmap() pages that have not been added to the _hardware_ maintained enclave. The pages (or some optimized representation of a range of pages) would exist in the kernel's software mode of the enclave.
On 7/8/2019 8:55 AM, Sean Christopherson wrote: > On Sun, Jul 07, 2019 at 04:41:30PM -0700, Cedric Xing wrote: > ... > >> different FSMs to govern page protection transitions. Implementation wise, his >> model also imposes unwanted restrictions specifically to SGX2, such as: >> · Complicated/Restricted UAPI – Enclave loaders are required to provide > > I don't think "complicated" is a fair assessment. For SGX1 enclaves it's > literally a direct propagation of the SECINFO RWX flags. True only for SGX1. >> “maximal protection” at page load time, but such information is NOT always >> available. For example, Graphene containers may run different applications >> comprised of different set of executables and/or shared objects. Some of >> them may contain self-modifying code (or text relocation) while others >> don’t. The generic enclave loader usually doesn’t have such information so >> wouldn’t be able to provide it ahead of time. > > I'm unconvinced that it would be remotely difficult to teach an enclave > loader that an enclave or hosted application employs SMC, relocation or > any other behavior that would require declaring RWX on all pages. You've been talking as if "enclave loader" is tailored to the enclave it is loading. But in reality "enclave loader" is usually a library knowing usually nothing about the enclave. How could it know if an enclave contains self-modifying code? >> · Inefficient Auditing – Audit logs are supposed to help system >> administrators to determine the set of minimally needed permissions and to >> detect abnormal behaviors. But consider the “maximal protection” model, if >> “maximal protection” is set to be too permissive, then audit log wouldn’t >> be able to detect anomalies; > > Huh? Declaring overly permissive protections is only problematic if an > LSM denies the permission, in which case it will generate an accurate > audit log. > > If the enclave/loader "requires" a permission it doesn't actually need, > e.g. EXECDIRTY, then it's a software bug that should be fixed. I don't > see how this scenario is any different than an application that uses > assembly code without 'noexecstack' and inadvertantly "requires" > EXECSTACK due to triggering "read implies exec". In both cases the > denied permission is unnecessary due to a userspace application bug. You see, you've been assuming "enclave loader" knows everything and tailored to what it loads in a particular application. But the reality is the loader is generic and probably shared by multiple applications. It needs some generic way to figure out the "maximal protection". An implementation could use information embedded in the enclave file, or could just be "configurable". In the former case, you put extra burdens on the build tools, while in the latter case, your audit logs cannot help generating an appropriate configuration. >> or if “maximal protection” is too restrictive, >> then audit log cannot identify the file violating the policy. > > Maximal protections that are too restrictive are completely orthogonal to > LSMs as the enclave would fail to run irrespective of LSMs. This is no > different than specifying the wrong RWX flags in SECINFO, or opening a > file as RO instead of RW. Say loader is configurable. By looking at the log, can an administrator tell which file has too restrictive "maximal protection"? >> In either case the audit log cannot fulfill its purposes. >> · Inability to support #PF driven EPC allocation in SGX2 – For those >> unfamiliar with SGX2 software flows, an SGX2 enclave requests a page by >> issuing EACCEPT on the address that a new page is wanted, and the resulted >> #PF is expected to be handled by the kernel by EAUG’ing an EPC page at the >> fault address, and then the enclave would be resumed and the faulting >> EACCEPT retried, and succeed. The key requirement is to allow mmap()’ing >> non-existing enclave pages so that the SGX module/subsystem could respond >> to #PFs by EAUG’ing new pages. Sean’s implementation doesn’t allow >> mmap()’ing non-existing pages for variety of reasons and therefore blocks >> this major SGX2 usage. > > This is simply wrong. The key requirement in the theoretical EAUG scheme > is to mmap() pages that have not been added to the _hardware_ maintained > enclave. The pages (or some optimized representation of a range of pages) > would exist in the kernel's software mode of the enclave. You are right. Code can always change. My assessment was made on what's in your patch. The key point here is your patch is more complicated than it seems because you've been hand-waving on imminent requirements.
On Mon, Jul 08, 2019 at 10:49:59AM -0700, Xing, Cedric wrote: > On 7/8/2019 8:55 AM, Sean Christopherson wrote: > >On Sun, Jul 07, 2019 at 04:41:30PM -0700, Cedric Xing wrote: > True only for SGX1. > >> “maximal protection” at page load time, but such information is NOT always > >> available. For example, Graphene containers may run different applications > >> comprised of different set of executables and/or shared objects. Some of > >> them may contain self-modifying code (or text relocation) while others > >> don’t. The generic enclave loader usually doesn’t have such information so > >> wouldn’t be able to provide it ahead of time. > > > >I'm unconvinced that it would be remotely difficult to teach an enclave > >loader that an enclave or hosted application employs SMC, relocation or > >any other behavior that would require declaring RWX on all pages. > > You've been talking as if "enclave loader" is tailored to the enclave it is > loading. But in reality "enclave loader" is usually a library knowing > usually nothing about the enclave. How could it know if an enclave contains > self-modifying code? Given the rarity of SMC, require enclaves to declare "I do SMC"... The Intel SDK already requires the enclave developer to declare heap size, stack size, thread affinity, etc... I have a very hard time believing that it can't support SMC and relocation flags. > >> · Inefficient Auditing – Audit logs are supposed to help system > >> administrators to determine the set of minimally needed permissions and to > >> detect abnormal behaviors. But consider the “maximal protection” model, if > >> “maximal protection” is set to be too permissive, then audit log wouldn’t > >> be able to detect anomalies; > > > >Huh? Declaring overly permissive protections is only problematic if an > >LSM denies the permission, in which case it will generate an accurate > >audit log. > > > >If the enclave/loader "requires" a permission it doesn't actually need, > >e.g. EXECDIRTY, then it's a software bug that should be fixed. I don't > >see how this scenario is any different than an application that uses > >assembly code without 'noexecstack' and inadvertantly "requires" > >EXECSTACK due to triggering "read implies exec". In both cases the > >denied permission is unnecessary due to a userspace application bug. > > You see, you've been assuming "enclave loader" knows everything and tailored > to what it loads in a particular application. But the reality is the loader > is generic and probably shared by multiple applications. No, I'm assuming that an enclave can communicate its basic needs without undue pain. > It needs some generic way to figure out the "maximal protection". An > implementation could use information embedded in the enclave file, or could > just be "configurable". In the former case, you put extra burdens on the build > tools, while in the latter case, your audit logs cannot help generating an > appropriate configuration. I'm contending the "extra burdens" is minimal. if (do_smc || do_relocation) max_prot = RWX; else max_prot = SECINFO.FLAGS; > >> or if “maximal protection” is too restrictive, > >> then audit log cannot identify the file violating the policy. > > > >Maximal protections that are too restrictive are completely orthogonal to > >LSMs as the enclave would fail to run irrespective of LSMs. This is no > >different than specifying the wrong RWX flags in SECINFO, or opening a > >file as RO instead of RW. > > Say loader is configurable. By looking at the log, can an administrator tell > which file has too restrictive "maximal protection"? Again, this fails irrespective of LSMs. So the answer is "no", because there is no log. But the admin will never have to deal with this issue because the enclave will *never* run, i.e. would unconditionally fail to run during initial development. And the developer has bigger problems if they can't debug their own code. > >>In either case the audit log cannot fulfill its purposes.
Hi Sean, What's in my cover letter is my assessment on what's in your series. You may disagree. But I don't think it productive until you can prove your points in code. The key points I'm making are: (1) The impact to user mode code due to UAPI change is more significant than you have envisioned. (2) Your series has implemented less than required in practice. For #1, regular shared objects don't carry info like whether it contains self-modifying code or generates code on the fly. So your requirement of "maximal protection" is new, and you should at least put together a story to show everyone how it could be met, especially without changing build tools. For #2, SGX2 is imminent, and the upcoming ICX server will support 512GB of EPC. So the problems in mprotect() performance and EAUG-on-#PF must be solved, let alone other problems. I guess you have to code them up so everyone will be able to evaluate whether your approach is really as simple as you have claimed. -Cedric On 7/8/2019 11:49 AM, Sean Christopherson wrote: > On Mon, Jul 08, 2019 at 10:49:59AM -0700, Xing, Cedric wrote: >> On 7/8/2019 8:55 AM, Sean Christopherson wrote: >>> On Sun, Jul 07, 2019 at 04:41:30PM -0700, Cedric Xing wrote: >> True only for SGX1. >>>> “maximal protection” at page load time, but such information is NOT always >>>> available. For example, Graphene containers may run different applications >>>> comprised of different set of executables and/or shared objects. Some of >>>> them may contain self-modifying code (or text relocation) while others >>>> don’t. The generic enclave loader usually doesn’t have such information so >>>> wouldn’t be able to provide it ahead of time. >>> >>> I'm unconvinced that it would be remotely difficult to teach an enclave >>> loader that an enclave or hosted application employs SMC, relocation or >>> any other behavior that would require declaring RWX on all pages. >> >> You've been talking as if "enclave loader" is tailored to the enclave it is >> loading. But in reality "enclave loader" is usually a library knowing >> usually nothing about the enclave. How could it know if an enclave contains >> self-modifying code? > > Given the rarity of SMC, require enclaves to declare "I do SMC"... The > Intel SDK already requires the enclave developer to declare heap size, > stack size, thread affinity, etc... I have a very hard time believing > that it can't support SMC and relocation flags. > >>>> · Inefficient Auditing – Audit logs are supposed to help system >>>> administrators to determine the set of minimally needed permissions and to >>>> detect abnormal behaviors. But consider the “maximal protection” model, if >>>> “maximal protection” is set to be too permissive, then audit log wouldn’t >>>> be able to detect anomalies; >>> >>> Huh? Declaring overly permissive protections is only problematic if an >>> LSM denies the permission, in which case it will generate an accurate >>> audit log. >>> >>> If the enclave/loader "requires" a permission it doesn't actually need, >>> e.g. EXECDIRTY, then it's a software bug that should be fixed. I don't >>> see how this scenario is any different than an application that uses >>> assembly code without 'noexecstack' and inadvertantly "requires" >>> EXECSTACK due to triggering "read implies exec". In both cases the >>> denied permission is unnecessary due to a userspace application bug. >> >> You see, you've been assuming "enclave loader" knows everything and tailored >> to what it loads in a particular application. But the reality is the loader >> is generic and probably shared by multiple applications. > > No, I'm assuming that an enclave can communicate its basic needs without > undue pain. > >> It needs some generic way to figure out the "maximal protection". An >> implementation could use information embedded in the enclave file, or could >> just be "configurable". In the former case, you put extra burdens on the build >> tools, while in the latter case, your audit logs cannot help generating an >> appropriate configuration. > > I'm contending the "extra burdens" is minimal. > > if (do_smc || do_relocation) > max_prot = RWX; > else > max_prot = SECINFO.FLAGS; > >>>> or if “maximal protection” is too restrictive, >>>> then audit log cannot identify the file violating the policy. >>> >>> Maximal protections that are too restrictive are completely orthogonal to >>> LSMs as the enclave would fail to run irrespective of LSMs. This is no >>> different than specifying the wrong RWX flags in SECINFO, or opening a >>> file as RO instead of RW. >> >> Say loader is configurable. By looking at the log, can an administrator tell >> which file has too restrictive "maximal protection"? > > Again, this fails irrespective of LSMs. So the answer is "no", because > there is no log. But the admin will never have to deal with this issue > because the enclave will *never* run, i.e. would unconditionally fail to > run during initial development. And the developer has bigger problems if > they can't debug their own code. > >>>> In either case the audit log cannot fulfill its purposes.