Message ID | 20201104145430.300542-11-jarkko.sakkinen@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
On Wed, Nov 04, 2020 at 04:54:16PM +0200, Jarkko Sakkinen wrote: > From: Sean Christopherson <sean.j.christopherson@intel.com> > > Background > ========== > > 1. SGX enclave pages are populated with data by copying from normal memory > via ioctl() (SGX_IOC_ENCLAVE_ADD_PAGES), which will be added later in > this series. > 2. It is desirable to be able to restrict those normal memory data sources. > For instance, to ensure that the source data is executable before > copying data to an executable enclave page. > 3. Enclave page permissions are dynamic (just like normal permissions) and > can be adjusted at runtime with mprotect(). > > This creates a problem because the original data source may have long since > vanished at the time when enclave page permissions are established (mmap() > or mprotect()). > > The solution (elsewhere in this series) is to force enclaves creators to > declare their paging permission *intent* up front to the ioctl(). This > intent can me immediately compared to the source data’s mapping and > rejected if necessary. > > The “intent” is also stashed off for later comparison with enclave > PTEs. This ensures that any future mmap()/mprotect() operations > performed by the enclave creator or done on behalf of the enclave > can be compared with the earlier declared permissions. > > Problem > ======= > > There is an existing mmap() hook which allows SGX to perform this > permission comparison at mmap() time. However, there is no corresponding > ->mprotect() hook. > > Solution > ======== > > Add a vm_ops->mprotect() hook so that mprotect() operations which are > inconsistent with any page's stashed intent can be rejected by the driver. > > Cc: linux-mm@kvack.org > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Matthew Wilcox <willy@infradead.org> > Acked-by: Jethro Beekman <jethro@fortanix.com> > Reviewed-by: Darren Kenny <darren.kenny@oracle.com> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > Co-developed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > --- > include/linux/mm.h | 3 +++ > mm/mprotect.c | 5 ++++- > 2 files changed, 7 insertions(+), 1 deletion(-) This needs an ACK from an mm person.
On 11/5/20 8:04 AM, Borislav Petkov wrote: ... >> Add a vm_ops->mprotect() hook so that mprotect() operations which are >> inconsistent with any page's stashed intent can be rejected by the driver. >> >> Cc: linux-mm@kvack.org >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Matthew Wilcox <willy@infradead.org> >> Acked-by: Jethro Beekman <jethro@fortanix.com> >> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> >> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> >> Co-developed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> >> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> >> --- >> include/linux/mm.h | 3 +++ >> mm/mprotect.c | 5 ++++- >> 2 files changed, 7 insertions(+), 1 deletion(-) > This needs an ACK from an mm person. For what it's worth: Acked-by: Dave Hansen <dave.hansen@intel.com> The top 5 git-blamees in terms of mprotect.c at the moment are: 45 Andi Kleen 50 Peter Xu 81 Dave Hansen 90 Mel Gorman 209 Linus Torvalds
On Wed, Nov 04, 2020 at 04:54:16PM +0200, Jarkko Sakkinen wrote: > From: Sean Christopherson <sean.j.christopherson@intel.com> > > Background > ========== > > 1. SGX enclave pages are populated with data by copying from normal memory > via ioctl() (SGX_IOC_ENCLAVE_ADD_PAGES), which will be added later in > this series. > 2. It is desirable to be able to restrict those normal memory data sources. > For instance, to ensure that the source data is executable before > copying data to an executable enclave page. > 3. Enclave page permissions are dynamic (just like normal permissions) and > can be adjusted at runtime with mprotect(). > > This creates a problem because the original data source may have long since > vanished at the time when enclave page permissions are established (mmap() > or mprotect()). > > The solution (elsewhere in this series) is to force enclaves creators to > declare their paging permission *intent* up front to the ioctl(). This > intent can me immediately compared to the source data???s mapping and > rejected if necessary. > > The ???intent??? is also stashed off for later comparison with enclave > PTEs. This ensures that any future mmap()/mprotect() operations > performed by the enclave creator or done on behalf of the enclave > can be compared with the earlier declared permissions. > > Problem > ======= > > There is an existing mmap() hook which allows SGX to perform this > permission comparison at mmap() time. However, there is no corresponding > ->mprotect() hook. > > Solution > ======== > > Add a vm_ops->mprotect() hook so that mprotect() operations which are > inconsistent with any page's stashed intent can be rejected by the driver. > I have not read the series so this is superficial only. That said... > Cc: linux-mm@kvack.org > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Matthew Wilcox <willy@infradead.org> > Acked-by: Jethro Beekman <jethro@fortanix.com> > Reviewed-by: Darren Kenny <darren.kenny@oracle.com> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > Co-developed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > --- > include/linux/mm.h | 3 +++ > mm/mprotect.c | 5 ++++- > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index ef360fe70aaf..eb38eabc5039 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -559,6 +559,9 @@ struct vm_operations_struct { > void (*close)(struct vm_area_struct * area); > int (*split)(struct vm_area_struct * area, unsigned long addr); > int (*mremap)(struct vm_area_struct * area); > + int (*mprotect)(struct vm_area_struct *vma, > + struct vm_area_struct **pprev, unsigned long start, > + unsigned long end, unsigned long newflags); The first user of this uses the following information ret = sgx_encl_may_map(vma->vm_private_data, start, end, newflags); It only needs start, end and newflags. The pprev is passed in so the hook can call mprotect_fixup() which is redundant as the caller knows it should do that. I don't think an arbitrary driver should be responsible for poking too much into the mm internals to do the fixup because we do not know what other users of this hook might require in the future. Hence, I would suggest that the hook receive the minimum possible information to do the permissions check for the first in-tree user. If it returns without failure then mm/mprotect.c would always do the fixup. > vm_fault_t (*fault)(struct vm_fault *vmf); > vm_fault_t (*huge_fault)(struct vm_fault *vmf, > enum page_entry_size pe_size); > diff --git a/mm/mprotect.c b/mm/mprotect.c > index 56c02beb6041..1fd4fa71ce16 100644 > --- a/mm/mprotect.c > +++ b/mm/mprotect.c > @@ -616,7 +616,10 @@ static int do_mprotect_pkey(unsigned long start, size_t len, > tmp = vma->vm_end; > if (tmp > end) > tmp = end; > - error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); > + if (vma->vm_ops && vma->vm_ops->mprotect) > + error = vma->vm_ops->mprotect(vma, &prev, nstart, tmp, newflags); > + else > + error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); That would then become if (vma->vm_ops && vma->vm_ops->mprotect) error = vma->vm_ops->mprotect(vma, &prev, nstart, tmp, newflags); if (!error) error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); and mprotect_fixup would be removed from the driver. While vm_operations_struct has borderline zero documentation, a hook for one in-kernel user should have a comment explaining what the semantics of the hook is -- what is it responsible for (permission check), what can it change (nothing), etc. Maybe something like /* * Called by mprotect in the event driver-specific permission * checks need to be made before the mprotect is finalised. * No modifications should be done to the VMA, returns 0 * if the mprotect is permitted. */ int (*mprotect)(struct vm_area_struct *vma, unsigned long start, unsigned long end, unsigned long newflags); If a future driver *does* need to poke deeper into the VM for mprotect then at least they'll have to explain why that's a good idea.
On Fri, Nov 06, 2020 at 10:04:09AM +0000, Mel Gorman wrote: > On Wed, Nov 04, 2020 at 04:54:16PM +0200, Jarkko Sakkinen wrote: > > From: Sean Christopherson <sean.j.christopherson@intel.com> > > > > Background > > ========== > > > > 1. SGX enclave pages are populated with data by copying from normal memory > > via ioctl() (SGX_IOC_ENCLAVE_ADD_PAGES), which will be added later in > > this series. > > 2. It is desirable to be able to restrict those normal memory data sources. > > For instance, to ensure that the source data is executable before > > copying data to an executable enclave page. > > 3. Enclave page permissions are dynamic (just like normal permissions) and > > can be adjusted at runtime with mprotect(). > > > > This creates a problem because the original data source may have long since > > vanished at the time when enclave page permissions are established (mmap() > > or mprotect()). > > > > The solution (elsewhere in this series) is to force enclaves creators to > > declare their paging permission *intent* up front to the ioctl(). This > > intent can me immediately compared to the source data???s mapping and > > rejected if necessary. > > > > The ???intent??? is also stashed off for later comparison with enclave > > PTEs. This ensures that any future mmap()/mprotect() operations > > performed by the enclave creator or done on behalf of the enclave > > can be compared with the earlier declared permissions. > > > > Problem > > ======= > > > > There is an existing mmap() hook which allows SGX to perform this > > permission comparison at mmap() time. However, there is no corresponding > > ->mprotect() hook. > > > > Solution > > ======== > > > > Add a vm_ops->mprotect() hook so that mprotect() operations which are > > inconsistent with any page's stashed intent can be rejected by the driver. > > > > I have not read the series so this is superficial only. That said... > > > Cc: linux-mm@kvack.org > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Matthew Wilcox <willy@infradead.org> > > Acked-by: Jethro Beekman <jethro@fortanix.com> > > Reviewed-by: Darren Kenny <darren.kenny@oracle.com> > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > > Co-developed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > --- > > include/linux/mm.h | 3 +++ > > mm/mprotect.c | 5 ++++- > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index ef360fe70aaf..eb38eabc5039 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -559,6 +559,9 @@ struct vm_operations_struct { > > void (*close)(struct vm_area_struct * area); > > int (*split)(struct vm_area_struct * area, unsigned long addr); > > int (*mremap)(struct vm_area_struct * area); > > + int (*mprotect)(struct vm_area_struct *vma, > > + struct vm_area_struct **pprev, unsigned long start, > > + unsigned long end, unsigned long newflags); > > The first user of this uses the following information > > ret = sgx_encl_may_map(vma->vm_private_data, start, end, newflags); > > It only needs start, end and newflags. The pprev is passed in so the > hook can call mprotect_fixup() which is redundant as the caller knows it > should do that. I don't think an arbitrary driver should be responsible > for poking too much into the mm internals to do the fixup because we do > not know what other users of this hook might require in the future. > > Hence, I would suggest that the hook receive the minimum possible > information to do the permissions check for the first in-tree user. If > it returns without failure then mm/mprotect.c would always do the fixup. > > > vm_fault_t (*fault)(struct vm_fault *vmf); > > vm_fault_t (*huge_fault)(struct vm_fault *vmf, > > enum page_entry_size pe_size); > > diff --git a/mm/mprotect.c b/mm/mprotect.c > > index 56c02beb6041..1fd4fa71ce16 100644 > > --- a/mm/mprotect.c > > +++ b/mm/mprotect.c > > @@ -616,7 +616,10 @@ static int do_mprotect_pkey(unsigned long start, size_t len, > > tmp = vma->vm_end; > > if (tmp > end) > > tmp = end; > > - error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); > > + if (vma->vm_ops && vma->vm_ops->mprotect) > > + error = vma->vm_ops->mprotect(vma, &prev, nstart, tmp, newflags); > > + else > > + error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); > > That would then become > > if (vma->vm_ops && vma->vm_ops->mprotect) > error = vma->vm_ops->mprotect(vma, &prev, nstart, tmp, newflags); > if (!error) > error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); > > and mprotect_fixup would be removed from the driver. > > While vm_operations_struct has borderline zero documentation, a hook for > one in-kernel user should have a comment explaining what the semantics > of the hook is -- what is it responsible for (permission check), what > can it change (nothing), etc. Maybe something like > > /* > * Called by mprotect in the event driver-specific permission > * checks need to be made before the mprotect is finalised. > * No modifications should be done to the VMA, returns 0 > * if the mprotect is permitted. > */ > int (*mprotect)(struct vm_area_struct *vma, > unsigned long start, unsigned long end, > unsigned long newflags); > > If a future driver *does* need to poke deeper into the VM for mprotect > then at least they'll have to explain why that's a good idea. Both comments make sense to me. I'll refine this patch on Monday and also "x86/sgx: Add SGX misc driver interface", which uses this callback. Thanks a lot for valuable feedback! > -- > Mel Gorman > SUSE Labs /Jarkko
On Wed, Nov 04, 2020 at 04:54:16PM +0200, Jarkko Sakkinen wrote: Good morning, I hope the week has gone well for everyone. > From: Sean Christopherson <sean.j.christopherson@intel.com> > > Background > ========== > > 1. SGX enclave pages are populated with data by copying from normal memory > via ioctl() (SGX_IOC_ENCLAVE_ADD_PAGES), which will be added later in > this series. > 2. It is desirable to be able to restrict those normal memory data sources. > For instance, to ensure that the source data is executable before > copying data to an executable enclave page. > 3. Enclave page permissions are dynamic (just like normal permissions) and > can be adjusted at runtime with mprotect(). Only relevant on SGX2 hardware, see discussion below. > This creates a problem because the original data source may have > long since vanished at the time when enclave page permissions are > established (mmap() or mprotect()). > > The solution (elsewhere in this series) is to force enclaves I don't believe that enclaves should be plural in this context. > creators to declare their paging permission *intent* up front to the > ioctl(). This intent can me immediately compared to the source The 'me' should be 'be' in the above line. > data???s mapping and rejected if necessary. > > The ???intent??? is also stashed off for later comparison with > enclave PTEs. This ensures that any future mmap()/mprotect() > operations performed by the enclave creator or done on behalf of the > enclave can be compared with the earlier declared permissions. Just some further clarifications that should, arguably, be included in the kernel documentation given their security implications. The 900 pound primate in the room, that no one is acknowledging, is that this technology was designed to not allow the operating system to have any control over what it is doing. In the mindset of kernel developers, the operating system is the absolute authority on security, so we find ourselves in a situation where the kernel needs to try and work around this fact so any solutions will be imperfect at best. As I've noted before, this is actually a primary objective of enclave authors, since one of the desires for 'Confidential Computing' is to hide things like proprietary algorithms from the platform owners. I think the driver needs to acknowledge this fact and equip platform owners with the simplest and most effective security solutions that are available. The only reason that mprotect protections are needed in this driver are to close a security loophole on SGX2 hardware, ie. hardware that supports the ENCLU[EMODPE] instruction. This instruction allows an enclave to modify page permissions that are encoded in the Enclave Page Cache Metadata (EPCM) at initialization time. In all likelihood, this is going to be the only relevant hardware that this driver runs on. On SGX2 hardware, enclave based code can conspire with its untrusted runtime to allow executable regions to have write permissions. This would allow the enclave to load and execute whatever code that it pleases and that the operating system would have no visibility into whatsoever. The non-SGX2 platforms don't need mprotect protections since even if they were to modify at the OS level their page permissions, any attempts to access a page with modified permissions would be trapped by the EPCM protections that are unmodifiable after the enclave has been initialized. In light of this, given the decision by the driver authors to not fully equip the driver with EDMM support, the mprotect protection requirements are straight forward and minimalistic. All that is needed is a binary valued variable, set on the command-line, that either allows or denies anonymous code execution by an enclave, ie. access to page protection changes after initialization. The enclave page mapping callback is elegant but has little use if the objective of all this is to allow the driver to enforce SGX1 semantics on a platform that has SGX2 instruction support. Save the elegant solution until a reasoned arguement can be made as to what anyone would actually be able to do with the per page permissions checks, even on an EDMM capable driver. I could go into detail on that issue as well but I hesitate to be an insufferable bore. I hope all of this is helpful. Have a good weekend. Dr. Greg As always, Dr. Greg Wettstein, Ph.D, Worker Autonomously self-defensive Enjellic Systems Development, LLC IOT platforms and edge devices. 4206 N. 19th Ave. Fargo, ND 58102 PH: 701-281-1686 EMAIL: greg@enjellic.com ------------------------------------------------------------------------------ "In the future, company names will be a 32-character hex string." -- Bruce Schneier
On 11/6/20 9:43 AM, Dr. Greg wrote: > In light of this, given the decision by the driver authors to not > fully equip the driver with EDMM support, the mprotect protection > requirements are straight forward and minimalistic. All that is > needed is a binary valued variable, set on the command-line, that > either allows or denies anonymous code execution by an enclave, > ie. access to page protection changes after initialization. To that, I say NAK. We need more flexibility than a boot-time-fixed, system-wide switch.
On Fri, Nov 06, 2020 at 06:51:07PM +0200, Jarkko Sakkinen wrote:
> Both comments make sense to me. I'll refine this patch on Monday and
And while you're at it, I'd suggest you refine the whole patchset and
send a full v41 instead:
- please audit all your Reviewed-by, Acked-by tags as to for what
versions of the patches they were given. If you've changed those patches
in the meantime, then all those tags are invalid and need to go.
- work in all the change requests
- fix the order of the patches so that each one builds
so that they can be taken cleanly into tip.
Thx.
On Fri, Nov 06, 2020 at 11:43:59AM -0600, Dr. Greg wrote: > The 900 pound primate in the room, that no one is acknowledging, is > that this technology was designed to not allow the operating system to > have any control over what it is doing. In the mindset of kernel > developers, the operating system is the absolute authority on > security, so we find ourselves in a situation where the kernel needs > to try and work around this fact so any solutions will be imperfect at > best. > > As I've noted before, this is actually a primary objective of enclave > authors, since one of the desires for 'Confidential Computing' is to > hide things like proprietary algorithms from the platform owners. I > think the driver needs to acknowledge this fact and equip platform > owners with the simplest and most effective security solutions that > are available. Or we need to not merge "technology" that subverts the owner of the hardware. Remember: root kit authors are inventive buggers.
On 11/6/20 1:13 PM, Matthew Wilcox wrote: > On Fri, Nov 06, 2020 at 11:43:59AM -0600, Dr. Greg wrote: >> The 900 pound primate in the room, that no one is acknowledging, is >> that this technology was designed to not allow the operating system to >> have any control over what it is doing. In the mindset of kernel >> developers, the operating system is the absolute authority on >> security, so we find ourselves in a situation where the kernel needs >> to try and work around this fact so any solutions will be imperfect at >> best. >> >> As I've noted before, this is actually a primary objective of enclave >> authors, since one of the desires for 'Confidential Computing' is to >> hide things like proprietary algorithms from the platform owners. I >> think the driver needs to acknowledge this fact and equip platform >> owners with the simplest and most effective security solutions that >> are available. > Or we need to not merge "technology" that subverts the owner of > the hardware. Remember: root kit authors are inventive buggers. Machine owners have lots of ways to yield their own control of the hardware. One is: wget http://what-could-go-wrong.com -O foo.sh; sudo foo.sh Another is to enable SGX in the BIOS. You're giving up some level of control and yielding it to the hardware. If you don't trust the hardware (aka. Intel), I'd stay far, far away from that BIOS option. This mprotect() hook is trying to have the kernel enforce some rules that yield *less* to enclave authors. Once a rootkit is in play, the kernel isn't going to be providing meaningful protection and I'd expect that this hook is rather useless.
On Fri, Nov 06, 2020 at 09:37:25PM +0100, Borislav Petkov wrote: > On Fri, Nov 06, 2020 at 06:51:07PM +0200, Jarkko Sakkinen wrote: > > Both comments make sense to me. I'll refine this patch on Monday and > > And while you're at it, I'd suggest you refine the whole patchset and > send a full v41 instead: > > - please audit all your Reviewed-by, Acked-by tags as to for what > versions of the patches they were given. If you've changed those patches > in the meantime, then all those tags are invalid and need to go. > > - work in all the change requests > > - fix the order of the patches so that each one builds > > so that they can be taken cleanly into tip. > > Thx. OK, everything else is clear except change requests part I want to check. There has been a change request to update callback that made perfect sense to me. Is there something else that I might have missed? Just checking. > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette /Jarkko
On Sat, Nov 07, 2020 at 12:04:02AM +0200, Jarkko Sakkinen wrote: > There has been a change request to update callback that made perfect > sense to me. Is there something else that I might have missed? Just > checking. With "change requests" I mean the usual going through the replies to a patchset and working in the changes people requested. Nothing special - just the usual working in review feedback, that's it. Thx.
On Fri, Nov 06, 2020 at 09:54:19AM -0800, Dave Hansen wrote: Good morning, I hope the weekend is going well for everyone, beautiful weather out here in West-Cental Minnesota. > On 11/6/20 9:43 AM, Dr. Greg wrote: > > In light of this, given the decision by the driver authors to not > > fully equip the driver with EDMM support, the mprotect protection > > requirements are straight forward and minimalistic. All that is > > needed is a binary valued variable, set on the command-line, that > > either allows or denies anonymous code execution by an enclave, > > ie. access to page protection changes after initialization. > To that, I say NAK. We need more flexibility than a boot-time-fixed, > system-wide switch. To be clear, I wasn't referring to a global yes/no option in the code that implements the mprotect callout method in the vm_operations_struct. I was referring to the implementation of the hook in the SGX driver code. In all of these discussions there hasn't been a refutation of my point that the only reason this hook is needed is to stop the potential for anonymous code execution on SGX2 capable hardware. So we will assume, that while unspoken, this is the rationale for the hook. If you are NAK'ing a global enable/disable in the driver code, I think there needs to be a discussion of why the driver, in its current state, needs anything other then a yes/no decision on mprotect after enclave initialization is complete. At this point in time the driver has no intention of supporting EDMM, so the simple belt-and-suspenders approach is to deny mprotect on enclave virtual memory after initialization. Absent mprotect, the hardware is perfectly capable of enforcing page permissions that are only consistent with the initial mapping of the enclave. If and when EDMM is implemented there might be a rationale for per page mprotect interrogation. I won't waste people's time here but I believe a cogent arguement can be made that there is little utility, even under EDMM, of making per page permission decisions rather then a 'yes/no' decision by the platform owner as to whether or not they want to allow anonymous code execution. Hopefully all of this is a useful clarification. Have a good weekend. Dr. Greg As always, Dr. Greg Wettstein, Ph.D, Worker Autonomously self-defensive Enjellic Systems Development, LLC IOT platforms and edge devices. 4206 N. 19th Ave. Fargo, ND 58102 PH: 701-281-1686 EMAIL: greg@enjellic.com ------------------------------------------------------------------------------ "If you ever teach a yodeling class, probably the hardest thing is to keep the students from just trying to yodel right off. You see, we build to that." -- Jack Handey Deep Thoughts
On Fri, Nov 06, 2020 at 09:13:11PM +0000, Matthew Wilcox wrote: > On Fri, Nov 06, 2020 at 11:43:59AM -0600, Dr. Greg wrote: > > The 900 pound primate in the room, that no one is acknowledging, is > > that this technology was designed to not allow the operating system to > > have any control over what it is doing. In the mindset of kernel > > developers, the operating system is the absolute authority on > > security, so we find ourselves in a situation where the kernel needs > > to try and work around this fact so any solutions will be imperfect at > > best. > > > > As I've noted before, this is actually a primary objective of enclave > > authors, since one of the desires for 'Confidential Computing' is to > > hide things like proprietary algorithms from the platform owners. I > > think the driver needs to acknowledge this fact and equip platform > > owners with the simplest and most effective security solutions that > > are available. > Or we need to not merge "technology" that subverts the owner of the > hardware. Remember: root kit authors are inventive buggers. That will be an interesting philosophical argument for Linux moving forward. I've often stated that there is going to be a natural political tension between the objectives of open-source and advances in platform security. By definition, advancing the latter will result in technology that contrains what can be done with a platform. It may have made more sense for the SGX driver to be mainline when the technology was going to be ubiquitous. Given the decision by Intel to monetize the platform, by limiting its implementation to high end server platforms, the case could be made that it is a driver best supported by the distributions or cloud providers. I'm neither for or against inclusion, I'm simply advocating that we make informed decisions on the driver implementation that benefits the user community. FWIW, based on knowledge that has come from building application stacks on top of the technology for a half decade now. Have a good weekend. Dr. Greg As always, Dr. Greg Wettstein, Ph.D, Worker Autonomously self-defensive Enjellic Systems Development, LLC IOT platforms and edge devices. 4206 N. 19th Ave. Fargo, ND 58102 PH: 701-281-1686 EMAIL: greg@enjellic.com ------------------------------------------------------------------------------ "Atilla The Hun's Maxim: If you're going to rape, pillage and burn, be sure to do things in that order." -- P.J. Plauger Programming On Purpose
On 11/7/20 7:09 AM, Dr. Greg wrote: > In all of these discussions there hasn't been a refutation of my point > that the only reason this hook is needed is to stop the potential for > anonymous code execution on SGX2 capable hardware. So we will assume, > that while unspoken, this is the rationale for the hook. Unspoken? See from the cover letter: > 3. Enclave page permissions are dynamic (just like normal permissions) and > can be adjusted at runtime with mprotect(). I explicitly chose not to name the instructions, nor the exact version of the SGX ISA that introduces them. They're supported in the series, and that's all that matters. If you want to advocate for something different to be done, patches are accepted.
On Sat, Nov 07, 2020 at 11:16:25AM -0800, Dave Hansen wrote: Good afternoon, I hope the week is going well for everyone. > On 11/7/20 7:09 AM, Dr. Greg wrote: > > In all of these discussions there hasn't been a refutation of my point > > that the only reason this hook is needed is to stop the potential for > > anonymous code execution on SGX2 capable hardware. So we will assume, > > that while unspoken, this is the rationale for the hook. > Unspoken? See from the cover letter: The complexity of the issues involved and the almost complete lack of understanding of the technology in the Linux user community would suggest that everyone would benefit from a more detailed explanation of the issues at hand. > > 3. Enclave page permissions are dynamic (just like normal permissions) and > > can be adjusted at runtime with mprotect(). > I explicitly chose not to name the instructions, nor the exact > version of the SGX ISA that introduces them. They're supported in > the series, and that's all that matters. When it comes to security issues and risk assessment it always seems that more information is better, but of course opinions vary, wildly it would seem in the case of this technology. I've been told countless times in my career: "What happens if you get hit by a bus". I've tried to address those issues by generating copious amounts of documentation on everything I do. Having the relevant issues with respect to the security considerations and implications of this technology clearly documented would seem to address the 'hit by a bus' issue for other developers that may need to look at and understand the code down the road. > If you want to advocate for something different to be done, patches > are accepted. I'm including a patch below that addresses the mprotect vulnerability as simplistically as I believe it can be addressed and provides some background information on the issues at hand. I will let people more wise then I am decide whether or not the world at large is worse off for having the information available. I tested this with our runtime, which is of a significantly different design then Intel's. After testing multiple adversarial approaches to changing page permissions, I'm left struggling to understand what the page walking code accomplishes, even in the case of mmap. The ultimate decision with respect to allowed page permissions is cryptographically encoded in the enclave measurement. The enclave won't initialize if changes are made to the desired EPCM permissions. If an attempt is made to use mmap to alter those permissions at the OS level they will be inhibited by the EPCM permission verifications. If one reads the EDMM papers by the Intel engineering team that designed the technology, they were very concerned about an enclave having to agree to any virtual memory changes, hence the need for ENCLU[EACCEPT] and ENCLU[EACCEPTCOPY]. In that respect the behavior of ENCLU[EMODPE] is somewhat interesting in that it gives untrusted userspace the ability to thwart the intentions of enclave code. They may not, however, have had any other choice given that SGX was designed as a virtual memory play in order to make it an 'easy' add-on. Given all of this, it seems to be the case that the only thing needed to block anonymous code execution is to block mprotect on an initialized enclave, which the attached patch does. If and when the driver supports EDMM there is, I believe, a very open question as to what type of introspection capability the kernel should have. More on that in a subsequent post/patch. Have a good evening. Dr. Greg Cut here. ----------------------------------------------------------------- From 68cba86b0cb3c5917e8c42d83edd5220e2890bb1 Mon Sep 17 00:00:00 2001 From: "Dr. Greg" <greg@enjellic.com> Date: Thu, 12 Nov 2020 04:48:57 -0600 Subject: [PATCH] Unconditionally block mprotect of enclave memory. In SGX there are two levels of memory protection, the classic page table mechanism and SGX hardware based page protections that are codified in the Enclave Page Cache Metadata. A successful memory access requires that both mechanisms agree that the access is permitted. In the initial implementation of SGX (SGX1), the page permissions are immutable after the enclave is initialized. Even if classic page protections are modified via mprotect, any attempt to access enclave memory with alternative permissions will be blocked. One of the architectural changes implemented in the second generation of SGX (SGX2) is the ability for page access permissions to be dynamically manipulated after the enclave is initialized. This requires coordination between trusted code running in the enclave and untrusted code using mprotect and special ring-0 instructions. One of the security threats associated with SGX2 hardware is that enclave based code can conspire with its untrusted runtime to make executable enclave memory writable. This provides the opportunity for completely anonymous code execution that the operating system has no visibility into. All that is needed to, simply, close this vulnerability is to eliminate the ability to call mprotect against the virtual memory range of an enclave after it is initialized. Any mprotect changes made prior to initialization that are inconsistent with the permissions codified in the enclave will cause initialization and/or access to fail. Tested-by: Dr. Greg <greg@enjellic.com> Signed-off-by: Dr. Greg <greg@enjellic.com> --- arch/x86/kernel/cpu/sgx/encl.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 139f8c398685..c613482ebb56 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -270,11 +270,10 @@ static int sgx_vma_mprotect(struct vm_area_struct *vma, struct vm_area_struct **pprev, unsigned long start, unsigned long end, unsigned long newflags) { - int ret; + struct sgx_encl *encl = vma->vm_private_data; - ret = sgx_encl_may_map(vma->vm_private_data, start, end, newflags); - if (ret) - return ret; + if ( test_bit(SGX_ENCL_INITIALIZED, &encl->flags) ) + return -EACCES; return mprotect_fixup(vma, pprev, start, end, newflags); }
On 11/12/20 12:58 PM, Dr. Greg wrote: > @@ -270,11 +270,10 @@ static int sgx_vma_mprotect(struct vm_area_struct *vma, > struct vm_area_struct **pprev, unsigned long start, > unsigned long end, unsigned long newflags) > { > - int ret; > + struct sgx_encl *encl = vma->vm_private_data; > > - ret = sgx_encl_may_map(vma->vm_private_data, start, end, newflags); > - if (ret) > - return ret; > + if ( test_bit(SGX_ENCL_INITIALIZED, &encl->flags) ) > + return -EACCES; > > return mprotect_fixup(vma, pprev, start, end, newflags); > } This rules out mprotect() on running enclaves. Does that break any expectations from enclave authors, or take away capabilities that folks need?
On Thu, Nov 12, 2020 at 1:31 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 11/12/20 12:58 PM, Dr. Greg wrote: > > @@ -270,11 +270,10 @@ static int sgx_vma_mprotect(struct vm_area_struct *vma, > > struct vm_area_struct **pprev, unsigned long start, > > unsigned long end, unsigned long newflags) > > { > > - int ret; > > + struct sgx_encl *encl = vma->vm_private_data; > > > > - ret = sgx_encl_may_map(vma->vm_private_data, start, end, newflags); > > - if (ret) > > - return ret; > > + if ( test_bit(SGX_ENCL_INITIALIZED, &encl->flags) ) > > + return -EACCES; > > > > return mprotect_fixup(vma, pprev, start, end, newflags); > > } > > This rules out mprotect() on running enclaves. Does that break any > expectations from enclave authors, or take away capabilities that folks > need? It certainly prevents any scheme in which an enclave coordinates with the outside world to do W-and-then-X JIT inside. I'm also not convinced it has any real effect at all unless there's some magic I missed to prevent someone from using mmap(2) to effectively change permissions. Everyone, IMO this SGX1 - vs - SGX2 - vs - EDMM discussion is entirely missing the point and is a waste of everyone's time. Let's pretend we're building a system that has nothing to do with SGX and requires no special hardware support at all. It works like this: A user program opens /dev/xyz and gets back an fd that represents 16 MB of memory. The user program copies some data from disk (or network or whatever) into fd (using write(2) or ioctl(2) or mmap(2) and memcpy) and then mmaps some of the fd as R and some as RW and some as RX, and then the user program jumps into the RX mapping. If we replace /dev/xyz with /dev/zero, then this simply does not work under a reasonably strict W^X policy -- a lot of people think it's quite reasonable for an OS to prevent a user program from obtaining an X mapping containing anything other than a mapping from a file on disk. To solve this, we can do one of at least three things: a) You can't use /dev/xyz unless you have permission to create WX memory or to at least create W memory and then change it to X. b) You can do whatever you want with /dev/xyz, and LSM policies are blatantly violated as a result. c) The /dev/xyz API is clever and tracks, page-by-page, whether the user intends to ever write and/or execute that page, and behaves accordingly. This patchset takes the approach (c). The actual clever policy isn't here yet, and we don't really know whether it will ever appear, but the API is set up to enable such a policy to be written. This appears to be a win for everyone, since the code is pretty clean and the API is straightforward. Now, back to SGX. There are only two things that are remotely SGX-specific here. First, SGX requires this unusual memory model in which there is an executable mapping of (part of) a device node. [0] Second, early SGX hardware had this oddity that the kernel could set a per-backing-page (as opposed to per-PTE) bit to permanently disable X on a given /dev/sgx page. Building a security model around that would have been a hack, and it DOES NOT WORK on new hardware. So can we please stop discussing it? None of the actual interesting parts of this have much to do with SGX per se and have nothing whatsoever to do with EDMM or any other Intel buzzword. Heck, if anyone actually cared to do so, something with essentially the same semantics could probably be built using SEV hardware instead of SGX, and it would have exactly the same issue if we wanted it to work for tasks that didn't have access to /dev/kvm. [0] SGX doesn't *really* require this. We could set things up so that you do mmap(..., MAP_ANONYMOUS, fd, ...) and then somehow introduce that mapping to SGX. I think the result would be too disgusting to seriously consider.
On Thu, Nov 12, 2020 at 01:31:19PM -0800, Dave Hansen wrote: Good afternoon to everyone. > On 11/12/20 12:58 PM, Dr. Greg wrote: > > @@ -270,11 +270,10 @@ static int sgx_vma_mprotect(struct vm_area_struct *vma, > > struct vm_area_struct **pprev, unsigned long start, > > unsigned long end, unsigned long newflags) > > { > > - int ret; > > + struct sgx_encl *encl = vma->vm_private_data; > > > > - ret = sgx_encl_may_map(vma->vm_private_data, start, end, newflags); > > - if (ret) > > - return ret; > > + if ( test_bit(SGX_ENCL_INITIALIZED, &encl->flags) ) > > + return -EACCES; > > > > return mprotect_fixup(vma, pprev, start, end, newflags); > > } > This rules out mprotect() on running enclaves. Does that break any > expectations from enclave authors, or take away capabilities that > folks need? As I mentioned an hour or so ago when I posted our updated patch, Sean and Jarkko have specifically indicated that there is no intention to support Enclave Dynamic Memory Management (EDMM) at this stage of the driver. I believe the intent is to open that can of worms after the driver is mainlined. Since the stated intent of the driver is to only implement SGX1 semantics there is no need to allow page permission changes of any type after the enclave is initialized. If mmap/mprotect are taken off the table for an initialized enclave, there is no need to walk the enclave page permissions since the hardware itself will enforce those intents. Runtime support is critical to implementing EDMM. It seems premature to place code into the kernel until there is agreement from the runtime developers as to how page permission intent should be communicated into the kernel. Current EDMM implementations simply allocate a sparse aperture which can be further extended, for example, to increase heap space or the number of Task Control Structures. As I've stated previously, there is an open question at this point as to how useful a mainline driver will be without EDMM support, unless the distributions or cloud providers are going to patch it in on top of the mainline driver. Those players have been copied on all of these e-mails so I would assume they could/would pipe up with comments on what type of security architecture should be implemented. As I've stated before, I believe in the final analysis that the only relevant question is yes or no with respect to dynamic enclaves. Have a good remainder of the weekend. Dr. Greg As always, Greg Wettstein, Ph.D, Worker Autonomously self-defensive Enjellic Systems Development, LLC IOT platforms and edge devices. 4206 N. 19th Ave. Fargo, ND 58102 PH: 701-281-1686 EMAIL: greg@enjellic.com ------------------------------------------------------------------------------ "If you think nobody cares if you're alive, try missing a couple of car payments." -- Earl Wilson
On Thu, Nov 12, 2020 at 02:41:00PM -0800, Andy Lutomirski wrote: Good morning, I hope the week is starting well for everyone. > On Thu, Nov 12, 2020 at 1:31 PM Dave Hansen <dave.hansen@intel.com> wrote: > > > > On 11/12/20 12:58 PM, Dr. Greg wrote: > > > @@ -270,11 +270,10 @@ static int sgx_vma_mprotect(struct vm_area_struct *vma, > > > struct vm_area_struct **pprev, unsigned long start, > > > unsigned long end, unsigned long newflags) > > > { > > > - int ret; > > > + struct sgx_encl *encl = vma->vm_private_data; > > > > > > - ret = sgx_encl_may_map(vma->vm_private_data, start, end, newflags); > > > - if (ret) > > > - return ret; > > > + if ( test_bit(SGX_ENCL_INITIALIZED, &encl->flags) ) > > > + return -EACCES; > > > > > > return mprotect_fixup(vma, pprev, start, end, newflags); > > > } > > > > This rules out mprotect() on running enclaves. Does that break any > > expectations from enclave authors, or take away capabilities that folks > > need? > It certainly prevents any scheme in which an enclave coordinates > with the outside world to do W-and-then-X JIT inside. I'm also not > convinced it has any real effect at all unless there's some magic I > missed to prevent someone from using mmap(2) to effectively change > permissions. The patch that I posted yesterday addresses the security issue for both mmap and mprotect by trapping the permission change request at the level of the sgx_encl_may_map() function. With respect to the W-and-then-X JIT issue, the stated purpose of the driver is to implement basic SGX functionality, which is SGX1 semantics, it has been stated formally for a year by the developers themselves that they are not entertaining a driver that addresses any of the issues associated with non-static memory permissions. As I've noted previously, the hardware itself is capable of enforcing that after initialization, if mmap/mprotect is blocked on hardware that supports SGX2 instructions. > Everyone, IMO this SGX1 - vs - SGX2 - vs - EDMM discussion is > entirely missing the point and is a waste of everyone's time. Let's > pretend we're building a system that has nothing to do with SGX and > requires no special hardware support at all. It works like this: I don't doubt there is a potential bigger vision here, quite frankly it is probably an open question whether or not SGX is going to be a part of this future, for a variety of reasons. I also do not doubt that you have the skills to define that vision. Right now, however, the issue is not about pretending but rather one of getting a driver into the kernel that provides a framework for building whatever future SGX may have. Given GKH's comments on LWN last week in response to the RAPL vulnerability, I'm not sure if it is a politically done deal that the driver will go in. SGX has specific hardware characteristics that impact the driver, I don't see how fitting it into a generic trusted execution model advances the agenda immediately at hand. Particularly given the fact that I'm not even sure people understand the questions that need to be answered about such a generic model. > A user program opens /dev/xyz and gets back an fd that represents 16 > MB of memory. The user program copies some data from disk (or > network or whatever) into fd (using write(2) or ioctl(2) or mmap(2) > and memcpy) and then mmaps some of the fd as R and some as RW and > some as RX, and then the user program jumps into the RX mapping. This is basically the SGX model in the new driver. The important defining characteristic of the driver, that we can't wave away, is that the hardware requires a specific set of initial page permissions to be implemented in order for initialization of the memory range (enclave) to succeed. This is inherent to the way SGX hardware was designed to work. The only difference between SGX1 and SGX2 is that the latter offers a small number of additional instructions that allow the page permissions to be dynamically manipulated after initialization is complete. From a security perspective, the issue at hand is that the executable material is not going to come in through the fd, it is going to be loaded by the enclave over the network. This isn't fear mongering, it is the stated intent of what people want to do with the technology as a integral part of confidential computing. I've had the opportunity to brief DOD and other entities concerned with intelligence issues, about these type of potential capabilities. It isn't hard to envision scenarios of where having potentially sensitive code and data only ever handled and executed by a trusted entity, in an environment that is inherently ephemeral with respect to its persistence, is an important design characteristic. Thermite has also been known to play a role in some of these designs prior to the greater elegance of trusted execution environments. Ultimately, if you believe the Confidential Computing Consortium, it is also what people want for their sensitive cloud workloads. Absent the thermite of course. > If we replace /dev/xyz with /dev/zero, then this simply does not work > under a reasonably strict W^X policy -- a lot of people think it's > quite reasonable for an OS to prevent a user program from obtaining an > X mapping containing anything other than a mapping from a file on > disk. To solve this, we can do one of at least three things: > > a) You can't use /dev/xyz unless you have permission to create WX > memory or to at least create W memory and then change it to X. > > b) You can do whatever you want with /dev/xyz, and LSM policies are > blatantly violated as a result. I think the important issue at hand is that classic LSM policies simply are not relevant with respect to how this technology was designed to operate, and perhaps more importantly, how people want to use it. That is why I have consistently stated that I think the only relevant knob is a binary decision as to whether or not a platform owner wants to entertain completely anonymous code execution. > c) The /dev/xyz API is clever and tracks, page-by-page, whether the > user intends to ever write and/or execute that page, and behaves > accordingly. > > This patchset takes the approach (c). The actual clever policy > isn't here yet, and we don't really know whether it will ever > appear, but the API is set up to enable such a policy to be written. > This appears to be a win for everyone, since the code is pretty > clean and the API is straightforward. I believe I have been clear in stating that I have never doubted the cleverness of the approach or its potential utility for the future. The issue at hand is that it simply isn't relevant at this stage of the driver. Getting this new vision right is something that would benefit from a lot of conversations between runtime and kernel developers. Arguably, the case can be made that it should have a second type of implementation to ensure that the approach is generic, extensible and most importantly secure. The 'cleverness' of the policy needs to be evaluated in the context of what does it mean with respect to the risk arbitration decisions that we are trying to support. The open question comes down to, in essence, asking ourselves whether or not we believe that it makes sense to say that 15 pages of RWX memory is a security threat but 5 are not. > Now, back to SGX. There are only two things that are remotely > SGX-specific here. First, SGX requires this unusual memory model in > which there is an executable mapping of (part of) a device node. [0] > Second, early SGX hardware had this oddity that the kernel could set > a per-backing-page (as opposed to per-PTE) bit to permanently > disable X on a given /dev/sgx page. Building a security model > around that would have been a hack, and it DOES NOT WORK on new > hardware. So can we please stop discussing it? None of the actual > interesting parts of this have much to do with SGX per se and have > nothing whatsoever to do with EDMM or any other Intel buzzword. Just a clarification for everyone sitting in their recliners eating popcorn and following along at home. As I've stated before, I don't argue the potential utility of some new model, SGX however, has hardware characteristics that cannot be waved away in this discussion. The technology was designed to have a cryptographic measurement that controls whether or not the memory image is suitable for execution. The description of that image is defined by the enclave author when the enclave is signed. This is why the current EDMM implementation requires that a maximum aperture range be defined for dynamic memory regions. Since the linear address of each page address in the enclave is encoded into the measurement, enclave initialization will fail unless the loaded memory image is consistent with the wishes of the enclave signer. Having 40 pages of potential heap memory when the author called for 39 would be considered an initialization defect that would be enforced by the hardware. The desired page permissions are also encoded into the enclave measurement. Since the current implementation takes the maximum scoped permissions from the security information encoded in the enclave, it would require that the enclave encode for RWX permissions if the intent was to dynmically load executable or JIT code after the enclave was initialized. If I understand your above analysis correctly, this would be problematic for current security models/practices. Obviously an API could be proposed that allowed this permissable memory map to be defined independently from the enclave. This notion, based on my read of the security risk considerations that went into the design of SGX, would be problematic, since it would allow an untrusted party to modify the characteristics that were desired by the enclave author for the executable image. > Heck, if anyone actually cared to do so, something with essentially > the same semantics could probably be built using SEV hardware > instead of SGX, and it would have exactly the same issue if we > wanted it to work for tasks that didn't have access to /dev/kvm. As I noted above, perhaps whatever this driver may become in the future would benefit from the community creating something like this as a second reference to build an API on top of. > [0] SGX doesn't *really* require this. We could set things up so that > you do mmap(..., MAP_ANONYMOUS, fd, ...) and then somehow introduce > that mapping to SGX. I think the result would be too disgusting to > seriously consider. Let me be clear, I certainly would not advocate doing anything too disgusting to consider. Hopefully our proposal for simplifying the security model for the driver, while still allowing the framework for a still unspecified future pathway, doesn't fit this description. Best wishes for a productive week to everyone. Dr. Greg As always, Dr. Greg Wettstein, Ph.D, Worker Autonomously self-defensive Enjellic Systems Development, LLC IOT platforms and edge devices. 4206 N. 19th Ave. Fargo, ND 58102 PH: 701-281-1686 EMAIL: greg@enjellic.com ------------------------------------------------------------------------------ "Boy, it must not take much to make a phone work. Looking at everthing else here it must be the same way with the INTERNET." -- Francis 'Fritz' Wettstein
On Mon, 16 Nov 2020 12:00:23 -0600, Dr. Greg <greg@enjellic.com> wrote: > On Thu, Nov 12, 2020 at 02:41:00PM -0800, Andy Lutomirski wrote: > > Good morning, I hope the week is starting well for everyone. > >> On Thu, Nov 12, 2020 at 1:31 PM Dave Hansen <dave.hansen@intel.com> >> wrote: >> > >> > On 11/12/20 12:58 PM, Dr. Greg wrote: >> > > @@ -270,11 +270,10 @@ static int sgx_vma_mprotect(struct >> vm_area_struct *vma, >> > > struct vm_area_struct **pprev, unsigned >> long start, >> > > unsigned long end, unsigned long newflags) >> > > { >> > > - int ret; >> > > + struct sgx_encl *encl = vma->vm_private_data; >> > > >> > > - ret = sgx_encl_may_map(vma->vm_private_data, start, end, >> newflags); >> > > - if (ret) >> > > - return ret; >> > > + if ( test_bit(SGX_ENCL_INITIALIZED, &encl->flags) ) >> > > + return -EACCES; >> > > >> > > return mprotect_fixup(vma, pprev, start, end, newflags); >> > > } >> > >> > This rules out mprotect() on running enclaves. Does that break any >> > expectations from enclave authors, or take away capabilities that >> folks >> > need? > >> It certainly prevents any scheme in which an enclave coordinates >> with the outside world to do W-and-then-X JIT inside. I'm also not >> convinced it has any real effect at all unless there's some magic I >> missed to prevent someone from using mmap(2) to effectively change >> permissions. > > The patch that I posted yesterday addresses the security issue for > both mmap and mprotect by trapping the permission change request at > the level of the sgx_encl_may_map() function. > > With respect to the W-and-then-X JIT issue, the stated purpose of the > driver is to implement basic SGX functionality, which is SGX1 > semantics, it has been stated formally for a year by the developers > themselves that they are not entertaining a driver that addresses any > of the issues associated with non-static memory permissions. > The JIT issue is applicable even to SGX1 platforms. We can do EADD with EPCM.RWX in sec_info and with PTE.RW, EINIT, then mprotect to set PTE.RX when JIT is done. Haitao
On Wed, Nov 18, 2020 at 07:39:50PM -0600, Haitao Huang wrote: Good morning, I hope the week is ending well for everyone. > On Mon, 16 Nov 2020 12:00:23 -0600, Dr. Greg <greg@enjellic.com> wrote: > > >On Thu, Nov 12, 2020 at 02:41:00PM -0800, Andy Lutomirski wrote: > >>It certainly prevents any scheme in which an enclave coordinates > >>with the outside world to do W-and-then-X JIT inside. I'm also not > >>convinced it has any real effect at all unless there's some magic I > >>missed to prevent someone from using mmap(2) to effectively change > >>permissions. > > > >The patch that I posted yesterday addresses the security issue for > >both mmap and mprotect by trapping the permission change request at > >the level of the sgx_encl_may_map() function. > > > >With respect to the W-and-then-X JIT issue, the stated purpose of the > >driver is to implement basic SGX functionality, which is SGX1 > >semantics, it has been stated formally for a year by the developers > >themselves that they are not entertaining a driver that addresses any > >of the issues associated with non-static memory permissions. > The JIT issue is applicable even to SGX1 platforms. We can do EADD > with EPCM.RWX in sec_info and with PTE.RW, EINIT, then mprotect to > set PTE.RX when JIT is done. Correct, which further underscores the point that I am trying make, it is unclear what the current mmap/mprotect protection architecture is accomplishing, since it only enforces EPCM permissions. The hardware is perfectly capable of doing so without assistance from the operating system, in fact, the very reason it was designed was to provide protections in the face of an adversarial operating system. For precisely one year, as of next week, we have engaged in a re-design of this driver, driven by the concern that the previous driver allowed execution of code that bypassed the LSM. So I'm assuming the community has concerns about the possibility of anonymous code execution. If this is the case, the mmap/mprotect protection code in the driver should be implementing some type of control over the execution of anonymous memory, which our patch implements, very simply and very understandably. The other straight forward alternative is a knob that tells the driver to disallow the addition of any page that attempts to set EPCM.XW permissions. As always, corrections gladly accepted if our analysis of the driver or how the hardware itself works is incorrect. > Haitao Have a good weekend. Dr. Greg As always, Dr. Greg Wettstein, Ph.D, Worker Autonomously self-defensive Enjellic Systems Development, LLC IOT platforms and edge devices. 4206 N. 19th Ave. Fargo, ND 58102 PH: 701-281-1686 EMAIL: greg@enjellic.com ------------------------------------------------------------------------------ "My second remark is that our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed." -- Edsger W. Dijkstra
diff --git a/include/linux/mm.h b/include/linux/mm.h index ef360fe70aaf..eb38eabc5039 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -559,6 +559,9 @@ struct vm_operations_struct { void (*close)(struct vm_area_struct * area); int (*split)(struct vm_area_struct * area, unsigned long addr); int (*mremap)(struct vm_area_struct * area); + int (*mprotect)(struct vm_area_struct *vma, + struct vm_area_struct **pprev, unsigned long start, + unsigned long end, unsigned long newflags); vm_fault_t (*fault)(struct vm_fault *vmf); vm_fault_t (*huge_fault)(struct vm_fault *vmf, enum page_entry_size pe_size); diff --git a/mm/mprotect.c b/mm/mprotect.c index 56c02beb6041..1fd4fa71ce16 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -616,7 +616,10 @@ static int do_mprotect_pkey(unsigned long start, size_t len, tmp = vma->vm_end; if (tmp > end) tmp = end; - error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); + if (vma->vm_ops && vma->vm_ops->mprotect) + error = vma->vm_ops->mprotect(vma, &prev, nstart, tmp, newflags); + else + error = mprotect_fixup(vma, &prev, nstart, tmp, newflags); if (error) goto out; nstart = tmp;