diff mbox series

[v34,10/24] mm: Add vm_ops->mprotect()

Message ID 20200707030204.126021-11-jarkko.sakkinen@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Jarkko Sakkinen July 7, 2020, 3:01 a.m. UTC
From: Sean Christopherson <sean.j.christopherson@intel.com>

Add vm_ops()->mprotect() for additional constraints for a VMA.

Intel Software Guard eXtensions (SGX) will use this callback to add two
constraints:

1. Verify that the address range does not have holes: each page address
   must be filled with an enclave page.
2. Verify that VMA permissions won't surpass the permissions of any enclave
   page within the address range. Enclave cryptographically sealed
   permissions for each page address that set the upper limit for possible
   VMA permissions. Not respecting this can cause #GP's to be emitted.

Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Acked-by: Jethro Beekman <jethro@fortanix.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
---
 include/linux/mm.h |  2 ++
 mm/mprotect.c      | 13 ++++++++++---
 2 files changed, 12 insertions(+), 3 deletions(-)

Comments

Matthew Wilcox (Oracle) July 7, 2020, 3:14 a.m. UTC | #1
On Tue, Jul 07, 2020 at 06:01:50AM +0300, Jarkko Sakkinen wrote:
> +++ b/mm/mprotect.c
> @@ -603,13 +603,20 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
>  			goto out;
>  		}
>  
> +		tmp = vma->vm_end;
> +		if (tmp > end)
> +			tmp = end;
> +
>  		error = security_file_mprotect(vma, reqprot, prot);
>  		if (error)
>  			goto out;
>  
> -		tmp = vma->vm_end;
> -		if (tmp > end)
> -			tmp = end;

You don't need to move this any more, right?

> +		if (vma->vm_ops && vma->vm_ops->mprotect) {
> +			error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> +			if (error)
> +				goto out;
> +		}
> +
>  		error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
>  		if (error)
>  			goto out;
> -- 
> 2.25.1
>
Sean Christopherson July 7, 2020, 3:22 a.m. UTC | #2
On Tue, Jul 07, 2020 at 04:14:24AM +0100, Matthew Wilcox wrote:
> On Tue, Jul 07, 2020 at 06:01:50AM +0300, Jarkko Sakkinen wrote:
> > +++ b/mm/mprotect.c
> > @@ -603,13 +603,20 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
> >  			goto out;
> >  		}
> >  
> > +		tmp = vma->vm_end;
> > +		if (tmp > end)
> > +			tmp = end;
> > +
> >  		error = security_file_mprotect(vma, reqprot, prot);
> >  		if (error)
> >  			goto out;
> >  
> > -		tmp = vma->vm_end;
> > -		if (tmp > end)
> > -			tmp = end;
> 
> You don't need to move this any more, right?

Ya, I was typing up a longer version, you beat me to the punch...

The calculation of 'tmp' doesn't need to be moved.  The previous incarnation
only moved it so that that 'tmp would be available for .may_mprotect().

The only reason  .may_mprotect() was placed before security_file_mprotect()
was to avoid triggering LSM errors that would have been blocked by
.may_mprotect(), but that justification is no longer relevant as the
proposed SGX LSM hooks obviously didn't worm their way into this series.

> > +		if (vma->vm_ops && vma->vm_ops->mprotect) {
> > +			error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> > +			if (error)
> > +				goto out;
> > +		}

Based on "... and then the vma owner can do whatever it needs to before
calling mprotect_fixup(), which is already not static", my interpretation
is that Matthew's intent was to do:

		if (vma->vm_ops && vma->vm_ops->mprotect)
			error =  = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
		else
			error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
		if (error)
			goto out;

i.e. make .mprotect() a full replacement as opposed to a prereq hook.
		
> > +
> >  		error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
> >  		if (error)
> >  			goto out;
> > -- 
> > 2.25.1
> >
Matthew Wilcox (Oracle) July 7, 2020, 3:24 a.m. UTC | #3
On Mon, Jul 06, 2020 at 08:22:54PM -0700, Sean Christopherson wrote:
> On Tue, Jul 07, 2020 at 04:14:24AM +0100, Matthew Wilcox wrote:
> > > +		if (vma->vm_ops && vma->vm_ops->mprotect) {
> > > +			error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> > > +			if (error)
> > > +				goto out;
> > > +		}
> 
> Based on "... and then the vma owner can do whatever it needs to before
> calling mprotect_fixup(), which is already not static", my interpretation
> is that Matthew's intent was to do:
> 
> 		if (vma->vm_ops && vma->vm_ops->mprotect)
> 			error =  = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> 		else
> 			error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
> 		if (error)
> 			goto out;
> 
> i.e. make .mprotect() a full replacement as opposed to a prereq hook.

Yes, it was.  I was just looking at the next patch to be sure this was
how I'd been misunderstood.
Jarkko Sakkinen July 7, 2020, 4:01 a.m. UTC | #4
On Tue, Jul 07, 2020 at 04:24:08AM +0100, Matthew Wilcox wrote:
> On Mon, Jul 06, 2020 at 08:22:54PM -0700, Sean Christopherson wrote:
> > On Tue, Jul 07, 2020 at 04:14:24AM +0100, Matthew Wilcox wrote:
> > > > +		if (vma->vm_ops && vma->vm_ops->mprotect) {
> > > > +			error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> > > > +			if (error)
> > > > +				goto out;
> > > > +		}
> > 
> > Based on "... and then the vma owner can do whatever it needs to before
> > calling mprotect_fixup(), which is already not static", my interpretation
> > is that Matthew's intent was to do:
> > 
> > 		if (vma->vm_ops && vma->vm_ops->mprotect)
> > 			error =  = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> > 		else
> > 			error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
> > 		if (error)
> > 			goto out;
> > 
> > i.e. make .mprotect() a full replacement as opposed to a prereq hook.
> 
> Yes, it was.  I was just looking at the next patch to be sure this was
> how I'd been misunderstood.

I'm don't get this part. If mprotect_fixup is called in the tail of the
callback, why it has to be called inside the callback and not be called
after the callback?

The reason I only part did what you requested was to do only the part of
the change that I get. Not to oppose it.

/Jarkko
Jarkko Sakkinen July 7, 2020, 4:03 a.m. UTC | #5
On Tue, Jul 07, 2020 at 04:14:24AM +0100, Matthew Wilcox wrote:
> On Tue, Jul 07, 2020 at 06:01:50AM +0300, Jarkko Sakkinen wrote:
> > +++ b/mm/mprotect.c
> > @@ -603,13 +603,20 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
> >  			goto out;
> >  		}
> >  
> > +		tmp = vma->vm_end;
> > +		if (tmp > end)
> > +			tmp = end;
> > +
> >  		error = security_file_mprotect(vma, reqprot, prot);
> >  		if (error)
> >  			goto out;
> >  
> > -		tmp = vma->vm_end;
> > -		if (tmp > end)
> > -			tmp = end;
> 
> You don't need to move this any more, right?

My bad.

/Jarkko
Matthew Wilcox (Oracle) July 7, 2020, 4:10 a.m. UTC | #6
On Tue, Jul 07, 2020 at 07:01:51AM +0300, Jarkko Sakkinen wrote:
> On Tue, Jul 07, 2020 at 04:24:08AM +0100, Matthew Wilcox wrote:
> > On Mon, Jul 06, 2020 at 08:22:54PM -0700, Sean Christopherson wrote:
> > > On Tue, Jul 07, 2020 at 04:14:24AM +0100, Matthew Wilcox wrote:
> > > > > +		if (vma->vm_ops && vma->vm_ops->mprotect) {
> > > > > +			error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> > > > > +			if (error)
> > > > > +				goto out;
> > > > > +		}
> > > 
> > > Based on "... and then the vma owner can do whatever it needs to before
> > > calling mprotect_fixup(), which is already not static", my interpretation
> > > is that Matthew's intent was to do:
> > > 
> > > 		if (vma->vm_ops && vma->vm_ops->mprotect)
> > > 			error =  = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> > > 		else
> > > 			error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
> > > 		if (error)
> > > 			goto out;
> > > 
> > > i.e. make .mprotect() a full replacement as opposed to a prereq hook.
> > 
> > Yes, it was.  I was just looking at the next patch to be sure this was
> > how I'd been misunderstood.
> 
> I'm don't get this part. If mprotect_fixup is called in the tail of the
> callback, why it has to be called inside the callback and not be called
> after the callback?

Because that's how every other VM operation works.  Look at your
implementation of get_unmapped_area() for example.
Jarkko Sakkinen July 8, 2020, 2:33 p.m. UTC | #7
On Tue, Jul 07, 2020 at 05:10:46AM +0100, Matthew Wilcox wrote:
> On Tue, Jul 07, 2020 at 07:01:51AM +0300, Jarkko Sakkinen wrote:
> > On Tue, Jul 07, 2020 at 04:24:08AM +0100, Matthew Wilcox wrote:
> > > On Mon, Jul 06, 2020 at 08:22:54PM -0700, Sean Christopherson wrote:
> > > > On Tue, Jul 07, 2020 at 04:14:24AM +0100, Matthew Wilcox wrote:
> > > > > > +		if (vma->vm_ops && vma->vm_ops->mprotect) {
> > > > > > +			error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> > > > > > +			if (error)
> > > > > > +				goto out;
> > > > > > +		}
> > > > 
> > > > Based on "... and then the vma owner can do whatever it needs to before
> > > > calling mprotect_fixup(), which is already not static", my interpretation
> > > > is that Matthew's intent was to do:
> > > > 
> > > > 		if (vma->vm_ops && vma->vm_ops->mprotect)
> > > > 			error =  = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> > > > 		else
> > > > 			error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
> > > > 		if (error)
> > > > 			goto out;
> > > > 
> > > > i.e. make .mprotect() a full replacement as opposed to a prereq hook.
> > > 
> > > Yes, it was.  I was just looking at the next patch to be sure this was
> > > how I'd been misunderstood.
> > 
> > I'm don't get this part. If mprotect_fixup is called in the tail of the
> > callback, why it has to be called inside the callback and not be called
> > after the callback?
> 
> Because that's how every other VM operation works.  Look at your
> implementation of get_unmapped_area() for example.

I get the point but I don't think that your proposal could work given
that mprotect-callback takes neither 'prev' nor 'newflags' as its
parameters. The current callback has no means to call mprotect_fixup()
properly.

It would have to be extended

	int (*mprotect)(struct vm_area_struct *vma,
			struct vm_area_struct **pprev, unsigned long start,
			unsigned long end, unsigned long prot,
			unsigned long newflags);

Is this what you want?

/Jarkko
Matthew Wilcox (Oracle) July 8, 2020, 2:37 p.m. UTC | #8
On Wed, Jul 08, 2020 at 05:33:20PM +0300, Jarkko Sakkinen wrote:
> I get the point but I don't think that your proposal could work given
> that mprotect-callback takes neither 'prev' nor 'newflags' as its
> parameters. The current callback has no means to call mprotect_fixup()
> properly.
> 
> It would have to be extended
> 
> 	int (*mprotect)(struct vm_area_struct *vma,
> 			struct vm_area_struct **pprev, unsigned long start,
> 			unsigned long end, unsigned long prot,
> 			unsigned long newflags);
> 
> Is this what you want?

https://lore.kernel.org/linux-mm/20200625173050.GF7703@casper.infradead.org/
Jarkko Sakkinen July 8, 2020, 4:10 p.m. UTC | #9
On Wed, Jul 08, 2020 at 03:37:08PM +0100, Matthew Wilcox wrote:
> On Wed, Jul 08, 2020 at 05:33:20PM +0300, Jarkko Sakkinen wrote:
> > I get the point but I don't think that your proposal could work given
> > that mprotect-callback takes neither 'prev' nor 'newflags' as its
> > parameters. The current callback has no means to call mprotect_fixup()
> > properly.
> > 
> > It would have to be extended
> > 
> > 	int (*mprotect)(struct vm_area_struct *vma,
> > 			struct vm_area_struct **pprev, unsigned long start,
> > 			unsigned long end, unsigned long prot,
> > 			unsigned long newflags);
> > 
> > Is this what you want?
> 
> https://lore.kernel.org/linux-mm/20200625173050.GF7703@casper.infradead.org/

Ugh, it's there as it should be. I'm sorry - I just misread the code.

I think that should work, and we do not have to do extra conversion
inside the callback.

There is still one thing that I'm wondering. 'page->vm_max_prot_bits'
contains some an union of subset of {VM_READ, VM_WRITE, VM_EXEC}, and
'newflags' can contain other bits set too.

The old implementation of sgx_vma_mprotect() is like this:

static int sgx_vma_mprotect(struct vm_area_struct *vma, unsigned long start,
			    unsigned long end, unsigned long prot)
{
	return sgx_encl_may_map(vma->vm_private_data, start, end,
				calc_vm_prot_bits(prot, 0));
}

The new one should be probably the implemented along the lines of

static int sgx_vma_mprotect(struct vm_area_struct *vma,
			    struct vm_area_struct **pprev, unsigned long start,
			    unsigned long end, unsigned long newflags)
{
	unsigned long masked_newflags = newflags &
					(VM_READ | VM_WRITE | VM_EXEC);
	int ret;

	ret = sgx_encl_may_map(vma->vm_private_data, start, end,
				   masked_newflags);
	if (ret)
		return ret;

	return mprotect_fixup(vma, pprev, start, end, newflags);
}

Alternatively the filtering can be inside sgx_encl_may_map(). Perhaps
that is a better place for it. This was just easier function to
represent the idea.

/Jarkko
Jarkko Sakkinen July 8, 2020, 10:56 p.m. UTC | #10
On Wed, Jul 08, 2020 at 07:10:27PM +0300, Jarkko Sakkinen wrote:
> On Wed, Jul 08, 2020 at 03:37:08PM +0100, Matthew Wilcox wrote:
> > On Wed, Jul 08, 2020 at 05:33:20PM +0300, Jarkko Sakkinen wrote:
> > > I get the point but I don't think that your proposal could work given
> > > that mprotect-callback takes neither 'prev' nor 'newflags' as its
> > > parameters. The current callback has no means to call mprotect_fixup()
> > > properly.
> > > 
> > > It would have to be extended
> > > 
> > > 	int (*mprotect)(struct vm_area_struct *vma,
> > > 			struct vm_area_struct **pprev, unsigned long start,
> > > 			unsigned long end, unsigned long prot,
> > > 			unsigned long newflags);
> > > 
> > > Is this what you want?
> > 
> > https://lore.kernel.org/linux-mm/20200625173050.GF7703@casper.infradead.org/
> 
> Ugh, it's there as it should be. I'm sorry - I just misread the code.
> 
> I think that should work, and we do not have to do extra conversion
> inside the callback.
> 
> There is still one thing that I'm wondering. 'page->vm_max_prot_bits'
> contains some an union of subset of {VM_READ, VM_WRITE, VM_EXEC}, and
> 'newflags' can contain other bits set too.
> 
> The old implementation of sgx_vma_mprotect() is like this:
> 
> static int sgx_vma_mprotect(struct vm_area_struct *vma, unsigned long start,
> 			    unsigned long end, unsigned long prot)
> {
> 	return sgx_encl_may_map(vma->vm_private_data, start, end,
> 				calc_vm_prot_bits(prot, 0));
> }
> 
> The new one should be probably the implemented along the lines of
> 
> static int sgx_vma_mprotect(struct vm_area_struct *vma,
> 			    struct vm_area_struct **pprev, unsigned long start,
> 			    unsigned long end, unsigned long newflags)
> {
> 	unsigned long masked_newflags = newflags &
> 					(VM_READ | VM_WRITE | VM_EXEC);
> 	int ret;
> 
> 	ret = sgx_encl_may_map(vma->vm_private_data, start, end,
> 				   masked_newflags);
> 	if (ret)
> 		return ret;
> 
> 	return mprotect_fixup(vma, pprev, start, end, newflags);
> }
> 
> Alternatively the filtering can be inside sgx_encl_may_map(). Perhaps
> that is a better place for it. This was just easier function to
> represent the idea.

Turned out that mmap() handler was already masking with RWX. So I
removed masking from there and do it as the first step in
sgx_encl_may_map(), which is called by the both handlers:

int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
		     unsigned long end, unsigned long vm_flags)
{
	unsigned long vm_prot_bits = vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
	unsigned long idx, idx_start, idx_end;
	struct sgx_encl_page *page;

Also renamed the last function parameter from vm_flags to vm_port_bits.
Kind of makes the flow more understandable (i.e. vm_prot_bits is purely
internal representation, not something caller needs to be concerned of).

/Jarkko
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index dc7b87310c10..fc0e3ef28873 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -542,6 +542,8 @@  struct vm_operations_struct {
 	void (*close)(struct vm_area_struct * area);
 	int (*split)(struct vm_area_struct * area, unsigned long addr);
 	int (*mremap)(struct vm_area_struct * area);
+	int (*mprotect)(struct vm_area_struct *vma, unsigned long start,
+			unsigned long end, unsigned long prot);
 	vm_fault_t (*fault)(struct vm_fault *vmf);
 	vm_fault_t (*huge_fault)(struct vm_fault *vmf,
 			enum page_entry_size pe_size);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index ce8b8a5eacbb..e23dfd8d18bc 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -603,13 +603,20 @@  static int do_mprotect_pkey(unsigned long start, size_t len,
 			goto out;
 		}
 
+		tmp = vma->vm_end;
+		if (tmp > end)
+			tmp = end;
+
 		error = security_file_mprotect(vma, reqprot, prot);
 		if (error)
 			goto out;
 
-		tmp = vma->vm_end;
-		if (tmp > end)
-			tmp = end;
+		if (vma->vm_ops && vma->vm_ops->mprotect) {
+			error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
+			if (error)
+				goto out;
+		}
+
 		error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
 		if (error)
 			goto out;