diff mbox series

[RFC,6/9] s390/pci_mmio: fully validate the VMA before calling follow_pte()

Message ID 20210909145945.12192-7-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series s390: fixes, cleanups and optimizations for page table walkers | expand

Commit Message

David Hildenbrand Sept. 9, 2021, 2:59 p.m. UTC
We should not walk/touch page tables outside of VMA boundaries when
holding only the mmap sem in read mode. Evil user space can modify the
VMA layout just before this function runs and e.g., trigger races with
page table removal code since commit dd2283f2605e ("mm: mmap: zap pages
with read mmap_sem in munmap").

find_vma() does not check if the address is >= the VMA start address;
use vma_lookup() instead.

Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/s390/pci/pci_mmio.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Niklas Schnelle Sept. 10, 2021, 8:22 a.m. UTC | #1
On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote:
> We should not walk/touch page tables outside of VMA boundaries when
> holding only the mmap sem in read mode. Evil user space can modify the
> VMA layout just before this function runs and e.g., trigger races with
> page table removal code since commit dd2283f2605e ("mm: mmap: zap pages
> with read mmap_sem in munmap").
> 
> find_vma() does not check if the address is >= the VMA start address;
> use vma_lookup() instead.
> 
> Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/s390/pci/pci_mmio.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
> index ae683aa623ac..c5b35ea129cf 100644
> --- a/arch/s390/pci/pci_mmio.c
> +++ b/arch/s390/pci/pci_mmio.c
> @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
>  
>  	mmap_read_lock(current->mm);
>  	ret = -EINVAL;
> -	vma = find_vma(current->mm, mmio_addr);
> +	vma = vma_lookup(current->mm, mmio_addr);
>  	if (!vma)
>  		goto out_unlock_mmap;
>  	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
>  
>  	mmap_read_lock(current->mm);
>  	ret = -EINVAL;
> -	vma = find_vma(current->mm, mmio_addr);
> +	vma = vma_lookup(current->mm, mmio_addr);
>  	if (!vma)
>  		goto out_unlock_mmap;
>  	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))

Oh wow great find thanks! If I may say so these are not great function
names. Looking at the code vma_lookup() is inded find_vma() plus the
check that the looked up address is indeed inside the vma.

I think this is pretty independent of the rest of the patches, so do
you want me to apply this patch independently or do you want to wait
for the others?

In any case:

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
David Hildenbrand Sept. 10, 2021, 9:23 a.m. UTC | #2
On 10.09.21 10:22, Niklas Schnelle wrote:
> On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote:
>> We should not walk/touch page tables outside of VMA boundaries when
>> holding only the mmap sem in read mode. Evil user space can modify the
>> VMA layout just before this function runs and e.g., trigger races with
>> page table removal code since commit dd2283f2605e ("mm: mmap: zap pages
>> with read mmap_sem in munmap").
>>
>> find_vma() does not check if the address is >= the VMA start address;
>> use vma_lookup() instead.
>>
>> Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>   arch/s390/pci/pci_mmio.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
>> index ae683aa623ac..c5b35ea129cf 100644
>> --- a/arch/s390/pci/pci_mmio.c
>> +++ b/arch/s390/pci/pci_mmio.c
>> @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
>>   
>>   	mmap_read_lock(current->mm);
>>   	ret = -EINVAL;
>> -	vma = find_vma(current->mm, mmio_addr);
>> +	vma = vma_lookup(current->mm, mmio_addr);
>>   	if (!vma)
>>   		goto out_unlock_mmap;
>>   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
>> @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
>>   
>>   	mmap_read_lock(current->mm);
>>   	ret = -EINVAL;
>> -	vma = find_vma(current->mm, mmio_addr);
>> +	vma = vma_lookup(current->mm, mmio_addr);
>>   	if (!vma)
>>   		goto out_unlock_mmap;
>>   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> 
> Oh wow great find thanks! If I may say so these are not great function
> names. Looking at the code vma_lookup() is inded find_vma() plus the
> check that the looked up address is indeed inside the vma.
> 

IIRC, vma_lookup() was introduced fairly recently. Before that, this 
additional check was open coded (and still are in some instances). It's 
confusing, I agree.

> I think this is pretty independent of the rest of the patches, so do
> you want me to apply this patch independently or do you want to wait
> for the others?

Sure, please go ahead and apply independently. It'd be great if you 
could give it a quick sanity test, although I don't expect surprises -- 
unfortunately, the environment I have easily at hand is not very well 
suited (#cpu, #mem, #disk ...) for anything that exceeds basic compile 
tests (and even cross-compiling is significantly faster ...).

> 
> In any case:
> 
> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
> 

Thanks!
Niklas Schnelle Sept. 10, 2021, 12:48 p.m. UTC | #3
On Fri, 2021-09-10 at 11:23 +0200, David Hildenbrand wrote:
> On 10.09.21 10:22, Niklas Schnelle wrote:
> > On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote:
> > > We should not walk/touch page tables outside of VMA boundaries when
> > > holding only the mmap sem in read mode. Evil user space can modify the
> > > VMA layout just before this function runs and e.g., trigger races with
> > > page table removal code since commit dd2283f2605e ("mm: mmap: zap pages
> > > with read mmap_sem in munmap").
> > > 
> > > find_vma() does not check if the address is >= the VMA start address;
> > > use vma_lookup() instead.
> > > 
> > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
> > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > ---
> > >   arch/s390/pci/pci_mmio.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
> > > index ae683aa623ac..c5b35ea129cf 100644
> > > --- a/arch/s390/pci/pci_mmio.c
> > > +++ b/arch/s390/pci/pci_mmio.c
> > > @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
> > >   
> > >   	mmap_read_lock(current->mm);
> > >   	ret = -EINVAL;
> > > -	vma = find_vma(current->mm, mmio_addr);
> > > +	vma = vma_lookup(current->mm, mmio_addr);
> > >   	if (!vma)
> > >   		goto out_unlock_mmap;
> > >   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> > > @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
> > >   
> > >   	mmap_read_lock(current->mm);
> > >   	ret = -EINVAL;
> > > -	vma = find_vma(current->mm, mmio_addr);
> > > +	vma = vma_lookup(current->mm, mmio_addr);
> > >   	if (!vma)
> > >   		goto out_unlock_mmap;
> > >   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> > 
> > Oh wow great find thanks! If I may say so these are not great function
> > names. Looking at the code vma_lookup() is inded find_vma() plus the
> > check that the looked up address is indeed inside the vma.
> > 
> 
> IIRC, vma_lookup() was introduced fairly recently. Before that, this 
> additional check was open coded (and still are in some instances). It's 
> confusing, I agree.
> 
> > I think this is pretty independent of the rest of the patches, so do
> > you want me to apply this patch independently or do you want to wait
> > for the others?
> 
> Sure, please go ahead and apply independently. It'd be great if you 
> could give it a quick sanity test, although I don't expect surprises -- 
> unfortunately, the environment I have easily at hand is not very well 
> suited (#cpu, #mem, #disk ...) for anything that exceeds basic compile 
> tests (and even cross-compiling is significantly faster ...).

Yes and even if you had more hardware this code path is only hit by
very specialized workloads doing MMIO access of PCI devices from
userspace. I did test with such a workload (ib_send_bw test utility)
and all looks good.

Applied and will be sent out by Heiko or Vasily as part of the s390
tree.


> 
> > In any case:
> > 
> > Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
> > 
> 
> Thanks!
>
Liam R. Howlett Sept. 10, 2021, 2:12 p.m. UTC | #4
* David Hildenbrand <david@redhat.com> [210910 05:23]:
> On 10.09.21 10:22, Niklas Schnelle wrote:
> > On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote:
> > > We should not walk/touch page tables outside of VMA boundaries when
> > > holding only the mmap sem in read mode. Evil user space can modify the
> > > VMA layout just before this function runs and e.g., trigger races with
> > > page table removal code since commit dd2283f2605e ("mm: mmap: zap pages
> > > with read mmap_sem in munmap").
> > > 
> > > find_vma() does not check if the address is >= the VMA start address;
> > > use vma_lookup() instead.
> > > 
> > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
> > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > ---
> > >   arch/s390/pci/pci_mmio.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
> > > index ae683aa623ac..c5b35ea129cf 100644
> > > --- a/arch/s390/pci/pci_mmio.c
> > > +++ b/arch/s390/pci/pci_mmio.c
> > > @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
> > >   	mmap_read_lock(current->mm);
> > >   	ret = -EINVAL;
> > > -	vma = find_vma(current->mm, mmio_addr);
> > > +	vma = vma_lookup(current->mm, mmio_addr);
> > >   	if (!vma)
> > >   		goto out_unlock_mmap;
> > >   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> > > @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
> > >   	mmap_read_lock(current->mm);
> > >   	ret = -EINVAL;
> > > -	vma = find_vma(current->mm, mmio_addr);
> > > +	vma = vma_lookup(current->mm, mmio_addr);
> > >   	if (!vma)
> > >   		goto out_unlock_mmap;
> > >   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> > 
> > Oh wow great find thanks! If I may say so these are not great function
> > names. Looking at the code vma_lookup() is inded find_vma() plus the
> > check that the looked up address is indeed inside the vma.
> > 
> 
> IIRC, vma_lookup() was introduced fairly recently. Before that, this
> additional check was open coded (and still are in some instances). It's
> confusing, I agree.

This confusion is why I introduced vma_lookup().  My hope is to reduce
the users of find_vma() to only those that actually need the added
functionality, which are mostly in the mm code.

> 
> > I think this is pretty independent of the rest of the patches, so do
> > you want me to apply this patch independently or do you want to wait
> > for the others?
> 
> Sure, please go ahead and apply independently. It'd be great if you could
> give it a quick sanity test, although I don't expect surprises --
> unfortunately, the environment I have easily at hand is not very well suited
> (#cpu, #mem, #disk ...) for anything that exceeds basic compile tests (and
> even cross-compiling is significantly faster ...).
> 
> > 
> > In any case:
> > 
> > Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
> > 
> 
> Thanks!
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 
> 


Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Niklas Schnelle Sept. 10, 2021, 2:31 p.m. UTC | #5
On Fri, 2021-09-10 at 14:12 +0000, Liam Howlett wrote:
> * David Hildenbrand <david@redhat.com> [210910 05:23]:
> > On 10.09.21 10:22, Niklas Schnelle wrote:
> > > On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote:
> > > > We should not walk/touch page tables outside of VMA boundaries when
> > > > holding only the mmap sem in read mode. Evil user space can modify the
> > > > VMA layout just before this function runs and e.g., trigger races with
> > > > page table removal code since commit dd2283f2605e ("mm: mmap: zap pages
> > > > with read mmap_sem in munmap").
> > > > 
> > > > find_vma() does not check if the address is >= the VMA start address;
> > > > use vma_lookup() instead.
> > > > 
> > > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
> > > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > > ---
> > > >   arch/s390/pci/pci_mmio.c | 4 ++--
> > > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
> > > > index ae683aa623ac..c5b35ea129cf 100644
> > > > --- a/arch/s390/pci/pci_mmio.c
> > > > +++ b/arch/s390/pci/pci_mmio.c
> > > > @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
> > > >   	mmap_read_lock(current->mm);
> > > >   	ret = -EINVAL;
> > > > -	vma = find_vma(current->mm, mmio_addr);
> > > > +	vma = vma_lookup(current->mm, mmio_addr);
> > > >   	if (!vma)
> > > >   		goto out_unlock_mmap;
> > > >   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> > > > @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
> > > >   	mmap_read_lock(current->mm);
> > > >   	ret = -EINVAL;
> > > > -	vma = find_vma(current->mm, mmio_addr);
> > > > +	vma = vma_lookup(current->mm, mmio_addr);
> > > >   	if (!vma)
> > > >   		goto out_unlock_mmap;
> > > >   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> > > 
> > > Oh wow great find thanks! If I may say so these are not great function
> > > names. Looking at the code vma_lookup() is inded find_vma() plus the
> > > check that the looked up address is indeed inside the vma.
> > > 
> > 
> > IIRC, vma_lookup() was introduced fairly recently. Before that, this
> > additional check was open coded (and still are in some instances). It's
> > confusing, I agree.
> 
> This confusion is why I introduced vma_lookup().  My hope is to reduce
> the users of find_vma() to only those that actually need the added
> functionality, which are mostly in the mm code.

Ah I see, soo the confusingly similar names are in hope of one day
making find_vma() only visible or at least used in the mm code. That
does make more sense then. Thanks for the explanation! Maybe this would
be a good candidate for a treewide change/coccinelle script? Then again
I guess sometimes one really wants find_vma() and it's hard to tell
apart.

> 

..snip..
Liam R. Howlett Sept. 10, 2021, 2:52 p.m. UTC | #6
* Niklas Schnelle <schnelle@linux.ibm.com> [210910 10:31]:
> On Fri, 2021-09-10 at 14:12 +0000, Liam Howlett wrote:
> > * David Hildenbrand <david@redhat.com> [210910 05:23]:
> > > On 10.09.21 10:22, Niklas Schnelle wrote:
> > > > On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote:
> > > > > We should not walk/touch page tables outside of VMA boundaries when
> > > > > holding only the mmap sem in read mode. Evil user space can modify the
> > > > > VMA layout just before this function runs and e.g., trigger races with
> > > > > page table removal code since commit dd2283f2605e ("mm: mmap: zap pages
> > > > > with read mmap_sem in munmap").
> > > > > 
> > > > > find_vma() does not check if the address is >= the VMA start address;
> > > > > use vma_lookup() instead.
> > > > > 
> > > > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
> > > > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > > > ---
> > > > >   arch/s390/pci/pci_mmio.c | 4 ++--
> > > > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
> > > > > index ae683aa623ac..c5b35ea129cf 100644
> > > > > --- a/arch/s390/pci/pci_mmio.c
> > > > > +++ b/arch/s390/pci/pci_mmio.c
> > > > > @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
> > > > >   	mmap_read_lock(current->mm);
> > > > >   	ret = -EINVAL;
> > > > > -	vma = find_vma(current->mm, mmio_addr);
> > > > > +	vma = vma_lookup(current->mm, mmio_addr);
> > > > >   	if (!vma)
> > > > >   		goto out_unlock_mmap;
> > > > >   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> > > > > @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
> > > > >   	mmap_read_lock(current->mm);
> > > > >   	ret = -EINVAL;
> > > > > -	vma = find_vma(current->mm, mmio_addr);
> > > > > +	vma = vma_lookup(current->mm, mmio_addr);
> > > > >   	if (!vma)
> > > > >   		goto out_unlock_mmap;
> > > > >   	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
> > > > 
> > > > Oh wow great find thanks! If I may say so these are not great function
> > > > names. Looking at the code vma_lookup() is inded find_vma() plus the
> > > > check that the looked up address is indeed inside the vma.
> > > > 
> > > 
> > > IIRC, vma_lookup() was introduced fairly recently. Before that, this
> > > additional check was open coded (and still are in some instances). It's
> > > confusing, I agree.
> > 
> > This confusion is why I introduced vma_lookup().  My hope is to reduce
> > the users of find_vma() to only those that actually need the added
> > functionality, which are mostly in the mm code.
> 
> Ah I see, soo the confusingly similar names are in hope of one day
> making find_vma() only visible or at least used in the mm code. That
> does make more sense then. Thanks for the explanation! Maybe this would
> be a good candidate for a treewide change/coccinelle script? Then again
> I guess sometimes one really wants find_vma() and it's hard to tell
> apart.
> 

find_vma() does not describe what the code actually does, so I think it
is a good candidate for a tree wide change.  I'm not sure it would be
popular though.  I couldn't come up with a name that would be worth the
efforts.  If the name does change, then it should also change
find_vma_intersection() as well, nommu code also has a find_vma_exact().
Given the unraveling of a rename, I thought it'd be best to try and
clean up the current code and make it less error-prone with a new mm
API.
diff mbox series

Patch

diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
index ae683aa623ac..c5b35ea129cf 100644
--- a/arch/s390/pci/pci_mmio.c
+++ b/arch/s390/pci/pci_mmio.c
@@ -159,7 +159,7 @@  SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
 
 	mmap_read_lock(current->mm);
 	ret = -EINVAL;
-	vma = find_vma(current->mm, mmio_addr);
+	vma = vma_lookup(current->mm, mmio_addr);
 	if (!vma)
 		goto out_unlock_mmap;
 	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
@@ -298,7 +298,7 @@  SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
 
 	mmap_read_lock(current->mm);
 	ret = -EINVAL;
-	vma = find_vma(current->mm, mmio_addr);
+	vma = vma_lookup(current->mm, mmio_addr);
 	if (!vma)
 		goto out_unlock_mmap;
 	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))