Message ID | 20210909145945.12192-7-david@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | s390: fixes, cleanups and optimizations for page table walkers | expand |
On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote: > We should not walk/touch page tables outside of VMA boundaries when > holding only the mmap sem in read mode. Evil user space can modify the > VMA layout just before this function runs and e.g., trigger races with > page table removal code since commit dd2283f2605e ("mm: mmap: zap pages > with read mmap_sem in munmap"). > > find_vma() does not check if the address is >= the VMA start address; > use vma_lookup() instead. > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > arch/s390/pci/pci_mmio.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c > index ae683aa623ac..c5b35ea129cf 100644 > --- a/arch/s390/pci/pci_mmio.c > +++ b/arch/s390/pci/pci_mmio.c > @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, > > mmap_read_lock(current->mm); > ret = -EINVAL; > - vma = find_vma(current->mm, mmio_addr); > + vma = vma_lookup(current->mm, mmio_addr); > if (!vma) > goto out_unlock_mmap; > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr, > > mmap_read_lock(current->mm); > ret = -EINVAL; > - vma = find_vma(current->mm, mmio_addr); > + vma = vma_lookup(current->mm, mmio_addr); > if (!vma) > goto out_unlock_mmap; > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) Oh wow great find thanks! If I may say so these are not great function names. Looking at the code vma_lookup() is inded find_vma() plus the check that the looked up address is indeed inside the vma. I think this is pretty independent of the rest of the patches, so do you want me to apply this patch independently or do you want to wait for the others? In any case: Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
On 10.09.21 10:22, Niklas Schnelle wrote: > On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote: >> We should not walk/touch page tables outside of VMA boundaries when >> holding only the mmap sem in read mode. Evil user space can modify the >> VMA layout just before this function runs and e.g., trigger races with >> page table removal code since commit dd2283f2605e ("mm: mmap: zap pages >> with read mmap_sem in munmap"). >> >> find_vma() does not check if the address is >= the VMA start address; >> use vma_lookup() instead. >> >> Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> arch/s390/pci/pci_mmio.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c >> index ae683aa623ac..c5b35ea129cf 100644 >> --- a/arch/s390/pci/pci_mmio.c >> +++ b/arch/s390/pci/pci_mmio.c >> @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, >> >> mmap_read_lock(current->mm); >> ret = -EINVAL; >> - vma = find_vma(current->mm, mmio_addr); >> + vma = vma_lookup(current->mm, mmio_addr); >> if (!vma) >> goto out_unlock_mmap; >> if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) >> @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr, >> >> mmap_read_lock(current->mm); >> ret = -EINVAL; >> - vma = find_vma(current->mm, mmio_addr); >> + vma = vma_lookup(current->mm, mmio_addr); >> if (!vma) >> goto out_unlock_mmap; >> if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > > Oh wow great find thanks! If I may say so these are not great function > names. Looking at the code vma_lookup() is inded find_vma() plus the > check that the looked up address is indeed inside the vma. > IIRC, vma_lookup() was introduced fairly recently. Before that, this additional check was open coded (and still are in some instances). It's confusing, I agree. > I think this is pretty independent of the rest of the patches, so do > you want me to apply this patch independently or do you want to wait > for the others? Sure, please go ahead and apply independently. It'd be great if you could give it a quick sanity test, although I don't expect surprises -- unfortunately, the environment I have easily at hand is not very well suited (#cpu, #mem, #disk ...) for anything that exceeds basic compile tests (and even cross-compiling is significantly faster ...). > > In any case: > > Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> > Thanks!
On Fri, 2021-09-10 at 11:23 +0200, David Hildenbrand wrote: > On 10.09.21 10:22, Niklas Schnelle wrote: > > On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote: > > > We should not walk/touch page tables outside of VMA boundaries when > > > holding only the mmap sem in read mode. Evil user space can modify the > > > VMA layout just before this function runs and e.g., trigger races with > > > page table removal code since commit dd2283f2605e ("mm: mmap: zap pages > > > with read mmap_sem in munmap"). > > > > > > find_vma() does not check if the address is >= the VMA start address; > > > use vma_lookup() instead. > > > > > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") > > > Signed-off-by: David Hildenbrand <david@redhat.com> > > > --- > > > arch/s390/pci/pci_mmio.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c > > > index ae683aa623ac..c5b35ea129cf 100644 > > > --- a/arch/s390/pci/pci_mmio.c > > > +++ b/arch/s390/pci/pci_mmio.c > > > @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, > > > > > > mmap_read_lock(current->mm); > > > ret = -EINVAL; > > > - vma = find_vma(current->mm, mmio_addr); > > > + vma = vma_lookup(current->mm, mmio_addr); > > > if (!vma) > > > goto out_unlock_mmap; > > > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > > > @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr, > > > > > > mmap_read_lock(current->mm); > > > ret = -EINVAL; > > > - vma = find_vma(current->mm, mmio_addr); > > > + vma = vma_lookup(current->mm, mmio_addr); > > > if (!vma) > > > goto out_unlock_mmap; > > > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > > > > Oh wow great find thanks! If I may say so these are not great function > > names. Looking at the code vma_lookup() is inded find_vma() plus the > > check that the looked up address is indeed inside the vma. > > > > IIRC, vma_lookup() was introduced fairly recently. Before that, this > additional check was open coded (and still are in some instances). It's > confusing, I agree. > > > I think this is pretty independent of the rest of the patches, so do > > you want me to apply this patch independently or do you want to wait > > for the others? > > Sure, please go ahead and apply independently. It'd be great if you > could give it a quick sanity test, although I don't expect surprises -- > unfortunately, the environment I have easily at hand is not very well > suited (#cpu, #mem, #disk ...) for anything that exceeds basic compile > tests (and even cross-compiling is significantly faster ...). Yes and even if you had more hardware this code path is only hit by very specialized workloads doing MMIO access of PCI devices from userspace. I did test with such a workload (ib_send_bw test utility) and all looks good. Applied and will be sent out by Heiko or Vasily as part of the s390 tree. > > > In any case: > > > > Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> > > > > Thanks! >
* David Hildenbrand <david@redhat.com> [210910 05:23]: > On 10.09.21 10:22, Niklas Schnelle wrote: > > On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote: > > > We should not walk/touch page tables outside of VMA boundaries when > > > holding only the mmap sem in read mode. Evil user space can modify the > > > VMA layout just before this function runs and e.g., trigger races with > > > page table removal code since commit dd2283f2605e ("mm: mmap: zap pages > > > with read mmap_sem in munmap"). > > > > > > find_vma() does not check if the address is >= the VMA start address; > > > use vma_lookup() instead. > > > > > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") > > > Signed-off-by: David Hildenbrand <david@redhat.com> > > > --- > > > arch/s390/pci/pci_mmio.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c > > > index ae683aa623ac..c5b35ea129cf 100644 > > > --- a/arch/s390/pci/pci_mmio.c > > > +++ b/arch/s390/pci/pci_mmio.c > > > @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, > > > mmap_read_lock(current->mm); > > > ret = -EINVAL; > > > - vma = find_vma(current->mm, mmio_addr); > > > + vma = vma_lookup(current->mm, mmio_addr); > > > if (!vma) > > > goto out_unlock_mmap; > > > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > > > @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr, > > > mmap_read_lock(current->mm); > > > ret = -EINVAL; > > > - vma = find_vma(current->mm, mmio_addr); > > > + vma = vma_lookup(current->mm, mmio_addr); > > > if (!vma) > > > goto out_unlock_mmap; > > > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > > > > Oh wow great find thanks! If I may say so these are not great function > > names. Looking at the code vma_lookup() is inded find_vma() plus the > > check that the looked up address is indeed inside the vma. > > > > IIRC, vma_lookup() was introduced fairly recently. Before that, this > additional check was open coded (and still are in some instances). It's > confusing, I agree. This confusion is why I introduced vma_lookup(). My hope is to reduce the users of find_vma() to only those that actually need the added functionality, which are mostly in the mm code. > > > I think this is pretty independent of the rest of the patches, so do > > you want me to apply this patch independently or do you want to wait > > for the others? > > Sure, please go ahead and apply independently. It'd be great if you could > give it a quick sanity test, although I don't expect surprises -- > unfortunately, the environment I have easily at hand is not very well suited > (#cpu, #mem, #disk ...) for anything that exceeds basic compile tests (and > even cross-compiling is significantly faster ...). > > > > > In any case: > > > > Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> > > > > Thanks! > > -- > Thanks, > > David / dhildenb > > Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
On Fri, 2021-09-10 at 14:12 +0000, Liam Howlett wrote: > * David Hildenbrand <david@redhat.com> [210910 05:23]: > > On 10.09.21 10:22, Niklas Schnelle wrote: > > > On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote: > > > > We should not walk/touch page tables outside of VMA boundaries when > > > > holding only the mmap sem in read mode. Evil user space can modify the > > > > VMA layout just before this function runs and e.g., trigger races with > > > > page table removal code since commit dd2283f2605e ("mm: mmap: zap pages > > > > with read mmap_sem in munmap"). > > > > > > > > find_vma() does not check if the address is >= the VMA start address; > > > > use vma_lookup() instead. > > > > > > > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") > > > > Signed-off-by: David Hildenbrand <david@redhat.com> > > > > --- > > > > arch/s390/pci/pci_mmio.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c > > > > index ae683aa623ac..c5b35ea129cf 100644 > > > > --- a/arch/s390/pci/pci_mmio.c > > > > +++ b/arch/s390/pci/pci_mmio.c > > > > @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, > > > > mmap_read_lock(current->mm); > > > > ret = -EINVAL; > > > > - vma = find_vma(current->mm, mmio_addr); > > > > + vma = vma_lookup(current->mm, mmio_addr); > > > > if (!vma) > > > > goto out_unlock_mmap; > > > > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > > > > @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr, > > > > mmap_read_lock(current->mm); > > > > ret = -EINVAL; > > > > - vma = find_vma(current->mm, mmio_addr); > > > > + vma = vma_lookup(current->mm, mmio_addr); > > > > if (!vma) > > > > goto out_unlock_mmap; > > > > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > > > > > > Oh wow great find thanks! If I may say so these are not great function > > > names. Looking at the code vma_lookup() is inded find_vma() plus the > > > check that the looked up address is indeed inside the vma. > > > > > > > IIRC, vma_lookup() was introduced fairly recently. Before that, this > > additional check was open coded (and still are in some instances). It's > > confusing, I agree. > > This confusion is why I introduced vma_lookup(). My hope is to reduce > the users of find_vma() to only those that actually need the added > functionality, which are mostly in the mm code. Ah I see, soo the confusingly similar names are in hope of one day making find_vma() only visible or at least used in the mm code. That does make more sense then. Thanks for the explanation! Maybe this would be a good candidate for a treewide change/coccinelle script? Then again I guess sometimes one really wants find_vma() and it's hard to tell apart. > ..snip..
* Niklas Schnelle <schnelle@linux.ibm.com> [210910 10:31]: > On Fri, 2021-09-10 at 14:12 +0000, Liam Howlett wrote: > > * David Hildenbrand <david@redhat.com> [210910 05:23]: > > > On 10.09.21 10:22, Niklas Schnelle wrote: > > > > On Thu, 2021-09-09 at 16:59 +0200, David Hildenbrand wrote: > > > > > We should not walk/touch page tables outside of VMA boundaries when > > > > > holding only the mmap sem in read mode. Evil user space can modify the > > > > > VMA layout just before this function runs and e.g., trigger races with > > > > > page table removal code since commit dd2283f2605e ("mm: mmap: zap pages > > > > > with read mmap_sem in munmap"). > > > > > > > > > > find_vma() does not check if the address is >= the VMA start address; > > > > > use vma_lookup() instead. > > > > > > > > > > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") > > > > > Signed-off-by: David Hildenbrand <david@redhat.com> > > > > > --- > > > > > arch/s390/pci/pci_mmio.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c > > > > > index ae683aa623ac..c5b35ea129cf 100644 > > > > > --- a/arch/s390/pci/pci_mmio.c > > > > > +++ b/arch/s390/pci/pci_mmio.c > > > > > @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, > > > > > mmap_read_lock(current->mm); > > > > > ret = -EINVAL; > > > > > - vma = find_vma(current->mm, mmio_addr); > > > > > + vma = vma_lookup(current->mm, mmio_addr); > > > > > if (!vma) > > > > > goto out_unlock_mmap; > > > > > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > > > > > @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr, > > > > > mmap_read_lock(current->mm); > > > > > ret = -EINVAL; > > > > > - vma = find_vma(current->mm, mmio_addr); > > > > > + vma = vma_lookup(current->mm, mmio_addr); > > > > > if (!vma) > > > > > goto out_unlock_mmap; > > > > > if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > > > > > > > > Oh wow great find thanks! If I may say so these are not great function > > > > names. Looking at the code vma_lookup() is inded find_vma() plus the > > > > check that the looked up address is indeed inside the vma. > > > > > > > > > > IIRC, vma_lookup() was introduced fairly recently. Before that, this > > > additional check was open coded (and still are in some instances). It's > > > confusing, I agree. > > > > This confusion is why I introduced vma_lookup(). My hope is to reduce > > the users of find_vma() to only those that actually need the added > > functionality, which are mostly in the mm code. > > Ah I see, soo the confusingly similar names are in hope of one day > making find_vma() only visible or at least used in the mm code. That > does make more sense then. Thanks for the explanation! Maybe this would > be a good candidate for a treewide change/coccinelle script? Then again > I guess sometimes one really wants find_vma() and it's hard to tell > apart. > find_vma() does not describe what the code actually does, so I think it is a good candidate for a tree wide change. I'm not sure it would be popular though. I couldn't come up with a name that would be worth the efforts. If the name does change, then it should also change find_vma_intersection() as well, nommu code also has a find_vma_exact(). Given the unraveling of a rename, I thought it'd be best to try and clean up the current code and make it less error-prone with a new mm API.
diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c index ae683aa623ac..c5b35ea129cf 100644 --- a/arch/s390/pci/pci_mmio.c +++ b/arch/s390/pci/pci_mmio.c @@ -159,7 +159,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, mmap_read_lock(current->mm); ret = -EINVAL; - vma = find_vma(current->mm, mmio_addr); + vma = vma_lookup(current->mm, mmio_addr); if (!vma) goto out_unlock_mmap; if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) @@ -298,7 +298,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr, mmap_read_lock(current->mm); ret = -EINVAL; - vma = find_vma(current->mm, mmio_addr); + vma = vma_lookup(current->mm, mmio_addr); if (!vma) goto out_unlock_mmap; if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
We should not walk/touch page tables outside of VMA boundaries when holding only the mmap sem in read mode. Evil user space can modify the VMA layout just before this function runs and e.g., trigger races with page table removal code since commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap"). find_vma() does not check if the address is >= the VMA start address; use vma_lookup() instead. Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") Signed-off-by: David Hildenbrand <david@redhat.com> --- arch/s390/pci/pci_mmio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)