diff mbox series

[32/41] mm: prevent userfaults to be handled under per-vma lock

Message ID 20230109205336.3665937-33-surenb@google.com (mailing list archive)
State New
Headers show
Series Per-VMA locks | expand

Commit Message

Suren Baghdasaryan Jan. 9, 2023, 8:53 p.m. UTC
Due to the possibility of handle_userfault dropping mmap_lock, avoid fault
handling under VMA lock and retry holding mmap_lock. This can be handled
more gracefully in the future.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Peter Xu <peterx@redhat.com>
---
 mm/memory.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Jann Horn Jan. 17, 2023, 7:51 p.m. UTC | #1
On Mon, Jan 9, 2023 at 9:55 PM Suren Baghdasaryan <surenb@google.com> wrote:
> Due to the possibility of handle_userfault dropping mmap_lock, avoid fault
> handling under VMA lock and retry holding mmap_lock. This can be handled
> more gracefully in the future.
>
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> Suggested-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/memory.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 20806bc8b4eb..12508f4d845a 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5273,6 +5273,13 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
>         if (!vma->anon_vma)
>                 goto inval;
>
> +       /*
> +        * Due to the possibility of userfault handler dropping mmap_lock, avoid
> +        * it for now and fall back to page fault handling under mmap_lock.
> +        */
> +       if (userfaultfd_armed(vma))
> +               goto inval;

This looks racy wrt concurrent userfaultfd_register(). I think you'll
want to do the userfaultfd_armed(vma) check _after_ locking the VMA,
and ensure that the userfaultfd code write-locks the VMA before
changing the __VM_UFFD_FLAGS in vma->vm_flags.

>         if (!vma_read_trylock(vma))
>                 goto inval;
>
> --
> 2.39.0
>
Jann Horn Jan. 17, 2023, 8:36 p.m. UTC | #2
On Tue, Jan 17, 2023 at 8:51 PM Jann Horn <jannh@google.com> wrote:
> On Mon, Jan 9, 2023 at 9:55 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > Due to the possibility of handle_userfault dropping mmap_lock, avoid fault
> > handling under VMA lock and retry holding mmap_lock. This can be handled
> > more gracefully in the future.
> >
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > Suggested-by: Peter Xu <peterx@redhat.com>
> > ---
> >  mm/memory.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 20806bc8b4eb..12508f4d845a 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -5273,6 +5273,13 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
> >         if (!vma->anon_vma)
> >                 goto inval;
> >
> > +       /*
> > +        * Due to the possibility of userfault handler dropping mmap_lock, avoid
> > +        * it for now and fall back to page fault handling under mmap_lock.
> > +        */
> > +       if (userfaultfd_armed(vma))
> > +               goto inval;
>
> This looks racy wrt concurrent userfaultfd_register(). I think you'll
> want to do the userfaultfd_armed(vma) check _after_ locking the VMA,

I still think this change is needed...

> and ensure that the userfaultfd code write-locks the VMA before
> changing the __VM_UFFD_FLAGS in vma->vm_flags.

Ah, but now I see you already took care of this half of the issue with
the reset_vm_flags() change in
https://lore.kernel.org/linux-mm/20230109205336.3665937-16-surenb@google.com/
.


> >         if (!vma_read_trylock(vma))
> >                 goto inval;
> >
> > --
> > 2.39.0
> >
Suren Baghdasaryan Jan. 17, 2023, 8:57 p.m. UTC | #3
On Tue, Jan 17, 2023 at 12:36 PM Jann Horn <jannh@google.com> wrote:
>
> On Tue, Jan 17, 2023 at 8:51 PM Jann Horn <jannh@google.com> wrote:
> > On Mon, Jan 9, 2023 at 9:55 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > > Due to the possibility of handle_userfault dropping mmap_lock, avoid fault
> > > handling under VMA lock and retry holding mmap_lock. This can be handled
> > > more gracefully in the future.
> > >
> > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > Suggested-by: Peter Xu <peterx@redhat.com>
> > > ---
> > >  mm/memory.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 20806bc8b4eb..12508f4d845a 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -5273,6 +5273,13 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
> > >         if (!vma->anon_vma)
> > >                 goto inval;
> > >
> > > +       /*
> > > +        * Due to the possibility of userfault handler dropping mmap_lock, avoid
> > > +        * it for now and fall back to page fault handling under mmap_lock.
> > > +        */
> > > +       if (userfaultfd_armed(vma))
> > > +               goto inval;
> >
> > This looks racy wrt concurrent userfaultfd_register(). I think you'll
> > want to do the userfaultfd_armed(vma) check _after_ locking the VMA,
>
> I still think this change is needed...

Yes, I think you are right. I'll move the check after locking the VMA. Thanks!

>
> > and ensure that the userfaultfd code write-locks the VMA before
> > changing the __VM_UFFD_FLAGS in vma->vm_flags.
>
> Ah, but now I see you already took care of this half of the issue with
> the reset_vm_flags() change in
> https://lore.kernel.org/linux-mm/20230109205336.3665937-16-surenb@google.com/
> .
>
>
> > >         if (!vma_read_trylock(vma))
> > >                 goto inval;
> > >
> > > --
> > > 2.39.0
> > >
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 20806bc8b4eb..12508f4d845a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5273,6 +5273,13 @@  struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
 	if (!vma->anon_vma)
 		goto inval;
 
+	/*
+	 * Due to the possibility of userfault handler dropping mmap_lock, avoid
+	 * it for now and fall back to page fault handling under mmap_lock.
+	 */
+	if (userfaultfd_armed(vma))
+		goto inval;
+
 	if (!vma_read_trylock(vma))
 		goto inval;