[v2,4/6] mm: drop VMA lock before waiting for migration

Message ID	20230609005158.2421285-5-surenb@google.com (mailing list archive)
State	Handled Elsewhere
Headers	show Return-Path: <linux-fsdevel-owner@vger.kernel.org> Date: Thu, 8 Jun 2023 17:51:56 -0700 In-Reply-To: <20230609005158.2421285-1-surenb@google.com> Mime-Version: 1.0 References: <20230609005158.2421285-1-surenb@google.com> Message-ID: <20230609005158.2421285-5-surenb@google.com> Subject: [PATCH v2 4/6] mm: drop VMA lock before waiting for migration From: Suren Baghdasaryan <surenb@google.com> To: akpm@linux-foundation.org Cc: willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com, peterx@redhat.com, ying.huang@intel.com, david@redhat.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, pasha.tatashin@soleen.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Per-vma lock support for swap and userfaults \| expand [v2,0/6] Per-vma lock support for swap and userfaults [v2,1/6] swap: remove remnants of polling from read_swap_cache_async [v2,2/6] mm: handle swap page faults under VMA lock if page is uncontended [v2,3/6] mm: add missing VM_FAULT_RESULT_TRACE name for VM_FAULT_COMPLETED [v2,4/6] mm: drop VMA lock before waiting for migration [v2,5/6] mm: implement folio wait under VMA lock [v2,6/6] mm: handle userfaults under VMA lock

Suren Baghdasaryan June 9, 2023, 12:51 a.m. UTC

migration_entry_wait does not need VMA lock, therefore it can be dropped
before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
lock was dropped while in handle_mm_fault().
Note that once VMA lock is dropped, the VMA reference can't be used as
there are no guarantees it was not freed.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 arch/arm64/mm/fault.c    |  3 ++-
 arch/powerpc/mm/fault.c  |  3 ++-
 arch/s390/mm/fault.c     |  3 ++-
 arch/x86/mm/fault.c      |  3 ++-
 include/linux/mm_types.h |  6 +++++-
 mm/memory.c              | 12 ++++++++++--
 6 files changed, 23 insertions(+), 7 deletions(-)

Peter Xu June 9, 2023, 8:42 p.m. UTC | #1

On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> migration_entry_wait does not need VMA lock, therefore it can be dropped
> before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> lock was dropped while in handle_mm_fault().
> Note that once VMA lock is dropped, the VMA reference can't be used as
> there are no guarantees it was not freed.

Then vma lock behaves differently from mmap read lock, am I right?  Can we
still make them match on behaviors, or there's reason not to do so?

One reason is if they match they can reuse existing flags and there'll be
less confusing, e.g. this:

  (fault->flags & FAULT_FLAG_VMA_LOCK) &&
    (vm_fault_ret && (VM_FAULT_RETRY || VM_FAULT_COMPLETE))

can replace the new flag, iiuc.

Thanks,

Suren Baghdasaryan June 9, 2023, 10:30 p.m. UTC | #2

On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@redhat.com> wrote:
>
> On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > lock was dropped while in handle_mm_fault().
> > Note that once VMA lock is dropped, the VMA reference can't be used as
> > there are no guarantees it was not freed.
>
> Then vma lock behaves differently from mmap read lock, am I right?  Can we
> still make them match on behaviors, or there's reason not to do so?

I think we could match their behavior by also dropping mmap_lock here
when fault is handled under mmap_lock (!(fault->flags &
FAULT_FLAG_VMA_LOCK)).
I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
mmap_lock in do_page_fault(), so indeed, I might be able to use
VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
of reusing existing flags?

>
> One reason is if they match they can reuse existing flags and there'll be
> less confusing, e.g. this:
>
>   (fault->flags & FAULT_FLAG_VMA_LOCK) &&
>     (vm_fault_ret && (VM_FAULT_RETRY || VM_FAULT_COMPLETE))
>
> can replace the new flag, iiuc.
>
> Thanks,
>
> --
> Peter Xu
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

Suren Baghdasaryan June 10, 2023, 1:29 a.m. UTC | #3

On Fri, Jun 9, 2023 at 3:30 PM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > > lock was dropped while in handle_mm_fault().
> > > Note that once VMA lock is dropped, the VMA reference can't be used as
> > > there are no guarantees it was not freed.
> >
> > Then vma lock behaves differently from mmap read lock, am I right?  Can we
> > still make them match on behaviors, or there's reason not to do so?
>
> I think we could match their behavior by also dropping mmap_lock here
> when fault is handled under mmap_lock (!(fault->flags &
> FAULT_FLAG_VMA_LOCK)).
> I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
> mmap_lock in do_page_fault(), so indeed, I might be able to use
> VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
> instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
> of reusing existing flags?
Sorry, I meant VM_FAULT_VMA_UNLOCKED, not FAULT_FLAG_VMA_LOCK in the
above reply.

I took a closer look into using VM_FAULT_COMPLETED instead of
VM_FAULT_VMA_UNLOCKED but when we fall back from per-vma lock to
mmap_lock we need to retry with an indication that the per-vma lock
was dropped. Returning (VM_FAULT_RETRY | VM_FAULT_COMPLETE) to
indicate such state seems strange to me ("retry" and "complete" seem
like contradicting concepts to be used in a single result). I could
use VM_FAULT_COMPLETE when releasing mmap_lock since we don't use it
in combination with VM_FAULT_RETRY and (VM_FAULT_RETRY |
VM_FAULT_VMA_UNLOCKED) when dropping per-vma lock and falling back to
mmap_lock. It still requires the new VM_FAULT_VMA_UNLOCKED flag but I
think logically that makes more sense. WDYT?

>
> >
> > One reason is if they match they can reuse existing flags and there'll be
> > less confusing, e.g. this:
> >
> >   (fault->flags & FAULT_FLAG_VMA_LOCK) &&
> >     (vm_fault_ret && (VM_FAULT_RETRY || VM_FAULT_COMPLETE))
> >
> > can replace the new flag, iiuc.
> >
> > Thanks,
> >
> > --
> > Peter Xu
> >
> > --
> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> >

Peter Xu June 12, 2023, 1:28 p.m. UTC | #4

On Fri, Jun 09, 2023 at 03:30:10PM -0700, Suren Baghdasaryan wrote:
> On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > > lock was dropped while in handle_mm_fault().
> > > Note that once VMA lock is dropped, the VMA reference can't be used as
> > > there are no guarantees it was not freed.
> >
> > Then vma lock behaves differently from mmap read lock, am I right?  Can we
> > still make them match on behaviors, or there's reason not to do so?
> 
> I think we could match their behavior by also dropping mmap_lock here
> when fault is handled under mmap_lock (!(fault->flags &
> FAULT_FLAG_VMA_LOCK)).
> I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
> mmap_lock in do_page_fault(), so indeed, I might be able to use
> VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
> instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
> of reusing existing flags?

Yes.

I'd suggest we move this patch out of the series as it's not really part of
it on enabling swap + uffd.  It can be a separate patch and hopefully it'll
always change both vma+mmap lock cases, and with proper reasonings.

Thanks,

Peter Xu June 12, 2023, 1:36 p.m. UTC | #5

On Fri, Jun 09, 2023 at 06:29:43PM -0700, Suren Baghdasaryan wrote:
> On Fri, Jun 9, 2023 at 3:30 PM Suren Baghdasaryan <surenb@google.com> wrote:
> >
> > On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@redhat.com> wrote:
> > >
> > > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > > > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > > > lock was dropped while in handle_mm_fault().
> > > > Note that once VMA lock is dropped, the VMA reference can't be used as
> > > > there are no guarantees it was not freed.
> > >
> > > Then vma lock behaves differently from mmap read lock, am I right?  Can we
> > > still make them match on behaviors, or there's reason not to do so?
> >
> > I think we could match their behavior by also dropping mmap_lock here
> > when fault is handled under mmap_lock (!(fault->flags &
> > FAULT_FLAG_VMA_LOCK)).
> > I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
> > mmap_lock in do_page_fault(), so indeed, I might be able to use
> > VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
> > instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
> > of reusing existing flags?
> Sorry, I meant VM_FAULT_VMA_UNLOCKED, not FAULT_FLAG_VMA_LOCK in the
> above reply.
> 
> I took a closer look into using VM_FAULT_COMPLETED instead of
> VM_FAULT_VMA_UNLOCKED but when we fall back from per-vma lock to
> mmap_lock we need to retry with an indication that the per-vma lock
> was dropped. Returning (VM_FAULT_RETRY | VM_FAULT_COMPLETE) to
> indicate such state seems strange to me ("retry" and "complete" seem

Not relevant to this migration patch, but for the whole idea I was thinking
whether it should just work if we simply:

        fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
-       vma_end_read(vma);
+       if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
+               vma_end_read(vma);

?

GUP may need more caution on NOWAIT, but vma lock is only in fault paths so
IIUC it's fine?

Suren Baghdasaryan June 12, 2023, 3:41 p.m. UTC | #6

On Mon, Jun 12, 2023 at 6:28 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Fri, Jun 09, 2023 at 03:30:10PM -0700, Suren Baghdasaryan wrote:
> > On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@redhat.com> wrote:
> > >
> > > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > > > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > > > lock was dropped while in handle_mm_fault().
> > > > Note that once VMA lock is dropped, the VMA reference can't be used as
> > > > there are no guarantees it was not freed.
> > >
> > > Then vma lock behaves differently from mmap read lock, am I right?  Can we
> > > still make them match on behaviors, or there's reason not to do so?
> >
> > I think we could match their behavior by also dropping mmap_lock here
> > when fault is handled under mmap_lock (!(fault->flags &
> > FAULT_FLAG_VMA_LOCK)).
> > I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
> > mmap_lock in do_page_fault(), so indeed, I might be able to use
> > VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
> > instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
> > of reusing existing flags?
>
> Yes.
>
> I'd suggest we move this patch out of the series as it's not really part of
> it on enabling swap + uffd.  It can be a separate patch and hopefully it'll
> always change both vma+mmap lock cases, and with proper reasonings.

Ok, I can move it out with mmap_lock support only and then add per-vma
lock support in my patchset (because this path is still part of
do_swap_page and my patchset enables swap support for per-vma locks).

>
> Thanks,
>
> --
> Peter Xu
>

Suren Baghdasaryan June 12, 2023, 4:07 p.m. UTC | #7

On Mon, Jun 12, 2023 at 6:36 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Fri, Jun 09, 2023 at 06:29:43PM -0700, Suren Baghdasaryan wrote:
> > On Fri, Jun 9, 2023 at 3:30 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > >
> > > On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@redhat.com> wrote:
> > > >
> > > > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > > > > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > > > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > > > > lock was dropped while in handle_mm_fault().
> > > > > Note that once VMA lock is dropped, the VMA reference can't be used as
> > > > > there are no guarantees it was not freed.
> > > >
> > > > Then vma lock behaves differently from mmap read lock, am I right?  Can we
> > > > still make them match on behaviors, or there's reason not to do so?
> > >
> > > I think we could match their behavior by also dropping mmap_lock here
> > > when fault is handled under mmap_lock (!(fault->flags &
> > > FAULT_FLAG_VMA_LOCK)).
> > > I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
> > > mmap_lock in do_page_fault(), so indeed, I might be able to use
> > > VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
> > > instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
> > > of reusing existing flags?
> > Sorry, I meant VM_FAULT_VMA_UNLOCKED, not FAULT_FLAG_VMA_LOCK in the
> > above reply.
> >
> > I took a closer look into using VM_FAULT_COMPLETED instead of
> > VM_FAULT_VMA_UNLOCKED but when we fall back from per-vma lock to
> > mmap_lock we need to retry with an indication that the per-vma lock
> > was dropped. Returning (VM_FAULT_RETRY | VM_FAULT_COMPLETE) to
> > indicate such state seems strange to me ("retry" and "complete" seem
>
> Not relevant to this migration patch, but for the whole idea I was thinking
> whether it should just work if we simply:
>
>         fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> -       vma_end_read(vma);
> +       if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> +               vma_end_read(vma);
>
> ?

Today when we can't handle a page fault under per-vma locks we return
VM_FAULT_RETRY, in which case per-vma lock is dropped and the fault is
retried under mmap_lock. The condition you suggest above would not
drop per-vma lock for VM_FAULT_RETRY case and would break the current
fallback mechanism.
However your suggestion gave me an idea. I could indicate that per-vma
lock got dropped using vmf structure (like Matthew suggested before)
and once handle_pte_fault(vmf) returns I could check if it returned
VM_FAULT_RETRY but per-vma lock is still held. If that happens I can
call vma_end_read() before returning from __handle_mm_fault(). That
way any time handle_mm_fault() returns VM_FAULT_RETRY per-vma lock
will be already released, so your condition in do_page_fault() will
work correctly. That would eliminate the need for a new
VM_FAULT_VMA_UNLOCKED flag. WDYT?

>
> GUP may need more caution on NOWAIT, but vma lock is only in fault paths so
> IIUC it's fine?
>
> --
> Peter Xu
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

Peter Xu June 12, 2023, 6:34 p.m. UTC | #8

On Mon, Jun 12, 2023 at 09:07:38AM -0700, Suren Baghdasaryan wrote:
> On Mon, Jun 12, 2023 at 6:36 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Fri, Jun 09, 2023 at 06:29:43PM -0700, Suren Baghdasaryan wrote:
> > > On Fri, Jun 9, 2023 at 3:30 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > > >
> > > > On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > > > > > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > > > > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > > > > > lock was dropped while in handle_mm_fault().
> > > > > > Note that once VMA lock is dropped, the VMA reference can't be used as
> > > > > > there are no guarantees it was not freed.
> > > > >
> > > > > Then vma lock behaves differently from mmap read lock, am I right?  Can we
> > > > > still make them match on behaviors, or there's reason not to do so?
> > > >
> > > > I think we could match their behavior by also dropping mmap_lock here
> > > > when fault is handled under mmap_lock (!(fault->flags &
> > > > FAULT_FLAG_VMA_LOCK)).
> > > > I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
> > > > mmap_lock in do_page_fault(), so indeed, I might be able to use
> > > > VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
> > > > instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
> > > > of reusing existing flags?
> > > Sorry, I meant VM_FAULT_VMA_UNLOCKED, not FAULT_FLAG_VMA_LOCK in the
> > > above reply.
> > >
> > > I took a closer look into using VM_FAULT_COMPLETED instead of
> > > VM_FAULT_VMA_UNLOCKED but when we fall back from per-vma lock to
> > > mmap_lock we need to retry with an indication that the per-vma lock
> > > was dropped. Returning (VM_FAULT_RETRY | VM_FAULT_COMPLETE) to
> > > indicate such state seems strange to me ("retry" and "complete" seem
> >
> > Not relevant to this migration patch, but for the whole idea I was thinking
> > whether it should just work if we simply:
> >
> >         fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> > -       vma_end_read(vma);
> > +       if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> > +               vma_end_read(vma);
> >
> > ?
> 
> Today when we can't handle a page fault under per-vma locks we return
> VM_FAULT_RETRY, in which case per-vma lock is dropped and the fault is

Oh I see what I missed.  I think it may not be a good idea to reuse
VM_FAULT_RETRY just for that, because it makes VM_FAULT_RETRY even more
complicated - now it adds one more case where the lock is not released,
that's when PER_VMA even if !NOWAIT.

> retried under mmap_lock. The condition you suggest above would not
> drop per-vma lock for VM_FAULT_RETRY case and would break the current
> fallback mechanism.
> However your suggestion gave me an idea. I could indicate that per-vma
> lock got dropped using vmf structure (like Matthew suggested before)
> and once handle_pte_fault(vmf) returns I could check if it returned
> VM_FAULT_RETRY but per-vma lock is still held.
> If that happens I can
> call vma_end_read() before returning from __handle_mm_fault(). That
> way any time handle_mm_fault() returns VM_FAULT_RETRY per-vma lock
> will be already released, so your condition in do_page_fault() will
> work correctly. That would eliminate the need for a new
> VM_FAULT_VMA_UNLOCKED flag. WDYT?

Sounds good.

So probably that's the major pain for now with the legacy fallback (I'll
have commented if I noticed it with the initial vma lock support..).  I
assume that'll go away as long as we recover the VM_FAULT_RETRY semantics
to before.

Suren Baghdasaryan June 12, 2023, 6:44 p.m. UTC | #9

On Mon, Jun 12, 2023 at 11:34 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Mon, Jun 12, 2023 at 09:07:38AM -0700, Suren Baghdasaryan wrote:
> > On Mon, Jun 12, 2023 at 6:36 AM Peter Xu <peterx@redhat.com> wrote:
> > >
> > > On Fri, Jun 09, 2023 at 06:29:43PM -0700, Suren Baghdasaryan wrote:
> > > > On Fri, Jun 9, 2023 at 3:30 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > > > >
> > > > > On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote:
> > > > > > > migration_entry_wait does not need VMA lock, therefore it can be dropped
> > > > > > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA
> > > > > > > lock was dropped while in handle_mm_fault().
> > > > > > > Note that once VMA lock is dropped, the VMA reference can't be used as
> > > > > > > there are no guarantees it was not freed.
> > > > > >
> > > > > > Then vma lock behaves differently from mmap read lock, am I right?  Can we
> > > > > > still make them match on behaviors, or there's reason not to do so?
> > > > >
> > > > > I think we could match their behavior by also dropping mmap_lock here
> > > > > when fault is handled under mmap_lock (!(fault->flags &
> > > > > FAULT_FLAG_VMA_LOCK)).
> > > > > I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping
> > > > > mmap_lock in do_page_fault(), so indeed, I might be able to use
> > > > > VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well
> > > > > instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea
> > > > > of reusing existing flags?
> > > > Sorry, I meant VM_FAULT_VMA_UNLOCKED, not FAULT_FLAG_VMA_LOCK in the
> > > > above reply.
> > > >
> > > > I took a closer look into using VM_FAULT_COMPLETED instead of
> > > > VM_FAULT_VMA_UNLOCKED but when we fall back from per-vma lock to
> > > > mmap_lock we need to retry with an indication that the per-vma lock
> > > > was dropped. Returning (VM_FAULT_RETRY | VM_FAULT_COMPLETE) to
> > > > indicate such state seems strange to me ("retry" and "complete" seem
> > >
> > > Not relevant to this migration patch, but for the whole idea I was thinking
> > > whether it should just work if we simply:
> > >
> > >         fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> > > -       vma_end_read(vma);
> > > +       if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> > > +               vma_end_read(vma);
> > >
> > > ?
> >
> > Today when we can't handle a page fault under per-vma locks we return
> > VM_FAULT_RETRY, in which case per-vma lock is dropped and the fault is
>
> Oh I see what I missed.  I think it may not be a good idea to reuse
> VM_FAULT_RETRY just for that, because it makes VM_FAULT_RETRY even more
> complicated - now it adds one more case where the lock is not released,
> that's when PER_VMA even if !NOWAIT.
>
> > retried under mmap_lock. The condition you suggest above would not
> > drop per-vma lock for VM_FAULT_RETRY case and would break the current
> > fallback mechanism.
> > However your suggestion gave me an idea. I could indicate that per-vma
> > lock got dropped using vmf structure (like Matthew suggested before)
> > and once handle_pte_fault(vmf) returns I could check if it returned
> > VM_FAULT_RETRY but per-vma lock is still held.
> > If that happens I can
> > call vma_end_read() before returning from __handle_mm_fault(). That
> > way any time handle_mm_fault() returns VM_FAULT_RETRY per-vma lock
> > will be already released, so your condition in do_page_fault() will
> > work correctly. That would eliminate the need for a new
> > VM_FAULT_VMA_UNLOCKED flag. WDYT?
>
> Sounds good.
>
> So probably that's the major pain for now with the legacy fallback (I'll
> have commented if I noticed it with the initial vma lock support..).  I
> assume that'll go away as long as we recover the VM_FAULT_RETRY semantics
> to before.

I think so. With that change getting VM_FAULT_RETRY in do_page_fault()
will guarantee that per-vma lock was dropped. Is that what you mean?

>
> --
> Peter Xu
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

Peter Xu June 12, 2023, 6:57 p.m. UTC | #10

On Mon, Jun 12, 2023 at 11:44:33AM -0700, Suren Baghdasaryan wrote:
> I think so. With that change getting VM_FAULT_RETRY in do_page_fault()
> will guarantee that per-vma lock was dropped. Is that what you mean?

Yes, with the newly added "return VM_FAULT_RETRY" in do_swap_page() for
per-vma lock removed.  AFAICT all the rest paths guaranteed that as long as
in the fault paths.

[v2,4/6] mm: drop VMA lock before waiting for migration

Commit Message

Comments

Patch