diff mbox series

[v6,6/7] mm/madvise: employ mmget_still_valid for write lock

Message ID 20200219014433.88424-7-minchan@kernel.org (mailing list archive)
State New, archived
Headers show
Series introduce memory hinting API for external process | expand

Commit Message

Minchan Kim Feb. 19, 2020, 1:44 a.m. UTC
From: Oleksandr Natalenko <oleksandr@redhat.com>

Do the very same trick as we already do since 04f5866e41fb. KSM hints
will require locking mmap_sem for write since they modify vm_flags, so
for remote KSM hinting this additional check is needed.

Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/madvise.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Suren Baghdasaryan Feb. 28, 2020, 11:19 p.m. UTC | #1
On Tue, Feb 18, 2020 at 5:44 PM Minchan Kim <minchan@kernel.org> wrote:
>
> From: Oleksandr Natalenko <oleksandr@redhat.com>
>
> Do the very same trick as we already do since 04f5866e41fb. KSM hints
> will require locking mmap_sem for write since they modify vm_flags, so
> for remote KSM hinting this additional check is needed.
>
> Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/madvise.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index f6d9b9e66243..c55a18fe71f9 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1118,6 +1118,8 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
>         if (write) {
>                 if (down_write_killable(&mm->mmap_sem))
>                         return -EINTR;
> +               if (current->mm != mm && !mmget_still_valid(mm))

mmget_still_valid() seems pretty light-weight, so why not just use
that without checking that the mm belongs to the current process
first?

> +                       goto skip_mm;
>         } else {
>                 down_read(&mm->mmap_sem);
>         }
> @@ -1169,6 +1171,7 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
>         }
>  out:
>         blk_finish_plug(&plug);
> +skip_mm:
>         if (write)
>                 up_write(&mm->mmap_sem);
>         else
> --
> 2.25.0.265.gbab2e86ba0-goog
>
Oleksandr Natalenko March 2, 2020, 7:33 a.m. UTC | #2
Hello.

On Fri, Feb 28, 2020 at 03:19:55PM -0800, Suren Baghdasaryan wrote:
> On Tue, Feb 18, 2020 at 5:44 PM Minchan Kim <minchan@kernel.org> wrote:
> >
> > From: Oleksandr Natalenko <oleksandr@redhat.com>
> >
> > Do the very same trick as we already do since 04f5866e41fb. KSM hints
> > will require locking mmap_sem for write since they modify vm_flags, so
> > for remote KSM hinting this additional check is needed.
> >
> > Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  mm/madvise.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index f6d9b9e66243..c55a18fe71f9 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -1118,6 +1118,8 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
> >         if (write) {
> >                 if (down_write_killable(&mm->mmap_sem))
> >                         return -EINTR;
> > +               if (current->mm != mm && !mmget_still_valid(mm))
> 
> mmget_still_valid() seems pretty light-weight, so why not just use
> that without checking that the mm belongs to the current process
> first?

I'd keep the checks separate to a) do not functionally change current->mm
== mm case; b) clearly separate the intention to call
mmget_still_valid() only for remote access (using mmget_still_valid()
for current->mm == mm does not make any sense here, IMO, since there's
no possibility of expecting a core dump at this point); c) ease the job for
reviewer once mmget_still_valid() is scheduled to be removed (I hope it
eventually goes away indeed).

> 
> > +                       goto skip_mm;
> >         } else {
> >                 down_read(&mm->mmap_sem);
> >         }
> > @@ -1169,6 +1171,7 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
> >         }
> >  out:
> >         blk_finish_plug(&plug);
> > +skip_mm:
> >         if (write)
> >                 up_write(&mm->mmap_sem);
> >         else
> > --
> > 2.25.0.265.gbab2e86ba0-goog
> >
>
Suren Baghdasaryan March 2, 2020, 4:32 p.m. UTC | #3
On Sun, Mar 1, 2020 at 11:33 PM Oleksandr Natalenko
<oleksandr@redhat.com> wrote:
>
> Hello.
>
> On Fri, Feb 28, 2020 at 03:19:55PM -0800, Suren Baghdasaryan wrote:
> > On Tue, Feb 18, 2020 at 5:44 PM Minchan Kim <minchan@kernel.org> wrote:
> > >
> > > From: Oleksandr Natalenko <oleksandr@redhat.com>
> > >
> > > Do the very same trick as we already do since 04f5866e41fb. KSM hints
> > > will require locking mmap_sem for write since they modify vm_flags, so
> > > for remote KSM hinting this additional check is needed.
> > >
> > > Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com>
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > ---
> > >  mm/madvise.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/mm/madvise.c b/mm/madvise.c
> > > index f6d9b9e66243..c55a18fe71f9 100644
> > > --- a/mm/madvise.c
> > > +++ b/mm/madvise.c
> > > @@ -1118,6 +1118,8 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
> > >         if (write) {
> > >                 if (down_write_killable(&mm->mmap_sem))
> > >                         return -EINTR;
> > > +               if (current->mm != mm && !mmget_still_valid(mm))
> >
> > mmget_still_valid() seems pretty light-weight, so why not just use
> > that without checking that the mm belongs to the current process
> > first?
>
> I'd keep the checks separate to a) do not functionally change current->mm
> == mm case; b) clearly separate the intention to call
> mmget_still_valid() only for remote access (using mmget_still_valid()
> for current->mm == mm does not make any sense here, IMO, since there's
> no possibility of expecting a core dump at this point); c) ease the job for
> reviewer once mmget_still_valid() is scheduled to be removed (I hope it
> eventually goes away indeed).
>

Makes sense. Thanks!

> >
> > > +                       goto skip_mm;
> > >         } else {
> > >                 down_read(&mm->mmap_sem);
> > >         }
> > > @@ -1169,6 +1171,7 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
> > >         }
> > >  out:
> > >         blk_finish_plug(&plug);
> > > +skip_mm:
> > >         if (write)
> > >                 up_write(&mm->mmap_sem);
> > >         else
> > > --
> > > 2.25.0.265.gbab2e86ba0-goog
> > >
> >
>
> --
>   Best regards,
>     Oleksandr Natalenko (post-factum)
>     Principal Software Maintenance Engineer
>

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
diff mbox series

Patch

diff --git a/mm/madvise.c b/mm/madvise.c
index f6d9b9e66243..c55a18fe71f9 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1118,6 +1118,8 @@  int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
 	if (write) {
 		if (down_write_killable(&mm->mmap_sem))
 			return -EINTR;
+		if (current->mm != mm && !mmget_still_valid(mm))
+			goto skip_mm;
 	} else {
 		down_read(&mm->mmap_sem);
 	}
@@ -1169,6 +1171,7 @@  int do_madvise(struct task_struct *target_task, struct mm_struct *mm,
 	}
 out:
 	blk_finish_plug(&plug);
+skip_mm:
 	if (write)
 		up_write(&mm->mmap_sem);
 	else