Message ID | 20200219014433.88424-7-minchan@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | introduce memory hinting API for external process | expand |
On Tue, Feb 18, 2020 at 5:44 PM Minchan Kim <minchan@kernel.org> wrote: > > From: Oleksandr Natalenko <oleksandr@redhat.com> > > Do the very same trick as we already do since 04f5866e41fb. KSM hints > will require locking mmap_sem for write since they modify vm_flags, so > for remote KSM hinting this additional check is needed. > > Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com> > Signed-off-by: Minchan Kim <minchan@kernel.org> > --- > mm/madvise.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/madvise.c b/mm/madvise.c > index f6d9b9e66243..c55a18fe71f9 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1118,6 +1118,8 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm, > if (write) { > if (down_write_killable(&mm->mmap_sem)) > return -EINTR; > + if (current->mm != mm && !mmget_still_valid(mm)) mmget_still_valid() seems pretty light-weight, so why not just use that without checking that the mm belongs to the current process first? > + goto skip_mm; > } else { > down_read(&mm->mmap_sem); > } > @@ -1169,6 +1171,7 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm, > } > out: > blk_finish_plug(&plug); > +skip_mm: > if (write) > up_write(&mm->mmap_sem); > else > -- > 2.25.0.265.gbab2e86ba0-goog >
Hello. On Fri, Feb 28, 2020 at 03:19:55PM -0800, Suren Baghdasaryan wrote: > On Tue, Feb 18, 2020 at 5:44 PM Minchan Kim <minchan@kernel.org> wrote: > > > > From: Oleksandr Natalenko <oleksandr@redhat.com> > > > > Do the very same trick as we already do since 04f5866e41fb. KSM hints > > will require locking mmap_sem for write since they modify vm_flags, so > > for remote KSM hinting this additional check is needed. > > > > Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com> > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > --- > > mm/madvise.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/mm/madvise.c b/mm/madvise.c > > index f6d9b9e66243..c55a18fe71f9 100644 > > --- a/mm/madvise.c > > +++ b/mm/madvise.c > > @@ -1118,6 +1118,8 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm, > > if (write) { > > if (down_write_killable(&mm->mmap_sem)) > > return -EINTR; > > + if (current->mm != mm && !mmget_still_valid(mm)) > > mmget_still_valid() seems pretty light-weight, so why not just use > that without checking that the mm belongs to the current process > first? I'd keep the checks separate to a) do not functionally change current->mm == mm case; b) clearly separate the intention to call mmget_still_valid() only for remote access (using mmget_still_valid() for current->mm == mm does not make any sense here, IMO, since there's no possibility of expecting a core dump at this point); c) ease the job for reviewer once mmget_still_valid() is scheduled to be removed (I hope it eventually goes away indeed). > > > + goto skip_mm; > > } else { > > down_read(&mm->mmap_sem); > > } > > @@ -1169,6 +1171,7 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm, > > } > > out: > > blk_finish_plug(&plug); > > +skip_mm: > > if (write) > > up_write(&mm->mmap_sem); > > else > > -- > > 2.25.0.265.gbab2e86ba0-goog > > >
On Sun, Mar 1, 2020 at 11:33 PM Oleksandr Natalenko <oleksandr@redhat.com> wrote: > > Hello. > > On Fri, Feb 28, 2020 at 03:19:55PM -0800, Suren Baghdasaryan wrote: > > On Tue, Feb 18, 2020 at 5:44 PM Minchan Kim <minchan@kernel.org> wrote: > > > > > > From: Oleksandr Natalenko <oleksandr@redhat.com> > > > > > > Do the very same trick as we already do since 04f5866e41fb. KSM hints > > > will require locking mmap_sem for write since they modify vm_flags, so > > > for remote KSM hinting this additional check is needed. > > > > > > Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com> > > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > > --- > > > mm/madvise.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/mm/madvise.c b/mm/madvise.c > > > index f6d9b9e66243..c55a18fe71f9 100644 > > > --- a/mm/madvise.c > > > +++ b/mm/madvise.c > > > @@ -1118,6 +1118,8 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm, > > > if (write) { > > > if (down_write_killable(&mm->mmap_sem)) > > > return -EINTR; > > > + if (current->mm != mm && !mmget_still_valid(mm)) > > > > mmget_still_valid() seems pretty light-weight, so why not just use > > that without checking that the mm belongs to the current process > > first? > > I'd keep the checks separate to a) do not functionally change current->mm > == mm case; b) clearly separate the intention to call > mmget_still_valid() only for remote access (using mmget_still_valid() > for current->mm == mm does not make any sense here, IMO, since there's > no possibility of expecting a core dump at this point); c) ease the job for > reviewer once mmget_still_valid() is scheduled to be removed (I hope it > eventually goes away indeed). > Makes sense. Thanks! > > > > > + goto skip_mm; > > > } else { > > > down_read(&mm->mmap_sem); > > > } > > > @@ -1169,6 +1171,7 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm, > > > } > > > out: > > > blk_finish_plug(&plug); > > > +skip_mm: > > > if (write) > > > up_write(&mm->mmap_sem); > > > else > > > -- > > > 2.25.0.265.gbab2e86ba0-goog > > > > > > > -- > Best regards, > Oleksandr Natalenko (post-factum) > Principal Software Maintenance Engineer > Reviewed-by: Suren Baghdasaryan <surenb@google.com>
diff --git a/mm/madvise.c b/mm/madvise.c index f6d9b9e66243..c55a18fe71f9 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1118,6 +1118,8 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm, if (write) { if (down_write_killable(&mm->mmap_sem)) return -EINTR; + if (current->mm != mm && !mmget_still_valid(mm)) + goto skip_mm; } else { down_read(&mm->mmap_sem); } @@ -1169,6 +1171,7 @@ int do_madvise(struct task_struct *target_task, struct mm_struct *mm, } out: blk_finish_plug(&plug); +skip_mm: if (write) up_write(&mm->mmap_sem); else