diff mbox

mm: Check for SIGKILL inside dup_mmap() loop.

Message ID 201804071938.CDE04681.SOFVQJFtMHOOLF@I-love.SAKURA.ne.jp (mailing list archive)
State New, archived
Headers show

Commit Message

Tetsuo Handa April 7, 2018, 10:38 a.m. UTC
>From 31c863e57a4ab7dfb491b2860fe3653e1e8f593b Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sat, 7 Apr 2018 19:29:30 +0900
Subject: [PATCH] mm: Check for SIGKILL inside dup_mmap() loop.

As a theoretical problem, an mm_struct with 60000+ vmas can loop with
potentially allocating memory, with mm->mmap_sem held for write by current
thread. This is bad if current thread was selected as an OOM victim, for
current thread will continue allocations using memory reserves while OOM
reaper is unable to reclaim memory.

As an actually observable problem, it is not difficult to make OOM reaper
unable to reclaim memory if the OOM victim is blocked at
i_mmap_lock_write() in this loop. Unfortunately, since nobody can explain
whether it is safe to use killable wait there, let's check for SIGKILL
before trying to allocate memory. Even without an OOM event, there is no
point with continuing the loop from the beginning if current thread is
killed.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 kernel/fork.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Andrew Morton April 18, 2018, 9:44 p.m. UTC | #1
On Sat, 7 Apr 2018 19:38:28 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:

> >From 31c863e57a4ab7dfb491b2860fe3653e1e8f593b Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Sat, 7 Apr 2018 19:29:30 +0900
> Subject: [PATCH] mm: Check for SIGKILL inside dup_mmap() loop.
> 
> As a theoretical problem, an mm_struct with 60000+ vmas can loop with
> potentially allocating memory, with mm->mmap_sem held for write by current
> thread. This is bad if current thread was selected as an OOM victim, for
> current thread will continue allocations using memory reserves while OOM
> reaper is unable to reclaim memory.
> 
> As an actually observable problem, it is not difficult to make OOM reaper
> unable to reclaim memory if the OOM victim is blocked at
> i_mmap_lock_write() in this loop. Unfortunately, since nobody can explain
> whether it is safe to use killable wait there, let's check for SIGKILL
> before trying to allocate memory. Even without an OOM event, there is no
> point with continuing the loop from the beginning if current thread is
> killed.
> 
> ...
>
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -441,6 +441,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
>  			continue;
>  		}
>  		charge = 0;
> +		if (fatal_signal_pending(current)) {
> +			retval = -EINTR;
> +			goto out;
> +		}
>  		if (mpnt->vm_flags & VM_ACCOUNT) {
>  			unsigned long len = vma_pages(mpnt);

Seems sane.  Has this been runtime tested?

I would like to see a comment here explaining why we're testing for
this at this particualr place.
Andrew Morton April 19, 2018, 2:32 a.m. UTC | #2
On Thu, 19 Apr 2018 10:54:44 +0900 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:

> Andrew Morton wrote:
> > On Sat, 7 Apr 2018 19:38:28 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
> > 
> > > >From 31c863e57a4ab7dfb491b2860fe3653e1e8f593b Mon Sep 17 00:00:00 2001
> > > From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> > > Date: Sat, 7 Apr 2018 19:29:30 +0900
> > > Subject: [PATCH] mm: Check for SIGKILL inside dup_mmap() loop.
> > > 
> > > As a theoretical problem, an mm_struct with 60000+ vmas can loop with
> > > potentially allocating memory, with mm->mmap_sem held for write by current
> > > thread. This is bad if current thread was selected as an OOM victim, for
> > > current thread will continue allocations using memory reserves while OOM
> > > reaper is unable to reclaim memory.
> > > 
> > > As an actually observable problem, it is not difficult to make OOM reaper
> > > unable to reclaim memory if the OOM victim is blocked at
> > > i_mmap_lock_write() in this loop. Unfortunately, since nobody can explain
> > > whether it is safe to use killable wait there, let's check for SIGKILL
> > > before trying to allocate memory. Even without an OOM event, there is no
> > > point with continuing the loop from the beginning if current thread is
> > > killed.
> > > 
> > > ...
> > >
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -441,6 +441,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> > >  			continue;
> > >  		}
> > >  		charge = 0;
> > > +		if (fatal_signal_pending(current)) {
> > > +			retval = -EINTR;
> > > +			goto out;
> > > +		}
> > >  		if (mpnt->vm_flags & VM_ACCOUNT) {
> > >  			unsigned long len = vma_pages(mpnt);
> > 
> > Seems sane.  Has this been runtime tested?
> > 
> 
> Yes, I tested with debug printk(). This patch should be safe
> because we already fail if security_vm_enough_memory_mm() or
> kmem_cache_alloc(GFP_KERNEL) fails and exit_mmap() handles it.
> 
> [  417.030691] ***** Aborting dup_mmap() due to SIGKILL *****
> [  417.036129] ***** Aborting dup_mmap() due to SIGKILL *****
> [  417.044544] ***** Aborting dup_mmap() due to SIGKILL *****
> [  419.116445] ***** Aborting dup_mmap() due to SIGKILL *****
> [  419.118401] ***** Aborting exit_mmap() due to NULL mmap *****
> [  419.168917] ***** Aborting dup_mmap() due to SIGKILL *****
> [  419.169064] ***** Aborting dup_mmap() due to SIGKILL *****
> [  419.170913] ***** Aborting exit_mmap() due to NULL mmap *****
> [  419.171411] ***** Aborting dup_mmap() due to SIGKILL *****
> [  419.171417] ***** Aborting exit_mmap() due to NULL mmap *****
> [  419.172804] ***** Aborting exit_mmap() due to NULL mmap *****
> [  419.176253] ***** Aborting dup_mmap() due to SIGKILL *****
> [  419.182676] ***** Aborting exit_mmap() due to NULL mmap *****

OK, thanks.

> > I would like to see a comment here explaining why we're testing for
> > this at this particualr place.
> > 
> Such comment goes to patch description.

I know.  I'm suggesting that it be in the code itself.  Making people
putz around with git to understand the code is unfriendly.

I'll add something in there tomorrow.
diff mbox

Patch

diff --git a/kernel/fork.c b/kernel/fork.c
index 242c8c9..8831bae 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -441,6 +441,10 @@  static __latent_entropy int dup_mmap(struct mm_struct *mm,
 			continue;
 		}
 		charge = 0;
+		if (fatal_signal_pending(current)) {
+			retval = -EINTR;
+			goto out;
+		}
 		if (mpnt->vm_flags & VM_ACCOUNT) {
 			unsigned long len = vma_pages(mpnt);