[3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

From: Michal Hocko <mhocko@suse.cz>

On Wed 05-11-14 12:01:11, Tejun Heo wrote:
> On Wed, Nov 05, 2014 at 11:54:28AM -0500, Tejun Heo wrote:
> > > Still not following. How do you want to detect an on-going OOM without
> > > any interface around out_of_memory?
> > 
> > I thought you were using oom_killer_allowed_start() outside OOM path.
> > Ugh.... why is everything weirdly structured?  oom_killer_disabled
> > implies that oom killer may fail, right?  Why is
> > __alloc_pages_slowpath() checking it directly?  If whether oom killing
> > failed or not is relevant to its users, make out_of_memory() return an
> > error code.  There's no reason for the exclusion detail to leak out of
> > the oom killer proper.  The only interface should be disable/enable
> > and whether oom killing failed or not.
> 
> And what's implemented is wrong.  What happens if oom killing is
> already in progress and then a task blocks trying to write-lock the
> rwsem and then that task is selected as the OOM victim?

But this is nothing new. Suspend hasn't been checking for fatal signals
nor for TIF_MEMDIE since the OOM disabling was introduced and I suppose
even before.

This is not harmful though. The previous OOM kill attempt would kick the
current TASK and mark it with TIF_MEMDIE and retry the allocation. After
OOM is disabled the allocation simply fails. The current will die on its
way out of the kernel. Definitely worth fixing. In a separate patch.

> disable() call must be able to fail.

This would be a way to do it without requiring caller to check for
TIF_MEMDIE explicitly. The fewer of them we have the better.
---
From 3a7e18144a369bfc537c1cda4c7c2c548e9114b8 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Thu, 6 Nov 2014 11:51:34 +0100
Subject: [PATCH] OOM, PM: handle pm freezer as an OOM victim correctly

PM freezer doesn't check whether it has been killed by OOM killer
after it disables OOM killer which means that it continues with the
suspend even though it should die as soon as possible. This has been
the case ever since PM suspend disables OOM killer and I suppose
it has ignored OOM even before.

This is not harmful though. The allocation which triggers OOM will
retry the allocation after a process is killed and the next attempt
will fail because the OOM killer will be disabled at the time so
there is no risk of an endless loop because the OOM victim doesn't
die.

But this is a correctness issue because no task should ignore OOM.
As suggested by Tejun, oom_killer_disable will return a success status
now. If the current task is pending fatal signals or TIF_MEMDIE is set
after oom_sem is taken then the caller should bail out and this is what
freeze_processes does with this patch.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/oom.h    |  4 +++-
 kernel/power/process.c | 16 ++++++++++------
 mm/oom_kill.c          | 12 +++++++++++-
 3 files changed, 24 insertions(+), 8 deletions(-)

[3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

Commit Message

Comments

Patch