Message ID | 20141106130543.GE7202@dhcp22.suse.cz (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Thu, Nov 06, 2014 at 02:05:43PM +0100, Michal Hocko wrote: > But this is nothing new. Suspend hasn't been checking for fatal signals > nor for TIF_MEMDIE since the OOM disabling was introduced and I suppose > even before. > > This is not harmful though. The previous OOM kill attempt would kick the > current TASK and mark it with TIF_MEMDIE and retry the allocation. After > OOM is disabled the allocation simply fails. The current will die on its > way out of the kernel. Definitely worth fixing. In a separate patch. Hah? Isn't this a new outright A-B B-A deadlock involving the rwsem you added? > > disable() call must be able to fail. > > This would be a way to do it without requiring caller to check for > TIF_MEMDIE explicitly. The fewer of them we have the better. Why the hell would the caller ever even KNOW about this? This is something which must be encapsulated in the OOM killer disable/enable interface. > +bool oom_killer_disable(void) > { > + bool ret = true; > + > down_write(&oom_sem); How would this task pass the above down_write() if the OOM killer is already read locking oom_sem? Or is the OOM killer guaranteed to make forward progress even if the killed task can't make forward progress? But, if so, what are we talking about in this thread? > + > + /* We might have been killed while waiting for the oom_sem. */ > + if (fatal_signal_pending(current) || test_thread_flag(TIF_MEMDIE)) { > + up_write(&oom_sem); > + ret = false; > + } This is pointless. What does the above do?
On Thu 06-11-14 10:09:27, Tejun Heo wrote: > On Thu, Nov 06, 2014 at 02:05:43PM +0100, Michal Hocko wrote: > > But this is nothing new. Suspend hasn't been checking for fatal signals > > nor for TIF_MEMDIE since the OOM disabling was introduced and I suppose > > even before. > > > > This is not harmful though. The previous OOM kill attempt would kick the > > current TASK and mark it with TIF_MEMDIE and retry the allocation. After > > OOM is disabled the allocation simply fails. The current will die on its > > way out of the kernel. Definitely worth fixing. In a separate patch. > > Hah? Isn't this a new outright A-B B-A deadlock involving the rwsem > you added? No, see below. > > > disable() call must be able to fail. > > > > This would be a way to do it without requiring caller to check for > > TIF_MEMDIE explicitly. The fewer of them we have the better. > > Why the hell would the caller ever even KNOW about this? This is > something which must be encapsulated in the OOM killer disable/enable > interface. > > > +bool oom_killer_disable(void) > > { > > + bool ret = true; > > + > > down_write(&oom_sem); > > How would this task pass the above down_write() if the OOM killer is > already read locking oom_sem? Or is the OOM killer guaranteed to make > forward progress even if the killed task can't make forward progress? > But, if so, what are we talking about in this thread? Yes, OOM killer simply kicks the process sets TIF_MEMDIE and terminates. That will release the read_lock, allow this to take the write lock and check whether it the current has been killed without any races. OOM killer doesn't wait for the killed task. The allocation is retried. Does this explain your concern? [...]
On Thu, Nov 06, 2014 at 05:01:58PM +0100, Michal Hocko wrote: > Yes, OOM killer simply kicks the process sets TIF_MEMDIE and terminates. > That will release the read_lock, allow this to take the write lock and > check whether it the current has been killed without any races. > OOM killer doesn't wait for the killed task. The allocation is retried. > > Does this explain your concern? Draining oom killer then doesn't mean anything, no? OOM killer may have been disabled and drained but the killed tasks might wake up after the PM freezer considers them to be frozen, right? What am I missing?
On Thu 06-11-14 11:12:11, Tejun Heo wrote: > On Thu, Nov 06, 2014 at 05:01:58PM +0100, Michal Hocko wrote: > > Yes, OOM killer simply kicks the process sets TIF_MEMDIE and terminates. > > That will release the read_lock, allow this to take the write lock and > > check whether it the current has been killed without any races. > > OOM killer doesn't wait for the killed task. The allocation is retried. > > > > Does this explain your concern? > > Draining oom killer then doesn't mean anything, no? OOM killer may > have been disabled and drained but the killed tasks might wake up > after the PM freezer considers them to be frozen, right? What am I > missing? The mutual exclusion between OOM and the freezer will cause that the victim will have TIF_MEMDIE already set when try_to_freeze_tasks even starts. Then freezing_slow_path wouldn't allow the task to enter the fridge so the wake up moment is not really that important.
On Thu, Nov 06, 2014 at 05:31:24PM +0100, Michal Hocko wrote: > On Thu 06-11-14 11:12:11, Tejun Heo wrote: > > On Thu, Nov 06, 2014 at 05:01:58PM +0100, Michal Hocko wrote: > > > Yes, OOM killer simply kicks the process sets TIF_MEMDIE and terminates. > > > That will release the read_lock, allow this to take the write lock and > > > check whether it the current has been killed without any races. > > > OOM killer doesn't wait for the killed task. The allocation is retried. > > > > > > Does this explain your concern? > > > > Draining oom killer then doesn't mean anything, no? OOM killer may > > have been disabled and drained but the killed tasks might wake up > > after the PM freezer considers them to be frozen, right? What am I > > missing? > > The mutual exclusion between OOM and the freezer will cause that the > victim will have TIF_MEMDIE already set when try_to_freeze_tasks even > starts. Then freezing_slow_path wouldn't allow the task to enter the > fridge so the wake up moment is not really that important. What if it was already in the freezer?
On Thu 06-11-14 11:33:04, Tejun Heo wrote: > On Thu, Nov 06, 2014 at 05:31:24PM +0100, Michal Hocko wrote: > > On Thu 06-11-14 11:12:11, Tejun Heo wrote: [...] > > > Draining oom killer then doesn't mean anything, no? OOM killer may > > > have been disabled and drained but the killed tasks might wake up > > > after the PM freezer considers them to be frozen, right? What am I > > > missing? > > > > The mutual exclusion between OOM and the freezer will cause that the > > victim will have TIF_MEMDIE already set when try_to_freeze_tasks even > > starts. Then freezing_slow_path wouldn't allow the task to enter the > > fridge so the wake up moment is not really that important. > > What if it was already in the freezer? Good question! You are right that there is a race window until the wake up then. I will think about this case some more. There is simply no control on when the task wakes up and freezer will see it as frozen until then. An immediate way around would be to check for TIF_MEMDIE in try_to_freeze_tasks. I have to call it end of the day unfortunately and will be back on Monday.
diff --git a/include/linux/oom.h b/include/linux/oom.h index 4af99a9b543b..a978bf2b02a1 100644 --- a/include/linux/oom.h +++ b/include/linux/oom.h @@ -77,8 +77,10 @@ extern int unregister_oom_notifier(struct notifier_block *nb); * oom_killer_disable - disable OOM killer in page allocator * * Forces all page allocations to fail rather than trigger OOM killer. + * Returns true on success and fails if the OOM killer couldn't be + * disabled (e.g. because the current task has been killed before) */ -extern void oom_killer_disable(void); +extern bool oom_killer_disable(void); /** * oom_killer_enable - enable OOM killer diff --git a/kernel/power/process.c b/kernel/power/process.c index 7d08d56cbf3f..0f8b782f9215 100644 --- a/kernel/power/process.c +++ b/kernel/power/process.c @@ -123,6 +123,16 @@ int freeze_processes(void) if (error) return error; + /* + * Need to exlude OOM killer from triggering while tasks are + * getting frozen to make sure none of them gets killed after + * try_to_freeze_tasks is done. + */ + if (!oom_killer_disable()) { + usermodehelper_enable(); + return -EBUSY; + } + /* Make sure this task doesn't get frozen */ current->flags |= PF_SUSPEND_TASK; @@ -133,12 +143,6 @@ int freeze_processes(void) printk("Freezing user space processes ... "); pm_freezing = true; - /* - * Need to exlude OOM killer from triggering while tasks are - * getting frozen to make sure none of them gets killed after - * try_to_freeze_tasks is done. - */ - oom_killer_disable(); error = try_to_freeze_tasks(true); if (!error) { __usermodehelper_set_disable_depth(UMH_DISABLED); diff --git a/mm/oom_kill.c b/mm/oom_kill.c index f80c5b777f05..58ade54ee421 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -600,9 +600,19 @@ void oom_zonelist_unlock(struct zonelist *zonelist, gfp_t gfp_mask) static DECLARE_RWSEM(oom_sem); -void oom_killer_disable(void) +bool oom_killer_disable(void) { + bool ret = true; + down_write(&oom_sem); + + /* We might have been killed while waiting for the oom_sem. */ + if (fatal_signal_pending(current) || test_thread_flag(TIF_MEMDIE)) { + up_write(&oom_sem); + ret = false; + } + + return ret; } void oom_killer_enable(void)