Message ID | 20180525083118.GI11881@dhcp22.suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Michal Hocko wrote: > On Fri 25-05-18 10:17:42, Tetsuo Handa wrote: > > Then, please show me (by writing a patch yourself) how to tell whether > > we should sleep there. What I can come up is shown below. > > > > -@@ -4241,6 +4240,12 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > - /* Retry as long as the OOM killer is making progress */ > > - if (did_some_progress) { > > - no_progress_loops = 0; > > -+ /* > > -+ * This schedule_timeout_*() serves as a guaranteed sleep for > > -+ * PF_WQ_WORKER threads when __zone_watermark_ok() == false. > > -+ */ > > -+ if (!tsk_is_oom_victim(current)) > > -+ schedule_timeout_uninterruptible(1); > > - goto retry; > > - } > > +@@ -3927,6 +3926,14 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > + (*no_progress_loops)++; > > > > + /* > > ++ * We do a short sleep here if the OOM killer/reaper/victims are > > ++ * holding oom_lock, in order to try to give them some CPU resources > > ++ * for releasing memory. > > ++ */ > > ++ if (mutex_is_locked(&oom_lock) && !tsk_is_oom_victim(current)) > > ++ schedule_timeout_uninterruptible(1); > > ++ > > ++ /* > > + * Make sure we converge to OOM if we cannot make any progress > > + * several times in the row. > > + */ > > > > As far as I know, whether a domain which the current thread belongs to is > > already OOM is not known as of should_reclaim_retry(). Therefore, sleeping > > there can become a pointless delay if the domain which the current thread > > belongs to and the domain which the owner of oom_lock (it can be a random > > thread inside out_of_memory() or exit_mmap()) belongs to differs. > > > > But you insist sleeping there means that you don't care about such > > pointless delay? > > What is wrong with the folliwing? should_reclaim_retry should be a > natural reschedule point. PF_WQ_WORKER is a special case which needs a > stronger rescheduling policy. Doing that unconditionally seems more > straightforward than depending on a zone being a good candidate for a > further reclaim. Where is schedule_timeout_uninterruptible(1) for !PF_KTHREAD threads? My concern is that cond_resched() might be a too short sleep to give enough CPU resources to the owner of the oom_lock. #ifndef CONFIG_PREEMPT extern int _cond_resched(void); #else static inline int _cond_resched(void) { return 0; } #endif #ifndef CONFIG_PREEMPT int __sched _cond_resched(void) { if (should_resched(0)) { preempt_schedule_common(); return 1; } rcu_all_qs(); return 0; } EXPORT_SYMBOL(_cond_resched); #endif #define cond_resched() ({ \ ___might_sleep(__FILE__, __LINE__, 0); \ _cond_resched(); \ }) How do you prove that cond_resched() is an appropriate replacement for schedule_timeout_killable(1) in out_of_memory() and schedule_timeout_uninterruptible(1) in __alloc_pages_may_oom() ?
On Fri 25-05-18 19:57:32, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Fri 25-05-18 10:17:42, Tetsuo Handa wrote: > > > Then, please show me (by writing a patch yourself) how to tell whether > > > we should sleep there. What I can come up is shown below. > > > > > > -@@ -4241,6 +4240,12 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > > - /* Retry as long as the OOM killer is making progress */ > > > - if (did_some_progress) { > > > - no_progress_loops = 0; > > > -+ /* > > > -+ * This schedule_timeout_*() serves as a guaranteed sleep for > > > -+ * PF_WQ_WORKER threads when __zone_watermark_ok() == false. > > > -+ */ > > > -+ if (!tsk_is_oom_victim(current)) > > > -+ schedule_timeout_uninterruptible(1); > > > - goto retry; > > > - } > > > +@@ -3927,6 +3926,14 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > > + (*no_progress_loops)++; > > > > > > + /* > > > ++ * We do a short sleep here if the OOM killer/reaper/victims are > > > ++ * holding oom_lock, in order to try to give them some CPU resources > > > ++ * for releasing memory. > > > ++ */ > > > ++ if (mutex_is_locked(&oom_lock) && !tsk_is_oom_victim(current)) > > > ++ schedule_timeout_uninterruptible(1); > > > ++ > > > ++ /* > > > + * Make sure we converge to OOM if we cannot make any progress > > > + * several times in the row. > > > + */ > > > > > > As far as I know, whether a domain which the current thread belongs to is > > > already OOM is not known as of should_reclaim_retry(). Therefore, sleeping > > > there can become a pointless delay if the domain which the current thread > > > belongs to and the domain which the owner of oom_lock (it can be a random > > > thread inside out_of_memory() or exit_mmap()) belongs to differs. > > > > > > But you insist sleeping there means that you don't care about such > > > pointless delay? > > > > What is wrong with the folliwing? should_reclaim_retry should be a > > natural reschedule point. PF_WQ_WORKER is a special case which needs a > > stronger rescheduling policy. Doing that unconditionally seems more > > straightforward than depending on a zone being a good candidate for a > > further reclaim. > > Where is schedule_timeout_uninterruptible(1) for !PF_KTHREAD threads? Re-read what I've said.
Michal Hocko wrote: > On Fri 25-05-18 19:57:32, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > What is wrong with the folliwing? should_reclaim_retry should be a > > > natural reschedule point. PF_WQ_WORKER is a special case which needs a > > > stronger rescheduling policy. Doing that unconditionally seems more > > > straightforward than depending on a zone being a good candidate for a > > > further reclaim. > > > > Where is schedule_timeout_uninterruptible(1) for !PF_KTHREAD threads? > > Re-read what I've said. Please show me as a complete patch. Then, I will test your patch.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3c6f4008ea55..b01b19d3d596 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3925,6 +3925,7 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, { struct zone *zone; struct zoneref *z; + bool ret = false; /* * Costly allocations might have made a progress but this doesn't mean @@ -3988,25 +3989,26 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, } } - /* - * Memory allocation/reclaim might be called from a WQ - * context and the current implementation of the WQ - * concurrency control doesn't recognize that - * a particular WQ is congested if the worker thread is - * looping without ever sleeping. Therefore we have to - * do a short sleep here rather than calling - * cond_resched(). - */ - if (current->flags & PF_WQ_WORKER) - schedule_timeout_uninterruptible(1); - else - cond_resched(); - - return true; + ret = true; + goto out; } } - return false; +out: + /* + * Memory allocation/reclaim might be called from a WQ + * context and the current implementation of the WQ + * concurrency control doesn't recognize that + * a particular WQ is congested if the worker thread is + * looping without ever sleeping. Therefore we have to + * do a short sleep here rather than calling + * cond_resched(). + */ + if (current->flags & PF_WQ_WORKER) + schedule_timeout_uninterruptible(1); + else + cond_resched(); + return ret; } static inline bool