Message ID | 201803301934.DHF12420.SOFFJQMLVtHOOF@I-love.SAKURA.ne.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Mar 30, 2018 at 07:34:59PM +0900, Tetsuo Handa wrote: > Maybe we can make "give up by default upon SIGKILL" and let callers > explicitly say "do not give up upon SIGKILL". I really strongly disapprove of this patch. This GFP flag will be abused like every other GFP flag. > +++ b/mm/page_alloc.c > @@ -4183,6 +4183,13 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > if (current->flags & PF_MEMALLOC) > goto nopage; > > + /* Can give up if caller is willing to give up upon fatal signals */ > + if (fatal_signal_pending(current) && > + !(gfp_mask & (__GFP_UNKILLABLE | __GFP_NOFAIL))) { > + gfp_mask |= __GFP_NOWARN; > + goto nopage; > + } > + > /* Try direct reclaim and then allocating */ This part is superficially tempting, although without the UNKILLABLE. ie: + if (fatal_signal_pending(current) && !(gfp_mask & __GFP_NOFAIL)) { + gfp_mask |= __GFP_NOWARN; + goto nopage; + } It makes some sense to me to prevent tasks with a fatal signal pending from being able to trigger reclaim. But I'm worried about what memory allocation failures it might trigger on paths that aren't accustomed to seeing failures.
On Tue 03-04-18 05:14:14, Matthew Wilcox wrote: > On Fri, Mar 30, 2018 at 07:34:59PM +0900, Tetsuo Handa wrote: > > Maybe we can make "give up by default upon SIGKILL" and let callers > > explicitly say "do not give up upon SIGKILL". > > I really strongly disapprove of this patch. This GFP flag will be abused > like every other GFP flag. > > > +++ b/mm/page_alloc.c > > @@ -4183,6 +4183,13 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > if (current->flags & PF_MEMALLOC) > > goto nopage; > > > > + /* Can give up if caller is willing to give up upon fatal signals */ > > + if (fatal_signal_pending(current) && > > + !(gfp_mask & (__GFP_UNKILLABLE | __GFP_NOFAIL))) { > > + gfp_mask |= __GFP_NOWARN; > > + goto nopage; > > + } > > + > > /* Try direct reclaim and then allocating */ > > This part is superficially tempting, although without the UNKILLABLE. ie: > > + if (fatal_signal_pending(current) && !(gfp_mask & __GFP_NOFAIL)) { > + gfp_mask |= __GFP_NOWARN; > + goto nopage; > + } > > It makes some sense to me to prevent tasks with a fatal signal pending > from being able to trigger reclaim. But I'm worried about what memory > allocation failures it might trigger on paths that aren't accustomed to > seeing failures. Please be aware that we _do_ allocate in the exit path. I have a strong suspicion that even while fatal signal is pending. Do we really want fail those really easily.
On Tue, Apr 03, 2018 at 02:19:50PM +0200, Michal Hocko wrote: > On Tue 03-04-18 05:14:14, Matthew Wilcox wrote: > > On Fri, Mar 30, 2018 at 07:34:59PM +0900, Tetsuo Handa wrote: > > > Maybe we can make "give up by default upon SIGKILL" and let callers > > > explicitly say "do not give up upon SIGKILL". > > > > I really strongly disapprove of this patch. This GFP flag will be abused > > like every other GFP flag. > > > > > +++ b/mm/page_alloc.c > > > @@ -4183,6 +4183,13 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > > if (current->flags & PF_MEMALLOC) > > > goto nopage; > > > > > > + /* Can give up if caller is willing to give up upon fatal signals */ > > > + if (fatal_signal_pending(current) && > > > + !(gfp_mask & (__GFP_UNKILLABLE | __GFP_NOFAIL))) { > > > + gfp_mask |= __GFP_NOWARN; > > > + goto nopage; > > > + } > > > + > > > /* Try direct reclaim and then allocating */ > > > > This part is superficially tempting, although without the UNKILLABLE. ie: > > > > + if (fatal_signal_pending(current) && !(gfp_mask & __GFP_NOFAIL)) { > > + gfp_mask |= __GFP_NOWARN; > > + goto nopage; > > + } > > > > It makes some sense to me to prevent tasks with a fatal signal pending > > from being able to trigger reclaim. But I'm worried about what memory > > allocation failures it might trigger on paths that aren't accustomed to > > seeing failures. > > Please be aware that we _do_ allocate in the exit path. I have a strong > suspicion that even while fatal signal is pending. Do we really want > fail those really easily. I agree. The allocations I'm thinking about are NFS wanting to send I/Os in order to fsync each file that gets closed. We probably don't want those to fail. And we definitely don't want to chase around the kernel adding __GFP_KILLABLE to each place that we discover needs to allocate on the exit path.
Michal Hocko wrote: > On Tue 03-04-18 05:14:14, Matthew Wilcox wrote: > > On Fri, Mar 30, 2018 at 07:34:59PM +0900, Tetsuo Handa wrote: > > > Maybe we can make "give up by default upon SIGKILL" and let callers > > > explicitly say "do not give up upon SIGKILL". > > > > I really strongly disapprove of this patch. This GFP flag will be abused > > like every other GFP flag. > > > > > +++ b/mm/page_alloc.c > > > @@ -4183,6 +4183,13 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > > if (current->flags & PF_MEMALLOC) > > > goto nopage; > > > > > > + /* Can give up if caller is willing to give up upon fatal signals */ > > > + if (fatal_signal_pending(current) && > > > + !(gfp_mask & (__GFP_UNKILLABLE | __GFP_NOFAIL))) { > > > + gfp_mask |= __GFP_NOWARN; > > > + goto nopage; > > > + } > > > + > > > /* Try direct reclaim and then allocating */ > > > > This part is superficially tempting, although without the UNKILLABLE. ie: > > > > + if (fatal_signal_pending(current) && !(gfp_mask & __GFP_NOFAIL)) { > > + gfp_mask |= __GFP_NOWARN; > > + goto nopage; > > + } > > > > It makes some sense to me to prevent tasks with a fatal signal pending > > from being able to trigger reclaim. But I'm worried about what memory > > allocation failures it might trigger on paths that aren't accustomed to > > seeing failures. Userspace tasks might call routines which need GFP_FS or GFP_NOIO (for direct reclaim), and giving up upon fatal signals leads to more problems like FS error or I/O error. Thus, we can't unconditionally give up upon fatal signals. > > Please be aware that we _do_ allocate in the exit path. I have a strong > suspicion that even while fatal signal is pending. Do we really want > fail those really easily. Does the exit path mean inside do_exit() ? If yes, fatal signals are already cleared before reaching do_exit().
On Tue 03-04-18 21:29:52, Tetsuo Handa wrote: > Michal Hocko wrote: [...] > > Please be aware that we _do_ allocate in the exit path. I have a strong > > suspicion that even while fatal signal is pending. Do we really want > > fail those really easily. > > Does the exit path mean inside do_exit() ? If yes, fatal signals are already > cleared before reaching do_exit(). They usually are. But we can send a SIGKILL on an already killed task after it removed the previously deadly signal already AFAIR. Maybe I mis-remember of course. Signal handling code always makes my head explode and I tend to forget all the details. Anyway relying on fatal_signal_pending for some allocator semantic is just too subtle.
Matthew Wilcox wrote: > On Tue, Apr 03, 2018 at 02:19:50PM +0200, Michal Hocko wrote: > > On Tue 03-04-18 05:14:14, Matthew Wilcox wrote: > > > On Fri, Mar 30, 2018 at 07:34:59PM +0900, Tetsuo Handa wrote: > > > > Maybe we can make "give up by default upon SIGKILL" and let callers > > > > explicitly say "do not give up upon SIGKILL". > > > > > > I really strongly disapprove of this patch. This GFP flag will be abused > > > like every other GFP flag. > > > > > > > +++ b/mm/page_alloc.c > > > > @@ -4183,6 +4183,13 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > > > if (current->flags & PF_MEMALLOC) > > > > goto nopage; > > > > > > > > + /* Can give up if caller is willing to give up upon fatal signals */ > > > > + if (fatal_signal_pending(current) && > > > > + !(gfp_mask & (__GFP_UNKILLABLE | __GFP_NOFAIL))) { > > > > + gfp_mask |= __GFP_NOWARN; > > > > + goto nopage; > > > > + } > > > > + > > > > /* Try direct reclaim and then allocating */ > > > > > > This part is superficially tempting, although without the UNKILLABLE. ie: > > > > > > + if (fatal_signal_pending(current) && !(gfp_mask & __GFP_NOFAIL)) { > > > + gfp_mask |= __GFP_NOWARN; > > > + goto nopage; > > > + } > > > > > > It makes some sense to me to prevent tasks with a fatal signal pending > > > from being able to trigger reclaim. But I'm worried about what memory > > > allocation failures it might trigger on paths that aren't accustomed to > > > seeing failures. > > > > Please be aware that we _do_ allocate in the exit path. I have a strong > > suspicion that even while fatal signal is pending. Do we really want > > fail those really easily. > > I agree. The allocations I'm thinking about are NFS wanting to send > I/Os in order to fsync each file that gets closed. We probably don't > want those to fail. And we definitely don't want to chase around the > kernel adding __GFP_KILLABLE to each place that we discover needs to > allocate on the exit path. > But memory allocations for syscalls are willing to give up upon SIGKILL regardless of OOM. If we worry the exit/nofs/noio paths, we can use scoped masking like memalloc_nofs_save()/memalloc_nofs_restore() for ignoring __GFP_KILLABLE.
diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 1a4582b..a0e8a9c 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -24,6 +24,7 @@ #define ___GFP_HIGH 0x20u #define ___GFP_IO 0x40u #define ___GFP_FS 0x80u +#define ___GFP_UNKILLABLE 0x100u #define ___GFP_NOWARN 0x200u #define ___GFP_RETRY_MAYFAIL 0x400u #define ___GFP_NOFAIL 0x800u @@ -122,6 +123,14 @@ * allocator recursing into the filesystem which might already be holding * locks. * + * __GFP_UNKILLABLE: The VM implementation does not fail by simply because + * fatal_signal_pending(current) is true when the current thread in task + * context is doing memory allocations. Those allocations which do not want + * to be disturbed by SIGKILL can add this flag. But note that those + * allocations which must not fail have to add __GFP_NOFAIL, for + * __GFP_UNKILLABLE allocations can still fail by other reasons such as + * __GFP_NORETRY, __GFP_RETRY_MAYFAIL, being selected as an OOM victim. + * * __GFP_DIRECT_RECLAIM indicates that the caller may enter direct reclaim. * This flag can be cleared to avoid unnecessary delays when a fallback * option is available. @@ -181,6 +190,7 @@ */ #define __GFP_IO ((__force gfp_t)___GFP_IO) #define __GFP_FS ((__force gfp_t)___GFP_FS) +#define __GFP_UNKILLABLE ((__force gfp_t)___GFP_UNKILLABLE) #define __GFP_DIRECT_RECLAIM ((__force gfp_t)___GFP_DIRECT_RECLAIM) /* Caller can reclaim */ #define __GFP_KSWAPD_RECLAIM ((__force gfp_t)___GFP_KSWAPD_RECLAIM) /* kswapd can wake */ #define __GFP_RECLAIM ((__force gfp_t)(___GFP_DIRECT_RECLAIM|___GFP_KSWAPD_RECLAIM)) diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index a81cffb..6a21654 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -32,6 +32,7 @@ {(unsigned long)__GFP_ATOMIC, "__GFP_ATOMIC"}, \ {(unsigned long)__GFP_IO, "__GFP_IO"}, \ {(unsigned long)__GFP_FS, "__GFP_FS"}, \ + {(unsigned long)__GFP_UNKILLABLE, "__GFP_UNKILLABLE"}, \ {(unsigned long)__GFP_NOWARN, "__GFP_NOWARN"}, \ {(unsigned long)__GFP_RETRY_MAYFAIL, "__GFP_RETRY_MAYFAIL"}, \ {(unsigned long)__GFP_NOFAIL, "__GFP_NOFAIL"}, \ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 905db9d..c8af32e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4183,6 +4183,13 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) if (current->flags & PF_MEMALLOC) goto nopage; + /* Can give up if caller is willing to give up upon fatal signals */ + if (fatal_signal_pending(current) && + !(gfp_mask & (__GFP_UNKILLABLE | __GFP_NOFAIL))) { + gfp_mask |= __GFP_NOWARN; + goto nopage; + } + /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac, &did_some_progress); @@ -4301,6 +4308,14 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, struct alloc_context *ac, gfp_t *alloc_mask, unsigned int *alloc_flags) { + /* + * Can give up if caller in task context is willing to give up upon + * fatal signals + */ + if (in_task() && fatal_signal_pending(current) && + (gfp_mask & (__GFP_UNKILLABLE | __GFP_NOFAIL))) + return false; + ac->high_zoneidx = gfp_zone(gfp_mask); ac->zonelist = node_zonelist(preferred_nid, gfp_mask); ac->nodemask = nodemask; diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c index ae11e4c..b36d945 100644 --- a/tools/perf/builtin-kmem.c +++ b/tools/perf/builtin-kmem.c @@ -641,6 +641,7 @@ static int gfpcmp(const void *a, const void *b) { "__GFP_ATOMIC", "_A" }, { "__GFP_IO", "I" }, { "__GFP_FS", "F" }, + { "__GFP_UNKILLABLE", "UK" }, { "__GFP_NOWARN", "NWR" }, { "__GFP_RETRY_MAYFAIL", "R" }, { "__GFP_NOFAIL", "NF" },