Message ID | 20240731000155.109583-4-21cnbao@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: clarify nofail memory allocation | expand |
On Wed 31-07-24 12:01:54, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > We have cases we still fail though callers might have __GFP_NOFAIL. > Since they don't check the return, we are exposed to the security > risks for NULL deference. > > Though BUG_ON() is not encouraged by Linus, this is an unrecoverable > situation. > > Christoph Hellwig: > The whole freaking point of __GFP_NOFAIL is that callers don't handle > allocation failures. So in fact a straight BUG is the right thing > here. > > Vlastimil Babka: > It's just not a recoverable situation (WARN_ON is for recoverable > situations). The caller cannot handle allocation failure and at the same > time asked for an impossible allocation. BUG_ON() is a guaranteed oops > with stracktrace etc. We don't need to hope for the later NULL pointer > dereference (which might if really unlucky happen from a different > context where it's no longer obvious what lead to the allocation failing). > > Michal Hocko: > Linus tends to be against adding new BUG() calls unless the failure is > absolutely unrecoverable (e.g. corrupted data structures etc.). I am > not sure how he would look at simply incorrect memory allocator usage to > blow up the kernel. Now the argument could be made that those failures > could cause subtle memory corruptions or even be exploitable which might > be a sufficient reason to stop them early. > > Cc: Michal Hocko <mhocko@suse.com> > Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Lorenzo Stoakes <lstoakes@gmail.com> > Cc: Christoph Lameter <cl@linux.com> > Cc: Pekka Enberg <penberg@kernel.org> > Cc: David Rientjes <rientjes@google.com> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Roman Gushchin <roman.gushchin@linux.dev> > Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Kees Cook <kees@kernel.org> > Signed-off-by: Barry Song <v-songbaohua@oppo.com> Thanks for separating overflow/out of bounds checks. Acked-by: Michal Hocko <mhocko@suse.com> > --- > include/linux/slab.h | 4 +++- > mm/page_alloc.c | 4 +++- > mm/util.c | 1 + > 3 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/include/linux/slab.h b/include/linux/slab.h > index c9cb42203183..4a4d1fdc2afe 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) > { > size_t bytes; > > - if (unlikely(check_mul_overflow(n, size, &bytes))) > + if (unlikely(check_mul_overflow(n, size, &bytes))) { > + BUG_ON(flags & __GFP_NOFAIL); > return NULL; > + } > > return kvmalloc_node_noprof(bytes, flags, node); > } > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index c700d2598a26..cc179c3e68df 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4708,8 +4708,10 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, > * There are several places where we assume that the order value is sane > * so bail out early if the request is out of bound. > */ > - if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) > + if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) { > + BUG_ON(gfp & __GFP_NOFAIL); > return NULL; > + } > > gfp &= gfp_allowed_mask; > /* > diff --git a/mm/util.c b/mm/util.c > index 0ff5898cc6de..bad3258523b6 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -667,6 +667,7 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) > > /* Don't even allow crazy sizes */ > if (unlikely(size > INT_MAX)) { > + BUG_ON(flags & __GFP_NOFAIL); > WARN_ON_ONCE(!(flags & __GFP_NOWARN)); > return NULL; > } > -- > 2.34.1
On 7/31/24 2:01 AM, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > We have cases we still fail though callers might have __GFP_NOFAIL. > Since they don't check the return, we are exposed to the security > risks for NULL deference. > > Though BUG_ON() is not encouraged by Linus, this is an unrecoverable > situation. > > Christoph Hellwig: > The whole freaking point of __GFP_NOFAIL is that callers don't handle > allocation failures. So in fact a straight BUG is the right thing > here. > > Vlastimil Babka: > It's just not a recoverable situation (WARN_ON is for recoverable > situations). The caller cannot handle allocation failure and at the same > time asked for an impossible allocation. BUG_ON() is a guaranteed oops > with stracktrace etc. We don't need to hope for the later NULL pointer > dereference (which might if really unlucky happen from a different > context where it's no longer obvious what lead to the allocation failing). > > Michal Hocko: > Linus tends to be against adding new BUG() calls unless the failure is > absolutely unrecoverable (e.g. corrupted data structures etc.). I am > not sure how he would look at simply incorrect memory allocator usage to > blow up the kernel. Now the argument could be made that those failures > could cause subtle memory corruptions or even be exploitable which might > be a sufficient reason to stop them early. > > Cc: Michal Hocko <mhocko@suse.com> > Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Lorenzo Stoakes <lstoakes@gmail.com> > Cc: Christoph Lameter <cl@linux.com> > Cc: Pekka Enberg <penberg@kernel.org> > Cc: David Rientjes <rientjes@google.com> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Roman Gushchin <roman.gushchin@linux.dev> > Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Kees Cook <kees@kernel.org> > Signed-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> > --- > include/linux/slab.h | 4 +++- > mm/page_alloc.c | 4 +++- > mm/util.c | 1 + > 3 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/include/linux/slab.h b/include/linux/slab.h > index c9cb42203183..4a4d1fdc2afe 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) > { > size_t bytes; > > - if (unlikely(check_mul_overflow(n, size, &bytes))) > + if (unlikely(check_mul_overflow(n, size, &bytes))) { > + BUG_ON(flags & __GFP_NOFAIL); Shouldn't we produce some kind of warning also in the no-NOFAIL case? Returning a NULL completely silently feels wrong. > return NULL; > + } > > return kvmalloc_node_noprof(bytes, flags, node); > } > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index c700d2598a26..cc179c3e68df 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4708,8 +4708,10 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, > * There are several places where we assume that the order value is sane > * so bail out early if the request is out of bound. > */ > - if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) > + if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) { Because here we do warn (although I'd argue even __GFP_NOWARN shouldn't suppress a warning for such a reason). > + BUG_ON(gfp & __GFP_NOFAIL); > return NULL; > + } > > gfp &= gfp_allowed_mask; > /* > diff --git a/mm/util.c b/mm/util.c > index 0ff5898cc6de..bad3258523b6 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -667,6 +667,7 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) > > /* Don't even allow crazy sizes */ > if (unlikely(size > INT_MAX)) { > + BUG_ON(flags & __GFP_NOFAIL); > WARN_ON_ONCE(!(flags & __GFP_NOWARN)); And here, would argue the same. But maybe there's code that relies on this so dunno. In that case we could at least convert to WARN_ON_ONCE_GFP. > return NULL; > }
On 2024/07/31 19:29, Vlastimil Babka wrote: >> @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) >> { >> size_t bytes; >> >> - if (unlikely(check_mul_overflow(n, size, &bytes))) >> + if (unlikely(check_mul_overflow(n, size, &bytes))) { >> + BUG_ON(flags & __GFP_NOFAIL); > > Shouldn't we produce some kind of warning also in the no-NOFAIL case? > Returning a NULL completely silently feels wrong. Doesn't embedding macros like BUG_ON() into inline functions needlessly bloat the kernel size? I think we can call a non-inlined function (that is marked as EXPORT_SYMBOL()) for emitting warning.
On 7/31/24 12:44 PM, Tetsuo Handa wrote: > On 2024/07/31 19:29, Vlastimil Babka wrote: >>> @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) >>> { >>> size_t bytes; >>> >>> - if (unlikely(check_mul_overflow(n, size, &bytes))) >>> + if (unlikely(check_mul_overflow(n, size, &bytes))) { >>> + BUG_ON(flags & __GFP_NOFAIL); >> >> Shouldn't we produce some kind of warning also in the no-NOFAIL case? >> Returning a NULL completely silently feels wrong. > > Doesn't embedding macros like BUG_ON() into inline functions needlessly bloat > the kernel size? I think we can call a non-inlined function (that is marked > as EXPORT_SYMBOL()) for emitting warning. Hm yeah we could probably make the whole function non-inline as it results in a call anyway.
On Wed, Jul 31, 2024 at 6:48 PM Vlastimil Babka <vbabka@suse.cz> wrote: > > On 7/31/24 12:44 PM, Tetsuo Handa wrote: > > On 2024/07/31 19:29, Vlastimil Babka wrote: > >>> @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) > >>> { > >>> size_t bytes; > >>> > >>> - if (unlikely(check_mul_overflow(n, size, &bytes))) > >>> + if (unlikely(check_mul_overflow(n, size, &bytes))) { > >>> + BUG_ON(flags & __GFP_NOFAIL); > >> > >> Shouldn't we produce some kind of warning also in the no-NOFAIL case? > >> Returning a NULL completely silently feels wrong. > > > > Doesn't embedding macros like BUG_ON() into inline functions needlessly bloat > > the kernel size? I think we can call a non-inlined function (that is marked > > as EXPORT_SYMBOL()) for emitting warning. > > Hm yeah we could probably make the whole function non-inline as it results > in a call anyway. Although this might save some code footprint, we will result in two calls for users of kvmalloc_array_node_noprof() with both Tetsuo's and your proposal?
Looks good (inline or not):
Reviewed-by: Christoph Hellwig <hch@lst.de>
diff --git a/include/linux/slab.h b/include/linux/slab.h index c9cb42203183..4a4d1fdc2afe 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) { size_t bytes; - if (unlikely(check_mul_overflow(n, size, &bytes))) + if (unlikely(check_mul_overflow(n, size, &bytes))) { + BUG_ON(flags & __GFP_NOFAIL); return NULL; + } return kvmalloc_node_noprof(bytes, flags, node); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c700d2598a26..cc179c3e68df 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4708,8 +4708,10 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, * There are several places where we assume that the order value is sane * so bail out early if the request is out of bound. */ - if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) + if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) { + BUG_ON(gfp & __GFP_NOFAIL); return NULL; + } gfp &= gfp_allowed_mask; /* diff --git a/mm/util.c b/mm/util.c index 0ff5898cc6de..bad3258523b6 100644 --- a/mm/util.c +++ b/mm/util.c @@ -667,6 +667,7 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) /* Don't even allow crazy sizes */ if (unlikely(size > INT_MAX)) { + BUG_ON(flags & __GFP_NOFAIL); WARN_ON_ONCE(!(flags & __GFP_NOWARN)); return NULL; }