diff mbox series

[v2,3/4] mm: BUG_ON to avoid NULL deference while __GFP_NOFAIL fails

Message ID 20240731000155.109583-4-21cnbao@gmail.com (mailing list archive)
State New
Headers show
Series mm: clarify nofail memory allocation | expand

Commit Message

Barry Song July 31, 2024, 12:01 a.m. UTC
From: Barry Song <v-songbaohua@oppo.com>

We have cases we still fail though callers might have __GFP_NOFAIL.
Since they don't check the return, we are exposed to the security
risks for NULL deference.

Though BUG_ON() is not encouraged by Linus, this is an unrecoverable
situation.

Christoph Hellwig:
The whole freaking point of __GFP_NOFAIL is that callers don't handle
allocation failures.  So in fact a straight BUG is the right thing
here.

Vlastimil Babka:
It's just not a recoverable situation (WARN_ON is for recoverable
situations). The caller cannot handle allocation failure and at the same
time asked for an impossible allocation. BUG_ON() is a guaranteed oops
with stracktrace etc. We don't need to hope for the later NULL pointer
dereference (which might if really unlucky happen from a different
context where it's no longer obvious what lead to the allocation failing).

Michal Hocko:
Linus tends to be against adding new BUG() calls unless the failure is
absolutely unrecoverable (e.g. corrupted data structures etc.). I am
not sure how he would look at simply incorrect memory allocator usage to
blow up the kernel. Now the argument could be made that those failures
could cause subtle memory corruptions or even be exploitable which might
be a sufficient reason to stop them early.

Cc: Michal Hocko <mhocko@suse.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <kees@kernel.org>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 include/linux/slab.h | 4 +++-
 mm/page_alloc.c      | 4 +++-
 mm/util.c            | 1 +
 3 files changed, 7 insertions(+), 2 deletions(-)

Comments

Michal Hocko July 31, 2024, 7:11 a.m. UTC | #1
On Wed 31-07-24 12:01:54, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> We have cases we still fail though callers might have __GFP_NOFAIL.
> Since they don't check the return, we are exposed to the security
> risks for NULL deference.
> 
> Though BUG_ON() is not encouraged by Linus, this is an unrecoverable
> situation.
> 
> Christoph Hellwig:
> The whole freaking point of __GFP_NOFAIL is that callers don't handle
> allocation failures.  So in fact a straight BUG is the right thing
> here.
> 
> Vlastimil Babka:
> It's just not a recoverable situation (WARN_ON is for recoverable
> situations). The caller cannot handle allocation failure and at the same
> time asked for an impossible allocation. BUG_ON() is a guaranteed oops
> with stracktrace etc. We don't need to hope for the later NULL pointer
> dereference (which might if really unlucky happen from a different
> context where it's no longer obvious what lead to the allocation failing).
> 
> Michal Hocko:
> Linus tends to be against adding new BUG() calls unless the failure is
> absolutely unrecoverable (e.g. corrupted data structures etc.). I am
> not sure how he would look at simply incorrect memory allocator usage to
> blow up the kernel. Now the argument could be made that those failures
> could cause subtle memory corruptions or even be exploitable which might
> be a sufficient reason to stop them early.
> 
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Kees Cook <kees@kernel.org>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

Thanks for separating overflow/out of bounds checks.
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/linux/slab.h | 4 +++-
>  mm/page_alloc.c      | 4 +++-
>  mm/util.c            | 1 +
>  3 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index c9cb42203183..4a4d1fdc2afe 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
>  {
>  	size_t bytes;
>  
> -	if (unlikely(check_mul_overflow(n, size, &bytes)))
> +	if (unlikely(check_mul_overflow(n, size, &bytes))) {
> +		BUG_ON(flags & __GFP_NOFAIL);
>  		return NULL;
> +	}
>  
>  	return kvmalloc_node_noprof(bytes, flags, node);
>  }
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c700d2598a26..cc179c3e68df 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4708,8 +4708,10 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
>  	 * There are several places where we assume that the order value is sane
>  	 * so bail out early if the request is out of bound.
>  	 */
> -	if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
> +	if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) {
> +		BUG_ON(gfp & __GFP_NOFAIL);
>  		return NULL;
> +	}
>  
>  	gfp &= gfp_allowed_mask;
>  	/*
> diff --git a/mm/util.c b/mm/util.c
> index 0ff5898cc6de..bad3258523b6 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -667,6 +667,7 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node)
>  
>  	/* Don't even allow crazy sizes */
>  	if (unlikely(size > INT_MAX)) {
> +		BUG_ON(flags & __GFP_NOFAIL);
>  		WARN_ON_ONCE(!(flags & __GFP_NOWARN));
>  		return NULL;
>  	}
> -- 
> 2.34.1
Vlastimil Babka July 31, 2024, 10:29 a.m. UTC | #2
On 7/31/24 2:01 AM, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> We have cases we still fail though callers might have __GFP_NOFAIL.
> Since they don't check the return, we are exposed to the security
> risks for NULL deference.
> 
> Though BUG_ON() is not encouraged by Linus, this is an unrecoverable
> situation.
> 
> Christoph Hellwig:
> The whole freaking point of __GFP_NOFAIL is that callers don't handle
> allocation failures.  So in fact a straight BUG is the right thing
> here.
> 
> Vlastimil Babka:
> It's just not a recoverable situation (WARN_ON is for recoverable
> situations). The caller cannot handle allocation failure and at the same
> time asked for an impossible allocation. BUG_ON() is a guaranteed oops
> with stracktrace etc. We don't need to hope for the later NULL pointer
> dereference (which might if really unlucky happen from a different
> context where it's no longer obvious what lead to the allocation failing).
> 
> Michal Hocko:
> Linus tends to be against adding new BUG() calls unless the failure is
> absolutely unrecoverable (e.g. corrupted data structures etc.). I am
> not sure how he would look at simply incorrect memory allocator usage to
> blow up the kernel. Now the argument could be made that those failures
> could cause subtle memory corruptions or even be exploitable which might
> be a sufficient reason to stop them early.
> 
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Kees Cook <kees@kernel.org>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  include/linux/slab.h | 4 +++-
>  mm/page_alloc.c      | 4 +++-
>  mm/util.c            | 1 +
>  3 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index c9cb42203183..4a4d1fdc2afe 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
>  {
>  	size_t bytes;
>  
> -	if (unlikely(check_mul_overflow(n, size, &bytes)))
> +	if (unlikely(check_mul_overflow(n, size, &bytes))) {
> +		BUG_ON(flags & __GFP_NOFAIL);

Shouldn't we produce some kind of warning also in the no-NOFAIL case?
Returning a NULL completely silently feels wrong.

>  		return NULL;
> +	}
>  
>  	return kvmalloc_node_noprof(bytes, flags, node);
>  }
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c700d2598a26..cc179c3e68df 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4708,8 +4708,10 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
>  	 * There are several places where we assume that the order value is sane
>  	 * so bail out early if the request is out of bound.
>  	 */
> -	if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
> +	if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) {

Because here we do warn (although I'd argue even __GFP_NOWARN shouldn't
suppress a warning for such a reason).

> +		BUG_ON(gfp & __GFP_NOFAIL);
>  		return NULL;
> +	}
>  
>  	gfp &= gfp_allowed_mask;
>  	/*
> diff --git a/mm/util.c b/mm/util.c
> index 0ff5898cc6de..bad3258523b6 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -667,6 +667,7 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node)
>  
>  	/* Don't even allow crazy sizes */
>  	if (unlikely(size > INT_MAX)) {
> +		BUG_ON(flags & __GFP_NOFAIL);
>  		WARN_ON_ONCE(!(flags & __GFP_NOWARN));

And here, would argue the same. But maybe there's code that relies on this
so dunno. In that case we could at least convert to WARN_ON_ONCE_GFP.


>  		return NULL;
>  	}
Tetsuo Handa July 31, 2024, 10:44 a.m. UTC | #3
On 2024/07/31 19:29, Vlastimil Babka wrote:
>> @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
>>  {
>>  	size_t bytes;
>>  
>> -	if (unlikely(check_mul_overflow(n, size, &bytes)))
>> +	if (unlikely(check_mul_overflow(n, size, &bytes))) {
>> +		BUG_ON(flags & __GFP_NOFAIL);
> 
> Shouldn't we produce some kind of warning also in the no-NOFAIL case?
> Returning a NULL completely silently feels wrong.

Doesn't embedding macros like BUG_ON() into inline functions needlessly bloat
the kernel size? I think we can call a non-inlined function (that is marked
as EXPORT_SYMBOL()) for emitting warning.
Vlastimil Babka July 31, 2024, 10:48 a.m. UTC | #4
On 7/31/24 12:44 PM, Tetsuo Handa wrote:
> On 2024/07/31 19:29, Vlastimil Babka wrote:
>>> @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
>>>  {
>>>  	size_t bytes;
>>>  
>>> -	if (unlikely(check_mul_overflow(n, size, &bytes)))
>>> +	if (unlikely(check_mul_overflow(n, size, &bytes))) {
>>> +		BUG_ON(flags & __GFP_NOFAIL);
>> 
>> Shouldn't we produce some kind of warning also in the no-NOFAIL case?
>> Returning a NULL completely silently feels wrong.
> 
> Doesn't embedding macros like BUG_ON() into inline functions needlessly bloat
> the kernel size? I think we can call a non-inlined function (that is marked
> as EXPORT_SYMBOL()) for emitting warning.

Hm yeah we could probably make the whole function non-inline as it results
in a call anyway.
Barry Song July 31, 2024, 10:57 a.m. UTC | #5
On Wed, Jul 31, 2024 at 6:48 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 7/31/24 12:44 PM, Tetsuo Handa wrote:
> > On 2024/07/31 19:29, Vlastimil Babka wrote:
> >>> @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
> >>>  {
> >>>     size_t bytes;
> >>>
> >>> -   if (unlikely(check_mul_overflow(n, size, &bytes)))
> >>> +   if (unlikely(check_mul_overflow(n, size, &bytes))) {
> >>> +           BUG_ON(flags & __GFP_NOFAIL);
> >>
> >> Shouldn't we produce some kind of warning also in the no-NOFAIL case?
> >> Returning a NULL completely silently feels wrong.
> >
> > Doesn't embedding macros like BUG_ON() into inline functions needlessly bloat
> > the kernel size? I think we can call a non-inlined function (that is marked
> > as EXPORT_SYMBOL()) for emitting warning.
>
> Hm yeah we could probably make the whole function non-inline as it results
> in a call anyway.

Although this might save some code footprint,  we will result in two
calls for users of
kvmalloc_array_node_noprof() with both Tetsuo's and your proposal?
Christoph Hellwig July 31, 2024, 4:28 p.m. UTC | #6
Looks good (inline or not):

Reviewed-by: Christoph Hellwig <hch@lst.de>
diff mbox series

Patch

diff --git a/include/linux/slab.h b/include/linux/slab.h
index c9cb42203183..4a4d1fdc2afe 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -827,8 +827,10 @@  kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
 {
 	size_t bytes;
 
-	if (unlikely(check_mul_overflow(n, size, &bytes)))
+	if (unlikely(check_mul_overflow(n, size, &bytes))) {
+		BUG_ON(flags & __GFP_NOFAIL);
 		return NULL;
+	}
 
 	return kvmalloc_node_noprof(bytes, flags, node);
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c700d2598a26..cc179c3e68df 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4708,8 +4708,10 @@  struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order,
 	 * There are several places where we assume that the order value is sane
 	 * so bail out early if the request is out of bound.
 	 */
-	if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
+	if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) {
+		BUG_ON(gfp & __GFP_NOFAIL);
 		return NULL;
+	}
 
 	gfp &= gfp_allowed_mask;
 	/*
diff --git a/mm/util.c b/mm/util.c
index 0ff5898cc6de..bad3258523b6 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -667,6 +667,7 @@  void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node)
 
 	/* Don't even allow crazy sizes */
 	if (unlikely(size > INT_MAX)) {
+		BUG_ON(flags & __GFP_NOFAIL);
 		WARN_ON_ONCE(!(flags & __GFP_NOWARN));
 		return NULL;
 	}