diff mbox series

[1/5] sgl_alloc_order: remove 4 GiB limit

Message ID 20221024010244.9522-2-dgilbert@interlog.com (mailing list archive)
State Superseded
Headers show
Series scatterlist: add operations for scsi_debug | expand

Commit Message

Douglas Gilbert Oct. 24, 2022, 1:02 a.m. UTC
This patch fixes a check done by sgl_alloc_order() before it starts
any allocations. The comment in the original said: "Check for integer
overflow" but the right hand side of the expression in the condition
is resolved as u32 so it can not exceed UINT32_MAX (4 GiB) which
means 'length' can not exceed that value.

This function may be used to replace vmalloc(unsigned long) for a
large allocation (e.g. a ramdisk). vmalloc has no limit at 4 GiB so
it seems unreasonable that sgl_alloc_order() whose length type is
unsigned long long should be limited to 4 GB.

Solutions to this issue were discussed by Jason Gunthorpe
<jgg@ziepe.ca> and Bodo Stroesser <bostroesser@gmail.com>. This
version is base on a linux-scsi post by Jason titled: "Re:
[PATCH v7 1/4] sgl_alloc_order: remove 4 GiB limit" dated 20220201.

An earlier patch fixed a memory leak in sg_alloc_order() due to the
misuse of sgl_free(). Take the opportunity to put a one line comment
above sgl_free()'s declaration warning that it is not suitable when
order > 0 .

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 include/linux/scatterlist.h |  1 +
 lib/scatterlist.c           | 21 ++++++++++++---------
 2 files changed, 13 insertions(+), 9 deletions(-)

Comments

Jason Gunthorpe Oct. 24, 2022, 12:21 p.m. UTC | #1
On Sun, Oct 23, 2022 at 09:02:40PM -0400, Douglas Gilbert wrote:
> This patch fixes a check done by sgl_alloc_order() before it starts
> any allocations. The comment in the original said: "Check for integer
> overflow" but the right hand side of the expression in the condition
> is resolved as u32 so it can not exceed UINT32_MAX (4 GiB) which
> means 'length' can not exceed that value.
> 
> This function may be used to replace vmalloc(unsigned long) for a
> large allocation (e.g. a ramdisk). vmalloc has no limit at 4 GiB so
> it seems unreasonable that sgl_alloc_order() whose length type is
> unsigned long long should be limited to 4 GB.
> 
> Solutions to this issue were discussed by Jason Gunthorpe
> <jgg@ziepe.ca> and Bodo Stroesser <bostroesser@gmail.com>. This
> version is base on a linux-scsi post by Jason titled: "Re:
> [PATCH v7 1/4] sgl_alloc_order: remove 4 GiB limit" dated 20220201.

You should link to lore here..

I think I prefer we just fix this so it doesn't have a problem in the
first place - nothing needs the weird unsigned long long argument type:

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index c8c3d675845c37..c39e19fa174bca 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -595,19 +595,17 @@ EXPORT_SYMBOL(sg_alloc_table_from_pages_segment);
  *
  * Returns: A pointer to an initialized scatterlist or %NULL upon failure.
  */
-struct scatterlist *sgl_alloc_order(unsigned long long length,
-				    unsigned int order, bool chainable,
-				    gfp_t gfp, unsigned int *nent_p)
+struct scatterlist *sgl_alloc_order(size_t length, unsigned int order,
+				    bool chainable, gfp_t gfp, size_t *nent_p)
 {
 	struct scatterlist *sgl, *sg;
 	struct page *page;
-	unsigned int nent, nalloc;
+	size_t nent, nalloc;
 	u32 elem_len;
 
-	nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);
-	/* Check for integer overflow */
-	if (length > (nent << (PAGE_SHIFT + order)))
-		return NULL;
+	nent = length >> (PAGE_SHIFT + order);
+	if (length % (1 << (PAGE_SHIFT + order)))
+		nent++;
 	nalloc = nent;
 	if (chainable) {
 		/* Check for integer overflow */
@@ -649,8 +647,7 @@ EXPORT_SYMBOL(sgl_alloc_order);
  *
  * Returns: A pointer to an initialized scatterlist or %NULL upon failure.
  */
-struct scatterlist *sgl_alloc(unsigned long long length, gfp_t gfp,
-			      unsigned int *nent_p)
+struct scatterlist *sgl_alloc(size_t length, gfp_t gfp, size_t *nent_p)
 {
 	return sgl_alloc_order(length, 0, false, gfp, nent_p);
 }

Jason
Bodo Stroesser Oct. 24, 2022, 2:32 p.m. UTC | #2
On 24.10.22 14:21, Jason Gunthorpe wrote:
> On Sun, Oct 23, 2022 at 09:02:40PM -0400, Douglas Gilbert wrote:
>> This patch fixes a check done by sgl_alloc_order() before it starts
>> any allocations. The comment in the original said: "Check for integer
>> overflow" but the right hand side of the expression in the condition
>> is resolved as u32 so it can not exceed UINT32_MAX (4 GiB) which
>> means 'length' can not exceed that value.
>>
>> This function may be used to replace vmalloc(unsigned long) for a
>> large allocation (e.g. a ramdisk). vmalloc has no limit at 4 GiB so
>> it seems unreasonable that sgl_alloc_order() whose length type is
>> unsigned long long should be limited to 4 GB.
>>
>> Solutions to this issue were discussed by Jason Gunthorpe
>> <jgg@ziepe.ca> and Bodo Stroesser <bostroesser@gmail.com>. This
>> version is base on a linux-scsi post by Jason titled: "Re:
>> [PATCH v7 1/4] sgl_alloc_order: remove 4 GiB limit" dated 20220201.
> 
> You should link to lore here..
> 
> I think I prefer we just fix this so it doesn't have a problem in the
> first place - nothing needs the weird unsigned long long argument type:
> 
> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
> index c8c3d675845c37..c39e19fa174bca 100644
> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c
> @@ -595,19 +595,17 @@ EXPORT_SYMBOL(sg_alloc_table_from_pages_segment);
>    *
>    * Returns: A pointer to an initialized scatterlist or %NULL upon failure.
>    */
> -struct scatterlist *sgl_alloc_order(unsigned long long length,
> -				    unsigned int order, bool chainable,
> -				    gfp_t gfp, unsigned int *nent_p)
> +struct scatterlist *sgl_alloc_order(size_t length, unsigned int order,
> +				    bool chainable, gfp_t gfp, size_t *nent_p)
>   {
>   	struct scatterlist *sgl, *sg;
>   	struct page *page;
> -	unsigned int nent, nalloc;
> +	size_t nent, nalloc;
>   	u32 elem_len;
>   
> -	nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);
> -	/* Check for integer overflow */
> -	if (length > (nent << (PAGE_SHIFT + order)))
> -		return NULL;
> +	nent = length >> (PAGE_SHIFT + order);
> +	if (length % (1 << (PAGE_SHIFT + order)))

This might end up doing a modulo operation for divisor 0, if caller
specifies a too high order parameter, right?

To be on the safe side I'd prefer to use the following code instead:
     if (length & ((1 << (PAGE_SHIFT + order)) - 1))

Thank you
Bodo

> +		nent++;
>   	nalloc = nent;
>   	if (chainable) {
>   		/* Check for integer overflow */
> @@ -649,8 +647,7 @@ EXPORT_SYMBOL(sgl_alloc_order);
>    *
>    * Returns: A pointer to an initialized scatterlist or %NULL upon failure.
>    */
> -struct scatterlist *sgl_alloc(unsigned long long length, gfp_t gfp,
> -			      unsigned int *nent_p)
> +struct scatterlist *sgl_alloc(size_t length, gfp_t gfp, size_t *nent_p)
>   {
>   	return sgl_alloc_order(length, 0, false, gfp, nent_p);
>   }
> 
> Jason
Jason Gunthorpe Oct. 24, 2022, 5:32 p.m. UTC | #3
On Mon, Oct 24, 2022 at 04:32:30PM +0200, Bodo Stroesser wrote:
> > +struct scatterlist *sgl_alloc_order(size_t length, unsigned int order,
> > +				    bool chainable, gfp_t gfp, size_t *nent_p)
> >   {
> >   	struct scatterlist *sgl, *sg;
> >   	struct page *page;
> > -	unsigned int nent, nalloc;
> > +	size_t nent, nalloc;
> >   	u32 elem_len;
> > -	nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);
> > -	/* Check for integer overflow */
> > -	if (length > (nent << (PAGE_SHIFT + order)))
> > -		return NULL;
> > +	nent = length >> (PAGE_SHIFT + order);
> > +	if (length % (1 << (PAGE_SHIFT + order)))
> 
> This might end up doing a modulo operation for divisor 0, if caller
> specifies a too high order parameter, right?

If that happens then the first >> will be busted too and this is all
broken..

We assume the caller will pass a valid order paramter it seems, it is
not userspace controlled.

Jason
Douglas Gilbert Oct. 24, 2022, 7:58 p.m. UTC | #4
On 2022-10-24 13:32, Jason Gunthorpe wrote:
> On Mon, Oct 24, 2022 at 04:32:30PM +0200, Bodo Stroesser wrote:
>>> +struct scatterlist *sgl_alloc_order(size_t length, unsigned int order,
>>> +				    bool chainable, gfp_t gfp, size_t *nent_p)
>>>    {
>>>    	struct scatterlist *sgl, *sg;
>>>    	struct page *page;
>>> -	unsigned int nent, nalloc;
>>> +	size_t nent, nalloc;
>>>    	u32 elem_len;
>>> -	nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);
>>> -	/* Check for integer overflow */
>>> -	if (length > (nent << (PAGE_SHIFT + order)))
>>> -		return NULL;
>>> +	nent = length >> (PAGE_SHIFT + order);
>>> +	if (length % (1 << (PAGE_SHIFT + order)))
>>
>> This might end up doing a modulo operation for divisor 0, if caller
>> specifies a too high order parameter, right?
> 
> If that happens then the first >> will be busted too and this is all
> broken..
> 
> We assume the caller will pass a valid order paramter it seems, it is
> not userspace controlled.

Hi,
I have been trying to remove the unstated 4 GB limit on this existing
function, _not_ change its interface. In practice, I don't believe limiting
'nent' to 2^32-1 (i.e. 'unsigned int') is an unreasonable restriction and
that is checked at runtime in my patch. Since each 'entity' is at least
PAGE_SIZE bytes, that is at least 16 TB (with order==0).

The order_too_large issue I have tried to address with a pre-condition
(i.e. words in the description of the 'order' argument). That said, I did
notice:

include/linux/mmzone.h :
   #ifndef CONFIG_ARCH_FORCE_MAX_ORDER
   #define MAX_ORDER 11
   #else
   #define MAX_ORDER CONFIG_ARCH_FORCE_MAX_ORDER
   #endif


Changing the interface of sgl_alloc_order() will break these callers:
    drivers/scsi/ipr.c
    drivers/target/target_core_transport.c

due to change the type (and size on 64 bit machines) of the fifth argument
(i.e. 'size_t *nent_p').

If you want to change sgl_alloc_order()'s interface, shouldn't that be done
in a separate patch that also fixes the breakages it would otherwise cause?

Doug Gilbert
Bodo Stroesser Oct. 25, 2022, 10:46 a.m. UTC | #5
On 24.10.22 19:32, Jason Gunthorpe wrote:
> On Mon, Oct 24, 2022 at 04:32:30PM +0200, Bodo Stroesser wrote:
>>> +struct scatterlist *sgl_alloc_order(size_t length, unsigned int order,
>>> +				    bool chainable, gfp_t gfp, size_t *nent_p)
>>>    {
>>>    	struct scatterlist *sgl, *sg;
>>>    	struct page *page;
>>> -	unsigned int nent, nalloc;
>>> +	size_t nent, nalloc;
>>>    	u32 elem_len;
>>> -	nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);
>>> -	/* Check for integer overflow */
>>> -	if (length > (nent << (PAGE_SHIFT + order)))
>>> -		return NULL;
>>> +	nent = length >> (PAGE_SHIFT + order);
>>> +	if (length % (1 << (PAGE_SHIFT + order)))
>>
>> This might end up doing a modulo operation for divisor 0, if caller
>> specifies a too high order parameter, right?
> 
> If that happens then the first >> will be busted too and this is all
> broken..
> 
> We assume the caller will pass a valid order paramter it seems, it is
> not userspace controlled.
> 

If a too high order is passed, alloc_pages will just return NULL, so
in the old code sgl_alloc_order simply returns NULL. Using modulo op
changes it to possibly crashing the system.

Bodo
Jason Gunthorpe Oct. 25, 2022, 12:18 p.m. UTC | #6
On Tue, Oct 25, 2022 at 12:46:55PM +0200, Bodo Stroesser wrote:
> On 24.10.22 19:32, Jason Gunthorpe wrote:
> > On Mon, Oct 24, 2022 at 04:32:30PM +0200, Bodo Stroesser wrote:
> > > > +struct scatterlist *sgl_alloc_order(size_t length, unsigned int order,
> > > > +				    bool chainable, gfp_t gfp, size_t *nent_p)
> > > >    {
> > > >    	struct scatterlist *sgl, *sg;
> > > >    	struct page *page;
> > > > -	unsigned int nent, nalloc;
> > > > +	size_t nent, nalloc;
> > > >    	u32 elem_len;
> > > > -	nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);
> > > > -	/* Check for integer overflow */
> > > > -	if (length > (nent << (PAGE_SHIFT + order)))
> > > > -		return NULL;
> > > > +	nent = length >> (PAGE_SHIFT + order);
> > > > +	if (length % (1 << (PAGE_SHIFT + order)))
> > > 
> > > This might end up doing a modulo operation for divisor 0, if caller
> > > specifies a too high order parameter, right?
> > 
> > If that happens then the first >> will be busted too and this is all
> > broken..
> > 
> > We assume the caller will pass a valid order paramter it seems, it is
> > not userspace controlled.
> > 
> 
> If a too high order is passed, alloc_pages will just return NULL, so
> in the old code sgl_alloc_order simply returns NULL. Using modulo op
> changes it to possibly crashing the system.

That is fine, we are not trying to protect against buggy callers
passing a weird order 

We are trying to protect against correct callers passing a large
length because length is user controlled or something.

Jason
Jason Gunthorpe Oct. 25, 2022, 12:19 p.m. UTC | #7
On Mon, Oct 24, 2022 at 03:58:22PM -0400, Douglas Gilbert wrote:

> Changing the interface of sgl_alloc_order() will break these callers:
>    drivers/scsi/ipr.c
>    drivers/target/target_core_transport.c
> 
> due to change the type (and size on 64 bit machines) of the fifth argument
> (i.e. 'size_t *nent_p').

Yes, you'd fold those small fixes into this patch.

Jason
diff mbox series

Patch

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 375a5e90d86a..0930755a756e 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -426,6 +426,7 @@  struct scatterlist *sgl_alloc(unsigned long long length, gfp_t gfp,
 			      unsigned int *nent_p);
 void sgl_free_n_order(struct scatterlist *sgl, int nents, int order);
 void sgl_free_order(struct scatterlist *sgl, int order);
+/* Only use sgl_free() when order is 0 */
 void sgl_free(struct scatterlist *sgl);
 #endif /* CONFIG_SGL_ALLOC */
 
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index c8c3d675845c..f633e2d669fe 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -585,13 +585,16 @@  EXPORT_SYMBOL(sg_alloc_table_from_pages_segment);
 #ifdef CONFIG_SGL_ALLOC
 
 /**
- * sgl_alloc_order - allocate a scatterlist and its pages
+ * sgl_alloc_order - allocate a scatterlist with equally sized elements each
+ *		     of which has 2^@order continuous pages
  * @length: Length in bytes of the scatterlist. Must be at least one
- * @order: Second argument for alloc_pages()
+ * @order:  Second argument for alloc_pages(). Each sgl element size will
+ *	    be (PAGE_SIZE*2^@order) bytes. @order must not exceed 16.
  * @chainable: Whether or not to allocate an extra element in the scatterlist
- *	for scatterlist chaining purposes
+ *	       for scatterlist chaining purposes
  * @gfp: Memory allocation flags
- * @nent_p: [out] Number of entries in the scatterlist that have pages
+ * @nent_p: [out] Number of entries in the scatterlist that have pages.
+ *		  Ignored if @nent_p is NULL.
  *
  * Returns: A pointer to an initialized scatterlist or %NULL upon failure.
  */
@@ -601,14 +604,14 @@  struct scatterlist *sgl_alloc_order(unsigned long long length,
 {
 	struct scatterlist *sgl, *sg;
 	struct page *page;
-	unsigned int nent, nalloc;
+	uint64_t nent;
+	unsigned int nalloc;
 	u32 elem_len;
 
 	nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);
-	/* Check for integer overflow */
-	if (length > (nent << (PAGE_SHIFT + order)))
+	if (nent > UINT_MAX)
 		return NULL;
-	nalloc = nent;
+	nalloc = (unsigned int)nent;
 	if (chainable) {
 		/* Check for integer overflow */
 		if (nalloc + 1 < nalloc)
@@ -636,7 +639,7 @@  struct scatterlist *sgl_alloc_order(unsigned long long length,
 	}
 	WARN_ONCE(length, "length = %lld\n", length);
 	if (nent_p)
-		*nent_p = nent;
+		*nent_p = (unsigned int)nent;
 	return sgl;
 }
 EXPORT_SYMBOL(sgl_alloc_order);