Message ID | 162497449506.16614.7781489905877008435.stgit@klimt.1015granger.net (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v3] mm/page_alloc: Further fix __alloc_pages_bulk() return value | expand |
On Tue, Jun 29, 2021 at 09:48:15AM -0400, Chuck Lever wrote: > The author of commit b3b64ebd3822 ("mm/page_alloc: do bulk array > bounds check after checking populated elements") was possibly > confused by the mixture of return values throughout the function. > > The API contract is clear that the function "Returns the number of > pages on the list or array." It does not list zero as a unique > return value with a special meaning. Therefore zero is a plausible > return value only if @nr_pages is zero or less. > > Clean up the return logic to make it clear that the returned value > is always the total number of pages in the array/list, not the > number of pages that were allocated during this call. > > The only change in behavior with this patch is the value returned > if prepare_alloc_pages() fails. To match the API contract, the > number of pages currently in the array/list is returned in this > case. > > The call site in __page_pool_alloc_pages_slow() also seems to be > confused on this matter. It should be attended to by someone who > is familiar with that code. > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Mel Gorman <mgorman@techsingularity.net>
On 29/06/2021 15.48, Chuck Lever wrote: > The call site in __page_pool_alloc_pages_slow() also seems to be > confused on this matter. It should be attended to by someone who > is familiar with that code. I don't think we need a fix for __page_pool_alloc_pages_slow(), as the array is guaranteed to be empty. But a fix would look like this: diff --git a/net/core/page_pool.c b/net/core/page_pool.c index c137ce308c27..1b04538a3da3 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -245,22 +245,23 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, if (unlikely(pp_order)) return __page_pool_alloc_page_order(pool, gfp); /* Unnecessary as alloc cache is empty, but guarantees zero count */ - if (unlikely(pool->alloc.count > 0)) + if (unlikely(pool->alloc.count > 0)) // ^^^^^^^^^^^^^^^^^^^^^^ return pool->alloc.cache[--pool->alloc.count]; /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */ memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); + /* bulk API ret value also count existing pages, but array is empty */ nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache); if (unlikely(!nr_pages)) return NULL; /* Pages have been filled into alloc.cache array, but count is zero and * page element have not been (possibly) DMA mapped. */ - for (i = 0; i < nr_pages; i++) { + for (i = pool->alloc.count; i < nr_pages; i++) { page = pool->alloc.cache[i]; if ((pp_flags & PP_FLAG_DMA_MAP) && unlikely(!page_pool_dma_map(pool, page))) { put_page(page);
On 29/06/2021 17.33, Mel Gorman wrote: > On Tue, Jun 29, 2021 at 09:48:15AM -0400, Chuck Lever wrote: >> The author of commit b3b64ebd3822 ("mm/page_alloc: do bulk array >> bounds check after checking populated elements") was possibly >> confused by the mixture of return values throughout the function. >> >> The API contract is clear that the function "Returns the number of >> pages on the list or array." It does not list zero as a unique >> return value with a special meaning. Therefore zero is a plausible >> return value only if @nr_pages is zero or less. >> >> Clean up the return logic to make it clear that the returned value >> is always the total number of pages in the array/list, not the >> number of pages that were allocated during this call. >> >> The only change in behavior with this patch is the value returned >> if prepare_alloc_pages() fails. To match the API contract, the >> number of pages currently in the array/list is returned in this >> case. >> >> The call site in __page_pool_alloc_pages_slow() also seems to be >> confused on this matter. It should be attended to by someone who >> is familiar with that code. >> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > Acked-by: Mel Gorman <mgorman@techsingularity.net> > Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
> On Jun 29, 2021, at 12:01 PM, Jesper Dangaard Brouer <jbrouer@redhat.com> wrote: > > On 29/06/2021 15.48, Chuck Lever wrote: > >> The call site in __page_pool_alloc_pages_slow() also seems to be >> confused on this matter. It should be attended to by someone who >> is familiar with that code. > > I don't think we need a fix for __page_pool_alloc_pages_slow(), as the array is guaranteed to be empty. > > But a fix would look like this: > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index c137ce308c27..1b04538a3da3 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -245,22 +245,23 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, > if (unlikely(pp_order)) > return __page_pool_alloc_page_order(pool, gfp); > > /* Unnecessary as alloc cache is empty, but guarantees zero count */ > - if (unlikely(pool->alloc.count > 0)) > + if (unlikely(pool->alloc.count > 0)) // ^^^^^^^^^^^^^^^^^^^^^^ > return pool->alloc.cache[--pool->alloc.count]; > > /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */ > memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); > > + /* bulk API ret value also count existing pages, but array is empty */ > nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache); > if (unlikely(!nr_pages)) > return NULL; Thanks, this check makes sense to me now. > /* Pages have been filled into alloc.cache array, but count is zero and > * page element have not been (possibly) DMA mapped. > */ > - for (i = 0; i < nr_pages; i++) { > + for (i = pool->alloc.count; i < nr_pages; i++) { > page = pool->alloc.cache[i]; > if ((pp_flags & PP_FLAG_DMA_MAP) && > unlikely(!page_pool_dma_map(pool, page))) { > put_page(page); > -- Chuck Lever
On Tue, Jun 29, 2021 at 06:01:12PM +0200, Jesper Dangaard Brouer wrote: > On 29/06/2021 15.48, Chuck Lever wrote: > > > The call site in __page_pool_alloc_pages_slow() also seems to be > > confused on this matter. It should be attended to by someone who > > is familiar with that code. > > I don't think we need a fix for __page_pool_alloc_pages_slow(), as the array > is guaranteed to be empty. > > But a fix would look like this: > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index c137ce308c27..1b04538a3da3 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -245,22 +245,23 @@ static struct page > *__page_pool_alloc_pages_slow(struct page_pool *pool, > if (unlikely(pp_order)) > return __page_pool_alloc_page_order(pool, gfp); > > /* Unnecessary as alloc cache is empty, but guarantees zero count */ > - if (unlikely(pool->alloc.count > 0)) > + if (unlikely(pool->alloc.count > 0)) // ^^^^^^^^^^^^^^^^^^^^^^ > return pool->alloc.cache[--pool->alloc.count]; > > /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array > */ > memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); > > + /* bulk API ret value also count existing pages, but array is empty > */ > nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache); > if (unlikely(!nr_pages)) > return NULL; > > /* Pages have been filled into alloc.cache array, but count is zero > and > * page element have not been (possibly) DMA mapped. > */ > - for (i = 0; i < nr_pages; i++) { > + for (i = pool->alloc.count; i < nr_pages; i++) { That last part would break as the loop is updating pool->alloc_count. Just setting pool->alloc_count = nr_pages would break if PP_FLAG_DMA_MAP was set and page_pool_dma_map failed. Right?
On 30/06/2021 08.58, Mel Gorman wrote: > On Tue, Jun 29, 2021 at 06:01:12PM +0200, Jesper Dangaard Brouer wrote: >> On 29/06/2021 15.48, Chuck Lever wrote: >> >>> The call site in __page_pool_alloc_pages_slow() also seems to be >>> confused on this matter. It should be attended to by someone who >>> is familiar with that code. >> I don't think we need a fix for __page_pool_alloc_pages_slow(), as the array >> is guaranteed to be empty. >> >> But a fix would look like this: >> >> diff --git a/net/core/page_pool.c b/net/core/page_pool.c >> index c137ce308c27..1b04538a3da3 100644 >> --- a/net/core/page_pool.c >> +++ b/net/core/page_pool.c >> @@ -245,22 +245,23 @@ static struct page >> *__page_pool_alloc_pages_slow(struct page_pool *pool, >> if (unlikely(pp_order)) >> return __page_pool_alloc_page_order(pool, gfp); >> >> /* Unnecessary as alloc cache is empty, but guarantees zero count */ >> - if (unlikely(pool->alloc.count > 0)) >> + if (unlikely(pool->alloc.count > 0)) // ^^^^^^^^^^^^^^^^^^^^^^ >> return pool->alloc.cache[--pool->alloc.count]; >> >> /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array >> */ >> memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); >> >> + /* bulk API ret value also count existing pages, but array is empty >> */ >> nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache); >> if (unlikely(!nr_pages)) >> return NULL; >> >> /* Pages have been filled into alloc.cache array, but count is zero >> and >> * page element have not been (possibly) DMA mapped. >> */ >> - for (i = 0; i < nr_pages; i++) { >> + for (i = pool->alloc.count; i < nr_pages; i++) { > That last part would break as the loop is updating pool->alloc_count. The last part "i = pool->alloc.count" probably is a bad idea. > Just setting pool->alloc_count = nr_pages would break if PP_FLAG_DMA_MAP > was set and page_pool_dma_map failed. Right? Yes, this loop is calling page_pool_dma_map(), and if that fails we don't advance pool->alloc_count. --Jesper
On Wed, Jun 30, 2021 at 01:22:24PM +0200, Jesper Dangaard Brouer wrote: > > On 30/06/2021 08.58, Mel Gorman wrote: > > On Tue, Jun 29, 2021 at 06:01:12PM +0200, Jesper Dangaard Brouer wrote: > > > On 29/06/2021 15.48, Chuck Lever wrote: > > > > > > > The call site in __page_pool_alloc_pages_slow() also seems to be > > > > confused on this matter. It should be attended to by someone who > > > > is familiar with that code. > > > I don't think we need a fix for __page_pool_alloc_pages_slow(), as the array > > > is guaranteed to be empty. > > > > > > But a fix would look like this: > > > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > > > index c137ce308c27..1b04538a3da3 100644 > > > --- a/net/core/page_pool.c > > > +++ b/net/core/page_pool.c > > > @@ -245,22 +245,23 @@ static struct page > > > *__page_pool_alloc_pages_slow(struct page_pool *pool, > > > if (unlikely(pp_order)) > > > return __page_pool_alloc_page_order(pool, gfp); > > > > > > /* Unnecessary as alloc cache is empty, but guarantees zero count */ > > > - if (unlikely(pool->alloc.count > 0)) > > > + if (unlikely(pool->alloc.count > 0)) // ^^^^^^^^^^^^^^^^^^^^^^ > > > return pool->alloc.cache[--pool->alloc.count]; > > > > > > /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array > > > */ > > > memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); > > > > > > + /* bulk API ret value also count existing pages, but array is empty > > > */ > > > nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache); > > > if (unlikely(!nr_pages)) > > > return NULL; > > > > > > /* Pages have been filled into alloc.cache array, but count is zero > > > and > > > * page element have not been (possibly) DMA mapped. > > > */ > > > - for (i = 0; i < nr_pages; i++) { > > > + for (i = pool->alloc.count; i < nr_pages; i++) { > > That last part would break as the loop is updating pool->alloc_count. > > The last part "i = pool->alloc.count" probably is a bad idea. It is because it confuses the context, either alloc.count is guaranteed zero or it's not. Initialised as zero, it's fairly clear that it's a) always zero and b) the way i and alloc.count works mean that the array element pointing to a freed page is either ignored or overwritten by a valid page pointer in a later loop iteration.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e7af86e1a312..49eb2e134f9d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5059,7 +5059,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, int nr_populated = 0; if (unlikely(nr_pages <= 0)) - return 0; + goto out; /* * Skip populated array elements to determine if any pages need @@ -5070,7 +5070,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, /* Already populated array? */ if (unlikely(page_array && nr_pages - nr_populated == 0)) - return nr_populated; + goto out; /* Use the single page allocator for one page. */ if (nr_pages - nr_populated == 1) @@ -5080,7 +5080,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, gfp &= gfp_allowed_mask; alloc_gfp = gfp; if (!prepare_alloc_pages(gfp, 0, preferred_nid, nodemask, &ac, &alloc_gfp, &alloc_flags)) - return 0; + goto out; gfp = alloc_gfp; /* Find an allowed local zone that meets the low watermark. */ @@ -5153,6 +5153,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, local_irq_restore(flags); +out: return nr_populated; failed_irq: @@ -5168,7 +5169,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, nr_populated++; } - return nr_populated; + goto out; } EXPORT_SYMBOL_GPL(__alloc_pages_bulk);
The author of commit b3b64ebd3822 ("mm/page_alloc: do bulk array bounds check after checking populated elements") was possibly confused by the mixture of return values throughout the function. The API contract is clear that the function "Returns the number of pages on the list or array." It does not list zero as a unique return value with a special meaning. Therefore zero is a plausible return value only if @nr_pages is zero or less. Clean up the return logic to make it clear that the returned value is always the total number of pages in the array/list, not the number of pages that were allocated during this call. The only change in behavior with this patch is the value returned if prepare_alloc_pages() fails. To match the API contract, the number of pages currently in the array/list is returned in this case. The call site in __page_pool_alloc_pages_slow() also seems to be confused on this matter. It should be attended to by someone who is familiar with that code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> --- mm/page_alloc.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)