Message ID | 163184741776.29351.3565418361661850328.stgit@noble.brown (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | congestion_wait() and GFP_NOFAIL | expand |
I'm top-posting to cc Jesper with full context of the patch. I don't have a problem with this patch other than the Fixes: being a bit marginal, I should have acked as Mel Gorman <mgorman@suse.de> and the @gfp in the comment should have been @gfp_mask. However, an assumption the API design made was that it should fail fast if memory is not quickly available but have at least one page in the array. I don't think the network use case cares about the situation where the array is already populated but I'd like Jesper to have the opportunity to think about it. It's possible he would prefer it's explicit and the check becomes (!nr_populated || ((gfp_mask & __GFP_NOFAIL) && !nr_account)) to state that __GFP_NOFAIL users are willing to take a potential latency penalty if the array is already partially populated but !__GFP_NOFAIL users would prefer fail-fast behaviour. I'm on the fence because while I wrote the implementation, it was based on other peoples requirements. On Fri, Sep 17, 2021 at 12:56:57PM +1000, NeilBrown wrote: > When alloc_pages_bulk_array() is called on an array that is partially > allocated, the level of effort to get a single page is less than when > the array was completely unallocated. This behaviour is inconsistent, > but now fixed. One effect if this is that __GFP_NOFAIL will not ensure > at least one page is allocated. > > Also clarify the expected success rate. __alloc_pages_bulk() will > allocated one page according to @gfp, and may allocate more if that can > be done cheaply. It is assumed that the caller values cheap allocation > where possible and may decide to use what it has got, or to call again > to get more. > > Acked-by: Mel Gorman <mgorman@suse.com> > Fixes: 0f87d9d30f21 ("mm/page_alloc: add an array-based interface to the bulk page allocator") > Signed-off-by: NeilBrown <neilb@suse.de> > --- > mm/page_alloc.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index b37435c274cf..aa51016e49c5 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5191,6 +5191,11 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, > * is the maximum number of pages that will be stored in the array. > * > * Returns the number of pages on the list or array. > + * > + * At least one page will be allocated if that is possible while > + * remaining consistent with @gfp. Extra pages up to the requested > + * total will be allocated opportunistically when doing so is > + * significantly cheaper than having the caller repeat the request. > */ > unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, > nodemask_t *nodemask, int nr_pages, > @@ -5292,7 +5297,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, > pcp, pcp_list); > if (unlikely(!page)) { > /* Try and get at least one page */ > - if (!nr_populated) > + if (!nr_account) > goto failed_irq; > break; > } > >
On Sat, 18 Sep 2021, Mel Gorman wrote: > I'm top-posting to cc Jesper with full context of the patch. I don't > have a problem with this patch other than the Fixes: being a bit > marginal, I should have acked as Mel Gorman <mgorman@suse.de> and the > @gfp in the comment should have been @gfp_mask. > > However, an assumption the API design made was that it should fail fast > if memory is not quickly available but have at least one page in the > array. I don't think the network use case cares about the situation where > the array is already populated but I'd like Jesper to have the opportunity > to think about it. It's possible he would prefer it's explicit and the > check becomes > (!nr_populated || ((gfp_mask & __GFP_NOFAIL) && !nr_account)) to > state that __GFP_NOFAIL users are willing to take a potential latency > penalty if the array is already partially populated but !__GFP_NOFAIL > users would prefer fail-fast behaviour. I'm on the fence because while > I wrote the implementation, it was based on other peoples requirements. I can see that it could be desirable to not try too hard when we already have pages allocated, but maybe the best way to achieve that is for the called to clear __GFP_RECLAIM in that case. Alternately, callers that really want the __GFP_RECLAIM and __GFP_NOFAIL flags to be honoured could ensure that the array passed in is empty. That wouldn't be difficult (for current callers). In either case, the documentation should make it clear which flags are honoured when. Let's see what Jesper has to say. Thanks, NeilBrown > > On Fri, Sep 17, 2021 at 12:56:57PM +1000, NeilBrown wrote: > > When alloc_pages_bulk_array() is called on an array that is partially > > allocated, the level of effort to get a single page is less than when > > the array was completely unallocated. This behaviour is inconsistent, > > but now fixed. One effect if this is that __GFP_NOFAIL will not ensure > > at least one page is allocated. > > > > Also clarify the expected success rate. __alloc_pages_bulk() will > > allocated one page according to @gfp, and may allocate more if that can > > be done cheaply. It is assumed that the caller values cheap allocation > > where possible and may decide to use what it has got, or to call again > > to get more. > > > > Acked-by: Mel Gorman <mgorman@suse.com> > > Fixes: 0f87d9d30f21 ("mm/page_alloc: add an array-based interface to the bulk page allocator") > > Signed-off-by: NeilBrown <neilb@suse.de> > > --- > > mm/page_alloc.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index b37435c274cf..aa51016e49c5 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -5191,6 +5191,11 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, > > * is the maximum number of pages that will be stored in the array. > > * > > * Returns the number of pages on the list or array. > > + * > > + * At least one page will be allocated if that is possible while > > + * remaining consistent with @gfp. Extra pages up to the requested > > + * total will be allocated opportunistically when doing so is > > + * significantly cheaper than having the caller repeat the request. > > */ > > unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, > > nodemask_t *nodemask, int nr_pages, > > @@ -5292,7 +5297,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, > > pcp, pcp_list); > > if (unlikely(!page)) { > > /* Try and get at least one page */ > > - if (!nr_populated) > > + if (!nr_account) > > goto failed_irq; > > break; > > } > > > > > >
On 9/17/21 16:42, Mel Gorman wrote: > I'm top-posting to cc Jesper with full context of the patch. I don't > have a problem with this patch other than the Fixes: being a bit > marginal, I should have acked as Mel Gorman <mgorman@suse.de> and the > @gfp in the comment should have been @gfp_mask. > > However, an assumption the API design made was that it should fail fast > if memory is not quickly available but have at least one page in the > array. I don't think the network use case cares about the situation where > the array is already populated but I'd like Jesper to have the opportunity > to think about it. It's possible he would prefer it's explicit and the > check becomes > (!nr_populated || ((gfp_mask & __GFP_NOFAIL) && !nr_account)) to Note that AFAICS nr_populated is an incomplete piece of information, as we initially only count pages in the page_array as nr_populated up to the first NULL pointer. So even before Neil's patch we could decide to allocate even if there are pre-existing pages, but placed later in the array. Which could be rather common if the array consumer starts from index 0? So with Neil's patch this at least becomes consistent, while the check suggested by Mel leaves there the weird dependency on where pre-existing pages appear in the page_array.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b37435c274cf..aa51016e49c5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5191,6 +5191,11 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, * is the maximum number of pages that will be stored in the array. * * Returns the number of pages on the list or array. + * + * At least one page will be allocated if that is possible while + * remaining consistent with @gfp. Extra pages up to the requested + * total will be allocated opportunistically when doing so is + * significantly cheaper than having the caller repeat the request. */ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, nodemask_t *nodemask, int nr_pages, @@ -5292,7 +5297,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, pcp, pcp_list); if (unlikely(!page)) { /* Try and get at least one page */ - if (!nr_populated) + if (!nr_account) goto failed_irq; break; }