Message ID | 20190524040633.16854-3-nicoleotsuka@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Optimize dma_*_from_contiguous calls | expand |
On Thu, May 23, 2019 at 09:06:33PM -0700, Nicolin Chen wrote: > The addresses within a single page are always contiguous, so it's > not so necessary to always allocate one single page from CMA area. > Since the CMA area has a limited predefined size of space, it may > run out of space in heavy use cases, where there might be quite a > lot CMA pages being allocated for single pages. > > However, there is also a concern that a device might care where a > page comes from -- it might expect the page from CMA area and act > differently if the page doesn't. How does a device know, after this call, if a CMA area was used? From the patches I figured a device should not care. > > This patch tries to use the fallback alloc_pages path, instead of > one-page size allocations from the global CMA area in case that a > device does not have its own CMA area. This'd save resources from > the CMA global area for more CMA allocations, and also reduce CMA > fragmentations resulted from trivial allocations. > > Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> > --- > kernel/dma/contiguous.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c > index 21f39a6cb04f..6914b92d5c88 100644 > --- a/kernel/dma/contiguous.c > +++ b/kernel/dma/contiguous.c > @@ -223,14 +223,23 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages, > * This function allocates contiguous memory buffer for specified device. It > * first tries to use device specific contiguous memory area if available or > * the default global one, then tries a fallback allocation of normal pages. > + * > + * Note that it byapss one-page size of allocations from the global area as > + * the addresses within one page are always contiguous, so there is no need > + * to waste CMA pages for that kind; it also helps reduce fragmentations. > */ > struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp) > { > int node = dev ? dev_to_node(dev) : NUMA_NO_NODE; > size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT; > size_t align = get_order(PAGE_ALIGN(size)); > - struct cma *cma = dev_get_cma_area(dev); > struct page *page = NULL; > + struct cma *cma = NULL; > + > + if (dev && dev->cma_area) > + cma = dev->cma_area; > + else if (count > 1) > + cma = dma_contiguous_default_area; Doesn't dev_get_dma_area() already do this? Ira > > /* CMA can be used only in the context which permits sleeping */ > if (cma && gfpflags_allow_blocking(gfp)) { > -- > 2.17.1 >
Hi Ira, On Fri, May 24, 2019 at 09:16:19AM -0700, Ira Weiny wrote: > On Thu, May 23, 2019 at 09:06:33PM -0700, Nicolin Chen wrote: > > The addresses within a single page are always contiguous, so it's > > not so necessary to always allocate one single page from CMA area. > > Since the CMA area has a limited predefined size of space, it may > > run out of space in heavy use cases, where there might be quite a > > lot CMA pages being allocated for single pages. > > > > However, there is also a concern that a device might care where a > > page comes from -- it might expect the page from CMA area and act > > differently if the page doesn't. > > How does a device know, after this call, if a CMA area was used? From the > patches I figured a device should not care. A device doesn't know. But that doesn't mean a device won't care at all. There was a concern from Robin and Christoph, as a corner case that device might act differently if the memory isn't in its own CMA region. That's why we let it still use its device specific CMA area. > > + if (dev && dev->cma_area) > > + cma = dev->cma_area; > > + else if (count > 1) > > + cma = dma_contiguous_default_area; > > Doesn't dev_get_dma_area() already do this? Partially yes. But unwrapping it makes the program flow clear in my opinion. Actually I should have mentioned that this patch was suggested by Christoph also. Otherwise, it would need an override like: cma = dev_get_dma_area(); if (count > 1 && cma == dma_contiguous_default_area) cma = NULL; Which doesn't look that bad though.. Thanks Nicolin
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index 21f39a6cb04f..6914b92d5c88 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -223,14 +223,23 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages, * This function allocates contiguous memory buffer for specified device. It * first tries to use device specific contiguous memory area if available or * the default global one, then tries a fallback allocation of normal pages. + * + * Note that it byapss one-page size of allocations from the global area as + * the addresses within one page are always contiguous, so there is no need + * to waste CMA pages for that kind; it also helps reduce fragmentations. */ struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp) { int node = dev ? dev_to_node(dev) : NUMA_NO_NODE; size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT; size_t align = get_order(PAGE_ALIGN(size)); - struct cma *cma = dev_get_cma_area(dev); struct page *page = NULL; + struct cma *cma = NULL; + + if (dev && dev->cma_area) + cma = dev->cma_area; + else if (count > 1) + cma = dma_contiguous_default_area; /* CMA can be used only in the context which permits sleeping */ if (cma && gfpflags_allow_blocking(gfp)) {
The addresses within a single page are always contiguous, so it's not so necessary to always allocate one single page from CMA area. Since the CMA area has a limited predefined size of space, it may run out of space in heavy use cases, where there might be quite a lot CMA pages being allocated for single pages. However, there is also a concern that a device might care where a page comes from -- it might expect the page from CMA area and act differently if the page doesn't. This patch tries to use the fallback alloc_pages path, instead of one-page size allocations from the global CMA area in case that a device does not have its own CMA area. This'd save resources from the CMA global area for more CMA allocations, and also reduce CMA fragmentations resulted from trivial allocations. Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> --- kernel/dma/contiguous.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)