Message ID | 20200723131344.41472-2-song.bao.hua@hisilicon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | make dma_alloc_coherent NUMA-aware by per-NUMA CMA | expand |
On Fri, Jul 24, 2020 at 01:13:43AM +1200, Barry Song wrote: > +config CMA_PERNUMA_SIZE_MBYTES > + int "Size in Mega Bytes for per-numa CMA areas" > + depends on NUMA > + default 16 if ARM64 > + default 0 > + help > + Defines the size (in MiB) of the per-numa memory area for Contiguous > + Memory Allocator. Every numa node will get a separate CMA with this > + size. If the size of 0 is selected, per-numa CMA is disabled. I'm still not a fan of the config option. You can just hardcode the value in CONFIG_CMDLINE based on the kernel parameter. Also I wonder if a way to expose this in the device tree might be useful, but people more familiar with the device tree and the arm code will have to chime in on that. > struct cma *dma_contiguous_default_area; > +static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES]; > > /* > * Default global CMA area size can be defined in kernel's .config. > @@ -44,6 +51,8 @@ struct cma *dma_contiguous_default_area; > */ > static const phys_addr_t size_bytes __initconst = > (phys_addr_t)CMA_SIZE_MBYTES * SZ_1M; > +static phys_addr_t pernuma_size_bytes __initdata = > + (phys_addr_t)CMA_SIZE_PERNUMA_MBYTES * SZ_1M; > static phys_addr_t size_cmdline __initdata = -1; > static phys_addr_t base_cmdline __initdata; > static phys_addr_t limit_cmdline __initdata; > @@ -69,6 +78,13 @@ static int __init early_cma(char *p) > } > early_param("cma", early_cma); > > +static int __init early_pernuma_cma(char *p) > +{ > + pernuma_size_bytes = memparse(p, &p); > + return 0; > +} > +early_param("pernuma_cma", early_pernuma_cma); > + > #ifdef CONFIG_CMA_SIZE_PERCENTAGE > > static phys_addr_t __init __maybe_unused cma_early_percent_memory(void) > @@ -96,6 +112,33 @@ static inline __maybe_unused phys_addr_t cma_early_percent_memory(void) > > #endif > > +void __init dma_pernuma_cma_reserve(void) > +{ > + int nid; > + > + if (!pernuma_size_bytes) > + return; > + > + for_each_node_state(nid, N_MEMORY) { > + int ret; > + char name[20]; > + > + snprintf(name, sizeof(name), "pernuma%d", nid); > + ret = cma_declare_contiguous_nid(0, pernuma_size_bytes, 0, 0, > + 0, false, name, > + &dma_contiguous_pernuma_area[nid], > + nid); This adds a > 80 char line. > struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp) > { > + int nid = dev_to_node(dev); > + > /* CMA can be used only in the context which permits sleeping */ > if (!gfpflags_allow_blocking(gfp)) > return NULL; > if (dev->cma_area) > return cma_alloc_aligned(dev->cma_area, size, gfp); > - if (size <= PAGE_SIZE || !dma_contiguous_default_area) > + if (size <= PAGE_SIZE) > return NULL; > + > + if ((nid != NUMA_NO_NODE) && !(gfp & (GFP_DMA | GFP_DMA32))) { No need for the braces around the nid check. > + struct cma *cma = dma_contiguous_pernuma_area[nid]; > + struct page *page; > + > + if (cma) { > + page = cma_alloc_aligned(cma, size, gfp); > + if (page) > + return page; > + } > + } > + > return cma_alloc_aligned(dma_contiguous_default_area, size, gfp); This seems to have lost the dma_contiguous_default_area NULL check. > + /* if dev has its own cma, free page from there */ > + if (dev->cma_area) { > + if (cma_release(dev->cma_area, page, PAGE_ALIGN(size) >> PAGE_SHIFT)) > + return; Another overly long line. > + } else { > + /* > + * otherwise, page is from either per-numa cma or default cma > + */ > + if (cma_release(dma_contiguous_pernuma_area[page_to_nid(page)], > + page, PAGE_ALIGN(size) >> PAGE_SHIFT)) > + return; > + > + if (cma_release(dma_contiguous_default_area, page, > + PAGE_ALIGN(size) >> PAGE_SHIFT)) > + return; > + } I'd introduce a count variable for the value of "PAGE_ALIGN(size) >> PAGE_SHIFT" to clean al lthis up a bit. Also please add a CONFIG_PERCPU_DMA_CMA config variable so that we don't build this code for the vast majority of users that don't need it.
> -----Original Message----- > From: Christoph Hellwig [mailto:hch@lst.de] > Sent: Tuesday, July 28, 2020 11:53 PM > To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com> > Cc: hch@lst.de; m.szyprowski@samsung.com; robin.murphy@arm.com; > will@kernel.org; ganapatrao.kulkarni@cavium.com; > catalin.marinas@arm.com; iommu@lists.linux-foundation.org; Linuxarm > <linuxarm@huawei.com>; linux-arm-kernel@lists.infradead.org; > linux-kernel@vger.kernel.org; Zengtao (B) <prime.zeng@hisilicon.com>; > huangdaode <huangdaode@huawei.com>; Jonathan Cameron > <jonathan.cameron@huawei.com>; Nicolas Saenz Julienne > <nsaenzjulienne@suse.de>; Steve Capper <steve.capper@arm.com>; Andrew > Morton <akpm@linux-foundation.org>; Mike Rapoport <rppt@linux.ibm.com> > Subject: Re: [PATCH v4 1/2] dma-direct: provide the ability to reserve > per-numa CMA > > On Fri, Jul 24, 2020 at 01:13:43AM +1200, Barry Song wrote: > > +config CMA_PERNUMA_SIZE_MBYTES > > + int "Size in Mega Bytes for per-numa CMA areas" > > + depends on NUMA > > + default 16 if ARM64 > > + default 0 > > + help > > + Defines the size (in MiB) of the per-numa memory area for Contiguous > > + Memory Allocator. Every numa node will get a separate CMA with this > > + size. If the size of 0 is selected, per-numa CMA is disabled. > > I'm still not a fan of the config option. You can just hardcode the > value in CONFIG_CMDLINE based on the kernel parameter. Also I wonder I am sorry I haven't got your point yet. Do you mean something like the below? arch/arm64/Kconfig: config CMDLINE string "Default kernel command string" - default "" + default "pernuma_cma=16M" help Provide a set of default command-line options at build time by entering them here. As a minimum, you should specify the the root device (e.g. root=/dev/nfs). A background of the current code is that Linux distributions can usually use arch/arm64/configs/defconfig directly to build kernel. cmdline can be easily ignored during the generation of Linux distributions. > if a way to expose this in the device tree might be useful, but people > more familiar with the device tree and the arm code will have to chime > in on that. Not sure if it is an useful user case as we are using ACPI but not device tree since it is an ARM64 server with NUMA. > > > struct cma *dma_contiguous_default_area; > > +static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES]; > > > > /* > > * Default global CMA area size can be defined in kernel's .config. > > @@ -44,6 +51,8 @@ struct cma *dma_contiguous_default_area; > > */ > > static const phys_addr_t size_bytes __initconst = > > (phys_addr_t)CMA_SIZE_MBYTES * SZ_1M; > > +static phys_addr_t pernuma_size_bytes __initdata = > > + (phys_addr_t)CMA_SIZE_PERNUMA_MBYTES * SZ_1M; > > static phys_addr_t size_cmdline __initdata = -1; > > static phys_addr_t base_cmdline __initdata; > > static phys_addr_t limit_cmdline __initdata; > > @@ -69,6 +78,13 @@ static int __init early_cma(char *p) > > } > > early_param("cma", early_cma); > > > > +static int __init early_pernuma_cma(char *p) > > +{ > > + pernuma_size_bytes = memparse(p, &p); > > + return 0; > > +} > > +early_param("pernuma_cma", early_pernuma_cma); > > + > > #ifdef CONFIG_CMA_SIZE_PERCENTAGE > > > > static phys_addr_t __init __maybe_unused > cma_early_percent_memory(void) > > @@ -96,6 +112,33 @@ static inline __maybe_unused phys_addr_t > cma_early_percent_memory(void) > > > > #endif > > > > +void __init dma_pernuma_cma_reserve(void) > > +{ > > + int nid; > > + > > + if (!pernuma_size_bytes) > > + return; > > + > > + for_each_node_state(nid, N_MEMORY) { > > + int ret; > > + char name[20]; > > + > > + snprintf(name, sizeof(name), "pernuma%d", nid); > > + ret = cma_declare_contiguous_nid(0, pernuma_size_bytes, 0, 0, > > + 0, false, name, > > + &dma_contiguous_pernuma_area[nid], > > + nid); > > This adds a > 80 char line. Will refine. > > > struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t > gfp) > > { > > + int nid = dev_to_node(dev); > > + > > /* CMA can be used only in the context which permits sleeping */ > > if (!gfpflags_allow_blocking(gfp)) > > return NULL; > > if (dev->cma_area) > > return cma_alloc_aligned(dev->cma_area, size, gfp); > > - if (size <= PAGE_SIZE || !dma_contiguous_default_area) > > + if (size <= PAGE_SIZE) > > return NULL; > > + > > + if ((nid != NUMA_NO_NODE) && !(gfp & (GFP_DMA | GFP_DMA32))) { > > No need for the braces around the nid check. Will refine. > > > + struct cma *cma = dma_contiguous_pernuma_area[nid]; > > + struct page *page; > > + > > + if (cma) { > > + page = cma_alloc_aligned(cma, size, gfp); > > + if (page) > > + return page; > > + } > > + } > > + > > return cma_alloc_aligned(dma_contiguous_default_area, size, gfp); > > This seems to have lost the dma_contiguous_default_area NULL check. cma_alloc() is doing the check by returning NULL if cma is NULL. struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, bool no_warn) { ... if (!cma || !cma->count) return NULL; } But I agree here the code can check before calling cma_alloc_aligned. > > > + /* if dev has its own cma, free page from there */ > > + if (dev->cma_area) { > > + if (cma_release(dev->cma_area, page, PAGE_ALIGN(size) >> > PAGE_SHIFT)) > > + return; > > Another overly long line. Will refine. > > > + } else { > > + /* > > + * otherwise, page is from either per-numa cma or default cma > > + */ > > + if (cma_release(dma_contiguous_pernuma_area[page_to_nid(page)], > > + page, PAGE_ALIGN(size) >> PAGE_SHIFT)) > > + return; > > + > > + if (cma_release(dma_contiguous_default_area, page, > > + PAGE_ALIGN(size) >> PAGE_SHIFT)) > > + return; > > + } > > I'd introduce a count variable for the value of "PAGE_ALIGN(size) >> > PAGE_SHIFT" to clean al lthis up a bit. Good idea. > > Also please add a CONFIG_PERCPU_DMA_CMA config variable so that we > don't build this code for the vast majority of users that don't need it. agreed. Thanks Barry
On Tue, Jul 28, 2020 at 12:19:03PM +0000, Song Bao Hua (Barry Song) wrote: > I am sorry I haven't got your point yet. Do you mean something like the below? > > arch/arm64/Kconfig: > config CMDLINE > string "Default kernel command string" > - default "" > + default "pernuma_cma=16M" > help > Provide a set of default command-line options at build time by > entering them here. As a minimum, you should specify the the > root device (e.g. root=/dev/nfs). Yes. > A background of the current code is that Linux distributions can usually use arch/arm64/configs/defconfig > directly to build kernel. cmdline can be easily ignored during the generation of Linux distributions. I've not actually heard of a distro shipping defconfig yet.. > > > if a way to expose this in the device tree might be useful, but people > > more familiar with the device tree and the arm code will have to chime > > in on that. > > Not sure if it is an useful user case as we are using ACPI but not device tree since it is an ARM64 > server with NUMA. Well, than maybe ACPI experts need to chime in on this. > > This seems to have lost the dma_contiguous_default_area NULL check. > > cma_alloc() is doing the check by returning NULL if cma is NULL. > > struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, > bool no_warn) > { > ... > if (!cma || !cma->count) > return NULL; > } > > But I agree here the code can check before calling cma_alloc_aligned. Oh, indeed. Please split the removal of the NULL check in to a prep patch then.
> -----Original Message----- > From: Christoph Hellwig [mailto:hch@lst.de] > Sent: Wednesday, July 29, 2020 12:23 AM > To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com> > Cc: Christoph Hellwig <hch@lst.de>; m.szyprowski@samsung.com; > robin.murphy@arm.com; will@kernel.org; ganapatrao.kulkarni@cavium.com; > catalin.marinas@arm.com; iommu@lists.linux-foundation.org; Linuxarm > <linuxarm@huawei.com>; linux-arm-kernel@lists.infradead.org; > linux-kernel@vger.kernel.org; Zengtao (B) <prime.zeng@hisilicon.com>; > huangdaode <huangdaode@huawei.com>; Jonathan Cameron > <jonathan.cameron@huawei.com>; Nicolas Saenz Julienne > <nsaenzjulienne@suse.de>; Steve Capper <steve.capper@arm.com>; Andrew > Morton <akpm@linux-foundation.org>; Mike Rapoport <rppt@linux.ibm.com> > Subject: Re: [PATCH v4 1/2] dma-direct: provide the ability to reserve > per-numa CMA > > On Tue, Jul 28, 2020 at 12:19:03PM +0000, Song Bao Hua (Barry Song) wrote: > > I am sorry I haven't got your point yet. Do you mean something like the > below? > > > > arch/arm64/Kconfig: > > config CMDLINE > > string "Default kernel command string" > > - default "" > > + default "pernuma_cma=16M" > > help > > Provide a set of default command-line options at build time by > > entering them here. As a minimum, you should specify the the > > root device (e.g. root=/dev/nfs). > > Yes. > > > A background of the current code is that Linux distributions can usually use > arch/arm64/configs/defconfig > > directly to build kernel. cmdline can be easily ignored during the generation > of Linux distributions. > > I've not actually heard of a distro shipping defconfig yet.. > > > > > > if a way to expose this in the device tree might be useful, but people > > > more familiar with the device tree and the arm code will have to chime > > > in on that. > > > > Not sure if it is an useful user case as we are using ACPI but not device tree > since it is an ARM64 > > server with NUMA. > > Well, than maybe ACPI experts need to chime in on this. > > > > This seems to have lost the dma_contiguous_default_area NULL check. > > > > cma_alloc() is doing the check by returning NULL if cma is NULL. > > > > struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, > > bool no_warn) > > { > > ... > > if (!cma || !cma->count) > > return NULL; > > } > > > > But I agree here the code can check before calling cma_alloc_aligned. > > Oh, indeed. Please split the removal of the NULL check in to a prep > patch then. Do you mean removing the NULL check in cma_alloc()? If so, it seems lot of places need to be changed: struct page *dma_alloc_from_contiguous(struct device *dev, size_t count, unsigned int align, bool no_warn) { if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; + code to check dev_get_cma_area(dev) is not NULL return cma_alloc(dev_get_cma_area(dev), count, align, no_warn); } bool dma_release_from_contiguous(struct device *dev, struct page *pages, int count) { + code to check dev_get_cma_area(dev) is not NULL return cma_release(dev_get_cma_area(dev), pages, count); } bool cma_release(struct cma *cma, const struct page *pages, unsigned int count) { unsigned long pfn; + do we need to remove this !cma too if we remove it in cma_alloc()? if (!cma || !pages) return false; ... } And some other places where cma_alloc() and cma_release() are called: arch/powerpc/kvm/book3s_hv_builtin.c drivers/dma-buf/heaps/cma_heap.c drivers/s390/char/vmcp.c drivers/staging/android/ion/ion_cma_heap.c mm/hugetlb.c it seems many code were written with the assumption that cma_alloc/release will check if cma is null so they don't check it before calling cma_alloc(). And I am not sure if kernel robot will report error like pointer reference before checking it if !cma is removed in cma_alloc(). Thanks Barry
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index b8d6cde06463..6d963484bbdc 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -599,6 +599,15 @@ altogether. For more information, see include/linux/dma-contiguous.h + pernuma_cma=nn[MG]@[start[MG][-end[MG]]] + [ARM,X86,KNL] + Sets the size of kernel per-numa memory area for + contiguous memory allocations. A value of 0 disables + per-numa CMA altogether. DMA users on node nid will + first try to allocate buffer from the pernuma area + which is located in node nid, if the allocation fails, + they will fallback to the global default memory area. + cmo_free_hint= [PPC] Format: { yes | no } Specify whether pages are marked as being inactive when they are freed. This is used in CMO environments diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h index 03f8e98e3bcc..278a80a40456 100644 --- a/include/linux/dma-contiguous.h +++ b/include/linux/dma-contiguous.h @@ -79,6 +79,8 @@ static inline void dma_contiguous_set_default(struct cma *cma) void dma_contiguous_reserve(phys_addr_t addr_limit); +void dma_pernuma_cma_reserve(void); + int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base, phys_addr_t limit, struct cma **res_cma, bool fixed); @@ -128,6 +130,8 @@ static inline void dma_contiguous_set_default(struct cma *cma) { } static inline void dma_contiguous_reserve(phys_addr_t limit) { } +static inline void dma_pernuma_cma_reserve(void) { } + static inline int dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base, phys_addr_t limit, struct cma **res_cma, bool fixed) diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index 5d18456b5f01..37e0e3559624 100644 --- a/kernel/dma/Kconfig +++ b/kernel/dma/Kconfig @@ -120,6 +120,16 @@ config DMA_CMA if DMA_CMA comment "Default contiguous memory area size:" +config CMA_PERNUMA_SIZE_MBYTES + int "Size in Mega Bytes for per-numa CMA areas" + depends on NUMA + default 16 if ARM64 + default 0 + help + Defines the size (in MiB) of the per-numa memory area for Contiguous + Memory Allocator. Every numa node will get a separate CMA with this + size. If the size of 0 is selected, per-numa CMA is disabled. + config CMA_SIZE_MBYTES int "Size in Mega Bytes" depends on !CMA_SIZE_SEL_PERCENTAGE diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index cff7e60968b9..c0482231c0d7 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -30,7 +30,14 @@ #define CMA_SIZE_MBYTES 0 #endif +#ifdef CONFIG_CMA_PERNUMA_SIZE_MBYTES +#define CMA_SIZE_PERNUMA_MBYTES CONFIG_CMA_PERNUMA_SIZE_MBYTES +#else +#define CMA_SIZE_PERNUMA_MBYTES 0 +#endif + struct cma *dma_contiguous_default_area; +static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES]; /* * Default global CMA area size can be defined in kernel's .config. @@ -44,6 +51,8 @@ struct cma *dma_contiguous_default_area; */ static const phys_addr_t size_bytes __initconst = (phys_addr_t)CMA_SIZE_MBYTES * SZ_1M; +static phys_addr_t pernuma_size_bytes __initdata = + (phys_addr_t)CMA_SIZE_PERNUMA_MBYTES * SZ_1M; static phys_addr_t size_cmdline __initdata = -1; static phys_addr_t base_cmdline __initdata; static phys_addr_t limit_cmdline __initdata; @@ -69,6 +78,13 @@ static int __init early_cma(char *p) } early_param("cma", early_cma); +static int __init early_pernuma_cma(char *p) +{ + pernuma_size_bytes = memparse(p, &p); + return 0; +} +early_param("pernuma_cma", early_pernuma_cma); + #ifdef CONFIG_CMA_SIZE_PERCENTAGE static phys_addr_t __init __maybe_unused cma_early_percent_memory(void) @@ -96,6 +112,33 @@ static inline __maybe_unused phys_addr_t cma_early_percent_memory(void) #endif +void __init dma_pernuma_cma_reserve(void) +{ + int nid; + + if (!pernuma_size_bytes) + return; + + for_each_node_state(nid, N_MEMORY) { + int ret; + char name[20]; + + snprintf(name, sizeof(name), "pernuma%d", nid); + ret = cma_declare_contiguous_nid(0, pernuma_size_bytes, 0, 0, + 0, false, name, + &dma_contiguous_pernuma_area[nid], + nid); + if (ret) { + pr_warn("%s: reservation failed: err %d, node %d", __func__, + ret, nid); + continue; + } + + pr_debug("%s: reserved %llu MiB on node %d\n", __func__, + (unsigned long long)pernuma_size_bytes / SZ_1M, nid); + } +} + /** * dma_contiguous_reserve() - reserve area(s) for contiguous memory handling * @limit: End address of the reserved memory (optional, 0 for any). @@ -228,23 +271,38 @@ static struct page *cma_alloc_aligned(struct cma *cma, size_t size, gfp_t gfp) * @size: Requested allocation size. * @gfp: Allocation flags. * - * This function allocates contiguous memory buffer for specified device. It - * tries to use device specific contiguous memory area if available, or the - * default global one. + * tries to use device specific contiguous memory area if available, or it + * tries to use per-numa cma, if the allocation fails, it will fallback to + * try default global one. * - * Note that it byapss one-page size of allocations from the global area as - * the addresses within one page are always contiguous, so there is no need - * to waste CMA pages for that kind; it also helps reduce fragmentations. + * Note that it bypass one-page size of allocations from the per-numa and + * global area as the addresses within one page are always contiguous, so + * there is no need to waste CMA pages for that kind; it also helps reduce + * fragmentations. */ struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp) { + int nid = dev_to_node(dev); + /* CMA can be used only in the context which permits sleeping */ if (!gfpflags_allow_blocking(gfp)) return NULL; if (dev->cma_area) return cma_alloc_aligned(dev->cma_area, size, gfp); - if (size <= PAGE_SIZE || !dma_contiguous_default_area) + if (size <= PAGE_SIZE) return NULL; + + if ((nid != NUMA_NO_NODE) && !(gfp & (GFP_DMA | GFP_DMA32))) { + struct cma *cma = dma_contiguous_pernuma_area[nid]; + struct page *page; + + if (cma) { + page = cma_alloc_aligned(cma, size, gfp); + if (page) + return page; + } + } + return cma_alloc_aligned(dma_contiguous_default_area, size, gfp); } @@ -261,9 +319,25 @@ struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp) */ void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { - if (!cma_release(dev_get_cma_area(dev), page, - PAGE_ALIGN(size) >> PAGE_SHIFT)) - __free_pages(page, get_order(size)); + /* if dev has its own cma, free page from there */ + if (dev->cma_area) { + if (cma_release(dev->cma_area, page, PAGE_ALIGN(size) >> PAGE_SHIFT)) + return; + } else { + /* + * otherwise, page is from either per-numa cma or default cma + */ + if (cma_release(dma_contiguous_pernuma_area[page_to_nid(page)], + page, PAGE_ALIGN(size) >> PAGE_SHIFT)) + return; + + if (cma_release(dma_contiguous_default_area, page, + PAGE_ALIGN(size) >> PAGE_SHIFT)) + return; + } + + /* not in any cma, free from buddy */ + __free_pages(page, get_order(size)); } /*
Right now, drivers like ARM SMMU are using dma_alloc_coherent() to get coherent DMA buffers to save their command queues and page tables. As there is only one default CMA in the whole system, SMMUs on nodes other than node0 will get remote memory. This leads to significant latency. This patch provides per-numa CMA so that drivers like SMMU can get local memory. Tests show localizing CMA can decrease dma_unmap latency much. For instance, before this patch, SMMU on node2 has to wait for more than 560ns for the completion of CMD_SYNC in an empty command queue; with this patch, it needs 240ns only. A positive side effect of this patch would be improving performance even further for those users who are worried about performance more than DMA security and use iommu.passthrough=1 to skip IOMMU. With local CMA, all drivers can get local coherent DMA buffers. Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Cc: Steve Capper <steve.capper@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> --- -v4: * rebase on top of Christoph Hellwig's patch: [PATCH v2] dma-contiguous: cleanup dma_alloc_contiguous https://lore.kernel.org/linux-iommu/20200723120133.94105-1-hch@lst.de/ * cleanup according to Christoph's comment * reserve cma by checking N_MEMORY rather than N_ONLINE .../admin-guide/kernel-parameters.txt | 9 ++ include/linux/dma-contiguous.h | 4 + kernel/dma/Kconfig | 10 ++ kernel/dma/contiguous.c | 94 +++++++++++++++++-- 4 files changed, 107 insertions(+), 10 deletions(-)