Message ID | 20201217201214.3414100-1-guro@fb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,1/2] mm: cma: allocate cma areas bottom-up | expand |
On Thu, Dec 17, 2020 at 12:12:13PM -0800, Roman Gushchin wrote: > Currently cma areas without a fixed base are allocated close to the > end of the node. This placement is sub-optimal because of compaction: > it brings pages into the cma area. In particular, it can bring in hot > executable pages, even if there is a plenty of free memory on the > machine. This results in cma allocation failures. > > Instead let's place cma areas close to the beginning of a node. > In this case the compaction will help to free cma areas, resulting > in better cma allocation success rates. > > If there is enough memory let's try to allocate bottom-up starting > with 4GB to exclude any possible interference with DMA32. On smaller > machines or in a case of a failure, stick with the old behavior. > > 16GB vm, 2GB cma area: > With this patch: > [ 0.000000] Command line: root=/dev/vda3 rootflags=subvol=/root systemd.unified_cgroup_hierarchy=1 enforcing=0 console=ttyS0,115200 hugetlb_cma=2G > [ 0.002928] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node > [ 0.002930] cma: Reserved 2048 MiB at 0x0000000100000000 > [ 0.002931] hugetlb_cma: reserved 2048 MiB on node 0 > > Without this patch: > [ 0.000000] Command line: root=/dev/vda3 rootflags=subvol=/root systemd.unified_cgroup_hierarchy=1 enforcing=0 console=ttyS0,115200 hugetlb_cma=2G > [ 0.002930] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node > [ 0.002933] cma: Reserved 2048 MiB at 0x00000003c0000000 > [ 0.002934] hugetlb_cma: reserved 2048 MiB on node 0 > > v2: > - switched to memblock_set_bottom_up(true), by Mike > - start with 4GB, by Mike > > Signed-off-by: Roman Gushchin <guro@fb.com> With one nit below Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> > --- > mm/cma.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/mm/cma.c b/mm/cma.c > index 7f415d7cda9f..21fd40c092f0 100644 > --- a/mm/cma.c > +++ b/mm/cma.c > @@ -337,6 +337,22 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, > limit = highmem_start; > } > > + /* > + * If there is enough memory, try a bottom-up allocation first. > + * It will place the new cma area close to the start of the node > + * and guarantee that the compaction is moving pages out of the > + * cma area and not into it. > + * Avoid using first 4GB to not interfere with constrained zones > + * like DMA/DMA32. > + */ > + if (!memblock_bottom_up() && > + memblock_end >= SZ_4G + size) { This seems short enough to fit a single line > + memblock_set_bottom_up(true); > + addr = memblock_alloc_range_nid(size, alignment, SZ_4G, > + limit, nid, true); > + memblock_set_bottom_up(false); > + } > + > if (!addr) { > addr = memblock_alloc_range_nid(size, alignment, base, > limit, nid, true); > -- > 2.26.2 >
On Sun, Dec 20, 2020 at 08:48:48AM +0200, Mike Rapoport wrote: > On Thu, Dec 17, 2020 at 12:12:13PM -0800, Roman Gushchin wrote: > > Currently cma areas without a fixed base are allocated close to the > > end of the node. This placement is sub-optimal because of compaction: > > it brings pages into the cma area. In particular, it can bring in hot > > executable pages, even if there is a plenty of free memory on the > > machine. This results in cma allocation failures. > > > > Instead let's place cma areas close to the beginning of a node. > > In this case the compaction will help to free cma areas, resulting > > in better cma allocation success rates. > > > > If there is enough memory let's try to allocate bottom-up starting > > with 4GB to exclude any possible interference with DMA32. On smaller > > machines or in a case of a failure, stick with the old behavior. > > > > 16GB vm, 2GB cma area: > > With this patch: > > [ 0.000000] Command line: root=/dev/vda3 rootflags=subvol=/root systemd.unified_cgroup_hierarchy=1 enforcing=0 console=ttyS0,115200 hugetlb_cma=2G > > [ 0.002928] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node > > [ 0.002930] cma: Reserved 2048 MiB at 0x0000000100000000 > > [ 0.002931] hugetlb_cma: reserved 2048 MiB on node 0 > > > > Without this patch: > > [ 0.000000] Command line: root=/dev/vda3 rootflags=subvol=/root systemd.unified_cgroup_hierarchy=1 enforcing=0 console=ttyS0,115200 hugetlb_cma=2G > > [ 0.002930] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node > > [ 0.002933] cma: Reserved 2048 MiB at 0x00000003c0000000 > > [ 0.002934] hugetlb_cma: reserved 2048 MiB on node 0 > > > > v2: > > - switched to memblock_set_bottom_up(true), by Mike > > - start with 4GB, by Mike > > > > Signed-off-by: Roman Gushchin <guro@fb.com> > > With one nit below > > Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> > > > --- > > mm/cma.c | 16 ++++++++++++++++ > > 1 file changed, 16 insertions(+) > > > > diff --git a/mm/cma.c b/mm/cma.c > > index 7f415d7cda9f..21fd40c092f0 100644 > > --- a/mm/cma.c > > +++ b/mm/cma.c > > @@ -337,6 +337,22 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, > > limit = highmem_start; > > } > > > > + /* > > + * If there is enough memory, try a bottom-up allocation first. > > + * It will place the new cma area close to the start of the node > > + * and guarantee that the compaction is moving pages out of the > > + * cma area and not into it. > > + * Avoid using first 4GB to not interfere with constrained zones > > + * like DMA/DMA32. > > + */ > > + if (!memblock_bottom_up() && > > + memblock_end >= SZ_4G + size) { > Hi Mike! > This seems short enough to fit a single line Indeed. An updated version below. Thank you for the review of the series! I assume it's simpler to route both patches through the mm tree. What do you think? Thanks! -- From f88bd0a425c7181bd26a4cf900e6924a7b521419 Mon Sep 17 00:00:00 2001 From: Roman Gushchin <guro@fb.com> Date: Mon, 14 Dec 2020 20:20:52 -0800 Subject: [PATCH v3 1/2] mm: cma: allocate cma areas bottom-up Currently cma areas without a fixed base are allocated close to the end of the node. This placement is sub-optimal because of compaction: it brings pages into the cma area. In particular, it can bring in hot executable pages, even if there is a plenty of free memory on the machine. This results in cma allocation failures. Instead let's place cma areas close to the beginning of a node. In this case the compaction will help to free cma areas, resulting in better cma allocation success rates. If there is enough memory let's try to allocate bottom-up starting with 4GB to exclude any possible interference with DMA32. On smaller machines or in a case of a failure, stick with the old behavior. 16GB vm, 2GB cma area: With this patch: [ 0.000000] Command line: root=/dev/vda3 rootflags=subvol=/root systemd.unified_cgroup_hierarchy=1 enforcing=0 console=ttyS0,115200 hugetlb_cma=2G [ 0.002928] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node [ 0.002930] cma: Reserved 2048 MiB at 0x0000000100000000 [ 0.002931] hugetlb_cma: reserved 2048 MiB on node 0 Without this patch: [ 0.000000] Command line: root=/dev/vda3 rootflags=subvol=/root systemd.unified_cgroup_hierarchy=1 enforcing=0 console=ttyS0,115200 hugetlb_cma=2G [ 0.002930] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node [ 0.002933] cma: Reserved 2048 MiB at 0x00000003c0000000 [ 0.002934] hugetlb_cma: reserved 2048 MiB on node 0 v3: - code alignment fix, by Mike v2: - switched to memblock_set_bottom_up(true), by Mike - start with 4GB, by Mike Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> --- mm/cma.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/mm/cma.c b/mm/cma.c index 20c4f6f40037..4fe74c9d83b0 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -336,6 +336,21 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, limit = highmem_start; } + /* + * If there is enough memory, try a bottom-up allocation first. + * It will place the new cma area close to the start of the node + * and guarantee that the compaction is moving pages out of the + * cma area and not into it. + * Avoid using first 4GB to not interfere with constrained zones + * like DMA/DMA32. + */ + if (!memblock_bottom_up() && memblock_end >= SZ_4G + size) { + memblock_set_bottom_up(true); + addr = memblock_alloc_range_nid(size, alignment, SZ_4G, + limit, nid, true); + memblock_set_bottom_up(false); + } + if (!addr) { addr = memblock_alloc_range_nid(size, alignment, base, limit, nid, true);
On Mon, 21 Dec 2020 09:05:51 -0800 Roman Gushchin <guro@fb.com> wrote:
> Subject: [PATCH v3 1/2] mm: cma: allocate cma areas bottom-up
i386 allmodconfig:
In file included from ./include/vdso/const.h:5,
from ./include/linux/const.h:4,
from ./include/linux/bits.h:5,
from ./include/linux/bitops.h:6,
from ./include/linux/kernel.h:11,
from ./include/asm-generic/bug.h:20,
from ./arch/x86/include/asm/bug.h:93,
from ./include/linux/bug.h:5,
from ./include/linux/mmdebug.h:5,
from ./include/linux/mm.h:9,
from ./include/linux/memblock.h:13,
from mm/cma.c:24:
mm/cma.c: In function ‘cma_declare_contiguous_nid’:
./include/uapi/linux/const.h:20:19: warning: conversion from ‘long long unsigned int’ to ‘phys_addr_t’ {aka ‘unsigned int’} changes value from ‘4294967296’ to ‘0’ [-Woverflow]
#define __AC(X,Y) (X##Y)
^~~~~~
./include/uapi/linux/const.h:21:18: note: in expansion of macro ‘__AC’
#define _AC(X,Y) __AC(X,Y)
^~~~
./include/linux/sizes.h:46:18: note: in expansion of macro ‘_AC’
#define SZ_4G _AC(0x100000000, ULL)
^~~
mm/cma.c:349:53: note: in expansion of macro ‘SZ_4G’
addr = memblock_alloc_range_nid(size, alignment, SZ_4G,
^~~~~
On Tue, Dec 22, 2020 at 08:06:06PM -0800, Andrew Morton wrote: > On Mon, 21 Dec 2020 09:05:51 -0800 Roman Gushchin <guro@fb.com> wrote: > > > Subject: [PATCH v3 1/2] mm: cma: allocate cma areas bottom-up > > i386 allmodconfig: > > In file included from ./include/vdso/const.h:5, > from ./include/linux/const.h:4, > from ./include/linux/bits.h:5, > from ./include/linux/bitops.h:6, > from ./include/linux/kernel.h:11, > from ./include/asm-generic/bug.h:20, > from ./arch/x86/include/asm/bug.h:93, > from ./include/linux/bug.h:5, > from ./include/linux/mmdebug.h:5, > from ./include/linux/mm.h:9, > from ./include/linux/memblock.h:13, > from mm/cma.c:24: > mm/cma.c: In function ‘cma_declare_contiguous_nid’: > ./include/uapi/linux/const.h:20:19: warning: conversion from ‘long long unsigned int’ to ‘phys_addr_t’ {aka ‘unsigned int’} changes value from ‘4294967296’ to ‘0’ [-Woverflow] > #define __AC(X,Y) (X##Y) > ^~~~~~ > ./include/uapi/linux/const.h:21:18: note: in expansion of macro ‘__AC’ > #define _AC(X,Y) __AC(X,Y) > ^~~~ > ./include/linux/sizes.h:46:18: note: in expansion of macro ‘_AC’ > #define SZ_4G _AC(0x100000000, ULL) > ^~~ > mm/cma.c:349:53: note: in expansion of macro ‘SZ_4G’ > addr = memblock_alloc_range_nid(size, alignment, SZ_4G, > ^~~~~ > I thought that (!memblock_bottom_up() && memblock_end >= SZ_4G + size) can't be true on a 32-bit platform, so the whole if clause can be compiled out. Maybe it's because memblock_end can be equal to SZ_4G and if the size == 0... I have no better idea than wrapping everything into #if BITS_PER_LONG > 32 #endif. Thanks! -- diff --git a/mm/cma.c b/mm/cma.c index 4fe74c9d83b0..5d69b498603a 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -344,12 +344,14 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, * Avoid using first 4GB to not interfere with constrained zones * like DMA/DMA32. */ +#if BITS_PER_LONG > 32 if (!memblock_bottom_up() && memblock_end >= SZ_4G + size) { memblock_set_bottom_up(true); addr = memblock_alloc_range_nid(size, alignment, SZ_4G, limit, nid, true); memblock_set_bottom_up(false); } +#endif if (!addr) { addr = memblock_alloc_range_nid(size, alignment, base,
On Wed, Dec 23, 2020 at 08:35:37AM -0800, Roman Gushchin wrote: > On Tue, Dec 22, 2020 at 08:06:06PM -0800, Andrew Morton wrote: > > On Mon, 21 Dec 2020 09:05:51 -0800 Roman Gushchin <guro@fb.com> wrote: > > > > > Subject: [PATCH v3 1/2] mm: cma: allocate cma areas bottom-up > > > > i386 allmodconfig: > > > > In file included from ./include/vdso/const.h:5, > > from ./include/linux/const.h:4, > > from ./include/linux/bits.h:5, > > from ./include/linux/bitops.h:6, > > from ./include/linux/kernel.h:11, > > from ./include/asm-generic/bug.h:20, > > from ./arch/x86/include/asm/bug.h:93, > > from ./include/linux/bug.h:5, > > from ./include/linux/mmdebug.h:5, > > from ./include/linux/mm.h:9, > > from ./include/linux/memblock.h:13, > > from mm/cma.c:24: > > mm/cma.c: In function ‘cma_declare_contiguous_nid’: > > ./include/uapi/linux/const.h:20:19: warning: conversion from ‘long long unsigned int’ to ‘phys_addr_t’ {aka ‘unsigned int’} changes value from ‘4294967296’ to ‘0’ [-Woverflow] > > #define __AC(X,Y) (X##Y) > > ^~~~~~ > > ./include/uapi/linux/const.h:21:18: note: in expansion of macro ‘__AC’ > > #define _AC(X,Y) __AC(X,Y) > > ^~~~ > > ./include/linux/sizes.h:46:18: note: in expansion of macro ‘_AC’ > > #define SZ_4G _AC(0x100000000, ULL) > > ^~~ > > mm/cma.c:349:53: note: in expansion of macro ‘SZ_4G’ > > addr = memblock_alloc_range_nid(size, alignment, SZ_4G, > > ^~~~~ > > > > I thought that (!memblock_bottom_up() && memblock_end >= SZ_4G + size) > can't be true on a 32-bit platform, so the whole if clause can be compiled out. > Maybe it's because memblock_end can be equal to SZ_4G and if the size == 0... > > I have no better idea than wrapping everything into > #if BITS_PER_LONG > 32 > #endif. 32-bit systems can have more than 32 bit in the physical address. I think a better option would be to use CONFIG_PHYS_ADDR_T_64BIT > Thanks! > > -- > > diff --git a/mm/cma.c b/mm/cma.c > index 4fe74c9d83b0..5d69b498603a 100644 > --- a/mm/cma.c > +++ b/mm/cma.c > @@ -344,12 +344,14 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, > * Avoid using first 4GB to not interfere with constrained zones > * like DMA/DMA32. > */ > +#if BITS_PER_LONG > 32 > if (!memblock_bottom_up() && memblock_end >= SZ_4G + size) { > memblock_set_bottom_up(true); > addr = memblock_alloc_range_nid(size, alignment, SZ_4G, > limit, nid, true); > memblock_set_bottom_up(false); > } > +#endif > > if (!addr) { > addr = memblock_alloc_range_nid(size, alignment, base,
On Thu, Dec 24, 2020 at 12:10:39AM +0200, Mike Rapoport wrote: > On Wed, Dec 23, 2020 at 08:35:37AM -0800, Roman Gushchin wrote: > > On Tue, Dec 22, 2020 at 08:06:06PM -0800, Andrew Morton wrote: > > > On Mon, 21 Dec 2020 09:05:51 -0800 Roman Gushchin <guro@fb.com> wrote: > > > > > > > Subject: [PATCH v3 1/2] mm: cma: allocate cma areas bottom-up > > > > > > i386 allmodconfig: > > > > > > In file included from ./include/vdso/const.h:5, > > > from ./include/linux/const.h:4, > > > from ./include/linux/bits.h:5, > > > from ./include/linux/bitops.h:6, > > > from ./include/linux/kernel.h:11, > > > from ./include/asm-generic/bug.h:20, > > > from ./arch/x86/include/asm/bug.h:93, > > > from ./include/linux/bug.h:5, > > > from ./include/linux/mmdebug.h:5, > > > from ./include/linux/mm.h:9, > > > from ./include/linux/memblock.h:13, > > > from mm/cma.c:24: > > > mm/cma.c: In function ‘cma_declare_contiguous_nid’: > > > ./include/uapi/linux/const.h:20:19: warning: conversion from ‘long long unsigned int’ to ‘phys_addr_t’ {aka ‘unsigned int’} changes value from ‘4294967296’ to ‘0’ [-Woverflow] > > > #define __AC(X,Y) (X##Y) > > > ^~~~~~ > > > ./include/uapi/linux/const.h:21:18: note: in expansion of macro ‘__AC’ > > > #define _AC(X,Y) __AC(X,Y) > > > ^~~~ > > > ./include/linux/sizes.h:46:18: note: in expansion of macro ‘_AC’ > > > #define SZ_4G _AC(0x100000000, ULL) > > > ^~~ > > > mm/cma.c:349:53: note: in expansion of macro ‘SZ_4G’ > > > addr = memblock_alloc_range_nid(size, alignment, SZ_4G, > > > ^~~~~ > > > > > > > I thought that (!memblock_bottom_up() && memblock_end >= SZ_4G + size) > > can't be true on a 32-bit platform, so the whole if clause can be compiled out. > > Maybe it's because memblock_end can be equal to SZ_4G and if the size == 0... > > > > I have no better idea than wrapping everything into > > #if BITS_PER_LONG > 32 > > #endif. > > 32-bit systems can have more than 32 bit in the physical address. > I think a better option would be to use CONFIG_PHYS_ADDR_T_64BIT I agree. An updated fixup below. Andrew, can you, please, replace the previous fixup with this one? Thanks! -- diff --git a/mm/cma.c b/mm/cma.c index 4fe74c9d83b0..0ba69cd16aeb 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -344,12 +344,14 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, * Avoid using first 4GB to not interfere with constrained zones * like DMA/DMA32. */ +#ifdef CONFIG_PHYS_ADDR_T_64BIT if (!memblock_bottom_up() && memblock_end >= SZ_4G + size) { memblock_set_bottom_up(true); addr = memblock_alloc_range_nid(size, alignment, SZ_4G, limit, nid, true); memblock_set_bottom_up(false); } +#endif if (!addr) { addr = memblock_alloc_range_nid(size, alignment, base,
diff --git a/mm/cma.c b/mm/cma.c index 7f415d7cda9f..21fd40c092f0 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -337,6 +337,22 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, limit = highmem_start; } + /* + * If there is enough memory, try a bottom-up allocation first. + * It will place the new cma area close to the start of the node + * and guarantee that the compaction is moving pages out of the + * cma area and not into it. + * Avoid using first 4GB to not interfere with constrained zones + * like DMA/DMA32. + */ + if (!memblock_bottom_up() && + memblock_end >= SZ_4G + size) { + memblock_set_bottom_up(true); + addr = memblock_alloc_range_nid(size, alignment, SZ_4G, + limit, nid, true); + memblock_set_bottom_up(false); + } + if (!addr) { addr = memblock_alloc_range_nid(size, alignment, base, limit, nid, true);
Currently cma areas without a fixed base are allocated close to the end of the node. This placement is sub-optimal because of compaction: it brings pages into the cma area. In particular, it can bring in hot executable pages, even if there is a plenty of free memory on the machine. This results in cma allocation failures. Instead let's place cma areas close to the beginning of a node. In this case the compaction will help to free cma areas, resulting in better cma allocation success rates. If there is enough memory let's try to allocate bottom-up starting with 4GB to exclude any possible interference with DMA32. On smaller machines or in a case of a failure, stick with the old behavior. 16GB vm, 2GB cma area: With this patch: [ 0.000000] Command line: root=/dev/vda3 rootflags=subvol=/root systemd.unified_cgroup_hierarchy=1 enforcing=0 console=ttyS0,115200 hugetlb_cma=2G [ 0.002928] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node [ 0.002930] cma: Reserved 2048 MiB at 0x0000000100000000 [ 0.002931] hugetlb_cma: reserved 2048 MiB on node 0 Without this patch: [ 0.000000] Command line: root=/dev/vda3 rootflags=subvol=/root systemd.unified_cgroup_hierarchy=1 enforcing=0 console=ttyS0,115200 hugetlb_cma=2G [ 0.002930] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node [ 0.002933] cma: Reserved 2048 MiB at 0x00000003c0000000 [ 0.002934] hugetlb_cma: reserved 2048 MiB on node 0 v2: - switched to memblock_set_bottom_up(true), by Mike - start with 4GB, by Mike Signed-off-by: Roman Gushchin <guro@fb.com> --- mm/cma.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)