Message ID | 20200707040204.30132-1-song.bao.hua@hisilicon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] mm/hugetlb: avoid hardcoding while checking if cma is enable | expand |
On Tue, Jul 07, 2020 at 04:02:04PM +1200, Barry Song wrote: > hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has > no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not > enabled. gigantic pages might have been reserved on other nodes. > > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma") > Cc: Mike Kravetz <mike.kravetz@oracle.com> > Cc: Jonathan Cameron <jonathan.cameron@huawei.com> > Acked-by: Roman Gushchin <guro@fb.com> > Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> > --- > -v3: add acked-by; make code more canonical > > mm/hugetlb.c | 16 +++++++++++++++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 57ece74e3aae..d293c823121e 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2546,6 +2546,20 @@ static void __init gather_bootmem_prealloc(void) > } > } > > +bool __init hugetlb_cma_enabled(void) > +{ > +#ifdef CONFIG_CMA > + int node; > + > + for_each_online_node(node) { > + if (hugetlb_cma[node]) > + return true; > + } > +#endif > + > + return false; > +} > + > static void __init hugetlb_hstate_alloc_pages(struct hstate *h) > { > unsigned long i; > @@ -2571,7 +2585,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) > > for (i = 0; i < h->max_huge_pages; ++i) { > if (hstate_is_gigantic(h)) { > - if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) { > + if (hugetlb_cma_enabled()) { > pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); > break; > } > -- > 2.27.0 > > >
On Tue, 7 Jul 2020 16:02:04 +1200 Barry Song <song.bao.hua@hisilicon.com> wrote: > hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has > no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not > enabled. gigantic pages might have been reserved on other nodes. I'm trying to figure out whether this should be backported into 5.7.1, but the changelog doesn't describe any known user-visible effects of the bug. Are there any?
On 7/7/20 12:56 PM, Andrew Morton wrote: > On Tue, 7 Jul 2020 16:02:04 +1200 Barry Song <song.bao.hua@hisilicon.com> wrote: > >> hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has >> no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not >> enabled. gigantic pages might have been reserved on other nodes. > > I'm trying to figure out whether this should be backported into 5.7.1, > but the changelog doesn't describe any known user-visible effects of > the bug. Are there any? Barry must have missed this email. He reported the issue so I was hoping he would reply. Based on the code changes, I believe the following could happen: - Someone uses 'hugetlb_cma=' kernel command line parameter to reserve CMA for gigantic pages. - The system topology is such that no memory is on node 0. Therefore, no CMA can be reserved for gigantic pages on node 0. CMA is reserved on other nodes. - The user also specifies a number of gigantic pages to pre-allocate on the command line with hugepagesz=<gigantic_page_size> hugepages=<N> - The routine which allocates gigantic pages from the bootmem allocator will not detect CMA has been reserved as there is no memory on node 0. Therefore, pages will be pre-allocated from bootmem allocator as well as reserved in CMA. This double allocation (bootmem and CMA) is the worst case scenario. Not sure if this is what Barry saw, and I suspect this would rarely happen. After writing this, I started to think that perhaps command line parsing should be changed. If hugetlb_cma= is specified, it makes no sense to pre-allocate gigantic pages. Therefore, the hugepages=<N> paramemter should be ignored and flagged with a warning if hugetlb_cma= is specified. This could be checked at parsing time and there would be no need for such a check in the allocation code (except for sanity cheching). Thoughts? I just cleaned up the parsing code and could make such a change quite easily.
On Wed, Jul 08, 2020 at 10:45:16AM -0700, Mike Kravetz wrote: > On 7/7/20 12:56 PM, Andrew Morton wrote: > > On Tue, 7 Jul 2020 16:02:04 +1200 Barry Song <song.bao.hua@hisilicon.com> wrote: > > > >> hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has > >> no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not > >> enabled. gigantic pages might have been reserved on other nodes. > > > > I'm trying to figure out whether this should be backported into 5.7.1, > > but the changelog doesn't describe any known user-visible effects of > > the bug. Are there any? > > Barry must have missed this email. He reported the issue so I was hoping > he would reply. > > Based on the code changes, I believe the following could happen: > - Someone uses 'hugetlb_cma=' kernel command line parameter to reserve > CMA for gigantic pages. > - The system topology is such that no memory is on node 0. Therefore, > no CMA can be reserved for gigantic pages on node 0. CMA is reserved > on other nodes. > - The user also specifies a number of gigantic pages to pre-allocate on > the command line with hugepagesz=<gigantic_page_size> hugepages=<N> > - The routine which allocates gigantic pages from the bootmem allocator > will not detect CMA has been reserved as there is no memory on node 0. > Therefore, pages will be pre-allocated from bootmem allocator as well > as reserved in CMA. > > This double allocation (bootmem and CMA) is the worst case scenario. Not > sure if this is what Barry saw, and I suspect this would rarely happen. > > After writing this, I started to think that perhaps command line parsing > should be changed. If hugetlb_cma= is specified, it makes no sense to > pre-allocate gigantic pages. Therefore, the hugepages=<N> paramemter > should be ignored and flagged with a warning if hugetlb_cma= is specified. > This could be checked at parsing time and there would be no need for such > a check in the allocation code (except for sanity cheching). > > Thoughts? I just cleaned up the parsing code and could make such a change > quite easily. I agree. Basically, if hugetlb_cma_size > 0, we should not pre-allocate gigantic pages. It would be much simpler and more reliable than the existing code. Thank you!
> -----Original Message----- > From: Roman Gushchin [mailto:guro@fb.com] > Sent: Thursday, July 9, 2020 6:46 AM > To: Mike Kravetz <mike.kravetz@oracle.com> > Cc: Andrew Morton <akpm@linux-foundation.org>; Song Bao Hua (Barry Song) > <song.bao.hua@hisilicon.com>; linux-mm@kvack.org; > linux-kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>; Jonathan > Cameron <jonathan.cameron@huawei.com> > Subject: Re: [PATCH v3] mm/hugetlb: avoid hardcoding while checking if cma > is enable > > On Wed, Jul 08, 2020 at 10:45:16AM -0700, Mike Kravetz wrote: > > On 7/7/20 12:56 PM, Andrew Morton wrote: > > > On Tue, 7 Jul 2020 16:02:04 +1200 Barry Song > <song.bao.hua@hisilicon.com> wrote: > > > > > >> hugetlb_cma[0] can be NULL due to various reasons, for example, node0 > has > > >> no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not > > >> enabled. gigantic pages might have been reserved on other nodes. > > > > > > I'm trying to figure out whether this should be backported into 5.7.1, > > > but the changelog doesn't describe any known user-visible effects of > > > the bug. Are there any? > > > > Barry must have missed this email. He reported the issue so I was hoping > > he would reply. Yep. it should be better to backport it into 5.7. it doesn't cause serious crash or failure, but could cause double reservation or cma leak. > > > > Based on the code changes, I believe the following could happen: > > - Someone uses 'hugetlb_cma=' kernel command line parameter to reserve > > CMA for gigantic pages. > > - The system topology is such that no memory is on node 0. Therefore, > > no CMA can be reserved for gigantic pages on node 0. CMA is reserved > > on other nodes. > > - The user also specifies a number of gigantic pages to pre-allocate on > > the command line with hugepagesz=<gigantic_page_size> hugepages=<N> > > - The routine which allocates gigantic pages from the bootmem allocator > > will not detect CMA has been reserved as there is no memory on node 0. > > Therefore, pages will be pre-allocated from bootmem allocator as well > > as reserved in CMA. > > > > This double allocation (bootmem and CMA) is the worst case scenario. Not > > sure if this is what Barry saw, and I suspect this would rarely happen. > > > > After writing this, I started to think that perhaps command line parsing > > should be changed. If hugetlb_cma= is specified, it makes no sense to > > pre-allocate gigantic pages. Therefore, the hugepages=<N> paramemter > > should be ignored and flagged with a warning if hugetlb_cma= is specified. > > This could be checked at parsing time and there would be no need for such > > a check in the allocation code (except for sanity cheching). > > > > Thoughts? I just cleaned up the parsing code and could make such a > change > > quite easily. > > I agree. Basically, if hugetlb_cma_size > 0, we should not pre-allocate > gigantic pages. It would be much simpler and more reliable than the existing > code. I agree this is a better solution, if hugetlb_cma has higher priority than bootmem gigantic pages, we should document it. > > Thank you! Thanks Barry
Looks like this produced a warning in linux-next. I suspect it is due to the combination CONFIG_HUGETLB_PAGE && !CONFIG_CMA. Instead of adding the routine hugetlb_cma_enabled() to scan the hugetlb_cma array, could we just use a boolean as follows? It can simply be set in hugetlb_cma_reserve when we reserve CMA.
> -----Original Message----- > From: Mike Kravetz [mailto:mike.kravetz@oracle.com] > Sent: Friday, July 10, 2020 6:58 AM > To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>; Roman > Gushchin <guro@fb.com> > Cc: Andrew Morton <akpm@linux-foundation.org>; linux-mm@kvack.org; > linux-kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>; Jonathan > Cameron <jonathan.cameron@huawei.com> > Subject: Re: [PATCH v3] mm/hugetlb: avoid hardcoding while checking if cma > is enable > > Looks like this produced a warning in linux-next. I suspect it is due to the > combination CONFIG_HUGETLB_PAGE && !CONFIG_CMA. > > Instead of adding the routine hugetlb_cma_enabled() to scan the hugetlb_cma > array, could we just use a boolean as follows? It can simply be set in > hugetlb_cma_reserve when we reserve CMA. Maybe just use hugetlb_cma_size? If hugetlb_cma_size is not 0, someone is trying to use cma, then bootmem for gigantic pages will be totally ignored according to discussion here: https://lkml.org/lkml/2020/7/8/1288 if somebody sets a wrong hugetlb_cma_size which causes that cma is not reserved. It is the fault of users? We just need to document hugetlb_cma will overwrite bootmem reservations? > -- > Mike Kravetz > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index fab4485b9e52..92cb882cf287 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -46,6 +46,7 @@ unsigned int default_hstate_idx; > struct hstate hstates[HUGE_MAX_HSTATE]; > > static struct cma *hugetlb_cma[MAX_NUMNODES]; > +static bool hugetlb_cma_enabled = false; > > /* > * Minimum page order among possible hugepage sizes, set to a proper value > @@ -2571,7 +2572,7 @@ static void __init hugetlb_hstate_alloc_pages(struct > hstate *h) > > for (i = 0; i < h->max_huge_pages; ++i) { > if (hstate_is_gigantic(h)) { > - if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) { > + if (hugetlb_cma_enabled) { > pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip > boot time allocation\n"); > break; > } > @@ -5708,6 +5709,7 @@ void __init hugetlb_cma_reserve(int order) > reserved += size; > pr_info("hugetlb_cma: reserved %lu MiB on node %d\n", > size / SZ_1M, nid); > + hugetlb_cma_enabled = true; > > if (reserved >= hugetlb_cma_size) > break; Thanks Barry
On 7/9/20 4:45 PM, Song Bao Hua (Barry Song) wrote: > > >> -----Original Message----- >> From: Mike Kravetz [mailto:mike.kravetz@oracle.com] >> Sent: Friday, July 10, 2020 6:58 AM >> To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>; Roman >> Gushchin <guro@fb.com> >> Cc: Andrew Morton <akpm@linux-foundation.org>; linux-mm@kvack.org; >> linux-kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>; Jonathan >> Cameron <jonathan.cameron@huawei.com> >> Subject: Re: [PATCH v3] mm/hugetlb: avoid hardcoding while checking if cma >> is enable >> >> Looks like this produced a warning in linux-next. I suspect it is due to the >> combination CONFIG_HUGETLB_PAGE && !CONFIG_CMA. >> >> Instead of adding the routine hugetlb_cma_enabled() to scan the hugetlb_cma >> array, could we just use a boolean as follows? It can simply be set in >> hugetlb_cma_reserve when we reserve CMA. > > Maybe just use hugetlb_cma_size? If hugetlb_cma_size is not 0, someone is trying to use > cma, then bootmem for gigantic pages will be totally ignored according to discussion here: > https://lkml.org/lkml/2020/7/8/1288 > > if somebody sets a wrong hugetlb_cma_size which causes that cma is not reserved. > It is the fault of users? We just need to document hugetlb_cma will overwrite bootmem > reservations? > Yes, I think using hugetlb_cma_size would be sufficient. If someone specifies hugetlb_cma=<N> and hugepagesz=<gigantic_page_size> hugepages=<X> that is wrong. I don't think we need to worry about 'falling back' to preallocating X gigantic pages if N is a bad value. Or, even if the arch does not support cma allocation. I am working on a patch to check this earlier in command processing. That will make this check unnecessary. However, that patch is based on new command line processing code only in 5.8. So, I think we still need to do this so that it can be backported to stable.
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 57ece74e3aae..d293c823121e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2546,6 +2546,20 @@ static void __init gather_bootmem_prealloc(void) } } +bool __init hugetlb_cma_enabled(void) +{ +#ifdef CONFIG_CMA + int node; + + for_each_online_node(node) { + if (hugetlb_cma[node]) + return true; + } +#endif + + return false; +} + static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { unsigned long i; @@ -2571,7 +2585,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) for (i = 0; i < h->max_huge_pages; ++i) { if (hstate_is_gigantic(h)) { - if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) { + if (hugetlb_cma_enabled()) { pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); break; }