Message ID | 20220623235153.2623702-2-naoya.horiguchi@linux.dev (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm, hwpoison: enable 1GB hugepage support (v2) | expand |
On 2022/6/24 7:51, Naoya Horiguchi wrote: > From: Naoya Horiguchi <naoya.horiguchi@nec.com> > > I found a weird state of 1GB hugepage pool, caused by the following > procedure: > > - run a process reserving all free 1GB hugepages, > - shrink free 1GB hugepage pool to zero (i.e. writing 0 to > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages), then > - kill the reserving process. > > , then all the hugepages are free *and* surplus at the same time. > > $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages > 3 > $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages > 3 > $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages > 0 > $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages > 3 > > This state is resolved by reserving and allocating the pages then > freeing them again, so this seems not to result in serious problem. > But it's a little surprizing (shrinking pool suddenly fails). > > This behavior is caused by hstate_is_gigantic() check in > return_unused_surplus_pages(). This was introduced so long ago in 2008 > by commit aa888a74977a ("hugetlb: support larger than MAX_ORDER"), and > it seems to me that this check is no longer unnecessary. Let's remove it. s/unnecessary/necessary/ > > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > --- > mm/hugetlb.c | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index a57e1be41401..c538278170a2 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2432,10 +2432,6 @@ static void return_unused_surplus_pages(struct hstate *h, > /* Uncommit the reservation */ > h->resv_huge_pages -= unused_resv_pages; > > - /* Cannot return gigantic pages currently */ > - if (hstate_is_gigantic(h)) > - goto out; > - IIUC it might be better to do the below check: /* * Cannot return gigantic pages currently if runtime gigantic page * allocation is not supported. */ if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) goto out; But I might be miss something. Thanks. > /* > * Part (or even all) of the reservation could have been backed > * by pre-allocated pages. Only free surplus pages. >
On Fri, Jun 24, 2022 at 10:25:48AM +0800, Miaohe Lin wrote: > On 2022/6/24 7:51, Naoya Horiguchi wrote: > > From: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > > I found a weird state of 1GB hugepage pool, caused by the following > > procedure: > > > > - run a process reserving all free 1GB hugepages, > > - shrink free 1GB hugepage pool to zero (i.e. writing 0 to > > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages), then > > - kill the reserving process. > > > > , then all the hugepages are free *and* surplus at the same time. > > > > $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages > > 3 > > $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages > > 3 > > $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages > > 0 > > $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages > > 3 > > > > This state is resolved by reserving and allocating the pages then > > freeing them again, so this seems not to result in serious problem. > > But it's a little surprizing (shrinking pool suddenly fails). > > > > This behavior is caused by hstate_is_gigantic() check in > > return_unused_surplus_pages(). This was introduced so long ago in 2008 > > by commit aa888a74977a ("hugetlb: support larger than MAX_ORDER"), and > > it seems to me that this check is no longer unnecessary. Let's remove it. > > s/unnecessary/necessary/ > > > > > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > --- > > mm/hugetlb.c | 4 ---- > > 1 file changed, 4 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index a57e1be41401..c538278170a2 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -2432,10 +2432,6 @@ static void return_unused_surplus_pages(struct hstate *h, > > /* Uncommit the reservation */ > > h->resv_huge_pages -= unused_resv_pages; > > > > - /* Cannot return gigantic pages currently */ > > - if (hstate_is_gigantic(h)) > > - goto out; > > - > > IIUC it might be better to do the below check: > /* > * Cannot return gigantic pages currently if runtime gigantic page > * allocation is not supported. > */ > if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) > goto out; > The change looks good to me. However, the comments above is unnecessary since gigantic_page_runtime_supported() is straightforward. Thanks. > But I might be miss something. > > Thanks. > > > /* > > * Part (or even all) of the reservation could have been backed > > * by pre-allocated pages. Only free surplus pages. > > > >
On 2022/6/24 16:03, Muchun Song wrote: > On Fri, Jun 24, 2022 at 10:25:48AM +0800, Miaohe Lin wrote: >> On 2022/6/24 7:51, Naoya Horiguchi wrote: >>> From: Naoya Horiguchi <naoya.horiguchi@nec.com> >>> >>> I found a weird state of 1GB hugepage pool, caused by the following >>> procedure: >>> >>> - run a process reserving all free 1GB hugepages, >>> - shrink free 1GB hugepage pool to zero (i.e. writing 0 to >>> /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages), then >>> - kill the reserving process. >>> >>> , then all the hugepages are free *and* surplus at the same time. >>> >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages >>> 3 >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages >>> 3 >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages >>> 0 >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages >>> 3 >>> >>> This state is resolved by reserving and allocating the pages then >>> freeing them again, so this seems not to result in serious problem. >>> But it's a little surprizing (shrinking pool suddenly fails). >>> >>> This behavior is caused by hstate_is_gigantic() check in >>> return_unused_surplus_pages(). This was introduced so long ago in 2008 >>> by commit aa888a74977a ("hugetlb: support larger than MAX_ORDER"), and >>> it seems to me that this check is no longer unnecessary. Let's remove it. >> >> s/unnecessary/necessary/ >> >>> >>> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> >>> --- >>> mm/hugetlb.c | 4 ---- >>> 1 file changed, 4 deletions(-) >>> >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>> index a57e1be41401..c538278170a2 100644 >>> --- a/mm/hugetlb.c >>> +++ b/mm/hugetlb.c >>> @@ -2432,10 +2432,6 @@ static void return_unused_surplus_pages(struct hstate *h, >>> /* Uncommit the reservation */ >>> h->resv_huge_pages -= unused_resv_pages; >>> >>> - /* Cannot return gigantic pages currently */ >>> - if (hstate_is_gigantic(h)) >>> - goto out; >>> - >> >> IIUC it might be better to do the below check: >> /* >> * Cannot return gigantic pages currently if runtime gigantic page >> * allocation is not supported. >> */ >> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) >> goto out; >> > > The change looks good to me. However, the comments above is unnecessary > since gigantic_page_runtime_supported() is straightforward. Agree. The comments can be removed. > > Thanks. Thanks for reviewing. > >> But I might be miss something. >> >> Thanks. >> >>> /* >>> * Part (or even all) of the reservation could have been backed >>> * by pre-allocated pages. Only free surplus pages. >>> >> >> > . >
On Fri, Jun 24, 2022 at 04:15:19PM +0800, Miaohe Lin wrote: > On 2022/6/24 16:03, Muchun Song wrote: > > On Fri, Jun 24, 2022 at 10:25:48AM +0800, Miaohe Lin wrote: > >> On 2022/6/24 7:51, Naoya Horiguchi wrote: > >>> From: Naoya Horiguchi <naoya.horiguchi@nec.com> > >>> > >>> I found a weird state of 1GB hugepage pool, caused by the following > >>> procedure: > >>> > >>> - run a process reserving all free 1GB hugepages, > >>> - shrink free 1GB hugepage pool to zero (i.e. writing 0 to > >>> /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages), then > >>> - kill the reserving process. > >>> > >>> , then all the hugepages are free *and* surplus at the same time. > >>> > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages > >>> 3 > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages > >>> 3 > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages > >>> 0 > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages > >>> 3 > >>> > >>> This state is resolved by reserving and allocating the pages then > >>> freeing them again, so this seems not to result in serious problem. > >>> But it's a little surprizing (shrinking pool suddenly fails). > >>> > >>> This behavior is caused by hstate_is_gigantic() check in > >>> return_unused_surplus_pages(). This was introduced so long ago in 2008 > >>> by commit aa888a74977a ("hugetlb: support larger than MAX_ORDER"), and > >>> it seems to me that this check is no longer unnecessary. Let's remove it. > >> > >> s/unnecessary/necessary/ Thanks, I fixed it. > >> > >>> > >>> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > >>> --- > >>> mm/hugetlb.c | 4 ---- > >>> 1 file changed, 4 deletions(-) > >>> > >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c > >>> index a57e1be41401..c538278170a2 100644 > >>> --- a/mm/hugetlb.c > >>> +++ b/mm/hugetlb.c > >>> @@ -2432,10 +2432,6 @@ static void return_unused_surplus_pages(struct hstate *h, > >>> /* Uncommit the reservation */ > >>> h->resv_huge_pages -= unused_resv_pages; > >>> > >>> - /* Cannot return gigantic pages currently */ > >>> - if (hstate_is_gigantic(h)) > >>> - goto out; > >>> - > >> > >> IIUC it might be better to do the below check: > >> /* > >> * Cannot return gigantic pages currently if runtime gigantic page > >> * allocation is not supported. > >> */ > >> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) > >> goto out; > >> > > > > The change looks good to me. However, the comments above is unnecessary > > since gigantic_page_runtime_supported() is straightforward. > > Agree. The comments can be removed. Thank you, both. Adding !gigantic_page_runtime_supported without comment makes sense to me. Thanks, Naoya Horiguchi
On 06/24/22 08:34, HORIGUCHI NAOYA(堀口 直也) wrote: > On Fri, Jun 24, 2022 at 04:15:19PM +0800, Miaohe Lin wrote: > > On 2022/6/24 16:03, Muchun Song wrote: > > > On Fri, Jun 24, 2022 at 10:25:48AM +0800, Miaohe Lin wrote: > > >> On 2022/6/24 7:51, Naoya Horiguchi wrote: > > >>> From: Naoya Horiguchi <naoya.horiguchi@nec.com> > > >>> > > >>> I found a weird state of 1GB hugepage pool, caused by the following > > >>> procedure: > > >>> > > >>> - run a process reserving all free 1GB hugepages, > > >>> - shrink free 1GB hugepage pool to zero (i.e. writing 0 to > > >>> /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages), then > > >>> - kill the reserving process. > > >>> > > >>> , then all the hugepages are free *and* surplus at the same time. > > >>> > > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages > > >>> 3 > > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages > > >>> 3 > > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages > > >>> 0 > > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages > > >>> 3 > > >>> > > >>> This state is resolved by reserving and allocating the pages then > > >>> freeing them again, so this seems not to result in serious problem. > > >>> But it's a little surprizing (shrinking pool suddenly fails). > > >>> > > >>> This behavior is caused by hstate_is_gigantic() check in > > >>> return_unused_surplus_pages(). This was introduced so long ago in 2008 > > >>> by commit aa888a74977a ("hugetlb: support larger than MAX_ORDER"), and > > >>> it seems to me that this check is no longer unnecessary. Let's remove it. Thank you for finding this!!! > > >>> +++ b/mm/hugetlb.c > > >>> @@ -2432,10 +2432,6 @@ static void return_unused_surplus_pages(struct hstate *h, > > >>> /* Uncommit the reservation */ > > >>> h->resv_huge_pages -= unused_resv_pages; > > >>> > > >>> - /* Cannot return gigantic pages currently */ > > >>> - if (hstate_is_gigantic(h)) > > >>> - goto out; > > >>> - > > >> > > >> IIUC it might be better to do the below check: > > >> /* > > >> * Cannot return gigantic pages currently if runtime gigantic page > > >> * allocation is not supported. > > >> */ > > >> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) > > >> goto out; > > >> > > > > > > The change looks good to me. However, the comments above is unnecessary > > > since gigantic_page_runtime_supported() is straightforward. > > > > Agree. The comments can be removed. > > Thank you, both. Adding !gigantic_page_runtime_supported without comment > makes sense to me. The change above makes sense to me. However, ... If we make the change above, will we have the same strange situation described in the commit message when !gigantic_page_runtime_supported() is true? IIUC, !gigantic_page_runtime_supported implies that gigantic hugetlb pages can not be allocated or freed at run time. They can only be allocated at boot time. So, there should NEVER be surplus gigantic pages if !gigantic_page_runtime_supported(). To avoid this situation, perhaps we should change set_max_huge_pages as follows (not tested)?
On Fri, Jun 24, 2022 at 12:11:07PM -0700, Mike Kravetz wrote: > On 06/24/22 08:34, HORIGUCHI NAOYA(堀口 直也) wrote: > > On Fri, Jun 24, 2022 at 04:15:19PM +0800, Miaohe Lin wrote: > > > On 2022/6/24 16:03, Muchun Song wrote: > > > > On Fri, Jun 24, 2022 at 10:25:48AM +0800, Miaohe Lin wrote: > > > >> On 2022/6/24 7:51, Naoya Horiguchi wrote: > > > >>> From: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > >>> > > > >>> I found a weird state of 1GB hugepage pool, caused by the following > > > >>> procedure: > > > >>> > > > >>> - run a process reserving all free 1GB hugepages, > > > >>> - shrink free 1GB hugepage pool to zero (i.e. writing 0 to > > > >>> /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages), then > > > >>> - kill the reserving process. > > > >>> > > > >>> , then all the hugepages are free *and* surplus at the same time. > > > >>> > > > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages > > > >>> 3 > > > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages > > > >>> 3 > > > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages > > > >>> 0 > > > >>> $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages > > > >>> 3 > > > >>> > > > >>> This state is resolved by reserving and allocating the pages then > > > >>> freeing them again, so this seems not to result in serious problem. > > > >>> But it's a little surprizing (shrinking pool suddenly fails). > > > >>> > > > >>> This behavior is caused by hstate_is_gigantic() check in > > > >>> return_unused_surplus_pages(). This was introduced so long ago in 2008 > > > >>> by commit aa888a74977a ("hugetlb: support larger than MAX_ORDER"), and > > > >>> it seems to me that this check is no longer unnecessary. Let's remove it. > > Thank you for finding this!!! > > > > >>> +++ b/mm/hugetlb.c > > > >>> @@ -2432,10 +2432,6 @@ static void return_unused_surplus_pages(struct hstate *h, > > > >>> /* Uncommit the reservation */ > > > >>> h->resv_huge_pages -= unused_resv_pages; > > > >>> > > > >>> - /* Cannot return gigantic pages currently */ > > > >>> - if (hstate_is_gigantic(h)) > > > >>> - goto out; > > > >>> - > > > >> > > > >> IIUC it might be better to do the below check: > > > >> /* > > > >> * Cannot return gigantic pages currently if runtime gigantic page > > > >> * allocation is not supported. > > > >> */ > > > >> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) > > > >> goto out; > > > >> > > > > > > > > The change looks good to me. However, the comments above is unnecessary > > > > since gigantic_page_runtime_supported() is straightforward. > > > > > > Agree. The comments can be removed. > > > > Thank you, both. Adding !gigantic_page_runtime_supported without comment > > makes sense to me. > > The change above makes sense to me. However, ... > > If we make the change above, will we have the same strange situation described > in the commit message when !gigantic_page_runtime_supported() is true? > > IIUC, !gigantic_page_runtime_supported implies that gigantic hugetlb > pages can not be allocated or freed at run time. They can only be > allocated at boot time. So, there should NEVER be surplus gigantic > pages if !gigantic_page_runtime_supported(). I have the same understanding as the above. > To avoid this situation, > perhaps we should change set_max_huge_pages as follows (not tested)? The suggested diff looks clearer about what it does, so I'd like to take it in the next version. Then, what do we do on the "if (hstate_if_gigantic())" check in return_unused_surplus_pages in the original suggestion? Should it be kept as is, or removed, or checked with !gigantic_page_runtime_supported()? I guess that the new checks prevent calling return_unused_surplus_pages() during pool shrinking, so the check seems not necessary any more. Thanks, Naoya Horiguchi > > -- > Mike Kravetz > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 5eabb8009d8a..c78d6c20e6b0 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -3416,7 +3416,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, > * the user tries to allocate gigantic pages but let the user free the > * boottime allocated gigantic pages. > */ > - if (hstate_is_gigantic(h) && !IS_ENABLED(CONFIG_CONTIG_ALLOC)) { > + if (hstate_is_gigantic(h) && (!IS_ENABLED(CONFIG_CONTIG_ALLOC) || > + !gigantic_page_runtime_supported())) { > if (count > persistent_huge_pages(h)) { > spin_unlock_irq(&hugetlb_lock); > mutex_unlock(&h->resize_lock); > @@ -3464,6 +3465,19 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, > goto out; > } > > + /* > + * We can not decrease gigantic pool size if runtime modification > + * is not supported. > + */ > + if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) { > + if (count < persistent_huge_pages(h)) { > + spin_unlock_irq(&hugetlb_lock); > + mutex_unlock(&h->resize_lock); > + NODEMASK_FREE(node_alloc_noretry); > + return -EINVAL; > + } > + } > + > /* > * Decrease the pool size > * First return free pages to the buddy allocator (being careful
On 06/27/22 06:02, HORIGUCHI NAOYA(堀口 直也) wrote: > On Fri, Jun 24, 2022 at 12:11:07PM -0700, Mike Kravetz wrote: > > On 06/24/22 08:34, HORIGUCHI NAOYA(堀口 直也) wrote: > > > On Fri, Jun 24, 2022 at 04:15:19PM +0800, Miaohe Lin wrote: > > > > On 2022/6/24 16:03, Muchun Song wrote: > > > > > On Fri, Jun 24, 2022 at 10:25:48AM +0800, Miaohe Lin wrote: > > > > >> On 2022/6/24 7:51, Naoya Horiguchi wrote: > > > > >>> From: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > > >> > > > > >> IIUC it might be better to do the below check: > > > > >> /* > > > > >> * Cannot return gigantic pages currently if runtime gigantic page > > > > >> * allocation is not supported. > > > > >> */ > > > > >> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) > > > > >> goto out; > > > > >> > > > > > > > > > > The change looks good to me. However, the comments above is unnecessary > > > > > since gigantic_page_runtime_supported() is straightforward. > > > > > > > > Agree. The comments can be removed. > > > > > > Thank you, both. Adding !gigantic_page_runtime_supported without comment > > > makes sense to me. > > > > The change above makes sense to me. However, ... > > > > If we make the change above, will we have the same strange situation described > > in the commit message when !gigantic_page_runtime_supported() is true? > > > > IIUC, !gigantic_page_runtime_supported implies that gigantic hugetlb > > pages can not be allocated or freed at run time. They can only be > > allocated at boot time. So, there should NEVER be surplus gigantic > > pages if !gigantic_page_runtime_supported(). > > I have the same understanding as the above. > > > To avoid this situation, > > perhaps we should change set_max_huge_pages as follows (not tested)? > > The suggested diff looks clearer about what it does, so I'd like to take it > in the next version. Then, what do we do on the "if (hstate_if_gigantic())" > check in return_unused_surplus_pages in the original suggestion? Should it > be kept as is, or removed, or checked with !gigantic_page_runtime_supported()? > I guess that the new checks prevent calling return_unused_surplus_pages() > during pool shrinking, so the check seems not necessary any more. My first thought was to keep the check in return_unused_surplus_pages() as it is called in other code paths. However, it SHOULD only try to free surplus hugetlb pages. With the modifications to set_max_huge_pages we will not have any surplus gigantic pages if !gigantic_page_runtime_supported, so the check can be removed. Also note that we never try to dynamically allocate surplus gigantic pages. This also is left over from the time when we could not easily allocate a gigantic page at runtime. It would not surprise me if someone found a use case to ease this restriction in the future. Especially so if 1G THP support is ever added. If this happens, the check would be necessary and I would guess that it would not be added. Sorry for thinking our loud!!! Although not necessary, it 'might' be a good idea to leave the check because it would be overlooked if dynamic allocation of gigantic surplus pages is ever added. I do not have a strong opinion. P.S. This also reminds me that a similar check should be added to the demote hugetlb code path. It would be bad if !gigantic_page_runtime_supported and we demoted a gigantic page into numerous non-gigantic pages. I will send a patch.
On 2022/6/28 1:25, Mike Kravetz wrote: > On 06/27/22 06:02, HORIGUCHI NAOYA(堀口 直也) wrote: >> On Fri, Jun 24, 2022 at 12:11:07PM -0700, Mike Kravetz wrote: >>> On 06/24/22 08:34, HORIGUCHI NAOYA(堀口 直也) wrote: >>>> On Fri, Jun 24, 2022 at 04:15:19PM +0800, Miaohe Lin wrote: >>>>> On 2022/6/24 16:03, Muchun Song wrote: >>>>>> On Fri, Jun 24, 2022 at 10:25:48AM +0800, Miaohe Lin wrote: >>>>>>> On 2022/6/24 7:51, Naoya Horiguchi wrote: >>>>>>>> From: Naoya Horiguchi <naoya.horiguchi@nec.com> >>>>>>> >>>>>>> IIUC it might be better to do the below check: >>>>>>> /* >>>>>>> * Cannot return gigantic pages currently if runtime gigantic page >>>>>>> * allocation is not supported. >>>>>>> */ >>>>>>> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) >>>>>>> goto out; >>>>>>> >>>>>> >>>>>> The change looks good to me. However, the comments above is unnecessary >>>>>> since gigantic_page_runtime_supported() is straightforward. >>>>> >>>>> Agree. The comments can be removed. >>>> >>>> Thank you, both. Adding !gigantic_page_runtime_supported without comment >>>> makes sense to me. >>> >>> The change above makes sense to me. However, ... >>> >>> If we make the change above, will we have the same strange situation described >>> in the commit message when !gigantic_page_runtime_supported() is true? >>> >>> IIUC, !gigantic_page_runtime_supported implies that gigantic hugetlb >>> pages can not be allocated or freed at run time. They can only be >>> allocated at boot time. So, there should NEVER be surplus gigantic >>> pages if !gigantic_page_runtime_supported(). >> >> I have the same understanding as the above. >> >>> To avoid this situation, >>> perhaps we should change set_max_huge_pages as follows (not tested)? >> >> The suggested diff looks clearer about what it does, so I'd like to take it >> in the next version. Then, what do we do on the "if (hstate_if_gigantic())" >> check in return_unused_surplus_pages in the original suggestion? Should it >> be kept as is, or removed, or checked with !gigantic_page_runtime_supported()? >> I guess that the new checks prevent calling return_unused_surplus_pages() >> during pool shrinking, so the check seems not necessary any more. > > My first thought was to keep the check in return_unused_surplus_pages() as it > is called in other code paths. However, it SHOULD only try to free surplus > hugetlb pages. With the modifications to set_max_huge_pages we will not > have any surplus gigantic pages if !gigantic_page_runtime_supported, so > the check can be removed. > > Also note that we never try to dynamically allocate surplus gigantic pages. > This also is left over from the time when we could not easily allocate a > gigantic page at runtime. It would not surprise me if someone found a use > case to ease this restriction in the future. Especially so if 1G THP support > is ever added. If this happens, the check would be necessary and I would > guess that it would not be added. > > Sorry for thinking our loud!!! Although not necessary, it 'might' be a good > idea to leave the check because it would be overlooked if dynamic allocation > of gigantic surplus pages is ever added. I do not have a strong opinion. > > P.S. This also reminds me that a similar check should be added to the > demote hugetlb code path. It would be bad if !gigantic_page_runtime_supported > and we demoted a gigantic page into numerous non-gigantic pages. I will > send a patch. Out-of-topic There're some places check "if (hstate_is_gigantic(h))" while others check "if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())". Should we make it explicit in some manner when gigantic_page_runtime_supported is needed to make code easier to follow? Just a trivial suggestion. Thanks! >
On Mon, Jun 27, 2022 at 10:25:13AM -0700, Mike Kravetz wrote: > On 06/27/22 06:02, HORIGUCHI NAOYA(堀口 直也) wrote: > > On Fri, Jun 24, 2022 at 12:11:07PM -0700, Mike Kravetz wrote: > > > On 06/24/22 08:34, HORIGUCHI NAOYA(堀口 直也) wrote: > > > > On Fri, Jun 24, 2022 at 04:15:19PM +0800, Miaohe Lin wrote: > > > > > On 2022/6/24 16:03, Muchun Song wrote: > > > > > > On Fri, Jun 24, 2022 at 10:25:48AM +0800, Miaohe Lin wrote: > > > > > >> On 2022/6/24 7:51, Naoya Horiguchi wrote: > > > > > >>> From: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > > > >> > > > > > >> IIUC it might be better to do the below check: > > > > > >> /* > > > > > >> * Cannot return gigantic pages currently if runtime gigantic page > > > > > >> * allocation is not supported. > > > > > >> */ > > > > > >> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) > > > > > >> goto out; > > > > > >> > > > > > > > > > > > > The change looks good to me. However, the comments above is unnecessary > > > > > > since gigantic_page_runtime_supported() is straightforward. > > > > > > > > > > Agree. The comments can be removed. > > > > > > > > Thank you, both. Adding !gigantic_page_runtime_supported without comment > > > > makes sense to me. > > > > > > The change above makes sense to me. However, ... > > > > > > If we make the change above, will we have the same strange situation described > > > in the commit message when !gigantic_page_runtime_supported() is true? > > > > > > IIUC, !gigantic_page_runtime_supported implies that gigantic hugetlb > > > pages can not be allocated or freed at run time. They can only be > > > allocated at boot time. So, there should NEVER be surplus gigantic > > > pages if !gigantic_page_runtime_supported(). > > > > I have the same understanding as the above. > > > > > To avoid this situation, > > > perhaps we should change set_max_huge_pages as follows (not tested)? > > > > The suggested diff looks clearer about what it does, so I'd like to take it > > in the next version. Then, what do we do on the "if (hstate_if_gigantic())" > > check in return_unused_surplus_pages in the original suggestion? Should it > > be kept as is, or removed, or checked with !gigantic_page_runtime_supported()? > > I guess that the new checks prevent calling return_unused_surplus_pages() > > during pool shrinking, so the check seems not necessary any more. > > My first thought was to keep the check in return_unused_surplus_pages() as it > is called in other code paths. However, it SHOULD only try to free surplus > hugetlb pages. With the modifications to set_max_huge_pages we will not > have any surplus gigantic pages if !gigantic_page_runtime_supported, so > the check can be removed. > > Also note that we never try to dynamically allocate surplus gigantic pages. > This also is left over from the time when we could not easily allocate a > gigantic page at runtime. It would not surprise me if someone found a use > case to ease this restriction in the future. Especially so if 1G THP support > is ever added. If this happens, the check would be necessary and I would > guess that it would not be added. > > Sorry for thinking our loud!!! Although not necessary, it 'might' be a good > idea to leave the check because it would be overlooked if dynamic allocation > of gigantic surplus pages is ever added. I do not have a strong opinion. OK, so let's keep the check. > > P.S. This also reminds me that a similar check should be added to the > demote hugetlb code path. It would be bad if !gigantic_page_runtime_supported > and we demoted a gigantic page into numerous non-gigantic pages. I will > send a patch. Sounds nice. Thanks, Naoya Horiguchi
On Tue, Jun 28, 2022 at 08:38:08AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote: > On Mon, Jun 27, 2022 at 10:25:13AM -0700, Mike Kravetz wrote: > > On 06/27/22 06:02, HORIGUCHI NAOYA(堀口 直也) wrote: > > > On Fri, Jun 24, 2022 at 12:11:07PM -0700, Mike Kravetz wrote: > > > > On 06/24/22 08:34, HORIGUCHI NAOYA(堀口 直也) wrote: > > > > > On Fri, Jun 24, 2022 at 04:15:19PM +0800, Miaohe Lin wrote: > > > > > > On 2022/6/24 16:03, Muchun Song wrote: > > > > > > > On Fri, Jun 24, 2022 at 10:25:48AM +0800, Miaohe Lin wrote: > > > > > > >> On 2022/6/24 7:51, Naoya Horiguchi wrote: > > > > > > >>> From: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > > > > >> > > > > > > >> IIUC it might be better to do the below check: > > > > > > >> /* > > > > > > >> * Cannot return gigantic pages currently if runtime gigantic page > > > > > > >> * allocation is not supported. > > > > > > >> */ > > > > > > >> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) > > > > > > >> goto out; > > > > > > >> > > > > > > > > > > > > > > The change looks good to me. However, the comments above is unnecessary > > > > > > > since gigantic_page_runtime_supported() is straightforward. > > > > > > > > > > > > Agree. The comments can be removed. > > > > > > > > > > Thank you, both. Adding !gigantic_page_runtime_supported without comment > > > > > makes sense to me. > > > > > > > > The change above makes sense to me. However, ... > > > > > > > > If we make the change above, will we have the same strange situation described > > > > in the commit message when !gigantic_page_runtime_supported() is true? > > > > > > > > IIUC, !gigantic_page_runtime_supported implies that gigantic hugetlb > > > > pages can not be allocated or freed at run time. They can only be > > > > allocated at boot time. So, there should NEVER be surplus gigantic > > > > pages if !gigantic_page_runtime_supported(). > > > > > > I have the same understanding as the above. > > > > > > > To avoid this situation, > > > > perhaps we should change set_max_huge_pages as follows (not tested)? > > > > > > The suggested diff looks clearer about what it does, so I'd like to take it > > > in the next version. Then, what do we do on the "if (hstate_if_gigantic())" > > > check in return_unused_surplus_pages in the original suggestion? Should it > > > be kept as is, or removed, or checked with !gigantic_page_runtime_supported()? > > > I guess that the new checks prevent calling return_unused_surplus_pages() > > > during pool shrinking, so the check seems not necessary any more. > > > > My first thought was to keep the check in return_unused_surplus_pages() as it > > is called in other code paths. However, it SHOULD only try to free surplus > > hugetlb pages. With the modifications to set_max_huge_pages we will not > > have any surplus gigantic pages if !gigantic_page_runtime_supported, so > > the check can be removed. > > > > Also note that we never try to dynamically allocate surplus gigantic pages. > > This also is left over from the time when we could not easily allocate a > > gigantic page at runtime. It would not surprise me if someone found a use > > case to ease this restriction in the future. Especially so if 1G THP support > > is ever added. If this happens, the check would be necessary and I would > > guess that it would not be added. > > > > Sorry for thinking our loud!!! Although not necessary, it 'might' be a good > > idea to leave the check because it would be overlooked if dynamic allocation > > of gigantic surplus pages is ever added. I do not have a strong opinion. > > OK, so let's keep the check. Sorry, I found that keeping the check doesn't fix the problem. I'll update the check with !gigantic_page_runtime_supported(). - Naoya Horiguchi
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a57e1be41401..c538278170a2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2432,10 +2432,6 @@ static void return_unused_surplus_pages(struct hstate *h, /* Uncommit the reservation */ h->resv_huge_pages -= unused_resv_pages; - /* Cannot return gigantic pages currently */ - if (hstate_is_gigantic(h)) - goto out; - /* * Part (or even all) of the reservation could have been backed * by pre-allocated pages. Only free surplus pages.