Message ID | 1538482531-26883-2-git-send-email-anshuman.khandual@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64/mm: Enable HugeTLB migration | expand |
Hi Anshuman On 02/10/18 13:15, Anshuman Khandual wrote: > Architectures like arm64 have PUD level HugeTLB pages for certain configs > (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be > enabled for migration. It can be achieved through checking for PUD_SHIFT > order based HugeTLB pages during migration. > > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > include/linux/hugetlb.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 6b68e34..9c1b77f 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -483,7 +483,8 @@ static inline bool hugepage_migration_supported(struct hstate *h) > { > #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > if ((huge_page_shift(h) == PMD_SHIFT) || > - (huge_page_shift(h) == PGDIR_SHIFT)) > + (huge_page_shift(h) == PUD_SHIFT) || > + (huge_page_shift(h) == PGDIR_SHIFT)) nit: Extra Tab ^^. Also, if only arm64 supports PUD_SHIFT, should this be added only in the arm64 specific backend, which we introduce later ? Suzuki
On Tue 02-10-18 17:45:28, Anshuman Khandual wrote: > Architectures like arm64 have PUD level HugeTLB pages for certain configs > (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be > enabled for migration. It can be achieved through checking for PUD_SHIFT > order based HugeTLB pages during migration. Well a long term problem with hugepage_migration_supported is that it is used in two different context 1) to bail out from the migration early because the arch doesn't support migration at all and 2) to use movable zone for hugetlb pages allocation. I am especially concerned about the later because the mere support for migration is not really good enough. Are you really able to find a different giga page during the runtime to move an existing giga page out of the movable zone? So I guess we want to split this into two functions arch_hugepage_migration_supported and hugepage_movable. The later would be a reasonably migrateable subset of the former. Without that this patch migth introduce subtle regressions when somebody relies on movable zone to be really movable. > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > include/linux/hugetlb.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 6b68e34..9c1b77f 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -483,7 +483,8 @@ static inline bool hugepage_migration_supported(struct hstate *h) > { > #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > if ((huge_page_shift(h) == PMD_SHIFT) || > - (huge_page_shift(h) == PGDIR_SHIFT)) > + (huge_page_shift(h) == PUD_SHIFT) || > + (huge_page_shift(h) == PGDIR_SHIFT)) > return true; > else > return false; > -- > 2.7.4
On 10/02/2018 06:08 PM, Suzuki K Poulose wrote: > Hi Anshuman > > On 02/10/18 13:15, Anshuman Khandual wrote: >> Architectures like arm64 have PUD level HugeTLB pages for certain configs >> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be >> enabled for migration. It can be achieved through checking for PUD_SHIFT >> order based HugeTLB pages during migration. >> >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> include/linux/hugetlb.h | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >> index 6b68e34..9c1b77f 100644 >> --- a/include/linux/hugetlb.h >> +++ b/include/linux/hugetlb.h >> @@ -483,7 +483,8 @@ static inline bool hugepage_migration_supported(struct hstate *h) >> { >> #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION >> if ((huge_page_shift(h) == PMD_SHIFT) || >> - (huge_page_shift(h) == PGDIR_SHIFT)) >> + (huge_page_shift(h) == PUD_SHIFT) || > > >> + (huge_page_shift(h) == PGDIR_SHIFT)) > > nit: Extra Tab ^^. The tab is in there when you apply this patch and all three checks are tab separated in a newline. > Also, if only arm64 supports PUD_SHIFT, should this be added only in the arm64 specific backend, which we introduce later ? Even if with the platform can add this up in the back end, I would think having this on for default fall back function makes it complete.
On 10/02/2018 06:09 PM, Michal Hocko wrote: > On Tue 02-10-18 17:45:28, Anshuman Khandual wrote: >> Architectures like arm64 have PUD level HugeTLB pages for certain configs >> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be >> enabled for migration. It can be achieved through checking for PUD_SHIFT >> order based HugeTLB pages during migration. > > Well a long term problem with hugepage_migration_supported is that it is > used in two different context 1) to bail out from the migration early > because the arch doesn't support migration at all and 2) to use movable > zone for hugetlb pages allocation. I am especially concerned about the > later because the mere support for migration is not really good enough. > Are you really able to find a different giga page during the runtime to > move an existing giga page out of the movable zone? I pre-allocate them before trying to initiate the migration (soft offline in my experiments). Hence it should come from the pre-allocated HugeTLB pool instead from the buddy. I might be missing something here but do we ever allocate HugeTLB on the go when trying to migrate ? IIUC it always came from the pool (unless its something related to ovecommit/surplus). Could you please kindly explain regarding how migration target HugeTLB pages are allocated on the fly from movable zone. But even if there are some chances of run time allocation failure from movable zone (as in point 2) that should not block the very initiation of migration itself. IIUC thats not the semantics for either THP or normal pages. Why should it be different here. If the allocation fails we should report and abort as always. Its the caller of migration taking the chances. why should we prevent that. > > So I guess we want to split this into two functions > arch_hugepage_migration_supported and hugepage_movable. The later would So the set difference between arch_hugepage_migration_supported and hugepage_movable still remains un-migratable ? Then what is the purpose for arch_hugepage_migration_supported page size set in the first place. Does it mean we allow the migration at the beginning and the abort later when the page size does not fall within the subset for hugepage_movable. Could you please kindly explain this in more detail. > be a reasonably migrateable subset of the former. Without that this > patch migth introduce subtle regressions when somebody relies on movable > zone to be really movable. PUD based HugeTLB pages were never migratable, then how can there be any regression here ? At present we even support PGD based HugeTLB pages for migration. Wondering how PUD based ones are going to be any different.
On Wed 03-10-18 07:46:27, Anshuman Khandual wrote: > > > On 10/02/2018 06:09 PM, Michal Hocko wrote: > > On Tue 02-10-18 17:45:28, Anshuman Khandual wrote: > >> Architectures like arm64 have PUD level HugeTLB pages for certain configs > >> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be > >> enabled for migration. It can be achieved through checking for PUD_SHIFT > >> order based HugeTLB pages during migration. > > > > Well a long term problem with hugepage_migration_supported is that it is > > used in two different context 1) to bail out from the migration early > > because the arch doesn't support migration at all and 2) to use movable > > zone for hugetlb pages allocation. I am especially concerned about the > > later because the mere support for migration is not really good enough. > > Are you really able to find a different giga page during the runtime to > > move an existing giga page out of the movable zone? > > I pre-allocate them before trying to initiate the migration (soft offline > in my experiments). Hence it should come from the pre-allocated HugeTLB > pool instead from the buddy. I might be missing something here but do we > ever allocate HugeTLB on the go when trying to migrate ? IIUC it always > came from the pool (unless its something related to ovecommit/surplus). > Could you please kindly explain regarding how migration target HugeTLB > pages are allocated on the fly from movable zone. Hotplug comes to mind. You usually do not pre-allocate to cover full node going offline. And people would like to do that. Another example is CMA. You would really like to move pages out of the way. > But even if there are some chances of run time allocation failure from > movable zone (as in point 2) that should not block the very initiation > of migration itself. IIUC thats not the semantics for either THP or > normal pages. Why should it be different here. If the allocation fails > we should report and abort as always. Its the caller of migration taking > the chances. why should we prevent that. Yes I agree, hence the distinction between the arch support for migrateability and the criterion on the movable zone placement. > > > > So I guess we want to split this into two functions > > arch_hugepage_migration_supported and hugepage_movable. The later would > > So the set difference between arch_hugepage_migration_supported and > hugepage_movable still remains un-migratable ? Then what is the purpose > for arch_hugepage_migration_supported page size set in the first place. > Does it mean we allow the migration at the beginning and the abort later > when the page size does not fall within the subset for hugepage_movable. > Could you please kindly explain this in more detail. The purpose of arch_hugepage_migration_supported is to tell whether it makes any sense to even try to migration. The allocation placement is completely independent on this choice. The later just says whether it is feasible to place a hugepage to the zone movable. Sure regular 2MB pages do not guarantee movability as well because of the memory fragmentation. But allocating a 2MB is a completely different storage from 1G or even larger huge pages, isn't it? > > be a reasonably migrateable subset of the former. Without that this > > patch migth introduce subtle regressions when somebody relies on movable > > zone to be really movable. > PUD based HugeTLB pages were never migratable, then how can there be any > regression here ? That means that they haven't been allocated from the movable zone before. Which is going to change by this patch. > At present we even support PGD based HugeTLB pages for > migration. And that is already wrong but nobody probably cares because those are rarely used. > Wondering how PUD based ones are going to be any different. It is not different, PGD is dubious already.
On 10/03/2018 12:28 PM, Michal Hocko wrote: > On Wed 03-10-18 07:46:27, Anshuman Khandual wrote: >> >> >> On 10/02/2018 06:09 PM, Michal Hocko wrote: >>> On Tue 02-10-18 17:45:28, Anshuman Khandual wrote: >>>> Architectures like arm64 have PUD level HugeTLB pages for certain configs >>>> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be >>>> enabled for migration. It can be achieved through checking for PUD_SHIFT >>>> order based HugeTLB pages during migration. >>> >>> Well a long term problem with hugepage_migration_supported is that it is >>> used in two different context 1) to bail out from the migration early >>> because the arch doesn't support migration at all and 2) to use movable >>> zone for hugetlb pages allocation. I am especially concerned about the >>> later because the mere support for migration is not really good enough. >>> Are you really able to find a different giga page during the runtime to >>> move an existing giga page out of the movable zone? >> >> I pre-allocate them before trying to initiate the migration (soft offline >> in my experiments). Hence it should come from the pre-allocated HugeTLB >> pool instead from the buddy. I might be missing something here but do we >> ever allocate HugeTLB on the go when trying to migrate ? IIUC it always >> came from the pool (unless its something related to ovecommit/surplus). >> Could you please kindly explain regarding how migration target HugeTLB >> pages are allocated on the fly from movable zone. > > Hotplug comes to mind. You usually do not pre-allocate to cover full > node going offline. And people would like to do that. Another example is > CMA. You would really like to move pages out of the way. You are right. Hotplug migration: __offline_pages do_migrate_range migrate_pages(...new_node_page...) new_node_page new_page_nodemask alloc_huge_page_nodemask dequeue_huge_page_nodemask (Getting from pool) or alloc_migrate_huge_page (Getting from buddy - non-gigantic) alloc_fresh_huge_page alloc_buddy_huge_page __alloc_pages_nodemask ----> goes into buddy CMA allocation: cma_alloc alloc_contig_range __alloc_contig_migrate_range migrate_pages(...alloc_migrate_target...) alloc_migrate_target new_page_nodemask -> __alloc_pages_nodemask ---> goes into buddy But this is not applicable for gigantic pages for which it backs off way before going into buddy. With MAX_ORDER as 11 its anything beyond 64MB for 64K pages, 16MB for 16K pages, 4MB for 4K pages etc. So all those bigger huge pages like 512MB/1GB/16GB will not be part of the HugeTLB/CMA initiated migrations. I will look into migration details during auto NUMA, compaction, memory-failure etc to see if gigantic huge page is allocated from the buddy with ___alloc_pages_nodemask or with alloc_contig_range(). > >> But even if there are some chances of run time allocation failure from >> movable zone (as in point 2) that should not block the very initiation >> of migration itself. IIUC thats not the semantics for either THP or >> normal pages. Why should it be different here. If the allocation fails >> we should report and abort as always. Its the caller of migration taking >> the chances. why should we prevent that. > > Yes I agree, hence the distinction between the arch support for > migrateability and the criterion on the movable zone placement. movable zone placement sounds very tricky here. How can the platform (through the hook huge_movable) before hand say whether destination page could be allocated from the ZONE_MOVABLE without looking into the state of buddy at migration (any sort attempt to do this is going to be expensive) or it merely indicates the desire to live with possible consequence (unable to hot unplug/CMA going forward) for a migration which might end up in an unmovable area. > >>> >>> So I guess we want to split this into two functions >>> arch_hugepage_migration_supported and hugepage_movable. The later would >> >> So the set difference between arch_hugepage_migration_supported and >> hugepage_movable still remains un-migratable ? Then what is the purpose >> for arch_hugepage_migration_supported page size set in the first place. >> Does it mean we allow the migration at the beginning and the abort later >> when the page size does not fall within the subset for hugepage_movable. >> Could you please kindly explain this in more detail. > > The purpose of arch_hugepage_migration_supported is to tell whether it > makes any sense to even try to migration. The allocation placement is Which kind of matches what we have right now and being continued with this proposal in the series. > completely independent on this choice. The later just says whether it is > feasible to place a hugepage to the zone movable. Sure regular 2MB pages What do you exactly mean by feasible ? Wont it depend on the state of the buddy allocator (ZONE_MOVABLE in particular) and it's ability to accommodate a given huge page. How can the platform decide on it ? Or as I mentioned before it's platform's willingness to live with unmovable huge pages (of certain sizes) as a consequence of migration. > do not guarantee movability as well because of the memory fragmentation. > But allocating a 2MB is a completely different storage from 1G or even > larger huge pages, isn't it? Right I understand that. Hotplug/CMA capability goes down more with bigger huge pages being unmovable on the system. > >>> be a reasonably migrateable subset of the former. Without that this >>> patch migth introduce subtle regressions when somebody relies on movable >>> zone to be really movable. >> PUD based HugeTLB pages were never migratable, then how can there be any >> regression here ? > > That means that they haven't been allocated from the movable zone > before. Which is going to change by this patch. The source PUD huge page might have been allocated from movable zone. The denial for migration is explicit and because we dont check for PUD_SHIFT in there and nothing to do with the zone type where the source page belongs. But are you referring to regression caused by something like this. Before the patch: - PUD huge page allocated on ZONE_MOVABLE - Huge page is movable (Hotplug/CMA works) After the patch: - PUD huge page allocated on ZONE_MOVABLE - Migration is successful without checking for destination page's zone - PUD huge page (new) is not on ZONE_MOVABLE - Huge page is unmovable (Hotplug/CMA does not work anymore) -> Regression! > >> At present we even support PGD based HugeTLB pages for >> migration. > > And that is already wrong but nobody probably cares because those are > rarely used. > >> Wondering how PUD based ones are going to be any different. > > It is not different, PGD is dubious already. > Got it.
On 02/10/18 13:56, Anshuman Khandual wrote: > > > On 10/02/2018 06:08 PM, Suzuki K Poulose wrote: >> Hi Anshuman >> >> On 02/10/18 13:15, Anshuman Khandual wrote: >>> Architectures like arm64 have PUD level HugeTLB pages for certain configs >>> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be >>> enabled for migration. It can be achieved through checking for PUD_SHIFT >>> order based HugeTLB pages during migration. >>> >>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >>> --- >>> include/linux/hugetlb.h | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >>> index 6b68e34..9c1b77f 100644 >>> --- a/include/linux/hugetlb.h >>> +++ b/include/linux/hugetlb.h >>> @@ -483,7 +483,8 @@ static inline bool hugepage_migration_supported(struct hstate *h) >>> { >>> #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION >>> if ((huge_page_shift(h) == PMD_SHIFT) || >>> - (huge_page_shift(h) == PGDIR_SHIFT)) >>> + (huge_page_shift(h) == PUD_SHIFT) || >> >> >>> + (huge_page_shift(h) == PGDIR_SHIFT)) >> >> nit: Extra Tab ^^. > > The tab is in there when you apply this patch and all three checks are tab separated > in a newline. Well, with the patch applied, at least I can see 2 tabs for the PUD_SHIFT check and 3 tabs for PGDIR_SHIFT check. Which seems inconsistent. Is it just me (my mail client) ? Suzuki
On Wed 03-10-18 15:28:23, Anshuman Khandual wrote: > > > On 10/03/2018 12:28 PM, Michal Hocko wrote: > > On Wed 03-10-18 07:46:27, Anshuman Khandual wrote: > >> > >> > >> On 10/02/2018 06:09 PM, Michal Hocko wrote: > >>> On Tue 02-10-18 17:45:28, Anshuman Khandual wrote: > >>>> Architectures like arm64 have PUD level HugeTLB pages for certain configs > >>>> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be > >>>> enabled for migration. It can be achieved through checking for PUD_SHIFT > >>>> order based HugeTLB pages during migration. > >>> > >>> Well a long term problem with hugepage_migration_supported is that it is > >>> used in two different context 1) to bail out from the migration early > >>> because the arch doesn't support migration at all and 2) to use movable > >>> zone for hugetlb pages allocation. I am especially concerned about the > >>> later because the mere support for migration is not really good enough. > >>> Are you really able to find a different giga page during the runtime to > >>> move an existing giga page out of the movable zone? > >> > >> I pre-allocate them before trying to initiate the migration (soft offline > >> in my experiments). Hence it should come from the pre-allocated HugeTLB > >> pool instead from the buddy. I might be missing something here but do we > >> ever allocate HugeTLB on the go when trying to migrate ? IIUC it always > >> came from the pool (unless its something related to ovecommit/surplus). > >> Could you please kindly explain regarding how migration target HugeTLB > >> pages are allocated on the fly from movable zone. > > > > Hotplug comes to mind. You usually do not pre-allocate to cover full > > node going offline. And people would like to do that. Another example is > > CMA. You would really like to move pages out of the way. > > You are right. > > Hotplug migration: > > __offline_pages > do_migrate_range > migrate_pages(...new_node_page...) > > new_node_page > new_page_nodemask > alloc_huge_page_nodemask > dequeue_huge_page_nodemask (Getting from pool) > or > alloc_migrate_huge_page (Getting from buddy - non-gigantic) > alloc_fresh_huge_page > alloc_buddy_huge_page > __alloc_pages_nodemask ----> goes into buddy > > CMA allocation: > > cma_alloc > alloc_contig_range > __alloc_contig_migrate_range > migrate_pages(...alloc_migrate_target...) > > alloc_migrate_target > new_page_nodemask -> __alloc_pages_nodemask ---> goes into buddy > > But this is not applicable for gigantic pages for which it backs off way > before going into buddy. This is an implementation detail - mostly a missing or an incomplete hugetlb overcommit implementation IIRC. The primary point remains the same. Being able to migrate in principle and feasible enough to migrate to be placed in zone movable are two distinct things. [...] > >> But even if there are some chances of run time allocation failure from > >> movable zone (as in point 2) that should not block the very initiation > >> of migration itself. IIUC thats not the semantics for either THP or > >> normal pages. Why should it be different here. If the allocation fails > >> we should report and abort as always. Its the caller of migration taking > >> the chances. why should we prevent that. > > > > Yes I agree, hence the distinction between the arch support for > > migrateability and the criterion on the movable zone placement. > movable zone placement sounds very tricky here. How can the platform > (through the hook huge_movable) before hand say whether destination > page could be allocated from the ZONE_MOVABLE without looking into the > state of buddy at migration (any sort attempt to do this is going to > be expensive) or it merely indicates the desire to live with possible > consequence (unable to hot unplug/CMA going forward) for a migration > which might end up in an unmovable area. I do not follow. The whole point of zone_movable is to provide a physical memory range which is more or less movable. That means that pages allocated from this zone can be migrated away should there be a reason for that. > >>> So I guess we want to split this into two functions > >>> arch_hugepage_migration_supported and hugepage_movable. The later would > >> > >> So the set difference between arch_hugepage_migration_supported and > >> hugepage_movable still remains un-migratable ? Then what is the purpose > >> for arch_hugepage_migration_supported page size set in the first place. > >> Does it mean we allow the migration at the beginning and the abort later > >> when the page size does not fall within the subset for hugepage_movable. > >> Could you please kindly explain this in more detail. > > > > The purpose of arch_hugepage_migration_supported is to tell whether it > > makes any sense to even try to migration. The allocation placement is > > Which kind of matches what we have right now and being continued with this > proposal in the series. Except you only go half way there. Because you still consider "able to migrate" and "feasible to migrate" as the same thing. > > > completely independent on this choice. The later just says whether it is > > feasible to place a hugepage to the zone movable. Sure regular 2MB pages > > What do you exactly mean by feasible ? Wont it depend on the state of the > buddy allocator (ZONE_MOVABLE in particular) and it's ability to accommodate > a given huge page. How can the platform decide on it ? It is not the platform that decides. That is the whole point of the distinction. It is us to say what is feasible and what we want to support. Do we want to support giga pages in zone_movable? Under which conditions? See my point? > Or as I mentioned > before it's platform's willingness to live with unmovable huge pages (of > certain sizes) as a consequence of migration. No, the platform has no saying in that. The platform only says that it supports migrating those pages in principle.
On 10/03/2018 03:52 PM, Suzuki K Poulose wrote: > > > On 02/10/18 13:56, Anshuman Khandual wrote: >> >> >> On 10/02/2018 06:08 PM, Suzuki K Poulose wrote: >>> Hi Anshuman >>> >>> On 02/10/18 13:15, Anshuman Khandual wrote: >>>> Architectures like arm64 have PUD level HugeTLB pages for certain configs >>>> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be >>>> enabled for migration. It can be achieved through checking for PUD_SHIFT >>>> order based HugeTLB pages during migration. >>>> >>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >>>> --- >>>> include/linux/hugetlb.h | 3 ++- >>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >>>> index 6b68e34..9c1b77f 100644 >>>> --- a/include/linux/hugetlb.h >>>> +++ b/include/linux/hugetlb.h >>>> @@ -483,7 +483,8 @@ static inline bool hugepage_migration_supported(struct hstate *h) >>>> { >>>> #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION >>>> if ((huge_page_shift(h) == PMD_SHIFT) || >>>> - (huge_page_shift(h) == PGDIR_SHIFT)) >>>> + (huge_page_shift(h) == PUD_SHIFT) || >>> >>> >>>> + (huge_page_shift(h) == PGDIR_SHIFT)) >>> >>> nit: Extra Tab ^^. >> >> The tab is in there when you apply this patch and all three checks are tab separated >> in a newline. > > Well, with the patch applied, at least I can see 2 tabs for the > PUD_SHIFT check and 3 tabs for PGDIR_SHIFT check. Which seems > inconsistent. Is it just me (my mail client) ? I am sorry, you are right. Did not understand your point earlier. Yeah there is increasing number of tabs for each new line with a conditional check. Is there a problem with this style of indentation ? Though I will be happy to change.
On 03/10/18 12:10, Anshuman Khandual wrote: > > > On 10/03/2018 03:52 PM, Suzuki K Poulose wrote: >> >> >> On 02/10/18 13:56, Anshuman Khandual wrote: >>> >>> >>> On 10/02/2018 06:08 PM, Suzuki K Poulose wrote: >>>> Hi Anshuman >>>> >>>> On 02/10/18 13:15, Anshuman Khandual wrote: >>>>> Architectures like arm64 have PUD level HugeTLB pages for certain configs >>>>> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be >>>>> enabled for migration. It can be achieved through checking for PUD_SHIFT >>>>> order based HugeTLB pages during migration. >>>>> >>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >>>>> --- >>>>> include/linux/hugetlb.h | 3 ++- >>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >>>>> index 6b68e34..9c1b77f 100644 >>>>> --- a/include/linux/hugetlb.h >>>>> +++ b/include/linux/hugetlb.h >>>>> @@ -483,7 +483,8 @@ static inline bool hugepage_migration_supported(struct hstate *h) >>>>> { >>>>> #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION >>>>> if ((huge_page_shift(h) == PMD_SHIFT) || >>>>> - (huge_page_shift(h) == PGDIR_SHIFT)) >>>>> + (huge_page_shift(h) == PUD_SHIFT) || >>>> >>>> >>>>> + (huge_page_shift(h) == PGDIR_SHIFT)) >>>> >>>> nit: Extra Tab ^^. >>> >>> The tab is in there when you apply this patch and all three checks are tab separated >>> in a newline. >> >> Well, with the patch applied, at least I can see 2 tabs for the >> PUD_SHIFT check and 3 tabs for PGDIR_SHIFT check. Which seems >> inconsistent. Is it just me (my mail client) ? > > I am sorry, you are right. Did not understand your point earlier. Yeah there is > increasing number of tabs for each new line with a conditional check. Is there > a problem with this style of indentation ? Though I will be happy to change. I have been under the idea that all the checks at the same level could have the same indentation. (i.e, 2 tabs in this case for each). Looks like there is no rule about it. How about replacing it with a switch..case ? Cheers Suzuki
On Wed 03-10-18 12:17:52, Suzuki K Poulose wrote: [...] > I have been under the idea that all the checks at the same level could > have the same indentation. (i.e, 2 tabs in this case for each). Looks > like there is no rule about it. How about replacing it with a > switch..case ? I would simply follow the existing indentation style in that function. Is this really worth discussig, anyway? Does the proposed change makes the code harder to read?
On 10/03/2018 04:29 PM, Michal Hocko wrote: > On Wed 03-10-18 15:28:23, Anshuman Khandual wrote: >> >> >> On 10/03/2018 12:28 PM, Michal Hocko wrote: >>> On Wed 03-10-18 07:46:27, Anshuman Khandual wrote: >>>> >>>> >>>> On 10/02/2018 06:09 PM, Michal Hocko wrote: >>>>> On Tue 02-10-18 17:45:28, Anshuman Khandual wrote: >>>>>> Architectures like arm64 have PUD level HugeTLB pages for certain configs >>>>>> (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be >>>>>> enabled for migration. It can be achieved through checking for PUD_SHIFT >>>>>> order based HugeTLB pages during migration. >>>>> >>>>> Well a long term problem with hugepage_migration_supported is that it is >>>>> used in two different context 1) to bail out from the migration early >>>>> because the arch doesn't support migration at all and 2) to use movable >>>>> zone for hugetlb pages allocation. I am especially concerned about the >>>>> later because the mere support for migration is not really good enough. >>>>> Are you really able to find a different giga page during the runtime to >>>>> move an existing giga page out of the movable zone? >>>> >>>> I pre-allocate them before trying to initiate the migration (soft offline >>>> in my experiments). Hence it should come from the pre-allocated HugeTLB >>>> pool instead from the buddy. I might be missing something here but do we >>>> ever allocate HugeTLB on the go when trying to migrate ? IIUC it always >>>> came from the pool (unless its something related to ovecommit/surplus). >>>> Could you please kindly explain regarding how migration target HugeTLB >>>> pages are allocated on the fly from movable zone. >>> >>> Hotplug comes to mind. You usually do not pre-allocate to cover full >>> node going offline. And people would like to do that. Another example is >>> CMA. You would really like to move pages out of the way. >> >> You are right. >> >> Hotplug migration: >> >> __offline_pages >> do_migrate_range >> migrate_pages(...new_node_page...) >> >> new_node_page >> new_page_nodemask >> alloc_huge_page_nodemask >> dequeue_huge_page_nodemask (Getting from pool) >> or >> alloc_migrate_huge_page (Getting from buddy - non-gigantic) >> alloc_fresh_huge_page >> alloc_buddy_huge_page >> __alloc_pages_nodemask ----> goes into buddy >> >> CMA allocation: >> >> cma_alloc >> alloc_contig_range >> __alloc_contig_migrate_range >> migrate_pages(...alloc_migrate_target...) >> >> alloc_migrate_target >> new_page_nodemask -> __alloc_pages_nodemask ---> goes into buddy >> >> But this is not applicable for gigantic pages for which it backs off way >> before going into buddy. > > This is an implementation detail - mostly a missing or an incomplete > hugetlb overcommit implementation IIRC. The primary point remains the > same. Being able to migrate in principle and feasible enough to migrate > to be placed in zone movable are two distinct things. I agree. They are two distinct things. > [...] >>>> But even if there are some chances of run time allocation failure from >>>> movable zone (as in point 2) that should not block the very initiation >>>> of migration itself. IIUC thats not the semantics for either THP or >>>> normal pages. Why should it be different here. If the allocation fails >>>> we should report and abort as always. Its the caller of migration taking >>>> the chances. why should we prevent that. >>> >>> Yes I agree, hence the distinction between the arch support for >>> migrateability and the criterion on the movable zone placement. >> movable zone placement sounds very tricky here. How can the platform >> (through the hook huge_movable) before hand say whether destination >> page could be allocated from the ZONE_MOVABLE without looking into the >> state of buddy at migration (any sort attempt to do this is going to >> be expensive) or it merely indicates the desire to live with possible >> consequence (unable to hot unplug/CMA going forward) for a migration >> which might end up in an unmovable area. > > I do not follow. The whole point of zone_movable is to provide a > physical memory range which is more or less movable. That means that > pages allocated from this zone can be migrated away should there be a > reason for that. I understand this. > >>>>> So I guess we want to split this into two functions >>>>> arch_hugepage_migration_supported and hugepage_movable. The later would >>>> >>>> So the set difference between arch_hugepage_migration_supported and >>>> hugepage_movable still remains un-migratable ? Then what is the purpose >>>> for arch_hugepage_migration_supported page size set in the first place. >>>> Does it mean we allow the migration at the beginning and the abort later >>>> when the page size does not fall within the subset for hugepage_movable. >>>> Could you please kindly explain this in more detail. >>> >>> The purpose of arch_hugepage_migration_supported is to tell whether it >>> makes any sense to even try to migration. The allocation placement is >> >> Which kind of matches what we have right now and being continued with this >> proposal in the series. > > Except you only go half way there. Because you still consider "able to > migrate" and "feasible to migrate" as the same thing. Okay. > >> >>> completely independent on this choice. The later just says whether it is >>> feasible to place a hugepage to the zone movable. Sure regular 2MB pages >> >> What do you exactly mean by feasible ? Wont it depend on the state of the >> buddy allocator (ZONE_MOVABLE in particular) and it's ability to accommodate >> a given huge page. How can the platform decide on it ? > > It is not the platform that decides. That is the whole point of the > distinction. It is us to say what is feasible and what we want to > support. Do we want to support giga pages in zone_movable? Under which > conditions? See my point? So huge_movable() is going to be a generic MM function deciding on the feasibility for allocating a huge page of 'size' from movable zone during migration. If the feasibility turns out to be negative, then migration process is aborted there. huge_movable() will do something like these: - Return positive right away on smaller size huge pages - Measure movable allocation feasibility for bigger huge pages - Look out for free_pages in the huge page order in movable areas - if (order > (MAX_ORDER - 1)) - Scan the PFN ranges in movable zone for possible allocation - etc - etc Did I get this right ? > >> Or as I mentioned >> before it's platform's willingness to live with unmovable huge pages (of >> certain sizes) as a consequence of migration. > > No, the platform has no saying in that. The platform only says that it > supports migrating those pages in principle. I understand this now.
On Wed 03-10-18 17:07:13, Anshuman Khandual wrote: > > > On 10/03/2018 04:29 PM, Michal Hocko wrote: [...] > > It is not the platform that decides. That is the whole point of the > > distinction. It is us to say what is feasible and what we want to > > support. Do we want to support giga pages in zone_movable? Under which > > conditions? See my point? > > So huge_movable() is going to be a generic MM function deciding on the > feasibility for allocating a huge page of 'size' from movable zone during > migration. Yeah, this might be a more complex logic than just the size check. If there is a sufficient pre-allocated pool to migrate the page off it might be pre-reserved for future migration etc... Nothing to be done right now of course. > If the feasibility turns out to be negative, then migration > process is aborted there. You are still confusing allocation and migration here I am afraid. The whole "feasible to migrate" is for the _allocation_ time when we decide whether the new page should be placed in zone_movable or not. > huge_movable() will do something like these: > > - Return positive right away on smaller size huge pages > - Measure movable allocation feasibility for bigger huge pages > - Look out for free_pages in the huge page order in movable areas > - if (order > (MAX_ORDER - 1)) > - Scan the PFN ranges in movable zone for possible allocation > - etc > - etc > > Did I get this right ? Well, not really. I was thinking of something like this for the beginning if (!arch_hugepage_migration_supporte()) return false; if (hstate_is_gigantic(h)) return false; return true; further changes might be done on top of this.
On 10/03/2018 05:18 PM, Michal Hocko wrote: > On Wed 03-10-18 17:07:13, Anshuman Khandual wrote: >> >> >> On 10/03/2018 04:29 PM, Michal Hocko wrote: > [...] >>> It is not the platform that decides. That is the whole point of the >>> distinction. It is us to say what is feasible and what we want to >>> support. Do we want to support giga pages in zone_movable? Under which >>> conditions? See my point? >> >> So huge_movable() is going to be a generic MM function deciding on the >> feasibility for allocating a huge page of 'size' from movable zone during >> migration. > > Yeah, this might be a more complex logic than just the size check. If > there is a sufficient pre-allocated pool to migrate the page off it > might be pre-reserved for future migration etc... Nothing to be done > right now of course. If the huge page has a pre-allocated pool, then it gets consumed first through the current allocator logic (new_page_nodemask). Hence testing for feasibility by looking into pool and (buddy / zone) together is not going to change the policy unless there is also a new allocator which goes and consumes (from reserved pool or buddy/zone) huge pages as envisioned by the feasibility checker. But I understand your point. That path can be explored as well. > >> If the feasibility turns out to be negative, then migration >> process is aborted there. > > You are still confusing allocation and migration here I am afraid. The > whole "feasible to migrate" is for the _allocation_ time when we decide > whether the new page should be placed in zone_movable or not. migrate_pages() -> platform specific arch_hugetlb_migration(in principle) -> generic huge_movable() feasibility check while trying to allocate the destination page -> move source to destination -> complete ! So we have two checks here 1) platform specific arch_hugetlb_migration -> In principle go ahead 2) huge_movable() during allocation - If huge page does not have to be placed on movable zone - Allocate any where successfully and done ! - If huge page *should* be placed on a movable zone - Try allocating on movable zone - Successfull and done ! - If the new page could not be allocated on movable zone - Abort the migration completely OR - Warn and fall back to non-movable There is an important point to note here. - Whether a huge size should be on movable zone can be determined looking into size and other parameters during feasibility test - But whether a huge size can be allocated in actual on movable zone might not be determined without really allocating it which will further delay the decision to successfully complete the migration, warning about it or aborting it at this allocation phase itself > >> huge_movable() will do something like these: >> >> - Return positive right away on smaller size huge pages >> - Measure movable allocation feasibility for bigger huge pages >> - Look out for free_pages in the huge page order in movable areas >> - if (order > (MAX_ORDER - 1)) >> - Scan the PFN ranges in movable zone for possible allocation >> - etc >> - etc >> >> Did I get this right ? > > Well, not really. I was thinking of something like this for the > beginning > if (!arch_hugepage_migration_supporte()) > return false; First check: Platform check in principle as you mentioned before > if (hstate_is_gigantic(h)) > return false; Second check: Simplistic generic allocation feasibility check looking just at size > return true; > > further changes might be done on top of this. Right.
On Wed 03-10-18 18:36:39, Anshuman Khandual wrote: [...] > So we have two checks here > > 1) platform specific arch_hugetlb_migration -> In principle go ahead > > 2) huge_movable() during allocation > > - If huge page does not have to be placed on movable zone > > - Allocate any where successfully and done ! > > - If huge page *should* be placed on a movable zone > > - Try allocating on movable zone > > - Successfull and done ! > > - If the new page could not be allocated on movable zone > > - Abort the migration completely > > OR > > - Warn and fall back to non-movable I guess you are still making it more complicated than necessary. The later is really only about __GFP_MOVABLE at this stage. I would just make it simple for now. We do not have to implement any dynamic heuristic right now. All that I am asking for is to split the migrate possible part from movable part. I should have been more clear about that I guess from my very first reply. I do like how you moved the current coarse grained hugepage_migration_supported to be more arch specific but I merely wanted to point out that we need to do some other changes before we can go that route and that thing is to distinguish movable from migration supported. See my point?
On 10/03/2018 07:06 PM, Michal Hocko wrote: > On Wed 03-10-18 18:36:39, Anshuman Khandual wrote: > [...] >> So we have two checks here >> >> 1) platform specific arch_hugetlb_migration -> In principle go ahead >> >> 2) huge_movable() during allocation >> >> - If huge page does not have to be placed on movable zone >> >> - Allocate any where successfully and done ! >> >> - If huge page *should* be placed on a movable zone >> >> - Try allocating on movable zone >> >> - Successfull and done ! >> >> - If the new page could not be allocated on movable zone >> >> - Abort the migration completely >> >> OR >> >> - Warn and fall back to non-movable > > I guess you are still making it more complicated than necessary. The > later is really only about __GFP_MOVABLE at this stage. I would just > make it simple for now. We do not have to implement any dynamic > heuristic right now. All that I am asking for is to split the migrate > possible part from movable part. > > I should have been more clear about that I guess from my very first > reply. I do like how you moved the current coarse grained > hugepage_migration_supported to be more arch specific but I merely > wanted to point out that we need to do some other changes before we can > go that route and that thing is to distinguish movable from migration > supported. > > See my point? Does the following sound close enough to what you are looking for ? diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 9df1d59..070c419 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -504,6 +504,13 @@ static inline bool hugepage_migration_supported(struct hstate *h) return arch_hugetlb_migration_supported(h); } +static inline bool hugepage_movable_required(struct hstate *h) +{ + if (hstate_is_gigantic(h)) + return true; + return false; +} + static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) { @@ -600,6 +607,11 @@ static inline bool hugepage_migration_supported(struct hstate *h) return false; } +static inline bool hugepage_movable_required(struct hstate *h) +{ + return false; +} + static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3c21775..8b0afdc 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1635,6 +1635,9 @@ struct page *alloc_huge_page_node(struct hstate *h, int nid) if (nid != NUMA_NO_NODE) gfp_mask |= __GFP_THISNODE; + if (hugepage_movable_required(h)) + gfp_mask |= __GFP_MOVABLE; + spin_lock(&hugetlb_lock); if (h->free_huge_pages - h->resv_huge_pages > 0) page = dequeue_huge_page_nodemask(h, gfp_mask, nid, NULL); @@ -1652,6 +1655,9 @@ struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, { gfp_t gfp_mask = htlb_alloc_mask(h); + if (hugepage_movable_required(h)) + gfp_mask |= __GFP_MOVABLE; + spin_lock(&hugetlb_lock); if (h->free_huge_pages - h->resv_huge_pages > 0) { struct page *page;
On Fri 05-10-18 13:04:43, Anshuman Khandual wrote: > Does the following sound close enough to what you are looking for ? I do not think so > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 9df1d59..070c419 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -504,6 +504,13 @@ static inline bool hugepage_migration_supported(struct hstate *h) > return arch_hugetlb_migration_supported(h); > } > > +static inline bool hugepage_movable_required(struct hstate *h) > +{ > + if (hstate_is_gigantic(h)) > + return true; > + return false; > +} > + Apart from naming (hugepage_movable_supported?) the above doesn't do the most essential thing to query whether the hugepage migration is supported at all. Apart from that i would expect the logic to be revers. We do not really support giga pages migration enough to support them in movable zone. > @@ -1652,6 +1655,9 @@ struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, > { > gfp_t gfp_mask = htlb_alloc_mask(h); > > + if (hugepage_movable_required(h)) > + gfp_mask |= __GFP_MOVABLE; > + And besides that this really want to live in htlb_alloc_mask because this is really an allocation policy. It would be unmap_and_move_huge_page to call hugepage_migration_supported. The later is the one to allow for an arch specific override. Makes sense?
On 10/09/2018 07:44 PM, Michal Hocko wrote: > On Fri 05-10-18 13:04:43, Anshuman Khandual wrote: >> Does the following sound close enough to what you are looking for ? > > I do not think so Okay. > >> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >> index 9df1d59..070c419 100644 >> --- a/include/linux/hugetlb.h >> +++ b/include/linux/hugetlb.h >> @@ -504,6 +504,13 @@ static inline bool hugepage_migration_supported(struct hstate *h) >> return arch_hugetlb_migration_supported(h); >> } >> >> +static inline bool hugepage_movable_required(struct hstate *h) >> +{ >> + if (hstate_is_gigantic(h)) >> + return true; >> + return false; >> +} >> + > > Apart from naming (hugepage_movable_supported?) the above doesn't do the > most essential thing to query whether the hugepage migration is > supported at all. Apart from that i would expect the logic to be revers. My assumption was hugepage_migration_supported() should only be called from unmap_and_move_huge_page() but we can add that here as well to limit the set further. > We do not really support giga pages migration enough to support them in > movable zone. Reversing the logic here would change gfp_t for a large number of huge pages. Current implementation is very liberal in assigning __GFP_MOVABLE for multiple huge page sizes (all most all of them when migration is enabled). But I guess it should be okay because after all we are tying to control which all sizes should be movable and which all should not be. static inline bool hugepage_migration_supported(struct hstate *h) { #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION if ((huge_page_shift(h) == PMD_SHIFT) || (huge_page_shift(h) == PGDIR_SHIFT)) return true; else return false; #else return false; #endif } static inline gfp_t htlb_alloc_mask(struct hstate *h) { if (hugepage_migration_supported(h)) return GFP_HIGHUSER_MOVABLE; else return GFP_HIGHUSER; } >> @@ -1652,6 +1655,9 @@ struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, >> { >> gfp_t gfp_mask = htlb_alloc_mask(h); >> >> + if (hugepage_movable_required(h)) >> + gfp_mask |= __GFP_MOVABLE; >> + > > And besides that this really want to live in htlb_alloc_mask because > this is really an allocation policy. It would be unmap_and_move_huge_page > to call hugepage_migration_supported. The later is the one to allow for > an arch specific override. > > Makes sense? > hugepage_migration_supported() will be checked inside hugepage_movable_supported(). A changed version .... diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 9df1d59..4bcbf1e 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -504,6 +504,16 @@ static inline bool hugepage_migration_supported(struct hstate *h) return arch_hugetlb_migration_supported(h); } +static inline bool hugepage_movable_supported(struct hstate *h) +{ + if (!hugepage_migration_supported(h)) --> calls arch override restricting the set + return false; + + if (hstate_is_gigantic(h) --------> restricts the set further + return false; + return true; +} + static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) { @@ -600,6 +610,11 @@ static inline bool hugepage_migration_supported(struct hstate *h) return false; } +static inline bool hugepage_movable_supported(struct hstate *h) +{ + return false; +} + static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3c21775..a5a111d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -919,7 +919,7 @@ static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, /* Movability of hugepages depends on migration support. */ static inline gfp_t htlb_alloc_mask(struct hstate *h) { - if (hugepage_migration_supported(h)) + if (hugepage_movable_supported(h)) return GFP_HIGHUSER_MOVABLE; else return GFP_HIGHUSER; The above patch is in addition to the following later patch in the series. mm/hugetlb: Enable arch specific huge page size support for migration Architectures like arm64 have HugeTLB page sizes which are different than generic sizes at PMD, PUD, PGD level and implemented via contiguous bits. At present these special size HugeTLB pages cannot be identified through macros like (PMD|PUD|PGDIR)_SHIFT and hence chosen not be migrated. Enabling migration support for these special HugeTLB page sizes along with the generic ones (PMD|PUD|PGD) would require identifying all of them on a given platform. A platform specific hook can precisely enumerate all huge page sizes supported for migration. Instead of comparing against standard huge page orders let hugetlb_migration_support() function call a platform hook arch_hugetlb_migration_support(). Default definition for the platform hook maintains existing semantics which checks standard huge page order. But an architecture can choose to override the default and provide support for a comprehensive set of huge page sizes. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 9c1b77f..9df1d59 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -479,18 +479,29 @@ static inline pgoff_t basepage_index(struct page *page) extern int dissolve_free_huge_page(struct page *page); extern int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn); -static inline bool hugepage_migration_supported(struct hstate *h) -{ + #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION +#ifndef arch_hugetlb_migration_supported +static inline bool arch_hugetlb_migration_supported(struct hstate *h) +{ if ((huge_page_shift(h) == PMD_SHIFT) || (huge_page_shift(h) == PUD_SHIFT) || (huge_page_shift(h) == PGDIR_SHIFT)) return true; else return false; +} +#endif #else +static inline bool arch_hugetlb_migration_supported(struct hstate *h) +{ return false; +} #endif + +static inline bool hugepage_migration_supported(struct hstate *h) +{ + return arch_hugetlb_migration_supported(h); }
On Wed 10-10-18 08:39:22, Anshuman Khandual wrote: [...] > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 9df1d59..4bcbf1e 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -504,6 +504,16 @@ static inline bool hugepage_migration_supported(struct hstate *h) > return arch_hugetlb_migration_supported(h); > } > > +static inline bool hugepage_movable_supported(struct hstate *h) > +{ > + if (!hugepage_migration_supported(h)) --> calls arch override restricting the set > + return false; > + > + if (hstate_is_gigantic(h) --------> restricts the set further > + return false; > + return true; > +} > + > static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > struct mm_struct *mm, pte_t *pte) > { > @@ -600,6 +610,11 @@ static inline bool hugepage_migration_supported(struct hstate *h) > return false; > } > > +static inline bool hugepage_movable_supported(struct hstate *h) > +{ > + return false; > +} > + > static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > struct mm_struct *mm, pte_t *pte) > { > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 3c21775..a5a111d 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -919,7 +919,7 @@ static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, > /* Movability of hugepages depends on migration support. */ > static inline gfp_t htlb_alloc_mask(struct hstate *h) > { > - if (hugepage_migration_supported(h)) > + if (hugepage_movable_supported(h)) > return GFP_HIGHUSER_MOVABLE; > else > return GFP_HIGHUSER; Exactly what I've had in mind. It would be great to have a comment in hugepage_movable_supported to explain why we are not supporting giga pages even though they are migrateable and why we need that distinction. > The above patch is in addition to the following later patch in the series. [...] > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 9c1b77f..9df1d59 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -479,18 +479,29 @@ static inline pgoff_t basepage_index(struct page *page) > extern int dissolve_free_huge_page(struct page *page); > extern int dissolve_free_huge_pages(unsigned long start_pfn, > unsigned long end_pfn); > -static inline bool hugepage_migration_supported(struct hstate *h) > -{ > + > #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION > +#ifndef arch_hugetlb_migration_supported > +static inline bool arch_hugetlb_migration_supported(struct hstate *h) > +{ > if ((huge_page_shift(h) == PMD_SHIFT) || > (huge_page_shift(h) == PUD_SHIFT) || > (huge_page_shift(h) == PGDIR_SHIFT)) > return true; > else > return false; > +} > +#endif > #else > +static inline bool arch_hugetlb_migration_supported(struct hstate *h) > +{ > return false; > +} > #endif > + > +static inline bool hugepage_migration_supported(struct hstate *h) > +{ > + return arch_hugetlb_migration_supported(h); > } Yes making hugepage_migration_supported to have an arch override is definitely the right thing to do. Whether the above approach rather than a weak symbol is better is a matter of taste and I do not feel strongly about that.
On 10/10/2018 03:09 PM, Michal Hocko wrote: > On Wed 10-10-18 08:39:22, Anshuman Khandual wrote: > [...] >> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >> index 9df1d59..4bcbf1e 100644 >> --- a/include/linux/hugetlb.h >> +++ b/include/linux/hugetlb.h >> @@ -504,6 +504,16 @@ static inline bool hugepage_migration_supported(struct hstate *h) >> return arch_hugetlb_migration_supported(h); >> } >> >> +static inline bool hugepage_movable_supported(struct hstate *h) >> +{ >> + if (!hugepage_migration_supported(h)) --> calls arch override restricting the set >> + return false; >> + >> + if (hstate_is_gigantic(h) --------> restricts the set further >> + return false; >> + return true; >> +} >> + >> static inline spinlock_t *huge_pte_lockptr(struct hstate *h, >> struct mm_struct *mm, pte_t *pte) >> { >> @@ -600,6 +610,11 @@ static inline bool hugepage_migration_supported(struct hstate *h) >> return false; >> } >> >> +static inline bool hugepage_movable_supported(struct hstate *h) >> +{ >> + return false; >> +} >> + >> static inline spinlock_t *huge_pte_lockptr(struct hstate *h, >> struct mm_struct *mm, pte_t *pte) >> { >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index 3c21775..a5a111d 100644 >> --- a/mm/hugetlb.c >> +++ b/mm/hugetlb.c >> @@ -919,7 +919,7 @@ static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, >> /* Movability of hugepages depends on migration support. */ >> static inline gfp_t htlb_alloc_mask(struct hstate *h) >> { >> - if (hugepage_migration_supported(h)) >> + if (hugepage_movable_supported(h)) >> return GFP_HIGHUSER_MOVABLE; >> else >> return GFP_HIGHUSER; > > Exactly what I've had in mind. It would be great to have a comment in > hugepage_movable_supported to explain why we are not supporting giga > pages even though they are migrateable and why we need that distinction. sure, will do. > >> The above patch is in addition to the following later patch in the series. > [...] >> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >> index 9c1b77f..9df1d59 100644 >> --- a/include/linux/hugetlb.h >> +++ b/include/linux/hugetlb.h >> @@ -479,18 +479,29 @@ static inline pgoff_t basepage_index(struct page *page) >> extern int dissolve_free_huge_page(struct page *page); >> extern int dissolve_free_huge_pages(unsigned long start_pfn, >> unsigned long end_pfn); >> -static inline bool hugepage_migration_supported(struct hstate *h) >> -{ >> + >> #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION >> +#ifndef arch_hugetlb_migration_supported >> +static inline bool arch_hugetlb_migration_supported(struct hstate *h) >> +{ >> if ((huge_page_shift(h) == PMD_SHIFT) || >> (huge_page_shift(h) == PUD_SHIFT) || >> (huge_page_shift(h) == PGDIR_SHIFT)) >> return true; >> else >> return false; >> +} >> +#endif >> #else >> +static inline bool arch_hugetlb_migration_supported(struct hstate *h) >> +{ >> return false; >> +} >> #endif >> + >> +static inline bool hugepage_migration_supported(struct hstate *h) >> +{ >> + return arch_hugetlb_migration_supported(h); >> } > > Yes making hugepage_migration_supported to have an arch override is > definitely the right thing to do. Whether the above approach rather than > a weak symbol is better is a matter of taste and I do not feel strongly > about that. Okay then, will carry this forward and re-spin the patch series. Thank you for your detailed review till now.
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 6b68e34..9c1b77f 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -483,7 +483,8 @@ static inline bool hugepage_migration_supported(struct hstate *h) { #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION if ((huge_page_shift(h) == PMD_SHIFT) || - (huge_page_shift(h) == PGDIR_SHIFT)) + (huge_page_shift(h) == PUD_SHIFT) || + (huge_page_shift(h) == PGDIR_SHIFT)) return true; else return false;
Architectures like arm64 have PUD level HugeTLB pages for certain configs (1GB huge page is PUD based on ARM64_4K_PAGES base page size) that can be enabled for migration. It can be achieved through checking for PUD_SHIFT order based HugeTLB pages during migration. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- include/linux/hugetlb.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)