Message ID | 20201213154534.54826-10-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Free some vmemmap pages of HugeTLB page | expand |
On Sun, Dec 13, 2020 at 11:45:32PM +0800, Muchun Song wrote: > All the infrastructure is ready, so we introduce nr_free_vmemmap_pages > field in the hstate to indicate how many vmemmap pages associated with > a HugeTLB page that we can free to buddy allocator. And initialize it "can be freed to buddy allocator" > in the hugetlb_vmemmap_init(). This patch is actual enablement of the > feature. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > Acked-by: Mike Kravetz <mike.kravetz@oracle.com> With below nits addressed you can add: Reviewed-by: Oscar Salvador <osalvador@suse.de> > static int __init early_hugetlb_free_vmemmap_param(char *buf) > { > + /* We cannot optimize if a "struct page" crosses page boundaries. */ > + if (!is_power_of_2(sizeof(struct page))) > + return 0; > + I wonder if we should report a warning in case someone wants to enable this feature and stuct page size it not power of 2. In case someone wonders why it does not work for him/her. > +void __init hugetlb_vmemmap_init(struct hstate *h) > +{ > + unsigned int nr_pages = pages_per_huge_page(h); > + unsigned int vmemmap_pages; > + > + if (!hugetlb_free_vmemmap_enabled) > + return; > + > + vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT; > + /* > + * The head page and the first tail page are not to be freed to buddy > + * system, the others page will map to the first tail page. So there > + * are the remaining pages that can be freed. "the other pages will map to the first tail page, so they can be freed." > + * > + * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true > + * on some architectures (e.g. aarch64). See Documentation/arm64/ > + * hugetlbpage.rst for more details. > + */ > + if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR)) > + h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR; > + > + pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages, > + h->name); Maybe specify this is hugetlb code: pr_info("%s: blabla", __func__, ...) or pr_info("hugetlb: blalala", ...); although I am not sure whether we need that at all, or maybe just use pr_debug().
On Wed, Dec 16, 2020 at 9:44 PM Oscar Salvador <osalvador@suse.de> wrote: > > On Sun, Dec 13, 2020 at 11:45:32PM +0800, Muchun Song wrote: > > All the infrastructure is ready, so we introduce nr_free_vmemmap_pages > > field in the hstate to indicate how many vmemmap pages associated with > > a HugeTLB page that we can free to buddy allocator. And initialize it > "can be freed to buddy allocator" > > > in the hugetlb_vmemmap_init(). This patch is actual enablement of the > > feature. > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Acked-by: Mike Kravetz <mike.kravetz@oracle.com> > > With below nits addressed you can add: > > Reviewed-by: Oscar Salvador <osalvador@suse.de> Thanks. > > > static int __init early_hugetlb_free_vmemmap_param(char *buf) > > { > > + /* We cannot optimize if a "struct page" crosses page boundaries. */ > > + if (!is_power_of_2(sizeof(struct page))) > > + return 0; > > + > > I wonder if we should report a warning in case someone wants to enable this > feature and stuct page size it not power of 2. > In case someone wonders why it does not work for him/her. > > > +void __init hugetlb_vmemmap_init(struct hstate *h) > > +{ > > + unsigned int nr_pages = pages_per_huge_page(h); > > + unsigned int vmemmap_pages; > > + > > + if (!hugetlb_free_vmemmap_enabled) > > + return; > > + > > + vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT; > > + /* > > + * The head page and the first tail page are not to be freed to buddy > > + * system, the others page will map to the first tail page. So there > > + * are the remaining pages that can be freed. > "the other pages will map to the first tail page, so they can be freed." > > + * > > + * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true > > + * on some architectures (e.g. aarch64). See Documentation/arm64/ > > + * hugetlbpage.rst for more details. > > + */ > > + if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR)) > > + h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR; > > + > > + pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages, > > + h->name); > > Maybe specify this is hugetlb code: > > pr_info("%s: blabla", __func__, ...) > or > pr_info("hugetlb: blalala", ...); > > although I am not sure whether we need that at all, or maybe just use > pr_debug(). The pr_info can tell the user whether the feature is enabled. From this point of view, it makes sense. Right? Thanks. > > -- > Oscar Salvador > SUSE L3
On Wed, Dec 16, 2020 at 09:56:47PM +0800, Muchun Song wrote: > The pr_info can tell the user whether the feature is enabled. From this > point of view, it makes sense. Right? Well, I guess so. Anyway, it is not that we are going to flood the logs, so it is ok.
On Wed, Dec 16, 2020 at 9:44 PM Oscar Salvador <osalvador@suse.de> wrote: > > On Sun, Dec 13, 2020 at 11:45:32PM +0800, Muchun Song wrote: > > All the infrastructure is ready, so we introduce nr_free_vmemmap_pages > > field in the hstate to indicate how many vmemmap pages associated with > > a HugeTLB page that we can free to buddy allocator. And initialize it > "can be freed to buddy allocator" > > > in the hugetlb_vmemmap_init(). This patch is actual enablement of the > > feature. > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Acked-by: Mike Kravetz <mike.kravetz@oracle.com> > > With below nits addressed you can add: > > Reviewed-by: Oscar Salvador <osalvador@suse.de> > > > static int __init early_hugetlb_free_vmemmap_param(char *buf) > > { > > + /* We cannot optimize if a "struct page" crosses page boundaries. */ > > + if (!is_power_of_2(sizeof(struct page))) > > + return 0; > > + > > I wonder if we should report a warning in case someone wants to enable this > feature and stuct page size it not power of 2. > In case someone wonders why it does not work for him/her. Agree. I think that we should add a warning message here. > > > +void __init hugetlb_vmemmap_init(struct hstate *h) > > +{ > > + unsigned int nr_pages = pages_per_huge_page(h); > > + unsigned int vmemmap_pages; > > + > > + if (!hugetlb_free_vmemmap_enabled) > > + return; > > + > > + vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT; > > + /* > > + * The head page and the first tail page are not to be freed to buddy > > + * system, the others page will map to the first tail page. So there > > + * are the remaining pages that can be freed. > "the other pages will map to the first tail page, so they can be freed." > > + * > > + * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true > > + * on some architectures (e.g. aarch64). See Documentation/arm64/ > > + * hugetlbpage.rst for more details. > > + */ > > + if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR)) > > + h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR; > > + > > + pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages, > > + h->name); > > Maybe specify this is hugetlb code: > > pr_info("%s: blabla", __func__, ...) > or > pr_info("hugetlb: blalala", ...); > > although I am not sure whether we need that at all, or maybe just use > pr_debug(). > > -- > Oscar Salvador > SUSE L3
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 7f47f0eeca3b..66d82ae7b712 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -492,6 +492,9 @@ struct hstate { unsigned int nr_huge_pages_node[MAX_NUMNODES]; unsigned int free_huge_pages_node[MAX_NUMNODES]; unsigned int surplus_huge_pages_node[MAX_NUMNODES]; +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP + unsigned int nr_free_vmemmap_pages; +#endif #ifdef CONFIG_CGROUP_HUGETLB /* cgroup control files */ struct cftype cgroup_files_dfl[7]; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b0847b2ce01d..2b45235a70e9 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3323,6 +3323,7 @@ void __init hugetlb_add_hstate(unsigned int order) h->next_nid_to_free = first_memory_node; snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB", huge_page_size(h)/1024); + hugetlb_vmemmap_init(h); parsed_hstate = h; } diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 64ad929cac61..d3b4c39f67c0 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -184,6 +184,10 @@ bool hugetlb_free_vmemmap_enabled; static int __init early_hugetlb_free_vmemmap_param(char *buf) { + /* We cannot optimize if a "struct page" crosses page boundaries. */ + if (!is_power_of_2(sizeof(struct page))) + return 0; + if (!buf) return -EINVAL; @@ -222,3 +226,28 @@ void free_huge_page_vmemmap(struct hstate *h, struct page *head) vmemmap_remap_reuse(vmemmap_addr + RESERVE_VMEMMAP_SIZE, free_vmemmap_pages_size_per_hpage(h)); } + +void __init hugetlb_vmemmap_init(struct hstate *h) +{ + unsigned int nr_pages = pages_per_huge_page(h); + unsigned int vmemmap_pages; + + if (!hugetlb_free_vmemmap_enabled) + return; + + vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT; + /* + * The head page and the first tail page are not to be freed to buddy + * system, the others page will map to the first tail page. So there + * are the remaining pages that can be freed. + * + * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true + * on some architectures (e.g. aarch64). See Documentation/arm64/ + * hugetlbpage.rst for more details. + */ + if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR)) + h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR; + + pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages, + h->name); +} diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index b2c8d2f11d48..8fd9ae113dbd 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -13,17 +13,15 @@ #ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP void alloc_huge_page_vmemmap(struct hstate *h, struct page *head); void free_huge_page_vmemmap(struct hstate *h, struct page *head); +void hugetlb_vmemmap_init(struct hstate *h); /* * How many vmemmap pages associated with a HugeTLB page that can be freed * to the buddy allocator. - * - * Todo: Returns zero for now, which means the feature is disabled. We will - * enable it once all the infrastructure is there. */ static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h) { - return 0; + return h->nr_free_vmemmap_pages; } #else static inline void alloc_huge_page_vmemmap(struct hstate *h, struct page *head) @@ -38,5 +36,9 @@ static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h) { return 0; } + +static inline void hugetlb_vmemmap_init(struct hstate *h) +{ +} #endif /* CONFIG_HUGETLB_PAGE_FREE_VMEMMAP */ #endif /* _LINUX_HUGETLB_VMEMMAP_H */