Message ID | 20201026145114.59424-1-songmuchun@bytedance.com (mailing list archive) |
---|---|
Headers | show |
Series | Free some vmemmap pages of hugetlb page | expand |
On Mon, Oct 26, 2020 at 10:50:55PM +0800, Muchun Song wrote:
> For tail pages, the value of compound_dtor is the same. So we can reuse
compound_dtor is only set on the first tail page. compound_head is
what you mean here, I think.
On Mon, Oct 26, 2020 at 11:53 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Mon, Oct 26, 2020 at 10:50:55PM +0800, Muchun Song wrote: > > For tail pages, the value of compound_dtor is the same. So we can reuse > > compound_dtor is only set on the first tail page. compound_head is > what you mean here, I think. > Yes, that's right. Sorry for the confusion. Thanks.
On Mon 26-10-20 22:50:55, Muchun Song wrote: > If we uses the 1G hugetlbpage, we can save 4095 pages. This is a very > substantial gain. On our server, run some SPDK/QEMU applications which > will use 1000GB hugetlbpage. With this feature enabled, we can save > ~16GB(1G hugepage)/~11GB(2MB hugepage) memory. [...] > 15 files changed, 1091 insertions(+), 165 deletions(-) > create mode 100644 include/linux/bootmem_info.h > create mode 100644 mm/bootmem_info.c This is a neat idea but the code footprint is really non trivial. To a very tricky code which hugetlb is unfortunately. Saving 1,6% of memory is definitely interesting especially for 1GB pages which tend to be more static and where the savings are more visible. Anyway, I haven't seen any runtime overhead analysis here. What is the price to modify the vmemmap page tables and make them pte rather than pmd based (especially for 2MB hugetlb). Also, how expensive is the vmemmap page tables reconstruction on the freeing path? Thanks!
On Fri, Oct 30, 2020 at 5:14 PM Michal Hocko <mhocko@suse.com> wrote: > > On Mon 26-10-20 22:50:55, Muchun Song wrote: > > If we uses the 1G hugetlbpage, we can save 4095 pages. This is a very > > substantial gain. On our server, run some SPDK/QEMU applications which > > will use 1000GB hugetlbpage. With this feature enabled, we can save > > ~16GB(1G hugepage)/~11GB(2MB hugepage) memory. > [...] > > 15 files changed, 1091 insertions(+), 165 deletions(-) > > create mode 100644 include/linux/bootmem_info.h > > create mode 100644 mm/bootmem_info.c > > This is a neat idea but the code footprint is really non trivial. To a > very tricky code which hugetlb is unfortunately. > > Saving 1,6% of memory is definitely interesting especially for 1GB pages > which tend to be more static and where the savings are more visible. > > Anyway, I haven't seen any runtime overhead analysis here. What is the > price to modify the vmemmap page tables and make them pte rather than > pmd based (especially for 2MB hugetlb). Also, how expensive is the > vmemmap page tables reconstruction on the freeing path? Yeah, I haven't tested the remapping overhead of reserving a hugetlb page. I can do that. But the overhead is not on the allocation/freeing of each hugetlb page, it is only once when we reserve some hugetlb pages through /proc/sys/vm/nr_hugepages. Once the reservation is successful, the subsequent allocation, freeing and using are the same as before (not patched). So I think that the overhead is acceptable. Thanks. > > Thanks! > -- > Michal Hocko > SUSE Labs
On Fri 30-10-20 18:24:25, Muchun Song wrote: > On Fri, Oct 30, 2020 at 5:14 PM Michal Hocko <mhocko@suse.com> wrote: > > > > On Mon 26-10-20 22:50:55, Muchun Song wrote: > > > If we uses the 1G hugetlbpage, we can save 4095 pages. This is a very > > > substantial gain. On our server, run some SPDK/QEMU applications which > > > will use 1000GB hugetlbpage. With this feature enabled, we can save > > > ~16GB(1G hugepage)/~11GB(2MB hugepage) memory. > > [...] > > > 15 files changed, 1091 insertions(+), 165 deletions(-) > > > create mode 100644 include/linux/bootmem_info.h > > > create mode 100644 mm/bootmem_info.c > > > > This is a neat idea but the code footprint is really non trivial. To a > > very tricky code which hugetlb is unfortunately. > > > > Saving 1,6% of memory is definitely interesting especially for 1GB pages > > which tend to be more static and where the savings are more visible. > > > > Anyway, I haven't seen any runtime overhead analysis here. What is the > > price to modify the vmemmap page tables and make them pte rather than > > pmd based (especially for 2MB hugetlb). Also, how expensive is the > > vmemmap page tables reconstruction on the freeing path? > > Yeah, I haven't tested the remapping overhead of reserving a hugetlb > page. I can do that. But the overhead is not on the allocation/freeing of > each hugetlb page, it is only once when we reserve some hugetlb pages > through /proc/sys/vm/nr_hugepages. Once the reservation is successful, > the subsequent allocation, freeing and using are the same as before > (not patched). Yes, that is quite clear. Except for the hugetlb overcommit and migration if the pool is depeleted. Maybe few other cases. > So I think that the overhead is acceptable. Having some numbers for a such a large feature is really needed.