mbox series

[v2,0/5] Support large folios for tmpfs

Message ID cover.1731397290.git.baolin.wang@linux.alibaba.com (mailing list archive)
Headers show
Series Support large folios for tmpfs | expand

Message

Baolin Wang Nov. 12, 2024, 7:45 a.m. UTC
Traditionally, tmpfs only supported PMD-sized huge folios. However nowadays
with other file systems supporting any sized large folios, and extending
anonymous to support mTHP, we should not restrict tmpfs to allocating only
PMD-sized huge folios, making it more special. Instead, we should allow
tmpfs can allocate any sized large folios.

Considering that tmpfs already has the 'huge=' option to control the huge
folios allocation, we can extend the 'huge=' option to allow any sized huge
folios. The semantics of the 'huge=' mount option are:

huge=never: no any sized huge folios
huge=always: any sized huge folios
huge=within_size: like 'always' but respect the i_size
huge=advise: like 'always' if requested with fadvise()/madvise()

Note: for tmpfs mmap() faults, due to the lack of a write size hint, still
allocate the PMD-sized huge folios if huge=always/within_size/advise is set.

Moreover, the 'deny' and 'force' testing options controlled by
'/sys/kernel/mm/transparent_hugepage/shmem_enabled', still retain the same
semantics. The 'deny' can disable any sized large folios for tmpfs, while
the 'force' can enable PMD sized large folios for tmpfs.

Any comments and suggestions are appreciated. Thanks.

Changes from v1:
 - Add reviewed tag from Barry and David. Thanks.
 - Fix building warnings reported by kernel test robot.
 - Add a new patch to control the default huge policy for tmpfs.

Changes from RFC v3:
 - Drop the huge=write_size option.
 - Allow any sized huge folios for 'hgue' option.
 - Update the documentation, per David.

Changes from RFC v2:
 - Drop mTHP interfaces to control huge page allocation, per Matthew.
 - Add a new helper to calculate the order, suggested by Matthew.
 - Add a new huge=write_size option to allocate large folios based on
   the write size.
 - Add a new patch to update the documentation.

Changes from RFC v1:
 - Drop patch 1.
 - Use 'write_end' to calculate the length in shmem_allowable_huge_orders().
 - Update shmem_mapping_size_order() per Daniel.

Baolin Wang (4):
  mm: factor out the order calculation into a new helper
  mm: shmem: change shmem_huge_global_enabled() to return huge order
    bitmap
  mm: shmem: add large folio support for tmpfs
  mm: shmem: add a kernel command line to change the default huge policy
    for tmpfs

David Hildenbrand (1):
  docs: tmpfs: update the huge folios policy for tmpfs and shmem

 .../admin-guide/kernel-parameters.txt         |   7 +
 Documentation/admin-guide/mm/transhuge.rst    |  64 ++++++--
 include/linux/pagemap.h                       |  16 +-
 mm/shmem.c                                    | 148 ++++++++++++++----
 4 files changed, 183 insertions(+), 52 deletions(-)

Comments

Daniel Gomez Nov. 15, 2024, 1:16 p.m. UTC | #1
On Tue Nov 12, 2024 at 8:45 AM CET, Baolin Wang wrote:
> Traditionally, tmpfs only supported PMD-sized huge folios. However nowadays

Nitpick:
We are mixing here folios/page, PMD-size huge. For anyone not aware of
Memory Folios conversion in the kernel I think this makes it confusing.
Tmpfs has never supported folios so, this is not true. Can we rephrase
it?

Below you are also mixing terms huge/large folios etc. Can we be
consistent? I'd stick with folios (for order-0), and large folios (!
order-0). I'd use huge term only when referring to PMD-size pages.

> with other file systems supporting any sized large folios, and extending
> anonymous to support mTHP, we should not restrict tmpfs to allocating only
> PMD-sized huge folios, making it more special. Instead, we should allow

Again here.

> tmpfs can allocate any sized large folios.
>
> Considering that tmpfs already has the 'huge=' option to control the huge
> folios allocation, we can extend the 'huge=' option to allow any sized huge

'huge=' has never controlled folios.

> folios. The semantics of the 'huge=' mount option are:
>
> huge=never: no any sized huge folios
> huge=always: any sized huge folios
> huge=within_size: like 'always' but respect the i_size
> huge=advise: like 'always' if requested with fadvise()/madvise()
>
> Note: for tmpfs mmap() faults, due to the lack of a write size hint, still
> allocate the PMD-sized huge folios if huge=always/within_size/advise is set.
>
> Moreover, the 'deny' and 'force' testing options controlled by
> '/sys/kernel/mm/transparent_hugepage/shmem_enabled', still retain the same
> semantics. The 'deny' can disable any sized large folios for tmpfs, while
> the 'force' can enable PMD sized large folios for tmpfs.
>
> Any comments and suggestions are appreciated. Thanks.
>
> Changes from v1:
>  - Add reviewed tag from Barry and David. Thanks.
>  - Fix building warnings reported by kernel test robot.
>  - Add a new patch to control the default huge policy for tmpfs.
>
> Changes from RFC v3:
>  - Drop the huge=write_size option.
>  - Allow any sized huge folios for 'hgue' option.
>  - Update the documentation, per David.
>
> Changes from RFC v2:
>  - Drop mTHP interfaces to control huge page allocation, per Matthew.
>  - Add a new helper to calculate the order, suggested by Matthew.
>  - Add a new huge=write_size option to allocate large folios based on
>    the write size.
>  - Add a new patch to update the documentation.
>
> Changes from RFC v1:
>  - Drop patch 1.
>  - Use 'write_end' to calculate the length in shmem_allowable_huge_orders().
>  - Update shmem_mapping_size_order() per Daniel.
>
> Baolin Wang (4):
>   mm: factor out the order calculation into a new helper
>   mm: shmem: change shmem_huge_global_enabled() to return huge order
>     bitmap
>   mm: shmem: add large folio support for tmpfs
>   mm: shmem: add a kernel command line to change the default huge policy
>     for tmpfs
>
> David Hildenbrand (1):
>   docs: tmpfs: update the huge folios policy for tmpfs and shmem
>
>  .../admin-guide/kernel-parameters.txt         |   7 +
>  Documentation/admin-guide/mm/transhuge.rst    |  64 ++++++--
>  include/linux/pagemap.h                       |  16 +-
>  mm/shmem.c                                    | 148 ++++++++++++++----
>  4 files changed, 183 insertions(+), 52 deletions(-)
David Hildenbrand Nov. 15, 2024, 1:35 p.m. UTC | #2
On 15.11.24 14:16, Daniel Gomez wrote:
> On Tue Nov 12, 2024 at 8:45 AM CET, Baolin Wang wrote:
>> Traditionally, tmpfs only supported PMD-sized huge folios. However nowadays
> 
> Nitpick:
> We are mixing here folios/page, PMD-size huge. For anyone not aware of
> Memory Folios conversion in the kernel I think this makes it confusing.
> Tmpfs has never supported folios so, this is not true. Can we rephrase
> it?

We had the exact same discussion when we added mTHP support to anonymous 
memory.

I suggest you read:

https://lkml.kernel.org/r/65dbdf2a-9281-a3c3-b7e3-a79c5b60b357@redhat.com

Folios are an implementation detail on how we manage metadata. Nobody in 
user space should even have to be aware of how we manage metadata for 
larger chunks of memory ("huge pages") in the kernel.
Daniel Gomez Nov. 15, 2024, 3:35 p.m. UTC | #3
On Fri Nov 15, 2024 at 2:35 PM CET, David Hildenbrand wrote:
> On 15.11.24 14:16, Daniel Gomez wrote:
>> On Tue Nov 12, 2024 at 8:45 AM CET, Baolin Wang wrote:
>>> Traditionally, tmpfs only supported PMD-sized huge folios. However nowadays
>> 
>> Nitpick:
>> We are mixing here folios/page, PMD-size huge. For anyone not aware of
>> Memory Folios conversion in the kernel I think this makes it confusing.
>> Tmpfs has never supported folios so, this is not true. Can we rephrase
>> it?
>
> We had the exact same discussion when we added mTHP support to anonymous 
> memory.
>
> I suggest you read:
>
> https://lkml.kernel.org/r/65dbdf2a-9281-a3c3-b7e3-a79c5b60b357@redhat.com
>
> Folios are an implementation detail on how we manage metadata. Nobody in 
> user space should even have to be aware of how we manage metadata for 
> larger chunks of memory ("huge pages") in the kernel.

I read it and I can't find where the use of "PMD-size huge folios" could
be a valid term. Tmpfs has never supported "folios", so I think using
"PMD-size huge pages" is more appropiate.
David Hildenbrand Nov. 15, 2024, 3:44 p.m. UTC | #4
On 15.11.24 16:35, Daniel Gomez wrote:
> On Fri Nov 15, 2024 at 2:35 PM CET, David Hildenbrand wrote:
>> On 15.11.24 14:16, Daniel Gomez wrote:
>>> On Tue Nov 12, 2024 at 8:45 AM CET, Baolin Wang wrote:
>>>> Traditionally, tmpfs only supported PMD-sized huge folios. However nowadays
>>>
>>> Nitpick:
>>> We are mixing here folios/page, PMD-size huge. For anyone not aware of
>>> Memory Folios conversion in the kernel I think this makes it confusing.
>>> Tmpfs has never supported folios so, this is not true. Can we rephrase
>>> it?
>>
>> We had the exact same discussion when we added mTHP support to anonymous
>> memory.
>>
>> I suggest you read:
>>
>> https://lkml.kernel.org/r/65dbdf2a-9281-a3c3-b7e3-a79c5b60b357@redhat.com
>>
>> Folios are an implementation detail on how we manage metadata. Nobody in
>> user space should even have to be aware of how we manage metadata for
>> larger chunks of memory ("huge pages") in the kernel.
> 
> I read it and I can't find where the use of "PMD-size huge folios" could
> be a valid term. Tmpfs has never supported "folios", so I think using
> "PMD-size huge pages" is more appropiate.

Oh sorry, I completely agree. Yes, we should use that.