diff mbox series

[v3,1/6] mm/thp: add prep_transhuge_device_private_page()

Message ID 20201106005147.20113-2-rcampbell@nvidia.com (mailing list archive)
State New, archived
Headers show
Series mm/hmm/nouveau: add THP migration to migrate_vma_* | expand

Commit Message

Ralph Campbell Nov. 6, 2020, 12:51 a.m. UTC
Add a helper function to allow device drivers to create device private
transparent huge pages. This is intended to help support device private
THP migrations.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
---
 include/linux/huge_mm.h | 5 +++++
 mm/huge_memory.c        | 9 +++++++++
 2 files changed, 14 insertions(+)

Comments

Christoph Hellwig Nov. 6, 2020, 7:55 a.m. UTC | #1
On Thu, Nov 05, 2020 at 04:51:42PM -0800, Ralph Campbell wrote:
> +extern void prep_transhuge_device_private_page(struct page *page);

No need for the extern.

> +static inline void prep_transhuge_device_private_page(struct page *page)
> +{
> +}

Is the code to call this even reachable if THP support is configured
out?  If not just declaring it unconditionally and letting dead code
elimination do its job might be a tad cleaner.

> +void prep_transhuge_device_private_page(struct page *page)

I think a kerneldoc comment explaining what this function is useful for
would be helpful.
Matthew Wilcox (Oracle) Nov. 6, 2020, 12:14 p.m. UTC | #2
On Thu, Nov 05, 2020 at 04:51:42PM -0800, Ralph Campbell wrote:
> Add a helper function to allow device drivers to create device private
> transparent huge pages. This is intended to help support device private
> THP migrations.

I think you'd be better off with these calling conventions:

-void prep_transhuge_page(struct page *page)
+struct page *thp_prep(struct page *page)
 {
+       if (!page || compound_order(page) == 0)
+               return page;
        /*
-        * we use page->mapping and page->indexlru in second tail page
+        * we use page->mapping and page->index in second tail page
         * as list_head: assuming THP order >= 2
         */
+       BUG_ON(compound_order(page) == 1);
 
        INIT_LIST_HEAD(page_deferred_list(page));
        set_compound_page_dtor(page, TRANSHUGE_PAGE_DTOR);
+
+       return page;
 }

It simplifies the users.

> +void prep_transhuge_device_private_page(struct page *page)
> +{
> +	prep_compound_page(page, HPAGE_PMD_ORDER);
> +	prep_transhuge_page(page);
> +	/* Only the head page has a reference to the pgmap. */
> +	percpu_ref_put_many(page->pgmap->ref, HPAGE_PMD_NR - 1);
> +}
> +EXPORT_SYMBOL_GPL(prep_transhuge_device_private_page);

Something else that may interest you from my patch series is support
for page sizes other than PMD_SIZE.  I don't know what page sizes your
hardware supports.  There's no support for page sizes other than PMD
for anonymous memory, so this might not be too useful for you yet.
Ralph Campbell Nov. 6, 2020, 8:34 p.m. UTC | #3
On 11/6/20 4:14 AM, Matthew Wilcox wrote:
> On Thu, Nov 05, 2020 at 04:51:42PM -0800, Ralph Campbell wrote:
>> Add a helper function to allow device drivers to create device private
>> transparent huge pages. This is intended to help support device private
>> THP migrations.
> 
> I think you'd be better off with these calling conventions:
> 
> -void prep_transhuge_page(struct page *page)
> +struct page *thp_prep(struct page *page)
>   {
> +       if (!page || compound_order(page) == 0)
> +               return page;
>          /*
> -        * we use page->mapping and page->indexlru in second tail page
> +        * we use page->mapping and page->index in second tail page
>           * as list_head: assuming THP order >= 2
>           */
> +       BUG_ON(compound_order(page) == 1);
>   
>          INIT_LIST_HEAD(page_deferred_list(page));
>          set_compound_page_dtor(page, TRANSHUGE_PAGE_DTOR);
> +
> +       return page;
>   }
> 
> It simplifies the users.

I'm not sure what the simplification is.
If you mean the name change from prep_transhuge_page() to thp_prep(),
that seems fine to me. The following could also be renamed to
thp_prep_device_private_page() or similar.

>> +void prep_transhuge_device_private_page(struct page *page)
>> +{
>> +	prep_compound_page(page, HPAGE_PMD_ORDER);
>> +	prep_transhuge_page(page);
>> +	/* Only the head page has a reference to the pgmap. */
>> +	percpu_ref_put_many(page->pgmap->ref, HPAGE_PMD_NR - 1);
>> +}
>> +EXPORT_SYMBOL_GPL(prep_transhuge_device_private_page);
> 
> Something else that may interest you from my patch series is support
> for page sizes other than PMD_SIZE.  I don't know what page sizes 
> hardware supports.  There's no support for page sizes other than PMD
> for anonymous memory, so this might not be too useful for you yet.

I did see those changes. It might help some device drivers to do DMA in
larger than PAGE_SIZE blocks but less than PMD_SIZE. It might help
reduce page table sizes since 2MB, 64K, and 4K are commonly supported
GPU page sizes. The MIGRATE_PFN_COMPOUND flag is intended to indicate
that the page size is determined by page_size() so I was thinking ahead
to other than PMD sized pages. However, when migrating a pte_none() or
pmd_none() page, there is no source page to determine the size.
Maybe I need to encode the page order in the migrate PFN entry like
hmm_range_fault().

Anyway, I agree that thinking about page sizes other than PMD is good.
Ralph Campbell Nov. 6, 2020, 8:56 p.m. UTC | #4
On 11/5/20 11:55 PM, Christoph Hellwig wrote:
> On Thu, Nov 05, 2020 at 04:51:42PM -0800, Ralph Campbell wrote:
>> +extern void prep_transhuge_device_private_page(struct page *page);
> 
> No need for the extern.

Right, I was just copying the style.
Would you like to see a preparatory patch that removes extern for the other
declarations in huge_mm.h?

>> +static inline void prep_transhuge_device_private_page(struct page *page)
>> +{
>> +}
> 
> Is the code to call this even reachable if THP support is configured
> out?  If not just declaring it unconditionally and letting dead code
> elimination do its job might be a tad cleaner.

The HMM test driver depends on TRANSPARENT_HUGEPAGE but the nouveau SVM
option doesn't and SVM is still useful if TRANSPARENT_HUGEPAGE is not configured.

The problem with defining prep_transhuge_device_private_page() in huge_mm.h
as a static inline function is that prep_compound_page() and prep_transhuge_page()
would have to be EXPORT_SYMBOL_GPL which are currently mm internal only.
The intent is to make this helper callable by separate device driver modules
using struct pages created with memremap_pages().

>> +void prep_transhuge_device_private_page(struct page *page)
> 
> I think a kerneldoc comment explaining what this function is useful for
> would be helpful.

That is a good idea. I'll add it to v4.
diff mbox series

Patch

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 0365aa97f8e7..3ec26ef27a93 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -184,6 +184,7 @@  extern unsigned long thp_get_unmapped_area(struct file *filp,
 		unsigned long flags);
 
 extern void prep_transhuge_page(struct page *page);
+extern void prep_transhuge_device_private_page(struct page *page);
 extern void free_transhuge_page(struct page *page);
 bool is_transparent_hugepage(struct page *page);
 
@@ -377,6 +378,10 @@  static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
 
 static inline void prep_transhuge_page(struct page *page) {}
 
+static inline void prep_transhuge_device_private_page(struct page *page)
+{
+}
+
 static inline bool is_transparent_hugepage(struct page *page)
 {
 	return false;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 08a183f6c3ab..b4141f12ff31 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -498,6 +498,15 @@  void prep_transhuge_page(struct page *page)
 	set_compound_page_dtor(page, TRANSHUGE_PAGE_DTOR);
 }
 
+void prep_transhuge_device_private_page(struct page *page)
+{
+	prep_compound_page(page, HPAGE_PMD_ORDER);
+	prep_transhuge_page(page);
+	/* Only the head page has a reference to the pgmap. */
+	percpu_ref_put_many(page->pgmap->ref, HPAGE_PMD_NR - 1);
+}
+EXPORT_SYMBOL_GPL(prep_transhuge_device_private_page);
+
 bool is_transparent_hugepage(struct page *page)
 {
 	if (!PageCompound(page))