diff mbox series

[net-next,v4,1/4] mm: add a signature in struct page

Message ID 20210511133118.15012-2-mcroce@linux.microsoft.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series page_pool: recycle buffers | expand

Checks

Context Check Description
netdev/apply fail Patch does not apply to net-next
netdev/tree_selection success Clearly marked for net-next

Commit Message

Matteo Croce May 11, 2021, 1:31 p.m. UTC
From: Matteo Croce <mcroce@microsoft.com>

This is needed by the page_pool to avoid recycling a page not allocated
via page_pool.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
---
 include/linux/mm_types.h | 1 +
 include/net/page_pool.h  | 2 ++
 net/core/page_pool.c     | 4 ++++
 3 files changed, 7 insertions(+)

Comments

Matthew Wilcox (Oracle) May 11, 2021, 1:45 p.m. UTC | #1
On Tue, May 11, 2021 at 03:31:15PM +0200, Matteo Croce wrote:
> @@ -101,6 +101,7 @@ struct page {
>  			 * 32-bit architectures.
>  			 */
>  			unsigned long dma_addr[2];
> +			unsigned long signature;
>  		};
>  		struct {	/* slab, slob and slub */
>  			union {

No.  Signature now aliases with page->mapping, which is going to go
badly wrong for drivers which map this page into userspace.

I had this as:

+                       unsigned long pp_magic;
+                       unsigned long xmi;
+                       unsigned long _pp_mapping_pad;
                        unsigned long dma_addr[2];

and pp_magic needs to be set to something with bits 0&1 clear and
clearly isn't a pointer.  I went with POISON_POINTER_DELTA + 0x40.
Ilias Apalodimas May 11, 2021, 2:11 p.m. UTC | #2
Hi Matthew,

On Tue, May 11, 2021 at 02:45:32PM +0100, Matthew Wilcox wrote:
> On Tue, May 11, 2021 at 03:31:15PM +0200, Matteo Croce wrote:
> > @@ -101,6 +101,7 @@ struct page {
> >  			 * 32-bit architectures.
> >  			 */
> >  			unsigned long dma_addr[2];
> > +			unsigned long signature;
> >  		};
> >  		struct {	/* slab, slob and slub */
> >  			union {
> 
> No.  Signature now aliases with page->mapping, which is going to go
> badly wrong for drivers which map this page into userspace.
> 
> I had this as:
> 
> +                       unsigned long pp_magic;
> +                       unsigned long xmi;
> +                       unsigned long _pp_mapping_pad;
>                         unsigned long dma_addr[2];
> 
> and pp_magic needs to be set to something with bits 0&1 clear and
> clearly isn't a pointer.  I went with POISON_POINTER_DELTA + 0x40.

Regardless to the changes required, there's another thing we'd like your
opinion on.
There was a change wrt to the previous patchset. We used to store the
struct xdp_mem_info into page->private.  On the new version we store the
page_pool ptr address in page->private (there's an explanation why on the
mail thread, but the tl;dr is that we can get some more speed and keeping
xdp_mem_info is not that crucial). So since we can just store the page_pool
address directly, should we keep using page->private or it's better to
do: 

+                       unsigned long pp_magic;
+                       unsigned long pp_ptr;
+                       unsigned long _pp_mapping_pad;
                        unsigned long dma_addr[2];
and use pp_ptr?

Thanks
/Ilias
Matthew Wilcox (Oracle) May 11, 2021, 2:18 p.m. UTC | #3
On Tue, May 11, 2021 at 05:11:13PM +0300, Ilias Apalodimas wrote:
> Hi Matthew,
> 
> On Tue, May 11, 2021 at 02:45:32PM +0100, Matthew Wilcox wrote:
> > On Tue, May 11, 2021 at 03:31:15PM +0200, Matteo Croce wrote:
> > > @@ -101,6 +101,7 @@ struct page {
> > >  			 * 32-bit architectures.
> > >  			 */
> > >  			unsigned long dma_addr[2];
> > > +			unsigned long signature;
> > >  		};
> > >  		struct {	/* slab, slob and slub */
> > >  			union {
> > 
> > No.  Signature now aliases with page->mapping, which is going to go
> > badly wrong for drivers which map this page into userspace.
> > 
> > I had this as:
> > 
> > +                       unsigned long pp_magic;
> > +                       unsigned long xmi;
> > +                       unsigned long _pp_mapping_pad;
> >                         unsigned long dma_addr[2];
> > 
> > and pp_magic needs to be set to something with bits 0&1 clear and
> > clearly isn't a pointer.  I went with POISON_POINTER_DELTA + 0x40.
> 
> Regardless to the changes required, there's another thing we'd like your
> opinion on.
> There was a change wrt to the previous patchset. We used to store the
> struct xdp_mem_info into page->private.  On the new version we store the
> page_pool ptr address in page->private (there's an explanation why on the
> mail thread, but the tl;dr is that we can get some more speed and keeping
> xdp_mem_info is not that crucial). So since we can just store the page_pool
> address directly, should we keep using page->private or it's better to
> do: 
> 
> +                       unsigned long pp_magic;
> +                       unsigned long pp_ptr;
> +                       unsigned long _pp_mapping_pad;
>                         unsigned long dma_addr[2];
> and use pp_ptr?

I'd rather you didn't use page_private ... Any reason not to use:

			unsigned long pp_magic;
			struct page_pool *pp;
			unsigned long _pp_mapping_pad;
			unsigned long dma_addr[2];

?
Ilias Apalodimas May 11, 2021, 2:25 p.m. UTC | #4
On Tue, 11 May 2021 at 17:19, Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, May 11, 2021 at 05:11:13PM +0300, Ilias Apalodimas wrote:
> > Hi Matthew,
> >
> > On Tue, May 11, 2021 at 02:45:32PM +0100, Matthew Wilcox wrote:
> > > On Tue, May 11, 2021 at 03:31:15PM +0200, Matteo Croce wrote:
> > > > @@ -101,6 +101,7 @@ struct page {
> > > >                    * 32-bit architectures.
> > > >                    */
> > > >                   unsigned long dma_addr[2];
> > > > +                 unsigned long signature;
> > > >           };
> > > >           struct {        /* slab, slob and slub */
> > > >                   union {
> > >
> > > No.  Signature now aliases with page->mapping, which is going to go
> > > badly wrong for drivers which map this page into userspace.
> > >
> > > I had this as:
> > >
> > > +                       unsigned long pp_magic;
> > > +                       unsigned long xmi;
> > > +                       unsigned long _pp_mapping_pad;
> > >                         unsigned long dma_addr[2];
> > >
> > > and pp_magic needs to be set to something with bits 0&1 clear and
> > > clearly isn't a pointer.  I went with POISON_POINTER_DELTA + 0x40.
> >
> > Regardless to the changes required, there's another thing we'd like your
> > opinion on.
> > There was a change wrt to the previous patchset. We used to store the
> > struct xdp_mem_info into page->private.  On the new version we store the
> > page_pool ptr address in page->private (there's an explanation why on the
> > mail thread, but the tl;dr is that we can get some more speed and keeping
> > xdp_mem_info is not that crucial). So since we can just store the page_pool
> > address directly, should we keep using page->private or it's better to
> > do:
> >
> > +                       unsigned long pp_magic;
> > +                       unsigned long pp_ptr;
> > +                       unsigned long _pp_mapping_pad;
> >                         unsigned long dma_addr[2];
> > and use pp_ptr?
>
> I'd rather you didn't use page_private ... Any reason not to use:
>
>                         unsigned long pp_magic;
>                         struct page_pool *pp;
>                         unsigned long _pp_mapping_pad;
>                         unsigned long dma_addr[2];
>
> ?

Nope not at all, either would work. we'll switch to that
Matthew Wilcox (Oracle) May 12, 2021, 3:57 p.m. UTC | #5
On Tue, May 11, 2021 at 05:25:36PM +0300, Ilias Apalodimas wrote:
> Nope not at all, either would work. we'll switch to that

You'll need something like this because of the current use of
page->index to mean "pfmemalloc".

From ecd6d912056a21bbe55d997c01f96b0b8b9fbc31 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Date: Fri, 16 Apr 2021 18:12:33 -0400
Subject: [PATCH] mm: Indicate pfmemalloc pages in compound_head

The net page_pool wants to use a magic value to identify page pool pages.
The best place to put it is in the first word where it can be clearly a
non-pointer value.  That means shifting dma_addr up to alias with ->index,
which means we need to find another way to indicate page_is_pfmemalloc().
Since page_pool doesn't want to set its magic value on pages which are
pfmemalloc, we can use bit 1 of compound_head to indicate that the page
came from the memory reserves.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h       | 12 +++++++-----
 include/linux/mm_types.h |  7 +++----
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index bd21864449bf..4f9b2007efad 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1670,10 +1670,12 @@ struct address_space *page_mapping(struct page *page);
 static inline bool page_is_pfmemalloc(const struct page *page)
 {
 	/*
-	 * Page index cannot be this large so this must be
-	 * a pfmemalloc page.
+	 * This is not a tail page; compound_head of a head page is unused
+	 * at return from the page allocator, and will be overwritten
+	 * by callers who do not care whether the page came from the
+	 * reserves.
 	 */
-	return page->index == -1UL;
+	return page->compound_head & 2;
 }
 
 /*
@@ -1682,12 +1684,12 @@ static inline bool page_is_pfmemalloc(const struct page *page)
  */
 static inline void set_page_pfmemalloc(struct page *page)
 {
-	page->index = -1UL;
+	page->compound_head = 2;
 }
 
 static inline void clear_page_pfmemalloc(struct page *page)
 {
-	page->index = 0;
+	page->compound_head = 0;
 }
 
 /*
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5aacc1c10a45..1352e278939b 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -96,10 +96,9 @@ struct page {
 			unsigned long private;
 		};
 		struct {	/* page_pool used by netstack */
-			/**
-			 * @dma_addr: might require a 64-bit value on
-			 * 32-bit architectures.
-			 */
+			unsigned long pp_magic;
+			struct page_pool *pp;
+			unsigned long _pp_mapping_pad;
 			unsigned long dma_addr[2];
 		};
 		struct {	/* slab, slob and slub */
Eric Dumazet May 12, 2021, 4:09 p.m. UTC | #6
On Wed, May 12, 2021 at 6:03 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, May 11, 2021 at 05:25:36PM +0300, Ilias Apalodimas wrote:
> > Nope not at all, either would work. we'll switch to that
>
> You'll need something like this because of the current use of
> page->index to mean "pfmemalloc".
>
> From ecd6d912056a21bbe55d997c01f96b0b8b9fbc31 Mon Sep 17 00:00:00 2001
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Date: Fri, 16 Apr 2021 18:12:33 -0400
> Subject: [PATCH] mm: Indicate pfmemalloc pages in compound_head
>
> The net page_pool wants to use a magic value to identify page pool pages.
> The best place to put it is in the first word where it can be clearly a
> non-pointer value.  That means shifting dma_addr up to alias with ->index,
> which means we need to find another way to indicate page_is_pfmemalloc().
> Since page_pool doesn't want to set its magic value on pages which are
> pfmemalloc, we can use bit 1 of compound_head to indicate that the page
> came from the memory reserves.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm.h       | 12 +++++++-----
>  include/linux/mm_types.h |  7 +++----
>  2 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index bd21864449bf..4f9b2007efad 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1670,10 +1670,12 @@ struct address_space *page_mapping(struct page *page);
>  static inline bool page_is_pfmemalloc(const struct page *page)
>  {
>         /*
> -        * Page index cannot be this large so this must be
> -        * a pfmemalloc page.
> +        * This is not a tail page; compound_head of a head page is unused
> +        * at return from the page allocator, and will be overwritten
> +        * by callers who do not care whether the page came from the
> +        * reserves.
>          */
> -       return page->index == -1UL;
> +       return page->compound_head & 2;
>  }
>
>  /*
> @@ -1682,12 +1684,12 @@ static inline bool page_is_pfmemalloc(const struct page *page)
>   */
>  static inline void set_page_pfmemalloc(struct page *page)
>  {
> -       page->index = -1UL;
> +       page->compound_head = 2;
>  }
>
>  static inline void clear_page_pfmemalloc(struct page *page)
>  {
> -       page->index = 0;
> +       page->compound_head = 0;
>  }
>
>  /*
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5aacc1c10a45..1352e278939b 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -96,10 +96,9 @@ struct page {
>                         unsigned long private;
>                 };
>                 struct {        /* page_pool used by netstack */
> -                       /**
> -                        * @dma_addr: might require a 64-bit value on
> -                        * 32-bit architectures.
> -                        */
> +                       unsigned long pp_magic;
> +                       struct page_pool *pp;
> +                       unsigned long _pp_mapping_pad;
>                         unsigned long dma_addr[2];
>                 };
>                 struct {        /* slab, slob and slub */
> --
> 2.30.2
>

This would break compound_head() ?
Matthew Wilcox (Oracle) May 12, 2021, 4:26 p.m. UTC | #7
On Wed, May 12, 2021 at 06:09:21PM +0200, Eric Dumazet wrote:
> On Wed, May 12, 2021 at 6:03 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Tue, May 11, 2021 at 05:25:36PM +0300, Ilias Apalodimas wrote:
> > > Nope not at all, either would work. we'll switch to that
> >
> > You'll need something like this because of the current use of
> > page->index to mean "pfmemalloc".
> >
> > From ecd6d912056a21bbe55d997c01f96b0b8b9fbc31 Mon Sep 17 00:00:00 2001
> > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > Date: Fri, 16 Apr 2021 18:12:33 -0400
> > Subject: [PATCH] mm: Indicate pfmemalloc pages in compound_head
> >
> > The net page_pool wants to use a magic value to identify page pool pages.
> > The best place to put it is in the first word where it can be clearly a
> > non-pointer value.  That means shifting dma_addr up to alias with ->index,
> > which means we need to find another way to indicate page_is_pfmemalloc().
> > Since page_pool doesn't want to set its magic value on pages which are
> > pfmemalloc, we can use bit 1 of compound_head to indicate that the page
> > came from the memory reserves.
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> >  include/linux/mm.h       | 12 +++++++-----
> >  include/linux/mm_types.h |  7 +++----
> >  2 files changed, 10 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index bd21864449bf..4f9b2007efad 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1670,10 +1670,12 @@ struct address_space *page_mapping(struct page *page);
> >  static inline bool page_is_pfmemalloc(const struct page *page)
> >  {
> >         /*
> > -        * Page index cannot be this large so this must be
> > -        * a pfmemalloc page.
> > +        * This is not a tail page; compound_head of a head page is unused
> > +        * at return from the page allocator, and will be overwritten
> > +        * by callers who do not care whether the page came from the
> > +        * reserves.
> >          */
> > -       return page->index == -1UL;
> > +       return page->compound_head & 2;
> >  }
> >
> >  /*
> > @@ -1682,12 +1684,12 @@ static inline bool page_is_pfmemalloc(const struct page *page)
> >   */
> >  static inline void set_page_pfmemalloc(struct page *page)
> >  {
> > -       page->index = -1UL;
> > +       page->compound_head = 2;
> >  }
> >
> >  static inline void clear_page_pfmemalloc(struct page *page)
> >  {
> > -       page->index = 0;
> > +       page->compound_head = 0;
> >  }
> >
> >  /*
> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 5aacc1c10a45..1352e278939b 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -96,10 +96,9 @@ struct page {
> >                         unsigned long private;
> >                 };
> >                 struct {        /* page_pool used by netstack */
> > -                       /**
> > -                        * @dma_addr: might require a 64-bit value on
> > -                        * 32-bit architectures.
> > -                        */
> > +                       unsigned long pp_magic;
> > +                       struct page_pool *pp;
> > +                       unsigned long _pp_mapping_pad;
> >                         unsigned long dma_addr[2];
> >                 };
> >                 struct {        /* slab, slob and slub */
> 
> This would break compound_head() ?

No, compound_head() only checks bit 0.  If bit 0 is clear, then this is
not a tail page.
Ilias Apalodimas May 12, 2021, 4:47 p.m. UTC | #8
On Wed, May 12, 2021 at 04:57:25PM +0100, Matthew Wilcox wrote:
> On Tue, May 11, 2021 at 05:25:36PM +0300, Ilias Apalodimas wrote:
> > Nope not at all, either would work. we'll switch to that
> 
> You'll need something like this because of the current use of
> page->index to mean "pfmemalloc".
> 

Yes, I was somehow under the impression that was already merged.
We'll include it in the series, with your Co-developed-by tag.

Thanks
/Ilias

> From ecd6d912056a21bbe55d997c01f96b0b8b9fbc31 Mon Sep 17 00:00:00 2001
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Date: Fri, 16 Apr 2021 18:12:33 -0400
> Subject: [PATCH] mm: Indicate pfmemalloc pages in compound_head
> 
> The net page_pool wants to use a magic value to identify page pool pages.
> The best place to put it is in the first word where it can be clearly a
> non-pointer value.  That means shifting dma_addr up to alias with ->index,
> which means we need to find another way to indicate page_is_pfmemalloc().
> Since page_pool doesn't want to set its magic value on pages which are
> pfmemalloc, we can use bit 1 of compound_head to indicate that the page
> came from the memory reserves.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm.h       | 12 +++++++-----
>  include/linux/mm_types.h |  7 +++----
>  2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index bd21864449bf..4f9b2007efad 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1670,10 +1670,12 @@ struct address_space *page_mapping(struct page *page);
>  static inline bool page_is_pfmemalloc(const struct page *page)
>  {
>  	/*
> -	 * Page index cannot be this large so this must be
> -	 * a pfmemalloc page.
> +	 * This is not a tail page; compound_head of a head page is unused
> +	 * at return from the page allocator, and will be overwritten
> +	 * by callers who do not care whether the page came from the
> +	 * reserves.
>  	 */
> -	return page->index == -1UL;
> +	return page->compound_head & 2;
>  }
>  
>  /*
> @@ -1682,12 +1684,12 @@ static inline bool page_is_pfmemalloc(const struct page *page)
>   */
>  static inline void set_page_pfmemalloc(struct page *page)
>  {
> -	page->index = -1UL;
> +	page->compound_head = 2;
>  }
>  
>  static inline void clear_page_pfmemalloc(struct page *page)
>  {
> -	page->index = 0;
> +	page->compound_head = 0;
>  }
>  
>  /*
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5aacc1c10a45..1352e278939b 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -96,10 +96,9 @@ struct page {
>  			unsigned long private;
>  		};
>  		struct {	/* page_pool used by netstack */
> -			/**
> -			 * @dma_addr: might require a 64-bit value on
> -			 * 32-bit architectures.
> -			 */
> +			unsigned long pp_magic;
> +			struct page_pool *pp;
> +			unsigned long _pp_mapping_pad;
>  			unsigned long dma_addr[2];
>  		};
>  		struct {	/* slab, slob and slub */
> -- 
> 2.30.2
>
Yunsheng Lin May 13, 2021, 2:15 a.m. UTC | #9
On 2021/5/12 23:57, Matthew Wilcox wrote:
> On Tue, May 11, 2021 at 05:25:36PM +0300, Ilias Apalodimas wrote:
>> Nope not at all, either would work. we'll switch to that
> 
> You'll need something like this because of the current use of
> page->index to mean "pfmemalloc".
> 
>>From ecd6d912056a21bbe55d997c01f96b0b8b9fbc31 Mon Sep 17 00:00:00 2001
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Date: Fri, 16 Apr 2021 18:12:33 -0400
> Subject: [PATCH] mm: Indicate pfmemalloc pages in compound_head
> 
> The net page_pool wants to use a magic value to identify page pool pages.
> The best place to put it is in the first word where it can be clearly a
> non-pointer value.  That means shifting dma_addr up to alias with ->index,
> which means we need to find another way to indicate page_is_pfmemalloc().
> Since page_pool doesn't want to set its magic value on pages which are
> pfmemalloc, we can use bit 1 of compound_head to indicate that the page
> came from the memory reserves.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm.h       | 12 +++++++-----
>  include/linux/mm_types.h |  7 +++----
>  2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index bd21864449bf..4f9b2007efad 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1670,10 +1670,12 @@ struct address_space *page_mapping(struct page *page);
>  static inline bool page_is_pfmemalloc(const struct page *page)
>  {
>  	/*
> -	 * Page index cannot be this large so this must be
> -	 * a pfmemalloc page.
> +	 * This is not a tail page; compound_head of a head page is unused
> +	 * at return from the page allocator, and will be overwritten
> +	 * by callers who do not care whether the page came from the
> +	 * reserves.
>  	 */
> -	return page->index == -1UL;
> +	return page->compound_head & 2;
>  }
>  
>  /*
> @@ -1682,12 +1684,12 @@ static inline bool page_is_pfmemalloc(const struct page *page)
>   */
>  static inline void set_page_pfmemalloc(struct page *page)
>  {
> -	page->index = -1UL;
> +	page->compound_head = 2;

Is there any reason why not use "page->compound_head |= 2"? as
corresponding to the "page->compound_head & 2" in the above
page_is_pfmemalloc()?

Also, this may mean we need to make sure to pass head page or
base page to set_page_pfmemalloc() if using
"page->compound_head = 2", because it clears the bit 0 and head
page ptr for tail page too, right?

>  }
>  
>  static inline void clear_page_pfmemalloc(struct page *page)
>  {
> -	page->index = 0;
> +	page->compound_head = 0;
>  }
>  
>  /*
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5aacc1c10a45..1352e278939b 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -96,10 +96,9 @@ struct page {
>  			unsigned long private;
>  		};
>  		struct {	/* page_pool used by netstack */
> -			/**
> -			 * @dma_addr: might require a 64-bit value on
> -			 * 32-bit architectures.
> -			 */
> +			unsigned long pp_magic;
> +			struct page_pool *pp;
> +			unsigned long _pp_mapping_pad;
>  			unsigned long dma_addr[2];

It seems the dma_addr[1] aliases with page->private, and
page_private() is used in skb_copy_ubufs()?

It seems we can avoid using page_private() in skb_copy_ubufs()
by using a dynamic allocated array to store the page ptr?

>  		};
>  		struct {	/* slab, slob and slub */
>
Matthew Wilcox (Oracle) May 13, 2021, 2:35 a.m. UTC | #10
On Thu, May 13, 2021 at 10:15:26AM +0800, Yunsheng Lin wrote:
> On 2021/5/12 23:57, Matthew Wilcox wrote:
> > You'll need something like this because of the current use of
> > page->index to mean "pfmemalloc".
> >
> > @@ -1682,12 +1684,12 @@ static inline bool page_is_pfmemalloc(const struct page *page)
> >   */
> >  static inline void set_page_pfmemalloc(struct page *page)
> >  {
> > -	page->index = -1UL;
> > +	page->compound_head = 2;
> 
> Is there any reason why not use "page->compound_head |= 2"? as
> corresponding to the "page->compound_head & 2" in the above
> page_is_pfmemalloc()?
> 
> Also, this may mean we need to make sure to pass head page or
> base page to set_page_pfmemalloc() if using
> "page->compound_head = 2", because it clears the bit 0 and head
> page ptr for tail page too, right?

I think what you're missing here is that this page is freshly allocated.
This is information being passed from the page allocator to any user
who cares to look at it.  By definition, it's set on the head/base page, and
there is nothing else present in the page->compound_head.  Doing an OR
is more expensive than just setting it to 2.

I'm not really sure why set/clear page_pfmemalloc are defined in mm.h.
They should probably be in mm/page_alloc.c where nobody else would ever
think that they could or should be calling them.

> >  		struct {	/* page_pool used by netstack */
> > -			/**
> > -			 * @dma_addr: might require a 64-bit value on
> > -			 * 32-bit architectures.
> > -			 */
> > +			unsigned long pp_magic;
> > +			struct page_pool *pp;
> > +			unsigned long _pp_mapping_pad;
> >  			unsigned long dma_addr[2];
> 
> It seems the dma_addr[1] aliases with page->private, and
> page_private() is used in skb_copy_ubufs()?
> 
> It seems we can avoid using page_private() in skb_copy_ubufs()
> by using a dynamic allocated array to store the page ptr?

This is why I hate it when people use page_private() instead of
documenting what they're doing in struct page.  There is no way to know
(as an outsider to networking) whether the page in skb_copy_ubufs()
comes from page_pool.  I looked at it, and thought it didn't:

                page = alloc_page(gfp_mask);

but if you say those pages can come from page_pool, I believe you.
Yunsheng Lin May 13, 2021, 3:25 a.m. UTC | #11
On 2021/5/13 10:35, Matthew Wilcox wrote:
> On Thu, May 13, 2021 at 10:15:26AM +0800, Yunsheng Lin wrote:
>> On 2021/5/12 23:57, Matthew Wilcox wrote:
>>> You'll need something like this because of the current use of
>>> page->index to mean "pfmemalloc".
>>>
>>> @@ -1682,12 +1684,12 @@ static inline bool page_is_pfmemalloc(const struct page *page)
>>>   */
>>>  static inline void set_page_pfmemalloc(struct page *page)
>>>  {
>>> -	page->index = -1UL;
>>> +	page->compound_head = 2;
>>
>> Is there any reason why not use "page->compound_head |= 2"? as
>> corresponding to the "page->compound_head & 2" in the above
>> page_is_pfmemalloc()?
>>
>> Also, this may mean we need to make sure to pass head page or
>> base page to set_page_pfmemalloc() if using
>> "page->compound_head = 2", because it clears the bit 0 and head
>> page ptr for tail page too, right?
> 
> I think what you're missing here is that this page is freshly allocated.
> This is information being passed from the page allocator to any user
> who cares to look at it.  By definition, it's set on the head/base page, and
> there is nothing else present in the page->compound_head.  Doing an OR
> is more expensive than just setting it to 2.

Thanks for clarifying.

> 
> I'm not really sure why set/clear page_pfmemalloc are defined in mm.h.
> They should probably be in mm/page_alloc.c where nobody else would ever
> think that they could or should be calling them.>
>>>  		struct {	/* page_pool used by netstack */
>>> -			/**
>>> -			 * @dma_addr: might require a 64-bit value on
>>> -			 * 32-bit architectures.
>>> -			 */
>>> +			unsigned long pp_magic;
>>> +			struct page_pool *pp;
>>> +			unsigned long _pp_mapping_pad;
>>>  			unsigned long dma_addr[2];
>>
>> It seems the dma_addr[1] aliases with page->private, and
>> page_private() is used in skb_copy_ubufs()?
>>
>> It seems we can avoid using page_private() in skb_copy_ubufs()
>> by using a dynamic allocated array to store the page ptr?
> 
> This is why I hate it when people use page_private() instead of
> documenting what they're doing in struct page.  There is no way to know
> (as an outsider to networking) whether the page in skb_copy_ubufs()
> comes from page_pool.  I looked at it, and thought it didn't:
> 
>                 page = alloc_page(gfp_mask);
> 
> but if you say those pages can come from page_pool, I believe you.

page_private() using in skb_copy_ubufs() does indeed seem ok here.
the page_private() is used on the page which is freshly allocated
from alloc_page().

Sorry for the confusion.

> 
> .
>
diff mbox series

Patch

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5aacc1c10a45..77c04210e474 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -101,6 +101,7 @@  struct page {
 			 * 32-bit architectures.
 			 */
 			unsigned long dma_addr[2];
+			unsigned long signature;
 		};
 		struct {	/* slab, slob and slub */
 			union {
diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index b4b6de909c93..9814e36becc1 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -63,6 +63,8 @@ 
  */
 #define PP_ALLOC_CACHE_SIZE	128
 #define PP_ALLOC_CACHE_REFILL	64
+#define PP_SIGNATURE		0x20210303
+
 struct pp_alloc_cache {
 	u32 count;
 	struct page *cache[PP_ALLOC_CACHE_SIZE];
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 3c4c4c7a0402..2e5e2b8c3a02 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -221,6 +221,8 @@  static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
 		return NULL;
 	}
 
+	page->signature = PP_SIGNATURE;
+
 	/* Track how many pages are held 'in-flight' */
 	pool->pages_state_hold_cnt++;
 	trace_page_pool_state_hold(pool, page, pool->pages_state_hold_cnt);
@@ -341,6 +343,8 @@  void page_pool_release_page(struct page_pool *pool, struct page *page)
 			     DMA_ATTR_SKIP_CPU_SYNC);
 	page_pool_set_dma_addr(page, 0);
 skip_dma_unmap:
+	page->signature = 0;
+
 	/* This may be the last page returned, releasing the pool, so
 	 * it is not safe to reference pool afterwards.
 	 */