diff mbox series

[8/9] mm: Rearrange page flags

Message ID 20230815032645.1393700-9-willy@infradead.org (mailing list archive)
State New
Headers show
Series Remove _folio_dtor and _folio_order | expand

Commit Message

Matthew Wilcox Aug. 15, 2023, 3:26 a.m. UTC
Move PG_writeback into bottom byte so that it can use PG_waiters in a
later patch.  Move PG_head into bottom byte as well to match with where
'order' is moving next.  PG_active and PG_workingset move into the second
byte to make room for them.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/page-flags.h | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

Comments

Yosry Ahmed Aug. 15, 2023, 4:30 a.m. UTC | #1
On Mon, Aug 14, 2023 at 8:27 PM Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> Move PG_writeback into bottom byte so that it can use PG_waiters in a
> later patch.  Move PG_head into bottom byte as well to match with where
> 'order' is moving next.

Sorry if that's obvious, but why do we want to move PG_head to where
'order' is moving? I thought we would want to avoid an overlap to
avoid mistaking the first tail page of a high-order compound page as a
head page?
Peter Xu Aug. 15, 2023, 7:24 p.m. UTC | #2
On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote:
> Move PG_writeback into bottom byte so that it can use PG_waiters in a
> later patch.  Move PG_head into bottom byte as well to match with where
> 'order' is moving next.  PG_active and PG_workingset move into the second
> byte to make room for them.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/page-flags.h | 24 +++++++++++-------------
>  1 file changed, 11 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index aabf50dc71a3..6a0dd94b2460 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -99,13 +99,15 @@
>   */
>  enum pageflags {
>  	PG_locked,		/* Page is locked. Don't touch. */
> +	PG_writeback,		/* Page is under writeback */
>  	PG_referenced,
>  	PG_uptodate,
>  	PG_dirty,
>  	PG_lru,
> +	PG_head,		/* Must be in bit 6 */

Could there be some explanation on "must be in bit 6" here?

Thanks,
Matthew Wilcox Aug. 15, 2023, 8:07 p.m. UTC | #3
On Tue, Aug 15, 2023 at 03:24:16PM -0400, Peter Xu wrote:
> On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote:
> > +++ b/include/linux/page-flags.h
> > @@ -99,13 +99,15 @@
> >   */
> >  enum pageflags {
> >  	PG_locked,		/* Page is locked. Don't touch. */
> > +	PG_writeback,		/* Page is under writeback */
> >  	PG_referenced,
> >  	PG_uptodate,
> >  	PG_dirty,
> >  	PG_lru,
> > +	PG_head,		/* Must be in bit 6 */
> 
> Could there be some explanation on "must be in bit 6" here?

Not on this line, no.  You get 40-50 characters to say something useful.
Happy to elaborate further in some other comment or in the commit log,
but not on this line.

The idea behind all of this is that _folio_order moves into the bottom
byte of _flags_1.  Because the order can never be greater than 63 (and
in practice I think the largest we're going to see is about 30 -- a 16GB
hugetlb page on some architectures), we know that PG_head and PG_waiters
will be clear, so we can write (folio->_flags_1 & 0xff) and the compiler
will just load a byte; it won't actually load the word and mask it.

We can't move PG_head any lower, or we'll risk having a tail page with
PG_head set (which can happen with the vmemmmap optimisation, but eugh).
And we don't want to move it any higher because then we'll have a flag
that cannot be reused in a tail page.  Bit 6 is the perfect spot for it.
Yosry Ahmed Aug. 15, 2023, 10:31 p.m. UTC | #4
On Tue, Aug 15, 2023 at 1:07 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Aug 15, 2023 at 03:24:16PM -0400, Peter Xu wrote:
> > On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote:
> > > +++ b/include/linux/page-flags.h
> > > @@ -99,13 +99,15 @@
> > >   */
> > >  enum pageflags {
> > >     PG_locked,              /* Page is locked. Don't touch. */
> > > +   PG_writeback,           /* Page is under writeback */
> > >     PG_referenced,
> > >     PG_uptodate,
> > >     PG_dirty,
> > >     PG_lru,
> > > +   PG_head,                /* Must be in bit 6 */
> >
> > Could there be some explanation on "must be in bit 6" here?
>
> Not on this line, no.  You get 40-50 characters to say something useful.
> Happy to elaborate further in some other comment or in the commit log,
> but not on this line.
>
> The idea behind all of this is that _folio_order moves into the bottom
> byte of _flags_1.  Because the order can never be greater than 63 (and
> in practice I think the largest we're going to see is about 30 -- a 16GB
> hugetlb page on some architectures), we know that PG_head and PG_waiters
> will be clear, so we can write (folio->_flags_1 & 0xff) and the compiler
> will just load a byte; it won't actually load the word and mask it.
>
> We can't move PG_head any lower, or we'll risk having a tail page with
> PG_head set (which can happen with the vmemmmap optimisation, but eugh).
> And we don't want to move it any higher because then we'll have a flag
> that cannot be reused in a tail page.  Bit 6 is the perfect spot for it.

Is there some compile time assertion that the order cannot overflow into bit 6?
Matthew Wilcox Aug. 15, 2023, 11:01 p.m. UTC | #5
On Tue, Aug 15, 2023 at 03:31:42PM -0700, Yosry Ahmed wrote:
> On Tue, Aug 15, 2023 at 1:07 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Tue, Aug 15, 2023 at 03:24:16PM -0400, Peter Xu wrote:
> > > On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote:
> > > > +++ b/include/linux/page-flags.h
> > > > @@ -99,13 +99,15 @@
> > > >   */
> > > >  enum pageflags {
> > > >     PG_locked,              /* Page is locked. Don't touch. */
> > > > +   PG_writeback,           /* Page is under writeback */
> > > >     PG_referenced,
> > > >     PG_uptodate,
> > > >     PG_dirty,
> > > >     PG_lru,
> > > > +   PG_head,                /* Must be in bit 6 */
> > >
> > > Could there be some explanation on "must be in bit 6" here?
> >
> > Not on this line, no.  You get 40-50 characters to say something useful.
> > Happy to elaborate further in some other comment or in the commit log,
> > but not on this line.
> >
> > The idea behind all of this is that _folio_order moves into the bottom
> > byte of _flags_1.  Because the order can never be greater than 63 (and
> > in practice I think the largest we're going to see is about 30 -- a 16GB
> > hugetlb page on some architectures), we know that PG_head and PG_waiters
> > will be clear, so we can write (folio->_flags_1 & 0xff) and the compiler
> > will just load a byte; it won't actually load the word and mask it.
> >
> > We can't move PG_head any lower, or we'll risk having a tail page with
> > PG_head set (which can happen with the vmemmmap optimisation, but eugh).
> > And we don't want to move it any higher because then we'll have a flag
> > that cannot be reused in a tail page.  Bit 6 is the perfect spot for it.
> 
> Is there some compile time assertion that the order cannot overflow into bit 6?

An order-64 folio on a machine with a 4kB page size would be 64
zettabytes.  I intend to retire before we're managing memory in chunks
that large.
Yosry Ahmed Aug. 15, 2023, 11:33 p.m. UTC | #6
On Tue, Aug 15, 2023 at 4:01 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Aug 15, 2023 at 03:31:42PM -0700, Yosry Ahmed wrote:
> > On Tue, Aug 15, 2023 at 1:07 PM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Tue, Aug 15, 2023 at 03:24:16PM -0400, Peter Xu wrote:
> > > > On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote:
> > > > > +++ b/include/linux/page-flags.h
> > > > > @@ -99,13 +99,15 @@
> > > > >   */
> > > > >  enum pageflags {
> > > > >     PG_locked,              /* Page is locked. Don't touch. */
> > > > > +   PG_writeback,           /* Page is under writeback */
> > > > >     PG_referenced,
> > > > >     PG_uptodate,
> > > > >     PG_dirty,
> > > > >     PG_lru,
> > > > > +   PG_head,                /* Must be in bit 6 */
> > > >
> > > > Could there be some explanation on "must be in bit 6" here?
> > >
> > > Not on this line, no.  You get 40-50 characters to say something useful.
> > > Happy to elaborate further in some other comment or in the commit log,
> > > but not on this line.
> > >
> > > The idea behind all of this is that _folio_order moves into the bottom
> > > byte of _flags_1.  Because the order can never be greater than 63 (and
> > > in practice I think the largest we're going to see is about 30 -- a 16GB
> > > hugetlb page on some architectures), we know that PG_head and PG_waiters
> > > will be clear, so we can write (folio->_flags_1 & 0xff) and the compiler
> > > will just load a byte; it won't actually load the word and mask it.
> > >
> > > We can't move PG_head any lower, or we'll risk having a tail page with
> > > PG_head set (which can happen with the vmemmmap optimisation, but eugh).
> > > And we don't want to move it any higher because then we'll have a flag
> > > that cannot be reused in a tail page.  Bit 6 is the perfect spot for it.
> >
> > Is there some compile time assertion that the order cannot overflow into bit 6?
>
> An order-64 folio on a machine with a 4kB page size would be 64
> zettabytes.  I intend to retire before we're managing memory in chunks
> that large.

I should have done the math :)
Never say never though :)
diff mbox series

Patch

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index aabf50dc71a3..6a0dd94b2460 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -99,13 +99,15 @@ 
  */
 enum pageflags {
 	PG_locked,		/* Page is locked. Don't touch. */
+	PG_writeback,		/* Page is under writeback */
 	PG_referenced,
 	PG_uptodate,
 	PG_dirty,
 	PG_lru,
+	PG_head,		/* Must be in bit 6 */
+	PG_waiters,		/* Page has waiters, check its waitqueue. Must be bit #7 and in the same byte as "PG_locked" */
 	PG_active,
 	PG_workingset,
-	PG_waiters,		/* Page has waiters, check its waitqueue. Must be bit #7 and in the same byte as "PG_locked" */
 	PG_error,
 	PG_slab,
 	PG_owner_priv_1,	/* Owner use. If pagecache, fs may use*/
@@ -113,8 +115,6 @@  enum pageflags {
 	PG_reserved,
 	PG_private,		/* If pagecache, has fs-private data */
 	PG_private_2,		/* If pagecache, has fs aux data */
-	PG_writeback,		/* Page is under writeback */
-	PG_head,		/* A head page */
 	PG_mappedtodisk,	/* Has blocks allocated on-disk */
 	PG_reclaim,		/* To be reclaimed asap */
 	PG_swapbacked,		/* Page is backed by RAM/swap */
@@ -171,21 +171,19 @@  enum pageflags {
 	/* Remapped by swiotlb-xen. */
 	PG_xen_remapped = PG_owner_priv_1,
 
-#ifdef CONFIG_MEMORY_FAILURE
 	/*
-	 * Compound pages. Stored in first tail page's flags.
-	 * Indicates that at least one subpage is hwpoisoned in the
-	 * THP.
+	 * Flags only valid for compound pages.  Stored in first tail page's
+	 * flags word.
 	 */
-	PG_has_hwpoisoned = PG_error,
-#endif
-
-	/* Is a hugetlb page.  Stored in first tail page. */
-	PG_hugetlb = PG_writeback,
 
-	/* Has a deferred list (may be empty).  First tail page. */
+	/* At least one page is hwpoisoned in the folio.  */
+	PG_has_hwpoisoned = PG_error,
+	/* Belongs to hugetlb */
+	PG_hugetlb = PG_active,
+	/* Has a deferred list (does not indicate whether it is active) */
 	PG_deferred_list = PG_reclaim,
 
+
 	/* non-lru isolated movable page */
 	PG_isolated = PG_reclaim,