Message ID | 20230815032645.1393700-9-willy@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Remove _folio_dtor and _folio_order | expand |
On Mon, Aug 14, 2023 at 8:27 PM Matthew Wilcox (Oracle) <willy@infradead.org> wrote: > > Move PG_writeback into bottom byte so that it can use PG_waiters in a > later patch. Move PG_head into bottom byte as well to match with where > 'order' is moving next. Sorry if that's obvious, but why do we want to move PG_head to where 'order' is moving? I thought we would want to avoid an overlap to avoid mistaking the first tail page of a high-order compound page as a head page?
On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote: > Move PG_writeback into bottom byte so that it can use PG_waiters in a > later patch. Move PG_head into bottom byte as well to match with where > 'order' is moving next. PG_active and PG_workingset move into the second > byte to make room for them. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > include/linux/page-flags.h | 24 +++++++++++------------- > 1 file changed, 11 insertions(+), 13 deletions(-) > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index aabf50dc71a3..6a0dd94b2460 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -99,13 +99,15 @@ > */ > enum pageflags { > PG_locked, /* Page is locked. Don't touch. */ > + PG_writeback, /* Page is under writeback */ > PG_referenced, > PG_uptodate, > PG_dirty, > PG_lru, > + PG_head, /* Must be in bit 6 */ Could there be some explanation on "must be in bit 6" here? Thanks,
On Tue, Aug 15, 2023 at 03:24:16PM -0400, Peter Xu wrote: > On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote: > > +++ b/include/linux/page-flags.h > > @@ -99,13 +99,15 @@ > > */ > > enum pageflags { > > PG_locked, /* Page is locked. Don't touch. */ > > + PG_writeback, /* Page is under writeback */ > > PG_referenced, > > PG_uptodate, > > PG_dirty, > > PG_lru, > > + PG_head, /* Must be in bit 6 */ > > Could there be some explanation on "must be in bit 6" here? Not on this line, no. You get 40-50 characters to say something useful. Happy to elaborate further in some other comment or in the commit log, but not on this line. The idea behind all of this is that _folio_order moves into the bottom byte of _flags_1. Because the order can never be greater than 63 (and in practice I think the largest we're going to see is about 30 -- a 16GB hugetlb page on some architectures), we know that PG_head and PG_waiters will be clear, so we can write (folio->_flags_1 & 0xff) and the compiler will just load a byte; it won't actually load the word and mask it. We can't move PG_head any lower, or we'll risk having a tail page with PG_head set (which can happen with the vmemmmap optimisation, but eugh). And we don't want to move it any higher because then we'll have a flag that cannot be reused in a tail page. Bit 6 is the perfect spot for it.
On Tue, Aug 15, 2023 at 1:07 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Aug 15, 2023 at 03:24:16PM -0400, Peter Xu wrote: > > On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote: > > > +++ b/include/linux/page-flags.h > > > @@ -99,13 +99,15 @@ > > > */ > > > enum pageflags { > > > PG_locked, /* Page is locked. Don't touch. */ > > > + PG_writeback, /* Page is under writeback */ > > > PG_referenced, > > > PG_uptodate, > > > PG_dirty, > > > PG_lru, > > > + PG_head, /* Must be in bit 6 */ > > > > Could there be some explanation on "must be in bit 6" here? > > Not on this line, no. You get 40-50 characters to say something useful. > Happy to elaborate further in some other comment or in the commit log, > but not on this line. > > The idea behind all of this is that _folio_order moves into the bottom > byte of _flags_1. Because the order can never be greater than 63 (and > in practice I think the largest we're going to see is about 30 -- a 16GB > hugetlb page on some architectures), we know that PG_head and PG_waiters > will be clear, so we can write (folio->_flags_1 & 0xff) and the compiler > will just load a byte; it won't actually load the word and mask it. > > We can't move PG_head any lower, or we'll risk having a tail page with > PG_head set (which can happen with the vmemmmap optimisation, but eugh). > And we don't want to move it any higher because then we'll have a flag > that cannot be reused in a tail page. Bit 6 is the perfect spot for it. Is there some compile time assertion that the order cannot overflow into bit 6?
On Tue, Aug 15, 2023 at 03:31:42PM -0700, Yosry Ahmed wrote: > On Tue, Aug 15, 2023 at 1:07 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Tue, Aug 15, 2023 at 03:24:16PM -0400, Peter Xu wrote: > > > On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote: > > > > +++ b/include/linux/page-flags.h > > > > @@ -99,13 +99,15 @@ > > > > */ > > > > enum pageflags { > > > > PG_locked, /* Page is locked. Don't touch. */ > > > > + PG_writeback, /* Page is under writeback */ > > > > PG_referenced, > > > > PG_uptodate, > > > > PG_dirty, > > > > PG_lru, > > > > + PG_head, /* Must be in bit 6 */ > > > > > > Could there be some explanation on "must be in bit 6" here? > > > > Not on this line, no. You get 40-50 characters to say something useful. > > Happy to elaborate further in some other comment or in the commit log, > > but not on this line. > > > > The idea behind all of this is that _folio_order moves into the bottom > > byte of _flags_1. Because the order can never be greater than 63 (and > > in practice I think the largest we're going to see is about 30 -- a 16GB > > hugetlb page on some architectures), we know that PG_head and PG_waiters > > will be clear, so we can write (folio->_flags_1 & 0xff) and the compiler > > will just load a byte; it won't actually load the word and mask it. > > > > We can't move PG_head any lower, or we'll risk having a tail page with > > PG_head set (which can happen with the vmemmmap optimisation, but eugh). > > And we don't want to move it any higher because then we'll have a flag > > that cannot be reused in a tail page. Bit 6 is the perfect spot for it. > > Is there some compile time assertion that the order cannot overflow into bit 6? An order-64 folio on a machine with a 4kB page size would be 64 zettabytes. I intend to retire before we're managing memory in chunks that large.
On Tue, Aug 15, 2023 at 4:01 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Aug 15, 2023 at 03:31:42PM -0700, Yosry Ahmed wrote: > > On Tue, Aug 15, 2023 at 1:07 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > On Tue, Aug 15, 2023 at 03:24:16PM -0400, Peter Xu wrote: > > > > On Tue, Aug 15, 2023 at 04:26:44AM +0100, Matthew Wilcox (Oracle) wrote: > > > > > +++ b/include/linux/page-flags.h > > > > > @@ -99,13 +99,15 @@ > > > > > */ > > > > > enum pageflags { > > > > > PG_locked, /* Page is locked. Don't touch. */ > > > > > + PG_writeback, /* Page is under writeback */ > > > > > PG_referenced, > > > > > PG_uptodate, > > > > > PG_dirty, > > > > > PG_lru, > > > > > + PG_head, /* Must be in bit 6 */ > > > > > > > > Could there be some explanation on "must be in bit 6" here? > > > > > > Not on this line, no. You get 40-50 characters to say something useful. > > > Happy to elaborate further in some other comment or in the commit log, > > > but not on this line. > > > > > > The idea behind all of this is that _folio_order moves into the bottom > > > byte of _flags_1. Because the order can never be greater than 63 (and > > > in practice I think the largest we're going to see is about 30 -- a 16GB > > > hugetlb page on some architectures), we know that PG_head and PG_waiters > > > will be clear, so we can write (folio->_flags_1 & 0xff) and the compiler > > > will just load a byte; it won't actually load the word and mask it. > > > > > > We can't move PG_head any lower, or we'll risk having a tail page with > > > PG_head set (which can happen with the vmemmmap optimisation, but eugh). > > > And we don't want to move it any higher because then we'll have a flag > > > that cannot be reused in a tail page. Bit 6 is the perfect spot for it. > > > > Is there some compile time assertion that the order cannot overflow into bit 6? > > An order-64 folio on a machine with a 4kB page size would be 64 > zettabytes. I intend to retire before we're managing memory in chunks > that large. I should have done the math :) Never say never though :)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index aabf50dc71a3..6a0dd94b2460 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -99,13 +99,15 @@ */ enum pageflags { PG_locked, /* Page is locked. Don't touch. */ + PG_writeback, /* Page is under writeback */ PG_referenced, PG_uptodate, PG_dirty, PG_lru, + PG_head, /* Must be in bit 6 */ + PG_waiters, /* Page has waiters, check its waitqueue. Must be bit #7 and in the same byte as "PG_locked" */ PG_active, PG_workingset, - PG_waiters, /* Page has waiters, check its waitqueue. Must be bit #7 and in the same byte as "PG_locked" */ PG_error, PG_slab, PG_owner_priv_1, /* Owner use. If pagecache, fs may use*/ @@ -113,8 +115,6 @@ enum pageflags { PG_reserved, PG_private, /* If pagecache, has fs-private data */ PG_private_2, /* If pagecache, has fs aux data */ - PG_writeback, /* Page is under writeback */ - PG_head, /* A head page */ PG_mappedtodisk, /* Has blocks allocated on-disk */ PG_reclaim, /* To be reclaimed asap */ PG_swapbacked, /* Page is backed by RAM/swap */ @@ -171,21 +171,19 @@ enum pageflags { /* Remapped by swiotlb-xen. */ PG_xen_remapped = PG_owner_priv_1, -#ifdef CONFIG_MEMORY_FAILURE /* - * Compound pages. Stored in first tail page's flags. - * Indicates that at least one subpage is hwpoisoned in the - * THP. + * Flags only valid for compound pages. Stored in first tail page's + * flags word. */ - PG_has_hwpoisoned = PG_error, -#endif - - /* Is a hugetlb page. Stored in first tail page. */ - PG_hugetlb = PG_writeback, - /* Has a deferred list (may be empty). First tail page. */ + /* At least one page is hwpoisoned in the folio. */ + PG_has_hwpoisoned = PG_error, + /* Belongs to hugetlb */ + PG_hugetlb = PG_active, + /* Has a deferred list (does not indicate whether it is active) */ PG_deferred_list = PG_reclaim, + /* non-lru isolated movable page */ PG_isolated = PG_reclaim,
Move PG_writeback into bottom byte so that it can use PG_waiters in a later patch. Move PG_head into bottom byte as well to match with where 'order' is moving next. PG_active and PG_workingset move into the second byte to make room for them. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- include/linux/page-flags.h | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-)