Message ID | 20240527141454.113132-3-david@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: page_type, zsmalloc and page_mapcount_reset() | expand |
On Mon, May 27, 2024 at 04:14:50PM +0200, David Hildenbrand wrote: > As long as the owner sets a page type first, we can allow reuse of the > lower 18 bit: sufficient to store an offset into a 64 KiB page, which You say 18 here and 16 below. > is the maximum base page size in *common* configurations (ignoring the > 256 KiB variant). Restrict it to the head page. > > We'll use that for zsmalloc next, to set a proper type while still > reusing that field to store information (offset into a base page) that > cannot go elsewhere for now. > > Fear of running out of bits for storing the actual type? Actually, we > don't need one bit per type, we could store a single value instead. > Further, we could likely limit PAGE_TYPE_BASE to a single (highest) bit. We could, but it's more instructions to check. > +++ b/include/linux/page-flags.h > @@ -945,14 +945,18 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned) > */ > > #define PAGE_TYPE_BASE 0xf0000000 > -/* Reserve 0x0000007f to catch underflows of _mapcount */ > -#define PAGE_MAPCOUNT_RESERVE -128 > -#define PG_buddy 0x00000080 > -#define PG_offline 0x00000100 > -#define PG_table 0x00000200 > -#define PG_guard 0x00000400 > -#define PG_hugetlb 0x00000800 > -#define PG_slab 0x00001000 > +/* > + * Reserve 0x0000ffff to catch underflows of _mapcount and > + * allow owners that set a type to reuse the lower 16 bit for their own > + * purposes. > + */ > +#define PAGE_MAPCOUNT_RESERVE -65536 I think my original comment was misleading. This should be: * Reserve 0xffff0000 - 0xfffffffe to catch _mapcount underflow. How about we start at the top end and let people extend down? ie: #define PAGE_TYPE_BASE 0x80000000 #define PG_buddy 0x40000000 #define PG_offline 0x20000000 #define PG_table 0x10000000 #define PG_guard 0x08000000 #define PG_hugetlb 0x04000000 #define PG_slab 0x02000000 #define PAGE_MAPCOUNT_RESERVE (~0x0000ffff) Now we can see that we have 9 flags remaining, which should last until we can have proper memdesc typing.
Am 27.05.24 um 17:26 schrieb Matthew Wilcox: > On Mon, May 27, 2024 at 04:14:50PM +0200, David Hildenbrand wrote: >> As long as the owner sets a page type first, we can allow reuse of the >> lower 18 bit: sufficient to store an offset into a 64 KiB page, which > > You say 18 here and 16 below. Thanks, missed to fixup one instance after going back and forth. > >> is the maximum base page size in *common* configurations (ignoring the >> 256 KiB variant). Restrict it to the head page. >> >> We'll use that for zsmalloc next, to set a proper type while still >> reusing that field to store information (offset into a base page) that >> cannot go elsewhere for now. >> >> Fear of running out of bits for storing the actual type? Actually, we >> don't need one bit per type, we could store a single value instead. >> Further, we could likely limit PAGE_TYPE_BASE to a single (highest) bit. > > We could, but it's more instructions to check. Maybe, and maybe not sufficient more that we care. I was thinking of something like the following (probably broken but you should get the idea): /* * If the _mapcount is negative, we might store a page type. The * page_type field corresponds to the most significant byte of the * _mapcount field. As the mapcount is initialized to -1, we have no * type as defaults. We have plenty of room to underflow the mapcount * before we would end up indicating a valid page_type. */ #define PAGE_TYPE_BASE 0x80 enum page_type { PT_buddy = PAGE_TYPE_BASE, PT_offline, PT_table, PT_guard, PT_hugetlb, PT_slab, /* we must forbid page_type == -1 */ PT_unusable = 0xff }; In struct page: union { atomic_t _mapcount; #if __BYTE_ORDER == __BIG_ENDIAN struct { uint16_t page_type_data; uint8_t page_type_reserved; uint8_t page_type; }; #else struct { uint8_t page_type; uint8_t page_type_reserved; uint16_t page_type_data; }; #end }; #define PageType(page, type) (page->page_type == type) Once could maybe also change page_has_type to simply work on the fact that the highest bit must be set and any other bit of the type must be clear: static inline int page_has_type(const struct page *page) { return (page->page_type & PAGE_TYPE_BASE) && page->page_type != 0xffff; } But just some thought. > >> +++ b/include/linux/page-flags.h >> @@ -945,14 +945,18 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned) >> */ >> >> #define PAGE_TYPE_BASE 0xf0000000 >> -/* Reserve 0x0000007f to catch underflows of _mapcount */ >> -#define PAGE_MAPCOUNT_RESERVE -128 >> -#define PG_buddy 0x00000080 >> -#define PG_offline 0x00000100 >> -#define PG_table 0x00000200 >> -#define PG_guard 0x00000400 >> -#define PG_hugetlb 0x00000800 >> -#define PG_slab 0x00001000 >> +/* >> + * Reserve 0x0000ffff to catch underflows of _mapcount and >> + * allow owners that set a type to reuse the lower 16 bit for their own >> + * purposes. >> + */ >> +#define PAGE_MAPCOUNT_RESERVE -65536 > > I think my original comment was misleading. This should be: > > * Reserve 0xffff0000 - 0xfffffffe to catch _mapcount underflow. Makes sense. > > How about we start at the top end and let people extend down? ie: > > #define PAGE_TYPE_BASE 0x80000000 > #define PG_buddy 0x40000000 > #define PG_offline 0x20000000 > #define PG_table 0x10000000 > #define PG_guard 0x08000000 > #define PG_hugetlb 0x04000000 > #define PG_slab 0x02000000 > #define PAGE_MAPCOUNT_RESERVE (~0x0000ffff) > > Now we can see that we have 9 flags remaining, which should last until > we can have proper memdesc typing. Also works for me. Thanks!
--- > As long as the owner sets a page type first, we can allow reuse of the > lower 18 bit: sufficient to store an offset into a 64 KiB page, which > is the maximum base page size in *common* configurations (ignoring the > 256 KiB variant). Restrict it to the head page. > > We'll use that for zsmalloc next, to set a proper type while still > reusing that field to store information (offset into a base page) that > cannot go elsewhere for now. > > Fear of running out of bits for storing the actual type? Actually, we > don't need one bit per type, we could store a single value instead. > Further, we could likely limit PAGE_TYPE_BASE to a single (highest) bit. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > include/linux/mm_types.h | 5 +++++ > include/linux/page-flags.h | 20 ++++++++++++-------- > 2 files changed, 17 insertions(+), 8 deletions(-) > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 6b2aeba792c4..598cfedbbfa0 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -157,6 +157,11 @@ struct page { > * > * See page-flags.h for a list of page types which are currently > * stored here. > + * > + * Owners of typed folios may reuse the lower 16 bit of the > + * head page page_type field after setting the page type, > + * but must reset these 16 bit to -1 before clearing the > + * page type. > */ > unsigned int page_type; > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index 104078afe0b1..b43e380ffa0b 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -945,14 +945,18 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned) > */ > > #define PAGE_TYPE_BASE 0xf0000000 > -/* Reserve 0x0000007f to catch underflows of _mapcount */ > -#define PAGE_MAPCOUNT_RESERVE -128 > -#define PG_buddy 0x00000080 > -#define PG_offline 0x00000100 > -#define PG_table 0x00000200 > -#define PG_guard 0x00000400 > -#define PG_hugetlb 0x00000800 > -#define PG_slab 0x00001000 > +/* > + * Reserve 0x0000ffff to catch underflows of _mapcount and > + * allow owners that set a type to reuse the lower 16 bit for their own > + * purposes. > + */ > +#define PAGE_MAPCOUNT_RESERVE -65536 > +#define PG_buddy 0x00010000 > +#define PG_offline 0x00020000 > +#define PG_table 0x00040000 > +#define PG_guard 0x00080000 > +#define PG_hugetlb 0x00100800 Every PG_XX occupies one bit in my understanding. But why PG_hugetlb occupies two bits? > +#define PG_slab 0x00200000 > > #define PageType(page, flag) \ > ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
On 29.05.24 17:55, wang wei wrote: > --- >> As long as the owner sets a page type first, we can allow reuse of the >> lower 18 bit: sufficient to store an offset into a 64 KiB page, which >> is the maximum base page size in *common* configurations (ignoring the >> 256 KiB variant). Restrict it to the head page. >> >> We'll use that for zsmalloc next, to set a proper type while still >> reusing that field to store information (offset into a base page) that >> cannot go elsewhere for now. >> >> Fear of running out of bits for storing the actual type? Actually, we >> don't need one bit per type, we could store a single value instead. >> Further, we could likely limit PAGE_TYPE_BASE to a single (highest) bit. >> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> include/linux/mm_types.h | 5 +++++ >> include/linux/page-flags.h | 20 ++++++++++++-------- >> 2 files changed, 17 insertions(+), 8 deletions(-) >> >> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h >> index 6b2aeba792c4..598cfedbbfa0 100644 >> --- a/include/linux/mm_types.h >> +++ b/include/linux/mm_types.h >> @@ -157,6 +157,11 @@ struct page { >> * >> * See page-flags.h for a list of page types which are currently >> * stored here. >> + * >> + * Owners of typed folios may reuse the lower 16 bit of the >> + * head page page_type field after setting the page type, >> + * but must reset these 16 bit to -1 before clearing the >> + * page type. >> */ >> unsigned int page_type; >> >> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h >> index 104078afe0b1..b43e380ffa0b 100644 >> --- a/include/linux/page-flags.h >> +++ b/include/linux/page-flags.h >> @@ -945,14 +945,18 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned) >> */ >> >> #define PAGE_TYPE_BASE 0xf0000000 >> -/* Reserve 0x0000007f to catch underflows of _mapcount */ >> -#define PAGE_MAPCOUNT_RESERVE -128 >> -#define PG_buddy 0x00000080 >> -#define PG_offline 0x00000100 >> -#define PG_table 0x00000200 >> -#define PG_guard 0x00000400 >> -#define PG_hugetlb 0x00000800 >> -#define PG_slab 0x00001000 >> +/* >> + * Reserve 0x0000ffff to catch underflows of _mapcount and >> + * allow owners that set a type to reuse the lower 16 bit for their own >> + * purposes. >> + */ >> +#define PAGE_MAPCOUNT_RESERVE -65536 >> +#define PG_buddy 0x00010000 >> +#define PG_offline 0x00020000 >> +#define PG_table 0x00040000 >> +#define PG_guard 0x00080000 >> +#define PG_hugetlb 0x00100800 > > Every PG_XX occupies one bit in my understanding. But why PG_hugetlb occupies two bits? Because it's wrong (although not harmful). Same issue in v2, fat fingers. Thanks for pointing that out!
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6b2aeba792c4..598cfedbbfa0 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -157,6 +157,11 @@ struct page { * * See page-flags.h for a list of page types which are currently * stored here. + * + * Owners of typed folios may reuse the lower 16 bit of the + * head page page_type field after setting the page type, + * but must reset these 16 bit to -1 before clearing the + * page type. */ unsigned int page_type; diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 104078afe0b1..b43e380ffa0b 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -945,14 +945,18 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned) */ #define PAGE_TYPE_BASE 0xf0000000 -/* Reserve 0x0000007f to catch underflows of _mapcount */ -#define PAGE_MAPCOUNT_RESERVE -128 -#define PG_buddy 0x00000080 -#define PG_offline 0x00000100 -#define PG_table 0x00000200 -#define PG_guard 0x00000400 -#define PG_hugetlb 0x00000800 -#define PG_slab 0x00001000 +/* + * Reserve 0x0000ffff to catch underflows of _mapcount and + * allow owners that set a type to reuse the lower 16 bit for their own + * purposes. + */ +#define PAGE_MAPCOUNT_RESERVE -65536 +#define PG_buddy 0x00010000 +#define PG_offline 0x00020000 +#define PG_table 0x00040000 +#define PG_guard 0x00080000 +#define PG_hugetlb 0x00100800 +#define PG_slab 0x00200000 #define PageType(page, flag) \ ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
As long as the owner sets a page type first, we can allow reuse of the lower 18 bit: sufficient to store an offset into a 64 KiB page, which is the maximum base page size in *common* configurations (ignoring the 256 KiB variant). Restrict it to the head page. We'll use that for zsmalloc next, to set a proper type while still reusing that field to store information (offset into a base page) that cannot go elsewhere for now. Fear of running out of bits for storing the actual type? Actually, we don't need one bit per type, we could store a single value instead. Further, we could likely limit PAGE_TYPE_BASE to a single (highest) bit. Signed-off-by: David Hildenbrand <david@redhat.com> --- include/linux/mm_types.h | 5 +++++ include/linux/page-flags.h | 20 ++++++++++++-------- 2 files changed, 17 insertions(+), 8 deletions(-)