diff mbox series

[RFC,2/6] mm: convert PG_balloon to PG_offline

Message ID 20181114211704.6381-3-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series mm/kdump: allow to exclude pages that are logically offline | expand

Commit Message

David Hildenbrand Nov. 14, 2018, 9:17 p.m. UTC
PG_balloon was introduced to implement page migration/compaction for pages
inflated in virtio-balloon. Nowadays, it is only a marker that a page is
part of virtio-balloon and therefore logically offline.

We also want to make use of this flag in other balloon drivers - for
inflated pages or when onlining a section but keeping some pages offline
(e.g. used right now by XEN and Hyper-V via set_online_page_callback()).

We are going to expose this flag to dump tools like makedumpfile. But
instead of exposing PG_balloon, let's generalize the concept of marking
pages as logically offline, so it can be reused for other purposes
later on.

Rename PG_balloon to PG_offline. This is an indicator that the page is
logically offline, the content stale and that it should not be touched
(e.g. a hypervisor would have to allocate backing storage in order for the
guest to dump an unused page).  We can then e.g. exclude such pages from
dumps.

In following patches, we will make use of this bit also in other balloon
drivers.  While at it, document PGTABLE.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 Documentation/admin-guide/mm/pagemap.rst |  6 ++++++
 fs/proc/page.c                           |  4 ++--
 include/linux/balloon_compaction.h       |  8 ++++----
 include/linux/page-flags.h               | 11 +++++++----
 include/uapi/linux/kernel-page-flags.h   |  1 +
 tools/vm/page-types.c                    |  1 +
 6 files changed, 21 insertions(+), 10 deletions(-)

Comments

Matthew Wilcox (Oracle) Nov. 14, 2018, 10:23 p.m. UTC | #1
On Wed, Nov 14, 2018 at 10:17:00PM +0100, David Hildenbrand wrote:
> Rename PG_balloon to PG_offline. This is an indicator that the page is
> logically offline, the content stale and that it should not be touched
> (e.g. a hypervisor would have to allocate backing storage in order for the
> guest to dump an unused page).  We can then e.g. exclude such pages from
> dumps.
> 
> In following patches, we will make use of this bit also in other balloon
> drivers.  While at it, document PGTABLE.

Thank you for documenting PGTABLE.  I didn't realise I also had this
document to update when I added PGTABLE.

> +++ b/Documentation/admin-guide/mm/pagemap.rst
> @@ -78,6 +78,8 @@ number of times a page is mapped.
>      23. BALLOON
>      24. ZERO_PAGE
>      25. IDLE
> +    26. PGTABLE
> +    27. OFFLINE

So the offline *user* bit is new ... even though the *kernel* bit
just renames the balloon bit.  I'm not sure how I feel about this.
I'm going to think about it some more.  Could you share your decision
process with us?

I have no objection to renaming the balloon bit inside the kernel; I
think that's a wise idea.  I'm just not sure whether we should rename
the user balloon bit rather than adding a new bit.
David Hildenbrand Nov. 14, 2018, 10:49 p.m. UTC | #2
On 14.11.18 23:23, Matthew Wilcox wrote:
> On Wed, Nov 14, 2018 at 10:17:00PM +0100, David Hildenbrand wrote:
>> Rename PG_balloon to PG_offline. This is an indicator that the page is
>> logically offline, the content stale and that it should not be touched
>> (e.g. a hypervisor would have to allocate backing storage in order for the
>> guest to dump an unused page).  We can then e.g. exclude such pages from
>> dumps.
>>
>> In following patches, we will make use of this bit also in other balloon
>> drivers.  While at it, document PGTABLE.
> 
> Thank you for documenting PGTABLE.  I didn't realise I also had this
> document to update when I added PGTABLE.

Thank you for looking into this :)

> 
>> +++ b/Documentation/admin-guide/mm/pagemap.rst
>> @@ -78,6 +78,8 @@ number of times a page is mapped.
>>      23. BALLOON
>>      24. ZERO_PAGE
>>      25. IDLE
>> +    26. PGTABLE
>> +    27. OFFLINE
> 
> So the offline *user* bit is new ... even though the *kernel* bit
> just renames the balloon bit.  I'm not sure how I feel about this.
> I'm going to think about it some more.  Could you share your decision
> process with us?

BALLOON was/is documented as

"23 - BALLOON
    balloon compaction page
"

and only includes all virtio-ballon pages after the non-lru migration
feature has been implemented for ballooned pages. Since then, this flag
does basically no longer stands for what it actually was supposed to do.

To not break uapi I decided to not rename it but instead to add a new flag.

> 
> I have no objection to renaming the balloon bit inside the kernel; I
> think that's a wise idea.  I'm just not sure whether we should rename
> the user balloon bit rather than adding a new bit.
> 

Can we rename without breaking uapi?
Mike Rapoport Nov. 15, 2018, 2:07 a.m. UTC | #3
On Wed, Nov 14, 2018 at 11:49:15PM +0100, David Hildenbrand wrote:
> On 14.11.18 23:23, Matthew Wilcox wrote:
> > On Wed, Nov 14, 2018 at 10:17:00PM +0100, David Hildenbrand wrote:
> >> Rename PG_balloon to PG_offline. This is an indicator that the page is
> >> logically offline, the content stale and that it should not be touched
> >> (e.g. a hypervisor would have to allocate backing storage in order for the
> >> guest to dump an unused page).  We can then e.g. exclude such pages from
> >> dumps.
> >>
> >> In following patches, we will make use of this bit also in other balloon
> >> drivers.  While at it, document PGTABLE.
> > 
> > Thank you for documenting PGTABLE.  I didn't realise I also had this
> > document to update when I added PGTABLE.
> 
> Thank you for looking into this :)
> 
> > 
> >> +++ b/Documentation/admin-guide/mm/pagemap.rst
> >> @@ -78,6 +78,8 @@ number of times a page is mapped.
> >>      23. BALLOON
> >>      24. ZERO_PAGE
> >>      25. IDLE
> >> +    26. PGTABLE
> >> +    27. OFFLINE
> > 
> > So the offline *user* bit is new ... even though the *kernel* bit
> > just renames the balloon bit.  I'm not sure how I feel about this.
> > I'm going to think about it some more.  Could you share your decision
> > process with us?
> 
> BALLOON was/is documented as
> 
> "23 - BALLOON
>     balloon compaction page
> "
> 
> and only includes all virtio-ballon pages after the non-lru migration
> feature has been implemented for ballooned pages. Since then, this flag
> does basically no longer stands for what it actually was supposed to do.

Perhaps I missing something, but how the user should interpret "23" when he
reads /proc/kpageflags?

> To not break uapi I decided to not rename it but instead to add a new flag.

I've got a feeling that uapi was anyway changed for the BALLON flag
meaning.
 
> > 
> > I have no objection to renaming the balloon bit inside the kernel; I
> > think that's a wise idea.  I'm just not sure whether we should rename
> > the user balloon bit rather than adding a new bit.
> > 
> 
> Can we rename without breaking uapi?
> 
> -- 
> 
> Thanks,
> 
> David / dhildenb
>
David Hildenbrand Nov. 15, 2018, 9:21 a.m. UTC | #4
On 15.11.18 03:07, Mike Rapoport wrote:
> On Wed, Nov 14, 2018 at 11:49:15PM +0100, David Hildenbrand wrote:
>> On 14.11.18 23:23, Matthew Wilcox wrote:
>>> On Wed, Nov 14, 2018 at 10:17:00PM +0100, David Hildenbrand wrote:
>>>> Rename PG_balloon to PG_offline. This is an indicator that the page is
>>>> logically offline, the content stale and that it should not be touched
>>>> (e.g. a hypervisor would have to allocate backing storage in order for the
>>>> guest to dump an unused page).  We can then e.g. exclude such pages from
>>>> dumps.
>>>>
>>>> In following patches, we will make use of this bit also in other balloon
>>>> drivers.  While at it, document PGTABLE.
>>>
>>> Thank you for documenting PGTABLE.  I didn't realise I also had this
>>> document to update when I added PGTABLE.
>>
>> Thank you for looking into this :)
>>
>>>
>>>> +++ b/Documentation/admin-guide/mm/pagemap.rst
>>>> @@ -78,6 +78,8 @@ number of times a page is mapped.
>>>>      23. BALLOON
>>>>      24. ZERO_PAGE
>>>>      25. IDLE
>>>> +    26. PGTABLE
>>>> +    27. OFFLINE
>>>
>>> So the offline *user* bit is new ... even though the *kernel* bit
>>> just renames the balloon bit.  I'm not sure how I feel about this.
>>> I'm going to think about it some more.  Could you share your decision
>>> process with us?
>>
>> BALLOON was/is documented as
>>
>> "23 - BALLOON
>>     balloon compaction page
>> "
>>
>> and only includes all virtio-ballon pages after the non-lru migration
>> feature has been implemented for ballooned pages. Since then, this flag
>> does basically no longer stands for what it actually was supposed to do.
> 
> Perhaps I missing something, but how the user should interpret "23" when he
> reads /proc/kpageflags?

Looking at the history in more detail:

commit 09316c09dde33aae14f34489d9e3d243ec0d5938
Author: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Date:   Thu Oct 9 15:29:32 2014 -0700

    mm/balloon_compaction: add vmstat counters and kpageflags bit

    Always mark pages with PageBalloon even if balloon compaction is
disabled
    and expose this mark in /proc/kpageflags as KPF_BALLOON.


So KPF_BALLOON was exposed when virtio-balloon pages were always marked
with PG_balloon. So the documentation is actually wrong ("balloon page"
vs. "balloon compaction page").

I have no idea who actually used that information. I suspect this was
just some debugging aid.

> 
>> To not break uapi I decided to not rename it but instead to add a new flag.
> 
> I've got a feeling that uapi was anyway changed for the BALLON flag
> meaning.

Yes. If we *replace* KPF_BALLOON by KPF_OFFLINE

a) Some applications might no longer compile (I guess that's ok)
b) Some old applications will treat KPF_OFFLINE like KPF_BALLOON (which
should at least for virtio-balloon usage until now be fine - it is just
more generic)

So I guess it's up to Maintainers/Matthew to decide :)
Michal Hocko Nov. 15, 2018, 12:19 p.m. UTC | #5
[Cc Konstantin - the patch is http://lkml.kernel.org/r/20181114211704.6381-3-david@redhat.com]

On Thu 15-11-18 10:21:13, David Hildenbrand wrote:
> On 15.11.18 03:07, Mike Rapoport wrote:
> > On Wed, Nov 14, 2018 at 11:49:15PM +0100, David Hildenbrand wrote:
> >> On 14.11.18 23:23, Matthew Wilcox wrote:
> >>> On Wed, Nov 14, 2018 at 10:17:00PM +0100, David Hildenbrand wrote:
> >>>> Rename PG_balloon to PG_offline. This is an indicator that the page is
> >>>> logically offline, the content stale and that it should not be touched
> >>>> (e.g. a hypervisor would have to allocate backing storage in order for the
> >>>> guest to dump an unused page).  We can then e.g. exclude such pages from
> >>>> dumps.
> >>>>
> >>>> In following patches, we will make use of this bit also in other balloon
> >>>> drivers.  While at it, document PGTABLE.
> >>>
> >>> Thank you for documenting PGTABLE.  I didn't realise I also had this
> >>> document to update when I added PGTABLE.
> >>
> >> Thank you for looking into this :)
> >>
> >>>
> >>>> +++ b/Documentation/admin-guide/mm/pagemap.rst
> >>>> @@ -78,6 +78,8 @@ number of times a page is mapped.
> >>>>      23. BALLOON
> >>>>      24. ZERO_PAGE
> >>>>      25. IDLE
> >>>> +    26. PGTABLE
> >>>> +    27. OFFLINE
> >>>
> >>> So the offline *user* bit is new ... even though the *kernel* bit
> >>> just renames the balloon bit.  I'm not sure how I feel about this.
> >>> I'm going to think about it some more.  Could you share your decision
> >>> process with us?
> >>
> >> BALLOON was/is documented as
> >>
> >> "23 - BALLOON
> >>     balloon compaction page
> >> "
> >>
> >> and only includes all virtio-ballon pages after the non-lru migration
> >> feature has been implemented for ballooned pages. Since then, this flag
> >> does basically no longer stands for what it actually was supposed to do.
> > 
> > Perhaps I missing something, but how the user should interpret "23" when he
> > reads /proc/kpageflags?
> 
> Looking at the history in more detail:
> 
> commit 09316c09dde33aae14f34489d9e3d243ec0d5938
> Author: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
> Date:   Thu Oct 9 15:29:32 2014 -0700
> 
>     mm/balloon_compaction: add vmstat counters and kpageflags bit
> 
>     Always mark pages with PageBalloon even if balloon compaction is
> disabled
>     and expose this mark in /proc/kpageflags as KPF_BALLOON.
> 
> 
> So KPF_BALLOON was exposed when virtio-balloon pages were always marked
> with PG_balloon. So the documentation is actually wrong ("balloon page"
> vs. "balloon compaction page").
> 
> I have no idea who actually used that information. I suspect this was
> just some debugging aid.
> 
> > 
> >> To not break uapi I decided to not rename it but instead to add a new flag.
> > 
> > I've got a feeling that uapi was anyway changed for the BALLON flag
> > meaning.
> 
> Yes. If we *replace* KPF_BALLOON by KPF_OFFLINE
> 
> a) Some applications might no longer compile (I guess that's ok)
> b) Some old applications will treat KPF_OFFLINE like KPF_BALLOON (which
> should at least for virtio-balloon usage until now be fine - it is just
> more generic)

I do not think any compilation could break. If the semantic of the flag
is preserved then everything should work as expected.
diff mbox series

Patch

diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst
index 3f7bade2c231..9afd6bdc424b 100644
--- a/Documentation/admin-guide/mm/pagemap.rst
+++ b/Documentation/admin-guide/mm/pagemap.rst
@@ -78,6 +78,8 @@  number of times a page is mapped.
     23. BALLOON
     24. ZERO_PAGE
     25. IDLE
+    26. PGTABLE
+    27. OFFLINE
 
  * ``/proc/kpagecgroup``.  This file contains a 64-bit inode number of the
    memory cgroup each page is charged to, indexed by PFN. Only available when
@@ -128,6 +130,10 @@  Short descriptions to the page flags
     Note that this flag may be stale in case the page was accessed via
     a PTE. To make sure the flag is up-to-date one has to read
     ``/sys/kernel/mm/page_idle/bitmap`` first.
+26 - PGTABLE
+    page is in use as a page table
+27 - OFFLINE
+    page is logically offline
 
 IO related page flags
 ---------------------
diff --git a/fs/proc/page.c b/fs/proc/page.c
index 6c517b11acf8..378401af4d9d 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -152,8 +152,8 @@  u64 stable_page_flags(struct page *page)
 	else if (page_count(page) == 0 && is_free_buddy_page(page))
 		u |= 1 << KPF_BUDDY;
 
-	if (PageBalloon(page))
-		u |= 1 << KPF_BALLOON;
+	if (PageOffline(page))
+		u |= 1 << KPF_OFFLINE;
 	if (PageTable(page))
 		u |= 1 << KPF_PGTABLE;
 
diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h
index cbe50da5a59d..f111c780ef1d 100644
--- a/include/linux/balloon_compaction.h
+++ b/include/linux/balloon_compaction.h
@@ -95,7 +95,7 @@  extern int balloon_page_migrate(struct address_space *mapping,
 static inline void balloon_page_insert(struct balloon_dev_info *balloon,
 				       struct page *page)
 {
-	__SetPageBalloon(page);
+	__SetPageOffline(page);
 	__SetPageMovable(page, balloon->inode->i_mapping);
 	set_page_private(page, (unsigned long)balloon);
 	list_add(&page->lru, &balloon->pages);
@@ -111,7 +111,7 @@  static inline void balloon_page_insert(struct balloon_dev_info *balloon,
  */
 static inline void balloon_page_delete(struct page *page)
 {
-	__ClearPageBalloon(page);
+	__ClearPageOffline(page);
 	__ClearPageMovable(page);
 	set_page_private(page, 0);
 	/*
@@ -141,13 +141,13 @@  static inline gfp_t balloon_mapping_gfp_mask(void)
 static inline void balloon_page_insert(struct balloon_dev_info *balloon,
 				       struct page *page)
 {
-	__SetPageBalloon(page);
+	__SetPageOffline(page);
 	list_add(&page->lru, &balloon->pages);
 }
 
 static inline void balloon_page_delete(struct page *page)
 {
-	__ClearPageBalloon(page);
+	__ClearPageOffline(page);
 	list_del(&page->lru);
 }
 
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 50ce1bddaf56..f91da3d0a67e 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -670,7 +670,7 @@  PAGEFLAG_FALSE(DoubleMap)
 #define PAGE_TYPE_BASE	0xf0000000
 /* Reserve		0x0000007f to catch underflows of page_mapcount */
 #define PG_buddy	0x00000080
-#define PG_balloon	0x00000100
+#define PG_offline	0x00000100
 #define PG_kmemcg	0x00000200
 #define PG_table	0x00000400
 
@@ -700,10 +700,13 @@  static __always_inline void __ClearPage##uname(struct page *page)	\
 PAGE_TYPE_OPS(Buddy, buddy)
 
 /*
- * PageBalloon() is true for pages that are on the balloon page list
- * (see mm/balloon_compaction.c).
+ * PageOffline() indicates that the pages is logically offline although the
+ * containing section is online. (e.g. inflated in a balloon driver or
+ * not onlined when onlining the section).
+ * The content of these pages is effectively stale. Such pages should not
+ * be touched (read/write/dump/save) except by their owner.
  */
-PAGE_TYPE_OPS(Balloon, balloon)
+PAGE_TYPE_OPS(Offline, offline)
 
 /*
  * If kmemcg is enabled, the buddy allocator will set PageKmemcg() on
diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
index 21b9113c69da..6c662eb0dab8 100644
--- a/include/uapi/linux/kernel-page-flags.h
+++ b/include/uapi/linux/kernel-page-flags.h
@@ -36,5 +36,6 @@ 
 #define KPF_ZERO_PAGE		24
 #define KPF_IDLE		25
 #define KPF_PGTABLE		26
+#define KPF_OFFLINE		27
 
 #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
diff --git a/tools/vm/page-types.c b/tools/vm/page-types.c
index 37908a83ddc2..b219c2eafd6a 100644
--- a/tools/vm/page-types.c
+++ b/tools/vm/page-types.c
@@ -137,6 +137,7 @@  static const char * const page_flag_names[] = {
 	[KPF_PGTABLE]		= "g:pgtable",
 	[KPF_ZERO_PAGE]		= "z:zero_page",
 	[KPF_IDLE]              = "i:idle_page",
+	[KPF_OFFLINE]		= "o:offline",
 
 	[KPF_RESERVED]		= "r:reserved",
 	[KPF_MLOCKED]		= "m:mlocked",