Message ID | 35FD53F367049845BC99AC72306C23D103E688B313EE@CNBJMBX05.corpusers.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Dec 5, 2014 at 11:57 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> wrote: > This patch add KPF_ZERO_PAGE flag for zero_page, > so that userspace process can notice zero_page from > /proc/kpageflags, and then do memory analysis more accurately. It would be nice to mark also huge_zero_page. See (completely untested) patch in attachment. > > Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com> > --- > fs/proc/page.c | 3 +++ > include/uapi/linux/kernel-page-flags.h | 1 + > 2 files changed, 4 insertions(+) > > diff --git a/fs/proc/page.c b/fs/proc/page.c > index 1e3187d..120dbf7 100644 > --- a/fs/proc/page.c > +++ b/fs/proc/page.c > @@ -136,6 +136,9 @@ u64 stable_page_flags(struct page *page) > if (PageBalloon(page)) > u |= 1 << KPF_BALLOON; > > + if (is_zero_pfn(page_to_pfn(page))) > + u |= 1 << KPF_ZERO_PAGE; > + > u |= kpf_copy_bit(k, KPF_LOCKED, PG_locked); > > u |= kpf_copy_bit(k, KPF_SLAB, PG_slab); > diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h > index 2f96d23..a6c4962 100644 > --- a/include/uapi/linux/kernel-page-flags.h > +++ b/include/uapi/linux/kernel-page-flags.h > @@ -32,6 +32,7 @@ > #define KPF_KSM 21 > #define KPF_THP 22 > #define KPF_BALLOON 23 > +#define KPF_ZERO_PAGE 24 > > > #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */ > -- > 2.1.3
> -----Original Message----- > From: Konstantin Khlebnikov [mailto:koct9i@gmail.com] > Sent: Friday, December 05, 2014 5:21 PM > To: Wang, Yalin > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; linux-arm- > kernel@lists.infradead.org; akpm@linux-foundation.org; n- > horiguchi@ah.jp.nec.com > Subject: Re: [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags > > On Fri, Dec 5, 2014 at 11:57 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> > wrote: > > This patch add KPF_ZERO_PAGE flag for zero_page, so that userspace > > process can notice zero_page from /proc/kpageflags, and then do memory > > analysis more accurately. > > It would be nice to mark also huge_zero_page. See (completely > untested) patch in attachment. > Got it, Thanks for your patch.
On Fri, Dec 05, 2014 at 06:21:17PM +0800, Wang, Yalin wrote: > This patch add KPF_ZERO_PAGE flag for zero_page, > so that userspace process can notice zero_page from > /proc/kpageflags, and then do memory analysis more accurately. > > Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com> > --- > fs/proc/page.c | 14 +++++++++++--- > include/linux/huge_mm.h | 12 ++++++++++++ > include/uapi/linux/kernel-page-flags.h | 1 + > mm/huge_memory.c | 7 +------ > 4 files changed, 25 insertions(+), 9 deletions(-) > > diff --git a/fs/proc/page.c b/fs/proc/page.c > index 1e3187d..dbe5630 100644 > --- a/fs/proc/page.c > +++ b/fs/proc/page.c > @@ -5,6 +5,7 @@ > #include <linux/ksm.h> > #include <linux/mm.h> > #include <linux/mmzone.h> > +#include <linux/huge_mm.h> > #include <linux/proc_fs.h> > #include <linux/seq_file.h> > #include <linux/hugetlb.h> > @@ -121,9 +122,16 @@ u64 stable_page_flags(struct page *page) > * just checks PG_head/PG_tail, so we need to check PageLRU/PageAnon > * to make sure a given page is a thp, not a non-huge compound page. > */ > - else if (PageTransCompound(page) && (PageLRU(compound_head(page)) || > - PageAnon(compound_head(page)))) > - u |= 1 << KPF_THP; > + else if (PageTransCompound(page)) { > + struct page *head = compound_head(page); > + > + if (PageLRU(head) || PageAnon(head)) > + u |= 1 << KPF_THP; > + else if (is_huge_zero_page(head)) > + u |= 1 << KPF_ZERO_PAGE; IIUC, KPF_THP bit should be set for huge zero page too.
On Fri, 5 Dec 2014 18:22:33 +0800 "Wang, Yalin" <Yalin.Wang@sonymobile.com> wrote: > > -----Original Message----- > > From: Konstantin Khlebnikov [mailto:koct9i@gmail.com] > > Sent: Friday, December 05, 2014 5:21 PM > > To: Wang, Yalin > > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; linux-arm- > > kernel@lists.infradead.org; akpm@linux-foundation.org; n- > > horiguchi@ah.jp.nec.com > > Subject: Re: [RFC] mm:add KPF_ZERO_PAGE flag for /proc/kpageflags > > > > On Fri, Dec 5, 2014 at 11:57 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> > > wrote: > > > This patch add KPF_ZERO_PAGE flag for zero_page, so that userspace > > > process can notice zero_page from /proc/kpageflags, and then do memory > > > analysis more accurately. > > > > It would be nice to mark also huge_zero_page. See (completely > > untested) patch in attachment. > > > Got it, > Thanks for your patch. Documentation/vm/pagemap.txt will need updating please.
On Mon, Dec 08, 2014 at 10:00:50AM +0800, Wang, Yalin wrote: > This patch add KPF_ZERO_PAGE flag for zero_page, > so that userspace process can notice zero_page from > /proc/kpageflags, and then do memory analysis more accurately. > > Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com> > --- > Documentation/vm/pagemap.txt | 5 +++++ > fs/proc/page.c | 16 +++++++++++++--- > include/linux/huge_mm.h | 12 ++++++++++++ > include/uapi/linux/kernel-page-flags.h | 1 + > mm/huge_memory.c | 7 +------ > 5 files changed, 32 insertions(+), 9 deletions(-) > > diff --git a/Documentation/vm/pagemap.txt b/Documentation/vm/pagemap.txt > index 5948e45..fdeb06e 100644 > --- a/Documentation/vm/pagemap.txt > +++ b/Documentation/vm/pagemap.txt > @@ -62,6 +62,8 @@ There are three components to pagemap: > 20. NOPAGE > 21. KSM > 22. THP > + 23. BALLOON > + 24. ZERO_PAGE > > Short descriptions to the page flags: > > @@ -102,6 +104,9 @@ Short descriptions to the page flags: > 22. THP > contiguous pages which construct transparent hugepages > > +24. ZERO_PAGE > + zero page for pfn_zero or huge_zero page > + > [IO related page flags] > 1. ERROR IO error occurred > 3. UPTODATE page has up-to-date data Would be nice to document BALLOON while you're there. Otherwise looks good to me. Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
On Wed, Dec 10, 2014 at 03:22:21PM +0800, Wang, Yalin wrote: > smaps_pte_entry() doesn't ignore zero_huge_page, > but it ignore zero_page, because vm_normal_page() will > ignore it. We remove vm_normal_page() call, because walk_page_range() > have ignore VM_PFNMAP vma maps, it's safe to just use pfn_valid(), > so that we can also consider zero_page to be a valid page. We fixed huge zero page accounting in smaps recentely. See mm tree.
On Tue, Dec 9, 2014 at 6:24 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> wrote: > This patch add KPF_ZERO_PAGE flag for zero_page, > so that userspace process can notice zero_page from > /proc/kpageflags, and then do memory analysis more accurately. > > Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com> Ack. Looks good. > --- > Documentation/vm/pagemap.txt | 8 ++++++++ > fs/proc/page.c | 16 +++++++++++++--- > include/linux/huge_mm.h | 12 ++++++++++++ > include/uapi/linux/kernel-page-flags.h | 1 + > mm/huge_memory.c | 7 +------ > tools/vm/page-types.c | 1 + > 6 files changed, 36 insertions(+), 9 deletions(-) > > diff --git a/Documentation/vm/pagemap.txt b/Documentation/vm/pagemap.txt > index 5948e45..6fbd55e 100644 > --- a/Documentation/vm/pagemap.txt > +++ b/Documentation/vm/pagemap.txt > @@ -62,6 +62,8 @@ There are three components to pagemap: > 20. NOPAGE > 21. KSM > 22. THP > + 23. BALLOON > + 24. ZERO_PAGE > > Short descriptions to the page flags: > > @@ -102,6 +104,12 @@ Short descriptions to the page flags: > 22. THP > contiguous pages which construct transparent hugepages > > +23. BALLOON > + balloon compaction page > + > +24. ZERO_PAGE > + zero page for pfn_zero or huge_zero page > + > [IO related page flags] > 1. ERROR IO error occurred > 3. UPTODATE page has up-to-date data > diff --git a/fs/proc/page.c b/fs/proc/page.c > index 1e3187d..7eee2d8 100644 > --- a/fs/proc/page.c > +++ b/fs/proc/page.c > @@ -5,6 +5,7 @@ > #include <linux/ksm.h> > #include <linux/mm.h> > #include <linux/mmzone.h> > +#include <linux/huge_mm.h> > #include <linux/proc_fs.h> > #include <linux/seq_file.h> > #include <linux/hugetlb.h> > @@ -121,9 +122,18 @@ u64 stable_page_flags(struct page *page) > * just checks PG_head/PG_tail, so we need to check PageLRU/PageAnon > * to make sure a given page is a thp, not a non-huge compound page. > */ > - else if (PageTransCompound(page) && (PageLRU(compound_head(page)) || > - PageAnon(compound_head(page)))) > - u |= 1 << KPF_THP; > + else if (PageTransCompound(page)) { > + struct page *head = compound_head(page); > + > + if (PageLRU(head) || PageAnon(head)) > + u |= 1 << KPF_THP; > + else if (is_huge_zero_page(head)) { > + u |= 1 << KPF_ZERO_PAGE; > + u |= 1 << KPF_THP; > + } > + } else if (is_zero_pfn(page_to_pfn(page))) > + u |= 1 << KPF_ZERO_PAGE; > + > > /* > * Caveats on high order pages: page->_count will only be set > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > index ad9051b..f10b20f 100644 > --- a/include/linux/huge_mm.h > +++ b/include/linux/huge_mm.h > @@ -157,6 +157,13 @@ static inline int hpage_nr_pages(struct page *page) > extern int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma, > unsigned long addr, pmd_t pmd, pmd_t *pmdp); > > +extern struct page *huge_zero_page; > + > +static inline bool is_huge_zero_page(struct page *page) > +{ > + return ACCESS_ONCE(huge_zero_page) == page; > +} > + > #else /* CONFIG_TRANSPARENT_HUGEPAGE */ > #define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; }) > #define HPAGE_PMD_MASK ({ BUILD_BUG(); 0; }) > @@ -206,6 +213,11 @@ static inline int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_str > return 0; > } > > +static inline bool is_huge_zero_page(struct page *page) > +{ > + return false; > +} > + > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > > #endif /* _LINUX_HUGE_MM_H */ > diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h > index 2f96d23..a6c4962 100644 > --- a/include/uapi/linux/kernel-page-flags.h > +++ b/include/uapi/linux/kernel-page-flags.h > @@ -32,6 +32,7 @@ > #define KPF_KSM 21 > #define KPF_THP 22 > #define KPF_BALLOON 23 > +#define KPF_ZERO_PAGE 24 > > > #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */ > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index de98415..d7bc7a5 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -171,12 +171,7 @@ static int start_khugepaged(void) > } > > static atomic_t huge_zero_refcount; > -static struct page *huge_zero_page __read_mostly; > - > -static inline bool is_huge_zero_page(struct page *page) > -{ > - return ACCESS_ONCE(huge_zero_page) == page; > -} > +struct page *huge_zero_page __read_mostly; > > static inline bool is_huge_zero_pmd(pmd_t pmd) > { > diff --git a/tools/vm/page-types.c b/tools/vm/page-types.c > index 264fbc2..8bdf16b 100644 > --- a/tools/vm/page-types.c > +++ b/tools/vm/page-types.c > @@ -133,6 +133,7 @@ static const char * const page_flag_names[] = { > [KPF_KSM] = "x:ksm", > [KPF_THP] = "t:thp", > [KPF_BALLOON] = "o:balloon", > + [KPF_ZERO_PAGE] = "z:zero_page", > > [KPF_RESERVED] = "r:reserved", > [KPF_MLOCKED] = "m:mlocked", > -- > 2.1.3
> -----Original Message----- > From: Kirill A. Shutemov [mailto:kirill@shutemov.name] > Sent: Wednesday, December 10, 2014 7:06 PM > To: Wang, Yalin > Cc: 'Andrew Morton'; 'Konstantin Khlebnikov'; 'linux- > kernel@vger.kernel.org'; 'linux-mm@kvack.org'; 'linux-arm- > kernel@lists.infradead.org'; 'n-horiguchi@ah.jp.nec.com'; 'oleg@redhat.com'; > 'gorcunov@openvz.org'; 'pfeiner@google.com' > Subject: Re: [RFC] mm:fix zero_page huge_zero_page rss/pss statistic > > On Wed, Dec 10, 2014 at 03:22:21PM +0800, Wang, Yalin wrote: > > smaps_pte_entry() doesn't ignore zero_huge_page, but it ignore > > zero_page, because vm_normal_page() will ignore it. We remove > > vm_normal_page() call, because walk_page_range() have ignore VM_PFNMAP > > vma maps, it's safe to just use pfn_valid(), so that we can also > > consider zero_page to be a valid page. > > We fixed huge zero page accounting in smaps recentely. See mm tree. > Hi I can't find the git, could you send me a link? Thank you !
On Fri, Dec 12, 2014 at 09:59:15AM +0800, Wang, Yalin wrote: > > -----Original Message----- > > From: Kirill A. Shutemov [mailto:kirill@shutemov.name] > > Sent: Wednesday, December 10, 2014 7:06 PM > > To: Wang, Yalin > > Cc: 'Andrew Morton'; 'Konstantin Khlebnikov'; 'linux- > > kernel@vger.kernel.org'; 'linux-mm@kvack.org'; 'linux-arm- > > kernel@lists.infradead.org'; 'n-horiguchi@ah.jp.nec.com'; 'oleg@redhat.com'; > > 'gorcunov@openvz.org'; 'pfeiner@google.com' > > Subject: Re: [RFC] mm:fix zero_page huge_zero_page rss/pss statistic > > > > On Wed, Dec 10, 2014 at 03:22:21PM +0800, Wang, Yalin wrote: > > > smaps_pte_entry() doesn't ignore zero_huge_page, but it ignore > > > zero_page, because vm_normal_page() will ignore it. We remove > > > vm_normal_page() call, because walk_page_range() have ignore VM_PFNMAP > > > vma maps, it's safe to just use pfn_valid(), so that we can also > > > consider zero_page to be a valid page. > > > > We fixed huge zero page accounting in smaps recentely. See mm tree. > > > Hi > I can't find the git, could you send me a link? http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/ or just take linux-next. The fix is already in Linus' tree.
On Thu, Dec 18, 2014 at 11:50:01AM +0800, Wang, Yalin wrote: > I notice this commit: > mm: support madvise(MADV_FREE), > > it can free clean anonymous pages directly, > doesn't need pageout to swap partition, > > but I found it doesn't work on my platform, > which don't enable any swap partitions. Current implementation, if there is no empty slot in swap, it does instant free instead of delayed free. Look at madvise_vma. > > I make a change for this. > Just to explain my issue clearly, > Do we need some other checks to still scan anonymous pages even > Don't have swap partition but have clean anonymous pages? There is a few places we should consider if you want to scan anonymous page withotu swap. Refer 69c854817566 and 74e3f3c3391d. However, it's not simple at the moment. If we reenable anonymous scan without swap, it would make much regress of reclaim. So my direction is move normal anonymos pages into unevictable LRU list because they're real unevictable without swap and put delayed freeing pages into anon LRU list and age them. > --- > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 5e8772b..8258f3a 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1941,7 +1941,7 @@ static void get_scan_count(struct lruvec *lruvec, int swappiness, > force_scan = true; > > /* If we have no swap space, do not bother scanning anon pages. */ > - if (!sc->may_swap || (get_nr_swap_pages() <= 0)) { > + if (!sc->may_swap) { > scan_balance = SCAN_FILE; > goto out; > }
> -----Original Message----- > From: Minchan Kim [mailto:minchan@kernel.org] > Sent: Friday, December 19, 2014 9:05 AM > To: Wang, Yalin > Cc: 'Konstantin Khlebnikov'; 'Kirill A. Shutemov'; 'Andrew Morton'; 'linux- > kernel@vger.kernel.org'; 'linux-mm@kvack.org'; 'linux-arm- > kernel@lists.infradead.org'; 'n-horiguchi@ah.jp.nec.com' > Subject: Re: [RFC] MADV_FREE doesn't work when doesn't have swap partition > > On Thu, Dec 18, 2014 at 11:50:01AM +0800, Wang, Yalin wrote: > > I notice this commit: > > mm: support madvise(MADV_FREE), > > > > it can free clean anonymous pages directly, doesn't need pageout to > > swap partition, > > > > but I found it doesn't work on my platform, which don't enable any > > swap partitions. > > Current implementation, if there is no empty slot in swap, it does instant > free instead of delayed free. Look at madvise_vma. > > > > > I make a change for this. > > Just to explain my issue clearly, > > Do we need some other checks to still scan anonymous pages even Don't > > have swap partition but have clean anonymous pages? > > There is a few places we should consider if you want to scan anonymous page > withotu swap. Refer 69c854817566 and 74e3f3c3391d. > > However, it's not simple at the moment. If we reenable anonymous scan > without swap, it would make much regress of reclaim. So my direction is > move normal anonymos pages into unevictable LRU list because they're real > unevictable without swap and put delayed freeing pages into anon LRU list > and age them. > I understand your solution, sounds a great idea! When this design will be merged into main stream? Thanks.
On Fri, Dec 19, 2014 at 4:04 AM, Minchan Kim <minchan@kernel.org> wrote: > On Thu, Dec 18, 2014 at 11:50:01AM +0800, Wang, Yalin wrote: >> I notice this commit: >> mm: support madvise(MADV_FREE), >> >> it can free clean anonymous pages directly, >> doesn't need pageout to swap partition, >> >> but I found it doesn't work on my platform, >> which don't enable any swap partitions. > > Current implementation, if there is no empty slot in swap, it does > instant free instead of delayed free. Look at madvise_vma. > >> >> I make a change for this. >> Just to explain my issue clearly, >> Do we need some other checks to still scan anonymous pages even >> Don't have swap partition but have clean anonymous pages? > > There is a few places we should consider if you want to scan anonymous page > withotu swap. Refer 69c854817566 and 74e3f3c3391d. > > However, it's not simple at the moment. If we reenable anonymous scan without swap, > it would make much regress of reclaim. So my direction is move normal anonymos pages > into unevictable LRU list because they're real unevictable without swap and > put delayed freeing pages into anon LRU list and age them. This sounds reasonable. In this case swapon must either scan unevictable pages and make some of them evictable again or just move all unevictable pages into active list and postpone this job till reclaimer invocation. > >> --- >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 5e8772b..8258f3a 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1941,7 +1941,7 @@ static void get_scan_count(struct lruvec *lruvec, int swappiness, >> force_scan = true; >> >> /* If we have no swap space, do not bother scanning anon pages. */ >> - if (!sc->may_swap || (get_nr_swap_pages() <= 0)) { >> + if (!sc->may_swap) { >> scan_balance = SCAN_FILE; >> goto out; >> } > > -- > Kind regards, > Minchan Kim > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
diff --git a/fs/proc/page.c b/fs/proc/page.c index 1e3187d..120dbf7 100644 --- a/fs/proc/page.c +++ b/fs/proc/page.c @@ -136,6 +136,9 @@ u64 stable_page_flags(struct page *page) if (PageBalloon(page)) u |= 1 << KPF_BALLOON; + if (is_zero_pfn(page_to_pfn(page))) + u |= 1 << KPF_ZERO_PAGE; + u |= kpf_copy_bit(k, KPF_LOCKED, PG_locked); u |= kpf_copy_bit(k, KPF_SLAB, PG_slab); diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h index 2f96d23..a6c4962 100644 --- a/include/uapi/linux/kernel-page-flags.h +++ b/include/uapi/linux/kernel-page-flags.h @@ -32,6 +32,7 @@ #define KPF_KSM 21 #define KPF_THP 22 #define KPF_BALLOON 23 +#define KPF_ZERO_PAGE 24 #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
This patch add KPF_ZERO_PAGE flag for zero_page, so that userspace process can notice zero_page from /proc/kpageflags, and then do memory analysis more accurately. Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com> --- fs/proc/page.c | 3 +++ include/uapi/linux/kernel-page-flags.h | 1 + 2 files changed, 4 insertions(+)