Message ID | 35FD53F367049845BC99AC72306C23D103E688B313E0@CNBJMBX05.corpusers.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote: > This patch add/dec zero_page's _mapcount to make sure > the mapcount is correct for zero_page, > so that when read from /proc/kpagecount, zero_page's > mapcount is also correct, userspace process like procrank can > calculate PSS correctly. I don't have specific code path to point to, but I would expect zero page with non-zero mapcount would cause a problem with rmap. How do you test the change?
> -----Original Message----- > From: Kirill A. Shutemov [mailto:kirill@shutemov.name] > Sent: Tuesday, December 02, 2014 7:30 PM > To: Wang, Yalin > Cc: 'linux-kernel@vger.kernel.org'; 'linux-mm@kvack.org'; 'linux-arm- > kernel@lists.infradead.org' > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user > space > > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote: > > This patch add/dec zero_page's _mapcount to make sure the mapcount is > > correct for zero_page, so that when read from /proc/kpagecount, > > zero_page's mapcount is also correct, userspace process like procrank > > can calculate PSS correctly. > > I don't have specific code path to point to, but I would expect zero page > with non-zero mapcount would cause a problem with rmap. > > How do you test the change? > I just test it to see the mapcount from /proc/pid/pagemap and /proc/kpagecount , It works well, The problem is that when I see /proc/pid/smaps , The Rss / Pss don't calculate zero_page map, Because smaps_pte_entry() --> vm_normal_page( ), Will return NULL for zero_page, But when userspace process cat /proc/pid/pagemap , It will see zero_page mapped, And will treat as Rss , This is weird, should we also omit zero_page in /proc/pid/pagemap ? Or add zero_page as Rss in /proc/pid/smaps ? I think we should add zero_page into Rss , Because it is really mapped into userspace address space. And will let userspace memory analysis more accurate . Thanks
On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote: > > -----Original Message----- > > From: Kirill A. Shutemov [mailto:kirill@shutemov.name] > > Sent: Tuesday, December 02, 2014 7:30 PM > > To: Wang, Yalin > > Cc: 'linux-kernel@vger.kernel.org'; 'linux-mm@kvack.org'; 'linux-arm- > > kernel@lists.infradead.org' > > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user > > space > > > > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote: > > > This patch add/dec zero_page's _mapcount to make sure the mapcount is > > > correct for zero_page, so that when read from /proc/kpagecount, > > > zero_page's mapcount is also correct, userspace process like procrank > > > can calculate PSS correctly. > > > > I don't have specific code path to point to, but I would expect zero page > > with non-zero mapcount would cause a problem with rmap. > > > > How do you test the change? > > > I just test it to see the mapcount from /proc/pid/pagemap and /proc/kpagecount , > It works well, I took a closer look and your patch is broken in multiple places: - on zap_pte_range() you don't decrement mapcount; - you don't update rss counters for mm; - copy_one_pte() doesn't increase mapcount; - ... Basically, each and every vm_normal_page() call must be audited. As first step. And you totally skip huge zero page. Proper mapcount handling for zero page would require a lot more work and I don't think it worth it. Gain is too small. NAK. > The problem is that when I see /proc/pid/smaps , > The Rss / Pss don't calculate zero_page map, > Because smaps_pte_entry() --> vm_normal_page( ), > Will return NULL for zero_page, > > But when userspace process cat /proc/pid/pagemap , > It will see zero_page mapped, > And will treat as Rss , > This is weird, should we also omit zero_page in /proc/pid/pagemap ? > Or add zero_page as Rss in /proc/pid/smaps ? > > I think we should add zero_page into Rss , > Because it is really mapped into userspace address space. > And will let userspace memory analysis more accurate . It would be easier for userspace to find out pfn of zero page and take it into account. Note: some architectures have multiple zero page due to coloring.
On Thu, Dec 4, 2014 at 3:28 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote: > On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote: >> > -----Original Message----- >> > From: Kirill A. Shutemov [mailto:kirill@shutemov.name] >> > Sent: Tuesday, December 02, 2014 7:30 PM >> > To: Wang, Yalin >> > Cc: 'linux-kernel@vger.kernel.org'; 'linux-mm@kvack.org'; 'linux-arm- >> > kernel@lists.infradead.org' >> > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user >> > space >> > >> > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote: >> > > This patch add/dec zero_page's _mapcount to make sure the mapcount is >> > > correct for zero_page, so that when read from /proc/kpagecount, >> > > zero_page's mapcount is also correct, userspace process like procrank >> > > can calculate PSS correctly. Instead of tweaking mapcount you could mark zero-pages in /proc/kpageflags and handle them accordingly in userspace. Or mark zero pages with special magic _mapcount and detect it in /proc/kpagecount. >> > >> > I don't have specific code path to point to, but I would expect zero page >> > with non-zero mapcount would cause a problem with rmap. >> > >> > How do you test the change? >> > >> I just test it to see the mapcount from /proc/pid/pagemap and /proc/kpagecount , >> It works well, > > I took a closer look and your patch is broken in multiple places: > - on zap_pte_range() you don't decrement mapcount; > - you don't update rss counters for mm; > - copy_one_pte() doesn't increase mapcount; > - ... > > Basically, each and every vm_normal_page() call must be audited. As first > step. And you totally skip huge zero page. > > Proper mapcount handling for zero page would require a lot more work and I > don't think it worth it. Gain is too small. > > NAK. > >> The problem is that when I see /proc/pid/smaps , >> The Rss / Pss don't calculate zero_page map, >> Because smaps_pte_entry() --> vm_normal_page( ), >> Will return NULL for zero_page, >> >> But when userspace process cat /proc/pid/pagemap , >> It will see zero_page mapped, >> And will treat as Rss , >> This is weird, should we also omit zero_page in /proc/pid/pagemap ? >> Or add zero_page as Rss in /proc/pid/smaps ? >> >> I think we should add zero_page into Rss , >> Because it is really mapped into userspace address space. >> And will let userspace memory analysis more accurate . > > It would be easier for userspace to find out pfn of zero page and take it > into account. > > Note: some architectures have multiple zero page due to coloring. > > -- > Kirill A. Shutemov > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> -----Original Message----- > From: Konstantin Khlebnikov [mailto:koct9i@gmail.com] > Sent: Friday, December 05, 2014 2:39 PM > To: Kirill A. Shutemov > Cc: Wang, Yalin; linux-kernel@vger.kernel.org; linux-mm@kvack.org; linux- > arm-kernel@lists.infradead.org > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user > space > > On Thu, Dec 4, 2014 at 3:28 PM, Kirill A. Shutemov <kirill@shutemov.name> > wrote: > > On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote: > >> > -----Original Message----- > >> > From: Kirill A. Shutemov [mailto:kirill@shutemov.name] > >> > Sent: Tuesday, December 02, 2014 7:30 PM > >> > To: Wang, Yalin > >> > Cc: 'linux-kernel@vger.kernel.org'; 'linux-mm@kvack.org'; > >> > 'linux-arm- kernel@lists.infradead.org' > >> > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into > >> > user space > >> > > >> > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote: > >> > > This patch add/dec zero_page's _mapcount to make sure the > >> > > mapcount is correct for zero_page, so that when read from > >> > > /proc/kpagecount, zero_page's mapcount is also correct, userspace > >> > > process like procrank can calculate PSS correctly. > > Instead of tweaking mapcount you could mark zero-pages in /proc/kpageflags > and handle them accordingly in userspace. Or mark zero pages with special > magic _mapcount and detect it in /proc/kpagecount. > I think add KPF_ZERO_PAGE in kpageflags is better. I will make another patch for review . Thanks
diff --git a/mm/memory.c b/mm/memory.c index 3e50383..7215423 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2198,6 +2198,7 @@ gotten: new_page = alloc_zeroed_user_highpage_movable(vma, address); if (!new_page) goto oom; + atomic_dec(&pfn_to_page(pte_pfn(orig_pte))->_mapcount); } else { new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address); if (!new_page) @@ -2647,6 +2648,7 @@ static int do_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma, page_table = pte_offset_map_lock(mm, pmd, address, &ptl); if (!pte_none(*page_table)) goto unlock; + atomic_inc(&pfn_to_page(my_zero_pfn(address))->_mapcount); goto setpte; }
This patch add/dec zero_page's _mapcount to make sure the mapcount is correct for zero_page, so that when read from /proc/kpagecount, zero_page's mapcount is also correct, userspace process like procrank can calculate PSS correctly. Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com> --- mm/memory.c | 2 ++ 1 file changed, 2 insertions(+)