diff mbox

[v6,17/17] mm: Distinguish VMalloc pages

Message ID 20180522201958.GC1237@bombadil.infradead.org (mailing list archive)
State New, archived
Headers show

Commit Message

Matthew Wilcox May 22, 2018, 8:19 p.m. UTC
On Tue, May 22, 2018 at 10:57:34PM +0300, Andrey Ryabinin wrote:
> On 05/22/2018 08:58 PM, Matthew Wilcox wrote:
> > On Tue, May 22, 2018 at 07:10:52PM +0300, Andrey Ryabinin wrote:
> >> On 05/18/2018 10:45 PM, Matthew Wilcox wrote:
> >>> From: Matthew Wilcox <mawilcox@microsoft.com>
> >>>
> >>> For diagnosing various performance and memory-leak problems, it is helpful
> >>> to be able to distinguish pages which are in use as VMalloc pages.
> >>> Unfortunately, we cannot use the page_type field in struct page, as
> >>> this is in use for mapcount by some drivers which map vmalloced pages
> >>> to userspace.
> >>>
> >>> Use a special page->mapping value to distinguish VMalloc pages from
> >>> other kinds of pages.  Also record a pointer to the vm_struct and the
> >>> offset within the area in struct page to help reconstruct exactly what
> >>> this page is being used for.
> >>
> >> This seems useless. page->vm_area and page->vm_offset are never used.
> >> There are no follow up patches which use this new information 'For diagnosing various performance and memory-leak problems',
> >> and no explanation how is it can be used in current form.
> > 
> > Right now, it's by-hand.  tools/vm/page-types.c will tell you which pages
> > are allocated to VMalloc.  Many people use kernel debuggers, crashdumps
> > and similar to examine the kernel's memory.  Leaving these breadcrumbs
> > is helpful, and those fields simply weren't in use before.
> > 
> >> Also, this patch breaks code like this:
> >> 	if (mapping = page_mapping(page))
> >> 		// access mapping
> > 
> > Example of broken code, please?  Pages allocated from the page allocator
> > with alloc_page() come with page->mapping == NULL.  This code snippet
> > would not have granted access to vmalloc pages before.
> > 
> 
> Some implementation of the flush_dcache_page(), also set_page_dirty() can be called
> on userspace-mapped vmalloc pages during unmap - zap_pte_range() -> set_page_dirty()

Ah, good catch!  I'm anticipating we'll have other special values for
page->mapping in the future. so how about this?

(no changelog because I assume Andrew will add this as a -fix patch)

Comments

Michal Hocko May 23, 2018, 6:36 a.m. UTC | #1
On Tue 22-05-18 13:19:58, Matthew Wilcox wrote:
> On Tue, May 22, 2018 at 10:57:34PM +0300, Andrey Ryabinin wrote:
> > On 05/22/2018 08:58 PM, Matthew Wilcox wrote:
> > > On Tue, May 22, 2018 at 07:10:52PM +0300, Andrey Ryabinin wrote:
> > >> On 05/18/2018 10:45 PM, Matthew Wilcox wrote:
> > >>> From: Matthew Wilcox <mawilcox@microsoft.com>
> > >>>
> > >>> For diagnosing various performance and memory-leak problems, it is helpful
> > >>> to be able to distinguish pages which are in use as VMalloc pages.
> > >>> Unfortunately, we cannot use the page_type field in struct page, as
> > >>> this is in use for mapcount by some drivers which map vmalloced pages
> > >>> to userspace.
> > >>>
> > >>> Use a special page->mapping value to distinguish VMalloc pages from
> > >>> other kinds of pages.  Also record a pointer to the vm_struct and the
> > >>> offset within the area in struct page to help reconstruct exactly what
> > >>> this page is being used for.
> > >>
> > >> This seems useless. page->vm_area and page->vm_offset are never used.
> > >> There are no follow up patches which use this new information 'For diagnosing various performance and memory-leak problems',
> > >> and no explanation how is it can be used in current form.
> > > 
> > > Right now, it's by-hand.  tools/vm/page-types.c will tell you which pages
> > > are allocated to VMalloc.  Many people use kernel debuggers, crashdumps
> > > and similar to examine the kernel's memory.  Leaving these breadcrumbs
> > > is helpful, and those fields simply weren't in use before.
> > > 
> > >> Also, this patch breaks code like this:
> > >> 	if (mapping = page_mapping(page))
> > >> 		// access mapping
> > > 
> > > Example of broken code, please?  Pages allocated from the page allocator
> > > with alloc_page() come with page->mapping == NULL.  This code snippet
> > > would not have granted access to vmalloc pages before.
> > > 
> > 
> > Some implementation of the flush_dcache_page(), also set_page_dirty() can be called
> > on userspace-mapped vmalloc pages during unmap - zap_pte_range() -> set_page_dirty()
> 
> Ah, good catch!  I'm anticipating we'll have other special values for
> page->mapping in the future. so how about this?
> 
> (no changelog because I assume Andrew will add this as a -fix patch)
> 
> diff --git a/mm/util.c b/mm/util.c
> index 10ca6f1d5c75..be81c9052ef7 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -561,6 +561,8 @@ struct address_space *page_mapping(struct page *page)
>  	mapping = page->mapping;
>  	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
>  		return NULL;
> +	if ((unsigned long)mapping < PAGE_SIZE)
> +		return NULL;
>  
>  	return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
>  }

Well, this would be quite unfortunate. We do not want to pay a branch
price for something that doesn't have a _real_ user. Which is kinda sad
because I found the explicit vmalloc page "flag" nice to have (if it was
for free basically).
diff mbox

Patch

diff --git a/mm/util.c b/mm/util.c
index 10ca6f1d5c75..be81c9052ef7 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -561,6 +561,8 @@  struct address_space *page_mapping(struct page *page)
 	mapping = page->mapping;
 	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
 		return NULL;
+	if ((unsigned long)mapping < PAGE_SIZE)
+		return NULL;
 
 	return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
 }