diff mbox

[1/2] mm: add support for discard of unused ptes

Message ID 1374742461-29160-2-git-send-email-schwidefsky@de.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Martin Schwidefsky July 25, 2013, 8:54 a.m. UTC
From: Konstantin Weitz <konstantin.weitz@gmail.com>

In a virtualized environment and given an appropriate interface the guest
can mark pages as unused while they are free (for the s390 implementation
see git commit 45e576b1c3d00206 "guest page hinting light"). For the host
the unused state is a property of the pte.

This patch adds the primitive 'pte_unused' and code to the host swap out
handler so that pages marked as unused by all mappers are not swapped out
but discarded instead, thus saving one IO for swap out and potentially
another one for swap in.

[ Martin Schwidefsky: patch reordering and simplification ]

Signed-off-by: Konstantin Weitz <konstantin.weitz@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 include/asm-generic/pgtable.h |   13 +++++++++++++
 mm/rmap.c                     |   10 ++++++++++
 2 files changed, 23 insertions(+)

Comments

Christian Borntraeger July 25, 2013, 10:46 a.m. UTC | #1
On 25/07/13 10:54, Martin Schwidefsky wrote:
> From: Konstantin Weitz <konstantin.weitz@gmail.com>
> 
> In a virtualized environment and given an appropriate interface the guest
> can mark pages as unused while they are free (for the s390 implementation
> see git commit 45e576b1c3d00206 "guest page hinting light"). For the host
> the unused state is a property of the pte.
> 
> This patch adds the primitive 'pte_unused' and code to the host swap out
> handler so that pages marked as unused by all mappers are not swapped out
> but discarded instead, thus saving one IO for swap out and potentially
> another one for swap in.
> 
> [ Martin Schwidefsky: patch reordering and simplification ]
> 
> Signed-off-by: Konstantin Weitz <konstantin.weitz@gmail.com>
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>

> ---
>  include/asm-generic/pgtable.h |   13 +++++++++++++
>  mm/rmap.c                     |   10 ++++++++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 2f47ade..ec540c5 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -193,6 +193,19 @@ static inline int pte_same(pte_t pte_a, pte_t pte_b)
>  }
>  #endif
> 
> +#ifndef __HAVE_ARCH_PTE_UNUSED
> +/*
> + * Some architectures provide facilities to virtualization guests
> + * so that they can flag allocated pages as unused. This allows the
> + * host to transparently reclaim unused pages. This function returns
> + * whether the pte's page is unused.
> + */
> +static inline int pte_unused(pte_t pte)
> +{
> +	return 0;
> +}
> +#endif
> +
>  #ifndef __HAVE_ARCH_PMD_SAME
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
> diff --git a/mm/rmap.c b/mm/rmap.c
> index cd356df..2291f25 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1234,6 +1234,16 @@ int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>  		}
>  		set_pte_at(mm, address, pte,
>  			   swp_entry_to_pte(make_hwpoison_entry(page)));
> +	} else if (pte_unused(pteval)) {
> +		/*
> +		 * The guest indicated that the page content is of no
> +		 * interest anymore. Simply discard the pte, vmscan
> +		 * will take care of the rest.
> +		 */
> +		if (PageAnon(page))
> +			dec_mm_counter(mm, MM_ANONPAGES);
> +		else
> +			dec_mm_counter(mm, MM_FILEPAGES);
>  	} else if (PageAnon(page)) {
>  		swp_entry_t entry = { .val = page_private(page) };
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Morton July 30, 2013, 8:44 p.m. UTC | #2
On Thu, 25 Jul 2013 10:54:20 +0200 Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> From: Konstantin Weitz <konstantin.weitz@gmail.com>
> 
> In a virtualized environment and given an appropriate interface the guest
> can mark pages as unused while they are free (for the s390 implementation
> see git commit 45e576b1c3d00206 "guest page hinting light"). For the host
> the unused state is a property of the pte.
> 
> This patch adds the primitive 'pte_unused' and code to the host swap out
> handler so that pages marked as unused by all mappers are not swapped out
> but discarded instead, thus saving one IO for swap out and potentially
> another one for swap in.
> 
> ...
>
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -193,6 +193,19 @@ static inline int pte_same(pte_t pte_a, pte_t pte_b)
>  }
>  #endif
>  
> +#ifndef __HAVE_ARCH_PTE_UNUSED
> +/*
> + * Some architectures provide facilities to virtualization guests
> + * so that they can flag allocated pages as unused. This allows the
> + * host to transparently reclaim unused pages. This function returns
> + * whether the pte's page is unused.
> + */
> +static inline int pte_unused(pte_t pte)
> +{
> +	return 0;
> +}
> +#endif
> +
>  #ifndef __HAVE_ARCH_PMD_SAME
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
> diff --git a/mm/rmap.c b/mm/rmap.c
> index cd356df..2291f25 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1234,6 +1234,16 @@ int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>  		}
>  		set_pte_at(mm, address, pte,
>  			   swp_entry_to_pte(make_hwpoison_entry(page)));
> +	} else if (pte_unused(pteval)) {
> +		/*
> +		 * The guest indicated that the page content is of no
> +		 * interest anymore. Simply discard the pte, vmscan
> +		 * will take care of the rest.
> +		 */
> +		if (PageAnon(page))
> +			dec_mm_counter(mm, MM_ANONPAGES);
> +		else
> +			dec_mm_counter(mm, MM_FILEPAGES);
>  	} else if (PageAnon(page)) {
>  		swp_entry_t entry = { .val = page_private(page) };

Obviously harmless.  Please include this in whatever tree carries
"[PATCH 2/2] s390/kvm: support collaborative memory management".
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martin Schwidefsky July 31, 2013, 6:26 a.m. UTC | #3
On Tue, 30 Jul 2013 13:44:22 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Thu, 25 Jul 2013 10:54:20 +0200 Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > From: Konstantin Weitz <konstantin.weitz@gmail.com>
> > 
> > In a virtualized environment and given an appropriate interface the guest
> > can mark pages as unused while they are free (for the s390 implementation
> > see git commit 45e576b1c3d00206 "guest page hinting light"). For the host
> > the unused state is a property of the pte.
> > 
> > This patch adds the primitive 'pte_unused' and code to the host swap out
> > handler so that pages marked as unused by all mappers are not swapped out
> > but discarded instead, thus saving one IO for swap out and potentially
> > another one for swap in.
> > 
> > ...
> >
> > --- a/include/asm-generic/pgtable.h
> > +++ b/include/asm-generic/pgtable.h
> > @@ -193,6 +193,19 @@ static inline int pte_same(pte_t pte_a, pte_t pte_b)
> >  }
> >  #endif
> >  
> > +#ifndef __HAVE_ARCH_PTE_UNUSED
> > +/*
> > + * Some architectures provide facilities to virtualization guests
> > + * so that they can flag allocated pages as unused. This allows the
> > + * host to transparently reclaim unused pages. This function returns
> > + * whether the pte's page is unused.
> > + */
> > +static inline int pte_unused(pte_t pte)
> > +{
> > +	return 0;
> > +}
> > +#endif
> > +
> >  #ifndef __HAVE_ARCH_PMD_SAME
> >  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> >  static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index cd356df..2291f25 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1234,6 +1234,16 @@ int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> >  		}
> >  		set_pte_at(mm, address, pte,
> >  			   swp_entry_to_pte(make_hwpoison_entry(page)));
> > +	} else if (pte_unused(pteval)) {
> > +		/*
> > +		 * The guest indicated that the page content is of no
> > +		 * interest anymore. Simply discard the pte, vmscan
> > +		 * will take care of the rest.
> > +		 */
> > +		if (PageAnon(page))
> > +			dec_mm_counter(mm, MM_ANONPAGES);
> > +		else
> > +			dec_mm_counter(mm, MM_FILEPAGES);
> >  	} else if (PageAnon(page)) {
> >  		swp_entry_t entry = { .val = page_private(page) };
> 
> Obviously harmless.  Please include this in whatever tree carries
> "[PATCH 2/2] s390/kvm: support collaborative memory management".
 
Cool, thanks. This will go out via the KVM tree then.
diff mbox

Patch

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 2f47ade..ec540c5 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -193,6 +193,19 @@  static inline int pte_same(pte_t pte_a, pte_t pte_b)
 }
 #endif
 
+#ifndef __HAVE_ARCH_PTE_UNUSED
+/*
+ * Some architectures provide facilities to virtualization guests
+ * so that they can flag allocated pages as unused. This allows the
+ * host to transparently reclaim unused pages. This function returns
+ * whether the pte's page is unused.
+ */
+static inline int pte_unused(pte_t pte)
+{
+	return 0;
+}
+#endif
+
 #ifndef __HAVE_ARCH_PMD_SAME
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
diff --git a/mm/rmap.c b/mm/rmap.c
index cd356df..2291f25 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1234,6 +1234,16 @@  int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		}
 		set_pte_at(mm, address, pte,
 			   swp_entry_to_pte(make_hwpoison_entry(page)));
+	} else if (pte_unused(pteval)) {
+		/*
+		 * The guest indicated that the page content is of no
+		 * interest anymore. Simply discard the pte, vmscan
+		 * will take care of the rest.
+		 */
+		if (PageAnon(page))
+			dec_mm_counter(mm, MM_ANONPAGES);
+		else
+			dec_mm_counter(mm, MM_FILEPAGES);
 	} else if (PageAnon(page)) {
 		swp_entry_t entry = { .val = page_private(page) };