diff mbox

[PATCHv3,11/41] thp: try to free page's buffers before attempt split

Message ID 20160915115523.29737-12-kirill.shutemov@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kirill A . Shutemov Sept. 15, 2016, 11:54 a.m. UTC
We want page to be isolated from the rest of the system before spliting
it. We rely on page count to be 2 for file pages to make sure nobody
uses the page: one pin to caller, one to radix-tree.

Filesystems with backing storage can have page count increased if it has
buffers.

Let's try to free them, before attempt split. And remove one guarding
VM_BUG_ON_PAGE().

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 include/linux/buffer_head.h |  1 +
 mm/huge_memory.c            | 19 ++++++++++++++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)

Comments

Jan Kara Oct. 11, 2016, 3:40 p.m. UTC | #1
On Thu 15-09-16 14:54:53, Kirill A. Shutemov wrote:
> We want page to be isolated from the rest of the system before spliting
> it. We rely on page count to be 2 for file pages to make sure nobody
> uses the page: one pin to caller, one to radix-tree.
> 
> Filesystems with backing storage can have page count increased if it has
> buffers.
> 
> Let's try to free them, before attempt split. And remove one guarding
> VM_BUG_ON_PAGE().
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
...
> @@ -2041,6 +2041,23 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>  			goto out;
>  		}
>  
> +		/* Try to free buffers before attempt split */
> +		if (!PageSwapBacked(head) && PagePrivate(page)) {
> +			/*
> +			 * We cannot trigger writeback from here due possible
> +			 * recursion if triggered from vmscan, only wait.
> +			 *
> +			 * Caller can trigger writeback it on its own, if safe.
> +			 */
> +			wait_on_page_writeback(head);
> +
> +			if (page_has_buffers(head) &&
> +					!try_to_free_buffers(head)) {
> +				ret = -EBUSY;
> +				goto out;
> +			}

Shouldn't you rather use try_to_release_page() here? Because filesystems
have their ->releasepage() callbacks for freeing data associated with a
page. It is not guaranteed page private data are buffers although it is
true for ext4...

								Honza
Kirill A. Shutemov Oct. 11, 2016, 9:43 p.m. UTC | #2
On Tue, Oct 11, 2016 at 05:40:31PM +0200, Jan Kara wrote:
> On Thu 15-09-16 14:54:53, Kirill A. Shutemov wrote:
> > We want page to be isolated from the rest of the system before spliting
> > it. We rely on page count to be 2 for file pages to make sure nobody
> > uses the page: one pin to caller, one to radix-tree.
> > 
> > Filesystems with backing storage can have page count increased if it has
> > buffers.
> > 
> > Let's try to free them, before attempt split. And remove one guarding
> > VM_BUG_ON_PAGE().
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ...
> > @@ -2041,6 +2041,23 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> >  			goto out;
> >  		}
> >  
> > +		/* Try to free buffers before attempt split */
> > +		if (!PageSwapBacked(head) && PagePrivate(page)) {
> > +			/*
> > +			 * We cannot trigger writeback from here due possible
> > +			 * recursion if triggered from vmscan, only wait.
> > +			 *
> > +			 * Caller can trigger writeback it on its own, if safe.
> > +			 */
> > +			wait_on_page_writeback(head);
> > +
> > +			if (page_has_buffers(head) &&
> > +					!try_to_free_buffers(head)) {
> > +				ret = -EBUSY;
> > +				goto out;
> > +			}
> 
> Shouldn't you rather use try_to_release_page() here? Because filesystems
> have their ->releasepage() callbacks for freeing data associated with a
> page. It is not guaranteed page private data are buffers although it is
> true for ext4...

Fair enough. Will fix this.
diff mbox

Patch

diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index ebbacd14d450..006a8a42acfb 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -395,6 +395,7 @@  extern int __set_page_dirty_buffers(struct page *page);
 #else /* CONFIG_BLOCK */
 
 static inline void buffer_init(void) {}
+static inline int page_has_buffers(struct page *page) { return 0; }
 static inline int try_to_free_buffers(struct page *page) { return 1; }
 static inline int inode_has_buffers(struct inode *inode) { return 0; }
 static inline void invalidate_inode_buffers(struct inode *inode) {}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 020a23d6e7f8..44bf0ba3d10f 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -30,6 +30,7 @@ 
 #include <linux/userfaultfd_k.h>
 #include <linux/page_idle.h>
 #include <linux/shmem_fs.h>
+#include <linux/buffer_head.h>
 
 #include <asm/tlb.h>
 #include <asm/pgalloc.h>
@@ -2012,7 +2013,6 @@  int split_huge_page_to_list(struct page *page, struct list_head *list)
 
 	VM_BUG_ON_PAGE(is_huge_zero_page(page), page);
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
-	VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
 	VM_BUG_ON_PAGE(!PageCompound(page), page);
 
 	if (PageAnon(head)) {
@@ -2041,6 +2041,23 @@  int split_huge_page_to_list(struct page *page, struct list_head *list)
 			goto out;
 		}
 
+		/* Try to free buffers before attempt split */
+		if (!PageSwapBacked(head) && PagePrivate(page)) {
+			/*
+			 * We cannot trigger writeback from here due possible
+			 * recursion if triggered from vmscan, only wait.
+			 *
+			 * Caller can trigger writeback it on its own, if safe.
+			 */
+			wait_on_page_writeback(head);
+
+			if (page_has_buffers(head) &&
+					!try_to_free_buffers(head)) {
+				ret = -EBUSY;
+				goto out;
+			}
+		}
+
 		/* Addidional pin from radix tree */
 		extra_pins = 1;
 		anon_vma = NULL;