diff mbox series

[v4,2/3] mm: Provide a function to get an additional pin on a page

Message ID 20230526214142.958751-3-dhowells@redhat.com (mailing list archive)
State New
Headers show
Series block: Make old dio use iov_iter_extract_pages() and page pinning | expand

Commit Message

David Howells May 26, 2023, 9:41 p.m. UTC
Provide a function to get an additional pin on a page that we already have
a pin on.  This will be used in fs/direct-io.c when dispatching multiple
bios to a page we've extracted from a user-backed iter rather than redoing
the extraction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@infradead.org>
cc: David Hildenbrand <david@redhat.com>
cc: Lorenzo Stoakes <lstoakes@gmail.com>
cc: Andrew Morton <akpm@linux-foundation.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jan Kara <jack@suse.cz>
cc: Jeff Layton <jlayton@kernel.org>
cc: Jason Gunthorpe <jgg@nvidia.com>
cc: Logan Gunthorpe <logang@deltatee.com>
cc: Hillf Danton <hdanton@sina.com>
cc: Christian Brauner <brauner@kernel.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-kernel@vger.kernel.org
cc: linux-mm@kvack.org
---

Notes:
    ver #4)
     - Use _inc rather than _add ops when we're just adding 1.
    
    ver #3)
     - Rename to folio_add_pin().
     - Change to using is_zero_folio()

 include/linux/mm.h |  1 +
 mm/gup.c           | 27 +++++++++++++++++++++++++++
 2 files changed, 28 insertions(+)

Comments

Christoph Hellwig May 31, 2023, 3:57 a.m. UTC | #1
Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
David Hildenbrand May 31, 2023, 7:42 a.m. UTC | #2
On 26.05.23 23:41, David Howells wrote:
> Provide a function to get an additional pin on a page that we already have
> a pin on.  This will be used in fs/direct-io.c when dispatching multiple
> bios to a page we've extracted from a user-backed iter rather than redoing
> the extraction.
> 

I guess this function is only used for "replicating" an existing pin, 
and not changing the semantics of an existing pin: something that was 
pinned !FOLL_LONGTERM cannot suddenly become effectively pinned 
FOLL_LONGTERM.

Out of curiosity, could we end up passing in an anonymous page, or is 
this almost exclusively for pagecache+zero pages? (I rememebr John H. 
had a similar patch where he said it would not apply to anon pages)

> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@infradead.org>
> cc: David Hildenbrand <david@redhat.com>
> cc: Lorenzo Stoakes <lstoakes@gmail.com>
> cc: Andrew Morton <akpm@linux-foundation.org>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: Jan Kara <jack@suse.cz>
> cc: Jeff Layton <jlayton@kernel.org>
> cc: Jason Gunthorpe <jgg@nvidia.com>
> cc: Logan Gunthorpe <logang@deltatee.com>
> cc: Hillf Danton <hdanton@sina.com>
> cc: Christian Brauner <brauner@kernel.org>
> cc: Linus Torvalds <torvalds@linux-foundation.org>
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-kernel@vger.kernel.org
> cc: linux-mm@kvack.org
> ---

Acked-by: David Hildenbrand <david@redhat.com>
David Howells May 31, 2023, 8:20 a.m. UTC | #3
David Hildenbrand <david@redhat.com> wrote:

> > Provide a function to get an additional pin on a page that we already have
> > a pin on.  This will be used in fs/direct-io.c when dispatching multiple
> > bios to a page we've extracted from a user-backed iter rather than redoing
> > the extraction.
> > 
> 
> I guess this function is only used for "replicating" an existing pin, and not
> changing the semantics of an existing pin: something that was pinned
> !FOLL_LONGTERM cannot suddenly become effectively pinned FOLL_LONGTERM.
> 
> Out of curiosity, could we end up passing in an anonymous page, or is this
> almost exclusively for pagecache+zero pages? (I rememebr John H. had a similar
> patch where he said it would not apply to anon pages)

It has to be able to handle anything in a process's VM that you can
legitimately use as the buffer for a DIO read/write.  If we can get a pin on
it from pin_user_pages_fast(), then we need to be able to get an additional
pin on it.  For the moment it's just in the old DIO stuff, but it will almost
certainly be necessary in the networking code too to handle splicing into
network packets.

David
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3c2f6b452586..200068d98686 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2405,6 +2405,7 @@  int get_user_pages_fast(unsigned long start, int nr_pages,
 			unsigned int gup_flags, struct page **pages);
 int pin_user_pages_fast(unsigned long start, int nr_pages,
 			unsigned int gup_flags, struct page **pages);
+void folio_add_pin(struct folio *folio);
 
 int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc);
 int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
diff --git a/mm/gup.c b/mm/gup.c
index ad28261dcafd..0814576b7366 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -275,6 +275,33 @@  void unpin_user_page(struct page *page)
 }
 EXPORT_SYMBOL(unpin_user_page);
 
+/**
+ * folio_add_pin - Try to get an additional pin on a pinned folio
+ * @folio: The folio to be pinned
+ *
+ * Get an additional pin on a folio we already have a pin on.  Makes no change
+ * if the folio is a zero_page.
+ */
+void folio_add_pin(struct folio *folio)
+{
+	if (is_zero_folio(folio))
+		return;
+
+	/*
+	 * Similar to try_grab_folio(): be sure to *also* increment the normal
+	 * page refcount field at least once, so that the page really is
+	 * pinned.
+	 */
+	if (folio_test_large(folio)) {
+		WARN_ON_ONCE(atomic_read(&folio->_pincount) < 1);
+		folio_ref_inc(folio);
+		atomic_inc(&folio->_pincount);
+	} else {
+		WARN_ON_ONCE(folio_ref_count(folio) < GUP_PIN_COUNTING_BIAS);
+		folio_ref_add(folio, GUP_PIN_COUNTING_BIAS);
+	}
+}
+
 static inline struct folio *gup_folio_range_next(struct page *start,
 		unsigned long npages, unsigned long i, unsigned int *ntails)
 {