diff mbox series

lib/iov_iter.c: extract virt-contiguous pages in iov_iter_extract_bvec_pages

Message ID 20241004153025.1867858-1-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series lib/iov_iter.c: extract virt-contiguous pages in iov_iter_extract_bvec_pages | expand

Commit Message

Ming Lei Oct. 4, 2024, 3:30 p.m. UTC
All iov_iter_bvec() users only want to extract virt-contiguous pages from
iov_iter_extract_pages() instead physical-contiguous pages.

Change iov_iter_extract_bvec_pages() to extract virt-contiguous pages via
bvec helper.

This way can fill much more pages one time, instead of (often)one page from
iov_iter_extract_pages() each time.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/bvec.h |  6 ++++++
 lib/iov_iter.c       | 47 +++++++++++++++++++++++---------------------
 2 files changed, 31 insertions(+), 22 deletions(-)

Comments

David Howells Oct. 4, 2024, 4:56 p.m. UTC | #1
Ming Lei <ming.lei@redhat.com> wrote:

> All iov_iter_bvec() users only want to extract virt-contiguous pages from
> iov_iter_extract_pages() instead physical-contiguous pages.

What do you mean by "virt-contiguous"?  Virtual according to what mapping?

The reason for physical contiguity is that you can pass a set of physical
contiguous pages as a single DMA descriptor.  Therefore, at some point, you
might end up screwing up skb_splice_from_iter().  Currently, that's limited to
a PAGE_SIZE per fragment, but hopefully that will be fixed at some point.

David
Ming Lei Oct. 6, 2024, 2:55 a.m. UTC | #2
On Fri, Oct 04, 2024 at 05:56:52PM +0100, David Howells wrote:
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > All iov_iter_bvec() users only want to extract virt-contiguous pages from
> > iov_iter_extract_pages() instead physical-contiguous pages.
> 
> What do you mean by "virt-contiguous"?  Virtual according to what mapping?

The term is from comment iov_iter_extract_kvec_pages(), seems it is
invented by you, :-)

Actually iov_iter_extract_pages() requires that there isn't gap in the
extracted pages, so 'offset' only exists in the 1st page, then these
pages can be mapped to one virtual(contiguous) address.

> 
> The reason for physical contiguity is that you can pass a set of physical
> contiguous pages as a single DMA descriptor.  Therefore, at some point, you
> might end up screwing up skb_splice_from_iter().  Currently, that's limited to
> a PAGE_SIZE per fragment, but hopefully that will be fixed at some point.

If any user wants to extract physical pages, new interface can be added for
returning single page instead of page array, cause it is physically contiguous.

Other kind of iterators(UBUF, KVEC, ...) do return non physically-contiguous
pages.

The point is that one bvec often point to one page except for huge page
case, so iov_iter_extract_pages() just returns single page each time
no matter how big maxpages & maxsize is passed in.

It is actually one regression:

Fixes: a7e689dd1c06 ("block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages")


Thanks,
Ming
diff mbox series

Patch

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index f41c7f0ef91e..98e1a4ad09e0 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -184,6 +184,12 @@  static inline void bvec_iter_advance_single(const struct bio_vec *bv,
 		((bvl = bvec_iter_bvec((bio_vec), (iter))), 1);	\
 	     bvec_iter_advance_single((bio_vec), &(iter), (bvl).bv_len))
 
+#define for_each_bvec_max(bvl, bio_vec, iter, start, nr_bvecs)		\
+	for (iter = (start);						\
+	     (iter).bi_size && iter.bi_idx < nr_bvecs &&		\
+		((bvl = bvec_iter_bvec((bio_vec), (iter))), 1);	\
+	     bvec_iter_advance_single((bio_vec), &(iter), (bvl).bv_len))
+
 /* for iterating one bio from start to end */
 #define BVEC_ITER_ALL_INIT (struct bvec_iter)				\
 {									\
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 97003155bfac..6e00f6da5259 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1677,8 +1677,8 @@  static ssize_t iov_iter_extract_xarray_pages(struct iov_iter *i,
 }
 
 /*
- * Extract a list of contiguous pages from an ITER_BVEC iterator.  This does
- * not get references on the pages, nor does it get a pin on them.
+ * Extract a list of virtually contiguous pages from an ITER_BVEC iterator.
+ * This does not get references on the pages, nor does it get a pin on them.
  */
 static ssize_t iov_iter_extract_bvec_pages(struct iov_iter *i,
 					   struct page ***pages, size_t maxsize,
@@ -1686,35 +1686,38 @@  static ssize_t iov_iter_extract_bvec_pages(struct iov_iter *i,
 					   iov_iter_extraction_t extraction_flags,
 					   size_t *offset0)
 {
-	struct page **p, *page;
-	size_t skip = i->iov_offset, offset, size;
-	int k;
+	size_t skip = i->iov_offset, size = 0;
+	struct bvec_iter bi;
+	struct bio_vec bv;
+	int k = 0;
 
-	for (;;) {
-		if (i->nr_segs == 0)
-			return 0;
-		size = min(maxsize, i->bvec->bv_len - skip);
-		if (size)
-			break;
+	if (i->nr_segs == 0)
+		return 0;
+
+	if (i->iov_offset == i->bvec->bv_len) {
 		i->iov_offset = 0;
 		i->nr_segs--;
 		i->bvec++;
 		skip = 0;
 	}
+	bi.bi_size = maxsize + skip;
+	bi.bi_bvec_done = skip;
 
-	skip += i->bvec->bv_offset;
-	page = i->bvec->bv_page + skip / PAGE_SIZE;
-	offset = skip % PAGE_SIZE;
-	*offset0 = offset;
+	maxpages = want_pages_array(pages, maxsize, skip, maxpages);
 
-	maxpages = want_pages_array(pages, size, offset, maxpages);
-	if (!maxpages)
-		return -ENOMEM;
-	p = *pages;
-	for (k = 0; k < maxpages; k++)
-		p[k] = page + k;
+	for_each_bvec_max(bv, i->bvec, bi, bi, i->nr_segs) {
+		if (k >= maxpages)
+			break;
+		if (!k)
+			*offset0 = bv.bv_offset;
+		else if (bv.bv_offset)
+			break;
+		(*pages)[k++] = bv.bv_page;
+		size += bv.bv_len;
+		if (bv.bv_offset + bv.bv_len != PAGE_SIZE)
+			break;
+	}
 
-	size = min_t(size_t, size, maxpages * PAGE_SIZE - offset);
 	iov_iter_advance(i, size);
 	return size;
 }