diff mbox series

[v5,4/6] io_uring: rsrc: delegate VMA file-backed check to GUP

Message ID 642128d50f5423b3331e3108f8faf6b8ac0d957e.1684097002.git.lstoakes@gmail.com (mailing list archive)
State New
Headers show
Series None | expand

Commit Message

Lorenzo Stoakes May 14, 2023, 9:26 p.m. UTC
Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
writing to file-backed mappings", there is no need to explicitly check VMAs
for this condition, so simply remove this logic from io_uring altogether.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
 io_uring/rsrc.c | 34 ++++++----------------------------
 1 file changed, 6 insertions(+), 28 deletions(-)

Comments

Christoph Hellwig May 15, 2023, 11:50 a.m. UTC | #1
Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Jens Axboe May 15, 2023, 7:55 p.m. UTC | #2
On 5/14/23 3:26 PM, Lorenzo Stoakes wrote:
> Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
> broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
> writing to file-backed mappings", there is no need to explicitly check VMAs
> for this condition, so simply remove this logic from io_uring altogether.

Don't have the prerequisite patch handy (not in mainline yet), but if it
just moves the check, then:

Reviewed-by: Jens Axboe <axboe@kernel.dk>
David Hildenbrand May 16, 2023, 8:25 a.m. UTC | #3
On 15.05.23 21:55, Jens Axboe wrote:
> On 5/14/23 3:26 PM, Lorenzo Stoakes wrote:
>> Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
>> broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
>> writing to file-backed mappings", there is no need to explicitly check VMAs
>> for this condition, so simply remove this logic from io_uring altogether.
> 
> Don't have the prerequisite patch handy (not in mainline yet), but if it
> just moves the check, then:
> 
> Reviewed-by: Jens Axboe <axboe@kernel.dk>
> 

Jens, please see my note regarding iouring:

https://lore.kernel.org/bpf/6e96358e-bcb5-cc36-18c3-ec5153867b9a@redhat.com/

With this patch, MAP_PRIVATE will work as expected (2), but there will 
be a change in return code handling (1) that we might have to document
in the man page.
David Hildenbrand May 16, 2023, 8:28 a.m. UTC | #4
On 14.05.23 23:26, Lorenzo Stoakes wrote:
> Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
> broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
> writing to file-backed mappings", there is no need to explicitly check VMAs
> for this condition, so simply remove this logic from io_uring altogether.
> 

Worth adding "Note that this change will make iouring fixed buffers work 
on MAP_PRIVATE file mappings."

I'll run my test cases with this series and expect no surprises :)


Reviewed-by: David Hildenbrand <david@redhat.com>
Jens Axboe May 16, 2023, 1:19 p.m. UTC | #5
On 5/16/23 2:25?AM, David Hildenbrand wrote:
> On 15.05.23 21:55, Jens Axboe wrote:
>> On 5/14/23 3:26?PM, Lorenzo Stoakes wrote:
>>> Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
>>> broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
>>> writing to file-backed mappings", there is no need to explicitly check VMAs
>>> for this condition, so simply remove this logic from io_uring altogether.
>>
>> Don't have the prerequisite patch handy (not in mainline yet), but if it
>> just moves the check, then:
>>
>> Reviewed-by: Jens Axboe <axboe@kernel.dk>
>>
> 
> Jens, please see my note regarding iouring:
> 
> https://lore.kernel.org/bpf/6e96358e-bcb5-cc36-18c3-ec5153867b9a@redhat.com/
> 
> With this patch, MAP_PRIVATE will work as expected (2), but there will
> be a change in return code handling (1) that we might have to document
> in the man page.

I think documenting that newer kernels will return -EFAULT rather than
-EOPNOTSUPP should be fine. It's not a new failure case, just a
different error value for an already failing case. Should be fine with
just a doc update. Will do that now.
diff mbox series

Patch

diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index d46f72a5ef73..b6451f8bc5d5 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -1030,9 +1030,8 @@  static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
 struct page **io_pin_pages(unsigned long ubuf, unsigned long len, int *npages)
 {
 	unsigned long start, end, nr_pages;
-	struct vm_area_struct **vmas = NULL;
 	struct page **pages = NULL;
-	int i, pret, ret = -ENOMEM;
+	int pret, ret = -ENOMEM;
 
 	end = (ubuf + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
 	start = ubuf >> PAGE_SHIFT;
@@ -1042,45 +1041,24 @@  struct page **io_pin_pages(unsigned long ubuf, unsigned long len, int *npages)
 	if (!pages)
 		goto done;
 
-	vmas = kvmalloc_array(nr_pages, sizeof(struct vm_area_struct *),
-			      GFP_KERNEL);
-	if (!vmas)
-		goto done;
-
 	ret = 0;
 	mmap_read_lock(current->mm);
 	pret = pin_user_pages(ubuf, nr_pages, FOLL_WRITE | FOLL_LONGTERM,
-			      pages, vmas);
-	if (pret == nr_pages) {
-		/* don't support file backed memory */
-		for (i = 0; i < nr_pages; i++) {
-			struct vm_area_struct *vma = vmas[i];
-
-			if (vma_is_shmem(vma))
-				continue;
-			if (vma->vm_file &&
-			    !is_file_hugepages(vma->vm_file)) {
-				ret = -EOPNOTSUPP;
-				break;
-			}
-		}
+			      pages, NULL);
+	if (pret == nr_pages)
 		*npages = nr_pages;
-	} else {
+	else
 		ret = pret < 0 ? pret : -EFAULT;
-	}
+
 	mmap_read_unlock(current->mm);
 	if (ret) {
-		/*
-		 * if we did partial map, or found file backed vmas,
-		 * release any pages we did get
-		 */
+		/* if we did partial map, release any pages we did get */
 		if (pret > 0)
 			unpin_user_pages(pages, pret);
 		goto done;
 	}
 	ret = 0;
 done:
-	kvfree(vmas);
 	if (ret < 0) {
 		kvfree(pages);
 		pages = ERR_PTR(ret);