diff mbox series

iov_iter: fix advancing slot in iter_folioq_get_pages()

Message ID cbaf141ba6c0e2e209717d02746584072844841a.1727722269.git.osandov@fb.com (mailing list archive)
State New
Headers show
Series iov_iter: fix advancing slot in iter_folioq_get_pages() | expand

Commit Message

Omar Sandoval Sept. 30, 2024, 6:55 p.m. UTC
From: Omar Sandoval <osandov@fb.com>

iter_folioq_get_pages() decides to advance to the next folioq slot when
it has reached the end of the current folio. However, it is checking
offset, which is the beginning of the current part, instead of
iov_offset, which is adjusted to the end of the current part, so it
doesn't advance the slot when it's supposed to. As a result, on the next
iteration, we'll use the same folio with an out-of-bounds offset and
return an unrelated page.

This manifested as various crashes and other failures in 9pfs in drgn's
VM testing setup and BPF CI.

Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
Link: https://lore.kernel.org/linux-fsdevel/20240923183432.1876750-1-chantr4@gmail.com/
Tested-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 lib/iov_iter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eduard Zingerman Sept. 30, 2024, 7:27 p.m. UTC | #1
On Mon, 2024-09-30 at 11:55 -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> iter_folioq_get_pages() decides to advance to the next folioq slot when
> it has reached the end of the current folio. However, it is checking
> offset, which is the beginning of the current part, instead of
> iov_offset, which is adjusted to the end of the current part, so it
> doesn't advance the slot when it's supposed to. As a result, on the next
> iteration, we'll use the same folio with an out-of-bounds offset and
> return an unrelated page.
> 
> This manifested as various crashes and other failures in 9pfs in drgn's
> VM testing setup and BPF CI.
> 
> Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
> Link: https://lore.kernel.org/linux-fsdevel/20240923183432.1876750-1-chantr4@gmail.com/
> Tested-by: Manu Bretelle <chantr4@gmail.com>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---

Tried this on top of the following commit from net-fs-fixes branch:
e1b0d67c7ae0 ("9p: Don't revert the I/O iterator after reading")

The boot issue is gone, thank you!

Tested-by: Eduard Zingerman <eddyz87@gmail.com>

[...]
David Howells Sept. 30, 2024, 8:10 p.m. UTC | #2
Omar Sandoval <osandov@osandov.com> wrote:

> From: Omar Sandoval <osandov@fb.com>
> 
> iter_folioq_get_pages() decides to advance to the next folioq slot when
> it has reached the end of the current folio. However, it is checking
> offset, which is the beginning of the current part, instead of
> iov_offset, which is adjusted to the end of the current part, so it
> doesn't advance the slot when it's supposed to. As a result, on the next
> iteration, we'll use the same folio with an out-of-bounds offset and
> return an unrelated page.
> 
> This manifested as various crashes and other failures in 9pfs in drgn's
> VM testing setup and BPF CI.
> 
> Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
> Link: https://lore.kernel.org/linux-fsdevel/20240923183432.1876750-1-chantr4@gmail.com/
> Tested-by: Manu Bretelle <chantr4@gmail.com>
> Signed-off-by: Omar Sandoval <osandov@fb.com>

Thanks for finding that!  That would explain why I didn't see it with afs or
cifs - both of those pass the iterator directly to the socket rather than
pulling the pages out of it.  I'm not sure how I managed to do things like run
xfstests to completion and git clone and build a kernel without encountering
the bug.

Christian: Can you add this to vfs.fixes and tag it:

Acked-by: David Howells <dhowells@redhat.com>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Leon Romanovsky Oct. 1, 2024, 5:52 a.m. UTC | #3
On Mon, Sep 30, 2024 at 09:10:02PM +0100, David Howells wrote:
> Omar Sandoval <osandov@osandov.com> wrote:
> 
> > From: Omar Sandoval <osandov@fb.com>
> > 
> > iter_folioq_get_pages() decides to advance to the next folioq slot when
> > it has reached the end of the current folio. However, it is checking
> > offset, which is the beginning of the current part, instead of
> > iov_offset, which is adjusted to the end of the current part, so it
> > doesn't advance the slot when it's supposed to. As a result, on the next
> > iteration, we'll use the same folio with an out-of-bounds offset and
> > return an unrelated page.
> > 
> > This manifested as various crashes and other failures in 9pfs in drgn's
> > VM testing setup and BPF CI.
> > 
> > Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
> > Link: https://lore.kernel.org/linux-fsdevel/20240923183432.1876750-1-chantr4@gmail.com/
> > Tested-by: Manu Bretelle <chantr4@gmail.com>
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> 
> Thanks for finding that!  That would explain why I didn't see it with afs or
> cifs - both of those pass the iterator directly to the socket rather than
> pulling the pages out of it.  I'm not sure how I managed to do things like run
> xfstests to completion and git clone and build a kernel without encountering
> the bug.
> 
> Christian: Can you add this to vfs.fixes and tag it:
> 
> Acked-by: David Howells <dhowells@redhat.com>
> Tested-by: Eduard Zingerman <eddyz87@gmail.com>

It worked for me too.

Tested-by: Leon Romanovsky <leon@kernel.org>

Thanks
Joey Gouly Oct. 1, 2024, 9:47 a.m. UTC | #4
On Mon, Sep 30, 2024 at 11:55:00AM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> iter_folioq_get_pages() decides to advance to the next folioq slot when
> it has reached the end of the current folio. However, it is checking
> offset, which is the beginning of the current part, instead of
> iov_offset, which is adjusted to the end of the current part, so it
> doesn't advance the slot when it's supposed to. As a result, on the next
> iteration, we'll use the same folio with an out-of-bounds offset and
> return an unrelated page.
> 
> This manifested as various crashes and other failures in 9pfs in drgn's
> VM testing setup and BPF CI.
> 
> Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
> Link: https://lore.kernel.org/linux-fsdevel/20240923183432.1876750-1-chantr4@gmail.com/
> Tested-by: Manu Bretelle <chantr4@gmail.com>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
>  lib/iov_iter.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 97003155bfac..1abb32c0da50 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -1033,7 +1033,7 @@ static ssize_t iter_folioq_get_pages(struct iov_iter *iter,
>  		if (maxpages == 0 || extracted >= maxsize)
>  			break;
>  
> -		if (offset >= fsize) {
> +		if (iov_offset >= fsize) {
>  			iov_offset = 0;
>  			slot++;
>  			if (slot == folioq_nr_slots(folioq) && folioq->next) {

This fixes booting for me with my 9pfs rootfs. Tested on next-20241001+this patch.

Tested-by: Joey Gouly <joey.gouly@arm.com>

Thanks!
Christian Brauner Oct. 1, 2024, 9:50 a.m. UTC | #5
On Mon, 30 Sep 2024 11:55:00 -0700, Omar Sandoval wrote:
> iter_folioq_get_pages() decides to advance to the next folioq slot when
> it has reached the end of the current folio. However, it is checking
> offset, which is the beginning of the current part, instead of
> iov_offset, which is adjusted to the end of the current part, so it
> doesn't advance the slot when it's supposed to. As a result, on the next
> iteration, we'll use the same folio with an out-of-bounds offset and
> return an unrelated page.
> 
> [...]

Applied to the vfs.fixes branch of the vfs/vfs.git tree.
Patches in the vfs.fixes branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.fixes

[1/1] iov_iter: fix advancing slot in iter_folioq_get_pages()
      https://git.kernel.org/vfs/vfs/c/0d24852bd71e
diff mbox series

Patch

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 97003155bfac..1abb32c0da50 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1033,7 +1033,7 @@  static ssize_t iter_folioq_get_pages(struct iov_iter *iter,
 		if (maxpages == 0 || extracted >= maxsize)
 			break;
 
-		if (offset >= fsize) {
+		if (iov_offset >= fsize) {
 			iov_offset = 0;
 			slot++;
 			if (slot == folioq_nr_slots(folioq) && folioq->next) {