diff mbox series

[RFC,08/13] cifs: Add a function to read into an iter from a socket

Message ID 20230125214543.2337639-9-dhowells@redhat.com (mailing list archive)
State New, archived
Headers show
Series smb3: Use iov_iters down to the network transport and fix DIO page pinning | expand

Commit Message

David Howells Jan. 25, 2023, 9:45 p.m. UTC
Add a helper function to read data from a socket into the given iterator.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org

Link: https://lore.kernel.org/r/164928617874.457102.10021662143234315566.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211419563.3154751.18431990381145195050.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348879662.2106726.16881134187242702351.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364826398.3334034.12541600783145647319.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126395495.708021.12328677373159554478.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697258876.61150.3530237818849429372.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031039.3186319.10691316510079412635.stgit@warthog.procyon.org.uk/ # rfc
---
 fs/cifs/cifsproto.h |  3 +++
 fs/cifs/connect.c   | 16 ++++++++++++++++
 2 files changed, 19 insertions(+)

Comments

David Laight Jan. 26, 2023, 9:27 a.m. UTC | #1
From: David Howells
> Sent: 25 January 2023 21:46

> Add a helper function to read data from a socket into the given iterator.
...
> +int
> +cifs_read_iter_from_socket(struct TCP_Server_Info *server, struct iov_iter *iter,
> +			   unsigned int to_read)
> +{
> +	struct msghdr smb_msg;
> +	int ret;
> +
> +	smb_msg.msg_iter = *iter;
> +	if (smb_msg.msg_iter.count > to_read)
> +		smb_msg.msg_iter.count = to_read;
> +	ret = cifs_readv_from_socket(server, &smb_msg);
> +	if (ret > 0)
> +		iov_iter_advance(iter, ret);
> +	return ret;
> +}

On the face of it that passes a largely uninitialised 'struct msghdr'
to cifs_readv_from_socket() in order to pass an iov_iter.
That seems to be asking for trouble.

I'm also not 100% sure that taking a copy of an iov_iter is a good idea.

If cifs_readv_from_socket() only needs the iov_iter then wouldn't
it be better to do the wrapper the other way around?
(Probably as an inline function)
Something like:

int
cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg)
{
	return cifs_read_iter_from_socket(server, &smb_msg->msg_iter, smb_msg->msg_iter.count);
}

and then changing cifs_readv_from_socket() to just use the iov_iter.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
David Howells Jan. 26, 2023, 3:44 p.m. UTC | #2
David Laight <David.Laight@ACULAB.COM> wrote:

> On the face of it that passes a largely uninitialised 'struct msghdr'
> to cifs_readv_from_socket() in order to pass an iov_iter.
> That seems to be asking for trouble.
> 
> If cifs_readv_from_socket() only needs the iov_iter then wouldn't
> it be better to do the wrapper the other way around?
> (Probably as an inline function)
> Something like:
> 
> int
> cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg)
> {
> 	return cifs_read_iter_from_socket(server, &smb_msg->msg_iter, smb_msg->msg_iter.count);
> }
> 
> and then changing cifs_readv_from_socket() to just use the iov_iter.

Yeah.  And smbd_recv() only cares about the iterator too.

> I'm also not 100% sure that taking a copy of an iov_iter is a good idea.

It shouldn't matter as the only problematic iterator is ITER_PIPE (advancing
that has side effects) - and splice_read is handled specially by patch 4.  The
problem with splice_read with the way cifs works is that it likes to subdivide
its read/write requests across multiple reqs and then subsubdivide them if
certain types of failure occur.  But you can't do that with ITER_PIPE.

I build an ITER_BVEC from ITER_PIPE, ITER_UBUF and ITER_IOVEC in the top
levels with pins inserted as appropriate and hand the ITER_BVEC down.  For
user-backed iterators it has to be done this way because the I/O may get
shuffled off to a different thread.

Reqs can then just copy the BVEC/XARRAY/KVEC and narrow the region because the
master request at the top does holds the vector list and the top cifs level or
the caller above the vfs (eg. sys_execve) does what is necessary to retain the
pages.

David
David Laight Jan. 26, 2023, 4:09 p.m. UTC | #3
From: David Howells
> Sent: 26 January 2023 15:44
...
> > I'm also not 100% sure that taking a copy of an iov_iter is a good idea.
> 
> It shouldn't matter as the only problematic iterator is ITER_PIPE (advancing
> that has side effects) - and splice_read is handled specially by patch 4.  The
> problem with splice_read with the way cifs works is that it likes to subdivide
> its read/write requests across multiple reqs and then subsubdivide them if
> certain types of failure occur.  But you can't do that with ITER_PIPE.

I was thinking that even if ok at the moment it might be troublesome later.
Somewhere I started writing a patch to put the iov_cache[] for user
requests into the same structure as the iterator.
Copying those might cause oddities.

> I build an ITER_BVEC from ITER_PIPE, ITER_UBUF and ITER_IOVEC in the top
> levels with pins inserted as appropriate and hand the ITER_BVEC down.  For
> user-backed iterators it has to be done this way because the I/O may get
> shuffled off to a different thread.

For sub-page sided transfers it is probably worth doing a bounce buffer
copy of user requests - just to save all the complex page pinning code.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
David Howells Jan. 26, 2023, 4:41 p.m. UTC | #4
David Laight <David.Laight@ACULAB.COM> wrote:

> > It shouldn't matter as the only problematic iterator is ITER_PIPE
> > (advancing that has side effects) - and splice_read is handled specially
> > by patch 4.  The problem with splice_read with the way cifs works is that
> > it likes to subdivide its read/write requests across multiple reqs and
> > then subsubdivide them if certain types of failure occur.  But you can't
> > do that with ITER_PIPE.
> 
> I was thinking that even if ok at the moment it might be troublesome later.
> Somewhere I started writing a patch to put the iov_cache[] for user
> requests into the same structure as the iterator.
> Copying those might cause oddities.

Well, there is dup_iter(), but that copies the vector table, which isn't what
we want in a number of cases.  You probably need to come up with a wrapper for
that.

But we copy iters by assignment in a lot of places.  With regards to msg_hdr,
it might be worth giving it an iterator pointer rather than its own iterator.

I've just had a go at attempting to modify the code.
cifs_read_iter_from_socket() wants to copy the iterator and truncate the copy,
which makes things slightly trickier.  For both of the call sites,
receive_encrypted_read() and cifs_readv_receive(), it can do the truncation
before calling cifs_read_iter_from_socket(), I think - but it may have to undo
the truncation afterwards.

> > I build an ITER_BVEC from ITER_PIPE, ITER_UBUF and ITER_IOVEC in the top
> > levels with pins inserted as appropriate and hand the ITER_BVEC down.  For
> > user-backed iterators it has to be done this way because the I/O may get
> > shuffled off to a different thread.
> 
> For sub-page sided transfers it is probably worth doing a bounce buffer
> copy of user requests - just to save all the complex page pinning code.

You can't avoid it for async DIO reads.  But that sort of thing I'm intending
to do in netfslib.

David
diff mbox series

Patch

diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 1207b39686fb..cb7a3fe89278 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -244,6 +244,9 @@  extern int cifs_read_page_from_socket(struct TCP_Server_Info *server,
 					struct page *page,
 					unsigned int page_offset,
 					unsigned int to_read);
+int cifs_read_iter_from_socket(struct TCP_Server_Info *server,
+			       struct iov_iter *iter,
+			       unsigned int to_read);
 extern int cifs_setup_cifs_sb(struct cifs_sb_info *cifs_sb);
 void cifs_mount_put_conns(struct cifs_mount_ctx *mnt_ctx);
 int cifs_mount_get_session(struct cifs_mount_ctx *mnt_ctx);
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index b2a04b4e89a5..5bdabbb6c45c 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -765,6 +765,22 @@  cifs_read_page_from_socket(struct TCP_Server_Info *server, struct page *page,
 	return cifs_readv_from_socket(server, &smb_msg);
 }
 
+int
+cifs_read_iter_from_socket(struct TCP_Server_Info *server, struct iov_iter *iter,
+			   unsigned int to_read)
+{
+	struct msghdr smb_msg;
+	int ret;
+
+	smb_msg.msg_iter = *iter;
+	if (smb_msg.msg_iter.count > to_read)
+		smb_msg.msg_iter.count = to_read;
+	ret = cifs_readv_from_socket(server, &smb_msg);
+	if (ret > 0)
+		iov_iter_advance(iter, ret);
+	return ret;
+}
+
 static bool
 is_smb_response(struct TCP_Server_Info *server, unsigned char type)
 {