Message ID | 20230125214543.2337639-9-dhowells@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | smb3: Use iov_iters down to the network transport and fix DIO page pinning | expand |
From: David Howells > Sent: 25 January 2023 21:46 > Add a helper function to read data from a socket into the given iterator. ... > +int > +cifs_read_iter_from_socket(struct TCP_Server_Info *server, struct iov_iter *iter, > + unsigned int to_read) > +{ > + struct msghdr smb_msg; > + int ret; > + > + smb_msg.msg_iter = *iter; > + if (smb_msg.msg_iter.count > to_read) > + smb_msg.msg_iter.count = to_read; > + ret = cifs_readv_from_socket(server, &smb_msg); > + if (ret > 0) > + iov_iter_advance(iter, ret); > + return ret; > +} On the face of it that passes a largely uninitialised 'struct msghdr' to cifs_readv_from_socket() in order to pass an iov_iter. That seems to be asking for trouble. I'm also not 100% sure that taking a copy of an iov_iter is a good idea. If cifs_readv_from_socket() only needs the iov_iter then wouldn't it be better to do the wrapper the other way around? (Probably as an inline function) Something like: int cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg) { return cifs_read_iter_from_socket(server, &smb_msg->msg_iter, smb_msg->msg_iter.count); } and then changing cifs_readv_from_socket() to just use the iov_iter. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
David Laight <David.Laight@ACULAB.COM> wrote: > On the face of it that passes a largely uninitialised 'struct msghdr' > to cifs_readv_from_socket() in order to pass an iov_iter. > That seems to be asking for trouble. > > If cifs_readv_from_socket() only needs the iov_iter then wouldn't > it be better to do the wrapper the other way around? > (Probably as an inline function) > Something like: > > int > cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg) > { > return cifs_read_iter_from_socket(server, &smb_msg->msg_iter, smb_msg->msg_iter.count); > } > > and then changing cifs_readv_from_socket() to just use the iov_iter. Yeah. And smbd_recv() only cares about the iterator too. > I'm also not 100% sure that taking a copy of an iov_iter is a good idea. It shouldn't matter as the only problematic iterator is ITER_PIPE (advancing that has side effects) - and splice_read is handled specially by patch 4. The problem with splice_read with the way cifs works is that it likes to subdivide its read/write requests across multiple reqs and then subsubdivide them if certain types of failure occur. But you can't do that with ITER_PIPE. I build an ITER_BVEC from ITER_PIPE, ITER_UBUF and ITER_IOVEC in the top levels with pins inserted as appropriate and hand the ITER_BVEC down. For user-backed iterators it has to be done this way because the I/O may get shuffled off to a different thread. Reqs can then just copy the BVEC/XARRAY/KVEC and narrow the region because the master request at the top does holds the vector list and the top cifs level or the caller above the vfs (eg. sys_execve) does what is necessary to retain the pages. David
From: David Howells > Sent: 26 January 2023 15:44 ... > > I'm also not 100% sure that taking a copy of an iov_iter is a good idea. > > It shouldn't matter as the only problematic iterator is ITER_PIPE (advancing > that has side effects) - and splice_read is handled specially by patch 4. The > problem with splice_read with the way cifs works is that it likes to subdivide > its read/write requests across multiple reqs and then subsubdivide them if > certain types of failure occur. But you can't do that with ITER_PIPE. I was thinking that even if ok at the moment it might be troublesome later. Somewhere I started writing a patch to put the iov_cache[] for user requests into the same structure as the iterator. Copying those might cause oddities. > I build an ITER_BVEC from ITER_PIPE, ITER_UBUF and ITER_IOVEC in the top > levels with pins inserted as appropriate and hand the ITER_BVEC down. For > user-backed iterators it has to be done this way because the I/O may get > shuffled off to a different thread. For sub-page sided transfers it is probably worth doing a bounce buffer copy of user requests - just to save all the complex page pinning code. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
David Laight <David.Laight@ACULAB.COM> wrote: > > It shouldn't matter as the only problematic iterator is ITER_PIPE > > (advancing that has side effects) - and splice_read is handled specially > > by patch 4. The problem with splice_read with the way cifs works is that > > it likes to subdivide its read/write requests across multiple reqs and > > then subsubdivide them if certain types of failure occur. But you can't > > do that with ITER_PIPE. > > I was thinking that even if ok at the moment it might be troublesome later. > Somewhere I started writing a patch to put the iov_cache[] for user > requests into the same structure as the iterator. > Copying those might cause oddities. Well, there is dup_iter(), but that copies the vector table, which isn't what we want in a number of cases. You probably need to come up with a wrapper for that. But we copy iters by assignment in a lot of places. With regards to msg_hdr, it might be worth giving it an iterator pointer rather than its own iterator. I've just had a go at attempting to modify the code. cifs_read_iter_from_socket() wants to copy the iterator and truncate the copy, which makes things slightly trickier. For both of the call sites, receive_encrypted_read() and cifs_readv_receive(), it can do the truncation before calling cifs_read_iter_from_socket(), I think - but it may have to undo the truncation afterwards. > > I build an ITER_BVEC from ITER_PIPE, ITER_UBUF and ITER_IOVEC in the top > > levels with pins inserted as appropriate and hand the ITER_BVEC down. For > > user-backed iterators it has to be done this way because the I/O may get > > shuffled off to a different thread. > > For sub-page sided transfers it is probably worth doing a bounce buffer > copy of user requests - just to save all the complex page pinning code. You can't avoid it for async DIO reads. But that sort of thing I'm intending to do in netfslib. David
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index 1207b39686fb..cb7a3fe89278 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -244,6 +244,9 @@ extern int cifs_read_page_from_socket(struct TCP_Server_Info *server, struct page *page, unsigned int page_offset, unsigned int to_read); +int cifs_read_iter_from_socket(struct TCP_Server_Info *server, + struct iov_iter *iter, + unsigned int to_read); extern int cifs_setup_cifs_sb(struct cifs_sb_info *cifs_sb); void cifs_mount_put_conns(struct cifs_mount_ctx *mnt_ctx); int cifs_mount_get_session(struct cifs_mount_ctx *mnt_ctx); diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index b2a04b4e89a5..5bdabbb6c45c 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -765,6 +765,22 @@ cifs_read_page_from_socket(struct TCP_Server_Info *server, struct page *page, return cifs_readv_from_socket(server, &smb_msg); } +int +cifs_read_iter_from_socket(struct TCP_Server_Info *server, struct iov_iter *iter, + unsigned int to_read) +{ + struct msghdr smb_msg; + int ret; + + smb_msg.msg_iter = *iter; + if (smb_msg.msg_iter.count > to_read) + smb_msg.msg_iter.count = to_read; + ret = cifs_readv_from_socket(server, &smb_msg); + if (ret > 0) + iov_iter_advance(iter, ret); + return ret; +} + static bool is_smb_response(struct TCP_Server_Info *server, unsigned char type) {
Add a helper function to read data from a socket into the given iterator. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Link: https://lore.kernel.org/r/164928617874.457102.10021662143234315566.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165211419563.3154751.18431990381145195050.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165348879662.2106726.16881134187242702351.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165364826398.3334034.12541600783145647319.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/166126395495.708021.12328677373159554478.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/166697258876.61150.3530237818849429372.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/166732031039.3186319.10691316510079412635.stgit@warthog.procyon.org.uk/ # rfc --- fs/cifs/cifsproto.h | 3 +++ fs/cifs/connect.c | 16 ++++++++++++++++ 2 files changed, 19 insertions(+)