Message ID | 2132364.1674655333@warthog.procyon.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | cifs: Fix oops due to uncleared server->smbd_conn in reconnect | expand |
On 1/25/2023 9:02 AM, David Howells wrote: > Hi Steve, > > That attached patch stops the kernel from oopsing, but it still tries > endlessly to send with softRoCE. I'm having better luck with softIWarp - with > some other patches, I can run generic/001 to completion with that transport. Do you have any logging from the softRoCE runs? I'd suspect some kind of RDMA-specific scatter/gather overflow which might be server-side as easily as client-side. On client, try: echo 0x1ff >/sys/module/cifs/parameters/smbd_logging_class On server: ksmbd.control -d conn ksmbd.control -d rdma > --- > commit 820cb3802c6a73c54e2e215b674eb5870fd5d0e5 > Author: David Howells <dhowells@redhat.com> > Date: Wed Jan 25 12:42:07 2023 +0000 > > cifs: Fix oops due to uncleared server->smbd_conn in reconnect > > In smbd_destroy(), clear the server->smbd_conn pointer after freeing the > smbd_connection struct that it points to so that reconnection doesn't get > confused. > > Fixes: 8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection") > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Long Li <longli@microsoft.com> > cc: Steve French <smfrench@gmail.com> > cc: Pavel Shilovsky <pshilov@microsoft.com> > cc: Ronnie Sahlberg <lsahlber@redhat.com> > cc: linux-cifs@vger.kernel.org > > diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c > index 90789aaa6567..8c816b25ce7c 100644 > --- a/fs/cifs/smbdirect.c > +++ b/fs/cifs/smbdirect.c > @@ -1405,6 +1405,7 @@ void smbd_destroy(struct TCP_Server_Info *server) > destroy_workqueue(info->workqueue); > log_rdma_event(INFO, "rdma session destroyed\n"); > kfree(info); > + server->smbd_conn = NULL; > } > > /* > >
Re the one-liner... Acked-by: Tom Talpey <tom@talpey.com> On 1/25/2023 9:02 AM, David Howells wrote: > Hi Steve, > > That attached patch stops the kernel from oopsing, but it still tries > endlessly to send with softRoCE. I'm having better luck with softIWarp - with > some other patches, I can run generic/001 to completion with that transport. > > David > > --- > commit 820cb3802c6a73c54e2e215b674eb5870fd5d0e5 > Author: David Howells <dhowells@redhat.com> > Date: Wed Jan 25 12:42:07 2023 +0000 > > cifs: Fix oops due to uncleared server->smbd_conn in reconnect > > In smbd_destroy(), clear the server->smbd_conn pointer after freeing the > smbd_connection struct that it points to so that reconnection doesn't get > confused. > > Fixes: 8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection") > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Long Li <longli@microsoft.com> > cc: Steve French <smfrench@gmail.com> > cc: Pavel Shilovsky <pshilov@microsoft.com> > cc: Ronnie Sahlberg <lsahlber@redhat.com> > cc: linux-cifs@vger.kernel.org > > diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c > index 90789aaa6567..8c816b25ce7c 100644 > --- a/fs/cifs/smbdirect.c > +++ b/fs/cifs/smbdirect.c > @@ -1405,6 +1405,7 @@ void smbd_destroy(struct TCP_Server_Info *server) > destroy_workqueue(info->workqueue); > log_rdma_event(INFO, "rdma session destroyed\n"); > kfree(info); > + server->smbd_conn = NULL; > } > > /* > >
minor cleanup of description and pushed to cifs-2.6.git for-next On Wed, Jan 25, 2023 at 8:05 AM David Howells <dhowells@redhat.com> wrote: > > Hi Steve, > > That attached patch stops the kernel from oopsing, but it still tries > endlessly to send with softRoCE. I'm having better luck with softIWarp - with > some other patches, I can run generic/001 to completion with that transport. > > David > > --- > commit 820cb3802c6a73c54e2e215b674eb5870fd5d0e5 > Author: David Howells <dhowells@redhat.com> > Date: Wed Jan 25 12:42:07 2023 +0000 > > cifs: Fix oops due to uncleared server->smbd_conn in reconnect > > In smbd_destroy(), clear the server->smbd_conn pointer after freeing the > smbd_connection struct that it points to so that reconnection doesn't get > confused. > > Fixes: 8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection") > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Long Li <longli@microsoft.com> > cc: Steve French <smfrench@gmail.com> > cc: Pavel Shilovsky <pshilov@microsoft.com> > cc: Ronnie Sahlberg <lsahlber@redhat.com> > cc: linux-cifs@vger.kernel.org > > diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c > index 90789aaa6567..8c816b25ce7c 100644 > --- a/fs/cifs/smbdirect.c > +++ b/fs/cifs/smbdirect.c > @@ -1405,6 +1405,7 @@ void smbd_destroy(struct TCP_Server_Info *server) > destroy_workqueue(info->workqueue); > log_rdma_event(INFO, "rdma session destroyed\n"); > kfree(info); > + server->smbd_conn = NULL; > } > > /* >
Hi Tom, Steve suggested I should ask you about this. I have IWarp RDMA mostly working with my iteratorisation patches - certainly better than without them, but I think that's mostly due to the patch that Stefan Metzmacher so dislikes ("cifs: Fix problem with encrypted RDMA data read"). However, fallocate doesn't work: # rdma link add siw0 type siw netdev enp6s0 # andromeda, softIWarp # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma # fallocate -l 1M /xfstest.test/hello fallocate: fallocate failed: Resource temporarily unavailable Because smb3_simple_fallocate_write_range() calls SMB2_write(), which calls cifs_send_recv() then compound_send_recv() and thence to smb_send_rqst(). smb_send_rqst() encrypts the buffer it is given and smbd_send() attempts to shovel it to the server using Direct Data Placement - which I think might fail because the data is encrypted. In one run of the above commands, the data in the kvec array looked like: fe534d42400001000000000009000a0000000000000000001600000000000000a01300000200 0000000000000000000000000000000000000000000000000000000000000000000000000000 before the smb_send_rqst() gets to ->init_transform_rq() and like: 98eddc1bc31da7c55c00341e4dc769fa4c8b2b0ecdacbad33eb31855ec162fa2458b8437edc7 88ee0a033c84aa857b65ab31ce553594d412719cc3daf925e873e80062ec16b97c855721a42d after. The encrypted data is seen on the wire in DDP/RDMA packets. Any thoughts as to how to fix this? Does it need to pass a flag down to suppress the encryption or suppress the use of direct data placement? Or should it perhaps go through something like ->write_iter()? Note also that it encrypts the buffer in place and then smb3_simple_fallocate_write_range() reuses the buffer multiple times without clearing it. I've pushed my cifs iteratorisation patches to: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-cifs I can post them by email a bit later. David
On 1/25/2023 3:41 PM, David Howells wrote: > Hi Tom, > > Steve suggested I should ask you about this. > > I have IWarp RDMA mostly working with my iteratorisation patches - certainly > better than without them, but I think that's mostly due to the patch that > Stefan Metzmacher so dislikes ("cifs: Fix problem with encrypted RDMA data > read"). The encryption problem is real, and Metze is correct. The client shouldnt be requesting, and the server shouldn't be responding, with unencrypted messages on encrypted shares. The problem is, the proper fix is complicated. - We've reported the issue to Microsoft, but they have not yet said how the Windows client and server are intended to behave, and they have not yet revealed how the protocol document will be changed. At this time, the Linux implementation conforms, dangerously, with the published spec. - There is some unexplained behavior in the client when the connection is lost after failing to decrypt the unencrypted response. In my earlier look at the traces, for some reason it reconnects and retries without requesting RDMA. This succeeds, because the "inline" requests and responses are encrypted and decrypted successfully. It's interesting that this occurs on a compounded fallocate call. That might be a clue, too. What are you trying to test? Since encrypted SMBDirect traffic is known to have an issue, I guess I'd suggest turning off encryption-by-default on the share. I'll poke Microsoft again on the protocol ticket. Tom. > However, fallocate doesn't work: > > # rdma link add siw0 type siw netdev enp6s0 # andromeda, softIWarp > # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma > # fallocate -l 1M /xfstest.test/hello > fallocate: fallocate failed: Resource temporarily unavailable > > Because smb3_simple_fallocate_write_range() calls SMB2_write(), which calls > cifs_send_recv() then compound_send_recv() and thence to smb_send_rqst(). > > smb_send_rqst() encrypts the buffer it is given and smbd_send() attempts to > shovel it to the server using Direct Data Placement - which I think might fail > because the data is encrypted. > > In one run of the above commands, the data in the kvec array looked like: > > fe534d42400001000000000009000a0000000000000000001600000000000000a01300000200 > 0000000000000000000000000000000000000000000000000000000000000000000000000000 > > before the smb_send_rqst() gets to ->init_transform_rq() and like: > > 98eddc1bc31da7c55c00341e4dc769fa4c8b2b0ecdacbad33eb31855ec162fa2458b8437edc7 > 88ee0a033c84aa857b65ab31ce553594d412719cc3daf925e873e80062ec16b97c855721a42d > > after. The encrypted data is seen on the wire in DDP/RDMA packets. > > Any thoughts as to how to fix this? > > Does it need to pass a flag down to suppress the encryption or suppress the > use of direct data placement? Or should it perhaps go through something like > ->write_iter()? > > Note also that it encrypts the buffer in place and then > smb3_simple_fallocate_write_range() reuses the buffer multiple times without > clearing it. > > I've pushed my cifs iteratorisation patches to: > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-cifs > > I can post them by email a bit later. > > David > >
Tom Talpey <tom@talpey.com> wrote: > What are you trying to test? I'm trying to make sure my iteratorisation patches work, including with RDMA. I have some functions to decant some data an iterator either into a scatterlist and into an RDMA SGE array without the need to get refs on pages. > Since encrypted SMBDirect traffic is known to have an issue, I guess I'd > suggest turning off encryption-by-default on the share. How do I do that? In the ksmbd config? [global] smb3 encryption = yes David
On 1/25/2023 5:43 PM, David Howells wrote: > Tom Talpey <tom@talpey.com> wrote: > >> What are you trying to test? > > I'm trying to make sure my iteratorisation patches work, including with RDMA. > I have some functions to decant some data an iterator either into a > scatterlist and into an RDMA SGE array without the need to get refs on pages. Most excellent. Great name for the task too. :) There are going to be a couple of paths to test eventually. In the non-encrypted case, the data will be coming down with a rather different set of sges/segments than after it goes through the scrambler. Since we're not ready to implement the encrypted SMBDirect traffic yet, it's best to put off the encrypted path work/testing, agree? >> Since encrypted SMBDirect traffic is known to have an issue, I guess I'd >> suggest turning off encryption-by-default on the share. > > How do I do that? In the ksmbd config? > > [global] > smb3 encryption = yes That's definitely needed, but also check that the share stanzas do not request encryption, as well. Tom.
2023-01-26 7:43 GMT+09:00, David Howells <dhowells@redhat.com>: > Tom Talpey <tom@talpey.com> wrote: > >> What are you trying to test? > > I'm trying to make sure my iteratorisation patches work, including with > RDMA. > I have some functions to decant some data an iterator either into a > scatterlist and into an RDMA SGE array without the need to get refs on > pages. > >> Since encrypted SMBDirect traffic is known to have an issue, I guess I'd >> suggest turning off encryption-by-default on the share. > > How do I do that? In the ksmbd config? > > [global] > smb3 encryption = yes I recently changed the input of the smb3 encryption parameters. It is "auto" by default. Requests/responses will not be encrypted unless you give the seal option in the mount options. So please update the latest ksmbd-tools for your test. man ksmbd.conf smb3 encryption (G) Client is disallowed, allowed, or required to use SMB3 encryption. With smb3 en‐ cryption = disabled, SMB3 encryption is disallowed even if it is requested by the client. With smb3 encryption = auto, SMB3 encryption is allowed if it is requested by the client. With smb3 encryption = mandatory, SMB3 encryption is required. i.e. clients that do not support encryption will be denied access to the share. Default: smb3 encryption = auto Thanks. > > David > >
Hi Tom, Steve, Could you take a look at the attached and see if you can tell me why it's going wrong? It's a server-side packet capture of: # rdma link add siw0 type siw netdev enp6s0 # andromeda, softIWarp # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma # fallocate -l 1M /xfstest.test/hello fallocate: fallocate failed: Resource temporarily unavailable # dd if=/dev/zero of=/xfstest.test/hello2 bs=16k count=1 oflag=direct conv=notrunc seek=2 1+0 records in 1+0 records out 16384 bytes (16 kB, 16 KiB) copied, 0.108858 s, 151 kB/s # umount /xfstest.test I altered the code to only send 16K of data at a time during the fallocate so that each block should fit within one message, but it fails whilst sending the first write. The fallocate starts at frame 74. There's an Ioctl exchange and then it starts using "DDP/RDMA Send" to shovel data across (the data looks right), but the server sends a Terminate packet in frame 90 before the client's Send is complete. The Send completes in frame 92 and the wireshark decoder seems to like it. For comparison I also did a DIO write with dd. That starts in frame 125 and uses a different mechanism (DDP/RDMA Read Request and Read Response) to shovel the data - and that completes successfully. I've switched the encryption back to auto, so it's not doing transport encryption. Thanks, David
Tom Talpey <tom@talpey.com> wrote: > Do you have any logging from the softRoCE runs? I'd suspect some > kind of RDMA-specific scatter/gather overflow which might be > server-side as easily as client-side. > > On client, try: > echo 0x1ff >/sys/module/cifs/parameters/smbd_logging_class > > On server: > ksmbd.control -d conn > ksmbd.control -d rdma Okay, on -rc5 without my patches, using: # rdma link add rxe0 type rxe netdev enp6s0 # andromeda, softRoCE # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma # dd if=/dev/zero of=/xfstest.test/hello2 bs=16k count=1 oflag=direct conv=notrunc seek=2 the dd hangs. I've captured the client and server logging you requested plus a pcap file on the server (see attached). Note also I tried md5summing a 1MiB file and that produced a different MD5 sum each time. I couldn't see enough data being transferred in the pcap to indicate that that was happening. David
On 1/26/2023 10:20 AM, David Howells wrote: > Tom Talpey <tom@talpey.com> wrote: > >> Do you have any logging from the softRoCE runs? I'd suspect some >> kind of RDMA-specific scatter/gather overflow which might be >> server-side as easily as client-side. >> >> On client, try: >> echo 0x1ff >/sys/module/cifs/parameters/smbd_logging_class >> >> On server: >> ksmbd.control -d conn >> ksmbd.control -d rdma > > Okay, on -rc5 without my patches, using: > > # rdma link add rxe0 type rxe netdev enp6s0 # andromeda, softRoCE > # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma > # dd if=/dev/zero of=/xfstest.test/hello2 bs=16k count=1 oflag=direct conv=notrunc seek=2 > > the dd hangs. I've captured the client and server logging you requested plus > a pcap file on the server (see attached). > > Note also I tried md5summing a 1MiB file and that produced a different MD5 sum > each time. I couldn't see enough data being transferred in the pcap to > indicate that that was happening. It looks like the server is seeing transmit timeouts on its responses, there are 7 of these in server-log.txt: [3700697.936899] ksmbd: smb_direct: read/write error. opcode = 0, status = transport retry counter exceeded(12) [3700697.937043] ksmbd: Failed to send message: -107 Maybe this is a softiWARP issue?
Tom Talpey <tom@talpey.com> wrote:
> Maybe this is a softiWARP issue?
That should be softRoCE.
David
Steve French <smfrench@gmail.com> wrote: > I am puzzled ... you show the fallocate failing but why do you mention > it sending data, sending writes smb3_simple_fallocate_write_range() sends data. > - when I try the fallocate you pasted above I see what is in the attached > screenshot go over the network (no writes) - and your example looks like it > simply doesn't send anything then resets the session at frame 93 Look at frame 92. That's the concluding packet of the write performed by smb3_simple_fallocate_write_range(). 74 4.568861795 192.168.6.2 -> 192.168.6.1 SMB2 250 Ioctl Request FSCTL_QUERY_ALLOCATED_RANGES File: hello 75 4.569429926 192.168.6.1 -> 192.168.6.2 SMB2 242 Ioctl Response FSCTL_QUERY_ALLOCATED_RANGES 77 4.680495774 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 78 4.680496219 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 79 4.680496364 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 80 4.680496552 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 81 4.680496698 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 82 4.680496844 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 83 4.680496989 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 84 4.680497177 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 88 4.680638842 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 89 4.680639016 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 90 4.680704523 192.168.6.1 -> 192.168.6.2 DDP/RDMA 114 5445 > 50018 Terminate [last DDP segment] 91 4.680735089 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 92 4.680735359 192.168.6.2 -> 192.168.6.1 SMB2 946 Write Request Len:16384 Off:204800 File: hello David
On 1/26/2023 2:54 PM, David Howells wrote: > Steve French <smfrench@gmail.com> wrote: > >> I am puzzled ... you show the fallocate failing but why do you mention >> it sending data, sending writes > > smb3_simple_fallocate_write_range() sends data. > >> - when I try the fallocate you pasted above I see what is in the attached >> screenshot go over the network (no writes) - and your example looks like it >> simply doesn't send anything then resets the session at frame 93 > > Look at frame 92. That's the concluding packet of the write performed by > smb3_simple_fallocate_write_range(). > > 74 4.568861795 192.168.6.2 -> 192.168.6.1 SMB2 250 Ioctl Request FSCTL_QUERY_ALLOCATED_RANGES File: hello > 75 4.569429926 192.168.6.1 -> 192.168.6.2 SMB2 242 Ioctl Response FSCTL_QUERY_ALLOCATED_RANGES > 77 4.680495774 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 78 4.680496219 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 79 4.680496364 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 80 4.680496552 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 81 4.680496698 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 82 4.680496844 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 83 4.680496989 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 84 4.680497177 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 88 4.680638842 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 89 4.680639016 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 90 4.680704523 192.168.6.1 -> 192.168.6.2 DDP/RDMA 114 5445 > 50018 Terminate [last DDP segment] > 91 4.680735089 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 92 4.680735359 192.168.6.2 -> 192.168.6.1 SMB2 946 Write Request Len:16384 Off:204800 File: hello > That's a really large SMBDirect Send operation, it looks like it's trying to send the entire write in one message and it overflows the receive buffer. I'm still fighting with wireshark and can't decode the layers above TCP. Can you look at the SMBDirect negotiation at the start of the trace, and tell me what the max send/receive values were set by each side?
Tom Talpey <tom@talpey.com> wrote: > That's a really large SMBDirect Send operation, it looks like it's > trying to send the entire write in one message and it overflows > the receive buffer. > > I'm still fighting with wireshark and can't decode the layers > above TCP. Can you look at the SMBDirect negotiation at the > start of the trace, and tell me what the max send/receive > values were set by each side? Frame 8: 110 bytes on wire (880 bits), 110 bytes captured (880 bits) on interface enp2s0, id 0 Ethernet II, Src: IntelCor_bb:e6:30 (00:1b:21:bb:e6:30), Dst: IntelCor_bb:e6:ac (00:1b:21:bb:e6:ac) Internet Protocol Version 4, Src: 192.168.6.2, Dst: 192.168.6.1 Transmission Control Protocol, Src Port: 50018, Dst Port: 5445, Seq: 33, Ack: 33, Len: 44 iWARP Marker Protocol data unit Aligned framing iWARP Direct Data Placement and Remote Direct Memory Access Protocol SMB-Direct (SMB RDMA Transport) NegotiateRequest MinVersion: 0x0100 MaxVersion: 0x0100 CreditsRequested: 255 PreferredSendSize: 1364 MaxReceiveSize: 1364 MaxFragmentedSize: 1048576 Frame 9: 122 bytes on wire (976 bits), 122 bytes captured (976 bits) on interface enp2s0, id 0 Ethernet II, Src: IntelCor_bb:e6:ac (00:1b:21:bb:e6:ac), Dst: IntelCor_bb:e6:30 (00:1b:21:bb:e6:30) Internet Protocol Version 4, Src: 192.168.6.1, Dst: 192.168.6.2 Transmission Control Protocol, Src Port: 5445, Dst Port: 50018, Seq: 33, Ack: 77, Len: 56 iWARP Marker Protocol data unit Aligned framing iWARP Direct Data Placement and Remote Direct Memory Access Protocol SMB-Direct (SMB RDMA Transport) NegotiateResponse MinVersion: 0x0100 MaxVersion: 0x0100 NegotiatedVersion: 0x0100 CreditsRequested: 255 CreditsGranted: 254 Status: STATUS_SUCCESS (0x00000000) MaxReadWriteSize: 524224 PreferredSendSize: 1364 MaxReceiveSize: 1364 MaxFragmentedSize: 173910 Frame 10: 110 bytes on wire (880 bits), 110 bytes captured (880 bits) on interface enp2s0, id 0 Ethernet II, Src: IntelCor_bb:e6:30 (00:1b:21:bb:e6:30), Dst: IntelCor_bb:e6:ac (00:1b:21:bb:e6:ac) Internet Protocol Version 4, Src: 192.168.6.2, Dst: 192.168.6.1 Transmission Control Protocol, Src Port: 50018, Dst Port: 5445, Seq: 77, Ack: 89, Len: 44 iWARP Marker Protocol data unit Aligned framing iWARP Direct Data Placement and Remote Direct Memory Access Protocol SMB-Direct (SMB RDMA Transport) DataMessage CreditsRequested: 255 CreditsGranted: 255 Flags: 0x0000 .... .... .... ...0 = ResponseRequested: False RemainingLength: 0 DataOffset: 0 DataLength: 0 Frame 11: 346 bytes on wire (2768 bits), 346 bytes captured (2768 bits) on interface enp2s0, id 0 Ethernet II, Src: IntelCor_bb:e6:30 (00:1b:21:bb:e6:30), Dst: IntelCor_bb:e6:ac (00:1b:21:bb:e6:ac) Internet Protocol Version 4, Src: 192.168.6.2, Dst: 192.168.6.1 Transmission Control Protocol, Src Port: 50018, Dst Port: 5445, Seq: 121, Ack: 89, Len: 280 iWARP Marker Protocol data unit Aligned framing iWARP Direct Data Placement and Remote Direct Memory Access Protocol SMB-Direct (SMB RDMA Transport) DataMessage CreditsRequested: 255 CreditsGranted: 0 Flags: 0x0000 .... .... .... ...0 = ResponseRequested: False RemainingLength: 0 DataOffset: 24 DataLength: 232 SMB2 (Server Message Block Protocol version 2) SMB2 Header ProtocolId: 0xfe534d42 Header Length: 64 Credit Charge: 0 Channel Sequence: 0 Reserved: 0000 Command: Negotiate Protocol (0) Credits requested: 10 Flags: 0x00000000 Chain Offset: 0x00000000 Message ID: 0 Process Id: 0x000013c5 Tree Id: 0x00000000 Session Id: 0x0000000000000000 Signature: 00000000000000000000000000000000 [Response in: 13] Negotiate Protocol Request (0x00) [Preauth Hash: 81cd52dea94ed363a171b7effe222c0003574f5c54f6c7a1cbb041676ea9ddf15245b2a4…] StructureSize: 0x0024 Dialect count: 4 Security mode: 0x01, Signing enabled Reserved: 0000 Capabilities: 0x00000077, DFS, LEASING, LARGE MTU, PERSISTENT HANDLES, DIRECTORY LEASING, ENCRYPTION Client Guid: c494649a-e636-d94c-a55e-be00d5a02a30 NegotiateContextOffset: 0x00000070 NegotiateContextCount: 4 Reserved: 0000 Dialect: SMB 2.1 (0x0210) Dialect: SMB 3.0 (0x0300) Dialect: SMB 3.0.2 (0x0302) Dialect: SMB 3.1.1 (0x0311) Negotiate Context: SMB2_PREAUTH_INTEGRITY_CAPABILITIES Type: SMB2_PREAUTH_INTEGRITY_CAPABILITIES (0x0001) DataLength: 38 Reserved: 00000000 HashAlgorithmCount: 1 SaltLength: 32 HashAlgorithm: SHA-512 (0x0001) Salt: 1d6e14b44264b6cc1db622478c3826c4cd09df1dc70abf73f13b9261724d4181 Negotiate Context: SMB2_ENCRYPTION_CAPABILITIES Type: SMB2_ENCRYPTION_CAPABILITIES (0x0002) DataLength: 8 Reserved: 00000000 CipherCount: 3 CipherId: AES-128-GCM (0x0002) CipherId: AES-256-GCM (0x0004) CipherId: AES-128-CCM (0x0001) Negotiate Context: SMB2_NETNAME_NEGOTIATE_CONTEXT_ID Type: SMB2_NETNAME_NEGOTIATE_CONTEXT_ID (0x0005) DataLength: 22 Reserved: 00000000 Netname: 192.168.6.1 Negotiate Context: SMB2_POSIX_EXTENSIONS_CAPABILITIES Type: SMB2_POSIX_EXTENSIONS_CAPABILITIES (0x0100) DataLength: 16 Reserved: 00000000 POSIX Reserved: 93ad25509cb411e7b42383de968bcd7c
diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c index 90789aaa6567..8c816b25ce7c 100644 --- a/fs/cifs/smbdirect.c +++ b/fs/cifs/smbdirect.c @@ -1405,6 +1405,7 @@ void smbd_destroy(struct TCP_Server_Info *server) destroy_workqueue(info->workqueue); log_rdma_event(INFO, "rdma session destroyed\n"); kfree(info); + server->smbd_conn = NULL; } /*
Hi Steve, That attached patch stops the kernel from oopsing, but it still tries endlessly to send with softRoCE. I'm having better luck with softIWarp - with some other patches, I can run generic/001 to completion with that transport. David --- commit 820cb3802c6a73c54e2e215b674eb5870fd5d0e5 Author: David Howells <dhowells@redhat.com> Date: Wed Jan 25 12:42:07 2023 +0000 cifs: Fix oops due to uncleared server->smbd_conn in reconnect In smbd_destroy(), clear the server->smbd_conn pointer after freeing the smbd_connection struct that it points to so that reconnection doesn't get confused. Fixes: 8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection") Signed-off-by: David Howells <dhowells@redhat.com> cc: Long Li <longli@microsoft.com> cc: Steve French <smfrench@gmail.com> cc: Pavel Shilovsky <pshilov@microsoft.com> cc: Ronnie Sahlberg <lsahlber@redhat.com> cc: linux-cifs@vger.kernel.org