Message ID | 20230310153211.10982-8-sprasad@microsoft.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/11] cifs: fix tcon status change after tree connect | expand |
Hi Shyam, everyone, On 10. 03. 23 16:32, Shyam Prasad N wrote: > + if (!ses->iface_count) { > + spin_unlock(&ses->iface_lock); > + cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname); > + return 0; > + } May I ask why this is now being logged, and what can be tweaked in the case that a server does not advertise interfaces? After updating the kernel from 6.1 to 6.6 in Home Assistant OS, we got a report [1] of these messages appearing, yet so far only from a single attentive user. I wonder if we are going to see them more often, and it that case a suggestion to users would come handy. If there's not a simple resolution, could the verbosity be lowered to FYI, maybe? Thanks, Jan [1] https://github.com/home-assistant/operating-system/issues/3201
On Tue, Feb 27, 2024 at 4:46 PM Jan Čermák <sairon@sairon.cz> wrote: > > Hi Shyam, everyone, > > On 10. 03. 23 16:32, Shyam Prasad N wrote: > > > + if (!ses->iface_count) { > > + spin_unlock(&ses->iface_lock); > > + cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname); > > + return 0; > > + } > > May I ask why this is now being logged, and what can be tweaked in the > case that a server does not advertise interfaces? After updating the > kernel from 6.1 to 6.6 in Home Assistant OS, we got a report [1] of > these messages appearing, yet so far only from a single attentive user. > I wonder if we are going to see them more often, and it that case a > suggestion to users would come handy. If there's not a simple > resolution, could the verbosity be lowered to FYI, maybe? > > Thanks, > Jan > > > [1] https://github.com/home-assistant/operating-system/issues/3201 Hi Jan, These messages (in theory) should not show up if either multichannel or max_channels are not specified mount options. And if multichannel is enabled, then the server should support the ioctl to return the server interface list. These messages are meant to help the user/developer understand why multiple channels are not getting established to the server, even after specifying multichannel as a mount option. The repeating nature of these messages here leads me to also believe that there's something fishy going on here. Either the network health is not good, or that there's some bug at play here.
Hi Shyam, On 27. 02. 24 17:17, Shyam Prasad N wrote: > These messages (in theory) should not show up if either multichannel > or max_channels are not specified mount options. That shouldn't be the case here, I checked with the user and he's not doing anything fishy himself (like interfering with the standard mount utilities), and the userspace tools creating the mounts should not be setting any of these options, which I confirmed by asking for his mounts list: //192.168.1.12/folder on /mnt/data/supervisor/mounts/folder type cifs (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1) //192.168.1.12/folder on /mnt/data/supervisor/media/folder type cifs (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1) Or am I missing anything here? > The repeating nature of these messages here leads me to also believe > that there's something fishy going on here. > Either the network health is not good, or that there's some bug at play here. Maybe, however I'm not able to reproduce the above behavior yet. But there's been so far one more report of this happening, so it's not a single isolated case. I appreciate any advice what to look at further. Cheers, Jan
On Wed, Feb 28, 2024 at 2:52 PM Jan Čermák <sairon@sairon.cz> wrote: > > Hi Shyam, > Hi Jan, Apologies for the delay. > On 27. 02. 24 17:17, Shyam Prasad N wrote: > > These messages (in theory) should not show up if either multichannel > > or max_channels are not specified mount options. > > That shouldn't be the case here, I checked with the user and he's not > doing anything fishy himself (like interfering with the standard mount > utilities), and the userspace tools creating the mounts should not be > setting any of these options, which I confirmed by asking for his mounts > list: > > //192.168.1.12/folder on /mnt/data/supervisor/mounts/folder type cifs > (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1) > //192.168.1.12/folder on /mnt/data/supervisor/media/folder type cifs > (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1) Hmmm.. That seems like a bug. Is there any chance that the user is willing to try out if the same bug reproduces with the latest mainline kernel? The other option is for us to try with the 6.6 kernel. But without the steps to repro, it'll just be shots in the dark. Let me try to go through the code and see if I can spot anything here. > > > Or am I missing anything here? > > > The repeating nature of these messages here leads me to also believe > > that there's something fishy going on here. > > Either the network health is not good, or that there's some bug at play here. > > Maybe, however I'm not able to reproduce the above behavior yet. But > there's been so far one more report of this happening, so it's not a > single isolated case. I appreciate any advice what to look at further. If a user has reproduced this issue, the one thing they can send us is the ftrace output of cifs events when the issue is being seen. i.e. something like this: # trace-cmd start -e cifs # <now wait for the issue to reproduce> # trace-cmd stop # trace-cmd extract > /tmp/outputs.txt # uname -r >> /tmp/outputs,txt # cat /proc/fs/cifs/DebugData >> /tmp/outputs,txt # cat /proc/fs/cifs/Stats >> /tmp/outputs,txt And then provide the outputs.txt file to us. Going through that capture can help us understand this better. > > Cheers, > Jan
Shyam Prasad N <nspmangalore@gmail.com> writes: > On Wed, Feb 28, 2024 at 2:52 PM Jan Čermák <sairon@sairon.cz> wrote: >> >> Hi Shyam, >> > > Hi Jan, > Apologies for the delay. > >> On 27. 02. 24 17:17, Shyam Prasad N wrote: >> > These messages (in theory) should not show up if either multichannel >> > or max_channels are not specified mount options. >> >> That shouldn't be the case here, I checked with the user and he's not >> doing anything fishy himself (like interfering with the standard mount >> utilities), and the userspace tools creating the mounts should not be >> setting any of these options, which I confirmed by asking for his mounts >> list: >> >> //192.168.1.12/folder on /mnt/data/supervisor/mounts/folder type cifs >> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1) >> //192.168.1.12/folder on /mnt/data/supervisor/media/folder type cifs >> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1) > > Hmmm.. That seems like a bug. Yes. I see a couple of issues here: (1) cifs_chan_update_iface() seems to be called over reconnect for all dialect versions and servers that do not support FSCTL_QUERY_NETWORK_INTERFACE_INFO, so @ses->iface_count will be zero in some cases and then VFS message will start being flooded on dmesg. (2) __cifs_reconnect() is queueing smb2_reconnect_server() even for SMB1 mounts, so smb2_reconnect() ends up being called and then call SMB3_request_interfaces() because SMB2_GLOBAL_CAP_MULTI_CHANNEL is mixed with CAP_LARGE_FILES.
Hi Paulo, Shyam, On 06. 03. 24 16:43, Paulo Alcantara wrote: > > Yes. I see a couple of issues here: > > (1) cifs_chan_update_iface() seems to be called over reconnect for all > dialect versions and servers that do not support > FSCTL_QUERY_NETWORK_INTERFACE_INFO, so @ses->iface_count will be zero in > some cases and then VFS message will start being flooded on dmesg. > > (2) __cifs_reconnect() is queueing smb2_reconnect_server() even for SMB1 > mounts, so smb2_reconnect() ends up being called and then call > SMB3_request_interfaces() because SMB2_GLOBAL_CAP_MULTI_CHANNEL is mixed > with CAP_LARGE_FILES. thanks for looking into this! Is there anything else you'll need from me or to test by the users who are observing the issue, or do you have enough information for a fix? Regards, Jan
On Mon, Mar 11, 2024 at 3:31 PM Jan Čermák <sairon@sairon.cz> wrote: > > Hi Paulo, Shyam, > > On 06. 03. 24 16:43, Paulo Alcantara wrote: > > > > Yes. I see a couple of issues here: > > Thanks for the review, Paulo. > > (1) cifs_chan_update_iface() seems to be called over reconnect for all > > dialect versions and servers that do not support > > FSCTL_QUERY_NETWORK_INTERFACE_INFO, so @ses->iface_count will be zero in > > some cases and then VFS message will start being flooded on dmesg. Valid point. > > > > (2) __cifs_reconnect() is queueing smb2_reconnect_server() even for SMB1 > > mounts, so smb2_reconnect() ends up being called and then call > > SMB3_request_interfaces() because SMB2_GLOBAL_CAP_MULTI_CHANNEL is mixed > > with CAP_LARGE_FILES. Excellent catch. I did not take this into account. It is very likely that this is going on here. Let me submit a patch to check the exact dialect before calling these functions to make sure we only do it for SMB3+. > > thanks for looking into this! Is there anything else you'll need from me > or to test by the users who are observing the issue, or do you have > enough information for a fix? > > Regards, > Jan Thanks for bringing this to our notice, Jan.
On 11. 03. 24 12:14, Shyam Prasad N wrote: > Let me submit a patch to check the exact dialect before calling these > functions to make sure we only do it for SMB3+. > FWIW, most of the reports (except the first one with macOS) are users running their SMB server on QNAP NAS's (more details are in the GH ticket I linked earlier). Please CC me in the e-mail with the patch as well, I can create a build with that patch applied to 6.6.y and ask if someone could try if it resolves the issue in their environment. Cheers, Jan
On Tue, Mar 12, 2024 at 7:50 PM Jan Čermák <sairon@sairon.cz> wrote: > > On 11. 03. 24 12:14, Shyam Prasad N wrote: > > Let me submit a patch to check the exact dialect before calling these > > functions to make sure we only do it for SMB3+. > > > > FWIW, most of the reports (except the first one with macOS) are users > running their SMB server on QNAP NAS's (more details are in the GH > ticket I linked earlier). > > Please CC me in the e-mail with the patch as well, I can create a build > with that patch applied to 6.6.y and ask if someone could try if it > resolves the issue in their environment. > > Cheers, > Jan Just sent two patches for this. One patch is just to change the log level for this log from VFS -> FYI. The other one is for the suggestions made by Paulo. #1 will fix this anyway. I'm curious to know if #2 alone would fix this problem. Also, please ask for the output of the following command while testing this out: # cat /proc/fs/cifs/DebugData
On 13. 03. 24 11:45, Shyam Prasad N wrote: > Just sent two patches for this. > One patch is just to change the log level for this log from VFS -> FYI. > The other one is for the suggestions made by Paulo. > #1 will fix this anyway. I'm curious to know if #2 alone would fix this problem. > Also, please ask for the output of the following command while testing this out: > # cat /proc/fs/cifs/DebugData One of the users tried testing on the same system with only the second patch applied and reports the issue still persists :( He supplied the DebugData output only as a screenshot [1], not sure what to look there for, but definitely the "Server interfaces" section is missing there compared to my setup. Cheers, Jan [1] https://github.com/home-assistant/operating-system/issues/3201#issuecomment-2019357433
diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c index 4391c7aac3cb..cee3af02e2c3 100644 --- a/fs/cifs/cifs_debug.c +++ b/fs/cifs/cifs_debug.c @@ -219,6 +219,8 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v) struct cifs_ses *ses; struct cifs_tcon *tcon; struct cifs_server_iface *iface; + size_t iface_weight = 0, iface_min_speed = 0; + struct cifs_server_iface *last_iface = NULL; int c, i, j; seq_puts(m, @@ -465,11 +467,25 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v) "\tLast updated: %lu seconds ago", ses->iface_count, (jiffies - ses->iface_last_update) / HZ); + + last_iface = list_last_entry(&ses->iface_list, + struct cifs_server_iface, + iface_head); + iface_min_speed = last_iface->speed; + j = 0; list_for_each_entry(iface, &ses->iface_list, iface_head) { seq_printf(m, "\n\t%d)", ++j); cifs_dump_iface(m, iface); + + iface_weight = iface->speed / iface_min_speed; + seq_printf(m, "\t\tWeight (cur,total): (%zu,%zu)" + "\n\t\tAllocated channels: %u\n", + iface->weight_fulfilled, + iface_weight, + iface->num_channels); + if (is_ses_using_iface(ses, iface)) seq_puts(m, "\t\t[CONNECTED]\n"); } diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index a11e7b10f607..e3ba5c979832 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -948,6 +948,8 @@ struct cifs_server_iface { struct list_head iface_head; struct kref refcount; size_t speed; + size_t weight_fulfilled; + unsigned int num_channels; unsigned int rdma_capable : 1; unsigned int rss_capable : 1; unsigned int is_active : 1; /* unset if non existent */ diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c index b8bfebe4498e..78a7cfa75e91 100644 --- a/fs/cifs/sess.c +++ b/fs/cifs/sess.c @@ -167,7 +167,9 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) int left; int rc = 0; int tries = 0; + size_t iface_weight = 0, iface_min_speed = 0; struct cifs_server_iface *iface = NULL, *niface = NULL; + struct cifs_server_iface *last_iface = NULL; spin_lock(&ses->chan_lock); @@ -196,21 +198,11 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) } spin_unlock(&ses->chan_lock); - /* - * Keep connecting to same, fastest, iface for all channels as - * long as its RSS. Try next fastest one if not RSS or channel - * creation fails. - */ - spin_lock(&ses->iface_lock); - iface = list_first_entry(&ses->iface_list, struct cifs_server_iface, - iface_head); - spin_unlock(&ses->iface_lock); - while (left > 0) { tries++; if (tries > 3*ses->chan_max) { - cifs_dbg(FYI, "too many channel open attempts (%d channels left to open)\n", + cifs_dbg(VFS, "too many channel open attempts (%d channels left to open)\n", left); break; } @@ -218,17 +210,34 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) spin_lock(&ses->iface_lock); if (!ses->iface_count) { spin_unlock(&ses->iface_lock); + cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname); break; } + if (!iface) + iface = list_first_entry(&ses->iface_list, struct cifs_server_iface, + iface_head); + last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface, + iface_head); + iface_min_speed = last_iface->speed; + list_for_each_entry_safe_from(iface, niface, &ses->iface_list, iface_head) { + /* do not mix rdma and non-rdma interfaces */ + if (iface->rdma_capable != ses->server->rdma) + continue; + /* skip ifaces that are unusable */ if (!iface->is_active || (is_ses_using_iface(ses, iface) && - !iface->rss_capable)) { + !iface->rss_capable)) + continue; + + /* check if we already allocated enough channels */ + iface_weight = iface->speed / iface_min_speed; + + if (iface->weight_fulfilled >= iface_weight) continue; - } /* take ref before unlock */ kref_get(&iface->refcount); @@ -245,10 +254,17 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) continue; } - cifs_dbg(FYI, "successfully opened new channel on iface:%pIS\n", + iface->num_channels++; + iface->weight_fulfilled++; + cifs_dbg(VFS, "successfully opened new channel on iface:%pIS\n", &iface->sockaddr); break; } + + /* reached end of list. reset weight_fulfilled */ + if (list_entry_is_head(iface, &ses->iface_list, iface_head)) + list_for_each_entry(iface, &ses->iface_list, iface_head) + iface->weight_fulfilled = 0; spin_unlock(&ses->iface_lock); left--; @@ -267,8 +283,10 @@ int cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server) { unsigned int chan_index; + size_t iface_weight = 0, iface_min_speed = 0; struct cifs_server_iface *iface = NULL; struct cifs_server_iface *old_iface = NULL; + struct cifs_server_iface *last_iface = NULL; int rc = 0; spin_lock(&ses->chan_lock); @@ -288,13 +306,34 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server) spin_unlock(&ses->chan_lock); spin_lock(&ses->iface_lock); + if (!ses->iface_count) { + spin_unlock(&ses->iface_lock); + cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname); + return 0; + } + + last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface, + iface_head); + iface_min_speed = last_iface->speed; + /* then look for a new one */ list_for_each_entry(iface, &ses->iface_list, iface_head) { + /* do not mix rdma and non-rdma interfaces */ + if (iface->rdma_capable != server->rdma) + continue; + if (!iface->is_active || (is_ses_using_iface(ses, iface) && !iface->rss_capable)) { continue; } + + /* check if we already allocated enough channels */ + iface_weight = iface->speed / iface_min_speed; + + if (iface->weight_fulfilled >= iface_weight) + continue; + kref_get(&iface->refcount); break; }
Today, if the server interfaces RSS capable, we simply choose the fastest interface to setup a channel. This is not a scalable approach, and does not make a lot of attempt to distribute the connections. This change does a weighted distribution of channels across all the available server interfaces, where the weight is a function of the advertised interface speed. Also make sure that we don't mix rdma and non-rdma for channels. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> --- fs/cifs/cifs_debug.c | 16 +++++++++++ fs/cifs/cifsglob.h | 2 ++ fs/cifs/sess.c | 67 +++++++++++++++++++++++++++++++++++--------- 3 files changed, 71 insertions(+), 14 deletions(-)