Message ID | CAH2r5mvxp8OZthKPQGCv82xEkNW+z7SN_QhdRUMnHJ2Fm4pJqA@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | cifs: missing null pointer check in cifs_mount | expand |
Steve French <smfrench@gmail.com> writes: > We weren't checking if tcon is null before setting dfs path, > although we check for null tcon in an earlier assignment statement. If tcon is NULL there is no point in continuing in that function, we should have exited earlier. If tcon is NULL it means mount_get_conns() failed so presumably rc will be != 0 and we would goto error. I don't think this is needed. We could change the existing check after the loop to this you really want to be safe: if (rc || !tcon) goto error; Cheers,
Agreed. On June 23, 2021 8:48:24 AM GMT-03:00, "Aurélien Aptel" <aaptel@suse.com> wrote: >Steve French <smfrench@gmail.com> writes: >> We weren't checking if tcon is null before setting dfs path, >> although we check for null tcon in an earlier assignment statement. > >If tcon is NULL there is no point in continuing in that function, we >should have exited earlier. > >If tcon is NULL it means mount_get_conns() failed so presumably rc will >be != 0 and we would goto error. > >I don't think this is needed. We could change the existing check after >the loop to this you really want to be safe: > > if (rc || !tcon) > goto error; > > >Cheers,
updated patch attached with Aurelien's suggestion. On Wed, Jun 23, 2021 at 7:17 AM Paulo Alcantara <pc@cjr.nz> wrote: > > Agreed. > > On June 23, 2021 8:48:24 AM GMT-03:00, "Aurélien Aptel" <aaptel@suse.com> wrote: > >Steve French <smfrench@gmail.com> writes: > >> We weren't checking if tcon is null before setting dfs path, > >> although we check for null tcon in an earlier assignment statement. > > > >If tcon is NULL there is no point in continuing in that function, we > >should have exited earlier. > > > >If tcon is NULL it means mount_get_conns() failed so presumably rc will > >be != 0 and we would goto error. > > > >I don't think this is needed. We could change the existing check after > >the loop to this you really want to be safe: > > > > if (rc || !tcon) > > goto error; > > > > > >Cheers,
On Wed, 2021-06-23 at 19:34 -0500, Steve French wrote: > updated patch attached with Aurelien's suggestion. > > On Wed, Jun 23, 2021 at 7:17 AM Paulo Alcantara <pc@cjr.nz> wrote: > > > > Agreed. > > > > On June 23, 2021 8:48:24 AM GMT-03:00, "Aurélien Aptel" <aaptel@suse.com> wrote: > > > Steve French <smfrench@gmail.com> writes: > > > > We weren't checking if tcon is null before setting dfs path, > > > > although we check for null tcon in an earlier assignment statement. > > > > > > If tcon is NULL there is no point in continuing in that function, we > > > should have exited earlier. > > > > > > If tcon is NULL it means mount_get_conns() failed so presumably rc will > > > be != 0 and we would goto error. > > > > > > I don't think this is needed. We could change the existing check after > > > the loop to this you really want to be safe: > > > > > > if (rc || !tcon) > > > goto error; > > > > > > > > > Cheers, > > > I know this patch is ancient and the mainline code has marched on, but it seems really suspicious to me. With this, we have cifs_mount returning 0, even though the superblock hasn't been properly initialized. Is that expected? Shouldn't it return an error in that case? The mount handling has morphed considerably since this patch went in, so I can't really tell whether this was later fixed or not.
Jeff Layton <jlayton@kernel.org> writes: > On Wed, 2021-06-23 at 19:34 -0500, Steve French wrote: >> updated patch attached with Aurelien's suggestion. >> >> On Wed, Jun 23, 2021 at 7:17 AM Paulo Alcantara <pc@cjr.nz> wrote: >> > >> > Agreed. >> > >> > On June 23, 2021 8:48:24 AM GMT-03:00, "Aurélien Aptel" <aaptel@suse.com> wrote: >> > > Steve French <smfrench@gmail.com> writes: >> > > > We weren't checking if tcon is null before setting dfs path, >> > > > although we check for null tcon in an earlier assignment statement. >> > > >> > > If tcon is NULL there is no point in continuing in that function, we >> > > should have exited earlier. >> > > >> > > If tcon is NULL it means mount_get_conns() failed so presumably rc will >> > > be != 0 and we would goto error. >> > > >> > > I don't think this is needed. We could change the existing check after >> > > the loop to this you really want to be safe: >> > > >> > > if (rc || !tcon) >> > > goto error; >> > > >> > > >> > > Cheers, >> >> >> > > I know this patch is ancient and the mainline code has marched on, but > it seems really suspicious to me. Yes, it is. > With this, we have cifs_mount returning 0, even though the superblock > hasn't been properly initialized. Is that expected? Shouldn't it return > an error in that case? No, that isn't expected. And yes, if @tcon would ever be NULL at that point, we should be returning an error instead. Otherwise we'd end up dereferencing a NULL @tcon while trying to get an inode for the root dentry later. However, by quickly looking at the old code -- on top of 162004a2f7ef -- I don't see how we'd end up having a NULL @tcon with rc == 0 as mount_get_conns() would return -errno if it couldn't get a tcon. Please correct me if I'm missing something. Whether it is possibile or not, the NULL @tcon check is certainly missing a 'rc = -ENOENT' or some other error before bailing out as you've pointed out. > The mount handling has morphed considerably since this patch went in, so > I can't really tell whether this was later fixed or not. I don't think there was a follow-up patch for that.
On Fri, 2023-08-11 at 12:15 -0300, Paulo Alcantara wrote: > Jeff Layton <jlayton@kernel.org> writes: > > > On Wed, 2021-06-23 at 19:34 -0500, Steve French wrote: > > > updated patch attached with Aurelien's suggestion. > > > > > > On Wed, Jun 23, 2021 at 7:17 AM Paulo Alcantara <pc@cjr.nz> wrote: > > > > > > > > Agreed. > > > > > > > > On June 23, 2021 8:48:24 AM GMT-03:00, "Aurélien Aptel" <aaptel@suse.com> wrote: > > > > > Steve French <smfrench@gmail.com> writes: > > > > > > We weren't checking if tcon is null before setting dfs path, > > > > > > although we check for null tcon in an earlier assignment statement. > > > > > > > > > > If tcon is NULL there is no point in continuing in that function, we > > > > > should have exited earlier. > > > > > > > > > > If tcon is NULL it means mount_get_conns() failed so presumably rc will > > > > > be != 0 and we would goto error. > > > > > > > > > > I don't think this is needed. We could change the existing check after > > > > > the loop to this you really want to be safe: > > > > > > > > > > if (rc || !tcon) > > > > > goto error; > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > I know this patch is ancient and the mainline code has marched on, but > > it seems really suspicious to me. > > Yes, it is. > > > With this, we have cifs_mount returning 0, even though the superblock > > hasn't been properly initialized. Is that expected? Shouldn't it return > > an error in that case? > > No, that isn't expected. And yes, if @tcon would ever be NULL at that > point, we should be returning an error instead. Otherwise we'd end up > dereferencing a NULL @tcon while trying to get an inode for the root > dentry later. > > However, by quickly looking at the old code -- on top of 162004a2f7ef -- > I don't see how we'd end up having a NULL @tcon with rc == 0 as > mount_get_conns() would return -errno if it couldn't get a tcon. Please > correct me if I'm missing something. Whether it is possibile or not, > the NULL @tcon check is certainly missing a 'rc = -ENOENT' or some other > error before bailing out as you've pointed out. Thanks for the confirmation. There were some oopses on some RHEL8 (5.14 based kernels). The stack looked something like this: PID: 2415716 TASK: ffff937139090000 CPU: 3 COMMAND: "ls" #0 [ffff9ef946b23728] machine_kexec at ffffffffac867cfe #1 [ffff9ef946b23780] __crash_kexec at ffffffffac9ad94d #2 [ffff9ef946b23848] crash_kexec at ffffffffac9ae881 #3 [ffff9ef946b23860] oops_end at ffffffffac8274f1 #4 [ffff9ef946b23880] no_context at ffffffffac879a03 #5 [ffff9ef946b238d8] __bad_area_nosemaphore at ffffffffac879d64 #6 [ffff9ef946b23920] do_page_fault at ffffffffac87a617 #7 [ffff9ef946b23950] page_fault at ffffffffad20111e [exception RIP: cifs_mount+1126] RIP: ffffffffc08e8826 RSP: ffff9ef946b23a00 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff936c8221ea00 RCX: ffff936f8018b320 RDX: 0000000000000001 RSI: ffff936f8018b420 RDI: ffff936c8221ea00 RBP: ffff9ef946b23a90 R8: 5346445756535c5c R9: 6765642e50313031 R10: 65622e666f6f7267 R11: 0063696c6275505c R12: ffff9370b192fc00 R13: ffff936f8018b420 R14: 00000000003097ad R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffff9ef946b23a98] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] #9 [ffff9ef946b23aa0] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] #10 [ffff9ef946b23b08] smb3_get_tree at ffffffffc0930ae0 [cifs] #11 [ffff9ef946b23b30] vfs_get_tree at ffffffffacb52365 #12 [ffff9ef946b23b50] fc_mount at ffffffffacb7485e #13 [ffff9ef946b23b60] vfs_kern_mount at ffffffffacb748ec #14 [ffff9ef946b23b80] cifs_dfs_do_automount at ffffffffc093552e [cifs] #15 [ffff9ef946b23bc0] cifs_dfs_d_automount at ffffffffc0935880 [cifs] #16 [ffff9ef946b23bd0] follow_managed at ffffffffacb5bdaf #17 [ffff9ef946b23c10] lookup_fast at ffffffffacb5c7e5 #18 [ffff9ef946b23c68] walk_component at ffffffffacb5d258 #19 [ffff9ef946b23cc8] path_lookupat at ffffffffacb5e215 #20 [ffff9ef946b23d28] filename_lookup at ffffffffacb62710 #21 [ffff9ef946b23e40] vfs_statx at ffffffffacb55874 #22 [ffff9ef946b23e98] __do_sys_statx at ffffffffacb5692b #23 [ffff9ef946b23f38] do_syscall_64 at ffffffffac8043ab #24 [ffff9ef946b23f50] entry_SYSCALL_64_after_hwframe at ffffffffad2000a9 RIP: 00007ff6a2637edf RSP: 00007ffe040017d0 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 00007ffe04001910 RCX: 00007ff6a2637edf RDX: 0000000000000100 RSI: 00007ffe04001910 RDI: 00000000ffffff9c RBP: 0000000000000100 R8: 00007ffe040017f0 R9: 00000000ffffff9c R10: 0000000000000002 R11: 0000000000000246 R12: 00007ffe040017f0 R13: 0000000000000000 R14: 0000000000000003 R15: 000055ce1a3ae1b8 ORIG_RAX: 000000000000014c CS: 0033 SS: 002b Analysis of the vmcore by Roberto showed that we had ended up past that point with tcon==NULL and rc==0. Steve's patch would have fixed the panic there, but I think the host would have ended up with a successful mount, but with a broken superblock. The current code seems a bit less fragile, and I didn't see any similar brokenness there, but I didn't look too hard either. In any case, we'll plan to fix this up with a one-off in RHEL/Centos. Thanks again for the sanity check!
Jeff Layton <jlayton@kernel.org> writes: > Thanks for the confirmation. There were some oopses on some RHEL8 (5.14 > based kernels). The stack looked something like this: > > PID: 2415716 TASK: ffff937139090000 CPU: 3 COMMAND: "ls" > #0 [ffff9ef946b23728] machine_kexec at ffffffffac867cfe > #1 [ffff9ef946b23780] __crash_kexec at ffffffffac9ad94d > #2 [ffff9ef946b23848] crash_kexec at ffffffffac9ae881 > #3 [ffff9ef946b23860] oops_end at ffffffffac8274f1 > #4 [ffff9ef946b23880] no_context at ffffffffac879a03 > #5 [ffff9ef946b238d8] __bad_area_nosemaphore at ffffffffac879d64 > #6 [ffff9ef946b23920] do_page_fault at ffffffffac87a617 > #7 [ffff9ef946b23950] page_fault at ffffffffad20111e > [exception RIP: cifs_mount+1126] > RIP: ffffffffc08e8826 RSP: ffff9ef946b23a00 RFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffff936c8221ea00 RCX: ffff936f8018b320 > RDX: 0000000000000001 RSI: ffff936f8018b420 RDI: ffff936c8221ea00 > RBP: ffff9ef946b23a90 R8: 5346445756535c5c R9: 6765642e50313031 > R10: 65622e666f6f7267 R11: 0063696c6275505c R12: ffff9370b192fc00 > R13: ffff936f8018b420 R14: 00000000003097ad R15: 0000000000000000 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #8 [ffff9ef946b23a98] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] > #9 [ffff9ef946b23aa0] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] > #10 [ffff9ef946b23b08] smb3_get_tree at ffffffffc0930ae0 [cifs] > #11 [ffff9ef946b23b30] vfs_get_tree at ffffffffacb52365 > #12 [ffff9ef946b23b50] fc_mount at ffffffffacb7485e > #13 [ffff9ef946b23b60] vfs_kern_mount at ffffffffacb748ec > #14 [ffff9ef946b23b80] cifs_dfs_do_automount at ffffffffc093552e [cifs] > #15 [ffff9ef946b23bc0] cifs_dfs_d_automount at ffffffffc0935880 [cifs] > #16 [ffff9ef946b23bd0] follow_managed at ffffffffacb5bdaf > #17 [ffff9ef946b23c10] lookup_fast at ffffffffacb5c7e5 > #18 [ffff9ef946b23c68] walk_component at ffffffffacb5d258 > #19 [ffff9ef946b23cc8] path_lookupat at ffffffffacb5e215 > #20 [ffff9ef946b23d28] filename_lookup at ffffffffacb62710 > #21 [ffff9ef946b23e40] vfs_statx at ffffffffacb55874 > #22 [ffff9ef946b23e98] __do_sys_statx at ffffffffacb5692b > #23 [ffff9ef946b23f38] do_syscall_64 at ffffffffac8043ab > #24 [ffff9ef946b23f50] entry_SYSCALL_64_after_hwframe at > ffffffffad2000a9 > RIP: 00007ff6a2637edf RSP: 00007ffe040017d0 RFLAGS: 00000246 > RAX: ffffffffffffffda RBX: 00007ffe04001910 RCX: 00007ff6a2637edf > RDX: 0000000000000100 RSI: 00007ffe04001910 RDI: 00000000ffffff9c > RBP: 0000000000000100 R8: 00007ffe040017f0 R9: 00000000ffffff9c > R10: 0000000000000002 R11: 0000000000000246 R12: 00007ffe040017f0 > R13: 0000000000000000 R14: 0000000000000003 R15: 000055ce1a3ae1b8 > ORIG_RAX: 000000000000014c CS: 0033 SS: 002b > > Analysis of the vmcore by Roberto showed that we had ended up past that > point with tcon==NULL and rc==0. Interesting. Thanks for sharing the backtrace! So it actually ended up with NULL @tcon and rc == 0 while mounting a DFS link. Nice catch! > Steve's patch would have fixed the panic there, but I think the host > would have ended up with a successful mount, but with a broken > superblock. The current code seems a bit less fragile, and I didn't see > any similar brokenness there, but I didn't look too hard either. Yeah, makes sense. > In any case, we'll plan to fix this up with a one-off in RHEL/Centos. > Thanks again for the sanity check! Would you mind to propose a patch that fixes the above and mark it for v5.14..v5.15?
On Fri, 2023-08-11 at 13:49 -0300, Paulo Alcantara wrote: > Jeff Layton <jlayton@kernel.org> writes: > > > Thanks for the confirmation. There were some oopses on some RHEL8 (5.14 > > based kernels). The stack looked something like this: > > > > PID: 2415716 TASK: ffff937139090000 CPU: 3 COMMAND: "ls" > > #0 [ffff9ef946b23728] machine_kexec at ffffffffac867cfe > > #1 [ffff9ef946b23780] __crash_kexec at ffffffffac9ad94d > > #2 [ffff9ef946b23848] crash_kexec at ffffffffac9ae881 > > #3 [ffff9ef946b23860] oops_end at ffffffffac8274f1 > > #4 [ffff9ef946b23880] no_context at ffffffffac879a03 > > #5 [ffff9ef946b238d8] __bad_area_nosemaphore at ffffffffac879d64 > > #6 [ffff9ef946b23920] do_page_fault at ffffffffac87a617 > > #7 [ffff9ef946b23950] page_fault at ffffffffad20111e > > [exception RIP: cifs_mount+1126] > > RIP: ffffffffc08e8826 RSP: ffff9ef946b23a00 RFLAGS: 00010246 > > RAX: 0000000000000000 RBX: ffff936c8221ea00 RCX: ffff936f8018b320 > > RDX: 0000000000000001 RSI: ffff936f8018b420 RDI: ffff936c8221ea00 > > RBP: ffff9ef946b23a90 R8: 5346445756535c5c R9: 6765642e50313031 > > R10: 65622e666f6f7267 R11: 0063696c6275505c R12: ffff9370b192fc00 > > R13: ffff936f8018b420 R14: 00000000003097ad R15: 0000000000000000 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > #8 [ffff9ef946b23a98] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] > > #9 [ffff9ef946b23aa0] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] > > #10 [ffff9ef946b23b08] smb3_get_tree at ffffffffc0930ae0 [cifs] > > #11 [ffff9ef946b23b30] vfs_get_tree at ffffffffacb52365 > > #12 [ffff9ef946b23b50] fc_mount at ffffffffacb7485e > > #13 [ffff9ef946b23b60] vfs_kern_mount at ffffffffacb748ec > > #14 [ffff9ef946b23b80] cifs_dfs_do_automount at ffffffffc093552e [cifs] > > #15 [ffff9ef946b23bc0] cifs_dfs_d_automount at ffffffffc0935880 [cifs] > > #16 [ffff9ef946b23bd0] follow_managed at ffffffffacb5bdaf > > #17 [ffff9ef946b23c10] lookup_fast at ffffffffacb5c7e5 > > #18 [ffff9ef946b23c68] walk_component at ffffffffacb5d258 > > #19 [ffff9ef946b23cc8] path_lookupat at ffffffffacb5e215 > > #20 [ffff9ef946b23d28] filename_lookup at ffffffffacb62710 > > #21 [ffff9ef946b23e40] vfs_statx at ffffffffacb55874 > > #22 [ffff9ef946b23e98] __do_sys_statx at ffffffffacb5692b > > #23 [ffff9ef946b23f38] do_syscall_64 at ffffffffac8043ab > > #24 [ffff9ef946b23f50] entry_SYSCALL_64_after_hwframe at > > ffffffffad2000a9 > > RIP: 00007ff6a2637edf RSP: 00007ffe040017d0 RFLAGS: 00000246 > > RAX: ffffffffffffffda RBX: 00007ffe04001910 RCX: 00007ff6a2637edf > > RDX: 0000000000000100 RSI: 00007ffe04001910 RDI: 00000000ffffff9c > > RBP: 0000000000000100 R8: 00007ffe040017f0 R9: 00000000ffffff9c > > R10: 0000000000000002 R11: 0000000000000246 R12: 00007ffe040017f0 > > R13: 0000000000000000 R14: 0000000000000003 R15: 000055ce1a3ae1b8 > > ORIG_RAX: 000000000000014c CS: 0033 SS: 002b > > > > Analysis of the vmcore by Roberto showed that we had ended up past that > > point with tcon==NULL and rc==0. > > Interesting. Thanks for sharing the backtrace! > > So it actually ended up with NULL @tcon and rc == 0 while mounting a DFS > link. Nice catch! > > > Steve's patch would have fixed the panic there, but I think the host > > would have ended up with a successful mount, but with a broken > > superblock. The current code seems a bit less fragile, and I didn't see > > any similar brokenness there, but I didn't look too hard either. > > Yeah, makes sense. > > > In any case, we'll plan to fix this up with a one-off in RHEL/Centos. > > Thanks again for the sanity check! > > Would you mind to propose a patch that fixes the above and mark it for > v5.14..v5.15? Sounds good. One of us will make sure that happens too. Thanks!
Jeff Layton <jlayton@kernel.org> writes: > On Fri, 2023-08-11 at 13:49 -0300, Paulo Alcantara wrote: >> > In any case, we'll plan to fix this up with a one-off in RHEL/Centos. >> > Thanks again for the sanity check! >> >> Would you mind to propose a patch that fixes the above and mark it for >> v5.14..v5.15? > > Sounds good. One of us will make sure that happens too. Thanks!
On Fri, 2023-08-11 at 12:58 -0400, Jeff Layton wrote: > On Fri, 2023-08-11 at 13:49 -0300, Paulo Alcantara wrote: > > Jeff Layton <jlayton@kernel.org> writes: > > > > > Thanks for the confirmation. There were some oopses on some RHEL8 (5.14 > > > based kernels). The stack looked something like this: > > > > > > PID: 2415716 TASK: ffff937139090000 CPU: 3 COMMAND: "ls" > > > #0 [ffff9ef946b23728] machine_kexec at ffffffffac867cfe > > > #1 [ffff9ef946b23780] __crash_kexec at ffffffffac9ad94d > > > #2 [ffff9ef946b23848] crash_kexec at ffffffffac9ae881 > > > #3 [ffff9ef946b23860] oops_end at ffffffffac8274f1 > > > #4 [ffff9ef946b23880] no_context at ffffffffac879a03 > > > #5 [ffff9ef946b238d8] __bad_area_nosemaphore at ffffffffac879d64 > > > #6 [ffff9ef946b23920] do_page_fault at ffffffffac87a617 > > > #7 [ffff9ef946b23950] page_fault at ffffffffad20111e > > > [exception RIP: cifs_mount+1126] > > > RIP: ffffffffc08e8826 RSP: ffff9ef946b23a00 RFLAGS: 00010246 > > > RAX: 0000000000000000 RBX: ffff936c8221ea00 RCX: ffff936f8018b320 > > > RDX: 0000000000000001 RSI: ffff936f8018b420 RDI: ffff936c8221ea00 > > > RBP: ffff9ef946b23a90 R8: 5346445756535c5c R9: 6765642e50313031 > > > R10: 65622e666f6f7267 R11: 0063696c6275505c R12: ffff9370b192fc00 > > > R13: ffff936f8018b420 R14: 00000000003097ad R15: 0000000000000000 > > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > > #8 [ffff9ef946b23a98] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] > > > #9 [ffff9ef946b23aa0] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] > > > #10 [ffff9ef946b23b08] smb3_get_tree at ffffffffc0930ae0 [cifs] > > > #11 [ffff9ef946b23b30] vfs_get_tree at ffffffffacb52365 > > > #12 [ffff9ef946b23b50] fc_mount at ffffffffacb7485e > > > #13 [ffff9ef946b23b60] vfs_kern_mount at ffffffffacb748ec > > > #14 [ffff9ef946b23b80] cifs_dfs_do_automount at ffffffffc093552e [cifs] > > > #15 [ffff9ef946b23bc0] cifs_dfs_d_automount at ffffffffc0935880 [cifs] > > > #16 [ffff9ef946b23bd0] follow_managed at ffffffffacb5bdaf > > > #17 [ffff9ef946b23c10] lookup_fast at ffffffffacb5c7e5 > > > #18 [ffff9ef946b23c68] walk_component at ffffffffacb5d258 > > > #19 [ffff9ef946b23cc8] path_lookupat at ffffffffacb5e215 > > > #20 [ffff9ef946b23d28] filename_lookup at ffffffffacb62710 > > > #21 [ffff9ef946b23e40] vfs_statx at ffffffffacb55874 > > > #22 [ffff9ef946b23e98] __do_sys_statx at ffffffffacb5692b > > > #23 [ffff9ef946b23f38] do_syscall_64 at ffffffffac8043ab > > > #24 [ffff9ef946b23f50] entry_SYSCALL_64_after_hwframe at > > > ffffffffad2000a9 > > > RIP: 00007ff6a2637edf RSP: 00007ffe040017d0 RFLAGS: 00000246 > > > RAX: ffffffffffffffda RBX: 00007ffe04001910 RCX: 00007ff6a2637edf > > > RDX: 0000000000000100 RSI: 00007ffe04001910 RDI: 00000000ffffff9c > > > RBP: 0000000000000100 R8: 00007ffe040017f0 R9: 00000000ffffff9c > > > R10: 0000000000000002 R11: 0000000000000246 R12: 00007ffe040017f0 > > > R13: 0000000000000000 R14: 0000000000000003 R15: 000055ce1a3ae1b8 > > > ORIG_RAX: 000000000000014c CS: 0033 SS: 002b > > > > > > Analysis of the vmcore by Roberto showed that we had ended up past that > > > point with tcon==NULL and rc==0. > > > > Interesting. Thanks for sharing the backtrace! > > > > So it actually ended up with NULL @tcon and rc == 0 while mounting a DFS > > link. Nice catch! > > > > > Steve's patch would have fixed the panic there, but I think the host > > > would have ended up with a successful mount, but with a broken > > > superblock. The current code seems a bit less fragile, and I didn't see > > > any similar brokenness there, but I didn't look too hard either. > > > > Yeah, makes sense. > > > > > In any case, we'll plan to fix this up with a one-off in RHEL/Centos. > > > Thanks again for the sanity check! > > > > Would you mind to propose a patch that fixes the above and mark it for > > v5.14..v5.15? > > Sounds good. One of us will make sure that happens too. > FWIW, I took a look at v5.15.125 and I don't see the same bug there. It probably got fixed inadvertently with some other backporting. Looks like this is only a problem for older, non-stable-series kernels. The patch I created for RHEL8 is attached though, if you're interested.
Jeff Layton <jlayton@kernel.org> writes: > FWIW, I took a look at v5.15.125 and I don't see the same bug there. It > probably got fixed inadvertently with some other backporting. Looks like > this is only a problem for older, non-stable-series kernels. Thanks for looking into that! Really appreciate it. > The patch I created for RHEL8 is attached though, if you're > interested. LGTM.
From 632096b66b2fa2621e3d2d02c2c2fd436975810b Mon Sep 17 00:00:00 2001 From: Steve French <stfrench@microsoft.com> Date: Tue, 22 Jun 2021 20:13:44 -0500 Subject: [PATCH] cifs: missing null pointer check in cifs_mount We weren't checking if tcon is null before setting dfs path, although we check for null tcon in an earlier assignment statement. Addresses-Coverity: 1476411 ("Dereference after null check") Signed-off-by: Steve French <stfrench@microsoft.com> --- fs/cifs/connect.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 8d95607a9312..196ef9ec69db 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -3480,7 +3480,8 @@ int cifs_mount(struct cifs_sb_info *cifs_sb, struct smb3_fs_context *ctx) goto error; } spin_lock(&cifs_tcp_ses_lock); - tcon->dfs_path = ref_path; + if (tcon) + tcon->dfs_path = ref_path; ref_path = NULL; spin_unlock(&cifs_tcp_ses_lock); -- 2.30.2
We weren't checking if tcon is null before setting dfs path, although we check for null tcon in an earlier assignment statement. Addresses-Coverity: 1476411 ("Dereference after null check") Signed-off-by: Steve French <stfrench@microsoft.com> --- fs/cifs/connect.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)