Message ID | 20210825111656.1635954-2-lsahlber@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | cifs: Do not leak EDEADLK to dgetents64 for STATUS_USER_SESSION_DELETED | expand |
lightly updated to add short sleep before retry On Wed, Aug 25, 2021 at 6:17 AM Ronnie Sahlberg <lsahlber@redhat.com> wrote: > > RHBZ: 1994393 > > If we hit a STATUS_USER_SESSION_DELETED for the Create part in the > Create/QueryDirectory compound that starts a directory scan > we will leak EDEADLK back to userspace and surprise glibc and the application. > > Pick this up initiate_cifs_search() and retry a small number of tries before we > return an error to userspace. > > Cc: stable@vger.kernel.org > Reported-by: Xiaoli Feng <xifeng@redhat.com> > Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> > --- > fs/cifs/readdir.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c > index bfee176b901d..4518e3ca64df 100644 > --- a/fs/cifs/readdir.c > +++ b/fs/cifs/readdir.c > @@ -369,7 +369,7 @@ int get_symlink_reparse_path(char *full_path, struct cifs_sb_info *cifs_sb, > */ > > static int > -initiate_cifs_search(const unsigned int xid, struct file *file, > +_initiate_cifs_search(const unsigned int xid, struct file *file, > const char *full_path) > { > __u16 search_flags; > @@ -451,6 +451,23 @@ initiate_cifs_search(const unsigned int xid, struct file *file, > return rc; > } > > +static int > +initiate_cifs_search(const unsigned int xid, struct file *file, > + const char *full_path) > +{ > + int rc, retry_count = 0; > + > + do { > + rc = _initiate_cifs_search(xid, file, full_path); > + /* > + * We don't have enough credits to start reading the > + * directory so just try again. > + */ > + } while (rc == -EDEADLK && retry_count++ < 5); > + > + return rc; > +} > + > /* return length of unicode string in bytes */ > static int cifs_unicode_bytelen(const char *str) > { > -- > 2.30.2 >
On Thu, Aug 26, 2021 at 2:39 AM Steve French <smfrench@gmail.com> wrote: > > lightly updated to add short sleep before retry > > > On Wed, Aug 25, 2021 at 6:17 AM Ronnie Sahlberg <lsahlber@redhat.com> wrote: > > > > RHBZ: 1994393 > > > > If we hit a STATUS_USER_SESSION_DELETED for the Create part in the > > Create/QueryDirectory compound that starts a directory scan > > we will leak EDEADLK back to userspace and surprise glibc and the application. > > > > Pick this up initiate_cifs_search() and retry a small number of tries before we > > return an error to userspace. > > > > Cc: stable@vger.kernel.org > > Reported-by: Xiaoli Feng <xifeng@redhat.com> > > Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> > > --- > > fs/cifs/readdir.c | 19 ++++++++++++++++++- > > 1 file changed, 18 insertions(+), 1 deletion(-) > > > > diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c > > index bfee176b901d..4518e3ca64df 100644 > > --- a/fs/cifs/readdir.c > > +++ b/fs/cifs/readdir.c > > @@ -369,7 +369,7 @@ int get_symlink_reparse_path(char *full_path, struct cifs_sb_info *cifs_sb, > > */ > > > > static int > > -initiate_cifs_search(const unsigned int xid, struct file *file, > > +_initiate_cifs_search(const unsigned int xid, struct file *file, > > const char *full_path) > > { > > __u16 search_flags; > > @@ -451,6 +451,23 @@ initiate_cifs_search(const unsigned int xid, struct file *file, > > return rc; > > } > > > > +static int > > +initiate_cifs_search(const unsigned int xid, struct file *file, > > + const char *full_path) > > +{ > > + int rc, retry_count = 0; > > + > > + do { > > + rc = _initiate_cifs_search(xid, file, full_path); > > + /* > > + * We don't have enough credits to start reading the > > + * directory so just try again. > > + */ > > + } while (rc == -EDEADLK && retry_count++ < 5); > > + > > + return rc; > > +} > > + > > /* return length of unicode string in bytes */ > > static int cifs_unicode_bytelen(const char *str) > > { > > -- > > 2.30.2 > > > > > -- > Thanks, > > Steve Hi Ronnie, EDEADLK is returned in wait_for_compound_request, when num of credits is 0, but there are no in flight requests to get more credits from. Why did we reach here in the first place? If we already found STATUS_USER_SESSION_DELETED, why are we waiting for another request?
On Fri, Aug 27, 2021 at 3:16 AM Shyam Prasad N <nspmangalore@gmail.com> wrote: > > On Thu, Aug 26, 2021 at 2:39 AM Steve French <smfrench@gmail.com> wrote: > > > > lightly updated to add short sleep before retry > > > > > > On Wed, Aug 25, 2021 at 6:17 AM Ronnie Sahlberg <lsahlber@redhat.com> wrote: > > > > > > RHBZ: 1994393 > > > > > > If we hit a STATUS_USER_SESSION_DELETED for the Create part in the > > > Create/QueryDirectory compound that starts a directory scan > > > we will leak EDEADLK back to userspace and surprise glibc and the application. > > > > > > Pick this up initiate_cifs_search() and retry a small number of tries before we > > > return an error to userspace. > > > > > > Cc: stable@vger.kernel.org > > > Reported-by: Xiaoli Feng <xifeng@redhat.com> > > > Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> > > > --- > > > fs/cifs/readdir.c | 19 ++++++++++++++++++- > > > 1 file changed, 18 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c > > > index bfee176b901d..4518e3ca64df 100644 > > > --- a/fs/cifs/readdir.c > > > +++ b/fs/cifs/readdir.c > > > @@ -369,7 +369,7 @@ int get_symlink_reparse_path(char *full_path, struct cifs_sb_info *cifs_sb, > > > */ > > > > > > static int > > > -initiate_cifs_search(const unsigned int xid, struct file *file, > > > +_initiate_cifs_search(const unsigned int xid, struct file *file, > > > const char *full_path) > > > { > > > __u16 search_flags; > > > @@ -451,6 +451,23 @@ initiate_cifs_search(const unsigned int xid, struct file *file, > > > return rc; > > > } > > > > > > +static int > > > +initiate_cifs_search(const unsigned int xid, struct file *file, > > > + const char *full_path) > > > +{ > > > + int rc, retry_count = 0; > > > + > > > + do { > > > + rc = _initiate_cifs_search(xid, file, full_path); > > > + /* > > > + * We don't have enough credits to start reading the > > > + * directory so just try again. > > > + */ > > > + } while (rc == -EDEADLK && retry_count++ < 5); > > > + > > > + return rc; > > > +} > > > + > > > /* return length of unicode string in bytes */ > > > static int cifs_unicode_bytelen(const char *str) > > > { > > > -- > > > 2.30.2 > > > > > > > > > -- > > Thanks, > > > > Steve > > Hi Ronnie, > > EDEADLK is returned in wait_for_compound_request, when num of credits > is 0, but there are no in flight requests to get more credits from. > Why did we reach here in the first place? If we already found > STATUS_USER_SESSION_DELETED, why are we waiting for another request? USER_SESSION_DELETED means the session is bad and needs to be reconnected which is why we can not get any credits. We can't get any credits until later until later been reconnected. If this happens from smb2_query_dir_first, the first attempt with USER_SESSION_DELETED will cause the session to need a reconnect and return -EAGAIN. While we have a retry on -EAGAIN here in this function, I don;t it can handle cases where we have both EAGAIN but also a situation where the session is dead and needs reconnect (which also means no credits). I think we are too deep in the call-stack and with too many things locked that we can not reconnect the session right now. Thus the retry on EAGAIN turns into a EDEADLK. The EDEADLK is then returned through the stack all the way back to cifs_readdir() where we don't have all these things locked and thus a re-try will actually trigger a reconnect and we recover. It is easy to reproduce with scrambla and the small error inject patch I posted. Every third "ls /mountpoint" will return a USERS_SESSION_DELETED to smb2_query_dir_first which you can then see leaking -EDEADLK to the ls command if you strace it. BTW. I really want to start using scrambla in our buildbot since it will allow us to do a lot of error injection from server side and test that we can recover correctly from them in the client. (scrambla is ~5k lines of python3 and is a lot less intimidating to patch to inject errors than full blown samba) > > -- > Regards, > Shyam
diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c index bfee176b901d..4518e3ca64df 100644 --- a/fs/cifs/readdir.c +++ b/fs/cifs/readdir.c @@ -369,7 +369,7 @@ int get_symlink_reparse_path(char *full_path, struct cifs_sb_info *cifs_sb, */ static int -initiate_cifs_search(const unsigned int xid, struct file *file, +_initiate_cifs_search(const unsigned int xid, struct file *file, const char *full_path) { __u16 search_flags; @@ -451,6 +451,23 @@ initiate_cifs_search(const unsigned int xid, struct file *file, return rc; } +static int +initiate_cifs_search(const unsigned int xid, struct file *file, + const char *full_path) +{ + int rc, retry_count = 0; + + do { + rc = _initiate_cifs_search(xid, file, full_path); + /* + * We don't have enough credits to start reading the + * directory so just try again. + */ + } while (rc == -EDEADLK && retry_count++ < 5); + + return rc; +} + /* return length of unicode string in bytes */ static int cifs_unicode_bytelen(const char *str) {
RHBZ: 1994393 If we hit a STATUS_USER_SESSION_DELETED for the Create part in the Create/QueryDirectory compound that starts a directory scan we will leak EDEADLK back to userspace and surprise glibc and the application. Pick this up initiate_cifs_search() and retry a small number of tries before we return an error to userspace. Cc: stable@vger.kernel.org Reported-by: Xiaoli Feng <xifeng@redhat.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> --- fs/cifs/readdir.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)