Complete oplock break jobs before closing file handle

Message ID	1421324524-2524-1-git-send-email-sprabhu@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-cifs-owner@kernel.org> From: Sachin Prabhu <sprabhu@redhat.com> To: linux-cifs <linux-cifs@vger.kernel.org> Subject: [PATCH] Complete oplock break jobs before closing file handle Date: Thu, 15 Jan 2015 12:22:04 +0000 Message-Id: <1421324524-2524-1-git-send-email-sprabhu@redhat.com> Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk

Sachin Prabhu Jan. 15, 2015, 12:22 p.m. UTC

Commit
c11f1df5003d534fd067f0168bfad7befffb3b5c
requires writers to wait for any pending oplock break handler to
complete before proceeding to write. This is done by waiting on bit
CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
cleared by the oplock break handler job queued on the workqueue once it
has completed handling the oplock break allowing writers to proceed with
writing to the file.

While testing, it was noticed that the filehandle could be closed while
there is a pending oplock break which results in the oplock break
handler on the cifsiod workqueue being cancelled before it has had a
chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
Any subsequent attempt to write to this file hangs waiting for the
CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.

We fix this by ensuring that we also clear the bit
CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
from the workqueue.

The bug was found by Red Hat QA while testing using ltp's fsstress
command.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Jeff Layton <jlayton@samba.org>
---
 fs/cifs/file.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Shirish Pargaonkar Jan. 17, 2015, 1:45 p.m. UTC | #1

On Thu, Jan 15, 2015 at 6:22 AM, Sachin Prabhu <sprabhu@redhat.com> wrote:
> Commit
> c11f1df5003d534fd067f0168bfad7befffb3b5c
> requires writers to wait for any pending oplock break handler to
> complete before proceeding to write. This is done by waiting on bit
> CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
> cleared by the oplock break handler job queued on the workqueue once it
> has completed handling the oplock break allowing writers to proceed with
> writing to the file.
>
> While testing, it was noticed that the filehandle could be closed while
> there is a pending oplock break which results in the oplock break
> handler on the cifsiod workqueue being cancelled before it has had a
> chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
> Any subsequent attempt to write to this file hangs waiting for the
> CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.
>
> We fix this by ensuring that we also clear the bit
> CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
> from the workqueue.
>
> The bug was found by Red Hat QA while testing using ltp's fsstress
> command.
>
> Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
> Signed-off-by: Jeff Layton <jlayton@samba.org>
> ---
>  fs/cifs/file.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 96b7e9b..74f1287 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -366,6 +366,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
>         struct cifsLockInfo *li, *tmp;
>         struct cifs_fid fid;
>         struct cifs_pending_open open;
> +       bool oplock_break_cancelled;
>
>         spin_lock(&cifs_file_list_lock);
>         if (--cifs_file->count > 0) {
> @@ -397,7 +398,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
>         }
>         spin_unlock(&cifs_file_list_lock);
>
> -       cancel_work_sync(&cifs_file->oplock_break);
> +       oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break);
>
>         if (!tcon->need_reconnect && !cifs_file->invalidHandle) {
>                 struct TCP_Server_Info *server = tcon->ses->server;
> @@ -409,6 +410,9 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
>                 _free_xid(xid);
>         }
>
> +       if (oplock_break_cancelled)
> +               cifs_done_oplock_break(cifsi);
> +
>         cifs_del_pending_open(&open);
>
>         /*
> --
> 2.1.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Does it matter what cancel_work_sync() returns?
Should cifs_done_oplock_break(cifsi); be called unconditionally?
Also, should this also to go stable?
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sachin Prabhu Jan. 19, 2015, 9:55 a.m. UTC | #2

On Sat, 2015-01-17 at 07:45 -0600, Shirish Pargaonkar wrote:
> On Thu, Jan 15, 2015 at 6:22 AM, Sachin Prabhu <sprabhu@redhat.com> wrote:
> > Commit
> > c11f1df5003d534fd067f0168bfad7befffb3b5c
> > requires writers to wait for any pending oplock break handler to
> > complete before proceeding to write. This is done by waiting on bit
> > CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
> > cleared by the oplock break handler job queued on the workqueue once it
> > has completed handling the oplock break allowing writers to proceed with
> > writing to the file.
> >
> > While testing, it was noticed that the filehandle could be closed while
> > there is a pending oplock break which results in the oplock break
> > handler on the cifsiod workqueue being cancelled before it has had a
> > chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
> > Any subsequent attempt to write to this file hangs waiting for the
> > CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.
> >
> > We fix this by ensuring that we also clear the bit
> > CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
> > from the workqueue.
> >
> > The bug was found by Red Hat QA while testing using ltp's fsstress
> > command.
> >
> > Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
> > Signed-off-by: Jeff Layton <jlayton@samba.org>
> > ---
> >  fs/cifs/file.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > index 96b7e9b..74f1287 100644
> > --- a/fs/cifs/file.c
> > +++ b/fs/cifs/file.c
> > @@ -366,6 +366,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> >         struct cifsLockInfo *li, *tmp;
> >         struct cifs_fid fid;
> >         struct cifs_pending_open open;
> > +       bool oplock_break_cancelled;
> >
> >         spin_lock(&cifs_file_list_lock);
> >         if (--cifs_file->count > 0) {
> > @@ -397,7 +398,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> >         }
> >         spin_unlock(&cifs_file_list_lock);
> >
> > -       cancel_work_sync(&cifs_file->oplock_break);
> > +       oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break);
> >
> >         if (!tcon->need_reconnect && !cifs_file->invalidHandle) {
> >                 struct TCP_Server_Info *server = tcon->ses->server;
> > @@ -409,6 +410,9 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> >                 _free_xid(xid);
> >         }
> >
> > +       if (oplock_break_cancelled)
> > +               cifs_done_oplock_break(cifsi);
> > +
> >         cifs_del_pending_open(&open);
> >
> >         /*
> > --
> > 2.1.0
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> Does it matter what cancel_work_sync() returns?
> Should cifs_done_oplock_break(cifsi); be called unconditionally?
> Also, should this also to go stable?

Sirish,

This is done so that we delay allowing other writers to write to this
file before we call close on the file handle.

Yes. I think it should also go to stable once it has been accepted by
Steve.

Sachin Prabhu


--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jeff Layton Jan. 19, 2015, 12:02 p.m. UTC | #3

On Sat, 17 Jan 2015 07:45:13 -0600
Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:

> On Thu, Jan 15, 2015 at 6:22 AM, Sachin Prabhu <sprabhu@redhat.com> wrote:
> > Commit
> > c11f1df5003d534fd067f0168bfad7befffb3b5c
> > requires writers to wait for any pending oplock break handler to
> > complete before proceeding to write. This is done by waiting on bit
> > CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
> > cleared by the oplock break handler job queued on the workqueue once it
> > has completed handling the oplock break allowing writers to proceed with
> > writing to the file.
> >
> > While testing, it was noticed that the filehandle could be closed while
> > there is a pending oplock break which results in the oplock break
> > handler on the cifsiod workqueue being cancelled before it has had a
> > chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
> > Any subsequent attempt to write to this file hangs waiting for the
> > CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.
> >
> > We fix this by ensuring that we also clear the bit
> > CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
> > from the workqueue.
> >
> > The bug was found by Red Hat QA while testing using ltp's fsstress
> > command.
> >
> > Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
> > Signed-off-by: Jeff Layton <jlayton@samba.org>
> > ---
> >  fs/cifs/file.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > index 96b7e9b..74f1287 100644
> > --- a/fs/cifs/file.c
> > +++ b/fs/cifs/file.c
> > @@ -366,6 +366,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> >         struct cifsLockInfo *li, *tmp;
> >         struct cifs_fid fid;
> >         struct cifs_pending_open open;
> > +       bool oplock_break_cancelled;
> >
> >         spin_lock(&cifs_file_list_lock);
> >         if (--cifs_file->count > 0) {
> > @@ -397,7 +398,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> >         }
> >         spin_unlock(&cifs_file_list_lock);
> >
> > -       cancel_work_sync(&cifs_file->oplock_break);
> > +       oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break);
> >
> >         if (!tcon->need_reconnect && !cifs_file->invalidHandle) {
> >                 struct TCP_Server_Info *server = tcon->ses->server;
> > @@ -409,6 +410,9 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> >                 _free_xid(xid);
> >         }
> >
> > +       if (oplock_break_cancelled)
> > +               cifs_done_oplock_break(cifsi);
> > +
> >         cifs_del_pending_open(&open);
> >
> >         /*
> > --
> > 2.1.0
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> Does it matter what cancel_work_sync() returns?
> Should cifs_done_oplock_break(cifsi); be called unconditionally?

I guess the question is whether you can end up with another oplock
break racing in between cancel_work_sync and the
cifs_done_oplock_break. If that can't happen, then calling it
unconditionally is fine.

> Also, should this also to go stable?

Yes, probably. It can cause the client to hang.

Sachin Prabhu Jan. 19, 2015, 4:24 p.m. UTC | #4

On Mon, 2015-01-19 at 07:02 -0500, Jeff Layton wrote:
> On Sat, 17 Jan 2015 07:45:13 -0600
> Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:
> 
> > On Thu, Jan 15, 2015 at 6:22 AM, Sachin Prabhu <sprabhu@redhat.com> wrote:
> > > Commit
> > > c11f1df5003d534fd067f0168bfad7befffb3b5c
> > > requires writers to wait for any pending oplock break handler to
> > > complete before proceeding to write. This is done by waiting on bit
> > > CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
> > > cleared by the oplock break handler job queued on the workqueue once it
> > > has completed handling the oplock break allowing writers to proceed with
> > > writing to the file.
> > >
> > > While testing, it was noticed that the filehandle could be closed while
> > > there is a pending oplock break which results in the oplock break
> > > handler on the cifsiod workqueue being cancelled before it has had a
> > > chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
> > > Any subsequent attempt to write to this file hangs waiting for the
> > > CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.
> > >
> > > We fix this by ensuring that we also clear the bit
> > > CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
> > > from the workqueue.
> > >
> > > The bug was found by Red Hat QA while testing using ltp's fsstress
> > > command.
> > >
> > > Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
> > > Signed-off-by: Jeff Layton <jlayton@samba.org>
> > > ---
> > >  fs/cifs/file.c | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > > index 96b7e9b..74f1287 100644
> > > --- a/fs/cifs/file.c
> > > +++ b/fs/cifs/file.c
> > > @@ -366,6 +366,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> > >         struct cifsLockInfo *li, *tmp;
> > >         struct cifs_fid fid;
> > >         struct cifs_pending_open open;
> > > +       bool oplock_break_cancelled;
> > >
> > >         spin_lock(&cifs_file_list_lock);
> > >         if (--cifs_file->count > 0) {
> > > @@ -397,7 +398,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> > >         }
> > >         spin_unlock(&cifs_file_list_lock);
> > >
> > > -       cancel_work_sync(&cifs_file->oplock_break);
> > > +       oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break);
> > >
> > >         if (!tcon->need_reconnect && !cifs_file->invalidHandle) {
> > >                 struct TCP_Server_Info *server = tcon->ses->server;
> > > @@ -409,6 +410,9 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> > >                 _free_xid(xid);
> > >         }
> > >
> > > +       if (oplock_break_cancelled)
> > > +               cifs_done_oplock_break(cifsi);
> > > +
> > >         cifs_del_pending_open(&open);
> > >
> > >         /*
> > > --
> > > 2.1.0
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > 
> > Does it matter what cancel_work_sync() returns?
> > Should cifs_done_oplock_break(cifsi); be called unconditionally?
> 
> I guess the question is whether you can end up with another oplock
> break racing in between cancel_work_sync and the
> cifs_done_oplock_break. If that can't happen, then calling it
> unconditionally is fine.

By the time we call cancel_work_sync(), we have removed the file from
the list of files opened for the tcon. This is the file list used by
is_valid_oplock_break() to find the file which the oplock break refers
to and to schedule an oplock break work. Since the file is no longer in
the list of open files held for the tcon, we cannot have an oplock break
sneaking in between  cancel_work_sync() and cifs_done_oplock_break(). So
I guess it can be called unconditionally. 

Should I make the change?

> 
> > Also, should this also to go stable?
> 
> Yes, probably. It can cause the client to hang.
> 

Sachin Prabhu


--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jeff Layton Jan. 19, 2015, 4:36 p.m. UTC | #5

On Mon, 19 Jan 2015 16:24:52 +0000
Sachin Prabhu <sprabhu@redhat.com> wrote:

> On Mon, 2015-01-19 at 07:02 -0500, Jeff Layton wrote:
> > On Sat, 17 Jan 2015 07:45:13 -0600
> > Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:
> > 
> > > On Thu, Jan 15, 2015 at 6:22 AM, Sachin Prabhu <sprabhu@redhat.com> wrote:
> > > > Commit
> > > > c11f1df5003d534fd067f0168bfad7befffb3b5c
> > > > requires writers to wait for any pending oplock break handler to
> > > > complete before proceeding to write. This is done by waiting on bit
> > > > CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
> > > > cleared by the oplock break handler job queued on the workqueue once it
> > > > has completed handling the oplock break allowing writers to proceed with
> > > > writing to the file.
> > > >
> > > > While testing, it was noticed that the filehandle could be closed while
> > > > there is a pending oplock break which results in the oplock break
> > > > handler on the cifsiod workqueue being cancelled before it has had a
> > > > chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
> > > > Any subsequent attempt to write to this file hangs waiting for the
> > > > CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.
> > > >
> > > > We fix this by ensuring that we also clear the bit
> > > > CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
> > > > from the workqueue.
> > > >
> > > > The bug was found by Red Hat QA while testing using ltp's fsstress
> > > > command.
> > > >
> > > > Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
> > > > Signed-off-by: Jeff Layton <jlayton@samba.org>
> > > > ---
> > > >  fs/cifs/file.c | 6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > > > index 96b7e9b..74f1287 100644
> > > > --- a/fs/cifs/file.c
> > > > +++ b/fs/cifs/file.c
> > > > @@ -366,6 +366,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> > > >         struct cifsLockInfo *li, *tmp;
> > > >         struct cifs_fid fid;
> > > >         struct cifs_pending_open open;
> > > > +       bool oplock_break_cancelled;
> > > >
> > > >         spin_lock(&cifs_file_list_lock);
> > > >         if (--cifs_file->count > 0) {
> > > > @@ -397,7 +398,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> > > >         }
> > > >         spin_unlock(&cifs_file_list_lock);
> > > >
> > > > -       cancel_work_sync(&cifs_file->oplock_break);
> > > > +       oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break);
> > > >
> > > >         if (!tcon->need_reconnect && !cifs_file->invalidHandle) {
> > > >                 struct TCP_Server_Info *server = tcon->ses->server;
> > > > @@ -409,6 +410,9 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
> > > >                 _free_xid(xid);
> > > >         }
> > > >
> > > > +       if (oplock_break_cancelled)
> > > > +               cifs_done_oplock_break(cifsi);
> > > > +
> > > >         cifs_del_pending_open(&open);
> > > >
> > > >         /*
> > > > --
> > > > 2.1.0
> > > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > > 
> > > Does it matter what cancel_work_sync() returns?
> > > Should cifs_done_oplock_break(cifsi); be called unconditionally?
> > 
> > I guess the question is whether you can end up with another oplock
> > break racing in between cancel_work_sync and the
> > cifs_done_oplock_break. If that can't happen, then calling it
> > unconditionally is fine.
> 
> By the time we call cancel_work_sync(), we have removed the file from
> the list of files opened for the tcon. This is the file list used by
> is_valid_oplock_break() to find the file which the oplock break refers
> to and to schedule an oplock break work. Since the file is no longer in
> the list of open files held for the tcon, we cannot have an oplock break
> sneaking in between  cancel_work_sync() and cifs_done_oplock_break(). So
> I guess it can be called unconditionally. 
> 
> Should I make the change?
> 

I dunno. I'm not sure I really grok how this is supposed to work...

The oplock break jobs are per-cifsFileInfo, but the
CIFS_INODE_PENDING_OPLOCK_BREAK bit is per-inode. ISTM that you could
end up with oplocks breaks for multiple open files attached to a single
inode. If that happens though, you just end up waiting on the first one
to finish (or be cancelled)?

Probably someone ought to go over this in detail and determine whether
the basic design makes sense...

> > 
> > > Also, should this also to go stable?
> > 
> > Yes, probably. It can cause the client to hang.
> > 
> 
> Sachin Prabhu
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steve French Jan. 19, 2015, 5:13 p.m. UTC | #6

On Mon, Jan 19, 2015 at 10:36 AM, Jeff Layton <jlayton@samba.org> wrote:
> On Mon, 19 Jan 2015 16:24:52 +0000
> Sachin Prabhu <sprabhu@redhat.com> wrote:
>
>> On Mon, 2015-01-19 at 07:02 -0500, Jeff Layton wrote:
>> > On Sat, 17 Jan 2015 07:45:13 -0600
>> > Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:
>> >
>> > > On Thu, Jan 15, 2015 at 6:22 AM, Sachin Prabhu <sprabhu@redhat.com> wrote:
>> > > > Commit
>> > > > c11f1df5003d534fd067f0168bfad7befffb3b5c
>> > > > requires writers to wait for any pending oplock break handler to
>> > > > complete before proceeding to write. This is done by waiting on bit
>> > > > CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
>> > > > cleared by the oplock break handler job queued on the workqueue once it
>> > > > has completed handling the oplock break allowing writers to proceed with
>> > > > writing to the file.
>> > > >
>> > > > While testing, it was noticed that the filehandle could be closed while
>> > > > there is a pending oplock break which results in the oplock break
>> > > > handler on the cifsiod workqueue being cancelled before it has had a
>> > > > chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
>> > > > Any subsequent attempt to write to this file hangs waiting for the
>> > > > CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.
>> > > >
>> > > > We fix this by ensuring that we also clear the bit
>> > > > CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
>> > > > from the workqueue.
>> > > >
>> > > > The bug was found by Red Hat QA while testing using ltp's fsstress
>> > > > command.
>> > > >
>> > > > Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
>> > > > Signed-off-by: Jeff Layton <jlayton@samba.org>
>> > > > ---
>> > > >  fs/cifs/file.c | 6 +++++-
>> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
>> > > >
>> > > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
>> > > > index 96b7e9b..74f1287 100644
>> > > > --- a/fs/cifs/file.c
>> > > > +++ b/fs/cifs/file.c
>> > > > @@ -366,6 +366,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
>> > > >         struct cifsLockInfo *li, *tmp;
>> > > >         struct cifs_fid fid;
>> > > >         struct cifs_pending_open open;
>> > > > +       bool oplock_break_cancelled;
>> > > >
>> > > >         spin_lock(&cifs_file_list_lock);
>> > > >         if (--cifs_file->count > 0) {
>> > > > @@ -397,7 +398,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
>> > > >         }
>> > > >         spin_unlock(&cifs_file_list_lock);
>> > > >
>> > > > -       cancel_work_sync(&cifs_file->oplock_break);
>> > > > +       oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break);
>> > > >
>> > > >         if (!tcon->need_reconnect && !cifs_file->invalidHandle) {
>> > > >                 struct TCP_Server_Info *server = tcon->ses->server;
>> > > > @@ -409,6 +410,9 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
>> > > >                 _free_xid(xid);
>> > > >         }
>> > > >
>> > > > +       if (oplock_break_cancelled)
>> > > > +               cifs_done_oplock_break(cifsi);
>> > > > +
>> > > >         cifs_del_pending_open(&open);
>> > > >
>> > > >         /*
>> > > > --
>> > > > 2.1.0
>> > > >
>> > > > --
>> > > > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
>> > > > the body of a message to majordomo@vger.kernel.org
>> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > >
>> > >
>> > > Does it matter what cancel_work_sync() returns?
>> > > Should cifs_done_oplock_break(cifsi); be called unconditionally?
>> >
>> > I guess the question is whether you can end up with another oplock
>> > break racing in between cancel_work_sync and the
>> > cifs_done_oplock_break. If that can't happen, then calling it
>> > unconditionally is fine.
>>
>> By the time we call cancel_work_sync(), we have removed the file from
>> the list of files opened for the tcon. This is the file list used by
>> is_valid_oplock_break() to find the file which the oplock break refers
>> to and to schedule an oplock break work. Since the file is no longer in
>> the list of open files held for the tcon, we cannot have an oplock break
>> sneaking in between  cancel_work_sync() and cifs_done_oplock_break(). So
>> I guess it can be called unconditionally.
>>
>> Should I make the change?
>>
>
> I dunno. I'm not sure I really grok how this is supposed to work...
>
> The oplock break jobs are per-cifsFileInfo, but the
> CIFS_INODE_PENDING_OPLOCK_BREAK bit is per-inode. ISTM that you could
> end up with oplocks breaks for multiple open files attached to a single
> inode. If that happens though, you just end up waiting on the first one
> to finish (or be cancelled)?
>
> Probably someone ought to go over this in detail and determine whether
> the basic design makes sense...

And whether it makes sense for both SMB2.1 (and later) vs. CIFS
(ie SMB2.1 leases vs. SMB/CIFS oplocks)

Thinking about some of the things mentioned in here:

http://blogs.msdn.com/b/openspecification/archive/2009/05/22/client-caching-features-oplock-vs-lease.aspx

>> > > Also, should this also to go stable?
>> >
>> > Yes, probably. It can cause the client to hang.

yes

Steve French Jan. 19, 2015, 10:15 p.m. UTC | #7

On Mon, Jan 19, 2015 at 10:36 AM, Jeff Layton <jlayton@samba.org> wrote:
> On Mon, 19 Jan 2015 16:24:52 +0000
> Sachin Prabhu <sprabhu@redhat.com> wrote:
>
>> On Mon, 2015-01-19 at 07:02 -0500, Jeff Layton wrote:
>> > On Sat, 17 Jan 2015 07:45:13 -0600
>> > Shirish Pargaonkar <shirishpargaonkar@gmail.com> wrote:
>> >
>> > > On Thu, Jan 15, 2015 at 6:22 AM, Sachin Prabhu <sprabhu@redhat.com> wrote:
>> > > > Commit
>> > > > c11f1df5003d534fd067f0168bfad7befffb3b5c
>> > > > requires writers to wait for any pending oplock break handler to
>> > > > complete before proceeding to write. This is done by waiting on bit
>> > > > CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
>> > > > cleared by the oplock break handler job queued on the workqueue once it
>> > > > has completed handling the oplock break allowing writers to proceed with
>> > > > writing to the file.
>> > > >
>> > > > While testing, it was noticed that the filehandle could be closed while
>> > > > there is a pending oplock break which results in the oplock break
>> > > > handler on the cifsiod workqueue being cancelled before it has had a
>> > > > chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
>> > > > Any subsequent attempt to write to this file hangs waiting for the
>> > > > CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.
>> > > >
>> > > > We fix this by ensuring that we also clear the bit
>> > > > CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
>> > > > from the workqueue.
>> > > >
>> > > > The bug was found by Red Hat QA while testing using ltp's fsstress
>> > > > command.
>> > > >
>> > > > Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
>> > > > Signed-off-by: Jeff Layton <jlayton@samba.org>
>> > > > ---
>> > > >  fs/cifs/file.c | 6 +++++-
>> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
>> > > >
>> > > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
>> > > > index 96b7e9b..74f1287 100644
>> > > > --- a/fs/cifs/file.c
>> > > > +++ b/fs/cifs/file.c
>> > > > @@ -366,6 +366,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
>> > > >         struct cifsLockInfo *li, *tmp;
>> > > >         struct cifs_fid fid;
>> > > >         struct cifs_pending_open open;
>> > > > +       bool oplock_break_cancelled;
>> > > >
>> > > >         spin_lock(&cifs_file_list_lock);
>> > > >         if (--cifs_file->count > 0) {
>> > > > @@ -397,7 +398,7 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
>> > > >         }
>> > > >         spin_unlock(&cifs_file_list_lock);
>> > > >
>> > > > -       cancel_work_sync(&cifs_file->oplock_break);
>> > > > +       oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break);
>> > > >
>> > > >         if (!tcon->need_reconnect && !cifs_file->invalidHandle) {
>> > > >                 struct TCP_Server_Info *server = tcon->ses->server;
>> > > > @@ -409,6 +410,9 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
>> > > >                 _free_xid(xid);
>> > > >         }
>> > > >
>> > > > +       if (oplock_break_cancelled)
>> > > > +               cifs_done_oplock_break(cifsi);
>> > > > +
>> > > >         cifs_del_pending_open(&open);
>> > > >
>> > > >         /*
>> > > > --
>> > > > 2.1.0
>> > > >
>> > > > --
>> > > > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
>> > > > the body of a message to majordomo@vger.kernel.org
>> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > >
>> > >
>> > > Does it matter what cancel_work_sync() returns?
>> > > Should cifs_done_oplock_break(cifsi); be called unconditionally?
>> >
>> > I guess the question is whether you can end up with another oplock
>> > break racing in between cancel_work_sync and the
>> > cifs_done_oplock_break. If that can't happen, then calling it
>> > unconditionally is fine.
>>
>> By the time we call cancel_work_sync(), we have removed the file from
>> the list of files opened for the tcon. This is the file list used by
>> is_valid_oplock_break() to find the file which the oplock break refers
>> to and to schedule an oplock break work. Since the file is no longer in
>> the list of open files held for the tcon, we cannot have an oplock break
>> sneaking in between  cancel_work_sync() and cifs_done_oplock_break(). So
>> I guess it can be called unconditionally.
>>
>> Should I make the change?
>>
>
> I dunno. I'm not sure I really grok how this is supposed to work...
>
> The oplock break jobs are per-cifsFileInfo, but the
> CIFS_INODE_PENDING_OPLOCK_BREAK bit is per-inode. ISTM that you could
> end up with oplocks breaks for multiple open files attached to a single
> inode. If that happens though, you just end up waiting on the first one
> to finish (or be cancelled)?

The oplock (in the inode flags) is downgraded immediately.  Getting
multiple overlapping oplock breaks on different opens of the same inode
should be harmless as long as the first break was processed.

Doesn't look like the if (oplock_break_cancelled) ... matters one
way or the other. I am ok with merging as is (will add the CC stable)

Merged into cifs-2.6.git

Complete oplock break jobs before closing file handle

Commit Message

Comments

Patch