diff mbox

[v2] btrfs: qgroup: exit the rescan worker during umount

Message ID 1441242317-16547-1-git-send-email-jmaggard@netgear.com (mailing list archive)
State Superseded
Headers show

Commit Message

Justin Maggard Sept. 3, 2015, 1:05 a.m. UTC
v2: Fix stupid error while making formatting changes...

I was hitting a consistent NULL pointer dereference during shutdown that
showed the trace running through end_workqueue_bio().  I traced it back to
the endio_meta_workers workqueue being poked after it had already been
destroyed.

Eventually I found that the root cause was a qgroup rescan that was still
in progress while we were stopping all the btrfs workers.

Currently we explicitly pause balance and scrub operations in
close_ctree(), but we do nothing to stop the qgroup rescan.  We should
probably be doing the same for qgroup rescan, but that's a much larger
change.  This small change is good enough to allow me to unmount without
crashing.

Signed-off-by: Justin Maggard <jmaggard@netgear.com>
---
 fs/btrfs/qgroup.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

David Sterba Sept. 22, 2015, 2:45 p.m. UTC | #1
On Wed, Sep 02, 2015 at 06:05:17PM -0700, Justin Maggard wrote:
> v2: Fix stupid error while making formatting changes...

I haven't noticed any difference between the patches, what exactly did
you change?

> I was hitting a consistent NULL pointer dereference during shutdown that
> showed the trace running through end_workqueue_bio().  I traced it back to
> the endio_meta_workers workqueue being poked after it had already been
> destroyed.
> 
> Eventually I found that the root cause was a qgroup rescan that was still
> in progress while we were stopping all the btrfs workers.
> 
> Currently we explicitly pause balance and scrub operations in
> close_ctree(), but we do nothing to stop the qgroup rescan.  We should
> probably be doing the same for qgroup rescan, but that's a much larger
> change.  This small change is good enough to allow me to unmount without
> crashing.
> 
> Signed-off-by: Justin Maggard <jmaggard@netgear.com>

Can you please submit the test you've used to trigger the crash to
fstests?

Reviewed-by: David Sterba <dsterba@suse.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Justin Maggard Sept. 26, 2015, 12:25 a.m. UTC | #2
On Tue, Sep 22, 2015 at 7:45 AM, David Sterba <dsterba@suse.cz> wrote:
> On Wed, Sep 02, 2015 at 06:05:17PM -0700, Justin Maggard wrote:
>> v2: Fix stupid error while making formatting changes...
>
> I haven't noticed any difference between the patches, what exactly did
> you change?
>

I broke compiling while cleaning up some checkpatch.pl feedback.
Here's what changed between v1 and v2:

-       if (!btrfs_fs_closing(fs_info)) {
+       if (!btrfs_fs_closing(fs_info))


>> I was hitting a consistent NULL pointer dereference during shutdown that
>> showed the trace running through end_workqueue_bio().  I traced it back to
>> the endio_meta_workers workqueue being poked after it had already been
>> destroyed.
>>
>> Eventually I found that the root cause was a qgroup rescan that was still
>> in progress while we were stopping all the btrfs workers.
>>
>> Currently we explicitly pause balance and scrub operations in
>> close_ctree(), but we do nothing to stop the qgroup rescan.  We should
>> probably be doing the same for qgroup rescan, but that's a much larger
>> change.  This small change is good enough to allow me to unmount without
>> crashing.
>>
>> Signed-off-by: Justin Maggard <jmaggard@netgear.com>
>
> Can you please submit the test you've used to trigger the crash to
> fstests?
>

Sure, I've got a reproducer coded up for xfstests now.  Should I just
send that to this list, or is there a better place to send it?

-Justin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Filipe Manana Sept. 26, 2015, 11:49 a.m. UTC | #3
On Sat, Sep 26, 2015 at 1:25 AM, Justin Maggard <jmaggard10@gmail.com> wrote:
> On Tue, Sep 22, 2015 at 7:45 AM, David Sterba <dsterba@suse.cz> wrote:
>> On Wed, Sep 02, 2015 at 06:05:17PM -0700, Justin Maggard wrote:
>>> v2: Fix stupid error while making formatting changes...
>>
>> I haven't noticed any difference between the patches, what exactly did
>> you change?
>>
>
> I broke compiling while cleaning up some checkpatch.pl feedback.
> Here's what changed between v1 and v2:
>
> -       if (!btrfs_fs_closing(fs_info)) {
> +       if (!btrfs_fs_closing(fs_info))
>
>
>>> I was hitting a consistent NULL pointer dereference during shutdown that
>>> showed the trace running through end_workqueue_bio().  I traced it back to
>>> the endio_meta_workers workqueue being poked after it had already been
>>> destroyed.
>>>
>>> Eventually I found that the root cause was a qgroup rescan that was still
>>> in progress while we were stopping all the btrfs workers.
>>>
>>> Currently we explicitly pause balance and scrub operations in
>>> close_ctree(), but we do nothing to stop the qgroup rescan.  We should
>>> probably be doing the same for qgroup rescan, but that's a much larger
>>> change.  This small change is good enough to allow me to unmount without
>>> crashing.
>>>
>>> Signed-off-by: Justin Maggard <jmaggard@netgear.com>
>>
>> Can you please submit the test you've used to trigger the crash to
>> fstests?
>>
>
> Sure, I've got a reproducer coded up for xfstests now.  Should I just
> send that to this list, or is there a better place to send it?

Just send it to fstests@vger.kernel.org with the btrfs mailing list on
cc. If you take a look at test submission emails in the btrfs mailing
list, you'll see how it's usually done.

thanks

>
> -Justin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Filipe Manana Oct. 8, 2015, 9:25 a.m. UTC | #4
On Thu, Sep 3, 2015 at 2:05 AM, Justin Maggard <jmaggard10@gmail.com> wrote:
> v2: Fix stupid error while making formatting changes...
>
> I was hitting a consistent NULL pointer dereference during shutdown that
> showed the trace running through end_workqueue_bio().  I traced it back to
> the endio_meta_workers workqueue being poked after it had already been
> destroyed.
>
> Eventually I found that the root cause was a qgroup rescan that was still
> in progress while we were stopping all the btrfs workers.
>
> Currently we explicitly pause balance and scrub operations in
> close_ctree(), but we do nothing to stop the qgroup rescan.  We should
> probably be doing the same for qgroup rescan, but that's a much larger
> change.  This small change is good enough to allow me to unmount without
> crashing.
>
> Signed-off-by: Justin Maggard <jmaggard@netgear.com>
> ---
>  fs/btrfs/qgroup.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index d904ee1..5bfcee9 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -2278,7 +2278,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
>                 goto out;
>
>         err = 0;
> -       while (!err) {
> +       while (!err && !btrfs_fs_closing(fs_info)) {
>                 trans = btrfs_start_transaction(fs_info->fs_root, 0);
>                 if (IS_ERR(trans)) {
>                         err = PTR_ERR(trans);
> @@ -2301,7 +2301,8 @@ out:
>         btrfs_free_path(path);
>
>         mutex_lock(&fs_info->qgroup_rescan_lock);
> -       fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
> +       if (!btrfs_fs_closing(fs_info))
> +               fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
>
>         if (err > 0 &&
>             fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT) {
> @@ -2330,7 +2331,9 @@ out:
>         }
>         btrfs_end_transaction(trans, fs_info->quota_root);
>
> -       if (err >= 0) {
> +       if (btrfs_fs_closing(fs_info)) {
> +               btrfs_info(fs_info, "qgroup scan paused");
> +       } else if (err >= 0) {
>                 btrfs_info(fs_info, "qgroup scan completed%s",
>                         err > 0 ? " (inconsistency flag cleared)" : "");
>         } else {

Justin, this is still racy (however much less racy than before).

Once we leave the loop because of the condition
btrfs_fs_closing(fs_info), we start a transaction and do some write
operation on the quota btree. While or before we do such write
operation, close_ctree() might have completed or be at a point where
such write operation will result in another null pointer dereference,
or accessing some dangling pointer, or leak a transaction that never
gets committed (because close_ctree() already stopped the transaction
kthread), etc, etc.

So in addition to what you did, you need to call
btrfs_qgroup_wait_for_completion(fs_info) at disk-io.c:close_ctree()
right after setting fs_info->closing to 1.

Otherwise it looks good.
Thanks.


> --
> 2.5.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index d904ee1..5bfcee9 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2278,7 +2278,7 @@  static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 		goto out;
 
 	err = 0;
-	while (!err) {
+	while (!err && !btrfs_fs_closing(fs_info)) {
 		trans = btrfs_start_transaction(fs_info->fs_root, 0);
 		if (IS_ERR(trans)) {
 			err = PTR_ERR(trans);
@@ -2301,7 +2301,8 @@  out:
 	btrfs_free_path(path);
 
 	mutex_lock(&fs_info->qgroup_rescan_lock);
-	fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
+	if (!btrfs_fs_closing(fs_info))
+		fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
 
 	if (err > 0 &&
 	    fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT) {
@@ -2330,7 +2331,9 @@  out:
 	}
 	btrfs_end_transaction(trans, fs_info->quota_root);
 
-	if (err >= 0) {
+	if (btrfs_fs_closing(fs_info)) {
+		btrfs_info(fs_info, "qgroup scan paused");
+	} else if (err >= 0) {
 		btrfs_info(fs_info, "qgroup scan completed%s",
 			err > 0 ? " (inconsistency flag cleared)" : "");
 	} else {