Message ID | 1359579978-2916-1-git-send-email-jbacik@fusionio.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Jan 30, 2013 at 04:06:18PM -0500, Josef Bacik wrote: > I hit this error when reproducing a bug that would end in a transaction > abort. We take the delayed ref head's mutex to keep anybody from processing > it while we're destroying it, but we fail to drop the mutex before we carry > on and free the damned thing. Fix this by doing the remove logic for the > head ourselves and unlock the mutex, that way we can avoid use after free's > or hung tasks waiting on that mutex to come back so they know the delayed > ref completed. Thanks, > > + ref->in_tree = 0; > + rb_erase(&ref->rb_node, &delayed_refs->root); > + delayed_refs->num_entries--; > + mutex_unlock(&head->mutex); > + } else { > + ref->in_tree = 0; > + rb_erase(&ref->rb_node, &delayed_refs->root); > + delayed_refs->num_entries--; Do you really need to duplicate the removal under the mutex? Isn't all that protected by the delayed_refs->lock? Isn't it enough to just add the mutex_unlock()? - z -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 30, 2013 at 03:16:35PM -0700, Zach Brown wrote: > On Wed, Jan 30, 2013 at 04:06:18PM -0500, Josef Bacik wrote: > > I hit this error when reproducing a bug that would end in a transaction > > abort. We take the delayed ref head's mutex to keep anybody from processing > > it while we're destroying it, but we fail to drop the mutex before we carry > > on and free the damned thing. Fix this by doing the remove logic for the > > head ourselves and unlock the mutex, that way we can avoid use after free's > > or hung tasks waiting on that mutex to come back so they know the delayed > > ref completed. Thanks, > > > > > + ref->in_tree = 0; > > + rb_erase(&ref->rb_node, &delayed_refs->root); > > + delayed_refs->num_entries--; > > + mutex_unlock(&head->mutex); > > + } else { > > + ref->in_tree = 0; > > + rb_erase(&ref->rb_node, &delayed_refs->root); > > + delayed_refs->num_entries--; > > Do you really need to duplicate the removal under the mutex? Isn't all > that protected by the delayed_refs->lock? > > Isn't it enough to just add the mutex_unlock()? > Yeah I could re-arrange I guess, either way it's not pretty but I guess not duplicating stuff is better. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 12a9547..51bff86 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3640,10 +3640,15 @@ int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, if (list_empty(&head->cluster)) delayed_refs->num_heads_ready--; list_del_init(&head->cluster); + ref->in_tree = 0; + rb_erase(&ref->rb_node, &delayed_refs->root); + delayed_refs->num_entries--; + mutex_unlock(&head->mutex); + } else { + ref->in_tree = 0; + rb_erase(&ref->rb_node, &delayed_refs->root); + delayed_refs->num_entries--; } - ref->in_tree = 0; - rb_erase(&ref->rb_node, &delayed_refs->root); - delayed_refs->num_entries--; spin_unlock(&delayed_refs->lock); btrfs_put_delayed_ref(ref);
I hit this error when reproducing a bug that would end in a transaction abort. We take the delayed ref head's mutex to keep anybody from processing it while we're destroying it, but we fail to drop the mutex before we carry on and free the damned thing. Fix this by doing the remove logic for the head ourselves and unlock the mutex, that way we can avoid use after free's or hung tasks waiting on that mutex to come back so they know the delayed ref completed. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> --- fs/btrfs/disk-io.c | 11 ++++++++--- 1 files changed, 8 insertions(+), 3 deletions(-)