Message ID | 0a4d66f6922f5219c7c8c37d88a919304abdbb55.1717416325.git.fdmanana@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fix leak of qgroup extent records after transaction abort | expand |
在 2024/6/3 21:36, fdmanana@kernel.org 写道: > From: Filipe Manana <fdmanana@suse.com> > > Qgroup extent records are created when delayed ref heads are created and > then released after accounting extents at btrfs_qgroup_account_extents(), > called during the transaction commit path. > > If a transaction is aborted we free the qgroup records by calling > btrfs_qgroup_destroy_extent_records() at btrfs_destroy_delayed_refs(), > unless we don't have delayed references. We are incorrectly assuming > that no delayed references means we don't have qgroup extents records. > > We can currently have no delayed references because we ran them all > during a transaction commit and the transaction was aborted after that > due to some error in the commit path. > > So fix this by ensuring we btrfs_qgroup_destroy_extent_records() at > btrfs_destroy_delayed_refs() even if we don't have any delayed references. Will it cause some underflow for delayed_refs->num_entries? As in the rb tree iteration code, we would try to decrease delayed_refs->num_entries again. Thanks, Qu > > Reported-by: syzbot+0fecc032fa134afd49df@syzkaller.appspotmail.com > Link: https://lore.kernel.org/linux-btrfs/0000000000004e7f980619f91835@google.com/ > Signed-off-by: Filipe Manana <fdmanana@suse.com> > --- > fs/btrfs/disk-io.c | 10 +--------- > 1 file changed, 1 insertion(+), 9 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 8693893744a0..b1daaaec0614 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -4522,18 +4522,10 @@ static void btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, > struct btrfs_fs_info *fs_info) > { > struct rb_node *node; > - struct btrfs_delayed_ref_root *delayed_refs; > + struct btrfs_delayed_ref_root *delayed_refs = &trans->delayed_refs; > struct btrfs_delayed_ref_node *ref; > > - delayed_refs = &trans->delayed_refs; > - > spin_lock(&delayed_refs->lock); > - if (atomic_read(&delayed_refs->num_entries) == 0) { > - spin_unlock(&delayed_refs->lock); > - btrfs_debug(fs_info, "delayed_refs has NO entry"); > - return; > - } > - > while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { > struct btrfs_delayed_ref_head *head; > struct rb_node *n;
On Mon, Jun 3, 2024 at 11:58 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > 在 2024/6/3 21:36, fdmanana@kernel.org 写道: > > From: Filipe Manana <fdmanana@suse.com> > > > > Qgroup extent records are created when delayed ref heads are created and > > then released after accounting extents at btrfs_qgroup_account_extents(), > > called during the transaction commit path. > > > > If a transaction is aborted we free the qgroup records by calling > > btrfs_qgroup_destroy_extent_records() at btrfs_destroy_delayed_refs(), > > unless we don't have delayed references. We are incorrectly assuming > > that no delayed references means we don't have qgroup extents records. > > > > We can currently have no delayed references because we ran them all > > during a transaction commit and the transaction was aborted after that > > due to some error in the commit path. > > > > So fix this by ensuring we btrfs_qgroup_destroy_extent_records() at > > btrfs_destroy_delayed_refs() even if we don't have any delayed references. > > Will it cause some underflow for delayed_refs->num_entries? > > As in the rb tree iteration code, we would try to decrease > delayed_refs->num_entries again. What underflow, where? btrfs_qgroup_destroy_extent_records() doesn't do anything to the counter (or delayed refs). Or are you seeing that delayed_refs->num_entries can be 0 while the delayed_refs->href_root rb tree is not empty? How is that possible? > > Thanks, > Qu > > > > Reported-by: syzbot+0fecc032fa134afd49df@syzkaller.appspotmail.com > > Link: https://lore.kernel.org/linux-btrfs/0000000000004e7f980619f91835@google.com/ > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > > --- > > fs/btrfs/disk-io.c | 10 +--------- > > 1 file changed, 1 insertion(+), 9 deletions(-) > > > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > > index 8693893744a0..b1daaaec0614 100644 > > --- a/fs/btrfs/disk-io.c > > +++ b/fs/btrfs/disk-io.c > > @@ -4522,18 +4522,10 @@ static void btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, > > struct btrfs_fs_info *fs_info) > > { > > struct rb_node *node; > > - struct btrfs_delayed_ref_root *delayed_refs; > > + struct btrfs_delayed_ref_root *delayed_refs = &trans->delayed_refs; > > struct btrfs_delayed_ref_node *ref; > > > > - delayed_refs = &trans->delayed_refs; > > - > > spin_lock(&delayed_refs->lock); > > - if (atomic_read(&delayed_refs->num_entries) == 0) { > > - spin_unlock(&delayed_refs->lock); > > - btrfs_debug(fs_info, "delayed_refs has NO entry"); > > - return; > > - } > > - > > while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { > > struct btrfs_delayed_ref_head *head; > > struct rb_node *n;
在 2024/6/4 08:39, Filipe Manana 写道: > On Mon, Jun 3, 2024 at 11:58 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: [...] >>> We can currently have no delayed references because we ran them all >>> during a transaction commit and the transaction was aborted after that >>> due to some error in the commit path. >>> >>> So fix this by ensuring we btrfs_qgroup_destroy_extent_records() at >>> btrfs_destroy_delayed_refs() even if we don't have any delayed references. >> >> Will it cause some underflow for delayed_refs->num_entries? >> >> As in the rb tree iteration code, we would try to decrease >> delayed_refs->num_entries again. > > What underflow, where? > > btrfs_qgroup_destroy_extent_records() doesn't do anything to the > counter (or delayed refs). > > Or are you seeing that delayed_refs->num_entries can be 0 while the > delayed_refs->href_root rb tree is not empty? > How is that possible? Never mind, I was originally referring to the "atomic_dec()" call inside "while ((n = rb_first_cached())" loop. But btrfs_run_delayed_refs_for_head() has ensured the entry is properly removed from ref_tree before decreasing the "delayed_refs->num_entries", it should be safe. I'd prefer to call btrfs_qgroup_destory_extent_records() inside the "if (atomic_read() == 0)" branch to be a little more easier to read. But it's only a preference. Thanks, Qu > >> >> Thanks, >> Qu >>> >>> Reported-by: syzbot+0fecc032fa134afd49df@syzkaller.appspotmail.com >>> Link: https://lore.kernel.org/linux-btrfs/0000000000004e7f980619f91835@google.com/ >>> Signed-off-by: Filipe Manana <fdmanana@suse.com> >>> --- >>> fs/btrfs/disk-io.c | 10 +--------- >>> 1 file changed, 1 insertion(+), 9 deletions(-) >>> >>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>> index 8693893744a0..b1daaaec0614 100644 >>> --- a/fs/btrfs/disk-io.c >>> +++ b/fs/btrfs/disk-io.c >>> @@ -4522,18 +4522,10 @@ static void btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, >>> struct btrfs_fs_info *fs_info) >>> { >>> struct rb_node *node; >>> - struct btrfs_delayed_ref_root *delayed_refs; >>> + struct btrfs_delayed_ref_root *delayed_refs = &trans->delayed_refs; >>> struct btrfs_delayed_ref_node *ref; >>> >>> - delayed_refs = &trans->delayed_refs; >>> - >>> spin_lock(&delayed_refs->lock); >>> - if (atomic_read(&delayed_refs->num_entries) == 0) { >>> - spin_unlock(&delayed_refs->lock); >>> - btrfs_debug(fs_info, "delayed_refs has NO entry"); >>> - return; >>> - } >>> - >>> while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { >>> struct btrfs_delayed_ref_head *head; >>> struct rb_node *n;
On Tue, Jun 4, 2024 at 12:21 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > 在 2024/6/4 08:39, Filipe Manana 写道: > > On Mon, Jun 3, 2024 at 11:58 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > [...] > >>> We can currently have no delayed references because we ran them all > >>> during a transaction commit and the transaction was aborted after that > >>> due to some error in the commit path. > >>> > >>> So fix this by ensuring we btrfs_qgroup_destroy_extent_records() at > >>> btrfs_destroy_delayed_refs() even if we don't have any delayed references. > >> > >> Will it cause some underflow for delayed_refs->num_entries? > >> > >> As in the rb tree iteration code, we would try to decrease > >> delayed_refs->num_entries again. > > > > What underflow, where? > > > > btrfs_qgroup_destroy_extent_records() doesn't do anything to the > > counter (or delayed refs). > > > > Or are you seeing that delayed_refs->num_entries can be 0 while the > > delayed_refs->href_root rb tree is not empty? > > How is that possible? > > Never mind, I was originally referring to the "atomic_dec()" call inside > "while ((n = rb_first_cached())" loop. > > But btrfs_run_delayed_refs_for_head() has ensured the entry is properly > removed from ref_tree before decreasing the "delayed_refs->num_entries", > it should be safe. > > I'd prefer to call btrfs_qgroup_destory_extent_records() inside the "if > (atomic_read() == 0)" branch to be a little more easier to read. > But it's only a preference. Maybe it makes the diff more immediate to understand. But the final code is certainly less verbose and clear enough: while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { (...) } btrfs_qgroup_destroy_extent_records(trans); As opposed to: if (atomic_read(&delayed_refs->num_entries) == 0) { btrfs_qgroup_destroy_extent_records(trans); spin_unlock(&delayed_refs->lock); return; } while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { (...) } btrfs_qgroup_destroy_extent_records(trans); More code for absolutely nothing, not even better readability. > > Thanks, > Qu > > > > >> > >> Thanks, > >> Qu > >>> > >>> Reported-by: syzbot+0fecc032fa134afd49df@syzkaller.appspotmail.com > >>> Link: https://lore.kernel.org/linux-btrfs/0000000000004e7f980619f91835@google.com/ > >>> Signed-off-by: Filipe Manana <fdmanana@suse.com> > >>> --- > >>> fs/btrfs/disk-io.c | 10 +--------- > >>> 1 file changed, 1 insertion(+), 9 deletions(-) > >>> > >>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > >>> index 8693893744a0..b1daaaec0614 100644 > >>> --- a/fs/btrfs/disk-io.c > >>> +++ b/fs/btrfs/disk-io.c > >>> @@ -4522,18 +4522,10 @@ static void btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, > >>> struct btrfs_fs_info *fs_info) > >>> { > >>> struct rb_node *node; > >>> - struct btrfs_delayed_ref_root *delayed_refs; > >>> + struct btrfs_delayed_ref_root *delayed_refs = &trans->delayed_refs; > >>> struct btrfs_delayed_ref_node *ref; > >>> > >>> - delayed_refs = &trans->delayed_refs; > >>> - > >>> spin_lock(&delayed_refs->lock); > >>> - if (atomic_read(&delayed_refs->num_entries) == 0) { > >>> - spin_unlock(&delayed_refs->lock); > >>> - btrfs_debug(fs_info, "delayed_refs has NO entry"); > >>> - return; > >>> - } > >>> - > >>> while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { > >>> struct btrfs_delayed_ref_head *head; > >>> struct rb_node *n;
On Mon, Jun 03, 2024 at 01:06:13PM +0100, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > Qgroup extent records are created when delayed ref heads are created and > then released after accounting extents at btrfs_qgroup_account_extents(), > called during the transaction commit path. > > If a transaction is aborted we free the qgroup records by calling > btrfs_qgroup_destroy_extent_records() at btrfs_destroy_delayed_refs(), > unless we don't have delayed references. We are incorrectly assuming > that no delayed references means we don't have qgroup extents records. > > We can currently have no delayed references because we ran them all > during a transaction commit and the transaction was aborted after that > due to some error in the commit path. > > So fix this by ensuring we btrfs_qgroup_destroy_extent_records() at > btrfs_destroy_delayed_refs() even if we don't have any delayed references. > > Reported-by: syzbot+0fecc032fa134afd49df@syzkaller.appspotmail.com > Link: https://lore.kernel.org/linux-btrfs/0000000000004e7f980619f91835@google.com/ > Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Thanks, Josef
在 2024/6/4 08:58, Filipe Manana 写道: > On Tue, Jun 4, 2024 at 12:21 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: [...] >> >> I'd prefer to call btrfs_qgroup_destory_extent_records() inside the "if >> (atomic_read() == 0)" branch to be a little more easier to read. >> But it's only a preference. > > Maybe it makes the diff more immediate to understand. > But the final code is certainly less verbose and clear enough: > > while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { > (...) > } > btrfs_qgroup_destroy_extent_records(trans); > > As opposed to: > > if (atomic_read(&delayed_refs->num_entries) == 0) { > btrfs_qgroup_destroy_extent_records(trans); > spin_unlock(&delayed_refs->lock); > return; > } > > while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { > (...) > } > btrfs_qgroup_destroy_extent_records(trans); > > > More code for absolutely nothing, not even better readability. It looks like I'm still tending to do empty check, other than let the loop to handle empty cases. And indeed the extra empty check is only going to bloat the code without much need, and I should change my tendency. Forgot to give a reviewed by tag: Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks, Qu > > >> >> Thanks, >> Qu >> >>> >>>> >>>> Thanks, >>>> Qu >>>>> >>>>> Reported-by: syzbot+0fecc032fa134afd49df@syzkaller.appspotmail.com >>>>> Link: https://lore.kernel.org/linux-btrfs/0000000000004e7f980619f91835@google.com/ >>>>> Signed-off-by: Filipe Manana <fdmanana@suse.com> >>>>> --- >>>>> fs/btrfs/disk-io.c | 10 +--------- >>>>> 1 file changed, 1 insertion(+), 9 deletions(-) >>>>> >>>>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>>>> index 8693893744a0..b1daaaec0614 100644 >>>>> --- a/fs/btrfs/disk-io.c >>>>> +++ b/fs/btrfs/disk-io.c >>>>> @@ -4522,18 +4522,10 @@ static void btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, >>>>> struct btrfs_fs_info *fs_info) >>>>> { >>>>> struct rb_node *node; >>>>> - struct btrfs_delayed_ref_root *delayed_refs; >>>>> + struct btrfs_delayed_ref_root *delayed_refs = &trans->delayed_refs; >>>>> struct btrfs_delayed_ref_node *ref; >>>>> >>>>> - delayed_refs = &trans->delayed_refs; >>>>> - >>>>> spin_lock(&delayed_refs->lock); >>>>> - if (atomic_read(&delayed_refs->num_entries) == 0) { >>>>> - spin_unlock(&delayed_refs->lock); >>>>> - btrfs_debug(fs_info, "delayed_refs has NO entry"); >>>>> - return; >>>>> - } >>>>> - >>>>> while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { >>>>> struct btrfs_delayed_ref_head *head; >>>>> struct rb_node *n; >
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 8693893744a0..b1daaaec0614 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4522,18 +4522,10 @@ static void btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, struct btrfs_fs_info *fs_info) { struct rb_node *node; - struct btrfs_delayed_ref_root *delayed_refs; + struct btrfs_delayed_ref_root *delayed_refs = &trans->delayed_refs; struct btrfs_delayed_ref_node *ref; - delayed_refs = &trans->delayed_refs; - spin_lock(&delayed_refs->lock); - if (atomic_read(&delayed_refs->num_entries) == 0) { - spin_unlock(&delayed_refs->lock); - btrfs_debug(fs_info, "delayed_refs has NO entry"); - return; - } - while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { struct btrfs_delayed_ref_head *head; struct rb_node *n;