mbox series

[00/14] More error handling and BUG_ON cleanups

Message ID cover.1707382595.git.dsterba@suse.com (mailing list archive)
Headers show
Series More error handling and BUG_ON cleanups | expand

Message

David Sterba Feb. 8, 2024, 8:59 a.m. UTC
Continuation of https://lore.kernel.org/linux-btrfs/cover.1706130791.git.dsterba@suse.com/
BUG_ON converted to error handling, removed or moved.

David Sterba (14):
  btrfs: push errors up from add_async_extent()
  btrfs: update comment and drop assertion in extent item lookup in
    find_parent_nodes()
  btrfs: handle invalid extent item reference found in
    extent_from_logical()
  btrfs: handle invalid extent item reference found in
    find_first_extent_item()
  btrfs: handle invalid root reference found in may_destroy_subvol()
  btrfs: send: handle unexpected data in header buffer in begin_cmd()
  btrfs: send: handle unexpected inode in header process_recorded_refs()
  btrfs: send: handle path ref underflow in header iterate_inode_ref()
  btrfs: change BUG_ON to assertion in tree_move_down()
  btrfs: change BUG_ONs to assertions in btrfs_qgroup_trace_subtree()
  btrfs: delete pointless BUG_ON check on quota root in
    btrfs_qgroup_account_extent()
  btrfs: delete pointless BUG_ONs on extent item size
  btrfs: handle unexpected parent block offset in
    btrfs_alloc_tree_block()
  btrfs: delete BUG_ON in btrfs_init_locked_inode()

 fs/btrfs/backref.c     | 20 +++++++++++++++-----
 fs/btrfs/extent-tree.c | 12 ++++++++++--
 fs/btrfs/inode.c       | 23 ++++++++++++++++-------
 fs/btrfs/qgroup.c      |  6 ++----
 fs/btrfs/scrub.c       |  9 ++++++++-
 fs/btrfs/send.c        | 27 +++++++++++++++++++++++----
 6 files changed, 74 insertions(+), 23 deletions(-)

Comments

Qu Wenruo Feb. 8, 2024, 9:50 a.m. UTC | #1
On 2024/2/8 19:29, David Sterba wrote:
> Continuation of https://lore.kernel.org/linux-btrfs/cover.1706130791.git.dsterba@suse.com/
> BUG_ON converted to error handling, removed or moved.
>
> David Sterba (14):
>    btrfs: push errors up from add_async_extent()
>    btrfs: update comment and drop assertion in extent item lookup in
>      find_parent_nodes()
>    btrfs: handle invalid extent item reference found in
>      extent_from_logical()

The above ones look good to me.

>    btrfs: handle invalid extent item reference found in
>      find_first_extent_item()

For bad extent item offsets, I'm totally fine with the current -EUCLEAN
even it doesn't have any error message.
As tree-checker has already verified the extent items, thus it's way
much harder to get runtime corruption to create a -1 key.offset.


>    btrfs: handle invalid root reference found in may_destroy_subvol()

But for this, I strongly recommend a new tree-checker entry for
ROOT_REF, before returning -EUCLEAN without an error message.

>    btrfs: send: handle unexpected data in header buffer in begin_cmd()
>    btrfs: send: handle unexpected inode in header process_recorded_refs()
>    btrfs: send: handle path ref underflow in header iterate_inode_ref()
>    btrfs: change BUG_ON to assertion in tree_move_down()
>    btrfs: change BUG_ONs to assertions in btrfs_qgroup_trace_subtree()
>    btrfs: delete pointless BUG_ON check on quota root in
>      btrfs_qgroup_account_extent()
>    btrfs: delete pointless BUG_ONs on extent item size

The above ones look good to me.

>    btrfs: handle unexpected parent block offset in
>      btrfs_alloc_tree_block()

For this one, I'd prefer one error message or at least something noisy
enough (aka, ASSERT()), just to make life a little easier when we
screwed up in our development environment.

Thanks,
Qu
>    btrfs: delete BUG_ON in btrfs_init_locked_inode()
>
>   fs/btrfs/backref.c     | 20 +++++++++++++++-----
>   fs/btrfs/extent-tree.c | 12 ++++++++++--
>   fs/btrfs/inode.c       | 23 ++++++++++++++++-------
>   fs/btrfs/qgroup.c      |  6 ++----
>   fs/btrfs/scrub.c       |  9 ++++++++-
>   fs/btrfs/send.c        | 27 +++++++++++++++++++++++----
>   6 files changed, 74 insertions(+), 23 deletions(-)
>
David Sterba Feb. 9, 2024, 2:03 a.m. UTC | #2
On Thu, Feb 08, 2024 at 08:20:55PM +1030, Qu Wenruo wrote:
> >    btrfs: handle invalid extent item reference found in
> >      find_first_extent_item()
> 
> For bad extent item offsets, I'm totally fine with the current -EUCLEAN
> even it doesn't have any error message.
> As tree-checker has already verified the extent items, thus it's way
> much harder to get runtime corruption to create a -1 key.offset.

Yeah, this is the same as for the previous patchset, tree-checker
catches things early and the new error handling makes sure any
improbable error will not propagate further.

Regarding the messages, I'll add them in the futre, currently I'm
analyzing all cases we might need and prototyping the error/warning
message macros. I've been going over BUG_ONs only but we have too many
random WARN_ONs too, this should be better converted to warnings or
assertions with clear purpose and use case.

> >    btrfs: handle invalid root reference found in may_destroy_subvol()
> 
> But for this, I strongly recommend a new tree-checker entry for
> ROOT_REF, before returning -EUCLEAN without an error message.

Right, we don't seem to have a case for BTRFS_ROOT_REF_KEY in
tree-checker, at least the allowed value range could be there.

> >    btrfs: send: handle unexpected data in header buffer in begin_cmd()
> >    btrfs: send: handle unexpected inode in header process_recorded_refs()
> >    btrfs: send: handle path ref underflow in header iterate_inode_ref()
> >    btrfs: change BUG_ON to assertion in tree_move_down()
> >    btrfs: change BUG_ONs to assertions in btrfs_qgroup_trace_subtree()
> >    btrfs: delete pointless BUG_ON check on quota root in
> >      btrfs_qgroup_account_extent()
> >    btrfs: delete pointless BUG_ONs on extent item size
> 
> The above ones look good to me.
> 
> >    btrfs: handle unexpected parent block offset in
> >      btrfs_alloc_tree_block()
> 
> For this one, I'd prefer one error message or at least something noisy
> enough (aka, ASSERT()), just to make life a little easier when we
> screwed up in our development environment.

Well, there are too many BUG_ONs that somebody just sprinkled over the
code, it still documents the invariants but in a very random way.
Getting rid of them is much harder as it requires reasoning in a much
broader context. I looked at all callers, this looks like a proper API
usage check rather than a random error that would need a message.

Each call site with EUCLEAN can be revisited if we really want to get
noticed about that or not, but I'm not doing that right now unless I'm
convince we do from the context.
David Sterba Feb. 13, 2024, 7:05 p.m. UTC | #3
On Thu, Feb 08, 2024 at 08:20:55PM +1030, Qu Wenruo wrote:
> On 2024/2/8 19:29, David Sterba wrote:
> > Continuation of https://lore.kernel.org/linux-btrfs/cover.1706130791.git.dsterba@suse.com/
> >    btrfs: handle unexpected parent block offset in
> >      btrfs_alloc_tree_block()
> 
> For this one, I'd prefer one error message or at least something noisy
> enough (aka, ASSERT()), just to make life a little easier when we
> screwed up in our development environment.

I'll drop this patch for now and will add it to another series dealing
with the error handling and adding the tree-checker and/or verbose
messages.