Message ID | 1473898977-29406-1-git-send-email-bo.li.liu@oracle.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On Wed, Sep 14, 2016 at 05:22:57PM -0700, Liu Bo wrote: > During updating btree, we could push items between sibling > nodes/leaves, for leaves data sections starts reversely from > the end of the block while for nodes we only have key pairs > which are stored one by one from the start of the block. > > So we could do try to push key pairs from one node to the next > node right in the tree, and after that, we update the node's > nritems to reflect the correct end while leaving the stale > content in the node. One may intentionally corrupt the fs > image and access the stale content by bumping the nritems and > causes various crashes. > > This takes the in-memory @nritems as the correct one and > gets to memset the unused part of a btree node. > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> > --- > fs/btrfs/extent_io.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > index c2325c3..56c9dee 100644 > --- a/fs/btrfs/extent_io.c > +++ b/fs/btrfs/extent_io.c > @@ -3732,6 +3732,17 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, > if (btrfs_header_owner(eb) == BTRFS_TREE_LOG_OBJECTID) > bio_flags = EXTENT_BIO_TREE_LOG; > > + /* set btree node beyond nritems with 0 to avoid stale content */ > + if (btrfs_header_level(eb) > 0) { We can do the same for leaves. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 20, 2016 at 03:16:36PM +0200, David Sterba wrote: > On Wed, Sep 14, 2016 at 05:22:57PM -0700, Liu Bo wrote: > > During updating btree, we could push items between sibling > > nodes/leaves, for leaves data sections starts reversely from > > the end of the block while for nodes we only have key pairs > > which are stored one by one from the start of the block. > > > > So we could do try to push key pairs from one node to the next > > node right in the tree, and after that, we update the node's > > nritems to reflect the correct end while leaving the stale > > content in the node. One may intentionally corrupt the fs > > image and access the stale content by bumping the nritems and > > causes various crashes. > > > > This takes the in-memory @nritems as the correct one and > > gets to memset the unused part of a btree node. > > > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > > Reviewed-by: David Sterba <dsterba@suse.com> > > > --- > > fs/btrfs/extent_io.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > > index c2325c3..56c9dee 100644 > > --- a/fs/btrfs/extent_io.c > > +++ b/fs/btrfs/extent_io.c > > @@ -3732,6 +3732,17 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, > > if (btrfs_header_owner(eb) == BTRFS_TREE_LOG_OBJECTID) > > bio_flags = EXTENT_BIO_TREE_LOG; > > > > + /* set btree node beyond nritems with 0 to avoid stale content */ > > + if (btrfs_header_level(eb) > 0) { > > We can do the same for leaves. In theory, the problem also applies for leaves, but I haven't got a reproducer for leaf case. So I'll update a v2 with leaf memset, please review that part more carefully :) Thanks, -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 20, 2016 at 10:57:41AM -0700, Liu Bo wrote: > On Tue, Sep 20, 2016 at 03:16:36PM +0200, David Sterba wrote: > > On Wed, Sep 14, 2016 at 05:22:57PM -0700, Liu Bo wrote: > > > During updating btree, we could push items between sibling > > > nodes/leaves, for leaves data sections starts reversely from > > > the end of the block while for nodes we only have key pairs > > > which are stored one by one from the start of the block. > > > > > > So we could do try to push key pairs from one node to the next > > > node right in the tree, and after that, we update the node's > > > nritems to reflect the correct end while leaving the stale > > > content in the node. One may intentionally corrupt the fs > > > image and access the stale content by bumping the nritems and > > > causes various crashes. > > > > > > This takes the in-memory @nritems as the correct one and > > > gets to memset the unused part of a btree node. > > > > > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > > > > Reviewed-by: David Sterba <dsterba@suse.com> > > > > > --- > > > fs/btrfs/extent_io.c | 11 +++++++++++ > > > 1 file changed, 11 insertions(+) > > > > > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > > > index c2325c3..56c9dee 100644 > > > --- a/fs/btrfs/extent_io.c > > > +++ b/fs/btrfs/extent_io.c > > > @@ -3732,6 +3732,17 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, > > > if (btrfs_header_owner(eb) == BTRFS_TREE_LOG_OBJECTID) > > > bio_flags = EXTENT_BIO_TREE_LOG; > > > > > > + /* set btree node beyond nritems with 0 to avoid stale content */ > > > + if (btrfs_header_level(eb) > 0) { > > > > We can do the same for leaves. > > In theory, the problem also applies for leaves, but I haven't got a > reproducer for leaf case. > > So I'll update a v2 with leaf memset, please review that part more > carefully :) You can keep it a separate patch, this one is fine. I didn't expect to reproduce a crash with a bogus nritems in a leaf but rather apply the same on a leaf buffer. The magic formula is (please verify) start = nr * sizeof(struct btrfs_disk_key); end = nr ? btrfs_item_offset(eb, btrfs_item_nr(nr - 1)) : eb->len; -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/21/2016 04:04 AM, David Sterba wrote: > On Tue, Sep 20, 2016 at 10:57:41AM -0700, Liu Bo wrote: >> On Tue, Sep 20, 2016 at 03:16:36PM +0200, David Sterba wrote: >>> On Wed, Sep 14, 2016 at 05:22:57PM -0700, Liu Bo wrote: >>>> During updating btree, we could push items between sibling >>>> nodes/leaves, for leaves data sections starts reversely from >>>> the end of the block while for nodes we only have key pairs >>>> which are stored one by one from the start of the block. >>>> >>>> So we could do try to push key pairs from one node to the next >>>> node right in the tree, and after that, we update the node's >>>> nritems to reflect the correct end while leaving the stale >>>> content in the node. One may intentionally corrupt the fs >>>> image and access the stale content by bumping the nritems and >>>> causes various crashes. >>>> >>>> This takes the in-memory @nritems as the correct one and >>>> gets to memset the unused part of a btree node. >>>> >>>> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> >>> >>> Reviewed-by: David Sterba <dsterba@suse.com> >>> >>>> --- >>>> fs/btrfs/extent_io.c | 11 +++++++++++ >>>> 1 file changed, 11 insertions(+) >>>> >>>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c >>>> index c2325c3..56c9dee 100644 >>>> --- a/fs/btrfs/extent_io.c >>>> +++ b/fs/btrfs/extent_io.c >>>> @@ -3732,6 +3732,17 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, >>>> if (btrfs_header_owner(eb) == BTRFS_TREE_LOG_OBJECTID) >>>> bio_flags = EXTENT_BIO_TREE_LOG; >>>> >>>> + /* set btree node beyond nritems with 0 to avoid stale content */ >>>> + if (btrfs_header_level(eb) > 0) { >>> >>> We can do the same for leaves. >> >> In theory, the problem also applies for leaves, but I haven't got a >> reproducer for leaf case. >> >> So I'll update a v2 with leaf memset, please review that part more >> carefully :) > > You can keep it a separate patch, this one is fine. I didn't expect to > reproduce a crash with a bogus nritems in a leaf but rather apply the > same on a leaf buffer. The magic formula is (please verify) > > start = nr * sizeof(struct btrfs_disk_key); > end = nr ? btrfs_item_offset(eb, btrfs_item_nr(nr - 1)) : eb->len; > This is the start/end of the memset for the leaves? Doesn't look right, since leaves looks like this: item headers 0,1,2 .. N -> [ empty space ] <- item data N, ... 2, 1, 0 The empty space is from the end of header N to the start of data N. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 21, 2016 at 09:09:32AM -0400, Chris Mason wrote: > > > On 09/21/2016 04:04 AM, David Sterba wrote: > > On Tue, Sep 20, 2016 at 10:57:41AM -0700, Liu Bo wrote: > > > On Tue, Sep 20, 2016 at 03:16:36PM +0200, David Sterba wrote: > > > > On Wed, Sep 14, 2016 at 05:22:57PM -0700, Liu Bo wrote: > > > > > During updating btree, we could push items between sibling > > > > > nodes/leaves, for leaves data sections starts reversely from > > > > > the end of the block while for nodes we only have key pairs > > > > > which are stored one by one from the start of the block. > > > > > > > > > > So we could do try to push key pairs from one node to the next > > > > > node right in the tree, and after that, we update the node's > > > > > nritems to reflect the correct end while leaving the stale > > > > > content in the node. One may intentionally corrupt the fs > > > > > image and access the stale content by bumping the nritems and > > > > > causes various crashes. > > > > > > > > > > This takes the in-memory @nritems as the correct one and > > > > > gets to memset the unused part of a btree node. > > > > > > > > > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > > > > > > > > Reviewed-by: David Sterba <dsterba@suse.com> > > > > > > > > > --- > > > > > fs/btrfs/extent_io.c | 11 +++++++++++ > > > > > 1 file changed, 11 insertions(+) > > > > > > > > > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > > > > > index c2325c3..56c9dee 100644 > > > > > --- a/fs/btrfs/extent_io.c > > > > > +++ b/fs/btrfs/extent_io.c > > > > > @@ -3732,6 +3732,17 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, > > > > > if (btrfs_header_owner(eb) == BTRFS_TREE_LOG_OBJECTID) > > > > > bio_flags = EXTENT_BIO_TREE_LOG; > > > > > > > > > > + /* set btree node beyond nritems with 0 to avoid stale content */ > > > > > + if (btrfs_header_level(eb) > 0) { > > > > > > > > We can do the same for leaves. > > > > > > In theory, the problem also applies for leaves, but I haven't got a > > > reproducer for leaf case. > > > > > > So I'll update a v2 with leaf memset, please review that part more > > > carefully :) > > > > You can keep it a separate patch, this one is fine. I didn't expect to > > reproduce a crash with a bogus nritems in a leaf but rather apply the > > same on a leaf buffer. The magic formula is (please verify) > > > > start = nr * sizeof(struct btrfs_disk_key); > > end = nr ? btrfs_item_offset(eb, btrfs_item_nr(nr - 1)) : eb->len; > > > > This is the start/end of the memset for the leaves? Doesn't look right, > since leaves looks like this: > > item headers 0,1,2 .. N -> [ empty space ] <- item data N, ... 2, 1, 0 > > The empty space is from the end of header N to the start of data N. Right, we've already got a good helper "leaf_data_end". Thanks, -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index c2325c3..56c9dee 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3732,6 +3732,17 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, if (btrfs_header_owner(eb) == BTRFS_TREE_LOG_OBJECTID) bio_flags = EXTENT_BIO_TREE_LOG; + /* set btree node beyond nritems with 0 to avoid stale content */ + if (btrfs_header_level(eb) > 0) { + u32 nritems; + unsigned long end; + + nritems = btrfs_header_nritems(eb); + end = btrfs_node_key_ptr_offset(nritems); + + memset_extent_buffer(eb, 0, end, eb->len - end); + } + for (i = 0; i < num_pages; i++) { struct page *p = eb->pages[i];
During updating btree, we could push items between sibling nodes/leaves, for leaves data sections starts reversely from the end of the block while for nodes we only have key pairs which are stored one by one from the start of the block. So we could do try to push key pairs from one node to the next node right in the tree, and after that, we update the node's nritems to reflect the correct end while leaving the stale content in the node. One may intentionally corrupt the fs image and access the stale content by bumping the nritems and causes various crashes. This takes the in-memory @nritems as the correct one and gets to memset the unused part of a btree node. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/extent_io.c | 11 +++++++++++ 1 file changed, 11 insertions(+)