Message ID | 9b53f585503429f5c81eeb222f1e2cb8014807f5.1677722020.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] btrfs: handle missing chunk mapping more gracefully | expand |
On 3/2/23 09:54, Qu Wenruo wrote: > [BUG] > During my scrub rework, I did a stupid thing like this: > > bio->bi_iter.bi_sector = stripe->logical; > btrfs_submit_bio(fs_info, bio, stripe->mirror_num); > > Above bi_sector assignment is using logical address directly, which > lacks ">> SECTOR_SHIFT". > > This results a read on a range which has no chunk mapping. > > This results the following crash: > > BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536 > assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 > ------------[ cut here ]------------ > > Sure this is all my fault, but this shows a possible problem in real > world, that some bitflip in file extents/tree block can point to > unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT > is not configured, cause invalid pointer. > > [PROBLEMS] > In above call chain, we just don't handle the possible error from > btrfs_get_chunk_map() inside __btrfs_map_block(). > > [FIX] > The fix is pretty straightforward, just replace the ASSERT() with proper > error handling. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > Changelog: > v2: > - Rebased to latest misc-next > The error path in bio.c is already fixed, thus only need to replace > the ASSERT() in __btrfs_map_block(). > --- > fs/btrfs/volumes.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 4d479ac233a4..93bc45001e68 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -6242,7 +6242,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, > return -EINVAL; > > em = btrfs_get_chunk_map(fs_info, logical, *length); > - ASSERT(!IS_ERR(em)); > + if (IS_ERR(em)) > + return PTR_ERR(em); Consider adding btrfs_err_rl() here. Except for scrub_find_good_copy(), the other functions do not report such errors. Furthermore, scrub_find_good_copy() lack sufficient details for effective debugging in the event of an issue. > > map = em->map_lookup; > data_stripes = nr_data_stripes(map);
On 2023/3/2 14:12, Anand Jain wrote: > On 3/2/23 09:54, Qu Wenruo wrote: >> [BUG] >> During my scrub rework, I did a stupid thing like this: >> >> bio->bi_iter.bi_sector = stripe->logical; >> btrfs_submit_bio(fs_info, bio, stripe->mirror_num); >> >> Above bi_sector assignment is using logical address directly, which >> lacks ">> SECTOR_SHIFT". >> >> This results a read on a range which has no chunk mapping. >> >> This results the following crash: >> >> BTRFS critical (device dm-1): unable to find logical 11274289152 >> length 65536 >> assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 >> ------------[ cut here ]------------ >> >> Sure this is all my fault, but this shows a possible problem in real >> world, that some bitflip in file extents/tree block can point to >> unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT >> is not configured, cause invalid pointer. >> >> [PROBLEMS] >> In above call chain, we just don't handle the possible error from >> btrfs_get_chunk_map() inside __btrfs_map_block(). >> >> [FIX] >> The fix is pretty straightforward, just replace the ASSERT() with proper >> error handling. >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> Changelog: >> v2: >> - Rebased to latest misc-next >> The error path in bio.c is already fixed, thus only need to replace >> the ASSERT() in __btrfs_map_block(). >> --- >> fs/btrfs/volumes.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >> index 4d479ac233a4..93bc45001e68 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -6242,7 +6242,8 @@ int __btrfs_map_block(struct btrfs_fs_info >> *fs_info, enum btrfs_map_op op, >> return -EINVAL; >> em = btrfs_get_chunk_map(fs_info, logical, *length); >> - ASSERT(!IS_ERR(em)); >> + if (IS_ERR(em)) >> + return PTR_ERR(em); > > > Consider adding btrfs_err_rl() here. > Except for scrub_find_good_copy(), the other functions do not report > such errors. > Furthermore, scrub_find_good_copy() lack sufficient details for > effective debugging in the event of an issue. Function btrfs_get_chunk_map() would already output an error message. Thanks, Qu > > >> map = em->map_lookup; >> data_stripes = nr_data_stripes(map); > > > > > >
On 3/2/23 14:49, Qu Wenruo wrote: > > > On 2023/3/2 14:12, Anand Jain wrote: >> On 3/2/23 09:54, Qu Wenruo wrote: >>> [BUG] >>> During my scrub rework, I did a stupid thing like this: >>> >>> bio->bi_iter.bi_sector = stripe->logical; >>> btrfs_submit_bio(fs_info, bio, stripe->mirror_num); >>> >>> Above bi_sector assignment is using logical address directly, which >>> lacks ">> SECTOR_SHIFT". >>> >>> This results a read on a range which has no chunk mapping. >>> >>> This results the following crash: >>> >>> BTRFS critical (device dm-1): unable to find logical 11274289152 >>> length 65536 >>> assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 >>> ------------[ cut here ]------------ >>> >>> Sure this is all my fault, but this shows a possible problem in real >>> world, that some bitflip in file extents/tree block can point to >>> unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT >>> is not configured, cause invalid pointer. >>> >>> [PROBLEMS] >>> In above call chain, we just don't handle the possible error from >>> btrfs_get_chunk_map() inside __btrfs_map_block(). >>> >>> [FIX] >>> The fix is pretty straightforward, just replace the ASSERT() with proper >>> error handling. >>> >>> Signed-off-by: Qu Wenruo <wqu@suse.com> >>> --- >>> Changelog: >>> v2: >>> - Rebased to latest misc-next >>> The error path in bio.c is already fixed, thus only need to replace >>> the ASSERT() in __btrfs_map_block(). >>> --- >>> fs/btrfs/volumes.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 4d479ac233a4..93bc45001e68 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -6242,7 +6242,8 @@ int __btrfs_map_block(struct btrfs_fs_info >>> *fs_info, enum btrfs_map_op op, >>> return -EINVAL; >>> em = btrfs_get_chunk_map(fs_info, logical, *length); >>> - ASSERT(!IS_ERR(em)); >>> + if (IS_ERR(em)) >>> + return PTR_ERR(em); >> >> >> Consider adding btrfs_err_rl() here. >> Except for scrub_find_good_copy(), the other functions do not report >> such errors. >> Furthermore, scrub_find_good_copy() lack sufficient details for >> effective debugging in the event of an issue. > > Function btrfs_get_chunk_map() would already output an error message. Ah. Right.. Reviewed-by: Anand Jain <anand.jain@oracle.com>
On Thu, Mar 02, 2023 at 09:54:12AM +0800, Qu Wenruo wrote: > [BUG] > During my scrub rework, I did a stupid thing like this: > > bio->bi_iter.bi_sector = stripe->logical; > btrfs_submit_bio(fs_info, bio, stripe->mirror_num); > > Above bi_sector assignment is using logical address directly, which > lacks ">> SECTOR_SHIFT". > > This results a read on a range which has no chunk mapping. > > This results the following crash: > > BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536 > assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 > ------------[ cut here ]------------ > > Sure this is all my fault, but this shows a possible problem in real > world, that some bitflip in file extents/tree block can point to > unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT > is not configured, cause invalid pointer. > > [PROBLEMS] > In above call chain, we just don't handle the possible error from > btrfs_get_chunk_map() inside __btrfs_map_block(). > > [FIX] > The fix is pretty straightforward, just replace the ASSERT() with proper > error handling. > > Signed-off-by: Qu Wenruo <wqu@suse.com> Added to misc-next, thanks.
On 3/2/23 09:54, Qu Wenruo wrote: > [BUG] > During my scrub rework, I did a stupid thing like this: > > bio->bi_iter.bi_sector = stripe->logical; > btrfs_submit_bio(fs_info, bio, stripe->mirror_num); > > Above bi_sector assignment is using logical address directly, which > lacks ">> SECTOR_SHIFT". > > This results a read on a range which has no chunk mapping. > > This results the following crash: > > BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536 > assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 > ------------[ cut here ]------------ > > Sure this is all my fault, but this shows a possible problem in real > world, that some bitflip in file extents/tree block can point to > unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT > is not configured, cause invalid pointer. > > [PROBLEMS] > In above call chain, we just don't handle the possible error from > btrfs_get_chunk_map() inside __btrfs_map_block(). > > [FIX] > The fix is pretty straightforward, just replace the ASSERT() with proper > error handling. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > Changelog: > v2: > - Rebased to latest misc-next > The error path in bio.c is already fixed, thus only need to replace > the ASSERT() in __btrfs_map_block(). > --- > fs/btrfs/volumes.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 4d479ac233a4..93bc45001e68 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -6242,7 +6242,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, > return -EINVAL; > > em = btrfs_get_chunk_map(fs_info, logical, *length); > - ASSERT(!IS_ERR(em)); > + if (IS_ERR(em)) > + return PTR_ERR(em); > > map = em->map_lookup; > data_stripes = nr_data_stripes(map); This bug affects 6.1.y LTS branch but no backport commit of this fix in 6.1.y branch. Please include it. Thanks.
On 2024/2/11 01:04, Celeste Liu wrote: > On 3/2/23 09:54, Qu Wenruo wrote: >> [BUG] >> During my scrub rework, I did a stupid thing like this: >> >> bio->bi_iter.bi_sector = stripe->logical; >> btrfs_submit_bio(fs_info, bio, stripe->mirror_num); >> >> Above bi_sector assignment is using logical address directly, which >> lacks ">> SECTOR_SHIFT". >> >> This results a read on a range which has no chunk mapping. >> >> This results the following crash: >> >> BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536 >> assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 >> ------------[ cut here ]------------ >> >> Sure this is all my fault, but this shows a possible problem in real >> world, that some bitflip in file extents/tree block can point to >> unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT >> is not configured, cause invalid pointer. >> >> [PROBLEMS] >> In above call chain, we just don't handle the possible error from >> btrfs_get_chunk_map() inside __btrfs_map_block(). >> >> [FIX] >> The fix is pretty straightforward, just replace the ASSERT() with proper >> error handling. >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> Changelog: >> v2: >> - Rebased to latest misc-next >> The error path in bio.c is already fixed, thus only need to replace >> the ASSERT() in __btrfs_map_block(). >> --- >> fs/btrfs/volumes.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >> index 4d479ac233a4..93bc45001e68 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -6242,7 +6242,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, >> return -EINVAL; >> >> em = btrfs_get_chunk_map(fs_info, logical, *length); >> - ASSERT(!IS_ERR(em)); >> + if (IS_ERR(em)) >> + return PTR_ERR(em); >> >> map = em->map_lookup; >> data_stripes = nr_data_stripes(map); > > This bug affects 6.1.y LTS branch but no backport commit of this fix in > 6.1.y branch. Please include it. Thanks. > Just want to make sure, are you hitting the ASSERT()? If so, please provide the calltrace and btrfs-check output to pin down the cause. Just backporting the patch is only to make it a little more graceful, but won't solve the root problem. Thanks, Qu
On 2024-02-11 03:58, Qu Wenruo wrote: > > > On 2024/2/11 01:04, Celeste Liu wrote: >> On 3/2/23 09:54, Qu Wenruo wrote: >>> [BUG] >>> During my scrub rework, I did a stupid thing like this: >>> >>> bio->bi_iter.bi_sector = stripe->logical; >>> btrfs_submit_bio(fs_info, bio, stripe->mirror_num); >>> >>> Above bi_sector assignment is using logical address directly, which >>> lacks ">> SECTOR_SHIFT". >>> >>> This results a read on a range which has no chunk mapping. >>> >>> This results the following crash: >>> >>> BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536 >>> assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 >>> ------------[ cut here ]------------ >>> >>> Sure this is all my fault, but this shows a possible problem in real >>> world, that some bitflip in file extents/tree block can point to >>> unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT >>> is not configured, cause invalid pointer. >>> >>> [PROBLEMS] >>> In above call chain, we just don't handle the possible error from >>> btrfs_get_chunk_map() inside __btrfs_map_block(). >>> >>> [FIX] >>> The fix is pretty straightforward, just replace the ASSERT() with proper >>> error handling. >>> >>> Signed-off-by: Qu Wenruo <wqu@suse.com> >>> --- >>> Changelog: >>> v2: >>> - Rebased to latest misc-next >>> The error path in bio.c is already fixed, thus only need to replace >>> the ASSERT() in __btrfs_map_block(). >>> --- >>> fs/btrfs/volumes.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 4d479ac233a4..93bc45001e68 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -6242,7 +6242,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, >>> return -EINVAL; >>> >>> em = btrfs_get_chunk_map(fs_info, logical, *length); >>> - ASSERT(!IS_ERR(em)); >>> + if (IS_ERR(em)) >>> + return PTR_ERR(em); >>> >>> map = em->map_lookup; >>> data_stripes = nr_data_stripes(map); >> >> This bug affects 6.1.y LTS branch but no backport commit of this fix in >> 6.1.y branch. Please include it. Thanks. >> > Just want to make sure, are you hitting the ASSERT()? No, in non-debug build, ASSERT() is noop. So em will be a pointer whose value is errno and there is no any error handle for it. And then it trigger kernel NULL Pointer Dereference exception in btrfs_get_io_geometry() with em->map_lookup (Of course, it's not 0 but 0x5a (-EINVAL + offsetof(map_lookup)). But it still is a illegal memory access), this kernel thread was killed but the lock hold by this thread are not released. After that, all fs operations was block by the lock. > > If so, please provide the calltrace and btrfs-check output to pin down > the cause. It doesn't happen on my machine, I've notified the finder to send a backtrace to the mailing list > Just backporting the patch is only to make it a little more graceful, > but won't solve the root problem. No. In non debug build, ASSERT() is noop so it lead that there is no any error handle code. There is no RAII/SBRM, unexpected exit will lead the resources will not be released correctly. If it has unique holder (e.g. mutex), it will break all other code that need this resource! > > Thanks, > Qu
On 2024/2/11 13:43, Celeste Liu wrote: > On 2024-02-11 03:58, Qu Wenruo wrote: > >> >> >> On 2024/2/11 01:04, Celeste Liu wrote: >>> On 3/2/23 09:54, Qu Wenruo wrote: >>>> [BUG] >>>> During my scrub rework, I did a stupid thing like this: >>>> >>>> bio->bi_iter.bi_sector = stripe->logical; >>>> btrfs_submit_bio(fs_info, bio, stripe->mirror_num); >>>> >>>> Above bi_sector assignment is using logical address directly, which >>>> lacks ">> SECTOR_SHIFT". >>>> >>>> This results a read on a range which has no chunk mapping. >>>> >>>> This results the following crash: >>>> >>>> BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536 >>>> assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 >>>> ------------[ cut here ]------------ >>>> >>>> Sure this is all my fault, but this shows a possible problem in real >>>> world, that some bitflip in file extents/tree block can point to >>>> unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT >>>> is not configured, cause invalid pointer. >>>> >>>> [PROBLEMS] >>>> In above call chain, we just don't handle the possible error from >>>> btrfs_get_chunk_map() inside __btrfs_map_block(). >>>> >>>> [FIX] >>>> The fix is pretty straightforward, just replace the ASSERT() with proper >>>> error handling. >>>> >>>> Signed-off-by: Qu Wenruo <wqu@suse.com> >>>> --- >>>> Changelog: >>>> v2: >>>> - Rebased to latest misc-next >>>> The error path in bio.c is already fixed, thus only need to replace >>>> the ASSERT() in __btrfs_map_block(). >>>> --- >>>> fs/btrfs/volumes.c | 3 ++- >>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>>> index 4d479ac233a4..93bc45001e68 100644 >>>> --- a/fs/btrfs/volumes.c >>>> +++ b/fs/btrfs/volumes.c >>>> @@ -6242,7 +6242,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, >>>> return -EINVAL; >>>> >>>> em = btrfs_get_chunk_map(fs_info, logical, *length); >>>> - ASSERT(!IS_ERR(em)); >>>> + if (IS_ERR(em)) >>>> + return PTR_ERR(em); >>>> >>>> map = em->map_lookup; >>>> data_stripes = nr_data_stripes(map); >>> >>> This bug affects 6.1.y LTS branch but no backport commit of this fix in >>> 6.1.y branch. Please include it. Thanks. >>> >> Just want to make sure, are you hitting the ASSERT()? > > No, in non-debug build, ASSERT() is noop. So em will be a pointer whose > value is errno and there is no any error handle for it. And then it > trigger kernel NULL Pointer Dereference exception in btrfs_get_io_geometry() > with em->map_lookup (Of course, it's not 0 but 0x5a > (-EINVAL + offsetof(map_lookup)). But it still is a illegal memory access), > this kernel thread was killed but the lock hold by this thread are not > released. After that, all fs operations was block by the lock. I know, what I mean "hitting the ASSERT()" includes the cases where CONFIG_BTRFS_ASSERT is not enabled. As it would lead to an obvious invalid memory access immediately. And lock/resources thing is the last thing you need to bother, as long as the kernel fs module triggered invalid memory access inside kernel space, it's busted already. > >> >> If so, please provide the calltrace and btrfs-check output to pin down >> the cause. > > It doesn't happen on my machine, I've notified the finder to send a > backtrace to the mailing list It's better to allow the reporter to send needed info directly to the ML. Trust me, working with a man in the middle is only going to slow down diagnosis and fix. And don't forget "btrfs check --readonly", as on-disk corrupted is also a very possible reason to lead IO out of chunk mapped ranges. > >> Just backporting the patch is only to make it a little more graceful, >> but won't solve the root problem. > > No. In non debug build, ASSERT() is noop so it lead that there is no any > error handle code. There is no RAII/SBRM, unexpected exit will lead the > resources will not be released correctly. If it has unique holder (e.g. mutex), > it will break all other code that need this resource! Sure, but even with a backport, it would still cause (although graceful) errors when such IO is triggered. Hitting the ASSERT() is only a symptom, not the root cause. I can definitely do a backport immediately, but I'm more interested in the root cause. What if it's caused by another hidden bug? Thanks, Qu > >> >> Thanks, >> Qu > >
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 4d479ac233a4..93bc45001e68 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6242,7 +6242,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, return -EINVAL; em = btrfs_get_chunk_map(fs_info, logical, *length); - ASSERT(!IS_ERR(em)); + if (IS_ERR(em)) + return PTR_ERR(em); map = em->map_lookup; data_stripes = nr_data_stripes(map);
[BUG] During my scrub rework, I did a stupid thing like this: bio->bi_iter.bi_sector = stripe->logical; btrfs_submit_bio(fs_info, bio, stripe->mirror_num); Above bi_sector assignment is using logical address directly, which lacks ">> SECTOR_SHIFT". This results a read on a range which has no chunk mapping. This results the following crash: BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536 assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 ------------[ cut here ]------------ Sure this is all my fault, but this shows a possible problem in real world, that some bitflip in file extents/tree block can point to unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT is not configured, cause invalid pointer. [PROBLEMS] In above call chain, we just don't handle the possible error from btrfs_get_chunk_map() inside __btrfs_map_block(). [FIX] The fix is pretty straightforward, just replace the ASSERT() with proper error handling. Signed-off-by: Qu Wenruo <wqu@suse.com> --- Changelog: v2: - Rebased to latest misc-next The error path in bio.c is already fixed, thus only need to replace the ASSERT() in __btrfs_map_block(). --- fs/btrfs/volumes.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)