Message ID | 981687019fcca4b4f7455b7a755e48888cbd24ff.1440510311.git.zhaolei@cn.fujitsu.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On Tue, Aug 25, 2015 at 09:45:27PM +0800, Zhao Lei wrote: > It is better to show error stats to user when we found tree block > spanning stripes. > > On a btrfs created by old version of btrfs-convert: > Before patch: > # btrfs scrub start -B /dev/vdh > scrub done for 8b342d35-2904-41ab-b3cb-2f929709cf47 > scrub started at Tue Aug 25 21:19:09 2015 and finished after 00:00:00 > total bytes scrubbed: 53.54MiB with 0 errors > # dmesg > ... > [ 128.711434] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27000832 > [ 128.712744] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27066368 > ... > > After patch: > # btrfs scrub start -B /dev/vdh > scrub done for ff7f844b-7a4e-4b1a-88a9-8252ab25be1b > scrub started at Tue Aug 25 21:42:29 2015 and finished after 00:00:00 > total bytes scrubbed: 53.60MiB with 2 errors > error details: > corrected errors: 0, uncorrectable errors: 2, unverified errors: 0 > ERROR: There are uncorrectable errors. I agree that reporting that is a good thing so the syslog is consistent with the status output. Is the error really uncorrectable? I'd have to re-read the reports and root cause analysis again, but can we do it at least via checker? The uncorrectable errors are especially worrisome so I'd like to know what kind of answer should we give. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, David Sterba > -----Original Message----- > From: David Sterba [mailto:dsterba@suse.cz] > Sent: Thursday, August 27, 2015 1:20 AM > To: Zhao Lei <zhaolei@cn.fujitsu.com> > Cc: linux-btrfs@vger.kernel.org > Subject: Re: [PATCH] btrfs: scrub: set error stats when tree block spanning > stripes > > On Tue, Aug 25, 2015 at 09:45:27PM +0800, Zhao Lei wrote: > > It is better to show error stats to user when we found tree block > > spanning stripes. > > > > On a btrfs created by old version of btrfs-convert: > > Before patch: > > # btrfs scrub start -B /dev/vdh > > scrub done for 8b342d35-2904-41ab-b3cb-2f929709cf47 > > scrub started at Tue Aug 25 21:19:09 2015 and finished after > 00:00:00 > > total bytes scrubbed: 53.54MiB with 0 errors > > # dmesg > > ... > > [ 128.711434] BTRFS error (device vdh): scrub: tree block 27054080 > spanning stripes, ignored. logical=27000832 > > [ 128.712744] BTRFS error (device vdh): scrub: tree block 27054080 > spanning stripes, ignored. logical=27066368 > > ... > > > > After patch: > > # btrfs scrub start -B /dev/vdh > > scrub done for ff7f844b-7a4e-4b1a-88a9-8252ab25be1b > > scrub started at Tue Aug 25 21:42:29 2015 and finished after > 00:00:00 > > total bytes scrubbed: 53.60MiB with 2 errors > > error details: > > corrected errors: 0, uncorrectable errors: 2, unverified errors: 0 > > ERROR: There are uncorrectable errors. > > I agree that reporting that is a good thing so the syslog is consistent with the > status output. Is the error really uncorrectable? I'd have to re-read the reports > and root cause analysis again, but can we do it at least via checker? > This type of error-filesystem is created by old btrfs-convert, Reported by: Chris Murphy <lists@colorremedies.com> In maillist, with title: [BUG] kernel BUG at fs/btrfs/scrub.c:1956!, when scrubbing freshly convert Then we get the reason: It is caused by bug in btrfs-convert. Detail is in maillist titled: RE: [BUG] kernel BUG at fs/btrfs/scrub.c:1956!, when scrubbing freshly converted ext In 2015-7-23 And fixed btrfs-progs, to avoid creating such a wrong filesystem with btrfs-convert, and report this type of error in fsck. With patch titled: [PATCH 0/4] Metadata crossing stripe boundary fixes [PATCH 1/4] btrfs: print-tree: print stripe len of a chunk [PATCH 2/4] btrfs: fsck: Check if a metadata tree block crossing stripe boundary [PATCH 3/4] btrfs: extent-tree: Avoid allocating tree block that cross stripe boundar [PATCH 4/4] btrfs: convert: Avoid allocating metadata extent crossing stripe boundary In 2015-7-23 Also fixed kernel code to avoid panic with: [PATCH] btrfs: Fix scrub panic when leaf accross stripes in btrfs maillist. In 2015-7-23 Above patchs are in current btrfs-progs and integration-4.3 branch. But it is still uncorrectable because neither kernel nor btrfsck can fix it(only can report). In short, it is caused by old version of btrfs-convert, and already fixed, and this patch is used to make scrub report it in output. Thanks Zhaolei > The uncorrectable errors are especially worrisome so I'd like to know what kind > of answer should we give. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 27, 2015 at 09:44:05AM +0800, Zhao Lei wrote: [...] Thanks for the links and summary! > But it is still uncorrectable because neither kernel nor btrfsck can fix > it(only can report). I'm curious if it' is possible to write code that would fix the problem, eg. te relocate or shift the blocks. The 'btrfs rescue' command group is meant for this type of targeted fixes, unless it's easy to integrate it to the check & repair itself. > In short, it is caused by old version of btrfs-convert, and already fixed, > and this patch is used to make scrub report it in output. I assume there are people affected by this problem so it would be good to have a repair tool. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, David and Zhao, David Sterba wrote on 2015/09/03 15:17 +0200: > On Thu, Aug 27, 2015 at 09:44:05AM +0800, Zhao Lei wrote: > [...] > > Thanks for the links and summary! > >> But it is still uncorrectable because neither kernel nor btrfsck can fix >> it(only can report). > > I'm curious if it' is possible to write code that would fix the problem, > eg. te relocate or shift the blocks. The 'btrfs rescue' command group is > meant for this type of targeted fixes, unless it's easy to integrate it > to the check & repair itself. I'm happy and willing to add the relocating ability for btrfs-progs. As that will provide a lot of flexibility, while the code should be quite easy if I do it in a NOCOW method. (Find a extent, copy it to other place, follow backref and modify) The only concern is, does it mean that we need to implement all the features of btrfs kernel in user space tool? And will it make the codes duplicated between kernel and progs? Thanks, Qu > >> In short, it is caused by old version of btrfs-convert, and already fixed, >> and this patch is used to make scrub report it in output. > > I assume there are people affected by this problem so it would be good > to have a repair tool. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Sep 04, 2015 at 11:21:05AM +0800, Qu Wenruo wrote: > I'm happy and willing to add the relocating ability for btrfs-progs. > As that will provide a lot of flexibility, while the code should be > quite easy if I do it in a NOCOW method. > (Find a extent, copy it to other place, follow backref and modify) > > The only concern is, does it mean that we need to implement all the > features of btrfs kernel in user space tool? I don't think we need a 1:1 feature parity. We might end up doing a one chunk relocation, but relocating an extent would be enough for a start. Doing the operations in userspace is easier in some respects as the data are not live, we can do different trade-offs, utilize readahead etc. Some operations might end up significantly faster and might be used as one-time actions eg. when converting multi-terabyte filesystem to another raid type. But this is a long-term goal. > And will it make the codes duplicated between kernel and progs? At this moment the code is not unified and we have to copy the code from kernel where needed. But as far as the code does not diverge, it'll be easier to use exact kernel sources in progs. Current versions of the common sources are not close enough, so we'd have to unify them first. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index c69c75e..81482e4 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -3025,6 +3025,9 @@ static noinline_for_stack int scrub_raid56_parity(struct scrub_ctx *sctx, logic_start + map->stripe_len)) { btrfs_err(fs_info, "scrub: tree block %llu spanning stripes, ignored. logical=%llu", key.objectid, logic_start); + spin_lock(&sctx->stat_lock); + sctx->stat.uncorrectable_errors++; + spin_unlock(&sctx->stat_lock); goto next; } again: @@ -3374,6 +3377,9 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, "scrub: tree block %llu spanning " "stripes, ignored. logical=%llu", key.objectid, logical); + spin_lock(&sctx->stat_lock); + sctx->stat.uncorrectable_errors++; + spin_unlock(&sctx->stat_lock); goto next; }
It is better to show error stats to user when we found tree block spanning stripes. On a btrfs created by old version of btrfs-convert: Before patch: # btrfs scrub start -B /dev/vdh scrub done for 8b342d35-2904-41ab-b3cb-2f929709cf47 scrub started at Tue Aug 25 21:19:09 2015 and finished after 00:00:00 total bytes scrubbed: 53.54MiB with 0 errors # dmesg ... [ 128.711434] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27000832 [ 128.712744] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27066368 ... After patch: # btrfs scrub start -B /dev/vdh scrub done for ff7f844b-7a4e-4b1a-88a9-8252ab25be1b scrub started at Tue Aug 25 21:42:29 2015 and finished after 00:00:00 total bytes scrubbed: 53.60MiB with 2 errors error details: corrected errors: 0, uncorrectable errors: 2, unverified errors: 0 ERROR: There are uncorrectable errors. # dmesg ...omit... # Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> --- fs/btrfs/scrub.c | 6 ++++++ 1 file changed, 6 insertions(+)