Message ID | 1449251884-24135-1-git-send-email-bo.li.liu@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Dec 04, 2015 at 09:58:04AM -0800, Liu Bo wrote: > This disables repair process on ro cases as it can cause system > to be unresponsive on the ASSERT() in repair_io_failure(). > > This can happen when scrub is running and a hardware error pops up, > we should fallback to ro mounts gracefully instead of being unresponsive. So this will also report the error as uncorrectable. This might be a bit misleading, if a device error happens first and then some potentially corectable errors are detected. This could be accounted as 'unverified' error, that has closet maning. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Dec 07, 2015 at 03:37:43PM +0100, David Sterba wrote: > On Fri, Dec 04, 2015 at 09:58:04AM -0800, Liu Bo wrote: > > This disables repair process on ro cases as it can cause system > > to be unresponsive on the ASSERT() in repair_io_failure(). > > > > This can happen when scrub is running and a hardware error pops up, > > we should fallback to ro mounts gracefully instead of being unresponsive. > > So this will also report the error as uncorrectable. This might be a bit > misleading, if a device error happens first and then some potentially > corectable errors are detected. This could be accounted as 'unverified' > error, that has closet maning. Make sense, we can do if (ret < 0 && ret == -EROFS) spin_lock(); unverified++; spin_unlock() However, in scrub_fixup_nodatasum() all errors including ENOMEM of path allocation and failure of trans are interpreted to 'uncorrectable', So I wander it means this 'uncorrectable' is only valid in this scrub process? Thanks, -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Dec 07, 2015 at 10:26:05AM -0800, Liu Bo wrote: > On Mon, Dec 07, 2015 at 03:37:43PM +0100, David Sterba wrote: > > On Fri, Dec 04, 2015 at 09:58:04AM -0800, Liu Bo wrote: > > > This disables repair process on ro cases as it can cause system > > > to be unresponsive on the ASSERT() in repair_io_failure(). > > > > > > This can happen when scrub is running and a hardware error pops up, > > > we should fallback to ro mounts gracefully instead of being unresponsive. > > > > So this will also report the error as uncorrectable. This might be a bit > > misleading, if a device error happens first and then some potentially > > corectable errors are detected. This could be accounted as 'unverified' > > error, that has closet maning. > > Make sense, we can do > if (ret < 0 && ret == -EROFS) > spin_lock(); > unverified++; > spin_unlock() > > However, in scrub_fixup_nodatasum() all errors including ENOMEM of path > allocation and failure of trans are interpreted to 'uncorrectable', So I > wander it means this 'uncorrectable' is only valid in this scrub process? I'm not sure we have a proper definition of the various stats. My user expectation is that 'uncorrectable' refers to permament errors, so we should try to match the type of error everywhere. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 2907a77..cb8a4e0 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -682,11 +682,14 @@ static int scrub_fixup_readpage(u64 inum, u64 offset, u64 root, void *fixup_ctx) struct btrfs_root *local_root; int srcu_index; + fs_info = fixup->root->fs_info; + if (fs_info->sb->s_flags & MS_RDONLY) + return -EROFS; + key.objectid = root; key.type = BTRFS_ROOT_ITEM_KEY; key.offset = (u64)-1; - fs_info = fixup->root->fs_info; srcu_index = srcu_read_lock(&fs_info->subvol_srcu); local_root = btrfs_read_fs_root_no_name(fs_info, &key);
This disables repair process on ro cases as it can cause system to be unresponsive on the ASSERT() in repair_io_failure(). This can happen when scrub is running and a hardware error pops up, we should fallback to ro mounts gracefully instead of being unresponsive. Reported-by: Codebird <codebird@birds-are-nice.me> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- v2: Get @fs_info from a real pointer instead of a confusing-name u64 root. fs/btrfs/scrub.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)