Message ID | 20180713131003.31121-1-billodo@redhat.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On Fri, Jul 13, 2018 at 08:10:03AM -0500, Bill O'Donnell wrote: > Current sb verifier doesn't check bounds on sb_fdblocks and sb_ifree. > Add sanity checks for these parameters. > > Signed-off-by: Bill O'Donnell <billodo@redhat.com> > --- > fs/xfs/libxfs/xfs_sb.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c > index 350119eeaecb..cdede769ab88 100644 > --- a/fs/xfs/libxfs/xfs_sb.c > +++ b/fs/xfs/libxfs/xfs_sb.c > @@ -261,7 +261,9 @@ xfs_mount_validate_sb( > sbp->sb_dblocks == 0 || > sbp->sb_dblocks > XFS_MAX_DBLOCKS(sbp) || > sbp->sb_dblocks < XFS_MIN_DBLOCKS(sbp) || > - sbp->sb_shared_vn != 0)) { > + sbp->sb_shared_vn != 0 || > + sbp->sb_fdblocks > sbp->sb_dblocks || > + sbp->sb_ifree > sbp->sb_icount)) { Hmm. On its face this seems reasonable for the superblock verifier, but then I started wondering, since these are /summary/ counters. If the free counts are off by this much, the admin won't be able to mount the fs, and xfs_repair is the only other tool that can fix the summary counts. However, if the log is dirty, the mount won't succeed to recover the fs, which is too bad since we can reinitialize the summary counts after log recovery. xfs_repair -L will be the only way out, which will wreak havoc on the filesystem from discarding the log contents. So, would it be preferable to split this into two parts? For example, have this as a corruption check in _sb_write_verify to prevent us from writing out garbage counters and a clamp in _reinit_percpu_counters so that we never present ridiculous free counts to users? (Does any of this make sense with !haslazysbcount filesystems?) Bonus question: What about checking frextents/rextents? --D > xfs_notice(mp, "SB sanity check failed"); > return -EFSCORRUPTED; > } > -- > 2.17.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 13, 2018 at 09:41:53AM -0700, Darrick J. Wong wrote: > On Fri, Jul 13, 2018 at 08:10:03AM -0500, Bill O'Donnell wrote: > > Current sb verifier doesn't check bounds on sb_fdblocks and sb_ifree. > > Add sanity checks for these parameters. > > > > Signed-off-by: Bill O'Donnell <billodo@redhat.com> > > --- > > fs/xfs/libxfs/xfs_sb.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c > > index 350119eeaecb..cdede769ab88 100644 > > --- a/fs/xfs/libxfs/xfs_sb.c > > +++ b/fs/xfs/libxfs/xfs_sb.c > > @@ -261,7 +261,9 @@ xfs_mount_validate_sb( > > sbp->sb_dblocks == 0 || > > sbp->sb_dblocks > XFS_MAX_DBLOCKS(sbp) || > > sbp->sb_dblocks < XFS_MIN_DBLOCKS(sbp) || > > - sbp->sb_shared_vn != 0)) { > > + sbp->sb_shared_vn != 0 || > > + sbp->sb_fdblocks > sbp->sb_dblocks || > > + sbp->sb_ifree > sbp->sb_icount)) { > > Hmm. On its face this seems reasonable for the superblock verifier, but > then I started wondering, since these are /summary/ counters. FWIW, I'm proposing a rudimentary bounds check to prevent this sort of issue from even happening in the first place: https://www.spinics.net/lists/linux-xfs/msg20592.html > > If the free counts are off by this much, the admin won't be able to > mount the fs, and xfs_repair is the only other tool that can fix the > summary counts. However, if the log is dirty, the mount won't succeed > to recover the fs, which is too bad since we can reinitialize the > summary counts after log recovery. xfs_repair -L will be the only way > out, which will wreak havoc on the filesystem from discarding the log > contents. agreed, but again, I want to prevent the aforementioned use case where corruption gets introduced. > > So, would it be preferable to split this into two parts? For example, > have this as a corruption check in _sb_write_verify to prevent us from > writing out garbage counters and a clamp in _reinit_percpu_counters so > that we never present ridiculous free counts to users? > > (Does any of this make sense with !haslazysbcount filesystems?) > > Bonus question: What about checking frextents/rextents? Hrmm, perhaps. It should definitely be considered. Thanks- Bill > --D > > > xfs_notice(mp, "SB sanity check failed"); > > return -EFSCORRUPTED; > > } > > -- > > 2.17.1 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 13, 2018 at 09:41:53AM -0700, Darrick J. Wong wrote: > On Fri, Jul 13, 2018 at 08:10:03AM -0500, Bill O'Donnell wrote: > > Current sb verifier doesn't check bounds on sb_fdblocks and sb_ifree. > > Add sanity checks for these parameters. > > > > Signed-off-by: Bill O'Donnell <billodo@redhat.com> > > --- > > fs/xfs/libxfs/xfs_sb.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c > > index 350119eeaecb..cdede769ab88 100644 > > --- a/fs/xfs/libxfs/xfs_sb.c > > +++ b/fs/xfs/libxfs/xfs_sb.c > > @@ -261,7 +261,9 @@ xfs_mount_validate_sb( > > sbp->sb_dblocks == 0 || > > sbp->sb_dblocks > XFS_MAX_DBLOCKS(sbp) || > > sbp->sb_dblocks < XFS_MIN_DBLOCKS(sbp) || > > - sbp->sb_shared_vn != 0)) { > > + sbp->sb_shared_vn != 0 || > > + sbp->sb_fdblocks > sbp->sb_dblocks || > > + sbp->sb_ifree > sbp->sb_icount)) { > > Hmm. On its face this seems reasonable for the superblock verifier, but > then I started wondering, since these are /summary/ counters. > > If the free counts are off by this much, the admin won't be able to > mount the fs, and xfs_repair is the only other tool that can fix the > summary counts. However, if the log is dirty, the mount won't succeed > to recover the fs, which is too bad since we can reinitialize the > summary counts after log recovery. xfs_repair -L will be the only way > out, which will wreak havoc on the filesystem from discarding the log > contents. Yup, that's why I said "catch this on /write/", not "always reject bad counter values". i.e. we should never be writing a bad value, but we most definitely need to be able to mount the filesystem to reconstruct them. > So, would it be preferable to split this into two parts? For example, > have this as a corruption check in _sb_write_verify to prevent us from > writing out garbage counters yes. > and a clamp in _reinit_percpu_counters so > that we never present ridiculous free counts to users? percpu_counter_{read,sum}_positive() should be used for anything that is userspace facing. xfs_fs_counts() gets this right, but xfs_fs_statfs() doesn't - it should use percpu_counter_sum_positive(). > (Does any of this make sense with !haslazysbcount filesystems?) Same thing - we can't verify the counters on read until after log recovery as all the changes are journalled. > Bonus question: What about checking frextents/rextents? Same as !lazycount - all changes are journalled. Cheers, Dave.
On Sat, Jul 14, 2018 at 09:43:41AM +1000, Dave Chinner wrote: > On Fri, Jul 13, 2018 at 09:41:53AM -0700, Darrick J. Wong wrote: > > On Fri, Jul 13, 2018 at 08:10:03AM -0500, Bill O'Donnell wrote: > > > Current sb verifier doesn't check bounds on sb_fdblocks and sb_ifree. > > > Add sanity checks for these parameters. > > > > > > Signed-off-by: Bill O'Donnell <billodo@redhat.com> > > > --- > > > fs/xfs/libxfs/xfs_sb.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c > > > index 350119eeaecb..cdede769ab88 100644 > > > --- a/fs/xfs/libxfs/xfs_sb.c > > > +++ b/fs/xfs/libxfs/xfs_sb.c > > > @@ -261,7 +261,9 @@ xfs_mount_validate_sb( > > > sbp->sb_dblocks == 0 || > > > sbp->sb_dblocks > XFS_MAX_DBLOCKS(sbp) || > > > sbp->sb_dblocks < XFS_MIN_DBLOCKS(sbp) || > > > - sbp->sb_shared_vn != 0)) { > > > + sbp->sb_shared_vn != 0 || > > > + sbp->sb_fdblocks > sbp->sb_dblocks || > > > + sbp->sb_ifree > sbp->sb_icount)) { > > > > Hmm. On its face this seems reasonable for the superblock verifier, but > > then I started wondering, since these are /summary/ counters. > > > > If the free counts are off by this much, the admin won't be able to > > mount the fs, and xfs_repair is the only other tool that can fix the > > summary counts. However, if the log is dirty, the mount won't succeed > > to recover the fs, which is too bad since we can reinitialize the > > summary counts after log recovery. xfs_repair -L will be the only way > > out, which will wreak havoc on the filesystem from discarding the log > > contents. > > Yup, that's why I said "catch this on /write/", not "always reject > bad counter values". > > i.e. we should never be writing a bad value, but we most definitely > need to be able to mount the filesystem to reconstruct them. > > > So, would it be preferable to split this into two parts? For example, > > have this as a corruption check in _sb_write_verify to prevent us from > > writing out garbage counters > > yes. > > > and a clamp in _reinit_percpu_counters so > > that we never present ridiculous free counts to users? > > percpu_counter_{read,sum}_positive() should be used for anything that is > userspace facing. xfs_fs_counts() gets this right, but > xfs_fs_statfs() doesn't - it should use > percpu_counter_sum_positive(). I don't think that will solve this problem -- although sb_fdblocks is larger than sb_dblocks, sd_fdblocks is not so insanely large that percpu_counter_{read,sum} return negative values; returning to Eric's analysis of the original complaint: > sb_fdblocks 4461713825, counted 166746529 > - found root inode chunk > > that sb_fdblocks really is ~17T which indicates the problem > really is on disk. > > 4461713825 > 100001001111100000101100110100001 ^-- bit 32 of a signed 64-bit quantity > 166746529 > 1001111100000101100110100001 So we still need a separate sb_fdblocks <= sb_dblocks clamp and/or forced recalculation somewhere. I agree that _fs_statfs should only return positive free blocks to avoid reporting total garbage to userspace, but that's not the problme here. I'll toss that onto my pile for 4.19 stuff. > > (Does any of this make sense with !haslazysbcount filesystems?) > > Same thing - we can't verify the counters on read until after log > recovery as all the changes are journalled. > > > Bonus question: What about checking frextents/rextents? > > Same as !lazycount - all changes are journalled. Ok. --D > Cheers, > > Dave. > > -- > Dave Chinner > david@fromorbit.com > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 350119eeaecb..cdede769ab88 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -261,7 +261,9 @@ xfs_mount_validate_sb( sbp->sb_dblocks == 0 || sbp->sb_dblocks > XFS_MAX_DBLOCKS(sbp) || sbp->sb_dblocks < XFS_MIN_DBLOCKS(sbp) || - sbp->sb_shared_vn != 0)) { + sbp->sb_shared_vn != 0 || + sbp->sb_fdblocks > sbp->sb_dblocks || + sbp->sb_ifree > sbp->sb_icount)) { xfs_notice(mp, "SB sanity check failed"); return -EFSCORRUPTED; }
Current sb verifier doesn't check bounds on sb_fdblocks and sb_ifree. Add sanity checks for these parameters. Signed-off-by: Bill O'Donnell <billodo@redhat.com> --- fs/xfs/libxfs/xfs_sb.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)