Message ID | 160679385127.447856.3129099457617444604.stgit@magnolia (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | xfs: add the ability to flag a fs for repair | expand |
On Mon, Nov 30, 2020 at 07:37:31PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > Define an incompat feature flag to indicate that the filesystem needs to > be repaired. While libxfs will recognize this feature, the kernel will > refuse to mount if the feature flag is set, and only xfs_repair will be > able to clear the flag. The goal here is to force the admin to run > xfs_repair to completion after upgrading the filesystem, or if we > otherwise detect anomalies. > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > --- IIUC, we're using an incompat bit to intentionally ensure the filesystem cannot mount, even on kernels that predate this particular "needs repair" feature. The only difference is that an older kernel would complain about an unknown feature and return a different error code. Right? That seems reasonable, but out of curiousity is there a need/reason for using an incompat bit over an ro_compat bit? Brian > fs/xfs/libxfs/xfs_format.h | 7 +++++++ > fs/xfs/xfs_mount.c | 6 ++++++ > 2 files changed, 13 insertions(+) > > > diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h > index dd764da08f6f..5d8ba609ac0b 100644 > --- a/fs/xfs/libxfs/xfs_format.h > +++ b/fs/xfs/libxfs/xfs_format.h > @@ -468,6 +468,7 @@ xfs_sb_has_ro_compat_feature( > #define XFS_SB_FEAT_INCOMPAT_SPINODES (1 << 1) /* sparse inode chunks */ > #define XFS_SB_FEAT_INCOMPAT_META_UUID (1 << 2) /* metadata UUID */ > #define XFS_SB_FEAT_INCOMPAT_BIGTIME (1 << 3) /* large timestamps */ > +#define XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR (1 << 4) /* needs xfs_repair */ > #define XFS_SB_FEAT_INCOMPAT_ALL \ > (XFS_SB_FEAT_INCOMPAT_FTYPE| \ > XFS_SB_FEAT_INCOMPAT_SPINODES| \ > @@ -584,6 +585,12 @@ static inline bool xfs_sb_version_hasinobtcounts(struct xfs_sb *sbp) > (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_INOBTCNT); > } > > +static inline bool xfs_sb_version_needsrepair(struct xfs_sb *sbp) > +{ > + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 && > + (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR); > +} > + > /* > * end of superblock version macros > */ > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > index 7bc7901d648d..2853ad49b27d 100644 > --- a/fs/xfs/xfs_mount.c > +++ b/fs/xfs/xfs_mount.c > @@ -266,6 +266,12 @@ xfs_sb_validate_mount( > struct xfs_buf *bp, > struct xfs_sb *sbp) > { > + /* Filesystem claims it needs repair, so refuse the mount. */ > + if (xfs_sb_version_needsrepair(&mp->m_sb)) { > + xfs_warn(mp, "Filesystem needs repair. Please run xfs_repair."); > + return -EFSCORRUPTED; > + } > + > /* > * Don't touch the filesystem if a user tool thinks it owns the primary > * superblock. mkfs doesn't clear the flag from secondary supers, so >
On Tue, Dec 01, 2020 at 11:18:12AM -0500, Brian Foster wrote: > On Mon, Nov 30, 2020 at 07:37:31PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > Define an incompat feature flag to indicate that the filesystem needs to > > be repaired. While libxfs will recognize this feature, the kernel will > > refuse to mount if the feature flag is set, and only xfs_repair will be > > able to clear the flag. The goal here is to force the admin to run > > xfs_repair to completion after upgrading the filesystem, or if we > > otherwise detect anomalies. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > --- > > IIUC, we're using an incompat bit to intentionally ensure the filesystem > cannot mount, even on kernels that predate this particular "needs > repair" feature. The only difference is that an older kernel would > complain about an unknown feature and return a different error code. > Right? > > That seems reasonable, but out of curiousity is there a need/reason for > using an incompat bit over an ro_compat bit? The general principle is to prevent /any/ mounting of the fs until the admin runs repair, even if it's readonly mounting. The specific reason is so that xfs_db can set some other feature flag as part of an upgrade and then set the incompat bit to force a repair run (which xfs_admin will immediately take care of). Hm. Now that you got me thinking, maybe there should be an exception for a norecovery mount? --D > Brian > > > fs/xfs/libxfs/xfs_format.h | 7 +++++++ > > fs/xfs/xfs_mount.c | 6 ++++++ > > 2 files changed, 13 insertions(+) > > > > > > diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h > > index dd764da08f6f..5d8ba609ac0b 100644 > > --- a/fs/xfs/libxfs/xfs_format.h > > +++ b/fs/xfs/libxfs/xfs_format.h > > @@ -468,6 +468,7 @@ xfs_sb_has_ro_compat_feature( > > #define XFS_SB_FEAT_INCOMPAT_SPINODES (1 << 1) /* sparse inode chunks */ > > #define XFS_SB_FEAT_INCOMPAT_META_UUID (1 << 2) /* metadata UUID */ > > #define XFS_SB_FEAT_INCOMPAT_BIGTIME (1 << 3) /* large timestamps */ > > +#define XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR (1 << 4) /* needs xfs_repair */ > > #define XFS_SB_FEAT_INCOMPAT_ALL \ > > (XFS_SB_FEAT_INCOMPAT_FTYPE| \ > > XFS_SB_FEAT_INCOMPAT_SPINODES| \ > > @@ -584,6 +585,12 @@ static inline bool xfs_sb_version_hasinobtcounts(struct xfs_sb *sbp) > > (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_INOBTCNT); > > } > > > > +static inline bool xfs_sb_version_needsrepair(struct xfs_sb *sbp) > > +{ > > + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 && > > + (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR); > > +} > > + > > /* > > * end of superblock version macros > > */ > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > > index 7bc7901d648d..2853ad49b27d 100644 > > --- a/fs/xfs/xfs_mount.c > > +++ b/fs/xfs/xfs_mount.c > > @@ -266,6 +266,12 @@ xfs_sb_validate_mount( > > struct xfs_buf *bp, > > struct xfs_sb *sbp) > > { > > + /* Filesystem claims it needs repair, so refuse the mount. */ > > + if (xfs_sb_version_needsrepair(&mp->m_sb)) { > > + xfs_warn(mp, "Filesystem needs repair. Please run xfs_repair."); > > + return -EFSCORRUPTED; > > + } > > + > > /* > > * Don't touch the filesystem if a user tool thinks it owns the primary > > * superblock. mkfs doesn't clear the flag from secondary supers, so > > >
On Tue, Dec 01, 2020 at 08:25:39AM -0800, Darrick J. Wong wrote: > On Tue, Dec 01, 2020 at 11:18:12AM -0500, Brian Foster wrote: > > On Mon, Nov 30, 2020 at 07:37:31PM -0800, Darrick J. Wong wrote: > > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > > > Define an incompat feature flag to indicate that the filesystem needs to > > > be repaired. While libxfs will recognize this feature, the kernel will > > > refuse to mount if the feature flag is set, and only xfs_repair will be > > > able to clear the flag. The goal here is to force the admin to run > > > xfs_repair to completion after upgrading the filesystem, or if we > > > otherwise detect anomalies. > > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > > --- > > > > IIUC, we're using an incompat bit to intentionally ensure the filesystem > > cannot mount, even on kernels that predate this particular "needs > > repair" feature. The only difference is that an older kernel would > > complain about an unknown feature and return a different error code. > > Right? > > > > That seems reasonable, but out of curiousity is there a need/reason for > > using an incompat bit over an ro_compat bit? > > The general principle is to prevent /any/ mounting of the fs until the > admin runs repair, even if it's readonly mounting. The specific reason > is so that xfs_db can set some other feature flag as part of an upgrade > and then set the incompat bit to force a repair run (which xfs_admin > will immediately take care of). > > Hm. Now that you got me thinking, maybe there should be an exception > for a norecovery mount? > Yeah, I was more thinking about for recovery purposes if something happens to go wrong, so that should imply norecovery as well. Eh, I suppose one could always clear the bit too in that case so it's not that big of a deal. Either way: Reviewed-by: Brian Foster <bfoster@redhat.com> > --D > > > Brian > > > > > fs/xfs/libxfs/xfs_format.h | 7 +++++++ > > > fs/xfs/xfs_mount.c | 6 ++++++ > > > 2 files changed, 13 insertions(+) > > > > > > > > > diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h > > > index dd764da08f6f..5d8ba609ac0b 100644 > > > --- a/fs/xfs/libxfs/xfs_format.h > > > +++ b/fs/xfs/libxfs/xfs_format.h > > > @@ -468,6 +468,7 @@ xfs_sb_has_ro_compat_feature( > > > #define XFS_SB_FEAT_INCOMPAT_SPINODES (1 << 1) /* sparse inode chunks */ > > > #define XFS_SB_FEAT_INCOMPAT_META_UUID (1 << 2) /* metadata UUID */ > > > #define XFS_SB_FEAT_INCOMPAT_BIGTIME (1 << 3) /* large timestamps */ > > > +#define XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR (1 << 4) /* needs xfs_repair */ > > > #define XFS_SB_FEAT_INCOMPAT_ALL \ > > > (XFS_SB_FEAT_INCOMPAT_FTYPE| \ > > > XFS_SB_FEAT_INCOMPAT_SPINODES| \ > > > @@ -584,6 +585,12 @@ static inline bool xfs_sb_version_hasinobtcounts(struct xfs_sb *sbp) > > > (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_INOBTCNT); > > > } > > > > > > +static inline bool xfs_sb_version_needsrepair(struct xfs_sb *sbp) > > > +{ > > > + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 && > > > + (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR); > > > +} > > > + > > > /* > > > * end of superblock version macros > > > */ > > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > > > index 7bc7901d648d..2853ad49b27d 100644 > > > --- a/fs/xfs/xfs_mount.c > > > +++ b/fs/xfs/xfs_mount.c > > > @@ -266,6 +266,12 @@ xfs_sb_validate_mount( > > > struct xfs_buf *bp, > > > struct xfs_sb *sbp) > > > { > > > + /* Filesystem claims it needs repair, so refuse the mount. */ > > > + if (xfs_sb_version_needsrepair(&mp->m_sb)) { > > > + xfs_warn(mp, "Filesystem needs repair. Please run xfs_repair."); > > > + return -EFSCORRUPTED; > > > + } > > > + > > > /* > > > * Don't touch the filesystem if a user tool thinks it owns the primary > > > * superblock. mkfs doesn't clear the flag from secondary supers, so > > > > > >
On 12/1/20 10:18 AM, Brian Foster wrote: > On Mon, Nov 30, 2020 at 07:37:31PM -0800, Darrick J. Wong wrote: >> From: Darrick J. Wong <darrick.wong@oracle.com> >> >> Define an incompat feature flag to indicate that the filesystem needs to >> be repaired. While libxfs will recognize this feature, the kernel will >> refuse to mount if the feature flag is set, and only xfs_repair will be >> able to clear the flag. The goal here is to force the admin to run >> xfs_repair to completion after upgrading the filesystem, or if we >> otherwise detect anomalies. >> >> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> >> --- > IIUC, we're using an incompat bit to intentionally ensure the filesystem > cannot mount, even on kernels that predate this particular "needs > repair" feature. The only difference is that an older kernel would > complain about an unknown feature and return a different error code. > Right? > > That seems reasonable, but out of curiousity is there a need/reason for > using an incompat bit over an ro_compat bit? I'm a fan of a straight-up incompat, because we don't really know what format changes in the future might require this flag to be set; nothing guarantees that future changes will be ro-compat-safe, right? -Eric
On Fri, Dec 04, 2020 at 02:07:49PM -0600, Eric Sandeen wrote: > On 12/1/20 10:18 AM, Brian Foster wrote: > > On Mon, Nov 30, 2020 at 07:37:31PM -0800, Darrick J. Wong wrote: > >> From: Darrick J. Wong <darrick.wong@oracle.com> > >> > >> Define an incompat feature flag to indicate that the filesystem needs to > >> be repaired. While libxfs will recognize this feature, the kernel will > >> refuse to mount if the feature flag is set, and only xfs_repair will be > >> able to clear the flag. The goal here is to force the admin to run > >> xfs_repair to completion after upgrading the filesystem, or if we > >> otherwise detect anomalies. > >> > >> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > >> --- > > IIUC, we're using an incompat bit to intentionally ensure the filesystem > > cannot mount, even on kernels that predate this particular "needs > > repair" feature. The only difference is that an older kernel would > > complain about an unknown feature and return a different error code. > > Right? > > > > That seems reasonable, but out of curiousity is there a need/reason for > > using an incompat bit over an ro_compat bit? > > I'm a fan of a straight-up incompat, because we don't really know what > format changes in the future might require this flag to be set; nothing > guarantees that future changes will be ro-compat-safe, right? Correct. In the case of the inobtcount upgrade, we know that the inobt/finobt blockcounts in the AGI are zero (and thus wrong) right after the upgrade. If we made it a rocompat bit then we'd allow ro mounts but we'd also have to be careful to prohibit a ro->rw remount, at which point the admin gets a Big Surprise. Why not just make the admin repair the system right then and there? I mean, xfs_admin is already going to run repair anyway, so in practice there shouldn't be that many people trying to push an "upgraded but needs repair" fs at the kernel anyway. --D > -Eric
diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index dd764da08f6f..5d8ba609ac0b 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -468,6 +468,7 @@ xfs_sb_has_ro_compat_feature( #define XFS_SB_FEAT_INCOMPAT_SPINODES (1 << 1) /* sparse inode chunks */ #define XFS_SB_FEAT_INCOMPAT_META_UUID (1 << 2) /* metadata UUID */ #define XFS_SB_FEAT_INCOMPAT_BIGTIME (1 << 3) /* large timestamps */ +#define XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR (1 << 4) /* needs xfs_repair */ #define XFS_SB_FEAT_INCOMPAT_ALL \ (XFS_SB_FEAT_INCOMPAT_FTYPE| \ XFS_SB_FEAT_INCOMPAT_SPINODES| \ @@ -584,6 +585,12 @@ static inline bool xfs_sb_version_hasinobtcounts(struct xfs_sb *sbp) (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_INOBTCNT); } +static inline bool xfs_sb_version_needsrepair(struct xfs_sb *sbp) +{ + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 && + (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR); +} + /* * end of superblock version macros */ diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 7bc7901d648d..2853ad49b27d 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -266,6 +266,12 @@ xfs_sb_validate_mount( struct xfs_buf *bp, struct xfs_sb *sbp) { + /* Filesystem claims it needs repair, so refuse the mount. */ + if (xfs_sb_version_needsrepair(&mp->m_sb)) { + xfs_warn(mp, "Filesystem needs repair. Please run xfs_repair."); + return -EFSCORRUPTED; + } + /* * Don't touch the filesystem if a user tool thinks it owns the primary * superblock. mkfs doesn't clear the flag from secondary supers, so