Message ID | 6f62dbc6-f516-e8b5-1f08-6be227a61219@suse.com (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
On Thu, Jul 19, 2018 at 05:23:22PM -0400, Jeff Mahoney wrote: > Commit 051b4e37f5e (mkfs: factor AG alignment) factored out the > AG alignment code into a separate function. It got rid of > redundant checks for dswidth != 0 since calc_stripe_factors was > supposed to guarantee that if dsunit is non-zero dswidth will be > as well. Unfortunately, there's hardware out there that reports its > optimal i/o size as larger than the maximum i/o size, which the kernel > treats as broken and zeros out the optimal i/o size. We'll accept > the multi-sector dsunit but have a zero dswidth and hit a divide-by-zero > in align_ag_geometry. > > To resolve this we can check the topology before consuming it, default > to using the stripe unit as the stripe width, and warn the user about it. > I wonder if this shouldn't go into blkid_get_topology since something is wrong with the information reported by the storage. And require a force_overwrite to continue, at this point, something looks quite wrong in the storage, and I think this is the last 'resource' a sysadmin will have to notice this before making the FS, and start using it, so, maybe requiring force_overwrite would bring more attention. > Fixes: 051b4e37f5e (mkfs: factor AG alignment) > Signed-off-by: Jeff Mahoney <jeffm@suse.com> > --- > mkfs/xfs_mkfs.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c > index a135e06e..35542e57 100644 > --- a/mkfs/xfs_mkfs.c > +++ b/mkfs/xfs_mkfs.c > @@ -2295,6 +2295,12 @@ _("data stripe width (%d) must be a multiple of the data stripe unit (%d)\n"), > if (!dsunit) { > dsunit = ft->dsunit; > dswidth = ft->dswidth; > + if (dsunit && dswidth == 0) { > + fprintf(stderr, > +_("%s: Volume reports stripe unit of %d bytes but stripe width of 0. Using stripe width of %d bytes, which may not be optimal.\n"), > + progname, dsunit << 9, dsunit << 9); > + dswidth = dsunit; > + } > use_dev = true; > } else { > /* check and warn is alignment is sub-optimal */ > -- > 2.16.4 > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 20, 2018 at 05:55:29PM +0200, Carlos Maiolino wrote: > On Thu, Jul 19, 2018 at 05:23:22PM -0400, Jeff Mahoney wrote: > > Commit 051b4e37f5e (mkfs: factor AG alignment) factored out the > > AG alignment code into a separate function. It got rid of > > redundant checks for dswidth != 0 since calc_stripe_factors was > > supposed to guarantee that if dsunit is non-zero dswidth will be > > as well. Unfortunately, there's hardware out there that reports its > > optimal i/o size as larger than the maximum i/o size, which the kernel > > treats as broken and zeros out the optimal i/o size. We'll accept > > the multi-sector dsunit but have a zero dswidth and hit a divide-by-zero > > in align_ag_geometry. > > > > To resolve this we can check the topology before consuming it, default > > to using the stripe unit as the stripe width, and warn the user about it. > > > > I wonder if this shouldn't go into blkid_get_topology since something is wrong > with the information reported by the storage. If the storage gives us crap geometry information we don't have to use it. Keep the message that we autodetected nonsense and are dropping it; the sysadmin can always re-run mkfs with sensible dsunit/dswidth. > And require a force_overwrite to continue, at this point, something looks quite > wrong in the storage, and I think this is the last 'resource' a sysadmin will > have to notice this before making the FS, and start using it, so, maybe requiring > force_overwrite would bring more attention. I prefer reserving -f for "This is about to destroy something and can't be undone", not "This auto-optimization is screwed up, continue? (y/N)" --D > > Fixes: 051b4e37f5e (mkfs: factor AG alignment) > > Signed-off-by: Jeff Mahoney <jeffm@suse.com> > > --- > > mkfs/xfs_mkfs.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c > > index a135e06e..35542e57 100644 > > --- a/mkfs/xfs_mkfs.c > > +++ b/mkfs/xfs_mkfs.c > > @@ -2295,6 +2295,12 @@ _("data stripe width (%d) must be a multiple of the data stripe unit (%d)\n"), > > if (!dsunit) { > > dsunit = ft->dsunit; > > dswidth = ft->dswidth; > > + if (dsunit && dswidth == 0) { > > + fprintf(stderr, > > +_("%s: Volume reports stripe unit of %d bytes but stripe width of 0. Using stripe width of %d bytes, which may not be optimal.\n"), > > + progname, dsunit << 9, dsunit << 9); > > + dswidth = dsunit; > > + } > > use_dev = true; > > } else { > > /* check and warn is alignment is sub-optimal */ > > -- > > 2.16.4 > > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Carlos > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/20/18 11:55 AM, Carlos Maiolino wrote: > On Thu, Jul 19, 2018 at 05:23:22PM -0400, Jeff Mahoney wrote: >> Commit 051b4e37f5e (mkfs: factor AG alignment) factored out the >> AG alignment code into a separate function. It got rid of >> redundant checks for dswidth != 0 since calc_stripe_factors was >> supposed to guarantee that if dsunit is non-zero dswidth will be >> as well. Unfortunately, there's hardware out there that reports its >> optimal i/o size as larger than the maximum i/o size, which the kernel >> treats as broken and zeros out the optimal i/o size. We'll accept >> the multi-sector dsunit but have a zero dswidth and hit a divide-by-zero >> in align_ag_geometry. >> >> To resolve this we can check the topology before consuming it, default >> to using the stripe unit as the stripe width, and warn the user about it. >> > > I wonder if this shouldn't go into blkid_get_topology since something is wrong > with the information reported by the storage. > And require a force_overwrite to continue, at this point, something looks quite > wrong in the storage, and I think this is the last 'resource' a sysadmin will > have to notice this before making the FS, and start using it, so, maybe requiring > force_overwrite would bring more attention. We discussed that initially here: https://patchwork.kernel.org/patch/10479083/ I worked that up and what ends up happening is that, since we don't have any context for how the topology will be used, if at all, we print the error every time. If the user specified stripe parameters manually, the topology won't be used. They won't care if it's broken and certainly don't need to force it. Lastly, this wasn't encountered in the real world on some weird discount hardware. It's a pretty big product from a major storage vendor. I've advised them to fix their firmware but we still need to get users rolling again. Warning about a potential suboptimal result is enough, IMO. It's not an emergency situation that will result in a completely broken file system. -Jeff >> Fixes: 051b4e37f5e (mkfs: factor AG alignment) >> Signed-off-by: Jeff Mahoney <jeffm@suse.com> >> --- >> mkfs/xfs_mkfs.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c >> index a135e06e..35542e57 100644 >> --- a/mkfs/xfs_mkfs.c >> +++ b/mkfs/xfs_mkfs.c >> @@ -2295,6 +2295,12 @@ _("data stripe width (%d) must be a multiple of the data stripe unit (%d)\n"), >> if (!dsunit) { >> dsunit = ft->dsunit; >> dswidth = ft->dswidth; >> + if (dsunit && dswidth == 0) { >> + fprintf(stderr, >> +_("%s: Volume reports stripe unit of %d bytes but stripe width of 0. Using stripe width of %d bytes, which may not be optimal.\n"), >> + progname, dsunit << 9, dsunit << 9); >> + dswidth = dsunit; >> + } >> use_dev = true; >> } else { >> /* check and warn is alignment is sub-optimal */ >> -- >> 2.16.4 >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >
On Fri, Jul 20, 2018 at 09:19:23AM -0700, Darrick J. Wong wrote: > On Fri, Jul 20, 2018 at 05:55:29PM +0200, Carlos Maiolino wrote: > > On Thu, Jul 19, 2018 at 05:23:22PM -0400, Jeff Mahoney wrote: > > > Commit 051b4e37f5e (mkfs: factor AG alignment) factored out the > > > AG alignment code into a separate function. It got rid of > > > redundant checks for dswidth != 0 since calc_stripe_factors was > > > supposed to guarantee that if dsunit is non-zero dswidth will be > > > as well. Unfortunately, there's hardware out there that reports its > > > optimal i/o size as larger than the maximum i/o size, which the kernel > > > treats as broken and zeros out the optimal i/o size. We'll accept > > > the multi-sector dsunit but have a zero dswidth and hit a divide-by-zero > > > in align_ag_geometry. > > > > > > To resolve this we can check the topology before consuming it, default > > > to using the stripe unit as the stripe width, and warn the user about it. > > > > > > > I wonder if this shouldn't go into blkid_get_topology since something is wrong > > with the information reported by the storage. > > If the storage gives us crap geometry information we don't have to use > it. Keep the message that we autodetected nonsense and are dropping it; > the sysadmin can always re-run mkfs with sensible dsunit/dswidth. > > > And require a force_overwrite to continue, at this point, something looks quite > > wrong in the storage, and I think this is the last 'resource' a sysadmin will > > have to notice this before making the FS, and start using it, so, maybe requiring > > force_overwrite would bring more attention. > > I prefer reserving -f for "This is about to destroy something and can't > be undone", not "This auto-optimization is screwed up, continue? (y/N)" Yeah, I agree, forget everything I said here :P > > --D > > > > Fixes: 051b4e37f5e (mkfs: factor AG alignment) > > > Signed-off-by: Jeff Mahoney <jeffm@suse.com> > > > --- > > > mkfs/xfs_mkfs.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c > > > index a135e06e..35542e57 100644 > > > --- a/mkfs/xfs_mkfs.c > > > +++ b/mkfs/xfs_mkfs.c > > > @@ -2295,6 +2295,12 @@ _("data stripe width (%d) must be a multiple of the data stripe unit (%d)\n"), > > > if (!dsunit) { > > > dsunit = ft->dsunit; > > > dswidth = ft->dswidth; > > > + if (dsunit && dswidth == 0) { > > > + fprintf(stderr, > > > +_("%s: Volume reports stripe unit of %d bytes but stripe width of 0. Using stripe width of %d bytes, which may not be optimal.\n"), > > > + progname, dsunit << 9, dsunit << 9); > > > + dswidth = dsunit; > > > + } > > > use_dev = true; > > > } else { > > > /* check and warn is alignment is sub-optimal */ > > > -- > > > 2.16.4 > > > > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > > Carlos > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/19/18 4:23 PM, Jeff Mahoney wrote: > Commit 051b4e37f5e (mkfs: factor AG alignment) factored out the > AG alignment code into a separate function. It got rid of > redundant checks for dswidth != 0 since calc_stripe_factors was > supposed to guarantee that if dsunit is non-zero dswidth will be > as well. Unfortunately, there's hardware out there that reports its > optimal i/o size as larger than the maximum i/o size, which the kernel > treats as broken and zeros out the optimal i/o size. We'll accept > the multi-sector dsunit but have a zero dswidth and hit a divide-by-zero > in align_ag_geometry. > > To resolve this we can check the topology before consuming it, default > to using the stripe unit as the stripe width, and warn the user about it. > > Fixes: 051b4e37f5e (mkfs: factor AG alignment) > Signed-off-by: Jeff Mahoney <jeffm@suse.com> Looks fine to me logically. Sorry for nitpicking a patch again (it's a character flaw) but I'd like to massage this slightly: diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c index 1074886..231542f 100644 --- a/mkfs/xfs_mkfs.c +++ b/mkfs/xfs_mkfs.c @@ -2281,6 +2281,16 @@ _("data stripe width (%d) must be a multiple of the data stripe unit (%d)\n"), /* if no stripe config set, use the device default */ if (!dsunit) { + /* Watch out for nonsense from device */ + if (ft->dsunit && ft->dswidth == 0) { + fprintf(stderr, +_("%s: Volume reports stripe unit of %d bytes but stripe width of 0.\n"), + progname, ft->dsunit << 9); + fprintf(stderr, +_("Using stripe width of %d bytes, which may not be optimal.\n"), + ft->dsunit << 9); + ft->dswidth = ft->dsunit; + } dsunit = ft->dsunit; dswidth = ft->dswidth; use_dev = true; to make it a bit more clear that we're checking the /device/-reported topology (by looking at ft before using it) and also to break up the long warning message into < 80 char lines. OK? This all seems a little messy yet (an inherited mess) but that's slightly clearer to me. Thanks, -Eric > --- > mkfs/xfs_mkfs.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c > index a135e06e..35542e57 100644 > --- a/mkfs/xfs_mkfs.c > +++ b/mkfs/xfs_mkfs.c > @@ -2295,6 +2295,12 @@ _("data stripe width (%d) must be a multiple of the data stripe unit (%d)\n"), > if (!dsunit) { > dsunit = ft->dsunit; > dswidth = ft->dswidth; > + if (dsunit && dswidth == 0) { > + fprintf(stderr, > +_("%s: Volume reports stripe unit of %d bytes but stripe width of 0. Using stripe width of %d bytes, which may not be optimal.\n"), > + progname, dsunit << 9, dsunit << 9); > + dswidth = dsunit; > + } > use_dev = true; > } else { > /* check and warn is alignment is sub-optimal */ > -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/30/18 9:10 PM, Eric Sandeen wrote: > On 7/19/18 4:23 PM, Jeff Mahoney wrote: >> Commit 051b4e37f5e (mkfs: factor AG alignment) factored out the >> AG alignment code into a separate function. It got rid of >> redundant checks for dswidth != 0 since calc_stripe_factors was >> supposed to guarantee that if dsunit is non-zero dswidth will be >> as well. Unfortunately, there's hardware out there that reports its >> optimal i/o size as larger than the maximum i/o size, which the kernel >> treats as broken and zeros out the optimal i/o size. We'll accept >> the multi-sector dsunit but have a zero dswidth and hit a divide-by-zero >> in align_ag_geometry. >> >> To resolve this we can check the topology before consuming it, default >> to using the stripe unit as the stripe width, and warn the user about it. >> >> Fixes: 051b4e37f5e (mkfs: factor AG alignment) >> Signed-off-by: Jeff Mahoney <jeffm@suse.com> > > Looks fine to me logically. Sorry for nitpicking a patch again (it's > a character flaw) but I'd like to massage this slightly: > > diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c > index 1074886..231542f 100644 > --- a/mkfs/xfs_mkfs.c > +++ b/mkfs/xfs_mkfs.c > @@ -2281,6 +2281,16 @@ _("data stripe width (%d) must be a multiple of the data stripe unit (%d)\n"), > > /* if no stripe config set, use the device default */ > if (!dsunit) { > + /* Watch out for nonsense from device */ > + if (ft->dsunit && ft->dswidth == 0) { > + fprintf(stderr, > +_("%s: Volume reports stripe unit of %d bytes but stripe width of 0.\n"), > + progname, ft->dsunit << 9); > + fprintf(stderr, > +_("Using stripe width of %d bytes, which may not be optimal.\n"), > + ft->dsunit << 9); > + ft->dswidth = ft->dsunit; > + } > dsunit = ft->dsunit; > dswidth = ft->dswidth; > use_dev = true; > > to make it a bit more clear that we're checking the /device/-reported > topology (by looking at ft before using it) and also to break up > the long warning message into < 80 char lines. OK? > > This all seems a little messy yet (an inherited mess) but that's slightly > clearer to me. Hm, though now I'm half tempted to put all the dswidth-vs-dsunit checks in a helper, and if it fails on the commandline values, usage(); on detected values, set to 0 with a warning, as it does here anyway: /* * now we have our stripe config, check it's a multiple of block * size. */ if ((BBTOB(dsunit) % cfg->blocksize) || (BBTOB(dswidth) % cfg->blocksize)) { if (!use_dev) { ... } dsunit = 0; dswidth = 0; cfg->sb_feat.nodalign = true;) and let the user respecify if they wish. *shrug* I may follow up with another patch if it works out. -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c index a135e06e..35542e57 100644 --- a/mkfs/xfs_mkfs.c +++ b/mkfs/xfs_mkfs.c @@ -2295,6 +2295,12 @@ _("data stripe width (%d) must be a multiple of the data stripe unit (%d)\n"), if (!dsunit) { dsunit = ft->dsunit; dswidth = ft->dswidth; + if (dsunit && dswidth == 0) { + fprintf(stderr, +_("%s: Volume reports stripe unit of %d bytes but stripe width of 0. Using stripe width of %d bytes, which may not be optimal.\n"), + progname, dsunit << 9, dsunit << 9); + dswidth = dsunit; + } use_dev = true; } else { /* check and warn is alignment is sub-optimal */
Commit 051b4e37f5e (mkfs: factor AG alignment) factored out the AG alignment code into a separate function. It got rid of redundant checks for dswidth != 0 since calc_stripe_factors was supposed to guarantee that if dsunit is non-zero dswidth will be as well. Unfortunately, there's hardware out there that reports its optimal i/o size as larger than the maximum i/o size, which the kernel treats as broken and zeros out the optimal i/o size. We'll accept the multi-sector dsunit but have a zero dswidth and hit a divide-by-zero in align_ag_geometry. To resolve this we can check the topology before consuming it, default to using the stripe unit as the stripe width, and warn the user about it. Fixes: 051b4e37f5e (mkfs: factor AG alignment) Signed-off-by: Jeff Mahoney <jeffm@suse.com> --- mkfs/xfs_mkfs.c | 6 ++++++ 1 file changed, 6 insertions(+)