diff mbox series

[v2] libnvdimm/region: Allow setting align attribute on regions without mappings

Message ID 20220830054505.1159488-1-tyhicks@linux.microsoft.com (mailing list archive)
State Accepted
Commit 2e5021cc42ba26c98fe83b973d774a999fa4f219
Headers show
Series [v2] libnvdimm/region: Allow setting align attribute on regions without mappings | expand

Commit Message

Tyler Hicks Aug. 30, 2022, 5:45 a.m. UTC
The alignment constraint for namespace creation in a region was
increased, from 2M to 16M, for non-PowerPC architectures in v5.7 with
commit 2522afb86a8c ("libnvdimm/region: Introduce an 'align'
attribute"). The thought behind the change was that region alignment
should be uniform across all architectures and, since PowerPC had the
largest alignment constraint of 16M, all architectures should conform to
that alignment.

The change regressed namespace creation in pre-defined regions that
relied on 2M alignment but a workaround was provided in the form of a
sysfs attribute, named 'align', that could be adjusted to a non-default
alignment value.

However, the sysfs attribute's store function returned an error (-ENXIO)
when userspace attempted to change the alignment of a region that had no
mappings. This affected 2M aligned regions of volatile memory that were
defined in a device tree using "pmem-region" and created by the
of_pmem_region_driver, since those regions do not contain mappings
(ndr_mappings is 0).

Allow userspace to set the align attribute on pre-existing regions that
do not have mappings so that namespaces can still be within those
regions, despite not being aligned to 16M.

Link: https://lore.kernel.org/lkml/CA+CK2bDJ3hrWoE91L2wpAk+Yu0_=GtYw=4gLDDD7mxs321b_aA@mail.gmail.com
Fixes: 2522afb86a8c ("libnvdimm/region: Introduce an 'align' attribute")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
---

While testing with a recent kernel release (6.0-rc3), I rediscovered
this bug and eventually realized that I never followed through with
fixing it upstream. After a year later, here's the v2 that Aneesh
requested. Sorry about that!

v2:
- Included Aneesh's feedback to ensure the val is a power of 2 and
  greater than PAGE_SIZE even for regions without mappings
- Reused the max_t() trick from default_align() to avoid special
  casing, with an if-else, when regions have mappings and when they
  don't
  + Didn't include Pavel's Reviewed-by since this is a slightly
    different approach than what he reviewed in v1
- Added a Link commit tag to Pavel's initial problem description
v1: https://lore.kernel.org/lkml/20210326152645.85225-1-tyhicks@linux.microsoft.com/

 drivers/nvdimm/region_devs.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Comments

Tyler Hicks Sept. 14, 2022, 2:37 p.m. UTC | #1
On 2022-08-30 00:45:05, Tyler Hicks wrote:
> The alignment constraint for namespace creation in a region was
> increased, from 2M to 16M, for non-PowerPC architectures in v5.7 with
> commit 2522afb86a8c ("libnvdimm/region: Introduce an 'align'
> attribute"). The thought behind the change was that region alignment
> should be uniform across all architectures and, since PowerPC had the
> largest alignment constraint of 16M, all architectures should conform to
> that alignment.
> 
> The change regressed namespace creation in pre-defined regions that
> relied on 2M alignment but a workaround was provided in the form of a
> sysfs attribute, named 'align', that could be adjusted to a non-default
> alignment value.
> 
> However, the sysfs attribute's store function returned an error (-ENXIO)
> when userspace attempted to change the alignment of a region that had no
> mappings. This affected 2M aligned regions of volatile memory that were
> defined in a device tree using "pmem-region" and created by the
> of_pmem_region_driver, since those regions do not contain mappings
> (ndr_mappings is 0).
> 
> Allow userspace to set the align attribute on pre-existing regions that
> do not have mappings so that namespaces can still be within those
> regions, despite not being aligned to 16M.
> 
> Link: https://lore.kernel.org/lkml/CA+CK2bDJ3hrWoE91L2wpAk+Yu0_=GtYw=4gLDDD7mxs321b_aA@mail.gmail.com
> Fixes: 2522afb86a8c ("libnvdimm/region: Introduce an 'align' attribute")
> Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> ---

Friendly ping. I'm hoping this fix can be considered for v6.1. Thanks!

Tyler

> 
> While testing with a recent kernel release (6.0-rc3), I rediscovered
> this bug and eventually realized that I never followed through with
> fixing it upstream. After a year later, here's the v2 that Aneesh
> requested. Sorry about that!
> 
> v2:
> - Included Aneesh's feedback to ensure the val is a power of 2 and
>   greater than PAGE_SIZE even for regions without mappings
> - Reused the max_t() trick from default_align() to avoid special
>   casing, with an if-else, when regions have mappings and when they
>   don't
>   + Didn't include Pavel's Reviewed-by since this is a slightly
>     different approach than what he reviewed in v1
> - Added a Link commit tag to Pavel's initial problem description
> v1: https://lore.kernel.org/lkml/20210326152645.85225-1-tyhicks@linux.microsoft.com/
> 
>  drivers/nvdimm/region_devs.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index 473a71bbd9c9..550ea0bd6c53 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -509,16 +509,13 @@ static ssize_t align_store(struct device *dev,
>  {
>  	struct nd_region *nd_region = to_nd_region(dev);
>  	unsigned long val, dpa;
> -	u32 remainder;
> +	u32 mappings, remainder;
>  	int rc;
>  
>  	rc = kstrtoul(buf, 0, &val);
>  	if (rc)
>  		return rc;
>  
> -	if (!nd_region->ndr_mappings)
> -		return -ENXIO;
> -
>  	/*
>  	 * Ensure space-align is evenly divisible by the region
>  	 * interleave-width because the kernel typically has no facility
> @@ -526,7 +523,8 @@ static ssize_t align_store(struct device *dev,
>  	 * contribute to the tail capacity in system-physical-address
>  	 * space for the namespace.
>  	 */
> -	dpa = div_u64_rem(val, nd_region->ndr_mappings, &remainder);
> +	mappings = max_t(u32, 1, nd_region->ndr_mappings);
> +	dpa = div_u64_rem(val, mappings, &remainder);
>  	if (!is_power_of_2(dpa) || dpa < PAGE_SIZE
>  			|| val > region_size(nd_region) || remainder)
>  		return -EINVAL;
> -- 
> 2.25.1
>
Jeff Moyer Sept. 26, 2022, 8:18 p.m. UTC | #2
Tyler Hicks <tyhicks@linux.microsoft.com> writes:

> The alignment constraint for namespace creation in a region was
> increased, from 2M to 16M, for non-PowerPC architectures in v5.7 with
> commit 2522afb86a8c ("libnvdimm/region: Introduce an 'align'
> attribute"). The thought behind the change was that region alignment
> should be uniform across all architectures and, since PowerPC had the
> largest alignment constraint of 16M, all architectures should conform to
> that alignment.
>
> The change regressed namespace creation in pre-defined regions that
> relied on 2M alignment but a workaround was provided in the form of a
> sysfs attribute, named 'align', that could be adjusted to a non-default
> alignment value.
>
> However, the sysfs attribute's store function returned an error (-ENXIO)
> when userspace attempted to change the alignment of a region that had no
> mappings. This affected 2M aligned regions of volatile memory that were
> defined in a device tree using "pmem-region" and created by the
> of_pmem_region_driver, since those regions do not contain mappings
> (ndr_mappings is 0).
>
> Allow userspace to set the align attribute on pre-existing regions that
> do not have mappings so that namespaces can still be within those
> regions, despite not being aligned to 16M.
>
> Link: https://lore.kernel.org/lkml/CA+CK2bDJ3hrWoE91L2wpAk+Yu0_=GtYw=4gLDDD7mxs321b_aA@mail.gmail.com
> Fixes: 2522afb86a8c ("libnvdimm/region: Introduce an 'align' attribute")
> Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> ---
>
> While testing with a recent kernel release (6.0-rc3), I rediscovered
> this bug and eventually realized that I never followed through with
> fixing it upstream. After a year later, here's the v2 that Aneesh
> requested. Sorry about that!
>
> v2:
> - Included Aneesh's feedback to ensure the val is a power of 2 and
>   greater than PAGE_SIZE even for regions without mappings
> - Reused the max_t() trick from default_align() to avoid special
>   casing, with an if-else, when regions have mappings and when they
>   don't
>   + Didn't include Pavel's Reviewed-by since this is a slightly
>     different approach than what he reviewed in v1
> - Added a Link commit tag to Pavel's initial problem description
> v1: https://lore.kernel.org/lkml/20210326152645.85225-1-tyhicks@linux.microsoft.com/
>
>  drivers/nvdimm/region_devs.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index 473a71bbd9c9..550ea0bd6c53 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -509,16 +509,13 @@ static ssize_t align_store(struct device *dev,
>  {
>  	struct nd_region *nd_region = to_nd_region(dev);
>  	unsigned long val, dpa;
> -	u32 remainder;
> +	u32 mappings, remainder;
>  	int rc;
>  
>  	rc = kstrtoul(buf, 0, &val);
>  	if (rc)
>  		return rc;
>  
> -	if (!nd_region->ndr_mappings)
> -		return -ENXIO;
> -
>  	/*
>  	 * Ensure space-align is evenly divisible by the region
>  	 * interleave-width because the kernel typically has no facility
> @@ -526,7 +523,8 @@ static ssize_t align_store(struct device *dev,
>  	 * contribute to the tail capacity in system-physical-address
>  	 * space for the namespace.
>  	 */
> -	dpa = div_u64_rem(val, nd_region->ndr_mappings, &remainder);
> +	mappings = max_t(u32, 1, nd_region->ndr_mappings);
> +	dpa = div_u64_rem(val, mappings, &remainder);
>  	if (!is_power_of_2(dpa) || dpa < PAGE_SIZE
>  			|| val > region_size(nd_region) || remainder)
>  		return -EINVAL;

The math all looks okay, and this matches what's done in default_align.
Unfortunately, I don't know enough about the power architecture to
understand how you can have a region with no dimms (ndr_mappings == 0).

-Jeff
Tyler Hicks Sept. 26, 2022, 8:29 p.m. UTC | #3
On 2022-09-26 16:18:18, Jeff Moyer wrote:
> Tyler Hicks <tyhicks@linux.microsoft.com> writes:
> 
> > The alignment constraint for namespace creation in a region was
> > increased, from 2M to 16M, for non-PowerPC architectures in v5.7 with
> > commit 2522afb86a8c ("libnvdimm/region: Introduce an 'align'
> > attribute"). The thought behind the change was that region alignment
> > should be uniform across all architectures and, since PowerPC had the
> > largest alignment constraint of 16M, all architectures should conform to
> > that alignment.
> >
> > The change regressed namespace creation in pre-defined regions that
> > relied on 2M alignment but a workaround was provided in the form of a
> > sysfs attribute, named 'align', that could be adjusted to a non-default
> > alignment value.
> >
> > However, the sysfs attribute's store function returned an error (-ENXIO)
> > when userspace attempted to change the alignment of a region that had no
> > mappings. This affected 2M aligned regions of volatile memory that were
> > defined in a device tree using "pmem-region" and created by the
> > of_pmem_region_driver, since those regions do not contain mappings
> > (ndr_mappings is 0).
> >
> > Allow userspace to set the align attribute on pre-existing regions that
> > do not have mappings so that namespaces can still be within those
> > regions, despite not being aligned to 16M.
> >
> > Link: https://lore.kernel.org/lkml/CA+CK2bDJ3hrWoE91L2wpAk+Yu0_=GtYw=4gLDDD7mxs321b_aA@mail.gmail.com
> > Fixes: 2522afb86a8c ("libnvdimm/region: Introduce an 'align' attribute")
> > Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> > ---
> >
> > While testing with a recent kernel release (6.0-rc3), I rediscovered
> > this bug and eventually realized that I never followed through with
> > fixing it upstream. After a year later, here's the v2 that Aneesh
> > requested. Sorry about that!
> >
> > v2:
> > - Included Aneesh's feedback to ensure the val is a power of 2 and
> >   greater than PAGE_SIZE even for regions without mappings
> > - Reused the max_t() trick from default_align() to avoid special
> >   casing, with an if-else, when regions have mappings and when they
> >   don't
> >   + Didn't include Pavel's Reviewed-by since this is a slightly
> >     different approach than what he reviewed in v1
> > - Added a Link commit tag to Pavel's initial problem description
> > v1: https://lore.kernel.org/lkml/20210326152645.85225-1-tyhicks@linux.microsoft.com/
> >
> >  drivers/nvdimm/region_devs.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> > index 473a71bbd9c9..550ea0bd6c53 100644
> > --- a/drivers/nvdimm/region_devs.c
> > +++ b/drivers/nvdimm/region_devs.c
> > @@ -509,16 +509,13 @@ static ssize_t align_store(struct device *dev,
> >  {
> >  	struct nd_region *nd_region = to_nd_region(dev);
> >  	unsigned long val, dpa;
> > -	u32 remainder;
> > +	u32 mappings, remainder;
> >  	int rc;
> >  
> >  	rc = kstrtoul(buf, 0, &val);
> >  	if (rc)
> >  		return rc;
> >  
> > -	if (!nd_region->ndr_mappings)
> > -		return -ENXIO;
> > -
> >  	/*
> >  	 * Ensure space-align is evenly divisible by the region
> >  	 * interleave-width because the kernel typically has no facility
> > @@ -526,7 +523,8 @@ static ssize_t align_store(struct device *dev,
> >  	 * contribute to the tail capacity in system-physical-address
> >  	 * space for the namespace.
> >  	 */
> > -	dpa = div_u64_rem(val, nd_region->ndr_mappings, &remainder);
> > +	mappings = max_t(u32, 1, nd_region->ndr_mappings);
> > +	dpa = div_u64_rem(val, mappings, &remainder);
> >  	if (!is_power_of_2(dpa) || dpa < PAGE_SIZE
> >  			|| val > region_size(nd_region) || remainder)
> >  		return -EINVAL;
> 
> The math all looks okay, and this matches what's done in default_align.
> Unfortunately, I don't know enough about the power architecture to
> understand how you can have a region with no dimms (ndr_mappings == 0).

Thanks for having a look!

FWIW, I need this working on arm64. It previously did before the commit
mentioned in the Fixes line. ndr_mappings is 0 when defining a
pmem-region in the device tree. The region is also marked as 'volatile'
but I don't recall if that contributes to ndr_mappings being 0.

Tyler

> 
> -Jeff
>
Dan Williams Sept. 30, 2022, 12:21 a.m. UTC | #4
Tyler Hicks wrote:
> On 2022-09-26 16:18:18, Jeff Moyer wrote:
> > Tyler Hicks <tyhicks@linux.microsoft.com> writes:
> > 
> > > The alignment constraint for namespace creation in a region was
> > > increased, from 2M to 16M, for non-PowerPC architectures in v5.7 with
> > > commit 2522afb86a8c ("libnvdimm/region: Introduce an 'align'
> > > attribute"). The thought behind the change was that region alignment
> > > should be uniform across all architectures and, since PowerPC had the
> > > largest alignment constraint of 16M, all architectures should conform to
> > > that alignment.
> > >
> > > The change regressed namespace creation in pre-defined regions that
> > > relied on 2M alignment but a workaround was provided in the form of a
> > > sysfs attribute, named 'align', that could be adjusted to a non-default
> > > alignment value.
> > >
> > > However, the sysfs attribute's store function returned an error (-ENXIO)
> > > when userspace attempted to change the alignment of a region that had no
> > > mappings. This affected 2M aligned regions of volatile memory that were
> > > defined in a device tree using "pmem-region" and created by the
> > > of_pmem_region_driver, since those regions do not contain mappings
> > > (ndr_mappings is 0).
> > >
> > > Allow userspace to set the align attribute on pre-existing regions that
> > > do not have mappings so that namespaces can still be within those
> > > regions, despite not being aligned to 16M.
> > >
> > > Link: https://lore.kernel.org/lkml/CA+CK2bDJ3hrWoE91L2wpAk+Yu0_=GtYw=4gLDDD7mxs321b_aA@mail.gmail.com
> > > Fixes: 2522afb86a8c ("libnvdimm/region: Introduce an 'align' attribute")
> > > Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> > > ---
> > >
> > > While testing with a recent kernel release (6.0-rc3), I rediscovered
> > > this bug and eventually realized that I never followed through with
> > > fixing it upstream. After a year later, here's the v2 that Aneesh
> > > requested. Sorry about that!
> > >
> > > v2:
> > > - Included Aneesh's feedback to ensure the val is a power of 2 and
> > >   greater than PAGE_SIZE even for regions without mappings
> > > - Reused the max_t() trick from default_align() to avoid special
> > >   casing, with an if-else, when regions have mappings and when they
> > >   don't
> > >   + Didn't include Pavel's Reviewed-by since this is a slightly
> > >     different approach than what he reviewed in v1
> > > - Added a Link commit tag to Pavel's initial problem description
> > > v1: https://lore.kernel.org/lkml/20210326152645.85225-1-tyhicks@linux.microsoft.com/
> > >
> > >  drivers/nvdimm/region_devs.c | 8 +++-----
> > >  1 file changed, 3 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> > > index 473a71bbd9c9..550ea0bd6c53 100644
> > > --- a/drivers/nvdimm/region_devs.c
> > > +++ b/drivers/nvdimm/region_devs.c
> > > @@ -509,16 +509,13 @@ static ssize_t align_store(struct device *dev,
> > >  {
> > >  	struct nd_region *nd_region = to_nd_region(dev);
> > >  	unsigned long val, dpa;
> > > -	u32 remainder;
> > > +	u32 mappings, remainder;
> > >  	int rc;
> > >  
> > >  	rc = kstrtoul(buf, 0, &val);
> > >  	if (rc)
> > >  		return rc;
> > >  
> > > -	if (!nd_region->ndr_mappings)
> > > -		return -ENXIO;
> > > -
> > >  	/*
> > >  	 * Ensure space-align is evenly divisible by the region
> > >  	 * interleave-width because the kernel typically has no facility
> > > @@ -526,7 +523,8 @@ static ssize_t align_store(struct device *dev,
> > >  	 * contribute to the tail capacity in system-physical-address
> > >  	 * space for the namespace.
> > >  	 */
> > > -	dpa = div_u64_rem(val, nd_region->ndr_mappings, &remainder);
> > > +	mappings = max_t(u32, 1, nd_region->ndr_mappings);
> > > +	dpa = div_u64_rem(val, mappings, &remainder);
> > >  	if (!is_power_of_2(dpa) || dpa < PAGE_SIZE
> > >  			|| val > region_size(nd_region) || remainder)
> > >  		return -EINVAL;
> > 
> > The math all looks okay, and this matches what's done in default_align.
> > Unfortunately, I don't know enough about the power architecture to
> > understand how you can have a region with no dimms (ndr_mappings == 0).
> 
> Thanks for having a look!
> 
> FWIW, I need this working on arm64. It previously did before the commit
> mentioned in the Fixes line. ndr_mappings is 0 when defining a
> pmem-region in the device tree. The region is also marked as 'volatile'
> but I don't recall if that contributes to ndr_mappings being 0.

Oh, ndr_mappings being 0 can also happen for cases where the platform
just does not advertise any DIMMs / nmemX devices that map to a region.
It is not possible for ndr_mappings to change at run time as they are
only established at nd_region_create() time. The change looks good to
me.
diff mbox series

Patch

diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index 473a71bbd9c9..550ea0bd6c53 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -509,16 +509,13 @@  static ssize_t align_store(struct device *dev,
 {
 	struct nd_region *nd_region = to_nd_region(dev);
 	unsigned long val, dpa;
-	u32 remainder;
+	u32 mappings, remainder;
 	int rc;
 
 	rc = kstrtoul(buf, 0, &val);
 	if (rc)
 		return rc;
 
-	if (!nd_region->ndr_mappings)
-		return -ENXIO;
-
 	/*
 	 * Ensure space-align is evenly divisible by the region
 	 * interleave-width because the kernel typically has no facility
@@ -526,7 +523,8 @@  static ssize_t align_store(struct device *dev,
 	 * contribute to the tail capacity in system-physical-address
 	 * space for the namespace.
 	 */
-	dpa = div_u64_rem(val, nd_region->ndr_mappings, &remainder);
+	mappings = max_t(u32, 1, nd_region->ndr_mappings);
+	dpa = div_u64_rem(val, mappings, &remainder);
 	if (!is_power_of_2(dpa) || dpa < PAGE_SIZE
 			|| val > region_size(nd_region) || remainder)
 		return -EINVAL;