Message ID | 155977193862.2443951.10284714500308539570.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: Sub-section memory hotplug support | expand |
On Wed, 05 Jun 2019 14:58:58 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > At namespace creation time there is the potential for the "expected to > be zero" fields of a 'pfn' info-block to be filled with indeterminate > data. While the kernel buffer is zeroed on allocation it is immediately > overwritten by nd_pfn_validate() filling it with the current contents of > the on-media info-block location. For fields like, 'flags' and the > 'padding' it potentially means that future implementations can not rely > on those fields being zero. > > In preparation to stop using the 'start_pad' and 'end_trunc' fields for > section alignment, arrange for fields that are not explicitly > initialized to be guaranteed zero. Bump the minor version to indicate it > is safe to assume the 'padding' and 'flags' are zero. Otherwise, this > corruption is expected to benign since all other critical fields are > explicitly initialized. > > Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") > Cc: <stable@vger.kernel.org> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> The cc:stable in [11/12] seems odd. Is this independent of the other patches? If so, shouldn't it be a standalone thing which can be prioritized?
On Thu, Jun 6, 2019 at 2:46 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Wed, 05 Jun 2019 14:58:58 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > > > At namespace creation time there is the potential for the "expected to > > be zero" fields of a 'pfn' info-block to be filled with indeterminate > > data. While the kernel buffer is zeroed on allocation it is immediately > > overwritten by nd_pfn_validate() filling it with the current contents of > > the on-media info-block location. For fields like, 'flags' and the > > 'padding' it potentially means that future implementations can not rely > > on those fields being zero. > > > > In preparation to stop using the 'start_pad' and 'end_trunc' fields for > > section alignment, arrange for fields that are not explicitly > > initialized to be guaranteed zero. Bump the minor version to indicate it > > is safe to assume the 'padding' and 'flags' are zero. Otherwise, this > > corruption is expected to benign since all other critical fields are > > explicitly initialized. > > > > Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") > > Cc: <stable@vger.kernel.org> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > The cc:stable in [11/12] seems odd. Is this independent of the other > patches? If so, shouldn't it be a standalone thing which can be > prioritized? > The cc: stable is about spreading this new policy to as many kernels as possible not fixing an issue in those kernels. It's not until patch 12 "libnvdimm/pfn: Stop padding pmem namespaces to section alignment" as all previous kernel do initialize all fields. I'd be ok to drop that cc: stable, my concern is distros that somehow pickup and backport patch 12 and miss patch 11.
On Thu, 6 Jun 2019 15:06:26 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > On Thu, Jun 6, 2019 at 2:46 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > > > On Wed, 05 Jun 2019 14:58:58 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > > > > > At namespace creation time there is the potential for the "expected to > > > be zero" fields of a 'pfn' info-block to be filled with indeterminate > > > data. While the kernel buffer is zeroed on allocation it is immediately > > > overwritten by nd_pfn_validate() filling it with the current contents of > > > the on-media info-block location. For fields like, 'flags' and the > > > 'padding' it potentially means that future implementations can not rely > > > on those fields being zero. > > > > > > In preparation to stop using the 'start_pad' and 'end_trunc' fields for > > > section alignment, arrange for fields that are not explicitly > > > initialized to be guaranteed zero. Bump the minor version to indicate it > > > is safe to assume the 'padding' and 'flags' are zero. Otherwise, this > > > corruption is expected to benign since all other critical fields are > > > explicitly initialized. > > > > > > Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") > > > Cc: <stable@vger.kernel.org> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > > > The cc:stable in [11/12] seems odd. Is this independent of the other > > patches? If so, shouldn't it be a standalone thing which can be > > prioritized? > > > > The cc: stable is about spreading this new policy to as many kernels > as possible not fixing an issue in those kernels. It's not until patch > 12 "libnvdimm/pfn: Stop padding pmem namespaces to section alignment" > as all previous kernel do initialize all fields. > > I'd be ok to drop that cc: stable, my concern is distros that somehow > pickup and backport patch 12 and miss patch 11. Could you please propose a changelog paragraph which explains all this to those who will be considering this patch for backports?
On Fri, Jun 7, 2019 at 12:54 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Thu, 6 Jun 2019 15:06:26 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > > > On Thu, Jun 6, 2019 at 2:46 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > > > > > On Wed, 05 Jun 2019 14:58:58 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > > > > > > > At namespace creation time there is the potential for the "expected to > > > > be zero" fields of a 'pfn' info-block to be filled with indeterminate > > > > data. While the kernel buffer is zeroed on allocation it is immediately > > > > overwritten by nd_pfn_validate() filling it with the current contents of > > > > the on-media info-block location. For fields like, 'flags' and the > > > > 'padding' it potentially means that future implementations can not rely > > > > on those fields being zero. > > > > > > > > In preparation to stop using the 'start_pad' and 'end_trunc' fields for > > > > section alignment, arrange for fields that are not explicitly > > > > initialized to be guaranteed zero. Bump the minor version to indicate it > > > > is safe to assume the 'padding' and 'flags' are zero. Otherwise, this > > > > corruption is expected to benign since all other critical fields are > > > > explicitly initialized. > > > > > > > > Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") > > > > Cc: <stable@vger.kernel.org> > > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > > > > > The cc:stable in [11/12] seems odd. Is this independent of the other > > > patches? If so, shouldn't it be a standalone thing which can be > > > prioritized? > > > > > > > The cc: stable is about spreading this new policy to as many kernels > > as possible not fixing an issue in those kernels. It's not until patch > > 12 "libnvdimm/pfn: Stop padding pmem namespaces to section alignment" > > as all previous kernel do initialize all fields. > > > > I'd be ok to drop that cc: stable, my concern is distros that somehow > > pickup and backport patch 12 and miss patch 11. > > Could you please propose a changelog paragraph which explains all this > to those who will be considering this patch for backports? > Will do. I'll resend the series with this and the fixups proposed by Oscar.
Dan Williams <dan.j.williams@intel.com> writes: > At namespace creation time there is the potential for the "expected to > be zero" fields of a 'pfn' info-block to be filled with indeterminate > data. While the kernel buffer is zeroed on allocation it is immediately > overwritten by nd_pfn_validate() filling it with the current contents of > the on-media info-block location. For fields like, 'flags' and the > 'padding' it potentially means that future implementations can not rely > on those fields being zero. > > In preparation to stop using the 'start_pad' and 'end_trunc' fields for > section alignment, arrange for fields that are not explicitly > initialized to be guaranteed zero. Bump the minor version to indicate it > is safe to assume the 'padding' and 'flags' are zero. Otherwise, this > corruption is expected to benign since all other critical fields are > explicitly initialized. > > Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") > Cc: <stable@vger.kernel.org> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > drivers/nvdimm/dax_devs.c | 2 +- > drivers/nvdimm/pfn.h | 1 + > drivers/nvdimm/pfn_devs.c | 18 +++++++++++++++--- > 3 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/drivers/nvdimm/dax_devs.c b/drivers/nvdimm/dax_devs.c > index 0453f49dc708..326f02ffca81 100644 > --- a/drivers/nvdimm/dax_devs.c > +++ b/drivers/nvdimm/dax_devs.c > @@ -126,7 +126,7 @@ int nd_dax_probe(struct device *dev, struct nd_namespace_common *ndns) > nvdimm_bus_unlock(&ndns->dev); > if (!dax_dev) > return -ENOMEM; > - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); > + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); > nd_pfn->pfn_sb = pfn_sb; > rc = nd_pfn_validate(nd_pfn, DAX_SIG); > dev_dbg(dev, "dax: %s\n", rc == 0 ? dev_name(dax_dev) : "<none>"); > diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h > index dde9853453d3..e901e3a3b04c 100644 > --- a/drivers/nvdimm/pfn.h > +++ b/drivers/nvdimm/pfn.h > @@ -36,6 +36,7 @@ struct nd_pfn_sb { > __le32 end_trunc; > /* minor-version-2 record the base alignment of the mapping */ > __le32 align; > + /* minor-version-3 guarantee the padding and flags are zero */ > u8 padding[4000]; > __le64 checksum; > }; > diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c > index 01f40672507f..a2406253eb70 100644 > --- a/drivers/nvdimm/pfn_devs.c > +++ b/drivers/nvdimm/pfn_devs.c > @@ -420,6 +420,15 @@ static int nd_pfn_clear_memmap_errors(struct nd_pfn *nd_pfn) > return 0; > } > > +/** > + * nd_pfn_validate - read and validate info-block > + * @nd_pfn: fsdax namespace runtime state / properties > + * @sig: 'devdax' or 'fsdax' signature > + * > + * Upon return the info-block buffer contents (->pfn_sb) are > + * indeterminate when validation fails, and a coherent info-block > + * otherwise. > + */ > int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) > { > u64 checksum, offset; > @@ -565,7 +574,7 @@ int nd_pfn_probe(struct device *dev, struct nd_namespace_common *ndns) > nvdimm_bus_unlock(&ndns->dev); > if (!pfn_dev) > return -ENOMEM; > - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); > + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); > nd_pfn = to_nd_pfn(pfn_dev); > nd_pfn->pfn_sb = pfn_sb; > rc = nd_pfn_validate(nd_pfn, PFN_SIG); > @@ -702,7 +711,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) > u64 checksum; > int rc; > > - pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); > + pfn_sb = devm_kmalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); > if (!pfn_sb) > return -ENOMEM; > > @@ -711,11 +720,14 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) > sig = DAX_SIG; > else > sig = PFN_SIG; > + > rc = nd_pfn_validate(nd_pfn, sig); > if (rc != -ENODEV) > return rc; > > /* no info block, do init */; > + memset(pfn_sb, 0, sizeof(*pfn_sb)); > + > nd_region = to_nd_region(nd_pfn->dev.parent); > if (nd_region->ro) { > dev_info(&nd_pfn->dev, > @@ -768,7 +780,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) > memcpy(pfn_sb->uuid, nd_pfn->uuid, 16); > memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); > pfn_sb->version_major = cpu_to_le16(1); > - pfn_sb->version_minor = cpu_to_le16(2); > + pfn_sb->version_minor = cpu_to_le16(3); > pfn_sb->start_pad = cpu_to_le32(start_pad); > pfn_sb->end_trunc = cpu_to_le32(end_trunc); > pfn_sb->align = cpu_to_le32(nd_pfn->align); > How will this minor version 3 be used? If we are not having start_pad/end_trunc updated in pfn_sb, how will the older kernel enable these namesapces? Do we need a patch like https://lore.kernel.org/linux-mm/20190604091357.32213-2-aneesh.kumar@linux.ibm.com -aneesh
diff --git a/drivers/nvdimm/dax_devs.c b/drivers/nvdimm/dax_devs.c index 0453f49dc708..326f02ffca81 100644 --- a/drivers/nvdimm/dax_devs.c +++ b/drivers/nvdimm/dax_devs.c @@ -126,7 +126,7 @@ int nd_dax_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!dax_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, DAX_SIG); dev_dbg(dev, "dax: %s\n", rc == 0 ? dev_name(dax_dev) : "<none>"); diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h index dde9853453d3..e901e3a3b04c 100644 --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -36,6 +36,7 @@ struct nd_pfn_sb { __le32 end_trunc; /* minor-version-2 record the base alignment of the mapping */ __le32 align; + /* minor-version-3 guarantee the padding and flags are zero */ u8 padding[4000]; __le64 checksum; }; diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 01f40672507f..a2406253eb70 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -420,6 +420,15 @@ static int nd_pfn_clear_memmap_errors(struct nd_pfn *nd_pfn) return 0; } +/** + * nd_pfn_validate - read and validate info-block + * @nd_pfn: fsdax namespace runtime state / properties + * @sig: 'devdax' or 'fsdax' signature + * + * Upon return the info-block buffer contents (->pfn_sb) are + * indeterminate when validation fails, and a coherent info-block + * otherwise. + */ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) { u64 checksum, offset; @@ -565,7 +574,7 @@ int nd_pfn_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!pfn_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn = to_nd_pfn(pfn_dev); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, PFN_SIG); @@ -702,7 +711,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) u64 checksum; int rc; - pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); if (!pfn_sb) return -ENOMEM; @@ -711,11 +720,14 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) sig = DAX_SIG; else sig = PFN_SIG; + rc = nd_pfn_validate(nd_pfn, sig); if (rc != -ENODEV) return rc; /* no info block, do init */; + memset(pfn_sb, 0, sizeof(*pfn_sb)); + nd_region = to_nd_region(nd_pfn->dev.parent); if (nd_region->ro) { dev_info(&nd_pfn->dev, @@ -768,7 +780,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) memcpy(pfn_sb->uuid, nd_pfn->uuid, 16); memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); - pfn_sb->version_minor = cpu_to_le16(2); + pfn_sb->version_minor = cpu_to_le16(3); pfn_sb->start_pad = cpu_to_le32(start_pad); pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align);
At namespace creation time there is the potential for the "expected to be zero" fields of a 'pfn' info-block to be filled with indeterminate data. While the kernel buffer is zeroed on allocation it is immediately overwritten by nd_pfn_validate() filling it with the current contents of the on-media info-block location. For fields like, 'flags' and the 'padding' it potentially means that future implementations can not rely on those fields being zero. In preparation to stop using the 'start_pad' and 'end_trunc' fields for section alignment, arrange for fields that are not explicitly initialized to be guaranteed zero. Bump the minor version to indicate it is safe to assume the 'padding' and 'flags' are zero. Otherwise, this corruption is expected to benign since all other critical fields are explicitly initialized. Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/nvdimm/dax_devs.c | 2 +- drivers/nvdimm/pfn.h | 1 + drivers/nvdimm/pfn_devs.c | 18 +++++++++++++++--- 3 files changed, 17 insertions(+), 4 deletions(-)