Message ID | 20201009120009.243108-1-vaibhav@linux.ibm.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 9e52527d1b951f21e6ed06ec95693b33a2cb9f51 |
Headers | show |
Series | [ndctl] libndctl: Fix probe of non-nfit nvdimms | expand |
On Fri, 2020-10-09 at 17:30 +0530, Vaibhav Jain wrote: > commit 107a24ff429f ("ndctl/list: Add firmware activation > enumeration") introduced changes in add_dimm() to enumerate the status > of firmware activation. However a branch added in that commit broke > the probe for non-nfit nvdimms like one provided by papr-scm. This > cause an error reported when listing namespaces like below: > > $ sudo ndctl list > libndctl: add_dimm: nmem0: probe failed: No such device > libndctl: __sysfs_device_parse: nmem0: add_dev() failed > > Do a fix for this by removing the offending branch in the add_dimm() > patch. This continues the flow of add_dimm() probe even if the nfit is > not detected on the associated bus. > > Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration") > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> > --- > ndctl/lib/libndctl.c | 3 --- > 1 file changed, 3 deletions(-) Ah apologies - this snuck in when I reflowed Dan's patches on top of the papr work for v70. I expect you'd like a point release with this fix asap? Is there a way for me to incorporate some papr unit tests into my release workflow so I can avoid breaking things like this again? I'll also try to do a better job of pushing things out to the pending branch more frequently so if you're monitoring that branch, hopefully things like this will get caught before a release happens :) > > diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c > index 554696386f48..ad521d365ee4 100644 > --- a/ndctl/lib/libndctl.c > +++ b/ndctl/lib/libndctl.c > @@ -1875,9 +1875,6 @@ static void *add_dimm(void *parent, int id, const char *dimm_base) > else > dimm->fwa_result = fwa_result_to_result(buf); > > - if (!ndctl_bus_has_nfit(bus)) > - goto out; > - > /* Check if the given dimm supports nfit */ > if (ndctl_bus_has_nfit(bus)) { > dimm->formats = formats;
On Fri, Oct 9, 2020 at 11:36 AM Verma, Vishal L <vishal.l.verma@intel.com> wrote: > > On Fri, 2020-10-09 at 17:30 +0530, Vaibhav Jain wrote: > > commit 107a24ff429f ("ndctl/list: Add firmware activation > > enumeration") introduced changes in add_dimm() to enumerate the status > > of firmware activation. However a branch added in that commit broke > > the probe for non-nfit nvdimms like one provided by papr-scm. This > > cause an error reported when listing namespaces like below: > > > > $ sudo ndctl list > > libndctl: add_dimm: nmem0: probe failed: No such device > > libndctl: __sysfs_device_parse: nmem0: add_dev() failed > > > > Do a fix for this by removing the offending branch in the add_dimm() > > patch. This continues the flow of add_dimm() probe even if the nfit is > > not detected on the associated bus. > > > > Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration") > > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> > > --- > > ndctl/lib/libndctl.c | 3 --- > > 1 file changed, 3 deletions(-) > > Ah apologies - this snuck in when I reflowed Dan's patches on top of the > papr work for v70. > > I expect you'd like a point release with this fix asap? > > Is there a way for me to incorporate some papr unit tests into my > release workflow so I can avoid breaking things like this again? > > I'll also try to do a better job of pushing things out to the pending > branch more frequently so if you're monitoring that branch, hopefully > things like this will get caught before a release happens :) Would be nice to have something like a papr_test next to nfit_test for such regression testing. These kinds of mistakes are really only avoidable with regression tests.
Hi Dan and Vishal, Thanks so much for quick turnaround on this. Dan Williams <dan.j.williams@intel.com> writes: > On Fri, Oct 9, 2020 at 11:36 AM Verma, Vishal L > <vishal.l.verma@intel.com> wrote: >> >> On Fri, 2020-10-09 at 17:30 +0530, Vaibhav Jain wrote: >> > commit 107a24ff429f ("ndctl/list: Add firmware activation >> > enumeration") introduced changes in add_dimm() to enumerate the status >> > of firmware activation. However a branch added in that commit broke >> > the probe for non-nfit nvdimms like one provided by papr-scm. This >> > cause an error reported when listing namespaces like below: >> > >> > $ sudo ndctl list >> > libndctl: add_dimm: nmem0: probe failed: No such device >> > libndctl: __sysfs_device_parse: nmem0: add_dev() failed >> > >> > Do a fix for this by removing the offending branch in the add_dimm() >> > patch. This continues the flow of add_dimm() probe even if the nfit is >> > not detected on the associated bus. >> > >> > Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration") >> > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> >> > --- >> > ndctl/lib/libndctl.c | 3 --- >> > 1 file changed, 3 deletions(-) >> >> Ah apologies - this snuck in when I reflowed Dan's patches on top of the >> papr work for v70. No worries :-) >> >> I expect you'd like a point release with this fix asap? Yes, that will be great. Thanks >> >> Is there a way for me to incorporate some papr unit tests into my >> release workflow so I can avoid breaking things like this again? >> >> I'll also try to do a better job of pushing things out to the pending >> branch more frequently so if you're monitoring that branch, hopefully >> things like this will get caught before a release happens :) Fully agree, if that happens we can incorporate it into our CI system to ensure that such regressions are caught early on before any release is tagged. > > Would be nice to have something like a papr_test next to nfit_test for > such regression testing. These kinds of mistakes are really only > avoidable with regression tests. Yes Agree, fortunatly Santosh has recently posted an RFC patchset implementing such tests at [1]. Once that gets merged, can used to perform regression testing. [1] "testing/nvdimm: Add test module for non-nfit platforms" https://lore.kernel.org/linux-nvdimm/20201006010013.848302-1-santosh@fossix.org/
Vaibhav Jain <vaibhav@linux.ibm.com> writes: > Hi Dan and Vishal, > > Thanks so much for quick turnaround on this. > > Dan Williams <dan.j.williams@intel.com> writes: > >> On Fri, Oct 9, 2020 at 11:36 AM Verma, Vishal L >> <vishal.l.verma@intel.com> wrote: >>> >>> On Fri, 2020-10-09 at 17:30 +0530, Vaibhav Jain wrote: >>> > commit 107a24ff429f ("ndctl/list: Add firmware activation >>> > enumeration") introduced changes in add_dimm() to enumerate the status >>> > of firmware activation. However a branch added in that commit broke >>> > the probe for non-nfit nvdimms like one provided by papr-scm. This >>> > cause an error reported when listing namespaces like below: >>> > >>> > $ sudo ndctl list >>> > libndctl: add_dimm: nmem0: probe failed: No such device >>> > libndctl: __sysfs_device_parse: nmem0: add_dev() failed >>> > >>> > Do a fix for this by removing the offending branch in the add_dimm() >>> > patch. This continues the flow of add_dimm() probe even if the nfit is >>> > not detected on the associated bus. >>> > >>> > Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration") >>> > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> >>> > --- >>> > ndctl/lib/libndctl.c | 3 --- >>> > 1 file changed, 3 deletions(-) >>> >>> Ah apologies - this snuck in when I reflowed Dan's patches on top of the >>> papr work for v70. > No worries :-) > >>> >>> I expect you'd like a point release with this fix asap? > Yes, that will be great. Thanks > >>> >>> Is there a way for me to incorporate some papr unit tests into my >>> release workflow so I can avoid breaking things like this again? >>> >>> I'll also try to do a better job of pushing things out to the pending >>> branch more frequently so if you're monitoring that branch, hopefully >>> things like this will get caught before a release happens :) > > Fully agree, if that happens we can incorporate it into our CI system to > ensure that such regressions are caught early on before any release is > tagged. > >> >> Would be nice to have something like a papr_test next to nfit_test for >> such regression testing. These kinds of mistakes are really only >> avoidable with regression tests. > Yes Agree, fortunatly Santosh has recently posted an RFC patchset > implementing such tests at [1]. Once that gets merged, can used to > perform regression testing. > > [1] "testing/nvdimm: Add test module for non-nfit platforms" > https://lore.kernel.org/linux-nvdimm/20201006010013.848302-1-santosh@fossix.org/ > Thanks Vaibhav to point that out. Dan/Vishal/Ira, If you could provide your comments on the above RFC we could move forward on this. Thanks, Santosh
diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c index 554696386f48..ad521d365ee4 100644 --- a/ndctl/lib/libndctl.c +++ b/ndctl/lib/libndctl.c @@ -1875,9 +1875,6 @@ static void *add_dimm(void *parent, int id, const char *dimm_base) else dimm->fwa_result = fwa_result_to_result(buf); - if (!ndctl_bus_has_nfit(bus)) - goto out; - /* Check if the given dimm supports nfit */ if (ndctl_bus_has_nfit(bus)) { dimm->formats = formats;
commit 107a24ff429f ("ndctl/list: Add firmware activation enumeration") introduced changes in add_dimm() to enumerate the status of firmware activation. However a branch added in that commit broke the probe for non-nfit nvdimms like one provided by papr-scm. This cause an error reported when listing namespaces like below: $ sudo ndctl list libndctl: add_dimm: nmem0: probe failed: No such device libndctl: __sysfs_device_parse: nmem0: add_dev() failed Do a fix for this by removing the offending branch in the add_dimm() patch. This continues the flow of add_dimm() probe even if the nfit is not detected on the associated bus. Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> --- ndctl/lib/libndctl.c | 3 --- 1 file changed, 3 deletions(-)