diff mbox series

[ndctl] libndctl: Fix probe of non-nfit nvdimms

Message ID 20201009120009.243108-1-vaibhav@linux.ibm.com (mailing list archive)
State Accepted
Commit 9e52527d1b951f21e6ed06ec95693b33a2cb9f51
Headers show
Series [ndctl] libndctl: Fix probe of non-nfit nvdimms | expand

Commit Message

Vaibhav Jain Oct. 9, 2020, noon UTC
commit 107a24ff429f ("ndctl/list: Add firmware activation
enumeration") introduced changes in add_dimm() to enumerate the status
of firmware activation. However a branch added in that commit broke
the probe for non-nfit nvdimms like one provided by papr-scm. This
cause an error reported when listing namespaces like below:

$ sudo ndctl list
libndctl: add_dimm: nmem0: probe failed: No such device
libndctl: __sysfs_device_parse: nmem0: add_dev() failed

Do a fix for this by removing the offending branch in the add_dimm()
patch. This continues the flow of add_dimm() probe even if the nfit is
not detected on the associated bus.

Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 ndctl/lib/libndctl.c | 3 ---
 1 file changed, 3 deletions(-)

Comments

Verma, Vishal L Oct. 9, 2020, 6:36 p.m. UTC | #1
On Fri, 2020-10-09 at 17:30 +0530, Vaibhav Jain wrote:
> commit 107a24ff429f ("ndctl/list: Add firmware activation
> enumeration") introduced changes in add_dimm() to enumerate the status
> of firmware activation. However a branch added in that commit broke
> the probe for non-nfit nvdimms like one provided by papr-scm. This
> cause an error reported when listing namespaces like below:
> 
> $ sudo ndctl list
> libndctl: add_dimm: nmem0: probe failed: No such device
> libndctl: __sysfs_device_parse: nmem0: add_dev() failed
> 
> Do a fix for this by removing the offending branch in the add_dimm()
> patch. This continues the flow of add_dimm() probe even if the nfit is
> not detected on the associated bus.
> 
> Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration")
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>  ndctl/lib/libndctl.c | 3 ---
>  1 file changed, 3 deletions(-)

Ah apologies - this snuck in when I reflowed Dan's patches on top of the
papr work for v70.

I expect you'd like a point release with this fix asap?

Is there a way for me to incorporate some papr unit tests into my
release workflow so I can avoid breaking things like this again?

I'll also try to do a better job of pushing things out to the pending
branch more frequently so if you're monitoring that branch, hopefully
things like this will get caught before a release happens :)

> 
> diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
> index 554696386f48..ad521d365ee4 100644
> --- a/ndctl/lib/libndctl.c
> +++ b/ndctl/lib/libndctl.c
> @@ -1875,9 +1875,6 @@ static void *add_dimm(void *parent, int id, const char *dimm_base)
>  	else
>  		dimm->fwa_result = fwa_result_to_result(buf);
>  
> -	if (!ndctl_bus_has_nfit(bus))
> -		goto out;
> -
>  	/* Check if the given dimm supports nfit */
>  	if (ndctl_bus_has_nfit(bus)) {
>  		dimm->formats = formats;
Dan Williams Oct. 9, 2020, 6:49 p.m. UTC | #2
On Fri, Oct 9, 2020 at 11:36 AM Verma, Vishal L
<vishal.l.verma@intel.com> wrote:
>
> On Fri, 2020-10-09 at 17:30 +0530, Vaibhav Jain wrote:
> > commit 107a24ff429f ("ndctl/list: Add firmware activation
> > enumeration") introduced changes in add_dimm() to enumerate the status
> > of firmware activation. However a branch added in that commit broke
> > the probe for non-nfit nvdimms like one provided by papr-scm. This
> > cause an error reported when listing namespaces like below:
> >
> > $ sudo ndctl list
> > libndctl: add_dimm: nmem0: probe failed: No such device
> > libndctl: __sysfs_device_parse: nmem0: add_dev() failed
> >
> > Do a fix for this by removing the offending branch in the add_dimm()
> > patch. This continues the flow of add_dimm() probe even if the nfit is
> > not detected on the associated bus.
> >
> > Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration")
> > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> > ---
> >  ndctl/lib/libndctl.c | 3 ---
> >  1 file changed, 3 deletions(-)
>
> Ah apologies - this snuck in when I reflowed Dan's patches on top of the
> papr work for v70.
>
> I expect you'd like a point release with this fix asap?
>
> Is there a way for me to incorporate some papr unit tests into my
> release workflow so I can avoid breaking things like this again?
>
> I'll also try to do a better job of pushing things out to the pending
> branch more frequently so if you're monitoring that branch, hopefully
> things like this will get caught before a release happens :)

Would be nice to have something like a papr_test next to nfit_test for
such regression testing. These kinds of mistakes are really only
avoidable with regression tests.
Vaibhav Jain Oct. 10, 2020, 4:59 a.m. UTC | #3
Hi Dan and Vishal,

Thanks so much for quick turnaround on this.

Dan Williams <dan.j.williams@intel.com> writes:

> On Fri, Oct 9, 2020 at 11:36 AM Verma, Vishal L
> <vishal.l.verma@intel.com> wrote:
>>
>> On Fri, 2020-10-09 at 17:30 +0530, Vaibhav Jain wrote:
>> > commit 107a24ff429f ("ndctl/list: Add firmware activation
>> > enumeration") introduced changes in add_dimm() to enumerate the status
>> > of firmware activation. However a branch added in that commit broke
>> > the probe for non-nfit nvdimms like one provided by papr-scm. This
>> > cause an error reported when listing namespaces like below:
>> >
>> > $ sudo ndctl list
>> > libndctl: add_dimm: nmem0: probe failed: No such device
>> > libndctl: __sysfs_device_parse: nmem0: add_dev() failed
>> >
>> > Do a fix for this by removing the offending branch in the add_dimm()
>> > patch. This continues the flow of add_dimm() probe even if the nfit is
>> > not detected on the associated bus.
>> >
>> > Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration")
>> > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> > ---
>> >  ndctl/lib/libndctl.c | 3 ---
>> >  1 file changed, 3 deletions(-)
>>
>> Ah apologies - this snuck in when I reflowed Dan's patches on top of the
>> papr work for v70.
No worries :-)

>>
>> I expect you'd like a point release with this fix asap?
Yes, that will be great. Thanks

>>
>> Is there a way for me to incorporate some papr unit tests into my
>> release workflow so I can avoid breaking things like this again?
>>
>> I'll also try to do a better job of pushing things out to the pending
>> branch more frequently so if you're monitoring that branch, hopefully
>> things like this will get caught before a release happens :)

Fully agree, if that happens we can incorporate it into our CI system to
ensure that such regressions are caught early on before any release is
tagged.

>
> Would be nice to have something like a papr_test next to nfit_test for
> such regression testing. These kinds of mistakes are really only
> avoidable with regression tests.
Yes Agree, fortunatly Santosh  has recently posted an RFC patchset
implementing such tests at [1]. Once that gets merged, can used to
perform regression testing.

[1] "testing/nvdimm: Add test module for non-nfit platforms"
https://lore.kernel.org/linux-nvdimm/20201006010013.848302-1-santosh@fossix.org/
Santosh Sivaraj Oct. 12, 2020, 4:51 a.m. UTC | #4
Vaibhav Jain <vaibhav@linux.ibm.com> writes:

> Hi Dan and Vishal,
>
> Thanks so much for quick turnaround on this.
>
> Dan Williams <dan.j.williams@intel.com> writes:
>
>> On Fri, Oct 9, 2020 at 11:36 AM Verma, Vishal L
>> <vishal.l.verma@intel.com> wrote:
>>>
>>> On Fri, 2020-10-09 at 17:30 +0530, Vaibhav Jain wrote:
>>> > commit 107a24ff429f ("ndctl/list: Add firmware activation
>>> > enumeration") introduced changes in add_dimm() to enumerate the status
>>> > of firmware activation. However a branch added in that commit broke
>>> > the probe for non-nfit nvdimms like one provided by papr-scm. This
>>> > cause an error reported when listing namespaces like below:
>>> >
>>> > $ sudo ndctl list
>>> > libndctl: add_dimm: nmem0: probe failed: No such device
>>> > libndctl: __sysfs_device_parse: nmem0: add_dev() failed
>>> >
>>> > Do a fix for this by removing the offending branch in the add_dimm()
>>> > patch. This continues the flow of add_dimm() probe even if the nfit is
>>> > not detected on the associated bus.
>>> >
>>> > Fixes: 107a24ff429fa("ndctl/list: Add firmware activation enumeration")
>>> > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>>> > ---
>>> >  ndctl/lib/libndctl.c | 3 ---
>>> >  1 file changed, 3 deletions(-)
>>>
>>> Ah apologies - this snuck in when I reflowed Dan's patches on top of the
>>> papr work for v70.
> No worries :-)
>
>>>
>>> I expect you'd like a point release with this fix asap?
> Yes, that will be great. Thanks
>
>>>
>>> Is there a way for me to incorporate some papr unit tests into my
>>> release workflow so I can avoid breaking things like this again?
>>>
>>> I'll also try to do a better job of pushing things out to the pending
>>> branch more frequently so if you're monitoring that branch, hopefully
>>> things like this will get caught before a release happens :)
>
> Fully agree, if that happens we can incorporate it into our CI system to
> ensure that such regressions are caught early on before any release is
> tagged.
>
>>
>> Would be nice to have something like a papr_test next to nfit_test for
>> such regression testing. These kinds of mistakes are really only
>> avoidable with regression tests.
> Yes Agree, fortunatly Santosh  has recently posted an RFC patchset
> implementing such tests at [1]. Once that gets merged, can used to
> perform regression testing.
>
> [1] "testing/nvdimm: Add test module for non-nfit platforms"
> https://lore.kernel.org/linux-nvdimm/20201006010013.848302-1-santosh@fossix.org/
>

Thanks Vaibhav to point that out.

Dan/Vishal/Ira,

If you could provide your comments on the above RFC we could move forward on
this.

Thanks,
Santosh
diff mbox series

Patch

diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
index 554696386f48..ad521d365ee4 100644
--- a/ndctl/lib/libndctl.c
+++ b/ndctl/lib/libndctl.c
@@ -1875,9 +1875,6 @@  static void *add_dimm(void *parent, int id, const char *dimm_base)
 	else
 		dimm->fwa_result = fwa_result_to_result(buf);
 
-	if (!ndctl_bus_has_nfit(bus))
-		goto out;
-
 	/* Check if the given dimm supports nfit */
 	if (ndctl_bus_has_nfit(bus)) {
 		dimm->formats = formats;