Message ID | 20200527041244.37821-2-vaibhav@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | powerpc/papr_scm: Add support for reporting nvdimm health | expand |
On Tue, May 26, 2020 at 9:13 PM Vaibhav Jain <vaibhav@linux.ibm.com> wrote: > > Add documentation to 'papr_hcalls.rst' describing the bitmap flags > that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM > specification. > Please do a global s/SCM/PMEM/ or s/SCM/NVDIMM/. It's unfortunate that we already have 2 ways to describe persistent memory devices, let's not perpetuate a third so that "grep" has a chance to find interrelated code across architectures. Other than that this looks good to me. > Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Ira Weiny <ira.weiny@intel.com> > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> > --- > Changelog: > v7..v8: > * Added a clarification on bit-ordering of Health Bitmap > > Resend: > * None > > v6..v7: > * None > > v5..v6: > * New patch in the series > --- > Documentation/powerpc/papr_hcalls.rst | 45 ++++++++++++++++++++++++--- > 1 file changed, 41 insertions(+), 4 deletions(-) > > diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst > index 3493631a60f8..45063f305813 100644 > --- a/Documentation/powerpc/papr_hcalls.rst > +++ b/Documentation/powerpc/papr_hcalls.rst > @@ -220,13 +220,50 @@ from the LPAR memory. > **H_SCM_HEALTH** > > | Input: drcIndex > -| Out: *health-bitmap, health-bit-valid-bitmap* > +| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)* > | Return Value: *H_Success, H_Parameter, H_Hardware* > > Given a DRC Index return the info on predictive failure and overall health of > -the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive > -failure and health-bit-valid-bitmap indicate which bits in health-bitmap are > -valid. > +the NVDIMM. The asserted bits in the health-bitmap indicate one or more states > +(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate > +which bits in health-bitmap are valid. The bits are reported in > +reverse bit ordering for example a value of 0xC400000000000000 > +indicates bits 0, 1, and 5 are valid. > + > +Health Bitmap Flags: > + > ++------+-----------------------------------------------------------------------+ > +| Bit | Definition | > ++======+=======================================================================+ > +| 00 | SCM device is unable to persist memory contents. | > +| | If the system is powered down, nothing will be saved. | > ++------+-----------------------------------------------------------------------+ > +| 01 | SCM device failed to persist memory contents. Either contents were not| > +| | saved successfully on power down or were not restored properly on | > +| | power up. | > ++------+-----------------------------------------------------------------------+ > +| 02 | SCM device contents are persisted from previous IPL. The data from | > +| | the last boot were successfully restored. | > ++------+-----------------------------------------------------------------------+ > +| 03 | SCM device contents are not persisted from previous IPL. There was no | > +| | data to restore from the last boot. | > ++------+-----------------------------------------------------------------------+ > +| 04 | SCM device memory life remaining is critically low | > ++------+-----------------------------------------------------------------------+ > +| 05 | SCM device will be garded off next IPL due to failure | > ++------+-----------------------------------------------------------------------+ > +| 06 | SCM contents cannot persist due to current platform health status. A | > +| | hardware failure may prevent data from being saved or restored. | > ++------+-----------------------------------------------------------------------+ > +| 07 | SCM device is unable to persist memory contents in certain conditions | > ++------+-----------------------------------------------------------------------+ > +| 08 | SCM device is encrypted | > ++------+-----------------------------------------------------------------------+ > +| 09 | SCM device has successfully completed a requested erase or secure | > +| | erase procedure. | > ++------+-----------------------------------------------------------------------+ > +|10:63 | Reserved / Unused | > ++------+-----------------------------------------------------------------------+ > > **H_SCM_PERFORMANCE_STATS** > > -- > 2.26.2 >
Thanks for looking into this patchset Dan, Dan Williams <dan.j.williams@intel.com> writes: > On Tue, May 26, 2020 at 9:13 PM Vaibhav Jain <vaibhav@linux.ibm.com> wrote: >> >> Add documentation to 'papr_hcalls.rst' describing the bitmap flags >> that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM >> specification. >> > > Please do a global s/SCM/PMEM/ or s/SCM/NVDIMM/. It's unfortunate that > we already have 2 ways to describe persistent memory devices, let's > not perpetuate a third so that "grep" has a chance to find > interrelated code across architectures. Other than that this looks > good to me. Sure, will use PAPR_NVDIMM instead of PAPR_SCM for new code being introduced. However certain identifiers like H_SCM_HEALTH are taken from the papr specificiation hence need to use the same name. > >> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com> >> Cc: Dan Williams <dan.j.williams@intel.com> >> Cc: Michael Ellerman <mpe@ellerman.id.au> >> Cc: Ira Weiny <ira.weiny@intel.com> >> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> >> --- >> Changelog: >> v7..v8: >> * Added a clarification on bit-ordering of Health Bitmap >> >> Resend: >> * None >> >> v6..v7: >> * None >> >> v5..v6: >> * New patch in the series >> --- >> Documentation/powerpc/papr_hcalls.rst | 45 ++++++++++++++++++++++++--- >> 1 file changed, 41 insertions(+), 4 deletions(-) >> >> diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst >> index 3493631a60f8..45063f305813 100644 >> --- a/Documentation/powerpc/papr_hcalls.rst >> +++ b/Documentation/powerpc/papr_hcalls.rst >> @@ -220,13 +220,50 @@ from the LPAR memory. >> **H_SCM_HEALTH** >> >> | Input: drcIndex >> -| Out: *health-bitmap, health-bit-valid-bitmap* >> +| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)* >> | Return Value: *H_Success, H_Parameter, H_Hardware* >> >> Given a DRC Index return the info on predictive failure and overall health of >> -the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive >> -failure and health-bit-valid-bitmap indicate which bits in health-bitmap are >> -valid. >> +the NVDIMM. The asserted bits in the health-bitmap indicate one or more states >> +(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate >> +which bits in health-bitmap are valid. The bits are reported in >> +reverse bit ordering for example a value of 0xC400000000000000 >> +indicates bits 0, 1, and 5 are valid. >> + >> +Health Bitmap Flags: >> + >> ++------+-----------------------------------------------------------------------+ >> +| Bit | Definition | >> ++======+=======================================================================+ >> +| 00 | SCM device is unable to persist memory contents. | >> +| | If the system is powered down, nothing will be saved. | >> ++------+-----------------------------------------------------------------------+ >> +| 01 | SCM device failed to persist memory contents. Either contents were not| >> +| | saved successfully on power down or were not restored properly on | >> +| | power up. | >> ++------+-----------------------------------------------------------------------+ >> +| 02 | SCM device contents are persisted from previous IPL. The data from | >> +| | the last boot were successfully restored. | >> ++------+-----------------------------------------------------------------------+ >> +| 03 | SCM device contents are not persisted from previous IPL. There was no | >> +| | data to restore from the last boot. | >> ++------+-----------------------------------------------------------------------+ >> +| 04 | SCM device memory life remaining is critically low | >> ++------+-----------------------------------------------------------------------+ >> +| 05 | SCM device will be garded off next IPL due to failure | >> ++------+-----------------------------------------------------------------------+ >> +| 06 | SCM contents cannot persist due to current platform health status. A | >> +| | hardware failure may prevent data from being saved or restored. | >> ++------+-----------------------------------------------------------------------+ >> +| 07 | SCM device is unable to persist memory contents in certain conditions | >> ++------+-----------------------------------------------------------------------+ >> +| 08 | SCM device is encrypted | >> ++------+-----------------------------------------------------------------------+ >> +| 09 | SCM device has successfully completed a requested erase or secure | >> +| | erase procedure. | >> ++------+-----------------------------------------------------------------------+ >> +|10:63 | Reserved / Unused | >> ++------+-----------------------------------------------------------------------+ >> >> **H_SCM_PERFORMANCE_STATS** >> >> -- >> 2.26.2 >>
diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst index 3493631a60f8..45063f305813 100644 --- a/Documentation/powerpc/papr_hcalls.rst +++ b/Documentation/powerpc/papr_hcalls.rst @@ -220,13 +220,50 @@ from the LPAR memory. **H_SCM_HEALTH** | Input: drcIndex -| Out: *health-bitmap, health-bit-valid-bitmap* +| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)* | Return Value: *H_Success, H_Parameter, H_Hardware* Given a DRC Index return the info on predictive failure and overall health of -the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive -failure and health-bit-valid-bitmap indicate which bits in health-bitmap are -valid. +the NVDIMM. The asserted bits in the health-bitmap indicate one or more states +(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate +which bits in health-bitmap are valid. The bits are reported in +reverse bit ordering for example a value of 0xC400000000000000 +indicates bits 0, 1, and 5 are valid. + +Health Bitmap Flags: + ++------+-----------------------------------------------------------------------+ +| Bit | Definition | ++======+=======================================================================+ +| 00 | SCM device is unable to persist memory contents. | +| | If the system is powered down, nothing will be saved. | ++------+-----------------------------------------------------------------------+ +| 01 | SCM device failed to persist memory contents. Either contents were not| +| | saved successfully on power down or were not restored properly on | +| | power up. | ++------+-----------------------------------------------------------------------+ +| 02 | SCM device contents are persisted from previous IPL. The data from | +| | the last boot were successfully restored. | ++------+-----------------------------------------------------------------------+ +| 03 | SCM device contents are not persisted from previous IPL. There was no | +| | data to restore from the last boot. | ++------+-----------------------------------------------------------------------+ +| 04 | SCM device memory life remaining is critically low | ++------+-----------------------------------------------------------------------+ +| 05 | SCM device will be garded off next IPL due to failure | ++------+-----------------------------------------------------------------------+ +| 06 | SCM contents cannot persist due to current platform health status. A | +| | hardware failure may prevent data from being saved or restored. | ++------+-----------------------------------------------------------------------+ +| 07 | SCM device is unable to persist memory contents in certain conditions | ++------+-----------------------------------------------------------------------+ +| 08 | SCM device is encrypted | ++------+-----------------------------------------------------------------------+ +| 09 | SCM device has successfully completed a requested erase or secure | +| | erase procedure. | ++------+-----------------------------------------------------------------------+ +|10:63 | Reserved / Unused | ++------+-----------------------------------------------------------------------+ **H_SCM_PERFORMANCE_STATS**
Add documentation to 'papr_hcalls.rst' describing the bitmap flags that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM specification. Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> --- Changelog: v7..v8: * Added a clarification on bit-ordering of Health Bitmap Resend: * None v6..v7: * None v5..v6: * New patch in the series --- Documentation/powerpc/papr_hcalls.rst | 45 ++++++++++++++++++++++++--- 1 file changed, 41 insertions(+), 4 deletions(-)