Message ID | 20181024200148.12597-1-vishal.l.verma@intel.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 5d96c9342c23ee1d084802dcf064caa67ecaa45b |
Headers | show |
Series | nfit, mce: only handle uncorrectable machine checks | expand |
On Wed, Oct 24, 2018 at 1:03 PM Vishal Verma <vishal.l.verma@intel.com> wrote: > > We only want a machine check error to be added to libnvdimm's 'badblock' > if it was an uncorrectable error. Currently we insert both corrected and > uncorrectable errors. Add a check in the nfit mce handler to filter out > corrected mce events. > > Reported-by: Omar Avelar <omar.avelar@intel.com> > Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error") > Cc: stable@vger.kernel.org > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Tony Luck <tony.luck@intel.com> > Cc: Borislav Petkov <bp@alien8.de> > Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Looks good, will let this sit in -next until the back half of the merge window.
Drop stable@ from CC. On Wed, Oct 24, 2018 at 02:01:48PM -0600, Vishal Verma wrote: > We only want a machine check error to be added to libnvdimm's 'badblock' > if it was an uncorrectable error. What is libnvdimm's 'badblock' ? Also, pls write in the commit message *why* you want only UE errors. Also, write your commit message in impartial tone, without the "we". Thx.
On Thu, 2018-10-25 at 12:04 +0200, Borislav Petkov wrote: > Drop stable@ from CC. > > On Wed, Oct 24, 2018 at 02:01:48PM -0600, Vishal Verma wrote: > > We only want a machine check error to be added to libnvdimm's > > 'badblock' > > if it was an uncorrectable error. > > What is libnvdimm's 'badblock' ? > > Also, pls write in the commit message *why* you want only UE errors. > > Also, write your commit message in impartial tone, without the "we". Hi Borislav, Thanks for the feedback, I'll send a new revision with better explanations in the changelog. Thanks, -Vishal
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 3a17107594c8..3111b3cee2ee 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -216,6 +216,7 @@ static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *s int mce_available(struct cpuinfo_x86 *c); bool mce_is_memory_error(struct mce *m); +bool mce_is_correctable(struct mce *m); DECLARE_PER_CPU(unsigned, mce_exception_count); DECLARE_PER_CPU(unsigned, mce_poll_count); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 953b3ce92dcc..27015948bc41 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -534,7 +534,7 @@ bool mce_is_memory_error(struct mce *m) } EXPORT_SYMBOL_GPL(mce_is_memory_error); -static bool mce_is_correctable(struct mce *m) +bool mce_is_correctable(struct mce *m) { if (m->cpuvendor == X86_VENDOR_AMD && m->status & MCI_STATUS_DEFERRED) return false; @@ -544,6 +544,7 @@ static bool mce_is_correctable(struct mce *m) return true; } +EXPORT_SYMBOL_GPL(mce_is_correctable); static bool cec_add_mce(struct mce *m) { diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c index e9626bf6ca29..7a51707f87e9 100644 --- a/drivers/acpi/nfit/mce.c +++ b/drivers/acpi/nfit/mce.c @@ -25,8 +25,8 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val, struct acpi_nfit_desc *acpi_desc; struct nfit_spa *nfit_spa; - /* We only care about memory errors */ - if (!mce_is_memory_error(mce)) + /* We only care about uncorrectable memory errors */ + if (!mce_is_memory_error(mce) || mce_is_correctable(mce)) return NOTIFY_DONE; /*
We only want a machine check error to be added to libnvdimm's 'badblock' if it was an uncorrectable error. Currently we insert both corrected and uncorrectable errors. Add a check in the nfit mce handler to filter out corrected mce events. Reported-by: Omar Avelar <omar.avelar@intel.com> Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error") Cc: stable@vger.kernel.org Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> --- arch/x86/include/asm/mce.h | 1 + arch/x86/kernel/cpu/mcheck/mce.c | 3 ++- drivers/acpi/nfit/mce.c | 4 ++-- 3 files changed, 5 insertions(+), 3 deletions(-)