Message ID | 20220730061757.3441537-1-jane.chu@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v5] x86/mce: retrieve poison range from hardware | expand |
> struct mce m; > + int lsb = PAGE_SHIFT; Some maintainers like to order local declaration lines from longest to shortest > + /* > + * Even if the ->validation_bits are set for address mask, > + * to be extra safe, check and reject an error radius '0', > + * and fallback to the default page size. > + */ > + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { > + lsb = __ffs64(mem_err->physical_addr_mask); > + if (lsb == 1) > + lsb = PAGE_SHIFT; > + } The comment above __ffs64() says: * The result is not defined if no bits are set, so check that @word * is non-zero before calling this. So if the intent is "extra safe" should check for that: if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK && mem_err->physical_addr_mask) { lsb = __ffs64(mem_err->physical_addr_mask); if (lsb == 1) lsb = PAGE_SHIFT; } -Tony
Jane Chu wrote: > With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine > poison granularity") that changed nfit_handle_mce() callback to report > badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been > discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting > 2 back-to-back poisons and the driver ends up logging 8 badblocks, > because 0x1000 bytes is 8 512-byte. > > Dan Williams noticed that apei_mce_report_mem_error() hardcode > the LSB field to PAGE_SHIFT instead of consulting the input > struct cper_sec_mem_err record. So change to rely on hardware whenever > support is available. > > Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com > > Reviewed-by: Dan Williams <dan.j.williams@intel.com> > Signed-off-by: Jane Chu <jane.chu@oracle.com> > --- > arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c > index 717192915f28..2c7ea0ba9dd7 100644 > --- a/arch/x86/kernel/cpu/mce/apei.c > +++ b/arch/x86/kernel/cpu/mce/apei.c > @@ -29,15 +29,27 @@ > void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err) > { > struct mce m; > + int lsb = PAGE_SHIFT; > > if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) > return; > > + /* > + * Even if the ->validation_bits are set for address mask, > + * to be extra safe, check and reject an error radius '0', > + * and fallback to the default page size. > + */ > + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { > + lsb = __ffs64(mem_err->physical_addr_mask); > + if (lsb == 1) This was the reason I recommended hweight64 and min_not_zero() as hweight64 does not have the undefined behavior. However, an even better option is to just do: find_first_bit(&mem_err->physical_addr_mask, PAGE_SHIFT) ...as that trims the result to the PAGE_SHIFT max and handles the zero case.
On 8/1/2022 8:58 AM, Luck, Tony wrote: >> struct mce m; >> + int lsb = PAGE_SHIFT; > > Some maintainers like to order local declaration lines from longest to shortest > >> + /* >> + * Even if the ->validation_bits are set for address mask, >> + * to be extra safe, check and reject an error radius '0', >> + * and fallback to the default page size. >> + */ >> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { >> + lsb = __ffs64(mem_err->physical_addr_mask); >> + if (lsb == 1) >> + lsb = PAGE_SHIFT; >> + } > > > The comment above __ffs64() says: > > * The result is not defined if no bits are set, so check that @word > * is non-zero before calling this. > > So if the intent is "extra safe" should check for that: > > if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK && > mem_err->physical_addr_mask) { > lsb = __ffs64(mem_err->physical_addr_mask); > if (lsb == 1) > lsb = PAGE_SHIFT; > } Indeed, thanks a lot! -jane > > -Tony > > >
On 8/1/2022 9:44 AM, Dan Williams wrote: > Jane Chu wrote: >> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine >> poison granularity") that changed nfit_handle_mce() callback to report >> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been >> discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting >> 2 back-to-back poisons and the driver ends up logging 8 badblocks, >> because 0x1000 bytes is 8 512-byte. >> >> Dan Williams noticed that apei_mce_report_mem_error() hardcode >> the LSB field to PAGE_SHIFT instead of consulting the input >> struct cper_sec_mem_err record. So change to rely on hardware whenever >> support is available. >> >> Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com >> >> Reviewed-by: Dan Williams <dan.j.williams@intel.com> >> Signed-off-by: Jane Chu <jane.chu@oracle.com> >> --- >> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++- >> 1 file changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c >> index 717192915f28..2c7ea0ba9dd7 100644 >> --- a/arch/x86/kernel/cpu/mce/apei.c >> +++ b/arch/x86/kernel/cpu/mce/apei.c >> @@ -29,15 +29,27 @@ >> void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err) >> { >> struct mce m; >> + int lsb = PAGE_SHIFT; >> >> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) >> return; >> >> + /* >> + * Even if the ->validation_bits are set for address mask, >> + * to be extra safe, check and reject an error radius '0', >> + * and fallback to the default page size. >> + */ >> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { >> + lsb = __ffs64(mem_err->physical_addr_mask); >> + if (lsb == 1) > > This was the reason I recommended hweight64 and min_not_zero() as > hweight64 does not have the undefined behavior. However, an even better > option is to just do: > > find_first_bit(&mem_err->physical_addr_mask, PAGE_SHIFT) > > ...as that trims the result to the PAGE_SHIFT max and handles the zero > case. Thanks Dan! However it looks like find_first_bit() could call into __ffs(x) which has the same limitation as __ffs64(x), as Tony pointed out. I'll post v6 shortly. thanks, -jane
Jane Chu wrote: > On 8/1/2022 9:44 AM, Dan Williams wrote: > > Jane Chu wrote: > >> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine > >> poison granularity") that changed nfit_handle_mce() callback to report > >> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been > >> discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting > >> 2 back-to-back poisons and the driver ends up logging 8 badblocks, > >> because 0x1000 bytes is 8 512-byte. > >> > >> Dan Williams noticed that apei_mce_report_mem_error() hardcode > >> the LSB field to PAGE_SHIFT instead of consulting the input > >> struct cper_sec_mem_err record. So change to rely on hardware whenever > >> support is available. > >> > >> Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com > >> > >> Reviewed-by: Dan Williams <dan.j.williams@intel.com> > >> Signed-off-by: Jane Chu <jane.chu@oracle.com> > >> --- > >> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++- > >> 1 file changed, 13 insertions(+), 1 deletion(-) > >> > >> diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c > >> index 717192915f28..2c7ea0ba9dd7 100644 > >> --- a/arch/x86/kernel/cpu/mce/apei.c > >> +++ b/arch/x86/kernel/cpu/mce/apei.c > >> @@ -29,15 +29,27 @@ > >> void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err) > >> { > >> struct mce m; > >> + int lsb = PAGE_SHIFT; > >> > >> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) > >> return; > >> > >> + /* > >> + * Even if the ->validation_bits are set for address mask, > >> + * to be extra safe, check and reject an error radius '0', > >> + * and fallback to the default page size. > >> + */ > >> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { > >> + lsb = __ffs64(mem_err->physical_addr_mask); > >> + if (lsb == 1) > > > > This was the reason I recommended hweight64 and min_not_zero() as > > hweight64 does not have the undefined behavior. However, an even better > > option is to just do: > > > > find_first_bit(&mem_err->physical_addr_mask, PAGE_SHIFT) > > > > ...as that trims the result to the PAGE_SHIFT max and handles the zero > > case. > > Thanks Dan! However it looks like find_first_bit() could call into > __ffs(x) which has the same limitation as __ffs64(x), as Tony pointed out. Not quite, no. __ffs() behavior is *undefined* if the input is zero. find_first_bit() is *defined* and returns @size is the input is zero. Which is the behavior this wants to default to PAGE_SHIFT in the absence of any smaller granularity information.
On 8/1/2022 2:20 PM, Dan Williams wrote: > Jane Chu wrote: >> On 8/1/2022 9:44 AM, Dan Williams wrote: >>> Jane Chu wrote: >>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine >>>> poison granularity") that changed nfit_handle_mce() callback to report >>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been >>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting >>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks, >>>> because 0x1000 bytes is 8 512-byte. >>>> >>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode >>>> the LSB field to PAGE_SHIFT instead of consulting the input >>>> struct cper_sec_mem_err record. So change to rely on hardware whenever >>>> support is available. >>>> >>>> Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com >>>> >>>> Reviewed-by: Dan Williams <dan.j.williams@intel.com> >>>> Signed-off-by: Jane Chu <jane.chu@oracle.com> >>>> --- >>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++- >>>> 1 file changed, 13 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c >>>> index 717192915f28..2c7ea0ba9dd7 100644 >>>> --- a/arch/x86/kernel/cpu/mce/apei.c >>>> +++ b/arch/x86/kernel/cpu/mce/apei.c >>>> @@ -29,15 +29,27 @@ >>>> void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err) >>>> { >>>> struct mce m; >>>> + int lsb = PAGE_SHIFT; >>>> >>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) >>>> return; >>>> >>>> + /* >>>> + * Even if the ->validation_bits are set for address mask, >>>> + * to be extra safe, check and reject an error radius '0', >>>> + * and fallback to the default page size. >>>> + */ >>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { >>>> + lsb = __ffs64(mem_err->physical_addr_mask); >>>> + if (lsb == 1) >>> >>> This was the reason I recommended hweight64 and min_not_zero() as >>> hweight64 does not have the undefined behavior. However, an even better >>> option is to just do: >>> >>> find_first_bit(&mem_err->physical_addr_mask, PAGE_SHIFT) >>> >>> ...as that trims the result to the PAGE_SHIFT max and handles the zero >>> case. >> >> Thanks Dan! However it looks like find_first_bit() could call into >> __ffs(x) which has the same limitation as __ffs64(x), as Tony pointed out. > > Not quite, no. __ffs() behavior is *undefined* if the input is zero. > find_first_bit() is *defined* and returns @size is the input is zero. > Which is the behavior this wants to default to PAGE_SHIFT in the absence > of any smaller granularity information. > You're right, because of this line - return val ? __ffs(val) : size; Thanks! -jane
diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c index 717192915f28..2c7ea0ba9dd7 100644 --- a/arch/x86/kernel/cpu/mce/apei.c +++ b/arch/x86/kernel/cpu/mce/apei.c @@ -29,15 +29,27 @@ void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err) { struct mce m; + int lsb = PAGE_SHIFT; if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) return; + /* + * Even if the ->validation_bits are set for address mask, + * to be extra safe, check and reject an error radius '0', + * and fallback to the default page size. + */ + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { + lsb = __ffs64(mem_err->physical_addr_mask); + if (lsb == 1) + lsb = PAGE_SHIFT; + } + mce_setup(&m); m.bank = -1; /* Fake a memory read error with unknown channel */ m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | MCI_STATUS_MISCV | 0x9f; - m.misc = (MCI_MISC_ADDR_PHYS << 6) | PAGE_SHIFT; + m.misc = (MCI_MISC_ADDR_PHYS << 6) | lsb; if (severity >= GHES_SEV_RECOVERABLE) m.status |= MCI_STATUS_UC;