Message ID | 20200819143544.155096-2-alex.kluver@hpe.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | UEFI v2.8 Memory Error Record Updates | expand |
On Wed, Aug 19, 2020 at 09:35:43AM -0500, Alex Kluver wrote: > Memory errors could be printed with incorrect row values since the DIMM > size has outgrown the 16 bit row field in the CPER structure. UEFI > Specification Version 2.8 has increased the size of row by allowing it to > use the first 2 bits from a previously reserved space within the structure. > > When needed, add the extension bits to the row value printed. > > Based on UEFI 2.8 Table 299. Memory Error Record > > Reviewed-by: Kyle Meyer <kyle.meyer@hpe.com> > Reviewed-by: Steve Wahl <steve.wahl@hpe.com> > Tested-by: Russ Anderson <russ.anderson@hpe.com> > Signed-off-by: Alex Kluver <alex.kluver@hpe.com> > --- > > v1 -> v2: > * Add static inline cper_get_mem_extension() to make it > more readable, as suggested by Borislav Petkov. > > * Add second patch for bank field, bank group, and chip id. > > --- > drivers/edac/ghes_edac.c | 8 ++++++-- > drivers/firmware/efi/cper.c | 9 +++++++-- > include/linux/cper.h | 16 ++++++++++++++-- > 3 files changed, 27 insertions(+), 6 deletions(-) For the EDAC bits: Acked-by: Borislav Petkov <bp@suse.de> Also, I could take both through the EDAC tree, if people prefer.
On Tue, 15 Sep 2020 at 19:33, Borislav Petkov <bp@alien8.de> wrote: > > On Wed, Aug 19, 2020 at 09:35:43AM -0500, Alex Kluver wrote: > > Memory errors could be printed with incorrect row values since the DIMM > > size has outgrown the 16 bit row field in the CPER structure. UEFI > > Specification Version 2.8 has increased the size of row by allowing it to > > use the first 2 bits from a previously reserved space within the structure. > > > > When needed, add the extension bits to the row value printed. > > > > Based on UEFI 2.8 Table 299. Memory Error Record > > > > Reviewed-by: Kyle Meyer <kyle.meyer@hpe.com> > > Reviewed-by: Steve Wahl <steve.wahl@hpe.com> > > Tested-by: Russ Anderson <russ.anderson@hpe.com> > > Signed-off-by: Alex Kluver <alex.kluver@hpe.com> > > --- > > > > v1 -> v2: > > * Add static inline cper_get_mem_extension() to make it > > more readable, as suggested by Borislav Petkov. > > > > * Add second patch for bank field, bank group, and chip id. > > > > --- > > drivers/edac/ghes_edac.c | 8 ++++++-- > > drivers/firmware/efi/cper.c | 9 +++++++-- > > include/linux/cper.h | 16 ++++++++++++++-- > > 3 files changed, 27 insertions(+), 6 deletions(-) > > For the EDAC bits: > > Acked-by: Borislav Petkov <bp@suse.de> > > Also, I could take both through the EDAC tree, if people prefer. > I'll take this via the EFI tree - I was just preparing the branch for a PR anyways.
On Tue, 15 Sep 2020 at 20:07, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Tue, 15 Sep 2020 at 19:33, Borislav Petkov <bp@alien8.de> wrote: > > > > On Wed, Aug 19, 2020 at 09:35:43AM -0500, Alex Kluver wrote: > > > Memory errors could be printed with incorrect row values since the DIMM > > > size has outgrown the 16 bit row field in the CPER structure. UEFI > > > Specification Version 2.8 has increased the size of row by allowing it to > > > use the first 2 bits from a previously reserved space within the structure. > > > > > > When needed, add the extension bits to the row value printed. > > > > > > Based on UEFI 2.8 Table 299. Memory Error Record > > > > > > Reviewed-by: Kyle Meyer <kyle.meyer@hpe.com> > > > Reviewed-by: Steve Wahl <steve.wahl@hpe.com> > > > Tested-by: Russ Anderson <russ.anderson@hpe.com> > > > Signed-off-by: Alex Kluver <alex.kluver@hpe.com> > > > --- > > > > > > v1 -> v2: > > > * Add static inline cper_get_mem_extension() to make it > > > more readable, as suggested by Borislav Petkov. > > > > > > * Add second patch for bank field, bank group, and chip id. > > > > > > --- > > > drivers/edac/ghes_edac.c | 8 ++++++-- > > > drivers/firmware/efi/cper.c | 9 +++++++-- > > > include/linux/cper.h | 16 ++++++++++++++-- > > > 3 files changed, 27 insertions(+), 6 deletions(-) > > > > For the EDAC bits: > > > > Acked-by: Borislav Petkov <bp@suse.de> > > > > Also, I could take both through the EDAC tree, if people prefer. > > > > I'll take this via the EFI tree - I was just preparing the branch for > a PR anyways. Alex - these patches do not apply cleanly. Could you please respin them on top of the next branch in https://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git? Boris - do you anticipate any conflicts? If so, please take these via the EDAC tree - the CPER code is mostly self contained so I don't expect any conflicts with the EFI tree in that case.
On Tue, Sep 15, 2020 at 08:12:31PM +0300, Ard Biesheuvel wrote: > Boris - do you anticipate any conflicts? If so, please take these via > the EDAC tree - the CPER code is mostly self contained so I don't > expect any conflicts with the EFI tree in that case. None so far, and I applied them for testing ontop of my EDAC queue for 5.10 so it should be all good. But if you want me, I can test-merge your branch once ready, just in case...
On Tue, 15 Sep 2020 at 20:19, Borislav Petkov <bp@alien8.de> wrote: > > On Tue, Sep 15, 2020 at 08:12:31PM +0300, Ard Biesheuvel wrote: > > Boris - do you anticipate any conflicts? If so, please take these via > > the EDAC tree - the CPER code is mostly self contained so I don't > > expect any conflicts with the EFI tree in that case. > > None so far, and I applied them for testing ontop of my EDAC queue for > 5.10 so it should be all good. But if you want me, I can test-merge your > branch once ready, just in case... > I managed to apply these patches by using a different base and cherrypicking them into efi/next I expect to send out a couple of PRs tomorrow, once the bots have had a go at building the branches. In the meantime, you can take a look at git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git next
On Wed, Sep 16, 2020 at 04:09:36PM +0300, Ard Biesheuvel wrote:
> git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git next
Looks good and no conflicts, builds fine too.
[boris@zn: ~/kernel/linux> git fetch efi
remote: Enumerating objects: 85, done.
remote: Counting objects: 100% (85/85), done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 131 (delta 71), reused 85 (delta 71), pack-reused 46
Receiving objects: 100% (131/131), 113.14 KiB | 1.69 MiB/s, done.
Resolving deltas: 100% (89/89), completed with 33 local objects.
From git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
+ 84780c5438ef...744de4180a43 next -> efi/next (forced update)
fb1201aececc..46908326c6b8 urgent -> efi/urgent
* [new tag] efi-next-for-v5.10 -> efi-next-for-v5.10
* [new tag] efi-urgent-for-v5.9-rc5 -> efi-urgent-for-v5.9-rc5
* [new tag] efi-riscv-shared-for-v5.10 -> efi-riscv-shared-for-v5.10
[boris@zn: ~/kernel/linux> git checkout -b test-merge ras/edac-for-next
Branch 'test-merge' set up to track remote branch 'edac-for-next' from 'ras'.
Switched to a new branch 'test-merge'
[boris@zn: ~/kernel/linux> git merge efi/next
Auto-merging drivers/firmware/efi/libstub/efi-stub-helper.c
Auto-merging drivers/firmware/efi/efi.c
Auto-merging drivers/edac/ghes_edac.c
Auto-merging arch/x86/platform/efi/efi.c
Merge made by the 'recursive' strategy.
arch/arm/include/asm/efi.h | 23 +++--
arch/arm64/include/asm/efi.h | 5 +-
arch/x86/kernel/setup.c | 1 +
arch/x86/platform/efi/efi.c | 3 +
drivers/edac/ghes_edac.c | 17 +++-
drivers/firmware/efi/Makefile | 3 +-
drivers/firmware/efi/cper.c | 18 +++-
drivers/firmware/efi/{arm-init.c => efi-init.c} | 1 +
drivers/firmware/efi/efi.c | 6 ++
drivers/firmware/efi/libstub/arm32-stub.c | 178 +++++++---------------------------
drivers/firmware/efi/libstub/arm64-stub.c | 1 -
drivers/firmware/efi/libstub/efi-stub-helper.c | 101 +++++++++++++++++++-
drivers/firmware/efi/libstub/efi-stub.c | 48 +---------
drivers/firmware/efi/libstub/efistub.h | 61 +++++++++++-
drivers/firmware/efi/libstub/file.c | 5 +-
drivers/firmware/efi/libstub/relocate.c | 4 +-
drivers/firmware/efi/libstub/vsprintf.c | 2 +-
drivers/firmware/efi/mokvar-table.c | 360 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/cper.h | 24 ++++-
include/linux/efi.h | 34 +++++++
include/linux/pe.h | 3 +
security/integrity/platform_certs/load_uefi.c | 85 +++++++++++++----
22 files changed, 746 insertions(+), 237 deletions(-)
rename drivers/firmware/efi/{arm-init.c => efi-init.c} (99%)
create mode 100644 drivers/firmware/efi/mokvar-table.c
On Wed, Sep 16, 2020 at 08:10:30PM +0200, Borislav Petkov wrote: > On Wed, Sep 16, 2020 at 04:09:36PM +0300, Ard Biesheuvel wrote: > > git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git next > > Looks good and no conflicts, builds fine too. Excellent. Thanks! > [boris@zn: ~/kernel/linux> git fetch efi > remote: Enumerating objects: 85, done. > remote: Counting objects: 100% (85/85), done. > remote: Compressing objects: 100% (14/14), done. > remote: Total 131 (delta 71), reused 85 (delta 71), pack-reused 46 > Receiving objects: 100% (131/131), 113.14 KiB | 1.69 MiB/s, done. > Resolving deltas: 100% (89/89), completed with 33 local objects. > From git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi > + 84780c5438ef...744de4180a43 next -> efi/next (forced update) > fb1201aececc..46908326c6b8 urgent -> efi/urgent > * [new tag] efi-next-for-v5.10 -> efi-next-for-v5.10 > * [new tag] efi-urgent-for-v5.9-rc5 -> efi-urgent-for-v5.9-rc5 > * [new tag] efi-riscv-shared-for-v5.10 -> efi-riscv-shared-for-v5.10 > [boris@zn: ~/kernel/linux> git checkout -b test-merge ras/edac-for-next > Branch 'test-merge' set up to track remote branch 'edac-for-next' from 'ras'. > Switched to a new branch 'test-merge' > [boris@zn: ~/kernel/linux> git merge efi/next > Auto-merging drivers/firmware/efi/libstub/efi-stub-helper.c > Auto-merging drivers/firmware/efi/efi.c > Auto-merging drivers/edac/ghes_edac.c > Auto-merging arch/x86/platform/efi/efi.c > Merge made by the 'recursive' strategy. > arch/arm/include/asm/efi.h | 23 +++-- > arch/arm64/include/asm/efi.h | 5 +- > arch/x86/kernel/setup.c | 1 + > arch/x86/platform/efi/efi.c | 3 + > drivers/edac/ghes_edac.c | 17 +++- > drivers/firmware/efi/Makefile | 3 +- > drivers/firmware/efi/cper.c | 18 +++- > drivers/firmware/efi/{arm-init.c => efi-init.c} | 1 + > drivers/firmware/efi/efi.c | 6 ++ > drivers/firmware/efi/libstub/arm32-stub.c | 178 +++++++--------------------------- > drivers/firmware/efi/libstub/arm64-stub.c | 1 - > drivers/firmware/efi/libstub/efi-stub-helper.c | 101 +++++++++++++++++++- > drivers/firmware/efi/libstub/efi-stub.c | 48 +--------- > drivers/firmware/efi/libstub/efistub.h | 61 +++++++++++- > drivers/firmware/efi/libstub/file.c | 5 +- > drivers/firmware/efi/libstub/relocate.c | 4 +- > drivers/firmware/efi/libstub/vsprintf.c | 2 +- > drivers/firmware/efi/mokvar-table.c | 360 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > include/linux/cper.h | 24 ++++- > include/linux/efi.h | 34 +++++++ > include/linux/pe.h | 3 + > security/integrity/platform_certs/load_uefi.c | 85 +++++++++++++---- > 22 files changed, 746 insertions(+), 237 deletions(-) > rename drivers/firmware/efi/{arm-init.c => efi-init.c} (99%) > create mode 100644 drivers/firmware/efi/mokvar-table.c > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c index cb3dab56a875..98fcdaf72a09 100644 --- a/drivers/edac/ghes_edac.c +++ b/drivers/edac/ghes_edac.c @@ -337,8 +337,12 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err) p += sprintf(p, "rank:%d ", mem_err->rank); if (mem_err->validation_bits & CPER_MEM_VALID_BANK) p += sprintf(p, "bank:%d ", mem_err->bank); - if (mem_err->validation_bits & CPER_MEM_VALID_ROW) - p += sprintf(p, "row:%d ", mem_err->row); + if (mem_err->validation_bits & (CPER_MEM_VALID_ROW | CPER_MEM_VALID_ROW_EXT)) { + u32 row = mem_err->row; + + row |= cper_get_mem_extension(mem_err->validation_bits, mem_err->extended); + p += sprintf(p, "row:%d ", row); + } if (mem_err->validation_bits & CPER_MEM_VALID_COLUMN) p += sprintf(p, "col:%d ", mem_err->column); if (mem_err->validation_bits & CPER_MEM_VALID_BIT_POSITION) diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index f564e15fbc7e..a60acd17bcaa 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -234,8 +234,12 @@ static int cper_mem_err_location(struct cper_mem_err_compact *mem, char *msg) n += scnprintf(msg + n, len - n, "bank: %d ", mem->bank); if (mem->validation_bits & CPER_MEM_VALID_DEVICE) n += scnprintf(msg + n, len - n, "device: %d ", mem->device); - if (mem->validation_bits & CPER_MEM_VALID_ROW) - n += scnprintf(msg + n, len - n, "row: %d ", mem->row); + if (mem->validation_bits & (CPER_MEM_VALID_ROW | CPER_MEM_VALID_ROW_EXT)) { + u32 row = mem->row; + + row |= cper_get_mem_extension(mem->validation_bits, mem->extended); + n += scnprintf(msg + n, len - n, "row: %d ", row); + } if (mem->validation_bits & CPER_MEM_VALID_COLUMN) n += scnprintf(msg + n, len - n, "column: %d ", mem->column); if (mem->validation_bits & CPER_MEM_VALID_BIT_POSITION) @@ -292,6 +296,7 @@ void cper_mem_err_pack(const struct cper_sec_mem_err *mem, cmem->requestor_id = mem->requestor_id; cmem->responder_id = mem->responder_id; cmem->target_id = mem->target_id; + cmem->extended = mem->extended; cmem->rank = mem->rank; cmem->mem_array_handle = mem->mem_array_handle; cmem->mem_dev_handle = mem->mem_dev_handle; diff --git a/include/linux/cper.h b/include/linux/cper.h index 8537e9282a65..bd2d8a77a784 100644 --- a/include/linux/cper.h +++ b/include/linux/cper.h @@ -230,6 +230,10 @@ enum { #define CPER_MEM_VALID_RANK_NUMBER 0x8000 #define CPER_MEM_VALID_CARD_HANDLE 0x10000 #define CPER_MEM_VALID_MODULE_HANDLE 0x20000 +#define CPER_MEM_VALID_ROW_EXT 0x40000 + +#define CPER_MEM_EXT_ROW_MASK 0x3 +#define CPER_MEM_EXT_ROW_SHIFT 16 #define CPER_PCIE_VALID_PORT_TYPE 0x0001 #define CPER_PCIE_VALID_VERSION 0x0002 @@ -443,7 +447,7 @@ struct cper_sec_mem_err_old { u8 error_type; }; -/* Memory Error Section (UEFI >= v2.3), UEFI v2.7 sec N.2.5 */ +/* Memory Error Section (UEFI >= v2.3), UEFI v2.8 sec N.2.5 */ struct cper_sec_mem_err { u64 validation_bits; u64 error_status; @@ -461,7 +465,7 @@ struct cper_sec_mem_err { u64 responder_id; u64 target_id; u8 error_type; - u8 reserved; + u8 extended; u16 rank; u16 mem_array_handle; /* "card handle" in UEFI 2.4 */ u16 mem_dev_handle; /* "module handle" in UEFI 2.4 */ @@ -483,8 +487,16 @@ struct cper_mem_err_compact { u16 rank; u16 mem_array_handle; u16 mem_dev_handle; + u8 extended; }; +static inline u32 cper_get_mem_extension(u64 mem_valid, u8 mem_extended) +{ + if (!(mem_valid & CPER_MEM_VALID_ROW_EXT)) + return 0; + return (mem_extended & CPER_MEM_EXT_ROW_MASK) << CPER_MEM_EXT_ROW_SHIFT; +} + /* PCI Express Error Section, UEFI v2.7 sec N.2.7 */ struct cper_sec_pcie { u64 validation_bits;