Message ID | 20240214200709.777166-3-Benjamin.Cheatham@amd.com |
---|---|
State | Superseded |
Headers | show |
Series | cxl, EINJ: Update EINJ for CXL error types | expand |
Ben Cheatham wrote: > Implement CXL helper functions in the EINJ module for getting/injecting > available CXL protocol error types and export them to sysfs under > kernel/debug/cxl. > > The kernel/debug/cxl/einj_types file will print the available CXL > protocol errors in the same format as the available_error_types > file provided by the EINJ module. The > kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the > error_type and error_inject files provided by the EINJ module, i.e.: > writing an error type into $dport_dev/einj_inject will inject said error > type into the CXL dport represented by $dport_dev. > > Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> [..] > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig > index 67998dbd1d46..d1fc3ce31fbb 100644 > --- a/drivers/cxl/Kconfig > +++ b/drivers/cxl/Kconfig > @@ -157,4 +157,16 @@ config CXL_PMU > monitoring units and provide standard perf based interfaces. > > If unsure say 'm'. > + > +config CXL_EINJ > + bool "CXL Error INJection Support" > + default ACPI_APEI_EINJ > + depends on ACPI_APEI_EINJ = CXL_BUS So I do not see CONFIG_CXL_EINJ used anywhere, not in a Makefile, not in a header file. My expectation is that if this variable is not set then no symbols from einj.ko are consumed by cxl_core.ko. > + help > + Support for CXL protocol Error INJection through debugfs/cxl. > + Availability and which errors are supported is dependent on > + the host platform. Look to ACPI v6.5 section 18.6.4 and kernel > + EINJ documentation for more information. > + > + If unsure say 'n' > endif [..] > diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h > new file mode 100644 > index 000000000000..92c0e2e37ad9 > --- /dev/null > +++ b/include/linux/einj-cxl.h > @@ -0,0 +1,40 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > +/* > + * CXL protocol Error INJection support. > + * > + * Copyright (c) 2023 Advanced Micro Devices, Inc. > + * All Rights Reserved. > + * > + * Author: Ben Cheatham <benjamin.cheatham@amd.com> > + */ > +#ifndef CXL_EINJ_H > +#define CXL_EINJ_H > + > +#include <linux/pci.h> > + > +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) Per above this needs to be IS_ENABLED(CONFIG_CXL_EINJ), otherwise what's the point of the config symbol? > +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); > +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); > +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); > +bool einj_is_initialized(void); > +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > +static inline int einj_cxl_available_error_type_show(struct seq_file *m, > + void *v) > +{ > + return -ENXIO; > +} > + > +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) > +{ > + return -ENXIO; > +} > + > +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) > +{ > + return -ENXIO; > +} > + > +static inline bool einj_is_initialized(void) { return false; } > +#endif // CONFIG_ACPI_APEI_EINJ > + > +#endif // CXL_EINJ_H > -- > 2.34.1 > >
On Wed, 14 Feb 2024 14:07:08 -0600 Ben Cheatham <Benjamin.Cheatham@amd.com> wrote: > Implement CXL helper functions in the EINJ module for getting/injecting > available CXL protocol error types and export them to sysfs under > kernel/debug/cxl. > > The kernel/debug/cxl/einj_types file will print the available CXL > protocol errors in the same format as the available_error_types > file provided by the EINJ module. The > kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the > error_type and error_inject files provided by the EINJ module, i.e.: > writing an error type into $dport_dev/einj_inject will inject said error > type into the CXL dport represented by $dport_dev. > > Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Hi Ben, Just one trivial thing to add to Dan's comments. > diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h > new file mode 100644 > index 000000000000..92c0e2e37ad9 > --- /dev/null > +++ b/include/linux/einj-cxl.h > @@ -0,0 +1,40 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > +/* > + * CXL protocol Error INJection support. > + * > + * Copyright (c) 2023 Advanced Micro Devices, Inc. > + * All Rights Reserved. > + * > + * Author: Ben Cheatham <benjamin.cheatham@amd.com> > + */ > +#ifndef CXL_EINJ_H > +#define CXL_EINJ_H > + > +#include <linux/pci.h> > + > +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); > +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); > +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); > +bool einj_is_initialized(void); > +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) Whilst C++ Style comments are allowed for a few specific case, you should stick to local style. In the include directory a quick grep gave me loads of #else /* */ lines and only 2 #else // So C style comments preferred. > +static inline int einj_cxl_available_error_type_show(struct seq_file *m, > + void *v) > +{ > + return -ENXIO; > +} > + > +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) > +{ > + return -ENXIO; > +} > + > +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) > +{ > + return -ENXIO; > +} > + > +static inline bool einj_is_initialized(void) { return false; } > +#endif // CONFIG_ACPI_APEI_EINJ > + > +#endif // CXL_EINJ_H
On 2/14/24 8:25 PM, Dan Williams wrote: > Ben Cheatham wrote: >> Implement CXL helper functions in the EINJ module for getting/injecting >> available CXL protocol error types and export them to sysfs under >> kernel/debug/cxl. >> >> The kernel/debug/cxl/einj_types file will print the available CXL >> protocol errors in the same format as the available_error_types >> file provided by the EINJ module. The >> kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the >> error_type and error_inject files provided by the EINJ module, i.e.: >> writing an error type into $dport_dev/einj_inject will inject said error >> type into the CXL dport represented by $dport_dev. >> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> > [..] >> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig >> index 67998dbd1d46..d1fc3ce31fbb 100644 >> --- a/drivers/cxl/Kconfig >> +++ b/drivers/cxl/Kconfig >> @@ -157,4 +157,16 @@ config CXL_PMU >> monitoring units and provide standard perf based interfaces. >> >> If unsure say 'm'. >> + >> +config CXL_EINJ >> + bool "CXL Error INJection Support" >> + default ACPI_APEI_EINJ >> + depends on ACPI_APEI_EINJ = CXL_BUS > > So I do not see CONFIG_CXL_EINJ used anywhere, not in a Makefile, not in > a header file. My expectation is that if this variable is not set then > no symbols from einj.ko are consumed by cxl_core.ko. > Yeah, you're right. More on this below. >> + help >> + Support for CXL protocol Error INJection through debugfs/cxl. >> + Availability and which errors are supported is dependent on >> + the host platform. Look to ACPI v6.5 section 18.6.4 and kernel >> + EINJ documentation for more information. >> + >> + If unsure say 'n' >> endif > [..] >> diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h >> new file mode 100644 >> index 000000000000..92c0e2e37ad9 >> --- /dev/null >> +++ b/include/linux/einj-cxl.h >> @@ -0,0 +1,40 @@ >> +/* SPDX-License-Identifier: GPL-2.0-or-later */ >> +/* >> + * CXL protocol Error INJection support. >> + * >> + * Copyright (c) 2023 Advanced Micro Devices, Inc. >> + * All Rights Reserved. >> + * >> + * Author: Ben Cheatham <benjamin.cheatham@amd.com> >> + */ >> +#ifndef CXL_EINJ_H >> +#define CXL_EINJ_H >> + >> +#include <linux/pci.h> >> + >> +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > > Per above this needs to be IS_ENABLED(CONFIG_CXL_EINJ), otherwise what's > the point of the config symbol? > So I've tried changing this to IS_ENABLED(CONFIG_CXL_EINJ) and always get redefinition errors that I can't figure out how to get around cleanly. I should've elaborated more in the last revision, but part of changing the dependency rule from ACPI_APEI_EINJ >= CXL_BUS to ACPI_APEI_EINJ = CXL_BUS is that the above guard stays as IS_ENABLED(CONFIG_ACPI_APEI_EINJ). I'm pretty sure the only thing this symbol is doing is enforcing the above dependency. I would love to be able to remove it at this point, but doing so would require moving the dependency to either the EINJ or CXL core modules, which sounds worse. I could implement one of the other solutions I outlined last revision, but I don't particularly like any of those (and I know you don't either :)). I think the solution here is to move the einj_cxl functions into a new file, gate that file by CONFIG_CXL_EINJ (or change the name to CONFIG_EINJ_CXL to match einj-cxl.h), and add declarations of the functions in the EINJ module used by said functions to drivers/acpi/apei/apei-internal.h. I'm not sure of another approach at this point, but if you have suggestions I'd be very happy to hear them! Thanks, Ben >> +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); >> +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); >> +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); >> +bool einj_is_initialized(void); >> +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) >> +static inline int einj_cxl_available_error_type_show(struct seq_file *m, >> + void *v) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline bool einj_is_initialized(void) { return false; } >> +#endif // CONFIG_ACPI_APEI_EINJ >> + >> +#endif // CXL_EINJ_H >> -- >> 2.34.1 >> >> > > >
On 2/15/24 3:33 AM, Jonathan Cameron wrote: > On Wed, 14 Feb 2024 14:07:08 -0600 > Ben Cheatham <Benjamin.Cheatham@amd.com> wrote: > >> Implement CXL helper functions in the EINJ module for getting/injecting >> available CXL protocol error types and export them to sysfs under >> kernel/debug/cxl. >> >> The kernel/debug/cxl/einj_types file will print the available CXL >> protocol errors in the same format as the available_error_types >> file provided by the EINJ module. The >> kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the >> error_type and error_inject files provided by the EINJ module, i.e.: >> writing an error type into $dport_dev/einj_inject will inject said error >> type into the CXL dport represented by $dport_dev. >> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> > > Hi Ben, > > Just one trivial thing to add to Dan's comments. > >> diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h >> new file mode 100644 >> index 000000000000..92c0e2e37ad9 >> --- /dev/null >> +++ b/include/linux/einj-cxl.h >> @@ -0,0 +1,40 @@ >> +/* SPDX-License-Identifier: GPL-2.0-or-later */ >> +/* >> + * CXL protocol Error INJection support. >> + * >> + * Copyright (c) 2023 Advanced Micro Devices, Inc. >> + * All Rights Reserved. >> + * >> + * Author: Ben Cheatham <benjamin.cheatham@amd.com> >> + */ >> +#ifndef CXL_EINJ_H >> +#define CXL_EINJ_H >> + >> +#include <linux/pci.h> >> + >> +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) >> +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); >> +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); >> +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); >> +bool einj_is_initialized(void); >> +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > Whilst C++ Style comments are allowed for a few specific case, you should stick > to local style. In the include directory a quick grep gave me loads of > #else /* */ > lines and only 2 > #else // > > So C style comments preferred. > You are 100% correct. I did a grep for examples and didn't notice that I did it from the base linux directory, so of course I found a bunch of examples of C++ style comments :/. I'll fix it in the next revision. Thanks, Ben > >> +static inline int einj_cxl_available_error_type_show(struct seq_file *m, >> + void *v) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline bool einj_is_initialized(void) { return false; } >> +#endif // CONFIG_ACPI_APEI_EINJ >> + >> +#endif // CXL_EINJ_H >
Ben Cheatham wrote: [..] > >> diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h > >> new file mode 100644 > >> index 000000000000..92c0e2e37ad9 > >> --- /dev/null > >> +++ b/include/linux/einj-cxl.h > >> @@ -0,0 +1,40 @@ > >> +/* SPDX-License-Identifier: GPL-2.0-or-later */ > >> +/* > >> + * CXL protocol Error INJection support. > >> + * > >> + * Copyright (c) 2023 Advanced Micro Devices, Inc. > >> + * All Rights Reserved. > >> + * > >> + * Author: Ben Cheatham <benjamin.cheatham@amd.com> > >> + */ > >> +#ifndef CXL_EINJ_H > >> +#define CXL_EINJ_H > >> + > >> +#include <linux/pci.h> > >> + > >> +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > > > > Per above this needs to be IS_ENABLED(CONFIG_CXL_EINJ), otherwise what's > > the point of the config symbol? > > > [..] > I think the solution here is to move the einj_cxl functions into a new > file, gate that file by CONFIG_CXL_EINJ (or change the name to > CONFIG_EINJ_CXL to match einj-cxl.h), and add declarations of the > functions in the EINJ module used by said functions to > drivers/acpi/apei/apei-internal.h. I'm not sure of another approach at > this point, but if you have suggestions I'd be very happy to hear > them! Yes, this is the way.
diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl index fe61d372e3fa..4c0f62f881ca 100644 --- a/Documentation/ABI/testing/debugfs-cxl +++ b/Documentation/ABI/testing/debugfs-cxl @@ -33,3 +33,33 @@ Description: device cannot clear poison from the address, -ENXIO is returned. The clear_poison attribute is only visible for devices supporting the capability. + +What: /sys/kernel/debug/cxl/einj_types +Date: January, 2024 +KernelVersion: v6.9 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) Prints the CXL protocol error types made available by + the platform in the format "0x<error number> <error type>". + The possible error types are (as of ACPI v6.5): + 0x1000 CXL.cache Protocol Correctable + 0x2000 CXL.cache Protocol Uncorrectable non-fatal + 0x4000 CXL.cache Protocol Uncorrectable fatal + 0x8000 CXL.mem Protocol Correctable + 0x10000 CXL.mem Protocol Uncorrectable non-fatal + 0x20000 CXL.mem Protocol Uncorrectable fatal + + The <error number> can be written to einj_inject to inject + <error type> into a chosen dport. + +What: /sys/kernel/debug/cxl/$dport_dev/einj_inject +Date: January, 2024 +KernelVersion: v6.9 +Contact: linux-cxl@vger.kernel.org +Description: + (WO) Writing an integer to this file injects the corresponding + CXL protocol error into $dport_dev ($dport_dev will be a device + name from /sys/bus/pci/devices). The integer to type mapping for + injection can be found by reading from einj_types. If the dport + was enumerated in RCH mode, a CXL 1.1 error is injected, otherwise + a CXL 2.0 error is injected. diff --git a/MAINTAINERS b/MAINTAINERS index 73d898383e51..0980ed116dc2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5291,6 +5291,7 @@ S: Maintained F: drivers/cxl/ F: include/linux/cxl-event.h F: include/uapi/linux/cxl_mem.h +F: include/linux/einj-cxl.h F: tools/testing/cxl/ COMPUTE EXPRESS LINK PMU (CPMU) diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 4782ae0a0832..ab729a12dd7f 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -21,6 +21,7 @@ #include <linux/nmi.h> #include <linux/delay.h> #include <linux/mm.h> +#include <linux/einj-cxl.h> #include <linux/platform_device.h> #include <asm/unaligned.h> @@ -37,6 +38,20 @@ #define MEM_ERROR_MASK (ACPI_EINJ_MEMORY_CORRECTABLE | \ ACPI_EINJ_MEMORY_UNCORRECTABLE | \ ACPI_EINJ_MEMORY_FATAL) +#ifndef ACPI_EINJ_CXL_CACHE_CORRECTABLE +#define ACPI_EINJ_CXL_CACHE_CORRECTABLE BIT(12) +#define ACPI_EINJ_CXL_CACHE_UNCORRECTABLE BIT(13) +#define ACPI_EINJ_CXL_CACHE_FATAL BIT(14) +#define ACPI_EINJ_CXL_MEM_CORRECTABLE BIT(15) +#define ACPI_EINJ_CXL_MEM_UNCORRECTABLE BIT(16) +#define ACPI_EINJ_CXL_MEM_FATAL BIT(17) +#endif +#define CXL_ERROR_MASK (ACPI_EINJ_CXL_CACHE_CORRECTABLE | \ + ACPI_EINJ_CXL_CACHE_UNCORRECTABLE | \ + ACPI_EINJ_CXL_CACHE_FATAL | \ + ACPI_EINJ_CXL_MEM_CORRECTABLE | \ + ACPI_EINJ_CXL_MEM_UNCORRECTABLE | \ + ACPI_EINJ_CXL_MEM_FATAL) /* * ACPI version 5 provides a SET_ERROR_TYPE_WITH_ADDRESS action. @@ -560,8 +575,11 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, if (type & ACPI5_VENDOR_BIT) { if (vendor_flags != SETWA_FLAGS_MEM) goto inject; - } else if (!(type & MEM_ERROR_MASK) && !(flags & SETWA_FLAGS_MEM)) + } else if (!(type & MEM_ERROR_MASK) && !(flags & SETWA_FLAGS_MEM)) { goto inject; + } else if ((type & CXL_ERROR_MASK) && (flags & SETWA_FLAGS_MEM)) { + goto inject; + } /* * Disallow crazy address masks that give BIOS leeway to pick @@ -613,13 +631,16 @@ static struct { u32 mask; const char *str; } const einj_error_type_string[] = { { BIT(9), "Platform Correctable" }, { BIT(10), "Platform Uncorrectable non-fatal" }, { BIT(11), "Platform Uncorrectable fatal"}, + { BIT(31), "Vendor Defined Error Types" }, +}; + +static struct { u32 mask; const char *str; } const einj_cxl_error_type_string[] = { { BIT(12), "CXL.cache Protocol Correctable" }, { BIT(13), "CXL.cache Protocol Uncorrectable non-fatal" }, { BIT(14), "CXL.cache Protocol Uncorrectable fatal" }, { BIT(15), "CXL.mem Protocol Correctable" }, { BIT(16), "CXL.mem Protocol Uncorrectable non-fatal" }, { BIT(17), "CXL.mem Protocol Uncorrectable fatal" }, - { BIT(31), "Vendor Defined Error Types" }, }; static int available_error_type_show(struct seq_file *m, void *v) @@ -640,29 +661,46 @@ static int available_error_type_show(struct seq_file *m, void *v) DEFINE_SHOW_ATTRIBUTE(available_error_type); -static int error_type_get(void *data, u64 *val) +int einj_cxl_available_error_type_show(struct seq_file *m, void *v) { - *val = error_type; + int cxl_err, rc; + u32 available_error_type = 0; + + if (!einj_initialized) + return -ENXIO; + + rc = einj_get_available_error_type(&available_error_type); + if (rc) + return rc; + + for (int pos = 0; pos < ARRAY_SIZE(einj_cxl_error_type_string); pos++) { + cxl_err = ACPI_EINJ_CXL_CACHE_CORRECTABLE << pos; + + if (available_error_type & cxl_err) + seq_printf(m, "0x%08x\t%s\n", + einj_cxl_error_type_string[pos].mask, + einj_cxl_error_type_string[pos].str); + } return 0; } +EXPORT_SYMBOL_NS_GPL(einj_cxl_available_error_type_show, CXL); -static int error_type_set(void *data, u64 val) +static int validate_error_type(u64 type) { + u32 tval, vendor, available_error_type = 0; int rc; - u32 available_error_type = 0; - u32 tval, vendor; /* Only low 32 bits for error type are valid */ - if (val & GENMASK_ULL(63, 32)) + if (type & GENMASK_ULL(63, 32)) return -EINVAL; /* * Vendor defined types have 0x80000000 bit set, and * are not enumerated by ACPI_EINJ_GET_ERROR_TYPE */ - vendor = val & ACPI5_VENDOR_BIT; - tval = val & 0x7fffffff; + vendor = type & ACPI5_VENDOR_BIT; + tval = type & GENMASK(30, 0); /* Only one error type can be specified */ if (tval & (tval - 1)) @@ -671,9 +709,101 @@ static int error_type_set(void *data, u64 val) rc = einj_get_available_error_type(&available_error_type); if (rc) return rc; - if (!(val & available_error_type)) + if (!(type & available_error_type)) return -EINVAL; } + + return 0; +} + +static int cxl_dport_get_sbdf(struct pci_dev *dport_dev, u64 *sbdf) +{ + struct pci_bus *pbus; + struct pci_host_bridge *bridge; + u64 seg = 0, bus; + + pbus = dport_dev->bus; + bridge = pci_find_host_bridge(pbus); + + if (!bridge) + return -ENODEV; + + if (bridge->domain_nr != PCI_DOMAIN_NR_NOT_SET) + seg = bridge->domain_nr; + + bus = pbus->number; + *sbdf = (seg << 24) | (bus << 16) | dport_dev->devfn; + + return 0; +} + +static bool is_cxl_error_type(u64 type) +{ + return (type & CXL_ERROR_MASK) && (!(type & ACPI5_VENDOR_BIT)); +} + +int einj_cxl_inject_rch_error(u64 rcrb, u64 type) +{ + int rc; + + if (!einj_initialized) + return -ENXIO; + + /* Only CXL error types can be specified */ + if (!is_cxl_error_type(type)) + return -EINVAL; + + rc = validate_error_type(type); + if (rc) + return rc; + + return einj_error_inject(type, 0x2, rcrb, GENMASK_ULL(63, 12), 0, 0); +} +EXPORT_SYMBOL_NS_GPL(einj_cxl_inject_rch_error, CXL); + +int einj_cxl_inject_error(struct pci_dev *dport, u64 type) +{ + u64 param4 = 0; + int rc; + + if (!einj_initialized) + return -ENXIO; + + /* Only CXL error types can be specified */ + if (!is_cxl_error_type(type)) + return -EINVAL; + + rc = validate_error_type(type); + if (rc) + return rc; + + rc = cxl_dport_get_sbdf(dport, ¶m4); + if (rc) + return rc; + + return einj_error_inject(type, 0x4, 0, 0, 0, param4); +} +EXPORT_SYMBOL_NS_GPL(einj_cxl_inject_error, CXL); + +static int error_type_get(void *data, u64 *val) +{ + *val = error_type; + + return 0; +} + +static int error_type_set(void *data, u64 val) +{ + int rc; + + /* CXL error types have to be injected from cxl debugfs */ + if (is_cxl_error_type(val)) + return -EINVAL; + + rc = validate_error_type(val); + if (rc) + return rc; + error_type = val; return 0; @@ -709,6 +839,12 @@ static int einj_check_table(struct acpi_table_einj *einj_tab) return 0; } +bool einj_is_initialized(void) +{ + return einj_initialized; +} +EXPORT_SYMBOL_GPL(einj_is_initialized); + static int __init einj_probe(struct platform_device *pdev) { int rc; diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 67998dbd1d46..d1fc3ce31fbb 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -157,4 +157,16 @@ config CXL_PMU monitoring units and provide standard perf based interfaces. If unsure say 'm'. + +config CXL_EINJ + bool "CXL Error INJection Support" + default ACPI_APEI_EINJ + depends on ACPI_APEI_EINJ = CXL_BUS + help + Support for CXL protocol Error INJection through debugfs/cxl. + Availability and which errors are supported is dependent on + the host platform. Look to ACPI v6.5 section 18.6.4 and kernel + EINJ documentation for more information. + + If unsure say 'n' endif diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index e59d9d37aa65..3700700c4e0c 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -3,6 +3,7 @@ #include <linux/platform_device.h> #include <linux/memregion.h> #include <linux/workqueue.h> +#include <linux/einj-cxl.h> #include <linux/debugfs.h> #include <linux/device.h> #include <linux/module.h> @@ -793,6 +794,39 @@ static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport, return rc; } +DEFINE_SHOW_ATTRIBUTE(einj_cxl_available_error_type); + +static int cxl_einj_inject(void *data, u64 type) +{ + struct cxl_dport *dport = data; + + if (dport->rch) + return einj_cxl_inject_rch_error(dport->rcrb.base, type); + + return einj_cxl_inject_error(to_pci_dev(dport->dport_dev), type); +} +DEFINE_DEBUGFS_ATTRIBUTE(cxl_einj_inject_fops, NULL, cxl_einj_inject, "%llx\n"); + +static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport) +{ + struct dentry *dir; + + if (!einj_is_initialized()) + return; + + /* + * dport_dev needs to be a PCIe port for CXL 2.0+ ports because + * EINJ expects a dport SBDF to be specified for 2.0 error injection. + */ + if (!dport->rch && !dev_is_pci(dport->dport_dev)) + return; + + dir = cxl_debugfs_create_dir(dev_name(dport->dport_dev)); + + debugfs_create_file("einj_inject", 0200, dir, dport, + &cxl_einj_inject_fops); +} + static struct cxl_port *__devm_cxl_add_port(struct device *host, struct device *uport_dev, resource_size_t component_reg_phys, @@ -1149,6 +1183,8 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev, if (dev_is_pci(dport_dev)) dport->link_latency = cxl_pci_get_latency(to_pci_dev(dport_dev)); + cxl_debugfs_create_dport_dir(dport); + return dport; } @@ -2221,6 +2257,11 @@ static __init int cxl_core_init(void) cxl_debugfs = debugfs_create_dir("cxl", NULL); + if (einj_is_initialized()) { + debugfs_create_file("einj_types", 0400, cxl_debugfs, NULL, + &einj_cxl_available_error_type_fops); + } + cxl_mbox_init(); rc = cxl_memdev_init(); diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h new file mode 100644 index 000000000000..92c0e2e37ad9 --- /dev/null +++ b/include/linux/einj-cxl.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * CXL protocol Error INJection support. + * + * Copyright (c) 2023 Advanced Micro Devices, Inc. + * All Rights Reserved. + * + * Author: Ben Cheatham <benjamin.cheatham@amd.com> + */ +#ifndef CXL_EINJ_H +#define CXL_EINJ_H + +#include <linux/pci.h> + +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); +bool einj_is_initialized(void); +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) +static inline int einj_cxl_available_error_type_show(struct seq_file *m, + void *v) +{ + return -ENXIO; +} + +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) +{ + return -ENXIO; +} + +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) +{ + return -ENXIO; +} + +static inline bool einj_is_initialized(void) { return false; } +#endif // CONFIG_ACPI_APEI_EINJ + +#endif // CXL_EINJ_H
Implement CXL helper functions in the EINJ module for getting/injecting available CXL protocol error types and export them to sysfs under kernel/debug/cxl. The kernel/debug/cxl/einj_types file will print the available CXL protocol errors in the same format as the available_error_types file provided by the EINJ module. The kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the error_type and error_inject files provided by the EINJ module, i.e.: writing an error type into $dport_dev/einj_inject will inject said error type into the CXL dport represented by $dport_dev. Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> --- Documentation/ABI/testing/debugfs-cxl | 30 +++++ MAINTAINERS | 1 + drivers/acpi/apei/einj.c | 158 ++++++++++++++++++++++++-- drivers/cxl/Kconfig | 12 ++ drivers/cxl/core/port.c | 41 +++++++ include/linux/einj-cxl.h | 40 +++++++ 6 files changed, 271 insertions(+), 11 deletions(-) create mode 100644 include/linux/einj-cxl.h