Message ID | 20230314135128.2930580-1-sdonthineni@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4 | expand |
On Tue, 14 Mar 2023 13:51:28 +0000, Shanker Donthineni <sdonthineni@nvidia.com> wrote: > > The T241 platform suffers from the T241-FABRIC-4 erratum which causes > unexpected behavior in the GIC when multiple transactions are received > simultaneously from different sources. This hardware issue impacts > NVIDIA server platforms that use more than two T241 chips > interconnected. Each chip has support for 320 {E}SPIs. > > This issue occurs when multiple packets from different GICs are > incorrectly interleaved at the target chip. The erratum text below > specifies exactly what can cause multiple transfer packets susceptible > to interleaving and GIC state corruption. GIC state corruption can > lead to a range of problems, including kernel panics, and unexpected > behavior. > > From the erratum text: > "In some cases, inter-socket AXI4 Stream packets with multiple > transfers, may be interleaved by the fabric when presented to ARM > Generic Interrupt Controller. GIC expects all transfers of a packet > to be delivered without any interleaving. > > The following GICv3 commands may result in multiple transfer packets > over inter-socket AXI4 Stream interface: > - Register reads from GICD_I* and GICD_N* > - Register writes to 64-bit GICD registers other than GICD_IROUTERn* > - ITS command MOVALL > > Multiple commands in GICv4+ utilize multiple transfer packets, > including VMOVP, VMOVI and VMAPP. > > This issue impacts system configurations with more than 2 sockets, > that require multi-transfer packets to be sent over inter-socket > AXI4 Stream interface between GIC instances on different sockets. > GICv4 cannot be supported. GICv3 SW model can only be supported > with the workaround. Single and Dual socket configurations are not > impacted by this issue and support GICv3 and GICv4." > > Link: https://developer.nvidia.com/docs/t241-fabric-4/nvidia-t241-fabric-4-errata.pdf > > Writing to the chip alias region of the GICD_In{E} registers except > GICD_ICENABLERn has an equivalent effect as writing to the global > distributor. The SPI interrupt deactivate path is not impacted by > the erratum. > > To fix this problem, implement a workaround that ensures read accesses > to the GICD_In{E} registers are directed to the chip that owns the > SPI, and disables GICv4.x features for KVM. To simplify code changes, > the gic_configure_irq() function uses the same alias region for both > read and write operations to GICD_ICFGR. > > Co-developed-by: Vikram Sethi <vsethi@nvidia.com> > Signed-off-by: Vikram Sethi <vsethi@nvidia.com> > Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> > --- > Changes since v1: > - Use SMCCC SOC-ID API for detecting the T241 chip > - Implement Marc's suggestions > - Edit commit text > > Documentation/arm64/silicon-errata.rst | 2 + > drivers/firmware/smccc/smccc.c | 13 +++ > drivers/irqchip/irq-gic-v3.c | 116 +++++++++++++++++++++++-- > include/linux/arm-smccc.h | 1 + > 4 files changed, 123 insertions(+), 9 deletions(-) > > diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst > index ec5f889d76819..e31f6c0687041 100644 > --- a/Documentation/arm64/silicon-errata.rst > +++ b/Documentation/arm64/silicon-errata.rst > @@ -172,6 +172,8 @@ stable kernels. > +----------------+-----------------+-----------------+-----------------------------+ > | NVIDIA | Carmel Core | N/A | NVIDIA_CARMEL_CNP_ERRATUM | > +----------------+-----------------+-----------------+-----------------------------+ > +| NVIDIA | T241 GICv3/4.x | T241-FABRIC-4 | N/A | > ++----------------+-----------------+-----------------+-----------------------------+ > +----------------+-----------------+-----------------+-----------------------------+ > | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | > +----------------+-----------------+-----------------+-----------------------------+ > diff --git a/drivers/firmware/smccc/smccc.c b/drivers/firmware/smccc/smccc.c > index 60ccf3e90d7de..cb688121270b8 100644 > --- a/drivers/firmware/smccc/smccc.c > +++ b/drivers/firmware/smccc/smccc.c > @@ -17,9 +17,12 @@ static enum arm_smccc_conduit smccc_conduit = SMCCC_CONDUIT_NONE; > > bool __ro_after_init smccc_trng_available = false; > u64 __ro_after_init smccc_has_sve_hint = false; > +s32 __ro_after_init smccc_soc_id_version = SMCCC_RET_NOT_SUPPORTED; > > void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) > { > + struct arm_smccc_res res; > + > smccc_version = version; > smccc_conduit = conduit; > > @@ -27,6 +30,16 @@ void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) > if (IS_ENABLED(CONFIG_ARM64_SVE) && > smccc_version >= ARM_SMCCC_VERSION_1_3) > smccc_has_sve_hint = true; > + > + if ((smccc_version >= ARM_SMCCC_VERSION_1_2) && > + (smccc_conduit != SMCCC_CONDUIT_NONE)) { > + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, > + ARM_SMCCC_ARCH_SOC_ID, &res); > + if ((s32)res.a0 >= 0) { > + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, &res); > + smccc_soc_id_version = (s32)res.a0; > + } > + } Please don't duplicate existing code. There is already the required infrastructure in drivers/firmware/smccc/soc_id.c. All you need to do is: - disassociate the SMCCC probing from the device registration - probe the SOC_ID early - add accessors for the relevant data - select ARM_SMCCC_SOD_ID/ARM_SMCCC_DISCOVERY from the GICv3 Kconfig > } > > enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void) > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > index 997104d4338e7..02c699ce92b12 100644 > --- a/drivers/irqchip/irq-gic-v3.c > +++ b/drivers/irqchip/irq-gic-v3.c > @@ -24,6 +24,7 @@ > #include <linux/irqchip/arm-gic-common.h> > #include <linux/irqchip/arm-gic-v3.h> > #include <linux/irqchip/irq-partition-percpu.h> > +#include <linux/arm-smccc.h> > > #include <asm/cputype.h> > #include <asm/exception.h> > @@ -57,8 +58,14 @@ struct gic_chip_data { > bool has_rss; > unsigned int ppi_nr; > struct partition_desc **ppi_descs; > + phys_addr_t dist_phys_base; Place this field next to its mapping. > }; > > + Spurious blank line. > +#define T241_CHIPS_MAX 4 > +static void __iomem *t241_dist_base_alias[T241_CHIPS_MAX]; Make this __read_mostly. > +static DEFINE_STATIC_KEY_FALSE(gic_nvidia_t241_erratum); > + > static struct gic_chip_data gic_data __read_mostly; > static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key); > > @@ -188,6 +195,34 @@ static inline bool gic_irq_in_rdist(struct irq_data *d) > } > } > > +static inline void __iomem *gic_dist_base_alias(irq_hw_number_t intid) Have this function take a struct irq_data * instead, and only perform the conversion in the workaround code. > +{ > + u32 chip; Move this into the if (). > + > + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) { > + /** > + * {E}SPI mappings for all 4 chips > + * Chip0 = 32-351 > + * Chip1 = 352-671 > + * Chip2 = 672-991 > + * Chip3 = 4096-4415 > + */ > + switch (__get_intid_range(intid)) { > + case SPI_RANGE: > + chip = (intid - 32) / 320; > + break; > + case ESPI_RANGE: > + chip = 3; > + break; > + default: > + unreachable(); > + } > + return t241_dist_base_alias[chip]; > + } > + > + return gic_data.dist_base; > +} > + > static inline void __iomem *gic_dist_base(struct irq_data *d) > { > switch (get_intid_range(d)) { > @@ -343,10 +378,14 @@ static int gic_peek_irq(struct irq_data *d, u32 offset) > offset = convert_offset_index(d, offset, &index); > mask = 1 << (index % 32); > > + /** > + * For the erratum T241-FABRIC-4, read accesses to the GICD_In{E} > + * registers are directed to the chip that owns the SPI. > + */ Move this into the workaround code, and drop the ** at the beginning of the comment, > if (gic_irq_in_rdist(d)) > base = gic_data_rdist_sgi_base(); > else > - base = gic_data.dist_base; > + base = gic_dist_base_alias(irqd_to_hwirq(d)); > > return !!(readl_relaxed(base + offset + (index / 32) * 4) & mask); > } > @@ -594,13 +633,17 @@ static int gic_set_type(struct irq_data *d, unsigned int type) > type != IRQ_TYPE_LEVEL_HIGH && type != IRQ_TYPE_EDGE_RISING) > return -EINVAL; > > + /** > + * For the erratum T241-FABRIC-4, read accesses to the GICD_In{E} > + * registers are directed to the chip that owns the SPI. Use the > + * same alias region for GICD_ICFGR writes to simplify code. > + */ Drop this and merge it with the above comment as required. > if (gic_irq_in_rdist(d)) > base = gic_data_rdist_sgi_base(); > else > - base = gic_data.dist_base; > + base = gic_dist_base_alias(irqd_to_hwirq(d)); > > offset = convert_offset_index(d, GICD_ICFGR, &index); > - Spurious blank line change. > ret = gic_configure_irq(index, type, base + offset, NULL); > if (ret && (range == PPI_RANGE || range == EPPI_RANGE)) { > /* Misconfigured PPIs are usually not fatal */ > @@ -1719,6 +1762,45 @@ static bool gic_enable_quirk_hip06_07(void *data) > return false; > } > > +#define T241_CHIPN_MASK GENMASK_ULL(45, 44) > +#define T241_CHIP_GICDA_OFFSET 0x1580000 > +#define SMCCC_SOC_ID_MASK 0x7f7fffff > +#define SMCCC_SOC_ID_T241 0x036b0241 > + > +static bool gic_enable_quirk_nvidia_t241(void *data) > +{ > + unsigned long chip_bmask = 0; > + phys_addr_t phys; > + u32 i; > + > + /* Check JEP106 code for NVIDIA T241 chip (036b:0241) */ > + if ((smccc_soc_id_version < 0) || > + ((smccc_soc_id_version & SMCCC_SOC_ID_MASK) != SMCCC_SOC_ID_T241)) { > + return false; > + } > + > + /* Find the chips based on GICR regions PHYS addr */ > + for (i = 0; i < gic_data.nr_redist_regions; i++) { > + chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK, > + gic_data.redist_regions[i].phys_base)); > + } > + > + if (hweight32(chip_bmask) < 3) > + return false; > + > + /* Setup GICD alias regions */ > + for (i = 0; i < ARRAY_SIZE(t241_dist_base_alias); i++) { > + if (chip_bmask & BIT(i)) { > + phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET; > + phys |= FIELD_PREP(T241_CHIPN_MASK, i); > + t241_dist_base_alias[i] = ioremap(phys, SZ_64K); > + WARN_ON_ONCE(!t241_dist_base_alias[i]); > + } > + } > + static_branch_enable(&gic_nvidia_t241_erratum); > + return true; > +} > + > static const struct gic_quirk gic_quirks[] = { > { > .desc = "GICv3: Qualcomm MSM8996 broken firmware", > @@ -1750,6 +1832,12 @@ static const struct gic_quirk gic_quirks[] = { > .mask = 0xe8f00fff, > .init = gic_enable_quirk_cavium_38539, > }, > + { > + .desc = "GICv3: NVIDIA erratum T241-FABRIC-4", > + .iidr = 0x0402043b, > + .mask = 0xffffffff, > + .init = gic_enable_quirk_nvidia_t241, > + }, > { > } > }; > @@ -1821,7 +1909,8 @@ static int __init gic_init_bases(void __iomem *dist_base, > struct redist_region *rdist_regs, > u32 nr_redist_regions, > u64 redist_stride, > - struct fwnode_handle *handle) > + struct fwnode_handle *handle, > + phys_addr_t dist_phys_base) Pass dist_phys_base as a the first parameter, not the last. > { > u32 typer; > int err; > @@ -1834,6 +1923,7 @@ static int __init gic_init_bases(void __iomem *dist_base, > > gic_data.fwnode = handle; > gic_data.dist_base = dist_base; > + gic_data.dist_phys_base = dist_phys_base; > gic_data.redist_regions = rdist_regs; > gic_data.nr_redist_regions = nr_redist_regions; > gic_data.redist_stride = redist_stride; > @@ -2072,11 +2162,12 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare > void __iomem *dist_base; > struct redist_region *rdist_regs; > struct resource res; > + struct resource dist_res; Surely you don't need a full struct resource to keep the distributor base address around... > u64 redist_stride; > u32 nr_redist_regions; > int err, i; > > - dist_base = gic_of_iomap(node, 0, "GICD", &res); > + dist_base = gic_of_iomap(node, 0, "GICD", &dist_res); > if (IS_ERR(dist_base)) { > pr_err("%pOF: unable to map gic dist registers\n", node); > return PTR_ERR(dist_base); > @@ -2114,7 +2205,7 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare > gic_enable_of_quirks(node, gic_quirks, &gic_data); > > err = gic_init_bases(dist_base, rdist_regs, nr_redist_regions, > - redist_stride, &node->fwnode); > + redist_stride, &node->fwnode, dist_res.start); > if (err) > goto out_unmap_rdist; > > @@ -2377,8 +2468,14 @@ static void __init gic_acpi_setup_kvm_info(void) > vcpu->end = vcpu->start + ACPI_GICV2_VCPU_MEM_SIZE - 1; > } > > - gic_v3_kvm_info.has_v4 = gic_data.rdists.has_vlpis; > - gic_v3_kvm_info.has_v4_1 = gic_data.rdists.has_rvpeid; > + /* Disable GICv4.x features for the erratum T241-FABRIC-4 */ > + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) { > + gic_v3_kvm_info.has_v4 = false; > + gic_v3_kvm_info.has_v4_1 = false; > + } else { > + gic_v3_kvm_info.has_v4 = gic_data.rdists.has_vlpis; > + gic_v3_kvm_info.has_v4_1 = gic_data.rdists.has_rvpeid; > + } This is the wrong place to do this. We want to disable *all* the GICv4.1 features, and not just KVM's view. Specially given that there are a bunch of fields that are evaluated in the ITS driver. Something like the change below seems more logical. Maybe you can salvage direct_lpi from the disaster, but that's about it. diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index fd134e1f481a..a68d33b4523f 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -1841,10 +1841,13 @@ static int __init gic_init_bases(void __iomem *dist_base, gic_data.domain = irq_domain_create_tree(handle, &gic_irq_domain_ops, &gic_data); gic_data.rdists.rdist = alloc_percpu(typeof(*gic_data.rdists.rdist)); - gic_data.rdists.has_rvpeid = true; - gic_data.rdists.has_vlpis = true; - gic_data.rdists.has_direct_lpi = true; - gic_data.rdists.has_vpend_valid_dirty = true; + if (!static_branch_unlikely(&gic_nvidia_t241_erratum)) { + /* Disable GICv4.x features for the erratum T241-FABRIC-4 */ + gic_data.rdists.has_rvpeid = true; + gic_data.rdists.has_vlpis = true; + gic_data.rdists.has_direct_lpi = true; + gic_data.rdists.has_vpend_valid_dirty = true; + } if (WARN_ON(!gic_data.domain) || WARN_ON(!gic_data.rdists.rdist)) { err = -ENOMEM; Thanks, M.
Hi Marc, On 3/15/23 03:34, Marc Zyngier wrote: > Please don't duplicate existing code. There is already the required > infrastructure in drivers/firmware/smccc/soc_id.c. All you need to do > is: > > - disassociate the SMCCC probing from the device registration > > - probe the SOC_ID early > > - add accessors for the relevant data > > - select ARM_SMCCC_SOD_ID/ARM_SMCCC_DISCOVERY from the GICv3 Kconfig I have not modified soc_id.c as it expects to be loaded as a module with the use of module_init() and module_exit() functions. The exported symbols in soc_id driver cannot be accessed from the built-in code. Agree, the SOD-ID discovery code was duplicated. Please guide me if the below approach is okay? 1) Probe the SOC-ID in arm_smccc_version_init() and export two functions arm_smccc_get_soc_id_version() and arm_smccc_get_soc_id_revision(). --- a/drivers/firmware/smccc/smccc.c +++ b/drivers/firmware/smccc/smccc.c @@ -17,9 +17,13 @@ static enum arm_smccc_conduit smccc_conduit = SMCCC_CONDUIT_NONE; bool __ro_after_init smccc_trng_available = false; u64 __ro_after_init smccc_has_sve_hint = false; +s32 __ro_after_init smccc_soc_id_version = SMCCC_RET_NOT_SUPPORTED; +s32 __ro_after_init smccc_soc_id_revision = SMCCC_RET_NOT_SUPPORTED; void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) { + struct arm_smccc_res res; + smccc_version = version; smccc_conduit = conduit; @@ -27,6 +31,18 @@ void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) if (IS_ENABLED(CONFIG_ARM64_SVE) && smccc_version >= ARM_SMCCC_VERSION_1_3) smccc_has_sve_hint = true; + + if ((smccc_version >= ARM_SMCCC_VERSION_1_2) && + (smccc_conduit != SMCCC_CONDUIT_NONE)) { + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_ARCH_SOC_ID, &res); + if ((s32)res.a0 >= 0) { + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, &res); + smccc_soc_id_version = (s32)res.a0; + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 1, &res); + smccc_soc_id_revision = (s32)res.a0; + } + } } +s32 arm_smccc_get_soc_id_version(void) +{ + return smccc_soc_id_version; +} +EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_version); + +s32 arm_smccc_get_soc_id_revision(void) +{ + return smccc_soc_id_revision; +} +EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_revision); --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h +/** + * arm_smccc_get_soc_id_version() + * + * Returns the SOC ID version. + * + * When ARM_SMCCC_ARCH_SOC_ID is not present, returns SMCCC_RET_NOT_SUPPORTED. + */ +s32 arm_smccc_get_soc_id_version(void); +/** + * arm_smccc_get_soc_id_revision() + * + * Returns the SOC ID revision. + * + * When ARM_SMCCC_ARCH_SOC_ID is not present, returns SMCCC_RET_NOT_SUPPORTED. + */ +s32 arm_smccc_get_soc_id_revision(void); 2) Use the exported functions in soc_id module. --- a/drivers/firmware/smccc/soc_id.c +++ b/drivers/firmware/smccc/soc_id.c @@ -42,41 +42,23 @@ static int __init smccc_soc_init(void) if (arm_smccc_get_version() < ARM_SMCCC_VERSION_1_2) return 0; - if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) { - pr_err("%s: invalid SMCCC conduit\n", __func__); - return -EOPNOTSUPP; - } - - arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, - ARM_SMCCC_ARCH_SOC_ID, &res); - - if ((int)res.a0 == SMCCC_RET_NOT_SUPPORTED) { + soc_id_version = arm_smccc_get_soc_id_version(); + if (soc_id_version == SMCCC_RET_NOT_SUPPORTED) { pr_info("ARCH_SOC_ID not implemented, skipping ....\n"); return 0; } - if ((int)res.a0 < 0) { - pr_info("ARCH_FEATURES(ARCH_SOC_ID) returned error: %lx\n", - res.a0); - return -EINVAL; - } - - arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, &res); - if ((int)res.a0 < 0) { + if (soc_id_version < 0) { pr_err("ARCH_SOC_ID(0) returned error: %lx\n", res.a0); return -EINVAL; } - soc_id_version = res.a0; - - arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 1, &res); - if ((int)res.a0 < 0) { + soc_id_rev = arm_smccc_get_soc_id_revision(); + if (soc_id_rev < 0) { pr_err("ARCH_SOC_ID(1) returned error: %lx\n", res.a0); return -EINVAL; } - soc_id_rev = res.a0; Kconfig: --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -35,6 +35,7 @@ config ARM_GIC_V3 select IRQ_DOMAIN_HIERARCHY select PARTITION_PERCPU select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP + select HAVE_ARM_SMCCC_DISCOVERY
Hi Shanker, Thank you for the patch! Yet something to improve: [auto build test ERROR on arm64/for-next/core] [also build test ERROR on tip/irq/core soc/for-next linus/master v6.3-rc2 next-20230315] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Shanker-Donthineni/irqchip-gicv3-Workaround-for-NVIDIA-erratum-T241-FABRIC-4/20230314-215648 base: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core patch link: https://lore.kernel.org/r/20230314135128.2930580-1-sdonthineni%40nvidia.com patch subject: [PATCH v2] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4 config: arm-buildonly-randconfig-r003-20230312 (https://download.01.org/0day-ci/archive/20230316/202303160555.3zGOmtIL-lkp@intel.com/config) compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project 67409911353323ca5edf2049ef0df54132fa1ca7) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm cross compiling tool for clang build # apt-get install binutils-arm-linux-gnueabi # https://github.com/intel-lab-lkp/linux/commit/f796361134151057b68a259013204e8fa5516aee git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Shanker-Donthineni/irqchip-gicv3-Workaround-for-NVIDIA-erratum-T241-FABRIC-4/20230314-215648 git checkout f796361134151057b68a259013204e8fa5516aee # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash drivers/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> | Link: https://lore.kernel.org/oe-kbuild-all/202303160555.3zGOmtIL-lkp@intel.com/ All errors (new ones prefixed by >>): >> drivers/irqchip/irq-gic-v3.c:1775:21: error: call to undeclared function 'FIELD_GET'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK, ^ >> drivers/irqchip/irq-gic-v3.c:1786:12: error: call to undeclared function 'FIELD_PREP'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] phys |= FIELD_PREP(T241_CHIPN_MASK, i); ^ 2 errors generated. vim +/FIELD_GET +1775 drivers/irqchip/irq-gic-v3.c 1760 1761 static bool gic_enable_quirk_nvidia_t241(void *data) 1762 { 1763 unsigned long chip_bmask = 0; 1764 phys_addr_t phys; 1765 u32 i; 1766 1767 /* Check JEP106 code for NVIDIA T241 chip (036b:0241) */ 1768 if ((smccc_soc_id_version < 0) || 1769 ((smccc_soc_id_version & SMCCC_SOC_ID_MASK) != SMCCC_SOC_ID_T241)) { 1770 return false; 1771 } 1772 1773 /* Find the chips based on GICR regions PHYS addr */ 1774 for (i = 0; i < gic_data.nr_redist_regions; i++) { > 1775 chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK, 1776 gic_data.redist_regions[i].phys_base)); 1777 } 1778 1779 if (hweight32(chip_bmask) < 3) 1780 return false; 1781 1782 /* Setup GICD alias regions */ 1783 for (i = 0; i < ARRAY_SIZE(t241_dist_base_alias); i++) { 1784 if (chip_bmask & BIT(i)) { 1785 phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET; > 1786 phys |= FIELD_PREP(T241_CHIPN_MASK, i); 1787 t241_dist_base_alias[i] = ioremap(phys, SZ_64K); 1788 WARN_ON_ONCE(!t241_dist_base_alias[i]); 1789 } 1790 } 1791 static_branch_enable(&gic_nvidia_t241_erratum); 1792 return true; 1793 } 1794
Hi Shanker, Thank you for the patch! Yet something to improve: [auto build test ERROR on arm64/for-next/core] [also build test ERROR on tip/irq/core soc/for-next linus/master v6.3-rc2 next-20230315] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Shanker-Donthineni/irqchip-gicv3-Workaround-for-NVIDIA-erratum-T241-FABRIC-4/20230314-215648 base: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core patch link: https://lore.kernel.org/r/20230314135128.2930580-1-sdonthineni%40nvidia.com patch subject: [PATCH v2] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4 config: arm-randconfig-r046-20230315 (https://download.01.org/0day-ci/archive/20230316/202303160647.R16cnZkc-lkp@intel.com/config) compiler: arm-linux-gnueabi-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/f796361134151057b68a259013204e8fa5516aee git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Shanker-Donthineni/irqchip-gicv3-Workaround-for-NVIDIA-erratum-T241-FABRIC-4/20230314-215648 git checkout f796361134151057b68a259013204e8fa5516aee # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash drivers/irqchip/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> | Link: https://lore.kernel.org/oe-kbuild-all/202303160647.R16cnZkc-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from include/linux/bits.h:6, from include/linux/ioport.h:13, from include/linux/acpi.h:12, from drivers/irqchip/irq-gic-v3.c:9: drivers/irqchip/irq-gic-v3.c: In function 'gic_enable_quirk_nvidia_t241': >> drivers/irqchip/irq-gic-v3.c:1775:35: error: implicit declaration of function 'FIELD_GET'; did you mean 'FOLL_GET'? [-Werror=implicit-function-declaration] 1775 | chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK, | ^~~~~~~~~ include/vdso/bits.h:7:44: note: in definition of macro 'BIT' 7 | #define BIT(nr) (UL(1) << (nr)) | ^~ >> drivers/irqchip/irq-gic-v3.c:1786:33: error: implicit declaration of function 'FIELD_PREP' [-Werror=implicit-function-declaration] 1786 | phys |= FIELD_PREP(T241_CHIPN_MASK, i); | ^~~~~~~~~~ cc1: some warnings being treated as errors vim +1775 drivers/irqchip/irq-gic-v3.c 1760 1761 static bool gic_enable_quirk_nvidia_t241(void *data) 1762 { 1763 unsigned long chip_bmask = 0; 1764 phys_addr_t phys; 1765 u32 i; 1766 1767 /* Check JEP106 code for NVIDIA T241 chip (036b:0241) */ 1768 if ((smccc_soc_id_version < 0) || 1769 ((smccc_soc_id_version & SMCCC_SOC_ID_MASK) != SMCCC_SOC_ID_T241)) { 1770 return false; 1771 } 1772 1773 /* Find the chips based on GICR regions PHYS addr */ 1774 for (i = 0; i < gic_data.nr_redist_regions; i++) { > 1775 chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK, 1776 gic_data.redist_regions[i].phys_base)); 1777 } 1778 1779 if (hweight32(chip_bmask) < 3) 1780 return false; 1781 1782 /* Setup GICD alias regions */ 1783 for (i = 0; i < ARRAY_SIZE(t241_dist_base_alias); i++) { 1784 if (chip_bmask & BIT(i)) { 1785 phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET; > 1786 phys |= FIELD_PREP(T241_CHIPN_MASK, i); 1787 t241_dist_base_alias[i] = ioremap(phys, SZ_64K); 1788 WARN_ON_ONCE(!t241_dist_base_alias[i]); 1789 } 1790 } 1791 static_branch_enable(&gic_nvidia_t241_erratum); 1792 return true; 1793 } 1794
On Wed, Mar 15, 2023 at 07:27:14AM -0500, Shanker Donthineni wrote: > Hi Marc, > > On 3/15/23 03:34, Marc Zyngier wrote: > > Please don't duplicate existing code. There is already the required > > infrastructure in drivers/firmware/smccc/soc_id.c. All you need to do > > is: > > > > - disassociate the SMCCC probing from the device registration > > > > - probe the SOC_ID early > > > > - add accessors for the relevant data > > > > - select ARM_SMCCC_SOD_ID/ARM_SMCCC_DISCOVERY from the GICv3 Kconfig > > > I have not modified soc_id.c as it expects to be loaded as a module with > the use of module_init() and module_exit() functions. The exported symbols > in soc_id driver cannot be accessed from the built-in code. > > Agree, the SOD-ID discovery code was duplicated. > > Please guide me if the below approach is okay? > > 1) Probe the SOC-ID in arm_smccc_version_init() and export two functions > arm_smccc_get_soc_id_version() and arm_smccc_get_soc_id_revision(). > > --- a/drivers/firmware/smccc/smccc.c > +++ b/drivers/firmware/smccc/smccc.c > @@ -17,9 +17,13 @@ static enum arm_smccc_conduit smccc_conduit = SMCCC_CONDUIT_NONE; > > bool __ro_after_init smccc_trng_available = false; > u64 __ro_after_init smccc_has_sve_hint = false; > +s32 __ro_after_init smccc_soc_id_version = SMCCC_RET_NOT_SUPPORTED; > +s32 __ro_after_init smccc_soc_id_revision = SMCCC_RET_NOT_SUPPORTED; > > void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) > { > + struct arm_smccc_res res; > + > smccc_version = version; > smccc_conduit = conduit; > > @@ -27,6 +31,18 @@ void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) > if (IS_ENABLED(CONFIG_ARM64_SVE) && > smccc_version >= ARM_SMCCC_VERSION_1_3) > smccc_has_sve_hint = true; > + > + if ((smccc_version >= ARM_SMCCC_VERSION_1_2) && > + (smccc_conduit != SMCCC_CONDUIT_NONE)) { > + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, > + ARM_SMCCC_ARCH_SOC_ID, &res); > + if ((s32)res.a0 >= 0) { > + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, &res); > + smccc_soc_id_version = (s32)res.a0; > + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 1, &res); > + smccc_soc_id_revision = (s32)res.a0; > + } > + } > } > > > +s32 arm_smccc_get_soc_id_version(void) > +{ > + return smccc_soc_id_version; > +} > +EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_version); > + > +s32 arm_smccc_get_soc_id_revision(void) > +{ > + return smccc_soc_id_revision; > +} > +EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_revision); > > Overall, it looks OK to me. However I see neither the gic nor the soc_id can be build as module atm. So do we really need the export symbols if no other modules are using it ?
On 2023-03-16 15:10, Sudeep Holla wrote: > On Wed, Mar 15, 2023 at 07:27:14AM -0500, Shanker Donthineni wrote: >> Hi Marc, >> >> On 3/15/23 03:34, Marc Zyngier wrote: >> > Please don't duplicate existing code. There is already the required >> > infrastructure in drivers/firmware/smccc/soc_id.c. All you need to do >> > is: >> > >> > - disassociate the SMCCC probing from the device registration >> > >> > - probe the SOC_ID early >> > >> > - add accessors for the relevant data >> > >> > - select ARM_SMCCC_SOD_ID/ARM_SMCCC_DISCOVERY from the GICv3 Kconfig >> >> >> I have not modified soc_id.c as it expects to be loaded as a module >> with >> the use of module_init() and module_exit() functions. The exported >> symbols >> in soc_id driver cannot be accessed from the built-in code. >> >> Agree, the SOD-ID discovery code was duplicated. >> >> Please guide me if the below approach is okay? >> >> 1) Probe the SOC-ID in arm_smccc_version_init() and export two >> functions >> arm_smccc_get_soc_id_version() and arm_smccc_get_soc_id_revision(). >> >> --- a/drivers/firmware/smccc/smccc.c >> +++ b/drivers/firmware/smccc/smccc.c >> @@ -17,9 +17,13 @@ static enum arm_smccc_conduit smccc_conduit = >> SMCCC_CONDUIT_NONE; >> >> bool __ro_after_init smccc_trng_available = false; >> u64 __ro_after_init smccc_has_sve_hint = false; >> +s32 __ro_after_init smccc_soc_id_version = SMCCC_RET_NOT_SUPPORTED; >> +s32 __ro_after_init smccc_soc_id_revision = SMCCC_RET_NOT_SUPPORTED; >> >> void __init arm_smccc_version_init(u32 version, enum >> arm_smccc_conduit conduit) >> { >> + struct arm_smccc_res res; >> + >> smccc_version = version; >> smccc_conduit = conduit; >> >> @@ -27,6 +31,18 @@ void __init arm_smccc_version_init(u32 version, >> enum arm_smccc_conduit conduit) >> if (IS_ENABLED(CONFIG_ARM64_SVE) && >> smccc_version >= ARM_SMCCC_VERSION_1_3) >> smccc_has_sve_hint = true; >> + >> + if ((smccc_version >= ARM_SMCCC_VERSION_1_2) && >> + (smccc_conduit != SMCCC_CONDUIT_NONE)) { >> + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, >> + ARM_SMCCC_ARCH_SOC_ID, &res); >> + if ((s32)res.a0 >= 0) { >> + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, >> &res); >> + smccc_soc_id_version = (s32)res.a0; >> + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 1, >> &res); >> + smccc_soc_id_revision = (s32)res.a0; >> + } >> + } >> } >> >> >> +s32 arm_smccc_get_soc_id_version(void) >> +{ >> + return smccc_soc_id_version; >> +} >> +EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_version); >> + >> +s32 arm_smccc_get_soc_id_revision(void) >> +{ >> + return smccc_soc_id_revision; >> +} >> +EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_revision); >> >> > > Overall, it looks OK to me. However I see neither the gic nor the > soc_id > can be build as module atm. So do we really need the export symbols if > no > other modules are using it ? It really shouldn't be exported. Having accessors should be enough. Thanks, M.
On 3/16/23 11:00, Marc Zyngier wrote: > External email: Use caution opening links or attachments > > > On 2023-03-16 15:10, Sudeep Holla wrote: >> On Wed, Mar 15, 2023 at 07:27:14AM -0500, Shanker Donthineni wrote: >>> Hi Marc, >>> >>> On 3/15/23 03:34, Marc Zyngier wrote: >>> > Please don't duplicate existing code. There is already the required >>> > infrastructure in drivers/firmware/smccc/soc_id.c. All you need to do >>> > is: >>> > >>> > - disassociate the SMCCC probing from the device registration >>> > >>> > - probe the SOC_ID early >>> > >>> > - add accessors for the relevant data >>> > >>> > - select ARM_SMCCC_SOD_ID/ARM_SMCCC_DISCOVERY from the GICv3 Kconfig >>> >>> >>> I have not modified soc_id.c as it expects to be loaded as a module >>> with >>> the use of module_init() and module_exit() functions. The exported >>> symbols >>> in soc_id driver cannot be accessed from the built-in code. >>> >>> Agree, the SOD-ID discovery code was duplicated. >>> >>> Please guide me if the below approach is okay? >>> >>> 1) Probe the SOC-ID in arm_smccc_version_init() and export two >>> functions >>> arm_smccc_get_soc_id_version() and arm_smccc_get_soc_id_revision(). >>> >>> --- a/drivers/firmware/smccc/smccc.c >>> +++ b/drivers/firmware/smccc/smccc.c >>> @@ -17,9 +17,13 @@ static enum arm_smccc_conduit smccc_conduit = >>> SMCCC_CONDUIT_NONE; >>> >>> bool __ro_after_init smccc_trng_available = false; >>> u64 __ro_after_init smccc_has_sve_hint = false; >>> +s32 __ro_after_init smccc_soc_id_version = SMCCC_RET_NOT_SUPPORTED; >>> +s32 __ro_after_init smccc_soc_id_revision = SMCCC_RET_NOT_SUPPORTED; >>> >>> void __init arm_smccc_version_init(u32 version, enum >>> arm_smccc_conduit conduit) >>> { >>> + struct arm_smccc_res res; >>> + >>> smccc_version = version; >>> smccc_conduit = conduit; >>> >>> @@ -27,6 +31,18 @@ void __init arm_smccc_version_init(u32 version, >>> enum arm_smccc_conduit conduit) >>> if (IS_ENABLED(CONFIG_ARM64_SVE) && >>> smccc_version >= ARM_SMCCC_VERSION_1_3) >>> smccc_has_sve_hint = true; >>> + >>> + if ((smccc_version >= ARM_SMCCC_VERSION_1_2) && >>> + (smccc_conduit != SMCCC_CONDUIT_NONE)) { >>> + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, >>> + ARM_SMCCC_ARCH_SOC_ID, &res); >>> + if ((s32)res.a0 >= 0) { >>> + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, >>> &res); >>> + smccc_soc_id_version = (s32)res.a0; >>> + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 1, >>> &res); >>> + smccc_soc_id_revision = (s32)res.a0; >>> + } >>> + } >>> } >>> >>> >>> +s32 arm_smccc_get_soc_id_version(void) >>> +{ >>> + return smccc_soc_id_version; >>> +} >>> +EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_version); >>> + >>> +s32 arm_smccc_get_soc_id_revision(void) >>> +{ >>> + return smccc_soc_id_revision; >>> +} >>> +EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_revision); >>> >>> >> >> Overall, it looks OK to me. However I see neither the gic nor the >> soc_id >> can be build as module atm. So do we really need the export symbols if >> no >> other modules are using it ? > > It really shouldn't be exported. Having accessors should be enough. > Thanks, I'll remove in v3 patch. -Shanker
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index ec5f889d76819..e31f6c0687041 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -172,6 +172,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | NVIDIA | Carmel Core | N/A | NVIDIA_CARMEL_CNP_ERRATUM | +----------------+-----------------+-----------------+-----------------------------+ +| NVIDIA | T241 GICv3/4.x | T241-FABRIC-4 | N/A | ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/drivers/firmware/smccc/smccc.c b/drivers/firmware/smccc/smccc.c index 60ccf3e90d7de..cb688121270b8 100644 --- a/drivers/firmware/smccc/smccc.c +++ b/drivers/firmware/smccc/smccc.c @@ -17,9 +17,12 @@ static enum arm_smccc_conduit smccc_conduit = SMCCC_CONDUIT_NONE; bool __ro_after_init smccc_trng_available = false; u64 __ro_after_init smccc_has_sve_hint = false; +s32 __ro_after_init smccc_soc_id_version = SMCCC_RET_NOT_SUPPORTED; void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) { + struct arm_smccc_res res; + smccc_version = version; smccc_conduit = conduit; @@ -27,6 +30,16 @@ void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) if (IS_ENABLED(CONFIG_ARM64_SVE) && smccc_version >= ARM_SMCCC_VERSION_1_3) smccc_has_sve_hint = true; + + if ((smccc_version >= ARM_SMCCC_VERSION_1_2) && + (smccc_conduit != SMCCC_CONDUIT_NONE)) { + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_ARCH_SOC_ID, &res); + if ((s32)res.a0 >= 0) { + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, &res); + smccc_soc_id_version = (s32)res.a0; + } + } } enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index 997104d4338e7..02c699ce92b12 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -24,6 +24,7 @@ #include <linux/irqchip/arm-gic-common.h> #include <linux/irqchip/arm-gic-v3.h> #include <linux/irqchip/irq-partition-percpu.h> +#include <linux/arm-smccc.h> #include <asm/cputype.h> #include <asm/exception.h> @@ -57,8 +58,14 @@ struct gic_chip_data { bool has_rss; unsigned int ppi_nr; struct partition_desc **ppi_descs; + phys_addr_t dist_phys_base; }; + +#define T241_CHIPS_MAX 4 +static void __iomem *t241_dist_base_alias[T241_CHIPS_MAX]; +static DEFINE_STATIC_KEY_FALSE(gic_nvidia_t241_erratum); + static struct gic_chip_data gic_data __read_mostly; static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key); @@ -188,6 +195,34 @@ static inline bool gic_irq_in_rdist(struct irq_data *d) } } +static inline void __iomem *gic_dist_base_alias(irq_hw_number_t intid) +{ + u32 chip; + + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) { + /** + * {E}SPI mappings for all 4 chips + * Chip0 = 32-351 + * Chip1 = 352-671 + * Chip2 = 672-991 + * Chip3 = 4096-4415 + */ + switch (__get_intid_range(intid)) { + case SPI_RANGE: + chip = (intid - 32) / 320; + break; + case ESPI_RANGE: + chip = 3; + break; + default: + unreachable(); + } + return t241_dist_base_alias[chip]; + } + + return gic_data.dist_base; +} + static inline void __iomem *gic_dist_base(struct irq_data *d) { switch (get_intid_range(d)) { @@ -343,10 +378,14 @@ static int gic_peek_irq(struct irq_data *d, u32 offset) offset = convert_offset_index(d, offset, &index); mask = 1 << (index % 32); + /** + * For the erratum T241-FABRIC-4, read accesses to the GICD_In{E} + * registers are directed to the chip that owns the SPI. + */ if (gic_irq_in_rdist(d)) base = gic_data_rdist_sgi_base(); else - base = gic_data.dist_base; + base = gic_dist_base_alias(irqd_to_hwirq(d)); return !!(readl_relaxed(base + offset + (index / 32) * 4) & mask); } @@ -594,13 +633,17 @@ static int gic_set_type(struct irq_data *d, unsigned int type) type != IRQ_TYPE_LEVEL_HIGH && type != IRQ_TYPE_EDGE_RISING) return -EINVAL; + /** + * For the erratum T241-FABRIC-4, read accesses to the GICD_In{E} + * registers are directed to the chip that owns the SPI. Use the + * same alias region for GICD_ICFGR writes to simplify code. + */ if (gic_irq_in_rdist(d)) base = gic_data_rdist_sgi_base(); else - base = gic_data.dist_base; + base = gic_dist_base_alias(irqd_to_hwirq(d)); offset = convert_offset_index(d, GICD_ICFGR, &index); - ret = gic_configure_irq(index, type, base + offset, NULL); if (ret && (range == PPI_RANGE || range == EPPI_RANGE)) { /* Misconfigured PPIs are usually not fatal */ @@ -1719,6 +1762,45 @@ static bool gic_enable_quirk_hip06_07(void *data) return false; } +#define T241_CHIPN_MASK GENMASK_ULL(45, 44) +#define T241_CHIP_GICDA_OFFSET 0x1580000 +#define SMCCC_SOC_ID_MASK 0x7f7fffff +#define SMCCC_SOC_ID_T241 0x036b0241 + +static bool gic_enable_quirk_nvidia_t241(void *data) +{ + unsigned long chip_bmask = 0; + phys_addr_t phys; + u32 i; + + /* Check JEP106 code for NVIDIA T241 chip (036b:0241) */ + if ((smccc_soc_id_version < 0) || + ((smccc_soc_id_version & SMCCC_SOC_ID_MASK) != SMCCC_SOC_ID_T241)) { + return false; + } + + /* Find the chips based on GICR regions PHYS addr */ + for (i = 0; i < gic_data.nr_redist_regions; i++) { + chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK, + gic_data.redist_regions[i].phys_base)); + } + + if (hweight32(chip_bmask) < 3) + return false; + + /* Setup GICD alias regions */ + for (i = 0; i < ARRAY_SIZE(t241_dist_base_alias); i++) { + if (chip_bmask & BIT(i)) { + phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET; + phys |= FIELD_PREP(T241_CHIPN_MASK, i); + t241_dist_base_alias[i] = ioremap(phys, SZ_64K); + WARN_ON_ONCE(!t241_dist_base_alias[i]); + } + } + static_branch_enable(&gic_nvidia_t241_erratum); + return true; +} + static const struct gic_quirk gic_quirks[] = { { .desc = "GICv3: Qualcomm MSM8996 broken firmware", @@ -1750,6 +1832,12 @@ static const struct gic_quirk gic_quirks[] = { .mask = 0xe8f00fff, .init = gic_enable_quirk_cavium_38539, }, + { + .desc = "GICv3: NVIDIA erratum T241-FABRIC-4", + .iidr = 0x0402043b, + .mask = 0xffffffff, + .init = gic_enable_quirk_nvidia_t241, + }, { } }; @@ -1821,7 +1909,8 @@ static int __init gic_init_bases(void __iomem *dist_base, struct redist_region *rdist_regs, u32 nr_redist_regions, u64 redist_stride, - struct fwnode_handle *handle) + struct fwnode_handle *handle, + phys_addr_t dist_phys_base) { u32 typer; int err; @@ -1834,6 +1923,7 @@ static int __init gic_init_bases(void __iomem *dist_base, gic_data.fwnode = handle; gic_data.dist_base = dist_base; + gic_data.dist_phys_base = dist_phys_base; gic_data.redist_regions = rdist_regs; gic_data.nr_redist_regions = nr_redist_regions; gic_data.redist_stride = redist_stride; @@ -2072,11 +2162,12 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare void __iomem *dist_base; struct redist_region *rdist_regs; struct resource res; + struct resource dist_res; u64 redist_stride; u32 nr_redist_regions; int err, i; - dist_base = gic_of_iomap(node, 0, "GICD", &res); + dist_base = gic_of_iomap(node, 0, "GICD", &dist_res); if (IS_ERR(dist_base)) { pr_err("%pOF: unable to map gic dist registers\n", node); return PTR_ERR(dist_base); @@ -2114,7 +2205,7 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare gic_enable_of_quirks(node, gic_quirks, &gic_data); err = gic_init_bases(dist_base, rdist_regs, nr_redist_regions, - redist_stride, &node->fwnode); + redist_stride, &node->fwnode, dist_res.start); if (err) goto out_unmap_rdist; @@ -2377,8 +2468,14 @@ static void __init gic_acpi_setup_kvm_info(void) vcpu->end = vcpu->start + ACPI_GICV2_VCPU_MEM_SIZE - 1; } - gic_v3_kvm_info.has_v4 = gic_data.rdists.has_vlpis; - gic_v3_kvm_info.has_v4_1 = gic_data.rdists.has_rvpeid; + /* Disable GICv4.x features for the erratum T241-FABRIC-4 */ + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) { + gic_v3_kvm_info.has_v4 = false; + gic_v3_kvm_info.has_v4_1 = false; + } else { + gic_v3_kvm_info.has_v4 = gic_data.rdists.has_vlpis; + gic_v3_kvm_info.has_v4_1 = gic_data.rdists.has_rvpeid; + } vgic_set_kvm_info(&gic_v3_kvm_info); } @@ -2431,7 +2528,8 @@ gic_acpi_init(union acpi_subtable_headers *header, const unsigned long end) } err = gic_init_bases(acpi_data.dist_base, acpi_data.redist_regs, - acpi_data.nr_redist_regions, 0, gsi_domain_handle); + acpi_data.nr_redist_regions, 0, gsi_domain_handle, + dist->base_address); if (err) goto out_fwhandle_free; diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index 220c8c60e021a..f74e13938693e 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -225,6 +225,7 @@ u32 arm_smccc_get_version(void); void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit); extern u64 smccc_has_sve_hint; +extern s32 smccc_soc_id_version; /** * struct arm_smccc_res - Result from SMC/HVC call