Message ID | 20170208005736.18724-1-cov@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Christopher Covington wrote: > The Qualcomm Datacenter Technologies QDF2400 family of SoCs contains a > custom (non-PrimeCell) implementation of the SBSA UART. Occasionally the > BUSY bit in the Flag Register gets stuck as 1, erratum 44 for both 2432v1 > and 2400v1 SoCs. Checking that the Transmit FIFO Empty (TXFE) bit is 0, > instead of checking that the BUSY bit is 1, works around the issue. To > facilitate this substitution when UART AMBA Port (UAP) data is available, > introduce vendor-specific inversion of Feature Register bits. To keep the > change small, this patch only works around the full console case, where UAP > data is available, and does not handle the erratum for earlycon, as the UAP > data is not available then. > > Signed-off-by: Christopher Covington <cov@codeaurora.org> > Acked-by: Russell King <rmk+kernel@armlinux.org.uk> > --- > Changes between the previous RFC [1] and this PATCH: > * don't use arch/arm64/kernel/cpu_errata.c at Will's request > * separate out earlycon case to separate patch > * rework earlycon case to not depend on UAP as suggested by Timur > > Because they need a newly-defined MIDR values, the patches are currently > based on: > https://git.kernel.org/cgit/linux/kernel/git/arm64/linux.git/log/?h=for-next/core > > I'm not confident that I know the best route for these two patches. Should > I request Catalin and Will to take them via arm64 as the essential MIDR > additions are their purview? > > Thanks Russell for the ack. Compared to the RFC, I've made minor changes to > what is now patch 1/2 and am making an educated guess that the ack sticks > (but if not please let me know). Patch 2/2 is significantly revised from > the RFC so I've not included the ack on it. > > 1. https://patchwork.codeaurora.org/patch/163281/ > --- > Documentation/arm64/silicon-errata.txt | 2 ++ > arch/arm64/include/asm/cputype.h | 2 ++ > drivers/tty/serial/Kconfig | 12 ++++++++++ > drivers/tty/serial/amba-pl011.c | 40 +++++++++++++++++++++++++++++++--- > 4 files changed, 53 insertions(+), 3 deletions(-) > > diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt > index 50da8391e9dd..0993ebb3e86b 100644 > --- a/Documentation/arm64/silicon-errata.txt > +++ b/Documentation/arm64/silicon-errata.txt > @@ -65,3 +65,5 @@ stable kernels. > | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | > | Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003| > | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009| > +| Qualcomm Tech. | QDF2432v1 UART | SoC E44 | QCOM_QDF2400_ERRATUM_44 | > +| Qualcomm Tech. | QDF2400v1 UART | SoC E44 | QCOM_QDF2400_ERRATUM_44 | > diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h > index fc502713ab37..cb399c7fe6ec 100644 > --- a/arch/arm64/include/asm/cputype.h > +++ b/arch/arm64/include/asm/cputype.h > @@ -88,12 +88,14 @@ > > #define BRCM_CPU_PART_VULCAN 0x516 > > +#define QCOM_CPU_PART_KRYO_V1 0x281 > #define QCOM_CPU_PART_FALKOR_V1 0x800 > > #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) > #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) > #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) > #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) > +#define MIDR_QCOM_KRYO_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_V1) > #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) > > #ifndef __ASSEMBLY__ > diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig > index e9cf5b67f1b7..4cde8f48a540 100644 > --- a/drivers/tty/serial/Kconfig > +++ b/drivers/tty/serial/Kconfig > @@ -73,6 +73,18 @@ config SERIAL_AMBA_PL011_CONSOLE > your boot loader (lilo or loadlin) about how to pass options to the > kernel at boot time.) > > +config QCOM_QDF2400_ERRATUM_44 > + bool "Work around QDF2400 SoC E44 stuck BUSY bit" > + depends on SERIAL_AMBA_PL011_CONSOLE=y > + default y > + help > + The BUSY bit in the Flag Register of the UART on the QDF2432v1 and > + QDF2400v1 SoCs may get stuck as 1, resulting in a hung serial console. > + Say Y here to work around the issue by checking TXFE == 0 instead of > + BUSY == 1 on affected systems. > + > + If unsure, say Y. > + > config SERIAL_EARLYCON_ARM_SEMIHOST > bool "Early console using ARM semihosting" > depends on ARM64 || ARM > diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c > index d4171d71a258..41e51901d6ef 100644 > --- a/drivers/tty/serial/amba-pl011.c > +++ b/drivers/tty/serial/amba-pl011.c > @@ -97,6 +97,7 @@ struct vendor_data { > unsigned int fr_dsr; > unsigned int fr_cts; > unsigned int fr_ri; > + unsigned int inv_fr; > bool access_32b; > bool oversampling; > bool dma_threshold; > @@ -141,6 +142,25 @@ static struct vendor_data vendor_sbsa = { > .fixed_options = true, > }; > > +#ifdef CONFIG_QCOM_QDF2400_ERRATUM_44 > +static struct vendor_data vendor_qdt_qdf2400_e44 = { > + .reg_offset = pl011_std_offsets, > + .fr_busy = UART011_FR_TXFE, > + .fr_dsr = UART01x_FR_DSR, > + .fr_cts = UART01x_FR_CTS, > + .fr_ri = UART011_FR_RI, > + .inv_fr = UART011_FR_TXFE, > + .access_32b = true, > + .oversampling = false, > + .dma_threshold = false, > + .cts_event_workaround = false, > + .always_enabled = true, > + .fixed_options = true, > +}; > +#else > +#define vendor_qdt_qdf2400_e44 vendor_sbsa > +#endif Instead of the #else, just put the #ifdef inside qdf2400_e44(). That way, the function always returns False if CONFIG_QCOM_QDF2400_ERRATUM_44 is not defined. Also, don't you need to add a definition of .inv_fr in vendor_sbsa and the other vendor_xxx structures? > + > static u16 pl011_st_offsets[REG_ARRAY_SIZE] = { > [REG_DR] = UART01x_DR, > [REG_ST_DMAWM] = ST_UART011_DMAWM, > @@ -1518,7 +1538,7 @@ static unsigned int pl011_tx_empty(struct uart_port *port) > { > struct uart_amba_port *uap = > container_of(port, struct uart_amba_port, port); > - unsigned int status = pl011_read(uap, REG_FR); > + unsigned int status = pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr; > return status & (uap->vendor->fr_busy | UART01x_FR_TXFF) ? > 0 : TIOCSER_TEMT; > } > @@ -2218,7 +2238,8 @@ pl011_console_write(struct console *co, const char *s, unsigned int count) > * Finally, wait for transmitter to become empty > * and restore the TCR > */ > - while (pl011_read(uap, REG_FR) & uap->vendor->fr_busy) > + while ((pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr) > + & uap->vendor->fr_busy) I really think the XOR logic needs to be documented wherever it's used. It's just too confusing. > cpu_relax(); > if (!uap->vendor->always_enabled) > pl011_write(old_cr, uap, REG_CR); > @@ -2383,6 +2404,13 @@ static struct console amba_console = { > > #define AMBA_CONSOLE (&amba_console) > > +static bool qdf2400_e44(void) { Call it needs_qdf2400_e44(void) ? > + u32 cpu_var_model = read_cpuid_id() & ~MIDR_REVISION_MASK; > + > + return (cpu_var_model == MIDR_QCOM_KRYO_V1 || > + cpu_var_model == MIDR_QCOM_FALKOR_V1); > +} > + > static void pl011_putc(struct uart_port *port, int c) > { > while (readl(port->membase + UART01x_FR) & UART01x_FR_TXFF) > @@ -2645,12 +2673,18 @@ static int sbsa_uart_probe(struct platform_device *pdev) > uap->port.irq = ret; > > uap->reg_offset = vendor_sbsa.reg_offset; > - uap->vendor = &vendor_sbsa; > uap->fifosize = 32; > uap->port.iotype = vendor_sbsa.access_32b ? UPIO_MEM32 : UPIO_MEM; > uap->port.ops = &sbsa_uart_pops; > uap->fixed_baud = baudrate; > > + if (qdf2400_e44()) { > + uap->vendor = &vendor_qdt_qdf2400_e44; > + dev_info(&pdev->dev, "working around QDF2400 SoC erratum 44\n"); > + } else { > + uap->vendor = &vendor_sbsa; > + } > + > snprintf(uap->type, sizeof(uap->type), "SBSA"); > > r = platform_get_resource(pdev, IORESOURCE_MEM, 0); >
On 08/02/17 00:57, Christopher Covington wrote: > The Qualcomm Datacenter Technologies QDF2400 family of SoCs contains a > custom (non-PrimeCell) implementation of the SBSA UART. Occasionally the > BUSY bit in the Flag Register gets stuck as 1, erratum 44 for both 2432v1 > and 2400v1 SoCs. Checking that the Transmit FIFO Empty (TXFE) bit is 0, > instead of checking that the BUSY bit is 1, works around the issue. To > facilitate this substitution when UART AMBA Port (UAP) data is available, > introduce vendor-specific inversion of Feature Register bits. To keep the > change small, this patch only works around the full console case, where UAP > data is available, and does not handle the erratum for earlycon, as the UAP > data is not available then. > > Signed-off-by: Christopher Covington <cov@codeaurora.org> > Acked-by: Russell King <rmk+kernel@armlinux.org.uk> > --- > Changes between the previous RFC [1] and this PATCH: > * don't use arch/arm64/kernel/cpu_errata.c at Will's request > * separate out earlycon case to separate patch > * rework earlycon case to not depend on UAP as suggested by Timur > > Because they need a newly-defined MIDR values, the patches are currently > based on: > https://git.kernel.org/cgit/linux/kernel/git/arm64/linux.git/log/?h=for-next/core > > I'm not confident that I know the best route for these two patches. Should > I request Catalin and Will to take them via arm64 as the essential MIDR > additions are their purview? > > Thanks Russell for the ack. Compared to the RFC, I've made minor changes to > what is now patch 1/2 and am making an educated guess that the ack sticks > (but if not please let me know). Patch 2/2 is significantly revised from > the RFC so I've not included the ack on it. > > 1. https://patchwork.codeaurora.org/patch/163281/ > --- > Documentation/arm64/silicon-errata.txt | 2 ++ > arch/arm64/include/asm/cputype.h | 2 ++ > drivers/tty/serial/Kconfig | 12 ++++++++++ > drivers/tty/serial/amba-pl011.c | 40 +++++++++++++++++++++++++++++++--- > 4 files changed, 53 insertions(+), 3 deletions(-) > > diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt > index 50da8391e9dd..0993ebb3e86b 100644 > --- a/Documentation/arm64/silicon-errata.txt > +++ b/Documentation/arm64/silicon-errata.txt > @@ -65,3 +65,5 @@ stable kernels. > | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | > | Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003| > | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009| > +| Qualcomm Tech. | QDF2432v1 UART | SoC E44 | QCOM_QDF2400_ERRATUM_44 | > +| Qualcomm Tech. | QDF2400v1 UART | SoC E44 | QCOM_QDF2400_ERRATUM_44 | > diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h > index fc502713ab37..cb399c7fe6ec 100644 > --- a/arch/arm64/include/asm/cputype.h > +++ b/arch/arm64/include/asm/cputype.h > @@ -88,12 +88,14 @@ > > #define BRCM_CPU_PART_VULCAN 0x516 > > +#define QCOM_CPU_PART_KRYO_V1 0x281 > #define QCOM_CPU_PART_FALKOR_V1 0x800 > > #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) > #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) > #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) > #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) > +#define MIDR_QCOM_KRYO_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_V1) > #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) > > #ifndef __ASSEMBLY__ > diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig > index e9cf5b67f1b7..4cde8f48a540 100644 > --- a/drivers/tty/serial/Kconfig > +++ b/drivers/tty/serial/Kconfig > @@ -73,6 +73,18 @@ config SERIAL_AMBA_PL011_CONSOLE > your boot loader (lilo or loadlin) about how to pass options to the > kernel at boot time.) > > +config QCOM_QDF2400_ERRATUM_44 > + bool "Work around QDF2400 SoC E44 stuck BUSY bit" > + depends on SERIAL_AMBA_PL011_CONSOLE=y > + default y > + help > + The BUSY bit in the Flag Register of the UART on the QDF2432v1 and > + QDF2400v1 SoCs may get stuck as 1, resulting in a hung serial console. > + Say Y here to work around the issue by checking TXFE == 0 instead of > + BUSY == 1 on affected systems. > + > + If unsure, say Y. Is there a reason anyone would ever want to turn this off? AFAICS you save a few dozen bytes in return for a kernel image which you know won't work properly on some hardware. That doesn't seem particularly worthwhile, and it's not like the PL011 driver isn't already ripe with unconditional vendor-specific data. > + > config SERIAL_EARLYCON_ARM_SEMIHOST > bool "Early console using ARM semihosting" > depends on ARM64 || ARM > diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c > index d4171d71a258..41e51901d6ef 100644 > --- a/drivers/tty/serial/amba-pl011.c > +++ b/drivers/tty/serial/amba-pl011.c > @@ -97,6 +97,7 @@ struct vendor_data { > unsigned int fr_dsr; > unsigned int fr_cts; > unsigned int fr_ri; > + unsigned int inv_fr; > bool access_32b; > bool oversampling; > bool dma_threshold; > @@ -141,6 +142,25 @@ static struct vendor_data vendor_sbsa = { > .fixed_options = true, > }; > > +#ifdef CONFIG_QCOM_QDF2400_ERRATUM_44 > +static struct vendor_data vendor_qdt_qdf2400_e44 = { > + .reg_offset = pl011_std_offsets, > + .fr_busy = UART011_FR_TXFE, > + .fr_dsr = UART01x_FR_DSR, > + .fr_cts = UART01x_FR_CTS, > + .fr_ri = UART011_FR_RI, > + .inv_fr = UART011_FR_TXFE, > + .access_32b = true, > + .oversampling = false, > + .dma_threshold = false, > + .cts_event_workaround = false, > + .always_enabled = true, > + .fixed_options = true, > +}; > +#else > +#define vendor_qdt_qdf2400_e44 vendor_sbsa > +#endif > + > static u16 pl011_st_offsets[REG_ARRAY_SIZE] = { > [REG_DR] = UART01x_DR, > [REG_ST_DMAWM] = ST_UART011_DMAWM, > @@ -1518,7 +1538,7 @@ static unsigned int pl011_tx_empty(struct uart_port *port) > { > struct uart_amba_port *uap = > container_of(port, struct uart_amba_port, port); > - unsigned int status = pl011_read(uap, REG_FR); > + unsigned int status = pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr; > return status & (uap->vendor->fr_busy | UART01x_FR_TXFF) ? > 0 : TIOCSER_TEMT; > } > @@ -2218,7 +2238,8 @@ pl011_console_write(struct console *co, const char *s, unsigned int count) > * Finally, wait for transmitter to become empty > * and restore the TCR > */ > - while (pl011_read(uap, REG_FR) & uap->vendor->fr_busy) > + while ((pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr) > + & uap->vendor->fr_busy) > cpu_relax(); > if (!uap->vendor->always_enabled) > pl011_write(old_cr, uap, REG_CR); > @@ -2383,6 +2404,13 @@ static struct console amba_console = { > > #define AMBA_CONSOLE (&amba_console) > > +static bool qdf2400_e44(void) { > + u32 cpu_var_model = read_cpuid_id() & ~MIDR_REVISION_MASK; > + > + return (cpu_var_model == MIDR_QCOM_KRYO_V1 || > + cpu_var_model == MIDR_QCOM_FALKOR_V1); Are we to take it that every SoC now and always with any Kryo or Falkor core which also has an SBSA UART will require this workaround? There's a guarantee that the UART itself will definitely never be fixed in some future revision? That it'll never be integrated into, say, some Kryo/Cortex-Axx big.LITTLE system where you do still need the workaround, but might be making this check on the wrong core? If there's really no suitable ID register to identify this particular UART implementation, it probably needs to be described appropriately by the firmware - I can't claim knowledge of how that works under ACPI, but I do note that the only device ID currently being matched is "ARMH0011", which seems to me to be inappropriate to describe something which is not an ARM PL011, bugs or no bugs. Robin. > +} > + > static void pl011_putc(struct uart_port *port, int c) > { > while (readl(port->membase + UART01x_FR) & UART01x_FR_TXFF) > @@ -2645,12 +2673,18 @@ static int sbsa_uart_probe(struct platform_device *pdev) > uap->port.irq = ret; > > uap->reg_offset = vendor_sbsa.reg_offset; > - uap->vendor = &vendor_sbsa; > uap->fifosize = 32; > uap->port.iotype = vendor_sbsa.access_32b ? UPIO_MEM32 : UPIO_MEM; > uap->port.ops = &sbsa_uart_pops; > uap->fixed_baud = baudrate; > > + if (qdf2400_e44()) { > + uap->vendor = &vendor_qdt_qdf2400_e44; > + dev_info(&pdev->dev, "working around QDF2400 SoC erratum 44\n"); > + } else { > + uap->vendor = &vendor_sbsa; > + } > + > snprintf(uap->type, sizeof(uap->type), "SBSA"); > > r = platform_get_resource(pdev, IORESOURCE_MEM, 0); >
Robin Murphy wrote: > Is there a reason anyone would ever want to turn this off? AFAICS you > save a few dozen bytes in return for a kernel image which you know won't > work properly on some hardware. That doesn't seem particularly > worthwhile, and it's not like the PL011 driver isn't already ripe with > unconditional vendor-specific data. > >> > + >> > config SERIAL_EARLYCON_ARM_SEMIHOST >> > bool "Early console using ARM semihosting" >> > depends on ARM64 || ARM >> > diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c >> > index d4171d71a258..41e51901d6ef 100644 >> > --- a/drivers/tty/serial/amba-pl011.c >> > +++ b/drivers/tty/serial/amba-pl011.c >> > @@ -97,6 +97,7 @@ struct vendor_data { >> > unsigned int fr_dsr; >> > unsigned int fr_cts; >> > unsigned int fr_ri; >> > + unsigned int inv_fr; >> > bool access_32b; >> > bool oversampling; >> > bool dma_threshold; >> > @@ -141,6 +142,25 @@ static struct vendor_data vendor_sbsa = { >> > .fixed_options = true, >> > }; >> > >> > +#ifdef CONFIG_QCOM_QDF2400_ERRATUM_44 >> > +static struct vendor_data vendor_qdt_qdf2400_e44 = { >> > + .reg_offset = pl011_std_offsets, >> > + .fr_busy = UART011_FR_TXFE, >> > + .fr_dsr = UART01x_FR_DSR, >> > + .fr_cts = UART01x_FR_CTS, >> > + .fr_ri = UART011_FR_RI, >> > + .inv_fr = UART011_FR_TXFE, >> > + .access_32b = true, >> > + .oversampling = false, >> > + .dma_threshold = false, >> > + .cts_event_workaround = false, >> > + .always_enabled = true, >> > + .fixed_options = true, >> > +}; >> > +#else >> > +#define vendor_qdt_qdf2400_e44 vendor_sbsa >> > +#endif >> > + >> > static u16 pl011_st_offsets[REG_ARRAY_SIZE] = { >> > [REG_DR] = UART01x_DR, >> > [REG_ST_DMAWM] = ST_UART011_DMAWM, >> > @@ -1518,7 +1538,7 @@ static unsigned int pl011_tx_empty(struct uart_port *port) >> > { >> > struct uart_amba_port *uap = >> > container_of(port, struct uart_amba_port, port); >> > - unsigned int status = pl011_read(uap, REG_FR); >> > + unsigned int status = pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr; >> > return status & (uap->vendor->fr_busy | UART01x_FR_TXFF) ? >> > 0 : TIOCSER_TEMT; >> > } >> > @@ -2218,7 +2238,8 @@ pl011_console_write(struct console *co, const char *s, unsigned int count) >> > * Finally, wait for transmitter to become empty >> > * and restore the TCR >> > */ >> > - while (pl011_read(uap, REG_FR) & uap->vendor->fr_busy) >> > + while ((pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr) >> > + & uap->vendor->fr_busy) >> > cpu_relax(); >> > if (!uap->vendor->always_enabled) >> > pl011_write(old_cr, uap, REG_CR); >> > @@ -2383,6 +2404,13 @@ static struct console amba_console = { >> > >> > #define AMBA_CONSOLE (&amba_console) >> > >> > +static bool qdf2400_e44(void) { >> > + u32 cpu_var_model = read_cpuid_id() & ~MIDR_REVISION_MASK; >> > + >> > + return (cpu_var_model == MIDR_QCOM_KRYO_V1 || >> > + cpu_var_model == MIDR_QCOM_FALKOR_V1); > Are we to take it that every SoC now and always with any Kryo or Falkor > core which also has an SBSA UART will require this workaround? No, only Kryo and Falkor V1 based SOCs have this problem. Falkor V2 will have this fixed. We intend to revert these fixes after Falkor V1 SOCs are no longer supported. > There's a > guarantee that the UART itself will definitely never be fixed in some > future revision? That it'll never be integrated into, say, some > Kryo/Cortex-Axx big.LITTLE system where you do still need the > workaround, but might be making this check on the wrong core? We are sure that this erratum is restricted to those specific SOCs. > > If there's really no suitable ID register to identify this particular > UART implementation, it probably needs to be described appropriately by > the firmware - I can't claim knowledge of how that works under ACPI, but > I do note that the only device ID currently being matched is "ARMH0011", > which seems to me to be inappropriate to describe something which is not > an ARM PL011, bugs or no bugs. ACPI is not like device tree. You can't just define a "qcom,needs-busy-bit-fix" property and call it a day. If you're saying that we should create a new ACPI HID, like QCOM0011, that probably would have worked as well, except hindsight is 20/20, and we already have firmware in the field.
On Wed, Feb 08, 2017 at 07:27:29AM -0600, Timur Tabi wrote: > Robin Murphy wrote: > >Are we to take it that every SoC now and always with any Kryo or Falkor > >core which also has an SBSA UART will require this workaround? > > No, only Kryo and Falkor V1 based SOCs have this problem. Falkor V2 > will have this fixed. We intend to revert these fixes after Falkor > V1 SOCs are no longer supported. Supported by whom? Generally, once something's upstreamed we expect it to remain supported. > > There's a > >guarantee that the UART itself will definitely never be fixed in some > >future revision? That it'll never be integrated into, say, some > >Kryo/Cortex-Axx big.LITTLE system where you do still need the > >workaround, but might be making this check on the wrong core? > > We are sure that this erratum is restricted to those specific SOCs. Ok. Then please identify the system as a whole (i.e. the SoC), and not the CPU. > >If there's really no suitable ID register to identify this particular > >UART implementation, it probably needs to be described appropriately by > >the firmware - I can't claim knowledge of how that works under ACPI, but > >I do note that the only device ID currently being matched is "ARMH0011", > >which seems to me to be inappropriate to describe something which is not > >an ARM PL011, bugs or no bugs. > > ACPI is not like device tree. You can't just define a > "qcom,needs-busy-bit-fix" property and call it a day. ACPI and DT have pretty much the exact same set of issues w.r.t. long term maintenance and support. The important step is raising issues early, rather than shipping FW with values that are not known to work (or known to not work)... > If you're saying that we should create a new ACPI HID, like > QCOM0011, that probably would have worked as well, except hindsight > is 20/20, and we already have firmware in the field. There are other way of identifying the system, if you look further than the device. Thanks, Thanks.
Mark Rutland wrote: >> No, only Kryo and Falkor V1 based SOCs have this problem. Falkor V2 >> will have this fixed. We intend to revert these fixes after Falkor >> V1 SOCs are no longer supported. > > Supported by whom? Qualcomm. > Generally, once something's upstreamed we expect it to remain supported. The V1 SOCs are technically pre-production, but still in wide use and it will be a while before they are all replaced by V2 SOCs. It's only because of the "ugliness" of the erratum and its work-around that we want to git rid of it when we can. Cov is better equipped to answer your other questions.
On Wed, Feb 08 2017 at 2:09:12 pm GMT, Timur Tabi <timur@codeaurora.org> wrote: > Mark Rutland wrote: > >>> No, only Kryo and Falkor V1 based SOCs have this problem. Falkor V2 >>> will have this fixed. We intend to revert these fixes after Falkor >>> V1 SOCs are no longer supported. >> >> Supported by whom? > > Qualcomm. > >> Generally, once something's upstreamed we expect it to remain supported. > > The V1 SOCs are technically pre-production, but still in wide use and > it will be a while before they are all replaced by V2 SOCs. It's only > because of the "ugliness" of the erratum and its work-around that we > want to git rid of it when we can. Interesting. How will you guarantee that nobody will ever want to run a mainline kernel on this box after a certain date? Self-destruct timer? ;-) As someone who runs mainline on HW that exceeded its "sell-by" date by a few decades, I'm genuinely interested. Thanks, M.
On 02/08/2017 09:35 AM, Marc Zyngier wrote: > Interesting. How will you guarantee that nobody will ever want to run a > mainline kernel on this box after a certain date? Self-destruct timer? > ;-) > > As someone who runs mainline on HW that exceeded its "sell-by" date by a > few decades, I'm genuinely interested. The affected chips are pre-production and are only available to select customers for a limited time.
Hi, On 02/08/2017 08:27 AM, Timur Tabi wrote: > Robin Murphy wrote: >> Is there a reason anyone would ever want to turn this off? AFAICS you >> save a few dozen bytes in return for a kernel image which you know won't >> work properly on some hardware. That doesn't seem particularly >> worthwhile, and it's not like the PL011 driver isn't already ripe with >> unconditional vendor-specific data. I'll make it unconditional in the next version. >>> > +static bool qdf2400_e44(void) { >>> > + u32 cpu_var_model = read_cpuid_id() & ~MIDR_REVISION_MASK; >>> > + >>> > + return (cpu_var_model == MIDR_QCOM_KRYO_V1 || >>> > + cpu_var_model == MIDR_QCOM_FALKOR_V1); > >> Are we to take it that every SoC now and always with any Kryo or Falkor >> core which also has an SBSA UART will require this workaround? I should have matched full 32 bit MIDR values, but I masked off the revision number after careful investigation of current and future known MIDR values and system pairings because MIDR_FOO isn't actually a 32 bit MIDR value as the name would suggest but has variant and revision zeroed. The CPU in the QDF2432 has a non-zero revision number and I didn't want to try to add an unprecedented revision number to cputype.h after Will told me I wasn't allowed to play with CPU toys for SoC games. I'm going to stop using MIDR, but as an aside, there are two very different CPUs, which may require different CPU errata workarounds, that (by accident) only differ by the variant field. Robert Richter's recent MIDR_CPU_VAR_REV is a welcome clarification. > No, only Kryo and Falkor V1 based SOCs have this problem. Falkor V2 > will have this fixed. A year ago, we were operating under the incorrect-in-hindsight assumption that the QDF2400 v1 SoC with Falkor v1 CPU would have this fixed. As for alternative means of identifying the errant hardware, I think that I or a colleague will experiment with the approach used for ACPI PCI quirks: checking the OEM ID, OEM Table ID, and OEM Revision. In this case, those header fields would come from the Serial Port Console Redirection (SPCR) table, rather than the PCI-specific MCFG table. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5b69b85ba1ddd36be01f5c57830b37a3c8256009 Regarding ongoing support, I don't see any disagreement that the code must be supported and maintained for the life of the hardware, however many years that ends up being. Thanks, Cov
On 02/07/2017 11:05 PM, Timur Tabi wrote: > Christopher Covington wrote: >> The Qualcomm Datacenter Technologies QDF2400 family of SoCs contains a >> custom (non-PrimeCell) implementation of the SBSA UART. Occasionally the >> BUSY bit in the Flag Register gets stuck as 1, erratum 44 for both 2432v1 >> and 2400v1 SoCs. Checking that the Transmit FIFO Empty (TXFE) bit is 0, >> instead of checking that the BUSY bit is 1, works around the issue. To >> facilitate this substitution when UART AMBA Port (UAP) data is available, >> introduce vendor-specific inversion of Feature Register bits. To keep the >> change small, this patch only works around the full console case, where UAP >> data is available, and does not handle the erratum for earlycon, as the UAP >> data is not available then. >> +#ifdef CONFIG_QCOM_QDF2400_ERRATUM_44 >> +static struct vendor_data vendor_qdt_qdf2400_e44 = { >> + .reg_offset = pl011_std_offsets, >> + .fr_busy = UART011_FR_TXFE, >> + .fr_dsr = UART01x_FR_DSR, >> + .fr_cts = UART01x_FR_CTS, >> + .fr_ri = UART011_FR_RI, >> + .inv_fr = UART011_FR_TXFE, >> + .access_32b = true, >> + .oversampling = false, >> + .dma_threshold = false, >> + .cts_event_workaround = false, >> + .always_enabled = true, >> + .fixed_options = true, >> +}; >> +#else >> +#define vendor_qdt_qdf2400_e44 vendor_sbsa >> +#endif > > Instead of the #else, just put the #ifdef inside qdf2400_e44(). That > way, the function always returns False if > CONFIG_QCOM_QDF2400_ERRATUM_44 is not defined. I'll consolidate the #ifdefery around qdf2400_e44(). > Also, don't you need to add a definition of .inv_fr in vendor_sbsa > and the other vendor_xxx structures? The current struct definitions appear to (ab)use implicit zero-initialization, so I did too. >> + >> static u16 pl011_st_offsets[REG_ARRAY_SIZE] = { >> [REG_DR] = UART01x_DR, >> [REG_ST_DMAWM] = ST_UART011_DMAWM, >> @@ -1518,7 +1538,7 @@ static unsigned int pl011_tx_empty(struct uart_port *port) >> { >> struct uart_amba_port *uap = >> container_of(port, struct uart_amba_port, port); >> - unsigned int status = pl011_read(uap, REG_FR); >> + unsigned int status = pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr; >> return status & (uap->vendor->fr_busy | UART01x_FR_TXFF) ? >> 0 : TIOCSER_TEMT; >> } >> @@ -2218,7 +2238,8 @@ pl011_console_write(struct console *co, const char *s, unsigned int count) >> * Finally, wait for transmitter to become empty >> * and restore the TCR >> */ >> - while (pl011_read(uap, REG_FR) & uap->vendor->fr_busy) >> + while ((pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr) >> + & uap->vendor->fr_busy) > > I really think the XOR logic needs to be documented wherever it's > used. It's just too confusing. References such as the following for basic bit operations like setting, clearing, and toggling are easy enough to come by. http://stackoverflow.com/questions/47981/how-do-you-set-clear-and-toggle-a-single-bit-in-c-c Cov
On 02/08/2017 04:22 PM, Christopher Covington wrote: >>> >> - while (pl011_read(uap, REG_FR) & uap->vendor->fr_busy) >>> >> + while ((pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr) >>> >> + & uap->vendor->fr_busy) >> > >> > I really think the XOR logic needs to be documented wherever it's >> > used. It's just too confusing. > References such as the following for basic bit operations like > setting, clearing, and toggling are easy enough to come by. > > http://stackoverflow.com/questions/47981/how-do-you-set-clear-and-toggle-a-single-bit-in-c-c I know what XOR does. I was thinking a more high-level comment would be appropriate. With E44, we want to ignore the BUSY bit and use TXFE instead. So was hoping to a comment that says that: /* Normally, we poll until BUSY=0, but E44 says we should poll until TXFE=1 instead. So with E44, we set fr_busy to TXFE, but we have to invert it. */
On 02/08/2017 06:04 PM, Timur Tabi wrote: > On 02/08/2017 04:22 PM, Christopher Covington wrote: >>>> >> - while (pl011_read(uap, REG_FR) & uap->vendor->fr_busy) >>>> >> + while ((pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr) >>>> >> + & uap->vendor->fr_busy) >>> > >>> > I really think the XOR logic needs to be documented wherever it's >>> > used. It's just too confusing. >> References such as the following for basic bit operations like >> setting, clearing, and toggling are easy enough to come by. >> >> http://stackoverflow.com/questions/47981/how-do-you-set-clear-and-toggle-a-single-bit-in-c-c > > I know what XOR does. I was thinking a more high-level comment would > be appropriate. With E44, we want to ignore the BUSY bit and use TXFE > instead. So was hoping to a comment that says that: > > /* Normally, we poll until BUSY=0, but E44 says we should poll until > TXFE=1 instead. So with E44, we set fr_busy to TXFE, but we have to > invert it. */ Oh, yeah, that'd be good. I'll include better comments in v2. Cheers, Cov
diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index 50da8391e9dd..0993ebb3e86b 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -65,3 +65,5 @@ stable kernels. | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | | Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003| | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009| +| Qualcomm Tech. | QDF2432v1 UART | SoC E44 | QCOM_QDF2400_ERRATUM_44 | +| Qualcomm Tech. | QDF2400v1 UART | SoC E44 | QCOM_QDF2400_ERRATUM_44 | diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index fc502713ab37..cb399c7fe6ec 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -88,12 +88,14 @@ #define BRCM_CPU_PART_VULCAN 0x516 +#define QCOM_CPU_PART_KRYO_V1 0x281 #define QCOM_CPU_PART_FALKOR_V1 0x800 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) +#define MIDR_QCOM_KRYO_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_V1) #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) #ifndef __ASSEMBLY__ diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig index e9cf5b67f1b7..4cde8f48a540 100644 --- a/drivers/tty/serial/Kconfig +++ b/drivers/tty/serial/Kconfig @@ -73,6 +73,18 @@ config SERIAL_AMBA_PL011_CONSOLE your boot loader (lilo or loadlin) about how to pass options to the kernel at boot time.) +config QCOM_QDF2400_ERRATUM_44 + bool "Work around QDF2400 SoC E44 stuck BUSY bit" + depends on SERIAL_AMBA_PL011_CONSOLE=y + default y + help + The BUSY bit in the Flag Register of the UART on the QDF2432v1 and + QDF2400v1 SoCs may get stuck as 1, resulting in a hung serial console. + Say Y here to work around the issue by checking TXFE == 0 instead of + BUSY == 1 on affected systems. + + If unsure, say Y. + config SERIAL_EARLYCON_ARM_SEMIHOST bool "Early console using ARM semihosting" depends on ARM64 || ARM diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c index d4171d71a258..41e51901d6ef 100644 --- a/drivers/tty/serial/amba-pl011.c +++ b/drivers/tty/serial/amba-pl011.c @@ -97,6 +97,7 @@ struct vendor_data { unsigned int fr_dsr; unsigned int fr_cts; unsigned int fr_ri; + unsigned int inv_fr; bool access_32b; bool oversampling; bool dma_threshold; @@ -141,6 +142,25 @@ static struct vendor_data vendor_sbsa = { .fixed_options = true, }; +#ifdef CONFIG_QCOM_QDF2400_ERRATUM_44 +static struct vendor_data vendor_qdt_qdf2400_e44 = { + .reg_offset = pl011_std_offsets, + .fr_busy = UART011_FR_TXFE, + .fr_dsr = UART01x_FR_DSR, + .fr_cts = UART01x_FR_CTS, + .fr_ri = UART011_FR_RI, + .inv_fr = UART011_FR_TXFE, + .access_32b = true, + .oversampling = false, + .dma_threshold = false, + .cts_event_workaround = false, + .always_enabled = true, + .fixed_options = true, +}; +#else +#define vendor_qdt_qdf2400_e44 vendor_sbsa +#endif + static u16 pl011_st_offsets[REG_ARRAY_SIZE] = { [REG_DR] = UART01x_DR, [REG_ST_DMAWM] = ST_UART011_DMAWM, @@ -1518,7 +1538,7 @@ static unsigned int pl011_tx_empty(struct uart_port *port) { struct uart_amba_port *uap = container_of(port, struct uart_amba_port, port); - unsigned int status = pl011_read(uap, REG_FR); + unsigned int status = pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr; return status & (uap->vendor->fr_busy | UART01x_FR_TXFF) ? 0 : TIOCSER_TEMT; } @@ -2218,7 +2238,8 @@ pl011_console_write(struct console *co, const char *s, unsigned int count) * Finally, wait for transmitter to become empty * and restore the TCR */ - while (pl011_read(uap, REG_FR) & uap->vendor->fr_busy) + while ((pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr) + & uap->vendor->fr_busy) cpu_relax(); if (!uap->vendor->always_enabled) pl011_write(old_cr, uap, REG_CR); @@ -2383,6 +2404,13 @@ static struct console amba_console = { #define AMBA_CONSOLE (&amba_console) +static bool qdf2400_e44(void) { + u32 cpu_var_model = read_cpuid_id() & ~MIDR_REVISION_MASK; + + return (cpu_var_model == MIDR_QCOM_KRYO_V1 || + cpu_var_model == MIDR_QCOM_FALKOR_V1); +} + static void pl011_putc(struct uart_port *port, int c) { while (readl(port->membase + UART01x_FR) & UART01x_FR_TXFF) @@ -2645,12 +2673,18 @@ static int sbsa_uart_probe(struct platform_device *pdev) uap->port.irq = ret; uap->reg_offset = vendor_sbsa.reg_offset; - uap->vendor = &vendor_sbsa; uap->fifosize = 32; uap->port.iotype = vendor_sbsa.access_32b ? UPIO_MEM32 : UPIO_MEM; uap->port.ops = &sbsa_uart_pops; uap->fixed_baud = baudrate; + if (qdf2400_e44()) { + uap->vendor = &vendor_qdt_qdf2400_e44; + dev_info(&pdev->dev, "working around QDF2400 SoC erratum 44\n"); + } else { + uap->vendor = &vendor_sbsa; + } + snprintf(uap->type, sizeof(uap->type), "SBSA"); r = platform_get_resource(pdev, IORESOURCE_MEM, 0);