@@ -592,6 +592,26 @@ static int spear_mtd_read(struct mtd_info *mtd, loff_t from, size_t len,
return 0;
}
+/*
+ * The purpose of this function is to ensure a memcpy_toio() with byte writes
+ * only. Its structure is inspired from the ARM implementation of _memcpy_toio()
+ * which also does single byte writes but cannot be used here as this is just an
+ * implementation detail and not part of the API. Not mentioning the comment
+ * stating that _memcpy_toio() should be optimized.
+ */
+static void spear_smi_memcpy_toio_b(volatile void __iomem *dest,
+ const void *src, size_t len)
+{
+ const unsigned char *from = src;
+
+ while (len) {
+ len--;
+ writeb(*from, dest);
+ from++;
+ dest++;
+ }
+}
+
static inline int spear_smi_cpy_toio(struct spear_smi *dev, u32 bank,
void __iomem *dest, const void *src, size_t len)
{
@@ -614,7 +634,23 @@ static inline int spear_smi_cpy_toio(struct spear_smi *dev, u32 bank,
ctrlreg1 = readl(dev->io_base + SMI_CR1);
writel((ctrlreg1 | WB_MODE) & ~SW_MODE, dev->io_base + SMI_CR1);
- memcpy_toio(dest, src, len);
+ /*
+ * In Write Burst mode (WB_MODE), the specs states that writes must be:
+ * - incremental
+ * - of the same size
+ * The ARM implementation of memcpy_toio() will optimize the number of
+ * I/O by using as much 4-byte writes as possible, surrounded by
+ * 2-byte/1-byte access if:
+ * - the destination is not 4-byte aligned
+ * - the length is not a multiple of 4-byte.
+ * Avoid this alternance of write access size by using our own 'byte
+ * access' helper if at least one of the two conditions above is true.
+ */
+ if (IS_ALIGNED(len, sizeof(u32)) &&
+ IS_ALIGNED((unsigned int)dest, sizeof(u32)))
+ memcpy_toio(dest, src, len);
+ else
+ spear_smi_memcpy_toio_b(dest, src, len);
writel(ctrlreg1, dev->io_base + SMI_CR1);
Any write with either dd or flashcp to a device driven by the spear_smi.c driver will pass through the spear_smi_cpy_toio() function. This function will get called for chunks of up to 256 bytes. If the amount of data is smaller, we may have a problem if the data length is not 4-byte aligned. In this situation, the kernel panics during the memcpy: # dd if=/dev/urandom bs=1001 count=1 of=/dev/mtd6 spear_smi_cpy_toio [620] dest c9070000, src c7be8800, len 256 spear_smi_cpy_toio [620] dest c9070100, src c7be8900, len 256 spear_smi_cpy_toio [620] dest c9070200, src c7be8a00, len 256 spear_smi_cpy_toio [620] dest c9070300, src c7be8b00, len 233 Unhandled fault: external abort on non-linefetch (0x808) at 0xc90703e8 [...] PC is at memcpy+0xcc/0x330 The above error occurs because the implementation of memcpy_toio() tries to optimize the number of I/O by writing 4 bytes at a time as much as possible, until there are less than 4 bytes left and then switches to word or byte writes. Unfortunately, the specification states about the Write Burst mode: "the next AHB Write request should point to the next incremented address and should have the same size (byte, half-word or word)" This means ARM architecture implementation of memcpy_toio() cannot reliably be used blindly here. Workaround this situation by update the write path to stick to byte access when the burst length is not multiple of 4. Fixes: f18dbbb1bfe0 ("mtd: ST SPEAr: Add SMI driver for serial NOR flash") Cc: Russell King <linux@armlinux.org.uk> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> --- Changes in v3: ============== * Prevent writes to non 4-byte aligned addresses to fail. * Use the IS_ALIGNED() macro. * Add a comment to explain why the 'memcpy_toio_b' helper is needed directly in the code. Changes in v2: ============== * This time I think the patch really fixes the problem: we use a memcpy_toio_b() function to force byte access only when needed. We don't use the _memcpy_toio() helper anymore as the fact that it is doing byte access is purely an implementation detail and is not part of the API, while the function is also flagged as "should be optimized". * One could argue that potentially memcpy_toio() does not ensure by design 4-bytes access only but I think it is good enough to use it in this case as the ARM implementation of this function is already extensively optimized. I also find clearer to use it than adding my own spear_smi_mempy_toio_l(). Please tell me if you disagree with this. * The volatile keyword has been taken voluntarily from the _memcpy_toio() implementation I was about to use previously. drivers/mtd/devices/spear_smi.c | 38 ++++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-)