Message ID | 1472443093.27061.4.camel@mtkswgap22 (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
+Jorge Hi Roger, On Mon, 29 Aug 2016 11:58:13 +0800 mtk04561 <rogercc.lin@mediatek.com> wrote: > Hi Boris, > > I found a problem in current Mediatek upstream NAND driver (of nand/next > branch). When testing the upstream driver with UBIFS file system, it has > some chance to create incorrect ECC data in the second subpage, this > makes UBIFS failed to mount. Below patch fixes this problem. Please help > to review the patch and upstream. Thank you. > > Best regards, > Roger Can you put this message after the '---' (below your SoB), so that it does not appear when applying the patch. > > > > Description: Drop the 'Description:' line. > When mtk_ecc_encode() is writing the ECC parity data to the OOB region, > because each register is 4 bytes in length, but the len's unit is in > bytes, the operation in the for loop will cross the ECC's boundary. > > And when mtk_nfc_do_write_page() comparing the sector number, because > the sector number field is at the 12th-bit position of NFI_BYTELEN > register, the masked register should be shifted 12 bits before being > compared. The result of this bug may cause the second subpage has > incomplete ECC parity bytes. Maybe you could split those changes in 2 patches (not a strong requirement). > > Test: Drop the 'Test:' line. > The patch passed the test of UBIFS file-system read/write on Mediatek's > RFB. The tested driver is checked-out from LEDE OpenWRT project's > upstream driver, which is pretty much same as nand/next branch upstream > driver(git clone https://git.lede-project.org/source.git). > > > Signed-off-by: RogerCC Lin <rogercc.lin@mediatek.com> > --- > diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c > index 25a4fbd..0a7ce8d 100644 > --- a/drivers/mtd/nand/mtk_ecc.c > +++ b/drivers/mtd/nand/mtk_ecc.c > @@ -366,7 +366,8 @@ int mtk_ecc_encode(struct mtk_ecc *ecc, struct > mtk_ecc_config *config, > u8 *data, u32 bytes) > { > dma_addr_t addr; > - u32 *p, len, i; > + u8 *p; > + u32 len, i, val; > int ret = 0; > > addr = dma_map_single(ecc->dev, data, bytes, DMA_TO_DEVICE); > @@ -395,8 +396,11 @@ int mtk_ecc_encode(struct mtk_ecc *ecc, struct > mtk_ecc_config *config, > p = (u32 *)(data + bytes); Hm, you should now do: p = data + bytes; > > /* write the parity bytes generated by the ECC back to the OOB > region */ > - for (i = 0; i < len; i++) > - p[i] = readl(ecc->regs + ECC_ENCPAR(i)); > + for (i = 0; i < len; i++) { > + if ((i % 4) == 0) > + val = readl(ecc->regs + ECC_ENCPAR(i >> 2)); > + p[i] = *((u8 *)&val + (i % 4)); Please, use a shift operator here: p[i] = (val >> ((i % 4) * 8)) & 0xff; Otherwise you're exposed to endianness conversion problems (this works fine in your case because you're using a little-endian kernel). > + } > timeout: > > dma_unmap_single(ecc->dev, addr, bytes, DMA_TO_DEVICE); > diff --git a/drivers/mtd/nand/mtk_nand.c b/drivers/mtd/nand/mtk_nand.c > index ddaa2ac..9c49121 100644 > --- a/drivers/mtd/nand/mtk_nand.c > +++ b/drivers/mtd/nand/mtk_nand.c > @@ -699,8 +699,8 @@ static int mtk_nfc_do_write_page(struct mtd_info > *mtd, struct nand_chip *chip, > } > > ret = readl_poll_timeout_atomic(nfc->regs + NFI_ADDRCNTR, reg, > - (reg & CNTR_MASK) >= > chip->ecc.steps, > - 10, MTK_TIMEOUT); > + ((reg & CNTR_MASK) >> 12) >= > + chip->ecc.steps, 10, Please keep '((reg & CNTR_MASK) >> 12) >= chip->ecc.steps' on the same line. If you want to comply with the 80 char lines, you can do: ret = readl_poll_timeout_atomic(nfc->regs + NFI_ADDRCNTR, reg, ((reg & CNTR_MASK) >> 12) >= chip->ecc.steps, 10, MTK_TIMEOUT); > if (ret) > dev_err(dev, "hwecc write timeout\n"); > > @@ -902,8 +902,8 @@ static int mtk_nfc_read_subpage(struct mtd_info > *mtd, struct nand_chip *chip, > dev_warn(nfc->dev, "read ahb/dma done timeout\n"); > > rc = readl_poll_timeout_atomic(nfc->regs + NFI_BYTELEN, reg, > - (reg & CNTR_MASK) >= sectors, 10, > - MTK_TIMEOUT); > + ((reg & CNTR_MASK) >> 12) >= > sectors, > + 10, MTK_TIMEOUT); > if (rc < 0) { > dev_err(nfc->dev, "subpage done timeout\n"); > bitflips = -EIO; > >
On 08/29/2016 10:10 AM, Boris Brezillon wrote:
> +Jorge
Thanks Boris. Xiaolei is looking into this - he is aware of the patch.
This series fix chances to write incorrect ECC data which may cause uncorrectable ECC error when reading. changes since v1: - separate patches into 2. - use shift operator with byte access to avoid endianness conversion problems. - follow linux coding style. The patch passed the test of UBIFS file-system read/write on Mediatek's RFB. The tested driver is checked-out from LEDE OpenWRT project's upstream driver, which is pretty much same as nand/next branch upstream driver(git clone https://git.lede-project.org/source.git). RogerCC Lin (2): mtd: nand: fix generating over-boundary ECC data when writing. mtd: nand: fix chances to create incomplete ECC data when writing. drivers/mtd/nand/mtk_ecc.c | 12 ++++++++---- drivers/mtd/nand/mtk_nand.c | 8 ++++---- 2 files changed, 12 insertions(+), 8 deletions(-)
Hi Boris and all, I am sorry that though I have change the sender name in the email, but It seems the patchwork still show the wrong name, I will check what wrong with my email, sorry for the inconvenience. Roger
diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c index 25a4fbd..0a7ce8d 100644 --- a/drivers/mtd/nand/mtk_ecc.c +++ b/drivers/mtd/nand/mtk_ecc.c @@ -366,7 +366,8 @@ int mtk_ecc_encode(struct mtk_ecc *ecc, struct mtk_ecc_config *config, u8 *data, u32 bytes) { dma_addr_t addr; - u32 *p, len, i; + u8 *p; + u32 len, i, val; int ret = 0; addr = dma_map_single(ecc->dev, data, bytes, DMA_TO_DEVICE); @@ -395,8 +396,11 @@ int mtk_ecc_encode(struct mtk_ecc *ecc, struct mtk_ecc_config *config, p = (u32 *)(data + bytes); /* write the parity bytes generated by the ECC back to the OOB region */ - for (i = 0; i < len; i++) - p[i] = readl(ecc->regs + ECC_ENCPAR(i)); + for (i = 0; i < len; i++) { + if ((i % 4) == 0) + val = readl(ecc->regs + ECC_ENCPAR(i >> 2)); + p[i] = *((u8 *)&val + (i % 4)); + } timeout: dma_unmap_single(ecc->dev, addr, bytes, DMA_TO_DEVICE); diff --git a/drivers/mtd/nand/mtk_nand.c b/drivers/mtd/nand/mtk_nand.c index ddaa2ac..9c49121 100644 --- a/drivers/mtd/nand/mtk_nand.c +++ b/drivers/mtd/nand/mtk_nand.c @@ -699,8 +699,8 @@ static int mtk_nfc_do_write_page(struct mtd_info *mtd, struct nand_chip *chip, } ret = readl_poll_timeout_atomic(nfc->regs + NFI_ADDRCNTR, reg, - (reg & CNTR_MASK) >= chip->ecc.steps, - 10, MTK_TIMEOUT); + ((reg & CNTR_MASK) >> 12) >= + chip->ecc.steps, 10, MTK_TIMEOUT);
Hi Boris, I found a problem in current Mediatek upstream NAND driver (of nand/next branch). When testing the upstream driver with UBIFS file system, it has some chance to create incorrect ECC data in the second subpage, this makes UBIFS failed to mount. Below patch fixes this problem. Please help to review the patch and upstream. Thank you. Best regards, Roger Description: When mtk_ecc_encode() is writing the ECC parity data to the OOB region, because each register is 4 bytes in length, but the len's unit is in bytes, the operation in the for loop will cross the ECC's boundary. And when mtk_nfc_do_write_page() comparing the sector number, because the sector number field is at the 12th-bit position of NFI_BYTELEN register, the masked register should be shifted 12 bits before being compared. The result of this bug may cause the second subpage has incomplete ECC parity bytes. Test: The patch passed the test of UBIFS file-system read/write on Mediatek's RFB. The tested driver is checked-out from LEDE OpenWRT project's upstream driver, which is pretty much same as nand/next branch upstream driver(git clone https://git.lede-project.org/source.git). Signed-off-by: RogerCC Lin <rogercc.lin@mediatek.com> --- if (ret) dev_err(dev, "hwecc write timeout\n"); @@ -902,8 +902,8 @@ static int mtk_nfc_read_subpage(struct mtd_info *mtd, struct nand_chip *chip, dev_warn(nfc->dev, "read ahb/dma done timeout\n"); rc = readl_poll_timeout_atomic(nfc->regs + NFI_BYTELEN, reg, - (reg & CNTR_MASK) >= sectors, 10, - MTK_TIMEOUT); + ((reg & CNTR_MASK) >> 12) >= sectors, + 10, MTK_TIMEOUT); if (rc < 0) { dev_err(nfc->dev, "subpage done timeout\n"); bitflips = -EIO;