Message ID | 1509331196-29412-2-git-send-email-xiaolei.li@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, 30 Oct 2017 10:39:56 +0800 Xiaolei Li <xiaolei.li@mediatek.com> wrote: > For MT2701 NAND Controller, there may generate infinite ECC decode IRQ > during long time burn test on some platforms. Once this issue occurred, > the ECC decode IRQ status cannot be cleared in the IRQ handler function, > and threads cannot be scheduled. > > ECC HW generates decode IRQ each sector, so there will have more than one > decode IRQ if read one page of large page NAND. > > Currently, ECC IRQ handle flow is that we will check whether it is decode > IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear > type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be > cleared at the same time. > Secondly, we will check whether all sectors are decoded by reading the > register ECC_DECDONE. This is because the current IRQ may be not dealed > in time, and the next sectors have been decoded before reading the > register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not > be generated. > Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we > will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by > programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the > next ECC IRQ. > > But, there is a timing issue between step one and two. When we read the > reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector, > and the ECC IRQ signal is cleared. But the last sector is decoded before > reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and > it means we will receive one extra ECC IRQ later. In step three, we will > find that all sectors were decoded, then disable ECC IRQ and return. > When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared > anymore. That is because the register ECC_DECIRQ_STA can only be cleared > when the register ECC_IRQ_REG(op) is enabled. But actually we have > disabled ECC IRQ in the previous ECC IRQ handle. So, there will > keep receiving ECC decode IRQ. > > Now, we read the register ECC_DECIRQ_STA once again before completing the > ecc done event. This ensures that there will be no extra ECC decode IRQ. > > Also, remove writel(0, ecc->regs + ECC_IRQ_REG(op)) from irq handler, > because ECC IRQ is disabled in mtk_ecc_disable(). And clear ECC_DECIRQ_STA > in mtk_ecc_disable() in case there is a timeout to wait decode IRQ. > > Fixes: 1d6b1e464950 ("mtd: mediatek: driver for MTK Smart Device") > Cc: <stable@vger.kernel.org> > Signed-off-by: Xiaolei Li <xiaolei.li@mediatek.com> Applied. Thanks, Boris > --- > drivers/mtd/nand/mtk_ecc.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c > index 7f3b065..c51d214 100644 > --- a/drivers/mtd/nand/mtk_ecc.c > +++ b/drivers/mtd/nand/mtk_ecc.c > @@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id) > op = ECC_DECODE; > dec = readw(ecc->regs + ECC_DECDONE); > if (dec & ecc->sectors) { > + /* > + * Clear decode IRQ status once again to ensure that > + * there will be no extra IRQ. > + */ > + readw(ecc->regs + ECC_DECIRQ_STA); > ecc->sectors = 0; > complete(&ecc->done); > } else { > @@ -130,8 +135,6 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id) > } > } > > - writel(0, ecc->regs + ECC_IRQ_REG(op)); > - > return IRQ_HANDLED; > } > > @@ -307,6 +310,12 @@ void mtk_ecc_disable(struct mtk_ecc *ecc) > > /* disable it */ > mtk_ecc_wait_idle(ecc, op); > + if (op == ECC_DECODE) > + /* > + * Clear decode IRQ status in case there is a timeout to wait > + * decode IRQ. > + */ > + readw(ecc->regs + ECC_DECIRQ_STA); > writew(0, ecc->regs + ECC_IRQ_REG(op)); > writew(ECC_OP_DISABLE, ecc->regs + ECC_CTL_REG(op)); >
diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c index 7f3b065..c51d214 100644 --- a/drivers/mtd/nand/mtk_ecc.c +++ b/drivers/mtd/nand/mtk_ecc.c @@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id) op = ECC_DECODE; dec = readw(ecc->regs + ECC_DECDONE); if (dec & ecc->sectors) { + /* + * Clear decode IRQ status once again to ensure that + * there will be no extra IRQ. + */ + readw(ecc->regs + ECC_DECIRQ_STA); ecc->sectors = 0; complete(&ecc->done); } else { @@ -130,8 +135,6 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id) } } - writel(0, ecc->regs + ECC_IRQ_REG(op)); - return IRQ_HANDLED; } @@ -307,6 +310,12 @@ void mtk_ecc_disable(struct mtk_ecc *ecc) /* disable it */ mtk_ecc_wait_idle(ecc, op); + if (op == ECC_DECODE) + /* + * Clear decode IRQ status in case there is a timeout to wait + * decode IRQ. + */ + readw(ecc->regs + ECC_DECIRQ_STA); writew(0, ecc->regs + ECC_IRQ_REG(op)); writew(ECC_OP_DISABLE, ecc->regs + ECC_CTL_REG(op));
For MT2701 NAND Controller, there may generate infinite ECC decode IRQ during long time burn test on some platforms. Once this issue occurred, the ECC decode IRQ status cannot be cleared in the IRQ handler function, and threads cannot be scheduled. ECC HW generates decode IRQ each sector, so there will have more than one decode IRQ if read one page of large page NAND. Currently, ECC IRQ handle flow is that we will check whether it is decode IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be cleared at the same time. Secondly, we will check whether all sectors are decoded by reading the register ECC_DECDONE. This is because the current IRQ may be not dealed in time, and the next sectors have been decoded before reading the register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not be generated. Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the next ECC IRQ. But, there is a timing issue between step one and two. When we read the reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector, and the ECC IRQ signal is cleared. But the last sector is decoded before reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and it means we will receive one extra ECC IRQ later. In step three, we will find that all sectors were decoded, then disable ECC IRQ and return. When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared anymore. That is because the register ECC_DECIRQ_STA can only be cleared when the register ECC_IRQ_REG(op) is enabled. But actually we have disabled ECC IRQ in the previous ECC IRQ handle. So, there will keep receiving ECC decode IRQ. Now, we read the register ECC_DECIRQ_STA once again before completing the ecc done event. This ensures that there will be no extra ECC decode IRQ. Also, remove writel(0, ecc->regs + ECC_IRQ_REG(op)) from irq handler, because ECC IRQ is disabled in mtk_ecc_disable(). And clear ECC_DECIRQ_STA in mtk_ecc_disable() in case there is a timeout to wait decode IRQ. Fixes: 1d6b1e464950 ("mtd: mediatek: driver for MTK Smart Device") Cc: <stable@vger.kernel.org> Signed-off-by: Xiaolei Li <xiaolei.li@mediatek.com> --- drivers/mtd/nand/mtk_ecc.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)