Message ID | 20230808105527.1707039-2-meenakshi.aggarwal@nxp.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Herbert Xu |
Headers | show |
Series | crypto: caam - increase the domain of write memory barrier to full system | expand |
Reviewed-by: Gaurav Jain <gaurav.jain@nxp.com> > -----Original Message----- > From: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com> > Sent: Tuesday, August 8, 2023 4:25 PM > To: Horia Geanta <horia.geanta@nxp.com>; Varun Sethi <V.Sethi@nxp.com>; > Pankaj Gupta <pankaj.gupta@nxp.com>; Gaurav Jain <gaurav.jain@nxp.com>; > herbert@gondor.apana.org.au; davem@davemloft.net; linux- > crypto@vger.kernel.org; linux-kernel@vger.kernel.org > Cc: Iuliana Prodan <iuliana.prodan@nxp.com>; Meenakshi Aggarwal > <meenakshi.aggarwal@nxp.com> > Subject: [PATCH] crypto: caam - increase the domain of write memory barrier to > full system > > From: Iuliana Prodan <iuliana.prodan@nxp.com> > > In caam_jr_enqueue, under heavy DDR load, smp_wmb() or dma_wmb() fail to > make the input ring be updated before the CAAM starts reading it. So, CAAM will > process, again, an old descriptor address and will put it in the output ring. This > will make caam_jr_dequeue() to fail, since this old descriptor is not in the > software ring. > To fix this, use wmb() which works on the full system instead of inner/outer > shareable domains. > > Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com> > Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com> > --- > drivers/crypto/caam/jr.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c index > 767fbf052536..5507d5d34a4c 100644 > --- a/drivers/crypto/caam/jr.c > +++ b/drivers/crypto/caam/jr.c > @@ -464,8 +464,16 @@ int caam_jr_enqueue(struct device *dev, u32 *desc, > * Guarantee that the descriptor's DMA address has been written to > * the next slot in the ring before the write index is updated, since > * other cores may update this index independently. > + * > + * Under heavy DDR load, smp_wmb() or dma_wmb() fail to make the > input > + * ring be updated before the CAAM starts reading it. So, CAAM will > + * process, again, an old descriptor address and will put it in the > + * output ring. This will make caam_jr_dequeue() to fail, since this > + * old descriptor is not in the software ring. > + * To fix this, use wmb() which works on the full system instead of > + * inner/outer shareable domains. > */ > - smp_wmb(); > + wmb(); > > jrp->head = (head + 1) & (JOBR_DEPTH - 1); > > -- > 2.25.1
On Tue, Aug 08, 2023 at 12:55:26PM +0200, meenakshi.aggarwal@nxp.com wrote: > From: Iuliana Prodan <iuliana.prodan@nxp.com> > > In caam_jr_enqueue, under heavy DDR load, smp_wmb() or dma_wmb() > fail to make the input ring be updated before the CAAM starts > reading it. So, CAAM will process, again, an old descriptor address > and will put it in the output ring. This will make caam_jr_dequeue() > to fail, since this old descriptor is not in the software ring. > To fix this, use wmb() which works on the full system instead of > inner/outer shareable domains. > > Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com> > Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com> > --- > drivers/crypto/caam/jr.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) Indeed, smp_wmb is always wrong for barriers separating DMA writes. I wonder if these should be changed to: $ git grep smp_wmb drivers/crypto/ drivers/crypto/caam/jr.c: smp_wmb(); drivers/crypto/cavium/cpt/cptvf_reqmanager.c: smp_wmb(); drivers/crypto/hisilicon/qm.c: smp_wmb(); drivers/crypto/marvell/octeontx/otx_cptvf_reqmgr.c: smp_wmb(); drivers/crypto/marvell/octeontx2/otx2_cptpf_mbox.c: smp_wmb(); drivers/crypto/marvell/octeontx2/otx2_cptpf_mbox.c: smp_wmb(); drivers/crypto/marvell/octeontx2/otx2_cptpf_mbox.c: smp_wmb(); drivers/crypto/marvell/octeontx2/otx2_cptpf_mbox.c: smp_wmb(); drivers/crypto/talitos.c: smp_wmb(); drivers/crypto/talitos.c: smp_wmb(); $ Cheers,
On Tue, Aug 08, 2023 at 12:55:26PM +0200, meenakshi.aggarwal@nxp.com wrote: > From: Iuliana Prodan <iuliana.prodan@nxp.com> > > In caam_jr_enqueue, under heavy DDR load, smp_wmb() or dma_wmb() > fail to make the input ring be updated before the CAAM starts > reading it. So, CAAM will process, again, an old descriptor address > and will put it in the output ring. This will make caam_jr_dequeue() > to fail, since this old descriptor is not in the software ring. > To fix this, use wmb() which works on the full system instead of > inner/outer shareable domains. > > Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com> > Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com> > --- > drivers/crypto/caam/jr.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) Patch applied. Thanks.
diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c index 767fbf052536..5507d5d34a4c 100644 --- a/drivers/crypto/caam/jr.c +++ b/drivers/crypto/caam/jr.c @@ -464,8 +464,16 @@ int caam_jr_enqueue(struct device *dev, u32 *desc, * Guarantee that the descriptor's DMA address has been written to * the next slot in the ring before the write index is updated, since * other cores may update this index independently. + * + * Under heavy DDR load, smp_wmb() or dma_wmb() fail to make the input + * ring be updated before the CAAM starts reading it. So, CAAM will + * process, again, an old descriptor address and will put it in the + * output ring. This will make caam_jr_dequeue() to fail, since this + * old descriptor is not in the software ring. + * To fix this, use wmb() which works on the full system instead of + * inner/outer shareable domains. */ - smp_wmb(); + wmb(); jrp->head = (head + 1) & (JOBR_DEPTH - 1);