Message ID | 20220701132735.1594822-1-clabbe@baylibre.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] crypto: flush poison data | expand |
On 01/07/2022 14:27, Corentin Labbe wrote: > On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to: > alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0 > > In fact the buffer is not overran by device but by the dma_map_single() operation. > > To prevent any corruption of the poisoned data, simply flush them before > giving the buffer to the tested driver. > > Signed-off-by: Corentin Labbe <clabbe@baylibre.com> > --- > > Hello > > I put this patch as RFC, since this behavour happen only on non yet merged RISCV code. > (Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant) > > Regards > > crypto/testmgr.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/crypto/testmgr.c b/crypto/testmgr.c > index c59bd9e07978..187163e2e593 100644 > --- a/crypto/testmgr.c > +++ b/crypto/testmgr.c > @@ -19,6 +19,7 @@ > #include <crypto/aead.h> > #include <crypto/hash.h> > #include <crypto/skcipher.h> > +#include <linux/cacheflush.h> > #include <linux/err.h> > #include <linux/fips.h> > #include <linux/module.h> > @@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE]) > static inline void testmgr_poison(void *addr, size_t len) > { > memset(addr, TESTMGR_POISON_BYTE, len); > + /* Be sure data is written to prevent corruption from some DMA sync */ > + flush_icache_range((unsigned long)addr, (unsigned long)addr + len); > } > > /* Is the memory region still fully poisoned? */ why are you flushing the instruction cache and not the data-cache?
On Fri, 1 Jul 2022 13:27:35 +0000 Corentin Labbe <clabbe@baylibre.com> wrote: Hi, > On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to: > alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0 > > In fact the buffer is not overran by device but by the dma_map_single() operation. > > To prevent any corruption of the poisoned data, simply flush them before > giving the buffer to the tested driver. > > Signed-off-by: Corentin Labbe <clabbe@baylibre.com> > --- > > Hello > > I put this patch as RFC, since this behavour happen only on non yet merged RISCV code. > (Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant) > > Regards > > crypto/testmgr.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/crypto/testmgr.c b/crypto/testmgr.c > index c59bd9e07978..187163e2e593 100644 > --- a/crypto/testmgr.c > +++ b/crypto/testmgr.c > @@ -19,6 +19,7 @@ > #include <crypto/aead.h> > #include <crypto/hash.h> > #include <crypto/skcipher.h> > +#include <linux/cacheflush.h> > #include <linux/err.h> > #include <linux/fips.h> > #include <linux/module.h> > @@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE]) > static inline void testmgr_poison(void *addr, size_t len) > { > memset(addr, TESTMGR_POISON_BYTE, len); > + /* Be sure data is written to prevent corruption from some DMA sync */ > + flush_icache_range((unsigned long)addr, (unsigned long)addr + len); As Ben already mentioned, this looks like having nothing to do with the I cache. I guess you picked that because it does the required cache cleaning and doesn't require a vma parameter? But more importantly: I think drivers shouldn't do explicit cache maintenance, this is what the DMA API is for. So if you get DMA corruption, then this points to some flaw in the DMA API usage: either the buffer belongs to the CPU, then the device must not write to it. Or the buffer belongs to the device, then the CPU cannot expect to write to that without that data potentially getting corrupted. So can you check if that's the case? Cheers, Andre > } > > /* Is the memory region still fully poisoned? */
Le Fri, Jul 01, 2022 at 03:36:14PM +0100, Andre Przywara a écrit : > On Fri, 1 Jul 2022 13:27:35 +0000 > Corentin Labbe <clabbe@baylibre.com> wrote: > > Hi, > > > On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to: > > alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0 > > > > In fact the buffer is not overran by device but by the dma_map_single() operation. > > > > To prevent any corruption of the poisoned data, simply flush them before > > giving the buffer to the tested driver. > > > > Signed-off-by: Corentin Labbe <clabbe@baylibre.com> > > --- > > > > Hello > > > > I put this patch as RFC, since this behavour happen only on non yet merged RISCV code. > > (Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant) > > > > Regards > > > > crypto/testmgr.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/crypto/testmgr.c b/crypto/testmgr.c > > index c59bd9e07978..187163e2e593 100644 > > --- a/crypto/testmgr.c > > +++ b/crypto/testmgr.c > > @@ -19,6 +19,7 @@ > > #include <crypto/aead.h> > > #include <crypto/hash.h> > > #include <crypto/skcipher.h> > > +#include <linux/cacheflush.h> > > #include <linux/err.h> > > #include <linux/fips.h> > > #include <linux/module.h> > > @@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE]) > > static inline void testmgr_poison(void *addr, size_t len) > > { > > memset(addr, TESTMGR_POISON_BYTE, len); > > + /* Be sure data is written to prevent corruption from some DMA sync */ > > + flush_icache_range((unsigned long)addr, (unsigned long)addr + len); > > As Ben already mentioned, this looks like having nothing to do with the I > cache. I guess you picked that because it does the required cache cleaning > and doesn't require a vma parameter? The reality is simpler, I just copied what did drivers/crypto/xilinx/zynqmp-sha.c > > But more importantly: I think drivers shouldn't do explicit cache > maintenance, this is what the DMA API is for. > So if you get DMA corruption, then this points to some flaw in the DMA API > usage: either the buffer belongs to the CPU, then the device must not write > to it. Or the buffer belongs to the device, then the CPU cannot expect to > write to that without that data potentially getting corrupted. The device does nothing wrong, I removed all sun8i-ce device action (and kept DMA API actions) and the the whole buffer is still corrupted. Anyway if the driver was doing something wrong, it should have fail on arm or arm64. See my previous report https://lore.kernel.org/lkml/YllWTN+15CoskNBt@Red/ which show the problem (The invalidated size is bigger than the dma_sync length parameter)
Le Fri, Jul 01, 2022 at 02:35:41PM +0100, Ben Dooks a écrit : > On 01/07/2022 14:27, Corentin Labbe wrote: > > On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to: > > alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0 > > > > In fact the buffer is not overran by device but by the dma_map_single() operation. > > > > To prevent any corruption of the poisoned data, simply flush them before > > giving the buffer to the tested driver. > > > > Signed-off-by: Corentin Labbe <clabbe@baylibre.com> > > --- > > > > Hello > > > > I put this patch as RFC, since this behavour happen only on non yet merged RISCV code. > > (Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant) > > > > Regards > > > > crypto/testmgr.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/crypto/testmgr.c b/crypto/testmgr.c > > index c59bd9e07978..187163e2e593 100644 > > --- a/crypto/testmgr.c > > +++ b/crypto/testmgr.c > > @@ -19,6 +19,7 @@ > > #include <crypto/aead.h> > > #include <crypto/hash.h> > > #include <crypto/skcipher.h> > > +#include <linux/cacheflush.h> > > #include <linux/err.h> > > #include <linux/fips.h> > > #include <linux/module.h> > > @@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE]) > > static inline void testmgr_poison(void *addr, size_t len) > > { > > memset(addr, TESTMGR_POISON_BYTE, len); > > + /* Be sure data is written to prevent corruption from some DMA sync */ > > + flush_icache_range((unsigned long)addr, (unsigned long)addr + len); > > } > > > > /* Is the memory region still fully poisoned? */ > > why are you flushing the instruction cache and not the data-cache? > I just copied what did drivers/crypto/xilinx/zynqmp-sha.c. I tried to do flush_dcache_range() but it seems to not be implemented on riscV. And flush_dcache_page(virt_to_page(addr), len) produce a kernel panic. Any advice on how to go further ?
On Tue, Jul 05, 2022 at 10:21:13AM +0200, LABBE Corentin wrote: > > I just copied what did drivers/crypto/xilinx/zynqmp-sha.c. > I tried to do flush_dcache_range() but it seems to not be implemented on riscV. That driver is broken and should no have been merged in that form. > And flush_dcache_page(virt_to_page(addr), len) produce a kernel panic. And that's good so. Drivers have no business doing their own cache flushing. That is the job of the dma-mapping implementation, so I'd suggest to look for problems there.
Le Tue, Jul 05, 2022 at 06:42:13PM +0200, Christoph Hellwig a écrit : > On Tue, Jul 05, 2022 at 10:21:13AM +0200, LABBE Corentin wrote: > > > > I just copied what did drivers/crypto/xilinx/zynqmp-sha.c. > > I tried to do flush_dcache_range() but it seems to not be implemented on riscV. > > That driver is broken and should no have been merged in that form. > > > And flush_dcache_page(virt_to_page(addr), len) produce a kernel panic. > > And that's good so. Drivers have no business doing their own cache > flushing. That is the job of the dma-mapping implementation, so I'd > suggest to look for problems there. I am sorry but this code is not in driver but in crypto API code. It seems that I didnt explain well the problem. The crypto API run a number of crypto operations against every driver that register crypto algos. For each buffer given to the tested driver, crypto API setup a poison buffer contigous to this buffer. The goal is to detect if driver do bad thing outside of buffer it got. So the tested driver dont know existence of this poison buffer and so cannot not handle it. My problem is that a dma_sync on the data buffer corrupt the poison buffer as collateral dommage. Probably because the sync operate on a larger region than the requested dma_sync length. So I try to flush poison data in the cryptoAPI. Any hint on how to do it properly is welcome.
On Tue, Jul 05, 2022 at 07:56:11PM +0200, LABBE Corentin wrote: > My problem is that a dma_sync on the data buffer corrupt the poison buffer as collateral dommage. > Probably because the sync operate on a larger region than the requested dma_sync length. > So I try to flush poison data in the cryptoAPI. Data structures that are DMAed to must be aligned to the value returned by dma_get_cache_alignment(), as non-coherent DMA by definition can disturb the data inside that boundary. That is not a bug but fundamentally part of how DMA works when the device attachment is not cache coherent.
Le Tue, Jul 05, 2022 at 07:58:34PM +0200, Christoph Hellwig a écrit : > On Tue, Jul 05, 2022 at 07:56:11PM +0200, LABBE Corentin wrote: > > My problem is that a dma_sync on the data buffer corrupt the poison buffer as collateral dommage. > > Probably because the sync operate on a larger region than the requested dma_sync length. > > So I try to flush poison data in the cryptoAPI. > > Data structures that are DMAed to must be aligned to > the value returned by dma_get_cache_alignment(), as non-coherent DMA > by definition can disturb the data inside that boundary. That is not > a bug but fundamentally part of how DMA works when the device attachment > is not cache coherent. I am sorry but I dont see how this can help my problem.
On 05/07/2022 17:42, Christoph Hellwig wrote: > On Tue, Jul 05, 2022 at 10:21:13AM +0200, LABBE Corentin wrote: >> >> I just copied what did drivers/crypto/xilinx/zynqmp-sha.c. >> I tried to do flush_dcache_range() but it seems to not be implemented on riscV. > > That driver is broken and should no have been merged in that form. > >> And flush_dcache_page(virt_to_page(addr), len) produce a kernel panic. > > And that's good so. Drivers have no business doing their own cache > flushing. That is the job of the dma-mapping implementation, so I'd > suggest to look for problems there. I'm not sure that the dma-mapping code for non-coherent riscv systems did get sorted. I couldn't find any when looking in 5.17. I expect the flush of the icache is also implicitly doing a dcache flush too as from what i've seen it is only being used when code or tlbs are being modified.
On Wed, Jul 06, 2022 at 10:47:24AM +0100, Ben Dooks wrote: > I'm not sure that the dma-mapping code for non-coherent riscv systems > did get sorted. I couldn't find any when looking in 5.17. Yes, none of that is upstream. But as supporting it is essential for the allwinner SOCs I'm pretty sure Corentin is not actually using an upstream kernel anyway.
Le Wed, Jul 06, 2022 at 01:58:07PM +0200, Christoph Hellwig a écrit : > On Wed, Jul 06, 2022 at 10:47:24AM +0100, Ben Dooks wrote: > > I'm not sure that the dma-mapping code for non-coherent riscv systems > > did get sorted. I couldn't find any when looking in 5.17. > > Yes, none of that is upstream. But as supporting it is essential for > the allwinner SOCs I'm pretty sure Corentin is not actually using an > upstream kernel anyway. I use an upstream kernel + some "not yet merged but sent for review" patch serie like "riscv: implement Zicbom-based CMO instructions + the t-head variant" And good news, I just updated to use the v6 of this serie (just posted today) and my problem disappear. Regards
diff --git a/crypto/testmgr.c b/crypto/testmgr.c index c59bd9e07978..187163e2e593 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -19,6 +19,7 @@ #include <crypto/aead.h> #include <crypto/hash.h> #include <crypto/skcipher.h> +#include <linux/cacheflush.h> #include <linux/err.h> #include <linux/fips.h> #include <linux/module.h> @@ -205,6 +206,8 @@ static void testmgr_free_buf(char *buf[XBUFSIZE]) static inline void testmgr_poison(void *addr, size_t len) { memset(addr, TESTMGR_POISON_BYTE, len); + /* Be sure data is written to prevent corruption from some DMA sync */ + flush_icache_range((unsigned long)addr, (unsigned long)addr + len); } /* Is the memory region still fully poisoned? */
On my Allwinner D1 nezha, the sun8i-ce fail self-tests due to: alg: skcipher: cbc-des3-sun8i-ce encryption overran dst buffer on test vector 0 In fact the buffer is not overran by device but by the dma_map_single() operation. To prevent any corruption of the poisoned data, simply flush them before giving the buffer to the tested driver. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> --- Hello I put this patch as RFC, since this behavour happen only on non yet merged RISCV code. (Mostly riscv: implement Zicbom-based CMO instructions + the t-head variant) Regards crypto/testmgr.c | 3 +++ 1 file changed, 3 insertions(+)