diff mbox series

[v3,11/13] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN

Message ID 20221106220143.2129263-12-catalin.marinas@arm.com (mailing list archive)
State New, archived
Headers show
Series mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 | expand

Commit Message

Catalin Marinas Nov. 6, 2022, 10:01 p.m. UTC
ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc()
alignment. This will ensure that the static alignment of various
structures or members of those structures( e.g. __ctx[] in struct
aead_request) is safe for DMA. Note that sizeof such structures becomes
aligned to ARCH_DMA_MINALIGN and kmalloc() will honour such alignment,
so there is no confusion for the compiler.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ard Biesheuvel <ardb@kernel.org>
---

I know Herbert NAK'ed this patch but I'm still keeping it here
temporarily, until we agree on some refactoring at the crypto code. FTR,
I don't think there's anything wrong with this patch since kmalloc()
will return ARCH_DMA_MINALIGN-aligned objects if the sizeof such objects
is a multiple of ARCH_DMA_MINALIGN (side-effect of
CRYPTO_MINALIGN_ATTR).

 include/linux/crypto.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Herbert Xu Nov. 7, 2022, 2:22 a.m. UTC | #1
On Sun, Nov 06, 2022 at 10:01:41PM +0000, Catalin Marinas wrote:
> ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
> operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc()
> alignment. This will ensure that the static alignment of various
> structures or members of those structures( e.g. __ctx[] in struct
> aead_request) is safe for DMA. Note that sizeof such structures becomes
> aligned to ARCH_DMA_MINALIGN and kmalloc() will honour such alignment,
> so there is no confusion for the compiler.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> ---
> 
> I know Herbert NAK'ed this patch but I'm still keeping it here
> temporarily, until we agree on some refactoring at the crypto code. FTR,
> I don't think there's anything wrong with this patch since kmalloc()
> will return ARCH_DMA_MINALIGN-aligned objects if the sizeof such objects
> is a multiple of ARCH_DMA_MINALIGN (side-effect of
> CRYPTO_MINALIGN_ATTR).

As I said before changing CRYPTO_MINALIGN doesn't do anything and
that's why this patch is broken.

To get what you want the drivers have to be modified.

Cheers,
Catalin Marinas Nov. 7, 2022, 9:05 a.m. UTC | #2
On Mon, Nov 07, 2022 at 10:22:18AM +0800, Herbert Xu wrote:
> On Sun, Nov 06, 2022 at 10:01:41PM +0000, Catalin Marinas wrote:
> > ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA
> > operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc()
> > alignment. This will ensure that the static alignment of various
> > structures or members of those structures( e.g. __ctx[] in struct
> > aead_request) is safe for DMA. Note that sizeof such structures becomes
> > aligned to ARCH_DMA_MINALIGN and kmalloc() will honour such alignment,
> > so there is no confusion for the compiler.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > Cc: Ard Biesheuvel <ardb@kernel.org>
> > ---
> > 
> > I know Herbert NAK'ed this patch but I'm still keeping it here
> > temporarily, until we agree on some refactoring at the crypto code. FTR,
> > I don't think there's anything wrong with this patch since kmalloc()
> > will return ARCH_DMA_MINALIGN-aligned objects if the sizeof such objects
> > is a multiple of ARCH_DMA_MINALIGN (side-effect of
> > CRYPTO_MINALIGN_ATTR).
> 
> As I said before changing CRYPTO_MINALIGN doesn't do anything and
> that's why this patch is broken.

Well, it does ensure that the __alignof__ and sizeof structures like
crypto_alg and aead_request is still 128 after this change. A kmalloc()
of a size multiple of 128 returns a 128-byte aligned object. So the aim
is just to keep the current binary layout/alignment to 128 on arm64. In
theory, no functional change.

Of course, there are better ways to do it but I think the crypto code
should move away from ARCH_KMALLOC_MINALIGN and use something like
dma_get_cache_alignment() instead. The cra_alignmask should be specific
to the device and typically small values (or 0 if no alignment required
by the device). The DMA alignment is specific to the SoC and CPU, so
this should be handled elsewhere.

As I don't fully understand the crypto code, I had a naive attempt at
forcing a higher alignmask but it ended up in a kernel panic:

diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 2324ab6f1846..6dc84c504b52 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -13,6 +13,7 @@
 #define _LINUX_CRYPTO_H
 
 #include <linux/atomic.h>
+#include <linux/dma-mapping.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/bug.h>
@@ -696,7 +697,7 @@ static inline unsigned int crypto_tfm_alg_blocksize(struct crypto_tfm *tfm)
 
 static inline unsigned int crypto_tfm_alg_alignmask(struct crypto_tfm *tfm)
 {
-	return tfm->__crt_alg->cra_alignmask;
+	return tfm->__crt_alg->cra_alignmask | (dma_get_cache_alignment() - 1);
 }
 
 static inline u32 crypto_tfm_get_flags(struct crypto_tfm *tfm)
Herbert Xu Nov. 7, 2022, 9:12 a.m. UTC | #3
On Mon, Nov 07, 2022 at 09:05:04AM +0000, Catalin Marinas wrote:
>
> Well, it does ensure that the __alignof__ and sizeof structures like
> crypto_alg and aead_request is still 128 after this change. A kmalloc()
> of a size multiple of 128 returns a 128-byte aligned object. So the aim
> is just to keep the current binary layout/alignment to 128 on arm64. In
> theory, no functional change.

Changing CRYPTO_MINALIGN to 128 does not cause structures that are
smaller than 128 bytes to magically become larger than 128 bytes.
All it does is declare to the compiler that it may assume that
these pointers are 128-byte aligned (which is obviously untrue
if kmalloc does not guarantee that).

So I don't see how this changes anything in practice.  Buffers
that required bouncing prior to your change will still require
bouncing.

If you're set on doing it this way then I can proceed with the
original patch-set to change the drivers.  I've just been putting
it off because it seems that you guys weren't quite decided on
which way to go.

Cheers,
Catalin Marinas Nov. 7, 2022, 9:38 a.m. UTC | #4
On Mon, Nov 07, 2022 at 05:12:53PM +0800, Herbert Xu wrote:
> On Mon, Nov 07, 2022 at 09:05:04AM +0000, Catalin Marinas wrote:
> > Well, it does ensure that the __alignof__ and sizeof structures like
> > crypto_alg and aead_request is still 128 after this change. A kmalloc()
> > of a size multiple of 128 returns a 128-byte aligned object. So the aim
> > is just to keep the current binary layout/alignment to 128 on arm64. In
> > theory, no functional change.
> 
> Changing CRYPTO_MINALIGN to 128 does not cause structures that are
> smaller than 128 bytes to magically become larger than 128 bytes.

For structures, it does (not arrays though):

#define __aligned(x)	__attribute__((__aligned__(x)))

struct align_test1 {
	char c;
	char __aligned(128) data[];
};

struct align_test2 {
	char c;
} __aligned(128);

char aligned_array[4] __aligned(128);

With the above, we have:

sizeof(align_test1) == 128; __alignof__(align_test1) == 128;
sizeof(align_test2) == 128; __alignof__(align_test2) == 128;
sizeof(align_array) == 4;   __alignof__(align_array) == 128;

> If you're set on doing it this way then I can proceed with the
> original patch-set to change the drivers.  I've just been putting
> it off because it seems that you guys weren't quite decided on
> which way to go.

Yes, reviving your patchset would help and that can be done
independently of this series as long as the crypto code starts using
dma_get_cache_alignment() and drops CRYPTO_MINALIGN_ATTR entirely. If at
the point of creating the mask the code knows whether the device is
coherent, it can even avoid any additional alignment (though still
honouring the cra_alignmask that a device requires). So such reworking
would be beneficial irrespective of this series.

It seems that swiotlb bouncing is the preferred route and least
intrusive but let's see the feedback on the other parts of the series.

Thanks.
diff mbox series

Patch

diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 2324ab6f1846..654b9c355575 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -167,7 +167,7 @@ 
  * maintenance for non-coherent DMA (cache invalidation in particular) does not
  * affect data that may be accessed by the CPU concurrently.
  */
-#define CRYPTO_MINALIGN ARCH_KMALLOC_MINALIGN
+#define CRYPTO_MINALIGN ARCH_DMA_MINALIGN
 
 #define CRYPTO_MINALIGN_ATTR __attribute__ ((__aligned__(CRYPTO_MINALIGN)))