diff mbox series

[01/92] iio: core: Fix IIO_ALIGN as it was not sufficiently large on some platforms.

Message ID 20220503085935.1533814-2-jic23@kernel.org (mailing list archive)
State Superseded
Headers show
Series IIO: Fix alignment of buffers for DMA | expand

Commit Message

Jonathan Cameron May 3, 2022, 8:58 a.m. UTC
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Discussion of the series:
https://lore.kernel.org/all/20220405135758.774016-1-catalin.marinas@arm.com/
mm, arm64: Reduce ARCH_KMALLOC_MINALIGN brought to my attention that
our current IIO usage of L1CACHE_ALIGN is insufficient as their are Arm
platforms out their with non coherent DMA and larger cache lines at
at higher levels of their cache hierarchy.

Note this patch will greatly reduce the padding on some architectures
that have smaller requirements for DMA safe buffers.

The history of changing values of ARCH_KMALLOC_MINALIGN via
ARCH_DMA_MINALIGN on arm64 is rather complex. I'm not tagging this
as fixing a particular patch from that route as it's not clear what to tag.

Most recently a change to bring them back inline was reverted because
of some Qualcomm Kryo cores with an L2 cache with 128-byte lines
sitting above the point of coherency.

c1132702c71f Revert "arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES)"
That reverts:
65688d2a05de arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES) which
refers to the change originally being motivated by Thunder x1 performance
rather than correctness.

Fixes: 6f7c8ee585e9d ("staging:iio: Add ability to allocate private data space to iio_allocate_device")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 include/linux/iio/iio.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Uwe Kleine-König May 3, 2022, 2:27 p.m. UTC | #1
On Tue, May 03, 2022 at 09:58:04AM +0100, Jonathan Cameron wrote:
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Discussion of the series:
> https://lore.kernel.org/all/20220405135758.774016-1-catalin.marinas@arm.com/
> mm, arm64: Reduce ARCH_KMALLOC_MINALIGN brought to my attention that
> our current IIO usage of L1CACHE_ALIGN is insufficient as their are Arm
> platforms out their with non coherent DMA and larger cache lines at
> at higher levels of their cache hierarchy.
> 
> Note this patch will greatly reduce the padding on some architectures
> that have smaller requirements for DMA safe buffers.
> 
> The history of changing values of ARCH_KMALLOC_MINALIGN via
> ARCH_DMA_MINALIGN on arm64 is rather complex. I'm not tagging this
> as fixing a particular patch from that route as it's not clear what to tag.
> 
> Most recently a change to bring them back inline was reverted because
> of some Qualcomm Kryo cores with an L2 cache with 128-byte lines
> sitting above the point of coherency.
> 
> c1132702c71f Revert "arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES)"
> That reverts:
> 65688d2a05de arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES) which
> refers to the change originally being motivated by Thunder x1 performance
> rather than correctness.
> 
> Fixes: 6f7c8ee585e9d ("staging:iio: Add ability to allocate private data space to iio_allocate_device")
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  include/linux/iio/iio.h | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h
> index faf00f2c0be6..30937f8f9424 100644
> --- a/include/linux/iio/iio.h
> +++ b/include/linux/iio/iio.h
> @@ -9,6 +9,7 @@
>  
>  #include <linux/device.h>
>  #include <linux/cdev.h>
> +#include <linux/slab.h>
>  #include <linux/iio/types.h>
>  #include <linux/of.h>
>  /* IIO TODO LIST */
> @@ -657,8 +658,13 @@ static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev)
>  	return dev_get_drvdata(&indio_dev->dev);
>  }
>  
> -/* Can we make this smaller? */
> -#define IIO_ALIGN L1_CACHE_BYTES
> +/*
> + * Used to ensure the iio_priv() structure is aligned to allow that structure
> + * to in turn include IIO_ALIGN'd elements such as buffers which must not share
> + * cachelines with the rest of the structure, thus making them safe for use with
> + * non-coherent DMA.
> + */
> +#define IIO_ALIGN ARCH_KMALLOC_MINALIGN

Given the purpose of IIO_ALIGN is to define the alignment for DMA'able
memory, I wonder why it's called "IIO_ALIGN" and not for example
"DMA_MINALIGN".

There is nothing iio specific about this value, is there? Then
consequently it doesn't need to be defined in an iio header, but
somewhere generic. Or even one step further: Why isn't there a macro
__align_for_dma that can be used directly to annotate the relevant
member in a struct?

Best regards
Uwe
Jonathan Cameron May 3, 2022, 4:24 p.m. UTC | #2
On Tue, 3 May 2022 16:27:25 +0200
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote:

> On Tue, May 03, 2022 at 09:58:04AM +0100, Jonathan Cameron wrote:
> > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > Discussion of the series:
> > https://lore.kernel.org/all/20220405135758.774016-1-catalin.marinas@arm.com/
> > mm, arm64: Reduce ARCH_KMALLOC_MINALIGN brought to my attention that
> > our current IIO usage of L1CACHE_ALIGN is insufficient as their are Arm
> > platforms out their with non coherent DMA and larger cache lines at
> > at higher levels of their cache hierarchy.
> > 
> > Note this patch will greatly reduce the padding on some architectures
> > that have smaller requirements for DMA safe buffers.
> > 
> > The history of changing values of ARCH_KMALLOC_MINALIGN via
> > ARCH_DMA_MINALIGN on arm64 is rather complex. I'm not tagging this
> > as fixing a particular patch from that route as it's not clear what to tag.
> > 
> > Most recently a change to bring them back inline was reverted because
> > of some Qualcomm Kryo cores with an L2 cache with 128-byte lines
> > sitting above the point of coherency.
> > 
> > c1132702c71f Revert "arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES)"
> > That reverts:
> > 65688d2a05de arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES) which
> > refers to the change originally being motivated by Thunder x1 performance
> > rather than correctness.
> > 
> > Fixes: 6f7c8ee585e9d ("staging:iio: Add ability to allocate private data space to iio_allocate_device")
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  include/linux/iio/iio.h | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h
> > index faf00f2c0be6..30937f8f9424 100644
> > --- a/include/linux/iio/iio.h
> > +++ b/include/linux/iio/iio.h
> > @@ -9,6 +9,7 @@
> >  
> >  #include <linux/device.h>
> >  #include <linux/cdev.h>
> > +#include <linux/slab.h>
> >  #include <linux/iio/types.h>
> >  #include <linux/of.h>
> >  /* IIO TODO LIST */
> > @@ -657,8 +658,13 @@ static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev)
> >  	return dev_get_drvdata(&indio_dev->dev);
> >  }
> >  
> > -/* Can we make this smaller? */
> > -#define IIO_ALIGN L1_CACHE_BYTES
> > +/*
> > + * Used to ensure the iio_priv() structure is aligned to allow that structure
> > + * to in turn include IIO_ALIGN'd elements such as buffers which must not share
> > + * cachelines with the rest of the structure, thus making them safe for use with
> > + * non-coherent DMA.
> > + */
> > +#define IIO_ALIGN ARCH_KMALLOC_MINALIGN  
> 
> Given the purpose of IIO_ALIGN is to define the alignment for DMA'able
> memory, I wonder why it's called "IIO_ALIGN" and not for example
> "DMA_MINALIGN".

Much like crypto I want a single place that provides the IIO requirements
for this.  Could rename it IIO_DMA_MINALIGN I guess.  The reason behind
that is to allow for a switch on mass if a new approach is accepted
along the lines of what Catalin proposed.  The discussions around
CRYPTO made it clear that there are sometimes additional requirements
from a subsystem beyond simply that of DMA (IIO has a similar issue
to crypto that mean it's not simple to shift the alignment requirements
at runtime because the compiler is getting told things are aligned to
a higher degree than the allocations).

https://lore.kernel.org/all/20220405135758.774016-8-catalin.marinas@arm.com/
and below that point.

> 
> There is nothing iio specific about this value, is there? Then
> consequently it doesn't need to be defined in an iio header, but
> somewhere generic. Or even one step further: Why isn't there a macro
> __align_for_dma that can be used directly to annotate the relevant
> member in a struct?

There is, but it's not currently available on all architectures.

ARCH_DMA_MINALIGN

Catalin's series proposed making it generally available:
https://lore.kernel.org/all/20220405135758.774016-2-catalin.marinas@arm.com/

but suggestion for now was to go with ARCH_KMALLOC_MINALIGN

https://lore.kernel.org/linux-iio/Yl6jB5DOUy+Yqyzl@arm.com/

Thanks,

Jonathan


> 
> Best regards
> Uwe
>
Jonathan Cameron May 8, 2022, 4:17 p.m. UTC | #3
On Tue, 3 May 2022 17:24:21 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Tue, 3 May 2022 16:27:25 +0200
> Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote:
> 
> > On Tue, May 03, 2022 at 09:58:04AM +0100, Jonathan Cameron wrote:  
> > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > 
> > > Discussion of the series:
> > > https://lore.kernel.org/all/20220405135758.774016-1-catalin.marinas@arm.com/
> > > mm, arm64: Reduce ARCH_KMALLOC_MINALIGN brought to my attention that
> > > our current IIO usage of L1CACHE_ALIGN is insufficient as their are Arm
> > > platforms out their with non coherent DMA and larger cache lines at
> > > at higher levels of their cache hierarchy.
> > > 
> > > Note this patch will greatly reduce the padding on some architectures
> > > that have smaller requirements for DMA safe buffers.
> > > 
> > > The history of changing values of ARCH_KMALLOC_MINALIGN via
> > > ARCH_DMA_MINALIGN on arm64 is rather complex. I'm not tagging this
> > > as fixing a particular patch from that route as it's not clear what to tag.
> > > 
> > > Most recently a change to bring them back inline was reverted because
> > > of some Qualcomm Kryo cores with an L2 cache with 128-byte lines
> > > sitting above the point of coherency.
> > > 
> > > c1132702c71f Revert "arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES)"
> > > That reverts:
> > > 65688d2a05de arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES) which
> > > refers to the change originally being motivated by Thunder x1 performance
> > > rather than correctness.
> > > 
> > > Fixes: 6f7c8ee585e9d ("staging:iio: Add ability to allocate private data space to iio_allocate_device")
> > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > ---
> > >  include/linux/iio/iio.h | 10 ++++++++--
> > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h
> > > index faf00f2c0be6..30937f8f9424 100644
> > > --- a/include/linux/iio/iio.h
> > > +++ b/include/linux/iio/iio.h
> > > @@ -9,6 +9,7 @@
> > >  
> > >  #include <linux/device.h>
> > >  #include <linux/cdev.h>
> > > +#include <linux/slab.h>
> > >  #include <linux/iio/types.h>
> > >  #include <linux/of.h>
> > >  /* IIO TODO LIST */
> > > @@ -657,8 +658,13 @@ static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev)
> > >  	return dev_get_drvdata(&indio_dev->dev);
> > >  }
> > >  
> > > -/* Can we make this smaller? */
> > > -#define IIO_ALIGN L1_CACHE_BYTES
> > > +/*
> > > + * Used to ensure the iio_priv() structure is aligned to allow that structure
> > > + * to in turn include IIO_ALIGN'd elements such as buffers which must not share
> > > + * cachelines with the rest of the structure, thus making them safe for use with
> > > + * non-coherent DMA.
> > > + */
> > > +#define IIO_ALIGN ARCH_KMALLOC_MINALIGN    
> > 
> > Given the purpose of IIO_ALIGN is to define the alignment for DMA'able
> > memory, I wonder why it's called "IIO_ALIGN" and not for example
> > "DMA_MINALIGN".  
> 
> Much like crypto I want a single place that provides the IIO requirements
> for this.  Could rename it IIO_DMA_MINALIGN I guess. 

I've switched to this naming for v2.

> The reason behind
> that is to allow for a switch on mass if a new approach is accepted
> along the lines of what Catalin proposed.  The discussions around
> CRYPTO made it clear that there are sometimes additional requirements
> from a subsystem beyond simply that of DMA (IIO has a similar issue
> to crypto that mean it's not simple to shift the alignment requirements
> at runtime because the compiler is getting told things are aligned to
> a higher degree than the allocations).
> 
> https://lore.kernel.org/all/20220405135758.774016-8-catalin.marinas@arm.com/
> and below that point.
> 
> > 
> > There is nothing iio specific about this value, is there? Then
> > consequently it doesn't need to be defined in an iio header, but
> > somewhere generic. Or even one step further: Why isn't there a macro
> > __align_for_dma that can be used directly to annotate the relevant
> > member in a struct?  
> 
> There is, but it's not currently available on all architectures.
> 
> ARCH_DMA_MINALIGN
> 
> Catalin's series proposed making it generally available:
> https://lore.kernel.org/all/20220405135758.774016-2-catalin.marinas@arm.com/
> 
> but suggestion for now was to go with ARCH_KMALLOC_MINALIGN
> 
> https://lore.kernel.org/linux-iio/Yl6jB5DOUy+Yqyzl@arm.com/
> 
> Thanks,
> 
> Jonathan
> 
> 
> > 
> > Best regards
> > Uwe
> >   
>
diff mbox series

Patch

diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h
index faf00f2c0be6..30937f8f9424 100644
--- a/include/linux/iio/iio.h
+++ b/include/linux/iio/iio.h
@@ -9,6 +9,7 @@ 
 
 #include <linux/device.h>
 #include <linux/cdev.h>
+#include <linux/slab.h>
 #include <linux/iio/types.h>
 #include <linux/of.h>
 /* IIO TODO LIST */
@@ -657,8 +658,13 @@  static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev)
 	return dev_get_drvdata(&indio_dev->dev);
 }
 
-/* Can we make this smaller? */
-#define IIO_ALIGN L1_CACHE_BYTES
+/*
+ * Used to ensure the iio_priv() structure is aligned to allow that structure
+ * to in turn include IIO_ALIGN'd elements such as buffers which must not share
+ * cachelines with the rest of the structure, thus making them safe for use with
+ * non-coherent DMA.
+ */
+#define IIO_ALIGN ARCH_KMALLOC_MINALIGN
 struct iio_dev *iio_device_alloc(struct device *parent, int sizeof_priv);
 
 /* The information at the returned address is guaranteed to be cacheline aligned */