Message ID | 20220503085935.1533814-2-jic23@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | IIO: Fix alignment of buffers for DMA | expand |
On Tue, May 03, 2022 at 09:58:04AM +0100, Jonathan Cameron wrote: > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > Discussion of the series: > https://lore.kernel.org/all/20220405135758.774016-1-catalin.marinas@arm.com/ > mm, arm64: Reduce ARCH_KMALLOC_MINALIGN brought to my attention that > our current IIO usage of L1CACHE_ALIGN is insufficient as their are Arm > platforms out their with non coherent DMA and larger cache lines at > at higher levels of their cache hierarchy. > > Note this patch will greatly reduce the padding on some architectures > that have smaller requirements for DMA safe buffers. > > The history of changing values of ARCH_KMALLOC_MINALIGN via > ARCH_DMA_MINALIGN on arm64 is rather complex. I'm not tagging this > as fixing a particular patch from that route as it's not clear what to tag. > > Most recently a change to bring them back inline was reverted because > of some Qualcomm Kryo cores with an L2 cache with 128-byte lines > sitting above the point of coherency. > > c1132702c71f Revert "arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES)" > That reverts: > 65688d2a05de arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES) which > refers to the change originally being motivated by Thunder x1 performance > rather than correctness. > > Fixes: 6f7c8ee585e9d ("staging:iio: Add ability to allocate private data space to iio_allocate_device") > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > include/linux/iio/iio.h | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h > index faf00f2c0be6..30937f8f9424 100644 > --- a/include/linux/iio/iio.h > +++ b/include/linux/iio/iio.h > @@ -9,6 +9,7 @@ > > #include <linux/device.h> > #include <linux/cdev.h> > +#include <linux/slab.h> > #include <linux/iio/types.h> > #include <linux/of.h> > /* IIO TODO LIST */ > @@ -657,8 +658,13 @@ static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev) > return dev_get_drvdata(&indio_dev->dev); > } > > -/* Can we make this smaller? */ > -#define IIO_ALIGN L1_CACHE_BYTES > +/* > + * Used to ensure the iio_priv() structure is aligned to allow that structure > + * to in turn include IIO_ALIGN'd elements such as buffers which must not share > + * cachelines with the rest of the structure, thus making them safe for use with > + * non-coherent DMA. > + */ > +#define IIO_ALIGN ARCH_KMALLOC_MINALIGN Given the purpose of IIO_ALIGN is to define the alignment for DMA'able memory, I wonder why it's called "IIO_ALIGN" and not for example "DMA_MINALIGN". There is nothing iio specific about this value, is there? Then consequently it doesn't need to be defined in an iio header, but somewhere generic. Or even one step further: Why isn't there a macro __align_for_dma that can be used directly to annotate the relevant member in a struct? Best regards Uwe
On Tue, 3 May 2022 16:27:25 +0200 Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote: > On Tue, May 03, 2022 at 09:58:04AM +0100, Jonathan Cameron wrote: > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > Discussion of the series: > > https://lore.kernel.org/all/20220405135758.774016-1-catalin.marinas@arm.com/ > > mm, arm64: Reduce ARCH_KMALLOC_MINALIGN brought to my attention that > > our current IIO usage of L1CACHE_ALIGN is insufficient as their are Arm > > platforms out their with non coherent DMA and larger cache lines at > > at higher levels of their cache hierarchy. > > > > Note this patch will greatly reduce the padding on some architectures > > that have smaller requirements for DMA safe buffers. > > > > The history of changing values of ARCH_KMALLOC_MINALIGN via > > ARCH_DMA_MINALIGN on arm64 is rather complex. I'm not tagging this > > as fixing a particular patch from that route as it's not clear what to tag. > > > > Most recently a change to bring them back inline was reverted because > > of some Qualcomm Kryo cores with an L2 cache with 128-byte lines > > sitting above the point of coherency. > > > > c1132702c71f Revert "arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES)" > > That reverts: > > 65688d2a05de arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES) which > > refers to the change originally being motivated by Thunder x1 performance > > rather than correctness. > > > > Fixes: 6f7c8ee585e9d ("staging:iio: Add ability to allocate private data space to iio_allocate_device") > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > --- > > include/linux/iio/iio.h | 10 ++++++++-- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h > > index faf00f2c0be6..30937f8f9424 100644 > > --- a/include/linux/iio/iio.h > > +++ b/include/linux/iio/iio.h > > @@ -9,6 +9,7 @@ > > > > #include <linux/device.h> > > #include <linux/cdev.h> > > +#include <linux/slab.h> > > #include <linux/iio/types.h> > > #include <linux/of.h> > > /* IIO TODO LIST */ > > @@ -657,8 +658,13 @@ static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev) > > return dev_get_drvdata(&indio_dev->dev); > > } > > > > -/* Can we make this smaller? */ > > -#define IIO_ALIGN L1_CACHE_BYTES > > +/* > > + * Used to ensure the iio_priv() structure is aligned to allow that structure > > + * to in turn include IIO_ALIGN'd elements such as buffers which must not share > > + * cachelines with the rest of the structure, thus making them safe for use with > > + * non-coherent DMA. > > + */ > > +#define IIO_ALIGN ARCH_KMALLOC_MINALIGN > > Given the purpose of IIO_ALIGN is to define the alignment for DMA'able > memory, I wonder why it's called "IIO_ALIGN" and not for example > "DMA_MINALIGN". Much like crypto I want a single place that provides the IIO requirements for this. Could rename it IIO_DMA_MINALIGN I guess. The reason behind that is to allow for a switch on mass if a new approach is accepted along the lines of what Catalin proposed. The discussions around CRYPTO made it clear that there are sometimes additional requirements from a subsystem beyond simply that of DMA (IIO has a similar issue to crypto that mean it's not simple to shift the alignment requirements at runtime because the compiler is getting told things are aligned to a higher degree than the allocations). https://lore.kernel.org/all/20220405135758.774016-8-catalin.marinas@arm.com/ and below that point. > > There is nothing iio specific about this value, is there? Then > consequently it doesn't need to be defined in an iio header, but > somewhere generic. Or even one step further: Why isn't there a macro > __align_for_dma that can be used directly to annotate the relevant > member in a struct? There is, but it's not currently available on all architectures. ARCH_DMA_MINALIGN Catalin's series proposed making it generally available: https://lore.kernel.org/all/20220405135758.774016-2-catalin.marinas@arm.com/ but suggestion for now was to go with ARCH_KMALLOC_MINALIGN https://lore.kernel.org/linux-iio/Yl6jB5DOUy+Yqyzl@arm.com/ Thanks, Jonathan > > Best regards > Uwe >
On Tue, 3 May 2022 17:24:21 +0100 Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote: > On Tue, 3 May 2022 16:27:25 +0200 > Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote: > > > On Tue, May 03, 2022 at 09:58:04AM +0100, Jonathan Cameron wrote: > > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > > > Discussion of the series: > > > https://lore.kernel.org/all/20220405135758.774016-1-catalin.marinas@arm.com/ > > > mm, arm64: Reduce ARCH_KMALLOC_MINALIGN brought to my attention that > > > our current IIO usage of L1CACHE_ALIGN is insufficient as their are Arm > > > platforms out their with non coherent DMA and larger cache lines at > > > at higher levels of their cache hierarchy. > > > > > > Note this patch will greatly reduce the padding on some architectures > > > that have smaller requirements for DMA safe buffers. > > > > > > The history of changing values of ARCH_KMALLOC_MINALIGN via > > > ARCH_DMA_MINALIGN on arm64 is rather complex. I'm not tagging this > > > as fixing a particular patch from that route as it's not clear what to tag. > > > > > > Most recently a change to bring them back inline was reverted because > > > of some Qualcomm Kryo cores with an L2 cache with 128-byte lines > > > sitting above the point of coherency. > > > > > > c1132702c71f Revert "arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES)" > > > That reverts: > > > 65688d2a05de arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES) which > > > refers to the change originally being motivated by Thunder x1 performance > > > rather than correctness. > > > > > > Fixes: 6f7c8ee585e9d ("staging:iio: Add ability to allocate private data space to iio_allocate_device") > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > --- > > > include/linux/iio/iio.h | 10 ++++++++-- > > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > > > diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h > > > index faf00f2c0be6..30937f8f9424 100644 > > > --- a/include/linux/iio/iio.h > > > +++ b/include/linux/iio/iio.h > > > @@ -9,6 +9,7 @@ > > > > > > #include <linux/device.h> > > > #include <linux/cdev.h> > > > +#include <linux/slab.h> > > > #include <linux/iio/types.h> > > > #include <linux/of.h> > > > /* IIO TODO LIST */ > > > @@ -657,8 +658,13 @@ static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev) > > > return dev_get_drvdata(&indio_dev->dev); > > > } > > > > > > -/* Can we make this smaller? */ > > > -#define IIO_ALIGN L1_CACHE_BYTES > > > +/* > > > + * Used to ensure the iio_priv() structure is aligned to allow that structure > > > + * to in turn include IIO_ALIGN'd elements such as buffers which must not share > > > + * cachelines with the rest of the structure, thus making them safe for use with > > > + * non-coherent DMA. > > > + */ > > > +#define IIO_ALIGN ARCH_KMALLOC_MINALIGN > > > > Given the purpose of IIO_ALIGN is to define the alignment for DMA'able > > memory, I wonder why it's called "IIO_ALIGN" and not for example > > "DMA_MINALIGN". > > Much like crypto I want a single place that provides the IIO requirements > for this. Could rename it IIO_DMA_MINALIGN I guess. I've switched to this naming for v2. > The reason behind > that is to allow for a switch on mass if a new approach is accepted > along the lines of what Catalin proposed. The discussions around > CRYPTO made it clear that there are sometimes additional requirements > from a subsystem beyond simply that of DMA (IIO has a similar issue > to crypto that mean it's not simple to shift the alignment requirements > at runtime because the compiler is getting told things are aligned to > a higher degree than the allocations). > > https://lore.kernel.org/all/20220405135758.774016-8-catalin.marinas@arm.com/ > and below that point. > > > > > There is nothing iio specific about this value, is there? Then > > consequently it doesn't need to be defined in an iio header, but > > somewhere generic. Or even one step further: Why isn't there a macro > > __align_for_dma that can be used directly to annotate the relevant > > member in a struct? > > There is, but it's not currently available on all architectures. > > ARCH_DMA_MINALIGN > > Catalin's series proposed making it generally available: > https://lore.kernel.org/all/20220405135758.774016-2-catalin.marinas@arm.com/ > > but suggestion for now was to go with ARCH_KMALLOC_MINALIGN > > https://lore.kernel.org/linux-iio/Yl6jB5DOUy+Yqyzl@arm.com/ > > Thanks, > > Jonathan > > > > > > Best regards > > Uwe > > >
diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h index faf00f2c0be6..30937f8f9424 100644 --- a/include/linux/iio/iio.h +++ b/include/linux/iio/iio.h @@ -9,6 +9,7 @@ #include <linux/device.h> #include <linux/cdev.h> +#include <linux/slab.h> #include <linux/iio/types.h> #include <linux/of.h> /* IIO TODO LIST */ @@ -657,8 +658,13 @@ static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev) return dev_get_drvdata(&indio_dev->dev); } -/* Can we make this smaller? */ -#define IIO_ALIGN L1_CACHE_BYTES +/* + * Used to ensure the iio_priv() structure is aligned to allow that structure + * to in turn include IIO_ALIGN'd elements such as buffers which must not share + * cachelines with the rest of the structure, thus making them safe for use with + * non-coherent DMA. + */ +#define IIO_ALIGN ARCH_KMALLOC_MINALIGN struct iio_dev *iio_device_alloc(struct device *parent, int sizeof_priv); /* The information at the returned address is guaranteed to be cacheline aligned */