diff mbox series

[v7,1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management

Message ID 20230330204217.47666-2-prabhakar.mahadev-lad.rj@bp.renesas.com (mailing list archive)
State Superseded
Headers show
Series RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP | expand

Checks

Context Check Description
conchuod/tree_selection fail Failed to apply to next/pending-fixes or riscv/for-next

Commit Message

Lad, Prabhakar March 30, 2023, 8:42 p.m. UTC
From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

Currently, selecting which CMOs to use on a given platform is done using
and ALTERNATIVE_X() macro. This was manageable when there were just two
CMO implementations, but now that there are more and more platforms coming
needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.

To avoid such issues this patch switches to use of function pointers
instead of ALTERNATIVE_X() macro for cache management (the only drawback
being performance over the previous approach).

void (*clean_range)(unsigned long addr, unsigned long size);
void (*inv_range)(unsigned long addr, unsigned long size);
void (*flush_range)(unsigned long addr, unsigned long size);

The above function pointers are provided to be overridden for platforms
needing CMO.

Convert ZICBOM and T-HEAD CMO to use function pointers.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
---
v6->v7
* Updated commit description
* Fixed build issues when CONFIG_ERRATA_THEAD_CMO=n
* Used static const struct ptr to register CMO ops
* Dropped riscv_dma_noncoherent_cmo_ops
* Moved ZICBOM CMO setup to setup.c

v5->v6
* New patch
---
 arch/riscv/errata/thead/errata.c         | 76 ++++++++++++++++++++++++
 arch/riscv/include/asm/dma-noncoherent.h | 73 +++++++++++++++++++++++
 arch/riscv/include/asm/errata_list.h     | 53 -----------------
 arch/riscv/kernel/setup.c                | 49 ++++++++++++++-
 arch/riscv/mm/dma-noncoherent.c          | 25 ++++++--
 arch/riscv/mm/pmem.c                     |  6 +-
 6 files changed, 221 insertions(+), 61 deletions(-)
 create mode 100644 arch/riscv/include/asm/dma-noncoherent.h

Comments

Arnd Bergmann March 30, 2023, 9:34 p.m. UTC | #1
On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two
> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
>
> To avoid such issues this patch switches to use of function pointers
> instead of ALTERNATIVE_X() macro for cache management (the only drawback
> being performance over the previous approach).
>
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);
>
> The above function pointers are provided to be overridden for platforms
> needing CMO.
>
> Convert ZICBOM and T-HEAD CMO to use function pointers.
>
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

This is looking pretty good. There are a few things that I
still see sticking out, and I think I've mentioned some of 
them before, but don't remember if there was a reason for
doing it like this:

> +#ifdef CONFIG_ERRATA_THEAD_CMO

I would rename this to not call this an 'ERRATA' but
just make it a driver. Not important though, and there
was probably a reason you did it like this.

> +extern struct riscv_cache_ops noncoherent_cache_ops;
> +
> +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops 
> *ops);
> +
> +static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t 
> size)
> +{
> +	if (noncoherent_cache_ops.clean_range) {
> +		unsigned long addr = (unsigned long)vaddr;
> +
> +		noncoherent_cache_ops.clean_range(addr, size);
> +	}
> +}

The type case should not be necessary here. Instead I would 
make the callbacks use 'void *' as well, not 'unsigned long'.

It's possible that some future cache controller driver requires
passing physical addresses, as is common for last level cache,
but as long as all drivers pass virtual addresses, it's easier
to do the phys_to_virt() in common code.

It also seems wrong to have the fallback be to do nothing
when the pointer is NULL, since that cannot actually work
when a device is not cache coherent.

I would either initialize the function pointer to the
zicbom version and remove the NULL check, or keep the
pointer NULL and have an explicit
'else zicbom_clean_range()' fallback.

> +static void zicbom_register_cmo_ops(void)
> +{
> +	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> +}
> +#else
> +static void zicbom_register_cmo_ops(void) {}
> +#endif

As far as I recall, the #else path here was needed previously
to work around a binutils dependency, but with the current
code, it should be possible to just always enable
CONFIG_RISCV_ISA_ZICBOM when RISCV_DMA_NONCOHERENT is used.

     Arnd
Geert Uytterhoeven March 31, 2023, 7:31 a.m. UTC | #2
Hi Prabhakar,

Thanks for your patch!

On Thu, Mar 30, 2023 at 10:42 PM Prabhakar <prabhakar.csengg@gmail.com> wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two

the ALTERNATIVE_X()

> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
>
> To avoid such issues this patch switches to use of function pointers

"the use" or "using"

> instead of ALTERNATIVE_X() macro for cache management (the only drawback

the ALTERNATIVE_X()

> being performance over the previous approach).
>
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);
>
> The above function pointers are provided to be overridden for platforms
> needing CMO.
>
> Convert ZICBOM and T-HEAD CMO to use function pointers.
>
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c

> +#ifdef CONFIG_ERRATA_THEAD_CMO

> +static void thead_register_cmo_ops(void)
> +{
> +       riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
> +}
> +#else
> +static void thead_register_cmo_ops(void) {}
> +#endif

> --- a/arch/riscv/mm/dma-noncoherent.c
> +++ b/arch/riscv/mm/dma-noncoherent.c

> @@ -75,3 +83,12 @@ void riscv_noncoherent_supported(void)
>              "Non-coherent DMA support enabled without a block size\n");
>         noncoherent_supported = true;
>  }
> +
> +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
> +{
> +       if (!ops)
> +               return;

This is never true.
I guess originally you wanted to call riscv_noncoherent_register_cache_ops()
unconditionally from common code, instead of the various *register_cmo_ops()?
But that would have required something like

#ifdef CONFIG_ERRATA_THEAD_CMO
#define THEAD_CMO_OPS_PTR   (&thead_cmo_ops)
#else
#define THEAD_CMO_OPS_PTR   NULL
#endif

Or can we come up with some macro like pm_ptr(), but that also takes
care of the "&", so we can do "#define thead_cmo_ops NULL"?

> +
> +       noncoherent_cache_ops = *ops;
> +}
> +EXPORT_SYMBOL_GPL(riscv_noncoherent_register_cache_ops);

Gr{oetje,eeting}s,

                        Geert
Conor Dooley March 31, 2023, 7:54 a.m. UTC | #3
On Thu, Mar 30, 2023 at 11:34:02PM +0200, Arnd Bergmann wrote:
> On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

> > +#ifdef CONFIG_ERRATA_THEAD_CMO
> 
> I would rename this to not call this an 'ERRATA' but
> just make it a driver. Not important though, and there
> was probably a reason you did it like this.

I think what was discussed in a prior iteration was that we'd leave
refactoring the T-HEAD bits into a driver for a subsequent work.
Arnd Bergmann March 31, 2023, 7:58 a.m. UTC | #4
On Fri, Mar 31, 2023, at 09:54, Conor Dooley wrote:
> On Thu, Mar 30, 2023 at 11:34:02PM +0200, Arnd Bergmann wrote:
>> On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
>> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
>> > +#ifdef CONFIG_ERRATA_THEAD_CMO
>> 
>> I would rename this to not call this an 'ERRATA' but
>> just make it a driver. Not important though, and there
>> was probably a reason you did it like this.
>
> I think what was discussed in a prior iteration was that we'd leave
> refactoring the T-HEAD bits into a driver for a subsequent work.

Ok, makes sense.

     Arnd
Lad, Prabhakar March 31, 2023, 10:37 a.m. UTC | #5
Hi Arnd,

Thank you for the review.

On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
> >
> > The above function pointers are provided to be overridden for platforms
> > needing CMO.
> >
> > Convert ZICBOM and T-HEAD CMO to use function pointers.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> This is looking pretty good. There are a few things that I
> still see sticking out, and I think I've mentioned some of
> them before, but don't remember if there was a reason for
> doing it like this:
>
> > +#ifdef CONFIG_ERRATA_THEAD_CMO
>
> I would rename this to not call this an 'ERRATA' but
> just make it a driver. Not important though, and there
> was probably a reason you did it like this.
>
As agreed, we will keep this as is.

> > +extern struct riscv_cache_ops noncoherent_cache_ops;
> > +
> > +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops
> > *ops);
> > +
> > +static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t
> > size)
> > +{
> > +     if (noncoherent_cache_ops.clean_range) {
> > +             unsigned long addr = (unsigned long)vaddr;
> > +
> > +             noncoherent_cache_ops.clean_range(addr, size);
> > +     }
> > +}
>
> The type case should not be necessary here. Instead I would
> make the callbacks use 'void *' as well, not 'unsigned long'.
>
Ok, I'll update this on using void *

> It's possible that some future cache controller driver requires
> passing physical addresses, as is common for last level cache,
> but as long as all drivers pass virtual addresses, it's easier
> to do the phys_to_virt() in common code.
>
Agreed.

> It also seems wrong to have the fallback be to do nothing
> when the pointer is NULL, since that cannot actually work
> when a device is not cache coherent.
>
If the device is non cache coherent and if it doesn't support ZICBOM
ISA extension the device won't work anyway. So non-cache coherent
devices until they have their CMO config enabled won't work anyway. So
I didn't see any benefit in enabling ZICBOM by default. Please let me
know if I am misunderstanding.

> I would either initialize the function pointer to the
> zicbom version and remove the NULL check, or keep the
> pointer NULL and have an explicit
> 'else zicbom_clean_range()' fallback.
>
> > +static void zicbom_register_cmo_ops(void)
> > +{
> > +     riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> > +}
> > +#else
> > +static void zicbom_register_cmo_ops(void) {}
> > +#endif
>
> As far as I recall, the #else path here was needed previously
> to work around a binutils dependency, but with the current
> code, it should be possible to just always enable
> CONFIG_RISCV_ISA_ZICBOM when RISCV_DMA_NONCOHERENT is used.
>
Are you suggesting something like below?

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 4dadf35ac721..a55dee98ccf8 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -242,6 +242,7 @@ config RISCV_DMA_NONCOHERENT
        select ARCH_HAS_SYNC_DMA_FOR_CPU
        select ARCH_HAS_SYNC_DMA_FOR_DEVICE
        select DMA_DIRECT_REMAP
+       select RISCV_ISA_ZICBOM

 config AS_HAS_INSN
        def_bool $(as-instr,.insn r 51$(comma) 0$(comma) 0$(comma)
t0$(comma) t0$(comma) zero)
@@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
        depends on MMU
        depends on RISCV_ALTERNATIVE
        default y
-       select RISCV_DMA_NONCOHERENT
        help
           Adds support to dynamically detect the presence of the ZICBOM
           extension (Cache Block Management Operations) and enable its

But what if the platform doesn't have the ZICBOM ISA extension?

Cheers,
Prabhakar
Arnd Bergmann March 31, 2023, 10:44 a.m. UTC | #6
On Fri, Mar 31, 2023, at 12:37, Lad, Prabhakar wrote:
> On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
>> It also seems wrong to have the fallback be to do nothing
>> when the pointer is NULL, since that cannot actually work
>> when a device is not cache coherent.
>>
> If the device is non cache coherent and if it doesn't support ZICBOM
> ISA extension the device won't work anyway. So non-cache coherent
> devices until they have their CMO config enabled won't work anyway. So
> I didn't see any benefit in enabling ZICBOM by default. Please let me
> know if I am misunderstanding.

Two things:

- Having a broken machine crash with in invalid instruction
  exception is better than having it run into silent data
  corruption.

- a correctly predicted branch is typically faster than an
  indirect function call, so the fallback to zicbom makes the
  expected (at least in the future) case the fast one.

> @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
>         depends on MMU
>         depends on RISCV_ALTERNATIVE
>         default y
> -       select RISCV_DMA_NONCOHERENT
>         help
>            Adds support to dynamically detect the presence of the ZICBOM
>            extension (Cache Block Management Operations) and enable its
>
> But what if the platform doesn't have the ZICBOM ISA extension?

Then it needs to register its cache operations before the first
DMA, which is something that it should do anyway. With your
current code, it may work by accident depending on the state of
the cache, but with the version I suggested, it will either work
correctly all the time or crash in an obvious way when misconfigured.

     Arnd
Lad, Prabhakar March 31, 2023, 10:45 a.m. UTC | #7
Hi Geert,

Thank you for the review.

On Fri, Mar 31, 2023 at 8:31 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Prabhakar,
>
> Thanks for your patch!
>
> On Thu, Mar 30, 2023 at 10:42 PM Prabhakar <prabhakar.csengg@gmail.com> wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
>
> the ALTERNATIVE_X()
>
OK.

> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
>
> "the use" or "using"
>
OK.

> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
>
> the ALTERNATIVE_X()
>
OK.

> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
> >
> > The above function pointers are provided to be overridden for platforms
> > needing CMO.
> >
> > Convert ZICBOM and T-HEAD CMO to use function pointers.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> > --- a/arch/riscv/errata/thead/errata.c
> > +++ b/arch/riscv/errata/thead/errata.c
>
> > +#ifdef CONFIG_ERRATA_THEAD_CMO
>
> > +static void thead_register_cmo_ops(void)
> > +{
> > +       riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
> > +}
> > +#else
> > +static void thead_register_cmo_ops(void) {}
> > +#endif
>
> > --- a/arch/riscv/mm/dma-noncoherent.c
> > +++ b/arch/riscv/mm/dma-noncoherent.c
>
> > @@ -75,3 +83,12 @@ void riscv_noncoherent_supported(void)
> >              "Non-coherent DMA support enabled without a block size\n");
> >         noncoherent_supported = true;
> >  }
> > +
> > +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
> > +{
> > +       if (!ops)
> > +               return;
>
> This is never true.
I just wanted to add a sanity check.

> I guess originally you wanted to call riscv_noncoherent_register_cache_ops()
> unconditionally from common code, instead of the various *register_cmo_ops()?
> But that would have required something like
>
> #ifdef CONFIG_ERRATA_THEAD_CMO
> #define THEAD_CMO_OPS_PTR   (&thead_cmo_ops)
> #else
> #define THEAD_CMO_OPS_PTR   NULL
> #endif
>
riscv_noncoherent_register_cache_ops() will only be called if the
ISA/Errata needs to be applied so I'll drop this check.

Cheers,
Prabhakar
Conor Dooley March 31, 2023, 10:55 a.m. UTC | #8
On Fri, Mar 31, 2023 at 11:37:30AM +0100, Lad, Prabhakar wrote:

> > As far as I recall, the #else path here was needed previously
> > to work around a binutils dependency, but with the current
> > code, it should be possible to just always enable
> > CONFIG_RISCV_ISA_ZICBOM when RISCV_DMA_NONCOHERENT is used.
> >
> Are you suggesting something like below?
> 
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 4dadf35ac721..a55dee98ccf8 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -242,6 +242,7 @@ config RISCV_DMA_NONCOHERENT
>         select ARCH_HAS_SYNC_DMA_FOR_CPU
>         select ARCH_HAS_SYNC_DMA_FOR_DEVICE
>         select DMA_DIRECT_REMAP
> +       select RISCV_ISA_ZICBOM
> 
>  config AS_HAS_INSN
>         def_bool $(as-instr,.insn r 51$(comma) 0$(comma) 0$(comma)
> t0$(comma) t0$(comma) zero)
> @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
>         depends on MMU
>         depends on RISCV_ALTERNATIVE
>         default y
> -       select RISCV_DMA_NONCOHERENT
>         help
>            Adds support to dynamically detect the presence of the ZICBOM
>            extension (Cache Block Management Operations) and enable its
>

Does that actually work? I don't think it does.
If you try to enable RISCV_ISA_ZICBOM then you won't get
RISC_DMA_NONCOHERENT turned on. Run menuconfig and disable support for
Renesas, SiFive and T-Head SoCs & you can replicate.

I think one of RISCV_ISA_ZICBOM and RISCV_DMA_NONCOHERENT should just be
dropped, although I don't know which one to pick!
Making RISCV_DMA_NONCOHERENT user selectable probably makes the most
sense.
Arnd Bergmann March 31, 2023, 11:36 a.m. UTC | #9
On Fri, Mar 31, 2023, at 12:55, Conor Dooley wrote:
> On Fri, Mar 31, 2023 at 11:37:30AM +0100, Lad, Prabhakar wrote:
>>
>
> Does that actually work? I don't think it does.
> If you try to enable RISCV_ISA_ZICBOM then you won't get
> RISC_DMA_NONCOHERENT turned on. Run menuconfig and disable support for
> Renesas, SiFive and T-Head SoCs & you can replicate.

Right, the circular dependency has to be broken in some form.

> I think one of RISCV_ISA_ZICBOM and RISCV_DMA_NONCOHERENT should just be
> dropped, although I don't know which one to pick!
> Making RISCV_DMA_NONCOHERENT user selectable probably makes the most
> sense.

That sounds good to me.

    Arnd
Lad, Prabhakar March 31, 2023, 12:11 p.m. UTC | #10
On Fri, Mar 31, 2023 at 11:45 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Mar 31, 2023, at 12:37, Lad, Prabhakar wrote:
> > On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> >> It also seems wrong to have the fallback be to do nothing
> >> when the pointer is NULL, since that cannot actually work
> >> when a device is not cache coherent.
> >>
> > If the device is non cache coherent and if it doesn't support ZICBOM
> > ISA extension the device won't work anyway. So non-cache coherent
> > devices until they have their CMO config enabled won't work anyway. So
> > I didn't see any benefit in enabling ZICBOM by default. Please let me
> > know if I am misunderstanding.
>
> Two things:
>
> - Having a broken machine crash with in invalid instruction
>   exception is better than having it run into silent data
>   corruption.
>
> - a correctly predicted branch is typically faster than an
>   indirect function call, so the fallback to zicbom makes the
>   expected (at least in the future) case the fast one.
>
Ok, thank you for the clarification. I'll default to zicbom.

> > @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
> >         depends on MMU
> >         depends on RISCV_ALTERNATIVE
> >         default y
> > -       select RISCV_DMA_NONCOHERENT
> >         help
> >            Adds support to dynamically detect the presence of the ZICBOM
> >            extension (Cache Block Management Operations) and enable its
> >
> > But what if the platform doesn't have the ZICBOM ISA extension?
>
> Then it needs to register its cache operations before the first
> DMA, which is something that it should do anyway. With your
> current code, it may work by accident depending on the state of
> the cache, but with the version I suggested, it will either work
> correctly all the time or crash in an obvious way when misconfigured.
>
Okay, agreed.

Cheers,
Prabhakar
Conor Dooley March 31, 2023, 12:24 p.m. UTC | #11
Hey,

I think most of what I wanted to talk about has been raised by Arnd or
Geert, so I kinda only have a couple of small comments for you here.

On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> 
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two
> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> 
> To avoid such issues this patch switches to use of function pointers
> instead of ALTERNATIVE_X() macro for cache management (the only drawback
> being performance over the previous approach).
> 
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);

So, given Arnd has renamed the generic helpers, should we use the
writeback/invalidate/writeback & invalidate terminology here too?
I think it'd be nice to make the "driver" functions match the generic
implementations's names (even though that means not making the
instruction naming).

> The above function pointers are provided to be overridden for platforms
> needing CMO.
> 
> Convert ZICBOM and T-HEAD CMO to use function pointers.
> 
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> ---
> v6->v7
> * Updated commit description
> * Fixed build issues when CONFIG_ERRATA_THEAD_CMO=n
> * Used static const struct ptr to register CMO ops
> * Dropped riscv_dma_noncoherent_cmo_ops

> * Moved ZICBOM CMO setup to setup.c

Why'd you do that?
What is the reason that the Zicbom stuff cannot live in
dma-noncoherent.[ch] and only expose, say:
void riscv_cbom_register_cmo_ops(void)
{
	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
}
which you then call from setup.c?

> v5->v6
> * New patch
> ---
>  arch/riscv/errata/thead/errata.c         | 76 ++++++++++++++++++++++++
>  arch/riscv/include/asm/dma-noncoherent.h | 73 +++++++++++++++++++++++
>  arch/riscv/include/asm/errata_list.h     | 53 -----------------
>  arch/riscv/kernel/setup.c                | 49 ++++++++++++++-
>  arch/riscv/mm/dma-noncoherent.c          | 25 ++++++--
>  arch/riscv/mm/pmem.c                     |  6 +-
>  6 files changed, 221 insertions(+), 61 deletions(-)
>  create mode 100644 arch/riscv/include/asm/dma-noncoherent.h

> +#ifdef CONFIG_RISCV_ISA_ZICBOM
> +
> +#define ZICBOM_CMO_OP(_op, _start, _size, _cachesize)				\
> +	asm volatile("mv a0, %1\n\t"						\
> +		     "j 2f\n\t"							\
> +		     "3:\n\t"							\
> +		     CBO_##_op(a0)						\
> +		     "add a0, a0, %0\n\t"					\
> +		     "2:\n\t"							\
> +		     "bltu a0, %2, 3b\n\t"					\
> +		     : : "r"(_cachesize),					\
> +			 "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
> +			 "r"((unsigned long)(_start) + (_size))			\
> +		     : "a0")
> +
> +static void zicbom_cmo_clean_range(unsigned long addr, unsigned long size)
> +{
> +	ZICBOM_CMO_OP(clean, addr, size, riscv_cbom_block_size);
> +}
> +
> +static void zicbom_cmo_flush_range(unsigned long addr, unsigned long size)
> +{
> +	ZICBOM_CMO_OP(flush, addr, size, riscv_cbom_block_size);
> +}
> +
> +static void zicbom_cmo_inval_range(unsigned long addr, unsigned long size)
> +{
> +	ZICBOM_CMO_OP(inval, addr, size, riscv_cbom_block_size);
> +}
> +
> +const struct riscv_cache_ops zicbom_cmo_ops = {
> +	.clean_range = &zicbom_cmo_clean_range,
> +	.inv_range = &zicbom_cmo_inval_range,
> +	.flush_range = &zicbom_cmo_flush_range,
> +};
> +
> +static void zicbom_register_cmo_ops(void)
> +{
> +	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> +}
> +#else
> +static void zicbom_register_cmo_ops(void) {}
> +#endif

I think all of the above should be prefixed with riscv_cbom to match the
other riscv_cbom stuff we already have.
Although, given the discussion elsewhere about just falling back to
zicbom in the absence of specific ops, most of these functions will
probably disappear (if not all of them).

Cheers,
Conor.
Lad, Prabhakar April 3, 2023, 5 p.m. UTC | #12
Hi Arnd,

On Fri, Mar 31, 2023 at 11:45 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Mar 31, 2023, at 12:37, Lad, Prabhakar wrote:
> > On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> >> It also seems wrong to have the fallback be to do nothing
> >> when the pointer is NULL, since that cannot actually work
> >> when a device is not cache coherent.
> >>
> > If the device is non cache coherent and if it doesn't support ZICBOM
> > ISA extension the device won't work anyway. So non-cache coherent
> > devices until they have their CMO config enabled won't work anyway. So
> > I didn't see any benefit in enabling ZICBOM by default. Please let me
> > know if I am misunderstanding.
>
> Two things:
>
> - Having a broken machine crash with in invalid instruction
>   exception is better than having it run into silent data
>   corruption.
>
> - a correctly predicted branch is typically faster than an
>   indirect function call, so the fallback to zicbom makes the
>   expected (at least in the future) case the fast one.
>
> > @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
> >         depends on MMU
> >         depends on RISCV_ALTERNATIVE
> >         default y
> > -       select RISCV_DMA_NONCOHERENT
> >         help
> >            Adds support to dynamically detect the presence of the ZICBOM
> >            extension (Cache Block Management Operations) and enable its
> >
> > But what if the platform doesn't have the ZICBOM ISA extension?
>
> Then it needs to register its cache operations before the first
> DMA, which is something that it should do anyway. With your
> current code, it may work by accident depending on the state of
> the cache, but with the version I suggested, it will either work
> correctly all the time or crash in an obvious way when misconfigured.
>
You were right, defaulting to ZICBOM is giving me a crash. So I need
to switch to something else rather than using arch_initcall().

I tried early_initcall() but this doesn't let me register a platform
driver. I should be able to access the resources and DT from init
callback and then register CMO callbacks (I havent tried this yet) but
it wont be a platform driver. Should this be OK?

Cheers,
Prabhakar
Lad, Prabhakar April 3, 2023, 6:23 p.m. UTC | #13
Hi Conor,

Thank you for the review.

On Fri, Mar 31, 2023 at 1:24 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> Hey,
>
> I think most of what I wanted to talk about has been raised by Arnd or
> Geert, so I kinda only have a couple of small comments for you here.
>
> On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
>
> So, given Arnd has renamed the generic helpers, should we use the
> writeback/invalidate/writeback & invalidate terminology here too?
> I think it'd be nice to make the "driver" functions match the generic
> implementations's names (even though that means not making the
> instruction naming).
>
> > The above function pointers are provided to be overridden for platforms
> > needing CMO.
> >
> > Convert ZICBOM and T-HEAD CMO to use function pointers.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> > ---
> > v6->v7
> > * Updated commit description
> > * Fixed build issues when CONFIG_ERRATA_THEAD_CMO=n
> > * Used static const struct ptr to register CMO ops
> > * Dropped riscv_dma_noncoherent_cmo_ops
>
> > * Moved ZICBOM CMO setup to setup.c
>
> Why'd you do that?
> What is the reason that the Zicbom stuff cannot live in
> dma-noncoherent.[ch] and only expose, say:
> void riscv_cbom_register_cmo_ops(void)
> {
>         riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> }
> which you then call from setup.c?
>
Commit abcc445acd ("riscv: move riscv_noncoherent_supported() out of
ZICBOM probe) moved the zicbom the setup to setup.c hence I moved the
CMO stuff here. Said that, now I am defaulting to zicbom ops so I have
mode the functions to dma-noncoherent.c  .

> > v5->v6
> > * New patch
> > ---
> >  arch/riscv/errata/thead/errata.c         | 76 ++++++++++++++++++++++++
> >  arch/riscv/include/asm/dma-noncoherent.h | 73 +++++++++++++++++++++++
> >  arch/riscv/include/asm/errata_list.h     | 53 -----------------
> >  arch/riscv/kernel/setup.c                | 49 ++++++++++++++-
> >  arch/riscv/mm/dma-noncoherent.c          | 25 ++++++--
> >  arch/riscv/mm/pmem.c                     |  6 +-
> >  6 files changed, 221 insertions(+), 61 deletions(-)
> >  create mode 100644 arch/riscv/include/asm/dma-noncoherent.h
>
> > +#ifdef CONFIG_RISCV_ISA_ZICBOM
> > +
> > +#define ZICBOM_CMO_OP(_op, _start, _size, _cachesize)                                \
> > +     asm volatile("mv a0, %1\n\t"                                            \
> > +                  "j 2f\n\t"                                                 \
> > +                  "3:\n\t"                                                   \
> > +                  CBO_##_op(a0)                                              \
> > +                  "add a0, a0, %0\n\t"                                       \
> > +                  "2:\n\t"                                                   \
> > +                  "bltu a0, %2, 3b\n\t"                                      \
> > +                  : : "r"(_cachesize),                                       \
> > +                      "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),  \
> > +                      "r"((unsigned long)(_start) + (_size))                 \
> > +                  : "a0")
> > +
> > +static void zicbom_cmo_clean_range(unsigned long addr, unsigned long size)
> > +{
> > +     ZICBOM_CMO_OP(clean, addr, size, riscv_cbom_block_size);
> > +}
> > +
> > +static void zicbom_cmo_flush_range(unsigned long addr, unsigned long size)
> > +{
> > +     ZICBOM_CMO_OP(flush, addr, size, riscv_cbom_block_size);
> > +}
> > +
> > +static void zicbom_cmo_inval_range(unsigned long addr, unsigned long size)
> > +{
> > +     ZICBOM_CMO_OP(inval, addr, size, riscv_cbom_block_size);
> > +}
> > +
> > +const struct riscv_cache_ops zicbom_cmo_ops = {
> > +     .clean_range = &zicbom_cmo_clean_range,
> > +     .inv_range = &zicbom_cmo_inval_range,
> > +     .flush_range = &zicbom_cmo_flush_range,
> > +};
> > +
> > +static void zicbom_register_cmo_ops(void)
> > +{
> > +     riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> > +}
> > +#else
> > +static void zicbom_register_cmo_ops(void) {}
> > +#endif
>
> I think all of the above should be prefixed with riscv_cbom to match the
> other riscv_cbom stuff we already have.
Just to clarify, the riscv_cbom prefix should just be applied to the
ZICOM functions and not to T-HEAD?

Cheers,
Prabhakar
Conor Dooley April 3, 2023, 6:31 p.m. UTC | #14
On Mon, Apr 03, 2023 at 07:23:37PM +0100, Lad, Prabhakar wrote:
> > I think all of the above should be prefixed with riscv_cbom to match the
> > other riscv_cbom stuff we already have.
> Just to clarify, the riscv_cbom prefix should just be applied to the
> ZICOM functions and not to T-HEAD?

Yah, can leave the T-HEAD stuff as is IMO.

Cheers,
Conor.
Christoph Hellwig April 4, 2023, 5:29 a.m. UTC | #15
On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> 
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two
> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> 
> To avoid such issues this patch switches to use of function pointers
> instead of ALTERNATIVE_X() macro for cache management (the only drawback
> being performance over the previous approach).
> 
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);
> 

NAK.  Function pointers for somthing high performance as cache
maintainance is a complete no-go.
Biju Das April 4, 2023, 6:24 a.m. UTC | #16
Hi Christoph Hellwig,

> -----Original Message-----
> From: Christoph Hellwig <hch@infradead.org>
> Sent: Tuesday, April 4, 2023 6:29 AM
> To: Prabhakar <prabhakar.csengg@gmail.com>
> Cc: Arnd Bergmann <arnd@arndb.de>; Conor Dooley
> <conor.dooley@microchip.com>; Geert Uytterhoeven <geert+renesas@glider.be>;
> Heiko Stuebner <heiko@sntech.de>; Guo Ren <guoren@kernel.org>; Andrew Jones
> <ajones@ventanamicro.com>; Paul Walmsley <paul.walmsley@sifive.com>; Palmer
> Dabbelt <palmer@dabbelt.com>; Albert Ou <aou@eecs.berkeley.edu>; Samuel
> Holland <samuel@sholland.org>; linux-riscv@lists.infradead.org; Rob Herring
> <robh+dt@kernel.org>; Krzysztof Kozlowski
> <krzysztof.kozlowski+dt@linaro.org>; devicetree@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-renesas-soc@vger.kernel.org; Biju Das
> <biju.das.jz@bp.renesas.com>; Prabhakar Mahadev Lad <prabhakar.mahadev-
> lad.rj@bp.renesas.com>
> Subject: Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using
> function pointers for cache management
> 
> On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done
> > using and ALTERNATIVE_X() macro. This was manageable when there were
> > just two CMO implementations, but now that there are more and more
> > platforms coming needing custom CMOs, the use of the ALTERNATIVE_X() macro
> is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only
> > drawback being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size); void
> > (*inv_range)(unsigned long addr, unsigned long size); void
> > (*flush_range)(unsigned long addr, unsigned long size);
> >
> 
> NAK.  Function pointers for somthing high performance as cache maintainance
> is a complete no-go.

Just a question, how does function pointer makes a performance difference compared to
ALTERNATIVE_X() macros?

On both cases, we are pushing function parameters to stack, jumping to the actual routine
And then on return pop the variables from stack. Am I missing something here?

Benchmark results by [1], shows that there is no performance degradation. I am not sure
What type of benchmarking used in this case and How accurate is this benchmark?

https://lore.kernel.org/linux-renesas-soc/40cdea465fef49a8a337b48e2a748138c66a08eb.camel@icenowy.me/T/#m093c1f3d8f7f0e15bd2909900bf137d5714c553c

Cheers,
Biju
Arnd Bergmann April 4, 2023, 6:50 a.m. UTC | #17
On Tue, Apr 4, 2023, at 07:29, Christoph Hellwig wrote:
> On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
>> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>> 
>> Currently, selecting which CMOs to use on a given platform is done using
>> and ALTERNATIVE_X() macro. This was manageable when there were just two
>> CMO implementations, but now that there are more and more platforms coming
>> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
>> 
>> To avoid such issues this patch switches to use of function pointers
>> instead of ALTERNATIVE_X() macro for cache management (the only drawback
>> being performance over the previous approach).
>> 
>> void (*clean_range)(unsigned long addr, unsigned long size);
>> void (*inv_range)(unsigned long addr, unsigned long size);
>> void (*flush_range)(unsigned long addr, unsigned long size);
>> 
>
> NAK.  Function pointers for somthing high performance as cache
> maintainance is a complete no-go.

As we already discussed, this is now planned to use a direct
branch to the zicbom version when the function pointer is NULL,
which should be a fast predicted branch on all standard implementations
and only be slow on the nonstandard ones, which I think is a reasonable
compromise.

I'm also not sure I'd actually consider this a fast path, since
there is already a significant cost in accessing the invalidated
cache lines afterwards, which is likely going to be much higher than
the cost of an indirect branch.

I suppose an alternative would be to use the linux/static_call.h
infrastructure to avoid the overhead of indirect branches altogether.
Right now, this has no riscv specific implementation though, so
using it just turns into a regular indirect branch until someone
implements static_call.

       Arnd
Conor Dooley April 4, 2023, 6:59 a.m. UTC | #18
On Tue, Apr 04, 2023 at 08:50:16AM +0200, Arnd Bergmann wrote:
> On Tue, Apr 4, 2023, at 07:29, Christoph Hellwig wrote:
> > On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> >> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >> 
> >> Currently, selecting which CMOs to use on a given platform is done using
> >> and ALTERNATIVE_X() macro. This was manageable when there were just two
> >> CMO implementations, but now that there are more and more platforms coming
> >> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >> 
> >> To avoid such issues this patch switches to use of function pointers
> >> instead of ALTERNATIVE_X() macro for cache management (the only drawback
> >> being performance over the previous approach).
> >> 
> >> void (*clean_range)(unsigned long addr, unsigned long size);
> >> void (*inv_range)(unsigned long addr, unsigned long size);
> >> void (*flush_range)(unsigned long addr, unsigned long size);
> >> 
> >
> > NAK.  Function pointers for somthing high performance as cache
> > maintainance is a complete no-go.
> 
> As we already discussed, this is now planned to use a direct
> branch to the zicbom version when the function pointer is NULL,
> which should be a fast predicted branch on all standard implementations
> and only be slow on the nonstandard ones, which I think is a reasonable
> compromise.
> 
> I'm also not sure I'd actually consider this a fast path, since
> there is already a significant cost in accessing the invalidated
> cache lines afterwards, which is likely going to be much higher than
> the cost of an indirect branch.
> 
> I suppose an alternative would be to use the linux/static_call.h
> infrastructure to avoid the overhead of indirect branches altogether.
> Right now, this has no riscv specific implementation though, so
> using it just turns into a regular indirect branch until someone
> implements static_call.

Someone actually posted an implementation for that the other day:
https://patchwork.kernel.org/project/linux-riscv/patch/tencent_A8A256967B654625AEE1DB222514B0613B07@qq.com/

I haven't looked at it at all, other than noticing the build issues
outside of for !rv64 || !mmu, so I have no idea what state it is
actually in.
Christoph Hellwig April 4, 2023, 3:42 p.m. UTC | #19
On Tue, Apr 04, 2023 at 06:24:16AM +0000, Biju Das wrote:
> Just a question, how does function pointer makes a performance difference compared to
> ALTERNATIVE_X() macros?
> 
> On both cases, we are pushing function parameters to stack, jumping to the actual routine
> And then on return pop the variables from stack. Am I missing something here?

Indirect calls have always been more expensive, and with the hard- and
software mitigations for spectre-like attacks they are becoming even
more expensive.

But other other point is adding more cache flushing variants should not
be easy.  Everyone should be using the standardize version.  If it's not
implemented in hardware despite having ratified extensions you can fake
it up in SBI.  Yes, that's way more expensive than indirect calls, but
that's what you get for taping out new hardware that ignores the actual
architecture specification and just does their own made up shit.
Biju Das April 5, 2023, 6:08 a.m. UTC | #20
Hi Christoph Hellwig,

Thanks for the feedback.

> Subject: Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using
> function pointers for cache management
> 
> On Tue, Apr 04, 2023 at 06:24:16AM +0000, Biju Das wrote:
> > Just a question, how does function pointer makes a performance
> > difference compared to
> > ALTERNATIVE_X() macros?
> >
> > On both cases, we are pushing function parameters to stack, jumping to
> > the actual routine And then on return pop the variables from stack. Am I
> missing something here?
> 
> Indirect calls have always been more expensive, and with the hard- and
> software mitigations for spectre-like attacks they are becoming even more
> expensive.

Thanks for the info. I agree, it will be more expensive with software mitigations
for spectre-like attacks.

Cheers,
Biju
Lad, Prabhakar April 6, 2023, 6:59 p.m. UTC | #21
Hi Christoph,

On Tue, Apr 4, 2023 at 6:29 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
> >
>
> NAK.  Function pointers for somthing high performance as cache
> maintainance is a complete no-go.
>
Ok, I will take the ALTERNATIVE() macro route.

Cheers,
Prabhakar
Andrea Parri April 7, 2023, 12:03 a.m. UTC | #22
> But other other point is adding more cache flushing variants should not
> be easy.  Everyone should be using the standardize version.  If it's not
> implemented in hardware despite having ratified extensions you can fake
> it up in SBI.  Yes, that's way more expensive than indirect calls, but
> that's what you get for taping out new hardware that ignores the actual
> architecture specification and just does their own made up shit.

FWIW, ALTERNATIVE_X() for "three instructions with (what should be a)
crystal-clear semantics" already smells like "we're doing it wrong" to
me, function pointers would be closer to "we're looking for trouble".

Thanks,
  Andrea
Christoph Hellwig April 7, 2023, 5:33 a.m. UTC | #23
On Fri, Apr 07, 2023 at 02:03:36AM +0200, Andrea Parri wrote:
> > But other other point is adding more cache flushing variants should not
> > be easy.  Everyone should be using the standardize version.  If it's not
> > implemented in hardware despite having ratified extensions you can fake
> > it up in SBI.  Yes, that's way more expensive than indirect calls, but
> > that's what you get for taping out new hardware that ignores the actual
> > architecture specification and just does their own made up shit.
> 
> FWIW, ALTERNATIVE_X() for "three instructions with (what should be a)
> crystal-clear semantics" already smells like "we're doing it wrong" to
> me, function pointers would be closer to "we're looking for trouble".

Thanks for putting my feelings into such concise words.
diff mbox series

Patch

diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index 7e8d50ebb71a..4bb3f2baee03 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -11,10 +11,83 @@ 
 #include <linux/uaccess.h>
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
 #include <asm/errata_list.h>
 #include <asm/patch.h>
 #include <asm/vendorid_list.h>
 
+#ifdef CONFIG_ERRATA_THEAD_CMO
+/*
+ * dcache.ipa rs1 (invalidate, physical address)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000001    01010      rs1       000      00000  0001011
+ * dache.iva rs1 (invalida, virtual address)
+ *   0000001    00110      rs1       000      00000  0001011
+ *
+ * dcache.cpa rs1 (clean, physical address)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000001    01001      rs1       000      00000  0001011
+ * dcache.cva rs1 (clean, virtual address)
+ *   0000001    00100      rs1       000      00000  0001011
+ *
+ * dcache.cipa rs1 (clean then invalidate, physical address)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000001    01011      rs1       000      00000  0001011
+ * dcache.civa rs1 (... virtual address)
+ *   0000001    00111      rs1       000      00000  0001011
+ *
+ * sync.s (make sure all cache operations finished)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000000    11001     00000      000      00000  0001011
+ */
+#define THEAD_inval_A0	".long 0x0265000b"
+#define THEAD_clean_A0	".long 0x0245000b"
+#define THEAD_flush_A0	".long 0x0275000b"
+#define THEAD_SYNC_S	".long 0x0190000b"
+
+#define THEAD_CMO_OP(_op, _start, _size, _cachesize)				\
+	asm volatile("mv a0, %1\n\t"						\
+		     "j 2f\n\t"							\
+		     "3:\n\t"							\
+		     THEAD_##_op##_A0 "\n\t"					\
+		     "add a0, a0, %0\n\t"					\
+		     "2:\n\t"							\
+		     "bltu a0, %2, 3b\n\t"					\
+		     THEAD_SYNC_S						\
+		     : : "r"(_cachesize),					\
+			 "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
+			 "r"((unsigned long)(_start) + (_size))			\
+		     : "a0")
+
+static void thead_cmo_clean_range(unsigned long addr, unsigned long size)
+{
+	THEAD_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+}
+
+static void thead_cmo_flush_range(unsigned long addr, unsigned long size)
+{
+	THEAD_CMO_OP(flush, addr, size, riscv_cbom_block_size);
+}
+
+static void thead_cmo_inval_range(unsigned long addr, unsigned long size)
+{
+	THEAD_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+}
+
+static const struct riscv_cache_ops thead_cmo_ops = {
+	.clean_range = &thead_cmo_clean_range,
+	.inv_range = &thead_cmo_inval_range,
+	.flush_range = &thead_cmo_flush_range,
+};
+
+static void thead_register_cmo_ops(void)
+{
+	riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
+}
+#else
+static void thead_register_cmo_ops(void) {}
+#endif
+
 static bool errata_probe_pbmt(unsigned int stage,
 			      unsigned long arch_id, unsigned long impid)
 {
@@ -45,6 +118,9 @@  static bool errata_probe_cmo(unsigned int stage,
 
 	riscv_cbom_block_size = L1_CACHE_BYTES;
 	riscv_noncoherent_supported();
+
+	thead_register_cmo_ops();
+
 	return true;
 }
 
diff --git a/arch/riscv/include/asm/dma-noncoherent.h b/arch/riscv/include/asm/dma-noncoherent.h
new file mode 100644
index 000000000000..fc8d16c89f01
--- /dev/null
+++ b/arch/riscv/include/asm/dma-noncoherent.h
@@ -0,0 +1,73 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 Renesas Electronics Corp.
+ */
+
+#ifndef __ASM_DMA_NONCOHERENT_H
+#define __ASM_DMA_NONCOHERENT_H
+
+#include <linux/dma-direct.h>
+
+#ifdef CONFIG_RISCV_DMA_NONCOHERENT
+
+/*
+ * struct riscv_cache_ops - Structure for CMO function pointers
+ *
+ * @clean_range: Function pointer for clean cache
+ * @inv_range: Function pointer for invalidate cache
+ * @flush_range: Function pointer for flushing the cache
+ */
+struct riscv_cache_ops {
+	void (*clean_range)(unsigned long addr, unsigned long size);
+	void (*inv_range)(unsigned long addr, unsigned long size);
+	void (*flush_range)(unsigned long addr, unsigned long size);
+};
+
+extern struct riscv_cache_ops noncoherent_cache_ops;
+
+void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops);
+
+static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t size)
+{
+	if (noncoherent_cache_ops.clean_range) {
+		unsigned long addr = (unsigned long)vaddr;
+
+		noncoherent_cache_ops.clean_range(addr, size);
+	}
+}
+
+static inline void riscv_dma_noncoherent_flush(void *vaddr, size_t size)
+{
+	if (noncoherent_cache_ops.flush_range) {
+		unsigned long addr = (unsigned long)vaddr;
+
+		noncoherent_cache_ops.flush_range(addr, size);
+	}
+}
+
+static inline void riscv_dma_noncoherent_inval(void *vaddr, size_t size)
+{
+	if (noncoherent_cache_ops.inv_range) {
+		unsigned long addr = (unsigned long)vaddr;
+
+		noncoherent_cache_ops.inv_range(addr, size);
+	}
+}
+
+static inline void riscv_dma_noncoherent_pmem_clean(void *vaddr, size_t size)
+{
+	riscv_dma_noncoherent_clean(vaddr, size);
+}
+
+static inline void riscv_dma_noncoherent_pmem_inval(void *vaddr, size_t size)
+{
+	riscv_dma_noncoherent_inval(vaddr, size);
+}
+#else
+
+static inline void riscv_dma_noncoherent_pmem_clean(void *vaddr, size_t size) {}
+static inline void riscv_dma_noncoherent_pmem_inval(void *vaddr, size_t size) {}
+
+#endif /* CONFIG_RISCV_DMA_NONCOHERENT */
+
+#endif	/* __ASM_DMA_NONCOHERENT_H */
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index fb1a810f3d8c..112429910ee6 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -89,59 +89,6 @@  asm volatile(ALTERNATIVE(						\
 #define ALT_THEAD_PMA(_val)
 #endif
 
-/*
- * dcache.ipa rs1 (invalidate, physical address)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000001    01010      rs1       000      00000  0001011
- * dache.iva rs1 (invalida, virtual address)
- *   0000001    00110      rs1       000      00000  0001011
- *
- * dcache.cpa rs1 (clean, physical address)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000001    01001      rs1       000      00000  0001011
- * dcache.cva rs1 (clean, virtual address)
- *   0000001    00100      rs1       000      00000  0001011
- *
- * dcache.cipa rs1 (clean then invalidate, physical address)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000001    01011      rs1       000      00000  0001011
- * dcache.civa rs1 (... virtual address)
- *   0000001    00111      rs1       000      00000  0001011
- *
- * sync.s (make sure all cache operations finished)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000000    11001     00000      000      00000  0001011
- */
-#define THEAD_inval_A0	".long 0x0265000b"
-#define THEAD_clean_A0	".long 0x0245000b"
-#define THEAD_flush_A0	".long 0x0275000b"
-#define THEAD_SYNC_S	".long 0x0190000b"
-
-#define ALT_CMO_OP(_op, _start, _size, _cachesize)			\
-asm volatile(ALTERNATIVE_2(						\
-	__nops(6),							\
-	"mv a0, %1\n\t"							\
-	"j 2f\n\t"							\
-	"3:\n\t"							\
-	CBO_##_op(a0)							\
-	"add a0, a0, %0\n\t"						\
-	"2:\n\t"							\
-	"bltu a0, %2, 3b\n\t"						\
-	"nop", 0, RISCV_ISA_EXT_ZICBOM, CONFIG_RISCV_ISA_ZICBOM,	\
-	"mv a0, %1\n\t"							\
-	"j 2f\n\t"							\
-	"3:\n\t"							\
-	THEAD_##_op##_A0 "\n\t"						\
-	"add a0, a0, %0\n\t"						\
-	"2:\n\t"							\
-	"bltu a0, %2, 3b\n\t"						\
-	THEAD_SYNC_S, THEAD_VENDOR_ID,					\
-			ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO)	\
-	: : "r"(_cachesize),						\
-	    "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
-	    "r"((unsigned long)(_start) + (_size))			\
-	: "a0")
-
 #define THEAD_C9XX_RV_IRQ_PMU			17
 #define THEAD_C9XX_CSR_SCOUNTEROF		0x5c5
 
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 5d3184cbf518..b2b69d1dec23 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -24,6 +24,7 @@ 
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
 #include <asm/cpu_ops.h>
+#include <asm/dma-noncoherent.h>
 #include <asm/early_ioremap.h>
 #include <asm/pgtable.h>
 #include <asm/setup.h>
@@ -262,6 +263,50 @@  static void __init parse_dtb(void)
 #endif
 }
 
+#ifdef CONFIG_RISCV_ISA_ZICBOM
+
+#define ZICBOM_CMO_OP(_op, _start, _size, _cachesize)				\
+	asm volatile("mv a0, %1\n\t"						\
+		     "j 2f\n\t"							\
+		     "3:\n\t"							\
+		     CBO_##_op(a0)						\
+		     "add a0, a0, %0\n\t"					\
+		     "2:\n\t"							\
+		     "bltu a0, %2, 3b\n\t"					\
+		     : : "r"(_cachesize),					\
+			 "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
+			 "r"((unsigned long)(_start) + (_size))			\
+		     : "a0")
+
+static void zicbom_cmo_clean_range(unsigned long addr, unsigned long size)
+{
+	ZICBOM_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+}
+
+static void zicbom_cmo_flush_range(unsigned long addr, unsigned long size)
+{
+	ZICBOM_CMO_OP(flush, addr, size, riscv_cbom_block_size);
+}
+
+static void zicbom_cmo_inval_range(unsigned long addr, unsigned long size)
+{
+	ZICBOM_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+}
+
+const struct riscv_cache_ops zicbom_cmo_ops = {
+	.clean_range = &zicbom_cmo_clean_range,
+	.inv_range = &zicbom_cmo_inval_range,
+	.flush_range = &zicbom_cmo_flush_range,
+};
+
+static void zicbom_register_cmo_ops(void)
+{
+	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
+}
+#else
+static void zicbom_register_cmo_ops(void) {}
+#endif
+
 void __init setup_arch(char **cmdline_p)
 {
 	parse_dtb();
@@ -301,8 +346,10 @@  void __init setup_arch(char **cmdline_p)
 	riscv_fill_hwcap();
 	apply_boot_alternatives();
 	if (IS_ENABLED(CONFIG_RISCV_ISA_ZICBOM) &&
-	    riscv_isa_extension_available(NULL, ZICBOM))
+	    riscv_isa_extension_available(NULL, ZICBOM)) {
 		riscv_noncoherent_supported();
+		zicbom_register_cmo_ops();
+	}
 }
 
 static int __init topology_init(void)
diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
index b9a9f57e02be..ab8f6dc67914 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -9,28 +9,36 @@ 
 #include <linux/dma-map-ops.h>
 #include <linux/mm.h>
 #include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
 
 static bool noncoherent_supported;
 
+struct riscv_cache_ops noncoherent_cache_ops = {
+	.clean_range = NULL,
+	.inv_range = NULL,
+	.flush_range = NULL,
+};
+EXPORT_SYMBOL_GPL(noncoherent_cache_ops);
+
 static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_clean(vaddr, size);
 }
 
 static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_inval(vaddr, size);
 }
 
 static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_flush(vaddr, size);
 }
 
 static inline bool arch_sync_dma_clean_before_fromdevice(void)
@@ -50,7 +58,7 @@  void arch_dma_prep_coherent(struct page *page, size_t size)
 {
 	void *flush_addr = page_address(page);
 
-	ALT_CMO_OP(flush, flush_addr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_flush(flush_addr, size);
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
@@ -75,3 +83,12 @@  void riscv_noncoherent_supported(void)
 	     "Non-coherent DMA support enabled without a block size\n");
 	noncoherent_supported = true;
 }
+
+void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
+{
+	if (!ops)
+		return;
+
+	noncoherent_cache_ops = *ops;
+}
+EXPORT_SYMBOL_GPL(riscv_noncoherent_register_cache_ops);
diff --git a/arch/riscv/mm/pmem.c b/arch/riscv/mm/pmem.c
index 089df92ae876..aad7c2209eca 100644
--- a/arch/riscv/mm/pmem.c
+++ b/arch/riscv/mm/pmem.c
@@ -6,16 +6,16 @@ 
 #include <linux/export.h>
 #include <linux/libnvdimm.h>
 
-#include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
 
 void arch_wb_cache_pmem(void *addr, size_t size)
 {
-	ALT_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_pmem_clean(addr, size);
 }
 EXPORT_SYMBOL_GPL(arch_wb_cache_pmem);
 
 void arch_invalidate_pmem(void *addr, size_t size)
 {
-	ALT_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_pmem_inval(addr, size);
 }
 EXPORT_SYMBOL_GPL(arch_invalidate_pmem);