diff mbox series

[RESEND] irqchip: riscv: Order normal writes and IPI writes

Message ID 20250127093846.98625-1-luxu.kernel@bytedance.com (mailing list archive)
State New
Headers show
Series [RESEND] irqchip: riscv: Order normal writes and IPI writes | expand

Checks

Context Check Description
conchuod/vmtest-for-next-PR success PR summary
conchuod/patch-1-test-1 success .github/scripts/patches/tests/build_rv32_defconfig.sh took 107.17s
conchuod/patch-1-test-2 success .github/scripts/patches/tests/build_rv64_clang_allmodconfig.sh took 1076.12s
conchuod/patch-1-test-3 success .github/scripts/patches/tests/build_rv64_gcc_allmodconfig.sh took 1274.48s
conchuod/patch-1-test-4 success .github/scripts/patches/tests/build_rv64_nommu_k210_defconfig.sh took 16.54s
conchuod/patch-1-test-5 success .github/scripts/patches/tests/build_rv64_nommu_virt_defconfig.sh took 18.30s
conchuod/patch-1-test-6 success .github/scripts/patches/tests/checkpatch.sh took 0.56s
conchuod/patch-1-test-7 success .github/scripts/patches/tests/dtb_warn_rv64.sh took 36.95s
conchuod/patch-1-test-8 success .github/scripts/patches/tests/header_inline.sh took 0.00s
conchuod/patch-1-test-9 success .github/scripts/patches/tests/kdoc.sh took 0.48s
conchuod/patch-1-test-10 success .github/scripts/patches/tests/module_param.sh took 0.01s
conchuod/patch-1-test-11 success .github/scripts/patches/tests/verify_fixes.sh took 0.00s
conchuod/patch-1-test-12 success .github/scripts/patches/tests/verify_signedoff.sh took 0.02s
bjorn/build-rv32-defconfig success build-rv32-defconfig
bjorn/build-rv64-clang-allmodconfig success build-rv64-clang-allmodconfig
bjorn/build-rv64-gcc-allmodconfig success build-rv64-gcc-allmodconfig
bjorn/build-rv64-nommu-k210-defconfig success build-rv64-nommu-k210-defconfig
bjorn/build-rv64-nommu-k210-virt success build-rv64-nommu-k210-virt
bjorn/checkpatch success checkpatch
bjorn/dtb-warn-rv64 success dtb-warn-rv64
bjorn/header-inline success header-inline
bjorn/kdoc success kdoc
bjorn/module-param success module-param
bjorn/verify-fixes success verify-fixes
bjorn/verify-signedoff success verify-signedoff

Commit Message

Xu Lu Jan. 27, 2025, 9:38 a.m. UTC
RISC-V distinguishes between normal memory accesses and device I/O and
uses FENCE instruction to order them as viewed by othe RISC-V harts and
external devices or coprocessors. The FENCE instruction can order any
combination of device input(I), device output(O), memory reads(R) and
memory writes(W). For example, 'fence w, o' can be used to ensure all
memory writes from instructions preceding the FENCE instruction appear
earlier in the global memory order than device output writes from
instructions after the FENCE instruction.

RISC-V issues IPI by writing certain value to IMSIC/ACLINT MMIO
registers, which is regarded as device output operation. However, the
existing implementation of IMSIC/ACLINT driver issues IPI via
writel_relaxed(), which does not guarantee the order of device output
operation and preceding memory writes. Then the hart receiving IPI may
not have seen the latest data yet.

This commit fixes this by replacing writel_relaxed() with writel() when
issuing IPI, which will use 'fence w, o' to ensure all previous writes
made by current hart are visible to other harts before they receive
the IPI.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
---
 drivers/irqchip/irq-riscv-imsic-early.c      | 2 +-
 drivers/irqchip/irq-thead-c900-aclint-sswi.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Thomas Gleixner Jan. 27, 2025, 10:33 a.m. UTC | #1
On Mon, Jan 27 2025 at 17:38, Xu Lu wrote:

This is not a RESEND. The change log has been modified, no?

The prefix is incorrect. See

  https://www.kernel.org/doc/html/latest/process/maintainer-tip.html

> RISC-V distinguishes between normal memory accesses and device I/O and

What is a normal memory write? Are there abnormal memory writes too?

> uses FENCE instruction to order them as viewed by othe RISC-V harts and
> external devices or coprocessors. The FENCE instruction can order any
> combination of device input(I), device output(O), memory reads(R) and
> memory writes(W). For example, 'fence w, o' can be used to ensure all

Can be? It _is_ used, no?

> memory writes from instructions preceding the FENCE instruction appear
> earlier in the global memory order than device output writes from
> instructions after the FENCE instruction.
>
> RISC-V issues IPI by writing certain value to IMSIC/ACLINT MMIO
> registers, which is regarded as device output operation. However, the
> existing implementation of IMSIC/ACLINT driver issues IPI via
> writel_relaxed(), which does not guarantee the order of device output
> operation and preceding memory writes. Then the hart receiving IPI may
> not have seen the latest data yet.
>
> This commit fixes this by replacing writel_relaxed() with writel()
> when

'This commit' is equally wrong as 'This patch'. See Documentation/process/

> issuing IPI, which will use 'fence w, o' to ensure all previous writes
> made by current hart are visible to other harts before they receive
> the IPI.

I've fixed it up for you.

Thanks,

        tglx
Xu Lu Jan. 27, 2025, 2:26 p.m. UTC | #2
On Mon, Jan 27, 2025 at 6:33 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, Jan 27 2025 at 17:38, Xu Lu wrote:
>
> This is not a RESEND. The change log has been modified, no?

Sure, the change log has been modified. I will pay attention next time.

>
> The prefix is incorrect. See
>
>   https://www.kernel.org/doc/html/latest/process/maintainer-tip.html
>
> > RISC-V distinguishes between normal memory accesses and device I/O and
>
> What is a normal memory write? Are there abnormal memory writes too?

Sorry for the misleading. By normal memory write, I mean memory writes
and want to distinguish it from MMIO writes.

>
> > uses FENCE instruction to order them as viewed by othe RISC-V harts and
> > external devices or coprocessors. The FENCE instruction can order any
> > combination of device input(I), device output(O), memory reads(R) and
> > memory writes(W). For example, 'fence w, o' can be used to ensure all
>
> Can be? It _is_ used, no?

Yes, it _is_ used. 'Can be' is not accurate.

>
> > memory writes from instructions preceding the FENCE instruction appear
> > earlier in the global memory order than device output writes from
> > instructions after the FENCE instruction.
> >
> > RISC-V issues IPI by writing certain value to IMSIC/ACLINT MMIO
> > registers, which is regarded as device output operation. However, the
> > existing implementation of IMSIC/ACLINT driver issues IPI via
> > writel_relaxed(), which does not guarantee the order of device output
> > operation and preceding memory writes. Then the hart receiving IPI may
> > not have seen the latest data yet.
> >
> > This commit fixes this by replacing writel_relaxed() with writel()
> > when
>
> 'This commit' is equally wrong as 'This patch'. See Documentation/process/

Thanks very much. I will check the documents.

Best Regards,

Xu Lu

>
> > issuing IPI, which will use 'fence w, o' to ensure all previous writes
> > made by current hart are visible to other harts before they receive
> > the IPI.
>
> I've fixed it up for you.
>
> Thanks,
>
>         tglx
Arnd Bergmann Jan. 27, 2025, 4:21 p.m. UTC | #3
On Mon, Jan 27, 2025, at 10:38, Xu Lu wrote:
> RISC-V distinguishes between normal memory accesses and device I/O and
> uses FENCE instruction to order them as viewed by othe RISC-V harts and
> external devices or coprocessors. The FENCE instruction can order any
> combination of device input(I), device output(O), memory reads(R) and
> memory writes(W). For example, 'fence w, o' can be used to ensure all
> memory writes from instructions preceding the FENCE instruction appear
> earlier in the global memory order than device output writes from
> instructions after the FENCE instruction.

There is nothing risc-v specific in here really, it's just a bug
in the driver: writel() means access the mmio register with appropriate
barriers, while writel_relaxed() is a special case that should
only ever be used if a particular function is sensitive to
performance and never needs to be serialized.

> diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c 
> b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> index b0e366ade427..8ff6e7a1363b 100644
> --- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> +++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> @@ -31,7 +31,7 @@ static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs);
> 
>  static void thead_aclint_sswi_ipi_send(unsigned int cpu)
>  {
> -	writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu));
> +	writel(0x1, per_cpu(sswi_cpu_regs, cpu));
>  }
> 
>  static void thead_aclint_sswi_ipi_clear(void)
> -- 

thead_aclint_sswi_ipi_clear() seems to have the same bug,
it also uses the _relaxed() version for no apparent reason.

     Arnd
Xu Lu Jan. 27, 2025, 4:37 p.m. UTC | #4
On Tue, Jan 28, 2025 at 12:23 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Mon, Jan 27, 2025, at 10:38, Xu Lu wrote:
> > RISC-V distinguishes between normal memory accesses and device I/O and
> > uses FENCE instruction to order them as viewed by othe RISC-V harts and
> > external devices or coprocessors. The FENCE instruction can order any
> > combination of device input(I), device output(O), memory reads(R) and
> > memory writes(W). For example, 'fence w, o' can be used to ensure all
> > memory writes from instructions preceding the FENCE instruction appear
> > earlier in the global memory order than device output writes from
> > instructions after the FENCE instruction.
>
> There is nothing risc-v specific in here really, it's just a bug
> in the driver: writel() means access the mmio register with appropriate
> barriers, while writel_relaxed() is a special case that should
> only ever be used if a particular function is sensitive to
> performance and never needs to be serialized.
>
> > diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> > b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> > index b0e366ade427..8ff6e7a1363b 100644
> > --- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> > +++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> > @@ -31,7 +31,7 @@ static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs);
> >
> >  static void thead_aclint_sswi_ipi_send(unsigned int cpu)
> >  {
> > -     writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu));
> > +     writel(0x1, per_cpu(sswi_cpu_regs, cpu));
> >  }
> >
> >  static void thead_aclint_sswi_ipi_clear(void)
> > --
>
> thead_aclint_sswi_ipi_clear() seems to have the same bug,
> it also uses the _relaxed() version for no apparent reason.

Hi Arnd,

There seems no need to modify thead_aclint_sswi_ipi_clear() as it only
clears pending IPI on current hart. No other harts require to see
strict order between preceding memory writes and this ACLINT MMIO
write. Please correct me if I missed anything.

Thanks,

Xu Lu

>
>      Arnd
Arnd Bergmann Jan. 27, 2025, 7:07 p.m. UTC | #5
On Mon, Jan 27, 2025, at 17:37, Xu Lu wrote:
> On Tue, Jan 28, 2025 at 12:23 AM Arnd Bergmann <arnd@arndb.de> wrote:
>> On Mon, Jan 27, 2025, at 10:38, Xu Lu wrote:
>>
>> thead_aclint_sswi_ipi_clear() seems to have the same bug,
>> it also uses the _relaxed() version for no apparent reason.
>
>
> There seems no need to modify thead_aclint_sswi_ipi_clear() as it only
> clears pending IPI on current hart. No other harts require to see
> strict order between preceding memory writes and this ACLINT MMIO
> write. Please correct me if I missed anything.

My point was that you should always default to the normal operations,
since the relaxed ones keep causing problems. If you still think it's
important to use writel_relaxed() in thead_aclint_sswi_ipi_clear(),
please add a comment explaining how you have shown that it's correct
and why it matters, otherwise convert them both.

      Arnd
diff mbox series

Patch

diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
index c5c2e6929a2f..275df5005705 100644
--- a/drivers/irqchip/irq-riscv-imsic-early.c
+++ b/drivers/irqchip/irq-riscv-imsic-early.c
@@ -27,7 +27,7 @@  static void imsic_ipi_send(unsigned int cpu)
 {
 	struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu);
 
-	writel_relaxed(IMSIC_IPI_ID, local->msi_va);
+	writel(IMSIC_IPI_ID, local->msi_va);
 }
 
 static void imsic_ipi_starting_cpu(void)
diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
index b0e366ade427..8ff6e7a1363b 100644
--- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
+++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
@@ -31,7 +31,7 @@  static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs);
 
 static void thead_aclint_sswi_ipi_send(unsigned int cpu)
 {
-	writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu));
+	writel(0x1, per_cpu(sswi_cpu_regs, cpu));
 }
 
 static void thead_aclint_sswi_ipi_clear(void)