diff mbox series

[v6,05/13] riscv: Only send remote fences when some other CPU is online

Message ID 20240327045035.368512-6-samuel.holland@sifive.com (mailing list archive)
State New
Headers show
Series riscv: ASID-related and UP-related TLB flush enhancements | expand

Commit Message

Samuel Holland March 27, 2024, 4:49 a.m. UTC
If no other CPU is online, a local cache or TLB flush is sufficient.
These checks can be constant-folded when SMP is disabled.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---

(no changes since v4)

Changes in v4:
 - New patch for v4

 arch/riscv/mm/cacheflush.c | 4 +++-
 arch/riscv/mm/tlbflush.c   | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

Comments

yunhui cui March 27, 2024, 6:16 a.m. UTC | #1
Hi Samuel,

On Wed, Mar 27, 2024 at 12:50 PM Samuel Holland
<samuel.holland@sifive.com> wrote:
>
> If no other CPU is online, a local cache or TLB flush is sufficient.
> These checks can be constant-folded when SMP is disabled.
>
> Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
> ---
>
> (no changes since v4)
>
> Changes in v4:
>  - New patch for v4
>
>  arch/riscv/mm/cacheflush.c | 4 +++-
>  arch/riscv/mm/tlbflush.c   | 4 +++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> index d76fc73e594b..f5be1fec8191 100644
> --- a/arch/riscv/mm/cacheflush.c
> +++ b/arch/riscv/mm/cacheflush.c
> @@ -21,7 +21,9 @@ void flush_icache_all(void)
>  {
>         local_flush_icache_all();
>
> -       if (riscv_use_sbi_for_rfence())
> +       if (num_online_cpus() < 2)
> +               return;
> +       else if (riscv_use_sbi_for_rfence())
>                 sbi_remote_fence_i(NULL);
>         else
>                 on_each_cpu(ipi_remote_fence_i, NULL, 1);
> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
> index da821315d43e..0901aa47b58f 100644
> --- a/arch/riscv/mm/tlbflush.c
> +++ b/arch/riscv/mm/tlbflush.c
> @@ -79,7 +79,9 @@ static void __ipi_flush_tlb_all(void *info)
>
>  void flush_tlb_all(void)
>  {
> -       if (riscv_use_sbi_for_rfence())
> +       if (num_online_cpus() < 2)
> +               local_flush_tlb_all();
> +       else if (riscv_use_sbi_for_rfence())
>                 sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID);
>         else
>                 on_each_cpu(__ipi_flush_tlb_all, NULL, 1);
> --
> 2.43.1
>

From a perceptual point of view, the modification here is not
necessary. There is such logic in on_each_cpu(). Can you share your
test data?


Thanks,
Yunhui
Samuel Holland March 27, 2024, 8:14 p.m. UTC | #2
Hi Yunhui,

On 2024-03-27 1:16 AM, yunhui cui wrote:
> On Wed, Mar 27, 2024 at 12:50 PM Samuel Holland
> <samuel.holland@sifive.com> wrote:
>>
>> If no other CPU is online, a local cache or TLB flush is sufficient.
>> These checks can be constant-folded when SMP is disabled.
>>
>> Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
>> ---
>>
>> (no changes since v4)
>>
>> Changes in v4:
>>  - New patch for v4
>>
>>  arch/riscv/mm/cacheflush.c | 4 +++-
>>  arch/riscv/mm/tlbflush.c   | 4 +++-
>>  2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
>> index d76fc73e594b..f5be1fec8191 100644
>> --- a/arch/riscv/mm/cacheflush.c
>> +++ b/arch/riscv/mm/cacheflush.c
>> @@ -21,7 +21,9 @@ void flush_icache_all(void)
>>  {
>>         local_flush_icache_all();
>>
>> -       if (riscv_use_sbi_for_rfence())
>> +       if (num_online_cpus() < 2)
>> +               return;
>> +       else if (riscv_use_sbi_for_rfence())
>>                 sbi_remote_fence_i(NULL);
>>         else
>>                 on_each_cpu(ipi_remote_fence_i, NULL, 1);
>> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
>> index da821315d43e..0901aa47b58f 100644
>> --- a/arch/riscv/mm/tlbflush.c
>> +++ b/arch/riscv/mm/tlbflush.c
>> @@ -79,7 +79,9 @@ static void __ipi_flush_tlb_all(void *info)
>>
>>  void flush_tlb_all(void)
>>  {
>> -       if (riscv_use_sbi_for_rfence())
>> +       if (num_online_cpus() < 2)
>> +               local_flush_tlb_all();
>> +       else if (riscv_use_sbi_for_rfence())
>>                 sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID);
>>         else
>>                 on_each_cpu(__ipi_flush_tlb_all, NULL, 1);
>> --
>> 2.43.1
>>
> 
> From a perceptual point of view, the modification here is not
> necessary. There is such logic in on_each_cpu(). Can you share your
> test data?

The logic in on_each_cpu() doesn't apply when riscv_use_sbi_for_rfence() is
true, so we would make unnecessary SBI calls, and cannot be oppimized out when
CONFIG_SMP=n. The cover letter includes benchmarks for a representative
single-core system (D1). There was no measurable performance impact from this
portion of the series on multi-core systems. If there are specific benchmarks
you think I should run, please let me know.

Regards,
Samuel
yunhui cui March 28, 2024, 2:21 a.m. UTC | #3
Hi Samuel,

On Thu, Mar 28, 2024 at 4:14 AM Samuel Holland
<samuel.holland@sifive.com> wrote:
>
> Hi Yunhui,
>
> On 2024-03-27 1:16 AM, yunhui cui wrote:
> > On Wed, Mar 27, 2024 at 12:50 PM Samuel Holland
> > <samuel.holland@sifive.com> wrote:
> >>
> >> If no other CPU is online, a local cache or TLB flush is sufficient.
> >> These checks can be constant-folded when SMP is disabled.
> >>
> >> Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
> >> ---
> >>
> >> (no changes since v4)
> >>
> >> Changes in v4:
> >>  - New patch for v4
> >>
> >>  arch/riscv/mm/cacheflush.c | 4 +++-
> >>  arch/riscv/mm/tlbflush.c   | 4 +++-
> >>  2 files changed, 6 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> >> index d76fc73e594b..f5be1fec8191 100644
> >> --- a/arch/riscv/mm/cacheflush.c
> >> +++ b/arch/riscv/mm/cacheflush.c
> >> @@ -21,7 +21,9 @@ void flush_icache_all(void)
> >>  {
> >>         local_flush_icache_all();
> >>
> >> -       if (riscv_use_sbi_for_rfence())
> >> +       if (num_online_cpus() < 2)
> >> +               return;
> >> +       else if (riscv_use_sbi_for_rfence())
> >>                 sbi_remote_fence_i(NULL);
> >>         else
> >>                 on_each_cpu(ipi_remote_fence_i, NULL, 1);
> >> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
> >> index da821315d43e..0901aa47b58f 100644
> >> --- a/arch/riscv/mm/tlbflush.c
> >> +++ b/arch/riscv/mm/tlbflush.c
> >> @@ -79,7 +79,9 @@ static void __ipi_flush_tlb_all(void *info)
> >>
> >>  void flush_tlb_all(void)
> >>  {
> >> -       if (riscv_use_sbi_for_rfence())
> >> +       if (num_online_cpus() < 2)
> >> +               local_flush_tlb_all();
> >> +       else if (riscv_use_sbi_for_rfence())
> >>                 sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID);
> >>         else
> >>                 on_each_cpu(__ipi_flush_tlb_all, NULL, 1);
> >> --
> >> 2.43.1
> >>
> >
> > From a perceptual point of view, the modification here is not
> > necessary. There is such logic in on_each_cpu(). Can you share your
> > test data?
>
> The logic in on_each_cpu() doesn't apply when riscv_use_sbi_for_rfence() is
> true, so we would make unnecessary SBI calls, and cannot be oppimized out when
> CONFIG_SMP=n.

Is it possible to do this:
"sbi_remote_sfence_vma_asid(cpu_online_mask,...); " instead of adding:
"if (num_online_cpus() < 2)" ?

Thanks,
Yunhui
Alexandre Ghiti April 4, 2024, 8:04 a.m. UTC | #4
On 27/03/2024 05:49, Samuel Holland wrote:
> If no other CPU is online, a local cache or TLB flush is sufficient.
> These checks can be constant-folded when SMP is disabled.
>
> Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
> ---
>
> (no changes since v4)
>
> Changes in v4:
>   - New patch for v4
>
>   arch/riscv/mm/cacheflush.c | 4 +++-
>   arch/riscv/mm/tlbflush.c   | 4 +++-
>   2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> index d76fc73e594b..f5be1fec8191 100644
> --- a/arch/riscv/mm/cacheflush.c
> +++ b/arch/riscv/mm/cacheflush.c
> @@ -21,7 +21,9 @@ void flush_icache_all(void)
>   {
>   	local_flush_icache_all();
>   
> -	if (riscv_use_sbi_for_rfence())
> +	if (num_online_cpus() < 2)
> +		return;
> +	else if (riscv_use_sbi_for_rfence())
>   		sbi_remote_fence_i(NULL);
>   	else
>   		on_each_cpu(ipi_remote_fence_i, NULL, 1);
> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
> index da821315d43e..0901aa47b58f 100644
> --- a/arch/riscv/mm/tlbflush.c
> +++ b/arch/riscv/mm/tlbflush.c
> @@ -79,7 +79,9 @@ static void __ipi_flush_tlb_all(void *info)
>   
>   void flush_tlb_all(void)
>   {
> -	if (riscv_use_sbi_for_rfence())
> +	if (num_online_cpus() < 2)
> +		local_flush_tlb_all();
> +	else if (riscv_use_sbi_for_rfence())
>   		sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID);
>   	else
>   		on_each_cpu(__ipi_flush_tlb_all, NULL, 1);


Could this be done directly in __sbi_rfence() instead?

Otherwise:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
diff mbox series

Patch

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index d76fc73e594b..f5be1fec8191 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -21,7 +21,9 @@  void flush_icache_all(void)
 {
 	local_flush_icache_all();
 
-	if (riscv_use_sbi_for_rfence())
+	if (num_online_cpus() < 2)
+		return;
+	else if (riscv_use_sbi_for_rfence())
 		sbi_remote_fence_i(NULL);
 	else
 		on_each_cpu(ipi_remote_fence_i, NULL, 1);
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index da821315d43e..0901aa47b58f 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -79,7 +79,9 @@  static void __ipi_flush_tlb_all(void *info)
 
 void flush_tlb_all(void)
 {
-	if (riscv_use_sbi_for_rfence())
+	if (num_online_cpus() < 2)
+		local_flush_tlb_all();
+	else if (riscv_use_sbi_for_rfence())
 		sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID);
 	else
 		on_each_cpu(__ipi_flush_tlb_all, NULL, 1);