Message ID | 20240624082141.153871-1-alexghiti@rivosinc.com (mailing list archive) |
---|---|
State | Accepted |
Commit | edf2d546bfd6f5c4d143715cef1b1e7ce5718c4e |
Headers | show |
Series | [-fixes] riscv: patch: Flush the icache right after patching to avoid illegal insns | expand |
Hi Alex, On Mon, Jun 24, 2024 at 4:21 PM Alexandre Ghiti <alexghiti@rivosinc.com> wrote: > > We cannot delay the icache flush after patching some functions as we may > have patched a function that will get called before the icache flush. > > The only way to completely avoid such scenario is by flushing the icache > as soon as we patch a function. This will probably be costly as we don't > batch the icache maintenance anymore. > > Fixes: 6ca445d8af0e ("riscv: Fix early ftrace nop patching") > Reported-by: Conor Dooley <conor.dooley@microchip.com> > Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/ > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> > --- > arch/riscv/kernel/ftrace.c | 7 ++----- > arch/riscv/kernel/patch.c | 26 ++++++++++++++++++-------- > 2 files changed, 20 insertions(+), 13 deletions(-) > > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > index 87cbd86576b2..4b95c574fd04 100644 > --- a/arch/riscv/kernel/ftrace.c > +++ b/arch/riscv/kernel/ftrace.c > @@ -120,9 +120,6 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) > out = ftrace_make_nop(mod, rec, MCOUNT_ADDR); > mutex_unlock(&text_mutex); > > - if (!mod) > - local_flush_icache_range(rec->ip, rec->ip + MCOUNT_INSN_SIZE); > - > return out; > } > > @@ -156,9 +153,9 @@ static int __ftrace_modify_code(void *data) > } else { > while (atomic_read(¶m->cpu_count) <= num_online_cpus()) > cpu_relax(); > - } > > - local_flush_icache_all(); > + local_flush_icache_all(); > + } Sorry, maybe I should point it out directly earlier. But I don't think this diff chunk is required. Threads running in the else clause from stop_machine must not run into any patchable entry. If it runs into a patchable entry, running the local fence.i is not going to fix the problem. > > return 0; > } > diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c > index 4007563fb607..ab03732d06c4 100644 > --- a/arch/riscv/kernel/patch.c > +++ b/arch/riscv/kernel/patch.c > @@ -89,6 +89,14 @@ static int __patch_insn_set(void *addr, u8 c, size_t len) > > memset(waddr, c, len); > > + /* > + * We could have just patched a function that is about to be > + * called so make sure we don't execute partially patched > + * instructions by flushing the icache as soon as possible. > + */ > + local_flush_icache_range((unsigned long)waddr, > + (unsigned long)waddr + len); > + > patch_unmap(FIX_TEXT_POKE0); > > if (across_pages) > @@ -135,6 +143,14 @@ static int __patch_insn_write(void *addr, const void *insn, size_t len) > > ret = copy_to_kernel_nofault(waddr, insn, len); > > + /* > + * We could have just patched a function that is about to be > + * called so make sure we don't execute partially patched > + * instructions by flushing the icache as soon as possible. > + */ > + local_flush_icache_range((unsigned long)waddr, > + (unsigned long)waddr + len); > + > patch_unmap(FIX_TEXT_POKE0); > > if (across_pages) > @@ -189,9 +205,6 @@ int patch_text_set_nosync(void *addr, u8 c, size_t len) > > ret = patch_insn_set(tp, c, len); > > - if (!ret) > - flush_icache_range((uintptr_t)tp, (uintptr_t)tp + len); > - > return ret; > } > NOKPROBE_SYMBOL(patch_text_set_nosync); > @@ -224,9 +237,6 @@ int patch_text_nosync(void *addr, const void *insns, size_t len) > > ret = patch_insn_write(tp, insns, len); > > - if (!ret) > - flush_icache_range((uintptr_t) tp, (uintptr_t) tp + len); > - > return ret; > } > NOKPROBE_SYMBOL(patch_text_nosync); > @@ -253,9 +263,9 @@ static int patch_text_cb(void *data) > } else { > while (atomic_read(&patch->cpu_count) <= num_online_cpus()) > cpu_relax(); > - } > > - local_flush_icache_all(); > + local_flush_icache_all(); > + } > > return ret; > } > -- > 2.39.2 > Thanks, Andy
On Mon, Jun 24, 2024 at 7:49 PM Andy Chiu <andy.chiu@sifive.com> wrote: > > Hi Alex, > > On Mon, Jun 24, 2024 at 4:21 PM Alexandre Ghiti <alexghiti@rivosinc.com> wrote: > > > > We cannot delay the icache flush after patching some functions as we may > > have patched a function that will get called before the icache flush. > > > > The only way to completely avoid such scenario is by flushing the icache > > as soon as we patch a function. This will probably be costly as we don't > > batch the icache maintenance anymore. > > > > Fixes: 6ca445d8af0e ("riscv: Fix early ftrace nop patching") > > Reported-by: Conor Dooley <conor.dooley@microchip.com> > > Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/ > > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> > > --- > > arch/riscv/kernel/ftrace.c | 7 ++----- > > arch/riscv/kernel/patch.c | 26 ++++++++++++++++++-------- > > 2 files changed, 20 insertions(+), 13 deletions(-) > > > > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > > index 87cbd86576b2..4b95c574fd04 100644 > > --- a/arch/riscv/kernel/ftrace.c > > +++ b/arch/riscv/kernel/ftrace.c > > @@ -120,9 +120,6 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) > > out = ftrace_make_nop(mod, rec, MCOUNT_ADDR); > > mutex_unlock(&text_mutex); > > > > - if (!mod) > > - local_flush_icache_range(rec->ip, rec->ip + MCOUNT_INSN_SIZE); > > - > > return out; > > } > > > > @@ -156,9 +153,9 @@ static int __ftrace_modify_code(void *data) > > } else { > > while (atomic_read(¶m->cpu_count) <= num_online_cpus()) > > cpu_relax(); > > - } > > > > - local_flush_icache_all(); > > + local_flush_icache_all(); > > + } > > Sorry, maybe I should point it out directly earlier. But I don't think > this diff chunk is required. Threads running in the else clause from > stop_machine must not run into any patchable entry. If it runs into a > patchable entry, running the local fence.i is not going to fix the > problem. Sorry, I read the code wrong. The local flush is not in the while loop. I have no problem with this chunk after properly seeing them. > > > > > return 0; > > } > > diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c > > index 4007563fb607..ab03732d06c4 100644 > > --- a/arch/riscv/kernel/patch.c > > +++ b/arch/riscv/kernel/patch.c > > @@ -89,6 +89,14 @@ static int __patch_insn_set(void *addr, u8 c, size_t len) > > > > memset(waddr, c, len); > > > > + /* > > + * We could have just patched a function that is about to be > > + * called so make sure we don't execute partially patched > > + * instructions by flushing the icache as soon as possible. > > + */ > > + local_flush_icache_range((unsigned long)waddr, > > + (unsigned long)waddr + len); > > + > > patch_unmap(FIX_TEXT_POKE0); > > > > if (across_pages) > > @@ -135,6 +143,14 @@ static int __patch_insn_write(void *addr, const void *insn, size_t len) > > > > ret = copy_to_kernel_nofault(waddr, insn, len); > > > > + /* > > + * We could have just patched a function that is about to be > > + * called so make sure we don't execute partially patched > > + * instructions by flushing the icache as soon as possible. > > + */ > > + local_flush_icache_range((unsigned long)waddr, > > + (unsigned long)waddr + len); > > + > > patch_unmap(FIX_TEXT_POKE0); > > > > if (across_pages) > > @@ -189,9 +205,6 @@ int patch_text_set_nosync(void *addr, u8 c, size_t len) > > > > ret = patch_insn_set(tp, c, len); > > > > - if (!ret) > > - flush_icache_range((uintptr_t)tp, (uintptr_t)tp + len); > > - > > return ret; > > } > > NOKPROBE_SYMBOL(patch_text_set_nosync); > > @@ -224,9 +237,6 @@ int patch_text_nosync(void *addr, const void *insns, size_t len) > > > > ret = patch_insn_write(tp, insns, len); > > > > - if (!ret) > > - flush_icache_range((uintptr_t) tp, (uintptr_t) tp + len); > > - > > return ret; > > } > > NOKPROBE_SYMBOL(patch_text_nosync); > > @@ -253,9 +263,9 @@ static int patch_text_cb(void *data) > > } else { > > while (atomic_read(&patch->cpu_count) <= num_online_cpus()) > > cpu_relax(); > > - } > > > > - local_flush_icache_all(); > > + local_flush_icache_all(); > > + } > > > > return ret; > > } > > -- > > 2.39.2 > > > > Thanks, > Andy
On Mon, Jun 24, 2024 at 10:21:41AM +0200, Alexandre Ghiti wrote: > We cannot delay the icache flush after patching some functions as we may > have patched a function that will get called before the icache flush. > > The only way to completely avoid such scenario is by flushing the icache > as soon as we patch a function. This will probably be costly as we don't > batch the icache maintenance anymore. > > Fixes: 6ca445d8af0e ("riscv: Fix early ftrace nop patching") > Reported-by: Conor Dooley <conor.dooley@microchip.com> > Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/ > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> So, I can't even test this right now :/ The issue is annoying enough to reproduce that same config + compiler + commit isn't enough.
On Mon, 24 Jun 2024 01:21:41 PDT (-0700), alexghiti@rivosinc.com wrote: > We cannot delay the icache flush after patching some functions as we may > have patched a function that will get called before the icache flush. > > The only way to completely avoid such scenario is by flushing the icache > as soon as we patch a function. This will probably be costly as we don't > batch the icache maintenance anymore. Ya, it's going to be pretty miserable for performance. We'd talked about using objtool for the static rewriting a few weeks ago in the patchwork meeting, but with the dynamic rewriting suffering from similar issues it seems best to just pick this one up. We can always sort out the performance isuses later, at least this is correct. > Fixes: 6ca445d8af0e ("riscv: Fix early ftrace nop patching") > Reported-by: Conor Dooley <conor.dooley@microchip.com> > Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/ > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> > --- > arch/riscv/kernel/ftrace.c | 7 ++----- > arch/riscv/kernel/patch.c | 26 ++++++++++++++++++-------- > 2 files changed, 20 insertions(+), 13 deletions(-) > > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > index 87cbd86576b2..4b95c574fd04 100644 > --- a/arch/riscv/kernel/ftrace.c > +++ b/arch/riscv/kernel/ftrace.c > @@ -120,9 +120,6 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) > out = ftrace_make_nop(mod, rec, MCOUNT_ADDR); > mutex_unlock(&text_mutex); > > - if (!mod) > - local_flush_icache_range(rec->ip, rec->ip + MCOUNT_INSN_SIZE); > - > return out; > } > > @@ -156,9 +153,9 @@ static int __ftrace_modify_code(void *data) > } else { > while (atomic_read(¶m->cpu_count) <= num_online_cpus()) > cpu_relax(); > - } > > - local_flush_icache_all(); > + local_flush_icache_all(); > + } > > return 0; > } > diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c > index 4007563fb607..ab03732d06c4 100644 > --- a/arch/riscv/kernel/patch.c > +++ b/arch/riscv/kernel/patch.c > @@ -89,6 +89,14 @@ static int __patch_insn_set(void *addr, u8 c, size_t len) > > memset(waddr, c, len); > > + /* > + * We could have just patched a function that is about to be > + * called so make sure we don't execute partially patched > + * instructions by flushing the icache as soon as possible. > + */ > + local_flush_icache_range((unsigned long)waddr, > + (unsigned long)waddr + len); > + > patch_unmap(FIX_TEXT_POKE0); > > if (across_pages) > @@ -135,6 +143,14 @@ static int __patch_insn_write(void *addr, const void *insn, size_t len) > > ret = copy_to_kernel_nofault(waddr, insn, len); > > + /* > + * We could have just patched a function that is about to be > + * called so make sure we don't execute partially patched > + * instructions by flushing the icache as soon as possible. > + */ > + local_flush_icache_range((unsigned long)waddr, > + (unsigned long)waddr + len); > + > patch_unmap(FIX_TEXT_POKE0); > > if (across_pages) > @@ -189,9 +205,6 @@ int patch_text_set_nosync(void *addr, u8 c, size_t len) > > ret = patch_insn_set(tp, c, len); > > - if (!ret) > - flush_icache_range((uintptr_t)tp, (uintptr_t)tp + len); > - > return ret; > } > NOKPROBE_SYMBOL(patch_text_set_nosync); > @@ -224,9 +237,6 @@ int patch_text_nosync(void *addr, const void *insns, size_t len) > > ret = patch_insn_write(tp, insns, len); > > - if (!ret) > - flush_icache_range((uintptr_t) tp, (uintptr_t) tp + len); > - > return ret; > } > NOKPROBE_SYMBOL(patch_text_nosync); > @@ -253,9 +263,9 @@ static int patch_text_cb(void *data) > } else { > while (atomic_read(&patch->cpu_count) <= num_online_cpus()) > cpu_relax(); > - } > > - local_flush_icache_all(); > + local_flush_icache_all(); > + } > > return ret; > }
Hello: This patch was applied to riscv/linux.git (fixes) by Palmer Dabbelt <palmer@rivosinc.com>: On Mon, 24 Jun 2024 10:21:41 +0200 you wrote: > We cannot delay the icache flush after patching some functions as we may > have patched a function that will get called before the icache flush. > > The only way to completely avoid such scenario is by flushing the icache > as soon as we patch a function. This will probably be costly as we don't > batch the icache maintenance anymore. > > [...] Here is the summary with links: - [-fixes] riscv: patch: Flush the icache right after patching to avoid illegal insns https://git.kernel.org/riscv/c/edf2d546bfd6 You are awesome, thank you!
Hi Alexandre, On Mon, Jun 24, 2024 at 10:23 AM Alexandre Ghiti <alexghiti@rivosinc.com> wrote: > We cannot delay the icache flush after patching some functions as we may > have patched a function that will get called before the icache flush. > > The only way to completely avoid such scenario is by flushing the icache > as soon as we patch a function. This will probably be costly as we don't > batch the icache maintenance anymore. > > Fixes: 6ca445d8af0e ("riscv: Fix early ftrace nop patching") > Reported-by: Conor Dooley <conor.dooley@microchip.com> > Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/ > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Thanks for your patch, which is now commit edf2d546bfd6f5c4 ("riscv: patch: Flush the icache right after patching to avoid illegal insns") in v6.10-rc6. > --- a/arch/riscv/kernel/patch.c > +++ b/arch/riscv/kernel/patch.c > @@ -189,9 +205,6 @@ int patch_text_set_nosync(void *addr, u8 c, size_t len) > > ret = patch_insn_set(tp, c, len); > > - if (!ret) > - flush_icache_range((uintptr_t)tp, (uintptr_t)tp + len); > - > return ret; > } > NOKPROBE_SYMBOL(patch_text_set_nosync); patch_text_set_nosync() is now identical to (static) patch_insn_set(), and the latter has no other callers. > @@ -224,9 +237,6 @@ int patch_text_nosync(void *addr, const void *insns, size_t len) > > ret = patch_insn_write(tp, insns, len); > > - if (!ret) > - flush_icache_range((uintptr_t) tp, (uintptr_t) tp + len); > - > return ret; > } > NOKPROBE_SYMBOL(patch_text_nosync); patch_text_nosync() is now identical to patch_insn_write(), and both functions are called from outside this file. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Geert, On Tue, Jul 9, 2024 at 10:13 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote: > > Hi Alexandre, > > On Mon, Jun 24, 2024 at 10:23 AM Alexandre Ghiti <alexghiti@rivosinc.com> wrote: > > We cannot delay the icache flush after patching some functions as we may > > have patched a function that will get called before the icache flush. > > > > The only way to completely avoid such scenario is by flushing the icache > > as soon as we patch a function. This will probably be costly as we don't > > batch the icache maintenance anymore. > > > > Fixes: 6ca445d8af0e ("riscv: Fix early ftrace nop patching") > > Reported-by: Conor Dooley <conor.dooley@microchip.com> > > Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/ > > Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> > > Thanks for your patch, which is now commit edf2d546bfd6f5c4 ("riscv: > patch: Flush the icache right after patching to avoid illegal > insns") in v6.10-rc6. > > > --- a/arch/riscv/kernel/patch.c > > +++ b/arch/riscv/kernel/patch.c > > @@ -189,9 +205,6 @@ int patch_text_set_nosync(void *addr, u8 c, size_t len) > > > > ret = patch_insn_set(tp, c, len); > > > > - if (!ret) > > - flush_icache_range((uintptr_t)tp, (uintptr_t)tp + len); > > - > > return ret; > > } > > NOKPROBE_SYMBOL(patch_text_set_nosync); > > patch_text_set_nosync() is now identical to (static) patch_insn_set(), > and the latter has no other callers. > > > @@ -224,9 +237,6 @@ int patch_text_nosync(void *addr, const void *insns, size_t len) > > > > ret = patch_insn_write(tp, insns, len); > > > > - if (!ret) > > - flush_icache_range((uintptr_t) tp, (uintptr_t) tp + len); > > - > > return ret; > > } > > NOKPROBE_SYMBOL(patch_text_nosync); > > patch_text_nosync() is now identical to patch_insn_write(), and both > functions are called from outside this file. > I'll do something about that, thanks. Alex > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like that. > -- Linus Torvalds
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index 87cbd86576b2..4b95c574fd04 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -120,9 +120,6 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) out = ftrace_make_nop(mod, rec, MCOUNT_ADDR); mutex_unlock(&text_mutex); - if (!mod) - local_flush_icache_range(rec->ip, rec->ip + MCOUNT_INSN_SIZE); - return out; } @@ -156,9 +153,9 @@ static int __ftrace_modify_code(void *data) } else { while (atomic_read(¶m->cpu_count) <= num_online_cpus()) cpu_relax(); - } - local_flush_icache_all(); + local_flush_icache_all(); + } return 0; } diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c index 4007563fb607..ab03732d06c4 100644 --- a/arch/riscv/kernel/patch.c +++ b/arch/riscv/kernel/patch.c @@ -89,6 +89,14 @@ static int __patch_insn_set(void *addr, u8 c, size_t len) memset(waddr, c, len); + /* + * We could have just patched a function that is about to be + * called so make sure we don't execute partially patched + * instructions by flushing the icache as soon as possible. + */ + local_flush_icache_range((unsigned long)waddr, + (unsigned long)waddr + len); + patch_unmap(FIX_TEXT_POKE0); if (across_pages) @@ -135,6 +143,14 @@ static int __patch_insn_write(void *addr, const void *insn, size_t len) ret = copy_to_kernel_nofault(waddr, insn, len); + /* + * We could have just patched a function that is about to be + * called so make sure we don't execute partially patched + * instructions by flushing the icache as soon as possible. + */ + local_flush_icache_range((unsigned long)waddr, + (unsigned long)waddr + len); + patch_unmap(FIX_TEXT_POKE0); if (across_pages) @@ -189,9 +205,6 @@ int patch_text_set_nosync(void *addr, u8 c, size_t len) ret = patch_insn_set(tp, c, len); - if (!ret) - flush_icache_range((uintptr_t)tp, (uintptr_t)tp + len); - return ret; } NOKPROBE_SYMBOL(patch_text_set_nosync); @@ -224,9 +237,6 @@ int patch_text_nosync(void *addr, const void *insns, size_t len) ret = patch_insn_write(tp, insns, len); - if (!ret) - flush_icache_range((uintptr_t) tp, (uintptr_t) tp + len); - return ret; } NOKPROBE_SYMBOL(patch_text_nosync); @@ -253,9 +263,9 @@ static int patch_text_cb(void *data) } else { while (atomic_read(&patch->cpu_count) <= num_online_cpus()) cpu_relax(); - } - local_flush_icache_all(); + local_flush_icache_all(); + } return ret; }
We cannot delay the icache flush after patching some functions as we may have patched a function that will get called before the icache flush. The only way to completely avoid such scenario is by flushing the icache as soon as we patch a function. This will probably be costly as we don't batch the icache maintenance anymore. Fixes: 6ca445d8af0e ("riscv: Fix early ftrace nop patching") Reported-by: Conor Dooley <conor.dooley@microchip.com> Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> --- arch/riscv/kernel/ftrace.c | 7 ++----- arch/riscv/kernel/patch.c | 26 ++++++++++++++++++-------- 2 files changed, 20 insertions(+), 13 deletions(-)