diff mbox

arm: Fix text patching via fixmap with virtually tagged D-caches

Message ID 20170316133609.22679-1-tixy@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Jon Medhurst (Tixy) March 16, 2017, 1:36 p.m. UTC
When __patch_text_real changes an instruction via a fixmap on systems
with a virtually tagged cache, there may still be a stale entry in the
data cache for the real instruction address. Fix this by also flushing
the cache at that address.

One consequence of this issue is that if a kprobe is added then removed,
the D-cache may still hold the breakpoint instruction from when the
probe was active. In that situation, when re-inserting the kprobe, the
kernel thinks the instruction being probed is a breakpoint instruction
and will reject the attempt. This shows up with test failures when
enabling CONFIG_ARM_KPROBES_TEST on a device with a Marvel Kirkwood SoC
and also enabling CONFIG_STRICT_KERNEL_RWX which triggers the use of
fixmaps.

Fixes: ab0615e2d6fb ("arm: use fixmap for text patching when text is RO")

Signed-off-by: Jon Medhurst <tixy@linaro.org>
---
 arch/arm/kernel/patch.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Russell King (Oracle) March 17, 2017, 12:04 p.m. UTC | #1
On Thu, Mar 16, 2017 at 01:36:09PM +0000, Jon Medhurst wrote:
> When __patch_text_real changes an instruction via a fixmap on systems
> with a virtually tagged cache, there may still be a stale entry in the
> data cache for the real instruction address. Fix this by also flushing
> the cache at that address.

The flush_icache_range() function cleans the data cache, and invalidates
the instruction cache so that the new instruction is visible to the
instruction path, but leaves the data visible in the data cache.

> One consequence of this issue is that if a kprobe is added then removed,
> the D-cache may still hold the breakpoint instruction from when the
> probe was active. In that situation, when re-inserting the kprobe, the
> kernel thinks the instruction being probed is a breakpoint instruction
> and will reject the attempt. This shows up with test failures when
> enabling CONFIG_ARM_KPROBES_TEST on a device with a Marvel Kirkwood SoC
> and also enabling CONFIG_STRICT_KERNEL_RWX which triggers the use of
> fixmaps.

flush_icache_range() assumes that we write through the same alias that
the instruction will be executed from.  Since the strict memory
permissions, and the modifications that this has caused, this simply is
no longer true.

I wonder whether a better solution would be to change flush_icache_range()
to flush the data cache for the region instead of merely cleaning it.

The only performance regression I can think would be that module load
would end up flushing out all the data cache lines for the module rather
than just cleaning them, but loading a module is not a fast path so it
probably doesn't matter.
Jon Medhurst (Tixy) March 17, 2017, 2 p.m. UTC | #2
On Fri, 2017-03-17 at 12:04 +0000, Russell King - ARM Linux wrote:
> On Thu, Mar 16, 2017 at 01:36:09PM +0000, Jon Medhurst wrote:
> > When __patch_text_real changes an instruction via a fixmap on systems
> > with a virtually tagged cache, there may still be a stale entry in the
> > data cache for the real instruction address. Fix this by also flushing
> > the cache at that address.
> 
> The flush_icache_range() function cleans the data cache, and invalidates
> the instruction cache so that the new instruction is visible to the
> instruction path, but leaves the data visible in the data cache.
> 
> > One consequence of this issue is that if a kprobe is added then removed,
> > the D-cache may still hold the breakpoint instruction from when the
> > probe was active. In that situation, when re-inserting the kprobe, the
> > kernel thinks the instruction being probed is a breakpoint instruction
> > and will reject the attempt. This shows up with test failures when
> > enabling CONFIG_ARM_KPROBES_TEST on a device with a Marvel Kirkwood SoC
> > and also enabling CONFIG_STRICT_KERNEL_RWX which triggers the use of
> > fixmaps.
> 
> flush_icache_range() assumes that we write through the same alias that
> the instruction will be executed from.  Since the strict memory
> permissions, and the modifications that this has caused, this simply is
> no longer true.
> 
> I wonder whether a better solution would be to change flush_icache_range()
> to flush the data cache for the region instead of merely cleaning it.
> 
> The only performance regression I can think would be that module load
> would end up flushing out all the data cache lines for the module rather
> than just cleaning them, but loading a module is not a fast path so it
> probably doesn't matter.

You'd think that it would be a generally true rule that
loading/modifying kernel code isn't on any kind of performance critical
path. Though some uses of flush_icache_range include the BPF JIT
compiler, and in fncpy() which seems to have a big role in suspend. I
don't know fast they are expected to be, or even if they get much use on
the old CPU's which have virtually indexed d-caches.

Speaking of which, would the practical implementation of getting
flush_icache_range to invalidate the d-cache involve fixing up all the
*_coherent_kern_range functions for the different cache implementations?
That seems to me a little fraught with regression risks given that it
wouldn't be possible to test them all. Also, another consideration is
that quite a few implementations seem to share the same code between
_coherent_user_range and _coherent_kern_range. (I'm a little fuzzy on
how these are used, but would coherent_user_range be used when loading
userside binaries? In which case, the performance regression might be
noticable.

BTW, in looking at uses of flush_icache_range, I spotted at least one
more (set_fiq_handler) that seems to write code to a different virtual
address than it's run from.
diff mbox

Patch

diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
index 020560b2dcb7..c3c64bc2f50d 100644
--- a/arch/arm/kernel/patch.c
+++ b/arch/arm/kernel/patch.c
@@ -101,6 +101,7 @@  void __kprobes __patch_text_real(void *addr, unsigned int insn, bool remap)
 	if (waddr != addr) {
 		flush_kernel_vmap_range(waddr, twopage ? size / 2 : size);
 		patch_unmap(FIX_TEXT_POKE0, &flags);
+		flush_kernel_vmap_range(addr, size);
 	} else
 		__release(&patch_lock);