Message ID | 20230210100006.1161696-1-ardb@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] arm64: Move HYP text out of kernel mapping | expand |
On Fri, 10 Feb 2023 10:00:06 +0000, Ard Biesheuvel <ardb@kernel.org> wrote: > > The HYP text region contains the code that the hypervisor runs when > running KVM at EL2. This code is never called by the kernel running at > EL1, regardless of whether it booted at EL2 or whether it runs KVM in > VHE mode or not. > > This means that this code has no need to be mapped with executable > permissions in the kernel's address space, and should therefore be > moved out of it. That way, any gadgets that may exist in this code are > no longer exploitable at the kernel's exception level (speculative or > otherwise). I *really* like this, as it also means that we get simply free this code when running VHE or that EL2 isn't available at all (in a guest). > > Cc: Marc Zyngier <maz@kernel.org> > Cc: Will Deacon <will@kernel.org> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Quentin Perret <qperret@google.com> > Cc: Kees Cook <keescook@chromium.org> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > --- > arch/arm64/kernel/vmlinux.lds.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > This change currently results in the following warnings*: > > (kvm_arm_pmu_available) can't patch jump_label at __kvm_nvhe___kvm_vcpu_run+0x16c/0x570 > (kvm_arm_pmu_available) can't patch jump_label at __kvm_nvhe___deactivate_traps+0x40/0x144 > (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_vcpu_run+0x380/0x570 > (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe_hyp_panic+0x54/0xf8 > (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_tlb_flush_vmid_ipa+0xc0/0x1b8 > (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_tlb_flush_vmid+0x84/0x150 > (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_flush_cpu_context+0x84/0x150 > (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe_handle_trap+0x80/0x128 > > The warnings are due to the fact that the jump label code refuses to > patch sections that are not kernel text. > > So the questions are: > a) Mark pointed out off-list that he has been getting rid of static keys > in favor of alternatives in the arch code, as those are guaranteed to > be patched only once. Should we try to get rid of these as well? The question is whether we can use these alternatives at such a late point in the boot process. Today, we are done with the alternatives as soon as all the early CPUs are up. > b) These look like they are set only once and never turned off again. > The pKVM one is definitely only set at boot time, but I couldn't > figure out whether the same applies to the PMU one? Yes, the PMU is in the same bag. As soon as we have found an architectural PMU *and* that the driver has been registered, we're good. But we cannot just rely on the CPU ID regs as the perf backend could fail to register. In both cases, this would be very late patching. Mark? Thanks, M.
On Fri, Feb 10, 2023 at 11:56:01AM +0000, Marc Zyngier wrote: > On Fri, 10 Feb 2023 10:00:06 +0000, > Ard Biesheuvel <ardb@kernel.org> wrote: > > So the questions are: > > a) Mark pointed out off-list that he has been getting rid of static keys > > in favor of alternatives in the arch code, as those are guaranteed to > > be patched only once. Should we try to get rid of these as well? > > The question is whether we can use these alternatives at such a late > point in the boot process. Today, we are done with the alternatives as > soon as all the early CPUs are up. My thinking is that anything pKVM relies upon must be settled around that time (and certainly before any late secondaries are onlined), so we should be able to pull the few remaining bits and pieces a little earlier. > > b) These look like they are set only once and never turned off again. > > The pKVM one is definitely only set at boot time, but I couldn't > > figure out whether the same applies to the PMU one? > > Yes, the PMU is in the same bag. As soon as we have found an > architectural PMU *and* that the driver has been registered, we're > good. As above, I was hoping we could somehow pull that before patching. > But we cannot just rely on the CPU ID regs as the perf backend > could fail to register. I thought pKVM just cared about homgeneity here, and was hiding the PMU state from the host, so does it matter what the host does, and if the host fails to register a perf backend? It doesn't seem right that pKVM would rely upon the host to manage the PMU given pKVM cannot trust the host... Thanks, Mark.
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 1a43df27a20461ca..f42c070c3b4530c6 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -180,7 +180,6 @@ SECTIONS CPUIDLE_TEXT LOCK_TEXT KPROBES_TEXT - HYPERVISOR_TEXT *(.gnu.warning) . = ALIGN(16); *(.got) /* Global offset table */ @@ -208,6 +207,7 @@ SECTIONS HIBERNATE_TEXT KEXEC_TEXT IDMAP_TEXT + HYPERVISOR_TEXT . = ALIGN(PAGE_SIZE); }
The HYP text region contains the code that the hypervisor runs when running KVM at EL2. This code is never called by the kernel running at EL1, regardless of whether it booted at EL2 or whether it runs KVM in VHE mode or not. This means that this code has no need to be mapped with executable permissions in the kernel's address space, and should therefore be moved out of it. That way, any gadgets that may exist in this code are no longer exploitable at the kernel's exception level (speculative or otherwise). Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <qperret@google.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> --- arch/arm64/kernel/vmlinux.lds.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) This change currently results in the following warnings*: (kvm_arm_pmu_available) can't patch jump_label at __kvm_nvhe___kvm_vcpu_run+0x16c/0x570 (kvm_arm_pmu_available) can't patch jump_label at __kvm_nvhe___deactivate_traps+0x40/0x144 (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_vcpu_run+0x380/0x570 (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe_hyp_panic+0x54/0xf8 (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_tlb_flush_vmid_ipa+0xc0/0x1b8 (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_tlb_flush_vmid+0x84/0x150 (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_flush_cpu_context+0x84/0x150 (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe_handle_trap+0x80/0x128 The warnings are due to the fact that the jump label code refuses to patch sections that are not kernel text. So the questions are: a) Mark pointed out off-list that he has been getting rid of static keys in favor of alternatives in the arch code, as those are guaranteed to be patched only once. Should we try to get rid of these as well? b) These look like they are set only once and never turned off again. The pKVM one is definitely only set at boot time, but I couldn't figure out whether the same applies to the PMU one? c) for Peter: could we relax this check (kernel/jump_label.c:446) to permit jump labels in .rodata as well? (* after changing the WARN_ONCE() to pr_warn() and tweaking the output)