mbox series

[00/10] KVM: arm64: Add support for hypervisor kCFI

Message ID cover.1710446682.git.ptosi@google.com (mailing list archive)
Headers show
Series KVM: arm64: Add support for hypervisor kCFI | expand

Message

Pierre-Clément Tosi March 14, 2024, 8:23 p.m. UTC
CONFIG_CFI_CLANG ("kernel Control Flow Integrity") makes the compiler inject
runtime type checks before any indirect function call. On AArch64, it generates
a BRK instruction to be executed on type mismatch and encodes the indices of the
registers holding the branch target and expected type in the immediate of the
instruction. As a result, a synchronous exception gets triggered on kCFI failure
and the fault handler can retrieve the immediate (and indices) from ESR_ELx.

This feature has been supported at EL1 ("host") since it was introduced by
b26e484b8bb3 ("arm64: Add CFI error handling"), where cfi_handler() decodes
ESR_EL1, giving informative panic messages such as

  [   21.885179] CFI failure at lkdtm_indirect_call+0x2c/0x44 [lkdtm]
  (target: lkdtm_increment_int+0x0/0x1c [lkdtm]; expected type: 0x7e0c52a)
  [   21.886593] Internal error: Oops - CFI: 0 [#1] PREEMPT SMP

However, it is not or only partially supported at EL2: in nVHE (or pKVM),
CONFIG_CFI_CLANG gets filtered out at build time, preventing the compiler from
injecting the checks. In VHE (or hVHE), EL2 code gets compiled with the checks
but the handlers in VBAR_EL2 are not aware of kCFI and will produce a generic
and not-so-helpful panic message such as

  [   36.456088][  T200] Kernel panic - not syncing: HYP panic:
  [   36.456088][  T200] PS:204003c9 PC:ffffffc080092310 ESR:f2008228
  [   36.456088][  T200] FAR:0000000081a50000 HPFAR:000000000081a500 PAR:1de7ec7edbadc0de
  [   36.456088][  T200] VCPU:00000000e189c7cf

To address this,

- [01/10] fixes an existing bug where the ELR_EL2 was getting clobbered on
  synchronous exceptions, causing the wrong "PC" to be reported by
  nvhe_hyp_panic_handler() or __hyp_call_panic(). This is particularly limiting
  for kCFI, as it would mask the location of the failed type check.
- [02/10] & [03/10] (resp.) fix and improve __pkvm_init_switch_pgd for kCFI
- [04/10] to [06/10] prepare nVHE for CONFIG_CFI_CLANG and [09/10] enables it
- [10/10] improves kCFI error messages by saving then parsing the CPU context

As a result, an informative kCFI panic message is printed by or on behalf of EL2
giving the expected type and target address (possibly resolved to a symbol) for
VHE/hVHE, nVHE, and pKVM (iff CONFIG_NVHE_EL2_DEBUG=y).

Note that kCFI errors remain fatal at EL2, even when CONFIG_CFI_PERMISSIVE=y.

Pierre-Clément Tosi (10):
  KVM: arm64: Fix clobbered ELR in sync abort
  KVM: arm64: Fix __pkvm_init_switch_pgd C signature
  KVM: arm64: Pass pointer to __pkvm_init_switch_pgd
  KVM: arm64: nVHE: Simplify __guest_exit_panic path
  KVM: arm64: nVHE: Add EL2 sync exception handler
  KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32
  KVM: arm64: VHE: Mark __hyp_call_panic __noreturn
  arm64: Move esr_comment() to <asm/esr.h>
  KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
  KVM: arm64: Improve CONFIG_CFI_CLANG error message

 arch/arm64/include/asm/esr.h            | 11 +++++++
 arch/arm64/include/asm/kvm_hyp.h        |  4 +--
 arch/arm64/kernel/asm-offsets.c         |  1 +
 arch/arm64/kernel/debug-monitors.c      |  4 +--
 arch/arm64/kernel/traps.c               |  2 --
 arch/arm64/kvm/handle_exit.c            | 39 +++++++++++++++++++++-
 arch/arm64/kvm/hyp/entry.S              | 43 ++++++++++++++++++++++---
 arch/arm64/kvm/hyp/hyp-entry.S          |  4 +--
 arch/arm64/kvm/hyp/include/hyp/switch.h |  6 ++--
 arch/arm64/kvm/hyp/nvhe/Makefile        |  6 ++--
 arch/arm64/kvm/hyp/nvhe/gen-hyprel.c    |  6 ++++
 arch/arm64/kvm/hyp/nvhe/host.S          | 19 ++++++-----
 arch/arm64/kvm/hyp/nvhe/hyp-init.S      | 11 ++++---
 arch/arm64/kvm/hyp/nvhe/setup.c         |  6 ++--
 arch/arm64/kvm/hyp/vhe/switch.c         | 27 ++++++++++++++--
 15 files changed, 149 insertions(+), 40 deletions(-)

Comments

Marc Zyngier March 14, 2024, 10:40 p.m. UTC | #1
Hi Pierre-Clément,

On Thu, 14 Mar 2024 20:23:00 +0000,
Pierre-Clément Tosi <ptosi@google.com> wrote:
> 
> CONFIG_CFI_CLANG ("kernel Control Flow Integrity") makes the compiler inject
> runtime type checks before any indirect function call. On AArch64, it generates
> a BRK instruction to be executed on type mismatch and encodes the indices of the
> registers holding the branch target and expected type in the immediate of the
> instruction. As a result, a synchronous exception gets triggered on kCFI failure
> and the fault handler can retrieve the immediate (and indices) from ESR_ELx.
> 
> This feature has been supported at EL1 ("host") since it was introduced by
> b26e484b8bb3 ("arm64: Add CFI error handling"), where cfi_handler() decodes
> ESR_EL1, giving informative panic messages such as
> 
>   [   21.885179] CFI failure at lkdtm_indirect_call+0x2c/0x44 [lkdtm]
>   (target: lkdtm_increment_int+0x0/0x1c [lkdtm]; expected type: 0x7e0c52a)
>   [   21.886593] Internal error: Oops - CFI: 0 [#1] PREEMPT SMP
> 
> However, it is not or only partially supported at EL2: in nVHE (or pKVM),
> CONFIG_CFI_CLANG gets filtered out at build time, preventing the compiler from
> injecting the checks. In VHE (or hVHE), EL2 code gets compiled with the checks

Are you sure about hVHE? hVHE is essentially the nVHE object running
with a slightly different HCR_EL2 configuration. So if you don't have
the checks in the nVHE code, you don't have them for hVHE either.

Or am I missing something obvious?

Thanks,

	M.
Pierre-Clément Tosi March 15, 2024, 10:22 a.m. UTC | #2
Hi Marc,

On Thu, Mar 14, 2024 at 10:40:47PM +0000, Marc Zyngier wrote:
> Hi Pierre-Clément,
> 
> On Thu, 14 Mar 2024 20:23:00 +0000,
> Pierre-Clément Tosi <ptosi@google.com> wrote:
> > 
> > CONFIG_CFI_CLANG ("kernel Control Flow Integrity") makes the compiler inject
> > runtime type checks before any indirect function call. On AArch64, it generates
> > a BRK instruction to be executed on type mismatch and encodes the indices of the
> > registers holding the branch target and expected type in the immediate of the
> > instruction. As a result, a synchronous exception gets triggered on kCFI failure
> > and the fault handler can retrieve the immediate (and indices) from ESR_ELx.
> > 
> > This feature has been supported at EL1 ("host") since it was introduced by
> > b26e484b8bb3 ("arm64: Add CFI error handling"), where cfi_handler() decodes
> > ESR_EL1, giving informative panic messages such as
> > 
> >   [   21.885179] CFI failure at lkdtm_indirect_call+0x2c/0x44 [lkdtm]
> >   (target: lkdtm_increment_int+0x0/0x1c [lkdtm]; expected type: 0x7e0c52a)
> >   [   21.886593] Internal error: Oops - CFI: 0 [#1] PREEMPT SMP
> > 
> > However, it is not or only partially supported at EL2: in nVHE (or pKVM),
> > CONFIG_CFI_CLANG gets filtered out at build time, preventing the compiler from
> > injecting the checks. In VHE (or hVHE), EL2 code gets compiled with the checks
> 
> Are you sure about hVHE? hVHE is essentially the nVHE object running
> with a slightly different HCR_EL2 configuration. So if you don't have
> the checks in the nVHE code, you don't have them for hVHE either.

No, I am not and my assumption that hVHE was running the VHE hyp code was wrong.
FYI, these patches were tested in VHE, nVHE, and pKVM (with NVHE_EL2_DEBUG set
and unset) but not in hVHE (clearly!). Thanks for pointing this out.

> 
> Or am I missing something obvious?
> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.