From patchwork Wed May 29 14:56:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13679067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 64FFBC27C43 for ; Wed, 29 May 2024 14:57:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=EStl3kRJTPv/yTT9JQwzYP3Rcshbug/PYdA4GQYpVw0=; b=dQWWqsGKDTF47T sChStaiodnGu5x8zX+ROKJ4X+fm38I5hGYUBMFYV5bIoP3e9jctFTxsjm8dXVA/pf87tTB4ZunIRZ 5J8e3h1qbMXzdlQMJaZlRUd11xVoiMf4dpkWxHezZrjDFzZnNveGawQ5jJ7/QSTv4/qUL09xF6NKv MUouYvuZH9xqnukwMNmq5YNLZSTgQv5Mnz7WCE4P+CexXdnSc12L4W8saLJ3VZvN31jmOFkMh0BHQ dFsk9zRVi/29LNjQEagYFvbPsclmCWZXk1Xfov0/C/p/uj9RHdR5Np5TFMjEm511BPWFO2CDPrVcG sCkcPP3tnOUUG5/5n1LA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sCKjV-00000004ZoS-2gdr; Wed, 29 May 2024 14:56:53 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sCKjD-00000004Zan-1vbc for linux-arm-kernel@lists.infradead.org; Wed, 29 May 2024 14:56:36 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id EF55061217; Wed, 29 May 2024 14:56:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 864F1C2BD10; Wed, 29 May 2024 14:56:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1716994594; bh=d6gYfXREzLx9AC4i/x7IJLh2xN8ioZDrcyAlf13uvAA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=I1nejoL5g1fUFP+mt7ebayHqAVddPT0chfguMCnBQurxYVcnMLQj5J+1MkVn/nfvX tzS+E0a7GF1+q+9g9UpnIo6pOI0dk98m+DKkvDCuoVnbhXOgS2wNY71pSCaocIo00a 2ytjVol8tiR+vdFPXd6QXCbBnfjVPxS4p12CK/USXx6UL9eKI9scijc1UOAK9AAnKi iwA4wv7u1zJlf+TBhsz0jRcxbMIOuWbYAm6dNLkxcRkdE4F1JFKPr/telGC6P8Nzs3 8V+bQV811rSij/JhJ9PKQZtKqex1VfZAUphg/RGmXISiXjHVqd1LFTOO8NzXdr9xU9 zuw9hr+AXpF/A== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1sCKjA-00GekF-Bu; Wed, 29 May 2024 15:56:32 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu , Joey Gouly , Alexandru Elisei , Christoffer Dall , Ganapatrao Kulkarni Subject: [PATCH v2 02/16] KVM: arm64: nv: Implement nested Stage-2 page table walk logic Date: Wed, 29 May 2024 15:56:14 +0100 Message-Id: <20240529145628.3272630-3-maz@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240529145628.3272630-1-maz@kernel.org> References: <20240529145628.3272630-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com, joey.gouly@arm.com, alexandru.elisei@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240529_075635_728743_E07FC9DD X-CRM114-Status: GOOD ( 25.01 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Christoffer Dall Based on the pseudo-code in the ARM ARM, implement a stage 2 software page table walker. Co-developed-by: Jintack Lim Signed-off-by: Jintack Lim Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/esr.h | 1 + arch/arm64/include/asm/kvm_nested.h | 13 ++ arch/arm64/kvm/nested.c | 265 ++++++++++++++++++++++++++++ 3 files changed, 279 insertions(+) diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index 7abf09df7033..15a4be765cad 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -152,6 +152,7 @@ #define ESR_ELx_Xs_MASK (GENMASK_ULL(4, 0)) /* ISS field definitions for exceptions taken in to Hyp */ +#define ESR_ELx_FSC_ADDRSZ (0x00) #define ESR_ELx_CV (UL(1) << 24) #define ESR_ELx_COND_SHIFT (20) #define ESR_ELx_COND_MASK (UL(0xF) << ESR_ELx_COND_SHIFT) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index a69faee31342..5404b7b843cf 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -68,6 +68,19 @@ extern struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu); extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu); extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu); +struct kvm_s2_trans { + phys_addr_t output; + unsigned long block_size; + bool writable; + bool readable; + int level; + u32 esr; + u64 upper_attr; +}; + +extern int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa, + struct kvm_s2_trans *result); + int kvm_init_nv_sysregs(struct kvm *kvm); #ifdef CONFIG_ARM64_PTR_AUTH diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 0a6b894b6390..627470fbe0f4 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -91,6 +91,271 @@ int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu) return 0; } +struct s2_walk_info { + int (*read_desc)(phys_addr_t pa, u64 *desc, void *data); + void *data; + u64 baddr; + unsigned int max_pa_bits; + unsigned int max_ipa_bits; + unsigned int pgshift; + unsigned int pgsize; + unsigned int ps; + unsigned int sl; + unsigned int t0sz; + bool be; +}; + +static unsigned int ps_to_output_size(unsigned int ps) +{ + switch (ps) { + case 0: return 32; + case 1: return 36; + case 2: return 40; + case 3: return 42; + case 4: return 44; + case 5: + default: + return 48; + } +} + +static u32 compute_fsc(int level, u32 fsc) +{ + return fsc | (level & 0x3); +} + +static int check_base_s2_limits(struct s2_walk_info *wi, + int level, int input_size, int stride) +{ + int start_size; + + /* Check translation limits */ + switch (wi->pgsize) { + case SZ_64K: + if (level == 0 || (level == 1 && wi->max_ipa_bits <= 42)) + return -EFAULT; + break; + case SZ_16K: + if (level == 0 || (level == 1 && wi->max_ipa_bits <= 40)) + return -EFAULT; + break; + case SZ_4K: + if (level < 0 || (level == 0 && wi->max_ipa_bits <= 42)) + return -EFAULT; + break; + } + + /* Check input size limits */ + if (input_size > wi->max_ipa_bits) + return -EFAULT; + + /* Check number of entries in starting level table */ + start_size = input_size - ((3 - level) * stride + wi->pgshift); + if (start_size < 1 || start_size > stride + 4) + return -EFAULT; + + return 0; +} + +/* Check if output is within boundaries */ +static int check_output_size(struct s2_walk_info *wi, phys_addr_t output) +{ + unsigned int output_size = ps_to_output_size(wi->ps); + + if (output_size > wi->max_pa_bits) + output_size = wi->max_pa_bits; + + if (output_size != 48 && (output & GENMASK_ULL(47, output_size))) + return -1; + + return 0; +} + +/* + * This is essentially a C-version of the pseudo code from the ARM ARM + * AArch64.TranslationTableWalk function. I strongly recommend looking at + * that pseudocode in trying to understand this. + * + * Must be called with the kvm->srcu read lock held + */ +static int walk_nested_s2_pgd(phys_addr_t ipa, + struct s2_walk_info *wi, struct kvm_s2_trans *out) +{ + int first_block_level, level, stride, input_size, base_lower_bound; + phys_addr_t base_addr; + unsigned int addr_top, addr_bottom; + u64 desc; /* page table entry */ + int ret; + phys_addr_t paddr; + + switch (wi->pgsize) { + default: + case SZ_64K: + case SZ_16K: + level = 3 - wi->sl; + first_block_level = 2; + break; + case SZ_4K: + level = 2 - wi->sl; + first_block_level = 1; + break; + } + + stride = wi->pgshift - 3; + input_size = 64 - wi->t0sz; + if (input_size > 48 || input_size < 25) + return -EFAULT; + + ret = check_base_s2_limits(wi, level, input_size, stride); + if (WARN_ON(ret)) + return ret; + + base_lower_bound = 3 + input_size - ((3 - level) * stride + + wi->pgshift); + base_addr = wi->baddr & GENMASK_ULL(47, base_lower_bound); + + if (check_output_size(wi, base_addr)) { + out->esr = compute_fsc(level, ESR_ELx_FSC_ADDRSZ); + return 1; + } + + addr_top = input_size - 1; + + while (1) { + phys_addr_t index; + + addr_bottom = (3 - level) * stride + wi->pgshift; + index = (ipa & GENMASK_ULL(addr_top, addr_bottom)) + >> (addr_bottom - 3); + + paddr = base_addr | index; + ret = wi->read_desc(paddr, &desc, wi->data); + if (ret < 0) + return ret; + + /* + * Handle reversedescriptors if endianness differs between the + * host and the guest hypervisor. + */ + if (wi->be) + desc = be64_to_cpu((__force __be64)desc); + else + desc = le64_to_cpu((__force __le64)desc); + + /* Check for valid descriptor at this point */ + if (!(desc & 1) || ((desc & 3) == 1 && level == 3)) { + out->esr = compute_fsc(level, ESR_ELx_FSC_FAULT); + out->upper_attr = desc; + return 1; + } + + /* We're at the final level or block translation level */ + if ((desc & 3) == 1 || level == 3) + break; + + if (check_output_size(wi, desc)) { + out->esr = compute_fsc(level, ESR_ELx_FSC_ADDRSZ); + out->upper_attr = desc; + return 1; + } + + base_addr = desc & GENMASK_ULL(47, wi->pgshift); + + level += 1; + addr_top = addr_bottom - 1; + } + + if (level < first_block_level) { + out->esr = compute_fsc(level, ESR_ELx_FSC_FAULT); + out->upper_attr = desc; + return 1; + } + + /* + * We don't use the contiguous bit in the stage-2 ptes, so skip check + * for misprogramming of the contiguous bit. + */ + + if (check_output_size(wi, desc)) { + out->esr = compute_fsc(level, ESR_ELx_FSC_ADDRSZ); + out->upper_attr = desc; + return 1; + } + + if (!(desc & BIT(10))) { + out->esr = compute_fsc(level, ESR_ELx_FSC_ACCESS); + out->upper_attr = desc; + return 1; + } + + /* Calculate and return the result */ + paddr = (desc & GENMASK_ULL(47, addr_bottom)) | + (ipa & GENMASK_ULL(addr_bottom - 1, 0)); + out->output = paddr; + out->block_size = 1UL << ((3 - level) * stride + wi->pgshift); + out->readable = desc & (0b01 << 6); + out->writable = desc & (0b10 << 6); + out->level = level; + out->upper_attr = desc & GENMASK_ULL(63, 52); + return 0; +} + +static int read_guest_s2_desc(phys_addr_t pa, u64 *desc, void *data) +{ + struct kvm_vcpu *vcpu = data; + + return kvm_read_guest(vcpu->kvm, pa, desc, sizeof(*desc)); +} + +static void vtcr_to_walk_info(u64 vtcr, struct s2_walk_info *wi) +{ + wi->t0sz = vtcr & TCR_EL2_T0SZ_MASK; + + switch (vtcr & VTCR_EL2_TG0_MASK) { + case VTCR_EL2_TG0_4K: + wi->pgshift = 12; break; + case VTCR_EL2_TG0_16K: + wi->pgshift = 14; break; + case VTCR_EL2_TG0_64K: + default: /* IMPDEF: treat any other value as 64k */ + wi->pgshift = 16; break; + } + + wi->pgsize = BIT(wi->pgshift); + wi->ps = FIELD_GET(VTCR_EL2_PS_MASK, vtcr); + wi->sl = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); + wi->max_ipa_bits = VTCR_EL2_IPA(vtcr); + /* Global limit for now, should eventually be per-VM */ + wi->max_pa_bits = get_kvm_ipa_limit(); +} + +int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa, + struct kvm_s2_trans *result) +{ + u64 vtcr = vcpu_read_sys_reg(vcpu, VTCR_EL2); + struct s2_walk_info wi; + int ret; + + result->esr = 0; + + if (!vcpu_has_nv(vcpu)) + return 0; + + wi.read_desc = read_guest_s2_desc; + wi.data = vcpu; + wi.baddr = vcpu_read_sys_reg(vcpu, VTTBR_EL2); + + vtcr_to_walk_info(vtcr, &wi); + + wi.be = vcpu_read_sys_reg(vcpu, SCTLR_EL2) & SCTLR_ELx_EE; + + ret = walk_nested_s2_pgd(gipa, &wi, result); + if (ret) + result->esr |= (kvm_vcpu_get_esr(vcpu) & ~ESR_ELx_FSC); + + return ret; +} + struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu) { struct kvm *kvm = vcpu->kvm;