From patchwork Sat Feb 15 15:01:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13976140 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EB06CC021A0 for ; Sat, 15 Feb 2025 15:12:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=kqGtMpA4q2dZMOxI/75lX3eLRTwNPrZKnDGyjoEKsRQ=; b=T75fmD7uUbHluraNNwUI7SpU/M o2VwBcEp1zbCgjBglWNx7dZDm/n4STa8t01UHgMIK3WXkpQK69ZfCZUw2Rc+HL8pfsJwU+f+4jzSP 0U5HY+UBEiNVKDiVouCofwM77kFIK+J/L2fMB+GWn2ozapJggc28q0em+JDvcs6cNXYRKW6rauJSX gpPkLAyOS9tIEJqOJ/hXYRzKOlVbaw5nyElbNCuTALCU+qFpJzs8PcrbqQQPBDrGH+epk+l8oEFl3 Rrd9L2BsGaDic/QWYJprtsY8M0bl0GvHm0OoybZ6gKLpz7jc2ZtHMFlXa3dVl8zc3fNkrdvuCfCdI wzDnDV6g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tjJqA-00000000ILv-1iyd; Sat, 15 Feb 2025 15:12:22 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tjJgE-00000000H3G-1SMC for linux-arm-kernel@lists.infradead.org; Sat, 15 Feb 2025 15:02:07 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 500705C5418; Sat, 15 Feb 2025 15:01:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 71FFEC4CEE6; Sat, 15 Feb 2025 15:02:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739631723; bh=SwzU8d3Rx/Sz7zhOT8ZS3cObd9aGG2nHVFLOm2bBxwk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eZkMFKUqq0Wft/nsdeJUY/msqPbh6oHirE+z/j/zqove3KAWajYDVpjB1wE8IIhU2 05hMzDNs47OBqTG8e//sVq5RRYQoJUk2ks+nVPYvGj8EfwzWXrpvWrOcdHt1dbdP85 cIAsePiC4VRPRXb4rAu+SjNjD4I38ARIpYo0VH0sjy3anQUA4n7fC3ewRkNIHP5QLX 8IQr3re8+zJvDeX+wjndBHfq5VLTYNq+NLYMBwUxlPGIe9nqZ3NY/9z6d09ShlVgAG vcTaeiFhoV4o+400myP7UqgcJoavV+r+YeBNkxxkXwLkzkR1vLBezmbgghFdWqsNdX c+3xF/AslvBdA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1tjJg9-004Nz6-Hs; Sat, 15 Feb 2025 15:02:01 +0000 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Joey Gouly , Suzuki K Poulose , Oliver Upton , Zenghui Yu , Eric Auger Subject: [PATCH 07/14] KVM: arm64: nv: Add pseudo-TLB backing VNCR_EL2 Date: Sat, 15 Feb 2025 15:01:27 +0000 Message-Id: <20250215150134.3765791-8-maz@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250215150134.3765791-1-maz@kernel.org> References: <20250215150134.3765791-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, joey.gouly@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com, eric.auger@redhat.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250215_070206_473379_6CF40E35 X-CRM114-Status: GOOD ( 27.50 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org FEAT_NV2 introduces an interesting problem for NV, as VNCR_EL2.BADDR is a virtual address in the EL2&0 (or EL2, but we thankfully ignore this) translation regime. As we need to replicate such mapping in the real EL2, it means that we need to remember that there is such a translation, and that any TLBI affecting EL2 can possibly affect this translation. It also means that any invalidation driven by an MMU notifier must be able to shoot down any such mapping. All in all, we need a data structure that represents this mapping, and that is extremely close to a TLB. Given that we can only use one of those per vcpu at any given time, we only allocate one. No effort is made to keep that structure small. If we need to start caching multiple of them, we may want to revisit that design point. But for now, it is kept simple so that we can reason about it. Oh, and add a braindump of how things are supposed to work, because I will definitely page this out at some point. Yes, pun intended. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_host.h | 5 ++ arch/arm64/include/asm/kvm_nested.h | 3 ++ arch/arm64/kvm/arm.c | 6 +++ arch/arm64/kvm/nested.c | 72 +++++++++++++++++++++++++++++ arch/arm64/kvm/reset.c | 1 + 5 files changed, 87 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 519023dad3b47..dd287ccaffdb7 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -706,6 +706,8 @@ struct vcpu_reset_state { bool reset; }; +struct vncr_tlb; + struct kvm_vcpu_arch { struct kvm_cpu_context ctxt; @@ -800,6 +802,9 @@ struct kvm_vcpu_arch { /* Per-vcpu CCSIDR override or NULL */ u32 *ccsidr; + + /* Per-vcpu TLB for VNCR_EL2 -- NULL when !NV */ + struct vncr_tlb *vncr_tlb; }; /* diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index cc1302cb7929f..6a168ae95aef4 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -332,4 +332,7 @@ struct s1_walk_result { int __kvm_translate_va(struct kvm_vcpu *vcpu, struct s1_walk_info *wi, struct s1_walk_result *wr, u64 va); +/* VNCR management */ +int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu); + #endif /* __ARM64_KVM_NESTED_H */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 071a7d75be689..274883bf4dd4e 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -815,6 +815,12 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) if (ret) return ret; + if (vcpu_has_nv(vcpu)) { + ret = kvm_vcpu_allocate_vncr_tlb(vcpu); + if (ret) + return ret; + } + /* * This needs to happen after any restriction has been applied * to the feature set. diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 952a1558f5214..6ae5ec43ddeaa 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -16,6 +16,24 @@ #include "sys_regs.h" +struct vncr_tlb { + /* The guest's VNCR_EL2 */ + u64 gva; + struct s1_walk_info wi; + struct s1_walk_result wr; + + u64 hpa; + + /* -1 when not mapped on a CPU */ + int cpu; + + /* + * true if the TLB is valid. Can only be changed with the + * mmu_lock held. + */ + bool valid; +}; + /* Protection against the sysreg repainting madness... */ #define NV_FTR(r, f) ID_AA64##r##_EL1_##f @@ -809,6 +827,60 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm) kvm_uninit_stage2_mmu(kvm); } +/* + * Dealing with VNCR_EL2 exposed by the *guest* is a complicated matter: + * + * - We introduce an internal representation of a vcpu-private TLB, + * representing the mapping between the guest VA contained in VNCR_EL2, + * the IPA the guest's EL2 PTs point to, and the actual PA this lives at. + * + * - On translation fault from a nested VNCR access, we create such a TLB. + * If there is no mapping to describe, the guest inherits the fault. + * Crucially, no actual mapping is done at this stage. + * + * - On vcpu_load() in a non-HYP context with HCR_EL2.NV==1, if the above + * TLB exists, we map it in the fixmap for this CPU, and run with it. We + * have to respect the permissions dictated by the guest, but not the + * memory type (FWB is a must). + * + * - Note that we usually don't do a vcpu_load() on the back of a fault + * (unless we are preempted), so the resolution of a translation fault + * must go via a request that will map the VNCR page in the fixmap. + * vcpu_load() might as well use the same mechanism. + * + * - On vcpu_put() in a non-HYP context with HCR_EL2.NV==1, if the TLB was + * mapped, we unmap it. Yes it is that simple. The TLB still exists + * though, and may be reused at a later load. + * + * - On permission fault, we simply forward the fault to the guest's EL2. + * Get out of my way. + * + * - On any TLBI for the EL2&0 translation regime, we must find any TLB that + * intersects with the TLBI request, invalidate it, and unmap the page + * from the fixmap. Because we need to look at all the vcpu-private TLBs, + * this requires some wide-ranging locking to ensure that nothing races + * against it. This may require some refcounting to avoid the search when + * no such TLB is present. + * + * - On MMU notifiers, we must invalidate our TLB in a similar way, but + * looking at the IPA instead. The funny part is that there may not be a + * stage-2 mapping for this page if L1 hasn't accessed it using LD/ST + * instructions. + */ + +int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu) +{ + if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY)) + return 0; + + vcpu->arch.vncr_tlb = kzalloc(sizeof(*vcpu->arch.vncr_tlb), + GFP_KERNEL_ACCOUNT); + if (!vcpu->arch.vncr_tlb) + return -ENOMEM; + + return 0; +} + /* * Our emulated CPU doesn't support all the possible features. For the * sake of simplicity (and probably mental sanity), wipe out a number diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 3c48527aef360..0d95b512eec12 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -159,6 +159,7 @@ void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu) kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu)); kfree(sve_state); free_page((unsigned long)vcpu->arch.ctxt.vncr_array); + kfree(vcpu->arch.vncr_tlb); kfree(vcpu->arch.ccsidr); }