From patchwork Thu Apr 3 11:25:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= X-Patchwork-Id: 14037363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D47EEC3601B for ; Thu, 3 Apr 2025 11:37:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=kspWii5VLxKrVVr7TSNs4pSt7CGKCRplqMnUQCXwQN4=; b=MirJYqVbv8hd3H Q3O4F/bjhHR6Etb6gRdpwTVb2tyS9CJA6xDlUR99LN3hsxI/D07+JRAS4OWbelhzSTzZ9VvDlz1BL PQfoDs0O3ytb2Y+o6hAidY0vAFA5hDewsXeP9AywDotzDSdnyzzyMlHJDaGqyyCxCsTPHUbm4oleN NFe5CGgXjIKtImrut6cdOHXenq0YBCxct5jPqGCt30g6R8pX6mDubB/2xU/uXHGWf8Vxxjg8D6lNE go4QMNzLz3z8usnYkQ4RwRF9LpteKPqaS7pIxXAMEFdXH1IMvec0BYT65Gz0Kyxiu+9MDeX6TSISC uQx8QgOpwuLKLbBWotWQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.1 #2 (Red Hat Linux)) id 1u0ItF-00000008jUW-1SqM; Thu, 03 Apr 2025 11:37:45 +0000 Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]) by bombadil.infradead.org with esmtps (Exim 4.98.1 #2 (Red Hat Linux)) id 1u0Inx-00000008iU3-1BWz for linux-riscv@lists.infradead.org; Thu, 03 Apr 2025 11:32:18 +0000 Received: by mail-wr1-x42a.google.com with SMTP id ffacd0b85a97d-3913290f754so61888f8f.1 for ; Thu, 03 Apr 2025 04:32:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1743679936; x=1744284736; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=v4I41IipITScJpjVRL6TprMAJ/XjTheyblDpzSa8c88=; b=V7dXZ0kZLsccQvOxjQ0zUWbkF9RwE9v/jtb/6QJzBy0BO1RAXoMIpqHTSxvhihTO1W YeL8JYGrCSI2SvFGk5V+dC7Y1Q8I5lsB0oQMvPIk2si6bm43RLysSMOuMvUoP3oO6uH0 oIFshU/iY1nNJaiVb3UnT4rdzFFSUIfytvqMPxNsCuJnr8lAslGpknEjBSAoYxMAoWqz ZdUFRYElCbHHHEgoBme0mVEhV5YZ2ZP5Dw5360Xcni6MEi5IUxjaSfgkKERM87pkHFpO EjxpCST9dZ0J7jTeiyQvXW08tvjk3+UC1tMA9t9PgGdb9MyXcmLQdYUOXZahDQOAUdEc WWBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743679936; x=1744284736; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v4I41IipITScJpjVRL6TprMAJ/XjTheyblDpzSa8c88=; b=dYoW4BmwYomuUsYQTT0Ubn9HcDLZL4+D3qN7+UNdL9Bzj0VVfg1WBU7m27HgqoO6ZU dWTV9zRGrf2Qmoteq9xe1vx/8LLcf1gqoVDLLgYyhLeu9CTFC+mOUdAzYbtirW/9vJXV lCsUsaZrF5aYBkNH6RL7Nsf9+ojEF5mr/Mjv7TtsfpcXoM7+o9J/kQGBBRCpP8QHcABI JvP0C9TYfnip5X/F7jrthhBH61kFeG5RfNtd3QVMmPaL9/V/lNHnSlrringVWjnuDXs7 dhyEa3Ise920288mFfu0L6k822puDYiZg6lTsV7ia9UXuY/aFLaTA9gTfnHYeHdLoPvS 4qKg== X-Forwarded-Encrypted: i=1; AJvYcCW0bj1a7b/DETOuqOIeCHrGq952HSNe+qsDVvNYS5/kyaaWbf+LcYKTYGqtATXZszkfhGKT5zW/lMT+mg==@lists.infradead.org X-Gm-Message-State: AOJu0YxbH72dxHay/JMU/oC06awpljSW7MUOLqmJKv5ufKMRZuJ4mGoN dIZZJrPqZNoJEnr6abQ3m3F/AllEJbqH/w/d3jRaLMWC/0GMQI6WaJjlvMV8GK0= X-Gm-Gg: ASbGncu7DtEs0dmlbP7k+tF71iwuCAir7Q7RmO71M9EeHdVZOoQxQoPoKNXe45H/EDR ctQSaBiDUtdr9McNINPnNJUmPLJgQc1Q9yLJG4VFe9/nhmHpRVTRegCWyTJ6TJP7WiqGmFX6DQU zUNGFpni7vAlknnlUq6tmX+oErHguQIvQUgFDpzHlqZmb6XGyRbN25jOMX/fjTpqzljb86HW/DQ E0Hv5glEVCYBx0CoA+dXk3mMZp6Ue1nXAT4URNnJJ78Alz08X6I11D9xHcGxkteUVmgoFXDxP3L SGysj4qHIWVM4PiLISlwSw0wpKb+TYJI4YUC9lS8HUNnuI4MIvzJ1OVHpj6Wg1bavWJrzIzJhrT Q4Q== X-Google-Smtp-Source: AGHT+IG37Xf9B62VUcljGsX2nc8emFPUfOIiPgop6kaEA0XU1nktX/EDvfLlLGCqE6Qni7LwKqU1Ng== X-Received: by 2002:a5d:59ad:0:b0:39c:1258:17d5 with SMTP id ffacd0b85a97d-39c2483abcfmr3137958f8f.14.1743679935688; Thu, 03 Apr 2025 04:32:15 -0700 (PDT) Received: from localhost (cst2-173-141.cust.vodafone.cz. [31.30.173.141]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-39c3020d6b1sm1575928f8f.62.2025.04.03.04.32.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Apr 2025 04:32:14 -0700 (PDT) From: =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= To: kvm-riscv@lists.infradead.org Cc: kvm@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Anup Patel , Atish Patra , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexandre Ghiti , Andrew Jones , Mayuresh Chitale Subject: [PATCH 4/5] KVM: RISC-V: reset VCPU state when becoming runnable Date: Thu, 3 Apr 2025 13:25:23 +0200 Message-ID: <20250403112522.1566629-7-rkrcmar@ventanamicro.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250403112522.1566629-3-rkrcmar@ventanamicro.com> References: <20250403112522.1566629-3-rkrcmar@ventanamicro.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250403_043217_324932_9658506A X-CRM114-Status: GOOD ( 23.75 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Beware, this patch is "breaking" the userspace interface, because it fixes a KVM/QEMU bug where the boot VCPU is not being reset by KVM. The VCPU reset paths are inconsistent right now. KVM resets VCPUs that are brought up by KVM-accelerated SBI calls, but does nothing for VCPUs brought up through ioctls. We need to perform a KVM reset even when the VCPU is started through an ioctl. This patch is one of the ways we can achieve it. Assume that userspace has no business setting the post-reset state. KVM is de-facto the SBI implementation, as the SBI HSM acceleration cannot be disabled and userspace cannot control the reset state, so KVM should be in full control of the post-reset state. Do not reset the pc and a1 registers, because SBI reset is expected to provide them and KVM has no idea what these registers should be -- only the userspace knows where it put the data. An important consideration is resume. Userspace might want to start with non-reset state. Check ran_atleast_once to allow this, because KVM-SBI HSM creates some VCPUs as STOPPED. The drawback is that userspace can still start the boot VCPU with an incorrect reset state, because there is no way to distinguish a freshly reset new VCPU on the KVM side (userspace might set some values by mistake) from a restored VCPU (userspace must set all values). The advantage of this solution is that it fixes current QEMU and makes some sense with the assumption that KVM implements SBI HSM. I do not like it too much, so I'd be in favor of a different solution if we can still afford to drop support for current userspaces. For a cleaner solution, we should add interfaces to perform the KVM-SBI reset request on userspace demand. I think it would also be much better if userspace was in control of the post-reset state. Signed-off-by: Radim Krčmář --- arch/riscv/include/asm/kvm_host.h | 1 + arch/riscv/include/asm/kvm_vcpu_sbi.h | 3 +++ arch/riscv/kvm/vcpu.c | 9 +++++++++ arch/riscv/kvm/vcpu_sbi.c | 21 +++++++++++++++++++-- 4 files changed, 32 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index 0c8c9c05af91..9bbf8c4a286b 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -195,6 +195,7 @@ struct kvm_vcpu_smstateen_csr { struct kvm_vcpu_reset_state { spinlock_t lock; + bool active; unsigned long pc; unsigned long a1; }; diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index aaaa81355276..2c334a87e02a 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -57,6 +57,9 @@ void kvm_riscv_vcpu_sbi_system_reset(struct kvm_vcpu *vcpu, u32 type, u64 flags); void kvm_riscv_vcpu_sbi_request_reset(struct kvm_vcpu *vcpu, unsigned long pc, unsigned long a1); +void __kvm_riscv_vcpu_set_reset_state(struct kvm_vcpu *vcpu, + unsigned long pc, unsigned long a1); +void kvm_riscv_vcpu_sbi_request_reset_from_userspace(struct kvm_vcpu *vcpu); int kvm_riscv_vcpu_sbi_return(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_riscv_vcpu_set_reg_sbi_ext(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index b8485c1c1ce4..4578863a39e3 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -58,6 +58,11 @@ static void kvm_riscv_vcpu_context_reset(struct kvm_vcpu *vcpu) struct kvm_vcpu_reset_state *reset_state = &vcpu->arch.reset_state; void *vector_datap = cntx->vector.datap; + spin_lock(&reset_state->lock); + if (!reset_state->active) + __kvm_riscv_vcpu_set_reset_state(vcpu, cntx->sepc, cntx->a1); + spin_unlock(&reset_state->lock); + memset(cntx, 0, sizeof(*cntx)); memset(csr, 0, sizeof(*csr)); @@ -520,6 +525,10 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, switch (mp_state->mp_state) { case KVM_MP_STATE_RUNNABLE: + if (riscv_vcpu_supports_sbi_ext(vcpu, KVM_RISCV_SBI_EXT_HSM) && + vcpu->arch.ran_atleast_once && + kvm_riscv_vcpu_stopped(vcpu)) + kvm_riscv_vcpu_sbi_request_reset_from_userspace(vcpu); WRITE_ONCE(vcpu->arch.mp_state, *mp_state); break; case KVM_MP_STATE_STOPPED: diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c index 3d7955e05cc3..77f9f0bd3842 100644 --- a/arch/riscv/kvm/vcpu_sbi.c +++ b/arch/riscv/kvm/vcpu_sbi.c @@ -156,12 +156,29 @@ void kvm_riscv_vcpu_sbi_system_reset(struct kvm_vcpu *vcpu, run->exit_reason = KVM_EXIT_SYSTEM_EVENT; } +/* must be called with held vcpu->arch.reset_state.lock */ +void __kvm_riscv_vcpu_set_reset_state(struct kvm_vcpu *vcpu, + unsigned long pc, unsigned long a1) +{ + vcpu->arch.reset_state.active = true; + vcpu->arch.reset_state.pc = pc; + vcpu->arch.reset_state.a1 = a1; +} + void kvm_riscv_vcpu_sbi_request_reset(struct kvm_vcpu *vcpu, unsigned long pc, unsigned long a1) { spin_lock(&vcpu->arch.reset_state.lock); - vcpu->arch.reset_state.pc = pc; - vcpu->arch.reset_state.a1 = a1; + __kvm_riscv_vcpu_set_reset_state(vcpu, pc, a1); + spin_unlock(&vcpu->arch.reset_state.lock); + + kvm_make_request(KVM_REQ_VCPU_RESET, vcpu); +} + +void kvm_riscv_vcpu_sbi_request_reset_from_userspace(struct kvm_vcpu *vcpu) +{ + spin_lock(&vcpu->arch.reset_state.lock); + vcpu->arch.reset_state.active = false; spin_unlock(&vcpu->arch.reset_state.lock); kvm_make_request(KVM_REQ_VCPU_RESET, vcpu);