From patchwork Fri May 3 18:17:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 13653268 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D9F51586CB for ; Fri, 3 May 2024 18:17:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714760262; cv=none; b=kS13Qw3rvmEgqK3KZ6V42g3Unhw0NmThTm0hBZ2PGsDWo7bpQYunLeO/FJLSzL84s2E28rzoB5N7CG2kBg6Lr7dXepX5ye3o5YKy2vRekmVC9qGeM10sEllKmtt9NxPWeOz3gdb+zqp+gRrJCopIy2fv3yRWyTIGI8E8JLAyJww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714760262; c=relaxed/simple; bh=SbR2pYdcwP7d8d0EllRiVsBDfxgWnwUD2Dl29PJc+BQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=CMnF2Ye9iImejXe+SgceGutsNTopAWBQ4WwOZ3mqaB89LDzuv1mcJcU65nIBBHZ9+H5A7NIc3sdO0hbMkYzuZ5UmPdpyuSydNMOt+06xImJtzI/bCvUI66ha5XglvHqPIxSvtH/NSEg1oOfni7McRMP7s+JB/8fAq1pjMvK96P0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--dmatlack.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=qjB58JOo; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--dmatlack.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="qjB58JOo" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-de603db5d6aso9890617276.2 for ; Fri, 03 May 2024 11:17:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1714760260; x=1715365060; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gxJS1crzpielwDie8rL3hx1PCStQKnU+yTalK3ZSiI0=; b=qjB58JOoAVKzWjGSuaNYeuxnwEdJHsc8ElO38SuAxBMgC/HO27kdewrhKh8tUMwLmC 0i1xnis++2592ebrLNPYM9MtKnjLDlLtqx/t2T1WkzkDvszoYzRigIwRj7aJW/wsq7IN kp2yoHFnOV6QnPYCcBJYPEqHY2X9j7pySLGafb7iR7nf9cmb2vOV1qIsLizxb/g+I867 VSKZyZHhJMSATuSbJl960nrcRUKBd/0gexvhBWdSXILCm+hzDsHTZe9gRthvfIInaDRy r0zyKurR1jUyqDP+fLqaG98S4QtJgA9h+Lz0e1aFLzAWuv0noORI5i1XD1cs3y6KNdNC LqPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714760260; x=1715365060; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gxJS1crzpielwDie8rL3hx1PCStQKnU+yTalK3ZSiI0=; b=TQvP2ReRRz6gT5APyIpieQfcC7vO5974yk2JJ0D/jgRkzZFK1BY33IJ5MdSWaXvCu0 DSN+rCuTCZeGx9q38upmLpUuuvuMcPIUzCPBVw4FI9JANzNvBx7+GzQEVKQxsbqT1G1f +QdCq0UMErvkpMyfaIJILuTT5i7DQWhtCAjuNoubwtHV5FBSuDIjlDh1T9FU9Vmwibdv dAdHGUUZFemefbrBVAMPjPZo6qXa4dpXZoM5YQ+6PLRRZqH9MJHTMNEeisaiTmsw06bG 4l7Wf9We/o1HL0nVJF4hsnDSjXGsXFyEYVKfnAY3wGTVMuc0zfLoM2dpFW8ovorjNhAo gQ3g== X-Forwarded-Encrypted: i=1; AJvYcCVx997d0zPOYUaDZJEGg0106RVdKxFEBoUMZqOuvqGA1pXVAnEczOVJ6ulDKh0NsrJGe6ioyR3NQc1u6LYjoCJZp+qmznsKzueLEA== X-Gm-Message-State: AOJu0YyvsfoAnV5t377qCMMOqRj4p/Ma7gbb2qrq+5SyBXAJYWVbAm25 qkU5eNYLvsqwEgARCLgAr2osVHduc6mr/YObO1SsPTfn8KkIylcpHhsfKrPZ5Byw1SWN9qtN1AX l5P2n3vojyg== X-Google-Smtp-Source: AGHT+IFV8XW0q6Vc+/y+/ULmmUgE0pK+amRiYPmJG0lNXqT19q15J8nIrLHDkebqo9zBkflGqNFe7OG5YJeiaQ== X-Received: from dmatlack-n2d-128.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1309]) (user=dmatlack job=sendgmr) by 2002:a05:6902:1023:b0:dd9:1db5:8348 with SMTP id x3-20020a056902102300b00dd91db58348mr1087385ybt.8.1714760260354; Fri, 03 May 2024 11:17:40 -0700 (PDT) Date: Fri, 3 May 2024 11:17:32 -0700 In-Reply-To: <20240503181734.1467938-1-dmatlack@google.com> Precedence: bulk X-Mailing-List: linux-mips@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240503181734.1467938-1-dmatlack@google.com> X-Mailer: git-send-email 2.45.0.rc1.225.g2a3ae87e7f-goog Message-ID: <20240503181734.1467938-2-dmatlack@google.com> Subject: [PATCH v3 1/3] KVM: Introduce vcpu->wants_to_run From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Nicholas Piggin , Anup Patel , Atish Patra , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , David Hildenbrand , Sean Christopherson , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, David Matlack Introduce vcpu->wants_to_run to indicate when a vCPU is in its core run loop, i.e. when the vCPU is running the KVM_RUN ioctl and immediate_exit was not set. Replace all references to vcpu->run->immediate_exit with !vcpu->wants_to_run to avoid TOCTOU races with userspace. For example, a malicious userspace could invoked KVM_RUN with immediate_exit=true and then after KVM reads it to set wants_to_run=false, flip it to false. This would result in the vCPU running in KVM_RUN with wants_to_run=false. This wouldn't cause any real bugs today but is a dangerous landmine. Signed-off-by: David Matlack --- arch/arm64/kvm/arm.c | 2 +- arch/loongarch/kvm/vcpu.c | 2 +- arch/mips/kvm/mips.c | 2 +- arch/powerpc/kvm/powerpc.c | 2 +- arch/riscv/kvm/vcpu.c | 2 +- arch/s390/kvm/kvm-s390.c | 2 +- arch/x86/kvm/x86.c | 4 ++-- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 3 +++ 9 files changed, 12 insertions(+), 8 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index c4a0a35e02c7..c587e5d9396e 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -986,7 +986,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) vcpu_load(vcpu); - if (run->immediate_exit) { + if (!vcpu->wants_to_run) { ret = -EINTR; goto out; } diff --git a/arch/loongarch/kvm/vcpu.c b/arch/loongarch/kvm/vcpu.c index 3a8779065f73..847ef54f3a84 100644 --- a/arch/loongarch/kvm/vcpu.c +++ b/arch/loongarch/kvm/vcpu.c @@ -1163,7 +1163,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) kvm_complete_iocsr_read(vcpu, run); } - if (run->immediate_exit) + if (!vcpu->wants_to_run) return r; /* Clear exit_reason */ diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index 231ac052b506..f1a99962027a 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -436,7 +436,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) vcpu->mmio_needed = 0; } - if (vcpu->run->immediate_exit) + if (!vcpu->wants_to_run) goto out; lose_fpu(1); diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index d32abe7fe6ab..961aadc71de2 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -1852,7 +1852,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) kvm_sigset_activate(vcpu); - if (run->immediate_exit) + if (!vcpu->wants_to_run) r = -EINTR; else r = kvmppc_vcpu_run(vcpu); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index b5ca9f2e98ac..3d8349470ee6 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -711,7 +711,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) return ret; } - if (run->immediate_exit) { + if (!vcpu->wants_to_run) { kvm_vcpu_srcu_read_unlock(vcpu); return -EINTR; } diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 5147b943a864..b1ea25aacbf9 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -5033,7 +5033,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) if (vcpu->kvm->arch.pv.dumping) return -EINVAL; - if (kvm_run->immediate_exit) + if (!vcpu->wants_to_run) return -EINTR; if (kvm_run->kvm_valid_regs & ~KVM_SYNC_S390_VALID_FIELDS || diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2d2619d3eee4..f70ae1558684 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11396,7 +11396,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) kvm_vcpu_srcu_read_lock(vcpu); if (unlikely(vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)) { - if (kvm_run->immediate_exit) { + if (!vcpu->wants_to_run) { r = -EINTR; goto out; } @@ -11474,7 +11474,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) WARN_ON_ONCE(vcpu->mmio_needed); } - if (kvm_run->immediate_exit) { + if (!vcpu->wants_to_run) { r = -EINTR; goto out; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index afbc99264ffa..f9b9ce0c3cd9 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -380,6 +380,7 @@ struct kvm_vcpu { bool dy_eligible; } spin_loop; #endif + bool wants_to_run; bool preempted; bool ready; struct kvm_vcpu_arch arch; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 38b498669ef9..bdea5b978f80 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4425,7 +4425,10 @@ static long kvm_vcpu_ioctl(struct file *filp, synchronize_rcu(); put_pid(oldpid); } + vcpu->wants_to_run = !READ_ONCE(vcpu->run->immediate_exit); r = kvm_arch_vcpu_ioctl_run(vcpu); + vcpu->wants_to_run = false; + trace_kvm_userspace_exit(vcpu->run->exit_reason, r); break; } From patchwork Fri May 3 18:17:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 13653269 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E93BA158D67 for ; Fri, 3 May 2024 18:17:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714760265; cv=none; b=Q0cuw8Y6eNECifktuHZn97CMPk3al3ZqBe3qer9qMuxRosY/myIRkrsX1C8zlA6bFXJG1z+Z7W576aZlVoKI+R5u5rLgL8insSBhqEV3QFKdmYF0E33mPMsmjtpbkoV/hPqvz14yJ625la3v/tdZPLj+xm2XiNZD/wFFEuSoO28= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714760265; c=relaxed/simple; bh=mXEp7/JVq5ahA3gE3cHKv/B7qtsS8wcRgKa4iiw3GNs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=lrEizUVC7zsEYMuMvDDEmrK6G16HQcdeLS/01lnIBPqHZBD1+0xb+LBEX3uuTkD5T5O5WMpsKHcb0DGR37/E+mueGAQr3NCGKxOfLs8XtYFwODvxPlC9/vIRMLg/u1ip8D+UP3Mhdl/NCNmDwJ+ODw1w3s5AzIMIgJHW1qNb2Q0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--dmatlack.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Xp7JuRF6; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--dmatlack.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Xp7JuRF6" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-de603db5d6aso9890696276.2 for ; Fri, 03 May 2024 11:17:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1714760262; x=1715365062; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QHXu+3wMl1x8qNonydytLIV6nGk0zgXCPmBNIKdSDOU=; b=Xp7JuRF6KzK9RhwEzZqYKBxAr5JJ1yJQJCZxHhdf9FBLSCiDNoBj099QKXwT+ooPnr Bywfs1QmeQk6JiEoVBu48J8Susf50V5mmMXr1JlogZfrowZj/Xr21gsbz4k4bHDi2CJQ qzSG5fwIiDZj3ug4Ss98yonOIlB97aLBM5AqdqtLksyO6KQpfwrojEI54fIXjUJa0uue Rtfv+vydWfgpAjjNSL2CVEY4l0o1rUbeOt9f/xZRtiDTR/FcVTuCRYEXbzPzEiP7SoKM WRCJ0kTgGEgDBKU074MywegCaqRe81cehnIw1jHmL0oyVVqDN9Yc1aPAYVMoHXH3u7Ov IpUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714760262; x=1715365062; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QHXu+3wMl1x8qNonydytLIV6nGk0zgXCPmBNIKdSDOU=; b=N4IU+Y/1ahaREpma0FDgzN17rShRMgctuLCaG1yHHi8vVSgfi94CZUzUx+qi1NAzSS uHGBSdWGJPniGBUcH3uj0uf214NsXechFFc0Nk+A1+JfzhxaqAaTjp66YZpCTW2/ZylC rkT5UiOCuTRY8PTQ5c5w1NU+1vVZQblo7JmfIfpEzvyn5EGReRyWA3HYsZPlUdel6qnc qGK6VexcPLcwhoJMeCm28bCzGwTrPmwtRV52uLYW2tB+66BTAuzcZ/t3lQKnPPENPMk4 a9gBRLD//qpb8NdhX3D20SPYjWUA+ZyHwVoof1j2a77gxi0PyFJs1HOPBz5uHMCbcW80 iCLA== X-Forwarded-Encrypted: i=1; AJvYcCUsYS9gXig8AGaK5OVdeCxNiUnxEdTvgUnnZAYURYueyJNDjqSqKNsphhRykI17v9ZeikQy8Ujs2UK7OptcVw+uV5EIVhkaZXRo/g== X-Gm-Message-State: AOJu0Yx6dCGCuK5Gv4hWclN4xHGxsCXT0WfS6QFzTEurXEyI7CvHhyFs QkrYDNDV1BnfirkftIzPSVgwyTnXFedN923LTllHv38YhhuFxWkd4K0+JfWHJdjIrC8MWu6vh1I Nb2l4nUpDxw== X-Google-Smtp-Source: AGHT+IH9qLn6o/uHe9WEM8VyAqFxWSMZ3MFnxC1YcYZ+IPtGArmXDrrlLFlgmBP/HenuKUWhjozjdfJl16Pvww== X-Received: from dmatlack-n2d-128.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1309]) (user=dmatlack job=sendgmr) by 2002:a25:26cc:0:b0:de4:e042:eee9 with SMTP id m195-20020a2526cc000000b00de4e042eee9mr1003169ybm.6.1714760262024; Fri, 03 May 2024 11:17:42 -0700 (PDT) Date: Fri, 3 May 2024 11:17:33 -0700 In-Reply-To: <20240503181734.1467938-1-dmatlack@google.com> Precedence: bulk X-Mailing-List: linux-mips@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240503181734.1467938-1-dmatlack@google.com> X-Mailer: git-send-email 2.45.0.rc1.225.g2a3ae87e7f-goog Message-ID: <20240503181734.1467938-3-dmatlack@google.com> Subject: [PATCH v3 2/3] KVM: Ensure new code that references immediate_exit gets extra scrutiny From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Nicholas Piggin , Anup Patel , Atish Patra , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , David Hildenbrand , Sean Christopherson , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, David Matlack Ensure that any new KVM code that references immediate_exit gets extra scrutiny by renaming it to immediate_exit__unsafe in kernel code. All fields in struct kvm_run are subject to TOCTOU races since they are mapped into userspace, which may be malicious or buggy. To protect KVM, this commit introduces a new macro that appends __unsafe to field names in struct kvm_run, hinting to developers and reviewers that accessing this field must be done carefully. Apply the new macro to immediate_exit, since userspace can make immediate_exit inconsistent with vcpu->wants_to_run, i.e. accessing immediate_exit directly could lead to unexpected bugs in the future. Signed-off-by: David Matlack --- include/uapi/linux/kvm.h | 15 ++++++++++++++- virt/kvm/kvm_main.c | 2 +- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 2190adbe3002..3611ad3b9c2a 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -192,11 +192,24 @@ struct kvm_xen_exit { /* Flags that describe what fields in emulation_failure hold valid data. */ #define KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES (1ULL << 0) +/* + * struct kvm_run can be modified by userspace at any time, so KVM must be + * careful to avoid TOCTOU bugs. In order to protect KVM, HINT_UNSAFE_IN_KVM() + * renames fields in struct kvm_run from to __unsafe when + * compiled into the kernel, ensuring that any use within KVM is obvious and + * gets extra scrutiny. + */ +#ifdef __KERNEL__ +#define HINT_UNSAFE_IN_KVM(_symbol) _symbol##__unsafe +#else +#define HINT_UNSAFE_IN_KVM(_symbol) _symbol +#endif + /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */ struct kvm_run { /* in */ __u8 request_interrupt_window; - __u8 immediate_exit; + __u8 HINT_UNSAFE_IN_KVM(immediate_exit); __u8 padding1[6]; /* out */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index bdea5b978f80..2b29851a90bd 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4425,7 +4425,7 @@ static long kvm_vcpu_ioctl(struct file *filp, synchronize_rcu(); put_pid(oldpid); } - vcpu->wants_to_run = !READ_ONCE(vcpu->run->immediate_exit); + vcpu->wants_to_run = !READ_ONCE(vcpu->run->immediate_exit__unsafe); r = kvm_arch_vcpu_ioctl_run(vcpu); vcpu->wants_to_run = false; From patchwork Fri May 3 18:17:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 13653270 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 85D81158D74 for ; Fri, 3 May 2024 18:17:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714760265; cv=none; b=T8CUbhVR6c17Gou9+z4ABx3DoUamLhvBy+2p+u9R0YYRIeSjzZloL27Ujo4p7dEckHGxCzWmeawum84F+fLNDOqo/1QYqUnHnZ+EzZD99iWAHLcdgmpdCGTeNbH41E1R70CDhUo369fFUnbQMrJQH8AmXKdoIjNdwTV92ZuF/uM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714760265; c=relaxed/simple; bh=MnH1Wy/qfGLoK8pqY0vb3YbaD+FyQMej5pfO9Kyf6Bw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=p3gz+iSNxofBliVClqGBPieoy92297kjSqwBJaCvvG7MzVVjZCJCF/TNiBdts7MBHqCdlmvcmrfJwuMesqkIgF5f7uNNkT1pS7cW7c9tHKV+2ESznNRAcRHcu5VujUJWtvxh5LzTdF1jyE6ToqUWCTgQIWWSWjjeptpcGjPXyA8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--dmatlack.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=UTtRwsdT; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--dmatlack.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="UTtRwsdT" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-61df903af62so71850517b3.1 for ; Fri, 03 May 2024 11:17:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1714760263; x=1715365063; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XCWz5AboezUKY+fKcSkwtkc56yFvTydQOL6+C1uV1TA=; b=UTtRwsdT8uk2thfIcXtPuU7pHscZJWBYeHv7wvyURQSARTlHnNS47ky1LbBwz0blCB 0maxojdoE78kFdCLmP+U9j89gf+UzWPkc1HPBMb5cZsLFLlkIAT1RiyfPgycKBIIYN/m bf/M8giqwWdJmM9kT+zGgGmUpiMQkg5nrXbpb845/aYrXvGvUJ01nrWIFnbndFtN5oJn bZ1/hGtwl7ol849efgECnOaJDPuU6QAZSf96V3SmmqmZ/fNbo4LgAKLRlk8xVZXwaQsc F/++iDVJsx3+s+OKWIedek4S7FLqsA+i925Kzi/RPg1WKJvg1yTRl7u5A4hS+ouNrgYu 1O8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714760263; x=1715365063; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XCWz5AboezUKY+fKcSkwtkc56yFvTydQOL6+C1uV1TA=; b=Dc3oBzuF0vdzoSgvxno5xDvVE7XEKhlbsSZ+r93gN3hRoOMSEX59nU32o1T50GZTtU iDApxUXpXKj+sp+KWch6Mc+dY49KxTZZwZ/UYsEAvGG3SRtFyMX06PTA2FT6pCqVk/hA SYUllG9dnFqwYHj1Myx1X8pYW9jK1IlV8pGZzbshBFlKvQ1ID0SEeTSdOqmFMMkwWMSv oCYd6WcTqYzHIrQ7UtlrQdXb1DFU9NmrAv8k3/Y5aRDtbmVhO/bZT95TQ5GOFFX4cVQi NjTrB9nZmvmF+SQSBKvlncVUeu7yNxh2Owh+2zyYEpvEF65aFyJqvCUoA0VoycrAiMkM uw9A== X-Forwarded-Encrypted: i=1; AJvYcCW8/sRcILoxx40HJKJptIcefkm/4keFFDYzwe22PoGvHznMLTS7PvVdkS7mo7eUYDHbyDx+fgxMqtiml6ZnHH+Wzt6Bzjuf1O4ejw== X-Gm-Message-State: AOJu0Yzn/LDzGSqunNr0reZ82lN71KicD/kmnZ678HRwDAOgg+OKaOLT FuvR0qr/n7k1LgXscSN8Xuh8HkbWwBfAVHtipc4eET/qhQn9un3DxwNrV4tU6NDvw1ebbBkR2LH FkcNeMdWuPA== X-Google-Smtp-Source: AGHT+IG8z91B85kZw12g8ILejBDce9nCeI6uZrPNjYvj7vQXfcVKq57xpnYqIWHHCJ8gvpfpvg4aSbrDSFG7Mg== X-Received: from dmatlack-n2d-128.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1309]) (user=dmatlack job=sendgmr) by 2002:a0d:ea05:0:b0:61b:7912:6cad with SMTP id t5-20020a0dea05000000b0061b79126cadmr815080ywe.2.1714760263504; Fri, 03 May 2024 11:17:43 -0700 (PDT) Date: Fri, 3 May 2024 11:17:34 -0700 In-Reply-To: <20240503181734.1467938-1-dmatlack@google.com> Precedence: bulk X-Mailing-List: linux-mips@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240503181734.1467938-1-dmatlack@google.com> X-Mailer: git-send-email 2.45.0.rc1.225.g2a3ae87e7f-goog Message-ID: <20240503181734.1467938-4-dmatlack@google.com> Subject: [PATCH v3 3/3] KVM: Mark a vCPU as preempted/ready iff it's scheduled out while running From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Nicholas Piggin , Anup Patel , Atish Patra , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , David Hildenbrand , Sean Christopherson , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, David Matlack Mark a vCPU as preempted/ready if-and-only-if it's scheduled out while running. i.e. Do not mark a vCPU preempted/ready if it's scheduled out during a non-KVM_RUN ioctl() or when userspace is doing KVM_RUN with immediate_exit. Commit 54aa83c90198 ("KVM: x86: do not set st->preempted when going back to user space") stopped marking a vCPU as preempted when returning to userspace, but if userspace then invokes a KVM vCPU ioctl() that gets preempted, the vCPU will be marked preempted/ready. This is arguably incorrect behavior since the vCPU was not actually preempted while the guest was running, it was preempted while doing something on behalf of userspace. This commit also avoids KVM dirtying guest memory after userspace has paused vCPUs, e.g. for Live Migration, which allows userspace to collect the final dirty bitmap before or in parallel with saving vCPU state without having to worry about saving vCPU state triggering writes to guest memory. Suggested-by: Sean Christopherson Signed-off-by: David Matlack --- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2b29851a90bd..3973e62acc7c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -6302,7 +6302,7 @@ static void kvm_sched_out(struct preempt_notifier *pn, { struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn); - if (current->on_rq) { + if (current->on_rq && vcpu->wants_to_run) { WRITE_ONCE(vcpu->preempted, true); WRITE_ONCE(vcpu->ready, true); }