From patchwork Fri May 3 18:17:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 13653276 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1878EC10F16 for ; Fri, 3 May 2024 18:18:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=gy4Rlqz77jnQSYXciEgRXpMjaBttx4sUA2nokRZ4VbQ=; b=sRjAuvWaPgBrgvIPwhbC34xp0b gUj5jpAkbwW4n3wo0NNC6YOW16haERnU7PiZyncUgjN17B7UZMOCskpEcd3F6fgH2EBs+MUtoeApo CJR+G8KdD0DJI06fOPSbA5iYjhhgPRavi2R4XIHxM1aK4D6aJDh4+6IVeG0+Qn/ZEquQrjyNsIyMP QHzOyJU2D+LB9kttpvMrVttawX+shuQg3kxDbpyAofs3Up/va2GrlsrG/ecP2O1y79FFP5lC0CtdM 2jyxs9Zf5rzniYRJ0DmJi5/vC2N2+Rvz2bxnS1H08Dy2k4sWDdDg/5qKQA4p0I/ylYIpKXK5ijt9p 9AH+H1nw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1s2xTk-0000000HWCt-3UfL; Fri, 03 May 2024 18:17:52 +0000 Received: from mail-yb1-xb49.google.com ([2607:f8b0:4864:20::b49]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1s2xTa-0000000HW3x-3qyc for linux-arm-kernel@lists.infradead.org; Fri, 03 May 2024 18:17:46 +0000 Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-de603db5d6aso9890616276.2 for ; Fri, 03 May 2024 11:17:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1714760260; x=1715365060; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gxJS1crzpielwDie8rL3hx1PCStQKnU+yTalK3ZSiI0=; b=NpehHdxbOkIV/dMCcoUrul1Ygq63zwwTKXO1+d4M/m4o9PI6eXKGFL/nUOwuMxh75p WtpU0LMLIrog4fny4SOG2KMKKB1qGHvvGchS2T8U8iLHo1jAjIfKgfwN/CjuWY534WKC zIiD5dFs4fSyBHuKwwOawSXFP74rbi99KNWb5E1RGIvjYXAkaYeuEr8FfkGld/A2AReB AxjM9L3Zphj1BazwzbSn8hZtURnPjqXwV6eaCelC0OQGZ9Qiax1SQEOwSBbA7AE4Cy32 zYMDIlfg/83AGxkJgXDKeX4p/aiesyt59Vq/BEswN9iTgkGsDsPtl74ahrTsP9NDPkCF xEEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714760260; x=1715365060; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gxJS1crzpielwDie8rL3hx1PCStQKnU+yTalK3ZSiI0=; b=Py7HOuNiocWhmvydvwFPGWmHp8Bxoqk1d8YTsPWbaTWD24T7sZ7eNJlQOcL0GepLGq v5zR7o5nv11o2uSK4bUfPw/8T31YKys5TjhrrEO9kyzbTVNoKfTVlajN+iRmwFUuv0g9 A8/+0h+cKyZe5elTnuBHswZCdzjFd1EDgGTxzc280MqdoNto0sCDoaoihSbvy9dLFust 6wWxiO0opiHtEG5p8lICr0XZ4cLJGSuAqYKJaU72LWSmVh8owxGv+oXPAdMvO1RfqLFs IMXayZswz3EBnkwwdzvA+sGPohLArgESO5fUA8aTetCpQucRgii+eDPaKIf9H9J2pnm3 tf3g== X-Forwarded-Encrypted: i=1; AJvYcCUhSsDT5kJ1pGTDe0iyyg7rp1NeL4Qb/IJLkTDVbXqV4w3N4b2jPmUFYvQMq6VIkmx2LGZBjPHQ/n3/Rej5qT7U4pjWLQVmMFQrW+CjMXh81QgZupE= X-Gm-Message-State: AOJu0Yz9i2N3uu3RiCR5HM98b+45PL8SlNhTWiWcjlvbEdw6PAs1KSuU RK6Z0gkq7bzR4kGfpaczHhwcs1R7z/Amezy4tNXvzJ57Lq7yWcMAEdfHCj7bhAGE6Nxj1cWhL/7 TQIzDImTrrg== X-Google-Smtp-Source: AGHT+IFV8XW0q6Vc+/y+/ULmmUgE0pK+amRiYPmJG0lNXqT19q15J8nIrLHDkebqo9zBkflGqNFe7OG5YJeiaQ== X-Received: from dmatlack-n2d-128.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1309]) (user=dmatlack job=sendgmr) by 2002:a05:6902:1023:b0:dd9:1db5:8348 with SMTP id x3-20020a056902102300b00dd91db58348mr1087385ybt.8.1714760260354; Fri, 03 May 2024 11:17:40 -0700 (PDT) Date: Fri, 3 May 2024 11:17:32 -0700 In-Reply-To: <20240503181734.1467938-1-dmatlack@google.com> Mime-Version: 1.0 References: <20240503181734.1467938-1-dmatlack@google.com> X-Mailer: git-send-email 2.45.0.rc1.225.g2a3ae87e7f-goog Message-ID: <20240503181734.1467938-2-dmatlack@google.com> Subject: [PATCH v3 1/3] KVM: Introduce vcpu->wants_to_run From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Nicholas Piggin , Anup Patel , Atish Patra , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , David Hildenbrand , Sean Christopherson , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, David Matlack X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240503_111742_974650_5A33D6F5 X-CRM114-Status: GOOD ( 19.04 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Introduce vcpu->wants_to_run to indicate when a vCPU is in its core run loop, i.e. when the vCPU is running the KVM_RUN ioctl and immediate_exit was not set. Replace all references to vcpu->run->immediate_exit with !vcpu->wants_to_run to avoid TOCTOU races with userspace. For example, a malicious userspace could invoked KVM_RUN with immediate_exit=true and then after KVM reads it to set wants_to_run=false, flip it to false. This would result in the vCPU running in KVM_RUN with wants_to_run=false. This wouldn't cause any real bugs today but is a dangerous landmine. Signed-off-by: David Matlack --- arch/arm64/kvm/arm.c | 2 +- arch/loongarch/kvm/vcpu.c | 2 +- arch/mips/kvm/mips.c | 2 +- arch/powerpc/kvm/powerpc.c | 2 +- arch/riscv/kvm/vcpu.c | 2 +- arch/s390/kvm/kvm-s390.c | 2 +- arch/x86/kvm/x86.c | 4 ++-- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 3 +++ 9 files changed, 12 insertions(+), 8 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index c4a0a35e02c7..c587e5d9396e 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -986,7 +986,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) vcpu_load(vcpu); - if (run->immediate_exit) { + if (!vcpu->wants_to_run) { ret = -EINTR; goto out; } diff --git a/arch/loongarch/kvm/vcpu.c b/arch/loongarch/kvm/vcpu.c index 3a8779065f73..847ef54f3a84 100644 --- a/arch/loongarch/kvm/vcpu.c +++ b/arch/loongarch/kvm/vcpu.c @@ -1163,7 +1163,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) kvm_complete_iocsr_read(vcpu, run); } - if (run->immediate_exit) + if (!vcpu->wants_to_run) return r; /* Clear exit_reason */ diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index 231ac052b506..f1a99962027a 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -436,7 +436,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) vcpu->mmio_needed = 0; } - if (vcpu->run->immediate_exit) + if (!vcpu->wants_to_run) goto out; lose_fpu(1); diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index d32abe7fe6ab..961aadc71de2 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -1852,7 +1852,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) kvm_sigset_activate(vcpu); - if (run->immediate_exit) + if (!vcpu->wants_to_run) r = -EINTR; else r = kvmppc_vcpu_run(vcpu); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index b5ca9f2e98ac..3d8349470ee6 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -711,7 +711,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) return ret; } - if (run->immediate_exit) { + if (!vcpu->wants_to_run) { kvm_vcpu_srcu_read_unlock(vcpu); return -EINTR; } diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 5147b943a864..b1ea25aacbf9 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -5033,7 +5033,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) if (vcpu->kvm->arch.pv.dumping) return -EINVAL; - if (kvm_run->immediate_exit) + if (!vcpu->wants_to_run) return -EINTR; if (kvm_run->kvm_valid_regs & ~KVM_SYNC_S390_VALID_FIELDS || diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2d2619d3eee4..f70ae1558684 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11396,7 +11396,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) kvm_vcpu_srcu_read_lock(vcpu); if (unlikely(vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)) { - if (kvm_run->immediate_exit) { + if (!vcpu->wants_to_run) { r = -EINTR; goto out; } @@ -11474,7 +11474,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) WARN_ON_ONCE(vcpu->mmio_needed); } - if (kvm_run->immediate_exit) { + if (!vcpu->wants_to_run) { r = -EINTR; goto out; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index afbc99264ffa..f9b9ce0c3cd9 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -380,6 +380,7 @@ struct kvm_vcpu { bool dy_eligible; } spin_loop; #endif + bool wants_to_run; bool preempted; bool ready; struct kvm_vcpu_arch arch; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 38b498669ef9..bdea5b978f80 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4425,7 +4425,10 @@ static long kvm_vcpu_ioctl(struct file *filp, synchronize_rcu(); put_pid(oldpid); } + vcpu->wants_to_run = !READ_ONCE(vcpu->run->immediate_exit); r = kvm_arch_vcpu_ioctl_run(vcpu); + vcpu->wants_to_run = false; + trace_kvm_userspace_exit(vcpu->run->exit_reason, r); break; } From patchwork Fri May 3 18:17:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 13653277 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B08FC4345F for ; Fri, 3 May 2024 18:18:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=JuJsEfQKPYUJHWbjwlSjd4jmR2ogJqwUEVadyZtErz4=; b=mDQ5MtDYhcySHXyD8kZz2xJihM coscpQhgoiJix5bnOQ1BvG4FaiNv0+/Tpsqdg7Ot0WwrE1LkMBQO5IRMmbER2C98sSaMpGBmWrQy4 bxhqmGju07VPaJG//BD3NWpitk5yfbbDv6SkghCJnaA0auao+QI09eQhCZBeuFE5xjeEh3dJvNYUA 5o1rtPAF/6AIUm7bDfbFww/zS8h5AFrLRjjiz6WV/Yuqoyv3LagTMK2Q+UUiZNAqSuCiwKfGkGEs6 /XqbJC40QoCVfyFMIRMcPfs2sDsSFJr7jqFZQRK4ys86dlhLn0brSkAb+U95TqMIwLt5KFap98umJ RZW1I3xw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1s2xTj-0000000HWBL-09Sh; Fri, 03 May 2024 18:17:51 +0000 Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1s2xTb-0000000HW5O-22mO for linux-arm-kernel@lists.infradead.org; Fri, 03 May 2024 18:17:46 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-de603db5d6aso9890697276.2 for ; Fri, 03 May 2024 11:17:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1714760262; x=1715365062; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QHXu+3wMl1x8qNonydytLIV6nGk0zgXCPmBNIKdSDOU=; b=i0YZkghn5IVvoLhCIYXAr0UvoYoOy9XNuFTSwing3AImXFHHylfdO1DCNCmvk7cBFN eDf0bqokgTwnKgtaXMolMWwcQENgwHbHWTSpTz1BndLu67Sl+a8rSwnQAyld5YO4crXo fSstCcYWBMMCv/f5tVMNcCjBfw9w4Ok3f8aKG5BLiLbudAnLf9HXY2KZh3itCKWGZMl0 LMLiX/7v5j3utb25QZ070bAyj+7PAgK6/cxm+wvMIrsqecGotj9eWR9u9xiNauhYp0L0 C6K3BHwvyKpGMxufHBu00yNucI23mYyrnifDWr+H1g5aPKTr+YGPbjmw8M0ZhXWLI05/ LJAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714760262; x=1715365062; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QHXu+3wMl1x8qNonydytLIV6nGk0zgXCPmBNIKdSDOU=; b=MoMpDGKU17dgOz/HddUFqiehQYft+dPWuRpYgxP58iabbxPimEVYE9WPm6kaPc0b+k nhpJBh/7p6ha8NeO/GTp5eztNw3DNtTwjpMDhVrUbkyrwulEnQhhxK5L0E2uIFi2pfjc nnOietWFbYkJ1lFi+SpZyvPx2vPAcYoWdmTE+i2DtQKXmlJO3s66tHE6nL+j6CqJ2bkx 5J2ch40xz94pM77QsBFmIS9A3EogL9WOCgXSvyRhKP7U657eh7AxY3u3MOZvBB+BdB7I G9oKNpf3fV5MX10fQkAbiKCuUapr8PwwM3x2h2Dl4ZnkZLHauOPUSJk25QtwxsEOyzUi +m2A== X-Forwarded-Encrypted: i=1; AJvYcCUDvZzjuWEFOOstWnA6CBGdXu1J9daixBAvP7KKQE7B6szmZly7B6vBMQmKkxvRvzWrto7bZJNGh2yPOi6QvlGrYSxWA5lGJ4wKy4Hxj4m95/Abpl4= X-Gm-Message-State: AOJu0YywFY1QAoiyzRA6grwYNpgGSodYaoMfma8agievxtYBfJUOu4KZ UeiPtK39aVhLQZbM6oxwRCSwLfIAHR0oJEmiE73EV5wmWuCrtpE64+vUrYMnIFup4qhnnIe6H0G aI134yu8XZg== X-Google-Smtp-Source: AGHT+IH9qLn6o/uHe9WEM8VyAqFxWSMZ3MFnxC1YcYZ+IPtGArmXDrrlLFlgmBP/HenuKUWhjozjdfJl16Pvww== X-Received: from dmatlack-n2d-128.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1309]) (user=dmatlack job=sendgmr) by 2002:a25:26cc:0:b0:de4:e042:eee9 with SMTP id m195-20020a2526cc000000b00de4e042eee9mr1003169ybm.6.1714760262024; Fri, 03 May 2024 11:17:42 -0700 (PDT) Date: Fri, 3 May 2024 11:17:33 -0700 In-Reply-To: <20240503181734.1467938-1-dmatlack@google.com> Mime-Version: 1.0 References: <20240503181734.1467938-1-dmatlack@google.com> X-Mailer: git-send-email 2.45.0.rc1.225.g2a3ae87e7f-goog Message-ID: <20240503181734.1467938-3-dmatlack@google.com> Subject: [PATCH v3 2/3] KVM: Ensure new code that references immediate_exit gets extra scrutiny From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Nicholas Piggin , Anup Patel , Atish Patra , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , David Hildenbrand , Sean Christopherson , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, David Matlack X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240503_111743_629850_92E8A395 X-CRM114-Status: GOOD ( 15.57 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Ensure that any new KVM code that references immediate_exit gets extra scrutiny by renaming it to immediate_exit__unsafe in kernel code. All fields in struct kvm_run are subject to TOCTOU races since they are mapped into userspace, which may be malicious or buggy. To protect KVM, this commit introduces a new macro that appends __unsafe to field names in struct kvm_run, hinting to developers and reviewers that accessing this field must be done carefully. Apply the new macro to immediate_exit, since userspace can make immediate_exit inconsistent with vcpu->wants_to_run, i.e. accessing immediate_exit directly could lead to unexpected bugs in the future. Signed-off-by: David Matlack --- include/uapi/linux/kvm.h | 15 ++++++++++++++- virt/kvm/kvm_main.c | 2 +- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 2190adbe3002..3611ad3b9c2a 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -192,11 +192,24 @@ struct kvm_xen_exit { /* Flags that describe what fields in emulation_failure hold valid data. */ #define KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES (1ULL << 0) +/* + * struct kvm_run can be modified by userspace at any time, so KVM must be + * careful to avoid TOCTOU bugs. In order to protect KVM, HINT_UNSAFE_IN_KVM() + * renames fields in struct kvm_run from to __unsafe when + * compiled into the kernel, ensuring that any use within KVM is obvious and + * gets extra scrutiny. + */ +#ifdef __KERNEL__ +#define HINT_UNSAFE_IN_KVM(_symbol) _symbol##__unsafe +#else +#define HINT_UNSAFE_IN_KVM(_symbol) _symbol +#endif + /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */ struct kvm_run { /* in */ __u8 request_interrupt_window; - __u8 immediate_exit; + __u8 HINT_UNSAFE_IN_KVM(immediate_exit); __u8 padding1[6]; /* out */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index bdea5b978f80..2b29851a90bd 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4425,7 +4425,7 @@ static long kvm_vcpu_ioctl(struct file *filp, synchronize_rcu(); put_pid(oldpid); } - vcpu->wants_to_run = !READ_ONCE(vcpu->run->immediate_exit); + vcpu->wants_to_run = !READ_ONCE(vcpu->run->immediate_exit__unsafe); r = kvm_arch_vcpu_ioctl_run(vcpu); vcpu->wants_to_run = false; From patchwork Fri May 3 18:17:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 13653278 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8E0A6C25B4F for ; Fri, 3 May 2024 18:18:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=UBagu8j1hXHC/wQLQnHNrJ0fEmJfMHwXQC2OsIM3w/o=; b=bzwOpmaKN2VGoTwLv2ODSotSpY oEJoZGnW6azX6srZYzxKEZwsMSKN0M/mnSVTxNqrh3vccVQFZLsKw4+aUM6jHcMBgvDY/SO1aYXaI dkI0vjGnYwwZrhVC19P1AHdYVPiIK7AETR57OfUXA9oQ7esyIpnF1MkaYOjhO2EE3jDmWG+Mzx0At ztnN8bb46LuLDuJBwTFz2yEqLKJUsa4Ws/2JRk+YdKV8qD+vgnLnqMLNjH4i1a4ZnIEwUY7nIdPze mWHXxDBRKkVBgoup5/zxiKYn4n92+C9DfiImNPp0+Ta+SXctBluszBuya3iPUbfL1gxPlaJ08LAWO EzLR128g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1s2xTo-0000000HWGK-2T1X; Fri, 03 May 2024 18:17:56 +0000 Received: from mail-yw1-x1149.google.com ([2607:f8b0:4864:20::1149]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1s2xTc-0000000HW6m-45nY for linux-arm-kernel@lists.infradead.org; Fri, 03 May 2024 18:17:47 +0000 Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-61df903af62so71850477b3.1 for ; Fri, 03 May 2024 11:17:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1714760263; x=1715365063; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XCWz5AboezUKY+fKcSkwtkc56yFvTydQOL6+C1uV1TA=; b=QLfMBwnNJdOp/btp8rlM42ib8SoHQzb3J/mwDLFy6EeZuN3/tG7oB5gKFgweOKBNKh gTKDXd96FiaWl3aJzL7rq4URBYE/4M6wQAXF8z3WFEkKKX6YE8ATphiUBwKa7T5oz0YV D9FAhoT8n78YefEqHzTDwFZAYEHsrVqPhpIdpxcBjNT0wMHhTHnUe/lKXDuM/WNGbRTW FwrS69U04YOBtXj+NdJRZD2AaLfL9HQQarlLvninXdCEiwz1pqzgFp3y+yDd4hKDov3l pahbDuqXOhO1OdSKuJwPWvllY7r6h70MVlnjxAN1LKc/3utIiE3eFPaZNI6QyU+1NALp VKuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714760263; x=1715365063; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XCWz5AboezUKY+fKcSkwtkc56yFvTydQOL6+C1uV1TA=; b=VJt1y6GWwpjkiNxDbrMmmB/+5MAlAYl7MWwUKlOj+sz5q9bvqSGUSRRhvfMxvSiXyl 3mGyksi6uGM5zPuGo0rSdMr7H6ampZ0HhF6zxIgKpr4RJqklWR0/ZqWWsTYa6M8v2I38 aWZBNhi5u+iLvsX3VKeWggM9ZQkArsn8OZNvUo069seisJJP4t6tV92Ob4cujbLKI6ve o6qdl7h+wv3Uqyxxtma6W2QKDZW+DGtY2KZ/hkhE2lh8TXj36PM7G4VrWQUXqX+kvpjw B0jjk1w6bw40aLCzePB8Ra9sh+Ts8nZd1P2DPe+5IIAOwwGnanUCBzkQ8xYALUFhnXYT 5LJg== X-Forwarded-Encrypted: i=1; AJvYcCXJpt2KFkJ2JYI3b343MOF7vRVNmEb22oyy5vDtdD5b9NNnpuS/w94IrD2uDt+ZjWYxWxgu3Ya+c83Ryn1H9Kx3BsJQYSgixjkMGYDRd1hDeaJKAHc= X-Gm-Message-State: AOJu0YyO+hrmKHnyeq0saDYFh3cUMrBiyYg0NjmCVNO+hChCi6/uKlIR amDDo/swOAO4dHX1BlkmA+m+df1nuHuPPUng0xuI9V+/Rp5peBxPWndVvPWyIBpKa/6SOqsTybr A89Bhhp5rfg== X-Google-Smtp-Source: AGHT+IG8z91B85kZw12g8ILejBDce9nCeI6uZrPNjYvj7vQXfcVKq57xpnYqIWHHCJ8gvpfpvg4aSbrDSFG7Mg== X-Received: from dmatlack-n2d-128.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1309]) (user=dmatlack job=sendgmr) by 2002:a0d:ea05:0:b0:61b:7912:6cad with SMTP id t5-20020a0dea05000000b0061b79126cadmr815080ywe.2.1714760263504; Fri, 03 May 2024 11:17:43 -0700 (PDT) Date: Fri, 3 May 2024 11:17:34 -0700 In-Reply-To: <20240503181734.1467938-1-dmatlack@google.com> Mime-Version: 1.0 References: <20240503181734.1467938-1-dmatlack@google.com> X-Mailer: git-send-email 2.45.0.rc1.225.g2a3ae87e7f-goog Message-ID: <20240503181734.1467938-4-dmatlack@google.com> Subject: [PATCH v3 3/3] KVM: Mark a vCPU as preempted/ready iff it's scheduled out while running From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Nicholas Piggin , Anup Patel , Atish Patra , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , David Hildenbrand , Sean Christopherson , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, David Matlack X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240503_111745_326620_E23F2942 X-CRM114-Status: GOOD ( 16.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Mark a vCPU as preempted/ready if-and-only-if it's scheduled out while running. i.e. Do not mark a vCPU preempted/ready if it's scheduled out during a non-KVM_RUN ioctl() or when userspace is doing KVM_RUN with immediate_exit. Commit 54aa83c90198 ("KVM: x86: do not set st->preempted when going back to user space") stopped marking a vCPU as preempted when returning to userspace, but if userspace then invokes a KVM vCPU ioctl() that gets preempted, the vCPU will be marked preempted/ready. This is arguably incorrect behavior since the vCPU was not actually preempted while the guest was running, it was preempted while doing something on behalf of userspace. This commit also avoids KVM dirtying guest memory after userspace has paused vCPUs, e.g. for Live Migration, which allows userspace to collect the final dirty bitmap before or in parallel with saving vCPU state without having to worry about saving vCPU state triggering writes to guest memory. Suggested-by: Sean Christopherson Signed-off-by: David Matlack --- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2b29851a90bd..3973e62acc7c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -6302,7 +6302,7 @@ static void kvm_sched_out(struct preempt_notifier *pn, { struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn); - if (current->on_rq) { + if (current->on_rq && vcpu->wants_to_run) { WRITE_ONCE(vcpu->preempted, true); WRITE_ONCE(vcpu->ready, true); }