From patchwork Fri May 3 18:17:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 13653267 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9492F1586CB for ; Fri, 3 May 2024 18:17:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714760261; cv=none; b=N/tHfzWGlJ+izNKF/vxU17TfVmU4P8m9tykqajaMpq1ahL96ORiOXJmUD1JdQhaNHhj2uw46QgQ3sPk6pqYC57VSigeF/LItKHT2qLn4s3GxPJ43V4TQ5EymHmYwaCg+nKqYt48+qyxvwIKceWJu5QnxWP+Cn97Qrq4ovSznTRw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714760261; c=relaxed/simple; bh=xBGFTYdyUOnHArOcW3JXAxjXjjvPLqJKxAY8xK+n/W4=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=dlNLJe64OqSynX93MWI1Pgnddn2QjZXSEl+N1DL42IR9DPEAG/IAycENHYXiMU3sOcfqy8LcR7ZCSrGVqEalv0SYZQwnV6+uqoA96eGILHr5FMjh3VoZb/FJRe0kGBZURmqgncBnj7BimCnPSh0xG+b9Gq6X8LZdxnUJg+JHsHE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--dmatlack.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=KY9EphbW; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--dmatlack.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="KY9EphbW" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-de60321ce6cso9989969276.1 for ; Fri, 03 May 2024 11:17:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1714760258; x=1715365058; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=u5lOataUl4CqNIE0BOfGoSKjtAnehJHFOaimjTGPj/Y=; b=KY9EphbWEvaWYHycofn0JF+oFT1mV0XIuivbBWnIfDJQv5jMYDnOjvyO88kHqX/HPE nyf3+FOHtdA20hzKXRBcBOzvP9+LTiarTyJnqxGSC9QPoR58KUpIQSTmDmCMZzxW+XdF jd3SD5VL/OA52STy0Q3yg0JlonqF8mSC4G+asI65FHUYrkp5Q3NlcKcOn21NhvTnFoV2 +yZ/Ru2gU4aN8o8mnjH4aZflGF54Ug1F3T4nV+5zh75Ooi+pcznvJx1Wr5GV7AeIYY9V hL7VUF9eQ7AcTO0263AQ58k0PyG+JbI0onDpo5Ylle4A1Dwr9Xy8p9dSWFv3Edmk4GKW yZsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714760258; x=1715365058; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=u5lOataUl4CqNIE0BOfGoSKjtAnehJHFOaimjTGPj/Y=; b=v4CZPJctpwVTd5ko1c1U90N6ZNM/xogSNMXUGSxGVJ54c7uHOm17h/YnxJyZwILJoq 7AI3WisN4tJPEJVQMIvvP3aXTTp2kMfvTF8QsXP3YjXkmYRD9E3UlEyTpGxs8GJMmvC+ GJhbgwZGMZNoZYzjavrSt5GGmm2iPynE8PO9Q1IEm5YBbhpnxYLWp9XmdQ+UzbsNll0/ ByycOI6LPLgt91HHtDuYbyoaRR2JlWh+Hkm68Y3yt29aTRfyGRpMYjMss6JITPncWnGF rGWNHo/YRXmd4JiBiloXLSKSRBxbTS8rd18/nyjmMT4IVAG/hf0glY60AD5Km49wB3DE S+Hg== X-Forwarded-Encrypted: i=1; AJvYcCVR4S3mx8R1id29OWcsyVeZt+aJvdlmTTWuEackp77T9o+PKAFll2oBREEIafm3R1Hc82p+ozVKJMAMC6Y6wxg/X8e7z+4zRZWnOA== X-Gm-Message-State: AOJu0YzLau1u3+dOGOqCiJ0vjamkKHPL7CACRNChFA5zc1n8pP91OK4B 49jgms3Gb6AEe+dEevNya29htHkZdfwmJcVzlUHCkHy7tsI1Aj7hofkY4bdd8TP7ZuXmd1pE0Tw riqvJj5Kpfw== X-Google-Smtp-Source: AGHT+IGi/6BGSqLUvb7wYnWHXOlhd9obpauunW/BcufWuEVmkE7H211A4Dm9y03t3pfp70UoCYU1Se1aO50snA== X-Received: from dmatlack-n2d-128.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1309]) (user=dmatlack job=sendgmr) by 2002:a05:6902:729:b0:dcc:c57c:8873 with SMTP id l9-20020a056902072900b00dccc57c8873mr1104781ybt.9.1714760258668; Fri, 03 May 2024 11:17:38 -0700 (PDT) Date: Fri, 3 May 2024 11:17:31 -0700 Precedence: bulk X-Mailing-List: linux-mips@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.45.0.rc1.225.g2a3ae87e7f-goog Message-ID: <20240503181734.1467938-1-dmatlack@google.com> Subject: [PATCH v3 0/3] KVM: Set vcpu->preempted/ready iff scheduled out while running From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Nicholas Piggin , Anup Patel , Atish Patra , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , David Hildenbrand , Sean Christopherson , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, David Matlack This series changes KVM to mark a vCPU as preempted/ready if-and-only-if it's scheduled out while running. i.e. Do not mark a vCPU preempted/ready if it's scheduled out during a non-KVM_RUN ioctl() or when userspace is doing KVM_RUN with immediate_exit=true. This is a logical extension of commit 54aa83c90198 ("KVM: x86: do not set st->preempted when going back to user space"), which stopped marking a vCPU as preempted when returning to userspace. But if userspace invokes a KVM vCPU ioctl() that gets preempted, the vCPU will be marked preempted/ready. This is arguably incorrect behavior since the vCPU was not actually preempted while the guest was running, it was preempted while doing something on behalf of userspace. In practice, this avoids KVM dirtying guest memory via the steal time page after userspace has paused vCPUs, e.g. for Live Migration, which allows userspace to collect the final dirty bitmap before or in parallel with saving vCPU state without having to worry about saving vCPU state triggering writes to guest memory. Patch 1 introduces vcpu->wants_to_run to allow KVM to detect when a vCPU is in its core run loop. Patch 2 renames immediated_exit to immediated_exit__unsafe within KVM to ensure that any new references get extra scrutiny. Patch 3 perform leverages vcpu->wants_to_run to contrain when vcpu->preempted and vcpu->ready are set. v3: - Use READ_ONCE() to read immediate_exit [Sean] - Replace use of immediate_exit with !wants_to_run to avoid TOCTOU [Sean] - Hide/Rename immediate_exit in KVM to harden against TOCTOU bugs [Sean] v2: https://lore.kernel.org/kvm/20240307163541.92138-1-dmatlack@google.com/ - Drop Google-specific "PRODKERNEL: " shortlog prefix [me] v1: https://lore.kernel.org/kvm/20231218185850.1659570-1-dmatlack@google.com/ David Matlack (3): KVM: Introduce vcpu->wants_to_run KVM: Ensure new code that references immediate_exit gets extra scrutiny KVM: Mark a vCPU as preempted/ready iff it's scheduled out while running arch/arm64/kvm/arm.c | 2 +- arch/loongarch/kvm/vcpu.c | 2 +- arch/mips/kvm/mips.c | 2 +- arch/powerpc/kvm/powerpc.c | 2 +- arch/riscv/kvm/vcpu.c | 2 +- arch/s390/kvm/kvm-s390.c | 2 +- arch/x86/kvm/x86.c | 4 ++-- include/linux/kvm_host.h | 1 + include/uapi/linux/kvm.h | 15 ++++++++++++++- virt/kvm/kvm_main.c | 5 ++++- 10 files changed, 27 insertions(+), 10 deletions(-) base-commit: 296655d9bf272cfdd9d2211d099bcb8a61b93037