mbox series

[0/6] KVM: Fix bugs in vCPUs xarray usage

Message ID 20241009150455.1057573-1-seanjc@google.com (mailing list archive)
Headers show
Series KVM: Fix bugs in vCPUs xarray usage | expand

Message

Sean Christopherson Oct. 9, 2024, 3:04 p.m. UTC
This series stems from Will's observation[*] that kvm_vm_ioctl_create_vcpu()'s
handling of xa_store() failure when inserting into vcpu_array is technically
broken, although in practice it's impossible for xa_store() to fail.

After much back and forth and staring, I realized that commit afb2acb2e3a3
("KVM: Fix vcpu_array[0] races") papered over underlying bugs in
kvm_get_vcpu() and kvm_for_each_vcpu().  The core problem is that KVM
allowed other tasks to see vCPU0 while online_vcpus==0, and thus trying
to gracefully error out of vCPU creation led to use-after-free failures.

So, rather than trying to solve the unsolvable problem for an error path
that should be impossible to hit, fix the underlying issue and ensure that
vcpu_array[0] is accessed if and only if online_vcpus is non-zero.

Patch 3 fixes a race Michal identified when we were trying to figure out
how to handle the xa_store() mess.

Patch 4 reverts afb2acb2e3a3.

Patches 5 and 6 are tangentially related cleanups.

[*] https://lkml.kernel.org/r/20240730155646.1687-1-will%40kernel.org

Sean Christopherson (6):
  KVM: Explicitly verify target vCPU is online in kvm_get_vcpu()
  KVM: Verify there's at least one online vCPU when iterating over all
    vCPUs
  KVM: Grab vcpu->mutex across installing the vCPU's fd and bumping
    online_vcpus
  Revert "KVM: Fix vcpu_array[0] races"
  KVM: Don't BUG() the kernel if xa_insert() fails with -EBUSY
  KVM: Drop hack that "manually" informs lockdep of kvm->lock vs.
    vcpu->mutex

 include/linux/kvm_host.h | 16 ++++++++--
 virt/kvm/kvm_main.c      | 68 ++++++++++++++++++++++++++++++----------
 2 files changed, 65 insertions(+), 19 deletions(-)


base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b

Comments

Will Deacon Oct. 29, 2024, 2:18 p.m. UTC | #1
On Wed, Oct 09, 2024 at 08:04:49AM -0700, Sean Christopherson wrote:
> This series stems from Will's observation[*] that kvm_vm_ioctl_create_vcpu()'s
> handling of xa_store() failure when inserting into vcpu_array is technically
> broken, although in practice it's impossible for xa_store() to fail.
> 
> After much back and forth and staring, I realized that commit afb2acb2e3a3
> ("KVM: Fix vcpu_array[0] races") papered over underlying bugs in
> kvm_get_vcpu() and kvm_for_each_vcpu().  The core problem is that KVM
> allowed other tasks to see vCPU0 while online_vcpus==0, and thus trying
> to gracefully error out of vCPU creation led to use-after-free failures.
> 
> So, rather than trying to solve the unsolvable problem for an error path
> that should be impossible to hit, fix the underlying issue and ensure that
> vcpu_array[0] is accessed if and only if online_vcpus is non-zero.
> 
> Patch 3 fixes a race Michal identified when we were trying to figure out
> how to handle the xa_store() mess.
> 
> Patch 4 reverts afb2acb2e3a3.
> 
> Patches 5 and 6 are tangentially related cleanups.

Thanks, Sean. For the series:

Acked-by: Will Deacon <will@kernel.org>

I sympathise a little with Paolo on patch 4, but at the end of the day
it's a revert and I think that the code is better for it, even if the
whole scenario is messy.

Will