From patchwork Tue Apr 1 20:44:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 14035305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3B7D5C36010 for ; Tue, 1 Apr 2025 20:46:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:Reply-To:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID :Mime-Version:Date:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=9f9VKB5oKSvn2nhxGo6RT64KwVDGGF72AA1mo6D0jlQ=; b=CzlOm6EwMnB36E NRHPwPobssFPPS4sv29cW38WEEmrw2Y1vZL4yq/cvf6AzVINMtgHSnXv5XMvu3u6Rc+yJZDG2Qq+J KbUacGC9Yh1ZLLQ25ITHVuDjwB6J/mM5EQAmoxgVLwnW+SUFaiN1JRehp6DM9OP1PHNfUmu9FI8Zp +WMMhYZNZSGMmThTD9UgUjk9CZx2pgPcBQVpg3bSi//lt4uApDGPpAQgpdLNz5VrLxB5Kv4tBezLT PpYdUoYrV83SlPZqWRA/zgRFNW8y+aF8l+NwFjvxPM0vhE9Ur7CJ5FAv4p1b/Kihw7oIf77S6lofM HqcoRkS8s0SrBjklK63w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.1 #2 (Red Hat Linux)) id 1tziVX-00000004L64-48v6; Tue, 01 Apr 2025 20:46:51 +0000 Received: from mail-pl1-x649.google.com ([2607:f8b0:4864:20::649]) by bombadil.infradead.org with esmtps (Exim 4.98.1 #2 (Red Hat Linux)) id 1tziVV-00000004L4B-26Tb for linux-riscv@lists.infradead.org; Tue, 01 Apr 2025 20:46:50 +0000 Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-22651aca434so112599165ad.1 for ; Tue, 01 Apr 2025 13:46:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540408; x=1744145208; darn=lists.infradead.org; h=cc:to:from:subject:message-id:mime-version:date:reply-to:from:to:cc :subject:date:message-id:reply-to; bh=4MHpp6KQzpe5YT4W+dH4Yie0ZZwtXZwZK/jMYGNOWeg=; b=FP99ocTtUXLODFJ8eXhx8eRsioyQ20FHwdbVs4CE23I7tytsNyosKv6qYWDyrO9k/6 J4UawOY4jLwIBs9/etx4OQSKvas1lxGBwoayIdPdQOLFhs/UKPBHLrKjaCI+kBSB0SBM sy9ddMJrQYnfVwBDTEWKyPn5ZkW/yvYkZFGS2KNa5xI1lND+AhoBqwiNGYnvwks3sHQE XVNjXJuYZ6UjMqgQrr4IPTCxEbWzoMxIyA3YwnkIAZ3LkUAN2JDG+yisuAaD1jG2inMK lN5bXJUbK66spZX4H++9Hgd8RVgiVGtUlhjLG84/ONsEuB2bTnok7VulxsPKEykt2pCN VkSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540408; x=1744145208; h=cc:to:from:subject:message-id:mime-version:date:reply-to :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4MHpp6KQzpe5YT4W+dH4Yie0ZZwtXZwZK/jMYGNOWeg=; b=QjMUW9haRY6dA1qqqNESAbxipbqcX1n4kr4FXss0MP5U40uG17l7Alu8zv5Wj27UHM cNJIpEttIngHHQCYs2drnQi4J+mYUGUKqg2m4zoAKZJ70Hm3pTEx688xD2VeVjIIjuB7 gZ6IUjevxqFDZ/NbXscWoQformSqt7T5F+gT1oPnvYKIfpMiyMCbWrP1Zzopw/nFM9BN 4gyBBfvfIVEX5zcEPew5A6G860QD6EPMNJZC5f+h56PclEaNWB4D7h8xUzMTOIBIm0Fq VmxwamcphMB2FsBiTVU9w3dk3x91k1Ch/nL09OpMuANaFWh7u3bGOH+13CDT5SSEHNLm Nw0A== X-Forwarded-Encrypted: i=1; AJvYcCXQSq1Y/9fU6ezh7F1fpJRB63iq1P7bu2al7em0VrxHjQ3u9SqZE3GP5C8CA83IXHnib1AzzCeXUtgPoA==@lists.infradead.org X-Gm-Message-State: AOJu0YxsmfZa3cAmkdOWsAOvTnbUVn2FMgE2fUVgh7m0Jf+IG3VT2h/+ LJmqnYiL3Fnj4L8dLSx5nmntlMb1YXN0DJMwAcN1rn06iIrpkYB5Xywg6ZGjgZp8R1iUWaaDa3z NGQ== X-Google-Smtp-Source: AGHT+IHAsBNwKlX/xhMaBjKbVn5TdC1aApjFgyzFqGqs5j7Zym3fz03OLRDWDWGVCV86OCx3ObyBflUf12E= X-Received: from pfbls13.prod.google.com ([2002:a05:6a00:740d:b0:737:5ee8:8403]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:3d51:b0:739:50c0:b3fe with SMTP id d2e1a72fcca58-73980380c52mr22298727b3a.8.1743540407837; Tue, 01 Apr 2025 13:46:47 -0700 (PDT) Date: Tue, 1 Apr 2025 13:44:12 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-1-seanjc@google.com> Subject: [PATCH 00/12] KVM: Make irqfd registration globally unique From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250401_134649_542542_09E3C6E7 X-CRM114-Status: GOOD ( 13.17 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Sean Christopherson Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Xen folks, I Cc'd y'all because "sched/wait: Drop WQ_FLAG_EXCLUSIVE from add_wait_queue_priority()" changes the behavior of add_wait_queue_priority(), which Xen's privcmd uses. Unless I've misread the code, the Xen behavior isn't actually affected, but extra eyeballs and any testing you can provide would be much appreciated. Rework KVM's irqfd registration to require that an eventfd is bound to at most one irqfd throughout the entire system. KVM currently disallows binding an eventfd to multiple irqfds for a single VM, but doesn't reject attempts to bind an eventfd to multiple VMs. This is obvious an ABI change, but I'm fairly confident that it won't break userspace, because binding an eventfd to multiple irqfds hasn't truly worked since commit e8dbf19508a1 ("kvm/eventfd: Use priority waitqueue to catch events before userspace"). A somewhat undocumented, and perhaps even unintentional, side effect of suppressing eventfd notifications for userspace is that the priority+exclusive behavior also suppresses eventfd notifications for any subsequent waiters, even if they are priority waiters. I.e. only the first VM with an irqfd+eventfd binding will get notifications. And for IRQ bypass, a.k.a. device posted interrupts, globally unique bindings are a hard requirement (at least on x86; I assume other archs are the same). KVM and the IRQ bypass manager kinda sorta handle this, but in the absolute worst way possible (IMO). Instead of surfacing an error to userspace, KVM silently ignores IRQ bypass registration errors. The motivation for this series is to harden against userspace goofs. AFAIK, we (Google) have never actually had a bug where userspace tries to assign an eventfd to multiple VMs, but the possibility has come up in more than one bug investigation (our intra-host, a.k.a. copyless, migration scheme transfers eventfds from the old to the new VM when updating the host VMM). Sean Christopherson (12): KVM: Use a local struct to do the initial vfs_poll() on an irqfd KVM: Acquire SCRU lock outside of irqfds.lock during assignment KVM: Initialize irqfd waitqueue callback when adding to the queue KVM: Add irqfd to KVM's list via the vfs_poll() callback KVM: Add irqfd to eventfd's waitqueue while holding irqfds.lock sched/wait: Add a waitqueue helper for fully exclusive priority waiters KVM: Disallow binding multiple irqfds to an eventfd with a priority waiter sched/wait: Drop WQ_FLAG_EXCLUSIVE from add_wait_queue_priority() KVM: Drop sanity check that per-VM list of irqfds is unique KVM: selftests: Assert that eventfd() succeeds in Xen shinfo test KVM: selftests: Add utilities to create eventfds and do KVM_IRQFD KVM: selftests: Add a KVM_IRQFD test to verify uniqueness requirements include/linux/kvm_irqfd.h | 1 - include/linux/wait.h | 2 + kernel/sched/wait.c | 24 +++- tools/testing/selftests/kvm/Makefile.kvm | 4 + tools/testing/selftests/kvm/arm64/vgic_irq.c | 12 +- .../testing/selftests/kvm/include/kvm_util.h | 40 ++++++ tools/testing/selftests/kvm/irqfd_test.c | 130 ++++++++++++++++++ .../selftests/kvm/x86/xen_shinfo_test.c | 21 +-- virt/kvm/eventfd.c | 130 +++++++++++++----- 9 files changed, 299 insertions(+), 65 deletions(-) create mode 100644 tools/testing/selftests/kvm/irqfd_test.c base-commit: 782f9feaa9517caf33186dcdd6b50a8f770ed29b