From patchwork Wed Jan 8 11:50:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930711 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 523C9E77188 for ; Wed, 8 Jan 2025 11:56:39 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdb-0006Ub-Ec; Wed, 08 Jan 2025 06:54:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd7-0006PW-9L for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:51 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd1-0002Cn-0T for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:42 -0500 Received: from pps.filterd (m0127844.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50885pFC021537; Wed, 8 Jan 2025 03:53:35 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=sTEjbMLlPVEgYrFpP6gnfLqwrQNlSwLWipaDa0jU3 fE=; b=oVUZ/6TOLFOYi4PJtaJ3RdYGd2G60QfYppgZNd43D/Dh7MKkxzgVN+cC+ dvFOmhZry0MbHaeur4FhYY4LuhGwY3BVsh4TtQUluwKTKnZv1v8m180k7guFkfeV Q4duoT21mbTGmfkGmooavta2iP84tFD0vAEU+N/FQ64NOXMEkO2QHD2aQh19yyp3 Gd2x+j/EXpzYHYkmIIyGOTk4G0rcGADge584xRmii1w34A0BUU+r4BJtmGNmRO6y ApGsmFYxI+y1S6vYq9qQT4VGEbu2YTSzXqRH/PieE9zVxMwRhFsCAgfW1CABy48e NHgNFQSz2HasMNme4I+BAnuTjih0g== Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2040.outbound.protection.outlook.com [104.47.74.40]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y5d3rs7x-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=cN+q2f8b7wujg4nmJSe9s2EroHO4K6b7pLS0k/ak5ZNH7k+CBg817YBw8QYgZ2JJi5ZlrZf5uEpfzmzVFRd8x3pt6NVMEBagHznAP3jTbbtITo39Qyo84bYS33RWv5bkLp+JxtSezghLD7rmIokuutXNBZsux/j15RMNyrWYOB6VCvPNnMGZ3s4eFX6UbOivxMp0wkR39fGkNIYXetKCHVWbwT/LWrYkvZrF8CzmMR+BhyGwT6g7PWUE5TRDmhoYmG6RO0aV6kM3gMsAJdLFA4cUOQH2UGbq0ECQL2eGKxkjIWH9ARzbNnH3FxhQZ/NclrwOX7aBdWzYn5cKiahb+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=sTEjbMLlPVEgYrFpP6gnfLqwrQNlSwLWipaDa0jU3fE=; b=j2zQhupiMQgiJz4nXdC5GSFn6nXoeH0byd28x69zTOU4a9Ai1+O7iVHohMjlLFBazBbtDAAGwXhlIfyIOAgIebInURikkJv0b/fmECh/da1vCxHogx4fw5ct8C30yR4n6LbmMHMXYEPkmGpw9O9xx6RtBnhyH/NN6jAkqlXCDvcDsoxOWcyzkQ+t9wE8bYnWbD2e83VDqtQXREhBwSQBGudvpQgPVZv+CY//D07JBrk/R+k5u1HZnce1y9rB/n7/J0e7+psAU88YONObfKUzVKYN+XaE2vUnZPao/OOTVdpIxDeyRjaA/eiZY6shYpkLjLrUQD17owLXA1TO/5WtIw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sTEjbMLlPVEgYrFpP6gnfLqwrQNlSwLWipaDa0jU3fE=; b=if0QatfJ8YQFXRw4iuRpzBiheamMavcSyR69RWw5LcKxPCyAcQoOMdazENkvU2ER0CbxGYOvl3gMhrPpiHjTwwf07pQJcdup5Rx5o019BDdP69snEYysUZx7Fd+0I4oBzIQAaqWYUE1ISb4ahKE4nRJk9C0WlcPmP0Zy7uSv6ws7MMruJytFdfRzbyswkVnl+Kx3rO+13WrxRrd20Bskf9OlqgU41bSW4yTIhKr7W/8iWt30/ADz9/eADw8E6mLMKcI1JRtGaaHeCbHWYs+RhixO+CT65eRk8MTcxM+HWXVScq8lU8scg3mvE6CpdN/5c3XH8YepoTwl/P9ftIf+QQ== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:31 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:31 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 01/26] vfio/container: pass MemoryRegion to DMA operations Date: Wed, 8 Jan 2025 11:50:07 +0000 Message-Id: <20250108115032.1677686-2-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: f6375515-69b2-4fcd-ea0b-08dd2fdb1256 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: /qBN692iq2ZieFzAZACMQhgrj38/lFv9elNUEETFf1fdUjoIwZM04/0d23TJn4sbFfRZQIozdjTJeS5Gcee/Q8D2YUJLTnoaKDu76Bn/XopG0FDZGnfv87/MIMj40YufwUS2pua1X1FUqTg7oMdpnNAEF/rYRB0QgjQ74yKkCfGAnKQ1+I/JcT9fjdTygQ0bxYLjlFw0Yik48XBVqVYUz7FpeU+NFN9nPq9M3hbU0Y9T+8yzAL0EXSSf5YtYWqL6xF4kHMkxTDPLt4TOM33WezRFDfS95iX7cuOartvjR6QsopQMxdSCvxnnfJ8I06fU/c+ZWDr6HNPv5ZmadKT+uTBf92brxiogsRGIaJJW8ktwfXmouipL0x/1enmIgCpRok9Hd/d+Z4a+VhZ+0VIx/OX8pDUdEUzDVoiwtbE0pvauIEF3oMNHuKChBGa00zBmZPBmBjpjyLMwA0E2EWrYEVcbup3ChUuwhAc2tdkdR+fY9U490sKEtYcrnRwbQATsahMEFvhxmnPFsmBZ3PonQfIt2mGdOFlD6Kl+0HYokN7YIRb31NuZSaK3NxaYadTYHAPZYM5Zeu20ogvt2wnsa2UuWkFk8HVq7dYPOxXqBLK8+KypBBkouqmJnMP1fophiG++jqc2RmUYkz7N4wgYm7SlQTZ5H5v1WKhrWl406F/R8luBFO++fHDRkHxEd6x3UIsOipG7PJ78+Zb5rML6o18yz6361oco6EpS4l7sLErk9LGZVZ1PBKeMYuWjwzEjfLCm9EvRN6d6fkW2vxRrLyOyGtLEl1AgADNydJIb84EYAW8uR7IWBW7coQCjZXs5xeDXfdRbXBXh/yU+b0ewPLbNprwsxxdykMI8pk/gbkRHK7KRjTZB8yvH+tiHSqD8UPrHvSk49hvOQQjLXLPX1g9cR+ErqELsOstOCjzBMAjLDZ6a9WUcpfj6cWOrRyJ1abyh+2zfUA/6Pxy09GwhlU1TDAVfaiZHpxqCbzepxDBllmXDpBzgf9m7+tgm8tUWKiiNQOc/5qLhVI9/swJaHQ+cB/rPOBuAQA2l7Ga4SWrtfj9mQywwZYw4Hc193TaN2zH79JRrxqnAw078StMYt3XutT4lh1dew/Es3MakkvE2ZIky/NHKEbb+l/CdEZMO/6wkB05ITxHGmUMcAHFZgpCkS+xQvCCcAWjttwag5ukYnC7HyLHFzBkcYqBxNcdgNdJgdBE2aS2qqTLVXrJCXeipZvaqbTyD5iKdTLDdNXmoY8v6RDEdjLwBP1679PyKFQrX20vE9WoBGdWGMKLJRNAegzqaaip+3lLPKuEd7+FW0d+nR3BTi5To+9+Zz9m+jflLH7sz2JALG12MOqv9LhHyngTYSN6c1lTMAHmntO/Hhj8eSIJCw9UDiGrRJFE6 X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 3V+/RrNMPiwjdlgvjXe08fT+RmK6bvhRB5r1QEK1ca6t/tomHzUiQCWtHg0mVa/QmGL0kAwO1FqwqlL26v81pBDzNVwfZhIue7J6aU5mGdaXVn9LrFoKlBiDolpupkeH2qtXEVh+EExS/a1Bd82HWxOUsigSTq5nO0OwIa55Grx6aP/z/X+Lss26ZAQn1PbeSgk6HePzV6m9slPcXX/dNMW4KSkLHXKpshTaLQpeu+6+it+tkv3k+RP0cUQ8ylLbx4UvtYoWqmqqOPc+EDSqgJLEhr7lgAg5SfflQeOLGz1GIjDRwA977rN8omGdBkCpvF+0LGPUt1s5jyLBENXmyRP2mEzGfe9hNdA0v38YgnemOYhfjxvhJa3GxkG2NxbKN8O0p+infA2IRcf3vsPtUZirA8VGEz7HI3lplfDNAoRSNadM3+ylwbyUbkOgEg5bhQu68NEi9Ru7xViD1Wmtx+Z5KCgzxG5k1CT8qHs0pTxcYC5d55CtyhXymory8IMVbSZdyfuPEAviWf6MS5tTNTal8qXVf/ZJxQOy/AMq6JDEcGfG8r+MtUp9zwNE/HaCGJ/KMqJW8e4K2yTxhcuYLt3Iz0N9Yfs0IYDffIOhgDIkvlO8CwGe3SKHNlyoj/9L4tPnRLnczV647rSXP4DzrGrAvjSVaiqI758COXNtung32Sy2eQITTdfP4dY+l6fvhB2bteJ1XpAMZLO2/gz9XwQ0okpuB18KCgUha18B3pkeMS6r+inB45HC0pkoEbNpLKSh8bTO0t93qoeUt6fRNk4gV/Ag2A4PJ2LPBGDRfn3ZWZxKrENkCRgnllKbKMegVp+47KsQH9KGgl/4hunuffi7LCqOQaiXYeU1o+Krr3Tkow6SnlVysZSBylh1nRPxQ1Sizw/TARReqabbA/IMbygJrRbwm9v9ubHSunebpEqt8WZCch01H3T6c1RwoJI6XZMWikg6N6T7Jl/uwZvpOBdd4UFIpVw430imGo2mbtKJcrLz1SBORtTbiKX+THypCu5cNZq9fsNQJh6I40+K0vNgJyY35Bkp9jxOdTcV07y0C7VecaJzFUm6q8Qh6s21LCqtDOK6xXSTvaGu1Lxx5+Qqnys5vboA7ixCH4kqqJs863zRSKE+shP+MYO68EpXeuM2fFAUi7V5KWZm14SR9+8QZjCKSAhOz2vwfqYMYLok1oKt9WOzZjwd8gjEPdtnd59tXbqJXWpGbmBpjrDa7Hf9R9wVksIAX/InGdt3D+LSJp1E6Gam0pr4Sb7YQidGtiU3tl7/oDwAQDbIMh7NxFFVaEoPZ8HN/49LQ3fqowHWiAs35sdZk8r3513y/hAaFtAM5ia8M9KZeJHzif3goWMlzsz1X5imH63DlJS7upfo18v/POK0G8ZNmQU7fWMXuFIwMXBtMS3TcVXbT+P4cC/0Il54SxdE7GLVPx8ShntiaTwhqJIRfSASbwFFtYRNxRxu5b/GnTdOhnWF91ka83f1iecHolviQ012uFELHsbjZ/PsfGqVrbhtd0PInd0PGfY+FsDzzK5ckB7v2HqkivUkdowgnUmsCJf8uZfklu76V3dYbMfOhfoaHWG+eWwI X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: f6375515-69b2-4fcd-ea0b-08dd2fdb1256 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:31.0084 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: gAVVp0DUPMsHAzBUEM/9tuslAHAg0w10cQz/dmY2YWdSl+crO2ZhnGwLIgqxdtt+4aKZSQUlyRQ3gOaFwq81cA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Proofpoint-ORIG-GUID: QzKfUsO_Shj78O_T7NrbVWijaWFRDoLW X-Proofpoint-GUID: QzKfUsO_Shj78O_T7NrbVWijaWFRDoLW X-Authority-Analysis: v=2.4 cv=YLtlyQGx c=1 sm=1 tr=0 ts=677e673e cx=c_pps a=ZeveGGZnxkNpWlA7A6AaFA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=DMMBb2U3yOgPCP8cz6YA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Pass through the MemoryRegion to DMA operation handlers of vfio containers. The vfio-user container will need this later. Originally-by: John Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John Levon --- hw/vfio/common.c | 17 ++++++++++------- hw/vfio/container-base.c | 4 ++-- hw/vfio/container.c | 3 ++- hw/vfio/iommufd.c | 3 ++- hw/virtio/vhost-vdpa.c | 2 +- include/exec/memory.h | 4 +++- include/hw/vfio/vfio-container-base.h | 4 ++-- system/memory.c | 7 ++++++- 8 files changed, 28 insertions(+), 16 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f7499a9b74..0e3ea71aae 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -248,12 +248,12 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section) /* Called with rcu_read_lock held. */ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, ram_addr_t *ram_addr, bool *read_only, - Error **errp) + MemoryRegion **mrp, Error **errp) { bool ret, mr_has_discard_manager; ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only, - &mr_has_discard_manager, errp); + &mr_has_discard_manager, mrp, errp); if (ret && mr_has_discard_manager) { /* * Malicious VMs might trigger discarding of IOMMU-mapped memory. The @@ -281,6 +281,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n); VFIOContainerBase *bcontainer = giommu->bcontainer; hwaddr iova = iotlb->iova + giommu->iommu_offset; + MemoryRegion *mrp; void *vaddr; int ret; Error *local_err = NULL; @@ -300,7 +301,8 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) { bool read_only; - if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, &local_err)) { + if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, &mrp, + &local_err)) { error_report_err(local_err); goto out; } @@ -313,7 +315,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) */ ret = vfio_container_dma_map(bcontainer, iova, iotlb->addr_mask + 1, vaddr, - read_only); + read_only, mrp); if (ret) { error_report("vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", @@ -378,7 +380,7 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, vaddr = memory_region_get_ram_ptr(section->mr) + start; ret = vfio_container_dma_map(bcontainer, iova, next - start, - vaddr, section->readonly); + vaddr, section->readonly, section->mr); if (ret) { /* Rollback */ vfio_ram_discard_notify_discard(rdl, section); @@ -662,7 +664,7 @@ static void vfio_listener_region_add(MemoryListener *listener, } ret = vfio_container_dma_map(bcontainer, iova, int128_get64(llsize), - vaddr, section->readonly); + vaddr, section->readonly, section->mr); if (ret) { error_setg(&err, "vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", @@ -1214,7 +1216,8 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) } rcu_read_lock(); - if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, &local_err)) { + if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, NULL, + &local_err)) { error_report_err(local_err); goto out_unlock; } diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 749a3fd29d..5e0c9700d9 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -17,12 +17,12 @@ int vfio_container_dma_map(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - void *vaddr, bool readonly) + void *vaddr, bool readonly, MemoryRegion *mrp) { VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer); g_assert(vioc->dma_map); - return vioc->dma_map(bcontainer, iova, size, vaddr, readonly); + return vioc->dma_map(bcontainer, iova, size, vaddr, readonly, mrp); } int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 4ebb526808..fe193ac7da 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -176,7 +176,8 @@ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer, } static int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova, - ram_addr_t size, void *vaddr, bool readonly) + ram_addr_t size, void *vaddr, bool readonly, + MemoryRegion *mrp) { const VFIOContainer *container = container_of(bcontainer, VFIOContainer, bcontainer); diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index 3490a8f1eb..f541b00785 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -28,7 +28,8 @@ #include "exec/ram_addr.h" static int iommufd_cdev_map(const VFIOContainerBase *bcontainer, hwaddr iova, - ram_addr_t size, void *vaddr, bool readonly) + ram_addr_t size, void *vaddr, bool readonly, + MemoryRegion *mrp) { const VFIOIOMMUFDContainer *container = container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 3cdaa12ed5..a1866bb396 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -228,7 +228,7 @@ static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) { bool read_only; - if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL, + if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL, NULL, &local_err)) { error_report_err(local_err); return; diff --git a/include/exec/memory.h b/include/exec/memory.h index 9458e2801d..50e7b7be30 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -737,13 +737,15 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm, * @read_only: indicates if writes are allowed * @mr_has_discard_manager: indicates memory is controlled by a * RamDiscardManager + * @mrp: if non-NULL, fill in with MemoryRegion * @errp: pointer to Error*, to store an error if it happens. * * Return: true on success, else false setting @errp with error. */ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, ram_addr_t *ram_addr, bool *read_only, - bool *mr_has_discard_manager, Error **errp); + bool *mr_has_discard_manager, MemoryRegion **mrp, + Error **errp); typedef struct CoalescedMemoryRange CoalescedMemoryRange; typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 4cff9943ab..c9d339383e 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -73,7 +73,7 @@ typedef struct VFIORamDiscardListener { int vfio_container_dma_map(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - void *vaddr, bool readonly); + void *vaddr, bool readonly, MemoryRegion *mrp); int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); @@ -113,7 +113,7 @@ struct VFIOIOMMUClass { bool (*setup)(VFIOContainerBase *bcontainer, Error **errp); int (*dma_map)(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - void *vaddr, bool readonly); + void *vaddr, bool readonly, MemoryRegion *mrp); int (*dma_unmap)(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); diff --git a/system/memory.c b/system/memory.c index 78e17e0efa..82ac19d473 100644 --- a/system/memory.c +++ b/system/memory.c @@ -2185,7 +2185,8 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm, /* Called with rcu_read_lock held. */ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, ram_addr_t *ram_addr, bool *read_only, - bool *mr_has_discard_manager, Error **errp) + bool *mr_has_discard_manager, MemoryRegion **mrp, + Error **errp) { MemoryRegion *mr; hwaddr xlat; @@ -2250,6 +2251,10 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, *read_only = !writable || mr->readonly; } + if (mrp != NULL) { + *mrp = mr; + } + return true; } From patchwork Wed Jan 8 11:50:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930716 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6741CE77188 for ; Wed, 8 Jan 2025 11:57:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdW-0006UD-HN; Wed, 08 Jan 2025 06:54:10 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd5-0006PJ-QY for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:47 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd1-0002Cu-0U for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:41 -0500 Received: from pps.filterd (m0127844.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50885pFD021537; Wed, 8 Jan 2025 03:53:35 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=VjJ0dCrJYirjmalQEGQcRMGK6EsRN+AhL+QDUW+TQ uM=; b=lf763ODwv0u5Jv7zvp1KKgEkqNWA8+zCzWxLhFTx6benHV9s67U/JuzyU ltSAbbugjk8d9jM9Ju7SV22FGEy64OkPMkGP+mAyfsOWsaGjbihdEFnjvpat+5qd DHbQoZoL2Ms9rLsBboPYe1IsOUR7uFOv+TQ+YW4V+u788B7ELV6HOfRAEgx6913C U24XSeRVztlDsBTCwg4zlg9+7HUwefdRgZbFgpaDRrx3Cd+6Ridq3BMmtkmqSxfb oRVjQl9PCp/nJOzw/9Lq0nywuzjM0NDcX9PnWHI0M72Yl1Ug5v0bkgTADne5w/Di FAQd1ZxLNaJpQuZA1g7JXvpvfJuQg== Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2040.outbound.protection.outlook.com [104.47.74.40]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y5d3rs7x-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=B4MuRzLe95FcJKRGiDAEhtvkQMvy9I+oOJpnoOEw1r6TsQ9+WuhAYuIpH8KjYehFMxPzka5Y0R8A5Yymqe7L7eYLtpfE03GpwiLkwtBZEArNJFjcqBU+ZIwXFA0BTjUrhIp8rjZNbp09fR5cibvlCT0s3RxnfuzNXAZ3MfwLYXU7gZE285qR1rl7I20+uyxAsXmoPpj+7/fpg0Nb+a4wOVzSQpi2NXwJQqlLKTq+UkRloG0Uyk1FB3IzXPR/x9frY8CnZw9u+l5oynvOfdIIpr5Um8NQds38OL/RbIuwWPKrSoUeuJZros4+njL4s4qVcedSsnIjOUyZvggHQPsdIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=VjJ0dCrJYirjmalQEGQcRMGK6EsRN+AhL+QDUW+TQuM=; b=U5vywAfFAJDqTuWGDWcWGX8qe7I/3UkiY3RdArEBKaqNOAqdUiTb6gcEIMZp1q/1AzMRIs/9Bt57I4rWShDOq7WvhzMpynMlwC0GV679XCMbJ3MxML7POJYCsmBQPcHnnM3/BX3IVCXKCDS3x6/dPY9/cduFuY2U9qmrpkFonesBVo6EqPRUduGFw5aJCixqu9ptw4AbO34LNFXwK6lzxEjUVgMh2JTljXdfTuGGAwwcPcfDWK9N8GS2Zk57gwUNC78aAWXlcKeEvYY5Uw7CSTO6by/yHjaLbChIwRPozoNpP5pyutvmUMpiRGwimeBJGkF/yG1/cvDfeKdPENl+ug== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=VjJ0dCrJYirjmalQEGQcRMGK6EsRN+AhL+QDUW+TQuM=; b=T8Xg4TkrnzYPEYCjpsrRHLo/KjyRFvRfHJfKG8PhOvjLdaYCUsi52iOQpnbuHWTGVWGt8w7EWAn9H80Q+p6/CELUUhigrj9lvewZms8u+2vosJZ5cXR6ftHHApN0VAIBEdbsj3r+/pSdkmriQ2k3rSgzIGSZ1LoZYDJOAPeV4PlszSqfez4LhmOyZxkzWkV1bvupu38mygpkIBD8c0i5DLUPxGAPyPG6k2GqQW1nd6ER106bfyhrOgza4PRLPBwpnrBPVzRZlVb+AM6rFBjDX06WJ5lIprF5AWKv1hckqs6o2M2Mq1dwfAtfMl4SzDE4yActXVDieQpt8shbbIZc2A== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:32 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:32 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 02/26] vfio/container: pass listener_begin/commit callbacks Date: Wed, 8 Jan 2025 11:50:08 +0000 Message-Id: <20250108115032.1677686-3-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 846483ed-a004-49c8-ede5-08dd2fdb131b x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: SJOR9tthzTfP0WEFmYZgNmFMf2RIRpNASo4V4THoxIXqzfptz3QxTluxQol29cy0tWhgpuuFy3F5JXwT0eo7lgT/U0XnKsiqq1/+xqs64fRBxTmyYco6sNwYor6OHCm5zkWUVxmc3hFCbpinwP51GuNU+PZ72INOR7FnA9OhW72Nn6lFPcWpTqvwvTpFTNVJrlRUS3O0w8/m0WKUXOz2rUi+l+xBN2+vD6xbW4ASxAoRL3EF5Ih6wuwI0Zx7diXHoOWNxwALb6qBYdRAUujzgquKDADo8+fknng6FctvKKEB7Uf17p8F6lYB51DYVRIRP6eosEZb6x2ufcgK5CBuXRQimtyogD6xiFbOFVMQ+gUzGBFQa/55f+sLXS16CeqXWaXklbm/MCelUwx5vQq8KOlLyAwsnTHSXNL7LEV5bCJpM24q+cj/mpOBJU1cJb7jE1Zr2StjtL9/ArXNElGQGO9H7yqjKMCi/UVCjwJii7ln7Jor9jUJllzj6XfQkLuDSdpK0RUCCvEYRCW8pVERp8od0H5o49kRn3M/3wUL/GQRf1Mn0XlTrdShBdllhiQfohcoQ2LtujpkAdXnK+L+zS7wN2kB3u32d5fX+iQJKddBfmrE+xs9NYKQa2sqSUHEdx25M1Ln5jsUogMEQP/MB1zVDp76oyvejpms10sbCqfhYGA9a2h1WMwF0Q5AXT3rgcVaWDr3nA97sQzT/Iys7ssyPAbT0jTYaS9tMSFFpWngqoB10Ywr6gaKVuGjvXQk6MyKvRXVd/RA1ujGw7w6YNNKCu40uen+MtmRGRFFcdL325G3MUQLEYSnb0m0Zm46T05/UP58bCc3SaNN+3x+EdY+dA8h8ajtNm8bP6RJXoTk/tfd+JS5bXTxiCEdzPDgrSPj+eJnSoaEsXQo3VDObvYK9uD73FgoxFyhaOGp5ganCasg9fdGn0tD1t8EdcwCoG1TP5jyORJ3DXdvmbRZmYlajBWBKSl2nSS8HvPwCTJWYetdjPIe8ULKFwcV/SAWiPH1udE0n05XhqB0V+M00uuvx14xXcDUSgFGCZRawCb4V9bqdpTb/jAy5n9kxxqad6+yhTYD1W0K3J/jR7JTuQejQ3WALS8lv7ZwbVNasAvTK6LdoE73/TFrizxBU3EiIVeQWAQVmP9ABMef7UJz3kW3Nt4U/lNr+roA8nSg0ffMKEwNOZ/9m4EQks6YSMqgF7CzkHP5fw8N0EoI3RP/A9Z61N7Ev0FISW8BcEXPPAsx4aLh9jsmsCrlaEKaWt7YZ452BrDNCL/Qgh12V3bnp5c902rHTn/NkyAbtUL7PXDprVtHuclvSDWkTK7cu0Wrk/vF8c5TTmg8MjDacRoBCrN9Pu/+6Bu3RrTYqddVBfYgfNOpm+lhVtaSrNmXXyVA X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: MHa+ObDCzK2WtjNnGnF789fYKaYpfBEhWrbnECx0bS3JZDxTeJG3IrndiqKy/zTUsqy9FpTYj9jQLp/2J+nkxlykQkoV33blaZ/o/IDMJFe/MrqyAzoqjhzLk1E7rEFLfpqlUcwphSO4O5d6S9g78bG+ejjr3uebua6yjg9Rw+22P/EnaAWrzcNj1gx9NPjACGJHqxnmIFGRjAPaeKOrnsDipcS0l/wY2g04s17JA+qgLyWUnsyKtB1ct8cOZ7L9vPm5Che3BH5mQjqvBuyvvOFntJ31KXx7NRBhDP+xn7kYZp8gTuv9cLugeFxWqJQKSOlFra8wcg+FVZhzt9SxN+Y49yoHQWA+vxUVEjnzmurmuolODq2vnLUqG0a1bew3CfEWb0HkptN7uDPXCEYIo9YjGlcwg6OEgScHNMi+8oi6+5mFfdsGdzw9WOH5vDR/G1JyZq/rU9Vi4WGxGq7/jF6htL3LGVYREsnlPBs3hxiKDFuK67m03bg9mJB3K6yTMUNSHJQwzUhrg6yfFeio3i2v86fcawe5ItQoVC0RpwJR/q6TATv+jllhYCrWfj67ww61r0h2ZoiXxp7+UpRMrN+JPG37Bas2Knae95F9jwBmjB1/ejYxAdWLQ7m2j01iULFLYq8Bi82Uf6uDBQDHd8nlmI6W9fpw10tAN1duaYLSF6j9SS7/4o3OH31eyNVTNpfTXLRl8PZ/AKFiR/Wf4FZ8tk9D8ri5YrQ9hevV4OzCEFQDlfn4mKI4iRl0wgL5PxV5APQ2JrzaJOVBAesjT6KaqYwfoxUZutk4s/KnT6J0hW+ZBW3G4/AyR29Lpoi1J1ju+XO9fzoMuSXgO+zpfhYWkoL4IFZCcCH5jrxFFL2Kmznc3czmrH9hAzQ2cjkKI2t+3PesdF7gwJghAnW0rw3Y3gImhefjGFtPZL2G2GpCNjLeUqT1/QpehOWN9TqcRqHBBAs/t6VZkmWR225ZApFwoDkFW4thc8E9/fhQV+XYrfQYRXmOdJE44YvM/FqP3ftIxfOwb8plHhNX0iJtnFrkbWxDdBU654VEQL+s9/PdD9NuX5JWos7LMirTfFBGn20JzhwWknXCuc04ULS/SEnDVM1kcZc9vJmEF88+jTjI2rDofRIhiqBtdbzAk7EpO0SNn5jb8j4wL7GFGenFvDLGlwTnIGfk1mrEPx4fLKr0nhTH2kd68BN4BzAFbEyJi1/Qfs0JoQmiZ8gNVgYXICMYhOnG1UQMuJ+ZzIEPT83mnhRASyTSX+JGPgvKBeENW9xsmO2oAjohrut+JAl2/GwX8OaQHZHU/RNQ/hj0jUm98RhfzliJsP8h8u2J/kOn9AIC1AigNPQ2YpOIDtRWpY8rf0HgnKluw7g1lDCQ4QMWfEUmdEG4KRuCx4KyxxYGpw4AZviWjLc44R9aL2lslAmqYxh12ZM9181L0bE/IRdnPpWt1ULsQc7NYg9trJyA988/FI38IATCKcDKymFu/LueUzPg+YIUUZkyRT5RlZsoId6V6UkqBPrcZ9PUgxKgfMCIK5nIP2ZVcR9MU6RV0/dEpRdpM3h6qZ5RGpbre1o4mPooXOBm/CEhGzY05RZd X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 846483ed-a004-49c8-ede5-08dd2fdb131b X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:32.2548 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 2A8ELSgfoTrQMjzfPLOhyx8UyE3G6Wv1QvKCoDTf7EHSmLhBOcG5xIdLphzAyjB42zhiBaWBXo+o8Ff8hiNalQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Proofpoint-ORIG-GUID: EHQoirHH84x-iLj538zN-LtSAmwkYjBK X-Proofpoint-GUID: EHQoirHH84x-iLj538zN-LtSAmwkYjBK X-Authority-Analysis: v=2.4 cv=YLtlyQGx c=1 sm=1 tr=0 ts=677e673f cx=c_pps a=ZeveGGZnxkNpWlA7A6AaFA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=RQZ_2NmkAAAA:8 a=64Cc0HZtAAAA:8 a=vU0vC_IIgwTtBsNJuTcA:9 a=46pEW5UW3zrkaSsnLxuo:22 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: John Levon The vfio-user container will later need to hook into these callbacks; set up vfio to use them, and optionally pass them through to the container. Signed-off-by: John Levon --- hw/vfio/common.c | 28 +++++++++++++++++++++++++++ include/hw/vfio/vfio-container-base.h | 2 ++ 2 files changed, 30 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 0e3ea71aae..0cacc66c85 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -557,6 +557,32 @@ static bool vfio_get_section_iova_range(VFIOContainerBase *bcontainer, return true; } +static void vfio_listener_begin(MemoryListener *listener) +{ + VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, + listener); + void (*listener_begin)(VFIOContainerBase *bcontainer); + + listener_begin = VFIO_IOMMU_GET_CLASS(bcontainer)->listener_begin; + + if (listener_begin) { + listener_begin(bcontainer); + } +} + +static void vfio_listener_commit(MemoryListener *listener) +{ + VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, + listener); + void (*listener_commit)(VFIOContainerBase *bcontainer); + + listener_commit = VFIO_IOMMU_GET_CLASS(bcontainer)->listener_begin; + + if (listener_commit) { + listener_commit(bcontainer); + } +} + static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { @@ -1378,6 +1404,8 @@ static void vfio_listener_log_sync(MemoryListener *listener, const MemoryListener vfio_memory_listener = { .name = "vfio", + .begin = vfio_listener_begin, + .commit = vfio_listener_commit, .region_add = vfio_listener_region_add, .region_del = vfio_listener_region_del, .log_global_start = vfio_listener_log_global_start, diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index c9d339383e..0a863df0dc 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -111,6 +111,8 @@ struct VFIOIOMMUClass { /* basic feature */ bool (*setup)(VFIOContainerBase *bcontainer, Error **errp); + void (*listener_begin)(VFIOContainerBase *bcontainer); + void (*listener_commit)(VFIOContainerBase *bcontainer); int (*dma_map)(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly, MemoryRegion *mrp); From patchwork Wed Jan 8 11:50:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930705 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C8584E77199 for ; Wed, 8 Jan 2025 11:55:13 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdI-0006Ro-MQ; Wed, 08 Jan 2025 06:53:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd7-0006PU-8i for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:49 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd1-0002D1-1O for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:42 -0500 Received: from pps.filterd (m0127844.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50885pFE021537; Wed, 8 Jan 2025 03:53:36 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=gfZZgflabec01m7nn1Io2co+v8baC8sayzUsnESXz gY=; b=0ELCh8jHy1HArGlM5b4zOPMSFBFPvQa+NYuV3oRhF/E4STpoHV0mjqwzU S05eufz02s1ZRXQQZiRerC2fbBs8dhDexwJDjaUr6hEaBtx7M2m7g0BPAgaShxf/ /1RSLtoLt/DWsTvWwmGgBlJRK04Nt8AKUgAwzCUUKb1KbSwbbtpRqP3OxzVCY/xP HXzYN8VgL9RDo6MbuQNa/a2Hy+ULJQsx7fSH1EE8ka74VP4uEE2ffEZI/wkc20ME tyr5FOda363jnbD8//QFan0fEQAi32thKFFQkvUU6tIwIaE4mX6N7/ikIVGD8NTP 115Pu/uHpEmoQJfKC5202mxbuyqQw== Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2040.outbound.protection.outlook.com [104.47.74.40]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y5d3rs7x-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=bRG41Q8lJ9OnX0Xd1JpwCfRzLtV6xihSFBQBD5KE7INCtGHk0QRZRzbqRvyqe3hb0014Lx0JVT463kltf9izuT51GckdOrk2OHuYb9hYQMpPwPLxQGJJUWQ8N5pxnKAzrgaWwbDGWbnaaYQdXP5uLnYK7PdOXCUcl2QlfN0k2cRrry89nVVDo2uB8pIVkjWg3uCREUTPE/xVMguJx69SoHuAFdUHfbcikdQz53RiEFN07C1Pgz/akRYXsPoxA0fqgNAHn2EWLB6DZxSdp/RTIj9f6zgi7367/Z4c5Ung/NeRIjHKIgqwQN7/cdZQNcguTWq6snPKbZ00cOX/XjAQ3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gfZZgflabec01m7nn1Io2co+v8baC8sayzUsnESXzgY=; b=rB8Vxw3YzxedTm0HBT27H+JehfC9vvqk7WpdhJLPTQ0LGcyD1AaLcb1NqivuP0bzF+7EQUr/KvpLZES8QH5CfH+IWK5F7maeV9Go8sriIqgxdx389A4ns7z2KDk78eYNV0KzEgHBHbR/IKxtbIR//DcmucWJQJXaFkYluagevosJclBqe+l3SHYXYs+i9/Z9Rm+oj/ozLWJqisYQGVfpORcOusWzwLrZMMIQ7F2y9rOKzD4KLnpDqNSbuM1mY/QA8NMKsj8PTERFi4ENc9AKg5ICZEn7nu1JpXuqiWd/0WmGioGaclFlaPGwosBxL7sOiJAOK8wt49I/M7VEVbeaQw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gfZZgflabec01m7nn1Io2co+v8baC8sayzUsnESXzgY=; b=JQShnw+VpTNC9zCr9HC5A77szv4rS1SPcNiaAwvJj85sYGzKMFaQEPOWtR2Z55hslhwZE8Qs9DmuI3bC0E4+LhkjT/R8S85A3nrGvOsMYtwsS5dJHZjz8YicBv9ABItF4ySuIC/w/pOmSIfg3C6wTwdBKv+Nt8mKln/XHQw5j9Hc//mS/5pdexdRgFRWeBOWV0y+KCrWg9KHeve60ULW53fsG27XdAP8ZIHUzxssvq/3mu48Wz3fNRW8f/UdXQwTyDmvTqPCKKAgHmp9RRyy2O0GaAysaqgn2dlJKdSB7SsHf48R8RwskEyw2FJgmzyPsRw+I46h2em60r4M+aZJLw== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:33 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:33 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 03/26] vfio/container: support VFIO_DMA_UNMAP_FLAG_ALL Date: Wed, 8 Jan 2025 11:50:09 +0000 Message-Id: <20250108115032.1677686-4-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: cbfb7bf5-bf42-4d2b-6b17-08dd2fdb13db x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: kImuYKMCVD+PJxYVJplnXDoXh6g9WDvuALlrc9xu8DqLTwxq11ZTGC8OeDyUGEVAtxhJW3TS0WnV67nUt/98fxXMASmSFflP3ERkIBG0uQsKdko/BEaa9U8UpD72IhBMWPiQjjVbE7llCqmg2yI1eQd6hOSFsnRu07tjFrxadWdOBke9x/amm6QI/pcGu1jV1CJQXKoNYCvU2+XLApQjNilwqnefN1xxqsTk0X1mVHTu7F9ku7OXYqKigAU35jUhCATUt65ytp77XboRJU8RZiMDVrTXIsNOTgApaAtKzhoRCPRwN1U9RbeXD/MBWRGjJdKKXHfD38vpwUECGMS6CQNKlzTEs780uPlW6iPVNPKnB5KYXjOlpxvMUaN1qU4sAI1wmui/cd+Z3vzlmlJcPmyD6vMzKKXsLMWJiFViehOQJD93Qk8VDvf/Y2EiXscQoccT/HUPi0W4hPx+/l+Vttw0axwnmUgcTK4pvWlQWiVL4sYjoJvSmVOiNQNbF0OB60oFj41NsWDNcPIWwXCn31udK/SJ6XYuJ/oCft6Jn4Tn0HDxpgBEPLZs1yTpgeUvS1ItB3CE2bctm7x3JAhdXKb5z/eyCOWUNTXkxeiT0TrINfgut6dVRKb1Ey2fKub1T88zEf/7TWXT4mhT3yDKZbaQiJFsAwx9iBVMHoymNYLfEkjOg2WftfEHdYuv2761vALc8PIzk8UrcSmqw5uYhPiFbnHWg41w4GoJmlcdtSFPUq9UI55i+OUkvtZnm/mWmNU6ib8X2RXtm/KRXsa81ZXNOMZiE30NgieMklYjJN35KPEbWCGSvk3jpNl60iF24SfLQTx252hAt1dWjgfWQBOaeqxb/M+ufAO9t7QqbYWl493t4BX3rlI3tWKXyjctwb/tFYteLMeKO4OaybbwxYp96+YTqrKy56yEv2uiwEtD5/mBSGxSkhRni1/WjnOEw3c7xwuw1Mxx3cfd8D+cKczBEbk9i84BAH8S8l5bEg3s04tddZb8jiAjk5+Gf1mVH0RfQrF8iAjafIRKSHWJCsjx3ZdJTgPCONYvgAe59qRAZqOI3R67AMdudxswL2EVGFs/UVyTPx7LXJ6MUv3XNK5NXSOdGJZoTkQo9y+g1IZqR7NgXmL1uwfuNo146Vy8kL9sT6kYz8EeqMsu5JOWkZAVjjzZUdLHXkeoWnOgIJM2Gy8EB+Wwn6NrbgJE+6F4rZdp3x+5fX0UM+hjRV6W7hvWT20U2yAI3FiF8RYHAehXSruWpP1ijMGmkhV+o5WhfanPisxY6XV6fJWTTr4m+MgOC2cMoVjtVNMl1RUhyvvqQVObuBmkKe0ulM3fcvIhHrUGLzokMDcnH9Xn63gpoRZn6M6gL0g2hg/kk/3jlK1PyfgjLN++t9CVnf3J52z7 X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 2OjZe0udY+QVv6wzjdNQnFcJj9/bgTlQrFltvbyQNsGviCJR6Fcnm5Bq+iXCMoAB2diEhVOH7iUMBj9lAA4f2+l1hZpJ9vt7o9LREf58l+4skJ3dGTlNOeoJShtSZ3MxvqfI17NYlcZkuev55+aswV8FhrWzPajzymFrF+Ec98CvktW6KaEUxUtn3nHcsY0Q0BabVoPnvpDMFnNZftXwW5OU0jpoT/GjEuKbfFhC4uNnBxzDBWew+XqUUm2LIwwUQab9Tmcc+Ws9WeUSHIzC/i4OJbuNjQWUjvwyNWpzIYF++F/SEpu6kcnOJAondwg1td21to+i+1AMW/fhBZTJDZKztDkBDStVG09MbGPPyJaV4cLSHWpPbk0aQx65TuB35kjVgkrNjptjDmc6RSPr+f6NGaI60qKQQARgWQHE/cGIM+H87A8NZb5JtG2NqKQzO6WCpru6D1hUpsRfOZftvIXUA1pbhqAGFSjHyCsquuY9ANFZTaCnFaCkeFCoRDXXDkUMy6mvxqv/k8LBOQ2iV0x7zR2SRqaPkH2n4pOwxRxswL8oEESn7n1jvJbAd8AD65SG7EnYhWXLasY5bY0ZvHFaL2aHRwneUbPRGenJM6/rD8KJl1tF5+EEhBlB0KIKpxcUE4tUMqPhU6QkzBRVxS50YMILF1y90wTqw9YkazNUzrclhttvFAKsOACo5L/AJ6zQrHnPQ8qdrGJDDRbszINoaiqTZAH+4LuVatWC9J2FeetQSA8h0+7MYcGBa6/BUhA6EVVAY0t5uqd3NvUVeUvjBRsUc91iF6Ge1Sx7h5+KWnj+AUQMlVa+ZnUrp/FxyLvg/GpBTLt3pjBLCyGnkw/3JkQUxpiMXihQj5PKOl1PQiQS0CpgQR2ipdC3j3R04F406TuIb8duBj1HDfajmJU5caA2jsXw3HPN4kMUiE3YHq2+/+EvO1tNo0LWn+4hkP/o50a/5HOOqvsGQV6p27XfvyhXRtFVBr5xk8Tnn4m8sNwOWYN1ctoLGHihqgJzMOQlUW6Jz+PqOLeWEq1qkd1SGuZUgXaIvzhYC6R1U70kYlEXmo41swG/ehEh966bPcOrK5oMEtSGApYoPk7M4NkovWypb/VtCIvUIT006SLh1OC9xGuyfUQCF1mGtEl8WJqPKzkZpp59cIdlvdYdGhaVCA07Iji2zTuIJwfNZlMKllIZNvJ9/dko1ac30sY2NStPhF/5bwLuDSoC7q9+VrwoBy2MLcL3j0GCQp4ghNmr15nYyzwIWzvEK9o7rptBoA09olOCDoOxrbPd7irxWibP90m3yIjECiOFfm2d+wJPAHZb4Mz3kpM6bfEjoSD3OsFVD4YPyABxQT5UImHklEmZkIN/OmmfDEcQGD2BnXrY7oAa8GDixXAiO30Gg2M88jQckNdQL5ozkXhphIQJwUVN9H7z3STycvNuYqPDSLr1omyjVAJ81q5mWSJVlQumw2jVgUsyv3D3BYHFscbRHK1w6ZNHFEwsBKBKD623G/m9Quip7gNil+oYSH7ddHDTLcYJ/LbUVQX015/uhEfAhwOnp1fTA/FCmG0gAxsBaFEcEEEo079Oc72aoVL6i/9f X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: cbfb7bf5-bf42-4d2b-6b17-08dd2fdb13db X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:33.5280 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: exXNDP0fcEol4dT118Ala7h6kdfD6ZhrCAPAU7nuULjFCYEdPbaPTz/w25bt6VCWhXO3FQgRnICpQ21US3Fa+w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Proofpoint-ORIG-GUID: irTS21spnVJZpBmibtObn412zkfALtJv X-Proofpoint-GUID: irTS21spnVJZpBmibtObn412zkfALtJv X-Authority-Analysis: v=2.4 cv=YLtlyQGx c=1 sm=1 tr=0 ts=677e6740 cx=c_pps a=ZeveGGZnxkNpWlA7A6AaFA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=rAkVWd3RjT2K1M8sKjoA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Some containers can directly implement unmapping all regions; add a new flag to support this. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/common.c | 24 +++++++---------- hw/vfio/container-base.c | 4 +-- hw/vfio/container.c | 38 +++++++++++++++++++++++++-- hw/vfio/iommufd.c | 19 +++++++++++++- include/hw/vfio/vfio-common.h | 1 + include/hw/vfio/vfio-container-base.h | 4 +-- 6 files changed, 68 insertions(+), 22 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 0cacc66c85..49e3543c89 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -324,7 +324,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) } } else { ret = vfio_container_dma_unmap(bcontainer, iova, - iotlb->addr_mask + 1, iotlb); + iotlb->addr_mask + 1, iotlb, 0); if (ret) { error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", @@ -348,7 +348,7 @@ static void vfio_ram_discard_notify_discard(RamDiscardListener *rdl, int ret; /* Unmap with a single call. */ - ret = vfio_container_dma_unmap(bcontainer, iova, size , NULL); + ret = vfio_container_dma_unmap(bcontainer, iova, size, NULL, 0); if (ret) { error_report("%s: vfio_container_dma_unmap() failed: %s", __func__, strerror(-ret)); @@ -789,21 +789,15 @@ static void vfio_listener_region_del(MemoryListener *listener, } if (try_unmap) { + int flags = 0; + if (int128_eq(llsize, int128_2_64())) { - /* The unmap ioctl doesn't accept a full 64-bit span. */ - llsize = int128_rshift(llsize, 1); - ret = vfio_container_dma_unmap(bcontainer, iova, - int128_get64(llsize), NULL); - if (ret) { - error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " - "0x%"HWADDR_PRIx") = %d (%s)", - bcontainer, iova, int128_get64(llsize), ret, - strerror(-ret)); - } - iova += int128_get64(llsize); + flags = VFIO_DMA_UNMAP_FLAG_ALL; } - ret = vfio_container_dma_unmap(bcontainer, iova, - int128_get64(llsize), NULL); + + ret = vfio_container_dma_unmap(bcontainer, iova, int128_get64(llsize), + NULL, flags); + if (ret) { error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 5e0c9700d9..db27e9c31d 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -27,12 +27,12 @@ int vfio_container_dma_map(VFIOContainerBase *bcontainer, int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - IOMMUTLBEntry *iotlb) + IOMMUTLBEntry *iotlb, int flags) { VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer); g_assert(vioc->dma_unmap); - return vioc->dma_unmap(bcontainer, iova, size, iotlb); + return vioc->dma_unmap(bcontainer, iova, size, iotlb, flags); } bool vfio_container_add_section_window(VFIOContainerBase *bcontainer, diff --git a/hw/vfio/container.c b/hw/vfio/container.c index fe193ac7da..39c77d402c 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -117,7 +117,7 @@ unmap_exit: */ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - IOMMUTLBEntry *iotlb) + IOMMUTLBEntry *iotlb, int flags) { const VFIOContainer *container = container_of(bcontainer, VFIOContainer, bcontainer); @@ -140,6 +140,34 @@ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer, need_dirty_sync = true; } + /* use unmap all if supported */ + if (flags & VFIO_DMA_UNMAP_FLAG_ALL) { + unmap.iova = 0; + unmap.size = 0; + if (container->unmap_all_supported) { + ret = ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, unmap); + } else { + /* unmap in halves */ + Int128 llsize = int128_rshift(int128_2_64(), 1); + + unmap.size = int128_get64(llsize); + + ret = ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, unmap); + + if (ret == 0) { + unmap.iova += int128_get64(llsize); + + ret = ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, unmap); + } + } + + if (ret != 0) { + return -errno; + } + + goto out; + } + while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) { /* * The type1 backend has an off-by-one bug in the kernel (71a7d3d78e3c @@ -163,6 +191,7 @@ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer, return -errno; } +out: if (need_dirty_sync) { ret = vfio_get_dirty_bitmap(bcontainer, iova, size, iotlb->translated_addr, &local_err); @@ -200,7 +229,7 @@ static int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova, */ if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 || (errno == EBUSY && - vfio_legacy_dma_unmap(bcontainer, iova, size, NULL) == 0 && + vfio_legacy_dma_unmap(bcontainer, iova, size, NULL, 0) == 0 && ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) { return 0; } @@ -535,6 +564,11 @@ static bool vfio_legacy_setup(VFIOContainerBase *bcontainer, Error **errp) vfio_get_info_iova_range(info, bcontainer); vfio_get_iommu_info_migration(container, info); + + ret = ioctl(container->fd, VFIO_CHECK_EXTENSION, VFIO_UNMAP_ALL); + + container->unmap_all_supported = (ret != 0); + return true; } diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index f541b00785..39c2b802b0 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -41,11 +41,28 @@ static int iommufd_cdev_map(const VFIOContainerBase *bcontainer, hwaddr iova, static int iommufd_cdev_unmap(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - IOMMUTLBEntry *iotlb) + IOMMUTLBEntry *iotlb, int flags) { const VFIOIOMMUFDContainer *container = container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + /* unmap in halves */ + if (flags & VFIO_DMA_UNMAP_FLAG_ALL) { + Int128 llsize = int128_rshift(int128_2_64(), 1); + int ret; + + ret = iommufd_backend_unmap_dma(container->be, container->ioas_id, + iova, int128_get64(llsize)); + iova += int128_get64(llsize); + + if (ret == 0) { + ret = iommufd_backend_unmap_dma(container->be, container->ioas_id, + iova, int128_get64(llsize)); + } + + return ret; + } + /* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */ return iommufd_backend_unmap_dma(container->be, container->ioas_id, iova, size); diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 0c60be5b15..13c67d25cb 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -84,6 +84,7 @@ typedef struct VFIOContainer { VFIOContainerBase bcontainer; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ unsigned iommu_type; + bool unmap_all_supported; QLIST_HEAD(, VFIOGroup) group_list; } VFIOContainer; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 0a863df0dc..24e48e3a07 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -76,7 +76,7 @@ int vfio_container_dma_map(VFIOContainerBase *bcontainer, void *vaddr, bool readonly, MemoryRegion *mrp); int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - IOMMUTLBEntry *iotlb); + IOMMUTLBEntry *iotlb, int flags); bool vfio_container_add_section_window(VFIOContainerBase *bcontainer, MemoryRegionSection *section, Error **errp); @@ -118,7 +118,7 @@ struct VFIOIOMMUClass { void *vaddr, bool readonly, MemoryRegion *mrp); int (*dma_unmap)(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - IOMMUTLBEntry *iotlb); + IOMMUTLBEntry *iotlb, int flags); bool (*attach_device)(const char *name, VFIODevice *vbasedev, AddressSpace *as, Error **errp); void (*detach_device)(VFIODevice *vbasedev); From patchwork Wed Jan 8 11:50:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930704 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4AAB7E77199 for ; Wed, 8 Jan 2025 11:55:07 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdR-0006SZ-Fd; Wed, 08 Jan 2025 06:54:05 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd7-0006PX-A3 for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:51 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd1-0002D8-1J for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:43 -0500 Received: from pps.filterd (m0127844.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50885pFF021537; Wed, 8 Jan 2025 03:53:37 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=RPCvoz2CV6dyRw6KqFuv3ix/vqrAXPjRvd4Pq1nuO lk=; b=GlxRb8rWfPZsFfoPUhBktN0abGnD1R4AiQusGkrx1aAhJZnNLyKOgTxur wbXtfaO4zgkpnpfmLLqrd7ZtQQKY34agne5e40GDc/y87VX80qp99rU9VLNnSTtu omF7AIXmHrGgcLaEW5YbmQgwF9AG+sd+HtBYeh2QDYcrJXvd8sk1Juv49rpbqTSh vwENZkGNEwvgqzs6FWWyhLihIkClolW74a09rcC9/Xjvf0n3O9GPqrCBChTEiPHK PcVS398Wz1Epcz1ibycoNkOYXacCWE6wZfUoRGdh6HYNRlci7axvpAQm5s9xDCHU 9bXxqtkeipZpedIIoWXVBj/tHGf2g== Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2040.outbound.protection.outlook.com [104.47.74.40]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y5d3rs7x-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=fNsqeF17Reyp0PXa2oveo260J2drZHlqZIrM9Je5mXFBVVFIn/PYhG7o3dcfnWuC4T7D3TPvTAGMu9NeHQ2khye9YIj4mqJXiHtZwbLAfCjw/y+HtaM1K5zHI4rGwzi1//V49aOMCVDul+ioAkS3w8+xQkSrh+4KK+e9Lgh7RGH2jd8LYW2z0uN069CRPNPtlMbujAuy5iwp+xZjHLXSSdAMs7nwradMgjouq74e4EWo8GtXknErtiabHm1eGuJnG4Nzrc5UjqZiHN6PPlfdj8kkJMCv7Q+d6916p8P8ubAYLf7WANWfsa3gofS0YaFsMd6NQCa5I0qPCErSnGBHvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RPCvoz2CV6dyRw6KqFuv3ix/vqrAXPjRvd4Pq1nuOlk=; b=dd+riDZSSwhbVLKqzmiSvBTQUoVqHCd9l4nr29L9xkLE5pHs2BRW2C5jMURCsBSavNLvQ4S5sQ9DlqSNhKxp/0fsLfkjL904CJ9fHo4GzokTD2VYdp6HoZcM+wEOTaCliQe2n2JReKG0JPiPYZpDEAE5Wug9cP/2ZqQcFJt5c74GjEz988cSh9cUcvI6E2FtxMls9oPMmveNQ5Q6rdUr1ZfjSk07IblrYr6duCeOC3z1f8RwfITfhVRjLE/NyJPcf7MKhtmuL6zI/Yoo2BXKmpUZaerVyiwdkTl54k27G5xQz64lOSMvc2UbSkUibiadY0de4RbaPszhsPFxu4Lmvg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RPCvoz2CV6dyRw6KqFuv3ix/vqrAXPjRvd4Pq1nuOlk=; b=cToGa9raXzjAaE6dxL2x/i1rsG51ItdpLj1le4akXR7sk90kq0QK4QQ+JlGMzDjyRIxL3ehetAA+jwuBN/3ehudxcMTrgNtl5RoDwUKgB2yMl3uXopuHTxMmoNCUqP9c0RsoUevff5aPNhmne1IibJKsqu+qSl3Vvbepc2F0xNGc0+7ovtTwh/OJ5R5roDTo9LDW/+os54hAnevAJ8XICVBtK7k4EMV7Z7IL3HUReTdlI77O5K7SHR5F02ZjxYSNqiMnYUv4GhzH16FhIYdeqxcYpQSmnzJ91H7WPIhNH3IGTRbPI+nttIh3A/MY592RkkNebR4nsag3Vny+CrwSxg== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:34 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:34 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 04/26] vfio: add vfio_attach_device_by_iommu_type() Date: Wed, 8 Jan 2025 11:50:10 +0000 Message-Id: <20250108115032.1677686-5-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 2dc88127-24e0-4753-2f18-08dd2fdb149f x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: I7kl+ylLcVg1Qto3U/dmtnPJy6YfzFRrcKNdtkc/a3USLw3+bOgzmagg7Ejv8vxWRdlQGJfzXQ3FOEUhldRcd75Un/vB/r0ZoEyeI3wLSjM4cXDL1oqpXGa89PkudO494t1ly9+VF7F+8mk/g4gcIfDa+EfuZSNDxXXzJ0ysjHuXKzMOMMEBRURNOTHVPRHHo8YHrOoAlGj2NZ8ocNp1esStcN2TuvhrG/cSiSyyr7BGlma9lIqtvhn+ZeclUI1Def5/sgW2n/MBbmCqMqL/8pQMXogR7QKdPyU/WhiZGgVCwmp/XAQeSmFc+O8tVw/8bupvV1/zJW6ThUTndx3dzxt7sRTUBsM8ngXyXB5mesCenAPYJMhXsPFaGdOt27Q0kbuo5gdb25nkh72dQ1rjC5wGzLFG2dulckhudZ9klfyGpRmXKj2qcLxyHmyrAysYRIvxYAqzrIbVl07SLeT6V3OCk3LwqF2hL8sTqVblkeA3+fOTpIvLAy9RaD48JR9DlYtvnT3qMkXKfeacpyUhGFDIfjs2+1ooJ6INIw5/d2xKgs2vf1uQ/LkVJGc1l/GtMy+5SEFIBcDEH/L5ePT6JNQPO14yRjmp8T6MH19rdQQH8eyEsK4J43IIDjlQuWY0Zc5vI82Vbw26QQynJcYJcEwetjKP5qlwPMvzY3QSxZY8J2gEDRCA5aug0oYMYIqp1omX0vmNHoeAwRURANnUOrjtkQb8O1N+8viT6TwORG47dnLROS7B1xNxzRuVi6krX/i4DH3SYd89dQDYZ0vQf/rcC5D6noBBPRlX66AD3dymovK1HwD8XulWli592JYou4fkH/z4m5LPaRuncIt4qGVBNu+AsfknUFRlLn5QROu92FBCFIBM1AQq/W1J7ah1O12vBpnF85CHRFGEfMHdx7gHnC9ZdteqNAzpOxI7dKYND4TzNrGKB4jiByo4GpGYC/uBao4kkod72o57wNWhMSJHvb/9anW1b2rNLsuJ9+PYTrxnDrFWwVpT6G8tCV2kvJYFf52toJZMYA9sCC2wt34k502TpgaWOAKmfYqTdogda49KjUpRynPKB5KPuKljnbo6nY8CHnpc9BnKe8ShPcqEmKSt2HTcHMX+CAttCX3DyytPKrotNpiAPc0eODfoH+Mmpd4v3NZ5Ssd4y6+fqS+tZ8sD4d7joNlWTnctnf7CBrTSh67ny31dx5/YbbRTg3rjoGMffgbS0rdMh/m1/ZfRwT6DYs5oo8eAQWUcIpNe007MQC13X+AI2s+7iSapJ1UV0fzdD4nW7BkpO7h08UGLt9UB/QO6MLfHrfVj4IWXoorC7k+96GCBIoCmnbTrGyvA5FXZ1HCHs0fngXkZv6S3/NMC4EU+DwHk4vjzUQTR095/ry1fzPKFgcFuxbfc X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: zi5ALBZst5GoyJImOLKPD3QRMtfJcaXx5O9wDNAZwJQ5DIN6ylKL6f6W1Drj9TiURCuVsS9K4eKFNRnd18EFEVqm4xLEFshuA84vkneFz41zsfV0faEqYuuiyUlsxKb9k4jdbFyZXienvN2BK9BdggHiemJ2VRUENQ5rHX1J+P4dV1LTUo1XN/WNgs+72LA58rSDvk4CuuU+TZ2KXdtyjrOSHWtTF5dTlW4fU7RrtwuC/xR87jxGQ/c2d0hz4xRySO2/4kXJprPc8G8I/tX1fI1RI8EN7Rvct5hmljsAzHYNz9v4tg8XTeDfa2Ls6Ai3SXYMUByT2h5B0eaNb+UFCyOXbhWNyLH4c49Jo3k+QaUvRPlqHKBuk5h5VBCUGRZtYZTNkEC+npvm4eeu7LMuv28UoevBWjs85MPjrxkPrfbsxoqismAyAVCtTP1nvH/PtzFGCnuKJmzQpG5GsU1D6Z624lNALhrTVhQjSryeBOkIw2AHrrVGYJmSNiVen48iPDEF/6HN5z8FBdRAdA72+KUF2P0f5kzjXWBA31fJJa2/W0zj6yLMXFpkgrK+AwSaG2oo9HhFEqc70iPn18XNEykGsJATANL9Be6ChafMCkF/i21SuhoYtfrGFGn2UT59cdHke5VCITewX9beunaHpqZFjpUl+k5wJMSjYSpiXGKeHOak01POhiFoQTAZAwe9Iae9knJ0uKySpV2/Hd4fgasvUIdlnfWLe/l9+Dyc5kJSaasgkuLqAnoV/2AH+geU7i1+s3/mSP0EGlFVQnwpgXP1Wsz2Tqlj4jphFS5Msoq2WasWTwmi0cddmtE5uTO6aRIZGqMLpDVjl7kLiKtl4itquPNAJGatGej6zU700qKEIw0yVhlyNnfLQoIL9Go2GIzP0sO9UzJTVYd6LKm1d1kbRREYtdqqPfL23u5yz8KlTYbwLnT2GQLqBtPgZzojWYazEduZCcQuoLDU2yMq/s1umKHfO2akZCCQiBRIcEKaScvS5m0m3diBcNQu4aXK6sTEGZrYCGX/WHI4JjSjSSAHMm2MNo8VpzQupy4bs/sZGbHDYzpRG4/9MLDixVtvDhDz5aHdmPlWg8W6TPPpNVkQT0MqLVJodJdbqHf5DfU9KfKAM6+eD9idPeGcxW+VFlBW5Edi1TK0Mi082uNmRei9eB4fZiqfqeyq+E/Slw1Q/ph+KALpiei7HZmKEML0JQJh7BJlSxcJYQQeK9qjqQbjfj1wodY4iaZxLItq6CEZuJICSz/JyN05gaxb3FaUr+yysZwvBBLV7+LxdYxn1SnDQlTZ7mMThiwPFtPvuk06z+xtLnp2SSHluJHZlUfm2wM4ycagH2k6Gsh1eoXQ+RQOpjWLxokAIZ+BJezdtd7ywPCrqO3tcRKUCfH8CwSoTsHkWzhGBbp2P2kAm1/W2cSuai7IT61sZ3XlofUUgEUKs1LYYPDtm6BgqcjJp2V/QwCE8RixInDI4IdjTD4LQX8Gffjg5rqed9+RaImgbe0qDT7ClKgMM8onHJvJyLtHr2AZTHL7Vd75Tb2sgsHTk+AfgGrfIHi3tJowrD8MyAWr2jY/4jwFTvKDw5NWEoyX X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2dc88127-24e0-4753-2f18-08dd2fdb149f X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:34.7836 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: nO4JcKKaHmEYDk6fYGwatsVkHFFSS49yjC8GNIZghWami2OG9cAe7CABlPTxGpllLaZeCq3Oe0ZRV5bI6luqww== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Proofpoint-ORIG-GUID: RijwkoEPdH14qB9o1zfgSuqCaC7Wqp5b X-Proofpoint-GUID: RijwkoEPdH14qB9o1zfgSuqCaC7Wqp5b X-Authority-Analysis: v=2.4 cv=YLtlyQGx c=1 sm=1 tr=0 ts=677e6740 cx=c_pps a=ZeveGGZnxkNpWlA7A6AaFA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=64Cc0HZtAAAA:8 a=1ktPcj0v_YJIm58XXvYA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Allow attachment by explicitly passing a TYPE_VFIO_IOMMU_* string; vfio-user will use this later. Signed-off-by: John Levon --- hw/vfio/common.c | 30 +++++++++++++++++++----------- include/hw/vfio/vfio-common.h | 3 +++ 2 files changed, 22 insertions(+), 11 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 49e3543c89..cb299fc3bf 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1551,25 +1551,20 @@ retry: return info; } -bool vfio_attach_device(char *name, VFIODevice *vbasedev, - AddressSpace *as, Error **errp) +bool vfio_attach_device_by_iommu_type(const char *iommu_type, char *name, + VFIODevice *vbasedev, AddressSpace *as, + Error **errp) { - const VFIOIOMMUClass *ops = - VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY)); HostIOMMUDevice *hiod = NULL; - - if (vbasedev->iommufd) { - ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD)); - } - - assert(ops); - + const VFIOIOMMUClass *ops = + VFIO_IOMMU_CLASS(object_class_by_name(iommu_type)); if (!vbasedev->mdev) { hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename)); vbasedev->hiod = hiod; } + if (!ops->attach_device(name, vbasedev, as, errp)) { object_unref(hiod); vbasedev->hiod = NULL; @@ -1579,6 +1574,19 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev, return true; } +bool vfio_attach_device(char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp) +{ + const char *iommu_type = TYPE_VFIO_IOMMU_LEGACY; + + if (vbasedev->iommufd) { + iommu_type = TYPE_VFIO_IOMMU_IOMMUFD; + } + + return vfio_attach_device_by_iommu_type(iommu_type, name, vbasedev, + as, errp); +} + void vfio_detach_device(VFIODevice *vbasedev) { if (!vbasedev->bcontainer) { diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 13c67d25cb..387854cb0b 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -252,6 +252,9 @@ bool vfio_device_is_mdev(VFIODevice *vbasedev); bool vfio_device_hiod_realize(VFIODevice *vbasedev, Error **errp); bool vfio_attach_device(char *name, VFIODevice *vbasedev, AddressSpace *as, Error **errp); +bool vfio_attach_device_by_iommu_type(const char *iommu_type, char *name, + VFIODevice *vbasedev, AddressSpace *as, + Error **errp); void vfio_detach_device(VFIODevice *vbasedev); int vfio_kvm_device_add_fd(int fd, Error **errp); From patchwork Wed Jan 8 11:50:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930724 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 820D8E7719A for ; Wed, 8 Jan 2025 11:58:55 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdX-0006UT-Df; Wed, 08 Jan 2025 06:54:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd6-0006PN-Kd for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:49 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd1-0002DF-1I for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:41 -0500 Received: from pps.filterd (m0127844.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50885pFG021537; Wed, 8 Jan 2025 03:53:38 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=gBp4yAV4zVQos1p2lAL0HCRy3wcK0mtDievm/YWpj AI=; b=Z4yNnscTyJHVOLKLlx9NPB6a2sETgwMXdT0/Ao4sSGAnyzJGcSkKq9YUL u+h/ucMDutTRU6oX+xPBoOxuIiwJg/vHMqFJgpBLJ6sX2ZUwBcDzqQtjOJI9rQyh ZLErI/hnYsK4VXCFtS+ymPUnrhjj/WUUirMSemP6jetU8t0lk0t7YCN+zkGqC7gR 4BiFhU1T7RtnRRY8Ym2t9g7iPBpBPVHJFcxxiQggpDoE4Fo+gqkEz8tiXVhz0W99 sag+IwGlXl3e6WyDUBr0e6MczhLLEwKuPFPU864w09b+waJivgRNAj/JQPQ5HVMX lIubqXXoqh0hzYDDf2DdtukCWkJ4Q== Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2040.outbound.protection.outlook.com [104.47.74.40]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y5d3rs7x-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=yLaBWws2PBEejkdoZJVlGMUjgm2ksIJ5//SY2W0TMzNuekGu1pGJTxTaecIIIU7U13UHyvkMvXj1bmy+Ahp3YvCSWXQjEdNtLsXQTqCNse+zhtOxUEw9pas538EIQ8tUseo57zEA3oVrneFWfpq9ka7UwVTXuF2DWhnyAp0RR6JLkViHiJQqfs5AcUdrQkouv6D5qf/cn/v/s1DlQ+QbBE45I1aoMAGmkHsxUcFOQtlWd8rlVGZ6iGVVqTXuzmlcITEvAxhtlnOvRqBSIik6NSpx6KnYNR8CJ5gXy5TRHPRVp5753MAB9q9DG214nfxdjqn+ABRDTweFBu8p4IExlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gBp4yAV4zVQos1p2lAL0HCRy3wcK0mtDievm/YWpjAI=; b=FiDoVdKCBpoXV+jCLxPwLCrCVo65ppsI9MSXr8pwJia2mj13NdYktE57zjIIf5OpZ6fucl1S8sya5DDZZqShVULbvsc7omL51XQiKkM96SNic/66GHG/bcWfO8a0NYzx4tbm3JHqj4cjdJMjY3Vjzu6JeyH9WTF/XC/lnjH4LDVvjSYK6othMRDU/ZZffdkEOJAVTVHLvum8go+pkJ9MNxlJGRYRi1uM4gJ8ciaFuB3JHYsBehoAqQBcLImx6s1cLlpU6KCaGcf3XwTkydjVT4O6RHnq70a4uXuhKW5gkkVL/lYSnFAQmf3q57OJislQz2Ib7LYpuubeWQHnVrUQtg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gBp4yAV4zVQos1p2lAL0HCRy3wcK0mtDievm/YWpjAI=; b=l7gpf6q7LJWPcz+m2AfLqNY8v+ECnnp7WVClhwyYAIZTs8+jawWCPCxeLzZcLV4bFwg+M5NU/gihn7goOFpve7fmqHED2A/EKMhjTHLKa5HTQYRZ579fqpJ/fqpPi9InuH+S40OvcDCwOLxGWkU5aOZ0jx6kgdrpKRTIGgJjILQ2UfdeYimfxuhfxyeJreY7qq8svR2ickQyfwKUgvlJtsMHEiDrgZ8GH2fcjwgXSSIrmQy0zUm4e6r7sa7VCS4mXSGMRDl1QjgsCb4jw45hn6eENHVZDWu3nDjqGciqbUq4r/2o8S0Snmtb0lCNSnUVjj2Wuo+GQMnMaFp9Lopn2Q== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:36 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:36 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 05/26] vfio: add vfio_prepare_device() Date: Wed, 8 Jan 2025 11:50:11 +0000 Message-Id: <20250108115032.1677686-6-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 52a3c438-1eec-42bd-2df8-08dd2fdb155a x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: wjD848bw3j6wUKHyIF6G43ampA6sxCr7ZxyxtTN5ss2AK2ZmYEFvU4VD72QZmPsnkvGBsYY/pjyE9QZVLY26ypwdlbtW6AKKpIDVOq9RPMFEG2zmUFAGWgDCMPwsP/6uwd6d2iPLFk9q3AMNScFhQXA1OjeSNrljQh6nqPqD7enuiQpHjIr0gElv3+70MGY0duXNulBIY+TH2TSejPdkJhFbD3uxs4dy0gG70VP5mSg299eQwnnPmcY6NDdxB/Zn/1hMeTBCndqYeL+V2qeObAVKfy0ehRqImdvB3HFz2JfnNub6g4UBy7MqrA1DWLcTUC8HLQMqqrOnjryJWXCeLAS5kaU0kicImtamOMb2Q7L1lp/aMbZ9BEQ6VsACyU/yIGSYh8rzx5cmPOx7BinzxS7SCKEI5bk1fjK0yHfPX2swXxM/BxrqyijBcrbl4nlBTSGbfnhcPCZVcDcJHzGdW0xr7NQgiK5wH6egUcKa0dem2YlEFHg9+Yy/eneJ6sV0jfkZAnNoEJqp2zMArsyPLO4vf5zGtda12IzUNEQ0HEUMHR0soSliFNA8Baw3NIetqJOPROgQZqNobc4T41cN8vo5Fdc5WMGoYQr8vPqlSvhMuV4biKAMaXrAhnKHqoxoSCpCehwhy+4Heks2g6lpH+focmYO8QjuvEsmLj0QBw5WOKQQt2Bxfuq/9x3tEjcWqXZt9FnhjhvmArc1kTCHKToE7Fufb2hllI2KOoL64a41yUURvua6jcEKQv1A5QSH+b12uGsNGFE7my2mVAludEGWhJOUkEsuS57Wwa7GjpmBGGGb7LX+SvUPtUugGK7FRM/uzBnlao8DZxv9TjFmiR1RGNd/x/yocHtHPHGTQAj34oQ/t4HYXhywR1gdHhgvDSQnn57dKYYAOKqcKA1DK9gzs9pVwzbYg4dXG0M5wqgE/Kg3cW2QamhtaVoiHmpL+MSDK9EgE2Z/iWhKuGcvPNCDlmHHAVP0KyfvQpSRFs8Li4wBOMHPAqLW87D9e1ZqEPGE4JtRZX232IDf772u3WNq8oGpdNzrXaKEmAvtI2PMr2NaB/KRtUeHWsLy6pIQeBiBDcc1lhPM0PerFVaC9Y/SDyTNF5ucm0r6FDVG7Q2a+tii7Y/TvwW60i2kbRkVhdVJbK+rLL5M4pFHkuT2yZ252YfHNH818DUga83ai0OVz+GEpLnItboy6vbJmvJ67YFt2UJ4VIz7prx/j7AwFq4thoQXhZZ+qvEL2OLJXETJTiYe+S6aUKvwgpFkI12GgWq0O/Cdl8kUs1DDv20OQ7ucZP6GpblGQpSOSAR/bJvimFD3DVkKaU2/FmmvQrCXQUvDN4efN7N5IXohmLMHX8s9WmInUNHYjrtrGSA5zZidHEi7xMiWtlj29YCTcT9J X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: bQFCM3dGSQcNvyTeQ4Hm6ED00MoD/Z0BPYJH3Q31loB3CWAEaDTQqwL0n2aYwcKjKuQAyXzSj8oZhfWIyzj6MPqcZPtdhAotOJ+lTZiIp+ntuoi5N4sgikPOECBxDaSeivsZrdKXzF6Jv2bfE6uZltnjmdixPTShjYFn9VTBfeH+woM2o3+NiJPwikZAkTZdffen6JL5GYZFO7IGkixPeS3yR++puTyaLiv4zK/uQRUMNsVIDse4e3cO8u+kPxSkXDX7fMa0FMDFv6sKTkbACWJ6bxWyKxrgmtA+ZdP3URm6sNz3ypVVg90e1Umeq8JhLQpoGusKNqBsaG7gFuQeY5sJfWB//7YRhF+U4gu6n5bApG6jv3zvRgDAbPQHJIr7iI8gZ0/pgdSLnSYX6DnzeyToWyuPdSxrlI4up+iFFyolOWkrWH2BXfCOAs8nfpzvnd09S9x3vlwMK/bt8YF3Lb6aTzWz+PAEXJV4Xk+n6pfpSmNlIq7oPpKE3gcSagwsJeThQE27633xdylg+NWna48DSoUxMeDHP/aXZ7x/5epUky3vZN3bKS6bJ9p+7Lt85CF57an/aWGYZuB2jJ/N0987uzrcebRptMPMo7mc5anlO5mGj59Tol9u0G9k3+8Cpdy5VEy4f2zbAVkCd/oLkkeCwbchL2LpF23o90cZ9WW7HWzW4i9nATt1U0i98KIbH7hLv3n8mGT5Ma0e6Ig7WvCfxc8fqPuVrcpaRgIgT2IQGiL3APA6Q31+ILBIHsEE3q98bPn/eH+cmmQVmMwZ4QPCFmb3s6ZqA0kdNwqLcrwmGH9DBJDkS+0JZIgaiqKWKRznS0DVHFuvwxOXI4EMiirqsrvgXZ/jZMC0CnBGnZ2n1FrMQgIaEp+jja9Tki1Zbt8ugw736jLZ2sjGhao3Yvxd/aoDTPlDjaiHAT1A+lLF4yD7+lXGcDyKh8SeX8I8n70ilQrQYB3dtyEgfdXD9btGS1y7pRCbwH/qDU5qr4yYazKlsKKi9TsKGj8DnjhCO5PvtyYUmXTxL6QPqRq0YXhzG0zt+C5/PmTnKRDC+QDiRyOoEWvb/LqAUXx320Jq/jY1WSwC//HjIFuR/NXM7Po69cqz8M1iGnP2X39gv6BRudQbhxcwEaxuF1fjpo51e/tx+M3EGdWVCq3YEW4ZwzUfbk7/uAJ8iIBr7XL2poqFDzaH/vWidu7eP4LEqoA7kqWO+o0j89Drc8eC4KRUfx4rlLK0mRyNXi0Mrzh0LUzfQhYnRaUMb8pm51EJ8nu79CyNrSpYOC0RqSsp2jcJo8+agB5x39H+1NOQEhsfObkMiQ/c1SVYDOdtfLlvg8YaQ0HvWa6pm1I4r+2QoNj+Pxr6+zfT43uLrWTlC5qrDH+PL3JK8IBJqAJTTCuYZVJGHDwZCNt7YN54qi3M1t8Ki92NfK3fTxeF679LYuM3ZWgm53Yo+JHYxQEwsxUW2e7IG0AyBjmU9Q4FW+pmhewUi5UV4a0Bfhmofkidxr49MXsfWhbxJ1E/brY1xMgM7/5fDqpoAXgml1MJKcAt0aFoTuSYFA+dxSVtcYo96gwh0sAMmp0/sLmAATCyrKajGLGw X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 52a3c438-1eec-42bd-2df8-08dd2fdb155a X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:36.0343 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Pt8UW0DzqzXpgzuuFPu9IoVyN3BdDOLbjWQE5O0hkIvTJQUkCQ8F1v4O4WPg/yGT1SAzecq3mwnlp+h8Az/yOA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Proofpoint-ORIG-GUID: SBv15s84lpS9M1PzrzJBIefrZcDBcM_S X-Proofpoint-GUID: SBv15s84lpS9M1PzrzJBIefrZcDBcM_S X-Authority-Analysis: v=2.4 cv=YLtlyQGx c=1 sm=1 tr=0 ts=677e6741 cx=c_pps a=ZeveGGZnxkNpWlA7A6AaFA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=64Cc0HZtAAAA:8 a=nsq3sfgFcGYXhcUyotgA:9 a=Bg_hFbzUHdN-jkeYMkSU:22 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Commonize some initialization code shared by the legacy and iommufd vfio implementations (and later by vfio-user). Signed-off-by: John Levon --- hw/vfio/common.c | 19 +++++++++++++++++++ hw/vfio/container.c | 14 +------------- hw/vfio/iommufd.c | 9 +-------- include/hw/vfio/vfio-common.h | 2 ++ 4 files changed, 23 insertions(+), 21 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index cb299fc3bf..a8243c0c58 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1551,6 +1551,25 @@ retry: return info; } +void vfio_prepare_device(VFIODevice *vbasedev, VFIOContainerBase *bcontainer, + VFIOGroup *group, struct vfio_device_info *info) +{ + vbasedev->group = group; + + vbasedev->num_irqs = info->num_irqs; + vbasedev->num_regions = info->num_regions; + vbasedev->flags = info->flags; + vbasedev->reset_works = !!(info->flags & VFIO_DEVICE_FLAGS_RESET); + + vbasedev->bcontainer = bcontainer; + QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); + if (group) { + QLIST_INSERT_HEAD(&group->device_list, vbasedev, next); + } + + QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); +} + bool vfio_attach_device_by_iommu_type(const char *iommu_type, char *name, VFIODevice *vbasedev, AddressSpace *as, Error **errp) diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 39c77d402c..b1a58b0579 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -878,17 +878,11 @@ static bool vfio_get_device(VFIOGroup *group, const char *name, } vbasedev->fd = fd; - vbasedev->group = group; - QLIST_INSERT_HEAD(&group->device_list, vbasedev, next); - vbasedev->num_irqs = info->num_irqs; - vbasedev->num_regions = info->num_regions; - vbasedev->flags = info->flags; + vfio_prepare_device(vbasedev, &group->container->bcontainer, group, info); trace_vfio_get_device(name, info->flags, info->num_regions, info->num_irqs); - vbasedev->reset_works = !!(info->flags & VFIO_DEVICE_FLAGS_RESET); - return true; } @@ -941,7 +935,6 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev, int groupid = vfio_device_groupid(vbasedev, errp); VFIODevice *vbasedev_iter; VFIOGroup *group; - VFIOContainerBase *bcontainer; if (groupid < 0) { return false; @@ -970,11 +963,6 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev, return false; } - bcontainer = &group->container->bcontainer; - vbasedev->bcontainer = bcontainer; - QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); - QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); - return true; } diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index 39c2b802b0..ef0a7f8ead 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -603,14 +603,7 @@ found_container: iommufd_cdev_ram_block_discard_disable(false); } - vbasedev->group = 0; - vbasedev->num_irqs = dev_info.num_irqs; - vbasedev->num_regions = dev_info.num_regions; - vbasedev->flags = dev_info.flags; - vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET); - vbasedev->bcontainer = bcontainer; - QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); - QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); + vfio_prepare_device(vbasedev, bcontainer, NULL, &dev_info); trace_iommufd_cdev_device_info(vbasedev->name, devfd, vbasedev->num_irqs, vbasedev->num_regions, vbasedev->flags); diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 387854cb0b..da2c5947c4 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -250,6 +250,8 @@ void vfio_reset_handler(void *opaque); struct vfio_device_info *vfio_get_device_info(int fd); bool vfio_device_is_mdev(VFIODevice *vbasedev); bool vfio_device_hiod_realize(VFIODevice *vbasedev, Error **errp); +void vfio_prepare_device(VFIODevice *vbasedev, VFIOContainerBase *bcontainer, + VFIOGroup *group, struct vfio_device_info *info); bool vfio_attach_device(char *name, VFIODevice *vbasedev, AddressSpace *as, Error **errp); bool vfio_attach_device_by_iommu_type(const char *iommu_type, char *name, From patchwork Wed Jan 8 11:50:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930726 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6804EE7719A for ; Wed, 8 Jan 2025 11:59:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdX-0006UR-0K; Wed, 08 Jan 2025 06:54:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd7-0006PY-Fx for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:51 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd1-0002DL-LL for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:43 -0500 Received: from pps.filterd (m0127844.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50885pFH021537; Wed, 8 Jan 2025 03:53:38 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=kN0ixm/yOIeHZmtYT4GwTKVGIhlQbWPPkE+BqxBpp z4=; b=zS3VV02oz7Af6fy17z5LfkpZVeiAHvhero9Wj13NWUopeO+BtKTu3bO8V HfdYpk6f8ho+9QhkWWRE2XBi+V/8WcxE4VoRpAgbznpWTaDHjrq942U1MaBcf1XT oCqKxR8FUXh5hRuKyjaAluei/QVdeC2dESMdPcT/sX+kk8wiODSp0cV1t3B9aiy1 /TOfd5fEEFpMb59HJ4zSqhG8bI4CcyTzgYaILZLbBoY1joJFzahzj1lhmLHLUsox 8tB8ywEo2LSY7J5WduSe2tORhqNlM45i1ju20wl2x9Lw1NqsM6RjFsyNVNo2jM6a CT7xwSIZzGlaC5radoOhnjhBmW93w== Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2040.outbound.protection.outlook.com [104.47.74.40]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y5d3rs7x-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=K3iYr5tec7me94TAndoM9Cf96zHO0ftal7ANMKlqzzC16Pgar+1KkJdQ7yyflu5AavRyCbgUssSQYwYuQdSqFzj3mQG2LvHFBfTVe1W3K/hQ9xed05kwJCMBH02Rm9ovrpEkL5u/8LF9rCBvtjzkdrotahmlgHruOQaYCrt2RSOR3LUoC2A9kAe0qdNwTwkw/IME9WCGlgo0alZw88qqEeH8eaCa2vTMI8nAkkshFn1iTBbcc5jsvEYDQ9WCMFcQe/QcyIpFv8fn8R/Gi4X1Wy4ye0eUmiF1DEfn+R73zjJaD+v/cOKxNqOjiBo7AkVFxM/gKbAfVmEamktZASvFdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=kN0ixm/yOIeHZmtYT4GwTKVGIhlQbWPPkE+BqxBppz4=; b=lAmLq1mi7z1uv1mD2cDAIU/a0tejB2JzkaoXJKYzlg9sX7syAXbHVsL5enf+NWgpgTblxip3jXBJt4whhOCgBflV9JrlTZtvzTjaGEczuYgE9vbQdlBBZyEmZ158M9MxX7JgG0CtyLina2wPmpML6WQ/RbVWDKhYz0C/d2F8uUvMPTBxyr+N9FlqNwrW4tXvf50A0xgcitBN5aT3GKYXT36Sz5g4WOrORbvQqbyqGvFJIDARxB2NvE747sTLqm2x7Fh/0vpjpmuIh4pc+4EemQ526rGVoDR+8LAP8bKaioSceVtQ2F0gntpK81CoVUTqenfNClTEHJaS2KknY5ihKQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kN0ixm/yOIeHZmtYT4GwTKVGIhlQbWPPkE+BqxBppz4=; b=f7iB8GqZlPgq65SPhs6knhbryIzLBDF32WSiFnIDo6T6COlQ+cpRGt88Wp0oB3O7wSbJxRTW1wGK4O3djNmYhnpaQ2NuYCMo1cn9ziWfooQybN4uQaRv/EdncgCKxyJxwNsNXqkPQtaJ+tcLDJ6IYDM9LhltT46f2O3gTbfJCdqGlUwmCjzkW0db6MWxeDGEOBLMGRRV2xV0Q9qolzy58Io2YbI9hsZqx6KAkGuMRKNq0kvDHSDbF5Xn1+LcAduQZI3yVU4u9js/7W2KzMRTOxWqe199ZlZtUhrMmgoS1mKawsRakyxUEGeE/cjiUK+vC2wmFu+RFfU3+BYGw94hQA== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:37 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:37 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 06/26] vfio: add region cache Date: Wed, 8 Jan 2025 11:50:12 +0000 Message-Id: <20250108115032.1677686-7-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 3bad6733-e019-454d-d533-08dd2fdb161a x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: mYpb3b1VU1++G6HMDUKy3NAEclLHdPoYvoGJhDPkcRlVz+tge2wEjDIaG9cxrJu9qsp0N+d9021PS1X0LIWrMg069OnZb6Q7ZIEr+B5LPUGJICZ2WYtBK6/doAOx5Dpdpd3MwowzgrKBysd9EQ+GcShNVhcOWKmAESzpZOeS2qj/XauplcI25hE8Wwz8G4AJme4Y1z4SJVFVJ/y9oBJVgPtT2dC11r8U6Yj4tLdFXlSsZBVJZo4KGA5fwxAgTMZZiWk+A2T0RVCrZa3F2/wP2hKfRq2iiCBdolUWDrlTbFpeyVKIMcFSHmtfOfuIT2nX5pLtucZXBIhPTfNkw0PIf15UkBdfJKmVEn1jVwc0LrHxa6YUA8vImfJkm8h5E+QAiBF5ZIzv7TCvlj4LfsvlU9rpO5M1i9+zk7+D8yj1GsbDnV5+keBVgSyg53yleLZyF7Qrwf6E6KXQAQRSe5S4+vnLHy2m1Zo5DVovw5KVy3rcHO+4So9Sbm+U2qokmp16xqGD0HtAid71CuZvnWQjlmQN3/qiGD2ewOgGxIBwiDZJKwm5zaHSd16DQ5E6ZgMNLor8LVaLXjh+4wJJaSEVHXt8aYsNDOVCTzvbFdcSFaN2xu9ReoHHXT8KI0XwxWXLTJAv7zlKjjmxsIICPuYFs2ra5dLvDp/Tz7l6SkHaiqWF+a01x6hOfnypbPUfBuZ9KE47UT7EuWS+XkCRkNDGh0hVEJ0nQlpZv/yTUh5AtUjaGT0zjxYGWl+zLLmvPiJpZVeT1laIQh4B7qpl3OoYcBPYCutxBw4HOF6y3IxOxjdM1k5atHt2q5y8e8WDatOmCrtj2Usx0/yTG8tza5xoxsMaHm3C2M/F62ejCPAQpzMU0INJlbgjB60YCijgMzTpjBjjiI+4r1VKAfLpfy1JfBwrh4TAxF1ojqq7XLsCBScVPup7UU4VjT9Wlpq9s+HMRoa72BfGvSxVEVRYwzRC2EdTAurK0NbrJlejo2TZmKWTqHmOIXckF5PNH3y4aSIwEmBPg43deb7Fj0RI3PRQFNmuFMiF93rcETVuGcOhqu7ZenLChCfvel1Rwj1HmDQPmCb38C7NWG6+ajRbFYYwdRPZtCYjAZ0LEqP5m+k7tiqEUal9EjvuWlxdKQcpduscSb3scHdLTls2iA5FaNwUhzOVRMuNdGcDuQfpyXAtcNQ8VZlzIXhJ2My4dNpR6FeNR3Z7s6+nfbQ/6kqKoIm8p5tAOs8n5M498POieLAHhvffL6sNNUHtAk+maCcM8Tk8mEnHKMklYrKTD9ZtuSxU7lW5L+cEW/d4Xpf4UyITdFJonvesmBocVxesP2JdVj2Wn073ebXOY8gRM2fZs8ymU9ly94daFu9z/qNDM2wPhkK4UEDI4E8vrgDE2UVa1meK X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: AVsWUVevDQaF0QE7aU9ECdwOeT9dG4yvjigbs/vjZt4x3X/1l5x89YdOahukuPVR68Ry1OgNI2rVI+c0au2T1fGWcJsuE8ub9JIcbS3urs+lcX7JHyi0YWPiS2dSOC+I714+g327SoLnjsEHwp29SW1bJD/H8Y6G/oPN0Bw8WbVxHYsVmh5Eil9blcx07hnSUAAjEGUuHVrphsRERtVY4570MNrO8VY8KCEchtimbT0sZ9jUKuxFvYHNTGC5nxdnuRiMkKBQ8V45gctn6g7aUP7MVPQNKqybJaAdc8bInt0lpVeIzpu559iHDZ7MMd2zyswJmkSUxQb21KeSjZAB7WVj7zXKtbrHnifXrBHQo9WfgcZZYCe9D45eYQlFzBZAyxXUBGl+5agy4XmYoUYtkIotrV2ThST79h7LN87O/XGhOK9XFkDIu/6B9Yf48wPMHjPbkaFs3ETg7+NSJIVg87DOHtVTTUYmS8qGWbDQNSkHiBJ/MjQTKRVMw6vvHyIBUXfynuFeNMH31x7jYwVGNO/8Yh0mIRK2xWdm/5r5qcKsSgq8YYLc/xgdt0ph3wvOlEOBxzJPTuTxleAbTgFMovFmnW3CXlRfsk6ac01RbJfwD4nvxwsdKqQyDMS6vjLTPNv/bffWdLg8EVDqcLhNWqnFxNS9sBO83SlR9yGAvQ+G/MTcpebLenQcHdpjtILh/NNkNhiT5Sv9CxFw+CEAt2l2P+05dLZ0zyXQAUWvEMoaIPBSzrpXmtURMdkjA2CQO6aSyOjZbSLl+YnYec9PL7AIwv4UBYUIkQzpDZWqTl6IN+D5T6dsg9moodggLprBrfTunmjgdfMrTprEyoOlI8NEXSzPBGDjmi4JQSDQTRcMqbjIq8cScNtpYjK3qsfR7m/fEb+qCwvcpLHC0xba8zXqLrqzBicdyJCUJ5BYswHygATdsJADxS41nzktvOnOd51aa2Y/CuYWY9E0B3fImsZPcxVtqZLOoGmT26H7WyB382DedrO1jAJR66BC6WVFRwftWvB6KZF5RgsT45nzxylzkOPcl9PxgLot/DZoQ5ngabTSfj4yaLn5RndkRTmKcPHPnhyGfxDgMpRQmp1cNs79xewoBIsF2OvOFOdpaZs6YUiE9Tx8rHrA50hKtAbYNu06g28bg8cvlymJKN5YFYDB8/l3FA8FIVDHv0r5QiNdz0MPHkEkPmRJbkuBjNJQr9RYKD4rqJREOITI52D2JUl0nWP+V7qynqxD6CRYRvVfdFhv5thO8nt24GaiptMh4qFC8g008GZK62L/sT1jIyx0iL9/nL76wvXSXiaB5kfA3FgbPJaOczK8CTKCf3CT8qgni6CBTQfQVGMXzb2PK4q9g/jgkTtzc4Br+tAzNQPZT5w2IMcLjR/Q7VrsQ3isw0ODJenZ8qEgGR/AOIxDxu/AD/7MmM9xc8v1oaM9oDcWZdoQoxVDMOMOnJQ8gR/rwK3blFtSQuEcPg7cSxviRwy1aRl/3NaEG/KMOU+lq38wf3v3vYIPcACfTBgz86HNzz8/i430/wTpfA7OsLZIOtXXjqDMxhz4HHEMZOoCGds9zWzJR/E+fJllpp2EuJiJ X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3bad6733-e019-454d-d533-08dd2fdb161a X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:37.2790 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: XLYvTcXJ3CcxnyBa5fxKh9zHTwbC7HCbWnHOU7AReFyvPMGfgKIGXttXQQ/nbNl9z7SIdFqWOX8AKNxbqD4Jgw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Proofpoint-ORIG-GUID: yxi4PcqsKG0K1wbJaPezkV6qgFzvCIpx X-Proofpoint-GUID: yxi4PcqsKG0K1wbJaPezkV6qgFzvCIpx X-Authority-Analysis: v=2.4 cv=YLtlyQGx c=1 sm=1 tr=0 ts=677e6742 cx=c_pps a=ZeveGGZnxkNpWlA7A6AaFA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=FNYyX2e1CiGw8cD_ergA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman cache VFIO_DEVICE_GET_REGION_INFO results to reduce memory alloc/free cycles and as prep work for vfio-user Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/ccw.c | 5 ----- hw/vfio/common.c | 12 ++++++++++++ hw/vfio/container.c | 10 ++++++++++ hw/vfio/helpers.c | 21 ++++++++++++++++----- hw/vfio/igd.c | 8 ++++---- hw/vfio/pci.c | 8 ++++---- include/hw/vfio/vfio-common.h | 1 + 7 files changed, 47 insertions(+), 18 deletions(-) diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 67bc137f9b..22378d50bc 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -510,7 +510,6 @@ static bool vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) vcdev->io_region_offset = info->offset; vcdev->io_region = g_malloc0(info->size); - g_free(info); /* check for the optional async command region */ ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW, @@ -523,7 +522,6 @@ static bool vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) } vcdev->async_cmd_region_offset = info->offset; vcdev->async_cmd_region = g_malloc0(info->size); - g_free(info); } ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW, @@ -536,7 +534,6 @@ static bool vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) } vcdev->schib_region_offset = info->offset; vcdev->schib_region = g_malloc(info->size); - g_free(info); } ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW, @@ -550,7 +547,6 @@ static bool vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) } vcdev->crw_region_offset = info->offset; vcdev->crw_region = g_malloc(info->size); - g_free(info); } return true; @@ -560,7 +556,6 @@ out_err: g_free(vcdev->schib_region); g_free(vcdev->async_cmd_region); g_free(vcdev->io_region); - g_free(info); return false; } diff --git a/hw/vfio/common.c b/hw/vfio/common.c index a8243c0c58..c0a6263678 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1551,6 +1551,16 @@ retry: return info; } +static void vfio_get_all_regions(VFIODevice *vbasedev) +{ + struct vfio_region_info *info; + int i; + + for (i = 0; i < vbasedev->num_regions; i++) { + vfio_get_region_info(vbasedev, i, &info); + } +} + void vfio_prepare_device(VFIODevice *vbasedev, VFIOContainerBase *bcontainer, VFIOGroup *group, struct vfio_device_info *info) { @@ -1568,6 +1578,8 @@ void vfio_prepare_device(VFIODevice *vbasedev, VFIOContainerBase *bcontainer, } QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); + + vfio_get_all_regions(vbasedev); } bool vfio_attach_device_by_iommu_type(const char *iommu_type, char *name, diff --git a/hw/vfio/container.c b/hw/vfio/container.c index b1a58b0579..e0fd5a153b 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -888,6 +888,16 @@ static bool vfio_get_device(VFIOGroup *group, const char *name, static void vfio_put_base_device(VFIODevice *vbasedev) { + if (vbasedev->regions != NULL) { + int i; + + for (i = 0; i < vbasedev->num_regions; i++) { + g_free(vbasedev->regions[i]); + } + g_free(vbasedev->regions); + vbasedev->regions = NULL; + } + if (!vbasedev->group) { return; } diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index 913796f437..a8951176b8 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -344,7 +344,7 @@ static int vfio_setup_region_sparse_mmaps(VFIORegion *region, int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, int index, const char *name) { - g_autofree struct vfio_region_info *info = NULL; + struct vfio_region_info *info = NULL; int ret; ret = vfio_get_region_info(vbasedev, index, &info); @@ -561,6 +561,17 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, { size_t argsz = sizeof(struct vfio_region_info); + /* create region cache */ + if (vbasedev->regions == NULL) { + vbasedev->regions = g_new0(struct vfio_region_info *, + vbasedev->num_regions); + } + /* check cache */ + if (vbasedev->regions[index] != NULL) { + *info = vbasedev->regions[index]; + return 0; + } + *info = g_malloc0(argsz); (*info)->index = index; @@ -580,6 +591,9 @@ retry: goto retry; } + /* fill cache */ + vbasedev->regions[index] = *info; + return 0; } @@ -598,7 +612,6 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type, hdr = vfio_get_region_info_cap(*info, VFIO_REGION_INFO_CAP_TYPE); if (!hdr) { - g_free(*info); continue; } @@ -610,8 +623,6 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type, if (cap_type->type == type && cap_type->subtype == subtype) { return 0; } - - g_free(*info); } *info = NULL; @@ -620,7 +631,7 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type, bool vfio_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type) { - g_autofree struct vfio_region_info *info = NULL; + struct vfio_region_info *info = NULL; bool ret = false; if (!vfio_get_region_info(vbasedev, region, &info)) { diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c index 0740a5dd8c..d2f9300e9a 100644 --- a/hw/vfio/igd.c +++ b/hw/vfio/igd.c @@ -553,10 +553,10 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int nr) void vfio_probe_igd_bar4_quirk(VFIOPCIDevice *vdev, int nr) { - g_autofree struct vfio_region_info *rom = NULL; - g_autofree struct vfio_region_info *opregion = NULL; - g_autofree struct vfio_region_info *host = NULL; - g_autofree struct vfio_region_info *lpc = NULL; + struct vfio_region_info *rom = NULL; + struct vfio_region_info *opregion = NULL; + struct vfio_region_info *host = NULL; + struct vfio_region_info *lpc = NULL; VFIOQuirk *quirk; VFIOIGDQuirk *igd; PCIDevice *lpc_bridge; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 17080b9dc0..8e6f20b3ad 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -879,7 +879,7 @@ static void vfio_update_msi(VFIOPCIDevice *vdev) static void vfio_pci_load_rom(VFIOPCIDevice *vdev) { - g_autofree struct vfio_region_info *reg_info = NULL; + struct vfio_region_info *reg_info = NULL; uint64_t size; off_t off = 0; ssize_t bytes; @@ -2666,7 +2666,7 @@ static VFIODeviceOps vfio_pci_ops = { bool vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp) { VFIODevice *vbasedev = &vdev->vbasedev; - g_autofree struct vfio_region_info *reg_info = NULL; + struct vfio_region_info *reg_info = NULL; int ret; ret = vfio_get_region_info(vbasedev, VFIO_PCI_VGA_REGION_INDEX, ®_info); @@ -2731,7 +2731,7 @@ bool vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp) static bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) { VFIODevice *vbasedev = &vdev->vbasedev; - g_autofree struct vfio_region_info *reg_info = NULL; + struct vfio_region_info *reg_info = NULL; struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) }; int i, ret = -1; @@ -3135,7 +3135,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) if (!vdev->igd_opregion && vdev->features & VFIO_FEATURE_ENABLE_IGD_OPREGION) { - g_autofree struct vfio_region_info *opregion = NULL; + struct vfio_region_info *opregion = NULL; if (vdev->pdev.qdev.hotplugged) { error_setg(errp, diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index da2c5947c4..59348b81aa 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -151,6 +151,7 @@ typedef struct VFIODevice { IOMMUFDBackend *iommufd; VFIOIOASHwpt *hwpt; QLIST_ENTRY(VFIODevice) hwpt_next; + struct vfio_region_info **regions; } VFIODevice; struct VFIODeviceOps { From patchwork Wed Jan 8 11:50:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930723 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 37AEDE77199 for ; Wed, 8 Jan 2025 11:58:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdn-0006Xc-9h; Wed, 08 Jan 2025 06:54:27 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd9-0006Pa-DT for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:52 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd6-0002DT-BQ for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:46 -0500 Received: from pps.filterd (m0127839.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085dSj1010866; Wed, 8 Jan 2025 03:53:40 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=8xEkQYfzqCzomLry3lIyC7TNElOKkQSjoHKbKYPKP ME=; b=JLG8WultFf7BB22ZIwtkLrUayXmn2ULxz0VynbJCWP2n48MPELP3WDrlr T88UDjKcgDXfIBNPQn2ekJiKYK7/nv2E+doIKoza+4G1XsZlbAKuLkd7+NFNxYTA 9l7jnU3SCgIodFL5zVZ2gKsueau+f8z5pwcYTIL+99h/XimTqF7CntAWuw+tTHUx RsHyGR4WHS/H4TSzS+UXSnvusyrRIPzY2MAjNVPQRTTtbvzgzDeCUbbfCmqIfSug QSa6H9AXBkDkYO7yM3eMfvbmewnDBwnFDITiaKAS4a0Vgl/Whj/r1e3w4pYMZQnd wXEdcwlXiFQyHPU/S4WLVLAwUR86A== Received: from nam04-dm6-obe.outbound.protection.outlook.com (mail-dm6nam04lp2047.outbound.protection.outlook.com [104.47.73.47]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y4xmyd10-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=WHglmIAbrfcPiiBjEYbAj4VVHpQ+62jbZNO0ZP0+8ebB+sQTz07VUjEGxxnTj5gZpbnG5DiE5Uyiacax/5qknam7R5CW2WZ+JymxWnLzmuYs39PWPQLig5zt8GqOa7ghbEchg+RKSyvhr9AGeEaahGF7RBP9uP3pVnNFdpa7WMN1dojS8IUHoJfYOtGKilQO3nSJfZ71+WKDz6RoK2PAMoiBKeflzKgLLGfNX2KmrcoSxmL+3fYphAmRyilia7dN9c3SEkzOtD24HMq5rTTWeKSY8mii5bKb0QKfGRNT80a6UGyQ/7YbaWIcOdhp1RHoCGyuAZEpW9TqYn3Kj3w5WA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8xEkQYfzqCzomLry3lIyC7TNElOKkQSjoHKbKYPKPME=; b=vX4Dc5Vv+4blibkkQpf0TkqbbLHxTiVquIZFE49mCJIJXlp/FB7HqyYLvfk4alcktt/F91zHHz6VyA82z8TIUAy2dVxYTuSYbmnOikLDCYvEj2CkmVaqMi9m4/7pij7Oo+FuzfBhtAz+1jcQi0C1YfQeiLlVewzeZoNCsv6lZxMPA3CgvZK/AjtMHyvmkmkUKT/1Uhm4eAKHOkweSnNuxqYrbu46deAez7yHdbehmchhEqfrGpMCqnN6f/SSfjWq/RG97qkGSYxrVmMvDYRld5BkRYbS1iWZYA4NzJUV8SqgvONvOplb7C2KSACqc2n8GeDZhRKVE9TrrX8MUS32IA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8xEkQYfzqCzomLry3lIyC7TNElOKkQSjoHKbKYPKPME=; b=wzyOBekctuo8iYvhnUmmNdXZXc5TikhtbaHYpkNNTxYKiPF8xSsMb6QVwB0an3dox9hUGXX5cwXTO7MZu1A0rOba4FGga4+/CEry5kA7o7+LSzvt4cx/thQ//0ahQbIJELuYSod1hhRlfeq433eRMftft6VQY8//NBTdlQvTU/M4UmfDOKw/sXcl0dMEvZzlU0H1Q8SBfyrVUIXc/tyA3zUq/WRT9XCOlLfKQfMeC2wzauVuVFjHOq9VXkpUggL0AXc+2t0yRRDRo4SkDGbnfxeHlIH7ujZTEkesSmw2W6Md0WFsoxQE8KpaSwYjBZR4zVsYIQpzCt/lRb7lr+Fjnw== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:38 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:38 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 07/26] vfio: add VFIO base abstract class Date: Wed, 8 Jan 2025 11:50:13 +0000 Message-Id: <20250108115032.1677686-8-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 7a1ccd0f-8671-4757-8a3e-08dd2fdb16d7 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: 6GZisQBSp4PfZvAN5645PVDRLRz0MZeSH2AI+ALe7sNcGAWpHZHmG8227h2GlxCUKMWzzf8zKKasHT/UXQLMwJsJRRiN/QuBAgp0hz+eB4LMx30hXtyRuy/COWvgxOGEwvFsZa8VkT9MMrje2Ls1srTRtrQdapRWKw3hv6ehjvWltGIatteoD9XL1zAyO60LGhAijQ3qBM8KfulciaRom+PAzuWGnfzrifQBYaCNGo+ZeRI9QKXVX0IdHFW7M7TNmDQMwQ83nfrlfnc5mF3KAata1TFO1hpi6txlXZeGWdvKzUCneFcONkw8M7U4fV5Q1xpFhkA26IXvi9DrG/WGiaKdqtu5FWLvYMe5kaSbsC60/aqEWYKk6L7Z5oUB9qgT42Lc8Yu2GpWICtBLDKeH/NW6TGbvUXYiYiEaKfKuzd5O1NA9yrNVPIN4R2MejeNyKgtbzsVRQGsOlzh2ONTu8w428aCwQThfS/edEfaROag4vv/IrF7UKlzu/mjTcehplkkLyDZJ43ucrV55Lwbo03kmtD97U5bWaoE7de8ywPAJ4FqONiWWqsAGlngHzFYjMBfdwpG4MEIEV6GT0bfGNJks2eIvEcn1ZpbCmoEuLYPB9oiHp0si0nuiuWDZiypISFmMehiIcwcHuij9zCxv9FB+MHdqWPqor6OYfVZkP3IBw3dwZXxc8OUZDOxZk7Y9X45Mk+uovAgha6h6HB1rGVSIgw+D3A55SMJgxmSdhJtq9gvL56IIn1uq3oF+qJlamaGUUboMaGJsk9Id+ep/wdHqBrVghqnNmg93ZzSjw8z70Wu+5vjd8wGolxg9A8VmVRMye/0BKC4hm5Q0r2cGtb10yeQwGnBbJ3Cu4Rl+TSgA5Cf7SSxqxbHBB9cL8yG/dl/j2yPvjLS0kwpSol7LFxcnzP/xWgLplWJ2NE2jRXEwPrhBHTkf34IAXYOEXoLwDDyE6Ak7M7MB2J6e+bh3hOu06PIorq61JoSXLWZIfB77VKc8tmuKuzw/az5+V79oBKHcWNsoN9UR24wdzp0M02KQ0/Ba+gJ71W4GPbUiy83+yUktfLA9CwDkwbqu3N2YhKfKXfh5rmuJFYEkyXls4l66CtY5AwdOg5OZYnyzocV2g1AuGFH0KV+WperZX1g8Cj8v+SjrPn4OUR8BzbKe4gWOlsF2Z6WrhMVbY6NtiOMb8bB9SuGFlPLdyrM05dvzP3U7BvMp8Wp3uo4EvFtoSaynHVXlCAdMjqIQQbwviGMhgZxBp6Ct3hkcai4UgrPWQ4cOAeCHWYGPucS9RocSr+oIVXpw8y3OMYCADVh5JsDjnLOOZE2fwboQjURJ+hn9OcH9x578RHcJHwHYzzDZwL7kiAjxvmXoRejcTfmcnhG/8aRTAYyOXURtJ+awNl/Q X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: RBg7qCWSz2wIHSIbMmraNhS90caZXihQDCPHEyrlFsXQIBQ/CoTTfTTa6FoJC4o36374xGU4PNKTspiWg7sRUWa3Cf1gtntKaXhdIsGuaQ2W17kXAw/ClCHPZcA1d3wfT/sKo2VaduOUmlTijt23v9YG6TO+Nnju8RxvU9+yfj3h1F6c0JJ12oJH3+AeDOOrlCkOBdCcigmShoRp0eU1RB6AQNZaHOgGbyFKxT7KAsRdxOjS/lpEGiMDddgO0J74+sVBrhd6oRezPwhezgl+V94jcJhRz4lbIIAuzV+8ioKWJkbfnZ90PNhaciBD0JQbcw+/foiipYjmQdtMHy6Fq/h7AOrHwzYAKJBjGPIZcvSE6/AHJ+H51JsCnA72dXe6z3lt7D6+0AttEn89ox5/yfi06o7q9aW78HDaSVypm/1F0OhNtQOOQjI+CPlvi6/K9ZlfRtsTVzP3nOhtvadtYMQ3alHPhJSvWtXsIJdl74UKyr2FO3k13e/BKxbXVwIvzhH0X0Ev2GDxMVpH0/8jlLGMC52WfbNudLiei4HLDpfDpkuel3+FA7MXPEoEmlgCtBuFKVNUJfbuqhJg9MgJgAs6zXom5bcLlfK6KFVsddQyKll9ZFFyQ9/7mk8R/MsWyCl3GQ1IniS9bmLRjXbc5giqnpwz/a8XcJg75VnDEhSMg4Dh6ePre6Qt5AiopdJhYPYBgQrmyrsXjCo2OXmYXMbY2w0QtRZAIMR6N1P58xCz8LHxeHfLLbMwjJwP9iCoteGN9fGw1jDS0dmXuAkqVXXi4jqJQgyrqHuMomqndhgYuHcsX7YMdQzZ9+gfLpNQTcETbq24cwWYwz/f5CxM+rRtEivteTbSq5vGlrfX2OXjq9VcYTgfkfVU1/2HgrQXHvSL3hn13elzQV47Mk85/xd+qDKLYTJMJ7hMlfAteDOgA+Bd6hivwQ0TaHPQYcFvukfx6pGuk49QJz60vYpsMbBNe+0PrALrwvkVgYBn4BewPIQ3pDfpnrCbuRv5dRZcR+CV8rZCJGKYV4QzilWhkAUaqTmLBCtj5mbPtsJzPqAnN3SSp1agm4/o2pM8BBktIMewhpMDH8leEUvxxwPP6fdpSciL9b1+w9n61u/xboYAU4WoRL0g7iHBQ2nYs+3TGK0/Q4my5b9y6UZ5O2b1eShTIBzyL3uR+XgrJpRVWjQTXh1FwN9F0J68bNXCkvgYh5Y4vjxdWrvIfNxnv4REGDvcmJwIJBsY2jw0XnorqVmQzKSnit/P2vPtQcJdbDwMqTWcl/IGcc6fZafSKjOU+X49/pIH9FopkTggwwXp8g+R29uAHolLWF/4OsAF0Ikr0bpp3ph1KNaJ/4c6j83UoDue4Jhl48lckPmTzEBChZIQ+jTle9MtR2V7bkqgjzZweZKveC84zgRA9U2wkToBbVVZeMoxW6pUsL873vgizg2PmZPy7XfMB6gjdQAG1CyCjUeVxE/PAmNU89m2rftAGEe4kWgfFYXNR+whFkvpbXB1WYwgk9VpZTLd795nDuWqxaCzrP8FV3AxWojKY+EJQs+YCV4xuElAb7l6tGbBUv9fTVwfqRcnzGXTQvbNnsLm X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7a1ccd0f-8671-4757-8a3e-08dd2fdb16d7 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:38.6374 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: euS65IoPK9iwByq2OmRCJLjbYJ7jhc3OE8bdvLpq9fQkyxroWzM31D6YiuzRAEKmXPZ7MICySm9N5K+3xO3oaw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Proofpoint-GUID: 5rtlnJsrYQ0_mLDo-8iCRnL4TQ-RHTwl X-Proofpoint-ORIG-GUID: 5rtlnJsrYQ0_mLDo-8iCRnL4TQ-RHTwl X-Authority-Analysis: v=2.4 cv=aaybnQot c=1 sm=1 tr=0 ts=677e6744 cx=c_pps a=2scX5b6JGDBY9+deG5t9BQ==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=6NY-EL8SuTLSG1x2ugYA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Add an abstract base class both the kernel driver and user socket implementations can use to share code. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/pci.c | 107 ++++++++++++++++++++++++++++++-------------------- hw/vfio/pci.h | 16 +++++++- 2 files changed, 79 insertions(+), 44 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 8e6f20b3ad..bb0d26915b 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -239,7 +239,7 @@ static void vfio_intx_update(VFIOPCIDevice *vdev, PCIINTxRoute *route) static void vfio_intx_routing_notifier(PCIDevice *pdev) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); PCIINTxRoute route; if (vdev->interrupt != VFIO_INT_INTx) { @@ -514,7 +514,7 @@ static void vfio_update_kvm_msi_virq(VFIOMSIVector *vector, MSIMessage msg, static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, MSIMessage *msg, IOHandler *handler) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIOMSIVector *vector; int ret; bool resizing = !!(vdev->nr_vectors < nr + 1); @@ -619,7 +619,7 @@ static int vfio_msix_vector_use(PCIDevice *pdev, static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIOMSIVector *vector = &vdev->msi_vectors[nr]; trace_vfio_msix_vector_release(vdev->vbasedev.name, nr); @@ -1168,7 +1168,7 @@ static const MemoryRegionOps vfio_vga_ops = { */ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIORegion *region = &vdev->bars[bar].region; MemoryRegion *mmap_mr, *region_mr, *base_mr; PCIIORegion *r; @@ -1214,7 +1214,7 @@ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) */ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val; memcpy(&emu_bits, vdev->emulated_config_bits + addr, len); @@ -1247,7 +1247,7 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr, uint32_t val, int len) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); uint32_t val_le = cpu_to_le32(val); trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len); @@ -2961,7 +2961,7 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev) static void vfio_realize(PCIDevice *pdev, Error **errp) { ERRP_GUARD(); - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIODevice *vbasedev = &vdev->vbasedev; int i, ret; char uuid[UUID_STR_LEN]; @@ -3251,7 +3251,7 @@ error: static void vfio_instance_finalize(Object *obj) { - VFIOPCIDevice *vdev = VFIO_PCI(obj); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); vfio_display_finalize(vdev); vfio_bars_finalize(vdev); @@ -3269,7 +3269,7 @@ static void vfio_instance_finalize(Object *obj) static void vfio_exitfn(PCIDevice *pdev) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIODevice *vbasedev = &vdev->vbasedev; vfio_unregister_req_notifier(vdev); @@ -3293,7 +3293,7 @@ static void vfio_exitfn(PCIDevice *pdev) static void vfio_pci_reset(DeviceState *dev) { - VFIOPCIDevice *vdev = VFIO_PCI(dev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(dev); trace_vfio_pci_reset(vdev->vbasedev.name); @@ -3333,7 +3333,7 @@ post_reset: static void vfio_instance_init(Object *obj) { PCIDevice *pci_dev = PCI_DEVICE(obj); - VFIOPCIDevice *vdev = VFIO_PCI(obj); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); VFIODevice *vbasedev = &vdev->vbasedev; device_add_bootindex_property(obj, &vdev->bootindex, @@ -3354,28 +3354,15 @@ static void vfio_instance_init(Object *obj) pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; } -static const Property vfio_pci_dev_properties[] = { - DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), - DEFINE_PROP_UUID_NODEFAULT("vf-token", VFIOPCIDevice, vf_token), - DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev), +static const Property vfio_pci_base_dev_properties[] = { DEFINE_PROP_ON_OFF_AUTO("x-pre-copy-dirty-page-tracking", VFIOPCIDevice, vbasedev.pre_copy_dirty_page_tracking, ON_OFF_AUTO_ON), DEFINE_PROP_ON_OFF_AUTO("x-device-dirty-page-tracking", VFIOPCIDevice, vbasedev.device_dirty_page_tracking, ON_OFF_AUTO_ON), - DEFINE_PROP_ON_OFF_AUTO("display", VFIOPCIDevice, - display, ON_OFF_AUTO_OFF), - DEFINE_PROP_UINT32("xres", VFIOPCIDevice, display_xres, 0), - DEFINE_PROP_UINT32("yres", VFIOPCIDevice, display_yres, 0), DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice, intx.mmap_timeout, 1100), - DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features, - VFIO_FEATURE_ENABLE_VGA_BIT, false), - DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features, - VFIO_FEATURE_ENABLE_REQ_BIT, true), - DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features, - VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), DEFINE_PROP_ON_OFF_AUTO("enable-migration", VFIOPCIDevice, vbasedev.enable_migration, ON_OFF_AUTO_AUTO), DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, @@ -3386,8 +3373,6 @@ static const Property vfio_pci_dev_properties[] = { DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false), DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false), DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false), - DEFINE_PROP_BOOL("x-no-geforce-quirks", VFIOPCIDevice, - no_geforce_quirks, false), DEFINE_PROP_BOOL("x-no-kvm-ioeventfd", VFIOPCIDevice, no_kvm_ioeventfd, false), DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", VFIOPCIDevice, no_vfio_ioeventfd, @@ -3398,12 +3383,57 @@ static const Property vfio_pci_dev_properties[] = { sub_vendor_id, PCI_ANY_ID), DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, sub_device_id, PCI_ANY_ID), + DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo, + OFF_AUTO_PCIBAR_OFF), +}; + + +static void vfio_pci_base_dev_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); + + device_class_set_props(dc, vfio_pci_base_dev_properties); + dc->desc = "VFIO PCI base device"; + set_bit(DEVICE_CATEGORY_MISC, dc->categories); + pdc->exit = vfio_exitfn; + pdc->config_read = vfio_pci_read_config; + pdc->config_write = vfio_pci_write_config; +} + +static const TypeInfo vfio_pci_base_dev_info = { + .name = TYPE_VFIO_PCI_BASE, + .parent = TYPE_PCI_DEVICE, + .instance_size = 0, + .abstract = true, + .class_init = vfio_pci_base_dev_class_init, + .interfaces = (InterfaceInfo[]) { + { INTERFACE_PCIE_DEVICE }, + { INTERFACE_CONVENTIONAL_PCI_DEVICE }, + { } + }, +}; + +static const Property vfio_pci_dev_properties[] = { + DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), + DEFINE_PROP_UUID_NODEFAULT("vf-token", VFIOPCIDevice, vf_token), + DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev), + DEFINE_PROP_ON_OFF_AUTO("display", VFIOPCIDevice, + display, ON_OFF_AUTO_OFF), + DEFINE_PROP_UINT32("xres", VFIOPCIDevice, display_xres, 0), + DEFINE_PROP_UINT32("yres", VFIOPCIDevice, display_yres, 0), + DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features, + VFIO_FEATURE_ENABLE_VGA_BIT, false), + DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features, + VFIO_FEATURE_ENABLE_REQ_BIT, true), + DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features, + VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), + DEFINE_PROP_BOOL("x-no-geforce-quirks", VFIOPCIDevice, + no_geforce_quirks, false), DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0), DEFINE_PROP_UNSIGNED_NODEFAULT("x-nv-gpudirect-clique", VFIOPCIDevice, nv_gpudirect_clique, qdev_prop_nv_gpudirect_clique, uint8_t), - DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo, - OFF_AUTO_PCIBAR_OFF), #ifdef CONFIG_IOMMUFD DEFINE_PROP_LINK("iommufd", VFIOPCIDevice, vbasedev.iommufd, TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), @@ -3414,7 +3444,8 @@ static const Property vfio_pci_dev_properties[] = { #ifdef CONFIG_IOMMUFD static void vfio_pci_set_fd(Object *obj, const char *str, Error **errp) { - vfio_device_set_fd(&VFIO_PCI(obj)->vbasedev, str, errp); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); + vfio_device_set_fd(&vdev->vbasedev, str, errp); } #endif @@ -3429,25 +3460,16 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd); #endif dc->desc = "VFIO-based PCI device assignment"; - set_bit(DEVICE_CATEGORY_MISC, dc->categories); pdc->realize = vfio_realize; - pdc->exit = vfio_exitfn; - pdc->config_read = vfio_pci_read_config; - pdc->config_write = vfio_pci_write_config; } static const TypeInfo vfio_pci_dev_info = { .name = TYPE_VFIO_PCI, - .parent = TYPE_PCI_DEVICE, - .instance_size = sizeof(VFIOPCIDevice), + .parent = TYPE_VFIO_PCI_BASE, + .instance_size = sizeof(VFIOKernelPCIDevice), .class_init = vfio_pci_dev_class_init, .instance_init = vfio_instance_init, .instance_finalize = vfio_instance_finalize, - .interfaces = (InterfaceInfo[]) { - { INTERFACE_PCIE_DEVICE }, - { INTERFACE_CONVENTIONAL_PCI_DEVICE }, - { } - }, }; static const Property vfio_pci_dev_nohotplug_properties[] = { @@ -3467,12 +3489,13 @@ static void vfio_pci_nohotplug_dev_class_init(ObjectClass *klass, void *data) static const TypeInfo vfio_pci_nohotplug_dev_info = { .name = TYPE_VFIO_PCI_NOHOTPLUG, .parent = TYPE_VFIO_PCI, - .instance_size = sizeof(VFIOPCIDevice), + .instance_size = sizeof(VFIOKernelPCIDevice), .class_init = vfio_pci_nohotplug_dev_class_init, }; static void register_vfio_pci_dev_type(void) { + type_register_static(&vfio_pci_base_dev_info); type_register_static(&vfio_pci_dev_info); type_register_static(&vfio_pci_nohotplug_dev_info); } diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 43c166680a..8e79740ddb 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -116,8 +116,13 @@ typedef struct VFIOMSIXInfo { bool noresize; } VFIOMSIXInfo; -#define TYPE_VFIO_PCI "vfio-pci" -OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI) +/* + * TYPE_VFIO_PCI_BASE is an abstract type used to share code + * between VFIO implementations that use a kernel driver + * with those that use user sockets. + */ +#define TYPE_VFIO_PCI_BASE "vfio-pci-base" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI_BASE) struct VFIOPCIDevice { PCIDevice pdev; @@ -182,6 +187,13 @@ struct VFIOPCIDevice { Notifier irqchip_change_notifier; }; +#define TYPE_VFIO_PCI "vfio-pci" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOKernelPCIDevice, VFIO_PCI) + +struct VFIOKernelPCIDevice { + VFIOPCIDevice device; +}; + /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ static inline bool vfio_pci_is(VFIOPCIDevice *vdev, uint32_t vendor, uint32_t device) { From patchwork Wed Jan 8 11:50:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 737B0E7719B for ; Wed, 8 Jan 2025 11:59:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdO-0006SC-79; Wed, 08 Jan 2025 06:54:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdA-0006Pc-Bp for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:52 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUd6-0002Dt-Iu for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:47 -0500 Received: from pps.filterd (m0127839.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085dSj2010866; Wed, 8 Jan 2025 03:53:41 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=MkxgI9sptPzpgqeAPFASSLEQ68AGziyzHdJmBlg2R l8=; b=blHaHu+31nbMYSrccdf47E9E7g3GMJ+s8ip8YvZA7E0ZIkLM1NZKrWqdU 0S/omq6d8MJ3AN5A6SqnKeL4DrB2SMDpn/eZTvqCTcxZsGL0LIjg1nwBLYoXD6pW o6F6mAi2/qLl6rrkzospIDuXsLyLDK/Ti74edI5DPV0ZWmMxj8PRxf57cNNbO2WP fhZb5ivfPqk+/d4BOHo04T79oYhpp+LcGVm590T2ofpcbg7375ZgxclRUE0MvYbA mlTdRF2gNoX0b08jJoOdg8xiV+M7niKx8ohgA4WU3pAnvnbd+wBN7rODsFNY3s0q 1736BcLik9JuHM2rnJFZ87hNVLLAA== Received: from nam04-dm6-obe.outbound.protection.outlook.com (mail-dm6nam04lp2047.outbound.protection.outlook.com [104.47.73.47]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y4xmyd10-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=rbbuCXaadoo3DolXoytrOVCkGWI6a4Saud/15Znw26pAqhHJI7z8jAdPgkuuZTGEgZ0mPnKXZUF+4QCgFIW0T3oFiL5+hXcolLCZQsKV0Pmx4jXtIIuUncphX64EIX/SMSGlreCYDVGEnofmqesj96QV4veE3/5AeZV/iOM8y5cwe1rpKJt2wmlFsUTuWJmn1jsTTPg8ApARz9DCy88Ee44vYKfS8NeGicc6SKWyn5Ph9hWJrRQNe2TIy8GxOboGgYZ5hzSYkJZo4qESgMgFTlmvk00KS2D5xbgJ45DfHfF5hAaESYPGflNagCz6gOkgESoDGRBbXGj4SU5HhaPe9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MkxgI9sptPzpgqeAPFASSLEQ68AGziyzHdJmBlg2Rl8=; b=QmirLhN1zrUDX2O9XkQGkGqVfjwtoEYT/UhwNo19HmdM27z/IRUmSiz/Lcg7IMzVb2y5lm7fjkNsOBztVX3K3MxZTY2x3lrV6aCI/AKnbQa7numyJ/WaOgGPk5C3WLstJfUCRqNXv6MSKJEVRF2/CiXtD2h0M8eDbWrhKgaXy2BGjati1QSc9B55y/48omVCG0YSeBRISbhF9Zykg+Cdp4L7lfZqSm2TuH1hRDrTvZM5pSFFqtb3e4kfVJEw75wQMe9ANvvPVyInMt1YaetAgSNiF6/r0GPINriKG3VLdgJNippNqeSWGWbU4R3NBSq40JLqJHTrUJhp8Tn+dqK7Xg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MkxgI9sptPzpgqeAPFASSLEQ68AGziyzHdJmBlg2Rl8=; b=p+usGEYfUMDIqxWEJa4ni4by71FcxoRke1ZD0H4IdzsljP2ioQ+WqNsO2frfqzKsumNODyuEKRawUM+RGjnjzU6AT7TRA+kDps/0u9yO3CdlnEjy6pWWZG53rCZWkXWaz7tjCt2VTND1Fosba83yVP+E0UT3StzSBGauPQuM+5BiIENYtyvV3DDumKp8FkXvRbfAWq83TqB6g4CY9kY7xxLLSZJVKuFkgKxFrs1yD/t4KXDmhPa5o3gBNHqt6UdcKT/vvxdOBBXTdo+wuHivhDCFJhqYpzjCkm0VK53/TJarD1O90MJhtikf/wokN5rLytItPXdhJIO2dIiGhv+SPw== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:40 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:40 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 08/26] vfio: add device IO ops vector Date: Wed, 8 Jan 2025 11:50:14 +0000 Message-Id: <20250108115032.1677686-9-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 3163207d-473d-4eca-de6c-08dd2fdb17a4 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: 0OmwfhlHB0BomrBDsWF9yhv6cncb74Wm90j2oTC2Zp3ferlq4BbnZB2GND0XJ5NxtHhQurrbOREDz3gEsc8ioe1f0AqyAWIFtNhw29/rXZXe4TyQimE5EMcAl7CfUQMfObw/26qSnWtanUIAS4BVzZ7TNox5UpmHmBbUJCPN1CTw1/Ls3ffWr4SpfEr4BsxbiZjs6A/cBRW6dLjo0Vskh48spmnJT68fupv/nXR/RWQY1KD2SriBUW3md5lwhwZKP1WrvA9+3difQ5vmqSZAgEng3/oKZ+XfwVXZv5wSxGKtRdRzuTG9eG7Ew/rParrFXXt0W2Pmi82p2tIszd13x34hj9t5y9LrcDVfsPytNrw1keFSFJmju6GCLO4hTqtpRxoqRY7jsus1e5/ymZA/btdVG+xuFx89jSMhPvKt6DivA7WSYg8LVhAvsaePIONJDykEaLiX7PuyVg8KT9a1er3wtp4BTee49n8YcEf5jEP9h4EToLnB+OE2BVnM6OlTpz3uDgi7YUaT1cAGjfBkIIj8850rFIG+JzzkVu/Cm0amsto7a6+0ecAckzkzweOAyGvlxjUYg76DW93q3ISztSJgRWeHvlI210e7BiVeXk6S9KwW56/4pMwYjx1YXz5FI+QEJnLCjMdECoq4FsPjjw/VQEpix1CPKnVxcEArc6J8oOkHIMdYPkiRIJc07JeBRUzevQqvJSgQYDJg8NFXTqDJb2kU+fjk1EGJIs0pe0Jyiu/5AWHOrRttYLV7iqr987FXi3au2iDLLS4II9m05QelyOSvnoAW3NcUnardzfXr1QVJ3N2D7aib1sWqJuTldh3rGkl2ORLjrNia10yWs+EBIE0z3C4D5NBG+OvAzpu25usV39sDUogmdpykJGgl0qcfV4E8/oa7xi5vQ3iOtqa3KQFseqSm+ltzmI1ri3erGXj7Yqvd3sqT78uYNhgXJlg1h0XGxJQQ1Dd9HYaBEpyc7jMu4QdAEWBs6bLINn+3p5GECznZ2h67pOTC3FSHizYz24a0vAkKTn72pGS4T7C1EACQAADiBa+LyMW618Xf+JwXfqvnjXGS9hLk6UqftHryTYBuHNIqThGGaeI3/AkqqK0vy/F7cDvsAlbCHjX9Ph99jN3AS1oGtRGud+HrhVgQZL/LkrT7CvCw1DuDdLVeLmgfGH3kTXrBzpz9EKKZgPCniyyQs2VTZvbadvU9GGk0BoRnmcTVDJe69OwjbAcqTi2wW5E6Poo0q40QbmJQtWOLYeNaHTaPzGoC82eS0zI39l7CvDMdGw70fToAyifGQ2JilU6TvVfBRCOSJ4H+0+QH7axj/V0YGnHXAjTztH15iobAbrGquwk/4Sqm2jlyKhvw23USf7HScnGh3qTzXgkwWUFUbd8L3WtPyJ0b X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 5U/N7y3e56t8JSB6dMlgRFsLa1c56rr3NGwF8qVd7VMOcIH8CMzby6zEVX+vBSJiqA12G/IFuj1MeB1wiUezN8daSqQ/yyQ6XNbUODRb5tnoxXN0FfZm6QW0emSI8sUiZHNxckRlxHPUtmgfEZaZ2jyQQH8vxuitGRYN4nUkiLgL7R7rFA9OtiV9V263juhbliO2pUUsUk6+ueYYGvWg277nTv08LdS7q3nA7rRdsNu5yB+VHLsNtj6RuvVwW+SrhwvKLVl63oF/i2ytvFK5kwEQCKtchC8dSbSE5kRZaOU/7/qlnlpntkCuEO4sQD6QFmPHQas4Peh28lpjGARyynFo11gfWPhMAIAM8tb1EWSOHb4Mro3/Bo0GjQR3VABxSiECzaUuRlLYJtM1md/JDYtl2eXnTg0BIO9g5BY3sphTXf49Eq8AuU65AcUnSp52Wl+gtPPqv9SVGDYMVAjM1PGNF2pkzBkJZ6T9H/ox4ZJiBOoGqjwJ68CDD14+KtNLEBp6yk7Q3khjU8DD+EKGYuw9V0PEEoWtLsBhtVDEzmN+y/Z3Rz8hQxe8OQ7BAhXCfOICIuLs5ib/1JSXlQndI1u/w3dJdW2hEpfbWRKyC5AeOZZUpVZyQnW9fkpiEiRf0R4xaQtCvzlCeLOa+LcqcH+eH76iC1kOdMaOFnByHab55ex8ym+wUL8FI1dSDvFpTV4z0FOaW+NBKJsL3H7Yhz44I/pIlwh0gHbyRSARYM9pf2SUeSGUMkpH6VtXU9hDvgR6lPIrKciWD8vRtBTVtfb3m+374xKv9Y9thU8qGDIyJpaXTmhiIB3V1cSNP+HMOLML6qcfscRyxcOor1Cmaq9Ch8o4oJ+cZSz+LZUQut8CFatNs+Q8puI1F7OQhTA+i7ii6CwUMZ/I9XH4LvylYFjgOI35ZsGkt0zQyO2daTqOePqkS0oVdzNSwiFYe2/VcwKwfdyzL6f1vegiF/lKrPtOtxzEF/tefCApN7+xwLaX2w///xsFwbHGqmOfvF34AfYNk78kAIDoqSpMV9pvQGdqxE862qbhRFLZ5oFEe9c3n4RF0qYr7sPVRJQGa4MxLwcZKAqUgGoIzmi3Kl4r5jgoE1kBwvHMffYfKUs4wWtBrU3Lf+/S6bE4MDbZUOQCbuL/LxhG1L5UZay+Z+kBQItuCZf9Pf+LJO+WDV3XNf8XErQ0GPYawoqRoc22oSDQM4k6eFr5I0p26xrywjUx5woAWZpo0xsYoIpQFdwLX5JH3cQm9ra9PINjqsXvK6hbHKHJMedzYe+GriCN8kCUcxExV7YFxc7UXJXlRQikQiGGtuBDxAwOHaLzQXX16o3flq9pTXark+mcLuORg/5emzxnBRUnfAhNDJ7/Vuwl24d3XkDsaWfhs8dfMzzTlh3Za2hEHpw+b4izKADgSndRMmNfZVWlRej82XvScZmuBPDaCM86Gsgp2Rhc9oYkuHfHCsFT2JgkFHhFJ/7gGQYRzYqShYp1EQE0n6oct3nLN+KsdRRquHfBWSDfTWhP5A+3027pB5jtXeSyxl3EU6HJa/cjP6AXOXoGq6Tn2C7lBdR5zs0zL5N231yGNUWVv8+C X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3163207d-473d-4eca-de6c-08dd2fdb17a4 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:39.9296 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0HKt5RtjecOC9VHtQgbVhTNDyTDJGYgB6Idg86r6mpF35FRSfKAZ1iUDPJaKFAMxbHJqLk0qgOufREjWYTcqpg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Proofpoint-GUID: 7Nbk_xoMSJLxYiY_vwcR8EjmNsK83Lx6 X-Proofpoint-ORIG-GUID: 7Nbk_xoMSJLxYiY_vwcR8EjmNsK83Lx6 X-Authority-Analysis: v=2.4 cv=aaybnQot c=1 sm=1 tr=0 ts=677e6745 cx=c_pps a=2scX5b6JGDBY9+deG5t9BQ==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=S7NjNkv4iZBI8J4Jy6QA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Used for communication with VFIO driver (prep work for vfio-user, which will communicate over a socket) Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/ap.c | 2 +- hw/vfio/ccw.c | 2 +- hw/vfio/common.c | 7 +- hw/vfio/helpers.c | 100 +++++++++++++++++++++++-- hw/vfio/pci.c | 137 +++++++++++++++++++++------------- hw/vfio/platform.c | 2 +- include/hw/vfio/vfio-common.h | 27 ++++++- 7 files changed, 211 insertions(+), 66 deletions(-) diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c index 30b08ad375..1adce1ab40 100644 --- a/hw/vfio/ap.c +++ b/hw/vfio/ap.c @@ -228,7 +228,7 @@ static void vfio_ap_instance_init(Object *obj) * handle ram_block_discard_disable(). */ vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_AP, &vfio_ap_ops, - DEVICE(vapdev), true); + &vfio_dev_io_ioctl, DEVICE(vapdev), true); /* AP device is mdev type device */ vbasedev->mdev = true; diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 22378d50bc..8c16648819 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -682,7 +682,7 @@ static void vfio_ccw_instance_init(Object *obj) * ram_block_discard_disable(). */ vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_CCW, &vfio_ccw_ops, - DEVICE(vcdev), true); + &vfio_dev_io_ioctl, DEVICE(vcdev), true); } #ifdef CONFIG_IOMMUFD diff --git a/hw/vfio/common.c b/hw/vfio/common.c index c0a6263678..edc1efc251 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -953,7 +953,7 @@ static void vfio_devices_dma_logging_stop(VFIOContainerBase *bcontainer) continue; } - if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) { + if (vbasedev->io->device_feature(vbasedev, feature)) { warn_report("%s: Failed to stop DMA logging, err %d (%s)", vbasedev->name, -errno, strerror(errno)); } @@ -1056,9 +1056,8 @@ static bool vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer, continue; } - ret = ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature); + ret = vbasedev->io->device_feature(vbasedev, feature); if (ret) { - ret = -errno; error_setg_errno(errp, errno, "%s: Failed to start DMA logging", vbasedev->name); goto out; @@ -1137,7 +1136,7 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova, feature->flags = VFIO_DEVICE_FEATURE_GET | VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT; - if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) { + if (vbasedev->io->device_feature(vbasedev, feature)) { return -errno; } diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index a8951176b8..529520c1d6 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -43,7 +43,7 @@ void vfio_disable_irqindex(VFIODevice *vbasedev, int index) .count = 0, }; - ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); + vbasedev->io->set_irqs(vbasedev, &irq_set); } void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index) @@ -56,7 +56,7 @@ void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index) .count = 1, }; - ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); + vbasedev->io->set_irqs(vbasedev, &irq_set); } void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index) @@ -69,7 +69,7 @@ void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index) .count = 1, }; - ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); + vbasedev->io->set_irqs(vbasedev, &irq_set); } static inline const char *action_to_str(int action) @@ -116,6 +116,7 @@ bool vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex, int argsz; const char *name; int32_t *pfd; + int ret; argsz = sizeof(*irq_set) + sizeof(*pfd); @@ -128,7 +129,9 @@ bool vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex, pfd = (int32_t *)&irq_set->data; *pfd = fd; - if (!ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set)) { + ret = vbasedev->io->set_irqs(vbasedev, irq_set); + + if (!ret) { return true; } @@ -160,6 +163,7 @@ void vfio_region_write(void *opaque, hwaddr addr, uint32_t dword; uint64_t qword; } buf; + int ret; switch (size) { case 1: @@ -179,7 +183,8 @@ void vfio_region_write(void *opaque, hwaddr addr, break; } - if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) { + ret = vbasedev->io->region_write(vbasedev, region->nr, addr, size, &buf); + if (ret != size) { error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64 ",%d) failed: %m", __func__, vbasedev->name, region->nr, @@ -211,8 +216,10 @@ uint64_t vfio_region_read(void *opaque, uint64_t qword; } buf; uint64_t data = 0; + int ret; - if (pread(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) { + ret = vbasedev->io->region_read(vbasedev, region->nr, addr, size, &buf); + if (ret != size) { error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %m", __func__, vbasedev->name, region->nr, addr, size); @@ -560,6 +567,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, struct vfio_region_info **info) { size_t argsz = sizeof(struct vfio_region_info); + int ret; /* create region cache */ if (vbasedev->regions == NULL) { @@ -578,7 +586,8 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, retry: (*info)->argsz = argsz; - if (ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, *info)) { + ret = vbasedev->io->get_region_info(vbasedev, *info); + if (ret != 0) { g_free(*info); *info = NULL; return -errno; @@ -688,11 +697,12 @@ void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp) } void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops, - DeviceState *dev, bool ram_discard) + VFIODeviceIO *io, DeviceState *dev, bool ram_discard) { vbasedev->type = type; vbasedev->ops = ops; vbasedev->dev = dev; + vbasedev->io = io; vbasedev->fd = -1; vbasedev->ram_block_discard_allowed = ram_discard; @@ -739,3 +749,77 @@ bool vfio_device_hiod_realize(VFIODevice *vbasedev, Error **errp) return HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp); } + +/* + * Traditional ioctl() based io + */ + +static int vfio_io_device_feature(VFIODevice *vbasedev, + struct vfio_device_feature *feature) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_get_region_info(VFIODevice *vbasedev, + struct vfio_region_info *info) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, info); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_get_irq_info(VFIODevice *vbasedev, + struct vfio_irq_info *info) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, info); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_set_irqs(VFIODevice *vbasedev, struct vfio_irq_set *irqs) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_region_read(VFIODevice *vbasedev, uint8_t index, off_t off, + uint32_t size, void *data) +{ + struct vfio_region_info *info = vbasedev->regions[index]; + int ret; + + ret = pread(vbasedev->fd, data, size, info->offset + off); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_region_write(VFIODevice *vbasedev, uint8_t index, off_t off, + uint32_t size, void *data) +{ + struct vfio_region_info *info = vbasedev->regions[index]; + int ret; + + ret = pwrite(vbasedev->fd, data, size, info->offset + off); + + return ret < 0 ? -errno : ret; +} + +VFIODeviceIO vfio_dev_io_ioctl = { + .device_feature = vfio_io_device_feature, + .get_region_info = vfio_io_get_region_info, + .get_irq_info = vfio_io_get_irq_info, + .set_irqs = vfio_io_set_irqs, + .region_read = vfio_io_region_read, + .region_write = vfio_io_region_write, +}; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index bb0d26915b..c6d7ebfd9b 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -45,6 +45,14 @@ #include "migration/qemu-file.h" #include "system/iommufd.h" +/* convenience macros for PCI config space */ +#define VDEV_CONFIG_READ(vbasedev, off, size, data) \ + ((vbasedev)->io->region_read((vbasedev), VFIO_PCI_CONFIG_REGION_INDEX, \ + (off), (size), (data))) +#define VDEV_CONFIG_WRITE(vbasedev, off, size, data) \ + ((vbasedev)->io->region_write((vbasedev), VFIO_PCI_CONFIG_REGION_INDEX, \ + (off), (size), (data))) + #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" /* Protected by BQL */ @@ -379,6 +387,7 @@ static void vfio_msi_interrupt(void *opaque) static int vfio_enable_msix_no_vec(VFIOPCIDevice *vdev) { g_autofree struct vfio_irq_set *irq_set = NULL; + VFIODevice *vbasedev = &vdev->vbasedev; int ret = 0, argsz; int32_t *fd; @@ -394,7 +403,7 @@ static int vfio_enable_msix_no_vec(VFIOPCIDevice *vdev) fd = (int32_t *)&irq_set->data; *fd = -1; - ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set); + ret = vbasedev->io->set_irqs(vbasedev, irq_set); return ret; } @@ -453,7 +462,7 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix) fds[i] = fd; } - ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set); + ret = vdev->vbasedev.io->set_irqs(&vdev->vbasedev, irq_set); g_free(irq_set); @@ -879,13 +888,14 @@ static void vfio_update_msi(VFIOPCIDevice *vdev) static void vfio_pci_load_rom(VFIOPCIDevice *vdev) { + VFIODevice *vbasedev = &vdev->vbasedev; struct vfio_region_info *reg_info = NULL; uint64_t size; off_t off = 0; ssize_t bytes; - if (vfio_get_region_info(&vdev->vbasedev, - VFIO_PCI_ROM_REGION_INDEX, ®_info)) { + if (!vfio_get_region_info(vbasedev, + VFIO_PCI_ROM_REGION_INDEX, ®_info)) { error_report("vfio: Error getting ROM info: %m"); return; } @@ -911,18 +921,19 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev) memset(vdev->rom, 0xff, size); while (size) { - bytes = pread(vdev->vbasedev.fd, vdev->rom + off, - size, vdev->rom_offset + off); + bytes = vbasedev->io->region_read(vbasedev, VFIO_PCI_ROM_REGION_INDEX, + off, size, vdev->rom + off); if (bytes == 0) { break; } else if (bytes > 0) { off += bytes; size -= bytes; } else { - if (errno == EINTR || errno == EAGAIN) { + if (bytes == -EINTR || bytes == -EAGAIN) { continue; } - error_report("vfio: Error reading device ROM: %m"); + error_report("vfio: Error reading device ROM: %s", + strerror(-bytes)); break; } } @@ -1010,11 +1021,10 @@ static const MemoryRegionOps vfio_rom_ops = { static void vfio_pci_size_rom(VFIOPCIDevice *vdev) { + VFIODevice *vbasedev = &vdev->vbasedev; uint32_t orig, size = cpu_to_le32((uint32_t)PCI_ROM_ADDRESS_MASK); - off_t offset = vdev->config_offset + PCI_ROM_ADDRESS; DeviceState *dev = DEVICE(vdev); char *name; - int fd = vdev->vbasedev.fd; if (vdev->pdev.romfile || !vdev->pdev.rom_bar) { /* Since pci handles romfile, just print a message and return */ @@ -1031,11 +1041,12 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev) * Use the same size ROM BAR as the physical device. The contents * will get filled in later when the guest tries to read it. */ - if (pread(fd, &orig, 4, offset) != 4 || - pwrite(fd, &size, 4, offset) != 4 || - pread(fd, &size, 4, offset) != 4 || - pwrite(fd, &orig, 4, offset) != 4) { - error_report("%s(%s) failed: %m", __func__, vdev->vbasedev.name); + if (VDEV_CONFIG_READ(vbasedev, PCI_ROM_ADDRESS, 4, &orig) != 4 || + VDEV_CONFIG_WRITE(vbasedev, PCI_ROM_ADDRESS, 4, &size) != 4 || + VDEV_CONFIG_READ(vbasedev, PCI_ROM_ADDRESS, 4, &size) != 4 || + VDEV_CONFIG_WRITE(vbasedev, PCI_ROM_ADDRESS, 4, &orig) != 4) { + + error_report("%s(%s) ROM access failed", __func__, vbasedev->name); return; } @@ -1215,6 +1226,7 @@ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val; memcpy(&emu_bits, vdev->emulated_config_bits + addr, len); @@ -1227,12 +1239,13 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) if (~emu_bits & (0xffffffffU >> (32 - len * 8))) { ssize_t ret; - ret = pread(vdev->vbasedev.fd, &phys_val, len, - vdev->config_offset + addr); + ret = VDEV_CONFIG_READ(vbasedev, addr, len, &phys_val); if (ret != len) { - error_report("%s(%s, 0x%x, 0x%x) failed: %m", - __func__, vdev->vbasedev.name, addr, len); - return -errno; + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_report("%s(%s, 0x%x, 0x%x) failed: %s", + __func__, vbasedev->name, addr, len, err); + return -1; } phys_val = le32_to_cpu(phys_val); } @@ -1248,15 +1261,19 @@ void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr, uint32_t val, int len) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; uint32_t val_le = cpu_to_le32(val); + int ret; trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len); /* Write everything to VFIO, let it filter out what we can't write */ - if (pwrite(vdev->vbasedev.fd, &val_le, len, vdev->config_offset + addr) - != len) { - error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %m", - __func__, vdev->vbasedev.name, addr, val, len); + ret = VDEV_CONFIG_WRITE(vbasedev, addr, len, &val_le); + if (ret != len) { + const char *err = ret < 0 ? strerror(-ret) : "short write"; + + error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %s", + __func__, vbasedev->name, addr, val, len, err); } /* MSI/MSI-X Enabling/Disabling */ @@ -1344,9 +1361,12 @@ static bool vfio_msi_setup(VFIOPCIDevice *vdev, int pos, Error **errp) int ret, entries; Error *err = NULL; - if (pread(vdev->vbasedev.fd, &ctrl, sizeof(ctrl), - vdev->config_offset + pos + PCI_CAP_FLAGS) != sizeof(ctrl)) { - error_setg_errno(errp, errno, "failed reading MSI PCI_CAP_FLAGS"); + ret = VDEV_CONFIG_READ(&vdev->vbasedev, pos + PCI_CAP_FLAGS, + sizeof(ctrl), &ctrl); + if (ret != sizeof(ctrl)) { + const char *errmsg = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed reading MSI PCI_CAP_FLAGS %s", errmsg); return false; } ctrl = le16_to_cpu(ctrl); @@ -1550,34 +1570,43 @@ static bool vfio_pci_relocate_msix(VFIOPCIDevice *vdev, Error **errp) */ static bool vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp) { + VFIODevice *vbasedev = &vdev->vbasedev; uint8_t pos; uint16_t ctrl; uint32_t table, pba; - int ret, fd = vdev->vbasedev.fd; struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info), .index = VFIO_PCI_MSIX_IRQ_INDEX }; VFIOMSIXInfo *msix; + int ret; pos = pci_find_capability(&vdev->pdev, PCI_CAP_ID_MSIX); if (!pos) { return true; } - if (pread(fd, &ctrl, sizeof(ctrl), - vdev->config_offset + pos + PCI_MSIX_FLAGS) != sizeof(ctrl)) { - error_setg_errno(errp, errno, "failed to read PCI MSIX FLAGS"); + ret = VDEV_CONFIG_READ(vbasedev, pos + PCI_MSIX_FLAGS, + sizeof(ctrl), &ctrl); + if (ret != sizeof(ctrl)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed to read PCI MSIX FLAGS %s", err); return false; } - if (pread(fd, &table, sizeof(table), - vdev->config_offset + pos + PCI_MSIX_TABLE) != sizeof(table)) { - error_setg_errno(errp, errno, "failed to read PCI MSIX TABLE"); + ret = VDEV_CONFIG_READ(vbasedev, pos + PCI_MSIX_TABLE, + sizeof(table), &table); + if (ret != sizeof(table)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed to read PCI MSIX TABLE %s", err); return false; } - if (pread(fd, &pba, sizeof(pba), - vdev->config_offset + pos + PCI_MSIX_PBA) != sizeof(pba)) { - error_setg_errno(errp, errno, "failed to read PCI MSIX PBA"); + ret = VDEV_CONFIG_READ(vbasedev, pos + PCI_MSIX_PBA, sizeof(pba), &pba); + if (ret != sizeof(pba)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed to read PCI MSIX PBA %s", err); return false; } @@ -1592,7 +1621,7 @@ static bool vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp) msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK; msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1; - ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info); + ret = vdev->vbasedev.io->get_irq_info(&vdev->vbasedev, &irq_info); if (ret < 0) { error_setg_errno(errp, -ret, "failed to get MSI-X irq info"); g_free(msix); @@ -1736,10 +1765,12 @@ static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) } /* Determine what type of BAR this is for registration */ - ret = pread(vdev->vbasedev.fd, &pci_bar, sizeof(pci_bar), - vdev->config_offset + PCI_BASE_ADDRESS_0 + (4 * nr)); + ret = VDEV_CONFIG_READ(&vdev->vbasedev, PCI_BASE_ADDRESS_0 + (4 * nr), + sizeof(pci_bar), &pci_bar); if (ret != sizeof(pci_bar)) { - error_report("vfio: Failed to read BAR %d (%m)", nr); + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_report("vfio: Failed to read BAR %d (%s)", nr, err); return; } @@ -2439,21 +2470,25 @@ void vfio_pci_pre_reset(VFIOPCIDevice *vdev) void vfio_pci_post_reset(VFIOPCIDevice *vdev) { + VFIODevice *vbasedev = &vdev->vbasedev; Error *err = NULL; - int nr; + int ret, nr; if (!vfio_intx_enable(vdev, &err)) { error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); } for (nr = 0; nr < PCI_NUM_REGIONS - 1; ++nr) { - off_t addr = vdev->config_offset + PCI_BASE_ADDRESS_0 + (4 * nr); + off_t addr = PCI_BASE_ADDRESS_0 + (4 * nr); uint32_t val = 0; uint32_t len = sizeof(val); - if (pwrite(vdev->vbasedev.fd, &val, len, addr) != len) { - error_report("%s(%s) reset bar %d failed: %m", __func__, - vdev->vbasedev.name, nr); + ret = VDEV_CONFIG_WRITE(vbasedev, addr, len, &val); + if (ret != len) { + const char *errmsg = ret < 0 ? strerror(-ret) : "short write"; + + error_report("%s(%s) reset bar %d failed: %s", __func__, + vbasedev->name, nr, errmsg); } } @@ -2795,7 +2830,7 @@ static bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) irq_info.index = VFIO_PCI_ERR_IRQ_INDEX; - ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info); + ret = vbasedev->io->get_irq_info(vbasedev, &irq_info); if (ret) { /* This can fail for an old kernel or legacy PCI dev */ trace_vfio_populate_device_get_irq_info_failure(strerror(errno)); @@ -2916,8 +2951,10 @@ static void vfio_register_req_notifier(VFIOPCIDevice *vdev) return; } - if (ioctl(vdev->vbasedev.fd, - VFIO_DEVICE_GET_IRQ_INFO, &irq_info) < 0 || irq_info.count < 1) { + if (vdev->vbasedev.io->get_irq_info(&vdev->vbasedev, &irq_info) < 0) { + return; + } + if (irq_info.count < 1) { return; } @@ -3345,7 +3382,7 @@ static void vfio_instance_init(Object *obj) vdev->host.function = ~0U; vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_PCI, &vfio_pci_ops, - DEVICE(vdev), false); + &vfio_dev_io_ioctl, DEVICE(vdev), false); vdev->nv_gpudirect_clique = 0xFF; diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index 1070a2113a..1194e55807 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -648,7 +648,7 @@ static void vfio_platform_instance_init(Object *obj) VFIODevice *vbasedev = &vdev->vbasedev; vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_PLATFORM, &vfio_platform_ops, - DEVICE(vdev), false); + &vfio_dev_io_ioctl, DEVICE(vdev), false); } #ifdef CONFIG_IOMMUFD diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 59348b81aa..1104ed63e3 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -116,6 +116,7 @@ typedef struct VFIOIOMMUFDContainer { OBJECT_DECLARE_SIMPLE_TYPE(VFIOIOMMUFDContainer, VFIO_IOMMU_IOMMUFD); typedef struct VFIODeviceOps VFIODeviceOps; +typedef struct VFIODeviceIO VFIODeviceIO; typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) next; @@ -136,6 +137,7 @@ typedef struct VFIODevice { OnOffAuto enable_migration; bool migration_events; VFIODeviceOps *ops; + VFIODeviceIO *io; unsigned int num_irqs; unsigned int num_regions; unsigned int flags; @@ -186,6 +188,29 @@ struct VFIODeviceOps { int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f); }; +#ifdef CONFIG_LINUX + +/* + * How devices communicate with the server. The default option is through + * ioctl() to the kernel VFIO driver, but vfio-user can use a socket to a remote + * process. + */ +struct VFIODeviceIO { + int (*device_feature)(VFIODevice *vdev, struct vfio_device_feature *); + int (*get_region_info)(VFIODevice *vdev, + struct vfio_region_info *info); + int (*get_irq_info)(VFIODevice *vdev, struct vfio_irq_info *irq); + int (*set_irqs)(VFIODevice *vdev, struct vfio_irq_set *irqs); + int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, + void *data); + int (*region_write)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, + void *data); +}; + +extern VFIODeviceIO vfio_dev_io_ioctl; + +#endif /* CONFIG_LINUX */ + typedef struct VFIOGroup { int fd; int groupid; @@ -316,6 +341,6 @@ int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova, bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp); void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp); void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops, - DeviceState *dev, bool ram_discard); + VFIODeviceIO *io, DeviceState *dev, bool ram_discard); int vfio_device_get_aw_bits(VFIODevice *vdev); #endif /* HW_VFIO_VFIO_COMMON_H */ From patchwork Wed Jan 8 11:50:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930712 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 275D0E7719A for ; Wed, 8 Jan 2025 11:56:41 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdR-0006Sn-KZ; Wed, 08 Jan 2025 06:54:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdH-0006RD-EB for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:55 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdA-0002Ek-B5 for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:53 -0500 Received: from pps.filterd (m0127842.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5087vts5007169; Wed, 8 Jan 2025 03:53:47 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=JJW1+y5Zpvcxt4eVA6jYEGxNaGhKV6Lm1NLgIb1xq 9w=; b=2TCJAh3YFSidM8RZtQIPXvM3+Nqj/qqR4lFFNL7mrXhrXmp9+IzkzfwJi D0+DtWvKcsLJG6ueWt/2zNUBUslTMXg60fC7MuFe97V4KO+WkKLndRX8irfca1Et Uw9fwchzUFzulPeoH3m0xEVQScE4ZpCNOCH7PJaXSgMNkELGj9c2kI1FzvS1Qt4h yDK0kaNxns7A4HZd9WIrqx2XdN9UawciRuRn6fyPaqJxsZwZZ80HeHt6M+oP1LCi RMWam9FQ0dP7IbWA/hU2uWj4ZRUNLgEA+Xyufm22UUUPfLjeiEqJQQMLkDQtuD4D 5v+25AiXA349DAFepGXc+1CexR9CQ== Received: from nam04-dm6-obe.outbound.protection.outlook.com (mail-dm6nam04lp2048.outbound.protection.outlook.com [104.47.73.48]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y56eryxp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=yyrikgWChmt6crQJkeODWMGxk9jgyL6Nk6tPu0hj/8hWfAsCLKu+jD2m6y6v3bYus1SKnSkeqr6B+Muusb/8V8g0i8RhrrO9LQike/qn7IylaEiKxZtlbZJmh1MKor0qEWaGGCeNATQXRDijhl3SptRTUXL63+q05knc77UaB/jn0eXYa6TTgoYVGRlswalF9kENJ7i1hl1UG/Od3M42MxbRVSSYwBdEFTwOBxEvnmQsqLWeSQmlpwJa5STYLmaAiGGGqWzvni1KPP5Z9Q6M5YkqR1kQNYGVzNOzjIsfU3TPorJ9cIUHOvbJsyFAPC864G2t/k2WDf/VkNZdr2QA3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JJW1+y5Zpvcxt4eVA6jYEGxNaGhKV6Lm1NLgIb1xq9w=; b=AqcKnn9flXzB7ZOK4YPT8QUpUr0I0a71/Qclp9fZ7uGFLMcAnWNOxZx6W7IzaZnClaWsJGcIj58ShWRzqhsVgb69Dj0kvmddCBJTwqL8sw3N4dA+v2M5PsVGOkwA+GWb0KzKr9dJvj8nqu4pB/RbU0n7FUopJPKwT4nyI6lMIZuKXsfCW1xGPfEt+HpmGJqa21+GyNDWOxYO1Th9yahk7Q2i5YA3G36zbO4mtzG9GECGzgKO3odilY1fIjM44VdxQ8gvMZB4gDt5SCe9DBvniGjZynK7vknrI9kA/wOrVh67kuAdUmd9jdUGqKR0BSk0/H/mfoW1+TqSdsl52jBMGg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JJW1+y5Zpvcxt4eVA6jYEGxNaGhKV6Lm1NLgIb1xq9w=; b=lqaAALheC/2AzeiaiMU/xJb8vLU4ztgGrjhHncD7qEP/5OxUhJHFiKTuYN16TG+c/zeUXg6FK2rRnYVi3LhJY9YDREVk9aPkZAwGUf0R2FHGJHb8IpNns5HpJLM7aS2/V2SQ31r/y3CP+zlioCNGPP36+6huq7ygLWIptakrHWVxmQeOmQi6VMlmJiUKtbGF8zJzXiOEfGL953e07mo2XWwW6DeZs7mm9GB9jWdLOOFlcHWKmz4MIy5cC3Ry6TJ95WMN4P3GlXOdULUdSv/ORJrf1S2FBUO0ymXUUvUrk/lMKjpDFihZH/J1aHqy/TlyGKoS4FcW+NrIXKnO+PXc5A== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:41 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:41 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 09/26] vfio-user: introduce vfio-user protocol specification Date: Wed, 8 Jan 2025 11:50:15 +0000 Message-Id: <20250108115032.1677686-10-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 161f4f85-6fe7-4c4c-ec05-08dd2fdb186b x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: Lr86137gjUzuaSyZeypdaN7x/gxx3Ap6L5IGZGiuwfiUbaawOpCNMBjn7DBRaFyAryy9D+64Ob5hx97C1uBfzbS+ghxG9k3A+JGVkt9EjJL62fFAUkTSZ2Y1BPDTLi4trHBqMecUCMOcF5BsmavEsykBCEUCWF/IZLW/R2Sr+KMmBB3GQUHmaXpmD5C3V7FAYMwBDlnqAOrykUuqDGQi9KQSf7Agp2SJZwyQ7L5iPk/sZhaup3WVUVX4JKcsNis+Xd82tzI0Vm2d/zHhamrteZ3qtnGNQTdfHm/mMn/XgzjtOoUKv0Y64zdVhQKiVpF/LH/POgbZEfxyUD3Gl2N1xQnZWF9T/UmwLTHbLyE+dVTfIF0ULQnZaO8p/25VQngCjgXHRJc0zVHqKbg+ReSfGBOydcip07AJ3NOZi90N7JP8vbKMbrAnOqR7IfILB+0AvkrbjrMDS/2mRsjjgX8pVxP8X12PomFtBhBRsxWPCFskTL4nSL6KloXPpCb0ZdOP3Sb8BdYi5LP2QbmqNef7abisvzmOympP79W5xx9LtdDPgKiel/GiSjVuIJL/uNgbVRYI9+KTGtpEBZI/8vDCAyZiV3pDGt6lulLVsHfDWONyeTEZES7woJIVPSIF7hVuXpYNIM0ImI29umWd3eLHCF2kErjVgrqu06ni97IOvcLJxem4mV3TBSLNWwQeWU8/wGFbIlbdkruU7MKuIcusLQJRFMcjAZqw1yuCgaShz0OuKLYG3iEUrUCrAF/phKToodneVhOKEBS3qDjCr27QG4EukjWI/Ho1tZ30a//pKRYtE9YiJpwTF7tPV4RsfUd5H+xfRTPQSurulqRSVvLim+52xjgezOfW7JPZiNVmYv0JqQ64EbOO/0IrfKcyr61e1yhsLMOzcuIJAW1dBtTlI44Meu+HMwux4muTieFd9gQXzctiskhaP0yo3T+q/9WVCiB89/dp/2BSuxCc9fu/VBHtQ9seMzoYCb1byx5oBVDQvOdxpPv82mcAgA8RD5FyQEuNqDRMdzg6SXqQBjbIItACdRTy4yQOtkOYjq/+5RiQhr/huzlrxKiUQCRXelBVwietnXB31TKuClFy1kg5BASxxs5M6kKP/QdF92enC8ff4kM+dqvGssC3UtbPBxQ0qTC7AdfKqSuzd5vim6oJukzk9YXw5NtWhrQCmHcdf0x78f3VSvvdtE6Li+xc7+kgJtD+VeGTJMARNXTkpeB7iOgsY+gxB6UUbBTWWh5gtx/AKlfEULriooC6eoPG0FyoVxV6R9rJZW+uG4z+HIslsK/fBVRrv+hF7B2T5a012BH9gjUDH3CTHVGPSOB/RmB3nuWur2QjPIPsxRLqVfKPjfzgUnBAjc3Hy9g3G6xF35fPFQiu63/VY32Kt1V8aMCv X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Be/tlGKuxrDsX6iTmXITA7HCMPBHbFkmHPiutlwE4Pf0T6iP4a+y4mIg1vORl7eDjcq8I5tLs1zXPLjw33LQJfNSWw0784AEnH1U7u04GQMf7SodqDJDQPelo24IAeRwTj2iJ1vssnTTLlxJVtZMI2+eKoh9KJGngnT4vnIm6nim0jRF0cb6R16NNqCajSM4+ywQQ6VLSPGqT6LNt2wlcb7AVESWWLsyh5pailY1wFHny65IcNTOYxiXlC2fN4I9WZyclDfBAhAjV7OHp9Wb8I0I/8dWIxz3cyss1XoKwti7+DR0Av89Bn4CX7wXsBeZa7OMEMl5FbluvD0UVNGUca1MjOuFx4HubSYE0tivQUmpAbaxJp0jx8grRvpEtvpKG586lzqhsj+3vDIfH1/ic/MF8huOtxsujRB0IiuBfHR59bBL8QBoPZgZ4nqejau5ss0OKU9oHvIv1a/yzEskUoMP3MPQvx3eSdLZSGgvQeQA0Me6CZhmCaLJMFk7D76Nk/TDScYuZO6LQjgEDbZer8eKnecY0ITfVOPPeZzyPKAAPRFJsnf0eg6bypTNlwBTIoGTLwvuBYzdh8inTuHQBiXZmtzIkyt7cSKAr5oTQQFO6BvlE5xjZIqIk8vvYs+gqj6ky0PLcc/9LND1/axOK31IEsqnlecbk3zWK28K1GUd74JokjlZpwFy3R83cDkRfPkNawplfgt72f4l9+tX+ATL9sVSIj6TTpUSfBlNc1izBA6oZi/+XyvmSBOpmQzqg16A2muKT1ACJwjrEfa7tUiv64uryzBiDbbJ4k9KPMX6iS8LEJMlaSvtAr08U5nmPG/nzkklxEuFnjE5dX9jnXxAJMvDuiFNdrnRDqW7F8S/CYyIruhfF2Spjls2wbsHaoeET8ItBxAM4RRLDRpoOvRckx+23Vqk5u4QaJy8ZFGOWEuE2y77U6rl53PZB1tXHfjRc2ynNFcvCvDlDcY+2dSrWpMD6mZseX3Ui4MYY1Yd/0pYpRDl9GucrGzSteOYw+6sWp3gixxxcWh9XuJCyCTmhyhNaSFCd/1kpPjinJhytBKpE55C49jmLIqBv+HdjOQM7CWuAbWMmeVzSg2W/aOukPNhHu8oO+iwfrpblcqb6hmd7rFVPrIcZ0ecD1fK9O4BM/jP5o72JY7nKPDF11dpyvOJ/rt6xPyNcoB4++XPHpLFNXakiZUUZDbh1U/UFs1pUe1w1sAx+tZAaY4hHsw2kt1z1JY9iF3FROukX/dL9b9mpLG/nIrD84u5H6DSXmhs06IVW2f7daZm9Qrj8jpCxN/APUI94TzpcxorT2hNyyXQWgvzXWFMdfXly6lCE1FW6QLr2qyrLGMP5EBVZlwYoxMg8nXYa70ey+64nJCDrk0aoCcq7xhY2s32EvjCeYFfluYQ3mtXIjkbenq/6af0tjgJ4hTj7D6J/JiQgRO+D1LNP/MLZiTzq0sfgSUIhPqI3w7ckfrC5Ckle6+frvSySk3E/IQJyrhiBOJgpBFZkukvrqjM+L7K3+3AaYpm2GK2iH1JeGxm4lJTJQ/sBrXSI9JwYZ+cv28JdBJVxje5zrg/7+rxnJwXUjSs1nRk X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 161f4f85-6fe7-4c4c-ec05-08dd2fdb186b X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:41.2858 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: tybKB1Twf75qhdIzE2pYb49xRiomRiJvYPXPW0omVjeT3Y6c8R3zSXpSZxeMumTIKAsVp0y68DCJT+T0SIBhrw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Authority-Analysis: v=2.4 cv=A6aWP7WG c=1 sm=1 tr=0 ts=677e674b cx=c_pps a=MHkl0I0wjNeC5ak5fNlPUA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=VwQbUJbxAAAA:8 a=pGLkceISAAAA:8 a=64Cc0HZtAAAA:8 a=20KFwNOVAAAA:8 a=z4glEzOvAAAA:8 a=h9SPtozqHe0Wh_k42joA:9 a=x5O-bmgOEKQMiMGR:21 a=N0yriQgkpQ4A:10 a=HplO-upLQ7EA:10 a=14NRyaPF5x3gF6G45PvQ:22 a=92dS5hN0c3Q7EetK7xW5:22 X-Proofpoint-GUID: g7OKLaU7vXqo-O7xCOT4NqzQ9e-jQxpJ X-Proofpoint-ORIG-GUID: g7OKLaU7vXqo-O7xCOT4NqzQ9e-jQxpJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, WEIRD_QUOTING=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Thanos Makatos This patch introduces the vfio-user protocol specification (formerly known as VFIO-over-socket), which is designed to allow devices to be emulated outside QEMU, in a separate process. vfio-user reuses the existing VFIO defines, structs and concepts. It has been earlier discussed as an RFC in: "RFC: use VFIO over a UNIX domain socket to implement device offloading" Signed-off-by: Thanos Makatos Signed-off-by: John Levon --- MAINTAINERS | 8 +- docs/devel/index-internals.rst | 1 + docs/devel/vfio-user.rst | 1522 ++++++++++++++++++++++++++++++++ 3 files changed, 1530 insertions(+), 1 deletion(-) create mode 100644 docs/devel/vfio-user.rst diff --git a/MAINTAINERS b/MAINTAINERS index 2101b51217..f60f0a4dd2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4129,12 +4129,18 @@ F: hw/remote/proxy-memory-listener.c F: include/hw/remote/proxy-memory-listener.h F: hw/remote/iohub.c F: include/hw/remote/iohub.h -F: subprojects/libvfio-user F: hw/remote/vfio-user-obj.c F: include/hw/remote/vfio-user-obj.h F: hw/remote/iommu.c F: include/hw/remote/iommu.h +VFIO-USER: +M: John Levon +M: Thanos Makatos +S: Supported +F: docs/devel/vfio-user.rst +F: subprojects/libvfio-user + EBPF: M: Jason Wang R: Andrew Melnychenko diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst index ab9fbc4482..268f13cd2d 100644 --- a/docs/devel/index-internals.rst +++ b/docs/devel/index-internals.rst @@ -19,6 +19,7 @@ Details about QEMU's various subsystems including how to add features to them. s390-dasd-ipl tracing vfio-iommufd + vfio-user writing-monitor-commands virtio-backends crypto diff --git a/docs/devel/vfio-user.rst b/docs/devel/vfio-user.rst new file mode 100644 index 0000000000..0d96477a68 --- /dev/null +++ b/docs/devel/vfio-user.rst @@ -0,0 +1,1522 @@ +.. include:: +******************************** +vfio-user Protocol Specification +******************************** + +-------------- +Version_ 0.9.1 +-------------- + +.. contents:: Table of Contents + +Introduction +============ +vfio-user is a protocol that allows a device to be emulated in a separate +process outside of a Virtual Machine Monitor (VMM). vfio-user devices consist +of a generic VFIO device type, living inside the VMM, which we call the client, +and the core device implementation, living outside the VMM, which we call the +server. + +The vfio-user specification is partly based on the +`Linux VFIO ioctl interface `_. + +VFIO is a mature and stable API, backed by an extensively used framework. The +existing VFIO client implementation in QEMU (``qemu/hw/vfio/``) can be largely +re-used, though there is nothing in this specification that requires that +particular implementation. None of the VFIO kernel modules are required for +supporting the protocol, on either the client or server side. Some source +definitions in VFIO are re-used for vfio-user. + +The main idea is to allow a virtual device to function in a separate process in +the same host over a UNIX domain socket. A UNIX domain socket (``AF_UNIX``) is +chosen because file descriptors can be trivially sent over it, which in turn +allows: + +* Sharing of client memory for DMA with the server. +* Sharing of server memory with the client for fast MMIO. +* Efficient sharing of eventfd's for triggering interrupts. + +Other socket types could be used which allow the server to run in a separate +guest in the same host (``AF_VSOCK``) or remotely (``AF_INET``). Theoretically +the underlying transport does not necessarily have to be a socket, however we do +not examine such alternatives. In this protocol version we focus on using a UNIX +domain socket and introduce basic support for the other two types of sockets +without considering performance implications. + +While passing of file descriptors is desirable for performance reasons, support +is not necessary for either the client or the server in order to implement the +protocol. There is always an in-band, message-passing fall back mechanism. + +Overview +======== + +VFIO is a framework that allows a physical device to be securely passed through +to a user space process; the device-specific kernel driver does not drive the +device at all. Typically, the user space process is a VMM and the device is +passed through to it in order to achieve high performance. VFIO provides an API +and the required functionality in the kernel. QEMU has adopted VFIO to allow a +guest to directly access physical devices, instead of emulating them in +software. + +vfio-user reuses the core VFIO concepts defined in its API, but implements them +as messages to be sent over a socket. It does not change the kernel-based VFIO +in any way, in fact none of the VFIO kernel modules need to be loaded to use +vfio-user. It is also possible for the client to concurrently use the current +kernel-based VFIO for one device, and vfio-user for another device. + +VFIO Device Model +----------------- + +A device under VFIO presents a standard interface to the user process. Many of +the VFIO operations in the existing interface use the ``ioctl()`` system call, and +references to the existing interface are called the ``ioctl()`` implementation in +this document. + +The following sections describe the set of messages that implement the vfio-user +interface over a socket. In many cases, the messages are analogous to data +structures used in the ``ioctl()`` implementation. Messages derived from the +``ioctl()`` will have a name derived from the ``ioctl()`` command name. E.g., the +``VFIO_DEVICE_GET_INFO`` ``ioctl()`` command becomes a +``VFIO_USER_DEVICE_GET_INFO`` message. The purpose of this reuse is to share as +much code as feasible with the ``ioctl()`` implementation``. + +Connection Initiation +^^^^^^^^^^^^^^^^^^^^^ + +After the client connects to the server, the initial client message is +``VFIO_USER_VERSION`` to propose a protocol version and set of capabilities to +apply to the session. The server replies with a compatible version and set of +capabilities it supports, or closes the connection if it cannot support the +advertised version. + +Device Information +^^^^^^^^^^^^^^^^^^ + +The client uses a ``VFIO_USER_DEVICE_GET_INFO`` message to query the server for +information about the device. This information includes: + +* The device type and whether it supports reset (``VFIO_DEVICE_FLAGS_``), +* the number of device regions, and +* the device presents to the client the number of interrupt types the device + supports. + +Region Information +^^^^^^^^^^^^^^^^^^ + +The client uses ``VFIO_USER_DEVICE_GET_REGION_INFO`` messages to query the +server for information about the device's regions. This information describes: + +* Read and write permissions, whether it can be memory mapped, and whether it + supports additional capabilities (``VFIO_REGION_INFO_CAP_``). +* Region index, size, and offset. + +When a device region can be mapped by the client, the server provides a file +descriptor which the client can ``mmap()``. The server is responsible for +polling for client updates to memory mapped regions. + +Region Capabilities +""""""""""""""""""" + +Some regions have additional capabilities that cannot be described adequately +by the region info data structure. These capabilities are returned in the +region info reply in a list similar to PCI capabilities in a PCI device's +configuration space. + +Sparse Regions +"""""""""""""" +A region can be memory-mappable in whole or in part. When only a subset of a +region can be mapped by the client, a ``VFIO_REGION_INFO_CAP_SPARSE_MMAP`` +capability is included in the region info reply. This capability describes +which portions can be mapped by the client. + +.. Note:: + For example, in a virtual NVMe controller, sparse regions can be used so + that accesses to the NVMe registers (found in the beginning of BAR0) are + trapped (an infrequent event), while allowing direct access to the doorbells + (an extremely frequent event as every I/O submission requires a write to + BAR0), found in the next page after the NVMe registers in BAR0. + +Device-Specific Regions +""""""""""""""""""""""" + +A device can define regions additional to the standard ones (e.g. PCI indexes +0-8). This is achieved by including a ``VFIO_REGION_INFO_CAP_TYPE`` capability +in the region info reply of a device-specific region. Such regions are reflected +in ``struct vfio_user_device_info.num_regions``. Thus, for PCI devices this +value can be equal to, or higher than, ``VFIO_PCI_NUM_REGIONS``. + +Region I/O via file descriptors +------------------------------- + +For unmapped regions, region I/O from the client is done via +``VFIO_USER_REGION_READ/WRITE``. As an optimization, ioeventfds or ioregionfds +may be configured for sub-regions of some regions. A client may request +information on these sub-regions via ``VFIO_USER_DEVICE_GET_REGION_IO_FDS``; by +configuring the returned file descriptors as ioeventfds or ioregionfds, the +server can be directly notified of I/O (for example, by KVM) without taking a +trip through the client. + +Interrupts +^^^^^^^^^^ + +The client uses ``VFIO_USER_DEVICE_GET_IRQ_INFO`` messages to query the server +for the device's interrupt types. The interrupt types are specific to the bus +the device is attached to, and the client is expected to know the capabilities +of each interrupt type. The server can signal an interrupt by directly injecting +interrupts into the guest via an event file descriptor. The client configures +how the server signals an interrupt with ``VFIO_USER_SET_IRQS`` messages. + +Device Read and Write +^^^^^^^^^^^^^^^^^^^^^ + +When the guest executes load or store operations to an unmapped device region, +the client forwards these operations to the server with +``VFIO_USER_REGION_READ`` or ``VFIO_USER_REGION_WRITE`` messages. The server +will reply with data from the device on read operations or an acknowledgement on +write operations. See `Read and Write Operations`_. + +Client memory access +-------------------- + +The client uses ``VFIO_USER_DMA_MAP`` and ``VFIO_USER_DMA_UNMAP`` messages to +inform the server of the valid DMA ranges that the server can access on behalf +of a device (typically, VM guest memory). DMA memory may be accessed by the +server via ``VFIO_USER_DMA_READ`` and ``VFIO_USER_DMA_WRITE`` messages over the +socket. In this case, the "DMA" part of the naming is a misnomer. + +Actual direct memory access of client memory from the server is possible if the +client provides file descriptors the server can ``mmap()``. Note that ``mmap()`` +privileges cannot be revoked by the client, therefore file descriptors should +only be exported in environments where the client trusts the server not to +corrupt guest memory. + +See `Read and Write Operations`_. + +Client/server interactions +========================== + +Socket +------ + +A server can serve: + +1) one or more clients, and/or +2) one or more virtual devices, belonging to one or more clients. + +The current protocol specification requires a dedicated socket per +client/server connection. It is a server-side implementation detail whether a +single server handles multiple virtual devices from the same or multiple +clients. The location of the socket is implementation-specific. Multiplexing +clients, devices, and servers over the same socket is not supported in this +version of the protocol. + +Authentication +-------------- + +For ``AF_UNIX``, we rely on OS mandatory access controls on the socket files, +therefore it is up to the management layer to set up the socket as required. +Socket types that span guests or hosts will require a proper authentication +mechanism. Defining that mechanism is deferred to a future version of the +protocol. + +Command Concurrency +------------------- + +A client may pipeline multiple commands without waiting for previous command +replies. The server will process commands in the order they are received. A +consequence of this is if a client issues a command with the *No_reply* bit, +then subsequently issues a command without *No_reply*, the older command will +have been processed before the reply to the younger command is sent by the +server. The client must be aware of the device's capability to process +concurrent commands if pipelining is used. For example, pipelining allows +multiple client threads to concurrently access device regions; the client must +ensure these accesses obey device semantics. + +An example is a frame buffer device, where the device may allow concurrent +access to different areas of video memory, but may have indeterminate behavior +if concurrent accesses are performed to command or status registers. + +Note that unrelated messages sent from the server to the client can appear in +between a client to server request/reply and vice versa. + +Implementers should be prepared for certain commands to exhibit potentially +unbounded latencies. For example, ``VFIO_USER_DEVICE_RESET`` may take an +arbitrarily long time to complete; clients should take care not to block +unnecessarily. + +Socket Disconnection Behavior +----------------------------- +The server and the client can disconnect from each other, either intentionally +or unexpectedly. Both the client and the server need to know how to handle such +events. + +Server Disconnection +^^^^^^^^^^^^^^^^^^^^ +A server disconnecting from the client may indicate that: + +1) A virtual device has been restarted, either intentionally (e.g. because of a + device update) or unintentionally (e.g. because of a crash). +2) A virtual device has been shut down with no intention to be restarted. + +It is impossible for the client to know whether or not a failure is +intermittent or innocuous and should be retried, therefore the client should +reset the VFIO device when it detects the socket has been disconnected. +Error recovery will be driven by the guest's device error handling +behavior. + +Client Disconnection +^^^^^^^^^^^^^^^^^^^^ +The client disconnecting from the server primarily means that the client +has exited. Currently, this means that the guest is shut down so the device is +no longer needed therefore the server can automatically exit. However, there +can be cases where a client disconnection should not result in a server exit: + +1) A single server serving multiple clients. +2) A multi-process QEMU upgrading itself step by step, which is not yet + implemented. + +Therefore in order for the protocol to be forward compatible, the server should +respond to a client disconnection as follows: + + - all client memory regions are unmapped and cleaned up (including closing any + passed file descriptors) + - all IRQ file descriptors passed from the old client are closed + - the device state should otherwise be retained + +The expectation is that when a client reconnects, it will re-establish IRQ and +client memory mappings. + +If anything happens to the client (such as qemu really did exit), the control +stack will know about it and can clean up resources accordingly. + +Security Considerations +----------------------- + +Speaking generally, vfio-user clients should not trust servers, and vice versa. +Standard tools and mechanisms should be used on both sides to validate input and +prevent against denial of service scenarios, buffer overflow, etc. + +Request Retry and Response Timeout +---------------------------------- +A failed command is a command that has been successfully sent and has been +responded to with an error code. Failure to send the command in the first place +(e.g. because the socket is disconnected) is a different type of error examined +earlier in the disconnect section. + +.. Note:: + QEMU's VFIO retries certain operations if they fail. While this makes sense + for real HW, we don't know for sure whether it makes sense for virtual + devices. + +Defining a retry and timeout scheme is deferred to a future version of the +protocol. + +Message sizes +------------- + +Some requests have an ``argsz`` field. In a request, it defines the maximum +expected reply payload size, which should be at least the size of the fixed +reply payload headers defined here. The *request* payload size is defined by the +usual ``msg_size`` field in the header, not the ``argsz`` field. + +In a reply, the server sets ``argsz`` field to the size needed for a full +payload size. This may be less than the requested maximum size. This may be +larger than the requested maximum size: in that case, the full payload is not +included in the reply, but the ``argsz`` field in the reply indicates the needed +size, allowing a client to allocate a larger buffer for holding the reply before +trying again. + +In addition, during negotiation (see `Version`_), the client and server may +each specify a ``max_data_xfer_size`` value; this defines the maximum data that +may be read or written via one of the ``VFIO_USER_DMA/REGION_READ/WRITE`` +messages; see `Read and Write Operations`_. + +Protocol Specification +====================== + +To distinguish from the base VFIO symbols, all vfio-user symbols are prefixed +with ``vfio_user`` or ``VFIO_USER``. In this revision, all data is in the +endianness of the host system, although this may be relaxed in future +revisions in cases where the client and server run on different hosts +with different endianness. + +Unless otherwise specified, all sizes should be presumed to be in bytes. + +.. _Commands: + +Commands +-------- +The following table lists the VFIO message command IDs, and whether the +message command is sent from the client or the server. + +====================================== ========= ================= +Name Command Request Direction +====================================== ========= ================= +``VFIO_USER_VERSION`` 1 client -> server +``VFIO_USER_DMA_MAP`` 2 client -> server +``VFIO_USER_DMA_UNMAP`` 3 client -> server +``VFIO_USER_DEVICE_GET_INFO`` 4 client -> server +``VFIO_USER_DEVICE_GET_REGION_INFO`` 5 client -> server +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` 6 client -> server +``VFIO_USER_DEVICE_GET_IRQ_INFO`` 7 client -> server +``VFIO_USER_DEVICE_SET_IRQS`` 8 client -> server +``VFIO_USER_REGION_READ`` 9 client -> server +``VFIO_USER_REGION_WRITE`` 10 client -> server +``VFIO_USER_DMA_READ`` 11 server -> client +``VFIO_USER_DMA_WRITE`` 12 server -> client +``VFIO_USER_DEVICE_RESET`` 13 client -> server +``VFIO_USER_REGION_WRITE_MULTI`` 15 client -> server +====================================== ========= ================= + +Header +------ + +All messages, both command messages and reply messages, are preceded by a +16-byte header that contains basic information about the message. The header is +followed by message-specific data described in the sections below. + ++----------------+--------+-------------+ +| Name | Offset | Size | ++================+========+=============+ +| Message ID | 0 | 2 | ++----------------+--------+-------------+ +| Command | 2 | 2 | ++----------------+--------+-------------+ +| Message size | 4 | 4 | ++----------------+--------+-------------+ +| Flags | 8 | 4 | ++----------------+--------+-------------+ +| | +-----+------------+ | +| | | Bit | Definition | | +| | +=====+============+ | +| | | 0-3 | Type | | +| | +-----+------------+ | +| | | 4 | No_reply | | +| | +-----+------------+ | +| | | 5 | Error | | +| | +-----+------------+ | ++----------------+--------+-------------+ +| Error | 12 | 4 | ++----------------+--------+-------------+ +| | 16 | variable | ++----------------+--------+-------------+ + +* *Message ID* identifies the message, and is echoed in the command's reply + message. Message IDs belong entirely to the sender, can be re-used (even + concurrently) and the receiver must not make any assumptions about their + uniqueness. +* *Command* specifies the command to be executed, listed in Commands_. It is + also set in the reply header. +* *Message size* contains the size of the entire message, including the header. +* *Flags* contains attributes of the message: + + * The *Type* bits indicate the message type. + + * *Command* (value 0x0) indicates a command message. + * *Reply* (value 0x1) indicates a reply message acknowledging a previous + command with the same message ID. + * *No_reply* in a command message indicates that no reply is needed for this + command. This is commonly used when multiple commands are sent, and only + the last needs acknowledgement. + * *Error* in a reply message indicates the command being acknowledged had + an error. In this case, the *Error* field will be valid. + +* *Error* in a reply message is an optional UNIX errno value. It may be zero + even if the Error bit is set in Flags. It is reserved in a command message. + +Each command message in Commands_ must be replied to with a reply message, +unless the message sets the *No_Reply* bit. The reply consists of the header +with the *Reply* bit set, plus any additional data. + +If an error occurs, the reply message must only include the reply header. + +As the header is standard in both requests and replies, it is not included in +the command-specific specifications below; each message definition should be +appended to the standard header, and the offsets are given from the end of the +standard header. + +``VFIO_USER_VERSION`` +--------------------- + +.. _Version: + +This is the initial message sent by the client after the socket connection is +established; the same format is used for the server's reply. + +Upon establishing a connection, the client must send a ``VFIO_USER_VERSION`` +message proposing a protocol version and a set of capabilities. The server +compares these with the versions and capabilities it supports and sends a +``VFIO_USER_VERSION`` reply according to the following rules. + +* The major version in the reply must be the same as proposed. If the client + does not support the proposed major, it closes the connection. +* The minor version in the reply must be equal to or less than the minor + version proposed. +* The capability list must be a subset of those proposed. If the server + requires a capability the client did not include, it closes the connection. + +The protocol major version will only change when incompatible protocol changes +are made, such as changing the message format. The minor version may change +when compatible changes are made, such as adding new messages or capabilities, +Both the client and server must support all minor versions less than the +maximum minor version it supports. E.g., an implementation that supports +version 1.3 must also support 1.0 through 1.2. + +When making a change to this specification, the protocol version number must +be included in the form "added in version X.Y" + +Request +^^^^^^^ + +============== ====== ==== +Name Offset Size +============== ====== ==== +version major 0 2 +version minor 2 2 +version data 4 variable (including terminating NUL). Optional. +============== ====== ==== + +The version data is an optional UTF-8 encoded JSON byte array with the following +format: + ++--------------+--------+-----------------------------------+ +| Name | Type | Description | ++==============+========+===================================+ +| capabilities | object | Contains common capabilities that | +| | | the sender supports. Optional. | ++--------------+--------+-----------------------------------+ + +Capabilities: + ++--------------------+---------+------------------------------------------------+ +| Name | Type | Description | ++====================+=========+================================================+ +| max_msg_fds | number | Maximum number of file descriptors that can be | +| | | received by the sender in one message. | +| | | Optional. If not specified then the receiver | +| | | must assume a value of ``1``. | ++--------------------+---------+------------------------------------------------+ +| max_data_xfer_size | number | Maximum ``count`` for data transfer messages; | +| | | see `Read and Write Operations`_. Optional, | +| | | with a default value of 1048576 bytes. | ++--------------------+---------+------------------------------------------------+ +| pgsizes | number | Page sizes supported in DMA map operations | +| | | or'ed together. Optional, with a default value | +| | | of supporting only 4k pages. | ++--------------------+---------+------------------------------------------------+ +| max_dma_maps | number | Maximum number DMA map windows that can be | +| | | valid simultaneously. Optional, with a | +| | | value of 65535 (64k-1). | ++--------------------+---------+------------------------------------------------+ +| migration | object | Migration capability parameters. If missing | +| | | then migration is not supported by the sender. | ++--------------------+---------+------------------------------------------------+ +| write_multiple | boolean | ``VFIO_USER_REGION_WRITE_MULTI`` messages | +| | | are supported if the value is ``true``. | ++--------------------+---------+------------------------------------------------+ + +The migration capability contains the following name/value pairs: + ++-----------------+--------+--------------------------------------------------+ +| Name | Type | Description | ++=================+========+==================================================+ +| pgsize | number | Page size of dirty pages bitmap. The smallest | +| | | between the client and the server is used. | ++-----------------+--------+--------------------------------------------------+ +| max_bitmap_size | number | Maximum bitmap size in ``VFIO_USER_DIRTY_PAGES`` | +| | | and ``VFIO_DMA_UNMAP`` messages. Optional, | +| | | with a default value of 256MB. | ++-----------------+--------+--------------------------------------------------+ + +Reply +^^^^^ + +The same message format is used in the server's reply with the semantics +described above. + +``VFIO_USER_DMA_MAP`` +--------------------- + +This command message is sent by the client to the server to inform it of the +memory regions the server can access. It must be sent before the server can +perform any DMA to the client. It is normally sent directly after the version +handshake is completed, but may also occur when memory is added to the client, +or if the client uses a vIOMMU. + +Request +^^^^^^^ + +The request payload for this message is a structure of the following format: + ++-------------+--------+-------------+ +| Name | Offset | Size | ++=============+========+=============+ +| argsz | 0 | 4 | ++-------------+--------+-------------+ +| flags | 4 | 4 | ++-------------+--------+-------------+ +| | +-----+------------+ | +| | | Bit | Definition | | +| | +=====+============+ | +| | | 0 | readable | | +| | +-----+------------+ | +| | | 1 | writeable | | +| | +-----+------------+ | ++-------------+--------+-------------+ +| offset | 8 | 8 | ++-------------+--------+-------------+ +| address | 16 | 8 | ++-------------+--------+-------------+ +| size | 24 | 8 | ++-------------+--------+-------------+ + +* *argsz* is the size of the above structure. Note there is no reply payload, + so this field differs from other message types. +* *flags* contains the following region attributes: + + * *readable* indicates that the region can be read from. + + * *writeable* indicates that the region can be written to. + +* *offset* is the file offset of the region with respect to the associated file + descriptor, or zero if the region is not mappable +* *address* is the base DMA address of the region. +* *size* is the size of the region. + +This structure is 32 bytes in size, so the message size is 16 + 32 bytes. + +If the DMA region being added can be directly mapped by the server, a file +descriptor must be sent as part of the message meta-data. The region can be +mapped via the mmap() system call. On ``AF_UNIX`` sockets, the file descriptor +must be passed as ``SCM_RIGHTS`` type ancillary data. Otherwise, if the DMA +region cannot be directly mapped by the server, no file descriptor must be sent +as part of the message meta-data and the DMA region can be accessed by the +server using ``VFIO_USER_DMA_READ`` and ``VFIO_USER_DMA_WRITE`` messages, +explained in `Read and Write Operations`_. A command to map over an existing +region must be failed by the server with ``EEXIST`` set in error field in the +reply. + +Reply +^^^^^ + +There is no payload in the reply message. + +``VFIO_USER_DMA_UNMAP`` +----------------------- + +This command message is sent by the client to the server to inform it that a +DMA region, previously made available via a ``VFIO_USER_DMA_MAP`` command +message, is no longer available for DMA. It typically occurs when memory is +subtracted from the client or if the client uses a vIOMMU. The DMA region is +described by the following structure: + +Request +^^^^^^^ + +The request payload for this message is a structure of the following format: + ++--------------+--------+------------------------+ +| Name | Offset | Size | ++==============+========+========================+ +| argsz | 0 | 4 | ++--------------+--------+------------------------+ +| flags | 4 | 4 | ++--------------+--------+------------------------+ +| address | 8 | 8 | ++--------------+--------+------------------------+ +| size | 16 | 8 | ++--------------+--------+------------------------+ + +* *argsz* is the maximum size of the reply payload. +* *flags* is unused in this version. +* *address* is the base DMA address of the DMA region. +* *size* is the size of the DMA region. + +The address and size of the DMA region being unmapped must match exactly a +previous mapping. + +Reply +^^^^^ + +Upon receiving a ``VFIO_USER_DMA_UNMAP`` command, if the file descriptor is +mapped then the server must release all references to that DMA region before +replying, which potentially includes in-flight DMA transactions. + +The server responds with the original DMA entry in the request. + + +``VFIO_USER_DEVICE_GET_INFO`` +----------------------------- + +This command message is sent by the client to the server to query for basic +information about the device. + +Request +^^^^^^^ + ++-------------+--------+--------------------------+ +| Name | Offset | Size | ++=============+========+==========================+ +| argsz | 0 | 4 | ++-------------+--------+--------------------------+ +| flags | 4 | 4 | ++-------------+--------+--------------------------+ +| | +-----+-------------------------+ | +| | | Bit | Definition | | +| | +=====+=========================+ | +| | | 0 | VFIO_DEVICE_FLAGS_RESET | | +| | +-----+-------------------------+ | +| | | 1 | VFIO_DEVICE_FLAGS_PCI | | +| | +-----+-------------------------+ | ++-------------+--------+--------------------------+ +| num_regions | 8 | 4 | ++-------------+--------+--------------------------+ +| num_irqs | 12 | 4 | ++-------------+--------+--------------------------+ + +* *argsz* is the maximum size of the reply payload +* all other fields must be zero. + +Reply +^^^^^ + ++-------------+--------+--------------------------+ +| Name | Offset | Size | ++=============+========+==========================+ +| argsz | 0 | 4 | ++-------------+--------+--------------------------+ +| flags | 4 | 4 | ++-------------+--------+--------------------------+ +| | +-----+-------------------------+ | +| | | Bit | Definition | | +| | +=====+=========================+ | +| | | 0 | VFIO_DEVICE_FLAGS_RESET | | +| | +-----+-------------------------+ | +| | | 1 | VFIO_DEVICE_FLAGS_PCI | | +| | +-----+-------------------------+ | ++-------------+--------+--------------------------+ +| num_regions | 8 | 4 | ++-------------+--------+--------------------------+ +| num_irqs | 12 | 4 | ++-------------+--------+--------------------------+ + +* *argsz* is the size required for the full reply payload (16 bytes today) +* *flags* contains the following device attributes. + + * ``VFIO_DEVICE_FLAGS_RESET`` indicates that the device supports the + ``VFIO_USER_DEVICE_RESET`` message. + * ``VFIO_DEVICE_FLAGS_PCI`` indicates that the device is a PCI device. + +* *num_regions* is the number of memory regions that the device exposes. +* *num_irqs* is the number of distinct interrupt types that the device supports. + +This version of the protocol only supports PCI devices. Additional devices may +be supported in future versions. + +``VFIO_USER_DEVICE_GET_REGION_INFO`` +------------------------------------ + +This command message is sent by the client to the server to query for +information about device regions. The VFIO region info structure is defined in +```` (``struct vfio_region_info``). + +Request +^^^^^^^ + ++------------+--------+------------------------------+ +| Name | Offset | Size | ++============+========+==============================+ +| argsz | 0 | 4 | ++------------+--------+------------------------------+ +| flags | 4 | 4 | ++------------+--------+------------------------------+ +| index | 8 | 4 | ++------------+--------+------------------------------+ +| cap_offset | 12 | 4 | ++------------+--------+------------------------------+ +| size | 16 | 8 | ++------------+--------+------------------------------+ +| offset | 24 | 8 | ++------------+--------+------------------------------+ + +* *argsz* the maximum size of the reply payload +* *index* is the index of memory region being queried, it is the only field + that is required to be set in the command message. +* all other fields must be zero. + +Reply +^^^^^ + ++------------+--------+------------------------------+ +| Name | Offset | Size | ++============+========+==============================+ +| argsz | 0 | 4 | ++------------+--------+------------------------------+ +| flags | 4 | 4 | ++------------+--------+------------------------------+ +| | +-----+-----------------------------+ | +| | | Bit | Definition | | +| | +=====+=============================+ | +| | | 0 | VFIO_REGION_INFO_FLAG_READ | | +| | +-----+-----------------------------+ | +| | | 1 | VFIO_REGION_INFO_FLAG_WRITE | | +| | +-----+-----------------------------+ | +| | | 2 | VFIO_REGION_INFO_FLAG_MMAP | | +| | +-----+-----------------------------+ | +| | | 3 | VFIO_REGION_INFO_FLAG_CAPS | | +| | +-----+-----------------------------+ | ++------------+--------+------------------------------+ ++------------+--------+------------------------------+ +| index | 8 | 4 | ++------------+--------+------------------------------+ +| cap_offset | 12 | 4 | ++------------+--------+------------------------------+ +| size | 16 | 8 | ++------------+--------+------------------------------+ +| offset | 24 | 8 | ++------------+--------+------------------------------+ + +* *argsz* is the size required for the full reply payload (region info structure + plus the size of any region capabilities) +* *flags* are attributes of the region: + + * ``VFIO_REGION_INFO_FLAG_READ`` allows client read access to the region. + * ``VFIO_REGION_INFO_FLAG_WRITE`` allows client write access to the region. + * ``VFIO_REGION_INFO_FLAG_MMAP`` specifies the client can mmap() the region. + When this flag is set, the reply will include a file descriptor in its + meta-data. On ``AF_UNIX`` sockets, the file descriptors will be passed as + ``SCM_RIGHTS`` type ancillary data. + * ``VFIO_REGION_INFO_FLAG_CAPS`` indicates additional capabilities found in the + reply. + +* *index* is the index of memory region being queried, it is the only field + that is required to be set in the command message. +* *cap_offset* describes where additional region capabilities can be found. + cap_offset is relative to the beginning of the VFIO region info structure. + The data structure it points is a VFIO cap header defined in + ````. +* *size* is the size of the region. +* *offset* is the offset that should be given to the mmap() system call for + regions with the MMAP attribute. It is also used as the base offset when + mapping a VFIO sparse mmap area, described below. + +VFIO region capabilities +"""""""""""""""""""""""" + +The VFIO region information can also include a capabilities list. This list is +similar to a PCI capability list - each entry has a common header that +identifies a capability and where the next capability in the list can be found. +The VFIO capability header format is defined in ```` (``struct +vfio_info_cap_header``). + +VFIO cap header format +"""""""""""""""""""""" + ++---------+--------+------+ +| Name | Offset | Size | ++=========+========+======+ +| id | 0 | 2 | ++---------+--------+------+ +| version | 2 | 2 | ++---------+--------+------+ +| next | 4 | 4 | ++---------+--------+------+ + +* *id* is the capability identity. +* *version* is a capability-specific version number. +* *next* specifies the offset of the next capability in the capability list. It + is relative to the beginning of the VFIO region info structure. + +VFIO sparse mmap cap header +""""""""""""""""""""""""""" + ++------------------+----------------------------------+ +| Name | Value | ++==================+==================================+ +| id | VFIO_REGION_INFO_CAP_SPARSE_MMAP | ++------------------+----------------------------------+ +| version | 0x1 | ++------------------+----------------------------------+ +| next | | ++------------------+----------------------------------+ +| sparse mmap info | VFIO region info sparse mmap | ++------------------+----------------------------------+ + +This capability is defined when only a subrange of the region supports +direct access by the client via mmap(). The VFIO sparse mmap area is defined in +```` (``struct vfio_region_sparse_mmap_area`` and ``struct +vfio_region_info_cap_sparse_mmap``). + +VFIO region info cap sparse mmap +"""""""""""""""""""""""""""""""" + ++----------+--------+------+ +| Name | Offset | Size | ++==========+========+======+ +| nr_areas | 0 | 4 | ++----------+--------+------+ +| reserved | 4 | 4 | ++----------+--------+------+ +| offset | 8 | 8 | ++----------+--------+------+ +| size | 16 | 8 | ++----------+--------+------+ +| ... | | | ++----------+--------+------+ + +* *nr_areas* is the number of sparse mmap areas in the region. +* *offset* and size describe a single area that can be mapped by the client. + There will be *nr_areas* pairs of offset and size. The offset will be added to + the base offset given in the ``VFIO_USER_DEVICE_GET_REGION_INFO`` to form the + offset argument of the subsequent mmap() call. + +The VFIO sparse mmap area is defined in ```` (``struct +vfio_region_info_cap_sparse_mmap``). + + +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` +-------------------------------------- + +Clients can access regions via ``VFIO_USER_REGION_READ/WRITE`` or, if provided, by +``mmap()`` of a file descriptor provided by the server. + +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` provides an alternative access mechanism via +file descriptors. This is an optional feature intended for performance +improvements where an underlying sub-system (such as KVM) supports communication +across such file descriptors to the vfio-user server, without needing to +round-trip through the client. + +The server returns an array of sub-regions for the requested region. Each +sub-region describes a span (offset and size) of a region, along with the +requested file descriptor notification mechanism to use. Each sub-region in the +response message may choose to use a different method, as defined below. The +two mechanisms supported in this specification are ioeventfds and ioregionfds. + +The server in addition returns a file descriptor in the ancillary data; clients +are expected to configure each sub-region's file descriptor with the requested +notification method. For example, a client could configure KVM with the +requested ioeventfd via a ``KVM_IOEVENTFD`` ``ioctl()``. + +Request +^^^^^^^ + ++-------------+--------+------+ +| Name | Offset | Size | ++=============+========+======+ +| argsz | 0 | 4 | ++-------------+--------+------+ +| flags | 4 | 4 | ++-------------+--------+------+ +| index | 8 | 4 | ++-------------+--------+------+ +| count | 12 | 4 | ++-------------+--------+------+ + +* *argsz* the maximum size of the reply payload +* *index* is the index of memory region being queried +* all other fields must be zero + +The client must set ``flags`` to zero and specify the region being queried in +the ``index``. + +Reply +^^^^^ + ++-------------+--------+------+ +| Name | Offset | Size | ++=============+========+======+ +| argsz | 0 | 4 | ++-------------+--------+------+ +| flags | 4 | 4 | ++-------------+--------+------+ +| index | 8 | 4 | ++-------------+--------+------+ +| count | 12 | 4 | ++-------------+--------+------+ +| sub-regions | 16 | ... | ++-------------+--------+------+ + +* *argsz* is the size of the region IO FD info structure plus the + total size of the sub-region array. Thus, each array entry "i" is at offset + i * ((argsz - 32) / count). Note that currently this is 40 bytes for both IO + FD types, but this is not to be relied on. As elsewhere, this indicates the + full reply payload size needed. +* *flags* must be zero +* *index* is the index of memory region being queried +* *count* is the number of sub-regions in the array +* *sub-regions* is the array of Sub-Region IO FD info structures + +The reply message will additionally include at least one file descriptor in the +ancillary data. Note that more than one sub-region may share the same file +descriptor. + +Note that it is the client's responsibility to verify the requested values (for +example, that the requested offset does not exceed the region's bounds). + +Each sub-region given in the response has one of two possible structures, +depending whether *type* is ``VFIO_USER_IO_FD_TYPE_IOEVENTFD`` or +``VFIO_USER_IO_FD_TYPE_IOREGIONFD``: + +Sub-Region IO FD info format (ioeventfd) +"""""""""""""""""""""""""""""""""""""""" + ++-----------+--------+------+ +| Name | Offset | Size | ++===========+========+======+ +| offset | 0 | 8 | ++-----------+--------+------+ +| size | 8 | 8 | ++-----------+--------+------+ +| fd_index | 16 | 4 | ++-----------+--------+------+ +| type | 20 | 4 | ++-----------+--------+------+ +| flags | 24 | 4 | ++-----------+--------+------+ +| padding | 28 | 4 | ++-----------+--------+------+ +| datamatch | 32 | 8 | ++-----------+--------+------+ + +* *offset* is the offset of the start of the sub-region within the region + requested ("physical address offset" for the region) +* *size* is the length of the sub-region. This may be zero if the access size is + not relevant, which may allow for optimizations +* *fd_index* is the index in the ancillary data of the FD to use for ioeventfd + notification; it may be shared. +* *type* is ``VFIO_USER_IO_FD_TYPE_IOEVENTFD`` +* *flags* is any of: + + * ``KVM_IOEVENTFD_FLAG_DATAMATCH`` + * ``KVM_IOEVENTFD_FLAG_PIO`` + * ``KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY`` (FIXME: makes sense?) + +* *datamatch* is the datamatch value if needed + +See https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt, *4.59 +KVM_IOEVENTFD* for further context on the ioeventfd-specific fields. + +Sub-Region IO FD info format (ioregionfd) +""""""""""""""""""""""""""""""""""""""""" + ++-----------+--------+------+ +| Name | Offset | Size | ++===========+========+======+ +| offset | 0 | 8 | ++-----------+--------+------+ +| size | 8 | 8 | ++-----------+--------+------+ +| fd_index | 16 | 4 | ++-----------+--------+------+ +| type | 20 | 4 | ++-----------+--------+------+ +| flags | 24 | 4 | ++-----------+--------+------+ +| padding | 28 | 4 | ++-----------+--------+------+ +| user_data | 32 | 8 | ++-----------+--------+------+ + +* *offset* is the offset of the start of the sub-region within the region + requested ("physical address offset" for the region) +* *size* is the length of the sub-region. This may be zero if the access size is + not relevant, which may allow for optimizations; ``KVM_IOREGION_POSTED_WRITES`` + must be set in *flags* in this case +* *fd_index* is the index in the ancillary data of the FD to use for ioregionfd + messages; it may be shared +* *type* is ``VFIO_USER_IO_FD_TYPE_IOREGIONFD`` +* *flags* is any of: + + * ``KVM_IOREGION_PIO`` + * ``KVM_IOREGION_POSTED_WRITES`` + +* *user_data* is an opaque value passed back to the server via a message on the + file descriptor + +For further information on the ioregionfd-specific fields, see: +https://lore.kernel.org/kvm/cover.1613828726.git.eafanasova@gmail.com/ + +(FIXME: update with final API docs.) + +``VFIO_USER_DEVICE_GET_IRQ_INFO`` +--------------------------------- + +This command message is sent by the client to the server to query for +information about device interrupt types. The VFIO IRQ info structure is +defined in ```` (``struct vfio_irq_info``). + +Request +^^^^^^^ + ++-------+--------+---------------------------+ +| Name | Offset | Size | ++=======+========+===========================+ +| argsz | 0 | 4 | ++-------+--------+---------------------------+ +| flags | 4 | 4 | ++-------+--------+---------------------------+ +| | +-----+--------------------------+ | +| | | Bit | Definition | | +| | +=====+==========================+ | +| | | 0 | VFIO_IRQ_INFO_EVENTFD | | +| | +-----+--------------------------+ | +| | | 1 | VFIO_IRQ_INFO_MASKABLE | | +| | +-----+--------------------------+ | +| | | 2 | VFIO_IRQ_INFO_AUTOMASKED | | +| | +-----+--------------------------+ | +| | | 3 | VFIO_IRQ_INFO_NORESIZE | | +| | +-----+--------------------------+ | ++-------+--------+---------------------------+ +| index | 8 | 4 | ++-------+--------+---------------------------+ +| count | 12 | 4 | ++-------+--------+---------------------------+ + +* *argsz* is the maximum size of the reply payload (16 bytes today) +* index is the index of IRQ type being queried (e.g. ``VFIO_PCI_MSIX_IRQ_INDEX``) +* all other fields must be zero + +Reply +^^^^^ + ++-------+--------+---------------------------+ +| Name | Offset | Size | ++=======+========+===========================+ +| argsz | 0 | 4 | ++-------+--------+---------------------------+ +| flags | 4 | 4 | ++-------+--------+---------------------------+ +| | +-----+--------------------------+ | +| | | Bit | Definition | | +| | +=====+==========================+ | +| | | 0 | VFIO_IRQ_INFO_EVENTFD | | +| | +-----+--------------------------+ | +| | | 1 | VFIO_IRQ_INFO_MASKABLE | | +| | +-----+--------------------------+ | +| | | 2 | VFIO_IRQ_INFO_AUTOMASKED | | +| | +-----+--------------------------+ | +| | | 3 | VFIO_IRQ_INFO_NORESIZE | | +| | +-----+--------------------------+ | ++-------+--------+---------------------------+ +| index | 8 | 4 | ++-------+--------+---------------------------+ +| count | 12 | 4 | ++-------+--------+---------------------------+ + +* *argsz* is the size required for the full reply payload (16 bytes today) +* *flags* defines IRQ attributes: + + * ``VFIO_IRQ_INFO_EVENTFD`` indicates the IRQ type can support server eventfd + signalling. + * ``VFIO_IRQ_INFO_MASKABLE`` indicates that the IRQ type supports the ``MASK`` + and ``UNMASK`` actions in a ``VFIO_USER_DEVICE_SET_IRQS`` message. + * ``VFIO_IRQ_INFO_AUTOMASKED`` indicates the IRQ type masks itself after being + triggered, and the client must send an ``UNMASK`` action to receive new + interrupts. + * ``VFIO_IRQ_INFO_NORESIZE`` indicates ``VFIO_USER_SET_IRQS`` operations setup + interrupts as a set, and new sub-indexes cannot be enabled without disabling + the entire type. +* index is the index of IRQ type being queried +* count describes the number of interrupts of the queried type. + +``VFIO_USER_DEVICE_SET_IRQS`` +----------------------------- + +This command message is sent by the client to the server to set actions for +device interrupt types. The VFIO IRQ set structure is defined in +```` (``struct vfio_irq_set``). + +Request +^^^^^^^ + ++-------+--------+------------------------------+ +| Name | Offset | Size | ++=======+========+==============================+ +| argsz | 0 | 4 | ++-------+--------+------------------------------+ +| flags | 4 | 4 | ++-------+--------+------------------------------+ +| | +-----+-----------------------------+ | +| | | Bit | Definition | | +| | +=====+=============================+ | +| | | 0 | VFIO_IRQ_SET_DATA_NONE | | +| | +-----+-----------------------------+ | +| | | 1 | VFIO_IRQ_SET_DATA_BOOL | | +| | +-----+-----------------------------+ | +| | | 2 | VFIO_IRQ_SET_DATA_EVENTFD | | +| | +-----+-----------------------------+ | +| | | 3 | VFIO_IRQ_SET_ACTION_MASK | | +| | +-----+-----------------------------+ | +| | | 4 | VFIO_IRQ_SET_ACTION_UNMASK | | +| | +-----+-----------------------------+ | +| | | 5 | VFIO_IRQ_SET_ACTION_TRIGGER | | +| | +-----+-----------------------------+ | ++-------+--------+------------------------------+ +| index | 8 | 4 | ++-------+--------+------------------------------+ +| start | 12 | 4 | ++-------+--------+------------------------------+ +| count | 16 | 4 | ++-------+--------+------------------------------+ +| data | 20 | variable | ++-------+--------+------------------------------+ + +* *argsz* is the size of the VFIO IRQ set request payload, including any *data* + field. Note there is no reply payload, so this field differs from other + message types. +* *flags* defines the action performed on the interrupt range. The ``DATA`` + flags describe the data field sent in the message; the ``ACTION`` flags + describe the action to be performed. The flags are mutually exclusive for + both sets. + + * ``VFIO_IRQ_SET_DATA_NONE`` indicates there is no data field in the command. + The action is performed unconditionally. + * ``VFIO_IRQ_SET_DATA_BOOL`` indicates the data field is an array of boolean + bytes. The action is performed if the corresponding boolean is true. + * ``VFIO_IRQ_SET_DATA_EVENTFD`` indicates an array of event file descriptors + was sent in the message meta-data. These descriptors will be signalled when + the action defined by the action flags occurs. In ``AF_UNIX`` sockets, the + descriptors are sent as ``SCM_RIGHTS`` type ancillary data. + If no file descriptors are provided, this de-assigns the specified + previously configured interrupts. + * ``VFIO_IRQ_SET_ACTION_MASK`` indicates a masking event. It can be used with + ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to mask an interrupt, + or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the guest masks + the interrupt. + * ``VFIO_IRQ_SET_ACTION_UNMASK`` indicates an unmasking event. It can be used + with ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to unmask an + interrupt, or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the + guest unmasks the interrupt. + * ``VFIO_IRQ_SET_ACTION_TRIGGER`` indicates a triggering event. It can be used + with ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to trigger an + interrupt, or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the + server triggers the interrupt. + +* *index* is the index of IRQ type being setup. +* *start* is the start of the sub-index being set. +* *count* describes the number of sub-indexes being set. As a special case, a + count (and start) of 0, with data flags of ``VFIO_IRQ_SET_DATA_NONE`` disables + all interrupts of the index. +* *data* is an optional field included when the + ``VFIO_IRQ_SET_DATA_BOOL`` flag is present. It contains an array of booleans + that specify whether the action is to be performed on the corresponding + index. It's used when the action is only performed on a subset of the range + specified. + +Not all interrupt types support every combination of data and action flags. +The client must know the capabilities of the device and IRQ index before it +sends a ``VFIO_USER_DEVICE_SET_IRQ`` message. + +In typical operation, a specific IRQ may operate as follows: + +1. The client sends a ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_EVENTFD|VFIO_IRQ_SET_ACTION_TRIGGER)`` along + with an eventfd. This associates the IRQ with a particular eventfd on the + server side. + +#. The client may send a ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_EVENTFD|VFIO_IRQ_SET_ACTION_MASK/UNMASK)`` along + with another eventfd. This associates the given eventfd with the + mask/unmask state on the server side. + +#. The server may trigger the IRQ by writing 1 to the eventfd. + +#. The server may mask/unmask an IRQ which will write 1 to the corresponding + mask/unmask eventfd, if there is one. + +5. A client may trigger a device IRQ itself, by sending a + ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_NONE/BOOL|VFIO_IRQ_SET_ACTION_TRIGGER)``. + +6. A client may mask or unmask the IRQ, by sending a + ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_NONE/BOOL|VFIO_IRQ_SET_ACTION_MASK/UNMASK)``. + +Reply +^^^^^ + +There is no payload in the reply. + +.. _Read and Write Operations: + +Note that all of these operations must be supported by the client and/or server, +even if the corresponding memory or device region has been shared as mappable. + +The ``count`` field must not exceed the value of ``max_data_xfer_size`` of the +peer, for both reads and writes. + +``VFIO_USER_REGION_READ`` +------------------------- + +If a device region is not mappable, it's not directly accessible by the client +via ``mmap()`` of the underlying file descriptor. In this case, a client can +read from a device region with this message. + +Request +^^^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. + +Reply +^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | variable | ++--------+--------+----------+ + +* *offset* into the region accessed. +* *region* is the index of the region accessed. +* *count* is the size of the data transferred. +* *data* is the data that was read from the device region. + +``VFIO_USER_REGION_WRITE`` +-------------------------- + +If a device region is not mappable, it's not directly accessible by the client +via mmap() of the underlying fd. In this case, a client can write to a device +region with this message. + +Request +^^^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | variable | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. +* *data* is the data to write + +Reply +^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ + +* *offset* into the region accessed. +* *region* is the index of the region accessed. +* *count* is the size of the data transferred. + +``VFIO_USER_DMA_READ`` +----------------------- + +If the client has not shared mappable memory, the server can use this message to +read from guest memory. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. This address must have + been previously exported to the server with a ``VFIO_USER_DMA_MAP`` message. +* *count* is the size of the data to be transferred. + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ +| data | 16 | variable | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. +* *count* is the size of the data transferred. +* *data* is the data read. + +``VFIO_USER_DMA_WRITE`` +----------------------- + +If the client has not shared mappable memory, the server can use this message to +write to guest memory. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ +| data | 16 | variable | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. This address must have + been previously exported to the server with a ``VFIO_USER_DMA_MAP`` message. +* *count* is the size of the data to be transferred. +* *data* is the data to write + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 4 | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. +* *count* is the size of the data transferred. + +``VFIO_USER_DEVICE_RESET`` +-------------------------- + +This command message is sent from the client to the server to reset the device. +Neither the request or reply have a payload. + +``VFIO_USER_REGION_WRITE_MULTI`` +-------------------------------- + +This message can be used to coalesce multiple device write operations +into a single messgage. It is only used as an optimization when the +outgoing message queue is relatively full. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| wr_cnt | 0 | 8 | ++---------+--------+----------+ +| wrs | 8 | variable | ++---------+--------+----------+ + +* *wr_cnt* is the number of device writes coalesced in the message +* *wrs* is an array of device writes defined below + +Single Device Write Format +"""""""""""""""""""""""""" + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | 8 | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. This format can + only describe writes of 8 bytes or less. +* *data* is the data to write. + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| wr_cnt | 0 | 8 | ++---------+--------+----------+ + +* *wr_cnt* is the number of device writes completed. + + +Appendices +========== + +Unused VFIO ``ioctl()`` commands +-------------------------------- + +The following VFIO commands do not have an equivalent vfio-user command: + +* ``VFIO_GET_API_VERSION`` +* ``VFIO_CHECK_EXTENSION`` +* ``VFIO_SET_IOMMU`` +* ``VFIO_GROUP_GET_STATUS`` +* ``VFIO_GROUP_SET_CONTAINER`` +* ``VFIO_GROUP_UNSET_CONTAINER`` +* ``VFIO_GROUP_GET_DEVICE_FD`` +* ``VFIO_IOMMU_GET_INFO`` + +However, once support for live migration for VFIO devices is finalized some +of the above commands may have to be handled by the client in their +corresponding vfio-user form. This will be addressed in a future protocol +version. + +VFIO groups and containers +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The current VFIO implementation includes group and container idioms that +describe how a device relates to the host IOMMU. In the vfio-user +implementation, the IOMMU is implemented in SW by the client, and is not +visible to the server. The simplest idea would be that the client put each +device into its own group and container. + +Backend Program Conventions +--------------------------- + +vfio-user backend program conventions are based on the vhost-user ones. + +* The backend program must not daemonize itself. +* No assumptions must be made as to what access the backend program has on the + system. +* File descriptors 0, 1 and 2 must exist, must have regular + stdin/stdout/stderr semantics, and can be redirected. +* The backend program must honor the SIGTERM signal. +* The backend program must accept the following commands line options: + + * ``--socket-path=PATH``: path to UNIX domain socket, + * ``--fd=FDNUM``: file descriptor for UNIX domain socket, incompatible with + ``--socket-path`` +* The backend program must be accompanied with a JSON file stored under + ``/usr/share/vfio-user``. + +TODO add schema similar to docs/interop/vhost-user.json. From patchwork Wed Jan 8 11:50:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930728 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0833BE77188 for ; Wed, 8 Jan 2025 11:59:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdO-0006SL-Oa; Wed, 08 Jan 2025 06:54:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdE-0006Pn-4N for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:52 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdA-0002Es-UL for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:51 -0500 Received: from pps.filterd (m0127842.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5087vts6007169; Wed, 8 Jan 2025 03:53:48 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=XQDDwFQyZm+cONjGQVEFxxSms0Hyy119B4qDMsr8z 84=; b=IKqPA2YsnTVU+NmsV3iCJmAJjVcDk/XCg0eLRMe0wzzgJNeQ8+1cL4CBP 10SJlqnVHx7jROml22oGjOOZVZ4D2xYKYgL5dL/RtBsH6bhYIGgzRVSCAxq62TSx Np+sjYPiuQpfOpvWrGvxrStBDIW3fIHp1qbRmWwpWBFWzWe5tY6uYVdl+lxlF29+ UEjbNRZ0ghITdMEdvPpFiSs8aoTWUpZ+6m6Snp++8Lfpn6CNis1D8NheJgsVgbYG yVTZGoloxZwVPD5tZ7lYqckrM31mxLL3OelLa8jhxocOcSwgBdFqQKsXXBUQPv+M aiyY3gaXhnDtplFeXRtLgps7O9u3w== Received: from nam04-dm6-obe.outbound.protection.outlook.com (mail-dm6nam04lp2048.outbound.protection.outlook.com [104.47.73.48]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y56eryxp-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=LZN5crZp3dWJWVL0RC2aUIhBvm4tzY+jdccczkP5STDot8RPbqPAnS3to3oix3ZYSHWv4uAQv1+/RZcGygi8W57cUilZz3DQ9bBbszwudi1akQRUmafFABiUHORetZ6OQUxDr8tmMlD6JJ6zPlto8BkYbKjt9f+csOdJsNiWbSVRiNyEUyI5RzSV+3OpMimgSmfI4UjbJWZIaWTmTgETeGtH4H9z1wKziLHOUiow3mTfFmryZHNK6R2P+muiuJYUubpQ5cYJqTimSQAl1acx/kIp6/rCsTR1aR7hedTfd9niUyPHniUwehgKPcqVmhtQy4bOC+i32zfec4AY880IlA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XQDDwFQyZm+cONjGQVEFxxSms0Hyy119B4qDMsr8z84=; b=kI2LMtnSVJeR6nJjcSgWh1EjpvG8+53OCVrfLGF9mgmoBjpQZ6Zd83tk4vNylrdwdFB/hQnfgZyrlkp/cQiXmiv7xIOY/bRX/oC///phESDt27SqlH7K5OdFb4dnA3nRsCPvn4X4p5ntHN+QHTWXh5wRE9EyiYqjf/yQPH1HOIApR6FfVdg5fWRkSrZn9KPmE/ch9GqBA2Q0DXdd1lFb94/xlyK9dbxcHJF6JPKDha2lzmlYI5seLO5ZZieapDrV7os/5iWz7gdM6cG6RfCUUhvtfuwcHFyixMALRwEx0Q51lP4gcSKSJnV4BvRnawPFRn0poPu/AjJC/23+N2fofQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XQDDwFQyZm+cONjGQVEFxxSms0Hyy119B4qDMsr8z84=; b=n0EgqI1DlTANzqjtbc+HsAdoTFD++6GdINERPo7jtVljsS6pv/e6vxZm9bEfIoVM9aXsJsQFfvbCprnFmvEEnulIldV52ggBu5sqsFz4O5WQpuMeqE7oDQabW5MY2gqg4xy2Y2MSuvzabivpzs7XNEouS2OgKKPEI/c+35Z8K3tY12fKT/IQbKF9dU3HMUwQfeX+hkyU7vs/c4AqO8/z831LkFMXkIevdgiz8E0CXECalxXRFCYmeuN7eb+X6qnUA2l9+t6CswBr9RM32cNBKERxYTeFroO8g2kJSJ5FbkwJfQOkNGxYl6swCLAv9HAXAK2+aNqCYoVi2YoVBAeRgA== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:42 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:42 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 10/26] vfio-user: add vfio-user class and container Date: Wed, 8 Jan 2025 11:50:16 +0000 Message-Id: <20250108115032.1677686-11-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 4eb175cd-acc5-4104-26fd-08dd2fdb193a x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: =?utf-8?q?uxp2HW6YG4IsIJrfLzkaEZKPvvjhjfs?= =?utf-8?q?KAiBzJa9KSz6vkBWfXVbt0M0464L3V8Mfwo48ZCRuhDJpANwKTremjEMLHQbTLTQD?= =?utf-8?q?ZyA3GAOER7A7umcRbcsrN/hmfj/Dngv+/Eg6pkIs9558bf210yYWvMiydG6DhCgaX?= =?utf-8?q?FYbxYaU4K6IMa04WaQBAKd/mV4uMff6A4WhRn9xZJzPmXhkrW42M9NeqeVXT32k8W?= =?utf-8?q?/xBg/2aerfeRLWLWb8JSWKqyv8AGahCE0oKCpmC95bmVHW3KGFTbMdb+jdWO3o1LK?= =?utf-8?q?4mfMtQrHEwQK5risINC3rF6uis4l/VrRRoUdp+38A7+0i82KmPF2vamQme0C4cRYf?= =?utf-8?q?j+HDllinlcba68+MeICubcakQsPIiZSvQ18i3IObnqKRv00yd2rRfLBKF0jgimfvf?= =?utf-8?q?IzB/R+3jKvXrQD5bz5d+xP4wY6ymKxhqAyctxKLrio4Q57ZTPfb+kc5QL4Pd6xnFh?= =?utf-8?q?WzDmUet6ezccU6YQsmstgWPswW7sOlxuiz+cyejRZ47Eh/mXz3T50q4SW8ztad8og?= =?utf-8?q?u09Ne/fYdhd5p6msqSqu2XidvJb4Ese4hmjW5Ogy0cyp2GvBWNqbTTb2G66A+o7AE?= =?utf-8?q?mtG8nzowj2xilra4ZUrCFwHohUYRHXI7kD+UUp4py3YVRsq79+btw/1ixCv02NMwm?= =?utf-8?q?xSWN9mFDEQThrE706EjMEyZxmY9pSKggKhdYWol6zPB1GhXzK/rLnJwe3xAzRGN6U?= =?utf-8?q?whDsdM8nThbd35E6niX5gIEtKWP/qmXRaEd43NXJbLrNclIGTvQWPk8J0HzyMptW/?= =?utf-8?q?s17UKXZJ8qzjTlQqDq20cLXH7PXHNcjkmHMD5Zz0TUP8kc3n9WWwqD9mBE2+/IVal?= =?utf-8?q?tiVcI1kE4cfdHyaqmsESYM+x6RVvNG23cdle7NXZUoRTjOGdYJIXgod4JUqYzqa6O?= =?utf-8?q?81+gamcXKdNaSQVHAQHnFYDQgi852yt6MetlqGECYaJ5HDm9LKUWKnjsmHQ9l+iu+?= =?utf-8?q?eSbdVNik46/tbfpUAKRaTrUdOR9oQnZ0WmoApJP0J5dws4zTGGRHyy8pnzCseatJN?= =?utf-8?q?DK0H+NT5SI8xTD7JFvQb2FLNTgwcjsIPX/sZIks0vBDa3pGjTsmJmZ8KPLyXkIgxL?= =?utf-8?q?tTbvKOsHWw2ogYNj1uQbEBQ4tPQTOmUkZQJmKUZ0kUF3c1dO6x5ONXNhTFY/4djRG?= =?utf-8?q?rCZOR1697zoysdh4pfPDkhk1dJihupged4jNq5LbYokkSgPBrYxNGfKaNmrXArHMR?= =?utf-8?q?/KCkAukvVS76K6rDNCkGUESXw1Mqg8g9LOjYTnHzU1wZhA5b2s2ZeRKAyw9i3/98Z?= =?utf-8?q?nkj5nBB64A5oC0CsISbvGiNb2gpDWFkg1CP9mRayJJEeOloNdR3bHy/sssRcQVKPc?= =?utf-8?q?p2MgBuwJUCgr?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?XYPP+yuo2/0Rmu1FwOHiWH4+T8Q9?= =?utf-8?q?At6ve+OyExmasJSQaCk7nJL3SkWDdU4XWbvH/XLR60YEJ+4I9w2Hw46hx9ri2q8kL?= =?utf-8?q?dBUCVjlvUQYzBXtEZLAyC5pcfQeXnCYcG6miJKDuujV9bgjxFz04PFfNeLgasxYUF?= =?utf-8?q?AZlE3xwVhCaK/pARiEkCCDeCar8qQZC9+T0Crj2S9UvVEYkxu36U1y4mWqF0xq+Mn?= =?utf-8?q?16vmLP7OrLkyR49P2bOwPacQ0VC4tR6jQYoy67Iykgo/j3j43Vl6h8Q6m6knz8R/A?= =?utf-8?q?QWqp5+UrrhY2jgXGtYGjpOlXlUbr6LeKKqf6heJSr4CTf9jyLgbCiaTweFc2Yy49l?= =?utf-8?q?x7QjG6zFKMuG4Kbr+sbB9CwjBj9t7EMXzmSpjYo+mTv/zfQh0QsovYVBINGLMKW9D?= =?utf-8?q?+dxP11GYcKM3nMsz3Ib76rfmsL5ubOvq55w09N2VZpzEvdqOZppNUuYs6Doudu+/V?= =?utf-8?q?QpU085ZHXWOBAoOKJInn8paBlOXt1LRL8nuOlPHmd5VmTC+eFmb7DZ4M6cRZ/SsTo?= =?utf-8?q?X7CQKQ+Kvjgb6TWK/yHRAwlyhBctMVV+HKtRQoX9kCDgE1SeCag0xBuO6u93w7IW7?= =?utf-8?q?pVzaKD76zoOL25eEWV6mC3B/35T9Ez74HsXbjP9wdRo/7i4nfeOWzonv7+7nAJG85?= =?utf-8?q?3PZ7jOGfqRU1QdedIo19pd7qciYUZ0yGSXPo7GpD5OZfyrOm6qQMP18SOp4tEMS5g?= =?utf-8?q?jgkUfiHvlvdlgiJi1kI8z342u1TuSW3PTqIdMckxzkUysm7YoG7iJZJLxelgbEt9k?= =?utf-8?q?wdlBFA2P+sNOePC1OY9HAS/iEeVdZeR/EG0R3zROVrReGdmNbGqNj2EoklxsXQhnS?= =?utf-8?q?8vAoXfkq4Nud6Tglcy7tE5VwEDus/ITO3regQc+AcaeoM8EBiLPKKsJt8inFXNyyB?= =?utf-8?q?FmMUXUh3azOdYq7SMcnYx96KBCVGcCxYDJ55l8HhetYx3AcPMJvSu2QKEWdSVfTIL?= =?utf-8?q?VW/fsEQcZFESL52RWf0C+GiS9z4jgCOVMcjR18RHrPHNf37OPFTCgMV1X3QCsLyTI?= =?utf-8?q?H6dZH8YrRE7OMYrWe+DyTg1zAtMaB/uHLS+4sTAj8vERjYSKTUC1eABidzBSjKsyA?= =?utf-8?q?uYscJYmBhrf8V/9977UVxIos4WX4EJEvW8Uafa1PlY/kTwWVIL2TRTf76Yd0iqoRQ?= =?utf-8?q?7kdNSOaNOdbtzlGfslguWZOH9J2NF3xyXkjUR7ipAFwqnX9T2IhzY4y54eJXu8g7+?= =?utf-8?q?pdvAKxnFrgBqtvuDJD6uKWR9NGAptLub8ZyQnlDtrxrVuwoJohlOT6JQWPJkRf28I?= =?utf-8?q?DKDz9DTAvgdwXQ4lxEm76Pmkd7E6+iyHxX8ojgQj2Olf1pBmjlFADn8XY23GYRh7Q?= =?utf-8?q?CRdFyenduC8i9pNdyHOCj7e/YqrFtJvlzFDlieAoH5rTxHmAHGI6MAVFc691ize4b?= =?utf-8?q?+Sbai6Hv7QLitUo+Knxz4/RxI/ffVsqJ5/NB45SmnKzM5rbD6GXOPpuLQHwNwdbAq?= =?utf-8?q?WcQWy5GE23mQ2tKMDFqCmIG+eqiMZCkp3Lg+/ZR9MwkSOy0TqhlT6FWZp5sbVgPph?= =?utf-8?q?X4dwDM/cDioJ?= X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4eb175cd-acc5-4104-26fd-08dd2fdb193a X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:42.5855 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: FApeuEw/dKsyI4MZKntihB1/9JqEwfNZvZc437ZNM1h6lniauVxM2o/YQagPwPa4tAkGOI8AJU7ZlPx0mnSlvg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Authority-Analysis: v=2.4 cv=A6aWP7WG c=1 sm=1 tr=0 ts=677e674b cx=c_pps a=MHkl0I0wjNeC5ak5fNlPUA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=IkcTkHD0fZMA:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=8KKI_yEfhvF2PQQ4fRUA:9 a=3ZKOabzyN94A:10 a=QEXdDO2ut3YA:10 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: zI8TUMC2zhI5APiQcCJeo_e1lrZbcj9W X-Proofpoint-ORIG-GUID: zI8TUMC2zhI5APiQcCJeo_e1lrZbcj9W X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Add a new class for vfio-user with its class and instance constructors and destructors, and its pci ops. Introduce VFIOUserContainer for handling container operations for such classes. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- MAINTAINERS | 2 + hw/vfio/container.c | 2 +- hw/vfio/meson.build | 5 + hw/vfio/pci.c | 12 +- hw/vfio/pci.h | 7 + hw/vfio/user-container.c | 222 ++++++++++++++++++++++++++ hw/vfio/user-pci.c | 158 ++++++++++++++++++ include/hw/vfio/vfio-common.h | 10 ++ include/hw/vfio/vfio-container-base.h | 1 + meson_options.txt | 2 + scripts/meson-buildoptions.sh | 4 + 11 files changed, 418 insertions(+), 7 deletions(-) create mode 100644 hw/vfio/user-container.c create mode 100644 hw/vfio/user-pci.c diff --git a/MAINTAINERS b/MAINTAINERS index f60f0a4dd2..b0f9b54500 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4139,6 +4139,8 @@ M: John Levon M: Thanos Makatos S: Supported F: docs/devel/vfio-user.rst +F: hw/vfio/user-container.c +F: hw/vfio/user-pci.c F: subprojects/libvfio-user EBPF: diff --git a/hw/vfio/container.c b/hw/vfio/container.c index e0fd5a153b..039241c9c5 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -886,7 +886,7 @@ static bool vfio_get_device(VFIOGroup *group, const char *name, return true; } -static void vfio_put_base_device(VFIODevice *vbasedev) +void vfio_put_base_device(VFIODevice *vbasedev) { if (vbasedev->regions != NULL) { int i; diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index bba776f75c..f897c5b81a 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -16,6 +16,11 @@ vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files( 'pci-quirks.c', 'pci.c', )) + +if get_option('vfio_user_client').enabled() + vfio_ss.add(files('user-container.c', 'user-pci.c')) +endif + vfio_ss.add(when: 'CONFIG_VFIO_CCW', if_true: files('ccw.c')) vfio_ss.add(when: 'CONFIG_VFIO_PLATFORM', if_true: files('platform.c')) vfio_ss.add(when: 'CONFIG_VFIO_XGMAC', if_true: files('calxeda-xgmac.c')) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index c6d7ebfd9b..27f82d6517 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -109,7 +109,7 @@ static void vfio_intx_interrupt(void *opaque) } } -static void vfio_intx_eoi(VFIODevice *vbasedev) +void vfio_intx_eoi(VFIODevice *vbasedev) { VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); @@ -2585,7 +2585,7 @@ static void vfio_pci_compute_needs_reset(VFIODevice *vbasedev) } } -static Object *vfio_pci_get_object(VFIODevice *vbasedev) +Object *vfio_pci_get_object(VFIODevice *vbasedev) { VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); @@ -2641,7 +2641,7 @@ static const VMStateDescription vmstate_vfio_pci_config = { } }; -static int vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f, Error **errp) +int vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f, Error **errp) { VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); @@ -2649,7 +2649,7 @@ static int vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f, Error **errp) errp); } -static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) +int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) { VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); PCIDevice *pdev = &vdev->pdev; @@ -2845,7 +2845,7 @@ static bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) return true; } -static void vfio_pci_put_device(VFIOPCIDevice *vdev) +void vfio_pci_put_device(VFIOPCIDevice *vdev) { vfio_detach_device(&vdev->vbasedev); @@ -3367,7 +3367,7 @@ post_reset: vfio_pci_post_reset(vdev); } -static void vfio_instance_init(Object *obj) +void vfio_instance_init(Object *obj) { PCIDevice *pci_dev = PCI_DEVICE(obj); VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 8e79740ddb..c0f030f4db 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -213,6 +213,13 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len); void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr, uint32_t val, int len); +void vfio_intx_eoi(VFIODevice *vbasedev); +Object *vfio_pci_get_object(VFIODevice *vbasedev); +int vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f, Error **errp); +int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f); +void vfio_pci_put_device(VFIOPCIDevice *vdev); +void vfio_instance_init(Object *obj); + uint64_t vfio_vga_read(void *opaque, hwaddr addr, unsigned size); void vfio_vga_write(void *opaque, hwaddr addr, uint64_t data, unsigned size); diff --git a/hw/vfio/user-container.c b/hw/vfio/user-container.c new file mode 100644 index 0000000000..f0e2dc6b6b --- /dev/null +++ b/hw/vfio/user-container.c @@ -0,0 +1,222 @@ +/* + * Container for vfio-user IOMMU type: rather than communicating with the kernel + * vfio driver, we communicate over a socket to a server using the vfio-user + * protocol. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include +#include + +#include "hw/vfio/vfio-common.h" +#include "exec/address-spaces.h" +#include "exec/memory.h" +#include "exec/ram_addr.h" +#include "hw/hw.h" +#include "qemu/error-report.h" +#include "qemu/range.h" +#include "trace.h" +#include "qapi/error.h" +#include "pci.h" + +static int vfio_user_dma_unmap(const VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb, int flags) +{ + return -ENOTSUP; +} + +static int vfio_user_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly, + MemoryRegion *mrp) +{ + return -ENOTSUP; +} + +static int +vfio_user_set_dirty_page_tracking(const VFIOContainerBase *bcontainer, + bool start, Error **errp) +{ + error_setg_errno(errp, ENOTSUP, "Not supported"); + return -ENOTSUP; +} + +static int vfio_user_query_dirty_bitmap(const VFIOContainerBase *bcontainer, + VFIOBitmap *vbmap, hwaddr iova, + hwaddr size, Error **errp) +{ + error_setg_errno(errp, ENOTSUP, "Not supported"); + return -ENOTSUP; +} + +static bool vfio_user_setup(VFIOContainerBase *bcontainer, Error **errp) +{ + error_setg_errno(errp, ENOTSUP, "Not supported"); + return -ENOTSUP; +} + +static VFIOUserContainer *vfio_create_user_container(Error **errp) +{ + VFIOUserContainer *container; + + container = VFIO_IOMMU_USER(object_new(TYPE_VFIO_IOMMU_USER)); + return container; +} + +/* + * Try to mirror vfio_connect_container() as much as possible. + */ +static VFIOUserContainer * +vfio_connect_user_container(AddressSpace *as, Error **errp) +{ + VFIOContainerBase *bcontainer; + VFIOUserContainer *container; + VFIOAddressSpace *space; + VFIOIOMMUClass *vioc; + + space = vfio_get_address_space(as); + + container = vfio_create_user_container(errp); + if (!container) { + goto put_space_exit; + } + + bcontainer = &container->bcontainer; + + if (!vfio_cpr_register_container(bcontainer, errp)) { + goto free_container_exit; + } + + vioc = VFIO_IOMMU_GET_CLASS(bcontainer); + assert(vioc->setup); + + if (!vioc->setup(bcontainer, errp)) { + goto unregister_container_exit; + } + + vfio_address_space_insert(space, bcontainer); + + bcontainer->listener = vfio_memory_listener; + memory_listener_register(&bcontainer->listener, bcontainer->space->as); + + if (bcontainer->error) { + errno = EINVAL; + error_propagate_prepend(errp, bcontainer->error, + "memory listener initialization failed: "); + goto listener_release_exit; + } + + bcontainer->initialized = true; + + return container; + +listener_release_exit: + memory_listener_unregister(&bcontainer->listener); + if (vioc->release) { + vioc->release(bcontainer); + } + +unregister_container_exit: + vfio_cpr_unregister_container(bcontainer); + +free_container_exit: + object_unref(container); + +put_space_exit: + vfio_put_address_space(space); + + return NULL; +} + +static void vfio_disconnect_user_container(VFIOUserContainer *container) +{ + VFIOContainerBase *bcontainer = &container->bcontainer; + VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer); + + memory_listener_unregister(&bcontainer->listener); + if (vioc->release) { + vioc->release(bcontainer); + } + + VFIOAddressSpace *space = bcontainer->space; + + vfio_cpr_unregister_container(bcontainer); + object_unref(container); + + vfio_put_address_space(space); +} + +static bool vfio_user_get_device(VFIOUserContainer *container, + VFIODevice *vbasedev, Error **errp) +{ + struct vfio_device_info info = { 0 }; + + vbasedev->fd = -1; + + vfio_prepare_device(vbasedev, &container->bcontainer, NULL, &info); + + return true; +} + +/* + * vfio_user_attach_device: attach a device to a new container. + */ +static bool vfio_user_attach_device(const char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp) +{ + VFIOUserContainer *container; + + container = vfio_connect_user_container(as, errp); + if (container == NULL) { + error_prepend(errp, "failed to connect proxy"); + return false; + } + + return vfio_user_get_device(container, vbasedev, errp); +} + +static void vfio_user_detach_device(VFIODevice *vbasedev) +{ + VFIOUserContainer *container = container_of(vbasedev->bcontainer, + VFIOUserContainer, bcontainer); + + QLIST_REMOVE(vbasedev, global_next); + QLIST_REMOVE(vbasedev, container_next); + vbasedev->bcontainer = NULL; + vfio_put_base_device(vbasedev); + vfio_disconnect_user_container(container); +} + +static int vfio_user_pci_hot_reset(VFIODevice *vbasedev, bool single) +{ + /* ->needs_reset is always false for vfio-user. */ + return 0; +} + +static void vfio_iommu_user_class_init(ObjectClass *klass, void *data) +{ + VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass); + + vioc->setup = vfio_user_setup; + vioc->dma_map = vfio_user_dma_map; + vioc->dma_unmap = vfio_user_dma_unmap; + vioc->attach_device = vfio_user_attach_device; + vioc->detach_device = vfio_user_detach_device; + vioc->set_dirty_page_tracking = vfio_user_set_dirty_page_tracking; + vioc->query_dirty_bitmap = vfio_user_query_dirty_bitmap; + vioc->pci_hot_reset = vfio_user_pci_hot_reset; +}; + +static const TypeInfo types[] = { + { + .name = TYPE_VFIO_IOMMU_USER, + .parent = TYPE_VFIO_IOMMU, + .instance_size = sizeof(VFIOUserContainer), + .class_init = vfio_iommu_user_class_init, + }, +}; + +DEFINE_TYPES(types) diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c new file mode 100644 index 0000000000..a06115cd55 --- /dev/null +++ b/hw/vfio/user-pci.c @@ -0,0 +1,158 @@ +/* + * vfio PCI device over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include +#include + +#include "hw/hw.h" +#include "hw/pci/msi.h" +#include "hw/pci/msix.h" +#include "hw/pci/pci_bridge.h" +#include "hw/qdev-properties.h" +#include "hw/qdev-properties-system.h" +#include "migration/vmstate.h" +#include "qapi/qmp/qdict.h" +#include "qemu/error-report.h" +#include "qemu/main-loop.h" +#include "qemu/module.h" +#include "qemu/range.h" +#include "qemu/units.h" +#include "system/kvm.h" +#include "pci.h" +#include "trace.h" +#include "qapi/error.h" +#include "migration/blocker.h" +#include "migration/qemu-file.h" + +#define TYPE_VFIO_USER_PCI "vfio-user-pci" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) + +struct VFIOUserPCIDevice { + VFIOPCIDevice device; + char *sock_name; +}; + +/* + * Emulated devices don't use host hot reset + */ +static void vfio_user_compute_needs_reset(VFIODevice *vbasedev) +{ + vbasedev->needs_reset = false; +} + +static VFIODeviceOps vfio_user_pci_ops = { + .vfio_compute_needs_reset = vfio_user_compute_needs_reset, + .vfio_eoi = vfio_intx_eoi, + .vfio_get_object = vfio_pci_get_object, + .vfio_save_config = vfio_pci_save_config, + .vfio_load_config = vfio_pci_load_config, +}; + +static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) +{ + ERRP_GUARD(); + VFIOUserPCIDevice *udev = VFIO_USER_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; + AddressSpace *as; + + /* + * TODO: make option parser understand SocketAddress + * and use that instead of having scalar options + * for each socket type. + */ + if (!udev->sock_name) { + error_setg(errp, "No socket specified"); + error_append_hint(errp, "Use -device vfio-user-pci,socket=\n"); + return; + } + + vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); + vbasedev->ops = &vfio_user_pci_ops; + vbasedev->type = VFIO_DEVICE_TYPE_PCI; + vbasedev->dev = DEVICE(vdev); + + /* + * vfio-user devices are effectively mdevs (don't use a host iommu). + */ + vbasedev->mdev = true; + + as = pci_device_iommu_address_space(pdev); + if (!vfio_attach_device_by_iommu_type(TYPE_VFIO_IOMMU_USER, + vbasedev->name, vbasedev, + as, errp)) { + error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->name); + return; + } +} + +static void vfio_user_instance_init(Object *obj) +{ + PCIDevice *pci_dev = PCI_DEVICE(obj); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); + VFIODevice *vbasedev = &vdev->vbasedev; + + device_add_bootindex_property(obj, &vdev->bootindex, + "bootindex", NULL, + &pci_dev->qdev); + vdev->host.domain = ~0U; + vdev->host.bus = ~0U; + vdev->host.slot = ~0U; + vdev->host.function = ~0U; + + vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_PCI, &vfio_user_pci_ops, + &vfio_dev_io_ioctl, DEVICE(vdev), false); + + vdev->nv_gpudirect_clique = 0xFF; + + /* + * QEMU_PCI_CAP_EXPRESS initialization does not depend on QEMU command + * line, therefore, no need to wait to realize like other devices. + */ + pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; +} + +static void vfio_user_instance_finalize(Object *obj) +{ + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); + + vfio_pci_put_device(vdev); +} + +static const Property vfio_user_pci_dev_properties[] = { + DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), +}; + +static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); + + device_class_set_props(dc, vfio_user_pci_dev_properties); + dc->desc = "VFIO over socket PCI device assignment"; + pdc->realize = vfio_user_pci_realize; +} + +static const TypeInfo vfio_user_pci_dev_info = { + .name = TYPE_VFIO_USER_PCI, + .parent = TYPE_VFIO_PCI_BASE, + .instance_size = sizeof(VFIOUserPCIDevice), + .class_init = vfio_user_pci_dev_class_init, + .instance_init = vfio_user_instance_init, + .instance_finalize = vfio_user_instance_finalize, +}; + +static void register_vfio_user_dev_type(void) +{ + type_register_static(&vfio_user_pci_dev_info); +} + +type_init(register_vfio_user_dev_type) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 1104ed63e3..50afa944ae 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -80,6 +80,7 @@ typedef struct VFIOMigration { struct VFIOGroup; +/* MMU container sub-class for legacy vfio implementation. */ typedef struct VFIOContainer { VFIOContainerBase bcontainer; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ @@ -106,6 +107,7 @@ typedef struct VFIOIOASHwpt { QLIST_ENTRY(VFIOIOASHwpt) next; } VFIOIOASHwpt; +/* MMU container sub-class for vfio iommufd implementation. */ typedef struct VFIOIOMMUFDContainer { VFIOContainerBase bcontainer; IOMMUFDBackend *be; @@ -115,6 +117,13 @@ typedef struct VFIOIOMMUFDContainer { OBJECT_DECLARE_SIMPLE_TYPE(VFIOIOMMUFDContainer, VFIO_IOMMU_IOMMUFD); +/* MMU container sub-class for vfio-user. */ +typedef struct VFIOUserContainer { + VFIOContainerBase bcontainer; +} VFIOUserContainer; + +OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserContainer, VFIO_IOMMU_USER); + typedef struct VFIODeviceOps VFIODeviceOps; typedef struct VFIODeviceIO VFIODeviceIO; @@ -284,6 +293,7 @@ bool vfio_attach_device_by_iommu_type(const char *iommu_type, char *name, VFIODevice *vbasedev, AddressSpace *as, Error **errp); void vfio_detach_device(VFIODevice *vbasedev); +void vfio_put_base_device(VFIODevice *vbasedev); int vfio_kvm_device_add_fd(int fd, Error **errp); int vfio_kvm_device_del_fd(int fd, Error **errp); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 24e48e3a07..1ce93c5b9b 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -100,6 +100,7 @@ vfio_container_get_page_size_mask(const VFIOContainerBase *bcontainer) #define TYPE_VFIO_IOMMU_LEGACY TYPE_VFIO_IOMMU "-legacy" #define TYPE_VFIO_IOMMU_SPAPR TYPE_VFIO_IOMMU "-spapr" #define TYPE_VFIO_IOMMU_IOMMUFD TYPE_VFIO_IOMMU "-iommufd" +#define TYPE_VFIO_IOMMU_USER TYPE_VFIO_IOMMU "-user" OBJECT_DECLARE_TYPE(VFIOContainerBase, VFIOIOMMUClass, VFIO_IOMMU) diff --git a/meson_options.txt b/meson_options.txt index 5eeaf3eee5..ba9bc07fcf 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -109,6 +109,8 @@ option('multiprocess', type: 'feature', value: 'auto', description: 'Out of process device emulation support') option('relocatable', type : 'boolean', value : true, description: 'toggle relocatable install') +option('vfio_user_client', type: 'feature', value: 'disabled', + description: 'vfio-user client support') option('vfio_user_server', type: 'feature', value: 'disabled', description: 'vfio-user server support') option('dbus_display', type: 'feature', value: 'auto', diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index a8066aab03..6ee381df8c 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -201,6 +201,8 @@ meson_options_help() { printf "%s\n" ' vdi vdi image format support' printf "%s\n" ' vduse-blk-export' printf "%s\n" ' VDUSE block export support' + printf "%s\n" ' vfio-user-client' + printf "%s\n" ' vfio-user client support' printf "%s\n" ' vfio-user-server' printf "%s\n" ' vfio-user server support' printf "%s\n" ' vhdx vhdx image format support' @@ -529,6 +531,8 @@ _meson_option_parse() { --disable-vdi) printf "%s" -Dvdi=disabled ;; --enable-vduse-blk-export) printf "%s" -Dvduse_blk_export=enabled ;; --disable-vduse-blk-export) printf "%s" -Dvduse_blk_export=disabled ;; + --enable-vfio-user-client) printf "%s" -Dvfio_user_client=enabled ;; + --disable-vfio-user-client) printf "%s" -Dvfio_user_client=disabled ;; --enable-vfio-user-server) printf "%s" -Dvfio_user_server=enabled ;; --disable-vfio-user-server) printf "%s" -Dvfio_user_server=disabled ;; --enable-vhdx) printf "%s" -Dvhdx=enabled ;; From patchwork Wed Jan 8 11:50:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930702 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B6286E7719A for ; Wed, 8 Jan 2025 11:54:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdM-0006S5-8y; Wed, 08 Jan 2025 06:54:01 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdH-0006RE-FM for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:55 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdB-0002Ew-Ds for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:53 -0500 Received: from pps.filterd (m0127842.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5087vts7007169; Wed, 8 Jan 2025 03:53:48 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=e2pH8ec9fPJ/yQmaQBOhWEct2hE2OKTew5a0CgjBL gk=; b=C9CEPmplPdtT/CYBrXJRxZs/zSrB7pSGkEit71eUd2Wf8APww7KkOPeQX Ncr/hp565aFmP3suXeGEL2Ojqb1gVES4w1XH66Zffj75hiwyB438xcObSrAs6JZ+ yqVHr9jJo/UHVBToWbXYeEUlN5L5FYMu/ol6cLsdbKGs5b3G8RUlUHpuT/MU2VlK o5KOgYc1P7hJQNg+bHkz9fNkZmAH+1JT0oPzDM5aQoAHlzGuHFq3+SMCnmHaSijm coCEJsepVNJa858mJv9MHMhrSD2JyMmrlxb3/HaCJ35UQShyJZcmO9fSVN7FR0X/ DztL9PyE8Jo+G43s5gvq+dTNt/8Wg== Received: from nam04-dm6-obe.outbound.protection.outlook.com (mail-dm6nam04lp2048.outbound.protection.outlook.com [104.47.73.48]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y56eryxp-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=S3J9EV4ZwmCxb9jQQ/uVs8eM9MSZJ/MIv0QrV9bSW9/eoaBOxchNxKSY9n5NymvlLW80av+7Us4tMbYlQ4u50wv21ooyXkoChhestxtbU+kN9aOtemqBhMQcbnNRyQJ2wckezhdEm1rWWZQg7ST1jT444CqGNIHBSNskLleDs8wY2RuB7gRtARiMzDnmgrbq1Kq9dZtSnN4ZnPRbXUXAJ/al7rHffnXgQhnzShvyojAHKUotdr3D0zUETEZWPQqUfD5o5wAjnBxQb8pHXd5zS0tWBDZNvgz/YOVR6Xv2hl3u5loShyHijP5juokI8IxyxmwVSn7FcwOu2ayNULkhJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=e2pH8ec9fPJ/yQmaQBOhWEct2hE2OKTew5a0CgjBLgk=; b=yiMOlUQGzqJ9w6Bcz4Z7My9wxLcnYx6RF+y+pPWHA9dmI92zzsjg7bRXgr4e208dWRqqXvTs3ZnA/UK4UrDUX7BawvfLoo+FskVmvXC18zJ84x9YsfHWDxHLeUPhLa063wkq7GKHrc9DVxOKKnusJnBhqUZcQUnZZqS+fSxP0tKzl92waUyA0zISFRI3hbpREy5y88sZ4CnoAgQVZB6PXzM8gb8beMEKPsApAohY6C8/fuOSwW08A6iW48HvGKLro4HF9UJBmPK8uwf4e1tlS4LjPl6RNuZ57j5d2JeClr/dSLmHi6sVAuT5Os/F5zGTqPzRHyB44eAcvbTR0Y5vqQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=e2pH8ec9fPJ/yQmaQBOhWEct2hE2OKTew5a0CgjBLgk=; b=lQemC1YamFuAY/b/2GERcmuRtJaGNE5Uq61IAHoYtESvRl66qncRFGAZ1tshjqMMuK6dqUPn4kx3Ys+TTUY8AdYN4ir2jaRGLtXk2yr4etCU/QbcSHsPw2v/WqJobxkFds02IbDejXWNy9fcPzElW+hBJ+WPWBL1fYqHqw/JBMQBxwLfOjXLHB5FXBNHVKDh+wggY6jk7GdPfSYr//8Dn9M1hL99S6yUjqd1jfZpAwSfiydDMYUMRNjB0YVGjprHqvLZTvcT3yXLGwKQkzsq/mIlIn/EKKB6lzaJaIJetQcbRaIBr8dii2fbn3mGFRaU0onvhrnjzez4nF/zU5K2nQ== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:43 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:43 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 11/26] vfio-user: connect vfio proxy to remote server Date: Wed, 8 Jan 2025 11:50:17 +0000 Message-Id: <20250108115032.1677686-12-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 8cb719b7-4503-4865-e9c8-08dd2fdb1a03 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: =?utf-8?q?cLxWChhd9ydjkYF2+8WdGASK86vp3ik?= =?utf-8?q?ht5Y0nwgAhKcgZCpWpGDiAQy2JRjzeHQGs6PXTr7sm2XFcCpal0hmzU5H/SKT3Ew3?= =?utf-8?q?+j2EYSMcp5Y5VVT8MSMuJq9TsXMkwwk+XvXTstiMSpPfG0D/fzCt8fCcor0tLcXp2?= =?utf-8?q?ajUgd7WDPL0dzZnUqyCfqmyV6xy30sNsaYXzvbyuYxYA5JZEZ1I7PQUTuaz0e0Ky8?= =?utf-8?q?hZwf6V0ZakRLNexRCqur9OlTc4HEdo5AEKlqLCq3Bc7OzlE/K1Ee2clXGVrtb5lLN?= =?utf-8?q?x+8wFif1JkUjTR4wvOS0QoNAdoNg+6nSk10DkqBq60FWj4Juash2huSNN40/Vaj8S?= =?utf-8?q?beq1yY24f93YxKGlNuL2tJ15g3ykzjlkBwkBaJHNqh3+RaaRQLT0zWr6tAdQ0fdi4?= =?utf-8?q?4B2LqHn44VNIde6beb3RahUT5BVwWlsASudIw16MM6jrqV23A2fJ88nYCAqHqqXW7?= =?utf-8?q?S1yoF+2RgtJGVMwvuLBwQWkyGDjczb0zIW77jSzsRjjSLhuRnJUGiLQVi0Kb++ixW?= =?utf-8?q?UZ5Z7/WWhu5e0G7/QRRcJnaJMjeLxxmseLAOja/lMUKg+m+4qcBYXRUJsXI677y1Y?= =?utf-8?q?WzWGjS80SPNvXhQL6NLMAsNylNjYQQoAtbjwiaL58Jz8znd46ugLCm5AfKxHh1VPp?= =?utf-8?q?anSUwfdEOG4D9sOOwH5KC19dMvVjh9cP/1jLsPckMbVZtfWpJsGBLRUJarVnx6KI9?= =?utf-8?q?4+52AXv6k4Fano65OwZBBy9yza1HwTg4tUkSjs/MX7g6FVocQimPXGAfujZG2tjGo?= =?utf-8?q?V5CEVvRS2TPY9LiVr/LRu1mcaiiDV/wAVF/VYbtPJQdclg9217fUBxpVxG3UQjn13?= =?utf-8?q?DxD2rlEJN98vNlN2f15BMrQE4qafi7uHChJI8mWDHcB3RMUZlNqG1r14U1U/D2/Du?= =?utf-8?q?0nDKQ9C5dk78ydpEk0Q+K85ovbquoRlTMfLlKl9dqEzKCI/LR9oeu3fhTH2YMMAeA?= =?utf-8?q?NSxOhMNsk8wOYX7SI/XgsOG3T+2u5VFDbUl8cnfZAylPPpjcNigHDBIdxirnYbxnE?= =?utf-8?q?6/6ZYOftRPFj8eRNb6pqeeVDUwOS+6Vpvbo7Kd9lC27RA4zZzOwHN6QxiHRjKcKss?= =?utf-8?q?5wO+FnsQiaLfXeAHDS4CSKHCdvEljMi5Z7NuLfxABSSOVTHLWFPGbHJs0gct+NpSV?= =?utf-8?q?7SjQM3xrLT/VcQKz7YtOfvlnhRCrV74b7CupLksHZNg8Qm2WZ1oGdV7xoSj5T2gny?= =?utf-8?q?8QOwbQIVrvpXzFFKENLteeHHKJkhneYAPjbi/dtrj7/lk2OkOuIs++Pl5t0UC80W7?= =?utf-8?q?tT1I0fJ/a1rj9cT/LHGhHV795EW8yzgZkMN/VbNh9SvZIUuPAKQko8mMFO+HaDFwi?= =?utf-8?q?xISVpkKCpt3w?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?clmnlOd6w2fDLte6zYlBVWlsL0n+?= =?utf-8?q?ZZn8ookDrsL/5kvNz3sd5gOsVmLrNt+zJwg9JnGhh7/mNQdmI1CrhFC7blNfGn0TH?= =?utf-8?q?LedM2UVDUuFnOxBRbnlMENUkgMxWwft8m+ta4PIPCX+4FWND0A/PgF1XIn6LAN7eg?= =?utf-8?q?JVPYkpomDCNbtzRTdpX5r1Advsg12xnsWkE+E6d8bIJDCbREJxObWrSJYL80S1bbp?= =?utf-8?q?Uko+rV+ZlZs2r99M8jFiR/w2/LxNzhy3Scv4kEtKr8zdYekOYfcVYjWT/oQMTMt2D?= =?utf-8?q?blxU2pSmN2L0HsjRpkCQ5N5cJb0XGmAWwGT0Uec8p2bVkLIUk2WjiFgtoohRxSNsf?= =?utf-8?q?vj37oQfdS1ZjaseeoFbCFumcNSBqayJ9vbHObw7GrYwdl7VCgsO9pTrKZK5TZyIQV?= =?utf-8?q?XCWHGAyMPi9TBBmJ1zari+KCixRzRew9Y9saQIyZWYnagIi+u9hw+JzRXQ+7aL29u?= =?utf-8?q?3mWPF3KxdyxtI20AO7rjyEGR9ovlFxB6GkYDld/2SndAjWFcIpwnRFRZoQF7U/xR6?= =?utf-8?q?qOIVvTB6nqFFBKKC8nw3PgcwHBdnV4tqyagvLUsl3mM2LO1yXajOXoTDjiu3Vm5cZ?= =?utf-8?q?0Ao6FyaTxkABHRUfZRyJw5xLMnhuK/BlvnVAnGYPhdFOmSNnQPtCJ/SUOM/UJ8KfX?= =?utf-8?q?UhmsmC0/uc+yQefZjPc5rcfeYXhwSwVQhId2UqWE3lWN3uXujQFfc+vDKqN33I2qp?= =?utf-8?q?IzEC6KKvt9DZpXMxmnkGGhRWxEKsj+cVU47VMIveupuLFdWQXGWAHUjedmjTkp+hU?= =?utf-8?q?9C4MHja/f8VWQdBbm95mZ1p09CF2kTN3KIGPyhGuJQjkyodizLkziyfGJPO0/5iK8?= =?utf-8?q?y2KkRlqYDV+wxrSq71U0PLl5aQwcRNG1wdky3awPARPvA13LyVatKDobV4AyyrDyt?= =?utf-8?q?b9d9uaUEF0XVFdxbXXZTQ2iCk5fC3Lg0DBD+TVp9uAn3llyHxhjcbzuQU99Wzyi1k?= =?utf-8?q?eXn2PO8gCJ70tVFW+XPO0yMcvie/400HrGnfbGQhEOM+dXWYG3GFkjMBd12uF3OUa?= =?utf-8?q?CcIWpr/5GbCq0XD02V0STCl6WB57k5MqJ3ZqJfMdeSRsnERvdiXyiP1KFTGeG9BDk?= =?utf-8?q?HSvWUOlOQO5sMO1Cgy6k2MsZSUiTbwnvQt7/jt8ArazG84ylQd7mMYszq+TLUowHP?= =?utf-8?q?2TXhtu2+7QSYmeT7tm3pmIV8prKDYO6AMPO2yj2sYMfZkFrIoIu8tcjMpEZ9lIT/7?= =?utf-8?q?hH8KwDv2ME3kjs6l1AwVXgE+pLAoeUfVKJImaLDTkkwa04wwvdilP43OD5TRKRuSe?= =?utf-8?q?z8ehZPCTOGP3TMx1pzez/W7JFYJwlWD7LOFg/zIuqx3ks9bz7dOoWuo3OGLxsmPIy?= =?utf-8?q?9cQzQ4eWWLyVY9SONfpdznjzF+lEn41vkpONbBdK5bV8hGGbfy5/tFnbw+3dMHZfA?= =?utf-8?q?DDyAI8cClrlshabVU1J9wkYiFq+R197Z+WJpVktAHxbWveT1t6BKWRPXHSOzBdWgB?= =?utf-8?q?KLnXAkEG+0MXcFxvnpsbNd0WU/SgNoNt/RE9MMNxoSd0TLjxmgpnnDvjnH9p367r3?= =?utf-8?q?v6R4hwY5H4s5?= X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8cb719b7-4503-4865-e9c8-08dd2fdb1a03 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:43.8832 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: QUE/CwdwABaM0P2HrxmnKpv1t4bKfkYCWgl0HBmUulrbW5UZedS9I9aE2RlGbpb9PNXQehpgnJLsJtoIwpvHcA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Authority-Analysis: v=2.4 cv=A6aWP7WG c=1 sm=1 tr=0 ts=677e674c cx=c_pps a=MHkl0I0wjNeC5ak5fNlPUA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=IkcTkHD0fZMA:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=vhHEnE-nI2wpHcbFAcgA:9 a=3ZKOabzyN94A:10 a=QEXdDO2ut3YA:10 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: A276CHxlHgv9nT6e407yzMQKnP_ikcKU X-Proofpoint-ORIG-GUID: A276CHxlHgv9nT6e407yzMQKnP_ikcKU X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman add user.c & user.h files for vfio-user code add proxy struct to handle comms with remote server Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- MAINTAINERS | 1 + hw/vfio/meson.build | 2 +- hw/vfio/user-pci.c | 17 ++++ hw/vfio/user.c | 171 ++++++++++++++++++++++++++++++++++ hw/vfio/user.h | 78 ++++++++++++++++ include/hw/vfio/vfio-common.h | 2 + 6 files changed, 270 insertions(+), 1 deletion(-) create mode 100644 hw/vfio/user.c create mode 100644 hw/vfio/user.h diff --git a/MAINTAINERS b/MAINTAINERS index b0f9b54500..153825b463 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4141,6 +4141,7 @@ S: Supported F: docs/devel/vfio-user.rst F: hw/vfio/user-container.c F: hw/vfio/user-pci.c +F: hw/vfio/user.* F: subprojects/libvfio-user EBPF: diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index f897c5b81a..32ad5ca6b7 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -18,7 +18,7 @@ vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files( )) if get_option('vfio_user_client').enabled() - vfio_ss.add(files('user-container.c', 'user-pci.c')) + vfio_ss.add(files('user.c', 'user-container.c', 'user-pci.c')) endif vfio_ss.add(when: 'CONFIG_VFIO_CCW', if_true: files('ccw.c')) diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index a06115cd55..7610b47163 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -18,6 +18,7 @@ #include "hw/pci/pci_bridge.h" #include "hw/qdev-properties.h" #include "hw/qdev-properties-system.h" +#include "hw/vfio/user.h" #include "migration/vmstate.h" #include "qapi/qmp/qdict.h" #include "qemu/error-report.h" @@ -63,6 +64,8 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIODevice *vbasedev = &vdev->vbasedev; AddressSpace *as; + SocketAddress addr; + VFIOUserProxy *proxy; /* * TODO: make option parser understand SocketAddress @@ -75,6 +78,15 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) return; } + memset(&addr, 0, sizeof(addr)); + addr.type = SOCKET_ADDRESS_TYPE_UNIX; + addr.u.q_unix.path = udev->sock_name; + proxy = vfio_user_connect_dev(&addr, errp); + if (!proxy) { + return; + } + vbasedev->proxy = proxy; + vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); vbasedev->ops = &vfio_user_pci_ops; vbasedev->type = VFIO_DEVICE_TYPE_PCI; @@ -123,8 +135,13 @@ static void vfio_user_instance_init(Object *obj) static void vfio_user_instance_finalize(Object *obj) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); + VFIODevice *vbasedev = &vdev->vbasedev; vfio_pci_put_device(vdev); + + if (vbasedev->proxy != NULL) { + vfio_user_disconnect(vbasedev->proxy); + } } static const Property vfio_user_pci_dev_properties[] = { diff --git a/hw/vfio/user.c b/hw/vfio/user.c new file mode 100644 index 0000000000..1c79fb1cb9 --- /dev/null +++ b/hw/vfio/user.c @@ -0,0 +1,171 @@ +/* + * vfio protocol over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include +#include + +#include "qemu/error-report.h" +#include "qapi/error.h" +#include "qemu/main-loop.h" +#include "qemu/lockable.h" +#include "hw/hw.h" +#include "hw/vfio/vfio-common.h" +#include "qemu/sockets.h" +#include "io/channel.h" +#include "io/channel-socket.h" +#include "io/channel-util.h" +#include "system/iothread.h" +#include "user.h" + +static IOThread *vfio_user_iothread; + +static void vfio_user_shutdown(VFIOUserProxy *proxy); + + +/* + * Functions called by main, CPU, or iothread threads + */ + +static void vfio_user_shutdown(VFIOUserProxy *proxy) +{ + qio_channel_shutdown(proxy->ioc, QIO_CHANNEL_SHUTDOWN_READ, NULL); + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, NULL, + proxy->ctx, NULL, NULL); +} + +/* + * Functions only called by iothread + */ + +static void vfio_user_cb(void *opaque) +{ + VFIOUserProxy *proxy = opaque; + + QEMU_LOCK_GUARD(&proxy->lock); + + proxy->state = VFIO_PROXY_CLOSED; + qemu_cond_signal(&proxy->close_cv); +} + + +/* + * Functions called by main or CPU threads + */ + +static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets = + QLIST_HEAD_INITIALIZER(vfio_user_sockets); + +VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) +{ + VFIOUserProxy *proxy; + QIOChannelSocket *sioc; + QIOChannel *ioc; + char *sockname; + + if (addr->type != SOCKET_ADDRESS_TYPE_UNIX) { + error_setg(errp, "vfio_user_connect - bad address family"); + return NULL; + } + sockname = addr->u.q_unix.path; + + sioc = qio_channel_socket_new(); + ioc = QIO_CHANNEL(sioc); + if (qio_channel_socket_connect_sync(sioc, addr, errp)) { + object_unref(OBJECT(ioc)); + return NULL; + } + qio_channel_set_blocking(ioc, false, NULL); + + proxy = g_malloc0(sizeof(VFIOUserProxy)); + proxy->sockname = g_strdup_printf("unix:%s", sockname); + proxy->ioc = ioc; + proxy->flags = VFIO_PROXY_CLIENT; + proxy->state = VFIO_PROXY_CONNECTED; + + qemu_mutex_init(&proxy->lock); + qemu_cond_init(&proxy->close_cv); + + if (vfio_user_iothread == NULL) { + vfio_user_iothread = iothread_create("VFIO user", errp); + } + + proxy->ctx = iothread_get_aio_context(vfio_user_iothread); + + QTAILQ_INIT(&proxy->outgoing); + QTAILQ_INIT(&proxy->incoming); + QTAILQ_INIT(&proxy->free); + QTAILQ_INIT(&proxy->pending); + QLIST_INSERT_HEAD(&vfio_user_sockets, proxy, next); + + return proxy; +} + +void vfio_user_disconnect(VFIOUserProxy *proxy) +{ + VFIOUserMsg *r1, *r2; + + qemu_mutex_lock(&proxy->lock); + + /* our side is quitting */ + if (proxy->state == VFIO_PROXY_CONNECTED) { + vfio_user_shutdown(proxy); + if (!QTAILQ_EMPTY(&proxy->pending)) { + error_printf("vfio_user_disconnect: outstanding requests\n"); + } + } + object_unref(OBJECT(proxy->ioc)); + proxy->ioc = NULL; + + proxy->state = VFIO_PROXY_CLOSING; + QTAILQ_FOREACH_SAFE(r1, &proxy->outgoing, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->outgoing, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->incoming, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->incoming, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->pending, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->pending, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->free, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->free, r1, next); + g_free(r1); + } + + /* + * Make sure the iothread isn't blocking anywhere + * with a ref to this proxy by waiting for a BH + * handler to run after the proxy fd handlers were + * deleted above. + */ + aio_bh_schedule_oneshot(proxy->ctx, vfio_user_cb, proxy); + qemu_cond_wait(&proxy->close_cv, &proxy->lock); + + /* we now hold the only ref to proxy */ + qemu_mutex_unlock(&proxy->lock); + qemu_cond_destroy(&proxy->close_cv); + qemu_mutex_destroy(&proxy->lock); + + QLIST_REMOVE(proxy, next); + if (QLIST_EMPTY(&vfio_user_sockets)) { + iothread_destroy(vfio_user_iothread); + vfio_user_iothread = NULL; + } + + g_free(proxy->sockname); + g_free(proxy); +} diff --git a/hw/vfio/user.h b/hw/vfio/user.h new file mode 100644 index 0000000000..ac7d15dfa8 --- /dev/null +++ b/hw/vfio/user.h @@ -0,0 +1,78 @@ +#ifndef VFIO_USER_H +#define VFIO_USER_H + +/* + * vfio protocol over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +typedef struct { + int send_fds; + int recv_fds; + int *fds; +} VFIOUserFDs; + +enum msg_type { + VFIO_MSG_NONE, + VFIO_MSG_ASYNC, + VFIO_MSG_WAIT, + VFIO_MSG_NOWAIT, + VFIO_MSG_REQ, +}; + +typedef struct VFIOUserMsg { + QTAILQ_ENTRY(VFIOUserMsg) next; + VFIOUserFDs *fds; + uint32_t rsize; + uint32_t id; + QemuCond cv; + bool complete; + enum msg_type type; +} VFIOUserMsg; + + +enum proxy_state { + VFIO_PROXY_CONNECTED = 1, + VFIO_PROXY_ERROR = 2, + VFIO_PROXY_CLOSING = 3, + VFIO_PROXY_CLOSED = 4, +}; + +typedef QTAILQ_HEAD(VFIOUserMsgQ, VFIOUserMsg) VFIOUserMsgQ; + +typedef struct VFIOUserProxy { + QLIST_ENTRY(VFIOUserProxy) next; + char *sockname; + struct QIOChannel *ioc; + void (*request)(void *opaque, VFIOUserMsg *msg); + void *req_arg; + int flags; + QemuCond close_cv; + AioContext *ctx; + QEMUBH *req_bh; + + /* + * above only changed when BQL is held + * below are protected by per-proxy lock + */ + QemuMutex lock; + VFIOUserMsgQ free; + VFIOUserMsgQ pending; + VFIOUserMsgQ incoming; + VFIOUserMsgQ outgoing; + VFIOUserMsg *last_nowait; + enum proxy_state state; +} VFIOUserProxy; + +/* VFIOProxy flags */ +#define VFIO_PROXY_CLIENT 0x1 + +VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); +void vfio_user_disconnect(VFIOUserProxy *proxy); + +#endif /* VFIO_USER_H */ diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 50afa944ae..afc67a3a77 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -79,6 +79,7 @@ typedef struct VFIOMigration { } VFIOMigration; struct VFIOGroup; +typedef struct VFIOUserProxy VFIOUserProxy; /* MMU container sub-class for legacy vfio implementation. */ typedef struct VFIOContainer { @@ -162,6 +163,7 @@ typedef struct VFIODevice { IOMMUFDBackend *iommufd; VFIOIOASHwpt *hwpt; QLIST_ENTRY(VFIODevice) hwpt_next; + VFIOUserProxy *proxy; struct vfio_region_info **regions; } VFIODevice; From patchwork Wed Jan 8 11:50:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930715 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 82503E77199 for ; Wed, 8 Jan 2025 11:57:30 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdT-0006TA-Cw; Wed, 08 Jan 2025 06:54:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdH-0006RG-Iq for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:55 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdC-0002F6-IW for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:53:53 -0500 Received: from pps.filterd (m0127842.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5087vts8007169; Wed, 8 Jan 2025 03:53:49 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=cZiJP88qlQLTQpmiOnVzHbOcsIU7PmhGTdhYzTdgO Rw=; b=XrSorF9RwYQarJ49q/kGA9qAB91T7irtIA2COXRho3D6uRttKD+q3Xyry hUMhBkd5obiB2/ADQyz0XSQLGUgAFrRRAOmgfQlOFZkPhLviZFckdIAgYyjACPii rijhxcfQ0HccWXd+l8zoLUg5ZQLLaM8ivycnYXiotLPa/9yLQPCl7f2yZOoRnvtX 6zSODgtgbpE6JzQg9SexfNYEA6DHOnMQ5hpwMVPzRKYU9KuzVJh6NHkI0FHLLAaL wuXqNNaKkRz3/GnGN1eTCUkVDjh+D2gsXfDDL1qxpN7hieHiROAXRXSpjMq2uo8w +/CIeTiUGgP5k9En1NPvy4M+Kb9hg== Received: from nam04-dm6-obe.outbound.protection.outlook.com (mail-dm6nam04lp2048.outbound.protection.outlook.com [104.47.73.48]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y56eryxp-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=p/76y5quWiW6tRkAb8pIuRTiHthB2rVZqHm7QQZGtaZrb3SNz4y1W/A3siw9/nMDmz0PIWpwLYZJEW67zC5Bbx03vx7KLp5Bu6r0bHWHak+HA7TKnCCHTqlRPIPznb7zpUNi1FoKYA5uY6GyywgLT5UnLd1FVROGBdiUen0IXvJief4Wx/pGbjLknj7yspRZ20gT5oh7yppvYc8y1EvCn17gcwKExsW6M9PyFX1u5RuVvA7OR2DdTZ62n6w2JPf1JPGVoi4Wl6eN9l2rSBHa/ssgixdF1VKxYaaSG236GoS5XGSMksuVKsKzXh1Aq0lS36qVmPMaO7xbToH2RVwHAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cZiJP88qlQLTQpmiOnVzHbOcsIU7PmhGTdhYzTdgORw=; b=Q1L1rpL7kf1YBtWIUoD9AmLM3t3IPZDji/libiVqMHpVRA0yd1PqiuvWwUeBs2SiHs4bEEcW/ZsIqQHTtlpTUA5gI7VUtbwmXUZy3qLxcRcWxLWRdAzlqQWpWI4Ro3yYkoXMSM8KGnYFUnvDo3AB2/7cDMg6Elt/6VSGafifMevw7HulNuMpPkgj3j9Wccg8TK9a/8JeSsiCTp3GJq1fRY1gArpJKMxQcf5hmcHxwNaQlc9ksSjHD6QF2qXnqWFBiXQhknyBoulwg61nAAN7MEiv59zJxcrEoB7isaTGxSi7vizJFQoqoJZtVQ2NNDyNCy2XcUvbGoZXUMH4Nm9xnQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cZiJP88qlQLTQpmiOnVzHbOcsIU7PmhGTdhYzTdgORw=; b=qvi8/IwFi0ogebVSNS0FWYqG0MeIzfAGQW5KmgaukPa3zJnZZdP68cUw8yMiHIzj9szAqNGOVbgVSBCjndeVtQRLYyjMKj8yV8m3Bqsi6MYH5RfR8hZ+UQpr3FlUM/mBanTsFpinAZ3HYdYyDggJNol+n4CvF/xJWnTGxn/AH/j6d3aKZdIJLr9IDgjG7YIRBOeFBOAlCgA9w1oxPhsTAvDBduPsNDOeQgi26QCQhLMy1JDvn1bcQHx6SADBFsmr08HVHeUE2Nk9sldTdIanQ0ft3r4b73/hLdgGjnAL9uSys1qv7ME72fJA2T19I/54T1zJ12ZfDkkz9ltHJk/4VQ== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by SJ0PR02MB7502.namprd02.prod.outlook.com (2603:10b6:a03:2a2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.11; Wed, 8 Jan 2025 11:53:45 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:45 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 12/26] vfio-user: define socket receive functions Date: Wed, 8 Jan 2025 11:50:18 +0000 Message-Id: <20250108115032.1677686-13-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|SJ0PR02MB7502:EE_ X-MS-Office365-Filtering-Correlation-Id: 6042a6f1-6e0e-46a1-dbc7-08dd2fdb1acc x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: =?utf-8?q?sRcqiGVbq8qV1zryylYBZm0/7cDkx8x?= =?utf-8?q?73tftuiPq+Vt0gnRimXGWjbUTDPIhLmZfyga5UaFBoJ/nxvNTdgPhUL2Ovj6Mc0J2?= =?utf-8?q?brOVEQNQlE7s2phuns6208T1LQDIvGax4KnzKZAJV1SwIyL3uoo/mrB+Cb84DIMi3?= =?utf-8?q?3lsLHZQbBNZw/KnsczXQr/xJlYKmxucxCAegnGBoed3KvTp84csHnTlYtSgGj2aqF?= =?utf-8?q?F7gCeIO3pR1Z4bba4qbLNKXW1+Yk/FyrbhX8WWQBOqwY0Kc2kEpA3FZOEOOR9pgO+?= =?utf-8?q?khwKzmxOSiJ7HLzt5+CZ9SFepdhq4ilDHATtMVDOHMdw7ajE7aHaEvpQuhaKAaKYQ?= =?utf-8?q?/u5SZ2i9C+G29ip5TeRk/8urAoF9VmFC4Z44cTkayeYX9wzxpD0FNDGE6uCuK7hHo?= =?utf-8?q?nr++CWFiISn0XU9o87z5lwafug2PAs5wUJqxKagvx8NrT4Yf07arDGczzmOZeOEnB?= =?utf-8?q?thGLiv/Yum0x090XZaVqNes67f8g8gx6xjOCRg6zhv/Zrfm0+AJRKnSWdTpfpXAaX?= =?utf-8?q?zzmE6k0N2MKCubV5OZqRd3M3cWlWp5phRxq5vSGFw8DbZCkFS/YAi0rjCZ2CoDYAD?= =?utf-8?q?Wol5/dYUMI04Xt+AfBiEAjxLH3ZbRukNfgJQG+HfNN3RG53TTVcWuHvLZSPh3pqkv?= =?utf-8?q?TVoRHOSHXrZtsp667zYwYYtLj7iW3fXmfdcQBybWm90Wt9K84FvkRLsjjZDAxy/Tm?= =?utf-8?q?3ZIjrQLxm6KK1+nyVmnIGg9B53nvd7pkcb9y+QODYJFDJnJwb8Q7+GS0Rq1RV/yUw?= =?utf-8?q?stuPsH79FxOD8q9ebRe/C+RAvsXn9+n+lIfwhRdmLfsPhH4DiGDMtSF69w/PyV5CM?= =?utf-8?q?FVfCub/0KlwI4N0qh2gZjziaTEWO8emXHnM/ZIwMJCLp6IWpEp/INe/+GEOHceVJ/?= =?utf-8?q?QSsBPP9o4xSLbeKOoPs0hAeSaPI8mWcHGxaYc/F6mrAps2v18aLgdr82dsOaJfDz5?= =?utf-8?q?A7IQB4ltoBaCTOdk0o2EtsPoyaExAvEUNDdBLkv6TQurENWO1TogS+06CeCeQWnUC?= =?utf-8?q?i8ieubU9HSwEnDG0z6rPgHMUShPLwLzdBGg133MwwkZTsGJj7YBX15ebepJb+UCa8?= =?utf-8?q?Zsd7NnMeZm5+qtVfNmNobsl1sXxFgw3erqSyZKLFU8nS+DtkqtCCk7Vym0LJPIeZZ?= =?utf-8?q?CQG8XdLX5lTDGf4KyxGOhLuYFIgTJ/PHqaVjcv8J8hdj1j+SFeNvB9Nfp6Uv9OYk7?= =?utf-8?q?Tgn/K3C7RrZiz8BY5v9Or8bJ2xbWJZRMLNpwUh6cJMVRCx7bPE9uFv3vpvfG//hzw?= =?utf-8?q?iBUl4PipgfT9wz1KPaDTIxwXmPh4OKR5Q0NlbNEsEdupEE2mwsm51JjGYRVMpQLVW?= =?utf-8?q?rkJL/+t46eF6?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?BWx6ujR1OsAIke0weVyeepAbhPUM?= =?utf-8?q?UixLsGnTsR0XSwTKJiXKNZ3NkaM1kKw+boqLB0HSfs/QrfTLmhwCjGbbOfMmljy/a?= =?utf-8?q?sZ37W8MyOq/IGtunr5PUDgT+hSJq4IXn5Lnt/0nUZVC0M9DLydsAs/2hSjJq+CP5w?= =?utf-8?q?BuQjMUBlIAWSgV5mYg+HLy0wFTRvxDPd0dyY2xNJokmL+JOqTBIbXgafNrdnNZdA1?= =?utf-8?q?hblawcM1QHshCYQhQZ/BgJjvPraS+s6hBkqlliKpMHF4JR/h/xmBmGtL68TmHP9TU?= =?utf-8?q?J0Y7zxqLYTzdw43klQmBYGn+Wlf37en2Emnp778YH1emoeIE1VKzcwBuDCIsbqoiL?= =?utf-8?q?D2g5RiisOIpr1MVmv7sxtdYXWHbQSIVCdpxZ7K13csp1KxGTXz6RDhAI1M2Tn7bo/?= =?utf-8?q?BFjSlaBEF0mljf3cBImInHZbbtowroj5iwTMGNdxdWQ5Kv1m8iO6kaEAf8jbiBh/I?= =?utf-8?q?2O2Q/xL83lXc1TD9SLqmSRBDgqOaAwncOG6GfR+8v72JwdmLhcSQJ2EypKyIW0NTU?= =?utf-8?q?1GWEWAoRG917dSPNo0ipWRDztu6BHySC0ZGLTazQ4S+S5/AH5LyeMA4bnvKHZ81/y?= =?utf-8?q?HFOGmMk/MfeEgrNKLmfrSytRLAXbWD19rz7jXzGEgaZT3dtdVPFnwSVYs3xJWeaJ5?= =?utf-8?q?AkVIPjQ1/80s312A6wOfvCY0mT2y0mhaD5vArKSUXgKp69roNqA5l2RMuKRolFjvy?= =?utf-8?q?1nJtNFGsyIoBt7meGIehl55B71GoE2Q536Ph9Fw3gyP+oquOFdtMccrw4/Oe6EaYv?= =?utf-8?q?9X75o40KCJSfKTbFiil6/hB7kj+4wBqKyM7lvyGTv5Gprk35iD9aCviodD0G2NHNs?= =?utf-8?q?vcCykQglJORdipXZn/sbWkswUiQUMBJkc+oWpOvWDChtT0HCDE6/3iC6npbWEmAv/?= =?utf-8?q?1q4UWJGcECFE66OVdEw2/NjOPCn+dlZnnXi0pSTuCghQJGre+U9fmAesv0oZdGtH8?= =?utf-8?q?ZPZCrJclvfCy5N9R40ms1zLhFOEhaAwHBIoag2858GiSw2qUFyO05YLqLcAWe/Vq8?= =?utf-8?q?Hn8QFmokBTGMHq2yNEdZJa6C6Jrj4/Jtu9toNmVvJBsvMLSNnulz7wqUe0Jq7+wk4?= =?utf-8?q?nDgsIvyBPostafjL67KFnqwdrmqEEg39vFWIPHNTh8t6bPZWSJ9rQT2DdIKxb8EqL?= =?utf-8?q?x3RDWsr4uk0BfkYgqJAVdyliJfxwPdTWoNAQq/BlChKunn8FFDWS376Is6lzLfqOj?= =?utf-8?q?IIbfqI5helDpvnEqM3L2nonhcvaN7v0EatdAULWzsWLq18llm2C0kchSiiHdNGcWs?= =?utf-8?q?KDV5wZB2RpSU4/F6kWepuFTMb3d9IwICbb2Yrcp2o/YFnQs8EQCY0eoD0PJRkLaHs?= =?utf-8?q?4tGLlN5Y/fDXp0A5yoDX33wE19V5+t5dxnprMXS8fO/zHV+7BGcxPj0HaXfBZ6vUF?= =?utf-8?q?LSQlzXbzsle2UF8WEwWX+pOOeFg8/Bpy+WCWOdHWkutmGiFVqXTWoa2HfDEmlMIg5?= =?utf-8?q?kJrg1JutgBWP070KqK+Eupy4C352LpubeRstIYQ8CYl4niNsx+d4OAOYGWQ75jLV3?= =?utf-8?q?1gZ2F2qIDp6J?= X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6042a6f1-6e0e-46a1-dbc7-08dd2fdb1acc X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:45.2889 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: KToaY+CEToc6H2JsGNpcisw0lNkMBJDLph/VCSt4QDnKjHrbWBztNvQRmVnGTDCIiiJqJkZmewvjSOVoaEnIDA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR02MB7502 X-Authority-Analysis: v=2.4 cv=A6aWP7WG c=1 sm=1 tr=0 ts=677e674c cx=c_pps a=MHkl0I0wjNeC5ak5fNlPUA==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=IkcTkHD0fZMA:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=isiwTlKnufRvzBNuimkA:9 a=3ZKOabzyN94A:10 a=QEXdDO2ut3YA:10 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: i3gusQ1N1ZG-hplGifrZSKRmFbT3uZIO X-Proofpoint-ORIG-GUID: i3gusQ1N1ZG-hplGifrZSKRmFbT3uZIO X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Add infrastructure needed to receive incoming messages Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/trace-events | 5 + hw/vfio/user-pci.c | 11 ++ hw/vfio/user-protocol.h | 54 ++++++ hw/vfio/user.c | 408 ++++++++++++++++++++++++++++++++++++++++ hw/vfio/user.h | 10 + 5 files changed, 488 insertions(+) create mode 100644 hw/vfio/user-protocol.h diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index cab1cf1de0..0e3e7be10c 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -180,3 +180,8 @@ iommufd_cdev_fail_attach_existing_container(const char *msg) " %s" iommufd_cdev_alloc_ioas(int iommufd, int ioas_id) " [iommufd=%d] new IOMMUFD container with ioasid=%d" iommufd_cdev_device_info(char *name, int devfd, int num_irqs, int num_regions, int flags) " %s (%d) num_irqs=%d num_regions=%d flags=%d" iommufd_cdev_pci_hot_reset_dep_devices(int domain, int bus, int slot, int function, int dev_id) "\t%04x:%02x:%02x.%x devid %d" + +# user.c +vfio_user_recv_hdr(const char *name, uint16_t id, uint16_t cmd, uint32_t size, uint32_t flags) " (%s) id 0x%x cmd 0x%x size 0x%x flags 0x%x" +vfio_user_recv_read(uint16_t id, int read) " id 0x%x read 0x%x" +vfio_user_recv_request(uint16_t cmd) " command 0x%x" diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index 7610b47163..b62fd4edef 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -41,6 +41,16 @@ struct VFIOUserPCIDevice { char *sock_name; }; +/* + * Incoming request message callback. + * + * Runs off main loop, so BQL held. + */ +static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg) +{ + +} + /* * Emulated devices don't use host hot reset */ @@ -86,6 +96,7 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) return; } vbasedev->proxy = proxy; + vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); vbasedev->ops = &vfio_user_pci_ops; diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h new file mode 100644 index 0000000000..d23877c958 --- /dev/null +++ b/hw/vfio/user-protocol.h @@ -0,0 +1,54 @@ +#ifndef VFIO_USER_PROTOCOL_H +#define VFIO_USER_PROTOCOL_H + +/* + * vfio protocol over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Each message has a standard header that describes the command + * being sent, which is almost always a VFIO ioctl(). + * + * The header may be followed by command-specific data, such as the + * region and offset info for read and write commands. + */ + +typedef struct { + uint16_t id; + uint16_t command; + uint32_t size; + uint32_t flags; + uint32_t error_reply; +} VFIOUserHdr; + +/* VFIOUserHdr commands */ +enum vfio_user_command { + VFIO_USER_VERSION = 1, + VFIO_USER_DMA_MAP = 2, + VFIO_USER_DMA_UNMAP = 3, + VFIO_USER_DEVICE_GET_INFO = 4, + VFIO_USER_DEVICE_GET_REGION_INFO = 5, + VFIO_USER_DEVICE_GET_REGION_IO_FDS = 6, + VFIO_USER_DEVICE_GET_IRQ_INFO = 7, + VFIO_USER_DEVICE_SET_IRQS = 8, + VFIO_USER_REGION_READ = 9, + VFIO_USER_REGION_WRITE = 10, + VFIO_USER_DMA_READ = 11, + VFIO_USER_DMA_WRITE = 12, + VFIO_USER_DEVICE_RESET = 13, + VFIO_USER_DIRTY_PAGES = 14, + VFIO_USER_MAX, +}; + +/* VFIOUserHdr flags */ +#define VFIO_USER_REQUEST 0x0 +#define VFIO_USER_REPLY 0x1 +#define VFIO_USER_TYPE 0xF + +#define VFIO_USER_NO_REPLY 0x10 +#define VFIO_USER_ERROR 0x20 + +#endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 1c79fb1cb9..1ab8e10739 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -24,11 +24,27 @@ #include "io/channel-util.h" #include "system/iothread.h" #include "user.h" +#include "trace.h" static IOThread *vfio_user_iothread; static void vfio_user_shutdown(VFIOUserProxy *proxy); +static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds); +static VFIOUserFDs *vfio_user_getfds(int numfds); +static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg); +static void vfio_user_recv(void *opaque); +static int vfio_user_recv_one(VFIOUserProxy *proxy); +static void vfio_user_cb(void *opaque); + +static void vfio_user_request(void *opaque); + +static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) +{ + hdr->flags |= VFIO_USER_ERROR; + hdr->error_reply = err; +} /* * Functions called by main, CPU, or iothread threads @@ -41,10 +57,340 @@ static void vfio_user_shutdown(VFIOUserProxy *proxy) proxy->ctx, NULL, NULL); } +static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds) +{ + VFIOUserMsg *msg; + + msg = QTAILQ_FIRST(&proxy->free); + if (msg != NULL) { + QTAILQ_REMOVE(&proxy->free, msg, next); + } else { + msg = g_malloc0(sizeof(*msg)); + qemu_cond_init(&msg->cv); + } + + msg->hdr = hdr; + msg->fds = fds; + return msg; +} + +/* + * Recycle a message list entry to the free list. + */ +static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg) +{ + if (msg->type == VFIO_MSG_NONE) { + error_printf("vfio_user_recycle - freeing free msg\n"); + return; + } + + /* free msg buffer if no one is waiting to consume the reply */ + if (msg->type == VFIO_MSG_NOWAIT || msg->type == VFIO_MSG_ASYNC) { + g_free(msg->hdr); + if (msg->fds != NULL) { + g_free(msg->fds); + } + } + + msg->type = VFIO_MSG_NONE; + msg->hdr = NULL; + msg->fds = NULL; + msg->complete = false; + QTAILQ_INSERT_HEAD(&proxy->free, msg, next); +} + +static VFIOUserFDs *vfio_user_getfds(int numfds) +{ + VFIOUserFDs *fds = g_malloc0(sizeof(*fds) + (numfds * sizeof(int))); + + fds->fds = (int *)((char *)fds + sizeof(*fds)); + + return fds; +} + /* * Functions only called by iothread */ +/* + * Process a received message. + */ +static void vfio_user_process(VFIOUserProxy *proxy, VFIOUserMsg *msg, + bool isreply) +{ + + /* + * Replies signal a waiter, if none just check for errors + * and free the message buffer. + * + * Requests get queued for the BH. + */ + if (isreply) { + msg->complete = true; + if (msg->type == VFIO_MSG_WAIT) { + qemu_cond_signal(&msg->cv); + } else { + if (msg->hdr->flags & VFIO_USER_ERROR) { + error_printf("vfio_user_process: error reply on async "); + error_printf("request command %x error %s\n", + msg->hdr->command, + strerror(msg->hdr->error_reply)); + } + /* youngest nowait msg has been ack'd */ + if (proxy->last_nowait == msg) { + proxy->last_nowait = NULL; + } + vfio_user_recycle(proxy, msg); + } + } else { + QTAILQ_INSERT_TAIL(&proxy->incoming, msg, next); + qemu_bh_schedule(proxy->req_bh); + } +} + +/* + * Complete a partial message read + */ +static int vfio_user_complete(VFIOUserProxy *proxy, Error **errp) +{ + VFIOUserMsg *msg = proxy->part_recv; + size_t msgleft = proxy->recv_left; + bool isreply; + char *data; + int ret; + + data = (char *)msg->hdr + (msg->hdr->size - msgleft); + while (msgleft > 0) { + ret = qio_channel_read(proxy->ioc, data, msgleft, errp); + + /* error or would block */ + if (ret <= 0) { + /* try for rest on next iternation */ + if (ret == QIO_CHANNEL_ERR_BLOCK) { + proxy->recv_left = msgleft; + } + return ret; + } + trace_vfio_user_recv_read(msg->hdr->id, ret); + + msgleft -= ret; + data += ret; + } + + /* + * Read complete message, process it. + */ + proxy->part_recv = NULL; + proxy->recv_left = 0; + isreply = (msg->hdr->flags & VFIO_USER_TYPE) == VFIO_USER_REPLY; + vfio_user_process(proxy, msg, isreply); + + /* return positive value */ + return 1; +} + +static void vfio_user_recv(void *opaque) +{ + VFIOUserProxy *proxy = opaque; + + QEMU_LOCK_GUARD(&proxy->lock); + + if (proxy->state == VFIO_PROXY_CONNECTED) { + while (vfio_user_recv_one(proxy) == 0) { + ; + } + } +} + +/* + * Receive and process one incoming message. + * + * For replies, find matching outgoing request and wake any waiters. + * For requests, queue in incoming list and run request BH. + */ +static int vfio_user_recv_one(VFIOUserProxy *proxy) +{ + VFIOUserMsg *msg = NULL; + g_autofree int *fdp = NULL; + VFIOUserFDs *reqfds; + VFIOUserHdr hdr; + struct iovec iov = { + .iov_base = &hdr, + .iov_len = sizeof(hdr), + }; + bool isreply = false; + int i, ret; + size_t msgleft, numfds = 0; + char *data = NULL; + char *buf = NULL; + Error *local_err = NULL; + + /* + * Complete any partial reads + */ + if (proxy->part_recv != NULL) { + ret = vfio_user_complete(proxy, &local_err); + + /* still not complete, try later */ + if (ret == QIO_CHANNEL_ERR_BLOCK) { + return ret; + } + + if (ret <= 0) { + goto fatal; + } + /* else fall into reading another msg */ + } + + /* + * Read header + */ + ret = qio_channel_readv_full(proxy->ioc, &iov, 1, &fdp, &numfds, 0, + &local_err); + if (ret == QIO_CHANNEL_ERR_BLOCK) { + return ret; + } + + /* read error or other side closed connection */ + if (ret <= 0) { + goto fatal; + } + + if (ret < sizeof(hdr)) { + error_setg(&local_err, "short read of header"); + goto fatal; + } + + /* + * Validate header + */ + if (hdr.size < sizeof(VFIOUserHdr)) { + error_setg(&local_err, "bad header size"); + goto fatal; + } + switch (hdr.flags & VFIO_USER_TYPE) { + case VFIO_USER_REQUEST: + isreply = false; + break; + case VFIO_USER_REPLY: + isreply = true; + break; + default: + error_setg(&local_err, "unknown message type"); + goto fatal; + } + trace_vfio_user_recv_hdr(proxy->sockname, hdr.id, hdr.command, hdr.size, + hdr.flags); + + /* + * For replies, find the matching pending request. + * For requests, reap incoming FDs. + */ + if (isreply) { + QTAILQ_FOREACH(msg, &proxy->pending, next) { + if (hdr.id == msg->id) { + break; + } + } + if (msg == NULL) { + error_setg(&local_err, "unexpected reply"); + goto err; + } + QTAILQ_REMOVE(&proxy->pending, msg, next); + + /* + * Process any received FDs + */ + if (numfds != 0) { + if (msg->fds == NULL || msg->fds->recv_fds < numfds) { + error_setg(&local_err, "unexpected FDs"); + goto err; + } + msg->fds->recv_fds = numfds; + memcpy(msg->fds->fds, fdp, numfds * sizeof(int)); + } + } else { + if (numfds != 0) { + reqfds = vfio_user_getfds(numfds); + memcpy(reqfds->fds, fdp, numfds * sizeof(int)); + } else { + reqfds = NULL; + } + } + + /* + * Put the whole message into a single buffer. + */ + if (isreply) { + if (hdr.size > msg->rsize) { + error_setg(&local_err, "reply larger than recv buffer"); + goto err; + } + *msg->hdr = hdr; + data = (char *)msg->hdr + sizeof(hdr); + } else { + buf = g_malloc0(hdr.size); + memcpy(buf, &hdr, sizeof(hdr)); + data = buf + sizeof(hdr); + msg = vfio_user_getmsg(proxy, (VFIOUserHdr *)buf, reqfds); + msg->type = VFIO_MSG_REQ; + } + + /* + * Read rest of message. + */ + msgleft = hdr.size - sizeof(hdr); + while (msgleft > 0) { + ret = qio_channel_read(proxy->ioc, data, msgleft, &local_err); + + /* prepare to complete read on next iternation */ + if (ret == QIO_CHANNEL_ERR_BLOCK) { + proxy->part_recv = msg; + proxy->recv_left = msgleft; + return ret; + } + + if (ret <= 0) { + goto fatal; + } + trace_vfio_user_recv_read(hdr.id, ret); + + msgleft -= ret; + data += ret; + } + + vfio_user_process(proxy, msg, isreply); + return 0; + + /* + * fatal means the other side closed or we don't trust the stream + * err means this message is corrupt + */ +fatal: + vfio_user_shutdown(proxy); + proxy->state = VFIO_PROXY_ERROR; + + /* set error if server side closed */ + if (ret == 0) { + error_setg(&local_err, "server closed socket"); + } + +err: + for (i = 0; i < numfds; i++) { + close(fdp[i]); + } + if (isreply && msg != NULL) { + /* force an error to keep sending thread from hanging */ + vfio_user_set_error(msg->hdr, EINVAL); + msg->complete = true; + qemu_cond_signal(&msg->cv); + } + error_prepend(&local_err, "vfio_user_recv_one: "); + error_report_err(local_err); + return -1; +} + static void vfio_user_cb(void *opaque) { VFIOUserProxy *proxy = opaque; @@ -60,6 +406,53 @@ static void vfio_user_cb(void *opaque) * Functions called by main or CPU threads */ +/* + * Process incoming requests. + * + * The bus-specific callback has the form: + * request(opaque, msg) + * where 'opaque' was specified in vfio_user_set_handler + * and 'msg' is the inbound message. + * + * The callback is responsible for disposing of the message buffer, + * usually by re-using it when calling vfio_send_reply or vfio_send_error, + * both of which free their message buffer when the reply is sent. + * + * If the callback uses a new buffer, it needs to free the old one. + */ +static void vfio_user_request(void *opaque) +{ + VFIOUserProxy *proxy = opaque; + VFIOUserMsgQ new, free; + VFIOUserMsg *msg, *m1; + + /* reap all incoming */ + QTAILQ_INIT(&new); + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + QTAILQ_FOREACH_SAFE(msg, &proxy->incoming, next, m1) { + QTAILQ_REMOVE(&proxy->incoming, msg, next); + QTAILQ_INSERT_TAIL(&new, msg, next); + } + } + + /* process list */ + QTAILQ_INIT(&free); + QTAILQ_FOREACH_SAFE(msg, &new, next, m1) { + QTAILQ_REMOVE(&new, msg, next); + trace_vfio_user_recv_request(msg->hdr->command); + proxy->request(proxy->req_arg, msg); + QTAILQ_INSERT_HEAD(&free, msg, next); + } + + /* free list */ + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + QTAILQ_FOREACH_SAFE(msg, &free, next, m1) { + vfio_user_recycle(proxy, msg); + } + } +} + + static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -98,6 +491,7 @@ VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) } proxy->ctx = iothread_get_aio_context(vfio_user_iothread); + proxy->req_bh = qemu_bh_new(vfio_user_request, proxy); QTAILQ_INIT(&proxy->outgoing); QTAILQ_INIT(&proxy->incoming); @@ -108,6 +502,18 @@ VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) return proxy; } +void vfio_user_set_handler(VFIODevice *vbasedev, + void (*handler)(void *opaque, VFIOUserMsg *msg), + void *req_arg) +{ + VFIOUserProxy *proxy = vbasedev->proxy; + + proxy->request = handler; + proxy->req_arg = req_arg; + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, NULL, NULL, proxy); +} + void vfio_user_disconnect(VFIOUserProxy *proxy) { VFIOUserMsg *r1, *r2; @@ -123,6 +529,8 @@ void vfio_user_disconnect(VFIOUserProxy *proxy) } object_unref(OBJECT(proxy->ioc)); proxy->ioc = NULL; + qemu_bh_delete(proxy->req_bh); + proxy->req_bh = NULL; proxy->state = VFIO_PROXY_CLOSING; QTAILQ_FOREACH_SAFE(r1, &proxy->outgoing, next, r2) { diff --git a/hw/vfio/user.h b/hw/vfio/user.h index ac7d15dfa8..30cf35d3e4 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -11,6 +11,8 @@ * */ +#include "user-protocol.h" + typedef struct { int send_fds; int recv_fds; @@ -27,6 +29,7 @@ enum msg_type { typedef struct VFIOUserMsg { QTAILQ_ENTRY(VFIOUserMsg) next; + VFIOUserHdr *hdr; VFIOUserFDs *fds; uint32_t rsize; uint32_t id; @@ -66,13 +69,20 @@ typedef struct VFIOUserProxy { VFIOUserMsgQ incoming; VFIOUserMsgQ outgoing; VFIOUserMsg *last_nowait; + VFIOUserMsg *part_recv; + size_t recv_left; enum proxy_state state; } VFIOUserProxy; /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 +typedef struct VFIODevice VFIODevice; + VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); void vfio_user_disconnect(VFIOUserProxy *proxy); +void vfio_user_set_handler(VFIODevice *vbasedev, + void (*handler)(void *opaque, VFIOUserMsg *msg), + void *reqarg); #endif /* VFIO_USER_H */ From patchwork Wed Jan 8 11:50:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AACF3E77199 for ; Wed, 8 Jan 2025 11:58:49 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdp-0006Z3-4B; Wed, 08 Jan 2025 06:54:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdQ-0006SO-Db for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:05 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdL-0002GA-Hr for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:03 -0500 Received: from pps.filterd (m0127840.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085doxa029507; Wed, 8 Jan 2025 03:53:56 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=n9qr3mk4E/ZQS5sBOt6yr/xhodb5zxqZ4z/XPdbQW vI=; b=kOwOiInCE1V4RhUBwVTSdAyS58qmakOSobw+kM9fvqX5UGf2Qf0XFgoDJ FZDAjAs43vNveYE7fW4tKRfxYttBl5DkLZvINPa5iryVPrs25sSxstm6O+pQ3hYw gVObnxUgeoZvj80FklnrNu8qqK3gcVbFDHpRh+7t4kDUgw6y0DI456lLHmNW4gSu rQrqTwtULACTcE9qOHLqs8GX+pRQ/kKWdG+F4yMlLX/k0shItIyslMa4O29wrmi7 l2dgcJU8uf7+bh0/TyGqJoQrMymtSzz8TXuY8S3ERDXMuYJziNQ0xYxbzNYKFuxJ Vj/W1ye7aT/kXaAHzeFqf2otZ00PQ== Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2169.outbound.protection.outlook.com [104.47.56.169]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y26xqhax-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=E+tIiuD/2Orpj3Dyz1GmWUXRM/tRTlLaxbzR4549MEnHcT+GGplnYuOgn5gDfGVhe2DZoyTpN4ZoI6XLESq2bkRiIrOjZzCtP7T9i/cDxmETFUJ5Kh+wstUfStcSJOtC/SViiq+/itjVqYqNiC2pwLTNBADBeCBZvYavhhl/6nIUFV3HakvX+jWzo+V6gMKuo+/6WGaVGA/JD7unLrA0btiJIoM32GoQCK/3HPr9wW7j/kP4WkxmjP94O9TgY4knlDIEf/hJSbFHYdc81dlqusk359mRx327apTyDhMs+gQOXWGyypeyrgjWVCfMr1M3fbeh4y7nXzt9pDRhPctp4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=n9qr3mk4E/ZQS5sBOt6yr/xhodb5zxqZ4z/XPdbQWvI=; b=bkGxaHLWPT/4OzR1P9u52wclMNv3AJVBJLAwfE0KTBtdNcPAJpixW8nPOb7zctMQ6/EjQt2nrBu3egWfCkVtG3QRcoAoExcnel0yXyJDR1SWCIfIy4ak1iF+auI3OX/nV1qZHOJJxuoFpL+uywncXjEACUr/M9QyEbcxcwENGux4nMzRCJJrEZuhSSpXYOylm+jDFGt8OeD2VEu6qEtT21ZgBqzn4FRx1gDytqCPO0i8npl7lHuko22kKgl11NVgosY+4iWdYkSj0Y+WzKmNxjTSPYkNSDQ5IK8Sivek1l9xF00fiO2rKC+DCTjQCn9fGjR3YF5Ss0+Nuv0Sl9oakQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=n9qr3mk4E/ZQS5sBOt6yr/xhodb5zxqZ4z/XPdbQWvI=; b=iw55LOBJnY/0hZO09s9GPxq77SRBx/wX2bmV8Dv1peot8itybNuQB39rOs5PG0b+4kSfdnMXYFbe9nUVP6rLrihXzL9inWi/s7s8lgIjM1ds9HCrO9HQHEjFEzC/93mgACIph9HgMJBGuJEmaxJr2WCKGIg4RvpUqxgF9/E5P19tI/p+L8SiPS6Cx9p2fYgf26mw9L98Lq3xAfCZLlTBFOPL3QBKrDTxRNfqeR4RUFP0CNTXxTC5o4XfP//ftdQ9zlV0qnrDy7J+bUa3qzH7qJF+Q51uE8wA0rtrObzWUmIuDlx2DLkHzjYE0HGCK3aA77O4fRGZn3Cq2tnKP5TMKQ== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:46 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:46 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 13/26] vfio-user: define socket send functions Date: Wed, 8 Jan 2025 11:50:19 +0000 Message-Id: <20250108115032.1677686-14-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 437e7373-5bc0-4df7-cf2d-08dd2fdb1b9f x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: 0MieiYOmyptZnFpEBYqujjN88RUNMjAXdSw2GtQb1eocEpJ0v1Z1+FS/3eosURZ2VedQz0N8SJ+IJAZ0s6jX8Zf8AX/aJN1TDtn1hUsNvdhUjnZIlu6qzP/AS6MSIS4nphTBNyXt6eT4vvY4eX3MxDHBeQSRg5kdYGrhPHP4Fknw38MXii6eHzkm4H6hl5MschpzHpoI92TEdaXm/ms0j7RwPV8DWfNiwbFyP6TFhyCOcoBeNxxYVTBGJQxZC1ckzy+iETcYMp/QXPtIfXRIHWbusBc69DbWNVjyp9bRm6FbISh/NHW6pZLyIcJypepa0GB7N+0xTfm9iLXXoalk0tvGd1u3vlkttUDHJJ4W4tXHIrr/2XWjTwcOSBCUnLlMxzdfN6o90RFyIpw0ntaVytSPxO6+3j1z/v92FfLf11pH5lf4oNrIUWrV6KrTBWSs4BjQ7Xz7cRW2/kxjH7QM/+fwpy3K4nf6FEwpB07AwsV/pJd5vCe+UBd99LKrvJCTKfDgmv5qZAtIrkgk/hcOyGMOgnEB1Qymr96J3N9Lzr383YR9nhvHTJ/J8ELfyvt9f7NSsV+FSwU6T2EFQl77yQRFTqt123QJTA7vPyaYIrTOOZwa0cG8yjSkDoyulxybZB58lWJGyO8djzsa5OaCCy4mJpr4TLTuCM2L/uzvSC/kNeI2gjMtaRrLIXnXbnghTfybkvvQaG+2jtvkfFBIv7B0Ih+UICXALvvT8yqzpd1kD2OWZpm2uehyX5BDHB/oGwgBtL5mH6pEA+hHZNqROSYXtmOKXesJWy5qnu6WCKgjdVphsmNkoj51qdtUAArFehTFrhUSMTT/vVns7QMxlykWZ1vo9D3mNMcPxtjU1rbh4vaOYI9qHtnCDFH6RCvGOu3s2wJXzLn29WeQnASBwsjEpmEfQapcIg/VB1DYO/3C8WScwywwWN7xsvpWHYrEuGPAuCmEEEFmvgIwZbePXidffTSZqhTKqeKUUYlNKN7i7IobMzh756oXl84zxwVuvokYI3/vK/H+m+ohA1zi3DUyrefeSdUzT7NrCqnVsP+6JObM2iw3IZDk+Er7W2QVSM5Y5NfRjCbcV0WSd6OHiobgYMCJi8JlCncVAS/MFSCsYf/INCQEVj4Zt+8Xr5SCGOBNY91WfM0lQLuIOGrDJLqeIl6yi1R3vnWJFMvYiAh8tMp3M0HC2MqnNNt/TZXvEuPaOtXEKLpzHCM84YApbNgPSFaOIcRwbxsItBDiDDKvkozmBEmal1an7hagNyBweJd2ydbQPXymT1BY4yoDO7op/Upunub7Nw7qaoqyzSvNnL9GDuZx+bSiizFPv/fm36V8if+SWPEcsWFn+lIqhSZiYWYU1j8hSqFtuxbIEX9/URdUbvJfMj58Jr+8z9ag X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ovds62/fzH3aPHQOwJQCd9EP1hxMqlEsshxteEaowrUdgxoP14mU9iOOqiitNEdY71MVEjCtd7479++MPq1KFFfpJ2TudypJUUkq0Z4Ao15gF5q5UY94D9AsrzntGmFmwWJrMxW03yA2Z2w5kK7UU3VdKaW6uAlavmtLR6YXS02vcRE4nJlxMvn/kUmYiz2KbgulU4aRKPsRmQH00frbNGPoMtmNcL6IObbLKubS7tXJzonLA7xx1ceIjxMZBSFtAFQ7AdGZZzqauYQS+ySA9ouOAcwwIbJRWtzS6IDn8zUNwk3YSYJ2FQXdfgVVS7rCfcfIud4XawchKFQr3nP6YHRZ5VRTCZ6LHKa51plHVdrGx8vsaV2LjlyUK9ODkGVUhx5/szp2UcMtYe8nYfw/gPB6Gxlhi/jQlPyevIXjWITk/kSgXafGip80TpQVydPnc47EXXncDMX+rJKGK3xrEP7jPTiMrFrQ8zWBGXTwPbxabLysZwfQTuGPka9UHklWzPB3u4fCamdZTDzGdA0hDBWo2dMi1vfz2Ve/FyUCA0wSgp4VxQTml1KF2aQxi2OZLr690ltOzOR8l8EpTZmzR4O4X8gTZXorSITULoaT0DKo3jwzO86cU8o4MHtqBp7yhdRptKGVXF8qLghR6lVkDxlL2KoMbGJy/FgFQDzVrSFZ192Gi/MO/IFAj6Nt+8WjtgMVHhSqJOtTxvIddmUNyEXqmb/hTpPE710sRQrnZDYx2UJ6dSl+axvYP5PscG9indno01IOf6TVuHklIulYSRZQIBwujfLZwhxQ6lzOFaVrwAqsHjm+BEuwUvXVxrvHE0EXnfwcDigxyHiwNoUh2FjZixky9HbbSfg/stGox9oA9fKtxepDe/u1OGxAQ7w//S4T0Do89XQnVFhkRpI7BFPxIvyEUQXxSH9rpByFeHfW1TDlm1QMSyQL7eHbe+Rs6yfFoEAXN9FtxRCPFD/iD9Fnq28O7bWU1GW1c4fVEWnEtwyi+09Jzs95MtnxJexsVq3hUMC8wI4ZmkrOrdErNGdMlTh3xsk9bpK+aI0SHzE8mPq5okpHnSJ4gm9gnRgy4tnB+UDBFjGnmXZecHPYNTov+AuMe90UyTCLYPeW5Hn6YelOIeDxylhR3AgbJaSOSrqLpiA7AyhJPLQEqBmZV//cjZb0DaWejdyJHQ3gSlbgnjFRXckXiz725dxpblJoEDUSVRy5Di4NxzvKLIhQ5JfzPGR6yL2b7+MKlkUMs+0ELNBdQb1C39qD+cxHBCoIjQ95vDraJ8gBiQ+iPzsCjqVqw0wY2PTdZdfsrSgpSTf7gSvmFEwj4Ty+GMNyHqlXf126uXqTs+EIZpq8sQobnP56SomAZgOgm3TNIg0sTsMfAJ/GZrRXBK1+ymP1wsUCZgqaCNnq9ZB885sf+3HcWthjXKW0iDXrqC5V7HyiHR3vk5v4p1BDNarH3e4MQJHR2T6G47PoqWY6UguPbXndG8+Rh2LiflnoTNipTj06jAfv4rfyRR3q43FCmQWFl89yCuOSVchrfM/qBJ2BcvT85TjpVYeExIzd+HkHOwmyFopsmN1EFrvMKxHP2z8znbgV X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 437e7373-5bc0-4df7-cf2d-08dd2fdb1b9f X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:46.6647 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: UMur3vyWSeuwnx1xj1g3l4y0Hm49s408fCaxLn+Db7Rii1judTWzVuD1e4HNXPRISnms1dbOprKin6yag89gNA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=Z/cWHGRA c=1 sm=1 tr=0 ts=677e6754 cx=c_pps a=MPHjzrODTC1L994aNYq1fw==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=gI41RUl3HiD4W0Omd60A:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: i1fOGVhEVgUZ7kiGecz81gOharGxHOxr X-Proofpoint-ORIG-GUID: i1fOGVhEVgUZ7kiGecz81gOharGxHOxr X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Also negotiate protocol version with remote server Originally-by: John Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John Levon --- hw/vfio/trace-events | 2 + hw/vfio/user-pci.c | 18 +- hw/vfio/user-protocol.h | 62 +++++ hw/vfio/user.c | 495 ++++++++++++++++++++++++++++++++++++++++ hw/vfio/user.h | 9 + 5 files changed, 584 insertions(+), 2 deletions(-) diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 0e3e7be10c..d66fc6c214 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -185,3 +185,5 @@ iommufd_cdev_pci_hot_reset_dep_devices(int domain, int bus, int slot, int functi vfio_user_recv_hdr(const char *name, uint16_t id, uint16_t cmd, uint32_t size, uint32_t flags) " (%s) id 0x%x cmd 0x%x size 0x%x flags 0x%x" vfio_user_recv_read(uint16_t id, int read) " id 0x%x read 0x%x" vfio_user_recv_request(uint16_t cmd) " command 0x%x" +vfio_user_send_write(uint16_t id, int wrote) " id 0x%x wrote 0x%x" +vfio_user_version(uint16_t major, uint16_t minor, const char *caps) " major %d minor %d caps: %s" diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index b62fd4edef..62259db473 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -39,6 +39,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) struct VFIOUserPCIDevice { VFIOPCIDevice device; char *sock_name; + bool send_queued; /* all sends are queued */ }; /* @@ -98,6 +99,14 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->proxy = proxy; vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); + if (udev->send_queued) { + proxy->flags |= VFIO_PROXY_FORCE_QUEUED; + } + + if (!vfio_user_validate_version(proxy, errp)) { + goto error; + } + vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); vbasedev->ops = &vfio_user_pci_ops; vbasedev->type = VFIO_DEVICE_TYPE_PCI; @@ -112,9 +121,13 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) if (!vfio_attach_device_by_iommu_type(TYPE_VFIO_IOMMU_USER, vbasedev->name, vbasedev, as, errp)) { - error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->name); - return; + goto error; } + + return; + +error: + error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); } static void vfio_user_instance_init(Object *obj) @@ -157,6 +170,7 @@ static void vfio_user_instance_finalize(Object *obj) static const Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), + DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), }; static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index d23877c958..5de5b2030c 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -51,4 +51,66 @@ enum vfio_user_command { #define VFIO_USER_NO_REPLY 0x10 #define VFIO_USER_ERROR 0x20 + +/* + * VFIO_USER_VERSION + */ +typedef struct { + VFIOUserHdr hdr; + uint16_t major; + uint16_t minor; + char capabilities[]; +} VFIOUserVersion; + +#define VFIO_USER_MAJOR_VER 0 +#define VFIO_USER_MINOR_VER 0 + +#define VFIO_USER_CAP "capabilities" + +/* "capabilities" members */ +#define VFIO_USER_CAP_MAX_FDS "max_msg_fds" +#define VFIO_USER_CAP_MAX_XFER "max_data_xfer_size" +#define VFIO_USER_CAP_PGSIZES "pgsizes" +#define VFIO_USER_CAP_MAP_MAX "max_dma_maps" +#define VFIO_USER_CAP_MIGR "migration" + +/* "migration" members */ +#define VFIO_USER_CAP_PGSIZE "pgsize" +#define VFIO_USER_CAP_MAX_BITMAP "max_bitmap_size" + +/* + * Max FDs mainly comes into play when a device supports multiple interrupts + * where each ones uses an eventfd to inject it into the guest. + * It is clamped by the the number of FDs the qio channel supports in a + * single message. + */ +#define VFIO_USER_DEF_MAX_FDS 8 +#define VFIO_USER_MAX_MAX_FDS 16 + +/* + * Max transfer limits the amount of data in region and DMA messages. + * Region R/W will be very small (limited by how much a single instruction + * can process) so just use a reasonable limit here. + */ +#define VFIO_USER_DEF_MAX_XFER (1024 * 1024) +#define VFIO_USER_MAX_MAX_XFER (64 * 1024 * 1024) + +/* + * Default pagesizes supported is 4k. + */ +#define VFIO_USER_DEF_PGSIZE 4096 + +/* + * Default max number of DMA mappings is stolen from the + * linux kernel "dma_entry_limit" + */ +#define VFIO_USER_DEF_MAP_MAX 65535 + +/* + * Default max bitmap size is also take from the linux kernel, + * where usage of signed ints limits the VA range to 2^31 bytes. + * Dividing that by the number of bits per byte yields 256MB + */ +#define VFIO_USER_DEF_MAX_BITMAP (256 * 1024 * 1024) + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 1ab8e10739..4e48bc65fe 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -23,12 +23,18 @@ #include "io/channel-socket.h" #include "io/channel-util.h" #include "system/iothread.h" +#include "qapi/qmp/qdict.h" +#include "qapi/qmp/qjson.h" +#include "qapi/qmp/qstring.h" +#include "qapi/qmp/qnum.h" #include "user.h" #include "trace.h" +static int wait_time = 5000; /* wait up to 5 sec for busy servers */ static IOThread *vfio_user_iothread; static void vfio_user_shutdown(VFIOUserProxy *proxy); +static int vfio_user_send_qio(VFIOUserProxy *proxy, VFIOUserMsg *msg); static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds); static VFIOUserFDs *vfio_user_getfds(int numfds); @@ -36,9 +42,16 @@ static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg); static void vfio_user_recv(void *opaque); static int vfio_user_recv_one(VFIOUserProxy *proxy); +static void vfio_user_send(void *opaque); +static int vfio_user_send_one(VFIOUserProxy *proxy); static void vfio_user_cb(void *opaque); static void vfio_user_request(void *opaque); +static int vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg); +static void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize); +static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags); static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) { @@ -57,6 +70,35 @@ static void vfio_user_shutdown(VFIOUserProxy *proxy) proxy->ctx, NULL, NULL); } +static int vfio_user_send_qio(VFIOUserProxy *proxy, VFIOUserMsg *msg) +{ + VFIOUserFDs *fds = msg->fds; + struct iovec iov = { + .iov_base = msg->hdr, + .iov_len = msg->hdr->size, + }; + size_t numfds = 0; + int ret, *fdp = NULL; + Error *local_err = NULL; + + if (fds != NULL && fds->send_fds != 0) { + numfds = fds->send_fds; + fdp = fds->fds; + } + + ret = qio_channel_writev_full(proxy->ioc, &iov, 1, fdp, numfds, 0, + &local_err); + + if (ret == -1) { + vfio_user_set_error(msg->hdr, EIO); + vfio_user_shutdown(proxy); + error_report_err(local_err); + } + trace_vfio_user_send_write(msg->hdr->id, ret); + + return ret; +} + static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds) { @@ -97,6 +139,7 @@ static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg) msg->hdr = NULL; msg->fds = NULL; msg->complete = false; + msg->pending = false; QTAILQ_INSERT_HEAD(&proxy->free, msg, next); } @@ -391,6 +434,54 @@ err: return -1; } +/* + * Send messages from outgoing queue when the socket buffer has space. + * If we deplete 'outgoing', remove ourselves from the poll list. + */ +static void vfio_user_send(void *opaque) +{ + VFIOUserProxy *proxy = opaque; + + QEMU_LOCK_GUARD(&proxy->lock); + + if (proxy->state == VFIO_PROXY_CONNECTED) { + while (!QTAILQ_EMPTY(&proxy->outgoing)) { + if (vfio_user_send_one(proxy) < 0) { + return; + } + } + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, NULL, NULL, proxy); + } +} + +/* + * Send a single message. + * + * Sent async messages are freed, others are moved to pending queue. + */ +static int vfio_user_send_one(VFIOUserProxy *proxy) +{ + VFIOUserMsg *msg; + int ret; + + msg = QTAILQ_FIRST(&proxy->outgoing); + ret = vfio_user_send_qio(proxy, msg); + if (ret < 0) { + return ret; + } + + QTAILQ_REMOVE(&proxy->outgoing, msg, next); + if (msg->type == VFIO_MSG_ASYNC) { + vfio_user_recycle(proxy, msg); + } else { + QTAILQ_INSERT_TAIL(&proxy->pending, msg, next); + msg->pending = true; + } + + return 0; +} + static void vfio_user_cb(void *opaque) { VFIOUserProxy *proxy = opaque; @@ -452,6 +543,119 @@ static void vfio_user_request(void *opaque) } } +/* + * Messages are queued onto the proxy's outgoing list. + * + * It handles 3 types of messages: + * + * async messages - replies and posted writes + * + * There will be no reply from the server, so message + * buffers are freed after they're sent. + * + * nowait messages - map/unmap during address space transactions + * + * These are also sent async, but a reply is expected so that + * vfio_wait_reqs() can wait for the youngest nowait request. + * They transition from the outgoing list to the pending list + * when sent, and are freed when the reply is received. + * + * wait messages - all other requests + * + * The reply to these messages is waited for by their caller. + * They also transition from outgoing to pending when sent, but + * the message buffer is returned to the caller with the reply + * contents. The caller is responsible for freeing these messages. + * + * As an optimization, if the outgoing list and the socket send + * buffer are empty, the message is sent inline instead of being + * added to the outgoing list. The rest of the transitions are + * unchanged. + * + * returns 0 if the message was sent or queued + * returns -1 on send error + */ +static int vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg) +{ + int ret; + + /* + * Unsent outgoing msgs - add to tail + */ + if (!QTAILQ_EMPTY(&proxy->outgoing)) { + QTAILQ_INSERT_TAIL(&proxy->outgoing, msg, next); + return 0; + } + + /* + * Try inline - if blocked, queue it and kick send poller + */ + if (proxy->flags & VFIO_PROXY_FORCE_QUEUED) { + ret = QIO_CHANNEL_ERR_BLOCK; + } else { + ret = vfio_user_send_qio(proxy, msg); + } + if (ret == QIO_CHANNEL_ERR_BLOCK) { + QTAILQ_INSERT_HEAD(&proxy->outgoing, msg, next); + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, proxy->ctx, + vfio_user_send, proxy); + return 0; + } + if (ret == -1) { + return ret; + } + + /* + * Sent - free async, add others to pending + */ + if (msg->type == VFIO_MSG_ASYNC) { + vfio_user_recycle(proxy, msg); + } else { + QTAILQ_INSERT_TAIL(&proxy->pending, msg, next); + msg->pending = true; + } + + return 0; +} + +static void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize) +{ + VFIOUserMsg *msg; + int ret; + + if (hdr->flags & VFIO_USER_NO_REPLY) { + error_printf("vfio_user_send_wait on async message\n"); + vfio_user_set_error(hdr, EINVAL); + return; + } + + qemu_mutex_lock(&proxy->lock); + + msg = vfio_user_getmsg(proxy, hdr, fds); + msg->id = hdr->id; + msg->rsize = rsize ? rsize : hdr->size; + msg->type = VFIO_MSG_WAIT; + + ret = vfio_user_send_queued(proxy, msg); + + if (ret == 0) { + while (!msg->complete) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + VFIOUserMsgQ *list; + + list = msg->pending ? &proxy->pending : &proxy->outgoing; + QTAILQ_REMOVE(list, msg, next); + vfio_user_set_error(hdr, ETIMEDOUT); + break; + } + } + } + vfio_user_recycle(proxy, msg); + + qemu_mutex_unlock(&proxy->lock); +} static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -480,6 +684,15 @@ VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) proxy = g_malloc0(sizeof(VFIOUserProxy)); proxy->sockname = g_strdup_printf("unix:%s", sockname); proxy->ioc = ioc; + + /* init defaults */ + proxy->max_xfer_size = VFIO_USER_DEF_MAX_XFER; + proxy->max_send_fds = VFIO_USER_DEF_MAX_FDS; + proxy->max_dma = VFIO_USER_DEF_MAP_MAX; + proxy->dma_pgsizes = VFIO_USER_DEF_PGSIZE; + proxy->max_bitmap = VFIO_USER_DEF_MAX_BITMAP; + proxy->migr_pgsize = VFIO_USER_DEF_PGSIZE; + proxy->flags = VFIO_PROXY_CLIENT; proxy->state = VFIO_PROXY_CONNECTED; @@ -577,3 +790,285 @@ void vfio_user_disconnect(VFIOUserProxy *proxy) g_free(proxy->sockname); g_free(proxy); } + +static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags) +{ + static uint16_t next_id; + + hdr->id = qatomic_fetch_inc(&next_id); + hdr->command = cmd; + hdr->size = size; + hdr->flags = (flags & ~VFIO_USER_TYPE) | VFIO_USER_REQUEST; + hdr->error_reply = 0; +} + +struct cap_entry { + const char *name; + bool (*check)(VFIOUserProxy *proxy, QObject *qobj, Error **errp); +}; + +static bool caps_parse(VFIOUserProxy *proxy, QDict *qdict, + struct cap_entry caps[], Error **errp) +{ + QObject *qobj; + struct cap_entry *p; + + for (p = caps; p->name != NULL; p++) { + qobj = qdict_get(qdict, p->name); + if (qobj != NULL) { + if (!p->check(proxy, qobj, errp)) { + return false; + } + qdict_del(qdict, p->name); + } + } + + /* warning, for now */ + if (qdict_size(qdict) != 0) { + warn_report("spurious capabilities"); + } + return true; +} + +static bool check_migr_pgsize(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + uint64_t pgsize; + + if (qn == NULL || !qnum_get_try_uint(qn, &pgsize)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_PGSIZE); + return false; + } + + /* must be larger than default */ + if (pgsize & (VFIO_USER_DEF_PGSIZE - 1)) { + error_setg(errp, "pgsize 0x%"PRIx64" too small", pgsize); + return false; + } + + proxy->migr_pgsize = pgsize; + return true; +} + +static bool check_bitmap(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + uint64_t bitmap_size; + + if (qn == NULL || !qnum_get_try_uint(qn, &bitmap_size)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_BITMAP); + return false; + } + + /* can only lower it */ + if (bitmap_size > VFIO_USER_DEF_MAX_BITMAP) { + error_setg(errp, "%s too large", VFIO_USER_CAP_MAX_BITMAP); + return false; + } + + proxy->max_bitmap = bitmap_size; + return true; +} + +static struct cap_entry caps_migr[] = { + { VFIO_USER_CAP_PGSIZE, check_migr_pgsize }, + { VFIO_USER_CAP_MAX_BITMAP, check_bitmap }, + { NULL } +}; + +static bool check_max_fds(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + uint64_t max_send_fds; + + if (qn == NULL || !qnum_get_try_uint(qn, &max_send_fds) || + max_send_fds > VFIO_USER_MAX_MAX_FDS) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_FDS); + return false; + } + proxy->max_send_fds = max_send_fds; + return true; +} + +static bool check_max_xfer(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + uint64_t max_xfer_size; + + if (qn == NULL || !qnum_get_try_uint(qn, &max_xfer_size) || + max_xfer_size > VFIO_USER_MAX_MAX_XFER) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_XFER); + return false; + } + proxy->max_xfer_size = max_xfer_size; + return true; +} + +static bool check_pgsizes(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + uint64_t pgsizes; + + if (qn == NULL || !qnum_get_try_uint(qn, &pgsizes)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_PGSIZES); + return false; + } + + /* must be larger than default */ + if (pgsizes & (VFIO_USER_DEF_PGSIZE - 1)) { + error_setg(errp, "pgsize 0x%"PRIx64" too small", pgsizes); + return false; + } + + proxy->dma_pgsizes = pgsizes; + return true; +} + +static bool check_max_dma(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + uint64_t max_dma; + + if (qn == NULL || !qnum_get_try_uint(qn, &max_dma)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAP_MAX); + return false; + } + + /* can only lower it */ + if (max_dma > VFIO_USER_DEF_MAP_MAX) { + error_setg(errp, "%s too large", VFIO_USER_CAP_MAP_MAX); + return false; + } + + proxy->max_dma = max_dma; + return true; +} + +static bool check_migr(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QDict *qdict = qobject_to(QDict, qobj); + + if (qdict == NULL) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_FDS); + return true; + } + return caps_parse(proxy, qdict, caps_migr, errp); +} + +static struct cap_entry caps_cap[] = { + { VFIO_USER_CAP_MAX_FDS, check_max_fds }, + { VFIO_USER_CAP_MAX_XFER, check_max_xfer }, + { VFIO_USER_CAP_PGSIZES, check_pgsizes }, + { VFIO_USER_CAP_MAP_MAX, check_max_dma }, + { VFIO_USER_CAP_MIGR, check_migr }, + { NULL } +}; + +static bool check_cap(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QDict *qdict = qobject_to(QDict, qobj); + + if (qdict == NULL) { + error_setg(errp, "malformed %s", VFIO_USER_CAP); + return false; + } + return caps_parse(proxy, qdict, caps_cap, errp); +} + +static struct cap_entry ver_0_0[] = { + { VFIO_USER_CAP, check_cap }, + { NULL } +}; + +static bool caps_check(VFIOUserProxy *proxy, int minor, const char *caps, + Error **errp) +{ + QObject *qobj; + QDict *qdict; + bool ret; + + qobj = qobject_from_json(caps, NULL); + if (qobj == NULL) { + error_setg(errp, "malformed capabilities %s", caps); + return false; + } + qdict = qobject_to(QDict, qobj); + if (qdict == NULL) { + error_setg(errp, "capabilities %s not an object", caps); + qobject_unref(qobj); + return false; + } + ret = caps_parse(proxy, qdict, ver_0_0, errp); + + qobject_unref(qobj); + return ret; +} + +static GString *caps_json(void) +{ + QDict *dict = qdict_new(); + QDict *capdict = qdict_new(); + QDict *migdict = qdict_new(); + GString *str; + + qdict_put_int(migdict, VFIO_USER_CAP_PGSIZE, VFIO_USER_DEF_PGSIZE); + qdict_put_int(migdict, VFIO_USER_CAP_MAX_BITMAP, VFIO_USER_DEF_MAX_BITMAP); + qdict_put_obj(capdict, VFIO_USER_CAP_MIGR, QOBJECT(migdict)); + + qdict_put_int(capdict, VFIO_USER_CAP_MAX_FDS, VFIO_USER_MAX_MAX_FDS); + qdict_put_int(capdict, VFIO_USER_CAP_MAX_XFER, VFIO_USER_DEF_MAX_XFER); + qdict_put_int(capdict, VFIO_USER_CAP_PGSIZES, VFIO_USER_DEF_PGSIZE); + qdict_put_int(capdict, VFIO_USER_CAP_MAP_MAX, VFIO_USER_DEF_MAP_MAX); + + qdict_put_obj(dict, VFIO_USER_CAP, QOBJECT(capdict)); + + str = qobject_to_json(QOBJECT(dict)); + qobject_unref(dict); + return str; +} + +bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp) +{ + g_autofree VFIOUserVersion *msgp = NULL; + GString *caps; + char *reply; + int size, caplen; + + caps = caps_json(); + caplen = caps->len + 1; + size = sizeof(*msgp) + caplen; + msgp = g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_VERSION, size, 0); + msgp->major = VFIO_USER_MAJOR_VER; + msgp->minor = VFIO_USER_MINOR_VER; + memcpy(&msgp->capabilities, caps->str, caplen); + g_string_free(caps, true); + trace_vfio_user_version(msgp->major, msgp->minor, msgp->capabilities); + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + error_setg_errno(errp, msgp->hdr.error_reply, "version reply"); + return false; + } + + if (msgp->major != VFIO_USER_MAJOR_VER || + msgp->minor > VFIO_USER_MINOR_VER) { + error_setg(errp, "incompatible server version"); + return false; + } + + reply = msgp->capabilities; + if (reply[msgp->hdr.size - sizeof(*msgp) - 1] != '\0') { + error_setg(errp, "corrupt version reply"); + return false; + } + + if (!caps_check(proxy, msgp->minor, reply, errp)) { + return false; + } + + trace_vfio_user_version(msgp->major, msgp->minor, msgp->capabilities); + return true; +} diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 30cf35d3e4..9c3b279839 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -35,6 +35,7 @@ typedef struct VFIOUserMsg { uint32_t id; QemuCond cv; bool complete; + bool pending; enum msg_type type; } VFIOUserMsg; @@ -54,6 +55,12 @@ typedef struct VFIOUserProxy { struct QIOChannel *ioc; void (*request)(void *opaque, VFIOUserMsg *msg); void *req_arg; + uint64_t max_xfer_size; + uint64_t max_send_fds; + uint64_t max_dma; + uint64_t dma_pgsizes; + uint64_t max_bitmap; + uint64_t migr_pgsize; int flags; QemuCond close_cv; AioContext *ctx; @@ -76,6 +83,7 @@ typedef struct VFIOUserProxy { /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 +#define VFIO_PROXY_FORCE_QUEUED 0x4 typedef struct VFIODevice VFIODevice; @@ -84,5 +92,6 @@ void vfio_user_disconnect(VFIOUserProxy *proxy); void vfio_user_set_handler(VFIODevice *vbasedev, void (*handler)(void *opaque, VFIOUserMsg *msg), void *reqarg); +bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp); #endif /* VFIO_USER_H */ From patchwork Wed Jan 8 11:50:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930714 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 120C7E7719A for ; Wed, 8 Jan 2025 11:57:12 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdb-0006Us-Uy; Wed, 08 Jan 2025 06:54:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdQ-0006SP-NU for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:05 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdL-0002GE-GG for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:04 -0500 Received: from pps.filterd (m0127840.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085doxb029507; Wed, 8 Jan 2025 03:53:57 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=QL9J/OlG6EPJwdAw/Q2NPHWl0j00g2towHkyvpToj NA=; b=JJG3Z7I44IkONXY32Ef2bvtedhe5YAstrY+5nVhc1bVO1bH6w9nuGiGRr D3sIBuibvolMscOV9+es7BIIOVjQSQNANnDrQrt3m/c+YoYfV1j5JfyuDxWFPYEa mkF2NJ2CEwF6gVEBxzXN1vEsvg1Ht/LYiOE11l6T26COiVYVOQTNk3CcXt3szV+Y P4L1bWryEuCJ4WeX3UpAiVoqon7GRmYilOS+ZFqAgL06RK/DihLdBohNQ+8pIcxi 9RXM2M+zsihpFw5Si3wXdZmeREd7ODWUql8iH0nEKLUneAEUk9vINvRX6ncV/DCm 8J7oylJeJEHf/8bvfxPqOjrFvmD6g== Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2169.outbound.protection.outlook.com [104.47.56.169]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y26xqhax-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=B73wroFO8dx0x3/2lxTkx21wWPqwWqcWNVkeTyZn1iTrAAdfHBTx3zsfFmcA1SxoNCKSRthUAYgNuph69lqKmJ7yv4uKD58tlwY0QCNwpmvu7m+KL3rZ5qQVk8vie92t2LefIInI4T4fTd5egoQwMfeo9PnvcLqTM+YApvX0eG+SCbxq0aNTQWb/FhpJPncEGI2mfjNoJs1zKiHGn+Q7tKHkN7rEbnn0oENy/erwkJRFZ5IPJZP/OWd1xeth1sZrt0tTfuBJu/MyiLFez8ZwWFPogkwKloVxn+5kSJBQ6fwhNSCmsrv/iUR4Ct1FwIPtBMbFi5F0hAu1K9tZZP9+PA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QL9J/OlG6EPJwdAw/Q2NPHWl0j00g2towHkyvpTojNA=; b=vGGVH4i5VnC0FGd4qz3hyyNv13ZdXTDWROr3rtezxn0FT34qSYU7SZulf4+NqF6PbVUjI7EdgY9FTNNUQBdSbUw3KmJsMzFA/0rovjJ76nXNVdccTBSu9UQOVTZey2ehrDxMYulyhqp5N22Y01IZQQnv5Ul98g8IP9PZVh1/Hu5Y5MCU92Ke7HUqC4wggZzVqFMGtYR32S+TnUoyN0/xupRK34icReC/k865aWs3hpwzwgocnoNiIquUpcX3CJra752fuPCS5a3WYNSOUyKL9AZSMPbuhrn1YBEVI8rwBFEuCpaypZJP4RCoxPlNRQ2nII9+cWF/gn8kXFWR1aaQjQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QL9J/OlG6EPJwdAw/Q2NPHWl0j00g2towHkyvpTojNA=; b=biNiOAXMU8u+53ephOTjTJDb+5Gi/DrhP0EvGYv9n93dLS0T1Qr1KybhLShuzo3LuI4ri91mZtbPX19gbaDWe1fGv4fO7nXXKqZH7pxIL8/nTxuXd36BcHFk2KqeAN76zJR0B/YryLtGhtjTVAlOPvi2lCuaCYCiu4y0UgRzMd4s4wLcvV9RApKgAmBHBzaCVCU5OHCVJL2BNu1k7JLd/nKxicEaqvhau6bI8bbHdlKo3V8WycRT1CVbaDqgWLQsqKphBF5VWj/8VLrJPE8CMSPpuDAD19uie7JziulDwhS5e2unPzhMESLmLM7acxbt+k4WZ0oqBqi6e7AmtCWoCg== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:48 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:48 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 14/26] vfio-user: get device info Date: Wed, 8 Jan 2025 11:50:20 +0000 Message-Id: <20250108115032.1677686-15-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 98f2280e-91b2-42e7-272a-08dd2fdb1c72 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: bKDgDuuYKRFWSYtUMJRjvpSxJEkwkIXFc565ZD135gw/CDSkZIC0+w1vKsYnmeb/efuwrjETCkuB3CA2mr2ZBX+9orGRJvX+/7vI/G8IBzqaihBsaFPMmx4Kt2MypZXJOwKw/HnewcHLdV0kpsRgi7NYdNiqYKeg8YZrQuXjDCaEj0RE9JctW1O8IXhLRrSu86KhjgYtIVOF0oqI+8u00kMTRnofA0Vi6xwhuXYQwa7qgkkIXd28nm/TgxXHKOuL+5M2vPG9dMM27M6HvkRXkLoB9GgkOGefR9RlLmTUCwKXVQ+bkhHAagobQWYTbtTKhRBlUkZCw5B9/dHZvv9KVO9hkSlxq6Q7L0VdDTFtpH0t8+AB5p9NMELGdEeCkF5bbWQjM8ZcyyQK2OGBNtgF97Q0sdKEY+MDrNgfeetqPjbhNg3FDV3JfvShDY7Ael6TtKgAPAIk2N+PtR2VeP/7oHhvrpQAUG1GhXsJtLPFPs5Q6oP50OqRrUTKFutvckhfTXgwjNJpBCeG8IVgmh6cfr3wlFpqLkeKWjoZF4dUm9EOAqxsSDPTL9sJv6OOogXdMM9Uu853tqkQf8Ym0mLhE407BMBlCKaZ5Ev/5dfy2C3mVkQz+nBAqzbSITyf1c1DuBIKQaQIJx9PnryprAIpYvO9uqvVvy/H0abry1geHFwD7FVe6UjnW0qujiSsHf+mFpz868Fqb4en10u678dQQCt1oSnImybafJeMq5BE2hdXkEAGR3KHIE6FGLA8FUrjOmYjzrGsZQZ74jPs560odZ30qzTM3+mTvupb8MnBHcIU1ZEosnELgnA41aEkQvfJGvjgZxqFNSBmC9xipG+XdgIxOL8denr8gtniS3LyLyBf1slnrVpxiqktaAtDBIqoFKQuGDS01/spG8tZ6K6zIo3uGYjYIhE8NMShuXQ/gRJkyac74andw1+qynjM2QnkRQn5FAzOWAOzOO1Hyv1p32ADVzHmOO1qWwMLkZo3pyfEnpDNkJ5WbCWLS65smn35wru4dM7r1xh2tylW0i3zfh5Lxms5sA+DqOlbY1vsDTMdObKARFoIQOWrLeavzW6L40oPxncv2XfDD/UnOyaSeF6z4KO1PvtPLWra1ggi9shqT/cSyS5r24V/oE9bleW0a1/CDiyrBZFIrq/EoyrDH9MfuKS1EMl2Ra3TFmX8I1QlqZTGBBYxpilcQKHLoCryCM2uHtNbG0Oy/T1bSp1hwdr2QggZAUrK//oD/WPzvGxZ0fKzjxPX5KqrKRyr1B8qFfllQ/pKOhRgvFPbYDBc0kqZNEai7v/IyAae9r4uWvqa6qAUbnbcx3zZWad1YEOzpns3I8pFmVfqHJDpS25oloJN112YarOkwuK/hj+CP1Z8HrLMadbx7jUNJkqdEZfI X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: SNDTY3QlxK/l+wDmWAd9WgFypLU32VQloysUzvPrbU2CFgOUza3VZqA2NAnhcv+J40CPvrAlBe+qP1UQ+L2G0/Xo7nC3gIZxye1QiH4KlynYe4FN8FM9Ztx9GZgW9wUVoJzoX/PIeEJlNpRDZDgWu8CGEpJsekBfuxdscdw3pGOW//PIfEQydjr8sQJVVGorIFhXvJNExYjmXLC+vmyrl77kQ9efACJalFnHLgEKvlg12VEn9B1dymqM3qm8E/AafsgsZLJMZna1MJC98AHT1FH+UkQCJge7pudOqA0gsxhxBIM5/j2gnU1uFsq3fA30zHlTzAYqwl5y3E0CWlZL7fgYt4ortj6XuR3AHXLSGWCVcDITSjCt3aARAWGcRE8z3weQqFn+3ou38KlvXX/dCPnqFU5mNOhhQEr1UpNBn3t4fr6rUsT0jIfm2DMt24/4kRMe1xTnV2oVnBv0aZbdL69oqjKIgD+IjFmsPQ2YFxOFqmBMfN4MY9BWEKntU+BBr6vmbXuOhuYkWgPiwCb/+enAzYVbTuQQ14hoQ5QAkdljlMbcKIWkQluNi0i60PGHegt3xMpbZol0PR0ZNJCVEdL9coJfakXw3S8JytOjZv5o9gGr2NH+y0K2GNGehX4RWk+UjkTe14nCivNoPA27HFYx1JssIZ8pduFZxSg1vkObXUvW1Ekwh5L53u1ZH8yHYX0lfpopWQCNM07iYjJ3H0KPzMSHUeiWuJZFViwwjT/RQt9KUen6f8DjoVzvt1NT1IZnuWq5MOsZtqyeJu9etTj4SMVt5b6GH1qnwkOqQDeUdarY+QGp0Mxghy2ObV3Vn70JjOW8QBxDbw0W7wtA93q+7wgwIIwSt1vOUUJ35xZQs+i0CPlG8Gc2IDIprOBR1cbntvTYO9kCid3E9yyN5OQhWe3/X8eTqlOEpnLseLU/fU3jb1g53osbhIat7SPsTWpK+AFGB8vG6ynOtftz9Pe2F81RP7xXIiHOhTYJJXPNlz9iPQkOfa6CdF1v/x/TKoSIIQobLELDJ+NDe7rKNnRSsTRN6W/eWyuw8ltXlH5IuGaX5uoAuUjAsXRJU27usHoTFINzlooB6IRiyoLOhCF1oW93y76sovCjEiEzJILaIa37Ox/vXXTokplVbIPs3NRoosjrCxOoEOy1vyTfp5nl3fUF7MNoXoReuPQ8D1L8wBbZc3XUVo9fNimkKtLWICHptPNCaVF9LoBK4jDN5Dde6jICrbsS0TlxKN1usRpon+/t2K7C+NDo2ajuc9tiwtijp8alYEjbTot5ZBS5j+/3Z+8+p8H02NFcINO0tSfj/3PlXht6d8adrJm9KO9zqgY872zkyDyOZetnadsXZ/Dqj8CUsQOvlEbIeQa7BTJnS3zFZnJV7QoxCc98KPzgAsjWFKGuztMF8lJ1MzvPj16hL73s5y0v88mrqkI0g/A2BDaYSGROp0ALfq7EgzwFrZYAKwNR0xpUdru3yc3tDlqj+2z2dmLRPCy7d0M/+TkqVabUTHyqUUmgH5TzDZ5qT4m21jBgqJw6cACL12nYlGBExrntjpU+KVvcOlKakc0d3TLNNrMPjp9XKUq1uhY1 X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 98f2280e-91b2-42e7-272a-08dd2fdb1c72 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:47.9472 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0NCHMTUJFBJACHM7Hg3ejx4VkfP1BQV/dPuE6hA9kcLwtPujYRzEB3FpB2WVUbMR6BR+b8rYM0okA3nuFCcStA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=Z/cWHGRA c=1 sm=1 tr=0 ts=677e6755 cx=c_pps a=MPHjzrODTC1L994aNYq1fw==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=1OX1_22eDB2b9HH4iF4A:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: QWVHU4qTgvsLTJJ0Toxq51bhIB1rB9da X-Proofpoint-ORIG-GUID: QWVHU4qTgvsLTJJ0Toxq51bhIB1rB9da X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/trace-events | 1 + hw/vfio/user-container.c | 10 +++++++++- hw/vfio/user-protocol.h | 12 ++++++++++++ hw/vfio/user.c | 34 ++++++++++++++++++++++++++++++++++ hw/vfio/user.h | 1 + 5 files changed, 57 insertions(+), 1 deletion(-) diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index d66fc6c214..662bc4edfd 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -187,3 +187,4 @@ vfio_user_recv_read(uint16_t id, int read) " id 0x%x read 0x%x" vfio_user_recv_request(uint16_t cmd) " command 0x%x" vfio_user_send_write(uint16_t id, int wrote) " id 0x%x wrote 0x%x" vfio_user_version(uint16_t major, uint16_t minor, const char *caps) " major %d minor %d caps: %s" +vfio_user_get_info(uint32_t nregions, uint32_t nirqs) " #regions %d #irqs %d" diff --git a/hw/vfio/user-container.c b/hw/vfio/user-container.c index f0e2dc6b6b..201755e3d1 100644 --- a/hw/vfio/user-container.c +++ b/hw/vfio/user-container.c @@ -12,6 +12,7 @@ #include #include "hw/vfio/vfio-common.h" +#include "hw/vfio/user.h" #include "exec/address-spaces.h" #include "exec/memory.h" #include "exec/ram_addr.h" @@ -152,7 +153,14 @@ static void vfio_disconnect_user_container(VFIOUserContainer *container) static bool vfio_user_get_device(VFIOUserContainer *container, VFIODevice *vbasedev, Error **errp) { - struct vfio_device_info info = { 0 }; + struct vfio_device_info info = { .argsz = sizeof(info) }; + int ret; + + ret = vfio_user_get_info(vbasedev->proxy, &info); + if (ret) { + error_setg_errno(errp, -ret, "get info failure"); + return ret; + } vbasedev->fd = -1; diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 5de5b2030c..5f9ef1768f 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -113,4 +113,16 @@ typedef struct { */ #define VFIO_USER_DEF_MAX_BITMAP (256 * 1024 * 1024) +/* + * VFIO_USER_DEVICE_GET_INFO + * imported from struct vfio_device_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t num_regions; + uint32_t num_irqs; +} VFIOUserDeviceInfo; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 4e48bc65fe..93c7eea649 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -30,6 +30,13 @@ #include "user.h" #include "trace.h" +/* + * These are to defend against a malign server trying + * to force us to run out of memory. + */ +#define VFIO_USER_MAX_REGIONS 100 +#define VFIO_USER_MAX_IRQS 50 + static int wait_time = 5000; /* wait up to 5 sec for busy servers */ static IOThread *vfio_user_iothread; @@ -1072,3 +1079,30 @@ bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp) trace_vfio_user_version(msgp->major, msgp->minor, msgp->capabilities); return true; } + +int vfio_user_get_info(VFIOUserProxy *proxy, struct vfio_device_info *info) +{ + VFIOUserDeviceInfo msg; + uint32_t argsz = sizeof(msg) - sizeof(msg.hdr); + + memset(&msg, 0, sizeof(msg)); + vfio_user_request_msg(&msg.hdr, VFIO_USER_DEVICE_GET_INFO, sizeof(msg), 0); + msg.argsz = argsz; + + vfio_user_send_wait(proxy, &msg.hdr, NULL, 0); + if (msg.hdr.flags & VFIO_USER_ERROR) { + return -msg.hdr.error_reply; + } + trace_vfio_user_get_info(msg.num_regions, msg.num_irqs); + + memcpy(info, &msg.argsz, argsz); + + /* defend against a malicious server */ + if (info->num_regions > VFIO_USER_MAX_REGIONS || + info->num_irqs > VFIO_USER_MAX_IRQS) { + error_printf("%s: invalid reply\n", __func__); + return -EINVAL; + } + + return 0; +} diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 9c3b279839..18a5a40073 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -93,5 +93,6 @@ void vfio_user_set_handler(VFIODevice *vbasedev, void (*handler)(void *opaque, VFIOUserMsg *msg), void *reqarg); bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp); +int vfio_user_get_info(VFIOUserProxy *proxy, struct vfio_device_info *info); #endif /* VFIO_USER_H */ From patchwork Wed Jan 8 11:50:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930718 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 96EB1E77188 for ; Wed, 8 Jan 2025 11:58:07 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdj-0006Vz-9x; Wed, 08 Jan 2025 06:54:23 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdR-0006Sk-H2 for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:05 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdL-0002GN-KK for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:04 -0500 Received: from pps.filterd (m0127840.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085doxd029507; Wed, 8 Jan 2025 03:53:58 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=XmcaJ+Jl5ZNUEeGhmZ8/MV6gYbOCVd+eV5KYQgpNF js=; b=hXN9HSdnNrTcK21z3asx8lSd1e+UoTB5/o964PxGDF4dFi2lHGVqLG0Na aAROxCg1NpaCSQWv8UMsqKjSQOIjSP/PkfN0JAmlkIcxywVgJqHNjIOXySBBjm5Q uxcW10BgXlpRl3AWuT3xAnULi/ECG8ZUMmzUb6gEqsxQ3RaRJ0NZa/Ku3Si4SERB 8+RyWyQ4MORK7U7NWGFRWj5SBpOV4p4FrRNaF4J46liXvcWKbj4eHOW9NcHJCP1J 6kJYAthQff2afsj6bR8p5ssJiwF1TvnYhomK70LQev+GAVboFquyOQuqE7xP4+lS FGsJepDauujc+KFVl4fW/WU66Xs6w== Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2169.outbound.protection.outlook.com [104.47.56.169]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y26xqhax-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=fiLxpxfnOt7MSxoLEofR/sE8kmh5KQ4FptsIubtNrHqAMm7wj4tFqce0u/imjyq6RjImo8+Mb8Cbj6sa2DZnRedvuoKX9mgFX3yHk9pYhxJNTfOqJJTDcg5SgjPeaBJVuSLByzH2kMB825HaPYgXbtILNjsGi3Rh9JKq0gITQbCpWoSb1FXCUIooIm6dH7+ovie9rioCSFC5fdGxOqc+xiH+WoGRDS1Iw8BvtmUSPuvY4YcwmuPPi4JR6JxjVo+6G2lUIPu347+QZOI1p8SnhzQ7cHnKcAy0dL+uotwe0utKlnGaChbvIli+21xEkQor5PoC//RpKxUMqz9GoKAMtg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XmcaJ+Jl5ZNUEeGhmZ8/MV6gYbOCVd+eV5KYQgpNFjs=; b=yr77+wyvjcXtjcOW69Np2PM0zMdNSo++1/YcpW/vyS9QDziFR4ygir91jf9aQIbx8rFL/77kbwVQGRL4Al/NkuR7dDX0Br/t+By0SugJh6kDP865WVBWSmdzrpX6eRMKNgawgk8HWeG+LUC8ZBRhL27Br41LtQl+6dd5Wqfs+57lJNPKPNX4nnbUjirJOC7rsBq253nnd46ZmOu0mMYXgJPXjUjffVYuX3IBzsiwwWALX/TDXCVMId3xzwXVRPEdYN4mm50ZScANbKG115a98XBq4hmagFyfboXfchPHr03xfDF8P4RwDdWgkEZxDoqYTcyWvFURAmBI0EdrakRtbQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XmcaJ+Jl5ZNUEeGhmZ8/MV6gYbOCVd+eV5KYQgpNFjs=; b=S8Ehjb3sGTbuu9eqBWJVlIpKy8vSaZackZ4EH029tEeRlo73JU51MHQDnVKYN9kJkYsr8BOhqJUDroKO9OvmuPFbycgaNmuSOs0BzFwRKUlfVHuyZiz4lI4fj1ZCAIcwzeNiO/RqXb+44T16UyNt10shbThKcBdSKr0cwgknDxCVDDYdFrX1M2pyz7hIebxsEaK6jEvu4+g85/YPG6RyqwinY8zRDiGA4xkPLgY7u1JE65JIFSn1qHpwBop0UTfXAgIX2hE/siqcSKWWjmfmFWrZvnq5o30zbxIHBdsTdNaIwGntWgzsPaEm0IkJy73p6nGHjeLaFf48L9SFBLfVhQ== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:49 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:49 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 15/26] vfio-user: get region info Date: Wed, 8 Jan 2025 11:50:21 +0000 Message-Id: <20250108115032.1677686-16-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: c89450c1-2ec5-4fb0-5e3d-08dd2fdb1d37 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: lsR0BZBL/JR2x/F1TRobj5mIfOuI+qhXKMkugg9exjBbey+v7CS1IjWcKg0xhd00oHMimIx8urraxwxPJXDprgn+8KgBAWu+EKQstSFls8RqH2rQmujTFzj8CLFEVgE1ZpTLWQnBv4N6PExFwVUDfGi/iVpJLKwcXfhuBIgDXF3SiwNYPv9Gi1A7YsMV1rNSgNKBMpBOx7PYteDLVZVGm1kiRe8D4ES0gJbXEFH/XzoP8KQ6IOxlGWb1GxQvIc7dqicz9I6BXsx+18v6JlL0rosCrO1L6dY4A2cfSooX8RVh2wrINPn/OwsFGtALYnSsr8mYXhxrV1hzQNpz0P9AmaytdDTlUYP4JG5oXC0kpzro31lVJaSAsdU0+wDmz4eaNUE1C/qt+qZJvivyYGwVdUsD5YtIOPH3wJZTBCWoAjLONZQs/3noMc+NlTKiA+OFIk09kwN+u1nh5fbpi2KSf2wO6/I6CCi/GXvbczR8+kWTkIDwzYUJzZQaT6KZ3qroIOV10Bcda8PzPbTTzvy1DPnumcURdSPQWDwLFJh+PxZUc5zwxnSjV5zULVTiCHqZsB2w4TwEu7Y4dcnmF4CG28+rwWkn1W+W3LEdDkpHFaQ9TrYhRat9Uh4YROjRPRqortcXdbYStZ4emVRiHMqs7Y7AQCihOjXnNb5Yg4p0mDelZQhQhEId2RoWux+3RM+PtuaAMIK91anDQgYVW34wVCw83qk7jB+kTuXb1Dko2c1/JULR5B8+aOAGYwMySVdArm6RvD6XJdtsoL1f3brqcay+hnG1EAX6wMG4Sj/OaXx1AWJx5HaKUFO0XMVuYj3YiBSuv5dJaJCz9/6czB3TcjRgSaDHhZlV87i2qWpk7VQYa9hgj0+jIj4xNiJGC7Murllyaz2mIRZXHc7L3x4RarI8C20U3NCLYC6rCGojyDIrmqe396HCw+w6uNT1GbeIynFjC3ajOrtxMN+HDU+Opp1xCJO0lAEivFCnYyVHomTjPC2fV78Q7k73z+skppdFlxI0dyW8qMBYFlhKobHDMClW2E+kHVUFX/UEHiaQfMCM5UqsIc2efLAC18LCkKg0Qwm6jIIpdDl8AJ+44snPnDDJ304OkH/Ditdkv7PpZJuz8GLk4tf3bOkfCmoy3aPlU+dDYxjkFJXbsrW/vPkTLqDQXav9ZoKqCHwWHUX2hTmsoy2+vmT3K/KdEPh5qxvjumEBb42/M4cx1WLCSkO8Sm0VG188ET+LgEH4sXHuO22N9rp3wV3fmT7WjBroFE1KmhwxmoNxlcrb7DZ4tpfR35oJOi/Nq9gr9x+tNIqUfzr7MVuTA8R/wIbanndRP7O51c6ErV+CC5pK5Xtel9QFXEFVf3hAbF5zy/5p7lI4JdFXNIXJY8UiUWgjRtoQF0bm X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: CC+X0lTZwxbLF0IENz+to3bPE7uUpqO6GrBj9oHtTQ7ZqEGS3k43/tBdwhuQT9yggTQ1xHspEDwbYreWR8oEsgNDwgL50oA8Z53y2+lVy1K5lSX9QAu3cV/mR4qnsqXsMzazgxuksCjQI+qx8iw14OGPQPI+W106Vwq4r+mmxRawRZHQQmiZpcwRVuqucwlO9iTVMWdmtfOVemI76FWYB/NHOl2NG1FNClw2HzaJwvX6huJedJrxfJZPYLjvwFt2NSwBlNV66vw+LujzvMfSkQudGjLkOv7WfaOcF8k/d2Gh66e5FbvvB02pr2MlMeXCbydVL1/kCj5PjUH/kHNCMwacydOpVrLTa8md9RFjLjPWws25LvEI6Q/NWvKxuI0y+vVC3FMA5rk/wyyt68GCzIJNqdOK/e6tT5XXaB7kf+68MNU+FMc0puyOLqD00V8t5gDBWaknnA5CpnS88cV0vOlnZ6Yb5P9839lCLGfelrgo4Ux4sk4+zT9wpPWJ3y1MPB0jDuo1P3mqhB4/tt6dB+aZMOgQLoFw8itCfnu4TRx6NAslGhdzStACcISQLLv2nNLhiEcuT/Rz/vB0nM8Tsft1mBUbVN9G4aCbvScOm6vNgbbzq5BCAarFmyMS6EFKlhgmAjSNIqdfMKxZzCb3EbbTxBBiapJdUjqIPXOUcRHu5UrL1apcl21o9+dwMHb/b/+UJ7itJ1yoztiQw3CQ5j4OfVMimcVH+UoSBRTM1xQjOgHSQJRr3SNzo+o+HfvckcYY45Z3Cu0KZGeE8ajaUWgyR7246D6wYOiffIFQPs4gSbr47h+flNpR8HROp+esCPTOh6bfPEriQFOePiHnZukn/4T7OuBaDgwSrT2wbDSNmp3g50eDQ2LAdUou2VOo1RKHcyjoPCDT+YpXuM7/DVi6kh74PAK/LYUKIFBJymqkoVo9NEXFUH5mJzxiw3PrfbJ6/77loRx9MjA20m9wLZkzv41nUbQqszOYz7fESl+iuaRK2Efnw730nYVRzzMXIQe1v8U5SffVaFmbcLqdtk4tCQyUAfECf+JM6zhWbcQHFV7whreDVg9eWgjh6ranj/SX6jQlWDYDaDh8xuI+sKmEwGbisGqtGKRDx4pnI4SjQfjLJAHVf375Nghjm9dGPclm338q5/jXMuDusbRWWIbmKzmj9UldMLuzId6sLNxZfo+Gh3LolFb6y9WqOxkw02lUQvDvVAWdhJszAFtqHr2sF0VkCKQJYJsVLnsawOa05O0eXhmhKSCi3TkP/pDAuJW52JAP5kbZjaEdJ1qONvefGG5PWF0aPOoexLJA4wRXvF9lhqcRmIQJhLbB3Fgy/LLlMwrKyouSJiJ87ltgF0VLB628vXBY7mbCiJmZGena5YlA1DlaoLBb30d44n8hAPrh8mwfTUV3VeyZ9ZwRTWAaABSJAVulvfvKhDHrG7MBB8/da9FFlabtJR6t26c+jgl1lWOuVDPS+EPipZZDaOxy9YWOd37LFeppoHwE8qsbJwNnGe0oJYeYeEQb+tycPHtWAX3yKJaByQKotjKuRT/EaqkN5LeQXwoxB8wLHb3RkJbOlnuCnHptIEcApvIe X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: c89450c1-2ec5-4fb0-5e3d-08dd2fdb1d37 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:49.2505 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /fRSWCoR5LRTrrqWM1nOuzoeTZ3FUS5YHtNiZ+jwAu/C6vU67qzBNI4pDjeCyS25vph7RQCkL1wfsdDOrInVUw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=Z/cWHGRA c=1 sm=1 tr=0 ts=677e6755 cx=c_pps a=MPHjzrODTC1L994aNYq1fw==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=t8GBkXr_Z3fVsEiobiEA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: sYlGEDHRKCNfWmztNnu7AzPUHET5gT2O X-Proofpoint-ORIG-GUID: sYlGEDHRKCNfWmztNnu7AzPUHET5gT2O X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Add per-region FD to support mmap() of remote device regions Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/ap.c | 2 ++ hw/vfio/ccw.c | 2 ++ hw/vfio/container.c | 7 ++++ hw/vfio/helpers.c | 28 +++++++++++++-- hw/vfio/pci.c | 2 ++ hw/vfio/platform.c | 2 ++ hw/vfio/trace-events | 1 + hw/vfio/user-pci.c | 2 ++ hw/vfio/user-protocol.h | 14 ++++++++ hw/vfio/user.c | 68 +++++++++++++++++++++++++++++++++++ include/hw/vfio/vfio-common.h | 6 +++- 11 files changed, 130 insertions(+), 4 deletions(-) diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c index 1adce1ab40..54b1815f1d 100644 --- a/hw/vfio/ap.c +++ b/hw/vfio/ap.c @@ -162,6 +162,8 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp) return; } + vbasedev->use_regfds = false; + if (!vfio_attach_device(vbasedev->name, vbasedev, &address_space_memory, errp)) { goto error; diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 8c16648819..085a3fc6e6 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -586,6 +586,8 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp) goto out_unrealize; } + vbasedev->use_regfds = false; + if (!vfio_attach_device(cdev->mdevid, vbasedev, &address_space_memory, errp)) { goto out_attach_dev_err; diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 039241c9c5..e017cd4b08 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -892,10 +892,17 @@ void vfio_put_base_device(VFIODevice *vbasedev) int i; for (i = 0; i < vbasedev->num_regions; i++) { + if (vbasedev->regfds != NULL && vbasedev->regfds[i] != -1) { + close(vbasedev->regfds[i]); + } g_free(vbasedev->regions[i]); } g_free(vbasedev->regions); vbasedev->regions = NULL; + if (vbasedev->regfds != NULL) { + g_free(vbasedev->regfds); + vbasedev->regfds = NULL; + } } if (!vbasedev->group) { diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index 529520c1d6..802d6ae101 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -364,6 +364,12 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, region->size = info->size; region->fd_offset = info->offset; region->nr = index; + if (vbasedev->regfds != NULL) { + region->fd = vbasedev->regfds[index]; + } else { + region->fd = vbasedev->fd; + } + if (region->size) { region->mem = g_new0(MemoryRegion, 1); @@ -442,7 +448,7 @@ int vfio_region_mmap(VFIORegion *region) region->mmaps[i].mmap = mmap(map_align, region->mmaps[i].size, prot, MAP_SHARED | MAP_FIXED, - region->vbasedev->fd, + region->fd, region->fd_offset + region->mmaps[i].offset); if (region->mmaps[i].mmap == MAP_FAILED) { @@ -567,12 +573,16 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, struct vfio_region_info **info) { size_t argsz = sizeof(struct vfio_region_info); + int fd = -1; int ret; /* create region cache */ if (vbasedev->regions == NULL) { vbasedev->regions = g_new0(struct vfio_region_info *, vbasedev->num_regions); + if (vbasedev->use_regfds) { + vbasedev->regfds = g_new0(int, vbasedev->num_regions); + } } /* check cache */ if (vbasedev->regions[index] != NULL) { @@ -586,22 +596,33 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, retry: (*info)->argsz = argsz; - ret = vbasedev->io->get_region_info(vbasedev, *info); + ret = vbasedev->io->get_region_info(vbasedev, *info, &fd); if (ret != 0) { g_free(*info); *info = NULL; + if (vbasedev->regfds != NULL) { + vbasedev->regfds[index] = -1; + } + return -errno; } if ((*info)->argsz > argsz) { argsz = (*info)->argsz; *info = g_realloc(*info, argsz); + if (fd != -1) { + close(fd); + fd = -1; + } goto retry; } /* fill cache */ vbasedev->regions[index] = *info; + if (vbasedev->regfds != NULL) { + vbasedev->regfds[index] = fd; + } return 0; } @@ -765,10 +786,11 @@ static int vfio_io_device_feature(VFIODevice *vbasedev, } static int vfio_io_get_region_info(VFIODevice *vbasedev, - struct vfio_region_info *info) + struct vfio_region_info *info, int *fd) { int ret; + *fd = -1; ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, info); return ret < 0 ? -errno : ret; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 27f82d6517..b57059d676 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3048,6 +3048,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) name = g_strdup(vbasedev->name); } + vbasedev->use_regfds = false; + if (!vfio_attach_device(name, vbasedev, pci_device_iommu_address_space(pdev), errp)) { goto error; diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index 1194e55807..6e19573b3b 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -575,6 +575,8 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp) VFIODevice *vbasedev = &vdev->vbasedev; int i; + vbasedev->use_regfds = false; + qemu_mutex_init(&vdev->intp_mutex); trace_vfio_platform_realize(vbasedev->sysfsdev ? diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 662bc4edfd..ee6d7a0d0a 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -188,3 +188,4 @@ vfio_user_recv_request(uint16_t cmd) " command 0x%x" vfio_user_send_write(uint16_t id, int wrote) " id 0x%x wrote 0x%x" vfio_user_version(uint16_t major, uint16_t minor, const char *caps) " major %d minor %d caps: %s" vfio_user_get_info(uint32_t nregions, uint32_t nirqs) " #regions %d #irqs %d" +vfio_user_get_region_info(uint32_t index, uint32_t flags, uint64_t size) " index %d flags 0x%x size 0x%"PRIx64 diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index 62259db473..60cd9c941c 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -111,6 +111,8 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->ops = &vfio_user_pci_ops; vbasedev->type = VFIO_DEVICE_TYPE_PCI; vbasedev->dev = DEVICE(vdev); + vbasedev->io = &vfio_dev_io_sock; + vbasedev->use_regfds = true; /* * vfio-user devices are effectively mdevs (don't use a host iommu). diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 5f9ef1768f..6f70a48905 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -125,4 +125,18 @@ typedef struct { uint32_t num_irqs; } VFIOUserDeviceInfo; +/* + * VFIO_USER_DEVICE_GET_REGION_INFO + * imported from struct vfio_region_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t cap_offset; + uint64_t size; + uint64_t offset; +} VFIOUserRegionInfo; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 93c7eea649..44e8da8aa1 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1106,3 +1106,71 @@ int vfio_user_get_info(VFIOUserProxy *proxy, struct vfio_device_info *info) return 0; } + +static int vfio_user_get_region_info(VFIOUserProxy *proxy, + struct vfio_region_info *info, + VFIOUserFDs *fds) +{ + g_autofree VFIOUserRegionInfo *msgp = NULL; + uint32_t size; + + /* data returned can be larger than vfio_region_info */ + if (info->argsz < sizeof(*info)) { + error_printf("vfio_user_get_region_info argsz too small\n"); + return -E2BIG; + } + if (fds != NULL && fds->send_fds != 0) { + error_printf("vfio_user_get_region_info can't send FDs\n"); + return -EINVAL; + } + + size = info->argsz + sizeof(VFIOUserHdr); + msgp = g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_GET_REGION_INFO, + sizeof(*msgp), 0); + msgp->argsz = info->argsz; + msgp->index = info->index; + + vfio_user_send_wait(proxy, &msgp->hdr, fds, size); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + trace_vfio_user_get_region_info(msgp->index, msgp->flags, msgp->size); + + memcpy(info, &msgp->argsz, info->argsz); + return 0; +} + + +/* + * Socket-based io_ops + */ + +static int vfio_user_io_get_region_info(VFIODevice *vbasedev, + struct vfio_region_info *info, + int *fd) +{ + int ret; + VFIOUserFDs fds = { 0, 1, fd}; + + ret = vfio_user_get_region_info(vbasedev->proxy, info, &fds); + if (ret) { + return ret; + } + + if (info->index > vbasedev->num_regions) { + return -EINVAL; + } + /* cap_offset in valid area */ + if ((info->flags & VFIO_REGION_INFO_FLAG_CAPS) && + (info->cap_offset < sizeof(*info) || info->cap_offset > info->argsz)) { + return -EINVAL; + } + + return 0; +} + +VFIODeviceIO vfio_dev_io_sock = { + .get_region_info = vfio_user_io_get_region_info, +}; diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index afc67a3a77..50b136b7dc 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -59,6 +59,7 @@ typedef struct VFIORegion { uint32_t nr_mmaps; VFIOMmap *mmaps; uint8_t nr; /* cache the region number for debug */ + int fd; /* fd to mmap() region */ } VFIORegion; typedef struct VFIOMigration { @@ -146,6 +147,7 @@ typedef struct VFIODevice { bool ram_block_discard_allowed; OnOffAuto enable_migration; bool migration_events; + bool use_regfds; VFIODeviceOps *ops; VFIODeviceIO *io; unsigned int num_irqs; @@ -165,6 +167,7 @@ typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) hwpt_next; VFIOUserProxy *proxy; struct vfio_region_info **regions; + int *regfds; } VFIODevice; struct VFIODeviceOps { @@ -209,7 +212,7 @@ struct VFIODeviceOps { struct VFIODeviceIO { int (*device_feature)(VFIODevice *vdev, struct vfio_device_feature *); int (*get_region_info)(VFIODevice *vdev, - struct vfio_region_info *info); + struct vfio_region_info *info, int *fd); int (*get_irq_info)(VFIODevice *vdev, struct vfio_irq_info *irq); int (*set_irqs)(VFIODevice *vdev, struct vfio_irq_set *irqs); int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, @@ -219,6 +222,7 @@ struct VFIODeviceIO { }; extern VFIODeviceIO vfio_dev_io_ioctl; +extern VFIODeviceIO vfio_dev_io_sock; #endif /* CONFIG_LINUX */ From patchwork Wed Jan 8 11:50:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930707 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7A06EE77199 for ; Wed, 8 Jan 2025 11:56:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUds-0006hh-KA; Wed, 08 Jan 2025 06:54:32 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdR-0006Sl-HR for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:05 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdN-0002GR-9s for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:04 -0500 Received: from pps.filterd (m0127840.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085doxe029507; Wed, 8 Jan 2025 03:53:59 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=jT5vc9RMxKHNpMOVEYQJ7CKcShkwSZKdcqR06HbFd P8=; b=V7TOLyGBKBv7qFlGUVCbv8CiZjrH60iIzvYcaXgG3JVONuXv96rpSctjm K5xIqdmD+aaG6z3azCJkoCd3T0w4Hv+MZ3SeoL3tMNAFlD/aYol3wFW4gI+vK8qI abb3h4dSZ9ZrUu9l/K74a8TDcXlHcxBFBSQ5EAJ0d9o83YFZ0SyMFMIleh/pas0x DyhHQFnB2s2ukfWiNWU/Qr2xeK/lrXj1mE4Ehez6am1p5N78TMj8PGA7ukOmLQPO 1uUudO2yZwsmCuSqXuH9bJ+uLzMQx9rfjwO2srKDgks4Fa+cL0HMIcDDFA5dl/Df VDlLeGR5ICd7Va43fx4ez+UTpf7dQ== Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2169.outbound.protection.outlook.com [104.47.56.169]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y26xqhax-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=X0GqRQxNqLN12SeBpK1y6Tzps0jBYMNi85p+ltZsrlXlqOvV8ITdPaMkILoNCc5MK0KZkcgzBDnamQpEZlDpK1LeSBrZZFqlWgkuoNwpXLz3EYePnunkbXeGi421S18625oBKWkNpODN7FF2qar82tFhEGR7c69QGja6RF7btjbcFOZrcnFFxmWqGJVjROmPlh5Z2xkFyJUynlXcX7XSc1B4n30LxEsKbBRC71RYev4yfTiY5s9nmcp77K1QbodDGImeEchedegZbhngHJmqYbNeo+X4dajGLS37wjuVKrgmwPXmAGRsZnnkRFlsmF4mxA+Y7KasTknPMU1DtQqeXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jT5vc9RMxKHNpMOVEYQJ7CKcShkwSZKdcqR06HbFdP8=; b=gHZMuqT7wHC70wuNzElt+YJb9wNye/LM1vaK5sV6o26BkTjwIMLa15baxBcRQyabm+KTO3L9yWKaOc3NIcWv1sayL5YkFUpIfGhOYiSic2Grb5xSglNZbP84idQSud4Td9zhtBmAVJvXg7G71lEmlTbXYVzkg0ktrhnwrqKhnHIF+rFjllnf7Fcfavy9C2rFtnPY3RYHv95RYa+zzNQDo5wvbkw6T/UhRJ1KJTMQ2PUOszLDuIWvZLDwy/uypClnp2lAM+o3nZg9uPH+g8nbeHHhQO4/e4+fT5mLB1hMDY+AZrEXJO8bxoaUR71J+OR4gLkXDEhKWd0WWiBnImSdJw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jT5vc9RMxKHNpMOVEYQJ7CKcShkwSZKdcqR06HbFdP8=; b=GOjXqNfYujuIl/GQ/Jbj211uR88VjAujAmh4gFFITQE2nfxUIZL/JXASl3WMi5qApfExjuE6LiB7Fevpue7Zt/vGp3fZeCHvg/apjPP65w5LOybgVZ+36VQk+IsKD4TR4JXkA6LTvSgWTvESKvTqNQ0EOuB1j90A4qs0MKZTq5WxMLY5tvzgf4sOyn7vAk8esQdLDZ0/JRba9BJ2lDM9h/VgqRPZzfZMa0jDUZcMuq6kWB94YIhw9znEY5t9M8SQXHawu+nFrU2VBWwHLttqvgs0oamUCR60iXFcWq+nBXAS88h+OxchQ2rn75Ly4VFH/ZtdHNfpEZV0/kAMxIECEg== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:50 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:50 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 16/26] vfio-user: region read/write Date: Wed, 8 Jan 2025 11:50:22 +0000 Message-Id: <20250108115032.1677686-17-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 77154de9-c376-4381-4402-08dd2fdb1e01 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: uv34jXyx8C3+6XJPNbtfNZcz5OzPsCVlJ0M4VRv7FNAXcK2wwxPQpYfgg6LVhxTRYI9F1qXj1MEd63o5X7jpqt8QnmMdfHkLLVQdV8y3vhZFA7okVjAuto3mVB7xdIT5X0HoNSY5/Sv3ghtYvWwISzGySjPwoMWLIirr41+EEOe/r8+eGFPkdzPOV6MtIelBTjQwxnoB/ZOc3P4JcwuCghBhEfFXrBuh3ll5FwEGQA0rcIsDCQuXm1VpvfB3VBpAXgiaBtEtVs58eGIVIbtj4lR5fGmnP7Ezf7yU3O7l14L627mFomytSMRfLvTKooup7lTfOKYuFZ0ybeFmkp427hDuY6cowT2XofxyI6o5SeCfOJinVAjsZSipVM3pnCoheNquU83J55mmLVzqrKZGzHG04rYtQMcOM5kqeG+MiguSssQzm935vm9LLx4v0RZp9WGSr4vIWAcPjLiZN0KeEtNFhgQdG4N1jvkhtSKXZRV3b/BJJB4CyvEpvfrCnMU6HfbIjlxkKy2qvSfuBByrfM7nUiYRbm7f8iONXaVLZ/dhbcfxWWG1BycXK+IxptuCbMhFVW8EDRLq17uZgqVX5cDj/P6B3snB4SLeYLPCL+K7WJD4YAFhBZLNjQsGeRxzHP31sji+tQTDqDGhsxkle+Uyp5ibQ9CanNFSFqV18/lR60KFozxuFIvKVcM8bEQSERaeyuJQn/+80ouneCAMU+5PrmAfETAD+3iaN5DG4Dm1TiBWINhKt6GQ38jHjdD26Od4VXgLbL8PH+ROTE6svzUshQfEkorvYkMexSdoJ+rvq5/XXfKQX2ogPc7l55PUQ8P2l6brAbTgCfvew7ADULd8hNaZE78MiF/EKjU5pjeM5ClI5GSi5brG9oLWr1ieU68+0MNgCa0ZLErHUmCWTAQU9a/hDR0d2ZwDK2gwKK5jH/AKwRm6zAf95k8SAv+L8IvuQ34ttk+bO4ze0fhO5fU0Fyt1CWnL3xWvENPP0LMqRUvuMVm9H9LzAY/5XMB9Q848aqZtBhjXWbC3A4R3oCb5muTeLsKjtt+pv0+GOvkqVEL5FmAth175DKRsPLa6f/RWk6uDCfBLybMqR7kxjBUlh8OAdxZXbSJ84AiqSMOaalfgxqD5EZ+poQcM4uqTuTYIt/zmu4BN1rwxT/uPI0Mt9ZpBWwolzItfinsrv6+B8J6raYKfPkMgJiVsulUc2W9vyMcFL+Uoxv/aqrgrD7vlQiAW7u7fEPSorfsynczR55i7jYD9vFL/npgo1tchiro61zqqlm/YfftYtbWJ8N/0t9u2SuqqEXYgU/7Gv6wngc9jSgpKVMdHFMP6/pqrJn87O5/Ud0RuJeJol9HM7nzJP63TH/h1C6j189bzH1ewCJYxqyjA7JvJrNLsqm0r X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Iy5ZqNugzptxqCcnj/8Kn5Ihkn2JFWkZYgtxLzA4jU0tvf0N1fsK4mkxQM+XPtKZymD3ZQGxm1TxbfDmHtcyD19OvHj5GuQcqHJ/PtAa4GIF32FIsi/6H2rzzEoCNL52mVeFi+vBPL9bn1GmNRXVuOnQQtO7YMsdn7pQ+oK8wIlHWLMGlfyGaxD6HvmAAxaAU+G91xf173XQ0yiY9iU3cCQt1Lr8JtwMvKGF0MRPaMRRBsWWEeD5wPCu0hiY2X7ukKirwfJtwWLHKXTJqcNp+vN87ZbyZ1PM8NH09vWDwp1pKlPyGj19uAb27iwbnS+6iqHbEuO2yECDwF7B3by4/hzXKi6GlYBlf36FHa4XEu7z7m/cVv/7JSCBRE/IDBRmZ3LLJXKfZ8/BCZM8jYVBDUsTdDM/l8w2hpwAVpXsTZAVKwpw3ue99YSWPgt5ICmoQ7rXhdfC+c10m3NKksX2Fv5QAqnTELjrR5py3XjCcMwEOqfau9XYISe4nVo6BoOHPFhnLRW+DSo+gCQutfvYSF6dMQUL+R3R6ZFfCfTYbHpjiEqhCb+w6PzCFMcQ9ra8PMhmkc4qVEEWwcdBXm/LRAzhmJvaPGxHr4E0KVesPCAPGwCW85mrmCKkM6+CtXUUPpNYT9pGRfQ6qBKJvX4KGEvNpZLgQvCHvKcwmfmeYrzLkm78ZPKwdounuKrvSGK/4prx313T0BHl1gFqQOlpMb9V0L+Vq3ZMHIqug2o21Tl/gN9O/KnIXpk8eXOgW8DYe2Du7wCESwCFvaHcrSfJ//OD1DtCHVLYE57WMfha8zZ8skBWiDAZEQ8jN8vcFAnn18fBwkFyHUc8WKQB2RXFkea8lNchVkuP9/GHgPM46aFaPViqdM9VdYvPkctMcCjUpHOOAVIt0Ah6chK4MwhtJKHk5dsFyjKpEXji9k8SakADFwIy+KQAJNVKEKIQ41J2p5ZorPaigzwH8j5a6gb0xhsTvlz1K+3aCgMD1fS50VONURavv7//OPt5PKp2BKAbYAmbE0J/P/yX3sOchuuDr1PrcLy8D+kCQVC9P6dqRQigcbB/HmQ4J+6/VffQqe4Jc+ojVkcocoMYRhp8ttBXziXKkddWpOvars9vvDy0tj0E+12SECAaRO3uE+VeunSTbm2m9ZKTY56yGo+s+6LoJUB4dHay3JQ78FYgTC564JgnAoHaMnWENseqMx02lzuXXmVOv8BoQMQhcE78xbLfwiN6zeD5XpCY31UH4WK3A05mGkhFfHc1fF7O4W7s70i4qjvQFlKhl8nGEziRe2vMoQ9BdXTt7EJv8SNqYripm+bZR1LUME8bOzjVaJsZSzOEt01g2aBLkgw0fGoqGvdJVG9gJ9/o1J9Kr+vHkk/c109RJteYfHyGlx4dBdA/tvffKN6EYpguaXsKInBXekvejR06gkv2+aejQGOef27L6L7G4dEckA7/SwGXvu98XygD2WlLZzV1Cqn6BDeSMqEUvqA4hzfuO9vrG9vbKpQgP0mjkm7IghRMd6JZQOlUVQGdDO7JvpZBWGEqUX6rYKdmqOIdnJF0giuzLq/Z4oWhTb5zsKDvf3m9TjO6SrT33ptN X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 77154de9-c376-4381-4402-08dd2fdb1e01 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:50.6338 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: FIDup80bF8xyiIfmv7Lo0/iNTazaC8YnQsbuz53xOGSFZRQt9winMu0QE0OMSEk75wiI/6Vcro2TJkFxouRyJg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=Z/cWHGRA c=1 sm=1 tr=0 ts=677e6756 cx=c_pps a=MPHjzrODTC1L994aNYq1fw==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=xcYbaV23Nb4-Kc-IWXIA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: K-m487dTmzcl1PXWPaLiOmI4XM3_vmcO X-Proofpoint-ORIG-GUID: K-m487dTmzcl1PXWPaLiOmI4XM3_vmcO X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Add support for posted writes on remote devices Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/helpers.c | 20 ++++-- hw/vfio/pci.c | 5 +- hw/vfio/trace-events | 1 + hw/vfio/user-pci.c | 5 ++ hw/vfio/user-protocol.h | 12 ++++ hw/vfio/user.c | 120 ++++++++++++++++++++++++++++++++++ hw/vfio/user.h | 1 + include/hw/vfio/vfio-common.h | 3 +- 8 files changed, 158 insertions(+), 9 deletions(-) diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index 802d6ae101..ea3dbfa96d 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -183,12 +183,15 @@ void vfio_region_write(void *opaque, hwaddr addr, break; } - ret = vbasedev->io->region_write(vbasedev, region->nr, addr, size, &buf); + ret = vbasedev->io->region_write(vbasedev, region->nr, addr, size, &buf, + region->post_wr); if (ret != size) { + const char *errmsg = ret < 0 ? strerror(-ret) : "short write"; + error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64 - ",%d) failed: %m", + ",%d) failed: %s", __func__, vbasedev->name, region->nr, - addr, data, size); + addr, data, size, errmsg); } trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size); @@ -220,9 +223,11 @@ uint64_t vfio_region_read(void *opaque, ret = vbasedev->io->region_read(vbasedev, region->nr, addr, size, &buf); if (ret != size) { - error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %m", + const char *errmsg = ret < 0 ? strerror(-ret) : "short read"; + + error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %s", __func__, vbasedev->name, region->nr, - addr, size); + addr, size, errmsg); return (uint64_t)-1; } switch (size) { @@ -364,13 +369,14 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, region->size = info->size; region->fd_offset = info->offset; region->nr = index; + region->post_wr = false; + if (vbasedev->regfds != NULL) { region->fd = vbasedev->regfds[index]; } else { region->fd = vbasedev->fd; } - if (region->size) { region->mem = g_new0(MemoryRegion, 1); memory_region_init_io(region->mem, obj, &vfio_region_ops, @@ -827,7 +833,7 @@ static int vfio_io_region_read(VFIODevice *vbasedev, uint8_t index, off_t off, } static int vfio_io_region_write(VFIODevice *vbasedev, uint8_t index, off_t off, - uint32_t size, void *data) + uint32_t size, void *data, bool post) { struct vfio_region_info *info = vbasedev->regions[index]; int ret; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index b57059d676..90cf29325f 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -51,7 +51,7 @@ (off), (size), (data))) #define VDEV_CONFIG_WRITE(vbasedev, off, size, data) \ ((vbasedev)->io->region_write((vbasedev), VFIO_PCI_CONFIG_REGION_INDEX, \ - (off), (size), (data))) + (off), (size), (data), false)) #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" @@ -1780,6 +1780,9 @@ static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) bar->type = pci_bar & (bar->ioport ? ~PCI_BASE_ADDRESS_IO_MASK : ~PCI_BASE_ADDRESS_MEM_MASK); bar->size = bar->region.size; + + /* IO regions are sync, memory can be async */ + bar->region.post_wr = (bar->ioport == 0); } static void vfio_bars_prepare(VFIOPCIDevice *vdev) diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index ee6d7a0d0a..da8af45ee9 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -189,3 +189,4 @@ vfio_user_send_write(uint16_t id, int wrote) " id 0x%x wrote 0x%x" vfio_user_version(uint16_t major, uint16_t minor, const char *caps) " major %d minor %d caps: %s" vfio_user_get_info(uint32_t nregions, uint32_t nirqs) " #regions %d #irqs %d" vfio_user_get_region_info(uint32_t index, uint32_t flags, uint64_t size) " index %d flags 0x%x size 0x%"PRIx64 +vfio_user_region_rw(uint32_t region, uint64_t off, uint32_t count) " region %d offset 0x%"PRIx64" count %d" diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index 60cd9c941c..aa5146db0a 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -40,6 +40,7 @@ struct VFIOUserPCIDevice { VFIOPCIDevice device; char *sock_name; bool send_queued; /* all sends are queued */ + bool no_post; /* all regions write are sync */ }; /* @@ -102,6 +103,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) if (udev->send_queued) { proxy->flags |= VFIO_PROXY_FORCE_QUEUED; } + if (udev->no_post) { + proxy->flags |= VFIO_PROXY_NO_POST; + } if (!vfio_user_validate_version(proxy, errp)) { goto error; @@ -173,6 +177,7 @@ static void vfio_user_instance_finalize(Object *obj) static const Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), + DEFINE_PROP_BOOL("x-no-posted-writes", VFIOUserPCIDevice, no_post, false), }; static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 6f70a48905..6987435e96 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -139,4 +139,16 @@ typedef struct { uint64_t offset; } VFIOUserRegionInfo; +/* + * VFIO_USER_REGION_READ + * VFIO_USER_REGION_WRITE + */ +typedef struct { + VFIOUserHdr hdr; + uint64_t offset; + uint32_t region; + uint32_t count; + char data[]; +} VFIOUserRegionRW; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 44e8da8aa1..118314b363 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -55,6 +55,8 @@ static void vfio_user_cb(void *opaque); static void vfio_user_request(void *opaque); static int vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg); +static void vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds); static void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize); static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, @@ -626,6 +628,33 @@ static int vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg) return 0; } +/* + * async send - msg can be queued, but will be freed when sent + */ +static void vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds) +{ + VFIOUserMsg *msg; + int ret; + + if (!(hdr->flags & (VFIO_USER_NO_REPLY | VFIO_USER_REPLY))) { + error_printf("vfio_user_send_async on sync message\n"); + return; + } + + QEMU_LOCK_GUARD(&proxy->lock); + + msg = vfio_user_getmsg(proxy, hdr, fds); + msg->id = hdr->id; + msg->rsize = 0; + msg->type = VFIO_MSG_ASYNC; + + ret = vfio_user_send_queued(proxy, msg); + if (ret < 0) { + vfio_user_recycle(proxy, msg); + } +} + static void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize) { @@ -1139,9 +1168,84 @@ static int vfio_user_get_region_info(VFIOUserProxy *proxy, trace_vfio_user_get_region_info(msgp->index, msgp->flags, msgp->size); memcpy(info, &msgp->argsz, info->argsz); + + /* read-after-write hazard if guest can directly access region */ + if (info->flags & VFIO_REGION_INFO_FLAG_MMAP) { + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + proxy->flags |= VFIO_PROXY_NO_POST; + } + } + return 0; } +static int vfio_user_region_read(VFIOUserProxy *proxy, uint8_t index, + off_t offset, uint32_t count, void *data) +{ + g_autofree VFIOUserRegionRW *msgp = NULL; + int size = sizeof(*msgp) + count; + + if (count > proxy->max_xfer_size) { + return -EINVAL; + } + + msgp = g_malloc0(size); + vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_READ, sizeof(*msgp), 0); + msgp->offset = offset; + msgp->region = index; + msgp->count = count; + trace_vfio_user_region_rw(msgp->region, msgp->offset, msgp->count); + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, size); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } else if (msgp->count > count) { + return -E2BIG; + } else { + memcpy(data, &msgp->data, msgp->count); + } + + return msgp->count; +} + +static int vfio_user_region_write(VFIOUserProxy *proxy, uint8_t index, + off_t offset, uint32_t count, void *data, + bool post) +{ + VFIOUserRegionRW *msgp = NULL; + int flags = post ? VFIO_USER_NO_REPLY : 0; + int size = sizeof(*msgp) + count; + int ret; + + if (count > proxy->max_xfer_size) { + return -EINVAL; + } + + msgp = g_malloc0(size); + vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_WRITE, size, flags); + msgp->offset = offset; + msgp->region = index; + msgp->count = count; + memcpy(&msgp->data, data, count); + trace_vfio_user_region_rw(msgp->region, msgp->offset, msgp->count); + + /* async send will free msg after it's sent */ + if (post && !(proxy->flags & VFIO_PROXY_NO_POST)) { + vfio_user_send_async(proxy, &msgp->hdr, NULL); + return count; + } + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + ret = -msgp->hdr.error_reply; + } else { + ret = count; + } + + g_free(msgp); + return ret; +} + /* * Socket-based io_ops @@ -1171,6 +1275,22 @@ static int vfio_user_io_get_region_info(VFIODevice *vbasedev, return 0; } +static int vfio_user_io_region_read(VFIODevice *vbasedev, uint8_t index, + off_t off, uint32_t size, void *data) +{ + return vfio_user_region_read(vbasedev->proxy, index, off, size, data); +} + +static int vfio_user_io_region_write(VFIODevice *vbasedev, uint8_t index, + off_t off, unsigned size, void *data, + bool post) +{ + return vfio_user_region_write(vbasedev->proxy, index, off, size, data, + post); +} + VFIODeviceIO vfio_dev_io_sock = { .get_region_info = vfio_user_io_get_region_info, + .region_read = vfio_user_io_region_read, + .region_write = vfio_user_io_region_write, }; diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 18a5a40073..1f99a976d6 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -84,6 +84,7 @@ typedef struct VFIOUserProxy { /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 #define VFIO_PROXY_FORCE_QUEUED 0x4 +#define VFIO_PROXY_NO_POST 0x8 typedef struct VFIODevice VFIODevice; diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 50b136b7dc..3a2e3afaaf 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -60,6 +60,7 @@ typedef struct VFIORegion { VFIOMmap *mmaps; uint8_t nr; /* cache the region number for debug */ int fd; /* fd to mmap() region */ + bool post_wr; /* writes can be posted */ } VFIORegion; typedef struct VFIOMigration { @@ -218,7 +219,7 @@ struct VFIODeviceIO { int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, void *data); int (*region_write)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, - void *data); + void *data, bool post); }; extern VFIODeviceIO vfio_dev_io_ioctl; From patchwork Wed Jan 8 11:50:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8BA62E77188 for ; Wed, 8 Jan 2025 11:57:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdm-0006XL-LK; Wed, 08 Jan 2025 06:54:26 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdR-0006Sm-Gs for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:05 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdN-0002Gc-Bc for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:04 -0500 Received: from pps.filterd (m0127842.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5087vts9007169; Wed, 8 Jan 2025 03:53:59 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=bq9Ga+C5PtMdkv9K7OvuJn1mYkXvQGzg9T60bFEaG Ps=; b=Y9mK9JGD0XQVVWLeq69K4J3uTdzU5ZqCHtAPK4nL4Ttg/4zMIwXTAIZ3W mJvoHxowfbRwlwCKiGA/BUb0H5e6i6M6MU42z7tpkd99ZRQj2i/Vh6eFvcbmyj/4 VHg4qZPMDjoi7pMASbHhZTrm8WI/DGI8ax+Ep0w9QSgzOtfRzZUfrlN2KWW3Teu/ ihZ2QzpkSEC2iCxwl2fLTFlq5YEs+8+imH41sfTqmgHZkIY8WdgWdnrt5EilG7vo xe0a2F9e+1Gk6J0aUgt6d8WQFX1W4CDfTecjvZYY59Ukt40DFKZn/+LPxIUPRuR0 bdBRxS99xrUZ+HKbQeNGkKjMZg7BA== Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2177.outbound.protection.outlook.com [104.47.55.177]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y56eryy2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Ux1B9EckNp3NDIikcAfJCx0y37SvVX+6p9HlFR98H8J4usfZPACrihNN/N3amhjUCSQUYh9uuscU5Bo8ZuqA5PYXbnIBxBkAqlIsWEMRDPlLofeO/I0avJFcCVoWkBdLThflHiqx/jUvDVaJMcAD9hQYA5fFdyB50kfqIvWYcaEAh9lzjO0mLTC2zKrnGQ/tPOW9Re1/nmmqKBJnqeDABtXnKcZyZYoQmx4/dM2RLqesWr8m6S7EfJ0qXNI3TGQugz+WnnhCKl1N7bhCudDDwN1s7zYk+DNasQ0sgKFQeL6BU7+7fVBeKXmmlLIxWZvFdz6AvY5MeGeJozIJZbrkkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bq9Ga+C5PtMdkv9K7OvuJn1mYkXvQGzg9T60bFEaGPs=; b=RwdBGsaxxRSuVOkhEHVr8jEiUEPVY+gPV1htVN73DRDhlCworALq9vlcf2WdVA1AymXlNBRThCTdNoCd/ay5cSWGKkzrCV72bLYurxHo/Bh5dEYnl/4+ZY76VlPGgc7dxZ47UgVwt6fuTQpe8c60E0V8Rn4l/rHPMoznScsTrMGOtSZZ2cnNcP3xgwY48sjKgFXlrg8DjQxR1LNKPeW/s83fNnP5aogk/outdyUnCPMsk2S35GkLY0eViin/qI7GiX3J/blhhFOCkbVcdMfXcxGDQfH5UWoChAk4ztXRcBrypH6mxandHDHJZvslCUI02/7/1s7rC4jiwn1uvxEvlA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bq9Ga+C5PtMdkv9K7OvuJn1mYkXvQGzg9T60bFEaGPs=; b=Ud+e59Ka/O36QRk9mdpmZgieEAs6NHbB14P1drLC7DOmuZ/my98+iLnYSXhMuDCtvUjT59JtYHr1ofzbHsbtFhXwXNwAL0DmKkUDt6CF+sTkmUtIiMmncQuz00+Gn/g6IYxkA+5FTmgywZy9FY3PORNmfMUmzaNYeBUu8WUURa6A3QG3OLbAcJA+Gz8SmujZYmoKIDYjTJCwE3OADxnPjUZu9fc+WOgiRdTqlSsfdawj1ktUtY2aPFqFz2uYfaJpVArzNKrcuD6bul6nGKcExYNvtbLgnJ9AC2kP441a/C5MbNdJMJ2KYMfYvzHwMJ0Gkw+hiF+20+iEiEi8PrCtsQ== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:52 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:52 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 17/26] vfio-user: pci_user_realize PCI setup Date: Wed, 8 Jan 2025 11:50:23 +0000 Message-Id: <20250108115032.1677686-18-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: ab74846f-c99b-45c1-866b-08dd2fdb1eda x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: VBcpAB6/U7FWJ7VH4F3Hzv9L0bYBFWnAwbZ5/1wKGF0PJvuUEFA9DKR42IN57BWPdkXq8q+8iOrgno/A7Y4/RfPazKq6IgHW5Ti5uexmeU6RfnCRaGAGYNFeQIdxXT3ExYCImWgvM1JZ3R3zK39Ki8QZbToRHQlEcJsL4pyQH7aRxsFVC5+OvkbxtMOlMll2IY00ZJLsm8pAXrG3cbj8b1vfJgI1GhMU4kuv8bJ4ZiBBMm0MnS6x+h6qEwIDwKIUr78f86bModyv0cwlu2k0SUY2Kmt75j1sd4sblc44PkgbMNgzcdv+Wtaf+ioNUb2dFjHkuMOX90pVAltm5Y5uGM6cy2dYgbVCwbSryUHHoqS3LCpCn+lib9Su0VNq7QXgDreuz2P5mNs93afpU0xE9a6CYOT67qvHUIF80VKXZiVGFFGTVmU8zfVpUVq6rcu1g57umy24B6lIo/XYBhNou0f7eFMXZ4uZr0YqZiRygxGY9IojG0MIb8VpYS71jti6clGvxGnpx6wCDbWEU06X/tcVa+FFsapDjcvmMQ69z77oCJCsYSwCDORpB/CpA6KQwvKptWJAIpHLqp29DvIxw2HAW9bTk77XBPuGCAtR1FlrBcZQCR2JR/CsHtRcFaMRd8Xg2BapThYV58Pe9wpEBFCo/bl0dkYaUXxwoG20SoLh5nBsrZFXXQHySa6Pic2sKlZqsPtz/KNZTad2m2I84TOIT4jY0vZzKATYM5/lqd5dh7sJ8v7AyRSYOzleuWWgmBZyvZkqhCJ2kNg9GQ0SH4M8Ixa8yf/fdRRgVrZrqccNnKHRmdyzk8I1N4QODRupzWIO5Qyv2sl2+xfYGvZRjlFjyRroF3un3/MmBhKvhjg+W8B6gqe7A6qY03N98mjD+zYJFvS2EgVyMOecX6N1VCGvA+3DSwAZoqLuNK1X7+aQyvYxzZCus5z9McyMlTDMmli6zKOg4sZ33yRq2AIm6ccVNdnJlYJqYaeItZEYujJp9DcEePm2pXm6QYyX9IouLnQXaiP90FFLr88LGHkQCVMh+tvbyD0ZwZ2vZb1UCyfxGTaiNG9kfYOGSpZkMsZ5MQrSoEg8ZEW3jJ4KSVb/oCO+3tC1f/r4EZO1DerR5sulqUELbdQd3m8uQT+D6Xfz3dHeUw3wJt0YpEmSucxIcN8PvI30JRYgFdmdbVoUQMAMJGfM9JiTmXUQPdyQGliOYhYDlLj1HTtXA8DJyQ4kPjxPXN88fhy10O8+QFTF7UM5kLq4Rl/NZLaX0aXipAwneIJVXUP2eWGLPbRzbJjLsivYqPxifuYDMn7csez9YNlDeX84iYA9wmeXlYAXnDKCIwxsLigbEuJJNJbpKIh3t8spz6TPVyB99w82lecky7bDmJ3pzgDLQpXL4FjIDAWm X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Rtb+IcOXHLDtE0mMBsgidY7Z6iV8Js1KNlaC6hDANhZWc8YxaMTYKn1YTzhySWJ8gNJ0AIK4jhOtZ9kg9RT6Bn0W93hgI/sioiOxsOIcWsfbFEAG0I9uB1/4cJYmw2XK6q6AvVvWqp9Mfy5cUeUvn8pPUyXjSup/4hPEgxZ2BGKatT1UvAH1eRweFcP3XeWCvy/pXB5ipyBtsHLM4EHalOnS90NW7K9FjunCte+/Qzmj6XKjZzquEHCa1KbVvApZQhL+zWZMYgIbtXYmV6o+zmUglP+r3TlU78k2yv38QXMzmPUFvFox78T7J7DDkFZLcFQxn9UVWD+Ol1nv/tkYCfYsWl4XkeqKHoE7tA4sgp6GGpowsXd3P3yfH07Re8QwqkcNP5/MQ0TvsA6fJrTSBHSt0JFQUkDikGEPDmT9ZQkOzTjyTYmxLeFclo76xuYDg/84rV5NDX7jsccLW0J6fRZ+qUU9SD032tBpw9CGhMetJPEjGqVt5KUv72axklhcWZAvZv1/Yoe86JSzEoTr3Z7OFkR0CyR22QM9xTbXjyhz3xlE+P2zr6K6gTB0VcVaZImg46hcdmnVFjxohcaj7gfO14qOP4MI0CK3BHofJgvFLJMxYGt5oDofMCqkg2pswC4qjcxyxBcManK1LK5ctOlsgVRrXMjPRaI08/ZN5kcPXTc2fN/fcPCHjnw4WaPCjK3O4DaDeqM6hJDfpm9Vg5DC0mB/oimIj0F9MLQZZVpOrX8j2H3MKSf8kiDHEzT4DV5Q6Iz5t/Algb2ftXHz6bA5xMEnzlc6pMIbXlxildzV7IVAdz/jLssp/HGwgBylTTz02ppxYEKSYloHUvJ92lNEd1tj19cA94j70JKijablC6r6vl00KLqakrrRfTwTl/WEeRZ0RRGrdJrjV10+1hvX0yY+i1/KxYUE1IcKqt5TTQToDCX+JG6SZQf1HzfZneRXaouigRfT086wWdl8Cyb7ewXhoulNeqJUWQyWjS+na1nUD4ebYgUidAniJ3IrjvdrLcGAvFvhsRuoI/SC5YQaEPiJvbmPssOra+0P8NTKWS3WZjCcBaGWMGr2iTxdpSiQFoYLZRzX6CIkyd2F6BDc1kQFFETzy19KE0OkmUMmFsGBsK/SuJFS+srYZhiEQ2WT339AYWw2rAtJYrlp8scp52/LNHY77kk7ReaG3QG/nJxQroKefK2O6yeDf4vhRDNB0ywk4FRcmObgAUCmgl2uVfkmdJvxvr/slPfXupCAjQw1YZSLFc3lzsAzsJLjM5Vk21FBtCS57wtMt0/PTLLmGc0/kgFFZiF5tNg4FSoKqJfDOE+oTIhknboyw1v6AY/gR5q4/jk2R29fSQoubGwinBgkZ3MuX5JyVYYOpcEmupq9dlTmjecKF8ILI5yxlG35zZFtN4AxcZ0Gg9Qq4VT14e5SxGkmG2bLxomh9uBgwlqnmg+1TrOxNlFzIOrW/Nyx2EDNDon9oSZMdVzSSz4hoocoDWwbM/9pAM+GMATtjz/YpHQGYJkJyHRaLF4ya+aMyfDQY1oVawWnXqi6mt+zlN2yesnj3T4YHERBZSQ0Jjhf1Db/E9vN+cKSYEBv X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: ab74846f-c99b-45c1-866b-08dd2fdb1eda X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:52.1128 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: cN51PScSZVBt0DfwCNswsQ5AT3mg8AknxVTh4wyvyO+7tm9iDLlLde3ZXarYL8ksWBsLLuEs0fzMTjz+pQd4rg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=A6aWP7WG c=1 sm=1 tr=0 ts=677e6757 cx=c_pps a=Odf1NfffwWNqZHMsEJ1rEg==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=TQvnsvYg8OETrsMx9EMA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: ngiEvAIY_y1bc97OHON-atSu31GCAfwT X-Proofpoint-ORIG-GUID: ngiEvAIY_y1bc97OHON-atSu31GCAfwT X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman PCI BARs read from remote device PCI config reads/writes sent to remote server Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/pci.c | 243 ++++++++++++++++++++++++++------------------- hw/vfio/pci.h | 10 ++ hw/vfio/user-pci.c | 42 ++++++++ 3 files changed, 191 insertions(+), 104 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 90cf29325f..cd7bff2b4c 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -1728,7 +1728,7 @@ static bool vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp) return true; } -static void vfio_teardown_msi(VFIOPCIDevice *vdev) +void vfio_teardown_msi(VFIOPCIDevice *vdev) { msi_uninit(&vdev->pdev); @@ -1829,7 +1829,7 @@ static void vfio_bars_register(VFIOPCIDevice *vdev) } } -static void vfio_bars_exit(VFIOPCIDevice *vdev) +void vfio_bars_exit(VFIOPCIDevice *vdev) { int i; @@ -1849,7 +1849,7 @@ static void vfio_bars_exit(VFIOPCIDevice *vdev) } } -static void vfio_bars_finalize(VFIOPCIDevice *vdev) +void vfio_bars_finalize(VFIOPCIDevice *vdev) { int i; @@ -2417,7 +2417,7 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev) return; } -static bool vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp) +bool vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp) { PCIDevice *pdev = &vdev->pdev; @@ -2766,7 +2766,7 @@ bool vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp) return true; } -static bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) +bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) { VFIODevice *vbasedev = &vdev->vbasedev; struct vfio_region_info *reg_info = NULL; @@ -2884,7 +2884,7 @@ static void vfio_err_notifier_handler(void *opaque) * and continue after disabling error recovery support for the * device. */ -static void vfio_register_err_notifier(VFIOPCIDevice *vdev) +void vfio_register_err_notifier(VFIOPCIDevice *vdev) { Error *err = NULL; int32_t fd; @@ -2943,7 +2943,7 @@ static void vfio_req_notifier_handler(void *opaque) } } -static void vfio_register_req_notifier(VFIOPCIDevice *vdev) +void vfio_register_req_notifier(VFIOPCIDevice *vdev) { struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info), .index = VFIO_PCI_REQ_IRQ_INDEX }; @@ -2998,79 +2998,10 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev) vdev->req_enabled = false; } -static void vfio_realize(PCIDevice *pdev, Error **errp) +bool vfio_pci_config_setup(VFIOPCIDevice *vdev, Error **errp) { - ERRP_GUARD(); - VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + PCIDevice *pdev = &vdev->pdev; VFIODevice *vbasedev = &vdev->vbasedev; - int i, ret; - char uuid[UUID_STR_LEN]; - g_autofree char *name = NULL; - - if (vbasedev->fd < 0 && !vbasedev->sysfsdev) { - if (!(~vdev->host.domain || ~vdev->host.bus || - ~vdev->host.slot || ~vdev->host.function)) { - error_setg(errp, "No provided host device"); - error_append_hint(errp, "Use -device vfio-pci,host=DDDD:BB:DD.F " -#ifdef CONFIG_IOMMUFD - "or -device vfio-pci,fd=DEVICE_FD " -#endif - "or -device vfio-pci,sysfsdev=PATH_TO_DEVICE\n"); - return; - } - vbasedev->sysfsdev = - g_strdup_printf("/sys/bus/pci/devices/%04x:%02x:%02x.%01x", - vdev->host.domain, vdev->host.bus, - vdev->host.slot, vdev->host.function); - } - - if (!vfio_device_get_name(vbasedev, errp)) { - return; - } - - /* - * Mediated devices *might* operate compatibly with discarding of RAM, but - * we cannot know for certain, it depends on whether the mdev vendor driver - * stays in sync with the active working set of the guest driver. Prevent - * the x-balloon-allowed option unless this is minimally an mdev device. - */ - vbasedev->mdev = vfio_device_is_mdev(vbasedev); - - trace_vfio_mdev(vbasedev->name, vbasedev->mdev); - - if (vbasedev->ram_block_discard_allowed && !vbasedev->mdev) { - error_setg(errp, "x-balloon-allowed only potentially compatible " - "with mdev devices"); - goto error; - } - - if (!qemu_uuid_is_null(&vdev->vf_token)) { - qemu_uuid_unparse(&vdev->vf_token, uuid); - name = g_strdup_printf("%s vf_token=%s", vbasedev->name, uuid); - } else { - name = g_strdup(vbasedev->name); - } - - vbasedev->use_regfds = false; - - if (!vfio_attach_device(name, vbasedev, - pci_device_iommu_address_space(pdev), errp)) { - goto error; - } - - if (!vfio_populate_device(vdev, errp)) { - goto error; - } - - /* Get a copy of config space */ - ret = pread(vbasedev->fd, vdev->pdev.config, - MIN(pci_config_size(&vdev->pdev), vdev->config_size), - vdev->config_offset); - if (ret < (int)MIN(pci_config_size(&vdev->pdev), vdev->config_size)) { - ret = ret < 0 ? -errno : -EFAULT; - error_setg_errno(errp, -ret, "failed to read device config space"); - goto error; - } /* vfio emulates a lot for us, but some bits need extra love */ vdev->emulated_config_bits = g_malloc0(vdev->config_size); @@ -3088,10 +3019,10 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) if (vdev->vendor_id != PCI_ANY_ID) { if (vdev->vendor_id >= 0xffff) { error_setg(errp, "invalid PCI vendor ID provided"); - goto error; + return false; } vfio_add_emulated_word(vdev, PCI_VENDOR_ID, vdev->vendor_id, ~0); - trace_vfio_pci_emulated_vendor_id(vbasedev->name, vdev->vendor_id); + trace_vfio_pci_emulated_vendor_id(vdev->vbasedev.name, vdev->vendor_id); } else { vdev->vendor_id = pci_get_word(pdev->config + PCI_VENDOR_ID); } @@ -3099,7 +3030,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) if (vdev->device_id != PCI_ANY_ID) { if (vdev->device_id > 0xffff) { error_setg(errp, "invalid PCI device ID provided"); - goto error; + return false; } vfio_add_emulated_word(vdev, PCI_DEVICE_ID, vdev->device_id, ~0); trace_vfio_pci_emulated_device_id(vbasedev->name, vdev->device_id); @@ -3110,7 +3041,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) if (vdev->sub_vendor_id != PCI_ANY_ID) { if (vdev->sub_vendor_id > 0xffff) { error_setg(errp, "invalid PCI subsystem vendor ID provided"); - goto error; + return false; } vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_VENDOR_ID, vdev->sub_vendor_id, ~0); @@ -3121,7 +3052,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) if (vdev->sub_device_id != PCI_ANY_ID) { if (vdev->sub_device_id > 0xffff) { error_setg(errp, "invalid PCI subsystem device ID provided"); - goto error; + return false; } vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_ID, vdev->sub_device_id, ~0); trace_vfio_pci_emulated_sub_device_id(vbasedev->name, @@ -3152,11 +3083,129 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) vfio_bars_prepare(vdev); if (!vfio_msix_early_setup(vdev, errp)) { - goto error; + return false; } vfio_bars_register(vdev); + return true; +} + +bool vfio_interrupt_setup(VFIOPCIDevice *vdev, Error **errp) +{ + PCIDevice *pdev = &vdev->pdev; + + /* QEMU emulates all of MSI & MSIX */ + if (pdev->cap_present & QEMU_PCI_CAP_MSIX) { + memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff, + MSIX_CAP_LENGTH); + } + + if (pdev->cap_present & QEMU_PCI_CAP_MSI) { + memset(vdev->emulated_config_bits + pdev->msi_cap, 0xff, + vdev->msi_cap_size); + } + + if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) { + vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, + vfio_intx_mmap_enable, vdev); + pci_device_set_intx_routing_notifier(&vdev->pdev, + vfio_intx_routing_notifier); + vdev->irqchip_change_notifier.notify = vfio_irqchip_change; + kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier); + if (!vfio_intx_enable(vdev, errp)) { + pci_device_set_intx_routing_notifier(&vdev->pdev, NULL); + kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier); + return false; + } + } + return true; +} + +static void vfio_realize(PCIDevice *pdev, Error **errp) +{ + ERRP_GUARD(); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; + int i, ret; + char uuid[UUID_STR_LEN]; + g_autofree char *name = NULL; + + if (vbasedev->fd < 0 && !vbasedev->sysfsdev) { + if (!(~vdev->host.domain || ~vdev->host.bus || + ~vdev->host.slot || ~vdev->host.function)) { + error_setg(errp, "No provided host device"); + error_append_hint(errp, "Use -device vfio-pci,host=DDDD:BB:DD.F " +#ifdef CONFIG_IOMMUFD + "or -device vfio-pci,fd=DEVICE_FD " +#endif + "or -device vfio-pci,sysfsdev=PATH_TO_DEVICE\n"); + return; + } + vbasedev->sysfsdev = + g_strdup_printf("/sys/bus/pci/devices/%04x:%02x:%02x.%01x", + vdev->host.domain, vdev->host.bus, + vdev->host.slot, vdev->host.function); + } + + if (!vfio_device_get_name(vbasedev, errp)) { + return; + } + + /* + * Mediated devices *might* operate compatibly with discarding of RAM, but + * we cannot know for certain, it depends on whether the mdev vendor driver + * stays in sync with the active working set of the guest driver. Prevent + * the x-balloon-allowed option unless this is minimally an mdev device. + */ + vbasedev->mdev = vfio_device_is_mdev(vbasedev); + + trace_vfio_mdev(vbasedev->name, vbasedev->mdev); + + if (vbasedev->ram_block_discard_allowed && !vbasedev->mdev) { + error_setg(errp, "x-balloon-allowed only potentially compatible " + "with mdev devices"); + goto error; + } + + if (!qemu_uuid_is_null(&vdev->vf_token)) { + qemu_uuid_unparse(&vdev->vf_token, uuid); + name = g_strdup_printf("%s vf_token=%s", vbasedev->name, uuid); + } else { + name = g_strdup(vbasedev->name); + } + + vbasedev->use_regfds = false; + + if (!vfio_attach_device(name, vbasedev, + pci_device_iommu_address_space(pdev), errp)) { + goto error; + } + + if (!vfio_populate_device(vdev, errp)) { + goto error; + } + + /* Get a copy of config space */ + ret = pread(vbasedev->fd, vdev->pdev.config, + MIN(pci_config_size(&vdev->pdev), vdev->config_size), + vdev->config_offset); + if (ret < (int)MIN(pci_config_size(&vdev->pdev), vdev->config_size)) { + ret = ret < 0 ? -errno : -EFAULT; + error_setg_errno(errp, -ret, "failed to read device config space"); + goto error; + } + + if (!vfio_pci_config_setup(vdev, errp)) { + goto error; + } + + /* + * vfio_pci_config_setup will have registered the device's BARs + * and setup any MSIX BARs, so errors after it succeeds must + * use out_teardown + */ + if (!vbasedev->mdev && !pci_device_set_iommu_device(pdev, vbasedev->hiod, errp)) { error_prepend(errp, "Failed to set iommu_device: "); @@ -3200,28 +3249,14 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) } } - /* QEMU emulates all of MSI & MSIX */ - if (pdev->cap_present & QEMU_PCI_CAP_MSIX) { - memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff, - MSIX_CAP_LENGTH); - } - - if (pdev->cap_present & QEMU_PCI_CAP_MSI) { - memset(vdev->emulated_config_bits + pdev->msi_cap, 0xff, - vdev->msi_cap_size); + if (!vfio_interrupt_setup(vdev, errp)) { + goto out_teardown; } - if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) { - vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, - vfio_intx_mmap_enable, vdev); - pci_device_set_intx_routing_notifier(&vdev->pdev, - vfio_intx_routing_notifier); - vdev->irqchip_change_notifier.notify = vfio_irqchip_change; - kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier); - if (!vfio_intx_enable(vdev, errp)) { - goto out_deregister; - } - } + /* + * vfio_interrupt_setup will have setup INTx's KVM routing + * so errors after it succeeds must use out_deregister + */ if (vdev->display != ON_OFF_AUTO_OFF) { if (!vfio_display_probe(vdev, errp)) { diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index c0f030f4db..5fe6eb282c 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -218,6 +218,16 @@ Object *vfio_pci_get_object(VFIODevice *vbasedev); int vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f, Error **errp); int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f); void vfio_pci_put_device(VFIOPCIDevice *vdev); +bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp); +void vfio_teardown_msi(VFIOPCIDevice *vdev); +void vfio_bars_exit(VFIOPCIDevice *vdev); +void vfio_bars_finalize(VFIOPCIDevice *vdev); +bool vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp); +void vfio_put_device(VFIOPCIDevice *vdev); +void vfio_register_err_notifier(VFIOPCIDevice *vdev); +void vfio_register_req_notifier(VFIOPCIDevice *vdev); +bool vfio_pci_config_setup(VFIOPCIDevice *vdev, Error **errp); +bool vfio_interrupt_setup(VFIOPCIDevice *vdev, Error **errp); void vfio_instance_init(Object *obj); uint64_t vfio_vga_read(void *opaque, hwaddr addr, unsigned size); diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index aa5146db0a..5758e1e234 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -78,6 +78,7 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) AddressSpace *as; SocketAddress addr; VFIOUserProxy *proxy; + int ret; /* * TODO: make option parser understand SocketAddress @@ -130,8 +131,45 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) goto error; } + if (!vfio_populate_device(vdev, errp)) { + goto error; + } + + /* Get a copy of config space */ + ret = vbasedev->io->region_read(vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, 0, + MIN(pci_config_size(pdev), vdev->config_size), + pdev->config); + if (ret < (int)MIN(pci_config_size(&vdev->pdev), vdev->config_size)) { + error_setg_errno(errp, -ret, "failed to read device config space"); + goto error; + } + + if (!vfio_pci_config_setup(vdev, errp)) { + goto error; + } + + /* + * vfio_pci_config_setup will have registered the device's BARs + * and setup any MSIX BARs, so errors after it succeeds must + * use out_teardown + */ + + if (!vfio_add_capabilities(vdev, errp)) { + goto out_teardown; + } + + if (!vfio_interrupt_setup(vdev, errp)) { + goto out_teardown; + } + + vfio_register_err_notifier(vdev); + vfio_register_req_notifier(vdev); + return; +out_teardown: + vfio_teardown_msi(vdev); + vfio_bars_exit(vdev); error: error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); } @@ -167,6 +205,10 @@ static void vfio_user_instance_finalize(Object *obj) VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); VFIODevice *vbasedev = &vdev->vbasedev; + vfio_bars_finalize(vdev); + g_free(vdev->emulated_config_bits); + g_free(vdev->rom); + vfio_pci_put_device(vdev); if (vbasedev->proxy != NULL) { From patchwork Wed Jan 8 11:50:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C2B51E77188 for ; Wed, 8 Jan 2025 11:58:22 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdp-0006ZJ-N6; Wed, 08 Jan 2025 06:54:30 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdS-0006TC-VE for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:07 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdN-0002Gg-FB for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:05 -0500 Received: from pps.filterd (m0127840.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085doxf029507; Wed, 8 Jan 2025 03:54:00 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=yEQ+2mZvRFUSxRFvKhMU6t4jWNStRpwNrcDO0tY+M 3A=; b=R3b1Y4ImIs6RaqA7BxNmOtA0Boc+kP2zp5qVp7TIJUGCMcCCG37550QD4 S5iLJTelGeq6qN8fsJrvlXWv1HP/6ulQJ8jWc/LhxNCo9CrZ2NcdUVgwqscVkrMT v0uUubNa68AT43S64PPPXJ5rLRXn4A+jpySfDi024UR/pBiGyYgwApEhdjIN8Xo6 N9PVKliTBn3sA0Tx+5jyyD0cTJx9cs+1gS5LuLOgLAdZJzNDjI+Z44I0FvHIJLQA LeHuB8DYWu2rWoWHuIuZ2oYcLcRbOD0LtLLVEb3eQEfghuYVhKCD9EEAKWtQ2YQ+ h3htmpAKUT7Bbf7HlN+C65AkXjJXg== Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2169.outbound.protection.outlook.com [104.47.56.169]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y26xqhax-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:53:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=nDQFD4vDFV9aTYoh7nV95SoFCz3jvnxXHJoKdDqTsYW/bgwYtKR3RXyW5RKp2KtU4WjeHgmfym7Ox1bniANTsifvv4+BOT/rCt/qpJEvYpLu6lnHuc3EytxUXqXRZPFfNT+PY43qmhfDDsCi33o5iTeq8nnuJ87eToBsz8jGENOJCf7GxXjoG0fTKjtpi5C8mLNUaBxbJRULUK6b943FD1vDncL5waQC9EJHG2AzUjN6m+sknPxOkhBq2jv77zlLHbsyYuwx4lEC0MmR4eqHBmRtEAvPjvhKNoPvI42ATff7H4Neh3lDonzLZwRA9anjRNbzfDWKZp9FlEGHsfdGYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yEQ+2mZvRFUSxRFvKhMU6t4jWNStRpwNrcDO0tY+M3A=; b=nQt07Jyzj3jb+dq71OrbeMuJ3zYV1QnrJFc656nEG9Kn2AWffbvmbbA80B7fCdk614HMg+gXei4Ln3N9dqJ8LJLia9uXWICM4MZjeU585pr7W4Nw84qfCGFj7HCLVtV69I2UVldUjSRK56fKZ/PsgXDvUAwBs90F4QyI0SPQLLAnrH34sQ3PenIFOgP1XdqpLUqlJoZXz/WKKF3SNwR2fGF+BHPGp3raTjw8B5Zmm9HLukAVODs66W/zWZUtoD3lfiUvMu5qhlfRZG5XlTUP+AByJ8m61g+nletoh/0fh7PoaIsCHM8pH844Zc8AbwIBOWKFG3eMlnKiDjaUv6Ivxw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yEQ+2mZvRFUSxRFvKhMU6t4jWNStRpwNrcDO0tY+M3A=; b=ILLszTr5+7jbaX+GYoroDQ3zha9jf9lXvIcXlMpjzlFu8x2anBhh84NoqiQLPp8hQ/Zc1rq56GEHxhMwkkvtoHZqMe4WNJf7XKAj579Yjdhe0pO11r988efFjEgxqWzDfMjvLcCRjWmHQaFR/hkrwx+Cg6HSuKRtkYCnQbBg/DWfDf1lE0yxFtd0VtOKqWcgnfcyRbG78FH1aQJ5Wjfc6b1kQWsUmvrjDuiF2pDdKM4GXynjuAqAJcHMsjNIt2uAIyVM781QOEu1qkU7PgEp69N+mVwj7VHNm/lrAmO6N0/B4Fd+FuJrl4qGqcb/qx7jNzuzrpaP2hDZ26PvM/tYCQ== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:53 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:53 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 18/26] vfio-user: get and set IRQs Date: Wed, 8 Jan 2025 11:50:24 +0000 Message-Id: <20250108115032.1677686-19-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 14677b4a-63fe-4e63-1991-08dd2fdb1fae x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: DPe3EYG6tiX5pmKTgLg4Yxpdw2VfDevf3ACBLQZMsCX1YFhKoR0hLNbTw8KHGwxNsu/K1E11nQwQz2M4ryQTekKwxWlV+W7t9kyR43C4xIF7kOp5kTRmeXPzwYWaVVD6ncZbQxqcuXUfCT2eFg0BUS06afr97cBxt5yC73s8DvzvukHpzM+rpG1LVGV6CH4rAiWY/h8HhpaYwmwBHTGjDgnxqJ5EPCFuvh8bBF4pP2cCsFZR0tvU5uHBcwVmVqnlZU+lx4OdQ30dmPcu5VJHpRUEMnJwXZj7KnVWadxlXv5reB2yemUJ7EgGXM/AMCYghD39S/8HlK/4CzWu5oqWEh9NYKUSW0q2/fZBGg4/2l2I+LHB4aeUM635QrYdXU1fIkGSJKFDRoUu7Abkvro7hswHsZsZQchgT9i6z8tClx3ntfDwF8Ok6YcdH/cHhYRSHANmpqu0bDgb5u7obuysRv5BEyq1lv8ibaOEJko6o3fCQuU4iuhYCy1zdD1GOSyikaBUw8emEVniIgAvWm3tPJbip3LK0MdFDohrIe0oaRUb09ow08WgPL7H21EYIIMpF4IbvqTwmq3bk/Ov7SUTVzpEo1MQzF3JE83TLeUpLsI+t90U1XcI+jWb1vyj5EZvM43uh4p0tncIZOo/cNUxZskC2HSQQ1sVckb+jnkNxJ8s7WjhzTLVFZEt94ni+Od+6o4MeOCHs84WenSyXpVOTZw93sFvpkYSfg/EbdCeZDDUEEazWTu1KElrKmS6y6kz46BwIO04RqGLJVMZQqp7zuKVPu3FQQqJ0HEY7qpUl3L1N3LbIRYCrb2YNFMySxqqCOq6rbom1/fT37+e0q0LL0tTeuG5oCZVECRUn6UgLLzdTWThTkyZXpv/YthOab3XJk9zuR16HSQA0FlUy872h48YA3kA+fyKjYsXgmvS4c47S+zOtwRXg+jPz23XTBUj6Naa0DRZofIpDFXEpa88si/h6j60V1Laf1ngBhes1gfwQapwIVLRfmn7ZBAAC9BbxioR7vPW2p1DEBr/EYudHxHm8Fy4YphCAzgsHT0xGt4L1zV6LOfjUrx/ahbcD2b48TX5Hw1YhOJwgBb+FW1OMExUxiPu/ctagj5GatkyI1XtOXZhvyNdIicgZs0zktkjTSd/kTW9TWwiKpPUaH4g37eN0utLiXpiQ1bfnwvvaw/jLlEOWT2mX7IbPK1dAySvRwWRReNmTw8dhdbyuR3qd9PyfeEQn/auqDrySjN3DUchQizgGYbrOkxeLyjnn9sWMj+qL9qqdGPv9yMcrGS2+QRD2T/E82gpchj5zleLdgNjQoNQyIaG4nclClbIePK+djXIRF1HOD65emKKS3euhIUi8jjiH1+R8VDH23wRLnR+QMPAsoxMxiArZiVAPIC1 X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: FQNtgk5Hfl+I/yuT9sukzJtNN3qS4uQciAslSe08ClthBtTg7qtOxlMDU9E+vgS4wcTbt1mB49N3HQoCFGAZVz2MZ1bRFJe9QydzpNr1UAGIwM7IyH/XGj8SHe2yOd7ZErlumtTtS36pTRM5nU9RT4pvyYn5pg2FCRIqJpyCYtHC5gZPIWU3OEe5h1hxdHwVkJF/qcR9QxKCznzeShOn8RE2FToDSxaS1Hct9oCqiBMhrNHtI6Pp4K7zBlZLQbbbSj3VQtH9ADwl7pzih3KunmnHbKqLMQhNJ7v1xO+w4rC3zol0jcM3gZD+mnnr3w2t98JqzwqN6Mf+qfG9jzKnp7eS+WE+nXI0nIysxGWIdwPA3tQ2hQ/XrPwBPhz4TmoMUFs9c3WKFBrMRXjo1EZGvem8bJhH57i6HoiZHJhV6nBVCnNYN7Ygu1KJTp1kHzo6jBVprdSygfCrbBnV+x/Tl1C8yAwGuKm7cS2zb+SIf8Xn2L7pPWmj0rInvg7rcbs8dvUpeMieiuFI6/i88XsHZO0dJDjdgNwsOanTJD8MYuEgPqv6S6KIuQGjm5TXehA2/DR2wTX8Ew6CbPJe3TSscJVhGvl/bNwSSaFopJbnf2DnrEyXbNmcEwd1sKUJQn50U0r+Lll2d7HeA+4F5UEKPCH4iqajB2rryWO/a2tN4pEJiQfIHFnw0I2WPgYGAMgK8HPD/3gBc5Feq+gZVTrsOX5G1mKGjZytdxgTEPyA7qggFOeZoJNTe3UT3eXuzRbGRw2+boYwn5ivlknNyZ3d87z1IITWLncQLL3CGqBZTExBNyXl07n85qefxUm5dSUnp1SNC/MnM9pLO+OQO4J2mZkqWpZSsHAM68N8TgjPuLxPVqR0j/EvZ6UJ5yEOFxReVCBFU3LNNMmWgIn7ixi6nt/83q17VLsYLeFyal7ztdxrlIuNTOa5jcNIKqgj15hoAY+5O/CZ3y70bo9zwKQSg+kK9NZ6fXOA1fX2ZgwpODJtKNVqXhraq3S60FQ9loOYKqhLl3uFHNA3DR8qlwpSzOX9htGCJdV0yDG7SiaeQL0uHRWdzkRVUZ1WpU0KmwJotr72Zzpz6Lebj1XwMc7lI+gFdmIZnK+HBMBOg6vXJnvYT3CI2cwVui4MaF6lJDc9kninThDBrqaZt23ht+P6djlM0nFBO1UWRHCHG+sxmMYzldBap7QzihWJmdd+KNHNNLcY+tlzD8TwRnDkFULfFJd1Zf+flAH+ioSaOohvTfSofnp4IM/XU0QRw5e3ibcXPMa9ey8+l344TBfybbQsqXWAmFAcov1ab9QneWttG19wFWlUFanwin0H2HqvuFmrWwAXzt3IQ9PX/4HobV3K0MdUziPMOzM4wsp5wdS+85VQ63nQovGdHu7p/NqsQvNGKGNmIHZrXnuoiprf3aH8ZJuSBD2Nbb10h424wKt9hG9U0BguQinvsF+gEIHsLyzTE+EnGNScwG5wx3zsFgUWGsirf10nirfIbCz2oi2J7ULU7ovD7p7ge+oQ5zaiT9Up0QbV2L4BEgEpip4wpLbbHbFCwzn4EnZ855RT5GU7bEid6FQ7JlkuAZm2dwTtlyrZ X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 14677b4a-63fe-4e63-1991-08dd2fdb1fae X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:53.3818 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: bzggMoS0VMv9tEL+8NMwp0S6E51QnRBWcjdTC3Mzk8vZlBIK4BvuTpxM2CvxJ7zaLahEAAYeibJ5JNtz4VF6Yg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=Z/cWHGRA c=1 sm=1 tr=0 ts=677e6757 cx=c_pps a=MPHjzrODTC1L994aNYq1fw==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=eeJ3FIb8qCAeb1asaDIA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: oIYaQVakPDNn9rHFjawt0zOvippVLi1y X-Proofpoint-ORIG-GUID: oIYaQVakPDNn9rHFjawt0zOvippVLi1y X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/pci.c | 3 +- hw/vfio/trace-events | 2 + hw/vfio/user-protocol.h | 25 +++++++ hw/vfio/user.c | 140 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 169 insertions(+), 1 deletion(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index cd7bff2b4c..57ed6f5363 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -772,7 +772,8 @@ retry: ret = vfio_enable_vectors(vdev, false); if (ret) { if (ret < 0) { - error_report("vfio: Error: Failed to setup MSI fds: %m"); + error_report("vfio: Error: Failed to setup MSI fds: %s", + strerror(-ret)); } else { error_report("vfio: Error: Failed to enable %d " "MSI vectors, retry with %d", vdev->nr_vectors, ret); diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index da8af45ee9..eceaa0c0fd 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -190,3 +190,5 @@ vfio_user_version(uint16_t major, uint16_t minor, const char *caps) " major %d m vfio_user_get_info(uint32_t nregions, uint32_t nirqs) " #regions %d #irqs %d" vfio_user_get_region_info(uint32_t index, uint32_t flags, uint64_t size) " index %d flags 0x%x size 0x%"PRIx64 vfio_user_region_rw(uint32_t region, uint64_t off, uint32_t count) " region %d offset 0x%"PRIx64" count %d" +vfio_user_get_irq_info(uint32_t index, uint32_t flags, uint32_t count) " index %d flags 0x%x count %d" +vfio_user_set_irqs(uint32_t index, uint32_t start, uint32_t count, uint32_t flags) " index %d start %d count %d flags 0x%x" diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 6987435e96..48dd475ab3 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -139,6 +139,31 @@ typedef struct { uint64_t offset; } VFIOUserRegionInfo; +/* + * VFIO_USER_DEVICE_GET_IRQ_INFO + * imported from struct vfio_irq_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t count; +} VFIOUserIRQInfo; + +/* + * VFIO_USER_DEVICE_SET_IRQS + * imported from struct vfio_irq_set + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t start; + uint32_t count; +} VFIOUserIRQSet; + /* * VFIO_USER_REGION_READ * VFIO_USER_REGION_WRITE diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 118314b363..be2fba522d 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1179,6 +1179,122 @@ static int vfio_user_get_region_info(VFIOUserProxy *proxy, return 0; } +static int vfio_user_get_irq_info(VFIOUserProxy *proxy, + struct vfio_irq_info *info) +{ + VFIOUserIRQInfo msg; + + memset(&msg, 0, sizeof(msg)); + vfio_user_request_msg(&msg.hdr, VFIO_USER_DEVICE_GET_IRQ_INFO, + sizeof(msg), 0); + msg.argsz = info->argsz; + msg.index = info->index; + + vfio_user_send_wait(proxy, &msg.hdr, NULL, 0); + if (msg.hdr.flags & VFIO_USER_ERROR) { + return -msg.hdr.error_reply; + } + trace_vfio_user_get_irq_info(msg.index, msg.flags, msg.count); + + memcpy(info, &msg.argsz, sizeof(*info)); + return 0; +} + +static int irq_howmany(int *fdp, uint32_t cur, uint32_t max) +{ + int n = 0; + + if (fdp[cur] != -1) { + do { + n++; + } while (n < max && fdp[cur + n] != -1); + } else { + do { + n++; + } while (n < max && fdp[cur + n] == -1); + } + + return n; +} + +static int vfio_user_set_irqs(VFIOUserProxy *proxy, struct vfio_irq_set *irq) +{ + g_autofree VFIOUserIRQSet *msgp = NULL; + uint32_t size, nfds, send_fds, sent_fds, max; + + if (irq->argsz < sizeof(*irq)) { + error_printf("vfio_user_set_irqs argsz too small\n"); + return -EINVAL; + } + + /* + * Handle simple case + */ + if ((irq->flags & VFIO_IRQ_SET_DATA_EVENTFD) == 0) { + size = sizeof(VFIOUserHdr) + irq->argsz; + msgp = g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_SET_IRQS, size, 0); + msgp->argsz = irq->argsz; + msgp->flags = irq->flags; + msgp->index = irq->index; + msgp->start = irq->start; + msgp->count = irq->count; + trace_vfio_user_set_irqs(msgp->index, msgp->start, msgp->count, + msgp->flags); + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + + return 0; + } + + /* + * Calculate the number of FDs to send + * and adjust argsz + */ + nfds = (irq->argsz - sizeof(*irq)) / sizeof(int); + irq->argsz = sizeof(*irq); + msgp = g_malloc0(sizeof(*msgp)); + /* + * Send in chunks if over max_send_fds + */ + for (sent_fds = 0; nfds > sent_fds; sent_fds += send_fds) { + VFIOUserFDs *arg_fds, loop_fds; + + /* must send all valid FDs or all invalid FDs in single msg */ + max = nfds - sent_fds; + if (max > proxy->max_send_fds) { + max = proxy->max_send_fds; + } + send_fds = irq_howmany((int *)irq->data, sent_fds, max); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_SET_IRQS, + sizeof(*msgp), 0); + msgp->argsz = irq->argsz; + msgp->flags = irq->flags; + msgp->index = irq->index; + msgp->start = irq->start + sent_fds; + msgp->count = send_fds; + trace_vfio_user_set_irqs(msgp->index, msgp->start, msgp->count, + msgp->flags); + + loop_fds.send_fds = send_fds; + loop_fds.recv_fds = 0; + loop_fds.fds = (int *)irq->data + sent_fds; + arg_fds = loop_fds.fds[0] != -1 ? &loop_fds : NULL; + + vfio_user_send_wait(proxy, &msgp->hdr, arg_fds, 0); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + } + + return 0; +} + static int vfio_user_region_read(VFIOUserProxy *proxy, uint8_t index, off_t offset, uint32_t count, void *data) { @@ -1275,6 +1391,28 @@ static int vfio_user_io_get_region_info(VFIODevice *vbasedev, return 0; } +static int vfio_user_io_get_irq_info(VFIODevice *vbasedev, + struct vfio_irq_info *irq) +{ + int ret; + + ret = vfio_user_get_irq_info(vbasedev->proxy, irq); + if (ret) { + return ret; + } + + if (irq->index > vbasedev->num_irqs) { + return -EINVAL; + } + return 0; +} + +static int vfio_user_io_set_irqs(VFIODevice *vbasedev, + struct vfio_irq_set *irqs) +{ + return vfio_user_set_irqs(vbasedev->proxy, irqs); +} + static int vfio_user_io_region_read(VFIODevice *vbasedev, uint8_t index, off_t off, uint32_t size, void *data) { @@ -1291,6 +1429,8 @@ static int vfio_user_io_region_write(VFIODevice *vbasedev, uint8_t index, VFIODeviceIO vfio_dev_io_sock = { .get_region_info = vfio_user_io_get_region_info, + .get_irq_info = vfio_user_io_get_irq_info, + .set_irqs = vfio_user_io_set_irqs, .region_read = vfio_user_io_region_read, .region_write = vfio_user_io_region_write, }; From patchwork Wed Jan 8 11:50:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930706 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 89142E77188 for ; Wed, 8 Jan 2025 11:55:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdb-0006Ul-IZ; Wed, 08 Jan 2025 06:54:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdS-0006TD-VX for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:07 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdN-0002Gl-MB for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:06 -0500 Received: from pps.filterd (m0127842.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5087vtsA007169; Wed, 8 Jan 2025 03:54:00 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=gkODlHe2hn9pUwEKcizqU9/W8lqBZqXJEw5lYnJEJ xg=; b=B4gWtTEAJ3VwUQ8hhjEr+4JJdZsb01XR+BC9c/PS/wx1KcXZMn8YBvC57 niplVuooKAXOS491sUkqc/qOYTFdztwbsJzUgw6urVt7uFtvy+pYdf7IYXmqZHHD OS/nVnUA2LPOQXbvJtdI9M/5J9n+23apU1YUhk5aUNYAZZxJM5qI7u9lcGls+ZtW BGls8hqMvTKuaQvMFErTiYEAvNdwYWu0adITvqNTBQ7ouSfYQwkQGWq4FM/Mv+dO naYuJv+a/6QSjGJFSTDa7l/OcbxvcHsoWeJ1TPNGB9q4bInaGXu/oSpGAmX1Y+D0 zDj/dpTv3tgijAdeFsmRYwtnvx6kA== Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2177.outbound.protection.outlook.com [104.47.55.177]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y56eryy2-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:54:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=I5bXrQMuRmkHYeP232E5JaanMS0MuhwGY4IfwGTvDRmYJOB9sslJwWdnBIC6XlOYXBovNXIUvXFoRQpLccRGpXDUM5gfMSln0aqntIeyrVfGdSIgmoTQRESZgkM2w79MQhpIrNPcz7p+bhJMGKRI4lxoW3+rLPLox44ntBMLrcWIi5zpxoPgHaMJAsftF41fj7LrrXMF32IAMYspanTsSKLCGVFf8wqvebd3Rs0dmNC7YKAi71a7o5C3Cs/iCF/A6vnrUdYipUnmocjs8foIyS1IQMldM2PwhoQGaU95SQqbiZ9pdbiynjaert+t1jtsM27v+RncbHdIivmSQ7291g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gkODlHe2hn9pUwEKcizqU9/W8lqBZqXJEw5lYnJEJxg=; b=c7cln6UtuZyRxtPeSCKTpgpjWSLeCzgQXWi8Iboq/oEsfDc91HrAOepDs47mZMSkz5WnCovAiJisNAE56AjEZTnGN0VjUy9ODVlakeX2oEhwknez77d8tdV3Mv9Bt7HaId+a7tHefNAWTEMsCdyWmxJsvy+0LymDTccliQIt6MMl8R59kz7+39B+8eu47txOAyri1+3biMlxLG8EYixlh4Vd6AIIKDWcgg5BUdC0sqCinhdZ8yXLPFORf19g7etSfH2F2i82JLaZ4EdGfBDGCu/Su7KQN+a/o/0su/yFnGFHjDBPJTdATc7yD1DBak/HBLX0pCbk3bvyk5+2/E7Nhw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gkODlHe2hn9pUwEKcizqU9/W8lqBZqXJEw5lYnJEJxg=; b=Fzeo7T0dcmxmlVKq583Dtz7tGNja8802VtjyNZUCqar6QonyvJF78ctcvsFfryhxv9C4e+V1KUIhlpelrWIj/XIgTuqlCCZ+yWXlCU6y3BbUYz9sFZfDIW7k+Wp5czBEfa3FO4HbXxsdLznFsQtybyu75+K8kgI6XxsCu1HIPdj8OqEG+p+b/1wwMrsPtxHWgDQnhc4EBkkbaRgwjpS4jjBt9xs7Udd9b1KB1MXtR/EQzfdXNmGCyJHesMlzbsTEPkC6kE9V/M3tvqx2CYFWRFyJ5DoMkQhsTXyFrLRiCh9aJ+V/161YZOH37ojygEKY5mkOmnHqz1PwCJ9cw6ZORw== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:54 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:54 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 19/26] vfio-user: forward msix BAR accesses to server Date: Wed, 8 Jan 2025 11:50:25 +0000 Message-Id: <20250108115032.1677686-20-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 2b741c72-9f75-40be-3a2b-08dd2fdb2072 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: opAdS76sU0ewS90lZtcCrED9iG4Rri04Fn4iaJuUrDFDf8ESa1n9VgKtKx+K0yhZUBAi3S83DLMWLAGH8Cx+F2LwT+f4WoDrNRVIPOQS22N6i0CoDynx3afzE0t501SbePhzeWKtf3HDv/oN/95vTbf3fX1VKukuk1s3ynA3DZxxVQP3JiqPQhNh/5p6dakHER7lEDJovIoaiVY4y2zfRirYXwyxjUFpvs0i5XcFDI+Wn1kkml6B5DkIlKKwkIxAOArs6c1k5lJt60KFr23onF4JApbj5OTEHkUPUZEF7s9DxinpHPLZuaTr8XkDNXC7sfxWoCd51ECEQwv/hldclLG9QM+FH3w1QX/snULS4vjGA0ufDxctI3aX+ac3XCPspnEJAFxQDkKMm2z/qha1ckCJ4IVL7Mg9LZuU5A3Y889KTZ0dWjRbTkMZ3tuZZBQQtlAYN4sLdlGLJvShfEPZbJn4+QwikMnZNk2wf2PDWV0WaddjnTeM0SU77uxIpDdDfNAKLzG4TkjAzcOwPhc7/ffnG9aGvIS/n2DfoqwWGaF6bln+7fNxdLxtbqb1Hyx74C9JFHH72KjWV9HtH1FK6e/pKvPvpcEkZgaWnsB99n02B0yNi1xn6TXS+hWQAYTsl6cvc75aCTSmSqBV/AMQ55ap3kmyeWxgXU0qzBaxN3gHOX7eOeS6mau3C4877zXrcfGGta0iy4TYJttgiYW7fsZuHfMG3+qc1NEm2ZmRMfumiX39cadXVJZwOuKfOOb75tlPsa/RIia6AI+6dQcVfBtGUg7OfJmBBlsyqKLTmLX/kCM5k0zsALzWy66AnkznEIWNj0kAvMdiRhgmNuNwB1yUkySJDpVaIbbogbB/IQCiisKu1nIDZD8W26Sn+8WURJDM/CdVdLj5rlAkZzrjPFgMic3yXvqMpw2yYtHv4DPyDsDec45t5mEqVGLjlQSOmyuxyPMBXZCblCiuXZNE/fGggFbS3NoKVMskw3fmFapRssmD2tUNveCjTGdx66xy0Xhbti9B3LKFLiNYa/3+KnGSR3rcHhrPNSdy+1iLEMUqopOd1vKhswwrGJuAuxk8xYJIQ0sUtk3TGVa/2wmrG88LldLUTiA4GC1w6lLX0bE2APfqYsVH5cHYM7LsIfYyR08v+WJxZLJ+T77zj6CRcqyKwvBsGBus2BCGxnapY7G5YRPP4Ai3zs8+ojV/OkjJvvpwiB0uv6IICUyCaTfd29/HWX1/YbWSx4ZpCpxn2HB5VdJX+qrg/ByOpGw6elaUglZKJuh2JBmgODAxYxs11Lw9tR84U2XCHzPv2oDpceWrKAwZ4pJlnevKOHia/7Ft6cHUKpdRKNxj3XxVkDq6uBu1ni2l7voNzzpDodulpeEwsV1jkbCm+3F2KUq3c3z/ X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: pqXIof3keHjCtjCgB08Hd4NWMSHR7Rva4wr/EB6dNGydmeMh5Q0oNaaRhw/pHFoLkRzZ0BtEHOtcwHxo8Zw4wMGZXdGwFA7sepqptz2iIUMNDkgqaeMtXkTWh5ZoAN8sorM4BRJT5L6x7pgJy5YbjLpiDmyHazZ9WY8IZGu8JEfJltFeuiwN9FIqYeH1k8qXIbdBAQjoj0oNrpKqlBN5mtzd6R1I+7vOyvUW8f3nJDnITFHhQtDSt2LDEGKeSU9rTwNuucm185yN5ysx35WTmFtSN9+G6tTRycIQFdYaFdrc6S31oaCxr8qA71C9mMfqFPajuBSXLPjgA15cHzppwSJv7P7eTJ6FxlGCyJTsEdIsWNSJRcxRbjjwzUjr1VmO0NfRl0eZz6PjUVN6g4is4KPBOiFoRlbMt9uXcxO8iQ36QZ1EDr2H8mIXj7f/4UKgpUQ6Ce8tbL4Zm2+Zm2jOnB04q+NJ3UQMGtQVzTUfw9f7rtK2oQ0FnAznm3Ga7ReAHYwm4PTCUdqyLRCCpYvoPPMmk2KUwsVITayq+CWhrIby6Bx5LFyzG38AxdCvFyB6zwIeWy3CiRt9OSCC0HdmLCy5GbLLu5wNeMqaQA+1/v3R253hRxEr7wSIKhORclyJ1GwNv9Sz0+Lun7e/ycGV0SGuCBGKtRRMzXivZeOjo/u/sS4rnJ16CTtDLoHqhieyzr1EWAyjJQ1ylRzs8+YW206Cl58J2/qxiV2AIvFEku4lHh1JXHn7onG9Xn4Rd/PIBsP2mhoSoAJ6J5HAQfOGKqpTW5E1fJqAS2D0kmkw6zqD5wF6Xp0IxVn4jW/5BqAmKF8vDCqRXxdQUjy+Dyw5UUvsb+YblbZah/RSKZ87VIhb8Ie8Mf7kmij6VC1Yx9tlYmolFylpOUBjv2wKbUEOaSQh4NXd02fSmnr0BwUbLwgG23SErKqFCi1qnxsOT3cywjAkT0NpkTUOK4f2KfgGT8hpoWpMSuFu4EpvyP7bqOqNzXQz5qdrdDeumREVxPI6v28wDIOjNs7/asefe6X5bh1P88h6h8KkdykOqLfMMbe3pqV1IFJUWaCRw7MWWNR6QgqMTk3jgVPFK9F7rFHQcWUdPKIyodHP0QiIOGE7MV9cRk3wTwoWJTM7Ap5omVwHbrxkVaQU2blQ78+Y1bSrx3Hd4iTnErTYqxIufa9WfSqxd4DsJ0Lsh2FUB+BuBZ+GKPp2CZXng+/Mt36n3EVAm8DaGFCR5Nxny/lMWDBCKpRiEGG/PEf47GN/Eme9pLxNu0iEaxLDczGq1lPNyBMeyB2ijoKlGZ3xTvJW/T/C49uqCy5YbLd3lkOl7DfPgZimOZtizAPzWuXExhkOFMukJ6J8jzKg8XGTaWUClCh/k1iGs+2BG7YMrVB3rK+oZtBSHnnQt8wgk02DCJ8K2h+lCkA9sbS4vS3T9GiNOw8nfc80SSv8IcJIH+fT/0rzdPxpV9KSX/RMGHzBYEKclbFkVVnWbS3qKvPxL/+gWcJqXgfLKIdQFjrsraCWk9hbZ4Mcs4cKqpqwJ4MeIq4CdWAh1PWXtiQBzKNxU+PVJf+7Eq1MeaYnpeHLWl1rTcG/C1sY X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2b741c72-9f75-40be-3a2b-08dd2fdb2072 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:54.7663 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 6kKWu1sTofuzsxqVH22/W6G+6ZGQHpvkwBjnlpOP8slJpIfYznnfccffMFyIljnLc77rY2vvdkFHYy08PA1Jow== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=A6aWP7WG c=1 sm=1 tr=0 ts=677e6758 cx=c_pps a=Odf1NfffwWNqZHMsEJ1rEg==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=NfzBEOeHTuBK66BqFZUA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: IVFg3U0L3C9TL2actLAz1vM7fqeQ7dzh X-Proofpoint-ORIG-GUID: IVFg3U0L3C9TL2actLAz1vM7fqeQ7dzh X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Server holds device current device pending state Use irq masking commands in socket case Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/helpers.c | 26 +++++++++++ hw/vfio/pci.c | 86 +++++++++++++++++++++++++---------- hw/vfio/pci.h | 2 + hw/vfio/user-pci.c | 63 +++++++++++++++++++++++++ include/hw/vfio/vfio-common.h | 2 + 5 files changed, 156 insertions(+), 23 deletions(-) diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index ea3dbfa96d..623634a614 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -72,6 +72,32 @@ void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index) vbasedev->io->set_irqs(vbasedev, &irq_set); } +void vfio_mask_single_irq(VFIODevice *vbasedev, int index, int irq) +{ + struct vfio_irq_set irq_set = { + .argsz = sizeof(irq_set), + .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK, + .index = index, + .start = irq, + .count = 1, + }; + + vbasedev->io->set_irqs(vbasedev, &irq_set); +} + +void vfio_unmask_single_irq(VFIODevice *vbasedev, int index, int irq) +{ + struct vfio_irq_set irq_set = { + .argsz = sizeof(irq_set), + .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK, + .index = index, + .start = irq, + .count = 1, + }; + + vbasedev->io->set_irqs(vbasedev, &irq_set); +} + static inline const char *action_to_str(int action) { switch (action) { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 57ed6f5363..fdb6d033f1 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -520,11 +520,30 @@ static void vfio_update_kvm_msi_virq(VFIOMSIVector *vector, MSIMessage msg, kvm_irqchip_commit_routes(kvm_state); } +static void set_irq_signalling(VFIODevice *vbasedev, VFIOMSIVector *vector, + unsigned int nr) +{ + Error *err = NULL; + int32_t fd; + + if (vector->virq >= 0) { + fd = event_notifier_get_fd(&vector->kvm_interrupt); + } else { + fd = event_notifier_get_fd(&vector->interrupt); + } + + if (!vfio_set_irq_signaling(vbasedev, VFIO_PCI_MSIX_IRQ_INDEX, nr, + VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) { + error_reportf_err(err, VFIO_MSG_PREFIX, vbasedev->name); + } +} + static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, MSIMessage *msg, IOHandler *handler) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIOMSIVector *vector; + bool new_vec = false; int ret; bool resizing = !!(vdev->nr_vectors < nr + 1); @@ -539,6 +558,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, error_report("vfio: Error: event_notifier_init failed"); } vector->use = true; + new_vec = true; msix_vector_use(pdev, nr); } @@ -565,6 +585,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, kvm_irqchip_commit_route_changes(&vfio_route_change); vfio_connect_kvm_msi_virq(vector); } + new_vec = true; } } @@ -574,38 +595,35 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, * in use, so we shutdown and incrementally increase them as needed. * nr_vectors represents the total number of vectors allocated. * + * Otherwise, unmask the vector if the vector is already setup (and we can + * do so) or send the fd if not. + * * When dynamic allocation is supported, let the host only allocate * and enable a vector when it is in use in guest. nr_vectors represents * the upper bound of vectors being enabled (but not all of the ranges * is allocated or enabled). */ + if (resizing) { vdev->nr_vectors = nr + 1; } if (!vdev->defer_kvm_irq_routing) { - if (vdev->msix->noresize && resizing) { - vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX); - ret = vfio_enable_vectors(vdev, true); - if (ret) { - error_report("vfio: failed to enable vectors, %d", ret); - } - } else { - Error *err = NULL; - int32_t fd; - - if (vector->virq >= 0) { - fd = event_notifier_get_fd(&vector->kvm_interrupt); + if (resizing) { + if (vdev->msix->noresize) { + vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX); + ret = vfio_enable_vectors(vdev, true); + if (ret) { + error_report("vfio: failed to enable vectors, %d", ret); + } } else { - fd = event_notifier_get_fd(&vector->interrupt); - } - - if (!vfio_set_irq_signaling(&vdev->vbasedev, - VFIO_PCI_MSIX_IRQ_INDEX, nr, - VFIO_IRQ_SET_ACTION_TRIGGER, fd, - &err)) { - error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); + set_irq_signalling(&vdev->vbasedev, vector, nr); } + } else if (vdev->can_mask_msix && !new_vec) { + vfio_unmask_single_irq(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX, + nr); + } else { + set_irq_signalling(&vdev->vbasedev, vector, nr); } } @@ -633,6 +651,12 @@ static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr) trace_vfio_msix_vector_release(vdev->vbasedev.name, nr); + /* just mask vector if peer supports it */ + if (vdev->can_mask_msix) { + vfio_mask_single_irq(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX, nr); + return; + } + /* * There are still old guests that mask and unmask vectors on every * interrupt. If we're using QEMU bypass with a KVM irqfd, leave all of @@ -704,7 +728,7 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev) if (ret) { error_report("vfio: failed to enable vectors, %d", ret); } - } else { + } else if (!vdev->can_mask_msix) { /* * Some communication channels between VF & PF or PF & fw rely on the * physical state of the device and expect that enabling MSI-X from the @@ -721,6 +745,13 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev) if (ret) { error_report("vfio: failed to enable MSI-X, %d", ret); } + } else { + /* + * If we can use irq masking, send an invalid fd on vector 0 + * to enable MSI-X without any vectors enabled. + */ + vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX, 0, + VFIO_IRQ_SET_ACTION_TRIGGER, -1, NULL); } trace_vfio_msix_enable(vdev->vbasedev.name); @@ -2771,7 +2802,7 @@ bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) { VFIODevice *vbasedev = &vdev->vbasedev; struct vfio_region_info *reg_info = NULL; - struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) }; + struct vfio_irq_info irq_info; int i, ret = -1; /* Sanity check device */ @@ -2832,8 +2863,17 @@ bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) } } - irq_info.index = VFIO_PCI_ERR_IRQ_INDEX; + irq_info.index = VFIO_PCI_MSIX_IRQ_INDEX; + irq_info.argsz = sizeof(irq_info); + ret = vbasedev->io->get_irq_info(vbasedev, &irq_info); + if (ret == 0 && (irq_info.flags & VFIO_IRQ_INFO_MASKABLE)) { + vdev->can_mask_msix = true; + } else { + vdev->can_mask_msix = false; + } + irq_info.index = VFIO_PCI_ERR_IRQ_INDEX; + irq_info.argsz = sizeof(irq_info); ret = vbasedev->io->get_irq_info(vbasedev, &irq_info); if (ret) { /* This can fail for an old kernel or legacy PCI dev */ diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 5fe6eb282c..6f024936ea 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -114,6 +114,7 @@ typedef struct VFIOMSIXInfo { uint32_t pba_offset; unsigned long *pending; bool noresize; + MemoryRegion *pba_region; } VFIOMSIXInfo; /* @@ -183,6 +184,7 @@ struct VFIOPCIDevice { bool defer_kvm_irq_routing; bool clear_parent_atomics_on_exit; bool skip_vsc_check; + bool can_mask_msix; VFIODisplay *dpy; Notifier irqchip_change_notifier; }; diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index 5758e1e234..53d230fdd3 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -43,6 +43,62 @@ struct VFIOUserPCIDevice { bool no_post; /* all regions write are sync */ }; +/* + * The server maintains the device's pending interrupts, + * via its MSIX table and PBA, so we treat these acceses + * like PCI config space and forward them. + */ +static uint64_t vfio_user_pba_read(void *opaque, hwaddr addr, + unsigned size) +{ + VFIOPCIDevice *vdev = opaque; + VFIORegion *region = &vdev->bars[vdev->msix->pba_bar].region; + uint64_t data; + + /* server copy is what matters */ + data = vfio_region_read(region, addr + vdev->msix->pba_offset, size); + return data; +} + +static void vfio_user_pba_write(void *opaque, hwaddr addr, + uint64_t data, unsigned size) +{ + /* dropped */ +} + +static const MemoryRegionOps vfio_user_pba_ops = { + .read = vfio_user_pba_read, + .write = vfio_user_pba_write, + .endianness = DEVICE_LITTLE_ENDIAN, +}; + +static void vfio_user_msix_setup(VFIOPCIDevice *vdev) +{ + MemoryRegion *vfio_reg, *msix_reg, *pba_reg; + + pba_reg = g_new0(MemoryRegion, 1); + vdev->msix->pba_region = pba_reg; + + vfio_reg = vdev->bars[vdev->msix->pba_bar].mr; + msix_reg = &vdev->pdev.msix_pba_mmio; + memory_region_init_io(pba_reg, OBJECT(vdev), &vfio_user_pba_ops, vdev, + "VFIO MSIX PBA", int128_get64(msix_reg->size)); + memory_region_add_subregion_overlap(vfio_reg, vdev->msix->pba_offset, + pba_reg, 1); +} + +static void vfio_user_msix_teardown(VFIOPCIDevice *vdev) +{ + MemoryRegion *mr, *sub; + + mr = vdev->bars[vdev->msix->pba_bar].mr; + sub = vdev->msix->pba_region; + memory_region_del_subregion(mr, sub); + + g_free(vdev->msix->pba_region); + vdev->msix->pba_region = NULL; +} + /* * Incoming request message callback. * @@ -157,6 +213,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) if (!vfio_add_capabilities(vdev, errp)) { goto out_teardown; } + if (vdev->msix != NULL) { + vfio_user_msix_setup(vdev); + } if (!vfio_interrupt_setup(vdev, errp)) { goto out_teardown; @@ -209,6 +268,10 @@ static void vfio_user_instance_finalize(Object *obj) g_free(vdev->emulated_config_bits); g_free(vdev->rom); + if (vdev->msix != NULL) { + vfio_user_msix_teardown(vdev); + } + vfio_pci_put_device(vdev); if (vbasedev->proxy != NULL) { diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 3a2e3afaaf..593e304ee0 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -275,6 +275,8 @@ void vfio_address_space_insert(VFIOAddressSpace *space, void vfio_disable_irqindex(VFIODevice *vbasedev, int index); void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index); void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index); +void vfio_unmask_single_irq(VFIODevice *vbasedev, int index, int irq); +void vfio_mask_single_irq(VFIODevice *vbasedev, int index, int irq); bool vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex, int action, int fd, Error **errp); void vfio_region_write(void *opaque, hwaddr addr, From patchwork Wed Jan 8 11:50:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930725 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BC5DAE77188 for ; Wed, 8 Jan 2025 11:58:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdX-0006US-Bm; Wed, 08 Jan 2025 06:54:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdS-0006TE-Vx for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:07 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdO-0002Gq-06 for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:06 -0500 Received: from pps.filterd (m0127840.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085doxg029507; Wed, 8 Jan 2025 03:54:00 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=qmzXfCh60HOKeC0hCwPA6A3vwTCuURLH8gSK5DPdh Z4=; b=avI/lF7KHfjancqqMDSTgYBhfVTyND6mXIBRkcC3/Lsa5svuZlDMjk0pH BhmFNMvYgQtJx9iM1loB7lMF+zQologBlCTTxUKAHC9jMoGtkLd5XfdpHDYax2o/ 8ij6R55DmJpF5/43uq3jwzEO/VUWtsh3z/tJqFj1KfqT/U6FPbJ47bLP69oepRlB 9DIiOIdIBKFh8VTFjmurlwoIC4xwiN+JFgKebP2hLPMHeIdMN1b8RSjE39oG0iRg g/Dief+lR0PORP9W2djBkmdN8zeTkegR29KPWN2Z5EoLS7OwBoXnFxvJZQCIRl9t Li7DJov5+7Vsy8PdOopzG5l6j4E0w== Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2169.outbound.protection.outlook.com [104.47.56.169]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y26xqhax-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:54:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=GSsl6dRMes4LjNRJ6iVojPa7XlZBAVing4+Zdq2h7uP/lmMTYo/lafYYKMlt3w3tVSvGhNK6vQ8gjSGzRumo+t1ClR8erom4YEG+BxFuvK3gbAW6Vl4bH94OookfGHIF9kekZ5gQlh6W78r8L39pqoWPUoHVR1Nca4gipiPTZP9RStDV48eTPLrvvmm+1QAbow27CaPMsU2V6Jn+pbmItQmxttcDf93veFWfZ8zdOpJTkZfSRhea467kBxOOC03iY0MfYJ0RvY1gdYcdnMeixkWQdNiFWkigKnjTMaCDwY1MI1MuFXdmifkax2XsYxzKE2ntUEKq+6CGo5Rx7VoOqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qmzXfCh60HOKeC0hCwPA6A3vwTCuURLH8gSK5DPdhZ4=; b=gX2tz0OwnaN5Dd7HCwEBKpjzTnxKnFYEDPqISXmN/CLF6MC2Anl1nXd28LO0S4Xur/r6nAKJzBjdUCw/yoWRorFy+ll6Hv0PGJtDOamiNh7PAZGntDZniDz3pq321pBWqE6Dr4gITsrpsOe2qsaGj12rOgG6BNdMGm0Dgfsz0NJbd5YJTbuKCjdHrVB0APaFoQ63wn5uwxSj/P5QQOIfAp9khQx8a/uTuorYdv+IHunuZpj1G7e8ItJtz3NAA2+I9Nnr5rOje4DuyOnA5/sla5acyqyzJEzbIbGIsTaGOJR47srGGtC9eVFOI0dd2D7B2PVbob4uFpMnqPkG493rsw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qmzXfCh60HOKeC0hCwPA6A3vwTCuURLH8gSK5DPdhZ4=; b=jcsInNhJUrCQKao6oVIqEE14R6Hm1pO8jlsYqC0etec8WPAZBRZDyrJKR3taQcE1VjxJzFTp1jxhd/ZwsdWbB1F1ijTOTa13R910ML8dPemMvyp5AsJqAFiilNIRt4Mg/v8N4kbP6ilMvvwe7gOSbknZHA/6sbZhzSl9lE5TiZkNZrY27kV7kHdiF223qAoQcA+igzlDtKOMqOcw+w1W9O1fnmgXJJWiVV9pZ2n6MqNeSTDwMt4T6I1RvC81MNpjVEdgm0vRim+J8h0v0mTRr7GlSQ23BqQ6vuI78kEMe79X9kpJ0mJFddwEF3kcZs9WzGBWOHpx3Mxor/+NC05jzg== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:56 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:56 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 20/26] vfio-user: proxy container connect/disconnect Date: Wed, 8 Jan 2025 11:50:26 +0000 Message-Id: <20250108115032.1677686-21-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 50f96c32-c3c5-4f54-c5b2-08dd2fdb2144 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: 1IU3uMrXDbQP9HuIC2NpmGN9uhm5chPFVkUg/NNxLnEPTcXiFNjQRDLZC9z00dd/MQdFWQ1kY7/G+B2Kd/WcL+Pnx9Kcp/TCVYy3zv/pVrCmD5Di+B3wS6cy0HyW5lzfPyA40X8Bj/u0958fFVL1WlgCWrTvC/8QcIuL6LQhxLn482rVBMeyu4152rOQTUxOe8Ch798/DmwSPDBg5VF8s6+oLrUYPtRVFM11FCjS/ysvziyDssrjIx3Ip0GEI/a1z1MaHHM+byorHerxbUEcSMMWhfA5eY/gsZFjXieGGUHpJZymML7GranF5H/yxoANApfQ7tXc2yN1D2vGa99K4oP+HZQ3nmLDwjgsS29oVEgG3I/rccZ9DusEjQhXS62IU2TmSJdih4pMRr+gHDSe+I0mX2/vKiz5TIB70BDFrUewet7AOyvLUYgcDXYLo0BBVQY6ZkhM55fnwSyIjkCNcDMJA7U0Hz1GFE6S/ZOT81CxgO7laLEkmWv/l5qBwN8TsvCc7BpGCGnJAzBCPX04l5x7l3U+qfnYnqNVcNJavFHiqexuc4HQZLP0eAP33tl9WOluCOl+2RKgx/YPWFI1bwoU989aeHV5xcVDb5DM8ebUr0PKqx4fvpXiAJaBPbTSWwdqOwIfDwwV7t/YsJAkKCARFZWpSnmZwCtJSr9An6okwr5uwgk+lzj1fF1d8HMOZQMWehVHuw86iitgvS5yf8Vo8mgod6b6egQWdWA2Mxu/0W8mkAjzHiXxmmWoA5yDXxVMu/lltR2rdxf6PD0NXN5+U1STCQyMayIcEYGG9/BY0OGHL4O6ARluXJGF0bu5k9oETgaO76Wqy0JrptxfrHMPR8PuvZCQwTg39kBAlJXlJYfiwly65rZ/j01Mz9/kPj69z+G3LBWseOxJMSeDNEjBqvtTZ+k3fNfOHveDJwxXMqgRU+FWRzxivXQBNAk+11Zj2M+93h1jsT8JGiw3W0qNnM4UAnyRw9eVEGoGBd5z68oK2LjLLjqdlb3fnEbaaX1SgJBeccYGfZ/V8XP7mLtTv7eTAoDO6atDb40iRceP8iNWnOzcW6Xh9AUriSJ85YWONRyM90/lus530/KF3zk4pR24GAq8jbTfbkxLTQmgDy9t+tBnMCmoMNku/TZssH8Fbxiprxp713+GkF2XrW6S7SgLZordTkVV1mPwkDMLAjR+GzNNGDaYgPhjcjvCJHGSEWX982VCWBXdxwfjMUVN2EsfhUDmicZC8sZagb0iZfmkFe/Em+yz/wkDrDDInJ0ZfUpFqqVgTMZoJrElx/4ABoIIGwQQJYf0t4rIk5YhcG21yrtkLmreuob/EE7eYDoalx6eaqXnEOFXYf9eq3/X6AEy4BocT/UMtWzneMpBM37rX0VkA3RtA1THfjyt X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: VwSy415bwnV2gXYl6pzjmbD/ySYVuf+ewu1aPEdxSYIR7wl0IVQ4hIfk5zfyo7lRRHj0J9s+DzO3NswlOkp5dh4BwYRsOT7aRB1Ovx0v3HvRMAiedCut7N9Yh2hVjPo+pcEcWNPlZun82VaRNLQOjwUmQWEtTTHcpeib4tVgewX1EirlWWys7Chsh+jPJjJP/dxs+k0843lyW1x0WEFqazYbF1UWXaTpGSKEI6nvWnoDwPvWHcbrLW70eChNdLgltV0X38YVZ8zD6DNgkZARv12T8kTGvwEL/ViTINFyvWz52rVyIaHaM2G03PK7FxD9/MBRWRbqQVpXM7qbIaeunMMA1UdtPzyKdNzyVOEMW1D3YljNgyDV6yT/MwfVGqLNYw42pKGJBpqm0gHLnrxSHP+mc/CEeiI213i18IfJEbIvMeZQRz++qkrSCrb4xO+gTb3GhfLJ4cn/rKSapwbMZdPjPdErSSQpodmSQqjySEtgFOoKlG1rki2oIj/Zksa4wCDVCho1twlS0NjXZhNze2ehWhRmWZSJVsF+MQWyVPNE3U0ApRZ96ZKQGaQnMhzCJG3XYTeEKjebzAoE73Gu7SDdtpXNeXU2T53dIV02UpL+ATfBMhC0zBsFbSM0oRYAZGDrqoWbTYXEe52XSk54QRhNe0AFOyrsftxfNp8m9MnidP/jyoGRFPsLdaOgog1ATNj58h2Ia1REvJJhrNci41dYbtYqmVN0nlIsR3QHgL6L0YXr9rjBUjo9MsCFyJca0OK3XdYTmLBnnyYZZNrC3JvaW6PvNH/+3dKKJe/jhvnG3mKNG9zCt7EiC4sa8WEw363qKWvJ77+46PCKPhNzc1TI7mAEuw1on4v7LVc8JXD7z/77FxVOvGqsLOMeNwYiKk+yXcCtR0D5IVrxb/eJ2Pl0HlJ039NOzvANzOcynUIsofnQ25k/RStBJb0BdnxMYiN5ndSi9wxu9pPN7YQDRmQSWVb5YXwWoH3Yk3zUMQPueLfqnScOsz5FEx+J6ch0cOO7qn4l+7uZBG9M2c5ngIHTybee04VMmEcRzosNosoBn4aEUPWC3BFXOBruRXJMm93PejXA0dCiimOOkZKB3nj39Vopzhh9kA+TL9oGKLpEb58gpkRMEYF3DUshHal0eE08mxdKbmF2YnUziUSl4/S9p5HXzeDm1XKmJwdyK1Y0CwiFJhUeC+SBX/RqdoLhmDfha0zSSCPKAbr1MDmNOP5DIabu1o1eMwYT9xd+8xxLiYBtEZGgOOtLjict/OpTCpEn0avpHXhS/6M3v4M82PWl/wBnza/AbjmdbCDwuYiMS8lk5nCw+O9hVfHQGXmCgLpopGYj4Ip5OweZ9CEvJqmO9uf/HTByjZihSrJ1VFenE6BrucCKp0zS7dBF0pCrK5N/SzNJ4hLw2hJHmnghGjSgQeFKQmnJo+uYLPOFv2ov46qhWcYREKsGrZsjobBo7Q86KB5G06ayscunLdYZaAkLC/0tlfOR8cLlNQ+zOl4yJsNzMUih9eg8WwIpIBrXET/oqMd0s2R0vZ/FWz+iNGXSJHNW6VKPwXL4BwPA/cM6ihwf87CSDIEVdCF2h1n0 X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 50f96c32-c3c5-4f54-c5b2-08dd2fdb2144 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:56.0372 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 7in9F0S0+GKV8ucvY1PobqUzWXnPRBTjfYFHM1Ajm1NW5qher2FI+PmG2S2UH5tAoUsnyG5CQkF/vfCH47sOcA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=Z/cWHGRA c=1 sm=1 tr=0 ts=677e6758 cx=c_pps a=MPHjzrODTC1L994aNYq1fw==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=ZcKNbaUmCdg_AYcUWpoA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: 5gWBneKDYybk7qXVCKyya92MbQMmpjMp X-Proofpoint-ORIG-GUID: 5gWBneKDYybk7qXVCKyya92MbQMmpjMp X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/container.c | 4 +++- hw/vfio/user-container.c | 43 +++++++++++++++++++++++++++-------- hw/vfio/user-protocol.h | 2 ++ hw/vfio/user.c | 3 +++ hw/vfio/user.h | 10 ++++++++ include/hw/vfio/vfio-common.h | 1 + 6 files changed, 53 insertions(+), 10 deletions(-) diff --git a/hw/vfio/container.c b/hw/vfio/container.c index e017cd4b08..5f8d949beb 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -911,7 +911,9 @@ void vfio_put_base_device(VFIODevice *vbasedev) QLIST_REMOVE(vbasedev, next); vbasedev->group = NULL; trace_vfio_put_base_device(vbasedev->fd); - close(vbasedev->fd); + if (vbasedev->fd != -1) { + close(vbasedev->fd); + } } static int vfio_device_groupid(VFIODevice *vbasedev, Error **errp) diff --git a/hw/vfio/user-container.c b/hw/vfio/user-container.c index 201755e3d1..99839edeed 100644 --- a/hw/vfio/user-container.c +++ b/hw/vfio/user-container.c @@ -55,15 +55,28 @@ static int vfio_user_query_dirty_bitmap(const VFIOContainerBase *bcontainer, static bool vfio_user_setup(VFIOContainerBase *bcontainer, Error **errp) { - error_setg_errno(errp, ENOTSUP, "Not supported"); - return -ENOTSUP; + VFIOUserContainer *container = container_of(bcontainer, VFIOUserContainer, + bcontainer); + + assert(container->proxy->dma_pgsizes != 0); + bcontainer->pgsizes = container->proxy->dma_pgsizes; + bcontainer->dma_max_mappings = container->proxy->max_dma; + + /* No live migration support yet. */ + bcontainer->dirty_pages_supported = false; + bcontainer->max_dirty_bitmap_size = container->proxy->max_bitmap; + bcontainer->dirty_pgsizes = container->proxy->migr_pgsize; + + return true; } -static VFIOUserContainer *vfio_create_user_container(Error **errp) +static VFIOUserContainer *vfio_create_user_container(VFIODevice *vbasedev, + Error **errp) { VFIOUserContainer *container; container = VFIO_IOMMU_USER(object_new(TYPE_VFIO_IOMMU_USER)); + container->proxy = vbasedev->proxy; return container; } @@ -71,16 +84,18 @@ static VFIOUserContainer *vfio_create_user_container(Error **errp) * Try to mirror vfio_connect_container() as much as possible. */ static VFIOUserContainer * -vfio_connect_user_container(AddressSpace *as, Error **errp) +vfio_connect_user_container(AddressSpace *as, VFIODevice *vbasedev, + Error **errp) { VFIOContainerBase *bcontainer; VFIOUserContainer *container; VFIOAddressSpace *space; VFIOIOMMUClass *vioc; + int ret; space = vfio_get_address_space(as); - container = vfio_create_user_container(errp); + container = vfio_create_user_container(vbasedev, errp); if (!container) { goto put_space_exit; } @@ -91,11 +106,17 @@ vfio_connect_user_container(AddressSpace *as, Error **errp) goto free_container_exit; } + ret = ram_block_uncoordinated_discard_disable(true); + if (ret) { + error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken"); + goto unregister_container_exit; + } + vioc = VFIO_IOMMU_GET_CLASS(bcontainer); assert(vioc->setup); if (!vioc->setup(bcontainer, errp)) { - goto unregister_container_exit; + goto enable_discards_exit; } vfio_address_space_insert(space, bcontainer); @@ -120,6 +141,9 @@ listener_release_exit: vioc->release(bcontainer); } +enable_discards_exit: + ram_block_uncoordinated_discard_disable(false); + unregister_container_exit: vfio_cpr_unregister_container(bcontainer); @@ -136,14 +160,15 @@ static void vfio_disconnect_user_container(VFIOUserContainer *container) { VFIOContainerBase *bcontainer = &container->bcontainer; VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer); + VFIOAddressSpace *space = bcontainer->space; + + ram_block_uncoordinated_discard_disable(false); memory_listener_unregister(&bcontainer->listener); if (vioc->release) { vioc->release(bcontainer); } - VFIOAddressSpace *space = bcontainer->space; - vfio_cpr_unregister_container(bcontainer); object_unref(container); @@ -177,7 +202,7 @@ static bool vfio_user_attach_device(const char *name, VFIODevice *vbasedev, { VFIOUserContainer *container; - container = vfio_connect_user_container(as, errp); + container = vfio_connect_user_container(as, vbasedev, errp); if (container == NULL) { error_prepend(errp, "failed to connect proxy"); return false; diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 48dd475ab3..87e43ddc72 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -16,6 +16,8 @@ * region and offset info for read and write commands. */ +#include + typedef struct { uint16_t id; uint16_t command; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index be2fba522d..4b1549cf8e 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -18,6 +18,9 @@ #include "qemu/lockable.h" #include "hw/hw.h" #include "hw/vfio/vfio-common.h" +#include "exec/address-spaces.h" +#include "exec/memory.h" +#include "exec/ram_addr.h" #include "qemu/sockets.h" #include "io/channel.h" #include "io/channel-socket.h" diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 1f99a976d6..9039e96069 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -11,7 +11,17 @@ * */ +#include + +#include "glib-compat.h" #include "user-protocol.h" +#include "qemu/osdep.h" +#include "qemu/typedefs.h" +#include "qemu/queue.h" +#include "qemu/sockets.h" +#include "qemu/thread.h" + +typedef struct VFIODevice VFIODevice; typedef struct { int send_fds; diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 593e304ee0..06cdf05c61 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -123,6 +123,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOIOMMUFDContainer, VFIO_IOMMU_IOMMUFD); /* MMU container sub-class for vfio-user. */ typedef struct VFIOUserContainer { VFIOContainerBase bcontainer; + VFIOUserProxy *proxy; } VFIOUserContainer; OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserContainer, VFIO_IOMMU_USER); From patchwork Wed Jan 8 11:50:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930708 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 23242E77199 for ; Wed, 8 Jan 2025 11:56:31 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUe6-00073U-AA; Wed, 08 Jan 2025 06:54:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUda-0006Ue-KW for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:15 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdY-0002Gt-0c for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:14 -0500 Received: from pps.filterd (m0127842.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5087vtsB007169; Wed, 8 Jan 2025 03:54:00 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=sNUptdF5TeVPO9uqoVmB8M87TardAFuI0PTyY3f0n ko=; b=fcfUctbELob3buTPFRMyY4wc+hYXGDFByMn1o5JbMw2CINAKNmG1HdXGr 5AnpkOpXd8dqcDoJUcjlOCaty4EbDC/KUOZqrVDuWGkUjPfyKOzoS44IfrgFhwIc Ory/guvNiFTlfwwqNUtT8DVBBKK1jyOkb+8pTbytNaA39WnkpGJnJOexAfLjQFsb rxNfDhdXOXqYRukX1w048qEh6qT5zTuaLZCo1UjCi39dQhXzmlvisTOzJPbWoGQO BZ+JvVtuXJ/Go39rINatBzX6Lvu2bnBS5e6oUNMnCHTQMzQ5gcJEdjE7lw/WgBS6 5eLHT8TvZFrFhlu29iYYP/9SqLKnw== Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2177.outbound.protection.outlook.com [104.47.55.177]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y56eryy2-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:54:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=qabhhJpOzh92U9+6SR8c7f0N/f/SxafYLKbfldMVNEFJS4L7yOE1KHX0HL4z7KMoykb4AbGOti/fxK3i5LBQ2c4ZKeOIB5UEzKXr0Emxa7hz76PFLvkwZjxj8Mz9IKwEX6QfRtc9Ap9ndUkwnkMXDEvcCijfs69/hkZk3N/8xV7YMt4eeHvHE9WCxCT1x2ARjoUl+ENQO4UeC0VJIBQfsfxcUpuQp55nowfw5hEeIYWAtxAbu8DscKKgH0oQdPJdzjxjGIyeY+qOJIuv7dxGFKy+hz7npgNJRBAxejzF1mseJLPtyKpmgVoMMIznl4Q9mQYwZ7d9XB5hl7QPgPmlhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=sNUptdF5TeVPO9uqoVmB8M87TardAFuI0PTyY3f0nko=; b=SlpohCGDkOK7GrEeGsHqYtUTCPQx11Ycsd3r0yLjOm6wjAEXBAZ0euL1APljED7Q+9CP441h0C+DofNlYU5WCjMbxFnDoB3JsvCsk2MuYtxDiIjxSquGXSSpngiRT0pUsKD+zLBabzUKt8vwusHI/jbYo0o71iRhGI1AW7C5aYEYRx1igoWZg9r0bRzY4WQNKRK8yXUwSIfEacuOxRUZLdKWO2t3ymyo2lLIWKWyZkRpsdWc4IasiIZb0Xz5kAUepbuiFhdqSA38Jf8rZG+FWADcNLWP6u8lWi4VkRD6l4DjfBTu84gt/+NMu7hu7SQyQ6A2kTuHDN6zU8t0u8waGA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sNUptdF5TeVPO9uqoVmB8M87TardAFuI0PTyY3f0nko=; b=QM+VdJ0E4v+AVOAnOb778mraXCHPS+t1VwNNB6CDRPqUN9Hyilzl7bK35JAV71Pmm+5XJ90Z/sGYrjZOgZc2m5dvqFpCGnTzaUhD8v13liTor8fe4UYN2lnTIN5EyL6ObHTF9fa3YrifpeNQwYezHSKvktdJzFGZTnT2zN0FyPao9+WgZl0pgaKpQx3HrKmnzPYjjHhQbI/pmIx6qDPrREI9bwkBdsEqAtyvXuSEhZVH+rfMOP5YKgp39rpayPjC5ypHtsm9ygEHc/A9tumC4/RB1V8FJctyrSaQr+9PnHTM/C+ZyK5Wz5F1sK45SSWBK++LyY90W9Dmbltk8ibj1g== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:57 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:57 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 21/26] vfio-user: dma map/unmap operations Date: Wed, 8 Jan 2025 11:50:27 +0000 Message-Id: <20250108115032.1677686-22-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: c26e2a57-72b9-4b20-1580-08dd2fdb2213 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: DcM8BtIQ/5gvntMlFRpKsIa+7iR9L9RL+djhA6RwYbfH/Wep9ZKu3vnVkMB5alowkqLjj9OHB+cwQ8DX1zLlM1FB1r5qcUVBI/QmQMpW38lCxBmQkz2v0iUa1RIlsa8d95gzViPKkZbZtmq18AZXZHuCMW7Pii6XfvrULv0uwainlM8vVhriUG8YOifaIYEFY/TngC/OmbBf3Q0DwTrQN1ocpvefclAED2agC6u6yVfl7x8MvvL6c7VK3o9kZiT+twXn5FkgoPuxZ5rIQH93MLgl0/pi2/mnYV39sUflbNXmsBTYOSWkIH+wojQRQHwGKVuFlwhGL3piMShFn1LD1SwQ3AbDVHPjtdtdh/v8DR65gWGcHoGLddYDFbQQxDbrPkIERWkreoL0te+sExjDksl722U9Goh+LvrEZgb+W4saPiLcGiGVvJv7Va36Vka9hJ7grNvPEUyRp8CScXcEIeftspl2VIFMcxsZjrzDaoF4jFoR/OLII8PFeTgHh3yOhtbhdtaaO9QJSttT7TgwHeKWELhI5kgLvgReY6qRUhveoKHvSJWb5Cg1Yj5Yz+WdYUFTNrzHF419fUmGKwWJI3NkO/1KFAO6jMOrJD/dWdQO/LqZZkkq/A6ig0WkUuvBYk++YpzXIrc+JYlNF/XGxWniU4M0e31WKp0Lv6SlQNDGA2H6eANj0ylBKvNbI/OMNXpO2jbO8pHp/vuJpSV1VC8amdX3E/Mqz2VINnulPQPB/hPZ7iENFsQlIMv7Kc3nbVX8oWY3GiUAkVHu29cMHClvXio5+H2z+6E0vbLaDIiKPMbTPiK9r0+aykLq7Rmrjmnc/A//j7WkMwl6eoSilmc2eOLyTj8uPM/oM3wocYYk+mg8nD84JG+sDuUatZhOtj/yekMnUrRCHOl1GnVsPGAB3re8wcYBzxe8/tcwjmfDBRoS/+z+nqj6uxdzRhQ6Gumt5ei92Zlip8cTgPYZGnTODUtzCDcw24qvB9dc/OYG/yM8EraohKF2BImoT+ryts1qcTnNcxnq+x7eGVTf9tfyGOgqZf6jc9jNG/pHRiW1nMt8YUI4HnIrDj567ZmxkkZm5zJL0XWKukoxUYJdjcdpoYHqCeZTTmosEro1dwwlB1ulF2uWz8Cty2/SsU43E7eh3tCgw4y+Jk6XzzR2W/qi1UACgDlIvqLdKe89AeS8Gjg+byUcSE1YYOYJFrxj+Abf+1wSvXV8ny/5dtn+8qjS1NaSbjwP0H3TQgwxGXjyPgO2gVC2FMUfisY6nzdg29U4V5ZHmB2h+dmLjF3a1FxqfRkrOUXTaGqlwbpEBaeQxa52LA/uMGOOD2TNTSNyGoYyxuOxi2nQuX3uS5xoFkqzfE3qSyJxnwe82yZ8Mr9cJnLKd+BifGxkTpXY2z/Z X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 2KWH5esiTqhOxsUmuNQhF/Wc6D9pDDbGSS6B+CaX5XBEOYxfQ7zXfSTwvHhbx3CjfWcj5synqtuUUJ5JKVvihEMxk3shw3p97RjHhqPSyjD2vNTw8YwyNQavlCJ3JCN6h10Lf3KvWZRajhEeZLRgXmhqdvWYRNmYnwUJeX7aKvtelJp2vpiaKV6hHMdIpHaBJ2upLVorFhb/coERulxiyQIx97SmffIQS6h2E5Sie8WOEQ4TYw2K3eEt2A3jy4s9KrazLeFvgqor7G/SAqgJbLsPGfhPL6sLfzi3YlnebwxNRzwpiAz2UWA4Q0HmYdHgWITpAJ7Sz42g8gzwgbYzPVU7+Fb4WrTq0Y2i6c3O2L/ijZd7HldUkw2IFHz4KIFM7E6Hqbj1x7GBU1C5RllDXti6Wb0TY3bfvvwvg4/9G1qCaDpeCUUsIr/9/GHyEs8HTXsUFrumieAw9tNlzYQxaPKFlnQr7kzI0vWQzlFdgQg84h6/YUZxEqgnhU9ccwrwYkJrFCHVVaqk74Yu90Ob3WygXAVpB0u8Q9sCy4csi5tKHTg8Bu6DMUbxUHIt9uqB4A5rD1k+CWevsk0BXNxUKW2R6q6iz/DMWi4tlc9im+bAwUjMsyGkSg0SdREH4ZbBGU/2WkYWJe+cGhxR+hosTSTebHqZXwrYgSbamJedw77h1pW/6xwHKzIzAp6KgUiLZ8+aXX7wJo4yZcZ9oPdzj1chxTdCR8EXGpiR+O+F/VjjkWhrnPLjYC5ebqgKz5OKvdtbt9uCT0IpxheEEweG2+pXr52rPtgmKEf6qsjR94215B98pkzWXnI9ldWv8mg1ZRrps1wLdXU2AscWoIzSLfLnVKmw9rmVTeYUCBVD/QVC9ZouLVZq81DXvHYuHUMmCjVSdGCPk3CBDfBAOy4QSKYmdSFz/tqtn1oObPEbWwqPsUx9Rsh8tbSdZMldfqtUMIOHjQc5riWjGkog/qwomY72bilrH066eU6bFp/BMK/rpUn/D4bSVLwylhrTNAtKPkzYCnQo+9hlAvxztiHJACGQQ+XjLL2/PSzIorRm2A+6QDR0joXCB2WJt+lpwPSfv7t8QUPHprePIz4gIb2Vv+XHcPktpcRTAWd3GCmhJjLVVAFWOgrRnX6pYrFwIpgqs5k2Z347lar3OQ+uwKsOqvhX85OwanKlqcChkJjlIcWXuIUxBluPlvVpM6O1hIXXBrCmFpvMs55XBPMNi2rplSin1boGG0xJcu749G9Raw/Wr3ieKYhv03pcQ4bAAyL35yumRQZepuM/AeBr/XXUGIZiNCfyy8FvNeXTgbwI7KNMtAFusStE/sCl9XXwpIF6b+NOfVss9pnSSa2oqEqkSMUevlNbCZONpRvpyl7cNmBuGm/q9g9o2r5lhayewUfo0Yb+rYDccppqSyFW+5HkgQGIkZ8MBlnLYka3vaH34iZ1CP6D/SJHDN1uGaJTV3Ow88mMA2qovhdKUhDAspdrXQITwjBUsL4X+33BDPCsxkTN/XBRpCEZnyTXXh9mduZqCdwhnxkeeJQ+TdW1DexOd0FB3lvINtV/SadGf2nSturQhu+p1S63t4GdO6GC/E/N X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: c26e2a57-72b9-4b20-1580-08dd2fdb2213 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:57.5093 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: KQKosa6pI77wqlcww6S00WzbJZNUqIGMyZ3yL6mBoMZuEcYnAS8kPyqn/5kWeKjd5GCbPPwuiSqT7b2TYPm4Pg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=A6aWP7WG c=1 sm=1 tr=0 ts=677e6758 cx=c_pps a=Odf1NfffwWNqZHMsEJ1rEg==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=RQZ_2NmkAAAA:8 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=RV09HK1Kih191idQr7QA:9 a=46pEW5UW3zrkaSsnLxuo:22 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: nYNoub3dj6OB66HmvxMUN4ZJFDGgHpQJ X-Proofpoint-ORIG-GUID: nYNoub3dj6OB66HmvxMUN4ZJFDGgHpQJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: John Levon Implement DMA map/unmap for the vfio-user container. Add ability to do async operations during memory transactions. Originally-by: John Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John Levon --- hw/vfio/trace-events | 4 ++ hw/vfio/user-container.c | 107 ++++++++++++++++++++++++++++++++++++++- hw/vfio/user-protocol.h | 32 ++++++++++++ hw/vfio/user.c | 89 ++++++++++++++++++++++++++++---- hw/vfio/user.h | 10 ++++ 5 files changed, 230 insertions(+), 12 deletions(-) diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index eceaa0c0fd..e3a7f82550 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -192,3 +192,7 @@ vfio_user_get_region_info(uint32_t index, uint32_t flags, uint64_t size) " index vfio_user_region_rw(uint32_t region, uint64_t off, uint32_t count) " region %d offset 0x%"PRIx64" count %d" vfio_user_get_irq_info(uint32_t index, uint32_t flags, uint32_t count) " index %d flags 0x%x count %d" vfio_user_set_irqs(uint32_t index, uint32_t start, uint32_t count, uint32_t flags) " index %d start %d count %d flags 0x%x" + +# user-container.c +vfio_user_dma_map(uint64_t iova, uint64_t size, uint64_t off, uint32_t flags, bool async_ops) " iova 0x%"PRIx64" size 0x%"PRIx64" off 0x%"PRIx64" flags 0x%x async_ops %d" +vfio_user_dma_unmap(uint64_t iova, uint64_t size, uint32_t flags, bool async_ops) " iova 0x%"PRIx64" size 0x%"PRIx64" flags 0x%x async_ops %d" diff --git a/hw/vfio/user-container.c b/hw/vfio/user-container.c index 99839edeed..77ffec9561 100644 --- a/hw/vfio/user-container.c +++ b/hw/vfio/user-container.c @@ -23,18 +23,119 @@ #include "qapi/error.h" #include "pci.h" +/* + * When DMA space is the physical address space, the region add/del listeners + * will fire during memory update transactions. These depend on BQL being held, + * so do any resulting map/demap ops async while keeping BQL. + */ +static void vfio_user_listener_begin(VFIOContainerBase *bcontainer) +{ + VFIOUserContainer *container = container_of(bcontainer, VFIOUserContainer, + bcontainer); + + container->proxy->async_ops = true; +} + +static void vfio_user_listener_commit(VFIOContainerBase *bcontainer) +{ + VFIOUserContainer *container = container_of(bcontainer, VFIOUserContainer, + bcontainer); + + /* wait here for any async requests sent during the transaction */ + container->proxy->async_ops = false; + vfio_user_wait_reqs(container->proxy); +} + static int vfio_user_dma_unmap(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb, int flags) { - return -ENOTSUP; + VFIOUserContainer *container = container_of(bcontainer, VFIOUserContainer, + bcontainer); + + VFIOUserDMAUnmap *msgp = g_malloc(sizeof(*msgp)); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DMA_UNMAP, sizeof(*msgp), 0); + msgp->argsz = sizeof(struct vfio_iommu_type1_dma_unmap); + msgp->flags = flags; + msgp->iova = iova; + msgp->size = size; + trace_vfio_user_dma_unmap(msgp->iova, msgp->size, msgp->flags, + container->proxy->async_ops); + + if (container->proxy->async_ops) { + vfio_user_send_nowait(container->proxy, &msgp->hdr, NULL, 0); + return 0; + } + + vfio_user_send_wait(container->proxy, &msgp->hdr, NULL, 0); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + + g_free(msgp); + return 0; } static int vfio_user_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly, MemoryRegion *mrp) { - return -ENOTSUP; + VFIOUserContainer *container = container_of(bcontainer, VFIOUserContainer, + bcontainer); + + VFIOUserProxy *proxy = container->proxy; + int fd = memory_region_get_fd(mrp); + int ret; + + VFIOUserFDs *fds = NULL; + VFIOUserDMAMap *msgp = g_malloc0(sizeof(*msgp)); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DMA_MAP, sizeof(*msgp), 0); + msgp->argsz = sizeof(struct vfio_iommu_type1_dma_map); + msgp->flags = VFIO_DMA_MAP_FLAG_READ; + msgp->offset = 0; + msgp->iova = iova; + msgp->size = size; + + /* + * vaddr enters as a QEMU process address; make it either a file offset + * for mapped areas or leave as 0. + */ + if (fd != -1) { + msgp->offset = qemu_ram_block_host_offset(mrp->ram_block, vaddr); + } + + if (!readonly) { + msgp->flags |= VFIO_DMA_MAP_FLAG_WRITE; + } + + trace_vfio_user_dma_map(msgp->iova, msgp->size, msgp->offset, msgp->flags, + container->proxy->async_ops); + + /* + * The async_ops case sends without blocking or dropping BQL. + * They're later waited for in vfio_send_wait_reqs. + */ + if (container->proxy->async_ops) { + /* can't use auto variable since we don't block */ + if (fd != -1) { + fds = vfio_user_getfds(1); + fds->send_fds = 1; + fds->fds[0] = fd; + } + vfio_user_send_nowait(proxy, &msgp->hdr, fds, 0); + ret = 0; + } else { + VFIOUserFDs local_fds = { 1, 0, &fd }; + + fds = fd != -1 ? &local_fds : NULL; + vfio_user_send_wait(proxy, &msgp->hdr, fds, 0); + ret = (msgp->hdr.flags & VFIO_USER_ERROR) ? -msgp->hdr.error_reply : 0; + g_free(msgp); + } + + return ret; } static int @@ -234,6 +335,8 @@ static void vfio_iommu_user_class_init(ObjectClass *klass, void *data) VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass); vioc->setup = vfio_user_setup; + vioc->listener_begin = vfio_user_listener_begin, + vioc->listener_commit = vfio_user_listener_commit, vioc->dma_map = vfio_user_dma_map; vioc->dma_unmap = vfio_user_dma_unmap; vioc->attach_device = vfio_user_attach_device; diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 87e43ddc72..9b569156fa 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -115,6 +115,31 @@ typedef struct { */ #define VFIO_USER_DEF_MAX_BITMAP (256 * 1024 * 1024) +/* + * VFIO_USER_DMA_MAP + * imported from struct vfio_iommu_type1_dma_map + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t offset; /* FD offset */ + uint64_t iova; + uint64_t size; +} VFIOUserDMAMap; + +/* + * VFIO_USER_DMA_UNMAP + * imported from struct vfio_iommu_type1_dma_unmap + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t iova; + uint64_t size; +} VFIOUserDMAUnmap; + /* * VFIO_USER_DEVICE_GET_INFO * imported from struct vfio_device_info @@ -178,4 +203,11 @@ typedef struct { char data[]; } VFIOUserRegionRW; +/*imported from struct vfio_bitmap */ +typedef struct { + uint64_t pgsize; + uint64_t size; + char data[]; +} VFIOUserBitmap; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 4b1549cf8e..ef644848ed 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -47,7 +47,6 @@ static void vfio_user_shutdown(VFIOUserProxy *proxy); static int vfio_user_send_qio(VFIOUserProxy *proxy, VFIOUserMsg *msg); static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds); -static VFIOUserFDs *vfio_user_getfds(int numfds); static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg); static void vfio_user_recv(void *opaque); @@ -60,10 +59,6 @@ static void vfio_user_request(void *opaque); static int vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg); static void vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds); -static void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, - VFIOUserFDs *fds, int rsize); -static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, - uint32_t size, uint32_t flags); static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) { @@ -155,7 +150,7 @@ static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg) QTAILQ_INSERT_HEAD(&proxy->free, msg, next); } -static VFIOUserFDs *vfio_user_getfds(int numfds) +VFIOUserFDs *vfio_user_getfds(int numfds) { VFIOUserFDs *fds = g_malloc0(sizeof(*fds) + (numfds * sizeof(int))); @@ -658,8 +653,38 @@ static void vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, } } -static void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, - VFIOUserFDs *fds, int rsize) +/* + * nowait send - vfio_wait_reqs() can wait for it later + */ +void vfio_user_send_nowait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize) +{ + VFIOUserMsg *msg; + int ret; + + if (hdr->flags & VFIO_USER_NO_REPLY) { + error_printf("vfio_user_send_nowait on async message\n"); + return; + } + + QEMU_LOCK_GUARD(&proxy->lock); + + msg = vfio_user_getmsg(proxy, hdr, fds); + msg->id = hdr->id; + msg->rsize = rsize ? rsize : hdr->size; + msg->type = VFIO_MSG_NOWAIT; + + ret = vfio_user_send_queued(proxy, msg); + if (ret < 0) { + vfio_user_recycle(proxy, msg); + return; + } + + proxy->last_nowait = msg; +} + +void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize) { VFIOUserMsg *msg; int ret; @@ -696,6 +721,50 @@ static void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, qemu_mutex_unlock(&proxy->lock); } +void vfio_user_wait_reqs(VFIOUserProxy *proxy) +{ + VFIOUserMsg *msg; + + /* + * Any DMA map/unmap requests sent in the middle + * of a memory region transaction were sent nowait. + * Wait for them here. + */ + qemu_mutex_lock(&proxy->lock); + if (proxy->last_nowait != NULL) { + /* + * Change type to WAIT to wait for reply + */ + msg = proxy->last_nowait; + msg->type = VFIO_MSG_WAIT; + proxy->last_nowait = NULL; + while (!msg->complete) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + VFIOUserMsgQ *list; + + list = msg->pending ? &proxy->pending : &proxy->outgoing; + QTAILQ_REMOVE(list, msg, next); + error_printf("vfio_wait_reqs - timed out\n"); + break; + } + } + + if (msg->hdr->flags & VFIO_USER_ERROR) { + error_printf("vfio_user_wait_reqs - error reply on async "); + error_printf("request: command %x error %s\n", msg->hdr->command, + strerror(msg->hdr->error_reply)); + } + + /* + * Change type back to NOWAIT to free + */ + msg->type = VFIO_MSG_NOWAIT; + vfio_user_recycle(proxy, msg); + } + + qemu_mutex_unlock(&proxy->lock); +} + static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -830,8 +899,8 @@ void vfio_user_disconnect(VFIOUserProxy *proxy) g_free(proxy); } -static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, - uint32_t size, uint32_t flags) +void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags) { static uint16_t next_id; diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 9039e96069..31d2c5abd9 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -75,6 +75,7 @@ typedef struct VFIOUserProxy { QemuCond close_cv; AioContext *ctx; QEMUBH *req_bh; + bool async_ops; /* * above only changed when BQL is held @@ -106,4 +107,13 @@ void vfio_user_set_handler(VFIODevice *vbasedev, bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp); int vfio_user_get_info(VFIOUserProxy *proxy, struct vfio_device_info *info); +VFIOUserFDs *vfio_user_getfds(int numfds); +void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags); +void vfio_user_wait_reqs(VFIOUserProxy *proxy); +void vfio_user_send_nowait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize); +void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize); + #endif /* VFIO_USER_H */ From patchwork Wed Jan 8 11:50:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930720 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ABE01E77188 for ; Wed, 8 Jan 2025 11:58:50 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdm-0006WV-05; Wed, 08 Jan 2025 06:54:26 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdS-0006TB-VM for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:07 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdO-0002Gy-Gr for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:05 -0500 Received: from pps.filterd (m0127840.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085doxh029507; Wed, 8 Jan 2025 03:54:01 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=T3z+rFUHzBeYOjacS0b3mDpI0PQfvU22M2trcvyFy Eg=; b=iW4l3pERwa5qKdcS6OIqc3fqqn4wHuf6JVYvljIKO/EIKuHvcZVbTkBg+ VmIl95IIa99spThOqUlc5RTHTB4xsJF3Ys+jfPkRt3QWCQu3AqepFDn/Std8iJuO /u77kk2a/zxp9rf/A07nTveXtqotvG0dt8kDv5BRz0J+FxQvTC2aZASKCx5oLp6Z MW/AAfF/+miN3GHwiYNyTOf485NC/UFoKqf4WACQMubhSsagyYYxNE0g8ptImsXB AdJVLYPqedlatitQTUsC0fZtA3UoANSRwPp1e/J3eJAgDM9dfRqDeou2LcNNnfBp +kK6Gg9swQDbqTF+Nzr9WeoYB2RTA== Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2169.outbound.protection.outlook.com [104.47.56.169]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y26xqhax-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:54:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Q7hygQHhhbM4qBXhQtpApMRCwB63KKMf26VgiiYz/60Xthd9o1AjhsR3U37c8vcRtMOurPs7df8aiTIiyJgW3d3cbgvmu79YshTbmvU+OyDvR9Q1q6cSlYWAe6tuOjGqNQh5kCJTmvRpPgrG5UBECGHH8UVuBK31TKcURSbmZeKfFN8q+pt+r/ofZ4v5WQu90/JcME43xvm+AjHdBeKgtUpAY/EwZjkeYto46iYo1f/yixmaw1X2/LZAwOLiTykeR5dEerVvy4aWl5CuE8p8N3n8WqdGt+4SDqPJZz8gyHLCjmzTGnMQGGQkvuGzsf55Mo8PtKwlHrqvVsaYgkJIdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=T3z+rFUHzBeYOjacS0b3mDpI0PQfvU22M2trcvyFyEg=; b=l8Dp9BrrlsooZbUBwhi05RvyHA+CGlzBLyBo+YtHxPDn6kNwnXjynR72k0QOaxwtYHz6DPB+MjppVlmyQRm3IXczEkjCRGtBoX5Cz4jzMsxNBV7cA6Me+0Lj0BrcEUOGoZk05AqRbPoFHkJmnDaz2Zv614dpq3SlgBrMzkLrvBFobLUMvNS+RtlFk7EHhpBlwhlhR36BiON8ikG0ZRkxJIqKUw5d/Gt0FYddXaZst9YQped67BOwiWmanCQ/38NNzppjvEFRRw6WV5gSlOWMaFGO4HizIMvsBl9KgCGGv92qczsozcZdtsDAa2FDUiQblSGXmPQHIYBTSDEIqKzvNw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=T3z+rFUHzBeYOjacS0b3mDpI0PQfvU22M2trcvyFyEg=; b=p9BI20luLZVMV4GzmrHRfQZEPFmw7PJSUQ3emn1be1lCM8LS15zwEZ8UPWWhd+w/ZRGblUsJ8n04NunCQkbsfkICEhoQYkRPSLDmFUcNQcjuyUWeoO6YuU38muJxvLP0FEk8OSNJTmnRT2+yxZLTpBtCUDZhoLafyYJb7fIXTzFThLa6k8pyFBjVSJyVDI5ralYuNsvTRCzJNVejXIggc7xZBUeco7I3DosvB4WhgBxCDfElv2+TWosa+FxXLRpZVHHahD42lCk48XHDbaY26gpsmcUQNpRX/3MHOSLwyu49FqW6/hyp584sLm0WCj7nrgcGftqo+xP1mFT2nzSuzg== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:53:58 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:53:58 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 22/26] vfio-user: no-mmap DMA support Date: Wed, 8 Jan 2025 11:50:28 +0000 Message-Id: <20250108115032.1677686-23-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 6843e4ca-4a0c-403c-d03e-08dd2fdb22e7 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: E9ARDPAdyR/WNghYRSWIE6tZk/gHBEdgvcFTTvdq8kh4w4qTJ5tp03wgdqsrOT3zfWVAe3t9sVPVmDOAWe/yPMzmC8/OZvLfjMKJu3dDsKrTwD+NQwZWKdNuAjLsbPFhNN2sGVZ3ucY3O58PklnFnBsk0iuGsP2LPRxOgSWSTnDtsF5l0y3hQNGH8Zx98CwBqY8UMEeWXKqq9d87xnlomK9x4jApVul4zDO0kg4X/rsiwmt00OAgX5m7LVLP7K+cTxZdBGZFMk3zEnEomvuTgixM2wEEfhR6oiL9/QO3x/YjWZXJbdAQh/qrF9RzRYTa03W0hoqMvTGkndRMEHJ9Mr0dk4MBRY83nqdRW1Gjksupf6Bkf1OGylktUnDBLqpIKbngfVVH8/6bDToWxiwgdltGfhcuIAeCwtz/DYer28C30qoX4qVvCYZHoF75DbHaqKXsJZ++w2q/dJrg+pEtG+uaTNV47MBvlC32rA8beBLEYCyptgXTRtFM7RbI4ZLxvihM6nXyl/kiB/bIQW6cnQc+lK/so8eKdNKDZ9nBc1M3GQSVlslgtHE5ZAJoK9ZYmzwwXjnuzoGJlxmCODQEc5Q8xhPntZbcvLsF9GS5SNBqwHUkPQb0yG7Wru0HVxpKwGaFujWmN1Pwe7BTCXqtYzxiYLB63b+NkQ3xgEUpxHFLdx9CIfc4fi/bFaYfrw5rQxGQLqtFUDWEhNAkswkMXNij1gBnXTvLe7mcNwZ3SSug1wJgP7ZMgxkrImkYqsXrsX7kqY/zCs1RQl8V2nD9B79xgkiOcmiKwEfPCrsIXzBqQJMW5whNOyy3nM9o9vi83L0BsGTd50Gwr3Q8duOIhCEZXm9Mpb4cnsW5HhuABCq81H6m/jJwWybelvaWsR1lJV0Xz+yv9WTiU4H2wdlkgURZAia2HPVECyKRh+bnzf3zc0WY6bXIMIb/6a60JDkfT3Fvoz0VOtzmc1d9oJVlxQTyxrbLj5b4qR3iCIQ4pBwQKdY8im9S7fj+xv/apNKwlHAnJLlVaxqfDHknmxdso25c/cTYHl/v6bS6D5ohTWlibefFvfDj2o4XuqlsMnAE6YKrwjMNh0TkYqsE65u0Sq9PpTiUGHwHaiR/S3iDSZR4wCYNzERwQYVPYQiFh2FF1dxqKO4/DkG7oRoxU+EtS4HyuQ8zRUyeOAsEq83RApYEx/yt/FwTwP2VvUNQNyyvSY7mKD9Z5FhCfJG6RkPEPTeL4h59QCZCZ87XFK6GzIyQNTVv5hbSeLcvdKBRS/4I/oas9Hj7eDYsZQ6e5mPdV1riQ7BJMT5jTWaF9wntqYcp53e6CExm3i98YgUaC2I9yuoAjedqMWvEFLX2+a6zdXbDajA/lxq3TnQ6XqQsODmf6OZPotZniUis3Kx8h+uA X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: R+BaCnv9oQMPzhyyIA5iem+BpgkvQoKRH4Zsnu/IHY2F5WL+EVa2MBl229R0BL4Zdv/hcZ+mlEANqaXo1EAT7BIRitdsQ36bB/SQdVcilr4sum3L3eTjw54mcSoqF7jlVVU8tkGs1sJ2DzHcixbbhfUDJlgSjjyo99xj6HSWRImXowsZP8ptovqQ2VVbNSX0LTsuW+XInhiy6vTeEgnUBdVLbkSTqCdt4aom9fnHLpJRK628LrNSDzqhdL8AS+06YRE0c3Zinm8ZLQJS9sRQXlqOe1y3uQA3CVN8UUFI8QOCui3OK3vO9+Xqm1VPL+zy3BNeo7HUyfkAUXcVAjdTXvlghP7PFbP8HR/YbkJ2stSoewKh0ydJw7yleogjhHTDj0rCaRNCkLTQ69lL8wxx22Zcx0huaxFr0yEdtD2RjtqacC6yCTU6ygC80ZuIBQttLhRyOyNa5e2RBGRvNNF3kfGYrBeYEfUMrkgl1lUdV0Bh3izeLYbMkSvIpSgm4c+mr/CY5ynd302qomM3Qy1WbZKFO9gziPCimKozh1f9+t1KogqbRyQzsVkViFUBC9ztTGWGjAovJPIizALn1OvJ7QhVLTjL9O4AGuJ20Yoz42w/y4HUtrzWOtiDU84l09/UdzsehZXnl/BwTO5X233CXGAaSR4B18NUhcWszul2QtqTu1YE1ZdzHvJFT5ibPPxrNSfS0WIaqcj7iAKu2r+C/BzP5Yh8e0oXEoO6d+azQRuZByeDqwFR67VCb9yGDHCG15YhMSmyn20VChAeWA30yqcRN1yEyeoovvTBXyP5Qdp+v+dwGVzibX+MEBksTI0zPkIrXM1w3iNrJWRWF6eDlCVnlccHbOc7sylgaBT76eFKmfB9ZPOpnFrl2xQAbS2JIcbo8/s1Pfi66GTgNLYqRujVXOTWK1oCAhdLV6XFdL+pqf13z/1ekBOuzLD4qSrPAc/6Kztq6dxKNphJZNDSbFi26QYVFrtUMYrzxLHZBtl2Is7aJG6Vy59MEvfmg2f/ObMgUX9FQHiyRl+T/KpDgpIvYdg1ej/oky1QgOWzWCbk9VoFAkwmcddLgfwjQIqI8HayLRud4ak8kjPbDqP+Bb7wShNwsiw8YwgRHPMeufOw3mgk778fpCXciaL7ZbLLzPZsqBd4eb5cD1n24GaroF6itgpBsmNJdgxXIWzNDDcc2vC4piXUirIhio22euSovFqK6SP3kGGF7i8CMTZuVq3LmwkNUux/MSvqp4yJq9g0yI+1uQEEiEEij0psNX/WvYoC28ixwP46ZAquYT49pSS1n/LcL8YZqPThlBRXVPlNsTKagCWtXXeQFVngIZ1a004i58rIxmk0S/L42xjzSbonkBnecruo4FtcT+QVon0rfNCiGyQUa7kMz1rIpIMuvjxKBJP8myd8wcQwpM3pEM8TD8C+G0zT2jSC3femQifEXG1rdn9xWFF/aY51U7fta3B09IfI6zyP6DrN7W5ilwrEl72Q92v/zBWoua4a6XRQNiCnojO8DnifjX7UTrSNCHT7AC3gvIEj0lwxKR+ctkz3dvdHYULAbMAUHfTFg5QwDMhiE9udFADMTkmZhdSl X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6843e4ca-4a0c-403c-d03e-08dd2fdb22e7 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:53:58.7843 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: eJY02zVKGle4fL8YcsBP1HiiRLU40zb6mSg0UAwox3yT1ZDxSB669lIIJxD9WnTwH/aRc6mnr+EQzltR1fqMxg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=Z/cWHGRA c=1 sm=1 tr=0 ts=677e6759 cx=c_pps a=MPHjzrODTC1L994aNYq1fw==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=9g9m8Zzpkuj7DW7ovtoA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: JYj_VwsUt4IQc3jcSQqdl9hhb8RmLJ5F X-Proofpoint-ORIG-GUID: JYj_VwsUt4IQc3jcSQqdl9hhb8RmLJ5F X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Force remote process to use DMA r/w messages instead of directly mapping guest memory. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/user-container.c | 2 +- hw/vfio/user-pci.c | 5 +++++ hw/vfio/user.h | 1 + 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/hw/vfio/user-container.c b/hw/vfio/user-container.c index 77ffec9561..89bd1850ef 100644 --- a/hw/vfio/user-container.c +++ b/hw/vfio/user-container.c @@ -102,7 +102,7 @@ static int vfio_user_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova, * vaddr enters as a QEMU process address; make it either a file offset * for mapped areas or leave as 0. */ - if (fd != -1) { + if (fd != -1 && !(container->proxy->flags & VFIO_PROXY_NO_MMAP)) { msgp->offset = qemu_ram_block_host_offset(mrp->ram_block, vaddr); } diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index 53d230fdd3..b1125f7403 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -39,6 +39,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) struct VFIOUserPCIDevice { VFIOPCIDevice device; char *sock_name; + bool no_direct_dma; /* disable shared mem for DMA */ bool send_queued; /* all sends are queued */ bool no_post; /* all regions write are sync */ }; @@ -157,6 +158,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->proxy = proxy; vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); + if (udev->no_direct_dma) { + proxy->flags |= VFIO_PROXY_NO_MMAP; + } if (udev->send_queued) { proxy->flags |= VFIO_PROXY_FORCE_QUEUED; } @@ -281,6 +285,7 @@ static void vfio_user_instance_finalize(Object *obj) static const Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), + DEFINE_PROP_BOOL("no-direct-dma", VFIOUserPCIDevice, no_direct_dma, false), DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), DEFINE_PROP_BOOL("x-no-posted-writes", VFIOUserPCIDevice, no_post, false), }; diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 31d2c5abd9..fe24a881f2 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -94,6 +94,7 @@ typedef struct VFIOUserProxy { /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 +#define VFIO_PROXY_NO_MMAP 0x2 #define VFIO_PROXY_FORCE_QUEUED 0x4 #define VFIO_PROXY_NO_POST 0x8 From patchwork Wed Jan 8 11:50:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930709 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 20ADAE77188 for ; Wed, 8 Jan 2025 11:56:35 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdk-0006W9-Kr; Wed, 08 Jan 2025 06:54:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUda-0006Ud-Iz for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:15 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdY-0002H1-BK for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:14 -0500 Received: from pps.filterd (m0127842.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5087vtsC007169; Wed, 8 Jan 2025 03:54:01 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=rqQUfPiEIG3vNuWWU3K9p5h0LL+Eh6YNnsaGhJGGT ks=; b=fgqMGyNcbbHLt2xtXG9LoWzGi7u6VSlRZUk4nFCG/EEN1BW3yJcvDwxud c3KVfMkSJl4LVaoK/kcFs/cQIm6SJo/Et/3mX98HpyLqH+m7TWMkguzCC9F11p4n LzxnBwbDVLUZSlUMla0EaZk8x46u4Uc3cYj1KErCFRSQED5PHqBHfDN5/P3Zr1Kx UjGz3XEFxoSO/iNYP8eFOn3OqFOgkrHgr2bK85h7FFkPt5ZyB1E6evLX8VeuGycg WQYso5XnX0+iGKMHGe4/pH3bHj3Aa7+IZX60iwmx+J9xl8z7aAf/CmtoteV3gERp Xx6KzJu69afcRuo2P5X3BcwDPg8Xw== Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2177.outbound.protection.outlook.com [104.47.55.177]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y56eryy2-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:54:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=wD9Gciz/6gwM8ej+ZbteU/pvr6Y2QNss9w+ADa1FTPbmr3ukpApw0K6ul8NcyB8V4i9c9JeHRkFsaHtB/RyjuhJmQlo6lVVywhOsMLoySO1jPIj9roGLUteHUSwVDfgl4zz2cLujcNzCri66SppbyL0CzZiYN+NV3Uv5TW8TSJ/8a67xA5eDK4IdHpjOPP3Z2Vt8ohi5szJkb1UecyMgbYT8QmEvECOX8nJkRC/dHPAKf7yjP7RWm7M2JKC+uXBDCsjJwCqk6c2p6zLWbazmqUucDU9r1pB3ho0/rGItXDvMU3dilTDqJAApK9pDWMebmOmFiQiPPc41jHuzAw9Cpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rqQUfPiEIG3vNuWWU3K9p5h0LL+Eh6YNnsaGhJGGTks=; b=r+DfaNUVfgGzdW8QH+ydgOzC3hRUpr6/v1Y72EuabmYX+KpNeY7tSn+F9yqt973vuW9tnlSMEm4Lsfs+P0Y4ZcdgVAEduI1zZgtIjoFHcj+ghDAmg/FNK4MSk6hYG2pQterMSBxpfhKmjQt7daYbMW0OOqgXPHMyr5PNoKlqpS9H3JaH+YI78sXv5QutAai6ep06lH9Ev4PC1LV2zK3TlIkwdjfw9oKMp7PhbMVnaebyw04aKorYpe5qj2a90JJaALfLp9xbS8dl5KjV4uYwqqeYk955uHyBcMVvGg/gWKSKHkQtiUCEi1o5C3Ukq+UX0FV9p8jNABJjzhoV26sL2w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=rqQUfPiEIG3vNuWWU3K9p5h0LL+Eh6YNnsaGhJGGTks=; b=hXgRNu5ut02fVR03MhGEfs4G/39ZoTyiaEVyCsnmayNXoYAFT0thVIsNzKTM9+qbqzRkkl3Ov0Z2J3zSNHtuQXFvmu8Z+dKHZgL2YwmpY3zIDBZvxZWtdrwC5i1Rx/1DzRI7BoTRbM/5mSVD6kLzmBSUUng9JlSKA5xJASBLpJey5IqK+6dN85GYLNMIx9DYHgj7pJXgTzXMs26SyALz8v/CpeiRQtn3QMu38HNbIMNsB7xBTkfTvfv6pGIzX7fb+ZbLP9CHdDIq1YC1JboQd1TZ4jdw7G+kdqPimkhWGURjxiDjuggyxZ1oCDviNuUIcpu5BIkqKF+Dk3Yx+cUgAw== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:54:00 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:54:00 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 23/26] vfio-user: dma read/write operations Date: Wed, 8 Jan 2025 11:50:29 +0000 Message-Id: <20250108115032.1677686-24-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: dd42aa46-b14a-4ca3-47e1-08dd2fdb23a8 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: xEETUyC5k27ZPXtPcbwWR/bI9An4XH5jAiKPn7+mAWpVHRT1ULbFEoeAWtmBUkOv8xI4sR7a954cPVE77i5+Zs+JjbpNMY3u4xKtLnk4qvFrjVwTCuIH0GA8LU0r07t8nX8H/ud7BiQO6iWrNMTeLv/nsFz8Ib4oBolX1qJABK4GEynccP66z+gIuTvZQKX0zwTHcc7bp6VlNEvodZowRrYG0y/gqZI74PiY7RPFPJa81rwxNLLbEyXiah9joF0d58nBLVdRP/BAu+nDlTBZDlPYKVuJTeCtl7ntgE/h7i08XXTS24ONrl3IvvgX4QsaNLVblFnugvgDUJuzPUhvJ6eyCg9rXf4qzDSAvMjEbP/rCT4AXtQxoGDpJhgfsDKGQaRvW34trP+KP5QCXJ0cJEYT+a85K5BJCGhCOL46eolXNQV7rzKOQ90x/WHiJOnXxar3D3a6tKOIvbN2FZdUufzwGh8dqtObfzaokfNrmsumITeYP5qRN2fTvvJvOD/0tMzka1/XGfX5KDTaLcOZyMB7onD/gS6NJcKxC1xVjFAVlGb0v6e+G6v8Rf9t9HoEK/WfQ5sTrismHdNo0zKkMpJ7pYPlWF15mRQW3NetEYjFbFXU8+7ZqtkCWXHb+HUxtdSKaybmOivAm6ZjmUKqhaDbM2yTTBv+J1D6PWVyzCpAwAARys9WpdLEjvEonTPV7VxOcdsrcC6iYFc7RMOCsVdm2kj8LPvZaW4y9khQRiVBgU7hAx1X+D6cMVfquCXRk/HIz8RWNsT91lOnhlQzT0e3wsLj4KgFDZPBRr+p3qaftBLf/BRqSReqrr0cNcnu0GGo57fEd6UbNJzaNlws7SWEaoNSvG9oHBzFanzlJEzj21t5428DWvfKp8NAikpfzYNYgf3HFc94gqU4l6x3WaLE1YxvZInkogBVlVsHMXhp4ItGujHotTJh8YSeQ5Tb5mN0I8d87U/RgwvdtJk5+kTexYdah4+B42B4rU36Dalcstx5PfXCeW6vz0GB1UY9jK6KVTX1lBIr8ryl1vhl6QoslH3YYXrO0mrtcU7D1D9iLO7mYtiE1HM4eSOeeox7sVmW7QefGdoLqxaBj0zkNOg4ZsEAVQW40DqpsIMG3xvWxYTwNx9n8rJTzfr2OuC5ArFmiASpzf450EV4ZwK9qr1JW/cF2Rpjxa2MApu2RXofsB7MSWUYQwKCDlprc+s16Y1NwEK3jZL/yxhwMNgml0CWXwpIOfJcAYOmnkfwaNIsrrEa6N6gaIjrA5m+IHaBKeo77voPmGCRhoAx8G+8SHHobn/dUz/JmWZSJ9NlXbredfcPLQ9HGG3553AI5qiGFASzhkumsVBlSWIya7yv/SNYd/6SuyxBV8WkdMJYaDdPV0/bJ8R/TaKIoSY4CY13 X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: pGsyeEQx4JI+SD+iv9TPcYKisv/6l1K/SRHruzxh005xOGauMC9Bsz9EWDOC6lFVXeP4xDtp05fYSqvOvbmPvp71zX1DoXnKYfpQZkQSav/FwyzborfFwLPXfRVdbuWL0PRQxoR75+wIfLRxjmwkG4zUoBcMZd6UbQUCUmNXa9VS4+ka+0j5AYs8xKxhZtSc7j0eeFQRILMZ2dGLNg5K5DcfTukHR3n3LG2DwDgwdY8xv6yBeBK0cEJYrVs19MdmoZFvSM2MPwkX5HlA79x21kh1UCOf5NA8UIP84XIBl8Aw6aEUpLECZI5YHipJyXlzuRdua7I00AkgxK9kxM5DzbaPXFUDFmE9Hu7BgoxGBXft863RzAIx+9o7U0u9QR2mWGAaCd04SwgHVL9KJznP10y0lSebgXHWGJzdvpzYsQEn6qngsQZOXntt+M2aeD6UnG4nOWVZaMZOUOxcy0Kni9ZWaHj2LPYFppFMJJTXaXA6pgaC2wDUnKjvPQUvazipmsgja0cI/PheCpNrO9PFoAu26F4cZAbXd4rHUGI+lrTGRP/VTEVPpOQltlzETtX+5ZBBOXwwUR/UK1TxC0l1V48yvIu+i0omiSDCMgLOiNHJGG7uCsbhdQ8RV8RhFE6bOYG8qnsHzoWGrqkV5+FiQztbjVLwlIOh9wL8gI1aOvZOKctjxbZyZ6Gyaz2JkpDacnn7fUCAtv2xQ5Y0iDb52daa4LpvgcR6Tcy6qL/XuwszsqFESn9kk/45DXKnpAF8YYTt1pvW+bFu5MrPbbz4lBv5Jb6UwyiK0OR6W6+1G3EhIkzztnLB4XpFnmLtZYpyV9WdBTsb2P713x5pb7VJ4doZzB7tKQMoKipbnmXqdNnt6B9NEZfijBFD2EHFRZRdXhYSXq0SgznRvFvaWxLOehUzU2cq47JE8uMnKFb6AiCo5iSVv88Dhu8KI/jW5Co5P1OMc1OwbNSIg1+vwShVe/WfLKZuwgsqVel7q+Vr2HjL5AmVaKKBLAUU1mnOvTuhkep2gKVzGUWMvkOyK8NTbbtzcJRQZSgJWkjzzwGxjWB/aP7Gluwkah2hFBWz0/Kit1CUq5RrUjL2Qcay2p7t6dgT4i9LZAGx/EQH/Ua6u/vJeKNI09nIJwE/d497guXJdgNHM0mMOcuWYHdWuRZBr2IfOG82K8fro3eXEKYfK3k8AlSCzMCukaw52cLEZFCxy9M33dIrcK21gcaUZvvRuTwTo1AVnupZ6pPxTgOD081yjIcgN7+A6FpE8g6SQICvbjBwXdMHUaCWuNDPkQkT75RwpkrZIKWzZpeBQWx2biEwZ+eaXPAjEufbh6z69BS9zXYorSEWwaf0JYXQhv+K38PBdYHWH+igGHcEEHiI/iUwwlqNTWViX9fs7C0VCu9L+6mWW6H6+0xL16zBVdh25XP5cD5uS7aiIsVOX/SKo/1FOK5TiHDTGJ1aEtU6wBK+aw+YlgMiFjqw8fsDWIXr3W0uxLHDYJOv5MIVD0sixgwQ7rIPlhMVbJFrvIVjeva5jYimGtcnq3XfM3peLun0Pn1PwnKOXpoohnPC/OVn2Yz+X3Ys3qpYMEiBoJU79IHr X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: dd42aa46-b14a-4ca3-47e1-08dd2fdb23a8 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:54:00.0734 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: m4uhXi1hc5uF0M9t8ffghuKWysYz7ElRXR1wuQA61u4TFGYSZsu72cIORBeHSg4Pf8I4o44QzRuz0p94MQsXyw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=A6aWP7WG c=1 sm=1 tr=0 ts=677e6759 cx=c_pps a=Odf1NfffwWNqZHMsEJ1rEg==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=O9TuWlIcgd8LY38MkVwA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: GunQU1rTu2_KsHnbFw0KMMHnkB7dkVlj X-Proofpoint-ORIG-GUID: GunQU1rTu2_KsHnbFw0KMMHnkB7dkVlj X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Messages from server to client that perform device DMA. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/user-pci.c | 110 ++++++++++++++++++++++++++++++++++++++++ hw/vfio/user-protocol.h | 13 ++++- hw/vfio/user.c | 57 +++++++++++++++++++++ hw/vfio/user.h | 3 ++ 4 files changed, 182 insertions(+), 1 deletion(-) diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index b1125f7403..8cd397b75a 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -100,6 +100,95 @@ static void vfio_user_msix_teardown(VFIOPCIDevice *vdev) vdev->msix->pba_region = NULL; } +static void vfio_user_dma_read(VFIOPCIDevice *vdev, VFIOUserDMARW *msg) +{ + PCIDevice *pdev = &vdev->pdev; + VFIOUserProxy *proxy = vdev->vbasedev.proxy; + VFIOUserDMARW *res; + MemTxResult r; + size_t size; + + if (msg->hdr.size < sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, EINVAL); + return; + } + if (msg->count > proxy->max_xfer_size) { + vfio_user_send_error(proxy, &msg->hdr, E2BIG); + return; + } + + /* switch to our own message buffer */ + size = msg->count + sizeof(VFIOUserDMARW); + res = g_malloc0(size); + memcpy(res, msg, sizeof(*res)); + g_free(msg); + + r = pci_dma_read(pdev, res->offset, &res->data, res->count); + + switch (r) { + case MEMTX_OK: + if (res->hdr.flags & VFIO_USER_NO_REPLY) { + g_free(res); + return; + } + vfio_user_send_reply(proxy, &res->hdr, size); + break; + case MEMTX_ERROR: + vfio_user_send_error(proxy, &res->hdr, EFAULT); + break; + case MEMTX_DECODE_ERROR: + vfio_user_send_error(proxy, &res->hdr, ENODEV); + break; + case MEMTX_ACCESS_ERROR: + vfio_user_send_error(proxy, &res->hdr, EPERM); + break; + default: + error_printf("vfio_user_dma_read unknown error %d\n", r); + vfio_user_send_error(vdev->vbasedev.proxy, &res->hdr, EINVAL); + } +} + +static void vfio_user_dma_write(VFIOPCIDevice *vdev, VFIOUserDMARW *msg) +{ + PCIDevice *pdev = &vdev->pdev; + VFIOUserProxy *proxy = vdev->vbasedev.proxy; + MemTxResult r; + + if (msg->hdr.size < sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, EINVAL); + return; + } + /* make sure transfer count isn't larger than the message data */ + if (msg->count > msg->hdr.size - sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, E2BIG); + return; + } + + r = pci_dma_write(pdev, msg->offset, &msg->data, msg->count); + + switch (r) { + case MEMTX_OK: + if ((msg->hdr.flags & VFIO_USER_NO_REPLY) == 0) { + vfio_user_send_reply(proxy, &msg->hdr, sizeof(msg->hdr)); + } else { + g_free(msg); + } + break; + case MEMTX_ERROR: + vfio_user_send_error(proxy, &msg->hdr, EFAULT); + break; + case MEMTX_DECODE_ERROR: + vfio_user_send_error(proxy, &msg->hdr, ENODEV); + break; + case MEMTX_ACCESS_ERROR: + vfio_user_send_error(proxy, &msg->hdr, EPERM); + break; + default: + error_printf("vfio_user_dma_write unknown error %d\n", r); + vfio_user_send_error(vdev->vbasedev.proxy, &msg->hdr, EINVAL); + } +} + /* * Incoming request message callback. * @@ -107,7 +196,28 @@ static void vfio_user_msix_teardown(VFIOPCIDevice *vdev) */ static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg) { + VFIOPCIDevice *vdev = opaque; + VFIOUserHdr *hdr = msg->hdr; + + /* no incoming PCI requests pass FDs */ + if (msg->fds != NULL) { + vfio_user_send_error(vdev->vbasedev.proxy, hdr, EINVAL); + vfio_user_putfds(msg); + return; + } + switch (hdr->command) { + case VFIO_USER_DMA_READ: + vfio_user_dma_read(vdev, (VFIOUserDMARW *)hdr); + break; + case VFIO_USER_DMA_WRITE: + vfio_user_dma_write(vdev, (VFIOUserDMARW *)hdr); + break; + default: + error_printf("vfio_user_pci_process_req unknown cmd %d\n", + hdr->command); + vfio_user_send_error(vdev->vbasedev.proxy, hdr, ENOSYS); + } } /* diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 9b569156fa..607e0f4b7f 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -203,7 +203,18 @@ typedef struct { char data[]; } VFIOUserRegionRW; -/*imported from struct vfio_bitmap */ +/* + * VFIO_USER_DMA_READ + * VFIO_USER_DMA_WRITE + */ +typedef struct { + VFIOUserHdr hdr; + uint64_t offset; + uint32_t count; + char data[]; +} VFIOUserDMARW; + +/* imported from struct vfio_bitmap */ typedef struct { uint64_t pgsize; uint64_t size; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index ef644848ed..6f0358bd8f 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -380,6 +380,10 @@ static int vfio_user_recv_one(VFIOUserProxy *proxy) *msg->hdr = hdr; data = (char *)msg->hdr + sizeof(hdr); } else { + if (hdr.size > proxy->max_xfer_size + sizeof(VFIOUserDMARW)) { + error_setg(&local_err, "vfio_user_recv request larger than max"); + goto err; + } buf = g_malloc0(hdr.size); memcpy(buf, &hdr, sizeof(hdr)); data = buf + sizeof(hdr); @@ -765,6 +769,59 @@ void vfio_user_wait_reqs(VFIOUserProxy *proxy) qemu_mutex_unlock(&proxy->lock); } +/* + * Reply to an incoming request. + */ +void vfio_user_send_reply(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int size) +{ + + if (size < sizeof(VFIOUserHdr)) { + error_printf("vfio_user_send_reply - size too small\n"); + g_free(hdr); + return; + } + + /* + * convert header to associated reply + */ + hdr->flags = VFIO_USER_REPLY; + hdr->size = size; + + vfio_user_send_async(proxy, hdr, NULL); +} + +/* + * Send an error reply to an incoming request. + */ +void vfio_user_send_error(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int error) +{ + + /* + * convert header to associated reply + */ + hdr->flags = VFIO_USER_REPLY; + hdr->flags |= VFIO_USER_ERROR; + hdr->error_reply = error; + hdr->size = sizeof(*hdr); + + vfio_user_send_async(proxy, hdr, NULL); +} + +/* + * Close FDs erroneously received in an incoming request. + */ +void vfio_user_putfds(VFIOUserMsg *msg) +{ + VFIOUserFDs *fds = msg->fds; + int i; + + for (i = 0; i < fds->recv_fds; i++) { + close(fds->fds[i]); + } + g_free(fds); + msg->fds = NULL; +} + static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); diff --git a/hw/vfio/user.h b/hw/vfio/user.h index fe24a881f2..fa6bc9a9d6 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -116,5 +116,8 @@ void vfio_user_send_nowait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize); void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize); +void vfio_user_send_reply(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int size); +void vfio_user_send_error(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int error); +void vfio_user_putfds(VFIOUserMsg *msg); #endif /* VFIO_USER_H */ From patchwork Wed Jan 8 11:50:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930703 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B668DE7719B for ; Wed, 8 Jan 2025 11:54:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUdo-0006YU-FA; Wed, 08 Jan 2025 06:54:28 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdU-0006TX-7b for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:08 -0500 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdR-0002HK-8O for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:07 -0500 Received: from pps.filterd (m0127840.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5085e2Rj029633; Wed, 8 Jan 2025 03:54:03 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=4VGnRreFjVOK/LVxPpNglaquqepYnR6KLteYjF/jl bo=; b=yFUuE62KTH+uslzfLlfpxb1n0FisuRyMU8KiClhU6PA0HPefCcPe7XAD9 zsb+BMdjlVFJIZLjtRpM61epk579+rOAnSmAw9ZImVHgxqrcEzo1wXPZ9QqKwNR4 9m6ewQAcbD8Q/xQAcA4VHem0XycrVgrqF6pMhJbmXnA3Bo3wvLZ59moFf5VawCCS Kdr9/gyOajjBo9h1ZdX8magnR2P0WSsoG8cmcV249DbdiVG9e1zqnlV1jevzDhVJ +M6viT8RjOt0o2TAF4R/0/t9xNScBf5mZeLkHrMIY9yk9Qh2dU4cz6VeT8La3Ebx 7LKdoP0kJbmP88jRX9AVrJ4RzdTeg== Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2170.outbound.protection.outlook.com [104.47.55.170]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 43y26xqhb4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:54:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=i9VjAeKKW6WowCkoETm510PDGIH3wyPVYb1KE2/+c7jjonHLDys8ksn5BxECK/iM/YDAQM5tXBgCPhMyGiFxp3WWiJTW/ZuCWgHsMDASLRbPaJiPKpRuJoGw0Hap4nOMlLJQr+r3Gsl3m7PHI2SnDQaxcHgckETOli7XsgrbXI8Brp4nfJgwMTYw1StzuMxqGjnh134e5WnBBpPNM73J21ytYwF2Bc5bjiRSxb2NWRXD0luIa2U1j211HicJBAULoQOtQbVwpp9cTmaNdBNYod3Hixs7Eub6FHUvTBPWbC3InuPGEPaMjFM1TG81LU2HPqnpHPUB0VBSrZssDkB/0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4VGnRreFjVOK/LVxPpNglaquqepYnR6KLteYjF/jlbo=; b=Uu+Xa4tx7VNxtNKbJAil9FahoENU2eqvy6fmHuzNLL0k5KOBLywW5+0vdCZ3npEqo0brBZeDM2VaRkM5HESXVc+Wh9SfGPG2MKX76Bw/WVKKN2WJzPA+GTiWMSFrO1jfh2/2aaBhu4DTiS14xsQCT2BUaaap6rpUyjk5TOjCoWdz62xpMx0kAivqxmeKuz3KzhILFTujSL/J092MvFXW21kwgkos3WXohalv88wz5OWJHN/CRtKVKW15rUU2Q9bvl41h7qpV4RHYnqdEQdo1DG5kAG/GaWJ85eANEBWFfvbcDZJfd3B34Vxnzx/YfEALoiz1ZQneayn+cu2fxi0pAg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4VGnRreFjVOK/LVxPpNglaquqepYnR6KLteYjF/jlbo=; b=pGwmy4YTtp6JzVU/TCAaTCVn1NWFShq6iwTpjz2xPLkxBl1Tiy17+KQaqSW8viCHxdr/ikIYHmrNHGih7NS7D1yrKlNr1un3Lc6RKdjM3vlDQWFqg3RVTeKZcu4kGrD/eCaXL8rNQqoozFixY8OVHbAaaqhIbrjkt+djkKGQ09IS8HlwLVjumlBLL+lXkh5YXi22xU5AheTJc4+ob8643thqNotEIDsEnJ7rV/l6J1rCEFHyDrLcnR4X3sx5m62EKTQZgEb16UpbmPc59u+hWFiJnfXED3aJuKMSqjdbt6BQxn1HqE0K/qmGdVZwXfgtmx5yfles9CznSJ7zFc/aRg== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:54:01 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:54:01 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 24/26] vfio-user: pci reset Date: Wed, 8 Jan 2025 11:50:30 +0000 Message-Id: <20250108115032.1677686-25-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 2ede8e6b-9c28-49b9-a72b-08dd2fdb2478 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: VyVDWhmo2yeCxBEIsMCqB0pa4Cc9IPAD7p9z9hnwad5SKTV6s2pNRclGbvztkfkXVAb7BT2xLpXmjL9+OubZmEg+qMJiUbGGuvbNKE5wYryZJdfcFFhjaPYOMXZgCYc8HSsc/LD7nPpIUgst+fEYilkobAVqIEdBtJ0qvkClF1YfObR+HTgXDqO8eYo22MGI88W7PguUfvZdJ1dfnueTMGPtFlRRJgwgprF0r9lei5yeH9yzWqWnTi4dcjRDbSKmmLxlyhGPugnkZQPpvJioKAH6Ja6cgB511QZ4caTKZuIb05/HYw1lvCtTW+s5F8yKp09EN61BJYkRI+Rf7f7wHfek3lKl139p/cjMi0cmjhxxT394KBmb8QqwmTDff7TcrbMey0WF9qLl5MSfpWbg/FR7//NAsa37Ww5gF5A/bTTWLqQpLkbbhBJIIC1SVrjamjqfRJErlXZ2+GfCYAlWFQOLWeHG3wdg9Cg97bk/k3H4ouZzoP1yC44s0kdkoBhHf+R20FiMG7X0vhQpY2T99on8+ChI6e25CYQ34ay9Ed+EvrxDMMLAFtlgLfDt83QeIpsdctSBX8WXenBoRiQFu0b4n+vVqVenM8j9PsqHMc2V1T3iLFVm57/ZPDvcEpmbAxfiBrvTsEJk/r7Qa7XLRnvsaXI1HscW4Vco4hkvAigI3DxgKMUnUqrFpfkfTShfeQT/apWuFSz8qCh/LOs/6RVqs2lmoYZoh7QuaU7Q9dS6HYiiT/Hj08HR6f1VJB1QFkjgBbp32W2PyAXcWZxrrtl4UswRD9v5nOPjNCkzYI9dLwflPqcCPbyc0arA04eJn64wredJ7i4tUpzliMd6zGsHN8CXIBqxTdQ7soUkEzqp+dSBGZam4p+ROzcSjwjtCCro8IityPwOZkkdOuUjMKXrjg1sq6nwZzAFIA/0jbKdW1+1H3pwifEB+RXreS8xzm76Er7swGw9X9SggMSoml8Lv/xDqGfR6fnuHx2XETcbLv36qsjOcW0TqKSEEJDBsPuD3gjvjmfQRsIOswnZ57adGP+sPbRZ6H5jEUJrrp6ljvj1pfvaDqELP7i4hbllYtdBMdXjmeVUSoEueKceOzB7O1GA5udoAnJuz7fZNoxngAfHYbytkRPXBf8be7r1cACqXg5Q/bW/9XoP8sLXF9zrxuJ9udggjfzjNYhJrdgdEUs4SEs99aJEXd1OC6EDved02PVQ3xOLIrR2d7k7OXE0YBrqMsPtZ2BSI8TRMd36O3kXcvbe+OXKvTTHsbXrG0AQ4CsazpyKa2W5cxkqUeQgqRuf/02yzlco9IcgiJNMCe4/VuIuulGosFmKaMPH4tvPtx+EvRxl4A097tsjldA3H4q+FPJcEvm8g4WktwXkiWWqgCacgHehz6Rk9My5 X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: VVJkGB62BqB8SRg/mww5jmxnxGQ8m8B/mwmuylcNr0B807nJiPS0UewMt33f2/clbuVvzGg8DVaGhKbZyK841oqMtJZhMMdRcKH27R4zZT/SmlaplEGVRJWkreN7O6SCyn6a9KyucTrKdsxUwWEEACjv6xT3asVXs+1XxiDdlWzI0ck97IM0t8Dv5XIEhfcCUVfe+8tSs3YzsnMPbpKM6QvuU6b4OCNGws1Kw81rvV8QmFvYBJi9GBTGUKuf2Kh/I8a4H0werEznUeVVrG2Q18cnHm8F3V6GKXpdg5QySMNOWJBcsChyR0cyTXT8D4msWOPctXK0a5AhfjwdP08Uv0Q830b+UPISXAesCnvhKDolErOt0Pro8VoZcq5Q3B8pHNL+nmHQr6+E1FZgXY1RxP8orFsaxCHiP/w9n/RAN3ZLXzE87ltlartJfWQj7ylz55Jwr73BF1DJ9hTIMFi95NuIJq2MDFBm0BBU7spH+OHVEKfCkDQnmbHBTCm1W0SQu72iV+VdE2FHI3Ia6Ma9NGMWtqxme/I9hvhPN/d6sTPZFhbhOF2Shr6JK3bIVUcgazGdoksNcrJ07Trh4Po11R4/jCxom5BPlIUC0/tVdg3Ly/ctEd6Qt2Vub/8/Lm8bLZO3IGX+u6qwyNdcGxqHajWSYyrviGeckTCEYjOjxuKgYYPKqD1k2/5tJgeoc14OTcWcEe6VIEVbDAT6hBvlwAGmZRGcPkQgnqZWMFE++Eh+J4w3Q+nlcbXfree6zv9np2a0hq/T06cwcnRg4ng+tF7N/KnsISWOO5EMf9RTnd0WWOFODtiI51ZpXpfMkaS3M6ccyBMGJSxFj28uOr7z/tzVgaRUTaE2aN2NganDgPagooZxJYJDYI41CMbszMmxc9ejmABefgJhrwSJlAjplrPwziuczQgThESIIOgE59nCnOyyAZfWljKn0ThNWTkXhIHReaDtNBucmO/GjYt9mTJoVdQIHLK+mfb31S2JTlrUwg3XyDYkgfJgq3QOp914fxrcfy+eYpLkoGlMIY1wzzOMaVC3IPoJcyO+jm8g1XN3k0nbWoNHEQa3xiviGcjE/MIxH/Z2mrSp/M8Ey5/QfxvZAVrZCadp99jaZ8Du2Es9dWQ/Cj5EG680vQusDWPpDS/JPg2O3+uofQvjEcovTetp08Ct+129V3LuyetKHbfsu3xp8jYAmxR7ZVZMUyUnDgGLwqqM49/r2FstHYP00G2w4ZtoeOxerjeiHJMpC9oIgNz2VKMpK5bAmdxE2COGfyf/xrtguGXr2SFnPfzvZYJipx2qPNi4IHzHFy9r4Z4YjyKu5fg4xeWPKyB0YDN+fZY/5hcyOKYdGaFp01trjia68cBu2Bp0Fo7CeHOPvgs5g8OO6EFEj4gAyXnK1jCZti7ocGLr8w1nvOsNnHH33RBW1GJZdw/nvD4EZbprWQShZ6sgpJGEGIFC8xvisiyupONIPhQ8RFm2QHX2Mwv13UTBTx3mU0eorH5dRgRDe0aVLnxuLRAYIq8TZk8a3ntKMmeCZkSUWPFCb8NG/ivFJRhzmHlc8iQZPXuEcgUV3mOdCoVhuNKlJBgtyJGlS5F1 X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2ede8e6b-9c28-49b9-a72b-08dd2fdb2478 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:54:01.4012 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: GbxcrIdk1YsYWdA17xyg1jkjMMEKvLr4u7xar+oowsgQmkWj0YkEk1v2REHgCSeNpmj1uI6+HpwvarCpkapEKQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Authority-Analysis: v=2.4 cv=Z/cWHGRA c=1 sm=1 tr=0 ts=677e675b cx=c_pps a=2bhcDDF4uZIgm5IDeBgkqw==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=Vo7S2oCIfNTITXvqP7UA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-GUID: eucTFqomGm6sbdzsCgTWcOtkrwmuvKFK X-Proofpoint-ORIG-GUID: eucTFqomGm6sbdzsCgTWcOtkrwmuvKFK X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.151.68; envelope-from=john.levon@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Message to tell the server to reset the device. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/user-pci.c | 15 +++++++++++++++ hw/vfio/user.c | 12 ++++++++++++ hw/vfio/user.h | 1 + 3 files changed, 28 insertions(+) diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index 8cd397b75a..84802556e9 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -393,6 +393,20 @@ static void vfio_user_instance_finalize(Object *obj) } } +static void vfio_user_pci_reset(DeviceState *dev) +{ + VFIOPCIDevice *vdev = VFIO_PCI_BASE(dev); + VFIODevice *vbasedev = &vdev->vbasedev; + + vfio_pci_pre_reset(vdev); + + if (vbasedev->reset_works) { + vfio_user_reset(vbasedev->proxy); + } + + vfio_pci_post_reset(vdev); +} + static const Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), DEFINE_PROP_BOOL("no-direct-dma", VFIOUserPCIDevice, no_direct_dma, false), @@ -405,6 +419,7 @@ static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data) DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); + device_class_set_legacy_reset(dc, vfio_user_pci_reset); device_class_set_props(dc, vfio_user_pci_dev_properties); dc->desc = "VFIO over socket PCI device assignment"; pdc->realize = vfio_user_pci_realize; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 6f0358bd8f..9fba36e196 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1491,6 +1491,18 @@ static int vfio_user_region_write(VFIOUserProxy *proxy, uint8_t index, return ret; } +void vfio_user_reset(VFIOUserProxy *proxy) +{ + VFIOUserHdr msg; + + vfio_user_request_msg(&msg, VFIO_USER_DEVICE_RESET, sizeof(msg), 0); + + vfio_user_send_wait(proxy, &msg, NULL, 0); + if (msg.flags & VFIO_USER_ERROR) { + error_printf("reset reply error %d\n", msg.error_reply); + } +} + /* * Socket-based io_ops diff --git a/hw/vfio/user.h b/hw/vfio/user.h index fa6bc9a9d6..d9aa1759df 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -119,5 +119,6 @@ void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, void vfio_user_send_reply(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int size); void vfio_user_send_error(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int error); void vfio_user_putfds(VFIOUserMsg *msg); +void vfio_user_reset(VFIOUserProxy *proxy); #endif /* VFIO_USER_H */ From patchwork Wed Jan 8 11:50:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930713 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E6F4BE77188 for ; Wed, 8 Jan 2025 11:57:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUe4-00071y-Sj; Wed, 08 Jan 2025 06:54:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUde-0006Va-HN for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:20 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdc-0002Hr-NK for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:18 -0500 Received: from pps.filterd (m0127843.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50881hhn004556; Wed, 8 Jan 2025 03:54:05 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=o/i7DsWvIAYEtFgLbrLXLc3imCcky+m4+aj3c+whM Y4=; b=2jW9fIVfqCQ8LAs37Nx0Xf7qNAtp8wdHKon/FkeoynrluGsphpnRmXQYt b2pMvEHPnTGFJSqshtXtaKnx25qxC7d5faKvLcZCgWazfgU6cc3IF+T9fmeFrU/n mn/J0f24eJf4OgJFNFyt9eDzY0H+zKsaUUMVjN+bdWoV2uA5Q5SwV6AJ8MhVm3rr BKhH4l13CkeKwyJGZkNVCgdpXb4qqqsLR95V+eahx02aDF6H2m0yVayKOkCc8rVc iFO24nMF5ylT4+FH6ttvK3eHoY7WxkDafMHUh6AYi6wkM/3CeRO3rqpQKRQzjzNc vlcBQqNJEZmTHw7N8d9NKtpx2+2kg== Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2171.outbound.protection.outlook.com [104.47.55.171]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y3ynrwf1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:54:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=dm/VxXGAgr9sj6iNr/IwSy+XAxEek8kUPR81sgZSTnW1KDLhszZPGchgxgphyIZigkaIycE+t7ymZhD9T/8r8yMsArIS7kk2HHL8jfuAo+/rhloIq6rGkzkka44ogsUGhlZgFZ5//OueS91+zdsf6U3uEA/h41zMCz++EKbwVn7WZkSqD74gei9H5I2we4TM3cCpX3PrQpDKBUQxwX+BOUnOJOR+dcgIBM5oFe2oTEWeU7sUMQJmHPGAJ8bVxl1WoOyAHyasRglRGa2wHuQyRDC0gF/yvVzpzLcjJm/GJvVUiocqqy202gJTaOgtXKinnozELcW6HARhtcLhYe750w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=o/i7DsWvIAYEtFgLbrLXLc3imCcky+m4+aj3c+whMY4=; b=Ya374fn0IiDsDulsE5BtL5E7ZVuJzZ2h1pYNBr27pbROa4QsHl5TdyLC3mCBvpCwAZJadLChl0wvWynHoj8ZJOF4x8VXuJ8CBeV4KuxOaR7LjTIuSN0LWqtCgAbrX5HyQdXF9VGg+zPvP3+hc+E6yElSlMAdzthTTJ/6RODHHz+OCysKvspSU+wCY5bVxqeLnFLGdi0FY8xPMEJiVw8qJnyRbzpY9cU+p99//5IvjTcwcnVg6LxTIxUzP5cpxtLQVzYAQcihnJTSa/hTUj42Yf/L/XlEVUuX5KMiDre65LUoVwyBSnNqd3kd/Z7Gw/sf/QQquX5ZcsekTs2Iv5SYug== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o/i7DsWvIAYEtFgLbrLXLc3imCcky+m4+aj3c+whMY4=; b=pX7+KeQPootKXJ87RFUZkKTg4CD02JKpjlZr8P6Q49Y0bBwUeh/q2Tw1HS8xgvQ/lUN27JFJJ+XaHJaS9V398Nuq/IcdlhkOvuxyU1d4qvuf5QGcwUC7S5yThj+3/MfpZXmJ2lMaaqUE7DUhegC3NLlA6q4wNrAtUclgNeVOOTfld6a6ZaHWFR8+1jQRsp7dS7LUifzSYiJoPWde+NdlWvQ6NaoH13CIOBoQisXEsqt9Gs1jsiw0X3wY2x6+cIe1qUABMBwhJbiaucZ+5Twofo8lU7LOlVSUWPjblI6fWVv9N2bteqqq+fHjAb/xGqnS6l+CTcRK6wxKx1YgXEErMQ== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:54:02 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:54:02 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 25/26] vfio-user: add 'x-msg-timeout' option that specifies msg wait times Date: Wed, 8 Jan 2025 11:50:31 +0000 Message-Id: <20250108115032.1677686-26-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 46df385e-4b4c-4267-932d-08dd2fdb2539 x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: sc+aYP0D+Lm/vR0wkVunLXi7iOrLH9AlQgVSnuCpYLSOnA8lOzHspMZuUWPjFZyeu5m7LB46UK0z4QcOK+ZCz33qbQvI8kCk6hwEzdqel27eIpiuwJBroSDEOCDcZpxhLnLSW/bbmpNIiFDOAm6h+NrK3u2orUh50YjPv5XLQeDLDV6qVCys5grJ6PoMPXdgJrnWs3y1YGyAhkhQ32D6P3nFXClf4STpvCt2sQKmIFAdbzXBx87Ir6QBd+xkVQURaLXzamuExoexJuesA3NLA+EGjojHB64LnjD17quGqjCGufNf26yK1wxtLqGj1oj95DepMbOKVZMLDTmO/kYjqAV0hWGxxY4hzTYLRravkSALntpmGWBhFgmdejppJGPaWUYm/6qoCvSBd7t54vF34EQSV4pPllr4vOwJwFVicuuBSUZ20qZ2VZeds0r/L0VSaY851G21OoXSCncDRwwGkh8Ml15cSQp05r1nmJhYGT7l1JvipB8mxBSdq9Q5YyT+3BbrS42bfdZolvdSXObRSZc5Hkf9hXRKMjizSbnkwE4STNJrTBCMCSpBB+niVK0OdpBUISmQ10BxKiJl7eEC/WJdWve7Sb7BtJJ8pUxrz3Qo/DeggsWTRgKUY8dzw4Q+6rOHUNWyuU5QH+3++fjEgslXie0bbVFpKH0XsvhYFnDZ5PGfD/IkVyTBAOJ3/9hn4JdCzp/PWGMo4lz8soDy1GQio4E1Oe1XCdkDtSiOhOzSmwEWVASLiiN5oLz/uS+QZDVMiSa1bePgcjgy3ETRjAyl9rZpqLFST5T8Z321i2OPMXUZdWASKxPzR6d4KQ9GegTc8ycl+x2KMiTBOC1KokCF7U0txE/1Vu046fDwBHFBkUFOACQd4o2/xCJZhPoCZ9TIrc9n0FtVVGQYOEo7sqtIsMk9TpKjd8cSTHT45LqnK/1SyOdz75mP09//ypgA2VbbQ3BU0PZKgoyCSPynCN3s6gnN806wE6FzxQLDCjaYxXkWIaE0+tE+NYUnpNFEj568k4nh3MQHVgk2yvxG2Lq6kyg00n5mPSNodbu1hN4ccnosEXzHpf32kUVrcYs8nClvk+z4g0C21bHtU6NR+OHBtagLf+ruRGK3kGnv/5bDTQ6amvlFdk0ogeM+Yiw0Ah29fxk9yG2gd5as3XYsO9qVNER7XiU8GJDvGGDeLWs/qWCpwHSvvA6IDvLww+o1c9ZM7WB/eZIR/MVeTXC5dw4mLbP8PSAiO1Ws4D/LU+wR612P5XECjxZ9c7HpHJNCXGF2PqN4V0Z9pyCgSmWgVoipDhl7YbzTF3LDrayM2kh2RtMQQ0mcy0mwn1d2kSa2exQVKJVs/CR0LYXF7lDY4V/GeB6jBPkcEJZAtDBds+A8Zk+8+KeHx6ImXjNwaJxB X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: rt2oXdBneo3mX2yDhULjqXAS1J67xmMRgHYLu8n36laRLsrDabzwCKquAQgK2/95JoMWbiiaLDAw9t/axVzDZ71LpTGMYmuhNpYifTS5a9SpDedNFGT18lWu6Dj0emIwRDk8NoG5u3jNmNb5sAamzIKyQODKz13K3MMj3JcKYGfOx0sTPyNjg4iWDqI9RC0bg2WZvek3CTmDiaxrPNkW1i3nPpUlLzsmz3Ml5WKKO141mV024UUPogevQD7uxCN9Rg7hVWanzc32LIJSjTniBhygWGGAQAuZL0ZLbP9NbGVAHhkgQ7BmHzk0JMHePQiLo1j8EM99Rqzn7Yw4D/bYGYCn8800l8wxMGFqHHFpSrHva26cGmt5PuOJGYAzEtOy6F90zN5Uxiohuq75EZVuOjOjjVZYcHTsYa4PFOoVbI8nRgJ6DD1dXke9aG1Ljoa6r5eN/5RLmhZALCQDclAYC6pGrMNQE7/yMw/bo6L56VGTQWGMjl382Q2AisHiwOR1+OBMgZ4tj688qL5nTHafEZxM23YUcuRi3Q9NPlSDFccPvLwC9ybdSZbHKb59BfzGVieD0qtKcp9pZYcoaciD/vMIOJsKKDKfT/bxP50f3ibP7lDMx/LdEqLjrIxeBc+BzXVUlURKfoDqFM9TwkS1778eXtjZz34M7gkr0TZd6kTLKa8zQU1ausdJM4pcxB0b9qLrtR+LdxmauYet4hpVrST8dqm67dq3ZtG8l9hbOzQ+40NSQR7m6RS6oz9JcBvwuO+s2uwa8v8ZmK9Md7eXB13i0GpiHktwhkMK9g3MVFa+0miwH1y3kImZG2zY7I0X47uv9Xgd+sy6PcA3UtYpidemp/ye7DxW9gcKMmW5TIFSR2GPt7zcwmAbyepHmryj5nmkn66TkYOzoGs26btseKhB6rfnNslizNR9PvfmbMqhGVZR003Os+FJH1oD/Nm/+iCvd7XgoaI7wEDSKZC+76n/N2yK4+kM0n18EkZwEihSd/joVyIFRDB03DWS0xCjXMkYVQ37N/vmVAzxOvenT5YCQLvpU5pRJvnUx8GgExxL3+bIEYncJdQFYjbxO6WLy7RoIkcZJhDod6bIgjb8zNxogZFbbidife8o+VZu7qlzXAuHG+99l164Kr0ESNxp3ALX/BOOqLoCl0f7gcjMEnQVw3jPkUWKH110gni8ZLTOkR2/mVv7rlpFDln0SqalwiXObtqkQ9zB1TsTCIRfYHkJ3nfmM36TmQcg8BIDpW/sT/T5NmGsGqlBiG03kb1LZIb2NFaadLT/1DDFbYWvww7j9GlkBEPpE//dkOR8qFiIcGoMEe+1YEZ49OBEY5OatiK8wNJW2gXuae/mQRTnX60OOb5yLKRhV45bPlhn8WpZUNMJmPhEgVZriHR/bA9rb+c8/F48BkCiis05jfBUMw1LzpKF/GbtgfgLFnm8JnRaCLp6voJ9F5cjK0U11aojFiD83XoTCleKGPuTiK6kai+QyOFst7wNTALRKPrrEDBjNkSAJvomOeqr0i/bDvARXCwqNWAR4N3i6vm6S6TSBBYGa1pdoDFhGwYPs3RKn2ax1KwlJueEz8DfJ1WSiDCn X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 46df385e-4b4c-4267-932d-08dd2fdb2539 X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:54:02.6712 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: aL/+mqs9zmUBUGCCSGKFPCRkDtIjD+HrsydVyV5TXJIZD4vjEHrR5r6IyB52j2bTKUD/6qQHTQZBLRU8uIE8dQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Proofpoint-GUID: Xk8F7NEkD0jgc9o8kwz6tuIBi9II7muX X-Authority-Analysis: v=2.4 cv=CrlFcm4D c=1 sm=1 tr=0 ts=677e675d cx=c_pps a=98TgpmV4a5moxWevO5qy4g==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=efhkqSP4pV1qFdCZ4JsA:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-ORIG-GUID: Xk8F7NEkD0jgc9o8kwz6tuIBi9II7muX X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/user-pci.c | 4 ++++ hw/vfio/user.c | 7 ++++--- hw/vfio/user.h | 1 + 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/hw/vfio/user-pci.c b/hw/vfio/user-pci.c index 84802556e9..148e451dbf 100644 --- a/hw/vfio/user-pci.c +++ b/hw/vfio/user-pci.c @@ -42,6 +42,7 @@ struct VFIOUserPCIDevice { bool no_direct_dma; /* disable shared mem for DMA */ bool send_queued; /* all sends are queued */ bool no_post; /* all regions write are sync */ + uint32_t wait_time; /* timeout for message replies */ }; /* @@ -277,6 +278,8 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) if (udev->no_post) { proxy->flags |= VFIO_PROXY_NO_POST; } + /* user specified or 5 sec default */ + proxy->wait_time = udev->wait_time; if (!vfio_user_validate_version(proxy, errp)) { goto error; @@ -412,6 +415,7 @@ static const Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_BOOL("no-direct-dma", VFIOUserPCIDevice, no_direct_dma, false), DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), DEFINE_PROP_BOOL("x-no-posted-writes", VFIOUserPCIDevice, no_post, false), + DEFINE_PROP_UINT32("x-msg-timeout", VFIOUserPCIDevice, wait_time, 5000), }; static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data) diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 9fba36e196..217d0e9ea4 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -40,7 +40,6 @@ #define VFIO_USER_MAX_REGIONS 100 #define VFIO_USER_MAX_IRQS 50 -static int wait_time = 5000; /* wait up to 5 sec for busy servers */ static IOThread *vfio_user_iothread; static void vfio_user_shutdown(VFIOUserProxy *proxy); @@ -710,7 +709,8 @@ void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, if (ret == 0) { while (!msg->complete) { - if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, + proxy->wait_time)) { VFIOUserMsgQ *list; list = msg->pending ? &proxy->pending : &proxy->outgoing; @@ -743,7 +743,8 @@ void vfio_user_wait_reqs(VFIOUserProxy *proxy) msg->type = VFIO_MSG_WAIT; proxy->last_nowait = NULL; while (!msg->complete) { - if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, + proxy->wait_time)) { VFIOUserMsgQ *list; list = msg->pending ? &proxy->pending : &proxy->outgoing; diff --git a/hw/vfio/user.h b/hw/vfio/user.h index d9aa1759df..ff2aa005eb 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -72,6 +72,7 @@ typedef struct VFIOUserProxy { uint64_t max_bitmap; uint64_t migr_pgsize; int flags; + uint32_t wait_time; QemuCond close_cv; AioContext *ctx; QEMUBH *req_bh; From patchwork Wed Jan 8 11:50:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Levon X-Patchwork-Id: 13930710 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3A922E77199 for ; Wed, 8 Jan 2025 11:56:37 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tVUeA-0007BT-7s; Wed, 08 Jan 2025 06:54:50 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdf-0006Vd-Dx for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:20 -0500 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tVUdd-0002Hw-9b for qemu-devel@nongnu.org; Wed, 08 Jan 2025 06:54:19 -0500 Received: from pps.filterd (m0127843.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50881hho004556; Wed, 8 Jan 2025 03:54:06 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= proofpoint20171006; bh=ZnGB7j6yjuAFNZMfhz++dR+OD+RYXIWGIqPPWhnna 8I=; b=Os6xWpXepdTUtzNMD+dlitHUoGWkbr2JN015uuxTZh4F/Cs+uRoBbfH7s 3KFm/3BIVmgBjxAPn6c5GPQPU8sirroiq996lEluIjb6KZu7MllRJ0pBk+ddn1zl I0KKukYIah9Hnz69mfo8BpQA6tswbPF5Dp1rTX+E2+HANUbEalBuyhYl6PkxsanA 4TsInSdKxhHW3oGbg7jgOGlB04PxLih0KCJzpDgxHhnB6wIasyNOpoef2/BCK9BQ /vZxKAnnqpUyqXVGt/BWnKJHP/BPjpMZpUpr8aQqhTls7s71lA1VDYn4UOYRLdMh uVi695CHttdZZy6RxqICCA/bZ25nw== Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2171.outbound.protection.outlook.com [104.47.55.171]) by mx0b-002c1b01.pphosted.com (PPS) with ESMTPS id 43y3ynrwf1-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jan 2025 03:54:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=IbhobrSyCN/n1PmNl9tmHKyx1AZbeMhJR21LUwQjKgSPI+YOF7cmwiJTkRjWRQ5gXy1LKdkTzCSKIg6ZX0mqDe4WE/6fmcsvDIVfKth3dQ+hRG7BNsTynhfw7IYcwPAzLitBiKsXqek9j00BEoGlmz3IEP3Ij8kW17VJKyccpnmplGK/MzpekOH+ZInkJ3g7LybVFbbtoDPTpThfu4ZXwNbC/XLUcAgIl+T+sLXAYI5gYpNBSnwV8XPvp+Bqg1a/4TYvqL2LwznAhbf81ujPeAbG9OQ65K5Yw/IyXJdwtn5v1ATnexdR2c/ga7OZTTNl2ACBuCcyFWEV33KomaejqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZnGB7j6yjuAFNZMfhz++dR+OD+RYXIWGIqPPWhnna8I=; b=J2s1b7s0noImallteVUZ843cSGUNhWnep8nmLawdMb1gCEKoLVFEMfIyLxQT4zKhPvNuSl7kONHtbxGjg7ifYofAgD+37hmnYmG77v03+/0/a3fn9Duyb1tu5KP1bX38/Hz2ARMime3uSi72/KHiOlrdwP5j8F042stTAGFZMPNllHXTZ+xzNgnTMAtC8ee14OTj97Z1moLszz+pLkklPg2OyX/pO1UAQEh8EdpBkAYRJbJMiMwmIvJod3688RZ8OpnqYJurA3itHr6H9reNg4ZxQ+0lBovmSjOZpzluhH+EdYoHHKT5uTGic6dod1OxZwoRjcvT9uEt4NO+eghyqg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZnGB7j6yjuAFNZMfhz++dR+OD+RYXIWGIqPPWhnna8I=; b=UaV7hNqm1A4b3KNCXdZP4+m8wmP/eksleBCiSsj4cU30rB2opElabWTYsO6bcRO2fSX/anj6I+rcvSxu9GiovRd0ZVSwYJLBwwaQxPO1DU/hPj/Zd0oIpvkIbPfcEqjy1hBM3+xW2Mha0rFNv8oNdCuB7CGqYIkUycNVIBdLbLlgAGp4c21HbAj5k5SgCOewqXDt9LBVWOGmd4jPfGJDowC31UBb6N70Rd/lQrChSjLMUzJTEw0qZUsKZiYDl5CyuPlAE3bhUh/Ed+vIpSF205r19k6eDbScUPkHxgjZ7fI10JXPQqjkZagu2wNMH9u74UULJ4A4o+EzbILGQR2Wgg== Received: from CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) by PH0PR02MB7670.namprd02.prod.outlook.com (2603:10b6:510:50::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.10; Wed, 8 Jan 2025 11:54:04 +0000 Received: from CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51]) by CH2PR02MB6760.namprd02.prod.outlook.com ([fe80::fd77:ea65:a159:ef51%4]) with mapi id 15.20.8335.010; Wed, 8 Jan 2025 11:54:04 +0000 From: John Levon To: qemu-devel@nongnu.org Cc: elena.ufimtseva@oracle.com, alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com Subject: [PATCH 26/26] vfio-user: add coalesced posted writes Date: Wed, 8 Jan 2025 11:50:32 +0000 Message-Id: <20250108115032.1677686-27-john.levon@nutanix.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250108115032.1677686-1-john.levon@nutanix.com> References: <20250108115032.1677686-1-john.levon@nutanix.com> X-ClientProxiedBy: AS4P189CA0066.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::14) To CH2PR02MB6760.namprd02.prod.outlook.com (2603:10b6:610:7f::9) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR02MB6760:EE_|PH0PR02MB7670:EE_ X-MS-Office365-Filtering-Correlation-Id: 073f55f8-3c9d-49e5-6961-08dd2fdb25fc x-proofpoint-crosstenant: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: WpwHoAC06ex/XRxMg7SmVjjX7ykOAEYlVhvBn/ILgkTD2bBkh2GuWEdTuPtUklsBgqSEC/NAmzQUi0VmQVICgTvw6okLRB6fwtASFRseUoSNE6+wrzCijg7JX7Kv0kRdfcQjAEbqEyUjohewMlgAi1WYPVTLXQUVwu1pX61NNy5JZjTTJPgFA58txdB55DM0XyvdfdOpB9fEIY+aBMhq08mUXX+Vp8SnjQmXlUQbttsLZDgaGiEjW0mlnAE1W5BjTwdN1tlHczP92ESJuin5yJxRWuam/1hxCFM/gVTnLDspvy+Toi4NOQ/qcNFyNQfNNjs8g+QkQw8qlqLINHjSevTWbmdeFH4G2w1z18RmDzZLCA/sGoq/pn01jaO+c85UoN0wKjxrpwhrYsvw20ziAuFMFKGw15pB7Ot30DMuKrs5lZFrovNLh5RQPWNOJ9PfMUItdJKbsfo8so6C2gW9Gm641MWu7Q0IdEvLJDFEY8svU/COgl/fzo0/pYTGt2nyZXLh0bC17tQEfb6dTWfTJ4Y4tX9Igvp+XjehfcyuZ+2iGrMUUerYHZUV/lbDWpxRzIEJe3ydky1wUWsRA+Iv7fXXqPmy2ZraMfHAc5jCp62U+RDpKnLxuhotPZcnDm1MNDAUBT+jMPD999chtfwNaQE1hpdMsmW4doVZ79ilwjO8uL8B8S8uANIvj6iE+4ZJhP8YlOJWMHJ4v39Ue5xOaYdw0xXkTyW1BFDYTVNVaZ6PhpjnwGM6dZp6l78YrB8JHZuMiKw1YeQd8ivC7fNcTqnpn1Xy0aQ18F5oVwkNSto201e6wfJqLdDopBJcVd27cvf51jLDmCPfsIFo9PfAZLypdAx+F6H+abcJu/XVrfhTAWH/lBLiLiNRrfgxwUlC+BuJl/SufQgXZAy3S6KTBN96pRlQcpafYg7AEcSGGFsrfqPhCvBAqlCQDv5f6SSNKNB3rY2ncc9/WNlj2/YQrGCAiGjRfw/YJDJ9M7oFDjfpZ4k26B0hIvz4kEAsnDu/KUIeCd8u5/CUwhiPDKwi5pP16zvF6KhOjNamPqZfM+yJXNJMtpuw+JLXxitDRweL2zaKgN/ZiLHmeR0WVMsStFhM+D2Ke5GwKIUxbzBUqGJ1FVpo2T/5MLzKhf5GTCzAwQfi5cIDVuca7VAFNj4Yt7Wp//VMnf4AjL4yqWWTPuGfiPvADpccWx5cLvvLG1WDIHTd0q7iRvdDMLaXwg6lJd6mMT3HuxoQtM/9mWareg7vwbnR4bwoF6RSFQN2YPkoO8mU2LXuXJooyEmwG9xVcxtWyKsasYCus/p+PiEYXWYlvLdwgValDIiGanNamF+DhxilnWkrEgIPJ5+cJKu9Rx5Z5VOulOgV8RauM4WoCbOPRDFTIRf2xGmFJ4yEIZsE X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR02MB6760.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: za9X+5LOxxAuBOInrv2RDjtPmEFocTcASvsik/Rd/LUYZIg89DbCWOgEI6xNcnXIabugpzL6iXoftJIUAUihFgEYQT/bwMnbQQ5jOU3yNOLl1medcTV56CX6dfpbrpNr/aV5HEI6aL76pfjunaJnXEnJX2UiCE0ZczAY3A7y6NNwOlr6otKklaE7qcreNdqXQ3ogFyMFr8n5Zmi1lX0JXNF2VVe2veNPA2aaGNAiq8DtC8GnvBweJv6YKxj3mrRhMk7ap7dO/hoVrIvsnCThzKbJb72Gh7jSqe2yU2YQFQHM6Uc7wbo8f4kmZMxM5qQbaFdD5DOjmqAPOtGYHPWwOYKaKGezOtzLi8OXjtNPpphLkK/3XhfbSxJT/LLnY7Tg19ch2nKmKnZmXcohMqeqBRTU6QQEA5kMTsGGEMWqv2F0daLVym1R2uL0iiAqYiw3G5Px2CBx9T0cA9zT4mF9cNMso79hKEGYnVt8trJ6sBSbtVnyVS71cHj+JMfpwfB+2LXBpqwIvCD3upHGg1dwXjbX8ulBi98FEpeTmbXgGChfZRqTldSkxyJcbkP2dC+LbZOu5B5Jlw2Y+GoZ6Rq48j4GDe/kF2oI6HlPQ4v0y0g2L7siXL6aKpXmDftWYluB/NTLhQDiZSO+Dd1NEWaLeDdAl5tjAeBX/Sto1TsVMCyGKyLMv6P5UDhbX+V/wojx8cKOmDNWam2Kj+SACty4ZJJsBZqcPvp0gcR9mU0mapsbnY8MVP2hAp/CsXyp3PkKZDCZVkxJsxN+g5WeEAGf20Aw7SUwAiy9vct6IynnUHJQL2HnMSgkUTKCAVOm4Zuv2OGrPSXTFbKthfbI2K9M9vh1qLTPNDK8651iGwGXfYhk8/cLo4SvFlUAidrY83h3aJtaWDc/BN+cvnaketUuAKqmApNtMhdQtzkMA+f7JpSvlS282dZxWB+zyu8N3mr1fNXZg1WVxidOTnANNrPp9zmMvaHvhoIuEG79yvxCorM48DesrPDKaXh17PiwV1zZ0UZhRsJosdq3XrkS0KZQ/uHyXNeDuGYPfc304Dj4OPDMl0ry2lWmv38fMfb8hQF04sv2a7jCcjLjktMpGyGr6mMFxhOe4xLAMKY00bu3SMuUOfzsUxELb02r1tlqIwyMqyXnxPTJbv8pxxaD7hTzD+kEJPqSSPhfboTy9Y+JZEcygraf91YtFXRTH6+0FfbWghB3wQwz2ag2Dx0gMUslGJe4F/VI6+U0eluyWjG3wdVwxoaw1bs2sJLCTvgwECKE26x+5+LTPEBWrlJJ0nv3Kcdrm0+Y9CLiMItd7EO9EKvkbkEHgkK3y9XRCf24Q2kDoR+WyH1BbIlCP3xRbLGGFoMMJoj1ZjLp14THTGH0sj0Ii9tuFRMJSyQIcSFh/YA+OWCCWqAcjyrnskFZEd0IHSSAYu4+WrZ02AmKk+ogKFMMqvVvVQvyUCFlGSz1D0utetMKDr27Ihwa6D9xe+1A908PQifPL1iRq0EfWAqurRZcOQtJh0IpDlE2Cpeqe0u6VYE0b2sfc85gtitetbDq4lIIIOi7ut7DKK7yvULHxUWSOSE5ZPW8kyhyE32IDs7R X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: 073f55f8-3c9d-49e5-6961-08dd2fdb25fc X-MS-Exchange-CrossTenant-AuthSource: CH2PR02MB6760.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jan 2025 11:54:03.9478 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: hwKZDt2XPLVSCw6e/wEveygWbMrlRO3FzRK8tFsb1aPddeGVkk5s4VBwd3yIPjWkWg6UV0XUl1u2rxmZaOMwjg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR02MB7670 X-Proofpoint-GUID: QiSurhCSaSPczzfheMSal95rZBSVTxUe X-Authority-Analysis: v=2.4 cv=CrlFcm4D c=1 sm=1 tr=0 ts=677e675e cx=c_pps a=98TgpmV4a5moxWevO5qy4g==:117 a=wKuvFiaSGQ0qltdbU6+NXLB8nM8=:19 a=Ol13hO9ccFRV9qXi2t6ftBPywas=:19 a=xqWC_Br6kY4A:10 a=VdSt8ZQiCzkA:10 a=0034W8JfsZAA:10 a=0kUYKlekyDsA:10 a=yPCof4ZbAAAA:8 a=64Cc0HZtAAAA:8 a=oAmoKkZGSHXUR1tpXt0A:9 a=14NRyaPF5x3gF6G45PvQ:22 X-Proofpoint-ORIG-GUID: QiSurhCSaSPczzfheMSal95rZBSVTxUe X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-08_02,2025-01-08_01,2024-11-22_01 X-Proofpoint-Spam-Reason: safe Received-SPF: pass client-ip=148.163.155.12; envelope-from=john.levon@nutanix.com; helo=mx0b-002c1b01.pphosted.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.432, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Jagannathan Raman Add new message to send multiple writes to server. Prevents the outgoing queue from overflowing when a long latency operation is followed by a series of posted writes. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon --- hw/vfio/trace-events | 1 + hw/vfio/user-protocol.h | 21 +++++++ hw/vfio/user.c | 131 +++++++++++++++++++++++++++++++++++++++- hw/vfio/user.h | 7 +++ 4 files changed, 158 insertions(+), 2 deletions(-) diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index e3a7f82550..fe9d797af7 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -192,6 +192,7 @@ vfio_user_get_region_info(uint32_t index, uint32_t flags, uint64_t size) " index vfio_user_region_rw(uint32_t region, uint64_t off, uint32_t count) " region %d offset 0x%"PRIx64" count %d" vfio_user_get_irq_info(uint32_t index, uint32_t flags, uint32_t count) " index %d flags 0x%x count %d" vfio_user_set_irqs(uint32_t index, uint32_t start, uint32_t count, uint32_t flags) " index %d start %d count %d flags 0x%x" +vfio_user_wrmulti(const char *s, uint64_t wr_cnt) " %s count 0x%"PRIx64 # user-container.c vfio_user_dma_map(uint64_t iova, uint64_t size, uint64_t off, uint32_t flags, bool async_ops) " iova 0x%"PRIx64" size 0x%"PRIx64" off 0x%"PRIx64" flags 0x%x async_ops %d" diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 607e0f4b7f..22e3265f58 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -42,6 +42,7 @@ enum vfio_user_command { VFIO_USER_DMA_WRITE = 12, VFIO_USER_DEVICE_RESET = 13, VFIO_USER_DIRTY_PAGES = 14, + VFIO_USER_REGION_WRITE_MULTI = 15, VFIO_USER_MAX, }; @@ -75,6 +76,7 @@ typedef struct { #define VFIO_USER_CAP_PGSIZES "pgsizes" #define VFIO_USER_CAP_MAP_MAX "max_dma_maps" #define VFIO_USER_CAP_MIGR "migration" +#define VFIO_USER_CAP_MULTI "write_multiple" /* "migration" members */ #define VFIO_USER_CAP_PGSIZE "pgsize" @@ -221,4 +223,23 @@ typedef struct { char data[]; } VFIOUserBitmap; +/* + * VFIO_USER_REGION_WRITE_MULTI + */ +#define VFIO_USER_MULTI_DATA 8 +#define VFIO_USER_MULTI_MAX 200 + +typedef struct { + uint64_t offset; + uint32_t region; + uint32_t count; + char data[VFIO_USER_MULTI_DATA]; +} VFIOUserWROne; + +typedef struct { + VFIOUserHdr hdr; + uint64_t wr_cnt; + VFIOUserWROne wrs[VFIO_USER_MULTI_MAX]; +} VFIOUserWRMulti; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 217d0e9ea4..128a65a3e7 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -26,6 +26,7 @@ #include "io/channel-socket.h" #include "io/channel-util.h" #include "system/iothread.h" +#include "qapi/qmp/qbool.h" #include "qapi/qmp/qdict.h" #include "qapi/qmp/qjson.h" #include "qapi/qmp/qstring.h" @@ -58,6 +59,7 @@ static void vfio_user_request(void *opaque); static int vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg); static void vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds); +static void vfio_user_flush_multi(VFIOUserProxy *proxy); static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) { @@ -462,6 +464,11 @@ static void vfio_user_send(void *opaque) } qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, vfio_user_recv, NULL, NULL, proxy); + + /* queue empty - send any pending multi write msgs */ + if (proxy->wr_multi != NULL) { + vfio_user_flush_multi(proxy); + } } } @@ -482,6 +489,7 @@ static int vfio_user_send_one(VFIOUserProxy *proxy) } QTAILQ_REMOVE(&proxy->outgoing, msg, next); + proxy->num_outgoing--; if (msg->type == VFIO_MSG_ASYNC) { vfio_user_recycle(proxy, msg); } else { @@ -589,11 +597,18 @@ static int vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg) { int ret; + /* older coalesced writes go first */ + if (proxy->wr_multi != NULL && + ((msg->hdr->flags & VFIO_USER_TYPE) == VFIO_USER_REQUEST)) { + vfio_user_flush_multi(proxy); + } + /* * Unsent outgoing msgs - add to tail */ if (!QTAILQ_EMPTY(&proxy->outgoing)) { QTAILQ_INSERT_TAIL(&proxy->outgoing, msg, next); + proxy->num_outgoing++; return 0; } @@ -607,6 +622,7 @@ static int vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg) } if (ret == QIO_CHANNEL_ERR_BLOCK) { QTAILQ_INSERT_HEAD(&proxy->outgoing, msg, next); + proxy->num_outgoing = 1; qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, vfio_user_recv, proxy->ctx, vfio_user_send, proxy); @@ -1122,12 +1138,27 @@ static bool check_migr(VFIOUserProxy *proxy, QObject *qobj, Error **errp) return caps_parse(proxy, qdict, caps_migr, errp); } +static bool check_multi(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QBool *qb = qobject_to(QBool, qobj); + + if (qb == NULL) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MULTI); + return false; + } + if (qbool_get_bool(qb)) { + proxy->flags |= VFIO_PROXY_USE_MULTI; + } + return true; +} + static struct cap_entry caps_cap[] = { { VFIO_USER_CAP_MAX_FDS, check_max_fds }, { VFIO_USER_CAP_MAX_XFER, check_max_xfer }, { VFIO_USER_CAP_PGSIZES, check_pgsizes }, { VFIO_USER_CAP_MAP_MAX, check_max_dma }, { VFIO_USER_CAP_MIGR, check_migr }, + { VFIO_USER_CAP_MULTI, check_multi }, { NULL } }; @@ -1186,6 +1217,7 @@ static GString *caps_json(void) qdict_put_int(capdict, VFIO_USER_CAP_MAX_XFER, VFIO_USER_DEF_MAX_XFER); qdict_put_int(capdict, VFIO_USER_CAP_PGSIZES, VFIO_USER_DEF_PGSIZE); qdict_put_int(capdict, VFIO_USER_CAP_MAP_MAX, VFIO_USER_DEF_MAP_MAX); + qdict_put_bool(capdict, VFIO_USER_CAP_MULTI, true); qdict_put_obj(dict, VFIO_USER_CAP, QOBJECT(capdict)); @@ -1454,19 +1486,114 @@ static int vfio_user_region_read(VFIOUserProxy *proxy, uint8_t index, return msgp->count; } +static void vfio_user_flush_multi(VFIOUserProxy *proxy) +{ + VFIOUserMsg *msg; + VFIOUserWRMulti *wm = proxy->wr_multi; + int ret; + + proxy->wr_multi = NULL; + + /* adjust size for actual # of writes */ + wm->hdr.size -= (VFIO_USER_MULTI_MAX - wm->wr_cnt) * sizeof(VFIOUserWROne); + + msg = vfio_user_getmsg(proxy, &wm->hdr, NULL); + msg->id = wm->hdr.id; + msg->rsize = 0; + msg->type = VFIO_MSG_ASYNC; + trace_vfio_user_wrmulti("flush", wm->wr_cnt); + + ret = vfio_user_send_queued(proxy, msg); + if (ret < 0) { + vfio_user_recycle(proxy, msg); + } +} + +static void vfio_user_create_multi(VFIOUserProxy *proxy) +{ + VFIOUserWRMulti *wm; + + wm = g_malloc0(sizeof(*wm)); + vfio_user_request_msg(&wm->hdr, VFIO_USER_REGION_WRITE_MULTI, + sizeof(*wm), VFIO_USER_NO_REPLY); + proxy->wr_multi = wm; +} + +static void vfio_user_add_multi(VFIOUserProxy *proxy, uint8_t index, + off_t offset, uint32_t count, void *data) +{ + VFIOUserWRMulti *wm = proxy->wr_multi; + VFIOUserWROne *w1 = &wm->wrs[wm->wr_cnt]; + + w1->offset = offset; + w1->region = index; + w1->count = count; + memcpy(&w1->data, data, count); + + wm->wr_cnt++; + trace_vfio_user_wrmulti("add", wm->wr_cnt); + if (wm->wr_cnt == VFIO_USER_MULTI_MAX || + proxy->num_outgoing < VFIO_USER_OUT_LOW) { + vfio_user_flush_multi(proxy); + } +} + static int vfio_user_region_write(VFIOUserProxy *proxy, uint8_t index, off_t offset, uint32_t count, void *data, bool post) { VFIOUserRegionRW *msgp = NULL; - int flags = post ? VFIO_USER_NO_REPLY : 0; + int flags; int size = sizeof(*msgp) + count; + bool can_multi; int ret; if (count > proxy->max_xfer_size) { return -EINVAL; } + if (proxy->flags & VFIO_PROXY_NO_POST) { + post = false; + } + + /* write eligible to be in a WRITE_MULTI msg ? */ + can_multi = (proxy->flags & VFIO_PROXY_USE_MULTI) && post && + count <= VFIO_USER_MULTI_DATA; + + /* + * This should be a rare case, so first check without the lock, + * if we're wrong, vfio_send_queued() will flush any posted writes + * we missed here + */ + if (proxy->wr_multi != NULL || + (proxy->num_outgoing > VFIO_USER_OUT_HIGH && can_multi)) { + + /* + * re-check with lock + * + * if already building a WRITE_MULTI msg, + * add this one if possible else flush pending before + * sending the current one + * + * else if outgoing queue is over the highwater, + * start a new WRITE_MULTI message + */ + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + if (proxy->wr_multi != NULL) { + if (can_multi) { + vfio_user_add_multi(proxy, index, offset, count, data); + return count; + } + vfio_user_flush_multi(proxy); + } else if (proxy->num_outgoing > VFIO_USER_OUT_HIGH && can_multi) { + vfio_user_create_multi(proxy); + vfio_user_add_multi(proxy, index, offset, count, data); + return count; + } + } + } + + flags = post ? VFIO_USER_NO_REPLY : 0; msgp = g_malloc0(size); vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_WRITE, size, flags); msgp->offset = offset; @@ -1476,7 +1603,7 @@ static int vfio_user_region_write(VFIOUserProxy *proxy, uint8_t index, trace_vfio_user_region_rw(msgp->region, msgp->offset, msgp->count); /* async send will free msg after it's sent */ - if (post && !(proxy->flags & VFIO_PROXY_NO_POST)) { + if (post) { vfio_user_send_async(proxy, &msgp->hdr, NULL); return count; } diff --git a/hw/vfio/user.h b/hw/vfio/user.h index ff2aa005eb..dc4d41cc0e 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -90,6 +90,8 @@ typedef struct VFIOUserProxy { VFIOUserMsg *last_nowait; VFIOUserMsg *part_recv; size_t recv_left; + VFIOUserWRMulti *wr_multi; + int num_outgoing; enum proxy_state state; } VFIOUserProxy; @@ -98,6 +100,11 @@ typedef struct VFIOUserProxy { #define VFIO_PROXY_NO_MMAP 0x2 #define VFIO_PROXY_FORCE_QUEUED 0x4 #define VFIO_PROXY_NO_POST 0x8 +#define VFIO_PROXY_USE_MULTI 0x10 + +/* coalescing high and low water marks for VFIOProxy num_outgoing */ +#define VFIO_USER_OUT_HIGH 1024 +#define VFIO_USER_OUT_LOW 128 typedef struct VFIODevice VFIODevice;