From patchwork Thu Jun 30 10:25:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84D50CCA481 for ; Thu, 30 Jun 2022 10:27:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234133AbiF3K1J (ORCPT ); Thu, 30 Jun 2022 06:27:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232214AbiF3K0u (ORCPT ); Thu, 30 Jun 2022 06:26:50 -0400 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2053.outbound.protection.outlook.com [40.107.223.53]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5A124F659; Thu, 30 Jun 2022 03:26:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DboyNN9744RTcjViW/zcHlMMCjVulJD4ob/+cCoi96jdrxIzVktOP+DVXLmyX8FXfXV3JrUm2T5GVuWlWo3SbEafVkCltKlzx/wJ3No6dlXWlPaRuqQ3CC0hoYVwrSacYJyNdh04Hdx3hwuFrWAxbMjR93FWtUk2Erp6Ujd82QOGCDw9MjazWPhTWZJNJ9JyykjqMunKWklOjD6qzO7H0NvsCNE+IqQAS61wJsIcftS7RUFdIfqHst+iBFlI+u7WmRtlhn8Pt/vUcKvjd75cWmkOpmcSzUcAPOEX0/j5ui38wyzE33bTqFtCt0wTE4kNhM/0aPAFzhHw1sNs74vonw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4FU0SF+R3sQDwmMMK0dPHFM1NO3T31njTmSiF9pYpZ4=; b=mU8ccG11daJxbuwmeKshwjJcB5Ofkk0WSntBFStvj9ZMeBc590uOLATY+qJg39/xXCxJziF+tBEEc3xl17VfuHvLEK4suZM8J7jhWkZagsVVU3iVH5QXZNXKvrC9oI/bxIPxq6FpeBYjSCZI5wzR6rgjDLUXEyirY9X5QllvOSGnOPRONhbqbm+nD3qW3fd1r75Z/cjAIjMD72fRgGaNz3J5DJQiC+3xswnrI4Y6iWTPHfbYj/jYuphQnOzuBOCTKm57Bz+tlOxfDmdrNLzmsQEoi83dM3MP2raezombdEX9NsGMt90UHp4Ca17V83zs2Q6yv+6CJFmDHG98h6GcTQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.236) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4FU0SF+R3sQDwmMMK0dPHFM1NO3T31njTmSiF9pYpZ4=; b=V9WmU6UYP4Z0QGbLgXV5Tzes9tmGmztXvb0Sn28lO3l7Wy9rxSrie7J3XlYg3aJ4sVOxqbEoOIhSJPKiyFpz87OzwehA9bfIdF3Bs7ARJsbv5f1gBl+2HmxvlEtmr9RRhdTgBzQqiLQiqU+4uqsEnglBXbWZ5Cefq7ANNUN+NXlrOtpc9jKWinai7Pn3YoPLo3EgsL2FthtvZYbFykfnrWY3DniDzOLWckMsounMMTPDl12GViz0hWfTS0GptAJPgL0eB2WB86CJwleQfew2BhU1Xu38Ag8IWpiIJPCndB3EWbojA/cO5RQpQ22+KE9l0VZNy+OiD8Gt5SOdvSmFdQ== Received: from DS7PR06CA0012.namprd06.prod.outlook.com (2603:10b6:8:2a::22) by BN8PR12MB4785.namprd12.prod.outlook.com (2603:10b6:408:a2::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.15; Thu, 30 Jun 2022 10:26:31 +0000 Received: from DM6NAM11FT053.eop-nam11.prod.protection.outlook.com (2603:10b6:8:2a:cafe::e5) by DS7PR06CA0012.outlook.office365.com (2603:10b6:8:2a::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.236) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.236 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.236; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.236) by DM6NAM11FT053.mail.protection.outlook.com (10.13.173.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:31 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by DRHQMAIL109.nvidia.com (10.27.9.19) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:30 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:29 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:26 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 01/13] vfio/mlx5: Protect mlx5vf_disable_fds() upon close device Date: Thu, 30 Jun 2022 13:25:33 +0300 Message-ID: <20220630102545.18005-2-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4f0be23c-3921-43d1-9ab7-08da5a830031 X-MS-TrafficTypeDiagnostic: BN8PR12MB4785:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: klLSOKXv5mLnpL11h5RugQvDO/doiJoYv41P/eRvyIl/17fzU7hF5zoS6/sL6xPvqy3u0CO/6L9lD5VB2hps6mrM1pc/e9QD30QMfwur/0vAjWN7x23VwMz6QbkBdywrauK/WA6Lkjh1Yqm4GvmBeBGKAfDgCAWP1g9sulRnz3fGE8JlymOrEENKBUcScgVQx5noFB0WdnerpCVZjj1cbde8w0rn6fBqDuOHlR7pMeRfRjV7mRRXDTfr2VS0bGbWysonCxjcYakrtZDxooFgxNiNm/UD4hdyPz/+z5OVor20qDAH0lon8vfsgbu7QBPRRa2vdqlk4ihFo0Aen3s9ZL3RKE+Ql/zxB5jVBMRlUuABCr5gPVgHFLNDPXCX12/u5W2mhxZaRzYWnBDRSUkksZWxiI75KSsYb7hux9tG5EQPzOWGa2pU6/jGr7dFOcGORlP2Gq5YFRrXeBhjhlJIE7cDJxsZFHlw52qVIQWoE9h39CtkREzuxbFJe3pRarMHS95n1a3lO3lffD+FZnepBiQPMxn942KMp6ajNCMYIF+kFih9TVceGCN6xFOiMX5hxN7X0PRF6l5XaOoln70h1/UKdmG4OyrX+pQ2eV//Fkw2RCkIya9VZMQP2S0REy3ymNeM8s3b85V0B6ba6b3PUMIAkHGPPjDbfeiBeNof71uNy9WDGb88vdqWa+ep1rfD2xmBAyooSMR9TE/MDskS96kX0Dj5GrV/+TSK4Vu3bJY7DhDxfySu1zXznx9utNd1uw29A6EkQNZRyilmTJ8L0SEcUqf8YV26r99qY57IzEKZa5RITVbfDDrrTU8FyNVVLyrF/48apKr5T/6n55pelDAUvZ+mKIBffIFsqPhm7zc= X-Forefront-Antispam-Report: CIP:12.22.5.236;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(396003)(136003)(346002)(39860400002)(376002)(36840700001)(40470700004)(46966006)(70586007)(36860700001)(70206006)(8676002)(5660300002)(7696005)(186003)(86362001)(4326008)(82740400003)(1076003)(8936002)(478600001)(2616005)(356005)(6666004)(2906002)(81166007)(41300700001)(40480700001)(40460700003)(54906003)(6636002)(26005)(316002)(426003)(336012)(36756003)(83380400001)(82310400005)(47076005)(110136005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:26:31.5563 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4f0be23c-3921-43d1-9ab7-08da5a830031 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.236];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT053.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN8PR12MB4785 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Protect mlx5vf_disable_fds() upon close device to be called under the state mutex as done in all other places. This will prevent a race with any other flow which calls mlx5vf_disable_fds() as of health/recovery upon MLX5_PF_NOTIFY_DISABLE_VF event. Encapsulate this functionality in a separate function named mlx5vf_cmd_close_migratable() to consider migration caps and for further usage upon close device. Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 devices") Reviewed-by: Kevin Tian Signed-off-by: Yishai Hadas --- drivers/vfio/pci/mlx5/cmd.c | 10 ++++++++++ drivers/vfio/pci/mlx5/cmd.h | 1 + drivers/vfio/pci/mlx5/main.c | 2 +- 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index 9b9f33ca270a..cdd0c667dc77 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -88,6 +88,16 @@ static int mlx5fv_vf_event(struct notifier_block *nb, return 0; } +void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev) +{ + if (!mvdev->migrate_cap) + return; + + mutex_lock(&mvdev->state_mutex); + mlx5vf_disable_fds(mvdev); + mlx5vf_state_mutex_unlock(mvdev); +} + void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev) { if (!mvdev->migrate_cap) diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index 6c3112fdd8b1..aa692d9ce656 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -64,6 +64,7 @@ int mlx5vf_cmd_query_vhca_migration_state(struct mlx5vf_pci_core_device *mvdev, size_t *state_size); void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev); void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev); +void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev); int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev, struct mlx5_vf_migration_file *migf); int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev, diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index 0558d0649ddb..d754990f0662 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -570,7 +570,7 @@ static void mlx5vf_pci_close_device(struct vfio_device *core_vdev) struct mlx5vf_pci_core_device *mvdev = container_of( core_vdev, struct mlx5vf_pci_core_device, core_device.vdev); - mlx5vf_disable_fds(mvdev); + mlx5vf_cmd_close_migratable(mvdev); vfio_pci_core_close_device(core_vdev); } From patchwork Thu Jun 30 10:25:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901614 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 563B5C433EF for ; Thu, 30 Jun 2022 10:27:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234899AbiF3K1L (ORCPT ); Thu, 30 Jun 2022 06:27:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233256AbiF3K0w (ORCPT ); Thu, 30 Jun 2022 06:26:52 -0400 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2051.outbound.protection.outlook.com [40.107.243.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E4374F646; Thu, 30 Jun 2022 03:26:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ezuKIhmhtbbkRlIMq94xntYNA57h0Yk8XBrU9VH4jKjJqJ9cJYikFANQ/Okt02AfpIYtGBiV22D/vP6PJ0Opp+5NAOb1fQ5tBp7Vw57zNXaP4JZLu6E+rTmnxS2DJC7J8HjRqcNIqDNcL90DfMV251l3mlR8Zd2gb3ttd9pCH2nuPJAkem7WBo70FOwKKbToU0+PsInx/i78Yo43AUEsIRCeAOiITonfAh8u8q8lsD0+KePy7qyUHrmLfNdIax887ky4R1DUK00KuI66mt7y24yRqOjFlEZ2OF/sbvx08fKiSrP/G8xbp4gQQiTHxSCIkGLhScN7m18qMLu8VySt3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/y6m8YqO2bO7jLDKjYVSYMXd4szDRHMsrnudh9HESYA=; b=BJAyl61iTglriAU4U+wdKeYlfBzJKX896ru383FQx7rI30e/ThE/tyGtdz3tHEiKm6vPh4yGbFOfIAAQhR3czASibL0hNJt/h86IXCC8A7i7NTUzAhGYoSj7E8ALYH1nXccmlkHwKH+V99NlcxKF8CKswtQurlJdCZ6yqwfNL/GiFcwL8j9K6gd4C78w5cPR3v0Y4vlxRd173p8X6mfPTowD940I9+cbYbio1quzcMLUMj2rSb+K8TZsQ9wT3QXFwiuPBQTin1TujzGTyUxl97sPDnbeIU2wa/J1MdgeAd7U1fl3ScIry37V9Ot9q8NEbMq6HlnI3YbJsIDhphAjag== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.234) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/y6m8YqO2bO7jLDKjYVSYMXd4szDRHMsrnudh9HESYA=; b=bcT6lMcv7ejtsMJJpUqpVqAsVJ2ax6TB/b1YBnKRQQi4oT0ndpvfWuUgqpqVeNsXGKny4BERW2jla92LuDn92oi2I1KFBde32KvNREkBxfuqW6sGAaj2kkBYNR3+Idk3vMJE9KjrmMw7qHgmkXPLm1DcUrMZwQp9YqHV09OHRhqzwayQwER1rQz1+YvZZtJzZnTh487s5SmDxyBiHCSGB/rUuBy7ZTjUkXOo7BwHhPnKGGuJYCD5idmCBnPxDjzhtsGnV3b34R/aqqq1FoOIjdpXichFL/LBn6wMc9NofEaNc6seFwzljoqlgbZK+NmHkhtv3J3pT/bgdCdjpTybNQ== Received: from BN8PR16CA0024.namprd16.prod.outlook.com (2603:10b6:408:4c::37) by BN6PR12MB1395.namprd12.prod.outlook.com (2603:10b6:404:1f::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14; Thu, 30 Jun 2022 10:26:35 +0000 Received: from BN8NAM11FT028.eop-nam11.prod.protection.outlook.com (2603:10b6:408:4c:cafe::3b) by BN8PR16CA0024.outlook.office365.com (2603:10b6:408:4c::37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.21 via Frontend Transport; Thu, 30 Jun 2022 10:26:35 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.234) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.234 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.234; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.234) by BN8NAM11FT028.mail.protection.outlook.com (10.13.176.225) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:35 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by DRHQMAIL101.nvidia.com (10.27.9.10) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:34 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:32 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:29 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 02/13] vfio: Split migration ops from main device ops Date: Thu, 30 Jun 2022 13:25:34 +0300 Message-ID: <20220630102545.18005-3-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 17b36582-017c-4f71-1113-08da5a83027c X-MS-TrafficTypeDiagnostic: BN6PR12MB1395:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 2bWSoM5bJ62A0vLo/e4vbivJbBGgmU+62wgDOVsuuGIqFcYpr8ZOmM8z53DdjyCy359zjVYtVWv5Z2Z4NUv2MXeeT+DWPWCmsKLlQevnrBHAV3SxDNZxvUPw+IeOPkfW0FRulE4M8Lr84yPvXYtt0w2Z+kf7096niyt6QuIY7+Jd5AeF83AZ16fMz+wPKX3nQ3hK5QKv2z51ShBOEqrFKBwDTNx5Xmb1exFwcVGSdm7Y/7WQ9Pg87dA4/O2/IWAMb3by08O3iE6w2odvQdA24ZrZqWwvIjWZG2uodeEtU0pA8vgvjxqvtvXdHl0R+7OmP95vc2WQA+XU654Jq0EWbnK9QLTVie27RGm+2V8ko6q1fEpzmiCS6h8MROkt3tMcmKUJNCZQIqy14VTuYdlupvJD5PHnkW94ud1RpCFrbuty73yDkw7Md+llHug0XIgx4VV31aO2IVJxPOrHSlrcm6pqSMd0YluuEIrB2KC19hKzqkKd7UaTyRfnYE0V77uxUc1iWPH31DSNl5gJ8b3AEsWOeJ/j7IQ0QhQ9gRMixSEHq2mEVik5stVO8cciMKlI7b37tZiC8ASBAYlgbMZrQgo8YMKdnzl/fWKAh//aXULWavcdu90dPMAcO67/OmPEzukQptgHcdS/qkrtTb+tpcYp35Of+9IwzGEXrNMxPVzbjdIQocCuFjgP4isG3wA6VUhSF8Y0vMLamI3OcFuzEE+iZE77tCFOLZieDbaQaCgNzUuzt57WWAjhYz9DZPhh3Ya/zyrYH52JgP0ncO0HZ2Zk5Dok3HrSQ717bYhiwQN5qfqQgvejfZWjEvBpyOH4E5xSlhajflZGtHrG5XxtlbgTUuCmPJMapkF/T47GflA= X-Forefront-Antispam-Report: CIP:12.22.5.234;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(136003)(39860400002)(396003)(346002)(376002)(46966006)(40470700004)(36840700001)(7696005)(26005)(336012)(5660300002)(186003)(1076003)(426003)(54906003)(47076005)(316002)(6666004)(4326008)(110136005)(30864003)(8676002)(70586007)(2616005)(41300700001)(2906002)(8936002)(70206006)(6636002)(36860700001)(82310400005)(83380400001)(40480700001)(478600001)(36756003)(82740400003)(40460700003)(356005)(81166007)(86362001)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:26:35.2903 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 17b36582-017c-4f71-1113-08da5a83027c X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.234];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT028.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR12MB1395 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org vfio core checks whether the driver sets some migration op (e.g. set_state/get_state) and accordingly calls its op. However, currently mlx5 driver sets the above ops without regards to its migration caps. This might lead to unexpected usage/Oops if user space may call to the above ops even if the driver doesn't support migration. As for example, the migration state_mutex is not initialized in that case. The cleanest way to manage that seems to split the migration ops from the main device ops, this will let the driver setting them separately from the main ops when it's applicable. As part of that, validate ops construction on registration and include a check for VFIO_MIGRATION_STOP_COPY since the uAPI claims it must be set in migration_flags. HISI driver was changed as well to match this scheme. This scheme may enable down the road to come with some extra group of ops (e.g. DMA log) that can be set without regards to the other options based on driver caps. Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 devices") Reviewed-by: Kevin Tian Signed-off-by: Yishai Hadas --- .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 11 +++++-- drivers/vfio/pci/mlx5/cmd.c | 4 ++- drivers/vfio/pci/mlx5/cmd.h | 3 +- drivers/vfio/pci/mlx5/main.c | 9 ++++-- drivers/vfio/pci/vfio_pci_core.c | 7 +++++ drivers/vfio/vfio.c | 11 ++++--- include/linux/vfio.h | 30 ++++++++++++------- 7 files changed, 51 insertions(+), 24 deletions(-) diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c index 4def43f5f7b6..ea762e28c1cc 100644 --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c @@ -1185,7 +1185,7 @@ static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev) if (ret) return ret; - if (core_vdev->ops->migration_set_state) { + if (core_vdev->mig_ops) { ret = hisi_acc_vf_qm_init(hisi_acc_vdev); if (ret) { vfio_pci_core_disable(vdev); @@ -1208,6 +1208,11 @@ static void hisi_acc_vfio_pci_close_device(struct vfio_device *core_vdev) vfio_pci_core_close_device(core_vdev); } +static const struct vfio_migration_ops hisi_acc_vfio_pci_migrn_state_ops = { + .migration_set_state = hisi_acc_vfio_pci_set_device_state, + .migration_get_state = hisi_acc_vfio_pci_get_device_state, +}; + static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = { .name = "hisi-acc-vfio-pci-migration", .open_device = hisi_acc_vfio_pci_open_device, @@ -1219,8 +1224,6 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = { .mmap = hisi_acc_vfio_pci_mmap, .request = vfio_pci_core_request, .match = vfio_pci_core_match, - .migration_set_state = hisi_acc_vfio_pci_set_device_state, - .migration_get_state = hisi_acc_vfio_pci_get_device_state, }; static const struct vfio_device_ops hisi_acc_vfio_pci_ops = { @@ -1272,6 +1275,8 @@ static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device if (!ret) { vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev, &hisi_acc_vfio_pci_migrn_ops); + hisi_acc_vdev->core_device.vdev.mig_ops = + &hisi_acc_vfio_pci_migrn_state_ops; } else { pci_warn(pdev, "migration support failed, continue with generic interface\n"); vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev, diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index cdd0c667dc77..dd5d7bfe0a49 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -108,7 +108,8 @@ void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev) destroy_workqueue(mvdev->cb_wq); } -void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev) +void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev, + const struct vfio_migration_ops *mig_ops) { struct pci_dev *pdev = mvdev->core_device.pdev; int ret; @@ -149,6 +150,7 @@ void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev) mvdev->core_device.vdev.migration_flags = VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P; + mvdev->core_device.vdev.mig_ops = mig_ops; end: mlx5_vf_put_core_dev(mvdev->mdev); diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index aa692d9ce656..8208f4701a90 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -62,7 +62,8 @@ int mlx5vf_cmd_suspend_vhca(struct mlx5vf_pci_core_device *mvdev, u16 op_mod); int mlx5vf_cmd_resume_vhca(struct mlx5vf_pci_core_device *mvdev, u16 op_mod); int mlx5vf_cmd_query_vhca_migration_state(struct mlx5vf_pci_core_device *mvdev, size_t *state_size); -void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev); +void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev, + const struct vfio_migration_ops *mig_ops); void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev); void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev); int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev, diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index d754990f0662..a9b63d15c5d3 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -574,6 +574,11 @@ static void mlx5vf_pci_close_device(struct vfio_device *core_vdev) vfio_pci_core_close_device(core_vdev); } +static const struct vfio_migration_ops mlx5vf_pci_mig_ops = { + .migration_set_state = mlx5vf_pci_set_device_state, + .migration_get_state = mlx5vf_pci_get_device_state, +}; + static const struct vfio_device_ops mlx5vf_pci_ops = { .name = "mlx5-vfio-pci", .open_device = mlx5vf_pci_open_device, @@ -585,8 +590,6 @@ static const struct vfio_device_ops mlx5vf_pci_ops = { .mmap = vfio_pci_core_mmap, .request = vfio_pci_core_request, .match = vfio_pci_core_match, - .migration_set_state = mlx5vf_pci_set_device_state, - .migration_get_state = mlx5vf_pci_get_device_state, }; static int mlx5vf_pci_probe(struct pci_dev *pdev, @@ -599,7 +602,7 @@ static int mlx5vf_pci_probe(struct pci_dev *pdev, if (!mvdev) return -ENOMEM; vfio_pci_core_init_device(&mvdev->core_device, pdev, &mlx5vf_pci_ops); - mlx5vf_cmd_set_migratable(mvdev); + mlx5vf_cmd_set_migratable(mvdev, &mlx5vf_pci_mig_ops); dev_set_drvdata(&pdev->dev, &mvdev->core_device); ret = vfio_pci_core_register_device(&mvdev->core_device); if (ret) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index a0d69ddaf90d..2e003913c561 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1855,6 +1855,13 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev) if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL) return -EINVAL; + if (vdev->vdev.mig_ops) { + if (!(vdev->vdev.mig_ops->migration_get_state && + vdev->vdev.mig_ops->migration_set_state) || + !(vdev->vdev.migration_flags & VFIO_MIGRATION_STOP_COPY)) + return -EINVAL; + } + /* * Prevent binding to PFs with VFs enabled, the VFs might be in use * by the host or other users. We cannot capture the VFs if they diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 61e71c1154be..aac9213a783d 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -1541,8 +1541,7 @@ vfio_ioctl_device_feature_mig_device_state(struct vfio_device *device, struct file *filp = NULL; int ret; - if (!device->ops->migration_set_state || - !device->ops->migration_get_state) + if (!device->mig_ops) return -ENOTTY; ret = vfio_check_feature(flags, argsz, @@ -1558,7 +1557,8 @@ vfio_ioctl_device_feature_mig_device_state(struct vfio_device *device, if (flags & VFIO_DEVICE_FEATURE_GET) { enum vfio_device_mig_state curr_state; - ret = device->ops->migration_get_state(device, &curr_state); + ret = device->mig_ops->migration_get_state(device, + &curr_state); if (ret) return ret; mig.device_state = curr_state; @@ -1566,7 +1566,7 @@ vfio_ioctl_device_feature_mig_device_state(struct vfio_device *device, } /* Handle the VFIO_DEVICE_FEATURE_SET */ - filp = device->ops->migration_set_state(device, mig.device_state); + filp = device->mig_ops->migration_set_state(device, mig.device_state); if (IS_ERR(filp) || !filp) goto out_copy; @@ -1589,8 +1589,7 @@ static int vfio_ioctl_device_feature_migration(struct vfio_device *device, }; int ret; - if (!device->ops->migration_set_state || - !device->ops->migration_get_state) + if (!device->mig_ops) return -ENOTTY; ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET, diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 49580fa2073a..4d26e149db81 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -32,6 +32,11 @@ struct vfio_device_set { struct vfio_device { struct device *dev; const struct vfio_device_ops *ops; + /* + * mig_ops is a static property of the vfio_device which must be set + * prior to registering the vfio_device. + */ + const struct vfio_migration_ops *mig_ops; struct vfio_group *group; struct vfio_device_set *dev_set; struct list_head dev_set_list; @@ -61,16 +66,6 @@ struct vfio_device { * match, -errno for abort (ex. match with insufficient or incorrect * additional args) * @device_feature: Optional, fill in the VFIO_DEVICE_FEATURE ioctl - * @migration_set_state: Optional callback to change the migration state for - * devices that support migration. It's mandatory for - * VFIO_DEVICE_FEATURE_MIGRATION migration support. - * The returned FD is used for data transfer according to the FSM - * definition. The driver is responsible to ensure that FD reaches end - * of stream or error whenever the migration FSM leaves a data transfer - * state or before close_device() returns. - * @migration_get_state: Optional callback to get the migration state for - * devices that support migration. It's mandatory for - * VFIO_DEVICE_FEATURE_MIGRATION migration support. */ struct vfio_device_ops { char *name; @@ -87,6 +82,21 @@ struct vfio_device_ops { int (*match)(struct vfio_device *vdev, char *buf); int (*device_feature)(struct vfio_device *device, u32 flags, void __user *arg, size_t argsz); +}; + +/** + * @migration_set_state: Optional callback to change the migration state for + * devices that support migration. It's mandatory for + * VFIO_DEVICE_FEATURE_MIGRATION migration support. + * The returned FD is used for data transfer according to the FSM + * definition. The driver is responsible to ensure that FD reaches end + * of stream or error whenever the migration FSM leaves a data transfer + * state or before close_device() returns. + * @migration_get_state: Optional callback to get the migration state for + * devices that support migration. It's mandatory for + * VFIO_DEVICE_FEATURE_MIGRATION migration support. + */ +struct vfio_migration_ops { struct file *(*migration_set_state)( struct vfio_device *device, enum vfio_device_mig_state new_state); From patchwork Thu Jun 30 10:25:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07272C43334 for ; Thu, 30 Jun 2022 10:27:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234958AbiF3K1Q (ORCPT ); Thu, 30 Jun 2022 06:27:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233879AbiF3K0z (ORCPT ); Thu, 30 Jun 2022 06:26:55 -0400 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2089.outbound.protection.outlook.com [40.107.220.89]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 87215647A; Thu, 30 Jun 2022 03:26:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gF4HY/Gvty08xLqOPxt3Ify2new5ECBjfX5po1yKXM/uTKTNHcsR8C0ZxyHGYFKT3eGW8RGpwNAUMdHcFJuCp14jKQNmxPbwRQR5BO8eOEMkYwL1MNX86uLdx7YMYLN/Bw7JsBz2H6NtpUH1f3LqMj7WaiSIzTX/M+RGRuPOft9pGPfD+ETQeC5WdMrU55Y3oFatz+/lcMXetKW2Y12KAoen/8s3ZmVl4Sm+7weYYNKfPO0LIKuwPSJ9ruCeFcdHIM8+HGkCyfYJtVP0RH8FvPd1knPkTjSV3FbIQnkMm+ILtshA6XhsjaqjVY0P3/7ht/yuF7qMdZqlJqVgwb2inA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1IF1CK6d5+J7F2DHDnfClHruppP4hP5qBD/+lM40s54=; b=X3hJkYHuSLYUGsi0ELmVXjGdkk+YBuXOHqpMWIOCyJWYETWGKxhC+QnOy15khZZRtPsZdhtZgOxMb+vHW9Qk2fsdP6GxeM8zefhC6ZmeX7pY1vaG5s1wUKD7z4JJUixFLCDyTuufysMZ8JgrSUWaKKrlVW0EdTp7cBbP8MfoL63d03V8W3tEirP2KOjSohaDlTVWpneuJBiQi+Ac0czEBfpIjyjcKeIvr9J9IlAHc1yvMWGO8Df4zVnyBW5Dq53uVmkg0amKMlsY19xqnsVQmd5U4j5V6E6Q2l5Lu2SbXvzmK10mR98gz24QDrLA0e5SK8dYXGE8TcNJNXQ7gB3Ipg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.236) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1IF1CK6d5+J7F2DHDnfClHruppP4hP5qBD/+lM40s54=; b=Dk5nb/nHchUndvZwlEqUsFVXJgRK3arMMc5/n6qtEBmXyeLdf9kEx7cQNrRwqo4uhuRmO0cRnc4YR90ltL8qdiAzkgvgWO6OGpXwpu2T40D0FYfQ9MGvsvHIaP8gdF+FB3hCvB4uYDCqSLNNQR21lfTo3C1covd8ga/F2UTreWKN5CyGJXVQ5Z7La1XBSBc0GXUUvwwIBcv286tn6r0sHp00KDxq1GLv1kCa0aVjmgROOO4Z7BiB6aI1Dve1y3seAuDBlZcqNFNPESwGCeCMN9WIA1UGTT/BaQ9EbQOnfDw1Pbr1mhcOhze8zwmFdQAch6quqEyU5Yy2dsKbPnXgYw== Received: from DS7PR06CA0012.namprd06.prod.outlook.com (2603:10b6:8:2a::22) by DM6PR12MB3068.namprd12.prod.outlook.com (2603:10b6:5:3e::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14; Thu, 30 Jun 2022 10:26:42 +0000 Received: from DM6NAM11FT053.eop-nam11.prod.protection.outlook.com (2603:10b6:8:2a:cafe::e4) by DS7PR06CA0012.outlook.office365.com (2603:10b6:8:2a::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:42 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.236) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.236 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.236; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.236) by DM6NAM11FT053.mail.protection.outlook.com (10.13.173.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:42 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by DRHQMAIL109.nvidia.com (10.27.9.19) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:37 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:36 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:33 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 03/13] net/mlx5: Introduce ifc bits for page tracker Date: Thu, 30 Jun 2022 13:25:35 +0300 Message-ID: <20220630102545.18005-4-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 5e499f6c-f275-4ed6-8f34-08da5a8306dc X-MS-TrafficTypeDiagnostic: DM6PR12MB3068:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: orhKQhERkY3H3jj/XrzMa1X8T41jk/dj3kidfz1AHkqt+0ojypVsKxL66kd5kixwRyq3Cix/3siSZAE2sjZFak0duH+C9LDxHQ8QC1KXpzEps+mYwwUhf4L2ngvbMbo6Uj4qNG0SDGG+Bj/jcz4vZ8yFQ4S04ZylvORJDEGW0DBWd49ooFZFRStF5dWo6++Pw2k1YIvizXm0PPd3NC7RIfkU8ib+7vM/imOKky7IlcmL4FEM3hRK6Pf7jY2dMo3sk9c2uTDJjj/d5CKKPcRT7LC9BQ9kDa7LpyalQWL6mS+X134Ms/7iqxyDuw2qJuADC8ycv/w62AhUhuKS17lmcsFsZw/FpcR55Ngn8ub4hllpvZSuTGSnsoojSpw7P8erW1p5l19bu47UmRP7BWmNiLBBdsTwxYEx//28zLObkTvMYH4fTO57ZRoe0A/2VkflbXEY3vOpXu/2JwqkmVkboDdUKIIbMnPKYmS09v/OAzO5GAnGAUu8+G/unY/H8+nnG8rqGB7WUWRbF4jRPNh9JmO1W1NbJeO8p5KZ1Ex5wQEMYlZGTDAjlQkCf62uCDJbQmsv0JCLlhfGGsQuZx5TqLkSDbPe3aAaQVsHr9KnFnUlyG57T3gzfdE3N1UjryBCYl0VEEZ8CZNQKJjX8gMoBDxdQ7TeoX61rTaK4c6Pdbt3CPn/8IQow2WjkyawXTRrOcOEGMWNeEptuMZ/s6OwFpTtVjFSsmya509i4sdkReUE0p37eM5OXNluqRFJvkNq5yyLATsznjvafcv6L1J6NikEhoGttccCfNAiC6JPAJ0nwU7tG1rjCkkiSYG8HNXJyOguvyESpYxioLb+exvnbM0Ijj//z2tR7zMrYucPTuw= X-Forefront-Antispam-Report: CIP:12.22.5.236;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(396003)(39860400002)(376002)(346002)(136003)(40470700004)(36840700001)(46966006)(356005)(316002)(70206006)(82310400005)(36860700001)(54906003)(4326008)(6636002)(8676002)(70586007)(82740400003)(81166007)(36756003)(110136005)(40480700001)(47076005)(41300700001)(186003)(26005)(478600001)(336012)(426003)(5660300002)(86362001)(2906002)(8936002)(1076003)(40460700003)(83380400001)(2616005)(7696005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:26:42.7273 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5e499f6c-f275-4ed6-8f34-08da5a8306dc X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.236];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT053.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB3068 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Introduce ifc related stuff to enable using page tracker. A page tracker is a dirty page tracking object used by the device to report the tracking log. Signed-off-by: Yishai Hadas --- include/linux/mlx5/mlx5_ifc.h | 79 ++++++++++++++++++++++++++++++++++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index fd7d083a34d3..b2d56fea6a09 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -89,6 +89,7 @@ enum { MLX5_OBJ_TYPE_VIRTIO_NET_Q = 0x000d, MLX5_OBJ_TYPE_VIRTIO_Q_COUNTERS = 0x001c, MLX5_OBJ_TYPE_MATCH_DEFINER = 0x0018, + MLX5_OBJ_TYPE_PAGE_TRACK = 0x46, MLX5_OBJ_TYPE_MKEY = 0xff01, MLX5_OBJ_TYPE_QP = 0xff02, MLX5_OBJ_TYPE_PSV = 0xff03, @@ -1711,7 +1712,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 max_geneve_tlv_options[0x8]; u8 reserved_at_568[0x3]; u8 max_geneve_tlv_option_data_len[0x5]; - u8 reserved_at_570[0x10]; + u8 reserved_at_570[0x9]; + u8 adv_virtualization[0x1]; + u8 reserved_at_57a[0x6]; u8 reserved_at_580[0xb]; u8 log_max_dci_stream_channels[0x5]; @@ -11668,4 +11671,78 @@ struct mlx5_ifc_load_vhca_state_out_bits { u8 reserved_at_40[0x40]; }; +struct mlx5_ifc_adv_virtualization_cap_bits { + u8 reserved_at_0[0x3]; + u8 pg_track_log_max_num[0x5]; + u8 pg_track_max_num_range[0x8]; + u8 pg_track_log_min_addr_space[0x8]; + u8 pg_track_log_max_addr_space[0x8]; + + u8 reserved_at_20[0x3]; + u8 pg_track_log_min_msg_size[0x5]; + u8 pg_track_log_max_msg_size[0x8]; + u8 pg_track_log_min_page_size[0x8]; + u8 pg_track_log_max_page_size[0x8]; + + u8 reserved_at_40[0x7c0]; +}; + +struct mlx5_ifc_page_track_report_entry_bits { + u8 dirty_address_high[0x20]; + + u8 dirty_address_low[0x20]; +}; + +enum { + MLX5_PAGE_TRACK_STATE_TRACKING, + MLX5_PAGE_TRACK_STATE_REPORTING, + MLX5_PAGE_TRACK_STATE_ERROR, +}; + +struct mlx5_ifc_page_track_range_bits { + u8 start_address[0x40]; + + u8 length[0x40]; +}; + +struct mlx5_ifc_page_track_bits { + u8 modify_field_select[0x40]; + + u8 reserved_at_40[0x10]; + u8 vhca_id[0x10]; + + u8 reserved_at_60[0x20]; + + u8 state[0x4]; + u8 track_type[0x4]; + u8 log_addr_space_size[0x8]; + u8 reserved_at_90[0x3]; + u8 log_page_size[0x5]; + u8 reserved_at_98[0x3]; + u8 log_msg_size[0x5]; + + u8 reserved_at_a0[0x8]; + u8 reporting_qpn[0x18]; + + u8 reserved_at_c0[0x18]; + u8 num_ranges[0x8]; + + u8 reserved_at_e0[0x20]; + + u8 range_start_address[0x40]; + + u8 length[0x40]; + + struct mlx5_ifc_page_track_range_bits track_range[0]; +}; + +struct mlx5_ifc_create_page_track_obj_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits general_obj_in_cmd_hdr; + struct mlx5_ifc_page_track_bits obj_context; +}; + +struct mlx5_ifc_modify_page_track_obj_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits general_obj_in_cmd_hdr; + struct mlx5_ifc_page_track_bits obj_context; +}; #endif /* MLX5_IFC_H */ From patchwork Thu Jun 30 10:25:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901615 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7066ECCA481 for ; Thu, 30 Jun 2022 10:27:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234922AbiF3K1N (ORCPT ); Thu, 30 Jun 2022 06:27:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36390 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234707AbiF3K0y (ORCPT ); Thu, 30 Jun 2022 06:26:54 -0400 Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam07on2081.outbound.protection.outlook.com [40.107.212.81]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1FC745F58; Thu, 30 Jun 2022 03:26:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UBxuHSg+ojegcFjsX7DyUt1I4TxJX6VzZIBI95LoBadToJRkRU6YodyaIhbcvGZkztunoqLVOMWAs7Z7ypFUQdJ3jvdokTH4oYp9T+IynROYON8nqvs3r7PjV8T1FFEYHjvhSPI4BjD1BVhHrNBF7jehqUvYWa+IE+MH5DE7dKMpP+5EvoiCxNnmGvZfa2DKOLGP6eBtYZZN8mwaDjImk6Fj8Bwg8fJFTDBX7YnHZyucUOtiywZFcIKyQ/0KAYWpJ+ZibvWrawXCi/cTcU+u5DfYiAaisXQJBPvCoAcLIa9HEfX2V6LVvl+topZC70cd1JYLKhBxNOXyP/F2baD9Dg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=acCF7blEo0YjpEeOeRgjmYbLma0/7FOpN21VoQVXpVI=; b=OVJDRfw8bv3Tr7h2lFj6mU+GsRJanknQ1HIhhGArZKGOlo+TQE5Itl1Mnzy1c7Z/oJpU0lPZaELGXJ6PiodYtzRMwvtRsnwC2YVPX7TRWcsSRyyoh4S4l/FWlXGsgNskFIuJCUpeNSW8dEygQlErp9secLAWaLhf2S+Mz3g2X6NyVl9XhK+3Nl6piqktfAQgGf+Rn+M7fxMtSliMNNHr3dqYDNjgXP4HWW80J0NOmChAaK/39tDo2jpETmlX9X3dvQrAmV9iNWSM9HaLQpnZJVy2ieOi62a/lzAWl01ZKr7HKU3UueF0FELxi7OrQCd60c0vB4Yw42mZ/D0wvz1X1g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.238) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=acCF7blEo0YjpEeOeRgjmYbLma0/7FOpN21VoQVXpVI=; b=X4red4eZCW8nF3IHGFb09hN4fBzekPJyUxbEDp/ea9LTTYFGLvDf9G+vezg+AT+Esa8oLiVw5SgL0uU/Wd7bSu6t+jEusXhwlMExtFZFUSJ65pWQQewUrD+mOmAq8iIU+ExzHhqFo171GK+TYlBqEY/06tmDrlbrjnrNS1+prz1ItX8w4jmEeoiII6oVGC0wkYUlcfvvbnVeElTOM12hB9OwAuN+/HsalvEUWXQ9h2Tbbepe1di6jyKp0leUpNuWifxIlJNPff9zXi0gQOXUEVbxpcYQjMwijcooO2UhRXZinycXK3gisP1UV3oT1PDI9nxZmqQ2rbT6zpOGRvtP5w== Received: from DS7PR05CA0085.namprd05.prod.outlook.com (2603:10b6:8:56::15) by DS0PR12MB6462.namprd12.prod.outlook.com (2603:10b6:8:c6::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.15; Thu, 30 Jun 2022 10:26:42 +0000 Received: from DM6NAM11FT035.eop-nam11.prod.protection.outlook.com (2603:10b6:8:56:cafe::58) by DS7PR05CA0085.outlook.office365.com (2603:10b6:8:56::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.8 via Frontend Transport; Thu, 30 Jun 2022 10:26:42 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.238) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.238 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.238; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.238) by DM6NAM11FT035.mail.protection.outlook.com (10.13.172.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:42 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by DRHQMAIL105.nvidia.com (10.27.9.14) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:41 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:40 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:36 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 04/13] net/mlx5: Query ADV_VIRTUALIZATION capabilities Date: Thu, 30 Jun 2022 13:25:36 +0300 Message-ID: <20220630102545.18005-5-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f4cba7f9-d2b6-4cab-fbe1-08da5a830697 X-MS-TrafficTypeDiagnostic: DS0PR12MB6462:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 90C/ZPMYZzjrhWNHJxdDNWRThaXVAHjb8s8dcFeuwnbcOSIvUgt7j4OL6q8+ggbgXMGcUu9on4W7KaltFHVuTsRTWm/28gQKo4BErYwYv9iS2kTCNYlaSLGlyHwJv2Fgkdd7Y5sL+gFHhDKgY2r2ZJjnb7IbmyUcVvBe1E7LMydkIx9PXsX5xxkNSv3LhiJHQWN+6CkW+XrT2vpx8lA9pRMm43a3z6ZTEee55fF2wEdkLFYoe5sAw/fTUQml5hbK1ZLntcDH06V5R3YImcSY3ZLgj1FXMlVSf01FDf1WrfRvT+XGLE8dSgmbPpVRiHULkcTgqTskwz6slNFKsa8FM4V4ErWC/1l31cT6ZdRdrnvZawlYVEnpt09sQuNiymLAM7w8qBqx7eERc7YynGrNU6CCxorp9AEwDtYMLJUnlwaCnmZWRc05M+paHOnuazRea9FHkKRAydyB5GKKieyi1ot8M4dOPXhCl3qyZC07Cq3JxAYOvSUA7lPzOop3YM8RNph6ZHKxFJznkSjr5FQ12K9T7gxCwpb5h/xk+IWRGlIO68DzhAYRRbsf83SBWSgx7XmhxejXMLB48m+hW0mhkP93KOtvMJpWzoR2Ap5GUjlDdks0v2ITBaK5BNZyI97mEhroKNEP4GYeVnEr06wil0hqkzkNQp7TmqoHkaB09VbcTNL8yVULt+os6qzPKytops+E9Z9imwqN8tyPEIVG3ZTQoZBdmVrxTtrxHjzdYKdl7AA0s/p+cTxg6xu/y/D6ucAlb2jzW0/2TxQ9OAChdjT+VWNNwAteSK7Q/8nAhi1QwOEUOdaabxRqA2yfYXlRK7JYdYKglzQFJyspw0ZRDYIVK9tWLhE6cVrnF/NXRQU= X-Forefront-Antispam-Report: CIP:12.22.5.238;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(396003)(376002)(136003)(39860400002)(346002)(36840700001)(40470700004)(46966006)(336012)(26005)(5660300002)(7696005)(70586007)(36756003)(8676002)(478600001)(6636002)(2616005)(110136005)(54906003)(41300700001)(82310400005)(81166007)(186003)(1076003)(8936002)(426003)(2906002)(86362001)(36860700001)(4326008)(40480700001)(40460700003)(70206006)(316002)(47076005)(82740400003)(356005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:26:42.2935 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f4cba7f9-d2b6-4cab-fbe1-08da5a830697 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.238];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT035.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB6462 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Query ADV_VIRTUALIZATION capabilities which provide information for advanced virtualization related features. Current capabilities refer to the page tracker object which is used for tracking the pages that are dirtied by the device. Signed-off-by: Yishai Hadas --- drivers/net/ethernet/mellanox/mlx5/core/fw.c | 6 ++++++ drivers/net/ethernet/mellanox/mlx5/core/main.c | 1 + include/linux/mlx5/device.h | 9 +++++++++ 3 files changed, 16 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c index cfb8bedba512..45b9891b7947 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c @@ -273,6 +273,12 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev) return err; } + if (MLX5_CAP_GEN(dev, adv_virtualization)) { + err = mlx5_core_get_caps(dev, MLX5_CAP_ADV_VIRTUALIZATION); + if (err) + return err; + } + return 0; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index c9b4e50a593e..5ecaaee2624c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1432,6 +1432,7 @@ static const int types[] = { MLX5_CAP_IPSEC, MLX5_CAP_PORT_SELECTION, MLX5_CAP_DEV_SHAMPO, + MLX5_CAP_ADV_VIRTUALIZATION, }; static void mlx5_hca_caps_free(struct mlx5_core_dev *dev) diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 604b85dd770a..96ea0c1796f8 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -1204,6 +1204,7 @@ enum mlx5_cap_type { MLX5_CAP_DEV_SHAMPO = 0x1d, MLX5_CAP_GENERAL_2 = 0x20, MLX5_CAP_PORT_SELECTION = 0x25, + MLX5_CAP_ADV_VIRTUALIZATION = 0x26, /* NUM OF CAP Types */ MLX5_CAP_NUM }; @@ -1369,6 +1370,14 @@ enum mlx5_qcam_feature_groups { MLX5_GET(port_selection_cap, \ mdev->caps.hca[MLX5_CAP_PORT_SELECTION]->max, cap) +#define MLX5_CAP_ADV_VIRTUALIZATION(mdev, cap) \ + MLX5_GET(adv_virtualization_cap, \ + mdev->caps.hca[MLX5_CAP_ADV_VIRTUALIZATION]->cur, cap) + +#define MLX5_CAP_ADV_VIRTUALIZATION_MAX(mdev, cap) \ + MLX5_GET(adv_virtualization_cap, \ + mdev->caps.hca[MLX5_CAP_ADV_VIRTUALIZATION]->max, cap) + #define MLX5_CAP_FLOWTABLE_PORT_SELECTION(mdev, cap) \ MLX5_CAP_PORT_SELECTION(mdev, flow_table_properties_port_selection.cap) From patchwork Thu Jun 30 10:25:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901618 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 360D6CCA482 for ; Thu, 30 Jun 2022 10:27:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234917AbiF3K1S (ORCPT ); Thu, 30 Jun 2022 06:27:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234944AbiF3K0z (ORCPT ); Thu, 30 Jun 2022 06:26:55 -0400 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2075.outbound.protection.outlook.com [40.107.220.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 952CF64EB; Thu, 30 Jun 2022 03:26:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OAFN1Q3MOt+B9mSugEQe9ehP8EIFp4GVNY9rj2RCxvuXMPFJwZ1sE3oJWTm+0QW6ZErLPrgANvqhIC5V0Sk3hYpcVpWUc2CQMjc0WCqQjlhN/TZgh5T00IO0G8cTm/zvXKq4xv2pAXkH5EnuJFkuDsIXVWWZg8bGjVRVHO9Jw0GKSk/ECukpHvZVoRPSCfZG64oeKP3muHAqE4HUf1u+YL6D+VOUQf9DL/RNai/Lmcl9Od4bQGbhqY7JCorefofaZDowcerWaKyK4xgl4iYv04fR+WOo7KJox5aZEvgiBTah+m5/2oeBLcYRv377wVJrDwG5SO5tGxkkzaTnTga7OQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5EqIIRI1FyOZ8HM4UPb1WkhAWy887FklMrDRGyedfgY=; b=aI1iT0jqksJmNF2TTRF4acTp5E+QhsJwh5UnrQkeCp7GXGa+rzdyGR7w9cZD9vbD5/rvMMq5NVVjKws/KlLf5XC12jvVHrnmdAeDPR8PFZR7Qar9VogcLKQTrq3AQ6dn/Jhed8WcBIZOFtzInNjblcKo1EeobLclheZV0o9hB3yqRsSeVwaFibNoj+1rT26R2r1/AnKcJVjQKOAnwFq1TeulpdiqXyD/HOO8w9ixnHiGKPhaItwEnJKpLWb2NPvnsSjISHEs1Sq3rgh1nK9/aO0ckAnufvsH5FuXozbHxd7IooytSsg2EWxDARx3dMAwv7D0gW6tSyNm65B1KvxdbQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.236) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5EqIIRI1FyOZ8HM4UPb1WkhAWy887FklMrDRGyedfgY=; b=BFYsfTCjW5GRcvGC2ALZ/OdVd8iM8v6XDFlLI+38ZpohAXr6Uv95o8kKYpBPGTkOxMYi54F8l56IX/h+Y3A3IlJgSwb71qrEqDPv2faDrfFaandf5r7nn7kW0siSXLEPphM2WSsvxjVX/3nfZachiUVcWCfzB2zFHvAEXszVXzsihoKN+C0+LX0iK2WCjWj/juoJI7QDinu/eWEmK0s0xyiIbVij42A0g+yn9R6b6Sx6cia9OiH9RE2KDJaGfHTTmcJaESwQQI7dx932ITNkMPEjaclLuKCTqF4hfjmGps8chsFjkOjr9+/dZLJYbW2qisJF6G9uu964EdjvgEUubQ== Received: from DS7PR06CA0006.namprd06.prod.outlook.com (2603:10b6:8:2a::7) by MN2PR12MB3167.namprd12.prod.outlook.com (2603:10b6:208:a9::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.16; Thu, 30 Jun 2022 10:26:45 +0000 Received: from DM6NAM11FT053.eop-nam11.prod.protection.outlook.com (2603:10b6:8:2a:cafe::89) by DS7PR06CA0006.outlook.office365.com (2603:10b6:8:2a::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:45 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.236) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.236 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.236; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.236) by DM6NAM11FT053.mail.protection.outlook.com (10.13.173.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:44 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by DRHQMAIL109.nvidia.com (10.27.9.19) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:44 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:43 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:40 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 05/13] vfio: Introduce DMA logging uAPIs Date: Thu, 30 Jun 2022 13:25:37 +0300 Message-ID: <20220630102545.18005-6-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4f077296-fbf4-441c-8db2-08da5a830825 X-MS-TrafficTypeDiagnostic: MN2PR12MB3167:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 727AR4puT0+1DWURH4eQVODp42lrtp2W6H3+k8wziSbx5DOTLsxNe9xE7pqnyFQcPECjAgZkNw6OENlnDzOYk3RB5FdOp7GM8SZS+oMXklMurLq0SfDV0bQnK/7Ysjb2aFKJRLzAkPDif70pAzZO7eR0Qi0RbGbqDVChHpedAMxwRfMCuQbmk4nxBDRD96g/c4iwP/EiTbxgDWYhk6OonTrQmibcHTdRbytAdjeqBmyAfd0yPl7Pv91Ixyj4Yw3UXSNglhwIg0IqcX5JNRFoSeTBPFTVZoVFMBW4OMxn3kZP2q5OD2d1KTB8HaPG6tW+FxC8FmIDSLtfif3T1REwSiHHvGO/Z2DB0gWYWo0w1cDdzkLK0dFRh/H8fth77/PYgY+9cwnjSCRbIvTMb2SQr/me5RTjflcC+DiIstDLMVMWMjoxqVMXq9t3AqFEGpw4hO4gWuR9bJUS55st8O+nrLwt0sfFvBl+EDCJVnD0+hyWn0B2EsXWgLbMZGU8gKUKvbi0QnYB3gEY0eLHrcYS1AVPqx+kG9lke3ovKdSOhyMjTsOu8j9D7tHb4+UvSL1XIg3KFw9owg3jNq5SNBvRe6mmT9rvFti2YTOmQqbuuMVUX4oqNQlNs8Tt5ANa758V7aQOvVIlYj+pOPpgwv0wXFLdN7R3ifhiTs03efVb5LXl3TkSXMJCkRKtbBl76iaE6JPIy9wlqZjpXKUk0GhzZOLl/l8rtS54WFkb6vpYmjFf3qSlZCYbaOgjINJSjBPpntsGYosYjJ6U7PzyPd0JoMLqmDxsZJ4A3OGfu3C7tzlkLFvzjpIAfO7zjeD6HPcTHn7ET9WoxxqgsTG/jWQ1QKyase+OCP+XudCd2VyG6w8= X-Forefront-Antispam-Report: CIP:12.22.5.236;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(136003)(346002)(376002)(39860400002)(396003)(46966006)(36840700001)(40470700004)(70586007)(26005)(110136005)(36756003)(8676002)(82740400003)(40460700003)(54906003)(41300700001)(40480700001)(70206006)(82310400005)(2906002)(8936002)(5660300002)(83380400001)(316002)(6636002)(47076005)(81166007)(356005)(186003)(1076003)(7696005)(2616005)(478600001)(86362001)(36860700001)(336012)(6666004)(426003)(4326008)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:26:44.8834 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4f077296-fbf4-441c-8db2-08da5a830825 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.236];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT053.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB3167 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org DMA logging allows a device to internally record what DMAs the device is initiating and report them back to userspace. It is part of the VFIO migration infrastructure that allows implementing dirty page tracking during the pre copy phase of live migration. Only DMA WRITEs are logged, and this API is not connected to VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE. This patch introduces the DMA logging involved uAPIs. It uses the FEATURE ioctl with its GET/SET/PROBE options as of below. It exposes a PROBE option to detect if the device supports DMA logging. It exposes a SET option to start device DMA logging in given IOVAs ranges. It exposes a SET option to stop device DMA logging that was previously started. It exposes a GET option to read back and clear the device DMA log. Extra details exist as part of vfio.h per a specific option. Signed-off-by: Yishai Hadas Signed-off-by: Jason Gunthorpe --- include/uapi/linux/vfio.h | 79 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 733a1cddde30..81475c3e7c92 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -986,6 +986,85 @@ enum vfio_device_mig_state { VFIO_DEVICE_STATE_RUNNING_P2P = 5, }; +/* + * Upon VFIO_DEVICE_FEATURE_SET start device DMA logging. + * VFIO_DEVICE_FEATURE_PROBE can be used to detect if the device supports + * DMA logging. + * + * DMA logging allows a device to internally record what DMAs the device is + * initiating and report them back to userspace. It is part of the VFIO + * migration infrastructure that allows implementing dirty page tracking + * during the pre copy phase of live migration. Only DMA WRITEs are logged, + * and this API is not connected to VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE. + * + * When DMA logging is started a range of IOVAs to monitor is provided and the + * device can optimize its logging to cover only the IOVA range given. Each + * DMA that the device initiates inside the range will be logged by the device + * for later retrieval. + * + * page_size is an input that hints what tracking granularity the device + * should try to achieve. If the device cannot do the hinted page size then it + * should pick the next closest page size it supports. On output the device + * will return the page size it selected. + * + * ranges is a pointer to an array of + * struct vfio_device_feature_dma_logging_range. + */ +struct vfio_device_feature_dma_logging_control { + __aligned_u64 page_size; + __u32 num_ranges; + __u32 __reserved; + __aligned_u64 ranges; +}; + +struct vfio_device_feature_dma_logging_range { + __aligned_u64 iova; + __aligned_u64 length; +}; + +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_START 3 + +/* + * Upon VFIO_DEVICE_FEATURE_SET stop device DMA logging that was started + * by VFIO_DEVICE_FEATURE_DMA_LOGGING_START + */ +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP 4 + +/* + * Upon VFIO_DEVICE_FEATURE_GET read back and clear the device DMA log + * + * Query the device's DMA log for written pages within the given IOVA range. + * During querying the log is cleared for the IOVA range. + * + * bitmap is a pointer to an array of u64s that will hold the output bitmap + * with 1 bit reporting a page_size unit of IOVA. The mapping of IOVA to bits + * is given by: + * bitmap[(addr - iova)/page_size] & (1ULL << (addr % 64)) + * + * The input page_size can be any power of two value and does not have to + * match the value given to VFIO_DEVICE_FEATURE_DMA_LOGGING_START. The driver + * will format its internal logging to match the reporting page size, possibly + * by replicating bits if the internal page size is lower than requested. + * + * Bits will be updated in bitmap using atomic or to allow userspace to + * combine bitmaps from multiple trackers together. Therefore userspace must + * zero the bitmap before doing any reports. + * + * If any error is returned userspace should assume that the dirty log is + * corrupted and restart. + * + * If DMA logging is not enabled, an error will be returned. + * + */ +struct vfio_device_feature_dma_logging_report { + __aligned_u64 iova; + __aligned_u64 length; + __aligned_u64 page_size; + __aligned_u64 bitmap; +}; + +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT 5 + /* -------- API for Type1 VFIO IOMMU -------- */ /** From patchwork Thu Jun 30 10:25:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901616 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00036CCA47E for ; Thu, 30 Jun 2022 10:27:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232798AbiF3K1O (ORCPT ); Thu, 30 Jun 2022 06:27:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234940AbiF3K0z (ORCPT ); Thu, 30 Jun 2022 06:26:55 -0400 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2054.outbound.protection.outlook.com [40.107.220.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 218FCB841; Thu, 30 Jun 2022 03:26:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=k68rGEbuFjGOxCiahnDJjjUVPsM3YB7zj59SJNiEcPnHxUJ8cITmlg8vFji1xlLRsVuXKZpUxRZgDVtlasOAzd8+ArJu7jWjUgeahYAVtaN/GhvNNfLlQ1zD8/r44ISrXgTuYqsU/ZESxN/vshRSVL3mKZdgdOd26LeSGVvpm/y+B5G8AuWqqn/E6kqj2JJJdwzTSTghjX+2wK6oGpq9nC+7FnEPrNxIyKAqPw/jgBSR1FgBXjUI8gpJ29d91rHEJWaFPGplrGdj96IZWbIs8i5rYh9b9ew30bUJLnobFW8n49rK/92VCDohP1Y/bq6scKore2woCn1wpUW0dN/Dpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=u1a9nPGQnJoIq+MBq1vFZ3JgFZjjSYJMAj3WLjTda+o=; b=O0YwKVsW3IFbSToay03z1gnJeHU52GRcLhNaVvBrhXVXj7UEX7jOpYQFGlYLEwyRllEz2AANOw9B/cWlNaPWaE2YOQd7QiIgk9ZaQe5bwasd6PHSHkhuglhXC0o3CdxOvxPg7SpTpCJlBG0+msfR3hSEH4aakbuPAmeyRyKLWruAzzD3eDR1EUGZfBAYorqmQlLp9V2IjzVZq5rOrUGVNZLOu6A3TS38RCf8rO6vzyPcFJ7zp0jwgq3WSaAtuUciLCtHvhxWBEqZ8c+EocuerpAU0QRqOB/tBA5wuaAaLncwuH0/fNch9mI1gcuBCYCAiXppzlHHxXdD31DMfFd/8g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.235) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=u1a9nPGQnJoIq+MBq1vFZ3JgFZjjSYJMAj3WLjTda+o=; b=LvMpWVtNFGfdATdoKwfAVvAgZod8XIRAv1X6D7pmk/sKWxKJw87140eFHpmNQnVh+4wZLMVQkuz2V7AGgQ4Nm0KST7JEjke9lU4cURxbRnk1MQ2CjT5wqQsN+me1BtjCHvlMgKqAMNSuR6sASH0gK7doo8qOptoqzTEwI+Mbh7tz8wK352hJpsQcRpWMGBYWJVroF3Nd8IrAw29t5G2fKOnCPTOFRiS/daT+2MVl/RZgIX6EyiNHZPG5hgje+1Hs5wc30VsCjiOBNIIDyQ/P2gSfCmRqiePump9/gPHQC5jy0aZfr1tlNnQIGGwZGkFajmu1B7yQGHCgGOKqnzdrLQ== Received: from BN8PR16CA0018.namprd16.prod.outlook.com (2603:10b6:408:4c::31) by DM5PR1201MB0044.namprd12.prod.outlook.com (2603:10b6:4:54::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.18; Thu, 30 Jun 2022 10:26:49 +0000 Received: from BN8NAM11FT028.eop-nam11.prod.protection.outlook.com (2603:10b6:408:4c:cafe::7) by BN8PR16CA0018.outlook.office365.com (2603:10b6:408:4c::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.21 via Frontend Transport; Thu, 30 Jun 2022 10:26:49 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.235) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.235 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.235; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.235) by BN8NAM11FT028.mail.protection.outlook.com (10.13.176.225) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:49 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:48 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:47 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:44 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 06/13] vfio: Move vfio.c to vfio_main.c Date: Thu, 30 Jun 2022 13:25:38 +0300 Message-ID: <20220630102545.18005-7-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 945e707b-6129-42ed-3886-08da5a830ac8 X-MS-TrafficTypeDiagnostic: DM5PR1201MB0044:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 56zNlVMdbKHWvMVgkf78IKL3HB8hvLLELj8UIMcWIGS1WigyoqVZMXDMkdO8BGv3mHg7f3UDNhru/1y22ZYUgdg2X6V53FdVDsSsxZ6XPORRNBDoDPyAj6coytPE+xSnd8xweNGghe31mCrilv/4QTx1rqo5PNvVPJc7pPcKc/WaaK7O3xDCtMg43NpEkFxxvCTYT8UrcFoTLWA1D/JAKg6eUoQ024IPL2x6stHmflhJsNo0CMhfbXmYZGAwz/1TDD79nHwqvM1LN+f95YntaPdMlWRum9ezSEBKKtKajk9LjyAgq+IFlj07mA6dsYrl+0hwpzcKgQ0m9HmwBEsKmCgdjc/UjC7BWMcpqZ34rX6KuqYKka4ZlCVVoDkLzL2W+j/lkKRanRK1c2FdPJBQthOZQnAfArmaxfqA6hfrBcFLKbnZ534DcoAHEYVDISNyXZMUg9edQsmsaMcvtgrJg+PA3KeiEoGSj/5GKeJRo+Lc6jv3RrRqmBGpTLUmpDXQcvuGHef4bzmeco3XaVUkLuMPUPjVGA4xz7vOzy6Zb3rCnIqzUbUzRZIn+WhkULiVP5ef/ndmD4xwbFAo4D7SECa1006S2dMj17ZxxHgA0rG4N4IePz4C0SOgviZGbEh+goiVeEVvKFAs7/YDsSyPwvsUwV57UO2qKZAoiDG/r22bDG3JYVvTXnynNZBFEJ9bVlGvD0nDl/07HVxcxrupL5go7n+HtxckIw5NsGPnARYNSIWJZhYEghO4+heFczddeoKYzkWHRyqWtn3AMikVCoe8R+7g4Ra8cWlqOfWC9I51+k1/cQ6fbkINhv59iOaUzICmNGQze7vq1CvkDDGZJ7SwqT2i21I0+cGQqCqD00M= X-Forefront-Antispam-Report: CIP:12.22.5.235;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(396003)(346002)(39860400002)(376002)(136003)(46966006)(36840700001)(40470700004)(54906003)(70586007)(6636002)(82740400003)(36756003)(110136005)(70206006)(4326008)(8936002)(86362001)(26005)(186003)(2906002)(40480700001)(478600001)(36860700001)(47076005)(5660300002)(336012)(41300700001)(356005)(426003)(6666004)(82310400005)(316002)(8676002)(7696005)(40460700003)(81166007)(1076003)(2616005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:26:49.2893 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 945e707b-6129-42ed-3886-08da5a830ac8 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.235];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT028.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR1201MB0044 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jason Gunthorpe If a source file has the same name as a module then kbuild only supports a single source file in the module. Rename vfio.c to vfio_main.c so that we can have more that one .c file in vfio.ko. Signed-off-by: Jason Gunthorpe Signed-off-by: Yishai Hadas --- drivers/vfio/Makefile | 2 ++ drivers/vfio/{vfio.c => vfio_main.c} | 0 2 files changed, 2 insertions(+) rename drivers/vfio/{vfio.c => vfio_main.c} (100%) diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index fee73f3d9480..1a32357592e3 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -1,6 +1,8 @@ # SPDX-License-Identifier: GPL-2.0 vfio_virqfd-y := virqfd.o +vfio-y += vfio_main.o + obj-$(CONFIG_VFIO) += vfio.o obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio_main.c similarity index 100% rename from drivers/vfio/vfio.c rename to drivers/vfio/vfio_main.c From patchwork Thu Jun 30 10:25:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901621 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D13A1C433EF for ; Thu, 30 Jun 2022 10:27:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235041AbiF3K10 (ORCPT ); Thu, 30 Jun 2022 06:27:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234823AbiF3K07 (ORCPT ); Thu, 30 Jun 2022 06:26:59 -0400 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1anam02on2062b.outbound.protection.outlook.com [IPv6:2a01:111:f400:7ea9::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7516BBC8B; Thu, 30 Jun 2022 03:26:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SUllg0XccrawexyMN1pjw/DxWGRrsGcZXpdCdEhAIBBfRuzg9xsuFOeuXg1o4+ZLyFxyqJ2sP+uQaeVEsSspKCi1hH6xDNNZ1kOs6OY/dJO9ZmNcY11CTOoCDdcdpuh4YuYPVEovFLMshPy/qS6Er7LxH1qmXHUn78po4Xg0VHDgiBCSMZksiWAmPGBmaV870Uxjn1MDLZCvkWa4DIbQ1urXESRk16rk7xO0uCHyJMb+m2WssaHcTs6/Vl/bGeN4BbYwKR3JvIA9Ka02bPtB81/FFdH2zpEc8O8A8BlkAmQVfKRnoy6skYPqkSaX73uEx3qpp1wp6dQEEaUCZoWGrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cQb7BEu304mzvh6yz+QuNvPSk0csvwEZnlQ3YPnCSDI=; b=Cu/UJl3xg050ic+Whg/UHqFJgi6KGjJPqDUkXTJeofKPteAxz4SeSnLiz0tdfv3HFTqgvQEKKKk/WdX+FizHD+FIxCyqGTuQpAVsDaI9v0ztsBN8K5lKnJIz5WtMOPy5byK1Pgmyaaxk7tO3EdHKWDsLBFGd4Bxfvy554nXONYF2d9VOQj6bDD4z6iZ2GatPJZ0Cw52JBRkeLG12q8C0CdoDRfoFFOvBr6CGi3QEmOFOJmun8jQww8onhJBUE/dyxb7ApSY8/uLpE16ZZgqiaoz5tiIPf70oii0lHnNIGeFQAYjUJVK4PBzTTODFMQytJ0YUCBXdb2cE8NskCetYmA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.238) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cQb7BEu304mzvh6yz+QuNvPSk0csvwEZnlQ3YPnCSDI=; b=PAWxQRZioGY3v1opq4qvlgonlaCINSWGC1VBhMH3d+E0ZsQHMOADAJlECqvXS3wqYoJqo9lDA8tj9hjmbe9q7p+qwAoBbW5omkvC3eP7WxGUUgnHfW+bdP226i25OtKM2P5v/Ok2LlDPVGnXdKxyY9BfiT4gZC/az5US9qFBoSR/GBfyDKhBXxBlip9AUBb9iOsXM0K6Jhz1ecrM5wD9FUYXCK+yawWK0Og1B9WRBFHp8xvJVw6z+JOe3vXpSNH+Hf3W5LF3GKxvsxEM1G6ktNZMLiC6E+LrUy90gz+XOG+el2FVapKviDuOPDwdFIfGB8EHI2VMYdJR+TVzwtJVYg== Received: from DM6PR07CA0041.namprd07.prod.outlook.com (2603:10b6:5:74::18) by MN2PR12MB3085.namprd12.prod.outlook.com (2603:10b6:208:c5::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.18; Thu, 30 Jun 2022 10:26:53 +0000 Received: from DM6NAM11FT009.eop-nam11.prod.protection.outlook.com (2603:10b6:5:74:cafe::5d) by DM6PR07CA0041.outlook.office365.com (2603:10b6:5:74::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:53 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.238) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.238 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.238; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.238) by DM6NAM11FT009.mail.protection.outlook.com (10.13.173.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:52 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by DRHQMAIL105.nvidia.com (10.27.9.14) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:52 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:51 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:47 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 07/13] vfio: Add an IOVA bitmap support Date: Thu, 30 Jun 2022 13:25:39 +0300 Message-ID: <20220630102545.18005-8-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: c294f138-d325-49bd-435c-08da5a830cf6 X-MS-TrafficTypeDiagnostic: MN2PR12MB3085:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: eXTkJtFlQdmaH2ki3ZMGHylPvBWDbb4Y3unXvAkEXz17bo/LRXHjqWtBDHlhJj4ZaSSQeKd6BZmtj9K4VQ1CA/6mLSOS4hjuTZr2IlG6VxleriCVIOM1H8A9Xwa2W/NBvJFfsxYyvtjrTBojji0kjRB4ZrZSZ8qmhDGi7QRBI4pBAoxAwkP6TH7i3wCtOt5kHP0hoa5hlfbXe4ZP3u3tYGCZu7CWnfT+P0kjjuLwVQndgKV1M6YSf+7x4dDPnQ17nG5YK6djCefnpz4mEv3p8nLIHsACFR9sy6ZHw3HAPO6LuRgSnsyeVy+rbox8wzP0Kfvj3sIuc174x9g7rxO25HR296UB12zH2fKZrOmIQogR3L/ExQwaRkL5d+3pTdBIFlXGhOe4g64pfVwMATEgMhFYZqnShR6VPjoQLHqlCeLYo0LYEWqPuWgSHkTTHcyoANXga6u7lUg0qwFbY0z3sEynU+9VXUgzEMaQglVzz5LRfB0yTy2Diu7Rz4a3zwUgxIkMp1TEQO/lpdSnwlY8RyRCtbEs/zDKDOKyeskuSO7gBtmU35YczP5Pf2DqPBy29EjtJc8J2qLk52Q9ZDKcvQh6ZlHL+IoSO+eJJtT9cEjkFf9DDNaeQpnBIZ/uoQpsLQCjHl2ROCrjjWyTSaLzPLBpMC1gn3E8g5MmOKuLx8+9Udisid75+NUItmJnBjzom7vgVgxXsp7NtbJseueaUtTlVmFKfqOkqdkboE7YDnwdaU2yQLL+XFmIH4t7pF0cSUyff+mfXqnmkhfNfMXGuA6nwpQyv3rpV3UAXM8Zaht+qSWQ77trCxuxAD2/SBbDPUq9DF4r9FRwcG/mQSVvspmrtOtcnkEgBffbRZGkeuU= X-Forefront-Antispam-Report: CIP:12.22.5.238;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(136003)(346002)(396003)(39860400002)(376002)(46966006)(36840700001)(40470700004)(70586007)(2616005)(7696005)(356005)(8936002)(5660300002)(36756003)(478600001)(36860700001)(83380400001)(2906002)(40480700001)(81166007)(82310400005)(47076005)(4326008)(86362001)(54906003)(1076003)(426003)(110136005)(41300700001)(336012)(6636002)(186003)(26005)(82740400003)(316002)(40460700003)(8676002)(6666004)(70206006)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:26:52.9309 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c294f138-d325-49bd-435c-08da5a830cf6 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.238];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT009.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB3085 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Joao Martins The new facility adds a bunch of wrappers that abstract how an IOVA range is represented in a bitmap that is granulated by a given page_size. So it translates all the lifting of dealing with user pointers into its corresponding kernel addresses backing said user memory into doing finally the bitmap ops to change various bits. The formula for the bitmap is: data[(iova / page_size) / 64] & (1ULL << (iova % 64)) Where 64 is the number of bits in a unsigned long (depending on arch) An example usage of these helpers for a given @iova, @page_size, @length and __user @data: iova_bitmap_init(&iter.dirty, iova, __ffs(page_size)); ret = iova_bitmap_iter_init(&iter, iova, length, data); if (ret) return -ENOMEM; for (; iova_bitmap_iter_done(&iter); iova_bitmap_iter_advance(&iter)) { ret = iova_bitmap_iter_get(&iter); if (ret) break; if (dirty) iova_bitmap_set(iova_bitmap_iova(&iter), iova_bitmap_iova_length(&iter), &iter.dirty); iova_bitmap_iter_put(&iter); if (ret) break; } iova_bitmap_iter_free(&iter); The facility is intended to be used for user bitmaps representing dirtied IOVAs by IOMMU (via IOMMUFD) and PCI Devices (via vfio-pci). Signed-off-by: Joao Martins Signed-off-by: Yishai Hadas --- drivers/vfio/Makefile | 6 +- drivers/vfio/iova_bitmap.c | 164 ++++++++++++++++++++++++++++++++++++ include/linux/iova_bitmap.h | 46 ++++++++++ 3 files changed, 214 insertions(+), 2 deletions(-) create mode 100644 drivers/vfio/iova_bitmap.c create mode 100644 include/linux/iova_bitmap.h diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 1a32357592e3..1d6cad32d366 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -1,9 +1,11 @@ # SPDX-License-Identifier: GPL-2.0 vfio_virqfd-y := virqfd.o -vfio-y += vfio_main.o - obj-$(CONFIG_VFIO) += vfio.o + +vfio-y := vfio_main.o \ + iova_bitmap.o \ + obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o diff --git a/drivers/vfio/iova_bitmap.c b/drivers/vfio/iova_bitmap.c new file mode 100644 index 000000000000..58abf485eba8 --- /dev/null +++ b/drivers/vfio/iova_bitmap.c @@ -0,0 +1,164 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2022, Oracle and/or its affiliates. + * Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include + +static unsigned long iova_bitmap_array_length(unsigned long iova_length, + unsigned long page_shift) +{ + return DIV_ROUND_UP(iova_length, BITS_PER_TYPE(u64) * (1 << page_shift)); +} + +static unsigned long iova_bitmap_index_to_length(struct iova_bitmap_iter *iter, + unsigned long index) +{ + return ((index * sizeof(u64) * BITS_PER_BYTE) << iter->dirty.pgshift); +} + +static unsigned long iova_bitmap_iter_left(struct iova_bitmap_iter *iter) +{ + unsigned long left = iter->count - iter->offset; + + left = min_t(unsigned long, left, + (iter->dirty.npages << PAGE_SHIFT) / sizeof(u64)); + + return left; +} + +/* + * Input argument of number of bits to bitmap_set() is unsigned integer, which + * further casts to signed integer for unaligned multi-bit operation, + * __bitmap_set(). + * Then maximum bitmap size supported is 2^31 bits divided by 2^3 bits/byte, + * that is 2^28 (256 MB) which maps to 2^31 * 2^12 = 2^43 (8TB) on 4K page + * system. + */ +int iova_bitmap_iter_init(struct iova_bitmap_iter *iter, + unsigned long iova, unsigned long length, + unsigned long __user *data) +{ + struct iova_bitmap *dirty = &iter->dirty; + + iter->data = data; + iter->offset = 0; + iter->count = iova_bitmap_array_length(length, dirty->pgshift); + iter->iova = iova; + iter->length = length; + dirty->pages = (struct page **)__get_free_page(GFP_KERNEL); + + return !dirty->pages ? -ENOMEM : 0; +} + +void iova_bitmap_iter_free(struct iova_bitmap_iter *iter) +{ + struct iova_bitmap *dirty = &iter->dirty; + + if (dirty->pages) { + free_page((unsigned long)dirty->pages); + dirty->pages = NULL; + } +} + +bool iova_bitmap_iter_done(struct iova_bitmap_iter *iter) +{ + return (iter->count - iter->offset) > 0; +} + +static unsigned long iova_bitmap_iter_length(struct iova_bitmap_iter *iter) +{ + return iova_bitmap_index_to_length(iter, iter->count); +} + +unsigned long iova_bitmap_length(struct iova_bitmap_iter *iter) +{ + unsigned long left = iova_bitmap_iter_left(iter); + unsigned long iova = iova_bitmap_iova(iter); + + left = iova_bitmap_index_to_length(iter, left); + if (iova_bitmap_iter_length(iter) > iter->length && + iova + left > iter->iova + iter->length - 1) + left -= ((iova + left) - (iova + iter->length)); + return left; +} + +unsigned long iova_bitmap_iova(struct iova_bitmap_iter *iter) +{ + unsigned long skip = iter->offset; + + return iter->iova + iova_bitmap_index_to_length(iter, skip); +} + +void iova_bitmap_iter_advance(struct iova_bitmap_iter *iter) +{ + unsigned long length = iova_bitmap_length(iter); + + iter->offset += iova_bitmap_array_length(length, iter->dirty.pgshift); +} + +void iova_bitmap_iter_put(struct iova_bitmap_iter *iter) +{ + struct iova_bitmap *dirty = &iter->dirty; + + if (dirty->npages) + unpin_user_pages(dirty->pages, dirty->npages); +} + +int iova_bitmap_iter_get(struct iova_bitmap_iter *iter) +{ + struct iova_bitmap *dirty = &iter->dirty; + unsigned long npages; + void __user *addr; + long ret; + + npages = DIV_ROUND_UP((iter->count - iter->offset) * sizeof(u64), + PAGE_SIZE); + npages = min(npages, PAGE_SIZE / sizeof(struct page *)); + addr = iter->data + (iter->offset * sizeof(u64)); + ret = pin_user_pages_fast((unsigned long)addr, npages, + FOLL_WRITE, dirty->pages); + if (ret <= 0) + return ret; + + dirty->npages = (unsigned long)ret; + dirty->iova = iova_bitmap_iova(iter); + dirty->start_offset = offset_in_page(addr); + return 0; +} + +void iova_bitmap_init(struct iova_bitmap *bitmap, + unsigned long base, unsigned long pgshift) +{ + memset(bitmap, 0, sizeof(*bitmap)); + bitmap->iova = base; + bitmap->pgshift = pgshift; +} + +unsigned int iova_bitmap_set(struct iova_bitmap *dirty, + unsigned long iova, + unsigned long length) +{ + unsigned long nbits, offset, start_offset, idx, size, *kaddr; + + nbits = max(1UL, length >> dirty->pgshift); + offset = (iova - dirty->iova) >> dirty->pgshift; + idx = offset / (PAGE_SIZE * BITS_PER_BYTE); + offset = offset % (PAGE_SIZE * BITS_PER_BYTE); + start_offset = dirty->start_offset; + + while (nbits > 0) { + kaddr = kmap_local_page(dirty->pages[idx]) + start_offset; + size = min(PAGE_SIZE * BITS_PER_BYTE - offset, nbits); + bitmap_set(kaddr, offset, size); + kunmap_local(kaddr - start_offset); + start_offset = offset = 0; + nbits -= size; + idx++; + } + + return nbits; +} +EXPORT_SYMBOL_GPL(iova_bitmap_set); + diff --git a/include/linux/iova_bitmap.h b/include/linux/iova_bitmap.h new file mode 100644 index 000000000000..ff19ad47a126 --- /dev/null +++ b/include/linux/iova_bitmap.h @@ -0,0 +1,46 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2022, Oracle and/or its affiliates. + * Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#ifndef _IOVA_BITMAP_H_ +#define _IOVA_BITMAP_H_ + +#include +#include +#include + +struct iova_bitmap { + unsigned long iova; + unsigned long pgshift; + unsigned long start_offset; + unsigned long npages; + struct page **pages; +}; + +struct iova_bitmap_iter { + struct iova_bitmap dirty; + void __user *data; + size_t offset; + size_t count; + unsigned long iova; + unsigned long length; +}; + +int iova_bitmap_iter_init(struct iova_bitmap_iter *iter, unsigned long iova, + unsigned long length, unsigned long __user *data); +void iova_bitmap_iter_free(struct iova_bitmap_iter *iter); +bool iova_bitmap_iter_done(struct iova_bitmap_iter *iter); +unsigned long iova_bitmap_length(struct iova_bitmap_iter *iter); +unsigned long iova_bitmap_iova(struct iova_bitmap_iter *iter); +void iova_bitmap_iter_advance(struct iova_bitmap_iter *iter); +int iova_bitmap_iter_get(struct iova_bitmap_iter *iter); +void iova_bitmap_iter_put(struct iova_bitmap_iter *iter); +void iova_bitmap_init(struct iova_bitmap *bitmap, + unsigned long base, unsigned long pgshift); +unsigned int iova_bitmap_set(struct iova_bitmap *dirty, + unsigned long iova, + unsigned long length); + +#endif From patchwork Thu Jun 30 10:25:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901620 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFC49C43334 for ; Thu, 30 Jun 2022 10:27:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235026AbiF3K1Y (ORCPT ); Thu, 30 Jun 2022 06:27:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234937AbiF3K07 (ORCPT ); Thu, 30 Jun 2022 06:26:59 -0400 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2055.outbound.protection.outlook.com [40.107.93.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBA81CE00; Thu, 30 Jun 2022 03:26:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RdZc+o+GsG+rgOpLZWfEhhCeFMqjy0tmTbCg2Id21tPUXwrcrszyJcyBFG3Dla49rz7SB/wt/ZoEC/SJ56VYTbjiZCL8lX9Fnwm2X9LHHPlmUF04RPA/myJB/G5GDTatNLptPLg/FkSZjkk2tXAbSCEOFqF9NNtFr9EZkbPwBMYlLI/syqXRVfS+dV5s9GlXZ1HF+8C5+C97e5P+iu3ktiVn/8sghjQ3xbkpo2xu178D930m+vWbJlltU+MjD3M49owE7kEpeu/gu3BjRq8CxzUO+PQ9C+gEJwo7ld/DrNVD9WgTgryWnOzQkWmL5PlqCEjUj2Hu7s+mhQr+TzIU+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=j8CbhQOtH/MavfK6utdDspsEAlQxOfHii+3T7MThgLQ=; b=S0ES14NIf0/DIT7X5rVeE4Fa5/LAgJSJiN+OkkYGgfxP5ueoXl5yddlQbv5ViwEUQDpeE1wMaDMmcwyGkTS6D9/ycZLoMYj6TzrOVXh7UqnLSunyhUPD3pIFHr4WMgOPQLMkdS5kV1Oj5MkqA4AQ9PU8sGSRtRJiETj7JkMAB0nVh1NO6QkNTeCmEhtXjy8VtGCac/JfDDpWkyZVHWSOg8LWcJo3H/8JVhqdKKM8LkjYn4hgPUmtrVJbCilWv5lEUI7iYg+NUUwJcjsTpTeyQ1HGn9lYYhzQHw/N6JA/Aixx08nmYVSsttgXwx62jmkNMHDP2LntiAirUdUKdIJUSQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.236) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=j8CbhQOtH/MavfK6utdDspsEAlQxOfHii+3T7MThgLQ=; b=WHR0UNoMncrOOyHcjU56onNpaJR8U+IOt770L8QWBs5BhduwBoCwWQebKLaQSG7T89QGNY7eo7E2/1Lq9Hs9seEKjiLkmS1dkgAC7BcyOwe44PlQN7aNJIFmkMmaHYkvW8cQXW3pjvb6CuFFQl2H1PoNesP1Cxd4Rkp8TbYGntuTp33LddSvzMQPj1d8Ost+OtOGWRKEEICKz7pI/K+62QLKvQU/KDsDPBLq+S9p8ehae58ZGxew0hON+IE42KQRwpWAlsU3sO2tjZ0ShTQhN62NW88LuOZVyFbA23eFrouUNlev7Q05FQET0kX1YZo0lvgyLdXbqjLzxd6lUdXtlw== Received: from BN9PR03CA0500.namprd03.prod.outlook.com (2603:10b6:408:130::25) by CY4PR12MB1512.namprd12.prod.outlook.com (2603:10b6:910:3::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.18; Thu, 30 Jun 2022 10:26:57 +0000 Received: from BN8NAM11FT048.eop-nam11.prod.protection.outlook.com (2603:10b6:408:130:cafe::68) by BN9PR03CA0500.outlook.office365.com (2603:10b6:408:130::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.236) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.236 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.236; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.236) by BN8NAM11FT048.mail.protection.outlook.com (10.13.177.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:56 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by DRHQMAIL109.nvidia.com (10.27.9.19) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:55 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:54 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:51 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 08/13] vfio: Introduce the DMA logging feature support Date: Thu, 30 Jun 2022 13:25:40 +0300 Message-ID: <20220630102545.18005-9-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 6280eb64-1104-429b-1834-08da5a830f28 X-MS-TrafficTypeDiagnostic: CY4PR12MB1512:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: a09SkQpurWQQCSEiNXF+CIVy7d3//xBEaWpIstthq33hXjRRDG8WmJy8M2YMSfIURx/soG/AeakDqil/u/by6TcASvjAilntAm9L7N3XvGJlubk6xp5ykKTnvQPQxY9S0gvQSkeDrMFHt+6jujxq3gfQxevSa63SkoN+0x6Kfm6ODjEND3vDXLgmjzn9A/Z5/JEKRcPyiJ02dyefxmBXDfEV+FcfYCBwyVWkwF6h18GY/md3J4jbRbR2FiF+SYYQ8l5aQum2iYKvD+jM7nVujIpzUxrZQX1d4jK9rHHWr5dJSfXF5QtyH0IL/MqTnz3B4kEz1QzZrsKcdYw4MsCiSJztXrfmqRBXPya/v4ckO7h3E3vPVaZ+H6iGpPnHMSGPJvSvi41myI1FKuRwR0VLDSmjw0I+KsBwkfG+0f+Jnm7IeM29l4DGOR4jSoQQvLYqX82Zoi2Jl1FvF2voDvvA0aGdDYMG724NIVaS/ELAHz1Ng/ubWI2XO6J0GHRXZig4ULWKT/rfgvljpk5/OG1BdvCOwRgJrQSAFLTYboMQ7fYdGjqFxl0VjIr6TmFWW4s0m+yjSCt+mgvrKxeVh3/G501BS58v1C2/+tiTJbYL9IszNZ/inweyBollj8bcZaUcjjSY5EbPRZ8FtX9UPYlRTl4IJXp/ygGTW1rBcKMkBEyaeEYre9LWUCegiHX7DTh3uulj93HsxtRcHFDIZJfX7TCMxbcZlNGUwhDE1i97XZy2kZriCZMzO5wcgwlpiKeCbnILKZRUEYuv92K7YMiBN9Aka/2Mo7Y5KUom0Mc3AKJM+NcywLJq/GbuxOWHSpRyMgdLeWZmzaCQihR7SSeOrNj0huPNaedqwh3PEORNUZI= X-Forefront-Antispam-Report: CIP:12.22.5.236;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(346002)(396003)(376002)(39860400002)(136003)(36840700001)(40470700004)(46966006)(316002)(1076003)(2906002)(6666004)(5660300002)(54906003)(6636002)(186003)(41300700001)(82310400005)(82740400003)(36756003)(70206006)(86362001)(110136005)(2616005)(356005)(336012)(7696005)(426003)(40460700003)(8676002)(70586007)(4326008)(47076005)(8936002)(26005)(81166007)(83380400001)(40480700001)(478600001)(36860700001)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:26:56.5348 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 6280eb64-1104-429b-1834-08da5a830f28 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.236];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT048.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR12MB1512 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Introduce the DMA logging feature support in the vfio core layer. It includes the processing of the device start/stop/report DMA logging UAPIs and calling the relevant driver 'op' to do the work. Specifically, Upon start, the core translates the given input ranges into an interval tree, checks for unexpected overlapping, non aligned ranges and then pass the translated input to the driver for start tracking the given ranges. Upon report, the core translates the given input user space bitmap and page size into an IOVA kernel bitmap iterator. Then it iterates it and call the driver to set the corresponding bits for the dirtied pages in a specific IOVA range. Upon stop, the driver is called to stop the previous started tracking. The next patches from the series will introduce the mlx5 driver implementation for the logging ops. Signed-off-by: Yishai Hadas Reported-by: kernel test robot Reported-by: kernel test robot --- drivers/vfio/pci/vfio_pci_core.c | 5 + drivers/vfio/vfio_main.c | 162 +++++++++++++++++++++++++++++++ include/linux/vfio.h | 21 +++- 3 files changed, 186 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 2e003913c561..8dcd212971fe 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1862,6 +1862,11 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev) return -EINVAL; } + if (vdev->vdev.log_ops && !(vdev->vdev.log_ops->log_start && + vdev->vdev.log_ops->log_stop && + vdev->vdev.log_ops->log_read_and_clear)) + return -EINVAL; + /* * Prevent binding to PFs with VFs enabled, the VFs might be in use * by the host or other users. We cannot capture the VFs if they diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index aac9213a783d..8eb8ba837059 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -32,6 +32,8 @@ #include #include #include +#include +#include #include "vfio.h" #define DRIVER_VERSION "0.3" @@ -1601,6 +1603,154 @@ static int vfio_ioctl_device_feature_migration(struct vfio_device *device, return 0; } +#define LOG_MAX_RANGES 1024 + +static int +vfio_ioctl_device_feature_logging_start(struct vfio_device *device, + u32 flags, void __user *arg, + size_t argsz) +{ + size_t minsz = + offsetofend(struct vfio_device_feature_dma_logging_control, + ranges); + struct vfio_device_feature_dma_logging_range __user *ranges; + struct vfio_device_feature_dma_logging_control control; + struct vfio_device_feature_dma_logging_range range; + struct rb_root_cached root = RB_ROOT_CACHED; + struct interval_tree_node *nodes; + u32 nnodes; + int i, ret; + + if (!device->log_ops) + return -ENOTTY; + + ret = vfio_check_feature(flags, argsz, + VFIO_DEVICE_FEATURE_SET, + sizeof(control)); + if (ret != 1) + return ret; + + if (copy_from_user(&control, arg, minsz)) + return -EFAULT; + + nnodes = control.num_ranges; + if (!nnodes || nnodes > LOG_MAX_RANGES) + return -EINVAL; + + ranges = (struct vfio_device_feature_dma_logging_range __user *) + control.ranges; + nodes = kmalloc_array(nnodes, sizeof(struct interval_tree_node), + GFP_KERNEL); + if (!nodes) + return -ENOMEM; + + for (i = 0; i < nnodes; i++) { + if (copy_from_user(&range, &ranges[i], sizeof(range))) { + ret = -EFAULT; + goto end; + } + if (!IS_ALIGNED(range.iova, control.page_size) || + !IS_ALIGNED(range.length, control.page_size)) { + ret = -EINVAL; + goto end; + } + nodes[i].start = range.iova; + nodes[i].last = range.iova + range.length - 1; + if (interval_tree_iter_first(&root, nodes[i].start, + nodes[i].last)) { + /* Range overlapping */ + ret = -EINVAL; + goto end; + } + interval_tree_insert(nodes + i, &root); + } + + ret = device->log_ops->log_start(device, &root, nnodes, + &control.page_size); + if (ret) + goto end; + + if (copy_to_user(arg, &control, sizeof(control))) { + ret = -EFAULT; + device->log_ops->log_stop(device); + } + +end: + kfree(nodes); + return ret; +} + +static int +vfio_ioctl_device_feature_logging_stop(struct vfio_device *device, + u32 flags, void __user *arg, + size_t argsz) +{ + int ret; + + if (!device->log_ops) + return -ENOTTY; + + ret = vfio_check_feature(flags, argsz, + VFIO_DEVICE_FEATURE_SET, 0); + if (ret != 1) + return ret; + + return device->log_ops->log_stop(device); +} + +static int +vfio_ioctl_device_feature_logging_report(struct vfio_device *device, + u32 flags, void __user *arg, + size_t argsz) +{ + size_t minsz = + offsetofend(struct vfio_device_feature_dma_logging_report, + bitmap); + struct vfio_device_feature_dma_logging_report report; + struct iova_bitmap_iter iter; + int ret; + + if (!device->log_ops) + return -ENOTTY; + + ret = vfio_check_feature(flags, argsz, + VFIO_DEVICE_FEATURE_GET, + sizeof(report)); + if (ret != 1) + return ret; + + if (copy_from_user(&report, arg, minsz)) + return -EFAULT; + + if (report.page_size < PAGE_SIZE) + return -EINVAL; + + iova_bitmap_init(&iter.dirty, report.iova, ilog2(report.page_size)); + ret = iova_bitmap_iter_init(&iter, report.iova, report.length, + (unsigned long __user *)report.bitmap); + if (ret) + return ret; + + for (; iova_bitmap_iter_done(&iter); + iova_bitmap_iter_advance(&iter)) { + ret = iova_bitmap_iter_get(&iter); + if (ret) + break; + + ret = device->log_ops->log_read_and_clear(device, + iova_bitmap_iova(&iter), + iova_bitmap_length(&iter), &iter.dirty); + + iova_bitmap_iter_put(&iter); + + if (ret) + break; + } + + iova_bitmap_iter_free(&iter); + return ret; +} + static int vfio_ioctl_device_feature(struct vfio_device *device, struct vfio_device_feature __user *arg) { @@ -1634,6 +1784,18 @@ static int vfio_ioctl_device_feature(struct vfio_device *device, return vfio_ioctl_device_feature_mig_device_state( device, feature.flags, arg->data, feature.argsz - minsz); + case VFIO_DEVICE_FEATURE_DMA_LOGGING_START: + return vfio_ioctl_device_feature_logging_start( + device, feature.flags, arg->data, + feature.argsz - minsz); + case VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP: + return vfio_ioctl_device_feature_logging_stop( + device, feature.flags, arg->data, + feature.argsz - minsz); + case VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT: + return vfio_ioctl_device_feature_logging_report( + device, feature.flags, arg->data, + feature.argsz - minsz); default: if (unlikely(!device->ops->device_feature)) return -EINVAL; diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 4d26e149db81..feed84d686ec 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -14,6 +14,7 @@ #include #include #include +#include struct kvm; @@ -33,10 +34,11 @@ struct vfio_device { struct device *dev; const struct vfio_device_ops *ops; /* - * mig_ops is a static property of the vfio_device which must be set - * prior to registering the vfio_device. + * mig_ops/log_ops is a static property of the vfio_device which must + * be set prior to registering the vfio_device. */ const struct vfio_migration_ops *mig_ops; + const struct vfio_log_ops *log_ops; struct vfio_group *group; struct vfio_device_set *dev_set; struct list_head dev_set_list; @@ -104,6 +106,21 @@ struct vfio_migration_ops { enum vfio_device_mig_state *curr_state); }; +/** + * @log_start: Optional callback to ask the device start DMA logging. + * @log_stop: Optional callback to ask the device stop DMA logging. + * @log_read_and_clear: Optional callback to ask the device read + * and clear the dirty DMAs in some given range. + */ +struct vfio_log_ops { + int (*log_start)(struct vfio_device *device, + struct rb_root_cached *ranges, u32 nnodes, u64 *page_size); + int (*log_stop)(struct vfio_device *device); + int (*log_read_and_clear)(struct vfio_device *device, + unsigned long iova, unsigned long length, + struct iova_bitmap *dirty); +}; + /** * vfio_check_feature - Validate user input for the VFIO_DEVICE_FEATURE ioctl * @flags: Arg from the device_feature op From patchwork Thu Jun 30 10:25:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA02BC43334 for ; Thu, 30 Jun 2022 10:27:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235012AbiF3K1V (ORCPT ); Thu, 30 Jun 2022 06:27:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232201AbiF3K1E (ORCPT ); Thu, 30 Jun 2022 06:27:04 -0400 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2052.outbound.protection.outlook.com [40.107.223.52]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 853BBF4A; Thu, 30 Jun 2022 03:27:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UzGwNMRpNYQg0fBhYEALou8b/WSGpqTT7mNd8QVsO4Ww4eacf/5sLm/GXLsJHXRuyl5c5BZxTOpdv6BiJrOAMyplxiWsFDRYmny3vcyOr/o2ivgOzlCxHBDNoM1tlGmyIw8vfsVQNoXY0NDTN8mE31/SvPHkavHKQnyqfgKBrFs2J5nfHN7OD3sKaYf+EHcHJ/iKaRqSrCZVzSQ2bDS+Cp6FuqgYkPXhoq3S2znNIOBle+rm2BDTVoQsYiw7SXQm3l9CX4avvFgLKnmwhyTNXXFJu0ME+879L2A4hUw6Dt8OHsg+ton9TWMOqXm0pBa6hzXlWMhtGuU7RvR8Z5trSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3DGCJTH1gN2qefNT81ZTzj2CLsE3lwqCbtSM2p36L4I=; b=dsWZejR0it9QL4QLnKB3dk+kNbxcxjXZNeMOapqlLi1oNIejGzTz10h2D8X8p5w1yl7qOqg4nG3SjnA4Lc65Xk5z471UoUfLrk4kunR20thEtvtKryRRZHKIxIC5wf5eJxSWEw9P6oDrd2fRbAOXIdvCs0gthYEb56SNYVFhMq+n4yDoCuK9PfjP6TAWJOk1Hl1PA9jqd5KqEN75gfa4F3Qr2glAgZFRfXVQbcoEGeKp65kLLaCM9j40OnSkaPeSfyHR3MpEfVt6WyqVoA8/lS4RvrYDjnVJyVRPUEb5wOL0dCFdpyDBzO8RJSgYW5RnxNcIRV1YGWuqzEO9/uMuOw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.235) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3DGCJTH1gN2qefNT81ZTzj2CLsE3lwqCbtSM2p36L4I=; b=V/yXvq51bRYvGm9aqsA/RZsKfQZq23e2jJweYgeLHmK1ySLkrrd+IgExLgFN1yVN2N5p3sNVqsjTA3Q0pwUNK+tT61QdMMemSN+LRPhzqSjAp5cQA+axuiqq+In6Dmk1fDDRqt/MoPx4Xooh+mOCiuiRFK2vdtwxHCY2/2AA9tjCFrnDcEF62N5Dq37hTVOBQO4EAVtJ+s06eVRH29r8p2K85upw3tFfh1aAq2tujpj9uycQ1nLx2BzeiQP4QEtuQW1jO4mkF7WgCD6hWWmnrBQ+3n1PkMDUgqBPZcDsh3JYKecZXcuKohKrqEs38uQ7Re5UZtdHAVNdut0C9E5Mjw== Received: from BN9P221CA0016.NAMP221.PROD.OUTLOOK.COM (2603:10b6:408:10a::18) by DM4PR12MB5264.namprd12.prod.outlook.com (2603:10b6:5:39c::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.15; Thu, 30 Jun 2022 10:27:00 +0000 Received: from BN8NAM11FT054.eop-nam11.prod.protection.outlook.com (2603:10b6:408:10a:cafe::6b) by BN9P221CA0016.outlook.office365.com (2603:10b6:408:10a::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.17 via Frontend Transport; Thu, 30 Jun 2022 10:27:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.235) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.235 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.235; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.235) by BN8NAM11FT054.mail.protection.outlook.com (10.13.177.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:27:00 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:59 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:58 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:55 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 09/13] vfio/mlx5: Init QP based resources for dirty tracking Date: Thu, 30 Jun 2022 13:25:41 +0300 Message-ID: <20220630102545.18005-10-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0b5efb58-5e2a-410b-b05c-08da5a83113c X-MS-TrafficTypeDiagnostic: DM4PR12MB5264:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ifS2G9I2+r8JwjCQn6bNjuelwYK8sdoxKQ/LM7vec7RCo8X1WOBklCn+ow++tPWooDftCEmcLj9G//YjjP+pUsLcd7HSt1qJYDFL0nCQyImfpgeizpuoR/tZ6Z9bEH1CuTN6VJxOf+zRk/yOWYIrwFNZO8pCaOlF1RSUR9a4taBr4llFgKAcOtY4Y9/MexAC1tqy8/7pXIYkCtu8vTx0wDoPLLhymXaVWfGquHjZeb73CF1BWHAkInpHBuHB/RMdihy+mzKNsZFvtKZwXyX4mZf9mMdQfuUbusVfhqYFRT/At0xNHtsfsoa/xUb8m72npv2ML3oSTKODYDafcTm6zyXsge8KBptxc4mTkhdfTDvXPQrJXmarPrCdmS1HeksMarDeOcz2DvVm/GkCeYEg2NE0GTPnVTYTEGylWs6Z/0tSS36xIe1aedW2PDO5Q6o0ozPvLiIMRQ7zQEzfI+CryfQ6PZ75BVh7tqIr4vum0PDVSxyYd6rEedKvJ5airRhQfbvh7wOPSJLZQEPoMeAejkjPnFhOzQXRSaBPvKiTA6tvrF/B2MmtA1GIzdQBlhR2GrSdNBNcrMYyAgFfam2msa8wkmSZ7GStyf9IOUHCZiz7t7zgT6bC1OwkT2eKo97ydE3G+6I/kubesG3xgalWb3YfHhBh75gMG9FwZnhElLuASg5beztLVXodVRRnqrrvBeT+mbRCaMpAxBZpfL4U96cbZ8xoKhplbY2bC+2bJXS1x42G/K49lKzisQ7L0ridtoYp7gT/+MxNoiiV0WLZkkzIVo3UqeVrIN3bE/+hhJenk9gXqG6MCx5g30RSMXXWEIXNj9Kj+uxyj4rJcQ70xAQ2uO6hnn3I0Knv2zGAYCAUT4EuUcgPidPF61j/avJd X-Forefront-Antispam-Report: CIP:12.22.5.235;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(376002)(39860400002)(136003)(346002)(396003)(36840700001)(40470700004)(46966006)(2906002)(4326008)(8676002)(5660300002)(478600001)(70586007)(70206006)(8936002)(30864003)(26005)(40460700003)(86362001)(82740400003)(7696005)(6666004)(6636002)(41300700001)(356005)(36756003)(54906003)(40480700001)(316002)(36860700001)(82310400005)(1076003)(110136005)(47076005)(426003)(336012)(81166007)(186003)(2616005)(83380400001)(14143004)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:27:00.0877 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0b5efb58-5e2a-410b-b05c-08da5a83113c X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.235];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT054.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5264 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Init QP based resources for dirty tracking to be used upon start logging. It includes: Creating the host and firmware RC QPs, move each of them to its expected state based on the device specification, etc. Creating the relevant resources which are needed by both QPs as of UAR, PD, etc. Creating the host receive side resources as of MKEY, CQ, receive WQEs, etc. The above resources are cleaned-up upon stop logging. The tracker object that will be introduced by next patches will use those resources. Signed-off-by: Yishai Hadas --- drivers/vfio/pci/mlx5/cmd.c | 595 +++++++++++++++++++++++++++++++++++- drivers/vfio/pci/mlx5/cmd.h | 53 ++++ 2 files changed, 636 insertions(+), 12 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index dd5d7bfe0a49..0a362796d567 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -7,6 +7,8 @@ static int mlx5vf_cmd_get_vhca_id(struct mlx5_core_dev *mdev, u16 function_id, u16 *vhca_id); +static void +_mlx5vf_free_page_tracker_resources(struct mlx5vf_pci_core_device *mvdev); int mlx5vf_cmd_suspend_vhca(struct mlx5vf_pci_core_device *mvdev, u16 op_mod) { @@ -72,19 +74,22 @@ static int mlx5fv_vf_event(struct notifier_block *nb, struct mlx5vf_pci_core_device *mvdev = container_of(nb, struct mlx5vf_pci_core_device, nb); - mutex_lock(&mvdev->state_mutex); switch (event) { case MLX5_PF_NOTIFY_ENABLE_VF: + mutex_lock(&mvdev->state_mutex); mvdev->mdev_detach = false; + mlx5vf_state_mutex_unlock(mvdev); break; case MLX5_PF_NOTIFY_DISABLE_VF: - mlx5vf_disable_fds(mvdev); + mlx5vf_cmd_close_migratable(mvdev); + mutex_lock(&mvdev->state_mutex); mvdev->mdev_detach = true; + mlx5vf_state_mutex_unlock(mvdev); break; default: break; } - mlx5vf_state_mutex_unlock(mvdev); + return 0; } @@ -95,6 +100,7 @@ void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev) mutex_lock(&mvdev->state_mutex); mlx5vf_disable_fds(mvdev); + _mlx5vf_free_page_tracker_resources(mvdev); mlx5vf_state_mutex_unlock(mvdev); } @@ -188,11 +194,13 @@ static int mlx5vf_cmd_get_vhca_id(struct mlx5_core_dev *mdev, u16 function_id, return ret; } -static int _create_state_mkey(struct mlx5_core_dev *mdev, u32 pdn, - struct mlx5_vf_migration_file *migf, u32 *mkey) +static int _create_mkey(struct mlx5_core_dev *mdev, u32 pdn, + struct mlx5_vf_migration_file *migf, + struct mlx5_vhca_recv_buf *recv_buf, + u32 *mkey) { - size_t npages = DIV_ROUND_UP(migf->total_length, PAGE_SIZE); - struct sg_dma_page_iter dma_iter; + size_t npages = migf ? DIV_ROUND_UP(migf->total_length, PAGE_SIZE) : + recv_buf->npages; int err = 0, inlen; __be64 *mtt; void *mkc; @@ -209,8 +217,17 @@ static int _create_state_mkey(struct mlx5_core_dev *mdev, u32 pdn, DIV_ROUND_UP(npages, 2)); mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt); - for_each_sgtable_dma_page(&migf->table.sgt, &dma_iter, 0) - *mtt++ = cpu_to_be64(sg_page_iter_dma_address(&dma_iter)); + if (migf) { + struct sg_dma_page_iter dma_iter; + + for_each_sgtable_dma_page(&migf->table.sgt, &dma_iter, 0) + *mtt++ = cpu_to_be64(sg_page_iter_dma_address(&dma_iter)); + } else { + int i; + + for (i = 0; i < npages; i++) + *mtt++ = cpu_to_be64(recv_buf->dma_addrs[i]); + } mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT); @@ -223,7 +240,8 @@ static int _create_state_mkey(struct mlx5_core_dev *mdev, u32 pdn, MLX5_SET(mkc, mkc, qpn, 0xffffff); MLX5_SET(mkc, mkc, log_page_size, PAGE_SHIFT); MLX5_SET(mkc, mkc, translations_octword_size, DIV_ROUND_UP(npages, 2)); - MLX5_SET64(mkc, mkc, len, migf->total_length); + MLX5_SET64(mkc, mkc, len, + migf ? migf->total_length : (npages * PAGE_SIZE)); err = mlx5_core_create_mkey(mdev, mkey, in, inlen); kvfree(in); return err; @@ -297,7 +315,7 @@ int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev, if (err) goto err_dma_map; - err = _create_state_mkey(mdev, pdn, migf, &mkey); + err = _create_mkey(mdev, pdn, migf, NULL, &mkey); if (err) goto err_create_mkey; @@ -369,7 +387,7 @@ int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev, if (err) goto err_reg; - err = _create_state_mkey(mdev, pdn, migf, &mkey); + err = _create_mkey(mdev, pdn, migf, NULL, &mkey); if (err) goto err_mkey; @@ -391,3 +409,556 @@ int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev, mutex_unlock(&migf->lock); return err; } + +static int alloc_cq_frag_buf(struct mlx5_core_dev *mdev, + struct mlx5_vhca_cq_buf *buf, int nent, + int cqe_size) +{ + struct mlx5_frag_buf *frag_buf = &buf->frag_buf; + u8 log_wq_stride = 6 + (cqe_size == 128 ? 1 : 0); + u8 log_wq_sz = ilog2(cqe_size); + int err; + + err = mlx5_frag_buf_alloc_node(mdev, nent * cqe_size, frag_buf, + mdev->priv.numa_node); + if (err) + return err; + + mlx5_init_fbc(frag_buf->frags, log_wq_stride, log_wq_sz, &buf->fbc); + buf->cqe_size = cqe_size; + buf->nent = nent; + return 0; +} + +static void init_cq_frag_buf(struct mlx5_vhca_cq_buf *buf) +{ + struct mlx5_cqe64 *cqe64; + void *cqe; + int i; + + for (i = 0; i < buf->nent; i++) { + cqe = mlx5_frag_buf_get_wqe(&buf->fbc, i); + cqe64 = buf->cqe_size == 64 ? cqe : cqe + 64; + cqe64->op_own = MLX5_CQE_INVALID << 4; + } +} + +static void mlx5vf_destroy_cq(struct mlx5_core_dev *mdev, + struct mlx5_vhca_cq *cq) +{ + mlx5_core_destroy_cq(mdev, &cq->mcq); + mlx5_frag_buf_free(mdev, &cq->buf.frag_buf); + mlx5_db_free(mdev, &cq->db); +} + +static int mlx5vf_create_cq(struct mlx5_core_dev *mdev, + struct mlx5_vhca_page_tracker *tracker, + size_t ncqe) +{ + int cqe_size = cache_line_size() == 128 ? 128 : 64; + u32 out[MLX5_ST_SZ_DW(create_cq_out)]; + struct mlx5_vhca_cq *cq; + int inlen, err, eqn; + void *cqc, *in; + __be64 *pas; + int vector; + + cq = &tracker->cq; + ncqe = roundup_pow_of_two(ncqe); + err = mlx5_db_alloc_node(mdev, &cq->db, mdev->priv.numa_node); + if (err) + return err; + + cq->ncqe = ncqe; + cq->mcq.set_ci_db = cq->db.db; + cq->mcq.arm_db = cq->db.db + 1; + cq->mcq.cqe_sz = cqe_size; + err = alloc_cq_frag_buf(mdev, &cq->buf, ncqe, cqe_size); + if (err) + goto err_db_free; + + init_cq_frag_buf(&cq->buf); + inlen = MLX5_ST_SZ_BYTES(create_cq_in) + + MLX5_FLD_SZ_BYTES(create_cq_in, pas[0]) * + cq->buf.frag_buf.npages; + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) { + err = -ENOMEM; + goto err_buff; + } + + vector = raw_smp_processor_id() % mlx5_comp_vectors_count(mdev); + err = mlx5_vector2eqn(mdev, vector, &eqn); + if (err) + goto err_vec; + + cqc = MLX5_ADDR_OF(create_cq_in, in, cq_context); + MLX5_SET(cqc, cqc, log_cq_size, ilog2(ncqe)); + MLX5_SET(cqc, cqc, c_eqn_or_apu_element, eqn); + MLX5_SET(cqc, cqc, uar_page, tracker->uar->index); + MLX5_SET(cqc, cqc, log_page_size, cq->buf.frag_buf.page_shift - + MLX5_ADAPTER_PAGE_SHIFT); + MLX5_SET64(cqc, cqc, dbr_addr, cq->db.dma); + pas = (__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas); + mlx5_fill_page_frag_array(&cq->buf.frag_buf, pas); + err = mlx5_core_create_cq(mdev, &cq->mcq, in, inlen, out, sizeof(out)); + if (err) + goto err_vec; + + kvfree(in); + return 0; + +err_vec: + kvfree(in); +err_buff: + mlx5_frag_buf_free(mdev, &cq->buf.frag_buf); +err_db_free: + mlx5_db_free(mdev, &cq->db); + return err; +} + +static struct mlx5_vhca_qp * +mlx5vf_create_rc_qp(struct mlx5_core_dev *mdev, + struct mlx5_vhca_page_tracker *tracker, u32 max_recv_wr) +{ + u32 out[MLX5_ST_SZ_DW(create_qp_out)] = {}; + struct mlx5_vhca_qp *qp; + u8 log_rq_stride; + u8 log_rq_sz; + void *qpc; + int inlen; + void *in; + int err; + + qp = kzalloc(sizeof(*qp), GFP_KERNEL); + if (!qp) + return ERR_PTR(-ENOMEM); + + qp->rq.wqe_cnt = roundup_pow_of_two(max_recv_wr); + log_rq_stride = ilog2(MLX5_SEND_WQE_DS); + log_rq_sz = ilog2(qp->rq.wqe_cnt); + err = mlx5_db_alloc_node(mdev, &qp->db, mdev->priv.numa_node); + if (err) + goto err_free; + + if (max_recv_wr) { + err = mlx5_frag_buf_alloc_node(mdev, + wq_get_byte_sz(log_rq_sz, log_rq_stride), + &qp->buf, mdev->priv.numa_node); + if (err) + goto err_db_free; + mlx5_init_fbc(qp->buf.frags, log_rq_stride, log_rq_sz, &qp->rq.fbc); + } + + qp->rq.db = &qp->db.db[MLX5_RCV_DBR]; + inlen = MLX5_ST_SZ_BYTES(create_qp_in) + + MLX5_FLD_SZ_BYTES(create_qp_in, pas[0]) * + qp->buf.npages; + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) { + err = -ENOMEM; + goto err_in; + } + + qpc = MLX5_ADDR_OF(create_qp_in, in, qpc); + MLX5_SET(qpc, qpc, st, MLX5_QP_ST_RC); + MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); + MLX5_SET(qpc, qpc, pd, tracker->pdn); + MLX5_SET(qpc, qpc, uar_page, tracker->uar->index); + MLX5_SET(qpc, qpc, log_page_size, + qp->buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT); + MLX5_SET(qpc, qpc, ts_format, mlx5_get_qp_default_ts(mdev)); + if (MLX5_CAP_GEN(mdev, cqe_version) == 1) + MLX5_SET(qpc, qpc, user_index, 0xFFFFFF); + MLX5_SET(qpc, qpc, no_sq, 1); + if (max_recv_wr) { + MLX5_SET(qpc, qpc, cqn_rcv, tracker->cq.mcq.cqn); + MLX5_SET(qpc, qpc, log_rq_stride, log_rq_stride - 4); + MLX5_SET(qpc, qpc, log_rq_size, log_rq_sz); + MLX5_SET(qpc, qpc, rq_type, MLX5_NON_ZERO_RQ); + MLX5_SET64(qpc, qpc, dbr_addr, qp->db.dma); + mlx5_fill_page_frag_array(&qp->buf, + (__be64 *)MLX5_ADDR_OF(create_qp_in, + in, pas)); + } else { + MLX5_SET(qpc, qpc, rq_type, MLX5_ZERO_LEN_RQ); + } + + MLX5_SET(create_qp_in, in, opcode, MLX5_CMD_OP_CREATE_QP); + err = mlx5_cmd_exec(mdev, in, inlen, out, sizeof(out)); + kvfree(in); + if (err) + goto err_in; + + qp->qpn = MLX5_GET(create_qp_out, out, qpn); + return qp; + +err_in: + if (max_recv_wr) + mlx5_frag_buf_free(mdev, &qp->buf); +err_db_free: + mlx5_db_free(mdev, &qp->db); +err_free: + kfree(qp); + return ERR_PTR(err); +} + +static void mlx5vf_post_recv(struct mlx5_vhca_qp *qp) +{ + struct mlx5_wqe_data_seg *data; + unsigned int ix; + + WARN_ON(qp->rq.pc - qp->rq.cc >= qp->rq.wqe_cnt); + ix = qp->rq.pc & (qp->rq.wqe_cnt - 1); + data = mlx5_frag_buf_get_wqe(&qp->rq.fbc, ix); + data->byte_count = cpu_to_be32(qp->max_msg_size); + data->lkey = cpu_to_be32(qp->recv_buf.mkey); + data->addr = cpu_to_be64(qp->recv_buf.next_rq_offset); + qp->rq.pc++; + /* Make sure that descriptors are written before doorbell record. */ + dma_wmb(); + *qp->rq.db = cpu_to_be32(qp->rq.pc & 0xffff); +} + +static int mlx5vf_activate_qp(struct mlx5_core_dev *mdev, + struct mlx5_vhca_qp *qp, u32 remote_qpn, + bool host_qp) +{ + u32 init_in[MLX5_ST_SZ_DW(rst2init_qp_in)] = {}; + u32 rtr_in[MLX5_ST_SZ_DW(init2rtr_qp_in)] = {}; + u32 rts_in[MLX5_ST_SZ_DW(rtr2rts_qp_in)] = {}; + void *qpc; + int ret; + + /* Init */ + qpc = MLX5_ADDR_OF(rst2init_qp_in, init_in, qpc); + MLX5_SET(qpc, qpc, primary_address_path.vhca_port_num, 1); + MLX5_SET(qpc, qpc, pm_state, MLX5_QPC_PM_STATE_MIGRATED); + MLX5_SET(qpc, qpc, rre, 1); + MLX5_SET(qpc, qpc, rwe, 1); + MLX5_SET(rst2init_qp_in, init_in, opcode, MLX5_CMD_OP_RST2INIT_QP); + MLX5_SET(rst2init_qp_in, init_in, qpn, qp->qpn); + ret = mlx5_cmd_exec_in(mdev, rst2init_qp, init_in); + if (ret) + return ret; + + if (host_qp) { + struct mlx5_vhca_recv_buf *recv_buf = &qp->recv_buf; + int i; + + for (i = 0; i < qp->rq.wqe_cnt; i++) { + mlx5vf_post_recv(qp); + recv_buf->next_rq_offset += qp->max_msg_size; + } + } + + /* RTR */ + qpc = MLX5_ADDR_OF(init2rtr_qp_in, rtr_in, qpc); + MLX5_SET(init2rtr_qp_in, rtr_in, qpn, qp->qpn); + MLX5_SET(qpc, qpc, mtu, IB_MTU_4096); + MLX5_SET(qpc, qpc, log_msg_max, MLX5_CAP_GEN(mdev, log_max_msg)); + MLX5_SET(qpc, qpc, remote_qpn, remote_qpn); + MLX5_SET(qpc, qpc, primary_address_path.vhca_port_num, 1); + MLX5_SET(qpc, qpc, primary_address_path.fl, 1); + MLX5_SET(qpc, qpc, min_rnr_nak, 1); + MLX5_SET(init2rtr_qp_in, rtr_in, opcode, MLX5_CMD_OP_INIT2RTR_QP); + MLX5_SET(init2rtr_qp_in, rtr_in, qpn, qp->qpn); + ret = mlx5_cmd_exec_in(mdev, init2rtr_qp, rtr_in); + if (ret || host_qp) + return ret; + + /* RTS */ + qpc = MLX5_ADDR_OF(rtr2rts_qp_in, rts_in, qpc); + MLX5_SET(rtr2rts_qp_in, rts_in, qpn, qp->qpn); + MLX5_SET(qpc, qpc, retry_count, 7); + MLX5_SET(qpc, qpc, rnr_retry, 7); /* Infinite retry if RNR NACK */ + MLX5_SET(qpc, qpc, primary_address_path.ack_timeout, 0x8); /* ~1ms */ + MLX5_SET(rtr2rts_qp_in, rts_in, opcode, MLX5_CMD_OP_RTR2RTS_QP); + MLX5_SET(rtr2rts_qp_in, rts_in, qpn, qp->qpn); + + return mlx5_cmd_exec_in(mdev, rtr2rts_qp, rts_in); +} + +static void mlx5vf_destroy_qp(struct mlx5_core_dev *mdev, + struct mlx5_vhca_qp *qp) +{ + u32 in[MLX5_ST_SZ_DW(destroy_qp_in)] = {}; + + MLX5_SET(destroy_qp_in, in, opcode, MLX5_CMD_OP_DESTROY_QP); + MLX5_SET(destroy_qp_in, in, qpn, qp->qpn); + mlx5_cmd_exec_in(mdev, destroy_qp, in); + + mlx5_frag_buf_free(mdev, &qp->buf); + mlx5_db_free(mdev, &qp->db); + kfree(qp); +} + +static void free_recv_pages(struct mlx5_vhca_recv_buf *recv_buf) +{ + int i; + + /* Undo alloc_pages_bulk_array() */ + for (i = 0; i < recv_buf->npages; i++) + __free_page(recv_buf->page_list[i]); + + kvfree(recv_buf->page_list); +} + +static int alloc_recv_pages(struct mlx5_vhca_recv_buf *recv_buf, + unsigned int npages) +{ + unsigned int filled = 0, done = 0; + int i; + + recv_buf->page_list = kvcalloc(npages, sizeof(*recv_buf->page_list), + GFP_KERNEL); + if (!recv_buf->page_list) + return -ENOMEM; + + for (;;) { + filled = alloc_pages_bulk_array(GFP_KERNEL, npages - done, + recv_buf->page_list + done); + if (!filled) + goto err; + + done += filled; + if (done == npages) + break; + } + + recv_buf->npages = npages; + return 0; + +err: + for (i = 0; i < npages; i++) { + if (recv_buf->page_list[i]) + __free_page(recv_buf->page_list[i]); + } + + kvfree(recv_buf->page_list); + return -ENOMEM; +} + +static int register_dma_recv_pages(struct mlx5_core_dev *mdev, + struct mlx5_vhca_recv_buf *recv_buf) +{ + int i, j; + + recv_buf->dma_addrs = kvcalloc(recv_buf->npages, + sizeof(*recv_buf->dma_addrs), + GFP_KERNEL); + if (!recv_buf->dma_addrs) + return -ENOMEM; + + for (i = 0; i < recv_buf->npages; i++) { + recv_buf->dma_addrs[i] = dma_map_page(mdev->device, + recv_buf->page_list[i], + 0, PAGE_SIZE, + DMA_FROM_DEVICE); + if (dma_mapping_error(mdev->device, recv_buf->dma_addrs[i])) + goto error; + } + return 0; + +error: + for (j = 0; j < i; j++) + dma_unmap_single(mdev->device, recv_buf->dma_addrs[j], + PAGE_SIZE, DMA_FROM_DEVICE); + + kvfree(recv_buf->dma_addrs); + return -ENOMEM; +} + +static void unregister_dma_recv_pages(struct mlx5_core_dev *mdev, + struct mlx5_vhca_recv_buf *recv_buf) +{ + int i; + + for (i = 0; i < recv_buf->npages; i++) + dma_unmap_single(mdev->device, recv_buf->dma_addrs[i], + PAGE_SIZE, DMA_FROM_DEVICE); + + kvfree(recv_buf->dma_addrs); +} + +static void mlx5vf_free_qp_recv_resources(struct mlx5_core_dev *mdev, + struct mlx5_vhca_qp *qp) +{ + struct mlx5_vhca_recv_buf *recv_buf = &qp->recv_buf; + + mlx5_core_destroy_mkey(mdev, recv_buf->mkey); + unregister_dma_recv_pages(mdev, recv_buf); + free_recv_pages(&qp->recv_buf); +} + +static int mlx5vf_alloc_qp_recv_resources(struct mlx5_core_dev *mdev, + struct mlx5_vhca_qp *qp, u32 pdn, + u64 rq_size) +{ + unsigned int npages = DIV_ROUND_UP_ULL(rq_size, PAGE_SIZE); + struct mlx5_vhca_recv_buf *recv_buf = &qp->recv_buf; + int err; + + err = alloc_recv_pages(recv_buf, npages); + if (err < 0) + return err; + + err = register_dma_recv_pages(mdev, recv_buf); + if (err) + goto end; + + err = _create_mkey(mdev, pdn, NULL, recv_buf, &recv_buf->mkey); + if (err) + goto err_create_mkey; + + return 0; + +err_create_mkey: + unregister_dma_recv_pages(mdev, recv_buf); +end: + free_recv_pages(recv_buf); + return err; +} + +static void +_mlx5vf_free_page_tracker_resources(struct mlx5vf_pci_core_device *mvdev) +{ + struct mlx5_vhca_page_tracker *tracker = &mvdev->tracker; + struct mlx5_core_dev *mdev = mvdev->mdev; + + lockdep_assert_held(&mvdev->state_mutex); + + if (!mvdev->log_active) + return; + + WARN_ON(mvdev->mdev_detach); + + mlx5vf_destroy_qp(mdev, tracker->fw_qp); + mlx5vf_free_qp_recv_resources(mdev, tracker->host_qp); + mlx5vf_destroy_qp(mdev, tracker->host_qp); + mlx5vf_destroy_cq(mdev, &tracker->cq); + mlx5_core_dealloc_pd(mdev, tracker->pdn); + mlx5_put_uars_page(mdev, tracker->uar); + mvdev->log_active = false; +} + +int mlx5vf_stop_page_tracker(struct vfio_device *vdev) +{ + struct mlx5vf_pci_core_device *mvdev = container_of( + vdev, struct mlx5vf_pci_core_device, core_device.vdev); + + mutex_lock(&mvdev->state_mutex); + if (!mvdev->log_active) + goto end; + + _mlx5vf_free_page_tracker_resources(mvdev); + mvdev->log_active = false; +end: + mlx5vf_state_mutex_unlock(mvdev); + return 0; +} + +int mlx5vf_start_page_tracker(struct vfio_device *vdev, + struct rb_root_cached *ranges, u32 nnodes, + u64 *page_size) +{ + struct mlx5vf_pci_core_device *mvdev = container_of( + vdev, struct mlx5vf_pci_core_device, core_device.vdev); + struct mlx5_vhca_page_tracker *tracker = &mvdev->tracker; + u8 log_tracked_page = ilog2(*page_size); + struct mlx5_vhca_qp *host_qp; + struct mlx5_vhca_qp *fw_qp; + struct mlx5_core_dev *mdev; + u32 max_msg_size = PAGE_SIZE; + u64 rq_size = SZ_2M; + u32 max_recv_wr; + int err; + + mutex_lock(&mvdev->state_mutex); + if (mvdev->mdev_detach) { + err = -ENOTCONN; + goto end; + } + + if (mvdev->log_active) { + err = -EINVAL; + goto end; + } + + mdev = mvdev->mdev; + memset(tracker, 0, sizeof(*tracker)); + tracker->uar = mlx5_get_uars_page(mdev); + if (IS_ERR(tracker->uar)) { + err = PTR_ERR(tracker->uar); + goto end; + } + + err = mlx5_core_alloc_pd(mdev, &tracker->pdn); + if (err) + goto err_uar; + + max_recv_wr = DIV_ROUND_UP_ULL(rq_size, max_msg_size); + err = mlx5vf_create_cq(mdev, tracker, max_recv_wr); + if (err) + goto err_dealloc_pd; + + host_qp = mlx5vf_create_rc_qp(mdev, tracker, max_recv_wr); + if (IS_ERR(host_qp)) { + err = PTR_ERR(host_qp); + goto err_cq; + } + + host_qp->max_msg_size = max_msg_size; + if (log_tracked_page < MLX5_CAP_ADV_VIRTUALIZATION(mdev, + pg_track_log_min_page_size)) { + log_tracked_page = MLX5_CAP_ADV_VIRTUALIZATION(mdev, + pg_track_log_min_page_size); + } else if (log_tracked_page > MLX5_CAP_ADV_VIRTUALIZATION(mdev, + pg_track_log_max_page_size)) { + log_tracked_page = MLX5_CAP_ADV_VIRTUALIZATION(mdev, + pg_track_log_max_page_size); + } + + host_qp->tracked_page_size = (1ULL << log_tracked_page); + err = mlx5vf_alloc_qp_recv_resources(mdev, host_qp, tracker->pdn, + rq_size); + if (err) + goto err_host_qp; + + fw_qp = mlx5vf_create_rc_qp(mdev, tracker, 0); + if (IS_ERR(fw_qp)) { + err = PTR_ERR(fw_qp); + goto err_recv_resources; + } + + err = mlx5vf_activate_qp(mdev, host_qp, fw_qp->qpn, true); + if (err) + goto err_activate; + + err = mlx5vf_activate_qp(mdev, fw_qp, host_qp->qpn, false); + if (err) + goto err_activate; + + tracker->host_qp = host_qp; + tracker->fw_qp = fw_qp; + *page_size = host_qp->tracked_page_size; + mvdev->log_active = true; + mlx5vf_state_mutex_unlock(mvdev); + return 0; + +err_activate: + mlx5vf_destroy_qp(mdev, fw_qp); +err_recv_resources: + mlx5vf_free_qp_recv_resources(mdev, host_qp); +err_host_qp: + mlx5vf_destroy_qp(mdev, host_qp); +err_cq: + mlx5vf_destroy_cq(mdev, &tracker->cq); +err_dealloc_pd: + mlx5_core_dealloc_pd(mdev, tracker->pdn); +err_uar: + mlx5_put_uars_page(mdev, tracker->uar); +end: + mlx5vf_state_mutex_unlock(mvdev); + return err; +} diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index 8208f4701a90..e71ec017bf04 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -9,6 +9,8 @@ #include #include #include +#include +#include struct mlx5vf_async_data { struct mlx5_async_work cb_work; @@ -39,6 +41,52 @@ struct mlx5_vf_migration_file { struct mlx5vf_async_data async_data; }; +struct mlx5_vhca_cq_buf { + struct mlx5_frag_buf_ctrl fbc; + struct mlx5_frag_buf frag_buf; + int cqe_size; + int nent; +}; + +struct mlx5_vhca_cq { + struct mlx5_vhca_cq_buf buf; + struct mlx5_db db; + struct mlx5_core_cq mcq; + size_t ncqe; +}; + +struct mlx5_vhca_recv_buf { + u32 npages; + struct page **page_list; + dma_addr_t *dma_addrs; + u32 next_rq_offset; + u32 mkey; +}; + +struct mlx5_vhca_qp { + struct mlx5_frag_buf buf; + struct mlx5_db db; + struct mlx5_vhca_recv_buf recv_buf; + u32 tracked_page_size; + u32 max_msg_size; + u32 qpn; + struct { + unsigned int pc; + unsigned int cc; + unsigned int wqe_cnt; + __be32 *db; + struct mlx5_frag_buf_ctrl fbc; + } rq; +}; + +struct mlx5_vhca_page_tracker { + u32 pdn; + struct mlx5_uars_page *uar; + struct mlx5_vhca_cq cq; + struct mlx5_vhca_qp *host_qp; + struct mlx5_vhca_qp *fw_qp; +}; + struct mlx5vf_pci_core_device { struct vfio_pci_core_device core_device; int vf_id; @@ -46,6 +94,7 @@ struct mlx5vf_pci_core_device { u8 migrate_cap:1; u8 deferred_reset:1; u8 mdev_detach:1; + u8 log_active:1; /* protect migration state */ struct mutex state_mutex; enum vfio_device_mig_state mig_state; @@ -53,6 +102,7 @@ struct mlx5vf_pci_core_device { spinlock_t reset_lock; struct mlx5_vf_migration_file *resuming_migf; struct mlx5_vf_migration_file *saving_migf; + struct mlx5_vhca_page_tracker tracker; struct workqueue_struct *cb_wq; struct notifier_block nb; struct mlx5_core_dev *mdev; @@ -73,4 +123,7 @@ int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev, void mlx5vf_state_mutex_unlock(struct mlx5vf_pci_core_device *mvdev); void mlx5vf_disable_fds(struct mlx5vf_pci_core_device *mvdev); void mlx5vf_mig_file_cleanup_cb(struct work_struct *_work); +int mlx5vf_start_page_tracker(struct vfio_device *vdev, + struct rb_root_cached *ranges, u32 nnodes, u64 *page_size); +int mlx5vf_stop_page_tracker(struct vfio_device *vdev); #endif /* MLX5_VFIO_CMD_H */ From patchwork Thu Jun 30 10:25:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901622 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1AA3C43334 for ; Thu, 30 Jun 2022 10:27:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232201AbiF3K1q (ORCPT ); Thu, 30 Jun 2022 06:27:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233582AbiF3K1H (ORCPT ); Thu, 30 Jun 2022 06:27:07 -0400 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2087.outbound.protection.outlook.com [40.107.94.87]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F4048CE3A; Thu, 30 Jun 2022 03:27:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=f1QWTzz0XFnEaxlJ98QpodkDXRQSkRS+MfAb3J21z0sA8Q2knm+wzJC1lGsFYwHsMA23f6kburf5QKKnoRmvMw/OTtN4jw2gWrX5ZlCvZLF19hwwr/fImqOZmhxOtdpko8s1zc7rqmirV7OR6dg0tgcSbZGcOFbuNRgep2yzcyh4u52jfyF98vJZGZlJcjUaZDtCAhDhVGBiCDmr/FCl+wmQwpc8xj2sDGB6Xbm68kWIdvqumWKYWaHiT5xuxdF9okmMp3SPVUTBYrJrMDqVggJ0Mwj3mRAXi/6ACNy5dkEmyhrkMLpx0XvvqQRA/9n7TIaCXd0fYZvnm/EABoVHGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=oSPkWxG54mB2dy3Cj1M8FdqBfihoIqPeKJMyEkC8U8I=; b=me/q+QW7dOCETQGQAcCoE6Khm38PRtBcZOzIj47t+UuhDA/yWkCpxdbHGmdP6YDHX2LU3BJODHbwRqdz+qYqvlUK5QGFdDPfzeV8MrknACWcoIogs/aFCTomsy9YLbLoex6APGEvu9ijEiOeq6KSV9thxEhRz2t0WtBG/rQneCHTiLEwK+TBfgzFJ9d459hreqB5iaI+9lN9GTJwVGTHT4zFQP708euH8vQIwHPwlXIWDcjLkP9MUCebNLTdRMWo/AjHjsUzp9uM8VLAJbuQhtyHa4DqWtoNNldHGYONWP6XztmjfckhxQ4q0+oVz4zjMds1oYsusNDN7HJBqm2vtg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.238) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oSPkWxG54mB2dy3Cj1M8FdqBfihoIqPeKJMyEkC8U8I=; b=nH1Yz4+RtJWxciHCqcfk/18CRmJfMX5sdOPSl1+C/JZ6K6bopSd9yjNQoJS0OxlJVkivs5fjoYbMMOq3C9GpzYPbXVqgLNsEVhO9vh51OtDLSlEnYWBxnU6llJA6HSj4Fmdo9Vwci3k2imFfzrG8DCAUUTUtFyNd53KuEgkUw5AKWlK1KM2X4kk2sC5oo0qmZ9ovYGCzRkybjjHsNjH7iJsSnCVHQMuzQVjLRpBSCJvuSkrpQ9lKLVRy5onjR1nVOS2jYqh2sO5z9VKbGOkgw3pF6VDc7haFkx2I+Mwlp/c/mdAcQtfYUu3UAA5h+2LObP8RhotK3J5dRQr9Krgnrw== Received: from DM5PR07CA0070.namprd07.prod.outlook.com (2603:10b6:4:ad::35) by DM6PR12MB3355.namprd12.prod.outlook.com (2603:10b6:5:115::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.17; Thu, 30 Jun 2022 10:27:04 +0000 Received: from DM6NAM11FT060.eop-nam11.prod.protection.outlook.com (2603:10b6:4:ad:cafe::65) by DM5PR07CA0070.outlook.office365.com (2603:10b6:4:ad::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.21 via Frontend Transport; Thu, 30 Jun 2022 10:27:04 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.238) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.238 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.238; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.238) by DM6NAM11FT060.mail.protection.outlook.com (10.13.173.63) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:27:04 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by DRHQMAIL105.nvidia.com (10.27.9.14) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:27:03 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:27:01 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:58 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 10/13] vfio/mlx5: Create and destroy page tracker object Date: Thu, 30 Jun 2022 13:25:42 +0300 Message-ID: <20220630102545.18005-11-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: c241bbcd-0cde-432a-5637-08da5a83138b X-MS-TrafficTypeDiagnostic: DM6PR12MB3355:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: yMFk1zaxG7St06FcrBgRhHYq+ZAlX9OQ8g2dLppOKUqT7G71arJ/DtYhzXIeHMg+2lzhYZm8nOvf8a3zIihuN0sAr1dKtYjV/ih3GXQ/aRXlW9+hrUZooqWN62SN71G5pvQ9hvWp++vs8qwHqvIbYH4hnB21qHSg6lvgJFmnbIwuZTJjgy0bOutE+8jkMnWYqHvg59qUr7j9s0zR//R74JAk0X/ak3qJ+Felg2JtfqiDSTGEoM0ZKZ0Oyk1IQwk6m3M/DeIVHZ/niWjpXfIB/fpIF9Ny4ljAuaoTg3fNk0EJLzey0zysWieGeGuky8qNnZpxjHlYCUh4BQBl5qTk2u+9Cgo6h8wYD29Op7Syfutp93vmsjLIthoLgV650JrEWI5XlJwr9j4tn8Ny4c2RzwPk1A4W7dwICXR7zw8U4iWH5heRBd6NUrE4chkf8OuVHkQFgFpPzfkrkCUvsTPeAIJS34yO1BmJoN3Rufeuuvqbr1wGeyed1N/IHbzjOpQt2bAsJSY/p8T7eVSMR6N3vwvRIN8ufVgqB2OhpknnbgbeOSU5JCx/hmeZgkNQrOTNuzr+OI1Xh9xkjUK+qwtlbkz18DK4uqT8rT/6vjtCy2Ay7mDY+6FR8snN8j7/0Rb392iQobD95nT36CtdruUJ4Ac/8GCbwkB1AeUrK9YTijnyEB5ZG3ugtf7eZnBXwruODdOjqa9g4VEllGI0muxbtuL88QH+s3KnPwwupI8pr0YPioJ64rvvjubP4d8Mj6yUFtbkp2t/vTw3qcmFxtEPiPT4hWqNjXh1budgqL7sAp5QtOp8Jx2HEBHmtl6PoOGE/TfDvkp9F05FpTmwa4DKeBwlD0ZPi+z5xUEXriGyv2g= X-Forefront-Antispam-Report: CIP:12.22.5.238;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(136003)(39860400002)(376002)(346002)(396003)(40470700004)(46966006)(36840700001)(86362001)(8676002)(82740400003)(478600001)(4326008)(36860700001)(6666004)(356005)(81166007)(82310400005)(8936002)(2906002)(110136005)(54906003)(26005)(70206006)(2616005)(6636002)(83380400001)(426003)(70586007)(40480700001)(5660300002)(41300700001)(36756003)(1076003)(316002)(186003)(336012)(47076005)(40460700003)(7696005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:27:04.0054 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c241bbcd-0cde-432a-5637-08da5a83138b X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.238];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT060.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB3355 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add support for creating and destroying page tracker object. This object is used to control/report the device dirty pages. As part of creating the tracker need to consider the device capabilities for max ranges and adapt/combine ranges accordingly. Signed-off-by: Yishai Hadas --- drivers/vfio/pci/mlx5/cmd.c | 147 ++++++++++++++++++++++++++++++++++++ drivers/vfio/pci/mlx5/cmd.h | 1 + 2 files changed, 148 insertions(+) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index 0a362796d567..f1cad96af6ab 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -410,6 +410,148 @@ int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev, return err; } +static void combine_ranges(struct rb_root_cached *root, u32 cur_nodes, + u32 req_nodes) +{ + struct interval_tree_node *prev, *curr, *comb_start, *comb_end; + unsigned long min_gap; + unsigned long curr_gap; + + /* Special shortcut when a single range is required */ + if (req_nodes == 1) { + unsigned long last; + + curr = comb_start = interval_tree_iter_first(root, 0, ULONG_MAX); + while (curr) { + last = curr->last; + prev = curr; + curr = interval_tree_iter_next(curr, 0, ULONG_MAX); + if (prev != comb_start) + interval_tree_remove(prev, root); + } + comb_start->last = last; + return; + } + + /* Combine ranges which have the smallest gap */ + while (cur_nodes > req_nodes) { + prev = NULL; + min_gap = ULONG_MAX; + curr = interval_tree_iter_first(root, 0, ULONG_MAX); + while (curr) { + if (prev) { + curr_gap = curr->start - prev->last; + if (curr_gap < min_gap) { + min_gap = curr_gap; + comb_start = prev; + comb_end = curr; + } + } + prev = curr; + curr = interval_tree_iter_next(curr, 0, ULONG_MAX); + } + comb_start->last = comb_end->last; + interval_tree_remove(comb_end, root); + cur_nodes--; + } +} + +static int mlx5vf_create_tracker(struct mlx5_core_dev *mdev, + struct mlx5vf_pci_core_device *mvdev, + struct rb_root_cached *ranges, u32 nnodes) +{ + int max_num_range = + MLX5_CAP_ADV_VIRTUALIZATION(mdev, pg_track_max_num_range); + struct mlx5_vhca_page_tracker *tracker = &mvdev->tracker; + int record_size = MLX5_ST_SZ_BYTES(page_track_range); + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; + struct interval_tree_node *node = NULL; + u64 total_ranges_len = 0; + u32 num_ranges = nnodes; + u8 log_addr_space_size; + void *range_list_ptr; + void *obj_context; + void *cmd_hdr; + int inlen; + void *in; + int err; + int i; + + if (num_ranges > max_num_range) { + combine_ranges(ranges, nnodes, max_num_range); + num_ranges = max_num_range; + } + + inlen = MLX5_ST_SZ_BYTES(create_page_track_obj_in) + + record_size * num_ranges; + in = kzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + cmd_hdr = MLX5_ADDR_OF(create_page_track_obj_in, in, + general_obj_in_cmd_hdr); + MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, opcode, + MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_type, + MLX5_OBJ_TYPE_PAGE_TRACK); + obj_context = MLX5_ADDR_OF(create_page_track_obj_in, in, obj_context); + MLX5_SET(page_track, obj_context, vhca_id, mvdev->vhca_id); + MLX5_SET(page_track, obj_context, track_type, 1); + MLX5_SET(page_track, obj_context, log_page_size, + ilog2(tracker->host_qp->tracked_page_size)); + MLX5_SET(page_track, obj_context, log_msg_size, + ilog2(tracker->host_qp->max_msg_size)); + MLX5_SET(page_track, obj_context, reporting_qpn, tracker->fw_qp->qpn); + MLX5_SET(page_track, obj_context, num_ranges, num_ranges); + + range_list_ptr = MLX5_ADDR_OF(page_track, obj_context, track_range); + node = interval_tree_iter_first(ranges, 0, ULONG_MAX); + for (i = 0; i < num_ranges; i++) { + void *addr_range_i_base = range_list_ptr + record_size * i; + unsigned long length = node->last - node->start; + + MLX5_SET64(page_track_range, addr_range_i_base, start_address, + node->start); + MLX5_SET64(page_track_range, addr_range_i_base, length, length); + total_ranges_len += length; + node = interval_tree_iter_next(node, 0, ULONG_MAX); + } + + WARN_ON(node); + log_addr_space_size = ilog2(total_ranges_len); + if (log_addr_space_size < + (MLX5_CAP_ADV_VIRTUALIZATION(mdev, pg_track_log_min_addr_space)) || + log_addr_space_size > + (MLX5_CAP_ADV_VIRTUALIZATION(mdev, pg_track_log_max_addr_space))) { + err = -EOPNOTSUPP; + goto out; + } + + MLX5_SET(page_track, obj_context, log_addr_space_size, + log_addr_space_size); + err = mlx5_cmd_exec(mdev, in, inlen, out, sizeof(out)); + if (err) + goto out; + + tracker->id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); +out: + kfree(in); + return err; +} + +static int mlx5vf_cmd_destroy_tracker(struct mlx5_core_dev *mdev, + u32 tracker_id) +{ + u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; + + MLX5_SET(general_obj_in_cmd_hdr, in, opcode, MLX5_CMD_OP_DESTROY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_PAGE_TRACK); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, tracker_id); + + return mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); +} + static int alloc_cq_frag_buf(struct mlx5_core_dev *mdev, struct mlx5_vhca_cq_buf *buf, int nent, int cqe_size) @@ -833,6 +975,7 @@ _mlx5vf_free_page_tracker_resources(struct mlx5vf_pci_core_device *mvdev) WARN_ON(mvdev->mdev_detach); + mlx5vf_cmd_destroy_tracker(mdev, tracker->id); mlx5vf_destroy_qp(mdev, tracker->fw_qp); mlx5vf_free_qp_recv_resources(mdev, tracker->host_qp); mlx5vf_destroy_qp(mdev, tracker->host_qp); @@ -941,6 +1084,10 @@ int mlx5vf_start_page_tracker(struct vfio_device *vdev, tracker->host_qp = host_qp; tracker->fw_qp = fw_qp; + err = mlx5vf_create_tracker(mdev, mvdev, ranges, nnodes); + if (err) + goto err_activate; + *page_size = host_qp->tracked_page_size; mvdev->log_active = true; mlx5vf_state_mutex_unlock(mvdev); diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index e71ec017bf04..658925ba5459 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -80,6 +80,7 @@ struct mlx5_vhca_qp { }; struct mlx5_vhca_page_tracker { + u32 id; u32 pdn; struct mlx5_uars_page *uar; struct mlx5_vhca_cq cq; From patchwork Thu Jun 30 10:25:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901623 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E62E6C43334 for ; Thu, 30 Jun 2022 10:27:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234926AbiF3K1s (ORCPT ); Thu, 30 Jun 2022 06:27:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234959AbiF3K1Q (ORCPT ); Thu, 30 Jun 2022 06:27:16 -0400 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2041.outbound.protection.outlook.com [40.107.223.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AD12F4A; Thu, 30 Jun 2022 03:27:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=giYi7mwB7XLgALP7jTYzaGbFnfDY70H5PPorQyEeaxn1VDvntdp8c5IRRJNBbAr291ZjQGPb2E/sDYSmzcrFB8k1inoShHmLmRS08+8ZgDJft6Qxd6uhEFZEOcNoOReIil+qGO3alVUHrTC4OAFA5zfYivRcIzfYm8FqWH8yLSW65iToc6QrCLrFM8rcILn3JRcTFYeGTESRsV4j5WMLkalHMpd1ZLmBjvucSPb9bIizAm6JXfgLbCJSUrIBMuiKAGY2AmAbGPnnCE6VQXQfaTeTJyTCwSboM6Cf/gakYMxbmzRy3SbNYM26htqPGd/7B7on4ARWCBMlDSd0ERBSzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=KtQbQg1nek4861Tza7nvTT11lXhRIzWkTBZo3dmlKlM=; b=NDr1EC4QiSsNwWO8u4yt6irA6Y5b0GVVs3xMw2u2fIELjBjXgpVttTxOyi7E5ulGOAVqv2+JRZQGJf2Gn76ArVchaJID1MWY3rko98F3swjfAGMcLs0omNWOpKnbcL2Lhq4hOm5V3YinxfdnhMYkVYCUDHu4duJ+rhWytzgqeiD9fsdhzs+L4WVQhcrkPX0e7C5i1oIsPF10NGgUlAj+XE4oWBfLcsZjW5fNJk9BrNwlqj39517DYhXW8/uXITkneUr5iAa89n+Yhb2Hq8Bl810Y19rT7dotDJg95CjNo2vJFzXpZSSzCJKS/04WNlKw2dDUWGPFxtM7qpqIV+8F2Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.236) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KtQbQg1nek4861Tza7nvTT11lXhRIzWkTBZo3dmlKlM=; b=Y4GuCy1EULQ/K9CYv2tWGVTGNF+yzK7+hL3HUUufSNTj2P9NHKueoeswJg2F3i0U4HDJeDZCkusdaCJ265B9ug2llH3MMiz+RDHE6RacvYabvMCrKttXZr7XBtt2b0xiuxRMJLCnPtzHEm/S5d9j08Ra3cCL/pZe+oZzfMhwKKaiYq9R0T5I65QIiXh0AqflnLYOw3cvxjT1ua7GdcjgyQCtJOLlbWj6vRXcrcJT2B1F8lUC5h2rlR3oA39memFB5ftDMr/kWXYXmHqzgCVAbaE9udnNJI4+QxFJgELPKsMozwWcO7Q6btSO5WlPK5QdsDroso/ZAu8hnNIn2z/nzg== Received: from BN9PR03CA0768.namprd03.prod.outlook.com (2603:10b6:408:13a::23) by MN2PR12MB4503.namprd12.prod.outlook.com (2603:10b6:208:264::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.15; Thu, 30 Jun 2022 10:27:13 +0000 Received: from BN8NAM11FT040.eop-nam11.prod.protection.outlook.com (2603:10b6:408:13a:cafe::2b) by BN9PR03CA0768.outlook.office365.com (2603:10b6:408:13a::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:27:13 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.236) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.236 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.236; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.236) by BN8NAM11FT040.mail.protection.outlook.com (10.13.177.166) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:27:13 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by DRHQMAIL109.nvidia.com (10.27.9.19) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:27:06 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:27:05 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:27:02 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 11/13] vfio/mlx5: Report dirty pages from tracker Date: Thu, 30 Jun 2022 13:25:43 +0300 Message-ID: <20220630102545.18005-12-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f7fe8735-b914-40aa-927f-08da5a8318f9 X-MS-TrafficTypeDiagnostic: MN2PR12MB4503:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: UdBh67W0TxCWvssgsNHEV/m/KCp6r0siTbE/QUxHRP38UAypNRumLMhh0BdNzNG8RiJ7aeEzpvIdWhFiGaGafTqKbFEp5ctPHau7cOLwekqsUBZR3plNxLiRyUBEGqGHAOF5yW2B7ZQOeejty8BRUYH5f/iZJy8OBTCswdPTGDCPx1m8Tui05ClsZJNkIKC/I+wkGWh5erLvneTdeKHsA1QD13JhvdQtSJHywR8QmW2My5Nn3IzsC2rilqiwaRfE9KrjeEb1YkF+iuZnvW0fJa3WvHsuisk2RCLCKYUjKQa5QLQ+6qLP1ffHJ1kz+xasz2tQ5NXUe8Zqin5vEzAPorDPsgsfFMZ7d+/ak/cjvTws2uHuxvSxr8C357YloTZiN50fOvfnxPfGRcpDBFCjnNDSCw9RCDSalczL8tfBm0VoAgidkLGoR8uvu5m4odWfYNhk+gcudSCD6u90KEcH2XGcOGYNmEeo660XgTA37YY8dayLVvFOeHZOemiN9FhiAnjQYmLwh+0fXr5znDUiI/0LEagSldZKFF2LDJ/GhyzsKbINYpRWqYP2F+xM8MlkEVzJvDI8+dvl7v2V3uMOexrcypz4adNoYES73o+mY6L2gkig7qblBXti/Hc4OlJQzithNa2uBpivh5y30W0CJWTQ6s719Y+7JuP9rUKbQoC4W3q2FHUhrnf9Ilm/HwFBz0Zqm3WdClfhrAkpDLvlyqEo3YKDPU09IdOQBGsiCk+g6LYMgxpWtlCoSQYW9fW1g6fiwVa4zmCbjtxUwjGycfV88ZwSB1Lilgaj8mcugRtEocKJJU0HwBiy5UNGXLzlPFVQytKKqynnFbwRTG5XOGEYSvxlZJSRHHFminKvJ/xY3hvCLpLu4oboEM1pYYPn X-Forefront-Antispam-Report: CIP:12.22.5.236;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(346002)(376002)(39860400002)(396003)(136003)(36840700001)(46966006)(40470700004)(4326008)(70206006)(36756003)(8676002)(70586007)(83380400001)(82740400003)(81166007)(356005)(40460700003)(41300700001)(110136005)(54906003)(2906002)(6636002)(8936002)(478600001)(5660300002)(316002)(86362001)(26005)(6666004)(1076003)(2616005)(7696005)(47076005)(336012)(426003)(36860700001)(186003)(40480700001)(82310400005)(14143004)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:27:13.0852 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f7fe8735-b914-40aa-927f-08da5a8318f9 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.236];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT040.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4503 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Report dirty pages from tracker. It includes: Querying for dirty pages in a given IOVA range, this is done by modifying the tracker into the reporting state and supplying the required range. Using the CQ event completion mechanism to be notified once data is ready on the CQ/QP to be processed. Once data is available turn on the corresponding bits in the bit map. This functionality will be used as part of the 'log_read_and_clear' driver callback in the next patches. Signed-off-by: Yishai Hadas --- drivers/vfio/pci/mlx5/cmd.c | 191 ++++++++++++++++++++++++++++++++++++ drivers/vfio/pci/mlx5/cmd.h | 4 + 2 files changed, 195 insertions(+) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index f1cad96af6ab..fa9ddd926500 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -5,6 +5,8 @@ #include "cmd.h" +enum { CQ_OK = 0, CQ_EMPTY = -1, CQ_POLL_ERR = -2 }; + static int mlx5vf_cmd_get_vhca_id(struct mlx5_core_dev *mdev, u16 function_id, u16 *vhca_id); static void @@ -157,6 +159,7 @@ void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev, VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P; mvdev->core_device.vdev.mig_ops = mig_ops; + init_completion(&mvdev->tracker_comp); end: mlx5_vf_put_core_dev(mvdev->mdev); @@ -552,6 +555,29 @@ static int mlx5vf_cmd_destroy_tracker(struct mlx5_core_dev *mdev, return mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); } +static int mlx5vf_cmd_modify_tracker(struct mlx5_core_dev *mdev, + u32 tracker_id, unsigned long iova, + unsigned long length, u32 tracker_state) +{ + u32 in[MLX5_ST_SZ_DW(modify_page_track_obj_in)] = {}; + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; + void *obj_context; + void *cmd_hdr; + + cmd_hdr = MLX5_ADDR_OF(modify_page_track_obj_in, in, general_obj_in_cmd_hdr); + MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, opcode, MLX5_CMD_OP_MODIFY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_type, MLX5_OBJ_TYPE_PAGE_TRACK); + MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_id, tracker_id); + + obj_context = MLX5_ADDR_OF(modify_page_track_obj_in, in, obj_context); + MLX5_SET64(page_track, obj_context, modify_field_select, 0x3); + MLX5_SET64(page_track, obj_context, range_start_address, iova); + MLX5_SET64(page_track, obj_context, length, length); + MLX5_SET(page_track, obj_context, state, tracker_state); + + return mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); +} + static int alloc_cq_frag_buf(struct mlx5_core_dev *mdev, struct mlx5_vhca_cq_buf *buf, int nent, int cqe_size) @@ -593,6 +619,16 @@ static void mlx5vf_destroy_cq(struct mlx5_core_dev *mdev, mlx5_db_free(mdev, &cq->db); } +static void mlx5vf_cq_complete(struct mlx5_core_cq *mcq, + struct mlx5_eqe *eqe) +{ + struct mlx5vf_pci_core_device *mvdev = + container_of(mcq, struct mlx5vf_pci_core_device, + tracker.cq.mcq); + + complete(&mvdev->tracker_comp); +} + static int mlx5vf_create_cq(struct mlx5_core_dev *mdev, struct mlx5_vhca_page_tracker *tracker, size_t ncqe) @@ -643,10 +679,13 @@ static int mlx5vf_create_cq(struct mlx5_core_dev *mdev, MLX5_SET64(cqc, cqc, dbr_addr, cq->db.dma); pas = (__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas); mlx5_fill_page_frag_array(&cq->buf.frag_buf, pas); + cq->mcq.comp = mlx5vf_cq_complete; err = mlx5_core_create_cq(mdev, &cq->mcq, in, inlen, out, sizeof(out)); if (err) goto err_vec; + mlx5_cq_arm(&cq->mcq, MLX5_CQ_DB_REQ_NOT, tracker->uar->map, + cq->mcq.cons_index); kvfree(in); return 0; @@ -1109,3 +1148,155 @@ int mlx5vf_start_page_tracker(struct vfio_device *vdev, mlx5vf_state_mutex_unlock(mvdev); return err; } + +static void +set_report_output(u32 size, int index, struct mlx5_vhca_qp *qp, + struct iova_bitmap *dirty) +{ + u32 entry_size = MLX5_ST_SZ_BYTES(page_track_report_entry); + u32 nent = size / entry_size; + struct page *page; + u64 addr; + u64 *buf; + int i; + + if (WARN_ON(index >= qp->recv_buf.npages || + (nent > qp->max_msg_size / entry_size))) + return; + + page = qp->recv_buf.page_list[index]; + buf = kmap_local_page(page); + for (i = 0; i < nent; i++) { + addr = MLX5_GET(page_track_report_entry, buf + i, + dirty_address_low); + addr |= (u64)MLX5_GET(page_track_report_entry, buf + i, + dirty_address_high) << 32; + iova_bitmap_set(dirty, addr, qp->tracked_page_size); + } + kunmap_local(buf); +} + +static void +mlx5vf_rq_cqe(struct mlx5_vhca_qp *qp, struct mlx5_cqe64 *cqe, + struct iova_bitmap *dirty, int *tracker_status) +{ + u32 size; + int ix; + + qp->rq.cc++; + *tracker_status = be32_to_cpu(cqe->immediate) >> 28; + size = be32_to_cpu(cqe->byte_cnt); + ix = be16_to_cpu(cqe->wqe_counter) & (qp->rq.wqe_cnt - 1); + + /* zero length CQE, no data */ + WARN_ON(!size && *tracker_status == MLX5_PAGE_TRACK_STATE_REPORTING); + if (size) + set_report_output(size, ix, qp, dirty); + + qp->recv_buf.next_rq_offset = ix * qp->max_msg_size; + mlx5vf_post_recv(qp); +} + +static void *get_cqe(struct mlx5_vhca_cq *cq, int n) +{ + return mlx5_frag_buf_get_wqe(&cq->buf.fbc, n); +} + +static struct mlx5_cqe64 *get_sw_cqe(struct mlx5_vhca_cq *cq, int n) +{ + void *cqe = get_cqe(cq, n & (cq->ncqe - 1)); + struct mlx5_cqe64 *cqe64; + + cqe64 = (cq->mcq.cqe_sz == 64) ? cqe : cqe + 64; + + if (likely(get_cqe_opcode(cqe64) != MLX5_CQE_INVALID) && + !((cqe64->op_own & MLX5_CQE_OWNER_MASK) ^ !!(n & (cq->ncqe)))) { + return cqe64; + } else { + return NULL; + } +} + +static int +mlx5vf_cq_poll_one(struct mlx5_vhca_cq *cq, struct mlx5_vhca_qp *qp, + struct iova_bitmap *dirty, int *tracker_status) +{ + struct mlx5_cqe64 *cqe; + u8 opcode; + + cqe = get_sw_cqe(cq, cq->mcq.cons_index); + if (!cqe) + return CQ_EMPTY; + + ++cq->mcq.cons_index; + /* + * Make sure we read CQ entry contents after we've checked the + * ownership bit. + */ + rmb(); + opcode = get_cqe_opcode(cqe); + switch (opcode) { + case MLX5_CQE_RESP_SEND_IMM: + mlx5vf_rq_cqe(qp, cqe, dirty, tracker_status); + return CQ_OK; + default: + return CQ_POLL_ERR; + } +} + +int mlx5vf_tracker_read_and_clear(struct vfio_device *vdev, unsigned long iova, + unsigned long length, + struct iova_bitmap *dirty) +{ + struct mlx5vf_pci_core_device *mvdev = container_of( + vdev, struct mlx5vf_pci_core_device, core_device.vdev); + struct mlx5_vhca_page_tracker *tracker = &mvdev->tracker; + struct mlx5_vhca_cq *cq = &tracker->cq; + struct mlx5_core_dev *mdev; + int poll_err, err; + + mutex_lock(&mvdev->state_mutex); + if (!mvdev->log_active) { + err = -EINVAL; + goto end; + } + + if (mvdev->mdev_detach) { + err = -ENOTCONN; + goto end; + } + + mdev = mvdev->mdev; + err = mlx5vf_cmd_modify_tracker(mdev, tracker->id, iova, length, + MLX5_PAGE_TRACK_STATE_REPORTING); + if (err) + goto end; + + tracker->status = MLX5_PAGE_TRACK_STATE_REPORTING; + while (tracker->status == MLX5_PAGE_TRACK_STATE_REPORTING) { + poll_err = mlx5vf_cq_poll_one(cq, tracker->host_qp, dirty, + &tracker->status); + if (poll_err == CQ_EMPTY) { + mlx5_cq_arm(&cq->mcq, MLX5_CQ_DB_REQ_NOT, tracker->uar->map, + cq->mcq.cons_index); + poll_err = mlx5vf_cq_poll_one(cq, tracker->host_qp, + dirty, &tracker->status); + if (poll_err == CQ_EMPTY) { + wait_for_completion(&mvdev->tracker_comp); + continue; + } + } + if (poll_err == CQ_POLL_ERR) { + err = -EIO; + goto end; + } + mlx5_cq_set_ci(&cq->mcq); + } + + if (tracker->status == MLX5_PAGE_TRACK_STATE_ERROR) + err = -EIO; + +end: + mlx5vf_state_mutex_unlock(mvdev); + return err; +} diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index 658925ba5459..fa1f9ab4d3d0 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -86,6 +86,7 @@ struct mlx5_vhca_page_tracker { struct mlx5_vhca_cq cq; struct mlx5_vhca_qp *host_qp; struct mlx5_vhca_qp *fw_qp; + int status; }; struct mlx5vf_pci_core_device { @@ -96,6 +97,7 @@ struct mlx5vf_pci_core_device { u8 deferred_reset:1; u8 mdev_detach:1; u8 log_active:1; + struct completion tracker_comp; /* protect migration state */ struct mutex state_mutex; enum vfio_device_mig_state mig_state; @@ -127,4 +129,6 @@ void mlx5vf_mig_file_cleanup_cb(struct work_struct *_work); int mlx5vf_start_page_tracker(struct vfio_device *vdev, struct rb_root_cached *ranges, u32 nnodes, u64 *page_size); int mlx5vf_stop_page_tracker(struct vfio_device *vdev); +int mlx5vf_tracker_read_and_clear(struct vfio_device *vdev, unsigned long iova, + unsigned long length, struct iova_bitmap *dirty); #endif /* MLX5_VFIO_CMD_H */ From patchwork Thu Jun 30 10:25:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901624 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A3D7C43334 for ; Thu, 30 Jun 2022 10:27:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234961AbiF3K1v (ORCPT ); Thu, 30 Jun 2022 06:27:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234947AbiF3K1O (ORCPT ); Thu, 30 Jun 2022 06:27:14 -0400 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2041.outbound.protection.outlook.com [40.107.236.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6BF45BC1D; Thu, 30 Jun 2022 03:27:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NvVlVbzuSvaScx3agBizP3ZoLgP9sdR3Ug+BSRp6A/bVeiOAlzHVl0FCNcnNRgUMdGbcmr5Zi/fU/32MpjFoHEnS0Kd6grT45z3F0t8J7DvAm/33NPPocGP6VJ56i+rkmU0kPYrsHi7EEM24MoMOfK9XLMK6iXf+85FKNtL1YDvhkhYaQN/nYugPzfl6xkAfzhkYZTIIgbzE2fyJVGlbbpwgwng0pFDAT12HxL88uiRVjZ9JlH0nZQT73/kd6OpV+afFwEQxykyx78oq0rItt7edTVp2vhJQ01rt/PhCTO78O9+/vUO1FUn5Av1AiMSqkxlN+q0cI/lWy0KLpY61uA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Wb5B38pz2fgEyssjvA29VUF1xQ36PldquXyFPWiE7LQ=; b=WRz5Okvbj6H7wPEe4yfIFh2KTwV5wMKHV3EYNYi4JH/dytK7/6ODy1LJc6/+GlI9v76m21QtUGOUTUU6m8axDPnfoPaxW+fA1dqqPsZhJY5cbZkQKb0z8ghH55sLnV87POWnmO6T+ZxwM2nj3HPQpXyO0Zoid87Pn62IRxv9wqnODgrVwm2mEMr62mFHsOXGA7i8kEHkzOnqlCDxiqY07XRAPAtyfo6/+Y6ETA3sZe/Nu5pSY5rSQErlc93BUUJpTI9Wf77CqOB0K1/rGf55YCQqFFlbNeIx0THnkhLJkhlGBVy16quc2o7vlnPbR05cN75Ve7cvfzgprCxqt3NUOQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.235) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Wb5B38pz2fgEyssjvA29VUF1xQ36PldquXyFPWiE7LQ=; b=eEFKb9Ie/7qpC8lDi6tGFBcYBoh3OK7hNnpql9n+XqMsKwokfKWvCuYzXi1/xkiZFOM/vxjU1l4ZeUmHv7kEqvYTgYlESrdZVl/WUIrAVoxF9WtqflAsZqLwYGmqVd4X0UXfderCsQ2/bx3jMXnCOSlWjvnV/9gKJ4qtKjid4zTvHP/sspnSijIVe2fomj7IclwpSDuaD3gnlPWVAVUPqPTyHgmD8i4kl0H3RGKQz+kEZhHp1zaFd8tneR11sVcExnc5kpk9NO5nhzETiFJHWxFZhVXccNBGIdygGAUmC1G0pYtpDo+gBbTEqB/2C2wyrAU7taUp11OVRDcvBhOPFg== Received: from BN9PR03CA0501.namprd03.prod.outlook.com (2603:10b6:408:130::26) by IA1PR12MB6164.namprd12.prod.outlook.com (2603:10b6:208:3e8::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.17; Thu, 30 Jun 2022 10:27:11 +0000 Received: from BN8NAM11FT038.eop-nam11.prod.protection.outlook.com (2603:10b6:408:130:cafe::95) by BN9PR03CA0501.outlook.office365.com (2603:10b6:408:130::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:27:11 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.235) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.235 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.235; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.235) by BN8NAM11FT038.mail.protection.outlook.com (10.13.176.246) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:27:11 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:27:10 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:27:08 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:27:05 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 12/13] vfio/mlx5: Manage error scenarios on tracker Date: Thu, 30 Jun 2022 13:25:44 +0300 Message-ID: <20220630102545.18005-13-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: e00ea3fb-dca3-48ce-6c28-08da5a8317dc X-MS-TrafficTypeDiagnostic: IA1PR12MB6164:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BatnxOVzpk2R/kbzh98adWma/M2uzGeqEICh+OQ7ifu7dUwOmMKJMHWEX+cY2KaWii/3MNy2++yrkzlhaNGiW4OVONM8XCYFrIxyYa1NPeqdg5aZ0hhJfXeNX12fiUqPdK96lOfwayE28NE8pFsdEEV67siABuE23DeH+5zEmEr/JkF6QiCI41qQHTwDMkT+jnXMetH4WkiW3LFbSAKojKDmWL8bU2Ii6tyQZMqS4ZH3CTHEeI+O27PkCSglEmZxU5oBXh1eOLVDLXkwhqp+s2bnj3KMD3DCDiYmdBEy34BiftenFnrrFyXLcyt/VQRUlO51H5BCQjAHmp8tquZ+8h6HgC2CGyyS+xX6oWzThcOXHwnTiBY/RPjdfHgJp8AdgXRQoY1u+hard260jhhzEyBNE9IKt+O/Abt0cDkuZ4DGn1c/4BdowouA5SqOdeF465OrezzPlXx2zj7jXZvFeuOQNoZIz+ZcJU6Vaba/ru6IVGTzx5dE+3tvUEqhJQFQauqPvDlcR4//sseV4jpkvR/YpWW20f7OKX9wCL1NHb2Y5HfBAICAO837/TpCiJsXWE04ss8xCaSGZ+VD7uVgqgVZFmli8KFiT5npHpRF5mgSlWLbFNZdXlp+8R1Ffrhn45e74L0HUPsuQYFayMqKqK+uKzsfmqAxv5pCdRZZhHt+WrWy4jxzDNOwVzRIcHvd2UfQW8YlwlVn4763mpflzlxwqqgDo6zXdaDywB0Ssy44odqLxGPGMpkrmKpcOMenoaAv1X0a+3FaIpNN5ghTil3b7apOtEEsU52d3tuXQHRxGrMcshLrGlb1d1jK9tLO1JJ7X6YyBMgHQRpA7t+JpKXVOg0y0MaOfrF5h3jlDWY= X-Forefront-Antispam-Report: CIP:12.22.5.235;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(136003)(396003)(346002)(39860400002)(376002)(40470700004)(36840700001)(46966006)(2906002)(86362001)(70586007)(4326008)(8676002)(6666004)(478600001)(5660300002)(70206006)(81166007)(40460700003)(7696005)(8936002)(82740400003)(6636002)(316002)(54906003)(356005)(336012)(26005)(41300700001)(186003)(1076003)(83380400001)(36860700001)(2616005)(36756003)(426003)(40480700001)(82310400005)(47076005)(110136005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:27:11.2165 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e00ea3fb-dca3-48ce-6c28-08da5a8317dc X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.235];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT038.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6164 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Handle async error events and health/recovery flow to safely stop the tracker upon error scenarios. Signed-off-by: Yishai Hadas --- drivers/vfio/pci/mlx5/cmd.c | 61 +++++++++++++++++++++++++++++++++++-- drivers/vfio/pci/mlx5/cmd.h | 2 ++ 2 files changed, 61 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index fa9ddd926500..3e92b4d92be2 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -70,6 +70,13 @@ int mlx5vf_cmd_query_vhca_migration_state(struct mlx5vf_pci_core_device *mvdev, return 0; } +static void set_tracker_error(struct mlx5vf_pci_core_device *mvdev) +{ + /* Mark the tracker under an error and wake it up if it's running */ + mvdev->tracker.is_err = true; + complete(&mvdev->tracker_comp); +} + static int mlx5fv_vf_event(struct notifier_block *nb, unsigned long event, void *data) { @@ -100,6 +107,8 @@ void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev) if (!mvdev->migrate_cap) return; + /* Must be done outside the lock to let it progress */ + set_tracker_error(mvdev); mutex_lock(&mvdev->state_mutex); mlx5vf_disable_fds(mvdev); _mlx5vf_free_page_tracker_resources(mvdev); @@ -619,6 +628,47 @@ static void mlx5vf_destroy_cq(struct mlx5_core_dev *mdev, mlx5_db_free(mdev, &cq->db); } +static void mlx5vf_cq_event(struct mlx5_core_cq *mcq, enum mlx5_event type) +{ + if (type != MLX5_EVENT_TYPE_CQ_ERROR) + return; + + set_tracker_error(container_of(mcq, struct mlx5vf_pci_core_device, + tracker.cq.mcq)); +} + +static int mlx5vf_event_notifier(struct notifier_block *nb, unsigned long type, + void *data) +{ + struct mlx5_vhca_page_tracker *tracker = + mlx5_nb_cof(nb, struct mlx5_vhca_page_tracker, nb); + struct mlx5vf_pci_core_device *mvdev = container_of( + tracker, struct mlx5vf_pci_core_device, tracker); + struct mlx5_eqe *eqe = data; + u8 event_type = (u8)type; + u8 queue_type; + int qp_num; + + switch (event_type) { + case MLX5_EVENT_TYPE_WQ_CATAS_ERROR: + case MLX5_EVENT_TYPE_WQ_ACCESS_ERROR: + case MLX5_EVENT_TYPE_WQ_INVAL_REQ_ERROR: + queue_type = eqe->data.qp_srq.type; + if (queue_type != MLX5_EVENT_QUEUE_TYPE_QP) + break; + qp_num = be32_to_cpu(eqe->data.qp_srq.qp_srq_n) & 0xffffff; + if (qp_num != tracker->host_qp->qpn && + qp_num != tracker->fw_qp->qpn) + break; + set_tracker_error(mvdev); + break; + default: + break; + } + + return NOTIFY_OK; +} + static void mlx5vf_cq_complete(struct mlx5_core_cq *mcq, struct mlx5_eqe *eqe) { @@ -680,6 +730,7 @@ static int mlx5vf_create_cq(struct mlx5_core_dev *mdev, pas = (__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas); mlx5_fill_page_frag_array(&cq->buf.frag_buf, pas); cq->mcq.comp = mlx5vf_cq_complete; + cq->mcq.event = mlx5vf_cq_event; err = mlx5_core_create_cq(mdev, &cq->mcq, in, inlen, out, sizeof(out)); if (err) goto err_vec; @@ -1014,6 +1065,7 @@ _mlx5vf_free_page_tracker_resources(struct mlx5vf_pci_core_device *mvdev) WARN_ON(mvdev->mdev_detach); + mlx5_eq_notifier_unregister(mdev, &tracker->nb); mlx5vf_cmd_destroy_tracker(mdev, tracker->id); mlx5vf_destroy_qp(mdev, tracker->fw_qp); mlx5vf_free_qp_recv_resources(mdev, tracker->host_qp); @@ -1127,6 +1179,8 @@ int mlx5vf_start_page_tracker(struct vfio_device *vdev, if (err) goto err_activate; + MLX5_NB_INIT(&tracker->nb, mlx5vf_event_notifier, NOTIFY_ANY); + mlx5_eq_notifier_register(mdev, &tracker->nb); *page_size = host_qp->tracked_page_size; mvdev->log_active = true; mlx5vf_state_mutex_unlock(mvdev); @@ -1273,7 +1327,8 @@ int mlx5vf_tracker_read_and_clear(struct vfio_device *vdev, unsigned long iova, goto end; tracker->status = MLX5_PAGE_TRACK_STATE_REPORTING; - while (tracker->status == MLX5_PAGE_TRACK_STATE_REPORTING) { + while (tracker->status == MLX5_PAGE_TRACK_STATE_REPORTING && + !tracker->is_err) { poll_err = mlx5vf_cq_poll_one(cq, tracker->host_qp, dirty, &tracker->status); if (poll_err == CQ_EMPTY) { @@ -1294,8 +1349,10 @@ int mlx5vf_tracker_read_and_clear(struct vfio_device *vdev, unsigned long iova, } if (tracker->status == MLX5_PAGE_TRACK_STATE_ERROR) - err = -EIO; + tracker->is_err = true; + if (tracker->is_err) + err = -EIO; end: mlx5vf_state_mutex_unlock(mvdev); return err; diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index fa1f9ab4d3d0..8b0ae40c620c 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -82,10 +82,12 @@ struct mlx5_vhca_qp { struct mlx5_vhca_page_tracker { u32 id; u32 pdn; + u8 is_err:1; struct mlx5_uars_page *uar; struct mlx5_vhca_cq cq; struct mlx5_vhca_qp *host_qp; struct mlx5_vhca_qp *fw_qp; + struct mlx5_nb nb; int status; }; From patchwork Thu Jun 30 10:25:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901625 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6479C433EF for ; Thu, 30 Jun 2022 10:28:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234967AbiF3K2D (ORCPT ); Thu, 30 Jun 2022 06:28:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235004AbiF3K1T (ORCPT ); Thu, 30 Jun 2022 06:27:19 -0400 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2069.outbound.protection.outlook.com [40.107.94.69]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 852B619003; Thu, 30 Jun 2022 03:27:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mykKvHI+nYzUMHoPXsiCjwnCGyjc4Fy3AZ/nK7sAfOtlkXiXIa2Uxn9GrM20gWAm5kfiR4XQiFQAOfsQt0OPpoUIH2KojmSSxM7K1CwBXna8pzcq0dXH1z1YVq9vPbic2N4lj9Ob9uvlP+kzSeVSrqMukkcWrPbV82ra4iq8mRcmMn3OZYqjR4Huf/9x1VUTLhim+GpxtwGVYd4hP3hPHhcSinN1D2FEKRCyOZIWMu47awTJGfg5z9GATIi/5Ndx44xAA8Hp/z0OlpqpkVuw50HdTaC5l2oRqiq74nl+6i6WNLARA76X9dbjrXDKEqpQmk6G1hqrcjMvBFpAUGhCbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uE6sjx0oSA79BfIgkJVMlmUd3p6BL3oGNTJ2eCqwAtA=; b=PWaTE7P4OFcgMj+chS7KRf3uaapJOY2yTgKZeHzkgxlmRVw17mENed3K5qNQIyD96OxlojWwiBRQRPYClIqvW1tKFA0kGGkuPMEAXR44m1YJ3porJlZFu+HJoV+yeqB3vL3taPS0pAIbCiJflCiX17nKoK0V8GKqSjjE2rZt4uyrhyD6yw8hLFeWmJ4ePs7JxHQiaZSIyU4L20jU1aMIgJX8Y//N+oodQebPeFpdCVTm4zAUMq0Ofl7gQJjzZSPP9DZbtoBp+a+9vWXR2W8r+Y15M50NpC4uHkR5AE0d4pngqdaPf6vB/O3lMvDeaH+U8M4c+JYlV7UHj9KYicSb3Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.234) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=uE6sjx0oSA79BfIgkJVMlmUd3p6BL3oGNTJ2eCqwAtA=; b=beXYWIj0QpfrgekkFHvK1CwA1Pij1VK7+FCaH0hcuHCxPgxQFkeFr3gs4Kav6r4eSqqKV/gdXBOZiaSqa0VCLz4HVd3WC1XVB5phR8fZKRLD9djDU6juTrfr+MxSwc2ZDTI/lN8q+gbfRMe4lXFWwW+/N01KghkZ6Uqo1qAAy2thwzwZptZHn2nQq6f1Nm/DsbOz5gJuUoH9Xtb6vq0mNR15HkQ9QdZxNCu4swxEOnMKI/35kXdZNyCjGku+v8VP6OWJX4njuGRyqMrza48hb64Vwgh8ALE9JYkkzuHkwU/TTa/goYKQqgFydBIogo8DTZdNw/Xo1LYSpCaVipjmmg== Received: from BN0PR04CA0007.namprd04.prod.outlook.com (2603:10b6:408:ee::12) by BYAPR12MB2997.namprd12.prod.outlook.com (2603:10b6:a03:db::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.15; Thu, 30 Jun 2022 10:27:15 +0000 Received: from BN8NAM11FT045.eop-nam11.prod.protection.outlook.com (2603:10b6:408:ee:cafe::87) by BN0PR04CA0007.outlook.office365.com (2603:10b6:408:ee::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:27:15 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.234) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.234 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.234; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.234) by BN8NAM11FT045.mail.protection.outlook.com (10.13.177.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:27:14 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by DRHQMAIL101.nvidia.com (10.27.9.10) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:27:13 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:27:12 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:27:09 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 13/13] vfio/mlx5: Set the driver DMA logging callbacks Date: Thu, 30 Jun 2022 13:25:45 +0300 Message-ID: <20220630102545.18005-14-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: abcfa49e-2849-4976-7327-08da5a8319ed X-MS-TrafficTypeDiagnostic: BYAPR12MB2997:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: U+nNVXNa5mchpYn6NtA29K6QcZbTRwjVFJIUnONm5+KK091omDv9KCEgx5zcXg+o029Xp55V1E9n4e1/2wGa/cAM2zx2n54/BMMrcMQz2KkphB6UtuWuHBi/yTQ8Kwwlrzv+PRFauyC+0gEBbOfErb5TjNuj6Y5U/GPKoTPtXkz+YaUHg0ssYHVylk9pYCCjtjiHxuKA3h7i0TJje0d/VU4qyhXCfdGTQyDUzVY2rIRLdbq8saOQ87pvTI62U4jZZwHuva+bbtNw5cwTklRPlgu5ermKgg8pPkI45D/DbMI+k8sHmfjBYNP4NDb70JpX/ElpZOsZOCteo2MFA2HaVB17JIi4cWNvIMCz/vAulZ90fmt7gaSxwUDTGPfQywqPpXFlNjNPyfWUzOItDiAwYCaCCcDcf+xjgkLmZ9tq6AwB7aVA+htwD277Cky//HgCgGoCzIZ3yAQk5IkfALjZk2ao471s0VfcoZ+WVTLJspk6FFC/zwQVM2GGql/MqsqV2yoJtO8wby7SAkWtgczXk2XsmOLpfj4pvAx4w2IMGUDD1IwwzVhgHotO7zlbwIwocwjONRk628C4lhLtCs3a3yTdzGgyaeAz1JiPl1vfDO7wEqQxkPg3CaTXZzxS69gkbaCnZOIkxk4vbBmU/dKm3MAZUf/eimb3/oNpN5Nhwry49GCnDRt228vs8h36UHSuTcUhqmWFLepFHcpN+h3sM+bU5bWZNR/XGIZuCYPNThPrApNVd1TK0jZ2vRN+ybetdVkg92Zfy/0osLQ4/4+uEZW/thC5bBm5oB8+DqM8UVi3ugArXaPXu+nbpOjFDeLYQouoV2ONcy4eoqe/Y0FkcbCPYs97CkSMCOnJNOogQQw= X-Forefront-Antispam-Report: CIP:12.22.5.234;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(39860400002)(346002)(396003)(376002)(136003)(40470700004)(36840700001)(46966006)(54906003)(110136005)(41300700001)(2906002)(6666004)(478600001)(82740400003)(40480700001)(5660300002)(81166007)(186003)(356005)(8936002)(316002)(86362001)(26005)(2616005)(6636002)(1076003)(82310400005)(40460700003)(83380400001)(336012)(70586007)(426003)(47076005)(7696005)(36860700001)(36756003)(70206006)(4326008)(8676002)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:27:14.6679 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: abcfa49e-2849-4976-7327-08da5a8319ed X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.234];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT045.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB2997 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Now that everything is ready set the driver DMA logging callbacks if supported by the device. Signed-off-by: Yishai Hadas --- drivers/vfio/pci/mlx5/cmd.c | 5 ++++- drivers/vfio/pci/mlx5/cmd.h | 3 ++- drivers/vfio/pci/mlx5/main.c | 9 ++++++++- 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index 3e92b4d92be2..c604b70437a5 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -126,7 +126,8 @@ void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev) } void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev, - const struct vfio_migration_ops *mig_ops) + const struct vfio_migration_ops *mig_ops, + const struct vfio_log_ops *log_ops) { struct pci_dev *pdev = mvdev->core_device.pdev; int ret; @@ -169,6 +170,8 @@ void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev, VFIO_MIGRATION_P2P; mvdev->core_device.vdev.mig_ops = mig_ops; init_completion(&mvdev->tracker_comp); + if (MLX5_CAP_GEN(mvdev->mdev, adv_virtualization)) + mvdev->core_device.vdev.log_ops = log_ops; end: mlx5_vf_put_core_dev(mvdev->mdev); diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index 8b0ae40c620c..921d5720a1e5 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -118,7 +118,8 @@ int mlx5vf_cmd_resume_vhca(struct mlx5vf_pci_core_device *mvdev, u16 op_mod); int mlx5vf_cmd_query_vhca_migration_state(struct mlx5vf_pci_core_device *mvdev, size_t *state_size); void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev, - const struct vfio_migration_ops *mig_ops); + const struct vfio_migration_ops *mig_ops, + const struct vfio_log_ops *log_ops); void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev); void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev); int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev, diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index a9b63d15c5d3..759a5f5f7b3f 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -579,6 +579,12 @@ static const struct vfio_migration_ops mlx5vf_pci_mig_ops = { .migration_get_state = mlx5vf_pci_get_device_state, }; +static const struct vfio_log_ops mlx5vf_pci_log_ops = { + .log_start = mlx5vf_start_page_tracker, + .log_stop = mlx5vf_stop_page_tracker, + .log_read_and_clear = mlx5vf_tracker_read_and_clear, +}; + static const struct vfio_device_ops mlx5vf_pci_ops = { .name = "mlx5-vfio-pci", .open_device = mlx5vf_pci_open_device, @@ -602,7 +608,8 @@ static int mlx5vf_pci_probe(struct pci_dev *pdev, if (!mvdev) return -ENOMEM; vfio_pci_core_init_device(&mvdev->core_device, pdev, &mlx5vf_pci_ops); - mlx5vf_cmd_set_migratable(mvdev, &mlx5vf_pci_mig_ops); + mlx5vf_cmd_set_migratable(mvdev, &mlx5vf_pci_mig_ops, + &mlx5vf_pci_log_ops); dev_set_drvdata(&pdev->dev, &mvdev->core_device); ret = vfio_pci_core_register_device(&mvdev->core_device); if (ret)