From patchwork Fri Dec 21 14:40:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuval Shaia X-Patchwork-Id: 10740521 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A4E06C5 for ; Fri, 21 Dec 2018 14:42:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4585726E47 for ; Fri, 21 Dec 2018 14:42:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 396CD1FF0B; Fri, 21 Dec 2018 14:42:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id AC07026E47 for ; Fri, 21 Dec 2018 14:42:18 +0000 (UTC) Received: from localhost ([::1]:46049 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gaM0H-0001EC-Ko for patchwork-qemu-devel@patchwork.kernel.org; Fri, 21 Dec 2018 09:42:17 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55221) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gaLyy-0000BT-8Y for qemu-devel@nongnu.org; Fri, 21 Dec 2018 09:40:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gaLyv-0008Kv-2K for qemu-devel@nongnu.org; Fri, 21 Dec 2018 09:40:56 -0500 Received: from userp2120.oracle.com ([156.151.31.85]:33756) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gaLyu-0008Ja-Ot for qemu-devel@nongnu.org; Fri, 21 Dec 2018 09:40:53 -0500 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id wBLEY0uq075500; Fri, 21 Dec 2018 14:40:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id; s=corp-2018-07-02; bh=zzDucam4qe1ZinxSj1YJ6V90zV2XQkYhUkuRAJEqnKA=; b=tcrtScVWTWWW0W1r5RpwXKQ3G6w4nt79Do6js+mvIpS9Cwl3fUHVgsJ62eaeD8GpPHV6 9ZcAZ3YH9QUu6JhNUPcZUYMM/tSgnNhVHkLhhoVrjuRamzfmY2SSAT8qiUgYiJSbj9Wx A64pvqSQxAS95OGm/6D8vd46Djvqq2HsvGrQGbxmvupSBqTvRB0gHyBgC0Y8SDRql8YP gocvlfcD47AipiB7jBXT9ipBzzpuDaOAlmbL3QljKBc/bFsLbLgCDnEvdeToiDOXjmgT IBwHrwQNMXa78TPzinpFCrP4hJd4LJ8cXW0bL6Yz3BRmEepJ3QmQ4eH0fjxUD2Jiyr/X oQ== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2120.oracle.com with ESMTP id 2pfn203ndb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 21 Dec 2018 14:40:49 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id wBLEem0w017673 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 21 Dec 2018 14:40:48 GMT Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id wBLEemx1007167; Fri, 21 Dec 2018 14:40:48 GMT Received: from localhost.localdomain (/77.138.186.148) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 21 Dec 2018 06:40:47 -0800 From: Yuval Shaia To: yuval.shaia@oracle.com, marcel.apfelbaum@gmail.com, dmitry.fleytman@gmail.com, jasowang@redhat.com, eblake@redhat.com, armbru@redhat.com, pbonzini@redhat.com, qemu-devel@nongnu.org, shamir.rabinovitch@oracle.com, cohuck@redhat.com Date: Fri, 21 Dec 2018 16:40:14 +0200 Message-Id: <20181221144037.10290-1-yuval.shaia@oracle.com> X-Mailer: git-send-email 2.17.2 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9113 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1812210115 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 156.151.31.85 Subject: [Qemu-devel] [PATCH v9 00/23] Add support for RDMA MAD X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Hi all. This is a major enhancement to the pvrdma device to allow it to work with state of the art applications such as MPI. As described in patch #5, MAD packets are management packets that are used for many purposes including but not limited to communication layer above IB verbs API. Patch 1 exposes new external executable (under contrib) that aims to address a specific limitation in the RDMA usrespace MAD stack. This patch-set mainly present MAD enhancement but during the work on it i came across some bugs and enhancement needed to be implemented before doing any MAD coding. This is the role of patches 2 to 4, 7 to 9 and 15 to 17. Patches 6 and 18 are cosmetic changes while not relevant to this patchset still introduce with it since (at least for 6) hard to decouple. Patches 12 to 15 couple pvrdma device with vmxnet3 device as this is the configuration enforced by pvrdma driver in guest - a vmxnet3 device in function 0 and pvrdma device in function 1 in the same PCI slot. Patch 12 moves needed code from vmxnet3 device to a new header file that can be used by pvrdma code while Patches 13 to 15 use of it. Along with this patch-set there is a parallel patch posted to libvirt to apply the change needed there as part of the process implemented in patches 10 and 11. This change is needed so that guest would be able to configure any IP to the Ethernet function of the pvrdma device. https://www.redhat.com/archives/libvir-list/2018-November/msg00135.html Since we maintain external resources such as GIDs on host GID table we need to do some cleanup before going down. This is the job of patches 19 and 20. Patches 21 to 22 contain a fixes for bugs detected during the work on processing VM shutdown notification. Patch 23 fixes documentation. Optional second review is welcome for: [10] qapi: Define new QMP message for pvrdma v1 -> v2: * Fix compilation issue detected when compiling for mingw. * Address comment from Eric Blake re version of QEMU in json message. * Fix example from QMP message in json file. * Fix case where a VM tries to remove an invalid GID from GID table. * rdmacm-mux: Cleanup entries in socket-gids table when socket is closed. * Cleanup resources (GIDs, QPs etc) when VM goes down. v2 -> v3: * Address comment from Cornelia Huck for patch #19. * Add some R-Bs from Marcel Apfelbaum and Dmitry Fleytman. * Update docs/pvrdma.txt with the changes made by this patchset. * Address comments from Shamir Rabinovitch for UMAD multiplexer. v3 -> v4: * Address some comments from Marcel. * Add some R-Bs from Cornelia Huck and Shamir Rabinovitch. v4 -> v5: * Add one more patch that deletes code that performs unneeded (and buggy) cleanup of resources during VM shutdown. * Fix race condition that might happen when MAD response arrive before ack for the send is received. * Based qapi patch on Eric Blake's patch "qapi: Reduce Makefile boilerplate" per Markus Armbruster's suggestion. Please note that this will cause build error until Eric's patch will be applied. * Add some debug log messages to rdmacm-mux. v5 -> v6 * Add some R-Bs from Marcel. * Set hop_limit to 0xFF in mad_send. * Accept comment from Marcel re clearing response in execute_command. * Change version for QMP message per Eric Blake comment. * Add some notes to docs/pvrdma.txt as suggested by Marcel. * in rdmacm-mux, do not default to rxe0. v6 -> v7: * Fix formating (checkpatch) in patch #17. * Undo wrong setting done in patch #17 (found after testing with Prasad's patchset). * Add Marcel's r-b for patches #11 and #17. v7 -> v8: * Accept Eric's comments for patch 10 and 11. v8 -> v9: * Resolve conflict caused as a result of change made to QMP shutdown message in commit ecd7a0d5bbf ("qmp: Add reason to SHUTDOWN and RESET events"). * s/---/---- in patch #1 Thanks, Yuval Yuval Shaia (23): contrib/rdmacm-mux: Add implementation of RDMA User MAD multiplexer hw/rdma: Add ability to force notification without re-arm hw/rdma: Return qpn 1 if ibqp is NULL hw/rdma: Abort send-op if fail to create addr handler hw/rdma: Add support for MAD packets hw/pvrdma: Make function reset_device return void hw/pvrdma: Make default pkey 0xFFFF hw/pvrdma: Set the correct opcode for recv completion hw/pvrdma: Set the correct opcode for send completion qapi: Define new QMP message for pvrdma hw/pvrdma: Add support to allow guest to configure GID table vmxnet3: Move some definitions to header file hw/pvrdma: Make sure PCI function 0 is vmxnet3 hw/rdma: Initialize node_guid from vmxnet3 mac address hw/pvrdma: Make device state depend on Ethernet function state hw/pvrdma: Fill all CQE fields hw/pvrdma: Fill error code in command's response hw/rdma: Remove unneeded code that handles more that one port vl: Introduce shutdown_notifiers hw/pvrdma: Clean device's resource when system is shutdown hw/rdma: Do not use bitmap_zero_extend to free bitmap hw/rdma: Do not call rdma_backend_del_gid on an empty gid docs: Update pvrdma device documentation MAINTAINERS | 2 + Makefile | 3 + Makefile.objs | 4 +- contrib/rdmacm-mux/Makefile.objs | 4 + contrib/rdmacm-mux/main.c | 798 +++++++++++++++++++++++++++++++ contrib/rdmacm-mux/rdmacm-mux.h | 61 +++ docs/pvrdma.txt | 126 ++++- hw/net/vmxnet3.c | 116 +---- hw/net/vmxnet3_defs.h | 133 ++++++ hw/rdma/rdma_backend.c | 515 +++++++++++++++++--- hw/rdma/rdma_backend.h | 28 +- hw/rdma/rdma_backend_defs.h | 19 +- hw/rdma/rdma_rm.c | 120 ++++- hw/rdma/rdma_rm.h | 17 +- hw/rdma/rdma_rm_defs.h | 21 +- hw/rdma/rdma_utils.h | 25 + hw/rdma/vmw/pvrdma.h | 10 +- hw/rdma/vmw/pvrdma_cmd.c | 225 ++++----- hw/rdma/vmw/pvrdma_main.c | 61 ++- hw/rdma/vmw/pvrdma_qp_ops.c | 62 ++- include/sysemu/sysemu.h | 1 + qapi/qapi-schema.json | 1 + qapi/rdma.json | 38 ++ vl.c | 15 +- 24 files changed, 2022 insertions(+), 383 deletions(-) create mode 100644 contrib/rdmacm-mux/Makefile.objs create mode 100644 contrib/rdmacm-mux/main.c create mode 100644 contrib/rdmacm-mux/rdmacm-mux.h create mode 100644 hw/net/vmxnet3_defs.h create mode 100644 qapi/rdma.json