From patchwork Wed Jul 24 23:38:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741441 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6FF124A39 for ; Wed, 24 Jul 2024 23:40:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864439; cv=none; b=rTOgJiwSL3HqjsTfapWRJne26xP0hYQy7rhCHoP77zpcEmhGLxx+8i0JB1pviOUp2k2FKbPsytJrJ1d4oTCdSvQt9s4b0HNKdsoQbBcxV8ASiI+4hA/8S/b7pLBRxwK5GpIiO5qm1AU5ZUL6IzsY6bOnjBAbqB2yePd20X2sxVE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864439; c=relaxed/simple; bh=LLUMxGAAYuD4sm0s3WekQFlIwOrqL4uLRfoVeaCn+Ao=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=i2I2j7BV/5OeQWm9Ek6vL1b1E0sGWkt0ccyG7nWt1IPQjGC57bSF9ToME5cW43OB7TfNR0nlT1TNzWW1rvFRaqX6Zsm0RxLwVGjorMXdg2Lvq9eyPv0oyQMzVov4g1kKI/TySDc9ZpBY26e1+owQnzwRosPh1q6WgrO75+kw9qY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=cngfK8jA; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="cngfK8jA" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864437; x=1753400437; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LLUMxGAAYuD4sm0s3WekQFlIwOrqL4uLRfoVeaCn+Ao=; b=cngfK8jAboSSGIv3DmepqckLCwtVaBBt9ftEwpvL6VjZl/Dy6w8/xDAD OUNKfIUaxjYeTSpVXuVSer21/ZanghMeqmOzeMKNfmdelME4w6FAqGQpQ 5tN4NRsfxc+wv+2psEen17/BYItBasngSryepOU1LUjHGWQy6kYQ/PKt5 qyBl3VYPSHwU3Q0CFMDCHHL5n30qvZNjKTAMGyOHi3yWGl2OfIeNPtGrw EjySZrEsRdxCvPdQ27uhMRyVkzYBchA6QUsVg7KsMM7CWUuZa4QMPEa5n OJK9cRYwrjZxxTt/0P1Bev9eoMWSlPS3VKfnB5shZTlX1IuoH+LYfyhEK g==; X-CSE-ConnectionGUID: 2jV/FWJWSa2e6dV10WVNBw== X-CSE-MsgGUID: xr8VUAIKQiKoujIYnmEfLw== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999736" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999736" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:35 -0700 X-CSE-ConnectionGUID: qySjSn6iTICxHvXAQWmp5A== X-CSE-MsgGUID: 3B+coKj2T7WVW2PVJfTF+A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52425987" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:34 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Dave Ertman , Shiraz Saleem , Tatyana Nikolova Subject: [RFC PATCH 01/25] iidc/ice/irdma: Update IDC to support multiple consumers Date: Wed, 24 Jul 2024 18:38:53 -0500 Message-Id: <20240724233917.704-2-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dave Ertman To support RDMA for E2000 product, the idpf driver will require to use IDC interface with the irdma auxiliary driver, thus becoming a second consumer of it. This requires the IDC be updated to support multiple consumers. The use of exported symbols no longer makes sense because it will require all core drivers (ice/idpf) that can interface with irdma auxiliary driver to be loaded even if hardware is not present for those drivers. To address this, implement an ops struct that will be universal set of naked function pointers that will be populated by each core driver for the irdma auxiliary driver to call. Also previously, the ice driver was just exporting its entire pf struct to the auxiliary driver, but since each core driver will have its own different pf sturct, implenent a universal struct that all core drivers can export to the auxiliary driver through the probe call. The iidc.h header file will be divided into two files. The first, idc_rdma.h, will host all of the generic header info that will be need for RDMA support in the auxiliary device. The second, iidc_rdma.h, will contain specific elements used by Intel drivers to support RDMA. This will be primarily the implementation of a new struct that will be assigned under the new generic opaque element of idc_priv in the idc_core_dev_info struct. Update ice and irdma to conform with the new IIDC interface definitions. Signed-off-by: Dave Ertman Co-developed-by: Mustafa Ismail Signed-off-by: Mustafa Ismail Co-developed-by: Shiraz Saleem Signed-off-by: Shiraz Saleem Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/main.c | 110 +++++----- drivers/infiniband/hw/irdma/main.h | 3 +- drivers/infiniband/hw/irdma/osdep.h | 4 +- drivers/net/ethernet/intel/ice/devlink/devlink.c | 41 +++- drivers/net/ethernet/intel/ice/ice.h | 6 +- drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 46 ++++- drivers/net/ethernet/intel/ice/ice_dcb_lib.h | 4 + drivers/net/ethernet/intel/ice/ice_ethtool.c | 8 +- drivers/net/ethernet/intel/ice/ice_idc.c | 245 ++++++++++++++--------- drivers/net/ethernet/intel/ice/ice_idc_int.h | 5 +- drivers/net/ethernet/intel/ice/ice_main.c | 18 +- include/linux/net/intel/idc_rdma.h | 138 +++++++++++++ include/linux/net/intel/iidc.h | 107 ---------- include/linux/net/intel/iidc_rdma.h | 61 ++++++ 14 files changed, 512 insertions(+), 284 deletions(-) create mode 100644 include/linux/net/intel/idc_rdma.h delete mode 100644 include/linux/net/intel/iidc.h create mode 100644 include/linux/net/intel/iidc_rdma.h diff --git a/drivers/infiniband/hw/irdma/main.c b/drivers/infiniband/hw/irdma/main.c index 3f13200..9b6f1d8 100644 --- a/drivers/infiniband/hw/irdma/main.c +++ b/drivers/infiniband/hw/irdma/main.c @@ -1,7 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2015 - 2021 Intel Corporation */ #include "main.h" -#include "../../../net/ethernet/intel/ice/ice.h" MODULE_ALIAS("i40iw"); MODULE_AUTHOR("Intel Corporation, "); @@ -61,7 +60,7 @@ static void irdma_log_invalid_mtu(u16 mtu, struct irdma_sc_dev *dev) } static void irdma_fill_qos_info(struct irdma_l2params *l2params, - struct iidc_qos_params *qos_info) + struct iidc_rdma_qos_params *qos_info) { int i; @@ -85,12 +84,13 @@ static void irdma_fill_qos_info(struct irdma_l2params *l2params, } } -static void irdma_iidc_event_handler(struct ice_pf *pf, struct iidc_event *event) +static void irdma_idc_event_handler(struct idc_rdma_core_dev_info *cdev_info, + struct idc_rdma_event *event) { - struct irdma_device *iwdev = dev_get_drvdata(&pf->adev->dev); + struct irdma_device *iwdev = dev_get_drvdata(&cdev_info->adev->dev); struct irdma_l2params l2params = {}; - if (*event->type & BIT(IIDC_EVENT_AFTER_MTU_CHANGE)) { + if (*event->type & BIT(IDC_RDMA_EVENT_AFTER_MTU_CHANGE)) { ibdev_dbg(&iwdev->ibdev, "CLNT: new MTU = %d\n", iwdev->netdev->mtu); if (iwdev->vsi.mtu != iwdev->netdev->mtu) { l2params.mtu = iwdev->netdev->mtu; @@ -98,25 +98,26 @@ static void irdma_iidc_event_handler(struct ice_pf *pf, struct iidc_event *event irdma_log_invalid_mtu(l2params.mtu, &iwdev->rf->sc_dev); irdma_change_l2params(&iwdev->vsi, &l2params); } - } else if (*event->type & BIT(IIDC_EVENT_BEFORE_TC_CHANGE)) { + } else if (*event->type & BIT(IDC_RDMA_EVENT_BEFORE_TC_CHANGE)) { if (iwdev->vsi.tc_change_pending) return; irdma_prep_tc_change(iwdev); - } else if (*event->type & BIT(IIDC_EVENT_AFTER_TC_CHANGE)) { - struct iidc_qos_params qos_info = {}; + } else if (*event->type & BIT(IDC_RDMA_EVENT_AFTER_TC_CHANGE)) { + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; if (!iwdev->vsi.tc_change_pending) return; l2params.tc_changed = true; ibdev_dbg(&iwdev->ibdev, "CLNT: TC Change\n"); - ice_get_qos_params(pf, &qos_info); - irdma_fill_qos_info(&l2params, &qos_info); + + irdma_fill_qos_info(&l2params, &idc_priv->qos_info); if (iwdev->rf->protocol_used != IRDMA_IWARP_PROTOCOL_ONLY) - iwdev->dcb_vlan_mode = qos_info.num_tc > 1 && !l2params.dscp_mode; + iwdev->dcb_vlan_mode = + l2params.num_tc > 1 && !l2params.dscp_mode; irdma_change_l2params(&iwdev->vsi, &l2params); - } else if (*event->type & BIT(IIDC_EVENT_CRIT_ERR)) { + } else if (*event->type & BIT(IDC_RDMA_EVENT_CRIT_ERR)) { ibdev_warn(&iwdev->ibdev, "ICE OICR event notification: oicr = 0x%08x\n", event->reg); if (event->reg & IRDMAPFINT_OICR_PE_CRITERR_M) { @@ -151,10 +152,10 @@ static void irdma_iidc_event_handler(struct ice_pf *pf, struct iidc_event *event */ static void irdma_request_reset(struct irdma_pci_f *rf) { - struct ice_pf *pf = rf->cdev; + struct idc_rdma_core_dev_info *cdev_info = rf->cdev; ibdev_warn(&rf->iwdev->ibdev, "Requesting a reset\n"); - ice_rdma_request_reset(pf, IIDC_PFR); + cdev_info->ops->request_reset(rf->cdev, IDC_FUNC_RESET); } /** @@ -166,14 +167,15 @@ static int irdma_lan_register_qset(struct irdma_sc_vsi *vsi, struct irdma_ws_node *tc_node) { struct irdma_device *iwdev = vsi->back_vsi; - struct ice_pf *pf = iwdev->rf->cdev; + struct idc_rdma_core_dev_info *cdev_info = iwdev->rf->cdev; + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; struct iidc_rdma_qset_params qset = {}; int ret; qset.qs_handle = tc_node->qs_handle; qset.tc = tc_node->traffic_class; qset.vport_id = vsi->vsi_idx; - ret = ice_add_rdma_qset(pf, &qset); + ret = idc_priv->priv_ops->alloc_res(cdev_info, &qset); if (ret) { ibdev_dbg(&iwdev->ibdev, "WS: LAN alloc_res for rdma qset failed.\n"); return ret; @@ -194,7 +196,8 @@ static void irdma_lan_unregister_qset(struct irdma_sc_vsi *vsi, struct irdma_ws_node *tc_node) { struct irdma_device *iwdev = vsi->back_vsi; - struct ice_pf *pf = iwdev->rf->cdev; + struct idc_rdma_core_dev_info *cdev_info = iwdev->rf->cdev; + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; struct iidc_rdma_qset_params qset = {}; qset.qs_handle = tc_node->qs_handle; @@ -202,40 +205,48 @@ static void irdma_lan_unregister_qset(struct irdma_sc_vsi *vsi, qset.vport_id = vsi->vsi_idx; qset.teid = tc_node->l2_sched_node_id; - if (ice_del_rdma_qset(pf, &qset)) + if (idc_priv->priv_ops->free_res(cdev_info, &qset)) ibdev_dbg(&iwdev->ibdev, "WS: LAN free_res for rdma qset failed.\n"); } static void irdma_remove(struct auxiliary_device *aux_dev) { - struct iidc_auxiliary_dev *iidc_adev = container_of(aux_dev, - struct iidc_auxiliary_dev, - adev); - struct ice_pf *pf = iidc_adev->pf; + struct idc_rdma_core_auxiliary_dev *idc_adev = + container_of(aux_dev, struct idc_rdma_core_auxiliary_dev, adev); + struct idc_rdma_core_dev_info *cdev_info = idc_adev->cdev_info; + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; struct irdma_device *iwdev = auxiliary_get_drvdata(aux_dev); + idc_priv->priv_ops->update_vport_filter(cdev_info, + iwdev->vsi_num, false); irdma_ib_unregister_device(iwdev); - ice_rdma_update_vsi_filter(pf, iwdev->vsi_num, false); - pr_debug("INIT: Gen2 PF[%d] device remove success\n", PCI_FUNC(pf->pdev->devfn)); + pr_debug("INIT: Gen2 PF[%d] device remove success\n", PCI_FUNC(cdev_info->pdev->devfn)); } -static void irdma_fill_device_info(struct irdma_device *iwdev, struct ice_pf *pf, - struct ice_vsi *vsi) +static void irdma_fill_device_info(struct irdma_device *iwdev, + struct idc_rdma_core_dev_info *cdev_info) { + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; struct irdma_pci_f *rf = iwdev->rf; - rf->cdev = pf; + rf->sc_dev.hw = &rf->hw; + rf->iwdev = iwdev; + rf->cdev = cdev_info; + rf->hw.hw_addr = idc_priv->hw_addr; + rf->pcidev = cdev_info->pdev; + rf->hw.device = &rf->pcidev->dev; + rf->msix_count = cdev_info->msix_count; + rf->pf_id = idc_priv->pf_id; + rf->msix_entries = cdev_info->msix_entries; + rf->gen_ops.register_qset = irdma_lan_register_qset; rf->gen_ops.unregister_qset = irdma_lan_unregister_qset; - rf->hw.hw_addr = pf->hw.hw_addr; - rf->pcidev = pf->pdev; - rf->msix_count = pf->num_rdma_msix; - rf->pf_id = pf->hw.pf_id; - rf->msix_entries = &pf->msix_entries[pf->rdma_base_vector]; - rf->default_vsi.vsi_idx = vsi->vsi_num; - rf->protocol_used = pf->rdma_mode & IIDC_RDMA_PROTOCOL_ROCEV2 ? - IRDMA_ROCE_PROTOCOL_ONLY : IRDMA_IWARP_PROTOCOL_ONLY; + + rf->default_vsi.vsi_idx = idc_priv->vport_id; + rf->protocol_used = + cdev_info->rdma_protocol == IDC_RDMA_PROTOCOL_ROCEV2 ? + IRDMA_ROCE_PROTOCOL_ONLY : IRDMA_IWARP_PROTOCOL_ONLY; rf->rdma_ver = IRDMA_GEN_2; rf->rsrc_profile = IRDMA_HMC_PROFILE_DEFAULT; rf->rst_to = IRDMA_RST_TIMEOUT_HZ; @@ -243,8 +254,9 @@ static void irdma_fill_device_info(struct irdma_device *iwdev, struct ice_pf *pf rf->limits_sel = 7; rf->iwdev = iwdev; mutex_init(&iwdev->ah_tbl_lock); - iwdev->netdev = vsi->netdev; - iwdev->vsi_num = vsi->vsi_num; + + iwdev->netdev = idc_priv->netdev; + iwdev->vsi_num = idc_priv->vport_id; iwdev->init_state = INITIAL_STATE; iwdev->roce_cwnd = IRDMA_ROCE_CWND_DEFAULT; iwdev->roce_ackcreds = IRDMA_ROCE_ACKCREDS_DEFAULT; @@ -256,19 +268,15 @@ static void irdma_fill_device_info(struct irdma_device *iwdev, struct ice_pf *pf static int irdma_probe(struct auxiliary_device *aux_dev, const struct auxiliary_device_id *id) { - struct iidc_auxiliary_dev *iidc_adev = container_of(aux_dev, - struct iidc_auxiliary_dev, - adev); - struct ice_pf *pf = iidc_adev->pf; - struct ice_vsi *vsi = ice_get_main_vsi(pf); - struct iidc_qos_params qos_info = {}; + struct idc_rdma_core_auxiliary_dev *idc_adev = + container_of(aux_dev, struct idc_rdma_core_auxiliary_dev, adev); + struct idc_rdma_core_dev_info *cdev_info = idc_adev->cdev_info; + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; struct irdma_device *iwdev; struct irdma_pci_f *rf; struct irdma_l2params l2params = {}; int err; - if (!vsi) - return -EIO; iwdev = ib_alloc_device(irdma_device, ibdev); if (!iwdev) return -ENOMEM; @@ -278,7 +286,7 @@ static int irdma_probe(struct auxiliary_device *aux_dev, const struct auxiliary_ return -ENOMEM; } - irdma_fill_device_info(iwdev, pf, vsi); + irdma_fill_device_info(iwdev, cdev_info); rf = iwdev->rf; err = irdma_ctrl_init_hw(rf); @@ -286,8 +294,7 @@ static int irdma_probe(struct auxiliary_device *aux_dev, const struct auxiliary_ goto err_ctrl_init; l2params.mtu = iwdev->netdev->mtu; - ice_get_qos_params(pf, &qos_info); - irdma_fill_qos_info(&l2params, &qos_info); + irdma_fill_qos_info(&l2params, &idc_priv->qos_info); if (iwdev->rf->protocol_used != IRDMA_IWARP_PROTOCOL_ONLY) iwdev->dcb_vlan_mode = l2params.num_tc > 1 && !l2params.dscp_mode; @@ -299,7 +306,8 @@ static int irdma_probe(struct auxiliary_device *aux_dev, const struct auxiliary_ if (err) goto err_ibreg; - ice_rdma_update_vsi_filter(pf, iwdev->vsi_num, true); + idc_priv->priv_ops->update_vport_filter(cdev_info, iwdev->vsi_num, + true); ibdev_dbg(&iwdev->ibdev, "INIT: Gen2 PF[%d] device probe success\n", PCI_FUNC(rf->pcidev->devfn)); auxiliary_set_drvdata(aux_dev, iwdev); @@ -325,13 +333,13 @@ static int irdma_probe(struct auxiliary_device *aux_dev, const struct auxiliary_ MODULE_DEVICE_TABLE(auxiliary, irdma_auxiliary_id_table); -static struct iidc_auxiliary_drv irdma_auxiliary_drv = { +static struct idc_rdma_core_auxiliary_drv irdma_auxiliary_drv = { .adrv = { .id_table = irdma_auxiliary_id_table, .probe = irdma_probe, .remove = irdma_remove, }, - .event_handler = irdma_iidc_event_handler, + .event_handler = irdma_idc_event_handler, }; static int __init irdma_init_module(void) diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index 9f0ed6e..e81f375 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -29,7 +29,8 @@ #include #endif #include -#include +#include +#include #include #include #include diff --git a/drivers/infiniband/hw/irdma/osdep.h b/drivers/infiniband/hw/irdma/osdep.h index e1e3d3a..b41134b 100644 --- a/drivers/infiniband/hw/irdma/osdep.h +++ b/drivers/infiniband/hw/irdma/osdep.h @@ -5,7 +5,9 @@ #include #include -#include +#include +#include + #include #include diff --git a/drivers/net/ethernet/intel/ice/devlink/devlink.c b/drivers/net/ethernet/intel/ice/devlink/devlink.c index c4b6965..0eb84a0 100644 --- a/drivers/net/ethernet/intel/ice/devlink/devlink.c +++ b/drivers/net/ethernet/intel/ice/devlink/devlink.c @@ -1286,8 +1286,14 @@ static int ice_devlink_reinit_up(struct ice_pf *pf) struct devlink_param_gset_ctx *ctx) { struct ice_pf *pf = devlink_priv(devlink); + struct idc_rdma_core_dev_info *cdev; - ctx->val.vbool = pf->rdma_mode & IIDC_RDMA_PROTOCOL_ROCEV2 ? true : false; + cdev = pf->cdev_info; + if (!cdev) + return 0; + + ctx->val.vbool = cdev->rdma_protocol & IDC_RDMA_PROTOCOL_ROCEV2 ? + true : false; return 0; } @@ -1297,19 +1303,24 @@ static int ice_devlink_enable_roce_set(struct devlink *devlink, u32 id, struct netlink_ext_ack *extack) { struct ice_pf *pf = devlink_priv(devlink); + struct idc_rdma_core_dev_info *cdev; bool roce_ena = ctx->val.vbool; int ret; + cdev = pf->cdev_info; + if (!cdev) + return -EINVAL; + if (!roce_ena) { ice_unplug_aux_dev(pf); - pf->rdma_mode &= ~IIDC_RDMA_PROTOCOL_ROCEV2; + cdev->rdma_protocol &= ~IDC_RDMA_PROTOCOL_ROCEV2; return 0; } - pf->rdma_mode |= IIDC_RDMA_PROTOCOL_ROCEV2; + cdev->rdma_protocol |= IDC_RDMA_PROTOCOL_ROCEV2; ret = ice_plug_aux_dev(pf); if (ret) - pf->rdma_mode &= ~IIDC_RDMA_PROTOCOL_ROCEV2; + cdev->rdma_protocol &= ~IDC_RDMA_PROTOCOL_ROCEV2; return ret; } @@ -1320,11 +1331,16 @@ static int ice_devlink_enable_roce_set(struct devlink *devlink, u32 id, struct netlink_ext_ack *extack) { struct ice_pf *pf = devlink_priv(devlink); + struct idc_rdma_core_dev_info *cdev; + + cdev = pf->cdev_info; + if (!cdev) + return -EINVAL; if (!test_bit(ICE_FLAG_RDMA_ENA, pf->flags)) return -EOPNOTSUPP; - if (pf->rdma_mode & IIDC_RDMA_PROTOCOL_IWARP) { + if (cdev->rdma_protocol & IDC_RDMA_PROTOCOL_IWARP) { NL_SET_ERR_MSG_MOD(extack, "iWARP is currently enabled. This device cannot enable iWARP and RoCEv2 simultaneously"); return -EOPNOTSUPP; } @@ -1338,7 +1354,8 @@ static int ice_devlink_enable_roce_set(struct devlink *devlink, u32 id, { struct ice_pf *pf = devlink_priv(devlink); - ctx->val.vbool = pf->rdma_mode & IIDC_RDMA_PROTOCOL_IWARP; + ctx->val.vbool = pf->cdev_info->rdma_protocol & + IDC_RDMA_PROTOCOL_IWARP; return 0; } @@ -1348,19 +1365,23 @@ static int ice_devlink_enable_iw_set(struct devlink *devlink, u32 id, struct netlink_ext_ack *extack) { struct ice_pf *pf = devlink_priv(devlink); + struct idc_rdma_core_dev_info *cdev; bool iw_ena = ctx->val.vbool; int ret; + cdev = pf->cdev_info; + if (!cdev) + return -EINVAL; if (!iw_ena) { ice_unplug_aux_dev(pf); - pf->rdma_mode &= ~IIDC_RDMA_PROTOCOL_IWARP; + cdev->rdma_protocol &= ~IDC_RDMA_PROTOCOL_IWARP; return 0; } - pf->rdma_mode |= IIDC_RDMA_PROTOCOL_IWARP; + cdev->rdma_protocol |= IDC_RDMA_PROTOCOL_IWARP; ret = ice_plug_aux_dev(pf); if (ret) - pf->rdma_mode &= ~IIDC_RDMA_PROTOCOL_IWARP; + cdev->rdma_protocol &= ~IDC_RDMA_PROTOCOL_IWARP; return ret; } @@ -1375,7 +1396,7 @@ static int ice_devlink_enable_iw_set(struct devlink *devlink, u32 id, if (!test_bit(ICE_FLAG_RDMA_ENA, pf->flags)) return -EOPNOTSUPP; - if (pf->rdma_mode & IIDC_RDMA_PROTOCOL_ROCEV2) { + if (pf->cdev_info->rdma_protocol & IDC_RDMA_PROTOCOL_ROCEV2) { NL_SET_ERR_MSG_MOD(extack, "RoCEv2 is currently enabled. This device cannot enable iWARP and RoCEv2 simultaneously"); return -EOPNOTSUPP; } diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index 6ad8002..8177eec 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -405,7 +405,6 @@ struct ice_vsi { u16 req_rxq; /* User requested Rx queues */ u16 num_rx_desc; u16 num_tx_desc; - u16 qset_handle[ICE_MAX_TRAFFIC_CLASS]; struct ice_tc_cfg tc_cfg; struct bpf_prog *xdp_prog; struct ice_tx_ring **xdp_rings; /* XDP ring array */ @@ -551,7 +550,6 @@ struct ice_pf { struct devlink_port devlink_port; /* OS reserved IRQ details */ - struct msix_entry *msix_entries; struct ice_irq_tracker irq_tracker; /* First MSIX vector used by SR-IOV VFs. Calculated by subtracting the * number of MSIX vectors needed for all SR-IOV VFs from the number of @@ -592,7 +590,6 @@ struct ice_pf { struct gnss_serial *gnss_serial; struct gnss_device *gnss_dev; u16 num_rdma_msix; /* Total MSIX vectors for RDMA driver */ - u16 rdma_base_vector; /* spinlock to protect the AdminQ wait list */ spinlock_t aq_wait_lock; @@ -625,14 +622,12 @@ struct ice_pf { struct ice_hw_port_stats stats_prev; struct ice_hw hw; u8 stat_prev_loaded:1; /* has previous stats been loaded */ - u8 rdma_mode; u16 dcbx_cap; u32 tx_timeout_count; unsigned long tx_timeout_last_recovery; u32 tx_timeout_recovery_level; char int_name[ICE_INT_NAME_STR_LEN]; char int_name_ll_ts[ICE_INT_NAME_STR_LEN]; - struct auxiliary_device *adev; int aux_idx; u32 sw_int_count; /* count of tc_flower filters specific to channel (aka where filter @@ -660,6 +655,7 @@ struct ice_pf { struct ice_agg_node vf_agg_node[ICE_MAX_VF_AGG_NODES]; struct ice_dplls dplls; struct device *hwmon_dev; + struct idc_rdma_core_dev_info *cdev_info; }; extern struct workqueue_struct *ice_lag_wq; diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c index a94e707..c85d86c 100644 --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c @@ -352,7 +352,7 @@ int ice_pf_dcb_cfg(struct ice_pf *pf, struct ice_dcbx_cfg *new_cfg, bool locked) struct ice_dcbx_cfg *old_cfg, *curr_cfg; struct device *dev = ice_pf_to_dev(pf); int ret = ICE_DCB_NO_HW_CHG; - struct iidc_event *event; + struct idc_rdma_event *event; struct ice_vsi *pf_vsi; curr_cfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg; @@ -404,7 +404,7 @@ int ice_pf_dcb_cfg(struct ice_pf *pf, struct ice_dcbx_cfg *new_cfg, bool locked) goto free_cfg; } - set_bit(IIDC_EVENT_BEFORE_TC_CHANGE, event->type); + set_bit(IDC_RDMA_EVENT_BEFORE_TC_CHANGE, event->type); ice_send_event_to_aux(pf, event); kfree(event); @@ -739,7 +739,9 @@ static int ice_dcb_noncontig_cfg(struct ice_pf *pf) void ice_pf_dcb_recfg(struct ice_pf *pf, bool locked) { struct ice_dcbx_cfg *dcbcfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg; - struct iidc_event *event; + struct iidc_rdma_priv_dev_info *privd; + struct idc_rdma_core_dev_info *cdev; + struct idc_rdma_event *event; u8 tc_map = 0; int v, ret; @@ -782,13 +784,17 @@ void ice_pf_dcb_recfg(struct ice_pf *pf, bool locked) if (vsi->type == ICE_VSI_PF) ice_dcbnl_set_all(vsi); } - if (!locked) { + + cdev = pf->cdev_info; + if (cdev && !locked) { + privd = cdev->idc_priv; + ice_setup_dcb_qos_info(pf, &privd->qos_info); /* Notify the AUX drivers that TC change is finished */ event = kzalloc(sizeof(*event), GFP_KERNEL); if (!event) return; - set_bit(IIDC_EVENT_AFTER_TC_CHANGE, event->type); + set_bit(IDC_RDMA_EVENT_AFTER_TC_CHANGE, event->type); ice_send_event_to_aux(pf, event); kfree(event); } @@ -944,6 +950,36 @@ void ice_update_dcb_stats(struct ice_pf *pf) } /** + * ice_setup_dcb_qos_info - Setup DCB QoS information + * @pf: ptr to ice_pf + * @qos_info: QoS param instance + */ +void ice_setup_dcb_qos_info(struct ice_pf *pf, struct iidc_rdma_qos_params *qos_info) +{ + struct ice_dcbx_cfg *dcbx_cfg; + unsigned int i; + u32 up2tc; + + if (!pf || !qos_info) + return; + + dcbx_cfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg; + up2tc = rd32(&pf->hw, PRTDCB_TUP2TC); + + qos_info->num_tc = ice_dcb_get_num_tc(dcbx_cfg); + + for (i = 0; i < IIDC_MAX_USER_PRIORITY; i++) + qos_info->up2tc[i] = (up2tc >> (i * 3)) & 0x7; + + for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) + qos_info->tc_info[i].rel_bw = dcbx_cfg->etscfg.tcbwtable[i]; + + qos_info->pfc_mode = dcbx_cfg->pfc_mode; + for (i = 0; i < ICE_DSCP_NUM_VAL; i++) + qos_info->dscp_map[i] = dcbx_cfg->dscp_map[i]; +} + +/** * ice_dcb_is_mib_change_pending - Check if MIB change is pending * @state: MIB change state */ diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.h b/drivers/net/ethernet/intel/ice/ice_dcb_lib.h index 800879a..80efbf0 100644 --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.h @@ -31,6 +31,8 @@ ice_tx_prepare_vlan_flags_dcb(struct ice_tx_ring *tx_ring, struct ice_tx_buf *first); void +ice_setup_dcb_qos_info(struct ice_pf *pf, struct iidc_rdma_qos_params *qos_info); +void ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf, struct ice_rq_event_info *event); /** @@ -134,5 +136,7 @@ static inline void ice_update_dcb_stats(struct ice_pf *pf) { } static inline void ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf, struct ice_rq_event_info *event) { } static inline void ice_set_cgd_num(struct ice_tlan_ctx *tlan_ctx, u8 dcb_tc) { } +static inline void +ice_setup_dcb_qos_info(struct ice_pf *pf, struct iidc_rdma_qos_params *qos_info) { } #endif /* CONFIG_DCB */ #endif /* _ICE_DCB_LIB_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index 62c8205..4dd7a69 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -3643,11 +3643,11 @@ static int ice_set_channels(struct net_device *dev, struct ethtool_channels *ch) return -EINVAL; } - if (pf->adev) { + if (pf->cdev_info->adev) { mutex_lock(&pf->adev_mutex); - device_lock(&pf->adev->dev); + device_lock(&pf->cdev_info->adev->dev); locked = true; - if (pf->adev->dev.driver) { + if (pf->cdev_info->adev->dev.driver) { netdev_err(dev, "Cannot change channels when RDMA is active\n"); ret = -EBUSY; goto adev_unlock; @@ -3666,7 +3666,7 @@ static int ice_set_channels(struct net_device *dev, struct ethtool_channels *ch) adev_unlock: if (locked) { - device_unlock(&pf->adev->dev); + device_unlock(&pf->cdev_info->adev->dev); mutex_unlock(&pf->adev_mutex); } return ret; diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c b/drivers/net/ethernet/intel/ice/ice_idc.c index 145b27f..3454b91 100644 --- a/drivers/net/ethernet/intel/ice/ice_idc.c +++ b/drivers/net/ethernet/intel/ice/ice_idc.c @@ -9,21 +9,22 @@ static DEFINE_XARRAY_ALLOC1(ice_aux_id); /** - * ice_get_auxiliary_drv - retrieve iidc_auxiliary_drv struct - * @pf: pointer to PF struct + * ice_get_auxiliary_drv - retrieve iidc_rdma_core_auxiliary_drv struct + * @cdev: pointer to iidc_rdma_core_dev_info struct * * This function has to be called with a device_lock on the - * pf->adev.dev to avoid race conditions. + * cdev->adev.dev to avoid race conditions. */ -static struct iidc_auxiliary_drv *ice_get_auxiliary_drv(struct ice_pf *pf) +static struct idc_rdma_core_auxiliary_drv +*ice_get_auxiliary_drv(struct idc_rdma_core_dev_info *cdev) { struct auxiliary_device *adev; - adev = pf->adev; + adev = cdev->adev; if (!adev || !adev->dev.driver) return NULL; - return container_of(adev->dev.driver, struct iidc_auxiliary_drv, + return container_of(adev->dev.driver, struct idc_rdma_core_auxiliary_drv, adrv.driver); } @@ -32,44 +33,52 @@ static struct iidc_auxiliary_drv *ice_get_auxiliary_drv(struct ice_pf *pf) * @pf: pointer to PF struct * @event: event struct */ -void ice_send_event_to_aux(struct ice_pf *pf, struct iidc_event *event) +void ice_send_event_to_aux(struct ice_pf *pf, struct idc_rdma_event *event) { - struct iidc_auxiliary_drv *iadrv; + struct idc_rdma_core_auxiliary_drv *iadrv; + struct idc_rdma_core_dev_info *cdev; if (WARN_ON_ONCE(!in_task())) return; + cdev = pf->cdev_info; + if (!cdev) + return; + mutex_lock(&pf->adev_mutex); - if (!pf->adev) + if (!cdev->adev) goto finish; - device_lock(&pf->adev->dev); - iadrv = ice_get_auxiliary_drv(pf); + device_lock(&cdev->adev->dev); + iadrv = ice_get_auxiliary_drv(cdev); if (iadrv && iadrv->event_handler) - iadrv->event_handler(pf, event); - device_unlock(&pf->adev->dev); + iadrv->event_handler(cdev, event); + device_unlock(&cdev->adev->dev); finish: mutex_unlock(&pf->adev_mutex); } /** * ice_add_rdma_qset - Add Leaf Node for RDMA Qset - * @pf: PF struct + * @cdev: pointer to iidc_rdma_core_dev_info struct * @qset: Resource to be allocated */ -int ice_add_rdma_qset(struct ice_pf *pf, struct iidc_rdma_qset_params *qset) +static int ice_add_rdma_qset(struct idc_rdma_core_dev_info *cdev, + struct iidc_rdma_qset_params *qset) { u16 max_rdmaqs[ICE_MAX_TRAFFIC_CLASS]; struct ice_vsi *vsi; struct device *dev; + struct ice_pf *pf; u32 qset_teid; u16 qs_handle; int status; int i; - if (WARN_ON(!pf || !qset)) + if (WARN_ON(!cdev || !qset)) return -EINVAL; + pf = pci_get_drvdata(cdev->pdev); dev = ice_pf_to_dev(pf); if (!ice_is_rdma_ena(pf)) @@ -100,27 +109,28 @@ int ice_add_rdma_qset(struct ice_pf *pf, struct iidc_rdma_qset_params *qset) dev_err(dev, "Failed VSI RDMA Qset enable\n"); return status; } - vsi->qset_handle[qset->tc] = qset->qs_handle; qset->teid = qset_teid; return 0; } -EXPORT_SYMBOL_GPL(ice_add_rdma_qset); /** * ice_del_rdma_qset - Delete leaf node for RDMA Qset - * @pf: PF struct + * @cdev: pointer to iidc_rdma_core_dev_info struct * @qset: Resource to be freed */ -int ice_del_rdma_qset(struct ice_pf *pf, struct iidc_rdma_qset_params *qset) +static int ice_del_rdma_qset(struct idc_rdma_core_dev_info *cdev, + struct iidc_rdma_qset_params *qset) { struct ice_vsi *vsi; + struct ice_pf *pf; u32 teid; u16 q_id; - if (WARN_ON(!pf || !qset)) + if (WARN_ON(!cdev || !qset)) return -EINVAL; + pf = pci_get_drvdata(cdev->pdev); vsi = ice_find_vsi(pf, qset->vport_id); if (!vsi) { dev_err(ice_pf_to_dev(pf), "RDMA Invalid VSI\n"); @@ -130,57 +140,56 @@ int ice_del_rdma_qset(struct ice_pf *pf, struct iidc_rdma_qset_params *qset) q_id = qset->qs_handle; teid = qset->teid; - vsi->qset_handle[qset->tc] = 0; - return ice_dis_vsi_rdma_qset(vsi->port_info, 1, &teid, &q_id); } -EXPORT_SYMBOL_GPL(ice_del_rdma_qset); /** * ice_rdma_request_reset - accept request from RDMA to perform a reset - * @pf: struct for PF + * @cdev: pointer to iidc_rdma_core_dev_info struct * @reset_type: type of reset */ -int ice_rdma_request_reset(struct ice_pf *pf, enum iidc_reset_type reset_type) +static int ice_rdma_request_reset(struct idc_rdma_core_dev_info *cdev, + enum idc_rdma_reset_type reset_type) { enum ice_reset_req reset; + struct ice_pf *pf; - if (WARN_ON(!pf)) + if (WARN_ON(!cdev)) return -EINVAL; + pf = pci_get_drvdata(cdev->pdev); + switch (reset_type) { - case IIDC_PFR: + case IDC_FUNC_RESET: reset = ICE_RESET_PFR; break; - case IIDC_CORER: + case IDC_DEV_RESET: reset = ICE_RESET_CORER; break; - case IIDC_GLOBR: - reset = ICE_RESET_GLOBR; - break; default: - dev_err(ice_pf_to_dev(pf), "incorrect reset request\n"); return -EINVAL; } return ice_schedule_reset(pf, reset); } -EXPORT_SYMBOL_GPL(ice_rdma_request_reset); /** * ice_rdma_update_vsi_filter - update main VSI filters for RDMA - * @pf: pointer to struct for PF + * @cdev: pointer to iidc_rdma_core_dev_info struct * @vsi_id: VSI HW idx to update filter on * @enable: bool whether to enable or disable filters */ -int ice_rdma_update_vsi_filter(struct ice_pf *pf, u16 vsi_id, bool enable) +static int ice_rdma_update_vsi_filter(struct idc_rdma_core_dev_info *cdev, + u16 vsi_id, bool enable) { struct ice_vsi *vsi; + struct ice_pf *pf; int status; - if (WARN_ON(!pf)) + if (WARN_ON(!cdev)) return -EINVAL; + pf = pci_get_drvdata(cdev->pdev); vsi = ice_find_vsi(pf, vsi_id); if (!vsi) return -EINVAL; @@ -198,35 +207,6 @@ int ice_rdma_update_vsi_filter(struct ice_pf *pf, u16 vsi_id, bool enable) return status; } -EXPORT_SYMBOL_GPL(ice_rdma_update_vsi_filter); - -/** - * ice_get_qos_params - parse QoS params for RDMA consumption - * @pf: pointer to PF struct - * @qos: set of QoS values - */ -void ice_get_qos_params(struct ice_pf *pf, struct iidc_qos_params *qos) -{ - struct ice_dcbx_cfg *dcbx_cfg; - unsigned int i; - u32 up2tc; - - dcbx_cfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg; - up2tc = rd32(&pf->hw, PRTDCB_TUP2TC); - - qos->num_tc = ice_dcb_get_num_tc(dcbx_cfg); - for (i = 0; i < IIDC_MAX_USER_PRIORITY; i++) - qos->up2tc[i] = (up2tc >> (i * 3)) & 0x7; - - for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) - qos->tc_info[i].rel_bw = dcbx_cfg->etscfg.tcbwtable[i]; - - qos->pfc_mode = dcbx_cfg->pfc_mode; - if (qos->pfc_mode == IIDC_DSCP_PFC_MODE) - for (i = 0; i < IIDC_MAX_DSCP_MAPPING; i++) - qos->dscp_map[i] = dcbx_cfg->dscp_map[i]; -} -EXPORT_SYMBOL_GPL(ice_get_qos_params); /** * ice_alloc_rdma_qvectors - Allocate vector resources for RDMA driver @@ -234,22 +214,26 @@ void ice_get_qos_params(struct ice_pf *pf, struct iidc_qos_params *qos) */ static int ice_alloc_rdma_qvectors(struct ice_pf *pf) { + struct idc_rdma_core_dev_info *cdev; + + cdev = pf->cdev_info; + if (!cdev) + return -EINVAL; + if (ice_is_rdma_ena(pf)) { int i; - pf->msix_entries = kcalloc(pf->num_rdma_msix, - sizeof(*pf->msix_entries), - GFP_KERNEL); - if (!pf->msix_entries) + cdev->msix_entries = kcalloc(pf->num_rdma_msix, + sizeof(*cdev->msix_entries), + GFP_KERNEL); + if (!cdev->msix_entries) return -ENOMEM; - /* RDMA is the only user of pf->msix_entries array */ - pf->rdma_base_vector = 0; - for (i = 0; i < pf->num_rdma_msix; i++) { - struct msix_entry *entry = &pf->msix_entries[i]; + struct msix_entry *entry; struct msi_map map; + entry = &cdev->msix_entries[i]; map = ice_alloc_irq(pf, false); if (map.index < 0) break; @@ -267,32 +251,49 @@ static int ice_alloc_rdma_qvectors(struct ice_pf *pf) */ static void ice_free_rdma_qvector(struct ice_pf *pf) { + struct idc_rdma_core_dev_info *cdev; int i; - if (!pf->msix_entries) + cdev = pf->cdev_info; + if (!cdev) + return; + + if (!cdev->msix_entries) return; for (i = 0; i < pf->num_rdma_msix; i++) { struct msi_map map; - map.index = pf->msix_entries[i].entry; - map.virq = pf->msix_entries[i].vector; + map.index = cdev->msix_entries[i].entry; + map.virq = cdev->msix_entries[i].vector; ice_free_irq(pf, map); } - kfree(pf->msix_entries); - pf->msix_entries = NULL; + kfree(cdev->msix_entries); + cdev->msix_entries = NULL; } +/* Initialize the ice_ops struct, which is used in 'ice_init_rdma' */ +static const struct idc_rdma_core_ops idc_c_ops = { + .vc_send_sync = NULL, + .request_reset = ice_rdma_request_reset, +}; + +static const struct iidc_rdma_priv_ops iidc_p_ops = { + .alloc_res = ice_add_rdma_qset, + .free_res = ice_del_rdma_qset, + .update_vport_filter = ice_rdma_update_vsi_filter, +}; + /** * ice_adev_release - function to be mapped to AUX dev's release op * @dev: pointer to device to free */ static void ice_adev_release(struct device *dev) { - struct iidc_auxiliary_dev *iadev; + struct idc_rdma_core_auxiliary_dev *iadev; - iadev = container_of(dev, struct iidc_auxiliary_dev, adev.dev); + iadev = container_of(dev, struct idc_rdma_core_auxiliary_dev, adev.dev); kfree(iadev); } @@ -302,7 +303,8 @@ static void ice_adev_release(struct device *dev) */ int ice_plug_aux_dev(struct ice_pf *pf) { - struct iidc_auxiliary_dev *iadev; + struct idc_rdma_core_auxiliary_dev *iadev; + struct idc_rdma_core_dev_info *cdev; struct auxiliary_device *adev; int ret; @@ -312,17 +314,22 @@ int ice_plug_aux_dev(struct ice_pf *pf) if (!ice_is_rdma_ena(pf)) return 0; + cdev = pf->cdev_info; + if (!cdev) + return -EINVAL; + iadev = kzalloc(sizeof(*iadev), GFP_KERNEL); if (!iadev) return -ENOMEM; adev = &iadev->adev; - iadev->pf = pf; + iadev->cdev_info = cdev; adev->id = pf->aux_idx; adev->dev.release = ice_adev_release; adev->dev.parent = &pf->pdev->dev; - adev->name = pf->rdma_mode & IIDC_RDMA_PROTOCOL_ROCEV2 ? "roce" : "iwarp"; + adev->name = cdev->rdma_protocol & IDC_RDMA_PROTOCOL_ROCEV2 ? + "roce" : "iwarp"; ret = auxiliary_device_init(adev); if (ret) { @@ -337,7 +344,7 @@ int ice_plug_aux_dev(struct ice_pf *pf) } mutex_lock(&pf->adev_mutex); - pf->adev = adev; + cdev->adev = adev; mutex_unlock(&pf->adev_mutex); return 0; @@ -351,8 +358,8 @@ void ice_unplug_aux_dev(struct ice_pf *pf) struct auxiliary_device *adev; mutex_lock(&pf->adev_mutex); - adev = pf->adev; - pf->adev = NULL; + adev = pf->cdev_info->adev; + pf->cdev_info->adev = NULL; mutex_unlock(&pf->adev_mutex); if (adev) { @@ -362,12 +369,38 @@ void ice_unplug_aux_dev(struct ice_pf *pf) } /** + * ice_init_rdma_qos_info - initialize qos_info for RDMA aux + * @pf: pointer to ice_pf + * @qos_info: pointer to qos_info struct + */ +static void +ice_init_rdma_qos_info(struct ice_pf *pf, struct iidc_rdma_qos_params *qos_info) +{ + int j; + + /* setup qos_info fields with defaults */ + qos_info->num_tc = 1; + + for (j = 0; j < IIDC_MAX_USER_PRIORITY; j++) + qos_info->up2tc[j] = 0; + + qos_info->tc_info[0].rel_bw = 100; + for (j = 1; j < IEEE_8021QAZ_MAX_TCS; j++) + qos_info->tc_info[j].rel_bw = 0; + + /* for DCB, override the qos_info defaults. */ + ice_setup_dcb_qos_info(pf, qos_info); +} + +/** * ice_init_rdma - initializes PF for RDMA use * @pf: ptr to ice_pf */ int ice_init_rdma(struct ice_pf *pf) { struct device *dev = &pf->pdev->dev; + struct iidc_rdma_priv_dev_info *privd; + struct idc_rdma_core_dev_info *cdev; int ret; if (!ice_is_rdma_ena(pf)) { @@ -375,20 +408,46 @@ int ice_init_rdma(struct ice_pf *pf) return 0; } + cdev = kzalloc(sizeof(*cdev), GFP_KERNEL); + if (!cdev) + return -ENOMEM; + + pf->cdev_info = cdev; + + privd = kzalloc(sizeof(*privd), GFP_KERNEL); + if (!privd) { + ret = -ENOMEM; + goto err_privd_alloc; + } + + privd->pf_id = pf->hw.pf_id; ret = xa_alloc(&ice_aux_id, &pf->aux_idx, NULL, XA_LIMIT(1, INT_MAX), GFP_KERNEL); if (ret) { dev_err(dev, "Failed to allocate device ID for AUX driver\n"); - return -ENOMEM; + ret = -ENOMEM; + goto err_alloc_xa; } + cdev->ops = &idc_c_ops; + cdev->idc_priv = privd; + privd->priv_ops = &iidc_p_ops; + privd->netdev = pf->vsi[0]->netdev; + + cdev->msix_count = pf->num_rdma_msix; + privd->hw_addr = (u8 __iomem *)pf->hw.hw_addr; + cdev->pdev = pf->pdev; + privd->vport_id = pf->vsi[0]->vsi_num; + /* Reserve vector resources */ ret = ice_alloc_rdma_qvectors(pf); if (ret < 0) { dev_err(dev, "failed to reserve vectors for RDMA\n"); goto err_reserve_rdma_qvector; } - pf->rdma_mode |= IIDC_RDMA_PROTOCOL_ROCEV2; + + pf->cdev_info->rdma_protocol |= IDC_RDMA_PROTOCOL_ROCEV2; + ice_init_rdma_qos_info(pf, &privd->qos_info); ret = ice_plug_aux_dev(pf); if (ret) goto err_plug_aux_dev; @@ -397,8 +456,14 @@ int ice_init_rdma(struct ice_pf *pf) err_plug_aux_dev: ice_free_rdma_qvector(pf); err_reserve_rdma_qvector: - pf->adev = NULL; + pf->cdev_info->adev = NULL; xa_erase(&ice_aux_id, pf->aux_idx); +err_alloc_xa: + kfree(privd); +err_privd_alloc: + kfree(cdev); + pf->cdev_info = NULL; + return ret; } diff --git a/drivers/net/ethernet/intel/ice/ice_idc_int.h b/drivers/net/ethernet/intel/ice/ice_idc_int.h index 4b0c867..8a1e9ed 100644 --- a/drivers/net/ethernet/intel/ice/ice_idc_int.h +++ b/drivers/net/ethernet/intel/ice/ice_idc_int.h @@ -4,10 +4,11 @@ #ifndef _ICE_IDC_INT_H_ #define _ICE_IDC_INT_H_ -#include +#include +#include struct ice_pf; -void ice_send_event_to_aux(struct ice_pf *pf, struct iidc_event *event); +void ice_send_event_to_aux(struct ice_pf *pf, struct idc_rdma_event *event); #endif /* !_ICE_IDC_INT_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index f60c022..200540b 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -2378,11 +2378,11 @@ static void ice_service_task(struct work_struct *work) } if (test_and_clear_bit(ICE_AUX_ERR_PENDING, pf->state)) { - struct iidc_event *event; + struct idc_rdma_event *event; event = kzalloc(sizeof(*event), GFP_KERNEL); if (event) { - set_bit(IIDC_EVENT_CRIT_ERR, event->type); + set_bit(IDC_RDMA_EVENT_CRIT_ERR, event->type); /* report the entire OICR value to AUX driver */ swap(event->reg, pf->oicr_err_reg); ice_send_event_to_aux(pf, event); @@ -2401,11 +2401,11 @@ static void ice_service_task(struct work_struct *work) ice_plug_aux_dev(pf); if (test_and_clear_bit(ICE_FLAG_MTU_CHANGED, pf->flags)) { - struct iidc_event *event; + struct idc_rdma_event *event; event = kzalloc(sizeof(*event), GFP_KERNEL); if (event) { - set_bit(IIDC_EVENT_AFTER_MTU_CHANGE, event->type); + set_bit(IDC_RDMA_EVENT_AFTER_MTU_CHANGE, event->type); ice_send_event_to_aux(pf, event); kfree(event); } @@ -9216,6 +9216,7 @@ static int ice_setup_tc_mqprio_qdisc(struct net_device *netdev, void *type_data) { struct ice_netdev_priv *np = netdev_priv(netdev); struct ice_pf *pf = np->vsi->back; + struct idc_rdma_core_dev_info *cdev; bool locked = false; int err; @@ -9231,11 +9232,12 @@ static int ice_setup_tc_mqprio_qdisc(struct net_device *netdev, void *type_data) return -EOPNOTSUPP; } - if (pf->adev) { + cdev = pf->cdev_info; + if (cdev && cdev->adev) { mutex_lock(&pf->adev_mutex); - device_lock(&pf->adev->dev); + device_lock(&cdev->adev->dev); locked = true; - if (pf->adev->dev.driver) { + if (cdev->adev->dev.driver) { netdev_err(netdev, "Cannot change qdisc when RDMA is active\n"); err = -EBUSY; goto adev_unlock; @@ -9249,7 +9251,7 @@ static int ice_setup_tc_mqprio_qdisc(struct net_device *netdev, void *type_data) adev_unlock: if (locked) { - device_unlock(&pf->adev->dev); + device_unlock(&cdev->adev->dev); mutex_unlock(&pf->adev_mutex); } return err; diff --git a/include/linux/net/intel/idc_rdma.h b/include/linux/net/intel/idc_rdma.h new file mode 100644 index 0000000..5c31c6d --- /dev/null +++ b/include/linux/net/intel/idc_rdma.h @@ -0,0 +1,138 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (C) 2021, Intel Corporation. */ + +#ifndef _IDC_RDMA_H_ +#define _IDC_RDMA_H_ + +#include +#include +#include +#include +#include + +#define IDC_RDMA_ROCE_NAME "roce" +#define IDC_RDMA_IWARP_NAME "iwarp" + +enum idc_rdma_reset_type { + IDC_FUNC_RESET, + IDC_DEV_RESET, +}; + +enum idc_rdma_event_type { + IDC_RDMA_EVENT_BEFORE_MTU_CHANGE, + IDC_RDMA_EVENT_AFTER_MTU_CHANGE, + IDC_RDMA_EVENT_BEFORE_TC_CHANGE, + IDC_RDMA_EVENT_AFTER_TC_CHANGE, + IDC_RDMA_EVENT_WARN_RESET, + IDC_RDMA_EVENT_CRIT_ERR, + IDC_RDMA_EVENT_NBITS, /* must be last */ +}; + +struct idc_rdma_event { + DECLARE_BITMAP(type, IDC_RDMA_EVENT_NBITS); + u32 reg; +}; + +enum idc_rdma_protocol { + IDC_RDMA_PROTOCOL_IWARP = BIT(0), + IDC_RDMA_PROTOCOL_ROCEV2 = BIT(1), +}; + +struct idc_rdma_qv_info { + u32 v_idx; + u16 ceq_idx; + u16 aeq_idx; + u8 itr_idx; +}; + +struct idc_rdma_qvlist_info { + u32 num_vectors; + struct idc_rdma_qv_info qv_info[]; +}; + +struct idc_rdma_core_dev_info; + +/* Following APIs are implemented by core PCI driver */ +struct idc_rdma_core_ops { + int (*vc_send_sync)(struct idc_rdma_core_dev_info *cdev_info, u8 *msg, + u16 len, u8 *recv_msg, u16 *recv_len); + int (*vc_queue_vec_map_unmap)(struct idc_rdma_core_dev_info *cdev_info, + struct idc_rdma_qvlist_info *qvl_info, + bool map); + /* vport_dev_ctrl is for RDMA CORE driver to indicate it is either ready + * for individual vport aux devices, or it is leaving the state where it + * can support vports and they need to be downed + */ + int (*vport_dev_ctrl)(struct idc_rdma_core_dev_info *cdev_info, + bool up); + int (*request_reset)(struct idc_rdma_core_dev_info *cdev_info, + enum idc_rdma_reset_type reset_type); +}; + +enum idc_function_type { + IDC_FUNCTION_TYPE_PF, + IDC_FUNCTION_TYPE_VF, +}; + +struct idc_rdma_lan_mapped_mem_region { + u8 __iomem *region_addr; + __le64 size; + __le64 start_offset; +}; + +/* struct to be populated by core LAN PCI driver */ +struct idc_rdma_core_dev_info { + struct pci_dev *pdev; /* PCI device of corresponding to main function */ + struct auxiliary_device *adev; + struct idc_rdma_lan_mapped_mem_region *mapped_mem_regions; + __le16 num_memory_regions; + /* Current active RDMA protocol */ + enum idc_rdma_protocol rdma_protocol; + enum idc_function_type ftype; + struct msix_entry *msix_entries; + u16 msix_count; /* How many vectors are reserved for this device */ + /* Following struct contains function pointers to be initialized + * by core PCI driver and called by auxiliary driver + */ + const struct idc_rdma_core_ops *ops; + void *idc_priv; +}; + +struct idc_rdma_core_auxiliary_dev { + struct auxiliary_device adev; + struct idc_rdma_core_dev_info *cdev_info; +}; + +/* struct to be populated by core LAN PCI driver */ +struct idc_rdma_vport_dev_info { + struct auxiliary_device *adev; + struct auxiliary_device *core_adev; + struct net_device *netdev; + u16 vport_id; +}; + +struct idc_rdma_vport_auxiliary_dev { + struct auxiliary_device adev; + struct idc_rdma_vport_dev_info *vdev_info; +}; + +/* structures representing the auxiliary drivers. These structs are to be + * allocated and populated by the auxiliary drivers' owner. The core PCI + * driver will access these ops by performing a container_of on the + * auxiliary_device->dev.driver. + */ +struct idc_rdma_core_auxiliary_drv { + struct auxiliary_driver adrv; + void (*event_handler)(struct idc_rdma_core_dev_info *cdev, + struct idc_rdma_event *event); + int (*vc_receive)(struct idc_rdma_core_dev_info *cdev_info, u8 *msg, + u16 len); +}; + +struct idc_rdma_vport_auxiliary_drv { + struct auxiliary_driver adrv; + void (*event_handler)(struct idc_rdma_vport_dev_info *vdev, + struct idc_rdma_event *event); +}; + +#endif /* _IDC_RDMA_H_*/ diff --git a/include/linux/net/intel/iidc.h b/include/linux/net/intel/iidc.h deleted file mode 100644 index 1c1332e..0000000 --- a/include/linux/net/intel/iidc.h +++ /dev/null @@ -1,107 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright (C) 2021, Intel Corporation. */ - -#ifndef _IIDC_H_ -#define _IIDC_H_ - -#include -#include -#include -#include -#include -#include - -enum iidc_event_type { - IIDC_EVENT_BEFORE_MTU_CHANGE, - IIDC_EVENT_AFTER_MTU_CHANGE, - IIDC_EVENT_BEFORE_TC_CHANGE, - IIDC_EVENT_AFTER_TC_CHANGE, - IIDC_EVENT_CRIT_ERR, - IIDC_EVENT_NBITS /* must be last */ -}; - -enum iidc_reset_type { - IIDC_PFR, - IIDC_CORER, - IIDC_GLOBR, -}; - -enum iidc_rdma_protocol { - IIDC_RDMA_PROTOCOL_IWARP = BIT(0), - IIDC_RDMA_PROTOCOL_ROCEV2 = BIT(1), -}; - -#define IIDC_MAX_USER_PRIORITY 8 -#define IIDC_MAX_DSCP_MAPPING 64 -#define IIDC_DSCP_PFC_MODE 0x1 - -/* Struct to hold per RDMA Qset info */ -struct iidc_rdma_qset_params { - /* Qset TEID returned to the RDMA driver in - * ice_add_rdma_qset and used by RDMA driver - * for calls to ice_del_rdma_qset - */ - u32 teid; /* Qset TEID */ - u16 qs_handle; /* RDMA driver provides this */ - u16 vport_id; /* VSI index */ - u8 tc; /* TC branch the Qset should belong to */ -}; - -struct iidc_qos_info { - u64 tc_ctx; - u8 rel_bw; - u8 prio_type; - u8 egress_virt_up; - u8 ingress_virt_up; -}; - -/* Struct to pass QoS info */ -struct iidc_qos_params { - struct iidc_qos_info tc_info[IEEE_8021QAZ_MAX_TCS]; - u8 up2tc[IIDC_MAX_USER_PRIORITY]; - u8 vport_relative_bw; - u8 vport_priority_type; - u8 num_tc; - u8 pfc_mode; - u8 dscp_map[IIDC_MAX_DSCP_MAPPING]; -}; - -struct iidc_event { - DECLARE_BITMAP(type, IIDC_EVENT_NBITS); - u32 reg; -}; - -struct ice_pf; - -int ice_add_rdma_qset(struct ice_pf *pf, struct iidc_rdma_qset_params *qset); -int ice_del_rdma_qset(struct ice_pf *pf, struct iidc_rdma_qset_params *qset); -int ice_rdma_request_reset(struct ice_pf *pf, enum iidc_reset_type reset_type); -int ice_rdma_update_vsi_filter(struct ice_pf *pf, u16 vsi_id, bool enable); -void ice_get_qos_params(struct ice_pf *pf, struct iidc_qos_params *qos); - -/* Structure representing auxiliary driver tailored information about the core - * PCI dev, each auxiliary driver using the IIDC interface will have an - * instance of this struct dedicated to it. - */ - -struct iidc_auxiliary_dev { - struct auxiliary_device adev; - struct ice_pf *pf; -}; - -/* structure representing the auxiliary driver. This struct is to be - * allocated and populated by the auxiliary driver's owner. The core PCI - * driver will access these ops by performing a container_of on the - * auxiliary_device->dev.driver. - */ -struct iidc_auxiliary_drv { - struct auxiliary_driver adrv; - /* This event_handler is meant to be a blocking call. For instance, - * when a BEFORE_MTU_CHANGE event comes in, the event_handler will not - * return until the auxiliary driver is ready for the MTU change to - * happen. - */ - void (*event_handler)(struct ice_pf *pf, struct iidc_event *event); -}; - -#endif /* _IIDC_H_*/ diff --git a/include/linux/net/intel/iidc_rdma.h b/include/linux/net/intel/iidc_rdma.h new file mode 100644 index 0000000..2e30b04 --- /dev/null +++ b/include/linux/net/intel/iidc_rdma.h @@ -0,0 +1,61 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (C) 2021, Intel Corporation. */ + +#ifndef _IIDC_RDMA_H_ +#define _IIDC_RDMA_H_ + +#include + +#define IIDC_MAX_USER_PRIORITY 8 +#define IIDC_MAX_DSCP_MAPPING 64 +#define IIDC_DSCP_PFC_MODE 0x1 + +/* Struct to hold per RDMA Qset info */ +struct iidc_rdma_qset_params { + /* Qset TEID returned to the RDMA driver in + * ice_add_rdma_qset and used by RDMA driver + * for calls to ice_del_rdma_qset + */ + u32 teid; /* Qset TEID */ + u16 qs_handle; /* RDMA driver provides this */ + u16 vport_id; /* VSI index */ + u8 tc; /* TC branch the Qset should belong to */ +}; + +struct iidc_rdma_qos_info { + u64 tc_ctx; + u8 rel_bw; + u8 prio_type; + u8 egress_virt_up; + u8 ingress_virt_up; +}; + +/* Struct to pass QoS info */ +struct iidc_rdma_qos_params { + struct iidc_rdma_qos_info tc_info[IEEE_8021QAZ_MAX_TCS]; + u8 up2tc[IIDC_MAX_USER_PRIORITY]; + u8 vport_relative_bw; + u8 vport_priority_type; + u8 num_tc; + u8 pfc_mode; + u8 dscp_map[IIDC_MAX_DSCP_MAPPING]; +}; + +struct iidc_rdma_priv_ops { + int (*alloc_res)(struct idc_rdma_core_dev_info *cdev_info, + struct iidc_rdma_qset_params *qset); + int (*free_res)(struct idc_rdma_core_dev_info *cdev_info, + struct iidc_rdma_qset_params *qset); + int (*update_vport_filter)(struct idc_rdma_core_dev_info *cdev_info, + u16 vport_id, bool enable); +}; + +struct iidc_rdma_priv_dev_info { + u8 pf_id; + u16 vport_id; + struct net_device *netdev; + struct iidc_rdma_qos_params qos_info; + const struct iidc_rdma_priv_ops *priv_ops; + u8 __iomem *hw_addr; +}; +#endif /* _IDC_RDMA_H_*/ From patchwork Wed Jul 24 23:38:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741440 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3613013D635 for ; Wed, 24 Jul 2024 23:40:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864439; cv=none; b=Ud7OBMenAmQBXu8IPmSfpMi/VCUl8DjDWbYUb8Kwjj/0oYV7zTtPnA8Tem39TvmFqBHZgVdnMk+k4VTwE5y0waHWidCxGe7svm2J8Rv3J/eAqAYOgEDRD4+Lz7SErlA52SbA+HNeYJ0rLJ5wVz6N7+9iuvOgMRk7Kt3K7MRDxSo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864439; c=relaxed/simple; bh=D3tbyhGezhFFSpxS7B5Ti2rsTOJvlyX8J34dI+FLCrY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TXnL0FxVCZtFM+PN3ywl4nnqRUkvvYSDpFIqXg3nPp9dH4QbhF5bcLzwPmqgOiC13RabJlpZJMj8EfOJ3jL+0ZfnyOL3qSAvoNOugoz//+hKZQeTpU2n7iisSCc0jYQlV99yBxgvk1uMHKNzlzSHCKK6/6rhE8zNM0vzpK4MdE8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TBWq/3FX; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TBWq/3FX" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864437; x=1753400437; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=D3tbyhGezhFFSpxS7B5Ti2rsTOJvlyX8J34dI+FLCrY=; b=TBWq/3FXYiT0MX9k/FhxRt2CClMeH0RNwjLP6aH5C73fweoLH+jYltNl Km42n2re+8GzgI8yVMfcMMk5N8+4LQSaks93aVOFQqkFoaWNfXELnos2M 9ZL8kuVqPeOpJJCMZCQ9czWYWOrs2kRVlCVXQjfrSA2JO4IzJ34nnw2bu 0LsOWAh/ozVKTMc9OZvN+BweDgVQrIQBbp9YBy5RsL5BvUzIqPUYMaWMZ 2pjLh3n/eZgb+mWEp7PfHyAu8+hP8VzYE6jrbrwKmKlp+ka5BGFPiZjaH KjiCpxsGgHnfeQS0ywCwRj9xQ5qfY76XFfWd+T3JgjcH3OUjxsO7cSM9B g==; X-CSE-ConnectionGUID: qoeNebCKTFWIVzuhzfSgiw== X-CSE-MsgGUID: Pv/v/9UsS+yGIgItBzprOA== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999739" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999739" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:35 -0700 X-CSE-ConnectionGUID: vOWURArjSdS3/AESvN+zJQ== X-CSE-MsgGUID: dHssaaIUT7SoXlGxFP3C1g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52425993" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:34 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Joshua Hay , Tatyana Nikolova Subject: [RFC PATCH 02/25] idpf: use reserved rdma vectors from control plane Date: Wed, 24 Jul 2024 18:38:54 -0500 Message-Id: <20240724233917.704-3-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Joshua Hay Fetch the number of reserved rdma vectors from the control plane. Adjust the number of reserved lan vectors if necessary. Adjust the minimum number of vectors the OS should reserve to include rdma; and fail if the OS cannot reserve enough vectors for the minimum number of lan and rdma vectors required. Create a separate msix table for the reserved rdma vectors, which will just get handed off to the rdma core device to do with what it will. Signed-off-by: Joshua Hay Signed-off-by: Tatyana Nikolova --- drivers/net/ethernet/intel/idpf/idpf.h | 24 +++++++++- drivers/net/ethernet/intel/idpf/idpf_lib.c | 70 +++++++++++++++++++++++------ drivers/net/ethernet/intel/idpf/idpf_txrx.h | 1 + drivers/net/ethernet/intel/idpf/virtchnl2.h | 5 ++- 4 files changed, 83 insertions(+), 17 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h index e7a0365..d25e783 100644 --- a/drivers/net/ethernet/intel/idpf/idpf.h +++ b/drivers/net/ethernet/intel/idpf/idpf.h @@ -493,10 +493,11 @@ struct idpf_vport_config { * @flags: See enum idpf_flags * @reset_reg: See struct idpf_reset_reg * @hw: Device access data - * @num_req_msix: Requested number of MSIX vectors * @num_avail_msix: Available number of MSIX vectors * @num_msix_entries: Number of entries in MSIX table * @msix_entries: MSIX table + * @num_rdma_msix_entries: Available number of MSIX vectors for RDMA + * @rdma_msix_entries: RDMA MSIX table * @req_vec_chunks: Requested vector chunk data * @mb_vector: Mailbox vector data * @vector_stack: Stack to store the msix vector indexes @@ -546,10 +547,11 @@ struct idpf_adapter { DECLARE_BITMAP(flags, IDPF_FLAGS_NBITS); struct idpf_reset_reg reset_reg; struct idpf_hw hw; - u16 num_req_msix; u16 num_avail_msix; u16 num_msix_entries; struct msix_entry *msix_entries; + u16 num_rdma_msix_entries; + struct msix_entry *rdma_msix_entries; struct virtchnl2_alloc_vectors *req_vec_chunks; struct idpf_q_vector mb_vector; struct idpf_vector_lifo vector_stack; @@ -612,6 +614,15 @@ static inline int idpf_is_queue_model_split(u16 q_model) bool idpf_is_capability_ena(struct idpf_adapter *adapter, bool all, enum idpf_cap_field field, u64 flag); +/** + * idpf_is_rdma_cap_ena - Determine if RDMA is supported + * @adapter: private data struct + */ +static inline bool idpf_is_rdma_cap_ena(struct idpf_adapter *adapter) +{ + return idpf_is_cap_ena(adapter, IDPF_OTHER_CAPS, VIRTCHNL2_CAP_RDMA); +} + #define IDPF_CAP_RSS (\ VIRTCHNL2_CAP_RSS_IPV4_TCP |\ VIRTCHNL2_CAP_RSS_IPV4_TCP |\ @@ -667,6 +678,15 @@ static inline u16 idpf_get_reserved_vecs(struct idpf_adapter *adapter) } /** + * idpf_get_reserved_rdma_vecs - Get reserved RDMA vectors + * @adapter: private data struct + */ +static inline u16 idpf_get_reserved_rdma_vecs(struct idpf_adapter *adapter) +{ + return le16_to_cpu(adapter->caps.num_rdma_allocated_vectors); +} + +/** * idpf_get_default_vports - Get default number of vports * @adapter: private data struct */ diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 52ceda6..0b96518 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -88,6 +88,8 @@ void idpf_intr_rel(struct idpf_adapter *adapter) idpf_deinit_vector_stack(adapter); kfree(adapter->msix_entries); adapter->msix_entries = NULL; + kfree(adapter->rdma_msix_entries); + adapter->rdma_msix_entries = NULL; } /** @@ -316,13 +318,29 @@ int idpf_req_rel_vector_indexes(struct idpf_adapter *adapter, */ int idpf_intr_req(struct idpf_adapter *adapter) { + u16 num_lan_vecs, min_lan_vecs, num_rdma_vecs = 0, min_rdma_vecs = 0; u16 default_vports = idpf_get_default_vports(adapter); int num_q_vecs, total_vecs, num_vec_ids; int min_vectors, v_actual, err; unsigned int vector; u16 *vecids; + int i; total_vecs = idpf_get_reserved_vecs(adapter); + num_lan_vecs = total_vecs; + if (idpf_is_rdma_cap_ena(adapter)) { + num_rdma_vecs = idpf_get_reserved_rdma_vecs(adapter); + min_rdma_vecs = IDPF_MIN_RDMA_VEC; + + if (num_rdma_vecs < min_rdma_vecs) { + /* If idpf_get_reserved_rdma_vecs is 0 or less than the + * minimum, the difference is taken from the LAN vecs + */ + num_lan_vecs -= (min_rdma_vecs - num_rdma_vecs); + num_rdma_vecs = min_rdma_vecs; + } + } + num_q_vecs = total_vecs - IDPF_MBX_Q_VEC; err = idpf_send_alloc_vectors_msg(adapter, num_q_vecs); @@ -333,27 +351,44 @@ int idpf_intr_req(struct idpf_adapter *adapter) return -EAGAIN; } - min_vectors = IDPF_MBX_Q_VEC + IDPF_MIN_Q_VEC * default_vports; + min_lan_vecs = IDPF_MBX_Q_VEC + IDPF_MIN_Q_VEC * default_vports; + min_vectors = min_lan_vecs + min_rdma_vecs; v_actual = pci_alloc_irq_vectors(adapter->pdev, min_vectors, total_vecs, PCI_IRQ_MSIX); if (v_actual < min_vectors) { - dev_err(&adapter->pdev->dev, "Failed to allocate MSIX vectors: %d\n", + dev_err(&adapter->pdev->dev, "Failed to allocate minimum MSIX vectors required: %d\n", v_actual); err = -EAGAIN; goto send_dealloc_vecs; } - adapter->msix_entries = kcalloc(v_actual, sizeof(struct msix_entry), - GFP_KERNEL); + if (idpf_is_rdma_cap_ena(adapter)) { + if (v_actual < total_vecs) { + dev_warn(&adapter->pdev->dev, + "Warning: not enough vectors available. Defaulting to minimum for RDMA and remaining for LAN.\n"); + num_rdma_vecs = IDPF_MIN_RDMA_VEC; + } + adapter->rdma_msix_entries = + kcalloc(num_rdma_vecs, + sizeof(struct msix_entry), GFP_KERNEL); + if (!adapter->rdma_msix_entries) { + err = -ENOMEM; + goto free_irq; + } + } + + num_lan_vecs = v_actual - num_rdma_vecs; + adapter->msix_entries = kcalloc(num_lan_vecs, sizeof(struct msix_entry), + GFP_KERNEL); if (!adapter->msix_entries) { err = -ENOMEM; - goto free_irq; + goto free_rdma_msix; } idpf_set_mb_vec_id(adapter); - vecids = kcalloc(total_vecs, sizeof(u16), GFP_KERNEL); + vecids = kcalloc(v_actual, sizeof(u16), GFP_KERNEL); if (!vecids) { err = -ENOMEM; goto free_msix; @@ -373,25 +408,29 @@ int idpf_intr_req(struct idpf_adapter *adapter) goto free_vecids; } } else { - int i; - for (i = 0; i < v_actual; i++) vecids[i] = i; } - for (vector = 0; vector < v_actual; vector++) { - adapter->msix_entries[vector].entry = vecids[vector]; - adapter->msix_entries[vector].vector = + for (i = 0, vector = 0; vector < num_lan_vecs; vector++, i++) { + adapter->msix_entries[i].entry = vecids[vector]; + adapter->msix_entries[i].vector = + pci_irq_vector(adapter->pdev, vector); + } + for (i = 0; i < num_rdma_vecs; vector++, i++) { + adapter->rdma_msix_entries[i].entry = vecids[vector]; + adapter->rdma_msix_entries[i].vector = pci_irq_vector(adapter->pdev, vector); } - adapter->num_req_msix = total_vecs; - adapter->num_msix_entries = v_actual; /* 'num_avail_msix' is used to distribute excess vectors to the vports * after considering the minimum vectors required per each default * vport */ - adapter->num_avail_msix = v_actual - min_vectors; + adapter->num_avail_msix = num_lan_vecs - min_lan_vecs; + adapter->num_msix_entries = num_lan_vecs; + if (idpf_is_rdma_cap_ena(adapter)) + adapter->num_rdma_msix_entries = num_rdma_vecs; /* Fill MSIX vector lifo stack with vector indexes */ err = idpf_init_vector_stack(adapter); @@ -413,6 +452,9 @@ int idpf_intr_req(struct idpf_adapter *adapter) free_msix: kfree(adapter->msix_entries); adapter->msix_entries = NULL; +free_rdma_msix: + kfree(adapter->rdma_msix_entries); + adapter->rdma_msix_entries = NULL; free_irq: pci_free_irq_vectors(adapter->pdev); send_dealloc_vecs: diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h index 3d046b8..9cb3e8a 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h @@ -54,6 +54,7 @@ /* Default vector sharing */ #define IDPF_MBX_Q_VEC 1 #define IDPF_MIN_Q_VEC 1 +#define IDPF_MIN_RDMA_VEC 4 #define IDPF_DFLT_TX_Q_DESC_COUNT 512 #define IDPF_DFLT_TX_COMPLQ_DESC_COUNT 512 diff --git a/drivers/net/ethernet/intel/idpf/virtchnl2.h b/drivers/net/ethernet/intel/idpf/virtchnl2.h index 63deb12..80c17e4 100644 --- a/drivers/net/ethernet/intel/idpf/virtchnl2.h +++ b/drivers/net/ethernet/intel/idpf/virtchnl2.h @@ -473,6 +473,8 @@ struct virtchnl2_version_info { * segment offload. * @max_hdr_buf_per_lso: Max number of header buffers that can be used for * an LSO. + * @num_rdma_allocated_vectors: Maximum number of allocated RDMA vectors for + * the device. * @pad1: Padding for future extensions. * * Dataplane driver sends this message to CP to negotiate capabilities and @@ -520,7 +522,8 @@ struct virtchnl2_get_capabilities { __le32 device_type; u8 min_sso_packet_len; u8 max_hdr_buf_per_lso; - u8 pad1[10]; + __le16 num_rdma_allocated_vectors; + u8 pad1[8]; }; VIRTCHNL2_CHECK_STRUCT_LEN(80, virtchnl2_get_capabilities); From patchwork Wed Jul 24 23:38:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741442 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A17E91420B0 for ; Wed, 24 Jul 2024 23:40:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864440; cv=none; b=uxOKUKZ+7bj6NuBf6oFKYv+6pX7LSRklWp7X1hNsg0WFx15wFammb8sr7kBg+Q+deRVewnSfVHIOUa0o4VjL6gk5aEoE13Xv9kFTsMJE2XlFKpdeoP1/zNV5X6KDO2En1vDsEl9Q+zX7JhGa1HwwCQXyS7prdiuZ0J5yzBWcV3Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864440; c=relaxed/simple; bh=oE1PKPWDMb3q8wLF1mFqvXmeaNH9MKK/Jwu2GpZGsyY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=OyrXLaEVGoYiwj8fKeE40WGozxmLxFWVsmVEOKAnzt5iQN88R8lM+eEciWlRFCuISsXQbgWOlYdRryyDkHJ1b/96C3bvghsTfx4MYGzGhVG6hBOnRaN/GFbfyjIMoGRIpzSH2XIDvUkJHTDQqw9gGCbgULAOIw7DpKSRcc91vAc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=AaWPE0c9; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="AaWPE0c9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864438; x=1753400438; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oE1PKPWDMb3q8wLF1mFqvXmeaNH9MKK/Jwu2GpZGsyY=; b=AaWPE0c9UJpnaoT9Ov5a/pNP1vyhjmU67rywmZJCZWjWR6eNuUMySNQE zUihyJ8chP1wWRyLlRUz/cST5k6H63GWFrjGGaddsmpXqUqIJ3oWUdknF 22qxvwNZtKXLbP+5WxuOdjRIdMfjaiJas+zBoDgJC+3LlUL/PV+aFDnG4 jD58XDWYpy4lzSBtrnbZO+DDyRZTbB/J+ePDF/gnBjVnz/NGdmGzwoeco XUsunY60S0O/ZcFdOPHNnRTTuv9TcniAJCy0xjjiLyrM+WNHSU9JRdd71 OxH+yeCDXQ+j4K8x1LR+BcyOvkUuI4mhFLdxnUcOMpr3YEaa8hI8wQ/yh w==; X-CSE-ConnectionGUID: m5jRbarUQ+KUwH4nGf8exg== X-CSE-MsgGUID: uK4XstIFRcm8/STHp3axJQ== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999742" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999742" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:36 -0700 X-CSE-ConnectionGUID: h9mWjUYHQa63vELzGIkxMQ== X-CSE-MsgGUID: iHbIbab3QkqRciZaeJWb1Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426001" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:35 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Joshua Hay , Tatyana Nikolova Subject: [RFC PATCH 03/25] idpf: implement core rdma auxiliary dev create, init, and destroy Date: Wed, 24 Jul 2024 18:38:55 -0500 Message-Id: <20240724233917.704-4-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Joshua Hay Add the initial idpf_idc.c file with the functions to kick off the idc initialization, create and initialize a core rdma auxiliary device, and destroy said device. The rdma core has a dependency on the vports being created by the control plane before it can be initialized. Therefore, once all the vports are up after a hard reset (either during driver load a function level reset), the core rdma device info will be created. It is populated with the function type (as distinguished by the idc initialization function pointer), the core idc_ops function points (just stubs for now), the reserved rdma msix table, and various other info the core rdma auxiliary driver will need. It is then plugged on to the bus. During a function level reset or driver unload, the device will be unplugged from the bus and destroyed. Signed-off-by: Joshua Hay Signed-off-by: Tatyana Nikolova --- drivers/net/ethernet/intel/idpf/Makefile | 1 + drivers/net/ethernet/intel/idpf/idpf.h | 8 + drivers/net/ethernet/intel/idpf/idpf_dev.c | 11 ++ drivers/net/ethernet/intel/idpf/idpf_idc.c | 212 ++++++++++++++++++++++++ drivers/net/ethernet/intel/idpf/idpf_lib.c | 4 + drivers/net/ethernet/intel/idpf/idpf_vf_dev.c | 11 ++ drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 17 ++ drivers/net/ethernet/intel/idpf/idpf_virtchnl.h | 3 + 8 files changed, 267 insertions(+) create mode 100644 drivers/net/ethernet/intel/idpf/idpf_idc.c diff --git a/drivers/net/ethernet/intel/idpf/Makefile b/drivers/net/ethernet/intel/idpf/Makefile index 6844ead..e86d8f5 100644 --- a/drivers/net/ethernet/intel/idpf/Makefile +++ b/drivers/net/ethernet/intel/idpf/Makefile @@ -10,6 +10,7 @@ idpf-y := \ idpf_controlq_setup.o \ idpf_dev.o \ idpf_ethtool.o \ + idpf_idc.o \ idpf_lib.o \ idpf_main.o \ idpf_singleq_txrx.o \ diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h index d25e783..9397208 100644 --- a/drivers/net/ethernet/intel/idpf/idpf.h +++ b/drivers/net/ethernet/intel/idpf/idpf.h @@ -18,6 +18,7 @@ #include #include #include +#include #include "virtchnl2.h" #include "idpf_lan_txrx.h" @@ -205,6 +206,8 @@ struct idpf_reg_ops { */ struct idpf_dev_ops { struct idpf_reg_ops reg_ops; + + int (*idc_init)(struct idpf_adapter *adapter); }; /** @@ -584,6 +587,7 @@ struct idpf_adapter { struct idpf_vc_xn_manager *vcxn_mngr; struct idpf_dev_ops dev_ops; + struct idc_rdma_core_dev_info *cdev_info; int num_vfs; bool crc_enable; bool req_tx_splitq; @@ -857,5 +861,9 @@ void idpf_vport_intr_write_itr(struct idpf_q_vector *q_vector, u8 idpf_vport_get_hsplit(const struct idpf_vport *vport); bool idpf_vport_set_hsplit(const struct idpf_vport *vport, u8 val); +int idpf_idc_init(struct idpf_adapter *adapter); +int idpf_idc_init_aux_core_dev(struct idpf_adapter *adapter, + enum idc_function_type ftype); +void idpf_idc_deinit_core_aux_device(struct idc_rdma_core_dev_info *cdev_info); #endif /* !_IDPF_H_ */ diff --git a/drivers/net/ethernet/intel/idpf/idpf_dev.c b/drivers/net/ethernet/intel/idpf/idpf_dev.c index 3df9935..f4c5691 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_dev.c +++ b/drivers/net/ethernet/intel/idpf/idpf_dev.c @@ -144,6 +144,15 @@ static void idpf_trigger_reset(struct idpf_adapter *adapter, } /** + * idpf_idc_register - register for IDC callbacks + * @adapter: Driver specific private structure + */ +static int idpf_idc_register(struct idpf_adapter *adapter) +{ + return idpf_idc_init_aux_core_dev(adapter, IDC_FUNCTION_TYPE_PF); +} + +/** * idpf_reg_ops_init - Initialize register API function pointers * @adapter: Driver specific private structure */ @@ -163,4 +172,6 @@ static void idpf_reg_ops_init(struct idpf_adapter *adapter) void idpf_dev_ops_init(struct idpf_adapter *adapter) { idpf_reg_ops_init(adapter); + + adapter->dev_ops.idc_init = idpf_idc_register; } diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c new file mode 100644 index 0000000..9eb0e0c --- /dev/null +++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c @@ -0,0 +1,212 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (C) 2022 Intel Corporation */ + +#include "idpf.h" +#include "idpf_virtchnl.h" + +static DEFINE_IDA(idpf_idc_ida); + +#define IDPF_IDC_MAX_ADEV_NAME_LEN 15 + +/** + * idpf_idc_init - Called to initialize IDC + * @adapter: driver private data structure + */ +int idpf_idc_init(struct idpf_adapter *adapter) +{ + int err; + + if (!idpf_is_rdma_cap_ena(adapter) || + !adapter->dev_ops.idc_init) + return 0; + + err = adapter->dev_ops.idc_init(adapter); + if (err) + dev_err(&adapter->pdev->dev, "failed to initialize idc: %d\n", + err); + + return err; +} + +/** + * idpf_core_adev_release - function to be mapped to aux dev's release op + * @dev: pointer to device to free + */ +static void idpf_core_adev_release(struct device *dev) +{ + struct idc_rdma_core_auxiliary_dev *iadev; + + iadev = container_of(dev, struct idc_rdma_core_auxiliary_dev, adev.dev); + kfree(iadev); + iadev = NULL; +} + +/* idpf_plug_core_aux_dev - allocate and register an Auxiliary device + * @cdev_info: idc core device info pointer + */ +static int idpf_plug_core_aux_dev(struct idc_rdma_core_dev_info *cdev_info) +{ + struct idc_rdma_core_auxiliary_dev *iadev; + char name[IDPF_IDC_MAX_ADEV_NAME_LEN]; + struct auxiliary_device *adev; + int err; + + iadev = (struct idc_rdma_core_auxiliary_dev *) + kzalloc(sizeof(*iadev), GFP_KERNEL); + if (!iadev) + return -ENOMEM; + + adev = &iadev->adev; + cdev_info->adev = adev; + iadev->cdev_info = cdev_info; + + adev->id = ida_alloc(&idpf_idc_ida, GFP_KERNEL); + if (adev->id < 0) { + pr_err("failed to allocate unique device ID for Auxiliary driver\n"); + err = -ENOMEM; + goto err_ida_alloc; + } + adev->dev.release = idpf_core_adev_release; + adev->dev.parent = &cdev_info->pdev->dev; + sprintf(name, "%04x.rdma.core", cdev_info->pdev->vendor); + adev->name = name; + + err = auxiliary_device_init(adev); + if (err) + goto err_aux_dev_init; + + err = auxiliary_device_add(adev); + if (err) + goto err_aux_dev_add; + + return 0; + +err_aux_dev_add: + cdev_info->adev = NULL; + auxiliary_device_uninit(adev); +err_aux_dev_init: + ida_free(&idpf_idc_ida, adev->id); +err_ida_alloc: + kfree(iadev); + + return err; +} + +/* idpf_unplug_aux_dev - unregister and free an Auxiliary device + * @adev: auxiliary device struct + */ +static void idpf_unplug_aux_dev(struct auxiliary_device *adev) +{ + auxiliary_device_delete(adev); + auxiliary_device_uninit(adev); + + ida_free(&idpf_idc_ida, adev->id); +} + +/** + * idpf_idc_vport_dev_ctrl - Called by an Auxiliary Driver + * @cdev_info: idc core device info pointer + * @up: RDMA core driver status + * + * This callback function is accessed by an Auxiliary Driver to indicate + * whether core driver is ready to support vport driver load or if vport + * drivers need to be taken down. + */ +static int +idpf_idc_vport_dev_ctrl(struct idc_rdma_core_dev_info *cdev_info, + bool up) +{ + return -EOPNOTSUPP; +} + +/** + * idpf_idc_request_reset - Called by an Auxiliary Driver + * @cdev_info: idc core device info pointer + * @reset_type: function, core or other + * + * This callback function is accessed by an Auxiliary Driver to request a reset + * on the Auxiliary Device + */ +static int +idpf_idc_request_reset(struct idc_rdma_core_dev_info *cdev_info, + enum idc_rdma_reset_type __always_unused reset_type) +{ + return -EOPNOTSUPP; +} + +/* Implemented by the Auxiliary Device and called by the Auxiliary Driver */ +static const struct idc_rdma_core_ops idc_ops = { + .vport_dev_ctrl = idpf_idc_vport_dev_ctrl, + .request_reset = idpf_idc_request_reset, + .vc_send_sync = idpf_idc_rdma_vc_send_sync, +}; + +/** + * idpf_idc_init_msix_data - initialize MSIX data for the cdev_info structure + * @adapter: driver private data structure + */ +static void +idpf_idc_init_msix_data(struct idpf_adapter *adapter) +{ + struct idc_rdma_core_dev_info *cdev_info; + + if (!adapter->rdma_msix_entries) + return; + + cdev_info = adapter->cdev_info; + + cdev_info->msix_entries = adapter->rdma_msix_entries; + cdev_info->msix_count = adapter->num_rdma_msix_entries; +} + +/** + * idpf_idc_init_aux_core_dev - initialize Auxiliary Device(s) + * @adapter: driver private data structure + * @ftype: PF or VF + */ +int idpf_idc_init_aux_core_dev(struct idpf_adapter *adapter, + enum idc_function_type ftype) +{ + struct idc_rdma_core_dev_info *cdev_info; + int err; + + adapter->cdev_info = (struct idc_rdma_core_dev_info *) + kzalloc(sizeof(struct idc_rdma_core_dev_info), GFP_KERNEL); + if (!adapter->cdev_info) + return -ENOMEM; + + cdev_info = adapter->cdev_info; + cdev_info->pdev = adapter->pdev; + cdev_info->ops = &idc_ops; + cdev_info->rdma_protocol = IDC_RDMA_PROTOCOL_ROCEV2; + cdev_info->ftype = ftype; + + idpf_idc_init_msix_data(adapter); + + err = idpf_plug_core_aux_dev(cdev_info); + if (err) + goto err_plug_aux_dev; + + return 0; + +err_plug_aux_dev: + kfree(cdev_info); + adapter->cdev_info = NULL; + + return err; +} + +/** + * idpf_idc_deinit_core_aux_device - de-initialize Auxiliary Device(s) + * @cdev_info: idc core device info pointer + */ +void idpf_idc_deinit_core_aux_device(struct idc_rdma_core_dev_info *cdev_info) +{ + if (!cdev_info) + return; + + idpf_unplug_aux_dev(cdev_info->adev); + + kfree(cdev_info->mapped_mem_regions); + kfree(cdev_info); +} diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 0b96518..5e1414b 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1858,6 +1858,10 @@ static int idpf_init_hard_reset(struct idpf_adapter *adapter) unlock_mutex: mutex_unlock(&adapter->vport_ctrl_lock); + /* Wait until all vports are created to init RDMA CORE AUX */ + if (!err) + err = idpf_idc_init(adapter); + return err; } diff --git a/drivers/net/ethernet/intel/idpf/idpf_vf_dev.c b/drivers/net/ethernet/intel/idpf/idpf_vf_dev.c index 629cb5c..db6a595 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_vf_dev.c +++ b/drivers/net/ethernet/intel/idpf/idpf_vf_dev.c @@ -142,6 +142,15 @@ static void idpf_vf_trigger_reset(struct idpf_adapter *adapter, } /** + * idpf_idc_vf_register - register for IDC callbacks + * @adapter: Driver specific private structure + */ +static int idpf_idc_vf_register(struct idpf_adapter *adapter) +{ + return idpf_idc_init_aux_core_dev(adapter, IDC_FUNCTION_TYPE_VF); +} + +/** * idpf_vf_reg_ops_init - Initialize register API function pointers * @adapter: Driver specific private structure */ @@ -161,4 +170,6 @@ static void idpf_vf_reg_ops_init(struct idpf_adapter *adapter) void idpf_vf_dev_ops_init(struct idpf_adapter *adapter) { idpf_vf_reg_ops_init(adapter); + + adapter->dev_ops.idc_init = idpf_idc_vf_register; } diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c index a5f9b7a..cdfd440 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c @@ -891,6 +891,7 @@ static int idpf_send_get_caps_msg(struct idpf_adapter *adapter) caps.other_caps = cpu_to_le64(VIRTCHNL2_CAP_SRIOV | + VIRTCHNL2_CAP_RDMA | VIRTCHNL2_CAP_MACFILTER | VIRTCHNL2_CAP_SPLITQ_QSCHED | VIRTCHNL2_CAP_PROMISC | @@ -3042,6 +3043,7 @@ void idpf_vc_core_deinit(struct idpf_adapter *adapter) idpf_vc_xn_shutdown(adapter->vcxn_mngr); idpf_deinit_task(adapter); + idpf_idc_deinit_core_aux_device(adapter->cdev_info); idpf_intr_rel(adapter); cancel_delayed_work_sync(&adapter->serv_task); @@ -3688,3 +3690,18 @@ int idpf_set_promiscuous(struct idpf_adapter *adapter, return reply_sz < 0 ? reply_sz : 0; } + +/** + * idpf_idc_rdma_vc_send_sync - virtchnl send callback for IDC registered drivers + * @cdev_info: idc core device info pointer + * @send_msg: message to send + * @msg_size: size of message to send + * @recv_msg: message to populate on reception of response + * @recv_len: length of message copied into recv_msg or 0 on error + */ +int idpf_idc_rdma_vc_send_sync(struct idc_rdma_core_dev_info *cdev_info, + u8 *send_msg, u16 msg_size, + u8 *recv_msg, u16 *recv_len) +{ + return -EOPNOTSUPP; +} diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.h b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.h index 83da5d8..6163cfa 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.h +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.h @@ -66,5 +66,8 @@ int idpf_set_promiscuous(struct idpf_adapter *adapter, int idpf_send_set_sriov_vfs_msg(struct idpf_adapter *adapter, u16 num_vfs); int idpf_send_get_set_rss_key_msg(struct idpf_vport *vport, bool get); int idpf_send_get_set_rss_lut_msg(struct idpf_vport *vport, bool get); +int idpf_idc_rdma_vc_send_sync(struct idc_rdma_core_dev_info *cdev_info, + u8 *send_msg, u16 msg_size, + u8 *recv_msg, u16 *recv_len); #endif /* _IDPF_VIRTCHNL_H_ */ From patchwork Wed Jul 24 23:38:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741443 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7DA1A13BC3E for ; Wed, 24 Jul 2024 23:40:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864440; cv=none; b=pZffVt5GDQirs4c9Thhz8UdKeaxOjBaTYiQG2TYgFr9/qLPxQEvEVIMQLdd6qf2typcbwyY0N+4YKHOZh7bbRBi9u+w9r95730vciwj8LXjzOW+nm2gzCGnkLNk7HzHQtuYr1Iz/1kf4BGD9X1T5kLisxVA2Ky9Mi9FQH0nndd0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864440; c=relaxed/simple; bh=Q2USVksmYtK7mddJ+9HZDrvw38qok5ardrI1zRf6+y8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=IKCW1HV5snGPty3pq7rPyq0nGtswS/X6gBQEGUPje1FKE8sB9qHqkOTrf2b0ZQ09RW8rouZLvfyl9wmND42kfg48tUBXErKQZiZ5dNfw6Z9TC6lZDQcTJQnQCHJHjdXCPLLURwY5M9GJpGeiF6JNvKneNY7ruI8/g2KkwD2c5Rk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=JBXMvyre; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="JBXMvyre" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864440; x=1753400440; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Q2USVksmYtK7mddJ+9HZDrvw38qok5ardrI1zRf6+y8=; b=JBXMvyreudz+yBpf5c80TWsIQHOhZPxPK7xqoZVFL5LAFMVhhny6B1Gc FP9qYyPe9lXtFDpg7uA6wOhKxPnDIslCimM0JpXgPvZGcc+hR1FGcwAj1 QPp2PAxnDVq39/QdwaQ1a6AV845fnrJfz+KIG6syP77ifDasXChTYW5TU sB9To8TkvidsV7w4U9pPkJHRQRvirJTdHSmWTyyhmpOTVCYOjr4IH/Pq1 dMiWRAvGDt0n2Gs9RFXqAFsh7/F8rdFHzVYH4s08pr7nMWYa97CQEV6Fw /7BWOyfBPvtwfwn5lRkPud9j0aKDV0HQGvBONnEG4qvTrAUufpvqgyZDJ w==; X-CSE-ConnectionGUID: MCzGs1y1T1+ot9HWGimrVQ== X-CSE-MsgGUID: 2uFV/nmzRu2+77L37K7IUg== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999745" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999745" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:37 -0700 X-CSE-ConnectionGUID: aRJiCRpcQtCWv7tUBD4OSA== X-CSE-MsgGUID: 9AC7/ACsQXiFAp7RYNh6Uw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426006" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:35 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Joshua Hay , Tatyana Nikolova Subject: [RFC PATCH 04/25] idpf: prevent deadlock with irdma get link settings Date: Wed, 24 Jul 2024 18:38:56 -0500 Message-Id: <20240724233917.704-5-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Joshua Hay When the rdma core deinitializes (reset or remove), it is calling get_link_ksettings. Add logic to get link settings to avoid even taking the lock during a reset or remove, since we will not care what the link settings are in that state anyways. Signed-off-by: Joshua Hay Signed-off-by: Tatyana Nikolova --- drivers/net/ethernet/intel/idpf/idpf_ethtool.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c index 1885ba6..48f1677 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c +++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c @@ -1310,20 +1310,25 @@ static void idpf_set_msglevel(struct net_device *netdev, u32 data) static int idpf_get_link_ksettings(struct net_device *netdev, struct ethtool_link_ksettings *cmd) { + struct idpf_adapter *adapter = idpf_netdev_to_adapter(netdev); struct idpf_vport *vport; - idpf_vport_ctrl_lock(netdev); - vport = idpf_netdev_to_vport(netdev); - ethtool_link_ksettings_zero_link_mode(cmd, supported); cmd->base.autoneg = AUTONEG_DISABLE; cmd->base.port = PORT_NONE; + cmd->base.duplex = DUPLEX_UNKNOWN; + cmd->base.speed = SPEED_UNKNOWN; + + if (idpf_is_reset_in_prog(adapter) || + test_bit(IDPF_REMOVE_IN_PROG, adapter->flags)) + return 0; + + idpf_vport_ctrl_lock(netdev); + vport = idpf_netdev_to_vport(netdev); + if (vport->link_up) { cmd->base.duplex = DUPLEX_FULL; cmd->base.speed = vport->link_speed_mbps; - } else { - cmd->base.duplex = DUPLEX_UNKNOWN; - cmd->base.speed = SPEED_UNKNOWN; } idpf_vport_ctrl_unlock(netdev); From patchwork Wed Jul 24 23:38:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741444 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DEE71143888 for ; Wed, 24 Jul 2024 23:40:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864441; cv=none; b=qOH8yn8roeDnpIsseLK5DUEHFAZ+pDBJ0Gv9mlriRY13E7/QaxE4m8Gi6xGJnkKNRs1j/3A2LDciV4GK5WqKhAfTJW9wSvXwo2u3ms7PCbcvCggriAIMvecbqvw+eMdhv8NXfZmPuQuAoCB+hn8J584Cc/+OEvT9R17RUxk7eSs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864441; c=relaxed/simple; bh=OnBHFG5Pr6ADKwrVtJHTycEjn1QZrzo8yDVJoB29Xf8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=il5suvVBPAMwtywv7U65Ym58WlDgo60xgaT6jl2l6HemFyKPkSynmJhrAvzwO2VPQCfzohVvScucKjiNY3anCYAkM49TSSQP/jMVTU+d+3MSK6Z7H9X//yB8vmwtcXy7/Jh239ADyqwvh7RZbuw+ats2zdA5kokqCiDAwit8QZY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Jo7LQepY; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Jo7LQepY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864440; x=1753400440; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OnBHFG5Pr6ADKwrVtJHTycEjn1QZrzo8yDVJoB29Xf8=; b=Jo7LQepYSAh/Ralg/qpEWKHGj7sfdhUiDzMxi+SaIIRM7pP19QMK+HtB SO+4FfRuE0CK9cW92IR8Zzfb5MIDLLAoUukwleKdSTRMHAnHUnHqJYOSj 6wpfYbmf14/2TEIhYjkCrUK0HlkJE/D1/mz18s2wehuZ6bqw5beLwZYwr tvynln1uXdxYQ/Xt6bUHQcAfjEVsDOf4tVhflyT/b7o/Lh9c3fNUBYSzI 1pi1PLYZzzV60rY6WZWCIvFtthwVBXPCXgDmFUb05lE2ahJ1h9YxNY/UG vUi++AfJGo6bDFpVMejTeVqwzcVMnCvYvhDGaVqN++J0t3Lbwlq8QgAJ5 g==; X-CSE-ConnectionGUID: IKVNBEnlQRujCo9NqvZ/2w== X-CSE-MsgGUID: wUvw665+QHCI1fDUBl6ofA== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999748" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999748" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:37 -0700 X-CSE-ConnectionGUID: pxPvqRryQZillXRucJVOWw== X-CSE-MsgGUID: hlcswRXmS6Oqk1lgJ/BdvA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426010" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:36 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Joshua Hay , Tatyana Nikolova Subject: [RFC PATCH 05/25] idpf: implement rdma vport auxiliary dev create, init, and destroy Date: Wed, 24 Jul 2024 18:38:57 -0500 Message-Id: <20240724233917.704-6-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Joshua Hay Implement the functions to create, initialize, and destroy an rdma vport auxiliary device. The vport aux dev creation is dependent on the core aux device to call idpf_idc_vport_dev_ctrl to signal that it is ready for vport aux devices. Implement that core callback to either create and initialize the vport aux dev or deinitialize. Rdma vport aux dev creation is also dependent on the control plane to tell us the vport is rdma enabled. Add a flag in the create vport message to signal individual vport rdma capabilities. Signed-off-by: Joshua Hay Signed-off-by: Tatyana Nikolova --- drivers/net/ethernet/intel/idpf/idpf.h | 3 + drivers/net/ethernet/intel/idpf/idpf_idc.c | 170 +++++++++++++++++++++++++++- drivers/net/ethernet/intel/idpf/idpf_lib.c | 2 + drivers/net/ethernet/intel/idpf/virtchnl2.h | 13 ++- 4 files changed, 185 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h index 9397208..46ec54e 100644 --- a/drivers/net/ethernet/intel/idpf/idpf.h +++ b/drivers/net/ethernet/intel/idpf/idpf.h @@ -319,6 +319,8 @@ struct idpf_vport { u32 rxq_model; struct idpf_rx_ptype_decoded rx_ptype_lkup[IDPF_RX_MAX_PTYPE]; + struct idc_rdma_vport_dev_info *vdev_info; + struct idpf_adapter *adapter; struct net_device *netdev; DECLARE_BITMAP(flags, IDPF_VPORT_FLAGS_NBITS); @@ -865,5 +867,6 @@ void idpf_vport_intr_write_itr(struct idpf_q_vector *q_vector, int idpf_idc_init_aux_core_dev(struct idpf_adapter *adapter, enum idc_function_type ftype); void idpf_idc_deinit_core_aux_device(struct idc_rdma_core_dev_info *cdev_info); +void idpf_idc_deinit_vport_aux_device(struct idc_rdma_vport_dev_info *vdev_info); #endif /* !_IDPF_H_ */ diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c index 9eb0e0c..bb69b2d 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_idc.c +++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c @@ -29,6 +29,111 @@ int idpf_idc_init(struct idpf_adapter *adapter) } /** + * idpf_vport_adev_release - function to be mapped to aux dev's release op + * @dev: pointer to device to free + */ +static void idpf_vport_adev_release(struct device *dev) +{ + struct idc_rdma_vport_auxiliary_dev *iadev; + + iadev = container_of(dev, struct idc_rdma_vport_auxiliary_dev, adev.dev); + kfree(iadev); + iadev = NULL; +} + +/* idpf_plug_vport_aux_dev - allocate and register a vport Auxiliary device + * @cdev_info: idc core device info pointer + * @vdev_info: idc vport device info pointer + */ +static int idpf_plug_vport_aux_dev(struct idc_rdma_core_dev_info *cdev_info, + struct idc_rdma_vport_dev_info *vdev_info) +{ + struct idc_rdma_vport_auxiliary_dev *iadev; + char name[IDPF_IDC_MAX_ADEV_NAME_LEN]; + struct auxiliary_device *adev; + int err; + + iadev = (struct idc_rdma_vport_auxiliary_dev *) + kzalloc(sizeof(*iadev), GFP_KERNEL); + if (!iadev) + return -ENOMEM; + + adev = &iadev->adev; + vdev_info->adev = &iadev->adev; + iadev->vdev_info = vdev_info; + + adev->id = ida_alloc(&idpf_idc_ida, GFP_KERNEL); + if (adev->id < 0) { + pr_err("failed to allocate unique device ID for Auxiliary driver\n"); + err = -ENOMEM; + goto err_ida_alloc; + } + adev->dev.release = idpf_vport_adev_release; + adev->dev.parent = &cdev_info->pdev->dev; + sprintf(name, "%04x.rdma.vdev", cdev_info->pdev->vendor); + adev->name = name; + + err = auxiliary_device_init(adev); + if (err) + goto err_aux_dev_init; + + err = auxiliary_device_add(adev); + if (err) + goto err_aux_dev_add; + + return 0; + +err_aux_dev_add: + vdev_info->adev = NULL; + auxiliary_device_uninit(adev); +err_aux_dev_init: + ida_free(&idpf_idc_ida, adev->id); +err_ida_alloc: + kfree(iadev); + + return err; +} + +/** + * idpf_idc_init_aux_vport_dev - initialize vport Auxiliary Device(s) + * @vport: virtual port data struct + */ +static int idpf_idc_init_aux_vport_dev(struct idpf_vport *vport) +{ + struct idpf_adapter *adapter = vport->adapter; + struct idc_rdma_vport_dev_info *vdev_info; + struct idc_rdma_core_dev_info *cdev_info; + struct virtchnl2_create_vport *vport_msg; + int err; + + vport_msg = (struct virtchnl2_create_vport *) + adapter->vport_params_recvd[vport->idx]; + + if (!(le16_to_cpu(vport_msg->vport_flags) & VIRTCHNL2_VPORT_ENABLE_RDMA)) + return 0; + + vport->vdev_info = (struct idc_rdma_vport_dev_info *) + kzalloc(sizeof(*vdev_info), GFP_KERNEL); + if (!vport->vdev_info) + return -ENOMEM; + + cdev_info = vport->adapter->cdev_info; + + vdev_info = vport->vdev_info; + vdev_info->vport_id = vport->vport_id; + vdev_info->netdev = vport->netdev; + vdev_info->core_adev = cdev_info->adev; + + err = idpf_plug_vport_aux_dev(cdev_info, vdev_info); + if (err) { + kfree(vdev_info); + return err; + } + + return 0; +} + +/** * idpf_core_adev_release - function to be mapped to aux dev's release op * @dev: pointer to device to free */ @@ -104,6 +209,48 @@ static void idpf_unplug_aux_dev(struct auxiliary_device *adev) } /** + * idpf_idc_vport_dev_up - called when CORE is ready for vport aux devs + * @adapter: private data struct + */ +static int idpf_idc_vport_dev_up(struct idpf_adapter *adapter) +{ + int i, err = 0; + + for (i = 0; i < adapter->num_alloc_vports; i++) { + struct idpf_vport *vport = adapter->vports[i]; + + if (!vport) + continue; + + if (!vport->vdev_info) + err = idpf_idc_init_aux_vport_dev(vport); + else + err = idpf_plug_vport_aux_dev(vport->adapter->cdev_info, + vport->vdev_info); + } + + return err; +} + +/** + * idpf_idc_vport_dev_down - called CORE is leaving vport aux dev support state + * @adapter: private data struct + */ +static void idpf_idc_vport_dev_down(struct idpf_adapter *adapter) +{ + int i; + + for (i = 0; i < adapter->num_alloc_vports; i++) { + struct idpf_vport *vport = adapter->vports[i]; + + if (!vport) + continue; + + idpf_unplug_aux_dev(vport->vdev_info->adev); + } +} + +/** * idpf_idc_vport_dev_ctrl - Called by an Auxiliary Driver * @cdev_info: idc core device info pointer * @up: RDMA core driver status @@ -116,7 +263,14 @@ static void idpf_unplug_aux_dev(struct auxiliary_device *adev) idpf_idc_vport_dev_ctrl(struct idc_rdma_core_dev_info *cdev_info, bool up) { - return -EOPNOTSUPP; + struct idpf_adapter *adapter = pci_get_drvdata(cdev_info->pdev); + + if (up) + return idpf_idc_vport_dev_up(adapter); + + idpf_idc_vport_dev_down(adapter); + + return 0; } /** @@ -210,3 +364,17 @@ void idpf_idc_deinit_core_aux_device(struct idc_rdma_core_dev_info *cdev_info) kfree(cdev_info->mapped_mem_regions); kfree(cdev_info); } + +/** + * idpf_idc_deinit_vport_aux_device - de-initialize Auxiliary Device(s) + * @vdev_info: idc vport device info pointer + */ +void idpf_idc_deinit_vport_aux_device(struct idc_rdma_vport_dev_info *vdev_info) +{ + if (!vdev_info) + return; + + idpf_unplug_aux_dev(vdev_info->adev); + + kfree(vdev_info); +} diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 5e1414b..718d40c 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1056,6 +1056,8 @@ static void idpf_vport_dealloc(struct idpf_vport *vport) struct idpf_adapter *adapter = vport->adapter; unsigned int i = vport->idx; + idpf_idc_deinit_vport_aux_device(vport->vdev_info); + idpf_deinit_mac_addr(vport); idpf_vport_stop(vport); diff --git a/drivers/net/ethernet/intel/idpf/virtchnl2.h b/drivers/net/ethernet/intel/idpf/virtchnl2.h index 80c17e4..673a39e 100644 --- a/drivers/net/ethernet/intel/idpf/virtchnl2.h +++ b/drivers/net/ethernet/intel/idpf/virtchnl2.h @@ -563,6 +563,15 @@ struct virtchnl2_queue_reg_chunks { VIRTCHNL2_CHECK_STRUCT_LEN(8, virtchnl2_queue_reg_chunks); /** + * enum virtchnl2_vport_flags - Vport flags + * @VIRTCHNL2_VPORT_ENABLE_RDMA: RDMA is enabled for this vport + */ +enum virtchnl2_vport_flags { + /* VIRTCHNL2_VPORT_* bits [0:3] rsvd */ + VIRTCHNL2_VPORT_ENABLE_RDMA = BIT(4), +}; + +/** * struct virtchnl2_create_vport - Create vport config info. * @vport_type: See enum virtchnl2_vport_type. * @txq_model: See virtchnl2_queue_model. @@ -580,7 +589,7 @@ struct virtchnl2_queue_reg_chunks { * @max_mtu: Max MTU. CP populates this field on response. * @vport_id: Vport id. CP populates this field on response. * @default_mac_addr: Default MAC address. - * @pad: Padding. + * @vport_flags: See enum virtchnl2_vport_flags * @rx_desc_ids: See VIRTCHNL2_RX_DESC_IDS definitions. * @tx_desc_ids: See VIRTCHNL2_TX_DESC_IDS definitions. * @pad1: Padding. @@ -613,7 +622,7 @@ struct virtchnl2_create_vport { __le16 max_mtu; __le32 vport_id; u8 default_mac_addr[ETH_ALEN]; - __le16 pad; + __le16 vport_flags; __le64 rx_desc_ids; __le64 tx_desc_ids; u8 pad1[72]; From patchwork Wed Jul 24 23:38:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741445 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B244143C5D for ; Wed, 24 Jul 2024 23:40:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864441; cv=none; b=iKYzEmdSxmzU2RdEEn1X6/+ufr/lPF7tClwYoA524+iHWIb+zDubg3nWrNatQUaGqddvZVazljNwn/PJuswXu4/m5oF68JQRJEDQciV5KX1wBBKHR9da5cAhm5QDsgfi4mErfwFmC+xgsscmGkM7muEw5/URRNbhtp6YFDoFOjk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864441; c=relaxed/simple; bh=B6nkGiyw+zEOoTYveMJYCcYSTD6CXaiHITb2OitCsHo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oKHc0JCAfGnG80oDAMTOzhOPcElCG+5or62W9cMHSQpGAp3C37r/6ZNUSt418YD3yUS86gZXaiWpfUb+sKrXKJEU19CvkO09JeIbwA67T2HrLKW/VBXjw0hBD+g/9pAre8f3ObhC67fGvPmfa8OHgcpMpcW1MokZjy8luublRfs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Z+dUowQQ; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Z+dUowQQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864440; x=1753400440; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=B6nkGiyw+zEOoTYveMJYCcYSTD6CXaiHITb2OitCsHo=; b=Z+dUowQQ9kz3uCkOmIYmgv2bGRqIcu+T870V+W5R1VRfSBAV/zzQn+Tf fe5JhCPu9kNSGz8g/Xz7/W15XdwtguGFW6h5lYVak/gwzdqBiKXVMp3/l Hnjh6fFjSu2a/tfrE2YsUHU7f0CdwmxEGMUs4JA2gGqr1fiRUgzW8Tn86 SoPQiU7IW2Mlf6y+B4FoQYx+qt6FUeiYa7vtbP8l2eTJixkWZX2lnNiID uui3iRXTbJbTHP4hGMshzzfsYLzGQFNXvGFIqRPleuBBGZvhFX7S8gy3G VciVWudimxMLvavmwAToXebewryXu0Z8O+wkop3Q3u6oq2zcOd7mkhc1B g==; X-CSE-ConnectionGUID: Z0ay4bX9SGq/AAyHP8I1Rw== X-CSE-MsgGUID: Fr0n8bgwQP+EOZrYtt82pA== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999751" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999751" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:38 -0700 X-CSE-ConnectionGUID: ZGL3ZtATQzyBpNVH6wqsbw== X-CSE-MsgGUID: usFEgiibSvm0nKjpZD2A9A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426016" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:37 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Joshua Hay , Tatyana Nikolova Subject: [RFC PATCH 06/25] idpf: implement remaining idc rdma core callbacks and handlers Date: Wed, 24 Jul 2024 18:38:58 -0500 Message-Id: <20240724233917.704-7-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Joshua Hay Implement the idpf_idc_request_reset and idpf_idc_rdma_vc_send_sync callbacks for the rdma core auxiliary driver to issue reset events to the idpf and send (synchronous) virtchnl messages to the control plane respectively. Implement and plumb the reset handler for the opposite flow as well, i.e. when the idpf is resetiing and needs to notify the rdma core auxiliary driver. Signed-off-by: Joshua Hay Signed-off-by: Tatyana Nikolova --- drivers/net/ethernet/intel/idpf/idpf.h | 1 + drivers/net/ethernet/intel/idpf/idpf_idc.c | 43 ++++++++++++++++++++++++- drivers/net/ethernet/intel/idpf/idpf_lib.c | 2 ++ drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 23 ++++++++++++- drivers/net/ethernet/intel/idpf/virtchnl2.h | 3 +- 5 files changed, 69 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h index 46ec54e..0e3e7000 100644 --- a/drivers/net/ethernet/intel/idpf/idpf.h +++ b/drivers/net/ethernet/intel/idpf/idpf.h @@ -868,5 +868,6 @@ int idpf_idc_init_aux_core_dev(struct idpf_adapter *adapter, enum idc_function_type ftype); void idpf_idc_deinit_core_aux_device(struct idc_rdma_core_dev_info *cdev_info); void idpf_idc_deinit_vport_aux_device(struct idc_rdma_vport_dev_info *vdev_info); +void idpf_idc_issue_reset_event(struct idc_rdma_core_dev_info *cdev_info); #endif /* !_IDPF_H_ */ diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c index bb69b2d..24ab9a4 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_idc.c +++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c @@ -209,6 +209,38 @@ static void idpf_unplug_aux_dev(struct auxiliary_device *adev) } /** + * idpf_idc_issue_reset_event - Function to handle reset IDC event + * @cdev_info: idc core device info pointer + */ +void idpf_idc_issue_reset_event(struct idc_rdma_core_dev_info *cdev_info) +{ + enum idc_rdma_event_type event_type = IDC_RDMA_EVENT_WARN_RESET; + struct idc_rdma_core_auxiliary_drv *iadrv; + struct idc_rdma_event event = { }; + struct auxiliary_device *adev; + + if (!cdev_info) + /* RDMA is not enabled */ + return; + + set_bit(event_type, event.type); + + device_lock(&cdev_info->adev->dev); + + adev = cdev_info->adev; + if (!adev || !adev->dev.driver) + goto unlock; + + iadrv = container_of(adev->dev.driver, + struct idc_rdma_core_auxiliary_drv, + adrv.driver); + if (iadrv && iadrv->event_handler) + iadrv->event_handler(cdev_info, &event); +unlock: + device_unlock(&cdev_info->adev->dev); +} + +/** * idpf_idc_vport_dev_up - called when CORE is ready for vport aux devs * @adapter: private data struct */ @@ -285,7 +317,16 @@ static void idpf_idc_vport_dev_down(struct idpf_adapter *adapter) idpf_idc_request_reset(struct idc_rdma_core_dev_info *cdev_info, enum idc_rdma_reset_type __always_unused reset_type) { - return -EOPNOTSUPP; + struct idpf_adapter *adapter = pci_get_drvdata(cdev_info->pdev); + + if (!idpf_is_reset_in_prog(adapter)) { + set_bit(IDPF_HR_FUNC_RESET, adapter->flags); + queue_delayed_work(adapter->vc_event_wq, + &adapter->vc_event_task, + msecs_to_jiffies(10)); + } + + return 0; } /* Implemented by the Auxiliary Device and called by the Auxiliary Driver */ diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 718d40c..237cc04 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1814,6 +1814,8 @@ static int idpf_init_hard_reset(struct idpf_adapter *adapter) } else if (test_and_clear_bit(IDPF_HR_FUNC_RESET, adapter->flags)) { bool is_reset = idpf_is_reset_detected(adapter); + idpf_idc_issue_reset_event(adapter->cdev_info); + idpf_set_vport_state(adapter); idpf_vc_core_deinit(adapter); if (!is_reset) diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c index cdfd440..65936de 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c @@ -3703,5 +3703,26 @@ int idpf_idc_rdma_vc_send_sync(struct idc_rdma_core_dev_info *cdev_info, u8 *send_msg, u16 msg_size, u8 *recv_msg, u16 *recv_len) { - return -EOPNOTSUPP; + struct idpf_adapter *adapter = pci_get_drvdata(cdev_info->pdev); + struct idpf_vc_xn_params xn_params = { }; + ssize_t reply_sz; + u16 recv_size; + + if (!recv_msg || !recv_len || msg_size > IDPF_CTLQ_MAX_BUF_LEN) + return -EINVAL; + + recv_size = min_t(u16, *recv_len, IDPF_CTLQ_MAX_BUF_LEN); + *recv_len = 0; + xn_params.vc_op = VIRTCHNL2_OP_RDMA; + xn_params.timeout_ms = IDPF_VC_XN_DEFAULT_TIMEOUT_MSEC; + xn_params.send_buf.iov_base = send_msg; + xn_params.send_buf.iov_len = msg_size; + xn_params.recv_buf.iov_base = recv_msg; + xn_params.recv_buf.iov_len = recv_size; + reply_sz = idpf_vc_xn_exec(adapter, &xn_params); + if (reply_sz < 0) + return reply_sz; + *recv_len = reply_sz; + + return 0; } diff --git a/drivers/net/ethernet/intel/idpf/virtchnl2.h b/drivers/net/ethernet/intel/idpf/virtchnl2.h index 673a39e..e6541152 100644 --- a/drivers/net/ethernet/intel/idpf/virtchnl2.h +++ b/drivers/net/ethernet/intel/idpf/virtchnl2.h @@ -62,8 +62,9 @@ enum virtchnl2_op { VIRTCHNL2_OP_GET_PTYPE_INFO = 526, /* Opcode 527 and 528 are reserved for VIRTCHNL2_OP_GET_PTYPE_ID and * VIRTCHNL2_OP_GET_PTYPE_INFO_RAW. - * Opcodes 529, 530, 531, 532 and 533 are reserved. */ + VIRTCHNL2_OP_RDMA = 529, + /* Opcodes 530 through 533 are reserved. */ VIRTCHNL2_OP_LOOPBACK = 534, VIRTCHNL2_OP_ADD_MAC_ADDR = 535, VIRTCHNL2_OP_DEL_MAC_ADDR = 536, From patchwork Wed Jul 24 23:38:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741446 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 188AB1442F6 for ; Wed, 24 Jul 2024 23:40:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864442; cv=none; b=BR/HeUymv8zxourejUiZvEA1LBDWoNFlJlmYTwbg4gfp6D3wYwVGbVlrWZp17AGUybCCubGNEfbwffZfvw3PTkfPnDYpU1AFw5uLNdQb9D/jsPQN5g9DHox2jqsGUv2wTMi0aoV7/9SvURvnGQBnGEF5QejrgF9bCtlu2tPOA9k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864442; c=relaxed/simple; bh=x1ttufVCCe5VokzfLpW4b4aQqaAncqWVczouQHx2jqM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=b0BM+N82syPtZzkXo3POEmVgD6RmTNnK4/x3NcNUBvNunElxP4lR6ZvNiOxK036BuTiBPHts1TTMik8SUgLWX1M0/AwNIpZgqoro5OFu9uLSSS+sp/K7Hw4/Vg0JG66gY35Jf4J4yMBEVZBfIhzpNDdawST1M8b/XciVERdwUpo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QoSzC5Kl; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QoSzC5Kl" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864441; x=1753400441; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=x1ttufVCCe5VokzfLpW4b4aQqaAncqWVczouQHx2jqM=; b=QoSzC5KleheDoyzXbsK77ilGmRcuJQcgKl42Y8zJBQTXRFbHxe9sfHJ8 OVxQmeA2GNTtuUul6M66ZFEqxvujnT/kopqcCcq3SWgl/7esi3UzCpLiF q56zfoL5lo7yDJm52mNsW6UmGveIRNg0kV+qtUFg1lV1oLPC8l3lnrgui +LXXhRzH3u5GlvVfDRuZZl0QAAalh+QLL3/Mri6zxIzLEzTaTazRnvMS5 pQdLZUywYzM6AWqax0bPYQjhQvA8TmHNxKnk5ry3C54OL5YbzOamG3oDQ MdYyz4VIdDN7JpZjRw/BQ3MKp9lMbOaYjLbSq4e3Fmxl9u6msZUHWrezi w==; X-CSE-ConnectionGUID: EstEit7YS2+Wn7sNZR7HiA== X-CSE-MsgGUID: RzeKf6WpQ12DeaSi8wMcpQ== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999754" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999754" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:39 -0700 X-CSE-ConnectionGUID: wFmo0l6TSea/xMloW5OTOQ== X-CSE-MsgGUID: zYRH75ZrQiGRGz0d61hVyA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426023" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:37 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Joshua Hay , Tatyana Nikolova Subject: [RFC PATCH 07/25] idpf: use actual mbx receive payload length Date: Wed, 24 Jul 2024 18:38:59 -0500 Message-Id: <20240724233917.704-8-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Joshua Hay When a mailbox message is received, the driver is checking for a non 0 datalen in the controlq descriptor. If it is valid, the payload is attached to the ctlq message to give to the upper layer. However, the payload response size given to the upper layer was taken from the buffer metadata which is _always_ the max buffer size. This meant the API was returning 4K as the payload size for all messages. This went unnoticed since the virtchnl exchange response logic was checking for a response size less than 0 (error), not less than exact size, or not greater than or equal to the max mailbox buffer size (4K). All of these checks will pass in the success case since the size provided is always 4K. Fetch the actual payload length from the value provided in the descriptor data_len field (instead of the buffer metadata). Unfortunately, this means we lose some extra error parsing for variable sized virtchnl responses such as create vport and get ptypes. However, the original checks weren't really helping anyways since the size was _always_ 4K. Signed-off-by: Joshua Hay Signed-off-by: Tatyana Nikolova --- drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c index 65936de..e9a71d3 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c @@ -664,7 +664,7 @@ static ssize_t idpf_vc_xn_exec(struct idpf_adapter *adapter, if (ctlq_msg->data_len) { payload = ctlq_msg->ctx.indirect.payload->va; - payload_size = ctlq_msg->ctx.indirect.payload->size; + payload_size = ctlq_msg->data_len; } xn->reply_sz = payload_size; @@ -1291,10 +1291,6 @@ int idpf_send_create_vport_msg(struct idpf_adapter *adapter, err = reply_sz; goto free_vport_params; } - if (reply_sz < IDPF_CTLQ_MAX_BUF_LEN) { - err = -EIO; - goto free_vport_params; - } return 0; @@ -2557,9 +2553,6 @@ int idpf_send_get_rx_ptype_msg(struct idpf_vport *vport) if (reply_sz < 0) return reply_sz; - if (reply_sz < IDPF_CTLQ_MAX_BUF_LEN) - return -EIO; - ptypes_recvd += le16_to_cpu(ptype_info->num_ptypes); if (ptypes_recvd > max_ptype) return -EINVAL; From patchwork Wed Jul 24 23:39:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741447 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCCBD1448EF for ; Wed, 24 Jul 2024 23:40:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864443; cv=none; b=EQcWNDzghEAQwN0Y2lYmG7cJCUrYP71YDSsBNhmt7Xiws0zhMz1h/eml6BDv+S2QZunj2GoPVimN6LdB0JtSVs1vp58tHP1hbV9e7RfAn3iPTQ4reRWLurqQXficpbJFT/PCSzEvIAsxMui3w1VjUNSl/z+QElOCXDhDjHHdDSQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864443; c=relaxed/simple; bh=wWfecCuykLAnG/JpTZs4+0+xCzJ5Fqfh/KSmhqZwg3s=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=lRsDipXvEFmmPk+0ujiQNpCcd4AqZVF4sbwZftL0akdvSDRi1LnIm6N7659BnI3d47w+jE3s7C9VN8JTcSCs1oraVFsu/Or5u0/yCdozOOdcMe/elNW7FJo7sM+l5UZm+GAYzOh9sq8yVAzq622kxOg0qa+0grk37jjJeXVTGo0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=O71hxf8x; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="O71hxf8x" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864442; x=1753400442; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wWfecCuykLAnG/JpTZs4+0+xCzJ5Fqfh/KSmhqZwg3s=; b=O71hxf8xwp1Zb1QuPRrUkJEziCYrQeCNAutnNl32DOBZabb5srfPr/TZ I8QSoPgmGj4oGlki3t9pnN6kPnTz6gtSzJrHrmgMiQ3rHCEO+SeaCytDn xvy5r+mW6iF/AhO0HAgisxRpyujPZ9hmDVMADevvyFV96PiwGm5Rnp+bD M3HWLC2byJbIeXbUXqRqIW9uWN7b6JzfIXvuQGc+V6rYJBUL4IE3xm63u CuMzJfg5cfDhcArVayM0HF394exvZtiuZHyB4EkBRsYJbHmBBBy/xkQkv 1UJw3qf60i1IrFtdn7YmyYSEW6YBQil2nCn70gNZHjWpMDy0/kW/5I6hL A==; X-CSE-ConnectionGUID: 1vr4zR01RQ2sVLonykCDhw== X-CSE-MsgGUID: /xNLsjgrReeUXWuT0HXQaw== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999757" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999757" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:39 -0700 X-CSE-ConnectionGUID: xLWN3TkuSiOHY6jZ4mjnzA== X-CSE-MsgGUID: MHtrrj6aQueHkeDosWY/xg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426030" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:38 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Joshua Hay , Tatyana Nikolova Subject: [RFC PATCH 08/25] idpf: implement idc vport aux driver mtu change handler Date: Wed, 24 Jul 2024 18:39:00 -0500 Message-Id: <20240724233917.704-9-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Joshua Hay The only event an rdma vport aux driver cares about right now is an MTU change on its underlying vport. Implement and plumb the handler to signal the pre MTU change event and post MTU change events to the rdma vport aux driver. Signed-off-by: Joshua Hay Signed-off-by: Tatyana Nikolova --- drivers/net/ethernet/intel/idpf/idpf.h | 2 ++ drivers/net/ethernet/intel/idpf/idpf_idc.c | 31 ++++++++++++++++++++++++++++++ drivers/net/ethernet/intel/idpf/idpf_lib.c | 11 ++++++++--- 3 files changed, 41 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h index 0e3e7000..8ab4b935 100644 --- a/drivers/net/ethernet/intel/idpf/idpf.h +++ b/drivers/net/ethernet/intel/idpf/idpf.h @@ -869,5 +869,7 @@ int idpf_idc_init_aux_core_dev(struct idpf_adapter *adapter, void idpf_idc_deinit_core_aux_device(struct idc_rdma_core_dev_info *cdev_info); void idpf_idc_deinit_vport_aux_device(struct idc_rdma_vport_dev_info *vdev_info); void idpf_idc_issue_reset_event(struct idc_rdma_core_dev_info *cdev_info); +void idpf_idc_vdev_mtu_event(struct idc_rdma_vport_dev_info *vdev_info, + enum idc_rdma_event_type event_type); #endif /* !_IDPF_H_ */ diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c index 24ab9a4..b87a72b 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_idc.c +++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c @@ -134,6 +134,37 @@ static int idpf_idc_init_aux_vport_dev(struct idpf_vport *vport) } /** + * idpf_idc_vdev_mtu_event - Function to handle IDC vport mtu change events + * @vdev_info: idc vport device info pointer + * @event_type: type of event to pass to handler + */ +void idpf_idc_vdev_mtu_event(struct idc_rdma_vport_dev_info *vdev_info, + enum idc_rdma_event_type event_type) +{ + struct idc_rdma_vport_auxiliary_drv *iadrv; + struct idc_rdma_event event = { }; + struct auxiliary_device *adev; + + if (!vdev_info) + /* RDMA is not enabled */ + return; + + set_bit(event_type, event.type); + + device_lock(&vdev_info->adev->dev); + adev = vdev_info->adev; + if (!adev || !adev->dev.driver) + goto unlock; + iadrv = container_of(adev->dev.driver, + struct idc_rdma_vport_auxiliary_drv, + adrv.driver); + if (iadrv && iadrv->event_handler) + iadrv->event_handler(vdev_info, &event); +unlock: + device_unlock(&vdev_info->adev->dev); +} + +/** * idpf_core_adev_release - function to be mapped to aux dev's release op * @dev: pointer to device to free */ diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c index 237cc04..8124aa8 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c @@ -1941,6 +1941,8 @@ int idpf_initiate_soft_reset(struct idpf_vport *vport, idpf_vport_calc_num_q_desc(new_vport); break; case IDPF_SR_MTU_CHANGE: + idpf_idc_vdev_mtu_event(vport->vdev_info, + IDC_RDMA_EVENT_BEFORE_MTU_CHANGE); case IDPF_SR_RSC_CHANGE: break; default: @@ -1952,6 +1954,7 @@ int idpf_initiate_soft_reset(struct idpf_vport *vport, err = idpf_vport_queues_alloc(new_vport); if (err) goto free_vport; + if (current_state <= __IDPF_VPORT_DOWN) { idpf_send_delete_queues_msg(vport); } else { @@ -2028,15 +2031,17 @@ int idpf_initiate_soft_reset(struct idpf_vport *vport, if (current_state == __IDPF_VPORT_UP) err = idpf_vport_open(vport, false); - kfree(new_vport); - - return err; + goto free_vport; err_reset: idpf_vport_queues_rel(new_vport); free_vport: kfree(new_vport); + if (reset_cause == IDPF_SR_MTU_CHANGE) + idpf_idc_vdev_mtu_event(vport->vdev_info, + IDC_RDMA_EVENT_AFTER_MTU_CHANGE); + return err; } From patchwork Wed Jul 24 23:39:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741448 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E3712146010 for ; Wed, 24 Jul 2024 23:40:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864444; cv=none; b=kImZ+2ZxecgtFlNu1XlznyL4PiSQtkMNXzKRQ9Bp+RfC55q+IcJXiA9f2E3R4doRbxJFLV96vmbhP5Kf2xYpHvP/KuYbwTzYrecRUcyjc7WiuhUPigsBm6lHP8m0bI9RNWULmE2GEg4472WwOywl1ZYZNAsjJY4FCXiPy0mS2Pc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864444; c=relaxed/simple; bh=CJETY7oq4HOQRRx5/4Rg5HQFKLcHVmr9t47uBsUdH5Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XZ+GGzB0pVaoIv8hyf3mIyX7SmH6jvoOj+7dMQWBDAscxHTods6D76/mLGu7lCZR12f1/BXNXivqktt2BIN+8asxQCvpEi86Bq/geqpGiUfEGmbQHvJKSJ5Q8KA8n57jsmJGRDeKiKHp0ggpID9PCD+aqULPXM90PeUxqRFo2mA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ypd7Ghx5; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ypd7Ghx5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864442; x=1753400442; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CJETY7oq4HOQRRx5/4Rg5HQFKLcHVmr9t47uBsUdH5Y=; b=Ypd7Ghx5UNeshrTXhg9jovq/xV+1bLoIm+QAP7S99litZSKjlncuEFGW InVdxGTigK89cZAXng2awH9rfhjkj//LfuDD/yPTuW8kWUHbbIiZaVdOL gRo4s0S4XhzCqBIcugtfkIlEDR4N1ocJg4FqKW582XynQ04ieUSRmZPHf dd8KfLV+0ybEmk9VgCPjaJU/H2AudHu5+Ez8sgdsVOxHONi2W/Z8f5B97 bHdNMDI1USK3TjKhNDwT8ykKD5C4njMlH5KEmG4ejJCUCrn3pjia2G4S9 MoMmN6bEp8i9QIqV/PlwDtpiW8WnjERyjYzcmOloaow0biTFQz8hBnrJ3 Q==; X-CSE-ConnectionGUID: 6RLnvFmHTYehE0icSncRsA== X-CSE-MsgGUID: KE2m3rVUSI2WIGMTxH2h+w== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999760" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999760" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:40 -0700 X-CSE-ConnectionGUID: TLuHYI2pTcmT3WGbSkPdZA== X-CSE-MsgGUID: /C+ognMKSAqu4ny//0iOWQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426033" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:39 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Joshua Hay , Tatyana Nikolova Subject: [RFC PATCH 09/25] idpf: implement get lan mmio memory regions Date: Wed, 24 Jul 2024 18:39:01 -0500 Message-Id: <20240724233917.704-10-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Joshua Hay The rdma driver needs to map its own mmio regions for the sake of performance, meaning the idpf needs to avoid mapping portions of the bar space. However, to be vendor agnostic, the idpf cannot assume where these are and must avoid mapping hard coded regions. Instead, the idpf will map the bare minimum to load and communicate with the control plane, i.e. the mailbox registers and the reset state registers. The idpf will then call a new virtchnl op to fetch a list of mmio regions that it should map. All other registers will calculate which region they should store their address from. Signed-off-by: Joshua Hay Signed-off-by: Tatyana Nikolova --- drivers/net/ethernet/intel/idpf/idpf.h | 66 +++++++++- drivers/net/ethernet/intel/idpf/idpf_controlq.h | 13 +- drivers/net/ethernet/intel/idpf/idpf_dev.c | 35 +++--- drivers/net/ethernet/intel/idpf/idpf_idc.c | 26 +++- drivers/net/ethernet/intel/idpf/idpf_main.c | 36 +++++- drivers/net/ethernet/intel/idpf/idpf_mem.h | 8 +- drivers/net/ethernet/intel/idpf/idpf_vf_dev.c | 31 +++-- drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 154 +++++++++++++++++++++++- drivers/net/ethernet/intel/idpf/virtchnl2.h | 30 ++++- 9 files changed, 355 insertions(+), 44 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h index 8ab4b935..5240a44 100644 --- a/drivers/net/ethernet/intel/idpf/idpf.h +++ b/drivers/net/ethernet/intel/idpf/idpf.h @@ -192,7 +192,8 @@ struct idpf_vport_max_q { * @trigger_reset: Trigger a reset to occur */ struct idpf_reg_ops { - void (*ctlq_reg_init)(struct idpf_ctlq_create_info *cq); + void (*ctlq_reg_init)(struct idpf_adapter *adapter, + struct idpf_ctlq_create_info *cq); int (*intr_reg_init)(struct idpf_vport *vport); void (*mb_intr_reg_init)(struct idpf_adapter *adapter); void (*reset_reg_init)(struct idpf_adapter *adapter); @@ -200,6 +201,11 @@ struct idpf_reg_ops { enum idpf_flags trig_cause); }; +#define IDPF_PF_MBX_REGION_SZ 4096 +#define IDPF_PF_RSTAT_REGION_SZ 2048 +#define IDPF_VF_MBX_REGION_SZ 10240 +#define IDPF_VF_RSTAT_REGION_SZ 2048 + /** * struct idpf_dev_ops - Device specific operations * @reg_ops: Register operations @@ -208,6 +214,11 @@ struct idpf_dev_ops { struct idpf_reg_ops reg_ops; int (*idc_init)(struct idpf_adapter *adapter); + + resource_size_t mbx_reg_start; + resource_size_t mbx_reg_sz; + resource_size_t rstat_reg_start; + resource_size_t rstat_reg_sz; }; /** @@ -731,6 +742,35 @@ static inline u8 idpf_get_min_tx_pkt_len(struct idpf_adapter *adapter) } /** + * idpf_get_mbx_reg_addr - Get BAR0 mailbox register address + * @adapter: private data struct + * @reg_offset: register offset value + * + * Based on the register offset, return the actual BAR0 register address + */ +static inline void __iomem *idpf_get_mbx_reg_addr(struct idpf_adapter *adapter, + resource_size_t reg_offset) +{ + return (void __iomem *)(adapter->hw.mbx.addr + reg_offset); +} + +/** + * idpf_get_rstat_reg_addr - Get BAR0 rstat register address + * @adapter: private data struct + * @reg_offset: register offset value + * + * Based on the register offset, return the actual BAR0 register address + */ +static inline +void __iomem *idpf_get_rstat_reg_addr(struct idpf_adapter *adapter, + resource_size_t reg_offset) +{ + reg_offset -= adapter->dev_ops.rstat_reg_start; + + return (void __iomem *)(adapter->hw.rstat.addr + reg_offset); +} + +/** * idpf_get_reg_addr - Get BAR0 register address * @adapter: private data struct * @reg_offset: register offset value @@ -740,7 +780,27 @@ static inline u8 idpf_get_min_tx_pkt_len(struct idpf_adapter *adapter) static inline void __iomem *idpf_get_reg_addr(struct idpf_adapter *adapter, resource_size_t reg_offset) { - return (void __iomem *)(adapter->hw.hw_addr + reg_offset); + struct idpf_hw *hw = &adapter->hw; + int i; + + for (i = 0; i < hw->num_lan_regs; i++) { + struct idpf_mmio_reg *region = &hw->lan_regs[i]; + + if (reg_offset >= region->addr_start && + reg_offset < (region->addr_start + region->addr_len)) { + reg_offset -= region->addr_start; + + return (u8 __iomem *)(region->addr + reg_offset); + } + } + + /* It's impossible to hit this case with offsets from the CP. But if we + * do for any other reason, the kernel will panic on that register + * access. Might as well do it here to make it clear what's happening. + */ + BUG(); + + return NULL; } /** @@ -754,7 +814,7 @@ static inline bool idpf_is_reset_detected(struct idpf_adapter *adapter) if (!adapter->hw.arq) return true; - return !(readl(idpf_get_reg_addr(adapter, adapter->hw.arq->reg.len)) & + return !(readl(idpf_get_mbx_reg_addr(adapter, adapter->hw.arq->reg.len)) & adapter->hw.arq->reg.len_mask); } diff --git a/drivers/net/ethernet/intel/idpf/idpf_controlq.h b/drivers/net/ethernet/intel/idpf/idpf_controlq.h index c1aba09..96a2be2 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_controlq.h +++ b/drivers/net/ethernet/intel/idpf/idpf_controlq.h @@ -94,12 +94,21 @@ struct idpf_mbxq_desc { u32 pf_vf_id; /* used by CP when sending to PF */ }; +struct idpf_mmio_reg { + void __iomem *addr; + resource_size_t addr_start; + resource_size_t addr_len; +}; + /* Define the driver hardware struct to replace other control structs as needed * Align to ctlq_hw_info */ struct idpf_hw { - void __iomem *hw_addr; - resource_size_t hw_addr_len; + struct idpf_mmio_reg mbx; + struct idpf_mmio_reg rstat; + /* Array of remaining LAN BAR regions */ + int num_lan_regs; + struct idpf_mmio_reg *lan_regs; struct idpf_adapter *back; diff --git a/drivers/net/ethernet/intel/idpf/idpf_dev.c b/drivers/net/ethernet/intel/idpf/idpf_dev.c index f4c5691..c364beb 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_dev.c +++ b/drivers/net/ethernet/intel/idpf/idpf_dev.c @@ -9,9 +9,11 @@ /** * idpf_ctlq_reg_init - initialize default mailbox registers + * @adapter: adapter structure * @cq: pointer to the array of create control queues */ -static void idpf_ctlq_reg_init(struct idpf_ctlq_create_info *cq) +static void idpf_ctlq_reg_init(struct idpf_adapter *adapter, + struct idpf_ctlq_create_info *cq) { int i; @@ -21,22 +23,22 @@ static void idpf_ctlq_reg_init(struct idpf_ctlq_create_info *cq) switch (ccq->type) { case IDPF_CTLQ_TYPE_MAILBOX_TX: /* set head and tail registers in our local struct */ - ccq->reg.head = PF_FW_ATQH; - ccq->reg.tail = PF_FW_ATQT; - ccq->reg.len = PF_FW_ATQLEN; - ccq->reg.bah = PF_FW_ATQBAH; - ccq->reg.bal = PF_FW_ATQBAL; + ccq->reg.head = PF_FW_ATQH - adapter->dev_ops.mbx_reg_start; + ccq->reg.tail = PF_FW_ATQT - adapter->dev_ops.mbx_reg_start; + ccq->reg.len = PF_FW_ATQLEN - adapter->dev_ops.mbx_reg_start; + ccq->reg.bah = PF_FW_ATQBAH - adapter->dev_ops.mbx_reg_start; + ccq->reg.bal = PF_FW_ATQBAL - adapter->dev_ops.mbx_reg_start; ccq->reg.len_mask = PF_FW_ATQLEN_ATQLEN_M; ccq->reg.len_ena_mask = PF_FW_ATQLEN_ATQENABLE_M; ccq->reg.head_mask = PF_FW_ATQH_ATQH_M; break; case IDPF_CTLQ_TYPE_MAILBOX_RX: /* set head and tail registers in our local struct */ - ccq->reg.head = PF_FW_ARQH; - ccq->reg.tail = PF_FW_ARQT; - ccq->reg.len = PF_FW_ARQLEN; - ccq->reg.bah = PF_FW_ARQBAH; - ccq->reg.bal = PF_FW_ARQBAL; + ccq->reg.head = PF_FW_ARQH - adapter->dev_ops.mbx_reg_start; + ccq->reg.tail = PF_FW_ARQT - adapter->dev_ops.mbx_reg_start; + ccq->reg.len = PF_FW_ARQLEN - adapter->dev_ops.mbx_reg_start; + ccq->reg.bah = PF_FW_ARQBAH - adapter->dev_ops.mbx_reg_start; + ccq->reg.bal = PF_FW_ARQBAL - adapter->dev_ops.mbx_reg_start; ccq->reg.len_mask = PF_FW_ARQLEN_ARQLEN_M; ccq->reg.len_ena_mask = PF_FW_ARQLEN_ARQENABLE_M; ccq->reg.head_mask = PF_FW_ARQH_ARQH_M; @@ -124,7 +126,7 @@ static int idpf_intr_reg_init(struct idpf_vport *vport) */ static void idpf_reset_reg_init(struct idpf_adapter *adapter) { - adapter->reset_reg.rstat = idpf_get_reg_addr(adapter, PFGEN_RSTAT); + adapter->reset_reg.rstat = idpf_get_rstat_reg_addr(adapter, PFGEN_RSTAT); adapter->reset_reg.rstat_m = PFGEN_RSTAT_PFR_STATE_M; } @@ -138,9 +140,9 @@ static void idpf_trigger_reset(struct idpf_adapter *adapter, { u32 reset_reg; - reset_reg = readl(idpf_get_reg_addr(adapter, PFGEN_CTRL)); + reset_reg = readl(idpf_get_rstat_reg_addr(adapter, PFGEN_CTRL)); writel(reset_reg | PFGEN_CTRL_PFSWR, - idpf_get_reg_addr(adapter, PFGEN_CTRL)); + idpf_get_rstat_reg_addr(adapter, PFGEN_CTRL)); } /** @@ -174,4 +176,9 @@ void idpf_dev_ops_init(struct idpf_adapter *adapter) idpf_reg_ops_init(adapter); adapter->dev_ops.idc_init = idpf_idc_register; + + adapter->dev_ops.mbx_reg_start = PF_FW_BASE; + adapter->dev_ops.mbx_reg_sz = IDPF_PF_MBX_REGION_SZ; + adapter->dev_ops.rstat_reg_start = PFGEN_RTRIG; + adapter->dev_ops.rstat_reg_sz = IDPF_PF_RSTAT_REGION_SZ; } diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c index b87a72b..c18b8c7 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_idc.c +++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c @@ -394,7 +394,7 @@ int idpf_idc_init_aux_core_dev(struct idpf_adapter *adapter, enum idc_function_type ftype) { struct idc_rdma_core_dev_info *cdev_info; - int err; + int err, i; adapter->cdev_info = (struct idc_rdma_core_dev_info *) kzalloc(sizeof(struct idc_rdma_core_dev_info), GFP_KERNEL); @@ -407,14 +407,36 @@ int idpf_idc_init_aux_core_dev(struct idpf_adapter *adapter, cdev_info->rdma_protocol = IDC_RDMA_PROTOCOL_ROCEV2; cdev_info->ftype = ftype; + cdev_info->mapped_mem_regions = + kcalloc(adapter->hw.num_lan_regs, + sizeof(struct idc_rdma_lan_mapped_mem_region), + GFP_KERNEL); + if (!cdev_info->mapped_mem_regions) { + err = -ENOMEM; + goto err_plug_aux_dev; + } + + cdev_info->num_memory_regions = cpu_to_le16(adapter->hw.num_lan_regs); + for (i = 0; i < adapter->hw.num_lan_regs; i++) { + cdev_info->mapped_mem_regions[i].region_addr = + adapter->hw.lan_regs[i].addr; + cdev_info->mapped_mem_regions[i].size = + cpu_to_le64(adapter->hw.lan_regs[i].addr_len); + cdev_info->mapped_mem_regions[i].start_offset = + cpu_to_le64(adapter->hw.lan_regs[i].addr_start); + } + idpf_idc_init_msix_data(adapter); err = idpf_plug_core_aux_dev(cdev_info); if (err) - goto err_plug_aux_dev; + goto err_free_mem_regions; return 0; +err_free_mem_regions: + kfree(cdev_info->mapped_mem_regions); + cdev_info->mapped_mem_regions = NULL; err_plug_aux_dev: kfree(cdev_info); adapter->cdev_info = NULL; diff --git a/drivers/net/ethernet/intel/idpf/idpf_main.c b/drivers/net/ethernet/intel/idpf/idpf_main.c index f784eea..efd4342 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_main.c +++ b/drivers/net/ethernet/intel/idpf/idpf_main.c @@ -76,6 +76,11 @@ static void idpf_remove(struct pci_dev *pdev) mutex_destroy(&adapter->queue_lock); mutex_destroy(&adapter->vc_buf_lock); + iounmap(adapter->hw.mbx.addr); + iounmap(adapter->hw.rstat.addr); + for (i = 0; i < adapter->hw.num_lan_regs; i++) + iounmap(adapter->hw.lan_regs[i].addr); + pci_set_drvdata(pdev, NULL); kfree(adapter); } @@ -102,13 +107,34 @@ static int idpf_cfg_hw(struct idpf_adapter *adapter) { struct pci_dev *pdev = adapter->pdev; struct idpf_hw *hw = &adapter->hw; + resource_size_t res_start; + long len; + + /* Map mailbox space for virtchnl communication */ + res_start = pci_resource_start(pdev, 0) + + adapter->dev_ops.mbx_reg_start; + len = adapter->dev_ops.mbx_reg_sz; + hw->mbx.addr = ioremap(res_start, len); + if (!hw->mbx.addr) { + pci_err(pdev, "failed to allocate bar0 mbx region\n"); + + return -ENOMEM; + } + hw->mbx.addr_start = adapter->dev_ops.mbx_reg_start; + hw->mbx.addr_len = len; - hw->hw_addr = pcim_iomap_table(pdev)[0]; - if (!hw->hw_addr) { - pci_err(pdev, "failed to allocate PCI iomap table\n"); + /* Map rstat space for resets */ + res_start = pci_resource_start(pdev, 0) + + adapter->dev_ops.rstat_reg_start; + len = adapter->dev_ops.rstat_reg_sz; + hw->rstat.addr = ioremap(res_start, len); + if (!hw->rstat.addr) { + pci_err(pdev, "failed to allocate bar0 rstat region\n"); return -ENOMEM; } + hw->rstat.addr_start = adapter->dev_ops.rstat_reg_start; + hw->rstat.addr_len = len; hw->back = adapter; @@ -155,9 +181,9 @@ static int idpf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) if (err) goto err_free; - err = pcim_iomap_regions(pdev, BIT(0), pci_name(pdev)); + err = pci_request_mem_regions(pdev, pci_name(pdev)); if (err) { - pci_err(pdev, "pcim_iomap_regions failed %pe\n", ERR_PTR(err)); + pci_err(pdev, "pci_request_mem_regions failed %pe\n", ERR_PTR(err)); goto err_free; } diff --git a/drivers/net/ethernet/intel/idpf/idpf_mem.h b/drivers/net/ethernet/intel/idpf/idpf_mem.h index b21a04f..d7cc938 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_mem.h +++ b/drivers/net/ethernet/intel/idpf/idpf_mem.h @@ -12,9 +12,9 @@ struct idpf_dma_mem { size_t size; }; -#define wr32(a, reg, value) writel((value), ((a)->hw_addr + (reg))) -#define rd32(a, reg) readl((a)->hw_addr + (reg)) -#define wr64(a, reg, value) writeq((value), ((a)->hw_addr + (reg))) -#define rd64(a, reg) readq((a)->hw_addr + (reg)) +#define wr32(a, reg, value) writel((value), ((a)->mbx.addr + (reg))) +#define rd32(a, reg) readl((a)->mbx.addr + (reg)) +#define wr64(a, reg, value) writeq((value), ((a)->mbx.addr + (reg))) +#define rd64(a, reg) readq((a)->mbx.addr + (reg)) #endif /* _IDPF_MEM_H_ */ diff --git a/drivers/net/ethernet/intel/idpf/idpf_vf_dev.c b/drivers/net/ethernet/intel/idpf/idpf_vf_dev.c index db6a595..5ad66b6 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_vf_dev.c +++ b/drivers/net/ethernet/intel/idpf/idpf_vf_dev.c @@ -9,9 +9,11 @@ /** * idpf_vf_ctlq_reg_init - initialize default mailbox registers + * @adapter: adapter structure * @cq: pointer to the array of create control queues */ -static void idpf_vf_ctlq_reg_init(struct idpf_ctlq_create_info *cq) +static void idpf_vf_ctlq_reg_init(struct idpf_adapter *adapter, + struct idpf_ctlq_create_info *cq) { int i; @@ -21,22 +23,22 @@ static void idpf_vf_ctlq_reg_init(struct idpf_ctlq_create_info *cq) switch (ccq->type) { case IDPF_CTLQ_TYPE_MAILBOX_TX: /* set head and tail registers in our local struct */ - ccq->reg.head = VF_ATQH; - ccq->reg.tail = VF_ATQT; - ccq->reg.len = VF_ATQLEN; - ccq->reg.bah = VF_ATQBAH; - ccq->reg.bal = VF_ATQBAL; + ccq->reg.head = VF_ATQH - adapter->dev_ops.mbx_reg_start; + ccq->reg.tail = VF_ATQT - adapter->dev_ops.mbx_reg_start; + ccq->reg.len = VF_ATQLEN - adapter->dev_ops.mbx_reg_start; + ccq->reg.bah = VF_ATQBAH - adapter->dev_ops.mbx_reg_start; + ccq->reg.bal = VF_ATQBAL - adapter->dev_ops.mbx_reg_start; ccq->reg.len_mask = VF_ATQLEN_ATQLEN_M; ccq->reg.len_ena_mask = VF_ATQLEN_ATQENABLE_M; ccq->reg.head_mask = VF_ATQH_ATQH_M; break; case IDPF_CTLQ_TYPE_MAILBOX_RX: /* set head and tail registers in our local struct */ - ccq->reg.head = VF_ARQH; - ccq->reg.tail = VF_ARQT; - ccq->reg.len = VF_ARQLEN; - ccq->reg.bah = VF_ARQBAH; - ccq->reg.bal = VF_ARQBAL; + ccq->reg.head = VF_ARQH - adapter->dev_ops.mbx_reg_start; + ccq->reg.tail = VF_ARQT - adapter->dev_ops.mbx_reg_start; + ccq->reg.len = VF_ARQLEN - adapter->dev_ops.mbx_reg_start; + ccq->reg.bah = VF_ARQBAH - adapter->dev_ops.mbx_reg_start; + ccq->reg.bal = VF_ARQBAL - adapter->dev_ops.mbx_reg_start; ccq->reg.len_mask = VF_ARQLEN_ARQLEN_M; ccq->reg.len_ena_mask = VF_ARQLEN_ARQENABLE_M; ccq->reg.head_mask = VF_ARQH_ARQH_M; @@ -123,7 +125,7 @@ static int idpf_vf_intr_reg_init(struct idpf_vport *vport) */ static void idpf_vf_reset_reg_init(struct idpf_adapter *adapter) { - adapter->reset_reg.rstat = idpf_get_reg_addr(adapter, VFGEN_RSTAT); + adapter->reset_reg.rstat = idpf_get_rstat_reg_addr(adapter, VFGEN_RSTAT); adapter->reset_reg.rstat_m = VFGEN_RSTAT_VFR_STATE_M; } @@ -172,4 +174,9 @@ void idpf_vf_dev_ops_init(struct idpf_adapter *adapter) idpf_vf_reg_ops_init(adapter); adapter->dev_ops.idc_init = idpf_idc_vf_register; + + adapter->dev_ops.mbx_reg_start = VF_BASE; + adapter->dev_ops.mbx_reg_sz = IDPF_VF_MBX_REGION_SZ; + adapter->dev_ops.rstat_reg_start = VFGEN_RSTAT; + adapter->dev_ops.rstat_reg_sz = IDPF_VF_RSTAT_REGION_SZ; } diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c index e9a71d3..dd3473c 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c @@ -892,6 +892,7 @@ static int idpf_send_get_caps_msg(struct idpf_adapter *adapter) caps.other_caps = cpu_to_le64(VIRTCHNL2_CAP_SRIOV | VIRTCHNL2_CAP_RDMA | + VIRTCHNL2_CAP_LAN_MEMORY_REGIONS | VIRTCHNL2_CAP_MACFILTER | VIRTCHNL2_CAP_SPLITQ_QSCHED | VIRTCHNL2_CAP_PROMISC | @@ -914,6 +915,133 @@ static int idpf_send_get_caps_msg(struct idpf_adapter *adapter) } /** + * idpf_send_get_lan_memory_regions - Send virtchnl get LAN memory regions msg + * @adapter: Driver specific private struct + */ +static int idpf_send_get_lan_memory_regions(struct idpf_adapter *adapter) +{ + struct virtchnl2_get_lan_memory_regions *rcvd_regions; + struct idpf_vc_xn_params xn_params = {}; + int num_regions, size, i; + struct idpf_hw *hw; + ssize_t reply_sz; + int err = 0; + + rcvd_regions = kzalloc(IDPF_CTLQ_MAX_BUF_LEN, GFP_KERNEL); + if (!rcvd_regions) + return -ENOMEM; + + xn_params.vc_op = VIRTCHNL2_OP_GET_LAN_MEMORY_REGIONS; + xn_params.recv_buf.iov_base = rcvd_regions; + xn_params.recv_buf.iov_len = IDPF_CTLQ_MAX_BUF_LEN; + xn_params.timeout_ms = IDPF_VC_XN_DEFAULT_TIMEOUT_MSEC; + reply_sz = idpf_vc_xn_exec(adapter, &xn_params); + if (reply_sz < 0) { + err = reply_sz; + goto send_get_lan_regions_out; + } + + num_regions = le16_to_cpu(rcvd_regions->num_memory_regions); + size = struct_size(rcvd_regions, mem_reg, num_regions); + if (reply_sz < size) { + err = -EIO; + goto send_get_lan_regions_out; + } + + if (size > IDPF_CTLQ_MAX_BUF_LEN) { + err = -EINVAL; + goto send_get_lan_regions_out; + } + + hw = &adapter->hw; + hw->lan_regs = + kcalloc(num_regions, sizeof(struct idpf_mmio_reg), GFP_ATOMIC); + if (!hw->lan_regs) { + err = -ENOMEM; + goto send_get_lan_regions_out; + } + + for (i = 0; i < num_regions; i++) { + hw->lan_regs[i].addr_len = + le64_to_cpu(rcvd_regions->mem_reg[i].size); + hw->lan_regs[i].addr_start = + le64_to_cpu(rcvd_regions->mem_reg[i].start_offset); + } + hw->num_lan_regs = num_regions; + +send_get_lan_regions_out: + kfree(rcvd_regions); + + return err; +} + +/** + * idpf_calc_remaining_mmio_regs - calcuate MMIO regions outside mbx and rstat + * @adapter: Driver specific private structure + * + * Called when idpf_send_get_lan_memory_regions fails or is not supported. This + * will calculate the offsets and sizes for the regions before, in between, and + * after the mailbox and rstat MMIO mappings. + */ +static int idpf_calc_remaining_mmio_regs(struct idpf_adapter *adapter) +{ + struct idpf_hw *hw = &adapter->hw; + + hw->num_lan_regs = 3; + hw->lan_regs = kcalloc(hw->num_lan_regs, + sizeof(struct idpf_mmio_reg), + GFP_ATOMIC); + if (!hw->lan_regs) + return -ENOMEM; + + /* Region preceding mailbox */ + hw->lan_regs[0].addr_start = 0; + hw->lan_regs[0].addr_len = adapter->dev_ops.mbx_reg_start; + /* Region between mailbox and rstat */ + hw->lan_regs[1].addr_start = adapter->dev_ops.mbx_reg_start + + adapter->dev_ops.mbx_reg_sz; + hw->lan_regs[1].addr_len = adapter->dev_ops.rstat_reg_start - + hw->lan_regs[1].addr_start; + /* Region after rstat */ + hw->lan_regs[2].addr_start = adapter->dev_ops.rstat_reg_start + + adapter->dev_ops.rstat_reg_sz; + hw->lan_regs[2].addr_len = pci_resource_len(adapter->pdev, 0) - + hw->lan_regs[2].addr_start; + + return 0; +} + +/** + * idpf_map_lan_mmio_regs - map remaining LAN BAR regions + * @adapter: Driver specific private structure + */ +static int idpf_map_lan_mmio_regs(struct idpf_adapter *adapter) +{ + struct pci_dev *pdev = adapter->pdev; + struct idpf_hw *hw = &adapter->hw; + int i; + + for (i = 0; i < hw->num_lan_regs; i++) { + resource_size_t res_start; + long len; + + len = hw->lan_regs[i].addr_len; + if (!len) + continue; + res_start = hw->lan_regs[i].addr_start; + res_start += pci_resource_start(pdev, 0); + + hw->lan_regs[i].addr = ioremap(res_start, len); + if (!hw->lan_regs[i].addr) { + pci_err(pdev, "failed to allocate bar0 region\n"); + return -ENOMEM; + } + } + + return 0; +} + +/** * idpf_vport_alloc_max_qs - Allocate max queues for a vport * @adapter: Driver specific private structure * @max_q: vport max queue structure @@ -2778,7 +2906,7 @@ int idpf_init_dflt_mbx(struct idpf_adapter *adapter) struct idpf_hw *hw = &adapter->hw; int err; - adapter->dev_ops.reg_ops.ctlq_reg_init(ctlq_info); + adapter->dev_ops.reg_ops.ctlq_reg_init(adapter, ctlq_info); err = idpf_ctlq_init(hw, IDPF_NUM_DFLT_MBX_Q, ctlq_info); if (err) @@ -2938,6 +3066,30 @@ int idpf_vc_core_init(struct idpf_adapter *adapter) msleep(task_delay); } + if (idpf_is_cap_ena(adapter, IDPF_OTHER_CAPS, VIRTCHNL2_CAP_LAN_MEMORY_REGIONS)) { + err = idpf_send_get_lan_memory_regions(adapter); + if (err) { + dev_err(&adapter->pdev->dev, "Failed to get LAN memory regions: %d\n", + err); + return -EINVAL; + } + } else { + /* Fallback to mapping the remaining regions of the entire BAR */ + err = idpf_calc_remaining_mmio_regs(adapter); + if (err) { + dev_err(&adapter->pdev->dev, "Failed to allocate bar0 region(s): %d\n", + err); + return -ENOMEM; + } + } + + err = idpf_map_lan_mmio_regs(adapter); + if (err) { + dev_err(&adapter->pdev->dev, "Failed to map bar0 region(s): %d\n", + err); + return -ENOMEM; + } + pci_sriov_set_totalvfs(adapter->pdev, idpf_get_max_vfs(adapter)); num_max_vports = idpf_get_max_vports(adapter); adapter->max_vports = num_max_vports; diff --git a/drivers/net/ethernet/intel/idpf/virtchnl2.h b/drivers/net/ethernet/intel/idpf/virtchnl2.h index e6541152..92ab03c 100644 --- a/drivers/net/ethernet/intel/idpf/virtchnl2.h +++ b/drivers/net/ethernet/intel/idpf/virtchnl2.h @@ -69,6 +69,8 @@ enum virtchnl2_op { VIRTCHNL2_OP_ADD_MAC_ADDR = 535, VIRTCHNL2_OP_DEL_MAC_ADDR = 536, VIRTCHNL2_OP_CONFIG_PROMISCUOUS_MODE = 537, + /* Opcodes 538 through 548 are reserved. */ + VIRTCHNL2_OP_GET_LAN_MEMORY_REGIONS = 549, }; /** @@ -202,7 +204,8 @@ enum virtchnl2_cap_other { VIRTCHNL2_CAP_RX_FLEX_DESC = BIT_ULL(17), VIRTCHNL2_CAP_PTYPE = BIT_ULL(18), VIRTCHNL2_CAP_LOOPBACK = BIT_ULL(19), - /* Other capability 20 is reserved */ + /* Other capability 20-21 is reserved */ + VIRTCHNL2_CAP_LAN_MEMORY_REGIONS = BIT_ULL(22), /* this must be the last capability */ VIRTCHNL2_CAP_OEM = BIT_ULL(63), @@ -1283,4 +1286,29 @@ struct virtchnl2_promisc_info { }; VIRTCHNL2_CHECK_STRUCT_LEN(8, virtchnl2_promisc_info); +/** + * struct virtchnl2_mem_region - MMIO memory region + * @start_offset: starting offset of the MMIO memory region + * @size: size of the MMIO memory region + */ +struct virtchnl2_mem_region { + __le64 start_offset; + __le64 size; +}; +VIRTCHNL2_CHECK_STRUCT_LEN(16, virtchnl2_mem_region); + +/** + * struct vitchnl2_mem_region - List of LAN MMIO memory regions + * @num_memory_regions: number of memory regions + * @mem_reg: List with memory region info + * + * PF/VF sends this message to learn what LAN MMIO memory regions it should map. + */ +struct virtchnl2_get_lan_memory_regions { + __le16 num_memory_regions; + u8 pad[6]; + struct virtchnl2_mem_region mem_reg[]; +}; +VIRTCHNL2_CHECK_STRUCT_LEN(8, virtchnl2_get_lan_memory_regions); + #endif /* _VIRTCHNL_2_H_ */ From patchwork Wed Jul 24 23:39:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741449 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B2DF71442EA for ; Wed, 24 Jul 2024 23:40:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864445; cv=none; b=QXygP+OSNM9LV5UZl//MS9XN8NfbirtK6yqshH+ywMge6iEc+3/VPTTGQj+/Ku6TXJ3D76WbAqWQRfFESt2Z12/jNnt5ZVKEo1naWc9SCLxhSxAu7HvGlTHyMR917s2MUudsOtWBeQ4H46st0+xMjemxIsJTCa7sHEviBd7jrtU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864445; c=relaxed/simple; bh=fPW8xkBTw/MMB/j1/5UGqjp4EjwTg+3kU/e/bxbPnuU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=lttgAJIAegaZW3Xz0MO0+ocVTPp/C+EzPX14778Q/WximH9NefxQeLkeIkR9aSdFK4Zmbeejk8crk95XoEPuhk6qHs42XzqXBT9TqYuQytHyKfSkJV0Veva1HvCwHy2rI/5CsPXx0c16aH9rVkKs0HfMfjir3A80G27a2chiXFA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=W6EE87qT; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="W6EE87qT" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864443; x=1753400443; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fPW8xkBTw/MMB/j1/5UGqjp4EjwTg+3kU/e/bxbPnuU=; b=W6EE87qTG0Hf2kTgBYiXDTMjwPneD5sI6O4DXBtX8cHO2VJubkIgKk8j VS7jo+3W/Vui1PvdKqFOO0dSEEgb72IhMK8UEa6Uewhd7UNvF8kX7MaiK sDxI04+MHXzpwY1B250ahFIrD375ShgFdAlBWziad1MQSMVXwQIxQ3pY/ LarSieljabNoP+BPQAFTou7F6NCwC7MVeaoeU8ruI8uBILvJs1HuPZSxH KSCiVcBLxzbghwJQ3i6BpGGjYfcWdmH3cfs0YGG4wmj+i9qgQywz2cbwh Kuts++eZcpB2A+KdLg+j7ql7sbvlmBuBER+ArITRY6QxRqkSZBPsB5NoY Q==; X-CSE-ConnectionGUID: WQoPrVlrQXuRisDQfJL6oA== X-CSE-MsgGUID: 0K+TEuACQUqv4ja5Otv7dg== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999763" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999763" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:40 -0700 X-CSE-ConnectionGUID: 9B2UcSb7RlqDih/Tcamd5A== X-CSE-MsgGUID: DCzXv3THSNe059uehgTYVQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426037" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:39 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Tatyana Nikolova Subject: [RFC PATCH 10/25] RDMA/irdma: Refactor GEN2 auxiliary driver Date: Wed, 24 Jul 2024 18:39:02 -0500 Message-Id: <20240724233917.704-11-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Mustafa Ismail Refactor the irdma auxiliary driver and associated interfaces out of main.c and into a standalone GEN2-specific source file and rename as gen_2 driver. This is in preparation for adding GEN3 auxiliary drivers. Each HW generation will have its own gen-specific interface file. Additionally, move the Address Handle hash table and associated locks under rf struct. This will allow GEN3 code to migrate to use it easily. Signed-off-by: Mustafa Ismail Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/Makefile | 1 + drivers/infiniband/hw/irdma/i40iw_if.c | 2 + drivers/infiniband/hw/irdma/icrdma_if.c | 265 +++++++++++++++++++++++++++++++ drivers/infiniband/hw/irdma/main.c | 272 +------------------------------- drivers/infiniband/hw/irdma/main.h | 9 +- drivers/infiniband/hw/irdma/verbs.c | 16 +- 6 files changed, 290 insertions(+), 275 deletions(-) create mode 100644 drivers/infiniband/hw/irdma/icrdma_if.c diff --git a/drivers/infiniband/hw/irdma/Makefile b/drivers/infiniband/hw/irdma/Makefile index 48c3854..2522e4c 100644 --- a/drivers/infiniband/hw/irdma/Makefile +++ b/drivers/infiniband/hw/irdma/Makefile @@ -13,6 +13,7 @@ irdma-objs := cm.o \ hw.o \ i40iw_hw.o \ i40iw_if.o \ + icrdma_if.o \ icrdma_hw.o \ main.o \ pble.o \ diff --git a/drivers/infiniband/hw/irdma/i40iw_if.c b/drivers/infiniband/hw/irdma/i40iw_if.c index cc50a70..6fa807e 100644 --- a/drivers/infiniband/hw/irdma/i40iw_if.c +++ b/drivers/infiniband/hw/irdma/i40iw_if.c @@ -75,6 +75,8 @@ static void i40iw_fill_device_info(struct irdma_device *iwdev, struct i40e_info struct irdma_pci_f *rf = iwdev->rf; rf->rdma_ver = IRDMA_GEN_1; + rf->sc_dev.hw = &rf->hw; + rf->sc_dev.hw_attrs.uk_attrs.hw_rev = IRDMA_GEN_1; rf->gen_ops.request_reset = i40iw_request_reset; rf->pcidev = cdev_info->pcidev; rf->pf_id = cdev_info->fid; diff --git a/drivers/infiniband/hw/irdma/icrdma_if.c b/drivers/infiniband/hw/irdma/icrdma_if.c new file mode 100644 index 0000000..5fcbf69 --- /dev/null +++ b/drivers/infiniband/hw/irdma/icrdma_if.c @@ -0,0 +1,265 @@ +// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB +// /* Copyright (c) 2015 - 2024 Intel Corporation */ +#include "main.h" + +static void icrdma_prep_tc_change(struct irdma_device *iwdev) +{ + iwdev->vsi.tc_change_pending = true; + irdma_sc_suspend_resume_qps(&iwdev->vsi, IRDMA_OP_SUSPEND); + + /* Wait for all qp's to suspend */ + wait_event_timeout(iwdev->suspend_wq, + !atomic_read(&iwdev->vsi.qp_suspend_reqs), + msecs_to_jiffies(IRDMA_EVENT_TIMEOUT_MS)); + irdma_ws_reset(&iwdev->vsi); +} + +static void icrdma_idc_event_handler(struct idc_rdma_core_dev_info *cdev_info, + struct idc_rdma_event *event) +{ + struct irdma_device *iwdev = dev_get_drvdata(&cdev_info->adev->dev); + struct irdma_l2params l2params = {}; + + if (*event->type & BIT(IDC_RDMA_EVENT_AFTER_MTU_CHANGE)) { + ibdev_dbg(&iwdev->ibdev, "CLNT: new MTU = %d\n", iwdev->netdev->mtu); + if (iwdev->vsi.mtu != iwdev->netdev->mtu) { + l2params.mtu = iwdev->netdev->mtu; + l2params.mtu_changed = true; + irdma_log_invalid_mtu(l2params.mtu, &iwdev->rf->sc_dev); + irdma_change_l2params(&iwdev->vsi, &l2params); + } + } else if (*event->type & BIT(IDC_RDMA_EVENT_BEFORE_TC_CHANGE)) { + if (iwdev->vsi.tc_change_pending) + return; + + icrdma_prep_tc_change(iwdev); + } else if (*event->type & BIT(IDC_RDMA_EVENT_AFTER_TC_CHANGE)) { + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; + + if (!iwdev->vsi.tc_change_pending) + return; + + l2params.tc_changed = true; + ibdev_dbg(&iwdev->ibdev, "CLNT: TC Change\n"); + + irdma_fill_qos_info(&l2params, &idc_priv->qos_info); + if (iwdev->rf->protocol_used != IRDMA_IWARP_PROTOCOL_ONLY) + iwdev->dcb_vlan_mode = + l2params.num_tc > 1 && !l2params.dscp_mode; + irdma_change_l2params(&iwdev->vsi, &l2params); + } else if (*event->type & BIT(IDC_RDMA_EVENT_CRIT_ERR)) { + ibdev_warn(&iwdev->ibdev, "ICE OICR event notification: oicr = 0x%08x\n", + event->reg); + if (event->reg & IRDMAPFINT_OICR_PE_CRITERR_M) { + u32 pe_criterr; + + pe_criterr = readl(iwdev->rf->sc_dev.hw_regs[IRDMA_GLPE_CRITERR]); +#define IRDMA_Q1_RESOURCE_ERR 0x0001024d + if (pe_criterr != IRDMA_Q1_RESOURCE_ERR) { + ibdev_err(&iwdev->ibdev, "critical PE Error, GLPE_CRITERR=0x%08x\n", + pe_criterr); + iwdev->rf->reset = true; + } else { + ibdev_warn(&iwdev->ibdev, "Q1 Resource Check\n"); + } + } + if (event->reg & IRDMAPFINT_OICR_HMC_ERR_M) { + ibdev_err(&iwdev->ibdev, "HMC Error\n"); + iwdev->rf->reset = true; + } + if (event->reg & IRDMAPFINT_OICR_PE_PUSH_M) { + ibdev_err(&iwdev->ibdev, "PE Push Error\n"); + iwdev->rf->reset = true; + } + if (iwdev->rf->reset) + iwdev->rf->gen_ops.request_reset(iwdev->rf); + } +} + +/** + * icrdma_lan_register_qset - Register qset with LAN driver + * @vsi: vsi structure + * @tc_node: Traffic class node + */ +static int icrdma_lan_register_qset(struct irdma_sc_vsi *vsi, + struct irdma_ws_node *tc_node) +{ + struct irdma_device *iwdev = vsi->back_vsi; + struct idc_rdma_core_dev_info *cdev_info = iwdev->rf->cdev; + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; + struct iidc_rdma_qset_params qset = {}; + int ret; + + qset.qs_handle = tc_node->qs_handle; + qset.tc = tc_node->traffic_class; + qset.vport_id = vsi->vsi_idx; + ret = idc_priv->priv_ops->alloc_res(cdev_info, &qset); + if (ret) { + ibdev_dbg(&iwdev->ibdev, "WS: LAN alloc_res for rdma qset failed.\n"); + return ret; + } + + tc_node->l2_sched_node_id = qset.teid; + vsi->qos[tc_node->user_pri].l2_sched_node_id = qset.teid; + + return 0; +} + +/** + * icrdma_lan_unregister_qset - Unregister qset with LAN driver + * @vsi: vsi structure + * @tc_node: Traffic class node + */ +static void icrdma_lan_unregister_qset(struct irdma_sc_vsi *vsi, + struct irdma_ws_node *tc_node) +{ + struct irdma_device *iwdev = vsi->back_vsi; + struct idc_rdma_core_dev_info *cdev_info = iwdev->rf->cdev; + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; + struct iidc_rdma_qset_params qset = {}; + + qset.qs_handle = tc_node->qs_handle; + qset.tc = tc_node->traffic_class; + qset.vport_id = vsi->vsi_idx; + qset.teid = tc_node->l2_sched_node_id; + + if (idc_priv->priv_ops->free_res(cdev_info, &qset)) + ibdev_dbg(&iwdev->ibdev, "WS: LAN free_res for rdma qset failed.\n"); +} + +static void icrdma_fill_device_info(struct irdma_device *iwdev, + struct idc_rdma_core_dev_info *cdev_info) +{ + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; + struct irdma_pci_f *rf = iwdev->rf; + + rf->sc_dev.hw = &rf->hw; + rf->iwdev = iwdev; + rf->cdev = cdev_info; + rf->hw.hw_addr = idc_priv->hw_addr; + rf->pcidev = cdev_info->pdev; + rf->hw.device = &rf->pcidev->dev; + rf->msix_count = cdev_info->msix_count; + rf->pf_id = idc_priv->pf_id; + rf->msix_entries = cdev_info->msix_entries; + rf->rdma_ver = IRDMA_GEN_2; + rf->sc_dev.hw_attrs.uk_attrs.hw_rev = IRDMA_GEN_2; + + rf->gen_ops.register_qset = icrdma_lan_register_qset; + rf->gen_ops.unregister_qset = icrdma_lan_unregister_qset; + + rf->default_vsi.vsi_idx = idc_priv->vport_id; + rf->protocol_used = + cdev_info->rdma_protocol == IDC_RDMA_PROTOCOL_ROCEV2 ? + IRDMA_ROCE_PROTOCOL_ONLY : IRDMA_IWARP_PROTOCOL_ONLY; + rf->rsrc_profile = IRDMA_HMC_PROFILE_DEFAULT; + rf->rst_to = IRDMA_RST_TIMEOUT_HZ; + rf->gen_ops.request_reset = irdma_request_reset; + rf->limits_sel = 7; + mutex_init(&rf->ah_tbl_lock); + + iwdev->netdev = idc_priv->netdev; + iwdev->vsi_num = idc_priv->vport_id; + iwdev->init_state = INITIAL_STATE; + iwdev->roce_cwnd = IRDMA_ROCE_CWND_DEFAULT; + iwdev->roce_ackcreds = IRDMA_ROCE_ACKCREDS_DEFAULT; + iwdev->rcv_wnd = IRDMA_CM_DEFAULT_RCV_WND_SCALED; + iwdev->rcv_wscale = IRDMA_CM_DEFAULT_RCV_WND_SCALE; + if (iwdev->rf->protocol_used != IRDMA_IWARP_PROTOCOL_ONLY) + iwdev->roce_mode = true; +} + +static int icrdma_probe(struct auxiliary_device *aux_dev, const struct auxiliary_device_id *id) +{ + struct idc_rdma_core_auxiliary_dev *idc_adev = + container_of(aux_dev, struct idc_rdma_core_auxiliary_dev, adev); + struct idc_rdma_core_dev_info *cdev_info = idc_adev->cdev_info; + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; + struct irdma_device *iwdev; + struct irdma_pci_f *rf; + struct irdma_l2params l2params = {}; + int err; + + iwdev = ib_alloc_device(irdma_device, ibdev); + if (!iwdev) + return -ENOMEM; + iwdev->rf = kzalloc(sizeof(*rf), GFP_KERNEL); + if (!iwdev->rf) { + ib_dealloc_device(&iwdev->ibdev); + return -ENOMEM; + } + + icrdma_fill_device_info(iwdev, cdev_info); + rf = iwdev->rf; + + err = irdma_ctrl_init_hw(rf); + if (err) + goto err_ctrl_init; + + l2params.mtu = iwdev->netdev->mtu; + irdma_fill_qos_info(&l2params, &idc_priv->qos_info); + if (iwdev->rf->protocol_used != IRDMA_IWARP_PROTOCOL_ONLY) + iwdev->dcb_vlan_mode = l2params.num_tc > 1 && !l2params.dscp_mode; + + err = irdma_rt_init_hw(iwdev, &l2params); + if (err) + goto err_rt_init; + + err = irdma_ib_register_device(iwdev); + if (err) + goto err_ibreg; + + idc_priv->priv_ops->update_vport_filter(cdev_info, iwdev->vsi_num, + true); + + ibdev_dbg(&iwdev->ibdev, "INIT: Gen[%d] PF[%d] device probe success\n", + rf->rdma_ver, PCI_FUNC(rf->pcidev->devfn)); + + auxiliary_set_drvdata(aux_dev, iwdev); + + return 0; + +err_ibreg: + irdma_rt_deinit_hw(iwdev); +err_rt_init: + irdma_ctrl_deinit_hw(rf); +err_ctrl_init: + kfree(iwdev->rf); + ib_dealloc_device(&iwdev->ibdev); + + return err; +} + +static void icrdma_remove(struct auxiliary_device *aux_dev) +{ + struct idc_rdma_core_auxiliary_dev *idc_adev = + container_of(aux_dev, struct idc_rdma_core_auxiliary_dev, adev); + struct idc_rdma_core_dev_info *cdev_info = idc_adev->cdev_info; + struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; + struct irdma_device *iwdev = auxiliary_get_drvdata(aux_dev); + u8 rdma_ver = iwdev->rf->rdma_ver; + + idc_priv->priv_ops->update_vport_filter(cdev_info, + iwdev->vsi_num, false); + irdma_ib_unregister_device(iwdev); + pr_debug("INIT: Gen[%d] func[%d] device remove success\n", + rdma_ver, PCI_FUNC(cdev_info->pdev->devfn)); +} + +static const struct auxiliary_device_id icrdma_auxiliary_id_table[] = { + {.name = "ice.iwarp", }, + {.name = "ice.roce", }, + {}, +}; + +MODULE_DEVICE_TABLE(auxiliary, icrdma_auxiliary_id_table); + +struct idc_rdma_core_auxiliary_drv icrdma_core_auxiliary_drv = { + .adrv = { + .name = "gen_2", + .id_table = icrdma_auxiliary_id_table, + .probe = icrdma_probe, + .remove = icrdma_remove, + }, + .event_handler = icrdma_idc_event_handler, +}; diff --git a/drivers/infiniband/hw/irdma/main.c b/drivers/infiniband/hw/irdma/main.c index 9b6f1d8..ee59ca1 100644 --- a/drivers/infiniband/hw/irdma/main.c +++ b/drivers/infiniband/hw/irdma/main.c @@ -39,19 +39,7 @@ static void irdma_unregister_notifiers(void) unregister_netdevice_notifier(&irdma_netdevice_notifier); } -static void irdma_prep_tc_change(struct irdma_device *iwdev) -{ - iwdev->vsi.tc_change_pending = true; - irdma_sc_suspend_resume_qps(&iwdev->vsi, IRDMA_OP_SUSPEND); - - /* Wait for all qp's to suspend */ - wait_event_timeout(iwdev->suspend_wq, - !atomic_read(&iwdev->vsi.qp_suspend_reqs), - msecs_to_jiffies(IRDMA_EVENT_TIMEOUT_MS)); - irdma_ws_reset(&iwdev->vsi); -} - -static void irdma_log_invalid_mtu(u16 mtu, struct irdma_sc_dev *dev) +void irdma_log_invalid_mtu(u16 mtu, struct irdma_sc_dev *dev) { if (mtu < IRDMA_MIN_MTU_IPV4) ibdev_warn(to_ibdev(dev), "MTU setting [%d] too low for RDMA traffic. Minimum MTU is 576 for IPv4\n", mtu); @@ -59,8 +47,8 @@ static void irdma_log_invalid_mtu(u16 mtu, struct irdma_sc_dev *dev) ibdev_warn(to_ibdev(dev), "MTU setting [%d] too low for RDMA traffic. Minimum MTU is 1280 for IPv6\\n", mtu); } -static void irdma_fill_qos_info(struct irdma_l2params *l2params, - struct iidc_rdma_qos_params *qos_info) +void irdma_fill_qos_info(struct irdma_l2params *l2params, + struct iidc_rdma_qos_params *qos_info) { int i; @@ -84,73 +72,11 @@ static void irdma_fill_qos_info(struct irdma_l2params *l2params, } } -static void irdma_idc_event_handler(struct idc_rdma_core_dev_info *cdev_info, - struct idc_rdma_event *event) -{ - struct irdma_device *iwdev = dev_get_drvdata(&cdev_info->adev->dev); - struct irdma_l2params l2params = {}; - - if (*event->type & BIT(IDC_RDMA_EVENT_AFTER_MTU_CHANGE)) { - ibdev_dbg(&iwdev->ibdev, "CLNT: new MTU = %d\n", iwdev->netdev->mtu); - if (iwdev->vsi.mtu != iwdev->netdev->mtu) { - l2params.mtu = iwdev->netdev->mtu; - l2params.mtu_changed = true; - irdma_log_invalid_mtu(l2params.mtu, &iwdev->rf->sc_dev); - irdma_change_l2params(&iwdev->vsi, &l2params); - } - } else if (*event->type & BIT(IDC_RDMA_EVENT_BEFORE_TC_CHANGE)) { - if (iwdev->vsi.tc_change_pending) - return; - - irdma_prep_tc_change(iwdev); - } else if (*event->type & BIT(IDC_RDMA_EVENT_AFTER_TC_CHANGE)) { - struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; - - if (!iwdev->vsi.tc_change_pending) - return; - - l2params.tc_changed = true; - ibdev_dbg(&iwdev->ibdev, "CLNT: TC Change\n"); - - irdma_fill_qos_info(&l2params, &idc_priv->qos_info); - if (iwdev->rf->protocol_used != IRDMA_IWARP_PROTOCOL_ONLY) - iwdev->dcb_vlan_mode = - l2params.num_tc > 1 && !l2params.dscp_mode; - irdma_change_l2params(&iwdev->vsi, &l2params); - } else if (*event->type & BIT(IDC_RDMA_EVENT_CRIT_ERR)) { - ibdev_warn(&iwdev->ibdev, "ICE OICR event notification: oicr = 0x%08x\n", - event->reg); - if (event->reg & IRDMAPFINT_OICR_PE_CRITERR_M) { - u32 pe_criterr; - - pe_criterr = readl(iwdev->rf->sc_dev.hw_regs[IRDMA_GLPE_CRITERR]); -#define IRDMA_Q1_RESOURCE_ERR 0x0001024d - if (pe_criterr != IRDMA_Q1_RESOURCE_ERR) { - ibdev_err(&iwdev->ibdev, "critical PE Error, GLPE_CRITERR=0x%08x\n", - pe_criterr); - iwdev->rf->reset = true; - } else { - ibdev_warn(&iwdev->ibdev, "Q1 Resource Check\n"); - } - } - if (event->reg & IRDMAPFINT_OICR_HMC_ERR_M) { - ibdev_err(&iwdev->ibdev, "HMC Error\n"); - iwdev->rf->reset = true; - } - if (event->reg & IRDMAPFINT_OICR_PE_PUSH_M) { - ibdev_err(&iwdev->ibdev, "PE Push Error\n"); - iwdev->rf->reset = true; - } - if (iwdev->rf->reset) - iwdev->rf->gen_ops.request_reset(iwdev->rf); - } -} - /** * irdma_request_reset - Request a reset * @rf: RDMA PCI function */ -static void irdma_request_reset(struct irdma_pci_f *rf) +void irdma_request_reset(struct irdma_pci_f *rf) { struct idc_rdma_core_dev_info *cdev_info = rf->cdev; @@ -158,190 +84,6 @@ static void irdma_request_reset(struct irdma_pci_f *rf) cdev_info->ops->request_reset(rf->cdev, IDC_FUNC_RESET); } -/** - * irdma_lan_register_qset - Register qset with LAN driver - * @vsi: vsi structure - * @tc_node: Traffic class node - */ -static int irdma_lan_register_qset(struct irdma_sc_vsi *vsi, - struct irdma_ws_node *tc_node) -{ - struct irdma_device *iwdev = vsi->back_vsi; - struct idc_rdma_core_dev_info *cdev_info = iwdev->rf->cdev; - struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; - struct iidc_rdma_qset_params qset = {}; - int ret; - - qset.qs_handle = tc_node->qs_handle; - qset.tc = tc_node->traffic_class; - qset.vport_id = vsi->vsi_idx; - ret = idc_priv->priv_ops->alloc_res(cdev_info, &qset); - if (ret) { - ibdev_dbg(&iwdev->ibdev, "WS: LAN alloc_res for rdma qset failed.\n"); - return ret; - } - - tc_node->l2_sched_node_id = qset.teid; - vsi->qos[tc_node->user_pri].l2_sched_node_id = qset.teid; - - return 0; -} - -/** - * irdma_lan_unregister_qset - Unregister qset with LAN driver - * @vsi: vsi structure - * @tc_node: Traffic class node - */ -static void irdma_lan_unregister_qset(struct irdma_sc_vsi *vsi, - struct irdma_ws_node *tc_node) -{ - struct irdma_device *iwdev = vsi->back_vsi; - struct idc_rdma_core_dev_info *cdev_info = iwdev->rf->cdev; - struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; - struct iidc_rdma_qset_params qset = {}; - - qset.qs_handle = tc_node->qs_handle; - qset.tc = tc_node->traffic_class; - qset.vport_id = vsi->vsi_idx; - qset.teid = tc_node->l2_sched_node_id; - - if (idc_priv->priv_ops->free_res(cdev_info, &qset)) - ibdev_dbg(&iwdev->ibdev, "WS: LAN free_res for rdma qset failed.\n"); -} - -static void irdma_remove(struct auxiliary_device *aux_dev) -{ - struct idc_rdma_core_auxiliary_dev *idc_adev = - container_of(aux_dev, struct idc_rdma_core_auxiliary_dev, adev); - struct idc_rdma_core_dev_info *cdev_info = idc_adev->cdev_info; - struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; - struct irdma_device *iwdev = auxiliary_get_drvdata(aux_dev); - - idc_priv->priv_ops->update_vport_filter(cdev_info, - iwdev->vsi_num, false); - irdma_ib_unregister_device(iwdev); - - pr_debug("INIT: Gen2 PF[%d] device remove success\n", PCI_FUNC(cdev_info->pdev->devfn)); -} - -static void irdma_fill_device_info(struct irdma_device *iwdev, - struct idc_rdma_core_dev_info *cdev_info) -{ - struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; - struct irdma_pci_f *rf = iwdev->rf; - - rf->sc_dev.hw = &rf->hw; - rf->iwdev = iwdev; - rf->cdev = cdev_info; - rf->hw.hw_addr = idc_priv->hw_addr; - rf->pcidev = cdev_info->pdev; - rf->hw.device = &rf->pcidev->dev; - rf->msix_count = cdev_info->msix_count; - rf->pf_id = idc_priv->pf_id; - rf->msix_entries = cdev_info->msix_entries; - - rf->gen_ops.register_qset = irdma_lan_register_qset; - rf->gen_ops.unregister_qset = irdma_lan_unregister_qset; - - rf->default_vsi.vsi_idx = idc_priv->vport_id; - rf->protocol_used = - cdev_info->rdma_protocol == IDC_RDMA_PROTOCOL_ROCEV2 ? - IRDMA_ROCE_PROTOCOL_ONLY : IRDMA_IWARP_PROTOCOL_ONLY; - rf->rdma_ver = IRDMA_GEN_2; - rf->rsrc_profile = IRDMA_HMC_PROFILE_DEFAULT; - rf->rst_to = IRDMA_RST_TIMEOUT_HZ; - rf->gen_ops.request_reset = irdma_request_reset; - rf->limits_sel = 7; - rf->iwdev = iwdev; - mutex_init(&iwdev->ah_tbl_lock); - - iwdev->netdev = idc_priv->netdev; - iwdev->vsi_num = idc_priv->vport_id; - iwdev->init_state = INITIAL_STATE; - iwdev->roce_cwnd = IRDMA_ROCE_CWND_DEFAULT; - iwdev->roce_ackcreds = IRDMA_ROCE_ACKCREDS_DEFAULT; - iwdev->rcv_wnd = IRDMA_CM_DEFAULT_RCV_WND_SCALED; - iwdev->rcv_wscale = IRDMA_CM_DEFAULT_RCV_WND_SCALE; - if (rf->protocol_used == IRDMA_ROCE_PROTOCOL_ONLY) - iwdev->roce_mode = true; -} - -static int irdma_probe(struct auxiliary_device *aux_dev, const struct auxiliary_device_id *id) -{ - struct idc_rdma_core_auxiliary_dev *idc_adev = - container_of(aux_dev, struct idc_rdma_core_auxiliary_dev, adev); - struct idc_rdma_core_dev_info *cdev_info = idc_adev->cdev_info; - struct iidc_rdma_priv_dev_info *idc_priv = cdev_info->idc_priv; - struct irdma_device *iwdev; - struct irdma_pci_f *rf; - struct irdma_l2params l2params = {}; - int err; - - iwdev = ib_alloc_device(irdma_device, ibdev); - if (!iwdev) - return -ENOMEM; - iwdev->rf = kzalloc(sizeof(*rf), GFP_KERNEL); - if (!iwdev->rf) { - ib_dealloc_device(&iwdev->ibdev); - return -ENOMEM; - } - - irdma_fill_device_info(iwdev, cdev_info); - rf = iwdev->rf; - - err = irdma_ctrl_init_hw(rf); - if (err) - goto err_ctrl_init; - - l2params.mtu = iwdev->netdev->mtu; - irdma_fill_qos_info(&l2params, &idc_priv->qos_info); - if (iwdev->rf->protocol_used != IRDMA_IWARP_PROTOCOL_ONLY) - iwdev->dcb_vlan_mode = l2params.num_tc > 1 && !l2params.dscp_mode; - - err = irdma_rt_init_hw(iwdev, &l2params); - if (err) - goto err_rt_init; - - err = irdma_ib_register_device(iwdev); - if (err) - goto err_ibreg; - - idc_priv->priv_ops->update_vport_filter(cdev_info, iwdev->vsi_num, - true); - - ibdev_dbg(&iwdev->ibdev, "INIT: Gen2 PF[%d] device probe success\n", PCI_FUNC(rf->pcidev->devfn)); - auxiliary_set_drvdata(aux_dev, iwdev); - - return 0; - -err_ibreg: - irdma_rt_deinit_hw(iwdev); -err_rt_init: - irdma_ctrl_deinit_hw(rf); -err_ctrl_init: - kfree(iwdev->rf); - ib_dealloc_device(&iwdev->ibdev); - - return err; -} - -static const struct auxiliary_device_id irdma_auxiliary_id_table[] = { - {.name = "ice.iwarp", }, - {.name = "ice.roce", }, - {}, -}; - -MODULE_DEVICE_TABLE(auxiliary, irdma_auxiliary_id_table); - -static struct idc_rdma_core_auxiliary_drv irdma_auxiliary_drv = { - .adrv = { - .id_table = irdma_auxiliary_id_table, - .probe = irdma_probe, - .remove = irdma_remove, - }, - .event_handler = irdma_idc_event_handler, -}; - static int __init irdma_init_module(void) { int ret; @@ -353,10 +95,10 @@ static int __init irdma_init_module(void) return ret; } - ret = auxiliary_driver_register(&irdma_auxiliary_drv.adrv); + ret = auxiliary_driver_register(&icrdma_core_auxiliary_drv.adrv); if (ret) { auxiliary_driver_unregister(&i40iw_auxiliary_drv); - pr_err("Failed irdma auxiliary_driver_register() ret=%d\n", + pr_err("Failed icrdma(gen_2) auxiliary_driver_register() ret=%d\n", ret); return ret; } @@ -369,7 +111,7 @@ static int __init irdma_init_module(void) static void __exit irdma_exit_module(void) { irdma_unregister_notifiers(); - auxiliary_driver_unregister(&irdma_auxiliary_drv.adrv); + auxiliary_driver_unregister(&icrdma_core_auxiliary_drv.adrv); auxiliary_driver_unregister(&i40iw_auxiliary_drv); } diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index e81f375..7360e17 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -55,6 +55,7 @@ #include "puda.h" extern struct auxiliary_driver i40iw_auxiliary_drv; +extern struct idc_rdma_core_auxiliary_drv icrdma_core_auxiliary_drv; #define IRDMA_FW_VER_DEFAULT 2 #define IRDMA_HW_VER 2 @@ -329,6 +330,8 @@ struct irdma_pci_f { void *back_fcn; struct irdma_gen_ops gen_ops; struct irdma_device *iwdev; + DECLARE_HASHTABLE(ah_hash_tbl, 8); + struct mutex ah_tbl_lock; /* protect AH hash table access */ }; struct irdma_device { @@ -338,8 +341,6 @@ struct irdma_device { struct workqueue_struct *cleanup_wq; struct irdma_sc_vsi vsi; struct irdma_cm_core cm_core; - DECLARE_HASHTABLE(ah_hash_tbl, 8); - struct mutex ah_tbl_lock; /* protect AH hash table access */ u32 roce_cwnd; u32 roce_ackcreds; u32 vendor_id; @@ -555,4 +556,8 @@ int irdma_netdevice_event(struct notifier_block *notifier, unsigned long event, void *ptr); void irdma_add_ip(struct irdma_device *iwdev); void cqp_compl_worker(struct work_struct *work); +void irdma_fill_qos_info(struct irdma_l2params *l2params, + struct iidc_rdma_qos_params *qos_info); +void irdma_request_reset(struct irdma_pci_f *rf); +void irdma_log_invalid_mtu(u16 mtu, struct irdma_sc_dev *dev); #endif /* IRDMA_MAIN_H */ diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 12704ef..89937d4 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -4529,7 +4529,7 @@ static bool irdma_ah_exists(struct irdma_device *iwdev, new_ah->sc_ah.ah_info.dest_ip_addr[2] ^ new_ah->sc_ah.ah_info.dest_ip_addr[3]; - hash_for_each_possible(iwdev->ah_hash_tbl, ah, list, key) { + hash_for_each_possible(iwdev->rf->ah_hash_tbl, ah, list, key) { /* Set ah_valid and ah_id the same so memcmp can work */ new_ah->sc_ah.ah_info.ah_idx = ah->sc_ah.ah_info.ah_idx; new_ah->sc_ah.ah_info.ah_valid = ah->sc_ah.ah_info.ah_valid; @@ -4555,14 +4555,14 @@ static int irdma_destroy_ah(struct ib_ah *ibah, u32 ah_flags) struct irdma_ah *ah = to_iwah(ibah); if ((ah_flags & RDMA_DESTROY_AH_SLEEPABLE) && ah->parent_ah) { - mutex_lock(&iwdev->ah_tbl_lock); + mutex_lock(&iwdev->rf->ah_tbl_lock); if (!refcount_dec_and_test(&ah->parent_ah->refcnt)) { - mutex_unlock(&iwdev->ah_tbl_lock); + mutex_unlock(&iwdev->rf->ah_tbl_lock); return 0; } hash_del(&ah->parent_ah->list); kfree(ah->parent_ah); - mutex_unlock(&iwdev->ah_tbl_lock); + mutex_unlock(&iwdev->rf->ah_tbl_lock); } irdma_ah_cqp_op(iwdev->rf, &ah->sc_ah, IRDMA_OP_AH_DESTROY, @@ -4599,11 +4599,11 @@ static int irdma_create_user_ah(struct ib_ah *ibah, err = irdma_setup_ah(ibah, attr); if (err) return err; - mutex_lock(&iwdev->ah_tbl_lock); + mutex_lock(&iwdev->rf->ah_tbl_lock); if (!irdma_ah_exists(iwdev, ah)) { err = irdma_create_hw_ah(iwdev, ah, true); if (err) { - mutex_unlock(&iwdev->ah_tbl_lock); + mutex_unlock(&iwdev->rf->ah_tbl_lock); return err; } /* Add new AH to list */ @@ -4615,11 +4615,11 @@ static int irdma_create_user_ah(struct ib_ah *ibah, parent_ah->sc_ah.ah_info.dest_ip_addr[3]; ah->parent_ah = parent_ah; - hash_add(iwdev->ah_hash_tbl, &parent_ah->list, key); + hash_add(iwdev->rf->ah_hash_tbl, &parent_ah->list, key); refcount_set(&parent_ah->refcnt, 1); } } - mutex_unlock(&iwdev->ah_tbl_lock); + mutex_unlock(&iwdev->rf->ah_tbl_lock); uresp.ah_id = ah->sc_ah.ah_info.ah_idx; err = ib_copy_to_udata(udata, &uresp, min(sizeof(uresp), udata->outlen)); From patchwork Wed Jul 24 23:39:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741451 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 646AA1448D8 for ; Wed, 24 Jul 2024 23:40:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864446; cv=none; b=rKRfNic0g5znxXiOwq23YlJutccI3ZdeKddBZhjYkzEIqDHNVQ9VKBr2IMNYfi6UGeWQeS9BQo9WjF/G2ZNfKkIXYvnm0CzmZDkCpTErQY/goKkRyrbIN2OeYaNPDdxQbibxjDekL+4WB3nR/0WKF7EELXxRge+3G6vkXJEf8QU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864446; c=relaxed/simple; bh=q00fszpD2Qyk3oAdePGLjSF2xntq88+vGVo40ifKgi4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=VPjhztp2saDnZdVLFl6NZUYIKD2f7JuBgzvl6vqT+0rmnLAgLF1ewd8r1w/QgZNztWNox3+OxH96KG4HUuHM3T+k30VbvuHmCwmjIC1FvwxYuztMYbU1r8TSKj+TPvRGMzxSG6thoDj8YYf6rxdO9mwwtL1FKQLW+zQqw1vYWjc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Nkf5gHR3; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Nkf5gHR3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864443; x=1753400443; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=q00fszpD2Qyk3oAdePGLjSF2xntq88+vGVo40ifKgi4=; b=Nkf5gHR3vk9d7MNPbY1zMzrEYZ9bpeTG7ZcZ47Q2I17h9FlkcPZ3qbdb jqkClXb2fUQti94ZdiqUBpIJ50z9ytASVgYKpsLPgwNEQsxXPlr8PbM6R 8os7rhEVMJDU5BMQ+GMTFuVwvOqS/uK3kjh8h6688oRq6xUEIKVqiC6wf DX6QvtuSYCNLz/CXhqLf4GCIb1GLMumiXgOZjCvpmS0U8RN5h9cD2GWJ6 OZTxNectUcLtOC/jDZA+rajCYDhbwiPaRY4esPJBwVp7e7wVVzoC2JS5n rYYTHlniqWBu6UacdYscGG6wKVJlhhhrA3RXxrW+9rYIcrer28M6rrq3p A==; X-CSE-ConnectionGUID: OjNA+TZST5Gv6bs5SgHQjw== X-CSE-MsgGUID: WKLrwsFSRueC5W8AlDKDOg== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999766" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999766" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:41 -0700 X-CSE-ConnectionGUID: 9KnClcBSRkK2iDwTy+ocFQ== X-CSE-MsgGUID: tylJq2wQQBuASyk93AY5cA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426043" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:40 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Tatyana Nikolova Subject: [RFC PATCH 11/25] RDMA/irdma: Add GEN3 core driver support Date: Wed, 24 Jul 2024 18:39:03 -0500 Message-Id: <20240724233917.704-12-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Mustafa Ismail Introduce support for the GEN3 auxiliary core driver, which is responsible for initializing PCI-level RDMA resources. Facilitate host-driver communication with the device's Control Plane (CP) to discover capabilities and perform privileged operations through an RDMA- specific messaging interface built atop the IDPF mailbox and virtchannel protocol. Establish the RDMA virtual channel message interface and incorporate operations to retrieve the hardware version and discover capabilities from the CP. Additionally, set up the RDMA MMIO regions and initialize the RF structure. Signed-off-by: Mustafa Ismail Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/Makefile | 2 + drivers/infiniband/hw/irdma/ctrl.c | 438 +++++++++++++++++++++++++++---- drivers/infiniband/hw/irdma/defs.h | 30 ++- drivers/infiniband/hw/irdma/hmc.c | 18 +- drivers/infiniband/hw/irdma/hmc.h | 19 +- drivers/infiniband/hw/irdma/hw.c | 4 + drivers/infiniband/hw/irdma/i40iw_if.c | 1 + drivers/infiniband/hw/irdma/icrdma_if.c | 2 + drivers/infiniband/hw/irdma/ig3rdma_hw.h | 11 + drivers/infiniband/hw/irdma/ig3rdma_if.c | 171 ++++++++++++ drivers/infiniband/hw/irdma/irdma.h | 3 + drivers/infiniband/hw/irdma/main.c | 55 ++++ drivers/infiniband/hw/irdma/main.h | 4 + drivers/infiniband/hw/irdma/pble.c | 20 +- drivers/infiniband/hw/irdma/type.h | 63 ++++- drivers/infiniband/hw/irdma/user.h | 4 +- drivers/infiniband/hw/irdma/virtchnl.c | 300 +++++++++++++++++++++ drivers/infiniband/hw/irdma/virtchnl.h | 96 +++++++ 18 files changed, 1166 insertions(+), 75 deletions(-) create mode 100644 drivers/infiniband/hw/irdma/ig3rdma_hw.h create mode 100644 drivers/infiniband/hw/irdma/ig3rdma_if.c create mode 100644 drivers/infiniband/hw/irdma/virtchnl.c create mode 100644 drivers/infiniband/hw/irdma/virtchnl.h diff --git a/drivers/infiniband/hw/irdma/Makefile b/drivers/infiniband/hw/irdma/Makefile index 2522e4c..3aa63b9 100644 --- a/drivers/infiniband/hw/irdma/Makefile +++ b/drivers/infiniband/hw/irdma/Makefile @@ -13,6 +13,7 @@ irdma-objs := cm.o \ hw.o \ i40iw_hw.o \ i40iw_if.o \ + ig3rdma_if.o\ icrdma_if.o \ icrdma_hw.o \ main.o \ @@ -23,6 +24,7 @@ irdma-objs := cm.o \ uk.o \ utils.o \ verbs.o \ + virtchnl.o \ ws.o \ CFLAGS_trace.o = -I$(src) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 6aed616..9d7b151 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -2794,7 +2794,10 @@ static u64 irdma_sc_decode_fpm_commit(struct irdma_sc_dev *dev, __le64 *buf, obj_info[rsrc_idx].cnt = (u32)FLD_RS_64(dev, temp, IRDMA_COMMIT_FPM_CQCNT); break; case IRDMA_HMC_IW_APBVT_ENTRY: - obj_info[rsrc_idx].cnt = 1; + if (dev->hw_attrs.uk_attrs.hw_rev <= IRDMA_GEN_2) + obj_info[rsrc_idx].cnt = 1; + else + obj_info[rsrc_idx].cnt = 0; break; default: obj_info[rsrc_idx].cnt = (u32)temp; @@ -2829,7 +2832,8 @@ static u64 irdma_sc_decode_fpm_commit(struct irdma_sc_dev *dev, __le64 *buf, IRDMA_HMC_IW_QP); irdma_sc_decode_fpm_commit(dev, buf, 8, info, IRDMA_HMC_IW_CQ); - /* skiping RSRVD */ + irdma_sc_decode_fpm_commit(dev, buf, 16, info, + IRDMA_HMC_IW_SRQ); irdma_sc_decode_fpm_commit(dev, buf, 24, info, IRDMA_HMC_IW_HTE); irdma_sc_decode_fpm_commit(dev, buf, 32, info, @@ -2864,15 +2868,17 @@ static u64 irdma_sc_decode_fpm_commit(struct irdma_sc_dev *dev, __le64 *buf, IRDMA_HMC_IW_HDR); irdma_sc_decode_fpm_commit(dev, buf, 152, info, IRDMA_HMC_IW_MD); - irdma_sc_decode_fpm_commit(dev, buf, 160, info, - IRDMA_HMC_IW_OOISC); - irdma_sc_decode_fpm_commit(dev, buf, 168, info, - IRDMA_HMC_IW_OOISCFFL); + if (dev->cqp->protocol_used == IRDMA_IWARP_PROTOCOL_ONLY) { + irdma_sc_decode_fpm_commit(dev, buf, 160, info, + IRDMA_HMC_IW_OOISC); + irdma_sc_decode_fpm_commit(dev, buf, 168, info, + IRDMA_HMC_IW_OOISCFFL); + } } /* searching for the last object in HMC to find the size of the HMC area. */ for (i = IRDMA_HMC_IW_QP; i < IRDMA_HMC_IW_MAX; i++) { - if (info[i].base > max_base) { + if (info[i].base > max_base && info[i].cnt) { max_base = info[i].base; last_hmc_obj = i; } @@ -2937,6 +2943,14 @@ static int irdma_sc_parse_fpm_query_buf(struct irdma_sc_dev *dev, __le64 *buf, hmc_info->first_sd_index = (u16)FIELD_GET(IRDMA_QUERY_FPM_FIRST_PE_SD_INDEX, temp); max_pe_sds = (u16)FIELD_GET(IRDMA_QUERY_FPM_MAX_PE_SDS, temp); + /* Reduce SD count for unprivleged functions by 1 to account for PBLE + * backing page rounding + */ + if (dev->hw_attrs.uk_attrs.hw_rev <= IRDMA_GEN_2 && + (hmc_info->hmc_fn_id >= dev->hw_attrs.first_hw_vf_fpm_id || + !dev->privileged)) + max_pe_sds--; + hmc_fpm_misc->max_sds = max_pe_sds; hmc_info->sd_table.sd_cnt = max_pe_sds + hmc_info->first_sd_index; get_64bit_val(buf, 8, &temp); @@ -2949,11 +2963,17 @@ static int irdma_sc_parse_fpm_query_buf(struct irdma_sc_dev *dev, __le64 *buf, size = (u32)(temp >> 32); obj_info[IRDMA_HMC_IW_CQ].size = BIT_ULL(size); + irdma_sc_decode_fpm_query(buf, 24, obj_info, IRDMA_HMC_IW_SRQ); irdma_sc_decode_fpm_query(buf, 32, obj_info, IRDMA_HMC_IW_HTE); irdma_sc_decode_fpm_query(buf, 40, obj_info, IRDMA_HMC_IW_ARP); - obj_info[IRDMA_HMC_IW_APBVT_ENTRY].size = 8192; - obj_info[IRDMA_HMC_IW_APBVT_ENTRY].max_cnt = 1; + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + obj_info[IRDMA_HMC_IW_APBVT_ENTRY].size = 0; + obj_info[IRDMA_HMC_IW_APBVT_ENTRY].max_cnt = 0; + } else { + obj_info[IRDMA_HMC_IW_APBVT_ENTRY].size = 8192; + obj_info[IRDMA_HMC_IW_APBVT_ENTRY].max_cnt = 1; + } irdma_sc_decode_fpm_query(buf, 48, obj_info, IRDMA_HMC_IW_MR); irdma_sc_decode_fpm_query(buf, 56, obj_info, IRDMA_HMC_IW_XF); @@ -2962,7 +2982,7 @@ static int irdma_sc_parse_fpm_query_buf(struct irdma_sc_dev *dev, __le64 *buf, obj_info[IRDMA_HMC_IW_XFFL].max_cnt = (u32)temp; obj_info[IRDMA_HMC_IW_XFFL].size = 4; hmc_fpm_misc->xf_block_size = FIELD_GET(IRDMA_QUERY_FPM_XFBLOCKSIZE, temp); - if (!hmc_fpm_misc->xf_block_size) + if (obj_info[IRDMA_HMC_IW_XF].max_cnt && !hmc_fpm_misc->xf_block_size) return -EINVAL; irdma_sc_decode_fpm_query(buf, 72, obj_info, IRDMA_HMC_IW_Q1); @@ -2998,17 +3018,30 @@ static int irdma_sc_parse_fpm_query_buf(struct irdma_sc_dev *dev, __le64 *buf, obj_info[IRDMA_HMC_IW_RRFFL].max_cnt) return -EINVAL; + if (!obj_info[IRDMA_HMC_IW_XF].max_cnt) + obj_info[IRDMA_HMC_IW_RRF].max_cnt = IRDMA_HMC_MIN_RRF; + irdma_sc_decode_fpm_query(buf, 144, obj_info, IRDMA_HMC_IW_HDR); irdma_sc_decode_fpm_query(buf, 152, obj_info, IRDMA_HMC_IW_MD); - irdma_sc_decode_fpm_query(buf, 160, obj_info, IRDMA_HMC_IW_OOISC); - - get_64bit_val(buf, 168, &temp); - obj_info[IRDMA_HMC_IW_OOISCFFL].max_cnt = (u32)temp; - obj_info[IRDMA_HMC_IW_OOISCFFL].size = 4; - hmc_fpm_misc->ooiscf_block_size = FIELD_GET(IRDMA_QUERY_FPM_OOISCFBLOCKSIZE, temp); - if (!hmc_fpm_misc->ooiscf_block_size && - obj_info[IRDMA_HMC_IW_OOISCFFL].max_cnt) - return -EINVAL; + + if (dev->cqp->protocol_used == IRDMA_IWARP_PROTOCOL_ONLY) { + irdma_sc_decode_fpm_query(buf, 160, obj_info, IRDMA_HMC_IW_OOISC); + + get_64bit_val(buf, 168, &temp); + obj_info[IRDMA_HMC_IW_OOISCFFL].max_cnt = (u32)temp; + obj_info[IRDMA_HMC_IW_OOISCFFL].size = 4; + hmc_fpm_misc->ooiscf_block_size = FIELD_GET(IRDMA_QUERY_FPM_OOISCFBLOCKSIZE, temp); + if (!hmc_fpm_misc->ooiscf_block_size && + obj_info[IRDMA_HMC_IW_OOISCFFL].max_cnt) + return -EINVAL; + } + + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + get_64bit_val(buf, 176, &temp); + hmc_fpm_misc->loc_mem_pages = (u32)FIELD_GET(IRDMA_QUERY_FPM_LOC_MEM_PAGES, temp); + if (!hmc_fpm_misc->loc_mem_pages) + return -EINVAL; + } return 0; } @@ -4336,6 +4369,26 @@ int irdma_sc_init_iw_hmc(struct irdma_sc_dev *dev, u8 hmc_fn_id) } /** + * irdma_set_loc_mem() - set a local memory bit field + * @buf: ptr to a buffer where local memory gets enabled + */ +static void irdma_set_loc_mem(__le64 *buf) +{ + u64 loc_mem_en = BIT_ULL(ENABLE_LOC_MEM); + u32 offset; + u64 temp; + + for (offset = 0; offset < IRDMA_COMMIT_FPM_BUF_SIZE; + offset += sizeof(__le64)) { + if (offset == IRDMA_PBLE_COMMIT_OFFSET) + continue; + get_64bit_val(buf, offset, &temp); + if (temp) + set_64bit_val(buf, offset, temp | loc_mem_en); + } +} + +/** * irdma_sc_cfg_iw_fpm() - commits hmc obj cnt values using cqp * command and populates fpm base address in hmc_info * @dev : ptr to irdma_dev struct @@ -4356,7 +4409,7 @@ static int irdma_sc_cfg_iw_fpm(struct irdma_sc_dev *dev, u8 hmc_fn_id) set_64bit_val(buf, 0, (u64)obj_info[IRDMA_HMC_IW_QP].cnt); set_64bit_val(buf, 8, (u64)obj_info[IRDMA_HMC_IW_CQ].cnt); - set_64bit_val(buf, 16, (u64)0); /* RSRVD */ + set_64bit_val(buf, 16, (u64)obj_info[IRDMA_HMC_IW_SRQ].cnt); set_64bit_val(buf, 24, (u64)obj_info[IRDMA_HMC_IW_HTE].cnt); set_64bit_val(buf, 32, (u64)obj_info[IRDMA_HMC_IW_ARP].cnt); set_64bit_val(buf, 40, (u64)0); /* RSVD */ @@ -4383,7 +4436,9 @@ static int irdma_sc_cfg_iw_fpm(struct irdma_sc_dev *dev, u8 hmc_fn_id) (u64)obj_info[IRDMA_HMC_IW_OOISC].cnt); set_64bit_val(buf, 168, (u64)obj_info[IRDMA_HMC_IW_OOISCFFL].cnt); - + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3 && + dev->hmc_fpm_misc.loc_mem_pages) + irdma_set_loc_mem(buf); commit_fpm_mem.pa = dev->fpm_commit_buf_pa; commit_fpm_mem.va = dev->fpm_commit_buf; @@ -4592,6 +4647,7 @@ static bool irdma_cqp_ring_full(struct irdma_sc_cqp *cqp) static u32 irdma_est_sd(struct irdma_sc_dev *dev, struct irdma_hmc_info *hmc_info) { + struct irdma_hmc_obj_info *pble_info; int i; u64 size = 0; u64 sd; @@ -4600,12 +4656,22 @@ static u32 irdma_est_sd(struct irdma_sc_dev *dev, if (i != IRDMA_HMC_IW_PBLE) size += round_up(hmc_info->hmc_obj[i].cnt * hmc_info->hmc_obj[i].size, 512); - size += round_up(hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt * - hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].size, 512); + + pble_info = &hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE]; + if (dev->privileged) + size += round_up(pble_info->cnt * pble_info->size, 512); if (size & 0x1FFFFF) sd = (size >> 21) + 1; /* add 1 for remainder */ else sd = size >> 21; + if (!dev->privileged && !dev->hmc_fpm_misc.loc_mem_pages) { + /* 2MB alignment for VF PBLE HMC */ + size = pble_info->cnt * pble_info->size; + if (size & 0x1FFFFF) + sd += (size >> 21) + 1; /* add 1 for remainder */ + else + sd += size >> 21; + } if (sd > 0xFFFFFFFF) { ibdev_dbg(to_ibdev(dev), "HMC: sd overflow[%lld]\n", sd); sd = 0xFFFFFFFF - 1; @@ -4786,22 +4852,287 @@ static void cfg_fpm_value_gen_2(struct irdma_sc_dev *dev, } /** + * irdma_get_rsrc_mem_config - configure resources if local memory or host + * @dev: sc device struct + * @is_mrte_loc_mem: if true, MR's to be in local memory because sd=loc pages + * + * Only mr can be configured host or local memory if qp's are in local memory. + * If qp is in local memory, then all resource object will be in local memory + * except mr which can be either host or local memory. The only exception + * is pble's which are always in host memory. + */ +static void irdma_get_rsrc_mem_config(struct irdma_sc_dev *dev, bool is_mrte_loc_mem) +{ + struct irdma_hmc_info *hmc_info = dev->hmc_info; + int i; + + for (i = IRDMA_HMC_IW_QP; i < IRDMA_HMC_IW_MAX; i++) + hmc_info->hmc_obj[i].mem_loc = IRDMA_LOC_MEM; + + if (dev->feature_info[IRDMA_OBJ_1] && !is_mrte_loc_mem) { + u8 mem_type; + + mem_type = (u8)FIELD_GET(IRDMA_MR_MEM_LOC, dev->feature_info[IRDMA_OBJ_1]); + + hmc_info->hmc_obj[IRDMA_HMC_IW_MR].mem_loc = + (mem_type & IRDMA_OBJ_LOC_MEM_BIT) ? + IRDMA_LOC_MEM : IRDMA_HOST_MEM; + } else { + hmc_info->hmc_obj[IRDMA_HMC_IW_MR].mem_loc = IRDMA_LOC_MEM; + } + + hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].mem_loc = IRDMA_HOST_MEM; + + ibdev_dbg(to_ibdev(dev), "HMC: INFO: mrte_mem_loc = %d pble = %d\n", + hmc_info->hmc_obj[IRDMA_HMC_IW_MR].mem_loc, + hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].mem_loc); +} + +/** + * irdma_cfg_sd_mem - allocate sd memory + * @dev: sc device struct + * @hmc_info: ptr to irdma_hmc_obj_info struct + */ +static int irdma_cfg_sd_mem(struct irdma_sc_dev *dev, + struct irdma_hmc_info *hmc_info) +{ + struct irdma_virt_mem virt_mem; + u32 mem_size; + + mem_size = sizeof(struct irdma_hmc_sd_entry) * hmc_info->sd_table.sd_cnt; + virt_mem.size = mem_size; + virt_mem.va = kzalloc(virt_mem.size, GFP_KERNEL); + if (!virt_mem.va) + return -ENOMEM; + hmc_info->sd_table.sd_entry = virt_mem.va; + + return 0; +} + +/** + * irdma_get_objs_pages - get number of 2M pages needed + * @dev: sc device struct + * @hmc_info: pointer to the HMC configuration information struct + * @mem_loc: pages for local or host memory + */ +static u32 irdma_get_objs_pages(struct irdma_sc_dev *dev, + struct irdma_hmc_info *hmc_info, + enum irdma_hmc_obj_mem mem_loc) +{ + u64 size = 0; + int i; + + for (i = IRDMA_HMC_IW_QP; i < IRDMA_HMC_IW_MAX; i++) { + if (hmc_info->hmc_obj[i].mem_loc == mem_loc) { + size += round_up(hmc_info->hmc_obj[i].cnt * + hmc_info->hmc_obj[i].size, 512); + } + } + + return DIV_ROUND_UP(size, IRDMA_HMC_PAGE_SIZE); +} + +/** + * irdma_set_host_hmc_rsrc_gen_3 - calculate host hmc resources for gen 3 + * @dev: sc device struct + */ +static void irdma_set_host_hmc_rsrc_gen_3(struct irdma_sc_dev *dev) +{ + struct irdma_hmc_fpm_misc *hmc_fpm_misc; + struct irdma_hmc_info *hmc_info; + enum irdma_hmc_obj_mem mrte_loc; + u32 mrwanted, pblewanted; + u32 avail_sds, mr_sds; + + hmc_info = dev->hmc_info; + hmc_fpm_misc = &dev->hmc_fpm_misc; + avail_sds = hmc_fpm_misc->max_sds; + mrte_loc = hmc_info->hmc_obj[IRDMA_HMC_IW_MR].mem_loc; + mrwanted = hmc_info->hmc_obj[IRDMA_HMC_IW_MR].cnt; + pblewanted = hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].max_cnt; + + if (mrte_loc == IRDMA_HOST_MEM && avail_sds > IRDMA_MIN_PBLE_PAGES) { + mr_sds = avail_sds - IRDMA_MIN_PBLE_PAGES; + mrwanted = min(mrwanted, mr_sds * MAX_MR_PER_SD); + hmc_info->hmc_obj[IRDMA_HMC_IW_MR].cnt = mrwanted; + avail_sds -= DIV_ROUND_UP(mrwanted, MAX_MR_PER_SD); + } + + pblewanted = min(pblewanted, avail_sds * MAX_PBLE_PER_SD); + hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt = pblewanted; +} + +/** + * irdma_set_loc_hmc_rsrc_gen_3 - calculate hmc resources for gen 3 + * @dev: sc device struct + * @max_pages: max local memory available + * @qpwanted: number of qp's wanted + */ +static int irdma_set_loc_hmc_rsrc_gen_3(struct irdma_sc_dev *dev, + u32 max_pages, + u32 qpwanted) +{ + struct irdma_hmc_fpm_misc *hmc_fpm_misc; + u32 xf_cnt, timer_cnt, pages_needed; + struct irdma_hmc_info *hmc_info; + u32 ird, ord, min_ird; + + hmc_info = dev->hmc_info; + hmc_fpm_misc = &dev->hmc_fpm_misc; + ird = dev->hw_attrs.max_hw_ird; + ord = dev->hw_attrs.max_hw_ord; + min_ird = IRDMA_MIN_IRD; + + hmc_info->hmc_obj[IRDMA_HMC_IW_HDR].cnt = qpwanted; + hmc_info->hmc_obj[IRDMA_HMC_IW_QP].cnt = qpwanted; + + hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt = + min(hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt, qpwanted * 2); + + hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].cnt = + min(qpwanted * 8, hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].max_cnt); + + hmc_info->hmc_obj[IRDMA_HMC_IW_RRF].cnt = + min(hmc_info->hmc_obj[IRDMA_HMC_IW_RRF].max_cnt, + IRDMA_RRF_MULTIPLIER * qpwanted); + + if (hmc_info->hmc_obj[IRDMA_HMC_IW_RRFFL].max_cnt) + hmc_info->hmc_obj[IRDMA_HMC_IW_RRFFL].cnt = + hmc_info->hmc_obj[IRDMA_HMC_IW_RRF].cnt / + hmc_fpm_misc->rrf_block_size; + + xf_cnt = IRDMA_XF_MULTIPLIER * qpwanted; + xf_cnt = min(hmc_info->hmc_obj[IRDMA_HMC_IW_XF].max_cnt, xf_cnt); + hmc_info->hmc_obj[IRDMA_HMC_IW_XF].cnt = xf_cnt; + + if (xf_cnt) + hmc_info->hmc_obj[IRDMA_HMC_IW_XFFL].cnt = + xf_cnt / hmc_fpm_misc->xf_block_size; + + timer_cnt = (round_up(qpwanted, 512) / 512 + 1) * + hmc_fpm_misc->timer_bucket; + hmc_info->hmc_obj[IRDMA_HMC_IW_TIMER].cnt = + min(timer_cnt, hmc_info->hmc_obj[IRDMA_HMC_IW_TIMER].cnt); + + do { + hmc_info->hmc_obj[IRDMA_HMC_IW_Q1].cnt = ird * 2 * qpwanted; + hmc_info->hmc_obj[IRDMA_HMC_IW_Q1FL].cnt = + hmc_info->hmc_obj[IRDMA_HMC_IW_Q1].cnt / hmc_fpm_misc->q1_block_size; + + pages_needed = irdma_get_objs_pages(dev, hmc_info, IRDMA_LOC_MEM); + if (pages_needed <= max_pages) + break; + + ird /= 2; + ord /= 2; + } while (ird >= IRDMA_MIN_IRD); + + if (ird < IRDMA_MIN_IRD) { + ibdev_dbg(to_ibdev(dev), "HMC: FAIL: IRD=%u Q1 CNT = %u\n", + ird, hmc_info->hmc_obj[IRDMA_HMC_IW_Q1].cnt); + return -EINVAL; + } + + dev->hw_attrs.max_hw_ird = ird; + dev->hw_attrs.max_hw_ord = ord; + hmc_fpm_misc->max_sds -= pages_needed; + + return 0; +} + +/** + * cfg_fpm_value_gen_3 - configure fpm for gen 3 + * @dev: sc device struct + * @hmc_info: ptr to irdma_hmc_obj_info struct + * @hmc_fpm_misc: ptr to fpm data + */ +static int cfg_fpm_value_gen_3(struct irdma_sc_dev *dev, + struct irdma_hmc_info *hmc_info, + struct irdma_hmc_fpm_misc *hmc_fpm_misc) +{ + enum irdma_hmc_obj_mem mrte_loc; + u32 mrwanted, qpwanted; + int i, ret_code = 0; + u32 loc_mem_pages; + bool is_mrte_loc_mem; + + loc_mem_pages = hmc_fpm_misc->loc_mem_pages; + is_mrte_loc_mem = hmc_fpm_misc->loc_mem_pages == hmc_fpm_misc->max_sds ? + true : false; + + irdma_get_rsrc_mem_config(dev, is_mrte_loc_mem); + mrte_loc = hmc_info->hmc_obj[IRDMA_HMC_IW_MR].mem_loc; + + if (is_mrte_loc_mem) + loc_mem_pages -= IRDMA_MIN_PBLE_PAGES; + + ibdev_dbg(to_ibdev(dev), + "HMC: mrte_loc %d loc_mem %u fpm max sds %u host_obj %d\n", + hmc_info->hmc_obj[IRDMA_HMC_IW_MR].mem_loc, + hmc_fpm_misc->loc_mem_pages, hmc_fpm_misc->max_sds, + is_mrte_loc_mem); + + mrwanted = hmc_info->hmc_obj[IRDMA_HMC_IW_MR].max_cnt; + qpwanted = hmc_info->hmc_obj[IRDMA_HMC_IW_QP].max_cnt; + hmc_info->hmc_obj[IRDMA_HMC_IW_HDR].cnt = qpwanted; + + hmc_info->hmc_obj[IRDMA_HMC_IW_OOISC].max_cnt = 0; + hmc_info->hmc_obj[IRDMA_HMC_IW_OOISCFFL].max_cnt = 0; + hmc_info->hmc_obj[IRDMA_HMC_IW_HTE].max_cnt = 0; + hmc_info->hmc_obj[IRDMA_HMC_IW_FSIMC].max_cnt = 0; + hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].max_cnt = + min(hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].max_cnt, + (u32)IRDMA_FSIAV_CNT_MAX); + for (i = IRDMA_HMC_IW_QP; i < IRDMA_HMC_IW_MAX; i++) + hmc_info->hmc_obj[i].cnt = hmc_info->hmc_obj[i].max_cnt; + + while (qpwanted >= IRDMA_MIN_QP_CNT) { + if (!irdma_set_loc_hmc_rsrc_gen_3(dev, loc_mem_pages, qpwanted)) + break; + + qpwanted /= 2; + if (mrte_loc == IRDMA_LOC_MEM) { + mrwanted = qpwanted * IRDMA_MIN_MR_PER_QP; + hmc_info->hmc_obj[IRDMA_HMC_IW_MR].cnt = + min(hmc_info->hmc_obj[IRDMA_HMC_IW_MR].max_cnt, mrwanted); + } + } + + if (qpwanted < IRDMA_MIN_QP_CNT) { + ibdev_dbg(to_ibdev(dev), + "HMC: ERROR: could not allocate fpm resources\n"); + return -EINVAL; + } + + irdma_set_host_hmc_rsrc_gen_3(dev); + ret_code = irdma_sc_cfg_iw_fpm(dev, dev->hmc_fn_id); + if (ret_code) { + ibdev_dbg(to_ibdev(dev), + "HMC: cfg_iw_fpm returned error_code[x%08X]\n", + readl(dev->hw_regs[IRDMA_CQPERRCODES])); + + return ret_code; + } + + return irdma_cfg_sd_mem(dev, hmc_info); +} + +/** * irdma_cfg_fpm_val - configure HMC objects * @dev: sc device struct * @qp_count: desired qp count */ int irdma_cfg_fpm_val(struct irdma_sc_dev *dev, u32 qp_count) { - struct irdma_virt_mem virt_mem; - u32 i, mem_size; u32 qpwanted, mrwanted, pblewanted; - u32 powerof2, hte; + u32 powerof2, hte, i; u32 sd_needed; u32 sd_diff; u32 loop_count = 0; struct irdma_hmc_info *hmc_info; struct irdma_hmc_fpm_misc *hmc_fpm_misc; int ret_code = 0; + u32 max_sds; hmc_info = dev->hmc_info; hmc_fpm_misc = &dev->hmc_fpm_misc; @@ -4814,14 +5145,16 @@ int irdma_cfg_fpm_val(struct irdma_sc_dev *dev, u32 qp_count) return ret_code; } + max_sds = hmc_fpm_misc->max_sds; + + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) + return cfg_fpm_value_gen_3(dev, hmc_info, hmc_fpm_misc); + for (i = IRDMA_HMC_IW_QP; i < IRDMA_HMC_IW_MAX; i++) hmc_info->hmc_obj[i].cnt = hmc_info->hmc_obj[i].max_cnt; sd_needed = irdma_est_sd(dev, hmc_info); - ibdev_dbg(to_ibdev(dev), - "HMC: FW max resources sd_needed[%08d] first_sd_index[%04d]\n", - sd_needed, hmc_info->first_sd_index); - ibdev_dbg(to_ibdev(dev), "HMC: sd count %d where max sd is %d\n", - hmc_info->sd_table.sd_cnt, hmc_fpm_misc->max_sds); + ibdev_dbg(to_ibdev(dev), "HMC: sd count %u where max sd is %u\n", + hmc_info->sd_table.sd_cnt, max_sds); qpwanted = min(qp_count, hmc_info->hmc_obj[IRDMA_HMC_IW_QP].max_cnt); @@ -4835,8 +5168,8 @@ int irdma_cfg_fpm_val(struct irdma_sc_dev *dev, u32 qp_count) pblewanted = hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].max_cnt; ibdev_dbg(to_ibdev(dev), - "HMC: req_qp=%d max_sd=%d, max_qp = %d, max_cq=%d, max_mr=%d, max_pble=%d, mc=%d, av=%d\n", - qp_count, hmc_fpm_misc->max_sds, + "HMC: req_qp=%d max_sd=%u, max_qp = %u, max_cq=%u, max_mr=%u, max_pble=%u, mc=%d, av=%u\n", + qp_count, max_sds, hmc_info->hmc_obj[IRDMA_HMC_IW_QP].max_cnt, hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].max_cnt, hmc_info->hmc_obj[IRDMA_HMC_IW_MR].max_cnt, @@ -4849,7 +5182,6 @@ int irdma_cfg_fpm_val(struct irdma_sc_dev *dev, u32 qp_count) hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].max_cnt; hmc_info->hmc_obj[IRDMA_HMC_IW_ARP].cnt = hmc_info->hmc_obj[IRDMA_HMC_IW_ARP].max_cnt; - hmc_info->hmc_obj[IRDMA_HMC_IW_APBVT_ENTRY].cnt = 1; while (irdma_q1_cnt(dev, hmc_info, qpwanted) > hmc_info->hmc_obj[IRDMA_HMC_IW_Q1].max_cnt) @@ -4860,7 +5192,7 @@ int irdma_cfg_fpm_val(struct irdma_sc_dev *dev, u32 qp_count) hmc_info->hmc_obj[IRDMA_HMC_IW_QP].cnt = qpwanted; hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt = min(2 * qpwanted, hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt); - hmc_info->hmc_obj[IRDMA_HMC_IW_RESERVED].cnt = 0; /* Reserved */ + hmc_info->hmc_obj[IRDMA_HMC_IW_SRQ].cnt = 0; /* Reserved */ hmc_info->hmc_obj[IRDMA_HMC_IW_MR].cnt = mrwanted; hte = round_up(qpwanted + hmc_info->hmc_obj[IRDMA_HMC_IW_FSIMC].cnt, 512); @@ -4898,11 +5230,12 @@ int irdma_cfg_fpm_val(struct irdma_sc_dev *dev, u32 qp_count) if (!(loop_count % 2) && qpwanted > 128) { qpwanted /= 2; } else { - mrwanted /= 2; pblewanted /= 2; + mrwanted /= 2; } continue; } + if (dev->cqp->hmc_profile != IRDMA_HMC_PROFILE_FAVOR_VF && pblewanted > (512 * FPM_MULTIPLIER * sd_diff)) { pblewanted -= 256 * FPM_MULTIPLIER * sd_diff; @@ -4928,14 +5261,13 @@ int irdma_cfg_fpm_val(struct irdma_sc_dev *dev, u32 qp_count) if (sd_needed > hmc_fpm_misc->max_sds) { ibdev_dbg(to_ibdev(dev), - "HMC: cfg_fpm failed loop_cnt=%d, sd_needed=%d, max sd count %d\n", + "HMC: cfg_fpm failed loop_cnt=%u, sd_needed=%u, max sd count %u\n", loop_count, sd_needed, hmc_info->sd_table.sd_cnt); return -EINVAL; } - if (loop_count > 1 && sd_needed < hmc_fpm_misc->max_sds) { - pblewanted += (hmc_fpm_misc->max_sds - sd_needed) * 256 * - FPM_MULTIPLIER; + if (loop_count > 1 && sd_needed < max_sds) { + pblewanted += (max_sds - sd_needed) * 256 * FPM_MULTIPLIER; hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt = pblewanted; sd_needed = irdma_est_sd(dev, hmc_info); } @@ -4959,18 +5291,7 @@ int irdma_cfg_fpm_val(struct irdma_sc_dev *dev, u32 qp_count) return ret_code; } - mem_size = sizeof(struct irdma_hmc_sd_entry) * - (hmc_info->sd_table.sd_cnt + hmc_info->first_sd_index + 1); - virt_mem.size = mem_size; - virt_mem.va = kzalloc(virt_mem.size, GFP_KERNEL); - if (!virt_mem.va) { - ibdev_dbg(to_ibdev(dev), - "HMC: failed to allocate memory for sd_entry buffer\n"); - return -ENOMEM; - } - hmc_info->sd_table.sd_entry = virt_mem.va; - - return ret_code; + return irdma_cfg_sd_mem(dev, hmc_info); } /** @@ -5381,6 +5702,7 @@ int irdma_sc_dev_init(enum irdma_vers ver, struct irdma_sc_dev *dev, dev->fpm_commit_buf = info->fpm_commit_buf; dev->hw = info->hw; dev->hw->hw_addr = info->bar0; + dev->protocol_used = info->protocol_used; /* Setup the hardware limits, hmc may limit further */ dev->hw_attrs.min_hw_qp_id = IRDMA_MIN_IW_QP_ID; dev->hw_attrs.min_hw_aeq_size = IRDMA_MIN_AEQ_ENTRIES; @@ -5409,7 +5731,17 @@ int irdma_sc_dev_init(enum irdma_vers ver, struct irdma_sc_dev *dev, dev->hw_attrs.max_sleep_count = IRDMA_SLEEP_COUNT; dev->hw_attrs.max_cqp_compl_wait_time_ms = CQP_COMPL_WAIT_TIME_MS; - dev->hw_attrs.uk_attrs.hw_rev = ver; + if (!dev->privileged) { + ret_code = irdma_vchnl_req_get_hmc_fcn(dev); + if (ret_code) { + ibdev_dbg(to_ibdev(dev), + "DEV: Get HMC function ret = %d\n", + ret_code); + + return ret_code; + } + } + irdma_sc_init_hw(dev); if (irdma_wait_pe_ready(dev)) diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index 2cb4b96..7825896 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -114,6 +114,12 @@ enum irdma_protocol_used { #define IRDMA_UPDATE_SD_BUFF_SIZE 128 #define IRDMA_FEATURE_BUF_SIZE (8 * IRDMA_MAX_FEATURES) +#define ENABLE_LOC_MEM 63 +#define MAX_PBLE_PER_SD 0x40000 +#define MAX_PBLE_SD_PER_FCN 0x400 +#define MAX_MR_PER_SD 0x8000 +#define MAX_MR_SD_PER_FCN 0x80 +#define IRDMA_PBLE_COMMIT_OFFSET 112 #define IRDMA_MAX_QUANTA_PER_WR 8 #define IRDMA_QP_SW_MAX_WQ_QUANTA 32768 @@ -396,6 +402,11 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_STATS_HMC_FCN_INDEX GENMASK_ULL(5, 0) #define IRDMA_CQPSQ_WS_WQEVALID BIT_ULL(63) #define IRDMA_CQPSQ_WS_NODEOP GENMASK_ULL(53, 52) +#define IRDMA_SD_MAX GENMASK_ULL(15, 0) +#define IRDMA_MEM_MAX GENMASK_ULL(15, 0) +#define IRDMA_QP_MEM_LOC GENMASK_ULL(47, 44) +#define IRDMA_MR_MEM_LOC_S 24 +#define IRDMA_MR_MEM_LOC GENMASK_ULL(27, 24) #define IRDMA_CQPSQ_WS_ENABLENODE BIT_ULL(62) #define IRDMA_CQPSQ_WS_NODETYPE BIT_ULL(61) @@ -660,10 +671,12 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_AEQ_VMAP BIT_ULL(47) #define IRDMA_CQPSQ_AEQ_FIRSTPMPBLIDX GENMASK_ULL(27, 0) -#define IRDMA_COMMIT_FPM_QPCNT GENMASK_ULL(18, 0) +#define IRDMA_COMMIT_FPM_QPCNT_S 0 +#define IRDMA_COMMIT_FPM_QPCNT GENMASK_ULL(20, 0) #define IRDMA_COMMIT_FPM_BASE_S 32 -#define IRDMA_CQPSQ_CFPM_HMCFNID GENMASK_ULL(5, 0) +#define IRDMA_CQPSQ_CFPM_HMCFNID GENMASK_ULL(15, 0) + #define IRDMA_CQPSQ_FWQE_AECODE GENMASK_ULL(15, 0) #define IRDMA_CQPSQ_FWQE_AESOURCE GENMASK_ULL(19, 16) #define IRDMA_CQPSQ_FWQE_RQMNERR GENMASK_ULL(15, 0) @@ -903,10 +916,17 @@ enum irdma_cqp_op_type { #define IRDMAPFINT_OICR_PE_PUSH_M BIT(27) #define IRDMAPFINT_OICR_PE_CRITERR_M BIT(28) -#define IRDMA_QUERY_FPM_MAX_QPS GENMASK_ULL(18, 0) -#define IRDMA_QUERY_FPM_MAX_CQS GENMASK_ULL(19, 0) +#define IRDMA_QUERY_FPM_LOC_MEM_PAGES_S 32 +#define IRDMA_QUERY_FPM_LOC_MEM_PAGES GENMASK_ULL(63, 32) +#define IRDMA_QUERY_FPM_MAX_QPS_S 0 +#define IRDMA_QUERY_FPM_MAX_QPS GENMASK_ULL(31, 0) +#define IRDMA_QUERY_FPM_MAX_CQS_S 0 +#define IRDMA_QUERY_FPM_MAX_CQS GENMASK_ULL(31, 0) +#define IRDMA_QUERY_FPM_FIRST_PE_SD_INDEX_S 0 #define IRDMA_QUERY_FPM_FIRST_PE_SD_INDEX GENMASK_ULL(13, 0) -#define IRDMA_QUERY_FPM_MAX_PE_SDS GENMASK_ULL(45, 32) +#define IRDMA_QUERY_FPM_MAX_PE_SDS_S 32 +#define IRDMA_QUERY_FPM_MAX_PE_SDS GENMASK_ULL(44, 32) + #define IRDMA_QUERY_FPM_MAX_CEQS GENMASK_ULL(9, 0) #define IRDMA_QUERY_FPM_XFBLOCKSIZE GENMASK_ULL(63, 32) #define IRDMA_QUERY_FPM_Q1BLOCKSIZE GENMASK_ULL(63, 32) diff --git a/drivers/infiniband/hw/irdma/hmc.c b/drivers/infiniband/hw/irdma/hmc.c index ac58088..da18add1 100644 --- a/drivers/infiniband/hw/irdma/hmc.c +++ b/drivers/infiniband/hw/irdma/hmc.c @@ -5,6 +5,7 @@ #include "defs.h" #include "type.h" #include "protos.h" +#include "virtchnl.h" /** * irdma_find_sd_index_limit - finds segment descriptor index limit @@ -228,6 +229,10 @@ int irdma_sc_create_hmc_obj(struct irdma_sc_dev *dev, bool pd_error = false; int ret_code = 0; + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3 && + dev->hmc_info->hmc_obj[info->rsrc_type].mem_loc == IRDMA_LOC_MEM) + return 0; + if (info->start_idx >= info->hmc_info->hmc_obj[info->rsrc_type].cnt) return -EINVAL; @@ -330,7 +335,7 @@ static int irdma_finish_del_sd_reg(struct irdma_sc_dev *dev, u32 i, sd_idx; struct irdma_dma_mem *mem; - if (!reset) + if (dev->privileged && !reset) ret_code = irdma_hmc_sd_grp(dev, info->hmc_info, info->hmc_info->sd_indexes[0], info->del_sd_cnt, false); @@ -376,6 +381,9 @@ int irdma_sc_del_hmc_obj(struct irdma_sc_dev *dev, u32 i, j; int ret_code = 0; + if (dev->hmc_info->hmc_obj[info->rsrc_type].mem_loc == IRDMA_LOC_MEM) + return 0; + if (info->start_idx >= info->hmc_info->hmc_obj[info->rsrc_type].cnt) { ibdev_dbg(to_ibdev(dev), "HMC: error start_idx[%04d] >= [type %04d].cnt[%04d]\n", @@ -589,7 +597,10 @@ int irdma_add_pd_table_entry(struct irdma_sc_dev *dev, pd_entry->sd_index = sd_idx; pd_entry->valid = true; pd_table->use_cnt++; - irdma_invalidate_pf_hmc_pd(dev, sd_idx, rel_pd_idx); + + if (hmc_info->hmc_fn_id < dev->hw_attrs.first_hw_vf_fpm_id && + dev->privileged) + irdma_invalidate_pf_hmc_pd(dev, sd_idx, rel_pd_idx); } pd_entry->bp.use_cnt++; @@ -640,7 +651,8 @@ int irdma_remove_pd_bp(struct irdma_sc_dev *dev, pd_addr = pd_table->pd_page_addr.va; pd_addr += rel_pd_idx; memset(pd_addr, 0, sizeof(u64)); - irdma_invalidate_pf_hmc_pd(dev, sd_idx, idx); + if (dev->privileged && dev->hmc_fn_id == hmc_info->hmc_fn_id) + irdma_invalidate_pf_hmc_pd(dev, sd_idx, idx); if (!pd_entry->rsrc_pg) { mem = &pd_entry->bp.addr; diff --git a/drivers/infiniband/hw/irdma/hmc.h b/drivers/infiniband/hw/irdma/hmc.h index 415f9e2..257a5d2 100644 --- a/drivers/infiniband/hw/irdma/hmc.h +++ b/drivers/infiniband/hw/irdma/hmc.h @@ -16,11 +16,21 @@ #define IRDMA_HMC_PD_BP_BUF_ALIGNMENT 4096 #define IRDMA_FIRST_VF_FPM_ID 8 #define FPM_MULTIPLIER 1024 +#define IRDMA_OBJ_LOC_MEM_BIT 0x4 +#define IRDMA_XF_MULTIPLIER 16 +#define IRDMA_RRF_MULTIPLIER 8 +#define IRDMA_MIN_PBLE_PAGES 3 +#define IRDMA_HMC_PAGE_SIZE 2097152 +#define IRDMA_MIN_MR_PER_QP 4 +#define IRDMA_MIN_QP_CNT 64 +#define IRDMA_FSIAV_CNT_MAX 1048576 +#define IRDMA_MIN_IRD 8 +#define IRDMA_HMC_MIN_RRF 16 enum irdma_hmc_rsrc_type { IRDMA_HMC_IW_QP = 0, IRDMA_HMC_IW_CQ = 1, - IRDMA_HMC_IW_RESERVED = 2, + IRDMA_HMC_IW_SRQ = 2, IRDMA_HMC_IW_HTE = 3, IRDMA_HMC_IW_ARP = 4, IRDMA_HMC_IW_APBVT_ENTRY = 5, @@ -48,11 +58,17 @@ enum irdma_sd_entry_type { IRDMA_SD_TYPE_DIRECT = 2, }; +enum irdma_hmc_obj_mem { + IRDMA_HOST_MEM = 0, + IRDMA_LOC_MEM = 1, +}; + struct irdma_hmc_obj_info { u64 base; u32 max_cnt; u32 cnt; u64 size; + enum irdma_hmc_obj_mem mem_loc; }; struct irdma_hmc_bp { @@ -117,6 +133,7 @@ struct irdma_update_sds_info { struct irdma_ccq_cqe_info; struct irdma_hmc_fcn_info { u32 vf_id; + u8 protocol_used; u8 free_fcn; }; diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index ad50b77..2881314 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -33,6 +33,7 @@ static enum irdma_hmc_rsrc_type iw_hmc_obj_types[] = { IRDMA_HMC_IW_QP, IRDMA_HMC_IW_CQ, + IRDMA_HMC_IW_SRQ, IRDMA_HMC_IW_HTE, IRDMA_HMC_IW_ARP, IRDMA_HMC_IW_APBVT_ENTRY, @@ -1571,6 +1572,8 @@ static void irdma_del_init_mem(struct irdma_pci_f *rf) { struct irdma_sc_dev *dev = &rf->sc_dev; + if (!rf->sc_dev.privileged) + irdma_vchnl_req_put_hmc_fcn(&rf->sc_dev); kfree(dev->hmc_info->sd_table.sd_entry); dev->hmc_info->sd_table.sd_entry = NULL; vfree(rf->mem_rsrc); @@ -1637,6 +1640,7 @@ static int irdma_initialize_dev(struct irdma_pci_f *rf) info.bar0 = rf->hw.hw_addr; info.hmc_fn_id = rf->pf_id; + info.protocol_used = rf->protocol_used; info.hw = &rf->hw; status = irdma_sc_dev_init(rf->rdma_ver, &rf->sc_dev, &info); if (status) diff --git a/drivers/infiniband/hw/irdma/i40iw_if.c b/drivers/infiniband/hw/irdma/i40iw_if.c index 6fa807e..15e036d 100644 --- a/drivers/infiniband/hw/irdma/i40iw_if.c +++ b/drivers/infiniband/hw/irdma/i40iw_if.c @@ -77,6 +77,7 @@ static void i40iw_fill_device_info(struct irdma_device *iwdev, struct i40e_info rf->rdma_ver = IRDMA_GEN_1; rf->sc_dev.hw = &rf->hw; rf->sc_dev.hw_attrs.uk_attrs.hw_rev = IRDMA_GEN_1; + rf->sc_dev.privileged = true; rf->gen_ops.request_reset = i40iw_request_reset; rf->pcidev = cdev_info->pcidev; rf->pf_id = cdev_info->fid; diff --git a/drivers/infiniband/hw/irdma/icrdma_if.c b/drivers/infiniband/hw/irdma/icrdma_if.c index 5fcbf69..0ddccf1 100644 --- a/drivers/infiniband/hw/irdma/icrdma_if.c +++ b/drivers/infiniband/hw/irdma/icrdma_if.c @@ -144,6 +144,8 @@ static void icrdma_fill_device_info(struct irdma_device *iwdev, rf->msix_entries = cdev_info->msix_entries; rf->rdma_ver = IRDMA_GEN_2; rf->sc_dev.hw_attrs.uk_attrs.hw_rev = IRDMA_GEN_2; + rf->sc_dev.is_pf = true; + rf->sc_dev.privileged = true; rf->gen_ops.register_qset = icrdma_lan_register_qset; rf->gen_ops.unregister_qset = icrdma_lan_unregister_qset; diff --git a/drivers/infiniband/hw/irdma/ig3rdma_hw.h b/drivers/infiniband/hw/irdma/ig3rdma_hw.h new file mode 100644 index 0000000..4c3d186 --- /dev/null +++ b/drivers/infiniband/hw/irdma/ig3rdma_hw.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ +/* Copyright (c) 2021 - 2024 Intel Corporation */ +#ifndef IG3RDMA_HW_H +#define IG3RDMA_HW_H + +#define IG3_PF_RDMA_REGION_OFFSET 0xBC00000 +#define IG3_PF_RDMA_REGION_LEN 0x401000 +#define IG3_VF_RDMA_REGION_OFFSET 0x8C00 +#define IG3_VF_RDMA_REGION_LEN 0x8400 + +#endif /* IG3RDMA_HW_H*/ diff --git a/drivers/infiniband/hw/irdma/ig3rdma_if.c b/drivers/infiniband/hw/irdma/ig3rdma_if.c new file mode 100644 index 0000000..70b1ed3 --- /dev/null +++ b/drivers/infiniband/hw/irdma/ig3rdma_if.c @@ -0,0 +1,171 @@ +// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB +/* Copyright (c) 2023 - 2024 Intel Corporation */ +#include "main.h" +#include "ig3rdma_hw.h" + +static void ig3rdma_idc_core_event_handler(struct idc_rdma_core_dev_info *cdev_info, + struct idc_rdma_event *event) +{ + struct irdma_pci_f *rf = auxiliary_get_drvdata(cdev_info->adev); + + if (*event->type & BIT(IDC_RDMA_EVENT_WARN_RESET)) { + rf->reset = true; + rf->sc_dev.vchnl_up = false; + } +} + +static int ig3rdma_cfg_regions(struct irdma_hw *hw, + struct idc_rdma_core_dev_info *cdev_info) +{ + struct pci_dev *pdev = cdev_info->pdev; + int i; + + switch (cdev_info->ftype) { + case IDC_FUNCTION_TYPE_PF: + hw->rdma_reg.len = IG3_PF_RDMA_REGION_LEN; + hw->rdma_reg.offset = IG3_PF_RDMA_REGION_OFFSET; + break; + case IDC_FUNCTION_TYPE_VF: + hw->rdma_reg.len = IG3_VF_RDMA_REGION_LEN; + hw->rdma_reg.offset = IG3_VF_RDMA_REGION_OFFSET; + break; + default: + return -ENODEV; + } + + hw->rdma_reg.addr = ioremap(pci_resource_start(pdev, 0) + hw->rdma_reg.offset, + hw->rdma_reg.len); + + if (!hw->rdma_reg.addr) + return -ENOMEM; + + hw->io_regs = kcalloc(cdev_info->num_memory_regions, + sizeof(struct irdma_mmio_region), GFP_KERNEL); + + if (!hw->io_regs) { + iounmap(hw->rdma_reg.addr); + return -ENOMEM; + } + + hw->num_io_regions = le16_to_cpu(cdev_info->num_memory_regions); + for (i = 0; i < cdev_info->num_memory_regions; i++) { + hw->io_regs[i].addr = + cdev_info->mapped_mem_regions[i].region_addr; + hw->io_regs[i].len = + cdev_info->mapped_mem_regions[i].size; + hw->io_regs[i].offset = + cdev_info->mapped_mem_regions[i].start_offset; + } + + return 0; +} + +static void ig3rdma_decfg_rf(struct irdma_pci_f *rf) +{ + struct irdma_hw *hw = &rf->hw; + + destroy_workqueue(rf->vchnl_wq); + kfree(hw->io_regs); + iounmap(hw->rdma_reg.addr); +} + +static int ig3rdma_cfg_rf(struct irdma_pci_f *rf, + struct idc_rdma_core_dev_info *cdev_info) +{ + int err; + + rf->sc_dev.hw = &rf->hw; + rf->cdev = cdev_info; + rf->pcidev = cdev_info->pdev; + rf->hw.device = &rf->pcidev->dev; + rf->msix_count = cdev_info->msix_count; + rf->msix_entries = cdev_info->msix_entries; + + err = irdma_vchnl_init(rf, cdev_info, &rf->rdma_ver); + if (err) + return err; + + err = ig3rdma_cfg_regions(&rf->hw, cdev_info); + if (err) { + destroy_workqueue(rf->vchnl_wq); + return err; + } + + rf->protocol_used = IRDMA_ROCE_PROTOCOL_ONLY; + rf->rsrc_profile = IRDMA_HMC_PROFILE_DEFAULT; + rf->rst_to = IRDMA_RST_TIMEOUT_HZ; + rf->gen_ops.request_reset = irdma_request_reset; + rf->limits_sel = 7; + mutex_init(&rf->ah_tbl_lock); + + return 0; +} + +static int ig3rdma_core_probe(struct auxiliary_device *aux_dev, + const struct auxiliary_device_id *id) +{ + struct idc_rdma_core_auxiliary_dev *idc_adev = + container_of(aux_dev, struct idc_rdma_core_auxiliary_dev, adev); + struct idc_rdma_core_dev_info *cdev_info = idc_adev->cdev_info; + struct irdma_pci_f *rf; + int err; + + rf = kzalloc(sizeof(*rf), GFP_KERNEL); + if (!rf) + return -ENOMEM; + + err = ig3rdma_cfg_rf(rf, cdev_info); + if (err) + goto err_cfg_rf; + + err = irdma_ctrl_init_hw(rf); + if (err) + goto err_ctrl_init; + + auxiliary_set_drvdata(aux_dev, rf); + + err = cdev_info->ops->vport_dev_ctrl(cdev_info, true); + if (err) + goto err_vport_ctrl; + + return 0; + +err_vport_ctrl: + irdma_ctrl_deinit_hw(rf); +err_ctrl_init: + ig3rdma_decfg_rf(rf); +err_cfg_rf: + kfree(rf); + + return err; +} + +static void ig3rdma_core_remove(struct auxiliary_device *aux_dev) +{ + struct idc_rdma_core_auxiliary_dev *idc_adev = + container_of(aux_dev, struct idc_rdma_core_auxiliary_dev, adev); + struct idc_rdma_core_dev_info *cdev_info = idc_adev->cdev_info; + struct irdma_pci_f *rf = auxiliary_get_drvdata(aux_dev); + + cdev_info->ops->vport_dev_ctrl(cdev_info, false); + irdma_ctrl_deinit_hw(rf); + ig3rdma_decfg_rf(rf); + kfree(rf); +} + +static const struct auxiliary_device_id ig3rdma_core_auxiliary_id_table[] = { + {.name = "idpf.8086.rdma.core", }, + {}, +}; + +MODULE_DEVICE_TABLE(auxiliary, ig3rdma_core_auxiliary_id_table); + +struct idc_rdma_core_auxiliary_drv ig3rdma_core_auxiliary_drv = { + .adrv = { + .name = "core", + .id_table = ig3rdma_core_auxiliary_id_table, + .probe = ig3rdma_core_probe, + .remove = ig3rdma_core_remove, + }, + .event_handler = ig3rdma_idc_core_event_handler, +}; \ No newline at end of file diff --git a/drivers/infiniband/hw/irdma/irdma.h b/drivers/infiniband/hw/irdma/irdma.h index 20d2e739..7691704 100644 --- a/drivers/infiniband/hw/irdma/irdma.h +++ b/drivers/infiniband/hw/irdma/irdma.h @@ -107,6 +107,9 @@ enum irdma_vers { IRDMA_GEN_RSVD, IRDMA_GEN_1, IRDMA_GEN_2, + IRDMA_GEN_3, + IRDMA_GEN_NEXT, + IRDMA_GEN_MAX = IRDMA_GEN_NEXT-1 }; struct irdma_uk_attrs { diff --git a/drivers/infiniband/hw/irdma/main.c b/drivers/infiniband/hw/irdma/main.c index ee59ca1..e9524de 100644 --- a/drivers/infiniband/hw/irdma/main.c +++ b/drivers/infiniband/hw/irdma/main.c @@ -7,6 +7,23 @@ MODULE_DESCRIPTION("Intel(R) Ethernet Protocol Driver for RDMA"); MODULE_LICENSE("Dual BSD/GPL"); +int irdma_vchnl_send_sync(struct irdma_sc_dev *dev, u8 *msg, u16 len, + u8 *recv_msg, u16 *recv_len) +{ + struct idc_rdma_core_dev_info *cdev_info = dev_to_rf(dev)->cdev; + int ret; + + ret = cdev_info->ops->vc_send_sync(cdev_info, msg, len, recv_msg, + recv_len); + if (ret == -ETIMEDOUT) { + ibdev_err(&(dev_to_rf(dev)->iwdev->ibdev), + "Virtual channel Req <-> Resp completion timeout\n"); + dev->vchnl_up = false; + } + + return ret; +} + static struct notifier_block irdma_inetaddr_notifier = { .notifier_call = irdma_inetaddr_event }; @@ -103,16 +120,54 @@ static int __init irdma_init_module(void) return ret; } + ret = auxiliary_driver_register(&ig3rdma_core_auxiliary_drv.adrv); + if (ret) { + auxiliary_driver_unregister(&icrdma_core_auxiliary_drv.adrv); + auxiliary_driver_unregister(&i40iw_auxiliary_drv); + pr_err("Failed ig3rdma(gen_3) core auxiliary_driver_register() ret=%d\n", + ret); + + return ret; + } irdma_register_notifiers(); return 0; } +int irdma_vchnl_init(struct irdma_pci_f *rf, + struct idc_rdma_core_dev_info *cdev_info, u8 *rdma_ver) +{ + struct irdma_vchnl_init_info virt_info; + u8 gen = rf->rdma_ver; + int ret; + + rf->vchnl_wq = alloc_ordered_workqueue("irdma-virtchnl-wq", 0); + if (!rf->vchnl_wq) + return -ENOMEM; + + mutex_init(&rf->sc_dev.vchnl_mutex); + + virt_info.is_pf = !cdev_info->ftype; + virt_info.hw_rev = gen; + virt_info.privileged = gen == IRDMA_GEN_2; + virt_info.vchnl_wq = rf->vchnl_wq; + ret = irdma_sc_vchnl_init(&rf->sc_dev, &virt_info); + if (ret) { + destroy_workqueue(rf->vchnl_wq); + return ret; + } + + *rdma_ver = rf->sc_dev.hw_attrs.uk_attrs.hw_rev; + + return 0; +} + static void __exit irdma_exit_module(void) { irdma_unregister_notifiers(); auxiliary_driver_unregister(&icrdma_core_auxiliary_drv.adrv); auxiliary_driver_unregister(&i40iw_auxiliary_drv); + auxiliary_driver_unregister(&ig3rdma_core_auxiliary_drv.adrv); } module_init(irdma_init_module); diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index 7360e17..a7f3d19 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -55,6 +55,7 @@ #include "puda.h" extern struct auxiliary_driver i40iw_auxiliary_drv; +extern struct idc_rdma_core_auxiliary_drv ig3rdma_core_auxiliary_drv; extern struct idc_rdma_core_auxiliary_drv icrdma_core_auxiliary_drv; #define IRDMA_FW_VER_DEFAULT 2 @@ -326,6 +327,7 @@ struct irdma_pci_f { wait_queue_head_t vchnl_waitq; struct workqueue_struct *cqp_cmpl_wq; struct work_struct cqp_cmpl_work; + struct workqueue_struct *vchnl_wq; struct irdma_sc_vsi default_vsi; void *back_fcn; struct irdma_gen_ops gen_ops; @@ -556,6 +558,8 @@ int irdma_netdevice_event(struct notifier_block *notifier, unsigned long event, void *ptr); void irdma_add_ip(struct irdma_device *iwdev); void cqp_compl_worker(struct work_struct *work); +int irdma_vchnl_init(struct irdma_pci_f *rf, + struct idc_rdma_core_dev_info *cdev_info, u8 *rdma_ver); void irdma_fill_qos_info(struct irdma_l2params *l2params, struct iidc_rdma_qos_params *qos_info); void irdma_request_reset(struct irdma_pci_f *rf); diff --git a/drivers/infiniband/hw/irdma/pble.c b/drivers/infiniband/hw/irdma/pble.c index e7ce684..2ef60e6 100644 --- a/drivers/infiniband/hw/irdma/pble.c +++ b/drivers/infiniband/hw/irdma/pble.c @@ -193,8 +193,15 @@ static enum irdma_sd_entry_type irdma_get_type(struct irdma_sc_dev *dev, { enum irdma_sd_entry_type sd_entry_type; - sd_entry_type = !idx->rel_pd_idx && pages == IRDMA_HMC_PD_CNT_IN_SD ? - IRDMA_SD_TYPE_DIRECT : IRDMA_SD_TYPE_PAGED; + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) + sd_entry_type = (!idx->rel_pd_idx && + pages == IRDMA_HMC_PD_CNT_IN_SD) ? + IRDMA_SD_TYPE_DIRECT : IRDMA_SD_TYPE_PAGED; + else + sd_entry_type = (!idx->rel_pd_idx && + pages == IRDMA_HMC_PD_CNT_IN_SD && + dev->privileged) ? + IRDMA_SD_TYPE_DIRECT : IRDMA_SD_TYPE_PAGED; return sd_entry_type; } @@ -279,10 +286,11 @@ static int add_pble_prm(struct irdma_hmc_pble_rsrc *pble_rsrc) sd_reg_val = (sd_entry_type == IRDMA_SD_TYPE_PAGED) ? sd_entry->u.pd_table.pd_page_addr.pa : sd_entry->u.bp.addr.pa; - - if (!sd_entry->valid) { - ret_code = irdma_hmc_sd_one(dev, hmc_info->hmc_fn_id, sd_reg_val, - idx->sd_idx, sd_entry->entry_type, true); + if ((dev->privileged && !sd_entry->valid) || + dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + ret_code = irdma_hmc_sd_one(dev, hmc_info->hmc_fn_id, + sd_reg_val, idx->sd_idx, + sd_entry->entry_type, true); if (ret_code) goto error; } diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index 59b34af..cfcb5d9 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -8,6 +8,8 @@ #include "hmc.h" #include "uda.h" #include "ws.h" +#include "virtchnl.h" + #define IRDMA_DEBUG_ERR "ERR" #define IRDMA_DEBUG_INIT "INIT" #define IRDMA_DEBUG_DEV "DEV" @@ -159,7 +161,34 @@ enum irdma_hw_stats_index { enum irdma_feature_type { IRDMA_FEATURE_FW_INFO = 0, IRDMA_HW_VERSION_INFO = 1, + IRDMA_QP_MAX_INCR = 2, + IRDMA_CQ_MAX_INCR = 3, + IRDMA_CEQ_MAX_INCR = 4, + IRDMA_SD_MAX_INCR = 5, + IRDMA_QP_SMALL = 6, + IRDMA_QP_MEDIUM = 7, + IRDMA_QP_LARGE = 8, + IRDMA_QP_XLARGE = 9, + IRDMA_CQ_SMALL = 10, + IRDMA_CQ_MEDIUM = 11, + IRDMA_CQ_LARGE = 12, + IRDMA_CQ_XLARGE = 13, + IRDMA_CEQ_SMALL = 14, + IRDMA_CEQ_MEDIUM = 15, + IRDMA_CEQ_LARGE = 16, + IRDMA_CEQ_XLARGE = 17, + IRDMA_SD_SMALL = 18, + IRDMA_SD_MEDIUM = 19, + IRDMA_SD_LARGE = 20, + IRDMA_SD_XLARGE = 21, + IRDMA_OBJ_1 = 22, + IRDMA_OBJ_2 = 23, + IRDMA_ENDPT_TRK = 24, + IRDMA_FTN_INLINE_MAX = 25, IRDMA_QSETS_MAX = 26, + IRDMA_ASO = 27, + IRDMA_FTN_FLAGS = 32, + IRDMA_FTN_NOP = 33, IRDMA_MAX_FEATURES, /* Must be last entry */ }; @@ -310,9 +339,21 @@ struct irdma_vsi_pestat { spinlock_t lock; /* rdma stats lock */ }; +struct irdma_mmio_region { + u8 __iomem *addr; + resource_size_t len; + resource_size_t offset; +}; + struct irdma_hw { - u8 __iomem *hw_addr; - u8 __iomem *priv_hw_addr; + union { + u8 __iomem *hw_addr; + struct { + struct irdma_mmio_region rdma_reg; /* RDMA region */ + struct irdma_mmio_region *io_regs; /* Non-RDMA MMIO regions */ + u16 num_io_regions; /* Number of Non-RDMA MMIO regions */ + }; + }; struct device *device; struct irdma_hmc_info hmc; }; @@ -518,6 +559,7 @@ struct irdma_ws_node_info { struct irdma_hmc_fpm_misc { u32 max_ceqs; u32 max_sds; + u32 loc_mem_pages; u32 xf_block_size; u32 q1_block_size; u32 ht_multiplier; @@ -526,6 +568,7 @@ struct irdma_hmc_fpm_misc { u32 ooiscf_block_size; }; +#define IRDMA_VCHNL_MAX_MSG_SIZE 512 #define IRDMA_LEAF_DEFAULT_REL_BW 64 #define IRDMA_PARENT_DEFAULT_REL_BW 1 @@ -601,19 +644,28 @@ struct irdma_sc_dev { u64 cqp_cmd_stats[IRDMA_MAX_CQP_OPS]; struct irdma_hw_attrs hw_attrs; struct irdma_hmc_info *hmc_info; + struct irdma_vchnl_rdma_caps vc_caps; + u8 vc_recv_buf[IRDMA_VCHNL_MAX_MSG_SIZE]; + u16 vc_recv_len; struct irdma_sc_cqp *cqp; struct irdma_sc_aeq *aeq; struct irdma_sc_ceq *ceq[IRDMA_CEQ_MAX_COUNT]; struct irdma_sc_cq *ccq; const struct irdma_irq_ops *irq_ops; + struct irdma_qos qos[IRDMA_MAX_USER_PRIORITY]; struct irdma_hmc_fpm_misc hmc_fpm_misc; struct irdma_ws_node *ws_tree_root; struct mutex ws_mutex; /* ws tree mutex */ + u32 vchnl_ver; u16 num_vfs; - u8 hmc_fn_id; + u16 hmc_fn_id; u8 vf_id; + bool privileged:1; bool vchnl_up:1; bool ceq_valid:1; + bool is_pf:1; + u8 protocol_used; + struct mutex vchnl_mutex; /* mutex to synchronize RDMA virtual channel messages */ u8 pci_rev; int (*ws_add)(struct irdma_sc_vsi *vsi, u8 user_pri); void (*ws_remove)(struct irdma_sc_vsi *vsi, u8 user_pri); @@ -731,7 +783,8 @@ struct irdma_device_init_info { __le64 *fpm_commit_buf; struct irdma_hw *hw; void __iomem *bar0; - u8 hmc_fn_id; + enum irdma_protocol_used protocol_used; + u16 hmc_fn_id; }; struct irdma_ceq_init_info { @@ -972,7 +1025,7 @@ struct irdma_allocate_stag_info { bool use_hmc_fcn_index:1; bool use_pf_rid:1; bool all_memory:1; - u8 hmc_fcn_index; + u16 hmc_fcn_index; }; struct irdma_mw_alloc_info { diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h index 380e4a4..8fd7eeb 100644 --- a/drivers/infiniband/hw/irdma/user.h +++ b/drivers/infiniband/hw/irdma/user.h @@ -55,8 +55,8 @@ enum irdma_device_caps_const { IRDMA_CEQE_SIZE = 1, IRDMA_CQP_CTX_SIZE = 8, IRDMA_SHADOW_AREA_SIZE = 8, - IRDMA_QUERY_FPM_BUF_SIZE = 176, - IRDMA_COMMIT_FPM_BUF_SIZE = 176, + IRDMA_QUERY_FPM_BUF_SIZE = 192, + IRDMA_COMMIT_FPM_BUF_SIZE = 192, IRDMA_GATHER_STATS_BUF_SIZE = 1024, IRDMA_MIN_IW_QP_ID = 0, IRDMA_MAX_IW_QP_ID = 262143, diff --git a/drivers/infiniband/hw/irdma/virtchnl.c b/drivers/infiniband/hw/irdma/virtchnl.c new file mode 100644 index 0000000..2abfc39 --- /dev/null +++ b/drivers/infiniband/hw/irdma/virtchnl.c @@ -0,0 +1,300 @@ +// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB +/* Copyright (c) 2015 - 2024 Intel Corporation */ +#include "osdep.h" +#include "hmc.h" +#include "defs.h" +#include "type.h" +#include "protos.h" +#include "virtchnl.h" +#include "ws.h" +#include "i40iw_hw.h" + +/** + * irdma_sc_vchnl_init - Initialize dev virtchannel and get hw_rev + * @dev: dev structure to update + * @info: virtchannel info parameters to fill into the dev structure + */ +int irdma_sc_vchnl_init(struct irdma_sc_dev *dev, + struct irdma_vchnl_init_info *info) +{ + dev->vchnl_up = true; + dev->privileged = info->privileged; + dev->is_pf = info->is_pf; + dev->hw_attrs.uk_attrs.hw_rev = info->hw_rev; + + if (!dev->privileged) { + int ret = irdma_vchnl_req_get_ver(dev, IRDMA_VCHNL_CHNL_VER_MAX, + &dev->vchnl_ver); + + ibdev_dbg(to_ibdev(dev), + "DEV: Get Channel version ret = %d, version is %u\n", + ret, dev->vchnl_ver); + + if (ret) + return ret; + + ret = irdma_vchnl_req_get_caps(dev); + if (ret) + return ret; + + dev->hw_attrs.uk_attrs.hw_rev = dev->vc_caps.hw_rev; + } + + return 0; +} + +/** + * irdma_vchnl_req_verify_resp - Verify requested response size + * @vchnl_req: vchnl message requested + * @resp_len: response length sent from vchnl peer + */ +static int irdma_vchnl_req_verify_resp(struct irdma_vchnl_req *vchnl_req, + u16 resp_len) +{ + switch (vchnl_req->vchnl_msg->op_code) { + case IRDMA_VCHNL_OP_GET_VER: + case IRDMA_VCHNL_OP_GET_HMC_FCN: + case IRDMA_VCHNL_OP_PUT_HMC_FCN: + if (resp_len != vchnl_req->parm_len) + return -EBADMSG; + break; + case IRDMA_VCHNL_OP_GET_RDMA_CAPS: + if (resp_len < IRDMA_VCHNL_OP_GET_RDMA_CAPS_MIN_SIZE) + return -EBADMSG; + break; + default: + return -EOPNOTSUPP; + } + + return 0; +} + +static void irdma_free_vchnl_req_msg(struct irdma_vchnl_req *vchnl_req) +{ + kfree(vchnl_req->vchnl_msg); +} + +static int irdma_alloc_vchnl_req_msg(struct irdma_vchnl_req *vchnl_req, + struct irdma_vchnl_req_init_info *info) +{ + struct irdma_vchnl_op_buf *vchnl_msg; + + vchnl_msg = kzalloc(IRDMA_VCHNL_MAX_MSG_SIZE, GFP_KERNEL); + + if (!vchnl_msg) + return -ENOMEM; + + vchnl_msg->op_ctx = (uintptr_t)vchnl_req; + vchnl_msg->buf_len = sizeof(*vchnl_msg) + info->req_parm_len; + if (info->req_parm_len) + memcpy(vchnl_msg->buf, info->req_parm, info->req_parm_len); + vchnl_msg->op_code = info->op_code; + vchnl_msg->op_ver = info->op_ver; + + vchnl_req->vchnl_msg = vchnl_msg; + vchnl_req->parm = info->resp_parm; + vchnl_req->parm_len = info->resp_parm_len; + + return 0; +} + +static int irdma_vchnl_req_send_sync(struct irdma_sc_dev *dev, + struct irdma_vchnl_req_init_info *info) +{ + u16 resp_len = sizeof(dev->vc_recv_buf); + struct irdma_vchnl_req vchnl_req = {}; + u16 msg_len; + u8 *msg; + int ret; + + ret = irdma_alloc_vchnl_req_msg(&vchnl_req, info); + if (ret) + return ret; + + msg_len = vchnl_req.vchnl_msg->buf_len; + msg = (u8 *)vchnl_req.vchnl_msg; + + mutex_lock(&dev->vchnl_mutex); + ret = irdma_vchnl_send_sync(dev, msg, msg_len, dev->vc_recv_buf, + &resp_len); + dev->vc_recv_len = resp_len; + if (ret) + goto exit; + + ret = irdma_vchnl_req_get_resp(dev, &vchnl_req); +exit: + mutex_unlock(&dev->vchnl_mutex); + ibdev_dbg(to_ibdev(dev), + "VIRT: virtual channel send %s caller: %pS ret=%d op=%u op_ver=%u req_len=%u parm_len=%u resp_len=%u\n", + !ret ? "SUCCEEDS" : "FAILS", __builtin_return_address(0), + ret, vchnl_req.vchnl_msg->op_code, + vchnl_req.vchnl_msg->op_ver, vchnl_req.vchnl_msg->buf_len, + vchnl_req.parm_len, vchnl_req.resp_len); + irdma_free_vchnl_req_msg(&vchnl_req); + + return ret; +} + +/** + * irdma_vchnl_req_get_ver - Request Channel version + * @dev: RDMA device pointer + * @ver_req: Virtual channel version requested + * @ver_res: Virtual channel version response + */ +int irdma_vchnl_req_get_ver(struct irdma_sc_dev *dev, u16 ver_req, u32 *ver_res) +{ + struct irdma_vchnl_req_init_info info = {}; + int ret; + + if (!dev->vchnl_up) + return -EBUSY; + + info.op_code = IRDMA_VCHNL_OP_GET_VER; + info.op_ver = ver_req; + info.resp_parm = ver_res; + info.resp_parm_len = sizeof(*ver_res); + + ret = irdma_vchnl_req_send_sync(dev, &info); + if (ret) + return ret; + + if (*ver_res < IRDMA_VCHNL_CHNL_VER_MIN) { + ibdev_dbg(to_ibdev(dev), + "VIRT: %s unsupported vchnl version 0x%0x\n", + __func__, *ver_res); + return -EOPNOTSUPP; + } + + return 0; +} + +/** + * irdma_vchnl_req_get_hmc_fcn - Request VF HMC Function + * @dev: RDMA device pointer + */ +int irdma_vchnl_req_get_hmc_fcn(struct irdma_sc_dev *dev) +{ + struct irdma_vchnl_req_hmc_info req_hmc = {}; + struct irdma_vchnl_resp_hmc_info resp_hmc = {}; + struct irdma_vchnl_req_init_info info = {}; + int ret; + + if (!dev->vchnl_up) + return -EBUSY; + + info.op_code = IRDMA_VCHNL_OP_GET_HMC_FCN; + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + info.op_ver = IRDMA_VCHNL_OP_GET_HMC_FCN_V2; + req_hmc.protocol_used = dev->protocol_used; + info.req_parm_len = sizeof(req_hmc); + info.req_parm = &req_hmc; + info.resp_parm = &resp_hmc; + info.resp_parm_len = sizeof(resp_hmc); + } + + ret = irdma_vchnl_req_send_sync(dev, &info); + + if (ret) + return ret; + + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + int i; + + for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) { + dev->qos[i].qs_handle = resp_hmc.qs_handle[i]; + dev->qos[i].valid = true; + } + } + return 0; +} + +/** + * irdma_vchnl_req_put_hmc_fcn - Free VF HMC Function + * @dev: RDMA device pointer + */ +int irdma_vchnl_req_put_hmc_fcn(struct irdma_sc_dev *dev) +{ + struct irdma_vchnl_req_init_info info = {}; + + if (!dev->vchnl_up) + return -EBUSY; + + info.op_code = IRDMA_VCHNL_OP_PUT_HMC_FCN; + info.op_ver = IRDMA_VCHNL_OP_PUT_HMC_FCN_V0; + + return irdma_vchnl_req_send_sync(dev, &info); +} + +/** + * irdma_vchnl_req_get_caps - Request RDMA capabilities + * @dev: RDMA device pointer + */ +int irdma_vchnl_req_get_caps(struct irdma_sc_dev *dev) +{ + struct irdma_vchnl_req_init_info info = {}; + int ret; + + if (!dev->vchnl_up) + return -EBUSY; + + info.op_code = IRDMA_VCHNL_OP_GET_RDMA_CAPS; + info.op_ver = IRDMA_VCHNL_OP_GET_RDMA_CAPS_V0; + info.resp_parm = &dev->vc_caps; + info.resp_parm_len = sizeof(dev->vc_caps); + + ret = irdma_vchnl_req_send_sync(dev, &info); + + if (ret) + return ret; + + if (dev->vc_caps.hw_rev > IRDMA_GEN_MAX || + dev->vc_caps.hw_rev < IRDMA_GEN_2) { + ibdev_dbg(to_ibdev(dev), + "ERR: %s unsupported hw_rev version 0x%0x\n", + __func__, dev->vc_caps.hw_rev); + return -EOPNOTSUPP; + } + + return 0; +} + +/** + * irdma_vchnl_req_get_resp - Receive the inbound vchnl response. + * @dev: Dev pointer + * @vchnl_req: Vchannel request + */ +int irdma_vchnl_req_get_resp(struct irdma_sc_dev *dev, + struct irdma_vchnl_req *vchnl_req) +{ + struct irdma_vchnl_resp_buf *vchnl_msg_resp = + (struct irdma_vchnl_resp_buf *)dev->vc_recv_buf; + u16 resp_len; + int ret; + + if ((uintptr_t)vchnl_req != (uintptr_t)vchnl_msg_resp->op_ctx) { + ibdev_dbg(to_ibdev(dev), + "VIRT: error vchnl context value does not match\n"); + return -EBADMSG; + } + + resp_len = dev->vc_recv_len - sizeof(*vchnl_msg_resp); + resp_len = min(resp_len, vchnl_req->parm_len); + + ret = irdma_vchnl_req_verify_resp(vchnl_req, resp_len); + if (ret) + return ret; + + ret = (int)vchnl_msg_resp->op_ret; + if (ret) + return ret; + + vchnl_req->resp_len = 0; + if (vchnl_req->parm_len && vchnl_req->parm && resp_len) { + memcpy(vchnl_req->parm, vchnl_msg_resp->buf, resp_len); + vchnl_req->resp_len = resp_len; + ibdev_dbg(to_ibdev(dev), "VIRT: Got response, data size %u\n", + resp_len); + } + + return 0; +} diff --git a/drivers/infiniband/hw/irdma/virtchnl.h b/drivers/infiniband/hw/irdma/virtchnl.h new file mode 100644 index 0000000..fb28fa0 --- /dev/null +++ b/drivers/infiniband/hw/irdma/virtchnl.h @@ -0,0 +1,96 @@ +/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ +/* Copyright (c) 2015 - 2024 Intel Corporation */ +#ifndef IRDMA_VIRTCHNL_H +#define IRDMA_VIRTCHNL_H + +#include "hmc.h" +#include "irdma.h" + +/* IRDMA_VCHNL_CHNL_VER_V0 is for legacy hw, no longer supported. */ +#define IRDMA_VCHNL_CHNL_VER_V2 2 +#define IRDMA_VCHNL_CHNL_VER_MIN IRDMA_VCHNL_CHNL_VER_V2 +#define IRDMA_VCHNL_CHNL_VER_MAX IRDMA_VCHNL_CHNL_VER_V2 +#define IRDMA_VCHNL_OP_GET_HMC_FCN_V0 0 +#define IRDMA_VCHNL_OP_GET_HMC_FCN_V1 1 +#define IRDMA_VCHNL_OP_GET_HMC_FCN_V2 2 +#define IRDMA_VCHNL_OP_PUT_HMC_FCN_V0 0 +#define IRDMA_VCHNL_OP_GET_RDMA_CAPS_V0 0 +#define IRDMA_VCHNL_OP_GET_RDMA_CAPS_MIN_SIZE 1 + +enum irdma_vchnl_ops { + IRDMA_VCHNL_OP_GET_VER = 0, + IRDMA_VCHNL_OP_GET_HMC_FCN = 1, + IRDMA_VCHNL_OP_PUT_HMC_FCN = 2, + IRDMA_VCHNL_OP_GET_RDMA_CAPS = 13, +}; + +struct irdma_vchnl_req_hmc_info { + u8 protocol_used; + u8 disable_qos; +} __packed; + +struct irdma_vchnl_resp_hmc_info { + u16 hmc_func; + u16 qs_handle[IRDMA_MAX_USER_PRIORITY]; +} __packed; + +struct irdma_vchnl_op_buf { + u16 op_code; + u16 op_ver; + u16 buf_len; + u16 rsvd; + u64 op_ctx; + u8 buf[]; +} __packed; + +struct irdma_vchnl_resp_buf { + u64 op_ctx; + u16 buf_len; + s16 op_ret; + u16 rsvd[2]; + u8 buf[]; +} __packed; + +struct irdma_vchnl_rdma_caps { + u8 hw_rev; + u16 cqp_timeout_s; + u16 cqp_def_timeout_s; + u16 max_hw_push_len; +} __packed; + +struct irdma_vchnl_init_info { + struct workqueue_struct *vchnl_wq; + enum irdma_vers hw_rev; + bool privileged; + bool is_pf; +}; + +struct irdma_vchnl_req { + struct irdma_vchnl_op_buf *vchnl_msg; + void *parm; + u32 vf_id; + u16 parm_len; + u16 resp_len; +}; + +struct irdma_vchnl_req_init_info { + void *req_parm; + void *resp_parm; + u16 req_parm_len; + u16 resp_parm_len; + u16 op_code; + u16 op_ver; +} __packed; + +int irdma_sc_vchnl_init(struct irdma_sc_dev *dev, + struct irdma_vchnl_init_info *info); +int irdma_vchnl_send_sync(struct irdma_sc_dev *dev, u8 *msg, u16 len, + u8 *recv_msg, u16 *recv_len); +int irdma_vchnl_req_get_ver(struct irdma_sc_dev *dev, u16 ver_req, + u32 *ver_res); +int irdma_vchnl_req_get_hmc_fcn(struct irdma_sc_dev *dev); +int irdma_vchnl_req_put_hmc_fcn(struct irdma_sc_dev *dev); +int irdma_vchnl_req_get_caps(struct irdma_sc_dev *dev); +int irdma_vchnl_req_get_resp(struct irdma_sc_dev *dev, + struct irdma_vchnl_req *vc_req); +#endif /* IRDMA_VIRTCHNL_H */ From patchwork Wed Jul 24 23:39:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741450 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D253143888 for ; Wed, 24 Jul 2024 23:40:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864446; cv=none; b=R7XRLur/hBdJASgLgKciHIXeObPdEUaAMpiJWc5OHhe1OzKwZP2N09FJjPTsXhGRzTQUwGVBJsJkwL5fH9WAz2PQ00ErCLZlyTzc7X+9jrEs7+qHGtpjEYlAgSCOVQoU28iRxZaJeK6E3Ofrku7i37GrAFmgxAy5FZNhVkqOlw4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864446; c=relaxed/simple; bh=iyi2+P3FSSeAZmxpMJxHMQs4jCGujOfYqtpPDRhm+3k=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Hzzb1yIADnJnffL2TfyWPFKTwKECiNI/m+3tzLhmjCXRh9+GIhG42CeMxknXDeUd26mdSLqtPJLJ4vvl4oF8uRtSIlQajEmuag7dDM8hfYiTH1mYhzx4N8QJ67S8RTMBbyLtD+MZcmhDT2UH7SoMBC8EBgyxBGmqiiDv8Zywlqc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=eLG4Z6at; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="eLG4Z6at" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864444; x=1753400444; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iyi2+P3FSSeAZmxpMJxHMQs4jCGujOfYqtpPDRhm+3k=; b=eLG4Z6at6qUsUm2dqq2/DvpHNxTL+f59X+zKahBtxSFto+DQP+yHuzxV ttQ0JzkdwwKsCG8nX8xqEY0UnVQG0j+21jWdCm9+67XSo8uPOgWA9McoV ZzrqJIUK6D2/dZkJRu3kKzEESsCbKPZyJWCRctWKwpvREtRYRSKIi3hHJ jVEhm+ScKsSnFSFzGQGsPBD8whGXmsx3h33frQvX1s4p3nfgraz/Oxqng Ow+fHAqy6lfD3bTfJBTGMvuZ7No378dg9oPPjGMigyHhEFtBI1bf+uwtG RswWzGWtch7eCKlnUHzdWQBcodCAIsBEVwpG1+Zot3UAe2uJjhjvqPY20 A==; X-CSE-ConnectionGUID: 43f8LOcXSW+TXtUDurYUEA== X-CSE-MsgGUID: hWmUnCxCSSerFTyBPZel2Q== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999769" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999769" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:41 -0700 X-CSE-ConnectionGUID: Ssoxhwx+T/Ct4oOzfDAY4A== X-CSE-MsgGUID: +Y5x/iK5R8CkB43ZMORK3g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426054" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:40 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Christopher Bednarz , Tatyana Nikolova Subject: [RFC PATCH 12/25] RDMA/irdma: Discover and set up GEN3 hardware register layout Date: Wed, 24 Jul 2024 18:39:04 -0500 Message-Id: <20240724233917.704-13-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Christopher Bednarz Discover the hardware register layout for GEN3 devices through an RDMA virtual channel operation with the Control Plane (CP). Set up the corresponding hardware attributes specific to GEN3 devices. Signed-off-by: Christopher Bednarz Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/Makefile | 1 + drivers/infiniband/hw/irdma/ctrl.c | 31 ++++-- drivers/infiniband/hw/irdma/defs.h | 12 ++- drivers/infiniband/hw/irdma/i40iw_hw.c | 2 + drivers/infiniband/hw/irdma/i40iw_hw.h | 2 + drivers/infiniband/hw/irdma/icrdma_hw.c | 3 + drivers/infiniband/hw/irdma/icrdma_hw.h | 5 +- drivers/infiniband/hw/irdma/ig3rdma_hw.c | 65 +++++++++++ drivers/infiniband/hw/irdma/ig3rdma_hw.h | 18 ++++ drivers/infiniband/hw/irdma/irdma.h | 5 + drivers/infiniband/hw/irdma/virtchnl.c | 178 +++++++++++++++++++++++++++++++ drivers/infiniband/hw/irdma/virtchnl.h | 44 ++++++++ 12 files changed, 351 insertions(+), 15 deletions(-) create mode 100644 drivers/infiniband/hw/irdma/ig3rdma_hw.c diff --git a/drivers/infiniband/hw/irdma/Makefile b/drivers/infiniband/hw/irdma/Makefile index 3aa63b9..03ceb9e 100644 --- a/drivers/infiniband/hw/irdma/Makefile +++ b/drivers/infiniband/hw/irdma/Makefile @@ -16,6 +16,7 @@ irdma-objs := cm.o \ ig3rdma_if.o\ icrdma_if.o \ icrdma_hw.o \ + ig3rdma_hw.o\ main.o \ pble.o \ puda.o \ diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 9d7b151..34875cb 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -5677,6 +5677,9 @@ static inline void irdma_sc_init_hw(struct irdma_sc_dev *dev) case IRDMA_GEN_2: icrdma_init_hw(dev); break; + case IRDMA_GEN_3: + ig3rdma_init_hw(dev); + break; } } @@ -5744,18 +5747,26 @@ int irdma_sc_dev_init(enum irdma_vers ver, struct irdma_sc_dev *dev, irdma_sc_init_hw(dev); - if (irdma_wait_pe_ready(dev)) - return -ETIMEDOUT; + if (dev->privileged) { + if (irdma_wait_pe_ready(dev)) + return -ETIMEDOUT; - val = readl(dev->hw_regs[IRDMA_GLPCI_LBARCTRL]); - db_size = (u8)FIELD_GET(IRDMA_GLPCI_LBARCTRL_PE_DB_SIZE, val); - if (db_size != IRDMA_PE_DB_SIZE_4M && db_size != IRDMA_PE_DB_SIZE_8M) { - ibdev_dbg(to_ibdev(dev), - "DEV: RDMA PE doorbell is not enabled in CSR val 0x%x db_size=%d\n", - val, db_size); - return -ENODEV; + val = readl(dev->hw_regs[IRDMA_GLPCI_LBARCTRL]); + db_size = (u8)FIELD_GET(IRDMA_GLPCI_LBARCTRL_PE_DB_SIZE, val); + if (db_size != IRDMA_PE_DB_SIZE_4M && + db_size != IRDMA_PE_DB_SIZE_8M) { + ibdev_dbg(to_ibdev(dev), + "DEV: RDMA PE doorbell is not enabled in CSR val 0x%x db_size=%d\n", + val, db_size); + return -ENODEV; + } + } else { + ret_code = irdma_vchnl_req_get_reg_layout(dev); + if (ret_code) + ibdev_dbg(to_ibdev(dev), + "DEV: Get Register layout failed ret = %d\n", + ret_code); } - dev->db_addr = dev->hw->hw_addr + (uintptr_t)dev->hw_regs[IRDMA_DB_ADDR_OFFSET]; return ret_code; } diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index 7825896..fe75737 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -115,6 +115,7 @@ enum irdma_protocol_used { #define IRDMA_FEATURE_BUF_SIZE (8 * IRDMA_MAX_FEATURES) #define ENABLE_LOC_MEM 63 +#define IRDMA_ATOMICS_ALLOWED_BIT 1 #define MAX_PBLE_PER_SD 0x40000 #define MAX_PBLE_SD_PER_FCN 0x400 #define MAX_MR_PER_SD 0x8000 @@ -127,7 +128,7 @@ enum irdma_protocol_used { #define IRDMA_QP_SW_MAX_RQ_QUANTA 32768 #define IRDMA_MAX_QP_WRS(max_quanta_per_wr) \ ((IRDMA_QP_SW_MAX_WQ_QUANTA - IRDMA_SQ_RSVD) / (max_quanta_per_wr)) - +#define IRDMA_SRQ_MAX_QUANTA 262144 #define IRDMAQP_TERM_SEND_TERM_AND_FIN 0 #define IRDMAQP_TERM_SEND_TERM_ONLY 1 #define IRDMAQP_TERM_SEND_FIN_ONLY 2 @@ -153,8 +154,13 @@ enum irdma_protocol_used { #define IRDMA_SQ_RSVD 258 #define IRDMA_RQ_RSVD 1 -#define IRDMA_FEATURE_RTS_AE 1ULL -#define IRDMA_FEATURE_CQ_RESIZE 2ULL +#define IRDMA_FEATURE_RTS_AE BIT_ULL(0) +#define IRDMA_FEATURE_CQ_RESIZE BIT_ULL(1) +#define IRDMA_FEATURE_64_BYTE_CQE BIT_ULL(5) +#define IRDMA_FEATURE_ATOMIC_OPS BIT_ULL(6) +#define IRDMA_FEATURE_SRQ BIT_ULL(7) +#define IRDMA_FEATURE_CQE_TIMESTAMPING BIT_ULL(8) + #define IRDMAQP_OP_RDMA_WRITE 0x00 #define IRDMAQP_OP_RDMA_READ 0x01 #define IRDMAQP_OP_RDMA_SEND 0x03 diff --git a/drivers/infiniband/hw/irdma/i40iw_hw.c b/drivers/infiniband/hw/irdma/i40iw_hw.c index ce61a27..60c1f2b 100644 --- a/drivers/infiniband/hw/irdma/i40iw_hw.c +++ b/drivers/infiniband/hw/irdma/i40iw_hw.c @@ -85,6 +85,7 @@ I40E_CQPSQ_CQ_CEQID, I40E_CQPSQ_CQ_CQID, I40E_COMMIT_FPM_CQCNT, + I40E_CQPSQ_UPESD_HMCFNID, }; static u64 i40iw_shifts[IRDMA_MAX_SHIFTS] = { @@ -94,6 +95,7 @@ I40E_CQPSQ_CQ_CEQID_S, I40E_CQPSQ_CQ_CQID_S, I40E_COMMIT_FPM_CQCNT_S, + I40E_CQPSQ_UPESD_HMCFNID_S, }; /** diff --git a/drivers/infiniband/hw/irdma/i40iw_hw.h b/drivers/infiniband/hw/irdma/i40iw_hw.h index e1db84d..0095b32 100644 --- a/drivers/infiniband/hw/irdma/i40iw_hw.h +++ b/drivers/infiniband/hw/irdma/i40iw_hw.h @@ -123,6 +123,8 @@ #define I40E_CQPSQ_CQ_CQID GENMASK_ULL(15, 0) #define I40E_COMMIT_FPM_CQCNT_S 0 #define I40E_COMMIT_FPM_CQCNT GENMASK_ULL(17, 0) +#define I40E_CQPSQ_UPESD_HMCFNID_S 0 +#define I40E_CQPSQ_UPESD_HMCFNID GENMASK_ULL(5, 0) #define I40E_VSIQF_CTL(_VSI) (0x0020D800 + ((_VSI) * 4)) diff --git a/drivers/infiniband/hw/irdma/icrdma_hw.c b/drivers/infiniband/hw/irdma/icrdma_hw.c index 941d3ed..32f2628 100644 --- a/drivers/infiniband/hw/irdma/icrdma_hw.c +++ b/drivers/infiniband/hw/irdma/icrdma_hw.c @@ -38,6 +38,7 @@ ICRDMA_CQPSQ_CQ_CEQID, ICRDMA_CQPSQ_CQ_CQID, ICRDMA_COMMIT_FPM_CQCNT, + ICRDMA_CQPSQ_UPESD_HMCFNID, }; static u64 icrdma_shifts[IRDMA_MAX_SHIFTS] = { @@ -47,6 +48,7 @@ ICRDMA_CQPSQ_CQ_CEQID_S, ICRDMA_CQPSQ_CQ_CQID_S, ICRDMA_COMMIT_FPM_CQCNT_S, + ICRDMA_CQPSQ_UPESD_HMCFNID_S, }; /** @@ -194,6 +196,7 @@ void icrdma_init_hw(struct irdma_sc_dev *dev) dev->hw_attrs.max_hw_ord = ICRDMA_MAX_ORD_SIZE; dev->hw_attrs.max_stat_inst = ICRDMA_MAX_STATS_COUNT; dev->hw_attrs.max_stat_idx = IRDMA_HW_STAT_INDEX_MAX_GEN_2; + dev->hw_attrs.max_hw_device_pages = ICRDMA_MAX_PUSH_PAGE_COUNT; dev->hw_attrs.uk_attrs.min_hw_wq_size = ICRDMA_MIN_WQ_SIZE; dev->hw_attrs.uk_attrs.max_hw_sq_chunk = IRDMA_MAX_QUANTA_PER_WR; diff --git a/drivers/infiniband/hw/irdma/icrdma_hw.h b/drivers/infiniband/hw/irdma/icrdma_hw.h index 697b957..d97944a 100644 --- a/drivers/infiniband/hw/irdma/icrdma_hw.h +++ b/drivers/infiniband/hw/irdma/icrdma_hw.h @@ -58,14 +58,15 @@ #define ICRDMA_CQPSQ_CQ_CQID GENMASK_ULL(18, 0) #define ICRDMA_COMMIT_FPM_CQCNT_S 0 #define ICRDMA_COMMIT_FPM_CQCNT GENMASK_ULL(19, 0) - +#define ICRDMA_CQPSQ_UPESD_HMCFNID_S 0 +#define ICRDMA_CQPSQ_UPESD_HMCFNID GENMASK_ULL(5, 0) enum icrdma_device_caps_const { ICRDMA_MAX_STATS_COUNT = 128, ICRDMA_MAX_IRD_SIZE = 127, ICRDMA_MAX_ORD_SIZE = 255, ICRDMA_MIN_WQ_SIZE = 8 /* WQEs */, - + ICRDMA_MAX_PUSH_PAGE_COUNT = 256, }; void icrdma_init_hw(struct irdma_sc_dev *dev); diff --git a/drivers/infiniband/hw/irdma/ig3rdma_hw.c b/drivers/infiniband/hw/irdma/ig3rdma_hw.c new file mode 100644 index 0000000..83ef6af --- /dev/null +++ b/drivers/infiniband/hw/irdma/ig3rdma_hw.c @@ -0,0 +1,65 @@ +// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB +/* Copyright (c) 2018 - 2024 Intel Corporation */ +#include "osdep.h" +#include "type.h" +#include "protos.h" +#include "ig3rdma_hw.h" + +void ig3rdma_init_hw(struct irdma_sc_dev *dev) +{ + dev->hw_attrs.uk_attrs.hw_rev = IRDMA_GEN_3; + dev->hw_attrs.uk_attrs.max_hw_wq_frags = IG3RDMA_MAX_WQ_FRAGMENT_COUNT; + dev->hw_attrs.uk_attrs.max_hw_read_sges = IG3RDMA_MAX_SGE_RD; + dev->hw_attrs.uk_attrs.max_hw_sq_chunk = IRDMA_MAX_QUANTA_PER_WR; + dev->hw_attrs.first_hw_vf_fpm_id = 0; + dev->hw_attrs.max_hw_vf_fpm_id = IG3_MAX_APFS + IG3_MAX_AVFS; + dev->hw_attrs.uk_attrs.feature_flags |= IRDMA_FEATURE_64_BYTE_CQE; + if (dev->feature_info[IRDMA_FTN_FLAGS] & IRDMA_ATOMICS_ALLOWED_BIT) + dev->hw_attrs.uk_attrs.feature_flags |= + IRDMA_FEATURE_ATOMIC_OPS; + dev->hw_attrs.uk_attrs.feature_flags |= IRDMA_FEATURE_CQE_TIMESTAMPING; + + dev->hw_attrs.uk_attrs.feature_flags |= IRDMA_FEATURE_SRQ; + dev->hw_attrs.uk_attrs.feature_flags |= IRDMA_FEATURE_RTS_AE | + IRDMA_FEATURE_CQ_RESIZE; + dev->hw_attrs.page_size_cap = SZ_4K | SZ_2M | SZ_1G; + dev->hw_attrs.max_hw_ird = IG3RDMA_MAX_IRD_SIZE; + dev->hw_attrs.max_hw_ord = IG3RDMA_MAX_ORD_SIZE; + dev->hw_attrs.uk_attrs.min_hw_wq_size = IG3RDMA_MIN_WQ_SIZE; + dev->hw_attrs.uk_attrs.max_hw_srq_quanta = IRDMA_SRQ_MAX_QUANTA; + dev->hw_attrs.uk_attrs.max_hw_inline = IG3RDMA_MAX_INLINE_DATA_SIZE; + dev->hw_attrs.max_hw_device_pages = + dev->is_pf ? IG3RDMA_MAX_PF_PUSH_PAGE_COUNT : IG3RDMA_MAX_VF_PUSH_PAGE_COUNT; +} + +static void __iomem *__ig3rdma_get_reg_addr(struct irdma_mmio_region *region, u64 reg_offset) +{ + if (reg_offset >= region->offset && + reg_offset < (region->offset + region->len)) { + reg_offset -= region->offset; + + return region->addr + reg_offset; + } + + return NULL; +} + +void __iomem *ig3rdma_get_reg_addr(struct irdma_hw *hw, u64 reg_offset) +{ + u8 __iomem *reg_addr; + int i; + + reg_addr = __ig3rdma_get_reg_addr(&hw->rdma_reg, reg_offset); + if (reg_addr) + return reg_addr; + + for (i = 0; i < hw->num_io_regions; i++) { + reg_addr = __ig3rdma_get_reg_addr(&hw->io_regs[i], reg_offset); + if (reg_addr) + return reg_addr; + } + + WARN_ON_ONCE(1); + + return NULL; +} diff --git a/drivers/infiniband/hw/irdma/ig3rdma_hw.h b/drivers/infiniband/hw/irdma/ig3rdma_hw.h index 4c3d186..d0793308 100644 --- a/drivers/infiniband/hw/irdma/ig3rdma_hw.h +++ b/drivers/infiniband/hw/irdma/ig3rdma_hw.h @@ -3,9 +3,27 @@ #ifndef IG3RDMA_HW_H #define IG3RDMA_HW_H +#define IG3_MAX_APFS 1 +#define IG3_MAX_AVFS 0 + #define IG3_PF_RDMA_REGION_OFFSET 0xBC00000 #define IG3_PF_RDMA_REGION_LEN 0x401000 #define IG3_VF_RDMA_REGION_OFFSET 0x8C00 #define IG3_VF_RDMA_REGION_LEN 0x8400 +enum ig3rdma_device_caps_const { + IG3RDMA_MAX_WQ_FRAGMENT_COUNT = 14, + IG3RDMA_MAX_SGE_RD = 14, + + IG3RDMA_MAX_STATS_COUNT = 128, + + IG3RDMA_MAX_IRD_SIZE = 2048, + IG3RDMA_MAX_ORD_SIZE = 2048, + IG3RDMA_MIN_WQ_SIZE = 16 /* WQEs */, + IG3RDMA_MAX_INLINE_DATA_SIZE = 216, + IG3RDMA_MAX_PF_PUSH_PAGE_COUNT = 8192, + IG3RDMA_MAX_VF_PUSH_PAGE_COUNT = 16, +}; + +void __iomem *ig3rdma_get_reg_addr(struct irdma_hw *hw, u64 reg_offset); #endif /* IG3RDMA_HW_H*/ diff --git a/drivers/infiniband/hw/irdma/irdma.h b/drivers/infiniband/hw/irdma/irdma.h index 7691704..4dc6bf5 100644 --- a/drivers/infiniband/hw/irdma/irdma.h +++ b/drivers/infiniband/hw/irdma/irdma.h @@ -67,6 +67,7 @@ enum irdma_shifts { IRDMA_CQPSQ_CQ_CEQID_S, IRDMA_CQPSQ_CQ_CQID_S, IRDMA_COMMIT_FPM_CQCNT_S, + IRDMA_CQPSQ_UPESD_HMCFNID_S, IRDMA_MAX_SHIFTS, }; @@ -77,6 +78,7 @@ enum irdma_masks { IRDMA_CQPSQ_CQ_CEQID_M, IRDMA_CQPSQ_CQ_CQID_M, IRDMA_COMMIT_FPM_CQCNT_M, + IRDMA_CQPSQ_UPESD_HMCFNID_M, IRDMA_MAX_MASKS, /* Must be last entry */ }; @@ -121,6 +123,7 @@ struct irdma_uk_attrs { u32 max_hw_wq_quanta; u32 min_hw_cq_size; u32 max_hw_cq_size; + u32 max_hw_srq_quanta; u16 max_hw_sq_chunk; u16 min_hw_wq_size; u8 hw_rev; @@ -156,4 +159,6 @@ struct irdma_hw_attrs { void i40iw_init_hw(struct irdma_sc_dev *dev); void icrdma_init_hw(struct irdma_sc_dev *dev); +void ig3rdma_init_hw(struct irdma_sc_dev *dev); +void __iomem *ig3rdma_get_reg_addr(struct irdma_hw *hw, u64 reg_offset); #endif /* IRDMA_H*/ diff --git a/drivers/infiniband/hw/irdma/virtchnl.c b/drivers/infiniband/hw/irdma/virtchnl.c index 2abfc39..fcb8ef2 100644 --- a/drivers/infiniband/hw/irdma/virtchnl.c +++ b/drivers/infiniband/hw/irdma/virtchnl.c @@ -9,6 +9,51 @@ #include "ws.h" #include "i40iw_hw.h" +struct vchnl_reg_map_elem { + u16 reg_id; + u16 reg_idx; + bool pg_rel; +}; + +struct vchnl_regfld_map_elem { + u16 regfld_id; + u16 regfld_idx; +}; + +static struct vchnl_reg_map_elem vchnl_reg_map[] = { + {IRDMA_VCHNL_REG_ID_CQPTAIL, IRDMA_CQPTAIL, false}, + {IRDMA_VCHNL_REG_ID_CQPDB, IRDMA_CQPDB, false}, + {IRDMA_VCHNL_REG_ID_CCQPSTATUS, IRDMA_CCQPSTATUS, false}, + {IRDMA_VCHNL_REG_ID_CCQPHIGH, IRDMA_CCQPHIGH, false}, + {IRDMA_VCHNL_REG_ID_CCQPLOW, IRDMA_CCQPLOW, false}, + {IRDMA_VCHNL_REG_ID_CQARM, IRDMA_CQARM, false}, + {IRDMA_VCHNL_REG_ID_CQACK, IRDMA_CQACK, false}, + {IRDMA_VCHNL_REG_ID_AEQALLOC, IRDMA_AEQALLOC, false}, + {IRDMA_VCHNL_REG_ID_CQPERRCODES, IRDMA_CQPERRCODES, false}, + {IRDMA_VCHNL_REG_ID_WQEALLOC, IRDMA_WQEALLOC, false}, + {IRDMA_VCHNL_REG_ID_DB_ADDR_OFFSET, IRDMA_DB_ADDR_OFFSET, false }, + {IRDMA_VCHNL_REG_ID_DYN_CTL, IRDMA_GLINT_DYN_CTL, false }, + {IRDMA_VCHNL_REG_INV_ID, IRDMA_VCHNL_REG_INV_ID, false } +}; + +static struct vchnl_regfld_map_elem vchnl_regfld_map[] = { + {IRDMA_VCHNL_REGFLD_ID_CCQPSTATUS_CQP_OP_ERR, IRDMA_CCQPSTATUS_CCQP_ERR_M}, + {IRDMA_VCHNL_REGFLD_ID_CCQPSTATUS_CCQP_DONE, IRDMA_CCQPSTATUS_CCQP_DONE_M}, + {IRDMA_VCHNL_REGFLD_ID_CQPSQ_STAG_PDID, IRDMA_CQPSQ_STAG_PDID_M}, + {IRDMA_VCHNL_REGFLD_ID_CQPSQ_CQ_CEQID, IRDMA_CQPSQ_CQ_CEQID_M}, + {IRDMA_VCHNL_REGFLD_ID_CQPSQ_CQ_CQID, IRDMA_CQPSQ_CQ_CQID_M}, + {IRDMA_VCHNL_REGFLD_ID_COMMIT_FPM_CQCNT, IRDMA_COMMIT_FPM_CQCNT_M}, + {IRDMA_VCHNL_REGFLD_ID_UPESD_HMCN_ID, IRDMA_CQPSQ_UPESD_HMCFNID_M}, + {IRDMA_VCHNL_REGFLD_INV_ID, IRDMA_VCHNL_REGFLD_INV_ID} +}; + +#define IRDMA_VCHNL_REG_COUNT ARRAY_SIZE(vchnl_reg_map) +#define IRDMA_VCHNL_REGFLD_COUNT ARRAY_SIZE(vchnl_regfld_map) +#define IRDMA_VCHNL_REGFLD_BUF_SIZE \ + (IRDMA_VCHNL_REG_COUNT * sizeof(struct irdma_vchnl_reg_info) + \ + IRDMA_VCHNL_REGFLD_COUNT * sizeof(struct irdma_vchnl_reg_field_info)) +#define IRDMA_REGMAP_RESP_BUF_SIZE (IRDMA_VCHNL_RESP_MIN_SIZE + IRDMA_VCHNL_REGFLD_BUF_SIZE) + /** * irdma_sc_vchnl_init - Initialize dev virtchannel and get hw_rev * @dev: dev structure to update @@ -62,6 +107,8 @@ static int irdma_vchnl_req_verify_resp(struct irdma_vchnl_req *vchnl_req, if (resp_len < IRDMA_VCHNL_OP_GET_RDMA_CAPS_MIN_SIZE) return -EBADMSG; break; + case IRDMA_VCHNL_OP_GET_REG_LAYOUT: + break; default: return -EOPNOTSUPP; } @@ -136,6 +183,137 @@ static int irdma_vchnl_req_send_sync(struct irdma_sc_dev *dev, } /** + * irdma_vchnl_req_get_reg_layout - Get Register Layout + * @dev: RDMA device pointer + */ +int irdma_vchnl_req_get_reg_layout(struct irdma_sc_dev *dev) +{ + u16 reg_idx, reg_id, tmp_reg_id, regfld_idx, regfld_id, tmp_regfld_id; + struct irdma_vchnl_reg_field_info *regfld_array = NULL; + u8 resp_buffer[IRDMA_REGMAP_RESP_BUF_SIZE] = {}; + struct vchnl_regfld_map_elem *regfld_map_array; + struct irdma_vchnl_req_init_info info = {}; + struct vchnl_reg_map_elem *reg_map_array; + struct irdma_vchnl_reg_info *reg_array; + u8 num_bits, shift_cnt; + u16 buf_len = 0; + u64 bitmask; + u32 rindex; + int ret; + + if (!dev->vchnl_up) + return -EBUSY; + + info.op_code = IRDMA_VCHNL_OP_GET_REG_LAYOUT; + info.op_ver = IRDMA_VCHNL_OP_GET_REG_LAYOUT_V0; + info.resp_parm = resp_buffer; + info.resp_parm_len = sizeof(resp_buffer); + + ret = irdma_vchnl_req_send_sync(dev, &info); + + if (ret) + return ret; + + /* parse the response buffer and update reg info*/ + /* Parse registers till invalid */ + /* Parse register fields till invalid */ + reg_array = (struct irdma_vchnl_reg_info *)resp_buffer; + for (rindex = 0; rindex < IRDMA_VCHNL_REG_COUNT; rindex++) { + buf_len += sizeof(struct irdma_vchnl_reg_info); + if (buf_len >= sizeof(resp_buffer)) + return -ENOMEM; + + regfld_array = + (struct irdma_vchnl_reg_field_info *)®_array[rindex + 1]; + reg_id = reg_array[rindex].reg_id; + if (reg_id == IRDMA_VCHNL_REG_INV_ID) + break; + + reg_id &= ~IRDMA_VCHNL_REG_PAGE_REL; + if (reg_id >= IRDMA_VCHNL_REG_COUNT) + return -EINVAL; + + /* search regmap for register index in hw_regs.*/ + reg_map_array = vchnl_reg_map; + do { + tmp_reg_id = reg_map_array->reg_id; + if (tmp_reg_id == reg_id) + break; + + reg_map_array++; + } while (tmp_reg_id != IRDMA_VCHNL_REG_INV_ID); + if (tmp_reg_id != reg_id) + continue; + + reg_idx = reg_map_array->reg_idx; + + /* Page relative, DB Offset do not need bar offset */ + if (reg_idx == IRDMA_DB_ADDR_OFFSET || + (reg_array[rindex].reg_id & IRDMA_VCHNL_REG_PAGE_REL)) { + dev->hw_regs[reg_idx] = + (u32 __iomem *)(uintptr_t)reg_array[rindex].reg_offset; + continue; + } + + /* Update the local HW struct */ + dev->hw_regs[reg_idx] = ig3rdma_get_reg_addr(dev->hw, + reg_array[rindex].reg_offset); + if (!dev->hw_regs[reg_idx]) + return -EINVAL; + } + + if (!regfld_array) + return -ENOMEM; + + /* set up doorbell variables using mapped DB page */ + dev->wqe_alloc_db = dev->hw_regs[IRDMA_WQEALLOC]; + dev->cq_arm_db = dev->hw_regs[IRDMA_CQARM]; + dev->aeq_alloc_db = dev->hw_regs[IRDMA_AEQALLOC]; + dev->cqp_db = dev->hw_regs[IRDMA_CQPDB]; + dev->cq_ack_db = dev->hw_regs[IRDMA_CQACK]; + + for (rindex = 0; rindex < IRDMA_VCHNL_REGFLD_COUNT; rindex++) { + buf_len += sizeof(struct irdma_vchnl_reg_field_info); + if ((buf_len - 1) > sizeof(resp_buffer)) + break; + + if (regfld_array[rindex].fld_id == IRDMA_VCHNL_REGFLD_INV_ID) + break; + + regfld_id = regfld_array[rindex].fld_id; + regfld_map_array = vchnl_regfld_map; + do { + tmp_regfld_id = regfld_map_array->regfld_id; + if (tmp_regfld_id == regfld_id) + break; + + regfld_map_array++; + } while (tmp_regfld_id != IRDMA_VCHNL_REGFLD_INV_ID); + + if (tmp_regfld_id != regfld_id) + continue; + + regfld_idx = regfld_map_array->regfld_idx; + + num_bits = regfld_array[rindex].fld_bits; + shift_cnt = regfld_array[rindex].fld_shift; + if ((num_bits + shift_cnt > 64) || !num_bits) { + ibdev_dbg(to_ibdev(dev), + "ERR: Invalid field mask id %d bits %d shift %d", + regfld_id, num_bits, shift_cnt); + + continue; + } + + bitmask = (1ULL << num_bits) - 1; + dev->hw_masks[regfld_idx] = bitmask << shift_cnt; + dev->hw_shifts[regfld_idx] = shift_cnt; + } + + return 0; +} + +/** * irdma_vchnl_req_get_ver - Request Channel version * @dev: RDMA device pointer * @ver_req: Virtual channel version requested diff --git a/drivers/infiniband/hw/irdma/virtchnl.h b/drivers/infiniband/hw/irdma/virtchnl.h index fb28fa0..20526c0 100644 --- a/drivers/infiniband/hw/irdma/virtchnl.h +++ b/drivers/infiniband/hw/irdma/virtchnl.h @@ -14,13 +14,44 @@ #define IRDMA_VCHNL_OP_GET_HMC_FCN_V1 1 #define IRDMA_VCHNL_OP_GET_HMC_FCN_V2 2 #define IRDMA_VCHNL_OP_PUT_HMC_FCN_V0 0 +#define IRDMA_VCHNL_OP_GET_REG_LAYOUT_V0 0 #define IRDMA_VCHNL_OP_GET_RDMA_CAPS_V0 0 #define IRDMA_VCHNL_OP_GET_RDMA_CAPS_MIN_SIZE 1 +#define IRDMA_VCHNL_REG_ID_CQPTAIL 0 +#define IRDMA_VCHNL_REG_ID_CQPDB 1 +#define IRDMA_VCHNL_REG_ID_CCQPSTATUS 2 +#define IRDMA_VCHNL_REG_ID_CCQPHIGH 3 +#define IRDMA_VCHNL_REG_ID_CCQPLOW 4 +#define IRDMA_VCHNL_REG_ID_CQARM 5 +#define IRDMA_VCHNL_REG_ID_CQACK 6 +#define IRDMA_VCHNL_REG_ID_AEQALLOC 7 +#define IRDMA_VCHNL_REG_ID_CQPERRCODES 8 +#define IRDMA_VCHNL_REG_ID_WQEALLOC 9 +#define IRDMA_VCHNL_REG_ID_IPCONFIG0 10 +#define IRDMA_VCHNL_REG_ID_DB_ADDR_OFFSET 11 +#define IRDMA_VCHNL_REG_ID_DYN_CTL 12 +#define IRDMA_VCHNL_REG_ID_AEQITRMASK 13 +#define IRDMA_VCHNL_REG_ID_CEQITRMASK 14 +#define IRDMA_VCHNL_REG_INV_ID 0xFFFF +#define IRDMA_VCHNL_REG_PAGE_REL 0x8000 + +#define IRDMA_VCHNL_REGFLD_ID_CCQPSTATUS_CQP_OP_ERR 2 +#define IRDMA_VCHNL_REGFLD_ID_CCQPSTATUS_CCQP_DONE 5 +#define IRDMA_VCHNL_REGFLD_ID_CQPSQ_STAG_PDID 6 +#define IRDMA_VCHNL_REGFLD_ID_CQPSQ_CQ_CEQID 7 +#define IRDMA_VCHNL_REGFLD_ID_CQPSQ_CQ_CQID 8 +#define IRDMA_VCHNL_REGFLD_ID_COMMIT_FPM_CQCNT 9 +#define IRDMA_VCHNL_REGFLD_ID_UPESD_HMCN_ID 10 +#define IRDMA_VCHNL_REGFLD_INV_ID 0xFFFF + +#define IRDMA_VCHNL_RESP_MIN_SIZE (sizeof(struct irdma_vchnl_resp_buf)) + enum irdma_vchnl_ops { IRDMA_VCHNL_OP_GET_VER = 0, IRDMA_VCHNL_OP_GET_HMC_FCN = 1, IRDMA_VCHNL_OP_PUT_HMC_FCN = 2, + IRDMA_VCHNL_OP_GET_REG_LAYOUT = 11, IRDMA_VCHNL_OP_GET_RDMA_CAPS = 13, }; @@ -65,6 +96,18 @@ struct irdma_vchnl_init_info { bool is_pf; }; +struct irdma_vchnl_reg_info { + u32 reg_offset; + u16 field_cnt; + u16 reg_id; /* High bit of reg_id: bar or page relative */ +}; + +struct irdma_vchnl_reg_field_info { + u8 fld_shift; + u8 fld_bits; + u16 fld_id; +}; + struct irdma_vchnl_req { struct irdma_vchnl_op_buf *vchnl_msg; void *parm; @@ -93,4 +136,5 @@ int irdma_vchnl_req_get_ver(struct irdma_sc_dev *dev, u16 ver_req, int irdma_vchnl_req_get_caps(struct irdma_sc_dev *dev); int irdma_vchnl_req_get_resp(struct irdma_sc_dev *dev, struct irdma_vchnl_req *vc_req); +int irdma_vchnl_req_get_reg_layout(struct irdma_sc_dev *dev); #endif /* IRDMA_VIRTCHNL_H */ From patchwork Wed Jul 24 23:39:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741452 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2081C146D79 for ; Wed, 24 Jul 2024 23:40:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864447; cv=none; b=RQIEELuzLnrq4dkAPjrYDhVN3BC0Vk3mXeYh5/qKG0SegANOVQAh5IfIALUeBa8ZiShCfjjeQEcm6xhw226zqb03SpU4sq2BgWA6YJLbpRjOp7wxl53JM35+WBO9D58f2mIvbbtYOtH2qZjY3ep/uE5CFgkReGfn02s/iGdudLA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864447; c=relaxed/simple; bh=yPKVftTccuHWkZERgxn1EcpymuXyBQrkFOr5sv8cJRA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=L5WgEtKZjIN9JSGLKRcPAUnsAukQ1eCQx9BsxCC1JyaWzj48FWFGjFV0P8MDqP8jNP9XbQ4l7pd8bTT89SPNWClAbK+9+sLDsaqcrs6hTRaefaE14GZh/4GB8hq4qoKSDjEkg7n3C12UKdms2I2bm64lW3Jf7mrJLxnbYd5Wbrk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=C6vXgZhS; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="C6vXgZhS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864445; x=1753400445; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yPKVftTccuHWkZERgxn1EcpymuXyBQrkFOr5sv8cJRA=; b=C6vXgZhS7n54FjVfoRZFwH+V7ZytCjtH51oEQlEBJAUN9z4/Kli5Bs+X yigXubG/xGETIQlGO0TbkRn2F8l4L7/nUBXNo47O+fRk+AiZc0tANWE84 RwLMWUPHbB0jcZXoequAn9p2KFqV4fp8AwMEmKu9Bm64TUKcmxNmQ7m9y H21SRNbpYat+s04qhO+5+vBkTDVQmnCAV79PUtijk51USrEHG+IncjGIy olW/dI4hZYmrhO/JLPyA1hlbHv0yWHsUQCCnBQ51wXd+5hdJY/8ifg//l EjmNEFrGEiQav5Zm4qUVpeJ67Y5feLKgGHM0n3vw/mc5Q0qyDz0GPdjaj Q==; X-CSE-ConnectionGUID: 9igdHK1YRAGsohk35MHIHg== X-CSE-MsgGUID: zkL5751oR2OVQn5uRJXKGA== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999773" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999773" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:42 -0700 X-CSE-ConnectionGUID: FkkpH0HlTF+t0Thx/Lxf7Q== X-CSE-MsgGUID: yTTIUJVgSiOpy2UNjjQjbA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426057" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:41 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Krzysztof Czurylo , Tatyana Nikolova Subject: [RFC PATCH 13/25] RDMA/irdma: Add GEN3 CQP support with deferred completions Date: Wed, 24 Jul 2024 18:39:05 -0500 Message-Id: <20240724233917.704-14-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Krzysztof Czurylo GEN3 introduces asynchronous handling of Control QP (CQP) operations to minimize head-of-line blocking. Create the CQP using the updated GEN3- specific descriptor fields and implement the necessary support for this deferred completion mechanism. Signed-off-by: Krzysztof Czurylo Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ctrl.c | 254 ++++++++++++++++++++++++++++++++++- drivers/infiniband/hw/irdma/defs.h | 15 +++ drivers/infiniband/hw/irdma/hw.c | 89 ++++++++++-- drivers/infiniband/hw/irdma/main.h | 2 + drivers/infiniband/hw/irdma/protos.h | 1 + drivers/infiniband/hw/irdma/type.h | 43 +++++- drivers/infiniband/hw/irdma/utils.c | 50 ++++++- 7 files changed, 439 insertions(+), 15 deletions(-) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 34875cb..e524b61 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -2738,6 +2738,90 @@ static inline void irdma_get_cqp_reg_info(struct irdma_sc_cqp *cqp, u32 *val, } /** + * irdma_sc_cqp_def_cmpl_ae_handler - remove completed requests from pending list + * @dev: sc device struct + * @info: AE entry info + * @first: true if this is the first call to this handler for given AEQE + * @scratch: (out) scratch entry pointer + * @sw_def_info: (in/out) SW ticket value for this AE + * + * In case of AE_DEF_CMPL event, this function should be called in a loop + * until it returns NULL-ptr via scratch. + * For each call, it looks for a matching CQP request on pending list, + * removes it from the list and returns the pointer to the associated scratch + * entry. + * If this is the first call to this function for given AEQE, sw_def_info + * value is not used to find matching requests. Instead, it is populated + * with the value from the first matching cqp_request on the list. + * For subsequent calls, ooo_op->sw_def_info need to match the value passed + * by a caller. + * + * Return: scratch entry pointer for cqp_request to be released or NULL + * if no matching request is found. + */ +void irdma_sc_cqp_def_cmpl_ae_handler(struct irdma_sc_dev *dev, + struct irdma_aeqe_info *info, + bool first, u64 *scratch, + u32 *sw_def_info) +{ + struct irdma_ooo_cqp_op *ooo_op; + unsigned long flags; + + *scratch = 0; + + spin_lock_irqsave(&dev->cqp->ooo_list_lock, flags); + list_for_each_entry(ooo_op, &dev->cqp->ooo_pnd, list_entry) { + if (ooo_op->deferred && + ((first && ooo_op->def_info == info->def_info) || + (!first && ooo_op->sw_def_info == *sw_def_info))) { + *sw_def_info = ooo_op->sw_def_info; + *scratch = ooo_op->scratch; + + list_del(&ooo_op->list_entry); + list_add(&ooo_op->list_entry, &dev->cqp->ooo_avail); + atomic64_inc(&dev->cqp->completed_ops); + + break; + } + } + spin_unlock_irqrestore(&dev->cqp->ooo_list_lock, flags); + + if (first && !*scratch) + ibdev_dbg(to_ibdev(dev), + "AEQ: deferred completion with unknown ticket: def_info 0x%x\n", + info->def_info); +} + +/** + * irdma_sc_cqp_cleanup_handler - remove requests from pending list + * @dev: sc device struct + * + * This function should be called in a loop from irdma_cleanup_pending_cqp_op. + * For each call, it returns first CQP request on pending list, removes it + * from the list and returns the pointer to the associated scratch entry. + * + * Return: scratch entry pointer for cqp_request to be released or NULL + * if pending list is empty. + */ +u64 irdma_sc_cqp_cleanup_handler(struct irdma_sc_dev *dev) +{ + struct irdma_ooo_cqp_op *ooo_op; + u64 scratch = 0; + + list_for_each_entry(ooo_op, &dev->cqp->ooo_pnd, list_entry) { + scratch = ooo_op->scratch; + + list_del(&ooo_op->list_entry); + list_add(&ooo_op->list_entry, &dev->cqp->ooo_avail); + atomic64_inc(&dev->cqp->completed_ops); + + break; + } + + return scratch; +} + +/** * irdma_cqp_poll_registers - poll cqp registers * @cqp: struct for cqp hw * @tail: wqtail register value @@ -3121,6 +3205,8 @@ void irdma_sc_remove_cq_ctx(struct irdma_sc_ceq *ceq, struct irdma_sc_cq *cq) int irdma_sc_cqp_init(struct irdma_sc_cqp *cqp, struct irdma_cqp_init_info *info) { + struct irdma_ooo_cqp_op *ooo_op; + u32 num_ooo_ops; u8 hw_sq_size; if (info->sq_size > IRDMA_CQP_SW_SQSIZE_2048 || @@ -3151,17 +3237,43 @@ int irdma_sc_cqp_init(struct irdma_sc_cqp *cqp, cqp->rocev2_rto_policy = info->rocev2_rto_policy; cqp->protocol_used = info->protocol_used; memcpy(&cqp->dcqcn_params, &info->dcqcn_params, sizeof(cqp->dcqcn_params)); + if (cqp->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + cqp->ooisc_blksize = info->ooisc_blksize; + cqp->rrsp_blksize = info->rrsp_blksize; + cqp->q1_blksize = info->q1_blksize; + cqp->xmit_blksize = info->xmit_blksize; + cqp->blksizes_valid = info->blksizes_valid; + cqp->ts_shift = info->ts_shift; + cqp->ts_override = info->ts_override; + cqp->en_fine_grained_timers = info->en_fine_grained_timers; + cqp->pe_en_vf_cnt = info->pe_en_vf_cnt; + cqp->ooo_op_array = info->ooo_op_array; + /* initialize the OOO lists */ + INIT_LIST_HEAD(&cqp->ooo_avail); + INIT_LIST_HEAD(&cqp->ooo_pnd); + if (cqp->ooo_op_array) { + /* Populate avail list entries */ + for (num_ooo_ops = 0, ooo_op = info->ooo_op_array; + num_ooo_ops < cqp->sq_size; + num_ooo_ops++, ooo_op++) + list_add(&ooo_op->list_entry, &cqp->ooo_avail); + } + } info->dev->cqp = cqp; IRDMA_RING_INIT(cqp->sq_ring, cqp->sq_size); + cqp->last_def_cmpl_ticket = 0; + cqp->sw_def_cmpl_ticket = 0; cqp->requested_ops = 0; atomic64_set(&cqp->completed_ops, 0); /* for the cqp commands backlog. */ INIT_LIST_HEAD(&cqp->dev->cqp_cmd_head); writel(0, cqp->dev->hw_regs[IRDMA_CQPTAIL]); - writel(0, cqp->dev->hw_regs[IRDMA_CQPDB]); - writel(0, cqp->dev->hw_regs[IRDMA_CCQPSTATUS]); + if (cqp->dev->hw_attrs.uk_attrs.hw_rev <= IRDMA_GEN_2) { + writel(0, cqp->dev->hw_regs[IRDMA_CQPDB]); + writel(0, cqp->dev->hw_regs[IRDMA_CCQPSTATUS]); + } ibdev_dbg(to_ibdev(cqp->dev), "WQE: sq_size[%04d] hw_sq_size[%04d] sq_base[%p] sq_pa[%pK] cqp[%p] polarity[x%04x]\n", @@ -3193,6 +3305,7 @@ int irdma_sc_cqp_create(struct irdma_sc_cqp *cqp, u16 *maj_err, u16 *min_err) return -ENOMEM; spin_lock_init(&cqp->dev->cqp_lock); + spin_lock_init(&cqp->ooo_list_lock); temp = FIELD_PREP(IRDMA_CQPHC_SQSIZE, cqp->hw_sq_size) | FIELD_PREP(IRDMA_CQPHC_SVER, cqp->struct_ver) | @@ -3204,12 +3317,29 @@ int irdma_sc_cqp_create(struct irdma_sc_cqp *cqp, u16 *maj_err, u16 *min_err) FIELD_PREP(IRDMA_CQPHC_PROTOCOL_USED, cqp->protocol_used); } + if (hw_rev >= IRDMA_GEN_3) + temp |= FIELD_PREP(IRDMA_CQPHC_EN_FINE_GRAINED_TIMERS, + cqp->en_fine_grained_timers); set_64bit_val(cqp->host_ctx, 0, temp); set_64bit_val(cqp->host_ctx, 8, cqp->sq_pa); temp = FIELD_PREP(IRDMA_CQPHC_ENABLED_VFS, cqp->ena_vf_count) | FIELD_PREP(IRDMA_CQPHC_HMC_PROFILE, cqp->hmc_profile); + + if (hw_rev >= IRDMA_GEN_3) + temp |= FIELD_PREP(IRDMA_CQPHC_OOISC_BLKSIZE, + cqp->ooisc_blksize) | + FIELD_PREP(IRDMA_CQPHC_RRSP_BLKSIZE, + cqp->rrsp_blksize) | + FIELD_PREP(IRDMA_CQPHC_Q1_BLKSIZE, cqp->q1_blksize) | + FIELD_PREP(IRDMA_CQPHC_XMIT_BLKSIZE, + cqp->xmit_blksize) | + FIELD_PREP(IRDMA_CQPHC_BLKSIZES_VALID, + cqp->blksizes_valid) | + FIELD_PREP(IRDMA_CQPHC_TIMESTAMP_OVERRIDE, + cqp->ts_override) | + FIELD_PREP(IRDMA_CQPHC_TS_SHIFT, cqp->ts_shift); set_64bit_val(cqp->host_ctx, 16, temp); set_64bit_val(cqp->host_ctx, 24, (uintptr_t)cqp); temp = FIELD_PREP(IRDMA_CQPHC_HW_MAJVER, cqp->hw_maj_ver) | @@ -3371,6 +3501,87 @@ void irdma_sc_ccq_arm(struct irdma_sc_cq *ccq) } /** + * irdma_sc_process_def_cmpl - process deferred or pending completion + * @cqp: CQP sc struct + * @info: CQP CQE info + * @wqe_idx: CQP WQE descriptor index + * @def_info: deferred op ticket value or out-of-order completion id + * @def_cmpl: true for deferred completion, false for pending (RCA) + */ +static void irdma_sc_process_def_cmpl(struct irdma_sc_cqp *cqp, + struct irdma_ccq_cqe_info *info, + u32 wqe_idx, u32 def_info, bool def_cmpl) +{ + struct irdma_ooo_cqp_op *ooo_op; + unsigned long flags; + + /* Deferred and out-of-order completions share the same list of pending + * completions. Since the list can be also accessed from AE handler, + * it must be protected by a lock. + */ + spin_lock_irqsave(&cqp->ooo_list_lock, flags); + + /* For deferred completions bump up SW completion ticket value. */ + if (def_cmpl) { + cqp->last_def_cmpl_ticket = def_info; + cqp->sw_def_cmpl_ticket++; + } + if (!list_empty(&cqp->ooo_avail)) { + ooo_op = (struct irdma_ooo_cqp_op *) + list_entry(cqp->ooo_avail.next, + struct irdma_ooo_cqp_op, list_entry); + + list_del(&ooo_op->list_entry); + ooo_op->scratch = info->scratch; + ooo_op->def_info = def_info; + ooo_op->sw_def_info = cqp->sw_def_cmpl_ticket; + ooo_op->deferred = def_cmpl; + ooo_op->wqe_idx = wqe_idx; + /* Pending completions must be chronologically ordered, + * so adding at the end of list. + */ + list_add_tail(&ooo_op->list_entry, &cqp->ooo_pnd); + } + spin_unlock_irqrestore(&cqp->ooo_list_lock, flags); + + info->pending = true; +} + +/** + * irdma_sc_process_ooo_cmpl - process out-of-order (final) completion + * @cqp: CQP sc struct + * @info: CQP CQE info + * @def_info: out-of-order completion id + */ +static void irdma_sc_process_ooo_cmpl(struct irdma_sc_cqp *cqp, + struct irdma_ccq_cqe_info *info, + u32 def_info) +{ + struct irdma_ooo_cqp_op *ooo_op_tmp; + struct irdma_ooo_cqp_op *ooo_op; + unsigned long flags; + + info->scratch = 0; + + spin_lock_irqsave(&cqp->ooo_list_lock, flags); + list_for_each_entry_safe(ooo_op, ooo_op_tmp, &cqp->ooo_pnd, + list_entry) { + if (!ooo_op->deferred && ooo_op->def_info == def_info) { + list_del(&ooo_op->list_entry); + info->scratch = ooo_op->scratch; + list_add(&ooo_op->list_entry, &cqp->ooo_avail); + break; + } + } + spin_unlock_irqrestore(&cqp->ooo_list_lock, flags); + + if (!info->scratch) + ibdev_dbg(to_ibdev(cqp->dev), + "CQP: DEBUG_FW_OOO out-of-order completion with unknown def_info = 0x%x\n", + def_info); +} + +/** * irdma_sc_ccq_get_cqe_info - get ccq's cq entry * @ccq: ccq sc struct * @info: completion q entry to return @@ -3378,6 +3589,10 @@ void irdma_sc_ccq_arm(struct irdma_sc_cq *ccq) int irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq, struct irdma_ccq_cqe_info *info) { + u32 def_info; + bool def_cmpl = false; + bool pend_cmpl = false; + bool ooo_final_cmpl = false; u64 qp_ctx, temp, temp1; __le64 *cqe; struct irdma_sc_cqp *cqp; @@ -3385,6 +3600,7 @@ int irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq, u32 error; u8 polarity; int ret_code = 0; + unsigned long flags; if (ccq->cq_uk.avoid_mem_cflct) cqe = IRDMA_GET_CURRENT_EXTENDED_CQ_ELEM(&ccq->cq_uk); @@ -3416,6 +3632,25 @@ int irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq, get_64bit_val(cqe, 16, &temp1); info->op_ret_val = (u32)FIELD_GET(IRDMA_CCQ_OPRETVAL, temp1); + if (cqp->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + def_cmpl = info->maj_err_code == IRDMA_CQPSQ_MAJ_NO_ERROR && + info->min_err_code == IRDMA_CQPSQ_MIN_DEF_CMPL; + def_info = (u32)FIELD_GET(IRDMA_CCQ_DEFINFO, temp1); + + pend_cmpl = info->maj_err_code == IRDMA_CQPSQ_MAJ_NO_ERROR && + info->min_err_code == IRDMA_CQPSQ_MIN_OOO_CMPL; + + ooo_final_cmpl = (bool)FIELD_GET(IRDMA_OOO_CMPL, temp); + + if (def_cmpl || pend_cmpl || ooo_final_cmpl) { + if (ooo_final_cmpl) + irdma_sc_process_ooo_cmpl(cqp, info, def_info); + else + irdma_sc_process_def_cmpl(cqp, info, wqe_idx, + def_info, def_cmpl); + } + } + get_64bit_val(cqp->sq_base[wqe_idx].elem, 24, &temp1); info->op_code = (u8)FIELD_GET(IRDMA_CQPSQ_OPCODE, temp1); info->cqp = cqp; @@ -3432,7 +3667,16 @@ int irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq, dma_wmb(); /* make sure shadow area is updated before moving tail */ - IRDMA_RING_MOVE_TAIL(cqp->sq_ring); + spin_lock_irqsave(&cqp->dev->cqp_lock, flags); + if (!ooo_final_cmpl) + IRDMA_RING_MOVE_TAIL(cqp->sq_ring); + spin_unlock_irqrestore(&cqp->dev->cqp_lock, flags); + + /* Do not increment completed_ops counter on pending or deferred + * completions. + */ + if (pend_cmpl || def_cmpl) + return ret_code; atomic64_inc(&cqp->completed_ops); return ret_code; @@ -4118,6 +4362,10 @@ int irdma_sc_get_next_aeqe(struct irdma_sc_aeq *aeq, info->compl_ctx = compl_ctx << 1; ae_src = IRDMA_AE_SOURCE_RSVD; break; + case IRDMA_AE_CQP_DEFERRED_COMPLETE: + info->def_info = info->wqe_idx; + ae_src = IRDMA_AE_SOURCE_RSVD; + break; case IRDMA_AE_ROCE_EMPTY_MCG: case IRDMA_AE_ROCE_BAD_MC_IP_ADDR: case IRDMA_AE_ROCE_BAD_MC_QPID: diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index fe75737..5e4d62c 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -367,6 +367,7 @@ enum irdma_cqp_op_type { #define IRDMA_AE_LCE_FUNCTION_CATASTROPHIC 0x0701 #define IRDMA_AE_LCE_CQ_CATASTROPHIC 0x0702 #define IRDMA_AE_QP_SUSPEND_COMPLETE 0x0900 +#define IRDMA_AE_CQP_DEFERRED_COMPLETE 0x0901 #define FLD_LS_64(dev, val, field) \ (((u64)(val) << (dev)->hw_shifts[field ## _S]) & (dev)->hw_masks[field ## _M]) @@ -465,6 +466,16 @@ enum irdma_cqp_op_type { #define IRDMA_CQPHC_SVER GENMASK_ULL(31, 24) #define IRDMA_CQPHC_SQBASE GENMASK_ULL(63, 9) +#define IRDMA_CQPHC_TIMESTAMP_OVERRIDE BIT_ULL(5) +#define IRDMA_CQPHC_TS_SHIFT GENMASK_ULL(12, 8) +#define IRDMA_CQPHC_EN_FINE_GRAINED_TIMERS BIT_ULL(0) + +#define IRDMA_CQPHC_OOISC_BLKSIZE GENMASK_ULL(63, 60) +#define IRDMA_CQPHC_RRSP_BLKSIZE GENMASK_ULL(59, 56) +#define IRDMA_CQPHC_Q1_BLKSIZE GENMASK_ULL(55, 52) +#define IRDMA_CQPHC_XMIT_BLKSIZE GENMASK_ULL(51, 48) +#define IRDMA_CQPHC_BLKSIZES_VALID BIT_ULL(4) + #define IRDMA_CQPHC_QPCTX GENMASK_ULL(63, 0) #define IRDMA_QP_DBSA_HW_SQ_TAIL GENMASK_ULL(14, 0) #define IRDMA_CQ_DBSA_CQEIDX GENMASK_ULL(19, 0) @@ -478,6 +489,8 @@ enum irdma_cqp_op_type { #define IRDMA_CCQ_OPRETVAL GENMASK_ULL(31, 0) +#define IRDMA_CCQ_DEFINFO GENMASK_ULL(63, 32) + #define IRDMA_CQ_MINERR GENMASK_ULL(15, 0) #define IRDMA_CQ_MAJERR GENMASK_ULL(31, 16) #define IRDMA_CQ_WQEIDX GENMASK_ULL(46, 32) @@ -715,6 +728,8 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_MIN_STAG_INVALID 0x0001 #define IRDMA_CQPSQ_MIN_SUSPEND_PND 0x0005 +#define IRDMA_CQPSQ_MIN_DEF_CMPL 0x0006 +#define IRDMA_CQPSQ_MIN_OOO_CMPL 0x0007 #define IRDMA_CQPSQ_MAJ_NO_ERROR 0x0000 #define IRDMA_CQPSQ_MAJ_OBJCACHE_ERROR 0xF000 diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index 2881314..55b10a8 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -208,6 +208,51 @@ static void irdma_set_flush_fields(struct irdma_sc_qp *qp, } /** + * irdma_complete_cqp_request - perform post-completion cleanup + * @cqp: device CQP + * @cqp_request: CQP request + * + * Mark CQP request as done, wake up waiting thread or invoke + * callback function and release/free CQP request. + */ +static void irdma_complete_cqp_request(struct irdma_cqp *cqp, + struct irdma_cqp_request *cqp_request) +{ + if (cqp_request->waiting) { + WRITE_ONCE(cqp_request->request_done, true); + wake_up(&cqp_request->waitq); + } else if (cqp_request->callback_fcn) { + cqp_request->callback_fcn(cqp_request); + } + irdma_put_cqp_request(cqp, cqp_request); +} + +/** + * irdma_process_ae_def_cmpl - handle IRDMA_AE_CQP_DEFERRED_COMPLETE event + * @rf: RDMA PCI function + * @info: AEQ entry info + */ +static void irdma_process_ae_def_cmpl(struct irdma_pci_f *rf, + struct irdma_aeqe_info *info) +{ + u32 sw_def_info; + u64 scratch; + + irdma_cqp_ce_handler(rf, &rf->ccq.sc_cq); + + irdma_sc_cqp_def_cmpl_ae_handler(&rf->sc_dev, info, true, + &scratch, &sw_def_info); + while (scratch) { + struct irdma_cqp_request *cqp_request = + (struct irdma_cqp_request *)(uintptr_t)scratch; + + irdma_complete_cqp_request(&rf->cqp, cqp_request); + irdma_sc_cqp_def_cmpl_ae_handler(&rf->sc_dev, info, false, + &scratch, &sw_def_info); + } +} + +/** * irdma_process_aeq - handle aeq events * @rf: RDMA PCI function */ @@ -269,7 +314,8 @@ static void irdma_process_aeq(struct irdma_pci_f *rf) spin_unlock_irqrestore(&iwqp->lock, flags); ctx_info = &iwqp->ctx_info; } else { - if (info->ae_id != IRDMA_AE_CQ_OPERATION_ERROR) + if (info->ae_id != IRDMA_AE_CQ_OPERATION_ERROR && + info->ae_id != IRDMA_AE_CQP_DEFERRED_COMPLETE) continue; } @@ -364,6 +410,12 @@ static void irdma_process_aeq(struct irdma_pci_f *rf) } irdma_cq_rem_ref(&iwcq->ibcq); break; + case IRDMA_AE_CQP_DEFERRED_COMPLETE: + /* Remove completed CQP requests from pending list + * and notify about those CQP ops completion. + */ + irdma_process_ae_def_cmpl(rf, info); + break; case IRDMA_AE_RESET_NOT_SENT: case IRDMA_AE_LLP_DOUBT_REACHABILITY: case IRDMA_AE_RESOURCE_EXHAUSTION: @@ -602,6 +654,8 @@ static void irdma_destroy_cqp(struct irdma_pci_f *rf) dma_free_coherent(dev->hw->device, cqp->sq.size, cqp->sq.va, cqp->sq.pa); cqp->sq.va = NULL; + kfree(cqp->oop_op_array); + cqp->oop_op_array = NULL; kfree(cqp->scratch_array); cqp->scratch_array = NULL; kfree(cqp->cqp_requests); @@ -945,6 +999,13 @@ static int irdma_create_cqp(struct irdma_pci_f *rf) goto err_scratch; } + cqp->oop_op_array = kcalloc(sqsize, sizeof(*cqp->oop_op_array), + GFP_KERNEL); + if (!cqp->oop_op_array) { + status = -ENOMEM; + goto err_oop; + } + cqp_init_info.ooo_op_array = cqp->oop_op_array; dev->cqp = &cqp->sc_cqp; dev->cqp->dev = dev; cqp->sq.size = ALIGN(sizeof(struct irdma_cqp_sq_wqe) * sqsize, @@ -981,6 +1042,10 @@ static int irdma_create_cqp(struct irdma_pci_f *rf) case IRDMA_GEN_2: cqp_init_info.hw_maj_ver = IRDMA_CQPHC_HW_MAJVER_GEN_2; break; + case IRDMA_GEN_3: + cqp_init_info.hw_maj_ver = IRDMA_CQPHC_HW_MAJVER_GEN_3; + cqp_init_info.ts_override = 1; + break; } status = irdma_sc_cqp_init(dev->cqp, &cqp_init_info); if (status) { @@ -1015,6 +1080,9 @@ static int irdma_create_cqp(struct irdma_pci_f *rf) cqp->sq.va, cqp->sq.pa); cqp->sq.va = NULL; err_sq: + kfree(cqp->oop_op_array); + cqp->oop_op_array = NULL; +err_oop: kfree(cqp->scratch_array); cqp->scratch_array = NULL; err_scratch: @@ -2106,15 +2174,16 @@ void irdma_cqp_ce_handler(struct irdma_pci_f *rf, struct irdma_sc_cq *cq) cqp_request->compl_info.op_ret_val = info.op_ret_val; cqp_request->compl_info.error = info.error; - if (cqp_request->waiting) { - WRITE_ONCE(cqp_request->request_done, true); - wake_up(&cqp_request->waitq); - irdma_put_cqp_request(&rf->cqp, cqp_request); - } else { - if (cqp_request->callback_fcn) - cqp_request->callback_fcn(cqp_request); - irdma_put_cqp_request(&rf->cqp, cqp_request); - } + /* + * If this is deferred or pending completion, then mark + * CQP request as pending to not block the CQ, but don't + * release CQP request, as it is still on the OOO list. + */ + if (info.pending) + cqp_request->pending = true; + else + irdma_complete_cqp_request(&rf->cqp, + cqp_request); } cqe_count++; diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index a7f3d19..5d13718 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -167,6 +167,7 @@ struct irdma_cqp_request { bool request_done; /* READ/WRITE_ONCE macros operate on it */ bool waiting:1; bool dynamic:1; + bool pending:1; }; struct irdma_cqp { @@ -179,6 +180,7 @@ struct irdma_cqp { struct irdma_dma_mem host_ctx; u64 *scratch_array; struct irdma_cqp_request *cqp_requests; + struct irdma_ooo_cqp_op *oop_op_array; struct list_head cqp_avail_reqs; struct list_head cqp_pending_reqs; }; diff --git a/drivers/infiniband/hw/irdma/protos.h b/drivers/infiniband/hw/irdma/protos.h index d7c8ea9..fac823a 100644 --- a/drivers/infiniband/hw/irdma/protos.h +++ b/drivers/infiniband/hw/irdma/protos.h @@ -10,6 +10,7 @@ #define ALL_TC2PFC 0xff #define CQP_COMPL_WAIT_TIME_MS 10 #define CQP_TIMEOUT_THRESHOLD 500 +#define CQP_DEF_CMPL_TIMEOUT_THRESHOLD 2500 /* init operations */ int irdma_sc_dev_init(enum irdma_vers ver, struct irdma_sc_dev *dev, diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index cfcb5d9..2b93a70 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -262,12 +262,22 @@ struct irdma_cqp_init_info { __le64 *host_ctx; u64 *scratch_array; u32 sq_size; + struct irdma_ooo_cqp_op *ooo_op_array; + u32 pe_en_vf_cnt; u16 hw_maj_ver; u16 hw_min_ver; u8 struct_ver; u8 hmc_profile; u8 ena_vf_count; u8 ceqs_per_vf; + u8 ooisc_blksize; + u8 rrsp_blksize; + u8 q1_blksize; + u8 xmit_blksize; + u8 ts_override; + u8 ts_shift; + u8 en_fine_grained_timers; + u8 blksizes_valid; bool en_datacenter_tcp:1; bool disable_packed:1; bool rocev2_rto_policy:1; @@ -392,7 +402,21 @@ struct irdma_cqp_quanta { __le64 elem[IRDMA_CQP_WQE_SIZE]; }; +struct irdma_ooo_cqp_op { + struct list_head list_entry; + u64 scratch; + u32 def_info; + u32 sw_def_info; + u32 wqe_idx; + bool deferred:1; +}; + struct irdma_sc_cqp { + spinlock_t ooo_list_lock; /* protects list of pending completions */ + struct list_head ooo_avail; + struct list_head ooo_pnd; + u32 last_def_cmpl_ticket; + u32 sw_def_cmpl_ticket; u32 size; u64 sq_pa; u64 host_ctx_pa; @@ -408,8 +432,10 @@ struct irdma_sc_cqp { u64 *scratch_array; u64 requested_ops; atomic64_t completed_ops; + struct irdma_ooo_cqp_op *ooo_op_array; u32 cqp_id; u32 sq_size; + u32 pe_en_vf_cnt; u32 hw_sq_size; u16 hw_maj_ver; u16 hw_min_ver; @@ -419,6 +445,14 @@ struct irdma_sc_cqp { u8 ena_vf_count; u8 timeout_count; u8 ceqs_per_vf; + u8 ooisc_blksize; + u8 rrsp_blksize; + u8 q1_blksize; + u8 xmit_blksize; + u8 ts_override; + u8 ts_shift; + u8 en_fine_grained_timers; + u8 blksizes_valid; bool en_datacenter_tcp:1; bool disable_packed:1; bool rocev2_rto_policy:1; @@ -723,7 +757,8 @@ struct irdma_ccq_cqe_info { u16 maj_err_code; u16 min_err_code; u8 op_code; - bool error; + bool error:1; + bool pending:1; }; struct irdma_dcb_app_info { @@ -998,6 +1033,7 @@ struct irdma_qp_host_ctx_info { struct irdma_aeqe_info { u64 compl_ctx; u32 qp_cq_id; + u32 def_info; /* only valid for DEF_CMPL */ u16 ae_id; u16 wqe_idx; u8 tcp_state; @@ -1242,6 +1278,11 @@ void irdma_sc_pd_init(struct irdma_sc_dev *dev, struct irdma_sc_pd *pd, u32 pd_i void irdma_cfg_aeq(struct irdma_sc_dev *dev, u32 idx, bool enable); void irdma_check_cqp_progress(struct irdma_cqp_timeout *cqp_timeout, struct irdma_sc_dev *dev); +void irdma_sc_cqp_def_cmpl_ae_handler(struct irdma_sc_dev *dev, + struct irdma_aeqe_info *info, + bool first, u64 *scratch, + u32 *sw_def_info); +u64 irdma_sc_cqp_cleanup_handler(struct irdma_sc_dev *dev); int irdma_sc_cqp_create(struct irdma_sc_cqp *cqp, u16 *maj_err, u16 *min_err); int irdma_sc_cqp_destroy(struct irdma_sc_cqp *cqp); int irdma_sc_cqp_init(struct irdma_sc_cqp *cqp, diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index 0422787..e940d32 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -484,6 +484,7 @@ void irdma_free_cqp_request(struct irdma_cqp *cqp, WRITE_ONCE(cqp_request->request_done, false); cqp_request->callback_fcn = NULL; cqp_request->waiting = false; + cqp_request->pending = false; spin_lock_irqsave(&cqp->req_lock, flags); list_add_tail(&cqp_request->list, &cqp->cqp_avail_reqs); @@ -524,6 +525,22 @@ void irdma_put_cqp_request(struct irdma_cqp *cqp, } /** + * irdma_cleanup_deferred_cqp_ops - clean-up cqp with no completions + * @dev: sc_dev + * @cqp: cqp + */ +static void irdma_cleanup_deferred_cqp_ops(struct irdma_sc_dev *dev, + struct irdma_cqp *cqp) +{ + u64 scratch; + + /* process all CQP requests with deferred/pending completions */ + while ((scratch = irdma_sc_cqp_cleanup_handler(dev))) + irdma_free_pending_cqp_request(cqp, (struct irdma_cqp_request *) + (uintptr_t)scratch); +} + +/** * irdma_cleanup_pending_cqp_op - clean-up cqp with no * completions * @rf: RDMA PCI function @@ -536,6 +553,8 @@ void irdma_cleanup_pending_cqp_op(struct irdma_pci_f *rf) struct cqp_cmds_info *pcmdinfo = NULL; u32 i, pending_work, wqe_idx; + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) + irdma_cleanup_deferred_cqp_ops(dev, cqp); pending_work = IRDMA_RING_USED_QUANTA(cqp->sc_cqp.sq_ring); wqe_idx = IRDMA_RING_CURRENT_TAIL(cqp->sc_cqp.sq_ring); for (i = 0; i < pending_work; i++) { @@ -555,6 +574,26 @@ void irdma_cleanup_pending_cqp_op(struct irdma_pci_f *rf) } } +static int irdma_get_timeout_threshold(struct irdma_sc_dev *dev) +{ + u16 time_s = dev->vc_caps.cqp_timeout_s; + + if (!time_s) + return CQP_TIMEOUT_THRESHOLD; + + return time_s * 1000 / dev->hw_attrs.max_cqp_compl_wait_time_ms; +} + +static int irdma_get_def_timeout_threshold(struct irdma_sc_dev *dev) +{ + u16 time_s = dev->vc_caps.cqp_def_timeout_s; + + if (!time_s) + return CQP_DEF_CMPL_TIMEOUT_THRESHOLD; + + return time_s * 1000 / dev->hw_attrs.max_cqp_compl_wait_time_ms; +} + /** * irdma_wait_event - wait for completion * @rf: RDMA PCI function @@ -564,6 +603,7 @@ static int irdma_wait_event(struct irdma_pci_f *rf, struct irdma_cqp_request *cqp_request) { struct irdma_cqp_timeout cqp_timeout = {}; + int timeout_threshold = irdma_get_timeout_threshold(&rf->sc_dev); bool cqp_error = false; int err_code = 0; @@ -575,9 +615,17 @@ static int irdma_wait_event(struct irdma_pci_f *rf, msecs_to_jiffies(CQP_COMPL_WAIT_TIME_MS))) break; + if (cqp_request->pending) + /* There was a deferred or pending completion + * received for this CQP request, so we need + * to wait longer than usual. + */ + timeout_threshold = + irdma_get_def_timeout_threshold(&rf->sc_dev); + irdma_check_cqp_progress(&cqp_timeout, &rf->sc_dev); - if (cqp_timeout.count < CQP_TIMEOUT_THRESHOLD) + if (cqp_timeout.count < timeout_threshold) continue; if (!rf->reset) { From patchwork Wed Jul 24 23:39:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741455 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77FF01482EE for ; Wed, 24 Jul 2024 23:40:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864449; cv=none; b=dDz08rqAvmacFStqfjHKbGKXWYVQvenaO9dHBKi1kE4OG0j/yv5taBs3HR5fw4kMG64/V0Jr8mw0wUCN/ffgPSg91k35NaxceAuBtkf0oScJNcRoS3+R57V23ZwAaimVejQtu1qnCzypVAzu0gzN4iNcTA8hpUlQkeVMfB3jN+c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864449; c=relaxed/simple; bh=O6oJ9OlYufXW4bhx3bmOP0qeCdiAljOu1vCsV0GUi3k=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=L1dULUsu+zYU9GerYqoSLYpDKusjAhfULnba72BJi1HF5Z1KEkIYE42Kk3VC4rwlhpKTiaViYuGniNfNdWXl8vQz5mWU3Upi7zkTOZi4QkTroyUtrUHje0f2VJ89yMSBvQ106b8zitziE/jRXj/kmFphU7zmrsQczRSBfBVQrTk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Zb3bcYIF; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Zb3bcYIF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864447; x=1753400447; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=O6oJ9OlYufXW4bhx3bmOP0qeCdiAljOu1vCsV0GUi3k=; b=Zb3bcYIFjZxij9SIMSa2GJ+b8KuabLDPPkmdcMEOqDJjmj8GeVymNW9l uzQnhVP15rzn5Ebfx2mI3RvDcLYt1DrCuWy+/lQRIhsstrq4l0zou3jB+ qJGotbJ72Josg0HiT7VBhyKnx+GAkvTIpt2oDSTl7rwznBFRNvOodXSe7 4Z8pEO9ZdbaLxE7N5WL+BhAMYvEN02Ra2QtwuU2pjKddWGY0mVRdOdpwq EWV1OCDd2KXseyzba81VZ3nZPGAQlQtMhUUGp7YrLEJvRNiy8SZntO53D 1JC1P4DPof0h2czKbmMYy6aIOclo7RqrS0McVgUVU8XmJmYIC3uSqXb4s Q==; X-CSE-ConnectionGUID: rfaTkOhpTJCI6HyVpU7ANw== X-CSE-MsgGUID: xK2BXmVJTyON2dPNupBcGw== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999776" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999776" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:42 -0700 X-CSE-ConnectionGUID: ntz7ZfBmTuKvQgYZ7AWA1g== X-CSE-MsgGUID: GU8O9MzBTo6SucT3IDreyA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426062" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:41 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Shiraz Saleem , Tatyana Nikolova Subject: [RFC PATCH 14/25] RDMA/irdma: Add GEN3 support for AEQ and CEQ Date: Wed, 24 Jul 2024 18:39:06 -0500 Message-Id: <20240724233917.704-15-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Shiraz Saleem Extend support for GEN3 devices by programming the necessary hardware IRQ registers and the updated descriptor fields for the Asynchronous Event Queue (AEQ) and Completion Event Queue (CEQ). Introduce a RDMA virtual channel operation with the Control Plane (CP) to associate interrupt vectors appropriately with AEQ and CEQ. Add new Asynchronous Event (AE) definitions specific to GEN3. Additionally, refactor the AEQ and CEQ setup into the irdma_ctrl_init_hw device control initialization routine. This completes the PCI device level initialization for RDMA in the core driver. Signed-off-by: Shiraz Saleem Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ctrl.c | 76 +++++++++++++++--- drivers/infiniband/hw/irdma/defs.h | 29 ++++++- drivers/infiniband/hw/irdma/hw.c | 130 ++++++++++++++++++------------- drivers/infiniband/hw/irdma/ig3rdma_hw.c | 45 +++++++++++ drivers/infiniband/hw/irdma/irdma.h | 11 ++- drivers/infiniband/hw/irdma/main.h | 6 +- drivers/infiniband/hw/irdma/type.h | 11 ++- drivers/infiniband/hw/irdma/virtchnl.c | 84 ++++++++++++++++++++ drivers/infiniband/hw/irdma/virtchnl.h | 19 +++++ 9 files changed, 338 insertions(+), 73 deletions(-) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index e524b61..5a5d47c 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -2562,6 +2562,9 @@ static int irdma_sc_cq_create(struct irdma_sc_cq *cq, u64 scratch, FIELD_PREP(IRDMA_CQPSQ_CQ_LPBLSIZE, cq->pbl_chunk_size) | FIELD_PREP(IRDMA_CQPSQ_CQ_CHKOVERFLOW, check_overflow) | FIELD_PREP(IRDMA_CQPSQ_CQ_VIRTMAP, cq->virtual_map) | + FIELD_PREP(IRDMA_CQPSQ_CQ_CQID_HIGH, cq->cq_uk.cq_id >> 22) | + FIELD_PREP(IRDMA_CQPSQ_CQ_CEQID_HIGH, + (cq->ceq_id_valid ? cq->ceq_id : 0) >> 10) | FIELD_PREP(IRDMA_CQPSQ_CQ_ENCEQEMASK, cq->ceqe_mask) | FIELD_PREP(IRDMA_CQPSQ_CQ_CEQIDVALID, cq->ceq_id_valid) | FIELD_PREP(IRDMA_CQPSQ_TPHEN, cq->tph_en) | @@ -3924,7 +3927,7 @@ int irdma_sc_ceq_init(struct irdma_sc_ceq *ceq, ceq->pbl_list = (ceq->virtual_map ? info->pbl_list : NULL); ceq->tph_en = info->tph_en; ceq->tph_val = info->tph_val; - ceq->vsi = info->vsi; + ceq->vsi_idx = info->vsi_idx; ceq->polarity = 1; IRDMA_RING_INIT(ceq->ceq_ring, ceq->elem_cnt); ceq->dev->ceq[info->ceq_id] = ceq; @@ -3957,13 +3960,16 @@ static int irdma_sc_ceq_create(struct irdma_sc_ceq *ceq, u64 scratch, (ceq->virtual_map ? ceq->first_pm_pbl_idx : 0)); set_64bit_val(wqe, 56, FIELD_PREP(IRDMA_CQPSQ_TPHVAL, ceq->tph_val) | - FIELD_PREP(IRDMA_CQPSQ_VSIIDX, ceq->vsi->vsi_idx)); + FIELD_PREP(IRDMA_CQPSQ_PASID, ceq->pasid) | + FIELD_PREP(IRDMA_CQPSQ_VSIIDX, ceq->vsi_idx)); hdr = FIELD_PREP(IRDMA_CQPSQ_CEQ_CEQID, ceq->ceq_id) | + FIELD_PREP(IRDMA_CQPSQ_CEQ_CEQID_HIGH, ceq->ceq_id >> 10) | FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_CREATE_CEQ) | FIELD_PREP(IRDMA_CQPSQ_CEQ_LPBLSIZE, ceq->pbl_chunk_size) | FIELD_PREP(IRDMA_CQPSQ_CEQ_VMAP, ceq->virtual_map) | FIELD_PREP(IRDMA_CQPSQ_CEQ_ITRNOEXPIRE, ceq->itr_no_expire) | FIELD_PREP(IRDMA_CQPSQ_TPHEN, ceq->tph_en) | + FIELD_PREP(IRDMA_CQPSQ_PASID_VALID, ceq->pasid_valid) | FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); dma_wmb(); /* make sure WQE is written before valid bit is set */ @@ -4018,7 +4024,7 @@ int irdma_sc_cceq_create(struct irdma_sc_ceq *ceq, u64 scratch) int ret_code; struct irdma_sc_dev *dev = ceq->dev; - dev->ccq->vsi = ceq->vsi; + dev->ccq->vsi_idx = ceq->vsi_idx; if (ceq->reg_cq) { ret_code = irdma_sc_add_cq_ctx(ceq, ceq->dev->ccq); if (ret_code) @@ -4051,11 +4057,14 @@ int irdma_sc_ceq_destroy(struct irdma_sc_ceq *ceq, u64 scratch, bool post_sq) set_64bit_val(wqe, 16, ceq->elem_cnt); set_64bit_val(wqe, 48, ceq->first_pm_pbl_idx); + set_64bit_val(wqe, 56, + FIELD_PREP(IRDMA_CQPSQ_PASID, ceq->pasid)); hdr = ceq->ceq_id | FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_DESTROY_CEQ) | FIELD_PREP(IRDMA_CQPSQ_CEQ_LPBLSIZE, ceq->pbl_chunk_size) | FIELD_PREP(IRDMA_CQPSQ_CEQ_VMAP, ceq->virtual_map) | FIELD_PREP(IRDMA_CQPSQ_TPHEN, ceq->tph_en) | + FIELD_PREP(IRDMA_CQPSQ_PASID_VALID, ceq->pasid_valid) | FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); dma_wmb(); /* make sure WQE is written before valid bit is set */ @@ -4219,10 +4228,13 @@ static int irdma_sc_aeq_create(struct irdma_sc_aeq *aeq, u64 scratch, (aeq->virtual_map ? 0 : aeq->aeq_elem_pa)); set_64bit_val(wqe, 48, (aeq->virtual_map ? aeq->first_pm_pbl_idx : 0)); + set_64bit_val(wqe, 56, + FIELD_PREP(IRDMA_CQPSQ_PASID, aeq->pasid)); hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_CREATE_AEQ) | FIELD_PREP(IRDMA_CQPSQ_AEQ_LPBLSIZE, aeq->pbl_chunk_size) | FIELD_PREP(IRDMA_CQPSQ_AEQ_VMAP, aeq->virtual_map) | + FIELD_PREP(IRDMA_CQPSQ_PASID_VALID, aeq->pasid_valid) | FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); dma_wmb(); /* make sure WQE is written before valid bit is set */ @@ -4251,7 +4263,8 @@ static int irdma_sc_aeq_destroy(struct irdma_sc_aeq *aeq, u64 scratch, u64 hdr; dev = aeq->dev; - writel(0, dev->hw_regs[IRDMA_PFINT_AEQCTL]); + if (dev->privileged) + writel(0, dev->hw_regs[IRDMA_PFINT_AEQCTL]); cqp = dev->cqp; wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch); @@ -4259,9 +4272,12 @@ static int irdma_sc_aeq_destroy(struct irdma_sc_aeq *aeq, u64 scratch, return -ENOMEM; set_64bit_val(wqe, 16, aeq->elem_cnt); set_64bit_val(wqe, 48, aeq->first_pm_pbl_idx); + set_64bit_val(wqe, 56, + FIELD_PREP(IRDMA_CQPSQ_PASID, aeq->pasid)); hdr = FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_DESTROY_AEQ) | FIELD_PREP(IRDMA_CQPSQ_AEQ_LPBLSIZE, aeq->pbl_chunk_size) | FIELD_PREP(IRDMA_CQPSQ_AEQ_VMAP, aeq->virtual_map) | + FIELD_PREP(IRDMA_CQPSQ_PASID_VALID, aeq->pasid_valid) | FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); dma_wmb(); /* make sure WQE is written before valid bit is set */ @@ -4302,18 +4318,39 @@ int irdma_sc_get_next_aeqe(struct irdma_sc_aeq *aeq, print_hex_dump_debug("WQE: AEQ_ENTRY WQE", DUMP_PREFIX_OFFSET, 16, 8, aeqe, 16, false); - ae_src = (u8)FIELD_GET(IRDMA_AEQE_AESRC, temp); - info->wqe_idx = (u16)FIELD_GET(IRDMA_AEQE_WQDESCIDX, temp); - info->qp_cq_id = (u32)FIELD_GET(IRDMA_AEQE_QPCQID_LOW, temp) | + if (aeq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + ae_src = (u8)FIELD_GET(IRDMA_AEQE_AESRC_GEN_3, temp); + info->wqe_idx = (u16)FIELD_GET(IRDMA_AEQE_WQDESCIDX_GEN_3, + temp); + info->qp_cq_id = (u32)FIELD_GET(IRDMA_AEQE_QPCQID_GEN_3, temp); + info->ae_id = (u16)FIELD_GET(IRDMA_AEQE_AECODE_GEN_3, temp); + info->tcp_state = (u8)FIELD_GET(IRDMA_AEQE_TCPSTATE_GEN_3, compl_ctx); + info->iwarp_state = (u8)FIELD_GET(IRDMA_AEQE_IWSTATE_GEN_3, temp); + info->q2_data_written = (u8)FIELD_GET(IRDMA_AEQE_Q2DATA_GEN_3, compl_ctx); + info->aeqe_overflow = (bool)FIELD_GET(IRDMA_AEQE_OVERFLOW_GEN_3, temp); + info->compl_ctx = FIELD_GET(IRDMA_AEQE_CMPL_CTXT, compl_ctx); + compl_ctx = FIELD_GET(IRDMA_AEQE_CMPL_CTXT, compl_ctx) << IRDMA_AEQE_CMPL_CTXT_S; + } else { + ae_src = (u8)FIELD_GET(IRDMA_AEQE_AESRC, temp); + info->wqe_idx = (u16)FIELD_GET(IRDMA_AEQE_WQDESCIDX, temp); + info->qp_cq_id = (u32)FIELD_GET(IRDMA_AEQE_QPCQID_LOW, temp) | ((u32)FIELD_GET(IRDMA_AEQE_QPCQID_HI, temp) << 18); - info->ae_id = (u16)FIELD_GET(IRDMA_AEQE_AECODE, temp); - info->tcp_state = (u8)FIELD_GET(IRDMA_AEQE_TCPSTATE, temp); - info->iwarp_state = (u8)FIELD_GET(IRDMA_AEQE_IWSTATE, temp); - info->q2_data_written = (u8)FIELD_GET(IRDMA_AEQE_Q2DATA, temp); - info->aeqe_overflow = (bool)FIELD_GET(IRDMA_AEQE_OVERFLOW, temp); + info->ae_id = (u16)FIELD_GET(IRDMA_AEQE_AECODE, temp); + info->tcp_state = (u8)FIELD_GET(IRDMA_AEQE_TCPSTATE, temp); + info->iwarp_state = (u8)FIELD_GET(IRDMA_AEQE_IWSTATE, temp); + info->q2_data_written = (u8)FIELD_GET(IRDMA_AEQE_Q2DATA, temp); + info->aeqe_overflow = (bool)FIELD_GET(IRDMA_AEQE_OVERFLOW, + temp); + } info->ae_src = ae_src; switch (info->ae_id) { + case IRDMA_AE_SRQ_LIMIT: + info->srq = true; + /* [63:6] from CMPL_CTXT, [5:0] from WQDESCIDX. */ + info->compl_ctx = compl_ctx | info->wqe_idx; + ae_src = IRDMA_AE_SOURCE_RSVD; + break; case IRDMA_AE_PRIV_OPERATION_DENIED: case IRDMA_AE_AMP_INVALIDATE_TYPE1_MW: case IRDMA_AE_AMP_MWBIND_ZERO_BASED_TYPE1_MW: @@ -4346,6 +4383,10 @@ int irdma_sc_get_next_aeqe(struct irdma_sc_aeq *aeq, case IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR: case IRDMA_AE_LLP_SEGMENT_TOO_SMALL: case IRDMA_AE_LLP_TOO_MANY_RETRIES: + case IRDMA_AE_LLP_TOO_MANY_RNRS: + case IRDMA_AE_REMOTE_QP_CATASTROPHIC: + case IRDMA_AE_LOCAL_QP_CATASTROPHIC: + case IRDMA_AE_RCE_QP_CATASTROPHIC: case IRDMA_AE_LLP_DOUBT_REACHABILITY: case IRDMA_AE_LLP_CONNECTION_ESTABLISHED: case IRDMA_AE_RESET_SENT: @@ -4391,6 +4432,7 @@ int irdma_sc_get_next_aeqe(struct irdma_sc_aeq *aeq, info->qp = true; info->rq = true; info->compl_ctx = compl_ctx; + info->err_rq_idx_valid = true; break; case IRDMA_AE_SOURCE_CQ: case IRDMA_AE_SOURCE_CQ_0110: @@ -4406,8 +4448,18 @@ int irdma_sc_get_next_aeqe(struct irdma_sc_aeq *aeq, info->compl_ctx = compl_ctx; break; case IRDMA_AE_SOURCE_IN_RR_WR: + info->qp = true; + if (aeq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) + info->err_rq_idx_valid = true; + info->compl_ctx = compl_ctx; + info->in_rdrsp_wr = true; + break; case IRDMA_AE_SOURCE_IN_RR_WR_1011: info->qp = true; + if (aeq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + info->sq = true; + info->err_rq_idx_valid = true; + } info->compl_ctx = compl_ctx; info->in_rdrsp_wr = true; break; diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index 5e4d62c..5829c72 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -319,13 +319,18 @@ enum irdma_cqp_op_type { #define IRDMA_AE_STAG_ZERO_INVALID 0x0206 #define IRDMA_AE_IB_RREQ_AND_Q1_FULL 0x0207 #define IRDMA_AE_IB_INVALID_REQUEST 0x0208 +#define IRDMA_AE_SRQ_LIMIT 0x0209 #define IRDMA_AE_WQE_UNEXPECTED_OPCODE 0x020a #define IRDMA_AE_WQE_INVALID_PARAMETER 0x020b #define IRDMA_AE_WQE_INVALID_FRAG_DATA 0x020c #define IRDMA_AE_IB_REMOTE_ACCESS_ERROR 0x020d #define IRDMA_AE_IB_REMOTE_OP_ERROR 0x020e +#define IRDMA_AE_SRQ_CATASTROPHIC_ERROR 0x020f #define IRDMA_AE_WQE_LSMM_TOO_LONG 0x0220 +#define IRDMA_AE_ATOMIC_ALIGNMENT 0x0221 +#define IRDMA_AE_ATOMIC_MASK 0x0222 #define IRDMA_AE_INVALID_REQUEST 0x0223 +#define IRDMA_AE_PCIE_ATOMIC_DISABLE 0x0224 #define IRDMA_AE_DDP_INVALID_MSN_GAP_IN_MSN 0x0301 #define IRDMA_AE_DDP_UBE_DDP_MESSAGE_TOO_LONG_FOR_AVAILABLE_BUFFER 0x0303 #define IRDMA_AE_DDP_UBE_INVALID_DDP_VERSION 0x0304 @@ -366,8 +371,12 @@ enum irdma_cqp_op_type { #define IRDMA_AE_LCE_QP_CATASTROPHIC 0x0700 #define IRDMA_AE_LCE_FUNCTION_CATASTROPHIC 0x0701 #define IRDMA_AE_LCE_CQ_CATASTROPHIC 0x0702 +#define IRDMA_AE_REMOTE_QP_CATASTROPHIC 0x0703 +#define IRDMA_AE_LOCAL_QP_CATASTROPHIC 0x0704 +#define IRDMA_AE_RCE_QP_CATASTROPHIC 0x0705 #define IRDMA_AE_QP_SUSPEND_COMPLETE 0x0900 #define IRDMA_AE_CQP_DEFERRED_COMPLETE 0x0901 +#define IRDMA_AE_ADAPTER_CATASTROPHIC 0x0B0B #define FLD_LS_64(dev, val, field) \ (((u64)(val) << (dev)->hw_shifts[field ## _S]) & (dev)->hw_masks[field ## _M]) @@ -538,6 +547,17 @@ enum irdma_cqp_op_type { #define IRDMA_AEQE_Q2DATA GENMASK_ULL(62, 61) #define IRDMA_AEQE_VALID BIT_ULL(63) +#define IRDMA_AEQE_Q2DATA_GEN_3 GENMASK_ULL(5, 4) +#define IRDMA_AEQE_TCPSTATE_GEN_3 GENMASK_ULL(3, 0) +#define IRDMA_AEQE_QPCQID_GEN_3 GENMASK_ULL(24, 0) +#define IRDMA_AEQE_AECODE_GEN_3 GENMASK_ULL(61, 50) +#define IRDMA_AEQE_OVERFLOW_GEN_3 BIT_ULL(62) +#define IRDMA_AEQE_WQDESCIDX_GEN_3 GENMASK_ULL(49, 32) +#define IRDMA_AEQE_IWSTATE_GEN_3 GENMASK_ULL(31, 29) +#define IRDMA_AEQE_AESRC_GEN_3 GENMASK_ULL(28, 25) +#define IRDMA_AEQE_CMPL_CTXT_S 6 +#define IRDMA_AEQE_CMPL_CTXT GENMASK_ULL(63, 6) + #define IRDMA_UDA_QPSQ_NEXT_HDR GENMASK_ULL(23, 16) #define IRDMA_UDA_QPSQ_OPCODE GENMASK_ULL(37, 32) #define IRDMA_UDA_QPSQ_L4LEN GENMASK_ULL(45, 42) @@ -560,11 +580,14 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_WQEVALID BIT_ULL(63) #define IRDMA_CQPSQ_TPHVAL GENMASK_ULL(7, 0) -#define IRDMA_CQPSQ_VSIIDX GENMASK_ULL(17, 8) +#define IRDMA_CQPSQ_VSIIDX GENMASK_ULL(23, 8) #define IRDMA_CQPSQ_TPHEN BIT_ULL(60) #define IRDMA_CQPSQ_PBUFADDR IRDMA_CQPHC_QPCTX +#define IRDMA_CQPSQ_PASID GENMASK_ULL(51, 32) +#define IRDMA_CQPSQ_PASID_VALID BIT_ULL(62) + /* Create/Modify/Destroy QP */ #define IRDMA_CQPSQ_QP_NEWMSS GENMASK_ULL(45, 32) @@ -600,6 +623,8 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_CQ_CQCTX GENMASK_ULL(62, 0) #define IRDMA_CQPSQ_CQ_SHADOW_READ_THRESHOLD GENMASK(17, 0) +#define IRDMA_CQPSQ_CQ_CQID_HIGH GENMASK_ULL(52, 50) +#define IRDMA_CQPSQ_CQ_CEQID_HIGH GENMASK_ULL(59, 54) #define IRDMA_CQPSQ_CQ_OP GENMASK_ULL(37, 32) #define IRDMA_CQPSQ_CQ_CQRESIZE BIT_ULL(43) #define IRDMA_CQPSQ_CQ_LPBLSIZE GENMASK_ULL(45, 44) @@ -681,6 +706,8 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_CEQ_CEQSIZE GENMASK_ULL(21, 0) #define IRDMA_CQPSQ_CEQ_CEQID GENMASK_ULL(9, 0) +#define IRDMA_CQPSQ_CEQ_CEQID_HIGH GENMASK_ULL(15, 10) + #define IRDMA_CQPSQ_CEQ_LPBLSIZE IRDMA_CQPSQ_CQ_LPBLSIZE #define IRDMA_CQPSQ_CEQ_VMAP BIT_ULL(47) #define IRDMA_CQPSQ_CEQ_ITRNOEXPIRE BIT_ULL(46) diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index 55b10a8..f01ec21 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -282,6 +282,13 @@ static void irdma_process_aeq(struct irdma_pci_f *rf) if (ret) break; + if (info->aeqe_overflow) { + ibdev_err(&iwdev->ibdev, "AEQ has overflowed\n"); + rf->reset = true; + rf->gen_ops.request_reset(rf); + return; + } + aeqcnt++; ibdev_dbg(&iwdev->ibdev, "AEQ: ae_id = 0x%x bool qp=%d qp_id = %d tcp_state=%d iwarp_state=%d ae_src=%d\n", @@ -442,6 +449,9 @@ static void irdma_process_aeq(struct irdma_pci_f *rf) case IRDMA_AE_LCE_FUNCTION_CATASTROPHIC: case IRDMA_AE_LLP_TOO_MANY_RNRS: case IRDMA_AE_LCE_CQ_CATASTROPHIC: + case IRDMA_AE_REMOTE_QP_CATASTROPHIC: + case IRDMA_AE_LOCAL_QP_CATASTROPHIC: + case IRDMA_AE_RCE_QP_CATASTROPHIC: case IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG: default: ibdev_err(&iwdev->ibdev, "abnormal ae_id = 0x%x bool qp=%d qp_id = %d, ae_src=%d\n", @@ -688,7 +698,9 @@ static void irdma_destroy_aeq(struct irdma_pci_f *rf) int status = -EBUSY; if (!rf->msix_shared) { - rf->sc_dev.irq_ops->irdma_cfg_aeq(&rf->sc_dev, rf->iw_msixtbl->idx, false); + if (rf->sc_dev.privileged) + rf->sc_dev.irq_ops->irdma_cfg_aeq(&rf->sc_dev, + rf->iw_msixtbl->idx, false); irdma_destroy_irq(rf, rf->iw_msixtbl, rf); } if (rf->reset) @@ -754,9 +766,10 @@ static void irdma_del_ceq_0(struct irdma_pci_f *rf) if (rf->msix_shared) { msix_vec = &rf->iw_msixtbl[0]; - rf->sc_dev.irq_ops->irdma_cfg_ceq(&rf->sc_dev, - msix_vec->ceq_id, - msix_vec->idx, false); + if (rf->sc_dev.privileged) + rf->sc_dev.irq_ops->irdma_cfg_ceq(&rf->sc_dev, + msix_vec->ceq_id, + msix_vec->idx, false); irdma_destroy_irq(rf, msix_vec, rf); } else { msix_vec = &rf->iw_msixtbl[1]; @@ -787,8 +800,10 @@ static void irdma_del_ceqs(struct irdma_pci_f *rf) msix_vec = &rf->iw_msixtbl[2]; for (i = 1; i < rf->ceqs_count; i++, msix_vec++, iwceq++) { - rf->sc_dev.irq_ops->irdma_cfg_ceq(&rf->sc_dev, msix_vec->ceq_id, - msix_vec->idx, false); + if (rf->sc_dev.privileged) + rf->sc_dev.irq_ops->irdma_cfg_ceq(&rf->sc_dev, + msix_vec->ceq_id, + msix_vec->idx, false); irdma_destroy_irq(rf, msix_vec, iwceq); irdma_cqp_ceq_cmd(&rf->sc_dev, &iwceq->sc_ceq, IRDMA_OP_CEQ_DESTROY); @@ -1211,9 +1226,13 @@ static int irdma_cfg_ceq_vector(struct irdma_pci_f *rf, struct irdma_ceq *iwceq, } msix_vec->ceq_id = ceq_id; - rf->sc_dev.irq_ops->irdma_cfg_ceq(&rf->sc_dev, ceq_id, msix_vec->idx, true); - - return 0; + if (rf->sc_dev.privileged) + rf->sc_dev.irq_ops->irdma_cfg_ceq(&rf->sc_dev, ceq_id, + msix_vec->idx, true); + else + status = irdma_vchnl_req_ceq_vec_map(&rf->sc_dev, ceq_id, + msix_vec->idx); + return status; } /** @@ -1226,7 +1245,7 @@ static int irdma_cfg_ceq_vector(struct irdma_pci_f *rf, struct irdma_ceq *iwceq, static int irdma_cfg_aeq_vector(struct irdma_pci_f *rf) { struct irdma_msix_vector *msix_vec = rf->iw_msixtbl; - u32 ret = 0; + int ret = 0; if (!rf->msix_shared) { snprintf(msix_vec->name, sizeof(msix_vec->name) - 1, @@ -1237,12 +1256,16 @@ static int irdma_cfg_aeq_vector(struct irdma_pci_f *rf) } if (ret) { ibdev_dbg(&rf->iwdev->ibdev, "ERR: aeq irq config fail\n"); - return -EINVAL; + return ret; } - rf->sc_dev.irq_ops->irdma_cfg_aeq(&rf->sc_dev, msix_vec->idx, true); + if (rf->sc_dev.privileged) + rf->sc_dev.irq_ops->irdma_cfg_aeq(&rf->sc_dev, msix_vec->idx, + true); + else + ret = irdma_vchnl_req_aeq_vec_map(&rf->sc_dev, msix_vec->idx); - return 0; + return ret; } /** @@ -1250,13 +1273,13 @@ static int irdma_cfg_aeq_vector(struct irdma_pci_f *rf) * @rf: RDMA PCI function * @iwceq: pointer to the ceq resources to be created * @ceq_id: the id number of the iwceq - * @vsi: SC vsi struct + * @vsi_idx: vsi idx * * Return 0, if the ceq and the resources associated with it * are successfully created, otherwise return error */ static int irdma_create_ceq(struct irdma_pci_f *rf, struct irdma_ceq *iwceq, - u32 ceq_id, struct irdma_sc_vsi *vsi) + u32 ceq_id, u16 vsi_idx) { int status; struct irdma_ceq_init_info info = {}; @@ -1280,7 +1303,7 @@ static int irdma_create_ceq(struct irdma_pci_f *rf, struct irdma_ceq *iwceq, info.elem_cnt = ceq_size; iwceq->sc_ceq.ceq_id = ceq_id; info.dev = dev; - info.vsi = vsi; + info.vsi_idx = vsi_idx; status = irdma_sc_ceq_init(&iwceq->sc_ceq, &info); if (!status) { if (dev->ceq_valid) @@ -1323,7 +1346,7 @@ static int irdma_setup_ceq_0(struct irdma_pci_f *rf) } iwceq = &rf->ceqlist[0]; - status = irdma_create_ceq(rf, iwceq, 0, &rf->default_vsi); + status = irdma_create_ceq(rf, iwceq, 0, rf->default_vsi.vsi_idx); if (status) { ibdev_dbg(&rf->iwdev->ibdev, "ERR: create ceq status = %d\n", status); @@ -1358,13 +1381,13 @@ static int irdma_setup_ceq_0(struct irdma_pci_f *rf) /** * irdma_setup_ceqs - manage the device ceq's and their interrupt resources * @rf: RDMA PCI function - * @vsi: VSI structure for this CEQ + * @vsi_idx: vsi_idx for this CEQ * * Allocate a list for all device completion event queues * Create the ceq's and configure their msix interrupt vectors * Return 0, if ceqs are successfully set up, otherwise return error */ -static int irdma_setup_ceqs(struct irdma_pci_f *rf, struct irdma_sc_vsi *vsi) +static int irdma_setup_ceqs(struct irdma_pci_f *rf, u16 vsi_idx) { u32 i; u32 ceq_id; @@ -1377,7 +1400,7 @@ static int irdma_setup_ceqs(struct irdma_pci_f *rf, struct irdma_sc_vsi *vsi) i = (rf->msix_shared) ? 1 : 2; for (ceq_id = 1; i < num_ceqs; i++, ceq_id++) { iwceq = &rf->ceqlist[ceq_id]; - status = irdma_create_ceq(rf, iwceq, ceq_id, vsi); + status = irdma_create_ceq(rf, iwceq, ceq_id, vsi_idx); if (status) { ibdev_dbg(&rf->iwdev->ibdev, "ERR: create ceq status = %d\n", status); @@ -1458,7 +1481,10 @@ static int irdma_create_aeq(struct irdma_pci_f *rf) aeq_size = multiplier * hmc_info->hmc_obj[IRDMA_HMC_IW_QP].cnt + hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt; aeq_size = min(aeq_size, dev->hw_attrs.max_hw_aeq_size); - + /* GEN_3 does not support virtual AEQ. Cap at max Kernel alloc size */ + if (rf->rdma_ver == IRDMA_GEN_3) + aeq_size = min(aeq_size, (u32)((PAGE_SIZE << MAX_PAGE_ORDER) / + sizeof(struct irdma_sc_aeqe))); aeq->mem.size = ALIGN(sizeof(struct irdma_sc_aeqe) * aeq_size, IRDMA_AEQ_ALIGNMENT); aeq->mem.va = dma_alloc_coherent(dev->hw->device, aeq->mem.size, @@ -1466,6 +1492,8 @@ static int irdma_create_aeq(struct irdma_pci_f *rf) GFP_KERNEL | __GFP_NOWARN); if (aeq->mem.va) goto skip_virt_aeq; + else if (rf->rdma_ver == IRDMA_GEN_3) + return -ENOMEM; /* physically mapped aeq failed. setup virtual aeq */ status = irdma_create_virt_aeq(rf, aeq_size); @@ -1739,9 +1767,6 @@ void irdma_rt_deinit_hw(struct irdma_device *iwdev) irdma_del_local_mac_entry(iwdev->rf, (u8)iwdev->mac_ip_table_idx); fallthrough; - case AEQ_CREATED: - case PBLE_CHUNK_MEM: - case CEQS_CREATED: case IEQ_CREATED: if (!iwdev->roce_mode) irdma_puda_dele_rsrc(&iwdev->vsi, IRDMA_PUDA_RSRC_TYPE_IEQ, @@ -1824,13 +1849,17 @@ void irdma_ctrl_deinit_hw(struct irdma_pci_f *rf) enum init_completion_state state = rf->init_state; rf->init_state = INVALID_STATE; - if (rf->rsrc_created) { + + switch (state) { + case AEQ_CREATED: irdma_destroy_aeq(rf); + fallthrough; + case PBLE_CHUNK_MEM: irdma_destroy_pble_prm(rf->pble_rsrc); + fallthrough; + case CEQS_CREATED: irdma_del_ceqs(rf); - rf->rsrc_created = false; - } - switch (state) { + fallthrough; case CEQ0_CREATED: irdma_del_ceq_0(rf); fallthrough; @@ -1909,32 +1938,6 @@ int irdma_rt_init_hw(struct irdma_device *iwdev, break; iwdev->init_state = IEQ_CREATED; } - if (!rf->rsrc_created) { - status = irdma_setup_ceqs(rf, &iwdev->vsi); - if (status) - break; - - iwdev->init_state = CEQS_CREATED; - - status = irdma_hmc_init_pble(&rf->sc_dev, - rf->pble_rsrc); - if (status) { - irdma_del_ceqs(rf); - break; - } - - iwdev->init_state = PBLE_CHUNK_MEM; - - status = irdma_setup_aeq(rf); - if (status) { - irdma_destroy_pble_prm(rf->pble_rsrc); - irdma_del_ceqs(rf); - break; - } - iwdev->init_state = AEQ_CREATED; - rf->rsrc_created = true; - } - if (iwdev->rf->sc_dev.hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_1) irdma_alloc_set_mac(iwdev); irdma_add_ip(iwdev); @@ -2016,6 +2019,25 @@ int irdma_ctrl_init_hw(struct irdma_pci_f *rf) } INIT_WORK(&rf->cqp_cmpl_work, cqp_compl_worker); irdma_sc_ccq_arm(dev->ccq); + + status = irdma_setup_ceqs(rf, rf->iwdev ? rf->iwdev->vsi_num : 0); + if (status) + break; + + rf->init_state = CEQS_CREATED; + + status = irdma_hmc_init_pble(&rf->sc_dev, + rf->pble_rsrc); + if (status) + break; + + rf->init_state = PBLE_CHUNK_MEM; + + status = irdma_setup_aeq(rf); + if (status) + break; + rf->init_state = AEQ_CREATED; + return 0; } while (0); diff --git a/drivers/infiniband/hw/irdma/ig3rdma_hw.c b/drivers/infiniband/hw/irdma/ig3rdma_hw.c index 83ef6af..1d582c5 100644 --- a/drivers/infiniband/hw/irdma/ig3rdma_hw.c +++ b/drivers/infiniband/hw/irdma/ig3rdma_hw.c @@ -5,8 +5,53 @@ #include "protos.h" #include "ig3rdma_hw.h" +/** + * ig3rdma_ena_irq - Enable interrupt + * @dev: pointer to the device structure + * @idx: vector index + */ +static void ig3rdma_ena_irq(struct irdma_sc_dev *dev, u32 idx) +{ + u32 val; + u32 int_stride = 1; /* one u32 per register */ + + if (dev->is_pf) + int_stride = 0x400; + else + idx--; /* VFs use DYN_CTL_N */ + + val = FIELD_PREP(IRDMA_GLINT_DYN_CTL_INTENA, 1) | + FIELD_PREP(IRDMA_GLINT_DYN_CTL_CLEARPBA, 1); + + writel(val, dev->hw_regs[IRDMA_GLINT_DYN_CTL] + (idx * int_stride)); +} + +/** + * ig3rdma_disable_irq - Disable interrupt + * @dev: pointer to the device structure + * @idx: vector index + */ +static void ig3rdma_disable_irq(struct irdma_sc_dev *dev, u32 idx) +{ + u32 int_stride = 1; /* one u32 per register */ + + if (dev->is_pf) + int_stride = 0x400; + else + idx--; /* VFs use DYN_CTL_N */ + + writel(0, dev->hw_regs[IRDMA_GLINT_DYN_CTL] + (idx * int_stride)); +} + +static const struct irdma_irq_ops ig3rdma_irq_ops = { + .irdma_dis_irq = ig3rdma_disable_irq, + .irdma_en_irq = ig3rdma_ena_irq, +}; + void ig3rdma_init_hw(struct irdma_sc_dev *dev) { + dev->irq_ops = &ig3rdma_irq_ops; + dev->hw_attrs.uk_attrs.hw_rev = IRDMA_GEN_3; dev->hw_attrs.uk_attrs.max_hw_wq_frags = IG3RDMA_MAX_WQ_FRAGMENT_COUNT; dev->hw_attrs.uk_attrs.max_hw_read_sges = IG3RDMA_MAX_SGE_RD; diff --git a/drivers/infiniband/hw/irdma/irdma.h b/drivers/infiniband/hw/irdma/irdma.h index 4dc6bf5..0544cba 100644 --- a/drivers/infiniband/hw/irdma/irdma.h +++ b/drivers/infiniband/hw/irdma/irdma.h @@ -32,7 +32,16 @@ #define IRDMA_PFHMC_SDDATALOW_PMSDDATALOW GENMASK(31, 12) #define IRDMA_PFHMC_SDCMD_PMSDWR BIT(31) -#define IRDMA_INVALID_CQ_IDX 0xffffffff +#define IRDMA_INVALID_CQ_IDX 0xffffffff +#define IRDMA_Q_INVALID_IDX 0xffff + +enum irdma_dyn_idx_t { + IRDMA_IDX_ITR0 = 0, + IRDMA_IDX_ITR1 = 1, + IRDMA_IDX_ITR2 = 2, + IRDMA_IDX_NOITR = 3, +}; + enum irdma_registers { IRDMA_CQPTAIL, IRDMA_CQPDB, diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index 5d13718..1716933 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -127,12 +127,12 @@ enum init_completion_state { HMC_OBJS_CREATED, HW_RSRC_INITIALIZED, CCQ_CREATED, - CEQ0_CREATED, /* Last state of probe */ - ILQ_CREATED, - IEQ_CREATED, + CEQ0_CREATED, CEQS_CREATED, PBLE_CHUNK_MEM, AEQ_CREATED, + ILQ_CREATED, + IEQ_CREATED, /* Last state of probe */ IP_ADDR_REGISTERED, /* Last state of open */ }; diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index 2b93a70..0faf9cf 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -472,6 +472,8 @@ struct irdma_sc_aeq { u32 msix_idx; u8 polarity; bool virtual_map:1; + bool pasid_valid:1; + u32 pasid; }; struct irdma_sc_ceq { @@ -487,13 +489,15 @@ struct irdma_sc_ceq { u8 tph_val; u32 first_pm_pbl_idx; u8 polarity; - struct irdma_sc_vsi *vsi; + u16 vsi_idx; struct irdma_sc_cq **reg_cq; u32 reg_cq_size; spinlock_t req_cq_lock; /* protect access to reg_cq array */ bool virtual_map:1; bool tph_en:1; bool itr_no_expire:1; + bool pasid_valid:1; + u32 pasid; }; struct irdma_sc_cq { @@ -501,6 +505,7 @@ struct irdma_sc_cq { u64 cq_pa; u64 shadow_area_pa; struct irdma_sc_dev *dev; + u16 vsi_idx; struct irdma_sc_vsi *vsi; void *pbl_list; void *back_cq; @@ -834,8 +839,8 @@ struct irdma_ceq_init_info { bool itr_no_expire:1; u8 pbl_chunk_size; u8 tph_val; + u16 vsi_idx; u32 first_pm_pbl_idx; - struct irdma_sc_vsi *vsi; struct irdma_sc_cq **reg_cq; u32 reg_cq_idx; }; @@ -1042,9 +1047,11 @@ struct irdma_aeqe_info { bool cq:1; bool sq:1; bool rq:1; + bool srq:1; bool in_rdrsp_wr:1; bool out_rdrsp:1; bool aeqe_overflow:1; + bool err_rq_idx_valid:1; u8 q2_data_written; u8 ae_src; }; diff --git a/drivers/infiniband/hw/irdma/virtchnl.c b/drivers/infiniband/hw/irdma/virtchnl.c index fcb8ef2..fc669b5 100644 --- a/drivers/infiniband/hw/irdma/virtchnl.c +++ b/drivers/infiniband/hw/irdma/virtchnl.c @@ -108,6 +108,8 @@ static int irdma_vchnl_req_verify_resp(struct irdma_vchnl_req *vchnl_req, return -EBADMSG; break; case IRDMA_VCHNL_OP_GET_REG_LAYOUT: + case IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP: + case IRDMA_VCHNL_OP_QUEUE_VECTOR_UNMAP: break; default: return -EOPNOTSUPP; @@ -314,6 +316,88 @@ int irdma_vchnl_req_get_reg_layout(struct irdma_sc_dev *dev) } /** + * irdma_vchnl_req_aeq_vec_map - Map AEQ to vector on this function + * @dev: RDMA device pointer + * @v_idx: vector index + */ +int irdma_vchnl_req_aeq_vec_map(struct irdma_sc_dev *dev, u32 v_idx) +{ + struct irdma_vchnl_req_init_info info = {}; + struct irdma_vchnl_qvlist_info *qvl; + struct irdma_vchnl_qv_info *qv; + u16 qvl_size, num_vectors = 1; + int ret; + + if (!dev->vchnl_up) + return -EBUSY; + + qvl_size = struct_size(qvl, qv_info, num_vectors); + + qvl = kzalloc(qvl_size, GFP_KERNEL); + if (!qvl) + return -ENOMEM; + + qvl->num_vectors = 1; + qv = qvl->qv_info; + + qv->ceq_idx = IRDMA_Q_INVALID_IDX; + qv->v_idx = v_idx; + qv->itr_idx = IRDMA_IDX_ITR0; + + info.op_code = IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP; + info.op_ver = IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP_V0; + info.req_parm = qvl; + info.req_parm_len = qvl_size; + + ret = irdma_vchnl_req_send_sync(dev, &info); + kfree(qvl); + + return ret; +} + +/** + * irdma_vchnl_req_ceq_vec_map - Map CEQ to vector on this function + * @dev: RDMA device pointer + * @ceq_id: CEQ index + * @v_idx: vector index + */ +int irdma_vchnl_req_ceq_vec_map(struct irdma_sc_dev *dev, u16 ceq_id, u32 v_idx) +{ + struct irdma_vchnl_req_init_info info = {}; + struct irdma_vchnl_qvlist_info *qvl; + struct irdma_vchnl_qv_info *qv; + u16 qvl_size, num_vectors = 1; + int ret; + + if (!dev->vchnl_up) + return -EBUSY; + + qvl_size = struct_size(qvl, qv_info, num_vectors); + + qvl = kzalloc(qvl_size, GFP_KERNEL); + if (!qvl) + return -ENOMEM; + + qvl->num_vectors = num_vectors; + qv = qvl->qv_info; + + qv->aeq_idx = IRDMA_Q_INVALID_IDX; + qv->ceq_idx = ceq_id; + qv->v_idx = v_idx; + qv->itr_idx = IRDMA_IDX_ITR0; + + info.op_code = IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP; + info.op_ver = IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP_V0; + info.req_parm = qvl; + info.req_parm_len = qvl_size; + + ret = irdma_vchnl_req_send_sync(dev, &info); + kfree(qvl); + + return ret; +} + +/** * irdma_vchnl_req_get_ver - Request Channel version * @dev: RDMA device pointer * @ver_req: Virtual channel version requested diff --git a/drivers/infiniband/hw/irdma/virtchnl.h b/drivers/infiniband/hw/irdma/virtchnl.h index 20526c0..3af72558 100644 --- a/drivers/infiniband/hw/irdma/virtchnl.h +++ b/drivers/infiniband/hw/irdma/virtchnl.h @@ -15,6 +15,8 @@ #define IRDMA_VCHNL_OP_GET_HMC_FCN_V2 2 #define IRDMA_VCHNL_OP_PUT_HMC_FCN_V0 0 #define IRDMA_VCHNL_OP_GET_REG_LAYOUT_V0 0 +#define IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP_V0 0 +#define IRDMA_VCHNL_OP_QUEUE_VECTOR_UNMAP_V0 0 #define IRDMA_VCHNL_OP_GET_RDMA_CAPS_V0 0 #define IRDMA_VCHNL_OP_GET_RDMA_CAPS_MIN_SIZE 1 @@ -53,6 +55,8 @@ enum irdma_vchnl_ops { IRDMA_VCHNL_OP_PUT_HMC_FCN = 2, IRDMA_VCHNL_OP_GET_REG_LAYOUT = 11, IRDMA_VCHNL_OP_GET_RDMA_CAPS = 13, + IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP = 14, + IRDMA_VCHNL_OP_QUEUE_VECTOR_UNMAP = 15, }; struct irdma_vchnl_req_hmc_info { @@ -65,6 +69,18 @@ struct irdma_vchnl_resp_hmc_info { u16 qs_handle[IRDMA_MAX_USER_PRIORITY]; } __packed; +struct irdma_vchnl_qv_info { + u32 v_idx; + u16 ceq_idx; + u16 aeq_idx; + u8 itr_idx; +}; + +struct irdma_vchnl_qvlist_info { + u32 num_vectors; + struct irdma_vchnl_qv_info qv_info[]; +}; + struct irdma_vchnl_op_buf { u16 op_code; u16 op_ver; @@ -137,4 +153,7 @@ int irdma_vchnl_req_get_ver(struct irdma_sc_dev *dev, u16 ver_req, int irdma_vchnl_req_get_resp(struct irdma_sc_dev *dev, struct irdma_vchnl_req *vc_req); int irdma_vchnl_req_get_reg_layout(struct irdma_sc_dev *dev); +int irdma_vchnl_req_aeq_vec_map(struct irdma_sc_dev *dev, u32 v_idx); +int irdma_vchnl_req_ceq_vec_map(struct irdma_sc_dev *dev, u16 ceq_id, + u32 v_idx); #endif /* IRDMA_VIRTCHNL_H */ From patchwork Wed Jul 24 23:39:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741453 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8D27149000 for ; Wed, 24 Jul 2024 23:40:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864448; cv=none; b=pD3aZsIitEoo5eprN/8QojhDecAJYipT/rpOieX9z1Gcv7kujQmDSxkC9tOU6uswf9m0UbmSWWJZVP2lgnw1ptTx/HbGIkToBNlzYIhhhKo/06Ghgs2naTwDgRH40PWRiO6kbR+lFc0JRB7RmqwJgq3FL26yHSbLcgZ72QZObYk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864448; c=relaxed/simple; bh=+ma1y1mz4oa5UGmRncQGasiYKCxvSod0w+t2Gv5RAFk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=H2ovYQLqm29IMn50FRX8sIGoUkV6ZfIxNVvj8ljCRBYP6fM7gl/POqlcNVvNRDqZAGHL/t6bzyG7/78CcsmxWy/1d5rddSnHW02lOYubUMAxm37plKUB9V9l3wxgNPeZiePNPjZQPYK8rta80qyWIzS0c5Oe/D0+s7B0C82hhDw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=GTiZ++zF; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="GTiZ++zF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864447; x=1753400447; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+ma1y1mz4oa5UGmRncQGasiYKCxvSod0w+t2Gv5RAFk=; b=GTiZ++zF3LUz/+WVQahuZcAX/eQXSsoxZLIOcaSBTIqb5kgS+Ij0s610 MBu8UEgcfY1E8QUO6IXqVSQjwk1DJ2k6f8PHnG5S5att64knGQIJEQ2Ho Da2INv3mNHLgnWT38gItlyTa+mmwlmv/MQIMQ+LHFJDeyDrjrFGUNRaEL vokep2ZgC1pDYUXJmr5cE3u2xF3j5Tgama9Cp6WLr//oPDTDP5lHVUXuB dSdv/yfj39U3j1Ed6IocdRcXyQW89pRt0ZhIL98ElABQhACXxPfOHqROW K7+xKgXGEFeCzllMVW7MHg7qt/AThQVfZ/tcqQIJ382G9tN6DRamQwwYv Q==; X-CSE-ConnectionGUID: MjHuWhSUReG0qWiQ0pUAPA== X-CSE-MsgGUID: A7zY+xkQRAGKNJ0eTnK65w== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999779" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999779" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:43 -0700 X-CSE-ConnectionGUID: X4EAj3YsQGGmBxgnMFbR9w== X-CSE-MsgGUID: Dxd6LXJfTUiQ2Pyzax3jog== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426068" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:42 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Krzysztof Czurylo , Tatyana Nikolova Subject: [RFC PATCH 15/25] RDMA/irdma: Add GEN3 HW statistics support Date: Wed, 24 Jul 2024 18:39:07 -0500 Message-Id: <20240724233917.704-16-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Krzysztof Czurylo Plug into the unified HW statistics framework by adding a hardware statistics map array for GEN3, defining the HW-specific width and location for each counter in the statistics buffer. Signed-off-by: Krzysztof Czurylo Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ctrl.c | 33 +++++++--- drivers/infiniband/hw/irdma/defs.h | 2 +- drivers/infiniband/hw/irdma/ig3rdma_hw.c | 63 ++++++++++++++++++ drivers/infiniband/hw/irdma/type.h | 19 +++++- drivers/infiniband/hw/irdma/verbs.c | 110 +++++++++++++++++-------------- 5 files changed, 166 insertions(+), 61 deletions(-) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 5a5d47c..88eb7a0 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -1964,7 +1964,8 @@ int irdma_vsi_stats_init(struct irdma_sc_vsi *vsi, (void *)((uintptr_t)stats_buff_mem->va + IRDMA_GATHER_STATS_BUF_SIZE); - irdma_hw_stats_start_timer(vsi); + if (vsi->dev->hw_attrs.uk_attrs.hw_rev < IRDMA_GEN_3) + irdma_hw_stats_start_timer(vsi); /* when stat allocation is not required default to fcn_id. */ vsi->stats_idx = info->fcn_id; @@ -2009,7 +2010,9 @@ void irdma_vsi_stats_free(struct irdma_sc_vsi *vsi) if (!vsi->pestat) return; - irdma_hw_stats_stop_timer(vsi); + + if (dev->hw_attrs.uk_attrs.hw_rev < IRDMA_GEN_3) + irdma_hw_stats_stop_timer(vsi); dma_free_coherent(vsi->pestat->hw->device, vsi->pestat->gather_info.stats_buff_mem.size, vsi->pestat->gather_info.stats_buff_mem.va, @@ -5935,14 +5938,26 @@ void irdma_cfg_aeq(struct irdma_sc_dev *dev, u32 idx, bool enable) */ void sc_vsi_update_stats(struct irdma_sc_vsi *vsi) { - struct irdma_gather_stats *gather_stats; - struct irdma_gather_stats *last_gather_stats; + struct irdma_dev_hw_stats *hw_stats = &vsi->pestat->hw_stats; + struct irdma_gather_stats *gather_stats = + vsi->pestat->gather_info.gather_stats_va; + struct irdma_gather_stats *last_gather_stats = + vsi->pestat->gather_info.last_gather_stats_va; + const struct irdma_hw_stat_map *map = vsi->dev->hw_stats_map; + u16 max_stat_idx = vsi->dev->hw_attrs.max_stat_idx; + u16 i; + + if (vsi->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + for (i = 0; i < max_stat_idx; i++) { + u16 idx = map[i].byteoff / sizeof(u64); + + hw_stats->stats_val[i] = gather_stats->val[idx]; + } + return; + } - gather_stats = vsi->pestat->gather_info.gather_stats_va; - last_gather_stats = vsi->pestat->gather_info.last_gather_stats_va; - irdma_update_stats(&vsi->pestat->hw_stats, gather_stats, - last_gather_stats, vsi->dev->hw_stats_map, - vsi->dev->hw_attrs.max_stat_idx); + irdma_update_stats(hw_stats, gather_stats, last_gather_stats, + map, max_stat_idx); } /** diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index 5829c72..492529a 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -415,7 +415,7 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_STATS_USE_INST BIT_ULL(61) #define IRDMA_CQPSQ_STATS_OP GENMASK_ULL(37, 32) #define IRDMA_CQPSQ_STATS_INST_INDEX GENMASK_ULL(6, 0) -#define IRDMA_CQPSQ_STATS_HMC_FCN_INDEX GENMASK_ULL(5, 0) +#define IRDMA_CQPSQ_STATS_HMC_FCN_INDEX GENMASK_ULL(15, 0) #define IRDMA_CQPSQ_WS_WQEVALID BIT_ULL(63) #define IRDMA_CQPSQ_WS_NODEOP GENMASK_ULL(53, 52) #define IRDMA_SD_MAX GENMASK_ULL(15, 0) diff --git a/drivers/infiniband/hw/irdma/ig3rdma_hw.c b/drivers/infiniband/hw/irdma/ig3rdma_hw.c index 1d582c5..2a3d714 100644 --- a/drivers/infiniband/hw/irdma/ig3rdma_hw.c +++ b/drivers/infiniband/hw/irdma/ig3rdma_hw.c @@ -48,9 +48,70 @@ static void ig3rdma_disable_irq(struct irdma_sc_dev *dev, u32 idx) .irdma_en_irq = ig3rdma_ena_irq, }; +static const struct irdma_hw_stat_map ig3rdma_hw_stat_map[] = { + [IRDMA_HW_STAT_INDEX_RXVLANERR] = { 0, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4RXOCTS] = { 8, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4RXPKTS] = { 16, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4RXDISCARD] = { 24, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4RXTRUNC] = { 32, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4RXFRAGS] = { 40, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4RXMCOCTS] = { 48, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4RXMCPKTS] = { 56, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6RXOCTS] = { 64, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6RXPKTS] = { 72, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6RXDISCARD] = { 80, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6RXTRUNC] = { 88, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6RXFRAGS] = { 96, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6RXMCOCTS] = { 104, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6RXMCPKTS] = { 112, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4TXOCTS] = { 120, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4TXPKTS] = { 128, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4TXFRAGS] = { 136, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4TXMCOCTS] = { 144, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4TXMCPKTS] = { 152, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6TXOCTS] = { 160, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6TXPKTS] = { 168, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6TXFRAGS] = { 176, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6TXMCOCTS] = { 184, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6TXMCPKTS] = { 192, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP4TXNOROUTE] = { 200, 0, 0 }, + [IRDMA_HW_STAT_INDEX_IP6TXNOROUTE] = { 208, 0, 0 }, + [IRDMA_HW_STAT_INDEX_TCPRTXSEG] = { 216, 0, 0 }, + [IRDMA_HW_STAT_INDEX_TCPRXOPTERR] = { 224, 0, 0 }, + [IRDMA_HW_STAT_INDEX_TCPRXPROTOERR] = { 232, 0, 0 }, + [IRDMA_HW_STAT_INDEX_TCPTXSEG] = { 240, 0, 0 }, + [IRDMA_HW_STAT_INDEX_TCPRXSEGS] = { 248, 0, 0 }, + [IRDMA_HW_STAT_INDEX_UDPRXPKTS] = { 256, 0, 0 }, + [IRDMA_HW_STAT_INDEX_UDPTXPKTS] = { 264, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMARXWRS] = { 272, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMARXRDS] = { 280, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMARXSNDS] = { 288, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMATXWRS] = { 296, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMATXRDS] = { 304, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMATXSNDS] = { 312, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMAVBND] = { 320, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMAVINV] = { 328, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RXNPECNMARKEDPKTS] = { 336, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RXRPCNPHANDLED] = { 344, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RXRPCNPIGNORED] = { 352, 0, 0 }, + [IRDMA_HW_STAT_INDEX_TXNPCNPSENT] = { 360, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RNR_SENT] = { 368, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RNR_RCVD] = { 376, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMAORDLMTCNT] = { 384, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMAIRDLMTCNT] = { 392, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMARXATS] = { 408, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RDMATXATS] = { 416, 0, 0 }, + [IRDMA_HW_STAT_INDEX_NAKSEQERR] = { 424, 0, 0 }, + [IRDMA_HW_STAT_INDEX_NAKSEQERR_IMPLIED] = { 432, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RTO] = { 440, 0, 0 }, + [IRDMA_HW_STAT_INDEX_RXOOOPKTS] = { 448, 0, 0 }, + [IRDMA_HW_STAT_INDEX_ICRCERR] = { 456, 0, 0 }, +}; + void ig3rdma_init_hw(struct irdma_sc_dev *dev) { dev->irq_ops = &ig3rdma_irq_ops; + dev->hw_stats_map = ig3rdma_hw_stat_map; dev->hw_attrs.uk_attrs.hw_rev = IRDMA_GEN_3; dev->hw_attrs.uk_attrs.max_hw_wq_frags = IG3RDMA_MAX_WQ_FRAGMENT_COUNT; @@ -70,6 +131,8 @@ void ig3rdma_init_hw(struct irdma_sc_dev *dev) dev->hw_attrs.page_size_cap = SZ_4K | SZ_2M | SZ_1G; dev->hw_attrs.max_hw_ird = IG3RDMA_MAX_IRD_SIZE; dev->hw_attrs.max_hw_ord = IG3RDMA_MAX_ORD_SIZE; + dev->hw_attrs.max_stat_inst = IG3RDMA_MAX_STATS_COUNT; + dev->hw_attrs.max_stat_idx = IRDMA_HW_STAT_INDEX_MAX_GEN_3; dev->hw_attrs.uk_attrs.min_hw_wq_size = IG3RDMA_MIN_WQ_SIZE; dev->hw_attrs.uk_attrs.max_hw_srq_quanta = IRDMA_SRQ_MAX_QUANTA; dev->hw_attrs.uk_attrs.max_hw_inline = IG3RDMA_MAX_INLINE_DATA_SIZE; diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index 0faf9cf..17fc726 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -156,6 +156,21 @@ enum irdma_hw_stats_index { IRDMA_HW_STAT_INDEX_RXRPCNPIGNORED = 44, IRDMA_HW_STAT_INDEX_TXNPCNPSENT = 45, IRDMA_HW_STAT_INDEX_MAX_GEN_2 = 46, + + /* gen3 */ + IRDMA_HW_STAT_INDEX_RNR_SENT = 46, + IRDMA_HW_STAT_INDEX_RNR_RCVD = 47, + IRDMA_HW_STAT_INDEX_RDMAORDLMTCNT = 48, + IRDMA_HW_STAT_INDEX_RDMAIRDLMTCNT = 49, + IRDMA_HW_STAT_INDEX_RDMARXATS = 50, + IRDMA_HW_STAT_INDEX_RDMATXATS = 51, + IRDMA_HW_STAT_INDEX_NAKSEQERR = 52, + IRDMA_HW_STAT_INDEX_NAKSEQERR_IMPLIED = 53, + IRDMA_HW_STAT_INDEX_RTO = 54, + IRDMA_HW_STAT_INDEX_RXOOOPKTS = 55, + IRDMA_HW_STAT_INDEX_ICRCERR = 56, + + IRDMA_HW_STAT_INDEX_MAX_GEN_3 = 57, }; enum irdma_feature_type { @@ -569,7 +584,7 @@ struct irdma_sc_qp { struct irdma_stats_inst_info { bool use_hmc_fcn_index; u8 hmc_fn_id; - u8 stats_idx; + u16 stats_idx; }; struct irdma_up_info { @@ -1027,7 +1042,7 @@ struct irdma_qp_host_ctx_info { u32 send_cq_num; u32 rcv_cq_num; u32 rem_endpoint_idx; - u8 stats_idx; + u16 stats_idx; bool srq_valid:1; bool tcp_info_valid:1; bool iwarp_info_valid:1; diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 89937d4..9a42a88 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -3913,40 +3913,7 @@ static int irdma_req_notify_cq(struct ib_cq *ibcq, return ret; } -static int irdma_roce_port_immutable(struct ib_device *ibdev, u32 port_num, - struct ib_port_immutable *immutable) -{ - struct ib_port_attr attr; - int err; - - immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP; - err = ib_query_port(ibdev, port_num, &attr); - if (err) - return err; - - immutable->max_mad_size = IB_MGMT_MAD_SIZE; - immutable->pkey_tbl_len = attr.pkey_tbl_len; - immutable->gid_tbl_len = attr.gid_tbl_len; - - return 0; -} - -static int irdma_iw_port_immutable(struct ib_device *ibdev, u32 port_num, - struct ib_port_immutable *immutable) -{ - struct ib_port_attr attr; - int err; - - immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; - err = ib_query_port(ibdev, port_num, &attr); - if (err) - return err; - immutable->gid_tbl_len = attr.gid_tbl_len; - - return 0; -} - -static const struct rdma_stat_desc irdma_hw_stat_names[] = { +static const struct rdma_stat_desc irdma_hw_stat_descs[] = { /* gen1 - 32-bit */ [IRDMA_HW_STAT_INDEX_IP4RXDISCARD].name = "ip4InDiscards", [IRDMA_HW_STAT_INDEX_IP4RXTRUNC].name = "ip4InTruncatedPkts", @@ -3954,9 +3921,6 @@ static int irdma_iw_port_immutable(struct ib_device *ibdev, u32 port_num, [IRDMA_HW_STAT_INDEX_IP6RXDISCARD].name = "ip6InDiscards", [IRDMA_HW_STAT_INDEX_IP6RXTRUNC].name = "ip6InTruncatedPkts", [IRDMA_HW_STAT_INDEX_IP6TXNOROUTE].name = "ip6OutNoRoutes", - [IRDMA_HW_STAT_INDEX_TCPRTXSEG].name = "tcpRetransSegs", - [IRDMA_HW_STAT_INDEX_TCPRXOPTERR].name = "tcpInOptErrors", - [IRDMA_HW_STAT_INDEX_TCPRXPROTOERR].name = "tcpInProtoErrors", [IRDMA_HW_STAT_INDEX_RXVLANERR].name = "rxVlanErrors", /* gen1 - 64-bit */ [IRDMA_HW_STAT_INDEX_IP4RXOCTS].name = "ip4InOctets", @@ -3975,16 +3939,14 @@ static int irdma_iw_port_immutable(struct ib_device *ibdev, u32 port_num, [IRDMA_HW_STAT_INDEX_IP6TXPKTS].name = "ip6OutPkts", [IRDMA_HW_STAT_INDEX_IP6TXFRAGS].name = "ip6OutSegRqd", [IRDMA_HW_STAT_INDEX_IP6TXMCPKTS].name = "ip6OutMcastPkts", - [IRDMA_HW_STAT_INDEX_TCPRXSEGS].name = "tcpInSegs", - [IRDMA_HW_STAT_INDEX_TCPTXSEG].name = "tcpOutSegs", - [IRDMA_HW_STAT_INDEX_RDMARXRDS].name = "iwInRdmaReads", - [IRDMA_HW_STAT_INDEX_RDMARXSNDS].name = "iwInRdmaSends", - [IRDMA_HW_STAT_INDEX_RDMARXWRS].name = "iwInRdmaWrites", - [IRDMA_HW_STAT_INDEX_RDMATXRDS].name = "iwOutRdmaReads", - [IRDMA_HW_STAT_INDEX_RDMATXSNDS].name = "iwOutRdmaSends", - [IRDMA_HW_STAT_INDEX_RDMATXWRS].name = "iwOutRdmaWrites", - [IRDMA_HW_STAT_INDEX_RDMAVBND].name = "iwRdmaBnd", - [IRDMA_HW_STAT_INDEX_RDMAVINV].name = "iwRdmaInv", + [IRDMA_HW_STAT_INDEX_RDMARXRDS].name = "InRdmaReads", + [IRDMA_HW_STAT_INDEX_RDMARXSNDS].name = "InRdmaSends", + [IRDMA_HW_STAT_INDEX_RDMARXWRS].name = "InRdmaWrites", + [IRDMA_HW_STAT_INDEX_RDMATXRDS].name = "OutRdmaReads", + [IRDMA_HW_STAT_INDEX_RDMATXSNDS].name = "OutRdmaSends", + [IRDMA_HW_STAT_INDEX_RDMATXWRS].name = "OutRdmaWrites", + [IRDMA_HW_STAT_INDEX_RDMAVBND].name = "RdmaBnd", + [IRDMA_HW_STAT_INDEX_RDMAVINV].name = "RdmaInv", /* gen2 - 32-bit */ [IRDMA_HW_STAT_INDEX_RXRPCNPHANDLED].name = "cnpHandled", @@ -3998,9 +3960,59 @@ static int irdma_iw_port_immutable(struct ib_device *ibdev, u32 port_num, [IRDMA_HW_STAT_INDEX_UDPRXPKTS].name = "RxUDP", [IRDMA_HW_STAT_INDEX_UDPTXPKTS].name = "TxUDP", [IRDMA_HW_STAT_INDEX_RXNPECNMARKEDPKTS].name = "RxECNMrkd", - + [IRDMA_HW_STAT_INDEX_TCPRTXSEG].name = "RetransSegs", + [IRDMA_HW_STAT_INDEX_TCPRXOPTERR].name = "InOptErrors", + [IRDMA_HW_STAT_INDEX_TCPRXPROTOERR].name = "InProtoErrors", + [IRDMA_HW_STAT_INDEX_TCPRXSEGS].name = "InSegs", + [IRDMA_HW_STAT_INDEX_TCPTXSEG].name = "OutSegs", + + /* gen3 */ + [IRDMA_HW_STAT_INDEX_RNR_SENT].name = "RNR sent", + [IRDMA_HW_STAT_INDEX_RNR_RCVD].name = "RNR received", + [IRDMA_HW_STAT_INDEX_RDMAORDLMTCNT].name = "ord limit count", + [IRDMA_HW_STAT_INDEX_RDMAIRDLMTCNT].name = "ird limit count", + [IRDMA_HW_STAT_INDEX_RDMARXATS].name = "Rx ATS", + [IRDMA_HW_STAT_INDEX_RDMATXATS].name = "Tx ATS", + [IRDMA_HW_STAT_INDEX_NAKSEQERR].name = "Nak Sequence Error", + [IRDMA_HW_STAT_INDEX_NAKSEQERR_IMPLIED].name = "Nak Sequence Error Implied", + [IRDMA_HW_STAT_INDEX_RTO].name = "RTO", + [IRDMA_HW_STAT_INDEX_RXOOOPKTS].name = "Rcvd Out of order packets", + [IRDMA_HW_STAT_INDEX_ICRCERR].name = "CRC errors", }; +static int irdma_roce_port_immutable(struct ib_device *ibdev, u32 port_num, + struct ib_port_immutable *immutable) +{ + struct ib_port_attr attr; + int err; + + immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP; + err = ib_query_port(ibdev, port_num, &attr); + if (err) + return err; + + immutable->max_mad_size = IB_MGMT_MAD_SIZE; + immutable->pkey_tbl_len = attr.pkey_tbl_len; + immutable->gid_tbl_len = attr.gid_tbl_len; + + return 0; +} + +static int irdma_iw_port_immutable(struct ib_device *ibdev, u32 port_num, + struct ib_port_immutable *immutable) +{ + struct ib_port_attr attr; + int err; + + immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; + err = ib_query_port(ibdev, port_num, &attr); + if (err) + return err; + immutable->gid_tbl_len = attr.gid_tbl_len; + + return 0; +} + static void irdma_get_dev_fw_str(struct ib_device *dev, char *str) { struct irdma_device *iwdev = to_iwdev(dev); @@ -4024,7 +4036,7 @@ static struct rdma_hw_stats *irdma_alloc_hw_port_stats(struct ib_device *ibdev, int num_counters = dev->hw_attrs.max_stat_idx; unsigned long lifespan = RDMA_HW_STATS_DEFAULT_LIFESPAN; - return rdma_alloc_hw_stats_struct(irdma_hw_stat_names, num_counters, + return rdma_alloc_hw_stats_struct(irdma_hw_stat_descs, num_counters, lifespan); } From patchwork Wed Jul 24 23:39:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741454 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75EAB1448D8 for ; Wed, 24 Jul 2024 23:40:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864449; cv=none; b=bxUYprX+4mPJvu2wuC4i3ETzpvJHHicP1wWkKDIy1o1mSpuYyqakIdBL5LbsTz94j1+vf2OEisEDkhEJWwzjxhV4FogLvnj0GY0sm94qCbHaCPe9raLRMXUh4Lfx4kGKhfk54BePWTaa/JvtkrUOZnvhtKZ4aVCZjbVpeQ9i1JU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864449; c=relaxed/simple; bh=1JbHvQeQlMfHeYUQOYebKU6bO7marf/qLoZLX7h0NM4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=i3qxz5B1SDaJ4SHaCCIkK3tNxQPTZMCCKX+mPLGbfa/1ZwFzO7bvKE9uQxld5bG5+VjLr6vs0e42xzXtatLdHsKE5tvLjO4uS+ktw0IPxH4b2Su23JcKS/oQHEqrrsmr45RhhHi+BsB/a9a9gt6qLLL03fq2nGd2E+bJWQuJROI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=PfyR2ZzL; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="PfyR2ZzL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864448; x=1753400448; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1JbHvQeQlMfHeYUQOYebKU6bO7marf/qLoZLX7h0NM4=; b=PfyR2ZzLQ5VRgRcq3xJt6auzRIifaBSNKlU6r5MaFHE1EOX2LMp9aYMT sUv5LvPJDZqe71Q9Q9tAB3JiYUnMB3umBjIjPUIGNaoylUaS0tIrIbrzs ASk5TW8mjBJfuxXGIZ7UYB7NRW74Htm3wCA9iHPwL3+HR6hJbcWrLu9sO d8c6u0S7zEyH1j8OKb0RT9f5V9o4DU4sOjl/GBXFu+7UzHyPegC43l9HT DPf5bVLIPjz8PFf+ze8dwRzFoOoHIEpzTj8AwvT6sX6Jk66KQklVuupqz +8R01OjolUttVAm2/9mo7b0aeHkJzk3XbSIgljXQ6inWC2yqkARf/ktJB A==; X-CSE-ConnectionGUID: YozypsV3QJmD2zWtCmzWUg== X-CSE-MsgGUID: IRkCwvMfSv2PmtDjqxz7FQ== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999782" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999782" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:43 -0700 X-CSE-ConnectionGUID: HpFXINCEQTC1NEvfglR1UQ== X-CSE-MsgGUID: 1vIIHAZ2STGbs1tsb96JLA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426073" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:42 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Tatyana Nikolova Subject: [RFC PATCH 16/25] RDMA/irdma: Introduce GEN3 vPort driver support Date: Wed, 24 Jul 2024 18:39:08 -0500 Message-Id: <20240724233917.704-17-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Mustafa Ismail In the IPU model, a function can host one or more logical network endpoints called vPorts. Each vPort may be associated with either a physical or an internal communication port, and can be RDMA capable. A vPort features a netdev and, if RDMA capable, must have an associated ib_dev. This change introduces a GEN3 auxiliary vPort driver responsible for registering a verbs device for every RDMA-capable vPort. Additionally, the UAPI is updated to prevent the binding of GEN3 devices to older user-space providers. Signed-off-by: Mustafa Ismail Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ig3rdma_if.c | 110 ++++++++++++++++++++++++++++++- drivers/infiniband/hw/irdma/main.c | 12 ++++ drivers/infiniband/hw/irdma/main.h | 3 + drivers/infiniband/hw/irdma/verbs.c | 12 +++- include/uapi/rdma/irdma-abi.h | 1 + 5 files changed, 135 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/irdma/ig3rdma_if.c b/drivers/infiniband/hw/irdma/ig3rdma_if.c index 70b1ed3..1e2e41d 100644 --- a/drivers/infiniband/hw/irdma/ig3rdma_if.c +++ b/drivers/infiniband/hw/irdma/ig3rdma_if.c @@ -14,6 +14,23 @@ static void ig3rdma_idc_core_event_handler(struct idc_rdma_core_dev_info *cdev_i } } +static void ig3rdma_idc_vport_event_handler(struct idc_rdma_vport_dev_info *cdev_info, + struct idc_rdma_event *event) +{ + struct irdma_device *iwdev = auxiliary_get_drvdata(cdev_info->adev); + struct irdma_l2params l2params = {}; + + if (*event->type & BIT(IDC_RDMA_EVENT_AFTER_MTU_CHANGE)) { + ibdev_dbg(&iwdev->ibdev, "CLNT: new MTU = %d\n", iwdev->netdev->mtu); + if (iwdev->vsi.mtu != iwdev->netdev->mtu) { + l2params.mtu = iwdev->netdev->mtu; + l2params.mtu_changed = true; + irdma_log_invalid_mtu(l2params.mtu, &iwdev->rf->sc_dev); + irdma_change_l2params(&iwdev->vsi, &l2params); + } + } +} + static int ig3rdma_cfg_regions(struct irdma_hw *hw, struct idc_rdma_core_dev_info *cdev_info) { @@ -168,4 +185,95 @@ struct idc_rdma_core_auxiliary_drv ig3rdma_core_auxiliary_drv = { .remove = ig3rdma_core_remove, }, .event_handler = ig3rdma_idc_core_event_handler, -}; \ No newline at end of file +}; + +static int ig3rdma_vport_probe(struct auxiliary_device *aux_dev, + const struct auxiliary_device_id *id) +{ + struct idc_rdma_vport_auxiliary_dev *idc_adev = + container_of(aux_dev, struct idc_rdma_vport_auxiliary_dev, adev); + struct auxiliary_device *aux_core_dev = idc_adev->vdev_info->core_adev; + struct irdma_pci_f *rf = auxiliary_get_drvdata(aux_core_dev); + struct iidc_rdma_qos_params qos_info = {}; + struct irdma_l2params l2params = {}; + struct irdma_device *iwdev; + int err; + + if (!rf) { + WARN_ON_ONCE(1); + return -ENOMEM; + } + iwdev = ib_alloc_device(irdma_device, ibdev); + /* Fill iwdev info */ + iwdev->is_vport = true; + iwdev->rf = rf; + iwdev->vport_id = idc_adev->vdev_info->vport_id; + iwdev->netdev = idc_adev->vdev_info->netdev; + iwdev->init_state = INITIAL_STATE; + iwdev->roce_cwnd = IRDMA_ROCE_CWND_DEFAULT; + iwdev->roce_ackcreds = IRDMA_ROCE_ACKCREDS_DEFAULT; + iwdev->rcv_wnd = IRDMA_CM_DEFAULT_RCV_WND_SCALED; + iwdev->rcv_wscale = IRDMA_CM_DEFAULT_RCV_WND_SCALE; + iwdev->roce_mode = true; + iwdev->push_mode = true; + + l2params.mtu = iwdev->netdev->mtu; + irdma_fill_qos_info(&l2params, &qos_info); + + err = irdma_rt_init_hw(iwdev, &l2params); + if (err) + goto err_rt_init; + + err = irdma_ib_register_device(iwdev); + if (err) + goto err_ibreg; + + auxiliary_set_drvdata(aux_dev, iwdev); + + ibdev_dbg(&iwdev->ibdev, + "INIT: Gen[%d] vport[%d] probe success. dev_name = %s, core_dev_name = %s, netdev=%s\n", + rf->rdma_ver, idc_adev->vdev_info->vport_id, + dev_name(&aux_dev->dev), + dev_name(&idc_adev->vdev_info->core_adev->dev), + netdev_name(idc_adev->vdev_info->netdev)); + + return 0; +err_ibreg: + irdma_rt_deinit_hw(iwdev); +err_rt_init: + ib_dealloc_device(&iwdev->ibdev); + + return err; +} + +static void ig3rdma_vport_remove(struct auxiliary_device *aux_dev) +{ + struct idc_rdma_vport_auxiliary_dev *idc_adev = + container_of(aux_dev, struct idc_rdma_vport_auxiliary_dev, adev); + struct irdma_device *iwdev = auxiliary_get_drvdata(aux_dev); + + ibdev_dbg(&iwdev->ibdev, + "INIT: Gen[%d] dev_name = %s, core_dev_name = %s, netdev=%s\n", + iwdev->rf->rdma_ver, dev_name(&aux_dev->dev), + dev_name(&idc_adev->vdev_info->core_adev->dev), + netdev_name(idc_adev->vdev_info->netdev)); + + irdma_ib_unregister_device(iwdev); +} + +static const struct auxiliary_device_id ig3rdma_vport_auxiliary_id_table[] = { + {.name = "idpf.8086.rdma.vdev", }, + {}, +}; + +MODULE_DEVICE_TABLE(auxiliary, ig3rdma_vport_auxiliary_id_table); + +struct idc_rdma_vport_auxiliary_drv ig3rdma_vport_auxiliary_drv = { + .adrv = { + .name = "vdev", + .id_table = ig3rdma_vport_auxiliary_id_table, + .probe = ig3rdma_vport_probe, + .remove = ig3rdma_vport_remove, + }, + .event_handler = ig3rdma_idc_vport_event_handler, +}; diff --git a/drivers/infiniband/hw/irdma/main.c b/drivers/infiniband/hw/irdma/main.c index e9524de..4b07b07 100644 --- a/drivers/infiniband/hw/irdma/main.c +++ b/drivers/infiniband/hw/irdma/main.c @@ -129,6 +129,17 @@ static int __init irdma_init_module(void) return ret; } + + ret = auxiliary_driver_register(&ig3rdma_vport_auxiliary_drv.adrv); + if (ret) { + auxiliary_driver_unregister(&ig3rdma_core_auxiliary_drv.adrv); + auxiliary_driver_unregister(&icrdma_core_auxiliary_drv.adrv); + auxiliary_driver_unregister(&i40iw_auxiliary_drv); + pr_err("Failed ig3rdma vport auxiliary_driver_register() ret=%d\n", + ret); + + return ret; + } irdma_register_notifiers(); return 0; @@ -168,6 +179,7 @@ static void __exit irdma_exit_module(void) auxiliary_driver_unregister(&icrdma_core_auxiliary_drv.adrv); auxiliary_driver_unregister(&i40iw_auxiliary_drv); auxiliary_driver_unregister(&ig3rdma_core_auxiliary_drv.adrv); + auxiliary_driver_unregister(&ig3rdma_vport_auxiliary_drv.adrv); } module_init(irdma_init_module); diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index 1716933..1dab2ff 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -56,6 +56,7 @@ extern struct auxiliary_driver i40iw_auxiliary_drv; extern struct idc_rdma_core_auxiliary_drv ig3rdma_core_auxiliary_drv; +extern struct idc_rdma_vport_auxiliary_drv ig3rdma_vport_auxiliary_drv; extern struct idc_rdma_core_auxiliary_drv icrdma_core_auxiliary_drv; #define IRDMA_FW_VER_DEFAULT 2 @@ -353,12 +354,14 @@ struct irdma_device { u32 rcv_wnd; u16 mac_ip_table_idx; u16 vsi_num; + u16 vport_id; u8 rcv_wscale; u8 iw_status; bool roce_mode:1; bool roce_dcqcn_en:1; bool dcb_vlan_mode:1; bool iw_ooo:1; + bool is_vport:1; enum init_completion_state init_state; wait_queue_head_t suspend_wq; diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 9a42a88..bb654f4 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -292,6 +292,10 @@ static int irdma_alloc_ucontext(struct ib_ucontext *uctx, ucontext->iwdev = iwdev; ucontext->abi_ver = req.userspace_ver; + if (!(req.comp_mask & IRDMA_SUPPORT_WQE_FORMAT_V2) && + uk_attrs->hw_rev >= IRDMA_GEN_3) + return -EOPNOTSUPP; + if (req.comp_mask & IRDMA_ALLOC_UCTX_USE_RAW_ATTR) ucontext->use_raw_attrs = true; @@ -4881,6 +4885,10 @@ void irdma_ib_dealloc_device(struct ib_device *ibdev) struct irdma_device *iwdev = to_iwdev(ibdev); irdma_rt_deinit_hw(iwdev); - irdma_ctrl_deinit_hw(iwdev->rf); - kfree(iwdev->rf); + if (!iwdev->is_vport) { + irdma_ctrl_deinit_hw(iwdev->rf); + if (iwdev->rf->vchnl_wq) + destroy_workqueue(iwdev->rf->vchnl_wq); + kfree(iwdev->rf); + } } diff --git a/include/uapi/rdma/irdma-abi.h b/include/uapi/rdma/irdma-abi.h index bb18f15..4e42054 100644 --- a/include/uapi/rdma/irdma-abi.h +++ b/include/uapi/rdma/irdma-abi.h @@ -25,6 +25,7 @@ enum irdma_memreg_type { enum { IRDMA_ALLOC_UCTX_USE_RAW_ATTR = 1 << 0, IRDMA_ALLOC_UCTX_MIN_HW_WQ_SIZE = 1 << 1, + IRDMA_SUPPORT_WQE_FORMAT_V2 = 1 << 3, }; struct irdma_alloc_ucontext_req { From patchwork Wed Jul 24 23:39:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741456 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EA9DF143C5B for ; Wed, 24 Jul 2024 23:40:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864451; cv=none; b=NzimpmZUB3/socs1Z2h/rjqBYuAf+fg4/vEX1hiliX1xVetW4VlP5ZZlemg9PzLNwyiTrWv/c99YkKY+0EWIC38O/h69upQThLscOrUC8jZXMEzBmoxxN+nQ6ETb7MhMJOjudgJ+KRVp1qF5H0HCRmQc9F8RsGp6BxS8u4TbfU8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864451; c=relaxed/simple; bh=e0EUq8wiP2CvjTMitYpHwmgpSzPGRi6pIa94T94QODo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cV7uS1dc4SO60lPMiNMZHj/WSN+JVQSHgl0x90o/0VrHdys31B5HvznHuXOH0wuZCvm8I+aa3iHre4WUEcd9J61zNsgUlV1KleZ1BrDwP8BRP2cVrDJKWDeaWvGUHi3rC1EgqxjmKhIgbKg/OFAIUAkI/fO+iTH2rw1G5Nu0szU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=h6JgAnDW; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="h6JgAnDW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864449; x=1753400449; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=e0EUq8wiP2CvjTMitYpHwmgpSzPGRi6pIa94T94QODo=; b=h6JgAnDWtF7+h8aYnT5oOEGW7yz1EAELRqPU33ef9Rh8J/+jEU4PAj86 FGfxZSGYaaiW58BiuZLS6/AnPD60AX096PfCMtStlCOZcwFNyPySPPc4T LeG0HCp8xwo6/uslDXmbyRBuqxTW0D67UwYYoJqvl0mMcQ09nOV2sf8tk 5vJZAM8IX70NOdhrge05iyNL/Q+8BEHXx8vecWtkLhJTgM07deBLRM+Kf a3qfid8wHwRDa/H81uNQoNJD77QYXeG4SbeOgH2jWuxcrMeQ/YwCngqXS oo1x9wtUYiox1mUc/7I9+R9G8nrrujKWhlZfwnFfCeCJbaeCs+ttm47HK Q==; X-CSE-ConnectionGUID: 9mlMpzrOTD6/+3RAriwdNg== X-CSE-MsgGUID: Vzq+e5UFSfOCFxOnLfKumw== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999785" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999785" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:44 -0700 X-CSE-ConnectionGUID: qA7ryYrmSFa8RBF/e3jPZA== X-CSE-MsgGUID: i2+gUAVsT5mHcqnXmtf92g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426079" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:43 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Shiraz Saleem , Tatyana Nikolova Subject: [RFC PATCH 17/25] RDMA/irdma: Add GEN3 virtual QP1 support Date: Wed, 24 Jul 2024 18:39:09 -0500 Message-Id: <20240724233917.704-18-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Shiraz Saleem Add a new RDMA virtual channel op during QP1 creation that allow the Control Plane (CP) to virtualize a regular QP as QP1 on non-default RDMA capable vPorts. Additionally, the CP will return the Qsets to use on the ib_device of the vPort. Signed-off-by: Shiraz Saleem Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ctrl.c | 10 +++- drivers/infiniband/hw/irdma/main.h | 1 + drivers/infiniband/hw/irdma/utils.c | 30 ++++++++++-- drivers/infiniband/hw/irdma/verbs.c | 84 ++++++++++++++++++++++++++-------- drivers/infiniband/hw/irdma/virtchnl.c | 52 +++++++++++++++++++++ drivers/infiniband/hw/irdma/virtchnl.h | 19 ++++++++ 6 files changed, 174 insertions(+), 22 deletions(-) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 88eb7a0..4f05d0e 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -74,6 +74,14 @@ static void irdma_set_qos_info(struct irdma_sc_vsi *vsi, { u8 i; + if (vsi->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) { + vsi->qos[i].qs_handle = vsi->dev->qos[i].qs_handle; + vsi->qos[i].valid = true; + } + + return; + } vsi->qos_rel_bw = l2p->vsi_rel_bw; vsi->qos_prio_type = l2p->vsi_prio_type; vsi->dscp_mode = l2p->dscp_mode; @@ -1873,7 +1881,7 @@ void irdma_sc_vsi_init(struct irdma_sc_vsi *vsi, mutex_init(&vsi->qos[i].qos_mutex); INIT_LIST_HEAD(&vsi->qos[i].qplist); } - if (vsi->register_qset) { + if (vsi->dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_2) { vsi->dev->ws_add = irdma_ws_add; vsi->dev->ws_remove = irdma_ws_remove; vsi->dev->ws_reset = irdma_ws_reset; diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index 1dab2ff..f0196aa 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -260,6 +260,7 @@ struct irdma_pci_f { bool reset:1; bool rsrc_created:1; bool msix_shared:1; + bool hwqp1_rsvd:1; u8 rsrc_profile; u8 *hmc_info_mem; u8 *mem_rsrc; diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index e940d32..894ced3 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -1184,6 +1184,26 @@ static void irdma_dealloc_push_page(struct irdma_pci_f *rf, irdma_put_cqp_request(&rf->cqp, cqp_request); } +static void irdma_free_gsi_qp_rsrc(struct irdma_qp *iwqp, u32 qp_num) +{ + struct irdma_device *iwdev = iwqp->iwdev; + struct irdma_pci_f *rf = iwdev->rf; + unsigned long flags; + + if (rf->sc_dev.hw_attrs.uk_attrs.hw_rev < IRDMA_GEN_3) + return; + + irdma_vchnl_req_del_vport(&rf->sc_dev, iwdev->vport_id, qp_num); + + if (qp_num == 1) { + spin_lock_irqsave(&rf->rsrc_lock, flags); + rf->hwqp1_rsvd = false; + spin_unlock_irqrestore(&rf->rsrc_lock, flags); + } else if (qp_num > 2) { + irdma_free_rsrc(rf, rf->allocated_qps, qp_num); + } +} + /** * irdma_free_qp_rsrc - free up memory resources for qp * @iwqp: qp ptr (user or kernel) @@ -1192,7 +1212,7 @@ void irdma_free_qp_rsrc(struct irdma_qp *iwqp) { struct irdma_device *iwdev = iwqp->iwdev; struct irdma_pci_f *rf = iwdev->rf; - u32 qp_num = iwqp->ibqp.qp_num; + u32 qp_num = iwqp->sc_qp.qp_uk.qp_id; irdma_ieq_cleanup_qp(iwdev->vsi.ieq, &iwqp->sc_qp); irdma_dealloc_push_page(rf, &iwqp->sc_qp); @@ -1202,8 +1222,12 @@ void irdma_free_qp_rsrc(struct irdma_qp *iwqp) iwqp->sc_qp.user_pri); } - if (qp_num > 2) - irdma_free_rsrc(rf, rf->allocated_qps, qp_num); + if (iwqp->ibqp.qp_type == IB_QPT_GSI) { + irdma_free_gsi_qp_rsrc(iwqp, qp_num); + } else { + if (qp_num > 2) + irdma_free_rsrc(rf, rf->allocated_qps, qp_num); + } dma_free_coherent(rf->sc_dev.hw->device, iwqp->q2_ctx_mem.size, iwqp->q2_ctx_mem.va, iwqp->q2_ctx_mem.pa); iwqp->q2_ctx_mem.va = NULL; diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index bb654f4..6b48236 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -545,6 +545,9 @@ static int irdma_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata) irdma_cqp_qp_destroy_cmd(&iwdev->rf->sc_dev, &iwqp->sc_qp); irdma_remove_push_mmap_entries(iwqp); + + if (iwqp->sc_qp.qp_uk.qp_id == 1) + iwdev->rf->hwqp1_rsvd = false; irdma_free_qp_rsrc(iwqp); return 0; @@ -723,6 +726,7 @@ static int irdma_setup_kmode_qp(struct irdma_device *iwdev, info->rq_pa + (ukinfo->rq_depth * IRDMA_QP_WQE_MIN_SIZE); ukinfo->sq_size = ukinfo->sq_depth >> ukinfo->sq_shift; ukinfo->rq_size = ukinfo->rq_depth >> ukinfo->rq_shift; + ukinfo->qp_id = info->qp_uk_init_info.qp_id; iwqp->max_send_wr = (ukinfo->sq_depth - IRDMA_SQ_RSVD) >> ukinfo->sq_shift; iwqp->max_recv_wr = (ukinfo->rq_depth - IRDMA_RQ_RSVD) >> ukinfo->rq_shift; @@ -779,6 +783,8 @@ static void irdma_roce_fill_and_set_qpctx_info(struct irdma_qp *iwqp, roce_info = &iwqp->roce_info; ether_addr_copy(roce_info->mac_addr, iwdev->netdev->dev_addr); + if (iwqp->ibqp.qp_type == IB_QPT_GSI && iwqp->ibqp.qp_num != 1) + roce_info->is_qp1 = true; roce_info->rd_en = true; roce_info->wr_rdresp_en = true; roce_info->bind_en = true; @@ -868,6 +874,47 @@ static void irdma_flush_worker(struct work_struct *work) irdma_generate_flush_completions(iwqp); } +static int irdma_setup_gsi_qp_rsrc(struct irdma_qp *iwqp, u32 *qp_num) +{ + struct irdma_device *iwdev = iwqp->iwdev; + struct irdma_pci_f *rf = iwdev->rf; + unsigned long flags; + int ret; + + if (rf->rdma_ver <= IRDMA_GEN_2) { + *qp_num = 1; + return 0; + } + + spin_lock_irqsave(&rf->rsrc_lock, flags); + if (!rf->hwqp1_rsvd) { + *qp_num = 1; + rf->hwqp1_rsvd = true; + spin_unlock_irqrestore(&rf->rsrc_lock, flags); + } else { + spin_unlock_irqrestore(&rf->rsrc_lock, flags); + ret = irdma_alloc_rsrc(rf, rf->allocated_qps, rf->max_qp, + qp_num, &rf->next_qp); + if (ret) + return ret; + } + + ret = irdma_vchnl_req_add_vport(&rf->sc_dev, iwdev->vport_id, *qp_num, + (&iwdev->vsi)->qos); + if (ret) { + if (*qp_num != 1) { + irdma_free_rsrc(rf, rf->allocated_qps, *qp_num); + } else { + spin_lock_irqsave(&rf->rsrc_lock, flags); + rf->hwqp1_rsvd = false; + spin_unlock_irqrestore(&rf->rsrc_lock, flags); + } + return ret; + } + + return 0; +} + /** * irdma_create_qp - create qp * @ibqp: ptr of qp @@ -929,16 +976,20 @@ static int irdma_create_qp(struct ib_qp *ibqp, init_info.host_ctx = (__le64 *)(init_info.q2 + IRDMA_Q2_BUF_SIZE); init_info.host_ctx_pa = init_info.q2_pa + IRDMA_Q2_BUF_SIZE; - if (init_attr->qp_type == IB_QPT_GSI) - qp_num = 1; - else + if (init_attr->qp_type == IB_QPT_GSI) { + err_code = irdma_setup_gsi_qp_rsrc(iwqp, &qp_num); + if (err_code) + goto error; + iwqp->ibqp.qp_num = 1; + } else { err_code = irdma_alloc_rsrc(rf, rf->allocated_qps, rf->max_qp, &qp_num, &rf->next_qp); - if (err_code) - goto error; + if (err_code) + goto error; + iwqp->ibqp.qp_num = qp_num; + } iwqp->iwpd = iwpd; - iwqp->ibqp.qp_num = qp_num; qp = &iwqp->sc_qp; iwqp->iwscq = to_iwcq(init_attr->send_cq); iwqp->iwrcq = to_iwcq(init_attr->recv_cq); @@ -998,10 +1049,17 @@ static int irdma_create_qp(struct ib_qp *ibqp, ctx_info->send_cq_num = iwqp->iwscq->sc_cq.cq_uk.cq_id; ctx_info->rcv_cq_num = iwqp->iwrcq->sc_cq.cq_uk.cq_id; - if (rdma_protocol_roce(&iwdev->ibdev, 1)) + if (rdma_protocol_roce(&iwdev->ibdev, 1)) { + if (dev->ws_add(&iwdev->vsi, 0)) { + irdma_cqp_qp_destroy_cmd(&rf->sc_dev, &iwqp->sc_qp); + err_code = -EINVAL; + goto error; + } + irdma_qp_add_qos(&iwqp->sc_qp); irdma_roce_fill_and_set_qpctx_info(iwqp, ctx_info); - else + } else { irdma_iw_fill_and_set_qpctx_info(iwqp, ctx_info); + } err_code = irdma_cqp_create_qp_cmd(iwqp); if (err_code) @@ -1013,16 +1071,6 @@ static int irdma_create_qp(struct ib_qp *ibqp, iwqp->sig_all = init_attr->sq_sig_type == IB_SIGNAL_ALL_WR; rf->qp_table[qp_num] = iwqp; - if (rdma_protocol_roce(&iwdev->ibdev, 1)) { - if (dev->ws_add(&iwdev->vsi, 0)) { - irdma_cqp_qp_destroy_cmd(&rf->sc_dev, &iwqp->sc_qp); - err_code = -EINVAL; - goto error; - } - - irdma_qp_add_qos(&iwqp->sc_qp); - } - if (udata) { /* GEN_1 legacy support with libi40iw does not have expanded uresp struct */ if (udata->outlen < sizeof(uresp)) { diff --git a/drivers/infiniband/hw/irdma/virtchnl.c b/drivers/infiniband/hw/irdma/virtchnl.c index fc669b5..9f39cd6 100644 --- a/drivers/infiniband/hw/irdma/virtchnl.c +++ b/drivers/infiniband/hw/irdma/virtchnl.c @@ -110,6 +110,8 @@ static int irdma_vchnl_req_verify_resp(struct irdma_vchnl_req *vchnl_req, case IRDMA_VCHNL_OP_GET_REG_LAYOUT: case IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP: case IRDMA_VCHNL_OP_QUEUE_VECTOR_UNMAP: + case IRDMA_VCHNL_OP_ADD_VPORT: + case IRDMA_VCHNL_OP_DEL_VPORT: break; default: return -EOPNOTSUPP; @@ -315,6 +317,56 @@ int irdma_vchnl_req_get_reg_layout(struct irdma_sc_dev *dev) return 0; } +int irdma_vchnl_req_add_vport(struct irdma_sc_dev *dev, u16 vport_id, + u32 qp1_id, struct irdma_qos *qos) +{ + struct irdma_vchnl_resp_vport_info resp_vport = { 0 }; + struct irdma_vchnl_req_vport_info req_vport = { 0 }; + struct irdma_vchnl_req_init_info info = { 0 }; + int ret, i; + + if (!dev->vchnl_up) + return -EBUSY; + + info.op_code = IRDMA_VCHNL_OP_ADD_VPORT; + info.op_ver = IRDMA_VCHNL_OP_ADD_VPORT_V0; + req_vport.vport_id = vport_id; + req_vport.qp1_id = qp1_id; + info.req_parm_len = sizeof(req_vport); + info.req_parm = &req_vport; + info.resp_parm = &resp_vport; + info.resp_parm_len = sizeof(resp_vport); + + ret = irdma_vchnl_req_send_sync(dev, &info); + if (ret) + return ret; + + for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) { + qos[i].qs_handle = resp_vport.qs_handle[i]; + qos[i].valid = true; + } + + return 0; +} + +int irdma_vchnl_req_del_vport(struct irdma_sc_dev *dev, u16 vport_id, u32 qp1_id) +{ + struct irdma_vchnl_req_init_info info = { 0 }; + struct irdma_vchnl_req_vport_info req_vport = { 0 }; + + if (!dev->vchnl_up) + return -EBUSY; + + info.op_code = IRDMA_VCHNL_OP_DEL_VPORT; + info.op_ver = IRDMA_VCHNL_OP_DEL_VPORT_V0; + req_vport.vport_id = vport_id; + req_vport.qp1_id = qp1_id; + info.req_parm_len = sizeof(req_vport); + info.req_parm = &req_vport; + + return irdma_vchnl_req_send_sync(dev, &info); +} + /** * irdma_vchnl_req_aeq_vec_map - Map AEQ to vector on this function * @dev: RDMA device pointer diff --git a/drivers/infiniband/hw/irdma/virtchnl.h b/drivers/infiniband/hw/irdma/virtchnl.h index 3af72558..23e66bc 100644 --- a/drivers/infiniband/hw/irdma/virtchnl.h +++ b/drivers/infiniband/hw/irdma/virtchnl.h @@ -17,6 +17,8 @@ #define IRDMA_VCHNL_OP_GET_REG_LAYOUT_V0 0 #define IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP_V0 0 #define IRDMA_VCHNL_OP_QUEUE_VECTOR_UNMAP_V0 0 +#define IRDMA_VCHNL_OP_ADD_VPORT_V0 0 +#define IRDMA_VCHNL_OP_DEL_VPORT_V0 0 #define IRDMA_VCHNL_OP_GET_RDMA_CAPS_V0 0 #define IRDMA_VCHNL_OP_GET_RDMA_CAPS_MIN_SIZE 1 @@ -57,6 +59,8 @@ enum irdma_vchnl_ops { IRDMA_VCHNL_OP_GET_RDMA_CAPS = 13, IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP = 14, IRDMA_VCHNL_OP_QUEUE_VECTOR_UNMAP = 15, + IRDMA_VCHNL_OP_ADD_VPORT = 16, + IRDMA_VCHNL_OP_DEL_VPORT = 17, }; struct irdma_vchnl_req_hmc_info { @@ -81,6 +85,15 @@ struct irdma_vchnl_qvlist_info { struct irdma_vchnl_qv_info qv_info[]; }; +struct irdma_vchnl_req_vport_info { + u16 vport_id; + u32 qp1_id; +}; + +struct irdma_vchnl_resp_vport_info { + u16 qs_handle[IRDMA_MAX_USER_PRIORITY]; +}; + struct irdma_vchnl_op_buf { u16 op_code; u16 op_ver; @@ -141,6 +154,8 @@ struct irdma_vchnl_req_init_info { u16 op_ver; } __packed; +struct irdma_qos; + int irdma_sc_vchnl_init(struct irdma_sc_dev *dev, struct irdma_vchnl_init_info *info); int irdma_vchnl_send_sync(struct irdma_sc_dev *dev, u8 *msg, u16 len, @@ -156,4 +171,8 @@ int irdma_vchnl_req_get_resp(struct irdma_sc_dev *dev, int irdma_vchnl_req_aeq_vec_map(struct irdma_sc_dev *dev, u32 v_idx); int irdma_vchnl_req_ceq_vec_map(struct irdma_sc_dev *dev, u16 ceq_id, u32 v_idx); +int irdma_vchnl_req_add_vport(struct irdma_sc_dev *dev, u16 vport_id, + u32 qp1_id, struct irdma_qos *qos); +int irdma_vchnl_req_del_vport(struct irdma_sc_dev *dev, u16 vport_id, + u32 qp1_id); #endif /* IRDMA_VIRTCHNL_H */ From patchwork Wed Jul 24 23:39:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741457 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5240C1494B9 for ; Wed, 24 Jul 2024 23:40:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864451; cv=none; b=X7jshy5UUgJlB7QP+lalcEJm/JWvKTwvCjflN+bgKDKueij+fLqux6QeZZaBAEc0L0oymsr+kwDlhVw2CWXSg18zVQv345ZTu/s56kXIcxzcy/XNCl5IU5rz/b2+WswgS5pSFmJzNxjDgtUNdhYpWLjBKWT7pPzf0T5uZrnjU2s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864451; c=relaxed/simple; bh=9dutTJnY9EXvOBCoKvbCfkmYs9fwUzy4KU+AIyL9uFk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=g183DtwVPKh8F0TrV9epT/HWdi0aISE0FtgvQ0Jhbrxg/r73XC+HjLNFU8xPdkgp8UfRP4CEkzMXcwKOxEs2bSBoXW+N3Ocsldmh1AC5LZvMLj8cht8wFBQfGBmoRIarS0BDEh1iHYyAAY59fmL2FNwlQ2s/Uo6nPxDsi3Tbq+k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=DFTWW9kT; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="DFTWW9kT" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864449; x=1753400449; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9dutTJnY9EXvOBCoKvbCfkmYs9fwUzy4KU+AIyL9uFk=; b=DFTWW9kTh2Qp7Up4GdyIEgq8XbA1YPRdp2HS5XnQ84qzQrElYvK1lmCo GD4A5afdwEJgFbyHjRx7ARiOxTZq5vHLY6cqNRPjRpaQVawa35n/idG8o DIOOxo4blBRzeGFf58iEmNYfKf14As1AQH+O2eNGgEx/zbmnEhW1z1CvR MR0e+tX8s14eJaCkHjRfcZFYIg2nWO4SSBd1GRm8KXSMs/hLiRmoGoRhh 98CZ3S88NCHsb+syUryRPYIoMKBAz4Y+BUCd4Xgy5qw8C9CQXYU4hqYhg 82J69SRCQY0OZWBF3/FtXwtwA5ykwSt4ayHiTFSnmkcnpGPXebtio69ZU w==; X-CSE-ConnectionGUID: HSNTQzc8T0yozSdNhbwWHQ== X-CSE-MsgGUID: xCGYKPBwSoGQ1TN4VlaWGg== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999788" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999788" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:45 -0700 X-CSE-ConnectionGUID: rzTSQN6kTE+mLecIzfmezA== X-CSE-MsgGUID: Et/0cOj6QM6vd1rpEFJaZQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426086" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:43 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Shiraz Saleem , Tatyana Nikolova Subject: [RFC PATCH 18/25] RDMA/irdma: Extend QP context programming for GEN3 Date: Wed, 24 Jul 2024 18:39:10 -0500 Message-Id: <20240724233917.704-19-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Shiraz Saleem Extend the QP context structure with support for new fields specific to GEN3 hardware capabilities. Signed-off-by: Shiraz Saleem Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ctrl.c | 184 +++++++++++++++++++++++++++++++++++- drivers/infiniband/hw/irdma/defs.h | 24 ++++- drivers/infiniband/hw/irdma/type.h | 4 + drivers/infiniband/hw/irdma/uda_d.h | 5 +- drivers/infiniband/hw/irdma/verbs.c | 5 + 5 files changed, 215 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 4f05d0e..3205385 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -637,13 +637,14 @@ static u8 irdma_sc_get_encoded_ird_size(u16 ird_size) } /** - * irdma_sc_qp_setctx_roce - set qp's context + * irdma_sc_qp_setctx_roce_gen_2 - set qp's context * @qp: sc qp * @qp_ctx: context ptr * @info: ctx info */ -void irdma_sc_qp_setctx_roce(struct irdma_sc_qp *qp, __le64 *qp_ctx, - struct irdma_qp_host_ctx_info *info) +static void irdma_sc_qp_setctx_roce_gen_2(struct irdma_sc_qp *qp, + __le64 *qp_ctx, + struct irdma_qp_host_ctx_info *info) { struct irdma_roce_offload_info *roce_info; struct irdma_udp_offload_info *udp; @@ -761,6 +762,183 @@ void irdma_sc_qp_setctx_roce(struct irdma_sc_qp *qp, __le64 *qp_ctx, 8, qp_ctx, IRDMA_QP_CTX_SIZE, false); } +/** + * irdma_sc_get_encoded_ird_size_gen_3 - get encoded IRD size for GEN 3 + * @ird_size: IRD size + * The ird from the connection is rounded to a supported HW setting and then encoded + * for ird_size field of qp_ctx. Consumers are expected to provide valid ird size based + * on hardware attributes. IRD size defaults to a value of 4 in case of invalid input. + */ +static u8 irdma_sc_get_encoded_ird_size_gen_3(u16 ird_size) +{ + switch (ird_size ? + roundup_pow_of_two(2 * ird_size) : 4) { + case 4096: + return IRDMA_IRD_HW_SIZE_4096_GEN3; + case 2048: + return IRDMA_IRD_HW_SIZE_2048_GEN3; + case 1024: + return IRDMA_IRD_HW_SIZE_1024_GEN3; + case 512: + return IRDMA_IRD_HW_SIZE_512_GEN3; + case 256: + return IRDMA_IRD_HW_SIZE_256_GEN3; + case 128: + return IRDMA_IRD_HW_SIZE_128_GEN3; + case 64: + return IRDMA_IRD_HW_SIZE_64_GEN3; + case 32: + return IRDMA_IRD_HW_SIZE_32_GEN3; + case 16: + return IRDMA_IRD_HW_SIZE_16_GEN3; + case 8: + return IRDMA_IRD_HW_SIZE_8_GEN3; + case 4: + default: + break; + } + + return IRDMA_IRD_HW_SIZE_4_GEN3; +} + +/** + * irdma_sc_qp_setctx_roce_gen_3 - set qp's context + * @qp: sc qp + * @qp_ctx: context ptr + * @info: ctx info + */ +static void irdma_sc_qp_setctx_roce_gen_3(struct irdma_sc_qp *qp, + __le64 *qp_ctx, + struct irdma_qp_host_ctx_info *info) +{ + struct irdma_roce_offload_info *roce_info = info->roce_info; + struct irdma_udp_offload_info *udp = info->udp_info; + u64 qw0, qw3, qw7 = 0, qw8 = 0; + u8 push_mode_en; + u32 push_idx; + + qp->user_pri = info->user_pri; + if (qp->push_idx == IRDMA_INVALID_PUSH_PAGE_INDEX) { + push_mode_en = 0; + push_idx = 0; + } else { + push_mode_en = 1; + push_idx = qp->push_idx; + } + + qw0 = FIELD_PREP(IRDMAQPC_RQWQESIZE, qp->qp_uk.rq_wqe_size) | + FIELD_PREP(IRDMAQPC_RCVTPHEN, qp->rcv_tph_en) | + FIELD_PREP(IRDMAQPC_XMITTPHEN, qp->xmit_tph_en) | + FIELD_PREP(IRDMAQPC_RQTPHEN, qp->rq_tph_en) | + FIELD_PREP(IRDMAQPC_SQTPHEN, qp->sq_tph_en) | + FIELD_PREP(IRDMAQPC_PPIDX, push_idx) | + FIELD_PREP(IRDMAQPC_PMENA, push_mode_en) | + FIELD_PREP(IRDMAQPC_DC_TCP_EN, roce_info->dctcp_en) | + FIELD_PREP(IRDMAQPC_ISQP1, roce_info->is_qp1) | + FIELD_PREP(IRDMAQPC_ROCE_TVER, roce_info->roce_tver) | + FIELD_PREP(IRDMAQPC_IPV4, udp->ipv4) | + FIELD_PREP(IRDMAQPC_INSERTVLANTAG, udp->insert_vlan_tag); + set_64bit_val(qp_ctx, 0, qw0); + set_64bit_val(qp_ctx, 8, qp->sq_pa); + set_64bit_val(qp_ctx, 16, qp->rq_pa); + qw3 = FIELD_PREP(IRDMAQPC_RQSIZE, qp->hw_rq_size) | + FIELD_PREP(IRDMAQPC_SQSIZE, qp->hw_sq_size) | + FIELD_PREP(IRDMAQPC_TTL, udp->ttl) | + FIELD_PREP(IRDMAQPC_TOS, udp->tos) | + FIELD_PREP(IRDMAQPC_SRCPORTNUM, udp->src_port) | + FIELD_PREP(IRDMAQPC_DESTPORTNUM, udp->dst_port); + set_64bit_val(qp_ctx, 24, qw3); + set_64bit_val(qp_ctx, 32, + FIELD_PREP(IRDMAQPC_DESTIPADDR2, udp->dest_ip_addr[2]) | + FIELD_PREP(IRDMAQPC_DESTIPADDR3, udp->dest_ip_addr[3])); + set_64bit_val(qp_ctx, 40, + FIELD_PREP(IRDMAQPC_DESTIPADDR0, udp->dest_ip_addr[0]) | + FIELD_PREP(IRDMAQPC_DESTIPADDR1, udp->dest_ip_addr[1])); + set_64bit_val(qp_ctx, 48, + FIELD_PREP(IRDMAQPC_SNDMSS, udp->snd_mss) | + FIELD_PREP(IRDMAQPC_VLANTAG, udp->vlan_tag) | + FIELD_PREP(IRDMAQPC_ARPIDX, udp->arp_idx)); + qw7 = FIELD_PREP(IRDMAQPC_PKEY, roce_info->p_key) | + FIELD_PREP(IRDMAQPC_ACKCREDITS, roce_info->ack_credits) | + FIELD_PREP(IRDMAQPC_FLOWLABEL, udp->flow_label); + set_64bit_val(qp_ctx, 56, qw7); + qw8 = FIELD_PREP(IRDMAQPC_QKEY, roce_info->qkey) | + FIELD_PREP(IRDMAQPC_DESTQP, roce_info->dest_qp); + set_64bit_val(qp_ctx, 64, qw8); + set_64bit_val(qp_ctx, 80, + FIELD_PREP(IRDMAQPC_PSNNXT, udp->psn_nxt) | + FIELD_PREP(IRDMAQPC_LSN, udp->lsn)); + set_64bit_val(qp_ctx, 88, + FIELD_PREP(IRDMAQPC_EPSN, udp->epsn)); + set_64bit_val(qp_ctx, 96, + FIELD_PREP(IRDMAQPC_PSNMAX, udp->psn_max) | + FIELD_PREP(IRDMAQPC_PSNUNA, udp->psn_una)); + set_64bit_val(qp_ctx, 112, + FIELD_PREP(IRDMAQPC_CWNDROCE, udp->cwnd)); + set_64bit_val(qp_ctx, 128, + FIELD_PREP(IRDMAQPC_MINRNR_TIMER, udp->min_rnr_timer) | + FIELD_PREP(IRDMAQPC_RNRNAK_THRESH, udp->rnr_nak_thresh) | + FIELD_PREP(IRDMAQPC_REXMIT_THRESH, udp->rexmit_thresh) | + FIELD_PREP(IRDMAQPC_RNRNAK_TMR, udp->rnr_nak_tmr) | + FIELD_PREP(IRDMAQPC_RTOMIN, roce_info->rtomin)); + set_64bit_val(qp_ctx, 136, + FIELD_PREP(IRDMAQPC_TXCQNUM, info->send_cq_num) | + FIELD_PREP(IRDMAQPC_RXCQNUM, info->rcv_cq_num)); + set_64bit_val(qp_ctx, 152, + FIELD_PREP(IRDMAQPC_MACADDRESS, + ether_addr_to_u64(roce_info->mac_addr)) | + FIELD_PREP(IRDMAQPC_LOCALACKTIMEOUT, + roce_info->local_ack_timeout)); + set_64bit_val(qp_ctx, 160, + FIELD_PREP(IRDMAQPC_ORDSIZE_GEN3, roce_info->ord_size) | + FIELD_PREP(IRDMAQPC_IRDSIZE_GEN3, + irdma_sc_get_encoded_ird_size_gen_3(roce_info->ird_size)) | + FIELD_PREP(IRDMAQPC_WRRDRSPOK, roce_info->wr_rdresp_en) | + FIELD_PREP(IRDMAQPC_RDOK, roce_info->rd_en) | + FIELD_PREP(IRDMAQPC_USESTATSINSTANCE, + info->stats_idx_valid) | + FIELD_PREP(IRDMAQPC_BINDEN, roce_info->bind_en) | + FIELD_PREP(IRDMAQPC_FASTREGEN, roce_info->fast_reg_en) | + FIELD_PREP(IRDMAQPC_DCQCNENABLE, roce_info->dcqcn_en) | + FIELD_PREP(IRDMAQPC_RCVNOICRC, roce_info->rcv_no_icrc) | + FIELD_PREP(IRDMAQPC_FW_CC_ENABLE, + roce_info->fw_cc_enable) | + FIELD_PREP(IRDMAQPC_UDPRIVCQENABLE, + roce_info->udprivcq_en) | + FIELD_PREP(IRDMAQPC_PRIVEN, roce_info->priv_mode_en) | + FIELD_PREP(IRDMAQPC_TIMELYENABLE, roce_info->timely_en)); + set_64bit_val(qp_ctx, 168, + FIELD_PREP(IRDMAQPC_QPCOMPCTX, info->qp_compl_ctx)); + set_64bit_val(qp_ctx, 176, + FIELD_PREP(IRDMAQPC_SQTPHVAL, qp->sq_tph_val) | + FIELD_PREP(IRDMAQPC_RQTPHVAL, qp->rq_tph_val) | + FIELD_PREP(IRDMAQPC_QSHANDLE, qp->qs_handle)); + set_64bit_val(qp_ctx, 184, + FIELD_PREP(IRDMAQPC_LOCAL_IPADDR3, udp->local_ipaddr[3]) | + FIELD_PREP(IRDMAQPC_LOCAL_IPADDR2, udp->local_ipaddr[2])); + set_64bit_val(qp_ctx, 192, + FIELD_PREP(IRDMAQPC_LOCAL_IPADDR1, udp->local_ipaddr[1]) | + FIELD_PREP(IRDMAQPC_LOCAL_IPADDR0, udp->local_ipaddr[0])); + set_64bit_val(qp_ctx, 200, + FIELD_PREP(IRDMAQPC_THIGH, roce_info->t_high) | + FIELD_PREP(IRDMAQPC_TLOW, roce_info->t_low)); + set_64bit_val(qp_ctx, 208, roce_info->pd_id | + FIELD_PREP(IRDMAQPC_STAT_INDEX_GEN3, info->stats_idx) | + FIELD_PREP(IRDMAQPC_PKT_LIMIT, qp->pkt_limit)); + + print_hex_dump_debug("WQE: QP_HOST ROCE CTX WQE", DUMP_PREFIX_OFFSET, + 16, 8, qp_ctx, IRDMA_QP_CTX_SIZE, false); +} + +void irdma_sc_qp_setctx_roce(struct irdma_sc_qp *qp, __le64 *qp_ctx, + struct irdma_qp_host_ctx_info *info) +{ + if (qp->dev->hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_2) + irdma_sc_qp_setctx_roce_gen_2(qp, qp_ctx, info); + else + irdma_sc_qp_setctx_roce_gen_3(qp, qp_ctx, info); +} + /* irdma_sc_alloc_local_mac_entry - allocate a mac entry * @cqp: struct for cqp hw * @scratch: u64 saved to be used during cqp completion diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index 492529a..b548490 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -14,6 +14,18 @@ #define IRDMA_PE_DB_SIZE_4M 1 #define IRDMA_PE_DB_SIZE_8M 2 +#define IRDMA_IRD_HW_SIZE_4_GEN3 0 +#define IRDMA_IRD_HW_SIZE_8_GEN3 1 +#define IRDMA_IRD_HW_SIZE_16_GEN3 2 +#define IRDMA_IRD_HW_SIZE_32_GEN3 3 +#define IRDMA_IRD_HW_SIZE_64_GEN3 4 +#define IRDMA_IRD_HW_SIZE_128_GEN3 5 +#define IRDMA_IRD_HW_SIZE_256_GEN3 6 +#define IRDMA_IRD_HW_SIZE_512_GEN3 7 +#define IRDMA_IRD_HW_SIZE_1024_GEN3 8 +#define IRDMA_IRD_HW_SIZE_2048_GEN3 9 +#define IRDMA_IRD_HW_SIZE_4096_GEN3 10 + #define IRDMA_IRD_HW_SIZE_4 0 #define IRDMA_IRD_HW_SIZE_16 1 #define IRDMA_IRD_HW_SIZE_64 2 @@ -843,7 +855,8 @@ enum irdma_cqp_op_type { #define IRDMAQPC_CWNDROCE GENMASK_ULL(55, 32) #define IRDMAQPC_SNDWL1 GENMASK_ULL(31, 0) #define IRDMAQPC_SNDWL2 GENMASK_ULL(63, 32) -#define IRDMAQPC_ERR_RQ_IDX GENMASK_ULL(45, 32) +#define IRDMAQPC_MINRNR_TIMER GENMASK_ULL(4, 0) +#define IRDMAQPC_ERR_RQ_IDX GENMASK_ULL(46, 32) #define IRDMAQPC_RTOMIN GENMASK_ULL(63, 57) #define IRDMAQPC_MAXSNDWND GENMASK_ULL(31, 0) #define IRDMAQPC_REXMIT_THRESH GENMASK_ULL(53, 48) @@ -856,8 +869,17 @@ enum irdma_cqp_op_type { #define IRDMAQPC_MACADDRESS GENMASK_ULL(63, 16) #define IRDMAQPC_ORDSIZE GENMASK_ULL(7, 0) +#define IRDMAQPC_LOCALACKTIMEOUT GENMASK_ULL(12, 8) +#define IRDMAQPC_RNRNAK_TMR GENMASK_ULL(4, 0) +#define IRDMAQPC_ORDSIZE_GEN3 GENMASK_ULL(10, 0) +#define IRDMAQPC_REMOTE_ATOMIC_EN BIT_ULL(18) +#define IRDMAQPC_STAT_INDEX_GEN3 GENMASK_ULL(47, 32) +#define IRDMAQPC_PKT_LIMIT GENMASK_ULL(55, 48) + #define IRDMAQPC_IRDSIZE GENMASK_ULL(18, 16) +#define IRDMAQPC_IRDSIZE_GEN3 GENMASK_ULL(17, 14) + #define IRDMAQPC_UDPRIVCQENABLE BIT_ULL(19) #define IRDMAQPC_WRRDRSPOK BIT_ULL(20) #define IRDMAQPC_RDOK BIT_ULL(21) diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index 17fc726..2432104 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -574,6 +574,7 @@ struct irdma_sc_qp { bool flush_rq:1; bool sq_flush_code:1; bool rq_flush_code:1; + u32 pkt_limit; enum irdma_flush_opcode flush_code; enum irdma_qp_event_type event_type; u8 term_flags; @@ -915,6 +916,8 @@ struct irdma_udp_offload_info { u32 cwnd; u8 rexmit_thresh; u8 rnr_nak_thresh; + u8 rnr_nak_tmr; + u8 min_rnr_timer; }; struct irdma_roce_offload_info { @@ -941,6 +944,7 @@ struct irdma_roce_offload_info { bool dctcp_en:1; bool fw_cc_enable:1; bool use_stats_inst:1; + u8 local_ack_timeout; u16 t_high; u16 t_low; u8 last_byte_sent; diff --git a/drivers/infiniband/hw/irdma/uda_d.h b/drivers/infiniband/hw/irdma/uda_d.h index 5a9e6ea..4fb4daa 100644 --- a/drivers/infiniband/hw/irdma/uda_d.h +++ b/drivers/infiniband/hw/irdma/uda_d.h @@ -78,8 +78,7 @@ #define IRDMA_UDAQPC_IPID GENMASK_ULL(47, 32) #define IRDMA_UDAQPC_SNDMSS GENMASK_ULL(29, 16) #define IRDMA_UDAQPC_VLANTAG GENMASK_ULL(15, 0) - -#define IRDMA_UDA_CQPSQ_MAV_PDINDEXHI GENMASK_ULL(21, 20) +#define IRDMA_UDA_CQPSQ_MAV_PDINDEXHI GENMASK_ULL(27, 20) #define IRDMA_UDA_CQPSQ_MAV_PDINDEXLO GENMASK_ULL(63, 48) #define IRDMA_UDA_CQPSQ_MAV_SRCMACADDRINDEX GENMASK_ULL(29, 24) #define IRDMA_UDA_CQPSQ_MAV_ARPINDEX GENMASK_ULL(63, 48) @@ -94,7 +93,7 @@ #define IRDMA_UDA_CQPSQ_MAV_OPCODE GENMASK_ULL(37, 32) #define IRDMA_UDA_CQPSQ_MAV_DOLOOPBACKK BIT_ULL(62) #define IRDMA_UDA_CQPSQ_MAV_IPV4VALID BIT_ULL(59) -#define IRDMA_UDA_CQPSQ_MAV_AVIDX GENMASK_ULL(16, 0) +#define IRDMA_UDA_CQPSQ_MAV_AVIDX GENMASK_ULL(23, 0) #define IRDMA_UDA_CQPSQ_MAV_INSERTVLANTAG BIT_ULL(60) #define IRDMA_UDA_MGCTX_VFFLAG BIT_ULL(29) #define IRDMA_UDA_MGCTX_DESTPORT GENMASK_ULL(47, 32) diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 6b48236..70652ab 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -1162,6 +1162,7 @@ static int irdma_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, attr->pkey_index = iwqp->roce_info.p_key; attr->retry_cnt = iwqp->udp_info.rexmit_thresh; attr->rnr_retry = iwqp->udp_info.rnr_nak_thresh; + attr->min_rnr_timer = iwqp->udp_info.min_rnr_timer; attr->max_rd_atomic = iwqp->roce_info.ord_size; attr->max_dest_rd_atomic = iwqp->roce_info.ird_size; } @@ -1294,6 +1295,10 @@ int irdma_modify_qp_roce(struct ib_qp *ibqp, struct ib_qp_attr *attr, if (attr_mask & IB_QP_RNR_RETRY) udp_info->rnr_nak_thresh = attr->rnr_retry; + if (attr_mask & IB_QP_MIN_RNR_TIMER && + dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) + udp_info->min_rnr_timer = attr->min_rnr_timer; + if (attr_mask & IB_QP_RETRY_CNT) udp_info->rexmit_thresh = attr->retry_cnt; From patchwork Wed Jul 24 23:39:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741458 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ABA2B1494CB for ; Wed, 24 Jul 2024 23:40:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864452; cv=none; b=uVbYW6xbcdcc1T7u+MREfBzodIVaMKYjLzuPvCfh/M0pniRnX+t8AsO22zuYL3cs17t+WHPPsW9PiFI2f9kz8NwpXQW2b8Z7xIRbZbOdz7z1PruP2UHo3SbLKlQDc2kORCVFcdShHc9HcsDSJvokFKTmUy0/X1SBHegnenguCAY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864452; c=relaxed/simple; bh=LYLaJegv9oWYBGlquMfKlvWmPTfWuJJ7rV6pJay7R08=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ASVjUKFDwoa/d1VuR3cskW2jvip6GWSO9TLmXH3P3Zi3z9LOOssvOTGdlMaHZh3gtNNyUd4S/sQlXvlHysf4t3fwlnkgnaq50ZXbDxMxdv2UQftia9co3mSb6jwI40kZJvgG3uiXU5rrdgjdzNbsIdBW77+63qAeIFDQaut+xrQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=XKg8wHme; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XKg8wHme" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864450; x=1753400450; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LYLaJegv9oWYBGlquMfKlvWmPTfWuJJ7rV6pJay7R08=; b=XKg8wHmewQzXsGaQGRaA5RZzZ2gnE6Dwo0tJ1AhnwnxZvHlGQiUCUbTh DzXqXR7UuiSZCZcD9B+ML+kaigp5z/NPBRvPeVW4nqUtI6n9IpNJSCDoC BDU0Cz9XXdQxQKuQIzoUsGr7FAa63IGavS8nlmJDBzLuRgo6piY6VbNtj u+mkSZNPRR9TNEJ4faP05p61TWxtBl14DUvYPlqA7JC8UuRTPdZ6/2/83 KSr+tZALRPv5LIZ7KoqNhd4+wDrlOKETGujjtYJ6ZYxPP58PpI+mpJxFW vWoQUFToYamhdX5Cz8MdoxwAJ/sQb3cxy3ZLT6l3mjLdIHeOXjj4Eh3e5 Q==; X-CSE-ConnectionGUID: 6J34YGetSWOf7D8+x6Nsmw== X-CSE-MsgGUID: lK8pTDXDSSucShR7Kvigpg== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999791" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999791" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:45 -0700 X-CSE-ConnectionGUID: 8Q5oNfTrSo2tVa+Ws6acIA== X-CSE-MsgGUID: wZ9/miywRCa4jmPpAZOgUw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426092" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:44 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Shiraz Saleem , Tatyana Nikolova Subject: [RFC PATCH 19/25] RDMA/irdma: Support 64-byte CQEs and GEN3 CQE opcode decoding Date: Wed, 24 Jul 2024 18:39:11 -0500 Message-Id: <20240724233917.704-20-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Shiraz Saleem Introduce support for 64-byte CQEs in GEN3 devices. Additionally, implement GEN3-specific CQE opcode decoding. Signed-off-by: Shiraz Saleem Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/verbs.c | 19 +++++++++++++++---- drivers/infiniband/hw/irdma/verbs.h | 13 +++++++++++++ 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 70652ab..593dbbd 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -2114,6 +2114,7 @@ static int irdma_create_cq(struct ib_cq *ibcq, unsigned long flags; int err_code; int entries = attr->cqe; + bool cqe_64byte_ena; err_code = cq_validate_flags(attr->flags, dev->hw_attrs.uk_attrs.hw_rev); if (err_code) @@ -2137,6 +2138,9 @@ static int irdma_create_cq(struct ib_cq *ibcq, info.dev = dev; ukinfo->cq_size = max(entries, 4); ukinfo->cq_id = cq_num; + cqe_64byte_ena = dev->hw_attrs.uk_attrs.feature_flags & IRDMA_FEATURE_64_BYTE_CQE ? + true : false; + ukinfo->avoid_mem_cflct = cqe_64byte_ena; iwcq->ibcq.cqe = info.cq_uk_init_info.cq_size; if (attr->comp_vector < rf->ceqs_count) info.ceq_id = attr->comp_vector; @@ -2212,11 +2216,14 @@ static int irdma_create_cq(struct ib_cq *ibcq, } entries++; - if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) + if (!cqe_64byte_ena && dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) entries *= 2; ukinfo->cq_size = entries; - rsize = info.cq_uk_init_info.cq_size * sizeof(struct irdma_cqe); + if (cqe_64byte_ena) + rsize = info.cq_uk_init_info.cq_size * sizeof(struct irdma_extended_cqe); + else + rsize = info.cq_uk_init_info.cq_size * sizeof(struct irdma_cqe); iwcq->kmem.size = ALIGN(round_up(rsize, 256), 256); iwcq->kmem.va = dma_alloc_coherent(dev->hw->device, iwcq->kmem.size, @@ -3774,8 +3781,12 @@ static void irdma_process_cqe(struct ib_wc *entry, if (cq_poll_info->q_type == IRDMA_CQE_QTYPE_SQ) { set_ib_wc_op_sq(cq_poll_info, entry); } else { - set_ib_wc_op_rq(cq_poll_info, entry, - qp->qp_uk.qp_caps & IRDMA_SEND_WITH_IMM); + if (qp->dev->hw_attrs.uk_attrs.hw_rev <= IRDMA_GEN_2) + set_ib_wc_op_rq(cq_poll_info, entry, + qp->qp_uk.qp_caps & IRDMA_SEND_WITH_IMM ? + true : false); + else + set_ib_wc_op_rq_gen_3(cq_poll_info, entry); if (qp->qp_uk.qp_type != IRDMA_QP_TYPE_ROCE_UD && cq_poll_info->stag_invalid_set) { entry->ex.invalidate_rkey = cq_poll_info->inv_stag; diff --git a/drivers/infiniband/hw/irdma/verbs.h b/drivers/infiniband/hw/irdma/verbs.h index cfa140b..fcb163c 100644 --- a/drivers/infiniband/hw/irdma/verbs.h +++ b/drivers/infiniband/hw/irdma/verbs.h @@ -267,6 +267,19 @@ static inline void set_ib_wc_op_sq(struct irdma_cq_poll_info *cq_poll_info, } } +static inline void set_ib_wc_op_rq_gen_3(struct irdma_cq_poll_info *info, + struct ib_wc *entry) +{ + switch (info->op_type) { + case IRDMA_OP_TYPE_RDMA_WRITE: + case IRDMA_OP_TYPE_RDMA_WRITE_SOL: + entry->opcode = IB_WC_RECV_RDMA_WITH_IMM; + break; + default: + entry->opcode = IB_WC_RECV; + } +} + static inline void set_ib_wc_op_rq(struct irdma_cq_poll_info *cq_poll_info, struct ib_wc *entry, bool send_imm_support) { From patchwork Wed Jul 24 23:39:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741461 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F90E1494BC for ; Wed, 24 Jul 2024 23:40:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864454; cv=none; b=oIDSwdySJ8WuWNVQA2R7lURQSwJcTPQjKvfEY1FVCFA6luNR3GF0yW67MymKn3zy4ol5IY9M2vBN8Mpi/B4jQufM4pS7vN0hB1MrTEXHnGqz2CGI/O9sfaYzMevV+O23V/6LVaFAIOaA8FhC8+/64OSLViQE2d4ByF2lP9jRkU8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864454; c=relaxed/simple; bh=mHsX72t3vDENpJTp2cPRFdvzy+QE7/dwZRiGCfhPmw4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=pHm4ZwGvl32CARRi1S6/Ep+Lf1lnFSdT+fjRZXJ1zzaj61BBNA3FcySBwOjGc6OQOOGspPO15lOgZ2gcxgCL9FDX3ZaFlFl1sWrDL2jsCDXYoV3fOl1A8SwDsqqHl2fTrMi2E3L1fiCVwB+R/GL4U381IXxqOQIfqUb93t/mJ0I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=JK4TpDFP; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="JK4TpDFP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864451; x=1753400451; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mHsX72t3vDENpJTp2cPRFdvzy+QE7/dwZRiGCfhPmw4=; b=JK4TpDFPc/SvG+nW3lDCvAopyyCTW6M52dIP3rOTLIzxKBYCa0H4UtR/ 4QKrlRfKT5u0uV8ZlUWHRBFo4BXZ5pqsLNF5D4JbGeuSF0KHSUQjMXPvM eBsnbXWWxV9tObMniy6wbCcP5shglXQP3E4fjGcwXPumo5HS81RPTwPNR MWh1mwslYLVFM1AAnTmm9bhNXABvlqNDuL/e7zzmxLLj9ctwk8cVs4QR8 qGMr8aVBzQ1538caoovYxjla6a6e9wTyI0YtUnPhFBy71Hjl5ZrO8rmMf 7uDEpOmdU53MJ6HIy18paxTLOmghQ8xEQ3gtY+KAvJmBDRplOAkoOdxMM w==; X-CSE-ConnectionGUID: 9tYErajmQpGWVmqESTU9cQ== X-CSE-MsgGUID: vNp27V+DQj2bAHmsu64cTg== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999795" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999795" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:46 -0700 X-CSE-ConnectionGUID: ax6N/mjISkKmp5NPdY32Eg== X-CSE-MsgGUID: uq7XGGaMQ1uhyHonXA6SAw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426095" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:45 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Faisal Latif , Tatyana Nikolova Subject: [RFC PATCH 20/25] RDMA/irdma: Add SRQ support Date: Wed, 24 Jul 2024 18:39:12 -0500 Message-Id: <20240724233917.704-21-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Faisal Latif Implement verb API and UAPI changes to support SRQ functionality in GEN3 devices. Signed-off-by: Faisal Latif Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ctrl.c | 237 +++++++++++++++++- drivers/infiniband/hw/irdma/defs.h | 36 ++- drivers/infiniband/hw/irdma/hw.c | 22 +- drivers/infiniband/hw/irdma/irdma.h | 1 + drivers/infiniband/hw/irdma/main.h | 12 +- drivers/infiniband/hw/irdma/type.h | 66 +++++ drivers/infiniband/hw/irdma/uk.c | 162 +++++++++++- drivers/infiniband/hw/irdma/user.h | 42 +++- drivers/infiniband/hw/irdma/utils.c | 27 ++ drivers/infiniband/hw/irdma/verbs.c | 478 +++++++++++++++++++++++++++++++++++- drivers/infiniband/hw/irdma/verbs.h | 25 ++ include/uapi/rdma/irdma-abi.h | 15 +- 12 files changed, 1103 insertions(+), 20 deletions(-) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 3205385..d7165bd 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -412,7 +412,8 @@ int irdma_sc_qp_init(struct irdma_sc_qp *qp, struct irdma_qp_init_info *info) pble_obj_cnt = info->pd->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_PBLE].cnt; if ((info->virtual_map && info->sq_pa >= pble_obj_cnt) || - (info->virtual_map && info->rq_pa >= pble_obj_cnt)) + (!info->qp_uk_init_info.srq_uk && + info->virtual_map && info->rq_pa >= pble_obj_cnt)) return -EINVAL; qp->llp_stream_handle = (void *)(-1); @@ -447,6 +448,208 @@ int irdma_sc_qp_init(struct irdma_sc_qp *qp, struct irdma_qp_init_info *info) } /** + * irdma_sc_srq_init - init sc_srq structure + * @srq: srq sc struct + * @info: parameters for srq init + */ +int irdma_sc_srq_init(struct irdma_sc_srq *srq, + struct irdma_srq_init_info *info) +{ + u32 srq_size_quanta; + int ret_code; + + ret_code = irdma_uk_srq_init(&srq->srq_uk, &info->srq_uk_init_info); + if (ret_code) + return ret_code; + + srq->dev = info->pd->dev; + srq->pd = info->pd; + srq->vsi = info->vsi; + srq->srq_pa = info->srq_pa; + srq->first_pm_pbl_idx = info->first_pm_pbl_idx; + srq->pasid = info->pasid; + srq->pasid_valid = info->pasid_valid; + srq->srq_limit = info->srq_limit; + srq->leaf_pbl_size = info->leaf_pbl_size; + srq->virtual_map = info->virtual_map; + srq->tph_en = info->tph_en; + srq->arm_limit_event = info->arm_limit_event; + srq->tph_val = info->tph_value; + srq->shadow_area_pa = info->shadow_area_pa; + + /* Smallest SRQ size is 256B i.e. 8 quanta */ + srq_size_quanta = max((u32)IRDMA_SRQ_MIN_QUANTA, + srq->srq_uk.srq_size * + srq->srq_uk.wqe_size_multiplier); + srq->hw_srq_size = irdma_get_encoded_wqe_size(srq_size_quanta, + IRDMA_QUEUE_TYPE_SRQ); + + return 0; +} + +/** + * irdma_sc_srq_create - send srq create CQP WQE + * @srq: srq sc struct + * @scratch: u64 saved to be used during cqp completion + * @post_sq: flag for cqp db to ring + */ +static int irdma_sc_srq_create(struct irdma_sc_srq *srq, u64 scratch, + bool post_sq) +{ + struct irdma_sc_cqp *cqp; + __le64 *wqe; + u64 hdr; + + cqp = srq->pd->dev->cqp; + if (srq->srq_uk.srq_id < cqp->dev->hw_attrs.min_hw_srq_id || + srq->srq_uk.srq_id > + (cqp->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_SRQ].max_cnt - 1)) + return -EINVAL; + + wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch); + if (!wqe) + return -ENOMEM; + + set_64bit_val(wqe, 0, + FIELD_PREP(IRDMA_CQPSQ_SRQ_SRQ_LIMIT, srq->srq_limit) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_RQSIZE, srq->hw_srq_size) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_RQ_WQE_SIZE, srq->srq_uk.wqe_size)); + set_64bit_val(wqe, 8, (uintptr_t)srq); + set_64bit_val(wqe, 16, + FIELD_PREP(IRDMA_CQPSQ_SRQ_PD_ID, srq->pd->pd_id)); + set_64bit_val(wqe, 32, + FIELD_PREP(IRDMA_CQPSQ_SRQ_PHYSICAL_BUFFER_ADDR, + srq->srq_pa >> + IRDMA_CQPSQ_SRQ_PHYSICAL_BUFFER_ADDR_S)); + set_64bit_val(wqe, 40, + FIELD_PREP(IRDMA_CQPSQ_SRQ_DB_SHADOW_ADDR, + srq->shadow_area_pa >> + IRDMA_CQPSQ_SRQ_DB_SHADOW_ADDR_S)); + set_64bit_val(wqe, 48, + FIELD_PREP(IRDMA_CQPSQ_SRQ_FIRST_PM_PBL_IDX, + srq->first_pm_pbl_idx)); + + hdr = srq->srq_uk.srq_id | + FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_CREATE_SRQ) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_LEAF_PBL_SIZE, srq->leaf_pbl_size) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_VIRTMAP, srq->virtual_map) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_ARM_LIMIT_EVENT, + srq->arm_limit_event) | + FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); + + dma_wmb(); /* make sure WQE is written before valid bit is set */ + + set_64bit_val(wqe, 24, hdr); + + print_hex_dump_debug("WQE: SRQ_CREATE WQE", DUMP_PREFIX_OFFSET, 16, 8, + wqe, IRDMA_CQP_WQE_SIZE * 8, false); + if (post_sq) + irdma_sc_cqp_post_sq(cqp); + + return 0; +} + +/** + * irdma_sc_srq_modify - send modify_srq CQP WQE + * @srq: srq sc struct + * @info: parameters for srq modification + * @scratch: u64 saved to be used during cqp completion + * @post_sq: flag for cqp db to ring + */ +static int irdma_sc_srq_modify(struct irdma_sc_srq *srq, + struct irdma_modify_srq_info *info, u64 scratch, + bool post_sq) +{ + struct irdma_sc_cqp *cqp; + __le64 *wqe; + u64 hdr; + + cqp = srq->dev->cqp; + if (srq->srq_uk.srq_id < cqp->dev->hw_attrs.min_hw_srq_id || + srq->srq_uk.srq_id > + (cqp->dev->hmc_info->hmc_obj[IRDMA_HMC_IW_SRQ].max_cnt - 1)) + return -EINVAL; + + wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch); + if (!wqe) + return -ENOMEM; + + set_64bit_val(wqe, 0, + FIELD_PREP(IRDMA_CQPSQ_SRQ_SRQ_LIMIT, info->srq_limit) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_RQSIZE, srq->hw_srq_size) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_RQ_WQE_SIZE, srq->srq_uk.wqe_size)); + set_64bit_val(wqe, 8, + FIELD_PREP(IRDMA_CQPSQ_SRQ_SRQCTX, srq->srq_uk.srq_id)); + set_64bit_val(wqe, 16, + FIELD_PREP(IRDMA_CQPSQ_SRQ_PD_ID, srq->pd->pd_id)); + set_64bit_val(wqe, 32, + FIELD_PREP(IRDMA_CQPSQ_SRQ_PHYSICAL_BUFFER_ADDR, + srq->srq_pa >> + IRDMA_CQPSQ_SRQ_PHYSICAL_BUFFER_ADDR_S)); + set_64bit_val(wqe, 40, + FIELD_PREP(IRDMA_CQPSQ_SRQ_DB_SHADOW_ADDR, + srq->shadow_area_pa >> + IRDMA_CQPSQ_SRQ_DB_SHADOW_ADDR_S)); + set_64bit_val(wqe, 48, + FIELD_PREP(IRDMA_CQPSQ_SRQ_FIRST_PM_PBL_IDX, + srq->first_pm_pbl_idx)); + + hdr = srq->srq_uk.srq_id | + FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_MODIFY_SRQ) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_LEAF_PBL_SIZE, srq->leaf_pbl_size) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_VIRTMAP, srq->virtual_map) | + FIELD_PREP(IRDMA_CQPSQ_SRQ_ARM_LIMIT_EVENT, + info->arm_limit_event) | + FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); + dma_wmb(); /* make sure WQE is written before valid bit is set */ + + set_64bit_val(wqe, 24, hdr); + + print_hex_dump_debug("WQE: SRQ_MODIFY WQE", DUMP_PREFIX_OFFSET, 16, 8, + wqe, IRDMA_CQP_WQE_SIZE * 8, false); + if (post_sq) + irdma_sc_cqp_post_sq(cqp); + + return 0; +} + +/** + * irdma_sc_srq_destroy - send srq_destroy CQP WQE + * @srq: srq sc struct + * @scratch: u64 saved to be used during cqp completion + * @post_sq: flag for cqp db to ring + */ +static int irdma_sc_srq_destroy(struct irdma_sc_srq *srq, u64 scratch, + bool post_sq) +{ + struct irdma_sc_cqp *cqp; + __le64 *wqe; + u64 hdr; + + cqp = srq->dev->cqp; + + wqe = irdma_sc_cqp_get_next_send_wqe(cqp, scratch); + if (!wqe) + return -ENOMEM; + + set_64bit_val(wqe, 8, (uintptr_t)srq); + + hdr = srq->srq_uk.srq_id | + FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_DESTROY_SRQ) | + FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); + dma_wmb(); /* make sure WQE is written before valid bit is set */ + + set_64bit_val(wqe, 24, hdr); + + print_hex_dump_debug("WQE: SRQ_DESTROY WQE", DUMP_PREFIX_OFFSET, 16, + 8, wqe, IRDMA_CQP_WQE_SIZE * 8, false); + if (post_sq) + irdma_sc_cqp_post_sq(cqp); + + return 0; +} + +/** * irdma_sc_qp_create - create qp * @qp: sc qp * @info: qp create info @@ -837,6 +1040,7 @@ static void irdma_sc_qp_setctx_roce_gen_3(struct irdma_sc_qp *qp, FIELD_PREP(IRDMAQPC_ISQP1, roce_info->is_qp1) | FIELD_PREP(IRDMAQPC_ROCE_TVER, roce_info->roce_tver) | FIELD_PREP(IRDMAQPC_IPV4, udp->ipv4) | + FIELD_PREP(IRDMAQPC_USE_SRQ, !qp->qp_uk.srq_uk ? 0 : 1) | FIELD_PREP(IRDMAQPC_INSERTVLANTAG, udp->insert_vlan_tag); set_64bit_val(qp_ctx, 0, qw0); set_64bit_val(qp_ctx, 8, qp->sq_pa); @@ -921,6 +1125,9 @@ static void irdma_sc_qp_setctx_roce_gen_3(struct irdma_sc_qp *qp, FIELD_PREP(IRDMAQPC_LOCAL_IPADDR0, udp->local_ipaddr[0])); set_64bit_val(qp_ctx, 200, FIELD_PREP(IRDMAQPC_THIGH, roce_info->t_high) | + FIELD_PREP(IRDMAQPC_SRQ_ID, + !qp->qp_uk.srq_uk ? + 0 : qp->qp_uk.srq_uk->srq_id) | FIELD_PREP(IRDMAQPC_TLOW, roce_info->t_low)); set_64bit_val(qp_ctx, 208, roce_info->pd_id | FIELD_PREP(IRDMAQPC_STAT_INDEX_GEN3, info->stats_idx) | @@ -2215,6 +2422,14 @@ u8 irdma_get_encoded_wqe_size(u32 wqsize, enum irdma_queue_type queue_type) { u8 encoded_size = 0; + if (queue_type == IRDMA_QUEUE_TYPE_SRQ) { + /* Smallest SRQ size is 256B (8 quanta) that gets + * encoded to 0. + */ + encoded_size = ilog2(wqsize) - 3; + + return encoded_size; + } /* cqp sq's hw coded value starts from 1 for size of 4 * while it starts from 0 for qp' wq's. */ @@ -5464,13 +5679,12 @@ static int irdma_set_loc_hmc_rsrc_gen_3(struct irdma_sc_dev *dev, struct irdma_hmc_fpm_misc *hmc_fpm_misc; u32 xf_cnt, timer_cnt, pages_needed; struct irdma_hmc_info *hmc_info; - u32 ird, ord, min_ird; + u32 ird, ord; hmc_info = dev->hmc_info; hmc_fpm_misc = &dev->hmc_fpm_misc; ird = dev->hw_attrs.max_hw_ird; ord = dev->hw_attrs.max_hw_ord; - min_ird = IRDMA_MIN_IRD; hmc_info->hmc_obj[IRDMA_HMC_IW_HDR].cnt = qpwanted; hmc_info->hmc_obj[IRDMA_HMC_IW_QP].cnt = qpwanted; @@ -6052,6 +6266,22 @@ static int irdma_exec_cqp_cmd(struct irdma_sc_dev *dev, &pcmdinfo->in.u.mc_modify.info, pcmdinfo->in.u.mc_modify.scratch); break; + case IRDMA_OP_SRQ_CREATE: + status = irdma_sc_srq_create(pcmdinfo->in.u.srq_create.srq, + pcmdinfo->in.u.srq_create.scratch, + pcmdinfo->post_sq); + break; + case IRDMA_OP_SRQ_MODIFY: + status = irdma_sc_srq_modify(pcmdinfo->in.u.srq_modify.srq, + &pcmdinfo->in.u.srq_modify.info, + pcmdinfo->in.u.srq_modify.scratch, + pcmdinfo->post_sq); + break; + case IRDMA_OP_SRQ_DESTROY: + status = irdma_sc_srq_destroy(pcmdinfo->in.u.srq_destroy.srq, + pcmdinfo->in.u.srq_destroy.scratch, + pcmdinfo->post_sq); + break; default: status = -EOPNOTSUPP; break; @@ -6209,6 +6439,7 @@ int irdma_sc_dev_init(enum irdma_vers ver, struct irdma_sc_dev *dev, dev->protocol_used = info->protocol_used; /* Setup the hardware limits, hmc may limit further */ dev->hw_attrs.min_hw_qp_id = IRDMA_MIN_IW_QP_ID; + dev->hw_attrs.min_hw_srq_id = IRDMA_MIN_IW_SRQ_ID; dev->hw_attrs.min_hw_aeq_size = IRDMA_MIN_AEQ_ENTRIES; dev->hw_attrs.max_hw_aeq_size = IRDMA_MAX_AEQ_ENTRIES; dev->hw_attrs.min_hw_ceq_size = IRDMA_MIN_CEQ_ENTRIES; diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index b548490..8ead170 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -140,7 +140,11 @@ enum irdma_protocol_used { #define IRDMA_QP_SW_MAX_RQ_QUANTA 32768 #define IRDMA_MAX_QP_WRS(max_quanta_per_wr) \ ((IRDMA_QP_SW_MAX_WQ_QUANTA - IRDMA_SQ_RSVD) / (max_quanta_per_wr)) +#define IRDMA_SRQ_MIN_QUANTA 8 #define IRDMA_SRQ_MAX_QUANTA 262144 +#define IRDMA_MAX_SRQ_WRS \ + ((IRDMA_SRQ_MAX_QUANTA - IRDMA_RQ_RSVD) / IRDMA_MAX_QUANTA_PER_WR) + #define IRDMAQP_TERM_SEND_TERM_AND_FIN 0 #define IRDMAQP_TERM_SEND_TERM_ONLY 1 #define IRDMAQP_TERM_SEND_FIN_ONLY 2 @@ -236,9 +240,12 @@ enum irdma_cqp_op_type { IRDMA_OP_ADD_LOCAL_MAC_ENTRY = 46, IRDMA_OP_DELETE_LOCAL_MAC_ENTRY = 47, IRDMA_OP_CQ_MODIFY = 48, + IRDMA_OP_SRQ_CREATE = 51, + IRDMA_OP_SRQ_MODIFY = 52, + IRDMA_OP_SRQ_DESTROY = 53, /* Must be last entry*/ - IRDMA_MAX_CQP_OPS = 49, + IRDMA_MAX_CQP_OPS = 54, }; /* CQP SQ WQES */ @@ -248,6 +255,9 @@ enum irdma_cqp_op_type { #define IRDMA_CQP_OP_CREATE_CQ 0x03 #define IRDMA_CQP_OP_MODIFY_CQ 0x04 #define IRDMA_CQP_OP_DESTROY_CQ 0x05 +#define IRDMA_CQP_OP_CREATE_SRQ 0x06 +#define IRDMA_CQP_OP_MODIFY_SRQ 0x07 +#define IRDMA_CQP_OP_DESTROY_SRQ 0x08 #define IRDMA_CQP_OP_ALLOC_STAG 0x09 #define IRDMA_CQP_OP_REG_MR 0x0a #define IRDMA_CQP_OP_QUERY_STAG 0x0b @@ -520,6 +530,7 @@ enum irdma_cqp_op_type { #define IRDMA_CQ_ERROR BIT_ULL(55) #define IRDMA_CQ_SQ BIT_ULL(62) +#define IRDMA_CQ_SRQ BIT_ULL(52) #define IRDMA_CQ_VALID BIT_ULL(63) #define IRDMA_CQ_IMMVALID BIT_ULL(62) #define IRDMA_CQ_UDSMACVALID BIT_ULL(61) @@ -631,6 +642,24 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_QP_DBSHADOWADDR IRDMA_CQPHC_QPCTX +#define IRDMA_CQPSQ_SRQ_RQSIZE GENMASK_ULL(3, 0) +#define IRDMA_CQPSQ_SRQ_RQ_WQE_SIZE GENMASK_ULL(5, 4) +#define IRDMA_CQPSQ_SRQ_SRQ_LIMIT GENMASK_ULL(43, 32) +#define IRDMA_CQPSQ_SRQ_SRQCTX GENMASK_ULL(63, 6) +#define IRDMA_CQPSQ_SRQ_PD_ID GENMASK_ULL(39, 16) +#define IRDMA_CQPSQ_SRQ_SRQ_ID GENMASK_ULL(15, 0) +#define IRDMA_CQPSQ_SRQ_OP GENMASK_ULL(37, 32) +#define IRDMA_CQPSQ_SRQ_LEAF_PBL_SIZE GENMASK_ULL(45, 44) +#define IRDMA_CQPSQ_SRQ_VIRTMAP BIT_ULL(47) +#define IRDMA_CQPSQ_SRQ_TPH_EN BIT_ULL(60) +#define IRDMA_CQPSQ_SRQ_ARM_LIMIT_EVENT BIT_ULL(61) +#define IRDMA_CQPSQ_SRQ_FIRST_PM_PBL_IDX GENMASK_ULL(27, 0) +#define IRDMA_CQPSQ_SRQ_TPH_VALUE GENMASK_ULL(7, 0) +#define IRDMA_CQPSQ_SRQ_PHYSICAL_BUFFER_ADDR_S 8 +#define IRDMA_CQPSQ_SRQ_PHYSICAL_BUFFER_ADDR GENMASK_ULL(63, 8) +#define IRDMA_CQPSQ_SRQ_DB_SHADOW_ADDR_S 6 +#define IRDMA_CQPSQ_SRQ_DB_SHADOW_ADDR GENMASK_ULL(63, 6) + #define IRDMA_CQPSQ_CQ_CQSIZE GENMASK_ULL(20, 0) #define IRDMA_CQPSQ_CQ_CQCTX GENMASK_ULL(62, 0) #define IRDMA_CQPSQ_CQ_SHADOW_READ_THRESHOLD GENMASK(17, 0) @@ -785,6 +814,11 @@ enum irdma_cqp_op_type { #define IRDMAQPC_INSERTL2TAG2 BIT_ULL(11) #define IRDMAQPC_LIMIT GENMASK_ULL(13, 12) +#define IRDMAQPC_USE_SRQ BIT_ULL(10) +#define IRDMAQPC_SRQ_ID GENMASK_ULL(15, 0) +#define IRDMAQPC_PASID GENMASK_ULL(19, 0) +#define IRDMAQPC_PASID_VALID BIT_ULL(11) + #define IRDMAQPC_ECN_EN BIT_ULL(14) #define IRDMAQPC_DROPOOOSEG BIT_ULL(15) #define IRDMAQPC_DUPACK_THRESH GENMASK_ULL(18, 16) diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index f01ec21..524fe5d 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -269,6 +269,7 @@ static void irdma_process_aeq(struct irdma_pci_f *rf) struct irdma_sc_qp *qp = NULL; struct irdma_qp_host_ctx_info *ctx_info = NULL; struct irdma_device *iwdev = rf->iwdev; + struct irdma_sc_srq *srq; unsigned long flags; u32 aeqcnt = 0; @@ -319,7 +320,9 @@ static void irdma_process_aeq(struct irdma_pci_f *rf) if (info->ae_id != IRDMA_AE_QP_SUSPEND_COMPLETE) iwqp->last_aeq = info->ae_id; spin_unlock_irqrestore(&iwqp->lock, flags); - ctx_info = &iwqp->ctx_info; + } else if (info->srq) { + if (info->ae_id != IRDMA_AE_SRQ_LIMIT) + continue; } else { if (info->ae_id != IRDMA_AE_CQ_OPERATION_ERROR && info->ae_id != IRDMA_AE_CQP_DEFERRED_COMPLETE) @@ -417,6 +420,12 @@ static void irdma_process_aeq(struct irdma_pci_f *rf) } irdma_cq_rem_ref(&iwcq->ibcq); break; + case IRDMA_AE_SRQ_LIMIT: + srq = (struct irdma_sc_srq *)(uintptr_t)info->compl_ctx; + irdma_srq_event(srq); + break; + case IRDMA_AE_SRQ_CATASTROPHIC_ERROR: + break; case IRDMA_AE_CQP_DEFERRED_COMPLETE: /* Remove completed CQP requests from pending list * and notify about those CQP ops completion. @@ -1839,7 +1848,9 @@ static void irdma_get_used_rsrc(struct irdma_device *iwdev) iwdev->rf->used_qps = find_first_zero_bit(iwdev->rf->allocated_qps, iwdev->rf->max_qp); iwdev->rf->used_cqs = find_first_zero_bit(iwdev->rf->allocated_cqs, - iwdev->rf->max_cq); + iwdev->rf->max_cq); + iwdev->rf->used_srqs = find_first_zero_bit(iwdev->rf->allocated_srqs, + iwdev->rf->max_srq); iwdev->rf->used_mrs = find_first_zero_bit(iwdev->rf->allocated_mrs, iwdev->rf->max_mr); } @@ -2056,7 +2067,8 @@ static void irdma_set_hw_rsrc(struct irdma_pci_f *rf) rf->allocated_qps = (void *)(rf->mem_rsrc + (sizeof(struct irdma_arp_entry) * rf->arp_table_size)); rf->allocated_cqs = &rf->allocated_qps[BITS_TO_LONGS(rf->max_qp)]; - rf->allocated_mrs = &rf->allocated_cqs[BITS_TO_LONGS(rf->max_cq)]; + rf->allocated_srqs = &rf->allocated_cqs[BITS_TO_LONGS(rf->max_cq)]; + rf->allocated_mrs = &rf->allocated_srqs[BITS_TO_LONGS(rf->max_srq)]; rf->allocated_pds = &rf->allocated_mrs[BITS_TO_LONGS(rf->max_mr)]; rf->allocated_ahs = &rf->allocated_pds[BITS_TO_LONGS(rf->max_pd)]; rf->allocated_mcgs = &rf->allocated_ahs[BITS_TO_LONGS(rf->max_ah)]; @@ -2084,12 +2096,14 @@ static u32 irdma_calc_mem_rsrc_size(struct irdma_pci_f *rf) rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_qp); rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_mr); rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_cq); + rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_srq); rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_pd); rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->arp_table_size); rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_ah); rsrc_size += sizeof(unsigned long) * BITS_TO_LONGS(rf->max_mcg); rsrc_size += sizeof(struct irdma_qp **) * rf->max_qp; rsrc_size += sizeof(struct irdma_cq **) * rf->max_cq; + rsrc_size += sizeof(struct irdma_srq **) * rf->max_srq; return rsrc_size; } @@ -2117,6 +2131,7 @@ u32 irdma_initialize_hw_rsrc(struct irdma_pci_f *rf) rf->max_qp = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_QP].cnt; rf->max_mr = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_MR].cnt; rf->max_cq = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_CQ].cnt; + rf->max_srq = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_SRQ].cnt; rf->max_pd = rf->sc_dev.hw_attrs.max_hw_pds; rf->arp_table_size = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_ARP].cnt; rf->max_ah = rf->sc_dev.hmc_info->hmc_obj[IRDMA_HMC_IW_FSIAV].cnt; @@ -2136,6 +2151,7 @@ u32 irdma_initialize_hw_rsrc(struct irdma_pci_f *rf) set_bit(0, rf->allocated_mrs); set_bit(0, rf->allocated_qps); set_bit(0, rf->allocated_cqs); + set_bit(0, rf->allocated_srqs); set_bit(0, rf->allocated_pds); set_bit(0, rf->allocated_arps); set_bit(0, rf->allocated_ahs); diff --git a/drivers/infiniband/hw/irdma/irdma.h b/drivers/infiniband/hw/irdma/irdma.h index 0544cba..6af79bb 100644 --- a/drivers/infiniband/hw/irdma/irdma.h +++ b/drivers/infiniband/hw/irdma/irdma.h @@ -162,6 +162,7 @@ struct irdma_hw_attrs { u32 max_done_count; u32 max_sleep_count; u32 max_cqp_compl_wait_time_ms; + u32 min_hw_srq_id; u16 max_stat_inst; u16 max_stat_idx; }; diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index f0196aa..7135271 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -273,6 +273,8 @@ struct irdma_pci_f { u32 max_mr; u32 max_qp; u32 max_cq; + u32 max_srq; + u32 next_srq; u32 max_ah; u32 next_ah; u32 max_mcg; @@ -286,6 +288,7 @@ struct irdma_pci_f { u32 mr_stagmask; u32 used_pds; u32 used_cqs; + u32 used_srqs; u32 used_mrs; u32 used_qps; u32 arp_table_size; @@ -297,6 +300,7 @@ struct irdma_pci_f { unsigned long *allocated_ws_nodes; unsigned long *allocated_qps; unsigned long *allocated_cqs; + unsigned long *allocated_srqs; unsigned long *allocated_mrs; unsigned long *allocated_pds; unsigned long *allocated_mcgs; @@ -420,6 +424,11 @@ static inline struct irdma_pci_f *dev_to_rf(struct irdma_sc_dev *dev) return container_of(dev, struct irdma_pci_f, sc_dev); } +static inline struct irdma_srq *to_iwsrq(struct ib_srq *ibsrq) +{ + return container_of(ibsrq, struct irdma_srq, ibsrq); +} + /** * irdma_alloc_resource - allocate a resource * @iwdev: device pointer @@ -515,7 +524,8 @@ int irdma_modify_qp_roce(struct ib_qp *ibqp, struct ib_qp_attr *attr, void irdma_cq_add_ref(struct ib_cq *ibcq); void irdma_cq_rem_ref(struct ib_cq *ibcq); void irdma_cq_wq_destroy(struct irdma_pci_f *rf, struct irdma_sc_cq *cq); - +void irdma_srq_event(struct irdma_sc_srq *srq); +void irdma_srq_wq_destroy(struct irdma_pci_f *rf, struct irdma_sc_srq *srq); void irdma_cleanup_pending_cqp_op(struct irdma_pci_f *rf); int irdma_hw_modify_qp(struct irdma_device *iwdev, struct irdma_qp *iwqp, struct irdma_modify_qp_info *info, bool wait); diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index 2432104..adfc528 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -250,6 +250,7 @@ enum irdma_syn_rst_handling { enum irdma_queue_type { IRDMA_QUEUE_TYPE_SQ_RQ = 0, IRDMA_QUEUE_TYPE_CQP, + IRDMA_QUEUE_TYPE_SRQ, }; struct irdma_sc_dev; @@ -739,6 +740,51 @@ struct irdma_modify_cq_info { bool cq_resize:1; }; +struct irdma_srq_init_info { + struct irdma_sc_pd *pd; + struct irdma_sc_vsi *vsi; + u64 srq_pa; + u64 shadow_area_pa; + u32 first_pm_pbl_idx; + u32 pasid; + u32 srq_size; + u16 srq_limit; + u8 pasid_valid; + u8 wqe_size; + u8 leaf_pbl_size; + u8 virtual_map; + u8 tph_en; + u8 arm_limit_event; + u8 tph_value; + u8 pbl_chunk_size; + struct irdma_srq_uk_init_info srq_uk_init_info; +}; + +struct irdma_sc_srq { + struct irdma_sc_dev *dev; + struct irdma_sc_vsi *vsi; + struct irdma_sc_pd *pd; + struct irdma_srq_uk srq_uk; + void *back_srq; + u64 srq_pa; + u64 shadow_area_pa; + u32 first_pm_pbl_idx; + u32 pasid; + u32 hw_srq_size; + u16 srq_limit; + u8 pasid_valid; + u8 leaf_pbl_size; + u8 virtual_map; + u8 tph_en; + u8 arm_limit_event; + u8 tph_val; +}; + +struct irdma_modify_srq_info { + u16 srq_limit; + u8 arm_limit_event; +}; + struct irdma_create_qp_info { bool ord_valid:1; bool tcp_ctx_valid:1; @@ -1045,6 +1091,7 @@ struct irdma_qp_host_ctx_info { }; u32 send_cq_num; u32 rcv_cq_num; + u32 srq_id; u32 rem_endpoint_idx; u16 stats_idx; bool srq_valid:1; @@ -1344,6 +1391,8 @@ void irdma_sc_qp_setctx_roce(struct irdma_sc_qp *qp, __le64 *qp_ctx, int irdma_sc_static_hmc_pages_allocated(struct irdma_sc_cqp *cqp, u64 scratch, u8 hmc_fn_id, bool post_sq, bool poll_registers); +int irdma_sc_srq_init(struct irdma_sc_srq *srq, + struct irdma_srq_init_info *info); void sc_vsi_update_stats(struct irdma_sc_vsi *vsi); struct cqp_info { @@ -1587,6 +1636,23 @@ struct cqp_info { struct irdma_dma_mem query_buff_mem; u64 scratch; } query_rdma; + + struct { + struct irdma_sc_srq *srq; + u64 scratch; + } srq_create; + + struct { + struct irdma_sc_srq *srq; + struct irdma_modify_srq_info info; + u64 scratch; + } srq_modify; + + struct { + struct irdma_sc_srq *srq; + u64 scratch; + } srq_destroy; + } u; }; diff --git a/drivers/infiniband/hw/irdma/uk.c b/drivers/infiniband/hw/irdma/uk.c index 38c54e5..26f3475 100644 --- a/drivers/infiniband/hw/irdma/uk.c +++ b/drivers/infiniband/hw/irdma/uk.c @@ -198,6 +198,26 @@ __le64 *irdma_qp_get_next_send_wqe(struct irdma_qp_uk *qp, u32 *wqe_idx, return wqe; } +__le64 *irdma_srq_get_next_recv_wqe(struct irdma_srq_uk *srq, u32 *wqe_idx) +{ + int ret_code; + __le64 *wqe; + + if (IRDMA_RING_FULL_ERR(srq->srq_ring)) + return NULL; + + IRDMA_ATOMIC_RING_MOVE_HEAD(srq->srq_ring, *wqe_idx, ret_code); + if (ret_code) + return NULL; + + if (!*wqe_idx) + srq->srwqe_polarity = !srq->srwqe_polarity; + /* rq_wqe_size_multiplier is no of 32 byte quanta in one rq wqe */ + wqe = srq->srq_base[*wqe_idx * (srq->wqe_size_multiplier)].elem; + + return wqe; +} + /** * irdma_qp_get_next_recv_wqe - get next qp's rcv wqe * @qp: hw qp ptr @@ -318,6 +338,58 @@ int irdma_uk_rdma_write(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, } /** + * irdma_uk_srq_post_receive - post a receive wqe to a shared rq + * @srq: shared rq ptr + * @info: post rq information + */ +int irdma_uk_srq_post_receive(struct irdma_srq_uk *srq, + struct irdma_post_rq_info *info) +{ + u32 wqe_idx, i, byte_off; + u32 addl_frag_cnt; + __le64 *wqe; + u64 hdr; + + if (srq->max_srq_frag_cnt < info->num_sges) + return -EINVAL; + + wqe = irdma_srq_get_next_recv_wqe(srq, &wqe_idx); + if (!wqe) + return -ENOMEM; + + addl_frag_cnt = info->num_sges > 1 ? info->num_sges - 1 : 0; + srq->wqe_ops.iw_set_fragment(wqe, 0, info->sg_list, + srq->srwqe_polarity); + + for (i = 1, byte_off = 32; i < info->num_sges; i++) { + srq->wqe_ops.iw_set_fragment(wqe, byte_off, &info->sg_list[i], + srq->srwqe_polarity); + byte_off += 16; + } + + /* if not an odd number set valid bit in next fragment */ + if (srq->uk_attrs->hw_rev >= IRDMA_GEN_2 && !(info->num_sges & 0x01) && + info->num_sges) { + srq->wqe_ops.iw_set_fragment(wqe, byte_off, NULL, + srq->srwqe_polarity); + if (srq->uk_attrs->hw_rev == IRDMA_GEN_2) + ++addl_frag_cnt; + } + + set_64bit_val(wqe, 16, (u64)info->wr_id); + hdr = FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, addl_frag_cnt) | + FIELD_PREP(IRDMAQPSQ_VALID, srq->srwqe_polarity); + + dma_wmb(); /* make sure WQE is populated before valid bit is set */ + + set_64bit_val(wqe, 24, hdr); + + set_64bit_val(srq->shadow_area, 0, (wqe_idx + 1) % srq->srq_ring.size); + + return 0; +} + +/** * irdma_uk_rdma_read - rdma read command * @qp: hw qp ptr * @info: post sq information @@ -973,6 +1045,8 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, u64 comp_ctx, qword0, qword2, qword3; __le64 *cqe; struct irdma_qp_uk *qp; + struct irdma_srq_uk *srq; + u8 is_srq; struct irdma_ring *pring = NULL; u32 wqe_idx; int ret_code; @@ -1046,8 +1120,14 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, } info->q_type = (u8)FIELD_GET(IRDMA_CQ_SQ, qword3); + is_srq = (u8)FIELD_GET(IRDMA_CQ_SRQ, qword3); info->error = (bool)FIELD_GET(IRDMA_CQ_ERROR, qword3); info->ipv4 = (bool)FIELD_GET(IRDMACQ_IPV4, qword3); + get_64bit_val(cqe, 8, &comp_ctx); + if (is_srq) + get_64bit_val(cqe, 40, (u64 *)&qp); + else + qp = (struct irdma_qp_uk *)(unsigned long)comp_ctx; if (info->error) { info->major_err = FIELD_GET(IRDMA_CQ_MAJERR, qword3); info->minor_err = FIELD_GET(IRDMA_CQ_MINERR, qword3); @@ -1085,7 +1165,22 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, info->qp_handle = (irdma_qp_handle)(unsigned long)qp; info->op_type = (u8)FIELD_GET(IRDMACQ_OP, qword3); - if (info->q_type == IRDMA_CQE_QTYPE_RQ) { + if (info->q_type == IRDMA_CQE_QTYPE_RQ && is_srq) { + srq = qp->srq_uk; + + get_64bit_val(cqe, 8, &info->wr_id); + info->bytes_xfered = (u32)FIELD_GET(IRDMACQ_PAYLDLEN, qword0); + + if (qword3 & IRDMACQ_STAG) { + info->stag_invalid_set = true; + info->inv_stag = (u32)FIELD_GET(IRDMACQ_INVSTAG, + qword2); + } else { + info->stag_invalid_set = false; + } + IRDMA_RING_MOVE_TAIL(srq->srq_ring); + pring = &srq->srq_ring; + } else if (info->q_type == IRDMA_CQE_QTYPE_RQ && !is_srq) { u32 array_idx; array_idx = wqe_idx / qp->rq_wqe_size_multiplier; @@ -1210,10 +1305,10 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, } /** - * irdma_qp_round_up - return round up qp wq depth + * irdma_round_up_wq - return round up qp wq depth * @wqdepth: wq depth in quanta to round up */ -static int irdma_qp_round_up(u32 wqdepth) +static int irdma_round_up_wq(u32 wqdepth) { int scount = 1; @@ -1268,7 +1363,7 @@ int irdma_get_sqdepth(struct irdma_uk_attrs *uk_attrs, u32 sq_size, u8 shift, { u32 min_size = (u32)uk_attrs->min_hw_wq_size << shift; - *sqdepth = irdma_qp_round_up((sq_size << shift) + IRDMA_SQ_RSVD); + *sqdepth = irdma_round_up_wq((sq_size << shift) + IRDMA_SQ_RSVD); if (*sqdepth < min_size) *sqdepth = min_size; @@ -1290,7 +1385,7 @@ int irdma_get_rqdepth(struct irdma_uk_attrs *uk_attrs, u32 rq_size, u8 shift, { u32 min_size = (u32)uk_attrs->min_hw_wq_size << shift; - *rqdepth = irdma_qp_round_up((rq_size << shift) + IRDMA_RQ_RSVD); + *rqdepth = irdma_round_up_wq((rq_size << shift) + IRDMA_RQ_RSVD); if (*rqdepth < min_size) *rqdepth = min_size; @@ -1300,6 +1395,26 @@ int irdma_get_rqdepth(struct irdma_uk_attrs *uk_attrs, u32 rq_size, u8 shift, return 0; } +/* + * irdma_get_srqdepth - get SRQ depth (quanta) + * @uk_attrs: qp HW attributes + * @srq_size: SRQ size + * @shift: shift which determines size of WQE + * @srqdepth: depth of SRQ + */ +int irdma_get_srqdepth(struct irdma_uk_attrs *uk_attrs, u32 srq_size, u8 shift, + u32 *srqdepth) +{ + *srqdepth = irdma_round_up_wq((srq_size << shift) + IRDMA_RQ_RSVD); + + if (*srqdepth < ((u32)uk_attrs->min_hw_wq_size << shift)) + *srqdepth = uk_attrs->min_hw_wq_size << shift; + else if (*srqdepth > uk_attrs->max_hw_srq_quanta) + return -EINVAL; + + return 0; +} + static const struct irdma_wqe_uk_ops iw_wqe_uk_ops = { .iw_copy_inline_data = irdma_copy_inline_data, .iw_inline_data_size_to_quanta = irdma_inline_data_size_to_quanta, @@ -1336,6 +1451,42 @@ static void irdma_setup_connection_wqes(struct irdma_qp_uk *qp, } /** + * irdma_uk_srq_init - initialize shared qp + * @srq: hw srq (user and kernel) + * @info: srq initialization info + * + * initializes the vars used in both user and kernel mode. + * size of the wqe depends on numbers of max. fragements + * allowed. Then size of wqe * the number of wqes should be the + * amount of memory allocated for srq. + */ +int irdma_uk_srq_init(struct irdma_srq_uk *srq, + struct irdma_srq_uk_init_info *info) +{ + u8 rqshift; + + srq->uk_attrs = info->uk_attrs; + if (info->max_srq_frag_cnt > srq->uk_attrs->max_hw_wq_frags) + return -EINVAL; + + irdma_get_wqe_shift(srq->uk_attrs, info->max_srq_frag_cnt, 0, &rqshift); + srq->srq_caps = info->srq_caps; + srq->srq_base = info->srq; + srq->shadow_area = info->shadow_area; + srq->srq_id = info->srq_id; + srq->srwqe_polarity = 0; + srq->srq_size = info->srq_size; + srq->wqe_size = rqshift; + srq->max_srq_frag_cnt = min(srq->uk_attrs->max_hw_wq_frags, + ((u32)2 << rqshift) - 1); + IRDMA_RING_INIT(srq->srq_ring, srq->srq_size); + srq->wqe_size_multiplier = 1 << rqshift; + srq->wqe_ops = iw_wqe_uk_ops; + + return 0; +} + +/** * irdma_uk_calc_shift_wq - calculate WQE shift for both SQ and RQ * @ukinfo: qp initialization info * @sq_shift: Returns shift of SQ @@ -1461,6 +1612,7 @@ int irdma_uk_qp_init(struct irdma_qp_uk *qp, struct irdma_qp_uk_init_info *info) qp->wqe_ops = iw_wqe_uk_ops_gen_1; else qp->wqe_ops = iw_wqe_uk_ops; + qp->srq_uk = info->srq_uk; return ret_code; } diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h index 8fd7eeb..af15529 100644 --- a/drivers/infiniband/hw/irdma/user.h +++ b/drivers/infiniband/hw/irdma/user.h @@ -59,7 +59,7 @@ enum irdma_device_caps_const { IRDMA_COMMIT_FPM_BUF_SIZE = 192, IRDMA_GATHER_STATS_BUF_SIZE = 1024, IRDMA_MIN_IW_QP_ID = 0, - IRDMA_MAX_IW_QP_ID = 262143, + IRDMA_MIN_IW_SRQ_ID = 0, IRDMA_MIN_CEQID = 0, IRDMA_MAX_CEQID = 1023, IRDMA_CEQ_MAX_COUNT = IRDMA_MAX_CEQID + 1, @@ -147,6 +147,8 @@ enum irdma_qp_caps { IRDMA_PUSH_MODE = 8, }; +struct irdma_srq_uk; +struct irdma_srq_uk_init_info; struct irdma_qp_uk; struct irdma_cq_uk; struct irdma_qp_uk_init_info; @@ -300,6 +302,39 @@ int irdma_uk_calc_depth_shift_sq(struct irdma_qp_uk_init_info *ukinfo, u32 *sq_depth, u8 *sq_shift); int irdma_uk_calc_depth_shift_rq(struct irdma_qp_uk_init_info *ukinfo, u32 *rq_depth, u8 *rq_shift); +int irdma_uk_srq_init(struct irdma_srq_uk *srq, + struct irdma_srq_uk_init_info *info); +int irdma_uk_srq_post_receive(struct irdma_srq_uk *srq, + struct irdma_post_rq_info *info); + +struct irdma_srq_uk { + u32 srq_caps; + struct irdma_qp_quanta *srq_base; + struct irdma_uk_attrs *uk_attrs; + __le64 *shadow_area; + struct irdma_ring srq_ring; + struct irdma_ring initial_ring; + u32 srq_id; + u32 srq_size; + u32 max_srq_frag_cnt; + struct irdma_wqe_uk_ops wqe_ops; + u8 srwqe_polarity; + u8 wqe_size; + u8 wqe_size_multiplier; + u8 deferred_flag; +}; + +struct irdma_srq_uk_init_info { + struct irdma_qp_quanta *srq; + struct irdma_uk_attrs *uk_attrs; + __le64 *shadow_area; + u64 *srq_wrid_array; + u32 srq_id; + u32 srq_caps; + u32 srq_size; + u32 max_srq_frag_cnt; +}; + struct irdma_sq_uk_wr_trk_info { u64 wrid; u32 wr_len; @@ -344,6 +379,7 @@ struct irdma_qp_uk { bool destroy_pending:1; /* Indicates the QP is being destroyed */ void *back_qp; u8 dbg_rq_flushed; + struct irdma_srq_uk *srq_uk; u8 sq_flush_seen; u8 rq_flush_seen; }; @@ -383,6 +419,7 @@ struct irdma_qp_uk_init_info { u8 rq_shift; int abi_ver; bool legacy_mode; + struct irdma_srq_uk *srq_uk; }; struct irdma_cq_uk_init_info { @@ -398,6 +435,7 @@ struct irdma_cq_uk_init_info { __le64 *irdma_qp_get_next_send_wqe(struct irdma_qp_uk *qp, u32 *wqe_idx, u16 quanta, u32 total_size, struct irdma_post_sq_info *info); +__le64 *irdma_srq_get_next_recv_wqe(struct irdma_srq_uk *srq, u32 *wqe_idx); __le64 *irdma_qp_get_next_recv_wqe(struct irdma_qp_uk *qp, u32 *wqe_idx); void irdma_uk_clean_cq(void *q, struct irdma_cq_uk *cq); int irdma_nop(struct irdma_qp_uk *qp, u64 wr_id, bool signaled, bool post_sq); @@ -409,5 +447,7 @@ int irdma_get_sqdepth(struct irdma_uk_attrs *uk_attrs, u32 sq_size, u8 shift, u32 *wqdepth); int irdma_get_rqdepth(struct irdma_uk_attrs *uk_attrs, u32 rq_size, u8 shift, u32 *wqdepth); +int irdma_get_srqdepth(struct irdma_uk_attrs *uk_attrs, u32 srq_size, u8 shift, + u32 *srqdepth); void irdma_clr_wqes(struct irdma_qp_uk *qp, u32 qp_wqe_idx); #endif /* IRDMA_USER_H */ diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index 894ced3..2fdcb88 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -700,6 +700,9 @@ static int irdma_wait_event(struct irdma_pci_f *rf, [IRDMA_OP_ADD_LOCAL_MAC_ENTRY] = "Add Local MAC Entry Cmd", [IRDMA_OP_DELETE_LOCAL_MAC_ENTRY] = "Delete Local MAC Entry Cmd", [IRDMA_OP_CQ_MODIFY] = "CQ Modify Cmd", + [IRDMA_OP_SRQ_CREATE] = "Create SRQ Cmd", + [IRDMA_OP_SRQ_MODIFY] = "Modify SRQ Cmd", + [IRDMA_OP_SRQ_DESTROY] = "Destroy SRQ Cmd", }; static const struct irdma_cqp_err_info irdma_noncrit_err_list[] = { @@ -1239,6 +1242,30 @@ void irdma_free_qp_rsrc(struct irdma_qp *iwqp) } /** + * irdma_srq_wq_destroy - send srq destroy cqp + * @rf: RDMA PCI function + * @srq: hardware control srq + */ +void irdma_srq_wq_destroy(struct irdma_pci_f *rf, struct irdma_sc_srq *srq) +{ + struct irdma_cqp_request *cqp_request; + struct cqp_cmds_info *cqp_info; + + cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true); + if (!cqp_request) + return; + + cqp_info = &cqp_request->info; + cqp_info->cqp_cmd = IRDMA_OP_SRQ_DESTROY; + cqp_info->post_sq = 1; + cqp_info->in.u.srq_destroy.srq = srq; + cqp_info->in.u.srq_destroy.scratch = (uintptr_t)cqp_request; + + irdma_handle_cqp_op(rf, cqp_request); + irdma_put_cqp_request(&rf->cqp, cqp_request); +} + +/** * irdma_cq_wq_destroy - send cq destroy cqp * @rf: RDMA PCI function * @cq: hardware control cq diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 593dbbd..868722b 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -56,9 +56,9 @@ static int irdma_query_device(struct ib_device *ibdev, props->max_mcast_qp_attach = IRDMA_MAX_MGS_PER_CTX; props->max_total_mcast_qp_attach = rf->max_qp * IRDMA_MAX_MGS_PER_CTX; props->max_fast_reg_page_list_len = IRDMA_MAX_PAGES_PER_FMR; -#define HCA_CLOCK_TIMESTAMP_MASK 0x1ffff - if (hw_attrs->uk_attrs.hw_rev >= IRDMA_GEN_2) - props->timestamp_mask = HCA_CLOCK_TIMESTAMP_MASK; + props->max_srq = rf->max_srq - rf->used_srqs; + props->max_srq_wr = IRDMA_MAX_SRQ_WRS; + props->max_srq_sge = hw_attrs->uk_attrs.max_hw_wq_frags; return 0; } @@ -336,6 +336,8 @@ static int irdma_alloc_ucontext(struct ib_ucontext *uctx, uresp.comp_mask |= IRDMA_ALLOC_UCTX_USE_RAW_ATTR; uresp.min_hw_wq_size = uk_attrs->min_hw_wq_size; uresp.comp_mask |= IRDMA_ALLOC_UCTX_MIN_HW_WQ_SIZE; + uresp.max_hw_srq_quanta = uk_attrs->max_hw_srq_quanta; + uresp.comp_mask |= IRDMA_ALLOC_UCTX_MAX_HW_SRQ_QUANTA; if (ib_copy_to_udata(udata, &uresp, min(sizeof(uresp), udata->outlen))) { rdma_user_mmap_entry_remove(ucontext->db_mmap_entry); @@ -347,6 +349,8 @@ static int irdma_alloc_ucontext(struct ib_ucontext *uctx, spin_lock_init(&ucontext->cq_reg_mem_list_lock); INIT_LIST_HEAD(&ucontext->qp_reg_mem_list); spin_lock_init(&ucontext->qp_reg_mem_list_lock); + INIT_LIST_HEAD(&ucontext->srq_reg_mem_list); + spin_lock_init(&ucontext->srq_reg_mem_list_lock); return 0; @@ -571,7 +575,11 @@ static void irdma_setup_virt_qp(struct irdma_device *iwdev, if (iwpbl->pbl_allocated) { init_info->virtual_map = true; init_info->sq_pa = qpmr->sq_pbl.idx; - init_info->rq_pa = qpmr->rq_pbl.idx; + /* Need to use contiguous buffer for RQ of QP + * in case it is associated with SRQ. + */ + init_info->rq_pa = init_info->qp_uk_init_info.srq_uk ? + qpmr->rq_pa : qpmr->rq_pbl.idx; } else { init_info->sq_pa = qpmr->sq_pbl.addr; init_info->rq_pa = qpmr->rq_pbl.addr; @@ -940,6 +948,18 @@ static int irdma_create_qp(struct ib_qp *ibqp, struct irdma_uk_attrs *uk_attrs = &dev->hw_attrs.uk_attrs; struct irdma_qp_init_info init_info = {}; struct irdma_qp_host_ctx_info *ctx_info; + struct irdma_srq *iwsrq; + bool srq_valid = false; + u32 srq_id = 0; + + if (init_attr->srq) { + iwsrq = to_iwsrq(init_attr->srq); + srq_valid = true; + srq_id = iwsrq->srq_num; + init_attr->cap.max_recv_sge = uk_attrs->max_hw_wq_frags; + init_attr->cap.max_recv_wr = 4; + init_info.qp_uk_init_info.srq_uk = &iwsrq->sc_srq.srq_uk; + } err_code = irdma_validate_qp_attrs(init_attr, iwdev); if (err_code) @@ -1046,6 +1066,8 @@ static int irdma_create_qp(struct ib_qp *ibqp, } ctx_info = &iwqp->ctx_info; + ctx_info->srq_valid = srq_valid; + ctx_info->srq_id = srq_id; ctx_info->send_cq_num = iwqp->iwscq->sc_cq.cq_uk.cq_id; ctx_info->rcv_cq_num = iwqp->iwrcq->sc_cq.cq_uk.cq_id; @@ -1171,6 +1193,7 @@ static int irdma_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, init_attr->qp_context = iwqp->ibqp.qp_context; init_attr->send_cq = iwqp->ibqp.send_cq; init_attr->recv_cq = iwqp->ibqp.recv_cq; + init_attr->srq = iwqp->ibqp.srq; init_attr->cap = attr->cap; return 0; @@ -1834,6 +1857,24 @@ int irdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, } /** + * irdma_srq_free_rsrc - free up resources for srq + * @rf: RDMA PCI function + * @iwsrq: srq ptr + */ +static void irdma_srq_free_rsrc(struct irdma_pci_f *rf, struct irdma_srq *iwsrq) +{ + struct irdma_sc_srq *srq = &iwsrq->sc_srq; + + if (!iwsrq->user_mode) { + dma_free_coherent(rf->sc_dev.hw->device, iwsrq->kmem.size, + iwsrq->kmem.va, iwsrq->kmem.pa); + iwsrq->kmem.va = NULL; + } + + irdma_free_rsrc(rf, rf->allocated_srqs, srq->srq_uk.srq_id); +} + +/** * irdma_cq_free_rsrc - free up resources for cq * @rf: RDMA PCI function * @iwcq: cq ptr @@ -1897,6 +1938,22 @@ static int irdma_process_resize_list(struct irdma_cq *iwcq, } /** + * irdma_destroy_srq - destroy srq + * @ibsrq: srq pointer + * @udata: user data + */ +static int irdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) +{ + struct irdma_device *iwdev = to_iwdev(ibsrq->device); + struct irdma_srq *iwsrq = to_iwsrq(ibsrq); + struct irdma_sc_srq *srq = &iwsrq->sc_srq; + + irdma_srq_wq_destroy(iwdev->rf, srq); + irdma_srq_free_rsrc(iwdev->rf, iwsrq); + return 0; +} + +/** * irdma_destroy_cq - destroy cq * @ib_cq: cq pointer * @udata: user data @@ -2079,6 +2136,293 @@ static int irdma_resize_cq(struct ib_cq *ibcq, int entries, return ret; } +/** + * irdma_srq_event - event notification for srq limit + * @srq: shared srq struct + */ +void irdma_srq_event(struct irdma_sc_srq *srq) +{ + struct irdma_srq *iwsrq = container_of(srq, struct irdma_srq, sc_srq); + struct ib_srq *ibsrq = &iwsrq->ibsrq; + struct ib_event event; + + srq->srq_limit = 0; + + if (!ibsrq->event_handler) + return; + + event.device = ibsrq->device; + event.element.port_num = 1; + event.element.srq = ibsrq; + event.event = IB_EVENT_SRQ_LIMIT_REACHED; + ibsrq->event_handler(&event, ibsrq->srq_context); +} + +/** + * irdma_modify_srq - modify srq request + * @ibsrq: srq's pointer for modify + * @attr: access attributes + * @attr_mask: state mask + * @udata: user data + */ +static int irdma_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, + enum ib_srq_attr_mask attr_mask, + struct ib_udata *udata) +{ + struct irdma_device *iwdev = to_iwdev(ibsrq->device); + struct irdma_srq *iwsrq = to_iwsrq(ibsrq); + struct irdma_cqp_request *cqp_request; + struct irdma_pci_f *rf = iwdev->rf; + struct irdma_modify_srq_info *info; + struct cqp_cmds_info *cqp_info; + int status; + + if (attr_mask & IB_SRQ_MAX_WR) + return -EINVAL; + + if (!(attr_mask & IB_SRQ_LIMIT)) + return 0; + + if (attr->srq_limit > iwsrq->sc_srq.srq_uk.srq_size) + return -EINVAL; + + /* Execute this cqp op synchronously, so we can update srq_limit + * upon successful completion. + */ + cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true); + if (!cqp_request) + return -ENOMEM; + + cqp_info = &cqp_request->info; + info = &cqp_info->in.u.srq_modify.info; + info->srq_limit = attr->srq_limit; + if (info->srq_limit > 0xFFF) + info->srq_limit = 0xFFF; + info->arm_limit_event = 1; + + cqp_info->cqp_cmd = IRDMA_OP_SRQ_MODIFY; + cqp_info->post_sq = 1; + cqp_info->in.u.srq_modify.srq = &iwsrq->sc_srq; + cqp_info->in.u.srq_modify.scratch = (uintptr_t)cqp_request; + status = irdma_handle_cqp_op(rf, cqp_request); + irdma_put_cqp_request(&rf->cqp, cqp_request); + if (status) + return status; + + iwsrq->sc_srq.srq_limit = info->srq_limit; + + return 0; +} + +static int irdma_setup_umode_srq(struct irdma_device *iwdev, + struct irdma_srq *iwsrq, + struct irdma_srq_init_info *info, + struct ib_udata *udata) +{ +#define IRDMA_CREATE_SRQ_MIN_REQ_LEN \ + offsetofend(struct irdma_create_srq_req, user_shadow_area) + struct irdma_create_srq_req req = {}; + struct irdma_ucontext *ucontext; + struct irdma_srq_mr *srqmr; + struct irdma_pbl *iwpbl; + unsigned long flags; + + iwsrq->user_mode = true; + ucontext = rdma_udata_to_drv_context(udata, struct irdma_ucontext, + ibucontext); + + if (udata->inlen < IRDMA_CREATE_SRQ_MIN_REQ_LEN) + return -EINVAL; + + if (ib_copy_from_udata(&req, udata, + min(sizeof(req), udata->inlen))) + return -EFAULT; + + spin_lock_irqsave(&ucontext->srq_reg_mem_list_lock, flags); + iwpbl = irdma_get_pbl((unsigned long)req.user_srq_buf, + &ucontext->srq_reg_mem_list); + spin_unlock_irqrestore(&ucontext->srq_reg_mem_list_lock, flags); + if (!iwpbl) + return -EPROTO; + + iwsrq->iwpbl = iwpbl; + srqmr = &iwpbl->srq_mr; + + if (iwpbl->pbl_allocated) { + info->virtual_map = true; + info->pbl_chunk_size = 1; + info->first_pm_pbl_idx = srqmr->srq_pbl.idx; + info->leaf_pbl_size = 1; + } else { + info->srq_pa = srqmr->srq_pbl.addr; + } + info->shadow_area_pa = srqmr->shadow; + + return 0; +} + +static int irdma_setup_kmode_srq(struct irdma_device *iwdev, + struct irdma_srq *iwsrq, + struct irdma_srq_init_info *info, u32 depth, + u8 shift) +{ + struct irdma_srq_uk_init_info *ukinfo = &info->srq_uk_init_info; + struct irdma_dma_mem *mem = &iwsrq->kmem; + u32 size, ring_size; + + ring_size = depth * IRDMA_QP_WQE_MIN_SIZE; + size = ring_size + (IRDMA_SHADOW_AREA_SIZE << 3); + + mem->size = ALIGN(size, 256); + mem->va = dma_alloc_coherent(iwdev->rf->hw.device, mem->size, + &mem->pa, GFP_KERNEL); + if (!mem->va) + return -ENOMEM; + + ukinfo->srq = mem->va; + ukinfo->srq_size = depth >> shift; + ukinfo->shadow_area = mem->va + ring_size; + + info->shadow_area_pa = info->srq_pa + ring_size; + info->srq_pa = mem->pa; + + return 0; +} + +/** + * irdma_create_srq - create srq + * @ibsrq: ib's srq pointer + * @initattrs: attributes for srq + * @udata: user data for create srq + */ +static int irdma_create_srq(struct ib_srq *ibsrq, + struct ib_srq_init_attr *initattrs, + struct ib_udata *udata) +{ + struct irdma_device *iwdev = to_iwdev(ibsrq->device); + struct ib_srq_attr *attr = &initattrs->attr; + struct irdma_pd *iwpd = to_iwpd(ibsrq->pd); + struct irdma_srq *iwsrq = to_iwsrq(ibsrq); + struct irdma_srq_uk_init_info *ukinfo; + struct irdma_cqp_request *cqp_request; + struct irdma_srq_init_info info = {}; + struct irdma_pci_f *rf = iwdev->rf; + struct irdma_uk_attrs *uk_attrs; + struct cqp_cmds_info *cqp_info; + int err_code = 0; + u32 depth; + u8 shift; + + uk_attrs = &rf->sc_dev.hw_attrs.uk_attrs; + ukinfo = &info.srq_uk_init_info; + + if (initattrs->srq_type != IB_SRQT_BASIC) + return -EOPNOTSUPP; + + if (!(uk_attrs->feature_flags & IRDMA_FEATURE_SRQ) || + attr->max_sge > uk_attrs->max_hw_wq_frags) + return -EINVAL; + + refcount_set(&iwsrq->refcnt, 1); + spin_lock_init(&iwsrq->lock); + err_code = irdma_alloc_rsrc(rf, rf->allocated_srqs, rf->max_srq, + &iwsrq->srq_num, &rf->next_srq); + if (err_code) + return err_code; + + ukinfo->max_srq_frag_cnt = attr->max_sge; + ukinfo->uk_attrs = uk_attrs; + ukinfo->srq_id = iwsrq->srq_num; + + irdma_get_wqe_shift(ukinfo->uk_attrs, ukinfo->max_srq_frag_cnt, 0, + &shift); + + err_code = irdma_get_srqdepth(ukinfo->uk_attrs, attr->max_wr, + shift, &depth); + if (err_code) + return err_code; + + /* Actual SRQ size in WRs for ring and HW */ + ukinfo->srq_size = depth >> shift; + + /* Max postable WRs to SRQ */ + iwsrq->max_wr = (depth - IRDMA_RQ_RSVD) >> shift; + attr->max_wr = iwsrq->max_wr; + + if (udata) + err_code = irdma_setup_umode_srq(iwdev, iwsrq, &info, udata); + else + err_code = irdma_setup_kmode_srq(iwdev, iwsrq, &info, depth, + shift); + + if (err_code) + goto free_rsrc; + + info.vsi = &iwdev->vsi; + info.pd = &iwpd->sc_pd; + + err_code = irdma_sc_srq_init(&iwsrq->sc_srq, &info); + if (err_code) + goto free_dmem; + + cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, true); + if (!cqp_request) { + err_code = -ENOMEM; + goto free_dmem; + } + + cqp_info = &cqp_request->info; + cqp_info->cqp_cmd = IRDMA_OP_SRQ_CREATE; + cqp_info->post_sq = 1; + cqp_info->in.u.srq_create.srq = &iwsrq->sc_srq; + cqp_info->in.u.srq_create.scratch = (uintptr_t)cqp_request; + err_code = irdma_handle_cqp_op(rf, cqp_request); + irdma_put_cqp_request(&rf->cqp, cqp_request); + if (err_code) + goto free_dmem; + + if (udata) { + struct irdma_create_srq_resp resp = {}; + + resp.srq_id = iwsrq->srq_num; + resp.srq_size = ukinfo->srq_size; + if (ib_copy_to_udata(udata, &resp, + min(sizeof(resp), udata->outlen))) { + err_code = -EPROTO; + goto srq_destroy; + } + } + + return 0; + +srq_destroy: + irdma_srq_wq_destroy(rf, &iwsrq->sc_srq); + +free_dmem: + if (!iwsrq->user_mode) + dma_free_coherent(rf->hw.device, iwsrq->kmem.size, + iwsrq->kmem.va, iwsrq->kmem.pa); +free_rsrc: + irdma_free_rsrc(rf, rf->allocated_srqs, iwsrq->srq_num); + return err_code; +} + +/** + * irdma_query_srq - get SRQ attributes + * @ibsrq: the SRQ to query + * @attr: the attributes of the SRQ + */ +static int irdma_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr) +{ + struct irdma_srq *iwsrq = to_iwsrq(ibsrq); + + attr->max_wr = iwsrq->max_wr; + attr->max_sge = iwsrq->sc_srq.srq_uk.max_srq_frag_cnt; + attr->srq_limit = iwsrq->sc_srq.srq_limit; + + return 0; +} + static inline int cq_validate_flags(u32 flags, u8 hw_rev) { /* GEN1 does not support CQ create flags */ @@ -2526,6 +2870,7 @@ static int irdma_handle_q_mem(struct irdma_device *iwdev, struct irdma_mr *iwmr = iwpbl->iwmr; struct irdma_qp_mr *qpmr = &iwpbl->qp_mr; struct irdma_cq_mr *cqmr = &iwpbl->cq_mr; + struct irdma_srq_mr *srqmr = &iwpbl->srq_mr; struct irdma_hmc_pble *hmc_p; u64 *arr = iwmr->pgaddrmem; u32 pg_size, total; @@ -2545,7 +2890,10 @@ static int irdma_handle_q_mem(struct irdma_device *iwdev, total = req->sq_pages + req->rq_pages; hmc_p = &qpmr->sq_pbl; qpmr->shadow = (dma_addr_t)arr[total]; - + /* Need to use physical address for RQ of QP + * in case it is associated with SRQ. + */ + qpmr->rq_pa = (dma_addr_t)arr[req->sq_pages]; if (lvl) { ret = irdma_check_mem_contiguous(arr, req->sq_pages, pg_size); @@ -2565,6 +2913,18 @@ static int irdma_handle_q_mem(struct irdma_device *iwdev, hmc_p->addr = arr[req->sq_pages]; } break; + case IRDMA_MEMREG_TYPE_SRQ: + hmc_p = &srqmr->srq_pbl; + srqmr->shadow = (dma_addr_t)arr[req->rq_pages]; + if (lvl) + ret = irdma_check_mem_contiguous(arr, req->rq_pages, + pg_size); + + if (!ret) + hmc_p->idx = palloc->level1.idx; + else + hmc_p->addr = arr[0]; + break; case IRDMA_MEMREG_TYPE_CQ: hmc_p = &cqmr->cq_pbl; @@ -3035,6 +3395,37 @@ static int irdma_reg_user_mr_type_qp(struct irdma_mem_reg_req req, return 0; } +static int irdma_reg_user_mr_type_srq(struct irdma_mem_reg_req req, + struct ib_udata *udata, + struct irdma_mr *iwmr) +{ + struct irdma_device *iwdev = to_iwdev(iwmr->ibmr.device); + struct irdma_pbl *iwpbl = &iwmr->iwpbl; + struct irdma_ucontext *ucontext; + unsigned long flags; + u32 total; + int err; + u8 lvl; + + total = req.rq_pages + IRDMA_SHADOW_PGCNT; + if (total > iwmr->page_cnt) + return -EINVAL; + + lvl = req.rq_pages > 1 ? PBLE_LEVEL_1 : PBLE_LEVEL_0; + err = irdma_handle_q_mem(iwdev, &req, iwpbl, lvl); + if (err) + return err; + + ucontext = rdma_udata_to_drv_context(udata, struct irdma_ucontext, + ibucontext); + spin_lock_irqsave(&ucontext->srq_reg_mem_list_lock, flags); + list_add_tail(&iwpbl->list, &ucontext->srq_reg_mem_list); + iwpbl->on_list = true; + spin_unlock_irqrestore(&ucontext->srq_reg_mem_list_lock, flags); + + return 0; +} + static int irdma_reg_user_mr_type_cq(struct irdma_mem_reg_req req, struct ib_udata *udata, struct irdma_mr *iwmr) @@ -3121,6 +3512,12 @@ static struct ib_mr *irdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 len, goto error; break; + case IRDMA_MEMREG_TYPE_SRQ: + err = irdma_reg_user_mr_type_srq(req, udata, iwmr); + if (err) + goto error; + + break; case IRDMA_MEMREG_TYPE_CQ: err = irdma_reg_user_mr_type_cq(req, udata, iwmr); if (err) @@ -3436,6 +3833,14 @@ static void irdma_del_memlist(struct irdma_mr *iwmr, } spin_unlock_irqrestore(&ucontext->qp_reg_mem_list_lock, flags); break; + case IRDMA_MEMREG_TYPE_SRQ: + spin_lock_irqsave(&ucontext->srq_reg_mem_list_lock, flags); + if (iwpbl->on_list) { + iwpbl->on_list = false; + list_del(&iwpbl->list); + } + spin_unlock_irqrestore(&ucontext->srq_reg_mem_list_lock, flags); + break; default: break; } @@ -3655,6 +4060,47 @@ static int irdma_post_send(struct ib_qp *ibqp, } /** + * irdma_post_srq_recv - post receive wr for kernel application + * @ibsrq: ib srq pointer + * @ib_wr: work request for receive + * @bad_wr: bad wr caused an error + */ +static int irdma_post_srq_recv(struct ib_srq *ibsrq, + const struct ib_recv_wr *ib_wr, + const struct ib_recv_wr **bad_wr) +{ + struct irdma_srq *iwsrq = to_iwsrq(ibsrq); + struct irdma_srq_uk *uksrq = &iwsrq->sc_srq.srq_uk; + struct irdma_post_rq_info post_recv = {}; + unsigned long flags; + int err = 0; + + spin_lock_irqsave(&iwsrq->lock, flags); + while (ib_wr) { + if (ib_wr->num_sge > uksrq->max_srq_frag_cnt) { + err = -EINVAL; + goto out; + } + post_recv.num_sges = ib_wr->num_sge; + post_recv.wr_id = ib_wr->wr_id; + post_recv.sg_list = ib_wr->sg_list; + err = irdma_uk_srq_post_receive(uksrq, &post_recv); + if (err) + goto out; + + ib_wr = ib_wr->next; + } + +out: + spin_unlock_irqrestore(&iwsrq->lock, flags); + + if (err) + *bad_wr = ib_wr; + + return err; +} + +/** * irdma_post_recv - post receive wr for kernel application * @ibqp: ib qp pointer * @ib_wr: work request for receive @@ -3673,6 +4119,11 @@ static int irdma_post_recv(struct ib_qp *ibqp, iwqp = to_iwqp(ibqp); ukqp = &iwqp->sc_qp.qp_uk; + if (ukqp->srq_uk) { + *bad_wr = ib_wr; + return -EINVAL; + } + spin_lock_irqsave(&iwqp->lock, flags); while (ib_wr) { post_recv.num_sges = ib_wr->num_sge; @@ -4761,6 +5212,18 @@ static enum rdma_link_layer irdma_get_link_layer(struct ib_device *ibdev, return IB_LINK_LAYER_ETHERNET; } +static const struct ib_device_ops irdma_gen1_dev_ops = { + .dealloc_driver = irdma_ib_dealloc_device, +}; + +static const struct ib_device_ops irdma_gen3_dev_ops = { + .create_srq = irdma_create_srq, + .destroy_srq = irdma_destroy_srq, + .modify_srq = irdma_modify_srq, + .post_srq_recv = irdma_post_srq_recv, + .query_srq = irdma_query_srq, +}; + static const struct ib_device_ops irdma_roce_dev_ops = { .attach_mcast = irdma_attach_mcast, .create_ah = irdma_create_ah, @@ -4831,6 +5294,7 @@ static enum rdma_link_layer irdma_get_link_layer(struct ib_device *ibdev, INIT_RDMA_OBJ_SIZE(ib_cq, irdma_cq, ibcq), INIT_RDMA_OBJ_SIZE(ib_mw, irdma_mr, ibmw), INIT_RDMA_OBJ_SIZE(ib_qp, irdma_qp, ibqp), + INIT_RDMA_OBJ_SIZE(ib_srq, irdma_srq, ibsrq), }; /** @@ -4878,6 +5342,10 @@ static void irdma_init_rdma_device(struct irdma_device *iwdev) iwdev->ibdev.num_comp_vectors = iwdev->rf->ceqs_count; iwdev->ibdev.dev.parent = &pcidev->dev; ib_set_device_ops(&iwdev->ibdev, &irdma_dev_ops); + if (iwdev->rf->rdma_ver == IRDMA_GEN_1) + ib_set_device_ops(&iwdev->ibdev, &irdma_gen1_dev_ops); + if (iwdev->rf->rdma_ver >= IRDMA_GEN_3) + ib_set_device_ops(&iwdev->ibdev, &irdma_gen3_dev_ops); } /** diff --git a/drivers/infiniband/hw/irdma/verbs.h b/drivers/infiniband/hw/irdma/verbs.h index fcb163c..157dfa2 100644 --- a/drivers/infiniband/hw/irdma/verbs.h +++ b/drivers/infiniband/hw/irdma/verbs.h @@ -8,6 +8,7 @@ #define IRDMA_PKEY_TBL_SZ 1 #define IRDMA_DEFAULT_PKEY 0xFFFF +#define IRDMA_SHADOW_PGCNT 1 struct irdma_ucontext { struct ib_ucontext ibucontext; @@ -17,6 +18,8 @@ struct irdma_ucontext { spinlock_t cq_reg_mem_list_lock; /* protect CQ memory list */ struct list_head qp_reg_mem_list; spinlock_t qp_reg_mem_list_lock; /* protect QP memory list */ + struct list_head srq_reg_mem_list; + spinlock_t srq_reg_mem_list_lock; /* protect SRQ memory list */ int abi_ver; u8 legacy_mode : 1; u8 use_raw_attrs : 1; @@ -65,10 +68,16 @@ struct irdma_cq_mr { bool split; }; +struct irdma_srq_mr { + struct irdma_hmc_pble srq_pbl; + dma_addr_t shadow; +}; + struct irdma_qp_mr { struct irdma_hmc_pble sq_pbl; struct irdma_hmc_pble rq_pbl; dma_addr_t shadow; + dma_addr_t rq_pa; struct page *sq_page; }; @@ -85,6 +94,7 @@ struct irdma_pbl { union { struct irdma_qp_mr qp_mr; struct irdma_cq_mr cq_mr; + struct irdma_srq_mr srq_mr; }; bool pbl_allocated:1; @@ -112,6 +122,21 @@ struct irdma_mr { struct irdma_pbl iwpbl; }; +struct irdma_srq { + struct ib_srq ibsrq; + struct irdma_sc_srq sc_srq; + struct irdma_dma_mem kmem; + u64 *srq_wrid_mem; + refcount_t refcnt; + spinlock_t lock; /* for poll srq */ + struct irdma_pbl *iwpbl; + struct irdma_sge *sg_list; + u16 srq_head; + u32 srq_num; + u32 max_wr; + bool user_mode:1; +}; + struct irdma_cq { struct ib_cq ibcq; struct irdma_sc_cq sc_cq; diff --git a/include/uapi/rdma/irdma-abi.h b/include/uapi/rdma/irdma-abi.h index 4e42054..f7788d3 100644 --- a/include/uapi/rdma/irdma-abi.h +++ b/include/uapi/rdma/irdma-abi.h @@ -20,11 +20,13 @@ enum irdma_memreg_type { IRDMA_MEMREG_TYPE_MEM = 0, IRDMA_MEMREG_TYPE_QP = 1, IRDMA_MEMREG_TYPE_CQ = 2, + IRDMA_MEMREG_TYPE_SRQ = 3, }; enum { IRDMA_ALLOC_UCTX_USE_RAW_ATTR = 1 << 0, IRDMA_ALLOC_UCTX_MIN_HW_WQ_SIZE = 1 << 1, + IRDMA_ALLOC_UCTX_MAX_HW_SRQ_QUANTA = 1 << 2, IRDMA_SUPPORT_WQE_FORMAT_V2 = 1 << 3, }; @@ -55,7 +57,8 @@ struct irdma_alloc_ucontext_resp { __u8 rsvd2; __aligned_u64 comp_mask; __u16 min_hw_wq_size; - __u8 rsvd3[6]; + __u32 max_hw_srq_quanta; + __u8 rsvd3[2]; }; struct irdma_alloc_pd_resp { @@ -72,6 +75,16 @@ struct irdma_create_cq_req { __aligned_u64 user_shadow_area; }; +struct irdma_create_srq_req { + __aligned_u64 user_srq_buf; + __aligned_u64 user_shadow_area; +}; + +struct irdma_create_srq_resp { + __u32 srq_id; + __u32 srq_size; +}; + struct irdma_create_qp_req { __aligned_u64 user_wqe_bufs; __aligned_u64 user_compl_ctx; From patchwork Wed Jul 24 23:39:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741459 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D25741494D7 for ; Wed, 24 Jul 2024 23:40:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864453; cv=none; b=eCs20ASW0Xfp60w3S/Dt0rTH6/Or4ZHKi6p49DolpYbgOaMuHSgC9LcsTNlyJtdAlytslhr4hJnAkBH65P4m5ATAdh5gEofkDXUTliKJECQvPIab5gV3ebZPHQ9d7ZOjS4fl1W1tpXoSOVWBIglgvjiFKr0mZ80asMbhVqzTckA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864453; c=relaxed/simple; bh=l29p/3BbnxZ+jD81BxeplAP+MiXiUV0uNhGpolCg9Jg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=u/LAohKIf72DA1AUoPyAl8m71jYi2KnyAXY2J0a7gS2LiF/ihCo06IdrOt9d7qX8tJ7qNv49xhR9yLvXhkSbZK18HwEEIj1fRgpFSlWb1wAkdXsSxi0ct5l4jzdDf60nDa8SqdTpeNjvF7YlJWXXaGRafjCVSILr3nWZ9xvAc7c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ie/IcvGC; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ie/IcvGC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864452; x=1753400452; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=l29p/3BbnxZ+jD81BxeplAP+MiXiUV0uNhGpolCg9Jg=; b=Ie/IcvGC9j3CTcaeJL2FbcBRo2D6X3PL4mRAaqVGwmdBYDVQ5AcQpcTk 2/eMqHTwoM8HjefVSpKQvYWtIZtFJ+3yAoAYSkQJ9g7kafXib1h1gmh7C mxzZLTi2arWqz6B5fFWOArfc8ufr/lR1D3D31gpzsXfA7aHJUUzrOpQxw /Q/LyOZx4EZza5ViB/U2e2FxAhE1PZ1bg0lnpFUW5n+xxqBn0cVGO+thm VmXLKeAQVo0Z7jd7k6gf9JLWWa9pmHctFkaa1r8KtWiu/kT9xF6tEPmcY QiKXfd8LhkAeacDSbjxS0vejKoRs9XY5SupLaZn3FJF16gTBtQ2EUArMh Q==; X-CSE-ConnectionGUID: svJ32GkyQhqLrLu2XUYUtg== X-CSE-MsgGUID: 6peAIfokTg6oNIewwP/fKw== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999798" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999798" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:46 -0700 X-CSE-ConnectionGUID: GBS02B83QF+1JwpY5SNSww== X-CSE-MsgGUID: eUdYaBN4RiGaq6SajW810Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426099" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:45 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Shiraz Saleem , Tatyana Nikolova Subject: [RFC PATCH 21/25] RDMA/irdma: Restrict Memory Window and CQE Timestamping to GEN3 Date: Wed, 24 Jul 2024 18:39:13 -0500 Message-Id: <20240724233917.704-22-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Shiraz Saleem With the deprecation of Memory Window and Timestamping support in GEN2, move these features to be exclusive to GEN3. This iteration supports only Type2 Memory Windows. Additionally, it includes the reporting of the timestamp mask and Host Channel Adapter (HCA) core clock frequency via the query device verb. Signed-off-by: Shiraz Saleem Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/verbs.c | 39 ++++++++++++++++++++++++------------- 1 file changed, 26 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 868722b..66439706 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -41,7 +41,8 @@ static int irdma_query_device(struct ib_device *ibdev, props->max_cq = rf->max_cq - rf->used_cqs; props->max_cqe = rf->max_cqe - 1; props->max_mr = rf->max_mr - rf->used_mrs; - props->max_mw = props->max_mr; + if (hw_attrs->uk_attrs.hw_rev >= IRDMA_GEN_3) + props->max_mw = props->max_mr; props->max_pd = rf->max_pd - rf->used_pds; props->max_sge_rd = hw_attrs->uk_attrs.max_hw_read_sges; props->max_qp_rd_atom = hw_attrs->max_hw_ird; @@ -59,6 +60,13 @@ static int irdma_query_device(struct ib_device *ibdev, props->max_srq = rf->max_srq - rf->used_srqs; props->max_srq_wr = IRDMA_MAX_SRQ_WRS; props->max_srq_sge = hw_attrs->uk_attrs.max_hw_wq_frags; + if (hw_attrs->uk_attrs.hw_rev >= IRDMA_GEN_3) { +#define HCA_CORE_CLOCK_KHZ 1000000UL + props->timestamp_mask = GENMASK(31, 0); + props->hca_core_clock = HCA_CORE_CLOCK_KHZ; + } + if (hw_attrs->uk_attrs.hw_rev >= IRDMA_GEN_3) + props->device_cap_flags |= IB_DEVICE_MEM_WINDOW_TYPE_2B; return 0; } @@ -795,7 +803,8 @@ static void irdma_roce_fill_and_set_qpctx_info(struct irdma_qp *iwqp, roce_info->is_qp1 = true; roce_info->rd_en = true; roce_info->wr_rdresp_en = true; - roce_info->bind_en = true; + if (dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) + roce_info->bind_en = true; roce_info->dcqcn_en = false; roce_info->rtomin = 5; @@ -826,7 +835,6 @@ static void irdma_iw_fill_and_set_qpctx_info(struct irdma_qp *iwqp, ether_addr_copy(iwarp_info->mac_addr, iwdev->netdev->dev_addr); iwarp_info->rd_en = true; iwarp_info->wr_rdresp_en = true; - iwarp_info->bind_en = true; iwarp_info->ecn_en = true; iwarp_info->rtomin = 5; @@ -1144,8 +1152,6 @@ static int irdma_get_ib_acc_flags(struct irdma_qp *iwqp) } if (iwqp->iwarp_info.rd_en) acc_flags |= IB_ACCESS_REMOTE_READ; - if (iwqp->iwarp_info.bind_en) - acc_flags |= IB_ACCESS_MW_BIND; } return acc_flags; } @@ -2425,8 +2431,8 @@ static int irdma_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr) static inline int cq_validate_flags(u32 flags, u8 hw_rev) { - /* GEN1 does not support CQ create flags */ - if (hw_rev == IRDMA_GEN_1) + /* GEN1/2 does not support CQ create flags */ + if (hw_rev <= IRDMA_GEN_2) return flags ? -EOPNOTSUPP : 0; return flags & ~IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION ? -EOPNOTSUPP : 0; @@ -2647,8 +2653,9 @@ static int irdma_create_cq(struct ib_cq *ibcq, /** * irdma_get_mr_access - get hw MR access permissions from IB access flags * @access: IB access flags + * @hw_rev: Hardware version */ -static inline u16 irdma_get_mr_access(int access) +static inline u16 irdma_get_mr_access(int access, u8 hw_rev) { u16 hw_access = 0; @@ -2658,8 +2665,10 @@ static inline u16 irdma_get_mr_access(int access) IRDMA_ACCESS_FLAGS_REMOTEWRITE : 0; hw_access |= (access & IB_ACCESS_REMOTE_READ) ? IRDMA_ACCESS_FLAGS_REMOTEREAD : 0; - hw_access |= (access & IB_ACCESS_MW_BIND) ? - IRDMA_ACCESS_FLAGS_BIND_WINDOW : 0; + if (hw_rev >= IRDMA_GEN_3) { + hw_access |= (access & IB_ACCESS_MW_BIND) ? + IRDMA_ACCESS_FLAGS_BIND_WINDOW : 0; + } hw_access |= (access & IB_ZERO_BASED) ? IRDMA_ACCESS_FLAGS_ZERO_BASED : 0; hw_access |= IRDMA_ACCESS_FLAGS_LOCALREAD; @@ -3229,7 +3238,8 @@ static int irdma_hwreg_mr(struct irdma_device *iwdev, struct irdma_mr *iwmr, stag_info->stag_idx = iwmr->stag >> IRDMA_CQPSQ_STAG_IDX_S; stag_info->stag_key = (u8)iwmr->stag; stag_info->total_len = iwmr->len; - stag_info->access_rights = irdma_get_mr_access(access); + stag_info->access_rights = irdma_get_mr_access(access, + iwdev->rf->sc_dev.hw_attrs.uk_attrs.hw_rev); stag_info->pd_id = iwpd->sc_pd.pd_id; stag_info->all_memory = pd->flags & IB_PD_UNSAFE_GLOBAL_RKEY; if (stag_info->access_rights & IRDMA_ACCESS_FLAGS_ZERO_BASED) @@ -4014,7 +4024,9 @@ static int irdma_post_send(struct ib_qp *ibqp, stag_info.signaled = info.signaled; stag_info.read_fence = info.read_fence; - stag_info.access_rights = irdma_get_mr_access(reg_wr(ib_wr)->access); + stag_info.access_rights = + irdma_get_mr_access(reg_wr(ib_wr)->access, + dev->hw_attrs.uk_attrs.hw_rev); stag_info.stag_key = reg_wr(ib_wr)->key & 0xff; stag_info.stag_idx = reg_wr(ib_wr)->key >> 8; stag_info.page_size = reg_wr(ib_wr)->mr->page_size; @@ -5217,7 +5229,9 @@ static enum rdma_link_layer irdma_get_link_layer(struct ib_device *ibdev, }; static const struct ib_device_ops irdma_gen3_dev_ops = { + .alloc_mw = irdma_alloc_mw, .create_srq = irdma_create_srq, + .dealloc_mw = irdma_dealloc_mw, .destroy_srq = irdma_destroy_srq, .modify_srq = irdma_modify_srq, .post_srq_recv = irdma_post_srq_recv, @@ -5258,7 +5272,6 @@ static enum rdma_link_layer irdma_get_link_layer(struct ib_device *ibdev, .alloc_hw_port_stats = irdma_alloc_hw_port_stats, .alloc_mr = irdma_alloc_mr, - .alloc_mw = irdma_alloc_mw, .alloc_pd = irdma_alloc_pd, .alloc_ucontext = irdma_alloc_ucontext, .create_cq = irdma_create_cq, From patchwork Wed Jul 24 23:39:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741460 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A86B149C40 for ; Wed, 24 Jul 2024 23:40:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864454; cv=none; b=BsA1+hmkm1qSqtA51U9csX+orAXeCbnm5itqvFKX1jTwS8xzPUZ+OC1t4usi/3fG33DS7idAtJvG7FMCPozD4tuIB7LF/uD1NetkFoik/9D1ThASWWx0CfOcyYMxsbOhzboYdglt8IYDKeew0N6xLhD+NpaXo2WRUGOWBn3yPFA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864454; c=relaxed/simple; bh=El8fAWrPsIKvlHQb2fqCPBcxd8TFyBSlCBABC8zcovM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Ja1gNPlaoLrB0wk9U8nyLIlhrF2K8QOSy59GMFWk9WAsK3+ywQDE5iOsXlI+Zv+MZXPafu7Ywk0gJIL7zLwEfdlnLdb1B/SWiSLCVDcJ5ETeiaqnyUj3/0mY4ugLS/M7cvxUpioEJKmwBMkhmoPgr+9aTe8KqV6eYz+Drhntk2E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=idQeAxaR; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="idQeAxaR" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864452; x=1753400452; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=El8fAWrPsIKvlHQb2fqCPBcxd8TFyBSlCBABC8zcovM=; b=idQeAxaRcfIwK2hu9nuZfsB2GKBpy0k02Eu9zOKU7DEOoOmQls2LtO8/ f0Kq0mojI1wQgHsAIRkZRyqMFGDzwk0McmFJdhVh8gGnB/uwoSct6NYjM xM23Y0ITAt+MPmd9lGkZ2vAvn5KxkXarjAM8u6ICbWSPH0pbuNPgoNgHg jeUAEs+Z1LpCu/FGScNWQJy1eB6pzijGQE4IgPD4IZ6ElWD2yRDUOxYFv NF9lwUCNMSb3wMxzK321foJoscnNkD6PU8U/Vd3n4fBogpBW6C1+gZVWT dpf/Sbn09SQ/1WGhMXF/zxoOeo7y8DwTZZaSLpLx2HDWiE3BqO1wVZY7p g==; X-CSE-ConnectionGUID: +JrwE+G6T3+5UTrF8hQa6Q== X-CSE-MsgGUID: fuunQyfUSmCnQq5cfdzXPA== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999801" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999801" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:47 -0700 X-CSE-ConnectionGUID: 1tSUTex1QdGcaVQypGG2hQ== X-CSE-MsgGUID: A+379/M0ST+EDb64/14JZQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426105" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:46 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Faisal Latif , Tatyana Nikolova Subject: [RFC PATCH 22/25] RDMA/irdma: Add Atomic Operations support Date: Wed, 24 Jul 2024 18:39:14 -0500 Message-Id: <20240724233917.704-23-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Faisal Latif Extend irdma to support atomic operations, namely Compare and Swap and Fetch and Add, for GEN3 devices. Signed-off-by: Faisal Latif Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ctrl.c | 7 +++ drivers/infiniband/hw/irdma/defs.h | 10 +++- drivers/infiniband/hw/irdma/type.h | 4 ++ drivers/infiniband/hw/irdma/uk.c | 102 ++++++++++++++++++++++++++++++++++++ drivers/infiniband/hw/irdma/user.h | 27 ++++++++++ drivers/infiniband/hw/irdma/verbs.c | 38 ++++++++++++++ drivers/infiniband/hw/irdma/verbs.h | 6 +++ 7 files changed, 193 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index d7165bd..40868b5 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -1110,6 +1110,8 @@ static void irdma_sc_qp_setctx_roce_gen_3(struct irdma_sc_qp *qp, FIELD_PREP(IRDMAQPC_UDPRIVCQENABLE, roce_info->udprivcq_en) | FIELD_PREP(IRDMAQPC_PRIVEN, roce_info->priv_mode_en) | + FIELD_PREP(IRDMAQPC_REMOTE_ATOMIC_EN, + info->remote_atomics_en) | FIELD_PREP(IRDMAQPC_TIMELYENABLE, roce_info->timely_en)); set_64bit_val(qp_ctx, 168, FIELD_PREP(IRDMAQPC_QPCOMPCTX, info->qp_compl_ctx)); @@ -1489,6 +1491,8 @@ static int irdma_sc_alloc_stag(struct irdma_sc_dev *dev, FIELD_PREP(IRDMA_CQPSQ_STAG_REMACCENABLED, info->remote_access) | FIELD_PREP(IRDMA_CQPSQ_STAG_USEHMCFNIDX, info->use_hmc_fcn_index) | FIELD_PREP(IRDMA_CQPSQ_STAG_USEPFRID, info->use_pf_rid) | + FIELD_PREP(IRDMA_CQPSQ_STAG_REMOTE_ATOMIC_EN, + info->remote_atomics_en) | FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); dma_wmb(); /* make sure WQE is written before valid bit is set */ @@ -1580,6 +1584,8 @@ static int irdma_sc_mr_reg_non_shared(struct irdma_sc_dev *dev, FIELD_PREP(IRDMA_CQPSQ_STAG_VABASEDTO, addr_type) | FIELD_PREP(IRDMA_CQPSQ_STAG_USEHMCFNIDX, info->use_hmc_fcn_index) | FIELD_PREP(IRDMA_CQPSQ_STAG_USEPFRID, info->use_pf_rid) | + FIELD_PREP(IRDMA_CQPSQ_STAG_REMOTE_ATOMIC_EN, + info->remote_atomics_en) | FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); dma_wmb(); /* make sure WQE is written before valid bit is set */ @@ -1736,6 +1742,7 @@ int irdma_sc_mr_fast_register(struct irdma_sc_qp *qp, FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) | FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | + FIELD_PREP(IRDMAQPSQ_REMOTE_ATOMICS_EN, info->remote_atomics_en) | FIELD_PREP(IRDMAQPSQ_VALID, qp->qp_uk.swqe_polarity); dma_wmb(); /* make sure WQE is written before valid bit is set */ diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index 8ead170..9c0fd46 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -189,6 +189,8 @@ enum irdma_protocol_used { #define IRDMAQP_OP_RDMA_READ_LOC_INV 0x0b #define IRDMAQP_OP_NOP 0x0c #define IRDMAQP_OP_RDMA_WRITE_SOL 0x0d +#define IRDMAQP_OP_ATOMIC_FETCH_ADD 0x0f +#define IRDMAQP_OP_ATOMIC_COMPARE_SWAP_ADD 0x11 #define IRDMAQP_OP_GEN_RTS_AE 0x30 enum irdma_cqp_op_type { @@ -696,7 +698,8 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_STAG_USEPFRID BIT_ULL(61) #define IRDMA_CQPSQ_STAG_PBA IRDMA_CQPHC_QPCTX -#define IRDMA_CQPSQ_STAG_HMCFNIDX GENMASK_ULL(5, 0) +#define IRDMA_CQPSQ_STAG_HMCFNIDX GENMASK_ULL(15, 0) +#define IRDMA_CQPSQ_STAG_REMOTE_ATOMIC_EN BIT_ULL(61) #define IRDMA_CQPSQ_STAG_FIRSTPMPBLIDX GENMASK_ULL(27, 0) #define IRDMA_CQPSQ_QUERYSTAG_IDX IRDMA_CQPSQ_STAG_IDX @@ -986,6 +989,9 @@ enum irdma_cqp_op_type { #define IRDMAQPSQ_REMTO IRDMA_CQPHC_QPCTX +#define IRDMAQPSQ_STAG GENMASK_ULL(31, 0) +#define IRDMAQPSQ_REMOTE_STAG GENMASK_ULL(31, 0) + #define IRDMAQPSQ_STAGRIGHTS GENMASK_ULL(52, 48) #define IRDMAQPSQ_VABASEDTO BIT_ULL(53) #define IRDMAQPSQ_MEMWINDOWTYPE BIT_ULL(54) @@ -996,6 +1002,8 @@ enum irdma_cqp_op_type { #define IRDMAQPSQ_BASEVA_TO_FBO IRDMA_CQPHC_QPCTX +#define IRDMAQPSQ_REMOTE_ATOMICS_EN BIT_ULL(55) + #define IRDMAQPSQ_LOCSTAG GENMASK_ULL(31, 0) #define IRDMAQPSQ_STAGKEY GENMASK_ULL(7, 0) diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index adfc528..52aa1dd 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -1094,6 +1094,7 @@ struct irdma_qp_host_ctx_info { u32 srq_id; u32 rem_endpoint_idx; u16 stats_idx; + bool remote_atomics_en:1; bool srq_valid:1; bool tcp_info_valid:1; bool iwarp_info_valid:1; @@ -1134,6 +1135,7 @@ struct irdma_allocate_stag_info { bool use_hmc_fcn_index:1; bool use_pf_rid:1; bool all_memory:1; + bool remote_atomics_en:1; u16 hmc_fcn_index; }; @@ -1162,6 +1164,7 @@ struct irdma_reg_ns_stag_info { u8 hmc_fcn_index; bool use_pf_rid:1; bool all_memory:1; + bool remote_atomics_en:1; }; struct irdma_fast_reg_stag_info { @@ -1185,6 +1188,7 @@ struct irdma_fast_reg_stag_info { u8 hmc_fcn_index; bool use_pf_rid:1; bool defer_flag:1; + bool remote_atomics_en:1; }; struct irdma_dealloc_stag_info { diff --git a/drivers/infiniband/hw/irdma/uk.c b/drivers/infiniband/hw/irdma/uk.c index 26f3475..24e8df0 100644 --- a/drivers/infiniband/hw/irdma/uk.c +++ b/drivers/infiniband/hw/irdma/uk.c @@ -338,6 +338,108 @@ int irdma_uk_rdma_write(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, } /** + * irdma_uk_atomic_fetch_add - atomic fetch and add operation + * @qp: hw qp ptr + * @info: post sq information + * @post_sq: flag to post sq + */ +int irdma_uk_atomic_fetch_add(struct irdma_qp_uk *qp, + struct irdma_post_sq_info *info, bool post_sq) +{ + struct irdma_atomic_fetch_add *op_info; + u32 total_size = 0; + u16 quanta = 2; + u32 wqe_idx; + __le64 *wqe; + u64 hdr; + + op_info = &info->op.atomic_fetch_add; + wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size, + info); + if (!wqe) + return -ENOMEM; + + set_64bit_val(wqe, 0, op_info->tagged_offset); + set_64bit_val(wqe, 8, + FIELD_PREP(IRDMAQPSQ_STAG, op_info->stag)); + set_64bit_val(wqe, 16, op_info->remote_tagged_offset); + + hdr = FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, 1) | + FIELD_PREP(IRDMAQPSQ_REMOTE_STAG, op_info->remote_stag) | + FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_ATOMIC_FETCH_ADD) | + FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) | + FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) | + FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | + FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); + + set_64bit_val(wqe, 32, op_info->fetch_add_data_bytes); + set_64bit_val(wqe, 40, 0); + set_64bit_val(wqe, 48, 0); + set_64bit_val(wqe, 56, + FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity)); + + dma_wmb(); /* make sure WQE is populated before valid bit is set */ + + set_64bit_val(wqe, 24, hdr); + + if (post_sq) + irdma_uk_qp_post_wr(qp); + + return 0; +} + +/** + * irdma_uk_atomic_compare_swap - atomic compare and swap operation + * @qp: hw qp ptr + * @info: post sq information + * @post_sq: flag to post sq + */ +int irdma_uk_atomic_compare_swap(struct irdma_qp_uk *qp, + struct irdma_post_sq_info *info, bool post_sq) +{ + struct irdma_atomic_compare_swap *op_info; + u32 total_size = 0; + u16 quanta = 2; + u32 wqe_idx; + __le64 *wqe; + u64 hdr; + + op_info = &info->op.atomic_compare_swap; + wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size, + info); + if (!wqe) + return -ENOMEM; + + set_64bit_val(wqe, 0, op_info->tagged_offset); + set_64bit_val(wqe, 8, + FIELD_PREP(IRDMAQPSQ_STAG, op_info->stag)); + set_64bit_val(wqe, 16, op_info->remote_tagged_offset); + + hdr = FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, 1) | + FIELD_PREP(IRDMAQPSQ_REMOTE_STAG, op_info->remote_stag) | + FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_ATOMIC_COMPARE_SWAP_ADD) | + FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) | + FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) | + FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | + FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); + + set_64bit_val(wqe, 32, op_info->swap_data_bytes); + set_64bit_val(wqe, 40, op_info->compare_data_bytes); + set_64bit_val(wqe, 48, 0); + set_64bit_val(wqe, 56, + FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity)); + + dma_wmb(); /* make sure WQE is populated before valid bit is set */ + + set_64bit_val(wqe, 24, hdr); + + if (post_sq) + irdma_uk_qp_post_wr(qp); + + return 0; +} + +/** * irdma_uk_srq_post_receive - post a receive wqe to a shared rq * @srq: shared rq ptr * @info: post rq information diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h index af15529..96dea01 100644 --- a/drivers/infiniband/hw/irdma/user.h +++ b/drivers/infiniband/hw/irdma/user.h @@ -41,6 +41,8 @@ #define IRDMA_OP_TYPE_INV_STAG 0x0a #define IRDMA_OP_TYPE_RDMA_READ_INV_STAG 0x0b #define IRDMA_OP_TYPE_NOP 0x0c +#define IRDMA_OP_TYPE_ATOMIC_FETCH_AND_ADD 0x0f +#define IRDMA_OP_TYPE_ATOMIC_COMPARE_AND_SWAP 0x11 #define IRDMA_OP_TYPE_REC 0x3e #define IRDMA_OP_TYPE_REC_IMM 0x3f @@ -203,6 +205,24 @@ struct irdma_bind_window { bool ena_writes:1; irdma_stag mw_stag; bool mem_window_type_1:1; + bool remote_atomics_en:1; +}; + +struct irdma_atomic_fetch_add { + u64 tagged_offset; + u64 remote_tagged_offset; + u64 fetch_add_data_bytes; + u32 stag; + u32 remote_stag; +}; + +struct irdma_atomic_compare_swap { + u64 tagged_offset; + u64 remote_tagged_offset; + u64 swap_data_bytes; + u64 compare_data_bytes; + u32 stag; + u32 remote_stag; }; struct irdma_inv_local_stag { @@ -221,6 +241,7 @@ struct irdma_post_sq_info { bool report_rtt:1; bool udp_hdr:1; bool defer_flag:1; + bool remote_atomic_en:1; u32 imm_data; u32 stag_to_inv; union { @@ -229,6 +250,8 @@ struct irdma_post_sq_info { struct irdma_rdma_read rdma_read; struct irdma_bind_window bind_window; struct irdma_inv_local_stag inv_local_stag; + struct irdma_atomic_fetch_add atomic_fetch_add; + struct irdma_atomic_compare_swap atomic_compare_swap; } op; }; @@ -257,6 +280,10 @@ struct irdma_cq_poll_info { bool imm_valid:1; }; +int irdma_uk_atomic_compare_swap(struct irdma_qp_uk *qp, + struct irdma_post_sq_info *info, bool post_sq); +int irdma_uk_atomic_fetch_add(struct irdma_qp_uk *qp, + struct irdma_post_sq_info *info, bool post_sq); int irdma_uk_inline_rdma_write(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq); int irdma_uk_inline_send(struct irdma_qp_uk *qp, diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 66439706..db92465 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -60,6 +60,11 @@ static int irdma_query_device(struct ib_device *ibdev, props->max_srq = rf->max_srq - rf->used_srqs; props->max_srq_wr = IRDMA_MAX_SRQ_WRS; props->max_srq_sge = hw_attrs->uk_attrs.max_hw_wq_frags; + if (hw_attrs->uk_attrs.feature_flags & IRDMA_FEATURE_ATOMIC_OPS) + props->atomic_cap = IB_ATOMIC_HCA; + else + props->atomic_cap = IB_ATOMIC_NONE; + props->masked_atomic_cap = props->atomic_cap; if (hw_attrs->uk_attrs.hw_rev >= IRDMA_GEN_3) { #define HCA_CORE_CLOCK_KHZ 1000000UL props->timestamp_mask = GENMASK(31, 0); @@ -1145,6 +1150,8 @@ static int irdma_get_ib_acc_flags(struct irdma_qp *iwqp) acc_flags |= IB_ACCESS_REMOTE_READ; if (iwqp->roce_info.bind_en) acc_flags |= IB_ACCESS_MW_BIND; + if (iwqp->ctx_info.remote_atomics_en) + acc_flags |= IB_ACCESS_REMOTE_ATOMIC; } else { if (iwqp->iwarp_info.wr_rdresp_en) { acc_flags |= IB_ACCESS_LOCAL_WRITE; @@ -1152,6 +1159,8 @@ static int irdma_get_ib_acc_flags(struct irdma_qp *iwqp) } if (iwqp->iwarp_info.rd_en) acc_flags |= IB_ACCESS_REMOTE_READ; + if (iwqp->ctx_info.remote_atomics_en) + acc_flags |= IB_ACCESS_REMOTE_ATOMIC; } return acc_flags; } @@ -1448,6 +1457,8 @@ int irdma_modify_qp_roce(struct ib_qp *ibqp, struct ib_qp_attr *attr, roce_info->wr_rdresp_en = true; if (attr->qp_access_flags & IB_ACCESS_REMOTE_READ) roce_info->rd_en = true; + if (attr->qp_access_flags & IB_ACCESS_REMOTE_ATOMIC) + ctx_info->remote_atomics_en = true; } wait_event(iwqp->mod_qp_waitq, !atomic_read(&iwqp->hw_mod_qp_pend)); @@ -1778,6 +1789,8 @@ int irdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, offload_info->wr_rdresp_en = true; if (attr->qp_access_flags & IB_ACCESS_REMOTE_READ) offload_info->rd_en = true; + if (attr->qp_access_flags & IB_ACCESS_REMOTE_ATOMIC) + ctx_info->remote_atomics_en = true; } if (ctx_info->iwarp_info_valid) { @@ -3240,6 +3253,7 @@ static int irdma_hwreg_mr(struct irdma_device *iwdev, struct irdma_mr *iwmr, stag_info->total_len = iwmr->len; stag_info->access_rights = irdma_get_mr_access(access, iwdev->rf->sc_dev.hw_attrs.uk_attrs.hw_rev); + stag_info->remote_atomics_en = (access & IB_ACCESS_REMOTE_ATOMIC) ? 1 : 0; stag_info->pd_id = iwpd->sc_pd.pd_id; stag_info->all_memory = pd->flags & IB_PD_UNSAFE_GLOBAL_RKEY; if (stag_info->access_rights & IRDMA_ACCESS_FLAGS_ZERO_BASED) @@ -3930,6 +3944,30 @@ static int irdma_post_send(struct ib_qp *ibqp, if (ib_wr->send_flags & IB_SEND_FENCE) info.read_fence = true; switch (ib_wr->opcode) { + case IB_WR_ATOMIC_CMP_AND_SWP: + info.op_type = IRDMA_OP_TYPE_ATOMIC_COMPARE_AND_SWAP; + info.op.atomic_compare_swap.tagged_offset = ib_wr->sg_list[0].addr; + info.op.atomic_compare_swap.remote_tagged_offset = + atomic_wr(ib_wr)->remote_addr; + info.op.atomic_compare_swap.swap_data_bytes = atomic_wr(ib_wr)->swap; + info.op.atomic_compare_swap.compare_data_bytes = + atomic_wr(ib_wr)->compare_add; + info.op.atomic_compare_swap.stag = ib_wr->sg_list[0].lkey; + info.op.atomic_compare_swap.remote_stag = atomic_wr(ib_wr)->rkey; + err = irdma_uk_atomic_compare_swap(ukqp, &info, false); + break; + case IB_WR_ATOMIC_FETCH_AND_ADD: + info.op_type = IRDMA_OP_TYPE_ATOMIC_FETCH_AND_ADD; + info.op.atomic_fetch_add.tagged_offset = ib_wr->sg_list[0].addr; + info.op.atomic_fetch_add.remote_tagged_offset = + atomic_wr(ib_wr)->remote_addr; + info.op.atomic_fetch_add.fetch_add_data_bytes = + atomic_wr(ib_wr)->compare_add; + info.op.atomic_fetch_add.stag = ib_wr->sg_list[0].lkey; + info.op.atomic_fetch_add.remote_stag = + atomic_wr(ib_wr)->rkey; + err = irdma_uk_atomic_fetch_add(ukqp, &info, false); + break; case IB_WR_SEND_WITH_IMM: if (ukqp->qp_caps & IRDMA_SEND_WITH_IMM) { info.imm_data_valid = true; diff --git a/drivers/infiniband/hw/irdma/verbs.h b/drivers/infiniband/hw/irdma/verbs.h index 157dfa2..0922a22 100644 --- a/drivers/infiniband/hw/irdma/verbs.h +++ b/drivers/infiniband/hw/irdma/verbs.h @@ -284,6 +284,12 @@ static inline void set_ib_wc_op_sq(struct irdma_cq_poll_info *cq_poll_info, case IRDMA_OP_TYPE_FAST_REG_NSMR: entry->opcode = IB_WC_REG_MR; break; + case IRDMA_OP_TYPE_ATOMIC_COMPARE_AND_SWAP: + entry->opcode = IB_WC_COMP_SWAP; + break; + case IRDMA_OP_TYPE_ATOMIC_FETCH_AND_ADD: + entry->opcode = IB_WC_FETCH_ADD; + break; case IRDMA_OP_TYPE_INV_STAG: entry->opcode = IB_WC_LOCAL_INV; break; From patchwork Wed Jul 24 23:39:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741462 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 948E71494CB for ; Wed, 24 Jul 2024 23:40:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864455; cv=none; b=OA5NF6DTAp+QPD/qQrW/PE7WyHJF0CKNrtSIyzryu1x4ONDnGP9yeWs11wimZtZzNK6tQoTmTlPzDa9AN1f25khZONKNRcfUC+U3bO+6pgZ7wV+h//gHvR9eulmN4KW7PdWZli0R8ReWcjrzPmyb7Q0TQv2i8vZFmnu3Eg1Y+A0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864455; c=relaxed/simple; bh=lol5QvKKIglGVr001A4WuxIemm3oOQIclVufKoe7ojs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nc0lcZYw1P9SrbE2I+VB3KKOiinGw34wB/9zu+WtCXirSMD6gzjTDu8KL37zTRpbenWJVOqxvu+/gK23nzTd+v/j5klx2iJ5mUC97S1dLEfyOAzrpNrfC5FwYl61ASTi6NtKSlYRM7AnmnxzG4VQG6a6jmmKh/L/ON7CIDGT6WY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=JWmdW/6v; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="JWmdW/6v" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864454; x=1753400454; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lol5QvKKIglGVr001A4WuxIemm3oOQIclVufKoe7ojs=; b=JWmdW/6vIlYsMo/6CLGZxfpYPafTKzp/B7dc03VtpI0OhooIZqFbSd+u yrL/KHt4gjsgwpzHfN5G1bAwfZwjHgG0KAGKe8ydK/JUdK09+TWfTggAn R+VPuPOIF/xNGkoM6YDd7oQFt7ucprI3qL0a3JMbNcVN2ZA9gmKMJ0tQ0 TFb+W70u3LRNAPYnpIavStdAADrknFJh1xt8+OAoAnxlBt0znaNyRnEZI e2A8vrjqUMjcu7EFQ7wXYqI7Y+rHWVyUt3MoBF+XLUMPSSFPGcDgJHxcx K/mACDV4kRA0aIMD0+2C1g5ZqKnLzpT97is5801AQRChiNlsb/ThV0zQ8 w==; X-CSE-ConnectionGUID: YypaEPu7SqWL4nqj1PcaDA== X-CSE-MsgGUID: 3h+zDwT/QjGQ29K6e0+jzQ== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999804" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999804" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:47 -0700 X-CSE-ConnectionGUID: vHC8OAfiTPq8cPDvBGyc4g== X-CSE-MsgGUID: HYFJVWgLSmWyAE/14r6fBA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426111" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:46 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Shiraz Saleem , Tatyana Nikolova Subject: [RFC PATCH 23/25] RDMA/irdma: Extend CQE Error and Flush Handling for GEN3 Devices Date: Wed, 24 Jul 2024 18:39:15 -0500 Message-Id: <20240724233917.704-24-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Shiraz Saleem Enhance the CQE error and flush handling specific to GEN3 devices. Unlike GEN1/2 devices, which depend on software to generate completions in error, GEN3 devices leverage firmware to generate CQEs in error for all WQEs posted after a QP moves to an error state. Key changes include: - Updating the CQ poll logic to properly advance the CQ head in the event of a flush CQE. - Updating the flush logic for GEN3 to pass error WQE idx for SQ on an AE to flush out unprocessed WQEs in error. - Isolating the decoding of AE to flush codes into a separate routine irdma_ae_to_qp_err_code. This routine can now be leveraged to flush error CQEs on an AE and when error CQE is received for SRQ Signed-off-by: Shiraz Saleem Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ctrl.c | 9 ++ drivers/infiniband/hw/irdma/defs.h | 105 +------------------ drivers/infiniband/hw/irdma/hw.c | 104 ++++++------------- drivers/infiniband/hw/irdma/type.h | 14 +-- drivers/infiniband/hw/irdma/uk.c | 39 ++++++-- drivers/infiniband/hw/irdma/user.h | 194 +++++++++++++++++++++++++++++++++++- drivers/infiniband/hw/irdma/verbs.c | 31 ++++-- 7 files changed, 297 insertions(+), 199 deletions(-) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 40868b5..73cab77 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -2670,6 +2670,12 @@ int irdma_sc_qp_flush_wqes(struct irdma_sc_qp *qp, info->ae_code | FIELD_PREP(IRDMA_CQPSQ_FWQE_AESOURCE, info->ae_src) : 0; set_64bit_val(wqe, 8, temp); + if (cqp->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) { + set_64bit_val(wqe, 40, + FIELD_PREP(IRDMA_CQPSQ_FWQE_ERR_SQ_IDX, info->err_sq_idx)); + set_64bit_val(wqe, 48, + FIELD_PREP(IRDMA_CQPSQ_FWQE_ERR_RQ_IDX, info->err_rq_idx)); + } hdr = qp->qp_uk.qp_id | FIELD_PREP(IRDMA_CQPSQ_OPCODE, IRDMA_CQP_OP_FLUSH_WQES) | @@ -2678,6 +2684,9 @@ int irdma_sc_qp_flush_wqes(struct irdma_sc_qp *qp, FIELD_PREP(IRDMA_CQPSQ_FWQE_FLUSHSQ, flush_sq) | FIELD_PREP(IRDMA_CQPSQ_FWQE_FLUSHRQ, flush_rq) | FIELD_PREP(IRDMA_CQPSQ_WQEVALID, cqp->polarity); + if (cqp->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_3) + hdr |= FIELD_PREP(IRDMA_CQPSQ_FWQE_ERR_SQ_IDX_VALID, info->err_sq_idx_valid) | + FIELD_PREP(IRDMA_CQPSQ_FWQE_ERR_RQ_IDX_VALID, info->err_rq_idx_valid); dma_wmb(); /* make sure WQE is written before valid bit is set */ set_64bit_val(wqe, 24, hdr); diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index 9c0fd46..e75dd8b 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -301,107 +301,6 @@ enum irdma_cqp_op_type { #define IRDMA_CQP_OP_GATHER_STATS 0x2e #define IRDMA_CQP_OP_UP_MAP 0x2f -/* Async Events codes */ -#define IRDMA_AE_AMP_UNALLOCATED_STAG 0x0102 -#define IRDMA_AE_AMP_INVALID_STAG 0x0103 -#define IRDMA_AE_AMP_BAD_QP 0x0104 -#define IRDMA_AE_AMP_BAD_PD 0x0105 -#define IRDMA_AE_AMP_BAD_STAG_KEY 0x0106 -#define IRDMA_AE_AMP_BAD_STAG_INDEX 0x0107 -#define IRDMA_AE_AMP_BOUNDS_VIOLATION 0x0108 -#define IRDMA_AE_AMP_RIGHTS_VIOLATION 0x0109 -#define IRDMA_AE_AMP_TO_WRAP 0x010a -#define IRDMA_AE_AMP_FASTREG_VALID_STAG 0x010c -#define IRDMA_AE_AMP_FASTREG_MW_STAG 0x010d -#define IRDMA_AE_AMP_FASTREG_INVALID_RIGHTS 0x010e -#define IRDMA_AE_AMP_FASTREG_INVALID_LENGTH 0x0110 -#define IRDMA_AE_AMP_INVALIDATE_SHARED 0x0111 -#define IRDMA_AE_AMP_INVALIDATE_NO_REMOTE_ACCESS_RIGHTS 0x0112 -#define IRDMA_AE_AMP_INVALIDATE_MR_WITH_BOUND_WINDOWS 0x0113 -#define IRDMA_AE_AMP_MWBIND_VALID_STAG 0x0114 -#define IRDMA_AE_AMP_MWBIND_OF_MR_STAG 0x0115 -#define IRDMA_AE_AMP_MWBIND_TO_ZERO_BASED_STAG 0x0116 -#define IRDMA_AE_AMP_MWBIND_TO_MW_STAG 0x0117 -#define IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS 0x0118 -#define IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS 0x0119 -#define IRDMA_AE_AMP_MWBIND_TO_INVALID_PARENT 0x011a -#define IRDMA_AE_AMP_MWBIND_BIND_DISABLED 0x011b -#define IRDMA_AE_PRIV_OPERATION_DENIED 0x011c -#define IRDMA_AE_AMP_INVALIDATE_TYPE1_MW 0x011d -#define IRDMA_AE_AMP_MWBIND_ZERO_BASED_TYPE1_MW 0x011e -#define IRDMA_AE_AMP_FASTREG_INVALID_PBL_HPS_CFG 0x011f -#define IRDMA_AE_AMP_MWBIND_WRONG_TYPE 0x0120 -#define IRDMA_AE_AMP_FASTREG_PBLE_MISMATCH 0x0121 -#define IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG 0x0132 -#define IRDMA_AE_UDA_XMIT_BAD_PD 0x0133 -#define IRDMA_AE_UDA_XMIT_DGRAM_TOO_SHORT 0x0134 -#define IRDMA_AE_UDA_L4LEN_INVALID 0x0135 -#define IRDMA_AE_BAD_CLOSE 0x0201 -#define IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE 0x0202 -#define IRDMA_AE_CQ_OPERATION_ERROR 0x0203 -#define IRDMA_AE_RDMA_READ_WHILE_ORD_ZERO 0x0205 -#define IRDMA_AE_STAG_ZERO_INVALID 0x0206 -#define IRDMA_AE_IB_RREQ_AND_Q1_FULL 0x0207 -#define IRDMA_AE_IB_INVALID_REQUEST 0x0208 -#define IRDMA_AE_SRQ_LIMIT 0x0209 -#define IRDMA_AE_WQE_UNEXPECTED_OPCODE 0x020a -#define IRDMA_AE_WQE_INVALID_PARAMETER 0x020b -#define IRDMA_AE_WQE_INVALID_FRAG_DATA 0x020c -#define IRDMA_AE_IB_REMOTE_ACCESS_ERROR 0x020d -#define IRDMA_AE_IB_REMOTE_OP_ERROR 0x020e -#define IRDMA_AE_SRQ_CATASTROPHIC_ERROR 0x020f -#define IRDMA_AE_WQE_LSMM_TOO_LONG 0x0220 -#define IRDMA_AE_ATOMIC_ALIGNMENT 0x0221 -#define IRDMA_AE_ATOMIC_MASK 0x0222 -#define IRDMA_AE_INVALID_REQUEST 0x0223 -#define IRDMA_AE_PCIE_ATOMIC_DISABLE 0x0224 -#define IRDMA_AE_DDP_INVALID_MSN_GAP_IN_MSN 0x0301 -#define IRDMA_AE_DDP_UBE_DDP_MESSAGE_TOO_LONG_FOR_AVAILABLE_BUFFER 0x0303 -#define IRDMA_AE_DDP_UBE_INVALID_DDP_VERSION 0x0304 -#define IRDMA_AE_DDP_UBE_INVALID_MO 0x0305 -#define IRDMA_AE_DDP_UBE_INVALID_MSN_NO_BUFFER_AVAILABLE 0x0306 -#define IRDMA_AE_DDP_UBE_INVALID_QN 0x0307 -#define IRDMA_AE_DDP_NO_L_BIT 0x0308 -#define IRDMA_AE_RDMAP_ROE_INVALID_RDMAP_VERSION 0x0311 -#define IRDMA_AE_RDMAP_ROE_UNEXPECTED_OPCODE 0x0312 -#define IRDMA_AE_ROE_INVALID_RDMA_READ_REQUEST 0x0313 -#define IRDMA_AE_ROE_INVALID_RDMA_WRITE_OR_READ_RESP 0x0314 -#define IRDMA_AE_ROCE_RSP_LENGTH_ERROR 0x0316 -#define IRDMA_AE_ROCE_EMPTY_MCG 0x0380 -#define IRDMA_AE_ROCE_BAD_MC_IP_ADDR 0x0381 -#define IRDMA_AE_ROCE_BAD_MC_QPID 0x0382 -#define IRDMA_AE_MCG_QP_PROTOCOL_MISMATCH 0x0383 -#define IRDMA_AE_INVALID_ARP_ENTRY 0x0401 -#define IRDMA_AE_INVALID_TCP_OPTION_RCVD 0x0402 -#define IRDMA_AE_STALE_ARP_ENTRY 0x0403 -#define IRDMA_AE_INVALID_AH_ENTRY 0x0406 -#define IRDMA_AE_LLP_CLOSE_COMPLETE 0x0501 -#define IRDMA_AE_LLP_CONNECTION_RESET 0x0502 -#define IRDMA_AE_LLP_FIN_RECEIVED 0x0503 -#define IRDMA_AE_LLP_RECEIVED_MARKER_AND_LENGTH_FIELDS_DONT_MATCH 0x0504 -#define IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR 0x0505 -#define IRDMA_AE_LLP_SEGMENT_TOO_SMALL 0x0507 -#define IRDMA_AE_LLP_SYN_RECEIVED 0x0508 -#define IRDMA_AE_LLP_TERMINATE_RECEIVED 0x0509 -#define IRDMA_AE_LLP_TOO_MANY_RETRIES 0x050a -#define IRDMA_AE_LLP_TOO_MANY_KEEPALIVE_RETRIES 0x050b -#define IRDMA_AE_LLP_DOUBT_REACHABILITY 0x050c -#define IRDMA_AE_LLP_CONNECTION_ESTABLISHED 0x050e -#define IRDMA_AE_LLP_TOO_MANY_RNRS 0x050f -#define IRDMA_AE_RESOURCE_EXHAUSTION 0x0520 -#define IRDMA_AE_RESET_SENT 0x0601 -#define IRDMA_AE_TERMINATE_SENT 0x0602 -#define IRDMA_AE_RESET_NOT_SENT 0x0603 -#define IRDMA_AE_LCE_QP_CATASTROPHIC 0x0700 -#define IRDMA_AE_LCE_FUNCTION_CATASTROPHIC 0x0701 -#define IRDMA_AE_LCE_CQ_CATASTROPHIC 0x0702 -#define IRDMA_AE_REMOTE_QP_CATASTROPHIC 0x0703 -#define IRDMA_AE_LOCAL_QP_CATASTROPHIC 0x0704 -#define IRDMA_AE_RCE_QP_CATASTROPHIC 0x0705 -#define IRDMA_AE_QP_SUSPEND_COMPLETE 0x0900 -#define IRDMA_AE_CQP_DEFERRED_COMPLETE 0x0901 -#define IRDMA_AE_ADAPTER_CATASTROPHIC 0x0B0B - #define FLD_LS_64(dev, val, field) \ (((u64)(val) << (dev)->hw_shifts[field ## _S]) & (dev)->hw_masks[field ## _M]) #define FLD_RS_64(dev, val, field) \ @@ -778,6 +677,10 @@ enum irdma_cqp_op_type { #define IRDMA_CQPSQ_FWQE_USERFLCODE BIT_ULL(60) #define IRDMA_CQPSQ_FWQE_FLUSHSQ BIT_ULL(61) #define IRDMA_CQPSQ_FWQE_FLUSHRQ BIT_ULL(62) +#define IRDMA_CQPSQ_FWQE_ERR_SQ_IDX_VALID BIT_ULL(42) +#define IRDMA_CQPSQ_FWQE_ERR_SQ_IDX GENMASK_ULL(49, 32) +#define IRDMA_CQPSQ_FWQE_ERR_RQ_IDX_VALID BIT_ULL(43) +#define IRDMA_CQPSQ_FWQE_ERR_RQ_IDX GENMASK_ULL(46, 32) #define IRDMA_CQPSQ_MAPT_PORT GENMASK_ULL(15, 0) #define IRDMA_CQPSQ_MAPT_ADDPORT BIT_ULL(62) #define IRDMA_CQPSQ_UPESD_SDCMD GENMASK_ULL(31, 0) diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index 524fe5d..4daaefa 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -133,78 +133,26 @@ static void irdma_process_ceq(struct irdma_pci_f *rf, struct irdma_ceq *ceq) } static void irdma_set_flush_fields(struct irdma_sc_qp *qp, - struct irdma_aeqe_info *info) + struct irdma_aeqe_info *info, + struct irdma_qp_host_ctx_info *ctx_info) { + struct qp_err_code qp_err; + qp->sq_flush_code = info->sq; qp->rq_flush_code = info->rq; - qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC; - - switch (info->ae_id) { - case IRDMA_AE_AMP_BOUNDS_VIOLATION: - case IRDMA_AE_AMP_INVALID_STAG: - case IRDMA_AE_AMP_RIGHTS_VIOLATION: - case IRDMA_AE_AMP_UNALLOCATED_STAG: - case IRDMA_AE_AMP_BAD_PD: - case IRDMA_AE_AMP_BAD_QP: - case IRDMA_AE_AMP_BAD_STAG_KEY: - case IRDMA_AE_AMP_BAD_STAG_INDEX: - case IRDMA_AE_AMP_TO_WRAP: - case IRDMA_AE_PRIV_OPERATION_DENIED: - qp->flush_code = FLUSH_PROT_ERR; - qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR; - break; - case IRDMA_AE_UDA_XMIT_BAD_PD: - case IRDMA_AE_WQE_UNEXPECTED_OPCODE: - qp->flush_code = FLUSH_LOC_QP_OP_ERR; - qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC; - break; - case IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG: - case IRDMA_AE_UDA_XMIT_DGRAM_TOO_SHORT: - case IRDMA_AE_UDA_L4LEN_INVALID: - case IRDMA_AE_DDP_UBE_INVALID_MO: - case IRDMA_AE_DDP_UBE_DDP_MESSAGE_TOO_LONG_FOR_AVAILABLE_BUFFER: - qp->flush_code = FLUSH_LOC_LEN_ERR; - qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC; - break; - case IRDMA_AE_AMP_INVALIDATE_NO_REMOTE_ACCESS_RIGHTS: - case IRDMA_AE_IB_REMOTE_ACCESS_ERROR: - qp->flush_code = FLUSH_REM_ACCESS_ERR; - qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR; - break; - case IRDMA_AE_LLP_SEGMENT_TOO_SMALL: - case IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR: - case IRDMA_AE_ROCE_RSP_LENGTH_ERROR: - case IRDMA_AE_IB_REMOTE_OP_ERROR: - qp->flush_code = FLUSH_REM_OP_ERR; - qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC; - break; - case IRDMA_AE_LCE_QP_CATASTROPHIC: - qp->flush_code = FLUSH_FATAL_ERR; - qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC; - break; - case IRDMA_AE_IB_RREQ_AND_Q1_FULL: - qp->flush_code = FLUSH_GENERAL_ERR; - break; - case IRDMA_AE_LLP_TOO_MANY_RETRIES: - qp->flush_code = FLUSH_RETRY_EXC_ERR; - qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC; - break; - case IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS: - case IRDMA_AE_AMP_MWBIND_BIND_DISABLED: - case IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS: - case IRDMA_AE_AMP_MWBIND_VALID_STAG: - qp->flush_code = FLUSH_MW_BIND_ERR; - qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR; - break; - case IRDMA_AE_IB_INVALID_REQUEST: - qp->flush_code = FLUSH_REM_INV_REQ_ERR; - qp->event_type = IRDMA_QP_EVENT_REQ_ERR; - break; - default: - qp->flush_code = FLUSH_GENERAL_ERR; - qp->event_type = IRDMA_QP_EVENT_CATASTROPHIC; - break; + if (info->sq && qp->qp_uk.uk_attrs->hw_rev >= IRDMA_GEN_3) { + qp->err_sq_idx_valid = true; + qp->err_sq_idx = info->wqe_idx; + } + if (ctx_info->roce_info->err_rq_idx_valid && qp->qp_uk.uk_attrs->hw_rev >= IRDMA_GEN_3) { + qp->err_rq_idx_valid = true; + qp->err_rq_idx = ctx_info->roce_info->err_rq_idx; } + + qp_err = irdma_ae_to_qp_err_code(info->ae_id); + + qp->flush_code = qp_err.flush_code; + qp->event_type = qp_err.event_type; } /** @@ -465,14 +413,16 @@ static void irdma_process_aeq(struct irdma_pci_f *rf) default: ibdev_err(&iwdev->ibdev, "abnormal ae_id = 0x%x bool qp=%d qp_id = %d, ae_src=%d\n", info->ae_id, info->qp, info->qp_cq_id, info->ae_src); - if (rdma_protocol_roce(&iwdev->ibdev, 1)) { - ctx_info->roce_info->err_rq_idx_valid = info->rq; - if (info->rq) { + ctx_info = &iwqp->ctx_info; + if (rdma_protocol_roce(&iwqp->iwdev->ibdev, 1)) { + ctx_info->roce_info->err_rq_idx_valid = + ctx_info->srq_valid ? false : info->err_rq_idx_valid; + if (ctx_info->roce_info->err_rq_idx_valid) { ctx_info->roce_info->err_rq_idx = info->wqe_idx; irdma_sc_qp_setctx_roce(&iwqp->sc_qp, iwqp->host_ctx.va, ctx_info); } - irdma_set_flush_fields(qp, info); + irdma_set_flush_fields(qp, info, ctx_info); irdma_cm_disconn(iwqp); break; } @@ -2831,7 +2781,9 @@ void irdma_flush_wqes(struct irdma_qp *iwqp, u32 flush_mask) struct irdma_pci_f *rf = iwqp->iwdev->rf; u8 flush_code = iwqp->sc_qp.flush_code; - if (!(flush_mask & IRDMA_FLUSH_SQ) && !(flush_mask & IRDMA_FLUSH_RQ)) + if ((!(flush_mask & IRDMA_FLUSH_SQ) && + !(flush_mask & IRDMA_FLUSH_RQ)) || + ((flush_mask & IRDMA_REFLUSH) && rf->rdma_ver >= IRDMA_GEN_3)) return; /* Set flush info fields*/ @@ -2844,6 +2796,10 @@ void irdma_flush_wqes(struct irdma_qp *iwqp, u32 flush_mask) info.rq_major_code = IRDMA_FLUSH_MAJOR_ERR; info.rq_minor_code = FLUSH_GENERAL_ERR; info.userflushcode = true; + info.err_sq_idx_valid = iwqp->sc_qp.err_sq_idx_valid; + info.err_sq_idx = iwqp->sc_qp.err_sq_idx; + info.err_rq_idx_valid = iwqp->sc_qp.err_rq_idx_valid; + info.err_rq_idx = iwqp->sc_qp.err_rq_idx; if (flush_mask & IRDMA_REFLUSH) { if (info.sq) @@ -2857,7 +2813,7 @@ void irdma_flush_wqes(struct irdma_qp *iwqp, u32 flush_mask) if (info.rq && iwqp->sc_qp.rq_flush_code) info.rq_minor_code = flush_code; } - if (!iwqp->user_mode) + if (!iwqp->user_mode && rf->rdma_ver <= IRDMA_GEN_2) queue_delayed_work(iwqp->iwdev->cleanup_wq, &iwqp->dwork_flush, msecs_to_jiffies(IRDMA_FLUSH_DELAY_MS)); diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index 52aa1dd..9916091 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -97,12 +97,6 @@ enum irdma_term_mpa_errors { MPA_REQ_RSP = 0x04, }; -enum irdma_qp_event_type { - IRDMA_QP_EVENT_CATASTROPHIC, - IRDMA_QP_EVENT_ACCESS_ERR, - IRDMA_QP_EVENT_REQ_ERR, -}; - enum irdma_hw_stats_index { /* gen1 - 32-bit */ IRDMA_HW_STAT_INDEX_IP4RXDISCARD = 0, @@ -573,6 +567,10 @@ struct irdma_sc_qp { bool virtual_map:1; bool flush_sq:1; bool flush_rq:1; + bool err_sq_idx_valid:1; + bool err_rq_idx_valid:1; + u32 err_sq_idx; + u32 err_rq_idx; bool sq_flush_code:1; bool rq_flush_code:1; u32 pkt_limit; @@ -1296,6 +1294,8 @@ struct irdma_cqp_manage_push_page_info { }; struct irdma_qp_flush_info { + u32 err_sq_idx; + u32 err_rq_idx; u16 sq_minor_code; u16 sq_major_code; u16 rq_minor_code; @@ -1306,6 +1306,8 @@ struct irdma_qp_flush_info { bool rq:1; bool userflushcode:1; bool generate_ae:1; + bool err_sq_idx_valid:1; + bool err_rq_idx_valid:1; }; struct irdma_gen_ae_info { diff --git a/drivers/infiniband/hw/irdma/uk.c b/drivers/infiniband/hw/irdma/uk.c index 24e8df0..682e848 100644 --- a/drivers/infiniband/hw/irdma/uk.c +++ b/drivers/infiniband/hw/irdma/uk.c @@ -1148,6 +1148,7 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, __le64 *cqe; struct irdma_qp_uk *qp; struct irdma_srq_uk *srq; + struct qp_err_code qp_err; u8 is_srq; struct irdma_ring *pring = NULL; u32 wqe_idx; @@ -1233,16 +1234,35 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, if (info->error) { info->major_err = FIELD_GET(IRDMA_CQ_MAJERR, qword3); info->minor_err = FIELD_GET(IRDMA_CQ_MINERR, qword3); - if (info->major_err == IRDMA_FLUSH_MAJOR_ERR) { - info->comp_status = IRDMA_COMPL_STATUS_FLUSHED; + switch (info->major_err) { + case IRDMA_SRQFLUSH_RSVD_MAJOR_ERR: + qp_err = irdma_ae_to_qp_err_code(info->minor_err); + info->minor_err = qp_err.flush_code; + fallthrough; + case IRDMA_FLUSH_MAJOR_ERR: /* Set the min error to standard flush error code for remaining cqes */ if (info->minor_err != FLUSH_GENERAL_ERR) { qword3 &= ~IRDMA_CQ_MINERR; qword3 |= FIELD_PREP(IRDMA_CQ_MINERR, FLUSH_GENERAL_ERR); set_64bit_val(cqe, 24, qword3); } - } else { - info->comp_status = IRDMA_COMPL_STATUS_UNKNOWN; + info->comp_status = IRDMA_COMPL_STATUS_FLUSHED; + break; + default: +#define IRDMA_CIE_SIGNATURE 0xE +#define IRDMA_CQMAJERR_HIGH_NIBBLE GENMASK(15, 12) + if (info->q_type == IRDMA_CQE_QTYPE_SQ && + qp->qp_type == IRDMA_QP_TYPE_ROCE_UD && + FIELD_GET(IRDMA_CQMAJERR_HIGH_NIBBLE, info->major_err) + == IRDMA_CIE_SIGNATURE) { + info->error = 0; + info->major_err = 0; + info->minor_err = 0; + info->comp_status = IRDMA_COMPL_STATUS_SUCCESS; + } else { + info->comp_status = IRDMA_COMPL_STATUS_UNKNOWN; + } + break; } } else { info->comp_status = IRDMA_COMPL_STATUS_SUCCESS; @@ -1251,7 +1271,6 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, get_64bit_val(cqe, 0, &qword0); get_64bit_val(cqe, 16, &qword2); - info->tcp_seq_num_rtt = (u32)FIELD_GET(IRDMACQ_TCPSEQNUMRTT, qword0); info->qp_id = (u32)FIELD_GET(IRDMACQ_QPID, qword2); info->ud_src_qpn = (u32)FIELD_GET(IRDMACQ_UDSRCQPN, qword2); @@ -1377,9 +1396,15 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, ret_code = 0; exit: - if (!ret_code && info->comp_status == IRDMA_COMPL_STATUS_FLUSHED) + if (!ret_code && info->comp_status == IRDMA_COMPL_STATUS_FLUSHED) { if (pring && IRDMA_RING_MORE_WORK(*pring)) - move_cq_head = false; + /* Park CQ head during a flush to generate additional CQEs + * from SW for all unprocessed WQEs. For GEN3 and beyond + * FW will generate/flush these CQEs so move to the next CQE + */ + move_cq_head = qp->uk_attrs->hw_rev <= IRDMA_GEN_2 ? + false : true; + } if (move_cq_head) { IRDMA_RING_MOVE_HEAD_NOCHECK(cq->cq_ring); diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h index 96dea01..a2029f5 100644 --- a/drivers/infiniband/hw/irdma/user.h +++ b/drivers/infiniband/hw/irdma/user.h @@ -46,7 +46,109 @@ #define IRDMA_OP_TYPE_REC 0x3e #define IRDMA_OP_TYPE_REC_IMM 0x3f -#define IRDMA_FLUSH_MAJOR_ERR 1 +#define IRDMA_FLUSH_MAJOR_ERR 1 +#define IRDMA_SRQFLUSH_RSVD_MAJOR_ERR 0xfffe + +/* Async Events codes */ +#define IRDMA_AE_AMP_UNALLOCATED_STAG 0x0102 +#define IRDMA_AE_AMP_INVALID_STAG 0x0103 +#define IRDMA_AE_AMP_BAD_QP 0x0104 +#define IRDMA_AE_AMP_BAD_PD 0x0105 +#define IRDMA_AE_AMP_BAD_STAG_KEY 0x0106 +#define IRDMA_AE_AMP_BAD_STAG_INDEX 0x0107 +#define IRDMA_AE_AMP_BOUNDS_VIOLATION 0x0108 +#define IRDMA_AE_AMP_RIGHTS_VIOLATION 0x0109 +#define IRDMA_AE_AMP_TO_WRAP 0x010a +#define IRDMA_AE_AMP_FASTREG_VALID_STAG 0x010c +#define IRDMA_AE_AMP_FASTREG_MW_STAG 0x010d +#define IRDMA_AE_AMP_FASTREG_INVALID_RIGHTS 0x010e +#define IRDMA_AE_AMP_FASTREG_INVALID_LENGTH 0x0110 +#define IRDMA_AE_AMP_INVALIDATE_SHARED 0x0111 +#define IRDMA_AE_AMP_INVALIDATE_NO_REMOTE_ACCESS_RIGHTS 0x0112 +#define IRDMA_AE_AMP_INVALIDATE_MR_WITH_BOUND_WINDOWS 0x0113 +#define IRDMA_AE_AMP_MWBIND_VALID_STAG 0x0114 +#define IRDMA_AE_AMP_MWBIND_OF_MR_STAG 0x0115 +#define IRDMA_AE_AMP_MWBIND_TO_ZERO_BASED_STAG 0x0116 +#define IRDMA_AE_AMP_MWBIND_TO_MW_STAG 0x0117 +#define IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS 0x0118 +#define IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS 0x0119 +#define IRDMA_AE_AMP_MWBIND_TO_INVALID_PARENT 0x011a +#define IRDMA_AE_AMP_MWBIND_BIND_DISABLED 0x011b +#define IRDMA_AE_PRIV_OPERATION_DENIED 0x011c +#define IRDMA_AE_AMP_INVALIDATE_TYPE1_MW 0x011d +#define IRDMA_AE_AMP_MWBIND_ZERO_BASED_TYPE1_MW 0x011e +#define IRDMA_AE_AMP_FASTREG_INVALID_PBL_HPS_CFG 0x011f +#define IRDMA_AE_AMP_MWBIND_WRONG_TYPE 0x0120 +#define IRDMA_AE_AMP_FASTREG_PBLE_MISMATCH 0x0121 +#define IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG 0x0132 +#define IRDMA_AE_UDA_XMIT_BAD_PD 0x0133 +#define IRDMA_AE_UDA_XMIT_DGRAM_TOO_SHORT 0x0134 +#define IRDMA_AE_UDA_L4LEN_INVALID 0x0135 +#define IRDMA_AE_BAD_CLOSE 0x0201 +#define IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE 0x0202 +#define IRDMA_AE_CQ_OPERATION_ERROR 0x0203 +#define IRDMA_AE_RDMA_READ_WHILE_ORD_ZERO 0x0205 +#define IRDMA_AE_STAG_ZERO_INVALID 0x0206 +#define IRDMA_AE_IB_RREQ_AND_Q1_FULL 0x0207 +#define IRDMA_AE_IB_INVALID_REQUEST 0x0208 +#define IRDMA_AE_SRQ_LIMIT 0x0209 +#define IRDMA_AE_WQE_UNEXPECTED_OPCODE 0x020a +#define IRDMA_AE_WQE_INVALID_PARAMETER 0x020b +#define IRDMA_AE_WQE_INVALID_FRAG_DATA 0x020c +#define IRDMA_AE_IB_REMOTE_ACCESS_ERROR 0x020d +#define IRDMA_AE_IB_REMOTE_OP_ERROR 0x020e +#define IRDMA_AE_SRQ_CATASTROPHIC_ERROR 0x020f +#define IRDMA_AE_WQE_LSMM_TOO_LONG 0x0220 +#define IRDMA_AE_ATOMIC_ALIGNMENT 0x0221 +#define IRDMA_AE_ATOMIC_MASK 0x0222 +#define IRDMA_AE_INVALID_REQUEST 0x0223 +#define IRDMA_AE_PCIE_ATOMIC_DISABLE 0x0224 +#define IRDMA_AE_DDP_INVALID_MSN_GAP_IN_MSN 0x0301 +#define IRDMA_AE_DDP_UBE_DDP_MESSAGE_TOO_LONG_FOR_AVAILABLE_BUFFER 0x0303 +#define IRDMA_AE_DDP_UBE_INVALID_DDP_VERSION 0x0304 +#define IRDMA_AE_DDP_UBE_INVALID_MO 0x0305 +#define IRDMA_AE_DDP_UBE_INVALID_MSN_NO_BUFFER_AVAILABLE 0x0306 +#define IRDMA_AE_DDP_UBE_INVALID_QN 0x0307 +#define IRDMA_AE_DDP_NO_L_BIT 0x0308 +#define IRDMA_AE_RDMAP_ROE_INVALID_RDMAP_VERSION 0x0311 +#define IRDMA_AE_RDMAP_ROE_UNEXPECTED_OPCODE 0x0312 +#define IRDMA_AE_ROE_INVALID_RDMA_READ_REQUEST 0x0313 +#define IRDMA_AE_ROE_INVALID_RDMA_WRITE_OR_READ_RESP 0x0314 +#define IRDMA_AE_ROCE_RSP_LENGTH_ERROR 0x0316 +#define IRDMA_AE_ROCE_EMPTY_MCG 0x0380 +#define IRDMA_AE_ROCE_BAD_MC_IP_ADDR 0x0381 +#define IRDMA_AE_ROCE_BAD_MC_QPID 0x0382 +#define IRDMA_AE_MCG_QP_PROTOCOL_MISMATCH 0x0383 +#define IRDMA_AE_INVALID_ARP_ENTRY 0x0401 +#define IRDMA_AE_INVALID_TCP_OPTION_RCVD 0x0402 +#define IRDMA_AE_STALE_ARP_ENTRY 0x0403 +#define IRDMA_AE_INVALID_AH_ENTRY 0x0406 +#define IRDMA_AE_LLP_CLOSE_COMPLETE 0x0501 +#define IRDMA_AE_LLP_CONNECTION_RESET 0x0502 +#define IRDMA_AE_LLP_FIN_RECEIVED 0x0503 +#define IRDMA_AE_LLP_RECEIVED_MARKER_AND_LENGTH_FIELDS_DONT_MATCH 0x0504 +#define IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR 0x0505 +#define IRDMA_AE_LLP_SEGMENT_TOO_SMALL 0x0507 +#define IRDMA_AE_LLP_SYN_RECEIVED 0x0508 +#define IRDMA_AE_LLP_TERMINATE_RECEIVED 0x0509 +#define IRDMA_AE_LLP_TOO_MANY_RETRIES 0x050a +#define IRDMA_AE_LLP_TOO_MANY_KEEPALIVE_RETRIES 0x050b +#define IRDMA_AE_LLP_DOUBT_REACHABILITY 0x050c +#define IRDMA_AE_LLP_CONNECTION_ESTABLISHED 0x050e +#define IRDMA_AE_LLP_TOO_MANY_RNRS 0x050f +#define IRDMA_AE_RESOURCE_EXHAUSTION 0x0520 +#define IRDMA_AE_RESET_SENT 0x0601 +#define IRDMA_AE_TERMINATE_SENT 0x0602 +#define IRDMA_AE_RESET_NOT_SENT 0x0603 +#define IRDMA_AE_LCE_QP_CATASTROPHIC 0x0700 +#define IRDMA_AE_LCE_FUNCTION_CATASTROPHIC 0x0701 +#define IRDMA_AE_LCE_CQ_CATASTROPHIC 0x0702 +#define IRDMA_AE_REMOTE_QP_CATASTROPHIC 0x0703 +#define IRDMA_AE_LOCAL_QP_CATASTROPHIC 0x0704 +#define IRDMA_AE_RCE_QP_CATASTROPHIC 0x0705 +#define IRDMA_AE_QP_SUSPEND_COMPLETE 0x0900 +#define IRDMA_AE_CQP_DEFERRED_COMPLETE 0x0901 +#define IRDMA_AE_ADAPTER_CATASTROPHIC 0x0B0B enum irdma_device_caps_const { IRDMA_WQE_SIZE = 4, @@ -107,6 +209,13 @@ enum irdma_flush_opcode { FLUSH_RETRY_EXC_ERR, FLUSH_MW_BIND_ERR, FLUSH_REM_INV_REQ_ERR, + FLUSH_RNR_RETRY_EXC_ERR, +}; + +enum irdma_qp_event_type { + IRDMA_QP_EVENT_CATASTROPHIC, + IRDMA_QP_EVENT_ACCESS_ERR, + IRDMA_QP_EVENT_REQ_ERR, }; enum irdma_cmpl_status { @@ -280,6 +389,11 @@ struct irdma_cq_poll_info { bool imm_valid:1; }; +struct qp_err_code { + enum irdma_flush_opcode flush_code; + enum irdma_qp_event_type event_type; +}; + int irdma_uk_atomic_compare_swap(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq); int irdma_uk_atomic_fetch_add(struct irdma_qp_uk *qp, @@ -477,4 +591,82 @@ int irdma_get_rqdepth(struct irdma_uk_attrs *uk_attrs, u32 rq_size, u8 shift, int irdma_get_srqdepth(struct irdma_uk_attrs *uk_attrs, u32 srq_size, u8 shift, u32 *srqdepth); void irdma_clr_wqes(struct irdma_qp_uk *qp, u32 qp_wqe_idx); + +static inline struct qp_err_code irdma_ae_to_qp_err_code(u16 ae_id) +{ + struct qp_err_code qp_err = {}; + + switch (ae_id) { + case IRDMA_AE_AMP_BOUNDS_VIOLATION: + case IRDMA_AE_AMP_INVALID_STAG: + case IRDMA_AE_AMP_RIGHTS_VIOLATION: + case IRDMA_AE_AMP_UNALLOCATED_STAG: + case IRDMA_AE_AMP_BAD_PD: + case IRDMA_AE_AMP_BAD_QP: + case IRDMA_AE_AMP_BAD_STAG_KEY: + case IRDMA_AE_AMP_BAD_STAG_INDEX: + case IRDMA_AE_AMP_TO_WRAP: + case IRDMA_AE_PRIV_OPERATION_DENIED: + qp_err.flush_code = FLUSH_PROT_ERR; + qp_err.event_type = IRDMA_QP_EVENT_ACCESS_ERR; + break; + case IRDMA_AE_UDA_XMIT_BAD_PD: + case IRDMA_AE_WQE_UNEXPECTED_OPCODE: + qp_err.flush_code = FLUSH_LOC_QP_OP_ERR; + qp_err.event_type = IRDMA_QP_EVENT_CATASTROPHIC; + break; + case IRDMA_AE_UDA_XMIT_DGRAM_TOO_SHORT: + case IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG: + case IRDMA_AE_UDA_L4LEN_INVALID: + case IRDMA_AE_DDP_UBE_INVALID_MO: + case IRDMA_AE_DDP_UBE_DDP_MESSAGE_TOO_LONG_FOR_AVAILABLE_BUFFER: + qp_err.flush_code = FLUSH_LOC_LEN_ERR; + qp_err.event_type = IRDMA_QP_EVENT_CATASTROPHIC; + break; + case IRDMA_AE_AMP_INVALIDATE_NO_REMOTE_ACCESS_RIGHTS: + case IRDMA_AE_IB_REMOTE_ACCESS_ERROR: + qp_err.flush_code = FLUSH_REM_ACCESS_ERR; + qp_err.event_type = IRDMA_QP_EVENT_ACCESS_ERR; + break; + case IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS: + case IRDMA_AE_AMP_MWBIND_BIND_DISABLED: + case IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS: + case IRDMA_AE_AMP_MWBIND_VALID_STAG: + qp_err.flush_code = FLUSH_MW_BIND_ERR; + qp_err.event_type = IRDMA_QP_EVENT_ACCESS_ERR; + break; + case IRDMA_AE_LLP_TOO_MANY_RETRIES: + qp_err.flush_code = FLUSH_RETRY_EXC_ERR; + qp_err.event_type = IRDMA_QP_EVENT_CATASTROPHIC; + break; + case IRDMA_AE_IB_INVALID_REQUEST: + qp_err.flush_code = FLUSH_REM_INV_REQ_ERR; + qp_err.event_type = IRDMA_QP_EVENT_REQ_ERR; + break; + case IRDMA_AE_LLP_SEGMENT_TOO_SMALL: + case IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR: + case IRDMA_AE_ROCE_RSP_LENGTH_ERROR: + case IRDMA_AE_IB_REMOTE_OP_ERROR: + qp_err.flush_code = FLUSH_REM_OP_ERR; + qp_err.event_type = IRDMA_QP_EVENT_CATASTROPHIC; + break; + case IRDMA_AE_LLP_TOO_MANY_RNRS: + qp_err.flush_code = FLUSH_RNR_RETRY_EXC_ERR; + qp_err.event_type = IRDMA_QP_EVENT_CATASTROPHIC; + break; + case IRDMA_AE_LCE_QP_CATASTROPHIC: + case IRDMA_AE_REMOTE_QP_CATASTROPHIC: + case IRDMA_AE_LOCAL_QP_CATASTROPHIC: + case IRDMA_AE_RCE_QP_CATASTROPHIC: + qp_err.flush_code = FLUSH_FATAL_ERR; + qp_err.event_type = IRDMA_QP_EVENT_CATASTROPHIC; + break; + default: + qp_err.flush_code = FLUSH_GENERAL_ERR; + qp_err.event_type = IRDMA_QP_EVENT_CATASTROPHIC; + break; + } + + return qp_err; +} #endif /* IRDMA_USER_H */ diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index db92465..8765a2a 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -542,10 +542,10 @@ static int irdma_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata) iwqp->sc_qp.qp_uk.destroy_pending = true; - if (iwqp->iwarp_state == IRDMA_QP_STATE_RTS) + if (iwqp->iwarp_state >= IRDMA_QP_STATE_IDLE) irdma_modify_qp_to_err(&iwqp->sc_qp); - if (!iwqp->user_mode) + if (iwdev->rf->rdma_ver <= IRDMA_GEN_2 && !iwqp->user_mode) cancel_delayed_work_sync(&iwqp->dwork_flush); if (!iwqp->user_mode) { @@ -1043,7 +1043,9 @@ static int irdma_create_qp(struct ib_qp *ibqp, err_code = irdma_setup_umode_qp(udata, iwdev, iwqp, &init_info, init_attr); } else { - INIT_DELAYED_WORK(&iwqp->dwork_flush, irdma_flush_worker); + if (uk_attrs->hw_rev <= IRDMA_GEN_2) + INIT_DELAYED_WORK(&iwqp->dwork_flush, + irdma_flush_worker); init_info.qp_uk_init_info.abi_ver = IRDMA_ABI_VER; err_code = irdma_setup_kmode_qp(iwdev, iwqp, &init_info, init_attr); } @@ -4094,15 +4096,22 @@ static int irdma_post_send(struct ib_qp *ibqp, ib_wr = ib_wr->next; } - if (!iwqp->flush_issued) { - if (iwqp->hw_iwarp_state <= IRDMA_QP_STATE_RTS) - irdma_uk_qp_post_wr(ukqp); - spin_unlock_irqrestore(&iwqp->lock, flags); + if (ukqp->uk_attrs->hw_rev <= IRDMA_GEN_2) { + if (!iwqp->flush_issued) { + if (iwqp->hw_iwarp_state <= IRDMA_QP_STATE_RTS) + irdma_uk_qp_post_wr(ukqp); + spin_unlock_irqrestore(&iwqp->lock, flags); + } else { + spin_unlock_irqrestore(&iwqp->lock, flags); + mod_delayed_work(iwqp->iwdev->cleanup_wq, + &iwqp->dwork_flush, + msecs_to_jiffies(IRDMA_FLUSH_DELAY_MS)); + } } else { + irdma_uk_qp_post_wr(ukqp); spin_unlock_irqrestore(&iwqp->lock, flags); - mod_delayed_work(iwqp->iwdev->cleanup_wq, &iwqp->dwork_flush, - msecs_to_jiffies(IRDMA_FLUSH_DELAY_MS)); } + if (err) *bad_wr = ib_wr; @@ -4191,7 +4200,7 @@ static int irdma_post_recv(struct ib_qp *ibqp, out: spin_unlock_irqrestore(&iwqp->lock, flags); - if (iwqp->flush_issued) + if (ukqp->uk_attrs->hw_rev <= IRDMA_GEN_2 && iwqp->flush_issued) mod_delayed_work(iwqp->iwdev->cleanup_wq, &iwqp->dwork_flush, msecs_to_jiffies(IRDMA_FLUSH_DELAY_MS)); @@ -4226,6 +4235,8 @@ static enum ib_wc_status irdma_flush_err_to_ib_wc_status(enum irdma_flush_opcode return IB_WC_MW_BIND_ERR; case FLUSH_REM_INV_REQ_ERR: return IB_WC_REM_INV_REQ_ERR; + case FLUSH_RNR_RETRY_EXC_ERR: + return IB_WC_RNR_RETRY_EXC_ERR; case FLUSH_FATAL_ERR: default: return IB_WC_FATAL_ERR; From patchwork Wed Jul 24 23:39:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741464 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 563451494DB for ; Wed, 24 Jul 2024 23:40:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864456; cv=none; b=SeB4DKfLJpmlo4f9dyHRc+FK5G3zKzu1+0DbU21oxYOubIy1qpoYP/c/Z6LG+CC6v2pTloc7I5Ra0QxBKndH59LeXVRQak0KfCX6QHDybfkHXRnx1eUYNDL65ONsKlNObCgfiO8NxdQwFcu0dX3IKQ1PApSejZXUIuAank+6RHM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864456; c=relaxed/simple; bh=ntSPNb3Np3EhlZxXdIigFHoJJEXjXhptSNhjvH1HcSs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=B7cZByJgtgX3sI+HGhoUQl5P4TSIbbxj8f5UFDC0h4eQUzY3TUcKB5ENJDyYEsDCUUC4bLikWOz3ROby6bRUTPHGMApDVDhFCqpJm5d+N4qt8vT8tOj0/S0RRwEXJ+F5aEx+DW5qmwZfaCuHmlrAFoBkqmTyONfsGGz8lnIDwN8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ib7GLiKv; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ib7GLiKv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864454; x=1753400454; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ntSPNb3Np3EhlZxXdIigFHoJJEXjXhptSNhjvH1HcSs=; b=ib7GLiKvpacwAfTjaxFKi7tK2GFClIBO0It7To82PaPQM+12fwzSpTL4 M8go90MIZywysJSezYYzqSlYMwr7MowRl9USSohNZytDj74uADVWhUbLi Q+VTX9BDGFLKtJWaojWo43vUC9O2kLelilq4740n/k50As7HuER8Hb84K GruykHfrex8cCkMqtfSXrIWxw0LcOGnzg523foQebNGoc3RtGFpYDKEq/ 3iv4w183Zw4SPgypcPM6J0jX2A3bk7YUjA7F93eu8yiYDNwpMl3BS7OzK ZULRNxB9v2mhQTyBz9OLdvzRy7mBXvi8PYbeqghstZC03LnTrwMEUx2aj A==; X-CSE-ConnectionGUID: xYzJuxFZQ4+efaMqyLcYYw== X-CSE-MsgGUID: 3G7IemNMQOeLbXciJzv1eA== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999807" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999807" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:48 -0700 X-CSE-ConnectionGUID: s1+26j9PQTOuVhKrCvXP9w== X-CSE-MsgGUID: OXZ/3SIdQ7CIdCKaqOBmqw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426122" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:47 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Jay Bhat , Tatyana Nikolova Subject: [RFC PATCH 24/25] RDMA/irdma: Add Push Page Support for GEN3 Date: Wed, 24 Jul 2024 18:39:16 -0500 Message-Id: <20240724233917.704-25-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Jay Bhat Implement the necessary support for enabling push on GEN3 devices. Key Changes: - Introduce a RDMA virtual channel operation with the Control Plane (CP) to manage the doorbell/push page which is a privileged operation. - Implement the MMIO mapping of push pages which adheres to the updated BAR layout and page indexing specific to GEN3 devices. - Support up to 16 QPs on a single push page, given that they are tied to the same Queue Set. - Impose limits on the size of WQEs pushed based on the message length constraints provided by the CP. Signed-off-by: Jay Bhat Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/ctrl.c | 1 - drivers/infiniband/hw/irdma/defs.h | 2 + drivers/infiniband/hw/irdma/irdma.h | 1 + drivers/infiniband/hw/irdma/type.h | 3 ++ drivers/infiniband/hw/irdma/user.h | 1 - drivers/infiniband/hw/irdma/utils.c | 48 ++++++++++++++++++--- drivers/infiniband/hw/irdma/verbs.c | 79 ++++++++++++++++++++++++++++++---- drivers/infiniband/hw/irdma/verbs.h | 7 +++ drivers/infiniband/hw/irdma/virtchnl.c | 40 +++++++++++++++++ drivers/infiniband/hw/irdma/virtchnl.h | 11 +++++ include/uapi/rdma/irdma-abi.h | 3 +- 11 files changed, 178 insertions(+), 18 deletions(-) diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 73cab77..0c13b0b 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -6467,7 +6467,6 @@ int irdma_sc_dev_init(enum irdma_vers ver, struct irdma_sc_dev *dev, dev->hw_attrs.max_hw_outbound_msg_size = IRDMA_MAX_OUTBOUND_MSG_SIZE; dev->hw_attrs.max_mr_size = IRDMA_MAX_MR_SIZE; dev->hw_attrs.max_hw_inbound_msg_size = IRDMA_MAX_INBOUND_MSG_SIZE; - dev->hw_attrs.max_hw_device_pages = IRDMA_MAX_PUSH_PAGE_COUNT; dev->hw_attrs.uk_attrs.max_hw_inline = IRDMA_MAX_INLINE_DATA_SIZE; dev->hw_attrs.max_hw_wqes = IRDMA_MAX_WQ_ENTRIES; dev->hw_attrs.max_qp_wr = IRDMA_MAX_QP_WRS(IRDMA_MAX_QUANTA_PER_WR); diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index e75dd8b..53d9588 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -167,6 +167,8 @@ enum irdma_protocol_used { #define IRDMA_MAX_RQ_WQE_SHIFT_GEN1 2 #define IRDMA_MAX_RQ_WQE_SHIFT_GEN2 3 +#define IRDMA_DEFAULT_MAX_PUSH_LEN 8192 + #define IRDMA_SQ_RSVD 258 #define IRDMA_RQ_RSVD 1 diff --git a/drivers/infiniband/hw/irdma/irdma.h b/drivers/infiniband/hw/irdma/irdma.h index 6af79bb..955ab98 100644 --- a/drivers/infiniband/hw/irdma/irdma.h +++ b/drivers/infiniband/hw/irdma/irdma.h @@ -133,6 +133,7 @@ struct irdma_uk_attrs { u32 min_hw_cq_size; u32 max_hw_cq_size; u32 max_hw_srq_quanta; + u16 max_hw_push_len; u16 max_hw_sq_chunk; u16 min_hw_wq_size; u8 hw_rev; diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index 9916091..0bfaf8b 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -1289,8 +1289,11 @@ struct irdma_qhash_table_info { struct irdma_cqp_manage_push_page_info { u32 push_idx; u16 qs_handle; + u16 hmc_fn_id; u8 free_page; u8 push_page_type; + u8 page_type; + u8 use_hmc_fn_id; }; struct irdma_qp_flush_info { diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h index a2029f5..d04175d 100644 --- a/drivers/infiniband/hw/irdma/user.h +++ b/drivers/infiniband/hw/irdma/user.h @@ -180,7 +180,6 @@ enum irdma_device_caps_const { IRDMA_MAX_SGE_RD = 13, IRDMA_MAX_OUTBOUND_MSG_SIZE = 2147483647, IRDMA_MAX_INBOUND_MSG_SIZE = 2147483647, - IRDMA_MAX_PUSH_PAGE_COUNT = 1024, IRDMA_MAX_PE_ENA_VF_COUNT = 32, IRDMA_MAX_VF_FPM_ID = 47, IRDMA_MAX_SQ_PAYLOAD_SIZE = 2145386496, diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index 2fdcb88..2e210be 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -1156,21 +1156,51 @@ int irdma_cqp_qp_create_cmd(struct irdma_sc_dev *dev, struct irdma_sc_qp *qp) /** * irdma_dealloc_push_page - free a push page for qp * @rf: RDMA PCI function - * @qp: hardware control qp + * @iwqp: QP pointer */ static void irdma_dealloc_push_page(struct irdma_pci_f *rf, - struct irdma_sc_qp *qp) + struct irdma_qp *iwqp) { struct irdma_cqp_request *cqp_request; struct cqp_cmds_info *cqp_info; int status; + struct irdma_sc_qp *qp = &iwqp->sc_qp; + struct irdma_pd *pd = iwqp->iwpd; + u32 push_pos; + bool is_empty; if (qp->push_idx == IRDMA_INVALID_PUSH_PAGE_INDEX) return; + mutex_lock(&pd->push_alloc_mutex); + + push_pos = qp->push_offset / IRDMA_PUSH_WIN_SIZE; + __clear_bit(push_pos, pd->push_offset_bmap); + is_empty = bitmap_empty(pd->push_offset_bmap, IRDMA_QPS_PER_PUSH_PAGE); + if (!is_empty) { + qp->push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX; + goto exit; + } + + if (!rf->sc_dev.privileged) { + u32 pg_idx = qp->push_idx; + + status = irdma_vchnl_req_manage_push_pg(&rf->sc_dev, false, + qp->qs_handle, &pg_idx); + if (!status) { + qp->push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX; + pd->push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX; + } else { + __set_bit(push_pos, pd->push_offset_bmap); + } + goto exit; + } + cqp_request = irdma_alloc_and_get_cqp_request(&rf->cqp, false); - if (!cqp_request) - return; + if (!cqp_request) { + __set_bit(push_pos, pd->push_offset_bmap); + goto exit; + } cqp_info = &cqp_request->info; cqp_info->cqp_cmd = IRDMA_OP_MANAGE_PUSH_PAGE; @@ -1182,9 +1212,15 @@ static void irdma_dealloc_push_page(struct irdma_pci_f *rf, cqp_info->in.u.manage_push_page.cqp = &rf->cqp.sc_cqp; cqp_info->in.u.manage_push_page.scratch = (uintptr_t)cqp_request; status = irdma_handle_cqp_op(rf, cqp_request); - if (!status) + if (!status) { qp->push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX; + pd->push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX; + } else { + __set_bit(push_pos, pd->push_offset_bmap); + } irdma_put_cqp_request(&rf->cqp, cqp_request); +exit: + mutex_unlock(&pd->push_alloc_mutex); } static void irdma_free_gsi_qp_rsrc(struct irdma_qp *iwqp, u32 qp_num) @@ -1218,7 +1254,7 @@ void irdma_free_qp_rsrc(struct irdma_qp *iwqp) u32 qp_num = iwqp->sc_qp.qp_uk.qp_id; irdma_ieq_cleanup_qp(iwdev->vsi.ieq, &iwqp->sc_qp); - irdma_dealloc_push_page(rf, &iwqp->sc_qp); + irdma_dealloc_push_page(rf, iwqp); if (iwqp->sc_qp.vsi) { irdma_qp_rem_qos(&iwqp->sc_qp); iwqp->sc_qp.dev->ws_remove(iwqp->sc_qp.vsi, diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 8765a2a..edcd720 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -245,11 +245,46 @@ static void irdma_alloc_push_page(struct irdma_qp *iwqp) struct cqp_cmds_info *cqp_info; struct irdma_device *iwdev = iwqp->iwdev; struct irdma_sc_qp *qp = &iwqp->sc_qp; + struct irdma_pd *pd = iwqp->iwpd; + u32 push_pos = 0; int status; + mutex_lock(&pd->push_alloc_mutex); + if (pd->push_idx == IRDMA_INVALID_PUSH_PAGE_INDEX) { + bitmap_zero(pd->push_offset_bmap, IRDMA_QPS_PER_PUSH_PAGE); + } else { + if (pd->qs_handle == qp->qs_handle) { + push_pos = find_first_zero_bit(pd->push_offset_bmap, + IRDMA_QPS_PER_PUSH_PAGE); + if (push_pos < IRDMA_QPS_PER_PUSH_PAGE) { + qp->push_idx = pd->push_idx; + qp->push_offset = + push_pos * IRDMA_PUSH_WIN_SIZE; + __set_bit(push_pos, pd->push_offset_bmap); + } + } + goto exit; + } + + if (!iwdev->rf->sc_dev.privileged) { + u32 pg_idx; + + status = irdma_vchnl_req_manage_push_pg(&iwdev->rf->sc_dev, + true, qp->qs_handle, + &pg_idx); + if (!status && pg_idx != IRDMA_INVALID_PUSH_PAGE_INDEX) { + qp->push_idx = pg_idx; + qp->push_offset = push_pos * IRDMA_PUSH_WIN_SIZE; + __set_bit(push_pos, pd->push_offset_bmap); + pd->push_idx = pg_idx; + pd->qs_handle = qp->qs_handle; + } + goto exit; + } + cqp_request = irdma_alloc_and_get_cqp_request(&iwdev->rf->cqp, true); if (!cqp_request) - return; + goto exit; cqp_info = &cqp_request->info; cqp_info->cqp_cmd = IRDMA_OP_MANAGE_PUSH_PAGE; @@ -266,10 +301,15 @@ static void irdma_alloc_push_page(struct irdma_qp *iwqp) if (!status && cqp_request->compl_info.op_ret_val < iwdev->rf->sc_dev.hw_attrs.max_hw_device_pages) { qp->push_idx = cqp_request->compl_info.op_ret_val; - qp->push_offset = 0; + qp->push_offset = push_pos * IRDMA_PUSH_WIN_SIZE; + __set_bit(push_pos, pd->push_offset_bmap); + pd->push_idx = cqp_request->compl_info.op_ret_val; + pd->qs_handle = qp->qs_handle; } irdma_put_cqp_request(&iwdev->rf->cqp, cqp_request); +exit: + mutex_unlock(&pd->push_alloc_mutex); } /** @@ -351,6 +391,9 @@ static int irdma_alloc_ucontext(struct ib_ucontext *uctx, uresp.comp_mask |= IRDMA_ALLOC_UCTX_MIN_HW_WQ_SIZE; uresp.max_hw_srq_quanta = uk_attrs->max_hw_srq_quanta; uresp.comp_mask |= IRDMA_ALLOC_UCTX_MAX_HW_SRQ_QUANTA; + uresp.max_hw_push_len = uk_attrs->max_hw_push_len; + uresp.comp_mask |= IRDMA_SUPPORT_MAX_HW_PUSH_LEN; + if (ib_copy_to_udata(udata, &uresp, min(sizeof(uresp), udata->outlen))) { rdma_user_mmap_entry_remove(ucontext->db_mmap_entry); @@ -410,6 +453,9 @@ static int irdma_alloc_pd(struct ib_pd *pd, struct ib_udata *udata) if (err) return err; + iwpd->push_idx = IRDMA_INVALID_PUSH_PAGE_INDEX; + mutex_init(&iwpd->push_alloc_mutex); + sc_pd = &iwpd->sc_pd; if (udata) { struct irdma_ucontext *ucontext = @@ -485,6 +531,23 @@ static void irdma_clean_cqes(struct irdma_qp *iwqp, struct irdma_cq *iwcq) spin_unlock_irqrestore(&iwcq->lock, flags); } +static u64 irdma_compute_push_wqe_offset(struct irdma_device *iwdev, u32 page_idx) +{ + u64 bar_off = (uintptr_t)iwdev->rf->sc_dev.hw_regs[IRDMA_DB_ADDR_OFFSET]; + + if (iwdev->rf->sc_dev.hw_attrs.uk_attrs.hw_rev == IRDMA_GEN_2) { + /* skip over db page */ + bar_off += IRDMA_HW_PAGE_SIZE; + /* skip over reserved space */ + bar_off += IRDMA_PF_BAR_RSVD; + } + + /* push wqe page */ + bar_off += (u64)page_idx * IRDMA_HW_PAGE_SIZE; + + return bar_off; +} + static void irdma_remove_push_mmap_entries(struct irdma_qp *iwqp) { if (iwqp->push_db_mmap_entry) { @@ -503,14 +566,12 @@ static int irdma_setup_push_mmap_entries(struct irdma_ucontext *ucontext, u64 *push_db_mmap_key) { struct irdma_device *iwdev = ucontext->iwdev; - u64 rsvd, bar_off; + u64 bar_off; + + WARN_ON_ONCE(iwdev->rf->sc_dev.hw_attrs.uk_attrs.hw_rev < IRDMA_GEN_2); + + bar_off = irdma_compute_push_wqe_offset(iwdev, iwqp->sc_qp.push_idx); - rsvd = IRDMA_PF_BAR_RSVD; - bar_off = (uintptr_t)iwdev->rf->sc_dev.hw_regs[IRDMA_DB_ADDR_OFFSET]; - /* skip over db page */ - bar_off += IRDMA_HW_PAGE_SIZE; - /* push wqe page */ - bar_off += rsvd + iwqp->sc_qp.push_idx * IRDMA_HW_PAGE_SIZE; iwqp->push_wqe_mmap_entry = irdma_user_mmap_entry_insert(ucontext, bar_off, IRDMA_MMAP_IO_WC, push_wqe_mmap_key); diff --git a/drivers/infiniband/hw/irdma/verbs.h b/drivers/infiniband/hw/irdma/verbs.h index 0922a22..71b627d 100644 --- a/drivers/infiniband/hw/irdma/verbs.h +++ b/drivers/infiniband/hw/irdma/verbs.h @@ -8,6 +8,9 @@ #define IRDMA_PKEY_TBL_SZ 1 #define IRDMA_DEFAULT_PKEY 0xFFFF + +#define IRDMA_QPS_PER_PUSH_PAGE 16 +#define IRDMA_PUSH_WIN_SIZE 256 #define IRDMA_SHADOW_PGCNT 1 struct irdma_ucontext { @@ -28,6 +31,10 @@ struct irdma_ucontext { struct irdma_pd { struct ib_pd ibpd; struct irdma_sc_pd sc_pd; + struct mutex push_alloc_mutex; /* protect push page alloc within a PD*/ + DECLARE_BITMAP(push_offset_bmap, IRDMA_QPS_PER_PUSH_PAGE); + u32 push_idx; + u16 qs_handle; }; union irdma_sockaddr { diff --git a/drivers/infiniband/hw/irdma/virtchnl.c b/drivers/infiniband/hw/irdma/virtchnl.c index 9f39cd6..667ec0e 100644 --- a/drivers/infiniband/hw/irdma/virtchnl.c +++ b/drivers/infiniband/hw/irdma/virtchnl.c @@ -66,6 +66,7 @@ int irdma_sc_vchnl_init(struct irdma_sc_dev *dev, dev->privileged = info->privileged; dev->is_pf = info->is_pf; dev->hw_attrs.uk_attrs.hw_rev = info->hw_rev; + dev->hw_attrs.uk_attrs.max_hw_push_len = IRDMA_DEFAULT_MAX_PUSH_LEN; if (!dev->privileged) { int ret = irdma_vchnl_req_get_ver(dev, IRDMA_VCHNL_CHNL_VER_MAX, @@ -83,6 +84,7 @@ int irdma_sc_vchnl_init(struct irdma_sc_dev *dev, return ret; dev->hw_attrs.uk_attrs.hw_rev = dev->vc_caps.hw_rev; + dev->hw_attrs.uk_attrs.max_hw_push_len = dev->vc_caps.max_hw_push_len; } return 0; @@ -107,6 +109,7 @@ static int irdma_vchnl_req_verify_resp(struct irdma_vchnl_req *vchnl_req, if (resp_len < IRDMA_VCHNL_OP_GET_RDMA_CAPS_MIN_SIZE) return -EBADMSG; break; + case IRDMA_VCHNL_OP_MANAGE_PUSH_PAGE: case IRDMA_VCHNL_OP_GET_REG_LAYOUT: case IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP: case IRDMA_VCHNL_OP_QUEUE_VECTOR_UNMAP: @@ -187,6 +190,40 @@ static int irdma_vchnl_req_send_sync(struct irdma_sc_dev *dev, } /** + * irdma_vchnl_req_manage_push_pg - manage push page + * @dev: rdma device pointer + * @add: Add or remove push page + * @qs_handle: qs_handle of push page for add + * @pg_idx: index of push page that is added or removed + */ +int irdma_vchnl_req_manage_push_pg(struct irdma_sc_dev *dev, bool add, + u32 qs_handle, u32 *pg_idx) +{ + struct irdma_vchnl_manage_push_page add_push_pg = {}; + struct irdma_vchnl_req_init_info info = {}; + + if (!dev->vchnl_up) + return -EBUSY; + + add_push_pg.add = add; + add_push_pg.pg_idx = add ? 0 : *pg_idx; + add_push_pg.qs_handle = qs_handle; + + info.op_code = IRDMA_VCHNL_OP_MANAGE_PUSH_PAGE; + info.op_ver = IRDMA_VCHNL_OP_MANAGE_PUSH_PAGE_V0; + info.req_parm = &add_push_pg; + info.req_parm_len = sizeof(add_push_pg); + info.resp_parm = pg_idx; + info.resp_parm_len = sizeof(*pg_idx); + + ibdev_dbg(to_ibdev(dev), + "VIRT: Sending msg: manage_push_pg add = %d, idx %u, qsh %u\n", + add_push_pg.add, add_push_pg.pg_idx, add_push_pg.qs_handle); + + return irdma_vchnl_req_send_sync(dev, &info); +} + +/** * irdma_vchnl_req_get_reg_layout - Get Register Layout * @dev: RDMA device pointer */ @@ -561,6 +598,9 @@ int irdma_vchnl_req_get_caps(struct irdma_sc_dev *dev) if (ret) return ret; + if (!dev->vc_caps.max_hw_push_len) + dev->vc_caps.max_hw_push_len = IRDMA_DEFAULT_MAX_PUSH_LEN; + if (dev->vc_caps.hw_rev > IRDMA_GEN_MAX || dev->vc_caps.hw_rev < IRDMA_GEN_2) { ibdev_dbg(to_ibdev(dev), diff --git a/drivers/infiniband/hw/irdma/virtchnl.h b/drivers/infiniband/hw/irdma/virtchnl.h index 23e66bc..0c88f64 100644 --- a/drivers/infiniband/hw/irdma/virtchnl.h +++ b/drivers/infiniband/hw/irdma/virtchnl.h @@ -14,6 +14,7 @@ #define IRDMA_VCHNL_OP_GET_HMC_FCN_V1 1 #define IRDMA_VCHNL_OP_GET_HMC_FCN_V2 2 #define IRDMA_VCHNL_OP_PUT_HMC_FCN_V0 0 +#define IRDMA_VCHNL_OP_MANAGE_PUSH_PAGE_V0 0 #define IRDMA_VCHNL_OP_GET_REG_LAYOUT_V0 0 #define IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP_V0 0 #define IRDMA_VCHNL_OP_QUEUE_VECTOR_UNMAP_V0 0 @@ -55,6 +56,7 @@ enum irdma_vchnl_ops { IRDMA_VCHNL_OP_GET_VER = 0, IRDMA_VCHNL_OP_GET_HMC_FCN = 1, IRDMA_VCHNL_OP_PUT_HMC_FCN = 2, + IRDMA_VCHNL_OP_MANAGE_PUSH_PAGE = 10, IRDMA_VCHNL_OP_GET_REG_LAYOUT = 11, IRDMA_VCHNL_OP_GET_RDMA_CAPS = 13, IRDMA_VCHNL_OP_QUEUE_VECTOR_MAP = 14, @@ -125,6 +127,13 @@ struct irdma_vchnl_init_info { bool is_pf; }; +struct irdma_vchnl_manage_push_page { + u8 page_type; + u8 add; + u32 pg_idx; + u32 qs_handle; +}; + struct irdma_vchnl_reg_info { u32 reg_offset; u16 field_cnt; @@ -167,6 +176,8 @@ int irdma_vchnl_req_get_ver(struct irdma_sc_dev *dev, u16 ver_req, int irdma_vchnl_req_get_caps(struct irdma_sc_dev *dev); int irdma_vchnl_req_get_resp(struct irdma_sc_dev *dev, struct irdma_vchnl_req *vc_req); +int irdma_vchnl_req_manage_push_pg(struct irdma_sc_dev *dev, bool add, + u32 qs_handle, u32 *pg_idx); int irdma_vchnl_req_get_reg_layout(struct irdma_sc_dev *dev); int irdma_vchnl_req_aeq_vec_map(struct irdma_sc_dev *dev, u32 v_idx); int irdma_vchnl_req_ceq_vec_map(struct irdma_sc_dev *dev, u16 ceq_id, diff --git a/include/uapi/rdma/irdma-abi.h b/include/uapi/rdma/irdma-abi.h index f7788d3..9c8cee0 100644 --- a/include/uapi/rdma/irdma-abi.h +++ b/include/uapi/rdma/irdma-abi.h @@ -28,6 +28,7 @@ enum { IRDMA_ALLOC_UCTX_MIN_HW_WQ_SIZE = 1 << 1, IRDMA_ALLOC_UCTX_MAX_HW_SRQ_QUANTA = 1 << 2, IRDMA_SUPPORT_WQE_FORMAT_V2 = 1 << 3, + IRDMA_SUPPORT_MAX_HW_PUSH_LEN = 1 << 4, }; struct irdma_alloc_ucontext_req { @@ -58,7 +59,7 @@ struct irdma_alloc_ucontext_resp { __aligned_u64 comp_mask; __u16 min_hw_wq_size; __u32 max_hw_srq_quanta; - __u8 rsvd3[2]; + __u16 max_hw_push_len; }; struct irdma_alloc_pd_resp { From patchwork Wed Jul 24 23:39:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nikolova, Tatyana E" X-Patchwork-Id: 13741463 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 864AC145B00 for ; Wed, 24 Jul 2024 23:40:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864455; cv=none; b=TmiIF3AHbd7xYoeteGnT2LXXsNHr0kHKqb7fJ7drU2gXsn1LIdNARNOs4vYQnjO2b5K6YN+KvR7K0kPDBXbIA9HFpgj055imhcp5lDecZ7wC0HQWHUBARmkKmGXHOjkg8T3weADRF90XdROqPLz6bt6hGaB1tZMmksMZixD41nc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721864455; c=relaxed/simple; bh=idudZEeGuT0F22EFtKu3oVijOX0MNRJEOnoVZsWsR/k=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=p9MjH31uC5vYIi/md5vyt0kSTz8m3QP2QYha+3c12r1t683DF+ByL4oRtRH0otVm5OWRqh4ZtoagK6+rqWTJDNkESt+xFWpMgckHN2jv3j127HNHD0qFhE0/VE/4xI8/6DjfzFPUnRz2HVviZFdwvqRyByb/ddq3Rs4bqoINtN8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=I4zPk497; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="I4zPk497" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721864455; x=1753400455; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=idudZEeGuT0F22EFtKu3oVijOX0MNRJEOnoVZsWsR/k=; b=I4zPk497AePeIHNjiF7ZGNMN0jl+n8QiHnmUXMIqUIIJVbhN+cTgGji0 luWLLuVERreApFOzDW9Qiqbk1O/ppVB6sKHQDrR7Grc4hJSc6uVTBS3c1 XOvRn2/uvgleS4YTGc4M6yH6SnGz0Cd62rfxUaSEaWHI4PlLovgFvFrwH mZuV8lVA7dHtSkSx0cpByXMGoV+O5b6o/0iKT4PTB2yvf/tOKGdzOUZjI 25iPorO5DRyaoReRFTMtJ3OpGCpKqKD6/yNRbOac99HlSNTD+WLbfU4Ac kga5lLe2DtJWglUG7Yenz/k2PhatTQQglLhnBzA9nlo9q/ZNZKWmUHH51 Q==; X-CSE-ConnectionGUID: Mn5rM6adRZCO1RTj0uiFrg== X-CSE-MsgGUID: 4T4dnf5/RBWwEgx4tPxjpg== X-IronPort-AV: E=McAfee;i="6700,10204,11143"; a="44999813" X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="44999813" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:49 -0700 X-CSE-ConnectionGUID: WmPcsHIbQKOoh4PKKfw0rg== X-CSE-MsgGUID: OUDydYSjTC6RGGV9mK4nUA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,234,1716274800"; d="scan'208";a="52426130" Received: from tenikolo-mobl1.amr.corp.intel.com ([10.124.96.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2024 16:40:47 -0700 From: Tatyana Nikolova To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, mustafa.ismail@intel.com, Shiraz Saleem , Tatyana Nikolova Subject: [RFC PATCH 25/25] RDMA/irdma: Update Kconfig Date: Wed, 24 Jul 2024 18:39:17 -0500 Message-Id: <20240724233917.704-26-tatyana.e.nikolova@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20240724233917.704-1-tatyana.e.nikolova@intel.com> References: <20240724233917.704-1-tatyana.e.nikolova@intel.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Shiraz Saleem Update Kconfig to add dependency on idpf module. Additionally, add IPU E2000 to list of devices supported. Signed-off-by: Shiraz Saleem Signed-off-by: Tatyana Nikolova --- drivers/infiniband/hw/irdma/Kconfig | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/irdma/Kconfig b/drivers/infiniband/hw/irdma/Kconfig index b6f9c41..f6b39f3 100644 --- a/drivers/infiniband/hw/irdma/Kconfig +++ b/drivers/infiniband/hw/irdma/Kconfig @@ -4,9 +4,10 @@ config INFINIBAND_IRDMA depends on INET depends on IPV6 || !IPV6 depends on PCI - depends on ICE && I40E + depends on (IDPF || ICE) && I40E select GENERIC_ALLOCATOR select AUXILIARY_BUS help - This is an Intel(R) Ethernet Protocol Driver for RDMA driver - that support E810 (iWARP/RoCE) and X722 (iWARP) network devices. + This is an Intel(R) Ethernet Protocol Driver for RDMA that + support IPU E2000 (RoCEv2), E810 (iWARP/RoCE) and X722 (iWARP) + network devices.