From patchwork Mon Jan 9 13:30:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093565 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B1C8C54EBD for ; Mon, 9 Jan 2023 13:31:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234946AbjAINbs (ORCPT ); Mon, 9 Jan 2023 08:31:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234878AbjAINbj (ORCPT ); Mon, 9 Jan 2023 08:31:39 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2048.outbound.protection.outlook.com [40.107.94.48]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3093513D14 for ; Mon, 9 Jan 2023 05:31:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hiBc2itVgd/rTfCv9oM5W+SryhbTHMArpc2QhrMBNFyIsKirFHqtiyCoI45/jIYW7w7l1icecPtFXsSt3Ku37Lb1klOba8F9j8FFhOfF2Xr5aUXYvpndI+qQEdOB3jK2ZjQe2zHrYynJY4nqleVTUirlsmESAze+k6PXICT2fJcVjAVbFkQc8gVOTyJTi9AINQrG72Gwc9LHQ44kKqVaLNX6V91caVr+P7XLt/H7yXJ0l3DNngFXw3qOGBbJQAxOurstkj8+YErENNv6Br1d7L6VQcPkx2AaczMvM6KePzKYey9NcBRum3zLAFwCi/PL214XgsjiyoNQEE/6vNipiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2cJzTLAcYAv18TubcpqRIKLMTOyL+3PupQgDg1msi9M=; b=QwxlinQCwQKJB+YV8vkAQNdHHCDXDetckX7mRYlS++F3O+sQxo/0hZxM4wwLFQtTFjcjDzFk6bgnylQGioVK/spOJoNAjlxmHaK0RVTMj+nIx5MCeuBUjqa9R2GjY0J17e3E+DbUqoWZKSn2MMsQiaQIXDb07hf82OGnpQPMt/2rGGj32iCaS/oVl/rRQHqzXgjg9nB6c8N7vibMLJ86UaR0czziyGyqw7FdmQu7iEGLpeD723R0RrscepEzJTiR/7v/7fkt2lGrXRlatZ6X1gk9J4C/fql8kPPjXlJS6ad+iFbh5Qx4j/OxqDIgRQ5R/kxbzsM1Q7Du5vk6nr7jGw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2cJzTLAcYAv18TubcpqRIKLMTOyL+3PupQgDg1msi9M=; b=UX2RHZhsf2MstMhD70dnv1U0Ayp8fPZ4kIAIxhN/HNkj7pk/AfaCvw/aaJ90MDJIcxwjsjnvt84Ld+ZXDPOjCbaLOxSkoJ8ANeYRanngP1BHkHffa1Wn9s4OrYALoJZzsWD0J6/5zYH+aSsO50zAbPLld7fszaa45j5eleC9+tMocKKcJxFwW2ZXVQpNHePUBGFlTX03Ni3ZaX4nvAQGw/XvYbUR5QObK2xHMAmaDflZKdOLNRPQjWvMDjuqmn+5Cn8Y+zmyxx8H/cqSnCHv0Em0SsbBgJX6TvKNgaImlQ6H9X6mIbExNsMP4Kb9FrvBs8QzlUn+UHDdlchaiVzMgg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by PH7PR12MB7307.namprd12.prod.outlook.com (2603:10b6:510:20b::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5944.19; Mon, 9 Jan 2023 13:31:36 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:31:36 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Boris Pismenny , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com Subject: [PATCH v8 01/25] net: Introduce direct data placement tcp offload Date: Mon, 9 Jan 2023 15:30:52 +0200 Message-Id: <20230109133116.20801-2-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO4P302CA0045.GBRP302.PROD.OUTLOOK.COM (2603:10a6:600:317::20) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|PH7PR12MB7307:EE_ X-MS-Office365-Filtering-Correlation-Id: 5d006dd0-b1ae-4f37-4ff0-08daf245d4af X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 0e6JzlWbAqi/dm7vPcBBSQaZizMpW1n65FTjQODousFEum1GlHABW3BE3CQnrMfjGwNLCfKWcszHwqdNfV10nTKiGoOu56OONODm4eYlo01V8uCokET/NkqpWOXCwQ/v6DLobTr8SRYXDLCDaF4dwfnsk5TxCy/slJ5V8mQDY9g5QADH3zFTgd9sEfK39kuGPrP9Yb8i4dSvxKKeJfAmJ0YY+Iuj7Lam2kIwoHa3u2iy+/6LOHLU29wqejLMM67LZVm9E6wotUxdmSbSdh9Kyz6qa+/NJu49xZlSb3bSx3xC8FnOCes7s7I8Wn2D7+eYq/jp/CoOhWuxVYoEQyTW5mvofDmPnsyAXfKFw0HChjj1HJKA3mJbibg0IhfHER+109aeOpHu6t+759iFtMrft+e8ngORdIrN0DNC2kPLkpyNlpTl8yNsK22RSV8lR0T1bYfsNIBmDIGGQ11fShgNqsWgZg0FdSS4x9rtzZUGlwvg79gtTNTxBZgmycxsPa1/vjFc6ba2kyPWkJs/srx8N5yJLF7UuonWQAi49NeqXcGSlQD7OM5d0glBDrMbvYbV8iKyXbmyHTsw7VE9gW5Gb+Z3pPFJyTnIjwmXa/HM+b9NFwVbZ5rOgjtiVGSxmQxxqGTw6myQKsUQkQb8+kX27Q== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(39860400002)(366004)(376002)(396003)(346002)(451199015)(4326008)(6506007)(38100700002)(107886003)(6666004)(8936002)(54906003)(66556008)(5660300002)(30864003)(86362001)(2616005)(478600001)(26005)(6512007)(41300700001)(6486002)(186003)(36756003)(7416002)(316002)(66476007)(83380400001)(2906002)(1076003)(66946007)(8676002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: YLXX6/EIo60Z/w6nJd3PIPZD06DH9epkvlT4f+eYkzjYeI0y7Ph+RCvmjPV+1sWnYos87WhFicwm003kV1o9kxfcYM35ew2AmBUt8kmoK19CE95y7h0EEnYN52JvAAdEdDppw45hjyX8ypK1WW8G/7XwdldpdTHRJvL4AQ4how6L1aXKxhXxPdfZd/2GapB4y6ijFP3WEbK70jt6i8CGrl585Kun53gyr0dU79mc5qfaiKKwQenkPPE+a6tErGS506BxsjlaJo6K39gpx9OnydDrmW4B4U9/VH6Q4Q8duAJEc/kLuvh61Sgr3rYd4aaNqK8sCePtNbQ0hEfgE9aP13tZVeoJjIng+0Kb9Rso0OYujZVNWpAYFZYa8Xk+UkepwBZtezh0l6Km7X2PgRgBHaBuPhoQldWs7+B/Uf4RU/X295j8Sg6RwyO2c/YD4WZtWoelYmEa+YT9Y2QkVsDu7MdDYRJsXjDCrgTglh+9cJbS27uNmOI7OZBhebIDq4bICqHYMa/hREULAonAucjHQG+c2Plp9BIzCWFwG+Ypv4X1umxtrjuYNQr8Z79rxLBGsR5Pva0kxbaDqdkvGYW46pmJUOFRIWp3X5roOZWrX0R/ucB6c2ZLa+MjUp+g402McstU9un7rw2uS4ZO8PztAaWgPqrlRs1ob5to2JpMo6V3Ex+ZggZy/rvTJvoiqz9/T462Pjfj4MP3qQDFEUV+XWMuiz8T/YdrCW705ZA8meEKB86/+uzdxU5meNTmviHWusVPpJi3pMPreBMEITLuylLWxjF5JhNp+1u6yiEv1ytJC3zeGneHWkwd5crFAANKwW8j81CfTfmyWZxevGU6HWRePk4QLgVpTh9pgtegnJ6CfWQknj0BdppkJOsihDcSwckoFKjOcmHyvCVOWYV3x1M6vgDLTOeCLRy35tE32NfadxhnPs4HUEhK02SH1KjQHYviFfQoc6afOxfvgSnW2AmBHlq0d24jJ1CLFPgaFVNgU1xT2FM9RF2oEXmWX6m/vO5DmYE+09AqJBeQDo0UxKxr/wyaIICdSMTiOALXT6nbmQp0SDgHHS/eH7JFNT/UGi52r6aRZ9TVNFQOkmIp0XkdSVJJk6V+bDMUqtd1QrxcO5Xzslx4PVVvHjWuNbr1T1poGRxXZvzG1i8R0LfRVWZa+vuyU51UjDRiz4h8yTF+176239ummHOcndagMM7enR5gVjvS6DJNLD66ErlTegThk8UU0/D3rMwFfY9MU7ACcmC+uNOzYxLrKfmU2/ASme9BUFxUPhFVCRRppjGIhqEAL4Cp/NAW+LVF9LzUIAOFq7PXocbFRYB5QZ/5kbTm4eZ9AoiwBpSkvyaEFLnUtvoCO1MQ67EUo/qn2+Nue1bmgw+dM4JW8FsDSzccvLFSq94EA6ZSknRJX5nQsGR/5a6nM3m4VPkiufGcNISCSNSuFlRPHsIFq60O5EplDS1mQvi0C58MOczfJ59HLMGWWF8LxX6LyNgXN8iXKqwMi+K2SBWjiYz8uU2VxJ3TQ2o2NwIaXYZCAj/DwP7sBCWpVxoX6aEgwbGSVf0G59jwSmxCQdeZ2HbrsH7PJ42B7nYD X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5d006dd0-b1ae-4f37-4ff0-08daf245d4af X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:31:36.4551 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: p+WXRr/HE0k9uiFXvJZZMymba4NVXKgMQKs+EYMP/T0jWhATfsWETIwqncnmqWn+hn0le39c8WFAhST6MAG1Zw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB7307 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Boris Pismenny This commit introduces direct data placement (DDP) offload for TCP. The motivation is saving compute resources/cycles that are spent to copy data from SKBs to the block layer buffers and CRC calculation/verification for received PDUs (Protocol Data Units). The DDP capability is accompanied by new net_device operations that configure hardware contexts. There is a context per socket, and a context per DDP operation. Additionally, a resynchronization routine is used to assist hardware handle TCP OOO, and continue the offload. Furthermore, we let the offloading driver advertise what is the max hw sectors/segments. The interface includes five net-device ddp operations: 1. sk_add - add offload for the queue represented by socket+config pair 2. sk_del - remove the offload for the socket/queue 3. ddp_setup - request copy offload for buffers associated with an IO 4. ddp_teardown - release offload resources for that IO 5. limits - query NIC driver for quirks and limitations (e.g. max number of scatter gather entries per IO) Using this interface, the NIC hardware will scatter TCP payload directly to the BIO pages according to the command_id. To maintain the correctness of the network stack, the driver is expected to construct SKBs that point to the BIO pages. The SKB passed to the network stack from the driver represents data as it is on the wire, while it is pointing directly to data in destination buffers. As a result, data from page frags should not be copied out to the linear part. To avoid needless copies, such as when using skb_condense, we mark the skb->ulp_ddp bit. In addition, the skb->ulp_crc will be used by the upper layers to determine if CRC re-calculation is required. The two separated skb indications are needed to avoid false positives GRO flushing events. Follow-up patches will use this interface for DDP in NVMe-TCP. Capability bits stored in net_device allow drivers to report which ULP DDP capabilities a device supports. Control over these capabilities will be exposed to userspace in later patches. Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel --- include/linux/netdevice.h | 12 ++ include/linux/skbuff.h | 24 ++++ include/net/inet_connection_sock.h | 4 + include/net/ulp_ddp.h | 173 +++++++++++++++++++++++++++++ include/net/ulp_ddp_caps.h | 41 +++++++ net/Kconfig | 20 ++++ net/core/skbuff.c | 3 +- net/ipv4/tcp_input.c | 8 ++ net/ipv4/tcp_ipv4.c | 3 + net/ipv4/tcp_offload.c | 3 + 10 files changed, 290 insertions(+), 1 deletion(-) create mode 100644 include/net/ulp_ddp.h create mode 100644 include/net/ulp_ddp_caps.h diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index aad12a179e54..bd270c4bbf97 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -52,6 +52,10 @@ #include #include +#ifdef CONFIG_ULP_DDP +#include +#endif + struct netpoll_info; struct device; struct ethtool_ops; @@ -1392,6 +1396,8 @@ struct netdev_net_notifier { * Get hardware timestamp based on normal/adjustable time or free running * cycle counter. This function is required if physical clock supports a * free running cycle counter. + * struct ulp_ddp_dev_ops *ulp_ddp_ops; + * ULP DDP operations (see include/net/ulp_ddp.h) */ struct net_device_ops { int (*ndo_init)(struct net_device *dev); @@ -1616,6 +1622,9 @@ struct net_device_ops { ktime_t (*ndo_get_tstamp)(struct net_device *dev, const struct skb_shared_hwtstamps *hwtstamps, bool cycles); +#if IS_ENABLED(CONFIG_ULP_DDP) + const struct ulp_ddp_dev_ops *ulp_ddp_ops; +#endif }; /** @@ -2071,6 +2080,9 @@ struct net_device { netdev_features_t mpls_features; netdev_features_t gso_partial_features; +#ifdef CONFIG_ULP_DDP + struct ulp_ddp_netdev_caps ulp_ddp_caps; +#endif unsigned int min_mtu; unsigned int max_mtu; unsigned short type; diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 4c8492401a10..8708c5935e89 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -811,6 +811,8 @@ typedef unsigned char *sk_buff_data_t; * delivery_time in mono clock base (i.e. EDT). Otherwise, the * skb->tstamp has the (rcv) timestamp at ingress and * delivery_time at egress. + * @ulp_ddp: DDP offloaded + * @ulp_crc: CRC offloaded * @napi_id: id of the NAPI struct this skb came from * @sender_cpu: (aka @napi_id) source CPU in XPS * @alloc_cpu: CPU which did the skb allocation. @@ -983,6 +985,10 @@ struct sk_buff { __u8 slow_gro:1; __u8 csum_not_inet:1; __u8 scm_io_uring:1; +#ifdef CONFIG_ULP_DDP + __u8 ulp_ddp:1; + __u8 ulp_crc:1; +#endif #ifdef CONFIG_NET_SCHED __u16 tc_index; /* traffic control index */ @@ -5053,5 +5059,23 @@ static inline void skb_mark_for_recycle(struct sk_buff *skb) } #endif +static inline bool skb_is_ulp_ddp(struct sk_buff *skb) +{ +#ifdef CONFIG_ULP_DDP + return skb->ulp_ddp; +#else + return 0; +#endif +} + +static inline bool skb_is_ulp_crc(struct sk_buff *skb) +{ +#ifdef CONFIG_ULP_DDP + return skb->ulp_crc; +#else + return 0; +#endif +} + #endif /* __KERNEL__ */ #endif /* _LINUX_SKBUFF_H */ diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index c2b15f7e5516..2ba73167b3bb 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -68,6 +68,8 @@ struct inet_connection_sock_af_ops { * @icsk_ulp_ops Pluggable ULP control hook * @icsk_ulp_data ULP private data * @icsk_clean_acked Clean acked data hook + * @icsk_ulp_ddp_ops Pluggable ULP direct data placement control hook + * @icsk_ulp_ddp_data ULP direct data placement private data * @icsk_ca_state: Congestion control state * @icsk_retransmits: Number of unrecovered [RTO] timeouts * @icsk_pending: Scheduled timer event @@ -98,6 +100,8 @@ struct inet_connection_sock { const struct tcp_ulp_ops *icsk_ulp_ops; void __rcu *icsk_ulp_data; void (*icsk_clean_acked)(struct sock *sk, u32 acked_seq); + const struct ulp_ddp_ulp_ops *icsk_ulp_ddp_ops; + void __rcu *icsk_ulp_ddp_data; unsigned int (*icsk_sync_mss)(struct sock *sk, u32 pmtu); __u8 icsk_ca_state:5, icsk_ca_initialized:1, diff --git a/include/net/ulp_ddp.h b/include/net/ulp_ddp.h new file mode 100644 index 000000000000..d3e0180462a5 --- /dev/null +++ b/include/net/ulp_ddp.h @@ -0,0 +1,173 @@ +/* SPDX-License-Identifier: GPL-2.0 + * + * ulp_ddp.h + * Author: Boris Pismenny + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + */ +#ifndef _ULP_DDP_H +#define _ULP_DDP_H + +#include +#include +#include + +#include "ulp_ddp_caps.h" + +enum ulp_ddp_type { + ULP_DDP_NVME = 1, +}; + +/** + * struct nvme_tcp_ddp_limits - nvme tcp driver limitations + * + * @full_ccid_range: true if the driver supports the full CID range + */ +struct nvme_tcp_ddp_limits { + bool full_ccid_range; +}; + +/** + * struct ulp_ddp_limits - Generic ulp ddp limits: tcp ddp + * protocol limits. + * Add new instances of ulp_ddp_limits in the union below (nvme-tcp, etc.). + * + * @type: type of this limits struct + * @max_ddp_sgl_len: maximum sgl size supported (zero means no limit) + * @io_threshold: minimum payload size required to offload + * @nvmeotcp: NVMe-TCP specific limits + */ +struct ulp_ddp_limits { + enum ulp_ddp_type type; + int max_ddp_sgl_len; + int io_threshold; + union { + struct nvme_tcp_ddp_limits nvmeotcp; + }; +}; + +/** + * struct nvme_tcp_ddp_config - nvme tcp ddp configuration for an IO queue + * + * @pfv: pdu version (e.g., NVME_TCP_PFV_1_0) + * @cpda: controller pdu data alignment (dwords, 0's based) + * @dgst: digest types enabled (header or data, see enum nvme_tcp_digest_option). + * The netdev will offload crc if it is supported. + * @queue_size: number of nvme-tcp IO queue elements + * @queue_id: queue identifier + * @io_cpu: cpu core running the IO thread for this queue + */ +struct nvme_tcp_ddp_config { + u16 pfv; + u8 cpda; + u8 dgst; + int queue_size; + int queue_id; + int io_cpu; +}; + +/** + * struct ulp_ddp_config - Generic ulp ddp configuration + * Add new instances of ulp_ddp_config in the union below (nvme-tcp, etc.). + * + * @type: type of this config struct + * @nvmeotcp: NVMe-TCP specific config + */ +struct ulp_ddp_config { + enum ulp_ddp_type type; + union { + struct nvme_tcp_ddp_config nvmeotcp; + }; +}; + +/** + * struct ulp_ddp_io - ulp ddp configuration for an IO request. + * + * @command_id: identifier on the wire associated with these buffers + * @nents: number of entries in the sg_table + * @sg_table: describing the buffers for this IO request + * @first_sgl: first SGL in sg_table + */ +struct ulp_ddp_io { + u32 command_id; + int nents; + struct sg_table sg_table; + struct scatterlist first_sgl[SG_CHUNK_SIZE]; +}; + +/** + * struct ulp_ddp_dev_ops - operations used by an upper layer protocol + * to configure ddp offload + * + * @ulp_ddp_limits: query ulp driver limitations and quirks. + * @ulp_ddp_sk_add: add offload for the queue represented by socket+config + * pair. this function is used to configure either copy, crc + * or both offloads. + * @ulp_ddp_sk_del: remove offload from the socket, and release any device + * related resources. + * @ulp_ddp_setup: request copy offload for buffers associated with a + * command_id in ulp_ddp_io. + * @ulp_ddp_teardown: release offload resources association between buffers + * and command_id in ulp_ddp_io. + * @ulp_ddp_resync: respond to the driver's resync_request. Called only if + * resync is successful. + */ +struct ulp_ddp_dev_ops { + int (*ulp_ddp_limits)(struct net_device *netdev, + struct ulp_ddp_limits *limits); + int (*ulp_ddp_sk_add)(struct net_device *netdev, + struct sock *sk, + struct ulp_ddp_config *config); + void (*ulp_ddp_sk_del)(struct net_device *netdev, + struct sock *sk); + int (*ulp_ddp_setup)(struct net_device *netdev, + struct sock *sk, + struct ulp_ddp_io *io); + void (*ulp_ddp_teardown)(struct net_device *netdev, + struct sock *sk, + struct ulp_ddp_io *io, + void *ddp_ctx); + void (*ulp_ddp_resync)(struct net_device *netdev, + struct sock *sk, u32 seq); +}; + +#define ULP_DDP_RESYNC_PENDING BIT(0) + +/** + * struct ulp_ddp_ulp_ops - Interface to register upper layer + * Direct Data Placement (DDP) TCP offload. + * @resync_request: NIC requests ulp to indicate if @seq is the start + * of a message. + * @ddp_teardown_done: NIC driver informs the ulp that teardown is done, + * used for async completions. + */ +struct ulp_ddp_ulp_ops { + bool (*resync_request)(struct sock *sk, u32 seq, u32 flags); + void (*ddp_teardown_done)(void *ddp_ctx); +}; + +/** + * struct ulp_ddp_ctx - Generic ulp ddp context + * + * @type: type of this context struct + * @buf: protocol-specific context struct + */ +struct ulp_ddp_ctx { + enum ulp_ddp_type type; + unsigned char buf[]; +}; + +static inline struct ulp_ddp_ctx *ulp_ddp_get_ctx(const struct sock *sk) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + + return (__force struct ulp_ddp_ctx *)icsk->icsk_ulp_ddp_data; +} + +static inline void ulp_ddp_set_ctx(struct sock *sk, void *ctx) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + + rcu_assign_pointer(icsk->icsk_ulp_ddp_data, ctx); +} + +#endif /* _ULP_DDP_H */ diff --git a/include/net/ulp_ddp_caps.h b/include/net/ulp_ddp_caps.h new file mode 100644 index 000000000000..e16e5a694238 --- /dev/null +++ b/include/net/ulp_ddp_caps.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0 + * + * ulp_ddp.h + * Author: Aurelien Aptel + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + */ +#ifndef _ULP_DDP_CAPS_H +#define _ULP_DDP_CAPS_H + +#include + +enum { + ULP_DDP_C_NVME_TCP_BIT, + ULP_DDP_C_NVME_TCP_DDGST_RX_BIT, + + /* add capabilities above */ + ULP_DDP_C_COUNT, +}; + +#define __ULP_DDP_C_BIT(bit) ((u64)1 << (bit)) +#define __ULP_DDP_C(name) __ULP_DDP_C_BIT(ULP_DDP_C_##name##_BIT) + +#define ULP_DDP_C_NVME_TCP __ULP_DDP_C(NVME_TCP) +#define ULP_DDP_C_NVME_TCP_DDGST_RX __ULP_DDP_C(NVME_TCP) + +struct ulp_ddp_netdev_caps { + DECLARE_BITMAP(active, ULP_DDP_C_COUNT); + DECLARE_BITMAP(hw, ULP_DDP_C_COUNT); +}; + +static inline bool ulp_ddp_cap_turned_on(unsigned long *old, unsigned long *new, int bit_nr) +{ + return !test_bit(bit_nr, old) && test_bit(bit_nr, new); +} + +static inline bool ulp_ddp_cap_turned_off(unsigned long *old, unsigned long *new, int bit_nr) +{ + return test_bit(bit_nr, old) && !test_bit(bit_nr, new); +} + +#endif diff --git a/net/Kconfig b/net/Kconfig index 48c33c222199..3c59eba4a438 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -471,4 +471,24 @@ config NETDEV_ADDR_LIST_TEST default KUNIT_ALL_TESTS depends on KUNIT +config ULP_DDP + bool "ULP direct data placement offload" + help + This feature provides a generic infrastructure for Direct + Data Placement (DDP) offload for Upper Layer Protocols (ULP, + such as NVMe-TCP). + + If the ULP and NIC driver supports it, the ULP code can + request the NIC to place ULP response data directly + into application memory, avoiding a costly copy. + + This infrastructure also allows for offloading the ULP data + integrity checks (e.g. data digest) that would otherwise + require another costly pass on the data we managed to avoid + copying. + + For more information, see + . + + endif # if NET diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 3a10387f9434..4b77f7c4687f 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -72,6 +72,7 @@ #include #include #include +#include #include #include @@ -6476,7 +6477,7 @@ void skb_condense(struct sk_buff *skb) { if (skb->data_len) { if (skb->data_len > skb->end - skb->tail || - skb_cloned(skb)) + skb_cloned(skb) || skb_is_ulp_ddp(skb)) return; /* Nice, we can free page frag(s) right now */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index cc072d2cfcd8..c711614604a6 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5234,6 +5234,10 @@ tcp_collapse(struct sock *sk, struct sk_buff_head *list, struct rb_root *root, memcpy(nskb->cb, skb->cb, sizeof(skb->cb)); #ifdef CONFIG_TLS_DEVICE nskb->decrypted = skb->decrypted; +#endif +#ifdef CONFIG_ULP_DDP + nskb->ulp_ddp = skb->ulp_ddp; + nskb->ulp_crc = skb->ulp_crc; #endif TCP_SKB_CB(nskb)->seq = TCP_SKB_CB(nskb)->end_seq = start; if (list) @@ -5267,6 +5271,10 @@ tcp_collapse(struct sock *sk, struct sk_buff_head *list, struct rb_root *root, #ifdef CONFIG_TLS_DEVICE if (skb->decrypted != nskb->decrypted) goto end; +#endif +#ifdef CONFIG_ULP_DDP + if (skb_is_ulp_crc(skb) != skb_is_ulp_crc(nskb)) + goto end; #endif } } diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 8320d0ecb13a..8c5e1e2e2809 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1861,6 +1861,9 @@ bool tcp_add_backlog(struct sock *sk, struct sk_buff *skb, TCP_SKB_CB(skb)->tcp_flags) & (TCPHDR_ECE | TCPHDR_CWR)) || #ifdef CONFIG_TLS_DEVICE tail->decrypted != skb->decrypted || +#endif +#ifdef CONFIG_ULP_DDP + skb_is_ulp_crc(tail) != skb_is_ulp_crc(skb) || #endif thtail->doff != th->doff || memcmp(thtail + 1, th + 1, hdrlen - sizeof(*th))) diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index 45dda7889387..2e62f18e85c0 100644 --- a/net/ipv4/tcp_offload.c +++ b/net/ipv4/tcp_offload.c @@ -268,6 +268,9 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb) #ifdef CONFIG_TLS_DEVICE flush |= p->decrypted ^ skb->decrypted; #endif +#ifdef CONFIG_ULP_DDP + flush |= skb_is_ulp_crc(p) ^ skb_is_ulp_crc(skb); +#endif if (flush || skb_gro_receive(p, skb)) { mss = 1; From patchwork Mon Jan 9 13:30:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093566 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D43F2C54EBD for ; Mon, 9 Jan 2023 13:32:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236987AbjAINcM (ORCPT ); Mon, 9 Jan 2023 08:32:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234870AbjAINcG (ORCPT ); Mon, 9 Jan 2023 08:32:06 -0500 Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam02on2079.outbound.protection.outlook.com [40.107.212.79]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 569001DDE3 for ; Mon, 9 Jan 2023 05:31:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AhVc0uwczWSfsA0P8Z8oMHSywGyNUDoiwpFjN2QYBkpZ/JrOO+ClJyJdFk88ucEnlIccFITfYcojftZWHLmF6evQurVdrmRHRI2QYopg1th0aiZqAHiSFsoBef18zpFP71iwtrSgd+mMdrapH8VlnTxCsZwNoth5IYf3DxBYRmvHJrIBmeQlID8SglY+xr/+dT4rc4bVxnoi5ub9nWS2vEp3ayv7ow/1aY7qgX02i0scpzHml+xNA47FDQN9MZ47enTECoFTDFxrY6DjhyvcgUUaEmM6j/prrEqUB+bi0MFMOE38MqVMooIKLsBy2YwNF3vnLh6qVCJDrH0c8HNqYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rSbqvpduWPvPFPOdZI4/4XxchUwmeWApE5sppiqjW7M=; b=Kb0UsMUPiECSsZrxUqBpzr3oHTANfbwpY4R90MfmD2kAkFrn+c74Hsml8lCSyZ5+iHF2EXLI6ioeWbwGuk2M/1u7mMCf0QMk8vc4buUsrVure+F3ag1X3wdSHsWs7L1rHvVA6u9AAV1UozcriUcs7VKNot8m6csAcODOeRw2TUGrXGBj+5ZcwmkaJraBNvDjV5oES/bZOcMIwv47COpbGMXGxGGKM/6dXDeBrrAD3ViSCt8mvu1YlGAD3avhnHV8P6W8YOHtAyvbBQccRt6HVIJ9UxoWzYRcrg4VUMn07JEctiyquvclO7WvcPEDvXNQtqPZUqRAQfem1slKeeQNpQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=rSbqvpduWPvPFPOdZI4/4XxchUwmeWApE5sppiqjW7M=; b=d1MEna3NpTikH6kGBUE/JEMvRK88dFYnBWnDJdzUThErihcUSB/4pnES/cxaMe7eQdtjXDU7lzz33+Ch6rmU8Tq/j1WgR5326fDvDEt9FSGDVUQ80iFEr/SLQpmyKThYDQDI4RTL9Y6DLjJ5lIKvifP5msnaK3QdUHm6f4iZ3bwsOIj9bBbhDQ2uaJ/y9HtcapPbs1+t0TqUh00vEZsnJUdC0tH6JKaeMfDMmEoc3jB3IRNltJJgovj8OcOEULy7x/aFNIcaq5BqeR+HsPT58YEY193/WflAUDTY0j3NkgOLrvucU+5iyk9bcRYXf4EJqeAI2/Z8X58g9Y380VSH0A== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by PH7PR12MB7307.namprd12.prod.outlook.com (2603:10b6:510:20b::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5944.19; Mon, 9 Jan 2023 13:31:42 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:31:42 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 02/25] net/ethtool: add new stringset ETH_SS_ULP_DDP{,_STATS} Date: Mon, 9 Jan 2023 15:30:53 +0200 Message-Id: <20230109133116.20801-3-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR3P281CA0171.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a0::7) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|PH7PR12MB7307:EE_ X-MS-Office365-Filtering-Correlation-Id: d240e6f7-b233-4f52-b747-08daf245d83b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: uiBSgP0AmG8Vu7MhpzXLhPOvrmZCH6Sdm6hZS5e1ejnu5ipp+Vtp5Rh/WAdzAmJs4SwB69ILqn4KxZxHOrNbXNuTq5jtjqyEcpQGHKIdh+tDc/NTn3wumtG9nMtTcOVUU/E4XefBej89O5MdbCX/cvoEvQkIiXpMk9k+/qCpxGkiABNmYNatcCs+6yPXYzymgelHWpYoVp7Je8r0kxQ+/4iFapS06dTVkP18ipR0xcPBCnorI5sx7PaXfj0BQIf9DzFeetlTIasmEEe5bYh0+XiPD6OYqKBJ3iBN3t9nTrnyDbiLyrYVTUDwhiFrr6KQtxFPqRXYdooeJQKNTshRwQlLBdu5yUjnYCJFvZ6dMsXyjP6HhfDlabwOtgfNz14VkBZfVakQrwO43MF9vYPeyZ5nb7EuUqn33rL9ZfPGQXKon4q+GE+g3etbPoHn6V9/ZFrE8FAoMNYXQKU7iO+GmLGvS6cKqoOGnkZ/FGtEkwCl71AFeVXtI7BeHosX/Ib75xc9rOfBVw3AEwm7YCUQKmYL/svTsk8v9GITP6sJM9RFZmHfSq3U/jY/U9FAPJuZSN+/YQ5JdUj2rFDbY5JrT56AZR8Bu0DqBb23aDiKNIVJSMTk0ySOnQ5T2zP2d1Yo X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(39860400002)(366004)(376002)(396003)(346002)(451199015)(4326008)(6506007)(38100700002)(107886003)(6666004)(8936002)(66556008)(5660300002)(86362001)(2616005)(478600001)(26005)(6512007)(41300700001)(6486002)(186003)(36756003)(7416002)(316002)(66476007)(83380400001)(2906002)(1076003)(66946007)(8676002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: r7ydNBV9g/OgGjv/IwEqheFAiVAaCu0XKcv4pnoKHP9cwQ8+sepT2a3MRCqvA5KZtFEW5lp4B94djpMPU0yNY9S1VZhtmZt4wY+g17buzTDuxMDcfQGe8GV/eEcxj6EeB9APnd+i/INw3kaS3U0YtyG3tp4QGjXy6XNWJGOas5gQc5O0qxjlJZAKsPhetBSwa6CTkzW7h5MU5bVcqM0xNv8yyvT6HS7x8awZ1Xt3AlQ3rdkG44G47IXzWXqg/aQChb44V0Ib24E3Ge/RJebHXeBDudCGmVY7CvSXLM3grQIWcVDjHQPjMYw08nB7GwDGqZW5lA0NpAXPLh6wWTsQ5gjm6XfMrfENM5WdZ9SoDSLfbFo9JQmYm/0L9KjfPJXcSBKFMr+Vgsamkhp7CKTboDz8XUyeJAL1wI8K6iyRfg1RfTHqCOWrcvhuoh/jIZlfroThPNBD7Clt0SKfn/06/pmh4b1hvc/vZVWCkaeMHhZfCpLgkVLIobV6arJwj7hHe7k8HHpCt731yPvsNU55lE6bqTFYXr0yL7+lPhWiSxQ0DHwyFmNU2Sp/6kbqpZ8iad9LSA7NZgLGf+kY2uBB17KZjprrU4QYFP7EDHu2XDJj8kFmRbmhsKvxejtXvR97OA2mefMhR0/7BxMiTlFCZwy1JqdaV7m6eRCFgEbEIIkiHvJ5g4vi9AKEhqUNJd9wFZq7imIrgsfyiZgxjVBKhnrJ6wBARmuNUAKZCv6QsQQI3JW/18PfO6eOWI759qv3dcuOHhpZHpdu8/W37NW+Rf1t4KGUWDyDid89K0HIbQkvDvY+easO+zp5/7OO0U0DTdNKGOzUub54souQaRufJjDiDPUykSueJpvktt6Qv/89rMDQ8f8BDmDAMGpKZqu1Tyj5Upy0y8dFcUSgpA6dnkLl5AbuY89toUOwPjgRk/31gnrkKTiK5ivL5X749YeR3dp49BVXLQ7KyJKpnVp7Pnx9ELbCWI0BwLwyCRTE0+UsfaEDlOfRISERRDZLorcNk3bnJUUELGvdKH2e92X7xYTRwNyWyoLWxWKa/gWHzakOTOdNvDJKD3I3neANYTLh8cHQaqNrmSufY0ZugFBKOFB9QUyQ9BLLXUqWobq+DHkx2KQ4xUdTCwU4aEoQyJCNdk0r/ROy0db5AeJAT5wbR+mexzmX7jtCkVKZqJKEwDzd2ZSdoscAyDAluk0p2IOqS7kOhzpFoPalAzPqeg3Zc6WJ/c4NE7/uwyXEDNkn2DLErLAZwJo8cA9XiVK/0sOHu9hXT0nhm5VT/3uGJHXv/wXa5AzyjAz4ShjEkWAXqq/QH3Wj9QAWmcASx8YBuZa9HubKzshVaXT1ZguMDFfsxt7KHYphukHuXcDPwdUzBiGHHqoD0fU1KmBI3YPXklSrf0DnGVYJemKwkFwX3rek2ykF8aPi2wcAgGvmZxRFu3f8XJWlEtiFzbEyim7KXK5zqWftKdi0gfPos85VLjn53as/o0i+r7WWl9LcIafpz+N1ZKW/yX9+pI5V0ckjfujHEuWc/Bk8bpUO/U3nl/VpIuZV4wBnheq5Z0YYX2TfNTnM/scYn5ukiQ3WrFw0p8Ad X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d240e6f7-b233-4f52-b747-08daf245d83b X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:31:42.1428 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: GTzNd6lmE/mHbk07WF+BHMgLSPs1PLgcqhFerELQc5AJLxrcKwdCl5QHJwT9IvdwoEJsxlhv6TLFD+udrMPSIQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB7307 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This commit exposes ULP DDP capability and statistics names to userspace via netlink. In order to support future ULP DDP capabilities and statistics without having to change the netlink protocol (and userspace ethtool) we add new string sets to let userspace dynamically fetch what the kernel supports. This also allows drivers to return their own statistics. * ETH_SS_ULP_DDP stores names of ULP DDP capabilities * ETH_SS_ULP_DDP_STATS stores names of ULP DDP statistics. Different drivers can report different stats, so make this a per-device string set. These stringsets will be used in later commits when implementing the new ULP DDP GET/SET netlink messages. We keep the convention of strset.c of having the static_assert() right after the array declaration, despite the checkpatch warning. Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel --- include/uapi/linux/ethtool.h | 4 ++++ net/ethtool/common.c | 7 +++++++ net/ethtool/common.h | 1 + net/ethtool/strset.c | 9 +++++++++ 4 files changed, 21 insertions(+) diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index 3135fa0ba9a4..c7bd00976309 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -681,6 +681,8 @@ enum ethtool_link_ext_substate_module { * @ETH_SS_STATS_ETH_MAC: names of IEEE 802.3 MAC statistics * @ETH_SS_STATS_ETH_CTRL: names of IEEE 802.3 MAC Control statistics * @ETH_SS_STATS_RMON: names of RMON statistics + * @ETH_SS_ULP_DDP: names of ULP DDP capabilities + * @ETH_SS_ULP_DDP_STATS: per-device names of ULP DDP statistics * * @ETH_SS_COUNT: number of defined string sets */ @@ -706,6 +708,8 @@ enum ethtool_stringset { ETH_SS_STATS_ETH_MAC, ETH_SS_STATS_ETH_CTRL, ETH_SS_STATS_RMON, + ETH_SS_ULP_DDP, + ETH_SS_ULP_DDP_STATS, /* add new constants above here */ ETH_SS_COUNT diff --git a/net/ethtool/common.c b/net/ethtool/common.c index 6f399afc2ff2..5ecaaa25cc98 100644 --- a/net/ethtool/common.c +++ b/net/ethtool/common.c @@ -5,6 +5,7 @@ #include #include #include +#include #include "common.h" @@ -457,6 +458,12 @@ const char udp_tunnel_type_names[][ETH_GSTRING_LEN] = { static_assert(ARRAY_SIZE(udp_tunnel_type_names) == __ETHTOOL_UDP_TUNNEL_TYPE_CNT); +const char ulp_ddp_names[][ETH_GSTRING_LEN] = { + [ULP_DDP_C_NVME_TCP_BIT] = "nvme-tcp-ddp", + [ULP_DDP_C_NVME_TCP_DDGST_RX_BIT] = "nvme-tcp-ddgst-rx-offload", +}; +static_assert(ARRAY_SIZE(ulp_ddp_names) == ULP_DDP_C_COUNT); + /* return false if legacy contained non-0 deprecated fields * maxtxpkt/maxrxpkt. rest of ksettings always updated */ diff --git a/net/ethtool/common.h b/net/ethtool/common.h index b1b9db810eca..3d4bb3fb43db 100644 --- a/net/ethtool/common.h +++ b/net/ethtool/common.h @@ -36,6 +36,7 @@ extern const char sof_timestamping_names[][ETH_GSTRING_LEN]; extern const char ts_tx_type_names[][ETH_GSTRING_LEN]; extern const char ts_rx_filter_names[][ETH_GSTRING_LEN]; extern const char udp_tunnel_type_names[][ETH_GSTRING_LEN]; +extern const char ulp_ddp_names[][ETH_GSTRING_LEN]; int __ethtool_get_link(struct net_device *dev); diff --git a/net/ethtool/strset.c b/net/ethtool/strset.c index 3f7de54d85fb..3928b5548713 100644 --- a/net/ethtool/strset.c +++ b/net/ethtool/strset.c @@ -2,6 +2,7 @@ #include #include +#include #include "netlink.h" #include "common.h" @@ -105,6 +106,14 @@ static const struct strset_info info_template[] = { .count = __ETHTOOL_A_STATS_RMON_CNT, .strings = stats_rmon_names, }, + [ETH_SS_ULP_DDP] = { + .per_dev = false, + .count = ULP_DDP_C_COUNT, + .strings = ulp_ddp_names, + }, + [ETH_SS_ULP_DDP_STATS] = { + .per_dev = true, + }, }; struct strset_req_info { From patchwork Mon Jan 9 13:30:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093567 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32427C54EBD for ; Mon, 9 Jan 2023 13:32:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234926AbjAINcQ (ORCPT ); Mon, 9 Jan 2023 08:32:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231559AbjAINcI (ORCPT ); Mon, 9 Jan 2023 08:32:08 -0500 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2078.outbound.protection.outlook.com [40.107.93.78]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 410F71CFF8 for ; Mon, 9 Jan 2023 05:31:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Xg5q6lcC/A367rp1+dmeLzPCOILIMnibABAcaRKUXK84itgLrmkNjqp/AHfIS7L3kqLWiVfiFQBD69C5b3QNSYxkWG9qpII/oo9jiRHMbNRf74pkxglCvP4o2ngSNq23v5g8tYmt6LusxssaTkRpHyimCR0/ZXRBHbqoGCtHhYr0uO1p+xAk2xDHkgxe3S8t1D8JAOB1M1SHNLS4aQN6AAn799pspNxGQ1mLRhcmD45ynO/Jlchjnj3miO2c36Vh9o6j2J2XO6zhGu1cLGTP+ElzRb/5bGQl/5oZZof6gusY6phx2xtjEpu+pdyL8k5CwvvUIM536gzNmcanP57hAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Xujxbk3UHoXPvsKZe8TaQASWug4Gd6EQPMG+Yx6TJj4=; b=ECn+5reOlhAROxEHTb0N7CbWTL3SycNsqrmV7WtrKvJw9yRHGgjZNvTYmTsC4NwkS/x2/NFobONoXvAjkkTgLyf89tvO75IGXCpwCcf7iMyJ7kUBsyCjZwtcgmvE2SfB/AQtv08kQ0DbQpogJfakmYlotPGgyGhrZbzv5AkTGuq/I+gHIp2jEo/Iy9ngeu5SeKEm2iMEEejZF73IRJ7KKo+k4+7zTSXPbbCmh78w2ppOsiOO7RHeabtFeB90T7ALTwW9TP2VJKNExffGdoSThqnkR7b3HJ5hSets4af3vZM/dgbVrtduLYx/xvNoXh/d7s5T1Mt6sZc3VjzJjBStxA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Xujxbk3UHoXPvsKZe8TaQASWug4Gd6EQPMG+Yx6TJj4=; b=KezZDJjp9AhYtlKQItYDT0FUm2mXSgJCcsJ23JaShcrm+lv32lTPLr24q7MRtinygO0Mry7bhy8qyTSIB+wzjqwONEBIZ65I329lFFwkkomNw35q5k0HYAeKcPicFZrdMPQZXMkUK64QNgbaenXdW8t8lqxbEFnRg1K98bPg8RzAeRFjT6lwgiZ8SCrBXuMW6mrkZKjuoNQWzzZFE2gPC36rkMd+zNRpwB9S30vK7rqGyMWTdzY1OTvbUW7LLm1EvjxaCzdDW7RwZC0UIdZXfo7q8GBsNlrANwIci8r/4vDI61iqXrbCYxMESYrSWTSgUpFfXq9dN30pBb7mn9VE0Q== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:31:49 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:31:48 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 03/25] net/ethtool: add ULP_DDP_{GET,SET} operations for caps and stats Date: Mon, 9 Jan 2023 15:30:54 +0200 Message-Id: <20230109133116.20801-4-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR3P281CA0169.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a0::8) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: aa366a4c-abdf-46d7-c70f-08daf245dbe9 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: iI3Gjt2+z00AFBuLfb+C/ko4tigYpOMQ2S200rOuE3eBwHfSs5vIcEK0Bkv54jIJMrbanlKiavR+nmr/99Iq60j7nV+mVGZ4MhFW7kcNzsA0cSvYKoOqHfVGDJZIgW4jAxJa8ng6NxN6Fo1vO8fueQ/ZaGMMcMKGVSWF+itpkEtekp/fWXNK1iEj4Jd18peppGe8hlScPauVtT/ShVddvu9sjNNIz8GeO+ei8LrtObIFhVik9j4CwyIt3aB7X1OsaZ0j6H/oNkkMiWVVuNdPlLl3IhHs0oMQ41vuFFBvzoMm4NY/rBMlNWcqRNeRDDU8SYAhswLEtzvGgi00rbh5rT3v8J+UMn3sXIgNEjpwFSmP32FYk+btc70pzQrlxQwBhq1xKaHgKYaO/cpJq6oXDniDUxKmM46o67MP7xadQPfxHVy17FUAr8K73JejSNB5fsHyqkuCCOEDhGwnaHMmIMaoNit4BZtWq17XttNKlQtf55bewby4Sp1blXbDw3egcX/0mcBKAYT0ot8VZnh5jntNTi+hZFqb/S6YjccgpVLeneIGFrzUhWjFvATGu0JvzIjbqZSE49AwM4+ntN/JV4rlUfrJJVWTHEYSdYp6xPFC+2lRSgFEDylCM49sDpCs/Zsq/OIrovjSxGP261uBdA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(6666004)(107886003)(26005)(186003)(36756003)(8936002)(30864003)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: SWeMtU6JDL1tfab9UH3YFCL6tiPCLoVrDdYl8qslIrUCdcaLKWqSqb18MzvvmcFglrnf0TIm2BCB9lgtGRsN4u4eHny3bS2NrISoPnYKikv2QkPu6b59Ek6tv41+oXldCkftVPRorojVeHJLTYcW5RlEbjxExM5mKVdQcnomJO2Y/IWJFTuc2FFC3p25adZkJE74AtvxkkwocYPrfCPifwkkKhTnwA39XcEKckbRHz/PP/Jwypw3dxUWrgcEQPPzLO9GMEpA3ksVk5A7QWL0+vKwXzxqGZ6zX7rgk6uXrnCp6K8ERIFYrmtFyKhA699WyyrClkZSNEz14qwm6B/yGIlw5gVsjQoS02dhCyAwFbK6VspY/k/g0R0BfqMQ7qSXL+PUlnjQiZ7VLY2ccEvZvraWtrMLdPtvKhrA0ZRHTJqt2Dg6Ws2y2gq3IZNz3ffKIsYfVU00CwmIccGQxpV2N3IAp388Bn+L/3OBYGM8y0DRCAEMTZsv95fgGs2TAdn4/KXnR0NyoteHOu1qcNfev1GaTCI40tKXlYmNe/euPlhwOxWEBFopKzYakcKaYKwyTOk/imdy0qVAlor/wD1KrGFaZfoLpz2PUHZdREMO/OyHLjwGZ7OivikmtCqoWMBJHCShvta4sfEVyF8SoyMawHCdVUlXlpCQvFCd2xpTl3IRU9r6OWZc1DLhRs62dJHepUOn7Tgsoh+rPILGJDaknqglO4YGTl8oQOjjV9tx+IGUf2b98LzO98dlg8z2sAztttdJibonoQrL4+98mzwjXRGOhwUsoTzRKn+N+awTqKzzb85hfsDeHCsG+rI399GAo9bVNh/bj4div3KX21odtV0WovrFz8nT9FawRw9JY7DCnSuACUSm5rGGW0JH7cgYiqA+EoSifb14iQ0Xk7g4PtQDZIWFZ//NhQFQ3YzQekr+WnNFDTQEi7BC7aghgYZteJjFjYq5Ff6h/N6tQEHwDgOh/yIE+otKxYhTMX0vBBTfNGweQXhxCn6c25kZmHduzhoBz35N6FZ13rD60gWY7Ug0CsCzkAMgsZZHAr/KHUzSs69a+SKoB9JypKVZd2X98IXPU2aNm90322akONflS/VrrRNRHWLNoR4U9AmZ3Ed+1XiIMSjUjQeAl7BEN9d1JDBjV1p3Zgo0csoWyb07gKdTtlpAEenQyKGWI5Te/OJeiuFwD/dEWLWSx0jMZq0ejW9CPKMEG3ze5mbmFQlBKc2hg5GjfAvM5yb/gUUH2HsIEF2umbRHb7GlJzlLRge6OKy0Q51WiZHmBEUz5Y3+TuyY1NZJIheBtchKaFKgB1SoELSumPEdnAlEI7M5axcjh1h4xiPXxQ+vm9xD69VO7Ps7n5gmghnlh+Ts23k4kUa+EXERhSRx7SIYh+42zowr5EaWCOwsJU4zaAF5YCM1xU4vrvWgPTbQFDfx82vWnPn+b76RPMq8e0IZUjcWlQTM9PRNnF5bWwbdQN7Y06lyxxGg9965JlKsVXHVf1JRt4n99xdxKlrDg5mMu6GwmPo/KfjeUjqVt+OtNHTu24qNMFOdeBGVzOz7Wu3UNLR8P7jf6mFNxNhZNERxy4sykY4B X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: aa366a4c-abdf-46d7-c70f-08daf245dbe9 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:31:48.5042 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: nvaFl56+aubol6pQZv1C0WUO/DBq8pO5+B/aF0ZoP9MU/CwR6CNTxlTrGiiE7AjOyngZbDk+SSfxF7vFXJCo6Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This commit adds: - 2 new netlink messages: * ULP_DDP_GET: returns a bitset of supported and active capabilities * ULP_DDP_SET: tries to activate requested bitset and returns results - 2 new netdev ethtool_ops operations: * ethtool_ops->get_ulp_ddp_stats(): retrieve device statistics * ethtool_ops->set_ulp_ddp_capabilities(): try to apply capability changes ULP DDP capabilities handling is similar to netdev features handling. If a ULP_DDP_GET message has requested statistics via the ETHTOOL_FLAG_STATS header flag, then per-device statistics are returned to userspace. Similar to netdev features, ULP_DDP_GET capabilities and statistics can be returned in a verbose (default) or compact form (if ETHTOOL_FLAG_COMPACT_BITSET is set in header flags). Verbose statistics are nested as follows: STATS (nest) COUNT (u32) MAP (nest) ITEM (nest) NAME (strz) VAL (u64) ... Compact statistics are nested as follows: STATS (nest) COUNT (u32) COMPACT_VALUES (array of u64) Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel --- include/linux/ethtool.h | 2 + include/uapi/linux/ethtool_netlink.h | 49 +++ net/ethtool/Makefile | 2 +- net/ethtool/netlink.c | 17 ++ net/ethtool/netlink.h | 4 + net/ethtool/ulp_ddp.c | 430 +++++++++++++++++++++++++++ 6 files changed, 503 insertions(+), 1 deletion(-) create mode 100644 net/ethtool/ulp_ddp.c diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 9e0a76fc7de9..7b7ba8a89cb1 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -777,6 +777,8 @@ struct ethtool_ops { int (*set_module_power_mode)(struct net_device *dev, const struct ethtool_module_power_mode_params *params, struct netlink_ext_ack *extack); + int (*get_ulp_ddp_stats)(struct net_device *dev, u64 *ulp_ddp_stats); + int (*set_ulp_ddp_capabilities)(struct net_device *dev, unsigned long *bits); }; int ethtool_check_ops(const struct ethtool_ops *ops); diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 5799a9db034e..7a4a66a9b3ea 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -52,6 +52,8 @@ enum { ETHTOOL_MSG_PSE_GET, ETHTOOL_MSG_PSE_SET, ETHTOOL_MSG_RSS_GET, + ETHTOOL_MSG_ULP_DDP_GET, + ETHTOOL_MSG_ULP_DDP_SET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -99,6 +101,8 @@ enum { ETHTOOL_MSG_MODULE_NTF, ETHTOOL_MSG_PSE_GET_REPLY, ETHTOOL_MSG_RSS_GET_REPLY, + ETHTOOL_MSG_ULP_DDP_GET_REPLY, + ETHTOOL_MSG_ULP_DDP_SET_REPLY, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -894,6 +898,51 @@ enum { ETHTOOL_A_RSS_MAX = (__ETHTOOL_A_RSS_CNT - 1), }; +/* ULP DDP */ + +enum { + ETHTOOL_A_ULP_DDP_UNSPEC, + ETHTOOL_A_ULP_DDP_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_ULP_DDP_HW, /* bitset */ + ETHTOOL_A_ULP_DDP_ACTIVE, /* bitset */ + ETHTOOL_A_ULP_DDP_WANTED, /* bitset */ + ETHTOOL_A_ULP_DDP_STATS, /* nest - _A_ULP_DDP_STATS_* */ + + /* add new constants above here */ + __ETHTOOL_A_ULP_DDP_CNT, + ETHTOOL_A_ULP_DDP_MAX = __ETHTOOL_A_ULP_DDP_CNT - 1 +}; + +enum { + ETHTOOL_A_ULP_DDP_STATS_UNSPEC, + ETHTOOL_A_ULP_DDP_STATS_COUNT, /* u32 */ + ETHTOOL_A_ULP_DDP_STATS_COMPACT_VALUES, /* array, u64 */ + ETHTOOL_A_ULP_DDP_STATS_MAP, /* nest - _A_ULP_DDP_STATS_MAP_* */ + + /* add new constants above here */ + __ETHTOOL_A_ULP_DDP_STATS_CNT, + ETHTOOL_A_ULP_DDP_STATS_MAX = __ETHTOOL_A_ULP_DDP_STATS_CNT - 1 +}; + +enum { + ETHTOOL_A_ULP_DDP_STATS_MAP_UNSPEC, + ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM, /* next - _A_ULP_DDP_STATS_MAP_ITEM_* */ + + /* add new constants above here */ + __ETHTOOL_A_ULP_DDP_STATS_MAP_CNT, + ETHTOOL_A_ULP_DDP_STATS_MAP_MAX = __ETHTOOL_A_ULP_DDP_STATS_MAP_CNT - 1 +}; + +enum { + ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_UNSPEC, + ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_NAME, /* string */ + ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_VAL, /* u64 */ + + /* add new constants above here */ + __ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_CNT, + ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_MAX = __ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile index 228f13df2e18..c1c6ddce7d3f 100644 --- a/net/ethtool/Makefile +++ b/net/ethtool/Makefile @@ -8,4 +8,4 @@ ethtool_nl-y := netlink.o bitset.o strset.o linkinfo.o linkmodes.o rss.o \ linkstate.o debug.o wol.o features.o privflags.o rings.o \ channels.o coalesce.o pause.o eee.o tsinfo.o cabletest.o \ tunnels.o fec.o eeprom.o stats.o phc_vclocks.o module.o \ - pse-pd.o + pse-pd.o ulp_ddp.o diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c index aee98be6237f..1ebd512dca2e 100644 --- a/net/ethtool/netlink.c +++ b/net/ethtool/netlink.c @@ -288,6 +288,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = { [ETHTOOL_MSG_MODULE_GET] = ðnl_module_request_ops, [ETHTOOL_MSG_PSE_GET] = ðnl_pse_request_ops, [ETHTOOL_MSG_RSS_GET] = ðnl_rss_request_ops, + [ETHTOOL_MSG_ULP_DDP_GET] = ðnl_ulp_ddp_request_ops, }; static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb) @@ -1047,6 +1048,22 @@ static const struct genl_ops ethtool_genl_ops[] = { .policy = ethnl_rss_get_policy, .maxattr = ARRAY_SIZE(ethnl_rss_get_policy) - 1, }, + { + .cmd = ETHTOOL_MSG_ULP_DDP_GET, + .doit = ethnl_default_doit, + .start = ethnl_default_start, + .dumpit = ethnl_default_dumpit, + .done = ethnl_default_done, + .policy = ethnl_ulp_ddp_get_policy, + .maxattr = ARRAY_SIZE(ethnl_ulp_ddp_get_policy) - 1, + }, + { + .cmd = ETHTOOL_MSG_ULP_DDP_SET, + .flags = GENL_UNS_ADMIN_PERM, + .doit = ethnl_set_ulp_ddp, + .policy = ethnl_ulp_ddp_set_policy, + .maxattr = ARRAY_SIZE(ethnl_ulp_ddp_set_policy) - 1, + }, }; static const struct genl_multicast_group ethtool_nl_mcgrps[] = { diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h index 3753787ba233..8040fb1e86e4 100644 --- a/net/ethtool/netlink.h +++ b/net/ethtool/netlink.h @@ -347,6 +347,7 @@ extern const struct ethnl_request_ops ethnl_phc_vclocks_request_ops; extern const struct ethnl_request_ops ethnl_module_request_ops; extern const struct ethnl_request_ops ethnl_pse_request_ops; extern const struct ethnl_request_ops ethnl_rss_request_ops; +extern const struct ethnl_request_ops ethnl_ulp_ddp_request_ops; extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1]; extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1]; @@ -388,6 +389,8 @@ extern const struct nla_policy ethnl_module_set_policy[ETHTOOL_A_MODULE_POWER_MO extern const struct nla_policy ethnl_pse_get_policy[ETHTOOL_A_PSE_HEADER + 1]; extern const struct nla_policy ethnl_pse_set_policy[ETHTOOL_A_PSE_MAX + 1]; extern const struct nla_policy ethnl_rss_get_policy[ETHTOOL_A_RSS_CONTEXT + 1]; +extern const struct nla_policy ethnl_ulp_ddp_get_policy[ETHTOOL_A_ULP_DDP_HEADER + 1]; +extern const struct nla_policy ethnl_ulp_ddp_set_policy[ETHTOOL_A_ULP_DDP_WANTED + 1]; int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info); int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info); @@ -408,6 +411,7 @@ int ethnl_tunnel_info_dumpit(struct sk_buff *skb, struct netlink_callback *cb); int ethnl_set_fec(struct sk_buff *skb, struct genl_info *info); int ethnl_set_module(struct sk_buff *skb, struct genl_info *info); int ethnl_set_pse(struct sk_buff *skb, struct genl_info *info); +int ethnl_set_ulp_ddp(struct sk_buff *skb, struct genl_info *info); extern const char stats_std_names[__ETHTOOL_STATS_CNT][ETH_GSTRING_LEN]; extern const char stats_eth_phy_names[__ETHTOOL_A_STATS_ETH_PHY_CNT][ETH_GSTRING_LEN]; diff --git a/net/ethtool/ulp_ddp.c b/net/ethtool/ulp_ddp.c new file mode 100644 index 000000000000..f4339e964d2d --- /dev/null +++ b/net/ethtool/ulp_ddp.c @@ -0,0 +1,430 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * + * ulp_ddp.c + * Author: Aurelien Aptel + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + */ + +#include "netlink.h" +#include "common.h" +#include "bitset.h" +#include + +static struct ulp_ddp_netdev_caps *netdev_ulp_ddp_caps(struct net_device *dev) +{ +#ifdef CONFIG_ULP_DDP + return &dev->ulp_ddp_caps; +#else + return NULL; +#endif +} + +/* ULP_DDP_GET */ + +struct ulp_ddp_req_info { + struct ethnl_req_info base; +}; + +struct ulp_ddp_reply_data { + struct ethnl_reply_data base; + DECLARE_BITMAP(hw, ULP_DDP_C_COUNT); + DECLARE_BITMAP(active, ULP_DDP_C_COUNT); + const char (*stat_names)[ETH_GSTRING_LEN]; + int stat_count; + u64 *stats; +}; + +#define ULP_DDP_REPDATA(__reply_base) \ + container_of(__reply_base, struct ulp_ddp_reply_data, base) + +const struct nla_policy ethnl_ulp_ddp_get_policy[] = { + [ETHTOOL_A_ULP_DDP_HEADER] = + NLA_POLICY_NESTED(ethnl_header_policy_stats), +}; + +/* When requested (ETHTOOL_FLAG_STATS) ULP DDP stats are appended to + * the response. + * + * Similar to bitsets, stats can be in a compact or verbose form. + * + * The verbose form is as follow: + * + * STATS (nest) + * COUNT (u32) + * MAP (nest) + * ITEM (nest) + * NAME (strz) + * VAL (u64) + * ... + * + * The compact form is as follow: + * + * STATS (nest) + * COUNT (u32) + * COMPACT_VALUES (array of u64) + * + */ +static int ulp_ddp_stats64_size(const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base, + ethnl_string_array_t names, + unsigned int count, + bool compact) +{ + unsigned int len = 0; + unsigned int i; + + /* count */ + len += nla_total_size(sizeof(u32)); + + if (compact) { + /* values */ + len += nla_total_size(count * sizeof(u64)); + } else { + unsigned int maplen = 0; + + for (i = 0; i < count; i++) { + unsigned int itemlen = 0; + + /* name */ + itemlen += ethnl_strz_size(names[i]); + /* value */ + itemlen += nla_total_size(sizeof(u64)); + + /* item nest */ + maplen += nla_total_size(itemlen); + } + + /* map nest */ + len += nla_total_size(maplen); + } + /* outermost nest */ + return nla_total_size(len); +} + +static int ulp_ddp_put_stats64(struct sk_buff *skb, int attrtype, const u64 *val, + unsigned int count, ethnl_string_array_t names, bool compact) +{ + struct nlattr *nest; + struct nlattr *attr; + + nest = nla_nest_start(skb, attrtype); + if (!nest) + return -EMSGSIZE; + + if (nla_put_u32(skb, ETHTOOL_A_ULP_DDP_STATS_COUNT, count)) + goto nla_put_failure; + if (compact) { + unsigned int nbytes = count * sizeof(*val); + u64 *dst; + + attr = nla_reserve(skb, ETHTOOL_A_ULP_DDP_STATS_COMPACT_VALUES, nbytes); + if (!attr) + goto nla_put_failure; + dst = nla_data(attr); + memcpy(dst, val, nbytes); + } else { + struct nlattr *map; + unsigned int i; + + map = nla_nest_start(skb, ETHTOOL_A_ULP_DDP_STATS_MAP); + if (!map) + goto nla_put_failure; + for (i = 0; i < count; i++) { + attr = nla_nest_start(skb, ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM); + if (!attr) + goto nla_put_failure; + if (ethnl_put_strz(skb, ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_NAME, names[i])) + goto nla_put_failure; + if (nla_put_u64_64bit(skb, ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_VAL, + val[i], -1)) + goto nla_put_failure; + nla_nest_end(skb, attr); + } + nla_nest_end(skb, map); + } + nla_nest_end(skb, nest); + return 0; + +nla_put_failure: + nla_nest_cancel(skb, nest); + return -EMSGSIZE; +} + +static int ulp_ddp_prepare_data(const struct ethnl_req_info *req_base, + struct ethnl_reply_data *reply_base, + struct genl_info *info) +{ + struct ulp_ddp_reply_data *data = ULP_DDP_REPDATA(reply_base); + bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS; + const struct ethtool_ops *ops = reply_base->dev->ethtool_ops; + struct net_device *dev = reply_base->dev; + struct ulp_ddp_netdev_caps *caps; + int nstats; + + caps = netdev_ulp_ddp_caps(dev); + if (!caps) + return -EOPNOTSUPP; + + bitmap_copy(data->hw, caps->hw, ULP_DDP_C_COUNT); + bitmap_copy(data->active, caps->active, ULP_DDP_C_COUNT); + + if (req_base->flags & ETHTOOL_FLAG_STATS) { + if (!ops->get_sset_count || !ops->get_strings || !ops->get_ulp_ddp_stats) + return -EOPNOTSUPP; + + nstats = ops->get_sset_count(dev, ETH_SS_ULP_DDP_STATS); + if (nstats < 0) + return nstats; + + data->stats = kcalloc(nstats, sizeof(u64), GFP_KERNEL); + if (!data->stats) + return -ENOMEM; + + if (!compact) { + data->stat_names = kcalloc(nstats, ETH_GSTRING_LEN, GFP_KERNEL); + if (!data->stat_names) { + kfree(data->stats); + return -ENOMEM; + } + ops->get_strings(dev, ETH_SS_ULP_DDP_STATS, + (u8 *)data->stat_names); + } + data->stat_count = nstats; + ops->get_ulp_ddp_stats(dev, data->stats); + } + return 0; +} + +static int ulp_ddp_reply_size(const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + const struct ulp_ddp_reply_data *data = ULP_DDP_REPDATA(reply_base); + bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS; + unsigned int len = 0; + int ret; + + ret = ethnl_bitset_size(data->hw, NULL, ULP_DDP_C_COUNT, + ulp_ddp_names, compact); + if (ret < 0) + return ret; + len += ret; + ret = ethnl_bitset_size(data->active, NULL, ULP_DDP_C_COUNT, + ulp_ddp_names, compact); + if (ret < 0) + return ret; + len += ret; + + if ((req_base->flags & ETHTOOL_FLAG_STATS) && data->stats) { + ret = ulp_ddp_stats64_size(req_base, reply_base, + data->stat_names, data->stat_count, compact); + if (ret < 0) + return ret; + len += ret; + } + return len; +} + +static int ulp_ddp_fill_reply(struct sk_buff *skb, + const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + const struct ulp_ddp_reply_data *data = ULP_DDP_REPDATA(reply_base); + bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS; + int ret; + + ret = ethnl_put_bitset(skb, ETHTOOL_A_ULP_DDP_HW, data->hw, + NULL, ULP_DDP_C_COUNT, + ulp_ddp_names, compact); + if (ret < 0) + return ret; + + ret = ethnl_put_bitset(skb, ETHTOOL_A_ULP_DDP_ACTIVE, data->active, + NULL, ULP_DDP_C_COUNT, + ulp_ddp_names, compact); + if (ret < 0) + return ret; + + if ((req_base->flags & ETHTOOL_FLAG_STATS) && data->stats) { + ret = ulp_ddp_put_stats64(skb, ETHTOOL_A_ULP_DDP_STATS, + data->stats, + data->stat_count, + data->stat_names, + compact); + if (ret < 0) + return ret; + } + return ret; +} + +static void ulp_ddp_cleanup_data(struct ethnl_reply_data *reply_data) +{ + struct ulp_ddp_reply_data *data = ULP_DDP_REPDATA(reply_data); + + kfree(data->stat_names); + kfree(data->stats); +} + +const struct ethnl_request_ops ethnl_ulp_ddp_request_ops = { + .request_cmd = ETHTOOL_MSG_ULP_DDP_GET, + .reply_cmd = ETHTOOL_MSG_ULP_DDP_GET_REPLY, + .hdr_attr = ETHTOOL_A_ULP_DDP_HEADER, + .req_info_size = sizeof(struct ulp_ddp_req_info), + .reply_data_size = sizeof(struct ulp_ddp_reply_data), + + .prepare_data = ulp_ddp_prepare_data, + .reply_size = ulp_ddp_reply_size, + .fill_reply = ulp_ddp_fill_reply, + .cleanup_data = ulp_ddp_cleanup_data, +}; + +/* ULP_DDP_SET */ + +const struct nla_policy ethnl_ulp_ddp_set_policy[] = { + [ETHTOOL_A_ULP_DDP_HEADER] = + NLA_POLICY_NESTED(ethnl_header_policy), + [ETHTOOL_A_ULP_DDP_WANTED] = { .type = NLA_NESTED }, +}; + +static int ulp_ddp_send_reply(struct net_device *dev, struct genl_info *info, + const unsigned long *wanted, + const unsigned long *wanted_mask, + const unsigned long *active, + const unsigned long *active_mask, bool compact) +{ + struct sk_buff *rskb; + void *reply_payload; + int reply_len = 0; + int ret; + + reply_len = ethnl_reply_header_size(); + ret = ethnl_bitset_size(wanted, wanted_mask, ULP_DDP_C_COUNT, + ulp_ddp_names, compact); + if (ret < 0) + goto err; + reply_len += ret; + ret = ethnl_bitset_size(active, active_mask, ULP_DDP_C_COUNT, + ulp_ddp_names, compact); + if (ret < 0) + goto err; + reply_len += ret; + + ret = -ENOMEM; + rskb = ethnl_reply_init(reply_len, dev, ETHTOOL_MSG_ULP_DDP_SET_REPLY, + ETHTOOL_A_ULP_DDP_HEADER, info, + &reply_payload); + if (!rskb) + goto err; + + ret = ethnl_put_bitset(rskb, ETHTOOL_A_ULP_DDP_WANTED, wanted, + wanted_mask, ULP_DDP_C_COUNT, + ulp_ddp_names, compact); + if (ret < 0) + goto nla_put_failure; + ret = ethnl_put_bitset(rskb, ETHTOOL_A_ULP_DDP_ACTIVE, active, + active_mask, ULP_DDP_C_COUNT, + ulp_ddp_names, compact); + if (ret < 0) + goto nla_put_failure; + + genlmsg_end(rskb, reply_payload); + ret = genlmsg_reply(rskb, info); + return ret; + +nla_put_failure: + nlmsg_free(rskb); + WARN_ONCE(1, "calculated message payload length (%d) not sufficient\n", + reply_len); +err: + GENL_SET_ERR_MSG(info, "failed to send reply message"); + return ret; +} + +int ethnl_set_ulp_ddp(struct sk_buff *skb, struct genl_info *info) +{ + DECLARE_BITMAP(old_active, ULP_DDP_C_COUNT); + DECLARE_BITMAP(new_active, ULP_DDP_C_COUNT); + DECLARE_BITMAP(req_wanted, ULP_DDP_C_COUNT); + DECLARE_BITMAP(req_mask, ULP_DDP_C_COUNT); + DECLARE_BITMAP(all_bits, ULP_DDP_C_COUNT); + DECLARE_BITMAP(tmp, ULP_DDP_C_COUNT); + struct ethnl_req_info req_info = {}; + struct nlattr **tb = info->attrs; + struct ulp_ddp_netdev_caps *caps; + struct net_device *dev; + int ret; + + if (!tb[ETHTOOL_A_ULP_DDP_WANTED]) + return -EINVAL; + ret = ethnl_parse_header_dev_get(&req_info, + tb[ETHTOOL_A_ULP_DDP_HEADER], + genl_info_net(info), info->extack, + true); + if (ret < 0) + return ret; + + dev = req_info.dev; + rtnl_lock(); + caps = netdev_ulp_ddp_caps(dev); + if (!caps) { + ret = -EOPNOTSUPP; + goto out_rtnl; + } + + ret = ethnl_parse_bitset(req_wanted, req_mask, ULP_DDP_C_COUNT, + tb[ETHTOOL_A_ULP_DDP_WANTED], + ulp_ddp_names, info->extack); + if (ret < 0) + goto out_rtnl; + + /* if (req_mask & ~all_bits) */ + bitmap_fill(all_bits, ULP_DDP_C_COUNT); + bitmap_andnot(tmp, req_mask, all_bits, ULP_DDP_C_COUNT); + if (!bitmap_empty(tmp, ULP_DDP_C_COUNT)) { + ret = -EINVAL; + goto out_rtnl; + } + + /* new_active = (old_active & ~req_mask) | (wanted & req_mask) + * new_active &= caps_hw + */ + bitmap_copy(old_active, caps->active, ULP_DDP_C_COUNT); + bitmap_and(req_wanted, req_wanted, req_mask, ULP_DDP_C_COUNT); + bitmap_andnot(new_active, old_active, req_mask, ULP_DDP_C_COUNT); + bitmap_or(new_active, new_active, req_wanted, ULP_DDP_C_COUNT); + bitmap_and(new_active, new_active, caps->hw, ULP_DDP_C_COUNT); + if (!bitmap_equal(old_active, new_active, ULP_DDP_C_COUNT)) { + ret = dev->ethtool_ops->set_ulp_ddp_capabilities(dev, new_active); + if (ret) + netdev_err(dev, "set_ulp_ddp_capabilities() returned error %d\n", ret); + bitmap_copy(new_active, caps->active, ULP_DDP_C_COUNT); + } + + ret = 0; + if (!(req_info.flags & ETHTOOL_FLAG_OMIT_REPLY)) { + DECLARE_BITMAP(wanted_diff_mask, ULP_DDP_C_COUNT); + DECLARE_BITMAP(active_diff_mask, ULP_DDP_C_COUNT); + bool compact = req_info.flags & ETHTOOL_FLAG_COMPACT_BITSETS; + + /* wanted_diff_mask = req_wanted ^ new_active + * active_diff_mask = old_active ^ new_active -> mask of bits that have changed + * wanted_diff_mask &= req_mask -> mask of bits that have diff value than wanted + * req_wanted &= wanted_diff_mask -> bits that have diff value than wanted + * new_active &= active_diff_mask -> bits that have changed + */ + bitmap_xor(wanted_diff_mask, req_wanted, new_active, ULP_DDP_C_COUNT); + bitmap_xor(active_diff_mask, old_active, new_active, ULP_DDP_C_COUNT); + bitmap_and(wanted_diff_mask, wanted_diff_mask, req_mask, ULP_DDP_C_COUNT); + bitmap_and(req_wanted, req_wanted, wanted_diff_mask, ULP_DDP_C_COUNT); + bitmap_and(new_active, new_active, active_diff_mask, ULP_DDP_C_COUNT); + ret = ulp_ddp_send_reply(dev, info, + req_wanted, wanted_diff_mask, + new_active, active_diff_mask, + compact); + } + +out_rtnl: + rtnl_unlock(); + ethnl_parse_header_dev_put(&req_info); + return ret; +} From patchwork Mon Jan 9 13:30:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093568 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3308AC54EBD for ; Mon, 9 Jan 2023 13:32:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234765AbjAINcd (ORCPT ); Mon, 9 Jan 2023 08:32:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234878AbjAINcL (ORCPT ); Mon, 9 Jan 2023 08:32:11 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2059.outbound.protection.outlook.com [40.107.94.59]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 698A71EC66 for ; Mon, 9 Jan 2023 05:31:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XmN4mDf0HdAvDN9n9C9Bi8cGlr+6BuS4BFKNq5tgMLwSHNc5AT5XByNjalXFJ0QHuCjEfmuB/I+qh/Cb6Nk+jqembzZzrs5Lu7aLnTCbnnCLUL1Iewz11GIjK0Sh9/seh2v171G+5FnlV43W49gHWb+TBAGvpWPNpbhS846mk2Y9BzMxyaHMFroUQ2FopLBTLJhgS9hGbABumt+8Y6mZiAZ390y2G9IH5Id0gtnCTvVeg5semadQWEjd2OhsSBJYPJrp6cc2+sgpNGaJFVzZrDwU+VdoqFE5mB8YbCOB+8nzHrdmE0pEvLRs0D9pjogb+lvLakbSPf17IKWX1P6iKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jcBdc1CIZcJQwKKpPz9wSoQvN9M1A3vvYwhoYzbVhMo=; b=f6Jcl/3BJxI5darO19EpK4TWpeOh8XZoeWEEgSxIc4Kb7wwLOeI2o7i0EBknUnO/WNdFBpQPbwJR4npQfMVN4mgPDKBUtJEjBGEzdpPtAUHg7nrVUC/ugjFwQpVN/iwzR0QNWK6ieEvjdqVLJ1TrrQqjLrC0/LrOXcztN9xJNGQDlweadF7kWVtNK8L+f3k8wZSxwaph9Io8HwtzdgympHHLWSSAFspUqmWvAONrzCBcKJ1lu4efUf2h8z4J8wgC9eXt0fAFzSY0o2teLxVVrhHvi1RzghMyZkSBD9hksY3LzHppSovGHelWkz/Gerq7kChIP34CyBYCbBeKkQmUwA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jcBdc1CIZcJQwKKpPz9wSoQvN9M1A3vvYwhoYzbVhMo=; b=Dz+6wDk+d17mievbpY1FD8XbkjJ09Y436hdWyXnY78coa/BzvFt0ViEhJ1Lo8pIlT3EG6iX9QClGQ0kaH0+BhA/DLVCDFI34onXHFnbxdrIW9/EDi72UasZ4LMsbvagiHtI40z0qeypXHwyZyrhqx7yd+hWdQeR/yaDS9mndUyE9m+kHn8/JFl7gKIVh/ci4mvmWqmT7kadnnmLFtyU/RgQtJWA5Z4voRXUVGOCFpzdvyPKaPLMmPOjjam1FVjg1YaSkYL4/5slNChJFPg3By63jJwQzgCHBnzSpJ4gNrA5UpDjq9LamDTRqgDKFjDTrCxeYNUkbIw5H2mP90IRIdA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by CH0PR12MB5074.namprd12.prod.outlook.com (2603:10b6:610:e1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:31:54 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:31:54 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 04/25] Documentation: document netlink ULP_DDP_GET/SET messages Date: Mon, 9 Jan 2023 15:30:55 +0200 Message-Id: <20230109133116.20801-5-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR0P281CA0126.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:97::6) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|CH0PR12MB5074:EE_ X-MS-Office365-Filtering-Correlation-Id: c7c27b7a-9f41-4e7d-b4e7-08daf245df6c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: p4MN9OwHUteXVpIEATpat3Wo0OoXHMUm+jZhnH1xJpNhZc0dfegXpJSNOK5HkHJ0v9pqpmebXE3R3Ika/6FJiOo0b9KxEyg/3uk9AMi1oZQFpN2sM5siSfvjhn4ZhdDS16PdJsIoxkhJ6MU5KJN79uWnty6BiUoNpXE/edWvt6jRty3dehNgcxSk1LNYIZHCdRuQ5OSyFXfgEuNyxfXgkBRo5vYG6wloD/CmJfbZWtk3QMv9lU4/jNY0x0pk0dqqDZGgvrsrEbZ+kNZeLUd5ZcCd+3WHLShpCu/YoFcTqXM/kYuji5QCekHbBf4nZAt2EZcJpcZMVms/kdQsK8sNuspW4YOyUvFmiyLAFEUA3sDDQDXfai4FP7nJgXJM47RRo3hsFKE+qkhzYEtm9gSM+OPV/Eeyq2W287QtlJyhDSijAgETsiiJ9kETK0LKTkYdwMHqEXxoTQ8HmV06zLydhWgubNpETeMI3AFS0t1z2M4OLDuZrn6iE5MF8r2Ld3vHOboXGUI2CmfoG2VBWgq2wHJs1b/j0XcHeLzUsgcVLR3L5vTrYsI3SCcn/tMa8dGO+WtcSBCs18EKJ1hKrglZayAD3OiXm6nlPjoZ5XnSDyUKtTgSRETdJbHSTBY20TsC4OKWIM7jax/ZYo/dzt+grw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(396003)(366004)(376002)(346002)(39860400002)(451199015)(1076003)(316002)(5660300002)(7416002)(26005)(6512007)(186003)(6486002)(478600001)(15650500001)(2616005)(41300700001)(4326008)(66556008)(66946007)(66476007)(8676002)(8936002)(83380400001)(86362001)(36756003)(107886003)(6506007)(38100700002)(2906002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ZCZgnBE+plEXW7RCZ0yEkxnTrIaNglCHsuAqsaisDHtYTXzWyFwdiUdDPKs0wIkjdFUoAE/MDTz2DR0UbdhBiKemHWIGrLEL3iqlJvOzkRIYpW9G5uK6TequBOdWn0qjwJWq06BzzT/DPt6p/lEgtHLFZmUSi91QFYaFZ2GcYiw82BZb41L7zZK4LjEoAIunG7pprqSWUuwy17Moz96GIOQYYhedCyZP9p2nNXaRCGlI7ezrRCSY/VnhdNhhWqTVyG1TnJN0Oc1XX7YqlBHOFjc+8xWeoFC7RPk9Ok6LjsvvBhahDZHg+gKwogf4xwQQjEBaF3UC/3yNZnXgRX7XaQjDJ/5qUC4I1Ka/8g8ArCZKBFQNxqV3hSceF43o49jSbgcwTbUnrQYfjrtB/SBxO5lk2Lk2kPSlKZumPQSbxlTu9yRRIph/jDbxGsP/eExhdLNMTJCGI0z6j8Dd3jDW/vwxJXTWkD8lKE8WcfqxDn9Mu+X+B6UfYFLQjtg/SA2LFxtgqh+vTHEouqJqlCxiR+4Kdyiyv/KYZ9/RUd3Y9KuEokJENBBk02mFpl3Wj52KLwfjIvmRFIJQ/UsHqO4krk6Qz+XTm+7AywBT3K63AScYQ4FzSpW1Xeh+nX2JJPn674S0iennAG01nNwmKLdoa3kX5CoDodcIBed4jHC9x8R6YUKQzbDrQZDnVANSFC1kS8fAqLOLrWQ07dWLtW/NuFEXfu++6yct4+INqU+uizgVEKDyUPVtUu5HOOIqPST837lH4U6GtfqXapMJGVkuGSe2hUcOJ4hADaOzVDTDKhDXf9WbQpkPvq6NHcG8vHmqUha+XHWjLqsq3zbY3Vd9ySVXml+Eeo33r+k2MwzJNB7TO2/iRQKbj1iFhLq4Hu+zFHCY5t5IV3RxqsyXfisNgLii8rijUpZFSWfegnePkAjpn1oOwWYd5W8yf4zWOsPy6vhHsKrcqGTgPZEhxYj9gD60mjPG1V6lEuxMBr45aMTMfxc7Izra1QlTAzQFyuOzINbekZfUNdxZULQeqBXDn7+gCpb2IANuof9sxmKwqC1HQXwjodptKpbOZD+LXV6+W+TLSwoU5fZJYVsnC9PDAi3tpW/A60b91Pg79BXNRIls47DrYBDE4a99hFxmYX86csUyrSSn6y+4SVP3+XfQ5Ltuhl+9abRdYc3dVWnJl1xKK+2HarAXaAYU6Rx+AOSp54kHDWV9k+IVdLE4xAb+GAsyIznNnHz5Va+LTEyXLNgae0ll4K3Rq2Ij1A5IbU45JSZ709KcWPcYsrJtn15ee4EtnhabkslGkA7hKI+oSOoClPnhSzB6wWKk9ZPCODDeyKn7nkNw1oLD/8K62guwt9J7EK2ohPXRk99HivleBGPl0SIc0LNpGRRKZjbvbJe8T17IviIwmqS6cnb//bzIQSb4KZDCSLDlj4JF4sN2IdZbCcgvFYchQhy6kl1nO2S2b7hRn43FwjcQ0tIhCZrbRrgK8s+obVuvgbdXp6YctAK0ki67ddU/FT2YCafB12f5jguXj4gMqJFZvCPvAp2E6QXFicciHt5uqDl75GoN8/UVHR9VzL3D3GivHBlX+0aD X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: c7c27b7a-9f41-4e7d-b4e7-08daf245df6c X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:31:54.2236 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: jkKcohWd40uNnhaExlTD1WuiA6MBhLC+98P61GiDw1he9R5aSJop7V5Lt7ccr2kW7ulZOcPyfh5VdW4XCsYh8g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5074 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Add detailed documentation about: - ETHTOOL_MSG_ULP_DDP_GET and ETHTOOL_MSG_ULP_DDP_SET netlink messages - ETH_SS_ULP_DDP and ETH_SS_ULP_DDP_STATS stringsets ETHTOOL_MSG_ULP_DDP_GET/SET messages are used to configure ULP DDP capabilities and retrieve ULP DDP statistics. Both statistics and capabilities names can be retrieved dynamically from the kernel via string sets (no need to hardcode them and keep them in sync in ethtool). Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel --- Documentation/networking/ethtool-netlink.rst | 106 +++++++++++++++++++ Documentation/networking/statistics.rst | 1 + 2 files changed, 107 insertions(+) diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index f10f8eb44255..146c6b474913 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -223,6 +223,8 @@ Userspace to kernel: ``ETHTOOL_MSG_PSE_SET`` set PSE parameters ``ETHTOOL_MSG_PSE_GET`` get PSE parameters ``ETHTOOL_MSG_RSS_GET`` get RSS settings + ``ETHTOOL_MSG_ULP_DDP_GET`` get ULP DDP capabilities and stats + ``ETHTOOL_MSG_ULP_DDP_SET`` set ULP DDP capabilities ===================================== ================================= Kernel to userspace: @@ -265,6 +267,8 @@ Kernel to userspace: ``ETHTOOL_MSG_MODULE_GET_REPLY`` transceiver module parameters ``ETHTOOL_MSG_PSE_GET_REPLY`` PSE parameters ``ETHTOOL_MSG_RSS_GET_REPLY`` RSS settings + ``ETHTOOL_MSG_ULP_DDP_GET_REPLY`` ULP DDP capabilities and stats + ``ETHTOOL_MSG_ULP_DDP_SET_REPLY`` optional reply to ULP_DDP_SET ======================================== ================================= ``GET`` requests are sent by userspace applications to retrieve device @@ -1716,6 +1720,108 @@ being used. Current supported options are toeplitz, xor or crc32. ETHTOOL_A_RSS_INDIR attribute returns RSS indrection table where each byte indicates queue number. +ULP_DDP_GET +=========== + +Get ULP DDP capabilities for the interface and optional driver-defined stats. + +Request contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_ULP_DDP_HEADER`` nested request header + ==================================== ====== ========================== + +Kernel response contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_ULP_DDP_HEADER`` nested reply header + ``ETHTOOL_A_ULP_DDP_HW`` bitset dev->ulp_ddp_caps.hw + ``ETHTOOL_A_ULP_DDP_ACTIVE`` bitset dev->ulp_ddp_caps.active + ``ETHTOOL_A_ULP_DDP_STATS`` nested ULP DDP statistics + ==================================== ====== ========================== + + +* If ``ETHTOOL_FLAG_COMPACT_BITSETS`` was set in + ``ETHTOOL_A_HEADER_FLAG``, the bitsets of the reply are in compact + form. In that form, the names for the individual bits can be retried + via the ``ETH_SS_ULP_DDP`` string set. +* ``ETHTOOL_A_ULP_DDP_STATS`` contains driver-defined statistics which + are only reported if ``ETHTOOL_FLAG_STATS`` was set in + ``ETHTOOL_A_HEADER_FLAGS``. + +Similar to the bitsets, statistics can be reported in a verbose or +compact form. This is controlled by the same header flag +``ETHTOOL_FLAG_STATS``). + +Verbose statistics contents: + + +-----------------------------------------------+--------+---------------------------------+ + | ``ETHTOOL_A_ULP_DDP_STATS_COUNT`` | u32 | number of statistics | + +-----------------------------------------------+--------+---------------------------------+ + | ``ETHTOOL_A_ULP_DDP_STATS_MAP`` | nested | nest containing a list of stats | + +-+---------------------------------------------+--------+---------------------------------+ + | | ``ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM`` | nested | nest containing one statistic | + +-+-+-------------------------------------------+--------+---------------------------------+ + | | | ``ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_NAME`` | string | statistic name | + +-+-+-------------------------------------------+--------+---------------------------------+ + | | | ``ETHTOOL_A_ULP_DDP_STATS_MAP_ITEM_NAME`` | u64 | statistic value | + +-+-+-------------------------------------------+--------+---------------------------------+ + +Compact statistics content: + + +-----------------------------------------------+--------+-----------------------+ + | ``ETHTOOL_A_ULP_DDP_STATS_COUNT`` | u32 | number of statistics | + +-----------------------------------------------+--------+-----------------------+ + | ``ETHTOOL_A_ULP_DDP_STATS_COMPACT_VALUES`` | u64[] | stats values | + +-----------------------------------------------+--------+-----------------------+ + +In compact form, ``ETHTOOL_A_ULP_DDP_STATS_COMPACT_VALUES`` contains +an array of unsigned 64 bits integer of *count* elements, as a binary +blob. + +The names of each statistics are not global but per-device. They can +be retried via the ``ETH_SS_ULP_DDP_STATS`` string set. + +ULP_DDP_SET +=========== + +Request to set ULP DDP capabilities for the interface. + +Request contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_ULP_DDP_HEADER`` nested request header + ``ETHTOOL_A_ULP_DDP_WANTED`` bitset requested capabilities + ==================================== ====== ========================== + +Kernel response contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_ULP_DDP_HEADER`` nested reply header + ``ETHTOOL_A_ULP_DDP_WANTED`` bitset diff wanted vs. results + ``ETHTOOL_A_ULP_DDP_ACTIVE`` bitset diff old vs. new active + ==================================== ====== ========================== + +Request contains only one bitset which can be either value/mask pair +(request to change specific capabilities and leave the rest) or only a +value (request to set the complete capabilities provided literally). + +Requests are subject to sanity checks by drivers so an optional kernel +reply (can be suppressed by ``ETHTOOL_FLAG_OMIT_REPLY`` flag in +request header) informs client about the actual +results. + +* ``ETHTOOL_A_ULP_DDP_WANTED`` reports the difference between client + request and actual result: mask consists of bits which differ between + requested capability and result (dev->ulp_ddp_caps.active after the + operation), value consists of values of these bits in the request + (i.e. negated values from resulting capabilities). +* ``ETHTOOL_A_ULP_DDP_ACTIVE`` reports the difference between old and + new dev->ulp_ddp_caps.active: mask consists of bits which have + changed, values are their values in new dev->ulp_ddp_caps.active + (after the operation). + + Request translation =================== diff --git a/Documentation/networking/statistics.rst b/Documentation/networking/statistics.rst index c9aeb70dafa2..518bf0cbeffc 100644 --- a/Documentation/networking/statistics.rst +++ b/Documentation/networking/statistics.rst @@ -171,6 +171,7 @@ statistics are supported in the following commands: - `ETHTOOL_MSG_PAUSE_GET` - `ETHTOOL_MSG_FEC_GET` + - `ETHTOOL_MSG_ULP_DDP_GET` debugfs ------- From patchwork Mon Jan 9 13:30:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093569 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51001C54EBD for ; Mon, 9 Jan 2023 13:32:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236602AbjAINcp (ORCPT ); Mon, 9 Jan 2023 08:32:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237011AbjAINcN (ORCPT ); Mon, 9 Jan 2023 08:32:13 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2057.outbound.protection.outlook.com [40.107.94.57]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B3331EEFD for ; Mon, 9 Jan 2023 05:32:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Du161CwoMI2Zngk3jXL6rcZbWM6PJkcSarOxqBqNx7tNl3lmk/1HSRRLWUWvz7Qcq+ly1q+cU/rEV9N2LXwUoq7YYgKMlrA4BfU1LLX0GJr1ahXMBF2feX9rJlpyeYqR/bXEM0xYC6eA44J2l+x6cqBPlDdzMbAAyaq/qctLo5fg7OqHsLfMh7+BQeURMeRK6iOWF4XKaR94OiDtdkL125BQqc7ivxZelxcBerMGL+7RBB1WdY/65IF+uobeTDuTjfERdoCmzZedudUBo7UXWnPNPmS+WAb/wjGmjdxoFzGwHXkZDheaN85Y4kPKPbzchR0EKy7Lt+o7gJYQp6A0Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=w0J4LimSn8ZUbTM5j1NedghnIFosV0mMEZMKi+xnnuo=; b=G0p3KiEdOJ8RMYWZACrE3s9kZxg5FSxJZUdWcU4Hh8IPkDkg3Y1k6m+IjTGkL6eFl3fhW0NInPKlDQeETCwvlHL8Q8CY7q7aB/4Fubwh5CF0DCsCRJe/M3eiokMzfNazhS5I386hOzPu1tCH5b1Z6K3dlQWbdsxB5bqlpAbTfJCyZCP+hf4efZWO9m42FnvIUtzjaEZJKiSRrmx+rIhY+UQyCCIGdhsTS2ztqEw7mLXfE9x/3OjGBeQDeYwDIDTkJPteYYlsTQOZcdkavqs1RCe9Lr2DUgsOBlHunrIIVXG5Sw7yVJgFa5jmQyL92xmIGJkmqvMJkH0R4XrH9TPM5Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=w0J4LimSn8ZUbTM5j1NedghnIFosV0mMEZMKi+xnnuo=; b=Ui9bhrsLPXwBv5xegBrB9hmcoCKeQKUKlgmLuWgYlnvWZa5wgOZ95p5ry85J5QWGj36QxA5DDD9NRJJLwnxnVTOQdwXAqJ7mLSk18w5Lxiuml+bb2Zp42dz3XI3/ORw0kBuOE1MLa1i18ls0CCwzk2X9XdgtxpHlvUY0Rq/fBsnvkvqh97/0ya87LuGZ1vuQDT8N6Gb/0tXqrteRtJsfZkNGoWHJj/HF7ex3DCWuhb1nzPV7Mo3tzLJxYi6addiIaUMoiH83qeO5wlIcW4Q/Vqum0aaPzXaF8rxUmJJGMpQ1qHi1smhCEps+ZjCSPJdQwC7VXc3w2tGbsfqdN6Xi/w== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by CH0PR12MB5074.namprd12.prod.outlook.com (2603:10b6:610:e1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:00 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:00 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org, Alexander Viro Cc: Ben Ben-Ishay , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 05/25] iov_iter: skip copy if src == dst for direct data placement Date: Mon, 9 Jan 2023 15:30:56 +0200 Message-Id: <20230109133116.20801-6-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR3P281CA0146.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:95::18) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|CH0PR12MB5074:EE_ X-MS-Office365-Filtering-Correlation-Id: a62ae4ec-ffb2-45c1-f19a-08daf245e31a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: b8nbYHkJCoFOY58PmAUYdCt+rEzDIH/tHcQrmS3Jussi+l1hrsX1a6TnGTjo41frlY0kNiWDTD6ZB9frQF4+lCpy1p7L/gDy5+2Il8Z4KWDUUa0S3QUuASwk1URArZ2MjTuuQOXtkM4gIrNC6Nh6+vt8fkgKVxQxd/GChfDrUvFA01OXPgo0r0PHeqijKDCPLL+l18F1s7jtp51vB/ZjcLR+HwyGIL0UpT7G0cJj2omGpWFNh1F+24/mpFcsqQQoQwrUHfCfEb0ysrtqlu/ums9PD/yDNtOX+Sg8aYAb8LNTVeqcznFBEG7KKbI0kl0jLFiyZbUf0uzyfjF6/fIBoBLyEMhiYuwcXP39O7loKdv+YvyqaAD2i3Kjdbx67CHeAzlSNJxc02BTXacYNs9Gi5fVYHI4YRJ528DDhZo/aWF7+JQrC+/B+rwq5diCq8ZvcKO2PGb9TWehqtTSxTfZRyy3D6j2jIa3vQXQ22C1uoFylggdgKLxWnOF1Wewbi5js2k8QCvnjLVqvBoIqqz3bqk7wsBEBkIOR9sfLjESsJghDUeEPwgFnqlWk0dAHmXAvmDjSOpA7zR040eZiP53gxyKMpi0Q0svilpOG3zjUqV6wP2i1XXy8dgzob+uknMWm505Czod+pbRVQCHf0iRPgiN1lmQotHijCR3aE9ATJBnTgwgaWWg4zNDzbpvqLa0 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(396003)(366004)(376002)(346002)(39860400002)(451199015)(1076003)(316002)(5660300002)(7416002)(26005)(6512007)(186003)(6486002)(478600001)(2616005)(41300700001)(6916009)(54906003)(4326008)(66556008)(66946007)(66476007)(8676002)(8936002)(83380400001)(86362001)(36756003)(6666004)(107886003)(6506007)(38100700002)(921005)(2906002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Z5qb0eeyrMmIIblwb1+qKXmH0CDL6G6e3pVFwm0AfHFGjDZzANrJdQzKHVQBALHAk3rYmG2DczyRaFyH35kR5InA+N4g44fZHo5fwXWKPutYOFKz/KRYuvSSUqpucESisg4qxxjsreVzE0UmEH2zHFkohwYf9sebZOFUh9FKG4JhtJOCXqt4AKImIbUe6/SUTrFAQK2eL8f9DC92KZ2CrUvVGBsZSSroNg4sHqo08h0CEcTNocxCuCqfzm0I02v2DE04IJfWoVGp7Z97xTWxaT0FIyujSY0EW77s5ZCsPavhwZGEl9FV3gltD7h/Wdwmw//nmgLcTzanwl9JbC257d/HvC4LgZCiASf/WXL+gzt4+V3nCdyQBpHyiI9nqbFeveuiq6Z2MFmZDDnoTI3vebgM52G0K+zsqK0OSoyQQ+59x1bap4mFHsH3rzWBLz0wn7EOdAMaP1qlmu3+UUV+4LAWnELMEgmUrwc9NrOmlnpqubnSytcCruURUeiFAxkhhTO/IRP++yLl5W3qqmLMoYmfNaw+quU++/za38M1Z1slm4cle9adabfgdio5hUtKOZcq3bwUTo07GuHA8/DpbAIPo+CjW8aAmyQ4m3mpate93sScdObZvaSxXUmTGDJkcb7HV1m76I0nqUmS2zChYhs0UP3sMvYNck8Dy/4h32ihQIqWAzlOt7LaVzXAsZ5f6I++4/5FQ/usrK9nSTnnvWzeqrxysSkJ3dJNsD0BeFG0crbz4UzmCrgMUfsKFFQZYrVcdL7XfzXzy9INe+lvg/L+eMA+GEIIwIdyv63Gke0TZ2l47yZKhR/39RAejHiGWEGlRu0ZBsARrDiHW2WT2i/zCAIjTYIJnoo0OxzLAPx6LvPwmgfFLSms9Pm4lEYQyPUUNh3Vf/MN9/zhLKRYMomSJfnh8VHlYL+gVoJR2FYGWRqPGOXX5tWKol94BZ87e2aDVu/rTIxs/IdGRy2MziXAFGW/AP5UV0PPgThp5y8Orsufmnp3ogNkWFH+4Rfm5UgAV3cpP9F+q35lqB9sKJoqNP2fO+6ibFSJqxuZVkqpPAS4NF8H8l3EazCHBKPp1gBK+H+uxdOICEge0frTR+39J8pEv+jrXX+qfCXXm6c269O27FwWmq/mo0MB/eqFkNfoKbJMhmw9gnwYO0q3DW7H+kv16sayUPP6g4lJ7pxvnd/k0tQcn+XGdJMIx12P0LjS4H4oD/p+hgvfg/ntX9a++43GCwsLErbgGjhBLYB4d4fL2Acg5fvXEFfhgWgagthe5WP6FJ+OHrumgJse3pxuWL8PqSGfqvkZdlIfhoW2O/ZlwiGoIhV6Beq5YRmYdu2FjP28lp92+T88qVRnUmRIY8ZYSDJw47IDMDBm3J/P8Ohm/QApTzq1txKF85maYKllq08loG7kj4d0x42xFwZGuV0tf6yELW6LU941AbikG8G0JSPhfVYH/0EELJHE6HfTqVp3vpTJt41FVx0IbRVRpbY+3QQjUZKd3sMgeWU7wZV0iAlHj3/C3GFW9sHGpzA3BvWydEZZfULCGIia6SdYOEDt7W+6ES+7wTYqwE8LOXgXqh97qnQCYwJGp1Fz X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: a62ae4ec-ffb2-45c1-f19a-08daf245e31a X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:00.4142 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 6Ht4u3bP7Poru4mt27VAbMQc4mEpzutvXev76iZlCqgqeZjGXvSZ+9tfWA9/kzmRoBSQnltIFknND2rSg1nqiw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5074 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Ben Ben-Ishay When using direct data placement (DDP) the NIC could write the payload directly into the destination buffer and constructs SKBs such that they point to this data. To skip copies when SKB data already resides in the destination buffer we check if (src == dst), and skip the copy when it's true. Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel Reviewed-by: Chaitanya Kulkarni --- lib/iov_iter.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/lib/iov_iter.c b/lib/iov_iter.c index f9a3ff37ecd1..2df634bb6d27 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -526,9 +526,15 @@ size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) return copy_pipe_to_iter(addr, bytes, i); if (user_backed_iter(i)) might_fault(); + /* + * When using direct data placement (DDP) the hardware writes + * data directly to the destination buffer, and constructs + * IOVs such that they point to this data. + * Thus, when the src == dst we skip the memcpy. + */ iterate_and_advance(i, bytes, base, len, off, copyout(base, addr + off, len), - memcpy(base, addr + off, len) + (base != addr + off) && memcpy(base, addr + off, len) ) return bytes; From patchwork Mon Jan 9 13:30:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093570 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D870C5479D for ; Mon, 9 Jan 2023 13:33:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231559AbjAINc7 (ORCPT ); Mon, 9 Jan 2023 08:32:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234464AbjAINcP (ORCPT ); Mon, 9 Jan 2023 08:32:15 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2087.outbound.protection.outlook.com [40.107.94.87]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 904D613E90 for ; Mon, 9 Jan 2023 05:32:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fnE8nId9/zrS+MOwtU6Blf75yA2wTO+C+2LzFSDfmFud8olo9bvsrevUkeST7XMJjUf57x8hcj76Nib1N55ncO7+/WtVXWObrMZDJ02ET2HP8UyzrsDUrtBV85/4fSvqdKRRgzLwAFbjtC/iCfyFrt/dPAYmPDc1xh604T8s8xAMWah+/VQuIHMkrSrk+NnH5Zf0oIG1crlL21RPhZxCL0N4aEPxfdbSz7z2QpLL2N8Om5vqzkO71mmZeI4D7LmNR1COCncq3lOvItE5kNiWdiimJvipJsrAYOjL2PcgEnQh3M8tu4CSk1nXUlpm0SWtgq7VggKEAczhF+eCbt9tsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rcbIohC+D8aEOXJjZsgEpRfHCxyY0PkY3RzNlVoReRE=; b=LJAPCR58/ZeTzekM5DT7Fr+14XbcQ2mFDoVWznIA8H5kyAop2ct6Oi0N3PN198YIwKDDR3Cd4CJZsBfmlWY7P0jfPC+RAmBcStDX8Pbx1Hb59e/5fwkKFJsANSi7laFey0b6YOw0ZahUGDvEXGfBBDLXkaC4beJoExybYEqMYIp6z734YUf+5UuSjjpcQu6EMyhIAuxxdJdrq1tc/krsrez05gz42gjyH9Ab5BcAkTsDoac9OLLdbqy0IpQM5RI+cJckoMQ5chjbJHAr4YDQBlfQEh1co5GHtqIVwKf1slsWUk6wT89FMiL1dSRS4xmtM5RWOrHh28vNRb4ToNnmWQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=rcbIohC+D8aEOXJjZsgEpRfHCxyY0PkY3RzNlVoReRE=; b=ZMKl9obnWmEcwA254jmxznGodyIVk7nlAFyFdusxGH/mEk+n3cvJODDgYqvAbDcIG+1CXfhLCsPtRWQPhxBsmtIVx0XLS1WyEemr6m1mESKVeymA8cWIxLB7hdo9xTAgW2Pk1FGtepxoW5Z0m538scftwaNEaiSux0gil6CP+qScgesX9iS2HbzrVDPExj2ZPJdIFO7ebnc2x7KdOYB9RS7pHwjEF7u0eeBSQAxGubJxuot7wnzshctk8+XJnXlGm7hrdRbDTf82H0W3ZPXZBX2zNO10W4iHe257vJdMcgEC277+QsPdg+HAw/13zHvzGX+1Fc60p4EskeWR2Kl1HA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by CH0PR12MB5074.namprd12.prod.outlook.com (2603:10b6:610:e1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:06 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:06 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 06/25] net/tls,core: export get_netdev_for_sock Date: Mon, 9 Jan 2023 15:30:57 +0200 Message-Id: <20230109133116.20801-7-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR0P281CA0143.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:96::17) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|CH0PR12MB5074:EE_ X-MS-Office365-Filtering-Correlation-Id: c62527c9-b01f-44e0-6995-08daf245e670 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: jckgRj0M2BwmT6JOSC5KU5CGD+ffwOZl/ZDbMlWVy64yUKZOF0kAlJR0zHcouAN6bkHnJw1LZnaIsAH3Xff3BvaXJMlOi/rP0bwpZlFJRIJ5yOJAeudOAvyTy/Ugr+NzGJZCb+Sc0RFzpa1QT5+OXP5p9WBb/g6WTKLSZQLd9QKsYSyJuxowTN3slHhhSZx72yfe2tt7Chz1pA2GcyBdJYkz+f8CFOSSjGY/jiag7dyjb4QE6r/6LIcpJ0ZG40RQbDGBkXTwBhA78alANIP0NrqfcdC2uVZLC2rr65yWCk4Wy4rKr6mDJU4Wv5AaZWQHcMe3Pe/bxUxVRxrKEDOeCfilY5P4RkDoXM/flS/XtTVFGMklml/iVSSGHg8Tmnta945IIaYz4PJF7PsIVo3RlaTsCYSUEgIAT1U3Mj1W1IZ49/Gu46R7hrTSskuwcJR5kcYa8alKi16jK1gEf4A28ismC5/B3wztdKaJlQv9loRUL79UizikZrVRQ6pNsSFczpra6XksJtA6X8Wwb1wtTE+sqLWQskgfug7wWcaOrdiOjYhIYucOHcm/zJ7qQjL4QWq0Ve9X75hLKji7xSlU+aA5pm3rDBd6dVLNcrDGhm6RgPB1nYMOkhgufRMfadNHP8XFHO0YcgWDOXUU9Zuaaw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(396003)(366004)(376002)(346002)(39860400002)(451199015)(1076003)(316002)(5660300002)(7416002)(26005)(6512007)(186003)(6486002)(478600001)(2616005)(41300700001)(4326008)(66556008)(66946007)(66476007)(8676002)(8936002)(83380400001)(86362001)(36756003)(6666004)(107886003)(6506007)(38100700002)(2906002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: XdxBZFyR68SaQIRZtN8++Am35awyXJ20alNoFlm0I9IytxeSk7uSc69L2MevJ5xtPIamtWBFtoR8upiQT+UErBzH3vwIbfDZnxo17KiHypPO7yh2tcSxUXniDcXGHsbjKLvUFly/IyZDVAlUmPCYLbe1DuipJ0B3wLN8NVWSRF9OyX7TQ7bIsxse2FlbhPF2tdv8JjGBuRl6ht4hNm4m5ZC7IwuH7SswdB/rAjjIsEvT4+HAUnrZc+7PFsc99/pXxov8KyKm2AUX43uDgCav80HKeInIJ1soBdoKHIuQWGk5LCiAwRq+cybkmP2GzoD53Z6VOJtRm2HJyswoEOufpM5y/DLEtqwbrus7Hnv8vY6zpTqFH63TkmIu405Rfkcf7ViEyEWu2pTsK4u4nXBCzGjpZAcjLpXmdjEBRqufedweOjK0Ck5gHFiZjdaAln+9FThsuHPuZpkjfhhsFz4nYRXnekPXjdg8nhfXbrkiSNrT4QKGG1tRT8qmHkUXw6yIYZ2FkIYx2phY5ZnHp01nRgUmI3PoAz+jPRs4Cy4WST4EvFi5QnYpGug1J83pfAcZXqUfeb8EUYQemkPnD1/n+WJH+asQcWRURpDkHAoQSZi2RuXvv1o/Nad+6P5YeznZSZkz3Gxm8QPAWy9wcD22wJzla79ZVMnIvuUSxnvwDUQHeMxeQ37NHQONCMXYQKf/mGYxu5p6rspv4eZgCFNxHXXTNifv0C926r2Mjs+OyzPOqAqXzkHipGDIM7m8JMB1QH2F/TraYjnjXAuJeURpEeUImzlZQ9J5Er+0zRAaAnNNdZkwySBxIfA4b407mwFJqtF5mWfsJ+juu3IiFQJLVPufx+OSsyGQI6PIbMfqrBVGePuw7VD/+Vm+15XmWAtXusjIj51siX9OPjcrhbSjUnrMXZOQXIDv0gm6uti/lmYw6mJp1RsW3xNPBFkiSJnz7roM1bznFRgmxmH86WEEOuxJ6YDbQUjVLVwORTLsTxoGiHlZS0q7dVQlaAAVRvG0c4EZqxT/h1UsgdttA+8jpE+9tGS0nPfe6pWGDzcFOKMtKtUB0Pq8p89eqkIL/mbuWqcPmcw8nI3fbDa4aQXsGjTeVFBxL3VtZuGDOuAe16YYQiWFYz/yUX3i/dDIgI1rILcKtMtJABYzusuE1XGGrr+s48Ke7W7ceB61vyQORgA2AfJTUtrrzOgxBt+KbI7ANuPpVnmZWVzxKVN3SwT0xZz27mLUk2f66M/PCmcwNHWzaFvqA/wKB7oyl7010wYhpf+8CZzl578gFEsbC6uF2IiDoImnkxdDJEH3lWGOZZlTJWcyQzJYZayBzb6ID4LNkRqdMPQbNylzMgIa87wKCOwsbv/GDOF+Vkd1fduLGEi0m9XVcitfGa8k5V33AohcY0YJ/LcEVkclxvUNMf8IpvhFzlmKbD2XJmvdkZmoKGpq7pMEC6Bh4vDkkvCA5luYzroE9EBj9p8M29w0IFY5unGkKC0pKbciDKslDmD8CXfNyHaFyMQEvMYsDaedFIzldgFsmKR+aiB8Qdj0cv9soXm4LQEKqtbwdRIZLySR39uiCVHDemHTxkp6O4VhrwGG X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: c62527c9-b01f-44e0-6995-08daf245e670 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:06.0231 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: uuvDajFbCDDZHgbTQ2yfw2vpDJGfYMF/4tWuvZ0WPGQc7m2FkXRuR5d2nCRNANeFmMFW+HV1qhYk3UXKmkFUxQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5074 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org * remove netdev_sk_get_lowest_dev() from net/core * move get_netdev_for_sock() from net/tls to net/core get_netdev_for_sock() is a utility that is used to obtain the net_device structure from a connected socket. Later patches will use this for nvme-tcp DDP and DDP DDGST offloads. Suggested-by: Christoph Hellwig Signed-off-by: Ben Ben-Ishay Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel --- include/linux/netdevice.h | 3 +-- net/core/dev.c | 26 +++++++++++++------------- net/tls/tls_device.c | 16 ---------------- 3 files changed, 14 insertions(+), 31 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index bd270c4bbf97..ba3806a1a11b 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3079,8 +3079,7 @@ int init_dummy_netdev(struct net_device *dev); struct net_device *netdev_get_xmit_slave(struct net_device *dev, struct sk_buff *skb, bool all_slaves); -struct net_device *netdev_sk_get_lowest_dev(struct net_device *dev, - struct sock *sk); +struct net_device *get_netdev_for_sock(struct sock *sk); struct net_device *dev_get_by_index(struct net *net, int ifindex); struct net_device *__dev_get_by_index(struct net *net, int ifindex); struct net_device *dev_get_by_index_rcu(struct net *net, int ifindex); diff --git a/net/core/dev.c b/net/core/dev.c index cf78f35bc0b9..ea80f77ba003 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -8152,27 +8152,27 @@ static struct net_device *netdev_sk_get_lower_dev(struct net_device *dev, } /** - * netdev_sk_get_lowest_dev - Get the lowest device in chain given device and socket - * @dev: device + * netdev_sk_get_lowest_dev - Get the lowest device in socket * @sk: the socket * - * %NULL is returned if no lower device is found. + * Assumes that the socket is already connected. + * Returns the lower device or %NULL if no lower device is found. */ - -struct net_device *netdev_sk_get_lowest_dev(struct net_device *dev, - struct sock *sk) +struct net_device *get_netdev_for_sock(struct sock *sk) { - struct net_device *lower; + struct dst_entry *dst = sk_dst_get(sk); + struct net_device *dev, *lower; - lower = netdev_sk_get_lower_dev(dev, sk); - while (lower) { + if (unlikely(!dst)) + return NULL; + dev = dst->dev; + while ((lower = netdev_sk_get_lower_dev(dev, sk))) dev = lower; - lower = netdev_sk_get_lower_dev(dev, sk); - } - + dev_hold(dev); + dst_release(dst); return dev; } -EXPORT_SYMBOL(netdev_sk_get_lowest_dev); +EXPORT_SYMBOL_GPL(get_netdev_for_sock); static void netdev_adjacent_add_links(struct net_device *dev) { diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index 6c593788dc25..3c298dfb77cb 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -120,22 +120,6 @@ static void tls_device_queue_ctx_destruction(struct tls_context *ctx) tls_device_free_ctx(ctx); } -/* We assume that the socket is already connected */ -static struct net_device *get_netdev_for_sock(struct sock *sk) -{ - struct dst_entry *dst = sk_dst_get(sk); - struct net_device *netdev = NULL; - - if (likely(dst)) { - netdev = netdev_sk_get_lowest_dev(dst->dev, sk); - dev_hold(netdev); - } - - dst_release(dst); - - return netdev; -} - static void destroy_record(struct tls_record_info *record) { int i; From patchwork Mon Jan 9 13:30:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43DC2C54EBE for ; Mon, 9 Jan 2023 13:33:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236634AbjAINdA (ORCPT ); Mon, 9 Jan 2023 08:33:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237045AbjAINcR (ORCPT ); Mon, 9 Jan 2023 08:32:17 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2060.outbound.protection.outlook.com [40.107.94.60]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3363F1DDE2 for ; Mon, 9 Jan 2023 05:32:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CWtj9SqfV6werDJsPqBFcGTnC5nn3VnvwponWibBuMqLPwQdUByKDHz5iuXOexo8mZJ66WxNXw7BgV88ksI1MpfmJ2qA14v1EJCrHctz2sNqOGUd24Mdq06etiNHZikqxENOzuQR+0ssNYP9bB7FiF1JICcg0qJuwVuEDYRhrGZRiL9LdUim8p5aGjRFWU6UsT7QWdb1F8CWgrZpDeab+mmgqiT7D9+B4CJJ8D8GrS5iPikmy6oQcLd2Mdxrc1NtYe8JQ2Bq1jTgNIEyLs3QqhGHpPkuUsKme5wp8Y75nZaBTtNKXiydj67d5H8LQSlXBxVRNKDb+hcUJ08qsA+IyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bMDSXqKkke0brmzCDWyLRI4uZCpji67Vq/Y46PmW5BM=; b=ZF41ypAFaRp6xCvz2RNQspYUIG96eMk0kFgCQSwNewBY7mz7Mgh2R9klkBbaqbgXfHE77HzapVy+L5cFJdPiGs7fxGXSpVbUwlMN5PHvHseXkk8FacjaB11n/IUe8cnq4wYjGCGy3ZjeG381+sTgvJq1Rjp0Wn0ioAaSDTPrGYf+dHXkF4+4wmBJO63VimIJrW+2uDLd1s9Pg3DqU6TMBpW2zM1m6to1r/bYY0VThprk+YOFovsl60YYmqiQIe5WkJQHfKtvVCH1w4Jg/xSw4IqhHdzDVGnPL0BB2++g4Ewc1ob0n0HHzw0kDyf3X3ZqaOSiGMbX2XNKewdFpVlHaQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bMDSXqKkke0brmzCDWyLRI4uZCpji67Vq/Y46PmW5BM=; b=IMboiWDKwj0xuxIBCtHlDgSqMMkhgFJ6pQKQn0MT8PocqJEKT0EAT4R18e0R/m9Q7YyA0H3ugL3C6pdQonN6O+4esOc2N/3t3lKhzATQcMB13buOO5omSPblHE70hE+s1+sE3Ln2Gp/xAoM59hZXiL4cXFkhI1o5ZC1LVmyLhPbcAHI8PviMOH4itaPf+fVNyWptr7/cNuL/X0Y7FxRD8GNDAxIAn6G9q2FWNhb2iqDpE1uSQEY9eLnN9BOYya7L+rRa80/RQd2YkMBd/CGyO7Xiv4hoH3V8dp8hVORyrmjRtp8YVH1U/nvdqWkbnCkum4b3CPUFGE4I0VsJ+KRNmw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by CH0PR12MB5074.namprd12.prod.outlook.com (2603:10b6:610:e1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:11 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:11 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Boris Pismenny , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com Subject: [PATCH v8 07/25] nvme-tcp: Add DDP offload control path Date: Mon, 9 Jan 2023 15:30:58 +0200 Message-Id: <20230109133116.20801-8-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR0P281CA0150.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:96::8) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|CH0PR12MB5074:EE_ X-MS-Office365-Filtering-Correlation-Id: 26b08fa4-60f7-445d-586b-08daf245e988 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TuqoqLfXhQoGJIYT/eAbU9gVSRPVXrmXbeY3UONRFmqCHDaS/moGiAFvcBwBTd8I0+ipX4w3RxQ/yhxM4EixaIt969X0+f+7lTYZVvV1mUVpJ5VmXEhL7Z7dYfiw/9oCCEvr52wT2Eg1SV5SodwY4aHZsj2CjVKy6ku6Kw4+bY7KTlR55OIlCySh+wsaxTjqh6uE8H5NT/lRMAp3astqpAN1AgIjMObEvb3axQxwWa9WQHEaREcPjni1uq/4UeICNYf0R2WBvroR8NzNDnhdPYMLEU+q0a8TOee34nJxWjaI6r4e1N51ZbwM0qgS9UwxMrbePcaGQRZsXzthTlhpnHW+eWOFPKW6fPb/wwmRRlChmwJg17rF/mybNJgz+7x0+/8LUtDjb3LKobB2vhQXM2aV/h+rmEZ7IZb+p7+XTXcackKcxq0KLHgAkKqqMpcX5rRZNPYh4o8azbdOcwIgD6oQ/MQLkxMtE16RFiO4Bk7sALk45rfHQr3kq7O+OnlKjrxbQTIq6xbnxFMrBVXgz8zjVhTE0X8beFIxftQv6EsXovhzDd89LuDz0MapfP07FSI4M4wzkLdg2I2b4uAEmiNKMDYuT3ExXzrxujugdzf6jUjuSfirZfIdGIL2Jjp5Vsk+NC/GbyeTeKcIJLgDOA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(396003)(366004)(376002)(346002)(39860400002)(451199015)(1076003)(30864003)(316002)(5660300002)(7416002)(26005)(6512007)(186003)(6486002)(478600001)(2616005)(41300700001)(54906003)(4326008)(66556008)(66946007)(66476007)(8676002)(8936002)(83380400001)(86362001)(36756003)(6666004)(107886003)(6506007)(38100700002)(2906002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: m29gAHbKnS9Rz4qNmlKDtlHGeRKngWNvTcmkcsa8yikE9gzCv7/L0SOgmG7WIHINrembmOVF5z4EHE4tLN2HANCsalQlKpsPGJc1bdkyxwCn8HfrKBNc1ihVmja5xPapl1lYVyh4d7mboD4rprTaSZVxXwTLczF95tKU5FUbS7Zb95DlIqO7Wcvmhz/U6mgfWscSS6NeVePrLHwlAeWXAnUSlC6yHhHSaV0F/o7UfXRQQsSu+a0shh4x6SSu0Q2w10HG9HR8mlGyxiHIYQIdJRTXi1r6TPjWqj8dv+E3xv7fgMhq0XgnBgD3h9YFu0Y3Abc262P3Cucrm6ahLiZWU/Ox4JfUDYwQzH2+hRdq4oIRkkX/+49rvDl/Rxkj7mYDeCoY2tKeo9+qLIhTQFQlfRlMR175OOEB9Nw9e5V4JlDFJMK9Ihac+vhbTkE3f7YdnNVzXmvL/GE16EXqs7h6gNiF7j5aYizBp6nbZNCLeFr4cl0vFAYncuSmeYVzpuk79m5vIsg95fvpiDO5Ysml7mx0+eNzm9w0XE1HDoucdNAKUjW1TzU5ImBiDvfv/GNfnhhZFKRpa6WrzIrULHNht0nd79pZqsRvR0LdRxsP+8Vb92O2T7/Cf3qQXczXwwzx77RDlP1NOPcCC0VYVR8fPYuCB2kUjDsYJihPp8rpsmztIKdlA33ZqrO2BfZX8OSBRlwhtZbS1ge7twd1dwkUvGC+WZVF1kAfXCrvWuLb7F2VM5NYpf+NwTs/0JqyvJ3wiJ7fKaqoBdDG7ra0dReiFhoUZJypDlxF/5tXYb37ygrV+M87tYhW00ZbDsCfzUBqxcnCn5w37bUoRDrIuAH8PTmAjGhQgBkc6A9B+iCfex+3EW1NQe7T0oqh0nGSES7vd9odWPOeKIRqF47254QDq3gNYGrkkjfGr25T8DVPZhsr3ZgxhMvXynKGOo4STbP2DkdAt4TXPkXR+N7XjEvzN1AZYr/ToH7HaXXzsnUMCvHe4Kv8aRMtBZu86m6l4QaJTpc4fKBAhlKWTgqoGqFhZAnwjlNHgJrC1ZPOEdKhEhDbXivS4PRXVmS7vMZLaZFNn22Iz/vlYp3nSXd2AVOv4xyDgLtH4YAFHvGfJlU12t2VyCzOg6QGETJqPR5HBbGIR74l5KPUrl37MFZSMZI6vg1fcs6BwJjWfx+1Nn9WT0R6QVXGLbeltAMq2LCOHaoOnEV7UPzUb3rebO9KUFl3LAyFDJPGcoH3XPDB5PjgIc8Clr/gM4Mu8ofM34JwPkP5uvljUgvh/Zkzzb1N4HeD+uyicb1d9h2OHilYH1VT+MiCfezDkTyss6T/ga9Ef57BtTb8t7saHBeX28S8MAX8MztSm5JcaD5cf/x9tfkarVAjUxCQFUFrTYE9XMUntLL9hrOEp8iqvlaNdFd5ZXO2RYqZCxWN4FbyTB76+mBgqpadpQfW1Jcic2Uii+zKuH7QBZCTEAqzgQFf7yqfIAvOxYb+ZfE4o+IJDEyFf9MhzaJ82D3Y+z3J9qcFtfAOKUPSC1h7ZIz2GKvkg8fIrBQ5HmlKXAZmio5xohZcj+r/xZ9HtlEWuSt59x/D44bIDacf X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 26b08fa4-60f7-445d-586b-08daf245e988 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:11.3694 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: lYLyrr6IJZKu93eXvnDREWiEKwBAd4KtrlPDVX/vF2xQUTRvc0iZXlF1J8lTAOP3edAUNRhyMX/QuqDT7Kpayg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5074 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Boris Pismenny This commit introduces direct data placement offload to NVME TCP. There is a context per queue, which is established after the handshake using the ulp_ddp_sk_add/del NDOs. Additionally, a resynchronization routine is used to assist hardware recovery from TCP OOO, and continue the offload. Resynchronization operates as follows: 1. TCP OOO causes the NIC HW to stop the offload 2. NIC HW identifies a PDU header at some TCP sequence number, and asks NVMe-TCP to confirm it. This request is delivered from the NIC driver to NVMe-TCP by first finding the socket for the packet that triggered the request, and then finding the nvme_tcp_queue that is used by this routine. Finally, the request is recorded in the nvme_tcp_queue. 3. When NVMe-TCP observes the requested TCP sequence, it will compare it with the PDU header TCP sequence, and report the result to the NIC driver (ulp_ddp_resync), which will update the HW, and resume offload when all is successful. Some HW implementation such as ConnectX-7 assume linear CCID (0...N-1 for queue of size N) where the linux nvme driver uses part of the 16 bit CCID for generation counter. To address that, we use the existing quirk in the nvme layer when the HW driver advertises if the device is not supports the full 16 bit CCID range. Furthermore, we let the offloading driver advertise what is the max hw sectors/segments via ulp_ddp_limits. A follow-up patch introduces the data-path changes required for this offload. Socket operations need a netdev reference. This reference is dropped on NETDEV_GOING_DOWN events to allow the device to go down in a follow-up patch. Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel Reviewed-by: Chaitanya Kulkarni --- drivers/nvme/host/tcp.c | 252 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 243 insertions(+), 9 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 8cedc1ef496c..3c35290d630f 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -15,6 +15,10 @@ #include #include +#ifdef CONFIG_ULP_DDP +#include +#endif + #include "nvme.h" #include "fabrics.h" @@ -103,6 +107,7 @@ enum nvme_tcp_queue_flags { NVME_TCP_Q_ALLOCATED = 0, NVME_TCP_Q_LIVE = 1, NVME_TCP_Q_POLLING = 2, + NVME_TCP_Q_OFF_DDP = 3, }; enum nvme_tcp_recv_state { @@ -130,6 +135,16 @@ struct nvme_tcp_queue { size_t ddgst_remaining; unsigned int nr_cqe; + /* + * resync_req is a speculative PDU header tcp seq number (with + * an additional flag at 32 lower bits) that the HW send to + * the SW, for the SW to verify. + * - The 32 high bits store the seq number + * - The 32 low bits are used as a flag to know if a request + * is pending (ULP_DDP_RESYNC_PENDING). + */ + atomic64_t resync_req; + /* send state */ struct nvme_tcp_request *request; @@ -169,6 +184,9 @@ struct nvme_tcp_ctrl { struct delayed_work connect_work; struct nvme_tcp_request async_req; u32 io_queues[HCTX_MAX_TYPES]; + + struct net_device *offloading_netdev; + u32 offload_io_threshold; }; static LIST_HEAD(nvme_tcp_ctrl_list); @@ -260,6 +278,190 @@ static inline size_t nvme_tcp_pdu_last_send(struct nvme_tcp_request *req, return nvme_tcp_pdu_data_left(req) <= len; } +#ifdef CONFIG_ULP_DDP + +static inline bool is_netdev_ulp_offload_active(struct net_device *netdev) +{ + return test_bit(ULP_DDP_C_NVME_TCP_BIT, netdev->ulp_ddp_caps.active); +} + +static bool nvme_tcp_ddp_query_limits(struct net_device *netdev, + struct ulp_ddp_limits *limits) +{ + int ret; + + if (!netdev || !is_netdev_ulp_offload_active(netdev) || + !netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_limits) + return false; + + limits->type = ULP_DDP_NVME; + ret = netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_limits(netdev, limits); + if (ret == -EOPNOTSUPP) { + return false; + } else if (ret) { + WARN_ONCE(ret, "ddp limits failed (ret=%d)", ret); + return false; + } + + return true; +} + +static bool nvme_tcp_resync_request(struct sock *sk, u32 seq, u32 flags); +static const struct ulp_ddp_ulp_ops nvme_tcp_ddp_ulp_ops = { + .resync_request = nvme_tcp_resync_request, +}; + +static int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue) +{ + struct net_device *netdev = queue->ctrl->offloading_netdev; + struct ulp_ddp_config config = {.type = ULP_DDP_NVME}; + int ret; + + config.nvmeotcp.pfv = NVME_TCP_PFV_1_0; + config.nvmeotcp.cpda = 0; + config.nvmeotcp.dgst = queue->hdr_digest ? NVME_TCP_HDR_DIGEST_ENABLE : 0; + config.nvmeotcp.dgst |= queue->data_digest ? NVME_TCP_DATA_DIGEST_ENABLE : 0; + config.nvmeotcp.queue_size = queue->ctrl->ctrl.sqsize + 1; + config.nvmeotcp.queue_id = nvme_tcp_queue_id(queue); + config.nvmeotcp.io_cpu = queue->io_cpu; + + /* Socket ops keep a netdev reference. It is put in + * nvme_tcp_unoffload_socket(). This ref is dropped on + * NETDEV_GOING_DOWN events to allow the device to go down + */ + dev_hold(netdev); + ret = netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_sk_add(netdev, + queue->sock->sk, + &config); + if (ret) { + dev_put(netdev); + return ret; + } + + inet_csk(queue->sock->sk)->icsk_ulp_ddp_ops = &nvme_tcp_ddp_ulp_ops; + set_bit(NVME_TCP_Q_OFF_DDP, &queue->flags); + return 0; +} + +static void nvme_tcp_unoffload_socket(struct nvme_tcp_queue *queue) +{ + struct net_device *netdev = queue->ctrl->offloading_netdev; + + if (!netdev) { + dev_info_ratelimited(queue->ctrl->ctrl.device, "netdev not found\n"); + return; + } + + clear_bit(NVME_TCP_Q_OFF_DDP, &queue->flags); + + netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_sk_del(netdev, queue->sock->sk); + + inet_csk(queue->sock->sk)->icsk_ulp_ddp_ops = NULL; + dev_put(netdev); /* held by offload_socket */ +} + +static void nvme_tcp_offload_limits(struct nvme_tcp_queue *queue, struct net_device *netdev) +{ + struct ulp_ddp_limits limits = {.type = ULP_DDP_NVME }; + + if (!nvme_tcp_ddp_query_limits(netdev, &limits)) { + queue->ctrl->offloading_netdev = NULL; + return; + } + + queue->ctrl->offloading_netdev = netdev; + dev_dbg_ratelimited(queue->ctrl->ctrl.device, + "netdev %s offload limits: max_ddp_sgl_len %d\n", + netdev->name, limits.max_ddp_sgl_len); + queue->ctrl->ctrl.max_segments = limits.max_ddp_sgl_len; + queue->ctrl->ctrl.max_hw_sectors = limits.max_ddp_sgl_len << (ilog2(SZ_4K) - 9); + queue->ctrl->offload_io_threshold = limits.io_threshold; + + /* offloading HW doesn't support full ccid range, apply the quirk */ + queue->ctrl->ctrl.quirks |= + limits.nvmeotcp.full_ccid_range ? 0 : NVME_QUIRK_SKIP_CID_GEN; +} + +/* In presence of packet drops or network packet reordering, the device may lose + * synchronization between the TCP stream and the L5P framing, and require a + * resync with the kernel's TCP stack. + * + * - NIC HW identifies a PDU header at some TCP sequence number, + * and asks NVMe-TCP to confirm it. + * - When NVMe-TCP observes the requested TCP sequence, it will compare + * it with the PDU header TCP sequence, and report the result to the + * NIC driver + */ +static void nvme_tcp_resync_response(struct nvme_tcp_queue *queue, + struct sk_buff *skb, unsigned int offset) +{ + u64 pdu_seq = TCP_SKB_CB(skb)->seq + offset - queue->pdu_offset; + struct net_device *netdev = queue->ctrl->offloading_netdev; + u64 pdu_val = (pdu_seq << 32) | ULP_DDP_RESYNC_PENDING; + u64 resync_val; + u32 resync_seq; + + resync_val = atomic64_read(&queue->resync_req); + /* Lower 32 bit flags. Check validity of the request */ + if ((resync_val & ULP_DDP_RESYNC_PENDING) == 0) + return; + + /* Obtain and check requested sequence number: is this PDU header before the request? */ + resync_seq = resync_val >> 32; + if (before(pdu_seq, resync_seq)) + return; + + /* + * The atomic operation guarantees that we don't miss any NIC driver + * resync requests submitted after the above checks. + */ + if (atomic64_cmpxchg(&queue->resync_req, pdu_val, + pdu_val & ~ULP_DDP_RESYNC_PENDING) != + atomic64_read(&queue->resync_req)) + netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_resync(netdev, + queue->sock->sk, + pdu_seq); +} + +static bool nvme_tcp_resync_request(struct sock *sk, u32 seq, u32 flags) +{ + struct nvme_tcp_queue *queue = sk->sk_user_data; + + /* + * "seq" (TCP seq number) is what the HW assumes is the + * beginning of a PDU. The nvme-tcp layer needs to store the + * number along with the "flags" (ULP_DDP_RESYNC_PENDING) to + * indicate that a request is pending. + */ + atomic64_set(&queue->resync_req, (((uint64_t)seq << 32) | flags)); + + return true; +} + +#else + +static inline bool is_netdev_ulp_offload_active(struct net_device *netdev) +{ + return false; +} + +static int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue) +{ + return 0; +} + +static void nvme_tcp_unoffload_socket(struct nvme_tcp_queue *queue) +{} + +static void nvme_tcp_offload_limits(struct nvme_tcp_queue *queue, struct net_device *netdev) +{} + +static void nvme_tcp_resync_response(struct nvme_tcp_queue *queue, + struct sk_buff *skb, unsigned int offset) +{} + +#endif + static void nvme_tcp_init_iter(struct nvme_tcp_request *req, unsigned int dir) { @@ -702,6 +904,9 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb, size_t rcv_len = min_t(size_t, *len, queue->pdu_remaining); int ret; + if (test_bit(NVME_TCP_Q_OFF_DDP, &queue->flags)) + nvme_tcp_resync_response(queue, skb, *offset); + ret = skb_copy_bits(skb, *offset, &pdu[queue->pdu_offset], rcv_len); if (unlikely(ret)) @@ -1657,6 +1862,8 @@ static void __nvme_tcp_stop_queue(struct nvme_tcp_queue *queue) kernel_sock_shutdown(queue->sock, SHUT_RDWR); nvme_tcp_restore_sock_calls(queue); cancel_work_sync(&queue->io_work); + if (test_bit(NVME_TCP_Q_OFF_DDP, &queue->flags)) + nvme_tcp_unoffload_socket(queue); } static void nvme_tcp_stop_queue(struct nvme_ctrl *nctrl, int qid) @@ -1676,21 +1883,48 @@ static void nvme_tcp_stop_queue(struct nvme_ctrl *nctrl, int qid) static int nvme_tcp_start_queue(struct nvme_ctrl *nctrl, int idx) { struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl); + struct net_device *netdev; int ret; - if (idx) + if (idx) { ret = nvmf_connect_io_queue(nctrl, idx); - else + if (ret) + goto err; + + netdev = ctrl->queues[idx].ctrl->offloading_netdev; + if (netdev && is_netdev_ulp_offload_active(netdev)) { + ret = nvme_tcp_offload_socket(&ctrl->queues[idx]); + if (ret) { + dev_err(nctrl->device, + "failed to setup offload on queue %d ret=%d\n", + idx, ret); + } + } + } else { ret = nvmf_connect_admin_queue(nctrl); + if (ret) + goto err; - if (!ret) { - set_bit(NVME_TCP_Q_LIVE, &ctrl->queues[idx].flags); - } else { - if (test_bit(NVME_TCP_Q_ALLOCATED, &ctrl->queues[idx].flags)) - __nvme_tcp_stop_queue(&ctrl->queues[idx]); - dev_err(nctrl->device, - "failed to connect queue: %d ret=%d\n", idx, ret); + netdev = get_netdev_for_sock(ctrl->queues[idx].sock->sk); + if (!netdev) { + dev_info_ratelimited(ctrl->ctrl.device, "netdev not found\n"); + ctrl->offloading_netdev = NULL; + goto done; + } + if (is_netdev_ulp_offload_active(netdev)) + nvme_tcp_offload_limits(&ctrl->queues[idx], netdev); + /* release the device as no offload context is established yet. */ + dev_put(netdev); } + +done: + set_bit(NVME_TCP_Q_LIVE, &ctrl->queues[idx].flags); + return 0; +err: + if (test_bit(NVME_TCP_Q_ALLOCATED, &ctrl->queues[idx].flags)) + __nvme_tcp_stop_queue(&ctrl->queues[idx]); + dev_err(nctrl->device, + "failed to connect queue: %d ret=%d\n", idx, ret); return ret; } From patchwork Mon Jan 9 13:30:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093572 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 310CAC54EBD for ; Mon, 9 Jan 2023 13:33:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237037AbjAINdE (ORCPT ); Mon, 9 Jan 2023 08:33:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236848AbjAINc3 (ORCPT ); Mon, 9 Jan 2023 08:32:29 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2065.outbound.protection.outlook.com [40.107.94.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C05021DF29 for ; Mon, 9 Jan 2023 05:32:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TXzYv3amj47igH5mSLy7CFP8Ji/f3cskbY8tGE1wKOYDzNA6zTk9uTvxPFr3aNQVUd6P19rOlhzYBh/w7zM/DuUJjLl9mpBQRBi6tMeANJrDZUVwbLx1gna4etbkeMAxHQn6luTqVNQs79+ifEYY5kB/A5EIwnhCX7NeO/XOq9vYBlddZgiagq2K46oX3t9LVIOrDvvV6/NkR1WSPsfevUhsfLsxevUMwWUF+mt3urKLgD8J+XpeRVmKuhVByOFxc9TzZKTAJjwMlYH0ctvOPJenjYUtjvhN0qcwBUcyjntSb/eVNcrO3ajQZV31tGBcJaj0WcKX7H0PnHjGeDrw3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7r4JRb8h0M2ex9ZpGujHho5JCUi2y59yV2W6/CnTvKQ=; b=niiULNpLXTyTaFYhqE2MzmWHwBFOku2KHvUNr4GJRAHe2YALhEsriO5Ss6Daf5RI0BxCBIEat4xFqDL5KOK+iLNfFzSZDoLSJWQcf/bnD9R03yHHj2Yu0KfvZFnHjGXJUkcN2rCQZI+rNp87olGxAaoMtUjs5g2aH+Mh902Ypcp3gC7R/cZhnOYk2Lbg/J49uZufCCUWU6/vCMWd1EFpAHOgsLokKkm8rwAaby8nKfL6VsctsTaFoNI5nvIy6MqdBG1i+GcTmFJ5aYwx5xRWeXoHKkLwgOZe7FQHt3G49Ax7y/h7k+eDM/vGZIbBIUlWLpfOwizlAjyLsVntmxrxhw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7r4JRb8h0M2ex9ZpGujHho5JCUi2y59yV2W6/CnTvKQ=; b=NOAQpoBn05AqraNtkscrW0LuUOmW/ik0QMX8lZJid+E1QOb043XWV4Hkqa7CQ/UU6AziEFEOy81rbX5hVAtP/ZijtbH+uQipaiuZjwrAmuC0rA368XTUSKTVW53uzeuHpA4Bcps0GWijwv0w36JOORahH7C6okuvPkx+sx32zev0KaoOIqABMBw7lQPGAjRITKO8Unv3MnCj3uUWqc54xMXp5AjCilRLaE4Vh9RUBx+F9pKGPquVEBzlB1ompWi2J8O2qOhbNz4VN+81vi2unCRhyPXEiTmXE5Vt4z9SvbXmxXVgnoyalmuSjGVkBivjJ3ptFM+WsX0nyBaohm2lPQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by CH0PR12MB5074.namprd12.prod.outlook.com (2603:10b6:610:e1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:18 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:18 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Boris Pismenny , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com Subject: [PATCH v8 08/25] nvme-tcp: Add DDP data-path Date: Mon, 9 Jan 2023 15:30:59 +0200 Message-Id: <20230109133116.20801-9-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR3P281CA0057.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:4b::8) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|CH0PR12MB5074:EE_ X-MS-Office365-Filtering-Correlation-Id: 5621bcc6-82f3-4465-2d04-08daf245edb7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 2q+cD+qSWcX0D6zvNpSDyIYTOg54aa/T4alapCi9dpItLmvPRlmsUd6fI43k+dSFdNQ4acNQCLJV2RfXdtLs8Uag9x5XqU2StKpsVwbXiBh8Bod42Um7Bd0EVAr2VBspmFa9k/vanio07Byr4LFKbBybU1dRkKHv7eHy+PnjjBTdfvFUQ4QrlXIt6rwKn0fBmyOmjzcizqU3ZVj2BPYlANZb4e5tsUBIOpan2PKtf/pwAkss9DB/aRcRi8ylSpEyKkS0mnQib18b+a3ljA/Hb/SKX99EYOfhfvJhiWeN1RBeoEXwLsO8VEwKCNIEIYMNAV5j6O6xqCjAbkWQIyIOh7pO25vDDuFTmbL6KCgVpRC0k0Dldr/okyXuKNCzSd1UTH5xr9l9Y98Ei6800XM+NsQfPCmS9ni230wz1FJO6IiR9phShHrYCPxmsPMOmoXsSMJXTgQaETyr7BOKK3+Cqf3Ha9inlsDZ/UW5pfh82iGfLB/BXE66qVzCVEek8Kq5OnoYtkCFgEyBLs3T234lFFy7oMSrAd0vh1DfQHr0gXXOgRF+DhABuIHKBi3ZKZA7KJsZHdwYp2wimRdpZdtNJqq1/enf+aTGELUCfyyUZSLypmbTNfDxIiXqS9ZLZzlvUcTM0jZbDtC6m0X4fLsjEg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(396003)(366004)(376002)(346002)(39860400002)(451199015)(1076003)(316002)(5660300002)(7416002)(26005)(6512007)(186003)(6486002)(478600001)(2616005)(41300700001)(54906003)(4326008)(66556008)(66946007)(66476007)(8676002)(8936002)(83380400001)(86362001)(36756003)(6666004)(107886003)(6506007)(38100700002)(2906002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: VpHvYfHFhKurBflBG6VmPZlVN8e33AkG2uE62Z4IyP5jpzjsb807EtBGkH3oa1xyyDCL/RBoFcy/D+G3oFMgWnq1TB1l2J3LZFdOfGQlw/dTXF4328UpLhQEAGc7YCn5Kdb8oLDM0b7lIHM64EuV+I5v3N7GSlF7ONqs3AksokqVcIeFI89pYv4k+uHmjhFL+mIaSEQrUQoodIAsJz92+ZWG2xsLIlY7G9A6Ta9LHspkM/xOAXb1IS5z/vq3USP35Py+CCXFQ/5U6kIx+WpxwK54xggGNpvwJUJK5iNrMsIPPgfW2wQOITZhomZPisq9IV/EYULLk0aycw/CKW7HnrRtpYXvaT0HW3ds/n87lkD4id/qCcrRJZLjU/9pJteIWGoLhRVFYWyfrtk3xmNm8csh8oSpK+gjTBMm5zFeJZCLz/CgymoKn8WfTIzHoPzGHKeB+L26tX+sNvGAKxJYr1MXhyZDl3C9MESkvfpWzgBUAgZdPVnI8g7m53Mcma/I+tYSC6YMYs2u5r4uLWDcjGnFthZ1WTRP2AO/vc8hkOUcWJn86HcwFThQirT5anYCAsMZ/0/Akv1rTtifjD2hbIWmWhQz7iJxwlwTq89NrJj7wMy28OrS7ggfiergMcfR/xdRN+JnUAlYMOO7FhunMf6vD/DDLei5pNVcERBCvx/9xMy1r8H4MoFE56qb+WdAN8vE7qBWDTny7naWNjdDxNs8IIV1U4bV6nzaKthnqIVdKphMsdBLFnWcXle29TiPtKC25SxT75TxG4ae84++UF6J2EMYpOoJ1e8133BUkQtlrxa/dsaJePfvfLGsvxVySMyGPyz1j1h2dFBcP8HEAUeOnVwaEcJyWVPgW+91vRduYQ5xevzHEDUj67LFDJxh8b0xKLioniBVEokanfeD4i/2nXFbBfomSnExoEI2tQyzQBroh3kSpgax5jBTgeVT/dXD6m/CyOjyGl+E+0/Ckld+UrE40r0/ii4zCr2yJteXOZ6oC7lJMIx4Tfh9qNze2zxYGbAJvP5J5sFgtPq+SYqVxAfhRVaVpH/3//7tq1YK+rQisQtL+ar1FKKjKl6cOI8N8K0g7e/gZ6umdoAiWcNVa5W03vUHQnQ6jr06CKOg/WoVPH7zUCLchIPwtQSc0bWvpUWAl35n8r01k9zaDm0mIBtqMDEUS1divFD58C339wUueJMwALrIr/V8KY1DlQ6q36jsS1ujyMBommmdbgCb5qXu2giSVBqnbuUcZUehqu8o+BttLnqvH7aJ1YKF9FuGUUxoBwovh3rBGgISq8rseIvKu4PZAPh8YqfMa9ms42i3pq6W/yCtcTfvXjBHckbCgjrpDp7lzmY2jxS7jaRkS1IQUvE8Hg1upka1qHPIbQ5KLeH0MjSOvcn5lAaZdLG2rCdKxIlKMjtwqlT3pQyCkLm64VuEBVxQt/TWxzeHnaHLkGkHFGpUgv8/rzRKcrCF0oLQPsCq0jpgjq9CuiznGoeQfexp31b9mFxSKZNR87F0qjG2Nh70wo3Ra9Auz0YGgZAUIw/vaYp1/HSeweT7/Ibj6cDYWExfy6mZgTzbz6I+Epfhz/uPdzgev78G X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5621bcc6-82f3-4465-2d04-08daf245edb7 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:18.1997 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 2P3EOeJT5MR5M2Pq6xGnCSTw5Mu5OQzD3/MDrDzP+yR27KrdhxA3cDETV3YMX9cbmvru1YI/NQfLrZ8tO3r7IA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5074 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Boris Pismenny Introduce the NVMe-TCP DDP data-path offload. Using this interface, the NIC hardware will scatter TCP payload directly to the BIO pages according to the command_id in the PDU. To maintain the correctness of the network stack, the driver is expected to construct SKBs that point to the BIO pages. The data-path interface contains two routines: tcp_ddp_setup/teardown. The setup provides the mapping from command_id to the request buffers, while the teardown removes this mapping. For efficiency, we introduce an asynchronous nvme completion, which is split between NVMe-TCP and the NIC driver as follows: NVMe-TCP performs the specific completion, while NIC driver performs the generic mq_blk completion. Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel Reviewed-by: Chaitanya Kulkarni --- drivers/nvme/host/tcp.c | 117 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 112 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 3c35290d630f..718d968d94d6 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -101,6 +101,13 @@ struct nvme_tcp_request { size_t offset; size_t data_sent; enum nvme_tcp_send_state state; + +#ifdef CONFIG_ULP_DDP + bool offloaded; + struct ulp_ddp_io ddp; + __le16 ddp_status; + union nvme_result result; +#endif }; enum nvme_tcp_queue_flags { @@ -306,11 +313,75 @@ static bool nvme_tcp_ddp_query_limits(struct net_device *netdev, return true; } +static int nvme_tcp_req_map_sg(struct nvme_tcp_request *req, struct request *rq) +{ + int ret; + + req->ddp.sg_table.sgl = req->ddp.first_sgl; + ret = sg_alloc_table_chained(&req->ddp.sg_table, + blk_rq_nr_phys_segments(rq), + req->ddp.sg_table.sgl, SG_CHUNK_SIZE); + if (ret) + return -ENOMEM; + req->ddp.nents = blk_rq_map_sg(rq->q, rq, req->ddp.sg_table.sgl); + return 0; +} + static bool nvme_tcp_resync_request(struct sock *sk, u32 seq, u32 flags); +static void nvme_tcp_ddp_teardown_done(void *ddp_ctx); static const struct ulp_ddp_ulp_ops nvme_tcp_ddp_ulp_ops = { .resync_request = nvme_tcp_resync_request, + .ddp_teardown_done = nvme_tcp_ddp_teardown_done, }; +static void nvme_tcp_teardown_ddp(struct nvme_tcp_queue *queue, + struct request *rq) +{ + struct net_device *netdev = queue->ctrl->offloading_netdev; + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + + netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_teardown(netdev, queue->sock->sk, + &req->ddp, rq); + sg_free_table_chained(&req->ddp.sg_table, SG_CHUNK_SIZE); +} + +static void nvme_tcp_ddp_teardown_done(void *ddp_ctx) +{ + struct request *rq = ddp_ctx; + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + + if (!nvme_try_complete_req(rq, req->ddp_status, req->result)) + nvme_complete_rq(rq); +} + +static int nvme_tcp_setup_ddp(struct nvme_tcp_queue *queue, u16 command_id, + struct request *rq) +{ + struct net_device *netdev = queue->ctrl->offloading_netdev; + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + int ret; + + if (!(rq_data_dir(rq) == READ) || + queue->ctrl->offload_io_threshold > blk_rq_payload_bytes(rq)) + return 0; + + req->ddp.command_id = command_id; + ret = nvme_tcp_req_map_sg(req, rq); + if (ret) + return -ENOMEM; + + ret = netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_setup(netdev, queue->sock->sk, + &req->ddp); + if (ret) { + sg_free_table_chained(&req->ddp.sg_table, SG_CHUNK_SIZE); + return ret; + } + + /* if successful, sg table is freed in nvme_tcp_teardown_ddp() */ + req->offloaded = true; + return 0; +} + static int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue) { struct net_device *netdev = queue->ctrl->offloading_netdev; @@ -445,6 +516,12 @@ static inline bool is_netdev_ulp_offload_active(struct net_device *netdev) return false; } +static int nvme_tcp_setup_ddp(struct nvme_tcp_queue *queue, u16 command_id, + struct request *rq) +{ + return 0; +} + static int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue) { return 0; @@ -731,6 +808,26 @@ static void nvme_tcp_error_recovery(struct nvme_ctrl *ctrl) queue_work(nvme_reset_wq, &to_tcp_ctrl(ctrl)->err_work); } +static void nvme_tcp_complete_request(struct request *rq, + __le16 status, + union nvme_result result, + __u16 command_id) +{ +#ifdef CONFIG_ULP_DDP + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + + if (req->offloaded) { + req->ddp_status = status; + req->result = result; + nvme_tcp_teardown_ddp(req->queue, rq); + return; + } +#endif + + if (!nvme_try_complete_req(rq, status, result)) + nvme_complete_rq(rq); +} + static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue, struct nvme_completion *cqe) { @@ -750,10 +847,8 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue, if (req->status == cpu_to_le16(NVME_SC_SUCCESS)) req->status = cqe->status; - if (!nvme_try_complete_req(rq, req->status, cqe->result)) - nvme_complete_rq(rq); + nvme_tcp_complete_request(rq, req->status, cqe->result, cqe->command_id); queue->nr_cqe++; - return 0; } @@ -951,10 +1046,12 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb, static inline void nvme_tcp_end_request(struct request *rq, u16 status) { + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + struct nvme_tcp_queue *queue = req->queue; + struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu; union nvme_result res = {}; - if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res)) - nvme_complete_rq(rq); + nvme_tcp_complete_request(rq, cpu_to_le16(status << 1), res, pdu->command_id); } static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb, @@ -1252,6 +1349,13 @@ static int nvme_tcp_try_send_cmd_pdu(struct nvme_tcp_request *req) else flags |= MSG_EOR; + if (test_bit(NVME_TCP_Q_OFF_DDP, &queue->flags)) { + ret = nvme_tcp_setup_ddp(queue, pdu->cmd.common.command_id, + blk_mq_rq_from_pdu(req)); + WARN_ONCE(ret, "ddp setup failed (queue 0x%x, cid 0x%x, ret=%d)", + nvme_tcp_queue_id(queue), pdu->cmd.common.command_id, ret); + } + if (queue->hdr_digest && !req->offset) nvme_tcp_hdgst(queue->snd_hash, pdu, sizeof(*pdu)); @@ -2580,6 +2684,9 @@ static blk_status_t nvme_tcp_setup_cmd_pdu(struct nvme_ns *ns, if (ret) return ret; +#ifdef CONFIG_ULP_DDP + req->offloaded = false; +#endif req->state = NVME_TCP_SEND_CMD_PDU; req->status = cpu_to_le16(NVME_SC_SUCCESS); req->offset = 0; From patchwork Mon Jan 9 13:31:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29134C54EBD for ; Mon, 9 Jan 2023 13:33:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237046AbjAINdQ (ORCPT ); Mon, 9 Jan 2023 08:33:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237096AbjAINca (ORCPT ); Mon, 9 Jan 2023 08:32:30 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on20605.outbound.protection.outlook.com [IPv6:2a01:111:f400:7e89::605]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFB091E3E0 for ; Mon, 9 Jan 2023 05:32:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bep8HeouC3LDKRQzp1tEGzyare3ZOw9amCSGOlOx98hV0/TGvaLVx2zzibVz95YeAWy6X7FRo546nWGksj/pm/tRTLI/DQ8AopigklUanJlaVCt/KQ4SQjvweAL+G7k66ipmZCMLJxqfL7IBO4Hmx/4V5gGN1uIB/wqarmGBkUlYIX6vVPJmiXqnb2sbgyirNaIIwr2PmlCoq6BGcznhrkaeujF2Xj1a1VYRPj98TpQmQcHWwB6z9GlYK7wgN/bAqY5l3+Q4Cdum5SuhhcjdDBEo2bAAVqJyw/PBDsygId1kmuBJcq2a4oJ8YXa4uHzWAhvMz/ewpMwYstfhAkfZdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lepBZ7M1caxjFTO54h8rHOvePvQXdmCfS983givpSGQ=; b=YagCOOv/O3aKZpZlnY5xBkxhbVLFUoWa/nL/06QUahweS5BRlWG6c1MgUaneYnbFHkcFjby8mgCBRDvIZKNK+MZe3PuW2Ti67M91pcM2wTLxIdCzypcyH/sqxBUjhG8UVRrW1NuQPO2CD6y3qUrRzmHzBNUX1vu65J7tptmdFMeZqYVAUvKtmBa72o8KLyXcq2fG74RcStZ7G5Dr8oqya08GtzZwK77BmL5LMDbsujlUxqyCkJLRQ7RyMbvZhhmb4ZN1FSGvbZJdTWVfiOj6ip8cOklPRHdGaRKWB6TuD5iXXFgveYidtZDN1Vfjw2LS7rkjeDftP1hHFqotrDMttQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lepBZ7M1caxjFTO54h8rHOvePvQXdmCfS983givpSGQ=; b=skFTGnqgo/XzR8YtIqH5PJhb7udnHtJtrNdPuyUTlDHAGxdjAydBI537FWd55y38cay6c8KBJrwpSTm7dAcfI/FhXovGnognDxQD9Cb1qhlSKA5yBQEiJZ8lZmnZwf7ySidSA02YEJXLXjzvvoE8k6rBolnXk21+PDsu4Tpx91I9lUS087UkTFukm3/KBb6VN0NaixBRGM0bEseDjUZ6YYjUOkGUOwKCdw6iPm8ZPsuKOuQehE1w78p+EwWTPI4dLLeAQ1W7vMrBqi9Gsz6y916Ht2e2/QsRQKMiFMN7Hvq/OSDMx6J18A3uvt5PPuLMcwH7O0DNdWFnrkzF3G38zA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:24 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:24 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Yoray Zack , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 09/25] nvme-tcp: RX DDGST offload Date: Mon, 9 Jan 2023 15:31:00 +0200 Message-Id: <20230109133116.20801-10-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO2P265CA0021.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:62::33) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: accd3615-23cb-4505-0141-08daf245f171 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 10z+sMuMKT9KRgxhZQ+j7j5EsfyirKpjzllliGEdHf599QKeyHgin2XfI0Wk9xX9YmO15x5hffhP42xiXNeX+Ic/DlyaDMYtDqFJt2KCGOybLw9pFkandPqy8h8GiuqcQnMx+VnvvRaNvT4YHg325eAs+g+93zUHBwjyH99Z+QTkNJU9AEywXSocBZHwh5yqdvJ8FemYXc0CWTNhTMOtGgiJffBStoLQGyte2MWz2b5fcSXNaTmrGoNnjXcg9KhFk26lwgc4xv792SeVua7u/8nNbib1j8TmBcNzYjc5Cfsg7morRV6uEldwyYH0+LTIwvqma7zMAI0EzaQndZTFyr9W3uSgFElrm27R576L+RFUY6zRLmhtE5VUWHBcWNRZf99rFlf+qZ/g6rdM4wSPrav5s0AInyxqu4WpRqMTc6U7QY750OajSp/N0uv2lGJ4sG2M7mcqSIRCGbCc30/XvPX0T8DkH87qPw1hKCgQZOCjObqPZ5hAAYDbVIz/ZSF5yoFSXh6oC74JpNTR/sV9JHydsV9+NtG8oefM9cA8Y8Muyh1SrLAXXein+yGgD2SrXlKRuWUqzQ5GWCIuaScFq3zTA+h8JRZF2OyJWjQ48uCmexSGT+K1Ta7mjvyrxLleh9aDIbG1NB8cps3gtGdAJg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(54906003)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(6666004)(107886003)(26005)(186003)(36756003)(8936002)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: jguHcYkR+8lZKzp23uDbwWHntNha0GQE3Fe9mBz6d9dl1G8rf+clgQhggXw7RB4cd5RVnmqeG1wqY7mmLO1c+suJR3K0W1O0dcRiWUH6X1yCiy9IShxniguWAtKFooufzKHX/UE8siLLGazCMM37urySrqZR8m+zxf3jR92Rkxbih9kq0Kuzh3kpPVKsQ2TcHCQhkmhRqnOjYyzwk098UHIk5gsZ4BpNj4VBcgzRPNFes6VBcVfJ/8egZKFxjaFAl2LJ+kvnHCxUqVraxUUQGt8paxCXVS0v1LQGbVIlSlV5Vn71pD7MZ7nwvS1gLTcWGiYFxXgJM47UDT2ozLPsKgC6gVKwxGTMgQA2DWdmRTJyGkdhJhvK38rQk9M2gNRXgELIHVItA2MaVmEW371K/UEca9206rD9R347Virc/8sHDCC246kgGow4i5oRypU7SldxWxZQ2m+Dxe2O16cE1eP659oYEV8C64MntwBiOA9UrHLY2u2xyHu71AQdhyVYvzIJ05FztrbEpz+ZRAx9IKhj8xBBiEK8Y/ew79vIn9lQi1X1/j0vNGWbjjIepc7jmRrdB4y1h1YEw3dtF3C65MRIv3RYQpGYO3PLOWH1b3y3Vl0T9bM98m+YFuAqKSFuAxUZZC91OYh0YbKd32CYden9jdeZBZMqOJV5JFqj20efHh+spnIhacg6pg2Rru1MEU3arXYeY1pNTi2ssa1XF8de85r14zdFaBZsRwyE8rIb9Y42UVtMkELvnrTa+e0f93CkLGaTDh6yx1HmwPGmo7lKYMUfW5JqhmYKGjcR8iHHPjMDvNbhLtowmYYEfRQoMErR5xpbpYVGU6LSQTqb4I1FHaAKdDuWwlgh8ETWXaX7QN7KZAts6We1RNJ75X5rbG9dYTfcsofj9WWYFL+Hus08Q4F7t2paLFhafnCZwXcpFfySTeiNqhdeqfS+NmhCwNp8sVHJGygdx5OCiWLtm/JG40o7FbEEaV5MgxO3AD9zICENLVCCa2nPItBRQD5A/Th/hMpM68Kjt4KYSSKnH0mX9mzLUsBI9CAupKoGqe1OTFO/lgN5UbT0t+3dBfKBDwqEtNbd8XqCj+pJjLbmWM40x1e1A78MFcmtbANGk/682QLBkijgf9t57fvAi0wOqvr4FCWCnv8tWa2d7v+MiOKpDLTn608+RWLQXnM+fAkmwCsMdRXfRV4ZM60mPOnmm8JQaHnB3DBLP6pK5pnLFtsC36gG4NH7VIB3VW0UL46i9zwQv6H5zwFiJWjcanfV/RIMDqO2siUX1WwkUMPXYe0ooQyuutfKtMYSsG5Q5Wz9cRwqpM+GGtQ3YEPDe+hXscJ52yoxaEuYk+1vyWiU+IVqjuNNcMsVzoFjVOeXyVT7Y/rdtDlvIarDrlXfJKBrkdOXpaoaUiSqKVlAU0SZq8GqB3sxRjfGkWTKJYYFfSXbVnncx7lkCFDqQ8lVbVCVkDENeVin1ubBKlfWyjgnkklwZiVRg2obO7Sc1eUcghoZyvyPT4dL2O9E12EH6moe7ZZy/FvxsVBpibpTbdOKlnk7pLT54ITWzw4L+3zTaFGaMcdLUXhWuaV1ELNhIcxJ X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: accd3615-23cb-4505-0141-08daf245f171 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:24.4507 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: VvjjOBzLmM7TkdFlDzD2+8tHluWpGM3byylVrPZMBF0TZNIw8l0rSkxrViSMQn2OoM3Ppwi4JLkIV+1ZZIDKlQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yoray Zack Enable rx side of DDGST offload when supported. At the end of the capsule, check if all the skb bits are on, and if not recalculate the DDGST in SW and check it. Signed-off-by: Yoray Zack Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel Reviewed-by: Chaitanya Kulkarni --- drivers/nvme/host/tcp.c | 138 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 124 insertions(+), 14 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 718d968d94d6..4bd2b03dcf4f 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -115,6 +115,7 @@ enum nvme_tcp_queue_flags { NVME_TCP_Q_LIVE = 1, NVME_TCP_Q_POLLING = 2, NVME_TCP_Q_OFF_DDP = 3, + NVME_TCP_Q_OFF_DDGST_RX = 4, }; enum nvme_tcp_recv_state { @@ -142,6 +143,9 @@ struct nvme_tcp_queue { size_t ddgst_remaining; unsigned int nr_cqe; +#ifdef CONFIG_ULP_DDP + bool ddp_ddgst_valid; + /* * resync_req is a speculative PDU header tcp seq number (with * an additional flag at 32 lower bits) that the HW send to @@ -151,6 +155,7 @@ struct nvme_tcp_queue { * is pending (ULP_DDP_RESYNC_PENDING). */ atomic64_t resync_req; +#endif /* send state */ struct nvme_tcp_request *request; @@ -287,9 +292,21 @@ static inline size_t nvme_tcp_pdu_last_send(struct nvme_tcp_request *req, #ifdef CONFIG_ULP_DDP -static inline bool is_netdev_ulp_offload_active(struct net_device *netdev) +static inline bool is_netdev_ulp_offload_active(struct net_device *netdev, + struct nvme_tcp_queue *queue) { - return test_bit(ULP_DDP_C_NVME_TCP_BIT, netdev->ulp_ddp_caps.active); + bool ddgst_offload; + + if (test_bit(ULP_DDP_C_NVME_TCP_BIT, netdev->ulp_ddp_caps.active)) + return true; + + ddgst_offload = test_bit(ULP_DDP_C_NVME_TCP_DDGST_RX_BIT, netdev->ulp_ddp_caps.active); + if (!queue && ddgst_offload) + return true; + if (queue && queue->data_digest && ddgst_offload) + return true; + + return false; } static bool nvme_tcp_ddp_query_limits(struct net_device *netdev, @@ -297,7 +314,7 @@ static bool nvme_tcp_ddp_query_limits(struct net_device *netdev, { int ret; - if (!netdev || !is_netdev_ulp_offload_active(netdev) || + if (!netdev || !is_netdev_ulp_offload_active(netdev, NULL) || !netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_limits) return false; @@ -313,6 +330,18 @@ static bool nvme_tcp_ddp_query_limits(struct net_device *netdev, return true; } +static inline bool nvme_tcp_ddp_ddgst_ok(struct nvme_tcp_queue *queue) +{ + return queue->ddp_ddgst_valid; +} + +static inline void nvme_tcp_ddp_ddgst_update(struct nvme_tcp_queue *queue, + struct sk_buff *skb) +{ + if (queue->ddp_ddgst_valid) + queue->ddp_ddgst_valid = skb_is_ulp_crc(skb); +} + static int nvme_tcp_req_map_sg(struct nvme_tcp_request *req, struct request *rq) { int ret; @@ -327,6 +356,38 @@ static int nvme_tcp_req_map_sg(struct nvme_tcp_request *req, struct request *rq) return 0; } +static void nvme_tcp_ddp_ddgst_recalc(struct ahash_request *hash, + struct request *rq, + __le32 *ddgst) +{ + struct nvme_tcp_request *req; + + if (!rq) + return; + + req = blk_mq_rq_to_pdu(rq); + + if (!req->offloaded) { + /* if we have DDGST_RX offload without DDP the request + * wasn't mapped, so we need to map it here + */ + if (nvme_tcp_req_map_sg(req, rq)) + return; + } + + req->ddp.sg_table.sgl = req->ddp.first_sgl; + ahash_request_set_crypt(hash, req->ddp.sg_table.sgl, (u8 *)ddgst, + req->data_len); + crypto_ahash_digest(hash); + + if (!req->offloaded) { + /* without DDP, ddp_teardown() won't be called, so + * free the table here + */ + sg_free_table_chained(&req->ddp.sg_table, SG_CHUNK_SIZE); + } +} + static bool nvme_tcp_resync_request(struct sock *sk, u32 seq, u32 flags); static void nvme_tcp_ddp_teardown_done(void *ddp_ctx); static const struct ulp_ddp_ulp_ops nvme_tcp_ddp_ulp_ops = { @@ -386,6 +447,9 @@ static int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue) { struct net_device *netdev = queue->ctrl->offloading_netdev; struct ulp_ddp_config config = {.type = ULP_DDP_NVME}; + bool offload_ddp = test_bit(ULP_DDP_C_NVME_TCP_BIT, netdev->ulp_ddp_caps.active); + bool offload_ddgst_rx = test_bit(ULP_DDP_C_NVME_TCP_DDGST_RX_BIT, + netdev->ulp_ddp_caps.active); int ret; config.nvmeotcp.pfv = NVME_TCP_PFV_1_0; @@ -410,7 +474,10 @@ static int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue) } inet_csk(queue->sock->sk)->icsk_ulp_ddp_ops = &nvme_tcp_ddp_ulp_ops; - set_bit(NVME_TCP_Q_OFF_DDP, &queue->flags); + if (offload_ddp) + set_bit(NVME_TCP_Q_OFF_DDP, &queue->flags); + if (queue->data_digest && offload_ddgst_rx) + set_bit(NVME_TCP_Q_OFF_DDGST_RX, &queue->flags); return 0; } @@ -424,6 +491,7 @@ static void nvme_tcp_unoffload_socket(struct nvme_tcp_queue *queue) } clear_bit(NVME_TCP_Q_OFF_DDP, &queue->flags); + clear_bit(NVME_TCP_Q_OFF_DDGST_RX, &queue->flags); netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_sk_del(netdev, queue->sock->sk); @@ -511,11 +579,26 @@ static bool nvme_tcp_resync_request(struct sock *sk, u32 seq, u32 flags) #else -static inline bool is_netdev_ulp_offload_active(struct net_device *netdev) +static inline bool is_netdev_ulp_offload_active(struct net_device *netdev, + struct nvme_tcp_queue *queue) { return false; } +static inline bool nvme_tcp_ddp_ddgst_ok(struct nvme_tcp_queue *queue) +{ + return true; +} + +static inline void nvme_tcp_ddp_ddgst_update(struct nvme_tcp_queue *queue, + struct sk_buff *skb) +{} + +static void nvme_tcp_ddp_ddgst_recalc(struct ahash_request *hash, + struct request *rq, + __le32 *ddgst) +{} + static int nvme_tcp_setup_ddp(struct nvme_tcp_queue *queue, u16 command_id, struct request *rq) { @@ -797,6 +880,9 @@ static void nvme_tcp_init_recv_ctx(struct nvme_tcp_queue *queue) queue->pdu_offset = 0; queue->data_remaining = -1; queue->ddgst_remaining = 0; +#ifdef CONFIG_ULP_DDP + queue->ddp_ddgst_valid = true; +#endif } static void nvme_tcp_error_recovery(struct nvme_ctrl *ctrl) @@ -999,7 +1085,8 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb, size_t rcv_len = min_t(size_t, *len, queue->pdu_remaining); int ret; - if (test_bit(NVME_TCP_Q_OFF_DDP, &queue->flags)) + if (test_bit(NVME_TCP_Q_OFF_DDP, &queue->flags) || + test_bit(NVME_TCP_Q_OFF_DDGST_RX, &queue->flags)) nvme_tcp_resync_response(queue, skb, *offset); ret = skb_copy_bits(skb, *offset, @@ -1062,6 +1149,10 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb, nvme_cid_to_rq(nvme_tcp_tagset(queue), pdu->command_id); struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + if (queue->data_digest && + test_bit(NVME_TCP_Q_OFF_DDGST_RX, &queue->flags)) + nvme_tcp_ddp_ddgst_update(queue, skb); + while (true) { int recv_len, ret; @@ -1090,7 +1181,8 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb, recv_len = min_t(size_t, recv_len, iov_iter_count(&req->iter)); - if (queue->data_digest) + if (queue->data_digest && + !test_bit(NVME_TCP_Q_OFF_DDGST_RX, &queue->flags)) ret = skb_copy_and_hash_datagram_iter(skb, *offset, &req->iter, recv_len, queue->rcv_hash); else @@ -1132,8 +1224,11 @@ static int nvme_tcp_recv_ddgst(struct nvme_tcp_queue *queue, char *ddgst = (char *)&queue->recv_ddgst; size_t recv_len = min_t(size_t, *len, queue->ddgst_remaining); off_t off = NVME_TCP_DIGEST_LENGTH - queue->ddgst_remaining; + struct request *rq; int ret; + if (test_bit(NVME_TCP_Q_OFF_DDGST_RX, &queue->flags)) + nvme_tcp_ddp_ddgst_update(queue, skb); ret = skb_copy_bits(skb, *offset, &ddgst[off], recv_len); if (unlikely(ret)) return ret; @@ -1144,9 +1239,24 @@ static int nvme_tcp_recv_ddgst(struct nvme_tcp_queue *queue, if (queue->ddgst_remaining) return 0; + rq = nvme_cid_to_rq(nvme_tcp_tagset(queue), + pdu->command_id); + + if (test_bit(NVME_TCP_Q_OFF_DDGST_RX, &queue->flags)) { + /* + * If HW successfully offloaded the digest + * verification, we can skip it + */ + if (nvme_tcp_ddp_ddgst_ok(queue)) + goto out; + /* + * Otherwise we have to recalculate and verify the + * digest with the software-fallback + */ + nvme_tcp_ddp_ddgst_recalc(queue->rcv_hash, rq, &queue->exp_ddgst); + } + if (queue->recv_ddgst != queue->exp_ddgst) { - struct request *rq = nvme_cid_to_rq(nvme_tcp_tagset(queue), - pdu->command_id); struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); req->status = cpu_to_le16(NVME_SC_DATA_XFER_ERROR); @@ -1157,9 +1267,8 @@ static int nvme_tcp_recv_ddgst(struct nvme_tcp_queue *queue, le32_to_cpu(queue->exp_ddgst)); } +out: if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) { - struct request *rq = nvme_cid_to_rq(nvme_tcp_tagset(queue), - pdu->command_id); struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); nvme_tcp_end_request(rq, le16_to_cpu(req->status)); @@ -1966,7 +2075,8 @@ static void __nvme_tcp_stop_queue(struct nvme_tcp_queue *queue) kernel_sock_shutdown(queue->sock, SHUT_RDWR); nvme_tcp_restore_sock_calls(queue); cancel_work_sync(&queue->io_work); - if (test_bit(NVME_TCP_Q_OFF_DDP, &queue->flags)) + if (test_bit(NVME_TCP_Q_OFF_DDP, &queue->flags) || + test_bit(NVME_TCP_Q_OFF_DDGST_RX, &queue->flags)) nvme_tcp_unoffload_socket(queue); } @@ -1996,7 +2106,7 @@ static int nvme_tcp_start_queue(struct nvme_ctrl *nctrl, int idx) goto err; netdev = ctrl->queues[idx].ctrl->offloading_netdev; - if (netdev && is_netdev_ulp_offload_active(netdev)) { + if (netdev && is_netdev_ulp_offload_active(netdev, &ctrl->queues[idx])) { ret = nvme_tcp_offload_socket(&ctrl->queues[idx]); if (ret) { dev_err(nctrl->device, @@ -2015,7 +2125,7 @@ static int nvme_tcp_start_queue(struct nvme_ctrl *nctrl, int idx) ctrl->offloading_netdev = NULL; goto done; } - if (is_netdev_ulp_offload_active(netdev)) + if (is_netdev_ulp_offload_active(netdev, &ctrl->queues[idx])) nvme_tcp_offload_limits(&ctrl->queues[idx], netdev); /* release the device as no offload context is established yet. */ dev_put(netdev); From patchwork Mon Jan 9 13:31:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093574 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2F41C5479D for ; Mon, 9 Jan 2023 13:33:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231267AbjAINdS (ORCPT ); Mon, 9 Jan 2023 08:33:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234702AbjAINcd (ORCPT ); Mon, 9 Jan 2023 08:32:33 -0500 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2042.outbound.protection.outlook.com [40.107.93.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7ABAC1E3D9 for ; Mon, 9 Jan 2023 05:32:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cLkM9mgKF9ceLQkd1vDECiVDbS2ntKvxA4R/zBbt6ZxwmPdGBHBWx8QBs50IKChkk54kuyLdz2Ux9q6NTuC/FuH4exWJIGlprEI9dehJNMYwCVPPCGSHCKWlbQwMIlPttjU/fYpqc4hjxlqwXz+eewqC0SV5MYPltTlIiaS6/qoF7wfazB3Y0ckQVafUEdE70ulxjU+tu9mVUlsq6plDEups6wvylZ7YPRYXvDONFhy3KFVK/ucSegdnWH27W4QhBndbN0OCwD5Qwg7BFwtkoWCrA3/WNLEFqVGo/zxVACdJ0i3sh6Ku+XuiaElqi5+RFuhoMHtYYSGu8vWbESTWpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PUaSoNbN2jAInfpgiCYMJWmGen4bHyBLxwu23vOA6jo=; b=Z2yqiobSCUJaufnL3VP5Wec+EZ9P0ED/MP7RQ8fsou5KEojXs5fsX1Hz/7NOw/KHr/kqAzZwKIPwpdavguu6Lbuhlr/9NixfajjEKqVASNv3BywVMidoF6IpkSBD3V1qjurWNxoXV1fpljXP8i3uN+895n1sI6rmkqFrE2wP53PxPtfX2PyIhVIqglHugg7PVdn0XsYuYD2cIzIvsTIz8JVYr8Vw7Hqh6LCS6g0bDLUPgS76sCjjt+W0D9d+V9hqvYHTItK77qJmLBRYig0VoZd2o2ZGN5hkX0oe/JVBTiLvbL09CVD7TlKp683A49Q9Xb0Ikls9FARgfrIROdAAgQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PUaSoNbN2jAInfpgiCYMJWmGen4bHyBLxwu23vOA6jo=; b=CzpvFf/8iWZs2zSeDnkbGX27Nk4MTAz2LUTJpa0acG9sdcHw41v/pUvR41SxM1iLWWegOz768lGew1DXucMa83xYU9GVl8ahZWLOIO2Q2dpqhpzjAaQJmAdxXT3Z16/9pdaHq2rR/Veo+Qncj6jByQfnZLz7vg1mfQQAnmyvqk2nOLE61y2xjw+xD2W0m8upHePB0waSeSS//S5Foa6S+YtPTnN5Z+TBzksrgMt1DQETU6KtQTilgTOto3TrItOAq5uywzDNa/Z67ovmQSKGSqXfppo8AgJ2vmDWVEK3MeYAlmtxzD4jDPnLcH8BZhJ7Q011G7ctTZTsk3+VeZY1Wg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:31 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:31 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Or Gerlitz , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 10/25] nvme-tcp: Deal with netdevice DOWN events Date: Mon, 9 Jan 2023 15:31:01 +0200 Message-Id: <20230109133116.20801-11-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO2P265CA0011.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:62::23) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: bb554600-6fef-4996-fee9-08daf245f558 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: hz2L0n0LyTYKmUIRMMjz3Wm8G9T1zVFh64H+BAVJor44CCzCmAF1I8RJmxfEhG4GXsp5vXlx2KuyqtEHK++ZuyUcgpBccCnPlO06TU/UOfpzCuG5z3eiUIgfcVwboNmQul38gtJFtZTbgIU4OwXQ+ugmLqQJd5LMmjJPmmZpSzH609l3UeV+SMPJR6+ezGYxKsEhPyOJ85yh2jCg1yFaKhTVj7zSmwUws5mUvoUyyCq1vrHkf4Wq1FzB8ymDRYkluOPNECNabxd5FuOU+v+Fd6ZGe2WoMqfvYTY/UR2MRotp3Z7bL6QR7qkyEK6RIfgWlH3E8l9jkq7UFr01Hi62AGn/d6IlycNXPqOl5ySTCICX0b2c6nDbvvUJPEg96s7aDZLdpwsKNLKE3MjavORa/Xp5KNMUh3hTCyLkWPjaXLuponWzmR7xawlzUtMTpcyKnRdDOYUfZqzmXG+LKr6DAswK0+hE8Q8ipAmzrEwbTZjja6thOTeaiQugaOBzrz+hfHb10LpswDzOmK3LuFYx2INfEifEyy4iyAe39y/gmE24Xhng76/2xtjnKfCEm99njC7HmI/NiK76qsgDPchCBAl5e1kW3h79SAS0qyFEWxAVzQwjZ0k3nniQ1Ihz2Z66eHfqFpZQnKeA1QbCF6FdxA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(54906003)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(6666004)(107886003)(26005)(186003)(36756003)(8936002)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: NTw7lLaoblMwO02EOG7YHgh0tlHnWDF8QUo/onDwHxNETCbUaOCWxUIedSej/79eT8sI6HF1tgKi6T66aiT/iqVjJEDALx0OaBN4+AdkBKLcwsDw9tr3IRIAVu4FQh246EJGc1S8n+8qYsHduUMPv0vKD97dY9uUOmr8lUxWIPWe+qVhlZbDKAkGNrWncU3et6XzP2TKf5LNzgbnrLxcAlw+iRXNAMjKF2R/GIZByvKcgWrouBBwRtx4MB+AHUXOmMA11zCceNPX+SLtNvJMQIIWmxULXdZeDOdUCwBbcimWqnh/3z7iEMm0LWY65Reh+FWk4DflWYwQD4f7TRjlV1zE3+gxz4bO+Ug0SxIp8pLCzbrBYhWy+8ZZpKpZZE2iOhESe9msWrAOpKOYS0u7hy7Fyd0lAem2KhimwAPuH39RgVB0cEVepdbvfzG8JPWw3KQ0LHKW7bJ39Lncvhtx4zZNVLpf5BAx957VF+wOdTIW3LrI+PLJE6hoUHZZN5I9R/238/5IcH45tjMlOS1fbKf2y2qdUMGqcVJ/h4/BbJES030+e8z40y50+nydcSokEL187FsTiaGijG3MjKUdfVEzhv8VUmDX4VoJDbDB+cLVndsoLOcj+0AlJD/mb7VnjOHmM4vkCgcf6Uq1RItN/Or+v15Bh36vT0LXFx1EKIaIhjLNK0bpLEu1jwovLIIrRR93GLNvJDnhUh6EUrCxylAXCr2CRI2fL9ns8Z/0v/GvABR3d3TKFsjDh99fU8zT+x2iOb3vDhMWamfRIN0ky3mRhzd0fsxfcmGaBoBFuxk3hLiMkoU3Wr1amb40eesO9X+m+ddiTB9WQgurOHF4n6xWZVXPzQLwEShNAItP9zfYvkQLBZHhKLQ9hmhxvBZMpn8EtVcWfyNG5cVApcLYA7a5SKmFMQyUB2tEAY4Sm6P5Jx+ZSLaSN6CrBJhDHlbfGCxYXqDKjall4szMaP3cyQPAGTdzxK7IWUTCih9YM0W/wVX22Vuy15NC/ulpyxbpFbrdObikUHdIirJ6izXrC5AjZ0KlO7u+4+QyN2uNzyxgCSREv1fT4kSAD22mWQl2gGVuEqOILHVeAh00O+rTrQFkx9SooosWGD1vyFUnveWEzMT0w4jMGjXZVbfWDwtu8N46ywUs17WHrSbX7jyiobt3lkD3IqzWvkKpZ/NaG5iwdx4sSIyS0bl29Z+hjsmoKRcFDpecppA/esdnD+mXvFF/Z2smINVQO0KdvoV7fWaiBtjenAWVI8nlooC/clP79AOblXGcQ7oOQl4geUR+ptzNT5YnQ8msryZl4rBHDHRdE1mJXm0ppIPe/xEim8Zwjn8mp19vkO2jUscos52B4b8y6x/thnwHq2g81i8fkbZpGeu7E1gcFxWoDijqUYAh8jcQsnx0rZIdVww4+DMZj6i4Q0aF6B/k0AIz6E+eHl0GjnrNrOgOstKZaLEvTJc7nrO7BDskb987Ek7GOKtIU2B54fkprqPPp9j0kd8GCw4dVo658fcQ/CmJmDYp2TMDj5JCi26rLr9quNMkyPOCyImzC4eZ08BV/O5gABw63+lKsGsrMuDT4/zQcJ8XlFLx X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: bb554600-6fef-4996-fee9-08daf245f558 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:30.9886 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: mn8MxGqMmbRgd5G3RDaExE+Evo1IPHcwCPRdnqDwAavEYHbLB4fOl+lP4X17gl6oYlD4O4pexJBMBGfYr2qbqQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Or Gerlitz For ddp setup/teardown and resync, the offloading logic uses HW resources at the NIC driver such as SQ and CQ. These resources are destroyed when the netdevice does down and hence we must stop using them before the NIC driver destroys them. Use netdevice notifier for that matter -- offloaded connections are stopped before the stack continues to call the NIC driver close ndo. We use the existing recovery flow which has the advantage of resuming the offload once the connection is re-set. This also buys us proper handling for the UNREGISTER event b/c our offloading starts in the UP state, and down is always there between up to unregister. Signed-off-by: Or Gerlitz Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Yoray Zack Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel Reviewed-by: Chaitanya Kulkarni --- drivers/nvme/host/tcp.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 4bd2b03dcf4f..52e0db53d067 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -203,6 +203,7 @@ struct nvme_tcp_ctrl { static LIST_HEAD(nvme_tcp_ctrl_list); static DEFINE_MUTEX(nvme_tcp_ctrl_mutex); +static struct notifier_block nvme_tcp_netdevice_nb; static struct workqueue_struct *nvme_tcp_wq; static const struct blk_mq_ops nvme_tcp_mq_ops; static const struct blk_mq_ops nvme_tcp_admin_mq_ops; @@ -3107,6 +3108,30 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev, return ERR_PTR(ret); } +static int nvme_tcp_netdev_event(struct notifier_block *this, + unsigned long event, void *ptr) +{ + struct net_device *ndev = netdev_notifier_info_to_dev(ptr); + struct nvme_tcp_ctrl *ctrl; + + switch (event) { + case NETDEV_GOING_DOWN: + mutex_lock(&nvme_tcp_ctrl_mutex); + list_for_each_entry(ctrl, &nvme_tcp_ctrl_list, list) { + if (ndev != ctrl->offloading_netdev) + continue; + nvme_tcp_error_recovery(&ctrl->ctrl); + } + mutex_unlock(&nvme_tcp_ctrl_mutex); + flush_workqueue(nvme_reset_wq); + /* + * The associated controllers teardown has completed, ddp contexts + * were also torn down so we should be safe to continue... + */ + } + return NOTIFY_DONE; +} + static struct nvmf_transport_ops nvme_tcp_transport = { .name = "tcp", .module = THIS_MODULE, @@ -3121,13 +3146,26 @@ static struct nvmf_transport_ops nvme_tcp_transport = { static int __init nvme_tcp_init_module(void) { + int ret; + nvme_tcp_wq = alloc_workqueue("nvme_tcp_wq", WQ_MEM_RECLAIM | WQ_HIGHPRI, 0); if (!nvme_tcp_wq) return -ENOMEM; + nvme_tcp_netdevice_nb.notifier_call = nvme_tcp_netdev_event; + ret = register_netdevice_notifier(&nvme_tcp_netdevice_nb); + if (ret) { + pr_err("failed to register netdev notifier\n"); + goto out_free_workqueue; + } + nvmf_register_transport(&nvme_tcp_transport); return 0; + +out_free_workqueue: + destroy_workqueue(nvme_tcp_wq); + return ret; } static void __exit nvme_tcp_cleanup_module(void) @@ -3135,6 +3173,7 @@ static void __exit nvme_tcp_cleanup_module(void) struct nvme_tcp_ctrl *ctrl; nvmf_unregister_transport(&nvme_tcp_transport); + unregister_netdevice_notifier(&nvme_tcp_netdevice_nb); mutex_lock(&nvme_tcp_ctrl_mutex); list_for_each_entry(ctrl, &nvme_tcp_ctrl_list, list) From patchwork Mon Jan 9 13:31:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093577 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A935C5479D for ; Mon, 9 Jan 2023 13:33:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236620AbjAINdr (ORCPT ); Mon, 9 Jan 2023 08:33:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236966AbjAINci (ORCPT ); Mon, 9 Jan 2023 08:32:38 -0500 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2087.outbound.protection.outlook.com [40.107.93.87]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8EEA1EEDE for ; Mon, 9 Jan 2023 05:32:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YxXi5TViZNkmWjbfKTZBEXPy9p+EC+X02ometo8qmOgPe2CroU45ggp+IiweFtp4dpTksGb5mO5fPllPSW6FXRrk2aMJ+Yr/wMxD8qD8gufEFIFVosN8hahtJ+sIMm8rrfHMShPoMpsO+Pa+5vW+aTcjvXBBjfduoU4IakrTNaIsagC6yPtjwGfLrlsvmO9/Fmh/QgqVOZoB5QvByaPYMVzQR6gz6SCaiEmtZw7+cr434Mgui67cb9QUMgeCFo/3hcu2j8FB4Hxuof7XjHJKxFkj1G6YzUUUE7yPiCdncLOsEHe7pfF2ws5QjHlKhtOV7afATo6JGEfVG2P+JV4eAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aHB2D/qQJMDSC7Hh/9lXum4hZJkuDMih95zxy4hryPo=; b=OhOgKP4i17pIYhloZXp66gq4fBOto1L0XItFMbZCOh3zwYyX4Gwf4F18JyAoSZtUYjxWHX3PIandZGRJ62dzO0unMaCPjXfSgbATNntX7UPy5MX9bugpNCF2PoWNrifgsXdUfwrV3eSlNGpI+jobqYjpB+l0bSF0GPPDmQ+HO/cBHs2TMm1snDzx4VKojcbDmZ9KyO11LUbN5+1GnJzejwXba7iyr0iGN4SZGqSw5di6ob8gkzhQTqkPxHcGJ3SncBikXHTyG5JBHnSl8CvOuXW+aW3SeRBcApLG7NQmt+6m2q5SOx8Ja8Td07vMP4JPLtVRTKFYEx1I6LIjCFURjw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aHB2D/qQJMDSC7Hh/9lXum4hZJkuDMih95zxy4hryPo=; b=pwoSIozE4WzGCH1kwlU/Z3+dE6IMYjjr/1y45QhpPcBOwRtiD8vzZ61ZxDgdaETXkl2VxrIrBn9rKfY1oo146269ccko2OcuOJngqD95+o7vnBGqzg0XWqYbsmjwSNlyxZwk2ZrwUGnrMriLBUym2Y3U4lbRwvAw33xj/z8AZt9jNKw6ALanftH3KdEe5qu5iJDtXwCkBKgYx5XcAbiJ9gGy77MTpPBughT8mZMuUrTmakur5hVYIpSl/BLmu+vMnn6Y9DVoI9bEF96KmHYLv1R7ADV9ArsGLH9epJumTEZ6C3oktVgQY/6IGLVtgitKXdvASByEwkEzOtURsFrBuw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:36 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:36 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 11/25] nvme-tcp: Add modparam to control the ULP offload enablement Date: Mon, 9 Jan 2023 15:31:02 +0200 Message-Id: <20230109133116.20801-12-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO4P123CA0580.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:276::16) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: 2b5b8070-a784-4d2b-2dea-08daf245f87a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: WwzJmgz1/dEz/+LWSi55lSzxQWXKwTdZV1yT8JGvvZKT8+gtufCJKxvdVvXCkNGls/tnpGYUagxgrS/yxgJarChUkWb2t/G/yAiAhBIFi4ROL1RMBSZOjgXpo9BnqjrYVIqK3Gb34fkLE5ZghBCCluvVATdm7cpfSS3LL0oF8e8tr51AVmY/LWOwuXGU0PJj5rodSgQd6H1vIzLCj+dVd9pt1llYe39S6YfEskrnTxV903zFDwXcGEmDQmNfh9gMwuzc4xM0PwpOzxq6si7zAzs8YjOT6Hs/MRN0OHUFgduIUqH+ZjTRD0x3j9kCU8e6s6IJr4o164aryudfvZZHcJWxFXpCdRxfBx2fpw1+ZR1+FoGpy8IMi4m+zpiq4zwRpcj44UU1bwjg1Wbs9kHc1TWvpDrXBsX5ocKmlfeamH8ZId61Bgh8IXAwa5twNfA1ICts0dOTHbJrQ4ab/B1w/C3rI/QVGE5DolrJE0gPrLVgMHscRvKKF+euPVS6NSteRHZ2GVboKB+P+eTaw+bu3bhQfWfaiD26eNH9GJNpcRIqZuz4lZqxASP8TugxxM9Mv5KZI64BLPM0WS2ZzChtmEYBlCMyiFR5vnL+cKcnOXZZxIDx8u8vCjlH1KDFFwflbzyYJFlQZOFkgyQ5IWH1lQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(6666004)(107886003)(26005)(186003)(36756003)(8936002)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: XqXTB5YjaGrBeuNKjN2FeBGNyIwq/Ouyrlhau1zJH1UO0BVWUL4TQ31NuPn2TeYMFgQgGm6qsXnpabjgJwBmYWbiUiDHdn7fLKqQLVCy8mIL0ENawcKpBM08NGaGG4NsP8DmwxrS97PBMQRXZ1GbGeyjbRnOleHz1OFECESpVy34FALbqGMOuXpQpbdrjGGdo3Ry8LVf5D8YwnjGZ1bSSgRcFDfJ9cpQlEOJA6x+177FZuTpFjHFy/r77M3dGp4bOEIJQ9p3gWUFUa2+HhFbSQZc8n8TKhgTurK6EFN+A8TW2CKfvXwOYy/rZl4e/U2N7w7rbe3RrwD+z+P8pUpS7+vxIhdfLMFjwwKzrRCCcpJIJWI78xOY/VzLATwVEyN+mKFxum4xQFhRFKkaGGJJ2unSNZS9d+rtSsOhmWbGSEx8BXREisBYjWZn4WNfFwppYyyLCoaGyQgIUYiBPZKbfYntj03aOMD0NbRi4VnmDaL5oAXezggVds+uEAiZFmJKjGHI8O66/8SY8fsY3hORrvCqt+CExqidJ299GR/M52GGB29nMYbdH8xWDhjCLm0Ra9QXEtqje1gFyDL/+07CqzffkFBpdWL0CJGBOpBjJZ1sTodGRpHbzr+cG44xcD/WeGF9Bh076vLh5o3xn6IWGdeynPIfE/lhhreJRGiBzod6do9bpZWxu/EqPCyyFU8K2M7E4B0fx630Fhdk7DqF0tRMm5JmOS3PapWHoC0swM0BJLiRdPZqx4ptAmyVJobCAXp+EFwJv+dN6Aw1HbPtuIfMmPRxjpI+ErFNN5MUncZRu5CrFe8IgxEHMcPB8+fOvWH56B0M8UeNUUm/okIrrxARMvtJX7qqui21ut+h6oyKE0KqXc1Tlqi1YpF5QPqrLvDYdTXlnDNYpoqhtlSuh0XnNALaxDcDr97bmkOP3QNB3msOThSUDnr2Red6eF7Ee822VMqBVQlFgo9GCXhwN7D/60Uf2Ab0Wkrt82rNUvJkzOxfuktoqFNtH1LEtYZMDC0LpvQvI7oGT9BttKvusEVs3t4Xf5JwyhmaTWVmo6BK/DeLjvxPJ2s4YhN5qzI4T12+KTH5H9asm6NYQF5EyhLhoCunN1GruML2XwaiLzsc/qdJXWHk7fPat4ciqdvWTUAFtnzesCzOeptxzm7zVhSj2oj87COxMtL5T0qbEs4pUCGKOyWnYJuclwJUuPxFdhMbmuN+hwnUvCU/UYHJf0mMUxn78FD0yMBUkoG6Xdp3tafFqXE6rIuUQ9nhXUeNlTt2cP2LWg9ORqoVTzvddtJ9AFYF4P3uCJ8vuPZNnoTATAL2VvNANvlqMc24L1alvr7WYKNOnqCSZ+BOkDBI5c3SZ4+MzBBXF+BkZgsxW86LFoSU856FDSACUlzq3QwqjQUYmQQjnX3FFTd5yP6oE01lFxYS1NnJDkquABurWot1D3mCazRD1VXGQ1cD2awKjx4l/Gi6FyiF2W1wRZXDEZcfu98K3RgjgmaQoP7AKEdW6ef3gtsBHwqrs4TIUsbgdMqJ/k3tk9Y9afOsswokUuTqUbssxTrDZM1ux8LBoIfFCqYgU4+BQejv8tg5aIQb X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2b5b8070-a784-4d2b-2dea-08daf245f87a X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:36.2706 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: my5lb4i01Ds423K1VLEaVHkmmKW/BJkwkyuC4ey/VvAqDUsIdevUc1OfujsL4N6Oii3r7pBMAc6fvcmPZXcIGQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add ulp_offload module parameter to the nvme-tcp module to control ULP offload at the NVMe-TCP layer. Turn ULP offload off be default, regardless of the NIC driver support. Overall, in order to enable ULP offload: - nvme-tcp ulp_offload modparam must be set to 1 - netdev->ulp_ddp_caps.active must have ULP_DDP_C_NVME_TCP and/or ULP_DDP_C_NVME_TCP_DDGST_RX capabilities flag set. Signed-off-by: Yoray Zack Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel --- drivers/nvme/host/tcp.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 52e0db53d067..1ce9de41a2f5 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -34,6 +34,16 @@ static int so_priority; module_param(so_priority, int, 0644); MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority"); +#ifdef CONFIG_ULP_DDP +/* NVMeTCP direct data placement and data digest offload will not + * happen if this parameter false (default), regardless of what the + * underlying netdev capabilities are. + */ +static bool ulp_offload; +module_param(ulp_offload, bool, 0644); +MODULE_PARM_DESC(ulp_offload, "Enable or disable NVMeTCP ULP support"); +#endif + #ifdef CONFIG_DEBUG_LOCK_ALLOC /* lockdep can detect a circular dependency of the form * sk_lock -> mmap_lock (page fault) -> fs locks -> sk_lock @@ -315,6 +325,9 @@ static bool nvme_tcp_ddp_query_limits(struct net_device *netdev, { int ret; + if (!ulp_offload) + return false; + if (!netdev || !is_netdev_ulp_offload_active(netdev, NULL) || !netdev->netdev_ops->ulp_ddp_ops->ulp_ddp_limits) return false; @@ -453,6 +466,9 @@ static int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue) netdev->ulp_ddp_caps.active); int ret; + if (!ulp_offload) + return 0; + config.nvmeotcp.pfv = NVME_TCP_PFV_1_0; config.nvmeotcp.cpda = 0; config.nvmeotcp.dgst = queue->hdr_digest ? NVME_TCP_HDR_DIGEST_ENABLE : 0; @@ -504,6 +520,9 @@ static void nvme_tcp_offload_limits(struct nvme_tcp_queue *queue, struct net_dev { struct ulp_ddp_limits limits = {.type = ULP_DDP_NVME }; + if (!ulp_offload) + return; + if (!nvme_tcp_ddp_query_limits(netdev, &limits)) { queue->ctrl->offloading_netdev = NULL; return; From patchwork Mon Jan 9 13:31:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093575 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E33B7C54EBE for ; Mon, 9 Jan 2023 13:33:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236362AbjAINdo (ORCPT ); Mon, 9 Jan 2023 08:33:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237011AbjAINcx (ORCPT ); Mon, 9 Jan 2023 08:32:53 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2043.outbound.protection.outlook.com [40.107.94.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86A151EEFD for ; Mon, 9 Jan 2023 05:32:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hu8whkleXke5GCC82FWn9Hs04SWy66pzNbH1vu7ePaVjcr/fRIAQbTY1MMQ1g9mDY2dcEQX6eOZZT/WgCfVt6PnFjvimcfEgZGr25q62jxEZ0TpRhtIvst4fHAQhpNRMMSc1P3rkqSvL9MPTLv7DCZf0JfAGOxShCkxL3WCf3q9h3O9ix2FfnJZPss2nIrPnHVplWVTEaqAR0MoLaVmlT7KAO+SbwckaMWaIqxPhS/fcNOlBYc2J8WtmreDYiSYmHwuQDxzPK7wrUtp8W8cGNfEBqEgWAPZWK/SDk1TFSZxoV9/9P2MxheNlewt2U23Kin8IsmJzwAcO9kjXzhVyaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HuMfodZNYR0aFM7mfM2xv1MuXnLz/wv5Dfr9Yrcq7ds=; b=BEE9LjSG94KtaW4PLnu+5irzBgcuhHyKdre35pLlZhx8MhcGOh4PLkeQXgy6hqfxCI7ZmwD1bTxTLsFDZOVzI4OX5iWxTwVYd5q+pb3s/HHhB8Z7OEbDEnpK+L/mIEEWXA79b2fEHOyoHaIrEs4+XGDC1ZbUt9bEx1/lgcOpCElnxULE+nJtM/0nfC33tgrCweFH/W8LLhKTgAdRnveewk2/o5MrY/ydnap+Z4nit3lWCtVuHaVe5VhPB26OzbI2/f7x9SRavuKCWxRwjRCaU9DC9hNLhTAcvlDSPdgPNfcUqsrRPY2XzapBz8EE0X+cL1uzBslMFeDPkBibd+/unQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HuMfodZNYR0aFM7mfM2xv1MuXnLz/wv5Dfr9Yrcq7ds=; b=XnN7BTP1HsKZImYoX6BayM1LFS+7Xxe1T5A/Gkv2cdBTGkhTjkw5nKCCI4hXYIirF1TFv4WHeFHQ5Z8C6y4OJOMmlaWPrzFzPvNTjpaXR76YryGTRIeWUFFnDNCDx6z3c4k4w2VVgxRs9sfVkPwOZIGt5Hl0Q14KTcaH74zzlNIlDMlMnTVjTEZ1gqyqEbSmzRehaYHBcqTFEXO2nWM1sDQiKqOh52xXYNJrGkov7GBAy8nfSkJGWpYjC8d4tBIU32Iilv0R+cNn8ZR3l2JmI0zKl192JV7zeAYcLhkPqMf2mYQVuoVV7tw6p4PdIdPXxO8+KyWCxehZhoGVBPnzsg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:42 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:42 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Yoray Zack , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 12/25] Documentation: add ULP DDP offload documentation Date: Mon, 9 Jan 2023 15:31:03 +0200 Message-Id: <20230109133116.20801-13-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO4P123CA0573.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:276::6) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: 84fb0502-7c97-4499-a6a9-08daf245fc6a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: pQ/2g9m6Jqt2kCXTzwiY1nbmlmJAQkki42r5s+4rJVY0k5UgZ+QudwJsB5ml/feUfQy26p8tpqYiGfcs+DNgSm3ucBZuUI/kRCPK0JaSYPaT1zXZRX13loP7AnUBypRBEnZTDzW8KhXHKFe9O99CNhDq6vxAaSe5dNGCZwLTyPPIzifS3fmWvykKIQFue1JXllpTH7AhExvNrzhb8IjznJV4YOraHvjlojd8TpIyHJdTgbzOchL9J8s/C2zxwmq7vOmLCdlxxj0iOs/zLq8Kfv02hRArvRiczhESEQ2LB3FoL8jRQbuRHrkwqbp6yJNpUVLgDnu8ww58puSaa5x0l3IjgYOAq6rh30+Q/fHodvq1IiITwi/p5pqHOV0s//4LIU2/67Xbqj4CN0lcmK6YzFGe1K64sYnjvj7mTorcOSSY5KH75dML+MMc4a/RtJK/jzDOcgfVLCPGajGvMALGz4QXV1W5QzfijDPM+3Hrlu1UM+u+gmi0KQuN2PUZtwRpnwkOfyQWppLFwNjGV31PtS4LJy/Mhu87TC/2niOMT632JHYriKkisUuCHFhdXG5yCcC5BdSDb0TgILVnbi8OHAmlqRM52BSw2VZD024jXWMpXCX/hcJX8WuxBbbiND/097Q4CFcCeSc/d6P3N+0jmg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(54906003)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(6666004)(107886003)(26005)(186003)(36756003)(8936002)(30864003)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Huh++CNC3KTsdXjk6qYG5eS0bREAQHQLEyN71eMDonPTZkQfJytejCRkXUdT2PnSdoG1RCjMO43cyfdITIf0dXhDsWRC9GEm0C1Zyu0olZmONZwuTJzfFNWJg7dzdBaaOUYBhzNrtxeANsbDv+NsXS7T6TaDzFPDb0n0O34Medsriur3CzN9MOJdh8qhpsG2W0YrSrj1d38y+1KgM0qQBVOVLBK8t+LPzM/Dl2zCXrJtGA73+pd3MnTXYAQUMqbq9ApE/c5UncKeBPykIuEGjhuJGqN6WZlCZScj+Ez8jK8XqaIhXjO9EeWaSqoxnNGMNJRn/Vj5vbPcAst6e9ozxpzc3cCjRtCuHqqGXp0aJeGnKukFFQnbQWHh0vbfYeXqlDURv9LxewPjtpBI0kBde7a/Pk7lyTrryWaMR1i9Z+p6rNi0x+H8YWrFn3VFaPpJSijbpKf+Ow74XasoMyXGdR5+JHBpF1RTl6F4hCQMPaAVOR73ox3EeABbe3s0QxkTlgkYy6hs/2mIyBKqaXhksoe/tl2GLMsP9oCfoB+k86MD4EMhN5z736vkz2j/JCNgIzw6nlK/tPC5YSP5BeNXST3nPevCZklwynd0w1F/55fD0eeV6svWUDqXLrFrCnf76KdV1pT06BmXAUDDgIWUg02VMF/uhmtx7esq/uWh458w6H833yrbAQ4oBSi/ToJDc42Iqa4aJlO3Evyf/r8NdkuPeFn0nGXIfd1JQJST4SKlLW0bbXwNMC44Lka/jHvIFz4j+Hg1hRxE4sKL0d3ippjOW3+4Xb3uxNrMGb56y1NZtEFo0MhM194FYwm/KwOY1zoFOhwb4Qx+UZytcPSHpaUXb7wW9BNWz/nGlZYrqBFT00hDuUKu/NPxZ83Q1Q0ZHZ7jO3+SfxiAV+zJGIUN8fyPa50zqx08+RfLYvtIh4LTpuRmNxhfWWgkYZKAoax1/6G1REqdCDJ7BOWgXIFSmwY/b1H1NMgFezsMi/3FiqQO00Tan4lH6AUMls3WJ+cuTXZxoOHTrHAU9WJj0sgEqmnJ2rMTOMiyH8UvK1X7e7Sm73UHD9GgJ3kG9PZmrsfWE/Bf41D9skoMNRV0TVbRA3MNHbE45Ntx8DuxyzeTxFH7g99CKuyXG7DDdfoQM5FldXG6y6MoRTlJ/OXDzok5edqtxuX7KDdgaiHsI6IrM4KuVC6QFMx6WZ+Zw79t9BlKwhKKlH4t33caFXPK6LmsimniFYb5Y+sz6FRmqVxefibjMQS0wLlbS6mM/1a5FTatAtdBqFvV4KfUhp+b0qBVqtBd89RlVpMdG8CS5Nuoq9+bnMB3ipvaBm2qCi7eEslwEdkL1qbJSC9RaeHy7QgdgDeRg4KinL1F/lXo5/O7jMKioJGWqgvBE6UoAiydp55lRiCvslpfs03xG1Eun73maMfGWs2j5m8d3mMACcU2ZL7QYlWLJOrxqgracTefymeQUZ3rm6BKWadfgcsgpoXk97LPm3MTr7MbcGfQy4pLoSsgSyBBM9zAuRVHNITuC5f1M68PqQjggzNB/50Y1Zl8RbN0Irx4zeNFwjQdjZSQG1CRUzDYDZNtc3Q/TlC6eLVQ X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 84fb0502-7c97-4499-a6a9-08daf245fc6a X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:42.8484 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /8wH6Gfi4Ls3IalVfvXNOFZRTYKvqMPzYpnTmNO9OfOHg2lH+z+kxpVQYdI5n6Tih4AVnbDyBbkfGE3mMQUffA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Yoray Zack Document the new ULP DDP API and add it under "networking". Use NVMe-TCP implementation as an example. Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel --- Documentation/networking/index.rst | 1 + Documentation/networking/ulp-ddp-offload.rst | 374 +++++++++++++++++++ 2 files changed, 375 insertions(+) create mode 100644 Documentation/networking/ulp-ddp-offload.rst diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 4f2d1f682a18..10dbbb6694dc 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -106,6 +106,7 @@ Contents: tc-actions-env-rules tc-queue-filters tcp-thin + ulp-ddp-offload team timestamping tipc diff --git a/Documentation/networking/ulp-ddp-offload.rst b/Documentation/networking/ulp-ddp-offload.rst new file mode 100644 index 000000000000..55a16662938b --- /dev/null +++ b/Documentation/networking/ulp-ddp-offload.rst @@ -0,0 +1,374 @@ +.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) + +================================= +ULP direct data placement offload +================================= + +Overview +======== + +The Linux kernel ULP direct data placement (DDP) offload infrastructure +provides tagged request-response protocols, such as NVMe-TCP, the ability to +place response data directly in pre-registered buffers according to header +tags. DDP is particularly useful for data-intensive pipelined protocols whose +responses may be reordered. + +For example, in NVMe-TCP numerous read requests are sent together and each +request is tagged using the PDU header CID field. Receiving servers process +requests as fast as possible and sometimes responses for smaller requests +bypasses responses to larger requests, e.g., 4KB reads bypass 1GB reads. +Thereafter, clients correlate responses to requests using PDU header CID tags. +The processing of each response requires copying data from SKBs to read +request destination buffers; The offload avoids this copy. The offload is +oblivious to destination buffers which can reside either in userspace +(O_DIRECT) or in kernel pagecache. + +Request TCP byte-stream: + +.. parsed-literal:: + + +---------------+-------+---------------+-------+---------------+-------+ + | PDU hdr CID=1 | Req 1 | PDU hdr CID=2 | Req 2 | PDU hdr CID=3 | Req 3 | + +---------------+-------+---------------+-------+---------------+-------+ + +Response TCP byte-stream: + +.. parsed-literal:: + + +---------------+--------+---------------+--------+---------------+--------+ + | PDU hdr CID=2 | Resp 2 | PDU hdr CID=3 | Resp 3 | PDU hdr CID=1 | Resp 1 | + +---------------+--------+---------------+--------+---------------+--------+ + +The driver builds SKB page fragments that point to destination buffers. +Consequently, SKBs represent the original data on the wire, which enables +*transparent* inter-operation with the network stack. To avoid copies between +SKBs and destination buffers, the layer-5 protocol (L5P) will check +``if (src == dst)`` for SKB page fragments, success indicates that data is +already placed there by NIC hardware and copy should be skipped. + +In addition, L5P might have DDGST which ensures data integrity over +the network. If not offloaded, ULP DDP might not be efficient as L5P +will need to go over the data and calculate it by itself, cancelling +out the benefits of the DDP copy skip. ULP DDP has support for Rx/Tx +DDGST offload. On the received side the NIC will verify DDGST for +received PDUs and update SKB->ulp_ddp and SKB->ulp_crc bits. If all the SKBs +making up a L5P PDU have crc on, L5P will skip on calculating and +verifying the DDGST for the corresponding PDU. On the Tx side, the NIC +will be responsible for calculating and filling the DDGST fields in +the sent PDUs. + +Offloading does require NIC hardware to track L5P protocol framing, similarly +to RX TLS offload (see Documentation/networking/tls-offload.rst). NIC hardware +will parse PDU headers, extract fields such as operation type, length, tag +identifier, etc. and only offload segments that correspond to tags registered +with the NIC, see the :ref:`buf_reg` section. + +Device configuration +==================== + +During driver initialization the driver sets the following +:c:type:`struct net_device ` properties: + +* The ULP DDP capabilities it supports + in :c:type:`struct ulp_ddp_netdev_caps ` +* The ULP DDP operations pointer in :c:type:`struct ulp_ddp_dev_ops`. + +The current list of capabilities is: + +.. code-block:: c + + enum ulp_ddp_offload_capabilities { + ULP_DDP_C_NVME_TCP = 1, + ULP_DDP_C_NVME_TCP_DDGST_RX = 2, + }; + +The enablement of capabilities can be controlled from userspace via +netlink. See Documentation/networking/ethtool-netlink.rst for more +details. + +Later, after the L5P completes its handshake, the L5P queries the +driver for its runtime limitations via the :c:member:`ulp_ddp_limits` operation: + +.. code-block:: c + + int (*ulp_ddp_limits)(struct net_device *netdev, + struct ulp_ddp_limits *limits); + + +All L5P share a common set of limits and parameters (:c:type:`struct ulp_ddp_limits`): + +.. code-block:: c + + /** + * struct ulp_ddp_limits - Generic ulp ddp limits: tcp ddp + * protocol limits. + * Add new instances of ulp_ddp_limits in the union below (nvme-tcp, etc.). + * + * @type: type of this limits struct + * @max_ddp_sgl_len: maximum sgl size supported (zero means no limit) + * @io_threshold: minimum payload size required to offload + * @nvmeotcp: NVMe-TCP specific limits + */ + struct ulp_ddp_limits { + enum ulp_ddp_type type; + int max_ddp_sgl_len; + int io_threshold; + union { + /* ... protocol-specific limits ... */ + struct nvme_tcp_ddp_limits nvmeotcp; + }; + }; + +But each L5P can also add protocol-specific limits e.g.: + +.. code-block:: c + + /** + * struct nvme_tcp_ddp_limits - nvme tcp driver limitations + * + * @full_ccid_range: true if the driver supports the full CID range + */ + struct nvme_tcp_ddp_limits { + bool full_ccid_range; + }; + +Once the L5P has made sure the device is supported the offload +operations are installed on the socket. + +If offload installation fails, then the connection is handled by software as if +offload was not attempted. + +To request offload for a socket `sk`, the L5P calls :c:member:`ulp_ddp_sk_add`: + +.. code-block:: c + + int (*ulp_ddp_sk_add)(struct net_device *netdev, + struct sock *sk, + struct ulp_ddp_config *config); + +The function return 0 for success. In case of failure, L5P software should +fallback to normal non-offloaded operations. The `config` parameter indicates +the L5P type and any metadata relevant for that protocol. For example, in +NVMe-TCP the following config is used: + +.. code-block:: c + + /** + * struct nvme_tcp_ddp_config - nvme tcp ddp configuration for an IO queue + * + * @pfv: pdu version (e.g., NVME_TCP_PFV_1_0) + * @cpda: controller pdu data alignment (dwords, 0's based) + * @dgst: digest types enabled. + * The netdev will offload crc if L5P data digest is supported. + * @queue_size: number of nvme-tcp IO queue elements + * @queue_id: queue identifier + * @cpu_io: cpu core running the IO thread for this queue + */ + struct nvme_tcp_ddp_config { + u16 pfv; + u8 cpda; + u8 dgst; + int queue_size; + int queue_id; + int io_cpu; + }; + +When offload is not needed anymore, e.g. when the socket is being released, the L5P +calls :c:member:`ulp_ddp_sk_del` to release device contexts: + +.. code-block:: c + + void (*ulp_ddp_sk_del)(struct net_device *netdev, + struct sock *sk); + +Normal operation +================ + +At the very least, the device maintains the following state for each connection: + + * 5-tuple + * expected TCP sequence number + * mapping between tags and corresponding buffers + * current offset within PDU, PDU length, current PDU tag + +NICs should not assume any correlation between PDUs and TCP packets. +If TCP packets arrive in-order, offload will place PDU payloads +directly inside corresponding registered buffers. NIC offload should +not delay packets. If offload is not possible, than the packet is +passed as-is to software. To perform offload on incoming packets +without buffering packets in the NIC, the NIC stores some inter-packet +state, such as partial PDU headers. + +RX data-path +------------ + +After the device validates TCP checksums, it can perform DDP offload. The +packet is steered to the DDP offload context according to the 5-tuple. +Thereafter, the expected TCP sequence number is checked against the packet +TCP sequence number. If there is a match, offload is performed: the PDU payload +is DMA written to the corresponding destination buffer according to the PDU header +tag. The data should be DMAed only once, and the NIC receive ring will only +store the remaining TCP and PDU headers. + +We remark that a single TCP packet may have numerous PDUs embedded inside. NICs +can choose to offload one or more of these PDUs according to various +trade-offs. Possibly, offloading such small PDUs is of little value, and it is +better to leave it to software. + +Upon receiving a DDP offloaded packet, the driver reconstructs the original SKB +using page frags, while pointing to the destination buffers whenever possible. +This method enables seamless integration with the network stack, which can +inspect and modify packet fields transparently to the offload. + +.. _buf_reg: + +Destination buffer registration +------------------------------- + +To register the mapping between tags and destination buffers for a socket +`sk`, the L5P calls :c:member:`ulp_ddp_setup` of :c:type:`struct ulp_ddp_ops +`: + +.. code-block:: c + + int (*ulp_ddp_setup)(struct net_device *netdev, + struct sock *sk, + struct ulp_ddp_io *io); + + +The `io` provides the buffer via scatter-gather list (`sg_table`) and +corresponding tag (`command_id`): + +.. code-block:: c + + /** + * struct ulp_ddp_io - tcp ddp configuration for an IO request. + * + * @command_id: identifier on the wire associated with these buffers + * @nents: number of entries in the sg_table + * @sg_table: describing the buffers for this IO request + * @first_sgl: first SGL in sg_table + */ + struct ulp_ddp_io { + u32 command_id; + int nents; + struct sg_table sg_table; + struct scatterlist first_sgl[SG_CHUNK_SIZE]; + }; + +After the buffers have been consumed by the L5P, to release the NIC mapping of +buffers the L5P calls :c:member:`ulp_ddp_teardown` of :c:type:`struct +ulp_ddp_ops `: + +.. code-block:: c + + void (*ulp_ddp_teardown)(struct net_device *netdev, + struct sock *sk, + struct ulp_ddp_io *io, + void *ddp_ctx); + +`ulp_ddp_teardown` receives the same `io` context and an additional opaque +`ddp_ctx` that is used for asynchronous teardown, see the :ref:`async_release` +section. + +.. _async_release: + +Asynchronous teardown +--------------------- + +To teardown the association between tags and buffers and allow tag reuse NIC HW +is called by the NIC driver during `ulp_ddp_teardown`. This operation may be +performed either synchronously or asynchronously. In asynchronous teardown, +`ulp_ddp_teardown` returns immediately without unmapping NIC HW buffers. Later, +when the unmapping completes by NIC HW, the NIC driver will call up to L5P +using :c:member:`ddp_teardown_done` of :c:type:`struct ulp_ddp_ulp_ops`: + +.. code-block:: c + + void (*ddp_teardown_done)(void *ddp_ctx); + +The `ddp_ctx` parameter passed in `ddp_teardown_done` is the same on provided +in `ulp_ddp_teardown` and it is used to carry some context about the buffers +and tags that are released. + +Resync handling +=============== + +RX +-- +In presence of packet drops or network packet reordering, the device may lose +synchronization between the TCP stream and the L5P framing, and require a +resync with the kernel's TCP stack. When the device is out of sync, no offload +takes place, and packets are passed as-is to software. Resync is very similar +to TLS offload (see documentation at Documentation/networking/tls-offload.rst) + +If only packets with L5P data are lost or reordered, then resynchronization may +be avoided by NIC HW that keeps tracking PDU headers. If, however, PDU headers +are reordered, then resynchronization is necessary. + +To resynchronize hardware during traffic, we use a handshake between hardware +and software. The NIC HW searches for a sequence of bytes that identifies L5P +headers (i.e., magic pattern). For example, in NVMe-TCP, the PDU operation +type can be used for this purpose. Using the PDU header length field, the NIC +HW will continue to find and match magic patterns in subsequent PDU headers. If +the pattern is missing in an expected position, then searching for the pattern +starts anew. + +The NIC will not resume offload when the magic pattern is first identified. +Instead, it will request L5P software to confirm that indeed this is a PDU +header. To request confirmation the NIC driver calls up to L5P using +:c:member:`*resync_request` of :c:type:`struct ulp_ddp_ulp_ops`: + +.. code-block:: c + + bool (*resync_request)(struct sock *sk, u32 seq, u32 flags); + +The `seq` parameter contains the TCP sequence of the last byte in the PDU header. +The `flags` parameter contains a flag (`ULP_DDP_RESYNC_PENDING`) indicating whether +a request is pending or not. +L5P software will respond to this request after observing the packet containing +TCP sequence `seq` in-order. If the PDU header is indeed there, then L5P +software calls the NIC driver using the :c:member:`ulp_ddp_resync` function of +the :c:type:`struct ulp_ddp_ops ` inside the :c:type:`struct +net_device ` while passing the same `seq` to confirm it is a PDU +header. + +.. code-block:: c + + void (*ulp_ddp_resync)(struct net_device *netdev, + struct sock *sk, u32 seq); + +Statistics +========== + +Per L5P protocol, the NIC driver must report statistics for the above +netdevice operations and packets processed by offload. +These statistics are per-device and can be retrieved from userspace +via netlink (see Documentation/networking/ethtool-netlink.rst). + +For example, NVMe-TCP offload reports: + + * ``rx_nvmeotcp_sk_add`` - number of NVMe-TCP Rx offload contexts created. + * ``rx_nvmeotcp_sk_add_fail`` - number of NVMe-TCP Rx offload context creation + failures. + * ``rx_nvmeotcp_sk_del`` - number of NVMe-TCP Rx offload contexts destroyed. + * ``rx_nvmeotcp_ddp_setup`` - number of DDP buffers mapped. + * ``rx_nvmeotcp_ddp_setup_fail`` - number of DDP buffers mapping that failed. + * ``rx_nvmeotcp_ddp_teardown`` - number of DDP buffers unmapped. + * ``rx_nvmeotcp_drop`` - number of packets dropped in the driver due to fatal + errors. + * ``rx_nvmeotcp_resync`` - number of packets with resync requests. + * ``rx_nvmeotcp_offload_packets`` - number of packets that used offload. + * ``rx_nvmeotcp_offload_bytes`` - number of bytes placed in DDP buffers. + +NIC requirements +================ + +NIC hardware should meet the following requirements to provide this offload: + + * Offload must never buffer TCP packets. + * Offload must never modify TCP packet headers. + * Offload must never reorder TCP packets within a flow. + * Offload must never drop TCP packets. + * Offload must not depend on any TCP fields beyond the + 5-tuple and TCP sequence number. From patchwork Mon Jan 9 13:31:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093576 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD6D8C54EBD for ; Mon, 9 Jan 2023 13:33:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234913AbjAINdp (ORCPT ); Mon, 9 Jan 2023 08:33:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237023AbjAINc4 (ORCPT ); Mon, 9 Jan 2023 08:32:56 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2087.outbound.protection.outlook.com [40.107.94.87]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 116291EED7 for ; Mon, 9 Jan 2023 05:32:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QAsmD+CKBObQlgFH0ge8rUqAGBpz6lQpWf11qxtoGbm+REWXEXGGJGvjepWeVdvY3Hm5U8X1e9+Iw58/dRJbnRoHa8ParKEqR4J6WRCbJ0lb8bPe6R32peXnCYrrqrIlkjddSdO9KhV7rREodpNkf2yTpYP6eGpLx93X+draGldigd1adcRUjks6xb37KvKGSEPcNep5vM21xhjUQABvhBCSSiVVaPj5comjIepGb1PM7HTtAPuk7nhXItFPO8tqnLyrFPoEHQYj2EVi/F4L9ZdjcGz5pf1m4qjDC0okdZoXoQW8tk/0N+/oLEfJQHsk+LXywme4HRMPJOjNzp1aGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xw2cXfolAcgSSOYWcrf13+8LOOYV4Nc44bB3hszCj5E=; b=fPrNpInLWjn09nBk/f8udYOL6n3VJqPmrlTrAD9qrgbnrZteKTIc/mGLQKeePupwN48pfkc3r9YwZYwrDAA43d+z5eQM9VsWrQ/aGSNAw3wQtqifqQjLA+JYPF9+8v0u2M6zAA/t72N+6+1FgJMSeMmL9KM6O9OX0KQCkjSEa6hCG8ZrC7IpZTmQDq7RTSO22py/UZhi0MrP4NPp9xvpFghhd/4FLpn2wR36cQdU/dPGmYV7yOnjbAcz7F5Smzjv/V/zOv8PhGd0ubJHFYmg2lr4Zpc30h+pgB54FHQ44qHT2kMEoY/FJ2auNefIyeoFKfLP4MngW2uPzUzf3eJ+EQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xw2cXfolAcgSSOYWcrf13+8LOOYV4Nc44bB3hszCj5E=; b=rTtVHeDFrmUcf7ZjuiGLq+BPUK/FiJgJUVhbddzN2BcL7spIXZHcY5YqHK067diJt9mivSIvV40PcvtYUc8TRMbuYWSQGfixD26FGmHYAfK9J2wIiZhW2tTFV3WYnUPV9JPrWqTU7BQrG1l3+K+Zvz7tw8jD7MLMVV6+92cM7uPD1zsJ9Tat0d06CKenipJZNDLfjZAdFvJVNQR9+oBaABGvhZPJK9iHgBmWS+TVm4QNDErKA52v+kCoEdR2fmQHQnhrH8cKYFdP9eA5cIg53XHWYLwx2hPs1fUBpQ7vJCcB8b6fo8a2ZrtIgb4TG7sg56V3Et3Ii01VEZEyyeziCw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:49 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:49 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Or Gerlitz , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 13/25] net/mlx5e: Rename from tls to transport static params Date: Mon, 9 Jan 2023 15:31:04 +0200 Message-Id: <20230109133116.20801-14-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO2P265CA0395.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:f::23) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: 332a14bd-93b1-40e6-ac34-08daf2460037 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: RnCgZxjVui6zc2T2JI65iKDgWu45AKUXW+AQJlAszpbGW5yyFivtdQlwOjtw9o2nqxwa3Q14loRAbt7ezpqwA7X4UEAhL82Lj5Im8na4F7wZWq70qpHKJr0R4u0EhYwQpn1RvpsX1RAUkPm6iffAl4LRAQFdPI7woEJXa/n+y0wRY6qnf/tPXDaqiJsPP/qM8QKUESwT+KzDMh/k97HjJF2TueRu9m2tkRA6s3JljMMxe3SxzhZOqTx9s7qgoBEU4z//hcfqW+vfUqzpPAy+yEMwlH+SAMaYxN8wKeJGZ2Q1Qu8DmOFL7VsK3wfcnkJV1PymAzNVOhBQ8AWiLSSgR/26bEMtYC+llpMVXol+REz5PpXP2qUolRXLLRmfxgzCMb1az7ygE/OWg1nTWePybz0p+PMuz0cdcIi4zjGy1AB/4tx0as6ZT913i7afUHX/xzsAKttIi0ZKk+PQhA8R0/6dlRMkN1FOyp3nQMKSh/NEjc15vKpyKjKDKzirGg+u5u01O9U+eLq7yoyHjtkU4xn+mL86FeSFtWfk/8gX5BrvaORANOmRB8siQEvJqkxtf188e/d1mRAr49GSWGw9B37DvVl2KJYQ1TalwwUjbn5EBBhJWOVqY7xrm5ZMTkqpGbS8Q09i0odLexTNI6DXcA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(54906003)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(6666004)(107886003)(26005)(186003)(36756003)(8936002)(30864003)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Dw3OpKMvJJGPUeqwWpX+NDnMElMC6FZUf0o2sao3Ek9gxFfzGNPlt3RE3E5mONDoBrfSUsmZd2o52FPG0D4T2gpYnO6laU4JuyvNvnpkIUsuRcDW7+YtDwk2OfaWuDQZ8wcryErd2D/L9el54q4PBwqwe+nv45y5029b1RApPRAqs82fNrg2pUsf/rOSWXpfPwvg3RywZcTvNIMRSO10K0PgRtlLIGvx7g63t2y1lq3oxME18RqYtJWeOw6/BDzmnksydaySoPpfzclRIaUYiBqNhiYT4ITRUjEoCME49nleTwV2v7ZOorEMBLPGK+u8azkWltUFE1Nq2L+ubCp8iq9qtD6UdkQ+vLt6kr+3SsF4dPu6y+LN6RR8rREsOR2eutFYF900jz/yvcefZhxeLj4xUrxaTwHi/TPEabnZ47/D9yPudMeYlCv1us7su/mUv3zUaVykIXzwwR4VLPYESZtj3uvVrNm8swocPs7UYXWZ3ki/pY7XLkccl4eN0Hd35d7PhFx5H4nGtpFwaSaxcUfK5Xxc57ktAmIj35EyS0WhcvAPVy2wc3UcpMeLcFS8f9Mm7SLnDv9ihdIVjqEk05lmtpf4blGUvTaP8dJFJ8Sq9Pa+aDalyjYq9GxtNLPmeb6Oc9xyfkqFMpjUpCoi0Gwtde+pASBiMCeWRs5BgsaxkbEnKtD21BhxWJnlrvvLTcZnNtd7G6K6nS/CAaY3FAb/PjSYT5FzEx2v5IlPfmGRneufyAj8+jKtg6CFQqqjB3+k7J6IYsbsnPfcZmFIkmtyu0/d+bM3bUB7Okq9IsrJLYjz7brPoB6dL3+BHxIlOh0BkbRPD1vn4moMnle8+gTbzJJkq070oaMBSoDvjrvPNzm0IFqlUeQAHGVUqK5RTBf4HVm2fo4Qal3iQNnM4EXDdHHwfYl1NnqoLJnGNCceAJMc5GR86lIuB+NTJd4EossRoYXqrHYA6d4+N2EfIb+B+Gf/Pkj16iYtEwl7td1s60kuKh04Vh6QCcNMyIPBcimCe7dltJsAcn2dlfzCeKtUQYOsoLebxG+p0aa2pbXSC7ashbq+WttssIJWsWcfk/90TM18KtH/rEmrAwYvvNd4oaOwYIw88dB5DY/mISH+eRGZrnXA8fAuFieieUrQCd88fiu4c0VuHtTP0Ya0CKOWq/V+3q0Kw8oD9ZJVvUrRY/fD7mwyPkfPb+F/P2FMnoEW9EVfiLjTR84oNhHkiRmnAY2qXrdlNLZNC8Liy6fRliT60PFBi/Ew3jEgZLQAKu8OECPeGcx9QIUuBwhOoOKwfolqyuaQtminKWmLf+SiRZ1o73SNyxojlxDkFrj6Z+EgsT9K7z1cim72IYXz+mZHJW0Cz4EiStRdf4VIwLpUHKZuMVSTPlaPfMObbiaXJLYcku+ETdb+zUxk50GHJHSr5Dbk3Pg/O36vAQBWGIlUv7Z3w91RY7aZxQXSPiXUdFFFK3PaBZaqGx6Rl8x7rpg5GegMe2LL1ZDVgz4u056CAdBebN4SJnel7sjfhYU7P20TJsYtQYKPBVOlS6f7XmYiPZTHcYSIOkLNq92h8K+0XfTAdZzqQR/JlrziJrTk X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 332a14bd-93b1-40e6-ac34-08daf2460037 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:49.3189 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ch53wuK8c7PfgdZcBU/ePkRDyCtx/4SixWCv8RwME6DBJRD+RdsqbajfV4DioTXypWrAVCVzDTbCY+uBiqMwAw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Or Gerlitz The static params structure is used in TLS but also in other transports we're offloading like nvmeotcp: - Rename the relevant structures/fields - Create common file for appropriate transports - Apply changes in the TLS code No functional change here. Signed-off-by: Or Gerlitz Signed-off-by: Ben Ben-Ishay Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../mlx5/core/en_accel/common_utils.h | 32 +++++++++++++++++ .../mellanox/mlx5/core/en_accel/ktls.c | 2 +- .../mellanox/mlx5/core/en_accel/ktls_rx.c | 6 ++-- .../mellanox/mlx5/core/en_accel/ktls_tx.c | 8 ++--- .../mellanox/mlx5/core/en_accel/ktls_txrx.c | 36 ++++++++----------- .../mellanox/mlx5/core/en_accel/ktls_utils.h | 17 ++------- include/linux/mlx5/device.h | 8 ++--- include/linux/mlx5/mlx5_ifc.h | 8 +++-- 8 files changed, 67 insertions(+), 50 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/common_utils.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/common_utils.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/common_utils.h new file mode 100644 index 000000000000..efdf48125848 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/common_utils.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. */ +#ifndef __MLX5E_COMMON_UTILS_H__ +#define __MLX5E_COMMON_UTILS_H__ + +#include "en.h" + +struct mlx5e_set_transport_static_params_wqe { + struct mlx5_wqe_ctrl_seg ctrl; + struct mlx5_wqe_umr_ctrl_seg uctrl; + struct mlx5_mkey_seg mkc; + struct mlx5_wqe_transport_static_params_seg params; +}; + +/* macros for transport_static_params handling */ +#define MLX5E_TRANSPORT_SET_STATIC_PARAMS_WQEBBS \ + (DIV_ROUND_UP(sizeof(struct mlx5e_set_transport_static_params_wqe), MLX5_SEND_WQE_BB)) + +#define MLX5E_TRANSPORT_FETCH_SET_STATIC_PARAMS_WQE(sq, pi) \ + ((struct mlx5e_set_transport_static_params_wqe *)\ + mlx5e_fetch_wqe(&(sq)->wq, pi, sizeof(struct mlx5e_set_transport_static_params_wqe))) + +#define MLX5E_TRANSPORT_STATIC_PARAMS_WQE_SZ \ + (sizeof(struct mlx5e_set_transport_static_params_wqe)) + +#define MLX5E_TRANSPORT_STATIC_PARAMS_DS_CNT \ + (DIV_ROUND_UP(MLX5E_TRANSPORT_STATIC_PARAMS_WQE_SZ, MLX5_SEND_WQE_DS)) + +#define MLX5E_TRANSPORT_STATIC_PARAMS_OCTWORD_SIZE \ + (MLX5_ST_SZ_BYTES(transport_static_params) / MLX5_SEND_WQE_DS) + +#endif /* __MLX5E_COMMON_UTILS_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c index da2184c94203..26695e74a475 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c @@ -100,7 +100,7 @@ bool mlx5e_is_ktls_rx(struct mlx5_core_dev *mdev) return false; /* Check the possibility to post the required ICOSQ WQEs. */ - if (WARN_ON_ONCE(max_sq_wqebbs < MLX5E_TLS_SET_STATIC_PARAMS_WQEBBS)) + if (WARN_ON_ONCE(max_sq_wqebbs < MLX5E_TRANSPORT_SET_STATIC_PARAMS_WQEBBS)) return false; if (WARN_ON_ONCE(max_sq_wqebbs < MLX5E_TLS_SET_PROGRESS_PARAMS_WQEBBS)) return false; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c index 3e54834747ce..8551ddd500b2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c @@ -136,16 +136,16 @@ static struct mlx5_wqe_ctrl_seg * post_static_params(struct mlx5e_icosq *sq, struct mlx5e_ktls_offload_context_rx *priv_rx) { - struct mlx5e_set_tls_static_params_wqe *wqe; + struct mlx5e_set_transport_static_params_wqe *wqe; struct mlx5e_icosq_wqe_info wi; u16 pi, num_wqebbs; - num_wqebbs = MLX5E_TLS_SET_STATIC_PARAMS_WQEBBS; + num_wqebbs = MLX5E_TRANSPORT_SET_STATIC_PARAMS_WQEBBS; if (unlikely(!mlx5e_icosq_can_post_wqe(sq, num_wqebbs))) return ERR_PTR(-ENOSPC); pi = mlx5e_icosq_get_next_pi(sq, num_wqebbs); - wqe = MLX5E_TLS_FETCH_SET_STATIC_PARAMS_WQE(sq, pi); + wqe = MLX5E_TRANSPORT_FETCH_SET_STATIC_PARAMS_WQE(sq, pi); mlx5e_ktls_build_static_params(wqe, sq->pc, sq->sqn, &priv_rx->crypto_info, mlx5e_tir_get_tirn(&priv_rx->tir), priv_rx->key_id, priv_rx->resync.seq, false, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c index 78072bf93f3f..ac82f32d0a7a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c @@ -32,7 +32,7 @@ u16 mlx5e_ktls_get_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *pa num_dumps = mlx5e_ktls_dumps_num_wqes(params, MAX_SKB_FRAGS, TLS_MAX_PAYLOAD_SIZE); - stop_room += mlx5e_stop_room_for_wqe(mdev, MLX5E_TLS_SET_STATIC_PARAMS_WQEBBS); + stop_room += mlx5e_stop_room_for_wqe(mdev, MLX5E_TRANSPORT_SET_STATIC_PARAMS_WQEBBS); stop_room += mlx5e_stop_room_for_wqe(mdev, MLX5E_TLS_SET_PROGRESS_PARAMS_WQEBBS); stop_room += num_dumps * mlx5e_stop_room_for_wqe(mdev, MLX5E_KTLS_DUMP_WQEBBS); stop_room += 1; /* fence nop */ @@ -543,12 +543,12 @@ post_static_params(struct mlx5e_txqsq *sq, struct mlx5e_ktls_offload_context_tx *priv_tx, bool fence) { - struct mlx5e_set_tls_static_params_wqe *wqe; + struct mlx5e_set_transport_static_params_wqe *wqe; u16 pi, num_wqebbs; - num_wqebbs = MLX5E_TLS_SET_STATIC_PARAMS_WQEBBS; + num_wqebbs = MLX5E_TRANSPORT_SET_STATIC_PARAMS_WQEBBS; pi = mlx5e_txqsq_get_next_pi(sq, num_wqebbs); - wqe = MLX5E_TLS_FETCH_SET_STATIC_PARAMS_WQE(sq, pi); + wqe = MLX5E_TRANSPORT_FETCH_SET_STATIC_PARAMS_WQE(sq, pi); mlx5e_ktls_build_static_params(wqe, sq->pc, sq->sqn, &priv_tx->crypto_info, priv_tx->tisn, priv_tx->key_id, 0, fence, TLS_OFFLOAD_CTX_DIR_TX); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_txrx.c index 570a912dd6fa..8abea6fe6cd9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_txrx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_txrx.c @@ -8,10 +8,6 @@ enum { MLX5E_STATIC_PARAMS_CONTEXT_TLS_1_2 = 0x2, }; -enum { - MLX5E_ENCRYPTION_STANDARD_TLS = 0x1, -}; - #define EXTRACT_INFO_FIELDS do { \ salt = info->salt; \ rec_seq = info->rec_seq; \ @@ -20,7 +16,7 @@ enum { } while (0) static void -fill_static_params(struct mlx5_wqe_tls_static_params_seg *params, +fill_static_params(struct mlx5_wqe_transport_static_params_seg *params, union mlx5e_crypto_info *crypto_info, u32 key_id, u32 resync_tcp_sn) { @@ -53,25 +49,25 @@ fill_static_params(struct mlx5_wqe_tls_static_params_seg *params, return; } - gcm_iv = MLX5_ADDR_OF(tls_static_params, ctx, gcm_iv); - initial_rn = MLX5_ADDR_OF(tls_static_params, ctx, initial_record_number); + gcm_iv = MLX5_ADDR_OF(transport_static_params, ctx, gcm_iv); + initial_rn = MLX5_ADDR_OF(transport_static_params, ctx, initial_record_number); memcpy(gcm_iv, salt, salt_sz); memcpy(initial_rn, rec_seq, rec_seq_sz); tls_version = MLX5E_STATIC_PARAMS_CONTEXT_TLS_1_2; - MLX5_SET(tls_static_params, ctx, tls_version, tls_version); - MLX5_SET(tls_static_params, ctx, const_1, 1); - MLX5_SET(tls_static_params, ctx, const_2, 2); - MLX5_SET(tls_static_params, ctx, encryption_standard, - MLX5E_ENCRYPTION_STANDARD_TLS); - MLX5_SET(tls_static_params, ctx, resync_tcp_sn, resync_tcp_sn); - MLX5_SET(tls_static_params, ctx, dek_index, key_id); + MLX5_SET(transport_static_params, ctx, tls_version, tls_version); + MLX5_SET(transport_static_params, ctx, const_1, 1); + MLX5_SET(transport_static_params, ctx, const_2, 2); + MLX5_SET(transport_static_params, ctx, acc_type, + MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_TLS); + MLX5_SET(transport_static_params, ctx, resync_tcp_sn, resync_tcp_sn); + MLX5_SET(transport_static_params, ctx, dek_index, key_id); } void -mlx5e_ktls_build_static_params(struct mlx5e_set_tls_static_params_wqe *wqe, +mlx5e_ktls_build_static_params(struct mlx5e_set_transport_static_params_wqe *wqe, u16 pc, u32 sqn, union mlx5e_crypto_info *crypto_info, u32 tis_tir_num, u32 key_id, u32 resync_tcp_sn, @@ -80,19 +76,17 @@ mlx5e_ktls_build_static_params(struct mlx5e_set_tls_static_params_wqe *wqe, struct mlx5_wqe_umr_ctrl_seg *ucseg = &wqe->uctrl; struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl; u8 opmod = direction == TLS_OFFLOAD_CTX_DIR_TX ? - MLX5_OPC_MOD_TLS_TIS_STATIC_PARAMS : - MLX5_OPC_MOD_TLS_TIR_STATIC_PARAMS; - -#define STATIC_PARAMS_DS_CNT DIV_ROUND_UP(sizeof(*wqe), MLX5_SEND_WQE_DS) + MLX5_OPC_MOD_TRANSPORT_TIS_STATIC_PARAMS : + MLX5_OPC_MOD_TRANSPORT_TIR_STATIC_PARAMS; cseg->opmod_idx_opcode = cpu_to_be32((pc << 8) | MLX5_OPCODE_UMR | (opmod << 24)); cseg->qpn_ds = cpu_to_be32((sqn << MLX5_WQE_CTRL_QPN_SHIFT) | - STATIC_PARAMS_DS_CNT); + MLX5E_TRANSPORT_STATIC_PARAMS_DS_CNT); cseg->fm_ce_se = fence ? MLX5_FENCE_MODE_INITIATOR_SMALL : 0; cseg->tis_tir_num = cpu_to_be32(tis_tir_num << 8); ucseg->flags = MLX5_UMR_INLINE; - ucseg->bsf_octowords = cpu_to_be16(MLX5_ST_SZ_BYTES(tls_static_params) / 16); + ucseg->bsf_octowords = cpu_to_be16(MLX5E_TRANSPORT_STATIC_PARAMS_OCTWORD_SIZE); fill_static_params(&wqe->params, crypto_info, key_id, resync_tcp_sn); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_utils.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_utils.h index 3d79cd379890..5e2d186778aa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_utils.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_utils.h @@ -6,6 +6,7 @@ #include #include "en.h" +#include "en_accel/common_utils.h" enum { MLX5E_TLS_PROGRESS_PARAMS_AUTH_STATE_NO_OFFLOAD = 0, @@ -33,13 +34,6 @@ union mlx5e_crypto_info { struct tls12_crypto_info_aes_gcm_256 crypto_info_256; }; -struct mlx5e_set_tls_static_params_wqe { - struct mlx5_wqe_ctrl_seg ctrl; - struct mlx5_wqe_umr_ctrl_seg uctrl; - struct mlx5_mkey_seg mkc; - struct mlx5_wqe_tls_static_params_seg params; -}; - struct mlx5e_set_tls_progress_params_wqe { struct mlx5_wqe_ctrl_seg ctrl; struct mlx5_wqe_tls_progress_params_seg params; @@ -50,19 +44,12 @@ struct mlx5e_get_tls_progress_params_wqe { struct mlx5_seg_get_psv psv; }; -#define MLX5E_TLS_SET_STATIC_PARAMS_WQEBBS \ - (DIV_ROUND_UP(sizeof(struct mlx5e_set_tls_static_params_wqe), MLX5_SEND_WQE_BB)) - #define MLX5E_TLS_SET_PROGRESS_PARAMS_WQEBBS \ (DIV_ROUND_UP(sizeof(struct mlx5e_set_tls_progress_params_wqe), MLX5_SEND_WQE_BB)) #define MLX5E_KTLS_GET_PROGRESS_WQEBBS \ (DIV_ROUND_UP(sizeof(struct mlx5e_get_tls_progress_params_wqe), MLX5_SEND_WQE_BB)) -#define MLX5E_TLS_FETCH_SET_STATIC_PARAMS_WQE(sq, pi) \ - ((struct mlx5e_set_tls_static_params_wqe *)\ - mlx5e_fetch_wqe(&(sq)->wq, pi, sizeof(struct mlx5e_set_tls_static_params_wqe))) - #define MLX5E_TLS_FETCH_SET_PROGRESS_PARAMS_WQE(sq, pi) \ ((struct mlx5e_set_tls_progress_params_wqe *)\ mlx5e_fetch_wqe(&(sq)->wq, pi, sizeof(struct mlx5e_set_tls_progress_params_wqe))) @@ -76,7 +63,7 @@ struct mlx5e_get_tls_progress_params_wqe { mlx5e_fetch_wqe(&(sq)->wq, pi, sizeof(struct mlx5e_dump_wqe))) void -mlx5e_ktls_build_static_params(struct mlx5e_set_tls_static_params_wqe *wqe, +mlx5e_ktls_build_static_params(struct mlx5e_set_transport_static_params_wqe *wqe, u16 pc, u32 sqn, union mlx5e_crypto_info *crypto_info, u32 tis_tir_num, u32 key_id, u32 resync_tcp_sn, diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 29d4b201c7b2..b50b15dbf3c1 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -444,8 +444,8 @@ enum { }; enum { - MLX5_OPC_MOD_TLS_TIS_STATIC_PARAMS = 0x1, - MLX5_OPC_MOD_TLS_TIR_STATIC_PARAMS = 0x2, + MLX5_OPC_MOD_TRANSPORT_TIS_STATIC_PARAMS = 0x1, + MLX5_OPC_MOD_TRANSPORT_TIR_STATIC_PARAMS = 0x2, }; enum { @@ -453,8 +453,8 @@ enum { MLX5_OPC_MOD_TLS_TIR_PROGRESS_PARAMS = 0x2, }; -struct mlx5_wqe_tls_static_params_seg { - u8 ctx[MLX5_ST_SZ_BYTES(tls_static_params)]; +struct mlx5_wqe_transport_static_params_seg { + u8 ctx[MLX5_ST_SZ_BYTES(transport_static_params)]; }; struct mlx5_wqe_tls_progress_params_seg { diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index a9ee7bc59c90..bbe5b0f233c4 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -11938,12 +11938,16 @@ enum { MLX5_GENERAL_OBJECT_TYPE_ENCRYPTION_KEY_TYPE_MACSEC = 0x4, }; -struct mlx5_ifc_tls_static_params_bits { +enum { + MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_TLS = 0x1, +}; + +struct mlx5_ifc_transport_static_params_bits { u8 const_2[0x2]; u8 tls_version[0x4]; u8 const_1[0x2]; u8 reserved_at_8[0x14]; - u8 encryption_standard[0x4]; + u8 acc_type[0x4]; u8 reserved_at_20[0x20]; From patchwork Mon Jan 9 13:31:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093578 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1E0EC54EBD for ; Mon, 9 Jan 2023 13:33:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236892AbjAINdt (ORCPT ); Mon, 9 Jan 2023 08:33:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236303AbjAINc6 (ORCPT ); Mon, 9 Jan 2023 08:32:58 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2060.outbound.protection.outlook.com [40.107.94.60]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3564727180 for ; Mon, 9 Jan 2023 05:32:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=X/LJaGRRQEeI9skvXOSKD7HeiCtJuZXuWFSiStAFnEWpOTHG0kRYN713gTPJXOzvQUGUdmpbh+Em1WzwBmIwr03TWv7ZVXBtm+EZIx7jRxtb5yqZ09ccHZRIKIhcjbFRQVEt5yzgWOH3tYcYSbSp/o8YAyHY419ON68j+9/jiiCFmz71RxPIZUfLUNC/wuobgws10fZ0bgL00lqqQZAT4KAzS2+GiHP3NxorzDUC+SxAvKaWlmclT5DmK2mLbIgEYCXi76vnsjN2a1hq8GEk/oIb2ySyMlTH0yE7KbCMNrQY6AQUQ2CzPAZ/XwvZYBfQs5XK6sqhuArdUMKQgzOgpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=W0PXvlx7XDWdFv2p/lC26w2sChP9Q33lvxW84tc6t4Y=; b=hfgbN/A+X+Fco+GiDRnWHlAKRjO7KWxY/0aIde0Ez++FXqQG8MoblXcBMjc9fAk71l+taeLnTlJIImomR3RHKEX8awEM3hf2Um8tVA7PPUbbE22qrx3TScvynqLeIUn2/pUxv8oAmmsaNwTv+CBPdnP14jP4yD+1dzz/2h3Degj/RDDFHxkGF9b9rAPcNxxTTMlYbcJVczyGOjbnK7kDdr38oLPFgjwltACNjI4rzRMCkhvk41BsDNvP5UqwGnym+ZMt+ANdMvUFIJjo9OnG9gm6mvIUeQuwBEFN8LlQZqYO/fWTYZgRU7nBddwhJWA7FsF9gJJw/GA9Lz+0gRAIGA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=W0PXvlx7XDWdFv2p/lC26w2sChP9Q33lvxW84tc6t4Y=; b=eLj3Okdwenah0/92kc0RewnvRk0wRdGNCPliQHaJ4vmdAXKn0mEuhYbwJF6h2/9oNXDyUCx/enMTpUuXdlJO/wKiKsjprmq6TwuNH2GXrdNPLP/dVwkftzIssI2o7kKGMls3Dnfish520d1QkzCmothqRDJWAOdKv9/y8nlkrwWeVFpQjXORDEtkWOHtNKAM+hQ9R2U98em/6YARI9GuOCG+9jElk5qo0WHHqa9qsFAN5ha+MnlcXcp/fgQYc41W65jCSU3mcQLUH+q+4Rk2b7tnwn6DtB+IDEnoj1mH3+txEoauO3cHga3xpuIoK2BvK7ffJHtvzqqCw0vnXqSkow== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:32:55 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:32:54 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Or Gerlitz , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 14/25] net/mlx5e: Refactor ico sq polling to get budget Date: Mon, 9 Jan 2023 15:31:05 +0200 Message-Id: <20230109133116.20801-15-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR2P281CA0087.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::6) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: afec3707-8060-42fe-ff4d-08daf246039d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: RWcJ0fI+iGIdaC+/TK4PhTtezQn20GDcd0IdQ+R60oV7S3IK8o9deJaksHth4oBi6O5EYYs9awiIUn4P9kD+gC6xF9EFbfG/92LkjLqFQXyMK141a8OlhKinWYYRnZsaaLJo/GKDaHlN5MwlzSd4vRY/vVZXIm4laAR+pS/tL9ZisByTPyqv+ZRO0ypKHWl/sFsEF9RlTm5iE+i1Frw4qN00aPhO4bNfWzr28+xKOuw4wocKCG+pOMsUM6D59bVtQrDPSUszOT8HAycJgp7popNAlDOZT0d2euyocoT/+uqfoaL8XDwVAD7bSVTuDfwRmtAiqjkhLpFkyTxfv1nINHqptt96NsaRT07XtzwXDF0sBnx+W9GHOfqd24R2hXvkuuGbQQwRTgxgF4iS447IT/TepKx2W7tZsWV/cq88Q/OUpuLmg2xGalZIE3uGg2glNgwXtadzPgIogaBBmlBkIUwJPMBMmPHnJHS6KbLrnsFdfR1ORftaVcafgjzWuoch41GBO8C4Vku5G7UBFiu3EHM75UuUxtmdNJ3wQ/cPkiEt1PDFLBBNFhXGM4QG6koh1Z0jQZ9p9QvJW4vk6fKYVWpkfL5RIDJqBUuR8ouOf8QSe7FAeVrJY21lbX15MxXgX4H/3Sh2/0F7GooTju+f3w== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(54906003)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(6666004)(107886003)(26005)(186003)(36756003)(8936002)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: QRzMg4fgtQMB8e4S5lKfcVRH2R4kYWhMg2yc/E9ppJ+wFYwYWekOABZfna0wsQohOqrmJsAGxHr9331BGgemAXhmZf20ZnuNn0f9Oh5mUsw8iOfEPRcFCtG9whG3PEITfZBkpt30VL9Ib3e+1bus+iJtGGdONDjnzmhwIMVJ0GAACaHpaKVkda4iZIx8x8Qu1mb764/hs1cGpvWUJ1kuy2FGHuMmQA/h/Fk7/D1EYAf6LSDyO1Xcj5usuDRYFsjZ0s1KQhp12tO1d2jZHKBL1+oAIH1XbJlKDRwE4VsjcQ5y+lW0A5B0+oszN3xPF//wv0+xe6bl1SUvH9OCJFkS3mfLhUQ72BpyWwSst9Ue02MGbV9ceD8f6v3H08QqTzuPHvx9ViZfMifNLLLbcT9SbXCew6lXltBd1K9LXp8qyIVox/6zbStxJD/5lZsv71am4jX181j+NaJntOM5rVbOaOf6A5EJhaJ+MX435iWqSr/p/hMgEmx9cyjRx1dSHa0prjiWlJpva8LSvbsP/7Hpl9kig57U4sqRjppCc5kLM5vHqxCbfbjQ1o63f0MM+Qm3JVXISK35I/3AAGKBGqRskwxiXdEY5huA8GTfvB+mzlAwCsUN+mvp+uaYJFBq9bY1pS3EahUPaNEJPcpvff44YkP56Ix7NszDekC8bpM52xd4qCZ64HT964STvwpK/v4lOSlH8I5i2QTbvwhkqPwK3D7s5y5I4r96cT37rAH0rS3MdlNbevm0AnVEWNLkPhKnwvTSlZYfuxEKA6IvoZPrYt/ep9r8SSiKcNgP/IIqPLSVaL+JQNaBLPHL2027noOIPQnhyR7bmKHHxc1TvjEqAdBjfL6hzqtT4XgzXXM93t6IeSlVWDeq1BiiLn0X6pLinIaOM/P0ibrghtP+5qBSOsDsWDYRFWly6UiWxXtMboFCiTSPDzPo7lHLEy1qUfN3mAx9Due1f8F9rzux+8MgjdvyEa5QA14h5sFyG1/ChnClBstI331aHli982YYjGltz6lTwV0R0z+RhPym5btY8diGwoXkUZ20wi0NFb9f7Ad4P7fqpDizzgegGBx6ihWNop/QS1Qj+29Nokc1XpChMBDv8S63ccZTKkAnVRve75V6BSzEqs9XeSNxTY8X3hVRHqEJumxvmv1ONpbzE6+QwdUa3U0ziO/w9hDXGCvC8v62q6DdD8PPBeoXWnSdX8/WNpfK/8LzzZ6SXU6BytQkXjEKyZgYqmCFCdt7dMItQQs0eaU/KRI8e/eDhQT/SvZFUiPfoGdnFg9IP0YWNoRC7JTzlvEIVz0sKeXfQOmF9HZ40xtr9Vp6KFAEPqXwAEAaSITzw/Eh8pK5VD4EyuYjkKCueEVClzGTz1ZQRMOPyiapfTJQ34EKy8hgb2nc3LGkEgX3jD9lyJb+DCHEQi8oOq0Zct5Ya8qs00EKZMJCVUgrWFbQ6NcCfPKL4YH6wJf2no8vJ8KX3S4oJFYI+kTkLYdyd5bD9B1+3I5o16yZzak8zylTrHhQISIf1XGNva1Xb/y+OK1byRZ8Kqt5Y79ZiKI6/zhx/5VBvEoRC3urV1/x+9FbieWy67Gc0Ap4nQ1P X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: afec3707-8060-42fe-ff4d-08daf246039d X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:32:54.9290 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: JiIDUG8EcK8o6Ytqh59CkslFrQ2Bqx1mMz8DLTWu0DlwOm8ozNFMJbHa9ygftlZpMEeaD2IrwfV65wGJDqrzcQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Or Gerlitz The mlx5e driver uses ICO SQs for internal control operations which are not visible to the network stack, such as UMR mapping for striding RQ (MPWQ) and etc more cases. The upcoming nvmeotcp offload uses ico sq for umr mapping as part of the offload. As a pre-step for nvmeotcp ico sqs which have their own napi and need to comply with budget, add the budget as parameter to the polling of cqs related to ico sqs. The polling already stops after a limit is reached, so just have the caller to provide this limit as the budget. No functional change here. Signed-off-by: Or Gerlitz Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h index 853f312cd757..a7799cee9918 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h @@ -62,7 +62,7 @@ void mlx5e_trigger_irq(struct mlx5e_icosq *sq); void mlx5e_completion_event(struct mlx5_core_cq *mcq, struct mlx5_eqe *eqe); void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event); int mlx5e_napi_poll(struct napi_struct *napi, int budget); -int mlx5e_poll_ico_cq(struct mlx5e_cq *cq); +int mlx5e_poll_ico_cq(struct mlx5e_cq *cq, int budget); /* RX */ void mlx5e_page_dma_unmap(struct mlx5e_rq *rq, struct page *page); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index c8820ab22169..7bf69e35af18 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -916,7 +916,7 @@ static void mlx5e_handle_shampo_hd_umr(struct mlx5e_shampo_umr umr, shampo->ci = (shampo->ci + umr.len) & (shampo->hd_per_wq - 1); } -int mlx5e_poll_ico_cq(struct mlx5e_cq *cq) +int mlx5e_poll_ico_cq(struct mlx5e_cq *cq, int budget) { struct mlx5e_icosq *sq = container_of(cq, struct mlx5e_icosq, cq); struct mlx5_cqe64 *cqe; @@ -991,7 +991,7 @@ int mlx5e_poll_ico_cq(struct mlx5e_cq *cq) wi->wqe_type); } } while (!last_wqe); - } while ((++i < MLX5E_TX_CQ_POLL_BUDGET) && (cqe = mlx5_cqwq_get_cqe(&cq->wq))); + } while ((++i < budget) && (cqe = mlx5_cqwq_get_cqe(&cq->wq))); sq->cc = sqcc; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c index 9a458a5d9853..9ddacb5e1bf4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c @@ -176,8 +176,8 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget) busy |= work_done == budget; } - mlx5e_poll_ico_cq(&c->icosq.cq); - if (mlx5e_poll_ico_cq(&c->async_icosq.cq)) + mlx5e_poll_ico_cq(&c->icosq.cq, MLX5E_TX_CQ_POLL_BUDGET); + if (mlx5e_poll_ico_cq(&c->async_icosq.cq, MLX5E_TX_CQ_POLL_BUDGET)) /* Don't clear the flag if nothing was polled to prevent * queueing more WQEs and overflowing the async ICOSQ. */ From patchwork Mon Jan 9 13:31:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093579 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8779DC54EBD for ; Mon, 9 Jan 2023 13:33:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236984AbjAINdu (ORCPT ); Mon, 9 Jan 2023 08:33:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237024AbjAINdD (ORCPT ); Mon, 9 Jan 2023 08:33:03 -0500 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2065.outbound.protection.outlook.com [40.107.93.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C3E414026 for ; Mon, 9 Jan 2023 05:33:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eHUS4PAn5R4VXQ/oAVKzmgdCW2et2MtCYYumgtCx8jfpP2Yk6OZBvauyhPoHUepEZFL0w4463QMWfqYgvGMHbG5ITMEJMD1T9lT2EG8UaRI5mntVP+R1mj4fggBH3kVRNcJAHDvnH37BTyfRSTF7uiYf51Aydzqn2CdjHtI/wS/hrqeI+hfmaUVdbylhwqkGhG03d5rDvdUa8svLs6nhO13jkHxh9/w/ePASVKJEctR1ot5pvq74dIOdfYdX6rLx0aghj21Lm+hw4CTajZxSDOAUflXrTeDiIleHamdmuRGLxlNmcat0YWfKf22E71wZGhszoT0yNV+69N43QUHgQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=A4xzkgv4I/LOhTXpNfs+v2n3207C0uyvTDw2RnBX/98=; b=QbUxsff3olIdsK6mt7VRK8LXfLUgbsRzHLFJs93sopORYBEornM/mRySx+QMOXg2QZCnhp/fRiHtDJ+0zryKk7Q6sDn8pDM/7ocRAR0mB+pqhhAd5toS1WetGKVLre37aZwmOKki1yQV5bH/TATOvXbE46+/sHV+d8Dfbo6ObL7qeySQMSggfn5FE0sW97O0eqYLW7WGdimmuKfSZ33gCq/+Ctcz31bv6ErQ95CBKhuXemfXHfTJVoGPxZjRbxlaJ215qA3LisH5T1fLNG6+QjhFvgeme/2NiYFNQSEh75T/ZYoLFNjwuQEofDKyuzdLtLd9qaZ5PowPK77cs3oPAQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=A4xzkgv4I/LOhTXpNfs+v2n3207C0uyvTDw2RnBX/98=; b=r3dXdIaFhygnHnBptKI+P2W0EdVuxwEkh05w6JXFVCPvmMquDHPHR4b49XOJlCND7YBZbB6LDxBx8WPH4UCDE6k+a6kur+MYTlzwLRg2ZCyx6m8EL5gVq6kjVhmcVqiYISpFQHIeI45za3dZ/CIlaSlIHNkwWKAtpUnzPj6THHTs5Yo3G5MFJec2WbERS5+I4PqGg9IsKcppUd/VbjO63TKbSgf+T373n8iCZKboVSWpykxTH//TbP1f1cuCE2CDoXTX/HT8DckacIFa/LaLn4ds5o7MqK24jD50FYtg5Q9BndyD66bBKCpvkn5/A5RJeDBql+RUzST5MDUD6tUyxg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:33:01 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:33:01 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Or Gerlitz , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 15/25] net/mlx5e: Have mdev pointer directly on the icosq structure Date: Mon, 9 Jan 2023 15:31:06 +0200 Message-Id: <20230109133116.20801-16-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR2P281CA0083.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9b::7) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: 4fc9fc20-9542-40ec-6aa6-08daf2460734 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: prOALxfED+cQDPr3bEM2jVpRtXMe4+t8g1CD3ipAPC44JpuXEyufFgjxz9DArA46aO33SMyZfFwi5gtmNKf371ko0tU3OP1IthZskOhCZ+EjMwPTUQA8VDy/Hz8uoMJUL/RjaaMkcTkHoG0lsfNXsQr8DpMJfn2NWzPVGfOu4bDQ/tVbLrT8yKSYlb/EOxaH079f/3O2mJUwhc3ktUBUDKZg2F8D9ySfd+p84DArPVSKOassJxrDrxdf4zGoTB8CUY37XcHa3YWcQip798Gd0WaYePgE2JVPkl5Zdwnys6JSyWDwz1wCleA81AYIlPurqxTyz4vTNkaeOC1KWk7PQ5zNg2ksZGNCaeSuvv7NAvX+06TAZyENPMuFAzr3YW0c0r5VEqWtgp9mfOqGrMIlet/N14xswwVWV3LsOKX2q/+lWISnC3u/40tn/zPU9Z6Y8x8+hzKU0gBFkGJRcP1A64xCvBzWRct/lDhyQQ/zjcotWK7qScSax4WgN/FwV+67KwFdqkZWSIbQvYYfH3vWd2vWF9yjCPKWq+p0uLWMgF+sWllf91cU+8voazP4i+Nc3Cok/dlresWo/czFCAuD5mZTJF27h4EjmrCbSGw+zgic4PA6YA7I09aThhdtYJM+fVEYn0mPzYGUkG+DO0i6wg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(54906003)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(107886003)(26005)(186003)(36756003)(8936002)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: NM2Ghm/ZCL39YrBdnGbY+ZSuOwXEXqew8dbgj0Dn71eJXoeRmVhUj8jmNsftbg8SH0WaWIvSkJW4dvaPOjTS8SvHbcy0Z/xzgiSXxxPdLdIMju09axKf7q26qF0JOSzYyofJ2QQISNEyfCh4v6BYFkZXnEOASE+tRrqd+CRXRMk5hewAECLTrQ08pip2zD+8Cnp9K3z4DeMa4cAcAZlcL45MRCyb5iVdAu8RN+rGQ5CVQOvPny6r9Zqws0jxlGN/3zxn/swxgkbsmVnzYniihHmLBLXBGInkRXW2YWrWKPBiH0E6FKmS3RhyaSF+nDlmvBrceJoLZofz8lJCkHS1Ya9Gsfnc4NihCZb17mYZnpk/jxbCce1LjpXRFF3I0dLNz9BzrsRmH9T0zGeuvN4Us0tKTfs7XRJWvguN10Sltz+bS31y3LYcYGogRlx3APqCUXaOgF11u0Jim/dZOOwnQArJ5agRvE2T81RNxousKODxt+Z4F47nqz8XloKGTEwrz4D7vOG4440t7rUqlIteJ10ino3Swe9nNL/Pa7h2OcmbaVUmQZ+Ivrgu5sUCzMc5/+oodpGE1gLVDCouB7G8v0tcVGVaoaejqW/oAiGjZ9L8rLiadF+lF/5U6oKXI3AwhW5mmGnAjjqnhBb2n3ZNaz6gtggZmzKxjXrvpKopXl/9Hb4VCpSkjP39ebJX+4VyoICTBpmxia2FhS8qHWAQB0g1LO4ta60j9JX4ikK7ELQLSVA52YYLUausDmz3BgCsv5s2kLXKQhV41Ab8fdOsHyvu4QdP3B1qRD5kz4/zjiZIgP+Oe0yCCj1oU9EmQJ3ujMFpkqOF4V+w9YcYUCLfy/+WEckrUtE2euw0CPAKRQPPzmYxds2mR4JXj21vpQERXKtQXF4BUdwzeGzcx5HlERSWCg8xNK/6PmYr0l89XXGQSKRHr+6PUK9CW5IAajp/CDo/kUkiB0iKeiGJGn6mH8dPPQ6eGQWCLhM0cDFqK7lrN36d7udmME8m6G5anF/MTCr0SgEUKeQw+0d1mz57f58hXPQU6eQ2f/tjqwko2nI2+Cn3Z2kYx6BAuRA4U19AqTw7fIxHcdBYSMKOmdzbCRQzO+VZJnU2DsOfQImGIcblBtBMjniMrVjLZvplO269mkSXBvjPSRy3jcpSNxiiw7PbVvAqe85tdYqzwjG7YCfdCV9GkPBn8k2B37emFWXqeglDhCZ3ADdX9KCr5qeFQfXgXykJVfNzTGJ3u76GIdATFL/zNZkoBra/ac8xgj+ixbNmQdZ0fZjqwMP9OVyED0RFFm8NuNg9GDk8KqCPTBH1cu3idlnEC35PR/rcGSBtPDmhU8NOCt01waA+VsR95BjyU6sbhQF1bg0bUf0l94AA4gM7KFieDKwaO1kRGaH1Vdv3vgt2MLrVpBo2jicDVzwoG5dYLm5sr146DRJOpBcQblxywge0em66rqINRSywjA4YMVpUflY6J8h1thnvmUAITgMfBMy4fTUXfWR2uztiEOew00ayRgjwP8aiffg4lQRULV5BXQ+d9RJpi2mB/5GCdkiN/zpH58R5tbfsVOmlwDKuXwenkIgiRnewRW1w X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4fc9fc20-9542-40ec-6aa6-08daf2460734 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:33:00.9635 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 12OGbxrtr6kKdqZMrSrW9TYn7Ho1xSrmuioOPa/7tFnB189Uwws2PCAFvU1mK5sjNxyq16zOXZCIwOVILbirmA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Or Gerlitz This provides better separation between channels to ICO SQs for use-cases where they are not tightly coupled (such as the upcoming nvmeotcp code). No functional change here. Signed-off-by: Or Gerlitz Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 5 ++--- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 2d77fb8a8a01..f0ceb182ac43 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -592,6 +592,7 @@ struct mlx5e_icosq { /* control path */ struct mlx5_wq_ctrl wq_ctrl; struct mlx5e_channel *channel; + struct mlx5_core_dev *mdev; struct work_struct recover_work; } ____cacheline_aligned_in_smp; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c index 1ae15b8536a8..12bdc4c04e70 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c @@ -33,7 +33,7 @@ static int mlx5e_query_rq_state(struct mlx5_core_dev *dev, u32 rqn, u8 *state) static int mlx5e_wait_for_icosq_flush(struct mlx5e_icosq *icosq) { - struct mlx5_core_dev *dev = icosq->channel->mdev; + struct mlx5_core_dev *dev = icosq->mdev; unsigned long exp_time; exp_time = jiffies + msecs_to_jiffies(mlx5_tout_ms(dev, FLUSH_ON_ERROR)); @@ -78,7 +78,7 @@ static int mlx5e_rx_reporter_err_icosq_cqe_recover(void *ctx) rq = &icosq->channel->rq; if (test_bit(MLX5E_RQ_STATE_ENABLED, &icosq->channel->xskrq.state)) xskrq = &icosq->channel->xskrq; - mdev = icosq->channel->mdev; + mdev = icosq->mdev; dev = icosq->channel->netdev; err = mlx5_core_query_sq_state(mdev, icosq->sqn, &state); if (err) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c index 8551ddd500b2..fe9e04068b0f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c @@ -266,7 +266,7 @@ resync_post_get_progress_params(struct mlx5e_icosq *sq, goto err_out; } - pdev = mlx5_core_dma_dev(sq->channel->priv->mdev); + pdev = mlx5_core_dma_dev(sq->mdev); buf->dma_addr = dma_map_single(pdev, &buf->progress, PROGRESS_PARAMS_PADDED_SIZE, DMA_FROM_DEVICE); if (unlikely(dma_mapping_error(pdev, buf->dma_addr))) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index cff5f2e29e1e..01418af45dc8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -1388,6 +1388,7 @@ static int mlx5e_alloc_icosq(struct mlx5e_channel *c, int err; sq->channel = c; + sq->mdev = mdev; sq->uar_map = mdev->mlx5e_res.hw_objs.bfreg.map; sq->reserved_room = param->stop_room; @@ -1785,11 +1786,9 @@ void mlx5e_deactivate_icosq(struct mlx5e_icosq *icosq) static void mlx5e_close_icosq(struct mlx5e_icosq *sq) { - struct mlx5e_channel *c = sq->channel; - if (sq->ktls_resync) mlx5e_ktls_rx_resync_destroy_resp_list(sq->ktls_resync); - mlx5e_destroy_sq(c->mdev, sq->sqn); + mlx5e_destroy_sq(sq->mdev, sq->sqn); mlx5e_free_icosq_descs(sq); mlx5e_free_icosq(sq); } From patchwork Mon Jan 9 13:31:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093580 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 751B0C54EBD for ; Mon, 9 Jan 2023 13:34:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236849AbjAINeR (ORCPT ); Mon, 9 Jan 2023 08:34:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237054AbjAINdM (ORCPT ); Mon, 9 Jan 2023 08:33:12 -0500 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2045.outbound.protection.outlook.com [40.107.93.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC34E271A0 for ; Mon, 9 Jan 2023 05:33:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mWos0gApkbFhOC9u0FPENmludfjdehXib5pW1SDAfwMT+ssWEdaE9r+sxs5V3visYlWT5VzbwY4uL9i32+CbW9rVQvNQ237XlNnINT7H6orn8onHkkZ/HFEtQX3dg5YFn7SZx7m30I/DeFUoI0dxCCUoM9UFi6wwcYUEjMD9LA9C5QlzSSjn/4g28hOQ97V8U/out6JatlDnJUN0ca48mMm+SVi/DOrYkDhUypTiE9XPv/c/rTsZuax1d1JfzF6i0YlOrIBsXihpfmKJ48HLebrbSpQ1c0CxGoVuJvsugC/mdrfCmoJoP7FcgfJ0dCs1aDia0nt+hQmRLsIyYSrWrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JzHGuklrol7s3+JBvut22Mhwr3AvA8YrfT5UAUgnXRk=; b=Eg43RgSA+Ct5q6wkD93dtfyaRWiG44anw83coMyHuFy8fxrjhQvy1lNCqRkrY1PEVW8pbPapH91/ZTlKnh+2GFg0c2LWq/6KeaaGSgmudyFQL/qasT1ctO7W7UIJNp7HtjOcfJmXRel2Sb2Y3FMCVqlqDHwv8kARqWARAkF0g3CWGU6xjOx0Btt1sNkvMnsgG6bMa0nnCVWX//JuMWj0LvYKuBZAhnLw/ZpW8uJL5qYgeUrPKe8yrCh6DgYd5j5GUQekPTVlifA374cO5fybvbYFNkmQFQHN96vUyp2f8fqYVQjjrqXCC8EDk3JiW5ryfPubOipS1AWjtQbj8Gl60Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JzHGuklrol7s3+JBvut22Mhwr3AvA8YrfT5UAUgnXRk=; b=fn57CLSWAOdeY5ba8SIogbDdkWAw9u8TCxwMLQYBvHLrrAZf2YRm/vVXPlper3rJmCgqUvowSdVZ99JeHZwV0UPoOjPsrbSarKmwguW1dL1gGgJp0dGIT5axdGKikimu3/YY8viG2f6xnA5f7eSIU7X7RhJJW6rKv35hWTzzu3wY2sSg7KZSQ2/qmUjid0AG6//QqHbkvRuaG8z+DJmUPRbQVYk8VMaL17nLDHBytDNlIgOueTSdSAFVEb4/lsKbSnUhiNoOgDRFYIAffJaD5qdw7ujXttlwiKPCp3FNMjZkgNRq7gkvuQjjQ4sfV+xRHPeYhNFaZDeUHFlQ1I0jYg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:33:08 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:33:08 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Or Gerlitz , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 16/25] net/mlx5e: Refactor doorbell function to allow avoiding a completion Date: Mon, 9 Jan 2023 15:31:07 +0200 Message-Id: <20230109133116.20801-17-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO4P123CA0449.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1a9::22) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: d9886191-9bd1-45de-8f1e-08daf2460b73 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /q1F2PBV2DK1WfOgmZ0Sj/+5UrIdXsmjK2KftG5QdtcTBVSyPxXHyMfwGkpLAn84eEm1whcVqEY7zLnxbhbaJwGPOZMRlhgSLb9eKLoEA6bs060v+7ZgvlEcnQuvVAp/R975NoemKbOM5ZXd2SxGF9d9K4Pfp1IrGfmNDtLl4HTFc0IguHBcL8bsVTuk9SFFXFKkN8qHHqNPgyLOzSHz5urHUaUB1I0/qjwP6wwCT9IkeutP+kqYeMksB1iIbMQtMP3KduBKz1MacijTzPBsjBQCMHDoXRvzFMU1mDpUHQLkcUqVv8/pJ3Tu36ZrZnvPB4sISDNV8sgoqAHPyyYxTlC1FJQib/lF2z8oRcRQwYhnQA/lrBLvowp9B+murVSBbrpjtYckTwAKJoo/DU8+VRsSmTcFWJ+mWGCeA2aePwhMtTiLisGKsebQYWLjVCOkN6dtPId/J4yc8/wg1egWuV2b2lUmIx0H23BX6C+iVIYOuoYkUTr70bYsezMpQzJcjO7eyn2h/kNdMueKCid5Xm+S9mQ/CMmH6hmN1NHJQYEuZYD43xDk+HovGiiHgPvztTmOA+gcPvxXbFasWru8p0RWoLeeR6Yj5X+vCFY6iQzrkYaFVVNFZuwfKC9Ttql6LwbjKej/uwijOu8ynT+qrQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(54906003)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(107886003)(26005)(186003)(36756003)(8936002)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: W1vtstk/YpL/c/ns4Ol/5V5I7rbmqEPQAuheaJpQk7ehVEy5WtWQJqWPeoAzNONJ5CrSfKQqeB3HiFneJZZj0Iix5zVBm5yB0ro9P7/nIylSkUcoFUir0Fszk0qZuuaq98dtnkdIQoR0p3bS3Uh12dlefqXcM7ExZEPYHvVA2c/Mdsg/7f8jFMrmG13jtM3duR00hvn3vx9OHJEWqsaNjeg1tATq3niDFiOB6sY6qRtadZ4yr3tFP6kC4OW9xzJtiS+BRny/QuHU7bb/T9Ap8csv7sFGRwGebV/tYDHLGYgYH09IxmSB/+k0pBOmbPNc4P9y0rrvZVqXVgek4xEkUgwd2CxpN0Oa5VRnzsH71e6DBeLmHtyI8qyPstJRaZf6W08X70FBzIO4Bvl3hQn/Imk1uqn/2SqgoOb6r9cOtiDhKl9ewSq6PG3VAjc3Ja9eMaQ/Mryo7EIGuWMwczlqUlBPnv1FJ1oeyChEBXfb7BaDfOXPhF0l9DE0cR8jaAfTZEl0i07+3uaB2eSG84YN+Acr5fDRcZwsEGtWi1du6i1Mb3oQvbpJJjIgXlXucwTiUlX7k7Ncpy5O0Ulh68qDuMtHalYf+3ReQOv4bdZbFq5cEqnK72e8IOt8+YKVR5XEIsqlNarDLIfcE4c4l+2gecyX0MrtnUgNNJE4RRYjsyl3FngmtZjwDysB6R0zNOTo7N69sAvVVBCe0p/KwSV0eo3UEw5SnoJTSR8/NaBdYixlZvPncEkyFIuNN5lhkGq7YxC232i+svLqrpC6SO3IOPLLMLK71IM9g2+cELVQ5T8gGXTbzEPYnVb7+aLY18bbLLQri8y5toI7SXZCPdA21I+Lz2V/oIF/bc1Mvo2F0KwsotR3cp3Zohhr6FneNqPvhMFgMtVCLwKmqaN1Zy1cz6VOv+uVkNWy6F3GTcLpkMvuJNmJBKXzDgw3K0YGmvAciNLLvyNPiHPFZZCjWy9nPXGuDQ50yIS8lcSZSOsAuwvb/Gw6BTE6Jl/px2VBnGexSGTisstBDuUsmoERzvkwR8gK0T80phK9uHznP5tg4sG0MeO5X/va+bMjX+KUxmwj8MR6TV8vPO7nOgGkR8Ll8ySxkH+YM4Gs8pUDrKW7CTXgr/7aZh/o+jzXLxigCJSoQqBr1nR7wXKU4Z/MBChy3WDSz0Tqa+PIgtfOJJnmjAMuk6EHKOcjHCptKHhXzcTSmNW5IqjHLVqOBK6oajgRTkhrmdtCQqJJk/7HjxsuJEd8suca2CYEm1Smx9W1xmX6gk0PtFSzvhOKsjdAYmh84Lta013O2N9jCttSMHRi5wOaWAfXUo1ziuH/RuYum7xZXCBe6uFkicvsR/wAPCcWLkhnjVGGIzODPVbBInn2Jrwv8me5LrLvkq8bBuWDvgBL1jqYaOP0x2HCJXw44z9dzcMUuFt+7SIx3GBOKoT7kVOg4Sd2TQlQwdtdoCOfHfIl+wq3rFiZR4av2I5+WTjrBKUL4GL7lEU0VIdTfLrKgOPy6w3ruKXomNaMPBP3yD2X7ITeN0uZMxo5RBRJtdOY6x0lUXyihS403SW7qOTXxa9tVK9k+APHeFUwE2u4G1Eb X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d9886191-9bd1-45de-8f1e-08daf2460b73 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:33:08.0726 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Z/2Ln0bp//r4bjq8z3I4Zfqtu9zocg6mpFTZJMAGqyoO0OHZn76ZxuZcsBH0r5xRLD2b7OOaYoO5V3T//YOqYg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Or Gerlitz Currently the doorbell function always asks for completion to be generated. Refactor things such that all existing call sites are untouched and no branching is added. This is done using inner function which can be invoked directly in cases completion is not desired (as done in downstream patch). No functional change here. Signed-off-by: Or Gerlitz Signed-off-by: Aurelien Aptel --- drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h index a7799cee9918..a690a90a4c9c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h @@ -238,10 +238,10 @@ static inline u16 mlx5e_icosq_get_next_pi(struct mlx5e_icosq *sq, u16 size) } static inline void -mlx5e_notify_hw(struct mlx5_wq_cyc *wq, u16 pc, void __iomem *uar_map, - struct mlx5_wqe_ctrl_seg *ctrl) +__mlx5e_notify_hw(struct mlx5_wq_cyc *wq, u16 pc, void __iomem *uar_map, + struct mlx5_wqe_ctrl_seg *ctrl, u8 cq_update) { - ctrl->fm_ce_se |= MLX5_WQE_CTRL_CQ_UPDATE; + ctrl->fm_ce_se |= cq_update; /* ensure wqe is visible to device before updating doorbell record */ dma_wmb(); @@ -255,6 +255,13 @@ mlx5e_notify_hw(struct mlx5_wq_cyc *wq, u16 pc, void __iomem *uar_map, mlx5_write64((__be32 *)ctrl, uar_map); } +static inline void +mlx5e_notify_hw(struct mlx5_wq_cyc *wq, u16 pc, void __iomem *uar_map, + struct mlx5_wqe_ctrl_seg *ctrl) +{ + __mlx5e_notify_hw(wq, pc, uar_map, ctrl, MLX5_WQE_CTRL_CQ_UPDATE); +} + static inline void mlx5e_cq_arm(struct mlx5e_cq *cq) { struct mlx5_core_cq *mcq; From patchwork Mon Jan 9 13:31:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093581 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2ABB9C5479D for ; Mon, 9 Jan 2023 13:34:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237025AbjAINeS (ORCPT ); Mon, 9 Jan 2023 08:34:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237087AbjAINdR (ORCPT ); Mon, 9 Jan 2023 08:33:17 -0500 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2064.outbound.protection.outlook.com [40.107.93.64]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0FF891E3DE for ; Mon, 9 Jan 2023 05:33:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Y3lKGYqggjJeISv8/6aIgp7RKAZuU+hdIDczHcmWt2kDAXNL+OeQ1nJSq6soLwCuY4vG78lWMqlnytBDashVCZJPx35y9+X7zq5TX7KdKD9JYpmiDBkxjVLUUpZjc5iAFWXmD2eqhpyTWeOhC6ZtNU9owjKxZLVdaWCGYSrZJ0xA6C9R4zNH9GAQDnibi48rNvQqGOzSQLIdQqamIXNZ+u3ly83DrZmWknslEjK2aUCrtTznShXPOkci8alslaTeEDWy/tGWu2WZZzhT5722W5mE6CLvTFmyFrcKElBOlzdL3VZPlq462yRznQYAqaATtTJ3J7vPpsEdEz7pvVMbQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=eMTmGjRCBCv4TtycYgnX3Kb8G1T1NlMIsB3gm1s1NyA=; b=e6E8ryqjVgV0qmXv6Amzqxv15eqaWZdHSF8i4Y3uDeEMEAD3dR3NHZb+AWV9Y4MpMGEjf/yafq4MVRVL1c2myWGuS7qHLjOhtDBuW19d8rzu6E5ZCrZEyhB+Z5QNEbATM5Zczbx4tevWxfLGlCX8rUNt0XboRnkieEgtrNhNSIF6mRoK8QYjx8J1+NgefOov0APyhsURNFQ0sPjp8J48qqfFBxpbDVnNRjFAZujtwaLtu/LX0DL4SNJRxFKxWi8KnG05LLcjSKYK9ep/j4eY+uzYolID9vZhOSxFLzWgTlxdL2BM2qY27xd5EaM9y+clivdm2r8714nFUa/ZQpMVnA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=eMTmGjRCBCv4TtycYgnX3Kb8G1T1NlMIsB3gm1s1NyA=; b=gQqj3aS0Fuq5GRGPvxuwH1gp0qgQJM3X+pbQx412NnWSjfBPfNJCVc/8byWP6P54cnrb5+eosEJwrqwHkMq5jMVmqrJfciHgTsM26/kG5ZpCyZnQhk1BYbhwvb4pQ6Ltw6LXR513SfSRK3zvxWPbzaa3n45x88X7M2XE/2rmTYRQzx574wug+UNHcdfCqp1p6D/pXK2y31NkBLRuy1Vl+PoOmD3VJcgd9f/U4nsmEdkLy+yd8D9qG+mGZk7HXKTJhY44coC8iiEAozdeHA0IB4WiW6nbUwfpdRF1zn4fOrcp0PkglyXBN3H2VPLGVkT9MCVpCTOjAMGP+o3dlKVcHg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by DM4PR12MB5056.namprd12.prod.outlook.com (2603:10b6:5:38b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:33:14 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:33:14 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Ben Ben-Ishay , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 17/25] net/mlx5: Add NVMEoTCP caps, HW bits, 128B CQE and enumerations Date: Mon, 9 Jan 2023 15:31:08 +0200 Message-Id: <20230109133116.20801-18-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO4P123CA0533.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:2c5::18) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|DM4PR12MB5056:EE_ X-MS-Office365-Filtering-Correlation-Id: b0c9bf5f-77dd-4215-18e4-08daf2460f37 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: glSiqjLqBDgIdNgtLKky8JdSaFbPnTfe4tzR8AEnTVNDoTNxSCm0hKfItMyNKhW2a6UrP7xA8TDX4gMtHnwZYNJtZc+wQ4O2tp/ki7KOTcstbV1oUYYwlFuFMGFdItd7mcX+0j14xl80DkgYV21A260dre4gGHDa5hsfoVlvlKZp66OnYiwgVaK5CuFvsaAoGc5EoUzlf7BuHsTNW0mTfbhaCUlORhlGf2wl/pf20yN7DMJaD8fAS/nR2pIsYNOH/8rfVAFcHtI8ZC5oJcX3jm64DyuLlJhV59+4lte8im5R8gX0vPQZgQoyCq9bcPIsUU7YRkvLF4N/7AZlz8HfbbWUjFbyESb/bm3HJl41KgxeBwTmMXvig7y3ojhEVsDVIeORN+upRKkcj9cOxHH5VA1BnBcwQyWczX497XkveSAU5Pzan4XAQXnXjPF1NIaCPk3U1s7t9RVriIoeSNxsFYs8UUXhWy5uUt+Vo/+vNciid0g/xdS0lSm5rt4p/dOpMho35us0sTErefzq/tHpU14crljVMEDZLDDW0eaUEBbKuAtaWUlvaCy7lFyiNGBMT01yvb1C2e7UCOLJiIB59UEEdwGhC3ZOxyqu3qcF1RijcX/+RmMtkE5EsMPFwavJqmqtiphjwWeBa8PvoqwLeA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(451199015)(2906002)(83380400001)(54906003)(1076003)(2616005)(66476007)(7416002)(66556008)(5660300002)(66946007)(6666004)(107886003)(26005)(186003)(36756003)(8936002)(6506007)(6512007)(478600001)(38100700002)(8676002)(41300700001)(6486002)(86362001)(4326008)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: YNJLsCOZqKTcdQ7534R2cFtIhuEFkK/CmZtHewxPBw9CAONWeecbOirXmGQ3UBLPqB+kM8CRxTE7HVz8sGEm9re+jLSluYXhUjAPMlz+ftEhBt750AaqzXqqavhmof6joDd496vCsowPr93oJjPpdY0CvAoXsIpNX5jwWkOFZ3DCVrFzdU5j7qnyXqNa/X92JNu6ils01kAatFrvW3EJtjtBRANUrrxVMpUqGoMXwBbpJDf5EMXCCN/CLAV+3zb+pfQwrCh4Nr8VILaMUCOf6IPporS6F5UhV/x5Dcg92XucWQyi9XkUKVuqFpNUY4b9y/kFcf2VM1LmeDApq3OeaNGG7ZpTsOTNu/O2YwujvaRipfJSCdE6XCrqcQyrE7FMcijXPcIOjSBwdXGfCUv9F48oETAPcDCyzcP3/Lj4Q0Nsko/KHHhG9ZMdNq5ExkeAYDjhWbPziQvR3LJ62cARmx8heJImALxcwFma7rvU1ctUyLY3ciBCdzhFhBpXFMHnckO0I895vbJOz/5vOarVmEQ+0QShDYtgNIX39fIsh05ZLiADddySHJ7F+H2YQr7ChS9X86DnXCFcYrRsqam7K4lRhnF60XOr1YOgaAH2X2bF+US/dfGA3G5/adGBc+sA8RB7KvymWH0ZDP8SI22hyqU6BAnFeGAkwA7QRkuQW3Q2YKkDi51o/Da9Pw6f5QxgIlS2VKEX2YT1/DkGtR70hfid7hU3SCUltuj0qiWNDnsFPmS2FVy/iVg1jrTRQfws5UCv2D3talz7ujtyUbF7uQirPGeqVWFYgXt/OylutuLmkG1XRC+GlJtgtJLfE4eMxw/a4lqIi2r+C7qJHTRQEix6vJKFZ8Udg3wXkYnXOip6rQ3roKU9lFqeSumdYk3Q5DZD+T0ERSqejHVwvd0iZ0Jeg4MvN7C7rSK+6E0bIZqTsXHSfzk9W0vjGUBqLI+/BTBfK5ImdUaFPlkd/pcASgXgTBT7B1Cjkxd77tw0sMi8c9aTt3X9KrUAjVW2LvKYg2IpIcGiG0HwvyMH7eS0LiaJd2UYWtqxgZ3Mh9fMFiGrOl8d7VVX/xSC4WUFrU4a3P40Cw0e/HgTytdj+pZX6mEOqhUm146bWjIClflG3OjzwDIcfv22wsCDJogWtdhVckd6gX5QROEE0d9CeV34HUKIILETIPrIDvshKSmG8ZuANu8DOosgaVR8x1LS3JcpLTzIOn1XfU5O5bfY27yf7wUEQHFFj/RfU/zwXncxQ4sfSazOZOSSUS5pmG4B3jEJTjqriCGwRcZhKKPXYEYpND8sq0+SrTSbbuRfUaSZJ3xoLJhbOsmv6kdWUR32OIObPKfag32LixmqDoi+2XzCjcAagtZQo95sEUFSe6tYeYR3fswp18v79sKD5EUf7oECB6FmgMU5+qWDmWpfKOJBPELxxKvaW+wkf3DhcmCQet6nBVDh7JkQAJOxa8qRL0bY810VPk1uJa5LM32lhqEGJ2KxY6AEI97zZ/s5YKBlIzv6kqxyb29vtAw214mOXxQWqNX2GlwrxFT3QP14rNeN2VSx7jxlzSQ8HOXYN6wNSqfzqCy3gRxrYZqf4GM5PHh3 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: b0c9bf5f-77dd-4215-18e4-08daf2460f37 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:33:14.3872 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: NDV+vgWWU0tPTL8k1kXFvehrdGIC3AwldjFfa5fKqBwzrosCjAs8gau17r/1uQyxYqAzY88ak6/L7xCo9bRMeg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5056 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Ben Ben-Ishay Add the necessary infrastructure for NVMEoTCP offload: - Create mlx5_cqe128 structure for NVMEoTCP offload. The new structure consist from the regular mlx5_cqe64 + NVMEoTCP data information for offloaded packets. - Add nvmetcp field to mlx5_cqe64, this field define the type of the data that the additional NVMEoTCP part represents. - Add nvmeotcp_zero_copy_en + nvmeotcp_crc_en bit to the TIR, for identify NVMEoTCP offload flow and tag_buffer_id that will be used by the connected nvmeotcp_queues. - Add new capability to HCA_CAP that represents the NVMEoTCP offload ability. Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- drivers/net/ethernet/mellanox/mlx5/core/fw.c | 6 ++ include/linux/mlx5/device.h | 51 +++++++++++++- include/linux/mlx5/mlx5_ifc.h | 74 ++++++++++++++++++-- include/linux/mlx5/qp.h | 1 + 4 files changed, 127 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c index f34e758a2f1f..bfe540a4d588 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c @@ -286,6 +286,12 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev) return err; } + if (MLX5_CAP_GEN(dev, nvmeotcp)) { + err = mlx5_core_get_caps(dev, MLX5_CAP_DEV_NVMEOTCP); + if (err) + return err; + } + return 0; } diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index b50b15dbf3c1..8b13b0326fc1 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -263,6 +263,7 @@ enum { enum { MLX5_MKEY_MASK_LEN = 1ull << 0, MLX5_MKEY_MASK_PAGE_SIZE = 1ull << 1, + MLX5_MKEY_MASK_XLT_OCT_SIZE = 1ull << 2, MLX5_MKEY_MASK_START_ADDR = 1ull << 6, MLX5_MKEY_MASK_PD = 1ull << 7, MLX5_MKEY_MASK_EN_RINVAL = 1ull << 8, @@ -787,7 +788,11 @@ struct mlx5_err_cqe { struct mlx5_cqe64 { u8 tls_outer_l3_tunneled; - u8 rsvd0; + u8 rsvd16bit:4; + u8 nvmeotcp_zc:1; + u8 nvmeotcp_ddgst:1; + u8 nvmeotcp_resync:1; + u8 rsvd23bit:1; __be16 wqe_id; union { struct { @@ -836,6 +841,19 @@ struct mlx5_cqe64 { u8 op_own; }; +struct mlx5e_cqe128 { + __be16 cclen; + __be16 hlen; + union { + __be32 resync_tcp_sn; + __be32 ccoff; + }; + __be16 ccid; + __be16 rsvd8; + u8 rsvd12[52]; + struct mlx5_cqe64 cqe64; +}; + struct mlx5_mini_cqe8 { union { __be32 rx_hash_result; @@ -871,6 +889,28 @@ enum { #define MLX5_MINI_CQE_ARRAY_SIZE 8 +static inline bool cqe_is_nvmeotcp_resync(struct mlx5_cqe64 *cqe) +{ + return cqe->nvmeotcp_resync; +} + +static inline bool cqe_is_nvmeotcp_crcvalid(struct mlx5_cqe64 *cqe) +{ + return cqe->nvmeotcp_ddgst; +} + +static inline bool cqe_is_nvmeotcp_zc(struct mlx5_cqe64 *cqe) +{ + return cqe->nvmeotcp_zc; +} + +/* check if cqe is zc or crc or resync */ +static inline bool cqe_is_nvmeotcp(struct mlx5_cqe64 *cqe) +{ + return cqe_is_nvmeotcp_zc(cqe) || cqe_is_nvmeotcp_crcvalid(cqe) || + cqe_is_nvmeotcp_resync(cqe); +} + static inline u8 mlx5_get_cqe_format(struct mlx5_cqe64 *cqe) { return (cqe->op_own >> 2) & 0x3; @@ -1204,6 +1244,7 @@ enum mlx5_cap_type { MLX5_CAP_VDPA_EMULATION = 0x13, MLX5_CAP_DEV_EVENT = 0x14, MLX5_CAP_IPSEC, + MLX5_CAP_DEV_NVMEOTCP = 0x19, MLX5_CAP_DEV_SHAMPO = 0x1d, MLX5_CAP_MACSEC = 0x1f, MLX5_CAP_GENERAL_2 = 0x20, @@ -1466,6 +1507,14 @@ enum mlx5_qcam_feature_groups { #define MLX5_CAP_MACSEC(mdev, cap)\ MLX5_GET(macsec_cap, (mdev)->caps.hca[MLX5_CAP_MACSEC]->cur, cap) +#define MLX5_CAP_DEV_NVMEOTCP(mdev, cap)\ + MLX5_GET(nvmeotcp_cap, \ + (mdev)->caps.hca[MLX5_CAP_DEV_NVMEOTCP]->cur, cap) + +#define MLX5_CAP64_DEV_NVMEOTCP(mdev, cap)\ + MLX5_GET64(nvmeotcp_cap, \ + (mdev)->caps.hca[MLX5_CAP_DEV_NVMEOTCP]->cur, cap) + enum { MLX5_CMD_STAT_OK = 0x0, MLX5_CMD_STAT_INT_ERR = 0x1, diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index bbe5b0f233c4..69d35c591c55 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1449,7 +1449,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 event_cap[0x1]; u8 reserved_at_91[0x2]; u8 isolate_vl_tc_new[0x1]; - u8 reserved_at_94[0x4]; + u8 reserved_at_94[0x2]; + u8 nvmeotcp[0x1]; + u8 reserved_at_97[0x1]; u8 prio_tag_required[0x1]; u8 reserved_at_99[0x2]; u8 log_max_qp[0x5]; @@ -3347,7 +3349,20 @@ struct mlx5_ifc_shampo_cap_bits { u8 reserved_at_20[0x3]; u8 shampo_max_log_headers_entry_size[0x5]; u8 reserved_at_28[0x18]; + u8 reserved_at_40[0x7c0]; +}; + +struct mlx5_ifc_nvmeotcp_cap_bits { + u8 zerocopy[0x1]; + u8 crc_rx[0x1]; + u8 crc_tx[0x1]; + u8 reserved_at_3[0x15]; + u8 version[0x8]; + u8 reserved_at_20[0x13]; + u8 log_max_nvmeotcp_tag_buffer_table[0x5]; + u8 reserved_at_38[0x3]; + u8 log_max_nvmeotcp_tag_buffer_size[0x5]; u8 reserved_at_40[0x7c0]; }; @@ -3371,6 +3386,7 @@ union mlx5_ifc_hca_cap_union_bits { struct mlx5_ifc_virtio_emulation_cap_bits virtio_emulation_cap; struct mlx5_ifc_shampo_cap_bits shampo_cap; struct mlx5_ifc_macsec_cap_bits macsec_cap; + struct mlx5_ifc_nvmeotcp_cap_bits nvmeotcp_cap; u8 reserved_at_0[0x8000]; }; @@ -3617,7 +3633,9 @@ struct mlx5_ifc_tirc_bits { u8 disp_type[0x4]; u8 tls_en[0x1]; - u8 reserved_at_25[0x1b]; + u8 nvmeotcp_zero_copy_en[0x1]; + u8 nvmeotcp_crc_en[0x1]; + u8 reserved_at_27[0x19]; u8 reserved_at_40[0x40]; @@ -3648,7 +3666,8 @@ struct mlx5_ifc_tirc_bits { struct mlx5_ifc_rx_hash_field_select_bits rx_hash_field_selector_inner; - u8 reserved_at_2c0[0x4c0]; + u8 nvmeotcp_tag_buffer_table_id[0x20]; + u8 reserved_at_2e0[0x4a0]; }; enum { @@ -11630,6 +11649,7 @@ enum { MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_ENCRYPTION_KEY = BIT_ULL(0xc), MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_IPSEC = BIT_ULL(0x13), MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_SAMPLER = BIT_ULL(0x20), + MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_NVMEOTCP_TAG_BUFFER_TABLE = BIT_ULL(0x21), MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_FLOW_METER_ASO = BIT_ULL(0x24), }; @@ -11637,6 +11657,7 @@ enum { MLX5_GENERAL_OBJECT_TYPES_ENCRYPTION_KEY = 0xc, MLX5_GENERAL_OBJECT_TYPES_IPSEC = 0x13, MLX5_GENERAL_OBJECT_TYPES_SAMPLER = 0x20, + MLX5_GENERAL_OBJECT_TYPES_NVMEOTCP_TAG_BUFFER_TABLE = 0x21, MLX5_GENERAL_OBJECT_TYPES_FLOW_METER_ASO = 0x24, MLX5_GENERAL_OBJECT_TYPES_MACSEC = 0x27, }; @@ -11927,6 +11948,20 @@ struct mlx5_ifc_query_sampler_obj_out_bits { struct mlx5_ifc_sampler_obj_bits sampler_object; }; +struct mlx5_ifc_nvmeotcp_tag_buf_table_obj_bits { + u8 modify_field_select[0x40]; + + u8 reserved_at_40[0x20]; + + u8 reserved_at_60[0x1b]; + u8 log_tag_buffer_table_size[0x5]; +}; + +struct mlx5_ifc_create_nvmeotcp_tag_buf_table_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits general_obj_in_cmd_hdr; + struct mlx5_ifc_nvmeotcp_tag_buf_table_obj_bits nvmeotcp_tag_buf_table_obj; +}; + enum { MLX5_GENERAL_OBJECT_TYPE_ENCRYPTION_KEY_KEY_SIZE_128 = 0x0, MLX5_GENERAL_OBJECT_TYPE_ENCRYPTION_KEY_KEY_SIZE_256 = 0x1, @@ -11940,6 +11975,13 @@ enum { enum { MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_TLS = 0x1, + MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_NVMETCP = 0x2, + MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_NVMETCP_WITH_TLS = 0x3, +}; + +enum { + MLX5_TRANSPORT_STATIC_PARAMS_TI_INITIATOR = 0x0, + MLX5_TRANSPORT_STATIC_PARAMS_TI_TARGET = 0x1, }; struct mlx5_ifc_transport_static_params_bits { @@ -11962,7 +12004,20 @@ struct mlx5_ifc_transport_static_params_bits { u8 reserved_at_100[0x8]; u8 dek_index[0x18]; - u8 reserved_at_120[0xe0]; + u8 reserved_at_120[0x14]; + + u8 cccid_ttag[0x1]; + u8 ti[0x1]; + u8 zero_copy_en[0x1]; + u8 ddgst_offload_en[0x1]; + u8 hdgst_offload_en[0x1]; + u8 ddgst_en[0x1]; + u8 hddgst_en[0x1]; + u8 pda[0x5]; + + u8 nvme_resync_tcp_sn[0x20]; + + u8 reserved_at_160[0xa0]; }; struct mlx5_ifc_tls_progress_params_bits { @@ -12201,4 +12256,15 @@ struct mlx5_ifc_modify_page_track_obj_in_bits { struct mlx5_ifc_page_track_bits obj_context; }; +struct mlx5_ifc_nvmeotcp_progress_params_bits { + u8 next_pdu_tcp_sn[0x20]; + + u8 hw_resync_tcp_sn[0x20]; + + u8 pdu_tracker_state[0x2]; + u8 offloading_state[0x2]; + u8 reserved_at_44[0xc]; + u8 cccid_ttag[0x10]; +}; + #endif /* MLX5_IFC_H */ diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h index 4657d5c54abe..bda53b241d71 100644 --- a/include/linux/mlx5/qp.h +++ b/include/linux/mlx5/qp.h @@ -227,6 +227,7 @@ struct mlx5_wqe_ctrl_seg { #define MLX5_WQE_CTRL_OPCODE_MASK 0xff #define MLX5_WQE_CTRL_WQE_INDEX_MASK 0x00ffff00 #define MLX5_WQE_CTRL_WQE_INDEX_SHIFT 8 +#define MLX5_WQE_CTRL_TIR_TIS_INDEX_SHIFT 8 enum { MLX5_ETH_WQE_L3_INNER_CSUM = 1 << 4, From patchwork Mon Jan 9 13:31:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093582 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6B0BC5479D for ; Mon, 9 Jan 2023 13:34:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237034AbjAINeU (ORCPT ); Mon, 9 Jan 2023 08:34:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237167AbjAINda (ORCPT ); Mon, 9 Jan 2023 08:33:30 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2043.outbound.protection.outlook.com [40.107.94.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B20D11EC70 for ; Mon, 9 Jan 2023 05:33:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dhEIEk0PQ2heAeaJvlBPqwfrkDq8gO9Wo6jDeWNclDqAl1LLdksxsJReoC8f0Cw9Kk+wYbhi6mdF2zKpbkHe51wVY4yYFjnHLg/TCOdAdl3DCaxho2CXDJuWWirueL4uMwdRsEB1f/bs49NCpAcSOHNndzr542iqnacErOP50DZGuY/MH8aWGhJgx+bqKIPGKIy6EqPoaaAakLkkPOPtBt8oQANp8e0O0H83PZM05nC+mmVLtoSMioX+PJL5NhVr/3Ba+riHSoZsi9BqtDGrhqjyuJAcUc+Fl2dDlG9RUFhxN/Uj+Hbd8OtGGS0dgt3HSNU6bCrXekWO4SBIOH/Q6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=eXBP7cDEPeKf8dmphnPLA412L+pNoomCmmyYMw9UiD4=; b=ERJOsJkwMh7wJUIYD+CTOY/tPazqas1Sx//Df6j8hKr0olZOLmy16Rp1RzsP1sxwhF6eDSMla/GdX2CH0zvprVIMkOK5oew/XcbbPp1leOfwVzHNpw0dAO+tDI4fFziSZa6pVkBq7Fzgdu/gwHLOgGuSVYxNwTs4E1qrZZXiMXjMMaCyaoW8IwjRH8p81noWq/pAbqiF38LV03rmrAAwN4WOinavlUlgVzRkwVXhX4HXo3fsYH6kEUgxxQVnqoPbKyi6gcD8+LMXw8MtJPG6MMbkUDB8H3GF4F1SSY/zW+g2/uhdV2q+BsuUilG+oq/boS2Op2YpHLH1KLcKJWHa5A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=eXBP7cDEPeKf8dmphnPLA412L+pNoomCmmyYMw9UiD4=; b=EVSMKwuHATxZDuBk+64R2Wdq7vJ/5FRwJoyNOouevOkpUKwHUyDGxjYtwFKrfuse0J1aVYw2uNp7M4xncEGB7dD0Wd8deTbC46Sk+y9Ozi5Hq39spWaM/fVQjNeSNfHJnUFUSEhI8xVLe02IjBOlGWITAjFhRECfRxQmIqVhzC0B66DAlNlYQbB1neUmhx6iNvkXTEYlZGCkaliBQC/YupbGNX3KSou59Ea6RAArVOeKGWLVZCAP0ccgbzi8WMT2V3Uli2nsJ0Nj2x46zPF17OzhGEh3pZGdTerhm626W7Pq12cu8Enywmpi7WtFCbkfJ6y4t4ax8gafFSx/Ejs2iQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by CH0PR12MB5074.namprd12.prod.outlook.com (2603:10b6:610:e1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:33:20 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:33:20 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Ben Ben-Ishay , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 18/25] net/mlx5e: NVMEoTCP, offload initialization Date: Mon, 9 Jan 2023 15:31:09 +0200 Message-Id: <20230109133116.20801-19-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR2P281CA0151.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:98::14) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|CH0PR12MB5074:EE_ X-MS-Office365-Filtering-Correlation-Id: 8054c37d-fc5e-4c1c-d76b-08daf24612c2 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 6Gwt5AOrajUVStinj/pt0Hi8s7v+TjcMd8+1dztQQxjMsU0Ond1jK3/vMWsSGA4gn1EW6Zcp7VUSTEquiw53EqqJbxACEfYKj+FLKW0x7tGXpv3S//DbYjHRrBC+wN4WLTRrSWNhwVWtdubAkWndMJRRU1ZOXJdJVRYUX6oQszWTpOjXhBtn6Zv1sAoakEYkIfZGwwznPgNXgBhWiTH1SL8Ey4pnEFLgMHF3fJTlqzIVQHZc+2GTLuS3/IiAqi95dAvxKraEzjq2yA9VxVr1eYqWk+ekXkh3wOZKcU9COqRKz/28SYhPPtyulTVpuWztzWxbHpCaVuSarGzBqf4Pzjwv87iDq2kU2X3ajvGGIQO+6KlLlOXamNUk7/tDEGdXnh01nyWqv/H2I79ofZWHOZuhUph5Rw6GeOWTaUKdhjh9HEywnaQ1YXsXsjvBm+zU9jpzk3SlGGrUyTMMmakxdTGtlYLUnIV8jbmY58/rV4YWtwxjX2nDrFYoyPpV0wYRz9YEhxojFZLDDW/zxDoFbhKqCWwlsnFHcCgdVg3cg8w2SnaDgF4zAwzwd1dgVGoFAeV45COdkT898jluAorzxHj5tuuu3ZYiBWZXCtqiN9CvxQoWMVnjKWrE2v3+VnpNwVefebXZbWEIgcc3fcCHqA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(396003)(366004)(376002)(346002)(39860400002)(451199015)(1076003)(30864003)(316002)(5660300002)(7416002)(26005)(6512007)(186003)(6486002)(478600001)(2616005)(41300700001)(54906003)(4326008)(66556008)(66946007)(66476007)(8676002)(8936002)(83380400001)(86362001)(36756003)(6666004)(107886003)(6506007)(38100700002)(2906002)(66899015);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: tvYnT1luI6bePGX4m/oR82gCSmHtgAmSbIVbFEveLuBJ6xVlUJlwusChyaRf0CM6HVbTiF1m7jcUSo6QJKre2epZhh9LCHVXrkP2q1bilOnmmXwjq3AK47g4qBssfouera/+PD+mQw0j6IzX+WF/g5xPZsI0u7buSfRJDt8PArNorAg12SZSBMkByGWlApa28IfZsq7Lg2P9uJOAEXa0kCfzxWM3bzlbBcJklPrvbgphENbFB2eVXZDWISnZ32RU9WsK/4t04SnGBZJuCTjQ1s0/+yqxXxb3OXxuVwIMhXtKxUCjfp6IGFCbXAIIokS6k5+rb/Dyq5oI8hTpAOQi3Sm72ba+XnylXl3sTFMn1711i9KoGXKPF4/HJATdre1xQ6PQqJqxbeHJv5MlddjYvHjeZbfwqlJs0a2CCjJAfUoqaOng2wBVJEXC6YGKREfC6dmZzbapLe+6SWsIxTu5GLqqLuoKbCYl6EvP64kQ3SZYULAoMaY0rhY6ET3qnzOEhm3Hdw11vcgnIOdIBH2sLzu2YvF/BWogiTiy1fnBqqnLPTi0gFaYVc1NtOiSQt+PizPVOOpo9juxq7ARGKhpnFpqQP4Fl6s4FyBWqy7/DVFD/ecVqh5GvJLEoS2lrWB26hXRMD7JP2wQzR71WMFKTtwnBIyHCT8BU2+TUiBy43XHI90eXrqDSH68L9wyJyTiK0Ab0qIpobrfFxewHW943BRScZRmi15CcPxunmgbtyVswTRBXIT5/e8zyRHSqJfn3myDQE1kFm9dvZ0knAxj+ASucDuNmDVxjvJYs/32sc9CapwCA7ipv26HGX6kyuOy0dFnwkixGfMU7p/8ncY/3ybjoO1VkY3IVfa/cl4B5/od8bzq6mIqDPCGPOZq9WjQxoDr+KfA8MPLj+cIHJIHm+Y4gY86cO8UoP854u3Mu+HgSn/aGGv5S+28OHG/Y4+X9QlRL2iNa36128O4azVtqNlnbShwyFAa2H1AzpgOZan+X5ezHt2QtwsYuytBOkrGry9dbeqR7B/R15nDonB2oRtKqCFC8jtkBqacw2O+8ougISwE7twQpILgm2DJTc7RqeRTotG0YW5hcqF5Nn6reWUtABQs0SlcjcFTlqjZUA7vB4csgMvTjZitURhB65utfVPkbt4HTfksdF5ThYvD4J2NYPpTMtntGCWKcWFhcBmuC8iWqB24tr2ZxIn8+kS4FXmQJEz6Qw17b8QoVgMyQXQAyb5hYr7aYdtg20HPf6/NbZSyqLHHz8p8CVxXWB0xiiVII1XOdEVtaikiw50oQ6cW2/dNkRLXuASpcqY33UFTwGO3VKZnse09xqY+or26rla/z+o0dx2GZxeC0Q35IwGIl5OM7ZAwoQtPbPKqyGSvT8JGosDtrQkpwzQuT91hOX7VDLbunPNCvbF9ZPsH8VA5CMHMUPdzD5fajbm4mxbiJWOFHfSzAQcm3Q7Bev1eXXJyRs686NpI1/d+qMKA9espiZ9sm0QN507UAVhFk+H3KqlJpuhTGmhYU9ujafEQWjXs0KGyGaVnypdD/WVfy0JKwNTw5UlCdQcfegkTseZg/iRrh6PJV0+dxxcBUQUE X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8054c37d-fc5e-4c1c-d76b-08daf24612c2 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:33:20.5706 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 9f5HaUTs5e+gdUlVqUFESEdmoYP6gZiwuN0e1fxfez7RiYmoaoxub1uVqNu0KSdGWsM6JiAg3yPttcNqv1IS8Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5074 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Ben Ben-Ishay This commit introduces the driver structures and initialization blocks for NVMEoTCP offload. The mlx5 nvmeotcp structures are: - queue (mlx5e_nvmeotcp_queue) - pairs 1:1 with nvmeotcp driver queues and deals with the offloading parts. The mlx5e queue is accessed in the ddp ops: initialized on sk_add, used in ddp setup,teardown,resync and in the fast path when dealing with packets, destroyed in the sk_del op. - queue entry (nvmeotcp_queue_entry) - pairs 1:1 with offloaded IO from that queue. Keeps pointers to the SG elements describing the buffers used for the IO and the ddp context of it. - queue handler (mlx5e_nvmeotcp_queue_handler) - we use icosq per NVME-TCP queue for UMR mapping as part of the ddp offload. Those dedicated SQs are unique in the sense that they are driven directly by the NVME-TCP layer to submit and invalidate ddp requests. Since the life-cycle of these icosqs is not tied to the channels, we create dedicated napi contexts for polling them such that channels can be re-created during offloading. The queue handler has pointer to the cq associated with the queue's sq and napi context. - main offload context (mlx5e_nvmeotcp) - has ida and hash table instances. Each offloaded queue gets an ID from the ida instance and the pairs are kept in the hash table. The id is programmed as flow tag to be set by HW on the completion (cqe) of all packets related to this queue (by 5-tuple steering). The fast path which deals with packets uses the flow tag to access the hash table and retrieve the queue for the processing. We query nvmeotcp capabilities to see if the offload can be supported and use 128B CQEs when this happens. By default, the offload is off but can be enabled with `ethtool --ulp-ddp nvme-tcp-ddp on`. Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../net/ethernet/mellanox/mlx5/core/Kconfig | 11 ++ .../net/ethernet/mellanox/mlx5/core/Makefile | 2 + drivers/net/ethernet/mellanox/mlx5/core/en.h | 4 + .../net/ethernet/mellanox/mlx5/core/en/fs.h | 4 +- .../ethernet/mellanox/mlx5/core/en/params.c | 12 +- .../ethernet/mellanox/mlx5/core/en/params.h | 3 + .../mellanox/mlx5/core/en_accel/en_accel.h | 3 + .../mellanox/mlx5/core/en_accel/fs_tcp.h | 2 +- .../mellanox/mlx5/core/en_accel/nvmeotcp.c | 168 ++++++++++++++++++ .../mellanox/mlx5/core/en_accel/nvmeotcp.h | 121 +++++++++++++ .../ethernet/mellanox/mlx5/core/en_ethtool.c | 59 +++++- .../net/ethernet/mellanox/mlx5/core/en_fs.c | 4 +- .../net/ethernet/mellanox/mlx5/core/en_main.c | 18 ++ .../net/ethernet/mellanox/mlx5/core/main.c | 1 + 14 files changed, 402 insertions(+), 10 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig index 26685fd0fdaa..0c790952fdf7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig @@ -163,6 +163,17 @@ config MLX5_EN_TLS help Build support for TLS cryptography-offload acceleration in the NIC. +config MLX5_EN_NVMEOTCP + bool "NVMEoTCP acceleration" + depends on ULP_DDP + depends on MLX5_CORE_EN + default y + help + Build support for NVMEoTCP acceleration in the NIC. + This includes Direct Data Placement and CRC offload. + Note: Support for hardware with this capability needs to be selected + for this option to become available. + config MLX5_SW_STEERING bool "Mellanox Technologies software-managed steering" depends on MLX5_CORE_EN && MLX5_ESWITCH diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index cd4a1ab0ea78..9df9999047d1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -103,6 +103,8 @@ mlx5_core-$(CONFIG_MLX5_EN_TLS) += en_accel/ktls_stats.o \ en_accel/fs_tcp.o en_accel/ktls.o en_accel/ktls_txrx.o \ en_accel/ktls_tx.o en_accel/ktls_rx.o +mlx5_core-$(CONFIG_MLX5_EN_NVMEOTCP) += en_accel/fs_tcp.o en_accel/nvmeotcp.o + mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o \ steering/dr_matcher.o steering/dr_rule.o \ steering/dr_icm_pool.o steering/dr_buddy.o \ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index f0ceb182ac43..a6da839a8f82 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -328,6 +328,7 @@ struct mlx5e_params { unsigned int sw_mtu; int hard_mtu; bool ptp_rx; + bool nvmeotcp; }; static inline u8 mlx5e_get_dcb_num_tc(struct mlx5e_params *params) @@ -959,6 +960,9 @@ struct mlx5e_priv { #endif #ifdef CONFIG_MLX5_EN_TLS struct mlx5e_tls *tls; +#endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + struct mlx5e_nvmeotcp *nvmeotcp; #endif struct devlink_health_reporter *tx_reporter; struct devlink_health_reporter *rx_reporter; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h index 379c6dc9a3be..3c81b7603131 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h @@ -77,7 +77,7 @@ enum { MLX5E_INNER_TTC_FT_LEVEL, MLX5E_FS_TT_UDP_FT_LEVEL = MLX5E_INNER_TTC_FT_LEVEL + 1, MLX5E_FS_TT_ANY_FT_LEVEL = MLX5E_INNER_TTC_FT_LEVEL + 1, -#ifdef CONFIG_MLX5_EN_TLS +#if defined(CONFIG_MLX5_EN_TLS) || defined(CONFIG_MLX5_EN_NVMEOTCP) MLX5E_ACCEL_FS_TCP_FT_LEVEL = MLX5E_INNER_TTC_FT_LEVEL + 1, #endif #ifdef CONFIG_MLX5_EN_ARFS @@ -168,7 +168,7 @@ struct mlx5e_fs_any *mlx5e_fs_get_any(struct mlx5e_flow_steering *fs); void mlx5e_fs_set_any(struct mlx5e_flow_steering *fs, struct mlx5e_fs_any *any); struct mlx5e_fs_udp *mlx5e_fs_get_udp(struct mlx5e_flow_steering *fs); void mlx5e_fs_set_udp(struct mlx5e_flow_steering *fs, struct mlx5e_fs_udp *udp); -#ifdef CONFIG_MLX5_EN_TLS +#if defined(CONFIG_MLX5_EN_TLS) || defined(CONFIG_MLX5_EN_NVMEOTCP) struct mlx5e_accel_fs_tcp *mlx5e_fs_get_accel_tcp(struct mlx5e_flow_steering *fs); void mlx5e_fs_set_accel_tcp(struct mlx5e_flow_steering *fs, struct mlx5e_accel_fs_tcp *accel_tcp); #endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c index 585bdc8383ee..36f53251a32b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c @@ -797,7 +797,8 @@ static void mlx5e_build_common_cq_param(struct mlx5_core_dev *mdev, void *cqc = param->cqc; MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index); - if (MLX5_CAP_GEN(mdev, cqe_128_always) && cache_line_size() >= 128) + if (MLX5_CAP_GEN(mdev, cqe_128_always) && + (cache_line_size() >= 128 || param->force_cqe128)) MLX5_SET(cqc, cqc, cqe_sz, CQE_STRIDE_128_PAD); } @@ -827,6 +828,9 @@ static void mlx5e_build_rx_cq_param(struct mlx5_core_dev *mdev, void *cqc = param->cqc; u8 log_cq_size; + /* nvme-tcp offload mandates 128 byte cqes */ + param->force_cqe128 |= IS_ENABLED(CONFIG_MLX5_EN_NVMEOTCP) && params->nvmeotcp; + switch (params->rq_wq_type) { case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ: hw_stridx = MLX5_CAP_GEN(mdev, mini_cqe_resp_stride_index); @@ -1166,9 +1170,9 @@ static u8 mlx5e_build_async_icosq_log_wq_sz(struct mlx5_core_dev *mdev) return MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE; } -static void mlx5e_build_icosq_param(struct mlx5_core_dev *mdev, - u8 log_wq_size, - struct mlx5e_sq_param *param) +void mlx5e_build_icosq_param(struct mlx5_core_dev *mdev, + u8 log_wq_size, + struct mlx5e_sq_param *param) { void *sqc = param->sqc; void *wq = MLX5_ADDR_OF(sqc, sqc, wq); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h index c9be6eb88012..d5b3455c4875 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h @@ -17,6 +17,7 @@ struct mlx5e_cq_param { struct mlx5_wq_param wq; u16 eq_ix; u8 cq_period_mode; + bool force_cqe128; }; struct mlx5e_rq_param { @@ -146,6 +147,8 @@ void mlx5e_build_xdpsq_param(struct mlx5_core_dev *mdev, struct mlx5e_params *params, struct mlx5e_xsk_param *xsk, struct mlx5e_sq_param *param); +void mlx5e_build_icosq_param(struct mlx5_core_dev *mdev, + u8 log_wq_size, struct mlx5e_sq_param *param); int mlx5e_build_channel_param(struct mlx5_core_dev *mdev, struct mlx5e_params *params, u16 q_counter, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h index 07187028f0d3..e38656229399 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h @@ -40,6 +40,7 @@ #include "en_accel/ktls.h" #include "en_accel/ktls_txrx.h" #include +#include "en_accel/nvmeotcp.h" #include "en.h" #include "en/txrx.h" @@ -202,11 +203,13 @@ static inline void mlx5e_accel_tx_finish(struct mlx5e_txqsq *sq, static inline int mlx5e_accel_init_rx(struct mlx5e_priv *priv) { + mlx5e_nvmeotcp_init_rx(priv); return mlx5e_ktls_init_rx(priv); } static inline void mlx5e_accel_cleanup_rx(struct mlx5e_priv *priv) { + mlx5e_nvmeotcp_cleanup_rx(priv); mlx5e_ktls_cleanup_rx(priv); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h index a032bff482a6..d907e352ffae 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h @@ -6,7 +6,7 @@ #include "en/fs.h" -#ifdef CONFIG_MLX5_EN_TLS +#if defined(CONFIG_MLX5_EN_TLS) || defined(CONFIG_MLX5_EN_NVMEOTCP) int mlx5e_accel_fs_tcp_create(struct mlx5e_flow_steering *fs); void mlx5e_accel_fs_tcp_destroy(struct mlx5e_flow_steering *fs); struct mlx5_flow_handle *mlx5e_accel_fs_add_sk(struct mlx5e_flow_steering *fs, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c new file mode 100644 index 000000000000..a1d143ea93cc --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c @@ -0,0 +1,168 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. + +#include +#include +#include "en_accel/nvmeotcp.h" +#include "en_accel/fs_tcp.h" +#include "en/txrx.h" + +#define MAX_NUM_NVMEOTCP_QUEUES (4000) +#define MIN_NUM_NVMEOTCP_QUEUES (1) + +static const struct rhashtable_params rhash_queues = { + .key_len = sizeof(int), + .key_offset = offsetof(struct mlx5e_nvmeotcp_queue, id), + .head_offset = offsetof(struct mlx5e_nvmeotcp_queue, hash), + .automatic_shrinking = true, + .min_size = MIN_NUM_NVMEOTCP_QUEUES, + .max_size = MAX_NUM_NVMEOTCP_QUEUES, +}; + +static int +mlx5e_nvmeotcp_offload_limits(struct net_device *netdev, + struct ulp_ddp_limits *ulp_limits) +{ + return 0; +} + +static int +mlx5e_nvmeotcp_queue_init(struct net_device *netdev, + struct sock *sk, + struct ulp_ddp_config *tconfig) +{ + return 0; +} + +static void +mlx5e_nvmeotcp_queue_teardown(struct net_device *netdev, + struct sock *sk) +{ +} + +static int +mlx5e_nvmeotcp_ddp_setup(struct net_device *netdev, + struct sock *sk, + struct ulp_ddp_io *ddp) +{ + return 0; +} + +static void +mlx5e_nvmeotcp_ddp_teardown(struct net_device *netdev, + struct sock *sk, + struct ulp_ddp_io *ddp, + void *ddp_ctx) +{ +} + +static void +mlx5e_nvmeotcp_ddp_resync(struct net_device *netdev, + struct sock *sk, u32 seq) +{ +} + +const struct ulp_ddp_dev_ops mlx5e_nvmeotcp_ops = { + .ulp_ddp_limits = mlx5e_nvmeotcp_offload_limits, + .ulp_ddp_sk_add = mlx5e_nvmeotcp_queue_init, + .ulp_ddp_sk_del = mlx5e_nvmeotcp_queue_teardown, + .ulp_ddp_setup = mlx5e_nvmeotcp_ddp_setup, + .ulp_ddp_teardown = mlx5e_nvmeotcp_ddp_teardown, + .ulp_ddp_resync = mlx5e_nvmeotcp_ddp_resync, +}; + +int set_ulp_ddp_nvme_tcp(struct net_device *netdev, bool enable) +{ + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_params new_params; + int err = 0; + + /* There may be offloaded queues when an ethtool callback to disable the feature is made. + * Hence, we can't destroy the tcp flow-table since it may be referenced by the offload + * related flows and we'll keep the 128B CQEs on the channel RQs. Also, since we don't + * deref/destroy the fs tcp table when the feature is disabled, we don't ref it again + * if the feature is enabled multiple times. + */ + if (!enable || priv->nvmeotcp->enabled) + return 0; + + err = mlx5e_accel_fs_tcp_create(priv->fs); + if (err) + return err; + + new_params = priv->channels.params; + new_params.nvmeotcp = enable; + err = mlx5e_safe_switch_params(priv, &new_params, NULL, NULL, true); + if (err) + goto fs_tcp_destroy; + + priv->nvmeotcp->enabled = true; + return 0; + +fs_tcp_destroy: + mlx5e_accel_fs_tcp_destroy(priv->fs); + return err; +} + +void mlx5e_nvmeotcp_build_netdev(struct mlx5e_priv *priv) +{ + struct net_device *netdev = priv->netdev; + struct mlx5_core_dev *mdev = priv->mdev; + + if (!(MLX5_CAP_GEN(mdev, nvmeotcp) && + MLX5_CAP_DEV_NVMEOTCP(mdev, zerocopy) && + MLX5_CAP_DEV_NVMEOTCP(mdev, crc_rx) && MLX5_CAP_GEN(mdev, cqe_128_always))) + return; + + /* report ULP DPP as supported, but don't enable it by default */ + set_bit(ULP_DDP_C_NVME_TCP_BIT, netdev->ulp_ddp_caps.hw); + set_bit(ULP_DDP_C_NVME_TCP_DDGST_RX_BIT, netdev->ulp_ddp_caps.hw); +} + +void mlx5e_nvmeotcp_cleanup_rx(struct mlx5e_priv *priv) +{ + if (priv->nvmeotcp && priv->nvmeotcp->enabled) + mlx5e_accel_fs_tcp_destroy(priv->fs); +} + +int mlx5e_nvmeotcp_init(struct mlx5e_priv *priv) +{ + struct mlx5e_nvmeotcp *nvmeotcp = NULL; + int ret = 0; + + if (!MLX5_CAP_GEN(priv->mdev, nvmeotcp)) + return 0; + + nvmeotcp = kzalloc(sizeof(*nvmeotcp), GFP_KERNEL); + + if (!nvmeotcp) + return -ENOMEM; + + ida_init(&nvmeotcp->queue_ids); + ret = rhashtable_init(&nvmeotcp->queue_hash, &rhash_queues); + if (ret) + goto err_ida; + + nvmeotcp->enabled = false; + + priv->nvmeotcp = nvmeotcp; + return 0; + +err_ida: + ida_destroy(&nvmeotcp->queue_ids); + kfree(nvmeotcp); + return ret; +} + +void mlx5e_nvmeotcp_cleanup(struct mlx5e_priv *priv) +{ + struct mlx5e_nvmeotcp *nvmeotcp = priv->nvmeotcp; + + if (!nvmeotcp) + return; + + rhashtable_destroy(&nvmeotcp->queue_hash); + ida_destroy(&nvmeotcp->queue_ids); + kfree(nvmeotcp); + priv->nvmeotcp = NULL; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h new file mode 100644 index 000000000000..a665b7a72bc2 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h @@ -0,0 +1,121 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. */ +#ifndef __MLX5E_NVMEOTCP_H__ +#define __MLX5E_NVMEOTCP_H__ + +#ifdef CONFIG_MLX5_EN_NVMEOTCP + +#include +#include "en.h" +#include "en/params.h" + +struct mlx5e_nvmeotcp_queue_entry { + struct mlx5e_nvmeotcp_queue *queue; + u32 sgl_length; + u32 klm_mkey; + struct scatterlist *sgl; + u32 ccid_gen; + u64 size; + + /* for the ddp invalidate done callback */ + void *ddp_ctx; + struct ulp_ddp_io *ddp; +}; + +struct mlx5e_nvmeotcp_queue_handler { + struct napi_struct napi; + struct mlx5e_cq *cq; +}; + +/** + * struct mlx5e_nvmeotcp_queue - mlx5 metadata for NVMEoTCP queue + * @ulp_ddp_ctx: Generic ulp ddp context + * @tir: Destination TIR created for NVMEoTCP offload + * @fh: Flow handle representing the 5-tuple steering for this flow + * @id: Flow tag ID used to identify this queue + * @size: NVMEoTCP queue depth + * @ccid_gen: Generation ID for the CCID, used to avoid conflicts in DDP + * @max_klms_per_wqe: Number of KLMs per DDP operation + * @hash: Hash table of queues mapped by @id + * @pda: Padding alignment + * @tag_buf_table_id: Tag buffer table for CCIDs + * @dgst: Digest supported (header and/or data) + * @sq: Send queue used for posting umrs + * @ref_count: Reference count for this structure + * @after_resync_cqe: Indicate if resync occurred + * @ccid_table: Table holding metadata for each CC (Command Capsule) + * @ccid: ID of the current CC + * @ccsglidx: Index within the scatter-gather list (SGL) of the current CC + * @ccoff: Offset within the current CC + * @ccoff_inner: Current offset within the @ccsglidx element + * @channel_ix: Channel IX for this nvmeotcp_queue + * @sk: The socket used by the NVMe-TCP queue + * @crc_rx: CRC Rx offload indication for this queue + * @priv: mlx5e netdev priv + * @static_params_done: Async completion structure for the initial umr mapping + * synchronization + * @sq_lock: Spin lock for the icosq + * @qh: Completion queue handler for processing umr completions + */ +struct mlx5e_nvmeotcp_queue { + struct ulp_ddp_ctx ulp_ddp_ctx; + struct mlx5e_tir tir; + struct mlx5_flow_handle *fh; + int id; + u32 size; + /* needed when the upper layer immediately reuses CCID + some packet loss happens */ + u32 ccid_gen; + u32 max_klms_per_wqe; + struct rhash_head hash; + int pda; + u32 tag_buf_table_id; + u8 dgst; + struct mlx5e_icosq sq; + + /* data-path section cache aligned */ + refcount_t ref_count; + /* for MASK HW resync cqe */ + bool after_resync_cqe; + struct mlx5e_nvmeotcp_queue_entry *ccid_table; + /* current ccid fields */ + int ccid; + int ccsglidx; + off_t ccoff; + int ccoff_inner; + + u32 channel_ix; + struct sock *sk; + u8 crc_rx:1; + /* for ddp invalidate flow */ + struct mlx5e_priv *priv; + /* end of data-path section */ + + struct completion static_params_done; + /* spin lock for the ico sq, ULP can issue requests from multiple contexts */ + spinlock_t sq_lock; + struct mlx5e_nvmeotcp_queue_handler qh; +}; + +struct mlx5e_nvmeotcp { + struct ida queue_ids; + struct rhashtable queue_hash; + bool enabled; +}; + +void mlx5e_nvmeotcp_build_netdev(struct mlx5e_priv *priv); +int mlx5e_nvmeotcp_init(struct mlx5e_priv *priv); +int set_ulp_ddp_nvme_tcp(struct net_device *netdev, bool enable); +void mlx5e_nvmeotcp_cleanup(struct mlx5e_priv *priv); +static inline void mlx5e_nvmeotcp_init_rx(struct mlx5e_priv *priv) {} +void mlx5e_nvmeotcp_cleanup_rx(struct mlx5e_priv *priv); +extern const struct ulp_ddp_dev_ops mlx5e_nvmeotcp_ops; +#else + +static inline void mlx5e_nvmeotcp_build_netdev(struct mlx5e_priv *priv) {} +static inline int mlx5e_nvmeotcp_init(struct mlx5e_priv *priv) { return 0; } +static inline void mlx5e_nvmeotcp_cleanup(struct mlx5e_priv *priv) {} +static inline int set_ulp_ddp_nvme_tcp(struct net_device *dev, bool en) { return -EOPNOTSUPP; } +static inline void mlx5e_nvmeotcp_init_rx(struct mlx5e_priv *priv) {} +static inline void mlx5e_nvmeotcp_cleanup_rx(struct mlx5e_priv *priv) {} +#endif +#endif /* __MLX5E_NVMEOTCP_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 7708acc9b2ab..7f763152f989 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -38,6 +38,7 @@ #include "en/ptp.h" #include "lib/clock.h" #include "en/fs_ethtool.h" +#include "en_accel/nvmeotcp.h" void mlx5e_ethtool_get_drvinfo(struct mlx5e_priv *priv, struct ethtool_drvinfo *drvinfo) @@ -1939,6 +1940,11 @@ int mlx5e_modify_rx_cqe_compression_locked(struct mlx5e_priv *priv, bool new_val return -EINVAL; } + if (priv->channels.params.nvmeotcp) { + netdev_warn(priv->netdev, "Can't set CQE compression after ULP DDP NVMe-TCP offload\n"); + return -EINVAL; + } + new_params = priv->channels.params; MLX5E_SET_PFLAG(&new_params, MLX5E_PFLAG_RX_CQE_COMPRESS, new_val); if (rx_filter) @@ -2393,6 +2399,54 @@ static void mlx5e_get_rmon_stats(struct net_device *netdev, mlx5e_stats_rmon_get(priv, rmon_stats, ranges); } +#ifdef CONFIG_MLX5_EN_NVMEOTCP +static int mlx5e_set_ulp_ddp_capabilities(struct net_device *netdev, unsigned long *new_caps) +{ + struct mlx5e_priv *priv = netdev_priv(netdev); + DECLARE_BITMAP(old_caps, ULP_DDP_C_COUNT); + struct mlx5e_params *params; + int ret = 0; + int nvme = -1; + + mutex_lock(&priv->state_lock); + params = &priv->channels.params; + bitmap_copy(old_caps, netdev->ulp_ddp_caps.active, ULP_DDP_C_COUNT); + + /* always handle nvme-tcp-ddp and nvme-tcp-ddgst-rx together (all or nothing) */ + + if (ulp_ddp_cap_turned_on(old_caps, new_caps, ULP_DDP_C_NVME_TCP_BIT) && + ulp_ddp_cap_turned_on(old_caps, new_caps, ULP_DDP_C_NVME_TCP_DDGST_RX_BIT)) { + if (MLX5E_GET_PFLAG(params, MLX5E_PFLAG_RX_CQE_COMPRESS)) { + netdev_warn(netdev, + "NVMe-TCP offload not supported when CQE compress is active. Disable rx_cqe_compress ethtool private flag first\n"); + goto out; + } + + if (netdev->features & (NETIF_F_LRO | NETIF_F_GRO_HW)) { + netdev_warn(netdev, + "NVMe-TCP offload not supported when HW_GRO/LRO is active. Disable rx-gro-hw ethtool feature first\n"); + goto out; + } + nvme = 1; + } else if (ulp_ddp_cap_turned_off(old_caps, new_caps, ULP_DDP_C_NVME_TCP_BIT) && + ulp_ddp_cap_turned_off(old_caps, new_caps, ULP_DDP_C_NVME_TCP_DDGST_RX_BIT)) { + nvme = 0; + } + + if (nvme >= 0) { + ret = set_ulp_ddp_nvme_tcp(netdev, nvme); + if (ret) + goto out; + change_bit(ULP_DDP_C_NVME_TCP_BIT, netdev->ulp_ddp_caps.active); + change_bit(ULP_DDP_C_NVME_TCP_DDGST_RX_BIT, netdev->ulp_ddp_caps.active); + } + +out: + mutex_unlock(&priv->state_lock); + return ret; +} +#endif + const struct ethtool_ops mlx5e_ethtool_ops = { .supported_coalesce_params = ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_MAX_FRAMES | @@ -2445,5 +2499,8 @@ const struct ethtool_ops mlx5e_ethtool_ops = { .get_eth_mac_stats = mlx5e_get_eth_mac_stats, .get_eth_ctrl_stats = mlx5e_get_eth_ctrl_stats, .get_rmon_stats = mlx5e_get_rmon_stats, - .get_link_ext_stats = mlx5e_get_link_ext_stats + .get_link_ext_stats = mlx5e_get_link_ext_stats, +#ifdef CONFIG_MLX5_EN_NVMEOTCP + .set_ulp_ddp_capabilities = mlx5e_set_ulp_ddp_capabilities, +#endif }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c index 1892ccb889b3..a791c6a4bf85 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c @@ -61,7 +61,7 @@ struct mlx5e_flow_steering { #ifdef CONFIG_MLX5_EN_ARFS struct mlx5e_arfs_tables *arfs; #endif -#ifdef CONFIG_MLX5_EN_TLS +#if defined(CONFIG_MLX5_EN_TLS) || defined(CONFIG_MLX5_EN_NVMEOTCP) struct mlx5e_accel_fs_tcp *accel_tcp; #endif struct mlx5e_fs_udp *udp; @@ -1540,7 +1540,7 @@ void mlx5e_fs_set_any(struct mlx5e_flow_steering *fs, struct mlx5e_fs_any *any) fs->any = any; } -#ifdef CONFIG_MLX5_EN_TLS +#if defined(CONFIG_MLX5_EN_TLS) || defined(CONFIG_MLX5_EN_NVMEOTCP) struct mlx5e_accel_fs_tcp *mlx5e_fs_get_accel_tcp(struct mlx5e_flow_steering *fs) { return fs->accel_tcp; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 01418af45dc8..5e1e556384cf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -48,6 +48,7 @@ #include "en_accel/macsec.h" #include "en_accel/en_accel.h" #include "en_accel/ktls.h" +#include "en_accel/nvmeotcp.h" #include "lib/vxlan.h" #include "lib/clock.h" #include "en/port.h" @@ -4142,6 +4143,13 @@ static netdev_features_t mlx5e_fix_features(struct net_device *netdev, } } + if (features & (NETIF_F_LRO | NETIF_F_GRO_HW)) { + if (params->nvmeotcp) { + netdev_warn(netdev, "Disabling HW-GRO/LRO, not supported after ULP DDP NVMe-TCP offload\n"); + features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW); + } + } + if (mlx5e_is_uplink_rep(priv)) features = mlx5e_fix_uplink_rep_features(netdev, features); @@ -4910,6 +4918,9 @@ const struct net_device_ops mlx5e_netdev_ops = { .ndo_has_offload_stats = mlx5e_has_offload_stats, .ndo_get_offload_stats = mlx5e_get_offload_stats, #endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + .ulp_ddp_ops = &mlx5e_nvmeotcp_ops, +#endif }; static u32 mlx5e_choose_lro_timeout(struct mlx5_core_dev *mdev, u32 wanted_timeout) @@ -5176,6 +5187,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev) mlx5e_macsec_build_netdev(priv); mlx5e_ipsec_build_netdev(priv); mlx5e_ktls_build_netdev(priv); + mlx5e_nvmeotcp_build_netdev(priv); } void mlx5e_create_q_counters(struct mlx5e_priv *priv) @@ -5241,13 +5253,19 @@ static int mlx5e_nic_init(struct mlx5_core_dev *mdev, if (err) mlx5_core_err(mdev, "TLS initialization failed, %d\n", err); + err = mlx5e_nvmeotcp_init(priv); + if (err) + mlx5_core_err(mdev, "NVMEoTCP initialization failed, %d\n", err); + mlx5e_health_create_reporters(priv); + return 0; } static void mlx5e_nic_cleanup(struct mlx5e_priv *priv) { mlx5e_health_destroy_reporters(priv); + mlx5e_nvmeotcp_cleanup(priv); mlx5e_ktls_cleanup(priv); mlx5e_fs_cleanup(priv->fs); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index df134f6d32dc..40c597d9a55d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1555,6 +1555,7 @@ static const int types[] = { MLX5_CAP_DEV_SHAMPO, MLX5_CAP_MACSEC, MLX5_CAP_ADV_VIRTUALIZATION, + MLX5_CAP_DEV_NVMEOTCP, }; static void mlx5_hca_caps_free(struct mlx5_core_dev *dev) From patchwork Mon Jan 9 13:31:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093583 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A07A3C5479D for ; Mon, 9 Jan 2023 13:34:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237033AbjAINe1 (ORCPT ); Mon, 9 Jan 2023 08:34:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237185AbjAINdb (ORCPT ); Mon, 9 Jan 2023 08:33:31 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2053.outbound.protection.outlook.com [40.107.94.53]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42D612718E for ; Mon, 9 Jan 2023 05:33:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Nh8Hk8Y1x4Y64/BIrW4ebBV88n0lpLsWz7AMfSvi96Fl30zEOYP7gac7wK5iTLMYexbm6UAuidEisesKADEbUsd9uG3vs9f7rHcfORIjnJ+BEGqDi3JcB0ktoZ+hbWpxZZhltf1OJtw15okueRoizVNRv0MeaDLMJHu7Yh0pYt6H+d+JAyIL2Xmm/hdQJKkjwDmtDaGf23FdcQNmspvKkiQjCqFRypLJD/tSun5JncfX0MAo1TR8ukUxaf+ojQ1cvEoOgLHJATKZ5u4RA9KEY6gY4xUsMCkjTBeMnFZ1Pqn8ZwpCqGpYxBK0pUVNY7CrBZigwUXsqsv4/cyMmWc2ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bHvJ/e3ftR78J1/ypd3/5sZy+tTImcRvBwzXcRJ5wlg=; b=Efv2TpncdFtEYtwCL33//UIM1K+9cYDB+qDjsliByc8/HgWFLfnRb0nfIMv5mlbZPbTVge7ICKIKUuj66Ca6r/lKuHtalLod0qvKhrokJHULi9pYRPB7nfQ8AwDMZkAb/HjSksmIVs2xykKc+t82a3OeMqk2OjQ37ppACm8Sf1B/WyFbH4aAxJBVdy5SkAZwc0qf5+XYAs0XvxdouO3mMdWDtRgqpEYlKsbe+D7vjDe07+ChQ5Gnb+7sFaKTVvlUmqCfX8Y1G+N46NAWLKjQD4cRXm0GpaSUvbtsAJvghF9QQStm9H9IOtlvsQvQlYrZ/WY1B+br3MjLVGPAKZrtsQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bHvJ/e3ftR78J1/ypd3/5sZy+tTImcRvBwzXcRJ5wlg=; b=Al/R/ji7tVHNImTtIJ5wmwJxkkV/Vkx+O6gBj/G/OF+ku/eLLELWZcxsqzamhum/Sb0uQphU+O7lpNcBsUa0+HDh6FW2vhz+7SJIqGfy+93tAepd9G90DxfKi/c0uHXuMBegFfPf6jOp5XZYCHtH9tZJJhzi3U4W9xlZTTcYRPdDfXfBm8qga6O9DIjk5BCCxwbIUSlGVdf1PZbSXU2hG0yar4E0PhytaPserZ4rCP+U1mNAERG3Tz7Hhz01OcJN6qEr2wUdDk6nDYZYztYzL9S8z4Rri+PNx6epRtyvwr9tcBnS4zfVpVEPl6RBNS0FhHhGCVj2wZpJI3QnDzNgCQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by CH0PR12MB5074.namprd12.prod.outlook.com (2603:10b6:610:e1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:33:26 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:33:26 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Boris Pismenny , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com Subject: [PATCH v8 19/25] net/mlx5e: TCP flow steering for nvme-tcp acceleration Date: Mon, 9 Jan 2023 15:31:10 +0200 Message-Id: <20230109133116.20801-20-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR2P281CA0147.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:98::12) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|CH0PR12MB5074:EE_ X-MS-Office365-Filtering-Correlation-Id: 19fd1a01-78f6-4013-a65f-08daf246161a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: PLu3tRf70kQqLanp/cOAKMLUe0I1h5Zg1Ji70axW8zilr+dQaBc2UlknVGoJjD0V/THCi7u35GiRKHrerZHa9W6+5MVZgU7YsNQJjz8dxbJz3+oIqf2QxAKzLWI35u1XcxjJdNLRk72uTRNqtSajPI3hCR+6/YwnrRUrekYQOIpSfcDkTOfEVj+iCfu+wP8mp2thbyiHBVq9HUPgqyVieAfoPhFkE4DGqcba4PE+2djPQH/tf6nYCMvRqwb7MRz9cY7ytXVc2wdSOM8EDMs3f+ZcuWvk9YExlq9KPT5Ih7f5WdfvVGvAsroRy57pZgipD/gIu50+AS+zNmTPn0PxYbxgJw0WorLBWmwXhNfBmkgYFV7Nw/XMo2M5r6NFxyAIQqjXGy3kzCsRPLa4kmuKUMw95Yuw08UDsByOt55sFKGjqU8vk3K+Fgso1BZuNolWcNAATlM2GFkKZ9U7ilvwaL5zMJHE+hEJr+dEeyMUFEwCuPo9mD+5eHx0jWgcUkGTpetm+0YnW1D5iwlN2KSR4XkwDpFtljPaTzLItxySzYzliGgUa+EwzDasCv8eseCGxurOO+utR+cX8eJMfWyIPlIc1ls9SXlUDell2xug3G7mXAYXaFkgyXNg5nU8vMM84mYpFMiBbZoo7uqw/Zq+Tw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(396003)(366004)(376002)(346002)(39860400002)(451199015)(1076003)(316002)(5660300002)(7416002)(26005)(6512007)(186003)(6486002)(478600001)(2616005)(41300700001)(54906003)(4326008)(66556008)(66946007)(66476007)(8676002)(8936002)(83380400001)(86362001)(36756003)(6666004)(107886003)(6506007)(38100700002)(2906002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: e24Kkucy6rvPwBqE1gvPWLG9asD0OP7b8wMBNbhn60yad0xTNkdY6cbTYV17sKIPWZsJJrruXNo9/DMsdacmPes8Ib67YQi2vD5Q+dElG4gwj8YZFVNEeDYqoN0kiglZBgWMMRycW0rpuIKSEHo846G2B5KNRQx8LSwqN/xAb1UlSMC/iVnhWfbVroYrVvg8XxPRdeg015bBrqMrz1zuwvBIoWQqyGikkPgCY3VR2LCtJsT1nShaaSJvKQ99Hu1A6DxobJHz5Z6HlpRWVP4l45jVdyk8I9zOtOtUA9YbW0yS8vhfOibDa9j+rhQbmvGJak+MDkMR0P+07XmE1FtZ3fNrIO5EGW+qtAHqW9+H7LHH/s6/PvuLgsm+CjgCsB/bvWg+9IThdZ8MbKvAJ/oBmrCZgj2Ik7doDRR5yiA+1ICM+ToVXcBMzny1bcns+osh9MdDtOAqNG0CngvBbpl/vl2obLPxCozTgXcCQ7HPp86+fvT4WVXdLZjjcw2cwr4Go4H3MSEvUyamKUUunISlSPScP8npBvxkl2gnPvj5ONWNan3JbNJTQ8Hmbi43vV29QkcWiSS//i+QmYhMXuRXDVe/LA41yOs+UkwG4cCbn2soKQMZZ5g/y23zDPgcAZ1dRAcU0sL8PoF/SQNekOoQ8A9r7rBd3xxn/6AnovDrSM5QkIwswhckcIybYoq7gVgbfVLx/QLM5xQN7hHYn8vD0Ilqwq5QPhm7b4jra9NeHzz4iFCKHFKRZotKft8oxYtb9JHbewm8b11BSCS2FtIZsrGv0SQhkppWK8UAYuEeg6qVSNzeg3+Uo4Ge2MQdH3nxSO4YlOWmumLDWTpfBY+rkUTW4izE4HKthvFuLTb4m8t26psb4CySC57evdNZTx5UqXbbFHGod/wvzfvFnSRUqRspbAAKhPlmdm0V3So5qq+bF99FxbTMy+l/gqWO8ViQ4AdC4wIQbAWEqRMQK7GkxxwxlOJiOk0vo3IqDBLHKilqMVzHqB2lYxxLXsvVfpI17IWPLioqxv1yOfB0tzXeqYTr/0RnlGdtSmphTetS/DAS0tkqUOEOtIFD8oaCnm+RqzyFifQiUvS3diiC7QUiGtcC6QFd1BD0eGigqIz5dh8XOqe6ZvcFcTFA2N2kp4FGBQNF1LVDS9UXaBVZNHJFOeqOO6EvXAaML7NebSh5XwT3TCx+OCSfC7hch+BNphfPknJDbe0D/hHwU7e/FSxNU6WLXnerdXPPCy/Eswu7ArLrecZGRYZ5VpwFwyg/27JXrTn6EMHRs+Hpig1E5zNqtllpLQDyTl0+AYMZBIr4OjnfQzhaLB4wcD9LxUtTJg+IKjYcuntCUUP8FdOzUXP0hWXgKSIJpMcOKVtGlWB69FX7Ylz+18hft1CHDJ1iGhWJwJJDx7mDmUX+LP5xxgzYaYKlpbRvSnvWvGORO2R0Y4guK8m9wqQRLdGak4TDy5JPNjvdE+lCFf1H7f5vm532jfay9MDIAxdr0DWNbpyr2RuwExq0+rIi2CXHbMLJ6xdkFM/qn1MlWAm8wksb5nWFXef4gnjB/6qe7rm8CiV9HVr8L9lm65hrXFHMgLgHEort X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 19fd1a01-78f6-4013-a65f-08daf246161a X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:33:25.9449 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 1wpOmPUMuvXYHs+ujXVyEm8E8mnAf6tIp079q886RLmKd16uw29L/WUF6+JdAfuyrD8obH+M25iTu8xXibkrdQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5074 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Boris Pismenny Both nvme-tcp and tls acceleration require tcp flow steering. Add reference counter to share TCP flow steering structure. Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c index d7c020f72401..c30224ab6ef3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c @@ -14,6 +14,7 @@ enum accel_fs_tcp_type { struct mlx5e_accel_fs_tcp { struct mlx5e_flow_table tables[ACCEL_FS_TCP_NUM_TYPES]; struct mlx5_flow_handle *default_rules[ACCEL_FS_TCP_NUM_TYPES]; + refcount_t user_count; }; static enum mlx5_traffic_types fs_accel2tt(enum accel_fs_tcp_type i) @@ -360,6 +361,9 @@ void mlx5e_accel_fs_tcp_destroy(struct mlx5e_flow_steering *fs) if (!accel_tcp) return; + if (!refcount_dec_and_test(&accel_tcp->user_count)) + return; + accel_fs_tcp_disable(fs); for (i = 0; i < ACCEL_FS_TCP_NUM_TYPES; i++) @@ -371,12 +375,17 @@ void mlx5e_accel_fs_tcp_destroy(struct mlx5e_flow_steering *fs) int mlx5e_accel_fs_tcp_create(struct mlx5e_flow_steering *fs) { - struct mlx5e_accel_fs_tcp *accel_tcp; + struct mlx5e_accel_fs_tcp *accel_tcp = mlx5e_fs_get_accel_tcp(fs); int i, err; if (!MLX5_CAP_FLOWTABLE_NIC_RX(mlx5e_fs_get_mdev(fs), ft_field_support.outer_ip_version)) return -EOPNOTSUPP; + if (accel_tcp) { + refcount_inc(&accel_tcp->user_count); + return 0; + } + accel_tcp = kvzalloc(sizeof(*accel_tcp), GFP_KERNEL); if (!accel_tcp) return -ENOMEM; @@ -392,6 +401,7 @@ int mlx5e_accel_fs_tcp_create(struct mlx5e_flow_steering *fs) if (err) goto err_destroy_tables; + refcount_set(&accel_tcp->user_count, 1); return 0; err_destroy_tables: From patchwork Mon Jan 9 13:31:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093584 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86A88C54EBD for ; Mon, 9 Jan 2023 13:34:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237173AbjAINeq (ORCPT ); Mon, 9 Jan 2023 08:34:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237223AbjAINdg (ORCPT ); Mon, 9 Jan 2023 08:33:36 -0500 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2048.outbound.protection.outlook.com [40.107.243.48]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF23A32182 for ; Mon, 9 Jan 2023 05:33:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=R/WLJJVYgEdZUBhz1S2cxHpsQL9jZs3Uiil4LuGpwNnzrnT0HeTtLc89IQg3/TVz08LHf94k4bHyY5LEbzzuvBEaXIM7VyisREi8kEY9uSECZ6rFHi9iJdw7wdq7SBTH4nQuHO/7xmwPKaWCPRJYseQwrZdC9z336sqgn1TyCJOlU7XfCW8q83+TBhrkMb4J28eZaJKAyQZS/wjONsGDneEPy6DyHTj4QI+Urvm4Cr7K/pCKzY2+1H+IfFUnxHpJJ03XvZxo+8MrdOOA734PezoqcVaOrqbhU1SB3Qk7etIWln5vZMDnBxcQ4Rv3lXIvpI2CFtxgM9Z4HRXtQv58Iw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=53R1CJMzRYqQ48Ra3u3SNvEhtJ1JMhpweQfCG2hq3OE=; b=FYFWaGwwjeX6oZb+cydqqBusLBnNMsZJdIK69H94cLj0P17dRfkCKDCESLyXfFzZDiqwI47akx1TOnaYIh0uZmFqCuxsJze5NzcKwcr2vT4RXJK0mdfe1i/Wa3q5w3dp00j6G78dHFigcK8mOy8f6idG+6PULVLVIkRgNb5hfd5tla2SuaTyJ1q4DYJevpUilah7+Td1Ds7VHtKivBZN/YZNxHa2klCvGZCzBpZ2XdeinGbMreDmEKaR0k5cK8+ZX14SeO1DU8GbBZ+tfX7OZNTBj8labge9E6BE/jWvSOORPwCCg/achjNBlylIv21MuYLyYzlKpIV30MCT9wNhMA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=53R1CJMzRYqQ48Ra3u3SNvEhtJ1JMhpweQfCG2hq3OE=; b=fDxju9j5f5nq9gSr6W8ajFXP6kgyfezllUYFmSQUR4NpxrogNqo0dK5qTjAhTUhiyymPq8VwZNQqJFX6Ct2Gw4Q9mCSN30sWi1BM5GmF5V+uFAacN4uQhP7VcmyLYn4xud7bLnbO0ytxAHCsUtP/b0ieFIdYc+CnW1og519K4SiiPzJfJwhgTh45vNsnsT7i1zoY7idDEoMZTZ0v86XVOi2DxYL6wJdYPGJ2I8OyGsCcKgcay8jMiu6qbav14zq5k50vWoHl/oBJ3XdgzjfVDXs8KAcNLjIjMwNH/6vAgE8d+WToHeoocM5FHRrAbgKRmtw2NYlOsUnaRLz7naas0g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by MN2PR12MB4285.namprd12.prod.outlook.com (2603:10b6:208:1d7::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:33:31 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:33:31 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Ben Ben-Ishay , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 20/25] net/mlx5e: NVMEoTCP, use KLM UMRs for buffer registration Date: Mon, 9 Jan 2023 15:31:11 +0200 Message-Id: <20230109133116.20801-21-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO4P265CA0108.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2c3::10) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|MN2PR12MB4285:EE_ X-MS-Office365-Filtering-Correlation-Id: 61e4f839-7247-479f-4c8c-08daf2461975 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BdmvmqdhJSdvZ+lvXzPyi8ZYqvy6OSK8nJUH1LippEVRCmBURXVcFTVx1yoF88TzM8ZPG3KsDNEuTZKi6MUMRdcpZjkMIAGE4IcBZlEWUppN4dtwxfiNTY5t6/cuYxq38FeSfLQ/zbuOXBc39r/Wq53Np9Wb4Z50U7BWYg2gmSX0VhxrbP3UL6l2o/ZZEbcHdv7Bolw0OkNmWLzHTgkP0AwdBI/US+gJ30D9Pa/djHNniIPlXOpghagSfKGl7CFflwFNjksoEHCU9gcMyflWY5SOzhl07HYVctToqcph762aZsEuItm+JO8GG36YqHT190wF+5hCWWIDnIX2PaPJuGqbtReUrFnOQ/Nrpp5g1qDKb7/5DZqNTirpyvxgcVOfm4kO27WzFI6lUX8sCltPLDZxSuqM4WzDk/TQq8YpeSsjuvX05cTQkT85z4xHKbiibQeIxuXg10AuQyLR/XKhBKPGG42SnevPQ7tPlmVHfi/UCOn0Rp7J/OMFHeadZ3uzHxF8jB0jcjsHwhgeMtgrakS+EvqgpHDc2VaNNwdw91GN4hFzw0tY7TW+oCoL+zN/umCGZ+O3IQfquEIq8spQlgt5MKZuLV8YZVxlGFjlTv1NGlqfAn+xJhmiQUfRpQQAhSDadp6UWAiF02XY/VWgoQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(366004)(39860400002)(396003)(376002)(136003)(451199015)(8936002)(2906002)(5660300002)(41300700001)(7416002)(66476007)(66556008)(4326008)(8676002)(316002)(66946007)(54906003)(186003)(26005)(6512007)(2616005)(1076003)(38100700002)(83380400001)(86362001)(36756003)(107886003)(478600001)(6486002)(6506007)(6666004);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: EjP1p9RFfGtSYfGs+gyPNrPSaWQMfq95JzP5eO/nbSIIjSffL1JztZ9NTZPYzTI6QvA/VKgw695GHpqJMVal0hEe2vYRpQ09J+dMUJbZnpbX1P+I6s9bOk9eAo0i3+7GX9qrQ3oG4A0MO9tbjwkVSCaqVP6R0z9BaKqmjTeTN1pHouqo/DuRow98YnRN+/hrX2fxtfYhYKJ9w29C+ZTeYThCWXkMP80GrV8/44qe/kO1t4FfYLXV5hlB51h6htodGEb9Gye07KM71+hH563HWs37/NFssA8pPdgwJcQGEueQGZZQttYb8jioxD33w/YWu0LpBQKN4+PmHXmvBzkRPLTJqX3rvGG0IuNesae66L4Nuthkhj8LjGJgcyuuZnJY2hEyBT6OZAKFfb1ltRug6sKacc7A1hXx+Z938scs0RpxrvJ0qktQ3t9MIrHzEnjGn2/d61OvEyqCMM8zExPKqf9j/w393k/5UnyCEbjQxVSxLhvTMHZyuE+GoCgTk3ak2KaXVLq4jW30WQPGtDD765wsS9aKm3NZ2EnbYiIh1L3ftzpAq4onARezqtbqbt2W4ROFVUmDmZ1VbONPBUgo11k3hJ3Z91N2/svUL3MOT1HUKIUHpufs3OnbkCy6RGxjuSJi7g8jkd7Opmhx34NQRbcke1P0S25ODii1tEY1FOz5IZFcHtMtNtyuNgbT2KHc2DxRTVQf2SAWZVhKe6mVpQbWgVcNiyJaw1tph3UFylNi5gDLYiMZMmjzrn/jkm1x2xIWhEbPlUVLNqXOQRTCsmP8r1K9sWvvE8Vtlr8RsmjswQugnZC/MWjSjphnT6wX9Le0f325JdddXZSXG9BcwvjB7Rwii+XZwiR03j64mgx9gwynpZeOMVIyxPTdLgoKuBjCw0Neal4/jeTVgIJda+rtAFyhTr0aIdoxUWoobFAZcPEFRWcOd42BWKC2wx5XXvNtlbvEu+SeJ0M3SfYaskr0vQrFral2L+nYxX9pemLhViC1IL2F9ShywLbqvWTQb3k14LGTbMuahODZmSTifQiDTpOpgIzcVvTxglnrgXnSxvTGyJARohwPo7gW6rRSmCccwuXxDy2F1vYm6149E64Zs49d4Q6LPRqjz8ze5AmlrL01qK3wT+T7q4Nk5NAP0/mmcHFXaOGzg5E1SrZEwa4nbfN5GQ11mN9dD4k5KBHEOHcI5kZSwk2af7radITc1ozmb34CBl6YygFZMGrb1O1TFeGiPeH1cAQLU6dp/zL1SVKWJVJi6O9i902ULzY9lsP/P9zLKrZ12OVlDuzi2Pfv/gBYv7stpRsPiebuPjmCNpSKUZewyHdZAXM4KGLNfh3enqy5kOwSFcUvwM8yTkG3WbKCCXqQIuNLfA4ZRTO7FWeb0f76A6SHsiZOADp7b4DxaYnf4XcU18KSjDervQxDoBIWmQtytHenlxujexnnnjy21PHpjfKmG2LUuT9OitDFz7+0O4R4oz+Ujh/mZdVKikuUxBxnIy21FwsaYYqXFAMjaw4MuGGuIp4WRw6KKSUf4bxn3mEXT4slK+8d0h1zPWtlkKvWa61Qq07GHVbIUJuI+Z3E5hLR+6pNPLCe X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 61e4f839-7247-479f-4c8c-08daf2461975 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:33:31.6481 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: UTHGINwWvDIn4LfOAj3CP1nNSTJX4HPsBeilFmIIAmRlyJyzvIPinpaLih/UZ+5cOq3NTY9JmTkeMTIggsfHew== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4285 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Ben Ben-Ishay NVMEoTCP offload uses buffer registration for ddp operation. Every request comprises from SG list that might consist from elements with multiple combination sizes, thus the appropriate way to perform buffer registration is with KLM UMRs. UMR stands for user-mode memory registration, it is a mechanism to alter address translation properties of MKEY by posting WorkQueueElement aka WQE on send queue. MKEY stands for memory key, MKEY are used to describe a region in memory that can be later used by HW. KLM stands for {Key, Length, MemVa}, KLM_MKEY is indirect MKEY that enables to map multiple memory spaces with different sizes in unified MKEY. KLM UMR is a UMR that use to update a KLM_MKEY. Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../net/ethernet/mellanox/mlx5/core/en/txrx.h | 3 + .../mellanox/mlx5/core/en_accel/nvmeotcp.c | 125 +++++++++++++++++- .../mlx5/core/en_accel/nvmeotcp_utils.h | 25 ++++ .../net/ethernet/mellanox/mlx5/core/en_rx.c | 4 + 4 files changed, 156 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_utils.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h index a690a90a4c9c..2781d9eaf4b5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h @@ -50,6 +50,9 @@ enum mlx5e_icosq_wqe_type { MLX5E_ICOSQ_WQE_SET_PSV_TLS, MLX5E_ICOSQ_WQE_GET_PSV_TLS, #endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + MLX5E_ICOSQ_WQE_UMR_NVMEOTCP, +#endif }; /* General */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c index a1d143ea93cc..5611e18c4246 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c @@ -4,6 +4,7 @@ #include #include #include "en_accel/nvmeotcp.h" +#include "en_accel/nvmeotcp_utils.h" #include "en_accel/fs_tcp.h" #include "en/txrx.h" @@ -19,9 +20,123 @@ static const struct rhashtable_params rhash_queues = { .max_size = MAX_NUM_NVMEOTCP_QUEUES, }; +static void +fill_nvmeotcp_klm_wqe(struct mlx5e_nvmeotcp_queue *queue, struct mlx5e_umr_wqe *wqe, u16 ccid, + u32 klm_entries, u16 klm_offset) +{ + struct scatterlist *sgl_mkey; + u32 lkey, i; + + lkey = queue->priv->mdev->mlx5e_res.hw_objs.mkey; + for (i = 0; i < klm_entries; i++) { + sgl_mkey = &queue->ccid_table[ccid].sgl[i + klm_offset]; + wqe->inline_klms[i].bcount = cpu_to_be32(sg_dma_len(sgl_mkey)); + wqe->inline_klms[i].key = cpu_to_be32(lkey); + wqe->inline_klms[i].va = cpu_to_be64(sgl_mkey->dma_address); + } + + for (; i < ALIGN(klm_entries, MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT); i++) { + wqe->inline_klms[i].bcount = 0; + wqe->inline_klms[i].key = 0; + wqe->inline_klms[i].va = 0; + } +} + +static void +build_nvmeotcp_klm_umr(struct mlx5e_nvmeotcp_queue *queue, struct mlx5e_umr_wqe *wqe, + u16 ccid, int klm_entries, u32 klm_offset, u32 len, + enum wqe_type klm_type) +{ + u32 id = (klm_type == KLM_UMR) ? queue->ccid_table[ccid].klm_mkey : + (mlx5e_tir_get_tirn(&queue->tir) << MLX5_WQE_CTRL_TIR_TIS_INDEX_SHIFT); + u8 opc_mod = (klm_type == KLM_UMR) ? MLX5_CTRL_SEGMENT_OPC_MOD_UMR_UMR : + MLX5_OPC_MOD_TRANSPORT_TIR_STATIC_PARAMS; + u32 ds_cnt = MLX5E_KLM_UMR_DS_CNT(ALIGN(klm_entries, MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT)); + struct mlx5_wqe_umr_ctrl_seg *ucseg = &wqe->uctrl; + struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl; + struct mlx5_mkey_seg *mkc = &wqe->mkc; + u32 sqn = queue->sq.sqn; + u16 pc = queue->sq.pc; + + cseg->opmod_idx_opcode = cpu_to_be32((pc << MLX5_WQE_CTRL_WQE_INDEX_SHIFT) | + MLX5_OPCODE_UMR | (opc_mod) << 24); + cseg->qpn_ds = cpu_to_be32((sqn << MLX5_WQE_CTRL_QPN_SHIFT) | ds_cnt); + cseg->general_id = cpu_to_be32(id); + + if (klm_type == KLM_UMR && !klm_offset) { + ucseg->mkey_mask = cpu_to_be64(MLX5_MKEY_MASK_XLT_OCT_SIZE | + MLX5_MKEY_MASK_LEN | MLX5_MKEY_MASK_FREE); + mkc->xlt_oct_size = cpu_to_be32(ALIGN(len, MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT)); + mkc->len = cpu_to_be64(queue->ccid_table[ccid].size); + } + + ucseg->flags = MLX5_UMR_INLINE | MLX5_UMR_TRANSLATION_OFFSET_EN; + ucseg->xlt_octowords = cpu_to_be16(ALIGN(klm_entries, MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT)); + ucseg->xlt_offset = cpu_to_be16(klm_offset); + fill_nvmeotcp_klm_wqe(queue, wqe, ccid, klm_entries, klm_offset); +} + +static void +mlx5e_nvmeotcp_fill_wi(struct mlx5e_icosq *sq, u32 wqebbs, u16 pi) +{ + struct mlx5e_icosq_wqe_info *wi = &sq->db.wqe_info[pi]; + + memset(wi, 0, sizeof(*wi)); + + wi->num_wqebbs = wqebbs; + wi->wqe_type = MLX5E_ICOSQ_WQE_UMR_NVMEOTCP; +} + +static u32 +post_klm_wqe(struct mlx5e_nvmeotcp_queue *queue, + enum wqe_type wqe_type, + u16 ccid, + u32 klm_length, + u32 klm_offset) +{ + struct mlx5e_icosq *sq = &queue->sq; + u32 wqebbs, cur_klm_entries; + struct mlx5e_umr_wqe *wqe; + u16 pi, wqe_sz; + + cur_klm_entries = min_t(int, queue->max_klms_per_wqe, klm_length - klm_offset); + wqe_sz = MLX5E_KLM_UMR_WQE_SZ(ALIGN(cur_klm_entries, MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT)); + wqebbs = DIV_ROUND_UP(wqe_sz, MLX5_SEND_WQE_BB); + pi = mlx5e_icosq_get_next_pi(sq, wqebbs); + wqe = MLX5E_NVMEOTCP_FETCH_KLM_WQE(sq, pi); + mlx5e_nvmeotcp_fill_wi(sq, wqebbs, pi); + build_nvmeotcp_klm_umr(queue, wqe, ccid, cur_klm_entries, klm_offset, + klm_length, wqe_type); + sq->pc += wqebbs; + sq->doorbell_cseg = &wqe->ctrl; + return cur_klm_entries; +} + +static void +mlx5e_nvmeotcp_post_klm_wqe(struct mlx5e_nvmeotcp_queue *queue, enum wqe_type wqe_type, + u16 ccid, u32 klm_length) +{ + struct mlx5e_icosq *sq = &queue->sq; + u32 klm_offset = 0, wqes, i; + + wqes = DIV_ROUND_UP(klm_length, queue->max_klms_per_wqe); + + spin_lock_bh(&queue->sq_lock); + + for (i = 0; i < wqes; i++) + klm_offset += post_klm_wqe(queue, wqe_type, ccid, klm_length, klm_offset); + + if (wqe_type == KLM_UMR) /* not asking for completion on ddp_setup UMRs */ + __mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, sq->doorbell_cseg, 0); + else + mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, sq->doorbell_cseg); + + spin_unlock_bh(&queue->sq_lock); +} + static int mlx5e_nvmeotcp_offload_limits(struct net_device *netdev, - struct ulp_ddp_limits *ulp_limits) + struct ulp_ddp_limits *limits) { return 0; } @@ -45,6 +160,14 @@ mlx5e_nvmeotcp_ddp_setup(struct net_device *netdev, struct sock *sk, struct ulp_ddp_io *ddp) { + struct mlx5e_nvmeotcp_queue *queue; + + queue = container_of(ulp_ddp_get_ctx(sk), + struct mlx5e_nvmeotcp_queue, ulp_ddp_ctx); + + /* Placeholder - map_sg and initializing the count */ + + mlx5e_nvmeotcp_post_klm_wqe(queue, KLM_UMR, ddp->command_id, 0); return 0; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_utils.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_utils.h new file mode 100644 index 000000000000..6ef92679c5d0 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_utils.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. */ +#ifndef __MLX5E_NVMEOTCP_UTILS_H__ +#define __MLX5E_NVMEOTCP_UTILS_H__ + +#include "en.h" + +#define MLX5E_NVMEOTCP_FETCH_KLM_WQE(sq, pi) \ + ((struct mlx5e_umr_wqe *)\ + mlx5e_fetch_wqe(&(sq)->wq, pi, sizeof(struct mlx5e_umr_wqe))) + +#define MLX5_CTRL_SEGMENT_OPC_MOD_UMR_NVMEOTCP_TIR_PROGRESS_PARAMS 0x4 + +#define MLX5_CTRL_SEGMENT_OPC_MOD_UMR_TIR_PARAMS 0x2 +#define MLX5_CTRL_SEGMENT_OPC_MOD_UMR_UMR 0x0 + +enum wqe_type { + KLM_UMR, + BSF_KLM_UMR, + SET_PSV_UMR, + BSF_UMR, + KLM_INV_UMR, +}; + +#endif /* __MLX5E_NVMEOTCP_UTILS_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 7bf69e35af18..edfe60e641ae 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -984,6 +984,10 @@ int mlx5e_poll_ico_cq(struct mlx5e_cq *cq, int budget) case MLX5E_ICOSQ_WQE_GET_PSV_TLS: mlx5e_ktls_handle_get_psv_completion(wi, sq); break; +#endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + case MLX5E_ICOSQ_WQE_UMR_NVMEOTCP: + break; #endif default: netdev_WARN_ONCE(cq->netdev, From patchwork Mon Jan 9 13:31:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093585 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62A83C54EBD for ; Mon, 9 Jan 2023 13:35:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237204AbjAINfC (ORCPT ); Mon, 9 Jan 2023 08:35:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234870AbjAINdn (ORCPT ); Mon, 9 Jan 2023 08:33:43 -0500 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2044.outbound.protection.outlook.com [40.107.243.44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30C7C1EC67 for ; Mon, 9 Jan 2023 05:33:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lMq8C7a4R18Q0YNBsLMagGfAe/SIaVx1kH7QKiO8wLUts4sZbu/N+2EjvgTeqPihPPiJhW/m3Qq+CUgESJ0Pj3+xHY5t1KI4rJIQg0lDf7xOpA+Dk4MtTTtcH4V1vUfHt5KmGY+cacjD5gsumJPdYsySTxjkkp+7X/577otc0qed9n7oL74DYunIZ6n/vtC3jC+2XYbw/iNje2ZiCZjuRTrP2D1n8JXuCVyYhuDXJwho4HwRrbBI43ffDxzrrnUdvtqlPw+dR3+1rCy1q4jK9ibnCpQCb2QwIOWZpGEIXp3rNcHyjDGuGhhnoOO5MoIPiEOC8uPYMM79Mdcnt0BBxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0i63ecjBVL+Jf/eA49qCMHr0yG625eIWBGFEKTZ0iyQ=; b=V4bth0432IAbFaOQkAIezHGkjGVHddy9b/6Y0XJJBbWIZbSvp+F/AXilZB3pZWNOqFWd7p3nHSYyGlem2DUFzYi7rKXSqKoRPvMKi8dBpAMjE4Ifqi2XFZHbfHeH5JprGxkD5yarjwKN56TmsYUEMEGMQxZHLAI+Hyf1ORNPLB1vv4qVs2GyDdgiPwQZI45iTUuMjRqOjjBwQds5yimaMm3Wp13AZjPtBfwVIKkxnKOqxl0U9NAKqjOOjW4TcMh9xuqzQMHLKv5S/GgKc+5kcYFFjvy2MYUFTfn4WNWHqxMOxReyPoRG1OhMpLhGCQ9MOwZpDgo3l5zIj2Us6pI5Ow== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0i63ecjBVL+Jf/eA49qCMHr0yG625eIWBGFEKTZ0iyQ=; b=aKWENw6BjmACzI/TOjrFxfe3deqPs7b3keoSB6lwWQInk0AAnPED7s0GE0dh5ajdwl4HFAUcISi6okd3dtNQ75xHz48eJoIiU0m90ZwNCDAmGf92O6lKYNz/JWwhOjz8rhu6rss33h6mGgPXn1NOJIDe7L+e+/PoJEkQvGJ3DAEnoScTnZsXQsHVOfcdICawFZjxGUl5i6UVsw05nlrciGmkypuA9h0hsczRziOAD6aClB9RKJDTWZ5twRIySWDdan0FZGvkNoA/V2H5VX6QOCzRSdnS741z7qUfBqJB5Od7NUHiKoQHoA3oPoAG2/ebQ0eSZ38fmpK/zwfbNsz4Eg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by MN2PR12MB4285.namprd12.prod.outlook.com (2603:10b6:208:1d7::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:33:39 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:33:38 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Ben Ben-Ishay , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 21/25] net/mlx5e: NVMEoTCP, queue init/teardown Date: Mon, 9 Jan 2023 15:31:12 +0200 Message-Id: <20230109133116.20801-22-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO4P265CA0117.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2c3::20) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|MN2PR12MB4285:EE_ X-MS-Office365-Filtering-Correlation-Id: 0e1b8e9b-a631-44f1-9b84-08daf2461dc0 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: rJRJeTIoW4fvQsM5CW2/GX1DVu0OG/GzPD9ad2ovdYbUqt7i49CSxJiRZ8ZmCMYyo0MQiindnb9H+Y2ssVoLbSGhJSxoGFFjOTSBLhV1VQWoMlWK8HuEpPtxuQR/xMVsqLU45rXIvJN4w6asnv9rXxUHgpfeVB3En5eMyal61s1vzRM2tBBu8GuVdNQiIgdBmxVG4DF11d+b2cCK31DCHppF85Yxe0yof8QhNxIRc/1fxFK5PM4xzMqGLkgXPOFQ3PJ3paBv+4TH7uws59SLFg3OfVR7dWADswKwmZtbMQozsxs3S6J5uKwr4Xd1AvnoFbUPvA3bHU79bbVMkV4YH0o28dvRkA6K4ZhhzRf/X5lKHPLyfKvXD1f18xxFjp3S8KaUQwtlb/4Cr8sqp7Kfdf6lIsy4UPurZAJAPaMQZfljB48wYyKMhPUtmDN28ztQaS3MJ7+HRrmCj4znvIDDwkZF+mpFCiQayESF8F5BMrGacsycFXgyvwxL/GVIwbpIojnSvVF3iIzuq8TL9GXzM8LeAmjDRZK40vE27F/ngmTLXnbEVlY9PXbRCLRficFqHiHvm8EMfVXuIkpSoJgnSbbB2nQNrO00RA5quirb+7fZ4x73E0y4cywazCwQc4TpH44Qmp2fYrsyQp0kcUQ729r1HY269RC7BE37nAZ2ItXl4jkeHakUQpviiahHCv6tZfVR3gF8iP5hEf8x5klRxQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(366004)(39860400002)(396003)(376002)(136003)(451199015)(66899015)(8936002)(2906002)(5660300002)(41300700001)(7416002)(66476007)(66556008)(4326008)(8676002)(316002)(66946007)(30864003)(54906003)(186003)(26005)(6512007)(2616005)(1076003)(38100700002)(83380400001)(86362001)(36756003)(107886003)(478600001)(6486002)(6506007)(6666004)(309714004);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 67V+ivE1L7aWz5JQbZEE6tMeos4QkhEq35SU5BTj8Lqdecl3eCrw668kkN2MS9hWG+ZETlJkFY6J2xb2/gUFiQzvncQzYRRl0RprE4kz+xDQO+irkyc41edcQNCOhPDeB5xIkxvhRn9jWbPqkz775k7n63uBQFxpWbBfkwD2UBKUzHfO03G4imZu0odIQv3cUnMgNGjipcRFBShyPI3jirzl7X2P40AkXCArM3I+8p7c3PSh52/Z9tZ9RxgOmUP2pYUvE4V7Vo4H061gtx83G9yXqJhlqDMTHp6eBzrYQeEJSkHSJv5+9OZOPON+4I6aRVvtOoONkjTEJH4VEoTlGsgwMVylecl0tC3jsKlLan6CthCWFakkkbbubQnj7yHxPIlbx4if5M4iTMSRlsY4X3UrF8I3MD748ecSBTvE/X7iC/0X+kqlAwug1VCBMXVzcTSuoQVs0o2Yib6Of/+y1rMo1uq+mZJAi9SpJlWZpFDZbBU4NaeDN+esPUyLwWOttBSkdp75sCK4Ro9ZxoxNMo1OIQfZJteQimwwlQDqft3cTHitfceyWu2JZx7yk0r+G7z2gVT0nGFQhHi6ij+1Lesr0H0FLp8/IYbmcXHmsHQLJHJmD7J3NzkfiOdcqprestESBzQxY+uQ17nuQOPUJeMcHPru5YRXMCvMF90rcec7GUJeDQsziuYnJ9bGIVx+lWLCOVuNNuMwHLLk9qvPJnJUYenGgGQZrfWKlTxyvy/ZJo0Co/mKiCYompXj6DXhn1bsOsdJNYB+DD4qDUciaVFFr+I5+ZXe3CescNdZY34y+SmYZf16HdePsyPPgTHupgK88wwrWBt+iB2Rix0nLvvV2DVwRIxxq0Xv3T6ABfz1DzNpNtpozPnyfkYxYUv5RN2p+ZJUhdkKwnQJVgWvsZv1nRHpGDD05i4FYa7Vfy3iXhjSRMRN2NwruW9tJRd6cepB/pmuN0nHiie56TOW0I2tWVtRd9pvNU4eBqXVWfP/Xf8aBRnDzxUkQXiDI0blJw1VSQzfB98fk5DqIm/oe2UEEK95euAezuQP9pBtFymg0wwLqaPMA3g8VdboF3fzdBWlKayi7mEQitV+dKVQDeep6EZhucCVWzMSsoE0zKMHGqZxu3HR2JmSyRNMZFBzbBmib0t21mps+WC5BItOeWxifdgVCCuw2VhiJa7FvORTVA/ilBK8xF4t3qSc/zOvWZKOfoQsDNMa54d4tgAClIKJcBB52vz1I7qRvtmfNgAG4tyXhUWdcGPCaFMRlXlqs9eLsKJhn72x3exm6D7dZGEPS1AroCNWaUUMYvypADmjNzlQy93rRKkAdiDKC/CMtkPbCNeNi90ACrQYWNbXC2CasZcg1P+o3NUaHYtMnJ0osIlBUipsn6r5XL7Mytg/wkWnlzmrVcdUr+NghjGAGdGdSLlzcdDPFQFXywZvAalhFhEycC7GxZZlmQzIsBsaO87dS4syVM/OGQ8azyS0YLe9St+B864hIWhrGTbCuW4/Hmc7angi+9KtHtI6EXcm4BVSbEQBizq0lqGLDPRvhd83kvc+KGgOmsQOylFMHpGrfXXJ1Uj3FZNDXjSELFl2 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0e1b8e9b-a631-44f1-9b84-08daf2461dc0 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:33:38.9328 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 4tRSdq5TNDonU0uC9QKVuH/KlHAciXYxROFnZTLUMKy96qWvnZAtvWGZFK1Oh+HPCujr1DJoTtHbvNMAxLfjuA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4285 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Ben Ben-Ishay Adds the ddp ops of sk_add, sk_del and offload limits. When nvme-tcp establishes new queue/connection, the sk_add op is called. We allocate a hardware context to offload operations for this queue: - use a steering rule based on the connection 5-tuple to mark packets of this queue/connection with a flow-tag in their completion (cqe) - use a dedicated TIR to identify the queue and maintain the HW context - use a dedicated ICOSQ to maintain the HW context by UMR postings - use a dedicated tag buffer for buffer registration - maintain static and progress HW contexts by posting the proper WQEs. When nvme-tcp teardowns a queue/connection, the sk_del op is called. We teardown the queue and free the corresponding contexts. The offload limits we advertise deal with the max SG supported. [Re-enabled calling open/close icosq out of en_main.c] Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 4 + .../ethernet/mellanox/mlx5/core/en/rx_res.c | 28 + .../ethernet/mellanox/mlx5/core/en/rx_res.h | 4 + .../net/ethernet/mellanox/mlx5/core/en/tir.c | 15 + .../net/ethernet/mellanox/mlx5/core/en/tir.h | 2 + .../net/ethernet/mellanox/mlx5/core/en/txrx.h | 6 + .../mellanox/mlx5/core/en_accel/nvmeotcp.c | 562 +++++++++++++++++- .../mellanox/mlx5/core/en_accel/nvmeotcp.h | 4 + .../mlx5/core/en_accel/nvmeotcp_utils.h | 41 ++ .../net/ethernet/mellanox/mlx5/core/en_main.c | 8 +- .../net/ethernet/mellanox/mlx5/core/en_rx.c | 15 +- 11 files changed, 679 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index a6da839a8f82..217b5480c3aa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -1053,6 +1053,10 @@ int mlx5e_create_rq(struct mlx5e_rq *rq, struct mlx5e_rq_param *param); void mlx5e_destroy_rq(struct mlx5e_rq *rq); struct mlx5e_sq_param; +int mlx5e_open_icosq(struct mlx5e_channel *c, struct mlx5e_params *params, + struct mlx5e_sq_param *param, struct mlx5e_icosq *sq, + work_func_t recover_work_func); +void mlx5e_close_icosq(struct mlx5e_icosq *sq); int mlx5e_open_xdpsq(struct mlx5e_channel *c, struct mlx5e_params *params, struct mlx5e_sq_param *param, struct xsk_buff_pool *xsk_pool, struct mlx5e_xdpsq *sq, bool is_redirect); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c index e1095bc36543..4a88b675a02c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c @@ -611,6 +611,34 @@ struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res * return mlx5e_rss_get_hash(res->rss[0]); } +int mlx5e_rx_res_nvmeotcp_tir_create(struct mlx5e_rx_res *res, unsigned int rxq, bool crc_rx, + u32 tag_buf_id, struct mlx5e_tir *tir) +{ + bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT; + struct mlx5e_tir_builder *builder; + u32 rqtn; + int err; + + builder = mlx5e_tir_builder_alloc(false); + if (!builder) + return -ENOMEM; + + rqtn = mlx5e_rx_res_get_rqtn_direct(res, rxq); + + mlx5e_tir_builder_build_rqt(builder, res->mdev->mlx5e_res.hw_objs.td.tdn, rqtn, + inner_ft_support); + mlx5e_tir_builder_build_direct(builder); + mlx5e_tir_builder_build_nvmeotcp(builder, crc_rx, tag_buf_id); + down_read(&res->pkt_merge_param_sem); + mlx5e_tir_builder_build_packet_merge(builder, &res->pkt_merge_param); + err = mlx5e_tir_init(tir, builder, res->mdev, false); + up_read(&res->pkt_merge_param_sem); + + mlx5e_tir_builder_free(builder); + + return err; +} + int mlx5e_rx_res_tls_tir_create(struct mlx5e_rx_res *res, unsigned int rxq, struct mlx5e_tir *tir) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h index 5d5f64fab60f..59c22cac9ef4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h @@ -66,4 +66,8 @@ struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res * /* Accel TIRs */ int mlx5e_rx_res_tls_tir_create(struct mlx5e_rx_res *res, unsigned int rxq, struct mlx5e_tir *tir); + +int mlx5e_rx_res_nvmeotcp_tir_create(struct mlx5e_rx_res *res, unsigned int rxq, bool crc_rx, + u32 tag_buf_id, struct mlx5e_tir *tir); + #endif /* __MLX5_EN_RX_RES_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tir.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tir.c index d4239e3b3c88..8bdf74cbd8cd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tir.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tir.c @@ -143,6 +143,21 @@ void mlx5e_tir_builder_build_direct(struct mlx5e_tir_builder *builder) MLX5_SET(tirc, tirc, rx_hash_fn, MLX5_RX_HASH_FN_INVERTED_XOR8); } +void mlx5e_tir_builder_build_nvmeotcp(struct mlx5e_tir_builder *builder, bool crc_rx, + u32 tag_buf_id) +{ + void *tirc = mlx5e_tir_builder_get_tirc(builder); + + WARN_ON(builder->modify); + + MLX5_SET(tirc, tirc, nvmeotcp_zero_copy_en, 1); + MLX5_SET(tirc, tirc, nvmeotcp_tag_buffer_table_id, tag_buf_id); + MLX5_SET(tirc, tirc, nvmeotcp_crc_en, !!crc_rx); + MLX5_SET(tirc, tirc, self_lb_block, + MLX5_TIRC_SELF_LB_BLOCK_BLOCK_UNICAST | + MLX5_TIRC_SELF_LB_BLOCK_BLOCK_MULTICAST); +} + void mlx5e_tir_builder_build_tls(struct mlx5e_tir_builder *builder) { void *tirc = mlx5e_tir_builder_get_tirc(builder); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tir.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tir.h index 857a84bcd53a..bdec6931444b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tir.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tir.h @@ -35,6 +35,8 @@ void mlx5e_tir_builder_build_rss(struct mlx5e_tir_builder *builder, bool inner); void mlx5e_tir_builder_build_direct(struct mlx5e_tir_builder *builder); void mlx5e_tir_builder_build_tls(struct mlx5e_tir_builder *builder); +void mlx5e_tir_builder_build_nvmeotcp(struct mlx5e_tir_builder *builder, bool crc_rx, + u32 tag_buf_id); struct mlx5_core_dev; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h index 2781d9eaf4b5..971265280e55 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h @@ -52,6 +52,7 @@ enum mlx5e_icosq_wqe_type { #endif #ifdef CONFIG_MLX5_EN_NVMEOTCP MLX5E_ICOSQ_WQE_UMR_NVMEOTCP, + MLX5E_ICOSQ_WQE_SET_PSV_NVMEOTCP, #endif }; @@ -206,6 +207,11 @@ struct mlx5e_icosq_wqe_info { struct { struct mlx5e_ktls_rx_resync_buf *buf; } tls_get_params; +#endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + struct { + struct mlx5e_nvmeotcp_queue *queue; + } nvmeotcp_q; #endif }; }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c index 5611e18c4246..0e0a261ec4e8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c @@ -3,6 +3,7 @@ #include #include +#include #include "en_accel/nvmeotcp.h" #include "en_accel/nvmeotcp_utils.h" #include "en_accel/fs_tcp.h" @@ -11,6 +12,11 @@ #define MAX_NUM_NVMEOTCP_QUEUES (4000) #define MIN_NUM_NVMEOTCP_QUEUES (1) +/* Max PDU data will be 512K */ +#define MLX5E_NVMEOTCP_MAX_SEGMENTS (128) +#define MLX5E_NVMEOTCP_IO_THRESHOLD (32 * 1024) +#define MLX5E_NVMEOTCP_FULL_CCID_RANGE (0) + static const struct rhashtable_params rhash_queues = { .key_len = sizeof(int), .key_offset = offsetof(struct mlx5e_nvmeotcp_queue, id), @@ -20,6 +26,95 @@ static const struct rhashtable_params rhash_queues = { .max_size = MAX_NUM_NVMEOTCP_QUEUES, }; +static u32 mlx5e_get_max_sgl(struct mlx5_core_dev *mdev) +{ + return min_t(u32, + MLX5E_NVMEOTCP_MAX_SEGMENTS, + 1 << MLX5_CAP_GEN(mdev, log_max_klm_list_size)); +} + +static u32 +mlx5e_get_channel_ix_from_io_cpu(struct mlx5e_params *params, u32 io_cpu) +{ + int num_channels = params->num_channels; + u32 channel_ix = io_cpu; + + if (channel_ix >= num_channels) + channel_ix = channel_ix % num_channels; + + return channel_ix; +} + +static +int mlx5e_create_nvmeotcp_tag_buf_table(struct mlx5_core_dev *mdev, + struct mlx5e_nvmeotcp_queue *queue, + u8 log_table_size) +{ + u32 in[MLX5_ST_SZ_DW(create_nvmeotcp_tag_buf_table_in)] = {}; + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)]; + u64 general_obj_types; + void *obj; + int err; + + obj = MLX5_ADDR_OF(create_nvmeotcp_tag_buf_table_in, in, + nvmeotcp_tag_buf_table_obj); + + general_obj_types = MLX5_CAP_GEN_64(mdev, general_obj_types); + if (!(general_obj_types & + MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_NVMEOTCP_TAG_BUFFER_TABLE)) + return -EINVAL; + + MLX5_SET(general_obj_in_cmd_hdr, in, opcode, + MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, + MLX5_GENERAL_OBJECT_TYPES_NVMEOTCP_TAG_BUFFER_TABLE); + MLX5_SET(nvmeotcp_tag_buf_table_obj, obj, + log_tag_buffer_table_size, log_table_size); + + err = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (!err) + queue->tag_buf_table_id = MLX5_GET(general_obj_out_cmd_hdr, + out, obj_id); + return err; +} + +static +void mlx5_destroy_nvmeotcp_tag_buf_table(struct mlx5_core_dev *mdev, u32 uid) +{ + u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)]; + + MLX5_SET(general_obj_in_cmd_hdr, in, opcode, + MLX5_CMD_OP_DESTROY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, + MLX5_GENERAL_OBJECT_TYPES_NVMEOTCP_TAG_BUFFER_TABLE); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, uid); + + mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); +} + +static void +fill_nvmeotcp_bsf_klm_wqe(struct mlx5e_nvmeotcp_queue *queue, struct mlx5e_umr_wqe *wqe, + u16 ccid, u32 klm_entries, u16 klm_offset) +{ + u32 i; + + /* BSF_KLM_UMR is used to update the tag_buffer. To spare the + * need to update both mkc.length and tag_buffer[i].len in two + * different UMRs we initialize the tag_buffer[*].len to the + * maximum size of an entry so the HW check will pass and the + * validity of the MKEY len will be checked against the + * updated mkey context field. + */ + for (i = 0; i < klm_entries; i++) { + u32 lkey = queue->ccid_table[i + klm_offset].klm_mkey; + + wqe->inline_klms[i].bcount = cpu_to_be32(U32_MAX); + wqe->inline_klms[i].key = cpu_to_be32(lkey); + wqe->inline_klms[i].va = 0; + } +} + static void fill_nvmeotcp_klm_wqe(struct mlx5e_nvmeotcp_queue *queue, struct mlx5e_umr_wqe *wqe, u16 ccid, u32 klm_entries, u16 klm_offset) @@ -73,18 +168,149 @@ build_nvmeotcp_klm_umr(struct mlx5e_nvmeotcp_queue *queue, struct mlx5e_umr_wqe ucseg->flags = MLX5_UMR_INLINE | MLX5_UMR_TRANSLATION_OFFSET_EN; ucseg->xlt_octowords = cpu_to_be16(ALIGN(klm_entries, MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT)); ucseg->xlt_offset = cpu_to_be16(klm_offset); - fill_nvmeotcp_klm_wqe(queue, wqe, ccid, klm_entries, klm_offset); + if (klm_type == BSF_KLM_UMR) + fill_nvmeotcp_bsf_klm_wqe(queue, wqe, ccid, klm_entries, klm_offset); + else + fill_nvmeotcp_klm_wqe(queue, wqe, ccid, klm_entries, klm_offset); +} + +static void +fill_nvmeotcp_progress_params(struct mlx5e_nvmeotcp_queue *queue, + struct mlx5_seg_nvmeotcp_progress_params *params, + u32 seq) +{ + void *ctx = params->ctx; + + params->tir_num = cpu_to_be32(mlx5e_tir_get_tirn(&queue->tir)); + + MLX5_SET(nvmeotcp_progress_params, ctx, next_pdu_tcp_sn, seq); + MLX5_SET(nvmeotcp_progress_params, ctx, pdu_tracker_state, + MLX5E_NVMEOTCP_PROGRESS_PARAMS_PDU_TRACKER_STATE_START); +} + +void +build_nvmeotcp_progress_params(struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_set_nvmeotcp_progress_params_wqe *wqe, + u32 seq) +{ + struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl; + u32 sqn = queue->sq.sqn; + u16 pc = queue->sq.pc; + u8 opc_mod; + + memset(wqe, 0, MLX5E_NVMEOTCP_PROGRESS_PARAMS_WQE_SZ); + opc_mod = MLX5_CTRL_SEGMENT_OPC_MOD_UMR_NVMEOTCP_TIR_PROGRESS_PARAMS; + cseg->opmod_idx_opcode = cpu_to_be32((pc << MLX5_WQE_CTRL_WQE_INDEX_SHIFT) | + MLX5_OPCODE_SET_PSV | (opc_mod << 24)); + cseg->qpn_ds = cpu_to_be32((sqn << MLX5_WQE_CTRL_QPN_SHIFT) | + PROGRESS_PARAMS_DS_CNT); + fill_nvmeotcp_progress_params(queue, &wqe->params, seq); +} + +static void +fill_nvmeotcp_static_params(struct mlx5e_nvmeotcp_queue *queue, + struct mlx5_wqe_transport_static_params_seg *params, + u32 resync_seq, bool ddgst_offload_en) +{ + void *ctx = params->ctx; + + MLX5_SET(transport_static_params, ctx, const_1, 1); + MLX5_SET(transport_static_params, ctx, const_2, 2); + MLX5_SET(transport_static_params, ctx, acc_type, + MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_NVMETCP); + MLX5_SET(transport_static_params, ctx, nvme_resync_tcp_sn, resync_seq); + MLX5_SET(transport_static_params, ctx, pda, queue->pda); + MLX5_SET(transport_static_params, ctx, ddgst_en, + !!(queue->dgst & NVME_TCP_DATA_DIGEST_ENABLE)); + MLX5_SET(transport_static_params, ctx, ddgst_offload_en, ddgst_offload_en); + MLX5_SET(transport_static_params, ctx, hddgst_en, + !!(queue->dgst & NVME_TCP_HDR_DIGEST_ENABLE)); + MLX5_SET(transport_static_params, ctx, hdgst_offload_en, 0); + MLX5_SET(transport_static_params, ctx, ti, + MLX5_TRANSPORT_STATIC_PARAMS_TI_INITIATOR); + MLX5_SET(transport_static_params, ctx, cccid_ttag, 1); + MLX5_SET(transport_static_params, ctx, zero_copy_en, 1); +} + +void +build_nvmeotcp_static_params(struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_set_transport_static_params_wqe *wqe, + u32 resync_seq, bool crc_rx) +{ + u8 opc_mod = MLX5_OPC_MOD_TRANSPORT_TIR_STATIC_PARAMS; + struct mlx5_wqe_umr_ctrl_seg *ucseg = &wqe->uctrl; + struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl; + u32 sqn = queue->sq.sqn; + u16 pc = queue->sq.pc; + + memset(wqe, 0, MLX5E_TRANSPORT_STATIC_PARAMS_WQE_SZ); + + cseg->opmod_idx_opcode = cpu_to_be32((pc << MLX5_WQE_CTRL_WQE_INDEX_SHIFT) | + MLX5_OPCODE_UMR | (opc_mod) << 24); + cseg->qpn_ds = cpu_to_be32((sqn << MLX5_WQE_CTRL_QPN_SHIFT) | + MLX5E_TRANSPORT_STATIC_PARAMS_DS_CNT); + cseg->imm = cpu_to_be32(mlx5e_tir_get_tirn(&queue->tir) + << MLX5_WQE_CTRL_TIR_TIS_INDEX_SHIFT); + + ucseg->flags = MLX5_UMR_INLINE; + ucseg->bsf_octowords = cpu_to_be16(MLX5E_TRANSPORT_STATIC_PARAMS_OCTWORD_SIZE); + fill_nvmeotcp_static_params(queue, &wqe->params, resync_seq, crc_rx); } static void -mlx5e_nvmeotcp_fill_wi(struct mlx5e_icosq *sq, u32 wqebbs, u16 pi) +mlx5e_nvmeotcp_fill_wi(struct mlx5e_nvmeotcp_queue *nvmeotcp_queue, + struct mlx5e_icosq *sq, u32 wqebbs, u16 pi, + enum wqe_type type) { struct mlx5e_icosq_wqe_info *wi = &sq->db.wqe_info[pi]; memset(wi, 0, sizeof(*wi)); wi->num_wqebbs = wqebbs; - wi->wqe_type = MLX5E_ICOSQ_WQE_UMR_NVMEOTCP; + switch (type) { + case SET_PSV_UMR: + wi->wqe_type = MLX5E_ICOSQ_WQE_SET_PSV_NVMEOTCP; + wi->nvmeotcp_q.queue = nvmeotcp_queue; + break; + default: + /* cases where no further action is required upon completion, such as ddp setup */ + wi->wqe_type = MLX5E_ICOSQ_WQE_UMR_NVMEOTCP; + break; + } +} + +static void +mlx5e_nvmeotcp_rx_post_static_params_wqe(struct mlx5e_nvmeotcp_queue *queue, u32 resync_seq) +{ + struct mlx5e_set_transport_static_params_wqe *wqe; + struct mlx5e_icosq *sq = &queue->sq; + u16 pi, wqebbs; + + spin_lock_bh(&queue->sq_lock); + wqebbs = MLX5E_TRANSPORT_SET_STATIC_PARAMS_WQEBBS; + pi = mlx5e_icosq_get_next_pi(sq, wqebbs); + wqe = MLX5E_TRANSPORT_FETCH_SET_STATIC_PARAMS_WQE(sq, pi); + mlx5e_nvmeotcp_fill_wi(NULL, sq, wqebbs, pi, BSF_UMR); + build_nvmeotcp_static_params(queue, wqe, resync_seq, queue->crc_rx); + sq->pc += wqebbs; + mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, &wqe->ctrl); + spin_unlock_bh(&queue->sq_lock); +} + +static void +mlx5e_nvmeotcp_rx_post_progress_params_wqe(struct mlx5e_nvmeotcp_queue *queue, u32 seq) +{ + struct mlx5e_set_nvmeotcp_progress_params_wqe *wqe; + struct mlx5e_icosq *sq = &queue->sq; + u16 pi, wqebbs; + + wqebbs = MLX5E_NVMEOTCP_PROGRESS_PARAMS_WQEBBS; + pi = mlx5e_icosq_get_next_pi(sq, wqebbs); + wqe = MLX5E_NVMEOTCP_FETCH_PROGRESS_PARAMS_WQE(sq, pi); + mlx5e_nvmeotcp_fill_wi(queue, sq, wqebbs, pi, SET_PSV_UMR); + build_nvmeotcp_progress_params(queue, wqe, seq); + sq->pc += wqebbs; + mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, &wqe->ctrl); } static u32 @@ -104,7 +330,7 @@ post_klm_wqe(struct mlx5e_nvmeotcp_queue *queue, wqebbs = DIV_ROUND_UP(wqe_sz, MLX5_SEND_WQE_BB); pi = mlx5e_icosq_get_next_pi(sq, wqebbs); wqe = MLX5E_NVMEOTCP_FETCH_KLM_WQE(sq, pi); - mlx5e_nvmeotcp_fill_wi(sq, wqebbs, pi); + mlx5e_nvmeotcp_fill_wi(queue, sq, wqebbs, pi, wqe_type); build_nvmeotcp_klm_umr(queue, wqe, ccid, cur_klm_entries, klm_offset, klm_length, wqe_type); sq->pc += wqebbs; @@ -134,25 +360,326 @@ mlx5e_nvmeotcp_post_klm_wqe(struct mlx5e_nvmeotcp_queue *queue, enum wqe_type wq spin_unlock_bh(&queue->sq_lock); } +static int mlx5e_create_nvmeotcp_mkey(struct mlx5_core_dev *mdev, u8 access_mode, + u32 translation_octword_size, u32 *mkey) +{ + int inlen = MLX5_ST_SZ_BYTES(create_mkey_in); + void *mkc; + u32 *in; + int err; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); + MLX5_SET(mkc, mkc, free, 1); + MLX5_SET(mkc, mkc, translations_octword_size, translation_octword_size); + MLX5_SET(mkc, mkc, umr_en, 1); + MLX5_SET(mkc, mkc, lw, 1); + MLX5_SET(mkc, mkc, lr, 1); + MLX5_SET(mkc, mkc, access_mode_1_0, access_mode); + + MLX5_SET(mkc, mkc, qpn, 0xffffff); + MLX5_SET(mkc, mkc, pd, mdev->mlx5e_res.hw_objs.pdn); + + err = mlx5_core_create_mkey(mdev, mkey, in, inlen); + + kvfree(in); + return err; +} + static int mlx5e_nvmeotcp_offload_limits(struct net_device *netdev, struct ulp_ddp_limits *limits) { + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5_core_dev *mdev = priv->mdev; + + if (limits->type != ULP_DDP_NVME) + return -EOPNOTSUPP; + + limits->max_ddp_sgl_len = mlx5e_get_max_sgl(mdev); + limits->io_threshold = MLX5E_NVMEOTCP_IO_THRESHOLD; + limits->nvmeotcp.full_ccid_range = MLX5E_NVMEOTCP_FULL_CCID_RANGE; return 0; } +static int mlx5e_nvmeotcp_queue_handler_poll(struct napi_struct *napi, int budget) +{ + struct mlx5e_nvmeotcp_queue_handler *qh; + int work_done; + + qh = container_of(napi, struct mlx5e_nvmeotcp_queue_handler, napi); + + work_done = mlx5e_poll_ico_cq(qh->cq, budget); + + if (work_done == budget || !napi_complete_done(napi, work_done)) + goto out; + + mlx5e_cq_arm(qh->cq); + +out: + return work_done; +} + +static void +mlx5e_nvmeotcp_destroy_icosq(struct mlx5e_icosq *sq) +{ + mlx5e_close_icosq(sq); + mlx5e_close_cq(&sq->cq); +} + +static void mlx5e_nvmeotcp_icosq_err_cqe_work(struct work_struct *recover_work) +{ + struct mlx5e_icosq *sq = container_of(recover_work, struct mlx5e_icosq, recover_work); + + /* Not implemented yet. */ + + netdev_warn(sq->channel->netdev, "nvmeotcp icosq recovery is not implemented\n"); +} + +static int +mlx5e_nvmeotcp_build_icosq(struct mlx5e_nvmeotcp_queue *queue, struct mlx5e_priv *priv, int io_cpu) +{ + u16 max_sgl, max_klm_per_wqe, max_umr_per_ccid, sgl_rest, wqebbs_rest; + struct mlx5e_channel *c = priv->channels.c[queue->channel_ix]; + struct mlx5e_sq_param icosq_param = {}; + struct mlx5e_create_cq_param ccp = {}; + struct dim_cq_moder icocq_moder = {}; + struct mlx5e_icosq *icosq; + int err = -ENOMEM; + u16 log_icosq_sz; + u32 max_wqebbs; + + icosq = &queue->sq; + max_sgl = mlx5e_get_max_sgl(priv->mdev); + max_klm_per_wqe = queue->max_klms_per_wqe; + max_umr_per_ccid = max_sgl / max_klm_per_wqe; + sgl_rest = max_sgl % max_klm_per_wqe; + wqebbs_rest = sgl_rest ? MLX5E_KLM_UMR_WQEBBS(sgl_rest) : 0; + max_wqebbs = (MLX5E_KLM_UMR_WQEBBS(max_klm_per_wqe) * + max_umr_per_ccid + wqebbs_rest) * queue->size; + log_icosq_sz = order_base_2(max_wqebbs); + + mlx5e_build_icosq_param(priv->mdev, log_icosq_sz, &icosq_param); + ccp.napi = &queue->qh.napi; + ccp.ch_stats = &priv->channel_stats[queue->channel_ix]->ch; + ccp.node = cpu_to_node(io_cpu); + ccp.ix = queue->channel_ix; + + err = mlx5e_open_cq(priv, icocq_moder, &icosq_param.cqp, &ccp, &icosq->cq); + if (err) + goto err_nvmeotcp_sq; + err = mlx5e_open_icosq(c, &priv->channels.params, &icosq_param, icosq, + mlx5e_nvmeotcp_icosq_err_cqe_work); + if (err) + goto close_cq; + + spin_lock_init(&queue->sq_lock); + return 0; + +close_cq: + mlx5e_close_cq(&icosq->cq); +err_nvmeotcp_sq: + return err; +} + +static void +mlx5e_nvmeotcp_destroy_rx(struct mlx5e_priv *priv, struct mlx5e_nvmeotcp_queue *queue, + struct mlx5_core_dev *mdev) +{ + int i; + + mlx5e_accel_fs_del_sk(queue->fh); + + for (i = 0; i < queue->size; i++) + mlx5_core_destroy_mkey(mdev, queue->ccid_table[i].klm_mkey); + + mlx5e_tir_destroy(&queue->tir); + mlx5_destroy_nvmeotcp_tag_buf_table(mdev, queue->tag_buf_table_id); + + mlx5e_deactivate_icosq(&queue->sq); + napi_disable(&queue->qh.napi); + mlx5e_nvmeotcp_destroy_icosq(&queue->sq); + netif_napi_del(&queue->qh.napi); +} + +static int +mlx5e_nvmeotcp_queue_rx_init(struct mlx5e_nvmeotcp_queue *queue, + struct nvme_tcp_ddp_config *config, + struct net_device *netdev) +{ + u8 log_queue_size = order_base_2(config->queue_size); + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5_core_dev *mdev = priv->mdev; + struct sock *sk = queue->sk; + int err, max_sgls, i; + + if (config->queue_size > + BIT(MLX5_CAP_DEV_NVMEOTCP(mdev, log_max_nvmeotcp_tag_buffer_size))) + return -EINVAL; + + err = mlx5e_create_nvmeotcp_tag_buf_table(mdev, queue, log_queue_size); + if (err) + return err; + + queue->qh.cq = &queue->sq.cq; + netif_napi_add(priv->netdev, &queue->qh.napi, mlx5e_nvmeotcp_queue_handler_poll); + + mutex_lock(&priv->state_lock); + err = mlx5e_nvmeotcp_build_icosq(queue, priv, config->io_cpu); + mutex_unlock(&priv->state_lock); + if (err) + goto del_napi; + + napi_enable(&queue->qh.napi); + mlx5e_activate_icosq(&queue->sq); + + /* initializes queue->tir */ + err = mlx5e_rx_res_nvmeotcp_tir_create(priv->rx_res, queue->channel_ix, queue->crc_rx, + queue->tag_buf_table_id, &queue->tir); + if (err) + goto destroy_icosq; + + mlx5e_nvmeotcp_rx_post_static_params_wqe(queue, 0); + mlx5e_nvmeotcp_rx_post_progress_params_wqe(queue, tcp_sk(sk)->copied_seq); + + queue->ccid_table = kcalloc(queue->size, sizeof(struct mlx5e_nvmeotcp_queue_entry), + GFP_KERNEL); + if (!queue->ccid_table) { + err = -ENOMEM; + goto destroy_tir; + } + + max_sgls = mlx5e_get_max_sgl(mdev); + for (i = 0; i < queue->size; i++) { + err = mlx5e_create_nvmeotcp_mkey(mdev, MLX5_MKC_ACCESS_MODE_KLMS, max_sgls, + &queue->ccid_table[i].klm_mkey); + if (err) + goto free_ccid_table; + } + + mlx5e_nvmeotcp_post_klm_wqe(queue, BSF_KLM_UMR, 0, queue->size); + + if (!(WARN_ON(!wait_for_completion_timeout(&queue->static_params_done, + msecs_to_jiffies(3000))))) + queue->fh = mlx5e_accel_fs_add_sk(priv->fs, sk, mlx5e_tir_get_tirn(&queue->tir), + queue->id); + + if (IS_ERR_OR_NULL(queue->fh)) { + err = -EINVAL; + goto destroy_mkeys; + } + + return 0; + +destroy_mkeys: + while ((i--)) + mlx5_core_destroy_mkey(mdev, queue->ccid_table[i].klm_mkey); +free_ccid_table: + kfree(queue->ccid_table); +destroy_tir: + mlx5e_tir_destroy(&queue->tir); +destroy_icosq: + mlx5e_deactivate_icosq(&queue->sq); + napi_disable(&queue->qh.napi); + mlx5e_nvmeotcp_destroy_icosq(&queue->sq); +del_napi: + netif_napi_del(&queue->qh.napi); + mlx5_destroy_nvmeotcp_tag_buf_table(mdev, queue->tag_buf_table_id); + + return err; +} + static int mlx5e_nvmeotcp_queue_init(struct net_device *netdev, struct sock *sk, struct ulp_ddp_config *tconfig) { + struct nvme_tcp_ddp_config *config = &tconfig->nvmeotcp; + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5_core_dev *mdev = priv->mdev; + struct mlx5e_nvmeotcp_queue *queue; + int queue_id, err; + + if (tconfig->type != ULP_DDP_NVME) { + err = -EOPNOTSUPP; + goto out; + } + + queue = kzalloc(sizeof(*queue), GFP_KERNEL); + if (!queue) { + err = -ENOMEM; + goto out; + } + + queue_id = ida_simple_get(&priv->nvmeotcp->queue_ids, + MIN_NUM_NVMEOTCP_QUEUES, MAX_NUM_NVMEOTCP_QUEUES, + GFP_KERNEL); + if (queue_id < 0) { + err = -ENOSPC; + goto free_queue; + } + + queue->crc_rx = !!(config->dgst & NVME_TCP_DATA_DIGEST_ENABLE); + queue->ulp_ddp_ctx.type = ULP_DDP_NVME; + queue->sk = sk; + queue->id = queue_id; + queue->dgst = config->dgst; + queue->pda = config->cpda; + queue->channel_ix = mlx5e_get_channel_ix_from_io_cpu(&priv->channels.params, + config->io_cpu); + queue->size = config->queue_size; + queue->max_klms_per_wqe = MLX5E_MAX_KLM_PER_WQE(mdev); + queue->priv = priv; + init_completion(&queue->static_params_done); + + err = mlx5e_nvmeotcp_queue_rx_init(queue, config, netdev); + if (err) + goto remove_queue_id; + + err = rhashtable_insert_fast(&priv->nvmeotcp->queue_hash, &queue->hash, + rhash_queues); + if (err) + goto destroy_rx; + + write_lock_bh(&sk->sk_callback_lock); + ulp_ddp_set_ctx(sk, queue); + write_unlock_bh(&sk->sk_callback_lock); + refcount_set(&queue->ref_count, 1); return 0; + +destroy_rx: + mlx5e_nvmeotcp_destroy_rx(priv, queue, mdev); +remove_queue_id: + ida_simple_remove(&priv->nvmeotcp->queue_ids, queue_id); +free_queue: + kfree(queue); +out: + return err; } static void mlx5e_nvmeotcp_queue_teardown(struct net_device *netdev, struct sock *sk) { + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5_core_dev *mdev = priv->mdev; + struct mlx5e_nvmeotcp_queue *queue; + + queue = container_of(ulp_ddp_get_ctx(sk), struct mlx5e_nvmeotcp_queue, ulp_ddp_ctx); + + WARN_ON(refcount_read(&queue->ref_count) != 1); + mlx5e_nvmeotcp_destroy_rx(priv, queue, mdev); + + rhashtable_remove_fast(&priv->nvmeotcp->queue_hash, &queue->hash, + rhash_queues); + ida_simple_remove(&priv->nvmeotcp->queue_ids, queue->id); + write_lock_bh(&sk->sk_callback_lock); + ulp_ddp_set_ctx(sk, NULL); + write_unlock_bh(&sk->sk_callback_lock); + mlx5e_nvmeotcp_put_queue(queue); } static int @@ -171,6 +698,13 @@ mlx5e_nvmeotcp_ddp_setup(struct net_device *netdev, return 0; } +void mlx5e_nvmeotcp_ctx_complete(struct mlx5e_icosq_wqe_info *wi) +{ + struct mlx5e_nvmeotcp_queue *queue = wi->nvmeotcp_q.queue; + + complete(&queue->static_params_done); +} + static void mlx5e_nvmeotcp_ddp_teardown(struct net_device *netdev, struct sock *sk, @@ -194,6 +728,26 @@ const struct ulp_ddp_dev_ops mlx5e_nvmeotcp_ops = { .ulp_ddp_resync = mlx5e_nvmeotcp_ddp_resync, }; +struct mlx5e_nvmeotcp_queue * +mlx5e_nvmeotcp_get_queue(struct mlx5e_nvmeotcp *nvmeotcp, int id) +{ + struct mlx5e_nvmeotcp_queue *queue; + + queue = rhashtable_lookup_fast(&nvmeotcp->queue_hash, + &id, rhash_queues); + if (!IS_ERR_OR_NULL(queue)) + refcount_inc(&queue->ref_count); + return queue; +} + +void mlx5e_nvmeotcp_put_queue(struct mlx5e_nvmeotcp_queue *queue) +{ + if (refcount_dec_and_test(&queue->ref_count)) { + kfree(queue->ccid_table); + kfree(queue); + } +} + int set_ulp_ddp_nvme_tcp(struct net_device *netdev, bool enable) { struct mlx5e_priv *priv = netdev_priv(netdev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h index a665b7a72bc2..555f3ed7e2e2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h @@ -106,6 +106,10 @@ void mlx5e_nvmeotcp_build_netdev(struct mlx5e_priv *priv); int mlx5e_nvmeotcp_init(struct mlx5e_priv *priv); int set_ulp_ddp_nvme_tcp(struct net_device *netdev, bool enable); void mlx5e_nvmeotcp_cleanup(struct mlx5e_priv *priv); +struct mlx5e_nvmeotcp_queue * +mlx5e_nvmeotcp_get_queue(struct mlx5e_nvmeotcp *nvmeotcp, int id); +void mlx5e_nvmeotcp_put_queue(struct mlx5e_nvmeotcp_queue *queue); +void mlx5e_nvmeotcp_ctx_complete(struct mlx5e_icosq_wqe_info *wi); static inline void mlx5e_nvmeotcp_init_rx(struct mlx5e_priv *priv) {} void mlx5e_nvmeotcp_cleanup_rx(struct mlx5e_priv *priv); extern const struct ulp_ddp_dev_ops mlx5e_nvmeotcp_ops; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_utils.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_utils.h index 6ef92679c5d0..fdb194c30e3b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_utils.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_utils.h @@ -4,6 +4,35 @@ #define __MLX5E_NVMEOTCP_UTILS_H__ #include "en.h" +#include "en_accel/nvmeotcp.h" +#include "en_accel/common_utils.h" + +enum { + MLX5E_NVMEOTCP_PROGRESS_PARAMS_PDU_TRACKER_STATE_START = 0, + MLX5E_NVMEOTCP_PROGRESS_PARAMS_PDU_TRACKER_STATE_TRACKING = 1, + MLX5E_NVMEOTCP_PROGRESS_PARAMS_PDU_TRACKER_STATE_SEARCHING = 2, +}; + +struct mlx5_seg_nvmeotcp_progress_params { + __be32 tir_num; + u8 ctx[MLX5_ST_SZ_BYTES(nvmeotcp_progress_params)]; +}; + +struct mlx5e_set_nvmeotcp_progress_params_wqe { + struct mlx5_wqe_ctrl_seg ctrl; + struct mlx5_seg_nvmeotcp_progress_params params; +}; + +/* macros for wqe handling */ +#define MLX5E_NVMEOTCP_PROGRESS_PARAMS_WQE_SZ \ + (sizeof(struct mlx5e_set_nvmeotcp_progress_params_wqe)) + +#define MLX5E_NVMEOTCP_PROGRESS_PARAMS_WQEBBS \ + (DIV_ROUND_UP(MLX5E_NVMEOTCP_PROGRESS_PARAMS_WQE_SZ, MLX5_SEND_WQE_BB)) + +#define MLX5E_NVMEOTCP_FETCH_PROGRESS_PARAMS_WQE(sq, pi) \ + ((struct mlx5e_set_nvmeotcp_progress_params_wqe *)\ + mlx5e_fetch_wqe(&(sq)->wq, pi, sizeof(struct mlx5e_set_nvmeotcp_progress_params_wqe))) #define MLX5E_NVMEOTCP_FETCH_KLM_WQE(sq, pi) \ ((struct mlx5e_umr_wqe *)\ @@ -14,6 +43,9 @@ #define MLX5_CTRL_SEGMENT_OPC_MOD_UMR_TIR_PARAMS 0x2 #define MLX5_CTRL_SEGMENT_OPC_MOD_UMR_UMR 0x0 +#define PROGRESS_PARAMS_DS_CNT \ + DIV_ROUND_UP(MLX5E_NVMEOTCP_PROGRESS_PARAMS_WQE_SZ, MLX5_SEND_WQE_DS) + enum wqe_type { KLM_UMR, BSF_KLM_UMR, @@ -22,4 +54,13 @@ enum wqe_type { KLM_INV_UMR, }; +void +build_nvmeotcp_progress_params(struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_set_nvmeotcp_progress_params_wqe *wqe, u32 seq); + +void +build_nvmeotcp_static_params(struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_set_transport_static_params_wqe *wqe, + u32 resync_seq, bool crc_rx); + #endif /* __MLX5E_NVMEOTCP_UTILS_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 5e1e556384cf..2c37c94ffcf4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -1739,9 +1739,9 @@ void mlx5e_tx_err_cqe_work(struct work_struct *recover_work) mlx5e_reporter_tx_err_cqe(sq); } -static int mlx5e_open_icosq(struct mlx5e_channel *c, struct mlx5e_params *params, - struct mlx5e_sq_param *param, struct mlx5e_icosq *sq, - work_func_t recover_work_func) +int mlx5e_open_icosq(struct mlx5e_channel *c, struct mlx5e_params *params, + struct mlx5e_sq_param *param, struct mlx5e_icosq *sq, + work_func_t recover_work_func) { struct mlx5e_create_sq_param csp = {}; int err; @@ -1785,7 +1785,7 @@ void mlx5e_deactivate_icosq(struct mlx5e_icosq *icosq) synchronize_net(); /* Sync with NAPI. */ } -static void mlx5e_close_icosq(struct mlx5e_icosq *sq) +void mlx5e_close_icosq(struct mlx5e_icosq *sq) { if (sq->ktls_resync) mlx5e_ktls_rx_resync_destroy_resp_list(sq->ktls_resync); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index edfe60e641ae..74b45c94d1cb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -53,6 +53,7 @@ #include "en_accel/macsec.h" #include "en_accel/ipsec_rxtx.h" #include "en_accel/ktls_txrx.h" +#include "en_accel/nvmeotcp.h" #include "en/xdp.h" #include "en/xsk/rx.h" #include "en/health.h" @@ -880,16 +881,23 @@ void mlx5e_free_icosq_descs(struct mlx5e_icosq *sq) ci = mlx5_wq_cyc_ctr2ix(&sq->wq, sqcc); wi = &sq->db.wqe_info[ci]; sqcc += wi->num_wqebbs; -#ifdef CONFIG_MLX5_EN_TLS switch (wi->wqe_type) { +#ifdef CONFIG_MLX5_EN_TLS case MLX5E_ICOSQ_WQE_SET_PSV_TLS: mlx5e_ktls_handle_ctx_completion(wi); break; case MLX5E_ICOSQ_WQE_GET_PSV_TLS: mlx5e_ktls_handle_get_psv_completion(wi, sq); break; - } #endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + case MLX5E_ICOSQ_WQE_SET_PSV_NVMEOTCP: + mlx5e_nvmeotcp_ctx_complete(wi); + break; +#endif + default: + break; + } } sq->cc = sqcc; } @@ -988,6 +996,9 @@ int mlx5e_poll_ico_cq(struct mlx5e_cq *cq, int budget) #ifdef CONFIG_MLX5_EN_NVMEOTCP case MLX5E_ICOSQ_WQE_UMR_NVMEOTCP: break; + case MLX5E_ICOSQ_WQE_SET_PSV_NVMEOTCP: + mlx5e_nvmeotcp_ctx_complete(wi); + break; #endif default: netdev_WARN_ONCE(cq->netdev, From patchwork Mon Jan 9 13:31:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093586 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5381BC54EBD for ; Mon, 9 Jan 2023 13:35:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236950AbjAINfH (ORCPT ); Mon, 9 Jan 2023 08:35:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236303AbjAINdu (ORCPT ); Mon, 9 Jan 2023 08:33:50 -0500 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2087.outbound.protection.outlook.com [40.107.243.87]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5552833D77 for ; Mon, 9 Jan 2023 05:33:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RXuwcw5XXfyMJU/UzReDfVZlEeqKX1TxBiv0gYfToy/UWOS4fC+6xDIMqklehJDkj2xF1exHWD22sIZhcmEJzFmS6HzCe3/Wy7TFco7LFmO8xEmVdJuACKrxPhUrKMuLdUJSweIRpWz4WRmLm1x9ypo5G1ezdOriYlzMlgS+3mfO5rbZtFzj//AoERA0X3As5ugo1LgpUi3RNMcVDuHIDNVEvmdr3S60YAxFbYce0l3dUrSVgxAwq233g9+hSj1I3GJDRu3m8YFzqZMdX1gM2G8LcVah98RHMGq+WyVAzQyeKDt3S6Il+79dY486gtoVjUfGWiiGDs+SulnQ+qPl0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SIASu0DmBZunrbhxcWiuiGq+kkZKW0FeFEeDLAwkyXQ=; b=BsZlaS/o+kn/SwBonq6UL/+moDdM0YNRk1Y0jP6Rrt3v/QNHFVR76kA6TvZWLhalQTPiUbZW08fPrZ8orkairGE9S0I7MR7wuIV9X+8WEHrKxbarja3Y2awdUVFYL/i5Yhn+S+O+kGPrare/k3WuvS3W3cVAajM8fRhWD1nYEb3SNTCOASER8QV+PgqP2QhgVecq5AvI5vzKHzFpH+w6FY0BFat+LqbB1HiLn5MvZBsf8QuXYG+DtqibzWNbVAYgfJW3mja1p7ZWtWIm8xf3wWRVgnIbuOk7Sn3+hL1qBjN8ouMI/B3kbP/A9WUx4CU1PcP+FKFNSA1Zwj7CMYha3g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SIASu0DmBZunrbhxcWiuiGq+kkZKW0FeFEeDLAwkyXQ=; b=Oqs+e83h8jeS1ZLb1E6wZvXKqQeorXA+CIsKYK2bzuc2OcG45VIgPaTEfNiNqggmtWvx79i3iTsYhqqzswMiRsOnYYeNVWsXJmM9XNXS2V6hFhMHfZP4bPKe8MfIhVB1VLfXi8ducG8wKeA1rfwOJ/Ft9yZkTySw8lPPSBWJr8J9jydcO0JKU3TrA5ZIuaWe+xG4kbiChSaRa/+dQgpz8vaOtO1GOlRBARtxoK351MSP6ZHhXOxuCCMl2ttKstf0krrlD9V6l9m1Y1St7+nrlbP82AXXbRMIpO4HCHznWGY7XYlH+1spqwzi7eUI5MmexMoTKeWGY/eK1Od6WPePGg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by MN2PR12MB4285.namprd12.prod.outlook.com (2603:10b6:208:1d7::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:33:46 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:33:46 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Ben Ben-Ishay , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 22/25] net/mlx5e: NVMEoTCP, ddp setup and resync Date: Mon, 9 Jan 2023 15:31:13 +0200 Message-Id: <20230109133116.20801-23-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: BE0P281CA0007.DEUP281.PROD.OUTLOOK.COM (2603:10a6:b10:a::17) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|MN2PR12MB4285:EE_ X-MS-Office365-Filtering-Correlation-Id: 19c0742c-c128-4a20-2507-08daf2462251 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: WkZL4w2HSajyzSAMT8PF2n1TOx/QNIWwrpQMiqrahLDGrcFGORnaaXq3vP8MBuZFpvjzeTIIrkdbzj3F+vs3sDaQF/8W+A5yOEnZosq1W0qB5oFk+T0Bppmi3vnkQlcbMqc4bA6SvFZVjEsJw5YCcA3Ycj25frwx5onXqOAw7tx8l0VXpEhsyV8fMWATjTRBoocwooAzFjeUJI1rc81jbh4LgeFuWx5xHW5fVWIo8idRuPwEgYgxWkVXEIfbbtbP2aXAxSgnMFYfNedmMP3NC6hh8R0ZOkyUI1OWR865xptoqkMPLkzCvJm7qi0ZlqRdi1fVXZ10VbFg18Io5x9/2nx2WWnCYxPcJU3qp0fTv5eoF6T8ZVNl5mbLk79pkhScxpkRVjPdb1A8l9HfGKL2KtAW3B7VZUXkB8kAJH2cxxMmTa81FXyZLiR6soBOn+czXbadZVqMVIHjEzJ7byjCFwFTZeuo1o6i6qrdbj+w5Cupd/dxZf/dHuQPYYFGgbJ1jJXLwldZpLNkxR51wZB73A9B767mya6oQmKQgQ/dO6tspk+YYXoE9lZ2cChkoiqDVqeBmy5W6byW2RXxU9GkXd5snMW5Zl1igwoD3OwlWdKnBRE+9M5KDqhs0eJKQV9TPF/93bl/R0yRBvV4+Ov3kQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(366004)(39860400002)(396003)(376002)(136003)(451199015)(8936002)(2906002)(5660300002)(41300700001)(7416002)(66476007)(66556008)(4326008)(8676002)(316002)(66946007)(54906003)(186003)(26005)(6512007)(2616005)(1076003)(38100700002)(83380400001)(86362001)(36756003)(107886003)(478600001)(6486002)(6506007)(6666004);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 2vlXFl+IM3TrGmPuzIJY+VeF/FP3zHbjf/uB/qkf2f6ZfGPyi/8RccakyNk0gT+FNZV+0DAQKlSfecbgkBJ62/w8A271UYxban9eY0nmbqqxrdsTuqsFUWwXJpuDRVbZt8poPHAvrCw6weQBxnDvRoeKRI39tvEriCsKL/XzKpsZpBvyXhC2tk2ZO6Zj0O+3iIIrpo1e8jRuUFhf4rGcIxIIiKUGOESdo8i5chco0MIxoBhW7DclWVObgPkspXSA9deT+lS/4kWJjy90dpx5fdxId3+EYw1APjyVQNQzYd2XfwiKRqyYkPVDGMToBIDpFCG/5yIoQdiklZi72sFhMfR8KG6wyHngMEDK6x05FtYfb+Zzyj47+qg3LpPVjeJkuitnF9GGLc5VkUVQIB0/yBRgV/ScGuLkX2LXv9U/TpZDixZoeUl6jmF74BCrmVAHbCcMAGLsQcsycy7DWRLTdrld6lrqYqG6DbSGVZfdarCq9eIOGtUg9X8QVVD7d1gFT1ueAL4fmYuWi7HdTfdJVNjFLE3xEb2CDu0vGQ3A5fOIgToRMq8IqRzsA727SvW9nCXTHep9+/rJ7xdDYpaEbATB9E6TPACpKIY9h+dK8B3u+pkNaFMgPg8AlFtI/J87/KsyC9Yo3nlcts8rcmPY/vieyWzeOtmoE9mLSD67h4SH24XQ+dM/XbBUQSyNmga9eYJ32tLS9c1OQ63XHIZEXTB2rw1d+LBVHHPaMxBGqRfgTbLFzB37ozDHZUd3SFtVMH03rDpNY9DYhVn4nYtYaRgDHI4YpUK53b/gu1BccV4dGbB+QvQ4wdcWdwhr6Ryyvlx63fNxPWvoe8jdqkPfOXx+/On/mN3c9YET+kvcnRYE2qu25LnmUAhiOq8ClUc96DdnlZV6UQGpP4vZA+Zpvg4Zv+03YmkIiE5efW6K41oQQKLo61MXVOkitviJdebVcnQfDQ2iNUharANW0SkGMVY9pUqH3ZVC9ENkdx5ou52DJMaP/hCTbyney6nkMe5d9BMF3fVkBnn2aNzI7ZuzrOG8stRZJCfr3Q55dVyVfwwIn+M1f6MQZJo44FyOCDUi0mWJcjZlAlkv7M+EszPaiwHCt7tEowvrRpO8XM2Nx1GGiuXID67XdVNe6xO3laI6JlJ+ZqKZeJ2feVywCMXEwErGZuuo9vE2ZZalX0IKiqnPe93BcTPHfn4dvq5+aSkLf9CDDsFSVXlXSibfAjE+V+1i9Y5NoP2Mo6IOWedNeTiYbrHQbySMX4b9aVwNpIZKqTTOz1GjxYd6UrZ78yAwJYtK1vchk0zI5jHT9NuiMEF9RLNJYxlaNrq09lITKjgSllhAEFXePZfNNCJ/R3jq4As97UXm4vYvQL4ysk0mY3MwMb+nKXjwNAtPkoRsxy0ZdobCkT9eVye1Z4Uv/cJfJCLiidsechiTzeZPC2z3YNiJ3i5ep1CE0uZXaLuFIwkBlM1jnst3aZRO7eyHUDTNCwQGB1K991W1+VFFKicukwOYLdXS2pKpvusto/8ysAm0ucWCTYgZxrwwl/VQIuXisji9X4PyQhvhOf71nkdhVNs0H/bUHu4NYHTPql0cmi6S X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 19c0742c-c128-4a20-2507-08daf2462251 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:33:46.4347 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: yhwD07HTK38szA3Uv4oIGaNJdMponRoFtz4iqaSALGhQjTOqurfz7jcAMj8aSxu7KfDEXz2L///EJSMrapl0yg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4285 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Ben Ben-Ishay NVMEoTCP offload uses buffer registration for every NVME request to perform direct data placement. This is achieved by creating a NIC HW mapping between the CCID (command capsule ID) to the set of buffers that compose the request. The registration is implemented via MKEY for which we do fast/async mapping using KLM UMR WQE. The buffer registration takes place when the ULP calls the ddp_setup op which is done before they send their corresponding request to the other side (e.g nvmf target). We don't wait for the completion of the registration before returning back to the ulp. The reason being that the HW mapping should be in place fast enough vs the RTT it would take for the request to be responded. If this doesn't happen, some IO may not be ddp-offloaded, but that doesn't stop the overall offloading session. When the offloading HW gets out of sync with the protocol session, a hardware/software handshake takes place to resync. The ddp_resync op is the part of the handshake where the SW confirms to the HW that a indeed they identified correctly a PDU header at a certain TCP sequence number. This allows the HW to resume the offload. The 1st part of the handshake is when the HW identifies such sequence number in an arriving packet. A special mark is made on the completion (cqe) and then the mlx5 driver invokes the ddp resync_request callback advertised by the ULP in the ddp context - this is in downstream patch. Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../mellanox/mlx5/core/en_accel/nvmeotcp.c | 146 +++++++++++++++++- 1 file changed, 144 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c index 0e0a261ec4e8..3f1c0e7682c3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c @@ -682,19 +682,156 @@ mlx5e_nvmeotcp_queue_teardown(struct net_device *netdev, mlx5e_nvmeotcp_put_queue(queue); } +static bool +mlx5e_nvmeotcp_validate_small_sgl_suffix(struct scatterlist *sg, int sg_len, int mtu) +{ + int i, hole_size, hole_len, chunk_size = 0; + + for (i = 1; i < sg_len; i++) + chunk_size += sg_dma_len(&sg[i]); + + if (chunk_size >= mtu) + return true; + + hole_size = mtu - chunk_size - 1; + hole_len = DIV_ROUND_UP(hole_size, PAGE_SIZE); + + if (sg_len + hole_len > MAX_SKB_FRAGS) + return false; + + return true; +} + +static bool +mlx5e_nvmeotcp_validate_big_sgl_suffix(struct scatterlist *sg, int sg_len, int mtu) +{ + int i, j, last_elem, window_idx, window_size = MAX_SKB_FRAGS - 1; + int chunk_size = 0; + + last_elem = sg_len - window_size; + window_idx = window_size; + + for (j = 1; j < window_size; j++) + chunk_size += sg_dma_len(&sg[j]); + + for (i = 1; i <= last_elem; i++, window_idx++) { + chunk_size += sg_dma_len(&sg[window_idx]); + if (chunk_size < mtu - 1) + return false; + + chunk_size -= sg_dma_len(&sg[i]); + } + + return true; +} + +/* This function makes sure that the middle/suffix of a PDU SGL meets the + * restriction of MAX_SKB_FRAGS. There are two cases here: + * 1. sg_len < MAX_SKB_FRAGS - the extreme case here is a packet that consists + * of one byte from the first SG element + the rest of the SGL and the remaining + * space of the packet will be scattered to the WQE and will be pointed by + * SKB frags. + * 2. sg_len => MAX_SKB_FRAGS - the extreme case here is a packet that consists + * of one byte from middle SG element + 15 continuous SG elements + one byte + * from a sequential SG element or the rest of the packet. + */ +static bool +mlx5e_nvmeotcp_validate_sgl_suffix(struct scatterlist *sg, int sg_len, int mtu) +{ + int ret; + + if (sg_len < MAX_SKB_FRAGS) + ret = mlx5e_nvmeotcp_validate_small_sgl_suffix(sg, sg_len, mtu); + else + ret = mlx5e_nvmeotcp_validate_big_sgl_suffix(sg, sg_len, mtu); + + return ret; +} + +static bool +mlx5e_nvmeotcp_validate_sgl_prefix(struct scatterlist *sg, int sg_len, int mtu) +{ + int i, hole_size, hole_len, tmp_len, chunk_size = 0; + + tmp_len = min_t(int, sg_len, MAX_SKB_FRAGS); + + for (i = 0; i < tmp_len; i++) + chunk_size += sg_dma_len(&sg[i]); + + if (chunk_size >= mtu) + return true; + + hole_size = mtu - chunk_size; + hole_len = DIV_ROUND_UP(hole_size, PAGE_SIZE); + + if (tmp_len + hole_len > MAX_SKB_FRAGS) + return false; + + return true; +} + +/* This function is responsible to ensure that a PDU could be offloaded. + * PDU is offloaded by building a non-linear SKB such that each SGL element is + * placed in frag, thus this function should ensure that all packets that + * represent part of the PDU won't exaggerate from MAX_SKB_FRAGS SGL. + * In addition NVMEoTCP offload has one PDU offload for packet restriction. + * Packet could start with a new PDU and then we should check that the prefix + * of the PDU meets the requirement or a packet can start in the middle of SG + * element and then we should check that the suffix of PDU meets the requirement. + */ +static bool +mlx5e_nvmeotcp_validate_sgl(struct scatterlist *sg, int sg_len, int mtu) +{ + int max_hole_frags; + + max_hole_frags = DIV_ROUND_UP(mtu, PAGE_SIZE); + if (sg_len + max_hole_frags <= MAX_SKB_FRAGS) + return true; + + if (!mlx5e_nvmeotcp_validate_sgl_prefix(sg, sg_len, mtu) || + !mlx5e_nvmeotcp_validate_sgl_suffix(sg, sg_len, mtu)) + return false; + + return true; +} + static int mlx5e_nvmeotcp_ddp_setup(struct net_device *netdev, struct sock *sk, struct ulp_ddp_io *ddp) { + struct scatterlist *sg = ddp->sg_table.sgl; + struct mlx5e_nvmeotcp_queue_entry *nvqt; struct mlx5e_nvmeotcp_queue *queue; + struct mlx5_core_dev *mdev; + int i, size = 0, count = 0; queue = container_of(ulp_ddp_get_ctx(sk), struct mlx5e_nvmeotcp_queue, ulp_ddp_ctx); + mdev = queue->priv->mdev; + count = dma_map_sg(mdev->device, ddp->sg_table.sgl, ddp->nents, + DMA_FROM_DEVICE); + + if (count <= 0) + return -EINVAL; - /* Placeholder - map_sg and initializing the count */ + if (WARN_ON(count > mlx5e_get_max_sgl(mdev))) + return -ENOSPC; + + if (!mlx5e_nvmeotcp_validate_sgl(sg, count, READ_ONCE(netdev->mtu))) + return -EOPNOTSUPP; + + for (i = 0; i < count; i++) + size += sg_dma_len(&sg[i]); + + nvqt = &queue->ccid_table[ddp->command_id]; + nvqt->size = size; + nvqt->ddp = ddp; + nvqt->sgl = sg; + nvqt->ccid_gen++; + nvqt->sgl_length = count; + mlx5e_nvmeotcp_post_klm_wqe(queue, KLM_UMR, ddp->command_id, count); - mlx5e_nvmeotcp_post_klm_wqe(queue, KLM_UMR, ddp->command_id, 0); return 0; } @@ -717,6 +854,11 @@ static void mlx5e_nvmeotcp_ddp_resync(struct net_device *netdev, struct sock *sk, u32 seq) { + struct mlx5e_nvmeotcp_queue *queue = + container_of(ulp_ddp_get_ctx(sk), struct mlx5e_nvmeotcp_queue, ulp_ddp_ctx); + + queue->after_resync_cqe = 1; + mlx5e_nvmeotcp_rx_post_static_params_wqe(queue, seq); } const struct ulp_ddp_dev_ops mlx5e_nvmeotcp_ops = { From patchwork Mon Jan 9 13:31:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093589 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A39A0C5479D for ; Mon, 9 Jan 2023 13:36:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237195AbjAINga (ORCPT ); Mon, 9 Jan 2023 08:36:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236975AbjAINfx (ORCPT ); Mon, 9 Jan 2023 08:35:53 -0500 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2089.outbound.protection.outlook.com [40.107.243.89]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BADCB392CB for ; Mon, 9 Jan 2023 05:33:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KQP3ppeKw4OmUksBLysmWSYo3WigwulhqYWCvVk9Iqem9CPgTvPycZEPfeqTrf+1VmxVoinUii7CGUt50ocrZz+hEVUE1nLcsT8svawftmVkbHYGClfh9MgS2jhpNs+tcK7OpS6rgGzEJSfvExm5wXBMOAtt9nwzAtgUlz/L23rAVx8xLKSmkvoYdzwkv7VHomZyMGBuVUnXUghIWqFbKbL3F+NQGf5PR/Wjg6nJhwpDhTGf29ChgGUCPd9dhyBHDliFTEb1XLz+6FmtYCN9rsfe23OyHJ0IpWP9m+emRDhyzJxME0ewlD1oYknYglrWFr1Bi1P5VrCAL8QpjnkZ7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=YxBpOA8eGPyyXPw7HrfSdP+qPZboGJvICHt7djzBOag=; b=Dj/NFnIe2Hc5sNW5pnhwsjoJsvGdPn4dmSeshb/c8C8yACywhmryGAPMGDKfzPvqvPBDWSRJOXec/egj2rlpfcrF4esVenKalAc9wbTnR5g+EaH6Ckal2GRookR2JkHqSsXQlL9Zcq0wjZH4UCwCTubFw6vK/DgcuM/Sy+iXqpiGAsXooUSBshU4sjFa7+bK5x8ZJLuHFvU4QONigaRO42UtBSt+Bg8essHl7xtTjSIctmjaMTLE/TJTqU1rWW8TxxxS0j0m7/AwJspCZ1LXS9FF0OBWLiDhWX2kZRe0V/UFmxXb82CjgZeKOmM6rPoU+/nubSEmsUMGUi14D64wLg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YxBpOA8eGPyyXPw7HrfSdP+qPZboGJvICHt7djzBOag=; b=TjXxdmEbrv0mheBODCoVRwpsVEC0PIyiSpHa91yne8qzGbNkbcVa3kzl85EeA/O8rTt2XfmHPlqMGWWU2Go2ultSSeYjA7dvYZCfs3DuxU/u4iwmR1rj9eoEdl2dGb0D2XauPKNyBjfOafntSVqDXIOx2bQO7RhC+ApRqEQTwIiaEN/7UR0nW0c2GA6S9+qzEloVqhqxus0rWjhyOSwJnbwZugJJH8HxDug/Ebt/2VSWm2QbFz6KeHJKdqX53NQ70bAgCvDkZv+lYshnuqw/d852vql/q0RKS730zZzJkDt6HxQ6k1SradazbI8aApYBRfKGPVsBcucaR0XNd/6Agg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by MN2PR12MB4285.namprd12.prod.outlook.com (2603:10b6:208:1d7::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:33:57 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:33:57 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Ben Ben-Ishay , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 23/25] net/mlx5e: NVMEoTCP, async ddp invalidation Date: Mon, 9 Jan 2023 15:31:14 +0200 Message-Id: <20230109133116.20801-24-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: LO2P123CA0090.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:138::23) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|MN2PR12MB4285:EE_ X-MS-Office365-Filtering-Correlation-Id: 8bb20557-f0c1-4a58-2657-08daf246290f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 7oD5cJIHFJEfCobswnK9LoNHUjNemiYkdU5cFkmqXbr6yh7cfF/7smP+Bu3p7rnIM2/DN/jL9+5ISAA295xzbMv3WQJZazAgKWYMp+0ogs1jnIg/K78CY0A0xqUdMsQEi4mHNzCjok3oT1A7gDX7xItAF1Y1Omml+s4d0LEtcVq2uxXdlb6tar1F9CNhbCr4AUdbYvPrvvip9mZM7ZNoVwfhaV9udiPpOPoQxYoZwc2w0VSfvDiyIXZZQW8sCgJGNVJvilg6BZjd4mvVmXe8QulKNkY2Hl+dy3jC1kn5x11bpxe6o3LlHG9/FTEFKKUjtosStmUA8F7/KsoEKMaJRCu6Ffu1J1jaCkD8yEu5wDqqm3sYs9c59rrZPIf4G0Zy6nZhGYW2UWow0txJMHRDWdKq95pbJbhC7lw6hS2u5ii/W3kjK7dV1MiwD26GjMVn8gUGo4+BpICIUYn7b3eLaJwBQx636m52owJ5R1o+8+nvp6NmJCkrw3hfW/IimOzQrq3Gy8DEaQ6kZ9otbF7qCTfjM3LtYbHixitT17c9iLTzWRZfgPYAtj1q2xZ0r0lSAYfUuT5ifTGfc0ELsShrwwvXBxBswZ1wV80EWVHHodWobPJvDzHtX0pnN/4y2AAX4OUTgTZg2kXgXA/KvTcLlA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(366004)(39860400002)(396003)(376002)(136003)(451199015)(8936002)(2906002)(5660300002)(41300700001)(7416002)(66476007)(66556008)(4326008)(8676002)(316002)(66946007)(54906003)(186003)(26005)(6512007)(2616005)(1076003)(38100700002)(83380400001)(86362001)(36756003)(107886003)(478600001)(6486002)(6506007)(6666004);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: EypIG+kM5TqI8yqo/kN+hmHu692wke/72+3cWp11+YvGp5zFYypRDF7zoq4AR7LGCPYOCoWDCB1XOqHlTfwRK3L1mIGaN67L7CZzw4Iu70jsMy435OrW/y2SQhDQkVGAlX6pYVJB+L0fnCdGd7UFqJAViMDSiaJlpqnqK8cI/MNEfSmlADSSfnGxk6xuTQ1Jog36Rs2rac3oiG/hq/ekH6tiJeSYgoFlovWxvkRXmDCEnW9spc9VfG74i5YqMPq8we+9E5cpOqbVfjQSzvjZIJfYKeWoRSWGT7lx4W+PtROqb7SzRWtwhpoPpQ7ivZozS9h3W1XYJj/8ZieQefOADSxG2Ue/iSeL1zXkaHHjb1MiP2OKJ6bMpdOmXmsLnaj70XgfsdF94x6lwWONPgC5CDiGOh4+9Nqrvffera9git9ASBqpjhwXuJZJwiBoegtbzhUDRE0oOqOOI9bOkQFsGeEA56nRjVmI6NPFxbIaCE4MR4yRLJt5Hb6uyHJmSSpbiSZ4ov2CpziM875w+ywOe+5qnN2PtjijKARKlNWPY5CW4sDGmCyEq08sbkW9eAYkA6qpsVykJ8zhKXTrOA/tIZxBPgLx+jgDV3+/DbclweyU0NfUkc6Br8SAtD89gxx/yyxkN9bk2Kt+wp+BQceLEychnrIuFpuQbsWXfazOt9x7Skhfyb1n4M/gVIqODnnSYwy3cJxq2deOzVHh2ESkR6WaFsrMwABWtV3vgKdj2hQbPjNwjmGnhyXDz3EYsUSs20OO504K/GodLDDOh0fhCoh5hcPCNQeP25RwdbwLbtmZl9g6AZcNjCulEixmI6lO++0Eo7wsWg549JSGMZcogrSTTnIwQpg9ES3O+vPLVPOKfbjZ6lCeZH/b7/hXE1DDnHcaIMnqhmy8EawMBb/e3fJSCTmmNYibK2ZvoLGVnCBDB8uDXat2UxoeHnL3tkWjCeAOrWyPdkv+VmrQeGKmTP+WSTAxAolezoIA/U5gMNYI1f58SfDc12TOQSKZ92oYXQjcsWfLvSDyVQp662I71kXjB4pfs+dD3NHPURiLbiVfhECdhDVGUonPePEMWdHIds3fMdT8yBWd1KE5sE66SCEgQZEKysdV8Y4PXIKYJll1qbHeBJwHzuQs7l2byR9U/ytv9h16sQxTOHmiQLVTHg9qtk9JhxSHyhrB2tOry3NuqBVCPD4bTkspNQ2PJeL7Q170WQaJ8VDrlLnDVtYlyqcbPCEA36PSaFi7loZqbDxmTtwpHMO27jDaP15NzN3vSIUSj9QVPLqfOgtTkTlAbhPxYm7ZnHGiQUSamkbMB6sJB37slw9OrUX1hw/JU9mb8kqbmzegOiQzLh3RpgtpW85Flt+kDQk77EO8Kn8EA+RUYmPm9qCH7UdnqkCi1Rsbgs1VoVf8JvKx8Oj1j4JXICy7iXtRKdttXfIR5zdv7R19a3GduDdyZWPJeBygmEyJtzEz68LOcdddkn8CG59DZivGIn2QboleENWYC7dqvmJtZvWcjRILZuACqG6JSyIE55jmGH+hCEycLsvJCqqDm66tfZQRQnbKV24ndJUXp79VdBUIJguXVv6TlKVNmf2o X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8bb20557-f0c1-4a58-2657-08daf246290f X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:33:57.7630 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 4dJ1XoVrjg3BUGoTBjtVbfsQT88rCCsYuszt4AF2ejDk+NcDRis5/YVsfwoi3ezVrUDmDzRs8RqPEEbXFdHhWA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4285 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Ben Ben-Ishay After the ULP consumed the buffers of the offloaded request, it calls the ddp_teardown op to release the NIC mapping for them and allow the NIC to reuse the HW contexts associated with offloading this IO. We do a fast/async un-mapping via UMR WQE. In this case, the ULP does holds off with completing the request towards the upper/application layers until the HW unmapping is done. When the corresponding CQE is received, a notification is done via the the teardown_done ddp callback advertised by the ULP in the ddp context. Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../net/ethernet/mellanox/mlx5/core/en/txrx.h | 4 ++ .../mellanox/mlx5/core/en_accel/nvmeotcp.c | 66 ++++++++++++++++--- .../mellanox/mlx5/core/en_accel/nvmeotcp.h | 1 + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 6 ++ 4 files changed, 67 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h index 971265280e55..c6952ff4c7ca 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h @@ -52,6 +52,7 @@ enum mlx5e_icosq_wqe_type { #endif #ifdef CONFIG_MLX5_EN_NVMEOTCP MLX5E_ICOSQ_WQE_UMR_NVMEOTCP, + MLX5E_ICOSQ_WQE_UMR_NVMEOTCP_INVALIDATE, MLX5E_ICOSQ_WQE_SET_PSV_NVMEOTCP, #endif }; @@ -212,6 +213,9 @@ struct mlx5e_icosq_wqe_info { struct { struct mlx5e_nvmeotcp_queue *queue; } nvmeotcp_q; + struct { + struct mlx5e_nvmeotcp_queue_entry *entry; + } nvmeotcp_qe; #endif }; }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c index 3f1c0e7682c3..b440ed10c373 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c @@ -142,10 +142,11 @@ build_nvmeotcp_klm_umr(struct mlx5e_nvmeotcp_queue *queue, struct mlx5e_umr_wqe u16 ccid, int klm_entries, u32 klm_offset, u32 len, enum wqe_type klm_type) { - u32 id = (klm_type == KLM_UMR) ? queue->ccid_table[ccid].klm_mkey : - (mlx5e_tir_get_tirn(&queue->tir) << MLX5_WQE_CTRL_TIR_TIS_INDEX_SHIFT); - u8 opc_mod = (klm_type == KLM_UMR) ? MLX5_CTRL_SEGMENT_OPC_MOD_UMR_UMR : - MLX5_OPC_MOD_TRANSPORT_TIR_STATIC_PARAMS; + u32 id = (klm_type == BSF_KLM_UMR) ? + (mlx5e_tir_get_tirn(&queue->tir) << MLX5_WQE_CTRL_TIR_TIS_INDEX_SHIFT) : + queue->ccid_table[ccid].klm_mkey; + u8 opc_mod = (klm_type == BSF_KLM_UMR) ? MLX5_OPC_MOD_TRANSPORT_TIR_STATIC_PARAMS : + MLX5_CTRL_SEGMENT_OPC_MOD_UMR_UMR; u32 ds_cnt = MLX5E_KLM_UMR_DS_CNT(ALIGN(klm_entries, MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT)); struct mlx5_wqe_umr_ctrl_seg *ucseg = &wqe->uctrl; struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl; @@ -158,6 +159,13 @@ build_nvmeotcp_klm_umr(struct mlx5e_nvmeotcp_queue *queue, struct mlx5e_umr_wqe cseg->qpn_ds = cpu_to_be32((sqn << MLX5_WQE_CTRL_QPN_SHIFT) | ds_cnt); cseg->general_id = cpu_to_be32(id); + if (!klm_entries) { /* this is invalidate */ + ucseg->mkey_mask = cpu_to_be64(MLX5_MKEY_MASK_FREE); + ucseg->flags = MLX5_UMR_INLINE; + mkc->status = MLX5_MKEY_STATUS_FREE; + return; + } + if (klm_type == KLM_UMR && !klm_offset) { ucseg->mkey_mask = cpu_to_be64(MLX5_MKEY_MASK_XLT_OCT_SIZE | MLX5_MKEY_MASK_LEN | MLX5_MKEY_MASK_FREE); @@ -259,8 +267,8 @@ build_nvmeotcp_static_params(struct mlx5e_nvmeotcp_queue *queue, static void mlx5e_nvmeotcp_fill_wi(struct mlx5e_nvmeotcp_queue *nvmeotcp_queue, - struct mlx5e_icosq *sq, u32 wqebbs, u16 pi, - enum wqe_type type) + struct mlx5e_icosq *sq, u32 wqebbs, + u16 pi, u16 ccid, enum wqe_type type) { struct mlx5e_icosq_wqe_info *wi = &sq->db.wqe_info[pi]; @@ -272,6 +280,10 @@ mlx5e_nvmeotcp_fill_wi(struct mlx5e_nvmeotcp_queue *nvmeotcp_queue, wi->wqe_type = MLX5E_ICOSQ_WQE_SET_PSV_NVMEOTCP; wi->nvmeotcp_q.queue = nvmeotcp_queue; break; + case KLM_INV_UMR: + wi->wqe_type = MLX5E_ICOSQ_WQE_UMR_NVMEOTCP_INVALIDATE; + wi->nvmeotcp_qe.entry = &nvmeotcp_queue->ccid_table[ccid]; + break; default: /* cases where no further action is required upon completion, such as ddp setup */ wi->wqe_type = MLX5E_ICOSQ_WQE_UMR_NVMEOTCP; @@ -290,7 +302,7 @@ mlx5e_nvmeotcp_rx_post_static_params_wqe(struct mlx5e_nvmeotcp_queue *queue, u32 wqebbs = MLX5E_TRANSPORT_SET_STATIC_PARAMS_WQEBBS; pi = mlx5e_icosq_get_next_pi(sq, wqebbs); wqe = MLX5E_TRANSPORT_FETCH_SET_STATIC_PARAMS_WQE(sq, pi); - mlx5e_nvmeotcp_fill_wi(NULL, sq, wqebbs, pi, BSF_UMR); + mlx5e_nvmeotcp_fill_wi(NULL, sq, wqebbs, pi, 0, BSF_UMR); build_nvmeotcp_static_params(queue, wqe, resync_seq, queue->crc_rx); sq->pc += wqebbs; mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, &wqe->ctrl); @@ -307,7 +319,7 @@ mlx5e_nvmeotcp_rx_post_progress_params_wqe(struct mlx5e_nvmeotcp_queue *queue, u wqebbs = MLX5E_NVMEOTCP_PROGRESS_PARAMS_WQEBBS; pi = mlx5e_icosq_get_next_pi(sq, wqebbs); wqe = MLX5E_NVMEOTCP_FETCH_PROGRESS_PARAMS_WQE(sq, pi); - mlx5e_nvmeotcp_fill_wi(queue, sq, wqebbs, pi, SET_PSV_UMR); + mlx5e_nvmeotcp_fill_wi(queue, sq, wqebbs, pi, 0, SET_PSV_UMR); build_nvmeotcp_progress_params(queue, wqe, seq); sq->pc += wqebbs; mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, &wqe->ctrl); @@ -330,7 +342,7 @@ post_klm_wqe(struct mlx5e_nvmeotcp_queue *queue, wqebbs = DIV_ROUND_UP(wqe_sz, MLX5_SEND_WQE_BB); pi = mlx5e_icosq_get_next_pi(sq, wqebbs); wqe = MLX5E_NVMEOTCP_FETCH_KLM_WQE(sq, pi); - mlx5e_nvmeotcp_fill_wi(queue, sq, wqebbs, pi, wqe_type); + mlx5e_nvmeotcp_fill_wi(queue, sq, wqebbs, pi, ccid, wqe_type); build_nvmeotcp_klm_umr(queue, wqe, ccid, cur_klm_entries, klm_offset, klm_length, wqe_type); sq->pc += wqebbs; @@ -345,7 +357,10 @@ mlx5e_nvmeotcp_post_klm_wqe(struct mlx5e_nvmeotcp_queue *queue, enum wqe_type wq struct mlx5e_icosq *sq = &queue->sq; u32 klm_offset = 0, wqes, i; - wqes = DIV_ROUND_UP(klm_length, queue->max_klms_per_wqe); + if (wqe_type == KLM_INV_UMR) + wqes = 1; + else + wqes = DIV_ROUND_UP(klm_length, queue->max_klms_per_wqe); spin_lock_bh(&queue->sq_lock); @@ -842,12 +857,43 @@ void mlx5e_nvmeotcp_ctx_complete(struct mlx5e_icosq_wqe_info *wi) complete(&queue->static_params_done); } +void mlx5e_nvmeotcp_ddp_inv_done(struct mlx5e_icosq_wqe_info *wi) +{ + struct mlx5e_nvmeotcp_queue_entry *q_entry = wi->nvmeotcp_qe.entry; + struct mlx5e_nvmeotcp_queue *queue = q_entry->queue; + struct mlx5_core_dev *mdev = queue->priv->mdev; + struct ulp_ddp_io *ddp = q_entry->ddp; + const struct ulp_ddp_ulp_ops *ulp_ops; + + dma_unmap_sg(mdev->device, ddp->sg_table.sgl, + q_entry->sgl_length, DMA_FROM_DEVICE); + + q_entry->sgl_length = 0; + + ulp_ops = inet_csk(queue->sk)->icsk_ulp_ddp_ops; + if (ulp_ops && ulp_ops->ddp_teardown_done) + ulp_ops->ddp_teardown_done(q_entry->ddp_ctx); +} + static void mlx5e_nvmeotcp_ddp_teardown(struct net_device *netdev, struct sock *sk, struct ulp_ddp_io *ddp, void *ddp_ctx) { + struct mlx5e_nvmeotcp_queue_entry *q_entry; + struct mlx5e_nvmeotcp_queue *queue; + + queue = container_of(ulp_ddp_get_ctx(sk), struct mlx5e_nvmeotcp_queue, ulp_ddp_ctx); + q_entry = &queue->ccid_table[ddp->command_id]; + WARN_ONCE(q_entry->sgl_length == 0, + "Invalidation of empty sgl (CID 0x%x, queue 0x%x)\n", + ddp->command_id, queue->id); + + q_entry->ddp_ctx = ddp_ctx; + q_entry->queue = queue; + + mlx5e_nvmeotcp_post_klm_wqe(queue, KLM_INV_UMR, ddp->command_id, 0); } static void diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h index 555f3ed7e2e2..a5cfd9e31be7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h @@ -109,6 +109,7 @@ void mlx5e_nvmeotcp_cleanup(struct mlx5e_priv *priv); struct mlx5e_nvmeotcp_queue * mlx5e_nvmeotcp_get_queue(struct mlx5e_nvmeotcp *nvmeotcp, int id); void mlx5e_nvmeotcp_put_queue(struct mlx5e_nvmeotcp_queue *queue); +void mlx5e_nvmeotcp_ddp_inv_done(struct mlx5e_icosq_wqe_info *wi); void mlx5e_nvmeotcp_ctx_complete(struct mlx5e_icosq_wqe_info *wi); static inline void mlx5e_nvmeotcp_init_rx(struct mlx5e_priv *priv) {} void mlx5e_nvmeotcp_cleanup_rx(struct mlx5e_priv *priv); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 74b45c94d1cb..1b3660d05350 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -891,6 +891,9 @@ void mlx5e_free_icosq_descs(struct mlx5e_icosq *sq) break; #endif #ifdef CONFIG_MLX5_EN_NVMEOTCP + case MLX5E_ICOSQ_WQE_UMR_NVMEOTCP_INVALIDATE: + mlx5e_nvmeotcp_ddp_inv_done(wi); + break; case MLX5E_ICOSQ_WQE_SET_PSV_NVMEOTCP: mlx5e_nvmeotcp_ctx_complete(wi); break; @@ -996,6 +999,9 @@ int mlx5e_poll_ico_cq(struct mlx5e_cq *cq, int budget) #ifdef CONFIG_MLX5_EN_NVMEOTCP case MLX5E_ICOSQ_WQE_UMR_NVMEOTCP: break; + case MLX5E_ICOSQ_WQE_UMR_NVMEOTCP_INVALIDATE: + mlx5e_nvmeotcp_ddp_inv_done(wi); + break; case MLX5E_ICOSQ_WQE_SET_PSV_NVMEOTCP: mlx5e_nvmeotcp_ctx_complete(wi); break; From patchwork Mon Jan 9 13:31:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093591 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80795C54EBD for ; Mon, 9 Jan 2023 13:37:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237200AbjAINhA (ORCPT ); Mon, 9 Jan 2023 08:37:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236710AbjAINgk (ORCPT ); Mon, 9 Jan 2023 08:36:40 -0500 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2089.outbound.protection.outlook.com [40.107.243.89]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75E023C39B for ; Mon, 9 Jan 2023 05:34:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=N1Rf8kdgX9IcKFFwv7cSAHE6v4sjOmJ47w+xEhgLCP5woZ/wfJQi/pctb3yehNLueifmMZ5kQMRDuM4mA9daQ6Ip0FG2OLiyNkb3gScgsZP4Y9qkR7q+z/2XnvPDWAXGzCTNfHoyDBEDExVM8uwNGU+2EeHkp7/G6iEQUis3oFHqiQ+4/MulVyrHk3MUqFEIiSutXgP49+c2QMwKi8DKrgA/gBi2Gl1gt7to/mOJHKzZIspiTRAdhyYbWg4UqgWv6y8ViRMPN8A85WlaXRjKj0tXS/cBYw+nDZbsSJcFCqcXJDGzUIHGM3hvrE+KzXoBGf2igFcERhXOEDNkxb+09Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NANmqoZ6T5PXPpRQ4kuvruD3kBK+CEkoor76lSajFvI=; b=YNckRU0Hfnmj+lhdbu1J52otAy4QT8FcE4GFVnFdHHU9Qn5S5IY9rJg28yFWwio6ycafoRrdQPGa+SZrxeJc34DkHu5NhLt3JG7FeiQVgwAFaN+/AQG/mncNqcLj3RJkyKj2WTovbjUbbUo7Vn46IBHAJdPm3Pk2WzpJxYRcTi+yDYiBH+8sL7hITWMyKtRJpwn+qtyoQb7u2SbA2nbt5uOV+GTNq9XyMvvrZlDzLJMACm6vCwGW2h2HPLzAhykel7mMoigbhDcF0mCwCZt2iL9DnDO/lgiOA9Jj56FNU0KT/XF11ganAscVWE6ii4Hkrkm6ttfBuzU6PNlvkWiznw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NANmqoZ6T5PXPpRQ4kuvruD3kBK+CEkoor76lSajFvI=; b=aWGEgHr7QNSCV+8RMwt20ZVq1/yXg4RvRvBHyzsCwv4QBRIHOCYTAc0vEQYnJbochcBZxDy9jJp5Yyofonmj7DbTuP45gxgZeRMPEMxRYajcLAGtCYABpyPf6J7kI45Xutt2eBv/Tkknwt/j5aFtfFOzc84mYqdF+yhl946oyIOGYBwvmp/htc2TlWDg9Kac4FkG8hqXXNtxQgpW2rQ00OiYl0vRyeix4EeY6byrb32YDShLyqPsyJN9gkdXiGoZxbL+7IYoUI7Elu1kYVg1PnXBdiyS4kTPb/cp8BrvQwIEd9O3InV9HRRhIBqwFsYjJc01FqQ5tSoldoOVMl1bZA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by MN2PR12MB4285.namprd12.prod.outlook.com (2603:10b6:208:1d7::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:34:03 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:34:03 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Ben Ben-Ishay , Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 24/25] net/mlx5e: NVMEoTCP, data-path for DDP+DDGST offload Date: Mon, 9 Jan 2023 15:31:15 +0200 Message-Id: <20230109133116.20801-25-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR3P281CA0162.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a2::14) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|MN2PR12MB4285:EE_ X-MS-Office365-Filtering-Correlation-Id: 6f190d75-f1fc-4b93-aa2d-08daf2462c48 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: tpzuFaQpzppHcU9/XkK7O/UQ59+SNC6lrdsK59KFRPJanJYKz2HmEcEwkkrGic1uOU6hjW5VZfZSSmoP7uJrbPtm3WV0fjqVb9lD/wG2DwG98ylqEhouA6qe1Ff9odMcWK5FDLj8K+7uyK/jSd+CzFuTUiqnrvAUFfr0JYIxzcF1XpX1KMHX+SHY0BC8fLkVo23vmbr5viLZdBjhAJBVc6zQSZ8s4lkhKDIS+xjcCfzePETz3ZLAWUUK8I9X9QF3zuI/lwc7ba1MUh3AwKM4PryUC+/JFKG01dHKNGBcB0TVuoqJ7ZT7+14M3jjOusV1oEmFdrmWEJOEzRxVG+dlUQdiD3NlORb84MfsBD3LvIR4L6RlxXHihQfh2e8fdvLug7FcxR/KKypZzidFsuVb0EBcqaSbb9IV9upiIOhhRztuH4s3zUPbzvlLbliDzJXA0s/iyUwFBE85bo+hfCdjPKi+zZxSd5AVUmUfsvjtUe/6tOeAX5Gx8OuF/4KccXHybbE67Wzvc/C2MwJ1kB3tdbzSj+Q2quybG3PxC3e4apV9YPImw1Ke+48Gw3CWyr9qh3bEFAZqMhwfsCKn+0n07twd48m9Yx4UNU7t02T4xGqJzNrYArRDymdZFMbeOIhUQByTcZxRPILJBiat9Ta0Mg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(366004)(39860400002)(396003)(376002)(136003)(451199015)(8936002)(2906002)(5660300002)(41300700001)(7416002)(66476007)(66556008)(4326008)(8676002)(316002)(66946007)(30864003)(54906003)(186003)(26005)(6512007)(2616005)(1076003)(38100700002)(83380400001)(86362001)(36756003)(107886003)(478600001)(6486002)(6506007)(6666004);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 8MsSKlqHDqLRL+yzJa9SldqDBWHjEGfllQlVg9FOZsB+kFGgSXsJAdvGMNLBNiHLYAPNm6ZeMu3HcfYKAuQaS1EUEIU+s8ufDzVMKiuK+DoHrbaz6CnCPx7pfiOXVOvllLi8jXxOwcQXBI+qsEJ7NKepamJmTSrh5Q/bdZ8cbuIzEeOiBR07iRVfVzemBmLL+jfPMPAr58slC46RhpVSNeeXvsr/MjYxVjQksT94sUsezL1xu4ya7yru3IECutRhL8GcNLY/qvEEpLkYbA1WQKlxIMqWZ6hMPXvPtIPFEyJblEcyWpM03In9hk1mst0XeKQRISFMjbr1baTY9GiGjJ4sO2ArQKVl+tkTjWux58vuCRkJz+JcnDAmfI+eUh++v+5Lwr3T5NSkVTycE/9Oi/82mSoTPRQ4G7O06Y6cf0r6kA4PkI1TiXd0P4eh7LyDIaTOG2uqTQRyrd/On1TKhIc3+hULxXqMByRomogAbV6VZfGEXbE35eVY8li/XZgBNDvCxfbkHkjRnIi8ysIr+TNA3yj+vqVRcWwguhm/+KRUk5E6XtZhwF0NdDd2o3unYp4HboKeUgpUVYle46LhXzO1wVE0E4iAdIIAjexUyiG30+I63teY8vVQ7p/IGaMUlXdIXl4/QUud7eu/ePRdCtqSvOHKtdTS6VmYUMiL5ND3hCMMRWIHofwunJHL58f39D6MjyjhWk7mJYWqPa8EL5gQbfaXVF2lS/4GeupZH61RqkGbeITIBwqVHS3t0SZpPkOZL+yBaUbimhs441Xy4Zs4T/ZMk6IylEORrlW/vv8VCKFV4LZyLtziIXsrIXR3blRAAA21HBTedSdU/bZouk0C4zqyvlkJ4PbBnHoKAvddmrLDHoJaBn7HccvKAtS9uK9HfhjzVQK2kKhWFQtZmmEcXJMdsqrD6BRxq0V5vaPd+6nDtiYVetkfzX12xFlIFrmKAv23Y0TZuhNR6fx5MQR9rWqP5C2Wiw7y9b7qjAsaaByERVzew5JM5R5AE9s4Jgbg1uByM5QB27AlIZ0Jpk7fc2O1sD/8kaG4rah5dHO4KtuJ42s1NYbb0BAWJvM8ux1aTrzjTG06kLnXnJNOQc75qUtU7XXd8dtiFJvim3xF6rN/bWjH5ThQNiZZzdhLVLEQxWtCeEhnZE8/cXNvUcxjLInCNqAboBaJetuopJxt623Gxf+1hmUP8x2KXaUJCe7+dtlozp/EaI1krAo1FaSvD7x3aejRyKAhBLNv8fzKP2H+21DBqTWV+h7HtzSZDYgDIpa9gPJkWKhP9RQSTe+Pmr5p4xJHP6gE+oV447cd/mbddWcO9uJP3O/UXbnJA/N49pZhvSNka/k+rjtswK+WZ6bYz4Qq5/aKE1/CsQWa136+x4ljGL2KbQ1cPETjAgc5g/H9/sYF2eFDsxX5urQVv1a5hfxnkMpdmQIlDo5ifg4VgC1b6sI8u8FBdONlpesppNz6Ygammbk9QW/H2W/gSPy6Vc+FJ1GvrWzUf/90+LDk1xiKbl9dyx647p+MPXDsDtndqc9fIVb7Rbxp5UglQg09/Ho8ziSijeUqwL14RzMJqIBubJHPLAHKpw5D X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6f190d75-f1fc-4b93-aa2d-08daf2462c48 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:34:03.3281 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: mB3tYoESrmKzsgEahvQAJElriu1IG+fVJMHHgfVq1gD0QoZgAkFY8b6lbE6rtNJmFXn2HSn7zAS0I79+ToDxKw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4285 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Ben Ben-Ishay This patch implements the data-path for direct data placement (DDP) and DDGST offloads. NVMEoTCP DDP constructs an SKB from each CQE, while pointing at NVME destination buffers. In turn, this enables the offload, as the NVMe-TCP layer will skip the copy when src == dst. Additionally, this patch adds support for DDGST (CRC32) offload. HW will report DDGST offload only if it has not encountered an error in the received packet. We pass this indication in skb->ulp_crc up the stack to NVMe-TCP to skip computing the DDGST if all corresponding SKBs were verified by HW. This patch also handles context resynchronization requests made by NIC HW. The resync request is passed to the NVMe-TCP layer to be handled at a later point in time. Finally, we also use the skb->ulp_ddp bit to avoid skb_condense. This is critical as every SKB that uses DDP has a hole that fits perfectly with skb_condense's policy, but filling this hole is counter-productive as the data there already resides in its destination buffer. This work has been done on pre-silicon functional simulator, and hence data-path performance numbers are not provided. Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../net/ethernet/mellanox/mlx5/core/Makefile | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 1 + .../ethernet/mellanox/mlx5/core/en/xsk/rx.h | 1 + .../mlx5/core/en_accel/nvmeotcp_rxtx.c | 316 ++++++++++++++++++ .../mlx5/core/en_accel/nvmeotcp_rxtx.h | 37 ++ .../net/ethernet/mellanox/mlx5/core/en_rx.c | 51 ++- 7 files changed, 395 insertions(+), 14 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 9df9999047d1..9804bd086bf4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -103,7 +103,7 @@ mlx5_core-$(CONFIG_MLX5_EN_TLS) += en_accel/ktls_stats.o \ en_accel/fs_tcp.o en_accel/ktls.o en_accel/ktls_txrx.o \ en_accel/ktls_tx.o en_accel/ktls_rx.o -mlx5_core-$(CONFIG_MLX5_EN_NVMEOTCP) += en_accel/fs_tcp.o en_accel/nvmeotcp.o +mlx5_core-$(CONFIG_MLX5_EN_NVMEOTCP) += en_accel/fs_tcp.o en_accel/nvmeotcp.o en_accel/nvmeotcp_rxtx.o mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o \ steering/dr_matcher.o steering/dr_rule.o \ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 217b5480c3aa..dfd0f1aa650c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -628,6 +628,7 @@ struct mlx5e_rq; typedef void (*mlx5e_fp_handle_rx_cqe)(struct mlx5e_rq*, struct mlx5_cqe64*); typedef struct sk_buff * (*mlx5e_fp_skb_from_cqe_mpwrq)(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx); typedef struct sk_buff * (*mlx5e_fp_skb_from_cqe)(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c index c91b54d9ff27..03b416989bba 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c @@ -229,6 +229,7 @@ static struct sk_buff *mlx5e_xsk_construct_skb(struct mlx5e_rq *rq, struct xdp_b struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h index 087c943bd8e9..22e972398d92 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h @@ -13,6 +13,7 @@ int mlx5e_xsk_alloc_rx_wqes_batched(struct mlx5e_rq *rq, u16 ix, int wqe_bulk); int mlx5e_xsk_alloc_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk); struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c new file mode 100644 index 000000000000..4c7dab28ef56 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c @@ -0,0 +1,316 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. + +#include "en_accel/nvmeotcp_rxtx.h" +#include + +#define MLX5E_TC_FLOW_ID_MASK 0x00ffffff +static void nvmeotcp_update_resync(struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_cqe128 *cqe128) +{ + const struct ulp_ddp_ulp_ops *ulp_ops; + u32 seq; + + seq = be32_to_cpu(cqe128->resync_tcp_sn); + ulp_ops = inet_csk(queue->sk)->icsk_ulp_ddp_ops; + if (ulp_ops && ulp_ops->resync_request) + ulp_ops->resync_request(queue->sk, seq, ULP_DDP_RESYNC_PENDING); +} + +static void mlx5e_nvmeotcp_advance_sgl_iter(struct mlx5e_nvmeotcp_queue *queue) +{ + struct mlx5e_nvmeotcp_queue_entry *nqe = &queue->ccid_table[queue->ccid]; + + queue->ccoff += nqe->sgl[queue->ccsglidx].length; + queue->ccoff_inner = 0; + queue->ccsglidx++; +} + +static inline void +mlx5e_nvmeotcp_add_skb_frag(struct net_device *netdev, struct sk_buff *skb, + struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_nvmeotcp_queue_entry *nqe, u32 fragsz) +{ + dma_sync_single_for_cpu(&netdev->dev, + nqe->sgl[queue->ccsglidx].offset + queue->ccoff_inner, + fragsz, DMA_FROM_DEVICE); + page_ref_inc(compound_head(sg_page(&nqe->sgl[queue->ccsglidx]))); + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, + sg_page(&nqe->sgl[queue->ccsglidx]), + nqe->sgl[queue->ccsglidx].offset + queue->ccoff_inner, + fragsz, + fragsz); +} + +static inline void +mlx5_nvmeotcp_add_tail_nonlinear(struct mlx5e_nvmeotcp_queue *queue, + struct sk_buff *skb, skb_frag_t *org_frags, + int org_nr_frags, int frag_index) +{ + while (org_nr_frags != frag_index) { + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, + skb_frag_page(&org_frags[frag_index]), + skb_frag_off(&org_frags[frag_index]), + skb_frag_size(&org_frags[frag_index]), + skb_frag_size(&org_frags[frag_index])); + page_ref_inc(skb_frag_page(&org_frags[frag_index])); + frag_index++; + } +} + +static void +mlx5_nvmeotcp_add_tail(struct mlx5e_nvmeotcp_queue *queue, struct sk_buff *skb, + int offset, int len) +{ + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, virt_to_page(skb->data), offset, len, + len); + page_ref_inc(virt_to_page(skb->data)); +} + +static void mlx5_nvmeotcp_trim_nonlinear(struct sk_buff *skb, skb_frag_t *org_frags, + int *frag_index, int remaining) +{ + unsigned int frag_size; + int nr_frags; + + /* skip @remaining bytes in frags */ + *frag_index = 0; + while (remaining) { + frag_size = skb_frag_size(&skb_shinfo(skb)->frags[*frag_index]); + if (frag_size > remaining) { + skb_frag_off_add(&skb_shinfo(skb)->frags[*frag_index], + remaining); + skb_frag_size_sub(&skb_shinfo(skb)->frags[*frag_index], + remaining); + remaining = 0; + } else { + remaining -= frag_size; + skb_frag_unref(skb, *frag_index); + *frag_index += 1; + } + } + + /* save original frags for the tail and unref */ + nr_frags = skb_shinfo(skb)->nr_frags; + memcpy(&org_frags[*frag_index], &skb_shinfo(skb)->frags[*frag_index], + (nr_frags - *frag_index) * sizeof(skb_frag_t)); + while (--nr_frags >= *frag_index) + skb_frag_unref(skb, nr_frags); + + /* remove frags from skb */ + skb_shinfo(skb)->nr_frags = 0; + skb->len -= skb->data_len; + skb->truesize -= skb->data_len; + skb->data_len = 0; +} + +static bool +mlx5e_nvmeotcp_rebuild_rx_skb_nonlinear(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + int ccoff, cclen, hlen, ccid, remaining, fragsz, to_copy = 0; + struct net_device *netdev = rq->netdev; + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_nvmeotcp_queue_entry *nqe; + skb_frag_t org_frags[MAX_SKB_FRAGS]; + struct mlx5e_nvmeotcp_queue *queue; + int org_nr_frags, frag_index; + struct mlx5e_cqe128 *cqe128; + u32 queue_id; + + queue_id = (be32_to_cpu(cqe->sop_drop_qpn) & MLX5E_TC_FLOW_ID_MASK); + queue = mlx5e_nvmeotcp_get_queue(priv->nvmeotcp, queue_id); + if (unlikely(!queue)) { + dev_kfree_skb_any(skb); + return false; + } + + cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); + if (cqe_is_nvmeotcp_resync(cqe)) { + nvmeotcp_update_resync(queue, cqe128); + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* If a resync occurred in the previous cqe, + * the current cqe.crcvalid bit may not be valid, + * so we will treat it as 0 + */ + if (unlikely(queue->after_resync_cqe) && cqe_is_nvmeotcp_crcvalid(cqe)) { + skb->ulp_crc = 0; + queue->after_resync_cqe = 0; + } else { + if (queue->crc_rx) + skb->ulp_crc = cqe_is_nvmeotcp_crcvalid(cqe); + } + + skb->ulp_ddp = cqe_is_nvmeotcp_zc(cqe); + if (!cqe_is_nvmeotcp_zc(cqe)) { + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* cc ddp from cqe */ + ccid = be16_to_cpu(cqe128->ccid); + ccoff = be32_to_cpu(cqe128->ccoff); + cclen = be16_to_cpu(cqe128->cclen); + hlen = be16_to_cpu(cqe128->hlen); + + /* carve a hole in the skb for DDP data */ + org_nr_frags = skb_shinfo(skb)->nr_frags; + mlx5_nvmeotcp_trim_nonlinear(skb, org_frags, &frag_index, cclen); + nqe = &queue->ccid_table[ccid]; + + /* packet starts new ccid? */ + if (queue->ccid != ccid || queue->ccid_gen != nqe->ccid_gen) { + queue->ccid = ccid; + queue->ccoff = 0; + queue->ccoff_inner = 0; + queue->ccsglidx = 0; + queue->ccid_gen = nqe->ccid_gen; + } + + /* skip inside cc until the ccoff in the cqe */ + while (queue->ccoff + queue->ccoff_inner < ccoff) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(off_t, remaining, + ccoff - (queue->ccoff + queue->ccoff_inner)); + + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + /* adjust the skb according to the cqe cc */ + while (to_copy < cclen) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(int, remaining, cclen - to_copy); + + mlx5e_nvmeotcp_add_skb_frag(netdev, skb, queue, nqe, fragsz); + to_copy += fragsz; + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + if (cqe_bcnt > hlen + cclen) { + remaining = cqe_bcnt - hlen - cclen; + mlx5_nvmeotcp_add_tail_nonlinear(queue, skb, org_frags, + org_nr_frags, + frag_index); + } + + mlx5e_nvmeotcp_put_queue(queue); + return true; +} + +static bool +mlx5e_nvmeotcp_rebuild_rx_skb_linear(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + int ccoff, cclen, hlen, ccid, remaining, fragsz, to_copy = 0; + struct net_device *netdev = rq->netdev; + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_nvmeotcp_queue_entry *nqe; + struct mlx5e_nvmeotcp_queue *queue; + struct mlx5e_cqe128 *cqe128; + u32 queue_id; + + queue_id = (be32_to_cpu(cqe->sop_drop_qpn) & MLX5E_TC_FLOW_ID_MASK); + queue = mlx5e_nvmeotcp_get_queue(priv->nvmeotcp, queue_id); + if (unlikely(!queue)) { + dev_kfree_skb_any(skb); + return false; + } + + cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); + if (cqe_is_nvmeotcp_resync(cqe)) { + nvmeotcp_update_resync(queue, cqe128); + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* If a resync occurred in the previous cqe, + * the current cqe.crcvalid bit may not be valid, + * so we will treat it as 0 + */ + if (unlikely(queue->after_resync_cqe) && cqe_is_nvmeotcp_crcvalid(cqe)) { + skb->ulp_crc = 0; + queue->after_resync_cqe = 0; + } else { + if (queue->crc_rx) + skb->ulp_crc = cqe_is_nvmeotcp_crcvalid(cqe); + } + + skb->ulp_ddp = cqe_is_nvmeotcp_zc(cqe); + if (!cqe_is_nvmeotcp_zc(cqe)) { + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* cc ddp from cqe */ + ccid = be16_to_cpu(cqe128->ccid); + ccoff = be32_to_cpu(cqe128->ccoff); + cclen = be16_to_cpu(cqe128->cclen); + hlen = be16_to_cpu(cqe128->hlen); + + /* carve a hole in the skb for DDP data */ + skb_trim(skb, hlen); + nqe = &queue->ccid_table[ccid]; + + /* packet starts new ccid? */ + if (queue->ccid != ccid || queue->ccid_gen != nqe->ccid_gen) { + queue->ccid = ccid; + queue->ccoff = 0; + queue->ccoff_inner = 0; + queue->ccsglidx = 0; + queue->ccid_gen = nqe->ccid_gen; + } + + /* skip inside cc until the ccoff in the cqe */ + while (queue->ccoff + queue->ccoff_inner < ccoff) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(off_t, remaining, + ccoff - (queue->ccoff + queue->ccoff_inner)); + + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + /* adjust the skb according to the cqe cc */ + while (to_copy < cclen) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(int, remaining, cclen - to_copy); + + mlx5e_nvmeotcp_add_skb_frag(netdev, skb, queue, nqe, fragsz); + to_copy += fragsz; + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + if (cqe_bcnt > hlen + cclen) { + remaining = cqe_bcnt - hlen - cclen; + mlx5_nvmeotcp_add_tail(queue, skb, + offset_in_page(skb->data) + + hlen + cclen, remaining); + } + + mlx5e_nvmeotcp_put_queue(queue); + return true; +} + +bool +mlx5e_nvmeotcp_rebuild_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + if (skb->data_len) + return mlx5e_nvmeotcp_rebuild_rx_skb_nonlinear(rq, skb, cqe, cqe_bcnt); + else + return mlx5e_nvmeotcp_rebuild_rx_skb_linear(rq, skb, cqe, cqe_bcnt); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h new file mode 100644 index 000000000000..a8ca8a53bac6 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. */ +#ifndef __MLX5E_NVMEOTCP_RXTX_H__ +#define __MLX5E_NVMEOTCP_RXTX_H__ + +#ifdef CONFIG_MLX5_EN_NVMEOTCP + +#include +#include "en_accel/nvmeotcp.h" + +bool +mlx5e_nvmeotcp_rebuild_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt); + +static inline int mlx5_nvmeotcp_get_headlen(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + struct mlx5e_cqe128 *cqe128; + + if (!cqe_is_nvmeotcp_zc(cqe)) + return cqe_bcnt; + + cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); + return be16_to_cpu(cqe128->hlen); +} + +#else + +static inline bool +mlx5e_nvmeotcp_rebuild_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ return true; } + +static inline int mlx5_nvmeotcp_get_headlen(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ return cqe_bcnt; } + +#endif /* CONFIG_MLX5_EN_NVMEOTCP */ +#endif /* __MLX5E_NVMEOTCP_RXTX_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 1b3660d05350..d6dd611c1ebf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -53,7 +53,7 @@ #include "en_accel/macsec.h" #include "en_accel/ipsec_rxtx.h" #include "en_accel/ktls_txrx.h" -#include "en_accel/nvmeotcp.h" +#include "en_accel/nvmeotcp_rxtx.h" #include "en/xdp.h" #include "en/xsk/rx.h" #include "en/health.h" @@ -63,9 +63,11 @@ static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx); static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx); static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe); static void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe); @@ -1484,7 +1486,7 @@ static inline void mlx5e_handle_csum(struct net_device *netdev, #define MLX5E_CE_BIT_MASK 0x80 -static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, +static inline bool mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, u32 cqe_bcnt, struct mlx5e_rq *rq, struct sk_buff *skb) @@ -1495,6 +1497,13 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, skb->mac_len = ETH_HLEN; + if (IS_ENABLED(CONFIG_MLX5_EN_NVMEOTCP) && cqe_is_nvmeotcp(cqe)) { + bool ret = mlx5e_nvmeotcp_rebuild_rx_skb(rq, skb, cqe, cqe_bcnt); + + if (unlikely(!ret)) + return ret; + } + if (unlikely(get_cqe_tls_offload(cqe))) mlx5e_ktls_handle_rx_skb(rq, skb, cqe, &cqe_bcnt); @@ -1540,6 +1549,8 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, if (unlikely(mlx5e_skb_is_multicast(skb))) stats->mcast_packets++; + + return true; } static void mlx5e_shampo_complete_rx_cqe(struct mlx5e_rq *rq, @@ -1563,7 +1574,7 @@ static void mlx5e_shampo_complete_rx_cqe(struct mlx5e_rq *rq, } } -static inline void mlx5e_complete_rx_cqe(struct mlx5e_rq *rq, +static inline bool mlx5e_complete_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, u32 cqe_bcnt, struct sk_buff *skb) @@ -1572,7 +1583,7 @@ static inline void mlx5e_complete_rx_cqe(struct mlx5e_rq *rq, stats->packets++; stats->bytes += cqe_bcnt; - mlx5e_build_rx_skb(cqe, cqe_bcnt, rq, skb); + return mlx5e_build_rx_skb(cqe, cqe_bcnt, rq, skb); } static inline @@ -1810,7 +1821,8 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) goto free_wqe; } - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto free_wqe; if (mlx5e_cqe_regb_chain(cqe)) if (!mlx5e_tc_update_skb(cqe, skb)) { @@ -1863,7 +1875,8 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) goto free_wqe; } - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto free_wqe; if (rep->vlan && skb_vlan_tag_present(skb)) skb_vlan_pop(skb); @@ -1910,11 +1923,12 @@ static void mlx5e_handle_rx_cqe_mpwrq_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 skb = INDIRECT_CALL_2(rq->mpwqe.skb_from_cqe_mpwrq, mlx5e_skb_from_cqe_mpwrq_linear, mlx5e_skb_from_cqe_mpwrq_nonlinear, - rq, wi, cqe_bcnt, head_offset, page_idx); + rq, wi, cqe, cqe_bcnt, head_offset, page_idx); if (!skb) goto mpwrq_cqe_out; - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto mpwrq_cqe_out; mlx5e_rep_tc_receive(cqe, rq, skb); @@ -1959,12 +1973,18 @@ mlx5e_fill_skb_data(struct sk_buff *skb, struct mlx5e_rq *rq, } } +static inline u16 mlx5e_get_headlen_hint(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + return min_t(u32, MLX5E_RX_MAX_HEAD, mlx5_nvmeotcp_get_headlen(cqe, cqe_bcnt)); +} + static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) { union mlx5e_alloc_unit *au = &wi->alloc_units[page_idx]; - u16 headlen = min_t(u16, MLX5E_RX_MAX_HEAD, cqe_bcnt); + u16 headlen = mlx5e_get_headlen_hint(cqe, cqe_bcnt); u32 frag_offset = head_offset + headlen; u32 byte_cnt = cqe_bcnt - headlen; union mlx5e_alloc_unit *head_au = au; @@ -2000,6 +2020,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) { union mlx5e_alloc_unit *au = &wi->alloc_units[page_idx]; @@ -2195,7 +2216,8 @@ static void mlx5e_handle_rx_cqe_mpwrq_shampo(struct mlx5e_rq *rq, struct mlx5_cq if (likely(head_size)) *skb = mlx5e_skb_from_cqe_shampo(rq, wi, cqe, header_index); else - *skb = mlx5e_skb_from_cqe_mpwrq_nonlinear(rq, wi, cqe_bcnt, data_offset, + *skb = mlx5e_skb_from_cqe_mpwrq_nonlinear(rq, wi, cqe, + cqe_bcnt, data_offset, page_idx); if (unlikely(!*skb)) goto free_hd_entry; @@ -2270,11 +2292,12 @@ static void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cq mlx5e_skb_from_cqe_mpwrq_linear, mlx5e_skb_from_cqe_mpwrq_nonlinear, mlx5e_xsk_skb_from_cqe_mpwrq_linear, - rq, wi, cqe_bcnt, head_offset, page_idx); + rq, wi, cqe, cqe_bcnt, head_offset, page_idx); if (!skb) goto mpwrq_cqe_out; - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto mpwrq_cqe_out; if (mlx5e_cqe_regb_chain(cqe)) if (!mlx5e_tc_update_skb(cqe, skb)) { @@ -2611,7 +2634,9 @@ static void mlx5e_trap_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe if (!skb) goto free_wqe; - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto free_wqe; + skb_push(skb, ETH_HLEN); dl_port = mlx5e_devlink_get_dl_port(priv); From patchwork Mon Jan 9 13:31:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13093590 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A54FC5479D for ; Mon, 9 Jan 2023 13:37:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233481AbjAINg6 (ORCPT ); Mon, 9 Jan 2023 08:36:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234702AbjAINgd (ORCPT ); Mon, 9 Jan 2023 08:36:33 -0500 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2080.outbound.protection.outlook.com [40.107.94.80]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A504D3C0DE for ; Mon, 9 Jan 2023 05:34:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FuVZiXbbaiACe59iYiIRDYhspfYgubI1FPZSgnwgE9pQER6xdT6OCkirbHHxdKnS7tnePExg5xXXb8MhMFfcRQncHuHWBEMFe8aYuX13jFvjMOivHFDAWs/M+6AGbxvfD1QLZwnxhWyM1i8ALTzja0tyaLhmA9yu5WfsQzlkAgGuTPXL4w4E0vetEc6eKUGxm9s2tPHStKEJNFbXxhh9APDCkM2Xnw2Qr37KzJ+nt+ZMfkWmvv4g645EBTwmquEuTM6EJ573AyAi7USvRUuoxPna7U+zq9rjw9mt0gYg/q3tANo+chtyogYEaOLZ3cq1r2XLsY3NboU4tZt/WceB1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IKbDuN7SL0K2ms7x3cUDgzAC4vUUY45/IhdiON1RscI=; b=IJnbLdA4aWqioK4N27Cc2vQYwri0FDuWWwRD2qwlVjhV2ic+GWqdTHZ6ykVwQEg0z/9lIk4nHLDb8+hF5tqDLILE8Qn7didMBpwlmzC/NPXKOQqcnNLwH2xvXkLEUUf6/vfLHmx1C1GBWHC9VDYpt4J+r5A6KyPu1HbsoMX3xVesenuF3MszQfBs7d5dLs1NqOIGVWHKH9t1a47gNRcg5cvkh7n8nQ0TkbiPG+yycIiMLj65IfcCFTgzBraRKsnx4a62uv7rX6xIOPFfbDiRA8l+WlyyNImQFiSDaWZ7O+/wZQGI9/b+1l6OtoCQ0OLcLkIe6L0svCuScJWWttGyRA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IKbDuN7SL0K2ms7x3cUDgzAC4vUUY45/IhdiON1RscI=; b=RrxpS2v0QFSx5szhG5370GUZ1pmU69MyTeMCJ69dRoPkltKbm47KVfkzTz33bCV4o3HAeCLC/4NBfJh0sS6Ck/3LspaSfGAZwRKAsEFKtsr0Si1vC5czBqJwnZ95ByBpv/N+8rJPxVlcTYWAFYYm0OJoOy9KG0AzC5IAOjp3OV3exXGhBEG59a9hsg8Oy1q/N7jcRnsJChcs5FzWP/eje9bDq1xPpXPqgWoPNCXL2/MUIyL/Bqcq89LB4FY2B4qtbLBtyWb/uOr2VHTjv8SIBt1oaXtC/F4BW7/FCCDnN/UtxeATALJF4Q5lBxhtxa2CMqGJqxuDlL5DgKhUOo/HiA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by CH0PR12MB5074.namprd12.prod.outlook.com (2603:10b6:610:e1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Mon, 9 Jan 2023 13:34:09 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::6d66:a4bb:4018:3c70%6]) with mapi id 15.20.5986.018; Mon, 9 Jan 2023 13:34:09 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v8 25/25] net/mlx5e: NVMEoTCP, statistics Date: Mon, 9 Jan 2023 15:31:16 +0200 Message-Id: <20230109133116.20801-26-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230109133116.20801-1-aaptel@nvidia.com> References: <20230109133116.20801-1-aaptel@nvidia.com> X-ClientProxiedBy: FR3P281CA0153.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a2::11) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|CH0PR12MB5074:EE_ X-MS-Office365-Filtering-Correlation-Id: 95256836-fa49-45c4-aeab-08daf2462fe3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: G4uGJlsRT5cTkCHYakcYMCyNWklY+TTO9oLEP9CqO9Q2VzTQIddNApNBhZdbEudxzRv6jxoWrUeO1PsDvCj+xdA9t7BlPtuUD/z6sqksZc6jXyPaXOwtyQDpSGr4RVQIY9CBkRfdg13OvQu8fT3XdGiH1Fcts8WryJFgpMr7CebvE5n57dBmBPd/lUa3S3ieF4TxNWjXY0UzR4BOc+S1l8E3qBpdPC1XivWXFp5Dl77bE/Ix2dXSubKQEX3b1O8VgIv5o5h0beCKXpdBcFtxNt1jze0n6I2EE407GY40toDClV370Aha5ATsPaR78+C+J28IPMOGSPNoAzpq+RfAOQ36mNkmSQ8ciYho3DSFJLF/6XQywCPh5/MaJek5fu8h2hPCUQS/LLaAOYXA6S4C7cBpHMRZ6SkgWkr/e4++UnBk3vtIXKSZwu6lj6XgRIpje/8vsXpKGPtvRzhNESpyMa2Xx/ga1GHCHmU4eRteuQuqK9b4c75aCh9q0lYtL5uuzXYsjIyw8V93OwLEfyYlCgyUk6AkrhgDJJw0rUL+AT4s7muBPAdNI5eN4IYZ+LWgkmJe5LSFCp/6ZET1lVcBAB0+bs7+dBfKqTAwEoC6gA6vatQNWt1Ttj/7LUoxcA6zkADu+0efKdeuwCtEVDq8LQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(396003)(366004)(376002)(346002)(39860400002)(451199015)(1076003)(30864003)(316002)(5660300002)(7416002)(26005)(6512007)(186003)(6486002)(478600001)(2616005)(41300700001)(4326008)(66556008)(66946007)(66476007)(8676002)(8936002)(83380400001)(86362001)(36756003)(6666004)(107886003)(6506007)(38100700002)(2906002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: PHi86ilFoSREdpY9ewopIyC2euWQ3PQ/luKDrpMnuu0iiAOd7IcLHYwu9PLvPV6fl+/L//gJ/4SKD35+X/B4xQ35HFj7um6ek+LPcRIaJjQxEW/zYXFZzWGhdsrKZK6AnFg/5vU1J0GCRY1vcmTwuWoD2CsZ9Jx3Fhfb2SyoDUprQtMv1RZ1IWG6dNDfPUK15TwDVXnbq2BXKktQvpeiazjNRmf+wG+nXl+UNA5bYO1caELENrRH69L/RSj8wqXD5id8V0Rlg9yTxARZZmtvMETDKpJ8CeFStXHdF3xT8Q4POfpivmgRRD9cSi5s/urFplg8rOXCPWou1+Q6xblqHnMt45nA1CF3gkOGx2ManZPcrlDpGiVGlkM6Qd596ypaqhbQnfBqvG6UagcuW3Jy5Nhbq6e6rOQBvL9PfT63qM0ZSoivJ+E+4yxpjnB043poLDZm5BqWiGZsobsHJdRjrL+2DyVzLbNfl/CmXufsarbWYoPr/HwAbSEuemWX/BetQQ4du397Th7gsqLtHEO7uyYHf89Knyzv0DxJKtClloCMSntRxDNZOFtzppSr5j68Hq/L1QHdK+zrkm03y4wQ9qYDKWJANL6WIpznH2ehjp+b4zXeOXDLy8FGwNbHbIYPTKpKfSc4/ae1JR8uhtU3lvMMAQUH1Jt2k5RieoCinDnLLvC5gnzMYcJk+XrdcPzWG7lm9xFxQLBQHl97tOcjGVeiRj48yx261UEQ52tX+CdcUw2HOoftbdENEbjaVTWZpNJtD2LGgvtCdHv4KFt30rSnKg5XJsfouGlGIwf7ubZ8qciPuPJeEDtyr+X9VPQD3MpM9zUgo1rZ3QW1ntDvCxKW2/C7MnurZWRkfn53+cynb8BBjFg2Fr0iYyGr4Wd92TJhnxyx/dwhOZVKbLA/+xYNUXAyoJiRPEOdSLr3rO99DmllhzrGv2v5hRz33K6O03GeO+epFgKdY+NPfwyO6VaccYFulKElF0owXae3LuJr5wCK1Yv99OVCqh4BKkilw+hHertnh3JLS9B3fH3ZlQ+eQXIa6hDW2IcoW1+PnJHRrmOp8af+kjsPCBNwuLRv4fXK4+flY+MDWyWNcXQi2noBc+NaD4rXnzW5UZd0fXBQKwirlQcatsv7xNBlT6e/CM+dBl8jTXzDxPS/Z7KxMCCrrInvalspdsCZV8nT0k0wipH5jUZxBFmSU5Gbi5ADDfHNn0z/xrGeTU1/DHk9G34y6K8gCmlUVc6llrYoXGdlIF53T1w3o1lgk8LJyhmjz/kwBuWb/WfDMT1HdC7IVmU14bLPwf/A+1lgO9TL8V6BLQI2WsY08KJ7/Vb0I1l9Mxd/OXQfm5StGGkZN8aJkol2+W/kss5+EvyiA3HP/FMkSpD1BbEUhejGcC+E+B7N1eHe1wxx97H9htfawBeJNJOk4B5Ytt52yobLnUfDR+fY/+z0ZlN2GwJD6OpRv8pX5uLFjj2de4Vs64lQDh1Xyni0c4FW7FXefOYdL8qpWrwVYzX0r2mQbTL5oXx9IOgFfZxgPqLWGIMg/jPVkb6IWHMJiEcl68VLZ30mdnpZ+6ewxbFAsDMG3Kas6qt/oPLT X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 95256836-fa49-45c4-aeab-08daf2462fe3 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2023 13:34:09.2054 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 8WDn0jcqAZA5EMhb8wyi+EAJwaZ8v/z+aFamczwXYVn9jzMRzqqArPEoB4qku4EGhNK1Z+H6tO+Exd0eWDbHXQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5074 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org NVMEoTCP offload statistics include both control and data path statistic: counters for the netdev ddp ops, offloaded packets/bytes and dropped packets. Expose the statistics using the new ethtool_ops->get_ulp_ddp_stats() and the new ETH_SS_ULP_DDP_STATS string set instead of the regular statistics flow. Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Shai Malin Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../net/ethernet/mellanox/mlx5/core/Makefile | 3 +- .../mellanox/mlx5/core/en_accel/nvmeotcp.c | 43 +++++-- .../mellanox/mlx5/core/en_accel/nvmeotcp.h | 19 +++ .../mlx5/core/en_accel/nvmeotcp_rxtx.c | 11 +- .../mlx5/core/en_accel/nvmeotcp_stats.c | 108 ++++++++++++++++++ .../ethernet/mellanox/mlx5/core/en_ethtool.c | 17 +++ .../ethernet/mellanox/mlx5/core/en_stats.c | 17 +++ .../ethernet/mellanox/mlx5/core/en_stats.h | 10 ++ 8 files changed, 217 insertions(+), 11 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_stats.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 9804bd086bf4..d48bde4ca8de 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -103,7 +103,8 @@ mlx5_core-$(CONFIG_MLX5_EN_TLS) += en_accel/ktls_stats.o \ en_accel/fs_tcp.o en_accel/ktls.o en_accel/ktls_txrx.o \ en_accel/ktls_tx.o en_accel/ktls_rx.o -mlx5_core-$(CONFIG_MLX5_EN_NVMEOTCP) += en_accel/fs_tcp.o en_accel/nvmeotcp.o en_accel/nvmeotcp_rxtx.o +mlx5_core-$(CONFIG_MLX5_EN_NVMEOTCP) += en_accel/fs_tcp.o en_accel/nvmeotcp.o \ + en_accel/nvmeotcp_rxtx.o en_accel/nvmeotcp_stats.o mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o \ steering/dr_matcher.o steering/dr_rule.o \ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c index b440ed10c373..6e07cea438ba 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c @@ -614,9 +614,15 @@ mlx5e_nvmeotcp_queue_init(struct net_device *netdev, { struct nvme_tcp_ddp_config *config = &tconfig->nvmeotcp; struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_nvmeotcp_sw_stats *sw_stats; struct mlx5_core_dev *mdev = priv->mdev; struct mlx5e_nvmeotcp_queue *queue; int queue_id, err; + u32 channel_ix; + + channel_ix = mlx5e_get_channel_ix_from_io_cpu(&priv->channels.params, + config->io_cpu); + sw_stats = &priv->nvmeotcp->sw_stats; if (tconfig->type != ULP_DDP_NVME) { err = -EOPNOTSUPP; @@ -643,11 +649,11 @@ mlx5e_nvmeotcp_queue_init(struct net_device *netdev, queue->id = queue_id; queue->dgst = config->dgst; queue->pda = config->cpda; - queue->channel_ix = mlx5e_get_channel_ix_from_io_cpu(&priv->channels.params, - config->io_cpu); + queue->channel_ix = channel_ix; queue->size = config->queue_size; queue->max_klms_per_wqe = MLX5E_MAX_KLM_PER_WQE(mdev); queue->priv = priv; + queue->sw_stats = sw_stats; init_completion(&queue->static_params_done); err = mlx5e_nvmeotcp_queue_rx_init(queue, config, netdev); @@ -659,6 +665,7 @@ mlx5e_nvmeotcp_queue_init(struct net_device *netdev, if (err) goto destroy_rx; + atomic64_inc(&sw_stats->rx_nvmeotcp_sk_add); write_lock_bh(&sk->sk_callback_lock); ulp_ddp_set_ctx(sk, queue); write_unlock_bh(&sk->sk_callback_lock); @@ -672,6 +679,7 @@ mlx5e_nvmeotcp_queue_init(struct net_device *netdev, free_queue: kfree(queue); out: + atomic64_inc(&sw_stats->rx_nvmeotcp_sk_add_fail); return err; } @@ -685,6 +693,8 @@ mlx5e_nvmeotcp_queue_teardown(struct net_device *netdev, queue = container_of(ulp_ddp_get_ctx(sk), struct mlx5e_nvmeotcp_queue, ulp_ddp_ctx); + atomic64_inc(&queue->sw_stats->rx_nvmeotcp_sk_del); + WARN_ON(refcount_read(&queue->ref_count) != 1); mlx5e_nvmeotcp_destroy_rx(priv, queue, mdev); @@ -816,25 +826,35 @@ mlx5e_nvmeotcp_ddp_setup(struct net_device *netdev, struct ulp_ddp_io *ddp) { struct scatterlist *sg = ddp->sg_table.sgl; + struct mlx5e_nvmeotcp_sw_stats *sw_stats; struct mlx5e_nvmeotcp_queue_entry *nvqt; struct mlx5e_nvmeotcp_queue *queue; struct mlx5_core_dev *mdev; int i, size = 0, count = 0; + int ret = 0; queue = container_of(ulp_ddp_get_ctx(sk), struct mlx5e_nvmeotcp_queue, ulp_ddp_ctx); + sw_stats = queue->sw_stats; mdev = queue->priv->mdev; count = dma_map_sg(mdev->device, ddp->sg_table.sgl, ddp->nents, DMA_FROM_DEVICE); - if (count <= 0) - return -EINVAL; + if (count <= 0) { + ret = -EINVAL; + goto ddp_setup_fail; + } + atomic64_inc(&sw_stats->rx_nvmeotcp_ddp_setup); - if (WARN_ON(count > mlx5e_get_max_sgl(mdev))) - return -ENOSPC; + if (WARN_ON(count > mlx5e_get_max_sgl(mdev))) { + ret = -ENOSPC; + goto ddp_setup_fail; + } - if (!mlx5e_nvmeotcp_validate_sgl(sg, count, READ_ONCE(netdev->mtu))) - return -EOPNOTSUPP; + if (!mlx5e_nvmeotcp_validate_sgl(sg, count, READ_ONCE(netdev->mtu))) { + ret = -EOPNOTSUPP; + goto ddp_setup_fail; + } for (i = 0; i < count; i++) size += sg_dma_len(&sg[i]); @@ -846,8 +866,12 @@ mlx5e_nvmeotcp_ddp_setup(struct net_device *netdev, nvqt->ccid_gen++; nvqt->sgl_length = count; mlx5e_nvmeotcp_post_klm_wqe(queue, KLM_UMR, ddp->command_id, count); - return 0; + +ddp_setup_fail: + dma_unmap_sg(mdev->device, ddp->sg_table.sgl, count, DMA_FROM_DEVICE); + atomic64_inc(&sw_stats->rx_nvmeotcp_ddp_setup_fail); + return ret; } void mlx5e_nvmeotcp_ctx_complete(struct mlx5e_icosq_wqe_info *wi) @@ -894,6 +918,7 @@ mlx5e_nvmeotcp_ddp_teardown(struct net_device *netdev, q_entry->queue = queue; mlx5e_nvmeotcp_post_klm_wqe(queue, KLM_INV_UMR, ddp->command_id, 0); + atomic64_inc(&queue->sw_stats->rx_nvmeotcp_ddp_teardown); } static void diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h index a5cfd9e31be7..2d6a12b40429 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h @@ -9,6 +9,15 @@ #include "en.h" #include "en/params.h" +struct mlx5e_nvmeotcp_sw_stats { + atomic64_t rx_nvmeotcp_sk_add; + atomic64_t rx_nvmeotcp_sk_add_fail; + atomic64_t rx_nvmeotcp_sk_del; + atomic64_t rx_nvmeotcp_ddp_setup; + atomic64_t rx_nvmeotcp_ddp_setup_fail; + atomic64_t rx_nvmeotcp_ddp_teardown; +}; + struct mlx5e_nvmeotcp_queue_entry { struct mlx5e_nvmeotcp_queue *queue; u32 sgl_length; @@ -52,6 +61,7 @@ struct mlx5e_nvmeotcp_queue_handler { * @sk: The socket used by the NVMe-TCP queue * @crc_rx: CRC Rx offload indication for this queue * @priv: mlx5e netdev priv + * @sw_stats: Global software statistics for nvmeotcp offload * @static_params_done: Async completion structure for the initial umr mapping * synchronization * @sq_lock: Spin lock for the icosq @@ -88,6 +98,7 @@ struct mlx5e_nvmeotcp_queue { u8 crc_rx:1; /* for ddp invalidate flow */ struct mlx5e_priv *priv; + struct mlx5e_nvmeotcp_sw_stats *sw_stats; /* end of data-path section */ struct completion static_params_done; @@ -97,6 +108,7 @@ struct mlx5e_nvmeotcp_queue { }; struct mlx5e_nvmeotcp { + struct mlx5e_nvmeotcp_sw_stats sw_stats; struct ida queue_ids; struct rhashtable queue_hash; bool enabled; @@ -113,6 +125,9 @@ void mlx5e_nvmeotcp_ddp_inv_done(struct mlx5e_icosq_wqe_info *wi); void mlx5e_nvmeotcp_ctx_complete(struct mlx5e_icosq_wqe_info *wi); static inline void mlx5e_nvmeotcp_init_rx(struct mlx5e_priv *priv) {} void mlx5e_nvmeotcp_cleanup_rx(struct mlx5e_priv *priv); +int mlx5e_nvmeotcp_get_count(struct mlx5e_priv *priv); +int mlx5e_nvmeotcp_get_strings(struct mlx5e_priv *priv, uint8_t *data); +int mlx5e_nvmeotcp_get_stats(struct mlx5e_priv *priv, u64 *data); extern const struct ulp_ddp_dev_ops mlx5e_nvmeotcp_ops; #else @@ -122,5 +137,9 @@ static inline void mlx5e_nvmeotcp_cleanup(struct mlx5e_priv *priv) {} static inline int set_ulp_ddp_nvme_tcp(struct net_device *dev, bool en) { return -EOPNOTSUPP; } static inline void mlx5e_nvmeotcp_init_rx(struct mlx5e_priv *priv) {} static inline void mlx5e_nvmeotcp_cleanup_rx(struct mlx5e_priv *priv) {} +static inline int mlx5e_nvmeotcp_get_count(struct mlx5e_priv *priv) { return 0; } +static inline int mlx5e_nvmeotcp_get_strings(struct mlx5e_priv *priv, uint8_t *data) +{ return 0; } +static inline int mlx5e_nvmeotcp_get_stats(struct mlx5e_priv *priv, u64 *data) { return 0; } #endif #endif /* __MLX5E_NVMEOTCP_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c index 4c7dab28ef56..4f23f6e396b7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c @@ -111,6 +111,7 @@ mlx5e_nvmeotcp_rebuild_rx_skb_nonlinear(struct mlx5e_rq *rq, struct sk_buff *skb int ccoff, cclen, hlen, ccid, remaining, fragsz, to_copy = 0; struct net_device *netdev = rq->netdev; struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_rq_stats *stats = rq->stats; struct mlx5e_nvmeotcp_queue_entry *nqe; skb_frag_t org_frags[MAX_SKB_FRAGS]; struct mlx5e_nvmeotcp_queue *queue; @@ -122,12 +123,14 @@ mlx5e_nvmeotcp_rebuild_rx_skb_nonlinear(struct mlx5e_rq *rq, struct sk_buff *skb queue = mlx5e_nvmeotcp_get_queue(priv->nvmeotcp, queue_id); if (unlikely(!queue)) { dev_kfree_skb_any(skb); + stats->nvmeotcp_drop++; return false; } cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); if (cqe_is_nvmeotcp_resync(cqe)) { nvmeotcp_update_resync(queue, cqe128); + stats->nvmeotcp_resync++; mlx5e_nvmeotcp_put_queue(queue); return true; } @@ -201,7 +204,8 @@ mlx5e_nvmeotcp_rebuild_rx_skb_nonlinear(struct mlx5e_rq *rq, struct sk_buff *skb org_nr_frags, frag_index); } - + stats->nvmeotcp_offload_packets++; + stats->nvmeotcp_offload_bytes += cclen; mlx5e_nvmeotcp_put_queue(queue); return true; } @@ -213,6 +217,7 @@ mlx5e_nvmeotcp_rebuild_rx_skb_linear(struct mlx5e_rq *rq, struct sk_buff *skb, int ccoff, cclen, hlen, ccid, remaining, fragsz, to_copy = 0; struct net_device *netdev = rq->netdev; struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_rq_stats *stats = rq->stats; struct mlx5e_nvmeotcp_queue_entry *nqe; struct mlx5e_nvmeotcp_queue *queue; struct mlx5e_cqe128 *cqe128; @@ -222,12 +227,14 @@ mlx5e_nvmeotcp_rebuild_rx_skb_linear(struct mlx5e_rq *rq, struct sk_buff *skb, queue = mlx5e_nvmeotcp_get_queue(priv->nvmeotcp, queue_id); if (unlikely(!queue)) { dev_kfree_skb_any(skb); + stats->nvmeotcp_drop++; return false; } cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); if (cqe_is_nvmeotcp_resync(cqe)) { nvmeotcp_update_resync(queue, cqe128); + stats->nvmeotcp_resync++; mlx5e_nvmeotcp_put_queue(queue); return true; } @@ -301,6 +308,8 @@ mlx5e_nvmeotcp_rebuild_rx_skb_linear(struct mlx5e_rq *rq, struct sk_buff *skb, hlen + cclen, remaining); } + stats->nvmeotcp_offload_packets++; + stats->nvmeotcp_offload_bytes += cclen; mlx5e_nvmeotcp_put_queue(queue); return true; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_stats.c new file mode 100644 index 000000000000..8e800886cf27 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_stats.c @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. + +#include "en_accel/nvmeotcp.h" + +/* Global counters */ +static const struct counter_desc nvmeotcp_sw_stats_desc[] = { + { MLX5E_DECLARE_STAT(struct mlx5e_nvmeotcp_sw_stats, rx_nvmeotcp_sk_add) }, + { MLX5E_DECLARE_STAT(struct mlx5e_nvmeotcp_sw_stats, rx_nvmeotcp_sk_add_fail) }, + { MLX5E_DECLARE_STAT(struct mlx5e_nvmeotcp_sw_stats, rx_nvmeotcp_sk_del) }, + { MLX5E_DECLARE_STAT(struct mlx5e_nvmeotcp_sw_stats, rx_nvmeotcp_ddp_setup) }, + { MLX5E_DECLARE_STAT(struct mlx5e_nvmeotcp_sw_stats, rx_nvmeotcp_ddp_setup_fail) }, + { MLX5E_DECLARE_STAT(struct mlx5e_nvmeotcp_sw_stats, rx_nvmeotcp_ddp_teardown) }, +}; + +/* Per-rx-queue counters */ +static const struct counter_desc nvmeotcp_rq_stats_desc[] = { + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_drop) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_resync) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_offload_packets) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_offload_bytes) }, +}; + +/* Names of sums of the per-rx-queue counters + * + * The per-queue desc have the queue number in their name, so we + * cannot use them for the sums. We don't store the sums in sw_stats + * so there are no struct offsets to specify. + */ +static const char *const nvmeotcp_rq_sum_names[] = { + "rx_nvmeotcp_drop", + "rx_nvmeotcp_resync", + "rx_nvmeotcp_offload_packets", + "rx_nvmeotcp_offload_bytes", +}; + +static_assert(ARRAY_SIZE(nvmeotcp_rq_stats_desc) == ARRAY_SIZE(nvmeotcp_rq_sum_names)); + +#define MLX5E_READ_CTR_ATOMIC64(ptr, dsc, i) \ + atomic64_read((atomic64_t *)((char *)(ptr) + (dsc)[i].offset)) + +int mlx5e_nvmeotcp_get_count(struct mlx5e_priv *priv) +{ + int max_nch = priv->stats_nch; + + if (!priv->nvmeotcp) + return 0; + + return ARRAY_SIZE(nvmeotcp_sw_stats_desc) + + ARRAY_SIZE(nvmeotcp_rq_stats_desc) + + (max_nch * ARRAY_SIZE(nvmeotcp_rq_stats_desc)); +} + +int mlx5e_nvmeotcp_get_strings(struct mlx5e_priv *priv, uint8_t *data) +{ + unsigned int i, ch, n = 0, idx = 0; + + if (!priv->nvmeotcp) + return 0; + + /* global counters */ + for (i = 0; i < ARRAY_SIZE(nvmeotcp_sw_stats_desc); i++, n++) + strcpy(data + (idx++) * ETH_GSTRING_LEN, + nvmeotcp_sw_stats_desc[i].format); + + /* summed per-rx-queue counters */ + for (i = 0; i < ARRAY_SIZE(nvmeotcp_rq_stats_desc); i++, n++) + strcpy(data + (idx++) * ETH_GSTRING_LEN, + nvmeotcp_rq_sum_names[i]); + + /* per-rx-queue counters */ + for (ch = 0; ch < priv->stats_nch; ch++) + for (i = 0; i < ARRAY_SIZE(nvmeotcp_rq_stats_desc); i++, n++) + sprintf(data + (idx++) * ETH_GSTRING_LEN, + nvmeotcp_rq_stats_desc[i].format, ch); + + return n; +} + +int mlx5e_nvmeotcp_get_stats(struct mlx5e_priv *priv, u64 *data) +{ + unsigned int i, ch, n = 0, idx = 0, sum_start = 0; + + if (!priv->nvmeotcp) + return 0; + + /* global counters */ + for (i = 0; i < ARRAY_SIZE(nvmeotcp_sw_stats_desc); i++, n++) + data[idx++] = MLX5E_READ_CTR_ATOMIC64(&priv->nvmeotcp->sw_stats, + nvmeotcp_sw_stats_desc, i); + + /* summed per-rx-queue counters */ + sum_start = idx; + for (i = 0; i < ARRAY_SIZE(nvmeotcp_rq_stats_desc); i++, n++) + data[idx++] = 0; + + /* per-rx-queue counters */ + for (ch = 0; ch < priv->stats_nch; ch++) { + for (i = 0; i < ARRAY_SIZE(nvmeotcp_rq_stats_desc); i++, n++) { + u64 v = MLX5E_READ_CTR64_CPU(&priv->channel_stats[ch]->rq, + nvmeotcp_rq_stats_desc, i); + data[idx++] = v; + data[sum_start + i] += v; + } + } + + return n; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 7f763152f989..dc9fc48eff12 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -246,6 +246,8 @@ int mlx5e_ethtool_get_sset_count(struct mlx5e_priv *priv, int sset) return MLX5E_NUM_PFLAGS; case ETH_SS_TEST: return mlx5e_self_test_num(priv); + case ETH_SS_ULP_DDP_STATS: + return mlx5e_stats_ulp_ddp_total_num(priv); default: return -EOPNOTSUPP; } @@ -276,6 +278,10 @@ void mlx5e_ethtool_get_strings(struct mlx5e_priv *priv, u32 stringset, u8 *data) case ETH_SS_STATS: mlx5e_stats_fill_strings(priv, data); break; + + case ETH_SS_ULP_DDP_STATS: + mlx5e_stats_ulp_ddp_fill_strings(priv, data); + break; } } @@ -2400,6 +2406,16 @@ static void mlx5e_get_rmon_stats(struct net_device *netdev, } #ifdef CONFIG_MLX5_EN_NVMEOTCP +static int mlx5e_get_ulp_ddp_stats(struct net_device *netdev, + u64 *stats) +{ + struct mlx5e_priv *priv = netdev_priv(netdev); + + mlx5e_stats_ulp_ddp_get(priv, stats); + + return 0; +} + static int mlx5e_set_ulp_ddp_capabilities(struct net_device *netdev, unsigned long *new_caps) { struct mlx5e_priv *priv = netdev_priv(netdev); @@ -2501,6 +2517,7 @@ const struct ethtool_ops mlx5e_ethtool_ops = { .get_rmon_stats = mlx5e_get_rmon_stats, .get_link_ext_stats = mlx5e_get_link_ext_stats, #ifdef CONFIG_MLX5_EN_NVMEOTCP + .get_ulp_ddp_stats = mlx5e_get_ulp_ddp_stats, .set_ulp_ddp_capabilities = mlx5e_set_ulp_ddp_capabilities, #endif }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c index 6687b8136e44..811f71ed8153 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c @@ -2497,3 +2497,20 @@ unsigned int mlx5e_nic_stats_grps_num(struct mlx5e_priv *priv) { return ARRAY_SIZE(mlx5e_nic_stats_grps); } + +/* ULP DDP stats */ + +unsigned int mlx5e_stats_ulp_ddp_total_num(struct mlx5e_priv *priv) +{ + return mlx5e_nvmeotcp_get_count(priv); +} + +void mlx5e_stats_ulp_ddp_fill_strings(struct mlx5e_priv *priv, u8 *data) +{ + mlx5e_nvmeotcp_get_strings(priv, data); +} + +void mlx5e_stats_ulp_ddp_get(struct mlx5e_priv *priv, u64 *stats) +{ + mlx5e_nvmeotcp_get_stats(priv, stats); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h index 375752d6546d..1b2a2c7de824 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h @@ -129,6 +129,10 @@ void mlx5e_stats_rmon_get(struct mlx5e_priv *priv, void mlx5e_get_link_ext_stats(struct net_device *dev, struct ethtool_link_ext_stats *stats); +unsigned int mlx5e_stats_ulp_ddp_total_num(struct mlx5e_priv *priv); +void mlx5e_stats_ulp_ddp_fill_strings(struct mlx5e_priv *priv, u8 *data); +void mlx5e_stats_ulp_ddp_get(struct mlx5e_priv *priv, u64 *stats); + /* Concrete NIC Stats */ struct mlx5e_sw_stats { @@ -395,6 +399,12 @@ struct mlx5e_rq_stats { u64 tls_resync_res_skip; u64 tls_err; #endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + u64 nvmeotcp_drop; + u64 nvmeotcp_resync; + u64 nvmeotcp_offload_packets; + u64 nvmeotcp_offload_bytes; +#endif }; struct mlx5e_sq_stats {