From patchwork Tue Jan 14 15:27:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Yoong Siang X-Patchwork-Id: 13938867 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 741EC155359; Tue, 14 Jan 2025 15:28:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736868494; cv=none; b=dNVzZ9QnDHenwo2b4SU5zD+lCUTXP02HN88CG9qV0g7xjPrJiiktTDSBcp2ffBO4IjZGoBxyXvWmxpg9R6AJ0+CxJ/u8SgHan/wtWz7QQZAgQ6CGK6eqLFzJs/XE+WnnvzZv5jX07o6uXrba1Pc6w/jHainsOBHV+AftuO4PR04= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736868494; c=relaxed/simple; bh=nbhPGPRtlZxwG7dQxueDinzXdUgkXuGkpkZ/lgycMo0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=SXKTrM+2hlVlJsDHF8hZLmiKhn2RvxqFwPvamdbRf9SVlz+aJ9mnZO54biKH8oNUCiG/s8mYyR/7A2VUoydbKWd2dykEKDLkvU/5EpayrHU8ppOB3mKsLvwp/8cAvS48WZ6iSpaqWLHerLkzaBsJt4IcuelIfywFaYxXBOVgxUQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=EndLYD8V; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="EndLYD8V" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1736868492; x=1768404492; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nbhPGPRtlZxwG7dQxueDinzXdUgkXuGkpkZ/lgycMo0=; b=EndLYD8VQBgRwCAVvtUVZ0pLsY90UVsPdObFXeGqxCfSiEzCLJN51Au7 jMjAv934JRjDSA6POCch8Irtq97XivVIC8c+eZCwDIQ8mZn27qPLIZ2KW jDYvcJ7COJK95UvZRA3ZDNIs3QwagxhDy8lnIEmgwt/kFQbHOdWe/gIAo zHb6CS+M5el5Spl9PUye4IY1asLJiIAUvYgqx9fld9/z50tguJMxL9vqe uUA2/gdol83FnOR27BA3wn+dz14Tx6Xzi1vV/6WhtTQ0ZfGIaiHVrIqnb 5cxIv8ezS/Pyga/ZrN8uR5TX7zpUBQSjHbSo4gMNLsE9OVtRvPu1Azgdq w==; X-CSE-ConnectionGUID: OunqZBSMS9emn270YAqkPA== X-CSE-MsgGUID: xDh6D160QOCfpxMZ7yI20Q== X-IronPort-AV: E=McAfee;i="6700,10204,11315"; a="37325324" X-IronPort-AV: E=Sophos;i="6.12,314,1728975600"; d="scan'208";a="37325324" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jan 2025 07:28:11 -0800 X-CSE-ConnectionGUID: TvLyFUCHRdCi8tHoog1QHw== X-CSE-MsgGUID: YMYpBkm0RQKqva8oQkV1ow== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="142117571" Received: from p12ill20yoongsia.png.intel.com ([10.88.227.38]) by orviesa001.jf.intel.com with ESMTP; 14 Jan 2025 07:27:59 -0800 From: Song Yoong Siang To: "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Willem de Bruijn , Florian Bezdeka , Donald Hunter , Jonathan Corbet , Bjorn Topel , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Andrew Lunn , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Joe Damato , Stanislav Fomichev , Xuan Zhuo , Mina Almasry , Daniel Jurgens , Song Yoong Siang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Hao Luo , Jiri Olsa , Shuah Khan , Alexandre Torgue , Jose Abreu , Maxime Coquelin , Tony Nguyen , Przemek Kitszel Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-stm32@st-md-mailman.stormreply.com, linux-arm-kernel@lists.infradead.org, intel-wired-lan@lists.osuosl.org, xdp-hints@xdp-project.net Subject: [PATCH bpf-next v5 1/4] xsk: Add launch time hardware offload support to XDP Tx metadata Date: Tue, 14 Jan 2025 23:27:15 +0800 Message-Id: <20250114152718.120588-2-yoong.siang.song@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250114152718.120588-1-yoong.siang.song@intel.com> References: <20250114152718.120588-1-yoong.siang.song@intel.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Extend the XDP Tx metadata framework so that user can requests launch time hardware offload, where the Ethernet device will schedule the packet for transmission at a pre-determined time called launch time. The value of launch time is communicated from user space to Ethernet driver via launch_time field of struct xsk_tx_metadata. Suggested-by: Stanislav Fomichev Signed-off-by: Song Yoong Siang --- Documentation/netlink/specs/netdev.yaml | 4 ++ Documentation/networking/xsk-tx-metadata.rst | 62 ++++++++++++++++++++ include/net/xdp_sock.h | 10 ++++ include/net/xdp_sock_drv.h | 1 + include/uapi/linux/if_xdp.h | 10 ++++ include/uapi/linux/netdev.h | 3 + net/core/netdev-genl.c | 2 + net/xdp/xsk.c | 3 + tools/include/uapi/linux/if_xdp.h | 10 ++++ tools/include/uapi/linux/netdev.h | 3 + 10 files changed, 108 insertions(+) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index cbb544bd6c84..901b5afb3df0 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -70,6 +70,10 @@ definitions: name: tx-checksum doc: L3 checksum HW offload is supported by the driver. + - + name: tx-launch-time-fifo + doc: + Launch time HW offload is supported by the driver. - name: queue-type type: enum diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst index e76b0cfc32f7..df53a10ccac3 100644 --- a/Documentation/networking/xsk-tx-metadata.rst +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -50,6 +50,10 @@ The flags field enables the particular offload: checksum. ``csum_start`` specifies byte offset of where the checksumming should start and ``csum_offset`` specifies byte offset where the device should store the computed checksum. +- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the + packet for transmission at a pre-determined time called launch time. The + value of launch time is indicated by ``launch_time`` field of + ``union xsk_tx_metadata``. Besides the flags above, in order to trigger the offloads, the first packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` @@ -65,6 +69,63 @@ In this case, when running in ``XDK_COPY`` mode, the TX checksum is calculated on the CPU. Do not enable this option in production because it will negatively affect performance. +Launch Time +=========== + +The value of the requested launch time should be based on the device's PTP +Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path +compared to the ETF queuing discipline, which organizes packets and delays +their transmission. Instead, AF_XDP immediately hands off the packets to +the device driver without rearranging their order or holding them prior to +transmission. Since the driver maintains FIFO behavior and does not perform +packet reordering, a packet with a launch time request will block other +packets in the same Tx Queue until it is sent. Therefore, it is recommended +to allocate separate queue for scheduling traffic that is intended for +future transmission. + +In scenarios where the launch time offload feature is disabled, the device +driver is expected to disregard the launch time request. For correct +interpretation and meaningful operation, the launch time should never be +set to a value larger than the farthest programmable time in the future +(the horizon). Different devices have different hardware limitations on the +launch time offload feature. + +stmmac driver +------------- + +For stmmac, TSO and launch time (TBS) features are mutually exclusive for +each individual Tx Queue. By default, the driver configures Tx Queue 0 to +support TSO and the rest of the Tx Queues to support TBS. The launch time +hardware offload feature can be enabled or disabled by using the tc-etf +command to call the driver's ndo_setup_tc() callback. + +The value of the launch time that is programmed in the Enhanced Normal +Transmit Descriptors is a 32-bit value, where the most significant 8 bits +represent the time in seconds and the remaining 24 bits represent the time +in 256 ns increments. The programmed launch time is compared against the +PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the +horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the +future. + +igc driver +---------- + +For igc, all four Tx Queues support the launch time feature. The launch +time hardware offload feature can be enabled or disabled by using the +tc-etf command to call the driver's ndo_setup_tc() callback. When entering +TSN mode, the igc driver will reset the device and create a default Qbv +schedule with a 1-second cycle time, with all Tx Queues open at all times. + +The value of the launch time that is programmed in the Advanced Transmit +Context Descriptor is a relative offset to the starting time of the Qbv +transmission window of the queue. The Frst flag of the descriptor can be +set to schedule the packet for the next Qbv cycle. Therefore, the horizon +of the launch time for i225 and i226 is the ending time of the next cycle +of the Qbv transmission window of the queue. For example, when the Qbv +cycle time is set to 1 second, the horizon of the launch time ranges +from 1 second to 2 seconds, depending on where the Qbv cycle is currently +running. + Querying Device Capabilities ============================ @@ -74,6 +135,7 @@ Refer to ``xsk-flags`` features bitmask in - ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP`` - ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM`` +- ``tx-launch-time-fifo``: device supports ``XDP_TXMD_FLAGS_LAUNCH_TIME`` See ``tools/net/ynl/samples/netdev.c`` on how to query this information. diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index bfe625b55d55..a58ae7589d12 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -110,11 +110,16 @@ struct xdp_sock { * indicates position where checksumming should start. * csum_offset indicates position where checksum should be stored. * + * void (*tmo_request_launch_time)(u64 launch_time, void *priv) + * Called when AF_XDP frame requested launch time HW offload support. + * launch_time indicates the PTP time at which the device can schedule the + * packet for transmission. */ struct xsk_tx_metadata_ops { void (*tmo_request_timestamp)(void *priv); u64 (*tmo_fill_timestamp)(void *priv); void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv); + void (*tmo_request_launch_time)(u64 launch_time, void *priv); }; #ifdef CONFIG_XDP_SOCKETS @@ -162,6 +167,11 @@ static inline void xsk_tx_metadata_request(const struct xsk_tx_metadata *meta, if (!meta) return; + if (ops->tmo_request_launch_time) + if (meta->flags & XDP_TXMD_FLAGS_LAUNCH_TIME) + ops->tmo_request_launch_time(meta->request.launch_time, + priv); + if (ops->tmo_request_timestamp) if (meta->flags & XDP_TXMD_FLAGS_TIMESTAMP) ops->tmo_request_timestamp(priv); diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 86620c818965..8c005b50dfd9 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -208,6 +208,7 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) #define XDP_TXMD_FLAGS_VALID ( \ XDP_TXMD_FLAGS_TIMESTAMP | \ XDP_TXMD_FLAGS_CHECKSUM | \ + XDP_TXMD_FLAGS_LAUNCH_TIME | \ 0) static inline bool diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index 42ec5ddaab8d..42869770776e 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -127,6 +127,12 @@ struct xdp_options { */ #define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) +/* Request launch time hardware offload. The device will schedule the packet for + * transmission at a pre-determined time called launch time. The value of + * launch time is communicated via launch_time field of struct xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_LAUNCH_TIME (1 << 2) + /* AF_XDP offloads request. 'request' union member is consumed by the driver * when the packet is being transmitted. 'completion' union member is * filled by the driver when the transmit completion arrives. @@ -142,6 +148,10 @@ struct xsk_tx_metadata { __u16 csum_start; /* Offset from csum_start where checksum should be stored. */ __u16 csum_offset; + + /* XDP_TXMD_FLAGS_LAUNCH_TIME */ + /* Launch time in nanosecond against the PTP HW Clock */ + __u64 launch_time; } request; struct { diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index e4be227d3ad6..5ab85f4af009 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata { * by the driver. * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the * driver. + * @NETDEV_XSK_FLAGS_LAUNCH_TIME: Launch Time HW offload is supported by the + * driver. */ enum netdev_xsk_flags { NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + NETDEV_XSK_FLAGS_LAUNCH_TIME = 4, }; enum netdev_queue_type { diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index b0772d135efb..586cd485045d 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -52,6 +52,8 @@ XDP_METADATA_KFUNC_xxx xsk_features |= NETDEV_XSK_FLAGS_TX_TIMESTAMP; if (netdev->xsk_tx_metadata_ops->tmo_request_checksum) xsk_features |= NETDEV_XSK_FLAGS_TX_CHECKSUM; + if (netdev->xsk_tx_metadata_ops->tmo_request_launch_time) + xsk_features |= NETDEV_XSK_FLAGS_LAUNCH_TIME; } if (nla_put_u32(rsp, NETDEV_A_DEV_IFINDEX, netdev->ifindex) || diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 3fa70286c846..8feaa0e86f07 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -743,6 +743,9 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, goto free_err; } } + + if (meta->flags & XDP_TXMD_FLAGS_LAUNCH_TIME) + skb->skb_mstamp_ns = meta->request.launch_time; } } diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index 2f082b01ff22..67719f8966c2 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -127,6 +127,12 @@ struct xdp_options { */ #define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) +/* Request launch time hardware offload. The device will schedule the packet for + * transmission at a pre-determined time called launch time. The value of + * launch time is communicated via launch_time field of struct xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_LAUNCH_TIME (1 << 2) + /* AF_XDP offloads request. 'request' union member is consumed by the driver * when the packet is being transmitted. 'completion' union member is * filled by the driver when the transmit completion arrives. @@ -142,6 +148,10 @@ struct xsk_tx_metadata { __u16 csum_start; /* Offset from csum_start where checksum should be stored. */ __u16 csum_offset; + + /* XDP_TXMD_FLAGS_LAUNCH_TIME */ + /* Launch time in nanosecond against the PTP HW Clock */ + __u64 launch_time; } request; struct { diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index e4be227d3ad6..5ab85f4af009 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata { * by the driver. * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the * driver. + * @NETDEV_XSK_FLAGS_LAUNCH_TIME: Launch Time HW offload is supported by the + * driver. */ enum netdev_xsk_flags { NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + NETDEV_XSK_FLAGS_LAUNCH_TIME = 4, }; enum netdev_queue_type {