From patchwork Mon Nov 27 19:03:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470120 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="CuvHXQ8y" Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7DC2D5D for ; Mon, 27 Nov 2023 11:03:23 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 41be03b00d2f7-5bddc607b45so5895015a12.0 for ; Mon, 27 Nov 2023 11:03:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111803; x=1701716603; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=psXeWSFV9jNwPqAMO4arGwTup4UEtWfZR60Db4FTCc0=; b=CuvHXQ8ymbSGyZryLG59MFyO7EvhS5h83X1IgkZ6ZPlCTnBBm+MJGQKCGDI2/dd1id nArconxY2fVSmNPjnp2tQTBCvoHYTU8gltzv9YoNoROVlvG2hUfNBtM4EJ4cPZlrP6Qr 46KWaV62kwp6uoDO2Ni88bbGvD0Rv6u4PxBX30BYKkyjNwhzdn2LXtrB3NfxkbEmPBgy RgjXPGkYRge8DUt91YcpmWGsIufupyutIBEEMgN2UUpbC1d6lC6Z/AfiiB6szcQAVF9q TbxLA8O5En2Szxq1Q5hOIWvwXRCOEEg2UQ2Z9H1lENLWkFBMDln1H+3/4nRvzc6eyVDV nZZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111803; x=1701716603; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=psXeWSFV9jNwPqAMO4arGwTup4UEtWfZR60Db4FTCc0=; b=WjZ0YFHrbXAvJSuWwFQqD0sc9GaHMj/aTL4c2fopu9dOo9eoHlKxL2tk3tDQ+W8mXk OMC1RPYghteaK9tABFeJdBJvVAs+dpwqvc2Oj+IZlPOwe00UudrYReY5A5e2fwMVMEZ/ QTtjfIaFacqUt3aOyaipDY8//XzaR2wYbFrziINUYfMvoVeDYomh2OdO9YyTokCBLyQq OqnDxiTfFEse43mcDYXZPvnPhioq54hBH9NN3inEkTpDzIGhLRhiFmAxHCd2d0WQ4Dl9 ZJs//8BiSXayb9S93ckir28cQFpRcW8WI6pgYpySbmdn8Bp/JeNZW+o/XLjAkRV8VYSK 3wxw== X-Gm-Message-State: AOJu0Yyh9yI+Cexrew8iwAOkMCNUVv+qfezH8SgZngLMxImg+zt6bnM4 DCHQcT0e/YxTHD6C2ssuoO6B/J+rCvguPIrB0jPFlDI2XI4/ygRDdDjuxTNbYt7pbu9mZ4BhNKN 9vsWXRddAe/g6T97DKSJjn3H4NohYsiVsvp58kZax0+k52fTaJw== X-Google-Smtp-Source: AGHT+IGmlmDeKoqQIpCdZIz5m+tgvYQiWdhiP3RR3TZ9IiHLEWD4GuSk6K4vRAEBjui7lE/Ujmo5mQ8= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a63:d209:0:b0:5be:1194:9c0b with SMTP id a9-20020a63d209000000b005be11949c0bmr2068487pgg.3.1701111802937; Mon, 27 Nov 2023 11:03:22 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:07 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-2-sdf@google.com> Subject: [PATCH bpf-next v6 01/13] xsk: Support tx_metadata_len From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net For zerocopy mode, tx_desc->addr can point to an arbitrary offset and carry some TX metadata in the headroom. For copy mode, there is no way currently to populate skb metadata. Introduce new tx_metadata_len umem config option that indicates how many bytes to treat as metadata. Metadata bytes come prior to tx_desc address (same as in RX case). The size of the metadata has mostly the same constraints as XDP: - less than 256 bytes - 8-byte aligned (compared to 4-byte alignment on xdp, due to 8-byte timestamp in the completion) - non-zero This data is not interpreted in any way right now. Reviewed-by: Song Yoong Siang Signed-off-by: Stanislav Fomichev Reviewed-by: Jakub Kicinski --- include/net/xdp_sock.h | 1 + include/net/xsk_buff_pool.h | 1 + include/uapi/linux/if_xdp.h | 1 + net/xdp/xdp_umem.c | 4 ++++ net/xdp/xsk.c | 12 +++++++++++- net/xdp/xsk_buff_pool.c | 1 + net/xdp/xsk_queue.h | 17 ++++++++++------- tools/include/uapi/linux/if_xdp.h | 1 + 8 files changed, 30 insertions(+), 8 deletions(-) diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index f83128007fb0..bcf765124f72 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -30,6 +30,7 @@ struct xdp_umem { struct user_struct *user; refcount_t users; u8 flags; + u8 tx_metadata_len; bool zc; struct page **pgs; int id; diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index b0bdff26fc88..1985ffaf9b0c 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -77,6 +77,7 @@ struct xsk_buff_pool { u32 chunk_size; u32 chunk_shift; u32 frame_len; + u8 tx_metadata_len; /* inherited from umem */ u8 cached_need_wakeup; bool uses_need_wakeup; bool dma_need_sync; diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index 8d48863472b9..2ecf79282c26 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -76,6 +76,7 @@ struct xdp_umem_reg { __u32 chunk_size; __u32 headroom; __u32 flags; + __u32 tx_metadata_len; }; struct xdp_statistics { diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c index 06cead2b8e34..946a687fb8e8 100644 --- a/net/xdp/xdp_umem.c +++ b/net/xdp/xdp_umem.c @@ -199,6 +199,9 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) if (headroom >= chunk_size - XDP_PACKET_HEADROOM) return -EINVAL; + if (mr->tx_metadata_len >= 256 || mr->tx_metadata_len % 8) + return -EINVAL; + umem->size = size; umem->headroom = headroom; umem->chunk_size = chunk_size; @@ -207,6 +210,7 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) umem->pgs = NULL; umem->user = NULL; umem->flags = mr->flags; + umem->tx_metadata_len = mr->tx_metadata_len; INIT_LIST_HEAD(&umem->xsk_dma_list); refcount_set(&umem->users, 1); diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index ae9f8cb611f6..c904356e2800 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -1283,6 +1283,14 @@ struct xdp_umem_reg_v1 { __u32 headroom; }; +struct xdp_umem_reg_v2 { + __u64 addr; /* Start of packet data area */ + __u64 len; /* Length of packet data area */ + __u32 chunk_size; + __u32 headroom; + __u32 flags; +}; + static int xsk_setsockopt(struct socket *sock, int level, int optname, sockptr_t optval, unsigned int optlen) { @@ -1326,8 +1334,10 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname, if (optlen < sizeof(struct xdp_umem_reg_v1)) return -EINVAL; - else if (optlen < sizeof(mr)) + else if (optlen < sizeof(struct xdp_umem_reg_v2)) mr_size = sizeof(struct xdp_umem_reg_v1); + else if (optlen < sizeof(mr)) + mr_size = sizeof(struct xdp_umem_reg_v2); if (copy_from_sockptr(&mr, optval, mr_size)) return -EFAULT; diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 49cb9f9a09be..386eddcdf837 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -85,6 +85,7 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs, XDP_PACKET_HEADROOM; pool->umem = umem; pool->addrs = umem->addrs; + pool->tx_metadata_len = umem->tx_metadata_len; INIT_LIST_HEAD(&pool->free_list); INIT_LIST_HEAD(&pool->xskb_list); INIT_LIST_HEAD(&pool->xsk_tx_list); diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 13354a1e4280..c74a1372bcb9 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -143,15 +143,17 @@ static inline bool xp_unused_options_set(u32 options) static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) { - u64 offset = desc->addr & (pool->chunk_size - 1); + u64 addr = desc->addr - pool->tx_metadata_len; + u64 len = desc->len + pool->tx_metadata_len; + u64 offset = addr & (pool->chunk_size - 1); if (!desc->len) return false; - if (offset + desc->len > pool->chunk_size) + if (offset + len > pool->chunk_size) return false; - if (desc->addr >= pool->addrs_cnt) + if (addr >= pool->addrs_cnt) return false; if (xp_unused_options_set(desc->options)) @@ -162,16 +164,17 @@ static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, static inline bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) { - u64 addr = xp_unaligned_add_offset_to_addr(desc->addr); + u64 addr = xp_unaligned_add_offset_to_addr(desc->addr) - pool->tx_metadata_len; + u64 len = desc->len + pool->tx_metadata_len; if (!desc->len) return false; - if (desc->len > pool->chunk_size) + if (len > pool->chunk_size) return false; - if (addr >= pool->addrs_cnt || addr + desc->len > pool->addrs_cnt || - xp_desc_crosses_non_contig_pg(pool, addr, desc->len)) + if (addr >= pool->addrs_cnt || addr + len > pool->addrs_cnt || + xp_desc_crosses_non_contig_pg(pool, addr, len)) return false; if (xp_unused_options_set(desc->options)) diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index 73a47da885dc..34411a2e5b6c 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -76,6 +76,7 @@ struct xdp_umem_reg { __u32 chunk_size; __u32 headroom; __u32 flags; + __u32 tx_metadata_len; }; struct xdp_statistics { From patchwork Mon Nov 27 19:03:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470122 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="q3bkLiZQ" Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0CB7131 for ; Mon, 27 Nov 2023 11:03:25 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5d12853cb89so6795247b3.3 for ; Mon, 27 Nov 2023 11:03:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111805; x=1701716605; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=CwjKcY8INDHEWCKu1BjLvMDStlAFGSYX6XGUHdqNH9o=; b=q3bkLiZQ4vTg8OM1cVHauU5BY3M9J3wANoBbiSgkYX/w5ptpH647ZCAPxK7mVMPNCa h8Btm8fUs/mVCbMHQX06sP1sYn6Y7MSD9F4lebZCDSt6HGaOepF3AswJgsX6kh3MKCzJ H5fqJZdU2xlmz7d8ryVj1+A1YLw4RdCd9WHuKCTxUWkpMFe06LBjEdtyHyKyh8IWvbij 9tA5OSEYdCWkQR0nlIzbgWwoQPP3eiWRDgMXPEqUZlgb1btz8uKMgLZuCa6hElpLBUiy OUSGl3+b2xVN55Lxt6JDqdx92Q7Ffy8PNKxeszlGH8Fi7VFSPN9QSF/CISVO3zCSUYc6 SIKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111805; x=1701716605; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CwjKcY8INDHEWCKu1BjLvMDStlAFGSYX6XGUHdqNH9o=; b=mwIF09kUw5JQRm65GPB9JgXQ8NS1vHwWKU1bKh+n9dlUGzkrXzJkewak50BLu/Z86G ty2tmIhgs6z/k8lyIr4QjQ7K7/jGZoPxJgTN44XyVC1OGqvE2Pja4tyT82wVdZQfrPVS HYS+uW8fIQGEe+/rEpmiB6cxX2Pk/6ImO/U7Cd36Mk/rWRza8kel30Tek8qgEK4fWc06 cBo75JGv/GeeRHStFzY0ZhqKTXNt9KJflbShj98SyUUiU9flDn6zGBmymL+YmTSz0OQO WPlJAc2lx6WBKvCh20SRVDxexZzUxkgyHtQr5juNwRTtVd2uJvpTvHYOz5ukUeeQ/maC yNsw== X-Gm-Message-State: AOJu0YyxfGCm8kwn/IkqkdxFAVC1vcZvOV7Vei8pxukhIoUbXWHKFD2C AjQVWCbarDNa7+JesKT21brE6Q2xIrBYqKpXb8RnwjCFwxxteurNz4N5xoy2tqv9KkM9lYRquaO LAWyu2kyACpmQCqPBaehrMFSvCtt9md6INn4wsjrbOEtqKbGxtQ== X-Google-Smtp-Source: AGHT+IHVk1+xZujiaYbWewjGuyUdz7EK3l349MjS40cTkhNid0nixK70qSJd7zmJz7Wyc0HFeq2H8eg= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a25:cf14:0:b0:daf:660e:9bdb with SMTP id f20-20020a25cf14000000b00daf660e9bdbmr396648ybg.6.1701111804820; Mon, 27 Nov 2023 11:03:24 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:08 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-3-sdf@google.com> Subject: [PATCH bpf-next v6 02/13] xsk: Add TX timestamp and TX checksum offload support From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net This change actually defines the (initial) metadata layout that should be used by AF_XDP userspace (xsk_tx_metadata). The first field is flags which requests appropriate offloads, followed by the offload-specific fields. The supported per-device offloads are exported via netlink (new xsk-flags). The offloads themselves are still implemented in a bit of a framework-y fashion that's left from my initial kfunc attempt. I'm introducing new xsk_tx_metadata_ops which drivers are supposed to implement. The drivers are also supposed to call xsk_tx_metadata_request/xsk_tx_metadata_complete in the right places. Since xsk_tx_metadata_{request,_complete} are static inline, we don't incur any extra overhead doing indirect calls. The benefit of this scheme is as follows: - keeps all metadata layout parsing away from driver code - makes it easy to grep and see which drivers implement what - don't need any extra flags to maintain to keep track of what offloads are implemented; if the callback is implemented - the offload is supported (used by netlink reporting code) Two offloads are defined right now: 1. XDP_TXMD_FLAGS_CHECKSUM: skb-style csum_start+csum_offset 2. XDP_TXMD_FLAGS_TIMESTAMP: writes TX timestamp back into metadata area upon completion (tx_timestamp field) XDP_TXMD_FLAGS_TIMESTAMP is also implemented for XDP_COPY mode: it writes SW timestamp from the skb destructor (note I'm reusing hwtstamps to pass metadata pointer). The struct is forward-compatible and can be extended in the future by appending more fields. Reviewed-by: Song Yoong Siang Signed-off-by: Stanislav Fomichev Acked-by: Jakub Kicinski --- Documentation/netlink/specs/netdev.yaml | 19 +++- include/linux/netdevice.h | 2 + include/linux/skbuff.h | 14 ++- include/net/xdp_sock.h | 110 ++++++++++++++++++++++++ include/net/xdp_sock_drv.h | 13 +++ include/net/xsk_buff_pool.h | 6 ++ include/uapi/linux/if_xdp.h | 38 ++++++++ include/uapi/linux/netdev.h | 16 ++++ net/core/netdev-genl.c | 13 ++- net/xdp/xsk.c | 34 ++++++++ net/xdp/xsk_queue.h | 2 +- tools/include/uapi/linux/if_xdp.h | 52 +++++++++-- tools/include/uapi/linux/netdev.h | 16 ++++ tools/net/ynl/generated/netdev-user.c | 19 ++++ tools/net/ynl/generated/netdev-user.h | 3 + 15 files changed, 348 insertions(+), 9 deletions(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 14511b13f305..00439bcbd2e3 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -45,7 +45,6 @@ name: netdev - type: flags name: xdp-rx-metadata - render-max: true entries: - name: timestamp @@ -55,6 +54,18 @@ name: netdev name: hash doc: Device is capable of exposing receive packet hash via bpf_xdp_metadata_rx_hash(). + - + type: flags + name: xsk-flags + entries: + - + name: tx-timestamp + doc: + HW timestamping egress packets is supported by the driver. + - + name: tx-checksum + doc: + L3 checksum HW offload is supported by the driver. attribute-sets: - @@ -86,6 +97,11 @@ name: netdev See Documentation/networking/xdp-rx-metadata.rst for more details. type: u64 enum: xdp-rx-metadata + - + name: xsk-features + doc: Bitmask of enabled AF_XDP features. + type: u64 + enum: xsk-flags operations: list: @@ -103,6 +119,7 @@ name: netdev - xdp-features - xdp-zc-max-segs - xdp-rx-metadata-features + - xsk-features dump: reply: *dev-all - diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index e87caa81f70c..08da8b28c816 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1865,6 +1865,7 @@ enum netdev_stat_type { * @netdev_ops: Includes several pointers to callbacks, * if one wants to override the ndo_*() functions * @xdp_metadata_ops: Includes pointers to XDP metadata callbacks. + * @xsk_tx_metadata_ops: Includes pointers to AF_XDP TX metadata callbacks. * @ethtool_ops: Management operations * @l3mdev_ops: Layer 3 master device operations * @ndisc_ops: Includes callbacks for different IPv6 neighbour @@ -2128,6 +2129,7 @@ struct net_device { unsigned long long priv_flags; const struct net_device_ops *netdev_ops; const struct xdp_metadata_ops *xdp_metadata_ops; + const struct xsk_tx_metadata_ops *xsk_tx_metadata_ops; int ifindex; unsigned short gflags; unsigned short hard_header_len; diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 27998f73183e..b370eb8d70f7 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -566,6 +566,15 @@ struct ubuf_info_msgzc { int mm_account_pinned_pages(struct mmpin *mmp, size_t size); void mm_unaccount_pinned_pages(struct mmpin *mmp); +/* Preserve some data across TX submission and completion. + * + * Note, this state is stored in the driver. Extending the layout + * might need some special care. + */ +struct xsk_tx_metadata_compl { + __u64 *tx_timestamp; +}; + /* This data is invariant across clones and lives at * the end of the header data, ie. at skb->end. */ @@ -578,7 +587,10 @@ struct skb_shared_info { /* Warning: this field is not always filled in (UFO)! */ unsigned short gso_segs; struct sk_buff *frag_list; - struct skb_shared_hwtstamps hwtstamps; + union { + struct skb_shared_hwtstamps hwtstamps; + struct xsk_tx_metadata_compl xsk_meta; + }; unsigned int gso_type; u32 tskey; diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index bcf765124f72..3cb4dc9bd70e 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -93,12 +93,105 @@ struct xdp_sock { struct xsk_queue *cq_tmp; /* Only as tmp storage before bind */ }; +/* + * AF_XDP TX metadata hooks for network devices. + * The following hooks can be defined; unless noted otherwise, they are + * optional and can be filled with a null pointer. + * + * void (*tmo_request_timestamp)(void *priv) + * Called when AF_XDP frame requested egress timestamp. + * + * u64 (*tmo_fill_timestamp)(void *priv) + * Called when AF_XDP frame, that had requested egress timestamp, + * received a completion. The hook needs to return the actual HW timestamp. + * + * void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv) + * Called when AF_XDP frame requested HW checksum offload. csum_start + * indicates position where checksumming should start. + * csum_offset indicates position where checksum should be stored. + * + */ +struct xsk_tx_metadata_ops { + void (*tmo_request_timestamp)(void *priv); + u64 (*tmo_fill_timestamp)(void *priv); + void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv); +}; + #ifdef CONFIG_XDP_SOCKETS int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp); int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp); void __xsk_map_flush(void); +/** + * xsk_tx_metadata_to_compl - Save enough relevant metadata information + * to perform tx completion in the future. + * @meta: pointer to AF_XDP metadata area + * @compl: pointer to output struct xsk_tx_metadata_to_compl + * + * This function should be called by the networking device when + * it prepares AF_XDP egress packet. The value of @compl should be stored + * and passed to xsk_tx_metadata_complete upon TX completion. + */ +static inline void xsk_tx_metadata_to_compl(struct xsk_tx_metadata *meta, + struct xsk_tx_metadata_compl *compl) +{ + if (!meta) + return; + + if (meta->flags & XDP_TXMD_FLAGS_TIMESTAMP) + compl->tx_timestamp = &meta->completion.tx_timestamp; + else + compl->tx_timestamp = NULL; +} + +/** + * xsk_tx_metadata_request - Evaluate AF_XDP TX metadata at submission + * and call appropriate xsk_tx_metadata_ops operation. + * @meta: pointer to AF_XDP metadata area + * @ops: pointer to struct xsk_tx_metadata_ops + * @priv: pointer to driver-private aread + * + * This function should be called by the networking device when + * it prepares AF_XDP egress packet. + */ +static inline void xsk_tx_metadata_request(const struct xsk_tx_metadata *meta, + const struct xsk_tx_metadata_ops *ops, + void *priv) +{ + if (!meta) + return; + + if (ops->tmo_request_timestamp) + if (meta->flags & XDP_TXMD_FLAGS_TIMESTAMP) + ops->tmo_request_timestamp(priv); + + if (ops->tmo_request_checksum) + if (meta->flags & XDP_TXMD_FLAGS_CHECKSUM) + ops->tmo_request_checksum(meta->request.csum_start, + meta->request.csum_offset, priv); +} + +/** + * xsk_tx_metadata_complete - Evaluate AF_XDP TX metadata at completion + * and call appropriate xsk_tx_metadata_ops operation. + * @compl: pointer to completion metadata produced from xsk_tx_metadata_to_compl + * @ops: pointer to struct xsk_tx_metadata_ops + * @priv: pointer to driver-private aread + * + * This function should be called by the networking device upon + * AF_XDP egress completion. + */ +static inline void xsk_tx_metadata_complete(struct xsk_tx_metadata_compl *compl, + const struct xsk_tx_metadata_ops *ops, + void *priv) +{ + if (!compl) + return; + + *compl->tx_timestamp = ops->tmo_fill_timestamp(priv); +} + #else static inline int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) @@ -115,6 +208,23 @@ static inline void __xsk_map_flush(void) { } +static inline void xsk_tx_metadata_to_compl(struct xsk_tx_metadata *meta, + struct xsk_tx_metadata_compl *compl) +{ +} + +static inline void xsk_tx_metadata_request(struct xsk_tx_metadata *meta, + const struct xsk_tx_metadata_ops *ops, + void *priv) +{ +} + +static inline void xsk_tx_metadata_complete(struct xsk_tx_metadata_compl *compl, + const struct xsk_tx_metadata_ops *ops, + void *priv) +{ +} + #endif /* CONFIG_XDP_SOCKETS */ #if defined(CONFIG_XDP_SOCKETS) && defined(CONFIG_DEBUG_NET) diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 1f6fc8c7a84c..e2558ac3e195 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -165,6 +165,14 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return xp_raw_get_data(pool, addr); } +static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) +{ + if (!pool->tx_metadata_len) + return NULL; + + return xp_raw_get_data(pool, addr) - pool->tx_metadata_len; +} + static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp, struct xsk_buff_pool *pool) { struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); @@ -324,6 +332,11 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return NULL; } +static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) +{ + return NULL; +} + static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp, struct xsk_buff_pool *pool) { } diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index 1985ffaf9b0c..97f5cc10d79e 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -33,6 +33,7 @@ struct xdp_buff_xsk { }; #define XSK_CHECK_PRIV_TYPE(t) BUILD_BUG_ON(sizeof(t) > offsetofend(struct xdp_buff_xsk, cb)) +#define XSK_TX_COMPL_FITS(t) BUILD_BUG_ON(sizeof(struct xsk_tx_metadata_compl) > sizeof(t)) struct xsk_dma_map { dma_addr_t *dma_pages; @@ -234,4 +235,9 @@ static inline u64 xp_get_handle(struct xdp_buff_xsk *xskb) return xskb->orig_addr + (offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT); } +static inline bool xp_tx_metadata_enabled(const struct xsk_buff_pool *pool) +{ + return pool->tx_metadata_len > 0; +} + #endif /* XSK_BUFF_POOL_H_ */ diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index 2ecf79282c26..95de66d5a26c 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -106,6 +106,41 @@ struct xdp_options { #define XSK_UNALIGNED_BUF_ADDR_MASK \ ((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1) +/* Request transmit timestamp. Upon completion, put it into tx_timestamp + * field of struct xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_TIMESTAMP (1 << 0) + +/* Request transmit checksum offload. Checksum start position and offset + * are communicated via csum_start and csum_offset fields of struct + * xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) + +/* AF_XDP offloads request. 'request' union member is consumed by the driver + * when the packet is being transmitted. 'completion' union member is + * filled by the driver when the transmit completion arrives. + */ +struct xsk_tx_metadata { + __u64 flags; + + union { + struct { + /* XDP_TXMD_FLAGS_CHECKSUM */ + + /* Offset from desc->addr where checksumming should start. */ + __u16 csum_start; + /* Offset from csum_start where checksum should be stored. */ + __u16 csum_offset; + } request; + + struct { + /* XDP_TXMD_FLAGS_TIMESTAMP */ + __u64 tx_timestamp; + } completion; + }; +}; + /* Rx/Tx descriptor */ struct xdp_desc { __u64 addr; @@ -122,4 +157,7 @@ struct xdp_desc { */ #define XDP_PKT_CONTD (1 << 0) +/* TX packet carries valid metadata. */ +#define XDP_TX_METADATA (1 << 1) + #endif /* _LINUX_IF_XDP_H */ diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 2943a151d4f1..48d5477a668c 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -53,12 +53,28 @@ enum netdev_xdp_rx_metadata { NETDEV_XDP_RX_METADATA_MASK = 3, }; +/** + * enum netdev_xsk_flags + * @NETDEV_XSK_FLAGS_TX_TIMESTAMP: HW timestamping egress packets is supported + * by the driver. + * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the + * driver. + */ +enum netdev_xsk_flags { + NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, + NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + + /* private: */ + NETDEV_XSK_FLAGS_MASK = 3, +}; + enum { NETDEV_A_DEV_IFINDEX = 1, NETDEV_A_DEV_PAD, NETDEV_A_DEV_XDP_FEATURES, NETDEV_A_DEV_XDP_ZC_MAX_SEGS, NETDEV_A_DEV_XDP_RX_METADATA_FEATURES, + NETDEV_A_DEV_XSK_FEATURES, __NETDEV_A_DEV_MAX, NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1) diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index fe61f85bcf33..10f2124e9e23 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -6,6 +6,7 @@ #include #include #include +#include #include "netdev-genl-gen.h" @@ -13,6 +14,7 @@ static int netdev_nl_dev_fill(struct net_device *netdev, struct sk_buff *rsp, const struct genl_info *info) { + u64 xsk_features = 0; u64 xdp_rx_meta = 0; void *hdr; @@ -26,11 +28,20 @@ netdev_nl_dev_fill(struct net_device *netdev, struct sk_buff *rsp, XDP_METADATA_KFUNC_xxx #undef XDP_METADATA_KFUNC + if (netdev->xsk_tx_metadata_ops) { + if (netdev->xsk_tx_metadata_ops->tmo_fill_timestamp) + xsk_features |= NETDEV_XSK_FLAGS_TX_TIMESTAMP; + if (netdev->xsk_tx_metadata_ops->tmo_request_checksum) + xsk_features |= NETDEV_XSK_FLAGS_TX_CHECKSUM; + } + if (nla_put_u32(rsp, NETDEV_A_DEV_IFINDEX, netdev->ifindex) || nla_put_u64_64bit(rsp, NETDEV_A_DEV_XDP_FEATURES, netdev->xdp_features, NETDEV_A_DEV_PAD) || nla_put_u64_64bit(rsp, NETDEV_A_DEV_XDP_RX_METADATA_FEATURES, - xdp_rx_meta, NETDEV_A_DEV_PAD)) { + xdp_rx_meta, NETDEV_A_DEV_PAD) || + nla_put_u64_64bit(rsp, NETDEV_A_DEV_XSK_FEATURES, + xsk_features, NETDEV_A_DEV_PAD)) { genlmsg_cancel(rsp, hdr); return -EINVAL; } diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index c904356e2800..e83ade32f1fd 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -571,6 +571,13 @@ static u32 xsk_get_num_desc(struct sk_buff *skb) static void xsk_destruct_skb(struct sk_buff *skb) { + struct xsk_tx_metadata_compl *compl = &skb_shinfo(skb)->xsk_meta; + + if (compl->tx_timestamp) { + /* sw completion timestamp, not a real one */ + *compl->tx_timestamp = ktime_get_tai_fast_ns(); + } + xsk_cq_submit_locked(xdp_sk(skb->sk), xsk_get_num_desc(skb)); sock_wfree(skb); } @@ -655,8 +662,10 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs, static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, struct xdp_desc *desc) { + struct xsk_tx_metadata *meta = NULL; struct net_device *dev = xs->dev; struct sk_buff *skb = xs->skb; + bool first_frag = false; int err; if (dev->priv_flags & IFF_TX_SKB_NO_LINEAR) { @@ -687,6 +696,8 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, kfree_skb(skb); goto free_err; } + + first_frag = true; } else { int nr_frags = skb_shinfo(skb)->nr_frags; struct page *page; @@ -709,12 +720,35 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, skb_add_rx_frag(skb, nr_frags, page, 0, len, 0); } + + if (first_frag && desc->options & XDP_TX_METADATA) { + if (unlikely(xs->pool->tx_metadata_len == 0)) { + err = -EINVAL; + goto free_err; + } + + meta = buffer - xs->pool->tx_metadata_len; + + if (meta->flags & XDP_TXMD_FLAGS_CHECKSUM) { + if (unlikely(meta->request.csum_start + + meta->request.csum_offset + + sizeof(__sum16) > len)) { + err = -EINVAL; + goto free_err; + } + + skb->csum_start = hr + meta->request.csum_start; + skb->csum_offset = meta->request.csum_offset; + skb->ip_summed = CHECKSUM_PARTIAL; + } + } } skb->dev = dev; skb->priority = READ_ONCE(xs->sk.sk_priority); skb->mark = READ_ONCE(xs->sk.sk_mark); skb->destructor = xsk_destruct_skb; + xsk_tx_metadata_to_compl(meta, &skb_shinfo(skb)->xsk_meta); xsk_set_destructor_arg(skb); return skb; diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index c74a1372bcb9..6f2d1621c992 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -137,7 +137,7 @@ static inline bool xskq_cons_read_addr_unchecked(struct xsk_queue *q, u64 *addr) static inline bool xp_unused_options_set(u32 options) { - return options & ~XDP_PKT_CONTD; + return options & ~(XDP_PKT_CONTD | XDP_TX_METADATA); } static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index 34411a2e5b6c..d0882edc1642 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -26,11 +26,11 @@ */ #define XDP_USE_NEED_WAKEUP (1 << 3) /* By setting this option, userspace application indicates that it can - * handle multiple descriptors per packet thus enabling xsk core to split + * handle multiple descriptors per packet thus enabling AF_XDP to split * multi-buffer XDP frames into multiple Rx descriptors. Without this set - * such frames will be dropped by xsk. + * such frames will be dropped. */ -#define XDP_USE_SG (1 << 4) +#define XDP_USE_SG (1 << 4) /* Flags for xsk_umem_config flags */ #define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) @@ -106,6 +106,41 @@ struct xdp_options { #define XSK_UNALIGNED_BUF_ADDR_MASK \ ((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1) +/* Request transmit timestamp. Upon completion, put it into tx_timestamp + * field of union xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_TIMESTAMP (1 << 0) + +/* Request transmit checksum offload. Checksum start position and offset + * are communicated via csum_start and csum_offset fields of union + * xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) + +/* AF_XDP offloads request. 'request' union member is consumed by the driver + * when the packet is being transmitted. 'completion' union member is + * filled by the driver when the transmit completion arrives. + */ +struct xsk_tx_metadata { + __u64 flags; + + union { + struct { + /* XDP_TXMD_FLAGS_CHECKSUM */ + + /* Offset from desc->addr where checksumming should start. */ + __u16 csum_start; + /* Offset from csum_start where checksum should be stored. */ + __u16 csum_offset; + } request; + + struct { + /* XDP_TXMD_FLAGS_TIMESTAMP */ + __u64 tx_timestamp; + } completion; + }; +}; + /* Rx/Tx descriptor */ struct xdp_desc { __u64 addr; @@ -113,9 +148,16 @@ struct xdp_desc { __u32 options; }; -/* Flag indicating packet constitutes of multiple buffers*/ +/* UMEM descriptor is __u64 */ + +/* Flag indicating that the packet continues with the buffer pointed out by the + * next frame in the ring. The end of the packet is signalled by setting this + * bit to zero. For single buffer packets, every descriptor has 'options' set + * to 0 and this maintains backward compatibility. + */ #define XDP_PKT_CONTD (1 << 0) -/* UMEM descriptor is __u64 */ +/* TX packet carries valid metadata. */ +#define XDP_TX_METADATA (1 << 1) #endif /* _LINUX_IF_XDP_H */ diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index 2943a151d4f1..48d5477a668c 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -53,12 +53,28 @@ enum netdev_xdp_rx_metadata { NETDEV_XDP_RX_METADATA_MASK = 3, }; +/** + * enum netdev_xsk_flags + * @NETDEV_XSK_FLAGS_TX_TIMESTAMP: HW timestamping egress packets is supported + * by the driver. + * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the + * driver. + */ +enum netdev_xsk_flags { + NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, + NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + + /* private: */ + NETDEV_XSK_FLAGS_MASK = 3, +}; + enum { NETDEV_A_DEV_IFINDEX = 1, NETDEV_A_DEV_PAD, NETDEV_A_DEV_XDP_FEATURES, NETDEV_A_DEV_XDP_ZC_MAX_SEGS, NETDEV_A_DEV_XDP_RX_METADATA_FEATURES, + NETDEV_A_DEV_XSK_FEATURES, __NETDEV_A_DEV_MAX, NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1) diff --git a/tools/net/ynl/generated/netdev-user.c b/tools/net/ynl/generated/netdev-user.c index b5ffe8cd1144..6283d87dad37 100644 --- a/tools/net/ynl/generated/netdev-user.c +++ b/tools/net/ynl/generated/netdev-user.c @@ -58,6 +58,19 @@ const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value) return netdev_xdp_rx_metadata_strmap[value]; } +static const char * const netdev_xsk_flags_strmap[] = { + [0] = "tx-timestamp", + [1] = "tx-checksum", +}; + +const char *netdev_xsk_flags_str(enum netdev_xsk_flags value) +{ + value = ffs(value) - 1; + if (value < 0 || value >= (int)MNL_ARRAY_SIZE(netdev_xsk_flags_strmap)) + return NULL; + return netdev_xsk_flags_strmap[value]; +} + /* Policies */ struct ynl_policy_attr netdev_dev_policy[NETDEV_A_DEV_MAX + 1] = { [NETDEV_A_DEV_IFINDEX] = { .name = "ifindex", .type = YNL_PT_U32, }, @@ -65,6 +78,7 @@ struct ynl_policy_attr netdev_dev_policy[NETDEV_A_DEV_MAX + 1] = { [NETDEV_A_DEV_XDP_FEATURES] = { .name = "xdp-features", .type = YNL_PT_U64, }, [NETDEV_A_DEV_XDP_ZC_MAX_SEGS] = { .name = "xdp-zc-max-segs", .type = YNL_PT_U32, }, [NETDEV_A_DEV_XDP_RX_METADATA_FEATURES] = { .name = "xdp-rx-metadata-features", .type = YNL_PT_U64, }, + [NETDEV_A_DEV_XSK_FEATURES] = { .name = "xsk-features", .type = YNL_PT_U64, }, }; struct ynl_policy_nest netdev_dev_nest = { @@ -116,6 +130,11 @@ int netdev_dev_get_rsp_parse(const struct nlmsghdr *nlh, void *data) return MNL_CB_ERROR; dst->_present.xdp_rx_metadata_features = 1; dst->xdp_rx_metadata_features = mnl_attr_get_u64(attr); + } else if (type == NETDEV_A_DEV_XSK_FEATURES) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.xsk_features = 1; + dst->xsk_features = mnl_attr_get_u64(attr); } } diff --git a/tools/net/ynl/generated/netdev-user.h b/tools/net/ynl/generated/netdev-user.h index 4fafac879df3..39af1908444b 100644 --- a/tools/net/ynl/generated/netdev-user.h +++ b/tools/net/ynl/generated/netdev-user.h @@ -19,6 +19,7 @@ extern const struct ynl_family ynl_netdev_family; const char *netdev_op_str(int op); const char *netdev_xdp_act_str(enum netdev_xdp_act value); const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value); +const char *netdev_xsk_flags_str(enum netdev_xsk_flags value); /* Common nested types */ /* ============== NETDEV_CMD_DEV_GET ============== */ @@ -50,12 +51,14 @@ struct netdev_dev_get_rsp { __u32 xdp_features:1; __u32 xdp_zc_max_segs:1; __u32 xdp_rx_metadata_features:1; + __u32 xsk_features:1; } _present; __u32 ifindex; __u64 xdp_features; __u32 xdp_zc_max_segs; __u64 xdp_rx_metadata_features; + __u64 xsk_features; }; void netdev_dev_get_rsp_free(struct netdev_dev_get_rsp *rsp); From patchwork Mon Nov 27 19:03:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470121 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="yCqGIp0w" Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B86AD5B for ; Mon, 27 Nov 2023 11:03:27 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id d2e1a72fcca58-6cb42be51easo5638403b3a.1 for ; Mon, 27 Nov 2023 11:03:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111807; x=1701716607; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NZmbhowMLIPaXt1/O5qGG+PgL70lJmkQfw56CCK9VMM=; b=yCqGIp0wh4kscly7TPfxKd8ENbvCWsBGz8UM0tXSFrkIlAvN3a4QmeG94+2NkL8f9g uAo/lbezA50R1bJhpwxl2iETPWJ1KUPDakkzYgkfpxiWbGD/0ZvDEzHePaqSWxkhHP1H EfUChIfKUrkEwDD+SSaTSEhULhTW7COugUYWJMnXiaTn/EG2lYwZL/Yd0SAjvpW8Sk3p b8i1prk7M+ptxJZfGVs3wsJZL9tGmlyoQZnJss/lkck93SH3gEiAxo7yw37aH8+BkaEl IflDKxk/H6OSvAsRY8E9HeMOrztK0ysaZ9N6Kw7zHJ0xCHXbUMc3/yNJJ+lRKY0xneaD vMsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111807; x=1701716607; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NZmbhowMLIPaXt1/O5qGG+PgL70lJmkQfw56CCK9VMM=; b=nDjBBokVL2LQdElUV2MmgBv72ew2JTPE1Y23JOm1ko7ToMf5UBiguY2jUS0/LebBmP hUbmHjemYRQMqs1Adfh6BYKHkXPxr6T7aL18e18ygmSti5Na4DeUI8tetM/M6Szgj5GD erwvNQzmfhxXVuFLRoqlS1PZLzSwLT/OxAOtAjdecGjLx3jJDbyeED2BKPbgKwfAvx/m znv3WZXWjf4nBLju+isx/oJeDFNitujVhVSrFtK5xw4gQi76xxAUBTrlKj9Bo4yhCVTY S7IAxfVK3NqCLGpr6JwZ/TFfgHAjVy/gD/4v7+MzY0US28Yq84HeHTPxsojl5b7NC9Bn vZAw== X-Gm-Message-State: AOJu0YzpexbxJqVm2bflDtaVejqarnn4A7FWl2gQaw8MBQr9IeKquOwP QihRRht0OL6W1xnGtdmugu9vXsFiEbQVZSpXGW9qQcfecl5XyjjMNEYcikfFFzFAvIB9AK0fz2f T3TnDD81ySrERQVZFtRtqHWpCkBI1yZhPYoMhYqbqEYyeHgrHSQ== X-Google-Smtp-Source: AGHT+IFKylgGgz4FPHYQx7uLvq9IjNMDty/QvWbEOwBFCwVeAcMS9nY3gUMmOn56aiimoTypBJj+hDU= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a05:6a00:3a12:b0:692:c216:8830 with SMTP id fj18-20020a056a003a1200b00692c2168830mr3426229pfb.0.1701111806732; Mon, 27 Nov 2023 11:03:26 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:09 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-4-sdf@google.com> Subject: [PATCH bpf-next v6 03/13] tools: ynl: Print xsk-features from the sample From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net In a similar fashion we do for the other bit masks. Fix mask parsing (>= vs >) while we are it. Signed-off-by: Stanislav Fomichev Reviewed-by: Jakub Kicinski --- tools/net/ynl/samples/netdev.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tools/net/ynl/samples/netdev.c b/tools/net/ynl/samples/netdev.c index b828225daad0..591b90e21890 100644 --- a/tools/net/ynl/samples/netdev.c +++ b/tools/net/ynl/samples/netdev.c @@ -33,17 +33,23 @@ static void netdev_print_device(struct netdev_dev_get_rsp *d, unsigned int op) return; printf("xdp-features (%llx):", d->xdp_features); - for (int i = 0; d->xdp_features > 1U << i; i++) { + for (int i = 0; d->xdp_features >= 1U << i; i++) { if (d->xdp_features & (1U << i)) printf(" %s", netdev_xdp_act_str(1 << i)); } printf(" xdp-rx-metadata-features (%llx):", d->xdp_rx_metadata_features); - for (int i = 0; d->xdp_rx_metadata_features > 1U << i; i++) { + for (int i = 0; d->xdp_rx_metadata_features >= 1U << i; i++) { if (d->xdp_rx_metadata_features & (1U << i)) printf(" %s", netdev_xdp_rx_metadata_str(1 << i)); } + printf(" xsk-features (%llx):", d->xsk_features); + for (int i = 0; d->xsk_features >= 1U << i; i++) { + if (d->xsk_features & (1U << i)) + printf(" %s", netdev_xsk_flags_str(1 << i)); + } + printf(" xdp-zc-max-segs=%u", d->xdp_zc_max_segs); name = netdev_op_str(op); From patchwork Mon Nov 27 19:03:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470123 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3eYatKmL" Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 928E7D60 for ; Mon, 27 Nov 2023 11:03:29 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5cdc42ce40fso44452117b3.2 for ; Mon, 27 Nov 2023 11:03:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111809; x=1701716609; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=mXyfbdDoDU35V1qQBuOVjHj1blo3Ypu1CcZZ4PLEhjs=; b=3eYatKmLBYS8PMkTytp4zEEhwOHBVCt6FQ3085teBl5I74cs9Uat6QKQGrYHteORvj Q4up39gBI/uFd6jtdTBma0UmY+k61CdJxz6tWu9TeDpW2zgTFR1hPHSLt5G5bjqGE3H/ FhA/7MvMfz6nrwdpJUGpchAiZfpmlp5cvdq/umZPbytftswbIxhvbUoMC+vlD5yyKBL+ 8bYVDRrcpI7ysIWZ1/twQ54OIs6OK1/ByfGm2POy3vyvLkn0bEYbpRZiXMDxBh8kmJl3 zmX4uQZIuCE9tssUbrEB/BZtZb7pzyy2J+HTIfMjeXrqrpbJl4GHiLuROimguEjq0KOT zMhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111809; x=1701716609; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mXyfbdDoDU35V1qQBuOVjHj1blo3Ypu1CcZZ4PLEhjs=; b=EdEdyZx26xxO+yfhyXVrISYWWVFZluoZhghoBbIRZXwkzsfMg2Jso3+B+RYGc3PKKB 9/VHfpKq/yPF3SIgFRRpSkjFJv9gx2vNNj31dO1fbb3NllSRI61XU+LJMK8tS2fgu+af hhfERppMuSbY+p7hppa9bU1Fys3l3dDflJcdVqiST8n2QsMIt709Bc6RIHvwZRJnzwuR ktVzNlvPdq0S/CNdfJ6jEzBIeQ3mxFgqq56K8s9PurCJvK+aA3pX85ObvYBNCQTn9rdj /7jN4gIoQCQ1M7U7E1IUtArRjKZlvGjCKdtcBZYv49GBIilwxseJOA0CvGjr/IXmQ6GC KvqA== X-Gm-Message-State: AOJu0YzDs887nfwePi1G524OCN3VwTLyxZmtwh+LbGIKpDv4GET4RnAF 3jEDF+T1ReobFSg1qdtIWvhzEFR1HTes1cozJ3CJwcXUshw5Tv13BSFS2lv7GXpuX9PHAXs2eNX 58hMJawR/sWULUEAs1H43ou2m9XyCszjp+2KGv3RM/s8k0kbtNA== X-Google-Smtp-Source: AGHT+IFmV7ENvadmUVxNYSmWVJCUreSJr5yA3h5e+KYA8IeXEvPIO5RG7Zwwa7QxyP6sr7PitxKzavM= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a05:690c:480d:b0:5ce:a88:8436 with SMTP id hc13-20020a05690c480d00b005ce0a888436mr364489ywb.10.1701111808609; Mon, 27 Nov 2023 11:03:28 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:10 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-5-sdf@google.com> Subject: [PATCH bpf-next v6 04/13] net/mlx5e: Implement AF_XDP TX timestamp and checksum offload From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net, Saeed Mahameed X-Patchwork-Delegate: bpf@iogearbox.net TX timestamp: - requires passing clock, not sure I'm passing the correct one (from cq->mdev), but the timestamp value looks convincing TX checksum: - looks like device does packet parsing (and doesn't accept custom start/offset), so I'm ignoring user offsets Cc: Saeed Mahameed Signed-off-by: Stanislav Fomichev --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 4 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 72 ++++++++++++++++--- .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 11 ++- .../ethernet/mellanox/mlx5/core/en/xsk/tx.c | 17 ++++- .../net/ethernet/mellanox/mlx5/core/en_main.c | 1 + 5 files changed, 89 insertions(+), 16 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index b2a5da9739d2..43f027bf2da3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -484,10 +484,12 @@ struct mlx5e_xdp_info_fifo { struct mlx5e_xdpsq; struct mlx5e_xmit_data; +struct xsk_tx_metadata; typedef int (*mlx5e_fp_xmit_xdp_frame_check)(struct mlx5e_xdpsq *); typedef bool (*mlx5e_fp_xmit_xdp_frame)(struct mlx5e_xdpsq *, struct mlx5e_xmit_data *, - int); + int, + struct xsk_tx_metadata *); struct mlx5e_xdpsq { /* data path */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 7decc81ed33a..e2e7d82cfca4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -103,7 +103,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, xdptxd->dma_addr = dma_addr; if (unlikely(!INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, xdptxd, 0))) + mlx5e_xmit_xdp_frame, sq, xdptxd, 0, NULL))) return false; /* xmit_mode == MLX5E_XDP_XMIT_MODE_FRAME */ @@ -145,7 +145,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, xdptxd->dma_addr = dma_addr; if (unlikely(!INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, xdptxd, 0))) + mlx5e_xmit_xdp_frame, sq, xdptxd, 0, NULL))) return false; /* xmit_mode == MLX5E_XDP_XMIT_MODE_PAGE */ @@ -261,6 +261,37 @@ const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = { .xmo_rx_hash = mlx5e_xdp_rx_hash, }; +struct mlx5e_xsk_tx_complete { + struct mlx5_cqe64 *cqe; + struct mlx5e_cq *cq; +}; + +static u64 mlx5e_xsk_fill_timestamp(void *_priv) +{ + struct mlx5e_xsk_tx_complete *priv = _priv; + u64 ts; + + ts = get_cqe_ts(priv->cqe); + + if (mlx5_is_real_time_rq(priv->cq->mdev) || mlx5_is_real_time_sq(priv->cq->mdev)) + return mlx5_real_time_cyc2time(&priv->cq->mdev->clock, ts); + + return mlx5_timecounter_cyc2time(&priv->cq->mdev->clock, ts); +} + +static void mlx5e_xsk_request_checksum(u16 csum_start, u16 csum_offset, void *priv) +{ + struct mlx5_wqe_eth_seg *eseg = priv; + + /* HW/FW is doing parsing, so offsets are largely ignored. */ + eseg->cs_flags |= MLX5_ETH_WQE_L3_CSUM | MLX5_ETH_WQE_L4_CSUM; +} + +const struct xsk_tx_metadata_ops mlx5e_xsk_tx_metadata_ops = { + .tmo_fill_timestamp = mlx5e_xsk_fill_timestamp, + .tmo_request_checksum = mlx5e_xsk_request_checksum, +}; + /* returns true if packet was consumed by xdp */ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct bpf_prog *prog, struct mlx5e_xdp_buff *mxbuf) @@ -398,11 +429,11 @@ INDIRECT_CALLABLE_SCOPE int mlx5e_xmit_xdp_frame_check_mpwqe(struct mlx5e_xdpsq INDIRECT_CALLABLE_SCOPE bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result); + int check_result, struct xsk_tx_metadata *meta); INDIRECT_CALLABLE_SCOPE bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result) + int check_result, struct xsk_tx_metadata *meta) { struct mlx5e_tx_mpwqe *session = &sq->mpwqe; struct mlx5e_xdpsq_stats *stats = sq->stats; @@ -420,7 +451,7 @@ mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptx */ if (unlikely(sq->mpwqe.wqe)) mlx5e_xdp_mpwqe_complete(sq); - return mlx5e_xmit_xdp_frame(sq, xdptxd, 0); + return mlx5e_xmit_xdp_frame(sq, xdptxd, 0, meta); } if (!xdptxd->len) { skb_frag_t *frag = &xdptxdf->sinfo->frags[0]; @@ -450,6 +481,7 @@ mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptx * and it's safe to complete it at any time. */ mlx5e_xdp_mpwqe_session_start(sq); + xsk_tx_metadata_request(meta, &mlx5e_xsk_tx_metadata_ops, &session->wqe->eth); } mlx5e_xdp_mpwqe_add_dseg(sq, p, stats); @@ -480,7 +512,7 @@ INDIRECT_CALLABLE_SCOPE int mlx5e_xmit_xdp_frame_check(struct mlx5e_xdpsq *sq) INDIRECT_CALLABLE_SCOPE bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result) + int check_result, struct xsk_tx_metadata *meta) { struct mlx5e_xmit_data_frags *xdptxdf = container_of(xdptxd, struct mlx5e_xmit_data_frags, xd); @@ -599,6 +631,8 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, sq->pc++; } + xsk_tx_metadata_request(meta, &mlx5e_xsk_tx_metadata_ops, eseg); + sq->doorbell_cseg = cseg; stats->xmit++; @@ -608,7 +642,9 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq, struct mlx5e_xdp_wqe_info *wi, u32 *xsk_frames, - struct xdp_frame_bulk *bq) + struct xdp_frame_bulk *bq, + struct mlx5e_cq *cq, + struct mlx5_cqe64 *cqe) { struct mlx5e_xdp_info_fifo *xdpi_fifo = &sq->db.xdpi_fifo; u16 i; @@ -668,10 +704,24 @@ static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq, break; } - case MLX5E_XDP_XMIT_MODE_XSK: + case MLX5E_XDP_XMIT_MODE_XSK: { /* AF_XDP send */ + struct xsk_tx_metadata_compl *compl = NULL; + struct mlx5e_xsk_tx_complete priv = { + .cqe = cqe, + .cq = cq, + }; + + if (xp_tx_metadata_enabled(sq->xsk_pool)) { + xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); + compl = &xdpi.xsk_meta; + + xsk_tx_metadata_complete(compl, &mlx5e_xsk_tx_metadata_ops, &priv); + } + (*xsk_frames)++; break; + } default: WARN_ON_ONCE(true); } @@ -720,7 +770,7 @@ bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq) sqcc += wi->num_wqebbs; - mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq); + mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq, cq, cqe); } while (!last_wqe); if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_REQ)) { @@ -767,7 +817,7 @@ void mlx5e_free_xdpsq_descs(struct mlx5e_xdpsq *sq) sq->cc += wi->num_wqebbs; - mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq); + mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq, NULL, NULL); } xdp_flush_frame_bulk(&bq); @@ -840,7 +890,7 @@ int mlx5e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, } ret = INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, xdptxd, 0); + mlx5e_xmit_xdp_frame, sq, xdptxd, 0, NULL); if (unlikely(!ret)) { int j; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h index ecfe93a479da..e054db1e10f8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h @@ -33,6 +33,7 @@ #define __MLX5_EN_XDP_H__ #include +#include #include "en.h" #include "en/txrx.h" @@ -82,7 +83,7 @@ enum mlx5e_xdp_xmit_mode { * num, page_1, page_2, ... , page_num. * * MLX5E_XDP_XMIT_MODE_XSK: - * none. + * frame.xsk_meta. */ #define MLX5E_XDP_FIFO_ENTRIES2DS_MAX_RATIO 4 @@ -97,6 +98,7 @@ union mlx5e_xdp_info { u8 num; struct page *page; } page; + struct xsk_tx_metadata_compl xsk_meta; }; struct mlx5e_xsk_param; @@ -112,13 +114,16 @@ int mlx5e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, u32 flags); extern const struct xdp_metadata_ops mlx5e_xdp_metadata_ops; +extern const struct xsk_tx_metadata_ops mlx5e_xsk_tx_metadata_ops; INDIRECT_CALLABLE_DECLARE(bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result)); + int check_result, + struct xsk_tx_metadata *meta)); INDIRECT_CALLABLE_DECLARE(bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result)); + int check_result, + struct xsk_tx_metadata *meta)); INDIRECT_CALLABLE_DECLARE(int mlx5e_xmit_xdp_frame_check_mpwqe(struct mlx5e_xdpsq *sq)); INDIRECT_CALLABLE_DECLARE(int mlx5e_xmit_xdp_frame_check(struct mlx5e_xdpsq *sq)); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c index 597f319d4770..a59199ed590d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c @@ -55,12 +55,16 @@ static void mlx5e_xsk_tx_post_err(struct mlx5e_xdpsq *sq, nopwqe = mlx5e_post_nop(&sq->wq, sq->sqn, &sq->pc); mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, *xdpi); + if (xp_tx_metadata_enabled(sq->xsk_pool)) + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .xsk_meta = {} }); sq->doorbell_cseg = &nopwqe->ctrl; } bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) { struct xsk_buff_pool *pool = sq->xsk_pool; + struct xsk_tx_metadata *meta = NULL; union mlx5e_xdp_info xdpi; bool work_done = true; bool flush = false; @@ -93,12 +97,13 @@ bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) xdptxd.dma_addr = xsk_buff_raw_get_dma(pool, desc.addr); xdptxd.data = xsk_buff_raw_get_data(pool, desc.addr); xdptxd.len = desc.len; + meta = xsk_buff_get_metadata(pool, desc.addr); xsk_buff_raw_dma_sync_for_device(pool, xdptxd.dma_addr, xdptxd.len); ret = INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, mlx5e_xmit_xdp_frame, sq, &xdptxd, - check_result); + check_result, meta); if (unlikely(!ret)) { if (sq->mpwqe.wqe) mlx5e_xdp_mpwqe_complete(sq); @@ -106,6 +111,16 @@ bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) mlx5e_xsk_tx_post_err(sq, &xdpi); } else { mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, xdpi); + if (xp_tx_metadata_enabled(sq->xsk_pool)) { + struct xsk_tx_metadata_compl compl; + + xsk_tx_metadata_to_compl(meta, &compl); + XSK_TX_COMPL_FITS(void *); + + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) + { .xsk_meta = compl }); + } } flush = true; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 3aecdf099a2f..0197eefb2108 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -5164,6 +5164,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev) netdev->netdev_ops = &mlx5e_netdev_ops; netdev->xdp_metadata_ops = &mlx5e_xdp_metadata_ops; + netdev->xsk_tx_metadata_ops = &mlx5e_xsk_tx_metadata_ops; mlx5e_dcbnl_build_netdev(netdev); From patchwork Mon Nov 27 19:03:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470124 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ToxMr7Rk" Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39F76D63 for ; Mon, 27 Nov 2023 11:03:31 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 41be03b00d2f7-5c19a0a2fbfso4775105a12.2 for ; Mon, 27 Nov 2023 11:03:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111810; x=1701716610; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=7OOxlshDL1+qiGyZH7/4x6ZmA5ZmMNeW0Lx+zNtrVXo=; b=ToxMr7Rkip/O6fy5cMpQ1HJ+ptmncngkKlKOZOrgHLh3hBFpjMQrEAcKjLE5Ul0M7A nkDBmxJ2i7Jt7wyac3PNdXYR4Ru7KZ8GUFUOAFBCWZsUKvaWSuELM9xB6oS0z6+Tg0y7 YsR6ZSj4s0znrZVn4VW56hDMOM8iqNfivWt9javlB5AbwYGrXyvJ6OaodxF4Vg0MLlPZ JD3FeBRx+v/ivkbxJAKVXo+tugoFDe7atmZ94Jr9JOHTtIC2HKBRi8jP6IVo+HKsKzdy ll0I7U3++irrKoXvlxuqEjueOnT/E13B8TkG/241DJE8IxNeQBOqnDbGqKwdB7xWBwJS n8ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111810; x=1701716610; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=7OOxlshDL1+qiGyZH7/4x6ZmA5ZmMNeW0Lx+zNtrVXo=; b=stUicTZ/MGuYUcC0dCkphdvmr84lLIOYUkVFA7L4c2zL6T1+LPAT02dI4rZimx5L5X CeAZblDpicZ82phCHTWwPSzk3ngsrtitmFhmKTaGbiPHzDIDQ6rEWeQUD1PNgzeip5WP BTtC69QUIpBmSV7MsWk0ntmuvb4SJMPOPUt/+lYtSJj1hEuI1bFBxUyN6+imaWbj5BNo PFDiJHEzrZYqW/AHuLgnD4iheUT6X9H21NlyONF1RIGrEbivqap9NnPnms7vOKbse1I+ pg/4vCykyS37f0ywDkNjlfpc3DwG0llQw41GNqDTS1/NGe3aKXv3VKQvTIi7Zq1LQLf3 X2Aw== X-Gm-Message-State: AOJu0Yw3zVYCTSIfNVVK/XOKnKaAjztYP2koYDWKbizHGGSpk46JFb55 Y+LdTcCWjjJi0SXBPwRiFNxjqZy+HbBcqd6twWLJ6nrpdxNL2vAhn9T1fiS0phfU9qPW0jEQK+V FI+nulKW69Llp2DolwG17ntx/3VgLW1DCkexYHdRXq0oI9/Skmg== X-Google-Smtp-Source: AGHT+IEOIzMiS+XOzQc1u4IE3dLr+juJzGGWDgeT/yHSLeJn0FuerExxj/8qzJZKZf0TgtvvLe4Alm0= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a63:211f:0:b0:5bd:4149:a123 with SMTP id h31-20020a63211f000000b005bd4149a123mr2141422pgh.7.1701111810604; Mon, 27 Nov 2023 11:03:30 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:11 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-6-sdf@google.com> Subject: [PATCH bpf-next v6 05/13] net: stmmac: Add Tx HWTS support to XDP ZC From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net From: Song Yoong Siang This patch enables transmit hardware timestamp support to XDP zero copy via XDP Tx metadata framework. This patchset is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel Tiger Lake platform. Below are the test steps and results. Command on DUT: sudo ./xdp_hw_metadata sudo hwstamp_ctl -i -t 1 -r 1 Command on Link Partner: echo -n xdp | nc -u -q1 9091 Result: xsk_ring_cons__peek: 1 0x55bbbf08b6d0: rx_desc[2]->addr=8c100 addr=8c100 comp_addr=8c100 EoP No rx_hash err=-95 rx_timestamp: 1677762688429141540 (sec:1677762688.4291) HW RX-time: 1677762688429141540 (sec:1677762688.4291) delta to User RX-time sec:0.0003 (250.665 usec) XDP RX-time: 1677762688429375597 (sec:1677762688.4294) delta to User RX-time sec:0.0000 (16.608 usec) 0x55bbbf08b6d0: ping-pong with csum=561c (want f488) csum_start=34 csum_offset=6 0x55bbbf08b6d0: complete tx idx=2 addr=2008 tx_timestamp: 1677762688431127273 (sec:1677762688.4311) HW TX-complete-time: 1677762688431127273 (sec:1677762688.4311) delta to User TX-complete-time sec:0.0083 (8331.655 usec) XDP RX-time: 1677762688429375597 (sec:1677762688.4294) delta to User TX-complete-time sec:0.0101 (10083.331 usec) HW RX-time: 1677762688429141540 (sec:1677762688.4291) delta to HW TX-complete-time sec:0.0020 (1985.733 usec) 0x55bbbf08b6d0: complete rx idx=130 addr=8c100 Signed-off-by: Song Yoong Siang Signed-off-by: Stanislav Fomichev --- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 12 ++++ .../net/ethernet/stmicro/stmmac/stmmac_main.c | 64 ++++++++++++++++++- 2 files changed, 75 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index cd7a9768de5f..686c94c2e8a7 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -51,6 +51,7 @@ struct stmmac_tx_info { bool last_segment; bool is_jumbo; enum stmmac_txbuf_type buf_type; + struct xsk_tx_metadata_compl xsk_meta; }; #define STMMAC_TBS_AVAIL BIT(0) @@ -100,6 +101,17 @@ struct stmmac_xdp_buff { struct dma_desc *ndesc; }; +struct stmmac_metadata_request { + struct stmmac_priv *priv; + struct dma_desc *tx_desc; + bool *set_ic; +}; + +struct stmmac_xsk_tx_complete { + struct stmmac_priv *priv; + struct dma_desc *desc; +}; + struct stmmac_rx_queue { u32 rx_count_frames; u32 queue_index; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 8964fc8a955f..c2ac88aaffed 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2430,6 +2430,46 @@ static void stmmac_dma_operation_mode(struct stmmac_priv *priv) } } +static void stmmac_xsk_request_timestamp(void *_priv) +{ + struct stmmac_metadata_request *meta_req = _priv; + + stmmac_enable_tx_timestamp(meta_req->priv, meta_req->tx_desc); + *meta_req->set_ic = true; +} + +static u64 stmmac_xsk_fill_timestamp(void *_priv) +{ + struct stmmac_xsk_tx_complete *tx_compl = _priv; + struct stmmac_priv *priv = tx_compl->priv; + struct dma_desc *desc = tx_compl->desc; + bool found = false; + u64 ns = 0; + + if (!priv->hwts_tx_en) + return 0; + + /* check tx tstamp status */ + if (stmmac_get_tx_timestamp_status(priv, desc)) { + stmmac_get_timestamp(priv, desc, priv->adv_ts, &ns); + found = true; + } else if (!stmmac_get_mac_tx_timestamp(priv, priv->hw, &ns)) { + found = true; + } + + if (found) { + ns -= priv->plat->cdc_error_adj; + return ns_to_ktime(ns); + } + + return 0; +} + +static const struct xsk_tx_metadata_ops stmmac_xsk_tx_metadata_ops = { + .tmo_request_timestamp = stmmac_xsk_request_timestamp, + .tmo_fill_timestamp = stmmac_xsk_fill_timestamp, +}; + static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) { struct netdev_queue *nq = netdev_get_tx_queue(priv->dev, queue); @@ -2449,6 +2489,8 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) budget = min(budget, stmmac_tx_avail(priv, queue)); while (budget-- > 0) { + struct stmmac_metadata_request meta_req; + struct xsk_tx_metadata *meta = NULL; dma_addr_t dma_addr; bool set_ic; @@ -2472,6 +2514,7 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) tx_desc = tx_q->dma_tx + entry; dma_addr = xsk_buff_raw_get_dma(pool, xdp_desc.addr); + meta = xsk_buff_get_metadata(pool, xdp_desc.addr); xsk_buff_raw_dma_sync_for_device(pool, dma_addr, xdp_desc.len); tx_q->tx_skbuff_dma[entry].buf_type = STMMAC_TXBUF_T_XSK_TX; @@ -2499,6 +2542,11 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) else set_ic = false; + meta_req.priv = priv; + meta_req.tx_desc = tx_desc; + meta_req.set_ic = &set_ic; + xsk_tx_metadata_request(meta, &stmmac_xsk_tx_metadata_ops, + &meta_req); if (set_ic) { tx_q->tx_count_frames = 0; stmmac_set_tx_ic(priv, tx_desc); @@ -2511,6 +2559,9 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) stmmac_enable_dma_transmission(priv, priv->ioaddr); + xsk_tx_metadata_to_compl(meta, + &tx_q->tx_skbuff_dma[entry].xsk_meta); + tx_q->cur_tx = STMMAC_GET_ENTRY(tx_q->cur_tx, priv->dma_conf.dma_tx_size); entry = tx_q->cur_tx; } @@ -2620,8 +2671,18 @@ static int stmmac_tx_clean(struct stmmac_priv *priv, int budget, u32 queue, } else { tx_packets++; } - if (skb) + if (skb) { stmmac_get_tx_hwtstamp(priv, p, skb); + } else { + struct stmmac_xsk_tx_complete tx_compl = { + .priv = priv, + .desc = p, + }; + + xsk_tx_metadata_complete(&tx_q->tx_skbuff_dma[entry].xsk_meta, + &stmmac_xsk_tx_metadata_ops, + &tx_compl); + } } if (likely(tx_q->tx_skbuff_dma[entry].buf && @@ -7464,6 +7525,7 @@ int stmmac_dvr_probe(struct device *device, ndev->netdev_ops = &stmmac_netdev_ops; ndev->xdp_metadata_ops = &stmmac_xdp_metadata_ops; + ndev->xsk_tx_metadata_ops = &stmmac_xsk_tx_metadata_ops; ndev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM; From patchwork Mon Nov 27 19:03:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470125 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="OscE3gt4" Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4BFFCD6D for ; Mon, 27 Nov 2023 11:03:33 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5cd6a86a898so52729067b3.3 for ; Mon, 27 Nov 2023 11:03:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111812; x=1701716612; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zp0NfVT6k+O+cDddTZy09JqchV8tKkVXuv7rsNqmLRk=; b=OscE3gt4+csTDYj2aCvRlzNko3HlRLpX8E/IdycxO/jCWsv4HJUU/ca7gqew1cL9/j jPRIsGULNg1pDBeewZ4wtsPZmWBLXSJQiHPJlzIXGajvKj/prnVWe959Eedf6UYELnGK o+splSS4zVl0R+8la3tp0bTAEaCWqwHE2IaWQEWs+H7yZdY/83STRrM6nHYg1R2V6Xpu 4E1BwhZ7a3/1maIOH2Omj9lDfxPApicM2iQpI2yJR7+0RBCW4pv7QTDSWU0QdxDDRykb Ku2QSZZpXomAbOysagoXH4MqufKOzWyPuHDfGWApEsJucH5hfkqS/5XefVto0dGVKFu+ zfEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111812; x=1701716612; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zp0NfVT6k+O+cDddTZy09JqchV8tKkVXuv7rsNqmLRk=; b=LJrVTjkUColzB923wtm1rBB2rTBc9CYpG8oBdrxrloudahc/kNhqtZU3MeQLpkzhgn D8BSYMNW4cL3PH4Jh1QRB8kyGX+rQWD5Ha6+kX2MwPDEvqCbXRvbiMfGqNakixtDpgVN Lx903X9tQpJUX7JyTRnFV8VDuQsB15VYSBuPjOEiZs4nddMHmmb+JIBF0lgCB2V79Vn4 d9QbOpdmO4rkEZu2vs2CICvZm4AUf9WiWHSaFkXx09kBlY2JHx4TjBVrIDW7Sziu6H/l ViOBh20v9rJamCTfuD+/xWTtb6IcPInn43kEcfWIIg4OBjVOIWT+kN+aqeIaHTFxpyYO Xwhg== X-Gm-Message-State: AOJu0YziarDkezXRiGcwsh54qbOg0nZkAWiFH9uoG86c8Z1BwZzjc5S7 sCY3ZGf6XV6GId0V/EItNrpvpiucJmFp2rpIhBxDGDBbVdZccXexOMUQ8luDs7MfFR8AXnJEJoI 6MCLY5deNmbDHPRrLttTGKkR0wQwq/l/tAVWvFG9khUkIdNuXdw== X-Google-Smtp-Source: AGHT+IFD35ro+x7PheJe+DmDJlu2/a0Hzt5yVOIeVEMuqqZBdyVEqsELGNDZNdlVsxO3j8SPEB5mPXE= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a05:690c:844:b0:5cb:d84d:218c with SMTP id bz4-20020a05690c084400b005cbd84d218cmr415973ywb.0.1701111812437; Mon, 27 Nov 2023 11:03:32 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:12 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-7-sdf@google.com> Subject: [PATCH bpf-next v6 06/13] xsk: Document tx_metadata_len layout From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net - how to use - how to query features - pointers to the examples Signed-off-by: Stanislav Fomichev --- Documentation/networking/index.rst | 1 + Documentation/networking/xdp-rx-metadata.rst | 2 + Documentation/networking/xsk-tx-metadata.rst | 70 ++++++++++++++++++++ 3 files changed, 73 insertions(+) create mode 100644 Documentation/networking/xsk-tx-metadata.rst diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 683eb42309cc..a297a894b366 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -123,6 +123,7 @@ Refer to :ref:`netdev-FAQ` for a guide on netdev development process specifics. xfrm_sync xfrm_sysctl xdp-rx-metadata + xsk-tx-metadata .. only:: subproject and html diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst index 205696780b78..e3e9420fd817 100644 --- a/Documentation/networking/xdp-rx-metadata.rst +++ b/Documentation/networking/xdp-rx-metadata.rst @@ -1,3 +1,5 @@ +.. SPDX-License-Identifier: GPL-2.0 + =============== XDP RX Metadata =============== diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst new file mode 100644 index 000000000000..4f376560b23f --- /dev/null +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -0,0 +1,70 @@ +================== +AF_XDP TX Metadata +================== + +This document describes how to enable offloads when transmitting packets +via :doc:`af_xdp`. Refer to :doc:`xdp-rx-metadata` on how to access similar +metadata on the receive side. + +General Design +============== + +The headroom for the metadata is reserved via ``tx_metadata_len`` in +``struct xdp_umem_reg``. The metadata length is therefore the same for +every socket that shares the same umem. The metadata layout is a fixed UAPI, +refer to ``union xsk_tx_metadata`` in ``include/uapi/linux/if_xdp.h``. +Thus, generally, the ``tx_metadata_len`` field above should contain +``sizeof(union xsk_tx_metadata)``. + +The headroom and the metadata itself should be located right before +``xdp_desc->addr`` in the umem frame. Within a frame, the metadata +layout is as follows:: + + tx_metadata_len + / \ + +-----------------+---------+----------------------------+ + | xsk_tx_metadata | padding | payload | + +-----------------+---------+----------------------------+ + ^ + | + xdp_desc->addr + +An AF_XDP application can request headrooms larger than ``sizeof(struct +xsk_tx_metadata)``. The kernel will ignore the padding (and will still +use ``xdp_desc->addr - tx_metadata_len`` to locate +the ``xsk_tx_metadata``). For the frames that shouldn't carry +any metadata (i.e., the ones that don't have ``XDP_TX_METADATA`` option), +the metadata area is ignored by the kernel as well. + +The flags field enables the particular offload: + +- ``XDP_TXMD_FLAGS_TIMESTAMP``: requests the device to put transmission + timestamp into ``tx_timestamp`` field of ``union xsk_tx_metadata``. +- ``XDP_TXMD_FLAGS_CHECKSUM``: requests the device to calculate L4 + checksum. ``csum_start`` specifies byte offset of where the checksumming + should start and ``csum_offset`` specifies byte offset where the + device should store the computed checksum. + +Besides the flags above, in order to trigger the offloads, the first +packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` +bit in the ``options`` field. Also note that in a multi-buffer packet +only the first chunk should carry the metadata. + +Querying Device Capabilities +============================ + +Every devices exports its offloads capabilities via netlink netdev family. +Refer to ``xsk-flags`` features bitmask in +``Documentation/netlink/specs/netdev.yaml``. + +- ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP`` +- ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM`` + +See ``tools/net/ynl/samples/netdev.c`` on how to query this information. + +Example +======= + +See ``tools/testing/selftests/bpf/xdp_hw_metadata.c`` for an example +program that handles TX metadata. Also see https://github.com/fomichev/xskgen +for a more bare-bones example. From patchwork Mon Nov 27 19:03:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470127 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="4PvLDmPl" Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2FDED5D for ; Mon, 27 Nov 2023 11:03:34 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id 98e67ed59e1d1-28525981a55so4788969a91.1 for ; Mon, 27 Nov 2023 11:03:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111814; x=1701716614; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=99qVGEnNEQZ0Ub6NY42NFhmi4pNAJrO1wmYriAMOllE=; b=4PvLDmPlZoP9x2c6EE2CeEQ3xJyFqpq6tQvQHcwqsSu+B/DfaRPrOQBCug46/nuEWA MgUMsufYSklRmUVJ6D4Wj+vyYaipVEQUZA8ywZA/lpSq6hpyOCoUymPkGdimBYSayN1T ZcjaiSX00TcWVKDGfLaz3lU7uqtsYcN06/zrWp4G2XVdCkQrmmD3kdB6dM3qB95sNTK7 9p05aAWOcIP59u6e6Wo0pjsQliIEt3huBacaL1fiSFNlAzlWmhR7xfUzplsp6z0Usuxp gvHgvfR8vxkp432D9lk3iuTVDiTuIFwDUuKi5CAUA6+Ke7hhAmtmvgXdxTmwtLDRhbm9 AI+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111814; x=1701716614; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=99qVGEnNEQZ0Ub6NY42NFhmi4pNAJrO1wmYriAMOllE=; b=McIzjU67B3NS2kGIWb0k+aL9DjIxbo7sYjzzhea23mp6hgxo/2m026eJ41IyKKi1bz 9M9tz8WopfTmSJWzfEUPgDko0HqaCIxRetk5S+LaYARX7+olT3+2Mc7eNmGY46KSVZ47 VHGsmyZ51kcKgPSu7/8y0L6FP1WiXakr9cEhH4+YE4WKs+hDcUTFASAv6cwHD44rz8zs rc0QJFH77EgP6fz3On/E90VaMqcu1h7+g2NC+r4fHOpOA8ZdcJMcQC+wUH76nn6DE/9G GteSKp/KTrjf9upGn5+5g/DpAZWHbWZzY5Gwn/VSvGy/Sruwbo+5ze0a3HMT8MgX/yiT jq4Q== X-Gm-Message-State: AOJu0Yy64A0cwtKyQfwb2T4iIHSEWmDW7svS2yBicMjbqyynrMncpnaB 191K5BstMp2x+yhtUjt0rW/WyPA3u3a6cevMTxBBGZCcMxIb4IWJ8/N/NX60JPIfQpMEMKrwxgC gX5plxKP7WMfUPl2qzw/LW3yhkEzv815EXmv1S8HNoe7wG7xHCw== X-Google-Smtp-Source: AGHT+IF4LeF4F6ujUaxMms+ifKigtIl1GqgRBxF8gPMPkhkgaMo0syeBjyEJ09ROh5SanlMUaHBbGMU= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:90a:ee8d:b0:27d:2cf5:7eb1 with SMTP id i13-20020a17090aee8d00b0027d2cf57eb1mr2904712pjz.4.1701111814269; Mon, 27 Nov 2023 11:03:34 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:13 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-8-sdf@google.com> Subject: [PATCH bpf-next v6 07/13] xsk: Validate xsk_tx_metadata flags From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net Accept only the flags that the kernel knows about to make sure we can extend this field in the future. Note that only in XDP_COPY mode we propagate the error signal back to the user (via sendmsg). For zerocopy mode we silently skip the metadata for the descriptors that have wrong flags (since we process the descriptors deep in the driver). Signed-off-by: Stanislav Fomichev --- include/net/xdp_sock_drv.h | 23 ++++++++++++++++++++++- net/xdp/xsk.c | 4 ++++ 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index e2558ac3e195..81e02de3f453 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -165,12 +165,28 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return xp_raw_get_data(pool, addr); } +#define XDP_TXMD_FLAGS_VALID ( \ + XDP_TXMD_FLAGS_TIMESTAMP | \ + XDP_TXMD_FLAGS_CHECKSUM | \ + 0) + +static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta) +{ + return !(meta->flags & ~XDP_TXMD_FLAGS_VALID); +} + static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) { + struct xsk_tx_metadata *meta; + if (!pool->tx_metadata_len) return NULL; - return xp_raw_get_data(pool, addr) - pool->tx_metadata_len; + meta = xp_raw_get_data(pool, addr) - pool->tx_metadata_len; + if (unlikely(!xsk_buff_valid_tx_metadata(meta))) + return NULL; /* no way to signal the error to the user */ + + return meta; } static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp, struct xsk_buff_pool *pool) @@ -332,6 +348,11 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return NULL; } +static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta) +{ + return false; +} + static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) { return NULL; diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index e83ade32f1fd..d66ba9d6154f 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -728,6 +728,10 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, } meta = buffer - xs->pool->tx_metadata_len; + if (unlikely(!xsk_buff_valid_tx_metadata(meta))) { + err = -EINVAL; + goto free_err; + } if (meta->flags & XDP_TXMD_FLAGS_CHECKSUM) { if (unlikely(meta->request.csum_start + From patchwork Mon Nov 27 19:03:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470126 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="eIVjk+U+" Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55F8FD64 for ; Mon, 27 Nov 2023 11:03:37 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5cd573c2cccso27462067b3.1 for ; Mon, 27 Nov 2023 11:03:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111816; x=1701716616; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=5QvtONr4OjbkhkoDPQ2air09KCE/816l/OMoWhPA4kI=; b=eIVjk+U+yo/SqZViQJPge6E0rN5WX+uhQLx/l7vgSwVKN0AnzOYHnTlY8nkzOCmMvZ A0RANphhj1LIwgJed2pmCoiVvMzrNAhoUxjH90FiFk9vtxsypWjLHBdsBEvX7F7FLA+C MYtWIHc+wUFkXO2QlRPCj0ILMlnUIqbAi9gz9nqkljMqOQG0hUW1xNwx9K4BBfWjR82Q lPNSPy7Tx2Wk7OYzjkDcBwE1H9S8JtnHvf1byasA8bYFFo7OZOTXitdj37BHaGdWi0CT LwSEXUM+UI/x/TAbJaylJ/3aJ5lwXsQu9cC6tBFFDfz7GFDxf4Mgj8jiZgIk19Gi81uH 9ixg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111816; x=1701716616; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5QvtONr4OjbkhkoDPQ2air09KCE/816l/OMoWhPA4kI=; b=n1wA2FGWFcRmVBZMM6LLAeDWQFcwVdNGc/z9KbSMFGoUDAs6hVMoZVRimbf5rIX8I8 5ViVbZOuTjI3GqVug6r5B78Df7er5XEqxePiGPQmAZHycvyc06WNS4CZZQzsqwPnOqbL AzXgQJgPciEZnOpp0uvF4OEhtxsJkSy4islUuHIX+e2BhYZbCySwZ4JSUkituoysxvDk Zf9avhHZfhHx18BeBMYrR1WBp4LJhlhjOwkRcCmP0skVVOwlqp5fYJAqTgETg/3XciY4 QEwwFG7WAWFSOg/c5xNt6VpsUgaIFjvcdEKh82dr3h4zG3ls3Fy5fe1un10HlKByuKVX C0ZQ== X-Gm-Message-State: AOJu0Yzoy+/kZ3IUm7Na/imHvOwTkbHjmJJkVUsEhqkkfI8PuMB9qaaY 9iaku3GtYTCfVj4DotlB0mLAZxqTnvqNuttarQ5pWeJoqvCAYMfQCDlbS36gp1ed50Jd3CyG0Wz AjoDXV7iiOXUH2Tw6d2wHFdtNWQfN4Hr/x6pqlG5so+O+PcQL2Q== X-Google-Smtp-Source: AGHT+IF7uTKY5e37aqevUScyDnyGHkG7JIEfn64W34mnrvyLoyNqrtOGVcerCTq7PO979xB+WpnVmmA= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a05:690c:844:b0:5cc:ba30:bcd1 with SMTP id bz4-20020a05690c084400b005ccba30bcd1mr487498ywb.3.1701111816186; Mon, 27 Nov 2023 11:03:36 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:14 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-9-sdf@google.com> Subject: [PATCH bpf-next v6 08/13] xsk: Add option to calculate TX checksum in SW From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net For XDP_COPY mode, add a UMEM option XDP_UMEM_TX_SW_CSUM to call skb_checksum_help in transmit path. Might be useful to debugging issues with real hardware. I also use this mode in the selftests. Signed-off-by: Stanislav Fomichev --- Documentation/networking/xsk-tx-metadata.rst | 9 +++++++++ include/net/xsk_buff_pool.h | 1 + include/uapi/linux/if_xdp.h | 8 +++++++- net/xdp/xdp_umem.c | 7 ++++++- net/xdp/xsk.c | 6 ++++++ net/xdp/xsk_buff_pool.c | 1 + tools/include/uapi/linux/if_xdp.h | 8 +++++++- 7 files changed, 37 insertions(+), 3 deletions(-) diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst index 4f376560b23f..97ecfa480d00 100644 --- a/Documentation/networking/xsk-tx-metadata.rst +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -50,6 +50,15 @@ packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` bit in the ``options`` field. Also note that in a multi-buffer packet only the first chunk should carry the metadata. +Software TX Checksum +==================== + +For development and testing purposes its possible to pass +``XDP_UMEM_TX_SW_CSUM`` flag to ``XDP_UMEM_REG`` UMEM registration call. +In this case, when running in ``XDK_COPY`` mode, the TX checksum +is calculated on the CPU. Do not enable this option in production because +it will negatively affect performance. + Querying Device Capabilities ============================ diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index 97f5cc10d79e..8d48d37ab7c0 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -83,6 +83,7 @@ struct xsk_buff_pool { bool uses_need_wakeup; bool dma_need_sync; bool unaligned; + bool tx_sw_csum; void *addrs; /* Mutual exclusion of the completion ring in the SKB mode. Two cases to protect: * NAPI TX thread and sendmsg error paths in the SKB destructor callback and when diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index 95de66d5a26c..d31698410410 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -33,7 +33,13 @@ #define XDP_USE_SG (1 << 4) /* Flags for xsk_umem_config flags */ -#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) +#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) + +/* Force checksum calculation in software. Can be used for testing or + * working around potential HW issues. This option causes performance + * degradation and only works in XDP_COPY mode. + */ +#define XDP_UMEM_TX_SW_CSUM (1 << 1) struct sockaddr_xdp { __u16 sxdp_family; diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c index 946a687fb8e8..caa340134b0e 100644 --- a/net/xdp/xdp_umem.c +++ b/net/xdp/xdp_umem.c @@ -148,6 +148,11 @@ static int xdp_umem_account_pages(struct xdp_umem *umem) return 0; } +#define XDP_UMEM_FLAGS_VALID ( \ + XDP_UMEM_UNALIGNED_CHUNK_FLAG | \ + XDP_UMEM_TX_SW_CSUM | \ + 0) + static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) { bool unaligned_chunks = mr->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG; @@ -167,7 +172,7 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) return -EINVAL; } - if (mr->flags & ~XDP_UMEM_UNALIGNED_CHUNK_FLAG) + if (mr->flags & ~XDP_UMEM_FLAGS_VALID) return -EINVAL; if (!unaligned_chunks && !is_power_of_2(chunk_size)) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index d66ba9d6154f..281d49b4fca4 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -744,6 +744,12 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, skb->csum_start = hr + meta->request.csum_start; skb->csum_offset = meta->request.csum_offset; skb->ip_summed = CHECKSUM_PARTIAL; + + if (unlikely(xs->pool->tx_sw_csum)) { + err = skb_checksum_help(skb); + if (err) + goto free_err; + } } } } diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 386eddcdf837..4f6f538a5462 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -86,6 +86,7 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs, pool->umem = umem; pool->addrs = umem->addrs; pool->tx_metadata_len = umem->tx_metadata_len; + pool->tx_sw_csum = umem->flags & XDP_UMEM_TX_SW_CSUM; INIT_LIST_HEAD(&pool->free_list); INIT_LIST_HEAD(&pool->xskb_list); INIT_LIST_HEAD(&pool->xsk_tx_list); diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index d0882edc1642..638c606dfa74 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -33,7 +33,13 @@ #define XDP_USE_SG (1 << 4) /* Flags for xsk_umem_config flags */ -#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) +#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) + +/* Force checksum calculation in software. Can be used for testing or + * working around potential HW issues. This option causes performance + * degradation and only works in XDP_COPY mode. + */ +#define XDP_UMEM_TX_SW_CSUM (1 << 1) struct sockaddr_xdp { __u16 sxdp_family; From patchwork Mon Nov 27 19:03:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470128 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YGy2yYij" Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36F93D5B for ; Mon, 27 Nov 2023 11:03:39 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5cd2100f467so57975607b3.1 for ; Mon, 27 Nov 2023 11:03:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111818; x=1701716618; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=AIZwWc9a6llwfy/GJ9Hn1wyFbDKHqDafvICOFwSzkaA=; b=YGy2yYijMlf1v8lscnCk8S1rmGLTo7DiO65mq6NyRR2ESO+ajSQwgHBpCOnynb7D/R MwM4dUZLQ+WmRRieTObi0dzgLnUxpLAndzP8Z7XuFDKwEjOShU4eDlndqNBlI6wXI8zN D4rySoLVPPpQJPvWn0WQzgal6VSprZ8t50aDw4jBPOlJqJWTZbqAkzUBQ7/XSDXnIPBX p+c1q4nibtRjHzQKH+XSyrVepe2k1MU0UkGzn3FvM/DXCkTtMFg2pRK6VT29+lUf1Am9 jZoqIeWGgRQ91m3MsUW7e4DhhwBQJg+c7Aj7xww7hKwPnqgcGLNztsf4W6qi8TaaCexJ G3wQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111818; x=1701716618; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AIZwWc9a6llwfy/GJ9Hn1wyFbDKHqDafvICOFwSzkaA=; b=qDhtrOSp3ZziGceEoEmeSnT8f3/TvnW5FIV8fnXLpkr8U/WwCQf34XbOmzCZSZtDuN WYkRiE6I/Ni6S3/wN36PULzjgKbPPboXQOqipTU3a2i5YiDjh7FYBaE7Xc6RU2nBvWFJ TuyS/DXc3dxc5Rr5zU/Hz7YrED3CpOT08C32hAaKt6zxmOcSz+5emUhtB+dJXUf8KsBy jEN6otgRvKSaEqz3nYK/Ik0PG4mJdTHWDiyX/gS3aiGWt9aVsqG+5dQ7DUtpirsbNhkd oZeq8a/z80tsI4r2MzUe3gBja+z3zSPSeC/Qk8lahyb2aRdncybk0nJiToEY8xS4bcaU 57qg== X-Gm-Message-State: AOJu0Ywzjl4Gp2w7qXiZdBvwihyYWvdqfPhmmjdvCuddDsd2e0V2Au9w rgldgwocvW5u8GAoN/2UcbPg5DiwZLISmvi5h0prpFCDawotX3tIFXiyvfmYkA5D0M7rvjNtpf6 F+GQlSyhccH5dJSUz7cBYMtGOUtS5A3Kmf6FX5xygodngwtN8mQ== X-Google-Smtp-Source: AGHT+IE8EPn2D8FiPemXYIFXefsg8Tsk7m3Yd2lEnAXCBSDAZJZtgbnRhsMN5Z+iMFgisc45OHcT14o= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a05:690c:3209:b0:5cc:234f:1860 with SMTP id ff9-20020a05690c320900b005cc234f1860mr371027ywb.3.1701111818184; Mon, 27 Nov 2023 11:03:38 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:15 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-10-sdf@google.com> Subject: [PATCH bpf-next v6 09/13] selftests/xsk: Support tx_metadata_len From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net Add new config field and propagate to UMEM registration setsockopt. Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/xsk.c | 3 +++ tools/testing/selftests/bpf/xsk.h | 1 + 2 files changed, 4 insertions(+) diff --git a/tools/testing/selftests/bpf/xsk.c b/tools/testing/selftests/bpf/xsk.c index e574711eeb84..25d568abf0f2 100644 --- a/tools/testing/selftests/bpf/xsk.c +++ b/tools/testing/selftests/bpf/xsk.c @@ -115,6 +115,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg, cfg->frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE; cfg->frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM; cfg->flags = XSK_UMEM__DEFAULT_FLAGS; + cfg->tx_metadata_len = 0; return; } @@ -123,6 +124,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg, cfg->frame_size = usr_cfg->frame_size; cfg->frame_headroom = usr_cfg->frame_headroom; cfg->flags = usr_cfg->flags; + cfg->tx_metadata_len = usr_cfg->tx_metadata_len; } static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg, @@ -252,6 +254,7 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, mr.chunk_size = umem->config.frame_size; mr.headroom = umem->config.frame_headroom; mr.flags = umem->config.flags; + mr.tx_metadata_len = umem->config.tx_metadata_len; err = setsockopt(umem->fd, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr)); if (err) { diff --git a/tools/testing/selftests/bpf/xsk.h b/tools/testing/selftests/bpf/xsk.h index 771570bc3731..93c2cc413cfc 100644 --- a/tools/testing/selftests/bpf/xsk.h +++ b/tools/testing/selftests/bpf/xsk.h @@ -200,6 +200,7 @@ struct xsk_umem_config { __u32 frame_size; __u32 frame_headroom; __u32 flags; + __u32 tx_metadata_len; }; int xsk_attach_xdp_program(struct bpf_program *prog, int ifindex, u32 xdp_flags); From patchwork Mon Nov 27 19:03:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470129 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ddfi+4eH" Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ACC40D5F for ; Mon, 27 Nov 2023 11:03:40 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1cfaeab7dafso28631205ad.1 for ; Mon, 27 Nov 2023 11:03:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111820; x=1701716620; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RPaN4vg9SQDJHqYebN2/xKdXNgaWBfzTAU2wIw8MN1M=; b=ddfi+4eHj5Wh4pGNj6IGvkv/wNLWH1oB//3Sxg0otVK4lJIjv1wJ5KczsyN+aHmlsM dCjY/htveE0GjdbkkNe5PSQwQZv2s440K1Vc86xhnXlhT6JUZIzJy8m/h5tPL62vhyTx Vh/ovvgjVH6qy2l16PVb1cs4vir6+Lmsf7SrqIv8U3JEJ6OaLChTZwGwFaOdfubF53x0 w/43esw5uXq6kvoK2tWBhHJGoUqxzDDguTepi0yWc6PN+ojepYnY59LuBB69N0enIEWO Dpk5WBWa8YckW5Yfhofgt0UVXJOnB/V7N78rO+nbuExwp3/4NzfgJ/n6C/8XCv7EV1DU wmpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111820; x=1701716620; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RPaN4vg9SQDJHqYebN2/xKdXNgaWBfzTAU2wIw8MN1M=; b=k7ywXRVCxa3JEM0LwMTtVfHEGie+S9fn6n3DCZIu+l1WRnhF/lkpghXz2XLOM1EvWq QWPEGfKreyyBeJ8mkEZcHRvblTPFdPj5i4Dz7IVfcdRpFzq1zxlrNi8sS0mJ2QC+LWx0 UFX/i7NSMA7AEKYOg77LdwjOTPJ0827VTheZhsbbQirWNIdPU40bJS163EWxB25K4p0q rKuKrdzjVCmAO2Qt8DNeoiDyByTY7bY+vN3XXahkotCPa4lyznogzNf6s1Z+2ciId5zi 6wb5LKj1lh7YmvI0MaeHJ+fl19uWzejKEWkuCgWUzKzKUQTv/+6xvJZyuy8VOeApvCNP JeeQ== X-Gm-Message-State: AOJu0Yx+wVJdP+ULknsDsVNz1muJuksztR5ryoGl+ppjEsbtKGnJAzyL 75Ckq81kX2JYGyPuv80gtJ0ZcmxQC1n1TcLTmugG3Y8W/4+PJjiPfgkni36r/IF1w3Lffj6Q5XY gFNdMZgHF5nKcI51A+dorIfPCjmNT677PBvb0+TcE4XKM+MmHsg== X-Google-Smtp-Source: AGHT+IFBTjrElYjTEV9WQEofymDfgw7+CwNBeZf4OoRmbs32unp/IJ0ulmfxZYSw4cDD/GY5fzng/uo= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:b948:b0:1cc:cc77:94ed with SMTP id h8-20020a170902b94800b001cccc7794edmr2695776pls.10.1701111819885; Mon, 27 Nov 2023 11:03:39 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:16 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-11-sdf@google.com> Subject: [PATCH bpf-next v6 10/13] selftests/bpf: Add csum helpers From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net Checksum helpers will be used to calculate pseudo-header checksum in AF_XDP metadata selftests. The helpers are mirroring existing kernel ones: - csum_tcpudp_magic : IPv4 pseudo header csum - csum_ipv6_magic : IPv6 pseudo header csum - csum_fold : fold csum and do one's complement Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/network_helpers.h | 43 +++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index 34f1200a781b..94b9be24e39b 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -71,4 +71,47 @@ struct nstoken; */ struct nstoken *open_netns(const char *name); void close_netns(struct nstoken *token); + +static __u16 csum_fold(__u32 csum) +{ + csum = (csum & 0xffff) + (csum >> 16); + csum = (csum & 0xffff) + (csum >> 16); + + return (__u16)~csum; +} + +static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr, + __u32 len, __u8 proto, + __wsum csum) +{ + __u64 s = csum; + + s += (__u32)saddr; + s += (__u32)daddr; + s += htons(proto + len); + s = (s & 0xffffffff) + (s >> 32); + s = (s & 0xffffffff) + (s >> 32); + + return csum_fold((__u32)s); +} + +static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr, + const struct in6_addr *daddr, + __u32 len, __u8 proto, + __wsum csum) +{ + __u64 s = csum; + int i; + + for (i = 0; i < 4; i++) + s += (__u32)saddr->s6_addr32[i]; + for (i = 0; i < 4; i++) + s += (__u32)daddr->s6_addr32[i]; + s += htons(proto + len); + s = (s & 0xffffffff) + (s >> 32); + s = (s & 0xffffffff) + (s >> 32); + + return csum_fold((__u32)s); +} + #endif From patchwork Mon Nov 27 19:03:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470130 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GA3PC37b" Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2B99D5D for ; Mon, 27 Nov 2023 11:03:41 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1cfdf2b510fso8895315ad.2 for ; Mon, 27 Nov 2023 11:03:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111821; x=1701716621; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xeLlbkCksJisb3y6jjL289ZDTds1CqtWZxwNSr9V8Pk=; b=GA3PC37bEFnVv1FMuugicFqye6zG4W7STtE1hPt2UxR/Hi+YXLGEQnNq1fL70t+b9j yiigqwtoo7Zrxp3ly6z4GMFibSwW497OEnBNEOSyoF3Zvp9upXqROvciKw0OPbvqIZJx ATZKdbR4M70n7CxGeH44Z5mk9s7JG6J9HLbu8B9rhuDXNyagv7MjOzLcQ5tvmjKTJxB7 vKcvna5gGjhuwCH5ror1UBXtiMYf/gnHimHdTGtbf8y9P6FjmcQXHDD7z0R47NzF+zbg vdeXTtWEf1uN1C4pISW2hhg+P+4ubpAem0cojZY878q4qrN1BMj3SlnvMvtg9Xp6vo5M Q9/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111821; x=1701716621; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xeLlbkCksJisb3y6jjL289ZDTds1CqtWZxwNSr9V8Pk=; b=vovOq1943WnfdkTInFWcc/biiB6jRlay76KLeRSTXf31hv8OQz27/rFG9fafYefk7b r0U++khaIimqlhvjHAFnRsfxrJaElEzYQ+mYztNbciciF0q34BNI5qGGoHDaXWMObW1k s/s0Cl5ZgsrCqIp9EC/JtOgdAbMjNJCciWNSIJxAh+X7alXliUFrSxu1Pyy4aKqo5/PD FIzvCOi4v6sa5JRc+XeRmebt+X7Ki4wpd1NlCbfJkrMLGJvP67el4jGW8wFnViFCTQko ynK1THeiZ1ONGpK931T+7zuTu0gdC8zt7FVk1ZsxmGeSLj27Hv9Dk66pauzDMSwOian3 dihg== X-Gm-Message-State: AOJu0YzG0I3L/h2Kcick1I5uraCN6/Tyfl6OopFGhBww4P7FyS4hK6Iq jPnnM2KRk0GC0iqeIDxUerXc7a38jB3hG448n7d20tpvlr7GDCu1iREmP30BC+rOMTfQADTzJhV 8uYGYK8GNjfs3nHI8rbiBVDGxp5YSDmGps2onJ2rFN5CYiBYocw== X-Google-Smtp-Source: AGHT+IF6RNDkcP+Fi382qERYGP7LZKJ89O5uRYNHli3mJyFA9QcODvz5FyQmIFY35v4/oNc61SLKBxQ= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:903:1302:b0:1ce:1829:2df8 with SMTP id iy2-20020a170903130200b001ce18292df8mr2455154plb.12.1701111821379; Mon, 27 Nov 2023 11:03:41 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:17 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-12-sdf@google.com> Subject: [PATCH bpf-next v6 11/13] selftests/bpf: Add TX side to xdp_metadata From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net Request TX timestamp and make sure it's not empty. Request TX checksum offload (SW-only) and make sure it's resolved to the correct one. Signed-off-by: Stanislav Fomichev --- .../selftests/bpf/prog_tests/xdp_metadata.c | 33 ++++++++++++++++--- 1 file changed, 29 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c index 4439ba9392f8..33cdf88efa6b 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -56,7 +56,8 @@ static int open_xsk(int ifindex, struct xsk *xsk) .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, - .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG, + .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG | XDP_UMEM_TX_SW_CSUM, + .tx_metadata_len = sizeof(struct xsk_tx_metadata), }; __u32 idx; u64 addr; @@ -138,6 +139,7 @@ static void ip_csum(struct iphdr *iph) static int generate_packet(struct xsk *xsk, __u16 dst_port) { + struct xsk_tx_metadata *meta; struct xdp_desc *tx_desc; struct udphdr *udph; struct ethhdr *eth; @@ -151,10 +153,14 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) return -1; tx_desc = xsk_ring_prod__tx_desc(&xsk->tx, idx); - tx_desc->addr = idx % (UMEM_NUM / 2) * UMEM_FRAME_SIZE; + tx_desc->addr = idx % (UMEM_NUM / 2) * UMEM_FRAME_SIZE + sizeof(struct xsk_tx_metadata); printf("%p: tx_desc[%u]->addr=%llx\n", xsk, idx, tx_desc->addr); data = xsk_umem__get_data(xsk->umem_area, tx_desc->addr); + meta = data - sizeof(struct xsk_tx_metadata); + memset(meta, 0, sizeof(*meta)); + meta->flags = XDP_TXMD_FLAGS_TIMESTAMP; + eth = data; iph = (void *)(eth + 1); udph = (void *)(iph + 1); @@ -178,11 +184,17 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) udph->source = htons(AF_XDP_SOURCE_PORT); udph->dest = htons(dst_port); udph->len = htons(sizeof(*udph) + UDP_PAYLOAD_BYTES); - udph->check = 0; + udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, + ntohs(udph->len), IPPROTO_UDP, 0); memset(udph + 1, 0xAA, UDP_PAYLOAD_BYTES); + meta->flags |= XDP_TXMD_FLAGS_CHECKSUM; + meta->request.csum_start = sizeof(*eth) + sizeof(*iph); + meta->request.csum_offset = offsetof(struct udphdr, check); + tx_desc->len = sizeof(*eth) + sizeof(*iph) + sizeof(*udph) + UDP_PAYLOAD_BYTES; + tx_desc->options |= XDP_TX_METADATA; xsk_ring_prod__submit(&xsk->tx, 1); ret = sendto(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, 0); @@ -194,13 +206,21 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) static void complete_tx(struct xsk *xsk) { - __u32 idx; + struct xsk_tx_metadata *meta; __u64 addr; + void *data; + __u32 idx; if (ASSERT_EQ(xsk_ring_cons__peek(&xsk->comp, 1, &idx), 1, "xsk_ring_cons__peek")) { addr = *xsk_ring_cons__comp_addr(&xsk->comp, idx); printf("%p: complete tx idx=%u addr=%llx\n", xsk, idx, addr); + + data = xsk_umem__get_data(xsk->umem_area, addr); + meta = data - sizeof(struct xsk_tx_metadata); + + ASSERT_NEQ(meta->completion.tx_timestamp, 0, "tx_timestamp"); + xsk_ring_cons__release(&xsk->comp, 1); } } @@ -221,6 +241,7 @@ static int verify_xsk_metadata(struct xsk *xsk) const struct xdp_desc *rx_desc; struct pollfd fds = {}; struct xdp_meta *meta; + struct udphdr *udph; struct ethhdr *eth; struct iphdr *iph; __u64 comp_addr; @@ -257,6 +278,7 @@ static int verify_xsk_metadata(struct xsk *xsk) ASSERT_EQ(eth->h_proto, htons(ETH_P_IP), "eth->h_proto"); iph = (void *)(eth + 1); ASSERT_EQ((int)iph->version, 4, "iph->version"); + udph = (void *)(iph + 1); /* custom metadata */ @@ -270,6 +292,9 @@ static int verify_xsk_metadata(struct xsk *xsk) ASSERT_EQ(meta->rx_hash_type, 0, "rx_hash_type"); + /* checksum offload */ + ASSERT_EQ(udph->check, htons(0x721c), "csum"); + xsk_ring_cons__release(&xsk->rx, 1); refill_rx(xsk, comp_addr); From patchwork Mon Nov 27 19:03:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470131 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="oEDmmPpJ" Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2EDED63 for ; Mon, 27 Nov 2023 11:03:43 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 41be03b00d2f7-5b7fb057153so5917926a12.1 for ; Mon, 27 Nov 2023 11:03:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111823; x=1701716623; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Mi8JIb1HDOpvCzNHvZfkM5tn49cG4glA3ceC2JikRio=; b=oEDmmPpJmbyhMR8AQJhlygNa+Lbsvmn6uX6PQxQcdI30N31yR9tp88oM8PdsSB/NcR jdhZcHCVTNn9S0ACPCL0avyLaF/nofXgA5YlZkdL/cHeieBImTQn5abBPt35QDtE85qy MzGvD75lZ72ypWZDqOZnNp4bLy3V4AUfmsu1oXghZDeUgObHKLj/jM+bagv6gCdusK1I Qi7HuUSPBnzFfeeA6Vs84ZUutW73Bq9ZkaJwgrIrqunGAvaIMJL56+ynUrKxTgU9sZus LtmTHfXcYYUzBglmj1U6xNulEM0YLxHwkvRFxn96TPs+TqcMEzMzPTeKfJJZ8VPLjyvN +KwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111823; x=1701716623; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Mi8JIb1HDOpvCzNHvZfkM5tn49cG4glA3ceC2JikRio=; b=Ent6fbqcg8zFqmjBcCdsmBv36GXDWxKiDWd+lEy8gaf2Wbycn5NY8JBDTfORi0e5/c D2nnTv0N0GiU6HEwX4m15ZG71/HhHD8xV7ohDdY2+Po38fOKhMwGlwwnigv5L3EFYzeu UT/UulffFuYjile1KQhTjgbO/px8s/QScNXhCgegWIbfMuYeRy+3TViDP6sBiRQF1VRF QSFgM+IZSwLH/BEZjDMUDGJTVWaSDYUIY6rTUq7arAUoYslTl4CHbSMm4E0qxBntfkk7 r8EXfZG2dR3aKQ7yYNASNDiYwGNlOrlhPXYqQISvzYrhHUTRma95y0UtdUOptMuO0QTF nKKA== X-Gm-Message-State: AOJu0Yyq4csvxtyPegQQwVyBaV+927muYxUFLHPHzLm+ZJIpob1YAAvq dvMpz3A6HUarQKQtR0W27i5qarWimdlHPc1A4lAvr5P+Ql/jN0CxX2wmeF8aHPeIupiR69SPJGd JAyW4c9v/xfrTAT1p9s1g2YKLKnO4eVv2gSGxokr/dmOvaxTVaQ== X-Google-Smtp-Source: AGHT+IH1KUk+5+dQZAufzInV+dXbKA0Ma+WWsCT+tk3mIzZPJf/ZmIWseSSOzCnMVCpv3ENQ4+iIF68= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a63:481a:0:b0:585:6402:41e4 with SMTP id v26-20020a63481a000000b00585640241e4mr2013275pga.7.1701111823236; Mon, 27 Nov 2023 11:03:43 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:18 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-13-sdf@google.com> Subject: [PATCH bpf-next v6 12/13] selftests/bpf: Convert xdp_hw_metadata to XDP_USE_NEED_WAKEUP From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net This is the recommended way to run AF_XDP, so let's use it in the test. Also, some unrelated changes to now blow up the log too much: - change default mode to zerocopy and add -c to use copy mode - small fixes for the flags/sizes/prints - add print_tstamp_delta to print timestamp + reference Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/xdp_hw_metadata.c | 73 +++++++++++++------ 1 file changed, 49 insertions(+), 24 deletions(-) diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index c3ba40d0b9de..4484b17c7a56 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -32,7 +32,7 @@ #include "xdp_metadata.h" -#define UMEM_NUM 16 +#define UMEM_NUM 256 #define UMEM_FRAME_SIZE XSK_UMEM__DEFAULT_FRAME_SIZE #define UMEM_SIZE (UMEM_FRAME_SIZE * UMEM_NUM) #define XDP_FLAGS (XDP_FLAGS_DRV_MODE | XDP_FLAGS_REPLACE) @@ -48,7 +48,7 @@ struct xsk { }; struct xdp_hw_metadata *bpf_obj; -__u16 bind_flags = XDP_COPY; +__u16 bind_flags = XDP_USE_NEED_WAKEUP | XDP_ZEROCOPY; struct xsk *rx_xsk; const char *ifname; int ifindex; @@ -68,7 +68,7 @@ static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, - .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG, + .flags = XSK_UMEM__DEFAULT_FLAGS, }; __u32 idx; u64 addr; @@ -110,7 +110,7 @@ static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) for (i = 0; i < UMEM_NUM / 2; i++) { addr = (UMEM_NUM / 2 + i) * UMEM_FRAME_SIZE; printf("%p: rx_desc[%d] -> %lx\n", xsk, i, addr); - *xsk_ring_prod__fill_addr(&xsk->fill, i) = addr; + *xsk_ring_prod__fill_addr(&xsk->fill, idx + i) = addr; } xsk_ring_prod__submit(&xsk->fill, ret); @@ -131,12 +131,22 @@ static void refill_rx(struct xsk *xsk, __u64 addr) __u32 idx; if (xsk_ring_prod__reserve(&xsk->fill, 1, &idx) == 1) { - printf("%p: complete idx=%u addr=%llx\n", xsk, idx, addr); + printf("%p: complete rx idx=%u addr=%llx\n", xsk, idx, addr); *xsk_ring_prod__fill_addr(&xsk->fill, idx) = addr; xsk_ring_prod__submit(&xsk->fill, 1); } } +static int kick_tx(struct xsk *xsk) +{ + return sendto(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, 0); +} + +static int kick_rx(struct xsk *xsk) +{ + return recvfrom(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, NULL); +} + #define NANOSEC_PER_SEC 1000000000 /* 10^9 */ static __u64 gettime(clockid_t clock_id) { @@ -152,6 +162,17 @@ static __u64 gettime(clockid_t clock_id) return (__u64) t.tv_sec * NANOSEC_PER_SEC + t.tv_nsec; } +static void print_tstamp_delta(const char *name, const char *refname, + __u64 tstamp, __u64 reference) +{ + __s64 delta = (__s64)reference - (__s64)tstamp; + + printf("%s: %llu (sec:%0.4f) delta to %s sec:%0.4f (%0.3f usec)\n", + name, tstamp, (double)tstamp / NANOSEC_PER_SEC, refname, + (double)delta / NANOSEC_PER_SEC, + (double)delta / 1000); +} + static void verify_xdp_metadata(void *data, clockid_t clock_id) { struct xdp_meta *meta; @@ -167,22 +188,13 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id) printf("rx_timestamp: %llu (sec:%0.4f)\n", meta->rx_timestamp, (double)meta->rx_timestamp / NANOSEC_PER_SEC); if (meta->rx_timestamp) { - __u64 usr_clock = gettime(clock_id); - __u64 xdp_clock = meta->xdp_timestamp; - __s64 delta_X = xdp_clock - meta->rx_timestamp; - __s64 delta_X2U = usr_clock - xdp_clock; - - printf("XDP RX-time: %llu (sec:%0.4f) delta sec:%0.4f (%0.3f usec)\n", - xdp_clock, (double)xdp_clock / NANOSEC_PER_SEC, - (double)delta_X / NANOSEC_PER_SEC, - (double)delta_X / 1000); - - printf("AF_XDP time: %llu (sec:%0.4f) delta sec:%0.4f (%0.3f usec)\n", - usr_clock, (double)usr_clock / NANOSEC_PER_SEC, - (double)delta_X2U / NANOSEC_PER_SEC, - (double)delta_X2U / 1000); - } + __u64 ref_tstamp = gettime(clock_id); + print_tstamp_delta("HW RX-time", "User RX-time", + meta->rx_timestamp, ref_tstamp); + print_tstamp_delta("XDP RX-time", "User RX-time", + meta->xdp_timestamp, ref_tstamp); + } } static void verify_skb_metadata(int fd) @@ -252,6 +264,13 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t while (true) { errno = 0; + + for (i = 0; i < rxq; i++) { + ret = kick_rx(&rx_xsk[i]); + if (ret) + printf("kick_rx ret=%d\n", ret); + } + ret = poll(fds, rxq + 1, 1000); printf("poll: %d (%d) skip=%llu fail=%llu redir=%llu\n", ret, errno, bpf_obj->bss->pkts_skip, @@ -420,8 +439,9 @@ static void print_usage(void) { const char *usage = "Usage: xdp_hw_metadata [OPTIONS] [IFNAME]\n" - " -m Enable multi-buffer XDP for larger MTU\n" + " -c Run in copy mode (zerocopy is default)\n" " -h Display this help and exit\n\n" + " -m Enable multi-buffer XDP for larger MTU\n" "Generate test packets on the other machine with:\n" " echo -n xdp | nc -u -q1 9091\n"; @@ -432,14 +452,19 @@ static void read_args(int argc, char *argv[]) { int opt; - while ((opt = getopt(argc, argv, "mh")) != -1) { + while ((opt = getopt(argc, argv, "chm")) != -1) { switch (opt) { - case 'm': - bind_flags |= XDP_USE_SG; + case 'c': + bind_flags &= ~XDP_USE_NEED_WAKEUP; + bind_flags &= ~XDP_ZEROCOPY; + bind_flags |= XDP_COPY; break; case 'h': print_usage(); exit(0); + case 'm': + bind_flags |= XDP_USE_SG; + break; case '?': if (isprint(optopt)) fprintf(stderr, "Unknown option: -%c\n", optopt); From patchwork Mon Nov 27 19:03:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13470132 X-Patchwork-Delegate: bpf@iogearbox.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="G9ALPJsg" Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2039BD64 for ; Mon, 27 Nov 2023 11:03:46 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-daee86e2d70so5564570276.0 for ; Mon, 27 Nov 2023 11:03:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701111825; x=1701716625; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=aC0MUVrEZYxjfAF4f/0rZDSKLUrYPaQCejwf2lWEL6M=; b=G9ALPJsg57c+HgathfbTBKDANAU0IhRIII7umWuVlM3YfTYmNSIG0npc76E0/yhIUe f8WteS1iMEibEWGTYEMsZOc6ucMgc+9KywDXh4/TdBbW0/rn+qz4FNsViW0rzqalhswa cvWM7TfwiuBmfulwtokREMRHNZT7YM9nmMx1gpFV4sIEcyDrVJw4NvppUDeFoqivsmg1 0u1PUgOkbCwMYBJyLRNYLXhf6CSnkGXGOkce/QWjvZL1DSq2iMUMG197PEDL7geZK6cB z+mGqSt5xxUUHytGsXscU2/hPtAXACxwAFO7Z0HCoKQQOUGJ5qgSgMMmtXMaFLnMaoHR GFlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701111825; x=1701716625; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aC0MUVrEZYxjfAF4f/0rZDSKLUrYPaQCejwf2lWEL6M=; b=CDXyGYprFjGR9NNkQR1CINUQp8++Z//XI148WklFyCOo2T6DVNBw4fGNKY92xSSbsj InI4/+SADS4am3QwuDeMn3X4U8dWiP02vE5ZssFbkj08khG5TLqbZYOiaSCwzAIJBZDe Z4MbFZTL9kphm86LaeDn4OGJnOjmhwd1P9oFiY4ON0ynZj6rTaLb8MLja3dOt8tALHLM HiI8WjVjlqjKRhlfZ5Mzx73wt3adC4Ek6uYO8J0yGMaqSkpICscEcu5+UydLs2W8xx9D amoO3Cfbvs06E+AeqVsL2miXlqseNDkKfBFN1OSOI9eqt4YwXLme/LYhz4A8YuAGm8tu +1Ew== X-Gm-Message-State: AOJu0YzL5gR0Ia42tJ3RzKBs7dPXnkcevurMupcJSZaWlP+UkpGpc9Tz 7tVuKYjUMTTSpojwIAzDYYw3pGa/H4zs9cPyTKgiqNyaXn/qDbYVALrTJGCMY30PkgvTp6Zp/Bd u7Xd4ady4Z79Ewf9sTX33+osSGqn0sDfPan00gSvG5jQQl7AVRw== X-Google-Smtp-Source: AGHT+IEyWw1rFaEtva2abKBWLH/BWw/JiEVXaHeYe1SwZrUJDV9SaIrZHHbmcNLI6LhJzWJrHYqfjtE= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a25:6f09:0:b0:d9a:d233:b2a3 with SMTP id k9-20020a256f09000000b00d9ad233b2a3mr400180ybc.1.1701111825142; Mon, 27 Nov 2023 11:03:45 -0800 (PST) Date: Mon, 27 Nov 2023 11:03:19 -0800 In-Reply-To: <20231127190319.1190813-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231127190319.1190813-1-sdf@google.com> X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127190319.1190813-14-sdf@google.com> Subject: [PATCH bpf-next v6 13/13] selftests/bpf: Add TX side to xdp_hw_metadata From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net When we get a packet on port 9091, we swap src/dst and send it out. At this point we also request the timestamp and checksum offloads. Checksum offload is verified by looking at the tcpdump on the other side. The tool prints pseudo-header csum and the final one it expects. The final checksum actually matches the incoming packets checksum because we only flip the src/dst and don't change the payload. Some other related changes: - switched to zerocopy mode by default; new flag can be used to force old behavior - request fixed tx_metadata_len headroom - some other small fixes (umem size, fill idx+i, etc) mvbz3:~# ./xdp_hw_metadata eth3 ... xsk_ring_cons__peek: 1 0x19546f8: rx_desc[0]->addr=80100 addr=80100 comp_addr=80100 rx_hash: 0x80B7EA8B with RSS type:0x2A rx_timestamp: 1697580171852147395 (sec:1697580171.8521) HW RX-time: 1697580171852147395 (sec:1697580171.8521), delta to User RX-time sec:0.2797 (279673.082 usec) XDP RX-time: 1697580172131699047 (sec:1697580172.1317), delta to User RX-time sec:0.0001 (121.430 usec) 0x19546f8: ping-pong with csum=3b8e (want d862) csum_start=54 csum_offset=6 0x19546f8: complete tx idx=0 addr=8 tx_timestamp: 1697580172056756493 (sec:1697580172.0568) HW TX-complete-time: 1697580172056756493 (sec:1697580172.0568), delta to User TX-complete-time sec:0.0852 (85175.537 usec) XDP RX-time: 1697580172131699047 (sec:1697580172.1317), delta to User TX-complete-time sec:0.0102 (10232.983 usec) HW RX-time: 1697580171852147395 (sec:1697580171.8521), delta to HW TX-complete-time sec:0.2046 (204609.098 usec) 0x19546f8: complete rx idx=128 addr=80100 mvbz4:~# nc -Nu -q1 ${MVBZ3_LINK_LOCAL_IP}%eth3 9091 mvbz4:~# tcpdump -vvx -i eth3 udp tcpdump: listening on eth3, link-type EN10MB (Ethernet), snapshot length 262144 bytes 12:26:09.301074 IP6 (flowlabel 0x35fa5, hlim 127, next-header UDP (17) payload length: 11) fe80::1270:fdff:fe48:1087.55807 > fe80::1270:fdff:fe48:1077.9091: [bad udp cksum 0x3b8e -> 0xde7e!] UDP, length 3 0x0000: 6003 5fa5 000b 117f fe80 0000 0000 0000 0x0010: 1270 fdff fe48 1087 fe80 0000 0000 0000 0x0020: 1270 fdff fe48 1077 d9ff 2383 000b 3b8e 0x0030: 7864 70 12:26:09.301976 IP6 (flowlabel 0x35fa5, hlim 127, next-header UDP (17) payload length: 11) fe80::1270:fdff:fe48:1077.9091 > fe80::1270:fdff:fe48:1087.55807: [udp sum ok] UDP, length 3 0x0000: 6003 5fa5 000b 117f fe80 0000 0000 0000 0x0010: 1270 fdff fe48 1077 fe80 0000 0000 0000 0x0020: 1270 fdff fe48 1087 2383 d9ff 000b de7e 0x0030: 7864 70 Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/xdp_hw_metadata.c | 164 +++++++++++++++++- 1 file changed, 160 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index 4484b17c7a56..3291625ba4fb 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -10,7 +10,9 @@ * - rx_hash * * TX: - * - TBD + * - UDP 9091 packets trigger TX reply + * - TX HW timestamp is requested and reported back upon completion + * - TX checksum is requested */ #include @@ -24,11 +26,14 @@ #include #include #include +#include #include #include #include #include #include +#include +#include #include "xdp_metadata.h" @@ -53,6 +58,9 @@ struct xsk *rx_xsk; const char *ifname; int ifindex; int rxq; +bool skip_tx; +__u64 last_hw_rx_timestamp; +__u64 last_xdp_rx_timestamp; void test__fail(void) { /* for network_helpers.c */ } @@ -69,6 +77,7 @@ static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, .flags = XSK_UMEM__DEFAULT_FLAGS, + .tx_metadata_len = sizeof(struct xsk_tx_metadata), }; __u32 idx; u64 addr; @@ -185,15 +194,19 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id) printf("rx_hash: 0x%X with RSS type:0x%X\n", meta->rx_hash, meta->rx_hash_type); - printf("rx_timestamp: %llu (sec:%0.4f)\n", meta->rx_timestamp, - (double)meta->rx_timestamp / NANOSEC_PER_SEC); if (meta->rx_timestamp) { __u64 ref_tstamp = gettime(clock_id); + /* store received timestamps to calculate a delta at tx */ + last_hw_rx_timestamp = meta->rx_timestamp; + last_xdp_rx_timestamp = meta->xdp_timestamp; + print_tstamp_delta("HW RX-time", "User RX-time", meta->rx_timestamp, ref_tstamp); print_tstamp_delta("XDP RX-time", "User RX-time", meta->xdp_timestamp, ref_tstamp); + } else { + printf("No rx_timestamp\n"); } } @@ -242,6 +255,129 @@ static void verify_skb_metadata(int fd) printf("skb hwtstamp is not found!\n"); } +static bool complete_tx(struct xsk *xsk, clockid_t clock_id) +{ + struct xsk_tx_metadata *meta; + __u64 addr; + void *data; + __u32 idx; + + if (!xsk_ring_cons__peek(&xsk->comp, 1, &idx)) + return false; + + addr = *xsk_ring_cons__comp_addr(&xsk->comp, idx); + data = xsk_umem__get_data(xsk->umem_area, addr); + meta = data - sizeof(struct xsk_tx_metadata); + + printf("%p: complete tx idx=%u addr=%llx\n", xsk, idx, addr); + + if (meta->completion.tx_timestamp) { + __u64 ref_tstamp = gettime(clock_id); + + print_tstamp_delta("HW TX-complete-time", "User TX-complete-time", + meta->completion.tx_timestamp, ref_tstamp); + print_tstamp_delta("XDP RX-time", "User TX-complete-time", + last_xdp_rx_timestamp, ref_tstamp); + print_tstamp_delta("HW RX-time", "HW TX-complete-time", + last_hw_rx_timestamp, meta->completion.tx_timestamp); + } else { + printf("No tx_timestamp\n"); + } + + xsk_ring_cons__release(&xsk->comp, 1); + + return true; +} + +#define swap(a, b, len) do { \ + for (int i = 0; i < len; i++) { \ + __u8 tmp = ((__u8 *)a)[i]; \ + ((__u8 *)a)[i] = ((__u8 *)b)[i]; \ + ((__u8 *)b)[i] = tmp; \ + } \ +} while (0) + +static void ping_pong(struct xsk *xsk, void *rx_packet, clockid_t clock_id) +{ + struct xsk_tx_metadata *meta; + struct ipv6hdr *ip6h = NULL; + struct iphdr *iph = NULL; + struct xdp_desc *tx_desc; + struct udphdr *udph; + struct ethhdr *eth; + __sum16 want_csum; + void *data; + __u32 idx; + int ret; + int len; + + ret = xsk_ring_prod__reserve(&xsk->tx, 1, &idx); + if (ret != 1) { + printf("%p: failed to reserve tx slot\n", xsk); + return; + } + + tx_desc = xsk_ring_prod__tx_desc(&xsk->tx, idx); + tx_desc->addr = idx % (UMEM_NUM / 2) * UMEM_FRAME_SIZE + sizeof(struct xsk_tx_metadata); + data = xsk_umem__get_data(xsk->umem_area, tx_desc->addr); + + meta = data - sizeof(struct xsk_tx_metadata); + memset(meta, 0, sizeof(*meta)); + meta->flags = XDP_TXMD_FLAGS_TIMESTAMP; + + eth = rx_packet; + + if (eth->h_proto == htons(ETH_P_IP)) { + iph = (void *)(eth + 1); + udph = (void *)(iph + 1); + } else if (eth->h_proto == htons(ETH_P_IPV6)) { + ip6h = (void *)(eth + 1); + udph = (void *)(ip6h + 1); + } else { + printf("%p: failed to detect IP version for ping pong %04x\n", xsk, eth->h_proto); + xsk_ring_prod__cancel(&xsk->tx, 1); + return; + } + + len = ETH_HLEN; + if (ip6h) + len += sizeof(*ip6h) + ntohs(ip6h->payload_len); + if (iph) + len += ntohs(iph->tot_len); + + swap(eth->h_dest, eth->h_source, ETH_ALEN); + if (iph) + swap(&iph->saddr, &iph->daddr, 4); + else + swap(&ip6h->saddr, &ip6h->daddr, 16); + swap(&udph->source, &udph->dest, 2); + + want_csum = udph->check; + if (ip6h) + udph->check = ~csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr, + ntohs(udph->len), IPPROTO_UDP, 0); + else + udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, + ntohs(udph->len), IPPROTO_UDP, 0); + + meta->flags |= XDP_TXMD_FLAGS_CHECKSUM; + if (iph) + meta->request.csum_start = sizeof(*eth) + sizeof(*iph); + else + meta->request.csum_start = sizeof(*eth) + sizeof(*ip6h); + meta->request.csum_offset = offsetof(struct udphdr, check); + + printf("%p: ping-pong with csum=%04x (want %04x) csum_start=%d csum_offset=%d\n", + xsk, ntohs(udph->check), ntohs(want_csum), + meta->request.csum_start, meta->request.csum_offset); + + memcpy(data, rx_packet, len); /* don't share umem chunk for simplicity */ + tx_desc->options |= XDP_TX_METADATA; + tx_desc->len = len; + + xsk_ring_prod__submit(&xsk->tx, 1); +} + static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t clock_id) { const struct xdp_desc *rx_desc; @@ -307,6 +443,22 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr), clock_id); first_seg = false; + + if (!skip_tx) { + /* mirror first chunk back */ + ping_pong(xsk, xsk_umem__get_data(xsk->umem_area, addr), + clock_id); + + ret = kick_tx(xsk); + if (ret) + printf("kick_tx ret=%d\n", ret); + + for (int j = 0; j < 500; j++) { + if (complete_tx(xsk, clock_id)) + break; + usleep(10*1000); + } + } } xsk_ring_cons__release(&xsk->rx, 1); @@ -442,6 +594,7 @@ static void print_usage(void) " -c Run in copy mode (zerocopy is default)\n" " -h Display this help and exit\n\n" " -m Enable multi-buffer XDP for larger MTU\n" + " -r Don't generate AF_XDP reply (rx metadata only)\n" "Generate test packets on the other machine with:\n" " echo -n xdp | nc -u -q1 9091\n"; @@ -452,7 +605,7 @@ static void read_args(int argc, char *argv[]) { int opt; - while ((opt = getopt(argc, argv, "chm")) != -1) { + while ((opt = getopt(argc, argv, "chmr")) != -1) { switch (opt) { case 'c': bind_flags &= ~XDP_USE_NEED_WAKEUP; @@ -465,6 +618,9 @@ static void read_args(int argc, char *argv[]) case 'm': bind_flags |= XDP_USE_SG; break; + case 'r': + skip_tx = true; + break; case '?': if (isprint(optopt)) fprintf(stderr, "Unknown option: -%c\n", optopt);