From patchwork Thu Nov 2 22:58:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444252 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EE3522338 for ; Thu, 2 Nov 2023 22:58:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="2TvXycth" Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65C3519F for ; Thu, 2 Nov 2023 15:58:42 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d9caf486775so1715790276.2 for ; Thu, 02 Nov 2023 15:58:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965921; x=1699570721; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9cdHM8U+7vYB8ZopY/OT9TsipodHioEVlGuj8FuKK4M=; b=2TvXycthMsGbWGPDGbD5s29oiG347T6XXxMmtPjNW8n5qXVrTybRvHNYPoP2dmjEXq Zl5ganieILUq/3Squ91WcKo1Z2Pw8H961APnx4A/qem80VOTpbrZeQf16bg/UiLykYoe auIqcQmX3HTRI0vzIlj33ajVI+haqgEmhmhoMLv3dFOw+6/T81ZFLw1ExsC3/3AnUoiG o67UNaSkeROAxVr4wqx/jp699IPFi7WkpyMBphHl3kS7T/j9lVyZ6eLUniXnM+GIiTz+ B6zL7TAG/F8k18sE8g6vIPmRbNmOkX63fFK1w95HIz54dKQuW9wnjjbJKXR6kFJnptnV n3wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965921; x=1699570721; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9cdHM8U+7vYB8ZopY/OT9TsipodHioEVlGuj8FuKK4M=; b=G6yGdqbcKYHIdH1whsUmQdNgh+rpTtXSdkLtqG2qc1G3TL8XjWYS0z1XXwCVCvR+16 K6C9TRYOY+0dvjQbt8d9ArlwCmHQJpgDkZzZPhQomMruJegNZ5wCKLvpo0EeOhua2Jv3 Uu1cVNrk7YiVaM2Q31pfPa5+DUDjwtOObJJDkWCVAqq37eDOcF8t2lJES3GIguT1PFbB dA+IPT/rRA7sYOA20lKurh81guMXdGslQwHt45sNynKRrTzuPNycGaLufhpP70jkB6EV Ycfa6Iqc4CJBHwhRpcaoeAyW7caFTHq+vlG9cfndEhnV6atfs5aI3YzMJGyS03fu33F9 c/kg== X-Gm-Message-State: AOJu0YyYUJ51w5wky91VerMb88XJzHSNbya+Vp37xZbvkwOO1BY6ECml Z+E0wuWqxVwOqdCFep0Mn84uTKc3FKU9MyI0rxhC59WDN9FZfYpnqrG5BoAP2rU739rKPgQ/mhd XK6W9B2W3X//I8pP/6Y3AP7F9sm89a1jH1pooaE9ftJfZurg42w== X-Google-Smtp-Source: AGHT+IE94t9ZLShH9Cp0c62mpXZv9teBKKznmrKHmThmMXLvUiMPWt/wb+cyW7UOj4t5DF+5Y4mXMP0= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a25:e706:0:b0:d9a:36cd:482e with SMTP id e6-20020a25e706000000b00d9a36cd482emr385028ybh.13.1698965921073; Thu, 02 Nov 2023 15:58:41 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:25 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-2-sdf@google.com> Subject: [PATCH bpf-next v5 01/13] xsk: Support tx_metadata_len From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net For zerocopy mode, tx_desc->addr can point to an arbitrary offset and carry some TX metadata in the headroom. For copy mode, there is no way currently to populate skb metadata. Introduce new tx_metadata_len umem config option that indicates how many bytes to treat as metadata. Metadata bytes come prior to tx_desc address (same as in RX case). The size of the metadata has mostly the same constraints as XDP: - less than 256 bytes - 8-byte aligned (compared to 4-byte alignment on xdp, due to 8-byte timestamp in the completion) - non-zero This data is not interpreted in any way right now. Reviewed-by: Song Yoong Siang Signed-off-by: Stanislav Fomichev --- include/net/xdp_sock.h | 1 + include/net/xsk_buff_pool.h | 1 + include/uapi/linux/if_xdp.h | 1 + net/xdp/xdp_umem.c | 4 ++++ net/xdp/xsk.c | 12 +++++++++++- net/xdp/xsk_buff_pool.c | 1 + net/xdp/xsk_queue.h | 17 ++++++++++------- tools/include/uapi/linux/if_xdp.h | 1 + 8 files changed, 30 insertions(+), 8 deletions(-) diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index f83128007fb0..bcf765124f72 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -30,6 +30,7 @@ struct xdp_umem { struct user_struct *user; refcount_t users; u8 flags; + u8 tx_metadata_len; bool zc; struct page **pgs; int id; diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index b0bdff26fc88..1985ffaf9b0c 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -77,6 +77,7 @@ struct xsk_buff_pool { u32 chunk_size; u32 chunk_shift; u32 frame_len; + u8 tx_metadata_len; /* inherited from umem */ u8 cached_need_wakeup; bool uses_need_wakeup; bool dma_need_sync; diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index 8d48863472b9..2ecf79282c26 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -76,6 +76,7 @@ struct xdp_umem_reg { __u32 chunk_size; __u32 headroom; __u32 flags; + __u32 tx_metadata_len; }; struct xdp_statistics { diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c index 06cead2b8e34..946a687fb8e8 100644 --- a/net/xdp/xdp_umem.c +++ b/net/xdp/xdp_umem.c @@ -199,6 +199,9 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) if (headroom >= chunk_size - XDP_PACKET_HEADROOM) return -EINVAL; + if (mr->tx_metadata_len >= 256 || mr->tx_metadata_len % 8) + return -EINVAL; + umem->size = size; umem->headroom = headroom; umem->chunk_size = chunk_size; @@ -207,6 +210,7 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) umem->pgs = NULL; umem->user = NULL; umem->flags = mr->flags; + umem->tx_metadata_len = mr->tx_metadata_len; INIT_LIST_HEAD(&umem->xsk_dma_list); refcount_set(&umem->users, 1); diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index ae9f8cb611f6..c904356e2800 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -1283,6 +1283,14 @@ struct xdp_umem_reg_v1 { __u32 headroom; }; +struct xdp_umem_reg_v2 { + __u64 addr; /* Start of packet data area */ + __u64 len; /* Length of packet data area */ + __u32 chunk_size; + __u32 headroom; + __u32 flags; +}; + static int xsk_setsockopt(struct socket *sock, int level, int optname, sockptr_t optval, unsigned int optlen) { @@ -1326,8 +1334,10 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname, if (optlen < sizeof(struct xdp_umem_reg_v1)) return -EINVAL; - else if (optlen < sizeof(mr)) + else if (optlen < sizeof(struct xdp_umem_reg_v2)) mr_size = sizeof(struct xdp_umem_reg_v1); + else if (optlen < sizeof(mr)) + mr_size = sizeof(struct xdp_umem_reg_v2); if (copy_from_sockptr(&mr, optval, mr_size)) return -EFAULT; diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 49cb9f9a09be..386eddcdf837 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -85,6 +85,7 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs, XDP_PACKET_HEADROOM; pool->umem = umem; pool->addrs = umem->addrs; + pool->tx_metadata_len = umem->tx_metadata_len; INIT_LIST_HEAD(&pool->free_list); INIT_LIST_HEAD(&pool->xskb_list); INIT_LIST_HEAD(&pool->xsk_tx_list); diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 13354a1e4280..c74a1372bcb9 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -143,15 +143,17 @@ static inline bool xp_unused_options_set(u32 options) static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) { - u64 offset = desc->addr & (pool->chunk_size - 1); + u64 addr = desc->addr - pool->tx_metadata_len; + u64 len = desc->len + pool->tx_metadata_len; + u64 offset = addr & (pool->chunk_size - 1); if (!desc->len) return false; - if (offset + desc->len > pool->chunk_size) + if (offset + len > pool->chunk_size) return false; - if (desc->addr >= pool->addrs_cnt) + if (addr >= pool->addrs_cnt) return false; if (xp_unused_options_set(desc->options)) @@ -162,16 +164,17 @@ static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, static inline bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) { - u64 addr = xp_unaligned_add_offset_to_addr(desc->addr); + u64 addr = xp_unaligned_add_offset_to_addr(desc->addr) - pool->tx_metadata_len; + u64 len = desc->len + pool->tx_metadata_len; if (!desc->len) return false; - if (desc->len > pool->chunk_size) + if (len > pool->chunk_size) return false; - if (addr >= pool->addrs_cnt || addr + desc->len > pool->addrs_cnt || - xp_desc_crosses_non_contig_pg(pool, addr, desc->len)) + if (addr >= pool->addrs_cnt || addr + len > pool->addrs_cnt || + xp_desc_crosses_non_contig_pg(pool, addr, len)) return false; if (xp_unused_options_set(desc->options)) diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index 73a47da885dc..34411a2e5b6c 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -76,6 +76,7 @@ struct xdp_umem_reg { __u32 chunk_size; __u32 headroom; __u32 flags; + __u32 tx_metadata_len; }; struct xdp_statistics { From patchwork Thu Nov 2 22:58:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444253 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0CED6224C3 for ; Thu, 2 Nov 2023 22:58:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="1C0lUolF" Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E0E918E for ; Thu, 2 Nov 2023 15:58:44 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-da13698a6d3so1853635276.0 for ; Thu, 02 Nov 2023 15:58:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965923; x=1699570723; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=5LplasE/jbebvMXOheQj5vE8wcvemUl8O7m6JW98rYs=; b=1C0lUolFNPj/IlcaV3n7tt9gHhK/hA0EqS/Yc2vs3wHo+MH/9/xjx30ls3eC4Vph01 yIB+0f4Rao6Qy7GAWqPEYzjZBmUPXABV0nJvORQu6qVsuaUG1eZcc8ogE9RxfvTagdRk hx5EDiTalsfuhl0w1RE9Dv4QUb2V5oAZm7NHCKjqcmWIuqSm2U3RqGuxaQynArAjUCmj oZ3uBRc76o6Ve64ye8UmQHd3egGYSSbHXyIhYDAv1Mc7+qpIRQi6vCmr7ogak0/ZOqrR XMwviLi4um7hgqlbPFgxL0ooKgcAK8fP/dZlPSEhcjk+ZdStZBDgKrcYg4PpjJrEAw6f RvQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965923; x=1699570723; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5LplasE/jbebvMXOheQj5vE8wcvemUl8O7m6JW98rYs=; b=BiswqTgXJg2AdUNx7sLvg6//YWrXG2Adb0iDtZeBcFBcd1b6bO7aGLialPMnc7d8OD /x105xpW9o4E5ZySk1hSc2YVwRKN3Zwr1Bv396Eo9RNb4dCj9mBhFZAegL5UBD8+SeKX 00Az4lMGBEfXqrDqDC5FgTIQwSB71r/3gt45Uc2UtlQ0OMseH9MkyPBqPwLF96wFE44H 7E79IWmrzKe5vPt4FPkBNGV/ae3pBVGtmaY+0oEDCNWvE6Nsbnw8Uhc08xg9H/j//ArP NHTRd+Nz5IuIIN2v5xAD+sCh3WF2t9qrKdhOwqc5WODVftrm623Pd9805LDRZ5jgPMaa Sm9A== X-Gm-Message-State: AOJu0Yy5KTUNyXECxN8E7AeKq09VgEGlOJy3cnjsCc+HUq/TwoDYBgBo 9j9pSowYffkHVcFwD6y8hzvLOdHIwIaA4/9SjJTNAMdc7xIeuNtJRZsiXIopt94YhaV+xB8ijtB XFh5aPScolRiMXebwDoPFdPFLJRxngKiAjCcJI1f+OrHtOp7f4w== X-Google-Smtp-Source: AGHT+IGxLVqyhVsIpK4cYO9/bv+1GIyS3KEHgsbqVafCAai+AjwpIviNMG/hahZy6iN6gHM4v5rJTxM= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a25:84d1:0:b0:da0:cbe9:6bb5 with SMTP id x17-20020a2584d1000000b00da0cbe96bb5mr393270ybm.11.1698965922870; Thu, 02 Nov 2023 15:58:42 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:26 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-3-sdf@google.com> Subject: [PATCH bpf-next v5 02/13] xsk: Add TX timestamp and TX checksum offload support From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net This change actually defines the (initial) metadata layout that should be used by AF_XDP userspace (xsk_tx_metadata). The first field is flags which requests appropriate offloads, followed by the offload-specific fields. The supported per-device offloads are exported via netlink (new xsk-flags). The offloads themselves are still implemented in a bit of a framework-y fashion that's left from my initial kfunc attempt. I'm introducing new xsk_tx_metadata_ops which drivers are supposed to implement. The drivers are also supposed to call xsk_tx_metadata_request/xsk_tx_metadata_complete in the right places. Since xsk_tx_metadata_{request,_complete} are static inline, we don't incur any extra overhead doing indirect calls. The benefit of this scheme is as follows: - keeps all metadata layout parsing away from driver code - makes it easy to grep and see which drivers implement what - don't need any extra flags to maintain to keep track of what offloads are implemented; if the callback is implemented - the offload is supported (used by netlink reporting code) Two offloads are defined right now: 1. XDP_TXMD_FLAGS_CHECKSUM: skb-style csum_start+csum_offset 2. XDP_TXMD_FLAGS_TIMESTAMP: writes TX timestamp back into metadata area upon completion (tx_timestamp field) XDP_TXMD_FLAGS_TIMESTAMP is also implemented for XDP_COPY mode: it writes SW timestamp from the skb destructor (note I'm reusing hwtstamps to pass metadata pointer). The struct is forward-compatible and can be extended in the future by appending more fields. Reviewed-by: Song Yoong Siang Signed-off-by: Stanislav Fomichev --- Documentation/netlink/specs/netdev.yaml | 19 +++- include/linux/netdevice.h | 2 + include/linux/skbuff.h | 14 ++- include/net/xdp_sock.h | 110 ++++++++++++++++++++++++ include/net/xdp_sock_drv.h | 13 +++ include/net/xsk_buff_pool.h | 6 ++ include/uapi/linux/if_xdp.h | 38 ++++++++ include/uapi/linux/netdev.h | 16 ++++ net/core/netdev-genl.c | 13 ++- net/xdp/xsk.c | 34 ++++++++ net/xdp/xsk_queue.h | 2 +- tools/include/uapi/linux/if_xdp.h | 52 +++++++++-- tools/include/uapi/linux/netdev.h | 16 ++++ tools/net/ynl/generated/netdev-user.c | 19 ++++ tools/net/ynl/generated/netdev-user.h | 3 + 15 files changed, 348 insertions(+), 9 deletions(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 14511b13f305..00439bcbd2e3 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -45,7 +45,6 @@ name: netdev - type: flags name: xdp-rx-metadata - render-max: true entries: - name: timestamp @@ -55,6 +54,18 @@ name: netdev name: hash doc: Device is capable of exposing receive packet hash via bpf_xdp_metadata_rx_hash(). + - + type: flags + name: xsk-flags + entries: + - + name: tx-timestamp + doc: + HW timestamping egress packets is supported by the driver. + - + name: tx-checksum + doc: + L3 checksum HW offload is supported by the driver. attribute-sets: - @@ -86,6 +97,11 @@ name: netdev See Documentation/networking/xdp-rx-metadata.rst for more details. type: u64 enum: xdp-rx-metadata + - + name: xsk-features + doc: Bitmask of enabled AF_XDP features. + type: u64 + enum: xsk-flags operations: list: @@ -103,6 +119,7 @@ name: netdev - xdp-features - xdp-zc-max-segs - xdp-rx-metadata-features + - xsk-features dump: reply: *dev-all - diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index a16c9cc063fe..fe915fa334cc 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1858,6 +1858,7 @@ enum netdev_ml_priv_type { * @netdev_ops: Includes several pointers to callbacks, * if one wants to override the ndo_*() functions * @xdp_metadata_ops: Includes pointers to XDP metadata callbacks. + * @xsk_tx_metadata_ops: Includes pointers to AF_XDP TX metadata callbacks. * @ethtool_ops: Management operations * @l3mdev_ops: Layer 3 master device operations * @ndisc_ops: Includes callbacks for different IPv6 neighbour @@ -2117,6 +2118,7 @@ struct net_device { unsigned long long priv_flags; const struct net_device_ops *netdev_ops; const struct xdp_metadata_ops *xdp_metadata_ops; + const struct xsk_tx_metadata_ops *xsk_tx_metadata_ops; int ifindex; unsigned short gflags; unsigned short hard_header_len; diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 27998f73183e..b370eb8d70f7 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -566,6 +566,15 @@ struct ubuf_info_msgzc { int mm_account_pinned_pages(struct mmpin *mmp, size_t size); void mm_unaccount_pinned_pages(struct mmpin *mmp); +/* Preserve some data across TX submission and completion. + * + * Note, this state is stored in the driver. Extending the layout + * might need some special care. + */ +struct xsk_tx_metadata_compl { + __u64 *tx_timestamp; +}; + /* This data is invariant across clones and lives at * the end of the header data, ie. at skb->end. */ @@ -578,7 +587,10 @@ struct skb_shared_info { /* Warning: this field is not always filled in (UFO)! */ unsigned short gso_segs; struct sk_buff *frag_list; - struct skb_shared_hwtstamps hwtstamps; + union { + struct skb_shared_hwtstamps hwtstamps; + struct xsk_tx_metadata_compl xsk_meta; + }; unsigned int gso_type; u32 tskey; diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index bcf765124f72..52afde24596b 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -93,12 +93,105 @@ struct xdp_sock { struct xsk_queue *cq_tmp; /* Only as tmp storage before bind */ }; +/* + * AF_XDP TX metadata hooks for network devices. + * The following hooks can be defined; unless noted otherwise, they are + * optional and can be filled with a null pointer. + * + * void (*tmo_request_timestamp)(void *priv) + * Called when AF_XDP frame requested egress timestamp. + * + * u64 (*tmo_fill_timestamp)(void *priv) + * Called when AF_XDP frame, that had requested egress timestamp, + * received a completion. The hook needs to return the actual HW timestamp. + * + * void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv) + * Called when AF_XDP frame requested HW checksum offload. csum_start + * indicates position where checksumming should start. + * csum_offset indicates position where checksum should be stored. + * + */ +struct xsk_tx_metadata_ops { + void (*tmo_request_timestamp)(void *priv); + u64 (*tmo_fill_timestamp)(void *priv); + void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv); +}; + #ifdef CONFIG_XDP_SOCKETS int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp); int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp); void __xsk_map_flush(void); +/** + * xsk_tx_metadata_to_compl - Save enough relevant metadata information + * to perform tx completion in the future. + * @meta: pointer to AF_XDP metadata area + * @compl: pointer to output struct xsk_tx_metadata_to_compl + * + * This function should be called by the networking device when + * it prepares AF_XDP egress packet. The value of @compl should be stored + * and passed to xsk_tx_metadata_complete upon TX completion. + */ +static inline void xsk_tx_metadata_to_compl(struct xsk_tx_metadata *meta, + struct xsk_tx_metadata_compl *compl) +{ + if (!meta) + return; + + if (meta->request.flags & XDP_TXMD_FLAGS_TIMESTAMP) + compl->tx_timestamp = &meta->completion.tx_timestamp; + else + compl->tx_timestamp = NULL; +} + +/** + * xsk_tx_metadata_request - Evaluate AF_XDP TX metadata at submission + * and call appropriate xsk_tx_metadata_ops operation. + * @meta: pointer to AF_XDP metadata area + * @ops: pointer to struct xsk_tx_metadata_ops + * @priv: pointer to driver-private aread + * + * This function should be called by the networking device when + * it prepares AF_XDP egress packet. + */ +static inline void xsk_tx_metadata_request(const struct xsk_tx_metadata *meta, + const struct xsk_tx_metadata_ops *ops, + void *priv) +{ + if (!meta) + return; + + if (ops->tmo_request_timestamp) + if (meta->request.flags & XDP_TXMD_FLAGS_TIMESTAMP) + ops->tmo_request_timestamp(priv); + + if (ops->tmo_request_checksum) + if (meta->request.flags & XDP_TXMD_FLAGS_CHECKSUM) + ops->tmo_request_checksum(meta->request.csum_start, + meta->request.csum_offset, priv); +} + +/** + * xsk_tx_metadata_complete - Evaluate AF_XDP TX metadata at completion + * and call appropriate xsk_tx_metadata_ops operation. + * @compl: pointer to completion metadata produced from xsk_tx_metadata_to_compl + * @ops: pointer to struct xsk_tx_metadata_ops + * @priv: pointer to driver-private aread + * + * This function should be called by the networking device upon + * AF_XDP egress completion. + */ +static inline void xsk_tx_metadata_complete(struct xsk_tx_metadata_compl *compl, + const struct xsk_tx_metadata_ops *ops, + void *priv) +{ + if (!compl) + return; + + *compl->tx_timestamp = ops->tmo_fill_timestamp(priv); +} + #else static inline int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) @@ -115,6 +208,23 @@ static inline void __xsk_map_flush(void) { } +static inline void xsk_tx_metadata_to_compl(struct xsk_tx_metadata *meta, + struct xsk_tx_metadata_compl *compl) +{ +} + +static inline void xsk_tx_metadata_request(struct xsk_tx_metadata *meta, + const struct xsk_tx_metadata_ops *ops, + void *priv) +{ +} + +static inline void xsk_tx_metadata_complete(struct xsk_tx_metadata_compl *compl, + const struct xsk_tx_metadata_ops *ops, + void *priv) +{ +} + #endif /* CONFIG_XDP_SOCKETS */ #if defined(CONFIG_XDP_SOCKETS) && defined(CONFIG_DEBUG_NET) diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 1f6fc8c7a84c..e2558ac3e195 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -165,6 +165,14 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return xp_raw_get_data(pool, addr); } +static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) +{ + if (!pool->tx_metadata_len) + return NULL; + + return xp_raw_get_data(pool, addr) - pool->tx_metadata_len; +} + static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp, struct xsk_buff_pool *pool) { struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); @@ -324,6 +332,11 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return NULL; } +static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) +{ + return NULL; +} + static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp, struct xsk_buff_pool *pool) { } diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index 1985ffaf9b0c..97f5cc10d79e 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -33,6 +33,7 @@ struct xdp_buff_xsk { }; #define XSK_CHECK_PRIV_TYPE(t) BUILD_BUG_ON(sizeof(t) > offsetofend(struct xdp_buff_xsk, cb)) +#define XSK_TX_COMPL_FITS(t) BUILD_BUG_ON(sizeof(struct xsk_tx_metadata_compl) > sizeof(t)) struct xsk_dma_map { dma_addr_t *dma_pages; @@ -234,4 +235,9 @@ static inline u64 xp_get_handle(struct xdp_buff_xsk *xskb) return xskb->orig_addr + (offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT); } +static inline bool xp_tx_metadata_enabled(const struct xsk_buff_pool *pool) +{ + return pool->tx_metadata_len > 0; +} + #endif /* XSK_BUFF_POOL_H_ */ diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index 2ecf79282c26..b0ee7ad19b51 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -106,6 +106,41 @@ struct xdp_options { #define XSK_UNALIGNED_BUF_ADDR_MASK \ ((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1) +/* Request transmit timestamp. Upon completion, put it into tx_timestamp + * field of struct xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_TIMESTAMP (1 << 0) + +/* Request transmit checksum offload. Checksum start position and offset + * are communicated via csum_start and csum_offset fields of struct + * xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) + +/* AF_XDP offloads request. 'request' union member is consumed by the driver + * when the packet is being transmitted. 'completion' union member is + * filled by the driver when the transmit completion arrives. + */ +struct xsk_tx_metadata { + union { + struct { + __u32 flags; + + /* XDP_TXMD_FLAGS_CHECKSUM */ + + /* Offset from desc->addr where checksumming should start. */ + __u16 csum_start; + /* Offset from csum_start where checksum should be stored. */ + __u16 csum_offset; + } request; + + struct { + /* XDP_TXMD_FLAGS_TIMESTAMP */ + __u64 tx_timestamp; + } completion; + }; +}; + /* Rx/Tx descriptor */ struct xdp_desc { __u64 addr; @@ -122,4 +157,7 @@ struct xdp_desc { */ #define XDP_PKT_CONTD (1 << 0) +/* TX packet carries valid metadata. */ +#define XDP_TX_METADATA (1 << 1) + #endif /* _LINUX_IF_XDP_H */ diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 2943a151d4f1..48d5477a668c 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -53,12 +53,28 @@ enum netdev_xdp_rx_metadata { NETDEV_XDP_RX_METADATA_MASK = 3, }; +/** + * enum netdev_xsk_flags + * @NETDEV_XSK_FLAGS_TX_TIMESTAMP: HW timestamping egress packets is supported + * by the driver. + * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the + * driver. + */ +enum netdev_xsk_flags { + NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, + NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + + /* private: */ + NETDEV_XSK_FLAGS_MASK = 3, +}; + enum { NETDEV_A_DEV_IFINDEX = 1, NETDEV_A_DEV_PAD, NETDEV_A_DEV_XDP_FEATURES, NETDEV_A_DEV_XDP_ZC_MAX_SEGS, NETDEV_A_DEV_XDP_RX_METADATA_FEATURES, + NETDEV_A_DEV_XSK_FEATURES, __NETDEV_A_DEV_MAX, NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1) diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index fe61f85bcf33..10f2124e9e23 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -6,6 +6,7 @@ #include #include #include +#include #include "netdev-genl-gen.h" @@ -13,6 +14,7 @@ static int netdev_nl_dev_fill(struct net_device *netdev, struct sk_buff *rsp, const struct genl_info *info) { + u64 xsk_features = 0; u64 xdp_rx_meta = 0; void *hdr; @@ -26,11 +28,20 @@ netdev_nl_dev_fill(struct net_device *netdev, struct sk_buff *rsp, XDP_METADATA_KFUNC_xxx #undef XDP_METADATA_KFUNC + if (netdev->xsk_tx_metadata_ops) { + if (netdev->xsk_tx_metadata_ops->tmo_fill_timestamp) + xsk_features |= NETDEV_XSK_FLAGS_TX_TIMESTAMP; + if (netdev->xsk_tx_metadata_ops->tmo_request_checksum) + xsk_features |= NETDEV_XSK_FLAGS_TX_CHECKSUM; + } + if (nla_put_u32(rsp, NETDEV_A_DEV_IFINDEX, netdev->ifindex) || nla_put_u64_64bit(rsp, NETDEV_A_DEV_XDP_FEATURES, netdev->xdp_features, NETDEV_A_DEV_PAD) || nla_put_u64_64bit(rsp, NETDEV_A_DEV_XDP_RX_METADATA_FEATURES, - xdp_rx_meta, NETDEV_A_DEV_PAD)) { + xdp_rx_meta, NETDEV_A_DEV_PAD) || + nla_put_u64_64bit(rsp, NETDEV_A_DEV_XSK_FEATURES, + xsk_features, NETDEV_A_DEV_PAD)) { genlmsg_cancel(rsp, hdr); return -EINVAL; } diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index c904356e2800..84fd10201f2a 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -571,6 +571,13 @@ static u32 xsk_get_num_desc(struct sk_buff *skb) static void xsk_destruct_skb(struct sk_buff *skb) { + struct xsk_tx_metadata_compl *compl = &skb_shinfo(skb)->xsk_meta; + + if (compl->tx_timestamp) { + /* sw completion timestamp, not a real one */ + *compl->tx_timestamp = ktime_get_tai_fast_ns(); + } + xsk_cq_submit_locked(xdp_sk(skb->sk), xsk_get_num_desc(skb)); sock_wfree(skb); } @@ -655,8 +662,10 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs, static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, struct xdp_desc *desc) { + struct xsk_tx_metadata *meta = NULL; struct net_device *dev = xs->dev; struct sk_buff *skb = xs->skb; + bool first_frag = false; int err; if (dev->priv_flags & IFF_TX_SKB_NO_LINEAR) { @@ -687,6 +696,8 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, kfree_skb(skb); goto free_err; } + + first_frag = true; } else { int nr_frags = skb_shinfo(skb)->nr_frags; struct page *page; @@ -709,12 +720,35 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, skb_add_rx_frag(skb, nr_frags, page, 0, len, 0); } + + if (first_frag && desc->options & XDP_TX_METADATA) { + if (unlikely(xs->pool->tx_metadata_len == 0)) { + err = -EINVAL; + goto free_err; + } + + meta = buffer - xs->pool->tx_metadata_len; + + if (meta->request.flags & XDP_TXMD_FLAGS_CHECKSUM) { + if (unlikely(meta->request.csum_start + + meta->request.csum_offset + + sizeof(__sum16) > len)) { + err = -EINVAL; + goto free_err; + } + + skb->csum_start = hr + meta->request.csum_start; + skb->csum_offset = meta->request.csum_offset; + skb->ip_summed = CHECKSUM_PARTIAL; + } + } } skb->dev = dev; skb->priority = READ_ONCE(xs->sk.sk_priority); skb->mark = READ_ONCE(xs->sk.sk_mark); skb->destructor = xsk_destruct_skb; + xsk_tx_metadata_to_compl(meta, &skb_shinfo(skb)->xsk_meta); xsk_set_destructor_arg(skb); return skb; diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index c74a1372bcb9..6f2d1621c992 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -137,7 +137,7 @@ static inline bool xskq_cons_read_addr_unchecked(struct xsk_queue *q, u64 *addr) static inline bool xp_unused_options_set(u32 options) { - return options & ~XDP_PKT_CONTD; + return options & ~(XDP_PKT_CONTD | XDP_TX_METADATA); } static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index 34411a2e5b6c..29942c2c32dc 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -26,11 +26,11 @@ */ #define XDP_USE_NEED_WAKEUP (1 << 3) /* By setting this option, userspace application indicates that it can - * handle multiple descriptors per packet thus enabling xsk core to split + * handle multiple descriptors per packet thus enabling AF_XDP to split * multi-buffer XDP frames into multiple Rx descriptors. Without this set - * such frames will be dropped by xsk. + * such frames will be dropped. */ -#define XDP_USE_SG (1 << 4) +#define XDP_USE_SG (1 << 4) /* Flags for xsk_umem_config flags */ #define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) @@ -106,6 +106,41 @@ struct xdp_options { #define XSK_UNALIGNED_BUF_ADDR_MASK \ ((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1) +/* Request transmit timestamp. Upon completion, put it into tx_timestamp + * field of union xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_TIMESTAMP (1 << 0) + +/* Request transmit checksum offload. Checksum start position and offset + * are communicated via csum_start and csum_offset fields of union + * xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) + +/* AF_XDP offloads request. 'request' union member is consumed by the driver + * when the packet is being transmitted. 'completion' union member is + * filled by the driver when the transmit completion arrives. + */ +struct xsk_tx_metadata { + union { + struct { + __u32 flags; + + /* XDP_TXMD_FLAGS_CHECKSUM */ + + /* Offset from desc->addr where checksumming should start. */ + __u16 csum_start; + /* Offset from csum_start where checksum should be stored. */ + __u16 csum_offset; + } request; + + struct { + /* XDP_TXMD_FLAGS_TIMESTAMP */ + __u64 tx_timestamp; + } completion; + }; +}; + /* Rx/Tx descriptor */ struct xdp_desc { __u64 addr; @@ -113,9 +148,16 @@ struct xdp_desc { __u32 options; }; -/* Flag indicating packet constitutes of multiple buffers*/ +/* UMEM descriptor is __u64 */ + +/* Flag indicating that the packet continues with the buffer pointed out by the + * next frame in the ring. The end of the packet is signalled by setting this + * bit to zero. For single buffer packets, every descriptor has 'options' set + * to 0 and this maintains backward compatibility. + */ #define XDP_PKT_CONTD (1 << 0) -/* UMEM descriptor is __u64 */ +/* TX packet carries valid metadata. */ +#define XDP_TX_METADATA (1 << 1) #endif /* _LINUX_IF_XDP_H */ diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index 2943a151d4f1..48d5477a668c 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -53,12 +53,28 @@ enum netdev_xdp_rx_metadata { NETDEV_XDP_RX_METADATA_MASK = 3, }; +/** + * enum netdev_xsk_flags + * @NETDEV_XSK_FLAGS_TX_TIMESTAMP: HW timestamping egress packets is supported + * by the driver. + * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the + * driver. + */ +enum netdev_xsk_flags { + NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, + NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + + /* private: */ + NETDEV_XSK_FLAGS_MASK = 3, +}; + enum { NETDEV_A_DEV_IFINDEX = 1, NETDEV_A_DEV_PAD, NETDEV_A_DEV_XDP_FEATURES, NETDEV_A_DEV_XDP_ZC_MAX_SEGS, NETDEV_A_DEV_XDP_RX_METADATA_FEATURES, + NETDEV_A_DEV_XSK_FEATURES, __NETDEV_A_DEV_MAX, NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1) diff --git a/tools/net/ynl/generated/netdev-user.c b/tools/net/ynl/generated/netdev-user.c index b5ffe8cd1144..6283d87dad37 100644 --- a/tools/net/ynl/generated/netdev-user.c +++ b/tools/net/ynl/generated/netdev-user.c @@ -58,6 +58,19 @@ const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value) return netdev_xdp_rx_metadata_strmap[value]; } +static const char * const netdev_xsk_flags_strmap[] = { + [0] = "tx-timestamp", + [1] = "tx-checksum", +}; + +const char *netdev_xsk_flags_str(enum netdev_xsk_flags value) +{ + value = ffs(value) - 1; + if (value < 0 || value >= (int)MNL_ARRAY_SIZE(netdev_xsk_flags_strmap)) + return NULL; + return netdev_xsk_flags_strmap[value]; +} + /* Policies */ struct ynl_policy_attr netdev_dev_policy[NETDEV_A_DEV_MAX + 1] = { [NETDEV_A_DEV_IFINDEX] = { .name = "ifindex", .type = YNL_PT_U32, }, @@ -65,6 +78,7 @@ struct ynl_policy_attr netdev_dev_policy[NETDEV_A_DEV_MAX + 1] = { [NETDEV_A_DEV_XDP_FEATURES] = { .name = "xdp-features", .type = YNL_PT_U64, }, [NETDEV_A_DEV_XDP_ZC_MAX_SEGS] = { .name = "xdp-zc-max-segs", .type = YNL_PT_U32, }, [NETDEV_A_DEV_XDP_RX_METADATA_FEATURES] = { .name = "xdp-rx-metadata-features", .type = YNL_PT_U64, }, + [NETDEV_A_DEV_XSK_FEATURES] = { .name = "xsk-features", .type = YNL_PT_U64, }, }; struct ynl_policy_nest netdev_dev_nest = { @@ -116,6 +130,11 @@ int netdev_dev_get_rsp_parse(const struct nlmsghdr *nlh, void *data) return MNL_CB_ERROR; dst->_present.xdp_rx_metadata_features = 1; dst->xdp_rx_metadata_features = mnl_attr_get_u64(attr); + } else if (type == NETDEV_A_DEV_XSK_FEATURES) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.xsk_features = 1; + dst->xsk_features = mnl_attr_get_u64(attr); } } diff --git a/tools/net/ynl/generated/netdev-user.h b/tools/net/ynl/generated/netdev-user.h index 4fafac879df3..39af1908444b 100644 --- a/tools/net/ynl/generated/netdev-user.h +++ b/tools/net/ynl/generated/netdev-user.h @@ -19,6 +19,7 @@ extern const struct ynl_family ynl_netdev_family; const char *netdev_op_str(int op); const char *netdev_xdp_act_str(enum netdev_xdp_act value); const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value); +const char *netdev_xsk_flags_str(enum netdev_xsk_flags value); /* Common nested types */ /* ============== NETDEV_CMD_DEV_GET ============== */ @@ -50,12 +51,14 @@ struct netdev_dev_get_rsp { __u32 xdp_features:1; __u32 xdp_zc_max_segs:1; __u32 xdp_rx_metadata_features:1; + __u32 xsk_features:1; } _present; __u32 ifindex; __u64 xdp_features; __u32 xdp_zc_max_segs; __u64 xdp_rx_metadata_features; + __u64 xsk_features; }; void netdev_dev_get_rsp_free(struct netdev_dev_get_rsp *rsp); From patchwork Thu Nov 2 22:58:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444251 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B56F82233C for ; Thu, 2 Nov 2023 22:58:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="sHgr8sjh" Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C345B1A6 for ; Thu, 2 Nov 2023 15:58:45 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5a7b9e83b70so12190477b3.0 for ; Thu, 02 Nov 2023 15:58:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965925; x=1699570725; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yIQWlqlKx7HVoISOZnXwD//BGCQLkTId8oEkXKIs8r4=; b=sHgr8sjhD1xQJyhp5uq7ikcbzl/+zjcqkaIKHxi1RJjTMMPdUhR3HBIFD8hiTpx2lM 9/ZDQQeA4o8CT9YcCVV0GW3jR7/YPdEwKIArKtDNcqE1DoWp6fOKSelozAZMUWKa12kz 1C84Ir+tKSQQd0L+6pS3vlZfT7ny1WhCxJvjDWI4Xg35IJ5jbo+poCAVWnhDDH8fwmEF J/UPPmAt0JxnvNI9myYY/cvfJ3+tp40h/nAkJdESn3+Pp2RNlV1wVIyzN7vHDKYBOU/Q 88pf9GV+M92NDl8d8619NVaUWBCDd0PkGcc+jWO9yuhW49dXc3viYGWNerCW5WnXS7TR 14aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965925; x=1699570725; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yIQWlqlKx7HVoISOZnXwD//BGCQLkTId8oEkXKIs8r4=; b=TQZA/Gzto4wUWm3COeSzAXFoh1h/6ievIj9lCTYpVpDd+7licjAmSfcTGWhTYOcoq4 85vEbUPh0pjgppTrjx70btT95vCMdJCPNTUxdK/ML5wVPdlBePdX4DtOmcnJc98OYl2x XTyRYyVN2qFw8kfKsgtWxmAtc+zEPvRIogP8fYLi60pwfj/PQ7DaQJlxvcPsw5E3QnYU jAxsQEca75S3fM7ixGjgq8hOi2TZTXxKTPSfnrWmlaAI48h0MQnHb05Oy5HxF5bYRa9i xe5Cv1J1Y+xcE4SIiWzSj41Ohtdq4AHSwEijYgx+rbxV3HghCTD8w7OHAjKPzdVuY6xd /kLQ== X-Gm-Message-State: AOJu0YxTH2DD9f4HF99P+lhxQH4240n8usj8nzxFgECZ+szo0YFNFIxN uTeZCbp6JwrCCWUdAyaHLhUvklstXCjM6gHc7lcZgZm84qmZwGiiJpCNLlceDIbsf2TvTKytg8N PaoNMZ2/6OIXw6l4HhiKiwFxm2/dcupc2FMeoMOuSn7ORGeLnqQ== X-Google-Smtp-Source: AGHT+IGxEl/+xuhibdGZev48+wG7/mtLXtVNyeGjcD7sBqjMuhiBM+KGfxbvnH9d9S2CPir4NbnlgGQ= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a05:690c:88d:b0:5a8:207a:143a with SMTP id cd13-20020a05690c088d00b005a8207a143amr26239ywb.0.1698965924683; Thu, 02 Nov 2023 15:58:44 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:27 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-4-sdf@google.com> Subject: [PATCH bpf-next v5 03/13] tools: ynl: Print xsk-features from the sample From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net In a similar fashion we do for the other bit masks. Fix mask parsing (>= vs >) while we are it. Signed-off-by: Stanislav Fomichev --- tools/net/ynl/samples/netdev.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tools/net/ynl/samples/netdev.c b/tools/net/ynl/samples/netdev.c index b828225daad0..591b90e21890 100644 --- a/tools/net/ynl/samples/netdev.c +++ b/tools/net/ynl/samples/netdev.c @@ -33,17 +33,23 @@ static void netdev_print_device(struct netdev_dev_get_rsp *d, unsigned int op) return; printf("xdp-features (%llx):", d->xdp_features); - for (int i = 0; d->xdp_features > 1U << i; i++) { + for (int i = 0; d->xdp_features >= 1U << i; i++) { if (d->xdp_features & (1U << i)) printf(" %s", netdev_xdp_act_str(1 << i)); } printf(" xdp-rx-metadata-features (%llx):", d->xdp_rx_metadata_features); - for (int i = 0; d->xdp_rx_metadata_features > 1U << i; i++) { + for (int i = 0; d->xdp_rx_metadata_features >= 1U << i; i++) { if (d->xdp_rx_metadata_features & (1U << i)) printf(" %s", netdev_xdp_rx_metadata_str(1 << i)); } + printf(" xsk-features (%llx):", d->xsk_features); + for (int i = 0; d->xsk_features >= 1U << i; i++) { + if (d->xsk_features & (1U << i)) + printf(" %s", netdev_xsk_flags_str(1 << i)); + } + printf(" xdp-zc-max-segs=%u", d->xdp_zc_max_segs); name = netdev_op_str(op); From patchwork Thu Nov 2 22:58:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444261 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 244E3225AE for ; Thu, 2 Nov 2023 22:58:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IiIX6yph" Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C14591B4 for ; Thu, 2 Nov 2023 15:58:47 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5b064442464so21281207b3.3 for ; Thu, 02 Nov 2023 15:58:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965927; x=1699570727; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fJs4BRlxYO5R4S5llZy1lr4c+O0yiJGCyUqLmBsZHsg=; b=IiIX6yphJ2tY4nw5KCW/0zzyAefseY3efJBu8uJe2ajnWCUB39AnJS/M4GBLwyp+Es VNvHX9OC8pYBGfdfcTI26v3Tm7woS0Ma80CnqU8jCvwT5dqFnTkY9RCTrno/npr3+58X D4rg5+F649/mB2t61qQTYGrP2JgozTjNQ2lHIn954/iRKEI3aVH28mJo69RNRNEq2uNK baI9V21a3LFymgcREQpEmhvMGUvouGEn5M/L6BpoLlod5/SPQJaTkr1rSEFXgb+82rTG oqs8pU7gPFxOQgBFc8Vtqwt3oiBkDLPx43kVO0OFm+v3+ydna6TdIsDJBsFpqKG9sN43 nJTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965927; x=1699570727; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fJs4BRlxYO5R4S5llZy1lr4c+O0yiJGCyUqLmBsZHsg=; b=gpyUWMEPLlD6+ePKqVYZEFqtPZrK9PcneA8Uq6LCebYhy2S/ClbivR8vETC5sfHn8C OXgrb9ySoIQsZ11U+TNoJg/L70wfpSmY2pkm9nnp7o6cYM5CMnv20BIzG4xBuY/3NpCA PMyH3ad5/A2Va1yPPhTlJn3+j/TSWBLiJEJ/eKp5SD4kl9ZmHF8RnxDB/92oVXxHaggJ 5Tfwote+N2qEmdVtMRWS58Km0rRfX5w0xpOoIjhhUbnuZtA6b9dhIH9haAsxCp3VcDNJ Tk7Xt5oIlH10DGCNlk1AWdhOo4Pm9aFDUK0e9gmgJXpWAqTWpOMvBUpKxWt2fTK2bcs3 50oA== X-Gm-Message-State: AOJu0YxArFSsx0I8n50Ng5I7YKC/4jPNUPqbuXgC6qTXmlJNQWnE0k3y /BiJ5RTQONYDofJ8n4TpVNdbrRorhHmivtk6e614PPKV63WuxlMhsoBBIIF8qBEW2wDg7qpPdcp raATXZadUEi6q46C1XlyJjFawa9Gy/QcHfSQ78wOEzqq6fcDMeg== X-Google-Smtp-Source: AGHT+IHVBCdzQX4Ee8IM1qlHC4A0gQtGqrPIixaTlLnDLMSJrE04fVXblfRKM3PfzgWFD5CU2n2RBTw= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a0d:d785:0:b0:59b:c811:a709 with SMTP id z127-20020a0dd785000000b0059bc811a709mr23870ywd.0.1698965926695; Thu, 02 Nov 2023 15:58:46 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:28 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-5-sdf@google.com> Subject: [PATCH bpf-next v5 04/13] net/mlx5e: Implement AF_XDP TX timestamp and checksum offload From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net, Saeed Mahameed X-Patchwork-Delegate: bpf@iogearbox.net TX timestamp: - requires passing clock, not sure I'm passing the correct one (from cq->mdev), but the timestamp value looks convincing TX checksum: - looks like device does packet parsing (and doesn't accept custom start/offset), so I'm ignoring user offsets Cc: Saeed Mahameed Signed-off-by: Stanislav Fomichev --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 4 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 72 ++++++++++++++++--- .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 11 ++- .../ethernet/mellanox/mlx5/core/en/xsk/tx.c | 17 ++++- .../net/ethernet/mellanox/mlx5/core/en_main.c | 1 + 5 files changed, 89 insertions(+), 16 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index b2a5da9739d2..43f027bf2da3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -484,10 +484,12 @@ struct mlx5e_xdp_info_fifo { struct mlx5e_xdpsq; struct mlx5e_xmit_data; +struct xsk_tx_metadata; typedef int (*mlx5e_fp_xmit_xdp_frame_check)(struct mlx5e_xdpsq *); typedef bool (*mlx5e_fp_xmit_xdp_frame)(struct mlx5e_xdpsq *, struct mlx5e_xmit_data *, - int); + int, + struct xsk_tx_metadata *); struct mlx5e_xdpsq { /* data path */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 7decc81ed33a..e2e7d82cfca4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -103,7 +103,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, xdptxd->dma_addr = dma_addr; if (unlikely(!INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, xdptxd, 0))) + mlx5e_xmit_xdp_frame, sq, xdptxd, 0, NULL))) return false; /* xmit_mode == MLX5E_XDP_XMIT_MODE_FRAME */ @@ -145,7 +145,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, xdptxd->dma_addr = dma_addr; if (unlikely(!INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, xdptxd, 0))) + mlx5e_xmit_xdp_frame, sq, xdptxd, 0, NULL))) return false; /* xmit_mode == MLX5E_XDP_XMIT_MODE_PAGE */ @@ -261,6 +261,37 @@ const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = { .xmo_rx_hash = mlx5e_xdp_rx_hash, }; +struct mlx5e_xsk_tx_complete { + struct mlx5_cqe64 *cqe; + struct mlx5e_cq *cq; +}; + +static u64 mlx5e_xsk_fill_timestamp(void *_priv) +{ + struct mlx5e_xsk_tx_complete *priv = _priv; + u64 ts; + + ts = get_cqe_ts(priv->cqe); + + if (mlx5_is_real_time_rq(priv->cq->mdev) || mlx5_is_real_time_sq(priv->cq->mdev)) + return mlx5_real_time_cyc2time(&priv->cq->mdev->clock, ts); + + return mlx5_timecounter_cyc2time(&priv->cq->mdev->clock, ts); +} + +static void mlx5e_xsk_request_checksum(u16 csum_start, u16 csum_offset, void *priv) +{ + struct mlx5_wqe_eth_seg *eseg = priv; + + /* HW/FW is doing parsing, so offsets are largely ignored. */ + eseg->cs_flags |= MLX5_ETH_WQE_L3_CSUM | MLX5_ETH_WQE_L4_CSUM; +} + +const struct xsk_tx_metadata_ops mlx5e_xsk_tx_metadata_ops = { + .tmo_fill_timestamp = mlx5e_xsk_fill_timestamp, + .tmo_request_checksum = mlx5e_xsk_request_checksum, +}; + /* returns true if packet was consumed by xdp */ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct bpf_prog *prog, struct mlx5e_xdp_buff *mxbuf) @@ -398,11 +429,11 @@ INDIRECT_CALLABLE_SCOPE int mlx5e_xmit_xdp_frame_check_mpwqe(struct mlx5e_xdpsq INDIRECT_CALLABLE_SCOPE bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result); + int check_result, struct xsk_tx_metadata *meta); INDIRECT_CALLABLE_SCOPE bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result) + int check_result, struct xsk_tx_metadata *meta) { struct mlx5e_tx_mpwqe *session = &sq->mpwqe; struct mlx5e_xdpsq_stats *stats = sq->stats; @@ -420,7 +451,7 @@ mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptx */ if (unlikely(sq->mpwqe.wqe)) mlx5e_xdp_mpwqe_complete(sq); - return mlx5e_xmit_xdp_frame(sq, xdptxd, 0); + return mlx5e_xmit_xdp_frame(sq, xdptxd, 0, meta); } if (!xdptxd->len) { skb_frag_t *frag = &xdptxdf->sinfo->frags[0]; @@ -450,6 +481,7 @@ mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptx * and it's safe to complete it at any time. */ mlx5e_xdp_mpwqe_session_start(sq); + xsk_tx_metadata_request(meta, &mlx5e_xsk_tx_metadata_ops, &session->wqe->eth); } mlx5e_xdp_mpwqe_add_dseg(sq, p, stats); @@ -480,7 +512,7 @@ INDIRECT_CALLABLE_SCOPE int mlx5e_xmit_xdp_frame_check(struct mlx5e_xdpsq *sq) INDIRECT_CALLABLE_SCOPE bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result) + int check_result, struct xsk_tx_metadata *meta) { struct mlx5e_xmit_data_frags *xdptxdf = container_of(xdptxd, struct mlx5e_xmit_data_frags, xd); @@ -599,6 +631,8 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, sq->pc++; } + xsk_tx_metadata_request(meta, &mlx5e_xsk_tx_metadata_ops, eseg); + sq->doorbell_cseg = cseg; stats->xmit++; @@ -608,7 +642,9 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq, struct mlx5e_xdp_wqe_info *wi, u32 *xsk_frames, - struct xdp_frame_bulk *bq) + struct xdp_frame_bulk *bq, + struct mlx5e_cq *cq, + struct mlx5_cqe64 *cqe) { struct mlx5e_xdp_info_fifo *xdpi_fifo = &sq->db.xdpi_fifo; u16 i; @@ -668,10 +704,24 @@ static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq, break; } - case MLX5E_XDP_XMIT_MODE_XSK: + case MLX5E_XDP_XMIT_MODE_XSK: { /* AF_XDP send */ + struct xsk_tx_metadata_compl *compl = NULL; + struct mlx5e_xsk_tx_complete priv = { + .cqe = cqe, + .cq = cq, + }; + + if (xp_tx_metadata_enabled(sq->xsk_pool)) { + xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); + compl = &xdpi.xsk_meta; + + xsk_tx_metadata_complete(compl, &mlx5e_xsk_tx_metadata_ops, &priv); + } + (*xsk_frames)++; break; + } default: WARN_ON_ONCE(true); } @@ -720,7 +770,7 @@ bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq) sqcc += wi->num_wqebbs; - mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq); + mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq, cq, cqe); } while (!last_wqe); if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_REQ)) { @@ -767,7 +817,7 @@ void mlx5e_free_xdpsq_descs(struct mlx5e_xdpsq *sq) sq->cc += wi->num_wqebbs; - mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq); + mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq, NULL, NULL); } xdp_flush_frame_bulk(&bq); @@ -840,7 +890,7 @@ int mlx5e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, } ret = INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, xdptxd, 0); + mlx5e_xmit_xdp_frame, sq, xdptxd, 0, NULL); if (unlikely(!ret)) { int j; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h index ecfe93a479da..e054db1e10f8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h @@ -33,6 +33,7 @@ #define __MLX5_EN_XDP_H__ #include +#include #include "en.h" #include "en/txrx.h" @@ -82,7 +83,7 @@ enum mlx5e_xdp_xmit_mode { * num, page_1, page_2, ... , page_num. * * MLX5E_XDP_XMIT_MODE_XSK: - * none. + * frame.xsk_meta. */ #define MLX5E_XDP_FIFO_ENTRIES2DS_MAX_RATIO 4 @@ -97,6 +98,7 @@ union mlx5e_xdp_info { u8 num; struct page *page; } page; + struct xsk_tx_metadata_compl xsk_meta; }; struct mlx5e_xsk_param; @@ -112,13 +114,16 @@ int mlx5e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, u32 flags); extern const struct xdp_metadata_ops mlx5e_xdp_metadata_ops; +extern const struct xsk_tx_metadata_ops mlx5e_xsk_tx_metadata_ops; INDIRECT_CALLABLE_DECLARE(bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result)); + int check_result, + struct xsk_tx_metadata *meta)); INDIRECT_CALLABLE_DECLARE(bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - int check_result)); + int check_result, + struct xsk_tx_metadata *meta)); INDIRECT_CALLABLE_DECLARE(int mlx5e_xmit_xdp_frame_check_mpwqe(struct mlx5e_xdpsq *sq)); INDIRECT_CALLABLE_DECLARE(int mlx5e_xmit_xdp_frame_check(struct mlx5e_xdpsq *sq)); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c index 597f319d4770..a59199ed590d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c @@ -55,12 +55,16 @@ static void mlx5e_xsk_tx_post_err(struct mlx5e_xdpsq *sq, nopwqe = mlx5e_post_nop(&sq->wq, sq->sqn, &sq->pc); mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, *xdpi); + if (xp_tx_metadata_enabled(sq->xsk_pool)) + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .xsk_meta = {} }); sq->doorbell_cseg = &nopwqe->ctrl; } bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) { struct xsk_buff_pool *pool = sq->xsk_pool; + struct xsk_tx_metadata *meta = NULL; union mlx5e_xdp_info xdpi; bool work_done = true; bool flush = false; @@ -93,12 +97,13 @@ bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) xdptxd.dma_addr = xsk_buff_raw_get_dma(pool, desc.addr); xdptxd.data = xsk_buff_raw_get_data(pool, desc.addr); xdptxd.len = desc.len; + meta = xsk_buff_get_metadata(pool, desc.addr); xsk_buff_raw_dma_sync_for_device(pool, xdptxd.dma_addr, xdptxd.len); ret = INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, mlx5e_xmit_xdp_frame, sq, &xdptxd, - check_result); + check_result, meta); if (unlikely(!ret)) { if (sq->mpwqe.wqe) mlx5e_xdp_mpwqe_complete(sq); @@ -106,6 +111,16 @@ bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) mlx5e_xsk_tx_post_err(sq, &xdpi); } else { mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, xdpi); + if (xp_tx_metadata_enabled(sq->xsk_pool)) { + struct xsk_tx_metadata_compl compl; + + xsk_tx_metadata_to_compl(meta, &compl); + XSK_TX_COMPL_FITS(void *); + + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) + { .xsk_meta = compl }); + } } flush = true; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index ea58c6917433..9da668f2b10b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -5164,6 +5164,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev) netdev->netdev_ops = &mlx5e_netdev_ops; netdev->xdp_metadata_ops = &mlx5e_xdp_metadata_ops; + netdev->xsk_tx_metadata_ops = &mlx5e_xsk_tx_metadata_ops; mlx5e_dcbnl_build_netdev(netdev); From patchwork Thu Nov 2 22:58:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444255 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 72E22224FE for ; Thu, 2 Nov 2023 22:58:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="EjitwUay" Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AE6719E for ; Thu, 2 Nov 2023 15:58:49 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d9b9aeb4962so1805960276.3 for ; Thu, 02 Nov 2023 15:58:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965928; x=1699570728; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KtdaCxDB1Z7LnNP3aiIiuOYbZNPYQ7+xO02WZ42BWyA=; b=EjitwUayeUz1g5myyvuHO84yH09Aa1MgCDxFZgHh1Gep69sGDJzLppfgQdqFsNpTiC tsdFBWlVZrmhks03YulUC3XSl6eu/68vStJdSOXHyZko1T2O/mxyeLlalrcJstkDwj1t 4Rb0x4ZJ47oN08Z8AmpBi155wt/UTwx8+ovlxNIIA9bpdDfibaM/G2JTMKCQTVLsc3yj HnVAa4wSAHEmaKsvtw+kYqnmMEU8pqMZiBP8fJQRn5KKalT7SeF0CpzvP20ZF/4OhRu6 g5lpx5QAdCsEzV1dPfBuYPEOIrbSLp+jM8k9P1qMZn/xFyY7mbhtzql43ASLpvYl0YXx CwAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965928; x=1699570728; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KtdaCxDB1Z7LnNP3aiIiuOYbZNPYQ7+xO02WZ42BWyA=; b=P9Yq/SFLRoXVIltSI7RhW1gq5LhoTX2mZn1TFPCgDpWbLvgAcBIzVejHBfHBdIjLWd YnMuJqa8Q9PFLeZ503UKig9qpOz9QlZPnSg15rrxxQztMTLkEvoyGKMcbVA1DKv6Vb33 nuOeBWmp3cgd7d7YQZHhXC4mv2IuLuOMOUSvz976gSi3mGvruGGthjtsVCvmn2y5+ydF uCLCCGjgRbApyZnq4/wPSlb3KvqNvhZ30ephPzn5+t0MibhaS4mmi9SDRR+S13Q/mCri CSHch3bN91ogJl5f8ZYaQwe166WTsM0XNY2nqQhb/UkoVR0umC8nTPGyhrVQTBPtCE6S GXnQ== X-Gm-Message-State: AOJu0YxF1z1f5MwVwB92Icba7M4nICO8Uy6KYHYamJW9Hy7x2hK7fYup k2m7wlj5RIYyZDQhmUJ65tooESdTCKfzebgr/4GomqhuxFeIp93L9b3mbiqgNklgTnHwl/3Z8sZ nA8W4EcKnBLPnd2iLZBo7oUjXHSF9QuuMhzVbbVhu24cp7nuy5g== X-Google-Smtp-Source: AGHT+IFNsVaNZr1WOL3uuMaQod85mask3VzmjAizM9z45cFZ80x7hgtpDWX1yMYqOnQ5i0D4JEIIY+U= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a25:9392:0:b0:da0:cbe9:6bac with SMTP id a18-20020a259392000000b00da0cbe96bacmr401466ybm.11.1698965928393; Thu, 02 Nov 2023 15:58:48 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:29 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-6-sdf@google.com> Subject: [PATCH bpf-next v5 05/13] net: stmmac: Add Tx HWTS support to XDP ZC From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net From: Song Yoong Siang This patch enables transmit hardware timestamp support to XDP zero copy via XDP Tx metadata framework. This patchset is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel Tiger Lake platform. Below are the test steps and results. Command on DUT: sudo ./xdp_hw_metadata sudo hwstamp_ctl -i -t 1 -r 1 Command on Link Partner: echo -n xdp | nc -u -q1 9091 Result: xsk_ring_cons__peek: 1 0x55bbbf08b6d0: rx_desc[2]->addr=8c100 addr=8c100 comp_addr=8c100 EoP No rx_hash err=-95 rx_timestamp: 1677762688429141540 (sec:1677762688.4291) HW RX-time: 1677762688429141540 (sec:1677762688.4291) delta to User RX-time sec:0.0003 (250.665 usec) XDP RX-time: 1677762688429375597 (sec:1677762688.4294) delta to User RX-time sec:0.0000 (16.608 usec) 0x55bbbf08b6d0: ping-pong with csum=561c (want f488) csum_start=34 csum_offset=6 0x55bbbf08b6d0: complete tx idx=2 addr=2008 tx_timestamp: 1677762688431127273 (sec:1677762688.4311) HW TX-complete-time: 1677762688431127273 (sec:1677762688.4311) delta to User TX-complete-time sec:0.0083 (8331.655 usec) XDP RX-time: 1677762688429375597 (sec:1677762688.4294) delta to User TX-complete-time sec:0.0101 (10083.331 usec) HW RX-time: 1677762688429141540 (sec:1677762688.4291) delta to HW TX-complete-time sec:0.0020 (1985.733 usec) 0x55bbbf08b6d0: complete rx idx=130 addr=8c100 Signed-off-by: Song Yoong Siang Signed-off-by: Stanislav Fomichev --- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 12 ++++ .../net/ethernet/stmicro/stmmac/stmmac_main.c | 64 ++++++++++++++++++- 2 files changed, 75 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index cd7a9768de5f..686c94c2e8a7 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -51,6 +51,7 @@ struct stmmac_tx_info { bool last_segment; bool is_jumbo; enum stmmac_txbuf_type buf_type; + struct xsk_tx_metadata_compl xsk_meta; }; #define STMMAC_TBS_AVAIL BIT(0) @@ -100,6 +101,17 @@ struct stmmac_xdp_buff { struct dma_desc *ndesc; }; +struct stmmac_metadata_request { + struct stmmac_priv *priv; + struct dma_desc *tx_desc; + bool *set_ic; +}; + +struct stmmac_xsk_tx_complete { + struct stmmac_priv *priv; + struct dma_desc *desc; +}; + struct stmmac_rx_queue { u32 rx_count_frames; u32 queue_index; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 3e50fd53a617..001a07a69539 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2430,6 +2430,46 @@ static void stmmac_dma_operation_mode(struct stmmac_priv *priv) } } +static void stmmac_xsk_request_timestamp(void *_priv) +{ + struct stmmac_metadata_request *meta_req = _priv; + + stmmac_enable_tx_timestamp(meta_req->priv, meta_req->tx_desc); + *meta_req->set_ic = true; +} + +static u64 stmmac_xsk_fill_timestamp(void *_priv) +{ + struct stmmac_xsk_tx_complete *tx_compl = _priv; + struct stmmac_priv *priv = tx_compl->priv; + struct dma_desc *desc = tx_compl->desc; + bool found = false; + u64 ns = 0; + + if (!priv->hwts_tx_en) + return 0; + + /* check tx tstamp status */ + if (stmmac_get_tx_timestamp_status(priv, desc)) { + stmmac_get_timestamp(priv, desc, priv->adv_ts, &ns); + found = true; + } else if (!stmmac_get_mac_tx_timestamp(priv, priv->hw, &ns)) { + found = true; + } + + if (found) { + ns -= priv->plat->cdc_error_adj; + return ns_to_ktime(ns); + } + + return 0; +} + +static const struct xsk_tx_metadata_ops stmmac_xsk_tx_metadata_ops = { + .tmo_request_timestamp = stmmac_xsk_request_timestamp, + .tmo_fill_timestamp = stmmac_xsk_fill_timestamp, +}; + static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) { struct netdev_queue *nq = netdev_get_tx_queue(priv->dev, queue); @@ -2449,6 +2489,8 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) budget = min(budget, stmmac_tx_avail(priv, queue)); while (budget-- > 0) { + struct stmmac_metadata_request meta_req; + struct xsk_tx_metadata *meta = NULL; dma_addr_t dma_addr; bool set_ic; @@ -2472,6 +2514,7 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) tx_desc = tx_q->dma_tx + entry; dma_addr = xsk_buff_raw_get_dma(pool, xdp_desc.addr); + meta = xsk_buff_get_metadata(pool, xdp_desc.addr); xsk_buff_raw_dma_sync_for_device(pool, dma_addr, xdp_desc.len); tx_q->tx_skbuff_dma[entry].buf_type = STMMAC_TXBUF_T_XSK_TX; @@ -2499,6 +2542,11 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) else set_ic = false; + meta_req.priv = priv; + meta_req.tx_desc = tx_desc; + meta_req.set_ic = &set_ic; + xsk_tx_metadata_request(meta, &stmmac_xsk_tx_metadata_ops, + &meta_req); if (set_ic) { tx_q->tx_count_frames = 0; stmmac_set_tx_ic(priv, tx_desc); @@ -2511,6 +2559,9 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) stmmac_enable_dma_transmission(priv, priv->ioaddr); + xsk_tx_metadata_to_compl(meta, + &tx_q->tx_skbuff_dma[entry].xsk_meta); + tx_q->cur_tx = STMMAC_GET_ENTRY(tx_q->cur_tx, priv->dma_conf.dma_tx_size); entry = tx_q->cur_tx; } @@ -2620,8 +2671,18 @@ static int stmmac_tx_clean(struct stmmac_priv *priv, int budget, u32 queue, } else { tx_packets++; } - if (skb) + if (skb) { stmmac_get_tx_hwtstamp(priv, p, skb); + } else { + struct stmmac_xsk_tx_complete tx_compl = { + .priv = priv, + .desc = p, + }; + + xsk_tx_metadata_complete(&tx_q->tx_skbuff_dma[entry].xsk_meta, + &stmmac_xsk_tx_metadata_ops, + &tx_compl); + } } if (likely(tx_q->tx_skbuff_dma[entry].buf && @@ -7449,6 +7510,7 @@ int stmmac_dvr_probe(struct device *device, ndev->netdev_ops = &stmmac_netdev_ops; ndev->xdp_metadata_ops = &stmmac_xdp_metadata_ops; + ndev->xsk_tx_metadata_ops = &stmmac_xsk_tx_metadata_ops; ndev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM; From patchwork Thu Nov 2 22:58:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444256 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ABC95225A9 for ; Thu, 2 Nov 2023 22:58:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="AGmGFi4d" Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EF7F192 for ; Thu, 2 Nov 2023 15:58:51 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1cc23f2226bso11828605ad.2 for ; Thu, 02 Nov 2023 15:58:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965930; x=1699570730; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HxShEwzn2SwGeh/fxMP8q3kc87+t2JtKbOUp4FoNDAk=; b=AGmGFi4d/yKsspPXCF0T6zVS8eEc6pCR3ZgirkgHNiVzXHmVO/hRtdj0jgMGmXy87C QlojbcraG5/z6axNc+Zz4bxDHCT7U/NAU3+GS7iI4nszsWuAXRvHv1wxkL1gy5FDnHbZ Xhr1GqWwAMS/62qXHXEWsrAcaJlLaO4kECjQUIbP46jdGeucXWIUs3iUOB0/1029mKNr mZEIlNYOsz+1VI/8QuEh0P1tcM9YhetlzD9I4grNehWO0GqDmz7N7RuqlxBLq3uIbOFs 2l/+h8xO3q2aSCxYBPxsmQ74HwXUT2xEGInz2N0s5TODxq1Unz58N/VXY43ouNAOmp5/ bqYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965930; x=1699570730; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HxShEwzn2SwGeh/fxMP8q3kc87+t2JtKbOUp4FoNDAk=; b=DNP+qIeuPWKzJOfLVFjQmf5KC1s93m0QeyGoWdsknTMgtQHX51SDqtiEZF29wxtcTU pMXk1IFZAEY02AJtmSSd/P2s642x1rE2tnc2FX7Icvv2XdzwRW9Rpzki6dtlGtP1euCG a/YecoS0iwgV+ZKK8wqz3N8pCcbwlvdFN2ttQHdmOs3Di+TTkI+TgQymZ2e9utuiLA4U 0ECxNpm5knTm6C6aZrdZtfxH1m3Xxo/57ceF5qMS1sGd0perm0R4EZJrVWzwPA8zjSe6 EtR/mgjUKQkWZh7gGbdYzdKVU+z9gF9ZG7B/Sn1ug3zEoImNx700dga97rCiFhxynEGJ t/Mg== X-Gm-Message-State: AOJu0Yx/iBY+wGGhK1oOv98kujvjNmEFSl1/oR/NT0hKqkHc4Pwawqtl Hpa4BFfb+8jTHh7NZI8ViPu4VU2JVZ4KqN1HIb2GarRHiWBAzzFqxmKFMtbhTWYXQjZElllRamZ 0mGRCtn2H+Jh8BATjfSAry2IDK0dz6H4KGz7yVuUFMfIobv7fGw== X-Google-Smtp-Source: AGHT+IFHksWobVq1DgklW4mDeMreDlgWsTTNqA/kCtQ42AoUqzv4G7g6ZHTYaGfKWIPjKCcaSl6uF+I= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:d48c:b0:1cc:2f2a:7d33 with SMTP id c12-20020a170902d48c00b001cc2f2a7d33mr319925plg.2.1698965929763; Thu, 02 Nov 2023 15:58:49 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:30 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-7-sdf@google.com> Subject: [PATCH bpf-next v5 06/13] xsk: Document tx_metadata_len layout From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net - how to use - how to query features - pointers to the examples Signed-off-by: Stanislav Fomichev --- Documentation/networking/index.rst | 1 + Documentation/networking/xsk-tx-metadata.rst | 70 ++++++++++++++++++++ 2 files changed, 71 insertions(+) create mode 100644 Documentation/networking/xsk-tx-metadata.rst diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 683eb42309cc..a297a894b366 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -123,6 +123,7 @@ Refer to :ref:`netdev-FAQ` for a guide on netdev development process specifics. xfrm_sync xfrm_sysctl xdp-rx-metadata + xsk-tx-metadata .. only:: subproject and html diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst new file mode 100644 index 000000000000..4f376560b23f --- /dev/null +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -0,0 +1,70 @@ +================== +AF_XDP TX Metadata +================== + +This document describes how to enable offloads when transmitting packets +via :doc:`af_xdp`. Refer to :doc:`xdp-rx-metadata` on how to access similar +metadata on the receive side. + +General Design +============== + +The headroom for the metadata is reserved via ``tx_metadata_len`` in +``struct xdp_umem_reg``. The metadata length is therefore the same for +every socket that shares the same umem. The metadata layout is a fixed UAPI, +refer to ``union xsk_tx_metadata`` in ``include/uapi/linux/if_xdp.h``. +Thus, generally, the ``tx_metadata_len`` field above should contain +``sizeof(union xsk_tx_metadata)``. + +The headroom and the metadata itself should be located right before +``xdp_desc->addr`` in the umem frame. Within a frame, the metadata +layout is as follows:: + + tx_metadata_len + / \ + +-----------------+---------+----------------------------+ + | xsk_tx_metadata | padding | payload | + +-----------------+---------+----------------------------+ + ^ + | + xdp_desc->addr + +An AF_XDP application can request headrooms larger than ``sizeof(struct +xsk_tx_metadata)``. The kernel will ignore the padding (and will still +use ``xdp_desc->addr - tx_metadata_len`` to locate +the ``xsk_tx_metadata``). For the frames that shouldn't carry +any metadata (i.e., the ones that don't have ``XDP_TX_METADATA`` option), +the metadata area is ignored by the kernel as well. + +The flags field enables the particular offload: + +- ``XDP_TXMD_FLAGS_TIMESTAMP``: requests the device to put transmission + timestamp into ``tx_timestamp`` field of ``union xsk_tx_metadata``. +- ``XDP_TXMD_FLAGS_CHECKSUM``: requests the device to calculate L4 + checksum. ``csum_start`` specifies byte offset of where the checksumming + should start and ``csum_offset`` specifies byte offset where the + device should store the computed checksum. + +Besides the flags above, in order to trigger the offloads, the first +packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` +bit in the ``options`` field. Also note that in a multi-buffer packet +only the first chunk should carry the metadata. + +Querying Device Capabilities +============================ + +Every devices exports its offloads capabilities via netlink netdev family. +Refer to ``xsk-flags`` features bitmask in +``Documentation/netlink/specs/netdev.yaml``. + +- ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP`` +- ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM`` + +See ``tools/net/ynl/samples/netdev.c`` on how to query this information. + +Example +======= + +See ``tools/testing/selftests/bpf/xdp_hw_metadata.c`` for an example +program that handles TX metadata. Also see https://github.com/fomichev/xskgen +for a more bare-bones example. From patchwork Thu Nov 2 22:58:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444257 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD25B225CC for ; Thu, 2 Nov 2023 22:58:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="22l0g9aA" Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9330018E for ; Thu, 2 Nov 2023 15:58:52 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d86dac81f8fso1820691276.1 for ; Thu, 02 Nov 2023 15:58:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965932; x=1699570732; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=cchCkg2hcFOX1ZtG27UCGvSuVnk6LdagSTMBDQs0CwQ=; b=22l0g9aAN7x9h7sy9Fd1R+xCasJ7Q7ooziP0A4fuG0zYmMyh/lOnHI/i7SpYLfdYkV aZUAxnunix6Tjy6jSf1zwMSvG9XP+SjNzTtcK3fyeTZFpAW7VFEJ3eKxNVj7LB+y3uGH gnptSoN3ETRq8jNFWnBVjUKtD4e6IgkWMV2Au8e3dtGcxaww+Qk9bXbrzXw3Rv1mbeLF opn+w0TcnTb1bBUQAIJhhqryFmM5iVEGaY+/vfgc78z7kn+9Xk3msWr7ScnkxXetfg6K sJhDaAgJbuvjeZGAUTBOswmVoHEcc7NJydFm6zuu0lD3oaSjN7p/hoasOoTDwbK3ao5O +Uyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965932; x=1699570732; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cchCkg2hcFOX1ZtG27UCGvSuVnk6LdagSTMBDQs0CwQ=; b=tGzgZpgDoDuY1D3a5sWcWRSEYwFgvkkYHo/yaxaIG7pYovA460SVaxXg3zcxrKrJz9 UBJ0k4+ApoO8Sx+g6Px3SSZCiDuV8mSPj7cxUkBk8vLU5i7vMojL9hPSvXrZ1OsRGrrW hob7HOdzt8BhcTCb+MNVhyFulTvhJFyq133LcVMVsmU1h/ZmyJCqiQSMx8itTmYomLXF JBinzUHRFOdVK9kcY+jdC9HIBuGhY0AF5u0zKmp74o09fMgsEpKi3H5kV9Yyun5G1RYY Bc0GiHPvalKfvWRGBYxOZTbnu/3bnGiCzk5UdWeBWdkD3um+xlkIlDJeyyKrtnTFIWX6 GTEQ== X-Gm-Message-State: AOJu0YztsFtv0t7szHTsXHm5F5II0y89BhPH232OHwgfyzpMYideEzlj pu0c8cYIYVklI27GJFD+ELEIFwuyd4xXmkFuJ9/1qt537udNXcvSGO7Kd65rs0EIWzPeIbXtiF4 i3F85BM/75nDG9lhISkgAPO7TzKr3eDUkyY+Hvg3FX0Fc7vW2Lg== X-Google-Smtp-Source: AGHT+IHj4JA6kBeIWI5vfIKduxpsOKCkJQxZJFhMF509ST6fAhXbSHC2dh+QTfMr/1MT9xvQhw/RoiI= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a25:9392:0:b0:da0:cbe9:6bac with SMTP id a18-20020a259392000000b00da0cbe96bacmr401471ybm.11.1698965931578; Thu, 02 Nov 2023 15:58:51 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:31 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-8-sdf@google.com> Subject: [PATCH bpf-next v5 07/13] xsk: Validate xsk_tx_metadata flags From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net Accept only the flags that the kernel knows about to make sure we can extend this field in the future. Note that only in XDP_COPY mode we propagate the error signal back to the user (via sendmsg). For zerocopy mode we silently skip the metadata for the descriptors that have wrong flags (since we process the descriptors deep in the driver). Signed-off-by: Stanislav Fomichev --- include/net/xdp_sock_drv.h | 23 ++++++++++++++++++++++- net/xdp/xsk.c | 4 ++++ 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index e2558ac3e195..5885176ea01e 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -165,12 +165,28 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return xp_raw_get_data(pool, addr); } +#define XDP_TXMD_FLAGS_VALID ( \ + XDP_TXMD_FLAGS_TIMESTAMP | \ + XDP_TXMD_FLAGS_CHECKSUM | \ + 0) + +static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta) +{ + return !(meta->request.flags & ~XDP_TXMD_FLAGS_VALID); +} + static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) { + struct xsk_tx_metadata *meta; + if (!pool->tx_metadata_len) return NULL; - return xp_raw_get_data(pool, addr) - pool->tx_metadata_len; + meta = xp_raw_get_data(pool, addr) - pool->tx_metadata_len; + if (unlikely(!xsk_buff_valid_tx_metadata(meta))) + return NULL; /* no way to signal the error to the user */ + + return meta; } static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp, struct xsk_buff_pool *pool) @@ -332,6 +348,11 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return NULL; } +static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta) +{ + return false; +} + static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) { return NULL; diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 84fd10201f2a..0e81ae6bfff4 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -728,6 +728,10 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, } meta = buffer - xs->pool->tx_metadata_len; + if (unlikely(!xsk_buff_valid_tx_metadata(meta))) { + err = -EINVAL; + goto free_err; + } if (meta->request.flags & XDP_TXMD_FLAGS_CHECKSUM) { if (unlikely(meta->request.csum_start + From patchwork Thu Nov 2 22:58:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444258 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 069FB224CE for ; Thu, 2 Nov 2023 22:58:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="LZZ2lYtL" Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D904196 for ; Thu, 2 Nov 2023 15:58:55 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-5b97a03aa53so1077445a12.1 for ; Thu, 02 Nov 2023 15:58:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965935; x=1699570735; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=P/kzkpf7mJPDxgqCP2JD59SSWcrFcpewjk7q9SbAWjE=; b=LZZ2lYtLNxxtIQTFMVwcWmMynegzwoI14IwMaJgopgB9QATvN2cFbeNc8QejUlMzfm jV4X+f1pKgoclQmAb4H7aocGOu9qWP7BxKoxiwoSVUkEzuZSk9LexCMVvEiy+ZILKMSI 7Yw8ydUP2VsnNEKdDXaQnOymB64CGXpk2wKoqISCk9NsZ/oppcR79HOeayF0LMN8AB4y eOWD0dMS3/OAMUHe6y6QKiJ2b/BdvagcXL7t1AjaDiiu/Smm6MGgusdWX16NE4uxnZrQ Upu/WgNZwaxY1bQnQ1c5vY2tFkgsWavj0mc8OMG9mkCyr45Zpg1qknog8+ld1llczhSQ qMEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965935; x=1699570735; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=P/kzkpf7mJPDxgqCP2JD59SSWcrFcpewjk7q9SbAWjE=; b=vvObHQiNtwGXtnHVFgJpOajsisewdRbf+5FZHhBTgVzHX4lhzsqCkSaMUDOm/BXibR iKp1VRleCk9wUu2mgQI9BpmVCxAiTj9KsnUycURR12N293ILs/tQl32xX9fVgWMISVGH CiVxtfxZm7HBjuzelXfv67myzvxskj6tt3hPVqvyDblU3MUSf8oPuGZECGDROZClGdLC Mw+Ep27GmJTfmmA7h3ttehsrWT1B6BjjmvKT0p0m8GKfGdZLStBF46HDOdBhD8tgOsq6 utzEG9h/Iu1ZDPwYAZF+xQMi9rpwj0aTaedMnV5Bluz9PdTnSBDnzV3w/1IJAON801B/ Cb+g== X-Gm-Message-State: AOJu0Yyd93zju3EfDDAHtjr0j+tbjIDXS7S2Jq+6PbpuJle0oA6N5f2l r9GTldL/mCfNxEUrnIpQHJulE/8ThHrX2oBpce/81yBe/yxMQFaTZLOfJlEHAvTMhgQR/4nkGdo a/pRoZi79WqSKjQQJ/PkDjAIUSM3E8vUvRJeTRjG0FMMQoc5yMA== X-Google-Smtp-Source: AGHT+IGeHGTku3StH3yaDLbw7vu7lMh6tOBynVJJF9KK/TP8fFYcPyHqNZCmU0a6fMxM0JDPRMXFKL8= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:903:444:b0:1cc:29fb:f39c with SMTP id iw4-20020a170903044400b001cc29fbf39cmr334431plb.1.1698965933297; Thu, 02 Nov 2023 15:58:53 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:32 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-9-sdf@google.com> Subject: [PATCH bpf-next v5 08/13] xsk: Add option to calculate TX checksum in SW From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net For XDP_COPY mode, add a UMEM option XDP_UMEM_TX_SW_CSUM to call skb_checksum_help in transmit path. Might be useful to debugging issues with real hardware. I also use this mode in the selftests. Signed-off-by: Stanislav Fomichev --- Documentation/networking/xsk-tx-metadata.rst | 9 +++++++++ include/net/xsk_buff_pool.h | 1 + include/uapi/linux/if_xdp.h | 8 +++++++- net/xdp/xdp_umem.c | 7 ++++++- net/xdp/xsk.c | 6 ++++++ net/xdp/xsk_buff_pool.c | 1 + tools/include/uapi/linux/if_xdp.h | 8 +++++++- 7 files changed, 37 insertions(+), 3 deletions(-) diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst index 4f376560b23f..97ecfa480d00 100644 --- a/Documentation/networking/xsk-tx-metadata.rst +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -50,6 +50,15 @@ packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` bit in the ``options`` field. Also note that in a multi-buffer packet only the first chunk should carry the metadata. +Software TX Checksum +==================== + +For development and testing purposes its possible to pass +``XDP_UMEM_TX_SW_CSUM`` flag to ``XDP_UMEM_REG`` UMEM registration call. +In this case, when running in ``XDK_COPY`` mode, the TX checksum +is calculated on the CPU. Do not enable this option in production because +it will negatively affect performance. + Querying Device Capabilities ============================ diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index 97f5cc10d79e..8d48d37ab7c0 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -83,6 +83,7 @@ struct xsk_buff_pool { bool uses_need_wakeup; bool dma_need_sync; bool unaligned; + bool tx_sw_csum; void *addrs; /* Mutual exclusion of the completion ring in the SKB mode. Two cases to protect: * NAPI TX thread and sendmsg error paths in the SKB destructor callback and when diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index b0ee7ad19b51..bed6adc3e9b4 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -33,7 +33,13 @@ #define XDP_USE_SG (1 << 4) /* Flags for xsk_umem_config flags */ -#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) +#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) + +/* Force checksum calculation in software. Can be used for testing or + * working around potential HW issues. This option causes performance + * degradation and only works in XDP_COPY mode. + */ +#define XDP_UMEM_TX_SW_CSUM (1 << 1) struct sockaddr_xdp { __u16 sxdp_family; diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c index 946a687fb8e8..caa340134b0e 100644 --- a/net/xdp/xdp_umem.c +++ b/net/xdp/xdp_umem.c @@ -148,6 +148,11 @@ static int xdp_umem_account_pages(struct xdp_umem *umem) return 0; } +#define XDP_UMEM_FLAGS_VALID ( \ + XDP_UMEM_UNALIGNED_CHUNK_FLAG | \ + XDP_UMEM_TX_SW_CSUM | \ + 0) + static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) { bool unaligned_chunks = mr->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG; @@ -167,7 +172,7 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) return -EINVAL; } - if (mr->flags & ~XDP_UMEM_UNALIGNED_CHUNK_FLAG) + if (mr->flags & ~XDP_UMEM_FLAGS_VALID) return -EINVAL; if (!unaligned_chunks && !is_power_of_2(chunk_size)) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 0e81ae6bfff4..e109b2aaeb2a 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -744,6 +744,12 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, skb->csum_start = hr + meta->request.csum_start; skb->csum_offset = meta->request.csum_offset; skb->ip_summed = CHECKSUM_PARTIAL; + + if (unlikely(xs->pool->tx_sw_csum)) { + err = skb_checksum_help(skb); + if (err) + goto free_err; + } } } } diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 386eddcdf837..4f6f538a5462 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -86,6 +86,7 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs, pool->umem = umem; pool->addrs = umem->addrs; pool->tx_metadata_len = umem->tx_metadata_len; + pool->tx_sw_csum = umem->flags & XDP_UMEM_TX_SW_CSUM; INIT_LIST_HEAD(&pool->free_list); INIT_LIST_HEAD(&pool->xskb_list); INIT_LIST_HEAD(&pool->xsk_tx_list); diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index 29942c2c32dc..8ed2ed9d1b17 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -33,7 +33,13 @@ #define XDP_USE_SG (1 << 4) /* Flags for xsk_umem_config flags */ -#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) +#define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) + +/* Force checksum calculation in software. Can be used for testing or + * working around potential HW issues. This option causes performance + * degradation and only works in XDP_COPY mode. + */ +#define XDP_UMEM_TX_SW_CSUM (1 << 1) struct sockaddr_xdp { __u16 sxdp_family; From patchwork Thu Nov 2 22:58:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444259 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9000B225A9 for ; Thu, 2 Nov 2023 22:58:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nS3bbo+s" Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4796C198 for ; Thu, 2 Nov 2023 15:58:56 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1cc3130ba31so12515225ad.0 for ; Thu, 02 Nov 2023 15:58:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965936; x=1699570736; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=VeQrdnKYuBooTnpvZQFnSOSMYJ4ZuSmlztUMmTptENE=; b=nS3bbo+si+mq/K6/ZJAAI5oUNgET9d8/ohhoNCD51k0ZNbZ7BcrNjzQl5nr1XorE4Q hJWtZgU0WvvBNnw8a/gi3YQuhGyI6imRkjpjbf588sdAh6iSrxs6pxLdQQbrx1Wvem9N EQ7Hlld2mMNN95U5u52qBhDP3btYI2n/XvVsVt7lPcWvFIOtUQZl7I3WgdfStWWmoeda tB7JSDfzsOkuajrk3HJNeu0XfuxM2jW9xKNxXphYqV57hjfnRBrhtlCLt6s+bkfp+T0T vQSXLfkeHktr7fWiDugeVxfVuccfMNtnDjaHLWA9Rth78LuQ4LMJubLgy/yGXclNyALk DFqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965936; x=1699570736; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VeQrdnKYuBooTnpvZQFnSOSMYJ4ZuSmlztUMmTptENE=; b=gXGx9/iCTJamK4iIvG4CO5KPXjMm8tgONlvdw2Lsdn7D/zTcSd4QwkjBGGohNj48Na HEjsfqjQz0u1NwY3F5EpEADi6Oax74nMU+aaBJZOz2ef0h7J//5xiTOVBiMuB8R8zQZn zFwbekPZm2ucGW8G3HaC5CDndbwk4GUKUnQcqOvePBZ1oV+bCrlwvgTd6b+62X4juOmp Q6VEBytOhJq6PnQEsEgSVfN0VEuhRnk0P4fPIirn/fenvfjNEQMtzwg4BGcVs9vyYT7f JCIJWTKMfO9i7CpRwxnnqtLbcmbjUJaXCvgLtH7W7ITI7xwy7K7Dx+3Ab98FEo5lGM7K GHTw== X-Gm-Message-State: AOJu0YwZN66N3OAS+Sxry0EtmSb7Iy085aXyOvxq3OEa4DCJdy0Fr5+9 5drC50+CPDtWYMhZWC6wGtPP5h7i+9dHhlMejqTktu1g+cEdShM18xIFPhTA3qOARqhke+8/Kem yGLO+TjS4jlfD2pgCZq6CzMr6U3R5DYVALetg6lM4s+rMwvZmxw== X-Google-Smtp-Source: AGHT+IE30pQeNxwTIxy6isUW+aAqMiu0jHID3fpIuTjvgcz86AcrLPHw0kA5islC10OPVkIGp4LNh+A= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:903:328e:b0:1ca:1e12:7c85 with SMTP id jh14-20020a170903328e00b001ca1e127c85mr341274plb.3.1698965935122; Thu, 02 Nov 2023 15:58:55 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:33 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-10-sdf@google.com> Subject: [PATCH bpf-next v5 09/13] selftests/xsk: Support tx_metadata_len From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net Add new config field and propagate to UMEM registration setsockopt. Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/xsk.c | 3 +++ tools/testing/selftests/bpf/xsk.h | 1 + 2 files changed, 4 insertions(+) diff --git a/tools/testing/selftests/bpf/xsk.c b/tools/testing/selftests/bpf/xsk.c index e574711eeb84..25d568abf0f2 100644 --- a/tools/testing/selftests/bpf/xsk.c +++ b/tools/testing/selftests/bpf/xsk.c @@ -115,6 +115,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg, cfg->frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE; cfg->frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM; cfg->flags = XSK_UMEM__DEFAULT_FLAGS; + cfg->tx_metadata_len = 0; return; } @@ -123,6 +124,7 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg, cfg->frame_size = usr_cfg->frame_size; cfg->frame_headroom = usr_cfg->frame_headroom; cfg->flags = usr_cfg->flags; + cfg->tx_metadata_len = usr_cfg->tx_metadata_len; } static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg, @@ -252,6 +254,7 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, mr.chunk_size = umem->config.frame_size; mr.headroom = umem->config.frame_headroom; mr.flags = umem->config.flags; + mr.tx_metadata_len = umem->config.tx_metadata_len; err = setsockopt(umem->fd, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr)); if (err) { diff --git a/tools/testing/selftests/bpf/xsk.h b/tools/testing/selftests/bpf/xsk.h index 771570bc3731..93c2cc413cfc 100644 --- a/tools/testing/selftests/bpf/xsk.h +++ b/tools/testing/selftests/bpf/xsk.h @@ -200,6 +200,7 @@ struct xsk_umem_config { __u32 frame_size; __u32 frame_headroom; __u32 flags; + __u32 tx_metadata_len; }; int xsk_attach_xdp_program(struct bpf_program *prog, int ifindex, u32 xdp_flags); From patchwork Thu Nov 2 22:58:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444260 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1D35F23742 for ; Thu, 2 Nov 2023 22:58:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="BlSieQbq" Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1036B192 for ; Thu, 2 Nov 2023 15:58:58 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1cc23f2226bso11829205ad.2 for ; Thu, 02 Nov 2023 15:58:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965937; x=1699570737; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+IGkzuCjLv5zFnRniqT4SWqZjcb9N4mGztS9ZvMTe5s=; b=BlSieQbqwz+zCz1YKwNpwhBwPcKv6CtTslX0XprklqUCUXOGg+Mf63eDV1PVxq5fUU Bjyidpksx6m0M41WIMext+1qkSGmpmwRIXQrwg7lRhbKG9kW88vjV9XRON7w1x0X+Q5A EBLN0JlnpOm1sGDxBIkDm/D/ZJFCVKxNLWRqdnPQNl4K4BnVMiWWsmcyX5ss2EDaW0Kl 5xvdeLGKTQqkVHrk7+MSIzRSvrINzmOOrzAj3LqhREoXw6JEPYza+JLZkhH0sytEzj9B sRoWRywsoUwRgX+SP4XEau1FoThxa8le+puk/VzvAT5EdgDiW8BUWgraR6KozpbKFm4W W6xA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965937; x=1699570737; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+IGkzuCjLv5zFnRniqT4SWqZjcb9N4mGztS9ZvMTe5s=; b=v1uoXwvtvKSDHeqpxXZbdffmjkt00DiWatDvWAsrpdtq96PkXC1aXiNmfx2kqujjrf RUZXgSxR1hHfIY5uQRjJDXmBG3NorZlyEVfx/pTj4XhI9mOsyxJ+ugX4urUKytnSpaio SZ/OG6gPZH+6tK5OyN2TphgnekTMkx3vgbwLON9HJPM1LaBU5VtuxyobXIgsQAQtoOdV q4ATVKqVhedWkdHwEKpEkKrwKQ+06c9TGlJZ4YlXE8LvO87VCRDaMELayZJd/3orVjFx 3f9vYpr7OiPB+Amw65tMVwfIaL2nRJisbh0h5rzVRii12BXAlRfN6A3pV0zxu7MSzevy RE8Q== X-Gm-Message-State: AOJu0YxdQvP+EdLeiQpLQ90dPdQ0UDWOazzV6S/W57ICeJmNjyVkKmPL CZlcpw+bzgy1QirfJF73ynOmqkBrWiknlP3q0s04qiqXiMRP5KnQvBgrNIytbg3Hu7ICLIu2vVK cTB0NNEwj5xLmz0ut0uVUeF09bDFbVm5DUEtYTtrwkJjCB5xpNA== X-Google-Smtp-Source: AGHT+IH6Wxf2G64tR0jzO3T+h2cnip19vyYFp36oRClwsoNM0OizspJXCj9yuO9CV3dzDnbjyYHxgsQ= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:ac8e:b0:1cc:30cf:eae6 with SMTP id h14-20020a170902ac8e00b001cc30cfeae6mr305364plr.10.1698965936662; Thu, 02 Nov 2023 15:58:56 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:34 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-11-sdf@google.com> Subject: [PATCH bpf-next v5 10/13] selftests/bpf: Add csum helpers From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net Checksum helpers will be used to calculate pseudo-header checksum in AF_XDP metadata selftests. The helpers are mirroring existing kernel ones: - csum_tcpudp_magic : IPv4 pseudo header csum - csum_ipv6_magic : IPv6 pseudo header csum - csum_fold : fold csum and do one's complement Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/network_helpers.h | 43 +++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index 34f1200a781b..94b9be24e39b 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -71,4 +71,47 @@ struct nstoken; */ struct nstoken *open_netns(const char *name); void close_netns(struct nstoken *token); + +static __u16 csum_fold(__u32 csum) +{ + csum = (csum & 0xffff) + (csum >> 16); + csum = (csum & 0xffff) + (csum >> 16); + + return (__u16)~csum; +} + +static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr, + __u32 len, __u8 proto, + __wsum csum) +{ + __u64 s = csum; + + s += (__u32)saddr; + s += (__u32)daddr; + s += htons(proto + len); + s = (s & 0xffffffff) + (s >> 32); + s = (s & 0xffffffff) + (s >> 32); + + return csum_fold((__u32)s); +} + +static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr, + const struct in6_addr *daddr, + __u32 len, __u8 proto, + __wsum csum) +{ + __u64 s = csum; + int i; + + for (i = 0; i < 4; i++) + s += (__u32)saddr->s6_addr32[i]; + for (i = 0; i < 4; i++) + s += (__u32)daddr->s6_addr32[i]; + s += htons(proto + len); + s = (s & 0xffffffff) + (s >> 32); + s = (s & 0xffffffff) + (s >> 32); + + return csum_fold((__u32)s); +} + #endif From patchwork Thu Nov 2 22:58:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444262 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D0702376D for ; Thu, 2 Nov 2023 22:59:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="vHDYirYS" Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 720C7196 for ; Thu, 2 Nov 2023 15:58:59 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1cc29f3afe0so12292815ad.2 for ; Thu, 02 Nov 2023 15:58:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965939; x=1699570739; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gH3ul6akAfngEDdVa6ettNYmN5CbVfUiDAfxlJmoIdI=; b=vHDYirYSvdo9m4t5zpb27EKU7cVmpa0jusrji1vAsrwhUi0umZeK1nhvlreTdnh6nJ J29NvP7qKzMbo7kInLbS0yO0EXOitkPR6O3ATkINjGgommZU8luSO6XC37DcuBiNON9a yXiQt9wzgSQ4cv5sIigmJvG9wxWcjPem5kN7KvVXhUJtVbB+69YxZB1Y7VqIXGUNCeZq fm6xo/JdzWWMZ2q72mw/qKgRpebTpcGbsKDeWuJaX66Jnb0hQxQpvXkrU8rHxKgohoFg uS2shwcjjhd2bt5cUe93K1JcD1K6uVHtHTOgLZQerlTZazHVg3eQ7DK82hLRConJUYT8 8yqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965939; x=1699570739; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gH3ul6akAfngEDdVa6ettNYmN5CbVfUiDAfxlJmoIdI=; b=wnoLKsix1qdy1/8SOCfZed7G60YUvdKNzg3KCCR9JWx8ange6wbXnQVwAsNW6cYuxn dGzO/RxdVzNZVdnXIczKj82EXzglUYUqKaKJMfwVDiOcCY6MBOOJiPKUh+VuTyh+6hR4 RN7ASo+VSp5JHGQZyUOXTE+r4m37he2u1lHEKGKEdP0GAkoUvqz+zLVCACUb9iV+iOHx sxqqwADAp6iiygBizx2+auojN2GAzaekMI0OXw0P4R6LOcSMaeDJBEiOovgQmGfeS/t1 KiSnseTzoqMRvc+r5QTHsaPon3BYpSL98bJQfISdDgEtMNQAiCFw1xxrxhEmR/n7gr58 +akA== X-Gm-Message-State: AOJu0Yx5X5TbErVVvRfnyHRRSuxGwQhg/S2kM2fbmvTjfIgKmbPZCjyF vxucaDuToC6m4qkVcahBVi/S9sFfPdabYd9FgS8hmV8RQnsIYjEU4lBwRmoR+yef18JUrX5/abx 7wc0WcwEdDhWT5jQEN5r3F+RpiYwUMGfjn+CkVBMokjJwoFyO6w== X-Google-Smtp-Source: AGHT+IEzJ8//TNpnPklZK9Sg4SXhqD7yPXp8IJUUrVVcPRQXVB6UZwCQb19FTQTizWLybUJWGb++3D8= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:9b86:b0:1cc:323c:fe4a with SMTP id y6-20020a1709029b8600b001cc323cfe4amr301672plp.12.1698965938529; Thu, 02 Nov 2023 15:58:58 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:35 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-12-sdf@google.com> Subject: [PATCH bpf-next v5 11/13] selftests/bpf: Add TX side to xdp_metadata From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net Request TX timestamp and make sure it's not empty. Request TX checksum offload (SW-only) and make sure it's resolved to the correct one. Signed-off-by: Stanislav Fomichev --- .../selftests/bpf/prog_tests/xdp_metadata.c | 33 ++++++++++++++++--- 1 file changed, 29 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c index 4439ba9392f8..9eec39969b1e 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -56,7 +56,8 @@ static int open_xsk(int ifindex, struct xsk *xsk) .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, - .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG, + .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG | XDP_UMEM_TX_SW_CSUM, + .tx_metadata_len = sizeof(struct xsk_tx_metadata), }; __u32 idx; u64 addr; @@ -138,6 +139,7 @@ static void ip_csum(struct iphdr *iph) static int generate_packet(struct xsk *xsk, __u16 dst_port) { + struct xsk_tx_metadata *meta; struct xdp_desc *tx_desc; struct udphdr *udph; struct ethhdr *eth; @@ -151,10 +153,14 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) return -1; tx_desc = xsk_ring_prod__tx_desc(&xsk->tx, idx); - tx_desc->addr = idx % (UMEM_NUM / 2) * UMEM_FRAME_SIZE; + tx_desc->addr = idx % (UMEM_NUM / 2) * UMEM_FRAME_SIZE + sizeof(struct xsk_tx_metadata); printf("%p: tx_desc[%u]->addr=%llx\n", xsk, idx, tx_desc->addr); data = xsk_umem__get_data(xsk->umem_area, tx_desc->addr); + meta = data - sizeof(struct xsk_tx_metadata); + memset(meta, 0, sizeof(*meta)); + meta->request.flags = XDP_TXMD_FLAGS_TIMESTAMP; + eth = data; iph = (void *)(eth + 1); udph = (void *)(iph + 1); @@ -178,11 +184,17 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) udph->source = htons(AF_XDP_SOURCE_PORT); udph->dest = htons(dst_port); udph->len = htons(sizeof(*udph) + UDP_PAYLOAD_BYTES); - udph->check = 0; + udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, + ntohs(udph->len), IPPROTO_UDP, 0); memset(udph + 1, 0xAA, UDP_PAYLOAD_BYTES); + meta->request.flags |= XDP_TXMD_FLAGS_CHECKSUM; + meta->request.csum_start = sizeof(*eth) + sizeof(*iph); + meta->request.csum_offset = offsetof(struct udphdr, check); + tx_desc->len = sizeof(*eth) + sizeof(*iph) + sizeof(*udph) + UDP_PAYLOAD_BYTES; + tx_desc->options |= XDP_TX_METADATA; xsk_ring_prod__submit(&xsk->tx, 1); ret = sendto(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, 0); @@ -194,13 +206,21 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) static void complete_tx(struct xsk *xsk) { - __u32 idx; + struct xsk_tx_metadata *meta; __u64 addr; + void *data; + __u32 idx; if (ASSERT_EQ(xsk_ring_cons__peek(&xsk->comp, 1, &idx), 1, "xsk_ring_cons__peek")) { addr = *xsk_ring_cons__comp_addr(&xsk->comp, idx); printf("%p: complete tx idx=%u addr=%llx\n", xsk, idx, addr); + + data = xsk_umem__get_data(xsk->umem_area, addr); + meta = data - sizeof(struct xsk_tx_metadata); + + ASSERT_NEQ(meta->completion.tx_timestamp, 0, "tx_timestamp"); + xsk_ring_cons__release(&xsk->comp, 1); } } @@ -221,6 +241,7 @@ static int verify_xsk_metadata(struct xsk *xsk) const struct xdp_desc *rx_desc; struct pollfd fds = {}; struct xdp_meta *meta; + struct udphdr *udph; struct ethhdr *eth; struct iphdr *iph; __u64 comp_addr; @@ -257,6 +278,7 @@ static int verify_xsk_metadata(struct xsk *xsk) ASSERT_EQ(eth->h_proto, htons(ETH_P_IP), "eth->h_proto"); iph = (void *)(eth + 1); ASSERT_EQ((int)iph->version, 4, "iph->version"); + udph = (void *)(iph + 1); /* custom metadata */ @@ -270,6 +292,9 @@ static int verify_xsk_metadata(struct xsk *xsk) ASSERT_EQ(meta->rx_hash_type, 0, "rx_hash_type"); + /* checksum offload */ + ASSERT_EQ(udph->check, 0x1c72, "csum"); + xsk_ring_cons__release(&xsk->rx, 1); refill_rx(xsk, comp_addr); From patchwork Thu Nov 2 22:58:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444263 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0D7E224E0 for ; Thu, 2 Nov 2023 22:59:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="CebcoSbz" Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95B9C18B for ; Thu, 2 Nov 2023 15:59:01 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1cc1ddb34ccso12324325ad.1 for ; Thu, 02 Nov 2023 15:59:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965941; x=1699570741; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dQa6UGFIhvagf5iHtxwy2S77Dl6tSWrDDUnqP4yyJ24=; b=CebcoSbz0pyg5dEZVaSIZWuYu7NiHsNzCjXj6nZcCZuNdDeid/lxEXiRlOzQXo3zf1 v37O9mjP3cNv5o9SWPGf+inYweoppTJHQQroyPYJZqk0G2frgtXRachAPBkkIh56arTH XW0mQJVN5tCDIbD0Z2iW3f8cE2xGLD5xhUxGDrmdKg27qSYWUv2WhuL/G8y0Cvhvh28X zIZOqNlyZtKqRG2oDFOKtPPUIRHZg9/WgpX/Q429GLIhuTBtWf10kwVGswSERbAnsXoK rrk3VRBZHy+Z/rFBkhv0x76OzRfU8NvTCCWfGjNqnAVwy4g0G/l2GjLcgGYv6ubPIzO5 9Epw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965941; x=1699570741; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dQa6UGFIhvagf5iHtxwy2S77Dl6tSWrDDUnqP4yyJ24=; b=hjBonYTgHV6u/guUf4/Omb2eIgquwyGKz18bypSV4D6spO/LwQ2rhHV0MvZUCCJk7d t9QAN3IiwvZNhQ4PsumRc4rydXziyPfiFetl85evZ4soIEcHAk79I8Axys2NKH9wowwP zw3YsXQD+jS8/L7UXDjWqIRnRUSk7ZEp1gmY8AyMjbUIk3LXmRUSRBf8302NFY8wpiJO 76AD6yyAYKZw9I9gKFH5twIJS6Xht8JxWbcCPzKZUJpJgmTm7GGzuG79Y5WPNi/M92dg Fyt0rX4+Iefn1v1jErUXf4KhNCkGRkMApUdYFSHGzGGv3RIXaaLIXNPpSqbJER53TT+y y9wQ== X-Gm-Message-State: AOJu0YwJEyDh94uig6jVGAyoKVwR+iZk3B5Wu1C8IBqDdDXSznDQch2m snd0VpffbnBLZG14uqNtS4n5TB8s9+xAkvYh2WrQU/AV6leSM9EwU5Exj4p9lsX0JGJIUdT+vtG lJ2rJs4nBKgHvyqfzz6MrbhN3nrGjKkbh6vcwjt0M1MjzdeCi6g== X-Google-Smtp-Source: AGHT+IHv0MEzw3LQnjLafs43YmbvepD2LmqtZjHeCsplTM331Pz+i8egVtOE5buydbeig+TVsZuQU3o= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:f7d6:b0:1cc:23d2:bb94 with SMTP id h22-20020a170902f7d600b001cc23d2bb94mr322063plw.1.1698965940246; Thu, 02 Nov 2023 15:59:00 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:36 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-13-sdf@google.com> Subject: [PATCH bpf-next v5 12/13] selftests/bpf: Convert xdp_hw_metadata to XDP_USE_NEED_WAKEUP From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net This is the recommended way to run AF_XDP, so let's use it in the test. Also, some unrelated changes to now blow up the log too much: - change default mode to zerocopy and add -c to use copy mode - small fixes for the flags/sizes/prints - add print_tstamp_delta to print timestamp + reference Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/xdp_hw_metadata.c | 73 +++++++++++++------ 1 file changed, 49 insertions(+), 24 deletions(-) diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index 17c0f92ff160..057f7c145f62 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -32,7 +32,7 @@ #include "xdp_metadata.h" -#define UMEM_NUM 16 +#define UMEM_NUM 256 #define UMEM_FRAME_SIZE XSK_UMEM__DEFAULT_FRAME_SIZE #define UMEM_SIZE (UMEM_FRAME_SIZE * UMEM_NUM) #define XDP_FLAGS (XDP_FLAGS_DRV_MODE | XDP_FLAGS_REPLACE) @@ -48,7 +48,7 @@ struct xsk { }; struct xdp_hw_metadata *bpf_obj; -__u16 bind_flags = XDP_COPY; +__u16 bind_flags = XDP_USE_NEED_WAKEUP | XDP_ZEROCOPY; struct xsk *rx_xsk; const char *ifname; int ifindex; @@ -68,7 +68,7 @@ static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, - .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG, + .flags = XSK_UMEM__DEFAULT_FLAGS, }; __u32 idx; u64 addr; @@ -110,7 +110,7 @@ static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) for (i = 0; i < UMEM_NUM / 2; i++) { addr = (UMEM_NUM / 2 + i) * UMEM_FRAME_SIZE; printf("%p: rx_desc[%d] -> %lx\n", xsk, i, addr); - *xsk_ring_prod__fill_addr(&xsk->fill, i) = addr; + *xsk_ring_prod__fill_addr(&xsk->fill, idx + i) = addr; } xsk_ring_prod__submit(&xsk->fill, ret); @@ -131,12 +131,22 @@ static void refill_rx(struct xsk *xsk, __u64 addr) __u32 idx; if (xsk_ring_prod__reserve(&xsk->fill, 1, &idx) == 1) { - printf("%p: complete idx=%u addr=%llx\n", xsk, idx, addr); + printf("%p: complete rx idx=%u addr=%llx\n", xsk, idx, addr); *xsk_ring_prod__fill_addr(&xsk->fill, idx) = addr; xsk_ring_prod__submit(&xsk->fill, 1); } } +static int kick_tx(struct xsk *xsk) +{ + return sendto(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, 0); +} + +static int kick_rx(struct xsk *xsk) +{ + return recvfrom(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, NULL); +} + #define NANOSEC_PER_SEC 1000000000 /* 10^9 */ static __u64 gettime(clockid_t clock_id) { @@ -152,6 +162,17 @@ static __u64 gettime(clockid_t clock_id) return (__u64) t.tv_sec * NANOSEC_PER_SEC + t.tv_nsec; } +static void print_tstamp_delta(const char *name, const char *refname, + __u64 tstamp, __u64 reference) +{ + __s64 delta = (__s64)reference - (__s64)tstamp; + + printf("%s: %llu (sec:%0.4f) delta to %s sec:%0.4f (%0.3f usec)\n", + name, tstamp, (double)tstamp / NANOSEC_PER_SEC, refname, + (double)delta / NANOSEC_PER_SEC, + (double)delta / 1000); +} + static void verify_xdp_metadata(void *data, clockid_t clock_id) { struct xdp_meta *meta; @@ -167,22 +188,13 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id) printf("rx_timestamp: %llu (sec:%0.4f)\n", meta->rx_timestamp, (double)meta->rx_timestamp / NANOSEC_PER_SEC); if (meta->rx_timestamp) { - __u64 usr_clock = gettime(clock_id); - __u64 xdp_clock = meta->xdp_timestamp; - __s64 delta_X = xdp_clock - meta->rx_timestamp; - __s64 delta_X2U = usr_clock - xdp_clock; - - printf("XDP RX-time: %llu (sec:%0.4f) delta sec:%0.4f (%0.3f usec)\n", - xdp_clock, (double)xdp_clock / NANOSEC_PER_SEC, - (double)delta_X / NANOSEC_PER_SEC, - (double)delta_X / 1000); - - printf("AF_XDP time: %llu (sec:%0.4f) delta sec:%0.4f (%0.3f usec)\n", - usr_clock, (double)usr_clock / NANOSEC_PER_SEC, - (double)delta_X2U / NANOSEC_PER_SEC, - (double)delta_X2U / 1000); - } + __u64 ref_tstamp = gettime(clock_id); + print_tstamp_delta("HW RX-time", "User RX-time", + meta->rx_timestamp, ref_tstamp); + print_tstamp_delta("XDP RX-time", "User RX-time", + meta->xdp_timestamp, ref_tstamp); + } } static void verify_skb_metadata(int fd) @@ -252,6 +264,13 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t while (true) { errno = 0; + + for (i = 0; i < rxq; i++) { + ret = kick_rx(&rx_xsk[i]); + if (ret) + printf("kick_rx ret=%d\n", ret); + } + ret = poll(fds, rxq + 1, 1000); printf("poll: %d (%d) skip=%llu fail=%llu redir=%llu\n", ret, errno, bpf_obj->bss->pkts_skip, @@ -420,8 +439,9 @@ static void print_usage(void) { const char *usage = "Usage: xdp_hw_metadata [OPTIONS] [IFNAME]\n" - " -m Enable multi-buffer XDP for larger MTU\n" + " -c Run in copy mode (zerocopy is default)\n" " -h Display this help and exit\n\n" + " -m Enable multi-buffer XDP for larger MTU\n" "Generate test packets on the other machine with:\n" " echo -n xdp | nc -u -q1 9091\n"; @@ -432,14 +452,19 @@ static void read_args(int argc, char *argv[]) { char opt; - while ((opt = getopt(argc, argv, "mh")) != -1) { + while ((opt = getopt(argc, argv, "chm")) != -1) { switch (opt) { - case 'm': - bind_flags |= XDP_USE_SG; + case 'c': + bind_flags &= ~XDP_USE_NEED_WAKEUP; + bind_flags &= ~XDP_ZEROCOPY; + bind_flags |= XDP_COPY; break; case 'h': print_usage(); exit(0); + case 'm': + bind_flags |= XDP_USE_SG; + break; case '?': if (isprint(optopt)) fprintf(stderr, "Unknown option: -%c\n", optopt); From patchwork Thu Nov 2 22:58:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13444264 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C646B22338 for ; Thu, 2 Nov 2023 22:59:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Q5AIZlOL" Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37B5A193 for ; Thu, 2 Nov 2023 15:59:03 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1cc3130ba31so12515595ad.0 for ; Thu, 02 Nov 2023 15:59:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698965942; x=1699570742; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/NH2QxfbfXdu6q/NKkSOhpUqwFaO0XJc1nEbLYxViCY=; b=Q5AIZlOLabKOnJWG8j7CgMb40VrjUIdvZZxDFW0a1HL6Sdz0ozdDkq+fJ7XTo4Aq++ oB7WOjmE2f8zUPBAu0rhe/w1hVv7FqYYVFyRYnc5JS/RvwApUvi1oxvQsGjXzXH0UXRP 6nOT177i3nFrxHFB9RKGYBP8B0udMF/My9niF7awxNcixNXnsomoXTCWb9rXObJQaqtc qHmL0RIedAq7fHz867GMseh057Fatv2U1bcW/IyVmulGvXTFAYD10jF0QdbjP9gJWDfw DmoTAev1W8hgGP5ydM5YtUHPjaGuU3+qAOB5kFpNmqx+iP1uPiyLZLcPhyFIK+K49cPl /7/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698965942; x=1699570742; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/NH2QxfbfXdu6q/NKkSOhpUqwFaO0XJc1nEbLYxViCY=; b=uOh9Cuk+g08OtQW6GvlVlbufmYp9PaMWdJRH35Y8hg2++AHHjjO47YYGE2tl61w+H7 jrTfS/JY+EE3q92II1WRS8F61zvpSRQ14W2D7iF/sCTyUXGLi6pLYoPjOOEE3Lg7QvKA pUedd8jbkyS9zJ5DzLJ8s47bz0j0pNXZLuJdRHiHPREbHTUceDcDLkPZLcqUG12WkTVO y9kZZPKJ9gOOPoA1Bttmsy2G9W07vhfGAgRWMeLpitVg/n06yM/BqHmgwYczyz4XvxM8 o8cQHHJ+Tgi58p5+gsGXC38IhUsynTfCND1ZM5GhPKVvBIqLbvC/0tkaM3XHABUvVt5l ffmg== X-Gm-Message-State: AOJu0YycJV8CPxbOKQaTsJ8pd6EdTN6Z4jRY+iL0QnARm8uovx2Y+Smz j+vfODfimjyCixdwJkxQ2mcjLEcER145P1Ser4VPieJulDIIdtAbcqdsIOrOZNraCIoqe6aggEp A7vxT7rR2WCapSYSTwkqnwKTY8VOCkyBX92SZA6T2wkFG++UblA== X-Google-Smtp-Source: AGHT+IGnPc3ySTbyIpSSr+xEGKtnNGG4nSw5Xg36zEVKJmKs2POT/T0DNKSLiovSBH4XKF3O3pp1S5I= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:903:2594:b0:1cc:1389:512e with SMTP id jb20-20020a170903259400b001cc1389512emr335139plb.4.1698965942117; Thu, 02 Nov 2023 15:59:02 -0700 (PDT) Date: Thu, 2 Nov 2023 15:58:37 -0700 In-Reply-To: <20231102225837.1141915-1-sdf@google.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231102225837.1141915-14-sdf@google.com> Subject: [PATCH bpf-next v5 13/13] selftests/bpf: Add TX side to xdp_hw_metadata From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Patchwork-Delegate: bpf@iogearbox.net When we get a packet on port 9091, we swap src/dst and send it out. At this point we also request the timestamp and checksum offloads. Checksum offload is verified by looking at the tcpdump on the other side. The tool prints pseudo-header csum and the final one it expects. The final checksum actually matches the incoming packets checksum because we only flip the src/dst and don't change the payload. Some other related changes: - switched to zerocopy mode by default; new flag can be used to force old behavior - request fixed tx_metadata_len headroom - some other small fixes (umem size, fill idx+i, etc) mvbz3:~# ./xdp_hw_metadata eth3 ... xsk_ring_cons__peek: 1 0x19546f8: rx_desc[0]->addr=80100 addr=80100 comp_addr=80100 rx_hash: 0x80B7EA8B with RSS type:0x2A rx_timestamp: 1697580171852147395 (sec:1697580171.8521) HW RX-time: 1697580171852147395 (sec:1697580171.8521), delta to User RX-time sec:0.2797 (279673.082 usec) XDP RX-time: 1697580172131699047 (sec:1697580172.1317), delta to User RX-time sec:0.0001 (121.430 usec) 0x19546f8: ping-pong with csum=3b8e (want d862) csum_start=54 csum_offset=6 0x19546f8: complete tx idx=0 addr=8 tx_timestamp: 1697580172056756493 (sec:1697580172.0568) HW TX-complete-time: 1697580172056756493 (sec:1697580172.0568), delta to User TX-complete-time sec:0.0852 (85175.537 usec) XDP RX-time: 1697580172131699047 (sec:1697580172.1317), delta to User TX-complete-time sec:0.0102 (10232.983 usec) HW RX-time: 1697580171852147395 (sec:1697580171.8521), delta to HW TX-complete-time sec:0.2046 (204609.098 usec) 0x19546f8: complete rx idx=128 addr=80100 mvbz4:~# nc -Nu -q1 ${MVBZ3_LINK_LOCAL_IP}%eth3 9091 mvbz4:~# tcpdump -vvx -i eth3 udp tcpdump: listening on eth3, link-type EN10MB (Ethernet), snapshot length 262144 bytes 12:26:09.301074 IP6 (flowlabel 0x35fa5, hlim 127, next-header UDP (17) payload length: 11) fe80::1270:fdff:fe48:1087.55807 > fe80::1270:fdff:fe48:1077.9091: [bad udp cksum 0x3b8e -> 0xde7e!] UDP, length 3 0x0000: 6003 5fa5 000b 117f fe80 0000 0000 0000 0x0010: 1270 fdff fe48 1087 fe80 0000 0000 0000 0x0020: 1270 fdff fe48 1077 d9ff 2383 000b 3b8e 0x0030: 7864 70 12:26:09.301976 IP6 (flowlabel 0x35fa5, hlim 127, next-header UDP (17) payload length: 11) fe80::1270:fdff:fe48:1077.9091 > fe80::1270:fdff:fe48:1087.55807: [udp sum ok] UDP, length 3 0x0000: 6003 5fa5 000b 117f fe80 0000 0000 0000 0x0010: 1270 fdff fe48 1077 fe80 0000 0000 0000 0x0020: 1270 fdff fe48 1087 2383 d9ff 000b de7e 0x0030: 7864 70 Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/xdp_hw_metadata.c | 164 +++++++++++++++++- 1 file changed, 160 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index 057f7c145f62..efaa3243483a 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -10,7 +10,9 @@ * - rx_hash * * TX: - * - TBD + * - UDP 9091 packets trigger TX reply + * - TX HW timestamp is requested and reported back upon completion + * - TX checksum is requested */ #include @@ -24,11 +26,14 @@ #include #include #include +#include #include #include #include #include #include +#include +#include #include "xdp_metadata.h" @@ -53,6 +58,9 @@ struct xsk *rx_xsk; const char *ifname; int ifindex; int rxq; +bool skip_tx; +__u64 last_hw_rx_timestamp; +__u64 last_xdp_rx_timestamp; void test__fail(void) { /* for network_helpers.c */ } @@ -69,6 +77,7 @@ static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, .flags = XSK_UMEM__DEFAULT_FLAGS, + .tx_metadata_len = sizeof(struct xsk_tx_metadata), }; __u32 idx; u64 addr; @@ -185,15 +194,19 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id) printf("rx_hash: 0x%X with RSS type:0x%X\n", meta->rx_hash, meta->rx_hash_type); - printf("rx_timestamp: %llu (sec:%0.4f)\n", meta->rx_timestamp, - (double)meta->rx_timestamp / NANOSEC_PER_SEC); if (meta->rx_timestamp) { __u64 ref_tstamp = gettime(clock_id); + /* store received timestamps to calculate a delta at tx */ + last_hw_rx_timestamp = meta->rx_timestamp; + last_xdp_rx_timestamp = meta->xdp_timestamp; + print_tstamp_delta("HW RX-time", "User RX-time", meta->rx_timestamp, ref_tstamp); print_tstamp_delta("XDP RX-time", "User RX-time", meta->xdp_timestamp, ref_tstamp); + } else { + printf("No rx_timestamp\n"); } } @@ -242,6 +255,129 @@ static void verify_skb_metadata(int fd) printf("skb hwtstamp is not found!\n"); } +static bool complete_tx(struct xsk *xsk, clockid_t clock_id) +{ + struct xsk_tx_metadata *meta; + __u64 addr; + void *data; + __u32 idx; + + if (!xsk_ring_cons__peek(&xsk->comp, 1, &idx)) + return false; + + addr = *xsk_ring_cons__comp_addr(&xsk->comp, idx); + data = xsk_umem__get_data(xsk->umem_area, addr); + meta = data - sizeof(struct xsk_tx_metadata); + + printf("%p: complete tx idx=%u addr=%llx\n", xsk, idx, addr); + + if (meta->completion.tx_timestamp) { + __u64 ref_tstamp = gettime(clock_id); + + print_tstamp_delta("HW TX-complete-time", "User TX-complete-time", + meta->completion.tx_timestamp, ref_tstamp); + print_tstamp_delta("XDP RX-time", "User TX-complete-time", + last_xdp_rx_timestamp, ref_tstamp); + print_tstamp_delta("HW RX-time", "HW TX-complete-time", + last_hw_rx_timestamp, meta->completion.tx_timestamp); + } else { + printf("No tx_timestamp\n"); + } + + xsk_ring_cons__release(&xsk->comp, 1); + + return true; +} + +#define swap(a, b, len) do { \ + for (int i = 0; i < len; i++) { \ + __u8 tmp = ((__u8 *)a)[i]; \ + ((__u8 *)a)[i] = ((__u8 *)b)[i]; \ + ((__u8 *)b)[i] = tmp; \ + } \ +} while (0) + +static void ping_pong(struct xsk *xsk, void *rx_packet, clockid_t clock_id) +{ + struct xsk_tx_metadata *meta; + struct ipv6hdr *ip6h = NULL; + struct iphdr *iph = NULL; + struct xdp_desc *tx_desc; + struct udphdr *udph; + struct ethhdr *eth; + __sum16 want_csum; + void *data; + __u32 idx; + int ret; + int len; + + ret = xsk_ring_prod__reserve(&xsk->tx, 1, &idx); + if (ret != 1) { + printf("%p: failed to reserve tx slot\n", xsk); + return; + } + + tx_desc = xsk_ring_prod__tx_desc(&xsk->tx, idx); + tx_desc->addr = idx % (UMEM_NUM / 2) * UMEM_FRAME_SIZE + sizeof(struct xsk_tx_metadata); + data = xsk_umem__get_data(xsk->umem_area, tx_desc->addr); + + meta = data - sizeof(struct xsk_tx_metadata); + memset(meta, 0, sizeof(*meta)); + meta->request.flags = XDP_TXMD_FLAGS_TIMESTAMP; + + eth = rx_packet; + + if (eth->h_proto == htons(ETH_P_IP)) { + iph = (void *)(eth + 1); + udph = (void *)(iph + 1); + } else if (eth->h_proto == htons(ETH_P_IPV6)) { + ip6h = (void *)(eth + 1); + udph = (void *)(ip6h + 1); + } else { + printf("%p: failed to detect IP version for ping pong %04x\n", xsk, eth->h_proto); + xsk_ring_prod__cancel(&xsk->tx, 1); + return; + } + + len = ETH_HLEN; + if (ip6h) + len += sizeof(*ip6h) + ntohs(ip6h->payload_len); + if (iph) + len += ntohs(iph->tot_len); + + swap(eth->h_dest, eth->h_source, ETH_ALEN); + if (iph) + swap(&iph->saddr, &iph->daddr, 4); + else + swap(&ip6h->saddr, &ip6h->daddr, 16); + swap(&udph->source, &udph->dest, 2); + + want_csum = udph->check; + if (ip6h) + udph->check = ~csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr, + ntohs(udph->len), IPPROTO_UDP, 0); + else + udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, + ntohs(udph->len), IPPROTO_UDP, 0); + + meta->request.flags |= XDP_TXMD_FLAGS_CHECKSUM; + if (iph) + meta->request.csum_start = sizeof(*eth) + sizeof(*iph); + else + meta->request.csum_start = sizeof(*eth) + sizeof(*ip6h); + meta->request.csum_offset = offsetof(struct udphdr, check); + + printf("%p: ping-pong with csum=%04x (want %04x) csum_start=%d csum_offset=%d\n", + xsk, ntohs(udph->check), ntohs(want_csum), + meta->request.csum_start, meta->request.csum_offset); + + memcpy(data, rx_packet, len); /* don't share umem chunk for simplicity */ + tx_desc->options |= XDP_TX_METADATA; + tx_desc->len = len; + + xsk_ring_prod__submit(&xsk->tx, 1); +} + static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t clock_id) { const struct xdp_desc *rx_desc; @@ -307,6 +443,22 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr), clock_id); first_seg = false; + + if (!skip_tx) { + /* mirror first chunk back */ + ping_pong(xsk, xsk_umem__get_data(xsk->umem_area, addr), + clock_id); + + ret = kick_tx(xsk); + if (ret) + printf("kick_tx ret=%d\n", ret); + + for (int j = 0; j < 500; j++) { + if (complete_tx(xsk, clock_id)) + break; + usleep(10*1000); + } + } } xsk_ring_cons__release(&xsk->rx, 1); @@ -442,6 +594,7 @@ static void print_usage(void) " -c Run in copy mode (zerocopy is default)\n" " -h Display this help and exit\n\n" " -m Enable multi-buffer XDP for larger MTU\n" + " -r Don't generate AF_XDP reply (rx metadata only)\n" "Generate test packets on the other machine with:\n" " echo -n xdp | nc -u -q1 9091\n"; @@ -452,7 +605,7 @@ static void read_args(int argc, char *argv[]) { char opt; - while ((opt = getopt(argc, argv, "chm")) != -1) { + while ((opt = getopt(argc, argv, "chmr")) != -1) { switch (opt) { case 'c': bind_flags &= ~XDP_USE_NEED_WAKEUP; @@ -465,6 +618,9 @@ static void read_args(int argc, char *argv[]) case 'm': bind_flags |= XDP_USE_SG; break; + case 'r': + skip_tx = true; + break; case '?': if (isprint(optopt)) fprintf(stderr, "Unknown option: -%c\n", optopt);