diff mbox series

[net-next,v2,16/18] xsk: add helper to get &xdp_desc's DMA and meta pointer in one go

Message ID 20241015145350.4077765-17-aleksander.lobakin@intel.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series idpf: XDP chapter III: core XDP changes (+libeth_xdp) | expand

Checks

Context Check Description
netdev/series_format fail Series longer than 15 patches
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5 this patch: 5
netdev/build_tools success Errors and warnings before: 0 (+0) this patch: 0 (+0)
netdev/cc_maintainers warning 4 maintainers not CCed: jonathan.lemon@gmail.com bjorn@kernel.org maciej.fijalkowski@intel.com hawk@kernel.org
netdev/build_clang success Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 35 this patch: 35
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 92 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-10-16--12-00 (tests: 776)

Commit Message

Alexander Lobakin Oct. 15, 2024, 2:53 p.m. UTC
Currently, when you send an XSk frame without metadata, you need to do
the following:

* call external xsk_buff_raw_get_dma();
* call inline xsk_buff_get_metadata(), which calls external
  xsk_buff_raw_get_data() and then do some inline checks.

This effectively means that the following piece:

addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr;

is done twice per frame, plus you have 2 external calls per frame, plus
this:

	meta = pool->addrs + addr - pool->tx_metadata_len;
	if (unlikely(!xsk_buff_valid_tx_metadata(meta)))

is always inlined, even if there's no meta or it's invalid.

Add xsk_buff_raw_get_ctx() (xp_raw_get_ctx() to be precise) to do that
in one go. It returns a small structure with 2 fields: DMA address,
filled unconditionally, and metadata pointer, valid only if it's
present. The address correction is performed only once and you also
have only 1 external call per XSk frame, which does all the calculations
and checks outside of your hotpath. You only need to check
`if (ctx.meta)` for the metadata presence.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
 include/net/xdp_sock_drv.h  | 23 +++++++++++++++++++++
 include/net/xsk_buff_pool.h |  8 ++++++++
 net/xdp/xsk_buff_pool.c     | 40 +++++++++++++++++++++++++++++++++++++
 3 files changed, 71 insertions(+)

Comments

Maciej Fijalkowski Oct. 22, 2024, 3:42 p.m. UTC | #1
On Tue, Oct 15, 2024 at 04:53:48PM +0200, Alexander Lobakin wrote:
> Currently, when you send an XSk frame without metadata, you need to do

you meant *with* metadata?

> the following:
> 
> * call external xsk_buff_raw_get_dma();
> * call inline xsk_buff_get_metadata(), which calls external
>   xsk_buff_raw_get_data() and then do some inline checks.
> 
> This effectively means that the following piece:
> 
> addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr;
> 
> is done twice per frame, plus you have 2 external calls per frame, plus
> this:
> 
> 	meta = pool->addrs + addr - pool->tx_metadata_len;
> 	if (unlikely(!xsk_buff_valid_tx_metadata(meta)))
> 
> is always inlined, even if there's no meta or it's invalid.

when there is no meta you bail out early in xsk_buff_get_metadata() as
tx_metadata_len was not set, no?

> 
> Add xsk_buff_raw_get_ctx() (xp_raw_get_ctx() to be precise) to do that
> in one go. It returns a small structure with 2 fields: DMA address,
> filled unconditionally, and metadata pointer, valid only if it's
> present. The address correction is performed only once and you also
> have only 1 external call per XSk frame, which does all the calculations
> and checks outside of your hotpath. You only need to check
> `if (ctx.meta)` for the metadata presence.

IMHO adding this might confuse future users which approach should be
preferred.

Thinking out loud...couldn't we export address correction logic and pass
the corrected addr to xsk_buff_get_metadata and then add it to
pool->addrs. But that would require modifying existing callsites +
addressing xp_raw_get_dma() as well :<

Standard question - any perf improvement when micro benchmarking? :P

> 
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> ---
>  include/net/xdp_sock_drv.h  | 23 +++++++++++++++++++++
>  include/net/xsk_buff_pool.h |  8 ++++++++
>  net/xdp/xsk_buff_pool.c     | 40 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 71 insertions(+)
> 
> diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
> index 6aae95b83645..324a4bb04431 100644
> --- a/include/net/xdp_sock_drv.h
> +++ b/include/net/xdp_sock_drv.h
> @@ -205,6 +205,23 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
>  	return xp_raw_get_data(pool, addr);
>  }
>  
> +/**
> + * xsk_buff_raw_get_ctx - get &xdp_desc context
> + * @pool: XSk buff pool desc address belongs to
> + * @addr: desc address (from userspace)
> + *
> + * Wrapper for xp_raw_get_ctx() to be used in drivers, see its kdoc for
> + * details.
> + *
> + * Return: new &xdp_desc_ctx struct containing desc's DMA address and metadata
> + * pointer, if it is present and valid (initialized to %NULL otherwise).
> + */
> +static inline struct xdp_desc_ctx
> +xsk_buff_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr)
> +{
> +	return xp_raw_get_ctx(pool, addr);
> +}
> +
>  #define XDP_TXMD_FLAGS_VALID ( \
>  		XDP_TXMD_FLAGS_TIMESTAMP | \
>  		XDP_TXMD_FLAGS_CHECKSUM | \
> @@ -402,6 +419,12 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
>  	return NULL;
>  }
>  
> +static inline struct xdp_desc_ctx
> +xsk_buff_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr)
> +{
> +	return (struct xdp_desc_ctx){ };
> +}
> +
>  static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta)
>  {
>  	return false;
> diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
> index 3832997cc605..6c540696a299 100644
> --- a/include/net/xsk_buff_pool.h
> +++ b/include/net/xsk_buff_pool.h
> @@ -141,6 +141,14 @@ u32 xp_alloc_batch(struct xsk_buff_pool *pool, struct xdp_buff **xdp, u32 max);
>  bool xp_can_alloc(struct xsk_buff_pool *pool, u32 count);
>  void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr);
>  dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr);
> +
> +struct xdp_desc_ctx {
> +	dma_addr_t dma;
> +	struct xsk_tx_metadata *meta;
> +};
> +
> +struct xdp_desc_ctx xp_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr);
> +
>  static inline dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb)
>  {
>  	return xskb->dma;
> diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
> index ae71da7d2cd6..02c42caec9f4 100644
> --- a/net/xdp/xsk_buff_pool.c
> +++ b/net/xdp/xsk_buff_pool.c
> @@ -715,3 +715,43 @@ dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr)
>  		(addr & ~PAGE_MASK);
>  }
>  EXPORT_SYMBOL(xp_raw_get_dma);
> +
> +/**
> + * xp_raw_get_ctx - get &xdp_desc context
> + * @pool: XSk buff pool desc address belongs to
> + * @addr: desc address (from userspace)
> + *
> + * Helper for getting desc's DMA address and metadata pointer, if present.
> + * Saves one call on hotpath, double calculation of the actual address,
> + * and inline checks for metadata presence and sanity.
> + * Please use xsk_buff_raw_get_ctx() in drivers instead.
> + *
> + * Return: new &xdp_desc_ctx struct containing desc's DMA address and metadata
> + * pointer, if it is present and valid (initialized to %NULL otherwise).
> + */
> +struct xdp_desc_ctx xp_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr)
> +{
> +	struct xsk_tx_metadata *meta;
> +	struct xdp_desc_ctx ret;
> +
> +	addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr;
> +	ret = (typeof(ret)){
> +		/* Same logic as in xp_raw_get_dma() */
> +		.dma	= (pool->dma_pages[addr >> PAGE_SHIFT] &
> +			   ~XSK_NEXT_PG_CONTIG_MASK) + (addr & ~PAGE_MASK),
> +	};
> +
> +	if (!pool->tx_metadata_len)
> +		goto out;
> +
> +	/* Same logic as in xp_raw_get_data() + xsk_buff_get_metadata() */
> +	meta = pool->addrs + addr - pool->tx_metadata_len;
> +	if (unlikely(!xsk_buff_valid_tx_metadata(meta)))
> +		goto out;
> +
> +	ret.meta = meta;
> +
> +out:
> +	return ret;
> +}
> +EXPORT_SYMBOL(xp_raw_get_ctx);
> -- 
> 2.46.2
>
Alexander Lobakin Oct. 23, 2024, 2:50 p.m. UTC | #2
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Tue, 22 Oct 2024 17:42:13 +0200

> On Tue, Oct 15, 2024 at 04:53:48PM +0200, Alexander Lobakin wrote:
>> Currently, when you send an XSk frame without metadata, you need to do
> 
> you meant *with* metadata?

Eeeeh... Maybe, I forgot already what I wanted to say =\

> 
>> the following:
>>
>> * call external xsk_buff_raw_get_dma();
>> * call inline xsk_buff_get_metadata(), which calls external
>>   xsk_buff_raw_get_data() and then do some inline checks.
>>
>> This effectively means that the following piece:
>>
>> addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr;
>>
>> is done twice per frame, plus you have 2 external calls per frame, plus
>> this:
>>
>> 	meta = pool->addrs + addr - pool->tx_metadata_len;
>> 	if (unlikely(!xsk_buff_valid_tx_metadata(meta)))
>>
>> is always inlined, even if there's no meta or it's invalid.
> 
> when there is no meta you bail out early in xsk_buff_get_metadata() as
> tx_metadata_len was not set, no?

Yes, but this code is still inlined.
See below (at the end of the reply).

> 
>>
>> Add xsk_buff_raw_get_ctx() (xp_raw_get_ctx() to be precise) to do that
>> in one go. It returns a small structure with 2 fields: DMA address,
>> filled unconditionally, and metadata pointer, valid only if it's
>> present. The address correction is performed only once and you also
>> have only 1 external call per XSk frame, which does all the calculations
>> and checks outside of your hotpath. You only need to check
>> `if (ctx.meta)` for the metadata presence.
> 
> IMHO adding this might confuse future users which approach should be
> preferred.

It's a regular practice in the kernel that we have several functions to
do +/- the same. It's up to the developer which one to pick, he reads
the code and decides himself.

> 
> Thinking out loud...couldn't we export address correction logic and pass
> the corrected addr to xsk_buff_get_metadata and then add it to
> pool->addrs. But that would require modifying existing callsites +
> addressing xp_raw_get_dma() as well :<

Yes, modifying current API requires touching the users.
+ keeping xsk_buff_get_metadata negates most the main purpose of this
patch, see below.

> 
> Standard question - any perf improvement when micro benchmarking? :P

TBH I didn't test before/after with the meta enabled, but it was enough
for me that using this function instead of the get_dma + get_meta pair
reduces the object code size by 1 Kb when unrolling by 8.

Thanks,
Olek
diff mbox series

Patch

diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 6aae95b83645..324a4bb04431 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -205,6 +205,23 @@  static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
 	return xp_raw_get_data(pool, addr);
 }
 
+/**
+ * xsk_buff_raw_get_ctx - get &xdp_desc context
+ * @pool: XSk buff pool desc address belongs to
+ * @addr: desc address (from userspace)
+ *
+ * Wrapper for xp_raw_get_ctx() to be used in drivers, see its kdoc for
+ * details.
+ *
+ * Return: new &xdp_desc_ctx struct containing desc's DMA address and metadata
+ * pointer, if it is present and valid (initialized to %NULL otherwise).
+ */
+static inline struct xdp_desc_ctx
+xsk_buff_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr)
+{
+	return xp_raw_get_ctx(pool, addr);
+}
+
 #define XDP_TXMD_FLAGS_VALID ( \
 		XDP_TXMD_FLAGS_TIMESTAMP | \
 		XDP_TXMD_FLAGS_CHECKSUM | \
@@ -402,6 +419,12 @@  static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
 	return NULL;
 }
 
+static inline struct xdp_desc_ctx
+xsk_buff_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr)
+{
+	return (struct xdp_desc_ctx){ };
+}
+
 static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta)
 {
 	return false;
diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
index 3832997cc605..6c540696a299 100644
--- a/include/net/xsk_buff_pool.h
+++ b/include/net/xsk_buff_pool.h
@@ -141,6 +141,14 @@  u32 xp_alloc_batch(struct xsk_buff_pool *pool, struct xdp_buff **xdp, u32 max);
 bool xp_can_alloc(struct xsk_buff_pool *pool, u32 count);
 void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr);
 dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr);
+
+struct xdp_desc_ctx {
+	dma_addr_t dma;
+	struct xsk_tx_metadata *meta;
+};
+
+struct xdp_desc_ctx xp_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr);
+
 static inline dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb)
 {
 	return xskb->dma;
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index ae71da7d2cd6..02c42caec9f4 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -715,3 +715,43 @@  dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr)
 		(addr & ~PAGE_MASK);
 }
 EXPORT_SYMBOL(xp_raw_get_dma);
+
+/**
+ * xp_raw_get_ctx - get &xdp_desc context
+ * @pool: XSk buff pool desc address belongs to
+ * @addr: desc address (from userspace)
+ *
+ * Helper for getting desc's DMA address and metadata pointer, if present.
+ * Saves one call on hotpath, double calculation of the actual address,
+ * and inline checks for metadata presence and sanity.
+ * Please use xsk_buff_raw_get_ctx() in drivers instead.
+ *
+ * Return: new &xdp_desc_ctx struct containing desc's DMA address and metadata
+ * pointer, if it is present and valid (initialized to %NULL otherwise).
+ */
+struct xdp_desc_ctx xp_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr)
+{
+	struct xsk_tx_metadata *meta;
+	struct xdp_desc_ctx ret;
+
+	addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr;
+	ret = (typeof(ret)){
+		/* Same logic as in xp_raw_get_dma() */
+		.dma	= (pool->dma_pages[addr >> PAGE_SHIFT] &
+			   ~XSK_NEXT_PG_CONTIG_MASK) + (addr & ~PAGE_MASK),
+	};
+
+	if (!pool->tx_metadata_len)
+		goto out;
+
+	/* Same logic as in xp_raw_get_data() + xsk_buff_get_metadata() */
+	meta = pool->addrs + addr - pool->tx_metadata_len;
+	if (unlikely(!xsk_buff_valid_tx_metadata(meta)))
+		goto out;
+
+	ret.meta = meta;
+
+out:
+	return ret;
+}
+EXPORT_SYMBOL(xp_raw_get_ctx);