Message ID | 20221123144641.339138-2-toke@redhat.com (mailing list archive) |
---|---|
State | Deferred |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next,1/2] xdp: Add drv_priv pointer to struct xdp_buff | expand |
On 23 Nov 15:46, Toke Høiland-Jørgensen wrote: >Support RX hash and timestamp metadata kfuncs. We need to pass in the cqe >pointer to the mlx5e_skb_from* functions so it can be retrieved from the >XDP ctx to do this. > >Cc: John Fastabend <john.fastabend@gmail.com> >Cc: David Ahern <dsahern@gmail.com> >Cc: Martin KaFai Lau <martin.lau@linux.dev> >Cc: Jakub Kicinski <kuba@kernel.org> >Cc: Willem de Bruijn <willemb@google.com> >Cc: Jesper Dangaard Brouer <brouer@redhat.com> >Cc: Anatoly Burakov <anatoly.burakov@intel.com> >Cc: Alexander Lobakin <alexandr.lobakin@intel.com> >Cc: Magnus Karlsson <magnus.karlsson@gmail.com> >Cc: Maryam Tahhan <mtahhan@redhat.com> >Cc: Stanislav Fomichev <sdf@google.com> >Cc: xdp-hints@xdp-project.net >Cc: netdev@vger.kernel.org >Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> >--- >This goes on top of Stanislav's series, obvioulsy. Verified that it works using >the xdp_hw_metadata utility; going to do ome benchmarking and follow up with the >results, but figured I'd send this out straight away in case others wanted to >play with it. > >Stanislav, feel free to fold it into the next version of your series if you >want! > [...] > #endif /* __MLX5_EN_XSK_RX_H__ */ >diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >index 14bd86e368d5..015bfe891458 100644 >--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >@@ -4890,6 +4890,10 @@ const struct net_device_ops mlx5e_netdev_ops = { > .ndo_tx_timeout = mlx5e_tx_timeout, > .ndo_bpf = mlx5e_xdp, > .ndo_xdp_xmit = mlx5e_xdp_xmit, >+ .ndo_xdp_rx_timestamp_supported = mlx5e_xdp_rx_timestamp_supported, >+ .ndo_xdp_rx_timestamp = mlx5e_xdp_rx_timestamp, >+ .ndo_xdp_rx_hash_supported = mlx5e_xdp_rx_hash_supported, >+ .ndo_xdp_rx_hash = mlx5e_xdp_rx_hash, I hope i am not late to the party. but I already expressed my feelings regarding using kfunc for xdp hints, @LPC and @netdevconf. I think it's wrong to use indirect calls, and for many usecases the overhead will be higher than just calculating the metadata on the spot. so you will need two indirect calls per packet per hint.. some would argue on some systems calculating the hash would be much faster. and one major reason to have the hints is to accelerate xdp edge and security programs with the hw provided hints. what happened with just asking the driver to place the data in a specific location on the headroom?
On Wed, Nov 23, 2022 at 2:29 PM Saeed Mahameed <saeed@kernel.org> wrote: > > On 23 Nov 15:46, Toke Høiland-Jørgensen wrote: > >Support RX hash and timestamp metadata kfuncs. We need to pass in the cqe > >pointer to the mlx5e_skb_from* functions so it can be retrieved from the > >XDP ctx to do this. > > > >Cc: John Fastabend <john.fastabend@gmail.com> > >Cc: David Ahern <dsahern@gmail.com> > >Cc: Martin KaFai Lau <martin.lau@linux.dev> > >Cc: Jakub Kicinski <kuba@kernel.org> > >Cc: Willem de Bruijn <willemb@google.com> > >Cc: Jesper Dangaard Brouer <brouer@redhat.com> > >Cc: Anatoly Burakov <anatoly.burakov@intel.com> > >Cc: Alexander Lobakin <alexandr.lobakin@intel.com> > >Cc: Magnus Karlsson <magnus.karlsson@gmail.com> > >Cc: Maryam Tahhan <mtahhan@redhat.com> > >Cc: Stanislav Fomichev <sdf@google.com> > >Cc: xdp-hints@xdp-project.net > >Cc: netdev@vger.kernel.org > >Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > >--- > >This goes on top of Stanislav's series, obvioulsy. Verified that it works using > >the xdp_hw_metadata utility; going to do ome benchmarking and follow up with the > >results, but figured I'd send this out straight away in case others wanted to > >play with it. > > > >Stanislav, feel free to fold it into the next version of your series if you > >want! > > > > [...] > > > #endif /* __MLX5_EN_XSK_RX_H__ */ > >diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >index 14bd86e368d5..015bfe891458 100644 > >--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > >@@ -4890,6 +4890,10 @@ const struct net_device_ops mlx5e_netdev_ops = { > > .ndo_tx_timeout = mlx5e_tx_timeout, > > .ndo_bpf = mlx5e_xdp, > > .ndo_xdp_xmit = mlx5e_xdp_xmit, > >+ .ndo_xdp_rx_timestamp_supported = mlx5e_xdp_rx_timestamp_supported, > >+ .ndo_xdp_rx_timestamp = mlx5e_xdp_rx_timestamp, > >+ .ndo_xdp_rx_hash_supported = mlx5e_xdp_rx_hash_supported, > >+ .ndo_xdp_rx_hash = mlx5e_xdp_rx_hash, > > I hope i am not late to the party. > but I already expressed my feelings regarding using kfunc for xdp hints, > @LPC and @netdevconf. > > I think it's wrong to use indirect calls, and for many usecases the > overhead will be higher than just calculating the metadata on the spot. > > so you will need two indirect calls per packet per hint.. > some would argue on some systems calculating the hash would be much faster. > and one major reason to have the hints is to accelerate xdp edge and > security programs with the hw provided hints. > > what happened with just asking the driver to place the data in a specific > location on the headroom? Take a look at [0], we are resolving indirect calls. We can also always go back to unrolling those calls as was done initially in [1]. 0: https://lore.kernel.org/bpf/20221121182552.2152891-3-sdf@google.com/ 1: https://lore.kernel.org/bpf/20221115030210.3159213-4-sdf@google.com/ kfunc approach seems more flexible than an all-or-nothing approach with the driver pre-filling all metadata.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index ff5b302531d5..960404027f0b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -629,7 +629,7 @@ typedef struct sk_buff * u16 cqe_bcnt, u32 head_offset, u32 page_idx); typedef struct sk_buff * (*mlx5e_fp_skb_from_cqe)(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, - u32 cqe_bcnt); + struct mlx5_cqe64 *cqe, u32 cqe_bcnt); typedef bool (*mlx5e_fp_post_rx_wqes)(struct mlx5e_rq *rq); typedef void (*mlx5e_fp_dealloc_wqe)(struct mlx5e_rq*, u16); typedef void (*mlx5e_fp_shampo_dealloc_hd)(struct mlx5e_rq*, u16, u16, bool); @@ -1035,6 +1035,11 @@ int mlx5e_vlan_rx_kill_vid(struct net_device *dev, __always_unused __be16 proto, u16 vid); void mlx5e_timestamp_init(struct mlx5e_priv *priv); +static inline bool mlx5e_rx_hw_stamp(struct hwtstamp_config *config) +{ + return config->rx_filter == HWTSTAMP_FILTER_ALL; +} + struct mlx5e_xsk_param; struct mlx5e_rq_param; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 20507ef2f956..604c8cdfde02 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -156,6 +156,38 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, return true; } +bool mlx5e_xdp_rx_timestamp_supported(const struct xdp_md *ctx) +{ + const struct xdp_buff *xdp = (void *)ctx; + struct mlx5_xdp_ctx *mctx = xdp->drv_priv; + + return mlx5e_rx_hw_stamp(mctx->rq->tstamp); +} + +u64 mlx5e_xdp_rx_timestamp(const struct xdp_md *ctx) +{ + const struct xdp_buff *xdp = (void *)ctx; + struct mlx5_xdp_ctx *mctx = xdp->drv_priv; + + return mlx5e_cqe_ts_to_ns(mctx->rq->ptp_cyc2time, + mctx->rq->clock, get_cqe_ts(mctx->cqe)); +} + +bool mlx5e_xdp_rx_hash_supported(const struct xdp_md *ctx) +{ + const struct xdp_buff *xdp = (void *)ctx; + + return xdp->rxq->dev->features & NETIF_F_RXHASH; +} + +u32 mlx5e_xdp_rx_hash(const struct xdp_md *ctx) +{ + const struct xdp_buff *xdp = (void *)ctx; + struct mlx5_xdp_ctx *mctx = xdp->drv_priv; + + return be32_to_cpu(mctx->cqe->rss_hash_result); +} + /* returns true if packet was consumed by xdp */ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page, struct bpf_prog *prog, struct xdp_buff *xdp) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h index bc2d9034af5b..07d80d0446ff 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h @@ -44,6 +44,11 @@ (MLX5E_XDP_INLINE_WQE_MAX_DS_CNT * MLX5_SEND_WQE_DS - \ sizeof(struct mlx5_wqe_inline_seg)) +struct mlx5_xdp_ctx { + struct mlx5_cqe64 *cqe; + struct mlx5e_rq *rq; +}; + struct mlx5e_xsk_param; int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk); bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page, @@ -56,6 +61,11 @@ void mlx5e_xdp_rx_poll_complete(struct mlx5e_rq *rq); int mlx5e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, u32 flags); +bool mlx5e_xdp_rx_hash_supported(const struct xdp_md *ctx); +u32 mlx5e_xdp_rx_hash(const struct xdp_md *ctx); +bool mlx5e_xdp_rx_timestamp_supported(const struct xdp_md *ctx); +u64 mlx5e_xdp_rx_timestamp(const struct xdp_md *ctx); + INDIRECT_CALLABLE_DECLARE(bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, struct skb_shared_info *sinfo, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c index c91b54d9ff27..c6715cb23d45 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c @@ -283,8 +283,10 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) { + struct mlx5_xdp_ctx mlctx = { .cqe = cqe, .rq = rq }; struct xdp_buff *xdp = wi->au->xsk; struct bpf_prog *prog; @@ -298,6 +300,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, xsk_buff_set_size(xdp, cqe_bcnt); xsk_buff_dma_sync_for_cpu(xdp, rq->xsk_pool); net_prefetch(xdp->data); + xdp->drv_priv = &mlctx; prog = rcu_dereference(rq->xdp_prog); if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, xdp))) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h index 087c943bd8e9..9198f137f48f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h @@ -18,6 +18,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, u32 page_idx); struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, - u32 cqe_bcnt); + struct mlx5_cqe64 *cqe, + u32 cqe_bcnt); #endif /* __MLX5_EN_XSK_RX_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 14bd86e368d5..015bfe891458 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4890,6 +4890,10 @@ const struct net_device_ops mlx5e_netdev_ops = { .ndo_tx_timeout = mlx5e_tx_timeout, .ndo_bpf = mlx5e_xdp, .ndo_xdp_xmit = mlx5e_xdp_xmit, + .ndo_xdp_rx_timestamp_supported = mlx5e_xdp_rx_timestamp_supported, + .ndo_xdp_rx_timestamp = mlx5e_xdp_rx_timestamp, + .ndo_xdp_rx_hash_supported = mlx5e_xdp_rx_hash_supported, + .ndo_xdp_rx_hash = mlx5e_xdp_rx_hash, .ndo_xsk_wakeup = mlx5e_xsk_wakeup, #ifdef CONFIG_MLX5_EN_ARFS .ndo_rx_flow_steer = mlx5e_rx_flow_steer, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index b1ea0b995d9c..1d6600441e74 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -76,11 +76,6 @@ const struct mlx5e_rx_handlers mlx5e_rx_handlers_nic = { .handle_rx_cqe_mpwqe_shampo = mlx5e_handle_rx_cqe_mpwrq_shampo, }; -static inline bool mlx5e_rx_hw_stamp(struct hwtstamp_config *config) -{ - return config->rx_filter == HWTSTAMP_FILTER_ALL; -} - static inline void mlx5e_read_cqe_slot(struct mlx5_cqwq *wq, u32 cqcc, void *data) { @@ -1573,7 +1568,7 @@ static void mlx5e_fill_xdp_buff(struct mlx5e_rq *rq, void *va, u16 headroom, static struct sk_buff * mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, - u32 cqe_bcnt) + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) { union mlx5e_alloc_unit *au = wi->au; u16 rx_headroom = rq->buff.headroom; @@ -1595,7 +1590,8 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, prog = rcu_dereference(rq->xdp_prog); if (prog) { - struct xdp_buff xdp; + struct mlx5_xdp_ctx mlctx = { .cqe = cqe, .rq = rq }; + struct xdp_buff xdp = { .drv_priv = &mlctx }; net_prefetchw(va); /* xdp_frame data area */ mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp); @@ -1619,16 +1615,17 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, static struct sk_buff * mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, - u32 cqe_bcnt) + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) { struct mlx5e_rq_frag_info *frag_info = &rq->wqe.info.arr[0]; + struct mlx5_xdp_ctx mlctx = { .cqe = cqe, .rq = rq }; + struct xdp_buff xdp = { .drv_priv = &mlctx }; struct mlx5e_wqe_frag_info *head_wi = wi; union mlx5e_alloc_unit *au = wi->au; u16 rx_headroom = rq->buff.headroom; struct skb_shared_info *sinfo; u32 frag_consumed_bytes; struct bpf_prog *prog; - struct xdp_buff xdp; struct sk_buff *skb; dma_addr_t addr; u32 truesize; @@ -1766,7 +1763,7 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) mlx5e_skb_from_cqe_linear, mlx5e_skb_from_cqe_nonlinear, mlx5e_xsk_skb_from_cqe_linear, - rq, wi, cqe_bcnt); + rq, wi, cqe, cqe_bcnt); if (!skb) { /* probably for XDP */ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { @@ -2575,7 +2572,7 @@ static void mlx5e_trap_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe goto free_wqe; } - skb = mlx5e_skb_from_cqe_nonlinear(rq, wi, cqe_bcnt); + skb = mlx5e_skb_from_cqe_nonlinear(rq, wi, cqe, cqe_bcnt); if (!skb) goto free_wqe;
Support RX hash and timestamp metadata kfuncs. We need to pass in the cqe pointer to the mlx5e_skb_from* functions so it can be retrieved from the XDP ctx to do this. Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: Stanislav Fomichev <sdf@google.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- This goes on top of Stanislav's series, obvioulsy. Verified that it works using the xdp_hw_metadata utility; going to do ome benchmarking and follow up with the results, but figured I'd send this out straight away in case others wanted to play with it. Stanislav, feel free to fold it into the next version of your series if you want! -Toke drivers/net/ethernet/mellanox/mlx5/core/en.h | 7 +++- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 32 +++++++++++++++++++ .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 10 ++++++ .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 3 ++ .../ethernet/mellanox/mlx5/core/en/xsk/rx.h | 3 +- .../net/ethernet/mellanox/mlx5/core/en_main.c | 4 +++ .../net/ethernet/mellanox/mlx5/core/en_rx.c | 19 +++++------ 7 files changed, 65 insertions(+), 13 deletions(-)