Message ID | 20250123150118.583039-3-maciej.fijalkowski@intel.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | ice: fix Rx data path for heavy 9k MTU traffic | expand |
>-----Original Message----- >From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of >Fijalkowski, Maciej >Sent: Thursday, January 23, 2025 8:31 PM >To: intel-wired-lan@lists.osuosl.org >Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; >netdev@vger.kernel.org; xudu@redhat.com; Nguyen, Anthony L ><anthony.l.nguyen@intel.com>; Simon Horman <horms@kernel.org>; Kitszel, >Przemyslaw <przemyslaw.kitszel@intel.com>; Keller, Jacob E ><jacob.e.keller@intel.com>; Maxwell, Jon <jmaxwell@redhat.com>; Karlsson, >Magnus <magnus.karlsson@intel.com> >Subject: [Intel-wired-lan] [PATCH v5 iwl-net 2/3] ice: gather page_count()'s of >each frag right before XDP prog call > >If we store the pgcnt on few fragments while being in the middle of gathering >the whole frame and we stumbled upon DD bit not being set, we terminate the >NAPI Rx processing loop and come back later on. Then on next NAPI execution >we work on previously stored pgcnt. > >Imagine that second half of page was used actively by networking stack and by >the time we came back, stack is not busy with this page anymore and >decremented the refcnt. The page reuse algorithm in this case should be good >to reuse the page but given the old refcnt it will not do so and attempt to >release the page via page_frag_cache_drain() with pagecnt_bias used as an arg. >This in turn will result in negative refcnt on struct page, which was initially >observed by Xu Du. > >Therefore, move the page count storage from ice_get_rx_buf() to a place where >we are sure that whole frame has been collected, but before calling XDP >program as it internally can also change the page count of fragments belonging >to xdp_buff. > >Fixes: ac0753391195 ("ice: Store page count inside ice_rx_buf") >Reported-and-tested-by: Xu Du <xudu@redhat.com> >Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> >Reviewed-by: Simon Horman <horms@kernel.org> >Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> >Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> >Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> >--- > drivers/net/ethernet/intel/ice/ice_txrx.c | 27 ++++++++++++++++++++++- > 1 file changed, 26 insertions(+), 1 deletion(-) > Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index e173d9c98988..cf46bcf143b4 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -924,7 +924,6 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, struct ice_rx_buf *rx_buf; rx_buf = &rx_ring->rx_buf[ntc]; - rx_buf->pgcnt = page_count(rx_buf->page); prefetchw(rx_buf->page); if (!size) @@ -940,6 +939,31 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, return rx_buf; } +/** + * ice_get_pgcnts - grab page_count() for gathered fragments + * @rx_ring: Rx descriptor ring to store the page counts on + * + * This function is intended to be called right before running XDP + * program so that the page recycling mechanism will be able to take + * a correct decision regarding underlying pages; this is done in such + * way as XDP program can change the refcount of page + */ +static void ice_get_pgcnts(struct ice_rx_ring *rx_ring) +{ + u32 nr_frags = rx_ring->nr_frags + 1; + u32 idx = rx_ring->first_desc; + struct ice_rx_buf *rx_buf; + u32 cnt = rx_ring->count; + + for (int i = 0; i < nr_frags; i++) { + rx_buf = &rx_ring->rx_buf[idx]; + rx_buf->pgcnt = page_count(rx_buf->page); + + if (++idx == cnt) + idx = 0; + } +} + /** * ice_build_skb - Build skb around an existing buffer * @rx_ring: Rx descriptor ring to transact packets on @@ -1241,6 +1265,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (ice_is_non_eop(rx_ring, rx_desc)) continue; + ice_get_pgcnts(rx_ring); ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); if (rx_buf->act == ICE_XDP_PASS) goto construct_skb;