diff mbox series

[intel-net] i40e: fix receiving of single packets in xsk zero-copy mode

Message ID 20210319094410.3633-1-magnus.karlsson@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [intel-net] i40e: fix receiving of single packets in xsk zero-copy mode | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers fail 2 blamed authors not CCed: daniel@iogearbox.net john.fastabend@gmail.com; 5 maintainers not CCed: daniel@iogearbox.net john.fastabend@gmail.com jesse.brandeburg@intel.com davem@davemloft.net kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 1 this patch: 1
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link

Commit Message

Magnus Karlsson March 19, 2021, 9:44 a.m. UTC
From: Magnus Karlsson <magnus.karlsson@intel.com>

Fix so that single packets are received immediately instead of in
batches of 8. If you sent 1 pss to a system, you received 8 packets
every 8 seconds instead of 1 packet every second. The problem behind
this was that the work_done reporting from the Tx part of the driver
was broken. The work_done reporting in i40e controls not only the
reporting back to the napi logic but also the setting of the interrupt
throttling logic. When Tx or Rx reports that it has more to do,
interrupts are throttled or coalesced and when they both report that
they are done, interrupts are armed right away. If the wrong work_done
value is returned, the logic will start to throttle interrupts in a
situation where it should have just enabled them. This leads to the
undesired batching behavior seen in user-space.

Fix this by returning the correct boolean value from the Tx xsk
zero-copy path. Return true if there is nothing to do or if we got
fewer packets to process than we asked for. Return false if we got as
many packets as the budget since there might be more packets we can
process.

Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance")
Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


base-commit: c79a707072fe3fea0e3c92edee6ca85c1e53c29f

Comments

Fijalkowski, Maciej March 19, 2021, 11 a.m. UTC | #1
On Fri, Mar 19, 2021 at 10:44:10AM +0100, Magnus Karlsson wrote:
> From: Magnus Karlsson <magnus.karlsson@intel.com>
> 
> Fix so that single packets are received immediately instead of in
> batches of 8. If you sent 1 pss to a system, you received 8 packets

pps?

> every 8 seconds instead of 1 packet every second. The problem behind
> this was that the work_done reporting from the Tx part of the driver
> was broken. The work_done reporting in i40e controls not only the
> reporting back to the napi logic but also the setting of the interrupt
> throttling logic. When Tx or Rx reports that it has more to do,
> interrupts are throttled or coalesced and when they both report that
> they are done, interrupts are armed right away. If the wrong work_done
> value is returned, the logic will start to throttle interrupts in a
> situation where it should have just enabled them. This leads to the
> undesired batching behavior seen in user-space.
> 
> Fix this by returning the correct boolean value from the Tx xsk
> zero-copy path. Return true if there is nothing to do or if we got
> fewer packets to process than we asked for. Return false if we got as
> many packets as the budget since there might be more packets we can
> process.
> 
> Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance")
> Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>

Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

> ---
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index fc32c5019b0f..12ca84113587 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -471,7 +471,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
>  
>  	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, descs, budget);
>  	if (!nb_pkts)
> -		return false;
> +		return true;
>  
>  	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
>  		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> @@ -488,7 +488,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
>  
>  	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
>  
> -	return true;
> +	return nb_pkts < budget;
>  }
>  
>  /**
> 
> base-commit: c79a707072fe3fea0e3c92edee6ca85c1e53c29f
> -- 
> 2.29.0
>
Magnus Karlsson March 19, 2021, 12:14 p.m. UTC | #2
On Fri, Mar 19, 2021 at 12:10 PM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> On Fri, Mar 19, 2021 at 10:44:10AM +0100, Magnus Karlsson wrote:
> > From: Magnus Karlsson <magnus.karlsson@intel.com>
> >
> > Fix so that single packets are received immediately instead of in
> > batches of 8. If you sent 1 pss to a system, you received 8 packets
>
> pps?

Arghh, yes it should be pps, i.e. packets per second. I wonder what pss is?

> > every 8 seconds instead of 1 packet every second. The problem behind
> > this was that the work_done reporting from the Tx part of the driver
> > was broken. The work_done reporting in i40e controls not only the
> > reporting back to the napi logic but also the setting of the interrupt
> > throttling logic. When Tx or Rx reports that it has more to do,
> > interrupts are throttled or coalesced and when they both report that
> > they are done, interrupts are armed right away. If the wrong work_done
> > value is returned, the logic will start to throttle interrupts in a
> > situation where it should have just enabled them. This leads to the
> > undesired batching behavior seen in user-space.
> >
> > Fix this by returning the correct boolean value from the Tx xsk
> > zero-copy path. Return true if there is nothing to do or if we got
> > fewer packets to process than we asked for. Return false if we got as
> > many packets as the budget since there might be more packets we can
> > process.
> >
> > Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance")
> > Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>
> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>
> > ---
> >  drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > index fc32c5019b0f..12ca84113587 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > @@ -471,7 +471,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >
> >       nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, descs, budget);
> >       if (!nb_pkts)
> > -             return false;
> > +             return true;
> >
> >       if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> >               nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> > @@ -488,7 +488,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >
> >       i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
> >
> > -     return true;
> > +     return nb_pkts < budget;
> >  }
> >
> >  /**
> >
> > base-commit: c79a707072fe3fea0e3c92edee6ca85c1e53c29f
> > --
> > 2.29.0
> >
Bhandare, KiranX March 31, 2021, 12:57 p.m. UTC | #3
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Magnus Karlsson
> Sent: Friday, March 19, 2021 3:14 PM
> To: Karlsson, Magnus <magnus.karlsson@intel.com>; intel-wired-
> lan@lists.osuosl.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>;
> Fijalkowski, Maciej <maciej.fijalkowski@intel.com>
> Cc: netdev@vger.kernel.org; Joshi, Sreedevi <sreedevi.joshi@intel.com>
> Subject: [Intel-wired-lan] [PATCH intel-net] i40e: fix receiving of single packets
> in xsk zero-copy mode
> 
> From: Magnus Karlsson <magnus.karlsson@intel.com>
> 
> Fix so that single packets are received immediately instead of in batches of 8.
> If you sent 1 pss to a system, you received 8 packets every 8 seconds instead
> of 1 packet every second. The problem behind this was that the work_done
> reporting from the Tx part of the driver was broken. The work_done
> reporting in i40e controls not only the reporting back to the napi logic but
> also the setting of the interrupt throttling logic. When Tx or Rx reports that it
> has more to do, interrupts are throttled or coalesced and when they both
> report that they are done, interrupts are armed right away. If the wrong
> work_done value is returned, the logic will start to throttle interrupts in a
> situation where it should have just enabled them. This leads to the undesired
> batching behavior seen in user-space.
> 
> Fix this by returning the correct boolean value from the Tx xsk zero-copy
> path. Return true if there is nothing to do or if we got fewer packets to
> process than we asked for. Return false if we got as many packets as the
> budget since there might be more packets we can process.
> 
> Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase
> performance")
> Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>  A Contingent Worker at Intel
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index fc32c5019b0f..12ca84113587 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -471,7 +471,7 @@  static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
 
 	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, descs, budget);
 	if (!nb_pkts)
-		return false;
+		return true;
 
 	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
 		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
@@ -488,7 +488,7 @@  static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
 
 	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
 
-	return true;
+	return nb_pkts < budget;
 }
 
 /**