Message ID | 20210319094410.3633-1-magnus.karlsson@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [intel-net] i40e: fix receiving of single packets in xsk zero-copy mode | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | fail | 2 blamed authors not CCed: daniel@iogearbox.net john.fastabend@gmail.com; 5 maintainers not CCed: daniel@iogearbox.net john.fastabend@gmail.com jesse.brandeburg@intel.com davem@davemloft.net kuba@kernel.org |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 1 this patch: 1 |
netdev/kdoc | success | Errors and warnings before: 1 this patch: 1 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 16 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 1 this patch: 1 |
netdev/header_inline | success | Link |
On Fri, Mar 19, 2021 at 10:44:10AM +0100, Magnus Karlsson wrote: > From: Magnus Karlsson <magnus.karlsson@intel.com> > > Fix so that single packets are received immediately instead of in > batches of 8. If you sent 1 pss to a system, you received 8 packets pps? > every 8 seconds instead of 1 packet every second. The problem behind > this was that the work_done reporting from the Tx part of the driver > was broken. The work_done reporting in i40e controls not only the > reporting back to the napi logic but also the setting of the interrupt > throttling logic. When Tx or Rx reports that it has more to do, > interrupts are throttled or coalesced and when they both report that > they are done, interrupts are armed right away. If the wrong work_done > value is returned, the logic will start to throttle interrupts in a > situation where it should have just enabled them. This leads to the > undesired batching behavior seen in user-space. > > Fix this by returning the correct boolean value from the Tx xsk > zero-copy path. Return true if there is nothing to do or if we got > fewer packets to process than we asked for. Return false if we got as > many packets as the budget since there might be more packets we can > process. > > Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance") > Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > --- > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > index fc32c5019b0f..12ca84113587 100644 > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > @@ -471,7 +471,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) > > nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, descs, budget); > if (!nb_pkts) > - return false; > + return true; > > if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) { > nb_processed = xdp_ring->count - xdp_ring->next_to_use; > @@ -488,7 +488,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) > > i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes); > > - return true; > + return nb_pkts < budget; > } > > /** > > base-commit: c79a707072fe3fea0e3c92edee6ca85c1e53c29f > -- > 2.29.0 >
On Fri, Mar 19, 2021 at 12:10 PM Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote: > > On Fri, Mar 19, 2021 at 10:44:10AM +0100, Magnus Karlsson wrote: > > From: Magnus Karlsson <magnus.karlsson@intel.com> > > > > Fix so that single packets are received immediately instead of in > > batches of 8. If you sent 1 pss to a system, you received 8 packets > > pps? Arghh, yes it should be pps, i.e. packets per second. I wonder what pss is? > > every 8 seconds instead of 1 packet every second. The problem behind > > this was that the work_done reporting from the Tx part of the driver > > was broken. The work_done reporting in i40e controls not only the > > reporting back to the napi logic but also the setting of the interrupt > > throttling logic. When Tx or Rx reports that it has more to do, > > interrupts are throttled or coalesced and when they both report that > > they are done, interrupts are armed right away. If the wrong work_done > > value is returned, the logic will start to throttle interrupts in a > > situation where it should have just enabled them. This leads to the > > undesired batching behavior seen in user-space. > > > > Fix this by returning the correct boolean value from the Tx xsk > > zero-copy path. Return true if there is nothing to do or if we got > > fewer packets to process than we asked for. Return false if we got as > > many packets as the budget since there might be more packets we can > > process. > > > > Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance") > > Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com> > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> > > Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > > --- > > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > index fc32c5019b0f..12ca84113587 100644 > > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > @@ -471,7 +471,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) > > > > nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, descs, budget); > > if (!nb_pkts) > > - return false; > > + return true; > > > > if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) { > > nb_processed = xdp_ring->count - xdp_ring->next_to_use; > > @@ -488,7 +488,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) > > > > i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes); > > > > - return true; > > + return nb_pkts < budget; > > } > > > > /** > > > > base-commit: c79a707072fe3fea0e3c92edee6ca85c1e53c29f > > -- > > 2.29.0 > >
> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of > Magnus Karlsson > Sent: Friday, March 19, 2021 3:14 PM > To: Karlsson, Magnus <magnus.karlsson@intel.com>; intel-wired- > lan@lists.osuosl.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; > Fijalkowski, Maciej <maciej.fijalkowski@intel.com> > Cc: netdev@vger.kernel.org; Joshi, Sreedevi <sreedevi.joshi@intel.com> > Subject: [Intel-wired-lan] [PATCH intel-net] i40e: fix receiving of single packets > in xsk zero-copy mode > > From: Magnus Karlsson <magnus.karlsson@intel.com> > > Fix so that single packets are received immediately instead of in batches of 8. > If you sent 1 pss to a system, you received 8 packets every 8 seconds instead > of 1 packet every second. The problem behind this was that the work_done > reporting from the Tx part of the driver was broken. The work_done > reporting in i40e controls not only the reporting back to the napi logic but > also the setting of the interrupt throttling logic. When Tx or Rx reports that it > has more to do, interrupts are throttled or coalesced and when they both > report that they are done, interrupts are armed right away. If the wrong > work_done value is returned, the logic will start to throttle interrupts in a > situation where it should have just enabled them. This leads to the undesired > batching behavior seen in user-space. > > Fix this by returning the correct boolean value from the Tx xsk zero-copy > path. Return true if there is nothing to do or if we got fewer packets to > process than we asked for. Return false if we got as many packets as the > budget since there might be more packets we can process. > > Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase > performance") > Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> > --- > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> A Contingent Worker at Intel
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index fc32c5019b0f..12ca84113587 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -471,7 +471,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, descs, budget); if (!nb_pkts) - return false; + return true; if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) { nb_processed = xdp_ring->count - xdp_ring->next_to_use; @@ -488,7 +488,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes); - return true; + return nb_pkts < budget; } /**