[net-next,v3,2/2] igc: Link queues to NAPI instances

Message ID	20241018171343.314835-3-jdamato@fastly.com (mailing list archive)
State	Superseded
Delegated to:	Netdev Maintainers
Headers	show Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD7E7204030 for <bpf@vger.kernel.org>; Fri, 18 Oct 2024 17:14:09 +0000 (UTC) From: Joe Damato <jdamato@fastly.com> To: netdev@vger.kernel.org Cc: kurt@linutronix.de, vinicius.gomes@intel.com, Joe Damato <jdamato@fastly.com>, Tony Nguyen <anthony.l.nguyen@intel.com>, Przemek Kitszel <przemyslaw.kitszel@intel.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Jesper Dangaard Brouer <hawk@kernel.org>, John Fastabend <john.fastabend@gmail.com>, intel-wired-lan@lists.osuosl.org (moderated list:INTEL ETHERNET DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [net-next v3 2/2] igc: Link queues to NAPI instances Date: Fri, 18 Oct 2024 17:13:43 +0000 Message-Id: <20241018171343.314835-3-jdamato@fastly.com> In-Reply-To: <20241018171343.314835-1-jdamato@fastly.com> References: <20241018171343.314835-1-jdamato@fastly.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	igc: Link IRQs and queues to NAPIs \| expand [net-next,v3,0/2] igc: Link IRQs and queues to NAPIs [net-next,v3,1/2] igc: Link IRQs to NAPI instances [net-next,v3,2/2] igc: Link queues to NAPI instances

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for net-next
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 5 this patch: 5
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	success	CCed 12 of 12 maintainers
netdev/build_clang	success	Errors and warnings before: 5 this patch: 5
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 4 this patch: 4
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 91 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 29 this patch: 29
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-10-19--00-00 (tests: 777)

Joe Damato Oct. 18, 2024, 5:13 p.m. UTC

Link queues to NAPI instances via netdev-genl API so that users can
query this information with netlink. Handle a few cases in the driver:
  1. Link/unlink the NAPIs when XDP is enabled/disabled
  2. Handle IGC_FLAG_QUEUE_PAIRS enabled and disabled

Example output when IGC_FLAG_QUEUE_PAIRS is enabled:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump queue-get --json='{"ifindex": 2}'

[{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
 {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'},
 {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'},
 {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'},
 {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
 {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]

Since IGC_FLAG_QUEUE_PAIRS is enabled, you'll note that the same NAPI ID
is present for both rx and tx queues at the same index, for example
index 0:

{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},

To test IGC_FLAG_QUEUE_PAIRS disabled, a test system was booted using
the grub command line option "maxcpus=2" to force
igc_set_interrupt_capability to disable IGC_FLAG_QUEUE_PAIRS.

Example output when IGC_FLAG_QUEUE_PAIRS is disabled:

$ lscpu | grep "On-line CPU"
On-line CPU(s) list:      0,2

$ ethtool -l enp86s0  | tail -5
Current hardware settings:
RX:		n/a
TX:		n/a
Other:		1
Combined:	2

$ cat /proc/interrupts  | grep enp
 144: [...] enp86s0
 145: [...] enp86s0-rx-0
 146: [...] enp86s0-rx-1
 147: [...] enp86s0-tx-0
 148: [...] enp86s0-tx-1

1 "other" IRQ, and 2 IRQs for each of RX and Tx, so we expect netlink to
report 4 IRQs with unique NAPI IDs:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump napi-get --json='{"ifindex": 2}'
[{'id': 8196, 'ifindex': 2, 'irq': 148},
 {'id': 8195, 'ifindex': 2, 'irq': 147},
 {'id': 8194, 'ifindex': 2, 'irq': 146},
 {'id': 8193, 'ifindex': 2, 'irq': 145}]

Now we examine which queues these NAPIs are associated with, expecting
that since IGC_FLAG_QUEUE_PAIRS is disabled each RX and TX queue will
have its own NAPI instance:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump queue-get --json='{"ifindex": 2}'
[{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
 {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 v3:
   - Replace igc_unset_queue_napi with igc_set_queue_napi(adapater, i,
     NULL), as suggested by Vinicius Costa Gomes
   - Simplify implemention of igc_set_queue_napi as suggested by Kurt
     Kanzenbach, with a tweak to use ring->queue_index

 v2:
   - Update commit message to include tests for IGC_FLAG_QUEUE_PAIRS
     disabled
   - Refactored code to move napi queue mapping and unmapping to helper
     functions igc_set_queue_napi and igc_unset_queue_napi
   - Adjust the code to handle IGC_FLAG_QUEUE_PAIRS disabled
   - Call helpers to map/unmap queues to NAPIs in igc_up, __igc_open,
     igc_xdp_enable_pool, and igc_xdp_disable_pool

 drivers/net/ethernet/intel/igc/igc.h      |  2 ++
 drivers/net/ethernet/intel/igc/igc_main.c | 33 ++++++++++++++++++++---
 drivers/net/ethernet/intel/igc/igc_xdp.c  |  2 ++
 3 files changed, 33 insertions(+), 4 deletions(-)

Vinicius Costa Gomes Oct. 21, 2024, 5:48 p.m. UTC | #1

Joe Damato <jdamato@fastly.com> writes:

> Link queues to NAPI instances via netdev-genl API so that users can
> query this information with netlink. Handle a few cases in the driver:
>   1. Link/unlink the NAPIs when XDP is enabled/disabled
>   2. Handle IGC_FLAG_QUEUE_PAIRS enabled and disabled
>
> Example output when IGC_FLAG_QUEUE_PAIRS is enabled:
>
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump queue-get --json='{"ifindex": 2}'
>
> [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
>  {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'},
>  {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'},
>  {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'},
>  {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
>  {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]
>
> Since IGC_FLAG_QUEUE_PAIRS is enabled, you'll note that the same NAPI ID
> is present for both rx and tx queues at the same index, for example
> index 0:
>
> {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
> {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
>
> To test IGC_FLAG_QUEUE_PAIRS disabled, a test system was booted using
> the grub command line option "maxcpus=2" to force
> igc_set_interrupt_capability to disable IGC_FLAG_QUEUE_PAIRS.
>
> Example output when IGC_FLAG_QUEUE_PAIRS is disabled:
>
> $ lscpu | grep "On-line CPU"
> On-line CPU(s) list:      0,2
>
> $ ethtool -l enp86s0  | tail -5
> Current hardware settings:
> RX:		n/a
> TX:		n/a
> Other:		1
> Combined:	2
>
> $ cat /proc/interrupts  | grep enp
>  144: [...] enp86s0
>  145: [...] enp86s0-rx-0
>  146: [...] enp86s0-rx-1
>  147: [...] enp86s0-tx-0
>  148: [...] enp86s0-tx-1
>
> 1 "other" IRQ, and 2 IRQs for each of RX and Tx, so we expect netlink to
> report 4 IRQs with unique NAPI IDs:
>
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump napi-get --json='{"ifindex": 2}'
> [{'id': 8196, 'ifindex': 2, 'irq': 148},
>  {'id': 8195, 'ifindex': 2, 'irq': 147},
>  {'id': 8194, 'ifindex': 2, 'irq': 146},
>  {'id': 8193, 'ifindex': 2, 'irq': 145}]
>
> Now we examine which queues these NAPIs are associated with, expecting
> that since IGC_FLAG_QUEUE_PAIRS is disabled each RX and TX queue will
> have its own NAPI instance:
>
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump queue-get --json='{"ifindex": 2}'
> [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
>  {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]
>
> Signed-off-by: Joe Damato <jdamato@fastly.com>


Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>


Cheers,

Joe Damato Oct. 22, 2024, 8:28 p.m. UTC | #2

On Fri, Oct 18, 2024 at 05:13:43PM +0000, Joe Damato wrote:
> Link queues to NAPI instances via netdev-genl API so that users can
> query this information with netlink. Handle a few cases in the driver:
>   1. Link/unlink the NAPIs when XDP is enabled/disabled
>   2. Handle IGC_FLAG_QUEUE_PAIRS enabled and disabled
> 
> Example output when IGC_FLAG_QUEUE_PAIRS is enabled:
> 
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump queue-get --json='{"ifindex": 2}'
> 
> [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
>  {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'},
>  {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'},
>  {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'},
>  {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
>  {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]
> 
> Since IGC_FLAG_QUEUE_PAIRS is enabled, you'll note that the same NAPI ID
> is present for both rx and tx queues at the same index, for example
> index 0:
> 
> {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
> {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
> 
> To test IGC_FLAG_QUEUE_PAIRS disabled, a test system was booted using
> the grub command line option "maxcpus=2" to force
> igc_set_interrupt_capability to disable IGC_FLAG_QUEUE_PAIRS.
> 
> Example output when IGC_FLAG_QUEUE_PAIRS is disabled:
> 
> $ lscpu | grep "On-line CPU"
> On-line CPU(s) list:      0,2
> 
> $ ethtool -l enp86s0  | tail -5
> Current hardware settings:
> RX:		n/a
> TX:		n/a
> Other:		1
> Combined:	2
> 
> $ cat /proc/interrupts  | grep enp
>  144: [...] enp86s0
>  145: [...] enp86s0-rx-0
>  146: [...] enp86s0-rx-1
>  147: [...] enp86s0-tx-0
>  148: [...] enp86s0-tx-1
> 
> 1 "other" IRQ, and 2 IRQs for each of RX and Tx, so we expect netlink to
> report 4 IRQs with unique NAPI IDs:
> 
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump napi-get --json='{"ifindex": 2}'
> [{'id': 8196, 'ifindex': 2, 'irq': 148},
>  {'id': 8195, 'ifindex': 2, 'irq': 147},
>  {'id': 8194, 'ifindex': 2, 'irq': 146},
>  {'id': 8193, 'ifindex': 2, 'irq': 145}]
> 
> Now we examine which queues these NAPIs are associated with, expecting
> that since IGC_FLAG_QUEUE_PAIRS is disabled each RX and TX queue will
> have its own NAPI instance:
> 
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump queue-get --json='{"ifindex": 2}'
> [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
>  {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]
> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> ---
>  v3:
>    - Replace igc_unset_queue_napi with igc_set_queue_napi(adapater, i,
>      NULL), as suggested by Vinicius Costa Gomes
>    - Simplify implemention of igc_set_queue_napi as suggested by Kurt
>      Kanzenbach, with a tweak to use ring->queue_index
> 
>  v2:
>    - Update commit message to include tests for IGC_FLAG_QUEUE_PAIRS
>      disabled
>    - Refactored code to move napi queue mapping and unmapping to helper
>      functions igc_set_queue_napi and igc_unset_queue_napi
>    - Adjust the code to handle IGC_FLAG_QUEUE_PAIRS disabled
>    - Call helpers to map/unmap queues to NAPIs in igc_up, __igc_open,
>      igc_xdp_enable_pool, and igc_xdp_disable_pool
> 
>  drivers/net/ethernet/intel/igc/igc.h      |  2 ++
>  drivers/net/ethernet/intel/igc/igc_main.c | 33 ++++++++++++++++++++---
>  drivers/net/ethernet/intel/igc/igc_xdp.c  |  2 ++
>  3 files changed, 33 insertions(+), 4 deletions(-)

I took another look at this to make sure that RTNL is held when
igc_set_queue_napi is called after the e1000 bug report came in [1],
and there may be two locations I've missed:

1. igc_resume, which calls __igc_open
2. igc_io_error_detected, which calls igc_down

In both cases, I think the code can be modified to hold rtnl around
calls to __igc_open and igc_down.

Let me know what you think ?

If you agree that I should hold rtnl in both of those cases, what is
the best way to proceed:
  - send a v4, or
  - wait for this to get merged (since I got the notification it was
    pulled into intel-next) and send a fixes ?

Here's the full analysis I came up with; I tried to be thorough, but
it is certainly possible I missed a call site:

For the up case:

- igc_up:
  - called from igc_reinit_locked, which is called via:
    - igc_reset_task (rtnl is held)
    - igc_set_features (ndo_set_features, which itself has an ASSERT_RTNL)
    - various places in igc_ethtool (set_priv_flags, nway_reset,
      ethtool_set_eee) all of which have RTNL held
  - igc_change_mtu which also has RTNL held
- __igc_open
  - called from igc_resume, which may need an rtnl_lock ?
  - igc_open
    - called from igc_io_resume, rtnl is held
    - called from igc_reinit_queues, only via ethool set_channels,
      where rtnl is held
    - ndo_open where rtnl is held

For the down case:

- igc_down:
  - called from various ethtool locations (set_ringparam,
    set_pauseparam, set_link_ksettings) all of which hold rtnl
  - called from igc_io_error_detected, which may need an rtnl_lock
  - igc_reinit_locked which is fine, as described above
  - igc_change_mtu which is fine, as described above
  - called from __igc_close
    - called from __igc_shutdown which holds rtnl
    - called from igc_reinit_queues which is fine as described above
    - called from igc_close which is ndo_close

Jacob Keller Oct. 22, 2024, 8:53 p.m. UTC | #3

On 10/22/2024 1:28 PM, Joe Damato wrote:
> I took another look at this to make sure that RTNL is held when
> igc_set_queue_napi is called after the e1000 bug report came in [1],
> and there may be two locations I've missed:
> 
> 1. igc_resume, which calls __igc_open
> 2. igc_io_error_detected, which calls igc_down
> 
> In both cases, I think the code can be modified to hold rtnl around
> calls to __igc_open and igc_down.
> 
> Let me know what you think ?
> 
> If you agree that I should hold rtnl in both of those cases, what is
> the best way to proceed:
>   - send a v4, or
>   - wait for this to get merged (since I got the notification it was
>     pulled into intel-next) and send a fixes ?
> 

Intel-next uses a stacked set of patches which we then send in batches
via PRs as they pass our internal testing.

We can drop the v3 and await v4.

> Here's the full analysis I came up with; I tried to be thorough, but
> it is certainly possible I missed a call site:
> 
> For the up case:
> 
> - igc_up:
>   - called from igc_reinit_locked, which is called via:
>     - igc_reset_task (rtnl is held)
>     - igc_set_features (ndo_set_features, which itself has an ASSERT_RTNL)
>     - various places in igc_ethtool (set_priv_flags, nway_reset,
>       ethtool_set_eee) all of which have RTNL held
>   - igc_change_mtu which also has RTNL held
> - __igc_open
>   - called from igc_resume, which may need an rtnl_lock ?
>   - igc_open
>     - called from igc_io_resume, rtnl is held
>     - called from igc_reinit_queues, only via ethool set_channels,
>       where rtnl is held
>     - ndo_open where rtnl is held
> 
> For the down case:
> 
> - igc_down:
>   - called from various ethtool locations (set_ringparam,
>     set_pauseparam, set_link_ksettings) all of which hold rtnl
>   - called from igc_io_error_detected, which may need an rtnl_lock
>   - igc_reinit_locked which is fine, as described above
>   - igc_change_mtu which is fine, as described above
>   - called from __igc_close
>     - called from __igc_shutdown which holds rtnl
>     - called from igc_reinit_queues which is fine as described above
>     - called from igc_close which is ndo_close

This analysis looks complete to me.

Joe Damato Oct. 22, 2024, 8:58 p.m. UTC | #4

On Tue, Oct 22, 2024 at 01:53:01PM -0700, Jacob Keller wrote:
> 
> 
> On 10/22/2024 1:28 PM, Joe Damato wrote:
> > I took another look at this to make sure that RTNL is held when
> > igc_set_queue_napi is called after the e1000 bug report came in [1],
> > and there may be two locations I've missed:
> > 
> > 1. igc_resume, which calls __igc_open
> > 2. igc_io_error_detected, which calls igc_down
> > 
> > In both cases, I think the code can be modified to hold rtnl around
> > calls to __igc_open and igc_down.
> > 
> > Let me know what you think ?
> > 
> > If you agree that I should hold rtnl in both of those cases, what is
> > the best way to proceed:
> >   - send a v4, or
> >   - wait for this to get merged (since I got the notification it was
> >     pulled into intel-next) and send a fixes ?
> > 
> 
> Intel-next uses a stacked set of patches which we then send in batches
> via PRs as they pass our internal testing.
> 
> We can drop the v3 and await v4.

OK, thanks for the info. I will prepare, test locally, and send a
v4 shortly.

> > Here's the full analysis I came up with; I tried to be thorough, but
> > it is certainly possible I missed a call site:
> > 
> > For the up case:
> > 
> > - igc_up:
> >   - called from igc_reinit_locked, which is called via:
> >     - igc_reset_task (rtnl is held)
> >     - igc_set_features (ndo_set_features, which itself has an ASSERT_RTNL)
> >     - various places in igc_ethtool (set_priv_flags, nway_reset,
> >       ethtool_set_eee) all of which have RTNL held
> >   - igc_change_mtu which also has RTNL held
> > - __igc_open
> >   - called from igc_resume, which may need an rtnl_lock ?
> >   - igc_open
> >     - called from igc_io_resume, rtnl is held
> >     - called from igc_reinit_queues, only via ethool set_channels,
> >       where rtnl is held
> >     - ndo_open where rtnl is held
> > 
> > For the down case:
> > 
> > - igc_down:
> >   - called from various ethtool locations (set_ringparam,
> >     set_pauseparam, set_link_ksettings) all of which hold rtnl
> >   - called from igc_io_error_detected, which may need an rtnl_lock
> >   - igc_reinit_locked which is fine, as described above
> >   - igc_change_mtu which is fine, as described above
> >   - called from __igc_close
> >     - called from __igc_shutdown which holds rtnl
> >     - called from igc_reinit_queues which is fine as described above
> >     - called from igc_close which is ndo_close
> 
> This analysis looks complete to me.

Thanks; I'd appreciate your comments on the e1000 RFC I posted, if
you have a moment. I'm going to update that thread with more data
now that I have analysed igc as there are some parallels:

https://lore.kernel.org/netdev/20241022172153.217890-1-jdamato@fastly.com/

[net-next,v3,2/2] igc: Link queues to NAPI instances

Checks

Commit Message

Comments

Patch