mbox series

[net,v2,0/5] net: xdp: execute xdp_do_flush() before napi_complete_done()

Message ID 20230125074901.2737-1-magnus.karlsson@gmail.com (mailing list archive)
Headers show
Series net: xdp: execute xdp_do_flush() before napi_complete_done() | expand

Message

Magnus Karlsson Jan. 25, 2023, 7:48 a.m. UTC
Make sure that xdp_do_flush() is always executed before
napi_complete_done(). This is important for two reasons. First, a
redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
napi context X on CPU Y will be followed by a xdp_do_flush() from the
same napi context and CPU. This is not guaranteed if the
napi_complete_done() is executed before xdp_do_flush(), as it tells
the napi logic that it is fine to schedule napi context X on another
CPU. Details from a production system triggering this bug using the
veth driver can be found in [1].

The second reason is that the XDP_REDIRECT logic in itself relies on
being inside a single NAPI instance through to the xdp_do_flush() call
for RCU protection of all in-kernel data structures. Details can be
found in [2].

The drivers have only been compile-tested since I do not own any of
the HW below. So if you are a maintainer, it would be great if you
could take a quick look to make sure I did not mess something up.

Note that these were the drivers I found that violated the ordering by
running a simple script and manually checking the ones that came up as
potential offenders. But the script was not perfect in any way. There
might still be offenders out there, since the script can generate
false negatives.

v1 -> v2:
* Added acks [Toke, Steen]
* Corrected two spelling errors [Toke]

[1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
[2] https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/

Thanks: Magnus

Magnus Karlsson (5):
  qede: execute xdp_do_flush() before napi_complete_done()
  lan966x: execute xdp_do_flush() before napi_complete_done()
  virtio-net: execute xdp_do_flush() before napi_complete_done()
  dpaa_eth: execute xdp_do_flush() before napi_complete_done()
  dpaa2-eth: execute xdp_do_flush() before napi_complete_done()

 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c        | 6 +++---
 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c      | 9 ++++++---
 drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c | 6 +++---
 drivers/net/ethernet/qlogic/qede/qede_fp.c            | 7 ++++---
 drivers/net/virtio_net.c                              | 6 +++---
 5 files changed, 19 insertions(+), 15 deletions(-)


base-commit: 2a48216cff7a2e3964fbed16f84d33f68b3e5e42
--
2.34.1

Comments

Michael S. Tsirkin Jan. 27, 2023, 10:50 a.m. UTC | #1
On Wed, Jan 25, 2023 at 08:48:56AM +0100, Magnus Karlsson wrote:
> Make sure that xdp_do_flush() is always executed before
> napi_complete_done(). This is important for two reasons. First, a
> redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
> napi context X on CPU Y will be followed by a xdp_do_flush() from the
> same napi context and CPU. This is not guaranteed if the
> napi_complete_done() is executed before xdp_do_flush(), as it tells
> the napi logic that it is fine to schedule napi context X on another
> CPU. Details from a production system triggering this bug using the
> veth driver can be found in [1].
> 
> The second reason is that the XDP_REDIRECT logic in itself relies on
> being inside a single NAPI instance through to the xdp_do_flush() call
> for RCU protection of all in-kernel data structures. Details can be
> found in [2].
> 
> The drivers have only been compile-tested since I do not own any of
> the HW below. So if you are a maintainer, it would be great if you
> could take a quick look to make sure I did not mess something up.
> 
> Note that these were the drivers I found that violated the ordering by
> running a simple script and manually checking the ones that came up as
> potential offenders. But the script was not perfect in any way. There
> might still be offenders out there, since the script can generate
> false negatives.


BTW all this series is stable material, right?


> v1 -> v2:
> * Added acks [Toke, Steen]
> * Corrected two spelling errors [Toke]
> 
> [1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
> [2] https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
> 
> Thanks: Magnus
> 
> Magnus Karlsson (5):
>   qede: execute xdp_do_flush() before napi_complete_done()
>   lan966x: execute xdp_do_flush() before napi_complete_done()
>   virtio-net: execute xdp_do_flush() before napi_complete_done()
>   dpaa_eth: execute xdp_do_flush() before napi_complete_done()
>   dpaa2-eth: execute xdp_do_flush() before napi_complete_done()
> 
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c        | 6 +++---
>  drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c      | 9 ++++++---
>  drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c | 6 +++---
>  drivers/net/ethernet/qlogic/qede/qede_fp.c            | 7 ++++---
>  drivers/net/virtio_net.c                              | 6 +++---
>  5 files changed, 19 insertions(+), 15 deletions(-)
> 
> 
> base-commit: 2a48216cff7a2e3964fbed16f84d33f68b3e5e42
> --
> 2.34.1
patchwork-bot+netdevbpf@kernel.org Jan. 28, 2023, 6:40 a.m. UTC | #2
Hello:

This series was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 25 Jan 2023 08:48:56 +0100 you wrote:
> Make sure that xdp_do_flush() is always executed before
> napi_complete_done(). This is important for two reasons. First, a
> redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
> napi context X on CPU Y will be followed by a xdp_do_flush() from the
> same napi context and CPU. This is not guaranteed if the
> napi_complete_done() is executed before xdp_do_flush(), as it tells
> the napi logic that it is fine to schedule napi context X on another
> CPU. Details from a production system triggering this bug using the
> veth driver can be found in [1].
> 
> [...]

Here is the summary with links:
  - [net,v2,1/5] qede: execute xdp_do_flush() before napi_complete_done()
    https://git.kernel.org/netdev/net/c/2ccce20d51fa
  - [net,v2,2/5] lan966x: execute xdp_do_flush() before napi_complete_done()
    https://git.kernel.org/netdev/net/c/12b5717990c8
  - [net,v2,3/5] virtio-net: execute xdp_do_flush() before napi_complete_done()
    https://git.kernel.org/netdev/net/c/ad7e615f646c
  - [net,v2,4/5] dpaa_eth: execute xdp_do_flush() before napi_complete_done()
    https://git.kernel.org/netdev/net/c/b534013798b7
  - [net,v2,5/5] dpaa2-eth: execute xdp_do_flush() before napi_complete_done()
    https://git.kernel.org/netdev/net/c/a3191c4d86c5

You are awesome, thank you!