Message ID | 20240604122927.29080-1-magnus.karlsson@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Revert "xsk: support redirect to any socket bound to the same umem" | expand |
Magnus Karlsson wrote: > Revert "xsk: support redirect to any socket bound to the same umem" > > This patch introduced a potential kernel crash when multiple napi > instances redirect to the same AF_XDP socket. By removing the > queue_index check, it is possible for multiple napi instances to > access the Rx ring at the same time, which will result in a corrupted > ring state which can lead to a crash when flushing the rings in > __xsk_flush(). This can happen when the linked list of sockets to > flush gets corrupted by concurrent accesses. A quick and small fix is > unfortunately not possible, so let us revert this for now. This is a very useful feature, to be able to use AF_XDP sockets with a standard RSS nic configuration. Not all AF_XDP use cases require the absolute highest packet rate. Can this be addressed with an optional spinlock on the RxQ, only for this case? If there is no simple enough fix in the short term, do you plan to reintroduce this in another form later?
On Wed, 5 Jun 2024 at 01:03, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > Magnus Karlsson wrote: > > Revert "xsk: support redirect to any socket bound to the same umem" > > > > This patch introduced a potential kernel crash when multiple napi > > instances redirect to the same AF_XDP socket. By removing the > > queue_index check, it is possible for multiple napi instances to > > access the Rx ring at the same time, which will result in a corrupted > > ring state which can lead to a crash when flushing the rings in > > __xsk_flush(). This can happen when the linked list of sockets to > > flush gets corrupted by concurrent accesses. A quick and small fix is > > unfortunately not possible, so let us revert this for now. > > This is a very useful feature, to be able to use AF_XDP sockets with > a standard RSS nic configuration. I completely agree. > Not all AF_XDP use cases require the absolute highest packet rate. > > Can this be addressed with an optional spinlock on the RxQ, only for > this case? Yes, or with a MPSC ring implementation. > If there is no simple enough fix in the short term, do you plan to > reintroduce this in another form later? Yuval and I are looking into a solution based around an optional spinlock since it is easier to pull off than an MPSC ring. The discussion is on-going on the xdp-newbies list [0], but as soon as we have a first patch, we will post it here for review and debate. [0] https://lore.kernel.org/xdp-newbies/8100DBDC-0B7C-49DB-9995-6027F6E63147@radware.com/
Hello: This series was applied to bpf/bpf.git (master) by Daniel Borkmann <daniel@iogearbox.net>: On Tue, 4 Jun 2024 14:29:24 +0200 you wrote: > Revert "xsk: support redirect to any socket bound to the same umem" > > This patch introduced a potential kernel crash when multiple napi > instances redirect to the same AF_XDP socket. By removing the > queue_index check, it is possible for multiple napi instances to > access the Rx ring at the same time, which will result in a corrupted > ring state which can lead to a crash when flushing the rings in > __xsk_flush(). This can happen when the linked list of sockets to > flush gets corrupted by concurrent accesses. A quick and small fix is > unfortunately not possible, so let us revert this for now. > > [...] Here is the summary with links: - [bpf,1/2] Revert "xsk: support redirect to any socket bound to the same umem" https://git.kernel.org/bpf/bpf/c/7fcf26b315bb - [bpf,2/2] Revert "xsk: document ability to redirect to any socket bound to the same umem" https://git.kernel.org/bpf/bpf/c/03e38d315f3c You are awesome, thank you!
Magnus Karlsson wrote: > On Wed, 5 Jun 2024 at 01:03, Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > Magnus Karlsson wrote: > > > Revert "xsk: support redirect to any socket bound to the same umem" > > > > > > This patch introduced a potential kernel crash when multiple napi > > > instances redirect to the same AF_XDP socket. By removing the > > > queue_index check, it is possible for multiple napi instances to > > > access the Rx ring at the same time, which will result in a corrupted > > > ring state which can lead to a crash when flushing the rings in > > > __xsk_flush(). This can happen when the linked list of sockets to > > > flush gets corrupted by concurrent accesses. A quick and small fix is > > > unfortunately not possible, so let us revert this for now. > > > > This is a very useful feature, to be able to use AF_XDP sockets with > > a standard RSS nic configuration. > > I completely agree. > > > Not all AF_XDP use cases require the absolute highest packet rate. > > > > Can this be addressed with an optional spinlock on the RxQ, only for > > this case? > > Yes, or with a MPSC ring implementation. > > > If there is no simple enough fix in the short term, do you plan to > > reintroduce this in another form later? > > Yuval and I are looking into a solution based around an optional > spinlock since it is easier to pull off than an MPSC ring. The > discussion is on-going on the xdp-newbies list [0], but as soon as we > have a first patch, we will post it here for review and debate. > > [0] https://lore.kernel.org/xdp-newbies/8100DBDC-0B7C-49DB-9995-6027F6E63147@radware.com/ Glad to hear that it's intended to be supported, and even being worked on, thanks! I'll follow the conversation there.