Message ID | 20240528-net-2024-05-28-intel-net-fixes-v1-4-dc8593d2bbc6@intel.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) | expand |
On Tue, 28 May 2024 15:06:07 -0700 Jacob Keller wrote: > + /* Called from netdev unregister context. Unload the XDP program. */ > + if (vsi->netdev->reg_state == NETREG_UNREGISTERING) { > + xdp_features_clear_redirect_target(vsi->netdev); > + old_prog = xchg(&vsi->xdp_prog, NULL); > + if (old_prog) > + bpf_prog_put(old_prog); > + > + return 0; > + } This is not great. The netdevice is closed at this stage, why is the xdp setup try to do work if the device is closed even when not unregistering?
On 5/29/2024 6:54 PM, Jakub Kicinski wrote: > On Tue, 28 May 2024 15:06:07 -0700 Jacob Keller wrote: >> + /* Called from netdev unregister context. Unload the XDP program. */ >> + if (vsi->netdev->reg_state == NETREG_UNREGISTERING) { >> + xdp_features_clear_redirect_target(vsi->netdev); >> + old_prog = xchg(&vsi->xdp_prog, NULL); >> + if (old_prog) >> + bpf_prog_put(old_prog); >> + >> + return 0; >> + } > > This is not great. The netdevice is closed at this stage, why is the xdp > setup try to do work if the device is closed even when not > unregistering? The comment makes this seem like its happening during unregistration. It looks like i40e_xdp_setup() is only called from i40e_xdp(), which is if xdp->command is XDP_SETUP_PROG From the looks of things, ndo_bpf is called both for setup and teardown? > 7 >-------/* Set or clear a bpf program used in the earliest stages of packet > 6 >------- * rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The callee > 5 >------- * is responsible for calling bpf_prog_put on any old progs that are > 4 >------- * stored. In case of error, the callee need not release the new prog > 3 >------- * reference, but on success it takes ownership and must bpf_prog_put > 2 >------- * when it is no longer used. > 1 >------- */ Indeed, dev_xdp_uninstall calls dev_xdp_install in a loop to remove programs. As far as I can tell, it looks like the .ndo_bpf call is made with a program set to NULL during uninstall: > 30 static void dev_xdp_uninstall(struct net_device *dev) > 29 { > 28 >-------struct bpf_xdp_link *link; > 27 >-------struct bpf_prog *prog; > 26 >-------enum bpf_xdp_mode mode; > 25 >-------bpf_op_t bpf_op; > 24 > 23 >-------ASSERT_RTNL(); > 22 > 21 >-------for (mode = XDP_MODE_SKB; mode < __MAX_XDP_MODE; mode++) { > 20 >------->-------prog = dev_xdp_prog(dev, mode); > 19 >------->-------if (!prog) > 18 >------->------->-------continue; > 17 > 16 >------->-------bpf_op = dev_xdp_bpf_op(dev, mode); > 15 >------->-------if (!bpf_op) > 14 >------->------->-------continue; > 13 > 12 >------->-------WARN_ON(dev_xdp_install(dev, mode, bpf_op, NULL, 0, NULL)); > 11 Here, dev_xdp_install is called with a prog of NULL > 10 >------->-------/* auto-detach link from net device */ > 9 >------->-------link = dev_xdp_link(dev, mode); > 8 >------->-------if (link) > 7 >------->------->-------link->dev = NULL; > 6 >------->-------else > 5 >------->------->-------bpf_prog_put(prog); > 4 > 3 >------->-------dev_xdp_set_link(dev, mode, NULL); > 2 >-------} > 1 } I think the semantics are confusing here. Basically, the issue is this function needs to remove the XDP properly when called by the netdev unregister flow but has a check against adding a new program if its called during remove... I think this is confusing and could be improved by refactoring how the i40e flow works. If the passed-in prog is NULL, its a request to remove. If its otherwise, its a request to add a new program (possibly replacing an existing one?). I think we ought to just be checking NULL and not needing to bother with the netdev_unregister state at all here? Hopefully Michal can chime in with a better understanding.
On Thu, May 30, 2024 at 09:38:04AM -0700, Jacob Keller wrote: > > > On 5/29/2024 6:54 PM, Jakub Kicinski wrote: > > On Tue, 28 May 2024 15:06:07 -0700 Jacob Keller wrote: > >> + /* Called from netdev unregister context. Unload the XDP program. */ > >> + if (vsi->netdev->reg_state == NETREG_UNREGISTERING) { > >> + xdp_features_clear_redirect_target(vsi->netdev); > >> + old_prog = xchg(&vsi->xdp_prog, NULL); > >> + if (old_prog) > >> + bpf_prog_put(old_prog); > >> + > >> + return 0; > >> + } > > > > This is not great. The netdevice is closed at this stage, why is the xdp > > setup try to do work if the device is closed even when not > > unregistering? > > The comment makes this seem like its happening during unregistration. It > looks like i40e_xdp_setup() is only called from i40e_xdp(), which is if > xdp->command is XDP_SETUP_PROG > > From the looks of things, ndo_bpf is called both for setup and teardown? Exactly, ndo_bpf with the command XDP_SETUP_PROG can be called for both loading and unloading the XDP program. When the XDP program is unloaded, the callback is simply called with NULL as the pointer to the program. Also, unloading the XDP program can be initiated not only from the user space directly, but also from the kernel core. In this specific case we are handling that last case. Calling ndo_bpf when reg_state == NETREG_UNREGISTERING is the case when unloading the XDP program is an immanent part of the netdevice unregistering process. In such a case we have to unconditionally decrease the reference counter for the XDP program using bpf_prog_put() call and exit with no error to assure the consistency between BPF core code and our driver. > > > 7 >-------/* Set or clear a bpf program used in the earliest stages of packet > > 6 >------- * rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The callee > > 5 >------- * is responsible for calling bpf_prog_put on any old progs that are > > 4 >------- * stored. In case of error, the callee need not release the new prog > > 3 >------- * reference, but on success it takes ownership and must bpf_prog_put > > 2 >------- * when it is no longer used. > > 1 >------- */ > > Indeed, dev_xdp_uninstall calls dev_xdp_install in a loop to remove > programs. > > As far as I can tell, it looks like the .ndo_bpf call is made with a > program set to NULL during uninstall: > > > 30 static void dev_xdp_uninstall(struct net_device *dev) > > 29 { > > 28 >-------struct bpf_xdp_link *link; > > 27 >-------struct bpf_prog *prog; > > 26 >-------enum bpf_xdp_mode mode; > > 25 >-------bpf_op_t bpf_op; > > 24 > > 23 >-------ASSERT_RTNL(); > > 22 > > 21 >-------for (mode = XDP_MODE_SKB; mode < __MAX_XDP_MODE; mode++) { > > 20 >------->-------prog = dev_xdp_prog(dev, mode); > > 19 >------->-------if (!prog) > > 18 >------->------->-------continue; > > 17 > > 16 >------->-------bpf_op = dev_xdp_bpf_op(dev, mode); > > 15 >------->-------if (!bpf_op) > > 14 >------->------->-------continue; > > 13 > > 12 >------->-------WARN_ON(dev_xdp_install(dev, mode, bpf_op, NULL, 0, NULL)); > > 11 > > Here, dev_xdp_install is called with a prog of NULL > > > 10 >------->-------/* auto-detach link from net device */ > > 9 >------->-------link = dev_xdp_link(dev, mode); > > 8 >------->-------if (link) > > 7 >------->------->-------link->dev = NULL; > > 6 >------->-------else > > 5 >------->------->-------bpf_prog_put(prog); > > 4 > > 3 >------->-------dev_xdp_set_link(dev, mode, NULL); > > 2 >-------} > > 1 } > I confirm. The current design of netdevice unregistering algorithm includes checking (in a loop) if the netdevice has any XDP program attached and forces unloading that program because it won't be used anymore on that device. > I think the semantics are confusing here. > > Basically, the issue is this function needs to remove the XDP properly > when called by the netdev unregister flow but has a check against adding > a new program if its called during remove... The check for __I40E_IN_REMOVE has been introduced by the commit 6533e558c650 ("i40e: Fix reset path while removing the driver"). Similar checks have been added in other callbacks. I believe the intention was to fix some synchronization issues by exiting from callbacks or reset immediately if the driver is being removed. Unfortunately, although it could work for other callbacks, we cannot do that in ndo_bpf because we need to leave kernel structures and ref counters consistent. I decided to keep the check for IN_REMOVE because I believe it covers the case when NETREG_UNREGISTERING flag is not set yet but we started to destroy our internal data structures. > > I think this is confusing and could be improved by refactoring how the > i40e flow works. If the passed-in prog is NULL, its a request to remove. > If its otherwise, its a request to add a new program (possibly replacing > an existing one?). > > I think we ought to just be checking NULL and not needing to bother with > the netdev_unregister state at all here? I am afraid checking for NULL won't be enough here. Normally, when ndo_bpf is called from the user space application, that callback can be called with NULL too (when the user just wants to unload the XDP program). In such a case, apart from calling bpf_prog_put(), we have to rebuild our internal data structures (queues, pointers, counters etc.) to restore the i40e driver working back in normal mode (with no XDP program). My intention of adding the check for NETREG_REGISTERING was to implement a quick handler for unloading the XDP program from the netdev unregistering context only, when our internal data structures are already destroyed but we need to leave kernel's ref counters in a consistent state. > > Hopefully Michal can chime in with a better understanding. Thanks, Michal
On Wed, 5 Jun 2024 17:00:02 +0200 Michal Kubiak wrote: > I am afraid checking for NULL won't be enough here. > Normally, when ndo_bpf is called from the user space application, that > callback can be called with NULL too (when the user just wants to unload > the XDP program). In such a case, apart from calling bpf_prog_put(), we > have to rebuild our internal data structures (queues, pointers, counters > etc.) to restore the i40e driver working back in normal mode (with no > XDP program). Apologizes for asking a question which can be answered by studying the code longer, but why do you need to rebuild internal data structures for a device which is *down*. Unregistering or not.
On Wed, Jun 05, 2024 at 12:29:57PM -0700, Jakub Kicinski wrote: > On Wed, 5 Jun 2024 17:00:02 +0200 Michal Kubiak wrote: > > I am afraid checking for NULL won't be enough here. > > Normally, when ndo_bpf is called from the user space application, that > > callback can be called with NULL too (when the user just wants to unload > > the XDP program). In such a case, apart from calling bpf_prog_put(), we > > have to rebuild our internal data structures (queues, pointers, counters > > etc.) to restore the i40e driver working back in normal mode (with no > > XDP program). > > Apologizes for asking a question which can be answered by studying > the code longer, but why do you need to rebuild internal data > structures for a device which is *down*. Unregistering or not. Excuse me, but I don't understand why we should assume that a device is *down* when that callback is being called? Maybe I didn't make it clear, but the ndo_bpf can be called every time when the userspace application wants to load or unload the XDP program. It can happen when a device is *up* and also when the link is *up*. Michal
On Thu, 6 Jun 2024 12:02:27 +0200 Michal Kubiak wrote: > > Apologizes for asking a question which can be answered by studying > > the code longer, but why do you need to rebuild internal data > > structures for a device which is *down*. Unregistering or not. > > Excuse me, but I don't understand why we should assume that a device is > *down* when that callback is being called? > Maybe I didn't make it clear, but the ndo_bpf can be called every time > when the userspace application wants to load or unload the XDP program. > It can happen when a device is *up* and also when the link is *up*. The patch was adding a special case for NETREG_UNREGISTERING, at that point the device will be closed. Calling ndo_close is one of the first things core does during unregistering. Simplifying the handling for when the device is closed would be better.
On Thu, Jun 06, 2024 at 06:43:28AM -0700, Jakub Kicinski wrote: > On Thu, 6 Jun 2024 12:02:27 +0200 Michal Kubiak wrote: > > > Apologizes for asking a question which can be answered by studying > > > the code longer, but why do you need to rebuild internal data > > > structures for a device which is *down*. Unregistering or not. > > > > Excuse me, but I don't understand why we should assume that a device is > > *down* when that callback is being called? > > Maybe I didn't make it clear, but the ndo_bpf can be called every time > > when the userspace application wants to load or unload the XDP program. > > It can happen when a device is *up* and also when the link is *up*. > > The patch was adding a special case for NETREG_UNREGISTERING, > at that point the device will be closed. Calling ndo_close is one > of the first things core does during unregistering. > Simplifying the handling for when the device is closed would be > better. I think I'm getting your point but moving the code for NETREG_UNREGISTERING to ndo_stop wouldn't be enough and it seems to be against the current design of 'unregister_netdevice_many_notify()'. In 'unregister_netdevice_many_notify()' we have the call to 'dev_close_many()' (which calls ndo_stop on i40e driver) and then 'dev_xdp_uninstall()' is called (where there is the call to ndo_bpf). 'dev_xdp_uninstall()' seems to be the right function where all activities related to XDP program unloading during unregistering are expected from the driver. Anyway, I analyzed that code one more time and I agree that the special case for NETREG_UNREGISTERING makes the code more complex and I can implement it in a simpler way. The root cause of the problem I'm trying to fix is that __I40E_IN_REMOVE flag is handled in a wrong way. I will post the v2 then. Thanks, Michal
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 284c3fad5a6e..2f478edb9f9f 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -13293,6 +13293,20 @@ static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog, bool need_reset; int i; + /* Called from netdev unregister context. Unload the XDP program. */ + if (vsi->netdev->reg_state == NETREG_UNREGISTERING) { + xdp_features_clear_redirect_target(vsi->netdev); + old_prog = xchg(&vsi->xdp_prog, NULL); + if (old_prog) + bpf_prog_put(old_prog); + + return 0; + } + + /* VSI shall be deleted in a moment, just return EINVAL */ + if (test_bit(__I40E_IN_REMOVE, pf->state)) + return -EINVAL; + /* Don't allow frames that span over multiple buffers */ if (vsi->netdev->mtu > frame_size - I40E_PACKET_HDR_PAD) { NL_SET_ERR_MSG_MOD(extack, "MTU too large for linear frames and XDP prog does not support frags"); @@ -13301,14 +13315,9 @@ static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog, /* When turning XDP on->off/off->on we reset and rebuild the rings. */ need_reset = (i40e_enabled_xdp_vsi(vsi) != !!prog); - if (need_reset) i40e_prep_for_reset(pf); - /* VSI shall be deleted in a moment, just return EINVAL */ - if (test_bit(__I40E_IN_REMOVE, pf->state)) - return -EINVAL; - old_prog = xchg(&vsi->xdp_prog, prog); if (need_reset) {