Message ID | 20240929061839.1175300-1-willemdebruijn.kernel@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | b04c4d9eb4f25b950b33218e33b04c94e7445e51 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] vrf: revert "vrf: Remove unnecessary RCU-bh critical section" | expand |
On Sun, Sep 29, 2024 at 02:18:20AM -0400, Willem de Bruijn wrote: > From: Willem de Bruijn <willemb@google.com> > > This reverts commit 504fc6f4f7f681d2a03aa5f68aad549d90eab853. > > dev_queue_xmit_nit is expected to be called with BH disabled. > __dev_queue_xmit has the following: > > /* Disable soft irqs for various locks below. Also > * stops preemption for RCU. > */ > rcu_read_lock_bh(); > > VRF must follow this invariant. The referenced commit removed this > protection. Which triggered a lockdep warning: [...] > > Fixes: 504fc6f4f7f6 ("vrf: Remove unnecessary RCU-bh critical section") > Link: https://lore.kernel.org/netdev/20240925185216.1990381-1-greearb@candelatech.com/ > Reported-by: Ben Greear <greearb@candelatech.com> > Signed-off-by: Willem de Bruijn <willemb@google.com> > Cc: stable@vger.kernel.org Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Thanks Willem! The reason my script from 504fc6f4f7f6 did not trigger the problem is that it was pinging the address inside the VRF, so vrf_finish_direct() was only called from the Rx path. If you ping the address outside of the VRF: ping -I vrf1 -i 0.1 -c 10 -q 192.0.2.1 Then vrf_finish_direct() is called from process context and the lockdep warning is triggered. Tested that it does not trigger after applying the revert.
Ido Schimmel wrote: > On Sun, Sep 29, 2024 at 02:18:20AM -0400, Willem de Bruijn wrote: > > From: Willem de Bruijn <willemb@google.com> > > > > This reverts commit 504fc6f4f7f681d2a03aa5f68aad549d90eab853. > > > > dev_queue_xmit_nit is expected to be called with BH disabled. > > __dev_queue_xmit has the following: > > > > /* Disable soft irqs for various locks below. Also > > * stops preemption for RCU. > > */ > > rcu_read_lock_bh(); > > > > VRF must follow this invariant. The referenced commit removed this > > protection. Which triggered a lockdep warning: > > [...] > > > > > Fixes: 504fc6f4f7f6 ("vrf: Remove unnecessary RCU-bh critical section") > > Link: https://lore.kernel.org/netdev/20240925185216.1990381-1-greearb@candelatech.com/ > > Reported-by: Ben Greear <greearb@candelatech.com> > > Signed-off-by: Willem de Bruijn <willemb@google.com> > > Cc: stable@vger.kernel.org > > Reviewed-by: Ido Schimmel <idosch@nvidia.com> > Tested-by: Ido Schimmel <idosch@nvidia.com> > > Thanks Willem! Thanks for testing immediately and sharing the below context, Ido! > The reason my script from 504fc6f4f7f6 did not trigger the problem is > that it was pinging the address inside the VRF, so vrf_finish_direct() > was only called from the Rx path. > > If you ping the address outside of the VRF: > > ping -I vrf1 -i 0.1 -c 10 -q 192.0.2.1 > > Then vrf_finish_direct() is called from process context and the lockdep > warning is triggered. Tested that it does not trigger after applying the > revert.
On 9/29/24 3:11 AM, Ido Schimmel wrote: > On Sun, Sep 29, 2024 at 02:18:20AM -0400, Willem de Bruijn wrote: >> From: Willem de Bruijn <willemb@google.com> >> >> This reverts commit 504fc6f4f7f681d2a03aa5f68aad549d90eab853. >> >> dev_queue_xmit_nit is expected to be called with BH disabled. >> __dev_queue_xmit has the following: >> >> /* Disable soft irqs for various locks below. Also >> * stops preemption for RCU. >> */ >> rcu_read_lock_bh(); >> >> VRF must follow this invariant. The referenced commit removed this >> protection. Which triggered a lockdep warning: > > [...] > >> >> Fixes: 504fc6f4f7f6 ("vrf: Remove unnecessary RCU-bh critical section") >> Link: https://lore.kernel.org/netdev/20240925185216.1990381-1-greearb@candelatech.com/ >> Reported-by: Ben Greear <greearb@candelatech.com> >> Signed-off-by: Willem de Bruijn <willemb@google.com> >> Cc: stable@vger.kernel.org > > Reviewed-by: Ido Schimmel <idosch@nvidia.com> > Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> > > Thanks Willem! > > The reason my script from 504fc6f4f7f6 did not trigger the problem is > that it was pinging the address inside the VRF, so vrf_finish_direct() > was only called from the Rx path. > > If you ping the address outside of the VRF: > > ping -I vrf1 -i 0.1 -c 10 -q 192.0.2.1 > > Then vrf_finish_direct() is called from process context and the lockdep > warning is triggered. Tested that it does not trigger after applying the > revert. That case should be covered by the fcnal-test suite which does all combinations of addresses.
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Sun, 29 Sep 2024 02:18:20 -0400 you wrote: > From: Willem de Bruijn <willemb@google.com> > > This reverts commit 504fc6f4f7f681d2a03aa5f68aad549d90eab853. > > dev_queue_xmit_nit is expected to be called with BH disabled. > __dev_queue_xmit has the following: > > [...] Here is the summary with links: - [net] vrf: revert "vrf: Remove unnecessary RCU-bh critical section" https://git.kernel.org/netdev/net/c/b04c4d9eb4f2 You are awesome, thank you!
diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index 4d8ccaf9a2b4..4087f72f0d2b 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -608,7 +608,9 @@ static void vrf_finish_direct(struct sk_buff *skb) eth_zero_addr(eth->h_dest); eth->h_proto = skb->protocol; + rcu_read_lock_bh(); dev_queue_xmit_nit(skb, vrf_dev); + rcu_read_unlock_bh(); skb_pull(skb, ETH_HLEN); }