Message ID | 20220607171732.21191-5-eric.dumazet@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: few debug refinements | expand |
On Tue, 7 Jun 2022 10:17:28 -0700 Eric Dumazet wrote: > sk_stream_kill_queues() has three checks which have been > useful to detect kernel bugs in the past. > > However they are potentially a problem because they > could flood the syslog, and really only a developper > can make sense of them. > > Keep the checks for CONFIG_DEBUG_NET=y builds, > and issue them once only. I feel like 3 & 4 had caught plenty of bugs which triggered only in production / at scale. In my head DEBUG_NET_WARN_ON_ONCE() is great for things we are relatively sure syzbot will trigger. Am I mis-characterizing things or should we WARN_ON_ONCE() those?
On Tue, 2022-06-07 at 21:10 -0700, Jakub Kicinski wrote: > On Tue, 7 Jun 2022 10:17:28 -0700 Eric Dumazet wrote: > > sk_stream_kill_queues() has three checks which have been > > useful to detect kernel bugs in the past. > > > > However they are potentially a problem because they > > could flood the syslog, and really only a developper > > can make sense of them. > > > > Keep the checks for CONFIG_DEBUG_NET=y builds, > > and issue them once only. > > I feel like 3 & 4 had caught plenty of bugs which triggered only > in production / at scale. > I have a somewhat similar experience: I hit a few races spotted by the warnings in patches 3 and 4 observable only in non-debug build. The checks in patch 4 are almost rendundant with the ones in patch 3 - at least in my experience I could not trigger the first without hitting the latter. Perhaps we could use WARN_ON_ONCE only in patch 3? Thanks! Paolo
On Wed, Jun 8, 2022 at 1:11 AM Paolo Abeni <pabeni@redhat.com> wrote: > > On Tue, 2022-06-07 at 21:10 -0700, Jakub Kicinski wrote: > > On Tue, 7 Jun 2022 10:17:28 -0700 Eric Dumazet wrote: > > > sk_stream_kill_queues() has three checks which have been > > > useful to detect kernel bugs in the past. > > > > > > However they are potentially a problem because they > > > could flood the syslog, and really only a developper > > > can make sense of them. > > > > > > Keep the checks for CONFIG_DEBUG_NET=y builds, > > > and issue them once only. > > > > I feel like 3 & 4 had caught plenty of bugs which triggered only > > in production / at scale. > > > I have a somewhat similar experience: I hit a few races spotted by the > warnings in patches 3 and 4 observable only in non-debug build. > > The checks in patch 4 are almost rendundant with the ones in patch 3 - > at least in my experience I could not trigger the first without hitting > the latter. Perhaps we could use WARN_ON_ONCE only in patch 3? > Well, I certainly can stick to WARN_ON_ONCE(). For syzbot it is really the same thing.
diff --git a/net/core/stream.c b/net/core/stream.c index 06b36c730ce8a29bb2d8984495e780931907ca72..a5aa3620be95574c6d0f371f5943bb3b8f36cb4c 100644 --- a/net/core/stream.c +++ b/net/core/stream.c @@ -196,13 +196,13 @@ void sk_stream_kill_queues(struct sock *sk) __skb_queue_purge(&sk->sk_receive_queue); /* Next, the write queue. */ - WARN_ON(!skb_queue_empty(&sk->sk_write_queue)); + DEBUG_NET_WARN_ON_ONCE(!skb_queue_empty(&sk->sk_write_queue)); /* Account for returned memory. */ sk_mem_reclaim_final(sk); - WARN_ON(sk->sk_wmem_queued); - WARN_ON(sk->sk_forward_alloc); + DEBUG_NET_WARN_ON_ONCE(sk->sk_wmem_queued); + DEBUG_NET_WARN_ON_ONCE(sk->sk_forward_alloc); /* It is _impossible_ for the backlog to contain anything * when we get here. All user references to this socket