Message ID | 1603971288-4786-1-git-send-email-linyunsheng@huawei.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net: add in_softirq() debug checking in napi_consume_skb() | expand |
On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote: > The current semantic for napi_consume_skb() is that caller need > to provide non-zero budget when calling from NAPI context, and > breaking this semantic will cause hard to debug problem, because > _kfree_skb_defer() need to run in atomic context in order to push > the skb to the particular cpu' napi_alloc_cache atomically. > > So add a in_softirq() debug checking in napi_consume_skb() to catch > this kind of error. > > Suggested-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 1ba8f01..1834007 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget) > return; > } > > + DEBUG_NET_WARN(!in_softirq(), > + "%s is called with non-zero budget outside softirq context.\n", > + __func__); Can't we use lockdep instead of defining our own knobs? Like this maybe? diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index f5594879175a..5253a167d00c 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -594,6 +594,14 @@ do { \ this_cpu_read(hardirqs_enabled))); \ } while (0) +#define lockdep_assert_in_softirq() \ +do { \ + WARN_ON_ONCE(__lockdep_enabled && \ + (softirq_count() == 0 || \ + this_cpu_read(hardirq_context))); \ +} while (0) > if (!skb_unref(skb)) > return; >
On 2020/11/1 6:38, Jakub Kicinski wrote: > On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote: >> The current semantic for napi_consume_skb() is that caller need >> to provide non-zero budget when calling from NAPI context, and >> breaking this semantic will cause hard to debug problem, because >> _kfree_skb_defer() need to run in atomic context in order to push >> the skb to the particular cpu' napi_alloc_cache atomically. >> >> So add a in_softirq() debug checking in napi_consume_skb() to catch >> this kind of error. >> >> Suggested-by: Eric Dumazet <edumazet@google.com> >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> > >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >> index 1ba8f01..1834007 100644 >> --- a/net/core/skbuff.c >> +++ b/net/core/skbuff.c >> @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget) >> return; >> } >> >> + DEBUG_NET_WARN(!in_softirq(), >> + "%s is called with non-zero budget outside softirq context.\n", >> + __func__); > > Can't we use lockdep instead of defining our own knobs? From the first look, using the below seems better than defining our own knobs, because there is similar lockdep_assert_in_irq() checking already and lockdep_assert_in_*() is NULL-OP when CONFIG_PROVE_LOCKING is not defined. > > Like this maybe? > > diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h > index f5594879175a..5253a167d00c 100644 > --- a/include/linux/lockdep.h > +++ b/include/linux/lockdep.h > @@ -594,6 +594,14 @@ do { \ > this_cpu_read(hardirqs_enabled))); \ > } while (0) > > +#define lockdep_assert_in_softirq() \ > +do { \ > + WARN_ON_ONCE(__lockdep_enabled && \ > + (softirq_count() == 0 || \ > + this_cpu_read(hardirq_context))); \ Using in_softirq() seems more obvious then using softirq_count()? And there is below comment above avoiding the using of in_softirq(), maybe that is why you use softirq_count() directly here? "softirq_count() == 0" still mean we are not in the SoftIRQ context and BH is not disabled. right? Perhap lockdep_assert_in_softirq_or_bh_disabled() is more obvious? /* * Are we doing bottom half or hardware interrupt processing? * * in_irq() - We're in (hard) IRQ context * in_softirq() - We have BH disabled, or are processing softirqs * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled * in_serving_softirq() - We're in softirq context * in_nmi() - We're in NMI context * in_task() - We're in task context * * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really * should not be used in new code. */ Also, is there any particular reason we do the "this_cpu_read(hardirq_context)" checking? Thanks. > +} while (0) > > > >> if (!skb_unref(skb)) >> return; >> > > . >
On Mon, 2 Nov 2020 11:14:32 +0800 Yunsheng Lin wrote: > On 2020/11/1 6:38, Jakub Kicinski wrote: > > On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote: > >> The current semantic for napi_consume_skb() is that caller need > >> to provide non-zero budget when calling from NAPI context, and > >> breaking this semantic will cause hard to debug problem, because > >> _kfree_skb_defer() need to run in atomic context in order to push > >> the skb to the particular cpu' napi_alloc_cache atomically. > >> > >> So add a in_softirq() debug checking in napi_consume_skb() to catch > >> this kind of error. > >> > >> Suggested-by: Eric Dumazet <edumazet@google.com> > >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> > > > >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c > >> index 1ba8f01..1834007 100644 > >> --- a/net/core/skbuff.c > >> +++ b/net/core/skbuff.c > >> @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget) > >> return; > >> } > >> > >> + DEBUG_NET_WARN(!in_softirq(), > >> + "%s is called with non-zero budget outside softirq context.\n", > >> + __func__); > > > > Can't we use lockdep instead of defining our own knobs? > > From the first look, using the below seems better than defining our > own knobs, because there is similar lockdep_assert_in_irq() checking > already and lockdep_assert_in_*() is NULL-OP when CONFIG_PROVE_LOCKING > is not defined. > > > > > Like this maybe? > > > > diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h > > index f5594879175a..5253a167d00c 100644 > > --- a/include/linux/lockdep.h > > +++ b/include/linux/lockdep.h > > @@ -594,6 +594,14 @@ do { \ > > this_cpu_read(hardirqs_enabled))); \ > > } while (0) > > > > +#define lockdep_assert_in_softirq() \ > > +do { \ > > + WARN_ON_ONCE(__lockdep_enabled && \ > > + (softirq_count() == 0 || \ > > + this_cpu_read(hardirq_context))); \ > > Using in_softirq() seems more obvious then using softirq_count()? > And there is below comment above avoiding the using of in_softirq(), maybe > that is why you use softirq_count() directly here? > "softirq_count() == 0" still mean we are not in the SoftIRQ context and > BH is not disabled. right? Perhap lockdep_assert_in_softirq_or_bh_disabled() > is more obvious? Let's add Peter to the recipients to get his opinion. We have a per-cpu resource which is also accessed from BH (see _kfree_skb_defer()). We're trying to come up with the correct lockdep incantation for it. > /* > * Are we doing bottom half or hardware interrupt processing? > * > * in_irq() - We're in (hard) IRQ context > * in_softirq() - We have BH disabled, or are processing softirqs > * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled > * in_serving_softirq() - We're in softirq context > * in_nmi() - We're in NMI context > * in_task() - We're in task context > * > * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really > * should not be used in new code. > */ > > > Also, is there any particular reason we do the "this_cpu_read(hardirq_context)" > checking? Accessing BH resources from a hard IRQ context would be a bug, we may have interrupted a BH, so AFAIU softirq_count() != 0, but we can nest calls to _kfree_skb_defer().
On 2020/11/3 3:41, Jakub Kicinski wrote: > On Mon, 2 Nov 2020 11:14:32 +0800 Yunsheng Lin wrote: >> On 2020/11/1 6:38, Jakub Kicinski wrote: >>> On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote: >>>> The current semantic for napi_consume_skb() is that caller need >>>> to provide non-zero budget when calling from NAPI context, and >>>> breaking this semantic will cause hard to debug problem, because >>>> _kfree_skb_defer() need to run in atomic context in order to push >>>> the skb to the particular cpu' napi_alloc_cache atomically. >>>> >>>> So add a in_softirq() debug checking in napi_consume_skb() to catch >>>> this kind of error. >>>> >>>> Suggested-by: Eric Dumazet <edumazet@google.com> >>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> >>> >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>>> index 1ba8f01..1834007 100644 >>>> --- a/net/core/skbuff.c >>>> +++ b/net/core/skbuff.c >>>> @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget) >>>> return; >>>> } >>>> >>>> + DEBUG_NET_WARN(!in_softirq(), >>>> + "%s is called with non-zero budget outside softirq context.\n", >>>> + __func__); >>> >>> Can't we use lockdep instead of defining our own knobs? >> >> From the first look, using the below seems better than defining our >> own knobs, because there is similar lockdep_assert_in_irq() checking >> already and lockdep_assert_in_*() is NULL-OP when CONFIG_PROVE_LOCKING >> is not defined. >> >>> >>> Like this maybe? >>> >>> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h >>> index f5594879175a..5253a167d00c 100644 >>> --- a/include/linux/lockdep.h >>> +++ b/include/linux/lockdep.h >>> @@ -594,6 +594,14 @@ do { \ >>> this_cpu_read(hardirqs_enabled))); \ >>> } while (0) >>> >>> +#define lockdep_assert_in_softirq() \ >>> +do { \ >>> + WARN_ON_ONCE(__lockdep_enabled && \ >>> + (softirq_count() == 0 || \ >>> + this_cpu_read(hardirq_context))); \ >> >> Using in_softirq() seems more obvious then using softirq_count()? >> And there is below comment above avoiding the using of in_softirq(), maybe >> that is why you use softirq_count() directly here? >> "softirq_count() == 0" still mean we are not in the SoftIRQ context and >> BH is not disabled. right? Perhap lockdep_assert_in_softirq_or_bh_disabled() >> is more obvious? > > Let's add Peter to the recipients to get his opinion. > > We have a per-cpu resource which is also accessed from BH (see > _kfree_skb_defer()). > > We're trying to come up with the correct lockdep incantation for it. Hi, Peter Any suggestion? > >> /* >> * Are we doing bottom half or hardware interrupt processing? >> * >> * in_irq() - We're in (hard) IRQ context >> * in_softirq() - We have BH disabled, or are processing softirqs >> * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled >> * in_serving_softirq() - We're in softirq context >> * in_nmi() - We're in NMI context >> * in_task() - We're in task context >> * >> * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really >> * should not be used in new code. >> */ >> >> >> Also, is there any particular reason we do the "this_cpu_read(hardirq_context)" >> checking? > > Accessing BH resources from a hard IRQ context would be a bug, we may > have interrupted a BH, so AFAIU softirq_count() != 0, but we can nest > calls to _kfree_skb_defer(). In that case, maybe just call lockdep_assert_in_irq() is enough? > . >
On Wed, 18 Nov 2020 09:57:30 +0800 Yunsheng Lin wrote: > On 2020/11/3 3:41, Jakub Kicinski wrote: > > On Mon, 2 Nov 2020 11:14:32 +0800 Yunsheng Lin wrote: > >> On 2020/11/1 6:38, Jakub Kicinski wrote: > >>> On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote: > >>>> The current semantic for napi_consume_skb() is that caller need > >>>> to provide non-zero budget when calling from NAPI context, and > >>>> breaking this semantic will cause hard to debug problem, because > >>>> _kfree_skb_defer() need to run in atomic context in order to push > >>>> the skb to the particular cpu' napi_alloc_cache atomically. > >>>> > >>>> So add a in_softirq() debug checking in napi_consume_skb() to catch > >>>> this kind of error. > >>>> > >>>> Suggested-by: Eric Dumazet <edumazet@google.com> > >>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> > >>> > >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c > >>>> index 1ba8f01..1834007 100644 > >>>> --- a/net/core/skbuff.c > >>>> +++ b/net/core/skbuff.c > >>>> @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget) > >>>> return; > >>>> } > >>>> > >>>> + DEBUG_NET_WARN(!in_softirq(), > >>>> + "%s is called with non-zero budget outside softirq context.\n", > >>>> + __func__); > >>> > >>> Can't we use lockdep instead of defining our own knobs? > >> > >> From the first look, using the below seems better than defining our > >> own knobs, because there is similar lockdep_assert_in_irq() checking > >> already and lockdep_assert_in_*() is NULL-OP when CONFIG_PROVE_LOCKING > >> is not defined. > >> > >>> > >>> Like this maybe? > >>> > >>> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h > >>> index f5594879175a..5253a167d00c 100644 > >>> --- a/include/linux/lockdep.h > >>> +++ b/include/linux/lockdep.h > >>> @@ -594,6 +594,14 @@ do { \ > >>> this_cpu_read(hardirqs_enabled))); \ > >>> } while (0) > >>> > >>> +#define lockdep_assert_in_softirq() \ > >>> +do { \ > >>> + WARN_ON_ONCE(__lockdep_enabled && \ > >>> + (softirq_count() == 0 || \ > >>> + this_cpu_read(hardirq_context))); \ > >> > >> Using in_softirq() seems more obvious then using softirq_count()? > >> And there is below comment above avoiding the using of in_softirq(), maybe > >> that is why you use softirq_count() directly here? > >> "softirq_count() == 0" still mean we are not in the SoftIRQ context and > >> BH is not disabled. right? Perhap lockdep_assert_in_softirq_or_bh_disabled() > >> is more obvious? > > > > Let's add Peter to the recipients to get his opinion. > > > > We have a per-cpu resource which is also accessed from BH (see > > _kfree_skb_defer()). > > > > We're trying to come up with the correct lockdep incantation for it. > > Hi, Peter > Any suggestion? Let's just try lockdep_assert_in_softirq() and see if anyone complains. People are more likely to respond to a patch than a discussion. > >> /* > >> * Are we doing bottom half or hardware interrupt processing? > >> * > >> * in_irq() - We're in (hard) IRQ context > >> * in_softirq() - We have BH disabled, or are processing softirqs > >> * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled > >> * in_serving_softirq() - We're in softirq context > >> * in_nmi() - We're in NMI context > >> * in_task() - We're in task context > >> * > >> * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really > >> * should not be used in new code. > >> */ > >> > >> > >> Also, is there any particular reason we do the "this_cpu_read(hardirq_context)" > >> checking? > > > > Accessing BH resources from a hard IRQ context would be a bug, we may > > have interrupted a BH, so AFAIU softirq_count() != 0, but we can nest > > calls to _kfree_skb_defer(). > > In that case, maybe just call lockdep_assert_in_irq() is enough? TBH the last sentence I wrote isn't clear even to me at this point ;D Maybe using just the macros from preempt.h - like this? #define lockdep_assert_in_softirq() \ do { \ WARN_ON_ONCE(__lockdep_enabled && \ (!in_softirq() || in_irq() || in_nmi()) \ } while (0) We know what we're doing so in_softirq() should be fine (famous last words).
On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote: > TBH the last sentence I wrote isn't clear even to me at this point ;D > > Maybe using just the macros from preempt.h - like this? > > #define lockdep_assert_in_softirq() \ > do { \ > WARN_ON_ONCE(__lockdep_enabled && \ > (!in_softirq() || in_irq() || in_nmi()) \ > } while (0) > > We know what we're doing so in_softirq() should be fine (famous last > words). So that's not actually using any lockdep state. But if that's what you need, I don't have any real complaints.
On Wed, 18 Nov 2020 16:57:57 +0100 Peter Zijlstra wrote: > On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote: > > > TBH the last sentence I wrote isn't clear even to me at this point ;D > > > > Maybe using just the macros from preempt.h - like this? > > > > #define lockdep_assert_in_softirq() \ > > do { \ > > WARN_ON_ONCE(__lockdep_enabled && \ > > (!in_softirq() || in_irq() || in_nmi()) \ > > } while (0) > > > > We know what we're doing so in_softirq() should be fine (famous last > > words). > > So that's not actually using any lockdep state. But if that's what you > need, I don't have any real complaints. Great, thanks! The motivation was to piggy back on lockdep rather than adding a one-off Kconfig knob for a check in the fast path in networking.
On 2020/11/19 0:26, Jakub Kicinski wrote: > On Wed, 18 Nov 2020 16:57:57 +0100 Peter Zijlstra wrote: >> On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote: >> >>> TBH the last sentence I wrote isn't clear even to me at this point ;D >>> >>> Maybe using just the macros from preempt.h - like this? >>> >>> #define lockdep_assert_in_softirq() \ >>> do { \ >>> WARN_ON_ONCE(__lockdep_enabled && \ >>> (!in_softirq() || in_irq() || in_nmi()) \ >>> } while (0) One thing I am not so sure about is the different irq context indicator in preempt.h and lockdep.h, for example lockdep_assert_in_irq() uses this_cpu_read(hardirq_context) in lockdep.h, and in_irq() uses current_thread_info()->preempt_count in preempt.h, if they are the same thing? >>> >>> We know what we're doing so in_softirq() should be fine (famous last >>> words). >> >> So that's not actually using any lockdep state. But if that's what you >> need, I don't have any real complaints. > > Great, thanks! > > The motivation was to piggy back on lockdep rather than adding a > one-off Kconfig knob for a check in the fast path in networking. > . >
On Thu, Nov 19, 2020 at 05:19:44PM +0800, Yunsheng Lin wrote: > On 2020/11/19 0:26, Jakub Kicinski wrote: > > On Wed, 18 Nov 2020 16:57:57 +0100 Peter Zijlstra wrote: > >> On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote: > >> > >>> TBH the last sentence I wrote isn't clear even to me at this point ;D > >>> > >>> Maybe using just the macros from preempt.h - like this? > >>> > >>> #define lockdep_assert_in_softirq() \ > >>> do { \ > >>> WARN_ON_ONCE(__lockdep_enabled && \ > >>> (!in_softirq() || in_irq() || in_nmi()) \ > >>> } while (0) > > One thing I am not so sure about is the different irq context indicator > in preempt.h and lockdep.h, for example lockdep_assert_in_irq() uses > this_cpu_read(hardirq_context) in lockdep.h, and in_irq() uses > current_thread_info()->preempt_count in preempt.h, if they are the same > thing? Very close, for more regular code they should be the same.
On 2020/11/19 19:41, Peter Zijlstra wrote: > On Thu, Nov 19, 2020 at 05:19:44PM +0800, Yunsheng Lin wrote: >> On 2020/11/19 0:26, Jakub Kicinski wrote: >>> On Wed, 18 Nov 2020 16:57:57 +0100 Peter Zijlstra wrote: >>>> On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote: >>>> >>>>> TBH the last sentence I wrote isn't clear even to me at this point ;D >>>>> >>>>> Maybe using just the macros from preempt.h - like this? >>>>> >>>>> #define lockdep_assert_in_softirq() \ >>>>> do { \ >>>>> WARN_ON_ONCE(__lockdep_enabled && \ >>>>> (!in_softirq() || in_irq() || in_nmi()) \ >>>>> } while (0) >> >> One thing I am not so sure about is the different irq context indicator >> in preempt.h and lockdep.h, for example lockdep_assert_in_irq() uses >> this_cpu_read(hardirq_context) in lockdep.h, and in_irq() uses >> current_thread_info()->preempt_count in preempt.h, if they are the same >> thing? > > Very close, for more regular code they should be the same. Thanks for clarifying. So I assmue the lockdep_assert_in_softirq() interface we want to add is regular code, right? > . >
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 964b494..8042bf1 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -5158,6 +5158,12 @@ do { \ }) #endif +#if defined(CONFIG_DEBUG_NET) +#define DEBUG_NET_WARN(condition, ...) WARN(condition, ##__VA_ARGS__) +#else +#define DEBUG_NET_WARN(condition, ...) +#endif + /* * The list of packet types we will receive (as opposed to discard) * and the routines to invoke. diff --git a/net/Kconfig b/net/Kconfig index d656716..82e69b0 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -459,6 +459,13 @@ config ETHTOOL_NETLINK netlink. It provides better extensibility and some new features, e.g. notification messages. +config DEBUG_NET + bool "Net debugging and diagnostics" + depends on DEBUG_KERNEL + default n + help + Say Y here to add some extra checks and diagnostics to networking. + endif # if NET # Used by archs to tell that they support BPF JIT compiler plus which flavour. diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 1ba8f01..1834007 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget) return; } + DEBUG_NET_WARN(!in_softirq(), + "%s is called with non-zero budget outside softirq context.\n", + __func__); + if (!skb_unref(skb)) return;
The current semantic for napi_consume_skb() is that caller need to provide non-zero budget when calling from NAPI context, and breaking this semantic will cause hard to debug problem, because _kfree_skb_defer() need to run in atomic context in order to push the skb to the particular cpu' napi_alloc_cache atomically. So add a in_softirq() debug checking in napi_consume_skb() to catch this kind of error. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> --- v1: drop RFC in the title --- include/linux/netdevice.h | 6 ++++++ net/Kconfig | 7 +++++++ net/core/skbuff.c | 4 ++++ 3 files changed, 17 insertions(+)