diff mbox series

[net-next] net: add in_softirq() debug checking in napi_consume_skb()

Message ID 1603971288-4786-1-git-send-email-linyunsheng@huawei.com (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: add in_softirq() debug checking in napi_consume_skb() | expand

Commit Message

Yunsheng Lin Oct. 29, 2020, 11:34 a.m. UTC
The current semantic for napi_consume_skb() is that caller need
to provide non-zero budget when calling from NAPI context, and
breaking this semantic will cause hard to debug problem, because
_kfree_skb_defer() need to run in atomic context in order to push
the skb to the particular cpu' napi_alloc_cache atomically.

So add a in_softirq() debug checking in napi_consume_skb() to catch
this kind of error.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
---
v1: drop RFC in the title
---
 include/linux/netdevice.h | 6 ++++++
 net/Kconfig               | 7 +++++++
 net/core/skbuff.c         | 4 ++++
 3 files changed, 17 insertions(+)

Comments

Jakub Kicinski Oct. 31, 2020, 10:38 p.m. UTC | #1
On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote:
> The current semantic for napi_consume_skb() is that caller need
> to provide non-zero budget when calling from NAPI context, and
> breaking this semantic will cause hard to debug problem, because
> _kfree_skb_defer() need to run in atomic context in order to push
> the skb to the particular cpu' napi_alloc_cache atomically.
> 
> So add a in_softirq() debug checking in napi_consume_skb() to catch
> this kind of error.
> 
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>

> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 1ba8f01..1834007 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
>  		return;
>  	}
>  
> +	DEBUG_NET_WARN(!in_softirq(),
> +		       "%s is called with non-zero budget outside softirq context.\n",
> +		       __func__);

Can't we use lockdep instead of defining our own knobs?

Like this maybe?

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index f5594879175a..5253a167d00c 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -594,6 +594,14 @@ do {                                                                       \
                      this_cpu_read(hardirqs_enabled)));                \
 } while (0)
 
+#define lockdep_assert_in_softirq()                                    \
+do {                                                                   \
+       WARN_ON_ONCE(__lockdep_enabled                  &&              \
+                    (softirq_count() == 0              ||              \
+                     this_cpu_read(hardirq_context)));                 \
+} while (0)



>  	if (!skb_unref(skb))
>  		return;
>
Yunsheng Lin Nov. 2, 2020, 3:14 a.m. UTC | #2
On 2020/11/1 6:38, Jakub Kicinski wrote:
> On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote:
>> The current semantic for napi_consume_skb() is that caller need
>> to provide non-zero budget when calling from NAPI context, and
>> breaking this semantic will cause hard to debug problem, because
>> _kfree_skb_defer() need to run in atomic context in order to push
>> the skb to the particular cpu' napi_alloc_cache atomically.
>>
>> So add a in_softirq() debug checking in napi_consume_skb() to catch
>> this kind of error.
>>
>> Suggested-by: Eric Dumazet <edumazet@google.com>
>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
> 
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index 1ba8f01..1834007 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
>>  		return;
>>  	}
>>  
>> +	DEBUG_NET_WARN(!in_softirq(),
>> +		       "%s is called with non-zero budget outside softirq context.\n",
>> +		       __func__);
> 
> Can't we use lockdep instead of defining our own knobs?

From the first look, using the below seems better than defining our
own knobs, because there is similar lockdep_assert_in_irq() checking
already and lockdep_assert_in_*() is NULL-OP when CONFIG_PROVE_LOCKING
is not defined.

> 
> Like this maybe?
> 
> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> index f5594879175a..5253a167d00c 100644
> --- a/include/linux/lockdep.h
> +++ b/include/linux/lockdep.h
> @@ -594,6 +594,14 @@ do {                                                                       \
>                       this_cpu_read(hardirqs_enabled)));                \
>  } while (0)
>  
> +#define lockdep_assert_in_softirq()                                    \
> +do {                                                                   \
> +       WARN_ON_ONCE(__lockdep_enabled                  &&              \
> +                    (softirq_count() == 0              ||              \
> +                     this_cpu_read(hardirq_context)));                 \

Using in_softirq() seems more obvious then using softirq_count()?
And there is below comment above avoiding the using of in_softirq(), maybe
that is why you use softirq_count() directly here?
"softirq_count() == 0" still mean we are not in the SoftIRQ context and
BH is not disabled. right? Perhap lockdep_assert_in_softirq_or_bh_disabled()
is more obvious?

/*
 * Are we doing bottom half or hardware interrupt processing?
 *
 * in_irq()       - We're in (hard) IRQ context
 * in_softirq()   - We have BH disabled, or are processing softirqs
 * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
 * in_serving_softirq() - We're in softirq context
 * in_nmi()       - We're in NMI context
 * in_task()	  - We're in task context
 *
 * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really
 *       should not be used in new code.
 */


Also, is there any particular reason we do the "this_cpu_read(hardirq_context)"
checking?

Thanks.

> +} while (0)
> 
> 
> 
>>  	if (!skb_unref(skb))
>>  		return;
>>  
> 
> .
>
Jakub Kicinski Nov. 2, 2020, 7:41 p.m. UTC | #3
On Mon, 2 Nov 2020 11:14:32 +0800 Yunsheng Lin wrote:
> On 2020/11/1 6:38, Jakub Kicinski wrote:
> > On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote:  
> >> The current semantic for napi_consume_skb() is that caller need
> >> to provide non-zero budget when calling from NAPI context, and
> >> breaking this semantic will cause hard to debug problem, because
> >> _kfree_skb_defer() need to run in atomic context in order to push
> >> the skb to the particular cpu' napi_alloc_cache atomically.
> >>
> >> So add a in_softirq() debug checking in napi_consume_skb() to catch
> >> this kind of error.
> >>
> >> Suggested-by: Eric Dumazet <edumazet@google.com>
> >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>  
> >   
> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> index 1ba8f01..1834007 100644
> >> --- a/net/core/skbuff.c
> >> +++ b/net/core/skbuff.c
> >> @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
> >>  		return;
> >>  	}
> >>  
> >> +	DEBUG_NET_WARN(!in_softirq(),
> >> +		       "%s is called with non-zero budget outside softirq context.\n",
> >> +		       __func__);  
> > 
> > Can't we use lockdep instead of defining our own knobs?  
> 
> From the first look, using the below seems better than defining our
> own knobs, because there is similar lockdep_assert_in_irq() checking
> already and lockdep_assert_in_*() is NULL-OP when CONFIG_PROVE_LOCKING
> is not defined.
> 
> > 
> > Like this maybe?
> > 
> > diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> > index f5594879175a..5253a167d00c 100644
> > --- a/include/linux/lockdep.h
> > +++ b/include/linux/lockdep.h
> > @@ -594,6 +594,14 @@ do {                                                                       \
> >                       this_cpu_read(hardirqs_enabled)));                \
> >  } while (0)
> >  
> > +#define lockdep_assert_in_softirq()                                    \
> > +do {                                                                   \
> > +       WARN_ON_ONCE(__lockdep_enabled                  &&              \
> > +                    (softirq_count() == 0              ||              \
> > +                     this_cpu_read(hardirq_context)));                 \  
> 
> Using in_softirq() seems more obvious then using softirq_count()?
> And there is below comment above avoiding the using of in_softirq(), maybe
> that is why you use softirq_count() directly here?
> "softirq_count() == 0" still mean we are not in the SoftIRQ context and
> BH is not disabled. right? Perhap lockdep_assert_in_softirq_or_bh_disabled()
> is more obvious?

Let's add Peter to the recipients to get his opinion.

We have a per-cpu resource which is also accessed from BH (see
_kfree_skb_defer()).

We're trying to come up with the correct lockdep incantation for it.

> /*
>  * Are we doing bottom half or hardware interrupt processing?
>  *
>  * in_irq()       - We're in (hard) IRQ context
>  * in_softirq()   - We have BH disabled, or are processing softirqs
>  * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
>  * in_serving_softirq() - We're in softirq context
>  * in_nmi()       - We're in NMI context
>  * in_task()	  - We're in task context
>  *
>  * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really
>  *       should not be used in new code.
>  */
> 
> 
> Also, is there any particular reason we do the "this_cpu_read(hardirq_context)"
> checking?

Accessing BH resources from a hard IRQ context would be a bug, we may
have interrupted a BH, so AFAIU softirq_count() != 0, but we can nest
calls to _kfree_skb_defer().
Yunsheng Lin Nov. 18, 2020, 1:57 a.m. UTC | #4
On 2020/11/3 3:41, Jakub Kicinski wrote:
> On Mon, 2 Nov 2020 11:14:32 +0800 Yunsheng Lin wrote:
>> On 2020/11/1 6:38, Jakub Kicinski wrote:
>>> On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote:  
>>>> The current semantic for napi_consume_skb() is that caller need
>>>> to provide non-zero budget when calling from NAPI context, and
>>>> breaking this semantic will cause hard to debug problem, because
>>>> _kfree_skb_defer() need to run in atomic context in order to push
>>>> the skb to the particular cpu' napi_alloc_cache atomically.
>>>>
>>>> So add a in_softirq() debug checking in napi_consume_skb() to catch
>>>> this kind of error.
>>>>
>>>> Suggested-by: Eric Dumazet <edumazet@google.com>
>>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>  
>>>   
>>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>>>> index 1ba8f01..1834007 100644
>>>> --- a/net/core/skbuff.c
>>>> +++ b/net/core/skbuff.c
>>>> @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
>>>>  		return;
>>>>  	}
>>>>  
>>>> +	DEBUG_NET_WARN(!in_softirq(),
>>>> +		       "%s is called with non-zero budget outside softirq context.\n",
>>>> +		       __func__);  
>>>
>>> Can't we use lockdep instead of defining our own knobs?  
>>
>> From the first look, using the below seems better than defining our
>> own knobs, because there is similar lockdep_assert_in_irq() checking
>> already and lockdep_assert_in_*() is NULL-OP when CONFIG_PROVE_LOCKING
>> is not defined.
>>
>>>
>>> Like this maybe?
>>>
>>> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
>>> index f5594879175a..5253a167d00c 100644
>>> --- a/include/linux/lockdep.h
>>> +++ b/include/linux/lockdep.h
>>> @@ -594,6 +594,14 @@ do {                                                                       \
>>>                       this_cpu_read(hardirqs_enabled)));                \
>>>  } while (0)
>>>  
>>> +#define lockdep_assert_in_softirq()                                    \
>>> +do {                                                                   \
>>> +       WARN_ON_ONCE(__lockdep_enabled                  &&              \
>>> +                    (softirq_count() == 0              ||              \
>>> +                     this_cpu_read(hardirq_context)));                 \  
>>
>> Using in_softirq() seems more obvious then using softirq_count()?
>> And there is below comment above avoiding the using of in_softirq(), maybe
>> that is why you use softirq_count() directly here?
>> "softirq_count() == 0" still mean we are not in the SoftIRQ context and
>> BH is not disabled. right? Perhap lockdep_assert_in_softirq_or_bh_disabled()
>> is more obvious?
> 
> Let's add Peter to the recipients to get his opinion.
> 
> We have a per-cpu resource which is also accessed from BH (see
> _kfree_skb_defer()).
> 
> We're trying to come up with the correct lockdep incantation for it.

Hi, Peter
	Any suggestion?

> 
>> /*
>>  * Are we doing bottom half or hardware interrupt processing?
>>  *
>>  * in_irq()       - We're in (hard) IRQ context
>>  * in_softirq()   - We have BH disabled, or are processing softirqs
>>  * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
>>  * in_serving_softirq() - We're in softirq context
>>  * in_nmi()       - We're in NMI context
>>  * in_task()	  - We're in task context
>>  *
>>  * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really
>>  *       should not be used in new code.
>>  */
>>
>>
>> Also, is there any particular reason we do the "this_cpu_read(hardirq_context)"
>> checking?
> 
> Accessing BH resources from a hard IRQ context would be a bug, we may
> have interrupted a BH, so AFAIU softirq_count() != 0, but we can nest
> calls to _kfree_skb_defer().

In that case, maybe just call lockdep_assert_in_irq() is enough?

> .
>
Jakub Kicinski Nov. 18, 2020, 3:43 p.m. UTC | #5
On Wed, 18 Nov 2020 09:57:30 +0800 Yunsheng Lin wrote:
> On 2020/11/3 3:41, Jakub Kicinski wrote:
> > On Mon, 2 Nov 2020 11:14:32 +0800 Yunsheng Lin wrote:  
> >> On 2020/11/1 6:38, Jakub Kicinski wrote:  
> >>> On Thu, 29 Oct 2020 19:34:48 +0800 Yunsheng Lin wrote:    
> >>>> The current semantic for napi_consume_skb() is that caller need
> >>>> to provide non-zero budget when calling from NAPI context, and
> >>>> breaking this semantic will cause hard to debug problem, because
> >>>> _kfree_skb_defer() need to run in atomic context in order to push
> >>>> the skb to the particular cpu' napi_alloc_cache atomically.
> >>>>
> >>>> So add a in_softirq() debug checking in napi_consume_skb() to catch
> >>>> this kind of error.
> >>>>
> >>>> Suggested-by: Eric Dumazet <edumazet@google.com>
> >>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>    
> >>>     
> >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >>>> index 1ba8f01..1834007 100644
> >>>> --- a/net/core/skbuff.c
> >>>> +++ b/net/core/skbuff.c
> >>>> @@ -897,6 +897,10 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
> >>>>  		return;
> >>>>  	}
> >>>>  
> >>>> +	DEBUG_NET_WARN(!in_softirq(),
> >>>> +		       "%s is called with non-zero budget outside softirq context.\n",
> >>>> +		       __func__);    
> >>>
> >>> Can't we use lockdep instead of defining our own knobs?    
> >>
> >> From the first look, using the below seems better than defining our
> >> own knobs, because there is similar lockdep_assert_in_irq() checking
> >> already and lockdep_assert_in_*() is NULL-OP when CONFIG_PROVE_LOCKING
> >> is not defined.
> >>  
> >>>
> >>> Like this maybe?
> >>>
> >>> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> >>> index f5594879175a..5253a167d00c 100644
> >>> --- a/include/linux/lockdep.h
> >>> +++ b/include/linux/lockdep.h
> >>> @@ -594,6 +594,14 @@ do {                                                                       \
> >>>                       this_cpu_read(hardirqs_enabled)));                \
> >>>  } while (0)
> >>>  
> >>> +#define lockdep_assert_in_softirq()                                    \
> >>> +do {                                                                   \
> >>> +       WARN_ON_ONCE(__lockdep_enabled                  &&              \
> >>> +                    (softirq_count() == 0              ||              \
> >>> +                     this_cpu_read(hardirq_context)));                 \    
> >>
> >> Using in_softirq() seems more obvious then using softirq_count()?
> >> And there is below comment above avoiding the using of in_softirq(), maybe
> >> that is why you use softirq_count() directly here?
> >> "softirq_count() == 0" still mean we are not in the SoftIRQ context and
> >> BH is not disabled. right? Perhap lockdep_assert_in_softirq_or_bh_disabled()
> >> is more obvious?  
> > 
> > Let's add Peter to the recipients to get his opinion.
> > 
> > We have a per-cpu resource which is also accessed from BH (see
> > _kfree_skb_defer()).
> > 
> > We're trying to come up with the correct lockdep incantation for it.  
> 
> Hi, Peter
> 	Any suggestion?

Let's just try lockdep_assert_in_softirq() and see if anyone complains.
People are more likely to respond to a patch than a discussion.

> >> /*
> >>  * Are we doing bottom half or hardware interrupt processing?
> >>  *
> >>  * in_irq()       - We're in (hard) IRQ context
> >>  * in_softirq()   - We have BH disabled, or are processing softirqs
> >>  * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
> >>  * in_serving_softirq() - We're in softirq context
> >>  * in_nmi()       - We're in NMI context
> >>  * in_task()	  - We're in task context
> >>  *
> >>  * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really
> >>  *       should not be used in new code.
> >>  */
> >>
> >>
> >> Also, is there any particular reason we do the "this_cpu_read(hardirq_context)"
> >> checking?  
> > 
> > Accessing BH resources from a hard IRQ context would be a bug, we may
> > have interrupted a BH, so AFAIU softirq_count() != 0, but we can nest
> > calls to _kfree_skb_defer().  
> 
> In that case, maybe just call lockdep_assert_in_irq() is enough?

TBH the last sentence I wrote isn't clear even to me at this point ;D

Maybe using just the macros from preempt.h - like this?

#define lockdep_assert_in_softirq()                                    \
do {                                                                   \
       WARN_ON_ONCE(__lockdep_enabled                  &&              \
                    (!in_softirq() || in_irq() || in_nmi())	\
} while (0)

We know what we're doing so in_softirq() should be fine (famous last
words).
Peter Zijlstra Nov. 18, 2020, 3:57 p.m. UTC | #6
On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote:

> TBH the last sentence I wrote isn't clear even to me at this point ;D
> 
> Maybe using just the macros from preempt.h - like this?
> 
> #define lockdep_assert_in_softirq()                                    \
> do {                                                                   \
>        WARN_ON_ONCE(__lockdep_enabled                  &&              \
>                     (!in_softirq() || in_irq() || in_nmi())	\
> } while (0)
> 
> We know what we're doing so in_softirq() should be fine (famous last
> words).

So that's not actually using any lockdep state. But if that's what you
need, I don't have any real complaints.
Jakub Kicinski Nov. 18, 2020, 4:26 p.m. UTC | #7
On Wed, 18 Nov 2020 16:57:57 +0100 Peter Zijlstra wrote:
> On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote:
> 
> > TBH the last sentence I wrote isn't clear even to me at this point ;D
> > 
> > Maybe using just the macros from preempt.h - like this?
> > 
> > #define lockdep_assert_in_softirq()                                    \
> > do {                                                                   \
> >        WARN_ON_ONCE(__lockdep_enabled                  &&              \
> >                     (!in_softirq() || in_irq() || in_nmi())	\
> > } while (0)
> > 
> > We know what we're doing so in_softirq() should be fine (famous last
> > words).  
> 
> So that's not actually using any lockdep state. But if that's what you
> need, I don't have any real complaints.

Great, thanks! 

The motivation was to piggy back on lockdep rather than adding a
one-off Kconfig knob for a check in the fast path in networking.
Yunsheng Lin Nov. 19, 2020, 9:19 a.m. UTC | #8
On 2020/11/19 0:26, Jakub Kicinski wrote:
> On Wed, 18 Nov 2020 16:57:57 +0100 Peter Zijlstra wrote:
>> On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote:
>>
>>> TBH the last sentence I wrote isn't clear even to me at this point ;D
>>>
>>> Maybe using just the macros from preempt.h - like this?
>>>
>>> #define lockdep_assert_in_softirq()                                    \
>>> do {                                                                   \
>>>        WARN_ON_ONCE(__lockdep_enabled                  &&              \
>>>                     (!in_softirq() || in_irq() || in_nmi())	\
>>> } while (0)

One thing I am not so sure about is the different irq context indicator
in preempt.h and lockdep.h, for example lockdep_assert_in_irq() uses
this_cpu_read(hardirq_context) in lockdep.h, and in_irq() uses
current_thread_info()->preempt_count in preempt.h, if they are the same
thing?

>>>
>>> We know what we're doing so in_softirq() should be fine (famous last
>>> words).  
>>
>> So that's not actually using any lockdep state. But if that's what you
>> need, I don't have any real complaints.
> 
> Great, thanks! 
> 
> The motivation was to piggy back on lockdep rather than adding a
> one-off Kconfig knob for a check in the fast path in networking.
> .
>
Peter Zijlstra Nov. 19, 2020, 11:41 a.m. UTC | #9
On Thu, Nov 19, 2020 at 05:19:44PM +0800, Yunsheng Lin wrote:
> On 2020/11/19 0:26, Jakub Kicinski wrote:
> > On Wed, 18 Nov 2020 16:57:57 +0100 Peter Zijlstra wrote:
> >> On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote:
> >>
> >>> TBH the last sentence I wrote isn't clear even to me at this point ;D
> >>>
> >>> Maybe using just the macros from preempt.h - like this?
> >>>
> >>> #define lockdep_assert_in_softirq()                                    \
> >>> do {                                                                   \
> >>>        WARN_ON_ONCE(__lockdep_enabled                  &&              \
> >>>                     (!in_softirq() || in_irq() || in_nmi())	\
> >>> } while (0)
> 
> One thing I am not so sure about is the different irq context indicator
> in preempt.h and lockdep.h, for example lockdep_assert_in_irq() uses
> this_cpu_read(hardirq_context) in lockdep.h, and in_irq() uses
> current_thread_info()->preempt_count in preempt.h, if they are the same
> thing?

Very close, for more regular code they should be the same.
Yunsheng Lin Nov. 19, 2020, 12:29 p.m. UTC | #10
On 2020/11/19 19:41, Peter Zijlstra wrote:
> On Thu, Nov 19, 2020 at 05:19:44PM +0800, Yunsheng Lin wrote:
>> On 2020/11/19 0:26, Jakub Kicinski wrote:
>>> On Wed, 18 Nov 2020 16:57:57 +0100 Peter Zijlstra wrote:
>>>> On Wed, Nov 18, 2020 at 07:43:48AM -0800, Jakub Kicinski wrote:
>>>>
>>>>> TBH the last sentence I wrote isn't clear even to me at this point ;D
>>>>>
>>>>> Maybe using just the macros from preempt.h - like this?
>>>>>
>>>>> #define lockdep_assert_in_softirq()                                    \
>>>>> do {                                                                   \
>>>>>        WARN_ON_ONCE(__lockdep_enabled                  &&              \
>>>>>                     (!in_softirq() || in_irq() || in_nmi())	\
>>>>> } while (0)
>>
>> One thing I am not so sure about is the different irq context indicator
>> in preempt.h and lockdep.h, for example lockdep_assert_in_irq() uses
>> this_cpu_read(hardirq_context) in lockdep.h, and in_irq() uses
>> current_thread_info()->preempt_count in preempt.h, if they are the same
>> thing?
> 
> Very close, for more regular code they should be the same.

Thanks for clarifying.
So I assmue the lockdep_assert_in_softirq() interface we want to add
is regular code, right?

> .
>
diff mbox series

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 964b494..8042bf1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -5158,6 +5158,12 @@  do {								\
 })
 #endif
 
+#if defined(CONFIG_DEBUG_NET)
+#define DEBUG_NET_WARN(condition, ...)	WARN(condition, ##__VA_ARGS__)
+#else
+#define DEBUG_NET_WARN(condition, ...)
+#endif
+
 /*
  *	The list of packet types we will receive (as opposed to discard)
  *	and the routines to invoke.
diff --git a/net/Kconfig b/net/Kconfig
index d656716..82e69b0 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -459,6 +459,13 @@  config ETHTOOL_NETLINK
 	  netlink. It provides better extensibility and some new features,
 	  e.g. notification messages.
 
+config DEBUG_NET
+	bool "Net debugging and diagnostics"
+	depends on DEBUG_KERNEL
+	default n
+	help
+	  Say Y here to add some extra checks and diagnostics to networking.
+
 endif   # if NET
 
 # Used by archs to tell that they support BPF JIT compiler plus which flavour.
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 1ba8f01..1834007 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -897,6 +897,10 @@  void napi_consume_skb(struct sk_buff *skb, int budget)
 		return;
 	}
 
+	DEBUG_NET_WARN(!in_softirq(),
+		       "%s is called with non-zero budget outside softirq context.\n",
+		       __func__);
+
 	if (!skb_unref(skb))
 		return;