diff mbox series

[net] net/sched: cls_api: Fix lockup on flushing explicitly created chain

Message ID 20230612093426.2867183-1-vladbu@nvidia.com (mailing list archive)
State Accepted
Commit c9a82bec02c339cdda99b37c5e62b3b71fc4209c
Delegated to: Netdev Maintainers
Headers show
Series [net] net/sched: cls_api: Fix lockup on flushing explicitly created chain | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 17 this patch: 17
netdev/cc_maintainers fail 1 blamed authors not CCed: vladbu@mellanox.com; 2 maintainers not CCed: vladbu@mellanox.com edumazet@google.com
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 17 this patch: 17
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 28 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Vlad Buslov June 12, 2023, 9:34 a.m. UTC
Mingshuai Ren reports:

When a new chain is added by using tc, one soft lockup alarm will be
 generated after delete the prio 0 filter of the chain. To reproduce
 the problem, perform the following steps:
(1) tc qdisc add dev eth0 root handle 1: htb default 1
(2) tc chain add dev eth0
(3) tc filter del dev eth0 chain 0 parent 1: prio 0
(4) tc filter add dev eth0 chain 0 parent 1:

Fix the issue by accounting for additional reference to chains that are
explicitly created by RTM_NEWCHAIN message as opposed to implicitly by
RTM_NEWTFILTER message.

Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
Reported-by: Mingshuai Ren <renmingshuai@huawei.com>
Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
---
 net/sched/cls_api.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

Comments

Pedro Tammela June 12, 2023, 1:59 p.m. UTC | #1
On 12/06/2023 06:34, Vlad Buslov wrote:
> Mingshuai Ren reports:
> 
> When a new chain is added by using tc, one soft lockup alarm will be
>   generated after delete the prio 0 filter of the chain. To reproduce
>   the problem, perform the following steps:
> (1) tc qdisc add dev eth0 root handle 1: htb default 1
> (2) tc chain add dev eth0
> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
> (4) tc filter add dev eth0 chain 0 parent 1:
> 
> Fix the issue by accounting for additional reference to chains that are
> explicitly created by RTM_NEWCHAIN message as opposed to implicitly by
> RTM_NEWTFILTER message.
> 
> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
> Reported-by: Mingshuai Ren <renmingshuai@huawei.com>
> Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/
> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
> ---
>   net/sched/cls_api.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)


Hi Vlad,

Thanks for taking a look.
Could you also carry over the tdc test or ask Ren to post in a separate 
patch?

> 
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 2621550bfddc..e4df96e133cd 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -659,8 +659,8 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act,
>   {
>   	struct tcf_block *block = chain->block;
>   	const struct tcf_proto_ops *tmplt_ops;
> +	unsigned int refcnt, non_act_refcnt;
>   	bool free_block = false;
> -	unsigned int refcnt;
>   	void *tmplt_priv;
>   
>   	mutex_lock(&block->lock);
> @@ -680,13 +680,15 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act,
>   	 * save these to temporary variables.
>   	 */
>   	refcnt = --chain->refcnt;
> +	non_act_refcnt = refcnt - chain->action_refcnt;
>   	tmplt_ops = chain->tmplt_ops;
>   	tmplt_priv = chain->tmplt_priv;
>   
> -	/* The last dropped non-action reference will trigger notification. */
> -	if (refcnt - chain->action_refcnt == 0 && !by_act) {
> -		tc_chain_notify_delete(tmplt_ops, tmplt_priv, chain->index,
> -				       block, NULL, 0, 0, false);
> +	if (non_act_refcnt == chain->explicitly_created && !by_act) {
> +		if (non_act_refcnt == 0)
> +			tc_chain_notify_delete(tmplt_ops, tmplt_priv,
> +					       chain->index, block, NULL, 0, 0,
> +					       false);
>   		/* Last reference to chain, no need to lock. */
>   		chain->flushing = false;
>   	}
Vlad Buslov June 12, 2023, 2:07 p.m. UTC | #2
On Mon 12 Jun 2023 at 10:59, Pedro Tammela <pctammela@mojatatu.com> wrote:
> On 12/06/2023 06:34, Vlad Buslov wrote:
>> Mingshuai Ren reports:
>> When a new chain is added by using tc, one soft lockup alarm will be
>>   generated after delete the prio 0 filter of the chain. To reproduce
>>   the problem, perform the following steps:
>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>> (2) tc chain add dev eth0
>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>> (4) tc filter add dev eth0 chain 0 parent 1:
>> Fix the issue by accounting for additional reference to chains that are
>> explicitly created by RTM_NEWCHAIN message as opposed to implicitly by
>> RTM_NEWTFILTER message.
>> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during
>> chain flush")
>> Reported-by: Mingshuai Ren <renmingshuai@huawei.com>
>> Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/
>> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
>> ---
>>   net/sched/cls_api.c | 12 +++++++-----
>>   1 file changed, 7 insertions(+), 5 deletions(-)
>
>
> Hi Vlad,
>
> Thanks for taking a look.
> Could you also carry over the tdc test or ask Ren to post in a separate patch?

Sure. I was planning to ask Mingshuai Ren to submit the new test as
standalone patch after my fix has been accepted since including his code
with my fix would require explicit approval of the whole patch and his
Signed-off-by clause AFAIK.

>
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index 2621550bfddc..e4df96e133cd 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -659,8 +659,8 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act,
>>   {
>>   	struct tcf_block *block = chain->block;
>>   	const struct tcf_proto_ops *tmplt_ops;
>> +	unsigned int refcnt, non_act_refcnt;
>>   	bool free_block = false;
>> -	unsigned int refcnt;
>>   	void *tmplt_priv;
>>     	mutex_lock(&block->lock);
>> @@ -680,13 +680,15 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act,
>>   	 * save these to temporary variables.
>>   	 */
>>   	refcnt = --chain->refcnt;
>> +	non_act_refcnt = refcnt - chain->action_refcnt;
>>   	tmplt_ops = chain->tmplt_ops;
>>   	tmplt_priv = chain->tmplt_priv;
>>   -	/* The last dropped non-action reference will trigger notification. */
>> -	if (refcnt - chain->action_refcnt == 0 && !by_act) {
>> -		tc_chain_notify_delete(tmplt_ops, tmplt_priv, chain->index,
>> -				       block, NULL, 0, 0, false);
>> +	if (non_act_refcnt == chain->explicitly_created && !by_act) {
>> +		if (non_act_refcnt == 0)
>> +			tc_chain_notify_delete(tmplt_ops, tmplt_priv,
>> +					       chain->index, block, NULL, 0, 0,
>> +					       false);
>>   		/* Last reference to chain, no need to lock. */
>>   		chain->flushing = false;
>>   	}
renmingshuai June 12, 2023, 2:35 p.m. UTC | #3
>On Mon 12 Jun 2023 at 10:59, Pedro Tammela <pctammela@mojatatu.com> wrote:
>> On 12/06/2023 06:34, Vlad Buslov wrote:
>>> Mingshuai Ren reports:
>>> When a new chain is added by using tc, one soft lockup alarm will be
>>>   generated after delete the prio 0 filter of the chain. To reproduce
>>>   the problem, perform the following steps:
>>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>>> (2) tc chain add dev eth0
>>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>>> (4) tc filter add dev eth0 chain 0 parent 1:
>>> Fix the issue by accounting for additional reference to chains that are
>>> explicitly created by RTM_NEWCHAIN message as opposed to implicitly by
>>> RTM_NEWTFILTER message.
>>> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during
>>> chain flush")
>>> Reported-by: Mingshuai Ren <renmingshuai@huawei.com>
>>> Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/
>>> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
>>> ---
>>>   net/sched/cls_api.c | 12 +++++++-----
>>>   1 file changed, 7 insertions(+), 5 deletions(-)
>>
>>
>> Hi Vlad,
>>
>> Thanks for taking a look.
>> Could you also carry over the tdc test or ask Ren to post in a separate patch?
>
>Sure. I was planning to ask Mingshuai Ren to submit the new test as
>standalone patch after my fix has been accepted since including his code
>with my fix would require explicit approval of the whole patch and his
>Signed-off-by clause AFAIK.

OK. I will submit the new test as standalone patch after your fix is been accepted.
patchwork-bot+netdevbpf@kernel.org June 15, 2023, 6:10 a.m. UTC | #4
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 12 Jun 2023 11:34:26 +0200 you wrote:
> Mingshuai Ren reports:
> 
> When a new chain is added by using tc, one soft lockup alarm will be
>  generated after delete the prio 0 filter of the chain. To reproduce
>  the problem, perform the following steps:
> (1) tc qdisc add dev eth0 root handle 1: htb default 1
> (2) tc chain add dev eth0
> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
> (4) tc filter add dev eth0 chain 0 parent 1:
> 
> [...]

Here is the summary with links:
  - [net] net/sched: cls_api: Fix lockup on flushing explicitly created chain
    https://git.kernel.org/netdev/net/c/c9a82bec02c3

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 2621550bfddc..e4df96e133cd 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -659,8 +659,8 @@  static void __tcf_chain_put(struct tcf_chain *chain, bool by_act,
 {
 	struct tcf_block *block = chain->block;
 	const struct tcf_proto_ops *tmplt_ops;
+	unsigned int refcnt, non_act_refcnt;
 	bool free_block = false;
-	unsigned int refcnt;
 	void *tmplt_priv;
 
 	mutex_lock(&block->lock);
@@ -680,13 +680,15 @@  static void __tcf_chain_put(struct tcf_chain *chain, bool by_act,
 	 * save these to temporary variables.
 	 */
 	refcnt = --chain->refcnt;
+	non_act_refcnt = refcnt - chain->action_refcnt;
 	tmplt_ops = chain->tmplt_ops;
 	tmplt_priv = chain->tmplt_priv;
 
-	/* The last dropped non-action reference will trigger notification. */
-	if (refcnt - chain->action_refcnt == 0 && !by_act) {
-		tc_chain_notify_delete(tmplt_ops, tmplt_priv, chain->index,
-				       block, NULL, 0, 0, false);
+	if (non_act_refcnt == chain->explicitly_created && !by_act) {
+		if (non_act_refcnt == 0)
+			tc_chain_notify_delete(tmplt_ops, tmplt_priv,
+					       chain->index, block, NULL, 0, 0,
+					       false);
 		/* Last reference to chain, no need to lock. */
 		chain->flushing = false;
 	}