Message ID | 20230612093426.2867183-1-vladbu@nvidia.com (mailing list archive) |
---|---|
State | Accepted |
Commit | c9a82bec02c339cdda99b37c5e62b3b71fc4209c |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net/sched: cls_api: Fix lockup on flushing explicitly created chain | expand |
On 12/06/2023 06:34, Vlad Buslov wrote: > Mingshuai Ren reports: > > When a new chain is added by using tc, one soft lockup alarm will be > generated after delete the prio 0 filter of the chain. To reproduce > the problem, perform the following steps: > (1) tc qdisc add dev eth0 root handle 1: htb default 1 > (2) tc chain add dev eth0 > (3) tc filter del dev eth0 chain 0 parent 1: prio 0 > (4) tc filter add dev eth0 chain 0 parent 1: > > Fix the issue by accounting for additional reference to chains that are > explicitly created by RTM_NEWCHAIN message as opposed to implicitly by > RTM_NEWTFILTER message. > > Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush") > Reported-by: Mingshuai Ren <renmingshuai@huawei.com> > Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/ > Signed-off-by: Vlad Buslov <vladbu@nvidia.com> > --- > net/sched/cls_api.c | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) Hi Vlad, Thanks for taking a look. Could you also carry over the tdc test or ask Ren to post in a separate patch? > > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c > index 2621550bfddc..e4df96e133cd 100644 > --- a/net/sched/cls_api.c > +++ b/net/sched/cls_api.c > @@ -659,8 +659,8 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act, > { > struct tcf_block *block = chain->block; > const struct tcf_proto_ops *tmplt_ops; > + unsigned int refcnt, non_act_refcnt; > bool free_block = false; > - unsigned int refcnt; > void *tmplt_priv; > > mutex_lock(&block->lock); > @@ -680,13 +680,15 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act, > * save these to temporary variables. > */ > refcnt = --chain->refcnt; > + non_act_refcnt = refcnt - chain->action_refcnt; > tmplt_ops = chain->tmplt_ops; > tmplt_priv = chain->tmplt_priv; > > - /* The last dropped non-action reference will trigger notification. */ > - if (refcnt - chain->action_refcnt == 0 && !by_act) { > - tc_chain_notify_delete(tmplt_ops, tmplt_priv, chain->index, > - block, NULL, 0, 0, false); > + if (non_act_refcnt == chain->explicitly_created && !by_act) { > + if (non_act_refcnt == 0) > + tc_chain_notify_delete(tmplt_ops, tmplt_priv, > + chain->index, block, NULL, 0, 0, > + false); > /* Last reference to chain, no need to lock. */ > chain->flushing = false; > }
On Mon 12 Jun 2023 at 10:59, Pedro Tammela <pctammela@mojatatu.com> wrote: > On 12/06/2023 06:34, Vlad Buslov wrote: >> Mingshuai Ren reports: >> When a new chain is added by using tc, one soft lockup alarm will be >> generated after delete the prio 0 filter of the chain. To reproduce >> the problem, perform the following steps: >> (1) tc qdisc add dev eth0 root handle 1: htb default 1 >> (2) tc chain add dev eth0 >> (3) tc filter del dev eth0 chain 0 parent 1: prio 0 >> (4) tc filter add dev eth0 chain 0 parent 1: >> Fix the issue by accounting for additional reference to chains that are >> explicitly created by RTM_NEWCHAIN message as opposed to implicitly by >> RTM_NEWTFILTER message. >> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during >> chain flush") >> Reported-by: Mingshuai Ren <renmingshuai@huawei.com> >> Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/ >> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> >> --- >> net/sched/cls_api.c | 12 +++++++----- >> 1 file changed, 7 insertions(+), 5 deletions(-) > > > Hi Vlad, > > Thanks for taking a look. > Could you also carry over the tdc test or ask Ren to post in a separate patch? Sure. I was planning to ask Mingshuai Ren to submit the new test as standalone patch after my fix has been accepted since including his code with my fix would require explicit approval of the whole patch and his Signed-off-by clause AFAIK. > >> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c >> index 2621550bfddc..e4df96e133cd 100644 >> --- a/net/sched/cls_api.c >> +++ b/net/sched/cls_api.c >> @@ -659,8 +659,8 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act, >> { >> struct tcf_block *block = chain->block; >> const struct tcf_proto_ops *tmplt_ops; >> + unsigned int refcnt, non_act_refcnt; >> bool free_block = false; >> - unsigned int refcnt; >> void *tmplt_priv; >> mutex_lock(&block->lock); >> @@ -680,13 +680,15 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act, >> * save these to temporary variables. >> */ >> refcnt = --chain->refcnt; >> + non_act_refcnt = refcnt - chain->action_refcnt; >> tmplt_ops = chain->tmplt_ops; >> tmplt_priv = chain->tmplt_priv; >> - /* The last dropped non-action reference will trigger notification. */ >> - if (refcnt - chain->action_refcnt == 0 && !by_act) { >> - tc_chain_notify_delete(tmplt_ops, tmplt_priv, chain->index, >> - block, NULL, 0, 0, false); >> + if (non_act_refcnt == chain->explicitly_created && !by_act) { >> + if (non_act_refcnt == 0) >> + tc_chain_notify_delete(tmplt_ops, tmplt_priv, >> + chain->index, block, NULL, 0, 0, >> + false); >> /* Last reference to chain, no need to lock. */ >> chain->flushing = false; >> }
>On Mon 12 Jun 2023 at 10:59, Pedro Tammela <pctammela@mojatatu.com> wrote: >> On 12/06/2023 06:34, Vlad Buslov wrote: >>> Mingshuai Ren reports: >>> When a new chain is added by using tc, one soft lockup alarm will be >>> generated after delete the prio 0 filter of the chain. To reproduce >>> the problem, perform the following steps: >>> (1) tc qdisc add dev eth0 root handle 1: htb default 1 >>> (2) tc chain add dev eth0 >>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0 >>> (4) tc filter add dev eth0 chain 0 parent 1: >>> Fix the issue by accounting for additional reference to chains that are >>> explicitly created by RTM_NEWCHAIN message as opposed to implicitly by >>> RTM_NEWTFILTER message. >>> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during >>> chain flush") >>> Reported-by: Mingshuai Ren <renmingshuai@huawei.com> >>> Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/ >>> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> >>> --- >>> net/sched/cls_api.c | 12 +++++++----- >>> 1 file changed, 7 insertions(+), 5 deletions(-) >> >> >> Hi Vlad, >> >> Thanks for taking a look. >> Could you also carry over the tdc test or ask Ren to post in a separate patch? > >Sure. I was planning to ask Mingshuai Ren to submit the new test as >standalone patch after my fix has been accepted since including his code >with my fix would require explicit approval of the whole patch and his >Signed-off-by clause AFAIK. OK. I will submit the new test as standalone patch after your fix is been accepted.
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Mon, 12 Jun 2023 11:34:26 +0200 you wrote: > Mingshuai Ren reports: > > When a new chain is added by using tc, one soft lockup alarm will be > generated after delete the prio 0 filter of the chain. To reproduce > the problem, perform the following steps: > (1) tc qdisc add dev eth0 root handle 1: htb default 1 > (2) tc chain add dev eth0 > (3) tc filter del dev eth0 chain 0 parent 1: prio 0 > (4) tc filter add dev eth0 chain 0 parent 1: > > [...] Here is the summary with links: - [net] net/sched: cls_api: Fix lockup on flushing explicitly created chain https://git.kernel.org/netdev/net/c/c9a82bec02c3 You are awesome, thank you!
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 2621550bfddc..e4df96e133cd 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -659,8 +659,8 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act, { struct tcf_block *block = chain->block; const struct tcf_proto_ops *tmplt_ops; + unsigned int refcnt, non_act_refcnt; bool free_block = false; - unsigned int refcnt; void *tmplt_priv; mutex_lock(&block->lock); @@ -680,13 +680,15 @@ static void __tcf_chain_put(struct tcf_chain *chain, bool by_act, * save these to temporary variables. */ refcnt = --chain->refcnt; + non_act_refcnt = refcnt - chain->action_refcnt; tmplt_ops = chain->tmplt_ops; tmplt_priv = chain->tmplt_priv; - /* The last dropped non-action reference will trigger notification. */ - if (refcnt - chain->action_refcnt == 0 && !by_act) { - tc_chain_notify_delete(tmplt_ops, tmplt_priv, chain->index, - block, NULL, 0, 0, false); + if (non_act_refcnt == chain->explicitly_created && !by_act) { + if (non_act_refcnt == 0) + tc_chain_notify_delete(tmplt_ops, tmplt_priv, + chain->index, block, NULL, 0, 0, + false); /* Last reference to chain, no need to lock. */ chain->flushing = false; }
Mingshuai Ren reports: When a new chain is added by using tc, one soft lockup alarm will be generated after delete the prio 0 filter of the chain. To reproduce the problem, perform the following steps: (1) tc qdisc add dev eth0 root handle 1: htb default 1 (2) tc chain add dev eth0 (3) tc filter del dev eth0 chain 0 parent 1: prio 0 (4) tc filter add dev eth0 chain 0 parent 1: Fix the issue by accounting for additional reference to chains that are explicitly created by RTM_NEWCHAIN message as opposed to implicitly by RTM_NEWTFILTER message. Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush") Reported-by: Mingshuai Ren <renmingshuai@huawei.com> Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/ Signed-off-by: Vlad Buslov <vladbu@nvidia.com> --- net/sched/cls_api.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)