Message ID | 20230602103750.2290132-3-vladimir.oltean@nxp.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Improve the taprio qdisc's relationship with its children | expand |
Context | Check | Description |
---|---|---|
netdev/series_format | success | Posting correctly formatted |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/fixes_present | success | Fixes tag not required for -next series |
netdev/header_inline | success | No static functions without inline keyword in header files |
netdev/build_32bit | success | Errors and warnings before: 8 this patch: 8 |
netdev/cc_maintainers | success | CCed 9 of 9 maintainers |
netdev/build_clang | success | Errors and warnings before: 8 this patch: 8 |
netdev/verify_signedoff | success | Signed-off-by tag matches author and committer |
netdev/deprecated_api | success | None detected |
netdev/check_selftest | success | No net selftest shell script |
netdev/verify_fixes | success | No Fixes tag |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 8 this patch: 8 |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 55 lines checked |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/source_inline | success | Was 0 now: 0 |
On Fri, 2023-06-02 at 13:37 +0300, Vladimir Oltean wrote: > The taprio qdisc currently lives in the following equilibrium. > > In the software scheduling case, taprio attaches itself to all TXQs, > thus having a refcount of 1 + the number of TX queues. In this mode, > q->qdiscs[] is not visible directly to the Qdisc API. The lifetime of > the Qdiscs from this private array lasts until qdisc_destroy() -> > taprio_destroy(). > > In the fully offloaded case, the root taprio has a refcount of 1, and > all child q->qdiscs[] also have a refcount of 1. The child q->qdiscs[] > are visible to the Qdisc API (they are attached to the netdev TXQs > directly), however taprio loses a reference to them very early - during > qdisc_graft(parent==NULL) -> taprio_attach(). At that time, taprio frees > the q->qdiscs[] array to not leak memory, but interestingly, it does not > release a reference on these qdiscs because it doesn't effectively own > them - they are created by taprio but owned by the Qdisc core, and will > be freed by qdisc_graft(parent==NULL, new==NULL) -> qdisc_put(old) when > the Qdisc is deleted or when the child Qdisc is replaced with something > else. > > My interest is to change this equilibrium such that taprio also owns a > reference on the q->qdiscs[] child Qdiscs for the lifetime of the root > Qdisc, including in full offload mode. I want this because I would like > taprio_leaf(), taprio_dump_class(), taprio_dump_class_stats() to have > insight into q->qdiscs[] for the software scheduling mode - currently > they look at dev_queue->qdisc_sleeping, which is, as mentioned, the same > as the root taprio. > > The following set of changes is necessary: > - don't free q->qdiscs[] early in taprio_attach(), free it late in > taprio_destroy() for consistency with software mode. But: > - currently that's not possible, because taprio doesn't own a reference > on q->qdiscs[]. So hold that reference - once during the initial > attach() and once during subsequent graft() calls when the child is > changed. > - always keep track of the current child in q->qdiscs[], even for full > offload mode, so that we free in taprio_destroy() what we should, and > not something stale. > > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> > --- > net/sched/sch_taprio.c | 28 +++++++++++++++++----------- > 1 file changed, 17 insertions(+), 11 deletions(-) > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > index b1c611c72aa4..8807fc915b79 100644 > --- a/net/sched/sch_taprio.c > +++ b/net/sched/sch_taprio.c > @@ -2138,23 +2138,20 @@ static void taprio_attach(struct Qdisc *sch) > > qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; > old = dev_graft_qdisc(dev_queue, qdisc); > + /* Keep refcount of q->qdiscs[ntx] at 2 */ > + qdisc_refcount_inc(qdisc); > } else { > /* In software mode, attach the root taprio qdisc > * to all netdev TX queues, so that dev_qdisc_enqueue() > * goes through taprio_enqueue(). > */ > old = dev_graft_qdisc(dev_queue, sch); > + /* Keep root refcount at 1 + num_tx_queues */ > qdisc_refcount_inc(sch); > } > if (old) > qdisc_put(old); > } > - > - /* access to the child qdiscs is not needed in offload mode */ > - if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { > - kfree(q->qdiscs); > - q->qdiscs = NULL; > - } > } > > static struct netdev_queue *taprio_queue_get(struct Qdisc *sch, > @@ -2183,15 +2180,24 @@ static int taprio_graft(struct Qdisc *sch, unsigned long cl, > if (dev->flags & IFF_UP) > dev_deactivate(dev); > > - if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { > + /* In software mode, the root taprio qdisc is still the one attached to > + * all netdev TX queues, and hence responsible for taprio_enqueue() to > + * forward the skbs to the child qdiscs from the private q->qdiscs[] > + * array. So only attach the new qdisc to the netdev queue in offload > + * mode, where the enqueue must bypass taprio. However, save the > + * reference to the new qdisc in the private array in both cases, to > + * have an up-to-date reference to our children. > + */ > + if (FULL_OFFLOAD_IS_ENABLED(q->flags)) > *old = dev_graft_qdisc(dev_queue, new); > - } else { > + else > *old = q->qdiscs[cl - 1]; > - q->qdiscs[cl - 1] = new; > - } > > - if (new) > + q->qdiscs[cl - 1] = new; > + if (new) { > + qdisc_refcount_inc(new); > new->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; > + } > Isn't the above leaking a reference to old with something alike: tc qdisc replace dev eth0 handle 8001: parent root stab overhead 24 taprio flags 0x2 #... # each q in q->qdiscs has refcnt == 2 tc qdisc replace dev eth0 parent 8001:8 cbs #... # -> taprio_graft(..., cbs, ...) # cbs refcnt is 2 # 'old' refcnt decreases by 1, refcount will not reach 0?!? kmemleak should be able to see that. BTW, what about including your tests from the cover letter somewhere under tc-testing? Thanks! Paolo
On Tue, Jun 06, 2023 at 12:27:09PM +0200, Paolo Abeni wrote: > On Fri, 2023-06-02 at 13:37 +0300, Vladimir Oltean wrote: > > The taprio qdisc currently lives in the following equilibrium. > > > > In the software scheduling case, taprio attaches itself to all TXQs, > > thus having a refcount of 1 + the number of TX queues. In this mode, > > q->qdiscs[] is not visible directly to the Qdisc API. The lifetime of > > the Qdiscs from this private array lasts until qdisc_destroy() -> > > taprio_destroy(). > > > > In the fully offloaded case, the root taprio has a refcount of 1, and > > all child q->qdiscs[] also have a refcount of 1. The child q->qdiscs[] > > are visible to the Qdisc API (they are attached to the netdev TXQs > > directly), however taprio loses a reference to them very early - during > > qdisc_graft(parent==NULL) -> taprio_attach(). At that time, taprio frees > > the q->qdiscs[] array to not leak memory, but interestingly, it does not > > release a reference on these qdiscs because it doesn't effectively own > > them - they are created by taprio but owned by the Qdisc core, and will > > be freed by qdisc_graft(parent==NULL, new==NULL) -> qdisc_put(old) when > > the Qdisc is deleted or when the child Qdisc is replaced with something > > else. > > > > My interest is to change this equilibrium such that taprio also owns a > > reference on the q->qdiscs[] child Qdiscs for the lifetime of the root > > Qdisc, including in full offload mode. I want this because I would like > > taprio_leaf(), taprio_dump_class(), taprio_dump_class_stats() to have > > insight into q->qdiscs[] for the software scheduling mode - currently > > they look at dev_queue->qdisc_sleeping, which is, as mentioned, the same > > as the root taprio. > > > > The following set of changes is necessary: > > - don't free q->qdiscs[] early in taprio_attach(), free it late in > > taprio_destroy() for consistency with software mode. But: > > - currently that's not possible, because taprio doesn't own a reference > > on q->qdiscs[]. So hold that reference - once during the initial > > attach() and once during subsequent graft() calls when the child is > > changed. > > - always keep track of the current child in q->qdiscs[], even for full > > offload mode, so that we free in taprio_destroy() what we should, and > > not something stale. > > > > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> > > --- > > net/sched/sch_taprio.c | 28 +++++++++++++++++----------- > > 1 file changed, 17 insertions(+), 11 deletions(-) > > > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > > index b1c611c72aa4..8807fc915b79 100644 > > --- a/net/sched/sch_taprio.c > > +++ b/net/sched/sch_taprio.c > > @@ -2138,23 +2138,20 @@ static void taprio_attach(struct Qdisc *sch) > > > > qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; > > old = dev_graft_qdisc(dev_queue, qdisc); > > + /* Keep refcount of q->qdiscs[ntx] at 2 */ > > + qdisc_refcount_inc(qdisc); > > } else { > > /* In software mode, attach the root taprio qdisc > > * to all netdev TX queues, so that dev_qdisc_enqueue() > > * goes through taprio_enqueue(). > > */ > > old = dev_graft_qdisc(dev_queue, sch); > > + /* Keep root refcount at 1 + num_tx_queues */ > > qdisc_refcount_inc(sch); > > } > > if (old) > > qdisc_put(old); > > } > > - > > - /* access to the child qdiscs is not needed in offload mode */ > > - if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { > > - kfree(q->qdiscs); > > - q->qdiscs = NULL; > > - } > > } > > > > static struct netdev_queue *taprio_queue_get(struct Qdisc *sch, > > @@ -2183,15 +2180,24 @@ static int taprio_graft(struct Qdisc *sch, unsigned long cl, > > if (dev->flags & IFF_UP) > > dev_deactivate(dev); > > > > - if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { > > + /* In software mode, the root taprio qdisc is still the one attached to > > + * all netdev TX queues, and hence responsible for taprio_enqueue() to > > + * forward the skbs to the child qdiscs from the private q->qdiscs[] > > + * array. So only attach the new qdisc to the netdev queue in offload > > + * mode, where the enqueue must bypass taprio. However, save the > > + * reference to the new qdisc in the private array in both cases, to > > + * have an up-to-date reference to our children. > > + */ > > + if (FULL_OFFLOAD_IS_ENABLED(q->flags)) > > *old = dev_graft_qdisc(dev_queue, new); > > - } else { > > + else > > *old = q->qdiscs[cl - 1]; > > - q->qdiscs[cl - 1] = new; > > - } > > > > - if (new) > > + q->qdiscs[cl - 1] = new; > > + if (new) { > > + qdisc_refcount_inc(new); > > new->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; > > + } > > > Isn't the above leaking a reference to old with something alike: > > tc qdisc replace dev eth0 handle 8001: parent root stab overhead 24 taprio flags 0x2 #... > # each q in q->qdiscs has refcnt == 2 > tc qdisc replace dev eth0 parent 8001:8 cbs #... > # -> taprio_graft(..., cbs, ...) > # cbs refcnt is 2 > # 'old' refcnt decreases by 1, refcount will not reach 0?!? > > kmemleak should be able to see that. You're right - in full offload mode, the refcount of "old" (pfifo, parent 8001:8) does not drop to 0 after grafting something else to 8001:8. I believe this other implementation should work in all cases. static int taprio_graft(struct Qdisc *sch, unsigned long cl, struct Qdisc *new, struct Qdisc **old, struct netlink_ext_ack *extack) { struct taprio_sched *q = qdisc_priv(sch); struct net_device *dev = qdisc_dev(sch); struct netdev_queue *dev_queue = taprio_queue_get(sch, cl); if (!dev_queue) return -EINVAL; if (dev->flags & IFF_UP) dev_deactivate(dev); /* In offload mode, the child Qdisc is directly attached to the netdev * TX queue, and thus, we need to keep its refcount elevated in order * to counteract qdisc_graft()'s call to qdisc_put() once per TX queue. * However, save the reference to the new qdisc in the private array in * both software and offload cases, to have an up-to-date reference to * our children. */ if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { *old = dev_graft_qdisc(dev_queue, new); if (new) qdisc_refcount_inc(new); if (*old) qdisc_put(*old); } else { *old = q->qdiscs[cl - 1]; } q->qdiscs[cl - 1] = new; if (new) new->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; if (dev->flags & IFF_UP) dev_activate(dev); return 0; } > BTW, what about including your tests from the cover letter somewhere under tc-testing? I don't know about that. Does it involve adding taprio hw offload to netdevsim, so that both code paths are covered? To be honest with you, I never ran tdc, and I worry about what I may find there, unrelated to what I'm being asked to add (at a cursory glance I see overlapping TXQs in taprio.json), and which has the possibility of turning this into a huge time pit.
On Tue, 2023-06-06 at 18:56 +0300, Vladimir Oltean wrote: > On Tue, Jun 06, 2023 at 12:27:09PM +0200, Paolo Abeni wrote: > > On Fri, 2023-06-02 at 13:37 +0300, Vladimir Oltean wrote: > > > The taprio qdisc currently lives in the following equilibrium. > > > > > > In the software scheduling case, taprio attaches itself to all TXQs, > > > thus having a refcount of 1 + the number of TX queues. In this mode, > > > q->qdiscs[] is not visible directly to the Qdisc API. The lifetime of > > > the Qdiscs from this private array lasts until qdisc_destroy() -> > > > taprio_destroy(). > > > > > > In the fully offloaded case, the root taprio has a refcount of 1, and > > > all child q->qdiscs[] also have a refcount of 1. The child q->qdiscs[] > > > are visible to the Qdisc API (they are attached to the netdev TXQs > > > directly), however taprio loses a reference to them very early - during > > > qdisc_graft(parent==NULL) -> taprio_attach(). At that time, taprio frees > > > the q->qdiscs[] array to not leak memory, but interestingly, it does not > > > release a reference on these qdiscs because it doesn't effectively own > > > them - they are created by taprio but owned by the Qdisc core, and will > > > be freed by qdisc_graft(parent==NULL, new==NULL) -> qdisc_put(old) when > > > the Qdisc is deleted or when the child Qdisc is replaced with something > > > else. > > > > > > My interest is to change this equilibrium such that taprio also owns a > > > reference on the q->qdiscs[] child Qdiscs for the lifetime of the root > > > Qdisc, including in full offload mode. I want this because I would like > > > taprio_leaf(), taprio_dump_class(), taprio_dump_class_stats() to have > > > insight into q->qdiscs[] for the software scheduling mode - currently > > > they look at dev_queue->qdisc_sleeping, which is, as mentioned, the same > > > as the root taprio. > > > > > > The following set of changes is necessary: > > > - don't free q->qdiscs[] early in taprio_attach(), free it late in > > > taprio_destroy() for consistency with software mode. But: > > > - currently that's not possible, because taprio doesn't own a reference > > > on q->qdiscs[]. So hold that reference - once during the initial > > > attach() and once during subsequent graft() calls when the child is > > > changed. > > > - always keep track of the current child in q->qdiscs[], even for full > > > offload mode, so that we free in taprio_destroy() what we should, and > > > not something stale. > > > > > > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> > > > --- > > > net/sched/sch_taprio.c | 28 +++++++++++++++++----------- > > > 1 file changed, 17 insertions(+), 11 deletions(-) > > > > > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > > > index b1c611c72aa4..8807fc915b79 100644 > > > --- a/net/sched/sch_taprio.c > > > +++ b/net/sched/sch_taprio.c > > > @@ -2138,23 +2138,20 @@ static void taprio_attach(struct Qdisc *sch) > > > > > > qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; > > > old = dev_graft_qdisc(dev_queue, qdisc); > > > + /* Keep refcount of q->qdiscs[ntx] at 2 */ > > > + qdisc_refcount_inc(qdisc); > > > } else { > > > /* In software mode, attach the root taprio qdisc > > > * to all netdev TX queues, so that dev_qdisc_enqueue() > > > * goes through taprio_enqueue(). > > > */ > > > old = dev_graft_qdisc(dev_queue, sch); > > > + /* Keep root refcount at 1 + num_tx_queues */ > > > qdisc_refcount_inc(sch); > > > } > > > if (old) > > > qdisc_put(old); > > > } > > > - > > > - /* access to the child qdiscs is not needed in offload mode */ > > > - if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { > > > - kfree(q->qdiscs); > > > - q->qdiscs = NULL; > > > - } > > > } > > > > > > static struct netdev_queue *taprio_queue_get(struct Qdisc *sch, > > > @@ -2183,15 +2180,24 @@ static int taprio_graft(struct Qdisc *sch, unsigned long cl, > > > if (dev->flags & IFF_UP) > > > dev_deactivate(dev); > > > > > > - if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { > > > + /* In software mode, the root taprio qdisc is still the one attached to > > > + * all netdev TX queues, and hence responsible for taprio_enqueue() to > > > + * forward the skbs to the child qdiscs from the private q->qdiscs[] > > > + * array. So only attach the new qdisc to the netdev queue in offload > > > + * mode, where the enqueue must bypass taprio. However, save the > > > + * reference to the new qdisc in the private array in both cases, to > > > + * have an up-to-date reference to our children. > > > + */ > > > + if (FULL_OFFLOAD_IS_ENABLED(q->flags)) > > > *old = dev_graft_qdisc(dev_queue, new); > > > - } else { > > > + else > > > *old = q->qdiscs[cl - 1]; > > > - q->qdiscs[cl - 1] = new; > > > - } > > > > > > - if (new) > > > + q->qdiscs[cl - 1] = new; > > > + if (new) { > > > + qdisc_refcount_inc(new); > > > new->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; > > > + } > > > > > Isn't the above leaking a reference to old with something alike: > > > > tc qdisc replace dev eth0 handle 8001: parent root stab overhead 24 taprio flags 0x2 #... > > # each q in q->qdiscs has refcnt == 2 > > tc qdisc replace dev eth0 parent 8001:8 cbs #... > > # -> taprio_graft(..., cbs, ...) > > # cbs refcnt is 2 > > # 'old' refcnt decreases by 1, refcount will not reach 0?!? > > > > kmemleak should be able to see that. > > You're right - in full offload mode, the refcount of "old" (pfifo, parent 8001:8) > does not drop to 0 after grafting something else to 8001:8. > > I believe this other implementation should work in all cases. > > static int taprio_graft(struct Qdisc *sch, unsigned long cl, > struct Qdisc *new, struct Qdisc **old, > struct netlink_ext_ack *extack) > { > struct taprio_sched *q = qdisc_priv(sch); > struct net_device *dev = qdisc_dev(sch); > struct netdev_queue *dev_queue = taprio_queue_get(sch, cl); > > if (!dev_queue) > return -EINVAL; > > if (dev->flags & IFF_UP) > dev_deactivate(dev); > > /* In offload mode, the child Qdisc is directly attached to the netdev > * TX queue, and thus, we need to keep its refcount elevated in order > * to counteract qdisc_graft()'s call to qdisc_put() once per TX queue. > * However, save the reference to the new qdisc in the private array in > * both software and offload cases, to have an up-to-date reference to > * our children. > */ > if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { > *old = dev_graft_qdisc(dev_queue, new); > if (new) > qdisc_refcount_inc(new); > if (*old) > qdisc_put(*old); > } else { > *old = q->qdiscs[cl - 1]; > } Perhaps the above chunk could be: *old = q->qdiscs[cl - 1]; if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { WARN_ON_ONCE(dev_graft_qdisc(dev_queue, new) != *old); if (new) qdisc_refcount_inc(new); if (*old) qdisc_put(*old); } (boldly assuming I'm not completely lost, which looks a wild bet ;) > q->qdiscs[cl - 1] = new; > if (new) > new->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; > > if (dev->flags & IFF_UP) > dev_activate(dev); > > return 0; > } > > > BTW, what about including your tests from the cover letter somewhere under tc-testing? > > I don't know about that. Does it involve adding taprio hw offload to netdevsim, > so that both code paths are covered? I guess I underlooked the needed effort and we could live without new tests here. Cheers, Paolo
On Wed, Jun 07, 2023 at 12:05:19PM +0200, Paolo Abeni wrote: > Perhaps the above chunk could be: > > *old = q->qdiscs[cl - 1]; > if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { > WARN_ON_ONCE(dev_graft_qdisc(dev_queue, new) != *old); > if (new) > qdisc_refcount_inc(new); > if (*old) > qdisc_put(*old); > } > > (boldly assuming I'm not completely lost, which looks a wild bet ;) Yeah, could be like that. In full offload mode, q->qdiscs[cl - 1] is also what's grafted to the TXQ, so the WARN_ON() would be indicative of a serious bug if it triggered. > > > BTW, what about including your tests from the cover letter somewhere under tc-testing? > > > > I don't know about that. Does it involve adding taprio hw offload to netdevsim, > > so that both code paths are covered? > > I guess I underlooked the needed effort and we could live without new > tests here. Let's see how the discussion progresses.
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c index b1c611c72aa4..8807fc915b79 100644 --- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -2138,23 +2138,20 @@ static void taprio_attach(struct Qdisc *sch) qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; old = dev_graft_qdisc(dev_queue, qdisc); + /* Keep refcount of q->qdiscs[ntx] at 2 */ + qdisc_refcount_inc(qdisc); } else { /* In software mode, attach the root taprio qdisc * to all netdev TX queues, so that dev_qdisc_enqueue() * goes through taprio_enqueue(). */ old = dev_graft_qdisc(dev_queue, sch); + /* Keep root refcount at 1 + num_tx_queues */ qdisc_refcount_inc(sch); } if (old) qdisc_put(old); } - - /* access to the child qdiscs is not needed in offload mode */ - if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { - kfree(q->qdiscs); - q->qdiscs = NULL; - } } static struct netdev_queue *taprio_queue_get(struct Qdisc *sch, @@ -2183,15 +2180,24 @@ static int taprio_graft(struct Qdisc *sch, unsigned long cl, if (dev->flags & IFF_UP) dev_deactivate(dev); - if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { + /* In software mode, the root taprio qdisc is still the one attached to + * all netdev TX queues, and hence responsible for taprio_enqueue() to + * forward the skbs to the child qdiscs from the private q->qdiscs[] + * array. So only attach the new qdisc to the netdev queue in offload + * mode, where the enqueue must bypass taprio. However, save the + * reference to the new qdisc in the private array in both cases, to + * have an up-to-date reference to our children. + */ + if (FULL_OFFLOAD_IS_ENABLED(q->flags)) *old = dev_graft_qdisc(dev_queue, new); - } else { + else *old = q->qdiscs[cl - 1]; - q->qdiscs[cl - 1] = new; - } - if (new) + q->qdiscs[cl - 1] = new; + if (new) { + qdisc_refcount_inc(new); new->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT; + } if (dev->flags & IFF_UP) dev_activate(dev);
The taprio qdisc currently lives in the following equilibrium. In the software scheduling case, taprio attaches itself to all TXQs, thus having a refcount of 1 + the number of TX queues. In this mode, q->qdiscs[] is not visible directly to the Qdisc API. The lifetime of the Qdiscs from this private array lasts until qdisc_destroy() -> taprio_destroy(). In the fully offloaded case, the root taprio has a refcount of 1, and all child q->qdiscs[] also have a refcount of 1. The child q->qdiscs[] are visible to the Qdisc API (they are attached to the netdev TXQs directly), however taprio loses a reference to them very early - during qdisc_graft(parent==NULL) -> taprio_attach(). At that time, taprio frees the q->qdiscs[] array to not leak memory, but interestingly, it does not release a reference on these qdiscs because it doesn't effectively own them - they are created by taprio but owned by the Qdisc core, and will be freed by qdisc_graft(parent==NULL, new==NULL) -> qdisc_put(old) when the Qdisc is deleted or when the child Qdisc is replaced with something else. My interest is to change this equilibrium such that taprio also owns a reference on the q->qdiscs[] child Qdiscs for the lifetime of the root Qdisc, including in full offload mode. I want this because I would like taprio_leaf(), taprio_dump_class(), taprio_dump_class_stats() to have insight into q->qdiscs[] for the software scheduling mode - currently they look at dev_queue->qdisc_sleeping, which is, as mentioned, the same as the root taprio. The following set of changes is necessary: - don't free q->qdiscs[] early in taprio_attach(), free it late in taprio_destroy() for consistency with software mode. But: - currently that's not possible, because taprio doesn't own a reference on q->qdiscs[]. So hold that reference - once during the initial attach() and once during subsequent graft() calls when the child is changed. - always keep track of the current child in q->qdiscs[], even for full offload mode, so that we free in taprio_destroy() what we should, and not something stale. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> --- net/sched/sch_taprio.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-)