diff mbox series

[v3,17/41] net: ieee802154: at86rf230: Call the complete helper when a transmission is over

Message ID 20220117115440.60296-18-miquel.raynal@bootlin.com (mailing list archive)
State Not Applicable
Delegated to: Johannes Berg
Headers show
Series IEEE 802.15.4 scan support | expand

Commit Message

Miquel Raynal Jan. 17, 2022, 11:54 a.m. UTC
ieee802154_xmit_complete() is the right helper to call when a
transmission is over. The fact that it completed or not is not really a
question, but drivers must tell the core that the completion is over,
even if it was canceled. Do not call ieee802154_wake_queue() manually,
in order to let full control of this task to the core.

By using the complete helper we also avoid leacking the skb structure.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 drivers/net/ieee802154/at86rf230.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Alexander Aring Jan. 17, 2022, 10:58 p.m. UTC | #1
Hi,

On Mon, 17 Jan 2022 at 06:55, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> ieee802154_xmit_complete() is the right helper to call when a
> transmission is over. The fact that it completed or not is not really a
> question, but drivers must tell the core that the completion is over,
> even if it was canceled. Do not call ieee802154_wake_queue() manually,
> in order to let full control of this task to the core.
>

This is not a cancellation of a transmission, it is something weird
going on.  Introduce a xmit_error() for this, you call consume_skb()
which is wrong for a non error case.

> By using the complete helper we also avoid leacking the skb structure.
>

Yes, we are leaking here.

- Alex
Alexander Aring Jan. 18, 2022, 12:34 a.m. UTC | #2
Hi,

On Mon, 17 Jan 2022 at 06:55, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> ieee802154_xmit_complete() is the right helper to call when a
> transmission is over. The fact that it completed or not is not really a
> question, but drivers must tell the core that the completion is over,
> even if it was canceled. Do not call ieee802154_wake_queue() manually,
> in order to let full control of this task to the core.
>
> By using the complete helper we also avoid leacking the skb structure.
>
> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> ---
>  drivers/net/ieee802154/at86rf230.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
> index 583f835c317a..1941e1f3d2ef 100644
> --- a/drivers/net/ieee802154/at86rf230.c
> +++ b/drivers/net/ieee802154/at86rf230.c
> @@ -343,7 +343,7 @@ at86rf230_async_error_recover_complete(void *context)
>         if (ctx->free)
>                 kfree(ctx);
>
> -       ieee802154_wake_queue(lp->hw);
> +       ieee802154_xmit_complete(lp->hw, lp->tx_skb, false);

also this lp->tx_skb can be a dangled pointer, after xmit_complete()
we need to set it to NULL in a xmit_error() we can check on NULL
before calling kfree_skb().

- Alex
Alexander Aring Jan. 18, 2022, 12:36 a.m. UTC | #3
Hi,

On Mon, 17 Jan 2022 at 19:34, Alexander Aring <alex.aring@gmail.com> wrote:
>
> Hi,
>
> On Mon, 17 Jan 2022 at 06:55, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > ieee802154_xmit_complete() is the right helper to call when a
> > transmission is over. The fact that it completed or not is not really a
> > question, but drivers must tell the core that the completion is over,
> > even if it was canceled. Do not call ieee802154_wake_queue() manually,
> > in order to let full control of this task to the core.
> >
> > By using the complete helper we also avoid leacking the skb structure.
> >
> > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > ---
> >  drivers/net/ieee802154/at86rf230.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
> > index 583f835c317a..1941e1f3d2ef 100644
> > --- a/drivers/net/ieee802154/at86rf230.c
> > +++ b/drivers/net/ieee802154/at86rf230.c
> > @@ -343,7 +343,7 @@ at86rf230_async_error_recover_complete(void *context)
> >         if (ctx->free)
> >                 kfree(ctx);
> >
> > -       ieee802154_wake_queue(lp->hw);
> > +       ieee802154_xmit_complete(lp->hw, lp->tx_skb, false);
>
> also this lp->tx_skb can be a dangled pointer, after xmit_complete()
> we need to set it to NULL in a xmit_error() we can check on NULL
> before calling kfree_skb().
>

forget the NULL checking, it's already done by core. However in some
cases this is called with a dangled pointer on lp->tx_skb.

- Alex
Miquel Raynal Jan. 18, 2022, 6:22 p.m. UTC | #4
Hi Alexander,

alex.aring@gmail.com wrote on Mon, 17 Jan 2022 19:36:39 -0500:

> Hi,
> 
> On Mon, 17 Jan 2022 at 19:34, Alexander Aring <alex.aring@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, 17 Jan 2022 at 06:55, Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > >
> > > ieee802154_xmit_complete() is the right helper to call when a
> > > transmission is over. The fact that it completed or not is not really a
> > > question, but drivers must tell the core that the completion is over,
> > > even if it was canceled. Do not call ieee802154_wake_queue() manually,
> > > in order to let full control of this task to the core.
> > >
> > > By using the complete helper we also avoid leacking the skb structure.
> > >
> > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > ---
> > >  drivers/net/ieee802154/at86rf230.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
> > > index 583f835c317a..1941e1f3d2ef 100644
> > > --- a/drivers/net/ieee802154/at86rf230.c
> > > +++ b/drivers/net/ieee802154/at86rf230.c
> > > @@ -343,7 +343,7 @@ at86rf230_async_error_recover_complete(void *context)
> > >         if (ctx->free)
> > >                 kfree(ctx);
> > >
> > > -       ieee802154_wake_queue(lp->hw);
> > > +       ieee802154_xmit_complete(lp->hw, lp->tx_skb, false);  
> >
> > also this lp->tx_skb can be a dangled pointer, after xmit_complete()
> > we need to set it to NULL in a xmit_error() we can check on NULL
> > before calling kfree_skb().
> >  
> 
> forget the NULL checking, it's already done by core. However in some
> cases this is called with a dangled pointer on lp->tx_skb.

I'll try to fix these dangling situation first if I find them.

I'll also introduce a xmit_error() helper as you suggest.

Thanks,
Miquèl
Miquel Raynal Jan. 19, 2022, 10:45 p.m. UTC | #5
Hi Alexander,

alex.aring@gmail.com wrote on Mon, 17 Jan 2022 19:36:39 -0500:

> Hi,
> 
> On Mon, 17 Jan 2022 at 19:34, Alexander Aring <alex.aring@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, 17 Jan 2022 at 06:55, Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > >
> > > ieee802154_xmit_complete() is the right helper to call when a
> > > transmission is over. The fact that it completed or not is not really a
> > > question, but drivers must tell the core that the completion is over,
> > > even if it was canceled. Do not call ieee802154_wake_queue() manually,
> > > in order to let full control of this task to the core.
> > >
> > > By using the complete helper we also avoid leacking the skb structure.
> > >
> > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > ---
> > >  drivers/net/ieee802154/at86rf230.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
> > > index 583f835c317a..1941e1f3d2ef 100644
> > > --- a/drivers/net/ieee802154/at86rf230.c
> > > +++ b/drivers/net/ieee802154/at86rf230.c
> > > @@ -343,7 +343,7 @@ at86rf230_async_error_recover_complete(void *context)
> > >         if (ctx->free)
> > >                 kfree(ctx);
> > >
> > > -       ieee802154_wake_queue(lp->hw);
> > > +       ieee802154_xmit_complete(lp->hw, lp->tx_skb, false);  
> >
> > also this lp->tx_skb can be a dangled pointer, after xmit_complete()
> > we need to set it to NULL in a xmit_error() we can check on NULL
> > before calling kfree_skb().

I've created a xmit_error() helper as suggested, which call
dev_kfree_skb_any() instead of *consume_skb*().

> 
> forget the NULL checking, it's already done by core.

Indeed, it is.

> However in some
> cases this is called with a dangled pointer on lp->tx_skb.

I've fixed that by setting it to NULL after the call to the xmit_error
helper.

> 
> - Alex


Thanks,
Miquèl
Miquel Raynal Jan. 19, 2022, 10:56 p.m. UTC | #6
Hi Alexander,

alex.aring@gmail.com wrote on Mon, 17 Jan 2022 19:36:39 -0500:

> Hi,
> 
> On Mon, 17 Jan 2022 at 19:34, Alexander Aring <alex.aring@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, 17 Jan 2022 at 06:55, Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > >
> > > ieee802154_xmit_complete() is the right helper to call when a
> > > transmission is over. The fact that it completed or not is not really a
> > > question, but drivers must tell the core that the completion is over,
> > > even if it was canceled. Do not call ieee802154_wake_queue() manually,
> > > in order to let full control of this task to the core.
> > >
> > > By using the complete helper we also avoid leacking the skb structure.
> > >
> > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > ---
> > >  drivers/net/ieee802154/at86rf230.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
> > > index 583f835c317a..1941e1f3d2ef 100644
> > > --- a/drivers/net/ieee802154/at86rf230.c
> > > +++ b/drivers/net/ieee802154/at86rf230.c
> > > @@ -343,7 +343,7 @@ at86rf230_async_error_recover_complete(void *context)
> > >         if (ctx->free)
> > >                 kfree(ctx);
> > >
> > > -       ieee802154_wake_queue(lp->hw);
> > > +       ieee802154_xmit_complete(lp->hw, lp->tx_skb, false);  
> >
> > also this lp->tx_skb can be a dangled pointer, after xmit_complete()
> > we need to set it to NULL in a xmit_error() we can check on NULL
> > before calling kfree_skb().
> >  
> 
> forget the NULL checking, it's already done by core. However in some
> cases this is called with a dangled pointer on lp->tx_skb.

Actually I don't see why tx_skb is dangling?

There is no function that could accesses lp->tx_skb between the free
operation and the next call to ->xmit() which does a lp->tx_skb = skb.
Am I missing something?

Thanks,
Miquèl
Alexander Aring Jan. 19, 2022, 11:34 p.m. UTC | #7
Hi,

On Wed, 19 Jan 2022 at 17:56, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Alexander,
>
> alex.aring@gmail.com wrote on Mon, 17 Jan 2022 19:36:39 -0500:
>
> > Hi,
> >
> > On Mon, 17 Jan 2022 at 19:34, Alexander Aring <alex.aring@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, 17 Jan 2022 at 06:55, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > > >
> > > > ieee802154_xmit_complete() is the right helper to call when a
> > > > transmission is over. The fact that it completed or not is not really a
> > > > question, but drivers must tell the core that the completion is over,
> > > > even if it was canceled. Do not call ieee802154_wake_queue() manually,
> > > > in order to let full control of this task to the core.
> > > >
> > > > By using the complete helper we also avoid leacking the skb structure.
> > > >
> > > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > > ---
> > > >  drivers/net/ieee802154/at86rf230.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
> > > > index 583f835c317a..1941e1f3d2ef 100644
> > > > --- a/drivers/net/ieee802154/at86rf230.c
> > > > +++ b/drivers/net/ieee802154/at86rf230.c
> > > > @@ -343,7 +343,7 @@ at86rf230_async_error_recover_complete(void *context)
> > > >         if (ctx->free)
> > > >                 kfree(ctx);
> > > >
> > > > -       ieee802154_wake_queue(lp->hw);
> > > > +       ieee802154_xmit_complete(lp->hw, lp->tx_skb, false);
> > >
> > > also this lp->tx_skb can be a dangled pointer, after xmit_complete()
> > > we need to set it to NULL in a xmit_error() we can check on NULL
> > > before calling kfree_skb().
> > >
> >
> > forget the NULL checking, it's already done by core. However in some
> > cases this is called with a dangled pointer on lp->tx_skb.
>
> Actually I don't see why tx_skb is dangling?
>
> There is no function that could accesses lp->tx_skb between the free
> operation and the next call to ->xmit() which does a lp->tx_skb = skb.
> Am I missing something?
>

look into at86rf230_sync_state_change() it is a sync over async and
the function "at86rf230_async_error_recover_complete()" is a generic
error handling to recover from a state change. It's e.g. being used in
e.g. at86rf230_start() which can occur in cases which are not xmit
related.

Indeed there is no dangled pointer in the irq handling, sorry. I
thought maybe the receive handling but the transceiver is doing a lot
of its own state change handling because of some framebuffer
protection which is not the case.

- Alex
Alexander Aring Jan. 20, 2022, 12:19 a.m. UTC | #8
Hi,

On Wed, 19 Jan 2022 at 18:34, Alexander Aring <alex.aring@gmail.com> wrote:
>
> Hi,
>
> On Wed, 19 Jan 2022 at 17:56, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > Hi Alexander,
> >
> > alex.aring@gmail.com wrote on Mon, 17 Jan 2022 19:36:39 -0500:
> >
> > > Hi,
> > >
> > > On Mon, 17 Jan 2022 at 19:34, Alexander Aring <alex.aring@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Mon, 17 Jan 2022 at 06:55, Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > > > >
> > > > > ieee802154_xmit_complete() is the right helper to call when a
> > > > > transmission is over. The fact that it completed or not is not really a
> > > > > question, but drivers must tell the core that the completion is over,
> > > > > even if it was canceled. Do not call ieee802154_wake_queue() manually,
> > > > > in order to let full control of this task to the core.
> > > > >
> > > > > By using the complete helper we also avoid leacking the skb structure.
> > > > >
> > > > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > > > ---
> > > > >  drivers/net/ieee802154/at86rf230.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
> > > > > index 583f835c317a..1941e1f3d2ef 100644
> > > > > --- a/drivers/net/ieee802154/at86rf230.c
> > > > > +++ b/drivers/net/ieee802154/at86rf230.c
> > > > > @@ -343,7 +343,7 @@ at86rf230_async_error_recover_complete(void *context)
> > > > >         if (ctx->free)
> > > > >                 kfree(ctx);
> > > > >
> > > > > -       ieee802154_wake_queue(lp->hw);
> > > > > +       ieee802154_xmit_complete(lp->hw, lp->tx_skb, false);
> > > >
> > > > also this lp->tx_skb can be a dangled pointer, after xmit_complete()
> > > > we need to set it to NULL in a xmit_error() we can check on NULL
> > > > before calling kfree_skb().
> > > >
> > >
> > > forget the NULL checking, it's already done by core. However in some
> > > cases this is called with a dangled pointer on lp->tx_skb.
> >
> > Actually I don't see why tx_skb is dangling?
> >
> > There is no function that could accesses lp->tx_skb between the free
> > operation and the next call to ->xmit() which does a lp->tx_skb = skb.
> > Am I missing something?
> >
>
> look into at86rf230_sync_state_change() it is a sync over async and
> the function "at86rf230_async_error_recover_complete()" is a generic
> error handling to recover from a state change. It's e.g. being used in
> e.g. at86rf230_start() which can occur in cases which are not xmit
> related.
>

which means there is more being broken that we should not simply call
to wake the queue in non-transmit cases...

- Alex
diff mbox series

Patch

diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
index 583f835c317a..1941e1f3d2ef 100644
--- a/drivers/net/ieee802154/at86rf230.c
+++ b/drivers/net/ieee802154/at86rf230.c
@@ -343,7 +343,7 @@  at86rf230_async_error_recover_complete(void *context)
 	if (ctx->free)
 		kfree(ctx);
 
-	ieee802154_wake_queue(lp->hw);
+	ieee802154_xmit_complete(lp->hw, lp->tx_skb, false);
 }
 
 static void