diff mbox series

[wpan-next,v3,2/4] net: ieee802154: Add support for inter PAN management

Message ID 20220620134018.62414-3-miquel.raynal@bootlin.com (mailing list archive)
State Superseded
Headers show
Series net: ieee802154: PAN management | expand

Commit Message

Miquel Raynal June 20, 2022, 1:40 p.m. UTC
Let's introduce the basics for defining PANs:
- structures defining a PAN
- helpers for PAN registration
- helpers discarding old PANs

Co-developed-by: David Girault <david.girault@qorvo.com>
Signed-off-by: David Girault <david.girault@qorvo.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 include/net/cfg802154.h |  31 ++++++
 net/ieee802154/Makefile |   2 +-
 net/ieee802154/core.c   |   2 +
 net/ieee802154/core.h   |  26 +++++
 net/ieee802154/pan.c    | 231 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 291 insertions(+), 1 deletion(-)
 create mode 100644 net/ieee802154/pan.c

Comments

Alexander Aring June 26, 2022, 2:29 a.m. UTC | #1
Hi,

On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
<miquel.raynal@bootlin.com> wrote:
>
> Let's introduce the basics for defining PANs:
> - structures defining a PAN
> - helpers for PAN registration
> - helpers discarding old PANs
>

I think the whole pan management can/should be stored in user space by
a daemon running in background. This can be a network manager as it
listens to netlink events as "detect PAN xy" and stores it and offers
it in their list to associate with it.

We need somewhere to draw a line and I guess the line is "Is this
information used e.g. as any lookup or something in the hot path", I
don't see this currently...

- Alex
Miquel Raynal June 27, 2022, 8:43 a.m. UTC | #2
Hi Alexander,

aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:

> Hi,
> 
> On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
> <miquel.raynal@bootlin.com> wrote:
> >
> > Let's introduce the basics for defining PANs:
> > - structures defining a PAN
> > - helpers for PAN registration
> > - helpers discarding old PANs
> >  
> 
> I think the whole pan management can/should be stored in user space by
> a daemon running in background.

We need both, and currently:
- while the scan is happening, the kernel saves all the discovered PANs
- the kernel PAN list can be dumped (and also flushed) asynchronously by
  the userspace

IOW the userspace is responsible of keeping its own list of PANs in
sync with what the kernel discovers, so at any moment it can ask the
kernel what it has in memory, it can be done during a scan or after. It
can request a new scan to update the entries, or flush the kernel list.
The scan operation is always requested by the user anyway, it's not
something happening in the background.

> This can be a network manager as it
> listens to netlink events as "detect PAN xy" and stores it and offers
> it in their list to associate with it.

There are events produced, yes. But really, this is not something we
actually need. The user requests a scan over a given range, when the
scan is over it looks at the list and decides which PAN it
wants to associate with, and through which coordinator (95% of the
scenarii).

> We need somewhere to draw a line and I guess the line is "Is this
> information used e.g. as any lookup or something in the hot path", I
> don't see this currently...

Each PAN descriptor is like 20 bytes, so that's why I don't feel back
keeping them, I think it's easier to be able to serve the list of PANs
upon request rather than only forwarding events and not being able to
retrieve the list a second time (at least during the development).

Overall I feel like this part is still a little bit blurry because it
has currently no user, perhaps I should send the next series which
actually makes the current series useful.

Thanks,
Miquèl
Alexander Aring June 28, 2022, 1:32 a.m. UTC | #3
Hi,

On Mon, Jun 27, 2022 at 4:43 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Alexander,
>
> aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:
>
> > Hi,
> >
> > On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
> > <miquel.raynal@bootlin.com> wrote:
> > >
> > > Let's introduce the basics for defining PANs:
> > > - structures defining a PAN
> > > - helpers for PAN registration
> > > - helpers discarding old PANs
> > >
> >
> > I think the whole pan management can/should be stored in user space by
> > a daemon running in background.
>
> We need both, and currently:
> - while the scan is happening, the kernel saves all the discovered PANs
> - the kernel PAN list can be dumped (and also flushed) asynchronously by
>   the userspace
>
> IOW the userspace is responsible of keeping its own list of PANs in
> sync with what the kernel discovers, so at any moment it can ask the
> kernel what it has in memory, it can be done during a scan or after. It
> can request a new scan to update the entries, or flush the kernel list.
> The scan operation is always requested by the user anyway, it's not
> something happening in the background.
>

I don't see what advantage it has to keep the discovered pan in the
kernel. You can do everything with a start/stop/pan discovered event.
It also has more advantages as you can look for a specific pan and
stop afterwards.
At the end the daemon has everything that the kernel also has, as you
said it's in sync.

> > This can be a network manager as it
> > listens to netlink events as "detect PAN xy" and stores it and offers
> > it in their list to associate with it.
>
> There are events produced, yes. But really, this is not something we
> actually need. The user requests a scan over a given range, when the
> scan is over it looks at the list and decides which PAN it
> wants to associate with, and through which coordinator (95% of the
> scenarii).
>

This isn't either a kernel job to decide which pan it will be associated with.

> > We need somewhere to draw a line and I guess the line is "Is this
> > information used e.g. as any lookup or something in the hot path", I
> > don't see this currently...
>
> Each PAN descriptor is like 20 bytes, so that's why I don't feel back
> keeping them, I think it's easier to be able to serve the list of PANs
> upon request rather than only forwarding events and not being able to
> retrieve the list a second time (at least during the development).
>

This has nothing to do with memory.

> Overall I feel like this part is still a little bit blurry because it
> has currently no user, perhaps I should send the next series which
> actually makes the current series useful.
>

Will it get more used than caching entries in the kernel for user
space? Please also no in-kernel association feature.

We can maybe agree to that point to put it under
IEEE802154_NL802154_EXPERIMENTAL config, as soon as we have some
_open_ user space program ready we will drop this feature again...
this program will show that there is no magic about it.

- Alex
Miquel Raynal June 28, 2022, 7:58 a.m. UTC | #4
Hi Alexander,

aahringo@redhat.com wrote on Mon, 27 Jun 2022 21:32:08 -0400:

> Hi,
> 
> On Mon, Jun 27, 2022 at 4:43 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > Hi Alexander,
> >
> > aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:
> >  
> > > Hi,
> > >
> > > On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
> > > <miquel.raynal@bootlin.com> wrote:  
> > > >
> > > > Let's introduce the basics for defining PANs:
> > > > - structures defining a PAN
> > > > - helpers for PAN registration
> > > > - helpers discarding old PANs
> > > >  
> > >
> > > I think the whole pan management can/should be stored in user space by
> > > a daemon running in background.  
> >
> > We need both, and currently:
> > - while the scan is happening, the kernel saves all the discovered PANs
> > - the kernel PAN list can be dumped (and also flushed) asynchronously by
> >   the userspace
> >
> > IOW the userspace is responsible of keeping its own list of PANs in
> > sync with what the kernel discovers, so at any moment it can ask the
> > kernel what it has in memory, it can be done during a scan or after. It
> > can request a new scan to update the entries, or flush the kernel list.
> > The scan operation is always requested by the user anyway, it's not
> > something happening in the background.
> >  
> 
> I don't see what advantage it has to keep the discovered pan in the
> kernel. You can do everything with a start/stop/pan discovered event.

I think the main reason is to be much more user friendly. Keeping track
of the known PANs in the kernel matters because when you start working
with 802.15.4 you won't blindly use a daemon (if there is any) and will
use test apps like iwpan which are stateless. Re-doing a scan on demand
just takes ages (from seconds to minutes, depending on the beacon
order).

Aside from this non technical reason, I also had in mind to retrieve
values gathered from the beacons (and stored in the PAN descriptors) to
know more about the devices when eg. listing associations, like
registering the short address of a coordinator. I don't yet know how
useful this is TBH.

> It also has more advantages as you can look for a specific pan and
> stop afterwards. At the end the daemon has everything that the kernel
> also has, as you said it's in sync.
> 
> > > This can be a network manager as it
> > > listens to netlink events as "detect PAN xy" and stores it and
> > > offers it in their list to associate with it.  
> >
> > There are events produced, yes. But really, this is not something we
> > actually need. The user requests a scan over a given range, when the
> > scan is over it looks at the list and decides which PAN it
> > wants to associate with, and through which coordinator (95% of the
> > scenarii).
> >  
> 
> This isn't either a kernel job to decide which pan it will be
> associated with.

Yes, "it looks at the list and decides" referred to "the user".

> > > We need somewhere to draw a line and I guess the line is "Is this
> > > information used e.g. as any lookup or something in the hot path", I
> > > don't see this currently...  
> >
> > Each PAN descriptor is like 20 bytes, so that's why I don't feel back
> > keeping them, I think it's easier to be able to serve the list of PANs
> > upon request rather than only forwarding events and not being able to
> > retrieve the list a second time (at least during the development).
> >  
> 
> This has nothing to do with memory.
> 
> > Overall I feel like this part is still a little bit blurry because it
> > has currently no user, perhaps I should send the next series which
> > actually makes the current series useful.
> >  
> 
> Will it get more used than caching entries in the kernel for user
> space? Please also no in-kernel association feature.

I am aligned on this.

> We can maybe agree to that point to put it under
> IEEE802154_NL802154_EXPERIMENTAL config, as soon as we have some
> _open_ user space program ready we will drop this feature again...
> this program will show that there is no magic about it.

Yeah, do you want to move all the code scan/beacon/pan/association code
under EXPERIMENTAL sections? Or is it just the PAN management logic?

Thanks,
Miquèl
Alexander Aring June 30, 2022, 1:40 a.m. UTC | #5
Hi,

On Tue, Jun 28, 2022 at 3:58 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Alexander,
>
> aahringo@redhat.com wrote on Mon, 27 Jun 2022 21:32:08 -0400:
>
> > Hi,
> >
> > On Mon, Jun 27, 2022 at 4:43 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > >
> > > Hi Alexander,
> > >
> > > aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:
> > >
> > > > Hi,
> > > >
> > > > On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
> > > > <miquel.raynal@bootlin.com> wrote:
> > > > >
> > > > > Let's introduce the basics for defining PANs:
> > > > > - structures defining a PAN
> > > > > - helpers for PAN registration
> > > > > - helpers discarding old PANs
> > > > >
> > > >
> > > > I think the whole pan management can/should be stored in user space by
> > > > a daemon running in background.
> > >
> > > We need both, and currently:
> > > - while the scan is happening, the kernel saves all the discovered PANs
> > > - the kernel PAN list can be dumped (and also flushed) asynchronously by
> > >   the userspace
> > >
> > > IOW the userspace is responsible of keeping its own list of PANs in
> > > sync with what the kernel discovers, so at any moment it can ask the
> > > kernel what it has in memory, it can be done during a scan or after. It
> > > can request a new scan to update the entries, or flush the kernel list.
> > > The scan operation is always requested by the user anyway, it's not
> > > something happening in the background.
> > >
> >
> > I don't see what advantage it has to keep the discovered pan in the
> > kernel. You can do everything with a start/stop/pan discovered event.
>
> I think the main reason is to be much more user friendly. Keeping track
> of the known PANs in the kernel matters because when you start working
> with 802.15.4 you won't blindly use a daemon (if there is any) and will
> use test apps like iwpan which are stateless. Re-doing a scan on demand
> just takes ages (from seconds to minutes, depending on the beacon
> order).
>

I can see that things should work "out-of the box" and we are already
doing it by manual setting pan_id, etc. However, doing it in an
automatic way there exists a lot of "interpretation" about how you
want to handle it (doesn't matter if this is what the spec says or
not)... moving it to user space will offload it to the user.

> Aside from this non technical reason, I also had in mind to retrieve
> values gathered from the beacons (and stored in the PAN descriptors) to
> know more about the devices when eg. listing associations, like
> registering the short address of a coordinator. I don't yet know how
> useful this is TBH.
>
> > It also has more advantages as you can look for a specific pan and
> > stop afterwards. At the end the daemon has everything that the kernel
> > also has, as you said it's in sync.
> >
> > > > This can be a network manager as it
> > > > listens to netlink events as "detect PAN xy" and stores it and
> > > > offers it in their list to associate with it.
> > >
> > > There are events produced, yes. But really, this is not something we
> > > actually need. The user requests a scan over a given range, when the
> > > scan is over it looks at the list and decides which PAN it
> > > wants to associate with, and through which coordinator (95% of the
> > > scenarii).
> > >
> >
> > This isn't either a kernel job to decide which pan it will be
> > associated with.
>
> Yes, "it looks at the list and decides" referred to "the user".
>
> > > > We need somewhere to draw a line and I guess the line is "Is this
> > > > information used e.g. as any lookup or something in the hot path", I
> > > > don't see this currently...
> > >
> > > Each PAN descriptor is like 20 bytes, so that's why I don't feel back
> > > keeping them, I think it's easier to be able to serve the list of PANs
> > > upon request rather than only forwarding events and not being able to
> > > retrieve the list a second time (at least during the development).
> > >
> >
> > This has nothing to do with memory.
> >
> > > Overall I feel like this part is still a little bit blurry because it
> > > has currently no user, perhaps I should send the next series which
> > > actually makes the current series useful.
> > >
> >
> > Will it get more used than caching entries in the kernel for user
> > space? Please also no in-kernel association feature.
>
> I am aligned on this.
>

I am sorry I am not sure what that means.

> > We can maybe agree to that point to put it under
> > IEEE802154_NL802154_EXPERIMENTAL config, as soon as we have some
> > _open_ user space program ready we will drop this feature again...
> > this program will show that there is no magic about it.
>
> Yeah, do you want to move all the code scan/beacon/pan/association code
> under EXPERIMENTAL sections? Or is it just the PAN management logic?

Yes, why not. But as I can see there exists two categories of
introducing your netlink api:

1. API candidates which are very likely to become stable
2. API candidates which we want to remove when we have a user
replacement for it (will probably never go stable)

The 2. should be defining _after_ the 1. In the "big" netlink API
enums of EXPERIMENTAL sections.

Also you should provide for 2. some kind of ifdef/functions dummy/etc.
that it's easy to remove from the kernel when we have a user
replacement for it.
I hope that is fine for everybody.

I try to find solutions here, I don't see a reason for putting this
pan management into the kernel... whereas I appreciate the effort
which is done here and will not force you to write some user space
software that does this job. From my point of view I can't accept this
functionality in the kernel "yet".

- Alex
Miquel Raynal June 30, 2022, 8:14 a.m. UTC | #6
Hi Alexander,

aahringo@redhat.com wrote on Wed, 29 Jun 2022 21:40:14 -0400:

> Hi,
> 
> On Tue, Jun 28, 2022 at 3:58 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > Hi Alexander,
> >
> > aahringo@redhat.com wrote on Mon, 27 Jun 2022 21:32:08 -0400:
> >  
> > > Hi,
> > >
> > > On Mon, Jun 27, 2022 at 4:43 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > > >
> > > > Hi Alexander,
> > > >
> > > > aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:
> > > >  
> > > > > Hi,
> > > > >
> > > > > On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
> > > > > <miquel.raynal@bootlin.com> wrote:  
> > > > > >
> > > > > > Let's introduce the basics for defining PANs:
> > > > > > - structures defining a PAN
> > > > > > - helpers for PAN registration
> > > > > > - helpers discarding old PANs
> > > > > >  
> > > > >
> > > > > I think the whole pan management can/should be stored in user space by
> > > > > a daemon running in background.  
> > > >
> > > > We need both, and currently:
> > > > - while the scan is happening, the kernel saves all the discovered PANs
> > > > - the kernel PAN list can be dumped (and also flushed) asynchronously by
> > > >   the userspace
> > > >
> > > > IOW the userspace is responsible of keeping its own list of PANs in
> > > > sync with what the kernel discovers, so at any moment it can ask the
> > > > kernel what it has in memory, it can be done during a scan or after. It
> > > > can request a new scan to update the entries, or flush the kernel list.
> > > > The scan operation is always requested by the user anyway, it's not
> > > > something happening in the background.
> > > >  
> > >
> > > I don't see what advantage it has to keep the discovered pan in the
> > > kernel. You can do everything with a start/stop/pan discovered event.  
> >
> > I think the main reason is to be much more user friendly. Keeping track
> > of the known PANs in the kernel matters because when you start working
> > with 802.15.4 you won't blindly use a daemon (if there is any) and will
> > use test apps like iwpan which are stateless. Re-doing a scan on demand
> > just takes ages (from seconds to minutes, depending on the beacon
> > order).
> >  
> 
> I can see that things should work "out-of the box" and we are already
> doing it by manual setting pan_id, etc. However, doing it in an
> automatic way there exists a lot of "interpretation" about how you
> want to handle it (doesn't matter if this is what the spec says or
> not)... moving it to user space will offload it to the user.
> 
> > Aside from this non technical reason, I also had in mind to retrieve
> > values gathered from the beacons (and stored in the PAN descriptors) to
> > know more about the devices when eg. listing associations, like
> > registering the short address of a coordinator. I don't yet know how
> > useful this is TBH.
> >  
> > > It also has more advantages as you can look for a specific pan and
> > > stop afterwards. At the end the daemon has everything that the kernel
> > > also has, as you said it's in sync.
> > >  
> > > > > This can be a network manager as it
> > > > > listens to netlink events as "detect PAN xy" and stores it and
> > > > > offers it in their list to associate with it.  
> > > >
> > > > There are events produced, yes. But really, this is not something we
> > > > actually need. The user requests a scan over a given range, when the
> > > > scan is over it looks at the list and decides which PAN it
> > > > wants to associate with, and through which coordinator (95% of the
> > > > scenarii).
> > > >  
> > >
> > > This isn't either a kernel job to decide which pan it will be
> > > associated with.  
> >
> > Yes, "it looks at the list and decides" referred to "the user".
> >  
> > > > > We need somewhere to draw a line and I guess the line is "Is this
> > > > > information used e.g. as any lookup or something in the hot path", I
> > > > > don't see this currently...  
> > > >
> > > > Each PAN descriptor is like 20 bytes, so that's why I don't feel back
> > > > keeping them, I think it's easier to be able to serve the list of PANs
> > > > upon request rather than only forwarding events and not being able to
> > > > retrieve the list a second time (at least during the development).
> > > >  
> > >
> > > This has nothing to do with memory.
> > >  
> > > > Overall I feel like this part is still a little bit blurry because it
> > > > has currently no user, perhaps I should send the next series which
> > > > actually makes the current series useful.
> > > >  
> > >
> > > Will it get more used than caching entries in the kernel for user
> > > space? Please also no in-kernel association feature.  
> >
> > I am aligned on this.
> >  
> 
> I am sorry I am not sure what that means.

I was referring to the "no in-kernel association feature".

There is however one situation which I _had_ to be handled in the
kernel: other devices asking for being associated or disassociated. In
the case of the disassociation, the receiving device is only notified
and cannot refuse the disassociation. For the association however,
the device receiving the association request has to make a decision.
There are three possible outcomes:
- accepting
- refusing because the PAN is at capacity
- refusing because the device is blacklisted
For now I've only implemented the first reason, because it's much
easier and only requires a maximum device number variable, set by the
user. For the second reason, it requires handling a
whitelist/blacklist, I don't plan to implement this for now, but that
should not impact the rest of the code. I'll let that to other
developers, or future-me, perhaps :-). Anyhow, you can kick-out devices
at any time anyway if needed with a disassociation notification
controlled by the user.

> > > We can maybe agree to that point to put it under
> > > IEEE802154_NL802154_EXPERIMENTAL config, as soon as we have some
> > > _open_ user space program ready we will drop this feature again...
> > > this program will show that there is no magic about it.  
> >
> > Yeah, do you want to move all the code scan/beacon/pan/association code
> > under EXPERIMENTAL sections? Or is it just the PAN management logic?  
> 
> Yes, why not. But as I can see there exists two categories of
> introducing your netlink api:
> 
> 1. API candidates which are very likely to become stable
> 2. API candidates which we want to remove when we have a user
> replacement for it (will probably never go stable)
> 
> The 2. should be defining _after_ the 1. In the "big" netlink API
> enums of EXPERIMENTAL sections.

Yeah, got it.

> Also you should provide for 2. some kind of ifdef/functions dummy/etc.
> that it's easy to remove from the kernel when we have a user
> replacement for it.
> I hope that is fine for everybody.
> 
> I try to find solutions here, I don't see a reason for putting this
> pan management into the kernel... whereas I appreciate the effort
> which is done here and will not force you to write some user space
> software that does this job. From my point of view I can't accept this
> functionality in the kernel "yet".

I've already spent a couple of days reworking all that part, I've
dropped most of the in-kernel PAN management, which means:
- when a new coordinator gets discovered (beacon received), if the mac
  was scanning then it calls a generic function from the cfg layer to
  advertise this pan.
- the cfg layer will send a NL message to the user with all the
  important information
- BUT the cfg layer will also keep in memory the beacon information for
  the time of the scan (only), to avoid polluting the user with the same
  information over and over again, this seems a necessary step to me,
  because otherwise if you track on the same channel two coordinators
  not emitting at the same pace, you might end up with 100 user
  notifications, for just 2 devices. I think this is the kernel duty to
  filter out identical beacons.

I _will_ send a v4, including the scanning part this time by the end of
the week, I need to settle everything down, ensure it still works and
clean the branch.

Thanks,
Miquèl
Alexander Aring June 30, 2022, 11:27 p.m. UTC | #7
Hi,

On Thu, Jun 30, 2022 at 4:14 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Alexander,
>
> aahringo@redhat.com wrote on Wed, 29 Jun 2022 21:40:14 -0400:
>
> > Hi,
> >
> > On Tue, Jun 28, 2022 at 3:58 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > >
> > > Hi Alexander,
> > >
> > > aahringo@redhat.com wrote on Mon, 27 Jun 2022 21:32:08 -0400:
> > >
> > > > Hi,
> > > >
> > > > On Mon, Jun 27, 2022 at 4:43 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > > > >
> > > > > Hi Alexander,
> > > > >
> > > > > aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
> > > > > > <miquel.raynal@bootlin.com> wrote:
> > > > > > >
> > > > > > > Let's introduce the basics for defining PANs:
> > > > > > > - structures defining a PAN
> > > > > > > - helpers for PAN registration
> > > > > > > - helpers discarding old PANs
> > > > > > >
> > > > > >
> > > > > > I think the whole pan management can/should be stored in user space by
> > > > > > a daemon running in background.
> > > > >
> > > > > We need both, and currently:
> > > > > - while the scan is happening, the kernel saves all the discovered PANs
> > > > > - the kernel PAN list can be dumped (and also flushed) asynchronously by
> > > > >   the userspace
> > > > >
> > > > > IOW the userspace is responsible of keeping its own list of PANs in
> > > > > sync with what the kernel discovers, so at any moment it can ask the
> > > > > kernel what it has in memory, it can be done during a scan or after. It
> > > > > can request a new scan to update the entries, or flush the kernel list.
> > > > > The scan operation is always requested by the user anyway, it's not
> > > > > something happening in the background.
> > > > >
> > > >
> > > > I don't see what advantage it has to keep the discovered pan in the
> > > > kernel. You can do everything with a start/stop/pan discovered event.
> > >
> > > I think the main reason is to be much more user friendly. Keeping track
> > > of the known PANs in the kernel matters because when you start working
> > > with 802.15.4 you won't blindly use a daemon (if there is any) and will
> > > use test apps like iwpan which are stateless. Re-doing a scan on demand
> > > just takes ages (from seconds to minutes, depending on the beacon
> > > order).
> > >
> >
> > I can see that things should work "out-of the box" and we are already
> > doing it by manual setting pan_id, etc. However, doing it in an
> > automatic way there exists a lot of "interpretation" about how you
> > want to handle it (doesn't matter if this is what the spec says or
> > not)... moving it to user space will offload it to the user.
> >
> > > Aside from this non technical reason, I also had in mind to retrieve
> > > values gathered from the beacons (and stored in the PAN descriptors) to
> > > know more about the devices when eg. listing associations, like
> > > registering the short address of a coordinator. I don't yet know how
> > > useful this is TBH.
> > >
> > > > It also has more advantages as you can look for a specific pan and
> > > > stop afterwards. At the end the daemon has everything that the kernel
> > > > also has, as you said it's in sync.
> > > >
> > > > > > This can be a network manager as it
> > > > > > listens to netlink events as "detect PAN xy" and stores it and
> > > > > > offers it in their list to associate with it.
> > > > >
> > > > > There are events produced, yes. But really, this is not something we
> > > > > actually need. The user requests a scan over a given range, when the
> > > > > scan is over it looks at the list and decides which PAN it
> > > > > wants to associate with, and through which coordinator (95% of the
> > > > > scenarii).
> > > > >
> > > >
> > > > This isn't either a kernel job to decide which pan it will be
> > > > associated with.
> > >
> > > Yes, "it looks at the list and decides" referred to "the user".
> > >
> > > > > > We need somewhere to draw a line and I guess the line is "Is this
> > > > > > information used e.g. as any lookup or something in the hot path", I
> > > > > > don't see this currently...
> > > > >
> > > > > Each PAN descriptor is like 20 bytes, so that's why I don't feel back
> > > > > keeping them, I think it's easier to be able to serve the list of PANs
> > > > > upon request rather than only forwarding events and not being able to
> > > > > retrieve the list a second time (at least during the development).
> > > > >
> > > >
> > > > This has nothing to do with memory.
> > > >
> > > > > Overall I feel like this part is still a little bit blurry because it
> > > > > has currently no user, perhaps I should send the next series which
> > > > > actually makes the current series useful.
> > > > >
> > > >
> > > > Will it get more used than caching entries in the kernel for user
> > > > space? Please also no in-kernel association feature.
> > >
> > > I am aligned on this.
> > >
> >
> > I am sorry I am not sure what that means.
>
> I was referring to the "no in-kernel association feature".
>
> There is however one situation which I _had_ to be handled in the
> kernel: other devices asking for being associated or disassociated. In
> the case of the disassociation, the receiving device is only notified
> and cannot refuse the disassociation. For the association however,
> the device receiving the association request has to make a decision.
> There are three possible outcomes:
> - accepting
> - refusing because the PAN is at capacity
> - refusing because the device is blacklisted

Why not move this decision to the user as well? The kernel will wait
for the reason? This isn't required to be fast and the decision may
depend on the current pan management...

> For now I've only implemented the first reason, because it's much
> easier and only requires a maximum device number variable, set by the
> user. For the second reason, it requires handling a
> whitelist/blacklist, I don't plan to implement this for now, but that
> should not impact the rest of the code. I'll let that to other
> developers, or future-me, perhaps :-). Anyhow, you can kick-out devices
> at any time anyway if needed with a disassociation notification
> controlled by the user.
>
> > > > We can maybe agree to that point to put it under
> > > > IEEE802154_NL802154_EXPERIMENTAL config, as soon as we have some
> > > > _open_ user space program ready we will drop this feature again...
> > > > this program will show that there is no magic about it.
> > >
> > > Yeah, do you want to move all the code scan/beacon/pan/association code
> > > under EXPERIMENTAL sections? Or is it just the PAN management logic?
> >
> > Yes, why not. But as I can see there exists two categories of
> > introducing your netlink api:
> >
> > 1. API candidates which are very likely to become stable
> > 2. API candidates which we want to remove when we have a user
> > replacement for it (will probably never go stable)
> >
> > The 2. should be defining _after_ the 1. In the "big" netlink API
> > enums of EXPERIMENTAL sections.
>
> Yeah, got it.
>
> > Also you should provide for 2. some kind of ifdef/functions dummy/etc.
> > that it's easy to remove from the kernel when we have a user
> > replacement for it.
> > I hope that is fine for everybody.
> >
> > I try to find solutions here, I don't see a reason for putting this
> > pan management into the kernel... whereas I appreciate the effort
> > which is done here and will not force you to write some user space
> > software that does this job. From my point of view I can't accept this
> > functionality in the kernel "yet".
>
> I've already spent a couple of days reworking all that part, I've
> dropped most of the in-kernel PAN management, which means:
> - when a new coordinator gets discovered (beacon received), if the mac
>   was scanning then it calls a generic function from the cfg layer to
>   advertise this pan.
> - the cfg layer will send a NL message to the user with all the
>   important information
> - BUT the cfg layer will also keep in memory the beacon information for
>   the time of the scan (only), to avoid polluting the user with the same
>   information over and over again, this seems a necessary step to me,
>   because otherwise if you track on the same channel two coordinators
>   not emitting at the same pace, you might end up with 100 user
>   notifications, for just 2 devices. I think this is the kernel duty to
>   filter out identical beacons.
>

Okay, I am sure if somebody complains about such kernel behaviour and
has a good argument to switch back... we still can do it.

> I _will_ send a v4, including the scanning part this time by the end of
> the week, I need to settle everything down, ensure it still works and
> clean the branch.
>

ok.

- Alex
Alexander Aring June 30, 2022, 11:39 p.m. UTC | #8
Hi,

On Thu, Jun 30, 2022 at 7:27 PM Alexander Aring <aahringo@redhat.com> wrote:
...
>
> Why not move this decision to the user as well? The kernel will wait
> for the reason? This isn't required to be fast and the decision may
> depend on the current pan management...

to be clear here, that this will then and on some coordinator only use
a user space daemon which manages whatever is needed for assoc/deassoc
management e.g. how short-addresses are allocated, etc. That should
also not be part of the kernel, if so then same strategy as we have a
user space replacement for it?

- Alex
Miquel Raynal July 1, 2022, 12:50 a.m. UTC | #9
Hi Alexander,

aahringo@redhat.com wrote on Thu, 30 Jun 2022 19:27:49 -0400:

> Hi,
> 
> On Thu, Jun 30, 2022 at 4:14 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > Hi Alexander,
> >
> > aahringo@redhat.com wrote on Wed, 29 Jun 2022 21:40:14 -0400:
> >  
> > > Hi,
> > >
> > > On Tue, Jun 28, 2022 at 3:58 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > > >
> > > > Hi Alexander,
> > > >
> > > > aahringo@redhat.com wrote on Mon, 27 Jun 2022 21:32:08 -0400:
> > > >  
> > > > > Hi,
> > > > >
> > > > > On Mon, Jun 27, 2022 at 4:43 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > > > > >
> > > > > > Hi Alexander,
> > > > > >
> > > > > > aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:
> > > > > >  
> > > > > > > Hi,
> > > > > > >
> > > > > > > On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
> > > > > > > <miquel.raynal@bootlin.com> wrote:  
> > > > > > > >
> > > > > > > > Let's introduce the basics for defining PANs:
> > > > > > > > - structures defining a PAN
> > > > > > > > - helpers for PAN registration
> > > > > > > > - helpers discarding old PANs
> > > > > > > >  
> > > > > > >
> > > > > > > I think the whole pan management can/should be stored in user space by
> > > > > > > a daemon running in background.  
> > > > > >
> > > > > > We need both, and currently:
> > > > > > - while the scan is happening, the kernel saves all the discovered PANs
> > > > > > - the kernel PAN list can be dumped (and also flushed) asynchronously by
> > > > > >   the userspace
> > > > > >
> > > > > > IOW the userspace is responsible of keeping its own list of PANs in
> > > > > > sync with what the kernel discovers, so at any moment it can ask the
> > > > > > kernel what it has in memory, it can be done during a scan or after. It
> > > > > > can request a new scan to update the entries, or flush the kernel list.
> > > > > > The scan operation is always requested by the user anyway, it's not
> > > > > > something happening in the background.
> > > > > >  
> > > > >
> > > > > I don't see what advantage it has to keep the discovered pan in the
> > > > > kernel. You can do everything with a start/stop/pan discovered event.  
> > > >
> > > > I think the main reason is to be much more user friendly. Keeping track
> > > > of the known PANs in the kernel matters because when you start working
> > > > with 802.15.4 you won't blindly use a daemon (if there is any) and will
> > > > use test apps like iwpan which are stateless. Re-doing a scan on demand
> > > > just takes ages (from seconds to minutes, depending on the beacon
> > > > order).
> > > >  
> > >
> > > I can see that things should work "out-of the box" and we are already
> > > doing it by manual setting pan_id, etc. However, doing it in an
> > > automatic way there exists a lot of "interpretation" about how you
> > > want to handle it (doesn't matter if this is what the spec says or
> > > not)... moving it to user space will offload it to the user.
> > >  
> > > > Aside from this non technical reason, I also had in mind to retrieve
> > > > values gathered from the beacons (and stored in the PAN descriptors) to
> > > > know more about the devices when eg. listing associations, like
> > > > registering the short address of a coordinator. I don't yet know how
> > > > useful this is TBH.
> > > >  
> > > > > It also has more advantages as you can look for a specific pan and
> > > > > stop afterwards. At the end the daemon has everything that the kernel
> > > > > also has, as you said it's in sync.
> > > > >  
> > > > > > > This can be a network manager as it
> > > > > > > listens to netlink events as "detect PAN xy" and stores it and
> > > > > > > offers it in their list to associate with it.  
> > > > > >
> > > > > > There are events produced, yes. But really, this is not something we
> > > > > > actually need. The user requests a scan over a given range, when the
> > > > > > scan is over it looks at the list and decides which PAN it
> > > > > > wants to associate with, and through which coordinator (95% of the
> > > > > > scenarii).
> > > > > >  
> > > > >
> > > > > This isn't either a kernel job to decide which pan it will be
> > > > > associated with.  
> > > >
> > > > Yes, "it looks at the list and decides" referred to "the user".
> > > >  
> > > > > > > We need somewhere to draw a line and I guess the line is "Is this
> > > > > > > information used e.g. as any lookup or something in the hot path", I
> > > > > > > don't see this currently...  
> > > > > >
> > > > > > Each PAN descriptor is like 20 bytes, so that's why I don't feel back
> > > > > > keeping them, I think it's easier to be able to serve the list of PANs
> > > > > > upon request rather than only forwarding events and not being able to
> > > > > > retrieve the list a second time (at least during the development).
> > > > > >  
> > > > >
> > > > > This has nothing to do with memory.
> > > > >  
> > > > > > Overall I feel like this part is still a little bit blurry because it
> > > > > > has currently no user, perhaps I should send the next series which
> > > > > > actually makes the current series useful.
> > > > > >  
> > > > >
> > > > > Will it get more used than caching entries in the kernel for user
> > > > > space? Please also no in-kernel association feature.  
> > > >
> > > > I am aligned on this.
> > > >  
> > >
> > > I am sorry I am not sure what that means.  
> >
> > I was referring to the "no in-kernel association feature".
> >
> > There is however one situation which I _had_ to be handled in the
> > kernel: other devices asking for being associated or disassociated. In
> > the case of the disassociation, the receiving device is only notified
> > and cannot refuse the disassociation. For the association however,
> > the device receiving the association request has to make a decision.
> > There are three possible outcomes:
> > - accepting
> > - refusing because the PAN is at capacity
> > - refusing because the device is blacklisted  
> 
> Why not move this decision to the user as well? The kernel will wait
> for the reason? This isn't required to be fast and the decision may
> depend on the current pan management...

I've opted out for the simplest option, which is allowing X devices
being associated, X being manageable by the user. For now I'll keep
this very simple approach, I propose we add this filtering feature
later?

> > For now I've only implemented the first reason, because it's much
> > easier and only requires a maximum device number variable, set by the
> > user. For the second reason, it requires handling a
> > whitelist/blacklist, I don't plan to implement this for now, but that
> > should not impact the rest of the code. I'll let that to other
> > developers, or future-me, perhaps :-). Anyhow, you can kick-out devices
> > at any time anyway if needed with a disassociation notification
> > controlled by the user.
> >  
> > > > > We can maybe agree to that point to put it under
> > > > > IEEE802154_NL802154_EXPERIMENTAL config, as soon as we have some
> > > > > _open_ user space program ready we will drop this feature again...
> > > > > this program will show that there is no magic about it.  
> > > >
> > > > Yeah, do you want to move all the code scan/beacon/pan/association code
> > > > under EXPERIMENTAL sections? Or is it just the PAN management logic?  
> > >
> > > Yes, why not. But as I can see there exists two categories of
> > > introducing your netlink api:
> > >
> > > 1. API candidates which are very likely to become stable
> > > 2. API candidates which we want to remove when we have a user
> > > replacement for it (will probably never go stable)
> > >
> > > The 2. should be defining _after_ the 1. In the "big" netlink API
> > > enums of EXPERIMENTAL sections.  
> >
> > Yeah, got it.
> >  
> > > Also you should provide for 2. some kind of ifdef/functions dummy/etc.
> > > that it's easy to remove from the kernel when we have a user
> > > replacement for it.
> > > I hope that is fine for everybody.
> > >
> > > I try to find solutions here, I don't see a reason for putting this
> > > pan management into the kernel... whereas I appreciate the effort
> > > which is done here and will not force you to write some user space
> > > software that does this job. From my point of view I can't accept this
> > > functionality in the kernel "yet".  
> >
> > I've already spent a couple of days reworking all that part, I've
> > dropped most of the in-kernel PAN management, which means:
> > - when a new coordinator gets discovered (beacon received), if the mac
> >   was scanning then it calls a generic function from the cfg layer to
> >   advertise this pan.
> > - the cfg layer will send a NL message to the user with all the
> >   important information
> > - BUT the cfg layer will also keep in memory the beacon information for
> >   the time of the scan (only), to avoid polluting the user with the same
> >   information over and over again, this seems a necessary step to me,
> >   because otherwise if you track on the same channel two coordinators
> >   not emitting at the same pace, you might end up with 100 user
> >   notifications, for just 2 devices. I think this is the kernel duty to
> >   filter out identical beacons.
> >  
> 
> Okay, I am sure if somebody complains about such kernel behaviour and
> has a good argument to switch back... we still can do it.

Great!

> 
> > I _will_ send a v4, including the scanning part this time by the end of
> > the week, I need to settle everything down, ensure it still works and
> > clean the branch.
> >  
> 
> ok.
> 
> - Alex
> 


Thanks,
Miquèl
Alexander Aring July 1, 2022, 12:23 p.m. UTC | #10
Hi,

On Thu, Jun 30, 2022 at 8:50 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Alexander,
>
> aahringo@redhat.com wrote on Thu, 30 Jun 2022 19:27:49 -0400:
>
> > Hi,
> >
> > On Thu, Jun 30, 2022 at 4:14 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > >
> > > Hi Alexander,
> > >
> > > aahringo@redhat.com wrote on Wed, 29 Jun 2022 21:40:14 -0400:
> > >
> > > > Hi,
> > > >
> > > > On Tue, Jun 28, 2022 at 3:58 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > > > >
> > > > > Hi Alexander,
> > > > >
> > > > > aahringo@redhat.com wrote on Mon, 27 Jun 2022 21:32:08 -0400:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Mon, Jun 27, 2022 at 4:43 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > > > > > >
> > > > > > > Hi Alexander,
> > > > > > >
> > > > > > > aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
> > > > > > > > <miquel.raynal@bootlin.com> wrote:
> > > > > > > > >
> > > > > > > > > Let's introduce the basics for defining PANs:
> > > > > > > > > - structures defining a PAN
> > > > > > > > > - helpers for PAN registration
> > > > > > > > > - helpers discarding old PANs
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think the whole pan management can/should be stored in user space by
> > > > > > > > a daemon running in background.
> > > > > > >
> > > > > > > We need both, and currently:
> > > > > > > - while the scan is happening, the kernel saves all the discovered PANs
> > > > > > > - the kernel PAN list can be dumped (and also flushed) asynchronously by
> > > > > > >   the userspace
> > > > > > >
> > > > > > > IOW the userspace is responsible of keeping its own list of PANs in
> > > > > > > sync with what the kernel discovers, so at any moment it can ask the
> > > > > > > kernel what it has in memory, it can be done during a scan or after. It
> > > > > > > can request a new scan to update the entries, or flush the kernel list.
> > > > > > > The scan operation is always requested by the user anyway, it's not
> > > > > > > something happening in the background.
> > > > > > >
> > > > > >
> > > > > > I don't see what advantage it has to keep the discovered pan in the
> > > > > > kernel. You can do everything with a start/stop/pan discovered event.
> > > > >
> > > > > I think the main reason is to be much more user friendly. Keeping track
> > > > > of the known PANs in the kernel matters because when you start working
> > > > > with 802.15.4 you won't blindly use a daemon (if there is any) and will
> > > > > use test apps like iwpan which are stateless. Re-doing a scan on demand
> > > > > just takes ages (from seconds to minutes, depending on the beacon
> > > > > order).
> > > > >
> > > >
> > > > I can see that things should work "out-of the box" and we are already
> > > > doing it by manual setting pan_id, etc. However, doing it in an
> > > > automatic way there exists a lot of "interpretation" about how you
> > > > want to handle it (doesn't matter if this is what the spec says or
> > > > not)... moving it to user space will offload it to the user.
> > > >
> > > > > Aside from this non technical reason, I also had in mind to retrieve
> > > > > values gathered from the beacons (and stored in the PAN descriptors) to
> > > > > know more about the devices when eg. listing associations, like
> > > > > registering the short address of a coordinator. I don't yet know how
> > > > > useful this is TBH.
> > > > >
> > > > > > It also has more advantages as you can look for a specific pan and
> > > > > > stop afterwards. At the end the daemon has everything that the kernel
> > > > > > also has, as you said it's in sync.
> > > > > >
> > > > > > > > This can be a network manager as it
> > > > > > > > listens to netlink events as "detect PAN xy" and stores it and
> > > > > > > > offers it in their list to associate with it.
> > > > > > >
> > > > > > > There are events produced, yes. But really, this is not something we
> > > > > > > actually need. The user requests a scan over a given range, when the
> > > > > > > scan is over it looks at the list and decides which PAN it
> > > > > > > wants to associate with, and through which coordinator (95% of the
> > > > > > > scenarii).
> > > > > > >
> > > > > >
> > > > > > This isn't either a kernel job to decide which pan it will be
> > > > > > associated with.
> > > > >
> > > > > Yes, "it looks at the list and decides" referred to "the user".
> > > > >
> > > > > > > > We need somewhere to draw a line and I guess the line is "Is this
> > > > > > > > information used e.g. as any lookup or something in the hot path", I
> > > > > > > > don't see this currently...
> > > > > > >
> > > > > > > Each PAN descriptor is like 20 bytes, so that's why I don't feel back
> > > > > > > keeping them, I think it's easier to be able to serve the list of PANs
> > > > > > > upon request rather than only forwarding events and not being able to
> > > > > > > retrieve the list a second time (at least during the development).
> > > > > > >
> > > > > >
> > > > > > This has nothing to do with memory.
> > > > > >
> > > > > > > Overall I feel like this part is still a little bit blurry because it
> > > > > > > has currently no user, perhaps I should send the next series which
> > > > > > > actually makes the current series useful.
> > > > > > >
> > > > > >
> > > > > > Will it get more used than caching entries in the kernel for user
> > > > > > space? Please also no in-kernel association feature.
> > > > >
> > > > > I am aligned on this.
> > > > >
> > > >
> > > > I am sorry I am not sure what that means.
> > >
> > > I was referring to the "no in-kernel association feature".
> > >
> > > There is however one situation which I _had_ to be handled in the
> > > kernel: other devices asking for being associated or disassociated. In
> > > the case of the disassociation, the receiving device is only notified
> > > and cannot refuse the disassociation. For the association however,
> > > the device receiving the association request has to make a decision.
> > > There are three possible outcomes:
> > > - accepting
> > > - refusing because the PAN is at capacity
> > > - refusing because the device is blacklisted
> >
> > Why not move this decision to the user as well? The kernel will wait
> > for the reason? This isn't required to be fast and the decision may
> > depend on the current pan management...
>
> I've opted out for the simplest option, which is allowing X devices
> being associated, X being manageable by the user. For now I'll keep
> this very simple approach, I propose we add this filtering feature
> later?
>

What I suggest here is to move the filtering logic into the user
space. If the interface is a coordinator it will trigger an event for
the user and waits for an upper layer user space logic to get an
answer back what to do as answer.

However as I said I don't force you to program a user space software
which does that job but you code should be prepared to be get replaced
by such handling.

> > > For now I've only implemented the first reason, because it's much
> > > easier and only requires a maximum device number variable, set by the
> > > user. For the second reason, it requires handling a
> > > whitelist/blacklist, I don't plan to implement this for now, but that
> > > should not impact the rest of the code. I'll let that to other
> > > developers, or future-me, perhaps :-). Anyhow, you can kick-out devices
> > > at any time anyway if needed with a disassociation notification
> > > controlled by the user.
> > >
> > > > > > We can maybe agree to that point to put it under
> > > > > > IEEE802154_NL802154_EXPERIMENTAL config, as soon as we have some
> > > > > > _open_ user space program ready we will drop this feature again...
> > > > > > this program will show that there is no magic about it.
> > > > >
> > > > > Yeah, do you want to move all the code scan/beacon/pan/association code
> > > > > under EXPERIMENTAL sections? Or is it just the PAN management logic?
> > > >
> > > > Yes, why not. But as I can see there exists two categories of
> > > > introducing your netlink api:
> > > >
> > > > 1. API candidates which are very likely to become stable
> > > > 2. API candidates which we want to remove when we have a user
> > > > replacement for it (will probably never go stable)
> > > >
> > > > The 2. should be defining _after_ the 1. In the "big" netlink API
> > > > enums of EXPERIMENTAL sections.
> > >
> > > Yeah, got it.
> > >
> > > > Also you should provide for 2. some kind of ifdef/functions dummy/etc.
> > > > that it's easy to remove from the kernel when we have a user
> > > > replacement for it.
> > > > I hope that is fine for everybody.
> > > >
> > > > I try to find solutions here, I don't see a reason for putting this
> > > > pan management into the kernel... whereas I appreciate the effort
> > > > which is done here and will not force you to write some user space
> > > > software that does this job. From my point of view I can't accept this
> > > > functionality in the kernel "yet".
> > >
> > > I've already spent a couple of days reworking all that part, I've
> > > dropped most of the in-kernel PAN management, which means:
> > > - when a new coordinator gets discovered (beacon received), if the mac
> > >   was scanning then it calls a generic function from the cfg layer to
> > >   advertise this pan.
> > > - the cfg layer will send a NL message to the user with all the
> > >   important information
> > > - BUT the cfg layer will also keep in memory the beacon information for
> > >   the time of the scan (only), to avoid polluting the user with the same
> > >   information over and over again, this seems a necessary step to me,
> > >   because otherwise if you track on the same channel two coordinators
> > >   not emitting at the same pace, you might end up with 100 user
> > >   notifications, for just 2 devices. I think this is the kernel duty to
> > >   filter out identical beacons.
> > >
> >
> > Okay, I am sure if somebody complains about such kernel behaviour and
> > has a good argument to switch back... we still can do it.
>
> Great!
>

I would say more here... there might be some API documentation where
you cannot expect anything from the kernel but it tries to avoid
stupid things (Whatever that means). As the API is experimental it can
be easily changed, otherwise some additional flag is required to
enable this feature or not. However I can say more about this when I
see code and we have some user experience about whatever the default
behaviour should be or such flag is really necessary.

We probably will find some solution...

- Alex
Miquel Raynal July 1, 2022, 2:29 p.m. UTC | #11
Hi Alexander,

aahringo@redhat.com wrote on Fri, 1 Jul 2022 08:23:32 -0400:

> Hi,
> 
> On Thu, Jun 30, 2022 at 8:50 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > Hi Alexander,
> >
> > aahringo@redhat.com wrote on Thu, 30 Jun 2022 19:27:49 -0400:
> >  
> > > Hi,
> > >
> > > On Thu, Jun 30, 2022 at 4:14 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > > >
> > > > Hi Alexander,
> > > >
> > > > aahringo@redhat.com wrote on Wed, 29 Jun 2022 21:40:14 -0400:
> > > >  
> > > > > Hi,
> > > > >
> > > > > On Tue, Jun 28, 2022 at 3:58 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > > > > >
> > > > > > Hi Alexander,
> > > > > >
> > > > > > aahringo@redhat.com wrote on Mon, 27 Jun 2022 21:32:08 -0400:
> > > > > >  
> > > > > > > Hi,
> > > > > > >
> > > > > > > On Mon, Jun 27, 2022 at 4:43 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > > > > > > >
> > > > > > > > Hi Alexander,
> > > > > > > >
> > > > > > > > aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:
> > > > > > > >  
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
> > > > > > > > > <miquel.raynal@bootlin.com> wrote:  
> > > > > > > > > >
> > > > > > > > > > Let's introduce the basics for defining PANs:
> > > > > > > > > > - structures defining a PAN
> > > > > > > > > > - helpers for PAN registration
> > > > > > > > > > - helpers discarding old PANs
> > > > > > > > > >  
> > > > > > > > >
> > > > > > > > > I think the whole pan management can/should be stored in user space by
> > > > > > > > > a daemon running in background.  
> > > > > > > >
> > > > > > > > We need both, and currently:
> > > > > > > > - while the scan is happening, the kernel saves all the discovered PANs
> > > > > > > > - the kernel PAN list can be dumped (and also flushed) asynchronously by
> > > > > > > >   the userspace
> > > > > > > >
> > > > > > > > IOW the userspace is responsible of keeping its own list of PANs in
> > > > > > > > sync with what the kernel discovers, so at any moment it can ask the
> > > > > > > > kernel what it has in memory, it can be done during a scan or after. It
> > > > > > > > can request a new scan to update the entries, or flush the kernel list.
> > > > > > > > The scan operation is always requested by the user anyway, it's not
> > > > > > > > something happening in the background.
> > > > > > > >  
> > > > > > >
> > > > > > > I don't see what advantage it has to keep the discovered pan in the
> > > > > > > kernel. You can do everything with a start/stop/pan discovered event.  
> > > > > >
> > > > > > I think the main reason is to be much more user friendly. Keeping track
> > > > > > of the known PANs in the kernel matters because when you start working
> > > > > > with 802.15.4 you won't blindly use a daemon (if there is any) and will
> > > > > > use test apps like iwpan which are stateless. Re-doing a scan on demand
> > > > > > just takes ages (from seconds to minutes, depending on the beacon
> > > > > > order).
> > > > > >  
> > > > >
> > > > > I can see that things should work "out-of the box" and we are already
> > > > > doing it by manual setting pan_id, etc. However, doing it in an
> > > > > automatic way there exists a lot of "interpretation" about how you
> > > > > want to handle it (doesn't matter if this is what the spec says or
> > > > > not)... moving it to user space will offload it to the user.
> > > > >  
> > > > > > Aside from this non technical reason, I also had in mind to retrieve
> > > > > > values gathered from the beacons (and stored in the PAN descriptors) to
> > > > > > know more about the devices when eg. listing associations, like
> > > > > > registering the short address of a coordinator. I don't yet know how
> > > > > > useful this is TBH.
> > > > > >  
> > > > > > > It also has more advantages as you can look for a specific pan and
> > > > > > > stop afterwards. At the end the daemon has everything that the kernel
> > > > > > > also has, as you said it's in sync.
> > > > > > >  
> > > > > > > > > This can be a network manager as it
> > > > > > > > > listens to netlink events as "detect PAN xy" and stores it and
> > > > > > > > > offers it in their list to associate with it.  
> > > > > > > >
> > > > > > > > There are events produced, yes. But really, this is not something we
> > > > > > > > actually need. The user requests a scan over a given range, when the
> > > > > > > > scan is over it looks at the list and decides which PAN it
> > > > > > > > wants to associate with, and through which coordinator (95% of the
> > > > > > > > scenarii).
> > > > > > > >  
> > > > > > >
> > > > > > > This isn't either a kernel job to decide which pan it will be
> > > > > > > associated with.  
> > > > > >
> > > > > > Yes, "it looks at the list and decides" referred to "the user".
> > > > > >  
> > > > > > > > > We need somewhere to draw a line and I guess the line is "Is this
> > > > > > > > > information used e.g. as any lookup or something in the hot path", I
> > > > > > > > > don't see this currently...  
> > > > > > > >
> > > > > > > > Each PAN descriptor is like 20 bytes, so that's why I don't feel back
> > > > > > > > keeping them, I think it's easier to be able to serve the list of PANs
> > > > > > > > upon request rather than only forwarding events and not being able to
> > > > > > > > retrieve the list a second time (at least during the development).
> > > > > > > >  
> > > > > > >
> > > > > > > This has nothing to do with memory.
> > > > > > >  
> > > > > > > > Overall I feel like this part is still a little bit blurry because it
> > > > > > > > has currently no user, perhaps I should send the next series which
> > > > > > > > actually makes the current series useful.
> > > > > > > >  
> > > > > > >
> > > > > > > Will it get more used than caching entries in the kernel for user
> > > > > > > space? Please also no in-kernel association feature.  
> > > > > >
> > > > > > I am aligned on this.
> > > > > >  
> > > > >
> > > > > I am sorry I am not sure what that means.  
> > > >
> > > > I was referring to the "no in-kernel association feature".
> > > >
> > > > There is however one situation which I _had_ to be handled in the
> > > > kernel: other devices asking for being associated or disassociated. In
> > > > the case of the disassociation, the receiving device is only notified
> > > > and cannot refuse the disassociation. For the association however,
> > > > the device receiving the association request has to make a decision.
> > > > There are three possible outcomes:
> > > > - accepting
> > > > - refusing because the PAN is at capacity
> > > > - refusing because the device is blacklisted  
> > >
> > > Why not move this decision to the user as well? The kernel will wait
> > > for the reason? This isn't required to be fast and the decision may
> > > depend on the current pan management...  
> >
> > I've opted out for the simplest option, which is allowing X devices
> > being associated, X being manageable by the user. For now I'll keep
> > this very simple approach, I propose we add this filtering feature
> > later?
> >  
> 
> What I suggest here is to move the filtering logic into the user
> space. If the interface is a coordinator it will trigger an event for
> the user and waits for an upper layer user space logic to get an
> answer back what to do as answer.
> 
> However as I said I don't force you to program a user space software
> which does that job but you code should be prepared to be get replaced
> by such handling.

Actually I really think we should keep the maximum value which returns
the 802.15.4 "PAN AT CAPACITY" error status. The filtering thing is an
additional feature, I don't think it will replace what I currently
provide, but it will rather complement it.

> > > > For now I've only implemented the first reason, because it's much
> > > > easier and only requires a maximum device number variable, set by the
> > > > user. For the second reason, it requires handling a
> > > > whitelist/blacklist, I don't plan to implement this for now, but that
> > > > should not impact the rest of the code. I'll let that to other
> > > > developers, or future-me, perhaps :-). Anyhow, you can kick-out devices
> > > > at any time anyway if needed with a disassociation notification
> > > > controlled by the user.
> > > >  
> > > > > > > We can maybe agree to that point to put it under
> > > > > > > IEEE802154_NL802154_EXPERIMENTAL config, as soon as we have some
> > > > > > > _open_ user space program ready we will drop this feature again...
> > > > > > > this program will show that there is no magic about it.  
> > > > > >
> > > > > > Yeah, do you want to move all the code scan/beacon/pan/association code
> > > > > > under EXPERIMENTAL sections? Or is it just the PAN management logic?  
> > > > >
> > > > > Yes, why not. But as I can see there exists two categories of
> > > > > introducing your netlink api:
> > > > >
> > > > > 1. API candidates which are very likely to become stable
> > > > > 2. API candidates which we want to remove when we have a user
> > > > > replacement for it (will probably never go stable)
> > > > >
> > > > > The 2. should be defining _after_ the 1. In the "big" netlink API
> > > > > enums of EXPERIMENTAL sections.  
> > > >
> > > > Yeah, got it.
> > > >  
> > > > > Also you should provide for 2. some kind of ifdef/functions dummy/etc.
> > > > > that it's easy to remove from the kernel when we have a user
> > > > > replacement for it.
> > > > > I hope that is fine for everybody.
> > > > >
> > > > > I try to find solutions here, I don't see a reason for putting this
> > > > > pan management into the kernel... whereas I appreciate the effort
> > > > > which is done here and will not force you to write some user space
> > > > > software that does this job. From my point of view I can't accept this
> > > > > functionality in the kernel "yet".  
> > > >
> > > > I've already spent a couple of days reworking all that part, I've
> > > > dropped most of the in-kernel PAN management, which means:
> > > > - when a new coordinator gets discovered (beacon received), if the mac
> > > >   was scanning then it calls a generic function from the cfg layer to
> > > >   advertise this pan.
> > > > - the cfg layer will send a NL message to the user with all the
> > > >   important information
> > > > - BUT the cfg layer will also keep in memory the beacon information for
> > > >   the time of the scan (only), to avoid polluting the user with the same
> > > >   information over and over again, this seems a necessary step to me,
> > > >   because otherwise if you track on the same channel two coordinators
> > > >   not emitting at the same pace, you might end up with 100 user
> > > >   notifications, for just 2 devices. I think this is the kernel duty to
> > > >   filter out identical beacons.
> > > >  
> > >
> > > Okay, I am sure if somebody complains about such kernel behaviour and
> > > has a good argument to switch back... we still can do it.  
> >
> > Great!
> >  
> 
> I would say more here... there might be some API documentation where
> you cannot expect anything from the kernel but it tries to avoid
> stupid things (Whatever that means). As the API is experimental it can
> be easily changed, otherwise some additional flag is required to
> enable this feature or not. However I can say more about this when I
> see code and we have some user experience about whatever the default
> behaviour should be or such flag is really necessary.

As I've dropped the entire internal PAN management handling thing with
userspace I have not set anything within the EXPERIMENTAL section, but
I believe there will be other versions, so we can decide what else
should go in there and I'll do it in the next version.

Thanks,
Miquèl
diff mbox series

Patch

diff --git a/include/net/cfg802154.h b/include/net/cfg802154.h
index 04b996895fc1..9838eca3c41e 100644
--- a/include/net/cfg802154.h
+++ b/include/net/cfg802154.h
@@ -255,6 +255,24 @@  struct ieee802154_addr {
 	};
 };
 
+/**
+ * struct ieee802154_pan_desc - PAN descriptor information
+ * @coord: PAN ID and coordinator address
+ * @page: page this PAN is on
+ * @channel: channel this PAN is on
+ * @superframe_spec: SuperFrame specification as received
+ * @link_quality: link quality indicator at which the beacon was received
+ * @gts_permit: the PAN coordinator accepts GTS requests
+ */
+struct ieee802154_pan_desc {
+	struct ieee802154_addr *coord;
+	u8 page;
+	u8 channel;
+	u16 superframe_spec;
+	u8 link_quality;
+	bool gts_permit;
+};
+
 struct ieee802154_llsec_key_id {
 	u8 mode;
 	u8 id;
@@ -426,4 +444,17 @@  static inline const char *wpan_phy_name(struct wpan_phy *phy)
 
 void ieee802154_configure_durations(struct wpan_phy *phy);
 
+/**
+ * cfg802154_record_pan - Advertize a new PAN following a beacon's reception
+ * @wpan_phy: PHY receiving the beacon
+ * @pan: PAN descriptor
+ *
+ * Tells the internal pan management layer to either register this PAN if it is
+ * new or at least update its entry if already discovered.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int cfg802154_record_pan(struct wpan_phy *wpan_phy,
+			 struct ieee802154_pan_desc *pan);
+
 #endif /* __NET_CFG802154_H */
diff --git a/net/ieee802154/Makefile b/net/ieee802154/Makefile
index f05b7bdae2aa..6b7c66de730d 100644
--- a/net/ieee802154/Makefile
+++ b/net/ieee802154/Makefile
@@ -4,7 +4,7 @@  obj-$(CONFIG_IEEE802154_SOCKET) += ieee802154_socket.o
 obj-y += 6lowpan/
 
 ieee802154-y := netlink.o nl-mac.o nl-phy.o nl_policy.o core.o \
-                header_ops.o sysfs.o nl802154.o trace.o
+                header_ops.o sysfs.o nl802154.o pan.o trace.o
 ieee802154_socket-y := socket.o
 
 CFLAGS_trace.o := -I$(src)
diff --git a/net/ieee802154/core.c b/net/ieee802154/core.c
index 57546e07e06a..f642db35d62e 100644
--- a/net/ieee802154/core.c
+++ b/net/ieee802154/core.c
@@ -115,6 +115,8 @@  wpan_phy_new(const struct cfg802154_ops *ops, size_t priv_size)
 		kfree(rdev);
 		return NULL;
 	}
+	spin_lock_init(&rdev->pan_lock);
+	INIT_LIST_HEAD(&rdev->pan_list);
 
 	/* atomic_inc_return makes it start at 1, make it start at 0 */
 	rdev->wpan_phy_idx--;
diff --git a/net/ieee802154/core.h b/net/ieee802154/core.h
index 1c19f575d574..cae071bede37 100644
--- a/net/ieee802154/core.h
+++ b/net/ieee802154/core.h
@@ -22,6 +22,14 @@  struct cfg802154_registered_device {
 	struct list_head wpan_dev_list;
 	int devlist_generation, wpan_dev_id;
 
+	/* PAN management */
+	spinlock_t pan_lock;
+	struct list_head pan_list;
+	unsigned int max_pan_entries;
+	unsigned int pan_expiration;
+	unsigned int pan_entries;
+	unsigned int pan_generation;
+
 	/* must be last because of the way we do wpan_phy_priv(),
 	 * and it should at least be aligned to NETDEV_ALIGN
 	 */
@@ -39,6 +47,17 @@  wpan_phy_to_rdev(struct wpan_phy *wpan_phy)
 extern struct list_head cfg802154_rdev_list;
 extern int cfg802154_rdev_list_generation;
 
+struct cfg802154_internal_pan {
+	struct list_head list;
+	unsigned long discovery_ts;
+	struct ieee802154_pan_desc desc;
+};
+
+/* Always update the list by dropping the expired PANs before iterating */
+#define ieee802154_for_each_pan(pan, rdev)				\
+	cfg802154_expire_pans(rdev);					\
+	list_for_each_entry((pan), &(rdev)->pan_list, list)
+
 int cfg802154_switch_netns(struct cfg802154_registered_device *rdev,
 			   struct net *net);
 /* free object */
@@ -47,4 +66,11 @@  struct cfg802154_registered_device *
 cfg802154_rdev_by_wpan_phy_idx(int wpan_phy_idx);
 struct wpan_phy *wpan_phy_idx_to_wpan_phy(int wpan_phy_idx);
 
+void cfg802154_set_max_pan_entries(struct cfg802154_registered_device *rdev,
+				   unsigned int max);
+void cfg802154_set_pans_expiration(struct cfg802154_registered_device *rdev,
+				   unsigned int exp_time_s);
+void cfg802154_expire_pans(struct cfg802154_registered_device *rdev);
+void cfg802154_flush_pans(struct cfg802154_registered_device *rdev);
+
 #endif /* __IEEE802154_CORE_H */
diff --git a/net/ieee802154/pan.c b/net/ieee802154/pan.c
new file mode 100644
index 000000000000..b9f50f785960
--- /dev/null
+++ b/net/ieee802154/pan.c
@@ -0,0 +1,231 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * IEEE 802.15.4 PAN management
+ *
+ * Copyright (C) Qorvo, 2021
+ * Authors:
+ *   - David Girault <david.girault@qorvo.com>
+ *   - Miquel Raynal <miquel.raynal@bootlin.com>
+ */
+
+#include <linux/slab.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/device.h>
+
+#include <net/cfg802154.h>
+#include <net/af_ieee802154.h>
+
+#include "ieee802154.h"
+#include "core.h"
+
+static struct cfg802154_internal_pan *
+cfg802154_alloc_pan(struct ieee802154_pan_desc *desc)
+{
+	struct cfg802154_internal_pan *new;
+	struct ieee802154_addr *coord;
+
+	new = kzalloc(sizeof(*new), GFP_ATOMIC);
+	if (!new)
+		return ERR_PTR(-ENOMEM);
+
+	coord = kzalloc(sizeof(*coord), GFP_ATOMIC);
+	if (!coord) {
+		kfree(new);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	new->discovery_ts = jiffies;
+	new->desc = *desc;
+
+	*coord = *desc->coord;
+	new->desc.coord = coord;
+
+	return new;
+}
+
+static void cfg802154_free_pan(struct cfg802154_internal_pan *pan)
+{
+	kfree(pan->desc.coord);
+	kfree(pan);
+}
+
+static void cfg802154_unlink_pan(struct cfg802154_registered_device *rdev,
+				 struct cfg802154_internal_pan *pan)
+{
+	lockdep_assert_held(&rdev->pan_lock);
+
+	list_del(&pan->list);
+	cfg802154_free_pan(pan);
+	rdev->pan_entries--;
+	rdev->pan_generation++;
+}
+
+static void cfg802154_link_pan(struct cfg802154_registered_device *rdev,
+			       struct cfg802154_internal_pan *pan)
+{
+	lockdep_assert_held(&rdev->pan_lock);
+
+	list_add_tail(&pan->list, &rdev->pan_list);
+	rdev->pan_entries++;
+	rdev->pan_generation++;
+}
+
+void cfg802154_set_max_pan_entries(struct cfg802154_registered_device *rdev,
+				   unsigned int max)
+{
+	lockdep_assert_held(&rdev->pan_lock);
+
+	rdev->max_pan_entries = max;
+}
+EXPORT_SYMBOL(cfg802154_set_max_pan_entries);
+
+static bool
+cfg802154_need_to_expire_pans(struct cfg802154_registered_device *rdev)
+{
+	if (!rdev->max_pan_entries)
+		return false;
+
+	if (rdev->pan_entries > rdev->max_pan_entries)
+		return true;
+
+	return false;
+}
+
+void cfg802154_set_pans_expiration(struct cfg802154_registered_device *rdev,
+				   unsigned int exp_time_s)
+{
+	lockdep_assert_held(&rdev->pan_lock);
+
+	rdev->pan_expiration = exp_time_s * HZ;
+}
+EXPORT_SYMBOL(cfg802154_set_pans_expiration);
+
+void cfg802154_expire_pans(struct cfg802154_registered_device *rdev)
+{
+	struct cfg802154_internal_pan *pan, *tmp;
+	unsigned long expiration_time;
+
+	lockdep_assert_held(&rdev->pan_lock);
+
+	if (!rdev->pan_expiration)
+		return;
+
+	expiration_time = jiffies - rdev->pan_expiration;
+	list_for_each_entry_safe(pan, tmp, &rdev->pan_list, list) {
+		if (!time_after(expiration_time, pan->discovery_ts))
+			continue;
+
+		cfg802154_unlink_pan(rdev, pan);
+	}
+}
+EXPORT_SYMBOL(cfg802154_expire_pans);
+
+static void cfg802154_expire_oldest_pan(struct cfg802154_registered_device *rdev)
+{
+	struct cfg802154_internal_pan *pan, *oldest;
+
+	lockdep_assert_held(&rdev->pan_lock);
+
+	if (WARN_ON(list_empty(&rdev->pan_list)))
+		return;
+
+	oldest = list_first_entry(&rdev->pan_list,
+				  struct cfg802154_internal_pan, list);
+
+	list_for_each_entry(pan, &rdev->pan_list, list) {
+		if (!time_before(oldest->discovery_ts, pan->discovery_ts))
+			oldest = pan;
+	}
+
+	cfg802154_unlink_pan(rdev, oldest);
+}
+
+void cfg802154_flush_pans(struct cfg802154_registered_device *rdev)
+{
+	struct cfg802154_internal_pan *pan, *tmp;
+
+	lockdep_assert_held(&rdev->pan_lock);
+
+	list_for_each_entry_safe(pan, tmp, &rdev->pan_list, list)
+		cfg802154_unlink_pan(rdev, pan);
+}
+EXPORT_SYMBOL(cfg802154_flush_pans);
+
+static bool cfg802154_same_pan(struct ieee802154_pan_desc *a,
+			       struct ieee802154_pan_desc *b)
+{
+	int ret;
+
+	if (a->page != b->page)
+		return false;
+
+	if (a->channel != b->channel)
+		return false;
+
+	ret = memcmp(&a->coord->pan_id, &b->coord->pan_id,
+		     sizeof(a->coord->pan_id));
+	if (ret)
+		return false;
+
+	if (a->coord->mode != b->coord->mode)
+		return false;
+
+	if (a->coord->mode == IEEE802154_ADDR_SHORT)
+		ret = memcmp(&a->coord->short_addr, &b->coord->short_addr,
+			     IEEE802154_SHORT_ADDR_LEN);
+	else
+		ret = memcmp(&a->coord->extended_addr, &b->coord->extended_addr,
+			     IEEE802154_EXTENDED_ADDR_LEN);
+
+	return true;
+}
+
+static struct cfg802154_internal_pan *
+cfg802154_find_matching_pan(struct cfg802154_registered_device *rdev,
+			    struct cfg802154_internal_pan *tmp)
+{
+	struct cfg802154_internal_pan *pan;
+
+	list_for_each_entry(pan, &rdev->pan_list, list) {
+		if (cfg802154_same_pan(&pan->desc, &tmp->desc))
+			return pan;
+	}
+
+	return NULL;
+}
+
+static void cfg802154_pan_update(struct cfg802154_registered_device *rdev,
+				 struct cfg802154_internal_pan *new)
+{
+	struct cfg802154_internal_pan *found;
+
+	spin_lock_bh(&rdev->pan_lock);
+
+	found = cfg802154_find_matching_pan(rdev, new);
+	if (found)
+		cfg802154_unlink_pan(rdev, found);
+
+	if (unlikely(cfg802154_need_to_expire_pans(rdev)))
+		cfg802154_expire_oldest_pan(rdev);
+
+	cfg802154_link_pan(rdev, new);
+
+	spin_unlock_bh(&rdev->pan_lock);
+}
+
+int cfg802154_record_pan(struct wpan_phy *wpan_phy,
+			 struct ieee802154_pan_desc *desc)
+{
+	struct cfg802154_registered_device *rdev = wpan_phy_to_rdev(wpan_phy);
+	struct cfg802154_internal_pan *new;
+
+	new = cfg802154_alloc_pan(desc);
+	if (IS_ERR(new))
+		return (PTR_ERR(new));
+
+	cfg802154_pan_update(rdev, new);
+
+	return 0;
+}
+EXPORT_SYMBOL(cfg802154_record_pan);