diff mbox series

[RFC,v2,net-next,1/2] rtnetlink: Add new RTM_GETSYNCESTATE message to get SyncE status

Message ID 20210829080512.3573627-2-maciej.machnikowski@intel.com (mailing list archive)
State New
Headers show
Series Add RTNL interface for SyncE | expand

Commit Message

Machnikowski, Maciej Aug. 29, 2021, 8:05 a.m. UTC
This patch adds the new RTM_GETSYNCESTATE message to query the status
of SyncE syntonization on the device.

Initial implementation returns:
 - SyncE DPLL state
 - Source of signal driving SyncE DPLL (SyncE, GNSS, PTP or External)
 - Current index of Pin driving the DPLL

SyncE state read needs to be implemented as ndo_get_synce_state function.

This patch is SyncE-oriented. Future implementation can add additional
functionality for reading different DPLL states using the same structure.

Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
---
 include/linux/netdevice.h      |  6 +++
 include/uapi/linux/if_link.h   | 43 +++++++++++++++++++
 include/uapi/linux/rtnetlink.h | 11 +++--
 net/core/rtnetlink.c           | 77 ++++++++++++++++++++++++++++++++++
 security/selinux/nlmsgtab.c    |  3 +-
 5 files changed, 135 insertions(+), 5 deletions(-)

Comments

Richard Cochran Aug. 29, 2021, 3:10 p.m. UTC | #1
On Sun, Aug 29, 2021 at 10:05:11AM +0200, Maciej Machnikowski wrote:
> This patch adds the new RTM_GETSYNCESTATE message to query the status
> of SyncE syntonization on the device.
> 
> Initial implementation returns:
>  - SyncE DPLL state
>  - Source of signal driving SyncE DPLL (SyncE, GNSS, PTP or External)
>  - Current index of Pin driving the DPLL
> 
> SyncE state read needs to be implemented as ndo_get_synce_state function.
> 
> This patch is SyncE-oriented. Future implementation can add additional
> functionality for reading different DPLL states using the same structure.

I would call this more "ice oriented" than SyncE oriented.  I'm not
sure there is even such a thing as "SyncE DPLL".  Does that term come
from 802.3?  To my understanding, that is one just way of implementing
it that works on super-Gigabit speed devices.

I have nothing against exposing the DPLL if you need to, however I'd
like to have an interface that support plain Gigabit as well.  This
could be done in a generic way by offering Control Register 9 as
described in 802.3.

Thanks,
Richard
Machnikowski, Maciej Aug. 29, 2021, 4:42 p.m. UTC | #2
> -----Original Message-----
> From: Richard Cochran <richardcochran@gmail.com>
> Sent: Sunday, August 29, 2021 5:10 PM
> Subject: Re: [RFC v2 net-next 1/2] rtnetlink: Add new RTM_GETSYNCESTATE
> message to get SyncE status
> 
> On Sun, Aug 29, 2021 at 10:05:11AM +0200, Maciej Machnikowski wrote:
> >
> > This patch is SyncE-oriented. Future implementation can add additional
> > functionality for reading different DPLL states using the same structure.
> 
> I would call this more "ice oriented" than SyncE oriented.  I'm not sure there is
> even such a thing as "SyncE DPLL".  Does that term come from 802.3?  To my
> understanding, that is one just way of implementing it that works on super-
> Gigabit speed devices.
> 
Hi,
This interface is ITU-T G.8264 SyncE-oriented. It is meant to monitor the state
of Ethernet Equipment Clock.

ITU-T G.8264 recommendation defines Synchronous Ethernet equipment
as a device  equipped with a system clock (e.g., a synchronous Ethernet
equipment clock). SyncE interfaces are able to extract the received clock
and pass it to a system clock.

Please take a look at the 10.2 Operation modes of the G.8264 and at the Figure A.1
which depicts the EEC. This interface is to report the status of the EEC.

If you prefer EEC over DPLL I'm fine with the name change. I think it will be less confusing.

> I have nothing against exposing the DPLL if you need to, however I'd like to have
> an interface that support plain Gigabit as well.  This could be done in a generic
> way by offering Control Register 9 as described in 802.3.

This part of Gigabit interface is a different part of SyncE device. It controls Master/Slave
operation of auto-negotiation. 
You would use it in slave mode if you want your EEC to tune to the frequency recovered
from network and to master if you use external source for your EEC and want to send it
as a reference for another devices. The decision can be made based on the EEC state
read by the interface proposed in this RFC.

This is a functionality that belongs to a different interface mentioned in the next steps.

Regards
Maciek
Andrew Lunn Aug. 29, 2021, 8:16 p.m. UTC | #3
> > I have nothing against exposing the DPLL if you need to, however I'd like to have
> > an interface that support plain Gigabit as well.  This could be done in a generic
> > way by offering Control Register 9 as described in 802.3.

Are we talking about Clause 22, register 9, also known as MII_CTRL1000?

> This part of Gigabit interface is a different part of SyncE device. It controls Master/Slave
> operation of auto-negotiation.

This is controlled using ethtool -s

      ethtool -s devname [speed N] [duplex half|full] [port tp|aui|bnc|mii]
              ....
              [sopass xx:yy:zz:aa:bb:cc] [master-slave preferred-
              master|preferred-slave|forced-master|forced-slave]

      Andrew
Richard Cochran Aug. 30, 2021, 8:57 p.m. UTC | #4
On Sun, Aug 29, 2021 at 04:42:55PM +0000, Machnikowski, Maciej wrote:

> Please take a look at the 10.2 Operation modes of the G.8264 and at the Figure A.1
> which depicts the EEC. This interface is to report the status of the EEC.

Well, I read it, and it is still fairly high level with no mention at
all of "DPLL".  I hope that the new RTNL states will cover other
possible EEC implementations, too.

The "Reference source selection mechanism" is also quite vague.  Your
patch is more specific:

+enum if_eec_src {
+       IF_EEC_SRC_INVALID = 0,
+       IF_EEC_SRC_UNKNOWN,
+       IF_EEC_SRC_SYNCE,
+       IF_EEC_SRC_GNSS,
+       IF_EEC_SRC_PTP,
+       IF_EEC_SRC_EXT,
+       __IF_EEC_SRC_MAX,
+};

But I guess your list is reasonable.  It can always be expanded, right?


> If you prefer EEC over DPLL I'm fine with the name change. I think it will be less confusing.

Yes, thanks for doing that.

Thanks,
Richard
Jakub Kicinski Aug. 30, 2021, 11:29 p.m. UTC | #5
On Mon, 30 Aug 2021 13:57:58 -0700 Richard Cochran wrote:
> > Please take a look at the 10.2 Operation modes of the G.8264 and at the Figure A.1
> > which depicts the EEC. This interface is to report the status of the EEC.  
> 
> Well, I read it, and it is still fairly high level with no mention at
> all of "DPLL".  I hope that the new RTNL states will cover other
> possible EEC implementations, too.
> 
> The "Reference source selection mechanism" is also quite vague.  Your
> patch is more specific:
> 
> +enum if_eec_src {
> +       IF_EEC_SRC_INVALID = 0,
> +       IF_EEC_SRC_UNKNOWN,
> +       IF_EEC_SRC_SYNCE,
> +       IF_EEC_SRC_GNSS,

Hmm, IDK if this really belongs in RTNL. The OCP time card that
Jonathan works on also wants to report signal lock, and it locks
to GNSS. It doesn't have any networking functionality whatsoever.

Can we add a genetlink family for clock info/configuration? From 
what I understood discussing this with Jonathan it sounded like most
clocks today have a vendor-specific character device for configuration
and reading status.

I'm happy to write the plumbing if this seems like an okay idea 
but too much work for anyone to commit.

> +       IF_EEC_SRC_PTP,
> +       IF_EEC_SRC_EXT,
> +       __IF_EEC_SRC_MAX,
> +};
> 
> But I guess your list is reasonable.  It can always be expanded, right?
Machnikowski, Maciej Aug. 31, 2021, 10:20 a.m. UTC | #6
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, August 31, 2021 1:29 AM
> To: Richard Cochran <richardcochran@gmail.com>
> Cc: Machnikowski, Maciej <maciej.machnikowski@intel.com>;
> netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org;
> abyagowi@fb.com; Nguyen, Anthony L <anthony.l.nguyen@intel.com>;
> davem@davemloft.net; linux-kselftest@vger.kernel.org; bsd@fb.com
> Subject: Re: [RFC v2 net-next 1/2] rtnetlink: Add new RTM_GETSYNCESTATE
> message to get SyncE status
> 
> On Mon, 30 Aug 2021 13:57:58 -0700 Richard Cochran wrote:
> > > Please take a look at the 10.2 Operation modes of the G.8264 and at the
> Figure A.1
> > > which depicts the EEC. This interface is to report the status of the EEC.
> >
> > Well, I read it, and it is still fairly high level with no mention at
> > all of "DPLL".  I hope that the new RTNL states will cover other
> > possible EEC implementations, too.
> >
> > The "Reference source selection mechanism" is also quite vague.  Your
> > patch is more specific:
> >
> > +enum if_eec_src {
> > +       IF_EEC_SRC_INVALID = 0,
> > +       IF_EEC_SRC_UNKNOWN,
> > +       IF_EEC_SRC_SYNCE,
> > +       IF_EEC_SRC_GNSS,
> 
> Hmm, IDK if this really belongs in RTNL. The OCP time card that
> Jonathan works on also wants to report signal lock, and it locks
> to GNSS. It doesn't have any networking functionality whatsoever.
> 
> Can we add a genetlink family for clock info/configuration? From
> what I understood discussing this with Jonathan it sounded like most
> clocks today have a vendor-specific character device for configuration
> and reading status.
> 
> I'm happy to write the plumbing if this seems like an okay idea
> but too much work for anyone to commit.
> 

I agree that this also is useful for Time card, yet it's also useful here.
PTP subsystem should implement a similar logic to this one for
DPLL-driven timers which can lock its frequency to external sources.

The reasoning behind putting it here is to enable returning the lock
to the GNSS receiver embedded on the NIC as a source for the
SyncE frequency. It helps distinguishing the embedded GNSS
from the external sources. As a result - the upper layer can report
GNSS lock based only on this message without the need to put the
embedded  GNSS receiver in the config file. On the other hand - if
sync to External source is reported such SW would need to read
the source of external sync from the config file.

And the list is expandable - if we need to define more embedded
sync source types we can always add more to it.

Regards
Maciek
Jakub Kicinski Aug. 31, 2021, 1:33 p.m. UTC | #7
On Tue, 31 Aug 2021 10:20:18 +0000 Machnikowski, Maciej wrote:
> > Hmm, IDK if this really belongs in RTNL. The OCP time card that
> > Jonathan works on also wants to report signal lock, and it locks
> > to GNSS. It doesn't have any networking functionality whatsoever.
> > 
> > Can we add a genetlink family for clock info/configuration? From
> > what I understood discussing this with Jonathan it sounded like most
> > clocks today have a vendor-specific character device for configuration
> > and reading status.
> > 
> > I'm happy to write the plumbing if this seems like an okay idea
> > but too much work for anyone to commit.
> >   
> 
> I agree that this also is useful for Time card, yet it's also useful here.
> PTP subsystem should implement a similar logic to this one for
> DPLL-driven timers which can lock its frequency to external sources.

Why would we have two APIs for doing the same thing? IIUC Richard does
not want this in the PTP ioctls which is fair, but we need to cater to
devices which do not have netdevs.

> The reasoning behind putting it here is to enable returning the lock
> to the GNSS receiver embedded on the NIC as a source for the
> SyncE frequency. It helps distinguishing the embedded GNSS
> from the external sources. As a result - the upper layer can report
> GNSS lock based only on this message without the need to put the
> embedded  GNSS receiver in the config file. On the other hand - if
> sync to External source is reported such SW would need to read
> the source of external sync from the config file.
> 
> And the list is expandable - if we need to define more embedded
> sync source types we can always add more to it.
Machnikowski, Maciej Aug. 31, 2021, 2:07 p.m. UTC | #8
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, August 31, 2021 3:33 PM
> Subject: Re: [RFC v2 net-next 1/2] rtnetlink: Add new RTM_GETSYNCESTATE
> message to get SyncE status
> 
> On Tue, 31 Aug 2021 10:20:18 +0000 Machnikowski, Maciej wrote:
> > > Hmm, IDK if this really belongs in RTNL. The OCP time card that
> > > Jonathan works on also wants to report signal lock, and it locks to
> > > GNSS. It doesn't have any networking functionality whatsoever.
> > >
> > > Can we add a genetlink family for clock info/configuration? From
> > > what I understood discussing this with Jonathan it sounded like most
> > > clocks today have a vendor-specific character device for
> > > configuration and reading status.
> > >
> > > I'm happy to write the plumbing if this seems like an okay idea but
> > > too much work for anyone to commit.
> > >
> >
> > I agree that this also is useful for Time card, yet it's also useful here.
> > PTP subsystem should implement a similar logic to this one for
> > DPLL-driven timers which can lock its frequency to external sources.
> 
> Why would we have two APIs for doing the same thing? IIUC Richard does
> not want this in the PTP ioctls which is fair, but we need to cater to devices
> which do not have netdevs.

From technical point of view - it can be explained by the fact that the DPLL
driving the SyncE logic can be separate from the one driving PTP.  Also
SyncE is frequency-only oriented and doesn't care about phase and
Time of Day that PTP also needs. The GNSS lock on the PTP side will be
multi-layered, as the full lock would mean that our PTP clock is not only
syntonized, but also has its time and phase set correctly.

A PTP can reuse the "physical" part of this interface later on, but it also needs
to solve more SW-specific challenges, like reporting the PTP lock on a SW level.

I agree that having such API for PTP subsystem will be very useful,
but let's address SyncE in netdev first and build the PTP netlink on top of what
we learn here. We can always move the structures defined here to the layer
above without affecting any APIs.

> 
> > The reasoning behind putting it here is to enable returning the lock
> > to the GNSS receiver embedded on the NIC as a source for the SyncE
> > frequency. It helps distinguishing the embedded GNSS from the external
> > sources. As a result - the upper layer can report GNSS lock based only
> > on this message without the need to put the embedded  GNSS receiver in
> > the config file. On the other hand - if sync to External source is
> > reported such SW would need to read the source of external sync from
> > the config file.
> >
> > And the list is expandable - if we need to define more embedded sync
> > source types we can always add more to it.
Jakub Kicinski Aug. 31, 2021, 2:18 p.m. UTC | #9
On Tue, 31 Aug 2021 14:07:32 +0000 Machnikowski, Maciej wrote:
> > > I agree that this also is useful for Time card, yet it's also useful here.
> > > PTP subsystem should implement a similar logic to this one for
> > > DPLL-driven timers which can lock its frequency to external sources.  
> > 
> > Why would we have two APIs for doing the same thing? IIUC Richard does
> > not want this in the PTP ioctls which is fair, but we need to cater to devices
> > which do not have netdevs.  
> 
> From technical point of view - it can be explained by the fact that the DPLL
> driving the SyncE logic can be separate from the one driving PTP.  Also
> SyncE is frequency-only oriented and doesn't care about phase and
> Time of Day that PTP also needs. The GNSS lock on the PTP side will be
> multi-layered, as the full lock would mean that our PTP clock is not only
> syntonized, but also has its time and phase set correctly.

Just because GNSS lock addresses more parameters (potentially) doesn't
mean the syntonization part shouldn't be addressed by the same API.

> A PTP can reuse the "physical" part of this interface later on, but it also needs
> to solve more SW-specific challenges, like reporting the PTP lock on a SW level.
> 
> I agree that having such API for PTP subsystem will be very useful,
> but let's address SyncE in netdev first and build the PTP netlink on top of what
> we learn here. We can always move the structures defined here to the layer
> above without affecting any APIs.

It's a reasonable SW design strategy to start simple. Unfortunately, 
it doesn't apply to stable uAPI design. You're adding a RTNL op, which
will have to be supported for ever. If we add anything "later" it will
be a strict addition, and will have to be backward compatible. Which
I'm not sure how to do when the object we'd operate on would be
completely different (clock vs netdev).

As I said I can write the boilerplate code for you if you prefer, the
code implementing the command and the driver interface will be almost
identical.

Is there a reason why RTNL is better?
Machnikowski, Maciej Aug. 31, 2021, 3:19 p.m. UTC | #10
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, August 31, 2021 4:18 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [RFC v2 net-next 1/2] rtnetlink: Add new RTM_GETSYNCESTATE
> message to get SyncE status
> 
> On Tue, 31 Aug 2021 14:07:32 +0000 Machnikowski, Maciej wrote:
> > > > I agree that this also is useful for Time card, yet it's also useful here.
> > > > PTP subsystem should implement a similar logic to this one for
> > > > DPLL-driven timers which can lock its frequency to external sources.
> > >
> > > Why would we have two APIs for doing the same thing? IIUC Richard
> does
> > > not want this in the PTP ioctls which is fair, but we need to cater to
> devices
> > > which do not have netdevs.
> >
> > From technical point of view - it can be explained by the fact that the DPLL
> > driving the SyncE logic can be separate from the one driving PTP.  Also
> > SyncE is frequency-only oriented and doesn't care about phase and
> > Time of Day that PTP also needs. The GNSS lock on the PTP side will be
> > multi-layered, as the full lock would mean that our PTP clock is not only
> > syntonized, but also has its time and phase set correctly.
> 
> Just because GNSS lock addresses more parameters (potentially) doesn't
> mean the syntonization part shouldn't be addressed by the same API.

Fair enough.

> 
> > A PTP can reuse the "physical" part of this interface later on, but it also
> needs
> > to solve more SW-specific challenges, like reporting the PTP lock on a SW
> level.
> >
> > I agree that having such API for PTP subsystem will be very useful,
> > but let's address SyncE in netdev first and build the PTP netlink on top of
> what
> > we learn here. We can always move the structures defined here to the
> layer
> > above without affecting any APIs.
> 
> It's a reasonable SW design strategy to start simple. Unfortunately,
> it doesn't apply to stable uAPI design. You're adding a RTNL op, which
> will have to be supported for ever. If we add anything "later" it will
> be a strict addition, and will have to be backward compatible. Which
> I'm not sure how to do when the object we'd operate on would be
> completely different (clock vs netdev).

I agree - the point I'm trying to make here is that the existence of
the PTP-specific interface will not invalidate the need of having 
SyncE-specific one as well. Even if we report lock-states for the clock
we will still need to report lock-states for devices that don't use PTP
clocks, but support SyncE. (that's also a reason why RTNL is still required).

The RTNL interface will also address devices that only need the 
frequency syntonization (especially in Radio Access Networks).

> 
> As I said I can write the boilerplate code for you if you prefer, the
> code implementing the command and the driver interface will be almost
> identical.

I think it's a great idea to start that in parallel to this patch. Then move
the common structures to the generic layer and use them in both
SyncE-specific RTNL implementation and PTP-specific part that will
be added. This won't affect SyncE specific APIs. The "worst" that can
happen is that the driver will put the same info for PTP part and
SyncE part if that's the design someone follows.

Regards
Maciek

> 
> Is there a reason why RTNL is better?
Jakub Kicinski Aug. 31, 2021, 3:32 p.m. UTC | #11
On Tue, 31 Aug 2021 15:19:36 +0000 Machnikowski, Maciej wrote:
> > It's a reasonable SW design strategy to start simple. Unfortunately,
> > it doesn't apply to stable uAPI design. You're adding a RTNL op, which
> > will have to be supported for ever. If we add anything "later" it will
> > be a strict addition, and will have to be backward compatible. Which
> > I'm not sure how to do when the object we'd operate on would be
> > completely different (clock vs netdev).  
> 
> I agree - the point I'm trying to make here is that the existence of
> the PTP-specific interface will not invalidate the need of having 
> SyncE-specific one as well. Even if we report lock-states for the clock
> we will still need to report lock-states for devices that don't use PTP
> clocks, but support SyncE. (that's also a reason why RTNL is still required).
> 
> The RTNL interface will also address devices that only need the 
> frequency syntonization (especially in Radio Access Networks).
> 
> > 
> > As I said I can write the boilerplate code for you if you prefer, the
> > code implementing the command and the driver interface will be almost
> > identical.  
> 
> I think it's a great idea to start that in parallel to this patch. Then move
> the common structures to the generic layer and use them in both
> SyncE-specific RTNL implementation and PTP-specific part that will
> be added. This won't affect SyncE specific APIs. The "worst" that can
> happen is that the driver will put the same info for PTP part and
> SyncE part if that's the design someone follows.

I don't understand why we need the SyncE RTNL if we have clock API for
controlling the freq source. Are you saying that there are
implementations out there which use SyncE to recover Rx clock and use
it for Tx but the PTP ticker is in a different clock domain?
Machnikowski, Maciej Aug. 31, 2021, 4 p.m. UTC | #12
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, August 31, 2021 5:32 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [RFC v2 net-next 1/2] rtnetlink: Add new RTM_GETSYNCESTATE
> message to get SyncE status
> 
> On Tue, 31 Aug 2021 15:19:36 +0000 Machnikowski, Maciej wrote:
> > > It's a reasonable SW design strategy to start simple. Unfortunately,
> > > it doesn't apply to stable uAPI design. You're adding a RTNL op, which
> > > will have to be supported for ever. If we add anything "later" it will
> > > be a strict addition, and will have to be backward compatible. Which
> > > I'm not sure how to do when the object we'd operate on would be
> > > completely different (clock vs netdev).
> >
> > I agree - the point I'm trying to make here is that the existence of
> > the PTP-specific interface will not invalidate the need of having
> > SyncE-specific one as well. Even if we report lock-states for the clock
> > we will still need to report lock-states for devices that don't use PTP
> > clocks, but support SyncE. (that's also a reason why RTNL is still required).
> >
> > The RTNL interface will also address devices that only need the
> > frequency syntonization (especially in Radio Access Networks).
> >
> > >
> > > As I said I can write the boilerplate code for you if you prefer, the
> > > code implementing the command and the driver interface will be almost
> > > identical.
> >
> > I think it's a great idea to start that in parallel to this patch. Then move
> > the common structures to the generic layer and use them in both
> > SyncE-specific RTNL implementation and PTP-specific part that will
> > be added. This won't affect SyncE specific APIs. The "worst" that can
> > happen is that the driver will put the same info for PTP part and
> > SyncE part if that's the design someone follows.
> 
> I don't understand why we need the SyncE RTNL if we have clock API for
> controlling the freq source. Are you saying that there are
> implementations out there which use SyncE to recover Rx clock and use
> it for Tx but the PTP ticker is in a different clock domain?

Exactly!
You can also have multiple PTP timers which may use different clock ticks
to support multiple time domains.

The PTP ticker can also be completely absent if all you need is the frequency
reference. There are many radio devices that expect 10 MHz reference only.
Richard Cochran Aug. 31, 2021, 4:19 p.m. UTC | #13
On Mon, Aug 30, 2021 at 04:29:09PM -0700, Jakub Kicinski wrote:
> Hmm, IDK if this really belongs in RTNL. The OCP time card that
> Jonathan works on also wants to report signal lock, and it locks
> to GNSS. It doesn't have any networking functionality whatsoever.
> 
> Can we add a genetlink family for clock info/configuration? From 
> what I understood discussing this with Jonathan it sounded like most
> clocks today have a vendor-specific character device for configuration
> and reading status.
> 
> I'm happy to write the plumbing if this seems like an okay idea 
> but too much work for anyone to commit.

This sounds nice.

As you said later on in this thread, any API we merge now will have to
last.  That is why I'm being so picky here.  We want new APIs to cover
current HW _and_ be reasonable for the future.

I don't see a DPLL as either a PTP Hardware Clock or a Network
Device.  It is a PLL.

The kernel can and should have a way to represent the relationship
between these three different kind of IP block.  We already have a
way to get from PHC to netdev interface.

I understand that Maciej and team want to get support for their card
ASAP.  However, proper kernel/user API takes time.  For example, the
PHC stuff took one year and fourteen revisions.  But it was worth the
wait, because the API has help up pretty well all these years since
the v3.0 kernel.

There is no need to quickly merge some poorly designed interfaces.

Thanks,
Richard
Machnikowski, Maciej Aug. 31, 2021, 10:09 p.m. UTC | #14
> -----Original Message-----
> From: Richard Cochran <richardcochran@gmail.com>
> Sent: Tuesday, August 31, 2021 6:19 PM
> To: Jakub Kicinski <kuba@kernel.org>
> Subject: Re: [RFC v2 net-next 1/2] rtnetlink: Add new RTM_GETSYNCESTATE
> message to get SyncE status
> 
> On Mon, Aug 30, 2021 at 04:29:09PM -0700, Jakub Kicinski wrote:
> > Can we add a genetlink family for clock info/configuration? From
> > what I understood discussing this with Jonathan it sounded like most
> > clocks today have a vendor-specific character device for configuration
> > and reading status.
> >
> > I'm happy to write the plumbing if this seems like an okay idea
> > but too much work for anyone to commit.
> 
> This sounds nice.
> 
> As you said later on in this thread, any API we merge now will have to
> last.  That is why I'm being so picky here.  We want new APIs to cover
> current HW _and_ be reasonable for the future.
> 
> I don't see a DPLL as either a PTP Hardware Clock or a Network
> Device.  It is a PLL.
> 
> The kernel can and should have a way to represent the relationship
> between these three different kind of IP block.  We already have a
> way to get from PHC to netdev interface.

OK I can strip down the RTNL EEC state interface to only report 
the state without any extras, like pin. Next step would be to add 
the control over recovered clock also to the netdev subsystem.

The EEC state read is needed for recovered/source clock validation
and that's why I think it belongs to the RTNL part as it gates the QL
for each port.

Those two interfaces will allow a minimalistic ESMC support
(receive the packet, extract the SSM from it, check if my clock is
recovered and my clock is in locked state, if all are good - pass
the message along to other related ports)

In parallel let's work on a proper clock generator subsystem. 
For starter It should handle:

 - reference configuration
 - reference status
 - reference priorities
 - output settings

Optionally:
 - NCO mode (here we'll duplicate the functionality of PHC in some 
    deployments)

Once we have that in place we can simply 
- reroute the internals of the EEC state the clock generator subsystem 
  on more complex systems,
- keeping the simple state-read for those who use other simpler
  Implementations of EEC.
- be able to support any hybrid between 1 and 2

Once we get there we'll know what else should this RTNL return and
extend it if needed.

Regards
Maciek
Jakub Kicinski Sept. 1, 2021, 1:58 a.m. UTC | #15
On Tue, 31 Aug 2021 09:19:27 -0700 Richard Cochran wrote:
> As you said later on in this thread, any API we merge now will have to
> last.  That is why I'm being so picky here.  We want new APIs to cover
> current HW _and_ be reasonable for the future.
> 
> I don't see a DPLL as either a PTP Hardware Clock or a Network
> Device.  It is a PLL.
> 
> The kernel can and should have a way to represent the relationship
> between these three different kind of IP block.  We already have a
> way to get from PHC to netdev interface.

Makes sense to me. I was wondering how to split things at high level
into the areas you mentioned, but TBH the part I'm struggling with is
the delineation of what falls under PTP. PLL by itself seems like an
awfully small unit to create a subsystem for, and PTP already has aux
stuff like PIN control. Then there's the whole bunch of stuff that Jonathan
is adding via driver specific sysfs interfaces [1]. I was hoping the
"new API" would cover his need but PLL would be a tiny part of it.

IOW after looking at the code I'm not so sure how to reasonably divide
things.

[1]
https://lore.kernel.org/netdev/20210830235236.309993-1-jonathan.lemon@gmail.com/
Jakub Kicinski Sept. 1, 2021, 2:02 a.m. UTC | #16
On Tue, 31 Aug 2021 22:09:18 +0000 Machnikowski, Maciej wrote:
> OK I can strip down the RTNL EEC state interface to only report 
> the state without any extras, like pin. Next step would be to add 
> the control over recovered clock also to the netdev subsystem.
> 
> The EEC state read is needed for recovered/source clock validation
> and that's why I think it belongs to the RTNL part as it gates the QL
> for each port.

If you mean just reporting state and have a syncE on/off without any
option for other sources that's fine by me.
Richard Cochran Sept. 1, 2021, 2:55 a.m. UTC | #17
On Tue, Aug 31, 2021 at 06:58:24PM -0700, Jakub Kicinski wrote:
> On Tue, 31 Aug 2021 09:19:27 -0700 Richard Cochran wrote:
> > As you said later on in this thread, any API we merge now will have to
> > last.  That is why I'm being so picky here.  We want new APIs to cover
> > current HW _and_ be reasonable for the future.
> > 
> > I don't see a DPLL as either a PTP Hardware Clock or a Network
> > Device.  It is a PLL.
> > 
> > The kernel can and should have a way to represent the relationship
> > between these three different kind of IP block.  We already have a
> > way to get from PHC to netdev interface.
> 
> Makes sense to me. I was wondering how to split things at high level
> into the areas you mentioned, but TBH the part I'm struggling with is
> the delineation of what falls under PTP. PLL by itself seems like an
> awfully small unit to create a subsystem for, and PTP already has aux
> stuff like PIN control.

These pins are a direct HW interface to the posix dynamic clock that
also generates time stamps on the PTP frames.  They can either
generate time stamps on external signals, or produce output signals
from the very same clock.  So the pins are rather tightly coupled to
the PTP clock itself.

But the pins do NOT cover input clock sources into the IP cores.  This
kind of thing is already covered by the DTS for many SoCs (for a
static input clock choice, not changeable at run time)

> Then there's the whole bunch of stuff that Jonathan
> is adding via driver specific sysfs interfaces [1]. I was hoping the
> "new API" would cover his need but PLL would be a tiny part of it.
> 
> IOW after looking at the code I'm not so sure how to reasonably divide
> things.

Right, me neither.  It is a big topic, and we needn't over engineer it
now, but I still think this DPLL is not part of the PTP clock.  There
has to be a better place for it.

Thanks,
Richard
Richard Cochran Sept. 1, 2021, 2:56 a.m. UTC | #18
On Tue, Aug 31, 2021 at 07:02:35PM -0700, Jakub Kicinski wrote:
> On Tue, 31 Aug 2021 22:09:18 +0000 Machnikowski, Maciej wrote:
> > OK I can strip down the RTNL EEC state interface to only report 
> > the state without any extras, like pin. Next step would be to add 
> > the control over recovered clock also to the netdev subsystem.
> > 
> > The EEC state read is needed for recovered/source clock validation
> > and that's why I think it belongs to the RTNL part as it gates the QL
> > for each port.
> 
> If you mean just reporting state and have a syncE on/off without any
> option for other sources that's fine by me.

Yeah, that is what I also imagined for a generic SyncE interface.

Thanks,
Richard
diff mbox series

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 6fd3a4d42668..a091a35706a7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1344,6 +1344,8 @@  struct netdev_net_notifier {
  *	The caller must be under RCU read context.
  * int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path);
  *     Get the forwarding path to reach the real device from the HW destination address
+ * int (*ndo_get_synce_state)(struct net_device *dev, struct if_synce_state_msg *state)
+ *	Get state of physical layer frequency syntonization (SyncE)
  */
 struct net_device_ops {
 	int			(*ndo_init)(struct net_device *dev);
@@ -1563,6 +1565,10 @@  struct net_device_ops {
 	struct net_device *	(*ndo_get_peer_dev)(struct net_device *dev);
 	int                     (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx,
                                                          struct net_device_path *path);
+	int			(*ndo_get_synce_state)(struct net_device *dev,
+						       enum if_synce_state *state,
+						       enum if_synce_src *src,
+						       u8 *pin_idx);
 };
 
 /**
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index eebd3894fe89..1fcb9c71dc7c 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -1273,4 +1273,47 @@  enum {
 
 #define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
 
+/* SyncE section */
+
+enum if_synce_state {
+	IF_SYNCE_STATE_INVALID = 0,
+	IF_SYNCE_STATE_FREERUN,
+	IF_SYNCE_STATE_LOCKACQ,
+	IF_SYNCE_STATE_LOCKREC,
+	IF_SYNCE_STATE_LOCKED,
+	IF_SYNCE_STATE_HOLDOVER,
+	IF_SYNCE_STATE_OPEN_LOOP,
+	__IF_SYNCE_STATE_MAX,
+};
+
+#define IF_SYNCE_STATE_MAX (__IF_SYNCE_STATE_MAX - 1)
+
+enum if_synce_src {
+	IF_SYNCE_SRC_INVALID = 0,
+	IF_SYNCE_SRC_UNKNOWN,
+	IF_SYNCE_SRC_SYNCE,
+	IF_SYNCE_SRC_GNSS,
+	IF_SYNCE_SRC_PTP,
+	IF_SYNCE_SRC_EXT,
+	__IF_SYNCE_SRC_MAX,
+};
+
+#define IF_SYNCE_PIN_UNKNOWN	0xFF
+
+struct if_synce_state_msg {
+	__u32 ifindex;
+	__u8 state;
+	__u8 src;
+	__u8 pin;
+	__u8 pad;
+};
+
+enum {
+	IFLA_SYNCE_UNSPEC,
+	IFLA_SYNCE_STATE,
+	__IFLA_SYNCE_MAX,
+};
+
+#define IFLA_SYNCE_MAX (__IFLA_SYNCE_MAX - 1)
+
 #endif /* _UAPI_LINUX_IF_LINK_H */
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 5888492a5257..cd68045c475b 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -185,6 +185,9 @@  enum {
 	RTM_GETNEXTHOPBUCKET,
 #define RTM_GETNEXTHOPBUCKET	RTM_GETNEXTHOPBUCKET
 
+	RTM_GETSYNCESTATE = 120,
+#define RTM_GETSYNCESTATE	RTM_GETSYNCESTATE
+
 	__RTM_MAX,
 #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
 };
@@ -193,7 +196,7 @@  enum {
 #define RTM_NR_FAMILIES	(RTM_NR_MSGTYPES >> 2)
 #define RTM_FAM(cmd)	(((cmd) - RTM_BASE) >> 2)
 
-/* 
+/*
    Generic structure for encapsulation of optional route information.
    It is reminiscent of sockaddr, but with sa_family replaced
    with attribute type.
@@ -233,7 +236,7 @@  struct rtmsg {
 
 	unsigned char		rtm_table;	/* Routing table id */
 	unsigned char		rtm_protocol;	/* Routing protocol; see below	*/
-	unsigned char		rtm_scope;	/* See below */	
+	unsigned char		rtm_scope;	/* See below */
 	unsigned char		rtm_type;	/* See below	*/
 
 	unsigned		rtm_flags;
@@ -555,7 +558,7 @@  struct ifinfomsg {
 };
 
 /********************************************************************
- *		prefix information 
+ *		prefix information
  ****/
 
 struct prefixmsg {
@@ -569,7 +572,7 @@  struct prefixmsg {
 	unsigned char	prefix_pad3;
 };
 
-enum 
+enum
 {
 	PREFIX_UNSPEC,
 	PREFIX_ADDRESS,
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 972c8cb303a5..8c9638421049 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -5468,6 +5468,81 @@  static int rtnl_stats_dump(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
+static int rtnl_fill_synce_state(struct sk_buff *msg, struct net_device *dev,
+				 u32 portid, u32 seq,
+				 struct netlink_callback *cb,
+				 int flags)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	struct if_synce_state_msg *state;
+	struct nlmsghdr *nlh;
+
+	ASSERT_RTNL();
+
+	nlh = nlmsg_put(msg, portid, seq, RTM_GETSYNCESTATE,
+			sizeof(*state), flags);
+	if (!nlh)
+		return -EMSGSIZE;
+
+	state = nlmsg_data(nlh);
+
+	if (ops->ndo_get_synce_state) {
+		enum if_synce_state sync_state;
+		enum if_synce_src src;
+		int err;
+		u8 pin;
+
+		err = ops->ndo_get_synce_state(dev, &sync_state, &src, &pin);
+		if (err)
+			return err;
+
+		memset(state, 0, sizeof(*state));
+
+		state->ifindex = dev->ifindex;
+		state->state = (u8)sync_state;
+		state->pin = pin;
+		state->src = (u8)src;
+
+		return 0;
+	}
+
+	return -EOPNOTSUPP;
+}
+
+static int rtnl_synce_state_get(struct sk_buff *skb, struct nlmsghdr *nlh,
+				struct netlink_ext_ack *extack)
+{
+	struct net *net = sock_net(skb->sk);
+	struct if_synce_state_msg *state;
+	struct net_device *dev = NULL;
+	struct sk_buff *nskb;
+	u32 filter_mask;
+	int err;
+
+	state = nlmsg_data(nlh);
+	if (state->ifindex > 0)
+		dev = __dev_get_by_index(net, state->ifindex);
+	else
+		return -EINVAL;
+
+	nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!nskb)
+		return -ENOBUFS;
+
+	if (!dev)
+		return -ENODEV;
+
+	err = rtnl_fill_synce_state(nskb, dev,
+				    NETLINK_CB(skb).portid, nlh->nlmsg_seq,
+				    NULL, filter_mask);
+	if (err < 0)
+		kfree_skb(nskb);
+	else
+		err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid);
+
+	return err;
+}
+
 /* Process one rtnetlink message. */
 
 static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
@@ -5693,4 +5768,6 @@  void __init rtnetlink_init(void)
 
 	rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump,
 		      0);
+
+	rtnl_register(PF_UNSPEC, RTM_GETSYNCESTATE, rtnl_synce_state_get, NULL, 0);
 }
diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c
index d59276f48d4f..b4bea120f2af 100644
--- a/security/selinux/nlmsgtab.c
+++ b/security/selinux/nlmsgtab.c
@@ -91,6 +91,7 @@  static const struct nlmsg_perm nlmsg_route_perms[] =
 	{ RTM_NEWNEXTHOPBUCKET,	NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
 	{ RTM_DELNEXTHOPBUCKET,	NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
 	{ RTM_GETNEXTHOPBUCKET,	NETLINK_ROUTE_SOCKET__NLMSG_READ  },
+	{ RTM_GETSYNCESTATE,	NETLINK_ROUTE_SOCKET__NLMSG_READ  },
 };
 
 static const struct nlmsg_perm nlmsg_tcpdiag_perms[] =
@@ -174,7 +175,7 @@  int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm)
 		 * structures at the top of this file with the new mappings
 		 * before updating the BUILD_BUG_ON() macro!
 		 */
-		BUILD_BUG_ON(RTM_MAX != (RTM_NEWNEXTHOPBUCKET + 3));
+		BUILD_BUG_ON(RTM_MAX != (RTM_GETSYNCESTATE + 3));
 		err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms,
 				 sizeof(nlmsg_route_perms));
 		break;