Message ID | 20211110114448.2792314-7-maciej.machnikowski@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Add RTNL interface for SyncE | expand |
Maciej Machnikowski <maciej.machnikowski@intel.com> writes: > Add Documentation/networking/synce.rst describing new RTNL messages > and respective NDO ops supporting SyncE (Synchronous Ethernet). > > Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> > --- > Documentation/networking/synce.rst | 124 +++++++++++++++++++++++++++++ > 1 file changed, 124 insertions(+) > create mode 100644 Documentation/networking/synce.rst > > diff --git a/Documentation/networking/synce.rst b/Documentation/networking/synce.rst > new file mode 100644 > index 000000000000..a7bb75685c07 > --- /dev/null > +++ b/Documentation/networking/synce.rst > @@ -0,0 +1,124 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +============================= > +Synchronous Equipment Clocks > +============================= > + > +Synchronous Equipment Clocks use a physical layer clock to syntonize > +the frequency across different network elements. > + > +Basic Synchronous network node consist of a Synchronous Equipment > +Clock (SEC) and and a PHY that has dedicated outputs of clocks recovered > +from the Receive side and a dedicated TX clock input that is used as > +a reference for the physical frequency of the transmit data to other nodes. > + > +The PHY is able to recover the physical signal frequency of the RX data > +stream on RX ports and redirect it (sometimes dividing it) to recovered > +clock outputs. Number of recovered clock output pins is usually lower than > +the number of RX portx. As a result the RX port to Recovered Clock output > +mapping needs to be configured. the TX frequency is directly depends on the > +input frequency - either on the PHY CLK input, or on a dedicated > +TX clock input. > + > + ┌──────────┬──────────┐ > + │ RX │ TX │ > + 1 │ ports │ ports │ 1 > + ───►├─────┐ │ ├─────► > + 2 │ │ │ │ 2 > + ───►├───┐ │ │ ├─────► > + 3 │ │ │ │ │ 3 > + ───►├─┐ │ │ │ ├─────► > + │ ▼ ▼ ▼ │ │ > + │ ────── │ │ > + │ \____/ │ │ > + └──┼──┼────┴──────────┘ > + 1│ 2│ ▲ > + RCLK out│ │ │ TX CLK in > + ▼ ▼ │ > + ┌─────────────┴───┐ > + │ │ > + │ SEC │ > + │ │ > + └─────────────────┘ > + > +The SEC can synchronize its frequency to one of the synchronization inputs > +either clocks recovered on traffic interfaces or (in advanced deployments) > +external frequency sources. > + > +Some SEC implementations can automatically select synchronization source > +through priority tables and synchronization status messaging and provide > +necessary filtering and holdover capabilities. > + > +The following interface can be applicable to diffferent packet network types > +following ITU-T G.8261/G.8262 recommendations. > + > +Interface > +========= > + > +The following RTNL messages are used to read/configure SyncE recovered > +clocks. > + > +RTM_GETRCLKSTATE > +----------------- > +Read the state of recovered pins that output recovered clock from > +a given port. The message will contain the number of assigned clocks > +(IFLA_RCLK_STATE_COUNT) and an N pin indexes in IFLA_RCLK_STATE_OUT_STATE > +To support multiple recovered clock outputs from the same port, this message > +will return the IFLA_RCLK_STATE_COUNT attribute containing the number of > +recovered clock outputs (N) and N IFLA_RCLK_STATE_OUT_STATE attributes > +listing the output indexes with the respective GET_RCLK_FLAGS_ENA flag. > +This message will call the ndo_get_rclk_range to determine the allowed > +recovered clock indexes and then will loop through them, calling > +the ndo_get_rclk_state for each of them. > + > + > +Attributes: > +IFLA_RCLK_STATE_COUNT - Returns the number of recovered clock outputs > +IFLA_RCLK_STATE_OUT_STATE - Returns the current state of a single recovered > + clock output in the struct if_get_rclk_msg. > +struct if_get_rclk_msg { > + __u32 out_idx; /* output index (from a valid range) */ > + __u32 flags; /* configuration flags */ > +}; > + > +Currently supported flags: > +#define GET_RCLK_FLAGS_ENA (1U << 0) > + > + > +RTM_SETRCLKSTATE > +----------------- > +Sets the redirection of the recovered clock for a given pin. This message > +expects one attribute: > +struct if_set_rclk_msg { > + __u32 ifindex; /* interface index */ > + __u32 out_idx; /* output index (from a valid range) */ > + __u32 flags; /* configuration flags */ > +}; > + > +Supported flags are: > +SET_RCLK_FLAGS_ENA - if set in flags - the given output will be enabled, > + if clear - the output will be disabled. > + > +RTM_GETEECSTATE > +---------------- > +Reads the state of the EEC or equivalent physical clock synchronizer. > +This message returns the following attributes: > +IFLA_EEC_STATE - current state of the EEC or equivalent clock generator. > + The states returned in this attribute are aligned to the > + ITU-T G.781 and are: > + IF_EEC_STATE_INVALID - state is not valid > + IF_EEC_STATE_FREERUN - clock is free-running > + IF_EEC_STATE_LOCKED - clock is locked to the reference, > + but the holdover memory is not valid > + IF_EEC_STATE_LOCKED_HO_ACQ - clock is locked to the reference > + and holdover memory is valid > + IF_EEC_STATE_HOLDOVER - clock is in holdover mode > +State is read from the netdev calling the: > +int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state *state, > + u32 *src_idx, struct netlink_ext_ack *extack); > + > +IFLA_EEC_SRC_IDX - optional attribute returning the index of the reference > + that is used for the current IFLA_EEC_STATE, i.e., > + the index of the pin that the EEC is locked to. > + > +Will be returned only if the ndo_get_eec_src is implemented. > \ No newline at end of file Just to be clear, I have much the same objections to this UAPI as I had to v2: - RTM_GETEECSTATE will become obsolete as soon as DPLL object is added. - Reporting pins through the netdevices that use them allows for configurations that are likely invalid, like disjoint "frequency bridges". - It's not clear what enabling several pins means, and it's not clear whether this genericity is not going to be an issue in the future when we know what enabling more pins means. - No way as a user to tell whether two interfaces that report the same pins are actually connected to the same EEC. How many EEC's are there, in the system, anyway? In particular, I think that the proposed UAPIs should belong to a DPLL object. That object must know about the pins, so have it enumerate them. That object needs to know about which pin/s to track, so configure it there. That object has the state, so have it report it. Really, it looks basically 1:1 vs. the proposed API, except the object over which the UAPIs should be defined is a DPLL, not a netdev.
> -----Original Message----- > From: Petr Machata <petrm@nvidia.com> > Sent: Thursday, November 11, 2021 1:43 PM > To: Machnikowski, Maciej <maciej.machnikowski@intel.com> > Subject: Re: [PATCH v3 net-next 6/6] docs: net: Add description of SyncE > interfaces > > > Maciej Machnikowski <maciej.machnikowski@intel.com> writes: > > > Add Documentation/networking/synce.rst describing new RTNL messages > > and respective NDO ops supporting SyncE (Synchronous Ethernet). > > > > Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> > > --- ... > > +RTM_GETEECSTATE > > +---------------- > > +Reads the state of the EEC or equivalent physical clock synchronizer. > > +This message returns the following attributes: > > +IFLA_EEC_STATE - current state of the EEC or equivalent clock generator. > > + The states returned in this attribute are aligned to the > > + ITU-T G.781 and are: > > + IF_EEC_STATE_INVALID - state is not valid > > + IF_EEC_STATE_FREERUN - clock is free-running > > + IF_EEC_STATE_LOCKED - clock is locked to the reference, > > + but the holdover memory is not valid > > + IF_EEC_STATE_LOCKED_HO_ACQ - clock is locked to the > reference > > + and holdover memory is valid > > + IF_EEC_STATE_HOLDOVER - clock is in holdover mode > > +State is read from the netdev calling the: > > +int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state > *state, > > + u32 *src_idx, struct netlink_ext_ack *extack); > > + > > +IFLA_EEC_SRC_IDX - optional attribute returning the index of the > reference > > + that is used for the current IFLA_EEC_STATE, i.e., > > + the index of the pin that the EEC is locked to. > > + > > +Will be returned only if the ndo_get_eec_src is implemented. > > \ No newline at end of file > > Just to be clear, I have much the same objections to this UAPI as I had > to v2: > > - RTM_GETEECSTATE will become obsolete as soon as DPLL object is added. Yes for more complex devices and no for simple ones > - Reporting pins through the netdevices that use them allows for > configurations that are likely invalid, like disjoint "frequency > bridges". Not sure if I understand that comment. In target application a given netdev will receive an ESMC message containing the quality of the clock that it has on the receive side. The upper layer software will see QL_PRC on one port and QL_EEC on other and will need to enable clock output from the port that received QL_PRC, as it's the higher clock class. Once the EEC reports Locked state all other ports that are traceable to a given EEC (either using the DPLL subsystem, or the config file) will start reporting QL_PRC to downstream devices. > - It's not clear what enabling several pins means, and it's not clear > whether this genericity is not going to be an issue in the future when > we know what enabling more pins means. It means that the recovered frequency will appear on 2 or more physical pins of the package. > - No way as a user to tell whether two interfaces that report the same > pins are actually connected to the same EEC. How many EEC's are there, > in the system, anyway? For now we can fix that with the config file, for future we will be able to trace them to the same EEC. It's like BC in PTP - you can rely on automated mode, but can also override it with the config file. > In particular, I think that the proposed UAPIs should belong to a DPLL > object. That object must know about the pins, so have it enumerate them. > That object needs to know about which pin/s to track, so configure it > there. That object has the state, so have it report it. Really, it looks > basically 1:1 vs. the proposed API, except the object over which the > UAPIs should be defined is a DPLL, not a netdev. RCLK pin API does not belong to the DPLL and never will. That part will always belong to the netdev.
Machnikowski, Maciej <maciej.machnikowski@intel.com> writes: >> -----Original Message----- >> From: Petr Machata <petrm@nvidia.com> >> Sent: Thursday, November 11, 2021 1:43 PM >> To: Machnikowski, Maciej <maciej.machnikowski@intel.com> >> Subject: Re: [PATCH v3 net-next 6/6] docs: net: Add description of SyncE >> interfaces >> >> >> Maciej Machnikowski <maciej.machnikowski@intel.com> writes: >> >> > Add Documentation/networking/synce.rst describing new RTNL messages >> > and respective NDO ops supporting SyncE (Synchronous Ethernet). >> > >> > Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> >> > --- > ... >> > +RTM_GETEECSTATE >> > +---------------- >> > +Reads the state of the EEC or equivalent physical clock synchronizer. >> > +This message returns the following attributes: >> > +IFLA_EEC_STATE - current state of the EEC or equivalent clock generator. >> > + The states returned in this attribute are aligned to the >> > + ITU-T G.781 and are: >> > + IF_EEC_STATE_INVALID - state is not valid >> > + IF_EEC_STATE_FREERUN - clock is free-running >> > + IF_EEC_STATE_LOCKED - clock is locked to the reference, >> > + but the holdover memory is not valid >> > + IF_EEC_STATE_LOCKED_HO_ACQ - clock is locked to the >> reference >> > + and holdover memory is valid >> > + IF_EEC_STATE_HOLDOVER - clock is in holdover mode >> > +State is read from the netdev calling the: >> > +int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state >> *state, >> > + u32 *src_idx, struct netlink_ext_ack *extack); >> > + >> > +IFLA_EEC_SRC_IDX - optional attribute returning the index of the >> reference >> > + that is used for the current IFLA_EEC_STATE, i.e., >> > + the index of the pin that the EEC is locked to. >> > + >> > +Will be returned only if the ndo_get_eec_src is implemented. >> > \ No newline at end of file >> >> Just to be clear, I have much the same objections to this UAPI as I had >> to v2: >> >> - RTM_GETEECSTATE will become obsolete as soon as DPLL object is added. > > Yes for more complex devices and no for simple ones If we have an interface suitable for more complex netdevices, the simpler ones can use it as well. Should in fact. There should not be two interfaces for the same thing. For reasons of maintenance, documentation, tool support, user experience.
Machnikowski, Maciej <maciej.machnikowski@intel.com> writes: >> - Reporting pins through the netdevices that use them allows for >> configurations that are likely invalid, like disjoint "frequency >> bridges". > > Not sure if I understand that comment. In target application a given > netdev will receive an ESMC message containing the quality of the > clock that it has on the receive side. The upper layer software will > see QL_PRC on one port and QL_EEC on other and will need to enable > clock output from the port that received QL_PRC, as it's the higher clock > class. Once the EEC reports Locked state all other ports that are traceable > to a given EEC (either using the DPLL subsystem, or the config file) > will start reporting QL_PRC to downstream devices. I think I had the reading of the UAPI wrong. So RTM_SETRCLKSTATE means, take the clock recovered from ifindex, and send it to pins that I have marked with the ENA flag. But that still does not work well for multi-port devices. I can set it up to forward frequency from swp1 to swp2 and swp3, from swp4 to swp5 and swp6, etc. But in reality I only have one underlying DPLL and can't support this. So yeah, obviously, I bounce it in the driver. It also means that when I want to switch tracking from swp1 to swp2, I first need to unset all the swp1 pins (64 messages or whaveter) and then set it up at swp2 (64 more messages). As a user I still don't know which of my ports share DPLL. It's just not a great interface for multi-port devices. Having this stuff at a dedicated DPLL object would make the issue go away completely. A driver then instantiates one DPLL, sets it up with RCLK pins and TX pins. The DPLL can be configured with which pin to take the frequency from, and which subset of pins to forward it to. There are as many DPLL objects as there are DPLL circuits in the system. This works for simple port devices as well as switches, as well as non-networked devices. The in-driver LOC overhead is a couple of _init / _fini calls and an ops structure that the DPLL subsystem uses to talk to the driver. Everything else remains the same. >> - It's not clear what enabling several pins means, and it's not clear >> whether this genericity is not going to be an issue in the future when >> we know what enabling more pins means. > > It means that the recovered frequency will appear on 2 or more physical > pins of the package. Yes, agreed now.
> -----Original Message----- > From: Petr Machata <petrm@nvidia.com> > Sent: Tuesday, November 16, 2021 12:53 PM > To: Machnikowski, Maciej <maciej.machnikowski@intel.com> > Subject: Re: [PATCH v3 net-next 6/6] docs: net: Add description of SyncE > interfaces > > > Machnikowski, Maciej <maciej.machnikowski@intel.com> writes: > > >> - Reporting pins through the netdevices that use them allows for > >> configurations that are likely invalid, like disjoint "frequency > >> bridges". > > > > Not sure if I understand that comment. In target application a given > > netdev will receive an ESMC message containing the quality of the > > clock that it has on the receive side. The upper layer software will > > see QL_PRC on one port and QL_EEC on other and will need to enable > > clock output from the port that received QL_PRC, as it's the higher clock > > class. Once the EEC reports Locked state all other ports that are traceable > > to a given EEC (either using the DPLL subsystem, or the config file) > > will start reporting QL_PRC to downstream devices. > > I think I had the reading of the UAPI wrong. So RTM_SETRCLKSTATE means, > take the clock recovered from ifindex, and send it to pins that I have > marked with the ENA flag. > > But that still does not work well for multi-port devices. I can set it > up to forward frequency from swp1 to swp2 and swp3, from swp4 to swp5 > and swp6, etc. But in reality I only have one underlying DPLL and can't > support this. So yeah, obviously, I bounce it in the driver. It also > means that when I want to switch tracking from swp1 to swp2, I first > need to unset all the swp1 pins (64 messages or whaveter) and then set > it up at swp2 (64 more messages). As a user I still don't know which of > my ports share DPLL. It's just not a great interface for multi-port > devices. This will only be done on init - after everything is configured - you will not really need to check anything there. > Having this stuff at a dedicated DPLL object would make the issue go > away completely. A driver then instantiates one DPLL, sets it up with > RCLK pins and TX pins. The DPLL can be configured with which pin to take > the frequency from, and which subset of pins to forward it to. There are > as many DPLL objects as there are DPLL circuits in the system. > > This works for simple port devices as well as switches, as well as > non-networked devices. > > The in-driver LOC overhead is a couple of _init / _fini calls and an ops > structure that the DPLL subsystem uses to talk to the driver. Everything > else remains the same. That won't work - a single recovered clock may be physically connected to more than one DPLL device and a single DPLL device may be used for more than one MAC chip at the same time - we shouldn't mix subsystems as recovered clocks belong to PHY/MAC layer. Also in that case the DPLL would need to track the relation between all netdev ports upstream - which will be nightmare to keep track of when ports reset/get removed or added. Also the netdev is the one that will receive the packet containing quality so the userspace app will know which netdev received it and not which DPLL pin it should configure. I think this approach will make everything more complex (unless I'm missing something). > >> - It's not clear what enabling several pins means, and it's not clear > >> whether this genericity is not going to be an issue in the future when > >> we know what enabling more pins means. > > > > It means that the recovered frequency will appear on 2 or more physical > > pins of the package. > > Yes, agreed now.
diff --git a/Documentation/networking/synce.rst b/Documentation/networking/synce.rst new file mode 100644 index 000000000000..a7bb75685c07 --- /dev/null +++ b/Documentation/networking/synce.rst @@ -0,0 +1,124 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================= +Synchronous Equipment Clocks +============================= + +Synchronous Equipment Clocks use a physical layer clock to syntonize +the frequency across different network elements. + +Basic Synchronous network node consist of a Synchronous Equipment +Clock (SEC) and and a PHY that has dedicated outputs of clocks recovered +from the Receive side and a dedicated TX clock input that is used as +a reference for the physical frequency of the transmit data to other nodes. + +The PHY is able to recover the physical signal frequency of the RX data +stream on RX ports and redirect it (sometimes dividing it) to recovered +clock outputs. Number of recovered clock output pins is usually lower than +the number of RX portx. As a result the RX port to Recovered Clock output +mapping needs to be configured. the TX frequency is directly depends on the +input frequency - either on the PHY CLK input, or on a dedicated +TX clock input. + + ┌──────────┬──────────┐ + │ RX │ TX │ + 1 │ ports │ ports │ 1 + ───►├─────┐ │ ├─────► + 2 │ │ │ │ 2 + ───►├───┐ │ │ ├─────► + 3 │ │ │ │ │ 3 + ───►├─┐ │ │ │ ├─────► + │ ▼ ▼ ▼ │ │ + │ ────── │ │ + │ \____/ │ │ + └──┼──┼────┴──────────┘ + 1│ 2│ ▲ + RCLK out│ │ │ TX CLK in + ▼ ▼ │ + ┌─────────────┴───┐ + │ │ + │ SEC │ + │ │ + └─────────────────┘ + +The SEC can synchronize its frequency to one of the synchronization inputs +either clocks recovered on traffic interfaces or (in advanced deployments) +external frequency sources. + +Some SEC implementations can automatically select synchronization source +through priority tables and synchronization status messaging and provide +necessary filtering and holdover capabilities. + +The following interface can be applicable to diffferent packet network types +following ITU-T G.8261/G.8262 recommendations. + +Interface +========= + +The following RTNL messages are used to read/configure SyncE recovered +clocks. + +RTM_GETRCLKSTATE +----------------- +Read the state of recovered pins that output recovered clock from +a given port. The message will contain the number of assigned clocks +(IFLA_RCLK_STATE_COUNT) and an N pin indexes in IFLA_RCLK_STATE_OUT_STATE +To support multiple recovered clock outputs from the same port, this message +will return the IFLA_RCLK_STATE_COUNT attribute containing the number of +recovered clock outputs (N) and N IFLA_RCLK_STATE_OUT_STATE attributes +listing the output indexes with the respective GET_RCLK_FLAGS_ENA flag. +This message will call the ndo_get_rclk_range to determine the allowed +recovered clock indexes and then will loop through them, calling +the ndo_get_rclk_state for each of them. + + +Attributes: +IFLA_RCLK_STATE_COUNT - Returns the number of recovered clock outputs +IFLA_RCLK_STATE_OUT_STATE - Returns the current state of a single recovered + clock output in the struct if_get_rclk_msg. +struct if_get_rclk_msg { + __u32 out_idx; /* output index (from a valid range) */ + __u32 flags; /* configuration flags */ +}; + +Currently supported flags: +#define GET_RCLK_FLAGS_ENA (1U << 0) + + +RTM_SETRCLKSTATE +----------------- +Sets the redirection of the recovered clock for a given pin. This message +expects one attribute: +struct if_set_rclk_msg { + __u32 ifindex; /* interface index */ + __u32 out_idx; /* output index (from a valid range) */ + __u32 flags; /* configuration flags */ +}; + +Supported flags are: +SET_RCLK_FLAGS_ENA - if set in flags - the given output will be enabled, + if clear - the output will be disabled. + +RTM_GETEECSTATE +---------------- +Reads the state of the EEC or equivalent physical clock synchronizer. +This message returns the following attributes: +IFLA_EEC_STATE - current state of the EEC or equivalent clock generator. + The states returned in this attribute are aligned to the + ITU-T G.781 and are: + IF_EEC_STATE_INVALID - state is not valid + IF_EEC_STATE_FREERUN - clock is free-running + IF_EEC_STATE_LOCKED - clock is locked to the reference, + but the holdover memory is not valid + IF_EEC_STATE_LOCKED_HO_ACQ - clock is locked to the reference + and holdover memory is valid + IF_EEC_STATE_HOLDOVER - clock is in holdover mode +State is read from the netdev calling the: +int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state *state, + u32 *src_idx, struct netlink_ext_ack *extack); + +IFLA_EEC_SRC_IDX - optional attribute returning the index of the reference + that is used for the current IFLA_EEC_STATE, i.e., + the index of the pin that the EEC is locked to. + +Will be returned only if the ndo_get_eec_src is implemented. \ No newline at end of file
Add Documentation/networking/synce.rst describing new RTNL messages and respective NDO ops supporting SyncE (Synchronous Ethernet). Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> --- Documentation/networking/synce.rst | 124 +++++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) create mode 100644 Documentation/networking/synce.rst