Message ID | 20240822115939.1387015-2-o.rempel@pengutronix.de (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Add Statistics Support for DP83TG720 PHY | expand |
On Thu, Aug 22, 2024 at 01:59:37PM +0200, Oleksij Rempel wrote: > Introduce a set of defines for link quality (LQ) related metrics in the > Open Alliance helpers. These metrics include: > > - `oa_lq_lfl_esd_event_count`: Number of ESD events detected by the Link > Failures and Losses (LFL). > - `oa_lq_link_training_time`: Time required to establish a link. > - `oa_lq_remote_receiver_time`: Time required until the remote receiver > signals that it is locked. > - `oa_lq_local_receiver_time`: Time required until the local receiver is > locked. > - `oa_lq_lfl_link_loss_count`: Number of link losses. > - `oa_lq_lfl_link_failure_count`: Number of link failures that do not > cause a link loss. > > These standardized defines will be used by PHY drivers to report these > statistics. > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Andrew
On Thu, 22 Aug 2024 13:59:37 +0200 Oleksij Rempel wrote: > Introduce a set of defines for link quality (LQ) related metrics in the > Open Alliance helpers. These metrics include: > > - `oa_lq_lfl_esd_event_count`: Number of ESD events detected by the Link > Failures and Losses (LFL). > - `oa_lq_link_training_time`: Time required to establish a link. > - `oa_lq_remote_receiver_time`: Time required until the remote receiver > signals that it is locked. > - `oa_lq_local_receiver_time`: Time required until the local receiver is > locked. > - `oa_lq_lfl_link_loss_count`: Number of link losses. > - `oa_lq_lfl_link_failure_count`: Number of link failures that do not > cause a link loss. > > These standardized defines will be used by PHY drivers to report these > statistics. If these are defined by a standard why not report them as structured data? Like we report ethtool_eth_mac_stats, ethtool_eth_ctrl_stats, ethtool_rmon_stats etc.?
On Mon, Aug 26, 2024 at 09:32:17AM -0700, Jakub Kicinski wrote: > On Thu, 22 Aug 2024 13:59:37 +0200 Oleksij Rempel wrote: > > Introduce a set of defines for link quality (LQ) related metrics in the > > Open Alliance helpers. These metrics include: > > > > - `oa_lq_lfl_esd_event_count`: Number of ESD events detected by the Link > > Failures and Losses (LFL). > > - `oa_lq_link_training_time`: Time required to establish a link. > > - `oa_lq_remote_receiver_time`: Time required until the remote receiver > > signals that it is locked. > > - `oa_lq_local_receiver_time`: Time required until the local receiver is > > locked. > > - `oa_lq_lfl_link_loss_count`: Number of link losses. > > - `oa_lq_lfl_link_failure_count`: Number of link failures that do not > > cause a link loss. > > > > These standardized defines will be used by PHY drivers to report these > > statistics. > > If these are defined by a standard why not report them as structured > data? Like we report ethtool_eth_mac_stats, ethtool_eth_ctrl_stats, > ethtool_rmon_stats etc.? We could do, but we have no infrastructure for this at the moment. These are PHY statistics, not MAC statistics. We don't have all the ethool_op infrastructure, etc. We also need to think about which PHY do we want the statics from, the bootlin code for multiple PHYs etc. I will leave it up to Oleksij, but it would neatly avoid different vendors returning the same stats with different names. Andrew
On Mon, 26 Aug 2024 19:12:52 +0200 Andrew Lunn wrote: > > If these are defined by a standard why not report them as structured > > data? Like we report ethtool_eth_mac_stats, ethtool_eth_ctrl_stats, > > ethtool_rmon_stats etc.? > > We could do, but we have no infrastructure for this at the > moment. These are PHY statistics, not MAC statistics. > We don't have all the ethool_op infrastructure, etc. This appears to not be a concern when calling phy_ops->get_sset_count() You know this code better than me, but I can't think of any big 'infra' that we'd need. ethtool code can just call phy_ops, the rest is likely a repeat of the "MAC"/ethtool_ops stats. > We also need to think about which PHY do we want the statics from, > the bootlin code for multiple PHYs etc. True, that said I'd rather we added a new group for the well-defined PHY stats without supporting multi-PHY, than let the additional considerations prevent us from making progress. ioctl stats are strictly worse. I'm sorry to pick on this particular series, but the structured ethtool stats have been around for 3 years. Feels like it's time to fill the gaps on the PHY side.
Hi Jakub, On Mon, Aug 26, 2024 at 12:57:19PM -0700, Jakub Kicinski wrote: > On Mon, 26 Aug 2024 19:12:52 +0200 Andrew Lunn wrote: > > > If these are defined by a standard why not report them as structured > > > data? Like we report ethtool_eth_mac_stats, ethtool_eth_ctrl_stats, > > > ethtool_rmon_stats etc.? > > > > We could do, but we have no infrastructure for this at the > > moment. These are PHY statistics, not MAC statistics. > > We don't have all the ethool_op infrastructure, etc. > > This appears to not be a concern when calling phy_ops->get_sset_count() > You know this code better than me, but I can't think of any big 'infra' > that we'd need. ethtool code can just call phy_ops, the rest is likely > a repeat of the "MAC"/ethtool_ops stats. > > > We also need to think about which PHY do we want the statics from, > > the bootlin code for multiple PHYs etc. > > True, that said I'd rather we added a new group for the well-defined > PHY stats without supporting multi-PHY, than let the additional > considerations prevent us from making progress. ioctl stats are > strictly worse. > > I'm sorry to pick on this particular series, but the structured ethtool > stats have been around for 3 years. Feels like it's time to fill the > gaps on the PHY side. I completely agree with you, but I currently don't have additional budget for this project. What might help is a diagnostic concept that I can present to my customers to seek sponsorship for implementing various interfaces based on their relevance and priority for different projects. Since I haven't seen an existing concept from the end product or component vendors, I suggest starting this within the Linux Kernel NetDev community. I'll send my current thoughts in a separate email Regards, Oleksij
On Tue, 27 Aug 2024 06:51:27 +0200 Oleksij Rempel wrote: > I completely agree with you, but I currently don't have additional > budget for this project. Is this a legit reason not to do something relatively simple? Especially that we're talking about uAPI, once we go down the string path I presume they will stick around forever. IDK. Additional opinions welcome...
On Tue, Aug 27, 2024 at 11:33:00AM -0700, Jakub Kicinski wrote: > On Tue, 27 Aug 2024 06:51:27 +0200 Oleksij Rempel wrote: > > I completely agree with you, but I currently don't have additional > > budget for this project. > > Is this a legit reason not to do something relatively simple? Due to the nature of my work in a consulting company, my time is scheduled across multiple customers. For the 10BaseT1 PHY, I had 2 days budgeted left, which allowed me to implement some extra diagnostics. This was simple, predictable, and within the scope of the original task. However, now that the budget for this task and customer has been used up, any additional work would require a full process: - I would need to sell the idea to the customer. - The new task would need to be prioritized. - It would then be scheduled, which could happen this year, next year, or possibly never. A similar situation occurred with the EEE implementation. I started with a simple fix for Atheros PHY's SmartEEE, but it led to reworking the entire EEE infrastructure in the kernel. Once the budget was exhausted, I couldn’t continue with SmartEEE for Atheros PHYs. These are the risks inherent to consulting work. I still see it as not wasted time, because we have a better EEE infrastructure now. Considering that you've requested a change to the uAPI, the work has now become more predictable. I can plan for it within the task and update the required time budget accordingly. However, it's worth noting that while this work is manageable, the time spent on this particular task could be seen as somewhat wasted from a budget perspective, as it wasn't part of the original scope. > Especially that we're talking about uAPI, once we go down > the string path I presume they will stick around forever. Yes, I agree with it. I just needed this feedback as early as possible.
On Wed, 28 Aug 2024 06:50:46 +0200 Oleksij Rempel wrote: > Considering that you've requested a change to the uAPI, the work has now become > more predictable. I can plan for it within the task and update the required > time budget accordingly. However, it's worth noting that while this work is > manageable, the time spent on this particular task could be seen as somewhat > wasted from a budget perspective, as it wasn't part of the original scope. I can probably take a stab at the kernel side, since I know the code already shouldn't take me more more than an hour. Would that help? You'd still need to retest, fix bugs. And go thru review.. so all the not-so-fun parts > > Especially that we're talking about uAPI, once we go down > > the string path I presume they will stick around forever. > > Yes, I agree with it. I just needed this feedback as early as possible. Andrew? Do you want to decide? :)
On Wed, Aug 28, 2024 at 01:34:28PM -0700, Jakub Kicinski wrote: > On Wed, 28 Aug 2024 06:50:46 +0200 Oleksij Rempel wrote: > > Considering that you've requested a change to the uAPI, the work has now become > > more predictable. I can plan for it within the task and update the required > > time budget accordingly. However, it's worth noting that while this work is > > manageable, the time spent on this particular task could be seen as somewhat > > wasted from a budget perspective, as it wasn't part of the original scope. > > I can probably take a stab at the kernel side, since I know the code > already shouldn't take me more more than an hour. Would that help? > You'd still need to retest, fix bugs. And go thru review.. so all > the not-so-fun parts > > > > Especially that we're talking about uAPI, once we go down > > > the string path I presume they will stick around forever. > > > > Yes, I agree with it. I just needed this feedback as early as possible. > > Andrew? Do you want to decide? :) I agree about avoiding free test strings. Something more structures would be good. I can definitely help out with review, but i don't have any time at the moment for writing code. Andrew
On Wed, Aug 28, 2024 at 01:34:28PM -0700, Jakub Kicinski wrote: > On Wed, 28 Aug 2024 06:50:46 +0200 Oleksij Rempel wrote: > > Considering that you've requested a change to the uAPI, the work has now become > > more predictable. I can plan for it within the task and update the required > > time budget accordingly. However, it's worth noting that while this work is > > manageable, the time spent on this particular task could be seen as somewhat > > wasted from a budget perspective, as it wasn't part of the original scope. > > I can probably take a stab at the kernel side, since I know the code > already shouldn't take me more more than an hour. Would that help? Ack. This will be a great help. > You'd still need to retest, fix bugs. And go thru review.. so all > the not-so-fun parts Sure :)
diff --git a/drivers/net/phy/open_alliance_helpers.h b/drivers/net/phy/open_alliance_helpers.h index 8b7d97bc6f186..f8b392671e20d 100644 --- a/drivers/net/phy/open_alliance_helpers.h +++ b/drivers/net/phy/open_alliance_helpers.h @@ -3,6 +3,20 @@ #ifndef OPEN_ALLIANCE_HELPERS_H #define OPEN_ALLIANCE_HELPERS_H +/* Link quality (LQ) related metrics */ +/* The number of ESD events detected by the Link Failures and Losses (LFL) */ +#define OA_LQ_LFL_ESD_EVENT_COUNT "oa_lq_lfl_esd_event_count" +/* Time required to establish a link */ +#define OA_LQ_LINK_TRAINING_TIME "oa_lq_link_training_time" +/* Time required until the remote receiver is signaling that it is locked */ +#define OA_LQ_REMOTE_RECEIVER_TIME "oa_lq_remote_receiver_time" +/* Time required until the local receiver is locked */ +#define OA_LQ_LOCAL_RECEIVER_TIME "oa_lq_local_receiver_time" +/* Number of link losses */ +#define OA_LQ_LFL_LINK_LOSS_COUNT "oa_lq_lfl_link_loss_count" +/* Number of link failures causing NOT a link loss */ +#define OA_LQ_LFL_LINK_FAILURE_COUNT "oa_lq_lfl_link_failure_count" + /* * These defines reflect the TDR (Time Delay Reflection) diagnostic feature * for 1000BASE-T1 automotive Ethernet PHYs as specified by the OPEN Alliance.
Introduce a set of defines for link quality (LQ) related metrics in the Open Alliance helpers. These metrics include: - `oa_lq_lfl_esd_event_count`: Number of ESD events detected by the Link Failures and Losses (LFL). - `oa_lq_link_training_time`: Time required to establish a link. - `oa_lq_remote_receiver_time`: Time required until the remote receiver signals that it is locked. - `oa_lq_local_receiver_time`: Time required until the local receiver is locked. - `oa_lq_lfl_link_loss_count`: Number of link losses. - `oa_lq_lfl_link_failure_count`: Number of link failures that do not cause a link loss. These standardized defines will be used by PHY drivers to report these statistics. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> --- drivers/net/phy/open_alliance_helpers.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+)