diff mbox series

[net] Revert "net/mlx5: Update cyclecounter shift value to improve ptp free running mode precision"

Message ID 20230815151507.3028503-1-vadfed@meta.com (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [net] Revert "net/mlx5: Update cyclecounter shift value to improve ptp free running mode precision" | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present fail Series targets non-next tree, but doesn't contain any Fixes tags
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1328 this patch: 1328
netdev/cc_maintainers warning 8 maintainers not CCed: tariqt@nvidia.com ayal@nvidia.com leon@kernel.org davem@davemloft.net pabeni@redhat.com jacob.e.keller@intel.com edumazet@google.com linux-rdma@vger.kernel.org
netdev/build_clang success Errors and warnings before: 1351 this patch: 1351
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1351 this patch: 1351
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Vadim Fedorenko Aug. 15, 2023, 3:15 p.m. UTC
From: Vadim Fedorenko <vadim.fedorenko@linux.dev>

This reverts commit 6a40109275626267ebf413ceda81c64719b5c431.

There was an assumption in the original commit that all the devices
supported by mlx5 advertise 1GHz as an internal timer frequency.
Apparently at least ConnectX-4 Lx (MCX4431N-GCAN) provides 156.250Mhz
as an internal frequency and the original commit breaks PTP
synchronization on these cards.

Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
---
 drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Rahul Rameshbabu Aug. 15, 2023, 4:53 p.m. UTC | #1
On Tue, 15 Aug, 2023 08:15:07 -0700 Vadim Fedorenko <vadfed@meta.com> wrote:
> From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
>
> This reverts commit 6a40109275626267ebf413ceda81c64719b5c431.
>
> There was an assumption in the original commit that all the devices
> supported by mlx5 advertise 1GHz as an internal timer frequency.
> Apparently at least ConnectX-4 Lx (MCX4431N-GCAN) provides 156.250Mhz
> as an internal frequency and the original commit breaks PTP
> synchronization on these cards.
>
> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
> ---

I agree with this revert. This change was made with the assumption that
all mlx5 compatible devices were running a 1Ghz internal timer. Will
sync with folks internally about how we can support higher precision
free running mode while accounting for the different device timer clock
speeds.

Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Rahul Rameshbabu Aug. 16, 2023, 9:57 p.m. UTC | #2
On Tue, 15 Aug, 2023 09:53:27 -0700 Rahul Rameshbabu <rrameshbabu@nvidia.com> wrote:
> On Tue, 15 Aug, 2023 08:15:07 -0700 Vadim Fedorenko <vadfed@meta.com> wrote:
>> From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
>>
>> This reverts commit 6a40109275626267ebf413ceda81c64719b5c431.
>>
>> There was an assumption in the original commit that all the devices
>> supported by mlx5 advertise 1GHz as an internal timer frequency.
>> Apparently at least ConnectX-4 Lx (MCX4431N-GCAN) provides 156.250Mhz
>> as an internal frequency and the original commit breaks PTP
>> synchronization on these cards.
>>
>> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
>> ---
>
> I agree with this revert. This change was made with the assumption that
> all mlx5 compatible devices were running a 1Ghz internal timer. Will
> sync with folks internally about how we can support higher precision
> free running mode while accounting for the different device timer clock
> speeds.

I have prepared a patch that resolves the issue of selecting a valid
shift constant based on the device frequency. The patch includes logic
for picking the optimal shift constant for PTP frequency adjustment
based on the device frequency. I have tested this against a variety of
mlx5 devices and believe this solution would be ideal. I expect this
patch to make its way to the mailing list by the end of the week.

--
Thanks,

Rahul Rameshbabu
Jakub Kicinski Aug. 17, 2023, 2:38 a.m. UTC | #3
On Tue, 15 Aug 2023 08:15:07 -0700 Vadim Fedorenko wrote:
> From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
> 
> This reverts commit 6a40109275626267ebf413ceda81c64719b5c431.
> 
> There was an assumption in the original commit that all the devices
> supported by mlx5 advertise 1GHz as an internal timer frequency.
> Apparently at least ConnectX-4 Lx (MCX4431N-GCAN) provides 156.250Mhz
> as an internal frequency and the original commit breaks PTP
> synchronization on these cards.

Hi Saeed, any preference here? Given we're past -rc6 and the small
size of the revert it seems like a tempting solution?
Saeed Mahameed Aug. 17, 2023, 4:37 p.m. UTC | #4
On 16 Aug 19:38, Jakub Kicinski wrote:
>On Tue, 15 Aug 2023 08:15:07 -0700 Vadim Fedorenko wrote:
>> From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
>>
>> This reverts commit 6a40109275626267ebf413ceda81c64719b5c431.
>>
>> There was an assumption in the original commit that all the devices
>> supported by mlx5 advertise 1GHz as an internal timer frequency.
>> Apparently at least ConnectX-4 Lx (MCX4431N-GCAN) provides 156.250Mhz
>> as an internal frequency and the original commit breaks PTP
>> synchronization on these cards.
>
>Hi Saeed, any preference here? Given we're past -rc6 and the small
>size of the revert it seems like a tempting solution?

Rahul's solution is also very small and already passed review, we will be
conducting some tests, share the patch with Vadim for testing and I will be
submitting it early next week.

>
Vadim Fedorenko Aug. 21, 2023, 9:37 p.m. UTC | #5
Hi!

I have tested the diff, looks like it works on both CX4 and CX6 cards I have right now. Wasn't able to find CX5 to test it, but I think it's ok to publish it and wait for some other users of CX5 to test.

Best,
Vadim
Rahul Rameshbabu Aug. 21, 2023, 9:44 p.m. UTC | #6
On Mon, 21 Aug, 2023 21:37:51 +0000 Vadim Fedorenko <vadfed@meta.com> wrote:
> Hi!
>
> I have tested the diff, looks like it works on both CX4 and CX6 cards I have
> right now. Wasn't able to find CX5 to test it, but I think it's ok to publish it
> and wait for some other users of CX5 to test.

Thanks Vadim. I have tested on CX5 myself, but a CX5 user reached out to
me directly since he was also impacted by the original change. If we
can, getting his feedback would be good.

>
> Best,
> Vadim
>
> ________________________________________
> From: Saeed Mahameed <saeed@kernel.org>
> Sent: 17 August 2023 17:37
> To: Jakub Kicinski
> Cc: Saeed Mahameed; Vadim Fedorenko; Rahul Rameshbabu; Gal Pressman; Bar
> Shapira; Vadim Fedorenko; Richard Cochran; netdev@vger.kernel.org
> Subject: Re: [PATCH net] Revert "net/mlx5: Update cyclecounter shift value to
> improve ptp free running mode precision"
>
> On 16 Aug 19:38, Jakub Kicinski wrote:
>>On Tue, 15 Aug 2023 08:15:07 -0700 Vadim Fedorenko wrote:
>>> From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
>>>
>>> This reverts commit 6a40109275626267ebf413ceda81c64719b5c431.
>>>
>>> There was an assumption in the original commit that all the devices
>>> supported by mlx5 advertise 1GHz as an internal timer frequency.
>>> Apparently at least ConnectX-4 Lx (MCX4431N-GCAN) provides 156.250Mhz
>>> as an internal frequency and the original commit breaks PTP
>>> synchronization on these cards.
>>
>>Hi Saeed, any preference here? Given we're past -rc6 and the small
>>size of the revert it seems like a tempting solution?
>
> Rahul's solution is also very small and already passed review, we will be
> conducting some tests, share the patch with Vadim for testing and I will be
> submitting it early next week.
>

Hi Saeed, I think we can send the patch out whenever you want to. I will
share the patch directly with the individual CX5 user who reached out to
me as well.

--
Thanks,

Rahul Rameshbabu
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
index 377372f0578a..3e504e7d24ce 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
@@ -39,7 +39,7 @@ 
 #include "clock.h"
 
 enum {
-	MLX5_CYCLES_SHIFT	= 31
+	MLX5_CYCLES_SHIFT	= 23
 };
 
 enum {