diff mbox series

[iwl-net] ice: ensure periodic output start time is in the future

Message ID 20250212-jk-gnrd-ptp-pin-patches-v1-1-7cbae692ac97@intel.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [iwl-net] ice: ensure periodic output start time is in the future | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 2 blamed authors not CCed: arkadiusz.kubalewski@intel.com horms@kernel.org; 9 maintainers not CCed: andrew+netdev@lunn.ch edumazet@google.com horms@kernel.org kuba@kernel.org intel-wired-lan@lists.osuosl.org richardcochran@gmail.com przemyslaw.kitszel@intel.com arkadiusz.kubalewski@intel.com pabeni@redhat.com
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 19 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 31 this patch: 31
netdev/source_inline success Was 0 now: 0

Commit Message

Jacob Keller Feb. 12, 2025, 11:54 p.m. UTC
From: Karol Kolacinski <karol.kolacinski@intel.com>

On E800 series hardware, if the start time for a periodic output signal is
programmed into GLTSYN_TGT_H and GLTSYN_TGT_L registers, the hardware logic
locks up and the periodic output signal never starts. Any future attempt to
reprogram the clock function is futile as the hardware will not reset until
a power on.

The ice_ptp_cfg_perout function has logic to prevent this, as it checks if
the requested start time is in the past. If so, a new start time is
calculated by rounding up.

Since commit d755a7e129a5 ("ice: Cache perout/extts requests and check
flags"), the rounding is done to the nearest multiple of the clock period,
rather than to a full second. This is more accurate, since it ensures the
signal matches the user request precisely.

Unfortunately, there is a race condition with this rounding logic. If the
current time is close to the multiple of the period, we could calculate a
target time that is extremely soon. It takes time for the software to
program the registers, during which time this requested start time could
become a start time in the past. If that happens, the periodic output
signal will lock up.

For large enough periods, or for the logic prior to the mentioned commit,
this is unlikely. However, with the new logic rounding to the period and
with a small enough period, this becomes inevitable.

For example, attempting to enable a 10MHz signal requires a period of 100
nanoseconds. This means in the *best* case, we have 99 nanoseconds to
program the clock output. This is essentially impossible, and thus such a
small period practically guarantees that the clock output function will
lock up.

To fix this, add some slop to the clock time used to check if the start
time is in the past. Because it is not critical that output signals start
immediately, but it *is* critical that we do not brick the function, 0.5
seconds is selected. This does mean that any requested output will be
delayed by at least 0.5 seconds.

This slop is applied before rounding, so that we always round up to the
nearest multiple of the period that is at least 0.5 seconds in the future,
ensuring a minimum of 0.5 seconds to program the clock output registers.

Finally, to ensure that the hardware registers programming the clock output
complete in a timely manner, add a write flush to the end of
ice_ptp_write_perout. This ensures we don't risk any issue with PCIe
transaction batching.

Strictly speaking, this fixes a race condition all the way back at the
initial implementation of periodic output programming, as it is
theoretically possible to trigger this bug even on the old logic when
always rounding to a full second. However, the window is narrow, and the
code has been refactored heavily since then, making a direct backport not
apply cleanly.

Fixes: d755a7e129a5 ("ice: Cache perout/extts requests and check flags")
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_ptp.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)


---
base-commit: e589adf5b70c07b1ab974d077046fdbf583b2f36
change-id: 20250212-jk-gnrd-ptp-pin-patches-42561da45ec1

Best regards,

Comments

Simon Horman Feb. 15, 2025, 2:59 p.m. UTC | #1
On Wed, Feb 12, 2025 at 03:54:39PM -0800, Jacob Keller wrote:
> From: Karol Kolacinski <karol.kolacinski@intel.com>
> 
> On E800 series hardware, if the start time for a periodic output signal is
> programmed into GLTSYN_TGT_H and GLTSYN_TGT_L registers, the hardware logic
> locks up and the periodic output signal never starts. Any future attempt to
> reprogram the clock function is futile as the hardware will not reset until
> a power on.
> 
> The ice_ptp_cfg_perout function has logic to prevent this, as it checks if
> the requested start time is in the past. If so, a new start time is
> calculated by rounding up.
> 
> Since commit d755a7e129a5 ("ice: Cache perout/extts requests and check
> flags"), the rounding is done to the nearest multiple of the clock period,
> rather than to a full second. This is more accurate, since it ensures the
> signal matches the user request precisely.
> 
> Unfortunately, there is a race condition with this rounding logic. If the
> current time is close to the multiple of the period, we could calculate a
> target time that is extremely soon. It takes time for the software to
> program the registers, during which time this requested start time could
> become a start time in the past. If that happens, the periodic output
> signal will lock up.
> 
> For large enough periods, or for the logic prior to the mentioned commit,
> this is unlikely. However, with the new logic rounding to the period and
> with a small enough period, this becomes inevitable.
> 
> For example, attempting to enable a 10MHz signal requires a period of 100
> nanoseconds. This means in the *best* case, we have 99 nanoseconds to
> program the clock output. This is essentially impossible, and thus such a
> small period practically guarantees that the clock output function will
> lock up.
> 
> To fix this, add some slop to the clock time used to check if the start
> time is in the past. Because it is not critical that output signals start
> immediately, but it *is* critical that we do not brick the function, 0.5
> seconds is selected. This does mean that any requested output will be
> delayed by at least 0.5 seconds.
> 
> This slop is applied before rounding, so that we always round up to the
> nearest multiple of the period that is at least 0.5 seconds in the future,
> ensuring a minimum of 0.5 seconds to program the clock output registers.
> 
> Finally, to ensure that the hardware registers programming the clock output
> complete in a timely manner, add a write flush to the end of
> ice_ptp_write_perout. This ensures we don't risk any issue with PCIe
> transaction batching.
> 
> Strictly speaking, this fixes a race condition all the way back at the
> initial implementation of periodic output programming, as it is
> theoretically possible to trigger this bug even on the old logic when
> always rounding to a full second. However, the window is narrow, and the
> code has been refactored heavily since then, making a direct backport not
> apply cleanly.
> 
> Fixes: d755a7e129a5 ("ice: Cache perout/extts requests and check flags")
> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Thanks for the excellent patch description.

Reviewed-by: Simon Horman <horms@kernel.org>

...
Jacob Keller Feb. 19, 2025, 10:30 p.m. UTC | #2
On 2/15/2025 6:59 AM, Simon Horman wrote:
> On Wed, Feb 12, 2025 at 03:54:39PM -0800, Jacob Keller wrote:
>> From: Karol Kolacinski <karol.kolacinski@intel.com>
>>
>> On E800 series hardware, if the start time for a periodic output signal is
>> programmed into GLTSYN_TGT_H and GLTSYN_TGT_L registers, the hardware logic
>> locks up and the periodic output signal never starts. Any future attempt to
>> reprogram the clock function is futile as the hardware will not reset until
>> a power on.
>>
>> The ice_ptp_cfg_perout function has logic to prevent this, as it checks if
>> the requested start time is in the past. If so, a new start time is
>> calculated by rounding up.
>>
>> Since commit d755a7e129a5 ("ice: Cache perout/extts requests and check
>> flags"), the rounding is done to the nearest multiple of the clock period,
>> rather than to a full second. This is more accurate, since it ensures the
>> signal matches the user request precisely.
>>
>> Unfortunately, there is a race condition with this rounding logic. If the
>> current time is close to the multiple of the period, we could calculate a
>> target time that is extremely soon. It takes time for the software to
>> program the registers, during which time this requested start time could
>> become a start time in the past. If that happens, the periodic output
>> signal will lock up.
>>
>> For large enough periods, or for the logic prior to the mentioned commit,
>> this is unlikely. However, with the new logic rounding to the period and
>> with a small enough period, this becomes inevitable.
>>
>> For example, attempting to enable a 10MHz signal requires a period of 100
>> nanoseconds. This means in the *best* case, we have 99 nanoseconds to
>> program the clock output. This is essentially impossible, and thus such a
>> small period practically guarantees that the clock output function will
>> lock up.
>>
>> To fix this, add some slop to the clock time used to check if the start
>> time is in the past. Because it is not critical that output signals start
>> immediately, but it *is* critical that we do not brick the function, 0.5
>> seconds is selected. This does mean that any requested output will be
>> delayed by at least 0.5 seconds.
>>
>> This slop is applied before rounding, so that we always round up to the
>> nearest multiple of the period that is at least 0.5 seconds in the future,
>> ensuring a minimum of 0.5 seconds to program the clock output registers.
>>
>> Finally, to ensure that the hardware registers programming the clock output
>> complete in a timely manner, add a write flush to the end of
>> ice_ptp_write_perout. This ensures we don't risk any issue with PCIe
>> transaction batching.
>>
>> Strictly speaking, this fixes a race condition all the way back at the
>> initial implementation of periodic output programming, as it is
>> theoretically possible to trigger this bug even on the old logic when
>> always rounding to a full second. However, the window is narrow, and the
>> code has been refactored heavily since then, making a direct backport not
>> apply cleanly.
>>
>> Fixes: d755a7e129a5 ("ice: Cache perout/extts requests and check flags")
>> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
>> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> 
> Thanks for the excellent patch description.
> 
> Reviewed-by: Simon Horman <horms@kernel.org>
> 
> ...

Lol.. can't get anything right this week.. Just noticed I had already
sent this but forgot CC to Intel Wired LAN, so it wasn't showing up in
our patchwork.
Simon Horman Feb. 20, 2025, 10:36 a.m. UTC | #3
On Wed, Feb 19, 2025 at 02:30:25PM -0800, Jacob Keller wrote:
> 
> 
> On 2/15/2025 6:59 AM, Simon Horman wrote:
> > On Wed, Feb 12, 2025 at 03:54:39PM -0800, Jacob Keller wrote:

...

> > Thanks for the excellent patch description.
> > 
> > Reviewed-by: Simon Horman <horms@kernel.org>
> > 
> > ...
> 
> Lol.. can't get anything right this week.. Just noticed I had already
> sent this but forgot CC to Intel Wired LAN, so it wasn't showing up in
> our patchwork.

Lol, but the patch description was good :)
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index e26320ce52ca17b30a9538b11b754c7cf57c9af9..a99e0fbd0b8b55ad72a2bc7d13851562a6f2f5bd 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -1783,6 +1783,7 @@  static int ice_ptp_write_perout(struct ice_hw *hw, unsigned int chan,
 				  8 + chan + (tmr_idx * 4));
 
 	wr32(hw, GLGEN_GPIO_CTL(gpio_pin), val);
+	ice_flush(hw);
 
 	return 0;
 }
@@ -1843,9 +1844,10 @@  static int ice_ptp_cfg_perout(struct ice_pf *pf, struct ptp_perout_request *rq,
 		div64_u64_rem(start, period, &phase);
 
 	/* If we have only phase or start time is in the past, start the timer
-	 * at the next multiple of period, maintaining phase.
+	 * at the next multiple of period, maintaining phase at least 0.5 second
+	 * from now, so we have time to write it to HW.
 	 */
-	clk = ice_ptp_read_src_clk_reg(pf, NULL);
+	clk = ice_ptp_read_src_clk_reg(pf, NULL) + NSEC_PER_MSEC * 500;
 	if (rq->flags & PTP_PEROUT_PHASE || start <= clk - prop_delay_ns)
 		start = div64_u64(clk + period - 1, period) * period + phase;