Message ID | 20220306085658.1943-1-gerhard@engleder-embedded.com (mailing list archive) |
---|---|
Headers | show |
Series | ptp: Support hardware clocks with additional free running time | expand |
On Sun, Mar 06, 2022 at 09:56:52AM +0100, Gerhard Engleder wrote: > If hardware would support a free running time additionally to the > physical clock, then the physical clock does not need to be forced to > free running. Thus, the physical clocks can still be synchronized while > vclocks are in use. So... the HW must provide frame time stamps using the one clock and ancillary operations using the other, right? Thanks, Richard
On Sun, Mar 06, 2022 at 08:49:41AM -0800, Richard Cochran wrote: > On Sun, Mar 06, 2022 at 09:56:52AM +0100, Gerhard Engleder wrote: > > > If hardware would support a free running time additionally to the > > physical clock, then the physical clock does not need to be forced to > > free running. Thus, the physical clocks can still be synchronized while > > vclocks are in use. > > So... the HW must provide frame time stamps using the one clock and > ancillary operations using the other, right? Looking at your tsnep driver, the requirement is that the HW provide each frame with TWO time stamps, one from each clock. Thanks, Richard
On Sun, Mar 06, 2022 at 09:56:52AM +0100, Gerhard Engleder wrote: > ptp vclocks require a clock with free running time for the timecounter. > Currently only a physical clock forced to free running is supported. > If vclocks are used, then the physical clock cannot be synchronized > anymore. The synchronized time is not available in hardware in this > case. As a result, timed transmission with ETF/TAPRIO hardware support > is not possible anymore. NAK. I don't see why you don't simply provide two PHC instances from this one device. AFAICT, this series is not needed at all, as the existing API covers this use case already. Thanks, Richard
On Sun, Mar 6, 2022 at 6:05 PM Richard Cochran <richardcochran@gmail.com> wrote: > > ptp vclocks require a clock with free running time for the timecounter. > > Currently only a physical clock forced to free running is supported. > > If vclocks are used, then the physical clock cannot be synchronized > > anymore. The synchronized time is not available in hardware in this > > case. As a result, timed transmission with ETF/TAPRIO hardware support > > is not possible anymore. > > NAK. > > I don't see why you don't simply provide two PHC instances from this > one device. Because with vclocks the user space interface is already available. Also In my opinion it is a good fit. The second PHC would be based on the free running hardware counter. So it would not provide any additional functionality compared to the vclocks based on it. Are two PHC instances supported? At least for ethtool there is only a single phc_index field. > AFAICT, this series is not needed at all, as the existing API covers > this use case already. I assume that you mean for ETF the transmission time can be converted, similar like for time stamps. So for ETF you are right. It was too quick to mention ETF along with TAPRIO. My use case is TAPRIO with hardware support. For TAPRIO the hardware has to act based on the synchronized time within the TSN network. No transmission times, which could be converted, are used. The hardware is in charge to transmit frames from a certain queue only during defined intervals. These intervals are based on the synchronized time. So the hardware must be synchronized somehow. This is my solution to keep the hardware synchronized while vclocks are in use. How can I cover my use case with the existing API? I had no idea so far. Thanks! Gerhard
On Sun, Mar 06, 2022 at 07:38:55PM +0100, Gerhard Engleder wrote:
> How can I cover my use case with the existing API? I had no idea so far.
Okay, so 2 PHCs doesn't help, but still all you need is:
1. a different method to convert time stamps to vclock time base
2. a different method for vclocks' gettime
So let me suggest a much smaller change to the phc/vclock api... stay tuned
Thanks,
Richard
Hi, > ptp vclocks require a clock with free running time for the timecounter. > Currently only a physical clock forced to free running is supported. > If vclocks are used, then the physical clock cannot be synchronized > anymore. The synchronized time is not available in hardware in this > case. As a result, timed transmission with ETF/TAPRIO hardware support > is not possible anymore. > > If hardware would support a free running time additionally to the > physical clock, then the physical clock does not need to be forced to > free running. Thus, the physical clocks can still be synchronized while > vclocks are in use. > > The physical clock could be used to synchronize the time domain of the > TSN network and trigger ETF/TAPRIO. In parallel vclocks can be used to > synchronize other time domains. > > One year ago I thought for two time domains within a TSN network also > two physical clocks are required. This would lead to new kernel > interfaces for asking for the second clock, ... . But actually for a > time triggered system like TSN there can be only one time domain that > controls the system itself. All other time domains belong to other > layers, but not to the time triggered system itself. So other time > domains can be based on a free running counter if similar mechanisms > to 2 step synchronisation are used. > > Synchronisation was tested with two time domains between two directly > connected hosts. Each host run two ptp4l instances, the first used the > physical clock and the second used the virtual clock. I used my FPGA > based network controller as network device. ptp4l was used in > combination with the virtual clock support patch set from Miroslav > Lichvar. > > Please give me some feedback. For me it seems like a straight forward > extension for ptp vclocks, but I'm new to this topic. From what I understand, you have a second PHC which is just a second PHC you cannot control; i.e. it is equivalent to a PHC in free running mode. This PHC will take the timestamps for the PTP frames. You can create multiple vclocks and you can use ptp4l to synchronize these. The first (controlable) PHC is used to do the Qbv scheduling, thus needs a synchronized time. How do you synchronize the vclock with this PHC? And how precise is it? I know that some cards can do cross timestamping in hardware to aid the synchronization (but I think that is not supported right now in linux). Richard Cochran wrote: > You are adding eight bytes per frame for what is arguably an extreme > niche case. I don't think it is a niche case, btw. I was always wondering why NXP introduced the vclock thingy. And apparently it is for 802.1AS-rev, but one use case for that is 802.1Qbv and there you'll need a (synchronized) hardware clock to control the gates. So while we can have multiple time domain support with the vclock, we cannot use Qbv with them. That was something I have always wondered about. Or.. maybe I'm missing something here. -michael
On Mon, Mar 07, 2022 at 01:07:51PM +0100, Michael Walle wrote: > Richard Cochran wrote: > > You are adding eight bytes per frame for what is arguably an extreme > > niche case. > > I don't think it is a niche case, btw. I was always wondering why > NXP introduced the vclock thingy. And apparently it is for > 802.1AS-rev, but one use case for that is 802.1Qbv and there you'll > need a (synchronized) hardware clock to control the gates. So while > we can have multiple time domain support with the vclock, we cannot > use Qbv with them. That was something I have always wondered about. > Or.. maybe I'm missing something here. Niche is relative. Believe it or not, the overwhelmingly great majority of people using Linux have no interest in TSN. Thanks, Richard
On Mon, Mar 07, 2022 at 06:05:31AM -0800, Richard Cochran wrote: > On Mon, Mar 07, 2022 at 01:07:51PM +0100, Michael Walle wrote: > > Richard Cochran wrote: > > > You are adding eight bytes per frame for what is arguably an extreme > > > niche case. > > > > I don't think it is a niche case, btw. I was always wondering why > > NXP introduced the vclock thingy. And apparently it is for ... > > Niche is relative. > > Believe it or not, the overwhelmingly great majority of people using > Linux have no interest in TSN. Is this not the same issue as what was recently discussed about some devices being able to provide two (e.g. PHY+MAC) or even more timestamps at the same time? There is a need to have multiple PHCs per device and for that to work the drivers need to be able to save multiple timestamps per packet. In this series it is an additional freerunning clock. That seems too specific. I think we need a more general approach that will support two and more physical PHCs per device. Virtual clocks are not involved here.
On Sun, Mar 06, 2022 at 01:50:32PM -0800, Richard Cochran wrote: > On Sun, Mar 06, 2022 at 07:38:55PM +0100, Gerhard Engleder wrote: > > How can I cover my use case with the existing API? I had no idea so far. > > Okay, so 2 PHCs doesn't help, but still all you need is: > > 1. a different method to convert time stamps to vclock time base > > 2. a different method for vclocks' gettime > > So let me suggest a much smaller change to the phc/vclock api... stay tuned For #1: On the receive path, the stack calls ptp_convert_timestamp() if the socket has the SOF_TIMESTAMPING_RAW_HARDWARE option. In that method, you need only get the raw cycle count if supported by the pclock. So instead of: vclock = info_to_vclock(ptp->info); ns = ktime_to_ns(hwtstamps->hwtstamp); spin_lock_irqsave(&vclock->lock, flags); ns = timecounter_cyc2time(&vclock->tc, ns); spin_unlock_irqrestore(&vclock->lock, flags); something like this: vclock = info_to_vclock(ptp->info); cycles = pclock->ktime_to_cycles(hwtstamps->hwtstamp); spin_lock_irqsave(&vclock->lock, flags); ns = timecounter_cyc2time(&vclock->tc, cycles); spin_unlock_irqrestore(&vclock->lock, flags); This new class method, ktime_to_cycles, can simply do ktime_to_ns() by default for all of the existing drivers, but your special driver can look up the hwtstamp in a cache of {hwtstamp, cycles} pairs. (No need to bloat skbuff by another eight bytes!) For #2: Similarly, add a new class method, say, pclock.get_cycles that does if (ptp->info->gettimex64) ptp->info->gettimex64(ptp->info, &ts, NULL); else ptp->info->gettime64(ptp->info, &ts); by default, but in your driver will read the special counter. Thanks, Richard
On Mon, Mar 07, 2022 at 03:23:58PM +0100, Miroslav Lichvar wrote: > There is a need to have multiple PHCs per device and for that to work > the drivers need to be able to save multiple timestamps per packet. That is a big job that entails a new user API. For the present, this device can be supported with only minimal changes in the PHC class layer. See my other reply. Thanks, Richard
> From what I understand, you have a second PHC which is just a second > PHC you cannot control; i.e. it is equivalent to a PHC in free running > mode. This PHC will take the timestamps for the PTP frames. You can > create multiple vclocks and you can use ptp4l to synchronize these. > > The first (controlable) PHC is used to do the Qbv scheduling, thus > needs a synchronized time. > > How do you synchronize the vclock with this PHC? And how precise > is it? I know that some cards can do cross timestamping in hardware > to aid the synchronization (but I think that is not supported right > now in linux). There is no need to synchronize the first (controlable) PHC with the vclock. A ptp4l instance is running and synchronizing the time for Qbv/TAPRIO. vclocks are used for other time domains, which do not affect Qbv/TAPRIO. > > You are adding eight bytes per frame for what is arguably an extreme > > niche case. > > I don't think it is a niche case, btw. I was always wondering why > NXP introduced the vclock thingy. And apparently it is for > 802.1AS-rev, but one use case for that is 802.1Qbv and there you'll > need a (synchronized) hardware clock to control the gates. So while > we can have multiple time domain support with the vclock, we cannot > use Qbv with them. That was something I have always wondered about. I agree that most people using Linux have no interest in TSN. For the few people who are interested in TSN, I assume using two time domains in combination with Qbv/TAPRIO is a common goal. Is there anyone else who wants to use two time domains in combination with Qbv/TAPRIO? Gerhard
> > > How can I cover my use case with the existing API? I had no idea so far. > > > > Okay, so 2 PHCs doesn't help, but still all you need is: > > > > 1. a different method to convert time stamps to vclock time base > > > > 2. a different method for vclocks' gettime > > > > So let me suggest a much smaller change to the phc/vclock api... stay tuned > > For #1: > > On the receive path, the stack calls ptp_convert_timestamp() if the > socket has the SOF_TIMESTAMPING_RAW_HARDWARE option. In that method, > you need only get the raw cycle count if supported by the pclock. > > So instead of: > > vclock = info_to_vclock(ptp->info); > > ns = ktime_to_ns(hwtstamps->hwtstamp); > > spin_lock_irqsave(&vclock->lock, flags); > ns = timecounter_cyc2time(&vclock->tc, ns); > spin_unlock_irqrestore(&vclock->lock, flags); > > something like this: > > vclock = info_to_vclock(ptp->info); > > cycles = pclock->ktime_to_cycles(hwtstamps->hwtstamp); > > spin_lock_irqsave(&vclock->lock, flags); > ns = timecounter_cyc2time(&vclock->tc, cycles); > spin_unlock_irqrestore(&vclock->lock, flags); > > This new class method, ktime_to_cycles, can simply do ktime_to_ns() by > default for all of the existing drivers, but your special driver can > look up the hwtstamp in a cache of {hwtstamp, cycles} pairs. ktime_to_cycles uses hwtstamp as key for the cache lookup. As long as the PHC is monotonic, the key is unique. If the time of the PHC is set, then the cache would be invalidated. I'm afraid that setting the PHC could lead to wrong or missing timestamps. Setting the PHC in hardware, timestamp generation in hardware, and cache invalidation in software would need to be synchronized somehow. Practically the PHC should be set only once. It would be also ok to use vclocks for PTP after PHC has been set. So it should work. But it would not be the generic PHC/vclock user space interface anymore. > (No need to bloat skbuff by another eight bytes!) I understand that bloating skbuff shall be prevented. Actually I need to find a way to generate the correct timestamp within ptp_convert_timestamp and without bloating skbuff. The cache is one possibility. What do you think about the following idea? For TX it is known which timestamp is required. So I would have to find a way to detect which timestamp shall be filled into hwtstamp. For RX both timestamps are already available within skbuff, because they are stored in front of the Ethernet header by the hardware. So I have to find a way to detect the RX case and copy the right timestamp to hwtstamp. This would prevent the cache and does not bloat skbuff. > For #2: > > Similarly, add a new class method, say, pclock.get_cycles that does > > if (ptp->info->gettimex64) > ptp->info->gettimex64(ptp->info, &ts, NULL); > else > ptp->info->gettime64(ptp->info, &ts); > > by default, but in your driver will read the special counter. Looks better than my getfreeruntimex64. Thanks for your suggestion! Gerhard
On Mon, Mar 07, 2022 at 06:54:19PM +0100, Gerhard Engleder wrote: > ktime_to_cycles uses hwtstamp as key for the cache lookup. As long as > the PHC is monotonic, the key is unique. If the time of the PHC is set, then > the cache would be invalidated. I'm afraid that setting the PHC could lead to > wrong or missing timestamps. "Wrong" timestamps happen with normal PHCs anyhow. Not a big deal. Your driver can drop the cache and force racing vclock frames to omit the Rx timestamp. User space must deal with this in any case. > For RX both timestamps are already available within skbuff, because they are > stored in front of the Ethernet header by the hardware. So I have to find a way > to detect the RX case and copy the right timestamp to hwtstamp. This works for your driver, but not for the generic vclock. Maybe you could let the ptp_convert_timestamp() method pass the skb back into your driver. Thanks, Richard
On Mon, Mar 07, 2022 at 06:54:19PM +0100, Gerhard Engleder wrote: > ktime_to_cycles uses hwtstamp as key for the cache lookup. As long as > the PHC is monotonic, the key is unique. If the time of the PHC is set, then > the cache would be invalidated. I'm afraid that setting the PHC could lead to > wrong or missing timestamps. Setting the PHC in hardware, timestamp > generation in hardware, and cache invalidation in software would need to > be synchronized somehow. You can avoid errors even with a time jump: Make a variant (union) of skb_shared_hwtstamps to allow driver to provide an address or cookie instead of ktime_t. Set a flag in the skbuff to signal this. Let the Rx path check the flag and fetch the time stamp by callback with the address/cookie. Thanks, Richard
On Mon, Mar 07, 2022 at 06:54:19PM +0100, Gerhard Engleder wrote: > For TX it is known which timestamp is required. So I would have to find a way > to detect which timestamp shall be filled into hwtstamp. How about tx_flags in struct skb_shared_info ? Thanks, Richard
> > ktime_to_cycles uses hwtstamp as key for the cache lookup. As long as > > the PHC is monotonic, the key is unique. If the time of the PHC is set, then > > the cache would be invalidated. I'm afraid that setting the PHC could lead to > > wrong or missing timestamps. Setting the PHC in hardware, timestamp > > generation in hardware, and cache invalidation in software would need to > > be synchronized somehow. > > You can avoid errors even with a time jump: > > Make a variant (union) of skb_shared_hwtstamps to allow driver to > provide an address or cookie instead of ktime_t. Set a flag in the > skbuff to signal this. > > Let the Rx path check the flag and fetch the time stamp by callback > with the address/cookie. > > > For TX it is known which timestamp is required. So I would have to find a way > > to detect which timestamp shall be filled into hwtstamp. > > How about tx_flags in struct skb_shared_info ? I will try to make an implementation! Shall I use tx_flags in struct skb_shared_info for both, TX and RX? Or should I use flags in struct skb_shared_info for address/cookie signalisation? 3 of 8 flags are unused in tx_flags. It may be possible to use the same flag for TX and RX. Is it worth it trying to save flags? For TX it signals "fill cycle based timestamp" and could be called SKBTX_HW_CSTAMP. For RX it signals "hwtstamp is an address/cookie" and could be called SKBFL_TSTAMP_COOKIE. Thank you! Gerhard
On Tue, Mar 08, 2022 at 08:49:03PM +0100, Gerhard Engleder wrote: > 3 of 8 flags are unused in tx_flags. It may be possible to use the same flag for > TX and RX. Is it worth it trying to save flags? Yes, every bit is precious. Thanks, Richard
On Mon, Mar 07, 2022 at 06:37:48AM -0800, Richard Cochran wrote: > On Mon, Mar 07, 2022 at 03:23:58PM +0100, Miroslav Lichvar wrote: > > > There is a need to have multiple PHCs per device and for that to work > > the drivers need to be able to save multiple timestamps per packet. > > That is a big job that entails a new user API. On second thought, maybe the address/cookie idea will at least open up multiple driver timestamps in a practical way... Thanks, Richard