Message ID | 20190816004449.10100-4-olteanv@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Deterministic SPI latency on NXP | expand |
On Fri, Aug 16, 2019 at 03:44:41AM +0300, Vladimir Oltean wrote: > @@ -842,6 +843,9 @@ struct spi_transfer { > > u32 effective_speed_hz; > > + struct ptp_system_timestamp *ptp_sts; > + unsigned int ptp_sts_word_offset; > + You've not documented these fields at all so it's not clear what the intended usage is.
Hi Mark, On Fri, 16 Aug 2019 at 15:18, Mark Brown <broonie@kernel.org> wrote: > > On Fri, Aug 16, 2019 at 03:44:41AM +0300, Vladimir Oltean wrote: > > > @@ -842,6 +843,9 @@ struct spi_transfer { > > > > u32 effective_speed_hz; > > > > + struct ptp_system_timestamp *ptp_sts; > > + unsigned int ptp_sts_word_offset; > > + > > You've not documented these fields at all so it's not clear what the > intended usage is. Thanks for looking into this. Indeed I didn't document them as the patch is part of a RFC and I thought the purpose was more clear from the context (cover letter etc). If I do ever send a patchset for submission I will document the newly introduced fields properly. So let me clarify: The SPI slave device driver is populating these fields to indicate to the controller driver that it wants word number @ptp_sts_word_offset from the tx buffer snapshotted. The controller driver is supposed to put the snapshot into the @ptp_sts field, which is a pointer to a memory location under the control of the SPI slave device driver. It is ok if the ptp_sts pointer is NULL (no need to check), because the API for taking snapshots already checks for that. At the moment there is yet no proposed mechanism for the SPI slave driver to ensure that the controller will really act upon this request. That would be really nice to have, since some SPI slave devices are time-sensitive and warning early is a good way to prevent unnecessary troubleshooting. Regards, -Vladimir
On Fri, Aug 16, 2019 at 03:35:30PM +0300, Vladimir Oltean wrote: > On Fri, 16 Aug 2019 at 15:18, Mark Brown <broonie@kernel.org> wrote: > > On Fri, Aug 16, 2019 at 03:44:41AM +0300, Vladimir Oltean wrote: > > > @@ -842,6 +843,9 @@ struct spi_transfer { > > > > > > u32 effective_speed_hz; > > > > > > + struct ptp_system_timestamp *ptp_sts; > > > + unsigned int ptp_sts_word_offset; > > > + > > You've not documented these fields at all so it's not clear what the > > intended usage is. > Thanks for looking into this. > Indeed I didn't document them as the patch is part of a RFC and I > thought the purpose was more clear from the context (cover letter > etc). > If I do ever send a patchset for submission I will document the newly > introduced fields properly. The issue I'm having is that I have zero idea about the PTP API so I've got nothing to go on when thinking about if this approach makes any sense unless I go do some research. > So let me clarify: > The SPI slave device driver is populating these fields to indicate to > the controller driver that it wants word number @ptp_sts_word_offset > from the tx buffer snapshotted. The controller driver is supposed to > put the snapshot into the @ptp_sts field, which is a pointer to a > memory location under the control of the SPI slave device driver. Snapshot here basically meaning recording a timestamp? This interface does seem like it basically precludes DMA based controllers from using it unless someone happened to implement some very specific stuff in hardware which seems implausible. I'd be inclined to just require that users can only snapshot the first (and possibly also the last, though DMA completions make that fun) word of a transfer, we could then pull this out into the core a bit by providing a wrapper function drivers should call at the appropriate moment. > It is ok if the ptp_sts pointer is NULL (no need to check), because > the API for taking snapshots already checks for that. > At the moment there is yet no proposed mechanism for the SPI slave > driver to ensure that the controller will really act upon this > request. That would be really nice to have, since some SPI slave > devices are time-sensitive and warning early is a good way to prevent > unnecessary troubleshooting. Yes, that's one of the things I was thinking about looking at the series - we should at least be able to warn if we can't timestamp so nobody gets confused, possibly error out if the calling code particularly depends on it.
On Fri, 16 Aug 2019 at 15:58, Mark Brown <broonie@kernel.org> wrote: > > On Fri, Aug 16, 2019 at 03:35:30PM +0300, Vladimir Oltean wrote: > > On Fri, 16 Aug 2019 at 15:18, Mark Brown <broonie@kernel.org> wrote: > > > On Fri, Aug 16, 2019 at 03:44:41AM +0300, Vladimir Oltean wrote: > > > > > @@ -842,6 +843,9 @@ struct spi_transfer { > > > > > > > > u32 effective_speed_hz; > > > > > > > > + struct ptp_system_timestamp *ptp_sts; > > > > + unsigned int ptp_sts_word_offset; > > > > + > > > > You've not documented these fields at all so it's not clear what the > > > intended usage is. > > > Thanks for looking into this. > > Indeed I didn't document them as the patch is part of a RFC and I > > thought the purpose was more clear from the context (cover letter > > etc). > > If I do ever send a patchset for submission I will document the newly > > introduced fields properly. > > The issue I'm having is that I have zero idea about the PTP API so I've > got nothing to go on when thinking about if this approach makes any > sense unless I go do some research. > > > So let me clarify: > > The SPI slave device driver is populating these fields to indicate to > > the controller driver that it wants word number @ptp_sts_word_offset > > from the tx buffer snapshotted. The controller driver is supposed to > > put the snapshot into the @ptp_sts field, which is a pointer to a > > memory location under the control of the SPI slave device driver. > > Snapshot here basically meaning recording a timestamp? This interface > does seem like it basically precludes DMA based controllers from using > it unless someone happened to implement some very specific stuff in > hardware which seems implausible. I'd be inclined to just require that > users can only snapshot the first (and possibly also the last, though > DMA completions make that fun) word of a transfer, we could then pull > this out into the core a bit by providing a wrapper function drivers > should call at the appropriate moment. > I'm not sure how to respond to this, because I don't know anything about the timing of DMA transfers. Maybe snapshotting DMA transfers the same way is not possible (if at all). Maybe they are not exactly adequate for this sort of application anyway. Maybe it depends. But the switch I'm working on is issuing an internal read transaction of the PTP timer exactly at the 4th-to-last bit of the 3rd byte. This is so that it has time (4 SPI clock cycles, to be precise) for the result of the read transaction to become available again to the SPI block, for output. It is impossible to know exactly when the switch will snapshot the time internally (because there are several clock domain crossings from the SPI interface towards its core) but for certain it takes place during the latter part of the 3rd SPI byte. I believe other devices are similar in this regard. In other words, from a purely performance perspective, I am against limiting the API to just snapshotting the first and last byte. At this level of "zoom", if I change the offset of the byte to anything other than 3, the synchronization offset refuses to converge towards zero, because the snapshot is incurring a constant offset that the servo loop from userspace (phc2sys) can't compensate for. Maybe the SPI master driver should just report what sort of snapshotting capability it can offer, ranging from none (default unless otherwise specified), to transfer-level (DMA style) or byte-level. I'm afraid more actual experimentation is needed with DMA-based controllers to understand what can be expected from them, and as a result, how the API should map around them. MDIO bus controllers are in a similar situation (with Hubert's patch) but at least there the frame size is fixed and I haven't heard of an MDIO controller to use DMA. I'm not really sure what the next step would be. In the other thread, Richard Cochran mentioned something about a two-part write API, although I didn't quite understand the idea behind it. > > It is ok if the ptp_sts pointer is NULL (no need to check), because > > the API for taking snapshots already checks for that. > > At the moment there is yet no proposed mechanism for the SPI slave > > driver to ensure that the controller will really act upon this > > request. That would be really nice to have, since some SPI slave > > devices are time-sensitive and warning early is a good way to prevent > > unnecessary troubleshooting. > > Yes, that's one of the things I was thinking about looking at the series > - we should at least be able to warn if we can't timestamp so nobody > gets confused, possibly error out if the calling code particularly > depends on it. Regards, -Vladimir
> MDIO bus controllers are in a similar situation (with Hubert's patch) > but at least there the frame size is fixed and I haven't heard of an > MDIO controller to use DMA. Linux does not have any DMA driver MDIO busses, as far as i know. It does not make sense, since you are only transferring 16bits of data. The vast majority are polled completion, but there is one which generates an interrupt on completion. Andrew
On Fri, Aug 16, 2019 at 05:05:53PM +0300, Vladimir Oltean wrote: > I'm not sure how to respond to this, because I don't know anything > about the timing of DMA transfers. > Maybe snapshotting DMA transfers the same way is not possible (if at > all). Maybe they are not exactly adequate for this sort of application > anyway. Maybe it depends. DMA transfers generally proceed without any involvement from the CPU, this is broadly the point of DMA. You *may* be able to split into multiple transactions but it's not reliable that you'd be able to do so on byte boundaries and there will be latency getting notified of completions. > In other words, from a purely performance perspective, I am against > limiting the API to just snapshotting the first and last byte. At this > level of "zoom", if I change the offset of the byte to anything other > than 3, the synchronization offset refuses to converge towards zero, > because the snapshot is incurring a constant offset that the servo > loop from userspace (phc2sys) can't compensate for. > Maybe the SPI master driver should just report what sort of > snapshotting capability it can offer, ranging from none (default > unless otherwise specified), to transfer-level (DMA style) or > byte-level. That does then have the consequence that the majority of controllers aren't going to be usable with the API which isn't great. > I'm afraid more actual experimentation is needed with DMA-based > controllers to understand what can be expected from them, and as a > result, how the API should map around them. > MDIO bus controllers are in a similar situation (with Hubert's patch) > but at least there the frame size is fixed and I haven't heard of an > MDIO controller to use DMA. I'm not 100% clear what the problem you're trying to solve is, or if it's a sensible problem to try to solve for that matter.
Hi Mark, On Tue, 20 Aug 2019 at 15:55, Mark Brown <broonie@kernel.org> wrote: > > On Fri, Aug 16, 2019 at 05:05:53PM +0300, Vladimir Oltean wrote: > > > I'm not sure how to respond to this, because I don't know anything > > about the timing of DMA transfers. > > Maybe snapshotting DMA transfers the same way is not possible (if at > > all). Maybe they are not exactly adequate for this sort of application > > anyway. Maybe it depends. > > DMA transfers generally proceed without any involvement from the CPU, > this is broadly the point of DMA. You *may* be able to split into > multiple transactions but it's not reliable that you'd be able to do so > on byte boundaries and there will be latency getting notified of > completions. > > > In other words, from a purely performance perspective, I am against > > limiting the API to just snapshotting the first and last byte. At this > > level of "zoom", if I change the offset of the byte to anything other > > than 3, the synchronization offset refuses to converge towards zero, > > because the snapshot is incurring a constant offset that the servo > > loop from userspace (phc2sys) can't compensate for. > > > Maybe the SPI master driver should just report what sort of > > snapshotting capability it can offer, ranging from none (default > > unless otherwise specified), to transfer-level (DMA style) or > > byte-level. > > That does then have the consequence that the majority of controllers > aren't going to be usable with the API which isn't great. > Can we continue this discussion on this thread: https://www.spinics.net/lists/netdev/msg593772.html The whole point there is that if there's nothing that the driver can do, the SPI core will take the timestamps and record their (bad) precision. > > I'm afraid more actual experimentation is needed with DMA-based > > controllers to understand what can be expected from them, and as a > > result, how the API should map around them. > > MDIO bus controllers are in a similar situation (with Hubert's patch) > > but at least there the frame size is fixed and I haven't heard of an > > MDIO controller to use DMA. > > I'm not 100% clear what the problem you're trying to solve is, or if > it's a sensible problem to try to solve for that matter. The problem can simply be summarized as: you're trying to read a clock over SPI, but there's so much timing jitter in you doing that, that you have a high degree of uncertainty in the actual precision of the readout you took. The solution has two parts: - Make the SPI access itself more predictable in terms of latency. This is always going to have to be dealt with on a driver-by-driver, hardware-by-hardware basis. - Provide a way of taking a software timestamp in the time interval when the latency is predictable, and as close as possible to the moment when the SPI slave will receive the request. Disabling interrupts and preemption always helps to snapshot that critical section. Again, the SPI core can't do that. And finding the correct "pre" and "post" hooks that surround the hardware transfer in a deterministic fashion is crucial. If you read the cover letter, I used a GPIO pin to make sure the timestamps are where they should be, and that they don't vary in width (post - pre) - there are also some screenshots on Gdrive. Maybe something similar is not impossible for a DMA transfer, although the problem formulation so far is too vague to emit a more clear statement. If you know when the SPI slave's clock was actually read, you have a better idea of what time it was. Regards, -Vladimir
On Tue, Aug 20, 2019 at 04:48:39PM +0300, Vladimir Oltean wrote: > On Tue, 20 Aug 2019 at 15:55, Mark Brown <broonie@kernel.org> wrote: > > On Fri, Aug 16, 2019 at 05:05:53PM +0300, Vladimir Oltean wrote: > > > Maybe the SPI master driver should just report what sort of > > > snapshotting capability it can offer, ranging from none (default > > > unless otherwise specified), to transfer-level (DMA style) or > > > byte-level. > > That does then have the consequence that the majority of controllers > > aren't going to be usable with the API which isn't great. > Can we continue this discussion on this thread: > https://www.spinics.net/lists/netdev/msg593772.html > The whole point there is that if there's nothing that the driver can > do, the SPI core will take the timestamps and record their (bad) > precision. I'm not on that thread. > > I'm not 100% clear what the problem you're trying to solve is, or if > > it's a sensible problem to try to solve for that matter. > The problem can simply be summarized as: you're trying to read a clock > over SPI, but there's so much timing jitter in you doing that, that > you have a high degree of uncertainty in the actual precision of the > readout you took. That doesn't seem like a great idea...
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index af4f265d0f67..5a1e4b24c617 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -13,6 +13,7 @@ #include <linux/completion.h> #include <linux/scatterlist.h> #include <linux/gpio/consumer.h> +#include <linux/ptp_clock_kernel.h> struct dma_chan; struct property_entry; @@ -842,6 +843,9 @@ struct spi_transfer { u32 effective_speed_hz; + struct ptp_system_timestamp *ptp_sts; + unsigned int ptp_sts_word_offset; + struct list_head transfer_list; };
SPI is one of the interfaces used to access devices which have a POSIX clock driver (real time clocks, 1588 timers etc). Since there are lots of sources of timing jitter when retrieving the time from such a device (controller delays, locking contention etc), introduce a PTP system timestamp structure in struct spi_transfer. This is to be used by SPI device drivers when they need to know the exact time at which the underlying device's time was snapshotted. Because SPI controllers may have jitter even between frames, also introduce a field which specifies to the controller driver specifically which byte needs to be snapshotted. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> --- include/linux/spi/spi.h | 4 ++++ 1 file changed, 4 insertions(+)