Message ID | 20190223084952.14758-4-kernel@martin.sperl.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | allow to define cs deassert times in us, ns and SCK-len | expand |
On Sat, Feb 23, 2019 at 08:49:50AM +0000, kernel@martin.sperl.org wrote: > Support setting a delay between cs assert and deassert as > a multiple of spi clock length. To what end? > --- a/drivers/spi/spi.c > +++ b/drivers/spi/spi.c > @@ -1092,6 +1092,7 @@ static void _spi_transfer_cs_change_delay(struct spi_message *msg, > { > u32 delay = xfer->cs_change_delay; > u32 unit = xfer->cs_change_delay_unit; > + u32 hz; > > /* return early on "fast" mode - for everything but USECS */ > if (!delay && unit != SPI_DELAY_UNIT_USECS) > @@ -1107,6 +1108,13 @@ static void _spi_transfer_cs_change_delay(struct spi_message *msg, > break; > case SPI_DELAY_UNIT_NSECS: /* nothing to do here */ > break; > + case SPI_DELAY_UNIT_SCK: > + /* if there is no effective speed know, then approximate > + * by underestimating with half the requested hz > + */ > + hz = xfer->effective_speed_hz ?: xfer->speed_hz / 2; > + delay *= DIV_ROUND_UP(1000000000, hz); > + break; > default: > dev_err_once(&msg->spi->dev, > "Use of unsupported delay unit %i, using default of 10us\n", > diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h > index f503a712423d..0b1d5e2b8b8b 100644 > --- a/include/linux/spi/spi.h > +++ b/include/linux/spi/spi.h > @@ -799,6 +799,7 @@ struct spi_transfer { > u8 cs_change_delay_unit; > #define SPI_DELAY_UNIT_USECS 0 > #define SPI_DELAY_UNIT_NSECS 1 > +#define SPI_DELAY_UNIT_SCK 2 The ability to define a unit seems somewhat over-engineered to me. You're introducing lots of branching and integer divisions (both comparatively expensive operations) which are executed for every single transfer (i.e. in a hot path). I'd recommend standardizing on a single unit. If 1 usec is too coarse-grained for your use case, convert everything to nsec. Patch [5/5] in this series didn't make it through to linux-rpi-kernel, this has happened to my own patches as well in the past: They're put in a moderation queue because the number of recipients exceeds a limit, but noone ever releases them from the moderation queue, which is unfortunate. Thanks, Lukas
Hi Lukas! > On 23.02.2019, at 13:40, Lukas Wunner <lukas@wunner.de> wrote: > > On Sat, Feb 23, 2019 at 08:49:50AM +0000, kernel@martin.sperl.org wrote: >> Support setting a delay between cs assert and deassert as >> a multiple of spi clock length. > > To what end? ... > > The ability to define a unit seems somewhat over-engineered to me. > You're introducing lots of branching and integer divisions (both > comparatively expensive operations) which are executed for every > single transfer (i.e. in a hot path). I'd recommend standardizing > on a single unit. If 1 usec is too coarse-grained for your use case, > convert everything to nsec. Well - a spi_device driver does not actually know what is the actual spi clock configured/used by the spi-bus-driver. All it knows is that the device requires X clock cycles of cs high. So how would you define that so that is valid for all cases without replicating code in every driver that needs it (everywhere slightly differently)? As for hot-path: we are not in the hot path - we need to waste time when any of those parameters are set! cs_change is not that commonly used - or cs_delay_usec for that matter. The corresponding ifs int the hot path are in the code right now - worsted case we are jumping a few more cpu instructions further than we would right now. Also the code has been moved into a separate function and some of it is used from multiple locations. Thus it is up to the compiler to know best if it is worth inlining the function or just calling it. From a CPU utilization perspective: for most situations we will (most likely) be busy waiting to pass the required time anyway. So those divides are just taking time from those busy cycles… Finally if the computation yields a shorter value than a (conservative) time-value that would have been given instead, then those computations (divides) have helped save cpu cycles and SPI bus bandwidth, so they are beneficial. As for the rpi-mailing-list: if this is an issue, then this is a separate issue that needs to get addressed separately as it is not really code-related. Martin
On Sat, Feb 23, 2019 at 02:15:13PM +0100, kernel@martin.sperl.org wrote: > On 23.02.2019, at 13:40, Lukas Wunner <lukas@wunner.de> wrote: > > On Sat, Feb 23, 2019 at 08:49:50AM +0000, kernel@martin.sperl.org wrote: > > > Support setting a delay between cs assert and deassert as > > > a multiple of spi clock length. > > > > The ability to define a unit seems somewhat over-engineered to me. > > You're introducing lots of branching and integer divisions (both > > comparatively expensive operations) which are executed for every > > single transfer (i.e. in a hot path). I'd recommend standardizing > > on a single unit. If 1 usec is too coarse-grained for your use case, > > convert everything to nsec. > > Well - a spi_device driver does not actually know what is the > actual spi clock configured/used by the spi-bus-driver. Which slave device and driver are we talking about anyway? Is the driver actually making use of the ability to change the speed per-transfer? Most slaves I've dealt with so far use a constant speed. In that case the master driver could set the slave's ->speed_hz member to the effective speed on ->setup() and the slave's driver could then calculate the delay in nsec or usec based on that speed. > All it knows is that the device requires X clock cycles of cs high. That's what I was driving at with my "to what end?" question. You've apparently got a slave device that requires a delay of a specific number of clock cycles. You should make that explicit in the commit messages, ideally by naming the slave that requires it. Right now your commit messages only explain the "what", not the "why". For an uninitiated reader it's helpful to understand the specific motivation of your code changes. > From a CPU utilization perspective: for most situations we will > (most likely) be busy waiting to pass the required time anyway. > So those divides are just taking time from those busy cycles??? Okay. > As for the rpi-mailing-list: if this is an issue, then this is a > separate issue that needs to get addressed separately as it is not > really code-related. Yes, obviously that issue is tangential to the present series and not a review comment, I raised it in the hope that someone reading it may have an idea how the situation can be improved. Thanks for resending the missing patch. Lukas
> On 24.02.2019, at 11:39, Lukas Wunner <lukas@wunner.de> wrote: > > On Sat, Feb 23, 2019 at 02:15:13PM +0100, kernel@martin.sperl.org wrote: >> On 23.02.2019, at 13:40, Lukas Wunner <lukas@wunner.de> wrote: >>> On Sat, Feb 23, 2019 at 08:49:50AM +0000, kernel@martin.sperl.org wrote: >>>> Support setting a delay between cs assert and deassert as >>>> a multiple of spi clock length. >>> >>> The ability to define a unit seems somewhat over-engineered to me. >>> You're introducing lots of branching and integer divisions (both >>> comparatively expensive operations) which are executed for every >>> single transfer (i.e. in a hot path). I'd recommend standardizing >>> on a single unit. If 1 usec is too coarse-grained for your use case, >>> convert everything to nsec. >> >> Well - a spi_device driver does not actually know what is the >> actual spi clock configured/used by the spi-bus-driver. > > Which slave device and driver are we talking about anyway? > Is the driver actually making use of the ability to change the > speed per-transfer? Most slaves I've dealt with so far use a > constant speed. In that case the master driver could set the > slave's ->speed_hz member to the effective speed on ->setup() > and the slave's driver could then calculate the delay in nsec > or usec based on that speed. Unfortunately this is not always the case... Some devices - like the mcp2517fd - have for example an internal PLL based on an external clock. So during setup you have to use speed_hz of <clock_hz> / 2 (or 4MHz at most) and only when PLL is in sync we may be using speed_hz from the dt (or less if a module parameter is used to limit ourselves further) So the initial setup would not be able to help here - and every bus controller would now be required to implement setup. It also means open coding the calculations in each driver that needs something like this. Thus it is - imo - in the right location to support it in spi core. This is also the reason why I was asking if the same argument for a multiple of SCKs may apply also to delay_us. An ADC spi-slave would be an example where this could be needed to keep the spi bus quiet during sampling - I remember having seen one datasheet where cs was required to remain asserted during sampling and after X SCK cycles the conversion could get read. > >> All it knows is that the device requires X clock cycles of cs high. > > That's what I was driving at with my "to what end?" question. You've > apparently got a slave device that requires a delay of a specific number > of clock cycles. You should make that explicit in the commit messages, > ideally by naming the slave that requires it. Right now your commit > messages only explain the "what", not the "why". For an uninitiated > reader it's helpful to understand the specific motivation of your code > changes. I will make that commit message clearer in case a new version of the patchset is needed for other reasons. Thanks, Martin
On Sun, Feb 24, 2019 at 12:03:33PM +0100, kernel@martin.sperl.org wrote: > Some devices - like the mcp2517fd - have for example an internal PLL > based on an external clock. So during setup you have to use speed_hz > of <clock_hz> / 2 (or 4MHz at most) and only when PLL is in sync we > may be using speed_hz from the dt (or less if a module parameter is > used to limit ourselves further) > So the initial setup would not be able to help here - and every > bus controller would now be required to implement setup. > It also means open coding the calculations in each driver that > needs something like this. > Thus it is - imo - in the right location to support it in spi core. I agree, this feature makes sense to me.
> On 26.02.2019, at 12:37, Mark Brown <broonie@kernel.org> wrote: > > On Sun, Feb 24, 2019 at 12:03:33PM +0100, kernel@martin.sperl.org wrote: > >> Some devices - like the mcp2517fd - have for example an internal PLL >> based on an external clock. So during setup you have to use speed_hz >> of <clock_hz> / 2 (or 4MHz at most) and only when PLL is in sync we >> may be using speed_hz from the dt (or less if a module parameter is >> used to limit ourselves further) > >> So the initial setup would not be able to help here - and every >> bus controller would now be required to implement setup. > >> It also means open coding the calculations in each driver that >> needs something like this. > >> Thus it is - imo - in the right location to support it in spi core. > > I agree, this feature makes sense to me. Is there anything that really block this patch? Do you want a rebase? Anything else? Martin
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 5a4616894d57..d5259bdd4b88 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1092,6 +1092,7 @@ static void _spi_transfer_cs_change_delay(struct spi_message *msg, { u32 delay = xfer->cs_change_delay; u32 unit = xfer->cs_change_delay_unit; + u32 hz; /* return early on "fast" mode - for everything but USECS */ if (!delay && unit != SPI_DELAY_UNIT_USECS) @@ -1107,6 +1108,13 @@ static void _spi_transfer_cs_change_delay(struct spi_message *msg, break; case SPI_DELAY_UNIT_NSECS: /* nothing to do here */ break; + case SPI_DELAY_UNIT_SCK: + /* if there is no effective speed know, then approximate + * by underestimating with half the requested hz + */ + hz = xfer->effective_speed_hz ?: xfer->speed_hz / 2; + delay *= DIV_ROUND_UP(1000000000, hz); + break; default: dev_err_once(&msg->spi->dev, "Use of unsupported delay unit %i, using default of 10us\n", diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index f503a712423d..0b1d5e2b8b8b 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -799,6 +799,7 @@ struct spi_transfer { u8 cs_change_delay_unit; #define SPI_DELAY_UNIT_USECS 0 #define SPI_DELAY_UNIT_NSECS 1 +#define SPI_DELAY_UNIT_SCK 2 u32 speed_hz; u16 word_delay;