diff mbox series

[3/5] spi: core: allow defining time that cs is deasserted as a multiple of SCK

Message ID 20190223084952.14758-4-kernel@martin.sperl.org (mailing list archive)
State New, archived
Headers show
Series allow to define cs deassert times in us, ns and SCK-len | expand

Commit Message

Martin Sperl Feb. 23, 2019, 8:49 a.m. UTC
From: Martin Sperl <kernel@martin.sperl.org>

Support setting a delay between cs assert and deassert as
a multiple of spi clock length.

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/spi/spi.c       | 8 ++++++++
 include/linux/spi/spi.h | 1 +
 2 files changed, 9 insertions(+)

--
2.11.0

Comments

Lukas Wunner Feb. 23, 2019, 12:40 p.m. UTC | #1
On Sat, Feb 23, 2019 at 08:49:50AM +0000, kernel@martin.sperl.org wrote:
> Support setting a delay between cs assert and deassert as
> a multiple of spi clock length.

To what end?


> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -1092,6 +1092,7 @@ static void _spi_transfer_cs_change_delay(struct spi_message *msg,
>  {
>  	u32 delay = xfer->cs_change_delay;
>  	u32 unit = xfer->cs_change_delay_unit;
> +	u32 hz;
> 
>  	/* return early on "fast" mode - for everything but USECS */
>  	if (!delay && unit != SPI_DELAY_UNIT_USECS)
> @@ -1107,6 +1108,13 @@ static void _spi_transfer_cs_change_delay(struct spi_message *msg,
>  		break;
>  	case SPI_DELAY_UNIT_NSECS: /* nothing to do here */
>  		break;
> +	case SPI_DELAY_UNIT_SCK:
> +		/* if there is no effective speed know, then approximate
> +		 * by underestimating with half the requested hz
> +		 */
> +		hz = xfer->effective_speed_hz ?: xfer->speed_hz / 2;
> +		delay *= DIV_ROUND_UP(1000000000, hz);
> +		break;
>  	default:
>  		dev_err_once(&msg->spi->dev,
>  			     "Use of unsupported delay unit %i, using default of 10us\n",
> diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
> index f503a712423d..0b1d5e2b8b8b 100644
> --- a/include/linux/spi/spi.h
> +++ b/include/linux/spi/spi.h
> @@ -799,6 +799,7 @@ struct spi_transfer {
>  	u8		cs_change_delay_unit;
>  #define SPI_DELAY_UNIT_USECS	0
>  #define SPI_DELAY_UNIT_NSECS	1
> +#define SPI_DELAY_UNIT_SCK	2

The ability to define a unit seems somewhat over-engineered to me.
You're introducing lots of branching and integer divisions (both
comparatively expensive operations) which are executed for every
single transfer (i.e. in a hot path).  I'd recommend standardizing
on a single unit.  If 1 usec is too coarse-grained for your use case,
convert everything to nsec.

Patch [5/5] in this series didn't make it through to linux-rpi-kernel,
this has happened to my own patches as well in the past:  They're put in
a moderation queue because the number of recipients exceeds a limit,
but noone ever releases them from the moderation queue, which is
unfortunate.

Thanks,

Lukas
Martin Sperl Feb. 23, 2019, 1:15 p.m. UTC | #2
Hi Lukas!

> On 23.02.2019, at 13:40, Lukas Wunner <lukas@wunner.de> wrote:
> 
> On Sat, Feb 23, 2019 at 08:49:50AM +0000, kernel@martin.sperl.org wrote:
>> Support setting a delay between cs assert and deassert as
>> a multiple of spi clock length.
> 
> To what end?
...
> 
> The ability to define a unit seems somewhat over-engineered to me.
> You're introducing lots of branching and integer divisions (both
> comparatively expensive operations) which are executed for every
> single transfer (i.e. in a hot path).  I'd recommend standardizing
> on a single unit.  If 1 usec is too coarse-grained for your use case,
> convert everything to nsec.

Well - a spi_device driver does not actually know what is the
actual spi clock configured/used by the spi-bus-driver.

All it knows is that the device requires X clock cycles of cs high.
So how would you define that so that is valid for all cases without
replicating code in every driver that needs it (everywhere slightly
differently)?

As for hot-path: we are not in the hot path - we need to waste time
when any of those parameters are set!

cs_change is not that commonly used - or cs_delay_usec for that matter.

The corresponding ifs int the hot path are in the code right now - 
worsted case we are jumping a few more cpu instructions further than we
would right now.

Also the code has been moved into a separate function and some of it
is used from multiple locations.

Thus it is up to the compiler to know best if it is worth inlining the 
function or just calling it.

From a CPU utilization perspective: for most situations we will
(most likely) be busy waiting to pass the required time anyway.
So those divides are just taking time from those busy cycles…

Finally if the computation yields a shorter value than a (conservative)
time-value that would have been given instead, then those computations
(divides)  have helped save cpu cycles and SPI bus bandwidth, so they
are beneficial.

As for the rpi-mailing-list: if this is an issue, then this is a
separate issue that needs to get addressed separately as it is not
really code-related.

Martin
Lukas Wunner Feb. 24, 2019, 10:39 a.m. UTC | #3
On Sat, Feb 23, 2019 at 02:15:13PM +0100, kernel@martin.sperl.org wrote:
> On 23.02.2019, at 13:40, Lukas Wunner <lukas@wunner.de> wrote:
> > On Sat, Feb 23, 2019 at 08:49:50AM +0000, kernel@martin.sperl.org wrote:
> > > Support setting a delay between cs assert and deassert as
> > > a multiple of spi clock length.
> > 
> > The ability to define a unit seems somewhat over-engineered to me.
> > You're introducing lots of branching and integer divisions (both
> > comparatively expensive operations) which are executed for every
> > single transfer (i.e. in a hot path).  I'd recommend standardizing
> > on a single unit.  If 1 usec is too coarse-grained for your use case,
> > convert everything to nsec.
> 
> Well - a spi_device driver does not actually know what is the
> actual spi clock configured/used by the spi-bus-driver.

Which slave device and driver are we talking about anyway?
Is the driver actually making use of the ability to change the
speed per-transfer?  Most slaves I've dealt with so far use a
constant speed.  In that case the master driver could set the
slave's ->speed_hz member to the effective speed on ->setup()
and the slave's driver could then calculate the delay in nsec
or usec based on that speed.


> All it knows is that the device requires X clock cycles of cs high.

That's what I was driving at with my "to what end?" question.  You've
apparently got a slave device that requires a delay of a specific number
of clock cycles.  You should make that explicit in the commit messages,
ideally by naming the slave that requires it.  Right now your commit
messages only explain the "what", not the "why".  For an uninitiated
reader it's helpful to understand the specific motivation of your code
changes.


> From a CPU utilization perspective: for most situations we will
> (most likely) be busy waiting to pass the required time anyway.
> So those divides are just taking time from those busy cycles???

Okay.


> As for the rpi-mailing-list: if this is an issue, then this is a
> separate issue that needs to get addressed separately as it is not
> really code-related.

Yes, obviously that issue is tangential to the present series and
not a review comment, I raised it in the hope that someone reading it
may have an idea how the situation can be improved.  Thanks for
resending the missing patch.

Lukas
Martin Sperl Feb. 24, 2019, 11:03 a.m. UTC | #4
> On 24.02.2019, at 11:39, Lukas Wunner <lukas@wunner.de> wrote:
> 
> On Sat, Feb 23, 2019 at 02:15:13PM +0100, kernel@martin.sperl.org wrote:
>> On 23.02.2019, at 13:40, Lukas Wunner <lukas@wunner.de> wrote:
>>> On Sat, Feb 23, 2019 at 08:49:50AM +0000, kernel@martin.sperl.org wrote:
>>>> Support setting a delay between cs assert and deassert as
>>>> a multiple of spi clock length.
>>> 
>>> The ability to define a unit seems somewhat over-engineered to me.
>>> You're introducing lots of branching and integer divisions (both
>>> comparatively expensive operations) which are executed for every
>>> single transfer (i.e. in a hot path).  I'd recommend standardizing
>>> on a single unit.  If 1 usec is too coarse-grained for your use case,
>>> convert everything to nsec.
>> 
>> Well - a spi_device driver does not actually know what is the
>> actual spi clock configured/used by the spi-bus-driver.
> 
> Which slave device and driver are we talking about anyway?
> Is the driver actually making use of the ability to change the
> speed per-transfer?  Most slaves I've dealt with so far use a
> constant speed.  In that case the master driver could set the
> slave's ->speed_hz member to the effective speed on ->setup()
> and the slave's driver could then calculate the delay in nsec
> or usec based on that speed.

Unfortunately this is not always the case...

Some devices - like the mcp2517fd -  have for example an internal PLL
based on an external clock. So during setup you have to use speed_hz 
of <clock_hz> / 2 (or 4MHz at most) and only when PLL is in sync we 
may be using speed_hz from the dt (or less if a module parameter is
used to limit ourselves further)

So the initial setup would not be able to help here - and every
bus controller would now be required to implement setup.

It also means open coding the calculations in each driver that 
needs something like this.

Thus it is - imo - in the right location to support it in spi core.

This is also the reason why I was asking if the same argument
for a multiple of SCKs may apply also to delay_us.

An ADC spi-slave would be an example where this could be needed
to keep the spi bus quiet during sampling - I remember having
seen one datasheet where cs was required to remain asserted
during sampling and after X SCK cycles the conversion could get read.

> 
>> All it knows is that the device requires X clock cycles of cs high.
> 
> That's what I was driving at with my "to what end?" question.  You've
> apparently got a slave device that requires a delay of a specific number
> of clock cycles.  You should make that explicit in the commit messages,
> ideally by naming the slave that requires it.  Right now your commit
> messages only explain the "what", not the "why".  For an uninitiated
> reader it's helpful to understand the specific motivation of your code
> changes.

I will make that commit message clearer in case a new version of 
the patchset is needed for other reasons.

Thanks,
	Martin
Mark Brown Feb. 26, 2019, 11:37 a.m. UTC | #5
On Sun, Feb 24, 2019 at 12:03:33PM +0100, kernel@martin.sperl.org wrote:

> Some devices - like the mcp2517fd -  have for example an internal PLL
> based on an external clock. So during setup you have to use speed_hz 
> of <clock_hz> / 2 (or 4MHz at most) and only when PLL is in sync we 
> may be using speed_hz from the dt (or less if a module parameter is
> used to limit ourselves further)

> So the initial setup would not be able to help here - and every
> bus controller would now be required to implement setup.

> It also means open coding the calculations in each driver that 
> needs something like this.

> Thus it is - imo - in the right location to support it in spi core.

I agree, this feature makes sense to me.
Martin Sperl May 7, 2019, 10:07 a.m. UTC | #6
> On 26.02.2019, at 12:37, Mark Brown <broonie@kernel.org> wrote:
> 
> On Sun, Feb 24, 2019 at 12:03:33PM +0100, kernel@martin.sperl.org wrote:
> 
>> Some devices - like the mcp2517fd -  have for example an internal PLL
>> based on an external clock. So during setup you have to use speed_hz 
>> of <clock_hz> / 2 (or 4MHz at most) and only when PLL is in sync we 
>> may be using speed_hz from the dt (or less if a module parameter is
>> used to limit ourselves further)
> 
>> So the initial setup would not be able to help here - and every
>> bus controller would now be required to implement setup.
> 
>> It also means open coding the calculations in each driver that 
>> needs something like this.
> 
>> Thus it is - imo - in the right location to support it in spi core.
> 
> I agree, this feature makes sense to me.

Is there anything that really block this patch?

Do you want a rebase?
Anything else?

Martin
diff mbox series

Patch

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 5a4616894d57..d5259bdd4b88 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1092,6 +1092,7 @@  static void _spi_transfer_cs_change_delay(struct spi_message *msg,
 {
 	u32 delay = xfer->cs_change_delay;
 	u32 unit = xfer->cs_change_delay_unit;
+	u32 hz;

 	/* return early on "fast" mode - for everything but USECS */
 	if (!delay && unit != SPI_DELAY_UNIT_USECS)
@@ -1107,6 +1108,13 @@  static void _spi_transfer_cs_change_delay(struct spi_message *msg,
 		break;
 	case SPI_DELAY_UNIT_NSECS: /* nothing to do here */
 		break;
+	case SPI_DELAY_UNIT_SCK:
+		/* if there is no effective speed know, then approximate
+		 * by underestimating with half the requested hz
+		 */
+		hz = xfer->effective_speed_hz ?: xfer->speed_hz / 2;
+		delay *= DIV_ROUND_UP(1000000000, hz);
+		break;
 	default:
 		dev_err_once(&msg->spi->dev,
 			     "Use of unsupported delay unit %i, using default of 10us\n",
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index f503a712423d..0b1d5e2b8b8b 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -799,6 +799,7 @@  struct spi_transfer {
 	u8		cs_change_delay_unit;
 #define SPI_DELAY_UNIT_USECS	0
 #define SPI_DELAY_UNIT_NSECS	1
+#define SPI_DELAY_UNIT_SCK	2
 	u32		speed_hz;
 	u16		word_delay;