[3/6] spi: bcm2835: fill FIFO before enabling interrupts to

Message ID	3628E5E2-7EA9-488F-AF5F-A2E43D2D1E3E@martin.sperl.org (mailing list archive)
State	Accepted
Commit	e3a2be3030e2fec27a2577d3c52203da090a4366
Headers	show Return-Path: <linux-spi-owner@kernel.org> From: Martin Sperl <kernel@martin.sperl.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT Subject: [PATCH 3/6] spi: bcm2835: fill FIFO before enabling interrupts to Message-Id: <3628E5E2-7EA9-488F-AF5F-A2E43D2D1E3E@martin.sperl.org> Date: Sun, 29 Mar 2015 16:03:25 +0200 To: Mark Brown <broonie@kernel.org>, Stephen Warren <swarren@wwwdotorg.org>, lee@kernel.org, linux-spi@vger.kernel.org, linux-rpi-kernel <linux-rpi-kernel@lists.infradead.org> Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Sender: linux-spi-owner@vger.kernel.org Precedence: bulk

Martin Sperl March 29, 2015, 2:03 p.m. UTC

reduce interrupts/message

To reduce the number of interrupts/message we fill the FIFO before
enabling interrupts - for short messages this reduces the interrupt count
from 2 to 1 interrupt.

There have been rare cases where short (<200ns) chip-select switches with
native CS have been observed during such operation, this is why this
optimization is only enabled for GPIO-CS.

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Tested-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/spi/spi-bcm2835.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Applies against spi - topic/bcm2835

Mark Brown March 29, 2015, 4:55 p.m. UTC | #1

On Sun, Mar 29, 2015 at 04:03:25PM +0200, Martin Sperl wrote:
>  reduce interrupts/message

Applied, thanks.  There's something wrong with the way you're sending
patches - long subject lines are being wrapped into the body of the
patch.  Please check this, the easiest thing is most likely to be to use
git format-patch and git send-email to send your patches as that's
pretty much zero effort.

Please while looking at this also look into threading your patches into
a single thread per series, this makes things easier to work with.

Stephen Warren March 31, 2015, 3:16 a.m. UTC | #2

On 03/29/2015 08:03 AM, Martin Sperl wrote:
>  reduce interrupts/message
> 
> To reduce the number of interrupts/message we fill the FIFO before
> enabling interrupts - for short messages this reduces the interrupt count
> from 2 to 1 interrupt.
> 
> There have been rare cases where short (<200ns) chip-select switches with
> native CS have been observed during such operation, this is why this
> optimization is only enabled for GPIO-CS.

> diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c

> +	/* fill in fifo if we have gpio-cs
> +	 * note that there have been rare events where the native-CS
> +	 * flapped for <1us which may change the behaviour
> +	 * with gpio-cs this does not happen, so it is implemented
> +	 * only for this case
> +	 */
> +	if (gpio_is_valid(spi->cs_gpio)) {
> +		/* enable HW block, but without interrupts enabled
> +		 * this would triggern an immediate interrupt
> +		 */
> +		bcm2835_wr(bs, BCM2835_SPI_CS,
> +			   cs | BCM2835_SPI_CS_TA);
> +		/* fill in tx fifo as much as possible */
> +		bcm2835_wr_fifo(bs);
> +	}

I can understand perfectly why the code fills the FIFO before enabling
interrupts; it avoids having to immediately service an interrupt simply
to fill the FIFO.

However, I'm not sure why this is in any way related to whether the
chip-select GPIO is valid. Surely we always want to do this? How does
the mechanism used to control chip selects influence whether we want to
pre-fill the FIFO?
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mark Brown March 31, 2015, 5:25 a.m. UTC | #3

On Mon, Mar 30, 2015 at 09:16:21PM -0600, Stephen Warren wrote:
> On 03/29/2015 08:03 AM, Martin Sperl wrote:

> > There have been rare cases where short (<200ns) chip-select switches with
> > native CS have been observed during such operation, this is why this
> > optimization is only enabled for GPIO-CS.

> > +	/* fill in fifo if we have gpio-cs
> > +	 * note that there have been rare events where the native-CS
> > +	 * flapped for <1us which may change the behaviour
> > +	 * with gpio-cs this does not happen, so it is implemented
> > +	 * only for this case
> > +	 */

> However, I'm not sure why this is in any way related to whether the
> chip-select GPIO is valid. Surely we always want to do this? How does
> the mechanism used to control chip selects influence whether we want to
> pre-fill the FIFO?

I think both the comment and the commit message answer that question -
something triggers spurious chip select changes?

Martin Sperl March 31, 2015, 5:51 a.m. UTC | #4

> On 31.03.2015, at 05:16, Stephen Warren <swarren@wwwdotorg.org> wrote:
> 
> I can understand perfectly why the code fills the FIFO before enabling
> interrupts; it avoids having to immediately service an interrupt simply
> to fill the FIFO.
> 
> However, I'm not sure why this is in any way related to whether the
> chip-select GPIO is valid. Surely we always want to do this? How does
> the mechanism used to control chip selects influence whether we want to
> pre-fill the FIFO?

During the time I was building a DMA only driver I saw "rare" glitches
in the CS line when using native CS. The "glitch" is that the CS drops
to inactive for a short period of time - typically 1 sample length at
10MHz sample rate, so <0.1us.

These "glitches" also once have been observed with the current driver
when using native-CS, so I think it is prudent to avoid native-CS when
enabling this optimization.

Hence this limitation or maybe even the full move to GPIO-CS for all.
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stephen Warren April 5, 2015, 1:49 a.m. UTC | #5

On 03/30/2015 11:25 PM, Mark Brown wrote:
> On Mon, Mar 30, 2015 at 09:16:21PM -0600, Stephen Warren wrote:
>> On 03/29/2015 08:03 AM, Martin Sperl wrote:
> 
>>> There have been rare cases where short (<200ns) chip-select
>>> switches with native CS have been observed during such
>>> operation, this is why this optimization is only enabled for
>>> GPIO-CS.
> 
>>> +	/* fill in fifo if we have gpio-cs +	 * note that there have
>>> been rare events where the native-CS +	 * flapped for <1us
>>> which may change the behaviour +	 * with gpio-cs this does not
>>> happen, so it is implemented +	 * only for this case +	 */
> 
>> However, I'm not sure why this is in any way related to whether
>> the chip-select GPIO is valid. Surely we always want to do this?
>> How does the mechanism used to control chip selects influence
>> whether we want to pre-fill the FIFO?
> 
> I think both the comment and the commit message answer that
> question - something triggers spurious chip select changes?

I must admit I don't feel either the commit message or the comment
explain the situation. They certainly state that there are glitches in
the "native CS" case, but that doesn't *explain* them. It seems more
likely the under-filling the FIFO would cause periods where the FIFO
was empty which would be aout the only case I could naively think of
for the CS to be de-asserted. More investigation would be useful.

--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mark Brown April 6, 2015, 5:21 p.m. UTC | #6

On Sat, Apr 04, 2015 at 07:49:41PM -0600, Stephen Warren wrote:
> On 03/30/2015 11:25 PM, Mark Brown wrote:
> > On Mon, Mar 30, 2015 at 09:16:21PM -0600, Stephen Warren wrote:
> >> On 03/29/2015 08:03 AM, Martin Sperl wrote:

> >>> There have been rare cases where short (<200ns) chip-select
> >>> switches with native CS have been observed during such
> >>> operation, this is why this optimization is only enabled for
> >>> GPIO-CS.

> >> However, I'm not sure why this is in any way related to whether
> >> the chip-select GPIO is valid. Surely we always want to do this?
> >> How does the mechanism used to control chip selects influence
> >> whether we want to pre-fill the FIFO?

> > I think both the comment and the commit message answer that
> > question - something triggers spurious chip select changes?

> I must admit I don't feel either the commit message or the comment
> explain the situation. They certainly state that there are glitches in
> the "native CS" case, but that doesn't *explain* them. It seems more
> likely the under-filling the FIFO would cause periods where the FIFO
> was empty which would be aout the only case I could naively think of
> for the CS to be de-asserted. More investigation would be useful.

Right, and I have to say I do suspect that the underlying thing is that
the FIFO is underrunning, but as far as the optimization is concerned
that's a separate thing.  The reason this isn't enabled for native chip
selects is that it's not working, the reason it's not working is
something that should indeed probably be investigated.

Martin Sperl April 7, 2015, 6:25 a.m. UTC | #7

> On 06.04.2015, at 19:21, Mark Brown <broonie@kernel.org> wrote:
> Right, and I have to say I do suspect that the underlying thing is that
> the FIFO is underrunning, but as far as the optimization is concerned
> that's a separate thing.  The reason this isn't enabled for native chip
> selects is that it's not working, the reason it's not working is
> something that should indeed probably be investigated.
Actually it happens exactly when setting the CS-register with the
interrupt flags enabled - typically observed in the middle of a transmit
of a byte the CS jumps, but the clock and data continue the transfers
correctly

As the CS register contains the interrupt flags as well as the 
control for the native-chip-selects this is impacting the chip select
lines in native mode.

I have never seen a glitch on the CLK or MISO/MOSI line in my logic
analyzer only CS jumps in the order of 1 sample in the logic analyzer
- @20MHz sample rate, so in the order of 50ns, probably less...

On top that happens only under some rare circumstances and can get rarely
observed - lots of samples (and memory) are needed to catch that one,
which typically results in a message in dmesg due to some unexpected
response crc checksum errors stalled processing.

This is from my experience from over a year ago with a fully DMA
pipelined driver that did all the programming of SPI in DMA as well and
there were some reports of people seeing "glitches" before I moved to
GPIO-CS (equivalents)

See also:
http://www.raspberrypi.org/forums/viewtopic.php?f=44&t=19489&start=125#p451817
for some observations mostly related to CLEAR_TX/CLEAR_RX that also
de-assert CS for short periods.

I also saw it _once_ in one of my traces when using native_cs 2-3 weeks
ago, but unfortunately I did not keep the capture...

Hence the requirement to use of gpio-cs when running this optimization,
as then the CS is not managed by the HW itself.

I have been filling (dd if=/dev/zero) a 2GB SD card repeatedly (more
than 9 times) for the last  12 hours and I have seen no issues -
besides the spi bus being occupied and spi_bus_locked for 17ms for
the data-transfers, which obviously impacts packet-reception (packet
loss) on CAN and ethernet network - but that is expected due to this
heavy load and long transfers.

Note that this test was executed using the latest patches I have sent
running at 16k interrupts/s and 4400 context-switches/second at 4MHz 
SPI-bus-speed for the SD-card.
The transfer of 1.6GB to the SD-card (filling up the partition) took
about 4350 seconds or 366kB/s - 3600s would be the ideal situation
without overheads, latencies or similar.

--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Martin Sperl April 7, 2015, 6:43 a.m. UTC | #8

Now remembering a bit more from what I have observed:
> On 07.04.2015, at 08:25, Martin Sperl <kernel@martin.sperl.org> wrote:
> 
> See also:
> http://www.raspberrypi.org/forums/viewtopic.php?f=44&t=19489&start=125#p451817
> for some observations mostly related to CLEAR_TX/CLEAR_RX that also
> de-assert CS for short periods.

Note that this TX/RX reset with native CS is also inhibiting the use
of DMA for any transfers not a multiple of 4 (DMA transfers data into
the FIFO in words not bytes). Because after those transfers we have
to reset the FIFO or the remaining bytes still in the FIFO will get
shifted out before a subsequent transfer.

So for any DMA enabled driver we need this kind of gpio-cs to avoid
this CS-glitch situation unless we want a "DMA only on multiple of 4"
with native-cs situation...

--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stephen Warren April 7, 2015, 3:39 p.m. UTC | #9

On 04/07/2015 12:25 AM, Martin Sperl wrote:
>
>> On 06.04.2015, at 19:21, Mark Brown <broonie@kernel.org> wrote:
>> Right, and I have to say I do suspect that the underlying thing is that
>> the FIFO is underrunning, but as far as the optimization is concerned
>> that's a separate thing.  The reason this isn't enabled for native chip
>> selects is that it's not working, the reason it's not working is
>> something that should indeed probably be investigated.
 >
> Actually it happens exactly when setting the CS-register with the
> interrupt flags enabled - typically observed in the middle of a transmit
> of a byte the CS jumps, but the clock and data continue the transfers
> correctly
>
> As the CS register contains the interrupt flags as well as the
> control for the native-chip-selects this is impacting the chip select
> lines in native mode.

Is the driver simply programming the HW incorrectly then? I would expect 
the driver to do something roughly like:

* Set up the HW to execute the transaction; everything except enabling 
IRQs and telling the HW to "go"
* Clear stale IRQ status (perhaps do this right at the start)
* Enable IRQs
* Tell the HW to "go"

... then not touch any CS-related register for the entire transfer. 
There shouldn't be a need to enable/disable IRQs during the transfer; 
just leave them enabled the entire time, until all bytes have been 
transferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mark Brown April 7, 2015, 4:28 p.m. UTC | #10

On Tue, Apr 07, 2015 at 09:39:03AM -0600, Stephen Warren wrote:
> On 04/07/2015 12:25 AM, Martin Sperl wrote:

> >As the CS register contains the interrupt flags as well as the
> >control for the native-chip-selects this is impacting the chip select
> >lines in native mode.

> Is the driver simply programming the HW incorrectly then? I would expect the
> driver to do something roughly like:

> * Set up the HW to execute the transaction; everything except enabling IRQs
> and telling the HW to "go"
> * Clear stale IRQ status (perhaps do this right at the start)
> * Enable IRQs
> * Tell the HW to "go"

> ... then not touch any CS-related register for the entire transfer. There
> shouldn't be a need to enable/disable IRQs during the transfer; just leave
> them enabled the entire time, until all bytes have been transferred.

We will need to make sure /CS is kept asserted between transfers in a
message too.

Martin Sperl April 7, 2015, 5:05 p.m. UTC | #11

> On 07.04.2015, at 17:39, Stephen Warren <swarren@wwwdotorg.org> wrote:
> 
> Is the driver simply programming the HW incorrectly then? I would expect the driver to do something roughly like:
> 
> * Set up the HW to execute the transaction; everything except enabling IRQs and telling the HW to "go"
> * Clear stale IRQ status (perhaps do this right at the start)
> * Enable IRQs
> * Tell the HW to "go"

no it actually works differently:
* set TA and some of the other flags (POL,...) in CS-REG, which activates
  the SPI-block and as soon as there is data it will start the transfer.
  if you set the IRQ flags here, then IRQ will trigger as soon as the write
  hits the HW-block, as the fifo is empty

This is what the driver does right in 4.0rcX - it also sets the interrupt
flags.

But the problem is that it takes typcially 20us for the ISR to get REALLY
executed, which means an unecessary delay of 20us before the transfer
really starts.

The prefill approach instead:
* sets TA and some other flags, in CS,but leaves Interrupts disabled
* fills in FIFO which initiates the transfer
* sets the same values in CS-REG as above but now also with IRQ enabled

At this point in time there is a slight chance that CS will toggle for <1us
when using native CS - this does not happen with gpio-CS for obvious reasons.

Similar for the situation where you request a transfer of 13 Bytes via DMA.
You do this by filling in SPI-LEN with 13 to tell the engine to only shift
13 bytes out even if DMA fills in 16 bytes. When this transfer is finished
then there are still 3 bytes in the FIFO, so you have to clean that by 
setting CLEAR_TX/CLEAR_RX in the CS-register. 
If you set only those 2 bit without modifying the others, then there is 
again a chance that native-CS get toggled for a short period of time.

This is obviously not ideal when you have to keep CS low for the next
spi_transfer.

The only way around this was to actually cheat by using only CS2, and
modifying the CSPOL0 and CSPOL1 to do what I want because the CS only
seems to toggle for the "active" CS (as per bits 0:1 in CS).

But that is essentially the same as using cs-gpio, but just uses a
different set of registers.

I guess that there is some sort of logic in the HW that re-evaluates the
state of the native-chip-selects whenever CS-reg is written. Even for the
case where there is no change there seems to be a short period of time
when the CS is not driven (high or low), which, together with some pull-ups,
is pulling those lines high - and this is what we are seeing in some rare
cases.

So any write to the CS-register can influence native chip selects but this
only happens in rare cases typically showing after several hours of
repeated spi-transfers (in my case after 20M CAN messages received and about
100M SPI messages).

Using GPIO-cs solves these situations by not being influenced by the
SPI-hardware in any way in the first place.

Actually - if I think of it - even with the current driver in 4.0rcX
which is configuring CS-reg on every spi_transfer in a spi_message
could trigger the same behavior as well.

But as spi_messages with multiple spi_transfers are rarely used
and as the issue itself is only happening only with a low probability
and some devices not detecting cs changes that are below a certain time
the likleyhood of such a situation being detected are minimal and
would typically get attributed to other factors...

Hope that this answers the question and summarizes the observations
that i have made and why we have to move to gpio-cs for those
optimizations to become active...

Martin--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mark Brown April 7, 2015, 5:40 p.m. UTC | #12

On Tue, Apr 07, 2015 at 07:05:01PM +0200, Martin Sperl wrote:

> But as spi_messages with multiple spi_transfers are rarely used
> and as the issue itself is only happening only with a low probability
> and some devices not detecting cs changes that are below a certain time
> the likleyhood of such a situation being detected are minimal and
> would typically get attributed to other factors...

For your workload perhaps.  For some workloads they are very common
(register reads are the most obvious example, they are typically a write
transfer to provide the register address followed by a read transfer for
the data).

Martin Sperl April 7, 2015, 6:18 p.m. UTC | #13

> On 07.04.2015, at 19:40, Mark Brown <broonie@kernel.org> wrote:
> 
> For your workload perhaps.  For some workloads they are very common
> (register reads are the most obvious example, they are typically a write
> transfer to provide the register address followed by a read transfer for
> the data).
Which spi_write_then_read collapses into a single transfer.

Still I try to run now 4 distinct devices for my testing and
I believe the enc28j60 is one of those devices that does that
and I have not seen an issue there during typical loads...
Obviously there are some devices that are easier to load than
others to trigger such a situation.

It really just shows when running some transfers a lot of times...

So the move to gpio-cs should be on our priority list.

I will come up with a generic approach that allows also to set native
CS if needed... (as sketched in an earlier mail)

Martin

--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mark Brown April 7, 2015, 6:23 p.m. UTC | #14

On Tue, Apr 07, 2015 at 08:18:57PM +0200, Martin Sperl wrote:
> > On 07.04.2015, at 19:40, Mark Brown <broonie@kernel.org> wrote:

> > For your workload perhaps.  For some workloads they are very common
> > (register reads are the most obvious example, they are typically a write
> > transfer to provide the register address followed by a read transfer for
> > the data).

> Which spi_write_then_read collapses into a single transfer.

No it doesn't.  It creates a single message with two transfers (which is
what I'd expect any user doing this to do, there are others that don't
use the SPI helper APIs).

Martin Sperl April 7, 2015, 8:28 p.m. UTC | #15

> On 07.04.2015, at 20:23, Mark Brown <broonie@kernel.org> wrote:
> 
> No it doesn't.  It creates a single message with two transfers (which is
> what I'd expect any user doing this to do, there are others that don't
> use the SPI helper APIs).
Well after rereading it now - it does copying, so I was making a naiv
assumption that it would do a single transfer, but then there is the
SPI_3WIRE case that is special.

In principle it could run in a single transfer at the cost of some
memory (64 vs. 32 bytes) for anything that is not SPI_3WIRE. But then
spi_transfer is also in the order of 32 bytes....

This definitely would speed up things for a "typical" interrupt-driven
driver implementation using the transfer_one interface, as it would avoid
one interrupt and 2 context switches.

On the other hand this could also get optimized inside the framework at
the cost of copying twice and we would optimize other cases as well.
But that is some other project...
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stephen Warren April 8, 2015, 2:01 a.m. UTC | #16

On 04/07/2015 11:05 AM, Martin Sperl wrote:
> 
>> On 07.04.2015, at 17:39, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>
>> Is the driver simply programming the HW incorrectly then? I would expect the driver to do something roughly like:
>>
>> * Set up the HW to execute the transaction; everything except enabling IRQs and telling the HW to "go"
>> * Clear stale IRQ status (perhaps do this right at the start)
>> * Enable IRQs
>> * Tell the HW to "go"
> 
> no it actually works differently:
> * set TA and some of the other flags (POL,...) in CS-REG, which activates
>   the SPI-block and as soon as there is data it will start the transfer.
>   if you set the IRQ flags here, then IRQ will trigger as soon as the write
>   hits the HW-block, as the fifo is empty
>
> This is what the driver does right in 4.0rcX - it also sets the interrupt
> flags.
> 
> But the problem is that it takes typcially 20us for the ISR to get REALLY
> executed, which means an unecessary delay of 20us before the transfer
> really starts.
> 
> The prefill approach instead:
> * sets TA and some other flags, in CS,but leaves Interrupts disabled
> * fills in FIFO which initiates the transfer
> * sets the same values in CS-REG as above but now also with IRQ enabled

Can you fill the FIFO as step 1? That way, you could presumably write to
that CS-REG just once, with all the desired bits set in one go.

> At this point in time there is a slight chance that CS will toggle for <1us
> when using native CS - this does not happen with gpio-CS for obvious reasons.
> 
> Similar for the situation where you request a transfer of 13 Bytes via DMA.
> You do this by filling in SPI-LEN with 13 to tell the engine to only shift
> 13 bytes out even if DMA fills in 16 bytes.

That sounds pretty typical for a 32-bit/dword-wide FIFO containing 8-bit
data. Surely DMA isn't relevant here; every dword write will put 4 bytes
into the FIFO irrespective of whether the write comes from the CPU or a
DMA engine. Generally, the HW will pull dword-sized values out of the
FIFO, and ignore any extra bytes that are packed into the dword beyond
whatever the programmed transfer length is. Doesn't the bcm283x SPI HW
do that too?

> When this transfer is finished
> then there are still 3 bytes in the FIFO, so you have to clean that by 
> setting CLEAR_TX/CLEAR_RX in the CS-register. 

That sounds pretty unusual. Have you tried not cleaning out the FIFO and
seeing if the next transfer does in fact only transfer the data you
write next, not the extra/stale bytes from the previous transfer?
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Martin Sperl April 8, 2015, 7:14 a.m. UTC | #17

> On 08.04.2015, at 04:01, Stephen Warren <swarren@wwwdotorg.org> wrote:
> 
> Can you fill the FIFO as step 1? That way, you could presumably write to
> that CS-REG just once, with all the desired bits set in one go.
Actually the "manual" says that you first need to enable TA - even for a
polling driver and then you can fill the FIFO.

But I have also tried it - it does not work, as the TA flag not only 
initiates the transfer but also activates/deactivates the whole SPI block
(and probably also all the registers). So the data filled into the FIFOs
is lost when TA is inactive.

> That sounds pretty typical for a 32-bit/dword-wide FIFO containing 8-bit
> data. Surely DMA isn't relevant here; every dword write will put 4 bytes
> into the FIFO irrespective of whether the write comes from the CPU or a
> DMA engine. Generally, the HW will pull dword-sized values out of the
> FIFO, and ignore any extra bytes that are packed into the dword beyond
> whatever the programmed transfer length is. Doesn't the bcm283x SPI HW
> do that too?
As the HW can do LoSSI (=9 bit) with the corresponding "command" parsing
where some commands trigger 8 bit reads, others 24 and others 32 bit
reads, it seems as if it reads as many byte from the fifo as necessary
for the specific transfer.

> That sounds pretty unusual. Have you tried not cleaning out the FIFO and
> seeing if the next transfer does in fact only transfer the data you
> write next, not the extra/stale bytes from the previous transfer?
I remember having tried this more than a year ago, but then you get the
"missing" data from the last word transfer (when using DMA).

As I am planning on implementing DMA I guess I will get there.
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Martin Sperl April 8, 2015, 9:41 a.m. UTC | #18

On 2015-04-07 18:28, Mark Brown wrote:
> On Tue, Apr 07, 2015 at 09:39:03AM -0600, Stephen Warren wrote:
>> On 04/07/2015 12:25 AM, Martin Sperl wrote:
>> ... then not touch any CS-related register for the entire transfer. There
>> shouldn't be a need to enable/disable IRQs during the transfer; just leave
>> them enabled the entire time, until all bytes have been transferred.
>
> We will need to make sure /CS is kept asserted between transfers in a
> message too.
The problem is here that you need to touch the CS-register when an
individual spi-transfer is finished, as only that way you can disable
the (level)interrupts, which would fire immediately again when you exit
chip-select.

So you are at the risk of it happening at that phase as well...
and during the next spi_transfer_one you need to reenable the
interrupts again.

Anyway: my latest incarnation of the driver does now implement full
cs_gpio operation with native-cs getting translated during the
new spi_register_cs phase when registering the bus using the (new)
function-pointer register_cs in spi_master.
So this issue goes away and the code becomes smaller at the same time.

This is also used to set up chip_selects to their defaults levels prior
to any activity on the SPI bus.
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[3/6] spi: bcm2835: fill FIFO before enabling interrupts to

Commit Message

Comments

Patch