diff mbox series

i2c: omap: fix IRQ storms

Message ID 20250207185435.751878-1-andreas@kemnade.info (mailing list archive)
State New
Headers show
Series i2c: omap: fix IRQ storms | expand

Commit Message

Andreas Kemnade Feb. 7, 2025, 6:54 p.m. UTC
On the GTA04A5 writing a reset command to the gyroscope causes IRQ
storms because NACK IRQs are enabled and therefore triggered but not
acked.

Sending a reset command to the gyroscope by
i2cset 1 0x69 0x14 0xb6
with an additional debug print in the ISR (not the thread) itself
causes

[ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
[ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
[ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
[ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
[ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
[ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
[ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
repeating till infinity
[...]
(0x2 = NACK, 0x100 = Bus free, which is not enabled)
Apparently no other IRQ bit gets set, so this stalls.

Do not ignore enabled interrupts and make sure they are acked.
If the NACK IRQ is not needed, it should simply not enabled, but
according to the above log, caring about it is necessary unless
the Bus free IRQ is enabled and handled. The assumption that is
will always come with a ARDY IRQ, which was the idea behind
ignoring it, proves wrong.
It is true for simple reads from an unused address.

So revert
commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").

The offending commit was used to reduce the false detections in
i2cdetect. i2cdetect warns for confusing the I2C bus, so having some
rare false detections (I have never seen such on my systems) is the
lesser devil than having basically the system hanging completely.

No more details came to light in the corresponding email thread since
several months:
https://lore.kernel.org/linux-omap/20230426194956.689756-1-reidt@ti.com/
so no better fix to solve both problems can be developed right now.

Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
CC: <stable@kernel.org>
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
---
 drivers/i2c/busses/i2c-omap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Andi Shyti Feb. 19, 2025, 7:22 p.m. UTC | #1
Hi,

On Fri, Feb 07, 2025 at 07:54:35PM +0100, Andreas Kemnade wrote:
> On the GTA04A5 writing a reset command to the gyroscope causes IRQ
> storms because NACK IRQs are enabled and therefore triggered but not
> acked.
> 
> Sending a reset command to the gyroscope by
> i2cset 1 0x69 0x14 0xb6
> with an additional debug print in the ISR (not the thread) itself
> causes
> 
> [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
> [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
> [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
> [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
> [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> repeating till infinity
> [...]
> (0x2 = NACK, 0x100 = Bus free, which is not enabled)
> Apparently no other IRQ bit gets set, so this stalls.
> 
> Do not ignore enabled interrupts and make sure they are acked.
> If the NACK IRQ is not needed, it should simply not enabled, but
> according to the above log, caring about it is necessary unless
> the Bus free IRQ is enabled and handled. The assumption that is
> will always come with a ARDY IRQ, which was the idea behind
> ignoring it, proves wrong.
> It is true for simple reads from an unused address.
> 
> So revert
> commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
> 
> The offending commit was used to reduce the false detections in
> i2cdetect. i2cdetect warns for confusing the I2C bus, so having some
> rare false detections (I have never seen such on my systems) is the
> lesser devil than having basically the system hanging completely.
> 
> No more details came to light in the corresponding email thread since
> several months:
> https://lore.kernel.org/linux-omap/20230426194956.689756-1-reidt@ti.com/
> so no better fix to solve both problems can be developed right now.

I need someone from TI or someone who can test to ack here.

Can someone help?

Thanks,
Andi
H. Nikolaus Schaller Feb. 20, 2025, 8:43 a.m. UTC | #2
Hi,

> Am 19.02.2025 um 20:22 schrieb Andi Shyti <andi.shyti@kernel.org>:
> 
> Hi,
> 
> On Fri, Feb 07, 2025 at 07:54:35PM +0100, Andreas Kemnade wrote:
>> On the GTA04A5 writing a reset command to the gyroscope causes IRQ
>> storms because NACK IRQs are enabled and therefore triggered but not
>> acked.
>> 
>> Sending a reset command to the gyroscope by
>> i2cset 1 0x69 0x14 0xb6
>> with an additional debug print in the ISR (not the thread) itself
>> causes
>> 
>> [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
>> [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
>> [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
>> [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
>> [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
>> [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
>> [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
>> repeating till infinity
>> [...]
>> (0x2 = NACK, 0x100 = Bus free, which is not enabled)
>> Apparently no other IRQ bit gets set, so this stalls.
>> 
>> Do not ignore enabled interrupts and make sure they are acked.
>> If the NACK IRQ is not needed, it should simply not enabled, but
>> according to the above log, caring about it is necessary unless
>> the Bus free IRQ is enabled and handled. The assumption that is
>> will always come with a ARDY IRQ, which was the idea behind
>> ignoring it, proves wrong.
>> It is true for simple reads from an unused address.
>> 
>> So revert
>> commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
>> 
>> The offending commit was used to reduce the false detections in
>> i2cdetect. i2cdetect warns for confusing the I2C bus, so having some
>> rare false detections (I have never seen such on my systems) is the
>> lesser devil than having basically the system hanging completely.
>> 
>> No more details came to light in the corresponding email thread since
>> several months:
>> https://lore.kernel.org/linux-omap/20230426194956.689756-1-reidt@ti.com/
>> so no better fix to solve both problems can be developed right now.
> 
> I need someone from TI or someone who can test to ack here.
> 
> Can someone help?

Well, I think since this is simply a full revert of

commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings")

to the status before. Therefore the status after revert was tested for many years
until c770657bd261 arrived to try to solve one issue but it apparently introduced
another one which is more severe and difficult to work around.

For real world tests please see also:

https://lore.kernel.org/linux-omap/664241E0-8D6B-4783-997B-2D8510ADAEA3@goldelico.com/
https://lore.kernel.org/linux-omap/ad0fe7ca-fb6c-4c19-b4b3-0f29ddaa92c3@jm0.eu/

BR and thanks,
Nikolaus
Andreas Kemnade Feb. 20, 2025, 9:08 a.m. UTC | #3
Am Wed, 19 Feb 2025 20:22:13 +0100
schrieb Andi Shyti <andi.shyti@kernel.org>:

> Hi,
> 
> On Fri, Feb 07, 2025 at 07:54:35PM +0100, Andreas Kemnade wrote:
> > On the GTA04A5 writing a reset command to the gyroscope causes IRQ
> > storms because NACK IRQs are enabled and therefore triggered but not
> > acked.
> > 
> > Sending a reset command to the gyroscope by
> > i2cset 1 0x69 0x14 0xb6
> > with an additional debug print in the ISR (not the thread) itself
> > causes
> > 
> > [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
> > [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
> > [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
> > [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
> > [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > repeating till infinity
> > [...]
> > (0x2 = NACK, 0x100 = Bus free, which is not enabled)
> > Apparently no other IRQ bit gets set, so this stalls.
> > 
> > Do not ignore enabled interrupts and make sure they are acked.
> > If the NACK IRQ is not needed, it should simply not enabled, but
> > according to the above log, caring about it is necessary unless
> > the Bus free IRQ is enabled and handled. The assumption that is
> > will always come with a ARDY IRQ, which was the idea behind
> > ignoring it, proves wrong.
> > It is true for simple reads from an unused address.
> > 
> > So revert
> > commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
> > 
> > The offending commit was used to reduce the false detections in
> > i2cdetect. i2cdetect warns for confusing the I2C bus, so having some
> > rare false detections (I have never seen such on my systems) is the
> > lesser devil than having basically the system hanging completely.
> > 
> > No more details came to light in the corresponding email thread since
> > several months:
> > https://lore.kernel.org/linux-omap/20230426194956.689756-1-reidt@ti.com/
> > so no better fix to solve both problems can be developed right now.  
> 
> I need someone from TI or someone who can test to ack here.
> 
> Can someone help?
>
The original (IMHO minor) problem which should be fixed by c770657bd261
is hard to test, I have never seen that on any system (and as a
platform maintainer have a bunch of them) I have access to.
There is not much description anywhere about the system in which the
original system occured, and no reaction since several months from the
author, so I do not see anything which can be done.
Maybe it was just faulty hardware.

As said in the commit message, reverting it should be the lesser devil.
And that state was tested for many years.

Regards,
Andreas
diff mbox series

Patch

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 92faf03d64cf..b54d4120899f 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -1057,7 +1057,7 @@  omap_i2c_isr(int irq, void *dev_id)
 	u16 stat;
 
 	stat = omap_i2c_read_reg(omap, OMAP_I2C_STAT_REG);
-	mask = omap_i2c_read_reg(omap, OMAP_I2C_IE_REG) & ~OMAP_I2C_STAT_NACK;
+	mask = omap_i2c_read_reg(omap, OMAP_I2C_IE_REG);
 
 	if (stat & mask)
 		ret = IRQ_WAKE_THREAD;