Message ID | 20221023185444.678573-1-conor@kernel.org (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Palmer Dabbelt |
Headers | show |
Series | Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend" | expand |
On Sun, 23 Oct 2022 11:54:44 PDT (-0700), Conor Dooley wrote: > From: Conor Dooley <conor.dooley@microchip.com> > > This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d. > If an AXI read to the PCIe controller on PolarFire SoC times out, the > system will stall, with an expected: > io scheduler mq-deadline registered > io scheduler kyber registered > microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges: > microchip-pcie 2000000000.pcie: MEM 0x2008000000..0x2087ffffff -> 0x0008000000 > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: axi read request error > microchip-pcie 2000000000.pcie: axi read timeout > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > Freeing initrd memory: 7336K > mc_event_handler: 667402 callbacks suppressed > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > mc_event_handler: 666588 callbacks suppressed > <truncated> > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > mc_event_handler: 666748 callbacks suppressed > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > rcu: 0-...0: (1 GPs behind) idle=19f/1/0x4000000000000002 softirq=34/36 fqs=2626 > (detected by 1, t=5256 jiffies, g=-1151, q=1143 ncpus=4) > Task dump for CPU 0: > task:swapper/0 state:R running task stack: 0 pid: 1 ppid: 0 flags:0x00000008 > Call Trace: > mc_event_handler: 666648 callbacks suppressed > > With this patch applied, the system just locks up without RCU stalling: > io scheduler mq-deadline registered > io scheduler kyber registered > microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges: > microchip-pcie 2000000000.pcie: MEM 0x2008000000..0x2087ffffff -> 0x0008000000 > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: axi read request error > microchip-pcie 2000000000.pcie: axi read timeout > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer > microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer > Freeing initrd memory: 7332K > > Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/ > Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > --- > I don't really want to post a revert, but it's been nearly a month since > I posted about my issue initially & 2 weeks without a reply to Palmer's > comments. > CC: samuel@sholland.org > CC: aou@eecs.berkeley.edu > CC: atishp@atishpatra.org > CC: daniel.lezcano@linaro.org > CC: dmitriy@oss-tech.org > CC: linux-kernel@vger.kernel.org > CC: linux-riscv@lists.infradead.org > CC: palmer@dabbelt.com > CC: paul.walmsley@sifive.com > CC: tglx@linutronix.de > --- > drivers/clocksource/timer-riscv.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c > index 969a552da8d2..a0d66fabf073 100644 > --- a/drivers/clocksource/timer-riscv.c > +++ b/drivers/clocksource/timer-riscv.c > @@ -51,7 +51,7 @@ static int riscv_clock_next_event(unsigned long delta, > static unsigned int riscv_clock_event_irq; > static DEFINE_PER_CPU(struct clock_event_device, riscv_clock_event) = { > .name = "riscv_timer_clockevent", > - .features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_C3STOP, > + .features = CLOCK_EVT_FEAT_ONESHOT, > .rating = 100, > .set_next_event = riscv_clock_next_event, > }; There's some discussion on that linked patch and we don't really have a fix yet, but IMO we're better off reverting this as it breaks the common case and it's not clear this is even a sane way to fix the bug. Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
On 23/10/2022 20:54, Conor Dooley wrote: > From: Conor Dooley <conor.dooley@microchip.com> > > This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d. > If an AXI read to the PCIe controller on PolarFire SoC times out, the > system will stall, Applied, thanks
On Fri, Dec 02, 2022 at 01:02:20PM +0100, Daniel Lezcano wrote: > On 23/10/2022 20:54, Conor Dooley wrote: > > From: Conor Dooley <conor.dooley@microchip.com> > > > > This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d. > > If an AXI read to the PCIe controller on PolarFire SoC times out, the > > system will stall, > > Applied, thanks Hey Daniel, Looks like Thomas already took the v2 of this patch: https://lore.kernel.org/all/166989319052.4906.3934360150862233210.tip-bot2@tip-bot2/ Thanks, Conor.
On 02/12/2022 13:05, Conor Dooley wrote: > On Fri, Dec 02, 2022 at 01:02:20PM +0100, Daniel Lezcano wrote: >> On 23/10/2022 20:54, Conor Dooley wrote: >>> From: Conor Dooley <conor.dooley@microchip.com> >>> >>> This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d. >>> If an AXI read to the PCIe controller on PolarFire SoC times out, the >>> system will stall, >> >> Applied, thanks > > Hey Daniel, > > Looks like Thomas already took the v2 of this patch: > https://lore.kernel.org/all/166989319052.4906.3934360150862233210.tip-bot2@tip-bot2/ Ok, thanks for pointing this out. I'll drop it
diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c index 969a552da8d2..a0d66fabf073 100644 --- a/drivers/clocksource/timer-riscv.c +++ b/drivers/clocksource/timer-riscv.c @@ -51,7 +51,7 @@ static int riscv_clock_next_event(unsigned long delta, static unsigned int riscv_clock_event_irq; static DEFINE_PER_CPU(struct clock_event_device, riscv_clock_event) = { .name = "riscv_timer_clockevent", - .features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_C3STOP, + .features = CLOCK_EVT_FEAT_ONESHOT, .rating = 100, .set_next_event = riscv_clock_next_event, };