Message ID | 20240814145642.344485-1-emil.renner.berthing@canonical.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix Allwinner D1 boot regression | expand |
On Wed, Aug 14 2024 at 16:56, Emil Renner Berthing wrote: > As described in the thread below[1] I haven't been able to boot my > boards based on the Allwinner D1 SoC since 6.9 where you converted the > SiFive PLIC driver to a platform driver. > > This is clearly a regression and there haven't really been much progress > on fixing the issue since then, so here is the revert that fixes it. > > If no other fix is found before 6.11 I suggest we apply this. So this mess has been ignored for two month now? From the pastebin in the initial report: [ 0.000000] irq: no irq domain found for interrupt-controller@10000000 ! [ 0.000000] Failed to map interrupt for /soc/timer@2050000 [ 0.000000] Failed to initialize '/soc/timer@2050000': -22 This comes back with -EINVAL. So the timer cannot find an interrupt, which makes it pretty obvious why the system stops to boot, unless there is some other timer available. This is obviously related to the SUN4I_TIMER because that message went away when it was disabled according to the next pastebin. Obviously that can't work because the SUN4I timer driver is using timer_of_init() which cannot handle deferred probing. Daniel: There was a partial fix for the sun4i driver, which you said you applied: https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com But that thing never materialized in a pull request. And of course everyone involved ignored the problem since March 13th 2024, i.e. almost half a year. Seriously? Can you RISCV folks get your act together and ensure to fix things you broke on the way? Especially when Emil reported this nobody pointed him to this patch and nobody noticed that it's still not merged? It took me less than 15 minutes to find that patch and the correlation, but this is absolutely not my job. I'm seriously grumpy about that. This is not how it works. If you break stuff, then you take care to fix it before you shove more changes into the tree and waste my time. I'm very much inclined to take the reverts right now, send them to Linus for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv patches until the next merge window is over. Emil, can you give that sun4i fix a test ride please? Thanks, tglx
Thomas Gleixner wrote: > On Wed, Aug 14 2024 at 16:56, Emil Renner Berthing wrote: > > As described in the thread below[1] I haven't been able to boot my > > boards based on the Allwinner D1 SoC since 6.9 where you converted the > > SiFive PLIC driver to a platform driver. > > > > This is clearly a regression and there haven't really been much progress > > on fixing the issue since then, so here is the revert that fixes it. > > > > If no other fix is found before 6.11 I suggest we apply this. > > So this mess has been ignored for two month now? > > From the pastebin in the initial report: > > [ 0.000000] irq: no irq domain found for interrupt-controller@10000000 ! > [ 0.000000] Failed to map interrupt for /soc/timer@2050000 > [ 0.000000] Failed to initialize '/soc/timer@2050000': -22 > > This comes back with -EINVAL. So the timer cannot find an interrupt, > which makes it pretty obvious why the system stops to boot, unless there > is some other timer available. > > This is obviously related to the SUN4I_TIMER because that message went > away when it was disabled according to the next pastebin. > > Obviously that can't work because the SUN4I timer driver is using > timer_of_init() which cannot handle deferred probing. > > Daniel: There was a partial fix for the sun4i driver, which you said you > applied: > > https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com > > But that thing never materialized in a pull request. > > And of course everyone involved ignored the problem since March 13th > 2024, i.e. almost half a year. > > Seriously? > > Can you RISCV folks get your act together and ensure to fix things you > broke on the way? Especially when Emil reported this nobody pointed him > to this patch and nobody noticed that it's still not merged? > > It took me less than 15 minutes to find that patch and the correlation, > but this is absolutely not my job. > > I'm seriously grumpy about that. This is not how it works. If you break > stuff, then you take care to fix it before you shove more changes into > the tree and waste my time. > > I'm very much inclined to take the reverts right now, send them to Linus > for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv > patches until the next merge window is over. > > Emil, can you give that sun4i fix a test ride please? Hi Thomas, Thanks for looking at this! Unfortunately the above patch isn't enough to fix the issue: https://termbin.com/7sgc It still hangs after the "[ 0.176451] cpuidle: using governor teo" message until the watchdog reboots the systems. /Emil
On Thu, Aug 15 2024 at 03:29, Emil Renner Berthing wrote: > Thomas Gleixner wrote: > Thanks for looking at this! Unfortunately the above patch isn't enough to fix > the issue: > > https://termbin.com/7sgc > > It still hangs after the "[ 0.176451] cpuidle: using governor teo" message > until the watchdog reboots the systems. So what's puzzling is that there is a timer installed early on: [ 0.000000] clocksource: riscv_clocksource: .... That same init function installs the per cpu riscv clockevent, so there should be a timer available. The deffered probing of the PLIC driver delays obviously the probing of the sun4i timer, but that should not matter when another timer is available. So the sun4i driver might be a red herring. Can you please add "ignore_loglevel initcall_debug" to the command line and provide the output of a booting and a failing kernel? And on the booting kernel please provide the output from: # cat /sys/devices/system/clockevents/clockevent0/current_device # cat /sys/devices/system/clockevents/broadcast/current_device # cat /sys/devices/system/clocksource/clocksource0/current_clocksource Thanks, tglx
Thomas Gleixner wrote: > On Thu, Aug 15 2024 at 03:29, Emil Renner Berthing wrote: > > Thomas Gleixner wrote: > > Thanks for looking at this! Unfortunately the above patch isn't enough to fix > > the issue: > > > > https://termbin.com/7sgc > > > > It still hangs after the "[ 0.176451] cpuidle: using governor teo" message > > until the watchdog reboots the systems. > > So what's puzzling is that there is a timer installed early on: > > [ 0.000000] clocksource: riscv_clocksource: .... > > That same init function installs the per cpu riscv clockevent, so there > should be a timer available. > > The deffered probing of the PLIC driver delays obviously the probing of > the sun4i timer, but that should not matter when another timer is > available. So the sun4i driver might be a red herring. > > Can you please add "ignore_loglevel initcall_debug" to the command line > and provide the output of a booting and a failing kernel? 6.11-rc3 + these reverts: https://termbin.com/q6wk 6.11-rc3 + Samuel's patch: https://termbin.com/7cgs > And on the booting kernel please provide the output from: > > # cat /sys/devices/system/clockevents/clockevent0/current_device > # cat /sys/devices/system/clockevents/broadcast/current_device > # cat /sys/devices/system/clocksource/clocksource0/current_clocksource On both a 6.8.6 kernel and 6.11-rc3 + reverts I get: # cat /sys/devices/system/clockevents/clockevent0/current_device sun4i_tick # cat /sys/devices/system/clockevents/broadcast/current_device riscv_timer_clockevent # cat /sys/devices/system/clocksource/clocksource0/current_clocksource riscv_clocksource /Emil
Emil Renner Berthing wrote: > Thomas Gleixner wrote: > > On Thu, Aug 15 2024 at 03:29, Emil Renner Berthing wrote: > > > Thomas Gleixner wrote: > > > Thanks for looking at this! Unfortunately the above patch isn't enough to fix > > > the issue: > > > > > > https://termbin.com/7sgc > > > > > > It still hangs after the "[ 0.176451] cpuidle: using governor teo" message > > > until the watchdog reboots the systems. > > > > So what's puzzling is that there is a timer installed early on: > > > > [ 0.000000] clocksource: riscv_clocksource: .... > > > > That same init function installs the per cpu riscv clockevent, so there > > should be a timer available. > > > > The deffered probing of the PLIC driver delays obviously the probing of > > the sun4i timer, but that should not matter when another timer is > > available. So the sun4i driver might be a red herring. > > > > Can you please add "ignore_loglevel initcall_debug" to the command line > > and provide the output of a booting and a failing kernel? > > 6.11-rc3 + these reverts: https://termbin.com/q6wk > 6.11-rc3 + Samuel's patch: https://termbin.com/7cgs I think this confirms what Charlie found here: https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/ > > > And on the booting kernel please provide the output from: > > > > # cat /sys/devices/system/clockevents/clockevent0/current_device > > # cat /sys/devices/system/clockevents/broadcast/current_device > > # cat /sys/devices/system/clocksource/clocksource0/current_clocksource > > On both a 6.8.6 kernel and 6.11-rc3 + reverts I get: > > # cat /sys/devices/system/clockevents/clockevent0/current_device > sun4i_tick > # cat /sys/devices/system/clockevents/broadcast/current_device > riscv_timer_clockevent > # cat /sys/devices/system/clocksource/clocksource0/current_clocksource > riscv_clocksource > > /Emil > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote: > Emil Renner Berthing wrote: >> 6.11-rc3 + these reverts: https://termbin.com/q6wk >> 6.11-rc3 + Samuel's patch: https://termbin.com/7cgs > > I think this confirms what Charlie found here: > https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/ Yes. So the riscv timer is not working on this thing or it stops somehow. Can you apply the debug patch below and check whether you see the 'J: ....' output at all and if so whether it stops at some point. Thanks, tglx --- --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -2459,6 +2459,9 @@ static void run_local_timers(void) { struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]); + if (!(jiffies & 0xFF)) + pr_info("J: %lx\n", jiffies); + hrtimer_run_queues(); for (int i = 0; i < NR_BASES; i++, base++) {
Hi Thomas, Emil, On 2024-08-15 8:16 AM, Thomas Gleixner wrote: > On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote: >> Emil Renner Berthing wrote: >>> 6.11-rc3 + these reverts: https://us01.z.antigena.com/l/Er4kZWDmvL5-bLzHHJoZv0k71iwW2jCD5qNpiz0x0XdYY6oORF_nXh7U7jw6oubhi~32HI4i71jUW9v8~NvSvPeUWrdYx3WJBr2GPDUjOu6LYPCOBfR2dVQuMWvlNj4tDjXFp3QEQAmeawZflD4JrIJjtSYIbKfe6v-tgH7SEuHMeSSriU633Lv >>> 6.11-rc3 + Samuel's patch: https://us01.z.antigena.com/l/EULtAYky6ZvgqZ49KGS-WBsYTg~Ht1NoQtEYmUVb56ymS9jDagqYHLK90WDjnVt69GfB4IX5NSRQXmSfkNsTzB8lJmFvDihHQmGrsCv9FzlorD9yGfXDlQ6rG6vmn5BNDwlipmssGaOGfh9yko8n9ArWR4TLhEf~f9ODqme~NXXwA9DLLc9p >> >> I think this confirms what Charlie found here: >> https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/ > > Yes. So the riscv timer is not working on this thing or it stops > somehow. That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI firmware does not have a timer device, so it does not expose the (optional[1]) SBI time extension, and sbi_set_timer() does nothing. I wrote a patch (not submitted) to skip registering riscv_clock_event when the SBI time extension is unavailable, but this doesn't fully solve the issue either, because then we have no clockevent at all when check_unaligned_access_all_cpus() is called. How early in the boot process are we "required" to have a functional clockevent? Do we need to refactor check_unaligned_access_all_cpus() so it works on systems where the only clockevent is provided by a platform device? Regards, Samuel [1] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/intro.adoc > Can you apply the debug patch below and check whether you see the > 'J: ....' output at all and if so whether it stops at some point. > > Thanks, > > tglx > > --- > --- a/kernel/time/timer.c > +++ b/kernel/time/timer.c > @@ -2459,6 +2459,9 @@ static void run_local_timers(void) > { > struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]); > > + if (!(jiffies & 0xFF)) > + pr_info("J: %lx\n", jiffies); > + > hrtimer_run_queues(); > > for (int i = 0; i < NR_BASES; i++, base++) { > > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv >
Thomas Gleixner wrote: > On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote: > > Emil Renner Berthing wrote: > >> 6.11-rc3 + these reverts: https://termbin.com/q6wk > >> 6.11-rc3 + Samuel's patch: https://termbin.com/7cgs > > > > I think this confirms what Charlie found here: > > https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/ > > Yes. So the riscv timer is not working on this thing or it stops > somehow. > > Can you apply the debug patch below and check whether you see the > 'J: ....' output at all and if so whether it stops at some point. No, I don't see the J: ... debug output anywhere: https://termbin.com/3vez /Emil
On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote: > On 2024-08-15 8:16 AM, Thomas Gleixner wrote: >> Yes. So the riscv timer is not working on this thing or it stops >> somehow. > > That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI > firmware does not have a timer device, so it does not expose the (optional[1]) > SBI time extension, and sbi_set_timer() does nothing. Sigh. Does RISCV really have to repeat all mistakes which have been made by x86, ARM and others before? It's known for decades that the kernel relies on a working timer... > I wrote a patch (not submitted) to skip registering riscv_clock_event when the > SBI time extension is unavailable, but this doesn't fully solve the issue > either, because then we have no clockevent at all when > check_unaligned_access_all_cpus() is called. check_unaligned_access_all_cpus() is irrelevant. > How early in the boot process are we "required" to have a functional clockevent? > Do we need to refactor check_unaligned_access_all_cpus() so it works on systems > where the only clockevent is provided by a platform device? Right after init/main::late_time_init() everything can depend on a working timer and on jiffies increasing. I'm actually surprised that the boot process gets that far. That's just by pure luck, really. Thanks, tglx
On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote: > > On 2024-08-15 8:16 AM, Thomas Gleixner wrote: > >> Yes. So the riscv timer is not working on this thing or it stops > >> somehow. > > > > That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI > > firmware does not have a timer device, so it does not expose the (optional[1]) > > SBI time extension, and sbi_set_timer() does nothing. > > Sigh. Does RISCV really have to repeat all mistakes which have been made > by x86, ARM and others before? It's known for decades that the kernel > relies on a working timer... My apologies for the delay in finding a fix for this issue. Almost all RISC-V platforms (except this one) have SBI Timer always available and Linux uses a better timer or Sstc extension whenever it is available. When Emil first reported this issue, I did try to help him root cause the issue but unfortunately I don't have this particular platform and PLIC on all other RISC-V platforms works fine. I am also surprised that none of the Allwiner folks tried helping. > > > I wrote a patch (not submitted) to skip registering riscv_clock_event when the > > SBI time extension is unavailable, but this doesn't fully solve the issue > > either, because then we have no clockevent at all when > > check_unaligned_access_all_cpus() is called. > > check_unaligned_access_all_cpus() is irrelevant. > > > How early in the boot process are we "required" to have a functional clockevent? > > Do we need to refactor check_unaligned_access_all_cpus() so it works on systems > > where the only clockevent is provided by a platform device? > > Right after init/main::late_time_init() everything can depend on a > working timer and on jiffies increasing. > > I'm actually surprised that the boot process gets that far. That's just > by pure luck, really. > > Thanks, > > tglx Regards, Anup
On Thu, Aug 15, 2024 at 7:02 PM Samuel Holland <samuel.holland@sifive.com> wrote: > > Hi Thomas, Emil, > > On 2024-08-15 8:16 AM, Thomas Gleixner wrote: > > On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote: > >> Emil Renner Berthing wrote: > >>> 6.11-rc3 + these reverts: https://us01.z.antigena.com/l/Er4kZWDmvL5-bLzHHJoZv0k71iwW2jCD5qNpiz0x0XdYY6oORF_nXh7U7jw6oubhi~32HI4i71jUW9v8~NvSvPeUWrdYx3WJBr2GPDUjOu6LYPCOBfR2dVQuMWvlNj4tDjXFp3QEQAmeawZflD4JrIJjtSYIbKfe6v-tgH7SEuHMeSSriU633Lv > >>> 6.11-rc3 + Samuel's patch: https://us01.z.antigena.com/l/EULtAYky6ZvgqZ49KGS-WBsYTg~Ht1NoQtEYmUVb56ymS9jDagqYHLK90WDjnVt69GfB4IX5NSRQXmSfkNsTzB8lJmFvDihHQmGrsCv9FzlorD9yGfXDlQ6rG6vmn5BNDwlipmssGaOGfh9yko8n9ArWR4TLhEf~f9ODqme~NXXwA9DLLc9p > >> > >> I think this confirms what Charlie found here: > >> https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/ > > > > Yes. So the riscv timer is not working on this thing or it stops > > somehow. > > That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI > firmware does not have a timer device, so it does not expose the (optional[1]) > SBI time extension, and sbi_set_timer() does nothing. OpenSBI uses platform specific M-mode timer (mtime and mtimecmp) to provide SBI time extension to Linux. The RISC-V privileged specification (v1.10 or higher) requires platform to provide a M-mode timer (mtime and mtimecmp). This platform not having any M-mode timer is yet another RISC-V spec violation by this platform. Regards, Anup > > I wrote a patch (not submitted) to skip registering riscv_clock_event when the > SBI time extension is unavailable, but this doesn't fully solve the issue > either, because then we have no clockevent at all when > check_unaligned_access_all_cpus() is called. > > How early in the boot process are we "required" to have a functional clockevent? > Do we need to refactor check_unaligned_access_all_cpus() so it works on systems > where the only clockevent is provided by a platform device? > > Regards, > Samuel > > [1] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/intro.adoc > > > Can you apply the debug patch below and check whether you see the > > 'J: ....' output at all and if so whether it stops at some point. > > > > Thanks, > > > > tglx > > > > --- > > --- a/kernel/time/timer.c > > +++ b/kernel/time/timer.c > > @@ -2459,6 +2459,9 @@ static void run_local_timers(void) > > { > > struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]); > > > > + if (!(jiffies & 0xFF)) > > + pr_info("J: %lx\n", jiffies); > > + > > hrtimer_run_queues(); > > > > for (int i = 0; i < NR_BASES; i++, base++) { > > > > > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv > > >
On 2024-08-15 9:16 AM, Anup Patel wrote: > On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote: >> >> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote: >>> On 2024-08-15 8:16 AM, Thomas Gleixner wrote: >>>> Yes. So the riscv timer is not working on this thing or it stops >>>> somehow. >>> >>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI >>> firmware does not have a timer device, so it does not expose the (optional[1]) >>> SBI time extension, and sbi_set_timer() does nothing. >> >> Sigh. Does RISCV really have to repeat all mistakes which have been made >> by x86, ARM and others before? It's known for decades that the kernel >> relies on a working timer... > > My apologies for the delay in finding a fix for this issue. > > Almost all RISC-V platforms (except this one) have SBI Timer always > available and Linux uses a better timer or Sstc extension whenever > it is available. So this is the immediate solution: add the CLINT to the firmware devicetree so that the SBI time extension works, and Linux will boot without any code changes, albeit with a higher-overhead clockevent device. Additionally merging the sun4i timer patch[1] will allow the system to switch to the better MMIO clocksource later in the boot process. The reason the CLINT was not added to the devicetree already is that the T-HEAD version of the CLINT includes an extension to drive SSIP/STIP from a second S-mode visible set of registers. So it should really have twice as many entries in its interrupts-extended property as the existing CLINT, and I never got around to validating that this would work. The long-term solution would be adding driver support for the T-HEAD CLINT extensions, which provide an even better clockevent than the sun4i timer. [1]: https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com/ > When Emil first reported this issue, I did try to help him root cause > the issue but unfortunately I don't have this particular platform and > PLIC on all other RISC-V platforms works fine. > > I am also surprised that none of the Allwinner folks tried helping. Allwinner D1 support was upstreamed by unpaid hobbyists with very little first-party assistance. >>> I wrote a patch (not submitted) to skip registering riscv_clock_event when the >>> SBI time extension is unavailable, but this doesn't fully solve the issue >>> either, because then we have no clockevent at all when >>> check_unaligned_access_all_cpus() is called. >> >> check_unaligned_access_all_cpus() is irrelevant. >> >>> How early in the boot process are we "required" to have a functional clockevent? >>> Do we need to refactor check_unaligned_access_all_cpus() so it works on systems >>> where the only clockevent is provided by a platform device? >> >> Right after init/main::late_time_init() everything can depend on a >> working timer and on jiffies increasing. >> >> I'm actually surprised that the boot process gets that far. That's just >> by pure luck, really. Thanks for clearing this up! Regards, Samuel
Hi Anup, On 2024-08-15 9:30 AM, Anup Patel wrote: > On Thu, Aug 15, 2024 at 7:02 PM Samuel Holland > <samuel.holland@sifive.com> wrote: >>> Yes. So the riscv timer is not working on this thing or it stops >>> somehow. >> >> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI >> firmware does not have a timer device, so it does not expose the (optional[1]) >> SBI time extension, and sbi_set_timer() does nothing. > > OpenSBI uses platform specific M-mode timer (mtime and mtimecmp) to > provide SBI time extension to Linux. > > The RISC-V privileged specification (v1.10 or higher) requires platform to > provide a M-mode timer (mtime and mtimecmp). > > This platform not having any M-mode timer is yet another RISC-V spec > violation by this platform. You've misunderstood here. Allwinner D1 (T-HEAD C906) _does_ have an M-mode timer (a CLINT). It is just omitted from devicetree that Emil happens to be using, so OpenSBI isn't using it. Currently OpenSBI allows the system to boot without a timer device, and the SBI specification does not mandate the time extension. If consensus is that either of these should change, that's fine, but currently I see nothing in either the privileged spec nor the SBI spec that guarantees the availability of some timer to the kernel in S-mode. Regards, Samuel
Samuel Holland wrote: > On 2024-08-15 9:16 AM, Anup Patel wrote: > > On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote: > >> > >> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote: > >>> On 2024-08-15 8:16 AM, Thomas Gleixner wrote: > >>>> Yes. So the riscv timer is not working on this thing or it stops > >>>> somehow. > >>> > >>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI > >>> firmware does not have a timer device, so it does not expose the (optional[1]) > >>> SBI time extension, and sbi_set_timer() does nothing. > >> > >> Sigh. Does RISCV really have to repeat all mistakes which have been made > >> by x86, ARM and others before? It's known for decades that the kernel > >> relies on a working timer... > > > > My apologies for the delay in finding a fix for this issue. > > > > Almost all RISC-V platforms (except this one) have SBI Timer always > > available and Linux uses a better timer or Sstc extension whenever > > it is available. > > So this is the immediate solution: add the CLINT to the firmware devicetree so > that the SBI time extension works, and Linux will boot without any code changes, > albeit with a higher-overhead clockevent device. But this will mean that you can't update your kernel to v6.9 or newer without reflashing OpenSBI and u-boot. That's still a regression right? /Emil
On Thu, Aug 15 2024 at 09:41, Samuel Holland wrote: > On 2024-08-15 9:16 AM, Anup Patel wrote: >> Almost all RISC-V platforms (except this one) have SBI Timer always >> available and Linux uses a better timer or Sstc extension whenever >> it is available. > > So this is the immediate solution: add the CLINT to the firmware > devicetree so that the SBI time extension works, and Linux will boot > without any code changes, albeit with a higher-overhead clockevent > device. That does not matter for the boot process when the sun4i timer becomes available afterwards. But how can this be retrofitted along with the kernel update? Thanks, tglx
On Thu, Aug 15, 2024 at 8:33 PM Samuel Holland <samuel.holland@sifive.com> wrote: > > Hi Anup, > > On 2024-08-15 9:30 AM, Anup Patel wrote: > > On Thu, Aug 15, 2024 at 7:02 PM Samuel Holland > > <samuel.holland@sifive.com> wrote: > >>> Yes. So the riscv timer is not working on this thing or it stops > >>> somehow. > >> > >> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI > >> firmware does not have a timer device, so it does not expose the (optional[1]) > >> SBI time extension, and sbi_set_timer() does nothing. > > > > OpenSBI uses platform specific M-mode timer (mtime and mtimecmp) to > > provide SBI time extension to Linux. > > > > The RISC-V privileged specification (v1.10 or higher) requires platform to > > provide a M-mode timer (mtime and mtimecmp). > > > > This platform not having any M-mode timer is yet another RISC-V spec > > violation by this platform. > > You've misunderstood here. Allwinner D1 (T-HEAD C906) _does_ have an M-mode > timer (a CLINT). It is just omitted from devicetree that Emil happens to be > using, so OpenSBI isn't using it. > > Currently OpenSBI allows the system to boot without a timer device, and the SBI > specification does not mandate the time extension. If consensus is that either > of these should change, that's fine, but currently I see nothing in either the > privileged spec nor the SBI spec that guarantees the availability of some timer > to the kernel in S-mode. > The SBI time is certainly optional hence OpenSBI does not hang or crash if it can't provide SBI time to supervisor software. My comment is from the RISC-V privileged spec perspective: 1) Priv v1.10 says "Platforms provide a real-time counter, exposed as a memory-mapped machine-mode register, mtime." in section "3.1.15 Machine Timer Registers (mtime and mtimecmp)". 2) Similar statement in Priv v1.11 section "3.1.10 Machine Timer Registers (mtime and mtimecmp)" 3) Similar statement in Priv v1.12 section "3.2.1 Machine Timer Registers (mtime and mtimecmp)" But since the M-mode timer was omitted from the DT, I think the DT was always incomplete from the M-mode perspective. Regards, Anup
Hi Emil, On 2024-08-15 10:07 AM, Emil Renner Berthing wrote: > Samuel Holland wrote: >> On 2024-08-15 9:16 AM, Anup Patel wrote: >>> On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote: >>>> >>>> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote: >>>>> On 2024-08-15 8:16 AM, Thomas Gleixner wrote: >>>>>> Yes. So the riscv timer is not working on this thing or it stops >>>>>> somehow. >>>>> >>>>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI >>>>> firmware does not have a timer device, so it does not expose the (optional[1]) >>>>> SBI time extension, and sbi_set_timer() does nothing. >>>> >>>> Sigh. Does RISCV really have to repeat all mistakes which have been made >>>> by x86, ARM and others before? It's known for decades that the kernel >>>> relies on a working timer... >>> >>> My apologies for the delay in finding a fix for this issue. >>> >>> Almost all RISC-V platforms (except this one) have SBI Timer always >>> available and Linux uses a better timer or Sstc extension whenever >>> it is available. >> >> So this is the immediate solution: add the CLINT to the firmware devicetree so >> that the SBI time extension works, and Linux will boot without any code changes, >> albeit with a higher-overhead clockevent device. > > But this will mean that you can't update your kernel to v6.9 or newer without > reflashing OpenSBI and u-boot. That's still a regression right? I suppose that depends on if you think the SBI time extension is (or should have been) mandatory for platforms without Sstc. If the SBI time extension is mandatory, then this is a firmware bug, and not really Linux's responsibility to work around. If the SBI time extension is not mandatory, then Linux needs to be able to handle platforms where the S-mode visible timer is attached to an external interrupt controller (PLIC or APLIC), so the irqchip driver needs to be loaded before time_init() (timer_probe()). So in that case, the bug is a Linux regression, and we would need to revert the platform driver conversion. Regards, Samuel
On Thu, 15 Aug 2024 08:59:37 PDT (-0700), samuel.holland@sifive.com wrote: > Hi Emil, > > On 2024-08-15 10:07 AM, Emil Renner Berthing wrote: >> Samuel Holland wrote: >>> On 2024-08-15 9:16 AM, Anup Patel wrote: >>>> On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote: >>>>> >>>>> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote: >>>>>> On 2024-08-15 8:16 AM, Thomas Gleixner wrote: >>>>>>> Yes. So the riscv timer is not working on this thing or it stops >>>>>>> somehow. >>>>>> >>>>>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI >>>>>> firmware does not have a timer device, so it does not expose the (optional[1]) >>>>>> SBI time extension, and sbi_set_timer() does nothing. >>>>> >>>>> Sigh. Does RISCV really have to repeat all mistakes which have been made >>>>> by x86, ARM and others before? It's known for decades that the kernel >>>>> relies on a working timer... It's even worse than that: RISC-V doesn't even mandate any working _instructions_, much less anything in the platform/firmware. >>>> My apologies for the delay in finding a fix for this issue. >>>> >>>> Almost all RISC-V platforms (except this one) have SBI Timer always >>>> available and Linux uses a better timer or Sstc extension whenever >>>> it is available. >>> >>> So this is the immediate solution: add the CLINT to the firmware devicetree so >>> that the SBI time extension works, and Linux will boot without any code changes, >>> albeit with a higher-overhead clockevent device. >> >> But this will mean that you can't update your kernel to v6.9 or newer without >> reflashing OpenSBI and u-boot. That's still a regression right? Ya, I'd call that a regression. Updating the firmware on these things isn't generally something we can rely on users to do, we've worked around other firmware bugs where we can to avoid forced updates. > I suppose that depends on if you think the SBI time extension is (or should have > been) mandatory for platforms without Sstc. If the SBI time extension is > mandatory, then this is a firmware bug, and not really Linux's responsibility to > work around. > > If the SBI time extension is not mandatory, then Linux needs to be able to > handle platforms where the S-mode visible timer is attached to an external > interrupt controller (PLIC or APLIC), so the irqchip driver needs to be loaded > before time_init() (timer_probe()). So in that case, the bug is a Linux > regression, and we would need to revert the platform driver conversion. It doesn't really matter what the specs say (aka intended to say in RISC-V land): if there's a regression then we have to deal with it. It's not like whatever's written in the specs actually matters, vendors can just do whatever they want, so wer'e just stuck making the known implementations work. So I think if the revert is the best fix then we should revert it. That said: If the CLINT works, could we just add a probing quirk to make it appear on these systems even when it's not in the DT? I'm thinking something like adding a compatibly string to the CLINT driver for the SOC (or core or whatever, just something that's already there). We'd probably need a bit of special-case probing code, but shouldn't be so bad. We've got some other compatibility-oriented DT quirks floating around. > Regards, > Samuel
On Wed, 14 Aug 2024 10:30:48 PDT (-0700), tglx@linutronix.de wrote: > On Wed, Aug 14 2024 at 16:56, Emil Renner Berthing wrote: >> As described in the thread below[1] I haven't been able to boot my >> boards based on the Allwinner D1 SoC since 6.9 where you converted the >> SiFive PLIC driver to a platform driver. >> >> This is clearly a regression and there haven't really been much progress >> on fixing the issue since then, so here is the revert that fixes it. >> >> If no other fix is found before 6.11 I suggest we apply this. > > So this mess has been ignored for two month now? > >>From the pastebin in the initial report: > > [ 0.000000] irq: no irq domain found for interrupt-controller@10000000 ! > [ 0.000000] Failed to map interrupt for /soc/timer@2050000 > [ 0.000000] Failed to initialize '/soc/timer@2050000': -22 > > This comes back with -EINVAL. So the timer cannot find an interrupt, > which makes it pretty obvious why the system stops to boot, unless there > is some other timer available. > > This is obviously related to the SUN4I_TIMER because that message went > away when it was disabled according to the next pastebin. > > Obviously that can't work because the SUN4I timer driver is using > timer_of_init() which cannot handle deferred probing. > > Daniel: There was a partial fix for the sun4i driver, which you said you > applied: > > https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com > > But that thing never materialized in a pull request. > > And of course everyone involved ignored the problem since March 13th > 2024, i.e. almost half a year. > > Seriously? > > Can you RISCV folks get your act together and ensure to fix things you > broke on the way? Especially when Emil reported this nobody pointed him > to this patch and nobody noticed that it's still not merged? > > It took me less than 15 minutes to find that patch and the correlation, > but this is absolutely not my job. Sorry, I guess I'd just sort of been ignoring the platform-specific side of things because it's so frustrating to deal with, but that's led to a bunch of breakages so it's obviously the wrong thing to do. > I'm seriously grumpy about that. This is not how it works. If you break > stuff, then you take care to fix it before you shove more changes into > the tree and waste my time. > > I'm very much inclined to take the reverts right now, send them to Linus > for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv > patches until the next merge window is over. Acked-by: Palmer Dabbelt <palmer@rivosinc.com> if you want to take the revert. IIUC the patch above doesn't actually fix it, that's what led to just sending the reverts -- at least reverts are better than breaking users. I'll post over there too... And it's no big deal if we're in the doghouse for a bit. Regressions should get fixed faster than this, so we deserve it. Probably also another sign we're way too focused on getting new features merged, as that's coming at the expense of making existing platforms work. IMO we've been way too focused on getting support for specs that don't even have implementations, and not enough on building real working systems. > Emil, can you give that sun4i fix a test ride please? > > Thanks, > > tglx
On Thu, Aug 15 2024 at 10:51, Palmer Dabbelt wrote: > On Wed, 14 Aug 2024 10:30:48 PDT (-0700), tglx@linutronix.de wrote: >> I'm very much inclined to take the reverts right now, send them to Linus >> for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv >> patches until the next merge window is over. > > Acked-by: Palmer Dabbelt <palmer@rivosinc.com> > > if you want to take the revert. I'm happy to wait a week and see whether someone gets that CLINT hack working or as I suggested the D1 PLIC early probe quirk. > IIUC the patch above doesn't actually fix it, that's what led to just > sending the reverts -- at least reverts are better than breaking users. > I'll post over there too... Right. We figured that out by now :) > And it's no big deal if we're in the doghouse for a bit. Regressions > should get fixed faster than this, so we deserve it. For a week I consider you probationers :) > Probably also another sign we're way too focused on getting new features > merged, as that's coming at the expense of making existing platforms > work. IMO we've been way too focused on getting support for specs that > don't even have implementations, and not enough on building real working > systems. RISCV is not alone with that. This whole industry is nuts about features and forgets the stuff what matters. Thanks, tglx
On Thu, 15 Aug 2024 11:10:25 PDT (-0700), tglx@linutronix.de wrote: > On Thu, Aug 15 2024 at 10:51, Palmer Dabbelt wrote: >> On Wed, 14 Aug 2024 10:30:48 PDT (-0700), tglx@linutronix.de wrote: >>> I'm very much inclined to take the reverts right now, send them to Linus >>> for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv >>> patches until the next merge window is over. >> >> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> >> >> if you want to take the revert. > > I'm happy to wait a week and see whether someone gets that CLINT hack > working or as I suggested the D1 PLIC early probe quirk. > >> IIUC the patch above doesn't actually fix it, that's what led to just >> sending the reverts -- at least reverts are better than breaking users. >> I'll post over there too... > > Right. We figured that out by now :) > >> And it's no big deal if we're in the doghouse for a bit. Regressions >> should get fixed faster than this, so we deserve it. > > For a week I consider you probationers :) Works for me ;) >> Probably also another sign we're way too focused on getting new features >> merged, as that's coming at the expense of making existing platforms >> work. IMO we've been way too focused on getting support for specs that >> don't even have implementations, and not enough on building real working >> systems. > > RISCV is not alone with that. This whole industry is nuts about features > and forgets the stuff what matters. > > Thanks, > > tglx
在 2024-08-15星期四的 20:00 +0530,Anup Patel写道: > On Thu, Aug 15, 2024 at 7:02 PM Samuel Holland > <samuel.holland@sifive.com> wrote: > > > > Hi Thomas, Emil, > > > > On 2024-08-15 8:16 AM, Thomas Gleixner wrote: > > > On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote: > > > > Emil Renner Berthing wrote: > > > > > 6.11-rc3 + these reverts: > > > > > https://us01.z.antigena.com/l/Er4kZWDmvL5-bLzHHJoZv0k71iwW2jCD5qNpiz0x0XdYY6oORF_nXh7U7jw6oubhi~32HI4i71jUW9v8~NvSvPeUWrdYx3WJBr2GPDUjOu6LYPCOBfR2dVQuMWvlNj4tDjXFp3QEQAmeawZflD4JrIJjtSYIbKfe6v-tgH7SEuHMeSSriU633Lv > > > > > 6.11-rc3 + Samuel's patch: > > > > > https://us01.z.antigena.com/l/EULtAYky6ZvgqZ49KGS-WBsYTg~Ht1NoQtEYmUVb56ymS9jDagqYHLK90WDjnVt69GfB4IX5NSRQXmSfkNsTzB8lJmFvDihHQmGrsCv9FzlorD9yGfXDlQ6rG6vmn5BNDwlipmssGaOGfh9yko8n9ArWR4TLhEf~f9ODqme~NXXwA9DLLc9p > > > > > > > > I think this confirms what Charlie found here: > > > > https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/ > > > > > > Yes. So the riscv timer is not working on this thing or it stops > > > somehow. > > > > That's correct. With the (firmware) devicetree that Emil is using, > > the OpenSBI > > firmware does not have a timer device, so it does not expose the > > (optional[1]) > > SBI time extension, and sbi_set_timer() does nothing. > > OpenSBI uses platform specific M-mode timer (mtime and mtimecmp) to > provide SBI time extension to Linux. > > The RISC-V privileged specification (v1.10 or higher) requires > platform to > provide a M-mode timer (mtime and mtimecmp). The T-Head cores' design is weird, and because of its source code is available (as OpenC906), we can investigate it further in the RTL level: - From software perspective, it has no mtime mmap'ed register, but it has mtimecmp, which compares with time CSR (a CSR R/O in all priv levels). - From SoC integration perspective, the value of the time CSR is sourced from an external input, pad_cpu_sys_cnt[63:0] [1]. BTW I already added support for this kind of non-standard CLINT to OpenSBI [2], however I don't think the current firmware DT of D1 utilizes it; and this is also a quite new SBI version I think. [1] https://github.com/XUANTIE-RV/openc906/blob/main/C906_RTL_FACTORY/gen_rtl/cpu/rtl/openC906.v#L84 [2] https://github.com/riscv-software-src/opensbi/commit/b848d8763a737de44b64bfc036c8f51200226440 > > This platform not having any M-mode timer is yet another RISC-V spec > violation by this platform. > > Regards, > Anup > > > > > I wrote a patch (not submitted) to skip registering > > riscv_clock_event when the > > SBI time extension is unavailable, but this doesn't fully solve the > > issue > > either, because then we have no clockevent at all when > > check_unaligned_access_all_cpus() is called. > > > > How early in the boot process are we "required" to have a > > functional clockevent? > > Do we need to refactor check_unaligned_access_all_cpus() so it > > works on systems > > where the only clockevent is provided by a platform device? > > > > Regards, > > Samuel > > > > [1] > > https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/intro.adoc > > > > > Can you apply the debug patch below and check whether you see the > > > 'J: ....' output at all and if so whether it stops at some point. > > > > > > Thanks, > > > > > > tglx > > > > > > --- > > > --- a/kernel/time/timer.c > > > +++ b/kernel/time/timer.c > > > @@ -2459,6 +2459,9 @@ static void run_local_timers(void) > > > { > > > struct timer_base *base = > > > this_cpu_ptr(&timer_bases[BASE_LOCAL]); > > > > > > + if (!(jiffies & 0xFF)) > > > + pr_info("J: %lx\n", jiffies); > > > + > > > hrtimer_run_queues(); > > > > > > for (int i = 0; i < NR_BASES; i++, base++) { > > > > > > > > > > > > _______________________________________________ > > > linux-riscv mailing list > > > linux-riscv@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-riscv > > > > > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
在 2024-08-15星期四的 10:51 -0700,Palmer Dabbelt写道: > On Thu, 15 Aug 2024 08:59:37 PDT (-0700), > samuel.holland@sifive.com wrote: > > Hi Emil, > > > > On 2024-08-15 10:07 AM, Emil Renner Berthing wrote: > > > Samuel Holland wrote: > > > > On 2024-08-15 9:16 AM, Anup Patel wrote: > > > > > On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner > > > > > <tglx@linutronix.de> wrote: > > > > > > > > > > > > On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote: > > > > > > > On 2024-08-15 8:16 AM, Thomas Gleixner wrote: > > > > > > > > Yes. So the riscv timer is not working on this thing or > > > > > > > > it stops > > > > > > > > somehow. > > > > > > > > > > > > > > That's correct. With the (firmware) devicetree that Emil > > > > > > > is using, the OpenSBI > > > > > > > firmware does not have a timer device, so it does not > > > > > > > expose the (optional[1]) > > > > > > > SBI time extension, and sbi_set_timer() does nothing. > > > > > > > > > > > > Sigh. Does RISCV really have to repeat all mistakes which > > > > > > have been made > > > > > > by x86, ARM and others before? It's known for decades that > > > > > > the kernel > > > > > > relies on a working timer... > > It's even worse than that: RISC-V doesn't even mandate any working > _instructions_, much less anything in the platform/firmware. > > > > > > My apologies for the delay in finding a fix for this issue. > > > > > > > > > > Almost all RISC-V platforms (except this one) have SBI Timer > > > > > always > > > > > available and Linux uses a better timer or Sstc extension > > > > > whenever > > > > > it is available. > > > > > > > > So this is the immediate solution: add the CLINT to the > > > > firmware devicetree so > > > > that the SBI time extension works, and Linux will boot without > > > > any code changes, > > > > albeit with a higher-overhead clockevent device. > > > > > > But this will mean that you can't update your kernel to v6.9 or > > > newer without > > > reflashing OpenSBI and u-boot. That's still a regression right? > > Ya, I'd call that a regression. Updating the firmware on these > things > isn't generally something we can rely on users to do, we've worked > around other firmware bugs where we can to avoid forced updates. > > > I suppose that depends on if you think the SBI time extension is > > (or should have > > been) mandatory for platforms without Sstc. If the SBI time > > extension is > > mandatory, then this is a firmware bug, and not really Linux's > > responsibility to > > work around. > > > > If the SBI time extension is not mandatory, then Linux needs to be > > able to > > handle platforms where the S-mode visible timer is attached to an > > external > > interrupt controller (PLIC or APLIC), so the irqchip driver needs > > to be loaded > > before time_init() (timer_probe()). So in that case, the bug is a > > Linux > > regression, and we would need to revert the platform driver > > conversion. > > It doesn't really matter what the specs say (aka intended to say in > RISC-V land): if there's a regression then we have to deal with it. > It's not like whatever's written in the specs actually matters, > vendors > can just do whatever they want, so wer'e just stuck making the known > implementations work. > > So I think if the revert is the best fix then we should revert it. > > That said: If the CLINT works, could we just add a probing quirk to > make CLINT works, but will will never work in S mode by its design -- the register used is all M-mode-only. So it this kind of probing quirk is being added, the target will be OpenSBI instead of Linux, and the problem of updating firmware still exists. > it appear on these systems even when it's not in the DT? I'm > thinking > something like adding a compatibly string to the CLINT driver for the > SOC (or core or whatever, just something that's already there). We'd > probably need a bit of special-case probing code, but shouldn't be so > bad. We've got some other compatibility-oriented DT quirks floating > around. > > > Regards, > > Samuel > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
在 2024-08-15星期四的 10:51 -0700,Palmer Dabbelt写道: > On Wed, 14 Aug 2024 10:30:48 PDT (-0700), tglx@linutronix.de wrote: > > On Wed, Aug 14 2024 at 16:56, Emil Renner Berthing wrote: > > > As described in the thread below[1] I haven't been able to boot > > > my > > > boards based on the Allwinner D1 SoC since 6.9 where you > > > converted the > > > SiFive PLIC driver to a platform driver. > > > > > > This is clearly a regression and there haven't really been much > > > progress > > > on fixing the issue since then, so here is the revert that fixes > > > it. > > > > > > If no other fix is found before 6.11 I suggest we apply this. > > > > So this mess has been ignored for two month now? > > > > > From the pastebin in the initial report: > > > > [ 0.000000] irq: no irq domain found for > > interrupt-controller@10000000 ! > > [ 0.000000] Failed to map interrupt for /soc/timer@2050000 > > [ 0.000000] Failed to initialize '/soc/timer@2050000': -22 > > > > This comes back with -EINVAL. So the timer cannot find an > > interrupt, > > which makes it pretty obvious why the system stops to boot, unless > > there > > is some other timer available. > > > > This is obviously related to the SUN4I_TIMER because that message > > went > > away when it was disabled according to the next pastebin. > > > > Obviously that can't work because the SUN4I timer driver is using > > timer_of_init() which cannot handle deferred probing. > > > > Daniel: There was a partial fix for the sun4i driver, which you > > said you > > applied: > > > > > > https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com > > > > But that thing never materialized in a pull request. > > > > And of course everyone involved ignored the problem since March > > 13th > > 2024, i.e. almost half a year. > > > > Seriously? > > > > Can you RISCV folks get your act together and ensure to fix things > > you > > broke on the way? Especially when Emil reported this nobody pointed > > him > > to this patch and nobody noticed that it's still not merged? > > > > It took me less than 15 minutes to find that patch and the > > correlation, > > but this is absolutely not my job. > > Sorry, I guess I'd just sort of been ignoring the platform-specific > side > of things because it's so frustrating to deal with, but that's led to > a > bunch of breakages so it's obviously the wrong thing to do. > > > I'm seriously grumpy about that. This is not how it works. If you > > break > > stuff, then you take care to fix it before you shove more changes > > into > > the tree and waste my time. > > > > I'm very much inclined to take the reverts right now, send them to > > Linus > > for -rc5 tagged with cc: stable and ignore/nak any irqchip related > > riscv > > patches until the next merge window is over. > > Acked-by: Palmer Dabbelt <palmer@rivosinc.com> > > if you want to take the revert. > > IIUC the patch above doesn't actually fix it, that's what led to just > sending the reverts -- at least reverts are better than breaking > users. > I'll post over there too... > > And it's no big deal if we're in the doghouse for a bit. Regressions > should get fixed faster than this, so we deserve it. > > Probably also another sign we're way too focused on getting new > features > merged, as that's coming at the expense of making existing platforms > work. IMO we've been way too focused on getting support for specs > that > don't even have implementations, and not enough on building real > working > systems. Well I think all existing platforms are more or less weird (in specification-compatibility, stability, etc). (Maybe FU540 isn't so weird, but it has too few peripherals to be really useful, and it's discontinued; FU740 has some stability issues.) > > > Emil, can you give that sun4i fix a test ride please? > > > > Thanks, > > > > tglx > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Wed, 14 Aug 2024 07:56:32 PDT (-0700), Renner Berthing wrote: > Hi Anup, > > As described in the thread below[1] I haven't been able to boot my > boards based on the Allwinner D1 SoC since 6.9 where you converted the > SiFive PLIC driver to a platform driver. > > This is clearly a regression and there haven't really been much progress > on fixing the issue since then, so here is the revert that fixes it. > > If no other fix is found before 6.11 I suggest we apply this. > > [1]: https://lore.kernel.org/linux-riscv/CAJM55Z9hGKo4784N3s3DhWw=nMRKZKcmvZ58x7uVBghExhoc9A@mail.gmail.com/ > > /Emil > > Emil Renner Berthing (9): > Revert "irqchip/sifive-plic: Chain to parent IRQ after handlers are > ready" > Revert "irqchip/sifive-plic: Avoid explicit cpumask allocation on > stack" > Revert "irqchip/sifive-plic: Improve locking safety by using > irqsave/irqrestore" > Revert "irqchip/sifive-plic: Parse number of interrupts and contexts > early in plic_probe()" > Revert "irqchip/sifive-plic: Cleanup PLIC contexts upon irqdomain > creation failure" > Revert "irqchip/sifive-plic: Use riscv_get_intc_hwnode() to get parent > fwnode" > Revert "irqchip/sifive-plic: Use devm_xyz() for managed allocation" > Revert "irqchip/sifive-plic: Use dev_xyz() in-place of pr_xyz()" > Revert "irqchip/sifive-plic: Convert PLIC driver into a platform > driver" > > drivers/irqchip/irq-sifive-plic.c | 296 ++++++++++++------------------ > 1 file changed, 117 insertions(+), 179 deletions(-) I'm still only testing on the QEMU-emulated platforms, but this isn't regressing over there so Tested-by: Palmer Dabbelt <palmer@rivosinc.com> # QEMU Thanks!