Message ID | 20230411165919.23955-3-jim2101024@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | PCI: brcmstb: CLKREQ# accomodations of downstream device | expand |
Hi Jim, Jim Quinlan <jim2101024@gmail.com> (2023-04-11): > […] > This property has already been in use by Raspian Linux, but this > immplementation adds more details and discerns between (a) and (b) ^^^^^^^^^^^^^^^ implementation > automatically. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217276 > Signed-off-by: Jim Quinlan <jim2101024@gmail.com> Sorry, it seems like my initial tests with v1 (applied on top of v6.3-rc5-137-gf2afccfefe7b) weren't thorough enough, and I'm seeing the same problems with v2 (applied on top of v6.3-rc6-46-gde4664485abb): - same setup as in https://bugzilla.kernel.org/show_bug.cgi?id=217276 - the kernel panic is indeed gone; - a USB keyboard connected on that SupaHub PCIe-to-multiple-USB adapter isn't seen by the kernel; - a USB memory stick connected on the same adapter isn't seen by the kernel either; - of course both USB devices are confirmed to work fine if they're plugged directly on the CM4's USB ports. Logs with v2: root@cm4:~# dmesg|grep -i pci [ 0.610997] PCI: CLS 0 bytes, default 64 [ 1.664886] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 1.672083] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges: [ 1.679125] brcm-pcie fd500000.pcie: No bus range found for /scb/pcie@7d500000, using [bus 00-ff] [ 1.688279] brcm-pcie fd500000.pcie: MEM 0x0600000000..0x0603ffffff -> 0x00f8000000 [ 1.696463] brcm-pcie fd500000.pcie: IB MEM 0x0000000000..0x00ffffffff -> 0x0400000000 [ 1.705282] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00 [ 1.711629] pci_bus 0000:00: root bus resource [bus 00-ff] [ 1.717172] pci_bus 0000:00: root bus resource [mem 0x600000000-0x603ffffff] (bus address [0xf8000000-0xfbffffff]) [ 1.727653] pci 0000:00:00.0: [14e4:2711] type 01 class 0x060400 [ 1.733768] pci 0000:00:00.0: PME# supported from D0 D3hot [ 1.740235] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring [ 1.855826] brcm-pcie fd500000.pcie: CLKREQ# ignored; no ASPM [ 1.863666] brcm-pcie fd500000.pcie: link up, 5.0 GT/s PCIe x1 (SSC) [ 1.870115] pci 0000:01:00.0: [1912:0014] type 00 class 0x0c0330 [ 1.876205] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00001fff 64bit] [ 1.883177] pci 0000:01:00.0: PME# supported from D0 D3hot [ 1.888881] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01 [ 1.895581] pci 0000:00:00.0: BAR 14: assigned [mem 0x600000000-0x6000fffff] [ 1.902707] pci 0000:01:00.0: BAR 0: assigned [mem 0x600000000-0x600001fff 64bit] [ 1.910279] pci 0000:00:00.0: PCI bridge to [bus 01] [ 1.915293] pci 0000:00:00.0: bridge window [mem 0x600000000-0x6000fffff] [ 1.922412] pcieport 0000:00:00.0: enabling device (0000 -> 0002) [ 1.928633] pcieport 0000:00:00.0: PME: Signaling with IRQ 23 [ 1.934609] pcieport 0000:00:00.0: AER: enabled with IRQ 23 [ 1.940340] pci 0000:01:00.0: enabling device (0000 -> 0002) [ 6.946090] pci 0000:01:00.0: xHCI HW not ready after 5 sec (HC bug?) status = 0x1801 [ 6.954026] pci 0000:01:00.0: quirk_usb_early_handoff+0x0/0x968 took 4896180 usecs Please let me know what I can do to help. Cheers,
Hello Cyril, Can you provide (a) the full boot log prior to applying the patch series and (b) full boot log after applying the series, using an IDENTICAL setup. If it fails on both then it has little to do with my patch series. In my last series your testing somehow conflated the effect of an unrelated MMC interrupt issue so please be precise. Regards, Jim Quinlan Broadcom STB On Thu, Apr 13, 2023 at 10:39 AM Cyril Brulebois <kibi@debian.org> wrote: > > Hi Jim, > > Jim Quinlan <jim2101024@gmail.com> (2023-04-11): > > […] > > This property has already been in use by Raspian Linux, but this > > immplementation adds more details and discerns between (a) and (b) > ^^^^^^^^^^^^^^^ > implementation > > > automatically. > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217276 > > Signed-off-by: Jim Quinlan <jim2101024@gmail.com> > > Sorry, it seems like my initial tests with v1 (applied on top of > v6.3-rc5-137-gf2afccfefe7b) weren't thorough enough, and I'm seeing the > same problems with v2 (applied on top of v6.3-rc6-46-gde4664485abb): > - same setup as in https://bugzilla.kernel.org/show_bug.cgi?id=217276 > - the kernel panic is indeed gone; > - a USB keyboard connected on that SupaHub PCIe-to-multiple-USB adapter > isn't seen by the kernel; > - a USB memory stick connected on the same adapter isn't seen by the > kernel either; > - of course both USB devices are confirmed to work fine if they're > plugged directly on the CM4's USB ports. > > Logs with v2: > > root@cm4:~# dmesg|grep -i pci > [ 0.610997] PCI: CLS 0 bytes, default 64 > [ 1.664886] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 > [ 1.672083] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges: > [ 1.679125] brcm-pcie fd500000.pcie: No bus range found for /scb/pcie@7d500000, using [bus 00-ff] > [ 1.688279] brcm-pcie fd500000.pcie: MEM 0x0600000000..0x0603ffffff -> 0x00f8000000 > [ 1.696463] brcm-pcie fd500000.pcie: IB MEM 0x0000000000..0x00ffffffff -> 0x0400000000 > [ 1.705282] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00 > [ 1.711629] pci_bus 0000:00: root bus resource [bus 00-ff] > [ 1.717172] pci_bus 0000:00: root bus resource [mem 0x600000000-0x603ffffff] (bus address [0xf8000000-0xfbffffff]) > [ 1.727653] pci 0000:00:00.0: [14e4:2711] type 01 class 0x060400 > [ 1.733768] pci 0000:00:00.0: PME# supported from D0 D3hot > [ 1.740235] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring > [ 1.855826] brcm-pcie fd500000.pcie: CLKREQ# ignored; no ASPM > [ 1.863666] brcm-pcie fd500000.pcie: link up, 5.0 GT/s PCIe x1 (SSC) > [ 1.870115] pci 0000:01:00.0: [1912:0014] type 00 class 0x0c0330 > [ 1.876205] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00001fff 64bit] > [ 1.883177] pci 0000:01:00.0: PME# supported from D0 D3hot > [ 1.888881] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01 > [ 1.895581] pci 0000:00:00.0: BAR 14: assigned [mem 0x600000000-0x6000fffff] > [ 1.902707] pci 0000:01:00.0: BAR 0: assigned [mem 0x600000000-0x600001fff 64bit] > [ 1.910279] pci 0000:00:00.0: PCI bridge to [bus 01] > [ 1.915293] pci 0000:00:00.0: bridge window [mem 0x600000000-0x6000fffff] > [ 1.922412] pcieport 0000:00:00.0: enabling device (0000 -> 0002) > [ 1.928633] pcieport 0000:00:00.0: PME: Signaling with IRQ 23 > [ 1.934609] pcieport 0000:00:00.0: AER: enabled with IRQ 23 > [ 1.940340] pci 0000:01:00.0: enabling device (0000 -> 0002) > [ 6.946090] pci 0000:01:00.0: xHCI HW not ready after 5 sec (HC bug?) status = 0x1801 > [ 6.954026] pci 0000:01:00.0: quirk_usb_early_handoff+0x0/0x968 took 4896180 usecs > > Please let me know what I can do to help. > > > Cheers, > -- > Cyril Brulebois (kibi@debian.org) <https://debamax.com/> > D-I release manager -- Release team member -- Freelance Consultant
On 4/13/2023 7:39 AM, Cyril Brulebois wrote: > Hi Jim, > > Jim Quinlan <jim2101024@gmail.com> (2023-04-11): >> […] >> This property has already been in use by Raspian Linux, but this >> immplementation adds more details and discerns between (a) and (b) > ^^^^^^^^^^^^^^^ > implementation > >> automatically. >> >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217276 >> Signed-off-by: Jim Quinlan <jim2101024@gmail.com> > > Sorry, it seems like my initial tests with v1 (applied on top of > v6.3-rc5-137-gf2afccfefe7b) weren't thorough enough, and I'm seeing the > same problems with v2 (applied on top of v6.3-rc6-46-gde4664485abb): > - same setup as in https://bugzilla.kernel.org/show_bug.cgi?id=217276 > - the kernel panic is indeed gone; > - a USB keyboard connected on that SupaHub PCIe-to-multiple-USB adapter > isn't seen by the kernel; > - a USB memory stick connected on the same adapter isn't seen by the > kernel either; > - of course both USB devices are confirmed to work fine if they're > plugged directly on the CM4's USB ports. > > Logs with v2: > > root@cm4:~# dmesg|grep -i pci > [ 0.610997] PCI: CLS 0 bytes, default 64 > [ 1.664886] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 > [ 1.672083] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges: > [ 1.679125] brcm-pcie fd500000.pcie: No bus range found for /scb/pcie@7d500000, using [bus 00-ff] > [ 1.688279] brcm-pcie fd500000.pcie: MEM 0x0600000000..0x0603ffffff -> 0x00f8000000 > [ 1.696463] brcm-pcie fd500000.pcie: IB MEM 0x0000000000..0x00ffffffff -> 0x0400000000 > [ 1.705282] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00 > [ 1.711629] pci_bus 0000:00: root bus resource [bus 00-ff] > [ 1.717172] pci_bus 0000:00: root bus resource [mem 0x600000000-0x603ffffff] (bus address [0xf8000000-0xfbffffff]) > [ 1.727653] pci 0000:00:00.0: [14e4:2711] type 01 class 0x060400 > [ 1.733768] pci 0000:00:00.0: PME# supported from D0 D3hot > [ 1.740235] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring > [ 1.855826] brcm-pcie fd500000.pcie: CLKREQ# ignored; no ASPM > [ 1.863666] brcm-pcie fd500000.pcie: link up, 5.0 GT/s PCIe x1 (SSC) > [ 1.870115] pci 0000:01:00.0: [1912:0014] type 00 class 0x0c0330 > [ 1.876205] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00001fff 64bit] > [ 1.883177] pci 0000:01:00.0: PME# supported from D0 D3hot > [ 1.888881] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01 > [ 1.895581] pci 0000:00:00.0: BAR 14: assigned [mem 0x600000000-0x6000fffff] > [ 1.902707] pci 0000:01:00.0: BAR 0: assigned [mem 0x600000000-0x600001fff 64bit] > [ 1.910279] pci 0000:00:00.0: PCI bridge to [bus 01] > [ 1.915293] pci 0000:00:00.0: bridge window [mem 0x600000000-0x6000fffff] > [ 1.922412] pcieport 0000:00:00.0: enabling device (0000 -> 0002) > [ 1.928633] pcieport 0000:00:00.0: PME: Signaling with IRQ 23 > [ 1.934609] pcieport 0000:00:00.0: AER: enabled with IRQ 23 > [ 1.940340] pci 0000:01:00.0: enabling device (0000 -> 0002) > [ 6.946090] pci 0000:01:00.0: xHCI HW not ready after 5 sec (HC bug?) status = 0x1801 > [ 6.954026] pci 0000:01:00.0: quirk_usb_early_handoff+0x0/0x968 took 4896180 usecs > > Please let me know what I can do to help. Could you please attach your .config so we can check a few things?
On Thu, Apr 13, 2023 at 4:06 PM Cyril Brulebois <kibi@debian.org> wrote: > > Hi Jim, > > Jim Quinlan <jim2101024@gmail.com> (2023-04-13): > > Can you provide (a) the full boot log prior to applying the patch > > series and (b) full boot log after applying the series, using an > > IDENTICAL setup. If it fails on both then it has little to do with my > > patch series. > > Just to be clear, the issue I reported was with: > - Raspberry Pi Compute Module 4 (Rev 1.1, 4G RAM, 32G storage) > - Raspberry Pi Compute Module 4 IO Board > - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S > > This was my minimal reproducer for the kernel panic at boot-up, which > goes away with either v1 or v2. When I realized I didn't actually check > whether the SupaHub board was working correctly, I plugged 2 devices to > obtain this setup: > - Raspberry Pi Compute Module 4 (Rev 1.1, 4G RAM, 32G storage) > - Raspberry Pi Compute Module 4 IO Board > - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S > - Kingston DataTraveler G4 32GB on USB-A port #1 of the SupaHub board. > - Logitech K120 keyboard on USB-A port #2 of the SupaHub board. > > It turns out that this particular revision of the SupaHub board isn't > supported by xhci_hcd directly (failing to probe with error -110) and > one needs to enable CONFIG_USB_XHCI_PCI_RENESAS=m and also ship its > accompanying firmware (/lib/firmware/renesas_usb_fw.mem). With this > updated kernel config, I'm able to use the keyboard and to read data > from the memory stick without problems (70 MB/s). > > > In my last series your testing somehow conflated the effect of an > > unrelated MMC interrupt issue so please be precise. > > I wish things would be simpler and didn't involve combinatorics, let > alone other bugs/regressions at times, but I'm really trying my best to > navigate and report issues and test patches when I can spare some time… Hi Cyril, I want to encourage you and others doing testing and bug reporting: everyone wins when a bug or issue is reported, fixed, and tested. I'm just asking that when you have negative results, that you provide information on the "before" and "after" test results of the patch series, and run both on the same test environment. Regards, Jim Quinlan Broadcom STB > > > In my defence, the very similar board PCE6U1C-R02, VER 006 boots without > a kernel panic, and works fine with xhci_hcd without that extra module > and its firmware. It's based on the exact same chip (Renesas Technology > Corp. uPD720201) though, that's why I didn't realize the need for an > extra module until now for the PCE6U1C-R02, VER 006S one. (Florian, > thanks for mentioning .config earlier…) > > To sum up, it seems both (sub)versions of that SupaHub PCE6U1C-R02 board > are usable with this patch series: the kernel panic at boot-up is gone, > and USB devices plugged into those boards are working as expected. > > > But, speaking of combinatorics, while the patch series helps fix > https://bugzilla.kernel.org/show_bug.cgi?id=217276 on that particular > initial combination of CM4 and SupaHub PCE6U1C-R02, VER 006S, it also > generates a regression if I use a different CM4… > > Setup: > - Raspberry Pi Compute Module 4 (Rev 1.0, 8G RAM, 32G storage) > - Raspberry Pi Compute Module 4 IO Board > - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S > > You'll find attached (a) and (b) as requested above, for this setup, as > well as the kernel configuration file: > - (a) = dmesg-unpatched.txt = boots fine (and USB devices work fine if > memory sticks or keyboards are plugged in). > - (b) = dmesg-patched.txt = kernel panic. > - cm4+pcie.config > > I'm getting similar results with the other SupaHub board: > - Raspberry Pi Compute Module 4 (Rev 1.0, 8G RAM, 32G storage) > - Raspberry Pi Compute Module 4 IO Board > - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006 > > > This is probably best summarized this way: > > | unpatched | patched > ------------------------------+--------------+-------------- > CM4 Rev 1.1 4G/32G + VER 006 | OK | OK > CM4 Rev 1.1 4G/32G + VER 006S | Kernel panic | OK ← Bugzilla entry > CM4 Rev 1.0 8G/32G + VER 006 | OK | Kernel panic > CM4 Rev 1.0 8G/32G + VER 006S | OK | Kernel panic > > > (And I'm of course using the same settings regarding the serial console, > to get traces when needed.) > > > Cheers, > -- > Cyril Brulebois (kibi@debian.org) <https://debamax.com/> > D-I release manager -- Release team member -- Freelance Consultant
On 4/14/2023 5:14 AM, Jim Quinlan wrote: > On Thu, Apr 13, 2023 at 4:06 PM Cyril Brulebois <kibi@debian.org> wrote: >> >> Hi Jim, >> >> Jim Quinlan <jim2101024@gmail.com> (2023-04-13): >>> Can you provide (a) the full boot log prior to applying the patch >>> series and (b) full boot log after applying the series, using an >>> IDENTICAL setup. If it fails on both then it has little to do with my >>> patch series. >> >> Just to be clear, the issue I reported was with: >> - Raspberry Pi Compute Module 4 (Rev 1.1, 4G RAM, 32G storage) >> - Raspberry Pi Compute Module 4 IO Board >> - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S >> >> This was my minimal reproducer for the kernel panic at boot-up, which >> goes away with either v1 or v2. When I realized I didn't actually check >> whether the SupaHub board was working correctly, I plugged 2 devices to >> obtain this setup: >> - Raspberry Pi Compute Module 4 (Rev 1.1, 4G RAM, 32G storage) >> - Raspberry Pi Compute Module 4 IO Board >> - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S >> - Kingston DataTraveler G4 32GB on USB-A port #1 of the SupaHub board. >> - Logitech K120 keyboard on USB-A port #2 of the SupaHub board. >> >> It turns out that this particular revision of the SupaHub board isn't >> supported by xhci_hcd directly (failing to probe with error -110) and >> one needs to enable CONFIG_USB_XHCI_PCI_RENESAS=m and also ship its >> accompanying firmware (/lib/firmware/renesas_usb_fw.mem). With this >> updated kernel config, I'm able to use the keyboard and to read data >> from the memory stick without problems (70 MB/s). >> >>> In my last series your testing somehow conflated the effect of an >>> unrelated MMC interrupt issue so please be precise. >> >> I wish things would be simpler and didn't involve combinatorics, let >> alone other bugs/regressions at times, but I'm really trying my best to >> navigate and report issues and test patches when I can spare some time… > > Hi Cyril, > > I want to encourage you and others doing testing and bug reporting: > everyone wins when a bug or issue is reported, fixed, and tested. > I'm just asking that when you have negative results, that you provide > information on the "before" and "after" test results of > the patch series, and run both on the same test environment. Cyril, based upon the table and logs you provided whereby you have used the following: - Raspberry Pi Compute Module 4 (Rev 1.0, 8G RAM, 32G storage) - Raspberry Pi Compute Module 4 IO Board - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S in the before/unpatched case we have a PCIe link down and in the after/patched we have a PCIe link up but a kernel panic. Neither are great nor resulting in a fully functional PCIe device. Looking at: https://www.amazon.co.uk/SupaHub-Express-BandWidth-Capable-Expanding/dp/B092ZQWG5B it would appear that it can accept an external power supply, do you have one connected to that USB expansion card by any chance? Are you able to boot the kernel before/after if you disconnect any USB peripheral? This looks like a broader electrical problem than the scope of this patch, though it would be neat if we could find a combination that works. At least with Jim's patch we have a PCIe link with uni-directional CLKREQ# so we could try a variety of things. Does that SupaHub board plugged to the CM4 1.0 system work fine in the Raspberry Pi kernel tree?
On Fri, Apr 14, 2023 at 8:27 AM Florian Fainelli <f.fainelli@gmail.com> wrote: > > > > On 4/14/2023 5:14 AM, Jim Quinlan wrote: > > On Thu, Apr 13, 2023 at 4:06 PM Cyril Brulebois <kibi@debian.org> wrote: > >> > >> Hi Jim, > >> > >> Jim Quinlan <jim2101024@gmail.com> (2023-04-13): > >>> Can you provide (a) the full boot log prior to applying the patch > >>> series and (b) full boot log after applying the series, using an > >>> IDENTICAL setup. If it fails on both then it has little to do with my > >>> patch series. > >> > >> Just to be clear, the issue I reported was with: > >> - Raspberry Pi Compute Module 4 (Rev 1.1, 4G RAM, 32G storage) > >> - Raspberry Pi Compute Module 4 IO Board > >> - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S > >> > >> This was my minimal reproducer for the kernel panic at boot-up, which > >> goes away with either v1 or v2. When I realized I didn't actually check > >> whether the SupaHub board was working correctly, I plugged 2 devices to > >> obtain this setup: > >> - Raspberry Pi Compute Module 4 (Rev 1.1, 4G RAM, 32G storage) > >> - Raspberry Pi Compute Module 4 IO Board > >> - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S > >> - Kingston DataTraveler G4 32GB on USB-A port #1 of the SupaHub board. > >> - Logitech K120 keyboard on USB-A port #2 of the SupaHub board. > >> > >> It turns out that this particular revision of the SupaHub board isn't > >> supported by xhci_hcd directly (failing to probe with error -110) and > >> one needs to enable CONFIG_USB_XHCI_PCI_RENESAS=m and also ship its > >> accompanying firmware (/lib/firmware/renesas_usb_fw.mem). With this > >> updated kernel config, I'm able to use the keyboard and to read data > >> from the memory stick without problems (70 MB/s). > >> > >>> In my last series your testing somehow conflated the effect of an > >>> unrelated MMC interrupt issue so please be precise. > >> > >> I wish things would be simpler and didn't involve combinatorics, let > >> alone other bugs/regressions at times, but I'm really trying my best to > >> navigate and report issues and test patches when I can spare some time… > > > > Hi Cyril, > > > > I want to encourage you and others doing testing and bug reporting: > > everyone wins when a bug or issue is reported, fixed, and tested. > > I'm just asking that when you have negative results, that you provide > > information on the "before" and "after" test results of > > the patch series, and run both on the same test environment. > > Cyril, based upon the table and logs you provided whereby you have used > the following: > > - Raspberry Pi Compute Module 4 (Rev 1.0, 8G RAM, 32G storage) > - Raspberry Pi Compute Module 4 IO Board > - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S > > in the before/unpatched case we have a PCIe link down and in the > after/patched we have a PCIe link up but a kernel panic. Neither are > great nor resulting in a fully functional PCIe device. The data do not make sense to me. My new code is executed AFTER pcie link up/down determination. Both test results should have the same link status -- either "link up" or "link down". If the system is wishy-washy, i.e. it has link-up in 5/10 boots, then we need to repeat the experiment to compare the "link up" cases only. Or discount the test completely If the system is not wishy-washy, then something has been changed between the "before" and "after" tests. > > Looking at: > > https://www.amazon.co.uk/SupaHub-Express-BandWidth-Capable-Expanding/dp/B092ZQWG5B > > it would appear that it can accept an external power supply, do you have > one connected to that USB expansion card by any chance? Are you able to > boot the kernel before/after if you disconnect any USB peripheral? > > This looks like a broader electrical problem than the scope of this > patch, though it would be neat if we could find a combination that > works. At least with Jim's patch we have a PCIe link with > uni-directional CLKREQ# so we could try a variety of things. > > Does that SupaHub board plugged to the CM4 1.0 system work fine in the > Raspberry Pi kernel tree? > -- > Florian
Florian Fainelli <f.fainelli@gmail.com> (2023-04-14): > Cyril, based upon the table and logs you provided whereby you have used the > following: > > - Raspberry Pi Compute Module 4 (Rev 1.0, 8G RAM, 32G storage) > - Raspberry Pi Compute Module 4 IO Board > - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S > > in the before/unpatched case we have a PCIe link down and in the > after/patched we have a PCIe link up but a kernel panic. Neither are great > nor resulting in a fully functional PCIe device. Based on Jim's feedback, I'm attaching a new dmesg for the aforementioned setup, with an unpatched kernel (dmesg-unpatched-pcie-link-up.txt). I fear that initially I might have not waited enough before power off and power on when replugging the SupaHub adapter, and I've only seen the PCIe link down a few times (now that I'm actively paying attention to that part). I'm indeed seeing PCIe link up consistently (100%) when using the following method: for i in $(seq 1 20); do # power on, let the system boot up and settle: sispmctl -t 4; sleep 2m ssh cm4 sh -c "dmesg > dmesg-$i.txt; poweroff" # let the system power down, and power off: sleep 30; sispmctl -t 4 # wait before power cycling: sleep 10 done > Looking at: > https://www.amazon.co.uk/SupaHub-Express-BandWidth-Capable-Expanding/dp/B092ZQWG5B > > it would appear that it can accept an external power supply, do you have one > connected to that USB expansion card by any chance? With the patched kernel, I have tried several things (leaving the regular 12V adapter plugged in all the time): - No external power supply (that's what I've been using in all previous attempts). - Using a 5V PSU with 2 pins (ground and 5V) plugged on the Ext PSU 4-pin header on the IO Board. - Using a 5V PSU with a SATA connector plugged directly onto the SupaHub adapter. I'm getting the same trace in all cases. > Are you able to boot the kernel before/after if you disconnect any USB > peripheral? The trace is obtained without any USB peripheral (on the extension card or on the onboard USB slots). Besides the SupaHub adapter on the PCIe slot, I only have Ethernet and HDMI plugged in (I'm getting traces via serial console, and operating the system over SSH when it boots fine). As soon as I unplug the SupaHub board itself, the system boots fine. > Does that SupaHub board plugged to the CM4 1.0 system work fine in the > Raspberry Pi kernel tree? With the Raspberry Pi OS (64-bit) > Raspberry Pi OS Lite image (2023-02-21-raspios-bullseye-arm64-lite.img.xz), the system at least boots fine; I haven't tried adding devices to the mix just yet, trying to stick with that particular setup. A full dmesg is attached (dmesg-raspios.txt). Cheers,
This subject line no verb. Can you add a leading verb to suggest what this patch does? s/accomodations/accommodations/ On Tue, Apr 11, 2023 at 12:59:17PM -0400, Jim Quinlan wrote: > The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be > deliberately set by the probe() into one of three mutually exclusive modes: > > (a) No CLKREQ# expected or required, refclk is always available. > (b) CLKREQ# is expected to be driven by downstream device when needed. > (c) Bidirectional CLKREQ# for L1SS capable devices. > > Previously, only (b) was supported by the driver, as almost all STB/CM > boards operate in this mode. But now there is interest in activating L1SS > power savings from STB/CM customers, and also interest in accomodating mode > (a) for designs such as the RPi CM4 with IO board. accommodating > The HW+driver is able to tell us when mode (a) mode is needed. But there > is no easy way to tell if L1SS mode should be configured. In certain > situations, getting this wrong may cause a panic during boot time. So we > rely on the DT prop "brcm,enable-l1ss" to tell us when mode (c) is desired. > Using this mode only makes sense when the downstream device is L1SS-capable > and the OS has been configured to activate L1SS > (e.g. policy==powersupersave). I'm really concerned about the user experience here. I assume users do not want to edit the DT based on what device they plug in. They shouldn't need to (and probably won't) know whether the device supports L1SS. I hate kernel/module parameters, but I think even that would be better then having to edit the DT. There's obviously a period of time when L1SS is supported but not yet enabled, so I'm *guessing* the "OS has been configured to activate L1SS" is not actually a requirement, and choosing (c) really just opens the possibility that L1SS can be used? Would be nice to have a hint (maybe a line or two of the panic message) to help users find the fix for a problem they're seeing. Obviously the ideal would be if we could use (c) in all cases, so I assume that's where a panic might happen. What situation would that be? An endpoint that doesn't support L1SS? One that supports L1SS but it's not enabled? Maybe if L1SS isn't configured correctly, e.g., LTR values programmed wrong? Bjorn
On 4/14/23 13:27, Bjorn Helgaas wrote: > This subject line no verb. Can you add a leading verb to suggest what > this patch does? > > s/accomodations/accommodations/ > > On Tue, Apr 11, 2023 at 12:59:17PM -0400, Jim Quinlan wrote: >> The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be >> deliberately set by the probe() into one of three mutually exclusive modes: >> >> (a) No CLKREQ# expected or required, refclk is always available. >> (b) CLKREQ# is expected to be driven by downstream device when needed. >> (c) Bidirectional CLKREQ# for L1SS capable devices. >> >> Previously, only (b) was supported by the driver, as almost all STB/CM >> boards operate in this mode. But now there is interest in activating L1SS >> power savings from STB/CM customers, and also interest in accomodating mode >> (a) for designs such as the RPi CM4 with IO board. > > accommodating > >> The HW+driver is able to tell us when mode (a) mode is needed. But there >> is no easy way to tell if L1SS mode should be configured. In certain >> situations, getting this wrong may cause a panic during boot time. So we >> rely on the DT prop "brcm,enable-l1ss" to tell us when mode (c) is desired. >> Using this mode only makes sense when the downstream device is L1SS-capable >> and the OS has been configured to activate L1SS >> (e.g. policy==powersupersave). > > I'm really concerned about the user experience here. I assume users > do not want to edit the DT based on what device they plug in. They > shouldn't need to (and probably won't) know whether the device > supports L1SS. > > I hate kernel/module parameters, but I think even that would be better > then having to edit the DT. The problem I see with kernel/module parameters is that it is really hard to differentiate whether they should be applied across all instances of the device/drivers or just for one in particular. The Raspberry Pi 4 is a single pcie-brcmstb instance, but we have other systems supported by that driver on Set-top-box and Cable Modem chips for instance where we may have different types of end-points being connected, some of which could be Multi-chip-module (MCM) where everything is known ahead of time, and sometimes cards that are plugged to full-sized PCIe or mini-PCIe connectors, where some amount of runtime discoverability is involved. Without inventing some custom modular parameter syntax, it may not work that well. The Device Tree properties at least easily allow for per-instance configuration. > > There's obviously a period of time when L1SS is supported but not yet > enabled, so I'm *guessing* the "OS has been configured to activate > L1SS" is not actually a requirement, and choosing (c) really just > opens the possibility that L1SS can be used? > > Would be nice to have a hint (maybe a line or two of the panic > message) to help users find the fix for a problem they're seeing. > > Obviously the ideal would be if we could use (c) in all cases, so I > assume that's where a panic might happen. What situation would that > be? An endpoint that doesn't support L1SS? One that supports L1SS > but it's not enabled? Maybe if L1SS isn't configured correctly, e.g., > LTR values programmed wrong? > > Bjorn
On Fri, Apr 14, 2023 at 4:27 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > This subject line no verb. Can you add a leading verb to suggest what > this patch does? > > s/accomodations/accommodations/ > > On Tue, Apr 11, 2023 at 12:59:17PM -0400, Jim Quinlan wrote: > > The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be > > deliberately set by the probe() into one of three mutually exclusive modes: > > > > (a) No CLKREQ# expected or required, refclk is always available. > > (b) CLKREQ# is expected to be driven by downstream device when needed. > > (c) Bidirectional CLKREQ# for L1SS capable devices. > > > > Previously, only (b) was supported by the driver, as almost all STB/CM > > boards operate in this mode. But now there is interest in activating L1SS > > power savings from STB/CM customers, and also interest in accomodating mode > > (a) for designs such as the RPi CM4 with IO board. > > accommodating > > > The HW+driver is able to tell us when mode (a) mode is needed. But there > > is no easy way to tell if L1SS mode should be configured. In certain > > situations, getting this wrong may cause a panic during boot time. So we > > rely on the DT prop "brcm,enable-l1ss" to tell us when mode (c) is desired. > > Using this mode only makes sense when the downstream device is L1SS-capable > > and the OS has been configured to activate L1SS > > (e.g. policy==powersupersave). > > I'm really concerned about the user experience here. I assume users > do not want to edit the DT based on what device they plug in. They > shouldn't need to (and probably won't) know whether the device > supports L1SS. > > I hate kernel/module parameters, but I think even that would be better > then having to edit the DT. > > There's obviously a period of time when L1SS is supported but not yet > enabled, so I'm *guessing* the "OS has been configured to activate > L1SS" is not actually a requirement, and choosing (c) really just > opens the possibility that L1SS can be used? Yes. Before this patch series we had two panic scenarios: (a) Endpoint devices with no CLKREQ# connection (b) Endpoints that are L1SS-capable Even without the "brcm,enable-l1ss" property present, both (a) and (b) should be eliminated. The reason (b) is eliminated is because the RC driver now unadvertises RC L1SS by default; subsequently, Linux does not turn it on. So the default setting should be fine for all devices. For those folks who have L1SS capable devices and desire L1SS power savings, they can add the brcm,enable-l1ss property. But everyone should have functionality w/o doing anything. As I am typing this I realize that my comments and dev_info()s are not aligned with what I am saying so I will change them in V3. Sorry about the confusion. > > Would be nice to have a hint (maybe a line or two of the panic > message) to help users find the fix for a problem they're seeing.SS > > Obviously the ideal would be if we could use (c) in all cases, so I > assume that's where a panic might happen. What situation would that > be? An endpoint that doesn't support L1SS? One that supports L1SS > but it's not enabled? Maybe if L1SS isn't configured correctly, e.g., > LTR values programmed wrong? Let me test everything on Monday and get back to you before I make any statements. Regards, Jim Quinilan Broadcom STB > > Bjorn
On Fri, Apr 14, 2023 at 01:33:29PM -0700, Florian Fainelli wrote: > On 4/14/23 13:27, Bjorn Helgaas wrote: > > On Tue, Apr 11, 2023 at 12:59:17PM -0400, Jim Quinlan wrote: > ... > > > The HW+driver is able to tell us when mode (a) mode is needed. But there > > > is no easy way to tell if L1SS mode should be configured. In certain > > > situations, getting this wrong may cause a panic during boot time. So we > > > rely on the DT prop "brcm,enable-l1ss" to tell us when mode (c) is desired. > > > Using this mode only makes sense when the downstream device is L1SS-capable > > > and the OS has been configured to activate L1SS > > > (e.g. policy==powersupersave). > > > > I'm really concerned about the user experience here. I assume users > > do not want to edit the DT based on what device they plug in. They > > shouldn't need to (and probably won't) know whether the device > > supports L1SS. > > > > I hate kernel/module parameters, but I think even that would be better > > then having to edit the DT. > > The problem I see with kernel/module parameters is that it is really hard to > differentiate whether they should be applied across all instances of the > device/drivers or just for one in particular. > > The Raspberry Pi 4 is a single pcie-brcmstb instance, but we have other > systems supported by that driver on Set-top-box and Cable Modem chips for > instance where we may have different types of end-points being connected, > some of which could be Multi-chip-module (MCM) where everything is known > ahead of time, and sometimes cards that are plugged to full-sized PCIe or > mini-PCIe connectors, where some amount of runtime discoverability is > involved. > > Without inventing some custom modular parameter syntax, it may not work that > well. The Device Tree properties at least easily allow for per-instance > configuration. We do have this already (from Documentation/admin-guide/kernel-parameters.txt): pci=option[,option...] [PCI] various PCI subsystem options. Some options herein operate on a specific device or a set of devices (<pci_dev>). These are specified in one of the following formats: [<domain>:]<bus>:<dev>.<func>[/<dev>.<func>]* pci:<vendor>:<device>[:<subvendor>:<subdevice>] Note: the first format specifies a PCI bus/device/function address which may change if new hardware is inserted, if motherboard firmware changes, or due to changes caused by other kernel parameters. If the domain is left unspecified, it is taken to be zero. Optionally, a path to a device through multiple device/function addresses can be specified after the base address (this is more robust against renumbering issues). The second format selects devices using IDs from the configuration space which may match multiple devices in the system.
On Fri, Apr 14, 2023 at 12:19 PM Cyril Brulebois <kibi@debian.org> wrote: > > Florian Fainelli <f.fainelli@gmail.com> (2023-04-14): > > Cyril, based upon the table and logs you provided whereby you have used the > > following: > > > > - Raspberry Pi Compute Module 4 (Rev 1.0, 8G RAM, 32G storage) > > - Raspberry Pi Compute Module 4 IO Board > > - SupaHub PCIe-to-multiple-USB adapter, reference PCE6U1C-R02, VER 006S > > > > in the before/unpatched case we have a PCIe link down and in the > > after/patched we have a PCIe link up but a kernel panic. Neither are great > > nor resulting in a fully functional PCIe device. > > Based on Jim's feedback, I'm attaching a new dmesg for the aforementioned > setup, with an unpatched kernel (dmesg-unpatched-pcie-link-up.txt). Hello Cyril, I'm still seeing things in the logs that do not make sense to me. First, the "unpatched" version logs -- including the Raspian case -- all have the following lines: [ 0.000000] Movable zone start for each node /* ... */ [ 0.000000] pcpu-alloc: s88232 r8192 d30552 u126976 alloc=31*4096 [ 0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 The above lines are also in my logs. But they are missing from your "after patch" logs -- what did you change to have these lines disappear on the patched logs? Second, you say that you used a different "CM4" from the one used in the Bugzilla report -- what do you mean by that? Are you referring to the CM4 module proper, whose only change was going from 4GB to 8GB, or the IO board being used? My testing is done with a standard RPi CM4 and standard RPi IO board. Before this patch series, the only way this standard configuration can work for most hobbyist PCI cards (i.e. floating CLKREQ# pin) is by using Raspian AND adding "brcm,enable-l1ss" to the DT node. I'm guessing that you are using the configuration that you described to me in a personal email: "[the] chip is embedded directly on the modified PCB, as opposed to plugged into the PCIe slot on the official CM4 IO Board". If true, you are testing on a configuration that (a) is unique to you and your group and (b) must be doing something with the CLKREQ# signal so that your "before" case does not panic. Is this the case? Finally, and this is a big one, is the fact that in both of the "before" and "after" cases the value written to the internal Brcm CLKREQ# register is the same. This is evident by this line in the "after" patch: "brcm-pcie fd500000.pcie: uni-dir CLKREQ# for ASPM". This mode is the only mode supported by the "before" case. Now there are some other things going on in the patch series -- reading two registers and extending the timeout value of a completion abort, but I'm having a hard time imagining how they could cause a panic. Regards, Jim PS Can you please include in your logs the output of "sudo lspci -vvv" for those cases that boot to prompt? > > I fear that initially I might have not waited enough before power off and > power on when replugging the SupaHub adapter, and I've only seen the PCIe > link down a few times (now that I'm actively paying attention to that > part). I'm indeed seeing PCIe link up consistently (100%) when using the > following method: > > for i in $(seq 1 20); do > # power on, let the system boot up and settle: > sispmctl -t 4; sleep 2m > ssh cm4 sh -c "dmesg > dmesg-$i.txt; poweroff" > # let the system power down, and power off: > sleep 30; sispmctl -t 4 > # wait before power cycling: > sleep 10 > done > > > Looking at: > > https://www.amazon.co.uk/SupaHub-Express-BandWidth-Capable-Expanding/dp/B092ZQWG5B > > > > it would appear that it can accept an external power supply, do you have one > > connected to that USB expansion card by any chance? > > With the patched kernel, I have tried several things (leaving the regular > 12V adapter plugged in all the time): > - No external power supply (that's what I've been using in all previous > attempts). > - Using a 5V PSU with 2 pins (ground and 5V) plugged on the Ext PSU > 4-pin header on the IO Board. > - Using a 5V PSU with a SATA connector plugged directly onto the SupaHub > adapter. > > I'm getting the same trace in all cases. > > > Are you able to boot the kernel before/after if you disconnect any USB > > peripheral? > > The trace is obtained without any USB peripheral (on the extension card or > on the onboard USB slots). Besides the SupaHub adapter on the PCIe slot, I > only have Ethernet and HDMI plugged in (I'm getting traces via serial > console, and operating the system over SSH when it boots fine). > > As soon as I unplug the SupaHub board itself, the system boots fine. > > > Does that SupaHub board plugged to the CM4 1.0 system work fine in the > > Raspberry Pi kernel tree? > > With the Raspberry Pi OS (64-bit) > Raspberry Pi OS Lite image > (2023-02-21-raspios-bullseye-arm64-lite.img.xz), the system at least > boots fine; I haven't tried adding devices to the mix just yet, trying > to stick with that particular setup. > > A full dmesg is attached (dmesg-raspios.txt). > > > Cheers, > -- > Cyril Brulebois (kibi@debian.org) <https://debamax.com/> > D-I release manager -- Release team member -- Freelance Consultant
Hi Jim, It might take a few days before getting back to you regarding the various questions you asked. To be on the safe side, I'll probably run a cold boot for each setup a number of times (e.g. 10), so that any possible outlier can be spotted/rejected. And maybe share results off list once we have a better understanding of what's going on. Does that make sense to you? I'll cover a particular topic right away though. Jim Quinlan <jim2101024@gmail.com> (2023-04-19): > Second, you say that you used a different "CM4" from the one used in > the Bugzilla report -- what do you mean by that? Are you referring > to the CM4 module proper, whose only change was going from 4GB to 8GB, > or the IO board being used? My testing is done with a standard RPi > CM4 and standard RPi IO board. Before this patch series, the only way > this standard configuration can work for most hobbyist PCI cards (i.e. > floating CLKREQ# pin) is by using Raspian AND adding > "brcm,enable-l1ss" to the DT node. Regarding the IO Board, I'm using the official Compute Module 4 IO Board: https://www.raspberrypi.com/products/compute-module-4-io-board/ I've been using the very same IO Board for all my testing, and what I'm changing is the standard RPi CM4 plugged on it. > I'm guessing that you are using the configuration that you described > to me in a personal email: "[the] chip is embedded directly on the > modified PCB, as opposed to plugged into the PCIe slot on the official > CM4 IO Board". If true, you are testing on a configuration that (a) > is unique to you and your group and (b) must be doing something with > the CLKREQ# signal so that your "before" case does not panic. Is this > the case? That's definitely not the case. True, as mentioned in a personal mail, we've seen problems with a custom PCB, derived from the CM4 IO Board design, but of course there could be some faulty design at work there… So we've first researched whether the same problem could be produced with consumer grade products, and once we've verified that, I opened #217276 on Bugzilla. Since Florian's testing seems overwhelmingly positive, and since I'm seeing definitive improvement with at least one CM4, maybe it would make sense not to block the patch series on the kernel panic I'm seeing with the other CM4, and track that particular issue via a separate bug? Cheers,
diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c index edf283e2b5dd..56b96aa02221 100644 --- a/drivers/pci/controller/pcie-brcmstb.c +++ b/drivers/pci/controller/pcie-brcmstb.c @@ -48,10 +48,17 @@ #define PCIE_RC_CFG_PRIV1_LINK_CAPABILITY 0x04dc #define PCIE_RC_CFG_PRIV1_LINK_CAPABILITY_ASPM_SUPPORT_MASK 0xc00 +#define PCIE_RC_CFG_PRIV1_ROOT_CAP 0x4f8 +#define PCIE_RC_CFG_PRIV1_ROOT_CAP_L1SS_MODE_MASK 0xf8 + #define PCIE_RC_DL_MDIO_ADDR 0x1100 #define PCIE_RC_DL_MDIO_WR_DATA 0x1104 #define PCIE_RC_DL_MDIO_RD_DATA 0x1108 +#define PCIE_0_RC_PL_PHY_DBG_CLKREQ2_0 0x1e30 +#define CLKREQ2_0_CLKREQ_IN_CNT_MASK 0x3f000000 +#define CLKREQ2_0_CLKREQ_IN_MASK 0x40000000 + #define PCIE_MISC_MISC_CTRL 0x4008 #define PCIE_MISC_MISC_CTRL_PCIE_RCB_64B_MODE_MASK 0x80 #define PCIE_MISC_MISC_CTRL_PCIE_RCB_MPS_MODE_MASK 0x400 @@ -121,9 +128,12 @@ #define PCIE_MISC_HARD_PCIE_HARD_DEBUG 0x4204 #define PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK 0x2 +#define PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK 0x200000 #define PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK 0x08000000 #define PCIE_BMIPS_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK 0x00800000 - +#define PCIE_CLKREQ_MASK \ + (PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK | \ + PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK) #define PCIE_INTR2_CPU_BASE 0x4300 #define PCIE_MSI_INTR2_BASE 0x4500 @@ -1024,13 +1034,58 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie) return 0; } +static void brcm_config_clkreq(struct brcm_pcie *pcie) +{ + bool l1ss = of_property_read_bool(pcie->np, "brcm,enable-l1ss"); + void __iomem *base = pcie->base; + u32 clkreq_set, tmp = readl(base + PCIE_0_RC_PL_PHY_DBG_CLKREQ2_0); + bool clkreq_in_seen; + + /* + * We have "seen" CLKREQ# if it is asserted or has been in the past. + * Note that the CLKREQ_IN_MASK is 1 if CLKREQ# is asserted. + */ + clkreq_in_seen = !!(tmp & CLKREQ2_0_CLKREQ_IN_MASK) || + !!FIELD_GET(CLKREQ2_0_CLKREQ_IN_CNT_MASK, tmp); + + /* Start with safest setting where we provide refclk regardless */ + clkreq_set = readl(pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG) & + ~PCIE_CLKREQ_MASK; + + if (l1ss && IS_ENABLED(CONFIG_PCIEASPM)) { + /* + * Note: This (l1ss) mode may not meet requirement for + * downstream devices that require CLKREQ# assertion to + * clock active within 400ns. + */ + clkreq_set |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK; + dev_info(pcie->dev, "bi-dir CLKREQ# for l1ss-capable devs\n"); + dev_info(pcie->dev, "ASPM policy should be set to \"powersupersave\"\n"); + } else { + if (clkreq_in_seen && IS_ENABLED(CONFIG_PCIEASPM)) { + clkreq_set |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK; + dev_info(pcie->dev, "uni-dir CLKREQ# for ASPM\n"); + } else { + dev_info(pcie->dev, "CLKREQ# ignored; no ASPM\n"); + /* Might as well unadvertise ASPM */ + tmp = readl(base + PCIE_RC_CFG_PRIV1_LINK_CAPABILITY) & + ~PCIE_RC_CFG_PRIV1_LINK_CAPABILITY_ASPM_SUPPORT_MASK; + writel(tmp, base + PCIE_RC_CFG_PRIV1_LINK_CAPABILITY); + } + /* Setting the field to 2 unadvertises L1SS support */ + tmp = readl(base + PCIE_RC_CFG_PRIV1_ROOT_CAP); + u32p_replace_bits(&tmp, 2, PCIE_RC_CFG_PRIV1_ROOT_CAP_L1SS_MODE_MASK); + writel(tmp, base + PCIE_RC_CFG_PRIV1_ROOT_CAP); + } + writel(clkreq_set, pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG); +} + static int brcm_pcie_start_link(struct brcm_pcie *pcie) { struct device *dev = pcie->dev; void __iomem *base = pcie->base; u16 nlw, cls, lnksta; bool ssc_good = false; - u32 tmp; int ret, i; /* Unassert the fundamental reset */ @@ -1055,6 +1110,8 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie) return -ENODEV; } + brcm_config_clkreq(pcie); + if (pcie->gen) brcm_pcie_set_gen(pcie, pcie->gen); @@ -1073,14 +1130,6 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie) pci_speed_string(pcie_link_speed[cls]), nlw, ssc_good ? "(SSC)" : "(!SSC)"); - /* - * Refclk from RC should be gated with CLKREQ# input when ASPM L0s,L1 - * is enabled => setting the CLKREQ_DEBUG_ENABLE field to 1. - */ - tmp = readl(base + PCIE_MISC_HARD_PCIE_HARD_DEBUG); - tmp |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK; - writel(tmp, base + PCIE_MISC_HARD_PCIE_HARD_DEBUG); - return 0; }
The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be deliberately set by the probe() into one of three mutually exclusive modes: (a) No CLKREQ# expected or required, refclk is always available. (b) CLKREQ# is expected to be driven by downstream device when needed. (c) Bidirectional CLKREQ# for L1SS capable devices. Previously, only (b) was supported by the driver, as almost all STB/CM boards operate in this mode. But now there is interest in activating L1SS power savings from STB/CM customers, and also interest in accomodating mode (a) for designs such as the RPi CM4 with IO board. The HW+driver is able to tell us when mode (a) mode is needed. But there is no easy way to tell if L1SS mode should be configured. In certain situations, getting this wrong may cause a panic during boot time. So we rely on the DT prop "brcm,enable-l1ss" to tell us when mode (c) is desired. Using this mode only makes sense when the downstream device is L1SS-capable and the OS has been configured to activate L1SS (e.g. policy==powersupersave). This property has already been in use by Raspian Linux, but this immplementation adds more details and discerns between (a) and (b) automatically. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217276 Signed-off-by: Jim Quinlan <jim2101024@gmail.com> --- drivers/pci/controller/pcie-brcmstb.c | 69 +++++++++++++++++++++++---- 1 file changed, 59 insertions(+), 10 deletions(-)