[RFC] irqchip/gic-v3: Claim iomem resources

Message ID	c534c2a458a3bf94ccdae8abc6edc3d45a689c30.1649777295.git.robin.murphy@arm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org> From: Robin Murphy <robin.murphy@arm.com> To: maz@kernel.org, tglx@linutronix.de Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH] irqchip/gic-v3: Claim iomem resources Date: Tue, 12 Apr 2022 16:28:15 +0100 Message-Id: <c534c2a458a3bf94ccdae8abc6edc3d45a689c30.1649777295.git.robin.murphy@arm.com> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	[RFC] irqchip/gic-v3: Claim iomem resources \| expand [RFC] irqchip/gic-v3: Claim iomem resources

Robin Murphy April 12, 2022, 3:28 p.m. UTC

As a simple quality-of-life tweak, claim our MMIO regions when mapping
them, such that the GIC shows up in /proc/iomem. No effort is spent on
trying to release them, since frankly if the GIC fails to probe then
it's never getting a second try anyway.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>

---

I briefly looked at doing the same for GICv2, but quickly decided that
GICv2 isn't interesting enough to be worth the (greater) bother...

Lightly tested with ACPI.
---
 drivers/irqchip/irq-gic-v3.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Tim Harvey Feb. 28, 2023, 11:11 p.m. UTC | #1

On Tue, Apr 12, 2022 at 8:29 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> As a simple quality-of-life tweak, claim our MMIO regions when mapping
> them, such that the GIC shows up in /proc/iomem. No effort is spent on
> trying to release them, since frankly if the GIC fails to probe then
> it's never getting a second try anyway.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>
> ---
>
> I briefly looked at doing the same for GICv2, but quickly decided that
> GICv2 isn't interesting enough to be worth the (greater) bother...
>
> Lightly tested with ACPI.
> ---
>  drivers/irqchip/irq-gic-v3.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index b252d5534547..9815b692a47a 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1980,10 +1980,10 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
>         u32 nr_redist_regions;
>         int err, i;
>
> -       dist_base = of_iomap(node, 0);
> -       if (!dist_base) {
> +       dist_base = of_io_request_and_map(node, 0, "GICD");
> +       if (IS_ERR(dist_base)) {
>                 pr_err("%pOF: unable to map gic dist registers\n", node);
> -               return -ENXIO;
> +               return PTR_ERR(dist_base);
>         }
>
>         err = gic_validate_dist_version(dist_base);
> @@ -2007,8 +2007,8 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
>                 int ret;
>
>                 ret = of_address_to_resource(node, 1 + i, &res);
> -               rdist_regs[i].redist_base = of_iomap(node, 1 + i);
> -               if (ret || !rdist_regs[i].redist_base) {
> +               rdist_regs[i].redist_base = of_io_request_and_map(node, 1 + i, "GICR");
> +               if (ret || IS_ERR(rdist_regs[i].redist_base)) {
>                         pr_err("%pOF: couldn't map region %d\n", node, i);
>                         err = -ENODEV;
>                         goto out_unmap_rdist;
> @@ -2034,7 +2034,7 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
>
>  out_unmap_rdist:
>         for (i = 0; i < nr_redist_regions; i++)
> -               if (rdist_regs[i].redist_base)
> +               if (rdist_regs[i].redist_base && !IS_ERR(rdist_regs[i].redist_base))
>                         iounmap(rdist_regs[i].redist_base);
>         kfree(rdist_regs);
>  out_unmap_dist:
> @@ -2081,6 +2081,7 @@ gic_acpi_parse_madt_redist(union acpi_subtable_headers *header,
>                 pr_err("Couldn't map GICR region @%llx\n", redist->base_address);
>                 return -ENOMEM;
>         }
> +       request_mem_region(redist->base_address, redist->length, "GICR");
>
>         gic_acpi_register_redist(redist->base_address, redist_base);
>         return 0;
> @@ -2103,6 +2104,7 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
>         redist_base = ioremap(gicc->gicr_base_address, size);
>         if (!redist_base)
>                 return -ENOMEM;
> +       request_mem_region(gicc->gicr_base_address, size, "GICR");
>
>         gic_acpi_register_redist(gicc->gicr_base_address, redist_base);
>         return 0;
> @@ -2304,6 +2306,7 @@ gic_acpi_init(union acpi_subtable_headers *header, const unsigned long end)
>                 pr_err("Unable to map GICD registers\n");
>                 return -ENOMEM;
>         }
> +       request_mem_region(dist->base_address, ACPI_GICV3_DIST_MEM_SIZE, "GICD");
>
>         err = gic_validate_dist_version(acpi_data.dist_base);
>         if (err) {
> --
> 2.28.0.dirty
>

Greetings,

I have bisected a kernel issue where octeontx (CN803x) will hang on
reboot caused by commit c0db06fd0993 ("mmc: core: Disable card detect
during shutdown"). This commit made it into 5.16 and stable kernels.
I've found that the patch here which is commit 2b2cd74a06c3 resolves
this hang but I'm not entirely clear why.

Does anyone have a good explanation of why the hang occurs in the
first place and why this resolves it? I would like to get the proper
fix into the affected stable branches.

Best Regards,

Tim

Marc Zyngier Feb. 28, 2023, 11:22 p.m. UTC | #2

On Tue, 28 Feb 2023 23:11:15 +0000,
Tim Harvey <tharvey@gateworks.com> wrote:
> 
> I have bisected a kernel issue where octeontx (CN803x) will hang on
> reboot caused by commit c0db06fd0993 ("mmc: core: Disable card detect
> during shutdown"). This commit made it into 5.16 and stable kernels.
> I've found that the patch here which is commit 2b2cd74a06c3 resolves
> this hang but I'm not entirely clear why.
> 
> Does anyone have a good explanation of why the hang occurs in the
> first place and why this resolves it? I would like to get the proper
> fix into the affected stable branches.

Wild guess: the reservation prevents some other driver from probing
because the firmware describes overlapping ranges, and that driver is
what is causing your above hang.

But you don't give much information that would allow this theory to be
checked. I guess you'll have to instrument your kernel and find out.

	M.

Robin Murphy March 1, 2023, 4:12 p.m. UTC | #3

On 2023-02-28 23:22, Marc Zyngier wrote:
> On Tue, 28 Feb 2023 23:11:15 +0000,
> Tim Harvey <tharvey@gateworks.com> wrote:
>>
>> I have bisected a kernel issue where octeontx (CN803x) will hang on
>> reboot caused by commit c0db06fd0993 ("mmc: core: Disable card detect
>> during shutdown"). This commit made it into 5.16 and stable kernels.
>> I've found that the patch here which is commit 2b2cd74a06c3 resolves
>> this hang but I'm not entirely clear why.
>>
>> Does anyone have a good explanation of why the hang occurs in the
>> first place and why this resolves it? I would like to get the proper
>> fix into the affected stable branches.
> 
> Wild guess: the reservation prevents some other driver from probing
> because the firmware describes overlapping ranges, and that driver is
> what is causing your above hang.

Indeed, according to [1], the GIC appears to overlap one of the "PCIe" 
windows of &ecam0, which conveniently appears to be the parent of the 
MMC controller as well.

Robin.

[1] 
https://github.com/Gateworks/dts-newport/blob/sdk-10.1.1.0-newport/cn81xx-linux.dtsi

Marc Zyngier March 1, 2023, 4:24 p.m. UTC | #4

On Wed, 01 Mar 2023 16:12:58 +0000,
Robin Murphy <robin.murphy@arm.com> wrote:
> 
> On 2023-02-28 23:22, Marc Zyngier wrote:
> > On Tue, 28 Feb 2023 23:11:15 +0000,
> > Tim Harvey <tharvey@gateworks.com> wrote:
> >> 
> >> I have bisected a kernel issue where octeontx (CN803x) will hang on
> >> reboot caused by commit c0db06fd0993 ("mmc: core: Disable card detect
> >> during shutdown"). This commit made it into 5.16 and stable kernels.
> >> I've found that the patch here which is commit 2b2cd74a06c3 resolves
> >> this hang but I'm not entirely clear why.
> >> 
> >> Does anyone have a good explanation of why the hang occurs in the
> >> first place and why this resolves it? I would like to get the proper
> >> fix into the affected stable branches.
> > 
> > Wild guess: the reservation prevents some other driver from probing
> > because the firmware describes overlapping ranges, and that driver is
> > what is causing your above hang.
> 
> Indeed, according to [1], the GIC appears to overlap one of the "PCIe"
> windows of &ecam0, which conveniently appears to be the parent of the
> MMC controller as well.

Huh, quality firmware. So clearly nothing to backport. Just a DT
update to distribute... :-/

	M.

Tim Harvey March 1, 2023, 5:29 p.m. UTC | #5

On Wed, Mar 1, 2023 at 8:13 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2023-02-28 23:22, Marc Zyngier wrote:
> > On Tue, 28 Feb 2023 23:11:15 +0000,
> > Tim Harvey <tharvey@gateworks.com> wrote:
> >>
> >> I have bisected a kernel issue where octeontx (CN803x) will hang on
> >> reboot caused by commit c0db06fd0993 ("mmc: core: Disable card detect
> >> during shutdown"). This commit made it into 5.16 and stable kernels.
> >> I've found that the patch here which is commit 2b2cd74a06c3 resolves
> >> this hang but I'm not entirely clear why.
> >>
> >> Does anyone have a good explanation of why the hang occurs in the
> >> first place and why this resolves it? I would like to get the proper
> >> fix into the affected stable branches.
> >
> > Wild guess: the reservation prevents some other driver from probing
> > because the firmware describes overlapping ranges, and that driver is
> > what is causing your above hang.
>
> Indeed, according to [1], the GIC appears to overlap one of the "PCIe"
> windows of &ecam0, which conveniently appears to be the parent of the
> MMC controller as well.
>
> Robin.
>
> [1]
> https://github.com/Gateworks/dts-newport/blob/sdk-10.1.1.0-newport/cn81xx-linux.dtsi

Robin and Marc,

Thanks! Yes, this patch exposes an issue in my dt which ends up
killing off ecam0 thus hiding the original 'hang on reboot' issue I
was trying to find related to mmc with a quick bisect.

As you expected ecam0 fails probe:
[    0.977514] pci-host-generic: probe of 848000000000.pci failed with error -12

Cavium/Marvell never mainlined their cn81xx device-tree and their
latest SDK appears to still have this issue. It looks to me like the
first mem range in ecam0 is just wrong and needs to be removed:
@@ -226,8 +226,7 @@ ecam0: pci@848000000000 {
                        u-boot,dm-pre-reloc;
                        dma-coherent;
                        reg = <0x8480 0x00000000 0 0x02000000>;  /*
Configuration space
 */
-                       ranges = <0x03000000 0x8010 0x00000000 0x8010
0x00000000 0x080
0x00000000>, /* mem ranges */
-                                <0x03000000 0x8100 0x00000000 0x8100
0x00000000 0x80
0x00000000>, /* SATA */
+                       ranges = <0x03000000 0x8100 0x00000000 0x8100
0x00000000 0x80
0x00000000>, /* SATA */
                                 <0x03000000 0x8680 0x00000000 0x8680
0x00000000 0x160
0x28000000>, /* UARTs */
                                 <0x03000000 0x87e0 0x2c000000 0x87e0
0x2c000000 0x000
0x94000000>, /* PEMs */
                                 <0x03000000 0x8400 0x00000000 0x8400
0x00000000 0x010
0x00000000>, /* RNM */

This results in:
[    0.911162] pci-host-generic 848000000000.pci: host bridge
/soc@0/pci@848000000000 ranges:
[    0.919506] pci-host-generic 848000000000.pci:      MEM
0x810000000000..0x817fffffffff -> 0x810000000000
[    0.929018] pci-host-generic 848000000000.pci:      MEM
0x868000000000..0x87e027ffffff -> 0x868000000000
[    0.938531] pci-host-generic 848000000000.pci:      MEM
0x87e02c000000..0x87e0bfffffff -> 0x87e02c000000
[    0.948039] pci-host-generic 848000000000.pci:      MEM
0x840000000000..0x840fffffffff -> 0x840000000000
[    0.957543] pci-host-generic 848000000000.pci:      MEM
0x843000000000..0x8431ffffffff -> 0x843000000000
[    0.967054] pci-host-generic 848000000000.pci:      MEM
0x87e0c6000000..0x87ffffffffff -> 0x87e0c6000000
[    0.976562] pci-host-generic 848000000000.pci: Memory resource size
exceeds max for 32 bits
[    0.984923] pci-host-generic 848000000000.pci: Memory resource size
exceeds max for 32 bits
[    0.993284] pci-host-generic 848000000000.pci: Memory resource size
exceeds max for 32 bits
[    1.001645] pci-host-generic 848000000000.pci: Memory resource size
exceeds max for 32 bits
[    1.010008] pci-host-generic 848000000000.pci: Memory resource size
exceeds max for 32 bits
[    1.018404] pci-host-generic 848000000000.pci: ECAM at [mem
0x848000000000-0x848001ffffff] for [bus 00-1f]
[    1.028240] pci-host-generic 848000000000.pci: PCI host bridge to bus 0000:00
[    1.035412] pci_bus 0000:00: root bus resource [bus 00-1f]
[    1.040907] pci_bus 0000:00: root bus resource [mem
0x810000000000-0x817fffffffff]
[    1.048488] pci_bus 0000:00: root bus resource [mem
0x868000000000-0x87e027ffffff]
[    1.056070] pci_bus 0000:00: root bus resource [mem
0x87e02c000000-0x87e0bfffffff]
[    1.063652] pci_bus 0000:00: root bus resource [mem
0x840000000000-0x840fffffffff]
[    1.071232] pci_bus 0000:00: root bus resource [mem
0x843000000000-0x8431ffffffff]
[    1.078815] pci_bus 0000:00: root bus resource [mem
0x87e0c6000000-0x87ffffffffff]

and all of the PCI devices appear to work fine. Is there an additional
issue I need to work on regarding the 'Memory resource size exceeds
max for 32 bits' warning above?

Best Regards,

Tim

[RFC] irqchip/gic-v3: Claim iomem resources

Commit Message

Comments

Patch