Message ID | 20241114200133.3069460-1-samuel.holland@sifive.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | irqchip/riscv-aplic: Fix crash when MSI domain is missing | expand |
On Fri, Nov 15, 2024 at 1:31 AM Samuel Holland <samuel.holland@sifive.com> wrote: > > If the APLIC driver is probed before the IMSIC driver, the parent MSI > domain will be missing, which causes a NULL pointer dereference in > msi_create_device_irq_domain(). Avoid this by deferring probe until the > parent MSI domain is available. Use dev_err_probe() to avoid printing an > error message when returning -EPROBE_DEFER. The -EPROBE_DEFER is not needed because we expect that platforms to use "msi-parent" DT property in APLIC DT node which in-turn allows Linux DD framework to re-order probing based on fw_devlink dependencies. The APLIC DT bindings mandates that any of "interrupt-extended" or "msi-parent" DT properties MUST be present. Can you elaborate a bit more on how you are hitting this issue ? Regards, Anup > > Fixes: ca8df97fe679 ("irqchip/riscv-aplic: Add support for MSI-mode") > Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > --- > > drivers/irqchip/irq-riscv-aplic-main.c | 3 ++- > drivers/irqchip/irq-riscv-aplic-msi.c | 3 +++ > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c > index 900e72541db9..93e7c51f944a 100644 > --- a/drivers/irqchip/irq-riscv-aplic-main.c > +++ b/drivers/irqchip/irq-riscv-aplic-main.c > @@ -207,7 +207,8 @@ static int aplic_probe(struct platform_device *pdev) > else > rc = aplic_direct_setup(dev, regs); > if (rc) > - dev_err(dev, "failed to setup APLIC in %s mode\n", msi_mode ? "MSI" : "direct"); > + dev_err_probe(dev, rc, "failed to setup APLIC in %s mode\n", > + msi_mode ? "MSI" : "direct"); > > #ifdef CONFIG_ACPI > if (!acpi_disabled) > diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c > index 945bff28265c..fb8d1838609f 100644 > --- a/drivers/irqchip/irq-riscv-aplic-msi.c > +++ b/drivers/irqchip/irq-riscv-aplic-msi.c > @@ -266,6 +266,9 @@ int aplic_msi_setup(struct device *dev, void __iomem *regs) > if (msi_domain) > dev_set_msi_domain(dev, msi_domain); > } > + > + if (!dev_get_msi_domain(dev)) > + return -EPROBE_DEFER; > } > > if (!msi_create_device_irq_domain(dev, MSI_DEFAULT_DOMAIN, &aplic_msi_template, > -- > 2.45.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
Hi Anup, On 2024-11-15 9:42 AM, Anup Patel wrote: > On Fri, Nov 15, 2024 at 1:31 AM Samuel Holland > <samuel.holland@sifive.com> wrote: >> >> If the APLIC driver is probed before the IMSIC driver, the parent MSI >> domain will be missing, which causes a NULL pointer dereference in >> msi_create_device_irq_domain(). Avoid this by deferring probe until the >> parent MSI domain is available. Use dev_err_probe() to avoid printing an >> error message when returning -EPROBE_DEFER. > > The -EPROBE_DEFER is not needed because we expect that platforms to > use "msi-parent" DT property in APLIC DT node which in-turn allows Linux > DD framework to re-order probing based on fw_devlink dependencies. The > APLIC DT bindings mandates that any of "interrupt-extended" or "msi-parent" > DT properties MUST be present. > > Can you elaborate a bit more on how you are hitting this issue ? I agree that fw_devlink should help avoid the situation where we need to return -EPROBE_DEFER, but the kernel must still not crash even if fw_devlink is disabled (which is a perfectly valid thing to do: "fw_devlink=off" on the kernel command line) or if fw_devlink fails to come up with the ideal probe order. fw_devlink is an optimization. It should not be relied on for correctness. In my specific case, fw_devlink got the order wrong due to some false dependency cycles, which I sent a patch for separately[1]. Regards, Samuel [1]: https://lore.kernel.org/lkml/20241114195652.3068725-1-samuel.holland@sifive.com/ >> Fixes: ca8df97fe679 ("irqchip/riscv-aplic: Add support for MSI-mode") >> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> >> --- >> >> drivers/irqchip/irq-riscv-aplic-main.c | 3 ++- >> drivers/irqchip/irq-riscv-aplic-msi.c | 3 +++ >> 2 files changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c >> index 900e72541db9..93e7c51f944a 100644 >> --- a/drivers/irqchip/irq-riscv-aplic-main.c >> +++ b/drivers/irqchip/irq-riscv-aplic-main.c >> @@ -207,7 +207,8 @@ static int aplic_probe(struct platform_device *pdev) >> else >> rc = aplic_direct_setup(dev, regs); >> if (rc) >> - dev_err(dev, "failed to setup APLIC in %s mode\n", msi_mode ? "MSI" : "direct"); >> + dev_err_probe(dev, rc, "failed to setup APLIC in %s mode\n", >> + msi_mode ? "MSI" : "direct"); >> >> #ifdef CONFIG_ACPI >> if (!acpi_disabled) >> diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c >> index 945bff28265c..fb8d1838609f 100644 >> --- a/drivers/irqchip/irq-riscv-aplic-msi.c >> +++ b/drivers/irqchip/irq-riscv-aplic-msi.c >> @@ -266,6 +266,9 @@ int aplic_msi_setup(struct device *dev, void __iomem *regs) >> if (msi_domain) >> dev_set_msi_domain(dev, msi_domain); >> } >> + >> + if (!dev_get_msi_domain(dev)) >> + return -EPROBE_DEFER; >> } >> >> if (!msi_create_device_irq_domain(dev, MSI_DEFAULT_DOMAIN, &aplic_msi_template, >> -- >> 2.45.1 >> >> >> _______________________________________________ >> linux-riscv mailing list >> linux-riscv@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-riscv
On Fri, Nov 15, 2024 at 9:27 PM Samuel Holland <samuel.holland@sifive.com> wrote: > > Hi Anup, > > On 2024-11-15 9:42 AM, Anup Patel wrote: > > On Fri, Nov 15, 2024 at 1:31 AM Samuel Holland > > <samuel.holland@sifive.com> wrote: > >> > >> If the APLIC driver is probed before the IMSIC driver, the parent MSI > >> domain will be missing, which causes a NULL pointer dereference in > >> msi_create_device_irq_domain(). Avoid this by deferring probe until the > >> parent MSI domain is available. Use dev_err_probe() to avoid printing an > >> error message when returning -EPROBE_DEFER. > > > > The -EPROBE_DEFER is not needed because we expect that platforms to > > use "msi-parent" DT property in APLIC DT node which in-turn allows Linux > > DD framework to re-order probing based on fw_devlink dependencies. The > > APLIC DT bindings mandates that any of "interrupt-extended" or "msi-parent" > > DT properties MUST be present. > > > > Can you elaborate a bit more on how you are hitting this issue ? > > I agree that fw_devlink should help avoid the situation where we need to return > -EPROBE_DEFER, but the kernel must still not crash even if fw_devlink is > disabled (which is a perfectly valid thing to do: "fw_devlink=off" on the kernel > command line) or if fw_devlink fails to come up with the ideal probe order. > fw_devlink is an optimization. It should not be relied on for correctness. In my > specific case, fw_devlink got the order wrong due to some false dependency > cycles, which I sent a patch for separately[1]. The RISC-V kernel is heavily dependent on fw_devlink based probe ordering and more upcoming drivers are going to increase this dependency. For example, we also have RISC-V IOMMU driver that needs to be probed after IMSIC since it can use MSIs. I think we should ensure that fw_devlink can't be disabled/turned-off for the RISC-V kernel. If this is not possible then we should have very verbose boot-time warning when fw_devlink is disabled/turned-off. Your other "interrupt-parent" related fix [1] looks fine to me. Regards, Anup > > Regards, > Samuel > > [1]: > https://lore.kernel.org/lkml/20241114195652.3068725-1-samuel.holland@sifive.com/ > > >> Fixes: ca8df97fe679 ("irqchip/riscv-aplic: Add support for MSI-mode") > >> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > >> --- > >> > >> drivers/irqchip/irq-riscv-aplic-main.c | 3 ++- > >> drivers/irqchip/irq-riscv-aplic-msi.c | 3 +++ > >> 2 files changed, 5 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c > >> index 900e72541db9..93e7c51f944a 100644 > >> --- a/drivers/irqchip/irq-riscv-aplic-main.c > >> +++ b/drivers/irqchip/irq-riscv-aplic-main.c > >> @@ -207,7 +207,8 @@ static int aplic_probe(struct platform_device *pdev) > >> else > >> rc = aplic_direct_setup(dev, regs); > >> if (rc) > >> - dev_err(dev, "failed to setup APLIC in %s mode\n", msi_mode ? "MSI" : "direct"); > >> + dev_err_probe(dev, rc, "failed to setup APLIC in %s mode\n", > >> + msi_mode ? "MSI" : "direct"); > >> > >> #ifdef CONFIG_ACPI > >> if (!acpi_disabled) > >> diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c > >> index 945bff28265c..fb8d1838609f 100644 > >> --- a/drivers/irqchip/irq-riscv-aplic-msi.c > >> +++ b/drivers/irqchip/irq-riscv-aplic-msi.c > >> @@ -266,6 +266,9 @@ int aplic_msi_setup(struct device *dev, void __iomem *regs) > >> if (msi_domain) > >> dev_set_msi_domain(dev, msi_domain); > >> } > >> + > >> + if (!dev_get_msi_domain(dev)) > >> + return -EPROBE_DEFER; > >> } > >> > >> if (!msi_create_device_irq_domain(dev, MSI_DEFAULT_DOMAIN, &aplic_msi_template, > >> -- > >> 2.45.1 > >> > >> > >> _______________________________________________ > >> linux-riscv mailing list > >> linux-riscv@lists.infradead.org > >> http://lists.infradead.org/mailman/listinfo/linux-riscv >
diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c index 900e72541db9..93e7c51f944a 100644 --- a/drivers/irqchip/irq-riscv-aplic-main.c +++ b/drivers/irqchip/irq-riscv-aplic-main.c @@ -207,7 +207,8 @@ static int aplic_probe(struct platform_device *pdev) else rc = aplic_direct_setup(dev, regs); if (rc) - dev_err(dev, "failed to setup APLIC in %s mode\n", msi_mode ? "MSI" : "direct"); + dev_err_probe(dev, rc, "failed to setup APLIC in %s mode\n", + msi_mode ? "MSI" : "direct"); #ifdef CONFIG_ACPI if (!acpi_disabled) diff --git a/drivers/irqchip/irq-riscv-aplic-msi.c b/drivers/irqchip/irq-riscv-aplic-msi.c index 945bff28265c..fb8d1838609f 100644 --- a/drivers/irqchip/irq-riscv-aplic-msi.c +++ b/drivers/irqchip/irq-riscv-aplic-msi.c @@ -266,6 +266,9 @@ int aplic_msi_setup(struct device *dev, void __iomem *regs) if (msi_domain) dev_set_msi_domain(dev, msi_domain); } + + if (!dev_get_msi_domain(dev)) + return -EPROBE_DEFER; } if (!msi_create_device_irq_domain(dev, MSI_DEFAULT_DOMAIN, &aplic_msi_template,
If the APLIC driver is probed before the IMSIC driver, the parent MSI domain will be missing, which causes a NULL pointer dereference in msi_create_device_irq_domain(). Avoid this by deferring probe until the parent MSI domain is available. Use dev_err_probe() to avoid printing an error message when returning -EPROBE_DEFER. Fixes: ca8df97fe679 ("irqchip/riscv-aplic: Add support for MSI-mode") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> --- drivers/irqchip/irq-riscv-aplic-main.c | 3 ++- drivers/irqchip/irq-riscv-aplic-msi.c | 3 +++ 2 files changed, 5 insertions(+), 1 deletion(-)