Message ID | 20200718000637.3632841-3-saravanak@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | irqchip: Add IRQCHIP_PLATFORM_DRIVER helper macros | expand |
On Fri, Jul 17, 2020 at 5:06 PM Saravana Kannan <saravanak@google.com> wrote: > > Switch the driver to use the helper macros. In addition to reducing the > number of lines, this also adds module unload protection (if the driver > is compiled as a module) by switching from module_platform_driver to > builtin_platform_driver. > > Signed-off-by: Saravana Kannan <saravanak@google.com> > --- > drivers/irqchip/qcom-pdc.c | 26 +++----------------------- > 1 file changed, 3 insertions(+), 23 deletions(-) > > diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c > index 5b624e3295e4..c1c5dfad57cc 100644 > --- a/drivers/irqchip/qcom-pdc.c > +++ b/drivers/irqchip/qcom-pdc.c > @@ -432,28 +432,8 @@ static int qcom_pdc_init(struct device_node *node, struct device_node *parent) > return ret; > } > > -static int qcom_pdc_probe(struct platform_device *pdev) > -{ > - struct device_node *np = pdev->dev.of_node; > - struct device_node *parent = of_irq_find_parent(np); > - > - return qcom_pdc_init(np, parent); > -} > - > -static const struct of_device_id qcom_pdc_match_table[] = { > - { .compatible = "qcom,pdc" }, > - {} > -}; > -MODULE_DEVICE_TABLE(of, qcom_pdc_match_table); > - > -static struct platform_driver qcom_pdc_driver = { > - .probe = qcom_pdc_probe, > - .driver = { > - .name = "qcom-pdc", > - .of_match_table = qcom_pdc_match_table, > - .suppress_bind_attrs = true, > - }, > -}; > -module_platform_driver(qcom_pdc_driver); > +IRQCHIP_PLATFORM_DRIVER_BEGIN(qcom_pdc) > +IRQCHIP_MATCH("qcom,pdc", qcom_pdc_init) > +IRQCHIP_PLATFORM_DRIVER_END(qcom_pdc) > MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Power Domain Controller"); > MODULE_LICENSE("GPL v2"); <sigh> So this is where I bashfully admit I didn't get a chance to try this patch series out, as I had success with a much older version of Saravana's macro magic. But unfortunately, now that this has landed in mainline, I'm seeing boot regressions on db845c. :( This is in the non-modular case, building the driver in. I managed to bisect it down to this patch, and reverting it avoids the issue. I don't see what is wrong right off, but I really need to get to bed, so I'll dig further tomorrow. Saravana: Apologies for not getting around to testing this beforehand! thanks -john
On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > > On Fri, Jul 17, 2020 at 5:06 PM Saravana Kannan <saravanak@google.com> wrote: > > > > Switch the driver to use the helper macros. In addition to reducing the > > number of lines, this also adds module unload protection (if the driver > > is compiled as a module) by switching from module_platform_driver to > > builtin_platform_driver. > > > > Signed-off-by: Saravana Kannan <saravanak@google.com> > > --- > > drivers/irqchip/qcom-pdc.c | 26 +++----------------------- > > 1 file changed, 3 insertions(+), 23 deletions(-) > > > > diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c > > index 5b624e3295e4..c1c5dfad57cc 100644 > > --- a/drivers/irqchip/qcom-pdc.c > > +++ b/drivers/irqchip/qcom-pdc.c > > @@ -432,28 +432,8 @@ static int qcom_pdc_init(struct device_node *node, struct device_node *parent) > > return ret; > > } > > > > -static int qcom_pdc_probe(struct platform_device *pdev) > > -{ > > - struct device_node *np = pdev->dev.of_node; > > - struct device_node *parent = of_irq_find_parent(np); > > - > > - return qcom_pdc_init(np, parent); > > -} > > - > > -static const struct of_device_id qcom_pdc_match_table[] = { > > - { .compatible = "qcom,pdc" }, > > - {} > > -}; > > -MODULE_DEVICE_TABLE(of, qcom_pdc_match_table); > > - > > -static struct platform_driver qcom_pdc_driver = { > > - .probe = qcom_pdc_probe, > > - .driver = { > > - .name = "qcom-pdc", > > - .of_match_table = qcom_pdc_match_table, > > - .suppress_bind_attrs = true, > > - }, > > -}; > > -module_platform_driver(qcom_pdc_driver); > > +IRQCHIP_PLATFORM_DRIVER_BEGIN(qcom_pdc) > > +IRQCHIP_MATCH("qcom,pdc", qcom_pdc_init) > > +IRQCHIP_PLATFORM_DRIVER_END(qcom_pdc) > > MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Power Domain Controller"); > > MODULE_LICENSE("GPL v2"); > > <sigh> > So this is where I bashfully admit I didn't get a chance to try this > patch series out, as I had success with a much older version of > Saravana's macro magic. > > But unfortunately, now that this has landed in mainline, I'm seeing > boot regressions on db845c. :( This is in the non-modular case, > building the driver in. Does that mean the modular version is working? Or you haven't tried that yet? I'll wait for your reply before I try to fix it. I don't have the hardware, but it should be easy to guess this issue looking at the code delta. The only significant change from what your probe function is doing is this snippet. But it'd be surprising if this only affects the builtin case. + if (par_np == np) + par_np = NULL; + + /* + * If there's a parent interrupt controller and none of the parent irq + * domains have been registered, that means the parent interrupt + * controller has not been initialized yet. it's not time for this + * interrupt controller to initialize. So, defer probe of this + * interrupt controller. The actual initialization callback of this + * interrupt controller can check for specific domains as necessary. + */ + if (par_np && !irq_find_matching_host(np, DOMAIN_BUS_ANY)) + return -EPROBE_DEFER; > I managed to bisect it down to this patch, and reverting it avoids the > issue. I don't see what is wrong right off, but I really need to get > to bed, so I'll dig further tomorrow. > > Saravana: Apologies for not getting around to testing this beforehand! No worries. Apologies for breaking it accidentally. -Saravana
On 8/5/20 3:19 PM, Saravana Kannan wrote: > On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: >> <sigh> >> So this is where I bashfully admit I didn't get a chance to try this >> patch series out, as I had success with a much older version of >> Saravana's macro magic. >> >> But unfortunately, now that this has landed in mainline, I'm seeing >> boot regressions on db845c. :( This is in the non-modular case, >> building the driver in. > Does that mean the modular version is working? Or you haven't tried > that yet? I'll wait for your reply before I try to fix it. I don't > have the hardware, but it should be easy to guess this issue looking > at the code delta. For what it's worth, I saw this too on the Lenovo C630 (started on -next around 20200727, but I didn't track it down as, well, there's less way to get debug output on the C630. In my testing, module or built-in doesn't matter, but reverting does allow me to boot again. > The only significant change from what your probe function is doing is > this snippet. But it'd be surprising if this only affects the builtin > case. > > + if (par_np == np) > + par_np = NULL; > + > + /* > + * If there's a parent interrupt controller and none of the parent irq > + * domains have been registered, that means the parent interrupt > + * controller has not been initialized yet. it's not time for this > + * interrupt controller to initialize. So, defer probe of this > + * interrupt controller. The actual initialization callback of this > + * interrupt controller can check for specific domains as necessary. > + */ > + if (par_np && !irq_find_matching_host(np, DOMAIN_BUS_ANY)) > + return -EPROBE_DEFER; > >> I managed to bisect it down to this patch, and reverting it avoids the >> issue. I don't see what is wrong right off, but I really need to get >> to bed, so I'll dig further tomorrow. >> >> Saravana: Apologies for not getting around to testing this beforehand! > No worries. Apologies for breaking it accidentally. > > -Saravana
On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > On 8/5/20 3:19 PM, Saravana Kannan wrote: >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: >>> <sigh> >>> So this is where I bashfully admit I didn't get a chance to try this >>> patch series out, as I had success with a much older version of >>> Saravana's macro magic. >>> >>> But unfortunately, now that this has landed in mainline, I'm seeing >>> boot regressions on db845c. :( This is in the non-modular case, >>> building the driver in. >> Does that mean the modular version is working? Or you haven't tried >> that yet? I'll wait for your reply before I try to fix it. I don't >> have the hardware, but it should be easy to guess this issue looking >> at the code delta. > For what it's worth, I saw this too on the Lenovo C630 (started on -next > around 20200727, but I didn't track it down as, well, there's less way > to get debug output on the C630. > > In my testing, module or built-in doesn't matter, but reverting does > allow me to boot again. > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted boots, however, module (on the c630 at least) doesn't boot whether it's a module or built-in. >> The only significant change from what your probe function is doing is >> this snippet. But it'd be surprising if this only affects the builtin >> case. >> >> + if (par_np == np) >> + par_np = NULL; >> + >> + /* >> + * If there's a parent interrupt controller and none of the parent irq >> + * domains have been registered, that means the parent interrupt >> + * controller has not been initialized yet. it's not time for this >> + * interrupt controller to initialize. So, defer probe of this >> + * interrupt controller. The actual initialization callback of this >> + * interrupt controller can check for specific domains as necessary. >> + */ >> + if (par_np && !irq_find_matching_host(np, DOMAIN_BUS_ANY)) >> + return -EPROBE_DEFER; >> >>> I managed to bisect it down to this patch, and reverting it avoids the >>> issue. I don't see what is wrong right off, but I really need to get >>> to bed, so I'll dig further tomorrow. >>> >>> Saravana: Apologies for not getting around to testing this beforehand! >> No worries. Apologies for breaking it accidentally. >> >> -Saravana
On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski <steev@kali.org> wrote: > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > >>> <sigh> > >>> So this is where I bashfully admit I didn't get a chance to try this > >>> patch series out, as I had success with a much older version of > >>> Saravana's macro magic. > >>> > >>> But unfortunately, now that this has landed in mainline, I'm seeing > >>> boot regressions on db845c. :( This is in the non-modular case, > >>> building the driver in. > >> Does that mean the modular version is working? Or you haven't tried > >> that yet? I'll wait for your reply before I try to fix it. I don't > >> have the hardware, but it should be easy to guess this issue looking > >> at the code delta. > > For what it's worth, I saw this too on the Lenovo C630 (started on -next > > around 20200727, but I didn't track it down as, well, there's less way > > to get debug output on the C630. > > > > In my testing, module or built-in doesn't matter, but reverting does > > allow me to boot again. > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted > boots, however, module (on the c630 at least) doesn't boot whether it's > a module or built-in. You may need to set deferred_probe_timeout=30 to give things a bit more grace time to load. (I've mostly recently used qcom-pdc as a module with the android tree, so the fw_devlink bits help there, but I need to re-check the state of that upstream.) I'll dbl check this and dig more on the issue with the patch in question once I can get back in my office later today. thanks -john
On Wed, Aug 5, 2020 at 1:19 PM Saravana Kannan <saravanak@google.com> wrote: > On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > > On Fri, Jul 17, 2020 at 5:06 PM Saravana Kannan <saravanak@google.com> wrote: > > > > > > Switch the driver to use the helper macros. In addition to reducing the > > > number of lines, this also adds module unload protection (if the driver > > > is compiled as a module) by switching from module_platform_driver to > > > builtin_platform_driver. > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com> > > > --- > > > drivers/irqchip/qcom-pdc.c | 26 +++----------------------- > > > 1 file changed, 3 insertions(+), 23 deletions(-) > > > > > > diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c > > > index 5b624e3295e4..c1c5dfad57cc 100644 > > > --- a/drivers/irqchip/qcom-pdc.c > > > +++ b/drivers/irqchip/qcom-pdc.c > > > @@ -432,28 +432,8 @@ static int qcom_pdc_init(struct device_node *node, struct device_node *parent) > > > return ret; > > > } > > > > > > -static int qcom_pdc_probe(struct platform_device *pdev) > > > -{ > > > - struct device_node *np = pdev->dev.of_node; > > > - struct device_node *parent = of_irq_find_parent(np); > > > - > > > - return qcom_pdc_init(np, parent); > > > -} > > > - > > > -static const struct of_device_id qcom_pdc_match_table[] = { > > > - { .compatible = "qcom,pdc" }, > > > - {} > > > -}; > > > -MODULE_DEVICE_TABLE(of, qcom_pdc_match_table); > > > - > > > -static struct platform_driver qcom_pdc_driver = { > > > - .probe = qcom_pdc_probe, > > > - .driver = { > > > - .name = "qcom-pdc", > > > - .of_match_table = qcom_pdc_match_table, > > > - .suppress_bind_attrs = true, > > > - }, > > > -}; > > > -module_platform_driver(qcom_pdc_driver); > > > +IRQCHIP_PLATFORM_DRIVER_BEGIN(qcom_pdc) > > > +IRQCHIP_MATCH("qcom,pdc", qcom_pdc_init) > > > +IRQCHIP_PLATFORM_DRIVER_END(qcom_pdc) > > > MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Power Domain Controller"); > > > MODULE_LICENSE("GPL v2"); > > > > <sigh> > > So this is where I bashfully admit I didn't get a chance to try this > > patch series out, as I had success with a much older version of > > Saravana's macro magic. > > > > But unfortunately, now that this has landed in mainline, I'm seeing > > boot regressions on db845c. :( This is in the non-modular case, > > building the driver in. > > Does that mean the modular version is working? Or you haven't tried > that yet? I'll wait for your reply before I try to fix it. I don't > have the hardware, but it should be easy to guess this issue looking > at the code delta. I've not yet tested with modules with your patch. > The only significant change from what your probe function is doing is > this snippet. But it'd be surprising if this only affects the builtin > case. > > + if (par_np == np) > + par_np = NULL; > + > + /* > + * If there's a parent interrupt controller and none of the parent irq > + * domains have been registered, that means the parent interrupt > + * controller has not been initialized yet. it's not time for this > + * interrupt controller to initialize. So, defer probe of this > + * interrupt controller. The actual initialization callback of this > + * interrupt controller can check for specific domains as necessary. > + */ > + if (par_np && !irq_find_matching_host(np, DOMAIN_BUS_ANY)) > + return -EPROBE_DEFER; Yep. We're getting caught on the irq_find_matching_host() check. I'm a little lost as when I look at the qcom,pdc node in the dtsi its not under a parent controller (instead the soc node). Not sure if that's an issue in the dtsi or if par_np check needs to ignore the soc node and pass null? thanks -john
On 2020-08-06 02:24, John Stultz wrote: > On Wed, Aug 5, 2020 at 1:19 PM Saravana Kannan <saravanak@google.com> > wrote: >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> >> wrote: >> > On Fri, Jul 17, 2020 at 5:06 PM Saravana Kannan <saravanak@google.com> wrote: >> > > >> > > Switch the driver to use the helper macros. In addition to reducing the >> > > number of lines, this also adds module unload protection (if the driver >> > > is compiled as a module) by switching from module_platform_driver to >> > > builtin_platform_driver. >> > > >> > > Signed-off-by: Saravana Kannan <saravanak@google.com> >> > > --- >> > > drivers/irqchip/qcom-pdc.c | 26 +++----------------------- >> > > 1 file changed, 3 insertions(+), 23 deletions(-) >> > > >> > > diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c >> > > index 5b624e3295e4..c1c5dfad57cc 100644 >> > > --- a/drivers/irqchip/qcom-pdc.c >> > > +++ b/drivers/irqchip/qcom-pdc.c >> > > @@ -432,28 +432,8 @@ static int qcom_pdc_init(struct device_node *node, struct device_node *parent) >> > > return ret; >> > > } >> > > >> > > -static int qcom_pdc_probe(struct platform_device *pdev) >> > > -{ >> > > - struct device_node *np = pdev->dev.of_node; >> > > - struct device_node *parent = of_irq_find_parent(np); >> > > - >> > > - return qcom_pdc_init(np, parent); >> > > -} >> > > - >> > > -static const struct of_device_id qcom_pdc_match_table[] = { >> > > - { .compatible = "qcom,pdc" }, >> > > - {} >> > > -}; >> > > -MODULE_DEVICE_TABLE(of, qcom_pdc_match_table); >> > > - >> > > -static struct platform_driver qcom_pdc_driver = { >> > > - .probe = qcom_pdc_probe, >> > > - .driver = { >> > > - .name = "qcom-pdc", >> > > - .of_match_table = qcom_pdc_match_table, >> > > - .suppress_bind_attrs = true, >> > > - }, >> > > -}; >> > > -module_platform_driver(qcom_pdc_driver); >> > > +IRQCHIP_PLATFORM_DRIVER_BEGIN(qcom_pdc) >> > > +IRQCHIP_MATCH("qcom,pdc", qcom_pdc_init) >> > > +IRQCHIP_PLATFORM_DRIVER_END(qcom_pdc) >> > > MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Power Domain Controller"); >> > > MODULE_LICENSE("GPL v2"); >> > >> > <sigh> >> > So this is where I bashfully admit I didn't get a chance to try this >> > patch series out, as I had success with a much older version of >> > Saravana's macro magic. >> > >> > But unfortunately, now that this has landed in mainline, I'm seeing >> > boot regressions on db845c. :( This is in the non-modular case, >> > building the driver in. >> >> Does that mean the modular version is working? Or you haven't tried >> that yet? I'll wait for your reply before I try to fix it. I don't >> have the hardware, but it should be easy to guess this issue looking >> at the code delta. > > I've not yet tested with modules with your patch. > >> The only significant change from what your probe function is doing is >> this snippet. But it'd be surprising if this only affects the builtin >> case. >> >> + if (par_np == np) >> + par_np = NULL; >> + >> + /* >> + * If there's a parent interrupt controller and none of the parent >> irq >> + * domains have been registered, that means the parent interrupt >> + * controller has not been initialized yet. it's not time for this >> + * interrupt controller to initialize. So, defer probe of this >> + * interrupt controller. The actual initialization callback of this >> + * interrupt controller can check for specific domains as necessary. >> + */ >> + if (par_np && !irq_find_matching_host(np, DOMAIN_BUS_ANY)) >> + return -EPROBE_DEFER; > > Yep. We're getting caught on the irq_find_matching_host() check. I'm a > little lost as when I look at the qcom,pdc node in the dtsi its not > under a parent controller (instead the soc node). > Not sure if that's an issue in the dtsi or if par_np check needs to > ignore the soc node and pass null? I think you have nailed it. This checks for a domain attached to the driver we are about to probe, and this domain cannot possibly exist. Instead, it is the *parent* this should check for, as we depend on it for successful probing. Can you please give this a go? Thanks, M. diff --git a/drivers/irqchip/irqchip.c b/drivers/irqchip/irqchip.c index 1bb0e36c2bf3..d2341153e181 100644 --- a/drivers/irqchip/irqchip.c +++ b/drivers/irqchip/irqchip.c @@ -52,7 +52,7 @@ int platform_irqchip_probe(struct platform_device *pdev) * interrupt controller. The actual initialization callback of this * interrupt controller can check for specific domains as necessary. */ - if (par_np && !irq_find_matching_host(np, DOMAIN_BUS_ANY)) + if (par_np && !irq_find_matching_host(par_np, DOMAIN_BUS_ANY)) return -EPROBE_DEFER; return irq_init_cb(np, par_np);
On Thu, Aug 6, 2020 at 5:12 AM Marc Zyngier <maz@kernel.org> wrote: > > On 2020-08-06 02:24, John Stultz wrote: > > On Wed, Aug 5, 2020 at 1:19 PM Saravana Kannan <saravanak@google.com> > > wrote: > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> > >> wrote: > >> > On Fri, Jul 17, 2020 at 5:06 PM Saravana Kannan <saravanak@google.com> wrote: > >> > > > >> > > Switch the driver to use the helper macros. In addition to reducing the > >> > > number of lines, this also adds module unload protection (if the driver > >> > > is compiled as a module) by switching from module_platform_driver to > >> > > builtin_platform_driver. > >> > > > >> > > Signed-off-by: Saravana Kannan <saravanak@google.com> > >> > > --- > >> > > drivers/irqchip/qcom-pdc.c | 26 +++----------------------- > >> > > 1 file changed, 3 insertions(+), 23 deletions(-) > >> > > > >> > > diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c > >> > > index 5b624e3295e4..c1c5dfad57cc 100644 > >> > > --- a/drivers/irqchip/qcom-pdc.c > >> > > +++ b/drivers/irqchip/qcom-pdc.c > >> > > @@ -432,28 +432,8 @@ static int qcom_pdc_init(struct device_node *node, struct device_node *parent) > >> > > return ret; > >> > > } > >> > > > >> > > -static int qcom_pdc_probe(struct platform_device *pdev) > >> > > -{ > >> > > - struct device_node *np = pdev->dev.of_node; > >> > > - struct device_node *parent = of_irq_find_parent(np); > >> > > - > >> > > - return qcom_pdc_init(np, parent); > >> > > -} > >> > > - > >> > > -static const struct of_device_id qcom_pdc_match_table[] = { > >> > > - { .compatible = "qcom,pdc" }, > >> > > - {} > >> > > -}; > >> > > -MODULE_DEVICE_TABLE(of, qcom_pdc_match_table); > >> > > - > >> > > -static struct platform_driver qcom_pdc_driver = { > >> > > - .probe = qcom_pdc_probe, > >> > > - .driver = { > >> > > - .name = "qcom-pdc", > >> > > - .of_match_table = qcom_pdc_match_table, > >> > > - .suppress_bind_attrs = true, > >> > > - }, > >> > > -}; > >> > > -module_platform_driver(qcom_pdc_driver); > >> > > +IRQCHIP_PLATFORM_DRIVER_BEGIN(qcom_pdc) > >> > > +IRQCHIP_MATCH("qcom,pdc", qcom_pdc_init) > >> > > +IRQCHIP_PLATFORM_DRIVER_END(qcom_pdc) > >> > > MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Power Domain Controller"); > >> > > MODULE_LICENSE("GPL v2"); > >> > > >> > <sigh> > >> > So this is where I bashfully admit I didn't get a chance to try this > >> > patch series out, as I had success with a much older version of > >> > Saravana's macro magic. > >> > > >> > But unfortunately, now that this has landed in mainline, I'm seeing > >> > boot regressions on db845c. :( This is in the non-modular case, > >> > building the driver in. > >> > >> Does that mean the modular version is working? Or you haven't tried > >> that yet? I'll wait for your reply before I try to fix it. I don't > >> have the hardware, but it should be easy to guess this issue looking > >> at the code delta. > > > > I've not yet tested with modules with your patch. > > > >> The only significant change from what your probe function is doing is > >> this snippet. But it'd be surprising if this only affects the builtin > >> case. > >> > >> + if (par_np == np) > >> + par_np = NULL; > >> + > >> + /* > >> + * If there's a parent interrupt controller and none of the parent > >> irq > >> + * domains have been registered, that means the parent interrupt > >> + * controller has not been initialized yet. it's not time for this > >> + * interrupt controller to initialize. So, defer probe of this > >> + * interrupt controller. The actual initialization callback of this > >> + * interrupt controller can check for specific domains as necessary. > >> + */ > >> + if (par_np && !irq_find_matching_host(np, DOMAIN_BUS_ANY)) > >> + return -EPROBE_DEFER; > > > > Yep. We're getting caught on the irq_find_matching_host() check. I'm a > > little lost as when I look at the qcom,pdc node in the dtsi its not > > under a parent controller (instead the soc node). > > Not sure if that's an issue in the dtsi or if par_np check needs to > > ignore the soc node and pass null? > > I think you have nailed it. This checks for a domain attached to > the driver we are about to probe, and this domain cannot possibly > exist. Instead, it is the *parent* this should check for, as we > depend on it for successful probing. Duh! Looks like I made a copy-paste/typo error. The comment clearly says I'm trying to check the parent and then I end up checking the node getting registered. I'm sure this will fix it. Actually Nial sent an email a few hours after your and he had found the same issue. He even tested the fix with an irqchip driver and it fixed the probe issue. I'm assuming you'll put up the patch yourself. Please let me know if you need me to send one. Thanks, Saravana
On 2020-08-06 19:05, Saravana Kannan wrote: > On Thu, Aug 6, 2020 at 5:12 AM Marc Zyngier <maz@kernel.org> wrote: >> >> On 2020-08-06 02:24, John Stultz wrote: [...] >> >> + if (par_np == np) >> >> + par_np = NULL; >> >> + >> >> + /* >> >> + * If there's a parent interrupt controller and none of the parent >> >> irq >> >> + * domains have been registered, that means the parent interrupt >> >> + * controller has not been initialized yet. it's not time for this >> >> + * interrupt controller to initialize. So, defer probe of this >> >> + * interrupt controller. The actual initialization callback of this >> >> + * interrupt controller can check for specific domains as necessary. >> >> + */ >> >> + if (par_np && !irq_find_matching_host(np, DOMAIN_BUS_ANY)) >> >> + return -EPROBE_DEFER; >> > >> > Yep. We're getting caught on the irq_find_matching_host() check. I'm a >> > little lost as when I look at the qcom,pdc node in the dtsi its not >> > under a parent controller (instead the soc node). >> > Not sure if that's an issue in the dtsi or if par_np check needs to >> > ignore the soc node and pass null? >> >> I think you have nailed it. This checks for a domain attached to >> the driver we are about to probe, and this domain cannot possibly >> exist. Instead, it is the *parent* this should check for, as we >> depend on it for successful probing. > > Duh! Looks like I made a copy-paste/typo error. The comment clearly > says I'm trying to check the parent and then I end up checking the > node getting registered. I'm sure this will fix it. > > Actually Nial sent an email a few hours after your and he had found > the same issue. He even tested the fix with an irqchip driver and it > fixed the probe issue. OK, thanks for confirming. It would have been good if these patches had seen a bit more testing. > > I'm assuming you'll put up the patch yourself. Please let me know if > you need me to send one. I have queued this [1] in -next. It'd be good if someone (John?) could give a Tested-by. Thanks, M. [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=irq/irqchip-next
On Thu, Aug 6, 2020 at 12:59 PM Marc Zyngier <maz@kernel.org> wrote: > OK, thanks for confirming. It would have been good if these patches > had seen a bit more testing. Yes, again, my apologies for that! > > I'm assuming you'll put up the patch yourself. Please let me know if > > you need me to send one. > > I have queued this [1] in -next. > > It'd be good if someone (John?) could give a Tested-by. Just validated. Tested-by: John Stultz <john.stultz@linaro.org> Thanks so much for the quick fix! -john
On 2020-08-06 21:09, John Stultz wrote: > On Thu, Aug 6, 2020 at 12:59 PM Marc Zyngier <maz@kernel.org> wrote: >> OK, thanks for confirming. It would have been good if these patches >> had seen a bit more testing. > > Yes, again, my apologies for that! I would say this should be the job of the patch author, before anyone else. Yes, silly bugs happen. In this occurrence, it could have been avoided by just boot-testing it, though. Oh well. At least it was caught early. >> > I'm assuming you'll put up the patch yourself. Please let me know if >> > you need me to send one. >> >> I have queued this [1] in -next. >> >> It'd be good if someone (John?) could give a Tested-by. > > Just validated. Tested-by: John Stultz <john.stultz@linaro.org> Thanks for your patience, the reporting and testing. M.
On Thu, Aug 6, 2020 at 1:31 PM Marc Zyngier <maz@kernel.org> wrote: > > On 2020-08-06 21:09, John Stultz wrote: > > On Thu, Aug 6, 2020 at 12:59 PM Marc Zyngier <maz@kernel.org> wrote: > >> OK, thanks for confirming. It would have been good if these patches > >> had seen a bit more testing. > > > > Yes, again, my apologies for that! > > I would say this should be the job of the patch author, before > anyone else. Yes, silly bugs happen. In this occurrence, it > could have been avoided by just boot-testing it, though. Sorry about this. I don't have a DB 845c to test this one. I generally work with John or try and backport changes to devices I have that are running older 4.xx kernels. This one seemed harmless/simple enough that I didn't think I needed to go through all that. But obviously the few times you don't test is when things fail :-( -Saravana
On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski <steev@kali.org> wrote: > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > > >>> <sigh> > > >>> So this is where I bashfully admit I didn't get a chance to try this > > >>> patch series out, as I had success with a much older version of > > >>> Saravana's macro magic. > > >>> > > >>> But unfortunately, now that this has landed in mainline, I'm seeing > > >>> boot regressions on db845c. :( This is in the non-modular case, > > >>> building the driver in. > > >> Does that mean the modular version is working? Or you haven't tried > > >> that yet? I'll wait for your reply before I try to fix it. I don't > > >> have the hardware, but it should be easy to guess this issue looking > > >> at the code delta. > > > For what it's worth, I saw this too on the Lenovo C630 (started on -next > > > around 20200727, but I didn't track it down as, well, there's less way > > > to get debug output on the C630. > > > > > > In my testing, module or built-in doesn't matter, but reverting does > > > allow me to boot again. > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted > > boots, however, module (on the c630 at least) doesn't boot whether it's > > a module or built-in. > > You may need to set deferred_probe_timeout=30 to give things a bit > more grace time to load. With the risk of me reading more into this than what you're saying, please don't upstream anything that depend this parameter to be increased. Compiling any of these drivers as module should not require the user to pass additional kernel command line parameters in order to get their device to boot. Regards, Bjorn > (I've mostly recently used qcom-pdc as a module with the android tree, > so the fw_devlink bits help there, but I need to re-check the state of > that upstream.) > > I'll dbl check this and dig more on the issue with the patch in > question once I can get back in my office later today. > > thanks > -john
On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson <bjorn.andersson@linaro.org> wrote: > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski <steev@kali.org> wrote: > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > > > >>> <sigh> > > > >>> So this is where I bashfully admit I didn't get a chance to try this > > > >>> patch series out, as I had success with a much older version of > > > >>> Saravana's macro magic. > > > >>> > > > >>> But unfortunately, now that this has landed in mainline, I'm seeing > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > >>> building the driver in. > > > >> Does that mean the modular version is working? Or you haven't tried > > > >> that yet? I'll wait for your reply before I try to fix it. I don't > > > >> have the hardware, but it should be easy to guess this issue looking > > > >> at the code delta. > > > > For what it's worth, I saw this too on the Lenovo C630 (started on -next > > > > around 20200727, but I didn't track it down as, well, there's less way > > > > to get debug output on the C630. > > > > > > > > In my testing, module or built-in doesn't matter, but reverting does > > > > allow me to boot again. > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted > > > boots, however, module (on the c630 at least) doesn't boot whether it's > > > a module or built-in. > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > more grace time to load. > > With the risk of me reading more into this than what you're saying, > please don't upstream anything that depend this parameter to be > increased. > > Compiling any of these drivers as module should not require the user to > pass additional kernel command line parameters in order to get their > device to boot. So, ideally I agree, and Saravana's fw_devlink work should allow us to avoid it. But the reality is that it is already required (at least in configurations heavily using modules) to give more time for modules loaded to resolve missing dependencies after init begins (due to changes in the driver core to fail loading after init so that optional dt links aren't eternally looked for). This was seen when trying to enable the qualcom clk drivers to modules. It doesn't seem necessary in this case, but I suggested it here as I've got it enabled by default in my AOSP builds so that the module-heavy configs for GKI boot properly (even if Saravana's fw_devlink work is disabled). thanks -john
On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > <bjorn.andersson@linaro.org> wrote: > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski <steev@kali.org> wrote: > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > > > > >>> <sigh> > > > > >>> So this is where I bashfully admit I didn't get a chance to try this > > > > >>> patch series out, as I had success with a much older version of > > > > >>> Saravana's macro magic. > > > > >>> > > > > >>> But unfortunately, now that this has landed in mainline, I'm seeing > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > >>> building the driver in. > > > > >> Does that mean the modular version is working? Or you haven't tried > > > > >> that yet? I'll wait for your reply before I try to fix it. I don't > > > > >> have the hardware, but it should be easy to guess this issue looking > > > > >> at the code delta. > > > > > For what it's worth, I saw this too on the Lenovo C630 (started on -next > > > > > around 20200727, but I didn't track it down as, well, there's less way > > > > > to get debug output on the C630. > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting does > > > > > allow me to boot again. > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted > > > > boots, however, module (on the c630 at least) doesn't boot whether it's > > > > a module or built-in. > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > more grace time to load. > > > > With the risk of me reading more into this than what you're saying, > > please don't upstream anything that depend this parameter to be > > increased. > > > > Compiling any of these drivers as module should not require the user to > > pass additional kernel command line parameters in order to get their > > device to boot. > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > avoid it. But the reality is that it is already required (at least in > configurations heavily using modules) to give more time for modules > loaded to resolve missing dependencies after init begins (due to > changes in the driver core to fail loading after init so that optional > dt links aren't eternally looked for). This was seen when trying to > enable the qualcom clk drivers to modules. > So to clarify what you're saying, any system that boots successfully with the default options is a sign of pure luck - regardless of being builtin or modules. And there you have my exact argument against the deferred timeout magic going on in the driver core. But as you know people insist that it's more important to be able to boot some defunct system from NFS than a properly configured one reliably. > It doesn't seem necessary in this case, but I suggested it here as > I've got it enabled by default in my AOSP builds so that the > module-heavy configs for GKI boot properly (even if Saravana's > fw_devlink work is disabled). > With all due respect, that's your downstream kernel, the upstream kernel should not rely on luck, out-of-tree patches or kernel parameters. Regards, Bjorn
On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson <bjorn.andersson@linaro.org> wrote: > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > <bjorn.andersson@linaro.org> wrote: > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski <steev@kali.org> wrote: > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > > > > > >>> <sigh> > > > > > >>> So this is where I bashfully admit I didn't get a chance to try this > > > > > >>> patch series out, as I had success with a much older version of > > > > > >>> Saravana's macro magic. > > > > > >>> > > > > > >>> But unfortunately, now that this has landed in mainline, I'm seeing > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > >>> building the driver in. > > > > > >> Does that mean the modular version is working? Or you haven't tried > > > > > >> that yet? I'll wait for your reply before I try to fix it. I don't > > > > > >> have the hardware, but it should be easy to guess this issue looking > > > > > >> at the code delta. > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started on -next > > > > > > around 20200727, but I didn't track it down as, well, there's less way > > > > > > to get debug output on the C630. > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting does > > > > > > allow me to boot again. > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted > > > > > boots, however, module (on the c630 at least) doesn't boot whether it's > > > > > a module or built-in. > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > more grace time to load. > > > > > > With the risk of me reading more into this than what you're saying, > > > please don't upstream anything that depend this parameter to be > > > increased. > > > > > > Compiling any of these drivers as module should not require the user to > > > pass additional kernel command line parameters in order to get their > > > device to boot. > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > avoid it. But the reality is that it is already required (at least in > > configurations heavily using modules) to give more time for modules > > loaded to resolve missing dependencies after init begins (due to > > changes in the driver core to fail loading after init so that optional > > dt links aren't eternally looked for). This was seen when trying to > > enable the qualcom clk drivers to modules. > > > > So to clarify what you're saying, any system that boots successfully > with the default options is a sign of pure luck - regardless of being > builtin or modules. > > > And there you have my exact argument against the deferred timeout magic > going on in the driver core. But as you know people insist that it's > more important to be able to boot some defunct system from NFS than a > properly configured one reliably. I'd agree, but the NFS case was in use before, and when the original deferred timeout/optional link handling stuff landed no one complained they were broken by it (at least at the point where it landed). Only later when we started enabling more lower-level core drivers as modules did the shortened dependency resolution time start to bite folks. My attempt to set the default to be 30 seconds helped there, but caused trouble and delays for the NFS case, and "don't break existing users" seemed to rule, so I set the default timeout back to 0. > > It doesn't seem necessary in this case, but I suggested it here as > > I've got it enabled by default in my AOSP builds so that the > > module-heavy configs for GKI boot properly (even if Saravana's > > fw_devlink work is disabled). > > > > With all due respect, that's your downstream kernel, the upstream kernel > should not rely on luck, out-of-tree patches or kernel parameters. I agree that would be preferred. But kernel parameters are often there for these sorts of cases where we can't always do the right thing. As for out-of-tree patches, broken things don't get fixed until out-of-tree patches are developed and upstreamed, and I know Saravana is doing exactly that, and I hope his fw_devlink work helps fix it so the module loading is not just a matter of luck. Also I think Thierry's comments in the other thread today are also good ideas for ways to better handle the optional dt link handling (rather than using a timeout). thanks -john
On Thu, Aug 6, 2020 at 7:49 PM John Stultz <john.stultz@linaro.org> wrote: > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > <bjorn.andersson@linaro.org> wrote: > > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > > <bjorn.andersson@linaro.org> wrote: > > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski <steev@kali.org> wrote: > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > > > > > > >>> <sigh> > > > > > > >>> So this is where I bashfully admit I didn't get a chance to try this > > > > > > >>> patch series out, as I had success with a much older version of > > > > > > >>> Saravana's macro magic. > > > > > > >>> > > > > > > >>> But unfortunately, now that this has landed in mainline, I'm seeing > > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > > >>> building the driver in. > > > > > > >> Does that mean the modular version is working? Or you haven't tried > > > > > > >> that yet? I'll wait for your reply before I try to fix it. I don't > > > > > > >> have the hardware, but it should be easy to guess this issue looking > > > > > > >> at the code delta. > > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started on -next > > > > > > > around 20200727, but I didn't track it down as, well, there's less way > > > > > > > to get debug output on the C630. > > > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting does > > > > > > > allow me to boot again. > > > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted > > > > > > boots, however, module (on the c630 at least) doesn't boot whether it's > > > > > > a module or built-in. > > > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > > more grace time to load. > > > > > > > > With the risk of me reading more into this than what you're saying, > > > > please don't upstream anything that depend this parameter to be > > > > increased. > > > > > > > > Compiling any of these drivers as module should not require the user to > > > > pass additional kernel command line parameters in order to get their > > > > device to boot. > > > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > > avoid it. But the reality is that it is already required (at least in > > > configurations heavily using modules) to give more time for modules > > > loaded to resolve missing dependencies after init begins (due to > > > changes in the driver core to fail loading after init so that optional > > > dt links aren't eternally looked for). This was seen when trying to > > > enable the qualcom clk drivers to modules. > > > > > > > So to clarify what you're saying, any system that boots successfully > > with the default options is a sign of pure luck - regardless of being > > builtin or modules. > > > > > > And there you have my exact argument against the deferred timeout magic > > going on in the driver core. But as you know people insist that it's > > more important to be able to boot some defunct system from NFS than a > > properly configured one reliably. > > I'd agree, but the NFS case was in use before, and when the original > deferred timeout/optional link handling stuff landed no one complained > they were broken by it (at least at the point where it landed). Only > later when we started enabling more lower-level core drivers as > modules did the shortened dependency resolution time start to bite > folks. My attempt to set the default to be 30 seconds helped there, > but caused trouble and delays for the NFS case, and "don't break > existing users" seemed to rule, so I set the default timeout back to > 0. > > > > It doesn't seem necessary in this case, but I suggested it here as > > > I've got it enabled by default in my AOSP builds so that the > > > module-heavy configs for GKI boot properly (even if Saravana's > > > fw_devlink work is disabled). > > > > > > > With all due respect, that's your downstream kernel, the upstream kernel > > should not rely on luck, out-of-tree patches or kernel parameters. > > I agree that would be preferred. But kernel parameters are often there > for these sorts of cases where we can't always do the right thing. As > for out-of-tree patches, broken things don't get fixed until > out-of-tree patches are developed and upstreamed, and I know Saravana > is doing exactly that, and I hope his fw_devlink work helps fix it so > the module loading is not just a matter of luck. Btw, the only downstream fw_devlink change is setting itto =on (vs =permissive in upstream). > Also I think Thierry's comments in the other thread today are also > good ideas for ways to better handle the optional dt link handling > (rather than using a timeout). Could you please give me a lore link to this thread? Just curious. -Saravana
On Thu, Aug 6, 2020 at 8:02 PM Saravana Kannan <saravanak@google.com> wrote: > On Thu, Aug 6, 2020 at 7:49 PM John Stultz <john.stultz@linaro.org> wrote: > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > > <bjorn.andersson@linaro.org> wrote: > > > With all due respect, that's your downstream kernel, the upstream kernel > > > should not rely on luck, out-of-tree patches or kernel parameters. > > > > I agree that would be preferred. But kernel parameters are often there > > for these sorts of cases where we can't always do the right thing. As > > for out-of-tree patches, broken things don't get fixed until > > out-of-tree patches are developed and upstreamed, and I know Saravana > > is doing exactly that, and I hope his fw_devlink work helps fix it so > > the module loading is not just a matter of luck. > > Btw, the only downstream fw_devlink change is setting itto =on (vs > =permissive in upstream). I thought there was the clk_sync_state stuff as well? > > Also I think Thierry's comments in the other thread today are also > > good ideas for ways to better handle the optional dt link handling > > (rather than using a timeout). > > Could you please give me a lore link to this thread? Just curious. Sure: https://lore.kernel.org/lkml/20200806135251.GB3351349@ulmo/ thanks -john
On Thu, Aug 6, 2020 at 8:09 PM John Stultz <john.stultz@linaro.org> wrote: > > On Thu, Aug 6, 2020 at 8:02 PM Saravana Kannan <saravanak@google.com> wrote: > > On Thu, Aug 6, 2020 at 7:49 PM John Stultz <john.stultz@linaro.org> wrote: > > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > > > <bjorn.andersson@linaro.org> wrote: > > > > With all due respect, that's your downstream kernel, the upstream kernel > > > > should not rely on luck, out-of-tree patches or kernel parameters. > > > > > > I agree that would be preferred. But kernel parameters are often there > > > for these sorts of cases where we can't always do the right thing. As > > > for out-of-tree patches, broken things don't get fixed until > > > out-of-tree patches are developed and upstreamed, and I know Saravana > > > is doing exactly that, and I hope his fw_devlink work helps fix it so > > > the module loading is not just a matter of luck. > > > > Btw, the only downstream fw_devlink change is setting itto =on (vs > > =permissive in upstream). > > I thought there was the clk_sync_state stuff as well? That's not needed to solve the module load ordering issues and deferred probe issues. That's only needed to keep clocks on till some of the modules are loaded and it depends on fw_devlink, but not really a part of fw_devlink IMHO. And yes, that's on my list of things to upstream. > > > Also I think Thierry's comments in the other thread today are also > > > good ideas for ways to better handle the optional dt link handling > > > (rather than using a timeout). > > > > Could you please give me a lore link to this thread? Just curious. > > Sure: https://lore.kernel.org/lkml/20200806135251.GB3351349@ulmo/ Thanks. -Saravana
On Thu 06 Aug 19:48 PDT 2020, John Stultz wrote: > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > <bjorn.andersson@linaro.org> wrote: > > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > > <bjorn.andersson@linaro.org> wrote: > > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski <steev@kali.org> wrote: > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > > > > > > >>> <sigh> > > > > > > >>> So this is where I bashfully admit I didn't get a chance to try this > > > > > > >>> patch series out, as I had success with a much older version of > > > > > > >>> Saravana's macro magic. > > > > > > >>> > > > > > > >>> But unfortunately, now that this has landed in mainline, I'm seeing > > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > > >>> building the driver in. > > > > > > >> Does that mean the modular version is working? Or you haven't tried > > > > > > >> that yet? I'll wait for your reply before I try to fix it. I don't > > > > > > >> have the hardware, but it should be easy to guess this issue looking > > > > > > >> at the code delta. > > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started on -next > > > > > > > around 20200727, but I didn't track it down as, well, there's less way > > > > > > > to get debug output on the C630. > > > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting does > > > > > > > allow me to boot again. > > > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted > > > > > > boots, however, module (on the c630 at least) doesn't boot whether it's > > > > > > a module or built-in. > > > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > > more grace time to load. > > > > > > > > With the risk of me reading more into this than what you're saying, > > > > please don't upstream anything that depend this parameter to be > > > > increased. > > > > > > > > Compiling any of these drivers as module should not require the user to > > > > pass additional kernel command line parameters in order to get their > > > > device to boot. > > > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > > avoid it. But the reality is that it is already required (at least in > > > configurations heavily using modules) to give more time for modules > > > loaded to resolve missing dependencies after init begins (due to > > > changes in the driver core to fail loading after init so that optional > > > dt links aren't eternally looked for). This was seen when trying to > > > enable the qualcom clk drivers to modules. > > > > > > > So to clarify what you're saying, any system that boots successfully > > with the default options is a sign of pure luck - regardless of being > > builtin or modules. > > > > > > And there you have my exact argument against the deferred timeout magic > > going on in the driver core. But as you know people insist that it's > > more important to be able to boot some defunct system from NFS than a > > properly configured one reliably. > > I'd agree, but the NFS case was in use before, and when the original > deferred timeout/optional link handling stuff landed no one complained > they were broken by it (at least at the point where it landed). I did object when this was proposed and I've objected for the last two years, because we keep adding more and more subsystems to follow this broken behavior. > Only later when we started enabling more lower-level core drivers as > modules did the shortened dependency resolution time start to bite > folks. My attempt to set the default to be 30 seconds helped there, > but caused trouble and delays for the NFS case, and "don't break > existing users" seemed to rule, so I set the default timeout back to > 0. > I can't argue with that and I'm at loss on how to turn this around. > > > It doesn't seem necessary in this case, but I suggested it here as > > > I've got it enabled by default in my AOSP builds so that the > > > module-heavy configs for GKI boot properly (even if Saravana's > > > fw_devlink work is disabled). > > > > > > > With all due respect, that's your downstream kernel, the upstream kernel > > should not rely on luck, out-of-tree patches or kernel parameters. > > I agree that would be preferred. But kernel parameters are often there > for these sorts of cases where we can't always do the right thing. > As for out-of-tree patches, broken things don't get fixed until > out-of-tree patches are developed and upstreamed, and I know Saravana > is doing exactly that, and I hope his fw_devlink work helps fix it so > the module loading is not just a matter of luck. > I don't agree with this, upstream should be functional in its default configuration. Out-of-tree patches might be necessary to enable features or get the most out of the hardware, but what we have upstream should work. And no, this is not always the case, but we should at least aim for this. > Also I think Thierry's comments in the other thread today are also > good ideas for ways to better handle the optional dt link handling > (rather than using a timeout). > I'll take a look at that, but to repeat what I've said many times before, for Qualcomm platforms there's pretty much no such thing as optional links. Regards, Bjorn
On Thu, Aug 6, 2020 at 10:58 PM Bjorn Andersson <bjorn.andersson@linaro.org> wrote: > > On Thu 06 Aug 19:48 PDT 2020, John Stultz wrote: > > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > > <bjorn.andersson@linaro.org> wrote: > > > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > > > <bjorn.andersson@linaro.org> wrote: > > > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski <steev@kali.org> wrote: > > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz <john.stultz@linaro.org> wrote: > > > > > > > >>> <sigh> > > > > > > > >>> So this is where I bashfully admit I didn't get a chance to try this > > > > > > > >>> patch series out, as I had success with a much older version of > > > > > > > >>> Saravana's macro magic. > > > > > > > >>> > > > > > > > >>> But unfortunately, now that this has landed in mainline, I'm seeing > > > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > > > >>> building the driver in. > > > > > > > >> Does that mean the modular version is working? Or you haven't tried > > > > > > > >> that yet? I'll wait for your reply before I try to fix it. I don't > > > > > > > >> have the hardware, but it should be easy to guess this issue looking > > > > > > > >> at the code delta. > > > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started on -next > > > > > > > > around 20200727, but I didn't track it down as, well, there's less way > > > > > > > > to get debug output on the C630. > > > > > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting does > > > > > > > > allow me to boot again. > > > > > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit reverted > > > > > > > boots, however, module (on the c630 at least) doesn't boot whether it's > > > > > > > a module or built-in. > > > > > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > > > more grace time to load. > > > > > > > > > > With the risk of me reading more into this than what you're saying, > > > > > please don't upstream anything that depend this parameter to be > > > > > increased. > > > > > > > > > > Compiling any of these drivers as module should not require the user to > > > > > pass additional kernel command line parameters in order to get their > > > > > device to boot. > > > > > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > > > avoid it. But the reality is that it is already required (at least in > > > > configurations heavily using modules) to give more time for modules > > > > loaded to resolve missing dependencies after init begins (due to > > > > changes in the driver core to fail loading after init so that optional > > > > dt links aren't eternally looked for). This was seen when trying to > > > > enable the qualcom clk drivers to modules. > > > > > > > > > > So to clarify what you're saying, any system that boots successfully > > > with the default options is a sign of pure luck - regardless of being > > > builtin or modules. > > > > > > > > > And there you have my exact argument against the deferred timeout magic > > > going on in the driver core. But as you know people insist that it's > > > more important to be able to boot some defunct system from NFS than a > > > properly configured one reliably. > > > > I'd agree, but the NFS case was in use before, and when the original > > deferred timeout/optional link handling stuff landed no one complained > > they were broken by it (at least at the point where it landed). > > I did object when this was proposed and I've objected for the last two > years, because we keep adding more and more subsystems to follow this > broken behavior. > > > Only later when we started enabling more lower-level core drivers as > > modules did the shortened dependency resolution time start to bite > > folks. My attempt to set the default to be 30 seconds helped there, > > but caused trouble and delays for the NFS case, and "don't break > > existing users" seemed to rule, so I set the default timeout back to > > 0. > > > > I can't argue with that and I'm at loss on how to turn this around. > > > > > It doesn't seem necessary in this case, but I suggested it here as > > > > I've got it enabled by default in my AOSP builds so that the > > > > module-heavy configs for GKI boot properly (even if Saravana's > > > > fw_devlink work is disabled). > > > > > > > > > > With all due respect, that's your downstream kernel, the upstream kernel > > > should not rely on luck, out-of-tree patches or kernel parameters. > > > > I agree that would be preferred. But kernel parameters are often there > > for these sorts of cases where we can't always do the right thing. > > As for out-of-tree patches, broken things don't get fixed until > > out-of-tree patches are developed and upstreamed, and I know Saravana > > is doing exactly that, and I hope his fw_devlink work helps fix it so > > the module loading is not just a matter of luck. > > > > I don't agree with this, upstream should be functional in its default > configuration. Out-of-tree patches might be necessary to enable features > or get the most out of the hardware, but what we have upstream should > work. And no, this is not always the case, but we should at least aim > for this. > > > Also I think Thierry's comments in the other thread today are also > > good ideas for ways to better handle the optional dt link handling > > (rather than using a timeout). > > > > I'll take a look at that, but to repeat what I've said many times > before, for Qualcomm platforms there's pretty much no such thing as > optional links. Nicolas had suggested earlier that we could have a way to enable fw_devlink=on in DT under the "chosen" node. I liked that idea, but wasn't sure how easy it would be to convince DT maintainers to allow it. If that happens, we could just set that and not have to worry about these timeouts for QC platforms. Not sure how easy it is to update DT in a DB 845c though. But for cases where DT can't be updated, we'd still have the crappy timeout to plaster over the issue. Something like this or whatever sensible name that suggests that the dependencies in DT are reliable enough to use for probe ordering, deferred probe, etc. Then once that's there, we can turn on fw_devlink=on and that'd completely skip this timeout code path (and has a bunch of other benefits). chosen { linux,reliable-dt-dependencies; } Just some food for thought. Also, Thierry's idea would work, but it'll need changes to multiple drivers. Also, I don't think it'll work for drivers that'll need to ignore vs not ignore dependencies based on what board they are used in. It's still better than the blind timeout though. -Saravana
diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c index 5b624e3295e4..c1c5dfad57cc 100644 --- a/drivers/irqchip/qcom-pdc.c +++ b/drivers/irqchip/qcom-pdc.c @@ -432,28 +432,8 @@ static int qcom_pdc_init(struct device_node *node, struct device_node *parent) return ret; } -static int qcom_pdc_probe(struct platform_device *pdev) -{ - struct device_node *np = pdev->dev.of_node; - struct device_node *parent = of_irq_find_parent(np); - - return qcom_pdc_init(np, parent); -} - -static const struct of_device_id qcom_pdc_match_table[] = { - { .compatible = "qcom,pdc" }, - {} -}; -MODULE_DEVICE_TABLE(of, qcom_pdc_match_table); - -static struct platform_driver qcom_pdc_driver = { - .probe = qcom_pdc_probe, - .driver = { - .name = "qcom-pdc", - .of_match_table = qcom_pdc_match_table, - .suppress_bind_attrs = true, - }, -}; -module_platform_driver(qcom_pdc_driver); +IRQCHIP_PLATFORM_DRIVER_BEGIN(qcom_pdc) +IRQCHIP_MATCH("qcom,pdc", qcom_pdc_init) +IRQCHIP_PLATFORM_DRIVER_END(qcom_pdc) MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Power Domain Controller"); MODULE_LICENSE("GPL v2");
Switch the driver to use the helper macros. In addition to reducing the number of lines, this also adds module unload protection (if the driver is compiled as a module) by switching from module_platform_driver to builtin_platform_driver. Signed-off-by: Saravana Kannan <saravanak@google.com> --- drivers/irqchip/qcom-pdc.c | 26 +++----------------------- 1 file changed, 3 insertions(+), 23 deletions(-)