Message ID | 20210926155356.23861-1-nikita.yoush@cogentembedded.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Kieran Bingham |
Headers | show |
Series | media: vsp1: mask interrupts before enabling | expand |
Hi Nikita, Quoting Nikita Yushchenko (2021-09-26 16:53:56) > Setting up VSP interrupt handler without masking interrupt before causes > interrupt handler to be immediately called (and crash due to null pointer > dereference) on r8a77951-ulcb-kf board. > > Fix that by explicitly masking all interrupts before setting the interrupt > handler. To do so, have to set the interrupt handler later, after hw > revision is already detected and number of interrupts to mask gets > known. > > Based on patch by Koji Matsuoka <koji.matsuoka.xm@renesas.com> included > in the Renesas BSP kernel. Updated that to use wfp_count as the number of s/wfp_count/wpf_count/ > WPF interrupts to mask. > > Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> > --- > drivers/media/platform/vsp1/vsp1_drv.c | 23 +++++++++++++++-------- > 1 file changed, 15 insertions(+), 8 deletions(-) > > diff --git a/drivers/media/platform/vsp1/vsp1_drv.c b/drivers/media/platform/vsp1/vsp1_drv.c > index de442d6c9926..0e9a6fad54f8 100644 > --- a/drivers/media/platform/vsp1/vsp1_drv.c > +++ b/drivers/media/platform/vsp1/vsp1_drv.c > @@ -811,13 +811,6 @@ static int vsp1_probe(struct platform_device *pdev) > return -EINVAL; > } > > - ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler, > - IRQF_SHARED, dev_name(&pdev->dev), vsp1); > - if (ret < 0) { > - dev_err(&pdev->dev, "failed to request IRQ\n"); > - return ret; > - } > - > /* FCP (optional). */ > fcp_node = of_parse_phandle(pdev->dev.of_node, "renesas,fcp", 0); > if (fcp_node) { > @@ -847,7 +840,6 @@ static int vsp1_probe(struct platform_device *pdev) > goto done; > > vsp1->version = vsp1_read(vsp1, VI6_IP_VERSION); > - vsp1_device_put(vsp1); > > for (i = 0; i < ARRAY_SIZE(vsp1_device_infos); ++i) { > if ((vsp1->version & VI6_IP_VERSION_MODEL_MASK) == > @@ -861,11 +853,26 @@ static int vsp1_probe(struct platform_device *pdev) > dev_err(&pdev->dev, "unsupported IP version 0x%08x\n", > vsp1->version); > ret = -ENXIO; > + vsp1_device_put(vsp1); > goto done; > } > > dev_dbg(&pdev->dev, "IP version 0x%08x\n", vsp1->version); > > + for (i = 0; i < vsp1->info->lif_count; ++i) > + vsp1_write(vsp1, VI6_DISP_IRQ_ENB(i), 0); > + for (i = 0; i < vsp1->info->wpf_count; ++i) > + vsp1_write(vsp1, VI6_WPF_IRQ_ENB(i), 0); Should any other state or context on the hardware be manually reset? The initial value of VI6_WPFn_IRQ_ENB and VI6_DISPn_IRQ_ENB is explicitly stated as H'00000000 in the datasheet. So perhaps that implies that something else is going on here. Perhaps the display is already used before the kernel boots to handle a bootsplash screen or such ? Will the 'pending' interrupts have otherwise been cleared by the time we get to come to enable them? or will we still have a race... Otherwise we should be clearing the status bits too. And if we need to do a whole software reset, we should use the software reset controls instead. Looking at vsp1_device_init(), which does a vsp1_reset_wpf for any WPF running, and is called at vsp1_pm_runtime_resume() means that everything should already be getting reset by software at the first call to pm_runtime_enable I think... That said, I can see how there could still be a race so requesting the IRQ below /after/ the device is initialised is a good thing. I just don't think we need the manual resets that you've added above. Could you test to see if those lines to explicitly set VI6_DISP_IRQ_ENB and VI6_WPF_IRQ_ENB are really needed in your use case please? -- Kieran > + > + vsp1_device_put(vsp1); > + > + ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler, > + IRQF_SHARED, dev_name(&pdev->dev), vsp1); > + if (ret < 0) { > + dev_err(&pdev->dev, "failed to request IRQ\n"); > + goto done; > + } > + > /* Instantiate entities. */ > ret = vsp1_create_entities(vsp1); > if (ret < 0) { > -- > 2.30.2 >
Hi, > Could you test to see if those lines to explicitly set VI6_DISP_IRQ_ENB > and VI6_WPF_IRQ_ENB are really needed in your use case please? Commenting out those register writes causes [ 2.275137][ C0] irq 188: nobody cared (try booting with the "irqpoll" option) [ 2.282621][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc5 #28 [ 2.289669][ C0] Hardware name: Renesas H3ULCB Kingfisher board based on r8a77951 (DT) [ 2.297844][ C0] Call trace: [ 2.300981][ C0] dump_backtrace+0x0/0x198 [ 2.305348][ C0] show_stack+0x1c/0x28 [ 2.309357][ C0] dump_stack_lvl+0x64/0x7c [ 2.313718][ C0] dump_stack+0x14/0x2c [ 2.317725][ C0] __report_bad_irq+0x50/0xdc [ 2.322254][ C0] note_interrupt+0x2e4/0x330 [ 2.326786][ C0] handle_irq_event_percpu+0x58/0x68 [ 2.331927][ C0] handle_irq_event+0x4c/0x98 [ 2.336456][ C0] handle_fasteoi_irq+0xd0/0x180 [ 2.341245][ C0] handle_domain_irq+0x94/0xd8 [ 2.345862][ C0] gic_handle_irq+0xa4/0xe0 [ 2.350216][ C0] do_interrupt_handler+0x38/0x60 [ 2.355093][ C0] el1_interrupt+0x2c/0x68 [ 2.359362][ C0] el1h_64_irq_handler+0x14/0x20 [ 2.364151][ C0] el1h_64_irq+0x74/0x78 [ 2.368244][ C0] __do_softirq+0xc8/0x404 [ 2.372511][ C0] irq_exit+0x118/0x140 [ 2.376521][ C0] handle_domain_irq+0x98/0xd8 [ 2.381137][ C0] gic_handle_irq+0xa4/0xe0 [ 2.385490][ C0] call_on_irq_stack+0x28/0x3c [ 2.390105][ C0] do_interrupt_handler+0x54/0x60 [ 2.394981][ C0] el1_interrupt+0x2c/0x68 [ 2.399247][ C0] el1h_64_irq_handler+0x14/0x20 [ 2.404036][ C0] el1h_64_irq+0x74/0x78 [ 2.408129][ C0] _raw_spin_unlock_irqrestore+0x20/0x50 [ 2.413615][ C0] __setup_irq+0x56c/0x888 [ 2.417882][ C0] request_threaded_irq+0xf0/0x1a8 [ 2.422843][ C0] devm_request_threaded_irq+0x84/0xf8 [ 2.428155][ C0] vsp1_probe+0x218/0xb48 [ 2.432340][ C0] platform_probe+0x6c/0xd8 [ 2.436700][ C0] really_probe+0xc0/0x428 [ 2.440967][ C0] __driver_probe_device+0x114/0x188 [ 2.446103][ C0] driver_probe_device+0x44/0xe8 [ 2.450891][ C0] __driver_attach+0xbc/0x1a0 [ 2.455419][ C0] bus_for_each_dev+0x64/0xa0 [ 2.459947][ C0] driver_attach+0x28/0x30 [ 2.464215][ C0] bus_add_driver+0x144/0x228 [ 2.468743][ C0] driver_register+0x68/0x118 [ 2.473272][ C0] __platform_driver_register+0x2c/0x38 [ 2.478669][ C0] vsp1_platform_driver_init+0x20/0x28 [ 2.483985][ C0] do_one_initcall+0x38/0x258 [ 2.488513][ C0] kernel_init_freeable+0x228/0x28c [ 2.493565][ C0] kernel_init+0x28/0x120 [ 2.497745][ C0] ret_from_fork+0x10/0x20 [ 2.502013][ C0] handlers: [ 2.504974][ C0] [<0000000040be598b>] vsp1_irq_handler [ 2.510376][ C0] Disabling IRQ #188 Nikita
Quoting Nikita Yushchenko (2021-10-20 18:45:50) > Hi, > > > Could you test to see if those lines to explicitly set VI6_DISP_IRQ_ENB > > and VI6_WPF_IRQ_ENB are really needed in your use case please? > > Commenting out those register writes causes > > [ 2.275137][ C0] irq 188: nobody cared (try booting with the "irqpoll" option) > [ 2.282621][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc5 #28 Ok, so something has clearly caused these to get set, and the reset has not cleared them. I think I would rather see the code to reset them done in vsp1_reset_wpf(), rather than in probe directly as that is what we are doing, and is I believe already in the call path. This should be with a comment stating explicitly that we are manually resetting VI6_DISP_IRQ_ENB as the reset does not always clear these bits. (But I'm reallly ... reallly concerned that the hardware is not really getting reset when it should, and that might merit some further investigation). Requesting the devm_request_irq() /after/ initialising the hardware is certainly a good thing though. -- Kieran > [ 2.289669][ C0] Hardware name: Renesas H3ULCB Kingfisher board based on r8a77951 (DT) > [ 2.297844][ C0] Call trace: > [ 2.300981][ C0] dump_backtrace+0x0/0x198 > [ 2.305348][ C0] show_stack+0x1c/0x28 > [ 2.309357][ C0] dump_stack_lvl+0x64/0x7c > [ 2.313718][ C0] dump_stack+0x14/0x2c > [ 2.317725][ C0] __report_bad_irq+0x50/0xdc > [ 2.322254][ C0] note_interrupt+0x2e4/0x330 > [ 2.326786][ C0] handle_irq_event_percpu+0x58/0x68 > [ 2.331927][ C0] handle_irq_event+0x4c/0x98 > [ 2.336456][ C0] handle_fasteoi_irq+0xd0/0x180 > [ 2.341245][ C0] handle_domain_irq+0x94/0xd8 > [ 2.345862][ C0] gic_handle_irq+0xa4/0xe0 > [ 2.350216][ C0] do_interrupt_handler+0x38/0x60 > [ 2.355093][ C0] el1_interrupt+0x2c/0x68 > [ 2.359362][ C0] el1h_64_irq_handler+0x14/0x20 > [ 2.364151][ C0] el1h_64_irq+0x74/0x78 > [ 2.368244][ C0] __do_softirq+0xc8/0x404 > [ 2.372511][ C0] irq_exit+0x118/0x140 > [ 2.376521][ C0] handle_domain_irq+0x98/0xd8 > [ 2.381137][ C0] gic_handle_irq+0xa4/0xe0 > [ 2.385490][ C0] call_on_irq_stack+0x28/0x3c > [ 2.390105][ C0] do_interrupt_handler+0x54/0x60 > [ 2.394981][ C0] el1_interrupt+0x2c/0x68 > [ 2.399247][ C0] el1h_64_irq_handler+0x14/0x20 > [ 2.404036][ C0] el1h_64_irq+0x74/0x78 > [ 2.408129][ C0] _raw_spin_unlock_irqrestore+0x20/0x50 > [ 2.413615][ C0] __setup_irq+0x56c/0x888 > [ 2.417882][ C0] request_threaded_irq+0xf0/0x1a8 > [ 2.422843][ C0] devm_request_threaded_irq+0x84/0xf8 > [ 2.428155][ C0] vsp1_probe+0x218/0xb48 > [ 2.432340][ C0] platform_probe+0x6c/0xd8 > [ 2.436700][ C0] really_probe+0xc0/0x428 > [ 2.440967][ C0] __driver_probe_device+0x114/0x188 > [ 2.446103][ C0] driver_probe_device+0x44/0xe8 > [ 2.450891][ C0] __driver_attach+0xbc/0x1a0 > [ 2.455419][ C0] bus_for_each_dev+0x64/0xa0 > [ 2.459947][ C0] driver_attach+0x28/0x30 > [ 2.464215][ C0] bus_add_driver+0x144/0x228 > [ 2.468743][ C0] driver_register+0x68/0x118 > [ 2.473272][ C0] __platform_driver_register+0x2c/0x38 > [ 2.478669][ C0] vsp1_platform_driver_init+0x20/0x28 > [ 2.483985][ C0] do_one_initcall+0x38/0x258 > [ 2.488513][ C0] kernel_init_freeable+0x228/0x28c > [ 2.493565][ C0] kernel_init+0x28/0x120 > [ 2.497745][ C0] ret_from_fork+0x10/0x20 > [ 2.502013][ C0] handlers: > [ 2.504974][ C0] [<0000000040be598b>] vsp1_irq_handler > [ 2.510376][ C0] Disabling IRQ #188 > > Nikita
Hi. Now I'm finally looking at this again. > I think I would rather see the code to reset them done in > vsp1_reset_wpf(), rather than in probe directly as that is what we are > doing, and is I believe already in the call path. Could you please explain, how that is intended to be called on the probe path? As far as can read from the code, vsp1_reset_wpf() is only called from vsp1_device_init(), which in turn is called only from PM resume hook and only if vsp1->info is already set. However, in the probe path, pm_runtime_enable() is called before vsp1->info is set. Nikita
diff --git a/drivers/media/platform/vsp1/vsp1_drv.c b/drivers/media/platform/vsp1/vsp1_drv.c index de442d6c9926..0e9a6fad54f8 100644 --- a/drivers/media/platform/vsp1/vsp1_drv.c +++ b/drivers/media/platform/vsp1/vsp1_drv.c @@ -811,13 +811,6 @@ static int vsp1_probe(struct platform_device *pdev) return -EINVAL; } - ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler, - IRQF_SHARED, dev_name(&pdev->dev), vsp1); - if (ret < 0) { - dev_err(&pdev->dev, "failed to request IRQ\n"); - return ret; - } - /* FCP (optional). */ fcp_node = of_parse_phandle(pdev->dev.of_node, "renesas,fcp", 0); if (fcp_node) { @@ -847,7 +840,6 @@ static int vsp1_probe(struct platform_device *pdev) goto done; vsp1->version = vsp1_read(vsp1, VI6_IP_VERSION); - vsp1_device_put(vsp1); for (i = 0; i < ARRAY_SIZE(vsp1_device_infos); ++i) { if ((vsp1->version & VI6_IP_VERSION_MODEL_MASK) == @@ -861,11 +853,26 @@ static int vsp1_probe(struct platform_device *pdev) dev_err(&pdev->dev, "unsupported IP version 0x%08x\n", vsp1->version); ret = -ENXIO; + vsp1_device_put(vsp1); goto done; } dev_dbg(&pdev->dev, "IP version 0x%08x\n", vsp1->version); + for (i = 0; i < vsp1->info->lif_count; ++i) + vsp1_write(vsp1, VI6_DISP_IRQ_ENB(i), 0); + for (i = 0; i < vsp1->info->wpf_count; ++i) + vsp1_write(vsp1, VI6_WPF_IRQ_ENB(i), 0); + + vsp1_device_put(vsp1); + + ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler, + IRQF_SHARED, dev_name(&pdev->dev), vsp1); + if (ret < 0) { + dev_err(&pdev->dev, "failed to request IRQ\n"); + goto done; + } + /* Instantiate entities. */ ret = vsp1_create_entities(vsp1); if (ret < 0) {
Setting up VSP interrupt handler without masking interrupt before causes interrupt handler to be immediately called (and crash due to null pointer dereference) on r8a77951-ulcb-kf board. Fix that by explicitly masking all interrupts before setting the interrupt handler. To do so, have to set the interrupt handler later, after hw revision is already detected and number of interrupts to mask gets known. Based on patch by Koji Matsuoka <koji.matsuoka.xm@renesas.com> included in the Renesas BSP kernel. Updated that to use wfp_count as the number of WPF interrupts to mask. Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> --- drivers/media/platform/vsp1/vsp1_drv.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-)