diff mbox series

media: vsp1: mask interrupts before enabling

Message ID 20210926155356.23861-1-nikita.yoush@cogentembedded.com (mailing list archive)
State Superseded
Delegated to: Kieran Bingham
Headers show
Series media: vsp1: mask interrupts before enabling | expand

Commit Message

Nikita Yushchenko Sept. 26, 2021, 3:53 p.m. UTC
Setting up VSP interrupt handler without masking interrupt before causes
interrupt handler to be immediately called (and crash due to null pointer
dereference) on r8a77951-ulcb-kf board.

Fix that by explicitly masking all interrupts before setting the interrupt
handler. To do so, have to set the interrupt handler later, after hw
revision is already detected and number of interrupts to mask gets
known.

Based on patch by Koji Matsuoka <koji.matsuoka.xm@renesas.com> included
in the Renesas BSP kernel. Updated that to use wfp_count as the number of
WPF interrupts to mask.

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
---
 drivers/media/platform/vsp1/vsp1_drv.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

Comments

Kieran Bingham Oct. 18, 2021, 11:58 a.m. UTC | #1
Hi Nikita,

Quoting Nikita Yushchenko (2021-09-26 16:53:56)
> Setting up VSP interrupt handler without masking interrupt before causes
> interrupt handler to be immediately called (and crash due to null pointer
> dereference) on r8a77951-ulcb-kf board.
> 
> Fix that by explicitly masking all interrupts before setting the interrupt
> handler. To do so, have to set the interrupt handler later, after hw
> revision is already detected and number of interrupts to mask gets
> known.
> 
> Based on patch by Koji Matsuoka <koji.matsuoka.xm@renesas.com> included
> in the Renesas BSP kernel. Updated that to use wfp_count as the number of

s/wfp_count/wpf_count/

> WPF interrupts to mask.
> 
> Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
> ---
>  drivers/media/platform/vsp1/vsp1_drv.c | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/media/platform/vsp1/vsp1_drv.c b/drivers/media/platform/vsp1/vsp1_drv.c
> index de442d6c9926..0e9a6fad54f8 100644
> --- a/drivers/media/platform/vsp1/vsp1_drv.c
> +++ b/drivers/media/platform/vsp1/vsp1_drv.c
> @@ -811,13 +811,6 @@ static int vsp1_probe(struct platform_device *pdev)
>                 return -EINVAL;
>         }
>  
> -       ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
> -                             IRQF_SHARED, dev_name(&pdev->dev), vsp1);
> -       if (ret < 0) {
> -               dev_err(&pdev->dev, "failed to request IRQ\n");
> -               return ret;
> -       }
> -
>         /* FCP (optional). */
>         fcp_node = of_parse_phandle(pdev->dev.of_node, "renesas,fcp", 0);
>         if (fcp_node) {
> @@ -847,7 +840,6 @@ static int vsp1_probe(struct platform_device *pdev)
>                 goto done;
>  
>         vsp1->version = vsp1_read(vsp1, VI6_IP_VERSION);
> -       vsp1_device_put(vsp1);
>  
>         for (i = 0; i < ARRAY_SIZE(vsp1_device_infos); ++i) {
>                 if ((vsp1->version & VI6_IP_VERSION_MODEL_MASK) ==
> @@ -861,11 +853,26 @@ static int vsp1_probe(struct platform_device *pdev)
>                 dev_err(&pdev->dev, "unsupported IP version 0x%08x\n",
>                         vsp1->version);
>                 ret = -ENXIO;
> +               vsp1_device_put(vsp1);
>                 goto done;
>         }
>  
>         dev_dbg(&pdev->dev, "IP version 0x%08x\n", vsp1->version);
>  
> +       for (i = 0; i < vsp1->info->lif_count; ++i)
> +               vsp1_write(vsp1, VI6_DISP_IRQ_ENB(i), 0);
> +       for (i = 0; i < vsp1->info->wpf_count; ++i)
> +               vsp1_write(vsp1, VI6_WPF_IRQ_ENB(i), 0);

Should any other state or context on the hardware be manually reset?

The initial value of VI6_WPFn_IRQ_ENB and VI6_DISPn_IRQ_ENB is
explicitly stated as H'00000000 in the datasheet. So perhaps that
implies that something else is going on here.

Perhaps the display is already used before the kernel boots to handle a
bootsplash screen or such ?

Will the 'pending' interrupts have otherwise been cleared by the time we
get to come to enable them? or will we still have a race...

Otherwise we should be clearing the status bits too. And if we need to
do a whole software reset, we should use the software reset controls
instead.

Looking at vsp1_device_init(), which does a vsp1_reset_wpf for any WPF
running, and is called at vsp1_pm_runtime_resume() means that everything
should already be getting reset by software at the first call to
pm_runtime_enable I think...

That said, I can see how there could still be a race so requesting the
IRQ below /after/ the device is initialised is a good thing. I just
don't think we need the manual resets that you've added above.

Could you test to see if those lines to explicitly set VI6_DISP_IRQ_ENB
and VI6_WPF_IRQ_ENB are really needed in your use case please?

--
Kieran


> +
> +       vsp1_device_put(vsp1);
> +
> +       ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
> +                              IRQF_SHARED, dev_name(&pdev->dev), vsp1);
> +       if (ret < 0) {
> +               dev_err(&pdev->dev, "failed to request IRQ\n");
> +               goto done;
> +       }
> +
>         /* Instantiate entities. */
>         ret = vsp1_create_entities(vsp1);
>         if (ret < 0) {
> -- 
> 2.30.2
>
Nikita Yushchenko Oct. 20, 2021, 5:45 p.m. UTC | #2
Hi,

> Could you test to see if those lines to explicitly set VI6_DISP_IRQ_ENB
> and VI6_WPF_IRQ_ENB are really needed in your use case please?

Commenting out those register writes causes

[    2.275137][    C0] irq 188: nobody cared (try booting with the "irqpoll" option)
[    2.282621][    C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc5 #28
[    2.289669][    C0] Hardware name: Renesas H3ULCB Kingfisher board based on r8a77951 (DT)
[    2.297844][    C0] Call trace:
[    2.300981][    C0]  dump_backtrace+0x0/0x198
[    2.305348][    C0]  show_stack+0x1c/0x28
[    2.309357][    C0]  dump_stack_lvl+0x64/0x7c
[    2.313718][    C0]  dump_stack+0x14/0x2c
[    2.317725][    C0]  __report_bad_irq+0x50/0xdc
[    2.322254][    C0]  note_interrupt+0x2e4/0x330
[    2.326786][    C0]  handle_irq_event_percpu+0x58/0x68
[    2.331927][    C0]  handle_irq_event+0x4c/0x98
[    2.336456][    C0]  handle_fasteoi_irq+0xd0/0x180
[    2.341245][    C0]  handle_domain_irq+0x94/0xd8
[    2.345862][    C0]  gic_handle_irq+0xa4/0xe0
[    2.350216][    C0]  do_interrupt_handler+0x38/0x60
[    2.355093][    C0]  el1_interrupt+0x2c/0x68
[    2.359362][    C0]  el1h_64_irq_handler+0x14/0x20
[    2.364151][    C0]  el1h_64_irq+0x74/0x78
[    2.368244][    C0]  __do_softirq+0xc8/0x404
[    2.372511][    C0]  irq_exit+0x118/0x140
[    2.376521][    C0]  handle_domain_irq+0x98/0xd8
[    2.381137][    C0]  gic_handle_irq+0xa4/0xe0
[    2.385490][    C0]  call_on_irq_stack+0x28/0x3c
[    2.390105][    C0]  do_interrupt_handler+0x54/0x60
[    2.394981][    C0]  el1_interrupt+0x2c/0x68
[    2.399247][    C0]  el1h_64_irq_handler+0x14/0x20
[    2.404036][    C0]  el1h_64_irq+0x74/0x78
[    2.408129][    C0]  _raw_spin_unlock_irqrestore+0x20/0x50
[    2.413615][    C0]  __setup_irq+0x56c/0x888
[    2.417882][    C0]  request_threaded_irq+0xf0/0x1a8
[    2.422843][    C0]  devm_request_threaded_irq+0x84/0xf8
[    2.428155][    C0]  vsp1_probe+0x218/0xb48
[    2.432340][    C0]  platform_probe+0x6c/0xd8
[    2.436700][    C0]  really_probe+0xc0/0x428
[    2.440967][    C0]  __driver_probe_device+0x114/0x188
[    2.446103][    C0]  driver_probe_device+0x44/0xe8
[    2.450891][    C0]  __driver_attach+0xbc/0x1a0
[    2.455419][    C0]  bus_for_each_dev+0x64/0xa0
[    2.459947][    C0]  driver_attach+0x28/0x30
[    2.464215][    C0]  bus_add_driver+0x144/0x228
[    2.468743][    C0]  driver_register+0x68/0x118
[    2.473272][    C0]  __platform_driver_register+0x2c/0x38
[    2.478669][    C0]  vsp1_platform_driver_init+0x20/0x28
[    2.483985][    C0]  do_one_initcall+0x38/0x258
[    2.488513][    C0]  kernel_init_freeable+0x228/0x28c
[    2.493565][    C0]  kernel_init+0x28/0x120
[    2.497745][    C0]  ret_from_fork+0x10/0x20
[    2.502013][    C0] handlers:
[    2.504974][    C0] [<0000000040be598b>] vsp1_irq_handler
[    2.510376][    C0] Disabling IRQ #188

Nikita
Kieran Bingham Nov. 2, 2021, 11:13 a.m. UTC | #3
Quoting Nikita Yushchenko (2021-10-20 18:45:50)
> Hi,
> 
> > Could you test to see if those lines to explicitly set VI6_DISP_IRQ_ENB
> > and VI6_WPF_IRQ_ENB are really needed in your use case please?
> 
> Commenting out those register writes causes
> 
> [    2.275137][    C0] irq 188: nobody cared (try booting with the "irqpoll" option)
> [    2.282621][    C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc5 #28

Ok, so something has clearly caused these to get set, and the reset has
not cleared them.

I think I would rather see the code to reset them done in
vsp1_reset_wpf(), rather than in probe directly as that is what we are
doing, and is I believe already in the call path.

This should be with a comment stating explicitly that we are manually
resetting VI6_DISP_IRQ_ENB as the reset does not always clear these
bits.

(But I'm reallly ... reallly concerned that the hardware is not really
getting reset when it should, and that might merit some further
investigation).

Requesting the devm_request_irq() /after/ initialising the hardware is
certainly a good thing though.

--
Kieran



> [    2.289669][    C0] Hardware name: Renesas H3ULCB Kingfisher board based on r8a77951 (DT)
> [    2.297844][    C0] Call trace:
> [    2.300981][    C0]  dump_backtrace+0x0/0x198
> [    2.305348][    C0]  show_stack+0x1c/0x28
> [    2.309357][    C0]  dump_stack_lvl+0x64/0x7c
> [    2.313718][    C0]  dump_stack+0x14/0x2c
> [    2.317725][    C0]  __report_bad_irq+0x50/0xdc
> [    2.322254][    C0]  note_interrupt+0x2e4/0x330
> [    2.326786][    C0]  handle_irq_event_percpu+0x58/0x68
> [    2.331927][    C0]  handle_irq_event+0x4c/0x98
> [    2.336456][    C0]  handle_fasteoi_irq+0xd0/0x180
> [    2.341245][    C0]  handle_domain_irq+0x94/0xd8
> [    2.345862][    C0]  gic_handle_irq+0xa4/0xe0
> [    2.350216][    C0]  do_interrupt_handler+0x38/0x60
> [    2.355093][    C0]  el1_interrupt+0x2c/0x68
> [    2.359362][    C0]  el1h_64_irq_handler+0x14/0x20
> [    2.364151][    C0]  el1h_64_irq+0x74/0x78
> [    2.368244][    C0]  __do_softirq+0xc8/0x404
> [    2.372511][    C0]  irq_exit+0x118/0x140
> [    2.376521][    C0]  handle_domain_irq+0x98/0xd8
> [    2.381137][    C0]  gic_handle_irq+0xa4/0xe0
> [    2.385490][    C0]  call_on_irq_stack+0x28/0x3c
> [    2.390105][    C0]  do_interrupt_handler+0x54/0x60
> [    2.394981][    C0]  el1_interrupt+0x2c/0x68
> [    2.399247][    C0]  el1h_64_irq_handler+0x14/0x20
> [    2.404036][    C0]  el1h_64_irq+0x74/0x78
> [    2.408129][    C0]  _raw_spin_unlock_irqrestore+0x20/0x50
> [    2.413615][    C0]  __setup_irq+0x56c/0x888
> [    2.417882][    C0]  request_threaded_irq+0xf0/0x1a8
> [    2.422843][    C0]  devm_request_threaded_irq+0x84/0xf8
> [    2.428155][    C0]  vsp1_probe+0x218/0xb48
> [    2.432340][    C0]  platform_probe+0x6c/0xd8
> [    2.436700][    C0]  really_probe+0xc0/0x428
> [    2.440967][    C0]  __driver_probe_device+0x114/0x188
> [    2.446103][    C0]  driver_probe_device+0x44/0xe8
> [    2.450891][    C0]  __driver_attach+0xbc/0x1a0
> [    2.455419][    C0]  bus_for_each_dev+0x64/0xa0
> [    2.459947][    C0]  driver_attach+0x28/0x30
> [    2.464215][    C0]  bus_add_driver+0x144/0x228
> [    2.468743][    C0]  driver_register+0x68/0x118
> [    2.473272][    C0]  __platform_driver_register+0x2c/0x38
> [    2.478669][    C0]  vsp1_platform_driver_init+0x20/0x28
> [    2.483985][    C0]  do_one_initcall+0x38/0x258
> [    2.488513][    C0]  kernel_init_freeable+0x228/0x28c
> [    2.493565][    C0]  kernel_init+0x28/0x120
> [    2.497745][    C0]  ret_from_fork+0x10/0x20
> [    2.502013][    C0] handlers:
> [    2.504974][    C0] [<0000000040be598b>] vsp1_irq_handler
> [    2.510376][    C0] Disabling IRQ #188
> 
> Nikita
Nikita Yushchenko Dec. 13, 2021, 8:13 p.m. UTC | #4
Hi.

Now I'm finally looking at this again.

> I think I would rather see the code to reset them done in
> vsp1_reset_wpf(), rather than in probe directly as that is what we are
> doing, and is I believe already in the call path.

Could you please explain, how that is intended to be called on the probe path?

As far as can read from the code, vsp1_reset_wpf() is only called from vsp1_device_init(), which in turn 
is called only from PM resume hook and only if vsp1->info is already set. However, in the probe path, 
pm_runtime_enable() is called before vsp1->info is set.

Nikita
diff mbox series

Patch

diff --git a/drivers/media/platform/vsp1/vsp1_drv.c b/drivers/media/platform/vsp1/vsp1_drv.c
index de442d6c9926..0e9a6fad54f8 100644
--- a/drivers/media/platform/vsp1/vsp1_drv.c
+++ b/drivers/media/platform/vsp1/vsp1_drv.c
@@ -811,13 +811,6 @@  static int vsp1_probe(struct platform_device *pdev)
 		return -EINVAL;
 	}
 
-	ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
-			      IRQF_SHARED, dev_name(&pdev->dev), vsp1);
-	if (ret < 0) {
-		dev_err(&pdev->dev, "failed to request IRQ\n");
-		return ret;
-	}
-
 	/* FCP (optional). */
 	fcp_node = of_parse_phandle(pdev->dev.of_node, "renesas,fcp", 0);
 	if (fcp_node) {
@@ -847,7 +840,6 @@  static int vsp1_probe(struct platform_device *pdev)
 		goto done;
 
 	vsp1->version = vsp1_read(vsp1, VI6_IP_VERSION);
-	vsp1_device_put(vsp1);
 
 	for (i = 0; i < ARRAY_SIZE(vsp1_device_infos); ++i) {
 		if ((vsp1->version & VI6_IP_VERSION_MODEL_MASK) ==
@@ -861,11 +853,26 @@  static int vsp1_probe(struct platform_device *pdev)
 		dev_err(&pdev->dev, "unsupported IP version 0x%08x\n",
 			vsp1->version);
 		ret = -ENXIO;
+		vsp1_device_put(vsp1);
 		goto done;
 	}
 
 	dev_dbg(&pdev->dev, "IP version 0x%08x\n", vsp1->version);
 
+	for (i = 0; i < vsp1->info->lif_count; ++i)
+		vsp1_write(vsp1, VI6_DISP_IRQ_ENB(i), 0);
+	for (i = 0; i < vsp1->info->wpf_count; ++i)
+		vsp1_write(vsp1, VI6_WPF_IRQ_ENB(i), 0);
+
+	vsp1_device_put(vsp1);
+
+	ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
+			       IRQF_SHARED, dev_name(&pdev->dev), vsp1);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "failed to request IRQ\n");
+		goto done;
+	}
+
 	/* Instantiate entities. */
 	ret = vsp1_create_entities(vsp1);
 	if (ret < 0) {