diff mbox series

[v1] media: camss: vfe: Don't use vfe->base before it's assigned

Message ID 20210810103336.114077-1-robert.foss@linaro.org (mailing list archive)
State New, archived
Headers show
Series [v1] media: camss: vfe: Don't use vfe->base before it's assigned | expand

Commit Message

Robert Foss Aug. 10, 2021, 10:33 a.m. UTC
vfe->ops->hw_version(vfe) being called before vfe->base has been assigned
is incorrect and causes crashes.

Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly")

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Robert Foss <robert.foss@linaro.org>
---
 drivers/media/platform/qcom/camss/camss-vfe.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Marek Szyprowski Aug. 11, 2021, 7:48 a.m. UTC | #1
On 10.08.2021 12:33, Robert Foss wrote:
> vfe->ops->hw_version(vfe) being called before vfe->base has been assigned
> is incorrect and causes crashes.
>
> Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly")
>
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> Signed-off-by: Robert Foss <robert.foss@linaro.org>

With this patch applied on top of linux next-20210810 instead of the 
NULL pointer dereference I get following error on DragonBoard410c while 
loading kernel modules:

[   18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1
[   18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2
[   18.600373] Internal error: synchronous external abort: 96000010 [#1] 
PREEMPT SMP

> ---
>   drivers/media/platform/qcom/camss/camss-vfe.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/media/platform/qcom/camss/camss-vfe.c
> index 6b2f33fc9be22..1c8d2f0f81207 100644
> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
> @@ -1299,7 +1299,6 @@ int msm_vfe_subdev_init(struct camss *camss, struct vfe_device *vfe,
>   		return -EINVAL;
>   	}
>   	vfe->ops->subdev_init(dev, vfe);
> -	vfe->ops->hw_version(vfe);
>   
>   	/* Memory */
>   
> @@ -1309,6 +1308,8 @@ int msm_vfe_subdev_init(struct camss *camss, struct vfe_device *vfe,
>   		return PTR_ERR(vfe->base);
>   	}
>   
> +	vfe->ops->hw_version(vfe);
> +
>   	/* Interrupt */
>   
>   	r = platform_get_resource_byname(pdev, IORESOURCE_IRQ,

Best regards
Robert Foss Aug. 11, 2021, 9:41 a.m. UTC | #2
Hey Marek,

Thanks for testing this.

On Wed, 11 Aug 2021 at 09:48, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>
> On 10.08.2021 12:33, Robert Foss wrote:
> > vfe->ops->hw_version(vfe) being called before vfe->base has been assigned
> > is incorrect and causes crashes.
> >
> > Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly")
> >
> > Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> > Signed-off-by: Robert Foss <robert.foss@linaro.org>
>
> With this patch applied on top of linux next-20210810 instead of the
> NULL pointer dereference I get following error on DragonBoard410c while
> loading kernel modules:
>
> [   18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1
> [   18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2
> [   18.600373] Internal error: synchronous external abort: 96000010 [#1]
> PREEMPT SMP

I'll spin a v2 asap.
Robert Foss Aug. 11, 2021, 1:42 p.m. UTC | #3
On Wed, 11 Aug 2021 at 11:41, Robert Foss <robert.foss@linaro.org> wrote:
>
> Hey Marek,
>
> Thanks for testing this.
>
> On Wed, 11 Aug 2021 at 09:48, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> >
> > On 10.08.2021 12:33, Robert Foss wrote:
> > > vfe->ops->hw_version(vfe) being called before vfe->base has been assigned
> > > is incorrect and causes crashes.
> > >
> > > Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly")
> > >
> > > Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> > > Signed-off-by: Robert Foss <robert.foss@linaro.org>
> >
> > With this patch applied on top of linux next-20210810 instead of the
> > NULL pointer dereference I get following error on DragonBoard410c while
> > loading kernel modules:
> >
> > [   18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1
> > [   18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2
> > [   18.600373] Internal error: synchronous external abort: 96000010 [#1]
> > PREEMPT SMP
>

After testing this patch + linux-next[1] I'm not able to replicate the
'Internal error' above on the db410c. And I don't think it is related
to this patch.

Are you seeing the same error on [1]? And are you seeing it before the
b10b5334528a9 ("media: camss: vfe: Don't read hardware version
needlessly") patch?

[1] https://git.linaro.org/people/robert.foss/linux.git/log/?h=camss_print_fix_v1


Rob
Marek Szyprowski Aug. 12, 2021, 7:40 a.m. UTC | #4
Hi Robert,

On 11.08.2021 15:42, Robert Foss wrote:
> On Wed, 11 Aug 2021 at 11:41, Robert Foss <robert.foss@linaro.org> wrote:
>> Hey Marek,
>>
>> Thanks for testing this.
>>
>> On Wed, 11 Aug 2021 at 09:48, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>>> On 10.08.2021 12:33, Robert Foss wrote:
>>>> vfe->ops->hw_version(vfe) being called before vfe->base has been assigned
>>>> is incorrect and causes crashes.
>>>>
>>>> Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly")
>>>>
>>>> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
>>>> Signed-off-by: Robert Foss <robert.foss@linaro.org>
>>> With this patch applied on top of linux next-20210810 instead of the
>>> NULL pointer dereference I get following error on DragonBoard410c while
>>> loading kernel modules:
>>>
>>> [   18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1
>>> [   18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2
>>> [   18.600373] Internal error: synchronous external abort: 96000010 [#1]
>>> PREEMPT SMP
> After testing this patch + linux-next[1] I'm not able to replicate the
> 'Internal error' above on the db410c. And I don't think it is related
> to this patch.
>
> Are you seeing the same error on [1]? And are you seeing it before the
> b10b5334528a9 ("media: camss: vfe: Don't read hardware version
> needlessly") patch?

I've checked once again on your branch. Yes, it is fully reproducible on 
my DragonBoard410c. On your branch I get the above synchronous external 
abort, while without your last patch I get Null ptr dereference.

Are you sure you have deployed the kernel modules for doing the test? 
This issue happens when udev triggers loading kernel modules for the 
detected hardware.

Here is the kernel configuration used for my tests on ARM64 based board:

# make ARCH=arm64 defconfig && ./scripts/config --set-val 
CMA_SIZE_MBYTES 96 -e PROVE_LOCKING -e DEBUG_ATOMIC_SLEEP -e PM_DEBUG -e 
PM_ADVANCED_DEBUG -d ARCH_SUNXI -d ARCH_ALPINE -d DRM_NOUVEAU -d 
ARCH_BCM_IPROC -d ARCH_BERLIN -d ARCH_BRCMSTB -d ARCH_LAYERSCAPE -d 
ARCH_LG1K -d ARCH_HISI -d ARCH_MEDIATEK -d ARCH_MVEBU -d ARCH_ROCKCHIP 
-d ARCH_SEATTLE -d ARCH_SYNQUACER -d ARCH_RENESAS -d ARCH_STRATIX10 -d 
ARCH_TEGRA -d ARCH_SPRD -d ARCH_THUNDER -d ARCH_THUNDER2 -d 
ARCH_UNIPHIER -d ARCH_XGENE -d ARCH_ZX -d ARCH_ZYNQMP -d HIBERNATION -d 
CLK_SUNXI -e BLK_DEV_RAM --set-val BLK_DEV_RAM_COUNT 4 --set-val 
BLK_DEV_RAM_SIZE 65536 -d CONFIG_EFI -d CONFIG_TEE

Comparing to the arm64's defconfig, I've enabled some debugging options 
and disabled some unused boards.

Best regards
Robert Foss Aug. 12, 2021, 9:22 a.m. UTC | #5
> >>>
> >>> [   18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1
> >>> [   18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2
> >>> [   18.600373] Internal error: synchronous external abort: 96000010 [#1]
> >>> PREEMPT SMP
> > After testing this patch + linux-next[1] I'm not able to replicate the
> > 'Internal error' above on the db410c. And I don't think it is related
> > to this patch.
> >
> > Are you seeing the same error on [1]? And are you seeing it before the
> > b10b5334528a9 ("media: camss: vfe: Don't read hardware version
> > needlessly") patch?
>
> I've checked once again on your branch. Yes, it is fully reproducible on
> my DragonBoard410c. On your branch I get the above synchronous external
> abort, while without your last patch I get Null ptr dereference.
>
> Are you sure you have deployed the kernel modules for doing the test?
> This issue happens when udev triggers loading kernel modules for the
> detected hardware.
>
> Here is the kernel configuration used for my tests on ARM64 based board:
>
> # make ARCH=arm64 defconfig && ./scripts/config --set-val
> CMA_SIZE_MBYTES 96 -e PROVE_LOCKING -e DEBUG_ATOMIC_SLEEP -e PM_DEBUG -e
> PM_ADVANCED_DEBUG -d ARCH_SUNXI -d ARCH_ALPINE -d DRM_NOUVEAU -d
> ARCH_BCM_IPROC -d ARCH_BERLIN -d ARCH_BRCMSTB -d ARCH_LAYERSCAPE -d
> ARCH_LG1K -d ARCH_HISI -d ARCH_MEDIATEK -d ARCH_MVEBU -d ARCH_ROCKCHIP
> -d ARCH_SEATTLE -d ARCH_SYNQUACER -d ARCH_RENESAS -d ARCH_STRATIX10 -d
> ARCH_TEGRA -d ARCH_SPRD -d ARCH_THUNDER -d ARCH_THUNDER2 -d
> ARCH_UNIPHIER -d ARCH_XGENE -d ARCH_ZX -d ARCH_ZYNQMP -d HIBERNATION -d
> CLK_SUNXI -e BLK_DEV_RAM --set-val BLK_DEV_RAM_COUNT 4 --set-val
> BLK_DEV_RAM_SIZE 65536 -d CONFIG_EFI -d CONFIG_TEE
>
> Comparing to the arm64's defconfig, I've enabled some debugging options
> and disabled some unused boards.
>

I made a mistake when testing on the db410c earlier, but with it fixed
I can replicate the issue. It seems to stem from the hardware not
actually being powered up when the hw_version() call is being made.

Unfortunately there is no simple way to achieve the original intention
of the series that reduced the frequency of the hw_version() being
called, while avoiding the above issue. So let's revert to the
previous behaviour for the time being.

Thank you for your help Marek! I'll submit a v2 shortly.


Rob
diff mbox series

Patch

diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/media/platform/qcom/camss/camss-vfe.c
index 6b2f33fc9be22..1c8d2f0f81207 100644
--- a/drivers/media/platform/qcom/camss/camss-vfe.c
+++ b/drivers/media/platform/qcom/camss/camss-vfe.c
@@ -1299,7 +1299,6 @@  int msm_vfe_subdev_init(struct camss *camss, struct vfe_device *vfe,
 		return -EINVAL;
 	}
 	vfe->ops->subdev_init(dev, vfe);
-	vfe->ops->hw_version(vfe);
 
 	/* Memory */
 
@@ -1309,6 +1308,8 @@  int msm_vfe_subdev_init(struct camss *camss, struct vfe_device *vfe,
 		return PTR_ERR(vfe->base);
 	}
 
+	vfe->ops->hw_version(vfe);
+
 	/* Interrupt */
 
 	r = platform_get_resource_byname(pdev, IORESOURCE_IRQ,