Message ID | 20210810103336.114077-1-robert.foss@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] media: camss: vfe: Don't use vfe->base before it's assigned | expand |
On 10.08.2021 12:33, Robert Foss wrote: > vfe->ops->hw_version(vfe) being called before vfe->base has been assigned > is incorrect and causes crashes. > > Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly") > > Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> > Signed-off-by: Robert Foss <robert.foss@linaro.org> With this patch applied on top of linux next-20210810 instead of the NULL pointer dereference I get following error on DragonBoard410c while loading kernel modules: [ 18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1 [ 18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2 [ 18.600373] Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP > --- > drivers/media/platform/qcom/camss/camss-vfe.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/media/platform/qcom/camss/camss-vfe.c > index 6b2f33fc9be22..1c8d2f0f81207 100644 > --- a/drivers/media/platform/qcom/camss/camss-vfe.c > +++ b/drivers/media/platform/qcom/camss/camss-vfe.c > @@ -1299,7 +1299,6 @@ int msm_vfe_subdev_init(struct camss *camss, struct vfe_device *vfe, > return -EINVAL; > } > vfe->ops->subdev_init(dev, vfe); > - vfe->ops->hw_version(vfe); > > /* Memory */ > > @@ -1309,6 +1308,8 @@ int msm_vfe_subdev_init(struct camss *camss, struct vfe_device *vfe, > return PTR_ERR(vfe->base); > } > > + vfe->ops->hw_version(vfe); > + > /* Interrupt */ > > r = platform_get_resource_byname(pdev, IORESOURCE_IRQ, Best regards
Hey Marek, Thanks for testing this. On Wed, 11 Aug 2021 at 09:48, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > On 10.08.2021 12:33, Robert Foss wrote: > > vfe->ops->hw_version(vfe) being called before vfe->base has been assigned > > is incorrect and causes crashes. > > > > Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly") > > > > Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> > > Signed-off-by: Robert Foss <robert.foss@linaro.org> > > With this patch applied on top of linux next-20210810 instead of the > NULL pointer dereference I get following error on DragonBoard410c while > loading kernel modules: > > [ 18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1 > [ 18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2 > [ 18.600373] Internal error: synchronous external abort: 96000010 [#1] > PREEMPT SMP I'll spin a v2 asap.
On Wed, 11 Aug 2021 at 11:41, Robert Foss <robert.foss@linaro.org> wrote: > > Hey Marek, > > Thanks for testing this. > > On Wed, 11 Aug 2021 at 09:48, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > > > On 10.08.2021 12:33, Robert Foss wrote: > > > vfe->ops->hw_version(vfe) being called before vfe->base has been assigned > > > is incorrect and causes crashes. > > > > > > Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly") > > > > > > Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> > > > Signed-off-by: Robert Foss <robert.foss@linaro.org> > > > > With this patch applied on top of linux next-20210810 instead of the > > NULL pointer dereference I get following error on DragonBoard410c while > > loading kernel modules: > > > > [ 18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1 > > [ 18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2 > > [ 18.600373] Internal error: synchronous external abort: 96000010 [#1] > > PREEMPT SMP > After testing this patch + linux-next[1] I'm not able to replicate the 'Internal error' above on the db410c. And I don't think it is related to this patch. Are you seeing the same error on [1]? And are you seeing it before the b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly") patch? [1] https://git.linaro.org/people/robert.foss/linux.git/log/?h=camss_print_fix_v1 Rob
Hi Robert, On 11.08.2021 15:42, Robert Foss wrote: > On Wed, 11 Aug 2021 at 11:41, Robert Foss <robert.foss@linaro.org> wrote: >> Hey Marek, >> >> Thanks for testing this. >> >> On Wed, 11 Aug 2021 at 09:48, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>> On 10.08.2021 12:33, Robert Foss wrote: >>>> vfe->ops->hw_version(vfe) being called before vfe->base has been assigned >>>> is incorrect and causes crashes. >>>> >>>> Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly") >>>> >>>> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> >>>> Signed-off-by: Robert Foss <robert.foss@linaro.org> >>> With this patch applied on top of linux next-20210810 instead of the >>> NULL pointer dereference I get following error on DragonBoard410c while >>> loading kernel modules: >>> >>> [ 18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1 >>> [ 18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2 >>> [ 18.600373] Internal error: synchronous external abort: 96000010 [#1] >>> PREEMPT SMP > After testing this patch + linux-next[1] I'm not able to replicate the > 'Internal error' above on the db410c. And I don't think it is related > to this patch. > > Are you seeing the same error on [1]? And are you seeing it before the > b10b5334528a9 ("media: camss: vfe: Don't read hardware version > needlessly") patch? I've checked once again on your branch. Yes, it is fully reproducible on my DragonBoard410c. On your branch I get the above synchronous external abort, while without your last patch I get Null ptr dereference. Are you sure you have deployed the kernel modules for doing the test? This issue happens when udev triggers loading kernel modules for the detected hardware. Here is the kernel configuration used for my tests on ARM64 based board: # make ARCH=arm64 defconfig && ./scripts/config --set-val CMA_SIZE_MBYTES 96 -e PROVE_LOCKING -e DEBUG_ATOMIC_SLEEP -e PM_DEBUG -e PM_ADVANCED_DEBUG -d ARCH_SUNXI -d ARCH_ALPINE -d DRM_NOUVEAU -d ARCH_BCM_IPROC -d ARCH_BERLIN -d ARCH_BRCMSTB -d ARCH_LAYERSCAPE -d ARCH_LG1K -d ARCH_HISI -d ARCH_MEDIATEK -d ARCH_MVEBU -d ARCH_ROCKCHIP -d ARCH_SEATTLE -d ARCH_SYNQUACER -d ARCH_RENESAS -d ARCH_STRATIX10 -d ARCH_TEGRA -d ARCH_SPRD -d ARCH_THUNDER -d ARCH_THUNDER2 -d ARCH_UNIPHIER -d ARCH_XGENE -d ARCH_ZX -d ARCH_ZYNQMP -d HIBERNATION -d CLK_SUNXI -e BLK_DEV_RAM --set-val BLK_DEV_RAM_COUNT 4 --set-val BLK_DEV_RAM_SIZE 65536 -d CONFIG_EFI -d CONFIG_TEE Comparing to the arm64's defconfig, I've enabled some debugging options and disabled some unused boards. Best regards
> >>> > >>> [ 18.480608] qcom-venus 1d00000.video-codec: Adding to iommu group 1 > >>> [ 18.536167] qcom-camss 1b0ac00.camss: Adding to iommu group 2 > >>> [ 18.600373] Internal error: synchronous external abort: 96000010 [#1] > >>> PREEMPT SMP > > After testing this patch + linux-next[1] I'm not able to replicate the > > 'Internal error' above on the db410c. And I don't think it is related > > to this patch. > > > > Are you seeing the same error on [1]? And are you seeing it before the > > b10b5334528a9 ("media: camss: vfe: Don't read hardware version > > needlessly") patch? > > I've checked once again on your branch. Yes, it is fully reproducible on > my DragonBoard410c. On your branch I get the above synchronous external > abort, while without your last patch I get Null ptr dereference. > > Are you sure you have deployed the kernel modules for doing the test? > This issue happens when udev triggers loading kernel modules for the > detected hardware. > > Here is the kernel configuration used for my tests on ARM64 based board: > > # make ARCH=arm64 defconfig && ./scripts/config --set-val > CMA_SIZE_MBYTES 96 -e PROVE_LOCKING -e DEBUG_ATOMIC_SLEEP -e PM_DEBUG -e > PM_ADVANCED_DEBUG -d ARCH_SUNXI -d ARCH_ALPINE -d DRM_NOUVEAU -d > ARCH_BCM_IPROC -d ARCH_BERLIN -d ARCH_BRCMSTB -d ARCH_LAYERSCAPE -d > ARCH_LG1K -d ARCH_HISI -d ARCH_MEDIATEK -d ARCH_MVEBU -d ARCH_ROCKCHIP > -d ARCH_SEATTLE -d ARCH_SYNQUACER -d ARCH_RENESAS -d ARCH_STRATIX10 -d > ARCH_TEGRA -d ARCH_SPRD -d ARCH_THUNDER -d ARCH_THUNDER2 -d > ARCH_UNIPHIER -d ARCH_XGENE -d ARCH_ZX -d ARCH_ZYNQMP -d HIBERNATION -d > CLK_SUNXI -e BLK_DEV_RAM --set-val BLK_DEV_RAM_COUNT 4 --set-val > BLK_DEV_RAM_SIZE 65536 -d CONFIG_EFI -d CONFIG_TEE > > Comparing to the arm64's defconfig, I've enabled some debugging options > and disabled some unused boards. > I made a mistake when testing on the db410c earlier, but with it fixed I can replicate the issue. It seems to stem from the hardware not actually being powered up when the hw_version() call is being made. Unfortunately there is no simple way to achieve the original intention of the series that reduced the frequency of the hw_version() being called, while avoiding the above issue. So let's revert to the previous behaviour for the time being. Thank you for your help Marek! I'll submit a v2 shortly. Rob
diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/media/platform/qcom/camss/camss-vfe.c index 6b2f33fc9be22..1c8d2f0f81207 100644 --- a/drivers/media/platform/qcom/camss/camss-vfe.c +++ b/drivers/media/platform/qcom/camss/camss-vfe.c @@ -1299,7 +1299,6 @@ int msm_vfe_subdev_init(struct camss *camss, struct vfe_device *vfe, return -EINVAL; } vfe->ops->subdev_init(dev, vfe); - vfe->ops->hw_version(vfe); /* Memory */ @@ -1309,6 +1308,8 @@ int msm_vfe_subdev_init(struct camss *camss, struct vfe_device *vfe, return PTR_ERR(vfe->base); } + vfe->ops->hw_version(vfe); + /* Interrupt */ r = platform_get_resource_byname(pdev, IORESOURCE_IRQ,
vfe->ops->hw_version(vfe) being called before vfe->base has been assigned is incorrect and causes crashes. Fixes: b10b5334528a9 ("media: camss: vfe: Don't read hardware version needlessly") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Robert Foss <robert.foss@linaro.org> --- drivers/media/platform/qcom/camss/camss-vfe.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)