diff mbox series

[drm-misc-next,v3,4/7] drm/arm/hdlcd: use drm_dev_unplug()

Message ID 20221001011905.433408-5-dakr@redhat.com (mailing list archive)
State New, archived
Headers show
Series drm/arm/hdlcd: use drm managed resources | expand

Commit Message

Danilo Krummrich Oct. 1, 2022, 1:19 a.m. UTC
When the driver is unbound, there might still be users in userspace
having an open fd and are calling into the driver.

While this is fine for drm managed resources, it is not for resources
bound to the device/driver lifecycle, e.g. clocks or MMIO mappings.

To prevent use-after-free issues we need to protect those resources with
drm_dev_enter() and drm_dev_exit(). This does only work if we indicate
that the drm device was unplugged, hence use drm_dev_unplug() instead of
drm_dev_unregister().

Protecting the particular resources with drm_dev_enter()/drm_dev_exit()
is handled by subsequent patches.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 drivers/gpu/drm/arm/hdlcd_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Liviu Dudau Oct. 12, 2022, 3:07 p.m. UTC | #1
Hi Danilo,

Appologies again for the delay in reviewing this as I was at XDC last week.

This patch is causing a regression at 'rmmod' time as the drm_crtc_vblank_off() does
not get called when we disable outputs and the HDLCD remains active as I keep getting
unhandled context faults from the arm-smmu driver that sits in front of the HDLCD.

This is the stack trace for it:

root@alarm ~]# rmmod hdlcd
[  198.981343] Console: switching to colour dummy device 80x25
[  199.012492] ------------[ cut here ]------------
[  199.017209] driver forgot to call drm_crtc_vblank_off()
[  199.023513] WARNING: CPU: 5 PID: 176 at drivers/gpu/drm/drm_atomic_helper.c:1236 disable_outputs+0x2c4/0x2d0 [drm_kms_helper]
[  199.035055] Modules linked in: hdlcd(-) drm_dma_helper tda998x cec drm_kms_helper drm
[  199.042929] CPU: 5 PID: 176 Comm: kworker/5:2 Not tainted 6.0.0-rc2-00007-ge17e6f0211cd #6
[  199.051212] Hardware name: ARM Juno development board (r2) (DT)
[  199.057143] Workqueue: events drm_mode_rmfb_work_fn [drm]
[  199.062831] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  199.069809] pc : disable_outputs+0x2c4/0x2d0 [drm_kms_helper]
[  199.075664] lr : disable_outputs+0x2c4/0x2d0 [drm_kms_helper]
[  199.081519] sp : ffff80000a8f3b60
[  199.084836] x29: ffff80000a8f3b60 x28: 0000000000000028 x27: ffff0008013962b8
[  199.091996] x26: ffff800001049688 x25: ffff800001080520 x24: 0000000000000038
[  199.099155] x23: 0000000000000000 x22: ffff000801396000 x21: ffff0008013966f0
[  199.106314] x20: 0000000000000000 x19: ffff00080a65a800 x18: 0000000000000000
[  199.113472] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[  199.120630] x14: 00000000000000e1 x13: 0000000000000000 x12: 0000000000000000
[  199.127788] x11: 0000000000000001 x10: 0000000000000a60 x9 : ffff80000a8f3910
[  199.134947] x8 : ffff00080149d480 x7 : ffff00097efb5bc0 x6 : 0000000000000083
[  199.142106] x5 : 0000000000000000 x4 : ffff00097efaea18 x3 : ffff00097efb1c20
[  199.149264] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00080149c9c0
[  199.156423] Call trace:
[  199.158870]  disable_outputs+0x2c4/0x2d0 [drm_kms_helper]
[  199.164380]  drm_atomic_helper_commit_tail+0x24/0xa0 [drm_kms_helper]
[  199.170933]  commit_tail+0x150/0x180 [drm_kms_helper]
[  199.176093]  drm_atomic_helper_commit+0x13c/0x370 [drm_kms_helper]
[  199.182384]  drm_atomic_commit+0xa4/0xe0 [drm]
[  199.187074]  drm_framebuffer_remove+0x434/0x4d4 [drm]
[  199.192374]  drm_mode_rmfb_work_fn+0x74/0x9c [drm]
[  199.197413]  process_one_work+0x1d4/0x330
[  199.201437]  worker_thread+0x220/0x430
[  199.205195]  kthread+0x108/0x10c
[  199.208430]  ret_from_fork+0x10/0x20
[  199.212015] ---[ end trace 0000000000000000 ]---
[  199.221088] arm-smmu 7fb10000.iommu: Unhandled context fault: fsr=0x2, iova=0xffa53600, fsynr=0x182, cbfrsynra=0x0, cb=0
[  199.232958] arm-smmu 7fb10000.iommu: Unhandled context fault: fsr=0x2, iova=0xff800000, fsynr=0x182, cbfrsynra=0x0, cb=0
[  199.233228] ------------[ cut here ]------------
[  199.248618] hdlcd 7ff50000.hdlcd: drm_WARN_ON(({ do { __attribute__((__noreturn__)) extern void __compiletime_assert_384(void) __attribute__((__error__("Unsupported access size for {READ,WRITE}_ONCE()."))); if (!((sizeof(vblank->enabled) == sizeof(char) || sizeof(vblank->enabled) == sizeof(short) || sizeof(vblank->
Danilo Krummrich Oct. 14, 2022, 12:07 a.m. UTC | #2
Hi Liviu,

On 10/12/22 17:07, Liviu Dudau wrote:
> Hi Danilo,
> 
> Appologies again for the delay in reviewing this as I was at XDC last week.

No worries, thanks for following up.

> Looking at the documentation for drm_dev_unplug, you can get a hint about what is going on:
> 
>   /*
>   * [....] There is one
>   * shortcoming however, drm_dev_unplug() marks the drm_device as unplugged before
>   * drm_atomic_helper_shutdown() is called. This means that if the disable code
>   * paths are protected, they will not run on regular driver module unload,
>   * possibly leaving the hardware enabled.
>   */
> 

Yes, that's the issue we have and pretty unfortunate. What we'd want for 
platform device drivers is to still be able to enter the sections locked 
with drm_dev_{enter,exit} on driver unbind, which we can't for at the 
moment.

I discussed this with Daniel Vetter on #dri-devel and for now he 
suggests to just not lock access to platform device bound resources and 
respin the patchset removing those parts.

Besides that I'll also work on a solution for drm_dev_unplug() / 
drm_dev_{enter,exit} to overcome this issue in the future.

> I'm finally getting to a conclusion: I'm still not sure what problem you were trying
> to solve when you have started this series and if you have found a scenario where
> you've got use after free then I would like to be able to reproduce it on my setup.
> Otherwise, I think this whole series introduces a regression on the behaviour of the
> driver and I would be inclined to reject it.

The problem is that DRM modeset objects should never be allocated with 
devm_*, since this can result in use-after free issues on driver unload. 
They should be freed when the last reference to the object is dropped, 
which DRM managed APIs take care of. As a consequence, DRM managed 
objects can potentially out-live platform device bound resources, which 
then should be protected from access. The first at least we can and 
should do.

It's not an issue that's unique to hdlcd, it's just a known issue to be 
addressed by all drivers. There's still the shortcoming concerning 
protecting platform bound resources as discussed above, but "the drmm 
parts should be a good idea no matter what" to cite Daniel.

I'll send a v4 without the drm_dev_{enter,exit} parts removed if that's 
fine for you.

- Danilo

> 
> Best regards,
> Liviu
Liviu Dudau Oct. 14, 2022, 11:11 a.m. UTC | #3
On Fri, Oct 14, 2022 at 02:07:09AM +0200, Danilo Krummrich wrote:
> Hi Liviu,
> 
> On 10/12/22 17:07, Liviu Dudau wrote:
> > Hi Danilo,
> > 
> > Appologies again for the delay in reviewing this as I was at XDC last week.
> 
> No worries, thanks for following up.
> 
> > Looking at the documentation for drm_dev_unplug, you can get a hint about what is going on:
> > 
> >   /*
> >   * [....] There is one
> >   * shortcoming however, drm_dev_unplug() marks the drm_device as unplugged before
> >   * drm_atomic_helper_shutdown() is called. This means that if the disable code
> >   * paths are protected, they will not run on regular driver module unload,
> >   * possibly leaving the hardware enabled.
> >   */
> > 
> 
> Yes, that's the issue we have and pretty unfortunate. What we'd want for
> platform device drivers is to still be able to enter the sections locked
> with drm_dev_{enter,exit} on driver unbind, which we can't for at the
> moment.
> 
> I discussed this with Daniel Vetter on #dri-devel and for now he suggests to
> just not lock access to platform device bound resources and respin the
> patchset removing those parts.
> 
> Besides that I'll also work on a solution for drm_dev_unplug() /
> drm_dev_{enter,exit} to overcome this issue in the future.
> 
> > I'm finally getting to a conclusion: I'm still not sure what problem you were trying
> > to solve when you have started this series and if you have found a scenario where
> > you've got use after free then I would like to be able to reproduce it on my setup.
> > Otherwise, I think this whole series introduces a regression on the behaviour of the
> > driver and I would be inclined to reject it.
> 
> The problem is that DRM modeset objects should never be allocated with
> devm_*, since this can result in use-after free issues on driver unload.
> They should be freed when the last reference to the object is dropped, which
> DRM managed APIs take care of. As a consequence, DRM managed objects can
> potentially out-live platform device bound resources, which then should be
> protected from access. The first at least we can and should do.
> 
> It's not an issue that's unique to hdlcd, it's just a known issue to be
> addressed by all drivers. There's still the shortcoming concerning
> protecting platform bound resources as discussed above, but "the drmm parts
> should be a good idea no matter what" to cite Daniel.
> 
> I'll send a v4 without the drm_dev_{enter,exit} parts removed if that's fine
> for you.

Thanks for the description of the problem!

Also bear in mind that hdlcd and malidp use the component framework, which means that
the unbind is happening through the component_master_del() function.

I'm still keen to get a reproducer for the original issue of use-after free on hdlcd
(or malidp) that I can play with so that I can validate the final fix.

Best regards,
Liviu

> 
> - Danilo
> 
> > 
> > Best regards,
> > Liviu
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/arm/hdlcd_drv.c b/drivers/gpu/drm/arm/hdlcd_drv.c
index 120c87934a91..e41def6d47cc 100644
--- a/drivers/gpu/drm/arm/hdlcd_drv.c
+++ b/drivers/gpu/drm/arm/hdlcd_drv.c
@@ -325,7 +325,7 @@  static void hdlcd_drm_unbind(struct device *dev)
 	struct drm_device *drm = dev_get_drvdata(dev);
 	struct hdlcd_drm_private *hdlcd = drm_to_hdlcd_priv(drm);
 
-	drm_dev_unregister(drm);
+	drm_dev_unplug(drm);
 	drm_kms_helper_poll_fini(drm);
 	component_unbind_all(dev, drm);
 	of_node_put(hdlcd->crtc.port);