diff mbox series

[1/3] drm: Restore driver.preclose() for all to use

Message ID 20200723172119.17649-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [1/3] drm: Restore driver.preclose() for all to use | expand

Commit Message

Chris Wilson July 23, 2020, 5:21 p.m. UTC
An unfortunate sequence of events, but it turns out there is a valid
usecase for being able to free/decouple the driver objects before they
are freed by the DRM core. In particular, if we have a pointer into a
drm core object from inside a driver object, that pointer needs to be
nerfed *before* it is freed so that concurrent access (e.g. debugfs)
does not following the dangling pointer.

The legacy marker was adding in the code movement from drp_fops.c to
drm_file.c

References: 9acdac68bcdc ("drm: rename drm_fops.c to drm_file.c")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Gustavo Padovan <gustavo.padovan@collabora.com>
Cc: CQ Tang <cq.tang@intel.com>
Cc: <stable@vger.kernel.org> # v4.12+
---
 drivers/gpu/drm/drm_file.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Daniel Vetter July 27, 2020, 7:32 p.m. UTC | #1
On Thu, Jul 23, 2020 at 7:21 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> An unfortunate sequence of events, but it turns out there is a valid
> usecase for being able to free/decouple the driver objects before they
> are freed by the DRM core. In particular, if we have a pointer into a
> drm core object from inside a driver object, that pointer needs to be
> nerfed *before* it is freed so that concurrent access (e.g. debugfs)
> does not following the dangling pointer.
>
> The legacy marker was adding in the code movement from drp_fops.c to
> drm_file.c

I might fumble a lot, but not this one:

commit 45c3d213a400c952ab7119f394c5293bb6877e6b
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon May 8 10:26:33 2017 +0200

    drm: Nerf the preclose callback for modern drivers

Also looking at the debugfs hook that has some rather adventurous
stuff going on I think, feels a bit like a kitchensink with batteries
included. If that's really all needed I'd say iterate the contexts by
first going over files, then the ctx (which arent shared anyway) and
the problem should also be gone.
-Daniel

> References: 9acdac68bcdc ("drm: rename drm_fops.c to drm_file.c")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Gustavo Padovan <gustavo.padovan@collabora.com>
> Cc: CQ Tang <cq.tang@intel.com>
> Cc: <stable@vger.kernel.org> # v4.12+
> ---
>  drivers/gpu/drm/drm_file.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index 0ac4566ae3f4..7b4258d6f7cc 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -258,8 +258,7 @@ void drm_file_free(struct drm_file *file)
>                   (long)old_encode_dev(file->minor->kdev->devt),
>                   atomic_read(&dev->open_count));
>
> -       if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
> -           dev->driver->preclose)
> +       if (dev->driver->preclose)
>                 dev->driver->preclose(dev, file);
>
>         if (drm_core_check_feature(dev, DRIVER_LEGACY))
> --
> 2.20.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Tang, CQ July 27, 2020, 8:11 p.m. UTC | #2
> -----Original Message-----
> From: Daniel Vetter <daniel@ffwll.ch>
> Sent: Monday, July 27, 2020 12:33 PM
> To: Chris Wilson <chris@chris-wilson.co.uk>; Dave Airlie <airlied@redhat.com>
> Cc: intel-gfx <intel-gfx@lists.freedesktop.org>; stable
> <stable@vger.kernel.org>; Gustavo Padovan
> <gustavo.padovan@collabora.com>; Tang, CQ <cq.tang@intel.com>; dri-
> devel <dri-devel@lists.freedesktop.org>; Vetter, Daniel
> <daniel.vetter@intel.com>
> Subject: Re: [PATCH 1/3] drm: Restore driver.preclose() for all to use
> 
> On Thu, Jul 23, 2020 at 7:21 PM Chris Wilson <chris@chris-wilson.co.uk>
> wrote:
> >
> > An unfortunate sequence of events, but it turns out there is a valid
> > usecase for being able to free/decouple the driver objects before they
> > are freed by the DRM core. In particular, if we have a pointer into a
> > drm core object from inside a driver object, that pointer needs to be
> > nerfed *before* it is freed so that concurrent access (e.g. debugfs)
> > does not following the dangling pointer.
> >
> > The legacy marker was adding in the code movement from drp_fops.c to
> > drm_file.c
> 
> I might fumble a lot, but not this one:
> 
> commit 45c3d213a400c952ab7119f394c5293bb6877e6b
> Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> Date:   Mon May 8 10:26:33 2017 +0200
> 
>     drm: Nerf the preclose callback for modern drivers
> 
> Also looking at the debugfs hook that has some rather adventurous stuff
> going on I think, feels a bit like a kitchensink with batteries included. If that's
> really all needed I'd say iterate the contexts by first going over files, then the
> ctx (which arent shared anyway) and the problem should also be gone.

Debugfs code can jump in after drm_gem_release() (where file->object_idr is destroyed), but before postclose(). At this window, everything is fine for debugfs context accessing except the file->object_idr.

--CQ

> -Daniel
> 
> > References: 9acdac68bcdc ("drm: rename drm_fops.c to drm_file.c")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Gustavo Padovan <gustavo.padovan@collabora.com>
> > Cc: CQ Tang <cq.tang@intel.com>
> > Cc: <stable@vger.kernel.org> # v4.12+
> > ---
> >  drivers/gpu/drm/drm_file.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> > index 0ac4566ae3f4..7b4258d6f7cc 100644
> > --- a/drivers/gpu/drm/drm_file.c
> > +++ b/drivers/gpu/drm/drm_file.c
> > @@ -258,8 +258,7 @@ void drm_file_free(struct drm_file *file)
> >                   (long)old_encode_dev(file->minor->kdev->devt),
> >                   atomic_read(&dev->open_count));
> >
> > -       if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
> > -           dev->driver->preclose)
> > +       if (dev->driver->preclose)
> >                 dev->driver->preclose(dev, file);
> >
> >         if (drm_core_check_feature(dev, DRIVER_LEGACY))
> > --
> > 2.20.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Sasha Levin July 27, 2020, 9:24 p.m. UTC | #3
Hi

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: 4.12+

The bot has tested the following trees: v5.7.10, v5.4.53, v4.19.134, v4.14.189.

v5.7.10: Build OK!
v5.4.53: Build OK!
v4.19.134: Build OK!
v4.14.189: Failed to apply! Possible dependencies:
    112ed2d31a46 ("drm/i915: Move GraphicsTechnology files under gt/")
    1572042a4ab2 ("drm: provide management functions for drm_file")
    7a2c65dd32b1 ("drm: Release filp before global lock")
    7e13ad896484 ("drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count")
    b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries")
    c2400ec3b6d1 ("drm/i915: add Makefile magic for testing headers are self-contained")
    cc662126b413 ("drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET")
    e7af3116836f ("drm/i915: Introduce a preempt context")
    f0e4a0639752 ("drm/i915: Move GEM domain management to its own file")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?
Chris Wilson July 28, 2020, 4:27 p.m. UTC | #4
Quoting Daniel Vetter (2020-07-27 20:32:45)
> On Thu, Jul 23, 2020 at 7:21 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > An unfortunate sequence of events, but it turns out there is a valid
> > usecase for being able to free/decouple the driver objects before they
> > are freed by the DRM core. In particular, if we have a pointer into a
> > drm core object from inside a driver object, that pointer needs to be
> > nerfed *before* it is freed so that concurrent access (e.g. debugfs)
> > does not following the dangling pointer.
> >
> > The legacy marker was adding in the code movement from drp_fops.c to
> > drm_file.c
> 
> I might fumble a lot, but not this one:
> 
> commit 45c3d213a400c952ab7119f394c5293bb6877e6b
> Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> Date:   Mon May 8 10:26:33 2017 +0200
> 
>     drm: Nerf the preclose callback for modern drivers

Gah, when I going through the history it looked like it appeared out of
nowhere.

> Also looking at the debugfs hook that has some rather adventurous
> stuff going on I think, feels a bit like a kitchensink with batteries
> included. If that's really all needed I'd say iterate the contexts by
> first going over files, then the ctx (which arent shared anyway) and
> the problem should also be gone.

Or we could cut out the middlelayer and put the release under the driver
control with a call to the drm_release() when the driver is ready.
-Chris
Tang, CQ July 29, 2020, 3:09 p.m. UTC | #5
> -----Original Message-----
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Sent: Tuesday, July 28, 2020 9:28 AM
> To: Daniel Vetter <daniel@ffwll.ch>; Dave Airlie <airlied@redhat.com>
> Cc: intel-gfx <intel-gfx@lists.freedesktop.org>; stable
> <stable@vger.kernel.org>; Gustavo Padovan
> <gustavo.padovan@collabora.com>; Tang, CQ <cq.tang@intel.com>; dri-
> devel <dri-devel@lists.freedesktop.org>; Vetter, Daniel
> <daniel.vetter@intel.com>
> Subject: Re: [PATCH 1/3] drm: Restore driver.preclose() for all to use
> 
> Quoting Daniel Vetter (2020-07-27 20:32:45)
> > On Thu, Jul 23, 2020 at 7:21 PM Chris Wilson <chris@chris-wilson.co.uk>
> wrote:
> > >
> > > An unfortunate sequence of events, but it turns out there is a valid
> > > usecase for being able to free/decouple the driver objects before
> > > they are freed by the DRM core. In particular, if we have a pointer
> > > into a drm core object from inside a driver object, that pointer
> > > needs to be nerfed *before* it is freed so that concurrent access
> > > (e.g. debugfs) does not following the dangling pointer.
> > >
> > > The legacy marker was adding in the code movement from drp_fops.c to
> > > drm_file.c
> >
> > I might fumble a lot, but not this one:
> >
> > commit 45c3d213a400c952ab7119f394c5293bb6877e6b
> > Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Date:   Mon May 8 10:26:33 2017 +0200
> >
> >     drm: Nerf the preclose callback for modern drivers
> 
> Gah, when I going through the history it looked like it appeared out of
> nowhere.
> 
> > Also looking at the debugfs hook that has some rather adventurous
> > stuff going on I think, feels a bit like a kitchensink with batteries
> > included. If that's really all needed I'd say iterate the contexts by
> > first going over files, then the ctx (which arent shared anyway) and
> > the problem should also be gone.
> 
> Or we could cut out the middlelayer and put the release under the driver
> control with a call to the drm_release() when the driver is ready.

Chiris, can explain this idea, or post a patch ?

--CQ

> -Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index 0ac4566ae3f4..7b4258d6f7cc 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -258,8 +258,7 @@  void drm_file_free(struct drm_file *file)
 		  (long)old_encode_dev(file->minor->kdev->devt),
 		  atomic_read(&dev->open_count));
 
-	if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
-	    dev->driver->preclose)
+	if (dev->driver->preclose)
 		dev->driver->preclose(dev, file);
 
 	if (drm_core_check_feature(dev, DRIVER_LEGACY))