Message ID | 20200124130107.125404-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count | expand |
On 1/24/20 2:01 PM, Chris Wilson wrote: > Since drm_global_mutex is a true global mutex across devices, we don't > want to acquire it unless absolutely necessary. For maintaining the > device local open_count, we can use atomic operations on the counter > itself, except when making the transition to/from 0. Here, we tackle the > easy portion of delaying acquiring the drm_global_mutex for the final > release by using atomic_dec_and_mutex_lock(), leaving the global > serialisation across the device opens. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> For the series: Reviewed-by: Thomas Hellström <thellstrom@vmware.com> Now the only remaining (though pre-existing) problem I can see is that there is no corresponding mutex lock in drm_open() so that firstopen might race with lastclose.. Or I might be missing something.. /Thomas
Quoting Thomas Hellström (VMware) (2020-01-24 13:37:47) > On 1/24/20 2:01 PM, Chris Wilson wrote: > > Since drm_global_mutex is a true global mutex across devices, we don't > > want to acquire it unless absolutely necessary. For maintaining the > > device local open_count, we can use atomic operations on the counter > > itself, except when making the transition to/from 0. Here, we tackle the > > easy portion of delaying acquiring the drm_global_mutex for the final > > release by using atomic_dec_and_mutex_lock(), leaving the global > > serialisation across the device opens. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> > > For the series: > > Reviewed-by: Thomas Hellström <thellstrom@vmware.com> > > Now the only remaining (though pre-existing) problem I can see is that > there is no corresponding mutex lock in drm_open() so that firstopen > might race with lastclose.. Or I might be missing something.. iirc, it's a complicated dance where it goes through drm_stub_open() first which acquires the drm_global_mutex. -Chris
Quoting Thomas Hellström (VMware) (2020-01-24 13:37:47) > On 1/24/20 2:01 PM, Chris Wilson wrote: > > Since drm_global_mutex is a true global mutex across devices, we don't > > want to acquire it unless absolutely necessary. For maintaining the > > device local open_count, we can use atomic operations on the counter > > itself, except when making the transition to/from 0. Here, we tackle the > > easy portion of delaying acquiring the drm_global_mutex for the final > > release by using atomic_dec_and_mutex_lock(), leaving the global > > serialisation across the device opens. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> > > For the series: > > Reviewed-by: Thomas Hellström <thellstrom@vmware.com> Now being opt-in, it is fairly limited in scope and will not randomly break others (touch wood) and the close() racing in BAT didn't throw anything up, so pushed to drm-misc-next. Thanks for the review and suggestions, Next task is to suggest others might like to use it as well. -Chris
On 1/24/20 7:39 PM, Chris Wilson wrote: > Quoting Thomas Hellström (VMware) (2020-01-24 13:37:47) >> On 1/24/20 2:01 PM, Chris Wilson wrote: >>> Since drm_global_mutex is a true global mutex across devices, we don't >>> want to acquire it unless absolutely necessary. For maintaining the >>> device local open_count, we can use atomic operations on the counter >>> itself, except when making the transition to/from 0. Here, we tackle the >>> easy portion of delaying acquiring the drm_global_mutex for the final >>> release by using atomic_dec_and_mutex_lock(), leaving the global >>> serialisation across the device opens. >>> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>> Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> >> For the series: >> >> Reviewed-by: Thomas Hellström <thellstrom@vmware.com> > Now being opt-in, it is fairly limited in scope and will not randomly > break others (touch wood) and the close() racing in BAT didn't throw > anything up, so pushed to drm-misc-next. Thanks for the review and > suggestions, > > Next task is to suggest others might like to use it as well. > -Chris Thanks. I'll look at doing the same for those drivers I audited. /Thomas
On Fri, Jan 24, 2020 at 06:39:26PM +0000, Chris Wilson wrote: > Quoting Thomas Hellström (VMware) (2020-01-24 13:37:47) > > On 1/24/20 2:01 PM, Chris Wilson wrote: > > > Since drm_global_mutex is a true global mutex across devices, we don't > > > want to acquire it unless absolutely necessary. For maintaining the > > > device local open_count, we can use atomic operations on the counter > > > itself, except when making the transition to/from 0. Here, we tackle the > > > easy portion of delaying acquiring the drm_global_mutex for the final > > > release by using atomic_dec_and_mutex_lock(), leaving the global > > > serialisation across the device opens. > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> > > > > For the series: > > > > Reviewed-by: Thomas Hellström <thellstrom@vmware.com> > > Now being opt-in, it is fairly limited in scope and will not randomly > break others (touch wood) and the close() racing in BAT didn't throw > anything up, so pushed to drm-misc-next. Thanks for the review and > suggestions, Yeah this version looks reasonable compared to the previous few (I'm catching up on dri-devel). I've looked at getting rid of the global_mutex, and all I have is a simple patch with a pile of notes. It's real nasty. This one here is a neat trick that I missed, and I'm semi-convinced it's safe :-) > Next task is to suggest others might like to use it as well. My idea for the opt-in was to look at whether ->load/->unload exists. And ofc not bother with any of this for DRIVER_LEGACY. So maybe next step would be to define a drm_can_noglobal() { return !DRIVER_LEGACY && !->load && !->unload; } and inline the close helper again and see what breaks? At least from what I've looked trying to duplicate paths and opt-in is going to be real tough on the open side of things. Best I've done thus far is minor pushing of the critical section. -Daniel
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 53d882000101..c3c0356dfa61 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1136,7 +1136,7 @@ static bool amdgpu_switcheroo_can_switch(struct pci_dev *pdev) * locking inversion with the driver load path. And the access here is * completely racy anyway. So don't bother with locking for now. */ - return dev->open_count == 0; + return atomic_read(&dev->open_count) == 0; } static const struct vga_switcheroo_client_ops amdgpu_switcheroo_ops = { diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index e25306c49cc6..1075b3a8b5b1 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -220,7 +220,7 @@ void drm_file_free(struct drm_file *file) DRM_DEBUG("pid = %d, device = 0x%lx, open_count = %d\n", task_pid_nr(current), (long)old_encode_dev(file->minor->kdev->devt), - READ_ONCE(dev->open_count)); + atomic_read(&dev->open_count)); if (drm_core_check_feature(dev, DRIVER_LEGACY) && dev->driver->preclose) @@ -379,7 +379,7 @@ int drm_open(struct inode *inode, struct file *filp) return PTR_ERR(minor); dev = minor->dev; - if (!dev->open_count++) + if (!atomic_fetch_inc(&dev->open_count)) need_setup = 1; /* share address_space across all char-devs of a single device */ @@ -398,7 +398,7 @@ int drm_open(struct inode *inode, struct file *filp) return 0; err_undo: - dev->open_count--; + atomic_dec(&dev->open_count); drm_minor_release(minor); return retcode; } @@ -440,11 +440,11 @@ int drm_release(struct inode *inode, struct file *filp) mutex_lock(&drm_global_mutex); - DRM_DEBUG("open_count = %d\n", dev->open_count); + DRM_DEBUG("open_count = %d\n", atomic_read(&dev->open_count)); drm_close_helper(filp); - if (!--dev->open_count) + if (atomic_dec_and_test(&dev->open_count)) drm_lastclose(dev); mutex_unlock(&drm_global_mutex); @@ -478,10 +478,10 @@ int drm_release_noglobal(struct inode *inode, struct file *filp) drm_close_helper(filp); - mutex_lock(&drm_global_mutex); - if (!--dev->open_count) + if (atomic_dec_and_mutex_lock(&dev->open_count, &drm_global_mutex)) { drm_lastclose(dev); - mutex_unlock(&drm_global_mutex); + mutex_unlock(&drm_global_mutex); + } drm_minor_release(minor); diff --git a/drivers/gpu/drm/i915/i915_switcheroo.c b/drivers/gpu/drm/i915/i915_switcheroo.c index 39c79e1c5b52..ed69b5d4a375 100644 --- a/drivers/gpu/drm/i915/i915_switcheroo.c +++ b/drivers/gpu/drm/i915/i915_switcheroo.c @@ -43,7 +43,7 @@ static bool i915_switcheroo_can_switch(struct pci_dev *pdev) * locking inversion with the driver load path. And the access here is * completely racy anyway. So don't bother with locking for now. */ - return i915 && i915->drm.open_count == 0; + return i915 && atomic_read(&i915->drm.open_count) == 0; } static const struct vga_switcheroo_client_ops i915_switcheroo_ops = { diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c b/drivers/gpu/drm/nouveau/nouveau_vga.c index d865d8aeac3c..c85dd8afa3c3 100644 --- a/drivers/gpu/drm/nouveau/nouveau_vga.c +++ b/drivers/gpu/drm/nouveau/nouveau_vga.c @@ -72,7 +72,7 @@ nouveau_switcheroo_can_switch(struct pci_dev *pdev) * locking inversion with the driver load path. And the access here is * completely racy anyway. So don't bother with locking for now. */ - return dev->open_count == 0; + return atomic_read(&dev->open_count) == 0; } static const struct vga_switcheroo_client_ops diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index a522e092038b..266e3cbbd09b 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1263,7 +1263,7 @@ static bool radeon_switcheroo_can_switch(struct pci_dev *pdev) * locking inversion with the driver load path. And the access here is * completely racy anyway. So don't bother with locking for now. */ - return dev->open_count == 0; + return atomic_read(&dev->open_count) == 0; } static const struct vga_switcheroo_client_ops radeon_switcheroo_ops = { diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h index 1acfc3bbd3fb..bb60a949f416 100644 --- a/include/drm/drm_device.h +++ b/include/drm/drm_device.h @@ -144,7 +144,7 @@ struct drm_device { * Usage counter for outstanding files open, * protected by drm_global_mutex */ - int open_count; + atomic_t open_count; /** @filelist_mutex: Protects @filelist. */ struct mutex filelist_mutex;
Since drm_global_mutex is a true global mutex across devices, we don't want to acquire it unless absolutely necessary. For maintaining the device local open_count, we can use atomic operations on the counter itself, except when making the transition to/from 0. Here, we tackle the easy portion of delaying acquiring the drm_global_mutex for the final release by using atomic_dec_and_mutex_lock(), leaving the global serialisation across the device opens. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> --- atomic_dec_and_mutex_lock needs pairing with mutex_unlock (you fool) --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- drivers/gpu/drm/drm_file.c | 16 ++++++++-------- drivers/gpu/drm/i915/i915_switcheroo.c | 2 +- drivers/gpu/drm/nouveau/nouveau_vga.c | 2 +- drivers/gpu/drm/radeon/radeon_device.c | 2 +- include/drm/drm_device.h | 2 +- 6 files changed, 13 insertions(+), 13 deletions(-)