mbox series

[v8,0/5] drm: address potential UAF bugs with drm_master ptrs

Message ID 20210712043508.11584-1-desmondcheongzx@gmail.com (mailing list archive)
Headers show
Series drm: address potential UAF bugs with drm_master ptrs | expand

Message

Desmond Cheong Zhi Xi July 12, 2021, 4:35 a.m. UTC
Hi,

In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.

Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803

The series is broken up into five patches:

1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.

2. Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().

3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.

4. Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.

5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.

v7 -> v8:
- Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
- Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.

v6 -> v7:
- Modify code alignment as suggested by the intel-gfx CI.
- Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
- Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.

v5 -> v6:
- Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
- Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
- Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
- Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.

v4 -> v5:
- Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
- Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.

v3 -> v4:
- Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
- Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
- Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
- Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.

v2 -> v3:
- Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
- Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.

v1 -> v2:
- Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.

Desmond Cheong Zhi Xi (5):
  drm: avoid circular locks in drm_mode_getconnector
  drm: avoid blocking in drm_clients_info's rcu section
  drm: add a locked version of drm_is_current_master
  drm: serialize drm_file.master with a new spinlock
  drm: protect drm_master pointers in drm_lease.c

 drivers/gpu/drm/drm_auth.c      | 93 ++++++++++++++++++++++++---------
 drivers/gpu/drm/drm_connector.c |  5 +-
 drivers/gpu/drm/drm_debugfs.c   |  3 +-
 drivers/gpu/drm/drm_file.c      |  1 +
 drivers/gpu/drm/drm_lease.c     | 81 +++++++++++++++++++++-------
 include/drm/drm_auth.h          |  1 +
 include/drm/drm_file.h          | 18 +++++--
 7 files changed, 152 insertions(+), 50 deletions(-)

Comments

Daniel Vetter July 20, 2021, 6:24 p.m. UTC | #1
On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
> Hi,
> 
> In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
> 
> Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
> 
> The series is broken up into five patches:
> 
> 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
> 
> 2. Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
> 
> 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
> 
> 4. Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
> 
> 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
> 
> v7 -> v8:
> - Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
> - Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
> 
> v6 -> v7:
> - Modify code alignment as suggested by the intel-gfx CI.
> - Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
> - Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
> 
> v5 -> v6:
> - Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
> - Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
> - Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
> - Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
> 
> v4 -> v5:
> - Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
> - Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
> 
> v3 -> v4:
> - Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
> - Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
> - Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
> - Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
> 
> v2 -> v3:
> - Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
> - Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
> 
> v1 -> v2:
> - Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.

Apologies for the delay, I missed your series. Maybe just ping next time
around there's silence.

Looks all great, merged to drm-misc-next. Given how complex this was I'm
vary of just pushing this to -fixes without some solid testing.

One thing I noticed is that drm_is_current_master could just use the
spinlock, since it's only doing a read access. Care to type up that patch?

Also, do you plan to look into that idea we've discussed to flush pending
access when we revoke a master or a lease? I think that would be really
nice improvement here.
-Daniel

> 
> Desmond Cheong Zhi Xi (5):
>   drm: avoid circular locks in drm_mode_getconnector
>   drm: avoid blocking in drm_clients_info's rcu section
>   drm: add a locked version of drm_is_current_master
>   drm: serialize drm_file.master with a new spinlock
>   drm: protect drm_master pointers in drm_lease.c
> 
>  drivers/gpu/drm/drm_auth.c      | 93 ++++++++++++++++++++++++---------
>  drivers/gpu/drm/drm_connector.c |  5 +-
>  drivers/gpu/drm/drm_debugfs.c   |  3 +-
>  drivers/gpu/drm/drm_file.c      |  1 +
>  drivers/gpu/drm/drm_lease.c     | 81 +++++++++++++++++++++-------
>  include/drm/drm_auth.h          |  1 +
>  include/drm/drm_file.h          | 18 +++++--
>  7 files changed, 152 insertions(+), 50 deletions(-)
> 
> -- 
> 2.25.1
>
Desmond Cheong Zhi Xi July 21, 2021, 4:12 a.m. UTC | #2
On 21/7/21 2:24 am, Daniel Vetter wrote:
> On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
>> Hi,
>>
>> In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
>>
>> Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
>> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
>>
>> The series is broken up into five patches:
>>
>> 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
>>
>> 2. Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
>>
>> 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
>>
>> 4. Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
>>
>> 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
>>
>> v7 -> v8:
>> - Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
>> - Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
>>
>> v6 -> v7:
>> - Modify code alignment as suggested by the intel-gfx CI.
>> - Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
>> - Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
>>
>> v5 -> v6:
>> - Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
>> - Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
>> - Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
>> - Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
>>
>> v4 -> v5:
>> - Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
>> - Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
>>
>> v3 -> v4:
>> - Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
>> - Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
>> - Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
>> - Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
>>
>> v2 -> v3:
>> - Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
>> - Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
>>
>> v1 -> v2:
>> - Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
> 
> Apologies for the delay, I missed your series. Maybe just ping next time
> around there's silence.
> 
> Looks all great, merged to drm-misc-next. Given how complex this was I'm
> vary of just pushing this to -fixes without some solid testing.
> 

Hi Daniel,

Thanks for merging, more testing definitely sounds good to me.

> One thing I noticed is that drm_is_current_master could just use the
> spinlock, since it's only doing a read access. Care to type up that patch?
> 

I thought about this too, but I'm not sure if that's the best solution.

drm_is_current_master calls drm_lease_owner which then walks up the tree 
of master lessors. The spinlock protects the master of the current drm 
file, but subsequent lessors aren't protected without holding the 
device's master mutex.

> Also, do you plan to look into that idea we've discussed to flush pending
> access when we revoke a master or a lease? I think that would be really
> nice improvement here.
> -Daniel
> 

Yup, now that the potential UAFs are addressed (hopefully), I'll take a 
closer look and propose a patch for this.

Best wishes,
Desmond

>>
>> Desmond Cheong Zhi Xi (5):
>>    drm: avoid circular locks in drm_mode_getconnector
>>    drm: avoid blocking in drm_clients_info's rcu section
>>    drm: add a locked version of drm_is_current_master
>>    drm: serialize drm_file.master with a new spinlock
>>    drm: protect drm_master pointers in drm_lease.c
>>
>>   drivers/gpu/drm/drm_auth.c      | 93 ++++++++++++++++++++++++---------
>>   drivers/gpu/drm/drm_connector.c |  5 +-
>>   drivers/gpu/drm/drm_debugfs.c   |  3 +-
>>   drivers/gpu/drm/drm_file.c      |  1 +
>>   drivers/gpu/drm/drm_lease.c     | 81 +++++++++++++++++++++-------
>>   include/drm/drm_auth.h          |  1 +
>>   include/drm/drm_file.h          | 18 +++++--
>>   7 files changed, 152 insertions(+), 50 deletions(-)
>>
>> -- 
>> 2.25.1
>>
>
Daniel Vetter July 21, 2021, 10:29 a.m. UTC | #3
On Wed, Jul 21, 2021 at 6:12 AM Desmond Cheong Zhi Xi
<desmondcheongzx@gmail.com> wrote:
> On 21/7/21 2:24 am, Daniel Vetter wrote:
> > On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
> >> Hi,
> >>
> >> In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
> >>
> >> Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
> >> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
> >>
> >> The series is broken up into five patches:
> >>
> >> 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
> >>
> >> 2. Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
> >>
> >> 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
> >>
> >> 4. Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
> >>
> >> 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
> >>
> >> v7 -> v8:
> >> - Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
> >> - Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
> >>
> >> v6 -> v7:
> >> - Modify code alignment as suggested by the intel-gfx CI.
> >> - Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
> >> - Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
> >>
> >> v5 -> v6:
> >> - Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
> >> - Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
> >> - Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
> >> - Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
> >>
> >> v4 -> v5:
> >> - Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
> >> - Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
> >>
> >> v3 -> v4:
> >> - Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
> >> - Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
> >> - Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
> >> - Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
> >>
> >> v2 -> v3:
> >> - Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
> >> - Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
> >>
> >> v1 -> v2:
> >> - Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
> >
> > Apologies for the delay, I missed your series. Maybe just ping next time
> > around there's silence.
> >
> > Looks all great, merged to drm-misc-next. Given how complex this was I'm
> > vary of just pushing this to -fixes without some solid testing.
> >
>
> Hi Daniel,
>
> Thanks for merging, more testing definitely sounds good to me.
>
> > One thing I noticed is that drm_is_current_master could just use the
> > spinlock, since it's only doing a read access. Care to type up that patch?
> >
>
> I thought about this too, but I'm not sure if that's the best solution.
>
> drm_is_current_master calls drm_lease_owner which then walks up the tree
> of master lessors. The spinlock protects the master of the current drm
> file, but subsequent lessors aren't protected without holding the
> device's master mutex.

But this isn't a fpriv->master pointer, but a master->lessor pointer.
Which should never ever be able to change (we'd have tons of uaf bugs
around drm_lease_owner otherwise). So I don't think there's anything
that dev->master_lock protects here that fpriv->master_lookup_lock
doesn't protect already?

Or am I missing something?

The comment in the struct drm_master says it's protected by
mode_config.idr_mutex, but that only applies to the idrs and lists I
think.

> > Also, do you plan to look into that idea we've discussed to flush pending
> > access when we revoke a master or a lease? I think that would be really
> > nice improvement here.
> > -Daniel
> >
>
> Yup, now that the potential UAFs are addressed (hopefully), I'll take a
> closer look and propose a patch for this.

Thanks a lot.
-Daniel

>
> Best wishes,
> Desmond
>
> >>
> >> Desmond Cheong Zhi Xi (5):
> >>    drm: avoid circular locks in drm_mode_getconnector
> >>    drm: avoid blocking in drm_clients_info's rcu section
> >>    drm: add a locked version of drm_is_current_master
> >>    drm: serialize drm_file.master with a new spinlock
> >>    drm: protect drm_master pointers in drm_lease.c
> >>
> >>   drivers/gpu/drm/drm_auth.c      | 93 ++++++++++++++++++++++++---------
> >>   drivers/gpu/drm/drm_connector.c |  5 +-
> >>   drivers/gpu/drm/drm_debugfs.c   |  3 +-
> >>   drivers/gpu/drm/drm_file.c      |  1 +
> >>   drivers/gpu/drm/drm_lease.c     | 81 +++++++++++++++++++++-------
> >>   include/drm/drm_auth.h          |  1 +
> >>   include/drm/drm_file.h          | 18 +++++--
> >>   7 files changed, 152 insertions(+), 50 deletions(-)
> >>
> >> --
> >> 2.25.1
> >>
> >
>
Desmond Cheong Zhi Xi July 21, 2021, 12:43 p.m. UTC | #4
On 21/7/21 6:29 pm, Daniel Vetter wrote:
> On Wed, Jul 21, 2021 at 6:12 AM Desmond Cheong Zhi Xi
> <desmondcheongzx@gmail.com> wrote:
>> On 21/7/21 2:24 am, Daniel Vetter wrote:
>>> On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
>>>> Hi,
>>>>
>>>> In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
>>>>
>>>> Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
>>>> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
>>>>
>>>> The series is broken up into five patches:
>>>>
>>>> 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
>>>>
>>>> 2. Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
>>>>
>>>> 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
>>>>
>>>> 4. Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
>>>>
>>>> 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
>>>>
>>>> v7 -> v8:
>>>> - Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
>>>> - Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
>>>>
>>>> v6 -> v7:
>>>> - Modify code alignment as suggested by the intel-gfx CI.
>>>> - Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
>>>> - Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
>>>>
>>>> v5 -> v6:
>>>> - Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
>>>> - Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
>>>> - Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
>>>> - Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
>>>>
>>>> v4 -> v5:
>>>> - Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
>>>> - Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
>>>>
>>>> v3 -> v4:
>>>> - Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
>>>> - Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
>>>> - Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
>>>> - Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
>>>>
>>>> v2 -> v3:
>>>> - Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
>>>> - Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
>>>>
>>>> v1 -> v2:
>>>> - Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
>>>
>>> Apologies for the delay, I missed your series. Maybe just ping next time
>>> around there's silence.
>>>
>>> Looks all great, merged to drm-misc-next. Given how complex this was I'm
>>> vary of just pushing this to -fixes without some solid testing.
>>>
>>
>> Hi Daniel,
>>
>> Thanks for merging, more testing definitely sounds good to me.
>>
>>> One thing I noticed is that drm_is_current_master could just use the
>>> spinlock, since it's only doing a read access. Care to type up that patch?
>>>
>>
>> I thought about this too, but I'm not sure if that's the best solution.
>>
>> drm_is_current_master calls drm_lease_owner which then walks up the tree
>> of master lessors. The spinlock protects the master of the current drm
>> file, but subsequent lessors aren't protected without holding the
>> device's master mutex.
> 
> But this isn't a fpriv->master pointer, but a master->lessor pointer.
> Which should never ever be able to change (we'd have tons of uaf bugs
> around drm_lease_owner otherwise). So I don't think there's anything
> that dev->master_lock protects here that fpriv->master_lookup_lock
> doesn't protect already?
> 
> Or am I missing something?
>  > The comment in the struct drm_master says it's protected by
> mode_config.idr_mutex, but that only applies to the idrs and lists I
> think.
> 

Ah you're right, I also completely forgot that lessees hold a reference 
to their lessor so nothing will be freed as long as the spinlock is 
held. I'll prepare that patch then, thanks for pointing it out.

>>> Also, do you plan to look into that idea we've discussed to flush pending
>>> access when we revoke a master or a lease? I think that would be really
>>> nice improvement here.
>>> -Daniel
>>>
>>
>> Yup, now that the potential UAFs are addressed (hopefully), I'll take a
>> closer look and propose a patch for this.
> 
> Thanks a lot.
> -Daniel
> 
>>
>> Best wishes,
>> Desmond
>>
>>>>
>>>> Desmond Cheong Zhi Xi (5):
>>>>     drm: avoid circular locks in drm_mode_getconnector
>>>>     drm: avoid blocking in drm_clients_info's rcu section
>>>>     drm: add a locked version of drm_is_current_master
>>>>     drm: serialize drm_file.master with a new spinlock
>>>>     drm: protect drm_master pointers in drm_lease.c
>>>>
>>>>    drivers/gpu/drm/drm_auth.c      | 93 ++++++++++++++++++++++++---------
>>>>    drivers/gpu/drm/drm_connector.c |  5 +-
>>>>    drivers/gpu/drm/drm_debugfs.c   |  3 +-
>>>>    drivers/gpu/drm/drm_file.c      |  1 +
>>>>    drivers/gpu/drm/drm_lease.c     | 81 +++++++++++++++++++++-------
>>>>    include/drm/drm_auth.h          |  1 +
>>>>    include/drm/drm_file.h          | 18 +++++--
>>>>    7 files changed, 152 insertions(+), 50 deletions(-)
>>>>
>>>> --
>>>> 2.25.1
>>>>
>>>
>>
> 
>
Daniel Vetter July 21, 2021, 1:23 p.m. UTC | #5
On Wed, Jul 21, 2021 at 2:44 PM Desmond Cheong Zhi Xi
<desmondcheongzx@gmail.com> wrote:
> On 21/7/21 6:29 pm, Daniel Vetter wrote:
> > On Wed, Jul 21, 2021 at 6:12 AM Desmond Cheong Zhi Xi
> > <desmondcheongzx@gmail.com> wrote:
> >> On 21/7/21 2:24 am, Daniel Vetter wrote:
> >>> On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
> >>>> Hi,
> >>>>
> >>>> In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
> >>>>
> >>>> Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
> >>>> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
> >>>>
> >>>> The series is broken up into five patches:
> >>>>
> >>>> 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
> >>>>
> >>>> 2. Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
> >>>>
> >>>> 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
> >>>>
> >>>> 4. Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
> >>>>
> >>>> 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
> >>>>
> >>>> v7 -> v8:
> >>>> - Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
> >>>> - Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
> >>>>
> >>>> v6 -> v7:
> >>>> - Modify code alignment as suggested by the intel-gfx CI.
> >>>> - Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
> >>>> - Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
> >>>>
> >>>> v5 -> v6:
> >>>> - Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
> >>>> - Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
> >>>> - Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
> >>>> - Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
> >>>>
> >>>> v4 -> v5:
> >>>> - Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
> >>>> - Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
> >>>>
> >>>> v3 -> v4:
> >>>> - Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
> >>>> - Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
> >>>> - Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
> >>>> - Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
> >>>>
> >>>> v2 -> v3:
> >>>> - Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
> >>>> - Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
> >>>>
> >>>> v1 -> v2:
> >>>> - Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
> >>>
> >>> Apologies for the delay, I missed your series. Maybe just ping next time
> >>> around there's silence.
> >>>
> >>> Looks all great, merged to drm-misc-next. Given how complex this was I'm
> >>> vary of just pushing this to -fixes without some solid testing.
> >>>
> >>
> >> Hi Daniel,
> >>
> >> Thanks for merging, more testing definitely sounds good to me.
> >>
> >>> One thing I noticed is that drm_is_current_master could just use the
> >>> spinlock, since it's only doing a read access. Care to type up that patch?
> >>>
> >>
> >> I thought about this too, but I'm not sure if that's the best solution.
> >>
> >> drm_is_current_master calls drm_lease_owner which then walks up the tree
> >> of master lessors. The spinlock protects the master of the current drm
> >> file, but subsequent lessors aren't protected without holding the
> >> device's master mutex.
> >
> > But this isn't a fpriv->master pointer, but a master->lessor pointer.
> > Which should never ever be able to change (we'd have tons of uaf bugs
> > around drm_lease_owner otherwise). So I don't think there's anything
> > that dev->master_lock protects here that fpriv->master_lookup_lock
> > doesn't protect already?
> >
> > Or am I missing something?
> >  > The comment in the struct drm_master says it's protected by
> > mode_config.idr_mutex, but that only applies to the idrs and lists I
> > think.
> >
>
> Ah you're right, I also completely forgot that lessees hold a reference
> to their lessor so nothing will be freed as long as the spinlock is
> held. I'll prepare that patch then, thanks for pointing it out.

btw since we now looked at all this in detail, can you perhaps do a
patch to update the kerneldoc for all the lease fields in struct
drm_master? I think moving them to the inline style and then adding
comments for each field how locking/lifetime rules work would be
really good. Since right now it's all fresh from for us.
-Daniel

> >>> Also, do you plan to look into that idea we've discussed to flush pending
> >>> access when we revoke a master or a lease? I think that would be really
> >>> nice improvement here.
> >>> -Daniel
> >>>
> >>
> >> Yup, now that the potential UAFs are addressed (hopefully), I'll take a
> >> closer look and propose a patch for this.
> >
> > Thanks a lot.
> > -Daniel
> >
> >>
> >> Best wishes,
> >> Desmond
> >>
> >>>>
> >>>> Desmond Cheong Zhi Xi (5):
> >>>>     drm: avoid circular locks in drm_mode_getconnector
> >>>>     drm: avoid blocking in drm_clients_info's rcu section
> >>>>     drm: add a locked version of drm_is_current_master
> >>>>     drm: serialize drm_file.master with a new spinlock
> >>>>     drm: protect drm_master pointers in drm_lease.c
> >>>>
> >>>>    drivers/gpu/drm/drm_auth.c      | 93 ++++++++++++++++++++++++---------
> >>>>    drivers/gpu/drm/drm_connector.c |  5 +-
> >>>>    drivers/gpu/drm/drm_debugfs.c   |  3 +-
> >>>>    drivers/gpu/drm/drm_file.c      |  1 +
> >>>>    drivers/gpu/drm/drm_lease.c     | 81 +++++++++++++++++++++-------
> >>>>    include/drm/drm_auth.h          |  1 +
> >>>>    include/drm/drm_file.h          | 18 +++++--
> >>>>    7 files changed, 152 insertions(+), 50 deletions(-)
> >>>>
> >>>> --
> >>>> 2.25.1
> >>>>
> >>>
> >>
> >
> >
>
Desmond Cheong Zhi Xi July 22, 2021, 9:38 a.m. UTC | #6
On 21/7/21 9:23 pm, Daniel Vetter wrote:
> On Wed, Jul 21, 2021 at 2:44 PM Desmond Cheong Zhi Xi
> <desmondcheongzx@gmail.com> wrote:
>> On 21/7/21 6:29 pm, Daniel Vetter wrote:
>>> On Wed, Jul 21, 2021 at 6:12 AM Desmond Cheong Zhi Xi
>>> <desmondcheongzx@gmail.com> wrote:
>>>> On 21/7/21 2:24 am, Daniel Vetter wrote:
>>>>> On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
>>>>>> Hi,
>>>>>>
>>>>>> In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
>>>>>>
>>>>>> Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
>>>>>> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
>>>>>>
>>>>>> The series is broken up into five patches:
>>>>>>
>>>>>> 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
>>>>>>
>>>>>> 2. Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
>>>>>>
>>>>>> 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
>>>>>>
>>>>>> 4. Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
>>>>>>
>>>>>> 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
>>>>>>
>>>>>> v7 -> v8:
>>>>>> - Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
>>>>>> - Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
>>>>>>
>>>>>> v6 -> v7:
>>>>>> - Modify code alignment as suggested by the intel-gfx CI.
>>>>>> - Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
>>>>>> - Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
>>>>>>
>>>>>> v5 -> v6:
>>>>>> - Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
>>>>>> - Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
>>>>>> - Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
>>>>>> - Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
>>>>>>
>>>>>> v4 -> v5:
>>>>>> - Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
>>>>>> - Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
>>>>>>
>>>>>> v3 -> v4:
>>>>>> - Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
>>>>>> - Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
>>>>>> - Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
>>>>>> - Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
>>>>>>
>>>>>> v2 -> v3:
>>>>>> - Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
>>>>>> - Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
>>>>>>
>>>>>> v1 -> v2:
>>>>>> - Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
>>>>>
>>>>> Apologies for the delay, I missed your series. Maybe just ping next time
>>>>> around there's silence.
>>>>>
>>>>> Looks all great, merged to drm-misc-next. Given how complex this was I'm
>>>>> vary of just pushing this to -fixes without some solid testing.
>>>>>
>>>>
>>>> Hi Daniel,
>>>>
>>>> Thanks for merging, more testing definitely sounds good to me.
>>>>
>>>>> One thing I noticed is that drm_is_current_master could just use the
>>>>> spinlock, since it's only doing a read access. Care to type up that patch?
>>>>>
>>>>
>>>> I thought about this too, but I'm not sure if that's the best solution.
>>>>
>>>> drm_is_current_master calls drm_lease_owner which then walks up the tree
>>>> of master lessors. The spinlock protects the master of the current drm
>>>> file, but subsequent lessors aren't protected without holding the
>>>> device's master mutex.
>>>
>>> But this isn't a fpriv->master pointer, but a master->lessor pointer.
>>> Which should never ever be able to change (we'd have tons of uaf bugs
>>> around drm_lease_owner otherwise). So I don't think there's anything
>>> that dev->master_lock protects here that fpriv->master_lookup_lock
>>> doesn't protect already?
>>>
>>> Or am I missing something?
>>>   > The comment in the struct drm_master says it's protected by
>>> mode_config.idr_mutex, but that only applies to the idrs and lists I
>>> think.
>>>
>>
>> Ah you're right, I also completely forgot that lessees hold a reference
>> to their lessor so nothing will be freed as long as the spinlock is
>> held. I'll prepare that patch then, thanks for pointing it out.
> 
> btw since we now looked at all this in detail, can you perhaps do a
> patch to update the kerneldoc for all the lease fields in struct
> drm_master? I think moving them to the inline style and then adding
> comments for each field how locking/lifetime rules work would be
> really good. Since right now it's all fresh from for us.
> -Daniel
> 

Sure thing. Just sent out the suggested changes in the same series, 
along with a relevant fix for drm/vmwgfx that I just noticed.

>>>>> Also, do you plan to look into that idea we've discussed to flush pending
>>>>> access when we revoke a master or a lease? I think that would be really
>>>>> nice improvement here.
>>>>> -Daniel
>>>>>
>>>>
>>>> Yup, now that the potential UAFs are addressed (hopefully), I'll take a
>>>> closer look and propose a patch for this.
>>>
>>> Thanks a lot.
>>> -Daniel
>>>
>>>>
>>>> Best wishes,
>>>> Desmond
>>>>
>>>>>>
>>>>>> Desmond Cheong Zhi Xi (5):
>>>>>>      drm: avoid circular locks in drm_mode_getconnector
>>>>>>      drm: avoid blocking in drm_clients_info's rcu section
>>>>>>      drm: add a locked version of drm_is_current_master
>>>>>>      drm: serialize drm_file.master with a new spinlock
>>>>>>      drm: protect drm_master pointers in drm_lease.c
>>>>>>
>>>>>>     drivers/gpu/drm/drm_auth.c      | 93 ++++++++++++++++++++++++---------
>>>>>>     drivers/gpu/drm/drm_connector.c |  5 +-
>>>>>>     drivers/gpu/drm/drm_debugfs.c   |  3 +-
>>>>>>     drivers/gpu/drm/drm_file.c      |  1 +
>>>>>>     drivers/gpu/drm/drm_lease.c     | 81 +++++++++++++++++++++-------
>>>>>>     include/drm/drm_auth.h          |  1 +
>>>>>>     include/drm/drm_file.h          | 18 +++++--
>>>>>>     7 files changed, 152 insertions(+), 50 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.25.1
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
>