mbox series

[0/2] drm/i915: fix rb-tree/llist/list confusion

Message ID 20230905113921.14071-1-minipli@grsecurity.net (mailing list archive)
Headers show
Series drm/i915: fix rb-tree/llist/list confusion | expand

Message

Mathias Krause Sept. 5, 2023, 11:39 a.m. UTC
Commit 1ec23ed7126e ("drm/i915: Use uabi engines for the default engine
map") introduced a bug regarding engine iteration in default_engines()
as the rb tree isn't set up yet that early during driver initialization.
This triggered a sanity check we have in our grsecurity kernels, fixed
by reverting the offending commit (patch 1) and giving the
type-multiplexed members some more visibility to avoid making a similar
mistake again in the future (patch 2).

Please apply!

Thanks,
Mathias

Mathias Krause (2):
  Revert "drm/i915: Use uabi engines for the default engine map"
  drm/i915: Clarify type evolution of uabi_node/uabi_engines

 drivers/gpu/drm/i915/gem/i915_gem_context.c  |  9 +++++----
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 10 +++++++++-
 drivers/gpu/drm/i915/gt/intel_engine_user.c  | 17 +++++++----------
 drivers/gpu/drm/i915/i915_drv.h              | 17 ++++++++++++++++-
 4 files changed, 37 insertions(+), 16 deletions(-)

Comments

Mathias Krause Sept. 21, 2023, 6:24 a.m. UTC | #1
On 05.09.23 13:39, Mathias Krause wrote:
> Commit 1ec23ed7126e ("drm/i915: Use uabi engines for the default engine
> map") introduced a bug regarding engine iteration in default_engines()
> as the rb tree isn't set up yet that early during driver initialization.
> This triggered a sanity check we have in our grsecurity kernels, fixed
> by reverting the offending commit (patch 1) and giving the
> type-multiplexed members some more visibility to avoid making a similar
> mistake again in the future (patch 2).
> 
> Please apply!
> 
> Thanks,
> Mathias
> 
> Mathias Krause (2):
>   Revert "drm/i915: Use uabi engines for the default engine map"
>   drm/i915: Clarify type evolution of uabi_node/uabi_engines
> 
>  drivers/gpu/drm/i915/gem/i915_gem_context.c  |  9 +++++----
>  drivers/gpu/drm/i915/gt/intel_engine_types.h | 10 +++++++++-
>  drivers/gpu/drm/i915/gt/intel_engine_user.c  | 17 +++++++----------
>  drivers/gpu/drm/i915/i915_drv.h              | 17 ++++++++++++++++-
>  4 files changed, 37 insertions(+), 16 deletions(-)
> 

Ping. Any objections to this series?

- Mathias
Tvrtko Ursulin Sept. 28, 2023, 11:15 a.m. UTC | #2
Hi,

On 21/09/2023 07:24, Mathias Krause wrote:
> On 05.09.23 13:39, Mathias Krause wrote:
>> Commit 1ec23ed7126e ("drm/i915: Use uabi engines for the default engine
>> map") introduced a bug regarding engine iteration in default_engines()
>> as the rb tree isn't set up yet that early during driver initialization.
>> This triggered a sanity check we have in our grsecurity kernels, fixed
>> by reverting the offending commit (patch 1) and giving the
>> type-multiplexed members some more visibility to avoid making a similar
>> mistake again in the future (patch 2).
>>
>> Please apply!
>>
>> Thanks,
>> Mathias
>>
>> Mathias Krause (2):
>>    Revert "drm/i915: Use uabi engines for the default engine map"
>>    drm/i915: Clarify type evolution of uabi_node/uabi_engines
>>
>>   drivers/gpu/drm/i915/gem/i915_gem_context.c  |  9 +++++----
>>   drivers/gpu/drm/i915/gt/intel_engine_types.h | 10 +++++++++-
>>   drivers/gpu/drm/i915/gt/intel_engine_user.c  | 17 +++++++----------
>>   drivers/gpu/drm/i915/i915_drv.h              | 17 ++++++++++++++++-
>>   4 files changed, 37 insertions(+), 16 deletions(-)
>>
> 
> Ping. Any objections to this series?

Apologies for the delay in getting back to you, I was away from work for 
a bit.

Tricky thing you discovered and a great analysis in the commit text.

But we cannot simply revert 1ec23ed7126e since that would miss to 
include the media tile engines on Meteorlake.

What I think should work is to move the call to 
intel_engines_driver_register() from i915_gem_driver_register() to 
i915_gem_init(). I can double-check and send a patch, or you can, 
keeping parts of your great commit message in 1/2?

That would align the expectations of intel_display_driver_probe() which 
expects GEM to be fully initialized by the time it runs.

2/2 looks good and I would be happy to merge it in the interim. Going 
forward I would pencil in looking into removing the rbtree and 
multi-stage complications. Former I think isn't needed, code which needs 
fast random lookup via the tree never materialized, and latter perhaps 
could be sorted in place somehow, that is, without the need for two list 
types externally visible.

Regards,

Tvrtko
Mathias Krause Sept. 28, 2023, 3:10 p.m. UTC | #3
On 28.09.23 13:15, Tvrtko Ursulin wrote:
> Hi,
> 
> On 21/09/2023 07:24, Mathias Krause wrote:
>> On 05.09.23 13:39, Mathias Krause wrote:
>>> Commit 1ec23ed7126e ("drm/i915: Use uabi engines for the default engine
>>> map") introduced a bug regarding engine iteration in default_engines()
>>> as the rb tree isn't set up yet that early during driver initialization.
>>> This triggered a sanity check we have in our grsecurity kernels, fixed
>>> by reverting the offending commit (patch 1) and giving the
>>> type-multiplexed members some more visibility to avoid making a similar
>>> mistake again in the future (patch 2).
>>>
>>> Please apply!
>>>
>>> Thanks,
>>> Mathias
>>>
>>> Mathias Krause (2):
>>>    Revert "drm/i915: Use uabi engines for the default engine map"
>>>    drm/i915: Clarify type evolution of uabi_node/uabi_engines
>>>
>>>   drivers/gpu/drm/i915/gem/i915_gem_context.c  |  9 +++++----
>>>   drivers/gpu/drm/i915/gt/intel_engine_types.h | 10 +++++++++-
>>>   drivers/gpu/drm/i915/gt/intel_engine_user.c  | 17 +++++++----------
>>>   drivers/gpu/drm/i915/i915_drv.h              | 17 ++++++++++++++++-
>>>   4 files changed, 37 insertions(+), 16 deletions(-)
>>>
>>
>> Ping. Any objections to this series?
> 
> Apologies for the delay in getting back to you, I was away from work for
> a bit.

Don't worry. vger had its hickups with gmail-originated Emails, adding
further delay. I think my "ping" Email only made it to the list three
days ago? -- 4 days late :D

> 
> Tricky thing you discovered and a great analysis in the commit text.

Thanks! Was a nasty bug to chase when the display simply stays black ;)

> 
> But we cannot simply revert 1ec23ed7126e since that would miss to
> include the media tile engines on Meteorlake.

Oh, missed that! But, honestly, I have no clue about i915's inner
workings. It's quite a beast---a lot of code to chase to grasp all
inter-dependencies.

> 
> What I think should work is to move the call to
> intel_engines_driver_register() from i915_gem_driver_register() to
> i915_gem_init(). I can double-check and send a patch, or you can,
> keeping parts of your great commit message in 1/2?

I can prepare a patch and test it on my systems (both ADL, a NUC12 and a
Lenovo X1). I've no Meteorlake, though. Should have something ready
tomorrow.

> 
> That would align the expectations of intel_display_driver_probe() which
> expects GEM to be fully initialized by the time it runs.

Sounds about right.

> 
> 2/2 looks good and I would be happy to merge it in the interim.

It helped me, at least, to clarify for each user what type it expects
the list to be. Then it's easy to follow a code chain and see when a
type transformation happens and when assumptions get invalidated / changed.

> Going
> forward I would pencil in looking into removing the rbtree and
> multi-stage complications. Former I think isn't needed, code which needs
> fast random lookup via the tree never materialized, and latter perhaps
> could be sorted in place somehow, that is, without the need for two list
> types externally visible.

I agree, simplifying or even completely removing the type overload makes
the code less fragile, so I'm all for it!

Thanks,
Mathias

> 
> Regards,
> 
> Tvrtko