mbox series

[v2,0/3] drm/amdgpu: Remove in_interrupt() usage.

Message ID 20210209124439.408140-1-bigeasy@linutronix.de (mailing list archive)
Headers show
Series drm/amdgpu: Remove in_interrupt() usage. | expand

Message

Sebastian Andrzej Siewior Feb. 9, 2021, 12:44 p.m. UTC
Folks,

in the discussion about preempt count consistency across kernel
configurations:

 https://lore.kernel.org/r/20200914204209.256266093@linutronix.de/

it was concluded that the usage of in_interrupt() and related context
checks should be removed from non-core code.

In the long run, usage of 'preemptible, in_*irq etc.' should be banned from
driver code completely.

This series addresses parts of the amdgpu driver.  There are still call sites
left in in the amdgpu driver.

v1…v2:
   - Limit to admgpu only
   - use "bool" instead of "bool == true"

Sebastian

Comments

Christian König Feb. 9, 2021, 12:50 p.m. UTC | #1
Reviewed-by: Christian König <christian.koenig@amd.com> for the series.

Am 09.02.21 um 13:44 schrieb Sebastian Andrzej Siewior:
> Folks,
>
> in the discussion about preempt count consistency across kernel
> configurations:
>
>   https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2F20200914204209.256266093%40linutronix.de%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C66cfb449f0ba475dd76b08d8ccf87a85%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484714876862283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=g04sQoqvfkuHzplig%2F%2BOruqzmyypIhaqrkKU0xeIJ80%3D&amp;reserved=0
>
> it was concluded that the usage of in_interrupt() and related context
> checks should be removed from non-core code.
>
> In the long run, usage of 'preemptible, in_*irq etc.' should be banned from
> driver code completely.
>
> This series addresses parts of the amdgpu driver.  There are still call sites
> left in in the amdgpu driver.
>
> v1…v2:
>     - Limit to admgpu only
>     - use "bool" instead of "bool == true"
>
> Sebastian
>
>
Sebastian Andrzej Siewior Feb. 9, 2021, 4:53 p.m. UTC | #2
On 2021-02-09 13:50:31 [+0100], Christian König wrote:
> Reviewed-by: Christian König <christian.koenig@amd.com> for the series.

Thank you.
Any chance you could give me a hand with the remaining three users
within the amdgpu driver? I don't know if the in_interrupt() check can
be limited to certain callers.
What I noticed while tracing v5.10 is this:

|             Xorg-2257    [007] d... 57261.620043: amdgpu_device_wreg: 0x699f, 0x00001bcf, 0x00000100
|  => trace_event_raw_event_amdgpu_device_wreg
|  => amdgpu_device_wreg.part.0
|  => dce110_arm_vert_intr
|  => dce110_vblank_set
|  => dm_enable_vblank
|  => drm_vblank_enable
|  => drm_vblank_get
|  => drm_wait_vblank_ioctl
|  => drm_ioctl_kernel
|  => drm_ioctl
|  => amdgpu_drm_ioctl
|  => __x64_sys_ioctl
|  => do_syscall_64
|  => entry_SYSCALL_64_after_hwframe

I think that amdgpu_device_wreg() -> amdgpu_kiq_wreg() could be invoked.
It doesn't here because amdgpu_sriov_runtime() is false.
The trace says `d' which means interrupts are disabled but
in_interrupt() will return false in this case (no IRQ/softirq).

Sebastian
Christian König Feb. 9, 2021, 5:43 p.m. UTC | #3
Hi Sebastian,

to be honest I'm thinking about that for quite some time now and I don't 
think that this is possible without a severe rewrite of the driver.

The problem is simply that we have a lot of functions which deal with 
hardware handling independent of the context. But how registers are 
accessed needs to be different depending if your are in the interrupt 
handler or not.

You would need to push the information if we are coming in from the 
interrupt handler through a > 10 function calls.

I don't think that this is feasible nor good design.

Regards,
Christian.

Am 09.02.21 um 17:53 schrieb Sebastian Andrzej Siewior:
> On 2021-02-09 13:50:31 [+0100], Christian König wrote:
>> Reviewed-by: Christian König <christian.koenig@amd.com> for the series.
> Thank you.
> Any chance you could give me a hand with the remaining three users
> within the amdgpu driver? I don't know if the in_interrupt() check can
> be limited to certain callers.
> What I noticed while tracing v5.10 is this:
>
> |             Xorg-2257    [007] d... 57261.620043: amdgpu_device_wreg: 0x699f, 0x00001bcf, 0x00000100
> |  => trace_event_raw_event_amdgpu_device_wreg
> |  => amdgpu_device_wreg.part.0
> |  => dce110_arm_vert_intr
> |  => dce110_vblank_set
> |  => dm_enable_vblank
> |  => drm_vblank_enable
> |  => drm_vblank_get
> |  => drm_wait_vblank_ioctl
> |  => drm_ioctl_kernel
> |  => drm_ioctl
> |  => amdgpu_drm_ioctl
> |  => __x64_sys_ioctl
> |  => do_syscall_64
> |  => entry_SYSCALL_64_after_hwframe
>
> I think that amdgpu_device_wreg() -> amdgpu_kiq_wreg() could be invoked.
> It doesn't here because amdgpu_sriov_runtime() is false.
> The trace says `d' which means interrupts are disabled but
> in_interrupt() will return false in this case (no IRQ/softirq).
>
> Sebastian
Sebastian Andrzej Siewior March 10, 2021, 5:47 p.m. UTC | #4
On 2021-02-09 18:43:54 [+0100], Christian König wrote:
> Hi Sebastian,
Hi Christian,

> to be honest I'm thinking about that for quite some time now and I don't
> think that this is possible without a severe rewrite of the driver.
> 
> The problem is simply that we have a lot of functions which deal with
> hardware handling independent of the context. But how registers are accessed
> needs to be different depending if your are in the interrupt handler or not.
> 
> You would need to push the information if we are coming in from the
> interrupt handler through a > 10 function calls.
> 
> I don't think that this is feasible nor good design.

Yeah, that is what I saw and didn't even try.

The possible backtrace (at the bottom of this email) is this a correct
assumption?

Another quick question: You acked my three-patch series. I don't see it
in the next tree as of today. Is there anything for me to do?

> Regards,
> Christian.
> 
> Am 09.02.21 um 17:53 schrieb Sebastian Andrzej Siewior:
> > On 2021-02-09 13:50:31 [+0100], Christian König wrote:
> > > Reviewed-by: Christian König <christian.koenig@amd.com> for the series.
> > Thank you.
> > Any chance you could give me a hand with the remaining three users
> > within the amdgpu driver? I don't know if the in_interrupt() check can
> > be limited to certain callers.
> > What I noticed while tracing v5.10 is this:
> > 
> > |             Xorg-2257    [007] d... 57261.620043: amdgpu_device_wreg: 0x699f, 0x00001bcf, 0x00000100
> > |  => trace_event_raw_event_amdgpu_device_wreg
> > |  => amdgpu_device_wreg.part.0
> > |  => dce110_arm_vert_intr
> > |  => dce110_vblank_set
> > |  => dm_enable_vblank
> > |  => drm_vblank_enable
> > |  => drm_vblank_get
> > |  => drm_wait_vblank_ioctl
> > |  => drm_ioctl_kernel
> > |  => drm_ioctl
> > |  => amdgpu_drm_ioctl
> > |  => __x64_sys_ioctl
> > |  => do_syscall_64
> > |  => entry_SYSCALL_64_after_hwframe
> > 
> > I think that amdgpu_device_wreg() -> amdgpu_kiq_wreg() could be invoked.
> > It doesn't here because amdgpu_sriov_runtime() is false.
> > The trace says `d' which means interrupts are disabled but
> > in_interrupt() will return false in this case (no IRQ/softirq).
> > 
> > Sebastian

Sebastian
Christian König March 11, 2021, 10:42 a.m. UTC | #5
Hi Sebastian,

Am 10.03.21 um 18:47 schrieb Sebastian Andrzej Siewior:
> On 2021-02-09 18:43:54 [+0100], Christian König wrote:
>> to be honest I'm thinking about that for quite some time now and I don't
>> think that this is possible without a severe rewrite of the driver.
>>
>> The problem is simply that we have a lot of functions which deal with
>> hardware handling independent of the context. But how registers are accessed
>> needs to be different depending if your are in the interrupt handler or not.
>>
>> You would need to push the information if we are coming in from the
>> interrupt handler through a > 10 function calls.
>>
>> I don't think that this is feasible nor good design.
> Yeah, that is what I saw and didn't even try.

I also have no idea where to start.

> The possible backtrace (at the bottom of this email) is this a correct
> assumption?

It's one of many, yes. But the real complicated once are in the CS UAPI 
and interrupt handling.

>
> Another quick question: You acked my three-patch series. I don't see it
> in the next tree as of today. Is there anything for me to do?

Alex usually picks them up into amd-staging-drm-next which is then 
merged into drm-next.

Regards,
Christian.

>
>> Regards,
>> Christian.
>>
>> Am 09.02.21 um 17:53 schrieb Sebastian Andrzej Siewior:
>>> On 2021-02-09 13:50:31 [+0100], Christian König wrote:
>>>> Reviewed-by: Christian König <christian.koenig@amd.com> for the series.
>>> Thank you.
>>> Any chance you could give me a hand with the remaining three users
>>> within the amdgpu driver? I don't know if the in_interrupt() check can
>>> be limited to certain callers.
>>> What I noticed while tracing v5.10 is this:
>>>
>>> |             Xorg-2257    [007] d... 57261.620043: amdgpu_device_wreg: 0x699f, 0x00001bcf, 0x00000100
>>> |  => trace_event_raw_event_amdgpu_device_wreg
>>> |  => amdgpu_device_wreg.part.0
>>> |  => dce110_arm_vert_intr
>>> |  => dce110_vblank_set
>>> |  => dm_enable_vblank
>>> |  => drm_vblank_enable
>>> |  => drm_vblank_get
>>> |  => drm_wait_vblank_ioctl
>>> |  => drm_ioctl_kernel
>>> |  => drm_ioctl
>>> |  => amdgpu_drm_ioctl
>>> |  => __x64_sys_ioctl
>>> |  => do_syscall_64
>>> |  => entry_SYSCALL_64_after_hwframe
>>>
>>> I think that amdgpu_device_wreg() -> amdgpu_kiq_wreg() could be invoked.
>>> It doesn't here because amdgpu_sriov_runtime() is false.
>>> The trace says `d' which means interrupts are disabled but
>>> in_interrupt() will return false in this case (no IRQ/softirq).
>>>
>>> Sebastian
> Sebastian
Alex Deucher March 11, 2021, 3:45 p.m. UTC | #6
Applied.  Thanks!

Alex

On Tue, Feb 9, 2021 at 7:50 AM Christian König <christian.koenig@amd.com> wrote:
>
> Reviewed-by: Christian König <christian.koenig@amd.com> for the series.
>
> Am 09.02.21 um 13:44 schrieb Sebastian Andrzej Siewior:
> > Folks,
> >
> > in the discussion about preempt count consistency across kernel
> > configurations:
> >
> >   https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2F20200914204209.256266093%40linutronix.de%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C66cfb449f0ba475dd76b08d8ccf87a85%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637484714876862283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=g04sQoqvfkuHzplig%2F%2BOruqzmyypIhaqrkKU0xeIJ80%3D&amp;reserved=0
> >
> > it was concluded that the usage of in_interrupt() and related context
> > checks should be removed from non-core code.
> >
> > In the long run, usage of 'preemptible, in_*irq etc.' should be banned from
> > driver code completely.
> >
> > This series addresses parts of the amdgpu driver.  There are still call sites
> > left in in the amdgpu driver.
> >
> > v1…v2:
> >     - Limit to admgpu only
> >     - use "bool" instead of "bool == true"
> >
> > Sebastian
> >
> >
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx