Message ID | 20200717185757.2786-1-jajones@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/nouveau: Accept 'legacy' format modifiers | expand |
This should resolve the inability to start X with the new NV format modifier support in nouveau. FYI, I'm offline next week, but I'll check in tonight in case there are any review comments. When I'm back, I'll get the associated userspace fixes cleaned up and out to the appropriate lists. Thanks, -James On 7/17/20 11:57 AM, James Jones wrote: > Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK() > family of modifiers to handle broken userspace > Xorg modesetting and Mesa drivers. > > Tested with Xorg 1.20 modesetting driver, > weston@c46c70dac84a4b3030cd05b380f9f410536690fc, > gnome & KDE wayland desktops from Ubuntu 18.04, > and sway 1.5 > > Signed-off-by: James Jones <jajones@nvidia.com> > --- > drivers/gpu/drm/nouveau/nouveau_display.c | 26 +++++++++++++++++++++-- > 1 file changed, 24 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c > index 496c4621cc78..31543086254b 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_display.c > +++ b/drivers/gpu/drm/nouveau/nouveau_display.c > @@ -191,8 +191,14 @@ nouveau_decode_mod(struct nouveau_drm *drm, > uint32_t *tile_mode, > uint8_t *kind) > { > + struct nouveau_display *disp = nouveau_display(drm->dev); > BUG_ON(!tile_mode || !kind); > > + if ((modifier & (0xffull << 12)) == 0ull) { > + /* Legacy modifier. Translate to this device's 'kind.' */ > + modifier |= disp->format_modifiers[0] & (0xffull << 12); > + } > + > if (modifier == DRM_FORMAT_MOD_LINEAR) { > /* tile_mode will not be used in this case */ > *tile_mode = 0; > @@ -227,6 +233,16 @@ nouveau_framebuffer_get_layout(struct drm_framebuffer *fb, > } > } > > +static const u64 legacy_modifiers[] = { > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(0), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(1), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(2), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(3), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(4), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(5), > + DRM_FORMAT_MOD_INVALID > +}; > + > static int > nouveau_validate_decode_mod(struct nouveau_drm *drm, > uint64_t modifier, > @@ -247,8 +263,14 @@ nouveau_validate_decode_mod(struct nouveau_drm *drm, > (disp->format_modifiers[mod] != modifier); > mod++); > > - if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) > - return -EINVAL; > + if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) { > + for (mod = 0; > + (legacy_modifiers[mod] != DRM_FORMAT_MOD_INVALID) && > + (legacy_modifiers[mod] != modifier); > + mod++); > + if (legacy_modifiers[mod] == DRM_FORMAT_MOD_INVALID) > + return -EINVAL; > + } > > nouveau_decode_mod(drm, modifier, tile_mode, kind); > >
On Fri, Jul 17, 2020 at 11:57:57AM -0700, James Jones wrote: > Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK() > family of modifiers to handle broken userspace > Xorg modesetting and Mesa drivers. > > Tested with Xorg 1.20 modesetting driver, > weston@c46c70dac84a4b3030cd05b380f9f410536690fc, > gnome & KDE wayland desktops from Ubuntu 18.04, > and sway 1.5 Just bikeshed, but maybe a few more words on what exactly is broken and how this works around it. Specifically why we only accept these, but don't advertise them. > > Signed-off-by: James Jones <jajones@nvidia.com> Needs Fixes: line here. Also nice to mention the bug reporter/link. > --- > drivers/gpu/drm/nouveau/nouveau_display.c | 26 +++++++++++++++++++++-- > 1 file changed, 24 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c > index 496c4621cc78..31543086254b 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_display.c > +++ b/drivers/gpu/drm/nouveau/nouveau_display.c > @@ -191,8 +191,14 @@ nouveau_decode_mod(struct nouveau_drm *drm, > uint32_t *tile_mode, > uint8_t *kind) > { > + struct nouveau_display *disp = nouveau_display(drm->dev); > BUG_ON(!tile_mode || !kind); > > + if ((modifier & (0xffull << 12)) == 0ull) { > + /* Legacy modifier. Translate to this device's 'kind.' */ > + modifier |= disp->format_modifiers[0] & (0xffull << 12); > + } Hm I tried to understand what this magic does by looking at drm_fourcc.h, but the drm_fourcc_canonicalize_nvidia_format_mod() in there implements something else. Is that function wrong, or should we use it here instead? Or is there something else going on entirely? Cheers, Daniel > + > if (modifier == DRM_FORMAT_MOD_LINEAR) { > /* tile_mode will not be used in this case */ > *tile_mode = 0; > @@ -227,6 +233,16 @@ nouveau_framebuffer_get_layout(struct drm_framebuffer *fb, > } > } > > +static const u64 legacy_modifiers[] = { > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(0), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(1), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(2), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(3), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(4), > + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(5), > + DRM_FORMAT_MOD_INVALID > +}; > + > static int > nouveau_validate_decode_mod(struct nouveau_drm *drm, > uint64_t modifier, > @@ -247,8 +263,14 @@ nouveau_validate_decode_mod(struct nouveau_drm *drm, > (disp->format_modifiers[mod] != modifier); > mod++); > > - if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) > - return -EINVAL; > + if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) { > + for (mod = 0; > + (legacy_modifiers[mod] != DRM_FORMAT_MOD_INVALID) && > + (legacy_modifiers[mod] != modifier); > + mod++); > + if (legacy_modifiers[mod] == DRM_FORMAT_MOD_INVALID) > + return -EINVAL; > + } > > nouveau_decode_mod(drm, modifier, tile_mode, kind); > > -- > 2.17.1 >
On Fri, Jul 17, 2020 at 11:57:57AM -0700, James Jones wrote: > Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK() > family of modifiers to handle broken userspace > Xorg modesetting and Mesa drivers. > > Tested with Xorg 1.20 modesetting driver, > weston@c46c70dac84a4b3030cd05b380f9f410536690fc, > gnome & KDE wayland desktops from Ubuntu 18.04, > and sway 1.5 > > Signed-off-by: James Jones <jajones@nvidia.com> I tried and it crashes. Not sure if it's related. [drm:drm_ioctl] pid=3327, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF [drm:vblank_disable_fn] disabling vblank on crtc 0 [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_CPU_PREP [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF BUG: unable to handle page fault for address: 00000000ffff059c #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 13 PID: 3351 Comm: alacritty Tainted: G I 5.8.0-rc5-00191-g086f86c033f9 #53 Hardware name: Gigabyte Technology Co., Ltd. X299 AORUS Gaming 3 Pro/X299 AORUS Gaming 3 Pro-CF, BIOS F5d 11/28/2019 RIP: 0010:kmem_cache_alloc_trace (/home/kas/linux/torvalds/mm/slub.c:272 /home/kas/linux/torvalds/mm/slub.c:278 /home/kas/linux/torvalds/mm/slub.c:292 /home/kas/linux/torvalds/mm/slub.c:2791 /home/kas/linux/torvalds/mm/slub.c:2832 /home/kas/linux/torvalds/mm/slub.c:2849) Code: 8b 51 08 48 89 c8 65 48 03 05 d4 0e ca 70 48 8b 70 08 48 39 f2 75 e7 4c 8b 38 4d 85 ff 0f 84 8f 01 00 00 8b 45 20 48 8b 7d 00 <49> 8b 1c 07 40 f6 c7 0f 0f 85 95 01 00 00 48 8d 8a 80 00 00 00 4c All code ======== 0: 8b 51 08 mov 0x8(%rcx),%edx 3: 48 89 c8 mov %rcx,%rax 6: 65 48 03 05 d4 0e ca add %gs:0x70ca0ed4(%rip),%rax # 0x70ca0ee2 d: 70 e: 48 8b 70 08 mov 0x8(%rax),%rsi 12: 48 39 f2 cmp %rsi,%rdx 15: 75 e7 jne 0xfffffffffffffffe 17: 4c 8b 38 mov (%rax),%r15 1a: 4d 85 ff test %r15,%r15 1d: 0f 84 8f 01 00 00 je 0x1b2 23: 8b 45 20 mov 0x20(%rbp),%eax 26: 48 8b 7d 00 mov 0x0(%rbp),%rdi 2a:* 49 8b 1c 07 mov (%r15,%rax,1),%rbx <-- trapping instruction 2e: 40 f6 c7 0f test $0xf,%dil 32: 0f 85 95 01 00 00 jne 0x1cd 38: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx 3f: 4c rex.WR Code starting with the faulting instruction =========================================== 0: 49 8b 1c 07 mov (%r15,%rax,1),%rbx 4: 40 f6 c7 0f test $0xf,%dil 8: 0f 85 95 01 00 00 jne 0x1a3 e: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx 15: 4c rex.WR RSP: 0018:ffffa8a381bcfba0 EFLAGS: 00010206 RAX: 0000000000000030 RBX: ffff9c0b15e05e00 RCX: 000000000003fe50 RDX: 000000000000fc8d RSI: 000000000000fc8d RDI: 000000000003fe50 RBP: ffff9c0b18407840 R08: 0000000000000000 R09: 0000000000000001 R10: ffff9c0b06c28000 R11: 0000000000000001 R12: 0000000000000dc0 [drm:drm_ioctl] pid=3327, dev=0xe200, auth=1, DRM_IOCTL_CRTC_GET_SEQUENCE R13: 0000000000000060 R14: ffffffff8fa35a47 R15: 00000000ffff056c FS: 00007fbe7a8e3900(0000) GS:ffff9c0b1f880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000ffff059c CR3: 000000103c7fe004 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: nouveau_fence_new (/home/kas/linux/torvalds/include/linux/slab.h:555 /home/kas/linux/torvalds/include/linux/slab.h:669 /home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_fence.c:423) [drm:drm_vblank_enable] enabling vblank on crtc 0, ret: 0 nouveau_gem_ioctl_pushbuf (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:852) [drm:drm_ioctl] pid=3327, dev=0xe200, auth=1, DRM_IOCTL_CRTC_QUEUE_SEQUENCE ? nouveau_gem_ioctl_new (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:680) ? drm_ioctl_kernel (/home/kas/linux/torvalds/drivers/gpu/drm/drm_ioctl.c:793) ? nouveau_gem_ioctl_new (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:680) drm_ioctl_kernel (/home/kas/linux/torvalds/drivers/gpu/drm/drm_ioctl.c:793) drm_ioctl (/home/kas/linux/torvalds/drivers/gpu/drm/drm_ioctl.c:888) ? nouveau_gem_ioctl_new (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:680) ? _raw_spin_unlock_irqrestore (/home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:41 /home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:84 /home/kas/linux/torvalds/include/linux/spinlock_api_smp.h:160 /home/kas/linux/torvalds/kernel/locking/spinlock.c:191) ? lockdep_hardirqs_on (/home/kas/linux/torvalds/kernel/locking/lockdep.c:3727 (discriminator 3)) ? _raw_spin_unlock_irqrestore (/home/kas/linux/torvalds/arch/x86/include/asm/preempt.h:102 /home/kas/linux/torvalds/include/linux/spinlock_api_smp.h:161 /home/kas/linux/torvalds/kernel/locking/spinlock.c:191) nouveau_drm_ioctl (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_drm.c:1120) ksys_ioctl (/home/kas/linux/torvalds/fs/ioctl.c:48 /home/kas/linux/torvalds/fs/ioctl.c:753) __x64_sys_ioctl (/home/kas/linux/torvalds/fs/ioctl.c:762 /home/kas/linux/torvalds/fs/ioctl.c:760 /home/kas/linux/torvalds/fs/ioctl.c:760) do_syscall_64 (/home/kas/linux/torvalds/arch/x86/entry/common.c:384) entry_SYSCALL_64_after_hwframe (/home/kas/linux/torvalds/arch/x86/entry/entry_64.S:124) RIP: 0033:0x7fbe7ab9cd37 Code: Bad RIP value. Code starting with the faulting instruction =========================================== RSP: 002b:00007fff4c0cf0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fff4c0cf150 RCX: 00007fbe7ab9cd37 RDX: 00007fff4c0cf150 RSI: 00000000c0406481 RDI: 0000000000000009 RBP: 00000000c0406481 R08: 000055db823225e8 R09: 000055db8231b5e8 R10: 00007fbe79ce0920 R11: 0000000000000246 R12: 000055db823115e0 R13: 0000000000000009 R14: 000055db8230ffc0 R15: 000055db8230f380 Modules linked in: CR2: 00000000ffff059c ---[ end trace 9db4c132a431e9c9 ]--- RIP: 0010:kmem_cache_alloc_trace (/home/kas/linux/torvalds/mm/slub.c:272 /home/kas/linux/torvalds/mm/slub.c:278 /home/kas/linux/torvalds/mm/slub.c:292 /home/kas/linux/torvalds/mm/slub.c:2791 /home/kas/linux/torvalds/mm/slub.c:2832 /home/kas/linux/torvalds/mm/slub.c:2849) Code: 8b 51 08 48 89 c8 65 48 03 05 d4 0e ca 70 48 8b 70 08 48 39 f2 75 e7 4c 8b 38 4d 85 ff 0f 84 8f 01 00 00 8b 45 20 48 8b 7d 00 <49> 8b 1c 07 40 f6 c7 0f 0f 85 95 01 00 00 48 8d 8a 80 00 00 00 4c All code ======== 0: 8b 51 08 mov 0x8(%rcx),%edx 3: 48 89 c8 mov %rcx,%rax 6: 65 48 03 05 d4 0e ca add %gs:0x70ca0ed4(%rip),%rax # 0x70ca0ee2 d: 70 e: 48 8b 70 08 mov 0x8(%rax),%rsi 12: 48 39 f2 cmp %rsi,%rdx 15: 75 e7 jne 0xfffffffffffffffe 17: 4c 8b 38 mov (%rax),%r15 1a: 4d 85 ff test %r15,%r15 1d: 0f 84 8f 01 00 00 je 0x1b2 23: 8b 45 20 mov 0x20(%rbp),%eax 26: 48 8b 7d 00 mov 0x0(%rbp),%rdi 2a:* 49 8b 1c 07 mov (%r15,%rax,1),%rbx <-- trapping instruction 2e: 40 f6 c7 0f test $0xf,%dil 32: 0f 85 95 01 00 00 jne 0x1cd 38: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx 3f: 4c rex.WR Code starting with the faulting instruction =========================================== 0: 49 8b 1c 07 mov (%r15,%rax,1),%rbx 4: 40 f6 c7 0f test $0xf,%dil 8: 0f 85 95 01 00 00 jne 0x1a3 e: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx 15: 4c rex.WR RSP: 0018:ffffa8a381bcfba0 EFLAGS: 00010206 RAX: 0000000000000030 RBX: ffff9c0b15e05e00 RCX: 000000000003fe50 RDX: 000000000000fc8d RSI: 000000000000fc8d RDI: 000000000003fe50 RBP: ffff9c0b18407840 R08: 0000000000000000 R09: 0000000000000001 R10: ffff9c0b06c28000 R11: 0000000000000001 R12: 0000000000000dc0 R13: 0000000000000060 R14: ffffffff8fa35a47 R15: 00000000ffff056c FS: 00007fbe7a8e3900(0000) GS:ffff9c0b1f880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000ffff059c CR3: 000000103c7fe004 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 3351, name: alacritty INFO: lockdep is turned off. irq event stamp: 302346 hardirqs last enabled at (302345): kfree_call_rcu (/home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:41 (discriminator 2) /home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:84 (discriminator 2) /home/kas/linux/torvalds/kernel/rcu/tree.c:3287 (discriminator 2)) hardirqs last disabled at (302346): idtentry_enter_cond_rcu (/home/kas/linux/torvalds/arch/x86/entry/common.c:652) softirqs last enabled at (302186): __do_softirq (/home/kas/linux/torvalds/arch/x86/include/asm/preempt.h:26 /home/kas/linux/torvalds/kernel/softirq.c:320) softirqs last disabled at (302175): asm_call_on_stack (/home/kas/linux/torvalds/arch/x86/entry/entry_64.S:719) CPU: 13 PID: 3351 Comm: alacritty Tainted: G D I 5.8.0-rc5-00191-g086f86c033f9 #53 Hardware name: Gigabyte Technology Co., Ltd. X299 AORUS Gaming 3 Pro/X299 AORUS Gaming 3 Pro-CF, BIOS F5d 11/28/2019 Call Trace: dump_stack (/home/kas/linux/torvalds/lib/dump_stack.c:120) ___might_sleep.cold (/home/kas/linux/torvalds/kernel/sched/core.c:6900) exit_signals (/home/kas/linux/torvalds/include/linux/percpu-rwsem.h:51 /home/kas/linux/torvalds/include/linux/cgroup-defs.h:733 /home/kas/linux/torvalds/kernel/signal.c:2828) do_exit (/home/kas/linux/torvalds/kernel/exit.c:764) ? ksys_ioctl (/home/kas/linux/torvalds/fs/ioctl.c:48 /home/kas/linux/torvalds/fs/ioctl.c:753) rewind_stack_do_exit (/home/kas/linux/torvalds/arch/x86/entry/thunk_64.S:40) RIP: 0033:0x7fbe7ab9cd37 Code: Bad RIP value. Code starting with the faulting instruction =========================================== RSP: 002b:00007fff4c0cf0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fff4c0cf150 RCX: 00007fbe7ab9cd37 RDX: 00007fff4c0cf150 RSI: 00000000c0406481 RDI: 0000000000000009 RBP: 00000000c0406481 R08: 000055db823225e8 R09: 000055db8231b5e8 R10: 00007fbe79ce0920 R11: 0000000000000246 R12: 000055db823115e0 R13: 0000000000000009 R14: 000055db8230ffc0 R15: 000055db8230f380
This doesn't look related. -James On 7/17/20 2:30 PM, Kirill A. Shutemov wrote: > On Fri, Jul 17, 2020 at 11:57:57AM -0700, James Jones wrote: >> Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK() >> family of modifiers to handle broken userspace >> Xorg modesetting and Mesa drivers. >> >> Tested with Xorg 1.20 modesetting driver, >> weston@c46c70dac84a4b3030cd05b380f9f410536690fc, >> gnome & KDE wayland desktops from Ubuntu 18.04, >> and sway 1.5 >> >> Signed-off-by: James Jones <jajones@nvidia.com> > > I tried and it crashes. Not sure if it's related. > > [drm:drm_ioctl] pid=3327, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF > [drm:vblank_disable_fn] disabling vblank on crtc 0 > [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF > [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_CPU_PREP > [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF > [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF > BUG: unable to handle page fault for address: 00000000ffff059c > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 0 P4D 0 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 13 PID: 3351 Comm: alacritty Tainted: G I 5.8.0-rc5-00191-g086f86c033f9 #53 > Hardware name: Gigabyte Technology Co., Ltd. X299 AORUS Gaming 3 Pro/X299 AORUS Gaming 3 Pro-CF, BIOS F5d 11/28/2019 > RIP: 0010:kmem_cache_alloc_trace (/home/kas/linux/torvalds/mm/slub.c:272 /home/kas/linux/torvalds/mm/slub.c:278 /home/kas/linux/torvalds/mm/slub.c:292 /home/kas/linux/torvalds/mm/slub.c:2791 /home/kas/linux/torvalds/mm/slub.c:2832 /home/kas/linux/torvalds/mm/slub.c:2849) > Code: 8b 51 08 48 89 c8 65 48 03 05 d4 0e ca 70 48 8b 70 08 48 39 f2 75 e7 4c 8b 38 4d 85 ff 0f 84 8f 01 00 00 8b 45 20 48 8b 7d 00 <49> 8b 1c 07 40 f6 c7 0f 0f 85 95 01 00 00 48 8d 8a 80 00 00 00 4c > All code > ======== > 0: 8b 51 08 mov 0x8(%rcx),%edx > 3: 48 89 c8 mov %rcx,%rax > 6: 65 48 03 05 d4 0e ca add %gs:0x70ca0ed4(%rip),%rax # 0x70ca0ee2 > d: 70 > e: 48 8b 70 08 mov 0x8(%rax),%rsi > 12: 48 39 f2 cmp %rsi,%rdx > 15: 75 e7 jne 0xfffffffffffffffe > 17: 4c 8b 38 mov (%rax),%r15 > 1a: 4d 85 ff test %r15,%r15 > 1d: 0f 84 8f 01 00 00 je 0x1b2 > 23: 8b 45 20 mov 0x20(%rbp),%eax > 26: 48 8b 7d 00 mov 0x0(%rbp),%rdi > 2a:* 49 8b 1c 07 mov (%r15,%rax,1),%rbx <-- trapping instruction > 2e: 40 f6 c7 0f test $0xf,%dil > 32: 0f 85 95 01 00 00 jne 0x1cd > 38: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx > 3f: 4c rex.WR > > Code starting with the faulting instruction > =========================================== > 0: 49 8b 1c 07 mov (%r15,%rax,1),%rbx > 4: 40 f6 c7 0f test $0xf,%dil > 8: 0f 85 95 01 00 00 jne 0x1a3 > e: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx > 15: 4c rex.WR > RSP: 0018:ffffa8a381bcfba0 EFLAGS: 00010206 > RAX: 0000000000000030 RBX: ffff9c0b15e05e00 RCX: 000000000003fe50 > RDX: 000000000000fc8d RSI: 000000000000fc8d RDI: 000000000003fe50 > RBP: ffff9c0b18407840 R08: 0000000000000000 R09: 0000000000000001 > R10: ffff9c0b06c28000 R11: 0000000000000001 R12: 0000000000000dc0 > [drm:drm_ioctl] pid=3327, dev=0xe200, auth=1, DRM_IOCTL_CRTC_GET_SEQUENCE > R13: 0000000000000060 R14: ffffffff8fa35a47 R15: 00000000ffff056c > FS: 00007fbe7a8e3900(0000) GS:ffff9c0b1f880000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000000ffff059c CR3: 000000103c7fe004 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > nouveau_fence_new (/home/kas/linux/torvalds/include/linux/slab.h:555 /home/kas/linux/torvalds/include/linux/slab.h:669 /home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_fence.c:423) > [drm:drm_vblank_enable] enabling vblank on crtc 0, ret: 0 > nouveau_gem_ioctl_pushbuf (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:852) > [drm:drm_ioctl] pid=3327, dev=0xe200, auth=1, DRM_IOCTL_CRTC_QUEUE_SEQUENCE > ? nouveau_gem_ioctl_new (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:680) > ? drm_ioctl_kernel (/home/kas/linux/torvalds/drivers/gpu/drm/drm_ioctl.c:793) > ? nouveau_gem_ioctl_new (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:680) > drm_ioctl_kernel (/home/kas/linux/torvalds/drivers/gpu/drm/drm_ioctl.c:793) > drm_ioctl (/home/kas/linux/torvalds/drivers/gpu/drm/drm_ioctl.c:888) > ? nouveau_gem_ioctl_new (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:680) > ? _raw_spin_unlock_irqrestore (/home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:41 /home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:84 /home/kas/linux/torvalds/include/linux/spinlock_api_smp.h:160 /home/kas/linux/torvalds/kernel/locking/spinlock.c:191) > ? lockdep_hardirqs_on (/home/kas/linux/torvalds/kernel/locking/lockdep.c:3727 (discriminator 3)) > ? _raw_spin_unlock_irqrestore (/home/kas/linux/torvalds/arch/x86/include/asm/preempt.h:102 /home/kas/linux/torvalds/include/linux/spinlock_api_smp.h:161 /home/kas/linux/torvalds/kernel/locking/spinlock.c:191) > nouveau_drm_ioctl (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_drm.c:1120) > ksys_ioctl (/home/kas/linux/torvalds/fs/ioctl.c:48 /home/kas/linux/torvalds/fs/ioctl.c:753) > __x64_sys_ioctl (/home/kas/linux/torvalds/fs/ioctl.c:762 /home/kas/linux/torvalds/fs/ioctl.c:760 /home/kas/linux/torvalds/fs/ioctl.c:760) > do_syscall_64 (/home/kas/linux/torvalds/arch/x86/entry/common.c:384) > entry_SYSCALL_64_after_hwframe (/home/kas/linux/torvalds/arch/x86/entry/entry_64.S:124) > RIP: 0033:0x7fbe7ab9cd37 > Code: Bad RIP value. > > Code starting with the faulting instruction > =========================================== > RSP: 002b:00007fff4c0cf0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 00007fff4c0cf150 RCX: 00007fbe7ab9cd37 > RDX: 00007fff4c0cf150 RSI: 00000000c0406481 RDI: 0000000000000009 > RBP: 00000000c0406481 R08: 000055db823225e8 R09: 000055db8231b5e8 > R10: 00007fbe79ce0920 R11: 0000000000000246 R12: 000055db823115e0 > R13: 0000000000000009 R14: 000055db8230ffc0 R15: 000055db8230f380 > Modules linked in: > CR2: 00000000ffff059c > ---[ end trace 9db4c132a431e9c9 ]--- > RIP: 0010:kmem_cache_alloc_trace (/home/kas/linux/torvalds/mm/slub.c:272 /home/kas/linux/torvalds/mm/slub.c:278 /home/kas/linux/torvalds/mm/slub.c:292 /home/kas/linux/torvalds/mm/slub.c:2791 /home/kas/linux/torvalds/mm/slub.c:2832 /home/kas/linux/torvalds/mm/slub.c:2849) > Code: 8b 51 08 48 89 c8 65 48 03 05 d4 0e ca 70 48 8b 70 08 48 39 f2 75 e7 4c 8b 38 4d 85 ff 0f 84 8f 01 00 00 8b 45 20 48 8b 7d 00 <49> 8b 1c 07 40 f6 c7 0f 0f 85 95 01 00 00 48 8d 8a 80 00 00 00 4c > All code > ======== > 0: 8b 51 08 mov 0x8(%rcx),%edx > 3: 48 89 c8 mov %rcx,%rax > 6: 65 48 03 05 d4 0e ca add %gs:0x70ca0ed4(%rip),%rax # 0x70ca0ee2 > d: 70 > e: 48 8b 70 08 mov 0x8(%rax),%rsi > 12: 48 39 f2 cmp %rsi,%rdx > 15: 75 e7 jne 0xfffffffffffffffe > 17: 4c 8b 38 mov (%rax),%r15 > 1a: 4d 85 ff test %r15,%r15 > 1d: 0f 84 8f 01 00 00 je 0x1b2 > 23: 8b 45 20 mov 0x20(%rbp),%eax > 26: 48 8b 7d 00 mov 0x0(%rbp),%rdi > 2a:* 49 8b 1c 07 mov (%r15,%rax,1),%rbx <-- trapping instruction > 2e: 40 f6 c7 0f test $0xf,%dil > 32: 0f 85 95 01 00 00 jne 0x1cd > 38: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx > 3f: 4c rex.WR > > Code starting with the faulting instruction > =========================================== > 0: 49 8b 1c 07 mov (%r15,%rax,1),%rbx > 4: 40 f6 c7 0f test $0xf,%dil > 8: 0f 85 95 01 00 00 jne 0x1a3 > e: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx > 15: 4c rex.WR > RSP: 0018:ffffa8a381bcfba0 EFLAGS: 00010206 > RAX: 0000000000000030 RBX: ffff9c0b15e05e00 RCX: 000000000003fe50 > RDX: 000000000000fc8d RSI: 000000000000fc8d RDI: 000000000003fe50 > RBP: ffff9c0b18407840 R08: 0000000000000000 R09: 0000000000000001 > R10: ffff9c0b06c28000 R11: 0000000000000001 R12: 0000000000000dc0 > R13: 0000000000000060 R14: ffffffff8fa35a47 R15: 00000000ffff056c > FS: 00007fbe7a8e3900(0000) GS:ffff9c0b1f880000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000000ffff059c CR3: 000000103c7fe004 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49 > in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 3351, name: alacritty > INFO: lockdep is turned off. > irq event stamp: 302346 > hardirqs last enabled at (302345): kfree_call_rcu (/home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:41 (discriminator 2) /home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:84 (discriminator 2) /home/kas/linux/torvalds/kernel/rcu/tree.c:3287 (discriminator 2)) > hardirqs last disabled at (302346): idtentry_enter_cond_rcu (/home/kas/linux/torvalds/arch/x86/entry/common.c:652) > softirqs last enabled at (302186): __do_softirq (/home/kas/linux/torvalds/arch/x86/include/asm/preempt.h:26 /home/kas/linux/torvalds/kernel/softirq.c:320) > softirqs last disabled at (302175): asm_call_on_stack (/home/kas/linux/torvalds/arch/x86/entry/entry_64.S:719) > CPU: 13 PID: 3351 Comm: alacritty Tainted: G D I 5.8.0-rc5-00191-g086f86c033f9 #53 > Hardware name: Gigabyte Technology Co., Ltd. X299 AORUS Gaming 3 Pro/X299 AORUS Gaming 3 Pro-CF, BIOS F5d 11/28/2019 > Call Trace: > dump_stack (/home/kas/linux/torvalds/lib/dump_stack.c:120) > ___might_sleep.cold (/home/kas/linux/torvalds/kernel/sched/core.c:6900) > exit_signals (/home/kas/linux/torvalds/include/linux/percpu-rwsem.h:51 /home/kas/linux/torvalds/include/linux/cgroup-defs.h:733 /home/kas/linux/torvalds/kernel/signal.c:2828) > do_exit (/home/kas/linux/torvalds/kernel/exit.c:764) > ? ksys_ioctl (/home/kas/linux/torvalds/fs/ioctl.c:48 /home/kas/linux/torvalds/fs/ioctl.c:753) > rewind_stack_do_exit (/home/kas/linux/torvalds/arch/x86/entry/thunk_64.S:40) > RIP: 0033:0x7fbe7ab9cd37 > Code: Bad RIP value. > > Code starting with the faulting instruction > =========================================== > RSP: 002b:00007fff4c0cf0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 00007fff4c0cf150 RCX: 00007fbe7ab9cd37 > RDX: 00007fff4c0cf150 RSI: 00000000c0406481 RDI: 0000000000000009 > RBP: 00000000c0406481 R08: 000055db823225e8 R09: 000055db8231b5e8 > R10: 00007fbe79ce0920 R11: 0000000000000246 R12: 000055db823115e0 > R13: 0000000000000009 R14: 000055db8230ffc0 R15: 000055db8230f380 >
On 7/17/20 12:47 PM, Daniel Vetter wrote: > On Fri, Jul 17, 2020 at 11:57:57AM -0700, James Jones wrote: >> Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK() >> family of modifiers to handle broken userspace >> Xorg modesetting and Mesa drivers. >> >> Tested with Xorg 1.20 modesetting driver, >> weston@c46c70dac84a4b3030cd05b380f9f410536690fc, >> gnome & KDE wayland desktops from Ubuntu 18.04, >> and sway 1.5 > > Just bikeshed, but maybe a few more words on what exactly is broken and > how this works around it. Specifically why we only accept these, but don't > advertise them. Added quite a few words. >> >> Signed-off-by: James Jones <jajones@nvidia.com> > > Needs Fixes: line here. Also nice to mention the bug reporter/link. Done in v2. >> --- >> drivers/gpu/drm/nouveau/nouveau_display.c | 26 +++++++++++++++++++++-- >> 1 file changed, 24 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c >> index 496c4621cc78..31543086254b 100644 >> --- a/drivers/gpu/drm/nouveau/nouveau_display.c >> +++ b/drivers/gpu/drm/nouveau/nouveau_display.c >> @@ -191,8 +191,14 @@ nouveau_decode_mod(struct nouveau_drm *drm, >> uint32_t *tile_mode, >> uint8_t *kind) >> { >> + struct nouveau_display *disp = nouveau_display(drm->dev); >> BUG_ON(!tile_mode || !kind); >> >> + if ((modifier & (0xffull << 12)) == 0ull) { >> + /* Legacy modifier. Translate to this device's 'kind.' */ >> + modifier |= disp->format_modifiers[0] & (0xffull << 12); >> + } > > Hm I tried to understand what this magic does by looking at drm_fourcc.h, > but the drm_fourcc_canonicalize_nvidia_format_mod() in there implements > something else. Is that function wrong, or should we use it here instead? > > Or is there something else going on entirely? This may be slightly clearer with the expanded change description: Canonicalize assumes the old modifiers are only used by certain Tegra revisions, because the Mesa patches were supposed to land and obliterate all uses beyond that. That assumption means it can assume the specific page kind (0xfe) used by the display-engine-compatible layout on those specific devices. There is no way to generally canonicalize a legacy modifier without referencing a specific device type, as is indirectly done here. This code does a limited device-specific canonicalization: It substitutes the display-appropriate page kind used by this specific device, ensuring we derive this correct page kind later in the function. I iterated on the best way to accomplish this a few times, and this was the least-invasive thing I came up with, but it does require a pretty thorough understanding of the NVIDIA modifier macros. Thanks for the quick review. -James > > Cheers, Daniel > >> + >> if (modifier == DRM_FORMAT_MOD_LINEAR) { >> /* tile_mode will not be used in this case */ >> *tile_mode = 0; >> @@ -227,6 +233,16 @@ nouveau_framebuffer_get_layout(struct drm_framebuffer *fb, >> } >> } >> >> +static const u64 legacy_modifiers[] = { >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(0), >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(1), >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(2), >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(3), >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(4), >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(5), >> + DRM_FORMAT_MOD_INVALID >> +}; >> + >> static int >> nouveau_validate_decode_mod(struct nouveau_drm *drm, >> uint64_t modifier, >> @@ -247,8 +263,14 @@ nouveau_validate_decode_mod(struct nouveau_drm *drm, >> (disp->format_modifiers[mod] != modifier); >> mod++); >> >> - if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) >> - return -EINVAL; >> + if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) { >> + for (mod = 0; >> + (legacy_modifiers[mod] != DRM_FORMAT_MOD_INVALID) && >> + (legacy_modifiers[mod] != modifier); >> + mod++); >> + if (legacy_modifiers[mod] == DRM_FORMAT_MOD_INVALID) >> + return -EINVAL; >> + } >> >> nouveau_decode_mod(drm, modifier, tile_mode, kind); >> >> -- >> 2.17.1 >> >
Did you just cherry-pick my change, or were you running the latest drm-next or drm-fixes code? There do appear to be various MM-related fixes that may be related to this in drm-fixes when I scroll down the log looking for nouveau stuff. Shot in the dark, but might be worth trying with Dave's tree if you weren't already. I was testing with drm-fixes-2020-07-17-1 from here: git://anongit.freedesktop.org/drm/drm Thanks, -James On 7/17/20 8:13 PM, James Jones wrote: > This doesn't look related. > > -James > > On 7/17/20 2:30 PM, Kirill A. Shutemov wrote: >> On Fri, Jul 17, 2020 at 11:57:57AM -0700, James Jones wrote: >>> Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK() >>> family of modifiers to handle broken userspace >>> Xorg modesetting and Mesa drivers. >>> >>> Tested with Xorg 1.20 modesetting driver, >>> weston@c46c70dac84a4b3030cd05b380f9f410536690fc, >>> gnome & KDE wayland desktops from Ubuntu 18.04, >>> and sway 1.5 >>> >>> Signed-off-by: James Jones <jajones@nvidia.com> >> >> I tried and it crashes. Not sure if it's related. >> >> [drm:drm_ioctl] pid=3327, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF >> [drm:vblank_disable_fn] disabling vblank on crtc 0 >> [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF >> [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_CPU_PREP >> [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF >> [drm:drm_ioctl] pid=3351, dev=0xe200, auth=1, NOUVEAU_GEM_PUSHBUF >> BUG: unable to handle page fault for address: 00000000ffff059c >> #PF: supervisor read access in kernel mode >> #PF: error_code(0x0000) - not-present page >> PGD 0 P4D 0 >> Oops: 0000 [#1] PREEMPT SMP PTI >> CPU: 13 PID: 3351 Comm: alacritty Tainted: G I >> 5.8.0-rc5-00191-g086f86c033f9 #53 >> Hardware name: Gigabyte Technology Co., Ltd. X299 AORUS Gaming 3 >> Pro/X299 AORUS Gaming 3 Pro-CF, BIOS F5d 11/28/2019 >> RIP: 0010:kmem_cache_alloc_trace >> (/home/kas/linux/torvalds/mm/slub.c:272 >> /home/kas/linux/torvalds/mm/slub.c:278 >> /home/kas/linux/torvalds/mm/slub.c:292 >> /home/kas/linux/torvalds/mm/slub.c:2791 >> /home/kas/linux/torvalds/mm/slub.c:2832 >> /home/kas/linux/torvalds/mm/slub.c:2849) >> Code: 8b 51 08 48 89 c8 65 48 03 05 d4 0e ca 70 48 8b 70 08 48 39 f2 >> 75 e7 4c 8b 38 4d 85 ff 0f 84 8f 01 00 00 8b 45 20 48 8b 7d 00 <49> 8b >> 1c 07 40 f6 c7 0f 0f 85 95 01 00 00 48 8d 8a 80 00 00 00 4c >> All code >> ======== >> 0: 8b 51 08 mov 0x8(%rcx),%edx >> 3: 48 89 c8 mov %rcx,%rax >> 6: 65 48 03 05 d4 0e ca add >> %gs:0x70ca0ed4(%rip),%rax # 0x70ca0ee2 >> d: 70 >> e: 48 8b 70 08 mov 0x8(%rax),%rsi >> 12: 48 39 f2 cmp %rsi,%rdx >> 15: 75 e7 jne 0xfffffffffffffffe >> 17: 4c 8b 38 mov (%rax),%r15 >> 1a: 4d 85 ff test %r15,%r15 >> 1d: 0f 84 8f 01 00 00 je 0x1b2 >> 23: 8b 45 20 mov 0x20(%rbp),%eax >> 26: 48 8b 7d 00 mov 0x0(%rbp),%rdi >> 2a:* 49 8b 1c 07 mov (%r15,%rax,1),%rbx >> <-- trapping instruction >> 2e: 40 f6 c7 0f test $0xf,%dil >> 32: 0f 85 95 01 00 00 jne 0x1cd >> 38: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx >> 3f: 4c rex.WR >> >> Code starting with the faulting instruction >> =========================================== >> 0: 49 8b 1c 07 mov (%r15,%rax,1),%rbx >> 4: 40 f6 c7 0f test $0xf,%dil >> 8: 0f 85 95 01 00 00 jne 0x1a3 >> e: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx >> 15: 4c rex.WR >> RSP: 0018:ffffa8a381bcfba0 EFLAGS: 00010206 >> RAX: 0000000000000030 RBX: ffff9c0b15e05e00 RCX: 000000000003fe50 >> RDX: 000000000000fc8d RSI: 000000000000fc8d RDI: 000000000003fe50 >> RBP: ffff9c0b18407840 R08: 0000000000000000 R09: 0000000000000001 >> R10: ffff9c0b06c28000 R11: 0000000000000001 R12: 0000000000000dc0 >> [drm:drm_ioctl] pid=3327, dev=0xe200, auth=1, DRM_IOCTL_CRTC_GET_SEQUENCE >> R13: 0000000000000060 R14: ffffffff8fa35a47 R15: 00000000ffff056c >> FS: 00007fbe7a8e3900(0000) GS:ffff9c0b1f880000(0000) >> knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00000000ffff059c CR3: 000000103c7fe004 CR4: 00000000003606e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> nouveau_fence_new (/home/kas/linux/torvalds/include/linux/slab.h:555 >> /home/kas/linux/torvalds/include/linux/slab.h:669 >> /home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_fence.c:423) >> [drm:drm_vblank_enable] enabling vblank on crtc 0, ret: 0 >> nouveau_gem_ioctl_pushbuf >> (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:852) >> [drm:drm_ioctl] pid=3327, dev=0xe200, auth=1, >> DRM_IOCTL_CRTC_QUEUE_SEQUENCE >> ? nouveau_gem_ioctl_new >> (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:680) >> ? drm_ioctl_kernel >> (/home/kas/linux/torvalds/drivers/gpu/drm/drm_ioctl.c:793) >> ? nouveau_gem_ioctl_new >> (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:680) >> drm_ioctl_kernel >> (/home/kas/linux/torvalds/drivers/gpu/drm/drm_ioctl.c:793) >> drm_ioctl (/home/kas/linux/torvalds/drivers/gpu/drm/drm_ioctl.c:888) >> ? nouveau_gem_ioctl_new >> (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_gem.c:680) >> ? _raw_spin_unlock_irqrestore >> (/home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:41 >> /home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:84 >> /home/kas/linux/torvalds/include/linux/spinlock_api_smp.h:160 >> /home/kas/linux/torvalds/kernel/locking/spinlock.c:191) >> ? lockdep_hardirqs_on >> (/home/kas/linux/torvalds/kernel/locking/lockdep.c:3727 (discriminator >> 3)) >> ? _raw_spin_unlock_irqrestore >> (/home/kas/linux/torvalds/arch/x86/include/asm/preempt.h:102 >> /home/kas/linux/torvalds/include/linux/spinlock_api_smp.h:161 >> /home/kas/linux/torvalds/kernel/locking/spinlock.c:191) >> nouveau_drm_ioctl >> (/home/kas/linux/torvalds/drivers/gpu/drm/nouveau/nouveau_drm.c:1120) >> ksys_ioctl (/home/kas/linux/torvalds/fs/ioctl.c:48 >> /home/kas/linux/torvalds/fs/ioctl.c:753) >> __x64_sys_ioctl (/home/kas/linux/torvalds/fs/ioctl.c:762 >> /home/kas/linux/torvalds/fs/ioctl.c:760 >> /home/kas/linux/torvalds/fs/ioctl.c:760) >> do_syscall_64 (/home/kas/linux/torvalds/arch/x86/entry/common.c:384) >> entry_SYSCALL_64_after_hwframe >> (/home/kas/linux/torvalds/arch/x86/entry/entry_64.S:124) >> RIP: 0033:0x7fbe7ab9cd37 >> Code: Bad RIP value. >> >> Code starting with the faulting instruction >> =========================================== >> RSP: 002b:00007fff4c0cf0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 >> RAX: ffffffffffffffda RBX: 00007fff4c0cf150 RCX: 00007fbe7ab9cd37 >> RDX: 00007fff4c0cf150 RSI: 00000000c0406481 RDI: 0000000000000009 >> RBP: 00000000c0406481 R08: 000055db823225e8 R09: 000055db8231b5e8 >> R10: 00007fbe79ce0920 R11: 0000000000000246 R12: 000055db823115e0 >> R13: 0000000000000009 R14: 000055db8230ffc0 R15: 000055db8230f380 >> Modules linked in: >> CR2: 00000000ffff059c >> ---[ end trace 9db4c132a431e9c9 ]--- >> RIP: 0010:kmem_cache_alloc_trace >> (/home/kas/linux/torvalds/mm/slub.c:272 >> /home/kas/linux/torvalds/mm/slub.c:278 >> /home/kas/linux/torvalds/mm/slub.c:292 >> /home/kas/linux/torvalds/mm/slub.c:2791 >> /home/kas/linux/torvalds/mm/slub.c:2832 >> /home/kas/linux/torvalds/mm/slub.c:2849) >> Code: 8b 51 08 48 89 c8 65 48 03 05 d4 0e ca 70 48 8b 70 08 48 39 f2 >> 75 e7 4c 8b 38 4d 85 ff 0f 84 8f 01 00 00 8b 45 20 48 8b 7d 00 <49> 8b >> 1c 07 40 f6 c7 0f 0f 85 95 01 00 00 48 8d 8a 80 00 00 00 4c >> All code >> ======== >> 0: 8b 51 08 mov 0x8(%rcx),%edx >> 3: 48 89 c8 mov %rcx,%rax >> 6: 65 48 03 05 d4 0e ca add >> %gs:0x70ca0ed4(%rip),%rax # 0x70ca0ee2 >> d: 70 >> e: 48 8b 70 08 mov 0x8(%rax),%rsi >> 12: 48 39 f2 cmp %rsi,%rdx >> 15: 75 e7 jne 0xfffffffffffffffe >> 17: 4c 8b 38 mov (%rax),%r15 >> 1a: 4d 85 ff test %r15,%r15 >> 1d: 0f 84 8f 01 00 00 je 0x1b2 >> 23: 8b 45 20 mov 0x20(%rbp),%eax >> 26: 48 8b 7d 00 mov 0x0(%rbp),%rdi >> 2a:* 49 8b 1c 07 mov (%r15,%rax,1),%rbx >> <-- trapping instruction >> 2e: 40 f6 c7 0f test $0xf,%dil >> 32: 0f 85 95 01 00 00 jne 0x1cd >> 38: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx >> 3f: 4c rex.WR >> >> Code starting with the faulting instruction >> =========================================== >> 0: 49 8b 1c 07 mov (%r15,%rax,1),%rbx >> 4: 40 f6 c7 0f test $0xf,%dil >> 8: 0f 85 95 01 00 00 jne 0x1a3 >> e: 48 8d 8a 80 00 00 00 lea 0x80(%rdx),%rcx >> 15: 4c rex.WR >> RSP: 0018:ffffa8a381bcfba0 EFLAGS: 00010206 >> RAX: 0000000000000030 RBX: ffff9c0b15e05e00 RCX: 000000000003fe50 >> RDX: 000000000000fc8d RSI: 000000000000fc8d RDI: 000000000003fe50 >> RBP: ffff9c0b18407840 R08: 0000000000000000 R09: 0000000000000001 >> R10: ffff9c0b06c28000 R11: 0000000000000001 R12: 0000000000000dc0 >> R13: 0000000000000060 R14: ffffffff8fa35a47 R15: 00000000ffff056c >> FS: 00007fbe7a8e3900(0000) GS:ffff9c0b1f880000(0000) >> knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00000000ffff059c CR3: 000000103c7fe004 CR4: 00000000003606e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> BUG: sleeping function called from invalid context at >> include/linux/percpu-rwsem.h:49 >> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 3351, name: >> alacritty >> INFO: lockdep is turned off. >> irq event stamp: 302346 >> hardirqs last enabled at (302345): kfree_call_rcu >> (/home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:41 >> (discriminator 2) >> /home/kas/linux/torvalds/arch/x86/include/asm/irqflags.h:84 >> (discriminator 2) /home/kas/linux/torvalds/kernel/rcu/tree.c:3287 >> (discriminator 2)) >> hardirqs last disabled at (302346): idtentry_enter_cond_rcu >> (/home/kas/linux/torvalds/arch/x86/entry/common.c:652) >> softirqs last enabled at (302186): __do_softirq >> (/home/kas/linux/torvalds/arch/x86/include/asm/preempt.h:26 >> /home/kas/linux/torvalds/kernel/softirq.c:320) >> softirqs last disabled at (302175): asm_call_on_stack >> (/home/kas/linux/torvalds/arch/x86/entry/entry_64.S:719) >> CPU: 13 PID: 3351 Comm: alacritty Tainted: G D I >> 5.8.0-rc5-00191-g086f86c033f9 #53 >> Hardware name: Gigabyte Technology Co., Ltd. X299 AORUS Gaming 3 >> Pro/X299 AORUS Gaming 3 Pro-CF, BIOS F5d 11/28/2019 >> Call Trace: >> dump_stack (/home/kas/linux/torvalds/lib/dump_stack.c:120) >> ___might_sleep.cold (/home/kas/linux/torvalds/kernel/sched/core.c:6900) >> exit_signals (/home/kas/linux/torvalds/include/linux/percpu-rwsem.h:51 >> /home/kas/linux/torvalds/include/linux/cgroup-defs.h:733 >> /home/kas/linux/torvalds/kernel/signal.c:2828) >> do_exit (/home/kas/linux/torvalds/kernel/exit.c:764) >> ? ksys_ioctl (/home/kas/linux/torvalds/fs/ioctl.c:48 >> /home/kas/linux/torvalds/fs/ioctl.c:753) >> rewind_stack_do_exit >> (/home/kas/linux/torvalds/arch/x86/entry/thunk_64.S:40) >> RIP: 0033:0x7fbe7ab9cd37 >> Code: Bad RIP value. >> >> Code starting with the faulting instruction >> =========================================== >> RSP: 002b:00007fff4c0cf0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 >> RAX: ffffffffffffffda RBX: 00007fff4c0cf150 RCX: 00007fbe7ab9cd37 >> RDX: 00007fff4c0cf150 RSI: 00000000c0406481 RDI: 0000000000000009 >> RBP: 00000000c0406481 R08: 000055db823225e8 R09: 000055db8231b5e8 >> R10: 00007fbe79ce0920 R11: 0000000000000246 R12: 000055db823115e0 >> R13: 0000000000000009 R14: 000055db8230ffc0 R15: 000055db8230f380 >> > _______________________________________________ > Nouveau mailing list > Nouveau@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
On Sat, Jul 18, 2020 at 5:43 AM James Jones <jajones@nvidia.com> wrote: > > On 7/17/20 12:47 PM, Daniel Vetter wrote: > > On Fri, Jul 17, 2020 at 11:57:57AM -0700, James Jones wrote: > >> Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK() > >> family of modifiers to handle broken userspace > >> Xorg modesetting and Mesa drivers. > >> > >> Tested with Xorg 1.20 modesetting driver, > >> weston@c46c70dac84a4b3030cd05b380f9f410536690fc, > >> gnome & KDE wayland desktops from Ubuntu 18.04, > >> and sway 1.5 > > > > Just bikeshed, but maybe a few more words on what exactly is broken and > > how this works around it. Specifically why we only accept these, but don't > > advertise them. > > Added quite a few words. > > >> > >> Signed-off-by: James Jones <jajones@nvidia.com> > > > > Needs Fixes: line here. Also nice to mention the bug reporter/link. > > Done in v2. > > >> --- > >> drivers/gpu/drm/nouveau/nouveau_display.c | 26 +++++++++++++++++++++-- > >> 1 file changed, 24 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c > >> index 496c4621cc78..31543086254b 100644 > >> --- a/drivers/gpu/drm/nouveau/nouveau_display.c > >> +++ b/drivers/gpu/drm/nouveau/nouveau_display.c > >> @@ -191,8 +191,14 @@ nouveau_decode_mod(struct nouveau_drm *drm, > >> uint32_t *tile_mode, > >> uint8_t *kind) > >> { > >> + struct nouveau_display *disp = nouveau_display(drm->dev); > >> BUG_ON(!tile_mode || !kind); > >> > >> + if ((modifier & (0xffull << 12)) == 0ull) { > >> + /* Legacy modifier. Translate to this device's 'kind.' */ > >> + modifier |= disp->format_modifiers[0] & (0xffull << 12); > >> + } > > > > Hm I tried to understand what this magic does by looking at drm_fourcc.h, > > but the drm_fourcc_canonicalize_nvidia_format_mod() in there implements > > something else. Is that function wrong, or should we use it here instead? > > > > Or is there something else going on entirely? > > This may be slightly clearer with the expanded change description: > > Canonicalize assumes the old modifiers are only used by certain Tegra > revisions, because the Mesa patches were supposed to land and obliterate > all uses beyond that. That assumption means it can assume the specific > page kind (0xfe) used by the display-engine-compatible layout on those > specific devices. There is no way to generally canonicalize a legacy > modifier without referencing a specific device type, as is indirectly > done here. > > This code does a limited device-specific canonicalization: It > substitutes the display-appropriate page kind used by this specific > device, ensuring we derive this correct page kind later in the function. > I iterated on the best way to accomplish this a few times, and this > was the least-invasive thing I came up with, but it does require a > pretty thorough understanding of the NVIDIA modifier macros. > > Thanks for the quick review. Ah yes this makes a ton more sense with your explanation of what's all going on. Thanks for all the explaining, but probably better if someone with real nouveau clues takes a look too. Fwiw (i.e. not much) Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cheers, Daniel > > -James > > > > > Cheers, Daniel > > > >> + > >> if (modifier == DRM_FORMAT_MOD_LINEAR) { > >> /* tile_mode will not be used in this case */ > >> *tile_mode = 0; > >> @@ -227,6 +233,16 @@ nouveau_framebuffer_get_layout(struct drm_framebuffer *fb, > >> } > >> } > >> > >> +static const u64 legacy_modifiers[] = { > >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(0), > >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(1), > >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(2), > >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(3), > >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(4), > >> + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(5), > >> + DRM_FORMAT_MOD_INVALID > >> +}; > >> + > >> static int > >> nouveau_validate_decode_mod(struct nouveau_drm *drm, > >> uint64_t modifier, > >> @@ -247,8 +263,14 @@ nouveau_validate_decode_mod(struct nouveau_drm *drm, > >> (disp->format_modifiers[mod] != modifier); > >> mod++); > >> > >> - if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) > >> - return -EINVAL; > >> + if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) { > >> + for (mod = 0; > >> + (legacy_modifiers[mod] != DRM_FORMAT_MOD_INVALID) && > >> + (legacy_modifiers[mod] != modifier); > >> + mod++); > >> + if (legacy_modifiers[mod] == DRM_FORMAT_MOD_INVALID) > >> + return -EINVAL; > >> + } > >> > >> nouveau_decode_mod(drm, modifier, tile_mode, kind); > >> > >> -- > >> 2.17.1 > >> > >
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index 496c4621cc78..31543086254b 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -191,8 +191,14 @@ nouveau_decode_mod(struct nouveau_drm *drm, uint32_t *tile_mode, uint8_t *kind) { + struct nouveau_display *disp = nouveau_display(drm->dev); BUG_ON(!tile_mode || !kind); + if ((modifier & (0xffull << 12)) == 0ull) { + /* Legacy modifier. Translate to this device's 'kind.' */ + modifier |= disp->format_modifiers[0] & (0xffull << 12); + } + if (modifier == DRM_FORMAT_MOD_LINEAR) { /* tile_mode will not be used in this case */ *tile_mode = 0; @@ -227,6 +233,16 @@ nouveau_framebuffer_get_layout(struct drm_framebuffer *fb, } } +static const u64 legacy_modifiers[] = { + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(0), + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(1), + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(2), + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(3), + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(4), + DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(5), + DRM_FORMAT_MOD_INVALID +}; + static int nouveau_validate_decode_mod(struct nouveau_drm *drm, uint64_t modifier, @@ -247,8 +263,14 @@ nouveau_validate_decode_mod(struct nouveau_drm *drm, (disp->format_modifiers[mod] != modifier); mod++); - if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) - return -EINVAL; + if (disp->format_modifiers[mod] == DRM_FORMAT_MOD_INVALID) { + for (mod = 0; + (legacy_modifiers[mod] != DRM_FORMAT_MOD_INVALID) && + (legacy_modifiers[mod] != modifier); + mod++); + if (legacy_modifiers[mod] == DRM_FORMAT_MOD_INVALID) + return -EINVAL; + } nouveau_decode_mod(drm, modifier, tile_mode, kind);
Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK() family of modifiers to handle broken userspace Xorg modesetting and Mesa drivers. Tested with Xorg 1.20 modesetting driver, weston@c46c70dac84a4b3030cd05b380f9f410536690fc, gnome & KDE wayland desktops from Ubuntu 18.04, and sway 1.5 Signed-off-by: James Jones <jajones@nvidia.com> --- drivers/gpu/drm/nouveau/nouveau_display.c | 26 +++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-)