Message ID | 20220707102453.3633-6-christian.koenig@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/6] drm/ttm: rename and cleanup ttm_bo_init_reserved | expand |
>-----Original Message----- >From: dri-devel <dri-devel-bounces@lists.freedesktop.org> On Behalf Of >Christian König >Sent: Thursday, July 7, 2022 6:25 AM >To: intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; >nouveau@lists.freedesktop.org; amd-gfx@lists.freedesktop.org >Cc: Christian König <christian.koenig@amd.com> >Subject: [PATCH 5/6] drm/ttm: stop allocating dummy resources during BO >creation > >That should not be necessary any more when drivers should at least be >able to handle the move without a resource. > >Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> M >--- > drivers/gpu/drm/ttm/ttm_bo.c | 7 ------- > 1 file changed, 7 deletions(-) > >diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >index a2f49bdda8a1..f491be751a2f 100644 >--- a/drivers/gpu/drm/ttm/ttm_bo.c >+++ b/drivers/gpu/drm/ttm/ttm_bo.c >@@ -960,7 +960,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev, >struct ttm_buffer_object *bo, > struct sg_table *sg, struct dma_resv *resv, > void (*destroy) (struct ttm_buffer_object *)) > { >- static const struct ttm_place sys_mem = { .mem_type = >TTM_PL_SYSTEM }; > int ret; > > kref_init(&bo->kref); >@@ -978,12 +977,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev, >struct ttm_buffer_object *bo, > bo->base.resv = &bo->base._resv; > atomic_inc(&ttm_glob.bo_count); > >- ret = ttm_resource_alloc(bo, &sys_mem, &bo->resource); >- if (unlikely(ret)) { >- ttm_bo_put(bo); >- return ret; >- } >- > /* > * For ttm_bo_type_device buffers, allocate > * address space from the device. >-- >2.25.1
Greeting, FYI, we noticed the following commit (built with gcc-11): commit: b297c22b70703b3eefc45d1e58f9caff08db5f9e ("[PATCH 5/6] drm/ttm: stop allocating dummy resources during BO creation") url: https://github.com/intel-lab-lkp/linux/commits/Christian-K-nig/drm-ttm-rename-and-cleanup-ttm_bo_init_reserved/20220707-192538 base: git://anongit.freedesktop.org/drm/drm-tip drm-tip patch link: https://lore.kernel.org/intel-gfx/20220707102453.3633-6-christian.koenig@amd.com in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): +------------------------------------------+------------+------------+ | | dafc2f4605 | b297c22b70 | +------------------------------------------+------------+------------+ | boot_successes | 10 | 0 | | boot_failures | 0 | 10 | | canonical_address#:#[##] | 0 | 10 | | RIP:ttm_bo_validate[ttm] | 0 | 10 | | Kernel_panic-not_syncing:Fatal_exception | 0 | 10 | +------------------------------------------+------------+------------+ If you fix the issue, kindly add following tag Reported-by: kernel test robot <oliver.sang@intel.com> [ 40.164641][ T182] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc048 irq 15 [ 40.170483][ T184] [drm] Found bochs VGA, ID 0xb0c5. [ 40.171513][ T184] [drm] Framebuffer size 16384 kB @ 0xfd000000, mmio @ 0xfebf0000. [ 40.176577][ T184] [drm] Found EDID data blob. [ 40.187472][ T184] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 0 [ 40.191227][ T184] general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] SMP KASAN PTI [ 40.192995][ T184] KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] [ 40.194411][ T184] CPU: 1 PID: 184 Comm: systemd-udevd Not tainted 5.19.0-rc4-00721-gb297c22b7070 #1 [ 40.196063][ T184] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014 [ 40.199605][ T184] RIP: 0010:ttm_bo_validate (kbuild/src/consumer/drivers/gpu/drm/ttm/ttm_bo.c:915) ttm [ 40.200754][ T184] Code: e8 72 c5 ff ff 83 f8 b8 74 d4 85 c0 75 54 49 8b 9e 58 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 10 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 04 3c 03 7e 44 8b 53 10 31 c0 85 d2 0f 85 58 All code ======== 0: e8 72 c5 ff ff callq 0xffffffffffffc577 5: 83 f8 b8 cmp $0xffffffb8,%eax 8: 74 d4 je 0xffffffffffffffde a: 85 c0 test %eax,%eax c: 75 54 jne 0x62 e: 49 8b 9e 58 01 00 00 mov 0x158(%r14),%rbx 15: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 1c: fc ff df 1f: 48 8d 7b 10 lea 0x10(%rbx),%rdi 23: 48 89 fa mov %rdi,%rdx 26: 48 c1 ea 03 shr $0x3,%rdx 2a:* 0f b6 04 02 movzbl (%rdx,%rax,1),%eax <-- trapping instruction 2e: 84 c0 test %al,%al 30: 74 04 je 0x36 32: 3c 03 cmp $0x3,%al 34: 7e 44 jle 0x7a 36: 8b 53 10 mov 0x10(%rbx),%edx 39: 31 c0 xor %eax,%eax 3b: 85 d2 test %edx,%edx 3d: 0f .byte 0xf 3e: 85 .byte 0x85 3f: 58 pop %rax Code starting with the faulting instruction =========================================== 0: 0f b6 04 02 movzbl (%rdx,%rax,1),%eax 4: 84 c0 test %al,%al 6: 74 04 je 0xc 8: 3c 03 cmp $0x3,%al a: 7e 44 jle 0x50 c: 8b 53 10 mov 0x10(%rbx),%edx f: 31 c0 xor %eax,%eax 11: 85 d2 test %edx,%edx 13: 0f .byte 0xf 14: 85 .byte 0x85 15: 58 pop %rax [ 40.203685][ T184] RSP: 0018:ffffc900006df0c8 EFLAGS: 00010202 [ 40.204630][ T184] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1102f4bb71b [ 40.205864][ T184] RDX: 0000000000000002 RSI: ffffc900006df208 RDI: 0000000000000010 [ 40.207102][ T184] RBP: 1ffff920000dbe1a R08: ffffc900006df208 R09: 0000000000000000 [ 40.208394][ T184] R10: ffff88817a5f0000 R11: 0000000000000001 R12: ffffc900006df110 [ 40.209692][ T184] R13: ffffc900006df0f0 R14: ffff88817a5db800 R15: ffffc900006df208 [ 40.210862][ T184] FS: 00007f6b1d16e8c0(0000) GS:ffff88839d700000(0000) knlGS:0000000000000000 [ 40.212250][ T184] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 40.213275][ T184] CR2: 000055a1001d4ff0 CR3: 00000001700f4000 CR4: 00000000000006e0 [ 40.214469][ T184] Call Trace: [ 40.214974][ T184] <TASK> [ 40.215438][ T184] ? ttm_bo_bounce_temp_buffer+0x140/0x140 ttm [ 40.216572][ T184] ? mutex_spin_on_owner (kbuild/src/consumer/kernel/locking/mutex.c:771) [ 40.217456][ T184] ? drm_vma_offset_add (kbuild/src/consumer/include/drm/drm_mm.h:439 kbuild/src/consumer/include/drm/drm_mm.h:462 kbuild/src/consumer/drivers/gpu/drm/drm_vma_manager.c:209) drm [ 40.218457][ T184] ttm_bo_init_reserved (kbuild/src/consumer/drivers/gpu/drm/ttm/ttm_bo.c:999) ttm To reproduce: # build kernel cd linux cp config-5.19.0-rc4-00721-gb297c22b7070 .config make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install cd <mod-install-dir> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state.
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index a2f49bdda8a1..f491be751a2f 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -960,7 +960,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev, struct ttm_buffer_object *bo, struct sg_table *sg, struct dma_resv *resv, void (*destroy) (struct ttm_buffer_object *)) { - static const struct ttm_place sys_mem = { .mem_type = TTM_PL_SYSTEM }; int ret; kref_init(&bo->kref); @@ -978,12 +977,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev, struct ttm_buffer_object *bo, bo->base.resv = &bo->base._resv; atomic_inc(&ttm_glob.bo_count); - ret = ttm_resource_alloc(bo, &sys_mem, &bo->resource); - if (unlikely(ret)) { - ttm_bo_put(bo); - return ret; - } - /* * For ttm_bo_type_device buffers, allocate * address space from the device.
That should not be necessary any more when drivers should at least be able to handle the move without a resource. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/gpu/drm/ttm/ttm_bo.c | 7 ------- 1 file changed, 7 deletions(-)