Message ID | 20220225103443.225228-1-matthew.auld@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: apply PM_EARLY for non-GTT mappable objects | expand |
On 2/25/22 11:34, Matthew Auld wrote: > On DG2 we allow objects that are smaller than the min_page_size, under > the premise that these are never mapped by the GTT, like with the paging > structures. Currently the suspend-resume path will try to map such > objects through the migration vm, which hits: > > [ 560.529217] kernel BUG at drivers/gpu/drm/i915/gt/intel_migrate.c:431! > [ 560.536081] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > [ 560.541629] CPU: 4 PID: 2062 Comm: rtcwake Tainted: G W 5.17.0-rc5-demarchi+ #175 > [ 560.550716] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X220.B00.2103302221 03/30/2021 > [ 560.563627] RIP: 0010:emit_pte+0x2e7/0x380 [i915] > [ 560.568665] Code: ee 02 48 89 69 04 83 c6 05 83 c0 05 39 f0 0f 4f c6 48 8b 73 08 39 d0 0f 4f c2 44 89 f2 4c 8d 4a ff 49 85 f1 0f 84 62 fe ff ff <0f> 0b 48 c7 03 00 00 00 00 4d 89 c6 8b 01 48 29 ce 48 8d 57 0c 48 > [ 560.587691] RSP: 0018:ffffc9000104f8a0 EFLAGS: 00010206 > [ 560.592906] RAX: 0000000000000040 RBX: ffffc9000104f908 RCX: ffffc900025114d0 > [ 560.600024] RDX: 0000000000010000 RSI: 00000003f9fe2000 RDI: ffffc900025114dc > [ 560.607458] RBP: 0000000001840000 R08: ffff88810f335540 R09: 000000000000ffff > [ 560.614865] R10: 000000000000081b R11: 0000000000000001 R12: 000000000000081b > [ 560.622300] R13: 0000000000000000 R14: 0000000000010000 R15: ffff888107c3e240 > [ 560.629716] FS: 00007f5b7c086580(0000) GS:ffff88846dc00000(0000) knlGS:0000000000000000 > [ 560.638090] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 560.644132] CR2: 00007f3ab0a133a8 CR3: 000000010a43e003 CR4: 00000000003706e0 > [ 560.651590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 560.659002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 560.666438] Call Trace: > [ 560.668885] <TASK> > [ 560.670983] intel_context_migrate_copy+0x1b1/0x4c0 [i915] > [ 560.676794] __i915_ttm_move+0x628/0x790 [i915] > [ 560.681704] ? dma_resv_iter_next+0x8f/0xb0 > [ 560.686223] ? dma_resv_iter_first+0xe5/0x140 > [ 560.690894] ? i915_deps_add_resv+0x4b/0x110 [i915] > [ 560.696147] ? dma_resv_reserve_shared+0x161/0x310 > [ 560.701228] i915_gem_obj_copy_ttm+0x10f/0x220 [i915] > [ 560.706650] i915_ttm_backup+0x191/0x2f0 [i915] > [ 560.711558] i915_gem_process_region+0x266/0x3b0 [i915] > [ 560.717153] ? verify_cpu+0xf0/0x100 > [ 560.721040] ? pci_pm_resume_early+0x20/0x20 > [ 560.725603] i915_ttm_backup_region+0x47/0x70 [i915] > [ 560.730927] i915_gem_backup_suspend+0x141/0x170 [i91 > > For now let's just force the memcpy path for such objects during > suspend-resume. > > Fixes: 00e27ad85bc9 ("drm/i915/migrate: add acceleration support for DG2") > Reported-by: Lucas De Marchi <lucas.demarchi@intel.com> > Signed-off-by: Matthew Auld <matthew.auld@intel.com> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> > --- > drivers/gpu/drm/i915/gem/i915_gem_region.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c > index a4350227e9ae..6d65c5849874 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_region.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c > @@ -67,6 +67,17 @@ i915_gem_object_create_region(struct intel_memory_region *mem, > if (!obj) > return ERR_PTR(-ENOMEM); > > + /* > + * Anything smaller than the min_page_size can't be freely inserted into > + * the GTT, due to alignemnt restrictions. For such special objects, s/alignemnt/alignment/ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> > + * make sure we force memcpy based suspend-resume. In the future we can > + * revisit this, either by allowing special mis-aligned objects in the > + * migration path, or by mapping all of LMEM upfront using cheap 1G > + * GTT entries. > + */ > + if (default_page_size < mem->min_page_size) > + flags |= I915_BO_ALLOC_PM_EARLY; > + > err = mem->ops->init_object(mem, obj, size, page_size, flags); > if (err) > goto err_object_free;
On Fri, Feb 25, 2022 at 10:34:43AM +0000, Matthew Auld wrote: >On DG2 we allow objects that are smaller than the min_page_size, under >the premise that these are never mapped by the GTT, like with the paging >structures. Currently the suspend-resume path will try to map such >objects through the migration vm, which hits: > >[ 560.529217] kernel BUG at drivers/gpu/drm/i915/gt/intel_migrate.c:431! >[ 560.536081] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI >[ 560.541629] CPU: 4 PID: 2062 Comm: rtcwake Tainted: G W 5.17.0-rc5-demarchi+ #175 >[ 560.550716] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X220.B00.2103302221 03/30/2021 >[ 560.563627] RIP: 0010:emit_pte+0x2e7/0x380 [i915] >[ 560.568665] Code: ee 02 48 89 69 04 83 c6 05 83 c0 05 39 f0 0f 4f c6 48 8b 73 08 39 d0 0f 4f c2 44 89 f2 4c 8d 4a ff 49 85 f1 0f 84 62 fe ff ff <0f> 0b 48 c7 03 00 00 00 00 4d 89 c6 8b 01 48 29 ce 48 8d 57 0c 48 >[ 560.587691] RSP: 0018:ffffc9000104f8a0 EFLAGS: 00010206 >[ 560.592906] RAX: 0000000000000040 RBX: ffffc9000104f908 RCX: ffffc900025114d0 >[ 560.600024] RDX: 0000000000010000 RSI: 00000003f9fe2000 RDI: ffffc900025114dc >[ 560.607458] RBP: 0000000001840000 R08: ffff88810f335540 R09: 000000000000ffff >[ 560.614865] R10: 000000000000081b R11: 0000000000000001 R12: 000000000000081b >[ 560.622300] R13: 0000000000000000 R14: 0000000000010000 R15: ffff888107c3e240 >[ 560.629716] FS: 00007f5b7c086580(0000) GS:ffff88846dc00000(0000) knlGS:0000000000000000 >[ 560.638090] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >[ 560.644132] CR2: 00007f3ab0a133a8 CR3: 000000010a43e003 CR4: 00000000003706e0 >[ 560.651590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >[ 560.659002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >[ 560.666438] Call Trace: >[ 560.668885] <TASK> >[ 560.670983] intel_context_migrate_copy+0x1b1/0x4c0 [i915] >[ 560.676794] __i915_ttm_move+0x628/0x790 [i915] >[ 560.681704] ? dma_resv_iter_next+0x8f/0xb0 >[ 560.686223] ? dma_resv_iter_first+0xe5/0x140 >[ 560.690894] ? i915_deps_add_resv+0x4b/0x110 [i915] >[ 560.696147] ? dma_resv_reserve_shared+0x161/0x310 >[ 560.701228] i915_gem_obj_copy_ttm+0x10f/0x220 [i915] >[ 560.706650] i915_ttm_backup+0x191/0x2f0 [i915] >[ 560.711558] i915_gem_process_region+0x266/0x3b0 [i915] >[ 560.717153] ? verify_cpu+0xf0/0x100 >[ 560.721040] ? pci_pm_resume_early+0x20/0x20 >[ 560.725603] i915_ttm_backup_region+0x47/0x70 [i915] >[ 560.730927] i915_gem_backup_suspend+0x141/0x170 [i91 > >For now let's just force the memcpy path for such objects during >suspend-resume. > >Fixes: 00e27ad85bc9 ("drm/i915/migrate: add acceleration support for DG2") >Reported-by: Lucas De Marchi <lucas.demarchi@intel.com> >Signed-off-by: Matthew Auld <matthew.auld@intel.com> >Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> thanks Lucas De Marchi
On Fri, Feb 25, 2022 at 05:14:28PM -0800, Lucas De Marchi wrote: >On Fri, Feb 25, 2022 at 10:34:43AM +0000, Matthew Auld wrote: >>On DG2 we allow objects that are smaller than the min_page_size, under >>the premise that these are never mapped by the GTT, like with the paging >>structures. Currently the suspend-resume path will try to map such >>objects through the migration vm, which hits: >> >>[ 560.529217] kernel BUG at drivers/gpu/drm/i915/gt/intel_migrate.c:431! >>[ 560.536081] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI >>[ 560.541629] CPU: 4 PID: 2062 Comm: rtcwake Tainted: G W 5.17.0-rc5-demarchi+ #175 >>[ 560.550716] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X220.B00.2103302221 03/30/2021 >>[ 560.563627] RIP: 0010:emit_pte+0x2e7/0x380 [i915] >>[ 560.568665] Code: ee 02 48 89 69 04 83 c6 05 83 c0 05 39 f0 0f 4f c6 48 8b 73 08 39 d0 0f 4f c2 44 89 f2 4c 8d 4a ff 49 85 f1 0f 84 62 fe ff ff <0f> 0b 48 c7 03 00 00 00 00 4d 89 c6 8b 01 48 29 ce 48 8d 57 0c 48 >>[ 560.587691] RSP: 0018:ffffc9000104f8a0 EFLAGS: 00010206 >>[ 560.592906] RAX: 0000000000000040 RBX: ffffc9000104f908 RCX: ffffc900025114d0 >>[ 560.600024] RDX: 0000000000010000 RSI: 00000003f9fe2000 RDI: ffffc900025114dc >>[ 560.607458] RBP: 0000000001840000 R08: ffff88810f335540 R09: 000000000000ffff >>[ 560.614865] R10: 000000000000081b R11: 0000000000000001 R12: 000000000000081b >>[ 560.622300] R13: 0000000000000000 R14: 0000000000010000 R15: ffff888107c3e240 >>[ 560.629716] FS: 00007f5b7c086580(0000) GS:ffff88846dc00000(0000) knlGS:0000000000000000 >>[ 560.638090] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>[ 560.644132] CR2: 00007f3ab0a133a8 CR3: 000000010a43e003 CR4: 00000000003706e0 >>[ 560.651590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>[ 560.659002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>[ 560.666438] Call Trace: >>[ 560.668885] <TASK> >>[ 560.670983] intel_context_migrate_copy+0x1b1/0x4c0 [i915] >>[ 560.676794] __i915_ttm_move+0x628/0x790 [i915] >>[ 560.681704] ? dma_resv_iter_next+0x8f/0xb0 >>[ 560.686223] ? dma_resv_iter_first+0xe5/0x140 >>[ 560.690894] ? i915_deps_add_resv+0x4b/0x110 [i915] >>[ 560.696147] ? dma_resv_reserve_shared+0x161/0x310 >>[ 560.701228] i915_gem_obj_copy_ttm+0x10f/0x220 [i915] >>[ 560.706650] i915_ttm_backup+0x191/0x2f0 [i915] >>[ 560.711558] i915_gem_process_region+0x266/0x3b0 [i915] >>[ 560.717153] ? verify_cpu+0xf0/0x100 >>[ 560.721040] ? pci_pm_resume_early+0x20/0x20 >>[ 560.725603] i915_ttm_backup_region+0x47/0x70 [i915] >>[ 560.730927] i915_gem_backup_suspend+0x141/0x170 [i91 >> >>For now let's just force the memcpy path for such objects during >>suspend-resume. >> >>Fixes: 00e27ad85bc9 ("drm/i915/migrate: add acceleration support for DG2") >>Reported-by: Lucas De Marchi <lucas.demarchi@intel.com> >>Signed-off-by: Matthew Auld <matthew.auld@intel.com> >>Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> > > >Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> double checked on a dg2 system it's indeed working and pushed. thanks Lucas De Marchi > >thanks >Lucas De Marchi
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c index a4350227e9ae..6d65c5849874 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_region.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c @@ -67,6 +67,17 @@ i915_gem_object_create_region(struct intel_memory_region *mem, if (!obj) return ERR_PTR(-ENOMEM); + /* + * Anything smaller than the min_page_size can't be freely inserted into + * the GTT, due to alignemnt restrictions. For such special objects, + * make sure we force memcpy based suspend-resume. In the future we can + * revisit this, either by allowing special mis-aligned objects in the + * migration path, or by mapping all of LMEM upfront using cheap 1G + * GTT entries. + */ + if (default_page_size < mem->min_page_size) + flags |= I915_BO_ALLOC_PM_EARLY; + err = mem->ops->init_object(mem, obj, size, page_size, flags); if (err) goto err_object_free;
On DG2 we allow objects that are smaller than the min_page_size, under the premise that these are never mapped by the GTT, like with the paging structures. Currently the suspend-resume path will try to map such objects through the migration vm, which hits: [ 560.529217] kernel BUG at drivers/gpu/drm/i915/gt/intel_migrate.c:431! [ 560.536081] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 560.541629] CPU: 4 PID: 2062 Comm: rtcwake Tainted: G W 5.17.0-rc5-demarchi+ #175 [ 560.550716] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X220.B00.2103302221 03/30/2021 [ 560.563627] RIP: 0010:emit_pte+0x2e7/0x380 [i915] [ 560.568665] Code: ee 02 48 89 69 04 83 c6 05 83 c0 05 39 f0 0f 4f c6 48 8b 73 08 39 d0 0f 4f c2 44 89 f2 4c 8d 4a ff 49 85 f1 0f 84 62 fe ff ff <0f> 0b 48 c7 03 00 00 00 00 4d 89 c6 8b 01 48 29 ce 48 8d 57 0c 48 [ 560.587691] RSP: 0018:ffffc9000104f8a0 EFLAGS: 00010206 [ 560.592906] RAX: 0000000000000040 RBX: ffffc9000104f908 RCX: ffffc900025114d0 [ 560.600024] RDX: 0000000000010000 RSI: 00000003f9fe2000 RDI: ffffc900025114dc [ 560.607458] RBP: 0000000001840000 R08: ffff88810f335540 R09: 000000000000ffff [ 560.614865] R10: 000000000000081b R11: 0000000000000001 R12: 000000000000081b [ 560.622300] R13: 0000000000000000 R14: 0000000000010000 R15: ffff888107c3e240 [ 560.629716] FS: 00007f5b7c086580(0000) GS:ffff88846dc00000(0000) knlGS:0000000000000000 [ 560.638090] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 560.644132] CR2: 00007f3ab0a133a8 CR3: 000000010a43e003 CR4: 00000000003706e0 [ 560.651590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 560.659002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 560.666438] Call Trace: [ 560.668885] <TASK> [ 560.670983] intel_context_migrate_copy+0x1b1/0x4c0 [i915] [ 560.676794] __i915_ttm_move+0x628/0x790 [i915] [ 560.681704] ? dma_resv_iter_next+0x8f/0xb0 [ 560.686223] ? dma_resv_iter_first+0xe5/0x140 [ 560.690894] ? i915_deps_add_resv+0x4b/0x110 [i915] [ 560.696147] ? dma_resv_reserve_shared+0x161/0x310 [ 560.701228] i915_gem_obj_copy_ttm+0x10f/0x220 [i915] [ 560.706650] i915_ttm_backup+0x191/0x2f0 [i915] [ 560.711558] i915_gem_process_region+0x266/0x3b0 [i915] [ 560.717153] ? verify_cpu+0xf0/0x100 [ 560.721040] ? pci_pm_resume_early+0x20/0x20 [ 560.725603] i915_ttm_backup_region+0x47/0x70 [i915] [ 560.730927] i915_gem_backup_suspend+0x141/0x170 [i91 For now let's just force the memcpy path for such objects during suspend-resume. Fixes: 00e27ad85bc9 ("drm/i915/migrate: add acceleration support for DG2") Reported-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_region.c | 11 +++++++++++ 1 file changed, 11 insertions(+)