Message ID | 20200607163803.GA10303@duo.ucw.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | 82fef0ad811f "x86/mm: unencrypted non-blocking DMA allocations use coherent pools" was Re: next-0519 on thinkpad x60: sound related? window manager crash | expand |
On Sun, 7 Jun 2020, Pavel Machek wrote: > > I have a similar issue, caused between aaa2faab4ed8 and b170290c2836. > > > > [ 20.263098] BUG: unable to handle page fault for address: ffffb2b582cc2000 > > [ 20.263104] #PF: supervisor write access in kernel mode > > [ 20.263105] #PF: error_code(0x000b) - reserved bit violation > > [ 20.263107] PGD 3fd03b067 P4D 3fd03b067 PUD 3fd03c067 PMD 3f8822067 PTE 8000273942ab2163 > > [ 20.263113] Oops: 000b [#1] PREEMPT SMP > > [ 20.263117] CPU: 3 PID: 691 Comm: mpv Not tainted 5.7.0-11262-gb170290c2836 #1 > > [ 20.263119] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Pro4, BIOS P4.10 03/05/2020 > > [ 20.263125] RIP: 0010:__memset+0x24/0x30 > > [ 20.263128] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3 > > [ 20.263131] RSP: 0018:ffffb2b583d07e10 EFLAGS: 00010216 > > [ 20.263133] RAX: 0000000000000000 RBX: ffff8b8000102c00 RCX: 0000000000004000 > > [ 20.263134] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2b582cc2000 > > [ 20.263136] RBP: ffff8b8000101000 R08: 0000000000000000 R09: ffffb2b582cc2000 > > [ 20.263137] R10: 0000000000005356 R11: ffff8b8000102c18 R12: 0000000000000000 > > [ 20.263139] R13: 0000000000000000 R14: ffff8b8039944200 R15: ffffffff9794daa0 > > [ 20.263141] FS: 00007f41aa4b4200(0000) GS:ffff8b803ecc0000(0000) knlGS:0000000000000000 > > [ 20.263143] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 20.263144] CR2: ffffb2b582cc2000 CR3: 00000003b6731000 CR4: 00000000003406e0 > > [ 20.263146] Call Trace: > > [ 20.263151] ? snd_pcm_hw_params+0x3f3/0x47a > > [ 20.263154] ? snd_pcm_common_ioctl+0xf2/0xf73 > > [ 20.263158] ? snd_pcm_ioctl+0x1e/0x29 > > [ 20.263161] ? ksys_ioctl+0x77/0x91 > > [ 20.263163] ? __x64_sys_ioctl+0x11/0x14 > > [ 20.263166] ? do_syscall_64+0x3d/0xf5 > > [ 20.263170] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [ 20.263173] Modules linked in: uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev snd_usb_audio videobuf2_common snd_hwdep snd_usbmidi_lib input_leds snd_rawmidi led_class > > [ 20.263182] CR2: ffffb2b582cc2000 > > [ 20.263184] ---[ end trace c6b47a774b91f0a0 ]--- > > [ 20.263187] RIP: 0010:__memset+0x24/0x30 > > [ 20.263190] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3 > > [ 20.263192] RSP: 0018:ffffb2b583d07e10 EFLAGS: 00010216 > > [ 20.263193] RAX: 0000000000000000 RBX: ffff8b8000102c00 RCX: 0000000000004000 > > [ 20.263195] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2b582cc2000 > > [ 20.263196] RBP: ffff8b8000101000 R08: 0000000000000000 R09: ffffb2b582cc2000 > > [ 20.263197] R10: 0000000000005356 R11: ffff8b8000102c18 R12: 0000000000000000 > > [ 20.263199] R13: 0000000000000000 R14: ffff8b8039944200 R15: ffffffff9794daa0 > > [ 20.263201] FS: 00007f41aa4b4200(0000) GS:ffff8b803ecc0000(0000) knlGS:0000000000000000 > > [ 20.263202] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 20.263204] CR2: ffffb2b582cc2000 CR3: 00000003b6731000 CR4: 00000000003406e0 > > > > I bisected this to 82fef0ad811f "x86/mm: unencrypted non-blocking DMA > > allocations use coherent pools". Reverting 1ee18de92927 resolves the > > issue. > > > > Looks like Thinkpad X60 doesn't have VT-d, but could still be DMA > > related. > > Note that newer -next releases seem to behave okay for me. The commit > pointed out by siection is really simple: > > AFAIK you could verify it is responsible by turning off > CONFIG_AMD_MEM_ENCRYPT on latest kernel... > > Best regards, > Pavel > > index 1d6104ea8af0..2bf2222819d3 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -1520,6 +1520,7 @@ config X86_CPA_STATISTICS > config AMD_MEM_ENCRYPT > bool "AMD Secure Memory Encryption (SME) support" > depends on X86_64 && CPU_SUP_AMD > + select DMA_COHERENT_POOL > select DYNAMIC_PHYSICAL_MASK > select ARCH_USE_MEMREMAP_PROT > select ARCH_HAS_FORCE_DMA_UNENCRYPTED Thanks for the report! Besides CONFIG_AMD_MEM_ENCRYPT, do you have CONFIG_DMA_DIRECT_REMAP enabled? If so, it may be caused by the virtual address passed to the set_memory_{decrypted,encrypted}() functions. And I assume you are enabling SME by using mem_encrypt=on on the kernel command line or CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT is enabled. We likely need an atomic pool for devices that support DMA to addresses in sme_me_mask as well. I can test this tomorrow, but wanted to get it out early to see if it helps? --- diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c --- a/kernel/dma/pool.c +++ b/kernel/dma/pool.c @@ -13,6 +13,8 @@ #include <linux/slab.h> #include <linux/workqueue.h> +static struct gen_pool *atomic_pool __ro_after_init; +static unsigned long pool_size; static struct gen_pool *atomic_pool_dma __ro_after_init; static unsigned long pool_size_dma; static struct gen_pool *atomic_pool_dma32 __ro_after_init; @@ -41,24 +43,37 @@ static void __init dma_atomic_pool_debugfs_init(void) if (IS_ERR_OR_NULL(root)) return; + debugfs_create_ulong("pool_size", 0400, root, &pool_size); debugfs_create_ulong("pool_size_dma", 0400, root, &pool_size_dma); debugfs_create_ulong("pool_size_dma32", 0400, root, &pool_size_dma32); debugfs_create_ulong("pool_size_kernel", 0400, root, &pool_size_kernel); } -static void dma_atomic_pool_size_add(gfp_t gfp, size_t size) +static gfp_t dma_atomic_pool_gfp(void) { - if (gfp & __GFP_DMA) + if (IS_ENABLED(CONFIG_ZONE_DMA)) + return GFP_KERNEL | GFP_DMA; + if (IS_ENABLED(CONFIG_ZONE_DMA32)) + return GFP_KERNEL | GFP_DMA32; + return GFP_KERNEL; +} + +static void dma_atomic_pool_size_add(struct gen_pool *pool, size_t size) +{ + if (pool == atomic_pool) + pool_size += size; + else if (pool == atomic_pool_dma) pool_size_dma += size; - else if (gfp & __GFP_DMA32) + else if (pool == atomic_pool_dma32) pool_size_dma32 += size; - else + else if (pool == atomic_pool_kernel) pool_size_kernel += size; } static int atomic_pool_expand(struct gen_pool *pool, size_t pool_size, gfp_t gfp) { + bool decrypt = pool != atomic_pool; unsigned int order; struct page *page; void *addr; @@ -94,8 +109,9 @@ static int atomic_pool_expand(struct gen_pool *pool, size_t pool_size, * Memory in the atomic DMA pools must be unencrypted, the pools do not * shrink so no re-encryption occurs in dma_direct_free_pages(). */ - ret = set_memory_decrypted((unsigned long)page_to_virt(page), - 1 << order); + if (decrypt) + ret = set_memory_decrypted((unsigned long)page_to_virt(page), + 1 << order); if (ret) goto remove_mapping; ret = gen_pool_add_virt(pool, (unsigned long)addr, page_to_phys(page), @@ -103,12 +119,13 @@ static int atomic_pool_expand(struct gen_pool *pool, size_t pool_size, if (ret) goto encrypt_mapping; - dma_atomic_pool_size_add(gfp, pool_size); + dma_atomic_pool_size_add(pool, pool_size); return 0; encrypt_mapping: - ret = set_memory_encrypted((unsigned long)page_to_virt(page), - 1 << order); + if (decrypt) + ret = set_memory_encrypted((unsigned long)page_to_virt(page), + 1 << order); if (WARN_ON_ONCE(ret)) { /* Decrypt succeeded but encrypt failed, purposely leak */ goto out; @@ -132,6 +149,7 @@ static void atomic_pool_resize(struct gen_pool *pool, gfp_t gfp) static void atomic_pool_work_fn(struct work_struct *work) { + atomic_pool_resize(atomic_pool, dma_atomic_pool_gfp()); if (IS_ENABLED(CONFIG_ZONE_DMA)) atomic_pool_resize(atomic_pool_dma, GFP_KERNEL | GFP_DMA); @@ -182,6 +200,10 @@ static int __init dma_atomic_pool_init(void) } INIT_WORK(&atomic_pool_work, atomic_pool_work_fn); + atomic_pool = __dma_atomic_pool_init(atomic_pool_size, + dma_atomic_pool_gfp()); + if (!atomic_pool) + ret = -ENOMEM; atomic_pool_kernel = __dma_atomic_pool_init(atomic_pool_size, GFP_KERNEL); if (!atomic_pool_kernel) @@ -209,6 +231,9 @@ static inline struct gen_pool *dev_to_pool(struct device *dev) u64 phys_mask; gfp_t gfp; + if (!force_dma_unencrypted(dev)) + return atomic_pool; + gfp = dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask, &phys_mask); if (IS_ENABLED(CONFIG_ZONE_DMA) && gfp == GFP_DMA)
Excerpts from David Rientjes's message of June 7, 2020 3:41 pm: > On Sun, 7 Jun 2020, Pavel Machek wrote: > >> > I have a similar issue, caused between aaa2faab4ed8 and b170290c2836. >> > >> > [ 20.263098] BUG: unable to handle page fault for address: ffffb2b582cc2000 >> > [ 20.263104] #PF: supervisor write access in kernel mode >> > [ 20.263105] #PF: error_code(0x000b) - reserved bit violation >> > [ 20.263107] PGD 3fd03b067 P4D 3fd03b067 PUD 3fd03c067 PMD 3f8822067 PTE 8000273942ab2163 >> > [ 20.263113] Oops: 000b [#1] PREEMPT SMP >> > [ 20.263117] CPU: 3 PID: 691 Comm: mpv Not tainted 5.7.0-11262-gb170290c2836 #1 >> > [ 20.263119] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Pro4, BIOS P4.10 03/05/2020 >> > [ 20.263125] RIP: 0010:__memset+0x24/0x30 >> > [ 20.263128] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3 >> > [ 20.263131] RSP: 0018:ffffb2b583d07e10 EFLAGS: 00010216 >> > [ 20.263133] RAX: 0000000000000000 RBX: ffff8b8000102c00 RCX: 0000000000004000 >> > [ 20.263134] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2b582cc2000 >> > [ 20.263136] RBP: ffff8b8000101000 R08: 0000000000000000 R09: ffffb2b582cc2000 >> > [ 20.263137] R10: 0000000000005356 R11: ffff8b8000102c18 R12: 0000000000000000 >> > [ 20.263139] R13: 0000000000000000 R14: ffff8b8039944200 R15: ffffffff9794daa0 >> > [ 20.263141] FS: 00007f41aa4b4200(0000) GS:ffff8b803ecc0000(0000) knlGS:0000000000000000 >> > [ 20.263143] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> > [ 20.263144] CR2: ffffb2b582cc2000 CR3: 00000003b6731000 CR4: 00000000003406e0 >> > [ 20.263146] Call Trace: >> > [ 20.263151] ? snd_pcm_hw_params+0x3f3/0x47a >> > [ 20.263154] ? snd_pcm_common_ioctl+0xf2/0xf73 >> > [ 20.263158] ? snd_pcm_ioctl+0x1e/0x29 >> > [ 20.263161] ? ksys_ioctl+0x77/0x91 >> > [ 20.263163] ? __x64_sys_ioctl+0x11/0x14 >> > [ 20.263166] ? do_syscall_64+0x3d/0xf5 >> > [ 20.263170] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> > [ 20.263173] Modules linked in: uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev snd_usb_audio videobuf2_common snd_hwdep snd_usbmidi_lib input_leds snd_rawmidi led_class >> > [ 20.263182] CR2: ffffb2b582cc2000 >> > [ 20.263184] ---[ end trace c6b47a774b91f0a0 ]--- >> > [ 20.263187] RIP: 0010:__memset+0x24/0x30 >> > [ 20.263190] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3 >> > [ 20.263192] RSP: 0018:ffffb2b583d07e10 EFLAGS: 00010216 >> > [ 20.263193] RAX: 0000000000000000 RBX: ffff8b8000102c00 RCX: 0000000000004000 >> > [ 20.263195] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2b582cc2000 >> > [ 20.263196] RBP: ffff8b8000101000 R08: 0000000000000000 R09: ffffb2b582cc2000 >> > [ 20.263197] R10: 0000000000005356 R11: ffff8b8000102c18 R12: 0000000000000000 >> > [ 20.263199] R13: 0000000000000000 R14: ffff8b8039944200 R15: ffffffff9794daa0 >> > [ 20.263201] FS: 00007f41aa4b4200(0000) GS:ffff8b803ecc0000(0000) knlGS:0000000000000000 >> > [ 20.263202] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> > [ 20.263204] CR2: ffffb2b582cc2000 CR3: 00000003b6731000 CR4: 00000000003406e0 >> > >> > I bisected this to 82fef0ad811f "x86/mm: unencrypted non-blocking DMA >> > allocations use coherent pools". Reverting 1ee18de92927 resolves the >> > issue. >> > >> > Looks like Thinkpad X60 doesn't have VT-d, but could still be DMA >> > related. >> >> Note that newer -next releases seem to behave okay for me. The commit >> pointed out by siection is really simple: >> >> AFAIK you could verify it is responsible by turning off >> CONFIG_AMD_MEM_ENCRYPT on latest kernel... >> >> Best regards, >> Pavel >> >> index 1d6104ea8af0..2bf2222819d3 100644 >> --- a/arch/x86/Kconfig >> +++ b/arch/x86/Kconfig >> @@ -1520,6 +1520,7 @@ config X86_CPA_STATISTICS >> config AMD_MEM_ENCRYPT >> bool "AMD Secure Memory Encryption (SME) support" >> depends on X86_64 && CPU_SUP_AMD >> + select DMA_COHERENT_POOL >> select DYNAMIC_PHYSICAL_MASK >> select ARCH_USE_MEMREMAP_PROT >> select ARCH_HAS_FORCE_DMA_UNENCRYPTED > > Thanks for the report! > > Besides CONFIG_AMD_MEM_ENCRYPT, do you have CONFIG_DMA_DIRECT_REMAP > enabled? If so, it may be caused by the virtual address passed to the > set_memory_{decrypted,encrypted}() functions. > > And I assume you are enabling SME by using mem_encrypt=on on the kernel > command line or CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT is enabled. > > We likely need an atomic pool for devices that support DMA to addresses in > sme_me_mask as well. I can test this tomorrow, but wanted to get it out > early to see if it helps? This patch doesn't seem to help. I have the same problem (kernel page fault, __memset, snd_pcm_hw_params...). I don't have CONFIG_DMA_DIRECT_REMAP enabled, and AFAICT it doesn't seem to be selectable currently on x86, unless there are some patches floating around for that.
On Sun, 7 Jun 2020, Alex Xu (Hello71) wrote: > > On Sun, 7 Jun 2020, Pavel Machek wrote: > > > >> > I have a similar issue, caused between aaa2faab4ed8 and b170290c2836. > >> > > >> > [ 20.263098] BUG: unable to handle page fault for address: ffffb2b582cc2000 > >> > [ 20.263104] #PF: supervisor write access in kernel mode > >> > [ 20.263105] #PF: error_code(0x000b) - reserved bit violation > >> > [ 20.263107] PGD 3fd03b067 P4D 3fd03b067 PUD 3fd03c067 PMD 3f8822067 PTE 8000273942ab2163 > >> > [ 20.263113] Oops: 000b [#1] PREEMPT SMP > >> > [ 20.263117] CPU: 3 PID: 691 Comm: mpv Not tainted 5.7.0-11262-gb170290c2836 #1 > >> > [ 20.263119] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Pro4, BIOS P4.10 03/05/2020 > >> > [ 20.263125] RIP: 0010:__memset+0x24/0x30 > >> > [ 20.263128] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3 > >> > [ 20.263131] RSP: 0018:ffffb2b583d07e10 EFLAGS: 00010216 > >> > [ 20.263133] RAX: 0000000000000000 RBX: ffff8b8000102c00 RCX: 0000000000004000 > >> > [ 20.263134] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2b582cc2000 > >> > [ 20.263136] RBP: ffff8b8000101000 R08: 0000000000000000 R09: ffffb2b582cc2000 > >> > [ 20.263137] R10: 0000000000005356 R11: ffff8b8000102c18 R12: 0000000000000000 > >> > [ 20.263139] R13: 0000000000000000 R14: ffff8b8039944200 R15: ffffffff9794daa0 > >> > [ 20.263141] FS: 00007f41aa4b4200(0000) GS:ffff8b803ecc0000(0000) knlGS:0000000000000000 > >> > [ 20.263143] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> > [ 20.263144] CR2: ffffb2b582cc2000 CR3: 00000003b6731000 CR4: 00000000003406e0 > >> > [ 20.263146] Call Trace: > >> > [ 20.263151] ? snd_pcm_hw_params+0x3f3/0x47a > >> > [ 20.263154] ? snd_pcm_common_ioctl+0xf2/0xf73 > >> > [ 20.263158] ? snd_pcm_ioctl+0x1e/0x29 > >> > [ 20.263161] ? ksys_ioctl+0x77/0x91 > >> > [ 20.263163] ? __x64_sys_ioctl+0x11/0x14 > >> > [ 20.263166] ? do_syscall_64+0x3d/0xf5 > >> > [ 20.263170] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > >> > [ 20.263173] Modules linked in: uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev snd_usb_audio videobuf2_common snd_hwdep snd_usbmidi_lib input_leds snd_rawmidi led_class > >> > [ 20.263182] CR2: ffffb2b582cc2000 > >> > [ 20.263184] ---[ end trace c6b47a774b91f0a0 ]--- > >> > [ 20.263187] RIP: 0010:__memset+0x24/0x30 > >> > [ 20.263190] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3 > >> > [ 20.263192] RSP: 0018:ffffb2b583d07e10 EFLAGS: 00010216 > >> > [ 20.263193] RAX: 0000000000000000 RBX: ffff8b8000102c00 RCX: 0000000000004000 > >> > [ 20.263195] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb2b582cc2000 > >> > [ 20.263196] RBP: ffff8b8000101000 R08: 0000000000000000 R09: ffffb2b582cc2000 > >> > [ 20.263197] R10: 0000000000005356 R11: ffff8b8000102c18 R12: 0000000000000000 > >> > [ 20.263199] R13: 0000000000000000 R14: ffff8b8039944200 R15: ffffffff9794daa0 > >> > [ 20.263201] FS: 00007f41aa4b4200(0000) GS:ffff8b803ecc0000(0000) knlGS:0000000000000000 > >> > [ 20.263202] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> > [ 20.263204] CR2: ffffb2b582cc2000 CR3: 00000003b6731000 CR4: 00000000003406e0 > >> > > >> > I bisected this to 82fef0ad811f "x86/mm: unencrypted non-blocking DMA > >> > allocations use coherent pools". Reverting 1ee18de92927 resolves the > >> > issue. > >> > > >> > Looks like Thinkpad X60 doesn't have VT-d, but could still be DMA > >> > related. > >> > >> Note that newer -next releases seem to behave okay for me. The commit > >> pointed out by siection is really simple: > >> > >> AFAIK you could verify it is responsible by turning off > >> CONFIG_AMD_MEM_ENCRYPT on latest kernel... > >> > >> Best regards, > >> Pavel > >> > >> index 1d6104ea8af0..2bf2222819d3 100644 > >> --- a/arch/x86/Kconfig > >> +++ b/arch/x86/Kconfig > >> @@ -1520,6 +1520,7 @@ config X86_CPA_STATISTICS > >> config AMD_MEM_ENCRYPT > >> bool "AMD Secure Memory Encryption (SME) support" > >> depends on X86_64 && CPU_SUP_AMD > >> + select DMA_COHERENT_POOL > >> select DYNAMIC_PHYSICAL_MASK > >> select ARCH_USE_MEMREMAP_PROT > >> select ARCH_HAS_FORCE_DMA_UNENCRYPTED > > > > Thanks for the report! > > > > Besides CONFIG_AMD_MEM_ENCRYPT, do you have CONFIG_DMA_DIRECT_REMAP > > enabled? If so, it may be caused by the virtual address passed to the > > set_memory_{decrypted,encrypted}() functions. > > > > And I assume you are enabling SME by using mem_encrypt=on on the kernel > > command line or CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT is enabled. > > > > We likely need an atomic pool for devices that support DMA to addresses in > > sme_me_mask as well. I can test this tomorrow, but wanted to get it out > > early to see if it helps? > > This patch doesn't seem to help. I have the same problem (kernel page > fault, __memset, snd_pcm_hw_params...). > > I don't have CONFIG_DMA_DIRECT_REMAP enabled, and AFAICT it doesn't seem > to be selectable currently on x86, unless there are some patches > floating around for that. > Thanks for trying it out, Alex. Would you mind sending your .config and command line? I assume either mem_encrypt=on or CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT is enabled. Could you also give this a try? diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -99,10 +99,11 @@ static inline bool dma_should_alloc_from_pool(struct device *dev, gfp_t gfp, static inline bool dma_should_free_from_pool(struct device *dev, unsigned long attrs) { - if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL)) + if (!IS_ENABLED(CONFIG_DMA_COHERENT_POOL)) + return false; + if (force_dma_unencrypted(dev)) return true; - if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) && - !force_dma_unencrypted(dev)) + if (attrs & DMA_ATTR_NO_KERNEL_MAPPING) return false; if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP)) return true;
Excerpts from David Rientjes's message of June 7, 2020 8:57 pm: > Thanks for trying it out, Alex. Would you mind sending your .config and > command line? I assume either mem_encrypt=on or > CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT is enabled. > > Could you also give this a try? > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c > --- a/kernel/dma/direct.c > +++ b/kernel/dma/direct.c > @@ -99,10 +99,11 @@ static inline bool dma_should_alloc_from_pool(struct device *dev, gfp_t gfp, > static inline bool dma_should_free_from_pool(struct device *dev, > unsigned long attrs) > { > - if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL)) > + if (!IS_ENABLED(CONFIG_DMA_COHERENT_POOL)) > + return false; > + if (force_dma_unencrypted(dev)) > return true; > - if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) && > - !force_dma_unencrypted(dev)) > + if (attrs & DMA_ATTR_NO_KERNEL_MAPPING) > return false; > if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP)) > return true; > This patch doesn't work for me either. It has since occurred to me that while I do have CONFIG_AMD_MEM_ENCYRPT=y, I have CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n, because it was broken with amdgpu (unfortunately a downgrade from radeon in this respect). Tried it again just now and it looks like it's now able to enable KMS, but all it displays is serious-looking errors. Sorry for not mentioning that earlier. I'll send you my .config and command line off-list. Thanks, Alex.
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1520,6 +1520,7 @@ config X86_CPA_STATISTICS config AMD_MEM_ENCRYPT bool "AMD Secure Memory Encryption (SME) support" depends on X86_64 && CPU_SUP_AMD + select DMA_COHERENT_POOL select DYNAMIC_PHYSICAL_MASK select ARCH_USE_MEMREMAP_PROT select ARCH_HAS_FORCE_DMA_UNENCRYPTED