Message ID | 20211223094435.248523-4-bhe@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Handle warning of allocation failure on DMA zone w/o managed pages | expand |
On 12/23/21 3:44 AM, Baoquan He wrote: > In kdump kernel of x86_64, page allocation failure is observed: > > kworker/u2:2: page allocation failure: order:0, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0 > CPU: 0 PID: 55 Comm: kworker/u2:2 Not tainted 5.16.0-rc4+ #5 > Hardware name: AMD Dinar/Dinar, BIOS RDN1505B 06/05/2013 > Workqueue: events_unbound async_run_entry_fn > Call Trace: > <TASK> > dump_stack_lvl+0x48/0x5e > warn_alloc.cold+0x72/0xd6 > __alloc_pages_slowpath.constprop.0+0xc69/0xcd0 > __alloc_pages+0x1df/0x210 > new_slab+0x389/0x4d0 > ___slab_alloc+0x58f/0x770 > __slab_alloc.constprop.0+0x4a/0x80 > kmem_cache_alloc_trace+0x24b/0x2c0 > sr_probe+0x1db/0x620 > ...... > device_add+0x405/0x920 > ...... > __scsi_add_device+0xe5/0x100 > ata_scsi_scan_host+0x97/0x1d0 > async_run_entry_fn+0x30/0x130 > process_one_work+0x1e8/0x3c0 > worker_thread+0x50/0x3b0 > ? rescuer_thread+0x350/0x350 > kthread+0x16b/0x190 > ? set_kthread_struct+0x40/0x40 > ret_from_fork+0x22/0x30 > </TASK> > Mem-Info: > ...... > > The above failure happened when calling kmalloc() to allocate buffer with > GFP_DMA. It requests to allocate slab page from DMA zone while no managed > pages at all in there. > sr_probe() > --> get_capabilities() > --> buffer = kmalloc(512, GFP_KERNEL | GFP_DMA); > > Because in the current kernel, dma-kmalloc will be created as long as > CONFIG_ZONE_DMA is enabled. However, kdump kernel of x86_64 doesn't have > managed pages on DMA zone since commit 6f599d84231f ("x86/kdump: Always > reserve the low 1M when the crashkernel option is specified"). The failure > can be always reproduced. > > For now, let's mute the warning of allocation failure if requesting pages > from DMA zone while no managed pages. > > Fixes: 6f599d84231f ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified") > Cc: stable@vger.kernel.org > Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: John Donnelly <john.p.donnelly@oracle.com> > Cc: Christoph Lameter <cl@linux.com> > Cc: Pekka Enberg <penberg@kernel.org> > Cc: David Rientjes <rientjes@google.com> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > --- > mm/page_alloc.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 7c7a0b5de2ff..843bc8e5550a 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4204,7 +4204,8 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...) > va_list args; > static DEFINE_RATELIMIT_STATE(nopage_rs, 10*HZ, 1); > > - if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs)) > + if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs) || > + (gfp_mask & __GFP_DMA) && !has_managed_dma()) > return; > > va_start(args, fmt);
On Thu, Dec 23, 2021 at 05:44:35PM +0800, Baoquan He wrote: > In kdump kernel of x86_64, page allocation failure is observed: > > kworker/u2:2: page allocation failure: order:0, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0 > CPU: 0 PID: 55 Comm: kworker/u2:2 Not tainted 5.16.0-rc4+ #5 > Hardware name: AMD Dinar/Dinar, BIOS RDN1505B 06/05/2013 > Workqueue: events_unbound async_run_entry_fn > Call Trace: > <TASK> > dump_stack_lvl+0x48/0x5e > warn_alloc.cold+0x72/0xd6 > __alloc_pages_slowpath.constprop.0+0xc69/0xcd0 > __alloc_pages+0x1df/0x210 > new_slab+0x389/0x4d0 > ___slab_alloc+0x58f/0x770 > __slab_alloc.constprop.0+0x4a/0x80 > kmem_cache_alloc_trace+0x24b/0x2c0 > sr_probe+0x1db/0x620 > ...... > device_add+0x405/0x920 > ...... > __scsi_add_device+0xe5/0x100 > ata_scsi_scan_host+0x97/0x1d0 > async_run_entry_fn+0x30/0x130 > process_one_work+0x1e8/0x3c0 > worker_thread+0x50/0x3b0 > ? rescuer_thread+0x350/0x350 > kthread+0x16b/0x190 > ? set_kthread_struct+0x40/0x40 > ret_from_fork+0x22/0x30 > </TASK> > Mem-Info: > ...... > > The above failure happened when calling kmalloc() to allocate buffer with > GFP_DMA. It requests to allocate slab page from DMA zone while no managed > pages at all in there. > sr_probe() > --> get_capabilities() > --> buffer = kmalloc(512, GFP_KERNEL | GFP_DMA); > > Because in the current kernel, dma-kmalloc will be created as long as > CONFIG_ZONE_DMA is enabled. However, kdump kernel of x86_64 doesn't have > managed pages on DMA zone since commit 6f599d84231f ("x86/kdump: Always > reserve the low 1M when the crashkernel option is specified"). The failure > can be always reproduced. > > For now, let's mute the warning of allocation failure if requesting pages > from DMA zone while no managed pages. > > Fixes: 6f599d84231f ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified") > Cc: stable@vger.kernel.org > Signed-off-by: Baoquan He <bhe@redhat.com> > Cc: Christoph Lameter <cl@linux.com> > Cc: Pekka Enberg <penberg@kernel.org> > Cc: David Rientjes <rientjes@google.com> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > --- > mm/page_alloc.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 7c7a0b5de2ff..843bc8e5550a 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4204,7 +4204,8 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...) > va_list args; > static DEFINE_RATELIMIT_STATE(nopage_rs, 10*HZ, 1); > > - if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs)) > + if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs) || > + (gfp_mask & __GFP_DMA) && !has_managed_dma()) > return; > Warning when there's always no page in DMA zone is unnecessary and it confuses user. The patch looks good. Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> And there is some driers that allocate memory with GFP_DMA even if that flag is unnecessary. We need to do cleanup later. Baoquan Are you planning to do it soon? I want to help that. Merry Christmas, Hyeonggon > va_start(args, fmt); > -- > 2.26.3 > >
On 12/25/21 at 05:53am, Hyeonggon Yoo wrote: > On Thu, Dec 23, 2021 at 05:44:35PM +0800, Baoquan He wrote: ...... > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 7c7a0b5de2ff..843bc8e5550a 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -4204,7 +4204,8 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...) > > va_list args; > > static DEFINE_RATELIMIT_STATE(nopage_rs, 10*HZ, 1); > > > > - if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs)) > > + if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs) || > > + (gfp_mask & __GFP_DMA) && !has_managed_dma()) > > return; > > > > Warning when there's always no page in DMA zone is unnecessary > and it confuses user. > > The patch looks good. > Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > And there is some driers that allocate memory with GFP_DMA > even if that flag is unnecessary. We need to do cleanup later. Thanks for reviewing and giving out some awesome suggestions. > > Baoquan Are you planning to do it soon? > I want to help that. Yes, I had the plan and have done a little part. I talked to Christoph about my thought. I planned to collect all kmalloc(GFP_DMA) callsite and post a RFC mail, CC mailing list and maintainers related. Anyone interested or know one or several callsites well can help. Now, Christoph has handled all under drviers/scsi, and post patches to fix them. I have gone throug those places and found out below callsites where we can remove GFP_DMA directly when calling kmalloc() since not necessary. And even found one place kmalloc(GFP_DMA32). (HEAD -> master) vxge: don't use GFP_DMA mtd: rawnand: marvell: don't use GFP_DMA HID: intel-ish-hid: remove wrong GFP_DMA32 flag ps3disk: don't use GFP_DMA atm: iphase: don't use GFP_DMA Next, I will send a RFC mail to contain those suspect callsites. We can track them and can help if needed. Suggest to change them with: 1) using dma_alloc_xx , or dma_map_xx after kmalloc() 2) using alloc_pages(GFP_DMA) instead When we fix, we all post patch with subject key words as 'xxxx: don't use GFP_DMA'. Christoph has posted patch with the similar subject, we can search subject to get all related patches for later back porting. I will add you to CC when sending. Could be tomorrow. Any suggestion or thought? Thanks Baoquan
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7c7a0b5de2ff..843bc8e5550a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4204,7 +4204,8 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...) va_list args; static DEFINE_RATELIMIT_STATE(nopage_rs, 10*HZ, 1); - if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs)) + if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs) || + (gfp_mask & __GFP_DMA) && !has_managed_dma()) return; va_start(args, fmt);
In kdump kernel of x86_64, page allocation failure is observed: kworker/u2:2: page allocation failure: order:0, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 0 PID: 55 Comm: kworker/u2:2 Not tainted 5.16.0-rc4+ #5 Hardware name: AMD Dinar/Dinar, BIOS RDN1505B 06/05/2013 Workqueue: events_unbound async_run_entry_fn Call Trace: <TASK> dump_stack_lvl+0x48/0x5e warn_alloc.cold+0x72/0xd6 __alloc_pages_slowpath.constprop.0+0xc69/0xcd0 __alloc_pages+0x1df/0x210 new_slab+0x389/0x4d0 ___slab_alloc+0x58f/0x770 __slab_alloc.constprop.0+0x4a/0x80 kmem_cache_alloc_trace+0x24b/0x2c0 sr_probe+0x1db/0x620 ...... device_add+0x405/0x920 ...... __scsi_add_device+0xe5/0x100 ata_scsi_scan_host+0x97/0x1d0 async_run_entry_fn+0x30/0x130 process_one_work+0x1e8/0x3c0 worker_thread+0x50/0x3b0 ? rescuer_thread+0x350/0x350 kthread+0x16b/0x190 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 </TASK> Mem-Info: ...... The above failure happened when calling kmalloc() to allocate buffer with GFP_DMA. It requests to allocate slab page from DMA zone while no managed pages at all in there. sr_probe() --> get_capabilities() --> buffer = kmalloc(512, GFP_KERNEL | GFP_DMA); Because in the current kernel, dma-kmalloc will be created as long as CONFIG_ZONE_DMA is enabled. However, kdump kernel of x86_64 doesn't have managed pages on DMA zone since commit 6f599d84231f ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified"). The failure can be always reproduced. For now, let's mute the warning of allocation failure if requesting pages from DMA zone while no managed pages. Fixes: 6f599d84231f ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified") Cc: stable@vger.kernel.org Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> --- mm/page_alloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)