Message ID | 20240205092626.v2.1.Id9ad163b60d21c9e56c2d686b0cc9083a8ba7924@changeid (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] regset: use kvzalloc() for regset_get_alloc() | expand |
Hi, On Mon, Feb 5, 2024 at 9:27 AM Douglas Anderson <dianders@chromium.org> wrote: > > While browsing through ChromeOS crash reports, I found one with an > allocation failure that looked like this: > > chrome: page allocation failure: order:7, > mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), > nodemask=(null),cpuset=urgent,mems_allowed=0 > CPU: 7 PID: 3295 Comm: chrome Not tainted > 5.15.133-20574-g8044615ac35c #1 (HASH:1162 1) > Hardware name: Google Lazor (rev3 - 8) with KB Backlight (DT) > Call trace: > ... > warn_alloc+0x104/0x174 > __alloc_pages+0x5f0/0x6e4 > kmalloc_order+0x44/0x98 > kmalloc_order_trace+0x34/0x124 > __kmalloc+0x228/0x36c > __regset_get+0x68/0xcc > regset_get_alloc+0x1c/0x28 > elf_core_dump+0x3d8/0xd8c > do_coredump+0xeb8/0x1378 > get_signal+0x14c/0x804 > ... > > An order 7 allocation is (1 << 7) contiguous pages, or 512K. It's not > a surprise that this allocation failed on a system that's been running > for a while. > > More digging showed that it was fairly easy to see the order 7 > allocation by just sending a SIGQUIT to chrome (or other processes) to > generate a core dump. The actual amount being allocated was 279,584 > bytes and it was for "core_note_type" NT_ARM_SVE. > > There was quite a bit of discussion [1] on the mailing lists in > response to my v1 patch attempting to switch to vmalloc. The overall > conclusion was that we could likely reduce the 279,584 byte allocation > by quite a bit and Mark Brown has sent a patch to that effect [2]. > However even with the 279,584 byte allocation gone there are still > 65,552 byte allocations. These are just barely more than the 65,536 > bytes and thus would require an order 5 allocation. > > An order 5 allocation is still something to avoid unless necessary and > nothing needs the memory here to be contiguous. Change the allocation > to kvzalloc() which should still be efficient for small allocations > but doesn't force the memory subsystem to work hard (and maybe fail) > at getting a large contiguous chunk. > > [1] https://lore.kernel.org/r/20240201171159.1.Id9ad163b60d21c9e56c2d686b0cc9083a8ba7924@changeid > [2] https://lore.kernel.org/r/20240203-arm64-sve-ptrace-regset-size-v1-1-2c3ba1386b9e@kernel.org > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > > Changes in v2: > - Use kvzalloc() instead of vmalloc(). > - Update description based on v1 discussion. > > fs/binfmt_elf.c | 2 +- > kernel/regset.c | 6 +++--- > 2 files changed, 4 insertions(+), 4 deletions(-) Just wanted to check in to see if there's anything else that I need to do here. Mark's patch to avoid the order 7 allocations [1] has landed, but we still want this kvzalloc() because the order 5 allocations can't really be avoided. I'm happy to sit tight for longer but just wanted to make sure it was clear that we still want my patch _in addition_ to Mark's patch and to see if there was anything else you needed me to do. Thanks! [1] https://lore.kernel.org/r/20240213-arm64-sve-ptrace-regset-size-v2-1-c7600ca74b9b@kernel.org
Hi, On Mon, Feb 26, 2024 at 3:55 PM Doug Anderson <dianders@chromium.org> wrote: > > Hi, > > On Mon, Feb 5, 2024 at 9:27 AM Douglas Anderson <dianders@chromium.org> wrote: > > > > While browsing through ChromeOS crash reports, I found one with an > > allocation failure that looked like this: > > > > chrome: page allocation failure: order:7, > > mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), > > nodemask=(null),cpuset=urgent,mems_allowed=0 > > CPU: 7 PID: 3295 Comm: chrome Not tainted > > 5.15.133-20574-g8044615ac35c #1 (HASH:1162 1) > > Hardware name: Google Lazor (rev3 - 8) with KB Backlight (DT) > > Call trace: > > ... > > warn_alloc+0x104/0x174 > > __alloc_pages+0x5f0/0x6e4 > > kmalloc_order+0x44/0x98 > > kmalloc_order_trace+0x34/0x124 > > __kmalloc+0x228/0x36c > > __regset_get+0x68/0xcc > > regset_get_alloc+0x1c/0x28 > > elf_core_dump+0x3d8/0xd8c > > do_coredump+0xeb8/0x1378 > > get_signal+0x14c/0x804 > > ... > > > > An order 7 allocation is (1 << 7) contiguous pages, or 512K. It's not > > a surprise that this allocation failed on a system that's been running > > for a while. > > > > More digging showed that it was fairly easy to see the order 7 > > allocation by just sending a SIGQUIT to chrome (or other processes) to > > generate a core dump. The actual amount being allocated was 279,584 > > bytes and it was for "core_note_type" NT_ARM_SVE. > > > > There was quite a bit of discussion [1] on the mailing lists in > > response to my v1 patch attempting to switch to vmalloc. The overall > > conclusion was that we could likely reduce the 279,584 byte allocation > > by quite a bit and Mark Brown has sent a patch to that effect [2]. > > However even with the 279,584 byte allocation gone there are still > > 65,552 byte allocations. These are just barely more than the 65,536 > > bytes and thus would require an order 5 allocation. > > > > An order 5 allocation is still something to avoid unless necessary and > > nothing needs the memory here to be contiguous. Change the allocation > > to kvzalloc() which should still be efficient for small allocations > > but doesn't force the memory subsystem to work hard (and maybe fail) > > at getting a large contiguous chunk. > > > > [1] https://lore.kernel.org/r/20240201171159.1.Id9ad163b60d21c9e56c2d686b0cc9083a8ba7924@changeid > > [2] https://lore.kernel.org/r/20240203-arm64-sve-ptrace-regset-size-v1-1-2c3ba1386b9e@kernel.org > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > --- > > > > Changes in v2: > > - Use kvzalloc() instead of vmalloc(). > > - Update description based on v1 discussion. > > > > fs/binfmt_elf.c | 2 +- > > kernel/regset.c | 6 +++--- > > 2 files changed, 4 insertions(+), 4 deletions(-) > > Just wanted to check in to see if there's anything else that I need to > do here. Mark's patch to avoid the order 7 allocations [1] has landed, > but we still want this kvzalloc() because the order 5 allocations > can't really be avoided. I'm happy to sit tight for longer but just > wanted to make sure it was clear that we still want my patch _in > addition_ to Mark's patch and to see if there was anything else you > needed me to do. > > Thanks! > > [1] https://lore.kernel.org/r/20240213-arm64-sve-ptrace-regset-size-v2-1-c7600ca74b9b@kernel.org I'm not trying to be a pest here, so if this is on someone's todo list and they'll get to it eventually then feel free to tell me to go away and I'll snooze this for another few months. I just want to make sure it's not forgotten. I've been assuming that someone like Al Viro or Christian Brauner would land this patch eventually and I know Al responded rather quickly to my v1 [2]. I think all of Al's issues were resolved by Mark Brown's patch [1] (which has landed in the arm64 tree) and my updating of the patch description in v2. I see that Al and Christian are flagged as maintainers of "fs/binfmt_elf.c" which is one of the two files I'm touching, so that's mostly why I was assuming they would land it. ...but I realize that perhaps my assumptions are wrong and this needs to go through a different maintainer. In this case (if I'm reading it correctly) Al and Christian are listed because the file is under "fs" even though this isn't _really_ much of a filesystem-related patch. Perhaps this needs to go through something like Andrew Morton's tree since he often picks up patches that have nowhere else to land? If someone else has suggestions, I'm all ears. I'm also happy to repost this patch in case it helps with a maintainer applying it. Thanks! -Doug [1] https://lore.kernel.org/r/20240213-arm64-sve-ptrace-regset-size-v2-1-c7600ca74b9b@kernel.org [2] https://lore.kernel.org/r/20240202012249.GU2087318@ZenIV/
On Thu, Mar 28, 2024 at 07:16:37AM -0700, Doug Anderson wrote: > On Mon, Feb 26, 2024 at 3:55 PM Doug Anderson <dianders@chromium.org> wrote: > > On Mon, Feb 5, 2024 at 9:27 AM Douglas Anderson <dianders@chromium.org> wrote: > > > While browsing through ChromeOS crash reports, I found one with an > > > allocation failure that looked like this: > > > > > > chrome: page allocation failure: order:7, > > > mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), > > > nodemask=(null),cpuset=urgent,mems_allowed=0 > > > CPU: 7 PID: 3295 Comm: chrome Not tainted > > > 5.15.133-20574-g8044615ac35c #1 (HASH:1162 1) > > > Hardware name: Google Lazor (rev3 - 8) with KB Backlight (DT) > > > Call trace: > > > ... > > > warn_alloc+0x104/0x174 > > > __alloc_pages+0x5f0/0x6e4 > > > kmalloc_order+0x44/0x98 > > > kmalloc_order_trace+0x34/0x124 > > > __kmalloc+0x228/0x36c > > > __regset_get+0x68/0xcc > > > regset_get_alloc+0x1c/0x28 > > > elf_core_dump+0x3d8/0xd8c > > > do_coredump+0xeb8/0x1378 > > > get_signal+0x14c/0x804 > > > ... > > > > > > An order 7 allocation is (1 << 7) contiguous pages, or 512K. It's not > > > a surprise that this allocation failed on a system that's been running > > > for a while. > > > > > > More digging showed that it was fairly easy to see the order 7 > > > allocation by just sending a SIGQUIT to chrome (or other processes) to > > > generate a core dump. The actual amount being allocated was 279,584 > > > bytes and it was for "core_note_type" NT_ARM_SVE. > > > > > > There was quite a bit of discussion [1] on the mailing lists in > > > response to my v1 patch attempting to switch to vmalloc. The overall > > > conclusion was that we could likely reduce the 279,584 byte allocation > > > by quite a bit and Mark Brown has sent a patch to that effect [2]. > > > However even with the 279,584 byte allocation gone there are still > > > 65,552 byte allocations. These are just barely more than the 65,536 > > > bytes and thus would require an order 5 allocation. > > > > > > An order 5 allocation is still something to avoid unless necessary and > > > nothing needs the memory here to be contiguous. Change the allocation > > > to kvzalloc() which should still be efficient for small allocations > > > but doesn't force the memory subsystem to work hard (and maybe fail) > > > at getting a large contiguous chunk. > > > > > > [1] https://lore.kernel.org/r/20240201171159.1.Id9ad163b60d21c9e56c2d686b0cc9083a8ba7924@changeid > > > [2] https://lore.kernel.org/r/20240203-arm64-sve-ptrace-regset-size-v1-1-2c3ba1386b9e@kernel.org > > > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > > --- > > > > > > Changes in v2: > > > - Use kvzalloc() instead of vmalloc(). > > > - Update description based on v1 discussion. > > > > > > fs/binfmt_elf.c | 2 +- > > > kernel/regset.c | 6 +++--- > > > 2 files changed, 4 insertions(+), 4 deletions(-) > > > > Just wanted to check in to see if there's anything else that I need to > > do here. Mark's patch to avoid the order 7 allocations [1] has landed, > > but we still want this kvzalloc() because the order 5 allocations > > can't really be avoided. I'm happy to sit tight for longer but just > > wanted to make sure it was clear that we still want my patch _in > > addition_ to Mark's patch and to see if there was anything else you > > needed me to do. > > > > Thanks! > > > > [1] https://lore.kernel.org/r/20240213-arm64-sve-ptrace-regset-size-v2-1-c7600ca74b9b@kernel.org > > I'm not trying to be a pest here, so if this is on someone's todo list > and they'll get to it eventually then feel free to tell me to go away > and I'll snooze this for another few months. I just want to make sure > it's not forgotten. > > I've been assuming that someone like Al Viro or Christian Brauner > would land this patch eventually and I know Al responded rather > quickly to my v1 [2]. I think all of Al's issues were resolved by Mark > Brown's patch [1] (which has landed in the arm64 tree) and my updating > of the patch description in v2. I see that Al and Christian are > flagged as maintainers of "fs/binfmt_elf.c" which is one of the two > files I'm touching, so that's mostly why I was assuming they would > land it. > > ...but I realize that perhaps my assumptions are wrong and this needs > to go through a different maintainer. In this case (if I'm reading it > correctly) Al and Christian are listed because the file is under "fs" > even though this isn't _really_ much of a filesystem-related patch. > Perhaps this needs to go through something like Andrew Morton's tree > since he often picks up patches that have nowhere else to land? If > someone else has suggestions, I'm all ears. I'm also happy to repost > this patch in case it helps with a maintainer applying it. FWIW, for this patch: Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Yeah, normally Al or Christian would take it but with their ack we can also take it through the arm64 tree (or Andrew can pick it up through the mm tree). With Mark's fix, I assume this is no longer urgent, cc stable material, but rather something nice in the future to reduce the risk of allocation failure on this path.
Hi, On Thu, Mar 28, 2024 at 8:33 AM Catalin Marinas <catalin.marinas@arm.com> wrote: > > > I'm not trying to be a pest here, so if this is on someone's todo list > > and they'll get to it eventually then feel free to tell me to go away > > and I'll snooze this for another few months. I just want to make sure > > it's not forgotten. > > > > I've been assuming that someone like Al Viro or Christian Brauner > > would land this patch eventually and I know Al responded rather > > quickly to my v1 [2]. I think all of Al's issues were resolved by Mark > > Brown's patch [1] (which has landed in the arm64 tree) and my updating > > of the patch description in v2. I see that Al and Christian are > > flagged as maintainers of "fs/binfmt_elf.c" which is one of the two > > files I'm touching, so that's mostly why I was assuming they would > > land it. > > > > ...but I realize that perhaps my assumptions are wrong and this needs > > to go through a different maintainer. In this case (if I'm reading it > > correctly) Al and Christian are listed because the file is under "fs" > > even though this isn't _really_ much of a filesystem-related patch. > > Perhaps this needs to go through something like Andrew Morton's tree > > since he often picks up patches that have nowhere else to land? If > > someone else has suggestions, I'm all ears. I'm also happy to repost > > this patch in case it helps with a maintainer applying it. > > FWIW, for this patch: > > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Thanks! > Yeah, normally Al or Christian would take it but with their ack we can > also take it through the arm64 tree (or Andrew can pick it up through > the mm tree). OK, let's see what folks say. > With Mark's fix, I assume this is no longer urgent, cc stable material, > but rather something nice in the future to reduce the risk of allocation > failure on this path. It's not quite as urgent as before Mark's fix, which gets rid of the order 7 allocation. ...but an unnecessary order 5 allocation is still nothing to sneeze at. I'd let others make the decision about whether to CC stable, but I'll at least advocate backporting it to all the kernel trees I'm directly involved in. -Doug
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 5397b552fbeb..ac178ad38823 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1928,7 +1928,7 @@ static void free_note_info(struct elf_note_info *info) threads = t->next; WARN_ON(t->notes[0].data && t->notes[0].data != &t->prstatus); for (i = 1; i < info->thread_notes; ++i) - kfree(t->notes[i].data); + kvfree(t->notes[i].data); kfree(t); } kfree(info->psinfo.data); diff --git a/kernel/regset.c b/kernel/regset.c index 586823786f39..b2871fa68b2a 100644 --- a/kernel/regset.c +++ b/kernel/regset.c @@ -16,14 +16,14 @@ static int __regset_get(struct task_struct *target, if (size > regset->n * regset->size) size = regset->n * regset->size; if (!p) { - to_free = p = kzalloc(size, GFP_KERNEL); + to_free = p = kvzalloc(size, GFP_KERNEL); if (!p) return -ENOMEM; } res = regset->regset_get(target, regset, (struct membuf){.p = p, .left = size}); if (res < 0) { - kfree(to_free); + kvfree(to_free); return res; } *data = p; @@ -71,6 +71,6 @@ int copy_regset_to_user(struct task_struct *target, ret = regset_get_alloc(target, regset, size, &buf); if (ret > 0) ret = copy_to_user(data, buf, ret) ? -EFAULT : 0; - kfree(buf); + kvfree(buf); return ret; }
While browsing through ChromeOS crash reports, I found one with an allocation failure that looked like this: chrome: page allocation failure: order:7, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=urgent,mems_allowed=0 CPU: 7 PID: 3295 Comm: chrome Not tainted 5.15.133-20574-g8044615ac35c #1 (HASH:1162 1) Hardware name: Google Lazor (rev3 - 8) with KB Backlight (DT) Call trace: ... warn_alloc+0x104/0x174 __alloc_pages+0x5f0/0x6e4 kmalloc_order+0x44/0x98 kmalloc_order_trace+0x34/0x124 __kmalloc+0x228/0x36c __regset_get+0x68/0xcc regset_get_alloc+0x1c/0x28 elf_core_dump+0x3d8/0xd8c do_coredump+0xeb8/0x1378 get_signal+0x14c/0x804 ... An order 7 allocation is (1 << 7) contiguous pages, or 512K. It's not a surprise that this allocation failed on a system that's been running for a while. More digging showed that it was fairly easy to see the order 7 allocation by just sending a SIGQUIT to chrome (or other processes) to generate a core dump. The actual amount being allocated was 279,584 bytes and it was for "core_note_type" NT_ARM_SVE. There was quite a bit of discussion [1] on the mailing lists in response to my v1 patch attempting to switch to vmalloc. The overall conclusion was that we could likely reduce the 279,584 byte allocation by quite a bit and Mark Brown has sent a patch to that effect [2]. However even with the 279,584 byte allocation gone there are still 65,552 byte allocations. These are just barely more than the 65,536 bytes and thus would require an order 5 allocation. An order 5 allocation is still something to avoid unless necessary and nothing needs the memory here to be contiguous. Change the allocation to kvzalloc() which should still be efficient for small allocations but doesn't force the memory subsystem to work hard (and maybe fail) at getting a large contiguous chunk. [1] https://lore.kernel.org/r/20240201171159.1.Id9ad163b60d21c9e56c2d686b0cc9083a8ba7924@changeid [2] https://lore.kernel.org/r/20240203-arm64-sve-ptrace-regset-size-v1-1-2c3ba1386b9e@kernel.org Signed-off-by: Douglas Anderson <dianders@chromium.org> --- Changes in v2: - Use kvzalloc() instead of vmalloc(). - Update description based on v1 discussion. fs/binfmt_elf.c | 2 +- kernel/regset.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)