Message ID | 20200304030206.1706-1-jaewon31.kim@samsung.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: mmap: show vm_unmapped_area error log | expand |
Hello sorry for build warning. I've changed %d to %ld reported by kbuild. Let me attach full patch again below. -------------------------------------------------- Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. Virtual memory space shortage of a task on mmap is reported to userspace as -ENOMEM. It can be confused as physical memory shortage of overall system. The vm_unmapped_area can be called to by some drivers or other kernel core system like filesystem. It can be hard to know which code layer returns the -ENOMEM. Print error log of vm_unmapped_area with rate limited. Without rate limited, soft lockup ocurrs on infinite mmap sytem call. i.e.) <3>[ 576.024088] [6: mmap_infinite:14251] mmap: vm_unmapped_area err:-12 total_vm:0xfee08 flags:0x1 len:0xa00000 low:0x8000 high:0xf3f63000 Fixed type mismatching on previous patch which is reported by kbuild. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com> --- include/linux/mm.h | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 52269e56c514..ecf9e1fcde79 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2379,10 +2379,19 @@ extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); static inline unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info) { + unsigned long addr; + if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) - return unmapped_area_topdown(info); + addr = unmapped_area_topdown(info); else - return unmapped_area(info); + addr = unmapped_area(info); + + if (IS_ERR_VALUE(addr) && printk_ratelimit()) { + pr_err("%s err:%ld total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx\n", + __func__, addr, current->mm->total_vm, info->flags, + info->length, info->low_limit, info->high_limit); + } + return addr; } On 2020년 03월 04일 12:02, Jaewon Kim wrote: > Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. > Virtual memory space shortage of a task on mmap is reported to userspace > as -ENOMEM. It can be confused as physical memory shortage of overall > system. > > The vm_unmapped_area can be called to by some drivers or other kernel > core system like filesystem. It can be hard to know which code layer > returns the -ENOMEM. > > Print error log of vm_unmapped_area with rate limited. Without rate > limited, soft lockup ocurrs on infinite mmap sytem call. > > i.e.) > <3>[ 576.024088] [6: mmap_infinite:14251] mmap: vm_unmapped_area err:-12 total_vm:0xfee08 flags:0x1 len:0xa00000 low:0x8000 high:0xf3f63000 > > Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com> > --- > include/linux/mm.h | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 52269e56c514..ee822d65ebb7 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2379,10 +2379,19 @@ extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); > static inline unsigned long > vm_unmapped_area(struct vm_unmapped_area_info *info) > { > + unsigned long addr; > + > if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) > - return unmapped_area_topdown(info); > + addr = unmapped_area_topdown(info); > else > - return unmapped_area(info); > + addr = unmapped_area(info); > + > + if (IS_ERR_VALUE(addr) && printk_ratelimit()) { > + pr_err("%s err:%d total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx\n", > + __func__, addr, current->mm->total_vm, info->flags, > + info->length, info->low_limit, info->high_limit); > + } > + return addr; > } > > /* truncate.c */
On Thu, 5 Mar 2020 10:35:05 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: > Hello > > sorry for build warning. > I've changed %d to %ld reported by kbuild. > Let me attach full patch again below. > -------------------------------------------------- > > > Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. > Virtual memory space shortage of a task on mmap is reported to userspace > as -ENOMEM. It can be confused as physical memory shortage of overall > system. > > The vm_unmapped_area can be called to by some drivers or other kernel > core system like filesystem. It can be hard to know which code layer > returns the -ENOMEM. > > Print error log of vm_unmapped_area with rate limited. Without rate > limited, soft lockup ocurrs on infinite mmap sytem call. > > i.e.) > <3>[ 576.024088] [6: mmap_infinite:14251] mmap: vm_unmapped_area err:-12 total_vm:0xfee08 flags:0x1 len:0xa00000 low:0x8000 high:0xf3f63000 > hm, I suppose that could be useful. Although the choice of which info to display could be a source of dispute. Why did you choose this info and omit other things? Perhaps a stack trace could also be useful? > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2379,10 +2379,19 @@ extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); > static inline unsigned long > vm_unmapped_area(struct vm_unmapped_area_info *info) > { > + unsigned long addr; > + > if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) > - return unmapped_area_topdown(info); > + addr = unmapped_area_topdown(info); > else > - return unmapped_area(info); > + addr = unmapped_area(info); > + > + if (IS_ERR_VALUE(addr) && printk_ratelimit()) { Please avoid using printk_ratelimit(). See the comment at the printk_ratelimit() definition site: /* * Please don't use printk_ratelimit(), because it shares ratelimiting state * with all other unrelated printk_ratelimit() callsites. Instead use * printk_ratelimited() or plain old __ratelimit(). */ > + pr_err("%s err:%ld total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx\n", > + __func__, addr, current->mm->total_vm, info->flags, > + info->length, info->low_limit, info->high_limit); > + } > + return addr; > }
On 2020년 03월 06일 13:24, Andrew Morton wrote: > On Thu, 5 Mar 2020 10:35:05 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: > >> Hello >> >> sorry for build warning. >> I've changed %d to %ld reported by kbuild. >> Let me attach full patch again below. >> -------------------------------------------------- >> >> >> Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. >> Virtual memory space shortage of a task on mmap is reported to userspace >> as -ENOMEM. It can be confused as physical memory shortage of overall >> system. >> >> The vm_unmapped_area can be called to by some drivers or other kernel >> core system like filesystem. It can be hard to know which code layer >> returns the -ENOMEM. >> >> Print error log of vm_unmapped_area with rate limited. Without rate >> limited, soft lockup ocurrs on infinite mmap sytem call. >> >> i.e.) >> <3>[ 576.024088] [6: mmap_infinite:14251] mmap: vm_unmapped_area err:-12 total_vm:0xfee08 flags:0x1 len:0xa00000 low:0x8000 high:0xf3f63000 >> > hm, I suppose that could be useful. Although the choice of which info > to display could be a source of dispute. Why did you choose this info > and omit other things? Perhaps a stack trace could also be useful? I thought info->align_mask, info->align_offset are not important. But I added them too now. And I think whole stack trace is not essential. In my opinion, higher layer like userspace mmap or other driver calling to vm_unmapped_area also should report the error if they need. >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -2379,10 +2379,19 @@ extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); >> static inline unsigned long >> vm_unmapped_area(struct vm_unmapped_area_info *info) >> { >> + unsigned long addr; >> + >> if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) >> - return unmapped_area_topdown(info); >> + addr = unmapped_area_topdown(info); >> else >> - return unmapped_area(info); >> + addr = unmapped_area(info); >> + >> + if (IS_ERR_VALUE(addr) && printk_ratelimit()) { > Please avoid using printk_ratelimit(). See the comment at the > printk_ratelimit() definition site: Thank you for your comment. I changed printk_ratelimit to pr_warn_ratelimited. To use pr_warn_ratelimited I included <linux/ratelimit.h> > > /* > * Please don't use printk_ratelimit(), because it shares ratelimiting state > * with all other unrelated printk_ratelimit() callsites. Instead use > * printk_ratelimited() or plain old __ratelimit(). > */ > >> + pr_err("%s err:%ld total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx\n", >> + __func__, addr, current->mm->total_vm, info->flags, >> + info->length, info->low_limit, info->high_limit); >> + } >> + return addr; >> } > > Let me attach changed whole patch below. ------------------- Subject: [PATCH] mm: mmap: show vm_unmapped_area error log Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. Virtual memory space shortage of a task on mmap is reported to userspace as -ENOMEM. It can be confused as physical memory shortage of overall system. The vm_unmapped_area can be called to by some drivers or other kernel core system like filesystem. It can be hard to know which code layer returns the -ENOMEM. Print error log of vm_unmapped_area with rate limited. Without rate limited, soft lockup ocurrs on infinite mmap sytem call. i.e.) <4>[ 68.556470] [2: mmap_infinite:12363] mmap: vm_unmapped_area err:-12 total_vm:0xf4c08 flags:0x1 len:0x100000 low:0x8000 high:0xf4583000 mask:0x0 offset:0x0 Fixed type mismatching on previous patch which is reported by kbuild. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com> --- include/linux/mm.h | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 52269e56c514..114055d70752 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -27,6 +27,7 @@ #include <linux/memremap.h> #include <linux/overflow.h> #include <linux/sizes.h> +#include <linux/ratelimit.h> struct mempolicy; struct anon_vma; @@ -2379,10 +2380,20 @@ extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); static inline unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info) { + unsigned long addr; + if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) - return unmapped_area_topdown(info); + addr = unmapped_area_topdown(info); else - return unmapped_area(info); + addr = unmapped_area(info); + + if (IS_ERR_VALUE(addr)) { + pr_warn_ratelimited("%s err:%ld total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx mask:0x%lx offset:0x%lx\n", + __func__, addr, current->mm->total_vm, info->flags, + info->length, info->low_limit, info->high_limit, + info->align_mask, info->align_offset); + } + return addr; }
On Fri, 6 Mar 2020 15:16:22 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: > > Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. > Virtual memory space shortage of a task on mmap is reported to userspace > as -ENOMEM. It can be confused as physical memory shortage of overall > system. > > The vm_unmapped_area can be called to by some drivers or other kernel > core system like filesystem. It can be hard to know which code layer > returns the -ENOMEM. > > Print error log of vm_unmapped_area with rate limited. Without rate > limited, soft lockup ocurrs on infinite mmap sytem call. > > i.e.) > <4>[ 68.556470] [2: mmap_infinite:12363] mmap: vm_unmapped_area err:-12 total_vm:0xf4c08 flags:0x1 len:0x100000 low:0x8000 high:0xf4583000 mask:0x0 offset:0x0 > > ... > > --- a/include/linux/mm.h > +++ b/include/linux/mm.h This patch was messed up by your email client (tabs expanded to spaces). > @@ -27,6 +27,7 @@ > #include <linux/memremap.h> > #include <linux/overflow.h> > #include <linux/sizes.h> > +#include <linux/ratelimit.h> > > struct mempolicy; > struct anon_vma; > @@ -2379,10 +2380,20 @@ extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); > static inline unsigned long > vm_unmapped_area(struct vm_unmapped_area_info *info) > { > + unsigned long addr; > + > if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) > - return unmapped_area_topdown(info); > + addr = unmapped_area_topdown(info); > else > - return unmapped_area(info); > + addr = unmapped_area(info); > + > + if (IS_ERR_VALUE(addr)) { > + pr_warn_ratelimited("%s err:%ld total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx mask:0x%lx offset:0x%lx\n", > + __func__, addr, current->mm->total_vm, info->flags, > + info->length, info->low_limit, info->high_limit, > + info->align_mask, info->align_offset); > + } > + return addr; > } pr_warn_ratelimited() contains static state. Using it in an inlined function means that each callsite gets its own copy of that state, so we're ratelimiting the vm_unmapped_area() output on a per-callsite basis, not on a kernelwide basis. Maybe that's what we want, maybe it's not. But I think vm_unmapped_area() has become too large to be inlined anyway, so I suggest making it a regular out-of-line function in mmap.c. I don't believe that function needs to be exported to modules.
On Sat, Mar 07, 2020 at 03:47:44PM -0800, Andrew Morton wrote: > On Fri, 6 Mar 2020 15:16:22 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: > > Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. > > Virtual memory space shortage of a task on mmap is reported to userspace > > as -ENOMEM. It can be confused as physical memory shortage of overall > > system. But userspace can trigger this printk. We don't usually allow printks under those circumstances, even ratelimited.
On 2020년 03월 08일 10:58, Matthew Wilcox wrote: > On Sat, Mar 07, 2020 at 03:47:44PM -0800, Andrew Morton wrote: >> On Fri, 6 Mar 2020 15:16:22 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: >>> Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. >>> Virtual memory space shortage of a task on mmap is reported to userspace >>> as -ENOMEM. It can be confused as physical memory shortage of overall >>> system. > But userspace can trigger this printk. We don't usually allow printks > under those circumstances, even ratelimited. Hello thank you your comment. Yes, userspace can trigger printk, but this was useful for to know why a userspace task was crashed. There seems to be still many userspace applications which did not check error of mmap and access invalid address. Additionally in my AARCH64 Android environment, display driver tries to get userspace address to map its display memory. The display driver report -ENOMEM from vm_unmapped_area and but also from GPU related address space. Please let me know your comment again if this debug is now allowed in that userspace triggered perspective. I may change to pr_debug or drop this change. Thank you > >
On 2020년 03월 08일 08:47, Andrew Morton wrote: > On Fri, 6 Mar 2020 15:16:22 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: > >> Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. >> Virtual memory space shortage of a task on mmap is reported to userspace >> as -ENOMEM. It can be confused as physical memory shortage of overall >> system. >> >> The vm_unmapped_area can be called to by some drivers or other kernel >> core system like filesystem. It can be hard to know which code layer >> returns the -ENOMEM. >> >> Print error log of vm_unmapped_area with rate limited. Without rate >> limited, soft lockup ocurrs on infinite mmap sytem call. >> >> i.e.) >> <4>[ 68.556470] [2: mmap_infinite:12363] mmap: vm_unmapped_area err:-12 total_vm:0xf4c08 flags:0x1 len:0x100000 low:0x8000 high:0xf4583000 mask:0x0 offset:0x0 >> >> ... >> >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h > This patch was messed up by your email client (tabs expanded to spaces). Sorry for this. Let me fix when I resubmit. >> @@ -27,6 +27,7 @@ >> #include <linux/memremap.h> >> #include <linux/overflow.h> >> #include <linux/sizes.h> >> +#include <linux/ratelimit.h> >> >> struct mempolicy; >> struct anon_vma; >> @@ -2379,10 +2380,20 @@ extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); >> static inline unsigned long >> vm_unmapped_area(struct vm_unmapped_area_info *info) >> { >> + unsigned long addr; >> + >> if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) >> - return unmapped_area_topdown(info); >> + addr = unmapped_area_topdown(info); >> else >> - return unmapped_area(info); >> + addr = unmapped_area(info); >> + >> + if (IS_ERR_VALUE(addr)) { >> + pr_warn_ratelimited("%s err:%ld total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx mask:0x%lx offset:0x%lx\n", >> + __func__, addr, current->mm->total_vm, info->flags, >> + info->length, info->low_limit, info->high_limit, >> + info->align_mask, info->align_offset); >> + } >> + return addr; >> } > pr_warn_ratelimited() contains static state. Using it in an inlined > function means that each callsite gets its own copy of that state, so > we're ratelimiting the vm_unmapped_area() output on a per-callsite > basis, not on a kernelwide basis. > > Maybe that's what we want, maybe it's not. But I think > vm_unmapped_area() has become too large to be inlined anyway, so I > suggest making it a regular out-of-line function in mmap.c. I don't > believe that function needs to be exported to modules. Thank you for your comment. Though, on v5.6-rc4, I just found couple of code which calls to vm_unmapped_area, I may be able to move this to out-of-line function on next patch version. By the way, I need to discuss userspace triggered printk with Matthew Wilcox. If possible, I'd like to hear your opinion for this. Thank you > > >
On Sun, Mar 08, 2020 at 06:58:47PM +0900, Jaewon Kim wrote: > On 2020년 03월 08일 10:58, Matthew Wilcox wrote: > > On Sat, Mar 07, 2020 at 03:47:44PM -0800, Andrew Morton wrote: > >> On Fri, 6 Mar 2020 15:16:22 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: > >>> Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. > >>> Virtual memory space shortage of a task on mmap is reported to userspace > >>> as -ENOMEM. It can be confused as physical memory shortage of overall > >>> system. > > But userspace can trigger this printk. We don't usually allow printks > > under those circumstances, even ratelimited. > Hello thank you your comment. > > Yes, userspace can trigger printk, but this was useful for to know why > a userspace task was crashed. There seems to be still many userspace > applications which did not check error of mmap and access invalid address. > > Additionally in my AARCH64 Android environment, display driver tries to > get userspace address to map its display memory. The display driver > report -ENOMEM from vm_unmapped_area and but also from GPU related > address space. > > Please let me know your comment again if this debug is now allowed > in that userspace triggered perspective. The scenario that worries us is an attacker being able to fill the log files and so also fill (eg) the /var partition. Once it's full, future kernel messages cannot be stored anywhere and so there will be no traces of their privilege escalation. Maybe a tracepoint would be a better idea? Usually they are disabled, but they can be enabled by a sysadmin to gain insight into why an application is crashing.
On 2020년 03월 08일 21:36, Matthew Wilcox wrote: > On Sun, Mar 08, 2020 at 06:58:47PM +0900, Jaewon Kim wrote: >> On 2020년 03월 08일 10:58, Matthew Wilcox wrote: >>> On Sat, Mar 07, 2020 at 03:47:44PM -0800, Andrew Morton wrote: >>>> On Fri, 6 Mar 2020 15:16:22 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: >>>>> Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. >>>>> Virtual memory space shortage of a task on mmap is reported to userspace >>>>> as -ENOMEM. It can be confused as physical memory shortage of overall >>>>> system. >>> But userspace can trigger this printk. We don't usually allow printks >>> under those circumstances, even ratelimited. >> Hello thank you your comment. >> >> Yes, userspace can trigger printk, but this was useful for to know why >> a userspace task was crashed. There seems to be still many userspace >> applications which did not check error of mmap and access invalid address. >> >> Additionally in my AARCH64 Android environment, display driver tries to >> get userspace address to map its display memory. The display driver >> report -ENOMEM from vm_unmapped_area and but also from GPU related >> address space. >> >> Please let me know your comment again if this debug is now allowed >> in that userspace triggered perspective. > The scenario that worries us is an attacker being able to fill the log > files and so also fill (eg) the /var partition. Once it's full, future > kernel messages cannot be stored anywhere and so there will be no traces > of their privilege escalation. Although up to 10 times within 5 secs is not many, I think those log may remove other important log in log buffer if it is the system which produces very few log. In my Android phone device system, there seems to be much more kernel log though. > Maybe a tracepoint would be a better idea? Usually they are disabled, > but they can be enabled by a sysadmin to gain insight into why an > application is crashing. In Android phone device system, we cannot get sysadmin permission if it is built for end user. And it is not easy to reproduce this symptom because it is an user's app. Anyway let me try pr_devel_ratelimited which is disabled by default. I hope this is good enough. Additionally I moved the code from mm.h to mmap.c. --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2069,7 +2069,7 @@ unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info) addr = unmapped_area(info); if (IS_ERR_VALUE(addr)) { - pr_warn_ratelimited("%s err:%ld total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx mask:0x%lx offset:0x%lx\n", + pr_devel_ratelimited("%s err:%ld total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx mask:0x%lx offset:0x%lx\n", __func__, addr, current->mm->total_vm, info->flags, info->length, info->low_limit, info->high_limit, info->align_mask, info->align_offset); > >
On Mon, Mar 09, 2020 at 06:12:03PM +0900, Jaewon Kim wrote: > On 2020년 03월 08일 21:36, Matthew Wilcox wrote: > > On Sun, Mar 08, 2020 at 06:58:47PM +0900, Jaewon Kim wrote: > >> On 2020년 03월 08일 10:58, Matthew Wilcox wrote: > >>> On Sat, Mar 07, 2020 at 03:47:44PM -0800, Andrew Morton wrote: > >>>> On Fri, 6 Mar 2020 15:16:22 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: > >>>>> Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. > >>>>> Virtual memory space shortage of a task on mmap is reported to userspace > >>>>> as -ENOMEM. It can be confused as physical memory shortage of overall > >>>>> system. > >>> But userspace can trigger this printk. We don't usually allow printks > >>> under those circumstances, even ratelimited. > >> Hello thank you your comment. > >> > >> Yes, userspace can trigger printk, but this was useful for to know why > >> a userspace task was crashed. There seems to be still many userspace > >> applications which did not check error of mmap and access invalid address. > >> > >> Additionally in my AARCH64 Android environment, display driver tries to > >> get userspace address to map its display memory. The display driver > >> report -ENOMEM from vm_unmapped_area and but also from GPU related > >> address space. > >> > >> Please let me know your comment again if this debug is now allowed > >> in that userspace triggered perspective. > > The scenario that worries us is an attacker being able to fill the log > > files and so also fill (eg) the /var partition. Once it's full, future > > kernel messages cannot be stored anywhere and so there will be no traces > > of their privilege escalation. > Although up to 10 times within 5 secs is not many, I think those log may remove > other important log in log buffer if it is the system which produces very few log. > In my Android phone device system, there seems to be much more kernel log though. I've never seen the logs on my android phone. All that a ratelimit is going to do is make the attacker be more patient. > > Maybe a tracepoint would be a better idea? Usually they are disabled, > > but they can be enabled by a sysadmin to gain insight into why an > > application is crashing. > In Android phone device system, we cannot get sysadmin permission if it is built > for end user. And it is not easy to reproduce this symptom because it is an user's app. > > Anyway let me try pr_devel_ratelimited which is disabled by default. I hope this is > good enough. Additionally I moved the code from mm.h to mmap.c. https://source.android.com/devices/tech/debug/ftrace
On 2020년 03월 09일 23:01, Matthew Wilcox wrote: > On Mon, Mar 09, 2020 at 06:12:03PM +0900, Jaewon Kim wrote: >> On 2020년 03월 08일 21:36, Matthew Wilcox wrote: >>> On Sun, Mar 08, 2020 at 06:58:47PM +0900, Jaewon Kim wrote: >>>> On 2020년 03월 08일 10:58, Matthew Wilcox wrote: >>>>> On Sat, Mar 07, 2020 at 03:47:44PM -0800, Andrew Morton wrote: >>>>>> On Fri, 6 Mar 2020 15:16:22 +0900 Jaewon Kim <jaewon31.kim@samsung.com> wrote: >>>>>>> Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. >>>>>>> Virtual memory space shortage of a task on mmap is reported to userspace >>>>>>> as -ENOMEM. It can be confused as physical memory shortage of overall >>>>>>> system. >>>>> But userspace can trigger this printk. We don't usually allow printks >>>>> under those circumstances, even ratelimited. >>>> Hello thank you your comment. >>>> >>>> Yes, userspace can trigger printk, but this was useful for to know why >>>> a userspace task was crashed. There seems to be still many userspace >>>> applications which did not check error of mmap and access invalid address. >>>> >>>> Additionally in my AARCH64 Android environment, display driver tries to >>>> get userspace address to map its display memory. The display driver >>>> report -ENOMEM from vm_unmapped_area and but also from GPU related >>>> address space. >>>> >>>> Please let me know your comment again if this debug is now allowed >>>> in that userspace triggered perspective. >>> The scenario that worries us is an attacker being able to fill the log >>> files and so also fill (eg) the /var partition. Once it's full, future >>> kernel messages cannot be stored anywhere and so there will be no traces >>> of their privilege escalation. >> Although up to 10 times within 5 secs is not many, I think those log may remove >> other important log in log buffer if it is the system which produces very few log. >> In my Android phone device system, there seems to be much more kernel log though. > I've never seen the logs on my android phone. All that a ratelimit is > going to do is make the attacker be more patient. > >>> Maybe a tracepoint would be a better idea? Usually they are disabled, >>> but they can be enabled by a sysadmin to gain insight into why an >>> application is crashing. >> In Android phone device system, we cannot get sysadmin permission if it is built >> for end user. And it is not easy to reproduce this symptom because it is an user's app. >> >> Anyway let me try pr_devel_ratelimited which is disabled by default. I hope this is >> good enough. Additionally I moved the code from mm.h to mmap.c. > https://source.android.com/devices/tech/debug/ftrace I am not sure if an end user can enable a trace point which is not writable. Anyway I created trace mmap.h file and changed printk to trace_vm_unmapped_area without ratelimited. If there is no objection, I am going to resubmit whole patch as v2. Thank you for your comment. --- a/mm/mmap.c +++ b/mm/mmap.c @@ -47,6 +47,8 @@ #include <linux/pkeys.h> #include <linux/oom.h> #include <linux/sched/mm.h> +#define CREATE_TRACE_POINTS +#include <trace/events/mmap.h> #include <linux/uaccess.h> #include <asm/cacheflush.h> @@ -2061,10 +2063,15 @@ unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info) */ unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info) { + unsigned long addr; + if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) - return unmapped_area_topdown(info); + addr = unmapped_area_topdown(info); else - return unmapped_area(info); + addr = unmapped_area(info); + + trace_vm_unmapped_area(addr, info); + return addr; } >
diff --git a/include/linux/mm.h b/include/linux/mm.h index 52269e56c514..ee822d65ebb7 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2379,10 +2379,19 @@ extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); static inline unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info) { + unsigned long addr; + if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) - return unmapped_area_topdown(info); + addr = unmapped_area_topdown(info); else - return unmapped_area(info); + addr = unmapped_area(info); + + if (IS_ERR_VALUE(addr) && printk_ratelimit()) { + pr_err("%s err:%d total_vm:0x%lx flags:0x%lx len:0x%lx low:0x%lx high:0x%lx\n", + __func__, addr, current->mm->total_vm, info->flags, + info->length, info->low_limit, info->high_limit); + } + return addr; } /* truncate.c */
Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. Virtual memory space shortage of a task on mmap is reported to userspace as -ENOMEM. It can be confused as physical memory shortage of overall system. The vm_unmapped_area can be called to by some drivers or other kernel core system like filesystem. It can be hard to know which code layer returns the -ENOMEM. Print error log of vm_unmapped_area with rate limited. Without rate limited, soft lockup ocurrs on infinite mmap sytem call. i.e.) <3>[ 576.024088] [6: mmap_infinite:14251] mmap: vm_unmapped_area err:-12 total_vm:0xfee08 flags:0x1 len:0xa00000 low:0x8000 high:0xf3f63000 Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com> --- include/linux/mm.h | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)