Message ID | 1534358990-85530-3-git-send-email-yang.shi@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: zap pages with read mmap_sem in munmap for large mapping | expand |
On 08/15/2018 08:49 PM, Yang Shi wrote: > We need check if mm or vma has uprobes in the following patch to check > if a vma could be unmapped with holding read mmap_sem. The checks and > pre-conditions used by uprobe_munmap() look just suitable for this > purpose. > > Extracting those checks into a helper function, has_uprobes(). > > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > Cc: Jiri Olsa <jolsa@redhat.com> > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Vlastimil Babka <vbabka@suse.cz> > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> > --- > include/linux/uprobes.h | 7 +++++++ > kernel/events/uprobes.c | 23 ++++++++++++++++------- > 2 files changed, 23 insertions(+), 7 deletions(-) > > diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h > index 0a294e9..418764e 100644 > --- a/include/linux/uprobes.h > +++ b/include/linux/uprobes.h > @@ -149,6 +149,8 @@ struct uprobes_state { > extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs); > extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, > void *src, unsigned long len); > +extern bool has_uprobes(struct vm_area_struct *vma, unsigned long start, > + unsigned long end); > #else /* !CONFIG_UPROBES */ > struct uprobes_state { > }; > @@ -203,5 +205,10 @@ static inline void uprobe_copy_process(struct task_struct *t, unsigned long flag > static inline void uprobe_clear_state(struct mm_struct *mm) > { > } > +static inline bool has_uprobes(struct vm_area_struct *vma, unsigned long start, > + unsgined long end) > +{ > + return false; > +} > #endif /* !CONFIG_UPROBES */ > #endif /* _LINUX_UPROBES_H */ > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > index aed1ba5..568481c 100644 > --- a/kernel/events/uprobes.c > +++ b/kernel/events/uprobes.c > @@ -1114,22 +1114,31 @@ int uprobe_mmap(struct vm_area_struct *vma) > return !!n; > } > > -/* > - * Called in context of a munmap of a vma. > - */ > -void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end) > +bool > +has_uprobes(struct vm_area_struct *vma, unsigned long start, unsigned long end) The name is not really great... > { > if (no_uprobe_events() || !valid_vma(vma, false)) > - return; > + return false; > > if (!atomic_read(&vma->vm_mm->mm_users)) /* called by mmput() ? */ > - return; > + return false; > > if (!test_bit(MMF_HAS_UPROBES, &vma->vm_mm->flags) || > test_bit(MMF_RECALC_UPROBES, &vma->vm_mm->flags)) This means that vma might have uprobes, but since RECALC is already set, we don't need to set it again. That's different from "has uprobes". Perhaps something like vma_needs_recalc_uprobes() ? But I also worry there might be a race where we initially return false because of MMF_RECALC_UPROBES, then the flag is cleared while vma's still have uprobes, then we downgrade mmap_sem and skip uprobe_munmap(). Should be checked if e.g. mmap_sem and vma visibility changes protects this case from happening. > - return; > + return false; > > if (vma_has_uprobes(vma, start, end)) > + return true; > + > + return false; Simpler: return vma_has_uprobes(vma, start, end); > +} > + > +/* > + * Called in context of a munmap of a vma. > + */ > +void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end) > +{ > + if (has_uprobes(vma, start, end)) > set_bit(MMF_RECALC_UPROBES, &vma->vm_mm->flags); > } > >
* Vlastimil Babka <vbabka@suse.cz> [2018-08-22 12:55:59]: > On 08/15/2018 08:49 PM, Yang Shi wrote: > > We need check if mm or vma has uprobes in the following patch to check > > if a vma could be unmapped with holding read mmap_sem. The checks and > > pre-conditions used by uprobe_munmap() look just suitable for this > > purpose. > > > > Extracting those checks into a helper function, has_uprobes(). > > > > Cc: Peter Zijlstra <peterz@infradead.org> > > Cc: Ingo Molnar <mingo@redhat.com> > > Cc: Arnaldo Carvalho de Melo <acme@kernel.org> > > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > > Cc: Jiri Olsa <jolsa@redhat.com> > > Cc: Namhyung Kim <namhyung@kernel.org> > > Cc: Vlastimil Babka <vbabka@suse.cz> > > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> > > --- > > include/linux/uprobes.h | 7 +++++++ > > kernel/events/uprobes.c | 23 ++++++++++++++++------- > > 2 files changed, 23 insertions(+), 7 deletions(-) > > > > diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h > > index 0a294e9..418764e 100644 > > --- a/include/linux/uprobes.h > > +++ b/include/linux/uprobes.h > > @@ -149,6 +149,8 @@ struct uprobes_state { > > extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs); > > extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, > > void *src, unsigned long len); > > +extern bool has_uprobes(struct vm_area_struct *vma, unsigned long start, > > + unsigned long end); > > #else /* !CONFIG_UPROBES */ > > struct uprobes_state { > > }; > > @@ -203,5 +205,10 @@ static inline void uprobe_copy_process(struct task_struct *t, unsigned long flag > > static inline void uprobe_clear_state(struct mm_struct *mm) > > { > > } > > +static inline bool has_uprobes(struct vm_area_struct *vma, unsigned long start, > > + unsgined long end) > > +{ > > + return false; > > +} > > #endif /* !CONFIG_UPROBES */ > > #endif /* _LINUX_UPROBES_H */ > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > > index aed1ba5..568481c 100644 > > --- a/kernel/events/uprobes.c > > +++ b/kernel/events/uprobes.c > > @@ -1114,22 +1114,31 @@ int uprobe_mmap(struct vm_area_struct *vma) > > return !!n; > > } > > > > -/* > > - * Called in context of a munmap of a vma. > > - */ > > -void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end) > > +bool > > +has_uprobes(struct vm_area_struct *vma, unsigned long start, unsigned long end) > > The name is not really great... I too feel the name is not apt. Can you make this vma_has_uprobes and convert the current vma_has_uprobes to __vma_has_uprobes? > > > { > > if (no_uprobe_events() || !valid_vma(vma, false)) > > - return; > > + return false; > > > > if (!atomic_read(&vma->vm_mm->mm_users)) /* called by mmput() ? */ > > - return; > > + return false; > > > > if (!test_bit(MMF_HAS_UPROBES, &vma->vm_mm->flags) || > > test_bit(MMF_RECALC_UPROBES, &vma->vm_mm->flags)) > > This means that vma might have uprobes, but since RECALC is already set, > we don't need to set it again. That's different from "has uprobes". > > Perhaps something like vma_needs_recalc_uprobes() ? > > But I also worry there might be a race where we initially return false > because of MMF_RECALC_UPROBES, then the flag is cleared while vma's > still have uprobes, then we downgrade mmap_sem and skip uprobe_munmap(). > Should be checked if e.g. mmap_sem and vma visibility changes protects > this case from happening. That is a very good observation. One think we can probably do is pass an extra parameter to has_uprobes(), depending on which we should skip this check. such that when we call from uprobes_munmap(), we continue as is but when calling from do_munmap_zap_rlock(), we skip the check. > > > - return; > > + return false; > > > > if (vma_has_uprobes(vma, start, end)) > > + return true; > > + > > + return false; > > Simpler: > return vma_has_uprobes(vma, start, end); > > > +} > > + > > +/* > > + * Called in context of a munmap of a vma. > > + */ > > +void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end) > > +{ > > + if (has_uprobes(vma, start, end)) > > set_bit(MMF_RECALC_UPROBES, &vma->vm_mm->flags); > > }
On 8/22/18 8:07 AM, Srikar Dronamraju wrote: > * Vlastimil Babka <vbabka@suse.cz> [2018-08-22 12:55:59]: > >> On 08/15/2018 08:49 PM, Yang Shi wrote: >>> We need check if mm or vma has uprobes in the following patch to check >>> if a vma could be unmapped with holding read mmap_sem. The checks and >>> pre-conditions used by uprobe_munmap() look just suitable for this >>> purpose. >>> >>> Extracting those checks into a helper function, has_uprobes(). >>> >>> Cc: Peter Zijlstra <peterz@infradead.org> >>> Cc: Ingo Molnar <mingo@redhat.com> >>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> >>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> >>> Cc: Jiri Olsa <jolsa@redhat.com> >>> Cc: Namhyung Kim <namhyung@kernel.org> >>> Cc: Vlastimil Babka <vbabka@suse.cz> >>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> >>> --- >>> include/linux/uprobes.h | 7 +++++++ >>> kernel/events/uprobes.c | 23 ++++++++++++++++------- >>> 2 files changed, 23 insertions(+), 7 deletions(-) >>> >>> diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h >>> index 0a294e9..418764e 100644 >>> --- a/include/linux/uprobes.h >>> +++ b/include/linux/uprobes.h >>> @@ -149,6 +149,8 @@ struct uprobes_state { >>> extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs); >>> extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, >>> void *src, unsigned long len); >>> +extern bool has_uprobes(struct vm_area_struct *vma, unsigned long start, >>> + unsigned long end); >>> #else /* !CONFIG_UPROBES */ >>> struct uprobes_state { >>> }; >>> @@ -203,5 +205,10 @@ static inline void uprobe_copy_process(struct task_struct *t, unsigned long flag >>> static inline void uprobe_clear_state(struct mm_struct *mm) >>> { >>> } >>> +static inline bool has_uprobes(struct vm_area_struct *vma, unsigned long start, >>> + unsgined long end) >>> +{ >>> + return false; >>> +} >>> #endif /* !CONFIG_UPROBES */ >>> #endif /* _LINUX_UPROBES_H */ >>> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c >>> index aed1ba5..568481c 100644 >>> --- a/kernel/events/uprobes.c >>> +++ b/kernel/events/uprobes.c >>> @@ -1114,22 +1114,31 @@ int uprobe_mmap(struct vm_area_struct *vma) >>> return !!n; >>> } >>> >>> -/* >>> - * Called in context of a munmap of a vma. >>> - */ >>> -void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end) >>> +bool >>> +has_uprobes(struct vm_area_struct *vma, unsigned long start, unsigned long end) >> The name is not really great... > I too feel the name is not apt. > Can you make this vma_has_uprobes and convert the current > vma_has_uprobes to __vma_has_uprobes? It sounds good to me. > >>> { >>> if (no_uprobe_events() || !valid_vma(vma, false)) >>> - return; >>> + return false; >>> >>> if (!atomic_read(&vma->vm_mm->mm_users)) /* called by mmput() ? */ >>> - return; >>> + return false; >>> >>> if (!test_bit(MMF_HAS_UPROBES, &vma->vm_mm->flags) || >>> test_bit(MMF_RECALC_UPROBES, &vma->vm_mm->flags)) >> This means that vma might have uprobes, but since RECALC is already set, >> we don't need to set it again. That's different from "has uprobes". >> >> Perhaps something like vma_needs_recalc_uprobes() ? >> >> But I also worry there might be a race where we initially return false >> because of MMF_RECALC_UPROBES, then the flag is cleared while vma's >> still have uprobes, then we downgrade mmap_sem and skip uprobe_munmap(). >> Should be checked if e.g. mmap_sem and vma visibility changes protects >> this case from happening. > That is a very good observation. > > One think we can probably do is pass an extra parameter to > has_uprobes(), depending on which we should skip this check. > such that when we call from uprobes_munmap(), we continue as is > but when calling from do_munmap_zap_rlock(), we skip the check. Yes, it sounds good to solve the race issue. When we need decide if we should jump to regular path for uprobes mapping, we don't check if MMF_RECALC_UPROBES is set or not. It looks not harmful to set this flag twice. Thanks, Yang > > >>> - return; >>> + return false; >>> >>> if (vma_has_uprobes(vma, start, end)) >>> + return true; >>> + >>> + return false; >> Simpler: >> return vma_has_uprobes(vma, start, end); >> >>> +} >>> + >>> +/* >>> + * Called in context of a munmap of a vma. >>> + */ >>> +void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end) >>> +{ >>> + if (has_uprobes(vma, start, end)) >>> set_bit(MMF_RECALC_UPROBES, &vma->vm_mm->flags); >>> }
On 08/22, Srikar Dronamraju wrote: > > * Vlastimil Babka <vbabka@suse.cz> [2018-08-22 12:55:59]: > > > On 08/15/2018 08:49 PM, Yang Shi wrote: > > > We need check if mm or vma has uprobes in the following patch to check > > > if a vma could be unmapped with holding read mmap_sem. Confused... why can't we call uprobe_munmap() under read_lock(mmap_sem) ? OK, it can race with find_active_uprobe() but I do not see anything really wrong, and a false-positive MMF_RECALC_UPROBES is fine. Again, I think we should simply kill uprobe_munmap(), but this needs another discussion. Oleg.
On 8/23/18 8:15 AM, Oleg Nesterov wrote: > On 08/22, Srikar Dronamraju wrote: >> * Vlastimil Babka <vbabka@suse.cz> [2018-08-22 12:55:59]: >> >>> On 08/15/2018 08:49 PM, Yang Shi wrote: >>>> We need check if mm or vma has uprobes in the following patch to check >>>> if a vma could be unmapped with holding read mmap_sem. > Confused... why can't we call uprobe_munmap() under read_lock(mmap_sem) ? I'm not sure if it is safe or not because it is not recommended and not safe to update vma's vm flags with read mmap_sem. uprobe_munmap() may update mm flags (MMF_RECALC_UPROBES). So, it sounds safer to not call it under read mmap_sem. > > OK, it can race with find_active_uprobe() but I do not see anything really > wrong, and a false-positive MMF_RECALC_UPROBES is fine. Thanks for confirming this. If it is ok to have such race, we don't have to have has_uprobes() helper anymore since it can be just called under read mmap_sem without any special handling. Yang > > Again, I think we should simply kill uprobe_munmap(), but this needs another > discussion. > > Oleg.
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index 0a294e9..418764e 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -149,6 +149,8 @@ struct uprobes_state { extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs); extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, void *src, unsigned long len); +extern bool has_uprobes(struct vm_area_struct *vma, unsigned long start, + unsigned long end); #else /* !CONFIG_UPROBES */ struct uprobes_state { }; @@ -203,5 +205,10 @@ static inline void uprobe_copy_process(struct task_struct *t, unsigned long flag static inline void uprobe_clear_state(struct mm_struct *mm) { } +static inline bool has_uprobes(struct vm_area_struct *vma, unsigned long start, + unsgined long end) +{ + return false; +} #endif /* !CONFIG_UPROBES */ #endif /* _LINUX_UPROBES_H */ diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index aed1ba5..568481c 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -1114,22 +1114,31 @@ int uprobe_mmap(struct vm_area_struct *vma) return !!n; } -/* - * Called in context of a munmap of a vma. - */ -void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end) +bool +has_uprobes(struct vm_area_struct *vma, unsigned long start, unsigned long end) { if (no_uprobe_events() || !valid_vma(vma, false)) - return; + return false; if (!atomic_read(&vma->vm_mm->mm_users)) /* called by mmput() ? */ - return; + return false; if (!test_bit(MMF_HAS_UPROBES, &vma->vm_mm->flags) || test_bit(MMF_RECALC_UPROBES, &vma->vm_mm->flags)) - return; + return false; if (vma_has_uprobes(vma, start, end)) + return true; + + return false; +} + +/* + * Called in context of a munmap of a vma. + */ +void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end) +{ + if (has_uprobes(vma, start, end)) set_bit(MMF_RECALC_UPROBES, &vma->vm_mm->flags); }
We need check if mm or vma has uprobes in the following patch to check if a vma could be unmapped with holding read mmap_sem. The checks and pre-conditions used by uprobe_munmap() look just suitable for this purpose. Extracting those checks into a helper function, has_uprobes(). Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> --- include/linux/uprobes.h | 7 +++++++ kernel/events/uprobes.c | 23 ++++++++++++++++------- 2 files changed, 23 insertions(+), 7 deletions(-)