Message ID | 20240625002144.3485799-3-andrii@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | uprobes: add batched register/unregister APIs and per-CPU RW semaphore | expand |
On Mon, 24 Jun 2024 17:21:34 -0700 Andrii Nakryiko <andrii@kernel.org> wrote: > Given unapply_uprobe() can call remove_breakpoint() which eventually > calls uprobe_write_opcode(), which can modify a set of memory pages and > expects mm->mmap_lock held for write, it needs to have writer lock. > > Fix this by switching to mmap_write_lock()/mmap_write_unlock(). > Oops, it is an actual bug, right? Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Thanks, > Fixes: da1816b1caec ("uprobes: Teach handler_chain() to filter out the probed task") > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > --- > kernel/events/uprobes.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > index 197fbe4663b5..e896eeecb091 100644 > --- a/kernel/events/uprobes.c > +++ b/kernel/events/uprobes.c > @@ -1235,7 +1235,7 @@ static int unapply_uprobe(struct uprobe *uprobe, struct mm_struct *mm) > struct vm_area_struct *vma; > int err = 0; > > - mmap_read_lock(mm); > + mmap_write_lock(mm); > for_each_vma(vmi, vma) { > unsigned long vaddr; > loff_t offset; > @@ -1252,7 +1252,7 @@ static int unapply_uprobe(struct uprobe *uprobe, struct mm_struct *mm) > vaddr = offset_to_vaddr(vma, uprobe->offset); > err |= remove_breakpoint(uprobe, mm, vaddr); > } > - mmap_read_unlock(mm); > + mmap_write_unlock(mm); > > return err; > } > -- > 2.43.0 >
I don't think I can review, I forgot everything, but I'll try to read this series when I have time to (try to) understand what it does... On 06/24, Andrii Nakryiko wrote: > > Given unapply_uprobe() can call remove_breakpoint() which eventually > calls uprobe_write_opcode(), which can modify a set of memory pages and > expects mm->mmap_lock held for write, it needs to have writer lock. > > Fix this by switching to mmap_write_lock()/mmap_write_unlock(). > > Fixes: da1816b1caec ("uprobes: Teach handler_chain() to filter out the probed task") > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > --- > kernel/events/uprobes.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > index 197fbe4663b5..e896eeecb091 100644 > --- a/kernel/events/uprobes.c > +++ b/kernel/events/uprobes.c > @@ -1235,7 +1235,7 @@ static int unapply_uprobe(struct uprobe *uprobe, struct mm_struct *mm) > struct vm_area_struct *vma; > int err = 0; > > - mmap_read_lock(mm); > + mmap_write_lock(mm); Can you explain what exactly is wrong? FOLL_FORCE/etc is fine under mmap_read_lock(), __replace_page() seems too... I recall that initially uprobes.c always took mmap_sem for reading, then register_for_each_vma() was changed by 77fc4af1b59d12 but there was other reasons for this change... Again, I don't understand this code today, quite possibly I missed something, I am just trying to understand. Well, it seems there is another reason for this change... Currently 2 unapply_uprobe()'s can race with each other if they try to update the same page. But in this case we can rely on -EAGAIN from __replace_page() ? Oleg.
On 06/25, Masami Hiramatsu wrote: > > On Mon, 24 Jun 2024 17:21:34 -0700 > Andrii Nakryiko <andrii@kernel.org> wrote: > > > Given unapply_uprobe() can call remove_breakpoint() which eventually > > calls uprobe_write_opcode(), which can modify a set of memory pages and > > expects mm->mmap_lock held for write, it needs to have writer lock. > > Oops, it is an actual bug, right? Why? So far I don't understand this change. Quite possibly I missed something, but in this case the changelog should explain the problem more clearly. Oleg.
On Tue, Jun 25, 2024 at 7:51 AM Oleg Nesterov <oleg@redhat.com> wrote: > > On 06/25, Masami Hiramatsu wrote: > > > > On Mon, 24 Jun 2024 17:21:34 -0700 > > Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > Given unapply_uprobe() can call remove_breakpoint() which eventually > > > calls uprobe_write_opcode(), which can modify a set of memory pages and > > > expects mm->mmap_lock held for write, it needs to have writer lock. > > > > Oops, it is an actual bug, right? > > Why? > > So far I don't understand this change. Quite possibly I missed something, > but in this case the changelog should explain the problem more clearly. > I just went off of "Called with mm->mmap_lock held for write." comment in uprobe_write_opcode(), tbh. If we don't actually need writer mmap_lock, we should probably update at least that comment. There is a lot going on in uprobe_write_opcode(), and I don't understand all the requirements there. > Oleg. >
On 06/25, Andrii Nakryiko wrote: > > On Tue, Jun 25, 2024 at 7:51 AM Oleg Nesterov <oleg@redhat.com> wrote: > > > > Why? > > > > So far I don't understand this change. Quite possibly I missed something, > > but in this case the changelog should explain the problem more clearly. > > > > I just went off of "Called with mm->mmap_lock held for write." comment > in uprobe_write_opcode(), tbh. Ah, indeed... and git blame makes me sad ;) I _think_ that 29dedee0e693a updated this comment without any thinking, but today I can't recall. In any case, today this nothing to do with mem_cgroup_charge(). Not sure __replace_page() is correct (in this respect) when it returns -EAGAIN but this is another story. > If we don't actually need writer > mmap_lock, we should probably update at least that comment. Agreed. > There is a > lot going on in uprobe_write_opcode(), and I don't understand all the > requirements there. Heh. Neither me today ;) Oleg.
On Tue, Jun 25, 2024 at 12:09 PM Oleg Nesterov <oleg@redhat.com> wrote: > > On 06/25, Andrii Nakryiko wrote: > > > > On Tue, Jun 25, 2024 at 7:51 AM Oleg Nesterov <oleg@redhat.com> wrote: > > > > > > Why? > > > > > > So far I don't understand this change. Quite possibly I missed something, > > > but in this case the changelog should explain the problem more clearly. > > > > > > > I just went off of "Called with mm->mmap_lock held for write." comment > > in uprobe_write_opcode(), tbh. > > Ah, indeed... and git blame makes me sad ;) > > I _think_ that 29dedee0e693a updated this comment without any thinking, > but today I can't recall. In any case, today this nothing to do with > mem_cgroup_charge(). Not sure __replace_page() is correct (in this respect) > when it returns -EAGAIN but this is another story. > > > If we don't actually need writer > > mmap_lock, we should probably update at least that comment. > > Agreed. Ok, I'll drop this change and will just update the comment. > > > There is a > > lot going on in uprobe_write_opcode(), and I don't understand all the > > requirements there. > > Heh. Neither me today ;) > > Oleg. >
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 197fbe4663b5..e896eeecb091 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -1235,7 +1235,7 @@ static int unapply_uprobe(struct uprobe *uprobe, struct mm_struct *mm) struct vm_area_struct *vma; int err = 0; - mmap_read_lock(mm); + mmap_write_lock(mm); for_each_vma(vmi, vma) { unsigned long vaddr; loff_t offset; @@ -1252,7 +1252,7 @@ static int unapply_uprobe(struct uprobe *uprobe, struct mm_struct *mm) vaddr = offset_to_vaddr(vma, uprobe->offset); err |= remove_breakpoint(uprobe, mm, vaddr); } - mmap_read_unlock(mm); + mmap_write_unlock(mm); return err; }
Given unapply_uprobe() can call remove_breakpoint() which eventually calls uprobe_write_opcode(), which can modify a set of memory pages and expects mm->mmap_lock held for write, it needs to have writer lock. Fix this by switching to mmap_write_lock()/mmap_write_unlock(). Fixes: da1816b1caec ("uprobes: Teach handler_chain() to filter out the probed task") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> --- kernel/events/uprobes.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)