Message ID | 20190625235325.2096441-3-songliubraving@fb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | THP aware uprobe | expand |
* Song Liu <songliubraving@fb.com> [2019-06-25 16:53:23]: > Currently, uprobe swaps the target page with a anonymous page in both > install_breakpoint() and remove_breakpoint(). When all uprobes on a page > are removed, the given mm is still using an anonymous page (not the > original page). > > This patch allows uprobe to use original page when possible (all uprobes > on the page are already removed). > > Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Signed-off-by: Song Liu <songliubraving@fb.com> > Looks good to me. Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
On 06/25, Song Liu wrote: > > This patch allows uprobe to use original page when possible (all uprobes > on the page are already removed). I can't review. I do not understand vm enough. > + if (!is_register) { > + struct page *orig_page; > + pgoff_t index; > + > + index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT; > + orig_page = find_get_page(vma->vm_file->f_inode->i_mapping, > + index); > + > + if (orig_page) { > + if (pages_identical(new_page, orig_page)) { Shouldn't we at least check PageUptodate? and I am a bit surprised there is no simple way to unmap the old page in this case... Oleg.
> On Jul 15, 2019, at 8:25 AM, Oleg Nesterov <oleg@redhat.com> wrote: > > On 06/25, Song Liu wrote: >> >> This patch allows uprobe to use original page when possible (all uprobes >> on the page are already removed). > > I can't review. I do not understand vm enough. > >> + if (!is_register) { >> + struct page *orig_page; >> + pgoff_t index; >> + >> + index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT; >> + orig_page = find_get_page(vma->vm_file->f_inode->i_mapping, >> + index); >> + >> + if (orig_page) { >> + if (pages_identical(new_page, orig_page)) { > > Shouldn't we at least check PageUptodate? For page cache, we only do ClearPageUptodate() on read failures, so this should be really rare case. But I guess we can check anyway. > > and I am a bit surprised there is no simple way to unmap the old page > in this case... The easiest way I have found requires flush_cache_page() plus a few mmu_notifier calls around it. I think current solution is better than that, as it saves a page fault. Thanks, Song
On 07/24, Song Liu wrote: > > > > On Jul 15, 2019, at 8:25 AM, Oleg Nesterov <oleg@redhat.com> wrote: > > > >> + if (!is_register) { > >> + struct page *orig_page; > >> + pgoff_t index; > >> + > >> + index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT; > >> + orig_page = find_get_page(vma->vm_file->f_inode->i_mapping, > >> + index); > >> + > >> + if (orig_page) { > >> + if (pages_identical(new_page, orig_page)) { > > > > Shouldn't we at least check PageUptodate? > > For page cache, we only do ClearPageUptodate() on read failures, Hmm. I don't think so. > so > this should be really rare case. But I guess we can check anyway. Can? I think we should or this code is simply wrong... > > and I am a bit surprised there is no simple way to unmap the old page > > in this case... > > The easiest way I have found requires flush_cache_page() plus a few > mmu_notifier calls around it. But we need to do this anyway? At least with your patch replace_page() still does this after page_add_file_rmap(). > I think current solution is better than > that, perhaps, I won't argue, > as it saves a page fault. I don't think it matters in this case. Oleg.
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 78f61bfc6b79..f7c61a1ef720 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -160,16 +160,19 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, int err; struct mmu_notifier_range range; struct mem_cgroup *memcg; + bool orig = new_page->mapping != NULL; /* new_page == orig_page */ mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, addr, addr + PAGE_SIZE); VM_BUG_ON_PAGE(PageTransHuge(old_page), old_page); - err = mem_cgroup_try_charge(new_page, vma->vm_mm, GFP_KERNEL, &memcg, - false); - if (err) - return err; + if (!orig) { + err = mem_cgroup_try_charge(new_page, vma->vm_mm, GFP_KERNEL, + &memcg, false); + if (err) + return err; + } /* For try_to_free_swap() and munlock_vma_page() below */ lock_page(old_page); @@ -177,15 +180,24 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, mmu_notifier_invalidate_range_start(&range); err = -EAGAIN; if (!page_vma_mapped_walk(&pvmw)) { - mem_cgroup_cancel_charge(new_page, memcg, false); + if (!orig) + mem_cgroup_cancel_charge(new_page, memcg, false); goto unlock; } VM_BUG_ON_PAGE(addr != pvmw.address, old_page); get_page(new_page); - page_add_new_anon_rmap(new_page, vma, addr, false); - mem_cgroup_commit_charge(new_page, memcg, false, false); - lru_cache_add_active_or_unevictable(new_page, vma); + if (orig) { + lock_page(new_page); /* for page_add_file_rmap() */ + page_add_file_rmap(new_page, false); + unlock_page(new_page); + inc_mm_counter(mm, mm_counter_file(new_page)); + dec_mm_counter(mm, MM_ANONPAGES); + } else { + page_add_new_anon_rmap(new_page, vma, addr, false); + mem_cgroup_commit_charge(new_page, memcg, false, false); + lru_cache_add_active_or_unevictable(new_page, vma); + } if (!PageAnon(old_page)) { dec_mm_counter(mm, mm_counter_file(old_page)); @@ -501,6 +513,23 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm, copy_highpage(new_page, old_page); copy_to_page(new_page, vaddr, &opcode, UPROBE_SWBP_INSN_SIZE); + if (!is_register) { + struct page *orig_page; + pgoff_t index; + + index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT; + orig_page = find_get_page(vma->vm_file->f_inode->i_mapping, + index); + + if (orig_page) { + if (pages_identical(new_page, orig_page)) { + put_page(new_page); + new_page = orig_page; + } else + put_page(orig_page); + } + } + ret = __replace_page(vma, vaddr, old_page, new_page); put_page(new_page); put_old: