diff mbox series

[v7,2/4] uprobe: use original page when all uprobes are removed

Message ID 20190625235325.2096441-3-songliubraving@fb.com (mailing list archive)
State New, archived
Headers show
Series THP aware uprobe | expand

Commit Message

Song Liu June 25, 2019, 11:53 p.m. UTC
Currently, uprobe swaps the target page with a anonymous page in both
install_breakpoint() and remove_breakpoint(). When all uprobes on a page
are removed, the given mm is still using an anonymous page (not the
original page).

This patch allows uprobe to use original page when possible (all uprobes
on the page are already removed).

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 kernel/events/uprobes.c | 45 +++++++++++++++++++++++++++++++++--------
 1 file changed, 37 insertions(+), 8 deletions(-)

Comments

Srikar Dronamraju June 26, 2019, 6 a.m. UTC | #1
* Song Liu <songliubraving@fb.com> [2019-06-25 16:53:23]:

> Currently, uprobe swaps the target page with a anonymous page in both
> install_breakpoint() and remove_breakpoint(). When all uprobes on a page
> are removed, the given mm is still using an anonymous page (not the
> original page).
> 
> This patch allows uprobe to use original page when possible (all uprobes
> on the page are already removed).
> 
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> 
Looks good to me.

Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Oleg Nesterov July 15, 2019, 3:25 p.m. UTC | #2
On 06/25, Song Liu wrote:
>
> This patch allows uprobe to use original page when possible (all uprobes
> on the page are already removed).

I can't review. I do not understand vm enough.

> +	if (!is_register) {
> +		struct page *orig_page;
> +		pgoff_t index;
> +
> +		index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT;
> +		orig_page = find_get_page(vma->vm_file->f_inode->i_mapping,
> +					  index);
> +
> +		if (orig_page) {
> +			if (pages_identical(new_page, orig_page)) {

Shouldn't we at least check PageUptodate?


and I am a bit surprised there is no simple way to unmap the old page
in this case... 

Oleg.
Song Liu July 24, 2019, 8:23 a.m. UTC | #3
> On Jul 15, 2019, at 8:25 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> 
> On 06/25, Song Liu wrote:
>> 
>> This patch allows uprobe to use original page when possible (all uprobes
>> on the page are already removed).
> 
> I can't review. I do not understand vm enough.
> 
>> +	if (!is_register) {
>> +		struct page *orig_page;
>> +		pgoff_t index;
>> +
>> +		index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT;
>> +		orig_page = find_get_page(vma->vm_file->f_inode->i_mapping,
>> +					  index);
>> +
>> +		if (orig_page) {
>> +			if (pages_identical(new_page, orig_page)) {
> 
> Shouldn't we at least check PageUptodate?

For page cache, we only do ClearPageUptodate() on read failures, so 
this should be really rare case. But I guess we can check anyway. 

> 
> and I am a bit surprised there is no simple way to unmap the old page
> in this case... 

The easiest way I have found requires flush_cache_page() plus a few
mmu_notifier calls around it. I think current solution is better than
that, as it saves a page fault. 

Thanks,
Song
Oleg Nesterov July 24, 2019, 8:57 a.m. UTC | #4
On 07/24, Song Liu wrote:
>
>
> > On Jul 15, 2019, at 8:25 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> >> +	if (!is_register) {
> >> +		struct page *orig_page;
> >> +		pgoff_t index;
> >> +
> >> +		index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT;
> >> +		orig_page = find_get_page(vma->vm_file->f_inode->i_mapping,
> >> +					  index);
> >> +
> >> +		if (orig_page) {
> >> +			if (pages_identical(new_page, orig_page)) {
> >
> > Shouldn't we at least check PageUptodate?
>
> For page cache, we only do ClearPageUptodate() on read failures,

Hmm. I don't think so.

> so
> this should be really rare case. But I guess we can check anyway.

Can? I think we should or this code is simply wrong...

> > and I am a bit surprised there is no simple way to unmap the old page
> > in this case...
>
> The easiest way I have found requires flush_cache_page() plus a few
> mmu_notifier calls around it.

But we need to do this anyway? At least with your patch replace_page() still
does this after page_add_file_rmap().

> I think current solution is better than
> that,

perhaps, I won't argue,

> as it saves a page fault.

I don't think it matters in this case.

Oleg.
diff mbox series

Patch

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 78f61bfc6b79..f7c61a1ef720 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -160,16 +160,19 @@  static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 	int err;
 	struct mmu_notifier_range range;
 	struct mem_cgroup *memcg;
+	bool orig = new_page->mapping != NULL;  /* new_page == orig_page */
 
 	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, addr,
 				addr + PAGE_SIZE);
 
 	VM_BUG_ON_PAGE(PageTransHuge(old_page), old_page);
 
-	err = mem_cgroup_try_charge(new_page, vma->vm_mm, GFP_KERNEL, &memcg,
-			false);
-	if (err)
-		return err;
+	if (!orig) {
+		err = mem_cgroup_try_charge(new_page, vma->vm_mm, GFP_KERNEL,
+					    &memcg, false);
+		if (err)
+			return err;
+	}
 
 	/* For try_to_free_swap() and munlock_vma_page() below */
 	lock_page(old_page);
@@ -177,15 +180,24 @@  static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 	mmu_notifier_invalidate_range_start(&range);
 	err = -EAGAIN;
 	if (!page_vma_mapped_walk(&pvmw)) {
-		mem_cgroup_cancel_charge(new_page, memcg, false);
+		if (!orig)
+			mem_cgroup_cancel_charge(new_page, memcg, false);
 		goto unlock;
 	}
 	VM_BUG_ON_PAGE(addr != pvmw.address, old_page);
 
 	get_page(new_page);
-	page_add_new_anon_rmap(new_page, vma, addr, false);
-	mem_cgroup_commit_charge(new_page, memcg, false, false);
-	lru_cache_add_active_or_unevictable(new_page, vma);
+	if (orig) {
+		lock_page(new_page);  /* for page_add_file_rmap() */
+		page_add_file_rmap(new_page, false);
+		unlock_page(new_page);
+		inc_mm_counter(mm, mm_counter_file(new_page));
+		dec_mm_counter(mm, MM_ANONPAGES);
+	} else {
+		page_add_new_anon_rmap(new_page, vma, addr, false);
+		mem_cgroup_commit_charge(new_page, memcg, false, false);
+		lru_cache_add_active_or_unevictable(new_page, vma);
+	}
 
 	if (!PageAnon(old_page)) {
 		dec_mm_counter(mm, mm_counter_file(old_page));
@@ -501,6 +513,23 @@  int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
 	copy_highpage(new_page, old_page);
 	copy_to_page(new_page, vaddr, &opcode, UPROBE_SWBP_INSN_SIZE);
 
+	if (!is_register) {
+		struct page *orig_page;
+		pgoff_t index;
+
+		index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT;
+		orig_page = find_get_page(vma->vm_file->f_inode->i_mapping,
+					  index);
+
+		if (orig_page) {
+			if (pages_identical(new_page, orig_page)) {
+				put_page(new_page);
+				new_page = orig_page;
+			} else
+				put_page(orig_page);
+		}
+	}
+
 	ret = __replace_page(vma, vaddr, old_page, new_page);
 	put_page(new_page);
 put_old: