Message ID | 20180518030316.31019-1-ying.huang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri 18-05-18 11:03:16, Huang, Ying wrote: [...] > The patch is a generic optimization which should benefit quite some > workloads, not for a specific use case. To demonstrate the performance > benefit of the patch, we tested it with vm-scalability run on > transparent huge page. It is also adds quite some non-intuitive code. So is this worth? Does any _real_ workload benefits from the change? > include/linux/mm.h | 3 ++- > mm/huge_memory.c | 3 ++- > mm/memory.c | 43 +++++++++++++++++++++++++++++++++++++++---- > 3 files changed, 43 insertions(+), 6 deletions(-)
On 05/17/2018 11:24 PM, Michal Hocko wrote: > On Fri 18-05-18 11:03:16, Huang, Ying wrote: > [...] >> The patch is a generic optimization which should benefit quite some >> workloads, not for a specific use case. To demonstrate the performance >> benefit of the patch, we tested it with vm-scalability run on >> transparent huge page. > > It is also adds quite some non-intuitive code. So is this worth? Does > any _real_ workload benefits from the change? One way to 'add less code' would be to create a helper routine that indicates the order in which sub-pages are to be copied. IIUC, you added the same algorithm for sub-page ordering to copy_huge_page() that was previously added to clear_huge_page(). Correct? If so, then perhaps a common helper could be used by both the clear and copy huge page routines. It would also make maintenance easier.
Hi, Michal, Michal Hocko <mhocko@kernel.org> writes: > On Fri 18-05-18 11:03:16, Huang, Ying wrote: > [...] >> The patch is a generic optimization which should benefit quite some >> workloads, not for a specific use case. To demonstrate the performance >> benefit of the patch, we tested it with vm-scalability run on >> transparent huge page. > > It is also adds quite some non-intuitive code. So is this worth? Does > any _real_ workload benefits from the change? I don't have any _real_ workload which benefits from this. But I think this is the right way to copy the huge page. It should benefit many workloads with heavy cache contention, as illustrated in the micro-benchmark. But the performance benefit may be small or non-measurable for the _real_ workload. The code does become not as intuitive as before. But fortunately, all non-intuitive code are in copy_user_huge_page(), which is a leaf function with well defined interface and semantics. And with the help of the code comments, at least the intention of the code is clear. Best Regards, Huang, Ying >> include/linux/mm.h | 3 ++- >> mm/huge_memory.c | 3 ++- >> mm/memory.c | 43 +++++++++++++++++++++++++++++++++++++++---- >> 3 files changed, 43 insertions(+), 6 deletions(-)
Mike Kravetz <mike.kravetz@oracle.com> writes: > On 05/17/2018 11:24 PM, Michal Hocko wrote: >> On Fri 18-05-18 11:03:16, Huang, Ying wrote: >> [...] >>> The patch is a generic optimization which should benefit quite some >>> workloads, not for a specific use case. To demonstrate the performance >>> benefit of the patch, we tested it with vm-scalability run on >>> transparent huge page. >> >> It is also adds quite some non-intuitive code. So is this worth? Does >> any _real_ workload benefits from the change? > > One way to 'add less code' would be to create a helper routine that > indicates the order in which sub-pages are to be copied. IIUC, you > added the same algorithm for sub-page ordering to copy_huge_page() > that was previously added to clear_huge_page(). Correct? Yes. > If so, then perhaps a common helper could be used by both the clear > and copy huge page routines. It would also make maintenance easier. That's a good idea. But this may need to turn copy_user_highpage()/clear_user_highpage() calling in copy_user_huge_page()/clear_huge_page() from direct call to indirect call. I don't know whether this will incur some overhead. Will try to measure this. Best Regards, Huang, Ying
diff --git a/include/linux/mm.h b/include/linux/mm.h index 3fa3b1356c34..a5fae31988e6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2732,7 +2732,8 @@ extern void clear_huge_page(struct page *page, unsigned long addr_hint, unsigned int pages_per_huge_page); extern void copy_user_huge_page(struct page *dst, struct page *src, - unsigned long addr, struct vm_area_struct *vma, + unsigned long addr_hint, + struct vm_area_struct *vma, unsigned int pages_per_huge_page); extern long copy_huge_page_from_user(struct page *dst_page, const void __user *usr_src, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 323acdd14e6e..7e720e92fcd6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1331,7 +1331,8 @@ int do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) if (!page) clear_huge_page(new_page, vmf->address, HPAGE_PMD_NR); else - copy_user_huge_page(new_page, page, haddr, vma, HPAGE_PMD_NR); + copy_user_huge_page(new_page, page, vmf->address, + vma, HPAGE_PMD_NR); __SetPageUptodate(new_page); mmun_start = haddr; diff --git a/mm/memory.c b/mm/memory.c index 14578158ed20..f8868c94d6ab 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4654,10 +4654,12 @@ static void copy_user_gigantic_page(struct page *dst, struct page *src, } void copy_user_huge_page(struct page *dst, struct page *src, - unsigned long addr, struct vm_area_struct *vma, + unsigned long addr_hint, struct vm_area_struct *vma, unsigned int pages_per_huge_page) { - int i; + int i, n, base, l; + unsigned long addr = addr_hint & + ~(((unsigned long)pages_per_huge_page << PAGE_SHIFT) - 1); if (unlikely(pages_per_huge_page > MAX_ORDER_NR_PAGES)) { copy_user_gigantic_page(dst, src, addr, vma, @@ -4665,10 +4667,43 @@ void copy_user_huge_page(struct page *dst, struct page *src, return; } + /* Copy sub-page to access last to keep its cache lines hot */ might_sleep(); - for (i = 0; i < pages_per_huge_page; i++) { + n = (addr_hint - addr) / PAGE_SIZE; + if (2 * n <= pages_per_huge_page) { + /* If sub-page to access in first half of huge page */ + base = 0; + l = n; + /* Copy sub-pages at the end of huge page */ + for (i = pages_per_huge_page - 1; i >= 2 * n; i--) { + cond_resched(); + copy_user_highpage(dst + i, src + i, + addr + i * PAGE_SIZE, vma); + } + } else { + /* If sub-page to access in second half of huge page */ + base = pages_per_huge_page - 2 * (pages_per_huge_page - n); + l = pages_per_huge_page - n; + /* Copy sub-pages at the begin of huge page */ + for (i = 0; i < base; i++) { + cond_resched(); + copy_user_highpage(dst + i, src + i, + addr + i * PAGE_SIZE, vma); + } + } + /* + * Copy remaining sub-pages in left-right-left-right pattern + * towards the sub-page to access + */ + for (i = 0; i < l; i++) { + cond_resched(); + copy_user_highpage(dst + base + i, src + base + i, + addr + (base + i) * PAGE_SIZE, vma); cond_resched(); - copy_user_highpage(dst + i, src + i, addr + i*PAGE_SIZE, vma); + copy_user_highpage(dst + base + 2 * l - 1 - i, + src + base + 2 * l - 1 - i, + addr + (base + 2 * l - 1 - i) * PAGE_SIZE, + vma); } }