Message ID | 20231006195318.4087158-5-willy@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Handle more faults under the VMA lock | expand |
On Fri, Oct 6, 2023 at 12:53 PM Matthew Wilcox (Oracle) <willy@infradead.org> wrote: > > If the page is not currently present in the page tables, we need to call > the page fault handler to find out which page we're supposed to COW, > so we need to both check that there is already an anon_vma and that the > fault handler doesn't need the mmap_lock. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> > --- > mm/memory.c | 12 +++++------- > 1 file changed, 5 insertions(+), 7 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index a9b0c135209a..938f481df0ab 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4639,13 +4639,11 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) > struct vm_area_struct *vma = vmf->vma; > vm_fault_t ret; > > - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { > - vma_end_read(vma); > - return VM_FAULT_RETRY; > - } > - > - if (unlikely(anon_vma_prepare(vma))) > - return VM_FAULT_OOM; > + ret = vmf_can_call_fault(vmf); > + if (!ret) > + ret = vmf_anon_prepare(vmf); > + if (ret) > + return ret; > > vmf->cow_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, vmf->address); > if (!vmf->cow_page) > -- > 2.40.1 >
Hello, kernel test robot noticed a 38.7% improvement of will-it-scale.per_thread_ops on: commit: 90e99527c746cd9ef7ebf0333c9611e45c6e5e1d ("[PATCH v2 4/6] mm: Handle COW faults under the VMA lock") url: https://github.com/intel-lab-lkp/linux/commits/Matthew-Wilcox-Oracle/mm-Make-lock_folio_maybe_drop_mmap-VMA-lock-aware/20231007-035513 base: v6.6-rc4 patch link: https://lore.kernel.org/all/20231006195318.4087158-5-willy@infradead.org/ patch subject: [PATCH v2 4/6] mm: Handle COW faults under the VMA lock testcase: will-it-scale test machine: 104 threads 2 sockets (Skylake) with 192G memory parameters: nr_task: 16 mode: thread test: page_fault2 cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231020/202310201702.62f04f91-oliver.sang@intel.com ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-8.3/thread/16/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/page_fault2/will-it-scale commit: c8b329d48e ("mm: Handle shared faults under the VMA lock") 90e99527c7 ("mm: Handle COW faults under the VMA lock") c8b329d48e0dac74 90e99527c746cd9ef7ebf0333c9 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.11 ± 2% +0.4 1.50 mpstat.cpu.all.usr% 690.67 ± 20% -35.3% 447.00 ± 6% perf-c2c.HITM.local 71432 ± 3% -10.5% 63958 meminfo.Active 70468 ± 3% -10.4% 63142 meminfo.Active(anon) 5.722e+08 ± 2% +38.8% 7.942e+08 numa-numastat.node0.local_node 5.723e+08 ± 2% +38.8% 7.944e+08 numa-numastat.node0.numa_hit 4746 -54.0% 2183 vmstat.system.cs 106237 +1.7% 108086 vmstat.system.in 69143 ± 4% -10.2% 62107 ± 2% numa-meminfo.node1.Active 68750 ± 3% -10.1% 61835 numa-meminfo.node1.Active(anon) 70251 ± 4% -9.8% 63348 numa-meminfo.node1.Shmem 1889742 ± 2% +38.7% 2621754 will-it-scale.16.threads 118108 ± 2% +38.7% 163859 will-it-scale.per_thread_ops 1889742 ± 2% +38.7% 2621754 will-it-scale.workload 5.723e+08 ± 2% +38.8% 7.944e+08 numa-vmstat.node0.numa_hit 5.722e+08 ± 2% +38.8% 7.942e+08 numa-vmstat.node0.numa_local 17189 ± 3% -10.1% 15458 numa-vmstat.node1.nr_active_anon 17563 ± 4% -9.8% 15837 numa-vmstat.node1.nr_shmem 17189 ± 3% -10.1% 15458 numa-vmstat.node1.nr_zone_active_anon 66914 ± 10% -54.3% 30547 ± 4% turbostat.C1 0.07 ± 18% -0.1 0.02 ± 33% turbostat.C1% 513918 ± 3% -74.2% 132621 ± 2% turbostat.C1E 0.54 ± 4% -0.4 0.16 ± 4% turbostat.C1E% 0.11 +18.2% 0.13 turbostat.IPC 218.42 +2.0% 222.83 turbostat.PkgWatt 30.47 +13.3% 34.53 turbostat.RAMWatt 720.36 +24.0% 893.56 ± 4% sched_debug.cfs_rq:/.runnable_avg.max 225.47 ± 7% +16.4% 262.37 sched_debug.cfs_rq:/.runnable_avg.stddev 713.28 +25.3% 893.53 ± 4% sched_debug.cfs_rq:/.util_avg.max 224.87 ± 7% +16.6% 262.19 sched_debug.cfs_rq:/.util_avg.stddev 72.59 ± 49% +63.1% 118.38 ± 11% sched_debug.cfs_rq:/.util_est_enqueued.avg 605.14 ± 4% +40.7% 851.22 sched_debug.cfs_rq:/.util_est_enqueued.max 151.28 ± 22% +64.0% 248.15 ± 5% sched_debug.cfs_rq:/.util_est_enqueued.stddev 8811 -42.4% 5078 sched_debug.cpu.nr_switches.avg 17617 ± 3% -10.4% 15785 proc-vmstat.nr_active_anon 332941 +4.6% 348206 proc-vmstat.nr_anon_pages 855626 +1.7% 870502 proc-vmstat.nr_inactive_anon 17617 ± 3% -10.4% 15785 proc-vmstat.nr_zone_active_anon 855626 +1.7% 870502 proc-vmstat.nr_zone_inactive_anon 5.729e+08 ± 2% +38.8% 7.95e+08 proc-vmstat.numa_hit 5.727e+08 ± 2% +38.8% 7.948e+08 proc-vmstat.numa_local 16509 ± 4% -13.0% 14365 proc-vmstat.pgactivate 5.724e+08 ± 2% +38.7% 7.94e+08 proc-vmstat.pgalloc_normal 5.704e+08 ± 2% +38.8% 7.914e+08 proc-vmstat.pgfault 5.723e+08 ± 2% +38.7% 7.94e+08 proc-vmstat.pgfree 0.00 ± 37% +164.7% 0.01 ± 6% perf-sched.sch_delay.avg.ms.__cond_resched.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.isra 0.02 ± 12% +26.4% 0.02 ± 10% perf-sched.sch_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 0.00 ±223% +9466.7% 0.05 ±181% perf-sched.sch_delay.avg.ms.io_schedule.folio_wait_bit_common.filemap_fault.__do_fault 0.02 ± 94% -61.2% 0.01 ± 11% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select 0.00 ± 8% +1068.0% 0.05 ±189% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 0.01 ± 14% +52.6% 0.01 ± 34% perf-sched.sch_delay.max.ms.__cond_resched.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.isra 0.01 ± 9% +10802.8% 0.65 ±212% perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.00 ±223% +10533.3% 0.05 ±162% perf-sched.sch_delay.max.ms.io_schedule.folio_wait_bit_common.filemap_fault.__do_fault 62.95 ± 2% +113.8% 134.58 ± 2% perf-sched.total_wait_and_delay.average.ms 13913 -52.2% 6654 perf-sched.total_wait_and_delay.count.ms 62.87 ± 2% +113.8% 134.44 ± 2% perf-sched.total_wait_time.average.ms 2.95 ± 3% +1477.8% 46.48 ± 2% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma 1.18 ± 7% +2017.8% 24.99 ± 2% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap 2.76 ± 3% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 6894 ± 2% -94.4% 384.67 perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma 1070 ± 11% -60.9% 418.33 perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap 112.33 ± 13% -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 15.07 ± 30% +469.9% 85.90 ± 4% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma 11.68 ± 17% +558.0% 76.85 ± 11% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap 14.21 ± 27% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 17.20 ± 29% -69.9% 5.17 ± 7% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 3893 ± 8% -19.2% 3144 ± 19% perf-sched.wait_and_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 2.99 ± 28% +906.8% 30.07 ± 12% perf-sched.wait_time.avg.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault 3.59 ± 49% +796.7% 32.22 ± 11% perf-sched.wait_time.avg.ms.__cond_resched.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault 1.81 ± 75% +2169.9% 41.07 ± 29% perf-sched.wait_time.avg.ms.__cond_resched.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.isra 3.46 ±101% +1224.0% 45.81 ± 30% perf-sched.wait_time.avg.ms.__cond_resched.zap_pmd_range.isra.0.unmap_page_range 3.15 ± 29% +943.4% 32.88 ± 7% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault 2.88 ± 50% +922.9% 29.44 ± 11% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 2.94 ± 3% +1481.0% 46.47 ± 2% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma 1.18 ± 7% +2023.3% 24.96 ± 3% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap 2.76 ± 3% +1449.8% 42.73 ± 9% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 10.38 ± 3% +533.8% 65.76 ± 7% perf-sched.wait_time.max.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault 9.13 ± 26% +596.6% 63.59 ± 11% perf-sched.wait_time.max.ms.__cond_resched.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault 6.77 ± 70% +843.3% 63.87 ± 30% perf-sched.wait_time.max.ms.__cond_resched.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.isra 5.71 ± 64% +1111.5% 69.19 ± 15% perf-sched.wait_time.max.ms.__cond_resched.zap_pmd_range.isra.0.unmap_page_range 10.23 ± 4% +560.7% 67.56 ± 6% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault 8.83 ± 30% +582.4% 60.23 ± 7% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 15.06 ± 30% +470.1% 85.89 ± 4% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma 11.67 ± 17% +558.2% 76.84 ± 11% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap 14.21 ± 27% +429.5% 75.22 ± 9% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 17.16 ± 28% -69.9% 5.16 ± 7% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 3893 ± 8% -19.2% 3144 ± 19% perf-sched.wait_time.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 14.12 +16.6% 16.46 perf-stat.i.MPKI 2.231e+09 ± 2% +16.6% 2.601e+09 perf-stat.i.branch-instructions 19953628 +8.8% 21705347 perf-stat.i.branch-misses 51.96 ± 2% +13.0 65.01 perf-stat.i.cache-miss-rate% 1.566e+08 ± 2% +36.8% 2.142e+08 perf-stat.i.cache-misses 3.015e+08 ± 3% +9.2% 3.294e+08 perf-stat.i.cache-references 4702 -55.0% 2116 perf-stat.i.context-switches 2.58 -14.2% 2.22 perf-stat.i.cpi 114.64 -2.2% 112.13 perf-stat.i.cpu-migrations 183.46 -26.2% 135.46 perf-stat.i.cycles-between-cache-misses 4280505 ± 3% +22.7% 5251081 ± 6% perf-stat.i.dTLB-load-misses 2.774e+09 ± 2% +19.1% 3.303e+09 perf-stat.i.dTLB-loads 0.98 ± 2% +0.2 1.14 perf-stat.i.dTLB-store-miss-rate% 15927669 ± 4% +38.8% 22110291 perf-stat.i.dTLB-store-misses 1.604e+09 ± 2% +19.9% 1.923e+09 perf-stat.i.dTLB-stores 79.86 +3.1 82.95 perf-stat.i.iTLB-load-miss-rate% 2701759 ± 2% +19.0% 3214102 perf-stat.i.iTLB-load-misses 679352 -2.8% 660048 perf-stat.i.iTLB-loads 1.115e+10 ± 2% +17.1% 1.305e+10 perf-stat.i.instructions 0.39 +16.8% 0.45 perf-stat.i.ipc 0.29 ± 26% -31.6% 0.20 ± 17% perf-stat.i.major-faults 762.98 ± 2% +39.2% 1062 perf-stat.i.metric.K/sec 66.44 ± 2% +18.0% 78.42 perf-stat.i.metric.M/sec 1890049 ± 2% +38.5% 2616916 perf-stat.i.minor-faults 47044113 ± 2% +41.1% 66393293 perf-stat.i.node-loads 11825548 ± 2% +34.0% 15841684 perf-stat.i.node-stores 1890049 ± 2% +38.5% 2616917 perf-stat.i.page-faults 14.05 +16.9% 16.42 perf-stat.overall.MPKI 0.89 -0.1 0.83 perf-stat.overall.branch-miss-rate% 51.96 ± 2% +13.1 65.04 perf-stat.overall.cache-miss-rate% 2.57 -14.4% 2.20 perf-stat.overall.cpi 183.08 -26.7% 134.14 perf-stat.overall.cycles-between-cache-misses 0.98 ± 2% +0.2 1.14 perf-stat.overall.dTLB-store-miss-rate% 79.90 +3.1 82.97 perf-stat.overall.iTLB-load-miss-rate% 0.39 +16.7% 0.45 perf-stat.overall.ipc 0.22 ± 2% -0.1 0.15 ± 3% perf-stat.overall.node-load-miss-rate% 0.19 ± 8% -0.1 0.13 ± 16% perf-stat.overall.node-store-miss-rate% 1779185 -15.5% 1503815 perf-stat.overall.path-length 2.224e+09 ± 2% +16.6% 2.593e+09 perf-stat.ps.branch-instructions 19885795 +8.8% 21625880 perf-stat.ps.branch-misses 1.56e+08 ± 2% +36.8% 2.135e+08 perf-stat.ps.cache-misses 3.005e+08 ± 3% +9.2% 3.283e+08 perf-stat.ps.cache-references 4686 -55.0% 2109 perf-stat.ps.context-switches 114.35 -2.3% 111.73 perf-stat.ps.cpu-migrations 4265367 ± 3% +22.7% 5233761 ± 6% perf-stat.ps.dTLB-load-misses 2.765e+09 ± 2% +19.1% 3.292e+09 perf-stat.ps.dTLB-loads 15874379 ± 4% +38.8% 22037238 perf-stat.ps.dTLB-store-misses 1.598e+09 ± 2% +19.9% 1.917e+09 perf-stat.ps.dTLB-stores 2692499 ± 2% +19.0% 3203465 perf-stat.ps.iTLB-load-misses 677243 -2.9% 657791 perf-stat.ps.iTLB-loads 1.111e+10 ± 2% +17.1% 1.3e+10 perf-stat.ps.instructions 0.29 ± 26% -31.6% 0.20 ± 17% perf-stat.ps.major-faults 1883712 ± 2% +38.5% 2608263 perf-stat.ps.minor-faults 46887454 ± 2% +41.1% 66175688 perf-stat.ps.node-loads 11785781 ± 2% +34.0% 15789100 perf-stat.ps.node-stores 1883712 ± 2% +38.5% 2608264 perf-stat.ps.page-faults 3.362e+12 ± 2% +17.3% 3.943e+12 perf-stat.total.instructions 47.03 ± 2% -8.6 38.45 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase 47.22 ± 2% -8.6 38.67 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase 8.30 ± 6% -8.3 0.00 perf-profile.calltrace.cycles-pp.lock_mm_and_find_vma.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase 7.19 ± 4% -7.2 0.00 perf-profile.calltrace.cycles-pp.down_read_trylock.lock_mm_and_find_vma.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 57.96 ± 3% -4.7 53.23 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase 61.72 ± 3% -3.3 58.42 perf-profile.calltrace.cycles-pp.testcase 2.19 ± 13% -0.6 1.59 ± 6% perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase 0.91 ± 8% +0.2 1.09 ± 7% perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault 0.56 ± 2% +0.2 0.78 ± 5% perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase 1.11 ± 4% +0.2 1.34 ± 4% perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault 0.86 ± 6% +0.3 1.13 ± 4% perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault 1.42 ± 3% +0.3 1.77 ± 2% perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault.do_fault 0.87 ± 6% +0.4 1.27 ± 3% perf-profile.calltrace.cycles-pp.__free_one_page.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush 1.66 ± 3% +0.4 2.10 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_cow_fault.do_fault.__handle_mm_fault 0.54 ± 45% +0.4 0.98 ± 4% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma 0.96 ± 6% +0.4 1.40 ± 2% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range 1.23 ± 4% +0.4 1.68 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase 0.26 ±100% +0.5 0.72 ± 8% perf-profile.calltrace.cycles-pp.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault.do_fault 1.74 ± 3% +0.5 2.22 perf-profile.calltrace.cycles-pp.__do_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault 0.59 ± 45% +0.5 1.06 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range 0.89 ± 5% +0.5 1.36 perf-profile.calltrace.cycles-pp._compound_head.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.60 ± 45% +0.5 1.08 ± 3% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault 0.00 +0.5 0.52 ± 2% perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap 0.00 +0.5 0.52 ± 2% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap 0.08 ±223% +0.6 0.67 perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 1.56 ± 4% +0.6 2.18 ± 2% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range 0.00 +0.6 0.63 ± 5% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 0.00 +0.7 0.66 ± 2% perf-profile.calltrace.cycles-pp.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 2.04 ± 8% +0.9 2.91 perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault 2.16 ± 7% +0.9 3.10 ± 2% perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault.do_fault 2.80 ± 4% +1.0 3.76 perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_exc_page_fault.testcase 2.93 ± 5% +1.1 4.06 perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault 3.11 ± 7% +1.1 4.24 ± 2% perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_cow_fault.do_fault.__handle_mm_fault 3.15 ± 4% +1.2 4.31 perf-profile.calltrace.cycles-pp.error_entry.testcase 3.05 ± 5% +1.2 4.23 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault.__handle_mm_fault 3.21 ± 3% +1.2 4.41 perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase 2.62 ± 6% +1.4 3.98 perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range 2.78 ± 6% +1.4 4.20 perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.70 ± 48% +1.7 2.38 ± 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist 0.71 ± 48% +1.7 2.39 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages 1.98 ± 10% +1.7 3.66 perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc 2.43 ± 9% +1.8 4.25 perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio 2.64 ± 8% +1.9 4.55 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault 3.07 ± 8% +2.1 5.13 perf-profile.calltrace.cycles-pp.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault 3.15 ± 8% +2.1 5.25 perf-profile.calltrace.cycles-pp.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault 4.46 ± 5% +2.3 6.72 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 4.47 ± 5% +2.3 6.74 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap 4.47 ± 5% +2.3 6.74 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 4.47 ± 5% +2.3 6.74 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 6.38 ± 6% +2.3 8.65 perf-profile.calltrace.cycles-pp.finish_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault 3.64 ± 7% +2.3 5.97 perf-profile.calltrace.cycles-pp.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault 4.81 ± 6% +2.5 7.28 perf-profile.calltrace.cycles-pp.__munmap 4.81 ± 6% +2.5 7.28 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap 4.81 ± 6% +2.5 7.28 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 4.81 ± 6% +2.5 7.28 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 4.81 ± 6% +2.5 7.28 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 4.79 ± 6% +2.5 7.27 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap 4.80 ± 6% +2.5 7.28 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe 4.80 ± 6% +2.5 7.28 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64 31.04 ± 3% +3.1 34.10 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 32.18 ± 3% +3.2 35.42 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase 10.32 ± 4% +3.6 13.90 perf-profile.calltrace.cycles-pp.copy_page.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault 23.83 ± 5% +9.0 32.85 perf-profile.calltrace.cycles-pp.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault 23.96 ± 5% +9.0 33.00 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault 47.11 ± 2% -8.6 38.50 perf-profile.children.cycles-pp.do_user_addr_fault 47.25 ± 2% -8.5 38.70 perf-profile.children.cycles-pp.exc_page_fault 8.31 ± 6% -8.3 0.00 perf-profile.children.cycles-pp.lock_mm_and_find_vma 7.32 ± 4% -7.1 0.18 ± 9% perf-profile.children.cycles-pp.down_read_trylock 54.76 ± 3% -5.9 48.89 perf-profile.children.cycles-pp.asm_exc_page_fault 3.55 ± 3% -3.4 0.18 ± 8% perf-profile.children.cycles-pp.up_read 63.31 ± 3% -2.7 60.56 perf-profile.children.cycles-pp.testcase 2.19 ± 13% -0.6 1.59 ± 6% perf-profile.children.cycles-pp.lock_vma_under_rcu 0.55 ± 10% -0.2 0.37 ± 6% perf-profile.children.cycles-pp.mtree_range_walk 0.30 ± 11% -0.1 0.18 ± 10% perf-profile.children.cycles-pp.handle_pte_fault 0.20 ± 13% -0.1 0.12 ± 9% perf-profile.children.cycles-pp.pte_offset_map_nolock 0.14 ± 10% -0.1 0.07 ± 10% perf-profile.children.cycles-pp.access_error 0.08 ± 14% -0.0 0.04 ± 45% perf-profile.children.cycles-pp.intel_idle 0.07 ± 11% +0.0 0.10 ± 12% perf-profile.children.cycles-pp.xas_start 0.05 ± 46% +0.0 0.08 ± 7% perf-profile.children.cycles-pp.policy_node 0.11 ± 7% +0.0 0.14 ± 12% perf-profile.children.cycles-pp.folio_unlock 0.15 ± 6% +0.0 0.20 ± 10% perf-profile.children.cycles-pp._raw_spin_trylock 0.11 ± 10% +0.0 0.15 ± 7% perf-profile.children.cycles-pp.get_pfnblock_flags_mask 0.12 ± 12% +0.0 0.17 ± 6% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size 0.13 ± 10% +0.0 0.18 ± 5% perf-profile.children.cycles-pp.uncharge_folio 0.15 ± 8% +0.0 0.20 ± 7% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list 0.11 ± 10% +0.0 0.16 ± 6% perf-profile.children.cycles-pp.shmem_get_policy 0.15 ± 7% +0.0 0.20 ± 4% perf-profile.children.cycles-pp.try_charge_memcg 0.13 ± 9% +0.0 0.18 ± 9% perf-profile.children.cycles-pp.cgroup_rstat_updated 0.01 ±223% +0.1 0.06 ± 23% perf-profile.children.cycles-pp.perf_swevent_event 0.20 ± 10% +0.1 0.26 ± 4% perf-profile.children.cycles-pp.__mod_zone_page_state 0.17 ± 9% +0.1 0.23 ± 8% perf-profile.children.cycles-pp.__count_memcg_events 0.14 ± 11% +0.1 0.20 ± 2% perf-profile.children.cycles-pp.free_swap_cache 0.20 ± 6% +0.1 0.25 ± 3% perf-profile.children.cycles-pp.free_unref_page_prepare 0.04 ± 45% +0.1 0.10 ± 19% perf-profile.children.cycles-pp.kthread_blkcg 0.14 ± 8% +0.1 0.20 ± 3% perf-profile.children.cycles-pp.free_pages_and_swap_cache 0.24 ± 8% +0.1 0.30 ± 6% perf-profile.children.cycles-pp.__list_add_valid_or_report 0.23 ± 9% +0.1 0.30 ± 4% perf-profile.children.cycles-pp.free_unref_page_commit 0.46 ± 4% +0.1 0.55 ± 2% perf-profile.children.cycles-pp.xas_load 0.00 +0.1 0.11 ± 9% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 0.34 ± 3% +0.1 0.47 ± 6% perf-profile.children.cycles-pp.charge_memcg 0.32 ± 8% +0.1 0.47 ± 6% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state 0.00 +0.2 0.15 ± 16% perf-profile.children.cycles-pp.put_page 0.25 ± 7% +0.2 0.41 ± 5% perf-profile.children.cycles-pp.__mod_node_page_state 0.20 ± 15% +0.2 0.36 ± 12% perf-profile.children.cycles-pp.blk_cgroup_congested 1.42 ± 4% +0.2 1.58 perf-profile.children.cycles-pp.__list_del_entry_valid_or_report 0.23 ± 16% +0.2 0.42 ± 10% perf-profile.children.cycles-pp.__folio_throttle_swaprate 0.36 ± 4% +0.2 0.56 ± 5% perf-profile.children.cycles-pp.__mod_lruvec_state 0.91 ± 8% +0.2 1.11 ± 7% perf-profile.children.cycles-pp.__mem_cgroup_charge 0.32 ± 9% +0.2 0.53 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu 0.45 ± 6% +0.2 0.68 perf-profile.children.cycles-pp.page_remove_rmap 1.11 ± 4% +0.2 1.34 ± 4% perf-profile.children.cycles-pp.filemap_get_entry 0.47 ± 12% +0.2 0.72 ± 8% perf-profile.children.cycles-pp.folio_add_new_anon_rmap 0.47 ± 11% +0.3 0.74 ± 6% perf-profile.children.cycles-pp.__mod_lruvec_page_state 0.88 ± 6% +0.3 1.17 ± 4% perf-profile.children.cycles-pp.lru_add_fn 0.85 ± 2% +0.3 1.16 ± 3% perf-profile.children.cycles-pp.___perf_sw_event 1.43 ± 4% +0.3 1.78 ± 2% perf-profile.children.cycles-pp.shmem_get_folio_gfp 1.06 ± 2% +0.4 1.47 ± 2% perf-profile.children.cycles-pp.__perf_sw_event 1.66 ± 3% +0.4 2.10 perf-profile.children.cycles-pp.shmem_fault 0.97 ± 6% +0.5 1.44 ± 3% perf-profile.children.cycles-pp.__free_one_page 1.27 ± 4% +0.5 1.74 perf-profile.children.cycles-pp.sync_regs 1.75 ± 4% +0.5 2.22 perf-profile.children.cycles-pp.__do_fault 1.06 ± 6% +0.5 1.58 ± 2% perf-profile.children.cycles-pp.free_pcppages_bulk 0.92 ± 5% +0.5 1.45 perf-profile.children.cycles-pp._compound_head 1.75 ± 5% +0.6 2.36 perf-profile.children.cycles-pp.native_irq_return_iret 1.74 ± 4% +0.7 2.47 ± 2% perf-profile.children.cycles-pp.free_unref_page_list 0.83 ± 18% +0.8 1.65 ± 2% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave 2.04 ± 8% +0.9 2.92 perf-profile.children.cycles-pp.folio_batch_move_lru 2.17 ± 7% +0.9 3.11 ± 2% perf-profile.children.cycles-pp.folio_add_lru_vma 2.85 ± 4% +1.0 3.82 perf-profile.children.cycles-pp.irqentry_exit_to_user_mode 3.01 ± 5% +1.1 4.14 perf-profile.children.cycles-pp._raw_spin_lock 3.12 ± 7% +1.1 4.26 ± 3% perf-profile.children.cycles-pp.set_pte_range 3.20 ± 3% +1.2 4.37 perf-profile.children.cycles-pp.error_entry 3.06 ± 5% +1.2 4.24 perf-profile.children.cycles-pp.__pte_offset_map_lock 3.22 ± 3% +1.2 4.41 perf-profile.children.cycles-pp.__irqentry_text_end 3.09 ± 6% +1.6 4.70 perf-profile.children.cycles-pp.release_pages 3.09 ± 6% +1.6 4.72 perf-profile.children.cycles-pp.tlb_batch_pages_flush 1.98 ± 10% +1.7 3.67 perf-profile.children.cycles-pp.rmqueue_bulk 2.44 ± 9% +1.8 4.27 perf-profile.children.cycles-pp.rmqueue 2.66 ± 8% +1.9 4.57 perf-profile.children.cycles-pp.get_page_from_freelist 3.14 ± 7% +2.1 5.23 perf-profile.children.cycles-pp.__alloc_pages 3.17 ± 8% +2.1 5.28 perf-profile.children.cycles-pp.__folio_alloc 4.48 ± 5% +2.3 6.75 perf-profile.children.cycles-pp.unmap_vmas 4.48 ± 5% +2.3 6.75 perf-profile.children.cycles-pp.unmap_page_range 4.48 ± 5% +2.3 6.75 perf-profile.children.cycles-pp.zap_pmd_range 4.48 ± 5% +2.3 6.75 perf-profile.children.cycles-pp.zap_pte_range 6.39 ± 6% +2.3 8.68 perf-profile.children.cycles-pp.finish_fault 3.68 ± 7% +2.4 6.03 perf-profile.children.cycles-pp.vma_alloc_folio 1.56 ± 21% +2.4 3.92 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.66 ± 19% +2.4 4.08 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 4.97 ± 5% +2.4 7.42 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 4.96 ± 5% +2.4 7.42 perf-profile.children.cycles-pp.do_syscall_64 4.81 ± 6% +2.5 7.28 perf-profile.children.cycles-pp.__munmap 4.81 ± 6% +2.5 7.28 perf-profile.children.cycles-pp.__x64_sys_munmap 4.81 ± 6% +2.5 7.28 perf-profile.children.cycles-pp.__vm_munmap 4.80 ± 6% +2.5 7.28 perf-profile.children.cycles-pp.do_vmi_munmap 4.80 ± 6% +2.5 7.28 perf-profile.children.cycles-pp.do_vmi_align_munmap 4.80 ± 6% +2.5 7.27 perf-profile.children.cycles-pp.unmap_region 31.08 ± 3% +3.1 34.13 perf-profile.children.cycles-pp.__handle_mm_fault 32.25 ± 3% +3.2 35.50 perf-profile.children.cycles-pp.handle_mm_fault 10.33 ± 4% +3.6 13.92 perf-profile.children.cycles-pp.copy_page 23.97 ± 5% +9.0 33.01 perf-profile.children.cycles-pp.do_fault 23.88 ± 5% +9.1 32.95 perf-profile.children.cycles-pp.do_cow_fault 7.29 ± 4% -7.1 0.18 ± 10% perf-profile.self.cycles-pp.down_read_trylock 6.77 ± 4% -5.8 0.93 ± 8% perf-profile.self.cycles-pp.__handle_mm_fault 3.51 ± 3% -3.3 0.18 ± 10% perf-profile.self.cycles-pp.up_read 0.54 ± 10% -0.2 0.36 ± 6% perf-profile.self.cycles-pp.mtree_range_walk 0.10 ± 18% -0.1 0.04 ± 72% perf-profile.self.cycles-pp.handle_pte_fault 0.12 ± 7% -0.1 0.07 ± 10% perf-profile.self.cycles-pp.access_error 0.10 ± 18% -0.1 0.05 ± 47% perf-profile.self.cycles-pp.pte_offset_map_nolock 0.08 ± 11% -0.0 0.04 ± 44% perf-profile.self.cycles-pp.do_fault 0.08 ± 14% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.intel_idle 0.09 ± 6% +0.0 0.11 ± 5% perf-profile.self.cycles-pp.free_unref_page_prepare 0.06 ± 7% +0.0 0.09 ± 4% perf-profile.self.cycles-pp.free_pcppages_bulk 0.09 ± 6% +0.0 0.12 ± 6% perf-profile.self.cycles-pp.rmqueue_bulk 0.10 ± 6% +0.0 0.13 ± 10% perf-profile.self.cycles-pp.charge_memcg 0.11 ± 8% +0.0 0.14 ± 4% perf-profile.self.cycles-pp.__mod_lruvec_state 0.08 ± 12% +0.0 0.11 ± 9% perf-profile.self.cycles-pp.__pte_offset_map_lock 0.10 ± 10% +0.0 0.14 ± 11% perf-profile.self.cycles-pp.folio_unlock 0.12 ± 11% +0.0 0.16 ± 8% perf-profile.self.cycles-pp.uncharge_folio 0.12 ± 15% +0.0 0.16 ± 4% perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.10 ± 9% +0.0 0.14 ± 5% perf-profile.self.cycles-pp.get_pfnblock_flags_mask 0.10 ± 7% +0.0 0.15 ± 3% perf-profile.self.cycles-pp.try_charge_memcg 0.04 ± 71% +0.0 0.08 ± 16% perf-profile.self.cycles-pp.__do_fault 0.15 ± 6% +0.0 0.20 ± 10% perf-profile.self.cycles-pp._raw_spin_trylock 0.11 ± 9% +0.0 0.16 ± 6% perf-profile.self.cycles-pp.mem_cgroup_update_lru_size 0.13 ± 9% +0.0 0.17 ± 8% perf-profile.self.cycles-pp.set_pte_range 0.10 ± 9% +0.0 0.15 ± 5% perf-profile.self.cycles-pp.shmem_get_policy 0.18 ± 9% +0.0 0.24 ± 4% perf-profile.self.cycles-pp.__mod_zone_page_state 0.14 ± 10% +0.0 0.18 ± 2% perf-profile.self.cycles-pp.free_swap_cache 0.10 ± 11% +0.1 0.16 ± 10% perf-profile.self.cycles-pp.exc_page_fault 0.19 ± 11% +0.1 0.24 ± 5% perf-profile.self.cycles-pp.free_unref_page_commit 0.12 ± 12% +0.1 0.17 ± 10% perf-profile.self.cycles-pp.cgroup_rstat_updated 0.14 ± 8% +0.1 0.20 ± 5% perf-profile.self.cycles-pp.asm_exc_page_fault 0.01 ±223% +0.1 0.06 ± 23% perf-profile.self.cycles-pp.perf_swevent_event 0.17 ± 7% +0.1 0.22 ± 7% perf-profile.self.cycles-pp.xas_load 0.16 ± 9% +0.1 0.22 ± 5% perf-profile.self.cycles-pp.folio_add_new_anon_rmap 0.20 ± 8% +0.1 0.26 ± 4% perf-profile.self.cycles-pp.free_unref_page_list 0.06 ± 14% +0.1 0.13 ± 21% perf-profile.self.cycles-pp.__mem_cgroup_charge 0.22 ± 9% +0.1 0.28 ± 7% perf-profile.self.cycles-pp.__list_add_valid_or_report 0.13 ± 6% +0.1 0.19 ± 7% perf-profile.self.cycles-pp.folio_add_lru_vma 0.22 ± 7% +0.1 0.29 ± 5% perf-profile.self.cycles-pp.rmqueue 0.24 ± 5% +0.1 0.31 ± 6% perf-profile.self.cycles-pp.shmem_fault 0.21 ± 6% +0.1 0.29 ± 4% perf-profile.self.cycles-pp.get_page_from_freelist 0.22 ± 7% +0.1 0.30 ± 4% perf-profile.self.cycles-pp.__perf_sw_event 0.00 +0.1 0.10 ± 9% perf-profile.self.cycles-pp.exit_to_user_mode_prepare 0.29 ± 4% +0.1 0.39 ± 6% perf-profile.self.cycles-pp.shmem_get_folio_gfp 0.32 ± 7% +0.1 0.44 ± 4% perf-profile.self.cycles-pp.zap_pte_range 0.24 ± 9% +0.1 0.36 ± 6% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state 0.16 ± 15% +0.1 0.29 ± 11% perf-profile.self.cycles-pp.blk_cgroup_congested 0.35 ± 7% +0.1 0.48 ± 4% perf-profile.self.cycles-pp.folio_batch_move_lru 0.39 ± 5% +0.1 0.53 ± 5% perf-profile.self.cycles-pp.__alloc_pages 0.31 ± 8% +0.1 0.45 ± 2% perf-profile.self.cycles-pp.vma_alloc_folio 0.29 ± 9% +0.1 0.44 ± 4% perf-profile.self.cycles-pp.page_remove_rmap 0.65 ± 7% +0.1 0.80 ± 7% perf-profile.self.cycles-pp.filemap_get_entry 0.44 ± 7% +0.2 0.59 ± 2% perf-profile.self.cycles-pp.lru_add_fn 0.00 +0.2 0.15 ± 16% perf-profile.self.cycles-pp.put_page 0.24 ± 7% +0.2 0.39 ± 6% perf-profile.self.cycles-pp.__mod_node_page_state 1.41 ± 4% +0.2 1.57 perf-profile.self.cycles-pp.__list_del_entry_valid_or_report 0.57 ± 8% +0.2 0.81 ± 5% perf-profile.self.cycles-pp.release_pages 0.75 ± 2% +0.3 1.03 ± 2% perf-profile.self.cycles-pp.___perf_sw_event 0.91 ± 6% +0.5 1.37 ± 3% perf-profile.self.cycles-pp.__free_one_page 1.27 ± 4% +0.5 1.74 perf-profile.self.cycles-pp.sync_regs 0.90 ± 5% +0.5 1.42 perf-profile.self.cycles-pp._compound_head 1.74 ± 5% +0.6 2.36 perf-profile.self.cycles-pp.native_irq_return_iret 2.82 ± 5% +0.9 3.72 perf-profile.self.cycles-pp.irqentry_exit_to_user_mode 2.99 ± 5% +1.1 4.11 perf-profile.self.cycles-pp._raw_spin_lock 3.18 ± 4% +1.2 4.34 perf-profile.self.cycles-pp.error_entry 3.22 ± 3% +1.2 4.41 perf-profile.self.cycles-pp.__irqentry_text_end 3.70 ± 4% +1.3 5.00 perf-profile.self.cycles-pp.testcase 1.56 ± 21% +2.4 3.92 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 10.29 ± 4% +3.6 13.86 perf-profile.self.cycles-pp.copy_page Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.
diff --git a/mm/memory.c b/mm/memory.c index a9b0c135209a..938f481df0ab 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4639,13 +4639,11 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) struct vm_area_struct *vma = vmf->vma; vm_fault_t ret; - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { - vma_end_read(vma); - return VM_FAULT_RETRY; - } - - if (unlikely(anon_vma_prepare(vma))) - return VM_FAULT_OOM; + ret = vmf_can_call_fault(vmf); + if (!ret) + ret = vmf_anon_prepare(vmf); + if (ret) + return ret; vmf->cow_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, vmf->address); if (!vmf->cow_page)
If the page is not currently present in the page tables, we need to call the page fault handler to find out which page we're supposed to COW, so we need to both check that there is already an anon_vma and that the fault handler doesn't need the mmap_lock. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- mm/memory.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)