Message ID | 20231010083143.19593-1-mgorman@techsingularity.net (mailing list archive) |
---|---|
Headers | show |
Series | sched/numa: Complete scanning of partial and inactive VMAs | expand |
On 10/10/2023 2:01 PM, Mel Gorman wrote: > NUMA Balancing currently uses PID fault activity within a VMA to > determine if it is worth updating PTEs to trap NUMA hinting faults. > While this is reduces overhead, it misses two important corner case. > The first is that if Task A partially scans a VMA that is active and > Task B resumes the scan but is inactive, then the remainder of the VMA > may be missed. Similarly, if a VMA is inactive for a period of time then > it may never be scanned again. > > Patches 1-3 improve the documentation of the current per-VMA tracking > and adds a trace point for scan activity. Patch 4 addresses a corner > case where the PID activity information may not be reset after the > expected timeout. Patches 5-6 complete the scanning of partial and > inactive VMAs within the scan sequence. > > This could be improved further but it would deserve a separate series on > top with supporting data justifying the change. Otherwise and gain/loss > due to the additional changes could be masked by this series on its own. > Thank you Mel for the patches. I see Ingo already took to sched/core. Here is my testing detail FWIW. SUT: - 4th Generation EPYC System - 2 x 128C/256T - NPS1 mode base: 6.6.-rc4 patch_v1r5: Mel's Initial series with prev_scan_seq = -1 fix Link: https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/ sched-numabselective-v1r5 (May not be relevant. But I did see number was even more better for thread_alloc, so ..) patch_v1_r13: current series numa01_thread_alloc ============= base patch_v1r5 patch_v1r13 real 8m46.557s 8m29.040s 8m38.098s user 599m6.070s 268m38.140s 404m52.065s sys 3655m38.681s 3794m10.079s 3751m36.779s numa_hit 394964680 396000482 393981391 numa_local 197351688 198242761 197008099 numa_other 197612992 197757721 196973292 numa_pte_updates 1160 790360 812 numa_hint_faults 755 729196 553 numa_hint_faults_local 754 410220 263 numa_pages_migrated 1 318976 290 num01 ====== real 18m26.691s 17m31.770s 17m33.540s user 4501m40.194s 2148m7.993s 3295m57.897s sys 3483m11.684s 4764m57.876s 4215m35.599s numa_hit 395473956 395813242 395000242 numa_local 197776626 198188480 197983594 numa_other 197697330 197624762 197016648 numa_pte_updates 1447 4625319 7142774 numa_hint_faults 1390 4947832 10313097 numa_hint_faults_local 1288 2758651 5354895 numa_pages_migrated 102 594803 960422 Thanks and Regards - Raghu
* Raghavendra K T <raghavendra.kt@amd.com> wrote: > On 10/10/2023 2:01 PM, Mel Gorman wrote: > > NUMA Balancing currently uses PID fault activity within a VMA to > > determine if it is worth updating PTEs to trap NUMA hinting faults. > > While this is reduces overhead, it misses two important corner case. > > The first is that if Task A partially scans a VMA that is active and > > Task B resumes the scan but is inactive, then the remainder of the VMA > > may be missed. Similarly, if a VMA is inactive for a period of time then > > it may never be scanned again. > > > > Patches 1-3 improve the documentation of the current per-VMA tracking > > and adds a trace point for scan activity. Patch 4 addresses a corner > > case where the PID activity information may not be reset after the > > expected timeout. Patches 5-6 complete the scanning of partial and > > inactive VMAs within the scan sequence. > > > > This could be improved further but it would deserve a separate series on > > top with supporting data justifying the change. Otherwise and gain/loss > > due to the additional changes could be masked by this series on its own. > > > > Thank you Mel for the patches. I see Ingo already took to sched/core. > Here is my testing detail FWIW. Thank you for testing the series, I've added your Tested-by to the final two patches that change behavior materially: Tested-by: Raghavendra K T <raghavendra.kt@amd.com> Thanks, Ingo