Message ID | 20231129032154.3710765-1-yosryahmed@google.com (mailing list archive) |
---|---|
Headers | show |
Series | mm: memcg: subtree stats flushing and thresholds | expand |
On Wed, Nov 29, 2023 at 03:21:48AM +0000, Yosry Ahmed wrote: > This series attempts to address shortages in today's approach for memcg > stats flushing, namely occasionally stale or expensive stat reads. The > series does so by changing the threshold that we use to decide whether > to trigger a flush to be per memcg instead of global (patch 3), and then > changing flushing to be per memcg (i.e. subtree flushes) instead of > global (patch 5). > > Patch 3 & 5 are the core of the series, and they include more details > and testing results. The rest are either cleanups or prep work. > > This series replaces the "memcg: more sophisticated stats flushing" > series [1], which also replaces another series, in a long list of > attempts to improve memcg stats flushing. It is not a new version of > the same patchset as it is a completely different approach. This is > based on collected feedback from discussions on lkml in all previous > attempts. Hopefully, this is the final attempt. > > There was a reported regression in v2 [2] for will-it-scale::fallocate > benchmark. I believe this regression should not affect production > workloads. This specific benchmark is allocating and freeing memory > (using fallocate/ftruncate) at a rate that is much faster to make actual > use of the memory. Testing this series on 100+ machines running > production workloads did not show any practical regressions in page > fault latency or allocation latency, but it showed great improvements in > stats read time. I do not have numbers about the exact improvements for > this series, but combined with another optimization for cgroup v1 [3] we > see 5-10x improvements. A significant chunk of that is coming from the > cgroup v1 optimization, but this series also made an improvement as > reported by Domenico [4]. > > v3 -> v4: > - Rebased on top of mm-unstable + "workload-specific and memory > pressure-driven zswap writeback" series to fix conflicts [5]. > > v3: https://lore.kernel.org/all/20231116022411.2250072-1-yosryahmed@google.com/ > > [1]https://lore.kernel.org/lkml/20230913073846.1528938-1-yosryahmed@google.com/ > [2]https://lore.kernel.org/lkml/202310202303.c68e7639-oliver.sang@intel.com/ > [3]https://lore.kernel.org/lkml/20230803185046.1385770-1-yosryahmed@google.com/ > [4]https://lore.kernel.org/lkml/CAFYChMv_kv_KXOMRkrmTN-7MrfgBHMcK3YXv0dPYEL7nK77e2A@mail.gmail.com/ > [5]https://lore.kernel.org/all/20231127234600.2971029-1-nphamcs@gmail.com/ > > Yosry Ahmed (5): > mm: memcg: change flush_next_time to flush_last_time > mm: memcg: move vmstats structs definition above flushing code > mm: memcg: make stats flushing threshold per-memcg > mm: workingset: move the stats flush into workingset_test_recent() > mm: memcg: restore subtree stats flushing > > include/linux/memcontrol.h | 8 +- > mm/memcontrol.c | 272 +++++++++++++++++++++---------------- > mm/vmscan.c | 2 +- > mm/workingset.c | 42 ++++-- > 4 files changed, 188 insertions(+), 136 deletions(-) > No regressions when booting the kernel with this series applied. Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>