Message ID | 20190627115405.255259-1-minchan@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | Introduce MADV_COLD and MADV_PAGEOUT | expand |
On Thu, Jun 27, 2019 at 08:54:00PM +0900, Minchan Kim wrote: > - Problem > > Naturally, cached apps were dominant consumers of memory on the system. > However, they were not significant consumers of swap even though they are > good candidate for swap. Under investigation, swapping out only begins > once the low zone watermark is hit and kswapd wakes up, but the overall > allocation rate in the system might trip lmkd thresholds and cause a cached > process to be killed(we measured performance swapping out vs. zapping the > memory by killing a process. Unsurprisingly, zapping is 10x times faster > even though we use zram which is much faster than real storage) so kill > from lmkd will often satisfy the high zone watermark, resulting in very > few pages actually being moved to swap. Maybe we should look if we do The Right Thing™ at system-wide level before introducing new API? How changing swappiness affects your workloads? What is swappiness value in your setup?
On Thu, Jun 27, 2019 at 09:06:01PM +0300, Kirill A. Shutemov wrote: > On Thu, Jun 27, 2019 at 08:54:00PM +0900, Minchan Kim wrote: > > - Problem > > > > Naturally, cached apps were dominant consumers of memory on the system. > > However, they were not significant consumers of swap even though they are > > good candidate for swap. Under investigation, swapping out only begins > > once the low zone watermark is hit and kswapd wakes up, but the overall > > allocation rate in the system might trip lmkd thresholds and cause a cached > > process to be killed(we measured performance swapping out vs. zapping the > > memory by killing a process. Unsurprisingly, zapping is 10x times faster > > even though we use zram which is much faster than real storage) so kill > > from lmkd will often satisfy the high zone watermark, resulting in very > > few pages actually being moved to swap. > > Maybe we should look if we do The Right Thing™ at system-wide level before > introducing new API? How changing swappiness affects your workloads? What > is swappiness value in your setup? It was 100. Even, I tried 150 and 200 with simple hack of swappiness. However, it caused too excessive swpout. Anyway, systen-level tune is generally good but if process has hint, that should work better and that's why advise API is.
Hi Folks, Do you guys have comments? I think it would be long enough to be pending. If there is no further comments, I want to ask to merge. Thanks. On Thu, Jun 27, 2019 at 08:54:00PM +0900, Minchan Kim wrote: > This patch is part of previous series: > https://lore.kernel.org/lkml/20190531064313.193437-1-minchan@kernel.org/T/#u > Originally, it was created for external madvise hinting feature. > > https://lkml.org/lkml/2019/5/31/463 > Michal wanted to separte the discussion from external hinting interface > so this patchset includes only first part of my entire patchset > > - introduce MADV_COLD and MADV_PAGEOUT hint to madvise. > > However, I keep entire description for others for easier understanding > why this kinds of hint was born. > > Thanks. > > This patchset is against on next-20190530. > > Below is description of previous entire patchset. > ================= &< ===================== > > - Background > > The Android terminology used for forking a new process and starting an app > from scratch is a cold start, while resuming an existing app is a hot start. > While we continually try to improve the performance of cold starts, hot > starts will always be significantly less power hungry as well as faster so > we are trying to make hot start more likely than cold start. > > To increase hot start, Android userspace manages the order that apps should > be killed in a process called ActivityManagerService. ActivityManagerService > tracks every Android app or service that the user could be interacting with > at any time and translates that into a ranked list for lmkd(low memory > killer daemon). They are likely to be killed by lmkd if the system has to > reclaim memory. In that sense they are similar to entries in any other cache. > Those apps are kept alive for opportunistic performance improvements but > those performance improvements will vary based on the memory requirements of > individual workloads. > > - Problem > > Naturally, cached apps were dominant consumers of memory on the system. > However, they were not significant consumers of swap even though they are > good candidate for swap. Under investigation, swapping out only begins > once the low zone watermark is hit and kswapd wakes up, but the overall > allocation rate in the system might trip lmkd thresholds and cause a cached > process to be killed(we measured performance swapping out vs. zapping the > memory by killing a process. Unsurprisingly, zapping is 10x times faster > even though we use zram which is much faster than real storage) so kill > from lmkd will often satisfy the high zone watermark, resulting in very > few pages actually being moved to swap. > > - Approach > > The approach we chose was to use a new interface to allow userspace to > proactively reclaim entire processes by leveraging platform information. > This allowed us to bypass the inaccuracy of the kernel’s LRUs for pages > that are known to be cold from userspace and to avoid races with lmkd > by reclaiming apps as soon as they entered the cached state. Additionally, > it could provide many chances for platform to use much information to > optimize memory efficiency. > > To achieve the goal, the patchset introduce two new options for madvise. > One is MADV_COLD which will deactivate activated pages and the other is > MADV_PAGEOUT which will reclaim private pages instantly. These new options > complement MADV_DONTNEED and MADV_FREE by adding non-destructive ways to > gain some free memory space. MADV_PAGEOUT is similar to MADV_DONTNEED in a way > that it hints the kernel that memory region is not currently needed and > should be reclaimed immediately; MADV_COLD is similar to MADV_FREE in a way > that it hints the kernel that memory region is not currently needed and > should be reclaimed when memory pressure rises. > > Minchan Kim (5): > mm: introduce MADV_COLD > mm: change PAGEREF_RECLAIM_CLEAN with PAGE_REFRECLAIM > mm: account nr_isolated_xxx in [isolate|putback]_lru_page > mm: introduce MADV_PAGEOUT > mm: factor out pmd young/dirty bit handling and THP split > > include/linux/huge_mm.h | 3 - > include/linux/swap.h | 2 + > include/uapi/asm-generic/mman-common.h | 2 + > mm/compaction.c | 2 - > mm/gup.c | 7 +- > mm/huge_memory.c | 74 ----- > mm/internal.h | 2 +- > mm/khugepaged.c | 3 - > mm/madvise.c | 438 ++++++++++++++++++++++++- > mm/memory-failure.c | 3 - > mm/memory_hotplug.c | 4 - > mm/mempolicy.c | 6 +- > mm/migrate.c | 37 +-- > mm/oom_kill.c | 2 +- > mm/swap.c | 42 +++ > mm/vmscan.c | 86 ++++- > 16 files changed, 566 insertions(+), 147 deletions(-) > > -- > 2.22.0.410.gd8fdbe21b5-goog >
On Mon 01-07-19 16:38:48, Minchan Kim wrote: > > Hi Folks, > > Do you guys have comments? I think it would be long enough to be > pending. If there is no further comments, I want to ask to merge. This is definitely on my todo list for this week. But please be patient. It's been _one_ work day since you posted this last version so I do not think this is stalling for too long. Sure the current version is probably not too much different from the previous but I didn't get to review it in the depth yet. All the code duplication doesn't make it much easier but I understand your reasoning that sharing more code is not really straightforward.