Message ID | 20220214214921.419687-1-hannes@cmpxchg.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: page_io: fix psi memory pressure error on cold swapins | expand |
On Mon, 14 Feb 2022 16:49:21 -0500 Johannes Weiner <hannes@cmpxchg.org> wrote: > Once upon a time, all swapins counted toward memory pressure[1]. Then > Joonsoo introduced workingset detection for anonymous pages and we > gained the ability to distinguish hot from cold swapins[2][3]. But we > failed to update swap_readpage() accordingly, and now we account > partial memory pressure in the swapin path of cold memory. > > Not for all situations - which adds more inconsistency: paths using > the conventional submit_bio() and lock_page() route will not see much > pressure - unless storage itself is heavily congested and the bio > submissions stall. ZRAM and ZSWAP do most of the work directly from > swap_readpage() and will see all swapins reflected as pressure. > > Restore consistency by making all swapin stall accounting conditional > on the page actually being part of the workingset. Does this have any known runtime effects? If not, can we hazard a guess?
On Mon, Feb 14, 2022 at 02:48:05PM -0800, Andrew Morton wrote: > On Mon, 14 Feb 2022 16:49:21 -0500 Johannes Weiner <hannes@cmpxchg.org> wrote: > > > Once upon a time, all swapins counted toward memory pressure[1]. Then > > Joonsoo introduced workingset detection for anonymous pages and we > > gained the ability to distinguish hot from cold swapins[2][3]. But we > > failed to update swap_readpage() accordingly, and now we account > > partial memory pressure in the swapin path of cold memory. > > > > Not for all situations - which adds more inconsistency: paths using > > the conventional submit_bio() and lock_page() route will not see much > > pressure - unless storage itself is heavily congested and the bio > > submissions stall. ZRAM and ZSWAP do most of the work directly from > > swap_readpage() and will see all swapins reflected as pressure. > > > > Restore consistency by making all swapin stall accounting conditional > > on the page actually being part of the workingset. > > Does this have any known runtime effects? If not, can we > hazard a guess? hm, how about this paragrah between "not for all situations" and "restore consistency": IOW, a workload doing cold swapins could see little to no pressure reported with on-disk swap, but potentially high pressure with a zram or zswap backend. That confuses any psi-based health monitoring, load shedding, proactive reclaim, or userspace OOM killing schemes that might be in place for the workload.
On Mon, Feb 14, 2022 at 04:49:21PM -0500, Johannes Weiner wrote: > Once upon a time, all swapins counted toward memory pressure[1]. Then > Joonsoo introduced workingset detection for anonymous pages and we > gained the ability to distinguish hot from cold swapins[2][3]. But we > failed to update swap_readpage() accordingly, and now we account > partial memory pressure in the swapin path of cold memory. > > Not for all situations - which adds more inconsistency: paths using > the conventional submit_bio() and lock_page() route will not see much > pressure - unless storage itself is heavily congested and the bio > submissions stall. ZRAM and ZSWAP do most of the work directly from > swap_readpage() and will see all swapins reflected as pressure. > > Restore consistency by making all swapin stall accounting conditional > on the page actually being part of the workingset. > > [1] commit 937790699be9 ("mm/page_io.c: annotate refault stalls from swap_readpage") > [2] commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") > [3] commit cad8320b4b39 ("mm/swap: don't SetPageWorkingset unconditionally during swapin") > > Reported-by: CGEL <cgel.zte@gmail.com> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Minchan Kim <minchan@kernel.org>
diff --git a/mm/page_io.c b/mm/page_io.c index 61c792f916fa..f6296ee25014 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -359,6 +359,7 @@ int swap_readpage(struct page *page, bool synchronous) struct bio *bio; int ret = 0; struct swap_info_struct *sis = page_swap_info(page); + bool workingset = PageWorkingset(page); unsigned long pflags; VM_BUG_ON_PAGE(!PageSwapCache(page) && !synchronous, page); @@ -370,7 +371,8 @@ int swap_readpage(struct page *page, bool synchronous) * or the submitting cgroup IO-throttled, submission can be a * significant part of overall IO time. */ - psi_memstall_enter(&pflags); + if (workingset) + psi_memstall_enter(&pflags); delayacct_swapin_start(); if (frontswap_load(page) == 0) { @@ -431,7 +433,8 @@ int swap_readpage(struct page *page, bool synchronous) bio_put(bio); out: - psi_memstall_leave(&pflags); + if (workingset) + psi_memstall_leave(&pflags); delayacct_swapin_end(); return ret; }
Once upon a time, all swapins counted toward memory pressure[1]. Then Joonsoo introduced workingset detection for anonymous pages and we gained the ability to distinguish hot from cold swapins[2][3]. But we failed to update swap_readpage() accordingly, and now we account partial memory pressure in the swapin path of cold memory. Not for all situations - which adds more inconsistency: paths using the conventional submit_bio() and lock_page() route will not see much pressure - unless storage itself is heavily congested and the bio submissions stall. ZRAM and ZSWAP do most of the work directly from swap_readpage() and will see all swapins reflected as pressure. Restore consistency by making all swapin stall accounting conditional on the page actually being part of the workingset. [1] commit 937790699be9 ("mm/page_io.c: annotate refault stalls from swap_readpage") [2] commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") [3] commit cad8320b4b39 ("mm/swap: don't SetPageWorkingset unconditionally during swapin") Reported-by: CGEL <cgel.zte@gmail.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Yu Zhao <yuzhao@google.com> --- mm/page_io.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)