Message ID | 20201209012400.1771150-1-yuzhao@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: don't SetPageWorkingset unconditionally during swapin | expand |
On 12/9/20 2:24 AM, Yu Zhao wrote: > We are capable of SetPageWorkingset based on refault distances after > commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") > This is done by workingset_refault(), which is right above the > unconditional SetPageWorkingset deleted by this patch. > > The unconditional SetPageWorkingset miscategorizes pages that are > read ahead or never belonged to the working set (e.g., tmpfs pages > accessed by fd). When those pages are swapped in (after they were > swapped out) for the first time, they skew PSI (when using > async swap). When this happens again, depending on their refault > distances, they might skew workingset_restore_anon counter in > addition to PSI because their shadows say they were part of the > working set. > > Signed-off-by: Yu Zhao <yuzhao@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Makes sense, especially now that we have anonymous LRU support. The flag setting in this context seems to go back all the way to 1899ad18c607 ("mm: workingset: tell cache transitions from workingset thrashing") where I'm not sure why it was even used on the anonymous page, when workingset was only implemented for the page cache. Maybe Johannes remembers? > --- > mm/swap_state.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/mm/swap_state.c b/mm/swap_state.c > index 1a01235156d1..6ecc84448d75 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -536,7 +536,6 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > workingset_refault(page, shadow); > > /* Caller will initiate read into locked page */ > - SetPageWorkingset(page); > lru_cache_add(page); > *new_page_allocated = true; > return page; >
On Tue, Dec 08, 2020 at 06:24:00PM -0700, Yu Zhao wrote: > We are capable of SetPageWorkingset based on refault distances after > commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") > This is done by workingset_refault(), which is right above the > unconditional SetPageWorkingset deleted by this patch. > > The unconditional SetPageWorkingset miscategorizes pages that are > read ahead or never belonged to the working set (e.g., tmpfs pages > accessed by fd). When those pages are swapped in (after they were > swapped out) for the first time, they skew PSI (when using > async swap). When this happens again, depending on their refault > distances, they might skew workingset_restore_anon counter in > addition to PSI because their shadows say they were part of the > working set. > > Signed-off-by: Yu Zhao <yuzhao@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
On Wed, Dec 09, 2020 at 10:18:22AM +0100, Vlastimil Babka wrote: > On 12/9/20 2:24 AM, Yu Zhao wrote: > > We are capable of SetPageWorkingset based on refault distances after > > commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") > > This is done by workingset_refault(), which is right above the > > unconditional SetPageWorkingset deleted by this patch. > > > > The unconditional SetPageWorkingset miscategorizes pages that are > > read ahead or never belonged to the working set (e.g., tmpfs pages > > accessed by fd). When those pages are swapped in (after they were > > swapped out) for the first time, they skew PSI (when using > > async swap). When this happens again, depending on their refault > > distances, they might skew workingset_restore_anon counter in > > addition to PSI because their shadows say they were part of the > > working set. > > > > Signed-off-by: Yu Zhao <yuzhao@google.com> > > Acked-by: Vlastimil Babka <vbabka@suse.cz> > > Makes sense, especially now that we have anonymous LRU support. The flag setting > in this context seems to go back all the way to 1899ad18c607 ("mm: workingset: > tell cache transitions from workingset thrashing") where I'm not sure why it was > even used on the anonymous page, when workingset was only implemented for the > page cache. Maybe Johannes remembers? I just double checked that commit and the changelog is indeed incomplete and doesn't mention the swap aspect. :( That patch was part of the psi series. It was meant to mark incoming pages under IO with SetPageWorkingset when waiting for them constituted a memory stall. On the page cache side, because we HAVE workingset detection, this was specific to recently evicted pages that had been active in their previous life. On the anon side, the aging algorithm had no distinction between workingset and sporadically used pages. Given the choice between a) no swapin stalls are pressure and b) all swapin stalls are pressure, I went with the latter in order to detect swap storms. The false positive case - high rate of swapin without severe memory pressure - was relatively unlikely, because we tried to avoid swapping until everything was completely on fire in the first place. With the lru balancing rework, more prevalent use of proactive reclaim etc. the distinction between hot and cold swapins became more important. Thankfully, Joonsoo's patches made that possible.
On Tue, Dec 08, 2020 at 06:24:00PM -0700, Yu Zhao wrote: > We are capable of SetPageWorkingset based on refault distances after > commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") > This is done by workingset_refault(), which is right above the > unconditional SetPageWorkingset deleted by this patch. > > The unconditional SetPageWorkingset miscategorizes pages that are > read ahead or never belonged to the working set (e.g., tmpfs pages > accessed by fd). When those pages are swapped in (after they were > swapped out) for the first time, they skew PSI (when using > async swap). When this happens again, depending on their refault > distances, they might skew workingset_restore_anon counter in > addition to PSI because their shadows say they were part of the > working set. > > Signed-off-by: Yu Zhao <yuzhao@google.com> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Thanks
On Thu, Dec 10, 2020 at 06:21:57AM -0500, Johannes Weiner wrote: > On Wed, Dec 09, 2020 at 10:18:22AM +0100, Vlastimil Babka wrote: > > On 12/9/20 2:24 AM, Yu Zhao wrote: > > > We are capable of SetPageWorkingset based on refault distances after > > > commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") > > > This is done by workingset_refault(), which is right above the > > > unconditional SetPageWorkingset deleted by this patch. > > > > > > The unconditional SetPageWorkingset miscategorizes pages that are > > > read ahead or never belonged to the working set (e.g., tmpfs pages > > > accessed by fd). When those pages are swapped in (after they were > > > swapped out) for the first time, they skew PSI (when using > > > async swap). When this happens again, depending on their refault > > > distances, they might skew workingset_restore_anon counter in > > > addition to PSI because their shadows say they were part of the > > > working set. > > > > > > Signed-off-by: Yu Zhao <yuzhao@google.com> > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz> > > > > Makes sense, especially now that we have anonymous LRU support. The flag setting > > in this context seems to go back all the way to 1899ad18c607 ("mm: workingset: > > tell cache transitions from workingset thrashing") where I'm not sure why it was > > even used on the anonymous page, when workingset was only implemented for the > > page cache. Maybe Johannes remembers? > > I just double checked that commit and the changelog is indeed > incomplete and doesn't mention the swap aspect. :( > > That patch was part of the psi series. It was meant to mark incoming > pages under IO with SetPageWorkingset when waiting for them > constituted a memory stall. > > On the page cache side, because we HAVE workingset detection, this was > specific to recently evicted pages that had been active in their > previous life. On the anon side, the aging algorithm had no > distinction between workingset and sporadically used pages. Given the > choice between a) no swapin stalls are pressure and b) all swapin > stalls are pressure, I went with the latter in order to detect swap > storms. The false positive case - high rate of swapin without severe > memory pressure - was relatively unlikely, because we tried to avoid > swapping until everything was completely on fire in the first place. This was my guess too -- and it makes sense to go with b) at that time. Thanks for confirming. > With the lru balancing rework, more prevalent use of proactive reclaim > etc. the distinction between hot and cold swapins became more > important. Thankfully, Joonsoo's patches made that possible.
On Thu 10-12-20 06:21:57, Johannes Weiner wrote: > On Wed, Dec 09, 2020 at 10:18:22AM +0100, Vlastimil Babka wrote: > > On 12/9/20 2:24 AM, Yu Zhao wrote: > > > We are capable of SetPageWorkingset based on refault distances after > > > commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") > > > This is done by workingset_refault(), which is right above the > > > unconditional SetPageWorkingset deleted by this patch. > > > > > > The unconditional SetPageWorkingset miscategorizes pages that are > > > read ahead or never belonged to the working set (e.g., tmpfs pages > > > accessed by fd). When those pages are swapped in (after they were > > > swapped out) for the first time, they skew PSI (when using > > > async swap). When this happens again, depending on their refault > > > distances, they might skew workingset_restore_anon counter in > > > addition to PSI because their shadows say they were part of the > > > working set. > > > > > > Signed-off-by: Yu Zhao <yuzhao@google.com> > > > > Acked-by: Vlastimil Babka <vbabka@suse.cz> > > > > Makes sense, especially now that we have anonymous LRU support. The flag setting > > in this context seems to go back all the way to 1899ad18c607 ("mm: workingset: > > tell cache transitions from workingset thrashing") where I'm not sure why it was > > even used on the anonymous page, when workingset was only implemented for the > > page cache. Maybe Johannes remembers? > > I just double checked that commit and the changelog is indeed > incomplete and doesn't mention the swap aspect. :( > > That patch was part of the psi series. It was meant to mark incoming > pages under IO with SetPageWorkingset when waiting for them > constituted a memory stall. > > On the page cache side, because we HAVE workingset detection, this was > specific to recently evicted pages that had been active in their > previous life. On the anon side, the aging algorithm had no > distinction between workingset and sporadically used pages. Given the > choice between a) no swapin stalls are pressure and b) all swapin > stalls are pressure, I went with the latter in order to detect swap > storms. The false positive case - high rate of swapin without severe > memory pressure - was relatively unlikely, because we tried to avoid > swapping until everything was completely on fire in the first place. > > With the lru balancing rework, more prevalent use of proactive reclaim > etc. the distinction between hot and cold swapins became more > important. Thankfully, Joonsoo's patches made that possible. This is a useful information, thanks! Yu Zhao can you make it into the changelog so that we have it for a future reference please? Feel free to add Acked-by: Michal Hocko <mhocko@suse.com>
diff --git a/mm/swap_state.c b/mm/swap_state.c index 1a01235156d1..6ecc84448d75 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -536,7 +536,6 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, workingset_refault(page, shadow); /* Caller will initiate read into locked page */ - SetPageWorkingset(page); lru_cache_add(page); *new_page_allocated = true; return page;
We are capable of SetPageWorkingset based on refault distances after commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") This is done by workingset_refault(), which is right above the unconditional SetPageWorkingset deleted by this patch. The unconditional SetPageWorkingset miscategorizes pages that are read ahead or never belonged to the working set (e.g., tmpfs pages accessed by fd). When those pages are swapped in (after they were swapped out) for the first time, they skew PSI (when using async swap). When this happens again, depending on their refault distances, they might skew workingset_restore_anon counter in addition to PSI because their shadows say they were part of the working set. Signed-off-by: Yu Zhao <yuzhao@google.com> --- mm/swap_state.c | 1 - 1 file changed, 1 deletion(-)