Message ID | 20200922020148.3261797-3-riel@surriel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm,swap: skip swap readahead for instant IO (like zswap) | expand |
On Tue, Sep 22, 2020 at 10:02 AM Rik van Riel <riel@surriel.com> wrote: > > Check whether a swap page was obtained instantaneously, for example > because it is in zswap, or on a very fast IO device which uses busy > waiting, and we did not wait on IO to swap in this page. > If no IO was needed to get the swap page we want, kicking off readahead > on surrounding swap pages is likely to be counterproductive, because the > extra loads will cause additional latency, use up extra memory, and chances > are the surrounding pages in swap are just as fast to load as this one, > making readahead pointless. > > Signed-off-by: Rik van Riel <riel@surriel.com> > --- > mm/swap_state.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/mm/swap_state.c b/mm/swap_state.c > index aacb9ba53f63..6919f9d5fe88 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -637,6 +637,7 @@ static struct page *swap_cluster_read_one(swp_entry_t entry, > struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, > struct vm_fault *vmf) Why not do this for swap_vma_readahead() too? swap_cluster_read_one() can be used in swap_vma_readahead() too. > { > + struct page *page; > unsigned long entry_offset = swp_offset(entry); > unsigned long offset = entry_offset; > unsigned long start_offset, end_offset; > @@ -668,11 +669,18 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, > end_offset = si->max - 1; > > blk_start_plug(&plug); > + /* If we read the page without waiting on IO, skip readahead. */ > + page = swap_cluster_read_one(entry, offset, gfp_mask, vma, addr, false); > + if (page && PageUptodate(page)) > + goto skip_unplug; > + > + /* Ok, do the async read-ahead now. */ > for (offset = start_offset; offset <= end_offset ; offset++) { > - /* Ok, do the async read-ahead now */ > - swap_cluster_read_one(entry, offset, gfp_mask, vma, addr, > - offset != entry_offset); > + if (offset == entry_offset) > + continue; > + swap_cluster_read_one(entry, offset, gfp_mask, vma, addr, true); > } > +skip_unplug: > blk_finish_plug(&plug); > > lru_add_drain(); /* Push any new pages onto the LRU now */ Best Regards, Huang, Ying
On Tue, 2020-09-22 at 11:13 +0800, huang ying wrote: > On Tue, Sep 22, 2020 at 10:02 AM Rik van Riel <riel@surriel.com> > wrote: > > Check whether a swap page was obtained instantaneously, for example > > because it is in zswap, or on a very fast IO device which uses busy > > waiting, and we did not wait on IO to swap in this page. > > If no IO was needed to get the swap page we want, kicking off > > readahead > > on surrounding swap pages is likely to be counterproductive, > > because the > > extra loads will cause additional latency, use up extra memory, and > > chances > > are the surrounding pages in swap are just as fast to load as this > > one, > > making readahead pointless. > > > > Signed-off-by: Rik van Riel <riel@surriel.com> > > --- > > mm/swap_state.c | 14 +++++++++++--- > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > diff --git a/mm/swap_state.c b/mm/swap_state.c > > index aacb9ba53f63..6919f9d5fe88 100644 > > --- a/mm/swap_state.c > > +++ b/mm/swap_state.c > > @@ -637,6 +637,7 @@ static struct page > > *swap_cluster_read_one(swp_entry_t entry, > > struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t > > gfp_mask, > > struct vm_fault *vmf) > > Why not do this for swap_vma_readahead() > too? swap_cluster_read_one() > can be used in swap_vma_readahead() too. Good point, I should do the same thing for swap_vma_readahead() as well. Let me do that and send in a version 2 of the series.
On Mon, Sep 21, 2020 at 10:01:48PM -0400, Rik van Riel wrote: > + struct page *page; > unsigned long entry_offset = swp_offset(entry); > unsigned long offset = entry_offset; > unsigned long start_offset, end_offset; > @@ -668,11 +669,18 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, > end_offset = si->max - 1; > > blk_start_plug(&plug); > + /* If we read the page without waiting on IO, skip readahead. */ > + page = swap_cluster_read_one(entry, offset, gfp_mask, vma, addr, false); > + if (page && PageUptodate(page)) > + goto skip_unplug; > + At least for the normal block device path the plug will prevent the I/O submission from actually happening and thus PageUptodate from becoming true. I think we need to split the different code paths more cleanly. Btw, what device type and media did you test this with? What kind of numbers did you get on what workload?
diff --git a/mm/swap_state.c b/mm/swap_state.c index aacb9ba53f63..6919f9d5fe88 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -637,6 +637,7 @@ static struct page *swap_cluster_read_one(swp_entry_t entry, struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, struct vm_fault *vmf) { + struct page *page; unsigned long entry_offset = swp_offset(entry); unsigned long offset = entry_offset; unsigned long start_offset, end_offset; @@ -668,11 +669,18 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, end_offset = si->max - 1; blk_start_plug(&plug); + /* If we read the page without waiting on IO, skip readahead. */ + page = swap_cluster_read_one(entry, offset, gfp_mask, vma, addr, false); + if (page && PageUptodate(page)) + goto skip_unplug; + + /* Ok, do the async read-ahead now. */ for (offset = start_offset; offset <= end_offset ; offset++) { - /* Ok, do the async read-ahead now */ - swap_cluster_read_one(entry, offset, gfp_mask, vma, addr, - offset != entry_offset); + if (offset == entry_offset) + continue; + swap_cluster_read_one(entry, offset, gfp_mask, vma, addr, true); } +skip_unplug: blk_finish_plug(&plug); lru_add_drain(); /* Push any new pages onto the LRU now */
Check whether a swap page was obtained instantaneously, for example because it is in zswap, or on a very fast IO device which uses busy waiting, and we did not wait on IO to swap in this page. If no IO was needed to get the swap page we want, kicking off readahead on surrounding swap pages is likely to be counterproductive, because the extra loads will cause additional latency, use up extra memory, and chances are the surrounding pages in swap are just as fast to load as this one, making readahead pointless. Signed-off-by: Rik van Riel <riel@surriel.com> --- mm/swap_state.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)