Message ID | 165146746627.24404.2324091720943354711@noble.neil.brown.name (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | MM: handle THP in swap_*page_fs() - count_vm_events() | expand |
On Mon, May 02, 2022 at 02:57:46PM +1000, NeilBrown wrote: > @@ -390,9 +392,9 @@ static void sio_read_complete(struct kiocb *iocb, long ret) > struct page *page = sio->bvec[p].bv_page; > > SetPageUptodate(page); > + count_swpout_vm_event(page); > unlock_page(page); > } > - count_vm_events(PSWPIN, sio->pages); Surely that should be count_swpIN_vm_event?
On Mon, 02 May 2022, Matthew Wilcox wrote: > On Mon, May 02, 2022 at 02:57:46PM +1000, NeilBrown wrote: > > @@ -390,9 +392,9 @@ static void sio_read_complete(struct kiocb *iocb, long ret) > > struct page *page = sio->bvec[p].bv_page; > > > > SetPageUptodate(page); > > + count_swpout_vm_event(page); > > unlock_page(page); > > } > > - count_vm_events(PSWPIN, sio->pages); > > Surely that should be count_swpIN_vm_event? > I'm not having a good day.... Certainly shouldn't be swpout. There isn't a count_swpin_vm_event(). swap_readpage() only counts once for each page no matter how big it is. While swap_writepage() counts one for each PAGE_SIZE written. And we have THP_SWPOUT but not THP_SWPIN And I cannot find where any of these counters are documents, so I cannot say what is "correct". Well.... arch/s390/appldata/appldata_mem.c says u64 pswpin; /* pages swapped in */ u64 pswpout; /* pages swapped out */ but that isn't exactly unambiguous, and is for s390 which doesn't support THP_SWAP Ho hum. I guess I put that back as it was. Thanks for the review!!! NeilBrown
On Mon, May 02, 2022 at 03:28:49PM +1000, NeilBrown wrote: > On Mon, 02 May 2022, Matthew Wilcox wrote: > > On Mon, May 02, 2022 at 02:57:46PM +1000, NeilBrown wrote: > > > @@ -390,9 +392,9 @@ static void sio_read_complete(struct kiocb *iocb, long ret) > > > struct page *page = sio->bvec[p].bv_page; > > > > > > SetPageUptodate(page); > > > + count_swpout_vm_event(page); > > > unlock_page(page); > > > } > > > - count_vm_events(PSWPIN, sio->pages); > > > > Surely that should be count_swpIN_vm_event? > > > I'm not having a good day.... > > Certainly shouldn't be swpout. There isn't a count_swpin_vm_event(). > > swap_readpage() only counts once for each page no matter how big it is. > While swap_writepage() counts one for each PAGE_SIZE written. > > And we have THP_SWPOUT but not THP_SWPIN _If_ I understand the swap-in patch correctly (at least as invoked by shmem), it won't attempt to swap in an entire THP. Even if it swapped out an order-9 page, it will bring in order-0 pages from swap, and then rely on khugepaged to reassemble them. Someone who actually understands the swap code should check that my explanation here is correct.
On Sun, May 1, 2022 at 10:32 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Mon, May 02, 2022 at 03:28:49PM +1000, NeilBrown wrote: > > On Mon, 02 May 2022, Matthew Wilcox wrote: > > > On Mon, May 02, 2022 at 02:57:46PM +1000, NeilBrown wrote: > > > > @@ -390,9 +392,9 @@ static void sio_read_complete(struct kiocb *iocb, long ret) > > > > struct page *page = sio->bvec[p].bv_page; > > > > > > > > SetPageUptodate(page); > > > > + count_swpout_vm_event(page); > > > > unlock_page(page); > > > > } > > > > - count_vm_events(PSWPIN, sio->pages); > > > > > > Surely that should be count_swpIN_vm_event? > > > > > I'm not having a good day.... > > > > Certainly shouldn't be swpout. There isn't a count_swpin_vm_event(). > > > > swap_readpage() only counts once for each page no matter how big it is. > > While swap_writepage() counts one for each PAGE_SIZE written. > > > > And we have THP_SWPOUT but not THP_SWPIN > > _If_ I understand the swap-in patch correctly (at least as invoked by > shmem), it won't attempt to swap in an entire THP. Even if it swapped > out an order-9 page, it will bring in order-0 pages from swap, and then > rely on khugepaged to reassemble them. Totally correct. The try_to_unmap() called by vmscan would split PMD to PTEs then install swap entries for each PTE but keep the huge page unsplit. BTW, there were patches adding THP swapin support, but they were never merged. > > Someone who actually understands the swap code should check that my > explanation here is correct. >
On 2022/5/7 1:26, Yang Shi wrote: > On Sun, May 1, 2022 at 10:32 PM Matthew Wilcox <willy@infradead.org> wrote: >> >> On Mon, May 02, 2022 at 03:28:49PM +1000, NeilBrown wrote: >>> On Mon, 02 May 2022, Matthew Wilcox wrote: >>>> On Mon, May 02, 2022 at 02:57:46PM +1000, NeilBrown wrote: >>>>> @@ -390,9 +392,9 @@ static void sio_read_complete(struct kiocb *iocb, long ret) >>>>> struct page *page = sio->bvec[p].bv_page; >>>>> >>>>> SetPageUptodate(page); >>>>> + count_swpout_vm_event(page); >>>>> unlock_page(page); >>>>> } >>>>> - count_vm_events(PSWPIN, sio->pages); >>>> >>>> Surely that should be count_swpIN_vm_event? >>>> >>> I'm not having a good day.... >>> >>> Certainly shouldn't be swpout. There isn't a count_swpin_vm_event(). >>> >>> swap_readpage() only counts once for each page no matter how big it is. >>> While swap_writepage() counts one for each PAGE_SIZE written. >>> >>> And we have THP_SWPOUT but not THP_SWPIN >> >> _If_ I understand the swap-in patch correctly (at least as invoked by >> shmem), it won't attempt to swap in an entire THP. Even if it swapped >> out an order-9 page, it will bring in order-0 pages from swap, and then >> rely on khugepaged to reassemble them. > > Totally correct. The try_to_unmap() called by vmscan would split PMD > to PTEs then install swap entries for each PTE but keep the huge page > unsplit. > > BTW, there were patches adding THP swapin support, but they were never merged. Could you please tell me where the THP swapin patches are ? It would be really helpful if you can kindly figure that out for me! :) Thanks a lot! > >> >> Someone who actually understands the swap code should check that my >> explanation here is correct. >> > . >
diff --git a/mm/page_io.c b/mm/page_io.c index d636a3531cad..3e2e9029ce50 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -280,8 +280,10 @@ static void sio_write_complete(struct kiocb *iocb, long ret) set_page_dirty(page); ClearPageReclaim(page); } - } else - count_vm_events(PSWPOUT, sio->pages); + } else { + for (p = 0; p < sio->pages; p++) + count_swpout_vm_event(sio->bvec[p].bv_page); + } for (p = 0; p < sio->pages; p++) end_page_writeback(sio->bvec[p].bv_page); @@ -390,9 +392,9 @@ static void sio_read_complete(struct kiocb *iocb, long ret) struct page *page = sio->bvec[p].bv_page; SetPageUptodate(page); + count_swpout_vm_event(page); unlock_page(page); } - count_vm_events(PSWPIN, sio->pages); } else { for (p = 0; p < sio->pages; p++) { struct page *page = sio->bvec[p].bv_page;
We need to use count_swpout_vm_event() for sio_write_complete() and sio_read_complete(), to get correct counting. This patch should be squased into MM: handle THP in swap_*page_fs() Reported-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: NeilBrown <neilb@suse.de> --- mm/page_io.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)