Message ID | 20240826085056.895865-1-zhaoyang.huang@unisoc.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC,1/1] mm: Skip folio with private data during isolation | expand |
On 26.08.24 10:50, zhaoyang.huang wrote: > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > Since clean target folio with private data will be given up finally in > __remove_mapping as it has extra refcnt, it is better to skip it during > isolation to save the slot for more qualified folio. Current one could > be the candidate for next round of scanning after the private data gone. > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > --- > mm/vmscan.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index cfa839284b92..755bf3a387f3 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1685,6 +1685,8 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan, > */ > scan += nr_pages; > > + if (folio_test_private(folio) && !folio_test_dirty(folio)) > + goto move; > if (!folio_test_lru(folio)) > goto move; > if (!sc->may_unmap && folio_mapped(folio)) An earlier filemap_release_folio() would have failed if the private data (buffers) cannot get freed, and we went into the activate_locked path. if (folio_needs_release(folio)) { if (!filemap_release_folio(folio, sc->gfp_mask) goto activate_locked; ... if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) { ... } else if (!mapping || !__remove_mapping(mapping, folio, true, } At least on the shrink_folio_list() path, I'm not sure the code you are adding could even trigger. We should not reach __remove_mapping() with folio_test_private().
On Mon, Aug 26, 2024 at 6:36 PM David Hildenbrand <david@redhat.com> wrote: > > On 26.08.24 10:50, zhaoyang.huang wrote: > > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > > > Since clean target folio with private data will be given up finally in > > __remove_mapping as it has extra refcnt, it is better to skip it during > > isolation to save the slot for more qualified folio. Current one could > > be the candidate for next round of scanning after the private data gone. > > > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > --- > > mm/vmscan.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index cfa839284b92..755bf3a387f3 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -1685,6 +1685,8 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan, > > */ > > scan += nr_pages; > > > > + if (folio_test_private(folio) && !folio_test_dirty(folio)) > > + goto move; > > if (!folio_test_lru(folio)) > > goto move; > > if (!sc->may_unmap && folio_mapped(folio)) > > An earlier filemap_release_folio() would have failed if the private data > (buffers) cannot get freed, and we went into the activate_locked path. > > > if (folio_needs_release(folio)) { > if (!filemap_release_folio(folio, sc->gfp_mask) > goto activate_locked; > ... > > if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) { > ... > } else if (!mapping || !__remove_mapping(mapping, folio, true, > } > > At least on the shrink_folio_list() path, I'm not sure the code you are > adding could even trigger. We should not reach __remove_mapping() with > folio_test_private(). Thanks for heads up. You are right, the bh is judged if existing before __remove_mapping. ASAIU, the metadata associated with the bh has risk to be freed such as journal data etc or it introduces extra IO. Actually, this patch is inspired by a practical problem we just run across which the bh remains on LRU for a long time since it is attached to a journal_head that can not be freed by jbd2. > > -- > Cheers, > > David / dhildenb >
On 26.08.24 13:10, Zhaoyang Huang wrote: > On Mon, Aug 26, 2024 at 6:36 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 26.08.24 10:50, zhaoyang.huang wrote: >>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> >>> >>> Since clean target folio with private data will be given up finally in >>> __remove_mapping as it has extra refcnt, it is better to skip it during >>> isolation to save the slot for more qualified folio. Current one could >>> be the candidate for next round of scanning after the private data gone. >>> >>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> >>> --- >>> mm/vmscan.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>> index cfa839284b92..755bf3a387f3 100644 >>> --- a/mm/vmscan.c >>> +++ b/mm/vmscan.c >>> @@ -1685,6 +1685,8 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan, >>> */ >>> scan += nr_pages; >>> >>> + if (folio_test_private(folio) && !folio_test_dirty(folio)) >>> + goto move; >>> if (!folio_test_lru(folio)) >>> goto move; >>> if (!sc->may_unmap && folio_mapped(folio)) >> >> An earlier filemap_release_folio() would have failed if the private data >> (buffers) cannot get freed, and we went into the activate_locked path. >> >> >> if (folio_needs_release(folio)) { >> if (!filemap_release_folio(folio, sc->gfp_mask) >> goto activate_locked; >> ... >> >> if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) { >> ... >> } else if (!mapping || !__remove_mapping(mapping, folio, true, >> } >> >> At least on the shrink_folio_list() path, I'm not sure the code you are >> adding could even trigger. We should not reach __remove_mapping() with >> folio_test_private(). > Thanks for heads up. You are right, the bh is judged if existing > before __remove_mapping. ASAIU, the metadata associated with the bh > has risk to be freed such as journal data etc or it introduces extra > IO. Actually, this patch is inspired by a practical problem we just > run across which the bh remains on LRU for a long time since it is > attached to a journal_head that can not be freed by jbd2. Okay, but I assume (as stated) this patch does not have an affect on that, or am I missing something? I assume you have some way to test before/after, if you run into similar problems in practice.
On Mon, Aug 26, 2024 at 9:01 PM David Hildenbrand <david@redhat.com> wrote: > > On 26.08.24 13:10, Zhaoyang Huang wrote: > > On Mon, Aug 26, 2024 at 6:36 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 26.08.24 10:50, zhaoyang.huang wrote: > >>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > >>> > >>> Since clean target folio with private data will be given up finally in > >>> __remove_mapping as it has extra refcnt, it is better to skip it during > >>> isolation to save the slot for more qualified folio. Current one could > >>> be the candidate for next round of scanning after the private data gone. > >>> > >>> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > >>> --- > >>> mm/vmscan.c | 2 ++ > >>> 1 file changed, 2 insertions(+) > >>> > >>> diff --git a/mm/vmscan.c b/mm/vmscan.c > >>> index cfa839284b92..755bf3a387f3 100644 > >>> --- a/mm/vmscan.c > >>> +++ b/mm/vmscan.c > >>> @@ -1685,6 +1685,8 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan, > >>> */ > >>> scan += nr_pages; > >>> > >>> + if (folio_test_private(folio) && !folio_test_dirty(folio)) > >>> + goto move; > >>> if (!folio_test_lru(folio)) > >>> goto move; > >>> if (!sc->may_unmap && folio_mapped(folio)) > >> > >> An earlier filemap_release_folio() would have failed if the private data > >> (buffers) cannot get freed, and we went into the activate_locked path. > >> > >> > >> if (folio_needs_release(folio)) { > >> if (!filemap_release_folio(folio, sc->gfp_mask) > >> goto activate_locked; > >> ... > >> > >> if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) { > >> ... > >> } else if (!mapping || !__remove_mapping(mapping, folio, true, > >> } > >> > >> At least on the shrink_folio_list() path, I'm not sure the code you are > >> adding could even trigger. We should not reach __remove_mapping() with > >> folio_test_private(). > > Thanks for heads up. You are right, the bh is judged if existing > > before __remove_mapping. ASAIU, the metadata associated with the bh > > has risk to be freed such as journal data etc or it introduces extra > > IO. Actually, this patch is inspired by a practical problem we just > > run across which the bh remains on LRU for a long time since it is > > attached to a journal_head that can not be freed by jbd2. > > Okay, but I assume (as stated) this patch does not have an affect on > that, or am I missing something? No, this is actually a migration failure issue[1] related to cma_alloc where the bh keeps busy as the journal's transaction can't be launched[2][3][4][5][6]. I am just inspired by this issue to check if there is anything to do in reclaiming. By counting from a ramdump, there are 300MB "lru, private" pages found in a 6GB RAM system which could lower the reclaiming efficiency if the same scenario as above happens. [1] crash_arm64_v8.0.4++> kmem -p|grep ffffff808f0aa150(sb->s_bdev->bd_inode->i_mapping) fffffffe01a51c00 e9470000 ffffff808f0aa150 3 2 8000000008020 lru,private //within CMA area fffffffe03d189c0 174627000 ffffff808f0aa150 4 2 2004000000008020 lru,private fffffffe03d88e00 176238000 ffffff808f0aa150 3f9 2 2008000000008020 lru,private fffffffe03d88e40 176239000 ffffff808f0aa150 6 2 2008000000008020 lru,private fffffffe03d88e80 17623a000 ffffff808f0aa150 5 2 2008000000008020 lru,private fffffffe03d88ec0 17623b000 ffffff808f0aa150 1 2 2008000000008020 lru,private fffffffe03d88f00 17623c000 ffffff808f0aa150 0 2 2008000000008020 lru,private fffffffe040e6540 183995000 ffffff808f0aa150 3f4 2 2004000000008020 lru,private [2] page -> buffer_head crash_arm64_v8.0.4++> struct page.private fffffffe01a51c00 -x private = 0xffffff802fca0c00 [3] buffer_head -> journal_head crash_arm64_v8.0.4++> struct buffer_head.b_private 0xffffff802fca0c00 b_private = 0xffffff8041338e10, [4] journal_head -> b_cp_transaction crash_arm64_v8.0.4++> struct journal_head.b_cp_transaction 0xffffff8041338e10 -x b_cp_transaction = 0xffffff80410f1900, [5] transaction_t -> journal crash_arm64_v8.0.4++> struct transaction_t.t_journal 0xffffff80410f1900 -x t_journal = 0xffffff80e70f3000, [6] j_free & j_max_transaction_buffers (j_free > j_max_transaction_buffers, transaction would NOT be launched) crash_arm64_v8.0.4++> struct journal_t.j_free,j_max_transaction_buffers 0xffffff80e70f3000 -x j_free = 0x3f1, j_max_transaction_buffers = 0x100, > > I assume you have some way to test before/after, if you run into similar > problems in practice. We solve the migration issue by moving the affected file to a none-journal disk. Whereas, I think the buffered folio issue(carrying meta data) could be taken into consideration to do something. > > -- > Cheers, > > David / dhildenb >
On Tue, Aug 27, 2024 at 10:45:11AM +0800, Zhaoyang Huang wrote: > On Mon, Aug 26, 2024 at 9:01 PM David Hildenbrand <david@redhat.com> wrote: > > On 26.08.24 13:10, Zhaoyang Huang wrote: > > > On Mon, Aug 26, 2024 at 6:36 PM David Hildenbrand <david@redhat.com> wrote: > > >> An earlier filemap_release_folio() would have failed if the private data > > >> (buffers) cannot get freed, and we went into the activate_locked path. > > >> > > >> if (folio_needs_release(folio)) { > > >> if (!filemap_release_folio(folio, sc->gfp_mask) > > >> goto activate_locked; > > >> ... > No, this is actually a migration failure issue[1] related to cma_alloc > where the bh keeps busy as the journal's transaction can't be > launched[2][3][4][5][6]. I am just inspired by this issue to check if > there is anything to do in reclaiming. By counting from a ramdump, > there are 300MB "lru, private" pages found in a 6GB RAM system which > could lower the reclaiming efficiency if the same scenario as above > happens. As David said, if we encounter a filesystem folio with private data, we ask the filesystem to strip the private data off. Usually this succeeds, because most folios aren't part of the journal for very long.
On Tue, Aug 27, 2024 at 10:49 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Aug 27, 2024 at 10:45:11AM +0800, Zhaoyang Huang wrote: > > On Mon, Aug 26, 2024 at 9:01 PM David Hildenbrand <david@redhat.com> wrote: > > > On 26.08.24 13:10, Zhaoyang Huang wrote: > > > > On Mon, Aug 26, 2024 at 6:36 PM David Hildenbrand <david@redhat.com> wrote: > > > >> An earlier filemap_release_folio() would have failed if the private data > > > >> (buffers) cannot get freed, and we went into the activate_locked path. > > > >> > > > >> if (folio_needs_release(folio)) { > > > >> if (!filemap_release_folio(folio, sc->gfp_mask) > > > >> goto activate_locked; > > > >> ... > > > No, this is actually a migration failure issue[1] related to cma_alloc > > where the bh keeps busy as the journal's transaction can't be > > launched[2][3][4][5][6]. I am just inspired by this issue to check if > > there is anything to do in reclaiming. By counting from a ramdump, > > there are 300MB "lru, private" pages found in a 6GB RAM system which > > could lower the reclaiming efficiency if the same scenario as above > > happens. > > As David said, if we encounter a filesystem folio with private data, > we ask the filesystem to strip the private data off. Usually this > succeeds, because most folios aren't part of the journal for very long. Got it, thanks for clarifying.
diff --git a/mm/vmscan.c b/mm/vmscan.c index cfa839284b92..755bf3a387f3 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1685,6 +1685,8 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan, */ scan += nr_pages; + if (folio_test_private(folio) && !folio_test_dirty(folio)) + goto move; if (!folio_test_lru(folio)) goto move; if (!sc->may_unmap && folio_mapped(folio))