Message ID | 20240610143037.812955-1-usamaarif642@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: Do not start/end writeback for pages stored in zswap | expand |
On Mon, Jun 10, 2024 at 7:31 AM Usama Arif <usamaarif642@gmail.com> wrote: > > start/end writeback combination incorrectly increments NR_WRITTEN > counter, eventhough the pages aren't written to disk. Pages successfully > stored in zswap should just unlock folio and return from writepage. > > Signed-off-by: Usama Arif <usamaarif642@gmail.com> > --- > mm/page_io.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/mm/page_io.c b/mm/page_io.c > index a360857cf75d..501784d79977 100644 > --- a/mm/page_io.c > +++ b/mm/page_io.c > @@ -196,9 +196,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) > return ret; > } > if (zswap_store(folio)) { > - folio_start_writeback(folio); > folio_unlock(folio); > - folio_end_writeback(folio); Removing these calls will have several effects, I am not really sure it's safe. 1. As you note in the commit log, NR_WRITTEN stats (and apparently others) will no longer be updated. While this may make sense, it's a user-visible change. I am not sure if anyone relies on this. 2. folio_end_writeback() calls folio_rotate_reclaimable() after writeback completes to put a folio that has been marked with PG_reclaim at the tail of the LRU, to be reclaimed first next time. Do we get this call through other paths now? 3. If I remember correctly, there was some sort of state machine where folios go from dirty to writeback to clean. I am not sure what happens if we take the writeback phase out of the equation. Overall, the change seems like it will special case the folios written to zswap vs. to disk further, and we may end up missing important things (like folio_rotate_reclaimable()). I would like to see a much stronger argument for why it is safe and justified tbh :) > return 0; > } > if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio))) { > -- > 2.43.0 >
On 10/06/2024 18:31, Yosry Ahmed wrote: > On Mon, Jun 10, 2024 at 7:31 AM Usama Arif <usamaarif642@gmail.com> wrote: >> start/end writeback combination incorrectly increments NR_WRITTEN >> counter, eventhough the pages aren't written to disk. Pages successfully >> stored in zswap should just unlock folio and return from writepage. >> >> Signed-off-by: Usama Arif <usamaarif642@gmail.com> >> --- >> mm/page_io.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/mm/page_io.c b/mm/page_io.c >> index a360857cf75d..501784d79977 100644 >> --- a/mm/page_io.c >> +++ b/mm/page_io.c >> @@ -196,9 +196,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) >> return ret; >> } >> if (zswap_store(folio)) { >> - folio_start_writeback(folio); >> folio_unlock(folio); >> - folio_end_writeback(folio); > Removing these calls will have several effects, I am not really sure it's safe. > > 1. As you note in the commit log, NR_WRITTEN stats (and apparently > others) will no longer be updated. While this may make sense, it's a > user-visible change. I am not sure if anyone relies on this. Thanks for the review. This patch would correct NR_WRITTEN stat, so I think its a good thing? But yeah as you said for people relying on that stat, suddenly this number would be lowered if they pick up a kernel with this patch, not sure how such changes would be dealt with in the kernel. > 2. folio_end_writeback() calls folio_rotate_reclaimable() after > writeback completes to put a folio that has been marked with > PG_reclaim at the tail of the LRU, to be reclaimed first next time. Do > we get this call through other paths now? We could add if (folio_test_reclaim(folio)) { folio_clear_reclaim(folio); folio_rotate_reclaimable(folio); } to if zswap_store is successful to fix this? > 3. If I remember correctly, there was some sort of state machine where > folios go from dirty to writeback to clean. I am not sure what happens > if we take the writeback phase out of the equation. > > Overall, the change seems like it will special case the folios written > to zswap vs. to disk further, and we may end up missing important > things (like folio_rotate_reclaimable()). I would like to see a much > stronger argument for why it is safe and justified tbh :) > The patch came about from zero page swap optimization series (https://lore.kernel.org/all/ZmcITDhdBzUGEHuY@casper.infradead.org/), where Matthew pointed out that NR_WRITTEN would be wrong with the way I was doing it. Overall, I thought it would be good to have consistency with how zeropages and zswap pages would be dealt with, and have a more correct NR_WRITTEN stat. In the next revision of the zero page patch, I will just add folio_rotate_reclaimable after folio_unlock if folio is zero filled. >> return 0; >> } >> if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio))) { >> -- >> 2.43.0 >>
On Mon, Jun 10, 2024 at 11:11 AM Usama Arif <usamaarif642@gmail.com> wrote: > > > On 10/06/2024 18:31, Yosry Ahmed wrote: > > On Mon, Jun 10, 2024 at 7:31 AM Usama Arif <usamaarif642@gmail.com> wrote: > >> start/end writeback combination incorrectly increments NR_WRITTEN > >> counter, eventhough the pages aren't written to disk. Pages successfully > >> stored in zswap should just unlock folio and return from writepage. > >> > >> Signed-off-by: Usama Arif <usamaarif642@gmail.com> > >> --- > >> mm/page_io.c | 2 -- > >> 1 file changed, 2 deletions(-) > >> > >> diff --git a/mm/page_io.c b/mm/page_io.c > >> index a360857cf75d..501784d79977 100644 > >> --- a/mm/page_io.c > >> +++ b/mm/page_io.c > >> @@ -196,9 +196,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) > >> return ret; > >> } > >> if (zswap_store(folio)) { > >> - folio_start_writeback(folio); > >> folio_unlock(folio); > >> - folio_end_writeback(folio); > > Removing these calls will have several effects, I am not really sure it's safe. > > > > 1. As you note in the commit log, NR_WRITTEN stats (and apparently > > others) will no longer be updated. While this may make sense, it's a > > user-visible change. I am not sure if anyone relies on this. > > Thanks for the review. > > This patch would correct NR_WRITTEN stat, so I think its a good thing? > But yeah as you said for people relying on that stat, suddenly this > number would be lowered if they pick up a kernel with this patch, not > sure how such changes would be dealt with in the kernel. > > > 2. folio_end_writeback() calls folio_rotate_reclaimable() after > > writeback completes to put a folio that has been marked with > > PG_reclaim at the tail of the LRU, to be reclaimed first next time. Do > > we get this call through other paths now? > > We could add > > if (folio_test_reclaim(folio)) { > folio_clear_reclaim(folio); > folio_rotate_reclaimable(folio); > } > > to if zswap_store is successful to fix this? > > > 3. If I remember correctly, there was some sort of state machine where > > folios go from dirty to writeback to clean. I am not sure what happens > > if we take the writeback phase out of the equation. > > > > Overall, the change seems like it will special case the folios written > > to zswap vs. to disk further, and we may end up missing important > > things (like folio_rotate_reclaimable()). I would like to see a much > > stronger argument for why it is safe and justified tbh :) > > > The patch came about from zero page swap optimization series > (https://lore.kernel.org/all/ZmcITDhdBzUGEHuY@casper.infradead.org/), > where Matthew pointed out that NR_WRITTEN would be wrong with the way I > was doing it. > > Overall, I thought it would be good to have consistency with how > zeropages and zswap pages would be dealt with, and have a more correct > NR_WRITTEN stat. > > In the next revision of the zero page patch, I will just add > folio_rotate_reclaimable after folio_unlock if folio is zero filled. I would wait until others weigh in here. I am not sure we can just change the stat handling from under the users, even if it is the right thing to do :/ I also think we need more analysis before we decide it's safe to remove the writeback calls otherwise. I noticed folio_rotate_reclaimable() on a quick look, but there may be other problems. I am not very familiar with the dirty -> writeback -> clean state machine. What's the benefit of this patch beyond making the code (and stats) make more sense semantically? It feels like a significant risk with little reward to me.
On Mon, Jun 10, 2024 at 10:31:36AM GMT, Yosry Ahmed wrote: > On Mon, Jun 10, 2024 at 7:31 AM Usama Arif <usamaarif642@gmail.com> wrote: > > > > start/end writeback combination incorrectly increments NR_WRITTEN > > counter, eventhough the pages aren't written to disk. Pages successfully > > stored in zswap should just unlock folio and return from writepage. > > > > Signed-off-by: Usama Arif <usamaarif642@gmail.com> > > --- > > mm/page_io.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/mm/page_io.c b/mm/page_io.c > > index a360857cf75d..501784d79977 100644 > > --- a/mm/page_io.c > > +++ b/mm/page_io.c > > @@ -196,9 +196,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) > > return ret; > > } > > if (zswap_store(folio)) { > > - folio_start_writeback(folio); > > folio_unlock(folio); > > - folio_end_writeback(folio); > > Removing these calls will have several effects, I am not really sure it's safe. > > 1. As you note in the commit log, NR_WRITTEN stats (and apparently > others) will no longer be updated. While this may make sense, it's a > user-visible change. I am not sure if anyone relies on this. > I couldn't imagine how this stat can be useful for the zswap case and I don't see much risk in changing this stat behavior for such cases. > 2. folio_end_writeback() calls folio_rotate_reclaimable() after > writeback completes to put a folio that has been marked with > PG_reclaim at the tail of the LRU, to be reclaimed first next time. Do > we get this call through other paths now? > The folio_rotate_reclaimable() only makes sense for async writeback pages i.e. not for zswap where we synchronously reclaim the page. > 3. If I remember correctly, there was some sort of state machine where > folios go from dirty to writeback to clean. I am not sure what happens > if we take the writeback phase out of the equation. > Is there really such a state machine? We only trigger writeback if the page is dirty and we have cleared it. The only thing I can think of is the behavior of the waiters on PG_locked bit but the window of PG_writeback is so small that it seems like it does not matter.
On Mon, Jun 10, 2024 at 12:08 PM Shakeel Butt <shakeel.butt@linux.dev> wrote: > > On Mon, Jun 10, 2024 at 10:31:36AM GMT, Yosry Ahmed wrote: > > On Mon, Jun 10, 2024 at 7:31 AM Usama Arif <usamaarif642@gmail.com> wrote: > > > > > > start/end writeback combination incorrectly increments NR_WRITTEN > > > counter, eventhough the pages aren't written to disk. Pages successfully > > > stored in zswap should just unlock folio and return from writepage. > > > > > > Signed-off-by: Usama Arif <usamaarif642@gmail.com> > > > --- > > > mm/page_io.c | 2 -- > > > 1 file changed, 2 deletions(-) > > > > > > diff --git a/mm/page_io.c b/mm/page_io.c > > > index a360857cf75d..501784d79977 100644 > > > --- a/mm/page_io.c > > > +++ b/mm/page_io.c > > > @@ -196,9 +196,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) > > > return ret; > > > } > > > if (zswap_store(folio)) { > > > - folio_start_writeback(folio); > > > folio_unlock(folio); > > > - folio_end_writeback(folio); > > > > Removing these calls will have several effects, I am not really sure it's safe. > > > > 1. As you note in the commit log, NR_WRITTEN stats (and apparently > > others) will no longer be updated. While this may make sense, it's a > > user-visible change. I am not sure if anyone relies on this. > > > > I couldn't imagine how this stat can be useful for the zswap case and I > don't see much risk in changing this stat behavior for such cases. It seems like NR_WRITTEN is only used in 'global_dirty_state' trace event. NR_WRITEBACK and NR_ZONE_WRITE_PENDING are state counters, not event counters. They are incremented in folio_start_writeback() and decremented in folio_end_writeback(). They are probably just causing noise. I think for both cases it's probably fine and not really visible to userspace. > > > 2. folio_end_writeback() calls folio_rotate_reclaimable() after > > writeback completes to put a folio that has been marked with > > PG_reclaim at the tail of the LRU, to be reclaimed first next time. Do > > we get this call through other paths now? > > > > The folio_rotate_reclaimable() only makes sense for async writeback > pages i.e. not for zswap where we synchronously reclaim the page. Looking at pageout(), it seems like we will clear PG_reclaim if the folio is not under writeback, and in shrink_folio_list() if the folio is not dirty or under writeback, we will reclaim right away. I thought zswap being synchronous was an odd case, but apparently there is wider support for synchronous reclaim. Thanks for pointing this out. > > > 3. If I remember correctly, there was some sort of state machine where > > folios go from dirty to writeback to clean. I am not sure what happens > > if we take the writeback phase out of the equation. > > > > Is there really such a state machine? We only trigger writeback if the > page is dirty and we have cleared it. The only thing I can think of is > the behavior of the waiters on PG_locked bit but the window of > PG_writeback is so small that it seems like it does not matter. I remember Matthew talking about it during LSF/MM this year when he was discussing page flags, but maybe I am misremembering.
On Mon, Jun 10, 2024 at 03:30:37PM +0100, Usama Arif wrote: > start/end writeback combination incorrectly increments NR_WRITTEN > counter, eventhough the pages aren't written to disk. Pages successfully > stored in zswap should just unlock folio and return from writepage. > > Signed-off-by: Usama Arif <usamaarif642@gmail.com> I also don't see anything in those start and end calls that wouldn't be pointless churn for these pages. Acked-by: Johannes Weiner <hannes@cmpxchg.org>
On 2024/6/10 22:30, Usama Arif wrote: > start/end writeback combination incorrectly increments NR_WRITTEN > counter, eventhough the pages aren't written to disk. Pages successfully > stored in zswap should just unlock folio and return from writepage. > > Signed-off-by: Usama Arif <usamaarif642@gmail.com> Looks good to me, thanks. Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> > --- > mm/page_io.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/mm/page_io.c b/mm/page_io.c > index a360857cf75d..501784d79977 100644 > --- a/mm/page_io.c > +++ b/mm/page_io.c > @@ -196,9 +196,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) > return ret; > } > if (zswap_store(folio)) { > - folio_start_writeback(folio); > folio_unlock(folio); > - folio_end_writeback(folio); > return 0; > } > if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio))) {
On 11/06/2024 10:53, Chengming Zhou wrote: > On 2024/6/10 22:30, Usama Arif wrote: >> start/end writeback combination incorrectly increments NR_WRITTEN >> counter, eventhough the pages aren't written to disk. Pages successfully >> stored in zswap should just unlock folio and return from writepage. >> >> Signed-off-by: Usama Arif <usamaarif642@gmail.com> > Looks good to me, thanks. > > Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Fororgot to add: Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> > >> --- >> mm/page_io.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/mm/page_io.c b/mm/page_io.c >> index a360857cf75d..501784d79977 100644 >> --- a/mm/page_io.c >> +++ b/mm/page_io.c >> @@ -196,9 +196,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) >> return ret; >> } >> if (zswap_store(folio)) { >> - folio_start_writeback(folio); >> folio_unlock(folio); >> - folio_end_writeback(folio); >> return 0; >> } >> if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio))) {
On Mon, Jun 10, 2024 at 03:30:37PM GMT, Usama Arif wrote: > start/end writeback combination incorrectly increments NR_WRITTEN > counter, eventhough the pages aren't written to disk. Pages successfully > stored in zswap should just unlock folio and return from writepage. > > Signed-off-by: Usama Arif <usamaarif642@gmail.com> If Andrew has not picked this up, send a v2 with more detailed commit message, particularly why it is safe apply this change. You can use the explanation given by Yosry in response to my email. Also add text answering the other questions raised by Yosry. If Andrew has already picked it up, just request him to update the commit message. You can add: Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
On Tue, Jun 11, 2024 at 10:16 AM Shakeel Butt <shakeel.butt@linux.dev> wrote: > > On Mon, Jun 10, 2024 at 03:30:37PM GMT, Usama Arif wrote: > > start/end writeback combination incorrectly increments NR_WRITTEN > > counter, eventhough the pages aren't written to disk. Pages successfully > > stored in zswap should just unlock folio and return from writepage. > > > > Signed-off-by: Usama Arif <usamaarif642@gmail.com> > > If Andrew has not picked this up, send a v2 with more detailed commit > message, particularly why it is safe apply this change. You can use the > explanation given by Yosry in response to my email. Also add text > answering the other questions raised by Yosry. > > If Andrew has already picked it up, just request him to update the > commit message. Yes please, thanks Shakeel. I wanted to ask for this but got derailed :) With the updated commit log feel free to add: Acked-by: Yosry Ahmed <yosryahmed@google.com> > > You can add: > > Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> >
diff --git a/mm/page_io.c b/mm/page_io.c index a360857cf75d..501784d79977 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -196,9 +196,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) return ret; } if (zswap_store(folio)) { - folio_start_writeback(folio); folio_unlock(folio); - folio_end_writeback(folio); return 0; } if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio))) {
start/end writeback combination incorrectly increments NR_WRITTEN counter, eventhough the pages aren't written to disk. Pages successfully stored in zswap should just unlock folio and return from writepage. Signed-off-by: Usama Arif <usamaarif642@gmail.com> --- mm/page_io.c | 2 -- 1 file changed, 2 deletions(-)