mm: zswap: fix pool refcount bug around shrink_worker()

Message ID	20231006160024.170748-1-hannes@cmpxchg.org (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Johannes Weiner <hannes@cmpxchg.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chris Mason <clm@fb.com>, stable@vger.kernel.org, Vitaly Wool <vitaly.wool@konsulko.com>, Domenico Cerasuolo <cerasuolodomenico@gmail.com>, Nhat Pham <nphamcs@gmail.com> Subject: [PATCH] mm: zswap: fix pool refcount bug around shrink_worker() Date: Fri, 6 Oct 2023 12:00:24 -0400 Message-ID: <20231006160024.170748-1-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	mm: zswap: fix pool refcount bug around shrink_worker() \| expand mm: zswap: fix pool refcount bug around shrink_worker()

Message ID

20231006160024.170748-1-hannes@cmpxchg.org (mailing list archive)

State

New

Headers

From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Chris Mason <clm@fb.com>,
	stable@vger.kernel.org,
	Vitaly Wool <vitaly.wool@konsulko.com>,
	Domenico Cerasuolo <cerasuolodomenico@gmail.com>,
	Nhat Pham <nphamcs@gmail.com>
Subject: [PATCH] mm: zswap: fix pool refcount bug around shrink_worker()
Date: Fri,  6 Oct 2023 12:00:24 -0400
Message-ID: <20231006160024.170748-1-hannes@cmpxchg.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

mm: zswap: fix pool refcount bug around shrink_worker() | expand

Commit Message

Johannes Weiner Oct. 6, 2023, 4 p.m. UTC

When a zswap store fails due to the limit, it acquires a pool
reference and queues the shrinker. When the shrinker runs, it drops
the reference. However, there can be multiple store attempts before
the shrinker wakes up and runs once. This results in reference leaks
and eventual saturation warnings for the pool refcount.

Fix this by dropping the reference again when the shrinker is already
queued. This ensures one reference per shrinker run.

Reported-by: Chris Mason <clm@fb.com>
Fixes: 45190f01dd40 ("mm/zswap.c: add allocation hysteresis if pool limit is hit")
Cc: stable@vger.kernel.org	[5.6+]
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/zswap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Nhat Pham Oct. 6, 2023, 9:40 p.m. UTC | #1

On Fri, Oct 6, 2023 at 9:00 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> When a zswap store fails due to the limit, it acquires a pool
> reference and queues the shrinker. When the shrinker runs, it drops
> the reference. However, there can be multiple store attempts before
> the shrinker wakes up and runs once. This results in reference leaks
> and eventual saturation warnings for the pool refcount.
>
> Fix this by dropping the reference again when the shrinker is already
> queued. This ensures one reference per shrinker run.
>
> Reported-by: Chris Mason <clm@fb.com>
> Fixes: 45190f01dd40 ("mm/zswap.c: add allocation hysteresis if pool limit is hit")
> Cc: stable@vger.kernel.org      [5.6+]
> Cc: Vitaly Wool <vitaly.wool@konsulko.com>
> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
> Cc: Nhat Pham <nphamcs@gmail.com>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/zswap.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 083c693602b8..37d2b1cb2ecb 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1383,8 +1383,8 @@ bool zswap_store(struct folio *folio)
>
>  shrink:
>         pool = zswap_pool_last_get();
> -       if (pool)
> -               queue_work(shrink_wq, &pool->shrink_work);
> +       if (pool && !queue_work(shrink_wq, &pool->shrink_work))
> +               zswap_pool_put(pool);
>         goto reject;
>  }
>
> --
> 2.42.0
>

Acked-by: Nhat Pham <nphamcs@gmail.com>

Random tangent: this asynchronous writeback mechanism
is always kinda weird to me. We could have quite a bit of memory
inversion before the shrinker finally kicks in and frees up zswap
pool space. But I guess if it doesn't break then don't fix it.

Maybe a shrinker that proactively writes pages back as memory
pressure builds up could help ;)

diff --git a/mm/zswap.c b/mm/zswap.c
index 083c693602b8..37d2b1cb2ecb 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1383,8 +1383,8 @@  bool zswap_store(struct folio *folio)
 
 shrink:
 	pool = zswap_pool_last_get();
-	if (pool)
-		queue_work(shrink_wq, &pool->shrink_work);
+	if (pool && !queue_work(shrink_wq, &pool->shrink_work))
+		zswap_pool_put(pool);
 	goto reject;
 }

mm: zswap: fix pool refcount bug around shrink_worker()

Commit Message

Comments

Patch