Message ID | Z72FJnbA39zWh4zS@gondor.apana.org.au (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead | expand |
February 25, 2025 at 12:53 AM, "Herbert Xu" <herbert@gondor.apana.org.au> wrote: > > On Mon, Feb 24, 2025 at 01:53:21PM -0800, syzbot wrote: > > > syzbot found the following issue on: > > > > HEAD commit: e9a8cac0bf89 Merge tag 'v6.14-rc3-smb3-client-fixes' of gi.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=17b667f8580000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=61cbf5ac8a063ad4 > > dashboard link: https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82 > > > > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/8441f1b50402/disk-e9a8cac0.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/65b1f8d2f790/vmlinux-e9a8cac0.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/1d6f6d8c3d6b/bzImage-e9a8cac0.xz > > > > ---8<--- > > Call crypto_free_acomp outside of the mutex in zswap_cpu_comp_dead > as otherwise this could dead-lock as the allocation path may lead > back into zswap while holding the same lock. Zap the pointers to > acomp and buffer after freeing. > Also move the NULL check on acomp_ctx so that it takes place before > the mutex dereference. > > Fixes: 12dcb0ef5406 ("mm: zswap: properly synchronize freeing resources during CPU hotunplug") > Reported-by: syzbot+1a517ccfcbc6a7ab0f82@syzkaller.appspotmail.com > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Interesting, it's weird that crypto_free_acomp() allocates memory. Do you have the specific call path?
On Tue, Feb 25, 2025 at 01:43:41PM +0000, Yosry Ahmed wrote: > > Interesting, it's weird that crypto_free_acomp() allocates memory. Do you have the specific call path? crypto_free_acomp does not allocate memory. However, it takes the same mutex that is also taken on the allocation path. The specific call path can be seen in the original report: https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82 Cheers,
On Wed, Feb 26, 2025 at 09:25:23AM +0800, Herbert Xu wrote: > On Tue, Feb 25, 2025 at 01:43:41PM +0000, Yosry Ahmed wrote: > > > > Interesting, it's weird that crypto_free_acomp() allocates memory. Do you have the specific call path? > > crypto_free_acomp does not allocate memory. However, it takes > the same mutex that is also taken on the allocation path. > > The specific call path can be seen in the original report: > > https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82 After staring at this for a while I think the following situation could be the problem: Task A running on CPU #1: crypto_alloc_acomp_node() Holds scomp_lock Enters reclaim Reads per_cpu_ptr(pool->acomp_ctx, cpu) Task A is descheduled zswap_cpu_comp_dead(CPU #1) // CPU #1 going offline Holds per_cpu_ptr(pool->acomp_ctx, cpu)) Calls crypto_free_acomp() Waits for scomp_lock Task A running on CPU #2: Waits for per_cpu_ptr(pool->acomp_ctx, cpu) DEADLOCK In this case I think the fix is correct, thanks for looking into it. Could you please: (1) Explain the exact scenario in the commit log, I did not understand it at first, only after looking at the syzbot dashboard for a while (and I am not sure how long this persists). (2) Move all the freeing operations outside the mutex? Right now crypto_free_acomp() was the problematic call but it could be acomp_request_free() next. Something like: static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) { struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu); struct struct acomp_req *req; struct crypto_acomp *acomp; u8 *buffer; if (IS_ERR_OR_NULL(acomp_ctx)) return 0; mutex_lock(&acomp_ctx->mutex); req = acomp_ctx->req; acomp_ctx->req = NULL; acomp = acomp_ctx->acomp; acomp_ctx->acomp = NULL; buffer = acomp_ctx->buffer; acomp_ctx->buffer = NULL; mutex_unlock(&acomp_ctx->mutex); /* * Do the actual freeing after releasing the mutex to avoid subtle * locking dependencies causing deadlocks */ if (!IS_ERR_OR_NULL(req)) acomp_request_free(req); if (!IS_ERR_OR_NULL(acomp)) crypto_free_acomp(acomp); kfree(acomp_ctx->buffer); return 0; }
On Wed, Feb 26, 2025 at 02:08:14AM +0000, Yosry Ahmed wrote: > > Could you please: > > (1) Explain the exact scenario in the commit log, I did not understand > it at first, only after looking at the syzbot dashboard for a while (and > I am not sure how long this persists). > > (2) Move all the freeing operations outside the mutex? Right now > crypto_free_acomp() was the problematic call but it could be > acomp_request_free() next. > > Something like: Looks good to me. Feel free to send this patch since it is your system after all :) Thanks,
On Wed, Feb 26, 2025 at 10:10:22AM +0800, Herbert Xu wrote: > On Wed, Feb 26, 2025 at 02:08:14AM +0000, Yosry Ahmed wrote: > > > > Could you please: > > > > (1) Explain the exact scenario in the commit log, I did not understand > > it at first, only after looking at the syzbot dashboard for a while (and > > I am not sure how long this persists). > > > > (2) Move all the freeing operations outside the mutex? Right now > > crypto_free_acomp() was the problematic call but it could be > > acomp_request_free() next. > > > > Something like: > > Looks good to me. Feel free to send this patch since it is your > system after all :) Can do :) May I add your Co-developed-by and Signed-off-by since this would be based off your patch?
On Wed, Feb 26, 2025 at 02:46:53AM +0000, Yosry Ahmed wrote: > > Can do :) May I add your Co-developed-by and Signed-off-by since this > would be based off your patch? Sure you can also add my ack: Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Thanks,
diff --git a/mm/zswap.c b/mm/zswap.c index 6504174fbc6a..24d36266a791 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -881,18 +881,23 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) { struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu); + struct crypto_acomp *acomp = NULL; + + if (IS_ERR_OR_NULL(acomp_ctx)) + return 0; mutex_lock(&acomp_ctx->mutex); - if (!IS_ERR_OR_NULL(acomp_ctx)) { - if (!IS_ERR_OR_NULL(acomp_ctx->req)) - acomp_request_free(acomp_ctx->req); - acomp_ctx->req = NULL; - if (!IS_ERR_OR_NULL(acomp_ctx->acomp)) - crypto_free_acomp(acomp_ctx->acomp); - kfree(acomp_ctx->buffer); - } + if (!IS_ERR_OR_NULL(acomp_ctx->req)) + acomp_request_free(acomp_ctx->req); + acomp_ctx->req = NULL; + acomp = acomp_ctx->acomp; + acomp_ctx->acomp = NULL; + kfree(acomp_ctx->buffer); + acomp_ctx->buffer = NULL; mutex_unlock(&acomp_ctx->mutex); + crypto_free_acomp(acomp); + return 0; }