diff mbox series

[v2] mm: zswap: fix crypto_free_acomp() deadlock in zswap_cpu_comp_dead()

Message ID 20250226185625.2672936-1-yosry.ahmed@linux.dev (mailing list archive)
State New
Headers show
Series [v2] mm: zswap: fix crypto_free_acomp() deadlock in zswap_cpu_comp_dead() | expand

Commit Message

Yosry Ahmed Feb. 26, 2025, 6:56 p.m. UTC
Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
(through crypto_exit_scomp_ops_async()).

On the other hand, crypto_alloc_acomp_node() holds the scomp_lock
(through crypto_scomp_init_tfm()), and then allocates memory.
If the allocation results in reclaim, we may attempt to hold the per-CPU
acomp_ctx mutex.

The above dependencies can cause an ABBA deadlock. For example in the
following scenario:

(1) Task A running on CPU #1:
    crypto_alloc_acomp_node()
      Holds scomp_lock
      Enters reclaim
      Reads per_cpu_ptr(pool->acomp_ctx, 1)

(2) Task A is descheduled

(3) CPU #1 goes offline
    zswap_cpu_comp_dead(CPU #1)
      Holds per_cpu_ptr(pool->acomp_ctx, 1))
      Calls crypto_free_acomp()
      Waits for scomp_lock

(4) Task A running on CPU #2:
      Waits for per_cpu_ptr(pool->acomp_ctx, 1) // Read on CPU #1
      DEADLOCK

Since there is no requirement to call crypto_free_acomp() with the
per-CPU acomp_ctx mutex held in zswap_cpu_comp_dead(), move it after the
mutex is unlocked. Also move the acomp_request_free() and kfree() calls
for consistency and to avoid any potential sublte locking dependencies
in the future.

With this, only setting acomp_ctx fields to NULL occurs with the mutex
held. This is similar to how zswap_cpu_comp_prepare() only initializes
acomp_ctx fields with the mutex held, after performing all allocations
before holding the mutex.

Opportunistically, move the NULL check on acomp_ctx so that it takes
place before the mutex dereference.

Fixes: 12dcb0ef5406 ("mm: zswap: properly synchronize freeing resources during CPU hotunplug")
Reported-by: syzbot+1a517ccfcbc6a7ab0f82@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67bcea51.050a0220.bbfd1.0096.GAE@google.com/
Cc: <stable@vger.kernel.org>
Co-developed-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
---

v1 -> v2:
- Explained the problem more clearly in the commit message.
- Moved all freeing calls outside the lock critical section.
v1: https://lore.kernel.org/all/Z72FJnbA39zWh4zS@gondor.apana.org.au/

---
 mm/zswap.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

Comments

Eric Biggers Feb. 26, 2025, 8 p.m. UTC | #1
On Wed, Feb 26, 2025 at 06:56:25PM +0000, Yosry Ahmed wrote:
> Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
> the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
> (through crypto_exit_scomp_ops_async()).
> 
> On the other hand, crypto_alloc_acomp_node() holds the scomp_lock
> (through crypto_scomp_init_tfm()), and then allocates memory.
> If the allocation results in reclaim, we may attempt to hold the per-CPU
> acomp_ctx mutex.

The bug is in acomp.  crypto_free_acomp() should never have to wait for a memory
allocation.  That is what needs to be fixed.

But really the bounce buffering in acomp (which is what is causing this problem)
should not exist at all.  There is really no practical use case for it; it's
just there because of the Crypto API's insistence on shoehorning everything into
scatterlists for no reason...

- Eric
Yosry Ahmed Feb. 26, 2025, 8:32 p.m. UTC | #2
On Wed, Feb 26, 2025 at 08:00:16PM +0000, Eric Biggers wrote:
> On Wed, Feb 26, 2025 at 06:56:25PM +0000, Yosry Ahmed wrote:
> > Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
> > the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
> > (through crypto_exit_scomp_ops_async()).
> > 
> > On the other hand, crypto_alloc_acomp_node() holds the scomp_lock
> > (through crypto_scomp_init_tfm()), and then allocates memory.
> > If the allocation results in reclaim, we may attempt to hold the per-CPU
> > acomp_ctx mutex.
> 
> The bug is in acomp.  crypto_free_acomp() should never have to wait for a memory
> allocation.  That is what needs to be fixed.

crypto_free_acomp() does not explicitly wait for an allocation, but it
waits for scomp_lock (in crypto_exit_scomp_ops_async()), which may be
held while allocating memory from crypto_scomp_init_tfm().

Are you suggesting that crypto_exit_scomp_ops_async() should not be
holding scomp_lock?

> 
> But really the bounce buffering in acomp (which is what is causing this problem)
> should not exist at all.  There is really no practical use case for it; it's
> just there because of the Crypto API's insistence on shoehorning everything into
> scatterlists for no reason...

I am assuming this about scomp_scratch logic, which is what we need to
hold the scomp_lock for, resulting in this problem.

If this is something that can be done right away I am fine with dropping
this patch for an alternative fix, although it may be nice to reduce the
lock critical section in zswap_cpu_comp_dead() to the bare minimum
anyway.
Eric Biggers Feb. 26, 2025, 9:16 p.m. UTC | #3
On Wed, Feb 26, 2025 at 08:32:22PM +0000, Yosry Ahmed wrote:
> On Wed, Feb 26, 2025 at 08:00:16PM +0000, Eric Biggers wrote:
> > On Wed, Feb 26, 2025 at 06:56:25PM +0000, Yosry Ahmed wrote:
> > > Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
> > > the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
> > > (through crypto_exit_scomp_ops_async()).
> > > 
> > > On the other hand, crypto_alloc_acomp_node() holds the scomp_lock
> > > (through crypto_scomp_init_tfm()), and then allocates memory.
> > > If the allocation results in reclaim, we may attempt to hold the per-CPU
> > > acomp_ctx mutex.
> > 
> > The bug is in acomp.  crypto_free_acomp() should never have to wait for a memory
> > allocation.  That is what needs to be fixed.
> 
> crypto_free_acomp() does not explicitly wait for an allocation, but it
> waits for scomp_lock (in crypto_exit_scomp_ops_async()), which may be
> held while allocating memory from crypto_scomp_init_tfm().
> 
> Are you suggesting that crypto_exit_scomp_ops_async() should not be
> holding scomp_lock?

I think the solution while keeping the bounce buffer in place would be to do
what the patch
https://lore.kernel.org/linux-crypto/Z6w7Pz8jBeqhijut@gondor.apana.org.au/ does,
i.e. make the actual allocation and free happen outside the lock.

> > But really the bounce buffering in acomp (which is what is causing this problem)
> > should not exist at all.  There is really no practical use case for it; it's
> > just there because of the Crypto API's insistence on shoehorning everything into
> > scatterlists for no reason...
> 
> I am assuming this about scomp_scratch logic, which is what we need to
> hold the scomp_lock for, resulting in this problem.

Yes.

> If this is something that can be done right away I am fine with dropping
> this patch for an alternative fix, although it may be nice to reduce the
> lock critical section in zswap_cpu_comp_dead() to the bare minimum
> anyway.

Well, unfortunately the whole Crypto API philosophy of having a single interface
for software and for hardware offload doesn't really work.  This is just yet
another example of that; it's a problem caused by shoehorning software
compression into an interface designed for hardware offload.  zcomp really
should just use the compression libs directly (like most users of compression in
the kernel already do), and have an alternate code path specifically for
hardware offload (using acomp) for the few people who really want that.

- Eric
Yosry Ahmed Feb. 26, 2025, 9:23 p.m. UTC | #4
On Wed, Feb 26, 2025 at 09:16:28PM +0000, Eric Biggers wrote:
> On Wed, Feb 26, 2025 at 08:32:22PM +0000, Yosry Ahmed wrote:
> > On Wed, Feb 26, 2025 at 08:00:16PM +0000, Eric Biggers wrote:
> > > On Wed, Feb 26, 2025 at 06:56:25PM +0000, Yosry Ahmed wrote:
> > > > Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
> > > > the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
> > > > (through crypto_exit_scomp_ops_async()).
> > > > 
> > > > On the other hand, crypto_alloc_acomp_node() holds the scomp_lock
> > > > (through crypto_scomp_init_tfm()), and then allocates memory.
> > > > If the allocation results in reclaim, we may attempt to hold the per-CPU
> > > > acomp_ctx mutex.
> > > 
> > > The bug is in acomp.  crypto_free_acomp() should never have to wait for a memory
> > > allocation.  That is what needs to be fixed.
> > 
> > crypto_free_acomp() does not explicitly wait for an allocation, but it
> > waits for scomp_lock (in crypto_exit_scomp_ops_async()), which may be
> > held while allocating memory from crypto_scomp_init_tfm().
> > 
> > Are you suggesting that crypto_exit_scomp_ops_async() should not be
> > holding scomp_lock?
> 
> I think the solution while keeping the bounce buffer in place would be to do
> what the patch
> https://lore.kernel.org/linux-crypto/Z6w7Pz8jBeqhijut@gondor.apana.org.au/ does,
> i.e. make the actual allocation and free happen outside the lock.

I am fine with a solution like that if Herbert is fine with it. Although
as I mentioned, I think this patch is nice to have anyway.

> 
> > > But really the bounce buffering in acomp (which is what is causing this problem)
> > > should not exist at all.  There is really no practical use case for it; it's
> > > just there because of the Crypto API's insistence on shoehorning everything into
> > > scatterlists for no reason...
> > 
> > I am assuming this about scomp_scratch logic, which is what we need to
> > hold the scomp_lock for, resulting in this problem.
> 
> Yes.
> 
> > If this is something that can be done right away I am fine with dropping
> > this patch for an alternative fix, although it may be nice to reduce the
> > lock critical section in zswap_cpu_comp_dead() to the bare minimum
> > anyway.
> 
> Well, unfortunately the whole Crypto API philosophy of having a single interface
> for software and for hardware offload doesn't really work.  This is just yet
> another example of that; it's a problem caused by shoehorning software
> compression into an interface designed for hardware offload.  zcomp really
> should just use the compression libs directly (like most users of compression in
> the kernel already do), and have an alternate code path specifically for
> hardware offload (using acomp) for the few people who really want that.

zcomp is for zram, zswap does not use it. If zswap is not going to use
the crypto API we'll want something like zcomp or maybe reuse zcomp
itself. That's a problem for another day :)
Nhat Pham Feb. 26, 2025, 11:47 p.m. UTC | #5
On Wed, Feb 26, 2025 at 1:23 PM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
>
> On Wed, Feb 26, 2025 at 09:16:28PM +0000, Eric Biggers wrote:
> > On Wed, Feb 26, 2025 at 08:32:22PM +0000, Yosry Ahmed wrote:
> > > On Wed, Feb 26, 2025 at 08:00:16PM +0000, Eric Biggers wrote:
> > > > On Wed, Feb 26, 2025 at 06:56:25PM +0000, Yosry Ahmed wrote:
> > > > > Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
> > > > > the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
> > > > > (through crypto_exit_scomp_ops_async()).
> > > > >
> > > > > On the other hand, crypto_alloc_acomp_node() holds the scomp_lock
> > > > > (through crypto_scomp_init_tfm()), and then allocates memory.
> > > > > If the allocation results in reclaim, we may attempt to hold the per-CPU
> > > > > acomp_ctx mutex.
> > > >
> > > > The bug is in acomp.  crypto_free_acomp() should never have to wait for a memory
> > > > allocation.  That is what needs to be fixed.
> > >
> > > crypto_free_acomp() does not explicitly wait for an allocation, but it
> > > waits for scomp_lock (in crypto_exit_scomp_ops_async()), which may be
> > > held while allocating memory from crypto_scomp_init_tfm().
> > >
> > > Are you suggesting that crypto_exit_scomp_ops_async() should not be
> > > holding scomp_lock?
> >
> > I think the solution while keeping the bounce buffer in place would be to do
> > what the patch
> > https://lore.kernel.org/linux-crypto/Z6w7Pz8jBeqhijut@gondor.apana.org.au/ does,
> > i.e. make the actual allocation and free happen outside the lock.
>
> I am fine with a solution like that if Herbert is fine with it. Although
> as I mentioned, I think this patch is nice to have anyway.
>
> >
> > > > But really the bounce buffering in acomp (which is what is causing this problem)
> > > > should not exist at all.  There is really no practical use case for it; it's
> > > > just there because of the Crypto API's insistence on shoehorning everything into
> > > > scatterlists for no reason...
> > >
> > > I am assuming this about scomp_scratch logic, which is what we need to
> > > hold the scomp_lock for, resulting in this problem.
> >
> > Yes.
> >
> > > If this is something that can be done right away I am fine with dropping
> > > this patch for an alternative fix, although it may be nice to reduce the
> > > lock critical section in zswap_cpu_comp_dead() to the bare minimum
> > > anyway.
> >
> > Well, unfortunately the whole Crypto API philosophy of having a single interface
> > for software and for hardware offload doesn't really work.  This is just yet
> > another example of that; it's a problem caused by shoehorning software
> > compression into an interface designed for hardware offload.  zcomp really
> > should just use the compression libs directly (like most users of compression in
> > the kernel already do), and have an alternate code path specifically for
> > hardware offload (using acomp) for the few people who really want that.
>
> zcomp is for zram, zswap does not use it. If zswap is not going to use
> the crypto API we'll want something like zcomp or maybe reuse zcomp
> itself. That's a problem for another day :)

I'm actually thinking whether we should expose the zcomp API and use
it for zswap. There are a couple of parameters for zstd I wanna play
with, which zcomp/zram seems to already support, but not the crypto
API (zstd level, dictionary, etc.).

But yes, a different problem for another day :)
Chengming Zhou Feb. 27, 2025, 2:24 a.m. UTC | #6
On 2025/2/27 02:56, Yosry Ahmed wrote:
> Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
> the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
> (through crypto_exit_scomp_ops_async()).
> 
> On the other hand, crypto_alloc_acomp_node() holds the scomp_lock
> (through crypto_scomp_init_tfm()), and then allocates memory.
> If the allocation results in reclaim, we may attempt to hold the per-CPU
> acomp_ctx mutex.
> 
> The above dependencies can cause an ABBA deadlock. For example in the
> following scenario:
> 
> (1) Task A running on CPU #1:
>      crypto_alloc_acomp_node()
>        Holds scomp_lock
>        Enters reclaim
>        Reads per_cpu_ptr(pool->acomp_ctx, 1)
> 
> (2) Task A is descheduled
> 
> (3) CPU #1 goes offline
>      zswap_cpu_comp_dead(CPU #1)
>        Holds per_cpu_ptr(pool->acomp_ctx, 1))
>        Calls crypto_free_acomp()
>        Waits for scomp_lock
> 
> (4) Task A running on CPU #2:
>        Waits for per_cpu_ptr(pool->acomp_ctx, 1) // Read on CPU #1
>        DEADLOCK
> 
> Since there is no requirement to call crypto_free_acomp() with the
> per-CPU acomp_ctx mutex held in zswap_cpu_comp_dead(), move it after the
> mutex is unlocked. Also move the acomp_request_free() and kfree() calls
> for consistency and to avoid any potential sublte locking dependencies
> in the future.
> 
> With this, only setting acomp_ctx fields to NULL occurs with the mutex
> held. This is similar to how zswap_cpu_comp_prepare() only initializes
> acomp_ctx fields with the mutex held, after performing all allocations
> before holding the mutex.
> 
> Opportunistically, move the NULL check on acomp_ctx so that it takes
> place before the mutex dereference.
> 
> Fixes: 12dcb0ef5406 ("mm: zswap: properly synchronize freeing resources during CPU hotunplug")
> Reported-by: syzbot+1a517ccfcbc6a7ab0f82@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/67bcea51.050a0220.bbfd1.0096.GAE@google.com/
> Cc: <stable@vger.kernel.org>
> Co-developed-by: Herbert Xu <herbert@gondor.apana.org.au>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Looks good to me:

Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>

Thanks!

> ---
> 
> v1 -> v2:
> - Explained the problem more clearly in the commit message.
> - Moved all freeing calls outside the lock critical section.
> v1: https://lore.kernel.org/all/Z72FJnbA39zWh4zS@gondor.apana.org.au/
> 
> ---
>   mm/zswap.c | 30 ++++++++++++++++++++++--------
>   1 file changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/zswap.c b/mm/zswap.c
> index ac9d299e7d0c1..adf745c66aa1d 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -881,18 +881,32 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
>   {
>   	struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
>   	struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
> +	struct acomp_req *req;
> +	struct crypto_acomp *acomp;
> +	u8 *buffer;
> +
> +	if (IS_ERR_OR_NULL(acomp_ctx))
> +		return 0;
>   
>   	mutex_lock(&acomp_ctx->mutex);
> -	if (!IS_ERR_OR_NULL(acomp_ctx)) {
> -		if (!IS_ERR_OR_NULL(acomp_ctx->req))
> -			acomp_request_free(acomp_ctx->req);
> -		acomp_ctx->req = NULL;
> -		if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
> -			crypto_free_acomp(acomp_ctx->acomp);
> -		kfree(acomp_ctx->buffer);
> -	}
> +	req = acomp_ctx->req;
> +	acomp = acomp_ctx->acomp;
> +	buffer = acomp_ctx->buffer;
> +	acomp_ctx->req = NULL;
> +	acomp_ctx->acomp = NULL;
> +	acomp_ctx->buffer = NULL;
>   	mutex_unlock(&acomp_ctx->mutex);
>   
> +	/*
> +	 * Do the actual freeing after releasing the mutex to avoid subtle
> +	 * locking dependencies causing deadlocks.
> +	 */
> +	if (!IS_ERR_OR_NULL(req))
> +		acomp_request_free(req);
> +	if (!IS_ERR_OR_NULL(acomp))
> +		crypto_free_acomp(acomp);
> +	kfree(buffer);
> +
>   	return 0;
>   }
>
Chengming Zhou Feb. 27, 2025, 2:30 a.m. UTC | #7
On 2025/2/27 07:47, Nhat Pham wrote:
> On Wed, Feb 26, 2025 at 1:23 PM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
>>
>> On Wed, Feb 26, 2025 at 09:16:28PM +0000, Eric Biggers wrote:
>>> On Wed, Feb 26, 2025 at 08:32:22PM +0000, Yosry Ahmed wrote:
>>>> On Wed, Feb 26, 2025 at 08:00:16PM +0000, Eric Biggers wrote:
>>>>> On Wed, Feb 26, 2025 at 06:56:25PM +0000, Yosry Ahmed wrote:
>>>>>> Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
>>>>>> the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
>>>>>> (through crypto_exit_scomp_ops_async()).
>>>>>>
>>>>>> On the other hand, crypto_alloc_acomp_node() holds the scomp_lock
>>>>>> (through crypto_scomp_init_tfm()), and then allocates memory.
>>>>>> If the allocation results in reclaim, we may attempt to hold the per-CPU
>>>>>> acomp_ctx mutex.
>>>>>
>>>>> The bug is in acomp.  crypto_free_acomp() should never have to wait for a memory
>>>>> allocation.  That is what needs to be fixed.
>>>>
>>>> crypto_free_acomp() does not explicitly wait for an allocation, but it
>>>> waits for scomp_lock (in crypto_exit_scomp_ops_async()), which may be
>>>> held while allocating memory from crypto_scomp_init_tfm().
>>>>
>>>> Are you suggesting that crypto_exit_scomp_ops_async() should not be
>>>> holding scomp_lock?
>>>
>>> I think the solution while keeping the bounce buffer in place would be to do
>>> what the patch
>>> https://lore.kernel.org/linux-crypto/Z6w7Pz8jBeqhijut@gondor.apana.org.au/ does,
>>> i.e. make the actual allocation and free happen outside the lock.
>>
>> I am fine with a solution like that if Herbert is fine with it. Although
>> as I mentioned, I think this patch is nice to have anyway.
>>
>>>
>>>>> But really the bounce buffering in acomp (which is what is causing this problem)
>>>>> should not exist at all.  There is really no practical use case for it; it's
>>>>> just there because of the Crypto API's insistence on shoehorning everything into
>>>>> scatterlists for no reason...
>>>>
>>>> I am assuming this about scomp_scratch logic, which is what we need to
>>>> hold the scomp_lock for, resulting in this problem.
>>>
>>> Yes.
>>>
>>>> If this is something that can be done right away I am fine with dropping
>>>> this patch for an alternative fix, although it may be nice to reduce the
>>>> lock critical section in zswap_cpu_comp_dead() to the bare minimum
>>>> anyway.
>>>
>>> Well, unfortunately the whole Crypto API philosophy of having a single interface
>>> for software and for hardware offload doesn't really work.  This is just yet
>>> another example of that; it's a problem caused by shoehorning software
>>> compression into an interface designed for hardware offload.  zcomp really
>>> should just use the compression libs directly (like most users of compression in
>>> the kernel already do), and have an alternate code path specifically for
>>> hardware offload (using acomp) for the few people who really want that.
>>
>> zcomp is for zram, zswap does not use it. If zswap is not going to use
>> the crypto API we'll want something like zcomp or maybe reuse zcomp
>> itself. That's a problem for another day :)
> 
> I'm actually thinking whether we should expose the zcomp API and use
> it for zswap. There are a couple of parameters for zstd I wanna play
> with, which zcomp/zram seems to already support, but not the crypto
> API (zstd level, dictionary, etc.).

Ah, agree! Actually I also think we should use the zcomp API in zswap,
if its API meets our requirements.

> 
> But yes, a different problem for another day :)
diff mbox series

Patch

diff --git a/mm/zswap.c b/mm/zswap.c
index ac9d299e7d0c1..adf745c66aa1d 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -881,18 +881,32 @@  static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
 {
 	struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
 	struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
+	struct acomp_req *req;
+	struct crypto_acomp *acomp;
+	u8 *buffer;
+
+	if (IS_ERR_OR_NULL(acomp_ctx))
+		return 0;
 
 	mutex_lock(&acomp_ctx->mutex);
-	if (!IS_ERR_OR_NULL(acomp_ctx)) {
-		if (!IS_ERR_OR_NULL(acomp_ctx->req))
-			acomp_request_free(acomp_ctx->req);
-		acomp_ctx->req = NULL;
-		if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
-			crypto_free_acomp(acomp_ctx->acomp);
-		kfree(acomp_ctx->buffer);
-	}
+	req = acomp_ctx->req;
+	acomp = acomp_ctx->acomp;
+	buffer = acomp_ctx->buffer;
+	acomp_ctx->req = NULL;
+	acomp_ctx->acomp = NULL;
+	acomp_ctx->buffer = NULL;
 	mutex_unlock(&acomp_ctx->mutex);
 
+	/*
+	 * Do the actual freeing after releasing the mutex to avoid subtle
+	 * locking dependencies causing deadlocks.
+	 */
+	if (!IS_ERR_OR_NULL(req))
+		acomp_request_free(req);
+	if (!IS_ERR_OR_NULL(acomp))
+		crypto_free_acomp(acomp);
+	kfree(buffer);
+
 	return 0;
 }