diff mbox series

[v5,2/2] x86/MCE/AMD: Fix memory leak when `threshold_create_bank()` fails

Message ID 20220310015306.445359-3-ammarfaizi2@gnuweeb.org (mailing list archive)
State New, archived
Headers show
Series Two x86 fixes | expand

Commit Message

Ammar Faizi March 10, 2022, 1:53 a.m. UTC
In mce_threshold_create_device(), if threshold_create_bank() fails, the
@bp will be leaked, because the call to mce_threshold_remove_device()
will not free the @bp. mce_threshold_remove_device() frees
@threshold_banks. At that point, the @bp has not been written to
@threshold_banks, @threshold_banks is NULL, so the call is just a nop.

Fix this by extracting the cleanup part into a new static function
_mce_threshold_remove_device(), then call it from create/remove device
functions.

Also, eliminate the "goto out_err", just early return inside the loop
if the creation fails.

Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org # v5.8+
Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path")
Link: https://lore.kernel.org/lkml/9dfe087a-f941-1bc4-657d-7e7c198888ff@gnuweeb.org
Co-authored-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Co-authored-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---
 arch/x86/kernel/cpu/mce/amd.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

Comments

Borislav Petkov March 27, 2022, 10:52 p.m. UTC | #1
On Thu, Mar 10, 2022 at 08:53:06AM +0700, Ammar Faizi wrote:
> In mce_threshold_create_device(), if threshold_create_bank() fails, the
> @bp will be leaked, because the call to mce_threshold_remove_device()
> will not free the @bp. mce_threshold_remove_device() frees
> @threshold_banks. At that point, the @bp has not been written to
> @threshold_banks, @threshold_banks is NULL, so the call is just a nop.
> 
> Fix this by extracting the cleanup part into a new static function
> _mce_threshold_remove_device(), then call it from create/remove device
> functions.
> 
> Also, eliminate the "goto out_err", just early return inside the loop
> if the creation fails.
> 
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: stable@vger.kernel.org # v5.8+
> Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path")

How did you decide this is the commit that this is fixing?

> Link: https://lore.kernel.org/lkml/9dfe087a-f941-1bc4-657d-7e7c198888ff@gnuweeb.org

That Link tag is not needed.

> Co-authored-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
> Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
> Co-authored-by: Yazen Ghannam <yazen.ghannam@amd.com>

There's no "Co-authored-by".

The correct tag is described in

Documentation/process/submitting-patches.rst

Please make sure you've read that file before sending patches.

> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
> ---
>  arch/x86/kernel/cpu/mce/amd.c | 32 +++++++++++++++++++-------------
>  1 file changed, 19 insertions(+), 13 deletions(-)

...

> @@ -1350,15 +1357,14 @@ int mce_threshold_create_device(unsigned int cpu)
>  		if (!(this_cpu_read(bank_map) & (1 << bank)))
>  			continue;
>  		err = threshold_create_bank(bp, cpu, bank);
> -		if (err)
> -			goto out_err;
> +		if (err) {
> +			_mce_threshold_remove_device(bp, numbanks);
> +			return err;
> +		}
>  	}
>  	this_cpu_write(threshold_banks, bp);

Do I see it correctly that the publishing of the @bp pointer - i.e.,
this line - should be moved right above the for loop?

Then mce_threshold_remove_device() would properly free it in the error
case and your patch turns into a oneliner?

And then your Fixes: tag would be correct too...
Ammar Faizi March 28, 2022, 4:12 a.m. UTC | #2
On 3/28/22 5:52 AM, Borislav Petkov wrote:
[...]
>> Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path")
> 
> How did you decide this is the commit that this is fixing?

I examined the history in those lines by git blame. Will recheck after the below
doubt is cleared.

>> Link: https://lore.kernel.org/lkml/9dfe087a-f941-1bc4-657d-7e7c198888ff@gnuweeb.org
> 
> That Link tag is not needed.
> 
>> Co-authored-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
>> Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
>> Co-authored-by: Yazen Ghannam <yazen.ghannam@amd.com>
> 
> There's no "Co-authored-by".
> 
> The correct tag is described in
> 
> Documentation/process/submitting-patches.rst

Will fix them in the v6.

> ...
> 
>> @@ -1350,15 +1357,14 @@ int mce_threshold_create_device(unsigned int cpu)
>>   		if (!(this_cpu_read(bank_map) & (1 << bank)))
>>   			continue;
>>   		err = threshold_create_bank(bp, cpu, bank);
>> -		if (err)
>> -			goto out_err;
>> +		if (err) {
>> +			_mce_threshold_remove_device(bp, numbanks);
>> +			return err;
>> +		}
>>   	}
>>   	this_cpu_write(threshold_banks, bp);
> 
> Do I see it correctly that the publishing of the @bp pointer - i.e.,
> this line - should be moved right above the for loop?
> 
> Then mce_threshold_remove_device() would properly free it in the error
> case and your patch turns into a oneliner?

Previously, in v4 I did that too. But after discussion with Yazen, we got a
conclusion that placing `this_cpu_write(threshold_banks, bp);` before the for loop
is not the right thing to do.

> And then your Fixes: tag would be correct too...
The reason is based on the discussion with Yazen, the full discussion can be read in
the Link tag above.

==================
The point is:

On Wed, 2 Mar 2022 17:26:32 +0000, Yazen Ghannam <yazen.ghannam@amd.com> wrote:
> The threshold interrupt handler uses this pointer. I think the goal here is to
> set this pointer when the list is fully formed and clear this pointer before
> making any changes to the list. Otherwise, the interrupt handler will operate
> on incomplete data if an interrupt comes in the middle of these updates.
==================

Also, looking at the comment in mce_threshold_remove_device() function:

	/*
	 * Clear the pointer before cleaning up, so that the interrupt won't
	 * touch anything of this.
	 */
	this_cpu_write(threshold_banks, NULL);

I think it's reasonable to place `this_cpu_write(threshold_banks, bp);` after
the "for loop" on the creation process for the similar reason. In short, don't
let the interrupt sees incomplete data.

Although, I am not sure if that 100% guarantees mce_threshold_remove_device()
will not mess up with the interrupt (e.g. freeing the data while the interrupt
reading it), unless we're using RCU stuff.

What do you think?
Borislav Petkov March 28, 2022, 8:05 a.m. UTC | #3
On Mon, Mar 28, 2022 at 11:12:53AM +0700, Ammar Faizi wrote:
> Although, I am not sure if that 100% guarantees mce_threshold_remove_device()
> will not mess up with the interrupt (e.g. freeing the data while the interrupt
> reading it), unless we're using RCU stuff.
> 
> What do you think?

I would've said it doesn't matter but that thresholding device creation
is part of hotplug and it can happen multiple times even *after* the
interrupt vector has been set during setup so a potential teardown and
concurrent thresholding interrupt firing might really hit in a not fully
initialized/cleaned up state so yeah, let's do Yazen's thing.

The alternative would be the temporarily re-assign mce_threshold_vector
to default_threshold_interrupt while setup is being done but that's not
really necessary atm.

But call that helper function __threshold_remove_device().

Thx.
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 9f4b508886dd..e492efab68a2 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -1293,10 +1293,23 @@  static void threshold_remove_bank(struct threshold_bank *bank)
 	kfree(bank);
 }
 
+static void _mce_threshold_remove_device(struct threshold_bank **bp,
+					 unsigned int numbanks)
+{
+	unsigned int bank;
+
+	for (bank = 0; bank < numbanks; bank++) {
+		if (bp[bank]) {
+			threshold_remove_bank(bp[bank]);
+			bp[bank] = NULL;
+		}
+	}
+	kfree(bp);
+}
+
 int mce_threshold_remove_device(unsigned int cpu)
 {
 	struct threshold_bank **bp = this_cpu_read(threshold_banks);
-	unsigned int bank, numbanks = this_cpu_read(mce_num_banks);
 
 	if (!bp)
 		return 0;
@@ -1307,13 +1320,7 @@  int mce_threshold_remove_device(unsigned int cpu)
 	 */
 	this_cpu_write(threshold_banks, NULL);
 
-	for (bank = 0; bank < numbanks; bank++) {
-		if (bp[bank]) {
-			threshold_remove_bank(bp[bank]);
-			bp[bank] = NULL;
-		}
-	}
-	kfree(bp);
+	_mce_threshold_remove_device(bp, this_cpu_read(mce_num_banks));
 	return 0;
 }
 
@@ -1350,15 +1357,14 @@  int mce_threshold_create_device(unsigned int cpu)
 		if (!(this_cpu_read(bank_map) & (1 << bank)))
 			continue;
 		err = threshold_create_bank(bp, cpu, bank);
-		if (err)
-			goto out_err;
+		if (err) {
+			_mce_threshold_remove_device(bp, numbanks);
+			return err;
+		}
 	}
 	this_cpu_write(threshold_banks, bp);
 
 	if (thresholding_irq_en)
 		mce_threshold_vector = amd_threshold_interrupt;
 	return 0;
-out_err:
-	mce_threshold_remove_device(cpu);
-	return err;
 }