diff mbox series

[1/1] ACPI/nfit: correct the badrange to be reported in nfit_handle_mce()

Message ID 20201118084117.1937-1-thunder.leizhen@huawei.com (mailing list archive)
State New, archived
Headers show
Series [1/1] ACPI/nfit: correct the badrange to be reported in nfit_handle_mce() | expand

Commit Message

Leizhen (ThunderTown) Nov. 18, 2020, 8:41 a.m. UTC
The badrange to be reported should always cover mce->addr.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 drivers/acpi/nfit/mce.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Leizhen (ThunderTown) Nov. 18, 2020, 8:54 a.m. UTC | #1
On 2020/11/18 16:41, Zhen Lei wrote:
> The badrange to be reported should always cover mce->addr.
Maybe I should change this description to:
Make sure the badrange to be reported can always cover mce->addr.

> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  drivers/acpi/nfit/mce.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
> index ee8d9973f60b..053e719c7bea 100644
> --- a/drivers/acpi/nfit/mce.c
> +++ b/drivers/acpi/nfit/mce.c
> @@ -63,7 +63,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
>  
>  		/* If this fails due to an -ENOMEM, there is little we can do */
>  		nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus,
> -				ALIGN(mce->addr, L1_CACHE_BYTES),
> +				ALIGN_DOWN(mce->addr, L1_CACHE_BYTES),
>  				L1_CACHE_BYTES);
>  		nvdimm_region_notify(nfit_spa->nd_region,
>  				NVDIMM_REVALIDATE_POISON);
>
Dan Williams Nov. 18, 2020, 7:16 p.m. UTC | #2
On Wed, Nov 18, 2020 at 12:55 AM Leizhen (ThunderTown)
<thunder.leizhen@huawei.com> wrote:
>
>
>
> On 2020/11/18 16:41, Zhen Lei wrote:
> > The badrange to be reported should always cover mce->addr.
> Maybe I should change this description to:
> Make sure the badrange to be reported can always cover mce->addr.

Yes, I like that better. Can you also say a bit more about how you
found this bug? As far as I can see this looks like -stable material.
Verma, Vishal L Nov. 18, 2020, 8:50 p.m. UTC | #3
On Wed, 2020-11-18 at 16:41 +0800, Zhen Lei wrote:
> The badrange to be reported should always cover mce->addr.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  drivers/acpi/nfit/mce.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Ah good find, agreed with Dan that this is stable material.
Minor nit, I'd recommend rewording the subject line to something like:

"acpi/nfit: fix badrange insertion in nfit_handle_mce()"

Otherwise, looks good to me.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>

> 
> diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
> index ee8d9973f60b..053e719c7bea 100644
> --- a/drivers/acpi/nfit/mce.c
> +++ b/drivers/acpi/nfit/mce.c
> @@ -63,7 +63,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
>  
>  		/* If this fails due to an -ENOMEM, there is little we can do */
>  		nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus,
> -				ALIGN(mce->addr, L1_CACHE_BYTES),
> +				ALIGN_DOWN(mce->addr, L1_CACHE_BYTES),
>  				L1_CACHE_BYTES);
>  		nvdimm_region_notify(nfit_spa->nd_region,
>  				NVDIMM_REVALIDATE_POISON);
Leizhen (ThunderTown) Nov. 19, 2020, 1:53 a.m. UTC | #4
On 2020/11/19 3:16, Dan Williams wrote:
> On Wed, Nov 18, 2020 at 12:55 AM Leizhen (ThunderTown)
> <thunder.leizhen@huawei.com> wrote:
>>
>>
>>
>> On 2020/11/18 16:41, Zhen Lei wrote:
>>> The badrange to be reported should always cover mce->addr.
>> Maybe I should change this description to:
>> Make sure the badrange to be reported can always cover mce->addr.
> 
> Yes, I like that better. Can you also say a bit more about how you
> found this bug? As far as I can see this looks like -stable material.

I found it when I was learning the code. I'm looking at it carefully.

> 
>
Leizhen (ThunderTown) Nov. 19, 2020, 1:55 a.m. UTC | #5
On 2020/11/19 4:50, Verma, Vishal L wrote:
> On Wed, 2020-11-18 at 16:41 +0800, Zhen Lei wrote:
>> The badrange to be reported should always cover mce->addr.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  drivers/acpi/nfit/mce.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Ah good find, agreed with Dan that this is stable material.
> Minor nit, I'd recommend rewording the subject line to something like:
> 
> "acpi/nfit: fix badrange insertion in nfit_handle_mce()"

OK, I will send v2.

> 
> Otherwise, looks good to me.
> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
> 
>>
>> diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
>> index ee8d9973f60b..053e719c7bea 100644
>> --- a/drivers/acpi/nfit/mce.c
>> +++ b/drivers/acpi/nfit/mce.c
>> @@ -63,7 +63,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
>>  
>>  		/* If this fails due to an -ENOMEM, there is little we can do */
>>  		nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus,
>> -				ALIGN(mce->addr, L1_CACHE_BYTES),
>> +				ALIGN_DOWN(mce->addr, L1_CACHE_BYTES),
>>  				L1_CACHE_BYTES);
>>  		nvdimm_region_notify(nfit_spa->nd_region,
>>  				NVDIMM_REVALIDATE_POISON);
>
Dan Williams Nov. 19, 2020, 2:06 a.m. UTC | #6
On Wed, Nov 18, 2020 at 5:53 PM Leizhen (ThunderTown)
<thunder.leizhen@huawei.com> wrote:
>
>
>
> On 2020/11/19 3:16, Dan Williams wrote:
> > On Wed, Nov 18, 2020 at 12:55 AM Leizhen (ThunderTown)
> > <thunder.leizhen@huawei.com> wrote:
> >>
> >>
> >>
> >> On 2020/11/18 16:41, Zhen Lei wrote:
> >>> The badrange to be reported should always cover mce->addr.
> >> Maybe I should change this description to:
> >> Make sure the badrange to be reported can always cover mce->addr.
> >
> > Yes, I like that better. Can you also say a bit more about how you
> > found this bug? As far as I can see this looks like -stable material.
>
> I found it when I was learning the code. I'm looking at it carefully.

Ok, good eye.

The impact of this one is somewhat mitigated by the fact that errors
are expanded to 512 byte blast radius, and error consumption maps 4k
around errors. I suspect few are trying to use the badblock list to do
fine grained recovery so this bug went unnoticed.
diff mbox series

Patch

diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index ee8d9973f60b..053e719c7bea 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -63,7 +63,7 @@  static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
 
 		/* If this fails due to an -ENOMEM, there is little we can do */
 		nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus,
-				ALIGN(mce->addr, L1_CACHE_BYTES),
+				ALIGN_DOWN(mce->addr, L1_CACHE_BYTES),
 				L1_CACHE_BYTES);
 		nvdimm_region_notify(nfit_spa->nd_region,
 				NVDIMM_REVALIDATE_POISON);