Message ID | 20201118084117.1937-1-thunder.leizhen@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] ACPI/nfit: correct the badrange to be reported in nfit_handle_mce() | expand |
On 2020/11/18 16:41, Zhen Lei wrote: > The badrange to be reported should always cover mce->addr. Maybe I should change this description to: Make sure the badrange to be reported can always cover mce->addr. > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > --- > drivers/acpi/nfit/mce.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c > index ee8d9973f60b..053e719c7bea 100644 > --- a/drivers/acpi/nfit/mce.c > +++ b/drivers/acpi/nfit/mce.c > @@ -63,7 +63,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val, > > /* If this fails due to an -ENOMEM, there is little we can do */ > nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus, > - ALIGN(mce->addr, L1_CACHE_BYTES), > + ALIGN_DOWN(mce->addr, L1_CACHE_BYTES), > L1_CACHE_BYTES); > nvdimm_region_notify(nfit_spa->nd_region, > NVDIMM_REVALIDATE_POISON); >
On Wed, Nov 18, 2020 at 12:55 AM Leizhen (ThunderTown) <thunder.leizhen@huawei.com> wrote: > > > > On 2020/11/18 16:41, Zhen Lei wrote: > > The badrange to be reported should always cover mce->addr. > Maybe I should change this description to: > Make sure the badrange to be reported can always cover mce->addr. Yes, I like that better. Can you also say a bit more about how you found this bug? As far as I can see this looks like -stable material.
On Wed, 2020-11-18 at 16:41 +0800, Zhen Lei wrote: > The badrange to be reported should always cover mce->addr. > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > --- > drivers/acpi/nfit/mce.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Ah good find, agreed with Dan that this is stable material. Minor nit, I'd recommend rewording the subject line to something like: "acpi/nfit: fix badrange insertion in nfit_handle_mce()" Otherwise, looks good to me. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> > > diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c > index ee8d9973f60b..053e719c7bea 100644 > --- a/drivers/acpi/nfit/mce.c > +++ b/drivers/acpi/nfit/mce.c > @@ -63,7 +63,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val, > > /* If this fails due to an -ENOMEM, there is little we can do */ > nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus, > - ALIGN(mce->addr, L1_CACHE_BYTES), > + ALIGN_DOWN(mce->addr, L1_CACHE_BYTES), > L1_CACHE_BYTES); > nvdimm_region_notify(nfit_spa->nd_region, > NVDIMM_REVALIDATE_POISON);
On 2020/11/19 3:16, Dan Williams wrote: > On Wed, Nov 18, 2020 at 12:55 AM Leizhen (ThunderTown) > <thunder.leizhen@huawei.com> wrote: >> >> >> >> On 2020/11/18 16:41, Zhen Lei wrote: >>> The badrange to be reported should always cover mce->addr. >> Maybe I should change this description to: >> Make sure the badrange to be reported can always cover mce->addr. > > Yes, I like that better. Can you also say a bit more about how you > found this bug? As far as I can see this looks like -stable material. I found it when I was learning the code. I'm looking at it carefully. > >
On 2020/11/19 4:50, Verma, Vishal L wrote: > On Wed, 2020-11-18 at 16:41 +0800, Zhen Lei wrote: >> The badrange to be reported should always cover mce->addr. >> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >> --- >> drivers/acpi/nfit/mce.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > Ah good find, agreed with Dan that this is stable material. > Minor nit, I'd recommend rewording the subject line to something like: > > "acpi/nfit: fix badrange insertion in nfit_handle_mce()" OK, I will send v2. > > Otherwise, looks good to me. > Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> > >> >> diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c >> index ee8d9973f60b..053e719c7bea 100644 >> --- a/drivers/acpi/nfit/mce.c >> +++ b/drivers/acpi/nfit/mce.c >> @@ -63,7 +63,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val, >> >> /* If this fails due to an -ENOMEM, there is little we can do */ >> nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus, >> - ALIGN(mce->addr, L1_CACHE_BYTES), >> + ALIGN_DOWN(mce->addr, L1_CACHE_BYTES), >> L1_CACHE_BYTES); >> nvdimm_region_notify(nfit_spa->nd_region, >> NVDIMM_REVALIDATE_POISON); >
On Wed, Nov 18, 2020 at 5:53 PM Leizhen (ThunderTown) <thunder.leizhen@huawei.com> wrote: > > > > On 2020/11/19 3:16, Dan Williams wrote: > > On Wed, Nov 18, 2020 at 12:55 AM Leizhen (ThunderTown) > > <thunder.leizhen@huawei.com> wrote: > >> > >> > >> > >> On 2020/11/18 16:41, Zhen Lei wrote: > >>> The badrange to be reported should always cover mce->addr. > >> Maybe I should change this description to: > >> Make sure the badrange to be reported can always cover mce->addr. > > > > Yes, I like that better. Can you also say a bit more about how you > > found this bug? As far as I can see this looks like -stable material. > > I found it when I was learning the code. I'm looking at it carefully. Ok, good eye. The impact of this one is somewhat mitigated by the fact that errors are expanded to 512 byte blast radius, and error consumption maps 4k around errors. I suspect few are trying to use the badblock list to do fine grained recovery so this bug went unnoticed.
diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c index ee8d9973f60b..053e719c7bea 100644 --- a/drivers/acpi/nfit/mce.c +++ b/drivers/acpi/nfit/mce.c @@ -63,7 +63,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val, /* If this fails due to an -ENOMEM, there is little we can do */ nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus, - ALIGN(mce->addr, L1_CACHE_BYTES), + ALIGN_DOWN(mce->addr, L1_CACHE_BYTES), L1_CACHE_BYTES); nvdimm_region_notify(nfit_spa->nd_region, NVDIMM_REVALIDATE_POISON);
The badrange to be reported should always cover mce->addr. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- drivers/acpi/nfit/mce.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)