diff mbox series

[v2,3/6] EDAC/fsl_ddr: Fix bad bit shift operations

Message ID 20241011-imx95_edac-v2-3-011b68290951@nxp.com (mailing list archive)
State Superseded
Headers show
Series EDAC/fsl-ddr: Add imx9 support | expand

Commit Message

Frank Li Oct. 11, 2024, 3:31 p.m. UTC
From: Priyanka Singh <priyanka.singh@nxp.com>

Fix undefined behavior caused by left-shifting a negative value in the
expression:

    cap_high ^ (1 << (bad_data_bit - 32))

The variable `bad_data_bit` ranges from 0 to 63. When `bad_data_bit` is
less than 32, `bad_data_bit - 32` becomes negative, and left-shifting by a
negative value in C is undefined behavior.

Fix this by checking the range of `bad_data_bit` before performing the
shift.

Fixes: ea2eb9a8b620 ("EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx")
Signed-off-by: Priyanka Singh <priyanka.singh@nxp.com>
Reviewed-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
---
 drivers/edac/fsl_ddr_edac.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

Comments

Borislav Petkov Oct. 14, 2024, 6:16 p.m. UTC | #1
On Fri, Oct 11, 2024 at 11:31:31AM -0400, Frank Li wrote:
> From: Priyanka Singh <priyanka.singh@nxp.com>
> 
> Fix undefined behavior caused by left-shifting a negative value in the
> expression:
> 
>     cap_high ^ (1 << (bad_data_bit - 32))
> 
> The variable `bad_data_bit` ranges from 0 to 63. When `bad_data_bit` is
> less than 32, `bad_data_bit - 32` becomes negative, and left-shifting by a
> negative value in C is undefined behavior.
> 
> Fix this by checking the range of `bad_data_bit` before performing the
> shift.
> 
> Fixes: ea2eb9a8b620 ("EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx")

Is this an urgent fix which needs to go to stable or someone just caught it
from code review?

Does it trigger in real life, IOW?

> Signed-off-by: Priyanka Singh <priyanka.singh@nxp.com>
> Reviewed-by: Sherry Sun <sherry.sun@nxp.com>
> Signed-off-by: Li Yang <leoyang.li@nxp.com>
> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---
>  drivers/edac/fsl_ddr_edac.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/edac/fsl_ddr_edac.c b/drivers/edac/fsl_ddr_edac.c
> index 7a9fb1202f1a0..ccc13c2adfd6f 100644
> --- a/drivers/edac/fsl_ddr_edac.c
> +++ b/drivers/edac/fsl_ddr_edac.c
> @@ -338,11 +338,18 @@ static void fsl_mc_check(struct mem_ctl_info *mci)
>  			fsl_mc_printk(mci, KERN_ERR,
>  				"Faulty ECC bit: %d\n", bad_ecc_bit);
>  
> -		fsl_mc_printk(mci, KERN_ERR,
> -			"Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> -			cap_high ^ (1 << (bad_data_bit - 32)),
> -			cap_low ^ (1 << bad_data_bit),
> -			syndrome ^ (1 << bad_ecc_bit));
> +		if ((bad_data_bit > 0 && bad_data_bit < 32) && bad_ecc_bit > 0) {
> +			fsl_mc_printk(mci, KERN_ERR,
> +				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> +				      cap_high, cap_low ^ (1 << bad_data_bit),
> +				      syndrome ^ (1 << bad_ecc_bit));
> +		}
> +		if (bad_data_bit >= 32 && bad_ecc_bit > 0) {
> +			fsl_mc_printk(mci, KERN_ERR,
> +				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> +				      cap_high ^ (1 << (bad_data_bit - 32)),
> +				      cap_low, syndrome ^ (1 << bad_ecc_bit));
> +		}

This is getting unnecessarily clumsy than it should be. Please do the
following:

	if (bad_data_bit != 1 && bad_ecc_bit != -1) {

		// prep the values you need to print

		// do an exactly one fsl_mc_printk() with the prepared values.

	}

Not have 4 fsl_mc_printks with a bunch of silly if-checks in front.

Thx.
Frank Li Oct. 15, 2024, 3:31 p.m. UTC | #2
On Mon, Oct 14, 2024 at 08:16:47PM +0200, Borislav Petkov wrote:
> On Fri, Oct 11, 2024 at 11:31:31AM -0400, Frank Li wrote:
> > From: Priyanka Singh <priyanka.singh@nxp.com>
> >
> > Fix undefined behavior caused by left-shifting a negative value in the
> > expression:
> >
> >     cap_high ^ (1 << (bad_data_bit - 32))
> >
> > The variable `bad_data_bit` ranges from 0 to 63. When `bad_data_bit` is
> > less than 32, `bad_data_bit - 32` becomes negative, and left-shifting by a
> > negative value in C is undefined behavior.
> >
> > Fix this by checking the range of `bad_data_bit` before performing the
> > shift.
> >
> > Fixes: ea2eb9a8b620 ("EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx")
>
> Is this an urgent fix which needs to go to stable or someone just caught it
> from code review?

I don't think it is urgent. In most system the return value is 0. I am not
sure who caught it because patch already exist at downstream tree for a
whole.

>
> Does it trigger in real life, IOW?

The problem is triggered. But the output result is correct at our hardware.
The result may change depend on compiler and cpu version.

Frank
>
> > Signed-off-by: Priyanka Singh <priyanka.singh@nxp.com>
> > Reviewed-by: Sherry Sun <sherry.sun@nxp.com>
> > Signed-off-by: Li Yang <leoyang.li@nxp.com>
> > Signed-off-by: Frank Li <Frank.Li@nxp.com>
> > ---
> >  drivers/edac/fsl_ddr_edac.c | 17 ++++++++++++-----
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/edac/fsl_ddr_edac.c b/drivers/edac/fsl_ddr_edac.c
> > index 7a9fb1202f1a0..ccc13c2adfd6f 100644
> > --- a/drivers/edac/fsl_ddr_edac.c
> > +++ b/drivers/edac/fsl_ddr_edac.c
> > @@ -338,11 +338,18 @@ static void fsl_mc_check(struct mem_ctl_info *mci)
> >  			fsl_mc_printk(mci, KERN_ERR,
> >  				"Faulty ECC bit: %d\n", bad_ecc_bit);
> >
> > -		fsl_mc_printk(mci, KERN_ERR,
> > -			"Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > -			cap_high ^ (1 << (bad_data_bit - 32)),
> > -			cap_low ^ (1 << bad_data_bit),
> > -			syndrome ^ (1 << bad_ecc_bit));
> > +		if ((bad_data_bit > 0 && bad_data_bit < 32) && bad_ecc_bit > 0) {
> > +			fsl_mc_printk(mci, KERN_ERR,
> > +				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > +				      cap_high, cap_low ^ (1 << bad_data_bit),
> > +				      syndrome ^ (1 << bad_ecc_bit));
> > +		}
> > +		if (bad_data_bit >= 32 && bad_ecc_bit > 0) {
> > +			fsl_mc_printk(mci, KERN_ERR,
> > +				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > +				      cap_high ^ (1 << (bad_data_bit - 32)),
> > +				      cap_low, syndrome ^ (1 << bad_ecc_bit));
> > +		}
>
> This is getting unnecessarily clumsy than it should be. Please do the
> following:
>
> 	if (bad_data_bit != 1 && bad_ecc_bit != -1) {
>
> 		// prep the values you need to print
>
> 		// do an exactly one fsl_mc_printk() with the prepared values.
>
> 	}
>
> Not have 4 fsl_mc_printks with a bunch of silly if-checks in front.
>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
Borislav Petkov Oct. 15, 2024, 4:11 p.m. UTC | #3
On Tue, Oct 15, 2024 at 11:31:41AM -0400, Frank Li wrote:
> I don't think it is urgent. In most system the return value is 0. I am not
> sure who caught it because patch already exist at downstream tree for a
> whole.

That current patch looks like it needs rethinking and making it sane - see
below.

> > > diff --git a/drivers/edac/fsl_ddr_edac.c b/drivers/edac/fsl_ddr_edac.c
> > > index 7a9fb1202f1a0..ccc13c2adfd6f 100644
> > > --- a/drivers/edac/fsl_ddr_edac.c
> > > +++ b/drivers/edac/fsl_ddr_edac.c
> > > @@ -338,11 +338,18 @@ static void fsl_mc_check(struct mem_ctl_info *mci)
> > >  			fsl_mc_printk(mci, KERN_ERR,
> > >  				"Faulty ECC bit: %d\n", bad_ecc_bit);
> > >
> > > -		fsl_mc_printk(mci, KERN_ERR,
> > > -			"Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > > -			cap_high ^ (1 << (bad_data_bit - 32)),
> > > -			cap_low ^ (1 << bad_data_bit),
> > > -			syndrome ^ (1 << bad_ecc_bit));
> > > +		if ((bad_data_bit > 0 && bad_data_bit < 32) && bad_ecc_bit > 0) {
> > > +			fsl_mc_printk(mci, KERN_ERR,
> > > +				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > > +				      cap_high, cap_low ^ (1 << bad_data_bit),
> > > +				      syndrome ^ (1 << bad_ecc_bit));
> > > +		}
> > > +		if (bad_data_bit >= 32 && bad_ecc_bit > 0) {
> > > +			fsl_mc_printk(mci, KERN_ERR,
> > > +				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > > +				      cap_high ^ (1 << (bad_data_bit - 32)),
> > > +				      cap_low, syndrome ^ (1 << bad_ecc_bit));
> > > +		}
> >
> > This is getting unnecessarily clumsy than it should be. Please do the
> > following:
> >
> > 	if (bad_data_bit != 1 && bad_ecc_bit != -1) {
> >
> > 		// prep the values you need to print
> >
> > 		// do an exactly one fsl_mc_printk() with the prepared values.
> >
> > 	}
> >
> > Not have 4 fsl_mc_printks with a bunch of silly if-checks in front.

You missed the most important feedback. See above. ^^^^^^^
Frank Li Oct. 15, 2024, 6:55 p.m. UTC | #4
On Tue, Oct 15, 2024 at 06:11:39PM +0200, Borislav Petkov wrote:
> On Tue, Oct 15, 2024 at 11:31:41AM -0400, Frank Li wrote:
> > I don't think it is urgent. In most system the return value is 0. I am not
> > sure who caught it because patch already exist at downstream tree for a
> > whole.
>
> That current patch looks like it needs rethinking and making it sane - see
> below.
>
> > > > diff --git a/drivers/edac/fsl_ddr_edac.c b/drivers/edac/fsl_ddr_edac.c
> > > > index 7a9fb1202f1a0..ccc13c2adfd6f 100644
> > > > --- a/drivers/edac/fsl_ddr_edac.c
> > > > +++ b/drivers/edac/fsl_ddr_edac.c
> > > > @@ -338,11 +338,18 @@ static void fsl_mc_check(struct mem_ctl_info *mci)
> > > >  			fsl_mc_printk(mci, KERN_ERR,
> > > >  				"Faulty ECC bit: %d\n", bad_ecc_bit);
> > > >
> > > > -		fsl_mc_printk(mci, KERN_ERR,
> > > > -			"Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > > > -			cap_high ^ (1 << (bad_data_bit - 32)),
> > > > -			cap_low ^ (1 << bad_data_bit),
> > > > -			syndrome ^ (1 << bad_ecc_bit));
> > > > +		if ((bad_data_bit > 0 && bad_data_bit < 32) && bad_ecc_bit > 0) {
> > > > +			fsl_mc_printk(mci, KERN_ERR,
> > > > +				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > > > +				      cap_high, cap_low ^ (1 << bad_data_bit),
> > > > +				      syndrome ^ (1 << bad_ecc_bit));
> > > > +		}
> > > > +		if (bad_data_bit >= 32 && bad_ecc_bit > 0) {
> > > > +			fsl_mc_printk(mci, KERN_ERR,
> > > > +				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > > > +				      cap_high ^ (1 << (bad_data_bit - 32)),
> > > > +				      cap_low, syndrome ^ (1 << bad_ecc_bit));
> > > > +		}
> > >
> > > This is getting unnecessarily clumsy than it should be. Please do the
> > > following:
> > >
> > > 	if (bad_data_bit != 1 && bad_ecc_bit != -1) {
> > >
> > > 		// prep the values you need to print
> > >
> > > 		// do an exactly one fsl_mc_printk() with the prepared values.
> > >
> > > 	}
> > >
> > > Not have 4 fsl_mc_printks with a bunch of silly if-checks in front.
>
> You missed the most important feedback. See above. ^^^^^^^

Sorry, will fix at next version.

Frank
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
diff mbox series

Patch

diff --git a/drivers/edac/fsl_ddr_edac.c b/drivers/edac/fsl_ddr_edac.c
index 7a9fb1202f1a0..ccc13c2adfd6f 100644
--- a/drivers/edac/fsl_ddr_edac.c
+++ b/drivers/edac/fsl_ddr_edac.c
@@ -338,11 +338,18 @@  static void fsl_mc_check(struct mem_ctl_info *mci)
 			fsl_mc_printk(mci, KERN_ERR,
 				"Faulty ECC bit: %d\n", bad_ecc_bit);
 
-		fsl_mc_printk(mci, KERN_ERR,
-			"Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
-			cap_high ^ (1 << (bad_data_bit - 32)),
-			cap_low ^ (1 << bad_data_bit),
-			syndrome ^ (1 << bad_ecc_bit));
+		if ((bad_data_bit > 0 && bad_data_bit < 32) && bad_ecc_bit > 0) {
+			fsl_mc_printk(mci, KERN_ERR,
+				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
+				      cap_high, cap_low ^ (1 << bad_data_bit),
+				      syndrome ^ (1 << bad_ecc_bit));
+		}
+		if (bad_data_bit >= 32 && bad_ecc_bit > 0) {
+			fsl_mc_printk(mci, KERN_ERR,
+				      "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
+				      cap_high ^ (1 << (bad_data_bit - 32)),
+				      cap_low, syndrome ^ (1 << bad_ecc_bit));
+		}
 	}
 
 	fsl_mc_printk(mci, KERN_ERR,