diff mbox series

[2/2] x86/numa: Fix the sort compare func used in numa_fill_memblks()

Message ID 99dcb3ae87e04995e9f293f6158dc8fa0749a487.1705085543.git.alison.schofield@intel.com
State Accepted
Commit 6be99530c92c6b8ff7a01903edc42393575ad63b
Headers show
Series x86/numa: Fix NUMA node overlap & init failure | expand

Commit Message

Alison Schofield Jan. 12, 2024, 8:09 p.m. UTC
From: Alison Schofield <alison.schofield@intel.com>

The compare function used to sort memblks into starting address
order fails when the result of its u64 address subtraction gets
truncated to an int upon return.

The impact of the bad sort is that memblks will be filled out
incorrectly. Depending on the set of memblks, a user may see no
errors at all but still have a bad fill, or see messages reporting
a node overlap that leads to numa init failure:

[] node 0 [mem: ] overlaps with node 1 [mem: ]
[] No NUMA configuration found

Replace with a comparison that can only result in: 1, 0, -1.

Fixes: 8f012db27c95 ("x86/numa: Introduce numa_fill_memblks()")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---
 arch/x86/mm/numa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dan Williams Jan. 12, 2024, 10:20 p.m. UTC | #1
alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> The compare function used to sort memblks into starting address
> order fails when the result of its u64 address subtraction gets
> truncated to an int upon return.
> 
> The impact of the bad sort is that memblks will be filled out
> incorrectly. Depending on the set of memblks, a user may see no
> errors at all but still have a bad fill, or see messages reporting
> a node overlap that leads to numa init failure:
> 
> [] node 0 [mem: ] overlaps with node 1 [mem: ]
> [] No NUMA configuration found
> 
> Replace with a comparison that can only result in: 1, 0, -1.

Good eye!

> Fixes: 8f012db27c95 ("x86/numa: Introduce numa_fill_memblks()")
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> ---
>  arch/x86/mm/numa.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index 8ada9bbfad58..65e9a6e391c0 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -934,7 +934,7 @@ static int __init cmp_memblk(const void *a, const void *b)
>  	const struct numa_memblk *ma = *(const struct numa_memblk **)a;
>  	const struct numa_memblk *mb = *(const struct numa_memblk **)b;
>  
> -	return ma->start - mb->start;
> +	return (ma->start > mb->start) - (ma->start < mb->start);

Maybe just do the less clever but obviously correct thing like 
other unsigned compares like ktime_compare():

    if (ma->start > mb->start)
        return 1;
    if (ma->start < mb->start)
        return -1;
    return 0;

...but otherwise this looks good to me.
Alison Schofield Jan. 13, 2024, 12:35 a.m. UTC | #2
On Fri, Jan 12, 2024 at 02:20:04PM -0800, Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > The compare function used to sort memblks into starting address
> > order fails when the result of its u64 address subtraction gets
> > truncated to an int upon return.
> > 
> > The impact of the bad sort is that memblks will be filled out
> > incorrectly. Depending on the set of memblks, a user may see no
> > errors at all but still have a bad fill, or see messages reporting
> > a node overlap that leads to numa init failure:
> > 
> > [] node 0 [mem: ] overlaps with node 1 [mem: ]
> > [] No NUMA configuration found
> > 
> > Replace with a comparison that can only result in: 1, 0, -1.
> 
> Good eye!
> 
> > Fixes: 8f012db27c95 ("x86/numa: Introduce numa_fill_memblks()")
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > ---
> >  arch/x86/mm/numa.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > index 8ada9bbfad58..65e9a6e391c0 100644
> > --- a/arch/x86/mm/numa.c
> > +++ b/arch/x86/mm/numa.c
> > @@ -934,7 +934,7 @@ static int __init cmp_memblk(const void *a, const void *b)
> >  	const struct numa_memblk *ma = *(const struct numa_memblk **)a;
> >  	const struct numa_memblk *mb = *(const struct numa_memblk **)b;
> >  
> > -	return ma->start - mb->start;
> > +	return (ma->start > mb->start) - (ma->start < mb->start);
> 
> Maybe just do the less clever but obviously correct thing like 
> other unsigned compares like ktime_compare():
> 
>     if (ma->start > mb->start)
>         return 1;
>     if (ma->start < mb->start)
>         return -1;
>     return 0;
> 
> ...but otherwise this looks good to me.

During the upstream march of the original set, I oversimplified and
broke. For this fix, I surveyed the sort compare funcs and lifted this
simple method from commit: 4ac19ead0dfb ("kvm: x86/pmu: Fix the compare
function used by the pmu event filter"). 

Let me see what else I get to trigger a v2.

Thanks for the review!
Dan Williams Jan. 13, 2024, 1:54 a.m. UTC | #3
Alison Schofield wrote:
> On Fri, Jan 12, 2024 at 02:20:04PM -0800, Dan Williams wrote:
> > alison.schofield@ wrote:
> > > From: Alison Schofield <alison.schofield@intel.com>
> > > 
> > > The compare function used to sort memblks into starting address
> > > order fails when the result of its u64 address subtraction gets
> > > truncated to an int upon return.
> > > 
> > > The impact of the bad sort is that memblks will be filled out
> > > incorrectly. Depending on the set of memblks, a user may see no
> > > errors at all but still have a bad fill, or see messages reporting
> > > a node overlap that leads to numa init failure:
> > > 
> > > [] node 0 [mem: ] overlaps with node 1 [mem: ]
> > > [] No NUMA configuration found
> > > 
> > > Replace with a comparison that can only result in: 1, 0, -1.
> > 
> > Good eye!
> > 
> > > Fixes: 8f012db27c95 ("x86/numa: Introduce numa_fill_memblks()")
> > > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > > ---
> > >  arch/x86/mm/numa.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > > index 8ada9bbfad58..65e9a6e391c0 100644
> > > --- a/arch/x86/mm/numa.c
> > > +++ b/arch/x86/mm/numa.c
> > > @@ -934,7 +934,7 @@ static int __init cmp_memblk(const void *a, const void *b)
> > >  	const struct numa_memblk *ma = *(const struct numa_memblk **)a;
> > >  	const struct numa_memblk *mb = *(const struct numa_memblk **)b;
> > >  
> > > -	return ma->start - mb->start;
> > > +	return (ma->start > mb->start) - (ma->start < mb->start);
> > 
> > Maybe just do the less clever but obviously correct thing like 
> > other unsigned compares like ktime_compare():
> > 
> >     if (ma->start > mb->start)
> >         return 1;
> >     if (ma->start < mb->start)
> >         return -1;
> >     return 0;
> > 
> > ...but otherwise this looks good to me.
> 
> During the upstream march of the original set, I oversimplified and
> broke. For this fix, I surveyed the sort compare funcs and lifted this
> simple method from commit: 4ac19ead0dfb ("kvm: x86/pmu: Fix the compare
> function used by the pmu event filter"). 
> 
> Let me see what else I get to trigger a v2.

If nothing else comes in, then don't spin just for this.
diff mbox series

Patch

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 8ada9bbfad58..65e9a6e391c0 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -934,7 +934,7 @@  static int __init cmp_memblk(const void *a, const void *b)
 	const struct numa_memblk *ma = *(const struct numa_memblk **)a;
 	const struct numa_memblk *mb = *(const struct numa_memblk **)b;
 
-	return ma->start - mb->start;
+	return (ma->start > mb->start) - (ma->start < mb->start);
 }
 
 static struct numa_memblk *numa_memblk_list[NR_NODE_MEMBLKS] __initdata;