Message ID | 20230904094251.64022-1-arefev@swemel.ru (mailing list archive) |
---|---|
State | Accepted |
Commit | ed083b0e22f1396dee3599896249a3f218845298 |
Headers | show |
Series | [v2] The value may overflow | expand |
On 9/4/23 05:42, Denis Arefev wrote: > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > is subject to overflow due to a failure to cast operands to a larger > data type before performing arithmetic The patch title should identify more precisely its context, e.g.: "srcu: Fix srcu_struct node grpmask overflow on 64-bit systems" Also, as I stated in my reply to the previous version, the patch commit message should describe the impact of the bug it fixes. Thanks, Mathieu > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > > Signed-off-by: Denis Arefev <arefev@swemel.ru> > --- > v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > kernel/rcu/srcutree.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 20d7a238d675..6c18e6005ae1 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) > snp->grplo = cpu; > snp->grphi = cpu; > } > - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); > } > smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); > return true; > @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp > int cpu; > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > - if (!(mask & (1 << (cpu - snp->grplo)))) > + if (!(mask & (1UL << (cpu - snp->grplo)))) > continue; > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > }
From: Mathieu Desnoyers > Sent: 04 September 2023 11:24 > > On 9/4/23 05:42, Denis Arefev wrote: > > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > > is subject to overflow due to a failure to cast operands to a larger > > data type before performing arithmetic > > The patch title should identify more precisely its context, e.g.: > > "srcu: Fix srcu_struct node grpmask overflow on 64-bit systems" > > Also, as I stated in my reply to the previous version, the patch commit > message should describe the impact of the bug it fixes. And is the analysis complete? Is 1UL right for 32bit archs?? Is 64 bits even enough?? David > > Thanks, > > Mathieu > > > > > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > > > > Signed-off-by: Denis Arefev <arefev@swemel.ru> > > --- > > v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by > > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > kernel/rcu/srcutree.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index 20d7a238d675..6c18e6005ae1 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) > > snp->grplo = cpu; > > snp->grphi = cpu; > > } > > - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > > + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); > > } > > smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); > > return true; > > @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp > > int cpu; > > > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > > - if (!(mask & (1 << (cpu - snp->grplo)))) > > + if (!(mask & (1UL << (cpu - snp->grplo)))) > > continue; > > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > > } > > -- > Mathieu Desnoyers > EfficiOS Inc. > https://www.efficios.com - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 9/5/23 05:31, David Laight wrote: > From: Mathieu Desnoyers >> Sent: 04 September 2023 11:24 >> >> On 9/4/23 05:42, Denis Arefev wrote: >>> The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) >>> is subject to overflow due to a failure to cast operands to a larger >>> data type before performing arithmetic >> >> The patch title should identify more precisely its context, e.g.: >> >> "srcu: Fix srcu_struct node grpmask overflow on 64-bit systems" >> >> Also, as I stated in my reply to the previous version, the patch commit >> message should describe the impact of the bug it fixes. > > And is the analysis complete? > Is 1UL right for 32bit archs?? > Is 64 bits even enough?? I understand from include/linux/rcu_node_tree.h and kernel/rcu/Kconfig RCU_FANOUT and RCU_FANOUT_LEAF ranges that a 32-bit integer is sufficient to hold the mask on 32-bit architectures, and a 64-bit integer is enough on 64-bit architectures given the current implementation. At least this appears to be the intent. I did not do a thorough analysis of the various parameter limits. Thanks, Mathieu > > David > >> >> Thanks, >> >> Mathieu >> >> >>> >>> Found by Linux Verification Center (linuxtesting.org) with SVACE. >>> >>> Signed-off-by: Denis Arefev <arefev@swemel.ru> >>> --- >>> v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by >>> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >>> kernel/rcu/srcutree.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c >>> index 20d7a238d675..6c18e6005ae1 100644 >>> --- a/kernel/rcu/srcutree.c >>> +++ b/kernel/rcu/srcutree.c >>> @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) >>> snp->grplo = cpu; >>> snp->grphi = cpu; >>> } >>> - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); >>> + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); >>> } >>> smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); >>> return true; >>> @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp >>> int cpu; >>> >>> for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { >>> - if (!(mask & (1 << (cpu - snp->grplo)))) >>> + if (!(mask & (1UL << (cpu - snp->grplo)))) >>> continue; >>> srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); >>> } >> >> -- >> Mathieu Desnoyers >> EfficiOS Inc. >> https://www.efficios.com > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)
On Tue, Sep 05, 2023 at 08:26:51AM -0400, Mathieu Desnoyers wrote: > On 9/5/23 05:31, David Laight wrote: > > From: Mathieu Desnoyers > > > Sent: 04 September 2023 11:24 > > > > > > On 9/4/23 05:42, Denis Arefev wrote: > > > > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > > > > is subject to overflow due to a failure to cast operands to a larger > > > > data type before performing arithmetic > > > > > > The patch title should identify more precisely its context, e.g.: > > > > > > "srcu: Fix srcu_struct node grpmask overflow on 64-bit systems" > > > > > > Also, as I stated in my reply to the previous version, the patch commit > > > message should describe the impact of the bug it fixes. > > > > And is the analysis complete? > > Is 1UL right for 32bit archs?? > > Is 64 bits even enough?? > > I understand from include/linux/rcu_node_tree.h and kernel/rcu/Kconfig > RCU_FANOUT and RCU_FANOUT_LEAF ranges that a 32-bit integer is sufficient to > hold the mask on 32-bit architectures, and a 64-bit integer is enough on > 64-bit architectures given the current implementation. > > At least this appears to be the intent. I did not do a thorough analysis of > the various parameter limits. Mathieu has it right. 32-bit kernels are unaffected by this bug. RCU_FANOUT_LEAF defaults to 16, which means that a 64-bit kernel would need more than 32 leaf rcu_node structures for the parent rcu_node structure to need more than 32 bit to track its children. This means that more than 32*16=512 CPUs are required for this bug to affect 64-bit systems. And there really are systems this big, so I am surprised that this has not shown up long ago. But it would not be the first time that objective reality surprised me. ;-) Thanx, Paul > Thanks, > > Mathieu > > > > > David > > > > > > > > Thanks, > > > > > > Mathieu > > > > > > > > > > > > > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > > > > > > > > Signed-off-by: Denis Arefev <arefev@swemel.ru> > > > > --- > > > > v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by > > > > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > > > kernel/rcu/srcutree.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > > > index 20d7a238d675..6c18e6005ae1 100644 > > > > --- a/kernel/rcu/srcutree.c > > > > +++ b/kernel/rcu/srcutree.c > > > > @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) > > > > snp->grplo = cpu; > > > > snp->grphi = cpu; > > > > } > > > > - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > > > > + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); > > > > } > > > > smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); > > > > return true; > > > > @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp > > > > int cpu; > > > > > > > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > > > > - if (!(mask & (1 << (cpu - snp->grplo)))) > > > > + if (!(mask & (1UL << (cpu - snp->grplo)))) > > > > continue; > > > > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > > > > } > > > > > > -- > > > Mathieu Desnoyers > > > EfficiOS Inc. > > > https://www.efficios.com > > > > - > > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > > Registration No: 1397386 (Wales) > > -- > Mathieu Desnoyers > EfficiOS Inc. > https://www.efficios.com >
On 9/5/23 09:26, Paul E. McKenney wrote: > On Tue, Sep 05, 2023 at 08:26:51AM -0400, Mathieu Desnoyers wrote: >> On 9/5/23 05:31, David Laight wrote: >>> From: Mathieu Desnoyers >>>> Sent: 04 September 2023 11:24 >>>> >>>> On 9/4/23 05:42, Denis Arefev wrote: >>>>> The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) >>>>> is subject to overflow due to a failure to cast operands to a larger >>>>> data type before performing arithmetic >>>> >>>> The patch title should identify more precisely its context, e.g.: >>>> >>>> "srcu: Fix srcu_struct node grpmask overflow on 64-bit systems" >>>> >>>> Also, as I stated in my reply to the previous version, the patch commit >>>> message should describe the impact of the bug it fixes. >>> >>> And is the analysis complete? >>> Is 1UL right for 32bit archs?? >>> Is 64 bits even enough?? >> >> I understand from include/linux/rcu_node_tree.h and kernel/rcu/Kconfig >> RCU_FANOUT and RCU_FANOUT_LEAF ranges that a 32-bit integer is sufficient to >> hold the mask on 32-bit architectures, and a 64-bit integer is enough on >> 64-bit architectures given the current implementation. >> >> At least this appears to be the intent. I did not do a thorough analysis of >> the various parameter limits. > > Mathieu has it right. > > 32-bit kernels are unaffected by this bug. > > RCU_FANOUT_LEAF defaults to 16, which means that a 64-bit kernel would > need more than 32 leaf rcu_node structures for the parent rcu_node > structure to need more than 32 bit to track its children. This means > that more than 32*16=512 CPUs are required for this bug to affect 64-bit > systems. And there really are systems this big, so I am surprised that > this has not shown up long ago. But it would not be the first time that > objective reality surprised me. ;-) So with a 64-bit kernel, RCU_FANOUT_LEAF=16, if we have exactly 32 leaf rcu_node structures (exactly 512 CPUs), we end up in the situation where the type promotion from signed integer (32-bit) to unsigned long will carry the sign, and thus create a mask of 0xffffffff80000000. So if this weird mask is indeed an issue we should state that configurations _starting from 512 CPUs_ are affected, not just those with more than 512 CPUs. Thanks, Mathieu > > Thanx, Paul > >> Thanks, >> >> Mathieu >> >>> >>> David >>> >>>> >>>> Thanks, >>>> >>>> Mathieu >>>> >>>> >>>>> >>>>> Found by Linux Verification Center (linuxtesting.org) with SVACE. >>>>> >>>>> Signed-off-by: Denis Arefev <arefev@swemel.ru> >>>>> --- >>>>> v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by >>>>> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >>>>> kernel/rcu/srcutree.c | 4 ++-- >>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c >>>>> index 20d7a238d675..6c18e6005ae1 100644 >>>>> --- a/kernel/rcu/srcutree.c >>>>> +++ b/kernel/rcu/srcutree.c >>>>> @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) >>>>> snp->grplo = cpu; >>>>> snp->grphi = cpu; >>>>> } >>>>> - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); >>>>> + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); >>>>> } >>>>> smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); >>>>> return true; >>>>> @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp >>>>> int cpu; >>>>> >>>>> for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { >>>>> - if (!(mask & (1 << (cpu - snp->grplo)))) >>>>> + if (!(mask & (1UL << (cpu - snp->grplo)))) >>>>> continue; >>>>> srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); >>>>> } >>>> >>>> -- >>>> Mathieu Desnoyers >>>> EfficiOS Inc. >>>> https://www.efficios.com >>> >>> - >>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK >>> Registration No: 1397386 (Wales) >> >> -- >> Mathieu Desnoyers >> EfficiOS Inc. >> https://www.efficios.com >>
On Tue, Sep 05, 2023 at 09:34:45AM -0400, Mathieu Desnoyers wrote: > On 9/5/23 09:26, Paul E. McKenney wrote: > > On Tue, Sep 05, 2023 at 08:26:51AM -0400, Mathieu Desnoyers wrote: > > > On 9/5/23 05:31, David Laight wrote: > > > > From: Mathieu Desnoyers > > > > > Sent: 04 September 2023 11:24 > > > > > > > > > > On 9/4/23 05:42, Denis Arefev wrote: > > > > > > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > > > > > > is subject to overflow due to a failure to cast operands to a larger > > > > > > data type before performing arithmetic > > > > > > > > > > The patch title should identify more precisely its context, e.g.: > > > > > > > > > > "srcu: Fix srcu_struct node grpmask overflow on 64-bit systems" > > > > > > > > > > Also, as I stated in my reply to the previous version, the patch commit > > > > > message should describe the impact of the bug it fixes. > > > > > > > > And is the analysis complete? > > > > Is 1UL right for 32bit archs?? > > > > Is 64 bits even enough?? > > > > > > I understand from include/linux/rcu_node_tree.h and kernel/rcu/Kconfig > > > RCU_FANOUT and RCU_FANOUT_LEAF ranges that a 32-bit integer is sufficient to > > > hold the mask on 32-bit architectures, and a 64-bit integer is enough on > > > 64-bit architectures given the current implementation. > > > > > > At least this appears to be the intent. I did not do a thorough analysis of > > > the various parameter limits. > > > > Mathieu has it right. > > > > 32-bit kernels are unaffected by this bug. > > > > RCU_FANOUT_LEAF defaults to 16, which means that a 64-bit kernel would > > need more than 32 leaf rcu_node structures for the parent rcu_node > > structure to need more than 32 bit to track its children. This means > > that more than 32*16=512 CPUs are required for this bug to affect 64-bit > > systems. And there really are systems this big, so I am surprised that > > this has not shown up long ago. But it would not be the first time that > > objective reality surprised me. ;-) > > So with a 64-bit kernel, RCU_FANOUT_LEAF=16, if we have exactly 32 leaf > rcu_node structures (exactly 512 CPUs), we end up in the situation where the > type promotion from signed integer (32-bit) to unsigned long will carry the > sign, and thus create a mask of 0xffffffff80000000. If there is someplace that does a shift of 1UL, yes, this would be a problem. I don't see one, but I might have missed something. If all the shifts are from 1, then the top 33 bits would be always set and cleared as a unit, as if they were one bit. > So if this weird mask is indeed an issue we should state that configurations > _starting from 512 CPUs_ are affected, not just those with more than 512 > CPUs. That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would only require a 31-bit shift, which should be OK, but 497 would require a 32-bit shift, which would result in sign extension. If it turns out that sign extension is OK, then we should get in trouble at 513 CPUs, which would result in a 33-bit shift (and is that even defined in C?). But this can be tested. Let's try setting RCU_FANOUT_LEAF to 2. Then we "only" need 64 (maybe 62) CPUs to trigger the bug. And I happen to have access to a system with 80 CPUs. (Those with smaller systems can try to trigger this bug by changing the current "1 <<" to (say) "4 <<", which should trigger it on 16-CPU systems.) I just fired off an 80-CPU test with CONFIG_RCU_FANOUT_LEAF=2 and will let you know how it goes. Thanx, Paul > Thanks, > > Mathieu > > > > > > Thanx, Paul > > > > > Thanks, > > > > > > Mathieu > > > > > > > > > > > David > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Mathieu > > > > > > > > > > > > > > > > > > > > > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > > > > > > > > > > > > Signed-off-by: Denis Arefev <arefev@swemel.ru> > > > > > > --- > > > > > > v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by > > > > > > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > > > > > kernel/rcu/srcutree.c | 4 ++-- > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > > > > > index 20d7a238d675..6c18e6005ae1 100644 > > > > > > --- a/kernel/rcu/srcutree.c > > > > > > +++ b/kernel/rcu/srcutree.c > > > > > > @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) > > > > > > snp->grplo = cpu; > > > > > > snp->grphi = cpu; > > > > > > } > > > > > > - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > > > > > > + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); > > > > > > } > > > > > > smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); > > > > > > return true; > > > > > > @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp > > > > > > int cpu; > > > > > > > > > > > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > > > > > > - if (!(mask & (1 << (cpu - snp->grplo)))) > > > > > > + if (!(mask & (1UL << (cpu - snp->grplo)))) > > > > > > continue; > > > > > > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > > > > > > } > > > > > > > > > > -- > > > > > Mathieu Desnoyers > > > > > EfficiOS Inc. > > > > > https://www.efficios.com > > > > > > > > - > > > > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > > > > Registration No: 1397386 (Wales) > > > > > > -- > > > Mathieu Desnoyers > > > EfficiOS Inc. > > > https://www.efficios.com > > > > > -- > Mathieu Desnoyers > EfficiOS Inc. > https://www.efficios.com >
... > That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would > only require a 31-bit shift, which should be OK, but 497 would require > a 32-bit shift, which would result in sign extension. If it turns out > that sign extension is OK, then we should get in trouble at 513 CPUs, > which would result in a 33-bit shift (and is that even defined in C?). Not quite right :-) (1 << 31) is int and negative, that gets sign extended before being converted to 'unsigned long' - so has the top 33 bits set. (1 << 32) is undefined, the current x86 cpu ignore the high shift bits so it is (1 << 0). If the mask is being used to optimise a search the code might just happen to work! David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 9/5/23 10:15, David Laight wrote: > ... >> That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would >> only require a 31-bit shift, which should be OK, but 497 would require >> a 32-bit shift, which would result in sign extension. If it turns out >> that sign extension is OK, then we should get in trouble at 513 CPUs, >> which would result in a 33-bit shift (and is that even defined in C?). > > Not quite right :-) > > (1 << 31) is int and negative, that gets sign extended before > being converted to 'unsigned long' - so has the top 33 bits set. > > (1 << 32) is undefined, the current x86 cpu ignore the high > shift bits so it is (1 << 0). Yes, I was about to reply the same thing. A shift of 31 is buggy, because shifting 1 << 31 raises the sign bit, which sets the top 33 bits when cast to unsigned long. A shift of 1 << 32 is undefined, with for instance x86 choosing to ignore the top bit. But in order to have a 1 << 31 shift from this expression: sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); I *think* we need the group to have 32 cpus or more (indexed within the group from grplo to grplo + 31 (both inclusive)). So as soon as we have one group with 32 cpus, the problem should show up. With FANOUT_LEAF=16, we can have 15 groups of 31 cpus and 1 group of 32 cpus, for: 15*31 + 32 = 497 cpus. AFAIU, this would be the minimum value for smp_processor_id()+1 which triggers this issue. Thanks, Mathieu > > If the mask is being used to optimise a search the code might > just happen to work! > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales) >
On Tue, Sep 05, 2023 at 10:34:43AM -0400, Mathieu Desnoyers wrote: > On 9/5/23 10:15, David Laight wrote: > > ... > > > That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would > > > only require a 31-bit shift, which should be OK, but 497 would require > > > a 32-bit shift, which would result in sign extension. If it turns out > > > that sign extension is OK, then we should get in trouble at 513 CPUs, > > > which would result in a 33-bit shift (and is that even defined in C?). > > > > Not quite right :-) > > > > (1 << 31) is int and negative, that gets sign extended before > > being converted to 'unsigned long' - so has the top 33 bits set. Good point, thank you for the correction. > > (1 << 32) is undefined, the current x86 cpu ignore the high > > shift bits so it is (1 << 0). Which would cause it to attempt to invoke SRCU callbacks on the lowest-numbered CPU associated with that srcu_node structure. > Yes, I was about to reply the same thing. A shift of 31 is buggy, > because shifting 1 << 31 raises the sign bit, which sets the top 33 > bits when cast to unsigned long. A shift of 1 << 32 is undefined, > with for instance x86 choosing to ignore the top bit. > > But in order to have a 1 << 31 shift from this expression: > > sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > > I *think* we need the group to have 32 cpus or more (indexed within > the group from grplo to grplo + 31 (both inclusive)). > > So as soon as we have one group with 32 cpus, the problem should show > up. With FANOUT_LEAF=16, we can have 15 groups of 31 cpus and 1 > group of 32 cpus, for: > > 15*31 + 32 = 497 cpus. > > AFAIU, this would be the minimum value for smp_processor_id()+1 which > triggers this issue. By default, there are 16 CPUs per leaf srcu_node structure. Each leaf srcu_node structure takes up one bit in the parent srcu_node structures' srcu_data_have_cbs[] array. Up to 30 bits is OK, 31 bits is questionable, and 32 bits and larger erroneous. This is the situation as I see it (assuming dense CPU numbering): # Leaf srcu_node Largest structures #CPUs CPU # Status 0-30 0-480 479 Good 31 481-496 495 Questionable 32- 497- 496+ Bad. Tree SRCU differs from Tree RCU in its operation, so this bug should not hold up SRCU grace periods. It might instead cause SRCU callbacks to be ignored (which would admittedly look quite similar to the user). My attempts to cheat the numbering system ran up against the limited height of the srcu_node tree. But there is still the question of why this has not been seen in the wild, given that there really are systems with more than 479 CPUs out there. One possibility is the call to srcu_schedule_cbs_sdp() from srcu_funnel_gp_start(). But it does not seem likely that this would happen all that often, as it requires back-to-back grace periods and then some. Maybe people with large systems boot with srcutree.convert_to_big=0? Adding Laurent for his thoughts. Aha! Here is what makes it work, given David's description of 1<<32: static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp, unsigned long mask, unsigned long delay) { int cpu; for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { if (!(mask & (1 << (cpu - snp->grplo)))) continue; srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); } } As long as at least one bit is set in the result of 1<<N, and as long as the compiler always does the same thing for a given N, then this loop will make the right thing happen. But even that is not relied upon, because the calling code looks like this: spin_lock_irq_rcu_node(snp); cbs = false; last_lvl = snp >= sup->level[rcu_num_lvls - 1]; if (last_lvl) cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq; snp->srcu_have_cbs[idx] = gpseq; rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1); sgsne = snp->srcu_gp_seq_needed_exp; if (srcu_invl_snp_seq(sgsne) || ULONG_CMP_LT(sgsne, gpseq)) WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq); if (ss_state < SRCU_SIZE_BIG) mask = ~0; else mask = snp->srcu_data_have_cbs[idx]; snp->srcu_data_have_cbs[idx] = 0; spin_unlock_irq_rcu_node(snp); if (cbs) srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay); So that last_lvl check means that the srcu_schedule_cbs_snp() function is invoked only for leaf srcu_node structures. Which by default limit the shift to 16. So this bug appears to have no effect for default RCU setups, even on very large 64-bit systems, which is consistent with field experience. Even if this is the case, it still should be fixed, to avoid confusion if nothing else. Or just in case someone decides to set CONFIG_RCU_FANOUT_LEAF=32. Which actually happened the other day due to someone trusting ChatGPT's opinion about RCU Kconfig options... Thanx, Paul > Thanks, > > Mathieu > > > > > If the mask is being used to optimise a search the code might > > just happen to work! > > > > David > > > > - > > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > > Registration No: 1397386 (Wales) > > > > -- > Mathieu Desnoyers > EfficiOS Inc. > https://www.efficios.com >
On Tue, Sep 05, 2023 at 09:40:51AM -0700, Paul E. McKenney wrote: > On Tue, Sep 05, 2023 at 10:34:43AM -0400, Mathieu Desnoyers wrote: > > On 9/5/23 10:15, David Laight wrote: > > > ... > > > > That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would > > > > only require a 31-bit shift, which should be OK, but 497 would require > > > > a 32-bit shift, which would result in sign extension. If it turns out > > > > that sign extension is OK, then we should get in trouble at 513 CPUs, > > > > which would result in a 33-bit shift (and is that even defined in C?). > > > > > > Not quite right :-) > > > > > > (1 << 31) is int and negative, that gets sign extended before > > > being converted to 'unsigned long' - so has the top 33 bits set. > > Good point, thank you for the correction. > > > > (1 << 32) is undefined, the current x86 cpu ignore the high > > > shift bits so it is (1 << 0). > > Which would cause it to attempt to invoke SRCU callbacks on the > lowest-numbered CPU associated with that srcu_node structure. > > > Yes, I was about to reply the same thing. A shift of 31 is buggy, > > because shifting 1 << 31 raises the sign bit, which sets the top 33 > > bits when cast to unsigned long. A shift of 1 << 32 is undefined, > > with for instance x86 choosing to ignore the top bit. > > > > But in order to have a 1 << 31 shift from this expression: > > > > sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > > > > I *think* we need the group to have 32 cpus or more (indexed within > > the group from grplo to grplo + 31 (both inclusive)). > > > > So as soon as we have one group with 32 cpus, the problem should show > > up. With FANOUT_LEAF=16, we can have 15 groups of 31 cpus and 1 > > group of 32 cpus, for: > > > > 15*31 + 32 = 497 cpus. > > > > AFAIU, this would be the minimum value for smp_processor_id()+1 which > > triggers this issue. > > By default, there are 16 CPUs per leaf srcu_node structure. Each leaf > srcu_node structure takes up one bit in the parent srcu_node structures' > srcu_data_have_cbs[] array. Up to 30 bits is OK, 31 bits is questionable, > and 32 bits and larger erroneous. > > This is the situation as I see it (assuming dense CPU numbering): > > # Leaf srcu_node Largest > structures #CPUs CPU # Status > > 0-30 0-480 479 Good > 31 481-496 495 Questionable > 32- 497- 496+ Bad. > > Tree SRCU differs from Tree RCU in its operation, so this bug should > not hold up SRCU grace periods. It might instead cause SRCU callbacks > to be ignored (which would admittedly look quite similar to the user). > > My attempts to cheat the numbering system ran up against the limited > height of the srcu_node tree. > > But there is still the question of why this has not been seen in the > wild, given that there really are systems with more than 479 CPUs > out there. One possibility is the call to srcu_schedule_cbs_sdp() > from srcu_funnel_gp_start(). But it does not seem likely that this > would happen all that often, as it requires back-to-back grace periods > and then some. > > Maybe people with large systems boot with srcutree.convert_to_big=0? > > Adding Laurent for his thoughts. > > Aha! > > Here is what makes it work, given David's description of 1<<32: > > static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp, > unsigned long mask, unsigned long delay) > { > int cpu; > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > if (!(mask & (1 << (cpu - snp->grplo)))) > continue; > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > } > } > > As long as at least one bit is set in the result of 1<<N, and as long > as the compiler always does the same thing for a given N, then this loop > will make the right thing happen. > > But even that is not relied upon, because the calling code looks like > this: > > spin_lock_irq_rcu_node(snp); > cbs = false; > last_lvl = snp >= sup->level[rcu_num_lvls - 1]; > if (last_lvl) > cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq; > snp->srcu_have_cbs[idx] = gpseq; > rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1); > sgsne = snp->srcu_gp_seq_needed_exp; > if (srcu_invl_snp_seq(sgsne) || ULONG_CMP_LT(sgsne, gpseq)) > WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq); > if (ss_state < SRCU_SIZE_BIG) > mask = ~0; > else > mask = snp->srcu_data_have_cbs[idx]; > snp->srcu_data_have_cbs[idx] = 0; > spin_unlock_irq_rcu_node(snp); > if (cbs) > srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay); > > So that last_lvl check means that the srcu_schedule_cbs_snp() function > is invoked only for leaf srcu_node structures. Which by default limit > the shift to 16. > > So this bug appears to have no effect for default RCU setups, even on very > large 64-bit systems, which is consistent with field experience. Even if > this is the case, it still should be fixed, to avoid confusion if nothing > else. Or just in case someone decides to set CONFIG_RCU_FANOUT_LEAF=32. > Which actually happened the other day due to someone trusting ChatGPT's > opinion about RCU Kconfig options... And I therefore queued Denis's v3 patch with an edited commit log. Of course, if anyone sees some other way that the bug could manifest other than in a 64-bit kernel built with CONFIG_RCU_FANOUT_LEAF greater than 30 on a system with at least 31 CPUs, please let me know so that I can adjust. Thanx, Paul ------------------------------------------------------------------------ commit ed083b0e22f1396dee3599896249a3f218845298 Author: Denis Arefev <arefev@swemel.ru> Date: Mon Sep 4 15:21:14 2023 +0300 Fix srcu_struct node grpmask overflow on 64-bit systems The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) is subject to overflow due to a failure to cast operands to a larger data type before performing arithmetic. The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF Kconfig option, which on 64-bit systems defaults to 16 (resulting in a maximum shift of 15), but which can be set up as high as 64 (resulting in a maximum shift of 63). A value of 31 can result in sign extension, resulting in 0xffffffff80000000 instead of the desired 0x80000000. A value of 31 or greater triggers undefined behavior per the C standard. This bug has not been known to cause issues because almost all kernels take the default CONFIG_RCU_FANOUT_LEAF=16. Furthermore, as long as a given compiler gives a deterministic result for 1<<N for N>=32, the code correctly invokes all SRCU callbacks, albeit wasting CPU time along the way. This commit therefore substitutes the correct 1UL for the buggy 1. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Denis Arefev <arefev@swemel.ru> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: David Laight <David.Laight@aculab.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 833a8f848a90..5602042856b1 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) snp->grplo = cpu; snp->grphi = cpu; } - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); } smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); return true; @@ -835,7 +835,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp int cpu; for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { - if (!(mask & (1 << (cpu - snp->grplo)))) + if (!(mask & (1UL << (cpu - snp->grplo)))) continue; srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); }
On 9/5/23 15:27, Paul E. McKenney wrote: > On Tue, Sep 05, 2023 at 09:40:51AM -0700, Paul E. McKenney wrote: >> On Tue, Sep 05, 2023 at 10:34:43AM -0400, Mathieu Desnoyers wrote: >>> On 9/5/23 10:15, David Laight wrote: >>>> ... >>>>> That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would >>>>> only require a 31-bit shift, which should be OK, but 497 would require >>>>> a 32-bit shift, which would result in sign extension. If it turns out >>>>> that sign extension is OK, then we should get in trouble at 513 CPUs, >>>>> which would result in a 33-bit shift (and is that even defined in C?). >>>> >>>> Not quite right :-) >>>> >>>> (1 << 31) is int and negative, that gets sign extended before >>>> being converted to 'unsigned long' - so has the top 33 bits set. >> >> Good point, thank you for the correction. >> >>>> (1 << 32) is undefined, the current x86 cpu ignore the high >>>> shift bits so it is (1 << 0). >> >> Which would cause it to attempt to invoke SRCU callbacks on the >> lowest-numbered CPU associated with that srcu_node structure. >> >>> Yes, I was about to reply the same thing. A shift of 31 is buggy, >>> because shifting 1 << 31 raises the sign bit, which sets the top 33 >>> bits when cast to unsigned long. A shift of 1 << 32 is undefined, >>> with for instance x86 choosing to ignore the top bit. >>> >>> But in order to have a 1 << 31 shift from this expression: >>> >>> sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); >>> >>> I *think* we need the group to have 32 cpus or more (indexed within >>> the group from grplo to grplo + 31 (both inclusive)). >>> >>> So as soon as we have one group with 32 cpus, the problem should show >>> up. With FANOUT_LEAF=16, we can have 15 groups of 31 cpus and 1 >>> group of 32 cpus, for: >>> >>> 15*31 + 32 = 497 cpus. >>> >>> AFAIU, this would be the minimum value for smp_processor_id()+1 which >>> triggers this issue. >> >> By default, there are 16 CPUs per leaf srcu_node structure. Each leaf >> srcu_node structure takes up one bit in the parent srcu_node structures' >> srcu_data_have_cbs[] array. Up to 30 bits is OK, 31 bits is questionable, >> and 32 bits and larger erroneous. >> >> This is the situation as I see it (assuming dense CPU numbering): >> >> # Leaf srcu_node Largest >> structures #CPUs CPU # Status >> >> 0-30 0-480 479 Good >> 31 481-496 495 Questionable >> 32- 497- 496+ Bad. >> >> Tree SRCU differs from Tree RCU in its operation, so this bug should >> not hold up SRCU grace periods. It might instead cause SRCU callbacks >> to be ignored (which would admittedly look quite similar to the user). >> >> My attempts to cheat the numbering system ran up against the limited >> height of the srcu_node tree. >> >> But there is still the question of why this has not been seen in the >> wild, given that there really are systems with more than 479 CPUs >> out there. One possibility is the call to srcu_schedule_cbs_sdp() >> from srcu_funnel_gp_start(). But it does not seem likely that this >> would happen all that often, as it requires back-to-back grace periods >> and then some. >> >> Maybe people with large systems boot with srcutree.convert_to_big=0? >> >> Adding Laurent for his thoughts. >> >> Aha! >> >> Here is what makes it work, given David's description of 1<<32: >> >> static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp, >> unsigned long mask, unsigned long delay) >> { >> int cpu; >> >> for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { >> if (!(mask & (1 << (cpu - snp->grplo)))) >> continue; >> srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); >> } >> } >> >> As long as at least one bit is set in the result of 1<<N, and as long >> as the compiler always does the same thing for a given N, then this loop >> will make the right thing happen. >> >> But even that is not relied upon, because the calling code looks like >> this: >> >> spin_lock_irq_rcu_node(snp); >> cbs = false; >> last_lvl = snp >= sup->level[rcu_num_lvls - 1]; >> if (last_lvl) >> cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq; >> snp->srcu_have_cbs[idx] = gpseq; >> rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1); >> sgsne = snp->srcu_gp_seq_needed_exp; >> if (srcu_invl_snp_seq(sgsne) || ULONG_CMP_LT(sgsne, gpseq)) >> WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq); >> if (ss_state < SRCU_SIZE_BIG) >> mask = ~0; >> else >> mask = snp->srcu_data_have_cbs[idx]; >> snp->srcu_data_have_cbs[idx] = 0; >> spin_unlock_irq_rcu_node(snp); >> if (cbs) >> srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay); >> >> So that last_lvl check means that the srcu_schedule_cbs_snp() function >> is invoked only for leaf srcu_node structures. Which by default limit >> the shift to 16. >> >> So this bug appears to have no effect for default RCU setups, even on very >> large 64-bit systems, which is consistent with field experience. Even if >> this is the case, it still should be fixed, to avoid confusion if nothing >> else. Or just in case someone decides to set CONFIG_RCU_FANOUT_LEAF=32. >> Which actually happened the other day due to someone trusting ChatGPT's >> opinion about RCU Kconfig options... > > And I therefore queued Denis's v3 patch with an edited commit log. > Of course, if anyone sees some other way that the bug could manifest > other than in a 64-bit kernel built with CONFIG_RCU_FANOUT_LEAF greater > than 30 on a system with at least 31 CPUs, please let me know so that > I can adjust. > > Thanx, Paul > > ------------------------------------------------------------------------ > > commit ed083b0e22f1396dee3599896249a3f218845298 > Author: Denis Arefev <arefev@swemel.ru> > Date: Mon Sep 4 15:21:14 2023 +0300 > > Fix srcu_struct node grpmask overflow on 64-bit systems > > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) AFAIU, the overflow resides in the "bitwise expression" and not the arithmetic expression. Other than this, please add my Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > is subject to overflow due to a failure to cast operands to a larger > data type before performing arithmetic. > > The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF > Kconfig option, which on 64-bit systems defaults to 16 (resulting in a > maximum shift of 15), but which can be set up as high as 64 (resulting > in a maximum shift of 63). A value of 31 can result in sign extension, > resulting in 0xffffffff80000000 instead of the desired 0x80000000. > A value of 31 or greater triggers undefined behavior per the C standard. > > This bug has not been known to cause issues because almost all kernels > take the default CONFIG_RCU_FANOUT_LEAF=16. Furthermore, as long as > a given compiler gives a deterministic result for 1<<N for N>=32, the > code correctly invokes all SRCU callbacks, albeit wasting CPU time along > the way. > > This commit therefore substitutes the correct 1UL for the buggy 1. > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > > Signed-off-by: Denis Arefev <arefev@swemel.ru> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: David Laight <David.Laight@aculab.com> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 833a8f848a90..5602042856b1 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) > snp->grplo = cpu; > snp->grphi = cpu; > } > - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); > } > smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); > return true; > @@ -835,7 +835,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp > int cpu; > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > - if (!(mask & (1 << (cpu - snp->grplo)))) > + if (!(mask & (1UL << (cpu - snp->grplo)))) > continue; > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > }
On Tue, Sep 05, 2023 at 03:36:40PM -0400, Mathieu Desnoyers wrote: > On 9/5/23 15:27, Paul E. McKenney wrote: > > On Tue, Sep 05, 2023 at 09:40:51AM -0700, Paul E. McKenney wrote: > > > On Tue, Sep 05, 2023 at 10:34:43AM -0400, Mathieu Desnoyers wrote: > > > > On 9/5/23 10:15, David Laight wrote: > > > > > ... > > > > > > That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would > > > > > > only require a 31-bit shift, which should be OK, but 497 would require > > > > > > a 32-bit shift, which would result in sign extension. If it turns out > > > > > > that sign extension is OK, then we should get in trouble at 513 CPUs, > > > > > > which would result in a 33-bit shift (and is that even defined in C?). > > > > > > > > > > Not quite right :-) > > > > > > > > > > (1 << 31) is int and negative, that gets sign extended before > > > > > being converted to 'unsigned long' - so has the top 33 bits set. > > > > > > Good point, thank you for the correction. > > > > > > > > (1 << 32) is undefined, the current x86 cpu ignore the high > > > > > shift bits so it is (1 << 0). > > > > > > Which would cause it to attempt to invoke SRCU callbacks on the > > > lowest-numbered CPU associated with that srcu_node structure. > > > > > > > Yes, I was about to reply the same thing. A shift of 31 is buggy, > > > > because shifting 1 << 31 raises the sign bit, which sets the top 33 > > > > bits when cast to unsigned long. A shift of 1 << 32 is undefined, > > > > with for instance x86 choosing to ignore the top bit. > > > > > > > > But in order to have a 1 << 31 shift from this expression: > > > > > > > > sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > > > > > > > > I *think* we need the group to have 32 cpus or more (indexed within > > > > the group from grplo to grplo + 31 (both inclusive)). > > > > > > > > So as soon as we have one group with 32 cpus, the problem should show > > > > up. With FANOUT_LEAF=16, we can have 15 groups of 31 cpus and 1 > > > > group of 32 cpus, for: > > > > > > > > 15*31 + 32 = 497 cpus. > > > > > > > > AFAIU, this would be the minimum value for smp_processor_id()+1 which > > > > triggers this issue. > > > > > > By default, there are 16 CPUs per leaf srcu_node structure. Each leaf > > > srcu_node structure takes up one bit in the parent srcu_node structures' > > > srcu_data_have_cbs[] array. Up to 30 bits is OK, 31 bits is questionable, > > > and 32 bits and larger erroneous. > > > > > > This is the situation as I see it (assuming dense CPU numbering): > > > > > > # Leaf srcu_node Largest > > > structures #CPUs CPU # Status > > > > > > 0-30 0-480 479 Good > > > 31 481-496 495 Questionable > > > 32- 497- 496+ Bad. > > > > > > Tree SRCU differs from Tree RCU in its operation, so this bug should > > > not hold up SRCU grace periods. It might instead cause SRCU callbacks > > > to be ignored (which would admittedly look quite similar to the user). > > > > > > My attempts to cheat the numbering system ran up against the limited > > > height of the srcu_node tree. > > > > > > But there is still the question of why this has not been seen in the > > > wild, given that there really are systems with more than 479 CPUs > > > out there. One possibility is the call to srcu_schedule_cbs_sdp() > > > from srcu_funnel_gp_start(). But it does not seem likely that this > > > would happen all that often, as it requires back-to-back grace periods > > > and then some. > > > > > > Maybe people with large systems boot with srcutree.convert_to_big=0? > > > > > > Adding Laurent for his thoughts. > > > > > > Aha! > > > > > > Here is what makes it work, given David's description of 1<<32: > > > > > > static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp, > > > unsigned long mask, unsigned long delay) > > > { > > > int cpu; > > > > > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > > > if (!(mask & (1 << (cpu - snp->grplo)))) > > > continue; > > > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > > > } > > > } > > > > > > As long as at least one bit is set in the result of 1<<N, and as long > > > as the compiler always does the same thing for a given N, then this loop > > > will make the right thing happen. > > > > > > But even that is not relied upon, because the calling code looks like > > > this: > > > > > > spin_lock_irq_rcu_node(snp); > > > cbs = false; > > > last_lvl = snp >= sup->level[rcu_num_lvls - 1]; > > > if (last_lvl) > > > cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq; > > > snp->srcu_have_cbs[idx] = gpseq; > > > rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1); > > > sgsne = snp->srcu_gp_seq_needed_exp; > > > if (srcu_invl_snp_seq(sgsne) || ULONG_CMP_LT(sgsne, gpseq)) > > > WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq); > > > if (ss_state < SRCU_SIZE_BIG) > > > mask = ~0; > > > else > > > mask = snp->srcu_data_have_cbs[idx]; > > > snp->srcu_data_have_cbs[idx] = 0; > > > spin_unlock_irq_rcu_node(snp); > > > if (cbs) > > > srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay); > > > > > > So that last_lvl check means that the srcu_schedule_cbs_snp() function > > > is invoked only for leaf srcu_node structures. Which by default limit > > > the shift to 16. > > > > > > So this bug appears to have no effect for default RCU setups, even on very > > > large 64-bit systems, which is consistent with field experience. Even if > > > this is the case, it still should be fixed, to avoid confusion if nothing > > > else. Or just in case someone decides to set CONFIG_RCU_FANOUT_LEAF=32. > > > Which actually happened the other day due to someone trusting ChatGPT's > > > opinion about RCU Kconfig options... > > > > And I therefore queued Denis's v3 patch with an edited commit log. > > Of course, if anyone sees some other way that the bug could manifest > > other than in a 64-bit kernel built with CONFIG_RCU_FANOUT_LEAF greater > > than 30 on a system with at least 31 CPUs, please let me know so that > > I can adjust. > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > commit ed083b0e22f1396dee3599896249a3f218845298 > > Author: Denis Arefev <arefev@swemel.ru> > > Date: Mon Sep 4 15:21:14 2023 +0300 > > > > Fix srcu_struct node grpmask overflow on 64-bit systems > > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > > AFAIU, the overflow resides in the "bitwise expression" and not > the arithmetic expression. Rather than quibble about exactly what constitutes arithmetic, I updated the first paragraph and added your Reviewed-by as shown below. ;-) > Other than this, please add my > > Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Thank you all!!! ------------------------------------------------------------------------ commit 50477ff756ab99402b1523b7c6be8b5d790d05e7 Author: Denis Arefev <arefev@swemel.ru> Date: Mon Sep 4 15:21:14 2023 +0300 Fix srcu_struct node grpmask overflow on 64-bit systems The value of a bitwise expression 1 << (cpu - sdp->mynode->grplo) is subject to overflow due to a failure to cast operands to a larger data type before performing the bitwise operation. The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF Kconfig option, which on 64-bit systems defaults to 16 (resulting in a maximum shift of 15), but which can be set up as high as 64 (resulting in a maximum shift of 63). A value of 31 can result in sign extension, resulting in 0xffffffff80000000 instead of the desired 0x80000000. A value of 31 or greater triggers undefined behavior per the C standard. This bug has not been known to cause issues because almost all kernels take the default CONFIG_RCU_FANOUT_LEAF=16. Furthermore, as long as a given compiler gives a deterministic result for 1<<N for N>=32, the code correctly invokes all SRCU callbacks, albeit wasting CPU time along the way. This commit therefore substitutes the correct 1UL for the buggy 1. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Denis Arefev <arefev@swemel.ru> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: David Laight <David.Laight@aculab.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 833a8f848a90..5602042856b1 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) snp->grplo = cpu; snp->grphi = cpu; } - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); } smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); return true; @@ -835,7 +835,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp int cpu; for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { - if (!(mask & (1 << (cpu - snp->grplo)))) + if (!(mask & (1UL << (cpu - snp->grplo)))) continue; srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); }
On Tue, Sep 5, 2023 at 4:13 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Tue, Sep 05, 2023 at 03:36:40PM -0400, Mathieu Desnoyers wrote: > > On 9/5/23 15:27, Paul E. McKenney wrote: > > > On Tue, Sep 05, 2023 at 09:40:51AM -0700, Paul E. McKenney wrote: > > > > On Tue, Sep 05, 2023 at 10:34:43AM -0400, Mathieu Desnoyers wrote: > > > > > On 9/5/23 10:15, David Laight wrote: > > > > > > ... > > > > > > > That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would > > > > > > > only require a 31-bit shift, which should be OK, but 497 would require > > > > > > > a 32-bit shift, which would result in sign extension. If it turns out > > > > > > > that sign extension is OK, then we should get in trouble at 513 CPUs, > > > > > > > which would result in a 33-bit shift (and is that even defined in C?). > > > > > > > > > > > > Not quite right :-) > > > > > > > > > > > > (1 << 31) is int and negative, that gets sign extended before > > > > > > being converted to 'unsigned long' - so has the top 33 bits set. > > > > > > > > Good point, thank you for the correction. > > > > > > > > > > (1 << 32) is undefined, the current x86 cpu ignore the high > > > > > > shift bits so it is (1 << 0). > > > > > > > > Which would cause it to attempt to invoke SRCU callbacks on the > > > > lowest-numbered CPU associated with that srcu_node structure. > > > > > > > > > Yes, I was about to reply the same thing. A shift of 31 is buggy, > > > > > because shifting 1 << 31 raises the sign bit, which sets the top 33 > > > > > bits when cast to unsigned long. A shift of 1 << 32 is undefined, > > > > > with for instance x86 choosing to ignore the top bit. > > > > > > > > > > But in order to have a 1 << 31 shift from this expression: > > > > > > > > > > sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > > > > > > > > > > I *think* we need the group to have 32 cpus or more (indexed within > > > > > the group from grplo to grplo + 31 (both inclusive)). > > > > > > > > > > So as soon as we have one group with 32 cpus, the problem should show > > > > > up. With FANOUT_LEAF=16, we can have 15 groups of 31 cpus and 1 > > > > > group of 32 cpus, for: > > > > > > > > > > 15*31 + 32 = 497 cpus. > > > > > > > > > > AFAIU, this would be the minimum value for smp_processor_id()+1 which > > > > > triggers this issue. > > > > > > > > By default, there are 16 CPUs per leaf srcu_node structure. Each leaf > > > > srcu_node structure takes up one bit in the parent srcu_node structures' > > > > srcu_data_have_cbs[] array. Up to 30 bits is OK, 31 bits is questionable, > > > > and 32 bits and larger erroneous. > > > > > > > > This is the situation as I see it (assuming dense CPU numbering): > > > > > > > > # Leaf srcu_node Largest > > > > structures #CPUs CPU # Status > > > > > > > > 0-30 0-480 479 Good > > > > 31 481-496 495 Questionable > > > > 32- 497- 496+ Bad. > > > > > > > > Tree SRCU differs from Tree RCU in its operation, so this bug should > > > > not hold up SRCU grace periods. It might instead cause SRCU callbacks > > > > to be ignored (which would admittedly look quite similar to the user). > > > > > > > > My attempts to cheat the numbering system ran up against the limited > > > > height of the srcu_node tree. > > > > > > > > But there is still the question of why this has not been seen in the > > > > wild, given that there really are systems with more than 479 CPUs > > > > out there. One possibility is the call to srcu_schedule_cbs_sdp() > > > > from srcu_funnel_gp_start(). But it does not seem likely that this > > > > would happen all that often, as it requires back-to-back grace periods > > > > and then some. > > > > > > > > Maybe people with large systems boot with srcutree.convert_to_big=0? > > > > > > > > Adding Laurent for his thoughts. > > > > > > > > Aha! > > > > > > > > Here is what makes it work, given David's description of 1<<32: > > > > > > > > static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp, > > > > unsigned long mask, unsigned long delay) > > > > { > > > > int cpu; > > > > > > > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > > > > if (!(mask & (1 << (cpu - snp->grplo)))) > > > > continue; > > > > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > > > > } > > > > } > > > > > > > > As long as at least one bit is set in the result of 1<<N, and as long > > > > as the compiler always does the same thing for a given N, then this loop > > > > will make the right thing happen. > > > > > > > > But even that is not relied upon, because the calling code looks like > > > > this: > > > > > > > > spin_lock_irq_rcu_node(snp); > > > > cbs = false; > > > > last_lvl = snp >= sup->level[rcu_num_lvls - 1]; > > > > if (last_lvl) > > > > cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq; > > > > snp->srcu_have_cbs[idx] = gpseq; > > > > rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1); > > > > sgsne = snp->srcu_gp_seq_needed_exp; > > > > if (srcu_invl_snp_seq(sgsne) || ULONG_CMP_LT(sgsne, gpseq)) > > > > WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq); > > > > if (ss_state < SRCU_SIZE_BIG) > > > > mask = ~0; > > > > else > > > > mask = snp->srcu_data_have_cbs[idx]; > > > > snp->srcu_data_have_cbs[idx] = 0; > > > > spin_unlock_irq_rcu_node(snp); > > > > if (cbs) > > > > srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay); > > > > > > > > So that last_lvl check means that the srcu_schedule_cbs_snp() function > > > > is invoked only for leaf srcu_node structures. Which by default limit > > > > the shift to 16. > > > > > > > > So this bug appears to have no effect for default RCU setups, even on very > > > > large 64-bit systems, which is consistent with field experience. Even if > > > > this is the case, it still should be fixed, to avoid confusion if nothing > > > > else. Or just in case someone decides to set CONFIG_RCU_FANOUT_LEAF=32. > > > > Which actually happened the other day due to someone trusting ChatGPT's > > > > opinion about RCU Kconfig options... > > > > > > And I therefore queued Denis's v3 patch with an edited commit log. > > > Of course, if anyone sees some other way that the bug could manifest > > > other than in a 64-bit kernel built with CONFIG_RCU_FANOUT_LEAF greater > > > than 30 on a system with at least 31 CPUs, please let me know so that > > > I can adjust. > > > > > > Thanx, Paul > > > > > > ------------------------------------------------------------------------ > > > > > > commit ed083b0e22f1396dee3599896249a3f218845298 > > > Author: Denis Arefev <arefev@swemel.ru> > > > Date: Mon Sep 4 15:21:14 2023 +0300 > > > > > > Fix srcu_struct node grpmask overflow on 64-bit systems > > > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > > > > AFAIU, the overflow resides in the "bitwise expression" and not > > the arithmetic expression. > > Rather than quibble about exactly what constitutes arithmetic, I > updated the first paragraph and added your Reviewed-by as shown > below. ;-) > > > Other than this, please add my > > > > Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > Thank you all!!! > > ------------------------------------------------------------------------ > > commit 50477ff756ab99402b1523b7c6be8b5d790d05e7 > Author: Denis Arefev <arefev@swemel.ru> > Date: Mon Sep 4 15:21:14 2023 +0300 > > Fix srcu_struct node grpmask overflow on 64-bit systems > > The value of a bitwise expression 1 << (cpu - sdp->mynode->grplo) > is subject to overflow due to a failure to cast operands to a larger > data type before performing the bitwise operation. > > The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF > Kconfig option, which on 64-bit systems defaults to 16 (resulting in a > maximum shift of 15), but which can be set up as high as 64 (resulting > in a maximum shift of 63). A value of 31 can result in sign extension, > resulting in 0xffffffff80000000 instead of the desired 0x80000000. > A value of 31 or greater triggers undefined behavior per the C standard. Do you mean here "A value of 32 or greater"? Only N >= 32 throws warning for: unsigned long foo = (1 << N); N=31 does undesirable sign extension but no warning. thanks, - Joel
On Tue, Sep 05, 2023 at 04:40:46PM -0400, Joel Fernandes wrote: > On Tue, Sep 5, 2023 at 4:13 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Tue, Sep 05, 2023 at 03:36:40PM -0400, Mathieu Desnoyers wrote: > > > On 9/5/23 15:27, Paul E. McKenney wrote: > > > > On Tue, Sep 05, 2023 at 09:40:51AM -0700, Paul E. McKenney wrote: > > > > > On Tue, Sep 05, 2023 at 10:34:43AM -0400, Mathieu Desnoyers wrote: > > > > > > On 9/5/23 10:15, David Laight wrote: > > > > > > > ... > > > > > > > > That would instead be more than 512-16=496 CPUs, correct? 496 CPUs would > > > > > > > > only require a 31-bit shift, which should be OK, but 497 would require > > > > > > > > a 32-bit shift, which would result in sign extension. If it turns out > > > > > > > > that sign extension is OK, then we should get in trouble at 513 CPUs, > > > > > > > > which would result in a 33-bit shift (and is that even defined in C?). > > > > > > > > > > > > > > Not quite right :-) > > > > > > > > > > > > > > (1 << 31) is int and negative, that gets sign extended before > > > > > > > being converted to 'unsigned long' - so has the top 33 bits set. > > > > > > > > > > Good point, thank you for the correction. > > > > > > > > > > > > (1 << 32) is undefined, the current x86 cpu ignore the high > > > > > > > shift bits so it is (1 << 0). > > > > > > > > > > Which would cause it to attempt to invoke SRCU callbacks on the > > > > > lowest-numbered CPU associated with that srcu_node structure. > > > > > > > > > > > Yes, I was about to reply the same thing. A shift of 31 is buggy, > > > > > > because shifting 1 << 31 raises the sign bit, which sets the top 33 > > > > > > bits when cast to unsigned long. A shift of 1 << 32 is undefined, > > > > > > with for instance x86 choosing to ignore the top bit. > > > > > > > > > > > > But in order to have a 1 << 31 shift from this expression: > > > > > > > > > > > > sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > > > > > > > > > > > > I *think* we need the group to have 32 cpus or more (indexed within > > > > > > the group from grplo to grplo + 31 (both inclusive)). > > > > > > > > > > > > So as soon as we have one group with 32 cpus, the problem should show > > > > > > up. With FANOUT_LEAF=16, we can have 15 groups of 31 cpus and 1 > > > > > > group of 32 cpus, for: > > > > > > > > > > > > 15*31 + 32 = 497 cpus. > > > > > > > > > > > > AFAIU, this would be the minimum value for smp_processor_id()+1 which > > > > > > triggers this issue. > > > > > > > > > > By default, there are 16 CPUs per leaf srcu_node structure. Each leaf > > > > > srcu_node structure takes up one bit in the parent srcu_node structures' > > > > > srcu_data_have_cbs[] array. Up to 30 bits is OK, 31 bits is questionable, > > > > > and 32 bits and larger erroneous. > > > > > > > > > > This is the situation as I see it (assuming dense CPU numbering): > > > > > > > > > > # Leaf srcu_node Largest > > > > > structures #CPUs CPU # Status > > > > > > > > > > 0-30 0-480 479 Good > > > > > 31 481-496 495 Questionable > > > > > 32- 497- 496+ Bad. > > > > > > > > > > Tree SRCU differs from Tree RCU in its operation, so this bug should > > > > > not hold up SRCU grace periods. It might instead cause SRCU callbacks > > > > > to be ignored (which would admittedly look quite similar to the user). > > > > > > > > > > My attempts to cheat the numbering system ran up against the limited > > > > > height of the srcu_node tree. > > > > > > > > > > But there is still the question of why this has not been seen in the > > > > > wild, given that there really are systems with more than 479 CPUs > > > > > out there. One possibility is the call to srcu_schedule_cbs_sdp() > > > > > from srcu_funnel_gp_start(). But it does not seem likely that this > > > > > would happen all that often, as it requires back-to-back grace periods > > > > > and then some. > > > > > > > > > > Maybe people with large systems boot with srcutree.convert_to_big=0? > > > > > > > > > > Adding Laurent for his thoughts. > > > > > > > > > > Aha! > > > > > > > > > > Here is what makes it work, given David's description of 1<<32: > > > > > > > > > > static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp, > > > > > unsigned long mask, unsigned long delay) > > > > > { > > > > > int cpu; > > > > > > > > > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > > > > > if (!(mask & (1 << (cpu - snp->grplo)))) > > > > > continue; > > > > > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > > > > > } > > > > > } > > > > > > > > > > As long as at least one bit is set in the result of 1<<N, and as long > > > > > as the compiler always does the same thing for a given N, then this loop > > > > > will make the right thing happen. > > > > > > > > > > But even that is not relied upon, because the calling code looks like > > > > > this: > > > > > > > > > > spin_lock_irq_rcu_node(snp); > > > > > cbs = false; > > > > > last_lvl = snp >= sup->level[rcu_num_lvls - 1]; > > > > > if (last_lvl) > > > > > cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq; > > > > > snp->srcu_have_cbs[idx] = gpseq; > > > > > rcu_seq_set_state(&snp->srcu_have_cbs[idx], 1); > > > > > sgsne = snp->srcu_gp_seq_needed_exp; > > > > > if (srcu_invl_snp_seq(sgsne) || ULONG_CMP_LT(sgsne, gpseq)) > > > > > WRITE_ONCE(snp->srcu_gp_seq_needed_exp, gpseq); > > > > > if (ss_state < SRCU_SIZE_BIG) > > > > > mask = ~0; > > > > > else > > > > > mask = snp->srcu_data_have_cbs[idx]; > > > > > snp->srcu_data_have_cbs[idx] = 0; > > > > > spin_unlock_irq_rcu_node(snp); > > > > > if (cbs) > > > > > srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay); > > > > > > > > > > So that last_lvl check means that the srcu_schedule_cbs_snp() function > > > > > is invoked only for leaf srcu_node structures. Which by default limit > > > > > the shift to 16. > > > > > > > > > > So this bug appears to have no effect for default RCU setups, even on very > > > > > large 64-bit systems, which is consistent with field experience. Even if > > > > > this is the case, it still should be fixed, to avoid confusion if nothing > > > > > else. Or just in case someone decides to set CONFIG_RCU_FANOUT_LEAF=32. > > > > > Which actually happened the other day due to someone trusting ChatGPT's > > > > > opinion about RCU Kconfig options... > > > > > > > > And I therefore queued Denis's v3 patch with an edited commit log. > > > > Of course, if anyone sees some other way that the bug could manifest > > > > other than in a 64-bit kernel built with CONFIG_RCU_FANOUT_LEAF greater > > > > than 30 on a system with at least 31 CPUs, please let me know so that > > > > I can adjust. > > > > > > > > Thanx, Paul > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > commit ed083b0e22f1396dee3599896249a3f218845298 > > > > Author: Denis Arefev <arefev@swemel.ru> > > > > Date: Mon Sep 4 15:21:14 2023 +0300 > > > > > > > > Fix srcu_struct node grpmask overflow on 64-bit systems > > > > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > > > > > > AFAIU, the overflow resides in the "bitwise expression" and not > > > the arithmetic expression. > > > > Rather than quibble about exactly what constitutes arithmetic, I > > updated the first paragraph and added your Reviewed-by as shown > > below. ;-) > > > > > Other than this, please add my > > > > > > Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > > > Thank you all!!! > > > > ------------------------------------------------------------------------ > > > > commit 50477ff756ab99402b1523b7c6be8b5d790d05e7 > > Author: Denis Arefev <arefev@swemel.ru> > > Date: Mon Sep 4 15:21:14 2023 +0300 > > > > Fix srcu_struct node grpmask overflow on 64-bit systems > > > > The value of a bitwise expression 1 << (cpu - sdp->mynode->grplo) > > is subject to overflow due to a failure to cast operands to a larger > > data type before performing the bitwise operation. > > > > The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF > > Kconfig option, which on 64-bit systems defaults to 16 (resulting in a > > maximum shift of 15), but which can be set up as high as 64 (resulting > > in a maximum shift of 63). A value of 31 can result in sign extension, > > resulting in 0xffffffff80000000 instead of the desired 0x80000000. > > A value of 31 or greater triggers undefined behavior per the C standard. > > Do you mean here "A value of 32 or greater"? > > Only N >= 32 throws warning for: > unsigned long foo = (1 << N); > > N=31 does undesirable sign extension but no warning. Good catch, thank you, and I will update this on my next rebase. Thanx, Paul
On Tue, Sep 5, 2023 at 4:52 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Tue, Sep 05, 2023 at 04:40:46PM -0400, Joel Fernandes wrote: [...] > > > > > > So this bug appears to have no effect for default RCU setups, even on very > > > > > > large 64-bit systems, which is consistent with field experience. Even if > > > > > > this is the case, it still should be fixed, to avoid confusion if nothing > > > > > > else. Or just in case someone decides to set CONFIG_RCU_FANOUT_LEAF=32. > > > > > > Which actually happened the other day due to someone trusting ChatGPT's > > > > > > opinion about RCU Kconfig options... > > > > > > > > > > And I therefore queued Denis's v3 patch with an edited commit log. > > > > > Of course, if anyone sees some other way that the bug could manifest > > > > > other than in a 64-bit kernel built with CONFIG_RCU_FANOUT_LEAF greater > > > > > than 30 on a system with at least 31 CPUs, please let me know so that > > > > > I can adjust. > > > > > > > > > > Thanx, Paul > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > commit ed083b0e22f1396dee3599896249a3f218845298 > > > > > Author: Denis Arefev <arefev@swemel.ru> > > > > > Date: Mon Sep 4 15:21:14 2023 +0300 > > > > > > > > > > Fix srcu_struct node grpmask overflow on 64-bit systems > > > > > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > > > > > > > > AFAIU, the overflow resides in the "bitwise expression" and not > > > > the arithmetic expression. > > > > > > Rather than quibble about exactly what constitutes arithmetic, I > > > updated the first paragraph and added your Reviewed-by as shown > > > below. ;-) > > > > > > > Other than this, please add my > > > > > > > > Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > > > > > Thank you all!!! > > > > > > ------------------------------------------------------------------------ > > > > > > commit 50477ff756ab99402b1523b7c6be8b5d790d05e7 > > > Author: Denis Arefev <arefev@swemel.ru> > > > Date: Mon Sep 4 15:21:14 2023 +0300 > > > > > > Fix srcu_struct node grpmask overflow on 64-bit systems > > > > > > The value of a bitwise expression 1 << (cpu - sdp->mynode->grplo) > > > is subject to overflow due to a failure to cast operands to a larger > > > data type before performing the bitwise operation. > > > > > > The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF > > > Kconfig option, which on 64-bit systems defaults to 16 (resulting in a > > > maximum shift of 15), but which can be set up as high as 64 (resulting > > > in a maximum shift of 63). A value of 31 can result in sign extension, > > > resulting in 0xffffffff80000000 instead of the desired 0x80000000. > > > A value of 31 or greater triggers undefined behavior per the C standard. > > > > Do you mean here "A value of 32 or greater"? > > > > Only N >= 32 throws warning for: > > unsigned long foo = (1 << N); > > > > N=31 does undesirable sign extension but no warning. > > Good catch, thank you, and I will update this on my next rebase. Thanks, and with that the patch looks good to me: Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> - Joel
On Tue, Sep 05, 2023 at 04:56:12PM -0400, Joel Fernandes wrote: > On Tue, Sep 5, 2023 at 4:52 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Tue, Sep 05, 2023 at 04:40:46PM -0400, Joel Fernandes wrote: > [...] > > > > > > > So this bug appears to have no effect for default RCU setups, even on very > > > > > > > large 64-bit systems, which is consistent with field experience. Even if > > > > > > > this is the case, it still should be fixed, to avoid confusion if nothing > > > > > > > else. Or just in case someone decides to set CONFIG_RCU_FANOUT_LEAF=32. > > > > > > > Which actually happened the other day due to someone trusting ChatGPT's > > > > > > > opinion about RCU Kconfig options... > > > > > > > > > > > > And I therefore queued Denis's v3 patch with an edited commit log. > > > > > > Of course, if anyone sees some other way that the bug could manifest > > > > > > other than in a 64-bit kernel built with CONFIG_RCU_FANOUT_LEAF greater > > > > > > than 30 on a system with at least 31 CPUs, please let me know so that > > > > > > I can adjust. > > > > > > > > > > > > Thanx, Paul > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > commit ed083b0e22f1396dee3599896249a3f218845298 > > > > > > Author: Denis Arefev <arefev@swemel.ru> > > > > > > Date: Mon Sep 4 15:21:14 2023 +0300 > > > > > > > > > > > > Fix srcu_struct node grpmask overflow on 64-bit systems > > > > > > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > > > > > > > > > > AFAIU, the overflow resides in the "bitwise expression" and not > > > > > the arithmetic expression. > > > > > > > > Rather than quibble about exactly what constitutes arithmetic, I > > > > updated the first paragraph and added your Reviewed-by as shown > > > > below. ;-) > > > > > > > > > Other than this, please add my > > > > > > > > > > Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > > > > > > > Thank you all!!! > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > commit 50477ff756ab99402b1523b7c6be8b5d790d05e7 > > > > Author: Denis Arefev <arefev@swemel.ru> > > > > Date: Mon Sep 4 15:21:14 2023 +0300 > > > > > > > > Fix srcu_struct node grpmask overflow on 64-bit systems > > > > > > > > The value of a bitwise expression 1 << (cpu - sdp->mynode->grplo) > > > > is subject to overflow due to a failure to cast operands to a larger > > > > data type before performing the bitwise operation. > > > > > > > > The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF > > > > Kconfig option, which on 64-bit systems defaults to 16 (resulting in a > > > > maximum shift of 15), but which can be set up as high as 64 (resulting > > > > in a maximum shift of 63). A value of 31 can result in sign extension, > > > > resulting in 0xffffffff80000000 instead of the desired 0x80000000. > > > > A value of 31 or greater triggers undefined behavior per the C standard. > > > > > > Do you mean here "A value of 32 or greater"? > > > > > > Only N >= 32 throws warning for: > > > unsigned long foo = (1 << N); > > > > > > N=31 does undesirable sign extension but no warning. > > > > Good catch, thank you, and I will update this on my next rebase. > > Thanks, and with that the patch looks good to me: > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Thank you very much, and I will apply this on the next rebase. Thanx, Paul
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 20d7a238d675..6c18e6005ae1 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) snp->grplo = cpu; snp->grphi = cpu; } - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); } smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); return true; @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp int cpu; for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { - if (!(mask & (1 << (cpu - snp->grplo)))) + if (!(mask & (1UL << (cpu - snp->grplo)))) continue; srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); }
The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) is subject to overflow due to a failure to cast operands to a larger data type before performing arithmetic Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Denis Arefev <arefev@swemel.ru> --- v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by Mathieu Desnoyers <mathieu.desnoyers@efficios.com> kernel/rcu/srcutree.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)