Message ID | a17ba988-2850-fced-d225-97e1d11f6576@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | timer: fix NR_CPUS=1 build with gcc13 | expand |
On Wed, Sep 13, 2023 at 8:32 AM Jan Beulich <jbeulich@suse.com> wrote: > > Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu" > is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)" > exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a > configuration). Make the code conditional upon there being at least 2 > CPUs configured (otherwise there simply is nothing to migrate [to]). Hmm, without digging into it, migrate_timer() doesn't seem like very robust code: It doesn't check to make sure that new_cpu is valid, nor does it give the option of returning an error if anything fails. Making migrate_timer() just do nothing on 1-cpu systems seems will remove the warning, but not really make things safer. Is this a super hot path? Would it make more sense to add `|| (new_cpu > CONFIG_NR_CPUS)` to the early-return conditional at the top of the first `for (; ; )` loop? I guess if we don't expect it ever to be called, it might be better to get rid of the code entirely; but maybe in that case we should add something like the following? ``` #else WARN_ONCE("migrate_timer: Request to move to %u on a single-core system!", new_cpu); ASSERT_UNREACHABLE(); #endif ``` Thoughts? -George
On 13.09.2023 11:44, George Dunlap wrote: > On Wed, Sep 13, 2023 at 8:32 AM Jan Beulich <jbeulich@suse.com> wrote: >> >> Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu" >> is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)" >> exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a >> configuration). Make the code conditional upon there being at least 2 >> CPUs configured (otherwise there simply is nothing to migrate [to]). > > Hmm, without digging into it, migrate_timer() doesn't seem like very > robust code: It doesn't check to make sure that new_cpu is valid, nor > does it give the option of returning an error if anything fails. Question is - what do you expect the callers to do upon getting back failure? > Making migrate_timer() just do nothing on 1-cpu systems seems will > remove the warning, but not really make things safer. > > Is this a super hot path? I don't think it is. > Would it make more sense to add `|| > (new_cpu > CONFIG_NR_CPUS)` to the early-return conditional at the > top of the first `for (; ; )` loop? But that would mean not doing what was requested without any indication to the caller. An out-of-range CPU passed in is generally very likely to result in a crash, I think. > I guess if we don't expect it ever to be called, it might be better to > get rid of the code entirely; but maybe in that case we should add > something like the following? > > ``` > #else > WARN_ONCE("migrate_timer: Request to move to %u on a single-core > system!", new_cpu); > ASSERT_UNREACHABLE(); > #endif > ``` With the old_cpu == new_cpu case explicitly permitted (and that being the only legal case when NR_CPUS=1, which arguably is an aspect which makes gcc's diagnostic questionable), perhaps only #else old_cpu = ...; if ( old_cpu != TIMER_CPU_status_killed ) WARN_ON(new_cpu != old_cpu); #endif (I'm afraid we have no WARN_ON_ONCE() yet, nor WARN_ONCE())? Jan
On Wed, Sep 13, 2023 at 11:05 AM Jan Beulich <jbeulich@suse.com> wrote: > > On 13.09.2023 11:44, George Dunlap wrote: > > On Wed, Sep 13, 2023 at 8:32 AM Jan Beulich <jbeulich@suse.com> wrote: > >> > >> Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu" > >> is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)" > >> exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a > >> configuration). Make the code conditional upon there being at least 2 > >> CPUs configured (otherwise there simply is nothing to migrate [to]). > > > > Hmm, without digging into it, migrate_timer() doesn't seem like very > > robust code: It doesn't check to make sure that new_cpu is valid, nor > > does it give the option of returning an error if anything fails. > > Question is - what do you expect the callers to do upon getting back > failure? [snip] > > Would it make more sense to add `|| > > (new_cpu > CONFIG_NR_CPUS)` to the early-return conditional at the > > top of the first `for (; ; )` loop? > > But that would mean not doing what was requested without any indication > to the caller. An out-of-range CPU passed in is generally very likely > to result in a crash, I think. If it's only off by a little bit, there's a good chance it might just corrupt some other data, causing a crash further down the line, where it's not obvious what went wrong. Generally speaking, passing an error up the stack, explicitly crashing, or explicitly doing nothing with a warning to the console are all better options. > > I guess if we don't expect it ever to be called, it might be better to > > get rid of the code entirely; but maybe in that case we should add > > something like the following? > > > > ``` > > #else > > WARN_ONCE("migrate_timer: Request to move to %u on a single-core > > system!", new_cpu); > > ASSERT_UNREACHABLE(); > > #endif > > ``` > > With the old_cpu == new_cpu case explicitly permitted (and that being > the only legal case when NR_CPUS=1, which arguably is an aspect which > makes gcc's diagnostic questionable), perhaps only > > #else > old_cpu = ...; > if ( old_cpu != TIMER_CPU_status_killed ) > WARN_ON(new_cpu != old_cpu); > #endif > > (I'm afraid we have no WARN_ON_ONCE() yet, nor WARN_ONCE())? I think I was looking for `printk_once`. If there's no reasonable way to fail more gracefully (or no real point in making the effort to do so), what if we add the following to the top of the function? Does that make gcc13 happy? ``` if ( new_cpu >= CONFIG_NR_CPUS ) { printk_once(/* whatever */); ASSERT_UNREACHABLE(); return; } ``` Or, if we feel like being passed an invalid cpu means the state is so bad it would be better to just crash and have done with it: ``` BUG_ON(new_cpu >= CONFIG_NR_CPUS); ``` -George
On 13.09.2023 12:25, George Dunlap wrote: > On Wed, Sep 13, 2023 at 11:05 AM Jan Beulich <jbeulich@suse.com> wrote: >> On 13.09.2023 11:44, George Dunlap wrote: >>> On Wed, Sep 13, 2023 at 8:32 AM Jan Beulich <jbeulich@suse.com> wrote: >>>> >>>> Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu" >>>> is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)" >>>> exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a >>>> configuration). Make the code conditional upon there being at least 2 >>>> CPUs configured (otherwise there simply is nothing to migrate [to]). >>> >>> Hmm, without digging into it, migrate_timer() doesn't seem like very >>> robust code: It doesn't check to make sure that new_cpu is valid, nor >>> does it give the option of returning an error if anything fails. >> >> Question is - what do you expect the callers to do upon getting back >> failure? > > [snip] > >>> Would it make more sense to add `|| >>> (new_cpu > CONFIG_NR_CPUS)` to the early-return conditional at the >>> top of the first `for (; ; )` loop? >> >> But that would mean not doing what was requested without any indication >> to the caller. An out-of-range CPU passed in is generally very likely >> to result in a crash, I think. > > If it's only off by a little bit, there's a good chance it might just > corrupt some other data, causing a crash further down the line, where > it's not obvious what went wrong. In general I would agree. but __per_cpu_offset[] is quite special in the values it holds. The data immediately following it would therefore also need to have unusual values within relatively narrow a range for a crash to not occur right away. > Generally speaking, passing an > error up the stack, explicitly crashing, or explicitly doing nothing > with a warning to the console are all better options. I guess I'll go that route then, since ... >>> I guess if we don't expect it ever to be called, it might be better to >>> get rid of the code entirely; but maybe in that case we should add >>> something like the following? >>> >>> ``` >>> #else >>> WARN_ONCE("migrate_timer: Request to move to %u on a single-core >>> system!", new_cpu); >>> ASSERT_UNREACHABLE(); >>> #endif >>> ``` >> >> With the old_cpu == new_cpu case explicitly permitted (and that being >> the only legal case when NR_CPUS=1, which arguably is an aspect which >> makes gcc's diagnostic questionable), perhaps only >> >> #else >> old_cpu = ...; >> if ( old_cpu != TIMER_CPU_status_killed ) >> WARN_ON(new_cpu != old_cpu); >> #endif >> >> (I'm afraid we have no WARN_ON_ONCE() yet, nor WARN_ONCE())? > > I think I was looking for `printk_once`. > > If there's no reasonable way to fail more gracefully (or no real point > in making the effort to do so), what if we add the following to the > top of the function? Does that make gcc13 happy? > > ``` > if ( new_cpu >= CONFIG_NR_CPUS ) > { > printk_once(/* whatever */); > ASSERT_UNREACHABLE(); > return; > } > ``` ... this actually makes things worse (then the compiler complains about old_cpu uses as array index), ... > Or, if we feel like being passed an invalid cpu means the state is so > bad it would be better to just crash and have done with it: > > ``` > BUG_ON(new_cpu >= CONFIG_NR_CPUS); > ``` ... and this, while it helps when then also done for old_cpu, seems too hefty to me. Just to mention it, 'asm volatile ( "" : "+g" (new_cpu) );' placed at the right location also helps. That's effectively RELOC_HIDE(), which we use to work around a gcc11 issue in the same area - see gcc11_wrap(). Jan
--- a/xen/common/timer.c +++ b/xen/common/timer.c @@ -356,6 +356,7 @@ bool timer_expires_before(struct timer * void migrate_timer(struct timer *timer, unsigned int new_cpu) { +#if CONFIG_NR_CPUS > 1 unsigned int old_cpu; bool_t active; unsigned long flags; @@ -404,6 +405,7 @@ void migrate_timer(struct timer *timer, spin_unlock(&per_cpu(timers, old_cpu).lock); spin_unlock_irqrestore(&per_cpu(timers, new_cpu).lock, flags); +#endif /* CONFIG_NR_CPUS > 1 */ }
Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu" is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)" exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a configuration). Make the code conditional upon there being at least 2 CPUs configured (otherwise there simply is nothing to migrate [to]). Signed-off-by: Jan Beulich <jbeulich@suse.com>