diff mbox series

timer: fix NR_CPUS=1 build with gcc13

Message ID a17ba988-2850-fced-d225-97e1d11f6576@suse.com (mailing list archive)
State Superseded
Headers show
Series timer: fix NR_CPUS=1 build with gcc13 | expand

Commit Message

Jan Beulich Sept. 13, 2023, 7:31 a.m. UTC
Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu"
is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)"
exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a
configuration). Make the code conditional upon there being at least 2
CPUs configured (otherwise there simply is nothing to migrate [to]).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

George Dunlap Sept. 13, 2023, 9:44 a.m. UTC | #1
On Wed, Sep 13, 2023 at 8:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu"
> is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)"
> exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a
> configuration). Make the code conditional upon there being at least 2
> CPUs configured (otherwise there simply is nothing to migrate [to]).

Hmm, without digging into it, migrate_timer() doesn't seem like very
robust code: It doesn't check to make sure that new_cpu is valid, nor
does it give the option of returning an error if anything fails.
Making migrate_timer() just do nothing on 1-cpu systems seems will
remove the warning, but not really make things safer.

Is this a super hot path?  Would it make more sense to add `||
(new_cpu > CONFIG_NR_CPUS)` to the early-return  conditional at the
top of the first `for (; ; )` loop?

I guess if we don't expect it ever to be called, it might be better to
get rid of the code entirely; but maybe in that case we should add
something like the following?

```
#else
    WARN_ONCE("migrate_timer: Request to move to %u on a single-core
system!", new_cpu);
    ASSERT_UNREACHABLE();
#endif
```

Thoughts?

 -George
Jan Beulich Sept. 13, 2023, 10:05 a.m. UTC | #2
On 13.09.2023 11:44, George Dunlap wrote:
> On Wed, Sep 13, 2023 at 8:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu"
>> is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)"
>> exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a
>> configuration). Make the code conditional upon there being at least 2
>> CPUs configured (otherwise there simply is nothing to migrate [to]).
> 
> Hmm, without digging into it, migrate_timer() doesn't seem like very
> robust code: It doesn't check to make sure that new_cpu is valid, nor
> does it give the option of returning an error if anything fails.

Question is - what do you expect the callers to do upon getting back
failure?

> Making migrate_timer() just do nothing on 1-cpu systems seems will
> remove the warning, but not really make things safer.
> 
> Is this a super hot path?

I don't think it is.

>  Would it make more sense to add `||
> (new_cpu > CONFIG_NR_CPUS)` to the early-return  conditional at the
> top of the first `for (; ; )` loop?

But that would mean not doing what was requested without any indication
to the caller. An out-of-range CPU passed in is generally very likely
to result in a crash, I think.

> I guess if we don't expect it ever to be called, it might be better to
> get rid of the code entirely; but maybe in that case we should add
> something like the following?
> 
> ```
> #else
>     WARN_ONCE("migrate_timer: Request to move to %u on a single-core
> system!", new_cpu);
>     ASSERT_UNREACHABLE();
> #endif
> ```

With the old_cpu == new_cpu case explicitly permitted (and that being
the only legal case when NR_CPUS=1, which arguably is an aspect which
makes gcc's diagnostic questionable), perhaps only

#else
    old_cpu = ...;
    if ( old_cpu != TIMER_CPU_status_killed )
        WARN_ON(new_cpu != old_cpu);
#endif

(I'm afraid we have no WARN_ON_ONCE() yet, nor WARN_ONCE())?

Jan
George Dunlap Sept. 13, 2023, 10:25 a.m. UTC | #3
On Wed, Sep 13, 2023 at 11:05 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 13.09.2023 11:44, George Dunlap wrote:
> > On Wed, Sep 13, 2023 at 8:32 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu"
> >> is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)"
> >> exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a
> >> configuration). Make the code conditional upon there being at least 2
> >> CPUs configured (otherwise there simply is nothing to migrate [to]).
> >
> > Hmm, without digging into it, migrate_timer() doesn't seem like very
> > robust code: It doesn't check to make sure that new_cpu is valid, nor
> > does it give the option of returning an error if anything fails.
>
> Question is - what do you expect the callers to do upon getting back
> failure?

[snip]

> >  Would it make more sense to add `||
> > (new_cpu > CONFIG_NR_CPUS)` to the early-return  conditional at the
> > top of the first `for (; ; )` loop?
>
> But that would mean not doing what was requested without any indication
> to the caller. An out-of-range CPU passed in is generally very likely
> to result in a crash, I think.

If it's only off by a little bit, there's a good chance it might just
corrupt some other data, causing a crash further down the line, where
it's not obvious what went wrong.  Generally speaking, passing an
error up the stack, explicitly crashing, or explicitly doing nothing
with a warning to the console are all better options.

> > I guess if we don't expect it ever to be called, it might be better to
> > get rid of the code entirely; but maybe in that case we should add
> > something like the following?
> >
> > ```
> > #else
> >     WARN_ONCE("migrate_timer: Request to move to %u on a single-core
> > system!", new_cpu);
> >     ASSERT_UNREACHABLE();
> > #endif
> > ```
>
> With the old_cpu == new_cpu case explicitly permitted (and that being
> the only legal case when NR_CPUS=1, which arguably is an aspect which
> makes gcc's diagnostic questionable), perhaps only
>
> #else
>     old_cpu = ...;
>     if ( old_cpu != TIMER_CPU_status_killed )
>         WARN_ON(new_cpu != old_cpu);
> #endif
>
> (I'm afraid we have no WARN_ON_ONCE() yet, nor WARN_ONCE())?

I think I was looking for `printk_once`.

If there's no reasonable way to fail more gracefully (or no real point
in making the effort to do so), what if we add the following to the
top of the function?  Does that make gcc13 happy?

```
if ( new_cpu >= CONFIG_NR_CPUS )
{
    printk_once(/* whatever */);
    ASSERT_UNREACHABLE();
    return;
}
```

Or, if we feel like being passed an invalid cpu means the state is so
bad it would be better to just crash and have done with it:

```
  BUG_ON(new_cpu >= CONFIG_NR_CPUS);
```

 -George
Jan Beulich Sept. 14, 2023, 12:36 p.m. UTC | #4
On 13.09.2023 12:25, George Dunlap wrote:
> On Wed, Sep 13, 2023 at 11:05 AM Jan Beulich <jbeulich@suse.com> wrote:
>> On 13.09.2023 11:44, George Dunlap wrote:
>>> On Wed, Sep 13, 2023 at 8:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu"
>>>> is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)"
>>>> exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a
>>>> configuration). Make the code conditional upon there being at least 2
>>>> CPUs configured (otherwise there simply is nothing to migrate [to]).
>>>
>>> Hmm, without digging into it, migrate_timer() doesn't seem like very
>>> robust code: It doesn't check to make sure that new_cpu is valid, nor
>>> does it give the option of returning an error if anything fails.
>>
>> Question is - what do you expect the callers to do upon getting back
>> failure?
> 
> [snip]
> 
>>>  Would it make more sense to add `||
>>> (new_cpu > CONFIG_NR_CPUS)` to the early-return  conditional at the
>>> top of the first `for (; ; )` loop?
>>
>> But that would mean not doing what was requested without any indication
>> to the caller. An out-of-range CPU passed in is generally very likely
>> to result in a crash, I think.
> 
> If it's only off by a little bit, there's a good chance it might just
> corrupt some other data, causing a crash further down the line, where
> it's not obvious what went wrong.

In general I would agree. but __per_cpu_offset[] is quite special in
the values it holds. The data immediately following it would therefore
also need to have unusual values within relatively narrow a range for
a crash to not occur right away.

>  Generally speaking, passing an
> error up the stack, explicitly crashing, or explicitly doing nothing
> with a warning to the console are all better options.

I guess I'll go that route then, since ...

>>> I guess if we don't expect it ever to be called, it might be better to
>>> get rid of the code entirely; but maybe in that case we should add
>>> something like the following?
>>>
>>> ```
>>> #else
>>>     WARN_ONCE("migrate_timer: Request to move to %u on a single-core
>>> system!", new_cpu);
>>>     ASSERT_UNREACHABLE();
>>> #endif
>>> ```
>>
>> With the old_cpu == new_cpu case explicitly permitted (and that being
>> the only legal case when NR_CPUS=1, which arguably is an aspect which
>> makes gcc's diagnostic questionable), perhaps only
>>
>> #else
>>     old_cpu = ...;
>>     if ( old_cpu != TIMER_CPU_status_killed )
>>         WARN_ON(new_cpu != old_cpu);
>> #endif
>>
>> (I'm afraid we have no WARN_ON_ONCE() yet, nor WARN_ONCE())?
> 
> I think I was looking for `printk_once`.
> 
> If there's no reasonable way to fail more gracefully (or no real point
> in making the effort to do so), what if we add the following to the
> top of the function?  Does that make gcc13 happy?
> 
> ```
> if ( new_cpu >= CONFIG_NR_CPUS )
> {
>     printk_once(/* whatever */);
>     ASSERT_UNREACHABLE();
>     return;
> }
> ```

... this actually makes things worse (then the compiler complains about
old_cpu uses as array index), ...

> Or, if we feel like being passed an invalid cpu means the state is so
> bad it would be better to just crash and have done with it:
> 
> ```
>   BUG_ON(new_cpu >= CONFIG_NR_CPUS);
> ```

... and this, while it helps when then also done for old_cpu, seems too
hefty to me.

Just to mention it, 'asm volatile ( "" : "+g" (new_cpu) );' placed at
the right location also helps. That's effectively RELOC_HIDE(), which
we use to work around a gcc11 issue in the same area - see gcc11_wrap().

Jan
diff mbox series

Patch

--- a/xen/common/timer.c
+++ b/xen/common/timer.c
@@ -356,6 +356,7 @@  bool timer_expires_before(struct timer *
 
 void migrate_timer(struct timer *timer, unsigned int new_cpu)
 {
+#if CONFIG_NR_CPUS > 1
     unsigned int old_cpu;
     bool_t active;
     unsigned long flags;
@@ -404,6 +405,7 @@  void migrate_timer(struct timer *timer,
 
     spin_unlock(&per_cpu(timers, old_cpu).lock);
     spin_unlock_irqrestore(&per_cpu(timers, new_cpu).lock, flags);
+#endif /* CONFIG_NR_CPUS > 1 */
 }