diff mbox series

[1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

Message ID 20230818134441.45586-2-julien@xen.org (mailing list archive)
State New, archived
Headers show
Series xen/x86: Optimize timer_irq_works() | expand

Commit Message

Julien Grall Aug. 18, 2023, 1:44 p.m. UTC
From: Julien Grall <jgrall@amazon.com>

Currently, Xen will spend ~100ms to check if the timer works. If the
Admin knows their platform have a working timer, then it would be
handy to be able to bypass the check.

Introduce a command line option 'no_timer_check' (the name is
matching the Linux parameter) for this purpose.

Signed-off-by: Julien Grall <jgrall@amazon.com>
---
 docs/misc/xen-command-line.pandoc |  7 +++++++
 xen/arch/x86/io_apic.c            | 11 +++++++++++
 2 files changed, 18 insertions(+)

Comments

Jan Beulich Sept. 7, 2023, 2:09 p.m. UTC | #1
On 18.08.2023 15:44, Julien Grall wrote:
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1896,6 +1896,13 @@ This option is ignored in **pv-shim** mode.
>  ### nr_irqs (x86)
>  > `= <integer>`
>  
> +### no_timer_works (x86)
> +> `=<boolean>`
> +
> +> Default: `true`
> +
> +Disables the code which tests for broken timer IRQ sources.

In description and code it's "check", but here it's "works". Likely
just a typo. But I'd prefer if we didn't introduce any new "no*"
options which then can be negated to "no-no*". Make it "timer-check"
(also avoiding the underscore, no matter that Linux uses it), or
alternatively make it a truly positive option, e.g. "timer-irq-works".

I also think it wants emphasizing that if this option is used and then
something doesn't work, people are on their own.

Jan
Julien Grall Sept. 15, 2023, 1:18 p.m. UTC | #2
Hi,

On 07/09/2023 15:09, Jan Beulich wrote:
> On 18.08.2023 15:44, Julien Grall wrote:
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -1896,6 +1896,13 @@ This option is ignored in **pv-shim** mode.
>>   ### nr_irqs (x86)
>>   > `= <integer>`
>>   
>> +### no_timer_works (x86)
>> +> `=<boolean>`
>> +
>> +> Default: `true`
>> +
>> +Disables the code which tests for broken timer IRQ sources.
> 
> In description and code it's "check", but here it's "works". Likely
> just a typo. But I'd prefer if we didn't introduce any new "no*"
> options which then can be negated to "no-no*". Make it "timer-check"
> (also avoiding the underscore, no matter that Linux uses it), or
> alternatively make it a truly positive option, e.g. "timer-irq-works".

I don't mind too much about using - over _ but it is never clear why you 
strongly push for it (and whether the others agrees). Is this documented 
somewhere? If not, can you do it so everyone can apply it consistently? 
(At least I would not remember to ask for it because I am happy with the _).

I will go for 'timer-irq-works'.

> 
> I also think it wants emphasizing that if this option is used and then
> something doesn't work, people are on their own.

I will do that.

Note that I will only resend a new version after the tree as branched 
because this is not meant for 4.18.

Cheers,
George Dunlap Sept. 15, 2023, 1:49 p.m. UTC | #3
On Fri, Sep 15, 2023 at 2:18 PM Julien Grall <julien@xen.org> wrote:

> I don't mind too much about using - over _ but it is never clear why you
> strongly push for it (and whether the others agrees). Is this documented
> somewhere? If not, can you do it so everyone can apply it consistently?
> (At least I would not remember to ask for it because I am happy with the _).

FWIW I prefer `-` over `_`; I'd Ack a patch that documented it
somewhere (if it's not documented already).

 -George
Roger Pau Monné Sept. 15, 2023, 1:54 p.m. UTC | #4
On Fri, Aug 18, 2023 at 02:44:40PM +0100, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Currently, Xen will spend ~100ms to check if the timer works. If the
> Admin knows their platform have a working timer, then it would be
> handy to be able to bypass the check.

I'm of the opinion that the current code is at least dubious.

Specifically attempts to test the PIT timer, even when the hypervisor
might be using a different timer.  At which point it mostly attempts
to test that the interrupt routing from the PIT (or it's replacement)
is correct, and that Xen can receive such interrupts.

We go to great lengths in order to attempt to please this piece of
code, even when no PIT is available, at which point we switch the HPET
to legacy replacement mode in order to satisfy the checks.

I think the current code is not useful, and I would be fine if it was
just removed.

> 
> Introduce a command line option 'no_timer_check' (the name is
> matching the Linux parameter) for this purpose.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> ---
>  docs/misc/xen-command-line.pandoc |  7 +++++++
>  xen/arch/x86/io_apic.c            | 11 +++++++++++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index 4872b9098e83..1f9d3106383f 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1896,6 +1896,13 @@ This option is ignored in **pv-shim** mode.
>  ### nr_irqs (x86)
>  > `= <integer>`
>  
> +### no_timer_works (x86)

I find the naming confusing, and I would rather avoid negative named
booleans.

Maybe it should be `check_pit_intr` and default to enabled for the
time being?

> +> `=<boolean>`
> +
> +> Default: `true`

I think the default is wrong here?  AFAICT no_timer_check will default
to false.

> +
> +Disables the code which tests for broken timer IRQ sources.

Hm, yes, it does check for one specific source, which doesn't need to
be the currently selected timer.

"Disables the code which tests interrupt injection from the legacy
i8259 timer."

Even that is not fully true, as it would resort to testing the HPET
legacy mode if PIT doesn't work, but one could argue the HPET legacy
mode is just a replacement for the i8254.

> +
>  ### irq-max-guests (x86)
>  > `= <integer>`
>  
> diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
> index b3afef8933d7..4875bb97003f 100644
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
>  int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
>  int __read_mostly nr_ioapics;
>  
> +/*
> + * The logic to check if the timer is working is expensive. So allow
> + * the admin to bypass it if they know their platform doesn't have
> + * a buggy timer.

I would mention i8254 here, as said this is quite likely not testing
the currently in use timer.

Note that if you don't want to run the test in timer_irq_works() you
can possibly avoid the code in preinit_pit(), as there no need to
setup channel 0 in periodic mode then.

Thanks, Roger.
Julien Grall Sept. 15, 2023, 2:27 p.m. UTC | #5
Hi Roger,

On 15/09/2023 14:54, Roger Pau Monné wrote:
> On Fri, Aug 18, 2023 at 02:44:40PM +0100, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Currently, Xen will spend ~100ms to check if the timer works. If the
>> Admin knows their platform have a working timer, then it would be
>> handy to be able to bypass the check.
> 
> I'm of the opinion that the current code is at least dubious.
> 
> Specifically attempts to test the PIT timer, even when the hypervisor
> might be using a different timer.  At which point it mostly attempts
> to test that the interrupt routing from the PIT (or it's replacement)
> is correct, and that Xen can receive such interrupts.
> 
> We go to great lengths in order to attempt to please this piece of
> code, even when no PIT is available, at which point we switch the HPET
> to legacy replacement mode in order to satisfy the checks.
> 
> I think the current code is not useful, and I would be fine if it was
> just removed.

I am afraid I don't know enough the code to be able to say if it can be 
removed.

I also have no idea how common are such platforms nowadays (I suspect it 
is rather small). Which I why I went with a command line option.

> 
>>
>> Introduce a command line option 'no_timer_check' (the name is
>> matching the Linux parameter) for this purpose.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>> ---
>>   docs/misc/xen-command-line.pandoc |  7 +++++++
>>   xen/arch/x86/io_apic.c            | 11 +++++++++++
>>   2 files changed, 18 insertions(+)
>>
>> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
>> index 4872b9098e83..1f9d3106383f 100644
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -1896,6 +1896,13 @@ This option is ignored in **pv-shim** mode.
>>   ### nr_irqs (x86)
>>   > `= <integer>`
>>   
>> +### no_timer_works (x86)
> 
> I find the naming confusing, and I would rather avoid negative named
> booleans.
> 
> Maybe it should be `check_pit_intr` and default to enabled for the
> time being?
Jan suggested to use timer-irq-works. Would you be happy with the name?

> 
>> +> `=<boolean>`
>> +
>> +> Default: `true`
> 
> I think the default is wrong here?  AFAICT no_timer_check will default
> to false.

Ah yes. In the downstream version, I went with setting to true by 
default as we don't support any platform with broken timer. I forgot to 
update the patch before sending.

> 
>> +
>> +Disables the code which tests for broken timer IRQ sources.
> 
> Hm, yes, it does check for one specific source, which doesn't need to
> be the currently selected timer.
> 
> "Disables the code which tests interrupt injection from the legacy
> i8259 timer."

I can update the comment.

> 
> Even that is not fully true, as it would resort to testing the HPET
> legacy mode if PIT doesn't work, but one could argue the HPET legacy
> mode is just a replacement for the i8254.
> 
>> +
>>   ### irq-max-guests (x86)
>>   > `= <integer>`
>>   
>> diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
>> index b3afef8933d7..4875bb97003f 100644
>> --- a/xen/arch/x86/io_apic.c
>> +++ b/xen/arch/x86/io_apic.c
>> @@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
>>   int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
>>   int __read_mostly nr_ioapics;
>>   
>> +/*
>> + * The logic to check if the timer is working is expensive. So allow
>> + * the admin to bypass it if they know their platform doesn't have
>> + * a buggy timer.
> 
> I would mention i8254 here, as said this is quite likely not testing
> the currently in use timer.

Sure.

> Note that if you don't want to run the test in timer_irq_works() you
> can possibly avoid the code in preinit_pit(), as there no need to
> setup channel 0 in periodic mode then.

The channel also seems to be used as a fallback method for calibrating 
the APIC (see calibrate_APIC_clock()). AFAICT, the fallback method 
should only be used when the PIT is enabled.

I think it would still be feasible to avoid running the IRQ tests even 
when PIT is used. So it means, we cannot skip preinit_pit().

As a side note, I noticed that preinit_pit() is called during resume. 
But I can't find any use of channel 0 after boot. So maybe the call 
could be removed?

Cheers,
Roger Pau Monné Sept. 18, 2023, 7:54 a.m. UTC | #6
On Fri, Sep 15, 2023 at 03:27:40PM +0100, Julien Grall wrote:
> Hi Roger,
> 
> On 15/09/2023 14:54, Roger Pau Monné wrote:
> > On Fri, Aug 18, 2023 at 02:44:40PM +0100, Julien Grall wrote:
> > > From: Julien Grall <jgrall@amazon.com>
> > > 
> > > Currently, Xen will spend ~100ms to check if the timer works. If the
> > > Admin knows their platform have a working timer, then it would be
> > > handy to be able to bypass the check.
> > 
> > I'm of the opinion that the current code is at least dubious.
> > 
> > Specifically attempts to test the PIT timer, even when the hypervisor
> > might be using a different timer.  At which point it mostly attempts
> > to test that the interrupt routing from the PIT (or it's replacement)
> > is correct, and that Xen can receive such interrupts.
> > 
> > We go to great lengths in order to attempt to please this piece of
> > code, even when no PIT is available, at which point we switch the HPET
> > to legacy replacement mode in order to satisfy the checks.
> > 
> > I think the current code is not useful, and I would be fine if it was
> > just removed.
> 
> I am afraid I don't know enough the code to be able to say if it can be
> removed.
> 
> I also have no idea how common are such platforms nowadays (I suspect it is
> rather small). Which I why I went with a command line option.

It was more of a grumble rather than a request for you to remove the
code.  I've been meaning to look into this myself for some time, as we
just keep accumulating bodges in order to keep the test happy.

We don't seem to perform such tests for any other interrupt sources
that Xen uses (or timer), so I find it kind of odd.  I guess at one
point the PIT was always used, and hence it was relevant to test for
it unconditionally, but that's not the case anymore.

> 
> > 
> > > 
> > > Introduce a command line option 'no_timer_check' (the name is
> > > matching the Linux parameter) for this purpose.
> > > 
> > > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > > ---
> > >   docs/misc/xen-command-line.pandoc |  7 +++++++
> > >   xen/arch/x86/io_apic.c            | 11 +++++++++++
> > >   2 files changed, 18 insertions(+)
> > > 
> > > diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> > > index 4872b9098e83..1f9d3106383f 100644
> > > --- a/docs/misc/xen-command-line.pandoc
> > > +++ b/docs/misc/xen-command-line.pandoc
> > > @@ -1896,6 +1896,13 @@ This option is ignored in **pv-shim** mode.
> > >   ### nr_irqs (x86)
> > >   > `= <integer>`
> > > +### no_timer_works (x86)
> > 
> > I find the naming confusing, and I would rather avoid negative named
> > booleans.
> > 
> > Maybe it should be `check_pit_intr` and default to enabled for the
> > time being?
> Jan suggested to use timer-irq-works. Would you be happy with the name?

Hm, pit_irq_works might be better IMO, but I could live with
timer_irq_works (with either _ or -).

> > Note that if you don't want to run the test in timer_irq_works() you
> > can possibly avoid the code in preinit_pit(), as there no need to
> > setup channel 0 in periodic mode then.
> 
> The channel also seems to be used as a fallback method for calibrating the
> APIC (see calibrate_APIC_clock()). AFAICT, the fallback method should only
> be used when the PIT is enabled.

Yes, I see.  I think most systems from the last decade will have TSC
deadline timer support, and hence don't engage in the calibration.  We
should see about switching the calibration to use the selected timer,
instead of forcing the usage of the PIT.

> I think it would still be feasible to avoid running the IRQ tests even when
> PIT is used. So it means, we cannot skip preinit_pit().

Yeah, so we seem to have a couple of usages of the Channel 0 counter
that don't rely on the interrupt at all.

> As a side note, I noticed that preinit_pit() is called during resume. But I
> can't find any use of channel 0 after boot. So maybe the call could be
> removed?

See _disable_pit_irq(), there's a relation between the PIT and
cpuidle.

Thanks, Roger.
Jan Beulich Sept. 18, 2023, 10:29 a.m. UTC | #7
On 15.09.2023 15:18, Julien Grall wrote:
> On 07/09/2023 15:09, Jan Beulich wrote:
>> On 18.08.2023 15:44, Julien Grall wrote:
>>> --- a/docs/misc/xen-command-line.pandoc
>>> +++ b/docs/misc/xen-command-line.pandoc
>>> @@ -1896,6 +1896,13 @@ This option is ignored in **pv-shim** mode.
>>>   ### nr_irqs (x86)
>>>   > `= <integer>`
>>>   
>>> +### no_timer_works (x86)
>>> +> `=<boolean>`
>>> +
>>> +> Default: `true`
>>> +
>>> +Disables the code which tests for broken timer IRQ sources.
>>
>> In description and code it's "check", but here it's "works". Likely
>> just a typo. But I'd prefer if we didn't introduce any new "no*"
>> options which then can be negated to "no-no*". Make it "timer-check"
>> (also avoiding the underscore, no matter that Linux uses it), or
>> alternatively make it a truly positive option, e.g. "timer-irq-works".
> 
> I don't mind too much about using - over _ but it is never clear why you 
> strongly push for it (and whether the others agrees).

Informal feedback suggested that various people agree and no-one strongly
disagrees to the argument of underscore really only being an auxiliary
separator character, when no better one can be used, and it also being
two keypresses to type on most keyboards, when dash is just one.

> Is this documented 
> somewhere? If not, can you do it so everyone can apply it consistently? 
> (At least I would not remember to ask for it because I am happy with the _).

As to documenting - it's not clear to me where such documentation ought
to go. In a way this is coding style, so it could be ./CODING_STYLE, but
then my experience with proposing changes there has been at best mixed.

Jan
diff mbox series

Patch

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 4872b9098e83..1f9d3106383f 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1896,6 +1896,13 @@  This option is ignored in **pv-shim** mode.
 ### nr_irqs (x86)
 > `= <integer>`
 
+### no_timer_works (x86)
+> `=<boolean>`
+
+> Default: `true`
+
+Disables the code which tests for broken timer IRQ sources.
+
 ### irq-max-guests (x86)
 > `= <integer>`
 
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index b3afef8933d7..4875bb97003f 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -57,6 +57,14 @@  bool __initdata ioapic_ack_forced;
 int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
 int __read_mostly nr_ioapics;
 
+/*
+ * The logic to check if the timer is working is expensive. So allow
+ * the admin to bypass it if they know their platform doesn't have
+ * a buggy timer.
+ */
+static bool __initdata no_timer_check;
+boolean_param("no_timer_check", no_timer_check);
+
 /*
  * Rough estimation of how many shared IRQs there are, can
  * be changed anytime.
@@ -1502,6 +1510,9 @@  static int __init timer_irq_works(void)
 {
     unsigned long t1, flags;
 
+    if ( no_timer_check )
+        return 1;
+
     t1 = ACCESS_ONCE(pit0_ticks);
 
     local_save_flags(flags);