diff mbox series

[1/2] ns16550: reject IRQ above nr_irqs

Message ID 20220310143403.50944-1-marmarek@invisiblethingslab.com (mailing list archive)
State New, archived
Headers show
Series [1/2] ns16550: reject IRQ above nr_irqs | expand

Commit Message

Marek Marczykowski-Górecki March 10, 2022, 2:34 p.m. UTC
Intel LPSS has INTERRUPT_LINE set to 0xff by default, that can't
possibly work. While a proper IRQ configuration may be useful,
validating value retrieved from the hardware is still necessary. If it
fails, use the device in poll mode.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
 xen/drivers/char/ns16550.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jan Beulich March 10, 2022, 3:23 p.m. UTC | #1
On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
> --- a/xen/drivers/char/ns16550.c
> +++ b/xen/drivers/char/ns16550.c
> @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
>                              pci_conf_read8(PCI_SBDF(0, b, d, f),
>                                             PCI_INTERRUPT_LINE) : 0;
>  
> +                if (uart->irq >= nr_irqs)
> +                    uart->irq = 0;

Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
immediately inside the parentheses.

Jan
Roger Pau Monné March 10, 2022, 3:47 p.m. UTC | #2
On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
> On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
> > --- a/xen/drivers/char/ns16550.c
> > +++ b/xen/drivers/char/ns16550.c
> > @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
> >                              pci_conf_read8(PCI_SBDF(0, b, d, f),
> >                                             PCI_INTERRUPT_LINE) : 0;
> >  
> > +                if (uart->irq >= nr_irqs)
> > +                    uart->irq = 0;
> 
> Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
> immediately inside the parentheses.

If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.

Thanks, Roger.
Jan Beulich March 10, 2022, 4:08 p.m. UTC | #3
On 10.03.2022 16:47, Roger Pau Monné wrote:
> On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
>> On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
>>> --- a/xen/drivers/char/ns16550.c
>>> +++ b/xen/drivers/char/ns16550.c
>>> @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
>>>                              pci_conf_read8(PCI_SBDF(0, b, d, f),
>>>                                             PCI_INTERRUPT_LINE) : 0;
>>>  
>>> +                if (uart->irq >= nr_irqs)
>>> +                    uart->irq = 0;
>>
>> Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
>> immediately inside the parentheses.
> 
> If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.

Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
right away. After all Arm wants to have an equivalent check here then,
not merely checking against nr_irqs instead. So putting a conditional
here right away would hide the need for putting in place an Arm-specific
alternative.

Jan
Roger Pau Monné March 10, 2022, 4:12 p.m. UTC | #4
On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
> On 10.03.2022 16:47, Roger Pau Monné wrote:
> > On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
> >> On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
> >>> --- a/xen/drivers/char/ns16550.c
> >>> +++ b/xen/drivers/char/ns16550.c
> >>> @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
> >>>                              pci_conf_read8(PCI_SBDF(0, b, d, f),
> >>>                                             PCI_INTERRUPT_LINE) : 0;
> >>>  
> >>> +                if (uart->irq >= nr_irqs)
> >>> +                    uart->irq = 0;
> >>
> >> Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
> >> immediately inside the parentheses.
> > 
> > If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
> 
> Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
> right away. After all Arm wants to have an equivalent check here then,
> not merely checking against nr_irqs instead. So putting a conditional
> here right away would hide the need for putting in place an Arm-specific
> alternative.

Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.

Roger.
Julien Grall March 10, 2022, 4:21 p.m. UTC | #5
Hi,

On 10/03/2022 16:12, Roger Pau Monné wrote:
> On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
>> On 10.03.2022 16:47, Roger Pau Monné wrote:
>>> On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
>>>> On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
>>>>> --- a/xen/drivers/char/ns16550.c
>>>>> +++ b/xen/drivers/char/ns16550.c
>>>>> @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
>>>>>                               pci_conf_read8(PCI_SBDF(0, b, d, f),
>>>>>                                              PCI_INTERRUPT_LINE) : 0;
>>>>>   
>>>>> +                if (uart->irq >= nr_irqs)
>>>>> +                    uart->irq = 0;
>>>>
>>>> Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
>>>> immediately inside the parentheses.
>>>
>>> If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
>>
>> Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
>> right away. After all Arm wants to have an equivalent check here then,
>> not merely checking against nr_irqs instead. So putting a conditional
>> here right away would hide the need for putting in place an Arm-specific
>> alternative.
> 
> Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.
The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and CONFIG_X86. I 
am not sure we will ever see a support for PCI UART card in Xen on Arm.

However, if it evers happens then neither nr_irqs or nr_irqs_gsi would 
help here because from the interrupt controller PoV 0xff may be a valid 
(GICv2 supports up to 1024 interrupts).

Is there any reason we can't explicitely check 0xff?

Cheers,
Jan Beulich March 10, 2022, 4:34 p.m. UTC | #6
On 10.03.2022 17:21, Julien Grall wrote:
> On 10/03/2022 16:12, Roger Pau Monné wrote:
>> On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
>>> On 10.03.2022 16:47, Roger Pau Monné wrote:
>>>> On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
>>>>> On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
>>>>>> --- a/xen/drivers/char/ns16550.c
>>>>>> +++ b/xen/drivers/char/ns16550.c
>>>>>> @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
>>>>>>                               pci_conf_read8(PCI_SBDF(0, b, d, f),
>>>>>>                                              PCI_INTERRUPT_LINE) : 0;
>>>>>>   
>>>>>> +                if (uart->irq >= nr_irqs)
>>>>>> +                    uart->irq = 0;
>>>>>
>>>>> Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
>>>>> immediately inside the parentheses.
>>>>
>>>> If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
>>>
>>> Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
>>> right away. After all Arm wants to have an equivalent check here then,
>>> not merely checking against nr_irqs instead. So putting a conditional
>>> here right away would hide the need for putting in place an Arm-specific
>>> alternative.
>>
>> Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.
> The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and CONFIG_X86. I 
> am not sure we will ever see a support for PCI UART card in Xen on Arm.
> 
> However, if it evers happens then neither nr_irqs or nr_irqs_gsi would 
> help here because from the interrupt controller PoV 0xff may be a valid 
> (GICv2 supports up to 1024 interrupts).
> 
> Is there any reason we can't explicitely check 0xff?

FF isn't called out by the spec as having a special meaning. Unlike I
think Andrew did say somewhere, FF does not indicate "none". That's
instead indicated by PIN returning zero. That's my reading of the spec,
at least.

Jan
Marek Marczykowski-Górecki March 10, 2022, 4:37 p.m. UTC | #7
On Thu, Mar 10, 2022 at 04:21:50PM +0000, Julien Grall wrote:
> Hi,
> 
> On 10/03/2022 16:12, Roger Pau Monné wrote:
> > On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
> > > On 10.03.2022 16:47, Roger Pau Monné wrote:
> > > > On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
> > > > > On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
> > > > > > --- a/xen/drivers/char/ns16550.c
> > > > > > +++ b/xen/drivers/char/ns16550.c
> > > > > > @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
> > > > > >                               pci_conf_read8(PCI_SBDF(0, b, d, f),
> > > > > >                                              PCI_INTERRUPT_LINE) : 0;
> > > > > > +                if (uart->irq >= nr_irqs)
> > > > > > +                    uart->irq = 0;
> > > > > 
> > > > > Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
> > > > > immediately inside the parentheses.
> > > > 
> > > > If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
> > > 
> > > Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
> > > right away. After all Arm wants to have an equivalent check here then,
> > > not merely checking against nr_irqs instead. So putting a conditional
> > > here right away would hide the need for putting in place an Arm-specific
> > > alternative.
> > 
> > Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.
> The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and CONFIG_X86. I am
> not sure we will ever see a support for PCI UART card in Xen on Arm.
> 
> However, if it evers happens then neither nr_irqs or nr_irqs_gsi would help
> here because from the interrupt controller PoV 0xff may be a valid (GICv2
> supports up to 1024 interrupts).
> 
> Is there any reason we can't explicitely check 0xff?

That's what my v0.1 did, but Roger suggested nr_irqs. And I agree,
because the value is later used (on x86) to access irq_desc array (via
irq_to_desc), which has nr_irqs size.
Julien Grall March 11, 2022, 10:23 a.m. UTC | #8
Hi Marek,

On 10/03/2022 16:37, Marek Marczykowski-Górecki wrote:
> On Thu, Mar 10, 2022 at 04:21:50PM +0000, Julien Grall wrote:
>> Hi,
>>
>> On 10/03/2022 16:12, Roger Pau Monné wrote:
>>> On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
>>>> On 10.03.2022 16:47, Roger Pau Monné wrote:
>>>>> On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
>>>>>> On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
>>>>>>> --- a/xen/drivers/char/ns16550.c
>>>>>>> +++ b/xen/drivers/char/ns16550.c
>>>>>>> @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
>>>>>>>                                pci_conf_read8(PCI_SBDF(0, b, d, f),
>>>>>>>                                               PCI_INTERRUPT_LINE) : 0;
>>>>>>> +                if (uart->irq >= nr_irqs)
>>>>>>> +                    uart->irq = 0;
>>>>>>
>>>>>> Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
>>>>>> immediately inside the parentheses.
>>>>>
>>>>> If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
>>>>
>>>> Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
>>>> right away. After all Arm wants to have an equivalent check here then,
>>>> not merely checking against nr_irqs instead. So putting a conditional
>>>> here right away would hide the need for putting in place an Arm-specific
>>>> alternative.
>>>
>>> Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.
>> The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and CONFIG_X86. I am
>> not sure we will ever see a support for PCI UART card in Xen on Arm.
>>
>> However, if it evers happens then neither nr_irqs or nr_irqs_gsi would help
>> here because from the interrupt controller PoV 0xff may be a valid (GICv2
>> supports up to 1024 interrupts).
>>
>> Is there any reason we can't explicitely check 0xff?
> 
> That's what my v0.1 did, but Roger suggested nr_irqs. And I agree,
> because the value is later used (on x86) to access irq_desc array (via
> irq_to_desc), which has nr_irqs size.

I think it would be better if that check is closer to who access the 
irq_desc. This would be helpful for other users (I am sure this is not 
the only potential place where the IRQ may be wrong). So how about 
moving it in setup_irq()?

Cheers,
Marek Marczykowski-Górecki March 11, 2022, 10:52 a.m. UTC | #9
On Fri, Mar 11, 2022 at 10:23:03AM +0000, Julien Grall wrote:
> Hi Marek,
> 
> On 10/03/2022 16:37, Marek Marczykowski-Górecki wrote:
> > On Thu, Mar 10, 2022 at 04:21:50PM +0000, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 10/03/2022 16:12, Roger Pau Monné wrote:
> > > > On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
> > > > > On 10.03.2022 16:47, Roger Pau Monné wrote:
> > > > > > On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
> > > > > > > On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
> > > > > > > > --- a/xen/drivers/char/ns16550.c
> > > > > > > > +++ b/xen/drivers/char/ns16550.c
> > > > > > > > @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
> > > > > > > >                                pci_conf_read8(PCI_SBDF(0, b, d, f),
> > > > > > > >                                               PCI_INTERRUPT_LINE) : 0;
> > > > > > > > +                if (uart->irq >= nr_irqs)
> > > > > > > > +                    uart->irq = 0;
> > > > > > > 
> > > > > > > Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
> > > > > > > immediately inside the parentheses.
> > > > > > 
> > > > > > If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
> > > > > 
> > > > > Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
> > > > > right away. After all Arm wants to have an equivalent check here then,
> > > > > not merely checking against nr_irqs instead. So putting a conditional
> > > > > here right away would hide the need for putting in place an Arm-specific
> > > > > alternative.
> > > > 
> > > > Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.
> > > The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and CONFIG_X86. I am
> > > not sure we will ever see a support for PCI UART card in Xen on Arm.
> > > 
> > > However, if it evers happens then neither nr_irqs or nr_irqs_gsi would help
> > > here because from the interrupt controller PoV 0xff may be a valid (GICv2
> > > supports up to 1024 interrupts).
> > > 
> > > Is there any reason we can't explicitely check 0xff?
> > 
> > That's what my v0.1 did, but Roger suggested nr_irqs. And I agree,
> > because the value is later used (on x86) to access irq_desc array (via
> > irq_to_desc), which has nr_irqs size.
> 
> I think it would be better if that check is closer to who access the
> irq_desc. This would be helpful for other users (I am sure this is not the
> only potential place where the IRQ may be wrong). So how about moving it in
> setup_irq()?

I don't like it, it's rather fragile approach (at least in the current
code base, without some refactor). There are a bunch of places using
uart->irq (even if just checking if its -1 or 0) before setup_irq()
call. This includes smp_intr_init(), which is what was the first thing
crashing with 0xff set there.
Julien Grall March 11, 2022, 11:15 a.m. UTC | #10
Hi,

On 11/03/2022 10:52, Marek Marczykowski-Górecki wrote:
> On Fri, Mar 11, 2022 at 10:23:03AM +0000, Julien Grall wrote:
>> Hi Marek,
>>
>> On 10/03/2022 16:37, Marek Marczykowski-Górecki wrote:
>>> On Thu, Mar 10, 2022 at 04:21:50PM +0000, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 10/03/2022 16:12, Roger Pau Monné wrote:
>>>>> On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
>>>>>> On 10.03.2022 16:47, Roger Pau Monné wrote:
>>>>>>> On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
>>>>>>>> On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
>>>>>>>>> --- a/xen/drivers/char/ns16550.c
>>>>>>>>> +++ b/xen/drivers/char/ns16550.c
>>>>>>>>> @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
>>>>>>>>>                                 pci_conf_read8(PCI_SBDF(0, b, d, f),
>>>>>>>>>                                                PCI_INTERRUPT_LINE) : 0;
>>>>>>>>> +                if (uart->irq >= nr_irqs)
>>>>>>>>> +                    uart->irq = 0;
>>>>>>>>
>>>>>>>> Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
>>>>>>>> immediately inside the parentheses.
>>>>>>>
>>>>>>> If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
>>>>>>
>>>>>> Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
>>>>>> right away. After all Arm wants to have an equivalent check here then,
>>>>>> not merely checking against nr_irqs instead. So putting a conditional
>>>>>> here right away would hide the need for putting in place an Arm-specific
>>>>>> alternative.
>>>>>
>>>>> Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.
>>>> The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and CONFIG_X86. I am
>>>> not sure we will ever see a support for PCI UART card in Xen on Arm.
>>>>
>>>> However, if it evers happens then neither nr_irqs or nr_irqs_gsi would help
>>>> here because from the interrupt controller PoV 0xff may be a valid (GICv2
>>>> supports up to 1024 interrupts).
>>>>
>>>> Is there any reason we can't explicitely check 0xff?
>>>
>>> That's what my v0.1 did, but Roger suggested nr_irqs. And I agree,
>>> because the value is later used (on x86) to access irq_desc array (via
>>> irq_to_desc), which has nr_irqs size.
>>
>> I think it would be better if that check is closer to who access the
>> irq_desc. This would be helpful for other users (I am sure this is not the
>> only potential place where the IRQ may be wrong). So how about moving it in
>> setup_irq()?
> 
> I don't like it, it's rather fragile approach (at least in the current
> code base, without some refactor). There are a bunch of places using
> uart->irq (even if just checking if its -1 or 0) before setup_irq()
> call. This includes smp_intr_init(), which is what was the first thing
> crashing with 0xff set there.

Even if the code is gated with !CONFIG_X86, it sounds wrong to me to 
have such check in an UART driver. It only prevents us to do an 
out-of-bound access. There are no guarantee the interrupt will be usable 
(on Arm 256 is a valid interrupt).

As I wrote, I don't expect the code to be used any time soon on Arm. So 
I am not going to argue too much on the approach. However, we should at 
least clarify in the commit message/title that this is x86 and pci only.

Cheers,
Roger Pau Monné March 11, 2022, 3:04 p.m. UTC | #11
On Fri, Mar 11, 2022 at 11:15:13AM +0000, Julien Grall wrote:
> Hi,
> 
> On 11/03/2022 10:52, Marek Marczykowski-Górecki wrote:
> > On Fri, Mar 11, 2022 at 10:23:03AM +0000, Julien Grall wrote:
> > > Hi Marek,
> > > 
> > > On 10/03/2022 16:37, Marek Marczykowski-Górecki wrote:
> > > > On Thu, Mar 10, 2022 at 04:21:50PM +0000, Julien Grall wrote:
> > > > > Hi,
> > > > > 
> > > > > On 10/03/2022 16:12, Roger Pau Monné wrote:
> > > > > > On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
> > > > > > > On 10.03.2022 16:47, Roger Pau Monné wrote:
> > > > > > > > On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
> > > > > > > > > On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > --- a/xen/drivers/char/ns16550.c
> > > > > > > > > > +++ b/xen/drivers/char/ns16550.c
> > > > > > > > > > @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
> > > > > > > > > >                                 pci_conf_read8(PCI_SBDF(0, b, d, f),
> > > > > > > > > >                                                PCI_INTERRUPT_LINE) : 0;
> > > > > > > > > > +                if (uart->irq >= nr_irqs)
> > > > > > > > > > +                    uart->irq = 0;
> > > > > > > > > 
> > > > > > > > > Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
> > > > > > > > > immediately inside the parentheses.
> > > > > > > > 
> > > > > > > > If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
> > > > > > > 
> > > > > > > Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
> > > > > > > right away. After all Arm wants to have an equivalent check here then,
> > > > > > > not merely checking against nr_irqs instead. So putting a conditional
> > > > > > > here right away would hide the need for putting in place an Arm-specific
> > > > > > > alternative.
> > > > > > 
> > > > > > Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.
> > > > > The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and CONFIG_X86. I am
> > > > > not sure we will ever see a support for PCI UART card in Xen on Arm.
> > > > > 
> > > > > However, if it evers happens then neither nr_irqs or nr_irqs_gsi would help
> > > > > here because from the interrupt controller PoV 0xff may be a valid (GICv2
> > > > > supports up to 1024 interrupts).
> > > > > 
> > > > > Is there any reason we can't explicitely check 0xff?
> > > > 
> > > > That's what my v0.1 did, but Roger suggested nr_irqs. And I agree,
> > > > because the value is later used (on x86) to access irq_desc array (via
> > > > irq_to_desc), which has nr_irqs size.
> > > 
> > > I think it would be better if that check is closer to who access the
> > > irq_desc. This would be helpful for other users (I am sure this is not the
> > > only potential place where the IRQ may be wrong). So how about moving it in
> > > setup_irq()?
> > 
> > I don't like it, it's rather fragile approach (at least in the current
> > code base, without some refactor). There are a bunch of places using
> > uart->irq (even if just checking if its -1 or 0) before setup_irq()
> > call. This includes smp_intr_init(), which is what was the first thing
> > crashing with 0xff set there.
> 
> Even if the code is gated with !CONFIG_X86, it sounds wrong to me to have
> such check in an UART driver. It only prevents us to do an out-of-bound
> access. There are no guarantee the interrupt will be usable (on Arm 256 is a
> valid interrupt).

It's a sanity check of a value we get from the hardware, I don't think
it's that strange. It's mostly similar to doing sanity checks of input
values we get from users.

Could you add an error message to note that an incorrect irq to use
was reported by hardware?

Thanks, Roger.
Julien Grall March 11, 2022, 3:19 p.m. UTC | #12
Hi Roger,

On 11/03/2022 15:04, Roger Pau Monné wrote:
> On Fri, Mar 11, 2022 at 11:15:13AM +0000, Julien Grall wrote:
>> Hi,
>>
>> On 11/03/2022 10:52, Marek Marczykowski-Górecki wrote:
>>> On Fri, Mar 11, 2022 at 10:23:03AM +0000, Julien Grall wrote:
>>>> Hi Marek,
>>>>
>>>> On 10/03/2022 16:37, Marek Marczykowski-Górecki wrote:
>>>>> On Thu, Mar 10, 2022 at 04:21:50PM +0000, Julien Grall wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 10/03/2022 16:12, Roger Pau Monné wrote:
>>>>>>> On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
>>>>>>>> On 10.03.2022 16:47, Roger Pau Monné wrote:
>>>>>>>>> On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
>>>>>>>>>> On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
>>>>>>>>>>> --- a/xen/drivers/char/ns16550.c
>>>>>>>>>>> +++ b/xen/drivers/char/ns16550.c
>>>>>>>>>>> @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
>>>>>>>>>>>                                  pci_conf_read8(PCI_SBDF(0, b, d, f),
>>>>>>>>>>>                                                 PCI_INTERRUPT_LINE) : 0;
>>>>>>>>>>> +                if (uart->irq >= nr_irqs)
>>>>>>>>>>> +                    uart->irq = 0;
>>>>>>>>>>
>>>>>>>>>> Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
>>>>>>>>>> immediately inside the parentheses.
>>>>>>>>>
>>>>>>>>> If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
>>>>>>>>
>>>>>>>> Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
>>>>>>>> right away. After all Arm wants to have an equivalent check here then,
>>>>>>>> not merely checking against nr_irqs instead. So putting a conditional
>>>>>>>> here right away would hide the need for putting in place an Arm-specific
>>>>>>>> alternative.
>>>>>>>
>>>>>>> Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.
>>>>>> The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and CONFIG_X86. I am
>>>>>> not sure we will ever see a support for PCI UART card in Xen on Arm.
>>>>>>
>>>>>> However, if it evers happens then neither nr_irqs or nr_irqs_gsi would help
>>>>>> here because from the interrupt controller PoV 0xff may be a valid (GICv2
>>>>>> supports up to 1024 interrupts).
>>>>>>
>>>>>> Is there any reason we can't explicitely check 0xff?
>>>>>
>>>>> That's what my v0.1 did, but Roger suggested nr_irqs. And I agree,
>>>>> because the value is later used (on x86) to access irq_desc array (via
>>>>> irq_to_desc), which has nr_irqs size.
>>>>
>>>> I think it would be better if that check is closer to who access the
>>>> irq_desc. This would be helpful for other users (I am sure this is not the
>>>> only potential place where the IRQ may be wrong). So how about moving it in
>>>> setup_irq()?
>>>
>>> I don't like it, it's rather fragile approach (at least in the current
>>> code base, without some refactor). There are a bunch of places using
>>> uart->irq (even if just checking if its -1 or 0) before setup_irq()
>>> call. This includes smp_intr_init(), which is what was the first thing
>>> crashing with 0xff set there.
>>
>> Even if the code is gated with !CONFIG_X86, it sounds wrong to me to have
>> such check in an UART driver. It only prevents us to do an out-of-bound
>> access. There are no guarantee the interrupt will be usable (on Arm 256 is a
>> valid interrupt).
> 
> It's a sanity check of a value we get from the hardware, I don't think
> it's that strange.

I think it is strange because the behavior would be different between 
the architectures. On x86, we would reject the interrupt and poll. On 
Arm, we would accept the interrupt and the UART would be unusable.

> It's mostly similar to doing sanity checks of input
> values we get from users.
I am a bit concerned that we are using an unrelated check (see above
why) to catch the "misconfiguration".

I think it would be good to understand why the interrupt line is 0xff 
and properly fix it. Is it a misconfiguration?  Is it intended to 
indicate "no IRQ"? Can we actually trust the value for the Intel LPSS?

Cheers,
Roger Pau Monné March 11, 2022, 3:43 p.m. UTC | #13
On Fri, Mar 11, 2022 at 03:19:22PM +0000, Julien Grall wrote:
> Hi Roger,
> 
> On 11/03/2022 15:04, Roger Pau Monné wrote:
> > On Fri, Mar 11, 2022 at 11:15:13AM +0000, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 11/03/2022 10:52, Marek Marczykowski-Górecki wrote:
> > > > On Fri, Mar 11, 2022 at 10:23:03AM +0000, Julien Grall wrote:
> > > > > Hi Marek,
> > > > > 
> > > > > On 10/03/2022 16:37, Marek Marczykowski-Górecki wrote:
> > > > > > On Thu, Mar 10, 2022 at 04:21:50PM +0000, Julien Grall wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On 10/03/2022 16:12, Roger Pau Monné wrote:
> > > > > > > > On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
> > > > > > > > > On 10.03.2022 16:47, Roger Pau Monné wrote:
> > > > > > > > > > On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
> > > > > > > > > > > On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > --- a/xen/drivers/char/ns16550.c
> > > > > > > > > > > > +++ b/xen/drivers/char/ns16550.c
> > > > > > > > > > > > @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
> > > > > > > > > > > >                                  pci_conf_read8(PCI_SBDF(0, b, d, f),
> > > > > > > > > > > >                                                 PCI_INTERRUPT_LINE) : 0;
> > > > > > > > > > > > +                if (uart->irq >= nr_irqs)
> > > > > > > > > > > > +                    uart->irq = 0;
> > > > > > > > > > > 
> > > > > > > > > > > Don't you mean nr_irqs_gsi here? Also (nit) please add the missing blanks
> > > > > > > > > > > immediately inside the parentheses.
> > > > > > > > > > 
> > > > > > > > > > If we use nr_irqs_gsi we will need to make the check x86 only AFAICT.
> > > > > > > > > 
> > > > > > > > > Down the road (when Arm wants to select HAS_PCI) - yes. Not necessarily
> > > > > > > > > right away. After all Arm wants to have an equivalent check here then,
> > > > > > > > > not merely checking against nr_irqs instead. So putting a conditional
> > > > > > > > > here right away would hide the need for putting in place an Arm-specific
> > > > > > > > > alternative.
> > > > > > > > 
> > > > > > > > Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled just yet.
> > > > > > > The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and CONFIG_X86. I am
> > > > > > > not sure we will ever see a support for PCI UART card in Xen on Arm.
> > > > > > > 
> > > > > > > However, if it evers happens then neither nr_irqs or nr_irqs_gsi would help
> > > > > > > here because from the interrupt controller PoV 0xff may be a valid (GICv2
> > > > > > > supports up to 1024 interrupts).
> > > > > > > 
> > > > > > > Is there any reason we can't explicitely check 0xff?
> > > > > > 
> > > > > > That's what my v0.1 did, but Roger suggested nr_irqs. And I agree,
> > > > > > because the value is later used (on x86) to access irq_desc array (via
> > > > > > irq_to_desc), which has nr_irqs size.
> > > > > 
> > > > > I think it would be better if that check is closer to who access the
> > > > > irq_desc. This would be helpful for other users (I am sure this is not the
> > > > > only potential place where the IRQ may be wrong). So how about moving it in
> > > > > setup_irq()?
> > > > 
> > > > I don't like it, it's rather fragile approach (at least in the current
> > > > code base, without some refactor). There are a bunch of places using
> > > > uart->irq (even if just checking if its -1 or 0) before setup_irq()
> > > > call. This includes smp_intr_init(), which is what was the first thing
> > > > crashing with 0xff set there.
> > > 
> > > Even if the code is gated with !CONFIG_X86, it sounds wrong to me to have
> > > such check in an UART driver. It only prevents us to do an out-of-bound
> > > access. There are no guarantee the interrupt will be usable (on Arm 256 is a
> > > valid interrupt).
> > 
> > It's a sanity check of a value we get from the hardware, I don't think
> > it's that strange.
> 
> I think it is strange because the behavior would be different between the
> architectures. On x86, we would reject the interrupt and poll. On Arm, we
> would accept the interrupt and the UART would be unusable.
> 
> > It's mostly similar to doing sanity checks of input
> > values we get from users.
> I am a bit concerned that we are using an unrelated check (see above
> why) to catch the "misconfiguration".
> 
> I think it would be good to understand why the interrupt line is 0xff and
> properly fix it. Is it a misconfiguration?  Is it intended to indicate "no
> IRQ"? Can we actually trust the value for the Intel LPSS?

Sorry, maybe this wasn't clear. My suggestion was not to just do this
fix and call it done, but rather to add this check for sanity and then
figure out how to properly handle this specific device.

So adding the check here is not a workaround in order to support Intel
LPSS, but rather a generic fix to ns16550 for an issue which happens
to be triggered by Intel LPSS. We would still need to figure how to
handle that specific Line value. I haven't looked at the docs, will do
on Monday hopefully.

Thanks, Roger.
Marek Marczykowski-Górecki March 11, 2022, 4:32 p.m. UTC | #14
On Fri, Mar 11, 2022 at 04:43:22PM +0100, Roger Pau Monné wrote:
> Sorry, maybe this wasn't clear. My suggestion was not to just do this
> fix and call it done, but rather to add this check for sanity and then
> figure out how to properly handle this specific device.

Yes, I agree. Having it properly configured is preferred. Linux manages
to do that, but I'm not sure how exactly. But ...

> So adding the check here is not a workaround in order to support Intel
> LPSS, but rather a generic fix to ns16550 for an issue which happens
> to be triggered by Intel LPSS. We would still need to figure how to
> handle that specific Line value. I haven't looked at the docs, will do
> on Monday hopefully.

... having fallback to a poll mode is still better than crashing the
hypervisor or not using such console at all.
Roger Pau Monné March 15, 2022, 10:02 a.m. UTC | #15
On Fri, Mar 11, 2022 at 05:32:45PM +0100, Marek Marczykowski-Górecki wrote:
> On Fri, Mar 11, 2022 at 04:43:22PM +0100, Roger Pau Monné wrote:
> > Sorry, maybe this wasn't clear. My suggestion was not to just do this
> > fix and call it done, but rather to add this check for sanity and then
> > figure out how to properly handle this specific device.
> 
> Yes, I agree. Having it properly configured is preferred. Linux manages
> to do that, but I'm not sure how exactly. But ...

I think it might get the interrupt from ACPI data, which is likely out
of scope for Xen. Can you take a look at ACPI data from the box and
see whether the interrupt is reported there? (search for a _CRS method
belonging to the LPSS device)

Sadly the LPSS spec doesn't contain any help regarding the usage of
0xff in the Interrupt Line register. Out of curiosity, can you print
what's in the Interrupt Pin register? (PCI_INTERRUPT_PIN)

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index e5b4a9085516..2d7c8c11bc69 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -1221,6 +1221,9 @@  pci_uart_config(struct ns16550 *uart, bool_t skip_amt, unsigned int idx)
                             pci_conf_read8(PCI_SBDF(0, b, d, f),
                                            PCI_INTERRUPT_LINE) : 0;
 
+                if (uart->irq >= nr_irqs)
+                    uart->irq = 0;
+
                 return 0;
             }
         }