diff mbox series

[v3,1/8] serial: fake IRQ-regs context in poll handlers

Message ID 893be03d-22cc-4b8c-8a54-6479961c5aa2@suse.com (mailing list archive)
State Superseded
Headers show
Series limit passing around of cpu_user_regs | expand

Commit Message

Jan Beulich Feb. 5, 2024, 1:27 p.m. UTC
In preparation of dropping the register parameters from
serial_[rt]x_interrupt() and in turn from IRQ handler functions,
register state needs making available another way for the few key
handlers which need it. Fake IRQ-like state.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
other console poll functions we have, and it's unclear whether that's
actually generally correct.

Andrew suggested to move set_irq_regs() to BUGFRAME_run_fn handling;
it's not clear to me whether that would be (a) correct from an abstract
pov (that's exception, not interrupt context after all) and (b) really
beneficial.
---
v2: New.

Comments

Julien Grall Feb. 8, 2024, 10 p.m. UTC | #1
Hi Jan,

On 05/02/2024 13:27, Jan Beulich wrote:
> In preparation of dropping the register parameters from
> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
> register state needs making available another way for the few key
> handlers which need it. Fake IRQ-like state.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
> other console poll functions we have, and it's unclear whether that's
> actually generally correct.

Is it? Looking at ns16550_poll() we would pass guest_user_regs() if 
run_in_exception() doesn't exist. But looking at the caller, no-on seems 
to care about the 'regs'. So is this just a latent bug?

BTW, do you have an idea why the poll function is not run in an 
exception handler?

> 
> Andrew suggested to move set_irq_regs() to BUGFRAME_run_fn handling;
> it's not clear to me whether that would be (a) correct from an abstract
> pov (that's exception, not interrupt context after all) 

I agree with that.

> and (b) really beneficial.

I guess this could help to reduce the amount of churn. I can't really 
make my mind whether this is worth it or not. So I would keep it as you did.

Cheers,
Jan Beulich Feb. 12, 2024, 9:04 a.m. UTC | #2
On 08.02.2024 23:00, Julien Grall wrote:
> On 05/02/2024 13:27, Jan Beulich wrote:
>> In preparation of dropping the register parameters from
>> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
>> register state needs making available another way for the few key
>> handlers which need it. Fake IRQ-like state.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
>> other console poll functions we have, and it's unclear whether that's
>> actually generally correct.
> 
> Is it? Looking at ns16550_poll() we would pass guest_user_regs() if 
> run_in_exception() doesn't exist. But looking at the caller, no-on seems 
> to care about the 'regs'. So is this just a latent bug?

What do you mean by "doesn't exist"? ns16550_poll() assumes it exists.
And I can spot any use of guest_user_regs() on the respective generic
or Arm-specific bug.c paths.

> BTW, do you have an idea why the poll function is not run in an 
> exception handler?

"The poll function" being which one? If you mean the one in xhci-dbc.c
then that's why I had Cc-ed Marek. Moving him to To: - maybe that
manages to finally catch his attention.

>> Andrew suggested to move set_irq_regs() to BUGFRAME_run_fn handling;
>> it's not clear to me whether that would be (a) correct from an abstract
>> pov (that's exception, not interrupt context after all) 
> 
> I agree with that.
> 
>> and (b) really beneficial.
> 
> I guess this could help to reduce the amount of churn. I can't really 
> make my mind whether this is worth it or not. So I would keep it as you did.

Good, thanks.

Jan
Marek Marczykowski-Górecki Feb. 13, 2024, 3:43 a.m. UTC | #3
On Mon, Feb 12, 2024 at 10:04:38AM +0100, Jan Beulich wrote:
> On 08.02.2024 23:00, Julien Grall wrote:
> > On 05/02/2024 13:27, Jan Beulich wrote:
> >> In preparation of dropping the register parameters from
> >> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
> >> register state needs making available another way for the few key
> >> handlers which need it. Fake IRQ-like state.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> ---
> >> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
> >> other console poll functions we have, and it's unclear whether that's
> >> actually generally correct.
> > 
> > Is it? Looking at ns16550_poll() we would pass guest_user_regs() if 
> > run_in_exception() doesn't exist. But looking at the caller, no-on seems 
> > to care about the 'regs'. So is this just a latent bug?
> 
> What do you mean by "doesn't exist"? ns16550_poll() assumes it exists.
> And I can spot any use of guest_user_regs() on the respective generic
> or Arm-specific bug.c paths.
> 
> > BTW, do you have an idea why the poll function is not run in an 
> > exception handler?
> 
> "The poll function" being which one? If you mean the one in xhci-dbc.c
> then that's why I had Cc-ed Marek. Moving him to To: - maybe that
> manages to finally catch his attention.

TBH, I don't know. That's part of the original xue patch at
https://github.com/connojd/xue/blob/master/patches/xen-xue-dbgp.patch
and it works for me as it is.

> >> Andrew suggested to move set_irq_regs() to BUGFRAME_run_fn handling;
> >> it's not clear to me whether that would be (a) correct from an abstract
> >> pov (that's exception, not interrupt context after all) 
> > 
> > I agree with that.
> > 
> >> and (b) really beneficial.
> > 
> > I guess this could help to reduce the amount of churn. I can't really 
> > make my mind whether this is worth it or not. So I would keep it as you did.
> 
> Good, thanks.
> 
> Jan
Jan Beulich Feb. 13, 2024, 7:45 a.m. UTC | #4
On 13.02.2024 04:43, Marek Marczykowski wrote:
> On Mon, Feb 12, 2024 at 10:04:38AM +0100, Jan Beulich wrote:
>> On 08.02.2024 23:00, Julien Grall wrote:
>>> On 05/02/2024 13:27, Jan Beulich wrote:
>>>> In preparation of dropping the register parameters from
>>>> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
>>>> register state needs making available another way for the few key
>>>> handlers which need it. Fake IRQ-like state.
>>>>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>> ---
>>>> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
>>>> other console poll functions we have, and it's unclear whether that's
>>>> actually generally correct.
>>>
>>> Is it? Looking at ns16550_poll() we would pass guest_user_regs() if 
>>> run_in_exception() doesn't exist. But looking at the caller, no-on seems 
>>> to care about the 'regs'. So is this just a latent bug?
>>
>> What do you mean by "doesn't exist"? ns16550_poll() assumes it exists.
>> And I can spot any use of guest_user_regs() on the respective generic
>> or Arm-specific bug.c paths.
>>
>>> BTW, do you have an idea why the poll function is not run in an 
>>> exception handler?
>>
>> "The poll function" being which one? If you mean the one in xhci-dbc.c
>> then that's why I had Cc-ed Marek. Moving him to To: - maybe that
>> manages to finally catch his attention.
> 
> TBH, I don't know. That's part of the original xue patch at
> https://github.com/connojd/xue/blob/master/patches/xen-xue-dbgp.patch
> and it works for me as it is.

"Works" meaning what? Doesn't crash on you? Or does also provide
sensible output in _all_ cases (i.e. including when e.g. the poll
happens to run on an idle vCPU)?

Jan
Marek Marczykowski-Górecki Feb. 13, 2024, 2:53 p.m. UTC | #5
On Tue, Feb 13, 2024 at 08:45:54AM +0100, Jan Beulich wrote:
> On 13.02.2024 04:43, Marek Marczykowski wrote:
> > On Mon, Feb 12, 2024 at 10:04:38AM +0100, Jan Beulich wrote:
> >> On 08.02.2024 23:00, Julien Grall wrote:
> >>> On 05/02/2024 13:27, Jan Beulich wrote:
> >>>> In preparation of dropping the register parameters from
> >>>> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
> >>>> register state needs making available another way for the few key
> >>>> handlers which need it. Fake IRQ-like state.
> >>>>
> >>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >>>> ---
> >>>> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
> >>>> other console poll functions we have, and it's unclear whether that's
> >>>> actually generally correct.
> >>>
> >>> Is it? Looking at ns16550_poll() we would pass guest_user_regs() if 
> >>> run_in_exception() doesn't exist. But looking at the caller, no-on seems 
> >>> to care about the 'regs'. So is this just a latent bug?
> >>
> >> What do you mean by "doesn't exist"? ns16550_poll() assumes it exists.
> >> And I can spot any use of guest_user_regs() on the respective generic
> >> or Arm-specific bug.c paths.
> >>
> >>> BTW, do you have an idea why the poll function is not run in an 
> >>> exception handler?
> >>
> >> "The poll function" being which one? If you mean the one in xhci-dbc.c
> >> then that's why I had Cc-ed Marek. Moving him to To: - maybe that
> >> manages to finally catch his attention.
> > 
> > TBH, I don't know. That's part of the original xue patch at
> > https://github.com/connojd/xue/blob/master/patches/xen-xue-dbgp.patch
> > and it works for me as it is.
> 
> "Works" meaning what? Doesn't crash on you? Or does also provide
> sensible output in _all_ cases (i.e. including when e.g. the poll
> happens to run on an idle vCPU)?

Generally provides sensible output, for example during boot (it is using
idle vCPU then, right?).
Jan Beulich Feb. 13, 2024, 3 p.m. UTC | #6
On 13.02.2024 15:53, Marek Marczykowski wrote:
> On Tue, Feb 13, 2024 at 08:45:54AM +0100, Jan Beulich wrote:
>> On 13.02.2024 04:43, Marek Marczykowski wrote:
>>> On Mon, Feb 12, 2024 at 10:04:38AM +0100, Jan Beulich wrote:
>>>> On 08.02.2024 23:00, Julien Grall wrote:
>>>>> On 05/02/2024 13:27, Jan Beulich wrote:
>>>>>> In preparation of dropping the register parameters from
>>>>>> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
>>>>>> register state needs making available another way for the few key
>>>>>> handlers which need it. Fake IRQ-like state.
>>>>>>
>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>>>> ---
>>>>>> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
>>>>>> other console poll functions we have, and it's unclear whether that's
>>>>>> actually generally correct.
>>>>>
>>>>> Is it? Looking at ns16550_poll() we would pass guest_user_regs() if 
>>>>> run_in_exception() doesn't exist. But looking at the caller, no-on seems 
>>>>> to care about the 'regs'. So is this just a latent bug?
>>>>
>>>> What do you mean by "doesn't exist"? ns16550_poll() assumes it exists.
>>>> And I can spot any use of guest_user_regs() on the respective generic
>>>> or Arm-specific bug.c paths.
>>>>
>>>>> BTW, do you have an idea why the poll function is not run in an 
>>>>> exception handler?
>>>>
>>>> "The poll function" being which one? If you mean the one in xhci-dbc.c
>>>> then that's why I had Cc-ed Marek. Moving him to To: - maybe that
>>>> manages to finally catch his attention.
>>>
>>> TBH, I don't know. That's part of the original xue patch at
>>> https://github.com/connojd/xue/blob/master/patches/xen-xue-dbgp.patch
>>> and it works for me as it is.
>>
>> "Works" meaning what? Doesn't crash on you? Or does also provide
>> sensible output in _all_ cases (i.e. including when e.g. the poll
>> happens to run on an idle vCPU)?
> 
> Generally provides sensible output, for example during boot (it is using
> idle vCPU then, right?).

Before Dom0 is started: Yes. With the exception of the phase where PV
Dom0's page tables are constructed, albeit in that time window
guest_cpu_user_regs() shouldn't yield sensible data either. I can only
say I'm surprised; since I have no way to properly test with an XHCI
debug port, I'd have to see about faking something to convince myself
(unless you were to supply example output).

Jan
Marek Marczykowski-Górecki Feb. 13, 2024, 3:11 p.m. UTC | #7
On Tue, Feb 13, 2024 at 04:00:32PM +0100, Jan Beulich wrote:
> On 13.02.2024 15:53, Marek Marczykowski wrote:
> > On Tue, Feb 13, 2024 at 08:45:54AM +0100, Jan Beulich wrote:
> >> On 13.02.2024 04:43, Marek Marczykowski wrote:
> >>> On Mon, Feb 12, 2024 at 10:04:38AM +0100, Jan Beulich wrote:
> >>>> On 08.02.2024 23:00, Julien Grall wrote:
> >>>>> On 05/02/2024 13:27, Jan Beulich wrote:
> >>>>>> In preparation of dropping the register parameters from
> >>>>>> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
> >>>>>> register state needs making available another way for the few key
> >>>>>> handlers which need it. Fake IRQ-like state.
> >>>>>>
> >>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >>>>>> ---
> >>>>>> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
> >>>>>> other console poll functions we have, and it's unclear whether that's
> >>>>>> actually generally correct.
> >>>>>
> >>>>> Is it? Looking at ns16550_poll() we would pass guest_user_regs() if 
> >>>>> run_in_exception() doesn't exist. But looking at the caller, no-on seems 
> >>>>> to care about the 'regs'. So is this just a latent bug?
> >>>>
> >>>> What do you mean by "doesn't exist"? ns16550_poll() assumes it exists.
> >>>> And I can spot any use of guest_user_regs() on the respective generic
> >>>> or Arm-specific bug.c paths.
> >>>>
> >>>>> BTW, do you have an idea why the poll function is not run in an 
> >>>>> exception handler?
> >>>>
> >>>> "The poll function" being which one? If you mean the one in xhci-dbc.c
> >>>> then that's why I had Cc-ed Marek. Moving him to To: - maybe that
> >>>> manages to finally catch his attention.
> >>>
> >>> TBH, I don't know. That's part of the original xue patch at
> >>> https://github.com/connojd/xue/blob/master/patches/xen-xue-dbgp.patch
> >>> and it works for me as it is.
> >>
> >> "Works" meaning what? Doesn't crash on you? Or does also provide
> >> sensible output in _all_ cases (i.e. including when e.g. the poll
> >> happens to run on an idle vCPU)?
> > 
> > Generally provides sensible output, for example during boot (it is using
> > idle vCPU then, right?).
> 
> Before Dom0 is started: Yes. With the exception of the phase where PV
> Dom0's page tables are constructed, albeit in that time window
> guest_cpu_user_regs() shouldn't yield sensible data either. I can only
> say I'm surprised; since I have no way to properly test with an XHCI
> debug port, I'd have to see about faking something to convince myself
> (unless you were to supply example output).

Would you like me to test this series with xhci console? Or maybe add
some extra debug prints and include their output? But note, printk from
inside console code generally leads to deadlocks. What I did for some
debugging was to log into some separate buffer and dump it later.
Jan Beulich Feb. 13, 2024, 3:44 p.m. UTC | #8
On 13.02.2024 16:11, Marek Marczykowski wrote:
> On Tue, Feb 13, 2024 at 04:00:32PM +0100, Jan Beulich wrote:
>> On 13.02.2024 15:53, Marek Marczykowski wrote:
>>> On Tue, Feb 13, 2024 at 08:45:54AM +0100, Jan Beulich wrote:
>>>> On 13.02.2024 04:43, Marek Marczykowski wrote:
>>>>> On Mon, Feb 12, 2024 at 10:04:38AM +0100, Jan Beulich wrote:
>>>>>> On 08.02.2024 23:00, Julien Grall wrote:
>>>>>>> On 05/02/2024 13:27, Jan Beulich wrote:
>>>>>>>> In preparation of dropping the register parameters from
>>>>>>>> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
>>>>>>>> register state needs making available another way for the few key
>>>>>>>> handlers which need it. Fake IRQ-like state.
>>>>>>>>
>>>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>>>>>> ---
>>>>>>>> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
>>>>>>>> other console poll functions we have, and it's unclear whether that's
>>>>>>>> actually generally correct.
>>>>>>>
>>>>>>> Is it? Looking at ns16550_poll() we would pass guest_user_regs() if 
>>>>>>> run_in_exception() doesn't exist. But looking at the caller, no-on seems 
>>>>>>> to care about the 'regs'. So is this just a latent bug?
>>>>>>
>>>>>> What do you mean by "doesn't exist"? ns16550_poll() assumes it exists.
>>>>>> And I can spot any use of guest_user_regs() on the respective generic
>>>>>> or Arm-specific bug.c paths.
>>>>>>
>>>>>>> BTW, do you have an idea why the poll function is not run in an 
>>>>>>> exception handler?
>>>>>>
>>>>>> "The poll function" being which one? If you mean the one in xhci-dbc.c
>>>>>> then that's why I had Cc-ed Marek. Moving him to To: - maybe that
>>>>>> manages to finally catch his attention.
>>>>>
>>>>> TBH, I don't know. That's part of the original xue patch at
>>>>> https://github.com/connojd/xue/blob/master/patches/xen-xue-dbgp.patch
>>>>> and it works for me as it is.
>>>>
>>>> "Works" meaning what? Doesn't crash on you? Or does also provide
>>>> sensible output in _all_ cases (i.e. including when e.g. the poll
>>>> happens to run on an idle vCPU)?
>>>
>>> Generally provides sensible output, for example during boot (it is using
>>> idle vCPU then, right?).
>>
>> Before Dom0 is started: Yes. With the exception of the phase where PV
>> Dom0's page tables are constructed, albeit in that time window
>> guest_cpu_user_regs() shouldn't yield sensible data either. I can only
>> say I'm surprised; since I have no way to properly test with an XHCI
>> debug port, I'd have to see about faking something to convince myself
>> (unless you were to supply example output).
> 
> Would you like me to test this series with xhci console?

The behavior shouldn't really be connected to this series. But yes, 'd'
debug key output (just the part for the CPU the key handling was
actually invoked from) with the xhci debug console would be of
interest, for the case where that CPU at that time runs an idle vCPU.

> Or maybe add
> some extra debug prints and include their output? But note, printk from
> inside console code generally leads to deadlocks. What I did for some
> debugging was to log into some separate buffer and dump it later.

Right, this would be more involved.

Jan
Marek Marczykowski-Górecki Feb. 15, 2024, 2:19 a.m. UTC | #9
On Tue, Feb 13, 2024 at 04:44:04PM +0100, Jan Beulich wrote:
> On 13.02.2024 16:11, Marek Marczykowski wrote:
> > On Tue, Feb 13, 2024 at 04:00:32PM +0100, Jan Beulich wrote:
> >> On 13.02.2024 15:53, Marek Marczykowski wrote:
> >>> On Tue, Feb 13, 2024 at 08:45:54AM +0100, Jan Beulich wrote:
> >>>> On 13.02.2024 04:43, Marek Marczykowski wrote:
> >>>>> On Mon, Feb 12, 2024 at 10:04:38AM +0100, Jan Beulich wrote:
> >>>>>> On 08.02.2024 23:00, Julien Grall wrote:
> >>>>>>> On 05/02/2024 13:27, Jan Beulich wrote:
> >>>>>>>> In preparation of dropping the register parameters from
> >>>>>>>> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
> >>>>>>>> register state needs making available another way for the few key
> >>>>>>>> handlers which need it. Fake IRQ-like state.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >>>>>>>> ---
> >>>>>>>> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
> >>>>>>>> other console poll functions we have, and it's unclear whether that's
> >>>>>>>> actually generally correct.
> >>>>>>>
> >>>>>>> Is it? Looking at ns16550_poll() we would pass guest_user_regs() if 
> >>>>>>> run_in_exception() doesn't exist. But looking at the caller, no-on seems 
> >>>>>>> to care about the 'regs'. So is this just a latent bug?
> >>>>>>
> >>>>>> What do you mean by "doesn't exist"? ns16550_poll() assumes it exists.
> >>>>>> And I can spot any use of guest_user_regs() on the respective generic
> >>>>>> or Arm-specific bug.c paths.
> >>>>>>
> >>>>>>> BTW, do you have an idea why the poll function is not run in an 
> >>>>>>> exception handler?
> >>>>>>
> >>>>>> "The poll function" being which one? If you mean the one in xhci-dbc.c
> >>>>>> then that's why I had Cc-ed Marek. Moving him to To: - maybe that
> >>>>>> manages to finally catch his attention.
> >>>>>
> >>>>> TBH, I don't know. That's part of the original xue patch at
> >>>>> https://github.com/connojd/xue/blob/master/patches/xen-xue-dbgp.patch
> >>>>> and it works for me as it is.
> >>>>
> >>>> "Works" meaning what? Doesn't crash on you? Or does also provide
> >>>> sensible output in _all_ cases (i.e. including when e.g. the poll
> >>>> happens to run on an idle vCPU)?
> >>>
> >>> Generally provides sensible output, for example during boot (it is using
> >>> idle vCPU then, right?).
> >>
> >> Before Dom0 is started: Yes. With the exception of the phase where PV
> >> Dom0's page tables are constructed, albeit in that time window
> >> guest_cpu_user_regs() shouldn't yield sensible data either. I can only
> >> say I'm surprised; since I have no way to properly test with an XHCI
> >> debug port, I'd have to see about faking something to convince myself
> >> (unless you were to supply example output).
> > 
> > Would you like me to test this series with xhci console?
> 
> The behavior shouldn't really be connected to this series. But yes, 'd'
> debug key output (just the part for the CPU the key handling was
> actually invoked from) with the xhci debug console would be of
> interest, for the case where that CPU at that time runs an idle vCPU.

I managed to press 'd' before dom0 started. Full output at
https://gist.github.com/marmarek/a495cd666f4aafed3a5cfcb8393f515b (it's
Alder Lake, and smt=off, so CPU numbering is weird).
Interestingly, I do _not_ see output for CPU0, where I'd expect the
key handler to run... I see all the idle ones, plus one doing memory
scrubbing.
But also, I don't see info about the handling CPU when doing `xl
debug-key d`. At one time, with `xl debug-key d` I got this:

(XEN) *** Dumping CPU6 guest state (d0v7): ***
(XEN) ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:  M     ]----
(XEN) CPU:    6
(XEN) RIP:    e033:[<ffffffff81e1546a>]
(XEN) RFLAGS: 0000000000000286   EM: 0   CONTEXT: pv guest (d0v7)
(XEN) rax: 0000000000000023   rbx: 0000000000000005   rcx: ffffffff81e1546a
(XEN) rdx: 0000000000000000   rsi: 0000000000000000   rdi: 000079147611e010
(XEN) rbp: ffff88810db53200   rsp: ffffc90041c6bde0   r8:  0000000000000000
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000286
(XEN) r12: 0000000000305000   r13: 00007ffc61097f40   r14: ffff88810db53200
(XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 0000000000b526e0
(XEN) cr3: 00000004ae2a7000   cr2: 00000000006d3118
(XEN) fsb: 0000791475b8a380   gsb: ffff8881897c0000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffc90041c6bde0:
(XEN)    0000000000000001 0000000000000000 ffffffffc02905a6 0000000000000023
(XEN)    000079147611e010 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 8064129fc747f100 ffffffffc0291568 0000000000305000
(XEN)    8064129fc747f100 0000000000000005 ffffffff813f7d4d ffffc90041c6bf58
(XEN)    ffffc90041c6bf48 0000000000000000 0000000000000000 0000000000000000
(XEN)    ffffffff81e16158 00000000006d3118 ffffc90041c6bf58 0000000000000040
(XEN)    ffffffff8132f6bb 0000000000000006 ffffc90041c6bf58 00000000006d3118
(XEN)    0000000000000255 ffff888102cf8880 ffff888102cf88f0 ffffffff8108746f
(XEN)    0000000000000000 0000000000000002 ffffc90041c6bf58 ffffc90041c6bf58
(XEN)    00000000006d3118 0000000000000000 0000000000000006 0000000000000000
(XEN)    0000000000000000 ffffffff81e1a975 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 ffffffff8200009b 000000000043d9b0
(XEN)    000000000043d990 00007ffc61097f90 00007ffc61097fc0 00007ffc61099d16
(XEN)    00000000006cab40 0000000000000246 0000000000000001 0000000000000000
(XEN)    0000000000000006 ffffffffffffffda 0000791475f1ed6f 00007ffc61097f40
(XEN)    0000000000305000 0000000000000005 0000000000000010 0000791475f1ed6f
(XEN)    0000000000000033 0000000000000246 00007ffc61097ed0 000000000000002b
(XEN)     Fault while accessing guest memory.
(XEN) 
(XEN) *** Dumping CPU0 host state: ***
(XEN) ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:  M     ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d04022c07e>] _spin_unlock_irqrestore+0x21/0x27
(XEN) RFLAGS: 0000000000000206   CONTEXT: hypervisor
(XEN) rax: ffff82d0405c1038   rbx: 0000000000000200   rcx: 0000000000000008
(XEN) rdx: ffff830856d07fff   rsi: ffff8308529d5b28   rdi: ffff8308529d5b20
(XEN) rbp: ffff830856d07dc8   rsp: ffff830856d07dc0   r8:  0000000000000001
(XEN) r9:  ffff8308529d5b20   r10: ffff82d0405c13a0   r11: 000000d091e62221
(XEN) r12: ffff82d040476898   r13: 0000000000000296   r14: ffff82d040476918
(XEN) r15: ffff82cffff04700   cr0: 0000000080050033   cr4: 0000000000b526e0
(XEN) cr3: 000000082e7ff000   cr2: ffff888109538618
(XEN) fsb: 0000000000000000   gsb: ffff888189600000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen code around <ffff82d04022c07e> (_spin_unlock_irqrestore+0x21/0x27):
(XEN)  fd ff ff 48 09 1c 24 9d <48> 8b 5d f8 c9 c3 0f b7 47 04 66 25 ff 0f 66 3d
(XEN) Xen stack trace from rsp=ffff830856d07dc0:
(XEN)    ffff82d0405d0c80 ffff830856d07e08 ffff82d040257c3f 0000000040476898
(XEN)    ffff82d0405c1280 ffff82d040257bca ffff82d040476898 000000d0911fcbc4
(XEN)    0000000000000000 ffff830856d07e30 ffff82d04022d55c ffff82d0405c1280
(XEN)    ffff8308529d5f00 ffff82d0405d0d68 ffff830856d07e70 ffff82d04022de59
(XEN)    ffff830856d07ef8 ffff82d0405c7f00 ffffffffffffffff ffff82d0405c7f00
(XEN)    ffff830856d07fff 0000000000000000 ffff830856d07ea8 ffff82d04022b53e
(XEN)    0000000000000000 0000000000007fff ffff82d0405c7f00 ffff82d0405c11d0
(XEN)    ffff82d0405db2a0 ffff830856d07eb8 ffff82d04022b5d1 ffff830856d07ef0
(XEN)    ffff82d0402fcd15 ffff82d0402fcc88 ffff8308528cb000 ffff830856d07ef8
(XEN)    ffff830856ce2000 0000000000000000 ffff830856d07e18 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 ffffffff82c1aa40
(XEN)    0000000000000000 0000000000000246 0000000000007ff0 0000000000000000
(XEN)    000000000fd109eb 0000000000000000 ffffffff81e153aa 4000000000000000
(XEN)    deadbeefdeadf00d deadbeefdeadf00d 0000010000000000 ffffffff81e153aa
(XEN)    000000000000e033 0000000000000246 ffffffff82c03dd0 000000000000e02b
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000e01000000000 ffff830856ce1000 0000000000000000 0000000000b526e0
(XEN)    0000000000000000 0000000000000000 0006030300000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82d04022c07e>] R _spin_unlock_irqrestore+0x21/0x27
(XEN)    [<ffff82d040257c3f>] F xhci-dbc.c#dbc_uart_poll+0x75/0x17c
(XEN)    [<ffff82d04022d55c>] F timer.c#execute_timer+0x45/0x5c
(XEN)    [<ffff82d04022de59>] F timer.c#timer_softirq_action+0x71/0x275
(XEN)    [<ffff82d04022b53e>] F softirq.c#__do_softirq+0x94/0xbe
(XEN)    [<ffff82d04022b5d1>] F do_softirq+0x13/0x15
(XEN)    [<ffff82d0402fcd15>] F domain.c#idle_loop+0x8d/0xe6

(other CPUs in mwait-idle)

> > Or maybe add
> > some extra debug prints and include their output? But note, printk from
> > inside console code generally leads to deadlocks. What I did for some
> > debugging was to log into some separate buffer and dump it later.
> 
> Right, this would be more involved.
> 
> Jan
Jan Beulich Feb. 15, 2024, 8:39 a.m. UTC | #10
On 15.02.2024 03:19, Marek Marczykowski wrote:
> On Tue, Feb 13, 2024 at 04:44:04PM +0100, Jan Beulich wrote:
>> On 13.02.2024 16:11, Marek Marczykowski wrote:
>>> On Tue, Feb 13, 2024 at 04:00:32PM +0100, Jan Beulich wrote:
>>>> On 13.02.2024 15:53, Marek Marczykowski wrote:
>>>>> Generally provides sensible output, for example during boot (it is using
>>>>> idle vCPU then, right?).
>>>>
>>>> Before Dom0 is started: Yes. With the exception of the phase where PV
>>>> Dom0's page tables are constructed, albeit in that time window
>>>> guest_cpu_user_regs() shouldn't yield sensible data either. I can only
>>>> say I'm surprised; since I have no way to properly test with an XHCI
>>>> debug port, I'd have to see about faking something to convince myself
>>>> (unless you were to supply example output).
>>>
>>> Would you like me to test this series with xhci console?
>>
>> The behavior shouldn't really be connected to this series. But yes, 'd'
>> debug key output (just the part for the CPU the key handling was
>> actually invoked from) with the xhci debug console would be of
>> interest, for the case where that CPU at that time runs an idle vCPU.
> 
> I managed to press 'd' before dom0 started. Full output at
> https://gist.github.com/marmarek/a495cd666f4aafed3a5cfcb8393f515b (it's
> Alder Lake, and smt=off, so CPU numbering is weird).
> Interestingly, I do _not_ see output for CPU0, where I'd expect the
> key handler to run... I see all the idle ones, plus one doing memory
> scrubbing.

Which is precisely the problem, just in not exactly the manifestation
I expected. In dump_execstate() we dump host state only if the
incoming regs don't indicate guest state. Yet for the idle vCPU they
(wrongly) do here - see how guest_mode() calculates the delta to what
guest_cpu_user_regs() returns, i.e. 0 when what guest_cpu_user_regs()
returned is passed in.

Guest state dumping is suppressed for idle vCPU-s. Hence no output
at all for the CPU where the key processing was actually invoked
from.

> But also, I don't see info about the handling CPU when doing `xl
> debug-key d`.

I'm afraid I'm confused, ...

> At one time, with `xl debug-key d` I got this:
> 
> (XEN) *** Dumping CPU6 guest state (d0v7): ***
> (XEN) ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:  M     ]----
> (XEN) CPU:    6
> (XEN) RIP:    e033:[<ffffffff81e1546a>]
> (XEN) RFLAGS: 0000000000000286   EM: 0   CONTEXT: pv guest (d0v7)
> (XEN) rax: 0000000000000023   rbx: 0000000000000005   rcx: ffffffff81e1546a
> (XEN) rdx: 0000000000000000   rsi: 0000000000000000   rdi: 000079147611e010
> (XEN) rbp: ffff88810db53200   rsp: ffffc90041c6bde0   r8:  0000000000000000
> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000286
> (XEN) r12: 0000000000305000   r13: 00007ffc61097f40   r14: ffff88810db53200
> (XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 0000000000b526e0
> (XEN) cr3: 00000004ae2a7000   cr2: 00000000006d3118
> (XEN) fsb: 0000791475b8a380   gsb: ffff8881897c0000   gss: 0000000000000000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
> (XEN) Guest stack trace from rsp=ffffc90041c6bde0:
> (XEN)    0000000000000001 0000000000000000 ffffffffc02905a6 0000000000000023
> (XEN)    000079147611e010 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 8064129fc747f100 ffffffffc0291568 0000000000305000
> (XEN)    8064129fc747f100 0000000000000005 ffffffff813f7d4d ffffc90041c6bf58
> (XEN)    ffffc90041c6bf48 0000000000000000 0000000000000000 0000000000000000
> (XEN)    ffffffff81e16158 00000000006d3118 ffffc90041c6bf58 0000000000000040
> (XEN)    ffffffff8132f6bb 0000000000000006 ffffc90041c6bf58 00000000006d3118
> (XEN)    0000000000000255 ffff888102cf8880 ffff888102cf88f0 ffffffff8108746f
> (XEN)    0000000000000000 0000000000000002 ffffc90041c6bf58 ffffc90041c6bf58
> (XEN)    00000000006d3118 0000000000000000 0000000000000006 0000000000000000
> (XEN)    0000000000000000 ffffffff81e1a975 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 ffffffff8200009b 000000000043d9b0
> (XEN)    000000000043d990 00007ffc61097f90 00007ffc61097fc0 00007ffc61099d16
> (XEN)    00000000006cab40 0000000000000246 0000000000000001 0000000000000000
> (XEN)    0000000000000006 ffffffffffffffda 0000791475f1ed6f 00007ffc61097f40
> (XEN)    0000000000305000 0000000000000005 0000000000000010 0000791475f1ed6f
> (XEN)    0000000000000033 0000000000000246 00007ffc61097ed0 000000000000002b
> (XEN)     Fault while accessing guest memory.
> (XEN) 
> (XEN) *** Dumping CPU0 host state: ***
> (XEN) ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:  M     ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82d04022c07e>] _spin_unlock_irqrestore+0x21/0x27
> (XEN) RFLAGS: 0000000000000206   CONTEXT: hypervisor
> (XEN) rax: ffff82d0405c1038   rbx: 0000000000000200   rcx: 0000000000000008
> (XEN) rdx: ffff830856d07fff   rsi: ffff8308529d5b28   rdi: ffff8308529d5b20
> (XEN) rbp: ffff830856d07dc8   rsp: ffff830856d07dc0   r8:  0000000000000001
> (XEN) r9:  ffff8308529d5b20   r10: ffff82d0405c13a0   r11: 000000d091e62221
> (XEN) r12: ffff82d040476898   r13: 0000000000000296   r14: ffff82d040476918
> (XEN) r15: ffff82cffff04700   cr0: 0000000080050033   cr4: 0000000000b526e0
> (XEN) cr3: 000000082e7ff000   cr2: ffff888109538618
> (XEN) fsb: 0000000000000000   gsb: ffff888189600000   gss: 0000000000000000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen code around <ffff82d04022c07e> (_spin_unlock_irqrestore+0x21/0x27):
> (XEN)  fd ff ff 48 09 1c 24 9d <48> 8b 5d f8 c9 c3 0f b7 47 04 66 25 ff 0f 66 3d
> (XEN) Xen stack trace from rsp=ffff830856d07dc0:
> (XEN)    ffff82d0405d0c80 ffff830856d07e08 ffff82d040257c3f 0000000040476898
> (XEN)    ffff82d0405c1280 ffff82d040257bca ffff82d040476898 000000d0911fcbc4
> (XEN)    0000000000000000 ffff830856d07e30 ffff82d04022d55c ffff82d0405c1280
> (XEN)    ffff8308529d5f00 ffff82d0405d0d68 ffff830856d07e70 ffff82d04022de59
> (XEN)    ffff830856d07ef8 ffff82d0405c7f00 ffffffffffffffff ffff82d0405c7f00
> (XEN)    ffff830856d07fff 0000000000000000 ffff830856d07ea8 ffff82d04022b53e
> (XEN)    0000000000000000 0000000000007fff ffff82d0405c7f00 ffff82d0405c11d0
> (XEN)    ffff82d0405db2a0 ffff830856d07eb8 ffff82d04022b5d1 ffff830856d07ef0
> (XEN)    ffff82d0402fcd15 ffff82d0402fcc88 ffff8308528cb000 ffff830856d07ef8
> (XEN)    ffff830856ce2000 0000000000000000 ffff830856d07e18 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 ffffffff82c1aa40
> (XEN)    0000000000000000 0000000000000246 0000000000007ff0 0000000000000000
> (XEN)    000000000fd109eb 0000000000000000 ffffffff81e153aa 4000000000000000
> (XEN)    deadbeefdeadf00d deadbeefdeadf00d 0000010000000000 ffffffff81e153aa
> (XEN)    000000000000e033 0000000000000246 ffffffff82c03dd0 000000000000e02b
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000e01000000000 ffff830856ce1000 0000000000000000 0000000000b526e0
> (XEN)    0000000000000000 0000000000000000 0006030300000000 0000000000000000
> (XEN) Xen call trace:
> (XEN)    [<ffff82d04022c07e>] R _spin_unlock_irqrestore+0x21/0x27
> (XEN)    [<ffff82d040257c3f>] F xhci-dbc.c#dbc_uart_poll+0x75/0x17c
> (XEN)    [<ffff82d04022d55c>] F timer.c#execute_timer+0x45/0x5c
> (XEN)    [<ffff82d04022de59>] F timer.c#timer_softirq_action+0x71/0x275
> (XEN)    [<ffff82d04022b53e>] F softirq.c#__do_softirq+0x94/0xbe
> (XEN)    [<ffff82d04022b5d1>] F do_softirq+0x13/0x15
> (XEN)    [<ffff82d0402fcd15>] F domain.c#idle_loop+0x8d/0xe6

... this looks to be the issuing CPU. What I don't understand is why we
are in _spin_unlock_irqrestore() here, called out of dbc_uart_poll().

Btw - was any/all of this with or without the series here applied?

Jan
Jan Beulich Feb. 15, 2024, 8:45 a.m. UTC | #11
On 05.02.2024 14:27, Jan Beulich wrote:
> In preparation of dropping the register parameters from
> serial_[rt]x_interrupt() and in turn from IRQ handler functions,
> register state needs making available another way for the few key
> handlers which need it. Fake IRQ-like state.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> The use of guest_cpu_user_regs() in dbc_uart_poll() is inconsistent with
> other console poll functions we have, and it's unclear whether that's
> actually generally correct.

It occurs to me that due to this behavior, ...

> --- a/xen/drivers/char/xhci-dbc.c
> +++ b/xen/drivers/char/xhci-dbc.c
> @@ -1164,6 +1164,7 @@ static void cf_check dbc_uart_poll(void
>      struct dbc_uart *uart = port->uart;
>      struct dbc *dbc = &uart->dbc;
>      unsigned long flags = 0;
> +    struct cpu_user_regs *old_regs;
>  
>      if ( spin_trylock_irqsave(&port->tx_lock, flags) )
>      {
> @@ -1175,10 +1176,15 @@ static void cf_check dbc_uart_poll(void
>          spin_unlock_irqrestore(&port->tx_lock, flags);
>      }
>  
> +    /* Mimic interrupt context. */
> +    old_regs = set_irq_regs(guest_cpu_user_regs());
> +
>      while ( dbc_work_ring_size(&dbc->dbc_iwork) )
>          serial_rx_interrupt(port, guest_cpu_user_regs());
>  
>      serial_tx_interrupt(port, guest_cpu_user_regs());
> +
> +    set_irq_regs(old_regs);
>      set_timer(&uart->timer, NOW() + MICROSECS(DBC_POLL_INTERVAL));
>  }
>  
> 

... this ought to be unnecessary, considering what dump_registers()
is changed to in patch 2.

Jan
Marek Marczykowski-Górecki Feb. 15, 2024, 11:08 a.m. UTC | #12
On Thu, Feb 15, 2024 at 09:39:41AM +0100, Jan Beulich wrote:
> On 15.02.2024 03:19, Marek Marczykowski wrote:
> > On Tue, Feb 13, 2024 at 04:44:04PM +0100, Jan Beulich wrote:
> >> On 13.02.2024 16:11, Marek Marczykowski wrote:
> >>> On Tue, Feb 13, 2024 at 04:00:32PM +0100, Jan Beulich wrote:
> >>>> On 13.02.2024 15:53, Marek Marczykowski wrote:
> >>>>> Generally provides sensible output, for example during boot (it is using
> >>>>> idle vCPU then, right?).
> >>>>
> >>>> Before Dom0 is started: Yes. With the exception of the phase where PV
> >>>> Dom0's page tables are constructed, albeit in that time window
> >>>> guest_cpu_user_regs() shouldn't yield sensible data either. I can only
> >>>> say I'm surprised; since I have no way to properly test with an XHCI
> >>>> debug port, I'd have to see about faking something to convince myself
> >>>> (unless you were to supply example output).
> >>>
> >>> Would you like me to test this series with xhci console?
> >>
> >> The behavior shouldn't really be connected to this series. But yes, 'd'
> >> debug key output (just the part for the CPU the key handling was
> >> actually invoked from) with the xhci debug console would be of
> >> interest, for the case where that CPU at that time runs an idle vCPU.
> > 
> > I managed to press 'd' before dom0 started. Full output at
> > https://gist.github.com/marmarek/a495cd666f4aafed3a5cfcb8393f515b (it's
> > Alder Lake, and smt=off, so CPU numbering is weird).
> > Interestingly, I do _not_ see output for CPU0, where I'd expect the
> > key handler to run... I see all the idle ones, plus one doing memory
> > scrubbing.
> 
> Which is precisely the problem, just in not exactly the manifestation
> I expected. In dump_execstate() we dump host state only if the
> incoming regs don't indicate guest state. Yet for the idle vCPU they
> (wrongly) do here - see how guest_mode() calculates the delta to what
> guest_cpu_user_regs() returns, i.e. 0 when what guest_cpu_user_regs()
> returned is passed in.
> 
> Guest state dumping is suppressed for idle vCPU-s. Hence no output
> at all for the CPU where the key processing was actually invoked
> from.
> 
> > But also, I don't see info about the handling CPU when doing `xl
> > debug-key d`.
> 
> I'm afraid I'm confused, ...
> 
> > At one time, with `xl debug-key d` I got this:
> > 
> > (XEN) *** Dumping CPU6 guest state (d0v7): ***
> > (XEN) ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:  M     ]----
> > (XEN) CPU:    6
> > (XEN) RIP:    e033:[<ffffffff81e1546a>]
> > (XEN) RFLAGS: 0000000000000286   EM: 0   CONTEXT: pv guest (d0v7)
> > (XEN) rax: 0000000000000023   rbx: 0000000000000005   rcx: ffffffff81e1546a
> > (XEN) rdx: 0000000000000000   rsi: 0000000000000000   rdi: 000079147611e010
> > (XEN) rbp: ffff88810db53200   rsp: ffffc90041c6bde0   r8:  0000000000000000
> > (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000286
> > (XEN) r12: 0000000000305000   r13: 00007ffc61097f40   r14: ffff88810db53200
> > (XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 0000000000b526e0
> > (XEN) cr3: 00000004ae2a7000   cr2: 00000000006d3118
> > (XEN) fsb: 0000791475b8a380   gsb: ffff8881897c0000   gss: 0000000000000000
> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
> > (XEN) Guest stack trace from rsp=ffffc90041c6bde0:
> > (XEN)    0000000000000001 0000000000000000 ffffffffc02905a6 0000000000000023
> > (XEN)    000079147611e010 0000000000000000 0000000000000000 0000000000000000
> > (XEN)    0000000000000000 8064129fc747f100 ffffffffc0291568 0000000000305000
> > (XEN)    8064129fc747f100 0000000000000005 ffffffff813f7d4d ffffc90041c6bf58
> > (XEN)    ffffc90041c6bf48 0000000000000000 0000000000000000 0000000000000000
> > (XEN)    ffffffff81e16158 00000000006d3118 ffffc90041c6bf58 0000000000000040
> > (XEN)    ffffffff8132f6bb 0000000000000006 ffffc90041c6bf58 00000000006d3118
> > (XEN)    0000000000000255 ffff888102cf8880 ffff888102cf88f0 ffffffff8108746f
> > (XEN)    0000000000000000 0000000000000002 ffffc90041c6bf58 ffffc90041c6bf58
> > (XEN)    00000000006d3118 0000000000000000 0000000000000006 0000000000000000
> > (XEN)    0000000000000000 ffffffff81e1a975 0000000000000000 0000000000000000
> > (XEN)    0000000000000000 0000000000000000 ffffffff8200009b 000000000043d9b0
> > (XEN)    000000000043d990 00007ffc61097f90 00007ffc61097fc0 00007ffc61099d16
> > (XEN)    00000000006cab40 0000000000000246 0000000000000001 0000000000000000
> > (XEN)    0000000000000006 ffffffffffffffda 0000791475f1ed6f 00007ffc61097f40
> > (XEN)    0000000000305000 0000000000000005 0000000000000010 0000791475f1ed6f
> > (XEN)    0000000000000033 0000000000000246 00007ffc61097ed0 000000000000002b
> > (XEN)     Fault while accessing guest memory.
> > (XEN) 
> > (XEN) *** Dumping CPU0 host state: ***
> > (XEN) ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:  M     ]----
> > (XEN) CPU:    0
> > (XEN) RIP:    e008:[<ffff82d04022c07e>] _spin_unlock_irqrestore+0x21/0x27
> > (XEN) RFLAGS: 0000000000000206   CONTEXT: hypervisor
> > (XEN) rax: ffff82d0405c1038   rbx: 0000000000000200   rcx: 0000000000000008
> > (XEN) rdx: ffff830856d07fff   rsi: ffff8308529d5b28   rdi: ffff8308529d5b20
> > (XEN) rbp: ffff830856d07dc8   rsp: ffff830856d07dc0   r8:  0000000000000001
> > (XEN) r9:  ffff8308529d5b20   r10: ffff82d0405c13a0   r11: 000000d091e62221
> > (XEN) r12: ffff82d040476898   r13: 0000000000000296   r14: ffff82d040476918
> > (XEN) r15: ffff82cffff04700   cr0: 0000000080050033   cr4: 0000000000b526e0
> > (XEN) cr3: 000000082e7ff000   cr2: ffff888109538618
> > (XEN) fsb: 0000000000000000   gsb: ffff888189600000   gss: 0000000000000000
> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> > (XEN) Xen code around <ffff82d04022c07e> (_spin_unlock_irqrestore+0x21/0x27):
> > (XEN)  fd ff ff 48 09 1c 24 9d <48> 8b 5d f8 c9 c3 0f b7 47 04 66 25 ff 0f 66 3d
> > (XEN) Xen stack trace from rsp=ffff830856d07dc0:
> > (XEN)    ffff82d0405d0c80 ffff830856d07e08 ffff82d040257c3f 0000000040476898
> > (XEN)    ffff82d0405c1280 ffff82d040257bca ffff82d040476898 000000d0911fcbc4
> > (XEN)    0000000000000000 ffff830856d07e30 ffff82d04022d55c ffff82d0405c1280
> > (XEN)    ffff8308529d5f00 ffff82d0405d0d68 ffff830856d07e70 ffff82d04022de59
> > (XEN)    ffff830856d07ef8 ffff82d0405c7f00 ffffffffffffffff ffff82d0405c7f00
> > (XEN)    ffff830856d07fff 0000000000000000 ffff830856d07ea8 ffff82d04022b53e
> > (XEN)    0000000000000000 0000000000007fff ffff82d0405c7f00 ffff82d0405c11d0
> > (XEN)    ffff82d0405db2a0 ffff830856d07eb8 ffff82d04022b5d1 ffff830856d07ef0
> > (XEN)    ffff82d0402fcd15 ffff82d0402fcc88 ffff8308528cb000 ffff830856d07ef8
> > (XEN)    ffff830856ce2000 0000000000000000 ffff830856d07e18 0000000000000000
> > (XEN)    0000000000000000 0000000000000000 0000000000000000 ffffffff82c1aa40
> > (XEN)    0000000000000000 0000000000000246 0000000000007ff0 0000000000000000
> > (XEN)    000000000fd109eb 0000000000000000 ffffffff81e153aa 4000000000000000
> > (XEN)    deadbeefdeadf00d deadbeefdeadf00d 0000010000000000 ffffffff81e153aa
> > (XEN)    000000000000e033 0000000000000246 ffffffff82c03dd0 000000000000e02b
> > (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN)    0000e01000000000 ffff830856ce1000 0000000000000000 0000000000b526e0
> > (XEN)    0000000000000000 0000000000000000 0006030300000000 0000000000000000
> > (XEN) Xen call trace:
> > (XEN)    [<ffff82d04022c07e>] R _spin_unlock_irqrestore+0x21/0x27
> > (XEN)    [<ffff82d040257c3f>] F xhci-dbc.c#dbc_uart_poll+0x75/0x17c
> > (XEN)    [<ffff82d04022d55c>] F timer.c#execute_timer+0x45/0x5c
> > (XEN)    [<ffff82d04022de59>] F timer.c#timer_softirq_action+0x71/0x275
> > (XEN)    [<ffff82d04022b53e>] F softirq.c#__do_softirq+0x94/0xbe
> > (XEN)    [<ffff82d04022b5d1>] F do_softirq+0x13/0x15
> > (XEN)    [<ffff82d0402fcd15>] F domain.c#idle_loop+0x8d/0xe6
> 
> ... this looks to be the issuing CPU. What I don't understand is why we
> are in _spin_unlock_irqrestore() here, called out of dbc_uart_poll().
> 
> Btw - was any/all of this with or without the series here applied?

This was without this series applied.
diff mbox series

Patch

--- a/xen/drivers/char/ehci-dbgp.c
+++ b/xen/drivers/char/ehci-dbgp.c
@@ -1253,6 +1253,7 @@  static void cf_check _ehci_dbgp_poll(str
     unsigned long flags;
     unsigned int timeout = MICROSECS(DBGP_CHECK_INTERVAL);
     bool empty = false;
+    struct cpu_user_regs *old_regs;
 
     if ( !dbgp->ehci_debug )
         return;
@@ -1268,12 +1269,17 @@  static void cf_check _ehci_dbgp_poll(str
         spin_unlock_irqrestore(&port->tx_lock, flags);
     }
 
+    /* Mimic interrupt context. */
+    old_regs = set_irq_regs(regs);
+
     if ( dbgp->in.chunk )
         serial_rx_interrupt(port, regs);
 
     if ( empty )
         serial_tx_interrupt(port, regs);
 
+    set_irq_regs(old_regs);
+
     if ( spin_trylock_irqsave(&port->tx_lock, flags) )
     {
         if ( dbgp->state == dbgp_idle && !dbgp->in.chunk &&
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -211,10 +211,14 @@  static void cf_check __ns16550_poll(stru
 {
     struct serial_port *port = this_cpu(poll_port);
     struct ns16550 *uart = port->uart;
+    struct cpu_user_regs *old_regs;
 
     if ( uart->intr_works )
         return; /* Interrupts work - no more polling */
 
+    /* Mimic interrupt context. */
+    old_regs = set_irq_regs(regs);
+
     while ( ns_read_reg(uart, UART_LSR) & UART_LSR_DR )
     {
         if ( ns16550_ioport_invalid(uart) )
@@ -227,6 +231,7 @@  static void cf_check __ns16550_poll(stru
         serial_tx_interrupt(port, regs);
 
 out:
+    set_irq_regs(old_regs);
     set_timer(&uart->timer, NOW() + MILLISECS(uart->timeout_ms));
 }
 
--- a/xen/drivers/char/xhci-dbc.c
+++ b/xen/drivers/char/xhci-dbc.c
@@ -1164,6 +1164,7 @@  static void cf_check dbc_uart_poll(void
     struct dbc_uart *uart = port->uart;
     struct dbc *dbc = &uart->dbc;
     unsigned long flags = 0;
+    struct cpu_user_regs *old_regs;
 
     if ( spin_trylock_irqsave(&port->tx_lock, flags) )
     {
@@ -1175,10 +1176,15 @@  static void cf_check dbc_uart_poll(void
         spin_unlock_irqrestore(&port->tx_lock, flags);
     }
 
+    /* Mimic interrupt context. */
+    old_regs = set_irq_regs(guest_cpu_user_regs());
+
     while ( dbc_work_ring_size(&dbc->dbc_iwork) )
         serial_rx_interrupt(port, guest_cpu_user_regs());
 
     serial_tx_interrupt(port, guest_cpu_user_regs());
+
+    set_irq_regs(old_regs);
     set_timer(&uart->timer, NOW() + MICROSECS(DBC_POLL_INTERVAL));
 }