Message ID | de07d56188f13e222ddaa1b963c20f8d7d61350e.camel@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ns16550c: avoid crash in ns16550_endboot in PV shim mode | expand |
On 19.10.2023 18:21, David Woodhouse wrote: > --- a/xen/drivers/char/ns16550.c > +++ b/xen/drivers/char/ns16550.c > @@ -586,6 +586,8 @@ static void __init cf_check ns16550_endboot(struct serial_port *port) > > if ( uart->remapped_io_base ) > return; > + if (!hardware_domain) > + return; > rv = ioports_deny_access(hardware_domain, uart->io_base, uart->io_base + 7); > if ( rv != 0 ) > BUG(); Considering this is the only use of hardware_domain in the driver, fixing it like this (with style adjusted) is certainly an option. I'd like to ask though whether it makes sense for execution to make it here in shim mode, where there's no serial port anyway (unless we start supporting pass-through [of possibly even non-PCI devices] for the shim). I wonder whether we wouldn't want to bypass ns16550_init() - likely as well as the EHCI and XHCI ones - instead, until (if ever) such passing through was actually deemed necessary to support. Jan
On 19/10/2023 5:21 pm, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > In shim mode there is no hardware_domain. Dereferencing the pointer > doesn't end well. > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- > This is about as far as I got in my abortive attempt to use the PV shim > without an actual PV console being provided by the HVM hosting > environment. It still doesn't pass the guest's console through to > serial; that only seems to shim to an actual PV console. There's no such thing as a Xen VM without a PV console. And yes, this is an error, but that horse bolted 2 decades ago. It would be nice if having a "real" serial didn't crash like this, but PV Shim is specialised to transplant one normal-looking PV guest, and the interposition logic is tied to the PV console. ~Andrew
On Fri, 2023-10-20 at 11:14 +0100, Andrew Cooper wrote: > On 19/10/2023 5:21 pm, David Woodhouse wrote: > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > In shim mode there is no hardware_domain. Dereferencing the pointer > > doesn't end well. > > > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > > --- > > This is about as far as I got in my abortive attempt to use the PV shim > > without an actual PV console being provided by the HVM hosting > > environment. It still doesn't pass the guest's console through to > > serial; that only seems to shim to an actual PV console. > > There's no such thing as a Xen VM without a PV console. Huh? There are literally millions of them. Every EC2 Xen HVM instance boots with a serial device but no Xen console. That's true of the ones running on true Xen, as well as the newer launches which are running on top of Linux/KVM. We implemented Xen console support in the Nitro hypervisor, then had to disable it because it wasn't faithful to the production environment that guests previously experienced on Xen, and eventually ripped it out because it was dead code. Likewise, upstream Qemu's Xen emulation mode doesn't currently have console support (although I did just add it to get the shim working). > And yes, this is an error, but that horse bolted 2 decades ago. > > > It would be nice if having a "real" serial didn't crash like this, but > PV Shim is specialised to transplant one normal-looking PV guest, and > the interposition logic is tied to the PV console. That's a nicer model. When Spectre/Meltdown broke and we posted the 'Vixen' version of a shim, it actually implemented a console backend and would output to the serial port. But I do agree it's nicer not to. Even with a PV console though, it might still be useful to have the shim's output going to a serial port while the guest's output goes to the console though, to keep them separate.
On 20/10/2023 11:29 am, David Woodhouse wrote: > On Fri, 2023-10-20 at 11:14 +0100, Andrew Cooper wrote: >> On 19/10/2023 5:21 pm, David Woodhouse wrote: >>> From: David Woodhouse <dwmw@amazon.co.uk> >>> >>> In shim mode there is no hardware_domain. Dereferencing the pointer >>> doesn't end well. >>> >>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> >>> --- >>> This is about as far as I got in my abortive attempt to use the PV shim >>> without an actual PV console being provided by the HVM hosting >>> environment. It still doesn't pass the guest's console through to >>> serial; that only seems to shim to an actual PV console. >> There's no such thing as a Xen VM without a PV console. > Huh? There are literally millions of them. I'm very prepared to believe there are millions which don't overtly malfunction when you don't fill in the HVM Params. Which is very different from saying "there's a way in the Xen guest ABI to express 'you don't have a PV console' ". ~Andrew
On Fri, Oct 20, 2023 at 02:25:35PM +0100, Andrew Cooper wrote: > On 20/10/2023 11:29 am, David Woodhouse wrote: > > On Fri, 2023-10-20 at 11:14 +0100, Andrew Cooper wrote: > >> On 19/10/2023 5:21 pm, David Woodhouse wrote: > >>> From: David Woodhouse <dwmw@amazon.co.uk> > >>> > >>> In shim mode there is no hardware_domain. Dereferencing the pointer > >>> doesn't end well. > >>> > >>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > >>> --- > >>> This is about as far as I got in my abortive attempt to use the PV shim > >>> without an actual PV console being provided by the HVM hosting > >>> environment. It still doesn't pass the guest's console through to > >>> serial; that only seems to shim to an actual PV console. > >> There's no such thing as a Xen VM without a PV console. > > Huh? There are literally millions of them. > > I'm very prepared to believe there are millions which don't overtly > malfunction when you don't fill in the HVM Params. > > Which is very different from saying "there's a way in the Xen guest ABI > to express 'you don't have a PV console' ". FWIW, Linux assumes that either the console page or the event channel being 0 implies no console available [0], so I guess that's the ABI now. Roger. [0] https://elixir.bootlin.com/linux/latest/source/drivers/tty/hvc/hvc_xen.c#L258
On Fri, 2023-10-20 at 15:29 +0200, Roger Pau Monné wrote: > On Fri, Oct 20, 2023 at 02:25:35PM +0100, Andrew Cooper wrote: > > On 20/10/2023 11:29 am, David Woodhouse wrote: > > > On Fri, 2023-10-20 at 11:14 +0100, Andrew Cooper wrote: > > > > On 19/10/2023 5:21 pm, David Woodhouse wrote: > > > > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > > > > > > > In shim mode there is no hardware_domain. Dereferencing the pointer > > > > > doesn't end well. > > > > > > > > > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > > > > > --- > > > > > This is about as far as I got in my abortive attempt to use the PV shim > > > > > without an actual PV console being provided by the HVM hosting > > > > > environment. It still doesn't pass the guest's console through to > > > > > serial; that only seems to shim to an actual PV console. > > > > There's no such thing as a Xen VM without a PV console. > > > Huh? There are literally millions of them. > > > > I'm very prepared to believe there are millions which don't overtly > > malfunction when you don't fill in the HVM Params. > > > > Which is very different from saying "there's a way in the Xen guest ABI > > to express 'you don't have a PV console' ". > > FWIW, Linux assumes that either the console page or the event channel > being 0 implies no console available [0], so I guess that's the ABI > now. Or if the HVMOP_get_param call returns an error. > Roger. > > [0] https://elixir.bootlin.com/linux/latest/source/drivers/tty/hvc/hvc_xen.c#L258 I'm not convinced I believe what the comment says there about evtchn 0 being theoretically valid. I don't believe zero is a valid evtchn#, is it?
On 20/10/2023 14:37, David Woodhouse wrote: [snip] >> >> [0] https://elixir.bootlin.com/linux/latest/source/drivers/tty/hvc/hvc_xen.c#L258 > > I'm not convinced I believe what the comment says there about evtchn 0 > being theoretically valid. I don't believe zero is a valid evtchn#, is > it? gfn 0 might be valid, but I'm also pretty sure evtchn 0 is not valid. Paul
On 20/10/2023 3:50 pm, Durrant, Paul wrote: > On 20/10/2023 14:37, David Woodhouse wrote: > [snip] >>> >>> [0] >>> https://elixir.bootlin.com/linux/latest/source/drivers/tty/hvc/hvc_xen.c#L258 >> >> I'm not convinced I believe what the comment says there about evtchn 0 >> being theoretically valid. I don't believe zero is a valid evtchn#, is >> it? > > gfn 0 might be valid, but I'm also pretty sure evtchn 0 is not valid. GFN 0 very much is valid. evtchn 0 OTOH is explicitly not valid. From evtchn_init(): evtchn_from_port(d, 0)->state = ECS_RESERVED; However, the fields being 0 doesn't mean not available. That's the signal to saying "not connected yet", because that's what dom0 gets before xenconsoled starts up. ~Andrew
On Fri, Oct 20, 2023 at 04:16:16PM +0100, Andrew Cooper wrote: > On 20/10/2023 3:50 pm, Durrant, Paul wrote: > > On 20/10/2023 14:37, David Woodhouse wrote: > > [snip] > >>> > >>> [0] > >>> https://elixir.bootlin.com/linux/latest/source/drivers/tty/hvc/hvc_xen.c#L258 > >> > >> I'm not convinced I believe what the comment says there about evtchn 0 > >> being theoretically valid. I don't believe zero is a valid evtchn#, is > >> it? > > > > gfn 0 might be valid, but I'm also pretty sure evtchn 0 is not valid. > > GFN 0 very much is valid. > > evtchn 0 OTOH is explicitly not valid. From evtchn_init(): > > evtchn_from_port(d, 0)->state = ECS_RESERVED; > > > However, the fields being 0 doesn't mean not available. That's the > signal to saying "not connected yet", because that's what dom0 gets > before xenconsoled starts up. Someone asked me the same a while back, and IIRC we don't state anywhere in the public headers that event channel 0 is reserved, however that has always? been part of the implementation. If we intend this to be reliable, we should add a define to the public headers in order to signal that 0 will always be reserved. Roger.
On 23.10.2023 09:52, Roger Pau Monné wrote: > On Fri, Oct 20, 2023 at 04:16:16PM +0100, Andrew Cooper wrote: >> On 20/10/2023 3:50 pm, Durrant, Paul wrote: >>> On 20/10/2023 14:37, David Woodhouse wrote: >>> [snip] >>>>> >>>>> [0] >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/tty/hvc/hvc_xen.c#L258 >>>> >>>> I'm not convinced I believe what the comment says there about evtchn 0 >>>> being theoretically valid. I don't believe zero is a valid evtchn#, is >>>> it? >>> >>> gfn 0 might be valid, but I'm also pretty sure evtchn 0 is not valid. >> >> GFN 0 very much is valid. >> >> evtchn 0 OTOH is explicitly not valid. From evtchn_init(): >> >> evtchn_from_port(d, 0)->state = ECS_RESERVED; >> >> >> However, the fields being 0 doesn't mean not available. That's the >> signal to saying "not connected yet", because that's what dom0 gets >> before xenconsoled starts up. > > Someone asked me the same a while back, and IIRC we don't state > anywhere in the public headers that event channel 0 is reserved, > however that has always? been part of the implementation. > > If we intend this to be reliable, we should add a define to the public > headers in order to signal that 0 will always be reserved. I agree a comment should have been there; it's not clear to me what useful #define we could add. Jan
On Mon, Oct 23, 2023 at 10:05:48AM +0200, Jan Beulich wrote: > On 23.10.2023 09:52, Roger Pau Monné wrote: > > On Fri, Oct 20, 2023 at 04:16:16PM +0100, Andrew Cooper wrote: > >> On 20/10/2023 3:50 pm, Durrant, Paul wrote: > >>> On 20/10/2023 14:37, David Woodhouse wrote: > >>> [snip] > >>>>> > >>>>> [0] > >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/tty/hvc/hvc_xen.c#L258 > >>>> > >>>> I'm not convinced I believe what the comment says there about evtchn 0 > >>>> being theoretically valid. I don't believe zero is a valid evtchn#, is > >>>> it? > >>> > >>> gfn 0 might be valid, but I'm also pretty sure evtchn 0 is not valid. > >> > >> GFN 0 very much is valid. > >> > >> evtchn 0 OTOH is explicitly not valid. From evtchn_init(): > >> > >> evtchn_from_port(d, 0)->state = ECS_RESERVED; > >> > >> > >> However, the fields being 0 doesn't mean not available. That's the > >> signal to saying "not connected yet", because that's what dom0 gets > >> before xenconsoled starts up. > > > > Someone asked me the same a while back, and IIRC we don't state > > anywhere in the public headers that event channel 0 is reserved, > > however that has always? been part of the implementation. > > > > If we intend this to be reliable, we should add a define to the public > > headers in order to signal that 0 will always be reserved. > > I agree a comment should have been there; it's not clear to me what > useful #define we could add. `EVTCHN_PORT_INVALID 0` or some such, but a comment would also be fine, the point is to be part of the public interface. Thanks, Roger.
diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c index 28ddedd50d..0818f5578c 100644 --- a/xen/drivers/char/ns16550.c +++ b/xen/drivers/char/ns16550.c @@ -586,6 +586,8 @@ static void __init cf_check ns16550_endboot(struct serial_port *port) if ( uart->remapped_io_base ) return; + if (!hardware_domain) + return; rv = ioports_deny_access(hardware_domain, uart->io_base, uart->io_base + 7); if ( rv != 0 ) BUG();