Message ID | 20200605110240.52545-1-roger.pau@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [for-4.14,v2] x86/rtc: provide mediated access to RTC for PVH dom0 | expand |
On Fri, Jun 05, 2020 at 01:02:39PM +0200, Roger Pau Monne wrote: > Mediated access to the RTC was provided for PVHv1 dom0 using the PV > code paths (guest_io_{write/read}), but those accesses where never > implemented for PVHv2 dom0. This patch provides such mediated accesses > to the RTC for PVH dom0, just like it's provided for a classic PV > dom0. > > Pull out some of the RTC logic from guest_io_{read/write} into > specific helpers that can be used by both PV and HVM guests. The > setup of the handlers for PVH is done in rtc_init, which is already > used to initialize the fully emulated RTC. > > Without this a Linux PVH dom0 will read garbage when trying to access > the RTC, and one vCPU will be constantly looping in > rtc_timer_do_work. > > Note that such issue doesn't happen on domUs because the ACPI > NO_CMOS_RTC flag is set in FADT, which prevents the OS from accessing > the RTC. Also the X86_EMU_RTC flag is not set for PVH dom0, as the > accesses are not emulated but rather forwarded to the physical > hardware. > > No functional change expected for classic PV dom0. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Wei Liu <wl@xen.org>
> -----Original Message----- > From: Roger Pau Monne <roger.pau@citrix.com> > Sent: 05 June 2020 12:03 > To: xen-devel@lists.xenproject.org > Cc: paul@xen.org; Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew > Cooper <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org> > Subject: [PATCH for-4.14 v2] x86/rtc: provide mediated access to RTC for PVH dom0 > > Mediated access to the RTC was provided for PVHv1 dom0 using the PV > code paths (guest_io_{write/read}), but those accesses where never > implemented for PVHv2 dom0. This patch provides such mediated accesses > to the RTC for PVH dom0, just like it's provided for a classic PV > dom0. > > Pull out some of the RTC logic from guest_io_{read/write} into > specific helpers that can be used by both PV and HVM guests. The > setup of the handlers for PVH is done in rtc_init, which is already > used to initialize the fully emulated RTC. > > Without this a Linux PVH dom0 will read garbage when trying to access > the RTC, and one vCPU will be constantly looping in > rtc_timer_do_work. > > Note that such issue doesn't happen on domUs because the ACPI > NO_CMOS_RTC flag is set in FADT, which prevents the OS from accessing > the RTC. Also the X86_EMU_RTC flag is not set for PVH dom0, as the > accesses are not emulated but rather forwarded to the physical > hardware. > > No functional change expected for classic PV dom0. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Release-acked-by: Paul Durrant <paul@xen.org> > --- > for-4.14 reasoning: the fix is mostly isolated to PVH dom0, and as > such the risk is very low of causing issues to other guests types, but > without this fix one vCPU when using a Linux dom0 will be constantly > looping over rtc_timer_do_work with 100% CPU usage, at least when > using Linux 4.19 or newer. > --- > Changes since v1: > - Share the code with PV. > - Add access checks to the IO ports. > --- > xen/arch/x86/hvm/rtc.c | 34 ++++++++++++++++++ > xen/arch/x86/pv/emul-priv-op.c | 28 +++------------ > xen/arch/x86/time.c | 59 +++++++++++++++++++++++++++++++ > xen/include/asm-x86/mc146818rtc.h | 3 ++ > 4 files changed, 100 insertions(+), 24 deletions(-) > > diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c > index 5bbbdc0e0f..9f304a0fb4 100644 > --- a/xen/arch/x86/hvm/rtc.c > +++ b/xen/arch/x86/hvm/rtc.c > @@ -22,6 +22,7 @@ > * IN THE SOFTWARE. > */ > > +#include <asm/iocap.h> > #include <asm/mc146818rtc.h> > #include <asm/hvm/vpt.h> > #include <asm/hvm/io.h> > @@ -808,10 +809,43 @@ void rtc_reset(struct domain *d) > s->pt.source = PTSRC_isa; > } > > +/* RTC mediator for HVM hardware domain. */ > +static int hw_rtc_io(int dir, unsigned int port, unsigned int size, > + uint32_t *val) > +{ > + if ( dir == IOREQ_READ ) > + *val = ~0; > + > + if ( size != 1 ) > + { > + gdprintk(XENLOG_WARNING, "bad RTC access size (%u)\n", size); > + return X86EMUL_OKAY; > + } > + if ( !ioports_access_permitted(current->domain, port, port) ) > + { > + gdprintk(XENLOG_WARNING, "access to RTC port %x not allowed\n", port); > + return X86EMUL_OKAY; > + } > + > + if ( dir == IOREQ_WRITE ) > + rtc_guest_write(port, *val); > + else > + *val = rtc_guest_read(port); > + > + return X86EMUL_OKAY; > +} > + > void rtc_init(struct domain *d) > { > RTCState *s = domain_vrtc(d); > > + if ( is_hardware_domain(d) ) > + { > + /* Hardware domain gets mediated access to the physical RTC. */ > + register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io); > + return; > + } > + > if ( !has_vrtc(d) ) > return; > > diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c > index fad6c754ad..1597f27ed9 100644 > --- a/xen/arch/x86/pv/emul-priv-op.c > +++ b/xen/arch/x86/pv/emul-priv-op.c > @@ -280,19 +280,10 @@ static uint32_t guest_io_read(unsigned int port, unsigned int bytes, > { > sub_data = pv_pit_handler(port, 0, 0); > } > - else if ( port == RTC_PORT(0) ) > - { > - sub_data = currd->arch.cmos_idx; > - } > - else if ( (port == RTC_PORT(1)) && > + else if ( (port == RTC_PORT(0) || port == RTC_PORT(1)) && > ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) ) > { > - unsigned long flags; > - > - spin_lock_irqsave(&rtc_lock, flags); > - outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > - sub_data = inb(RTC_PORT(1)); > - spin_unlock_irqrestore(&rtc_lock, flags); > + sub_data = rtc_guest_read(port); > } > else if ( (port == 0xcf8) && (bytes == 4) ) > { > @@ -413,21 +404,10 @@ static void guest_io_write(unsigned int port, unsigned int bytes, > { > pv_pit_handler(port, (uint8_t)data, 1); > } > - else if ( port == RTC_PORT(0) ) > - { > - currd->arch.cmos_idx = data; > - } > - else if ( (port == RTC_PORT(1)) && > + else if ( (port == RTC_PORT(0) || port == RTC_PORT(1)) && > ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) ) > { > - unsigned long flags; > - > - if ( pv_rtc_handler ) > - pv_rtc_handler(currd->arch.cmos_idx & 0x7f, data); > - spin_lock_irqsave(&rtc_lock, flags); > - outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > - outb(data, RTC_PORT(1)); > - spin_unlock_irqrestore(&rtc_lock, flags); > + rtc_guest_write(port, data); > } > else if ( (port == 0xcf8) && (bytes == 4) ) > { > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c > index bbaea3aa65..e1c81067ce 100644 > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -27,6 +27,7 @@ > #include <xen/keyhandler.h> > #include <xen/guest_access.h> > #include <asm/io.h> > +#include <asm/iocap.h> > #include <asm/msr.h> > #include <asm/mpspec.h> > #include <asm/processor.h> > @@ -1110,6 +1111,64 @@ static unsigned long get_cmos_time(void) > return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec); > } > > +/* Helpers for guest accesses to the physical RTC. */ > +unsigned int rtc_guest_read(unsigned int port) > +{ > + const struct domain *currd = current->domain; > + unsigned long flags; > + unsigned int data = ~0; > + > + ASSERT(port == RTC_PORT(0) || port == RTC_PORT(1)); > + if ( !ioports_access_permitted(currd, port, port) ) > + { > + ASSERT_UNREACHABLE(); > + return data; > + } > + > + switch ( port ) > + { > + case RTC_PORT(0): > + data = currd->arch.cmos_idx; > + break; > + > + case RTC_PORT(1): > + spin_lock_irqsave(&rtc_lock, flags); > + outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > + data = inb(RTC_PORT(1)); > + spin_unlock_irqrestore(&rtc_lock, flags); > + break; > + } > + > + return data; > +} > + > +void rtc_guest_write(unsigned int port, unsigned int data) > +{ > + struct domain *currd = current->domain; > + unsigned long flags; > + > + ASSERT(port == RTC_PORT(0) || port == RTC_PORT(1)); > + if ( !ioports_access_permitted(currd, port, port) ) > + { > + ASSERT_UNREACHABLE(); > + return; > + } > + > + switch ( port ) > + { > + case RTC_PORT(0): > + currd->arch.cmos_idx = data; > + break; > + > + case RTC_PORT(1): > + spin_lock_irqsave(&rtc_lock, flags); > + outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > + outb(data, RTC_PORT(1)); > + spin_unlock_irqrestore(&rtc_lock, flags); > + break; > + } > +} > + > static unsigned long get_wallclock_time(void) > { > #ifdef CONFIG_XEN_GUEST > diff --git a/xen/include/asm-x86/mc146818rtc.h b/xen/include/asm-x86/mc146818rtc.h > index 8758528f7c..803b236c0a 100644 > --- a/xen/include/asm-x86/mc146818rtc.h > +++ b/xen/include/asm-x86/mc146818rtc.h > @@ -110,4 +110,7 @@ outb_p((val),RTC_PORT(1)); \ > > #define RTC_IRQ 8 > > +unsigned int rtc_guest_read(unsigned int port); > +void rtc_guest_write(unsigned int port, unsigned int data); > + > #endif /* _ASM_MC146818RTC_H */ > -- > 2.26.2
On 05.06.2020 13:02, Roger Pau Monne wrote: > Mediated access to the RTC was provided for PVHv1 dom0 using the PV > code paths (guest_io_{write/read}), but those accesses where never > implemented for PVHv2 dom0. This patch provides such mediated accesses > to the RTC for PVH dom0, just like it's provided for a classic PV > dom0. > > Pull out some of the RTC logic from guest_io_{read/write} into > specific helpers that can be used by both PV and HVM guests. The > setup of the handlers for PVH is done in rtc_init, which is already > used to initialize the fully emulated RTC. > > Without this a Linux PVH dom0 will read garbage when trying to access > the RTC, and one vCPU will be constantly looping in > rtc_timer_do_work. > > Note that such issue doesn't happen on domUs because the ACPI > NO_CMOS_RTC flag is set in FADT, which prevents the OS from accessing > the RTC. Also the X86_EMU_RTC flag is not set for PVH dom0, as the > accesses are not emulated but rather forwarded to the physical > hardware. > > No functional change expected for classic PV dom0. But there is, in whether (virtual) port 0x71 can be read/written (even by a DomU). I'm afraid of being called guilty in splitting hair, though. > @@ -808,10 +809,43 @@ void rtc_reset(struct domain *d) > s->pt.source = PTSRC_isa; > } > > +/* RTC mediator for HVM hardware domain. */ > +static int hw_rtc_io(int dir, unsigned int port, unsigned int size, > + uint32_t *val) > +{ > + if ( dir == IOREQ_READ ) > + *val = ~0; > + > + if ( size != 1 ) > + { > + gdprintk(XENLOG_WARNING, "bad RTC access size (%u)\n", size); > + return X86EMUL_OKAY; > + } > + if ( !ioports_access_permitted(current->domain, port, port) ) This wants to move into the helper, such that the PV side can have it moved too. > void rtc_init(struct domain *d) > { > RTCState *s = domain_vrtc(d); > > + if ( is_hardware_domain(d) ) > + { > + /* Hardware domain gets mediated access to the physical RTC. */ > + register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io); > + return; Any reason for this explicit return, rather than ... > + } > + > if ( !has_vrtc(d) ) > return; ... making use of this one? In fact wouldn't it be more correct to have if ( !has_vrtc(d) ) { /* Hardware domain gets mediated access to the physical RTC. */ if ( is_hardware_domain(d) ) register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io); return; } such that eventual (perhaps optional) enabling of vRTC for hwdom would have it properly work without changing this function again? > --- a/xen/arch/x86/pv/emul-priv-op.c > +++ b/xen/arch/x86/pv/emul-priv-op.c > @@ -280,19 +280,10 @@ static uint32_t guest_io_read(unsigned int port, unsigned int bytes, > { > sub_data = pv_pit_handler(port, 0, 0); > } > - else if ( port == RTC_PORT(0) ) > - { > - sub_data = currd->arch.cmos_idx; Note how there was no permission check here. Having one or more I/O ports that can be used to simply latch a value can, as I've learned, be quite valuable as a debugging vehicle, and there aren't many (if any) ports beyond this one that a PV DomU might use for such a purpose. Arguably the value is somewhat limited here, as the value wouldn't survive a crash, but I'd still prefer if we could retain prior functionality. > @@ -1110,6 +1111,64 @@ static unsigned long get_cmos_time(void) > return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec); > } > > +/* Helpers for guest accesses to the physical RTC. */ > +unsigned int rtc_guest_read(unsigned int port) > +{ > + const struct domain *currd = current->domain; > + unsigned long flags; > + unsigned int data = ~0; > + > + ASSERT(port == RTC_PORT(0) || port == RTC_PORT(1)); Instead of this, how about ... > + if ( !ioports_access_permitted(currd, port, port) ) > + { > + ASSERT_UNREACHABLE(); > + return data; > + } > + > + switch ( port ) > + { > + case RTC_PORT(0): > + data = currd->arch.cmos_idx; > + break; > + > + case RTC_PORT(1): > + spin_lock_irqsave(&rtc_lock, flags); > + outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > + data = inb(RTC_PORT(1)); > + spin_unlock_irqrestore(&rtc_lock, flags); > + break; default: ASSERT_UNREACHABLE(); break; ? Jan
On Fri, Jun 05, 2020 at 04:44:32PM +0200, Jan Beulich wrote: > On 05.06.2020 13:02, Roger Pau Monne wrote: > > Mediated access to the RTC was provided for PVHv1 dom0 using the PV > > code paths (guest_io_{write/read}), but those accesses where never > > implemented for PVHv2 dom0. This patch provides such mediated accesses > > to the RTC for PVH dom0, just like it's provided for a classic PV > > dom0. > > > > Pull out some of the RTC logic from guest_io_{read/write} into > > specific helpers that can be used by both PV and HVM guests. The > > setup of the handlers for PVH is done in rtc_init, which is already > > used to initialize the fully emulated RTC. > > > > Without this a Linux PVH dom0 will read garbage when trying to access > > the RTC, and one vCPU will be constantly looping in > > rtc_timer_do_work. > > > > Note that such issue doesn't happen on domUs because the ACPI > > NO_CMOS_RTC flag is set in FADT, which prevents the OS from accessing > > the RTC. Also the X86_EMU_RTC flag is not set for PVH dom0, as the > > accesses are not emulated but rather forwarded to the physical > > hardware. > > > > No functional change expected for classic PV dom0. > > But there is, in whether (virtual) port 0x71 can be read/written (even > by a DomU). I'm afraid of being called guilty in splitting hair, though. Urg, OK, I realized that but considered it a harmless mistake. > > @@ -808,10 +809,43 @@ void rtc_reset(struct domain *d) > > s->pt.source = PTSRC_isa; > > } > > > > +/* RTC mediator for HVM hardware domain. */ > > +static int hw_rtc_io(int dir, unsigned int port, unsigned int size, > > + uint32_t *val) > > +{ > > + if ( dir == IOREQ_READ ) > > + *val = ~0; > > + > > + if ( size != 1 ) > > + { > > + gdprintk(XENLOG_WARNING, "bad RTC access size (%u)\n", size); > > + return X86EMUL_OKAY; > > + } > > + if ( !ioports_access_permitted(current->domain, port, port) ) > > This wants to move into the helper, such that the PV side can have > it moved too. > > > void rtc_init(struct domain *d) > > { > > RTCState *s = domain_vrtc(d); > > > > + if ( is_hardware_domain(d) ) > > + { > > + /* Hardware domain gets mediated access to the physical RTC. */ > > + register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io); > > + return; > > Any reason for this explicit return, rather than ... > > > + } > > + > > if ( !has_vrtc(d) ) > > return; > > ... making use of this one? In fact wouldn't it be more correct > to have > > if ( !has_vrtc(d) ) > { > /* Hardware domain gets mediated access to the physical RTC. */ > if ( is_hardware_domain(d) ) > register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io); > return; > } > > such that eventual (perhaps optional) enabling of vRTC for hwdom > would have it properly work without changing this function again? Right, that seems fine to me. > > --- a/xen/arch/x86/pv/emul-priv-op.c > > +++ b/xen/arch/x86/pv/emul-priv-op.c > > @@ -280,19 +280,10 @@ static uint32_t guest_io_read(unsigned int port, unsigned int bytes, > > { > > sub_data = pv_pit_handler(port, 0, 0); > > } > > - else if ( port == RTC_PORT(0) ) > > - { > > - sub_data = currd->arch.cmos_idx; > > Note how there was no permission check here. Having one or more > I/O ports that can be used to simply latch a value can, as I've > learned, be quite valuable as a debugging vehicle, and there > aren't many (if any) ports beyond this one that a PV DomU might > use for such a purpose. Arguably the value is somewhat limited > here, as the value wouldn't survive a crash, but I'd still > prefer if we could retain prior functionality. OK, as said above I considered this a harmless mistake, but seeing as you find it valuable I will make sure to keep the behavior. > > @@ -1110,6 +1111,64 @@ static unsigned long get_cmos_time(void) > > return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec); > > } > > > > +/* Helpers for guest accesses to the physical RTC. */ > > +unsigned int rtc_guest_read(unsigned int port) > > +{ > > + const struct domain *currd = current->domain; > > + unsigned long flags; > > + unsigned int data = ~0; > > + > > + ASSERT(port == RTC_PORT(0) || port == RTC_PORT(1)); > > Instead of this, how about ... > > > + if ( !ioports_access_permitted(currd, port, port) ) > > + { > > + ASSERT_UNREACHABLE(); > > + return data; > > + } > > + > > + switch ( port ) > > + { > > + case RTC_PORT(0): > > + data = currd->arch.cmos_idx; > > + break; > > + > > + case RTC_PORT(1): > > + spin_lock_irqsave(&rtc_lock, flags); > > + outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > > + data = inb(RTC_PORT(1)); > > + spin_unlock_irqrestore(&rtc_lock, flags); > > + break; > > default: > ASSERT_UNREACHABLE(); > break; > > ? Sure. Thanks, Roger.
On Fri, Jun 5, 2020 at 4:03 AM Roger Pau Monne <roger.pau@citrix.com> wrote: > > Mediated access to the RTC was provided for PVHv1 dom0 using the PV > code paths (guest_io_{write/read}), but those accesses where never > implemented for PVHv2 dom0. This patch provides such mediated accesses > to the RTC for PVH dom0, just like it's provided for a classic PV > dom0. > > Pull out some of the RTC logic from guest_io_{read/write} into > specific helpers that can be used by both PV and HVM guests. The > setup of the handlers for PVH is done in rtc_init, which is already > used to initialize the fully emulated RTC. > > Without this a Linux PVH dom0 will read garbage when trying to access > the RTC, and one vCPU will be constantly looping in > rtc_timer_do_work. > > Note that such issue doesn't happen on domUs because the ACPI > NO_CMOS_RTC flag is set in FADT, which prevents the OS from accessing > the RTC. Also the X86_EMU_RTC flag is not set for PVH dom0, as the > accesses are not emulated but rather forwarded to the physical > hardware. > > No functional change expected for classic PV dom0. For the dense guys like me: what is the user-visible feature that is now being offered with this? Would really appreciate a pointer or two. Thanks, Roman. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > for-4.14 reasoning: the fix is mostly isolated to PVH dom0, and as > such the risk is very low of causing issues to other guests types, but > without this fix one vCPU when using a Linux dom0 will be constantly > looping over rtc_timer_do_work with 100% CPU usage, at least when > using Linux 4.19 or newer. > --- > Changes since v1: > - Share the code with PV. > - Add access checks to the IO ports. > --- > xen/arch/x86/hvm/rtc.c | 34 ++++++++++++++++++ > xen/arch/x86/pv/emul-priv-op.c | 28 +++------------ > xen/arch/x86/time.c | 59 +++++++++++++++++++++++++++++++ > xen/include/asm-x86/mc146818rtc.h | 3 ++ > 4 files changed, 100 insertions(+), 24 deletions(-) > > diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c > index 5bbbdc0e0f..9f304a0fb4 100644 > --- a/xen/arch/x86/hvm/rtc.c > +++ b/xen/arch/x86/hvm/rtc.c > @@ -22,6 +22,7 @@ > * IN THE SOFTWARE. > */ > > +#include <asm/iocap.h> > #include <asm/mc146818rtc.h> > #include <asm/hvm/vpt.h> > #include <asm/hvm/io.h> > @@ -808,10 +809,43 @@ void rtc_reset(struct domain *d) > s->pt.source = PTSRC_isa; > } > > +/* RTC mediator for HVM hardware domain. */ > +static int hw_rtc_io(int dir, unsigned int port, unsigned int size, > + uint32_t *val) > +{ > + if ( dir == IOREQ_READ ) > + *val = ~0; > + > + if ( size != 1 ) > + { > + gdprintk(XENLOG_WARNING, "bad RTC access size (%u)\n", size); > + return X86EMUL_OKAY; > + } > + if ( !ioports_access_permitted(current->domain, port, port) ) > + { > + gdprintk(XENLOG_WARNING, "access to RTC port %x not allowed\n", port); > + return X86EMUL_OKAY; > + } > + > + if ( dir == IOREQ_WRITE ) > + rtc_guest_write(port, *val); > + else > + *val = rtc_guest_read(port); > + > + return X86EMUL_OKAY; > +} > + > void rtc_init(struct domain *d) > { > RTCState *s = domain_vrtc(d); > > + if ( is_hardware_domain(d) ) > + { > + /* Hardware domain gets mediated access to the physical RTC. */ > + register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io); > + return; > + } > + > if ( !has_vrtc(d) ) > return; > > diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c > index fad6c754ad..1597f27ed9 100644 > --- a/xen/arch/x86/pv/emul-priv-op.c > +++ b/xen/arch/x86/pv/emul-priv-op.c > @@ -280,19 +280,10 @@ static uint32_t guest_io_read(unsigned int port, unsigned int bytes, > { > sub_data = pv_pit_handler(port, 0, 0); > } > - else if ( port == RTC_PORT(0) ) > - { > - sub_data = currd->arch.cmos_idx; > - } > - else if ( (port == RTC_PORT(1)) && > + else if ( (port == RTC_PORT(0) || port == RTC_PORT(1)) && > ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) ) > { > - unsigned long flags; > - > - spin_lock_irqsave(&rtc_lock, flags); > - outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > - sub_data = inb(RTC_PORT(1)); > - spin_unlock_irqrestore(&rtc_lock, flags); > + sub_data = rtc_guest_read(port); > } > else if ( (port == 0xcf8) && (bytes == 4) ) > { > @@ -413,21 +404,10 @@ static void guest_io_write(unsigned int port, unsigned int bytes, > { > pv_pit_handler(port, (uint8_t)data, 1); > } > - else if ( port == RTC_PORT(0) ) > - { > - currd->arch.cmos_idx = data; > - } > - else if ( (port == RTC_PORT(1)) && > + else if ( (port == RTC_PORT(0) || port == RTC_PORT(1)) && > ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) ) > { > - unsigned long flags; > - > - if ( pv_rtc_handler ) > - pv_rtc_handler(currd->arch.cmos_idx & 0x7f, data); > - spin_lock_irqsave(&rtc_lock, flags); > - outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > - outb(data, RTC_PORT(1)); > - spin_unlock_irqrestore(&rtc_lock, flags); > + rtc_guest_write(port, data); > } > else if ( (port == 0xcf8) && (bytes == 4) ) > { > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c > index bbaea3aa65..e1c81067ce 100644 > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -27,6 +27,7 @@ > #include <xen/keyhandler.h> > #include <xen/guest_access.h> > #include <asm/io.h> > +#include <asm/iocap.h> > #include <asm/msr.h> > #include <asm/mpspec.h> > #include <asm/processor.h> > @@ -1110,6 +1111,64 @@ static unsigned long get_cmos_time(void) > return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec); > } > > +/* Helpers for guest accesses to the physical RTC. */ > +unsigned int rtc_guest_read(unsigned int port) > +{ > + const struct domain *currd = current->domain; > + unsigned long flags; > + unsigned int data = ~0; > + > + ASSERT(port == RTC_PORT(0) || port == RTC_PORT(1)); > + if ( !ioports_access_permitted(currd, port, port) ) > + { > + ASSERT_UNREACHABLE(); > + return data; > + } > + > + switch ( port ) > + { > + case RTC_PORT(0): > + data = currd->arch.cmos_idx; > + break; > + > + case RTC_PORT(1): > + spin_lock_irqsave(&rtc_lock, flags); > + outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > + data = inb(RTC_PORT(1)); > + spin_unlock_irqrestore(&rtc_lock, flags); > + break; > + } > + > + return data; > +} > + > +void rtc_guest_write(unsigned int port, unsigned int data) > +{ > + struct domain *currd = current->domain; > + unsigned long flags; > + > + ASSERT(port == RTC_PORT(0) || port == RTC_PORT(1)); > + if ( !ioports_access_permitted(currd, port, port) ) > + { > + ASSERT_UNREACHABLE(); > + return; > + } > + > + switch ( port ) > + { > + case RTC_PORT(0): > + currd->arch.cmos_idx = data; > + break; > + > + case RTC_PORT(1): > + spin_lock_irqsave(&rtc_lock, flags); > + outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); > + outb(data, RTC_PORT(1)); > + spin_unlock_irqrestore(&rtc_lock, flags); > + break; > + } > +} > + > static unsigned long get_wallclock_time(void) > { > #ifdef CONFIG_XEN_GUEST > diff --git a/xen/include/asm-x86/mc146818rtc.h b/xen/include/asm-x86/mc146818rtc.h > index 8758528f7c..803b236c0a 100644 > --- a/xen/include/asm-x86/mc146818rtc.h > +++ b/xen/include/asm-x86/mc146818rtc.h > @@ -110,4 +110,7 @@ outb_p((val),RTC_PORT(1)); \ > > #define RTC_IRQ 8 > > +unsigned int rtc_guest_read(unsigned int port); > +void rtc_guest_write(unsigned int port, unsigned int data); > + > #endif /* _ASM_MC146818RTC_H */ > -- > 2.26.2 > >
On 06.06.2020 01:43, Roman Shaposhnik wrote: > On Fri, Jun 5, 2020 at 4:03 AM Roger Pau Monne <roger.pau@citrix.com> wrote: >> >> Mediated access to the RTC was provided for PVHv1 dom0 using the PV >> code paths (guest_io_{write/read}), but those accesses where never >> implemented for PVHv2 dom0. This patch provides such mediated accesses >> to the RTC for PVH dom0, just like it's provided for a classic PV >> dom0. >> >> Pull out some of the RTC logic from guest_io_{read/write} into >> specific helpers that can be used by both PV and HVM guests. The >> setup of the handlers for PVH is done in rtc_init, which is already >> used to initialize the fully emulated RTC. >> >> Without this a Linux PVH dom0 will read garbage when trying to access >> the RTC, and one vCPU will be constantly looping in >> rtc_timer_do_work. >> >> Note that such issue doesn't happen on domUs because the ACPI >> NO_CMOS_RTC flag is set in FADT, which prevents the OS from accessing >> the RTC. Also the X86_EMU_RTC flag is not set for PVH dom0, as the >> accesses are not emulated but rather forwarded to the physical >> hardware. >> >> No functional change expected for classic PV dom0. > > For the dense guys like me: what is the user-visible feature that is now being > offered with this? Would really appreciate a pointer or two. PV Dom0 has always been permitted direct access to the hardware RTC. This change makes PVH Dom0 follow suit. Jan
On Fri, Jun 05, 2020 at 04:43:12PM -0700, Roman Shaposhnik wrote: > On Fri, Jun 5, 2020 at 4:03 AM Roger Pau Monne <roger.pau@citrix.com> wrote: > > > > Mediated access to the RTC was provided for PVHv1 dom0 using the PV > > code paths (guest_io_{write/read}), but those accesses where never > > implemented for PVHv2 dom0. This patch provides such mediated accesses > > to the RTC for PVH dom0, just like it's provided for a classic PV > > dom0. > > > > Pull out some of the RTC logic from guest_io_{read/write} into > > specific helpers that can be used by both PV and HVM guests. The > > setup of the handlers for PVH is done in rtc_init, which is already > > used to initialize the fully emulated RTC. > > > > Without this a Linux PVH dom0 will read garbage when trying to access > > the RTC, and one vCPU will be constantly looping in > > rtc_timer_do_work. > > > > Note that such issue doesn't happen on domUs because the ACPI > > NO_CMOS_RTC flag is set in FADT, which prevents the OS from accessing > > the RTC. Also the X86_EMU_RTC flag is not set for PVH dom0, as the > > accesses are not emulated but rather forwarded to the physical > > hardware. > > > > No functional change expected for classic PV dom0. > > For the dense guys like me: what is the user-visible feature that is now being > offered with this? Would really appreciate a pointer or two. Without this dom0 is not able to change the date. Note that XENPF_settime{32/64} doesn't write the changes to the RTC (at least on x86), so dom0 needs to write such changes to the RTC so they can survive a poweroff. However dom0 cannot be given direct access to the registers, since the RTC uses an indirect access interface using two IO registers, so Xen needs to trap such accesses by dom0 in order to serialize them and prevent conflicts with Xen accesses to the RTC. Roger.
diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c index 5bbbdc0e0f..9f304a0fb4 100644 --- a/xen/arch/x86/hvm/rtc.c +++ b/xen/arch/x86/hvm/rtc.c @@ -22,6 +22,7 @@ * IN THE SOFTWARE. */ +#include <asm/iocap.h> #include <asm/mc146818rtc.h> #include <asm/hvm/vpt.h> #include <asm/hvm/io.h> @@ -808,10 +809,43 @@ void rtc_reset(struct domain *d) s->pt.source = PTSRC_isa; } +/* RTC mediator for HVM hardware domain. */ +static int hw_rtc_io(int dir, unsigned int port, unsigned int size, + uint32_t *val) +{ + if ( dir == IOREQ_READ ) + *val = ~0; + + if ( size != 1 ) + { + gdprintk(XENLOG_WARNING, "bad RTC access size (%u)\n", size); + return X86EMUL_OKAY; + } + if ( !ioports_access_permitted(current->domain, port, port) ) + { + gdprintk(XENLOG_WARNING, "access to RTC port %x not allowed\n", port); + return X86EMUL_OKAY; + } + + if ( dir == IOREQ_WRITE ) + rtc_guest_write(port, *val); + else + *val = rtc_guest_read(port); + + return X86EMUL_OKAY; +} + void rtc_init(struct domain *d) { RTCState *s = domain_vrtc(d); + if ( is_hardware_domain(d) ) + { + /* Hardware domain gets mediated access to the physical RTC. */ + register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io); + return; + } + if ( !has_vrtc(d) ) return; diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c index fad6c754ad..1597f27ed9 100644 --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -280,19 +280,10 @@ static uint32_t guest_io_read(unsigned int port, unsigned int bytes, { sub_data = pv_pit_handler(port, 0, 0); } - else if ( port == RTC_PORT(0) ) - { - sub_data = currd->arch.cmos_idx; - } - else if ( (port == RTC_PORT(1)) && + else if ( (port == RTC_PORT(0) || port == RTC_PORT(1)) && ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) ) { - unsigned long flags; - - spin_lock_irqsave(&rtc_lock, flags); - outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); - sub_data = inb(RTC_PORT(1)); - spin_unlock_irqrestore(&rtc_lock, flags); + sub_data = rtc_guest_read(port); } else if ( (port == 0xcf8) && (bytes == 4) ) { @@ -413,21 +404,10 @@ static void guest_io_write(unsigned int port, unsigned int bytes, { pv_pit_handler(port, (uint8_t)data, 1); } - else if ( port == RTC_PORT(0) ) - { - currd->arch.cmos_idx = data; - } - else if ( (port == RTC_PORT(1)) && + else if ( (port == RTC_PORT(0) || port == RTC_PORT(1)) && ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) ) { - unsigned long flags; - - if ( pv_rtc_handler ) - pv_rtc_handler(currd->arch.cmos_idx & 0x7f, data); - spin_lock_irqsave(&rtc_lock, flags); - outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); - outb(data, RTC_PORT(1)); - spin_unlock_irqrestore(&rtc_lock, flags); + rtc_guest_write(port, data); } else if ( (port == 0xcf8) && (bytes == 4) ) { diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index bbaea3aa65..e1c81067ce 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -27,6 +27,7 @@ #include <xen/keyhandler.h> #include <xen/guest_access.h> #include <asm/io.h> +#include <asm/iocap.h> #include <asm/msr.h> #include <asm/mpspec.h> #include <asm/processor.h> @@ -1110,6 +1111,64 @@ static unsigned long get_cmos_time(void) return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec); } +/* Helpers for guest accesses to the physical RTC. */ +unsigned int rtc_guest_read(unsigned int port) +{ + const struct domain *currd = current->domain; + unsigned long flags; + unsigned int data = ~0; + + ASSERT(port == RTC_PORT(0) || port == RTC_PORT(1)); + if ( !ioports_access_permitted(currd, port, port) ) + { + ASSERT_UNREACHABLE(); + return data; + } + + switch ( port ) + { + case RTC_PORT(0): + data = currd->arch.cmos_idx; + break; + + case RTC_PORT(1): + spin_lock_irqsave(&rtc_lock, flags); + outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); + data = inb(RTC_PORT(1)); + spin_unlock_irqrestore(&rtc_lock, flags); + break; + } + + return data; +} + +void rtc_guest_write(unsigned int port, unsigned int data) +{ + struct domain *currd = current->domain; + unsigned long flags; + + ASSERT(port == RTC_PORT(0) || port == RTC_PORT(1)); + if ( !ioports_access_permitted(currd, port, port) ) + { + ASSERT_UNREACHABLE(); + return; + } + + switch ( port ) + { + case RTC_PORT(0): + currd->arch.cmos_idx = data; + break; + + case RTC_PORT(1): + spin_lock_irqsave(&rtc_lock, flags); + outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0)); + outb(data, RTC_PORT(1)); + spin_unlock_irqrestore(&rtc_lock, flags); + break; + } +} + static unsigned long get_wallclock_time(void) { #ifdef CONFIG_XEN_GUEST diff --git a/xen/include/asm-x86/mc146818rtc.h b/xen/include/asm-x86/mc146818rtc.h index 8758528f7c..803b236c0a 100644 --- a/xen/include/asm-x86/mc146818rtc.h +++ b/xen/include/asm-x86/mc146818rtc.h @@ -110,4 +110,7 @@ outb_p((val),RTC_PORT(1)); \ #define RTC_IRQ 8 +unsigned int rtc_guest_read(unsigned int port); +void rtc_guest_write(unsigned int port, unsigned int data); + #endif /* _ASM_MC146818RTC_H */
Mediated access to the RTC was provided for PVHv1 dom0 using the PV code paths (guest_io_{write/read}), but those accesses where never implemented for PVHv2 dom0. This patch provides such mediated accesses to the RTC for PVH dom0, just like it's provided for a classic PV dom0. Pull out some of the RTC logic from guest_io_{read/write} into specific helpers that can be used by both PV and HVM guests. The setup of the handlers for PVH is done in rtc_init, which is already used to initialize the fully emulated RTC. Without this a Linux PVH dom0 will read garbage when trying to access the RTC, and one vCPU will be constantly looping in rtc_timer_do_work. Note that such issue doesn't happen on domUs because the ACPI NO_CMOS_RTC flag is set in FADT, which prevents the OS from accessing the RTC. Also the X86_EMU_RTC flag is not set for PVH dom0, as the accesses are not emulated but rather forwarded to the physical hardware. No functional change expected for classic PV dom0. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- for-4.14 reasoning: the fix is mostly isolated to PVH dom0, and as such the risk is very low of causing issues to other guests types, but without this fix one vCPU when using a Linux dom0 will be constantly looping over rtc_timer_do_work with 100% CPU usage, at least when using Linux 4.19 or newer. --- Changes since v1: - Share the code with PV. - Add access checks to the IO ports. --- xen/arch/x86/hvm/rtc.c | 34 ++++++++++++++++++ xen/arch/x86/pv/emul-priv-op.c | 28 +++------------ xen/arch/x86/time.c | 59 +++++++++++++++++++++++++++++++ xen/include/asm-x86/mc146818rtc.h | 3 ++ 4 files changed, 100 insertions(+), 24 deletions(-)