diff mbox series

[v4] x86: detect CMOS aliasing on ports other than 0x70/0x71

Message ID 367d0223-d481-b7a3-3b79-a9bc6f00531e@suse.com (mailing list archive)
State Superseded
Headers show
Series [v4] x86: detect CMOS aliasing on ports other than 0x70/0x71 | expand

Commit Message

Jan Beulich March 20, 2023, 8:32 a.m. UTC
... in order to also intercept Dom0 accesses through the alias ports.

Also stop intercepting accesses to the CMOS ports if we won't ourselves
use the CMOS RTC.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: Also conditionally mask top bit for guest index port accesses. Add
    missing adjustments to rtc_init(). Re-work to avoid recursive
    read_lock(). Also adjust guest_io_{read,write}(). Re-base.
v3: Re-base over change to earlier patch.
v2: Re-base.

Comments

Roger Pau Monné March 21, 2023, 2:12 p.m. UTC | #1
On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
> ... in order to also intercept Dom0 accesses through the alias ports.

I'm trying to get some documentation about this aliasing, but so far I
haven't been able to find any.  Do you have any references of where I
might be able to find it?

> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> use the CMOS RTC.

Could this create any concerns with the ability to disable NMIs if we
no longer filter accesses to the RTC?

Thanks, Roger.
Jan Beulich March 22, 2023, 9:55 a.m. UTC | #2
On 21.03.2023 15:12, Roger Pau Monné wrote:
> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>> ... in order to also intercept Dom0 accesses through the alias ports.
> 
> I'm trying to get some documentation about this aliasing, but so far I
> haven't been able to find any.  Do you have any references of where I
> might be able to find it?

I think several ICH datasheet documents mention this. Right now I'm
looking at the ICH10 one (319973-003), section 13.6.1 ("I/O Register
Address Map" under "Real Time Clock Registers").

But such aliasing (really: lack of decoding) has been present on
various of the low 1024 ports from the very early days of x86. So we
may want to take care of such elsewhere as well, e.g. for the PIC
(where aforementioned doc also explicitly mentions the aliases).

>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>> use the CMOS RTC.
> 
> Could this create any concerns with the ability to disable NMIs if we
> no longer filter accesses to the RTC?

Hmm, that's a valid concern, but I'm not sure in how far we need to
be worried about giving Dom0 this level of control. As long as we
don't use it ourselves of course (I'm unaware of us using this
anywhere). If we're worried, we could continue to intercept port
0x70 alone, just to mask off the top bit for writes.

Jan
Roger Pau Monné March 23, 2023, 12:29 p.m. UTC | #3
On Wed, Mar 22, 2023 at 10:55:42AM +0100, Jan Beulich wrote:
> On 21.03.2023 15:12, Roger Pau Monné wrote:
> > On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
> >> ... in order to also intercept Dom0 accesses through the alias ports.
> > 
> > I'm trying to get some documentation about this aliasing, but so far I
> > haven't been able to find any.  Do you have any references of where I
> > might be able to find it?
> 
> I think several ICH datasheet documents mention this. Right now I'm
> looking at the ICH10 one (319973-003), section 13.6.1 ("I/O Register
> Address Map" under "Real Time Clock Registers").

Thanks, I had to fetch this from elsewhere as I haven't been able to
find it on the Intel documentation site, maybe it's too old?

> But such aliasing (really: lack of decoding) has been present on
> various of the low 1024 ports from the very early days of x86. So we
> may want to take care of such elsewhere as well, e.g. for the PIC
> (where aforementioned doc also explicitly mentions the aliases).

I wonder how relevant those aliases are for OSes, do we know of any OS
that uses them?

For example we don't seem to provide them to HVM guests at all, and we
seem to get away with it.

> >> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> >> use the CMOS RTC.
> > 
> > Could this create any concerns with the ability to disable NMIs if we
> > no longer filter accesses to the RTC?
> 
> Hmm, that's a valid concern, but I'm not sure in how far we need to
> be worried about giving Dom0 this level of control. As long as we
> don't use it ourselves of course (I'm unaware of us using this
> anywhere). If we're worried, we could continue to intercept port
> 0x70 alone, just to mask off the top bit for writes.

I would be mostly worried about dom0 disabling NMI and thus causing
the Xen watchdog to trigger for example.  I don't think we should
allow dom0 to disable NMIs at all.

Thanks, Roger.
Jan Beulich March 23, 2023, 2:26 p.m. UTC | #4
On 23.03.2023 13:29, Roger Pau Monné wrote:
> On Wed, Mar 22, 2023 at 10:55:42AM +0100, Jan Beulich wrote:
>> On 21.03.2023 15:12, Roger Pau Monné wrote:
>>> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>>>> ... in order to also intercept Dom0 accesses through the alias ports.
>>>
>>> I'm trying to get some documentation about this aliasing, but so far I
>>> haven't been able to find any.  Do you have any references of where I
>>> might be able to find it?
>>
>> I think several ICH datasheet documents mention this. Right now I'm
>> looking at the ICH10 one (319973-003), section 13.6.1 ("I/O Register
>> Address Map" under "Real Time Clock Registers").
> 
> Thanks, I had to fetch this from elsewhere as I haven't been able to
> find it on the Intel documentation site, maybe it's too old?
> 
>> But such aliasing (really: lack of decoding) has been present on
>> various of the low 1024 ports from the very early days of x86. So we
>> may want to take care of such elsewhere as well, e.g. for the PIC
>> (where aforementioned doc also explicitly mentions the aliases).
> 
> I wonder how relevant those aliases are for OSes, do we know of any OS
> that uses them?
> 
> For example we don't seem to provide them to HVM guests at all, and we
> seem to get away with it.

There are two aspects here: One is the functionality that becomes available
specifically via using the aliases here (and I'm not 100% certain this isn't
chipset dependent in the first place), allowing access to the full 256 bytes
of CMOS storage (i.e. no parts clipped off for the RTC registers). The other
aspect is simply disallowing access to ports we mean Dom0 to not have access
to. That would be the sole purpose e.g. for the PIC port ranges. Otherwise
there's little point disallowing access to the base ranges, imo.

>>>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>>>> use the CMOS RTC.
>>>
>>> Could this create any concerns with the ability to disable NMIs if we
>>> no longer filter accesses to the RTC?
>>
>> Hmm, that's a valid concern, but I'm not sure in how far we need to
>> be worried about giving Dom0 this level of control. As long as we
>> don't use it ourselves of course (I'm unaware of us using this
>> anywhere). If we're worried, we could continue to intercept port
>> 0x70 alone, just to mask off the top bit for writes.
> 
> I would be mostly worried about dom0 disabling NMI and thus causing
> the Xen watchdog to trigger for example.  I don't think we should
> allow dom0 to disable NMIs at all.

I'll see what I can do, preferably without keeping the intercepts fully
engaged.

Jan
Roger Pau Monné March 23, 2023, 2:49 p.m. UTC | #5
On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
> ... in order to also intercept Dom0 accesses through the alias ports.
> 
> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> use the CMOS RTC.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v4: Also conditionally mask top bit for guest index port accesses. Add
>     missing adjustments to rtc_init(). Re-work to avoid recursive
>     read_lock(). Also adjust guest_io_{read,write}(). Re-base.
> v3: Re-base over change to earlier patch.
> v2: Re-base.
> 
> --- a/xen/arch/x86/hvm/rtc.c
> +++ b/xen/arch/x86/hvm/rtc.c
> @@ -27,7 +27,7 @@
>  #include <asm/hvm/vpt.h>
>  #include <asm/hvm/io.h>
>  #include <asm/hvm/save.h>
> -#include <asm/current.h>
> +#include <asm/iocap.h>
>  #include <xen/trace.h>
>  #include <public/hvm/params.h>
>  
> @@ -836,10 +836,18 @@ void rtc_init(struct domain *d)
>  
>      if ( !has_vrtc(d) )
>      {
> -        if ( is_hardware_domain(d) )
> -            /* Hardware domain gets mediated access to the physical RTC. */
> -            register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
> -        return;
> +        unsigned int port;
> +
> +        if ( !is_hardware_domain(d) )
> +            return;
> +
> +        /*
> +         * Hardware domain gets mediated access to the physical RTC/CMOS
> +         * (of course unless we don't use it ourselves).
> +         */
> +        for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
> +            if ( is_cmos_port(port, 2, d) )
> +                register_portio_handler(d, port, 2, hw_rtc_io);
>      }
>  
>      spin_lock_init(&s->lock);
> --- a/xen/arch/x86/include/asm/mc146818rtc.h
> +++ b/xen/arch/x86/include/asm/mc146818rtc.h
> @@ -9,6 +9,10 @@
>  
>  extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
>  
> +struct domain;
> +bool is_cmos_port(unsigned int port, unsigned int bytes,
> +                  const struct domain *d);

We seem to usually name this rtc rather than cmos, any reason to use
cmos for the helper naming rather than rtc?

If not I would rather use is_rtc_port(), so that we can keep it in
sync with the existing RTC_PORT() macros and the handler names
rtc_guest_{read,write}, hw_rtc_io.

> +
>  /**********************************************************************
>   * register summary
>   **********************************************************************/
> --- a/xen/arch/x86/pv/emul-priv-op.c
> +++ b/xen/arch/x86/pv/emul-priv-op.c
> @@ -220,7 +220,7 @@ static bool admin_io_okay(unsigned int p
>          return false;
>  
>      /* We also never permit direct access to the RTC/CMOS registers. */
> -    if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
> +    if ( is_cmos_port(port, bytes, d) )
>          return false;
>  
>      return ioports_access_permitted(d, port, port + bytes - 1);
> @@ -290,7 +290,7 @@ static uint32_t guest_io_read(unsigned i
>          {
>              sub_data = pv_pit_handler(port, 0, 0);
>          }
> -        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
> +        else if ( is_cmos_port(port, 1, currd) )
>          {
>              sub_data = rtc_guest_read(port);
>          }
> @@ -436,7 +436,7 @@ static void guest_io_write(unsigned int
>          {
>              pv_pit_handler(port, (uint8_t)data, 1);
>          }
> -        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
> +        else if ( is_cmos_port(port, 1, currd) )
>          {
>              rtc_guest_write(port, data);
>          }
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -2072,37 +2072,36 @@ int __hwdom_init xen_in_range(unsigned l
>  static int __hwdom_init cf_check io_bitmap_cb(
>      unsigned long s, unsigned long e, void *ctx)
>  {
> -    struct domain *d = ctx;
> +    const struct domain *d = ctx;
>      unsigned int i;
>  
>      ASSERT(e <= INT_MAX);
>      for ( i = s; i <= e; i++ )
> -        __clear_bit(i, d->arch.hvm.io_bitmap);
> +        /*
> +         * Accesses to RTC ports also need to be trapped in order to keep
> +         * consistency with PV.
> +         */

More than to keep consistency with PV, don't we need to trap accesses
to that concurrent accesses between dom0 and Xen (when also using the
device) don't overlap, as the RTC/CMOS space uses indirect indexing.

And likely to avoid dom0 from disabling NMIs.

I see that you copied the existing comment, but not sure it's fully
accurate?

> +        if ( !is_cmos_port(i, 1, d) )
> +            __clear_bit(i, d->arch.hvm.io_bitmap);
>  
>      return 0;
>  }
>  
>  void __hwdom_init setup_io_bitmap(struct domain *d)
>  {
> -    int rc;
> +    if ( !is_hvm_domain(d) )
> +        return;
>  
> -    if ( is_hvm_domain(d) )
> -    {
> -        bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
> -        rc = rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
> -                                    io_bitmap_cb, d);
> -        BUG_ON(rc);
> -        /*
> -         * NB: we need to trap accesses to 0xcf8 in order to intercept
> -         * 4 byte accesses, that need to be handled by Xen in order to
> -         * keep consistency.
> -         * Access to 1 byte RTC ports also needs to be trapped in order
> -         * to keep consistency with PV.
> -         */
> -        __set_bit(0xcf8, d->arch.hvm.io_bitmap);
> -        __set_bit(RTC_PORT(0), d->arch.hvm.io_bitmap);
> -        __set_bit(RTC_PORT(1), d->arch.hvm.io_bitmap);
> -    }
> +    bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
> +    if ( rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
> +                                io_bitmap_cb, d) )
> +        BUG();
> +
> +    /*
> +     * We need to trap 4-byte accesses to 0xcf8 (see admin_io_okay(),
> +     * guest_io_read(), and guest_io_write()).
> +     */
> +    __set_bit(0xcf8, d->arch.hvm.io_bitmap);
>  }
>  
>  /*
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1234,7 +1234,10 @@ static unsigned long get_cmos_time(void)
>          if ( seconds < 60 )
>          {
>              if ( rtc.sec != seconds )
> +            {
>                  cmos_rtc_probe = false;
> +                acpi_gbl_FADT.boot_flags &= ~ACPI_FADT_NO_CMOS_RTC;
> +            }
>              break;
>          }
>  
> @@ -1249,6 +1252,80 @@ static unsigned long get_cmos_time(void)
>      return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
>  }
>  
> +static unsigned int __ro_after_init cmos_alias_mask;
> +
> +static int __init cf_check probe_cmos_alias(void)
> +{
> +    unsigned int i, offs;
> +
> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
> +        return 0;
> +
> +    for ( offs = 2; offs < 8; offs <<= 1 )
> +    {
> +        bool read = true;

You can limit the scope of i to the inner for loop.

> +
> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
> +        {
> +            uint8_t normal, alt;
> +            unsigned long flags;
> +
> +            if ( i == acpi_gbl_FADT.century )
> +                continue;
> +
> +            spin_lock_irqsave(&rtc_lock, flags);
> +
> +            normal = CMOS_READ(i);
> +            if ( inb(RTC_PORT(offs)) != i )
> +                read = false;
> +
> +            alt = inb(RTC_PORT(offs + 1));
> +
> +            spin_unlock_irqrestore(&rtc_lock, flags);
> +
> +            if ( normal != alt )
> +                break;
> +
> +            process_pending_softirqs();

You adding a call to process pending softirqs for every loop
iteration makes me wonder how long is each of those accesses expected
to take, since we could be performing a lot of them (0x80 * 3).

I don't think so, but there can not be any side effects from reading
from the CMOS RAM I would assume, even for cases where the CMOS ports
are not aliases?

I would assume ports to be either aliased to the CMOS, or otherwise
reserved.  What makes me wonder if it wouldn't be simpler to just
passthough accesses to all the possible CMOS alias ports.

> +        }
> +        if ( i == 0x80 )
> +        {
> +            cmos_alias_mask |= offs;
> +            printk(XENLOG_INFO "CMOS aliased at %02x, index %s\n",
> +                   RTC_PORT(offs), read ? "r/w" : "w/o");
> +        }
> +    }
> +
> +    return 0;
> +}
> +__initcall(probe_cmos_alias);
> +
> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
> +{
> +    if ( !is_hardware_domain(d) )
> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
> +
> +    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
> +         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
> +    {
> +        unsigned int cmos = RTC_PORT(0), nr = 2, step;
> +
> +        while ( cmos_alias_mask & nr )
> +            nr <<= 1;
> +        for ( step = nr << 1;
> +              step < cmos_alias_mask && !(cmos_alias_mask & step); )
> +            step <<= 1;
> +        do {
> +            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
> +                 port <= cmos + 1 && port + bytes > cmos )
> +                return true;
> +            cmos += step;
> +        } while ( cmos <= RTC_PORT(cmos_alias_mask) );

I would use a for loop similar to the one used in probe_cmos_alias()
to check for aliased accesses?

if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
    return true;

for ( offs = 2; offs < 8; offs <<= 1 )
{
    if ( !(offs & cmos_alias_mask) )
        continue;
    if ( port <= RTC_PORT(1 + off) && port + bytes > RTC_PORT(off) )
        return true;
}

return false;

So that you can also optimize for the more common case RTC_PORT(0) and
RTC_PORT(1) are used?

Or there's something I'm missing?

> +    }
> +
> +    return false;
> +}
> +
>  /* Helpers for guest accesses to the physical RTC. */
>  unsigned int rtc_guest_read(unsigned int port)
>  {
> @@ -1256,7 +1333,7 @@ unsigned int rtc_guest_read(unsigned int
>      unsigned long flags;
>      unsigned int data = ~0;
>  
> -    switch ( port )
> +    switch ( port & ~cmos_alias_mask )
>      {
>      case RTC_PORT(0):
>          /*
> @@ -1264,15 +1341,16 @@ unsigned int rtc_guest_read(unsigned int
>           * of the first RTC port, as there's no access to the physical IO
>           * ports.
>           */
> -        data = currd->arch.cmos_idx;
> +        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));

We do allow read access to alias ports even when the underling
hardware does do so, which I think is fine, but might be worth a
comment (since we already detect whether the RTC_PORT(0) alias is also
readable.

Thanks, Roger.
Jan Beulich March 23, 2023, 4:08 p.m. UTC | #6
On 23.03.2023 15:49, Roger Pau Monné wrote:
> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>> --- a/xen/arch/x86/include/asm/mc146818rtc.h
>> +++ b/xen/arch/x86/include/asm/mc146818rtc.h
>> @@ -9,6 +9,10 @@
>>  
>>  extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
>>  
>> +struct domain;
>> +bool is_cmos_port(unsigned int port, unsigned int bytes,
>> +                  const struct domain *d);
> 
> We seem to usually name this rtc rather than cmos, any reason to use
> cmos for the helper naming rather than rtc?
> 
> If not I would rather use is_rtc_port(), so that we can keep it in
> sync with the existing RTC_PORT() macros and the handler names
> rtc_guest_{read,write}, hw_rtc_io.

Already when talking about just ports 70 and 71 there's more CMOS
behind them than RTC. With extended CMOS accesses the ratio further
shifts. So I view using "rtc" here simply as increasingly
inappropriate.

>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -2072,37 +2072,36 @@ int __hwdom_init xen_in_range(unsigned l
>>  static int __hwdom_init cf_check io_bitmap_cb(
>>      unsigned long s, unsigned long e, void *ctx)
>>  {
>> -    struct domain *d = ctx;
>> +    const struct domain *d = ctx;
>>      unsigned int i;
>>  
>>      ASSERT(e <= INT_MAX);
>>      for ( i = s; i <= e; i++ )
>> -        __clear_bit(i, d->arch.hvm.io_bitmap);
>> +        /*
>> +         * Accesses to RTC ports also need to be trapped in order to keep
>> +         * consistency with PV.
>> +         */
> 
> More than to keep consistency with PV, don't we need to trap accesses
> to that concurrent accesses between dom0 and Xen (when also using the
> device) don't overlap, as the RTC/CMOS space uses indirect indexing.

That's what I read "consistency" to mean.

> And likely to avoid dom0 from disabling NMIs.

I can add that to the comment, but as you say ...

> I see that you copied the existing comment, but not sure it's fully
> accurate?

... I've merely moved it.

>> @@ -1249,6 +1252,80 @@ static unsigned long get_cmos_time(void)
>>      return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
>>  }
>>  
>> +static unsigned int __ro_after_init cmos_alias_mask;
>> +
>> +static int __init cf_check probe_cmos_alias(void)
>> +{
>> +    unsigned int i, offs;
>> +
>> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
>> +        return 0;
>> +
>> +    for ( offs = 2; offs < 8; offs <<= 1 )
>> +    {
>> +        bool read = true;
> 
> You can limit the scope of i to the inner for loop.

Sure. It has been long ago when I wrote this, so I guess I was after it
being one line less of code.

>> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
>> +        {
>> +            uint8_t normal, alt;
>> +            unsigned long flags;
>> +
>> +            if ( i == acpi_gbl_FADT.century )
>> +                continue;
>> +
>> +            spin_lock_irqsave(&rtc_lock, flags);
>> +
>> +            normal = CMOS_READ(i);
>> +            if ( inb(RTC_PORT(offs)) != i )
>> +                read = false;
>> +
>> +            alt = inb(RTC_PORT(offs + 1));
>> +
>> +            spin_unlock_irqrestore(&rtc_lock, flags);
>> +
>> +            if ( normal != alt )
>> +                break;
>> +
>> +            process_pending_softirqs();
> 
> You adding a call to process pending softirqs for every loop
> iteration makes me wonder how long is each of those accesses expected
> to take, since we could be performing a lot of them (0x80 * 3).

It seemed best to me to keep things simple here, at the expense at a
few too many calls.

> I don't think so, but there can not be any side effects from reading
> from the CMOS RAM I would assume, even for cases where the CMOS ports
> are not aliases?

Well, one of the fundamental assumptions is that these read attempts
won't have side effects. Without that assumption we simply can't do
such probing.

> I would assume ports to be either aliased to the CMOS, or otherwise
> reserved.  What makes me wonder if it wouldn't be simpler to just
> passthough accesses to all the possible CMOS alias ports.

But we need to intercept writes to 70 to track the index. IOW we
cannot simply pass through all of them, and we also cannot simply
intercept them all and treat them all the same.

>> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
>> +{
>> +    if ( !is_hardware_domain(d) )
>> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
>> +
>> +    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
>> +         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
>> +    {
>> +        unsigned int cmos = RTC_PORT(0), nr = 2, step;
>> +
>> +        while ( cmos_alias_mask & nr )
>> +            nr <<= 1;
>> +        for ( step = nr << 1;
>> +              step < cmos_alias_mask && !(cmos_alias_mask & step); )
>> +            step <<= 1;
>> +        do {
>> +            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
>> +                 port <= cmos + 1 && port + bytes > cmos )
>> +                return true;
>> +            cmos += step;
>> +        } while ( cmos <= RTC_PORT(cmos_alias_mask) );
> 
> I would use a for loop similar to the one used in probe_cmos_alias()
> to check for aliased accesses?
> 
> if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
>     return true;
> 
> for ( offs = 2; offs < 8; offs <<= 1 )
> {
>     if ( !(offs & cmos_alias_mask) )
>         continue;
>     if ( port <= RTC_PORT(1 + off) && port + bytes > RTC_PORT(off) )
>         return true;
> }
> 
> return false;
> 
> So that you can also optimize for the more common case RTC_PORT(0) and
> RTC_PORT(1) are used?
> 
> Or there's something I'm missing?

I'll have to check carefully, but to be honest I would prefer to not
touch this code again unless there's clearly something wrong with it.

>> @@ -1256,7 +1333,7 @@ unsigned int rtc_guest_read(unsigned int
>>      unsigned long flags;
>>      unsigned int data = ~0;
>>  
>> -    switch ( port )
>> +    switch ( port & ~cmos_alias_mask )
>>      {
>>      case RTC_PORT(0):
>>          /*
>> @@ -1264,15 +1341,16 @@ unsigned int rtc_guest_read(unsigned int
>>           * of the first RTC port, as there's no access to the physical IO
>>           * ports.
>>           */
>> -        data = currd->arch.cmos_idx;
>> +        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
> 
> We do allow read access to alias ports even when the underling
> hardware does do so,

I'm afraid I don't understand this, so ...

> which I think is fine, but might be worth a
> comment (since we already detect whether the RTC_PORT(0) alias is also
> readable.

... I can't really derive what kind of information you're after to put
in a comment.

Jan
Roger Pau Monné March 23, 2023, 4:40 p.m. UTC | #7
On Thu, Mar 23, 2023 at 05:08:43PM +0100, Jan Beulich wrote:
> On 23.03.2023 15:49, Roger Pau Monné wrote:
> > On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
> >> --- a/xen/arch/x86/include/asm/mc146818rtc.h
> >> +++ b/xen/arch/x86/include/asm/mc146818rtc.h
> >> @@ -9,6 +9,10 @@
> >>  
> >>  extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
> >>  
> >> +struct domain;
> >> +bool is_cmos_port(unsigned int port, unsigned int bytes,
> >> +                  const struct domain *d);
> > 
> > We seem to usually name this rtc rather than cmos, any reason to use
> > cmos for the helper naming rather than rtc?
> > 
> > If not I would rather use is_rtc_port(), so that we can keep it in
> > sync with the existing RTC_PORT() macros and the handler names
> > rtc_guest_{read,write}, hw_rtc_io.
> 
> Already when talking about just ports 70 and 71 there's more CMOS
> behind them than RTC. With extended CMOS accesses the ratio further
> shifts. So I view using "rtc" here simply as increasingly
> inappropriate.

Hm, it's your patch at the end, and such decision would likely fall
under the same bag as other style related questions.

I would prefer to keep the naming consistent, as to not confuse
readers with code dealing with the same underlying IO ports using both
RTC and CMOS, but that's just my taste.

> >> --- a/xen/arch/x86/setup.c
> >> +++ b/xen/arch/x86/setup.c
> >> @@ -2072,37 +2072,36 @@ int __hwdom_init xen_in_range(unsigned l
> >>  static int __hwdom_init cf_check io_bitmap_cb(
> >>      unsigned long s, unsigned long e, void *ctx)
> >>  {
> >> -    struct domain *d = ctx;
> >> +    const struct domain *d = ctx;
> >>      unsigned int i;
> >>  
> >>      ASSERT(e <= INT_MAX);
> >>      for ( i = s; i <= e; i++ )
> >> -        __clear_bit(i, d->arch.hvm.io_bitmap);
> >> +        /*
> >> +         * Accesses to RTC ports also need to be trapped in order to keep
> >> +         * consistency with PV.
> >> +         */
> > 
> > More than to keep consistency with PV, don't we need to trap accesses
> > to that concurrent accesses between dom0 and Xen (when also using the
> > device) don't overlap, as the RTC/CMOS space uses indirect indexing.
> 
> That's what I read "consistency" to mean.

But consistency with PV?  We need to keep consistency with concurrent
Xen (hypervisor) accesses I would think.

I would s/PV/hypervisor accesses/ in the comment above while moving
it.

> >> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
> >> +        {
> >> +            uint8_t normal, alt;
> >> +            unsigned long flags;
> >> +
> >> +            if ( i == acpi_gbl_FADT.century )
> >> +                continue;
> >> +
> >> +            spin_lock_irqsave(&rtc_lock, flags);
> >> +
> >> +            normal = CMOS_READ(i);
> >> +            if ( inb(RTC_PORT(offs)) != i )
> >> +                read = false;
> >> +
> >> +            alt = inb(RTC_PORT(offs + 1));
> >> +
> >> +            spin_unlock_irqrestore(&rtc_lock, flags);
> >> +
> >> +            if ( normal != alt )
> >> +                break;
> >> +
> >> +            process_pending_softirqs();
> > 
> > You adding a call to process pending softirqs for every loop
> > iteration makes me wonder how long is each of those accesses expected
> > to take, since we could be performing a lot of them (0x80 * 3).
> 
> It seemed best to me to keep things simple here, at the expense at a
> few too many calls.
> 
> > I don't think so, but there can not be any side effects from reading
> > from the CMOS RAM I would assume, even for cases where the CMOS ports
> > are not aliases?
> 
> Well, one of the fundamental assumptions is that these read attempts
> won't have side effects. Without that assumption we simply can't do
> such probing.
> 
> > I would assume ports to be either aliased to the CMOS, or otherwise
> > reserved.  What makes me wonder if it wouldn't be simpler to just
> > passthough accesses to all the possible CMOS alias ports.
> 
> But we need to intercept writes to 70 to track the index. IOW we
> cannot simply pass through all of them, and we also cannot simply
> intercept them all and treat them all the same.

Why couldn't we intercept all the possible alias port and passthrough
all of them?  As long as there's nothing else there's no risk in doing
so?

> >> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
> >> +{
> >> +    if ( !is_hardware_domain(d) )
> >> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
> >> +
> >> +    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
> >> +         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
> >> +    {
> >> +        unsigned int cmos = RTC_PORT(0), nr = 2, step;
> >> +
> >> +        while ( cmos_alias_mask & nr )
> >> +            nr <<= 1;
> >> +        for ( step = nr << 1;
> >> +              step < cmos_alias_mask && !(cmos_alias_mask & step); )
> >> +            step <<= 1;
> >> +        do {
> >> +            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
> >> +                 port <= cmos + 1 && port + bytes > cmos )
> >> +                return true;
> >> +            cmos += step;
> >> +        } while ( cmos <= RTC_PORT(cmos_alias_mask) );
> > 
> > I would use a for loop similar to the one used in probe_cmos_alias()
> > to check for aliased accesses?
> > 
> > if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
> >     return true;
> > 
> > for ( offs = 2; offs < 8; offs <<= 1 )
> > {
> >     if ( !(offs & cmos_alias_mask) )
> >         continue;
> >     if ( port <= RTC_PORT(1 + off) && port + bytes > RTC_PORT(off) )
> >         return true;
> > }
> > 
> > return false;
> > 
> > So that you can also optimize for the more common case RTC_PORT(0) and
> > RTC_PORT(1) are used?
> > 
> > Or there's something I'm missing?
> 
> I'll have to check carefully, but to be honest I would prefer to not
> touch this code again unless there's clearly something wrong with it.

TBH, I think the proposed code is extremely difficult to follow, there
are 3 loops in a row which gives me a headache when thinking about all
the possible combinations.

I think my proposed alternative is much easier to follow because it
has a single loop, and it's using the same bounds used to fill the
cmos_alias_mask in the first place.  But maybe that's just my taste.

> >> @@ -1256,7 +1333,7 @@ unsigned int rtc_guest_read(unsigned int
> >>      unsigned long flags;
> >>      unsigned int data = ~0;
> >>  
> >> -    switch ( port )
> >> +    switch ( port & ~cmos_alias_mask )
> >>      {
> >>      case RTC_PORT(0):
> >>          /*
> >> @@ -1264,15 +1341,16 @@ unsigned int rtc_guest_read(unsigned int
> >>           * of the first RTC port, as there's no access to the physical IO
> >>           * ports.
> >>           */
> >> -        data = currd->arch.cmos_idx;
> >> +        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
> > 
> > We do allow read access to alias ports even when the underling
> > hardware does do so,
> 
> I'm afraid I don't understand this, so ...
> 
> > which I think is fine, but might be worth a
> > comment (since we already detect whether the RTC_PORT(0) alias is also
> > readable.
> 
> ... I can't really derive what kind of information you're after to put
> in a comment.

Reading from ports that alias RTC_PORT(0) might not always return the
value written to RTC_PORT(0) (you have a check for that in
probe_cmos_alias()).  Yet in rtc_guest_read() Xen does always return
the cached CMOS index.  Which is likely to be all fine, but needs a
comment to note this behavior might not match what the underlying
hardware would return.

Thanks, Roger.
Jan Beulich March 27, 2023, 3:44 p.m. UTC | #8
On 21.03.2023 15:12, Roger Pau Monné wrote:
> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>> ... in order to also intercept Dom0 accesses through the alias ports.
> 
> I'm trying to get some documentation about this aliasing, but so far I
> haven't been able to find any.  Do you have any references of where I
> might be able to find it?
> 
>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>> use the CMOS RTC.
> 
> Could this create any concerns with the ability to disable NMIs if we
> no longer filter accesses to the RTC?

I've added "..., because of there being none." If there's no RTC, I don't
think it's specified what function (if any) port 70 bit 7 has.

Jan
Jan Beulich March 27, 2023, 3:44 p.m. UTC | #9
On 23.03.2023 15:49, Roger Pau Monné wrote:
> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>> @@ -1249,6 +1252,80 @@ static unsigned long get_cmos_time(void)
>>      return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
>>  }
>>  
>> +static unsigned int __ro_after_init cmos_alias_mask;
>> +
>> +static int __init cf_check probe_cmos_alias(void)
>> +{
>> +    unsigned int i, offs;
>> +
>> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
>> +        return 0;
>> +
>> +    for ( offs = 2; offs < 8; offs <<= 1 )
>> +    {
>> +        bool read = true;
> 
> You can limit the scope of i to the inner for loop.
> 
>> +
>> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
>> +        {
>> +            uint8_t normal, alt;
>> +            unsigned long flags;
>> +
>> +            if ( i == acpi_gbl_FADT.century )
>> +                continue;
>> +
>> +            spin_lock_irqsave(&rtc_lock, flags);
>> +
>> +            normal = CMOS_READ(i);
>> +            if ( inb(RTC_PORT(offs)) != i )
>> +                read = false;
>> +
>> +            alt = inb(RTC_PORT(offs + 1));
>> +
>> +            spin_unlock_irqrestore(&rtc_lock, flags);
>> +
>> +            if ( normal != alt )
>> +                break;
>> +
>> +            process_pending_softirqs();
> 
> You adding a call to process pending softirqs for every loop
> iteration makes me wonder how long is each of those accesses expected
> to take, since we could be performing a lot of them (0x80 * 3).
> 
> I don't think so, but there can not be any side effects from reading
> from the CMOS RAM I would assume, even for cases where the CMOS ports
> are not aliases?
> 
> I would assume ports to be either aliased to the CMOS, or otherwise
> reserved.  What makes me wonder if it wouldn't be simpler to just
> passthough accesses to all the possible CMOS alias ports.

I'm afraid this assumption doesn't hold, as can be seen from the ICH10
datasheet that I did point you at the other day. There ports 72/73
serve a purpose different from ports 70/71 (and their aliases at 74/75).
Unless (as the datasheet calls it) U128E is clear (wherever that bit
lives), in which case 72/73 (and 76/77) will also be aliases of 70/71.
So we won't get away without probing, and if we deem probing too risky,
then all I can do is drop this patch.

Jan
Jan Beulich March 27, 2023, 3:46 p.m. UTC | #10
On 23.03.2023 17:40, Roger Pau Monné wrote:
> On Thu, Mar 23, 2023 at 05:08:43PM +0100, Jan Beulich wrote:
>> On 23.03.2023 15:49, Roger Pau Monné wrote:
>>> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/setup.c
>>>> +++ b/xen/arch/x86/setup.c
>>>> @@ -2072,37 +2072,36 @@ int __hwdom_init xen_in_range(unsigned l
>>>>  static int __hwdom_init cf_check io_bitmap_cb(
>>>>      unsigned long s, unsigned long e, void *ctx)
>>>>  {
>>>> -    struct domain *d = ctx;
>>>> +    const struct domain *d = ctx;
>>>>      unsigned int i;
>>>>  
>>>>      ASSERT(e <= INT_MAX);
>>>>      for ( i = s; i <= e; i++ )
>>>> -        __clear_bit(i, d->arch.hvm.io_bitmap);
>>>> +        /*
>>>> +         * Accesses to RTC ports also need to be trapped in order to keep
>>>> +         * consistency with PV.
>>>> +         */
>>>
>>> More than to keep consistency with PV, don't we need to trap accesses
>>> to that concurrent accesses between dom0 and Xen (when also using the
>>> device) don't overlap, as the RTC/CMOS space uses indirect indexing.
>>
>> That's what I read "consistency" to mean.
> 
> But consistency with PV?  We need to keep consistency with concurrent
> Xen (hypervisor) accesses I would think.
> 
> I would s/PV/hypervisor accesses/ in the comment above while moving
> it.

Hmm, yes, changed. (Still I'm not normally very happy about changing
comments I don't really understand the way they're written.)

>>>> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
>>>> +{
>>>> +    if ( !is_hardware_domain(d) )
>>>> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
>>>> +
>>>> +    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
>>>> +         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
>>>> +    {
>>>> +        unsigned int cmos = RTC_PORT(0), nr = 2, step;
>>>> +
>>>> +        while ( cmos_alias_mask & nr )
>>>> +            nr <<= 1;
>>>> +        for ( step = nr << 1;
>>>> +              step < cmos_alias_mask && !(cmos_alias_mask & step); )
>>>> +            step <<= 1;
>>>> +        do {
>>>> +            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
>>>> +                 port <= cmos + 1 && port + bytes > cmos )
>>>> +                return true;
>>>> +            cmos += step;
>>>> +        } while ( cmos <= RTC_PORT(cmos_alias_mask) );
>>>
>>> I would use a for loop similar to the one used in probe_cmos_alias()
>>> to check for aliased accesses?
>>>
>>> if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
>>>     return true;
>>>
>>> for ( offs = 2; offs < 8; offs <<= 1 )
>>> {
>>>     if ( !(offs & cmos_alias_mask) )
>>>         continue;
>>>     if ( port <= RTC_PORT(1 + off) && port + bytes > RTC_PORT(off) )
>>>         return true;
>>> }
>>>
>>> return false;
>>>
>>> So that you can also optimize for the more common case RTC_PORT(0) and
>>> RTC_PORT(1) are used?
>>>
>>> Or there's something I'm missing?
>>
>> I'll have to check carefully, but to be honest I would prefer to not
>> touch this code again unless there's clearly something wrong with it.
> 
> TBH, I think the proposed code is extremely difficult to follow, there
> are 3 loops in a row which gives me a headache when thinking about all
> the possible combinations.
> 
> I think my proposed alternative is much easier to follow because it
> has a single loop, and it's using the same bounds used to fill the
> cmos_alias_mask in the first place.  But maybe that's just my taste.

Upon re-consideration I've adjusted the code. Iirc probe_cmos_alias()
originally had something similar, so changing it in the other place as
well is only logical.

>>>> @@ -1256,7 +1333,7 @@ unsigned int rtc_guest_read(unsigned int
>>>>      unsigned long flags;
>>>>      unsigned int data = ~0;
>>>>  
>>>> -    switch ( port )
>>>> +    switch ( port & ~cmos_alias_mask )
>>>>      {
>>>>      case RTC_PORT(0):
>>>>          /*
>>>> @@ -1264,15 +1341,16 @@ unsigned int rtc_guest_read(unsigned int
>>>>           * of the first RTC port, as there's no access to the physical IO
>>>>           * ports.
>>>>           */
>>>> -        data = currd->arch.cmos_idx;
>>>> +        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
>>>
>>> We do allow read access to alias ports even when the underling
>>> hardware does do so,
>>
>> I'm afraid I don't understand this, so ...
>>
>>> which I think is fine, but might be worth a
>>> comment (since we already detect whether the RTC_PORT(0) alias is also
>>> readable.
>>
>> ... I can't really derive what kind of information you're after to put
>> in a comment.
> 
> Reading from ports that alias RTC_PORT(0) might not always return the
> value written to RTC_PORT(0) (you have a check for that in
> probe_cmos_alias()).  Yet in rtc_guest_read() Xen does always return
> the cached CMOS index.  Which is likely to be all fine, but needs a
> comment to note this behavior might not match what the underlying
> hardware would return.

You do realize that this is pre-existing behavior? On many chipsets
you cannot read the index value from port 70. Yet we've always
returned it, presumably on the grounds of the value either being
rubbish (often 0xff) or the index. And rather than trying to figure
out whether to return rubbish (explicitly of via accessing the port)
I guess it was deemed easier and more helpful to always return the
index value. (I've amended the comment there nevertheless.)

Jan
diff mbox series

Patch

--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -27,7 +27,7 @@ 
 #include <asm/hvm/vpt.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/save.h>
-#include <asm/current.h>
+#include <asm/iocap.h>
 #include <xen/trace.h>
 #include <public/hvm/params.h>
 
@@ -836,10 +836,18 @@  void rtc_init(struct domain *d)
 
     if ( !has_vrtc(d) )
     {
-        if ( is_hardware_domain(d) )
-            /* Hardware domain gets mediated access to the physical RTC. */
-            register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
-        return;
+        unsigned int port;
+
+        if ( !is_hardware_domain(d) )
+            return;
+
+        /*
+         * Hardware domain gets mediated access to the physical RTC/CMOS
+         * (of course unless we don't use it ourselves).
+         */
+        for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
+            if ( is_cmos_port(port, 2, d) )
+                register_portio_handler(d, port, 2, hw_rtc_io);
     }
 
     spin_lock_init(&s->lock);
--- a/xen/arch/x86/include/asm/mc146818rtc.h
+++ b/xen/arch/x86/include/asm/mc146818rtc.h
@@ -9,6 +9,10 @@ 
 
 extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
 
+struct domain;
+bool is_cmos_port(unsigned int port, unsigned int bytes,
+                  const struct domain *d);
+
 /**********************************************************************
  * register summary
  **********************************************************************/
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -220,7 +220,7 @@  static bool admin_io_okay(unsigned int p
         return false;
 
     /* We also never permit direct access to the RTC/CMOS registers. */
-    if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
+    if ( is_cmos_port(port, bytes, d) )
         return false;
 
     return ioports_access_permitted(d, port, port + bytes - 1);
@@ -290,7 +290,7 @@  static uint32_t guest_io_read(unsigned i
         {
             sub_data = pv_pit_handler(port, 0, 0);
         }
-        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+        else if ( is_cmos_port(port, 1, currd) )
         {
             sub_data = rtc_guest_read(port);
         }
@@ -436,7 +436,7 @@  static void guest_io_write(unsigned int
         {
             pv_pit_handler(port, (uint8_t)data, 1);
         }
-        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+        else if ( is_cmos_port(port, 1, currd) )
         {
             rtc_guest_write(port, data);
         }
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -2072,37 +2072,36 @@  int __hwdom_init xen_in_range(unsigned l
 static int __hwdom_init cf_check io_bitmap_cb(
     unsigned long s, unsigned long e, void *ctx)
 {
-    struct domain *d = ctx;
+    const struct domain *d = ctx;
     unsigned int i;
 
     ASSERT(e <= INT_MAX);
     for ( i = s; i <= e; i++ )
-        __clear_bit(i, d->arch.hvm.io_bitmap);
+        /*
+         * Accesses to RTC ports also need to be trapped in order to keep
+         * consistency with PV.
+         */
+        if ( !is_cmos_port(i, 1, d) )
+            __clear_bit(i, d->arch.hvm.io_bitmap);
 
     return 0;
 }
 
 void __hwdom_init setup_io_bitmap(struct domain *d)
 {
-    int rc;
+    if ( !is_hvm_domain(d) )
+        return;
 
-    if ( is_hvm_domain(d) )
-    {
-        bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
-        rc = rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
-                                    io_bitmap_cb, d);
-        BUG_ON(rc);
-        /*
-         * NB: we need to trap accesses to 0xcf8 in order to intercept
-         * 4 byte accesses, that need to be handled by Xen in order to
-         * keep consistency.
-         * Access to 1 byte RTC ports also needs to be trapped in order
-         * to keep consistency with PV.
-         */
-        __set_bit(0xcf8, d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(0), d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(1), d->arch.hvm.io_bitmap);
-    }
+    bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
+    if ( rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
+                                io_bitmap_cb, d) )
+        BUG();
+
+    /*
+     * We need to trap 4-byte accesses to 0xcf8 (see admin_io_okay(),
+     * guest_io_read(), and guest_io_write()).
+     */
+    __set_bit(0xcf8, d->arch.hvm.io_bitmap);
 }
 
 /*
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1234,7 +1234,10 @@  static unsigned long get_cmos_time(void)
         if ( seconds < 60 )
         {
             if ( rtc.sec != seconds )
+            {
                 cmos_rtc_probe = false;
+                acpi_gbl_FADT.boot_flags &= ~ACPI_FADT_NO_CMOS_RTC;
+            }
             break;
         }
 
@@ -1249,6 +1252,80 @@  static unsigned long get_cmos_time(void)
     return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
 }
 
+static unsigned int __ro_after_init cmos_alias_mask;
+
+static int __init cf_check probe_cmos_alias(void)
+{
+    unsigned int i, offs;
+
+    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
+        return 0;
+
+    for ( offs = 2; offs < 8; offs <<= 1 )
+    {
+        bool read = true;
+
+        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
+        {
+            uint8_t normal, alt;
+            unsigned long flags;
+
+            if ( i == acpi_gbl_FADT.century )
+                continue;
+
+            spin_lock_irqsave(&rtc_lock, flags);
+
+            normal = CMOS_READ(i);
+            if ( inb(RTC_PORT(offs)) != i )
+                read = false;
+
+            alt = inb(RTC_PORT(offs + 1));
+
+            spin_unlock_irqrestore(&rtc_lock, flags);
+
+            if ( normal != alt )
+                break;
+
+            process_pending_softirqs();
+        }
+        if ( i == 0x80 )
+        {
+            cmos_alias_mask |= offs;
+            printk(XENLOG_INFO "CMOS aliased at %02x, index %s\n",
+                   RTC_PORT(offs), read ? "r/w" : "w/o");
+        }
+    }
+
+    return 0;
+}
+__initcall(probe_cmos_alias);
+
+bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
+{
+    if ( !is_hardware_domain(d) )
+        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
+
+    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
+         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
+    {
+        unsigned int cmos = RTC_PORT(0), nr = 2, step;
+
+        while ( cmos_alias_mask & nr )
+            nr <<= 1;
+        for ( step = nr << 1;
+              step < cmos_alias_mask && !(cmos_alias_mask & step); )
+            step <<= 1;
+        do {
+            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
+                 port <= cmos + 1 && port + bytes > cmos )
+                return true;
+            cmos += step;
+        } while ( cmos <= RTC_PORT(cmos_alias_mask) );
+    }
+
+    return false;
+}
+
 /* Helpers for guest accesses to the physical RTC. */
 unsigned int rtc_guest_read(unsigned int port)
 {
@@ -1256,7 +1333,7 @@  unsigned int rtc_guest_read(unsigned int
     unsigned long flags;
     unsigned int data = ~0;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
     case RTC_PORT(0):
         /*
@@ -1264,15 +1341,16 @@  unsigned int rtc_guest_read(unsigned int
          * of the first RTC port, as there's no access to the physical IO
          * ports.
          */
-        data = currd->arch.cmos_idx;
+        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        data = inb(RTC_PORT(1));
+        outb(currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1))),
+             port - 1);
+        data = inb(port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;
 
@@ -1288,9 +1366,10 @@  void rtc_guest_write(unsigned int port,
     struct domain *currd = current->domain;
     unsigned long flags;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
         typeof(pv_rtc_handler) hook;
+        unsigned int idx;
 
     case RTC_PORT(0):
         /*
@@ -1298,20 +1377,22 @@  void rtc_guest_write(unsigned int port,
          * value of the first RTC port, as there's no access to the physical IO
          * ports.
          */
-        currd->arch.cmos_idx = data;
+        currd->arch.cmos_idx = data & (0xff >> (port == RTC_PORT(0)));
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
 
+        idx = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1)));
+
         hook = ACCESS_ONCE(pv_rtc_handler);
         if ( hook )
-            hook(currd->arch.cmos_idx & 0x7f, data);
+            hook(idx, data);
 
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        outb(data, RTC_PORT(1));
+        outb(idx, port - 1);
+        outb(data, port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;