diff mbox

[15/33] docs: update ACPI CPU hotplug spec with new protocol

Message ID 1463496205-251412-16-git-send-email-imammedo@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Igor Mammedov May 17, 2016, 2:43 p.m. UTC
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------
 1 file changed, 76 insertions(+), 12 deletions(-)

Comments

Michael S. Tsirkin May 31, 2016, 4:49 a.m. UTC | #1
On Tue, May 17, 2016 at 04:43:07PM +0200, Igor Mammedov wrote:
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------
>  1 file changed, 76 insertions(+), 12 deletions(-)
> 
> diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
> index 340b751..c5bce6a 100644
> --- a/docs/specs/acpi_cpu_hotplug.txt
> +++ b/docs/specs/acpi_cpu_hotplug.txt
> @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface
>  QEMU supports CPU hotplug via ACPI. This document
>  describes the interface between QEMU and the ACPI BIOS.
>  
> -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access):
> ------------------------------------------
> -
> -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU
> -hot-add/remove event to ACPI BIOS, via SCI interrupt.
> +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add
> +and hot-remove events.
>  
> +============================================
> +Legacy ACPI CPU hotplug interface registers:
> +--------------------------------------------
>  CPU present bitmap for:
> +  One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only.
>    ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access)
>    PIIX-PM  (IO port 0xaf00-0xaf1f, 1-byte access)
>  ---------------------------------------------------------------
> -One bit per CPU. Bit position reflects corresponding CPU APIC ID.
> -Read-only.
> +QEMU sets corresponding CPU bit on hot-add event and issues SCI
> +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler
> +to notify OS about CPU hot-add events. CPU hot-remove isn't supported.
> +
> +=====================================
> +ACPI CPU hotplug interface registers:
> +-------------------------------------
> +Register block base address:
> +    ICH9-LPC IO port 0x0cd8
> +    PIIX-PM  IO port 0xaf00

OK but this means we either use legacy or new one,
bot both, which is problematic for people using old seabios
without acpi loading support and with -M pc.

I don't say we must support them with >255 CPUs,
but I do say we should make an effort for simple
setups with <255 CPUs.


> +Register block size:
> +    ACPI_CPU_HOTPLUG_REG_LEN = 12
> +
> +read access:

So this implies acpi must scan all cpus on each event, and
this seems too aggressive.

I think we need something hierarchical where
you read one level and know which cpus to probe.


> +    offset:
> +    [0x0-0x3] reserved
> +    [0x4] CPU device status fields: (1 byte access)
> +        bits:
> +           0: Device is enabled and may be used by guest
> +           1: Device insert event, used to distinguish device for which
> +              no device check event to OSPM was issued.
> +              It's valid only when bit 0 is set.
> +           2: Device remove event, used to distinguish device for which
> +              no device eject request to OSPM was issued.
> +           3-7: reserved and should be ignored by OSPM
> +    [0x5-0x7] reserved
> +    [0x8] Command data: (DWORD access)
> +          in case of error or unsupported command reads is 0xFFFFFFFF
> +          current 'Command field' value:
> +              0: returns PXM value corresponding to device
> +
> +write access:
> +    offset:
> +    [0x0-0x3] CPU selector: (DWORD access)
> +              selects active CPU device. All following accesses to other
> +              registers will read/store data from/to selected CPU.

I've been thinking - is it time to add an EmbeddedControl interface?
This way we have another namespace.
Not insisting on it, just an idea.

> +    [0x4] CPU device control fields: (1 byte access)
> +        bits:
> +            0: reserved, OSPM must clear it before writing to register.
> +            1: if set to 1 clears device insert event, set by OSPM
> +               after it has emitted device check event for the
> +               selected CPU device
> +            2: if set to 1 clears device remove event, set by OSPM
> +               after it has emitted device eject request for the
> +               selected CPU device
> +            3: if set to 1 initiates device eject, set by OSPM when it
> +               triggers CPU device removal and calls _EJ0 method
> +            4-7: reserved, OSPM must clear them before writing to register
> +    [0x5] Command field: (1 byte access)
> +          value:
> +            0: following reads from 'Command data' register returns PXM
> +               value of device
> +            1: following writes to 'Command data' register set OST event
> +               register in QEMU
> +            2: following writes to 'Command data' register set OST status
> +               register in QEMU
> +            other values: reserved
> +    [0x6-0x7] reserved
> +    [0x8] Command data: (DWORD access)
> +          current 'Command field' value:
> +              1: stores value into OST event register
> +              2: stores value into OST status register, triggers
> +                 ACPI_DEVICE_OST QMP event from QEMU to external applications
> +                 with current values of OST event and status registers.
> +            other values: reserved
>  
> -CPU hot-add/remove notification:
> ------------------------------------------------------
> -QEMU sets/clears corresponding CPU bit on hot-add/remove event.
> -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU
> -hot-(un)plug events.
> +Selecting CPU device beyond possible range has no effect on platform:
> +   - write accesses to CPU hot-plug registers not documented above are
> +     ignored
> +   - read accesses to CPU hot-plug registers not documented above return
> +     all bits set to 1.

why not 0?

> -- 
> 1.8.3.1
Igor Mammedov May 31, 2016, 3:07 p.m. UTC | #2
On Tue, 31 May 2016 07:49:16 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 17, 2016 at 04:43:07PM +0200, Igor Mammedov wrote:
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> >  docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------
> >  1 file changed, 76 insertions(+), 12 deletions(-)
> > 
> > diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
> > index 340b751..c5bce6a 100644
> > --- a/docs/specs/acpi_cpu_hotplug.txt
> > +++ b/docs/specs/acpi_cpu_hotplug.txt
> > @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface
> >  QEMU supports CPU hotplug via ACPI. This document
> >  describes the interface between QEMU and the ACPI BIOS.
> >  
> > -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access):
> > ------------------------------------------
> > -
> > -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU
> > -hot-add/remove event to ACPI BIOS, via SCI interrupt.
> > +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add
> > +and hot-remove events.
> >  
> > +============================================
> > +Legacy ACPI CPU hotplug interface registers:
> > +--------------------------------------------
> >  CPU present bitmap for:
> > +  One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only.
> >    ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access)
> >    PIIX-PM  (IO port 0xaf00-0xaf1f, 1-byte access)
> >  ---------------------------------------------------------------
> > -One bit per CPU. Bit position reflects corresponding CPU APIC ID.
> > -Read-only.
> > +QEMU sets corresponding CPU bit on hot-add event and issues SCI
> > +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler
> > +to notify OS about CPU hot-add events. CPU hot-remove isn't supported.
> > +
> > +=====================================
> > +ACPI CPU hotplug interface registers:
> > +-------------------------------------
> > +Register block base address:
> > +    ICH9-LPC IO port 0x0cd8
> > +    PIIX-PM  IO port 0xaf00  
> 
> OK but this means we either use legacy or new one,
> bot both, which is problematic for people using old seabios
> without acpi loading support and with -M pc.
> 
> I don't say we must support them with >255 CPUs,
> but I do say we should make an effort for simple
> setups with <255 CPUs.


it works with QEMU provided(shipped) BIOS,
it works for migration case as legacy stays enables because of -M src_legacy_machine

more that 255 cpus will break old BIOS in different
ways /corrupt/hang/ depending on BIOS version.

I'm not sure we should care about using old BIOS
with new QEMU+new machine type though and allow
expectations to be beyond what hw vendors usually set.
It's the same as real hw does i.e. new hardware
shipped with new BIOS. 
If user insist on old BIOS+new machine type he still has
property knob to force legacy mode.

on the fist glance, it's probably not that very hard
to switch IO ports handling from legacy to new interface
by sending from new cpu-hotplug AML a command to do so,
my concerns here is:
 * +1 more state to migrate
 * probably issues with migration as target started with
   different IO layout
 * IO window freed after switching from legacy to new,
   will not be available to guest as it started with
   legacy window consumed by CPUS.CRS.
 * that legacy switching business is only PC specific
    means having a knob to turn it on so it won't pollute ARM

all in all it's probably too much headache to make sure
that improbable usecase would work, so after considering
this idea I've dropped it and did it the way it's now.

> > +Register block size:
> > +    ACPI_CPU_HOTPLUG_REG_LEN = 12
> > +
> > +read access:  
> 
> So this implies acpi must scan all cpus on each event, and
> this seems too aggressive.
> I think we need something hierarchical where
> you read one level and know which cpus to probe.
That's what we do for mem-hotplug as it's not
performance critical path.

In addition to that depending on guest OS/version
it will anyway do enumeration of all CPUs after
our hotplug AML method scanned all CPUs and
sent notifies.

not that I'm in favor of complicating this protocol,
but I wouldn't do it hierarchical,
that's what Notify(BUS_CHECK) is supposed to do
but it's broken on some guests.

So if I'd do a more complicated protocol,
I'd do polling from AML side telling QEMU
  if (cpu = give_me_next_cpu_with_event())
     switch_to_cpu(cpu)
     ...

that means adding extra state to migrate,
probably ther might be other issues I'm not aware of yet.

On the other hand mem-hotplug like interface
so far hasn't caused any issues
(that of cause doesn't mean that there aren't any).

> > +    offset:
> > +    [0x0-0x3] reserved
> > +    [0x4] CPU device status fields: (1 byte access)
> > +        bits:
> > +           0: Device is enabled and may be used by guest
> > +           1: Device insert event, used to distinguish device for which
> > +              no device check event to OSPM was issued.
> > +              It's valid only when bit 0 is set.
> > +           2: Device remove event, used to distinguish device for which
> > +              no device eject request to OSPM was issued.
> > +           3-7: reserved and should be ignored by OSPM
> > +    [0x5-0x7] reserved
> > +    [0x8] Command data: (DWORD access)
> > +          in case of error or unsupported command reads is 0xFFFFFFFF
> > +          current 'Command field' value:
> > +              0: returns PXM value corresponding to device
> > +
> > +write access:
> > +    offset:
> > +    [0x0-0x3] CPU selector: (DWORD access)
> > +              selects active CPU device. All following accesses to other
> > +              registers will read/store data from/to selected CPU.  
> 
> I've been thinking - is it time to add an EmbeddedControl interface?
> This way we have another namespace.

There were patches on list with it some time ago

    "[PATCH RFC v2 0/7] coordinate cpu hotplug/unplug bewteen QEMU and kernel by EC"

but we dropped it as bloatware because there weren't any
real need for it as current IO port interfaces aren't
going away any time soon and work just fine without EC.
And if we were to move current interfaces into EC namespace
we would have to do it only for new machines types
and keep old code in place as well and a bunch of compat
code for a company.

> Not insisting on it, just an idea.
I wouldn't make it a necessary dependency and block this series.
Might be worth to look at again when there is usecases for it
without legacy stuff attached.

> 
> > +    [0x4] CPU device control fields: (1 byte access)
> > +        bits:
> > +            0: reserved, OSPM must clear it before writing to register.
> > +            1: if set to 1 clears device insert event, set by OSPM
> > +               after it has emitted device check event for the
> > +               selected CPU device
> > +            2: if set to 1 clears device remove event, set by OSPM
> > +               after it has emitted device eject request for the
> > +               selected CPU device
> > +            3: if set to 1 initiates device eject, set by OSPM when it
> > +               triggers CPU device removal and calls _EJ0 method
> > +            4-7: reserved, OSPM must clear them before writing to register
> > +    [0x5] Command field: (1 byte access)
> > +          value:
> > +            0: following reads from 'Command data' register returns PXM
> > +               value of device
> > +            1: following writes to 'Command data' register set OST event
> > +               register in QEMU
> > +            2: following writes to 'Command data' register set OST status
> > +               register in QEMU
> > +            other values: reserved
> > +    [0x6-0x7] reserved
> > +    [0x8] Command data: (DWORD access)
> > +          current 'Command field' value:
> > +              1: stores value into OST event register
> > +              2: stores value into OST status register, triggers
> > +                 ACPI_DEVICE_OST QMP event from QEMU to external applications
> > +                 with current values of OST event and status registers.
> > +            other values: reserved
> >  
> > -CPU hot-add/remove notification:
> > ------------------------------------------------------
> > -QEMU sets/clears corresponding CPU bit on hot-add/remove event.
> > -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU
> > -hot-(un)plug events.
> > +Selecting CPU device beyond possible range has no effect on platform:
> > +   - write accesses to CPU hot-plug registers not documented above are
> > +     ignored
> > +   - read accesses to CPU hot-plug registers not documented above return
> > +     all bits set to 1.  
> 
> why not 0?
isn't hardware usually return 1s on unhandled IO reads?
(I even thought that's it was you who told me it)

that's what has been done for memory hotplug, so I'm doing the same here.

> 
> > -- 
> > 1.8.3.1  
>
Michael S. Tsirkin May 31, 2016, 9:09 p.m. UTC | #3
On Tue, May 31, 2016 at 05:07:41PM +0200, Igor Mammedov wrote:
> On Tue, 31 May 2016 07:49:16 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, May 17, 2016 at 04:43:07PM +0200, Igor Mammedov wrote:
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > ---
> > >  docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------
> > >  1 file changed, 76 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
> > > index 340b751..c5bce6a 100644
> > > --- a/docs/specs/acpi_cpu_hotplug.txt
> > > +++ b/docs/specs/acpi_cpu_hotplug.txt
> > > @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface
> > >  QEMU supports CPU hotplug via ACPI. This document
> > >  describes the interface between QEMU and the ACPI BIOS.
> > >  
> > > -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access):
> > > ------------------------------------------
> > > -
> > > -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU
> > > -hot-add/remove event to ACPI BIOS, via SCI interrupt.
> > > +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add
> > > +and hot-remove events.
> > >  
> > > +============================================
> > > +Legacy ACPI CPU hotplug interface registers:
> > > +--------------------------------------------
> > >  CPU present bitmap for:
> > > +  One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only.
> > >    ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access)
> > >    PIIX-PM  (IO port 0xaf00-0xaf1f, 1-byte access)
> > >  ---------------------------------------------------------------
> > > -One bit per CPU. Bit position reflects corresponding CPU APIC ID.
> > > -Read-only.
> > > +QEMU sets corresponding CPU bit on hot-add event and issues SCI
> > > +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler
> > > +to notify OS about CPU hot-add events. CPU hot-remove isn't supported.
> > > +
> > > +=====================================
> > > +ACPI CPU hotplug interface registers:
> > > +-------------------------------------
> > > +Register block base address:
> > > +    ICH9-LPC IO port 0x0cd8
> > > +    PIIX-PM  IO port 0xaf00  
> > 
> > OK but this means we either use legacy or new one,
> > bot both, which is problematic for people using old seabios
> > without acpi loading support and with -M pc.
> > 
> > I don't say we must support them with >255 CPUs,
> > but I do say we should make an effort for simple
> > setups with <255 CPUs.
> 
> 
> it works with QEMU provided(shipped) BIOS,
> it works for migration case as legacy stays enables because of -M src_legacy_machine
> 
> more that 255 cpus will break old BIOS in different
> ways /corrupt/hang/ depending on BIOS version.
> 
> I'm not sure we should care about using old BIOS
> with new QEMU+new machine type though and allow
> expectations to be beyond what hw vendors usually set.
> It's the same as real hw does i.e. new hardware
> shipped with new BIOS. 
> If user insist on old BIOS+new machine type he still has
> property knob to force legacy mode.
> 
> on the fist glance, it's probably not that very hard
> to switch IO ports handling from legacy to new interface
> by sending from new cpu-hotplug AML a command to do so,
> my concerns here is:
>  * +1 more state to migrate
>  * probably issues with migration as target started with
>    different IO layout
>  * IO window freed after switching from legacy to new,
>    will not be available to guest as it started with
>    legacy window consumed by CPUS.CRS.
>  * that legacy switching business is only PC specific
>     means having a knob to turn it on so it won't pollute ARM
> 
> all in all it's probably too much headache to make sure
> that improbable usecase would work, so after considering
> this idea I've dropped it and did it the way it's now.

Well I think it's worth the effort. I agree it's tricky
to implement but we do maintain compatibility for years.
Being orthogonal with bios version is very helpful for
a variety of reasons such as debugging.


> > > +Register block size:
> > > +    ACPI_CPU_HOTPLUG_REG_LEN = 12
> > > +
> > > +read access:  
> > 
> > So this implies acpi must scan all cpus on each event, and
> > this seems too aggressive.
> > I think we need something hierarchical where
> > you read one level and know which cpus to probe.
> That's what we do for mem-hotplug as it's not
> performance critical path.
> 
> In addition to that depending on guest OS/version
> it will anyway do enumeration of all CPUs after
> our hotplug AML method scanned all CPUs and
> sent notifies.
> 
> not that I'm in favor of complicating this protocol,
> but I wouldn't do it hierarchical,
> that's what Notify(BUS_CHECK) is supposed to do
> but it's broken on some guests.

Interesting. Which ones? Would they be easy to detect?

> So if I'd do a more complicated protocol,
> I'd do polling from AML side telling QEMU
>   if (cpu = give_me_next_cpu_with_event())
>      switch_to_cpu(cpu)
>      ...
> 
> that means adding extra state to migrate,
> probably ther might be other issues I'm not aware of yet.
> 
> On the other hand mem-hotplug like interface
> so far hasn't caused any issues
> (that of cause doesn't mean that there aren't any).
> 
> > > +    offset:
> > > +    [0x0-0x3] reserved
> > > +    [0x4] CPU device status fields: (1 byte access)
> > > +        bits:
> > > +           0: Device is enabled and may be used by guest
> > > +           1: Device insert event, used to distinguish device for which
> > > +              no device check event to OSPM was issued.
> > > +              It's valid only when bit 0 is set.
> > > +           2: Device remove event, used to distinguish device for which
> > > +              no device eject request to OSPM was issued.
> > > +           3-7: reserved and should be ignored by OSPM
> > > +    [0x5-0x7] reserved
> > > +    [0x8] Command data: (DWORD access)
> > > +          in case of error or unsupported command reads is 0xFFFFFFFF
> > > +          current 'Command field' value:
> > > +              0: returns PXM value corresponding to device
> > > +
> > > +write access:
> > > +    offset:
> > > +    [0x0-0x3] CPU selector: (DWORD access)
> > > +              selects active CPU device. All following accesses to other
> > > +              registers will read/store data from/to selected CPU.  
> > 
> > I've been thinking - is it time to add an EmbeddedControl interface?
> > This way we have another namespace.
> 
> There were patches on list with it some time ago
> 
>     "[PATCH RFC v2 0/7] coordinate cpu hotplug/unplug bewteen QEMU and kernel by EC"
> 
> but we dropped it as bloatware because there weren't any
> real need for it as current IO port interfaces aren't
> going away any time soon and work just fine without EC.

Right. But now you are removing them for >255 CPUs
anyway.

> And if we were to move current interfaces into EC namespace
> we would have to do it only for new machines types
> and keep old code in place as well and a bunch of compat
> code for a company.

Hmm. Exactly like this current one?

> > Not insisting on it, just an idea.
> I wouldn't make it a necessary dependency and block this series.
> Might be worth to look at again when there is usecases for it
> without legacy stuff attached.

and then we'll have to maintain 3 interfaces?

> > 
> > > +    [0x4] CPU device control fields: (1 byte access)
> > > +        bits:
> > > +            0: reserved, OSPM must clear it before writing to register.
> > > +            1: if set to 1 clears device insert event, set by OSPM
> > > +               after it has emitted device check event for the
> > > +               selected CPU device
> > > +            2: if set to 1 clears device remove event, set by OSPM
> > > +               after it has emitted device eject request for the
> > > +               selected CPU device
> > > +            3: if set to 1 initiates device eject, set by OSPM when it
> > > +               triggers CPU device removal and calls _EJ0 method
> > > +            4-7: reserved, OSPM must clear them before writing to register
> > > +    [0x5] Command field: (1 byte access)
> > > +          value:
> > > +            0: following reads from 'Command data' register returns PXM
> > > +               value of device
> > > +            1: following writes to 'Command data' register set OST event
> > > +               register in QEMU
> > > +            2: following writes to 'Command data' register set OST status
> > > +               register in QEMU
> > > +            other values: reserved
> > > +    [0x6-0x7] reserved
> > > +    [0x8] Command data: (DWORD access)
> > > +          current 'Command field' value:
> > > +              1: stores value into OST event register
> > > +              2: stores value into OST status register, triggers
> > > +                 ACPI_DEVICE_OST QMP event from QEMU to external applications
> > > +                 with current values of OST event and status registers.
> > > +            other values: reserved
> > >  
> > > -CPU hot-add/remove notification:
> > > ------------------------------------------------------
> > > -QEMU sets/clears corresponding CPU bit on hot-add/remove event.
> > > -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU
> > > -hot-(un)plug events.
> > > +Selecting CPU device beyond possible range has no effect on platform:
> > > +   - write accesses to CPU hot-plug registers not documented above are
> > > +     ignored
> > > +   - read accesses to CPU hot-plug registers not documented above return
> > > +     all bits set to 1.  
> > 
> > why not 0?
> isn't hardware usually return 1s on unhandled IO reads?
> (I even thought that's it was you who told me it)

PCI does but it's special.

> 
> that's what has been done for memory hotplug, so I'm doing the same here.
> 
> > 
> > > -- 
> > > 1.8.3.1  
> >
Igor Mammedov June 6, 2016, 9:57 a.m. UTC | #4
On Wed, 1 Jun 2016 00:09:57 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, May 31, 2016 at 05:07:41PM +0200, Igor Mammedov wrote:
> > On Tue, 31 May 2016 07:49:16 +0300
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> > > On Tue, May 17, 2016 at 04:43:07PM +0200, Igor Mammedov wrote:  
> > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > > ---
> > > >  docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------
> > > >  1 file changed, 76 insertions(+), 12 deletions(-)
> > > > 
> > > > diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
> > > > index 340b751..c5bce6a 100644
> > > > --- a/docs/specs/acpi_cpu_hotplug.txt
> > > > +++ b/docs/specs/acpi_cpu_hotplug.txt
> > > > @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface
> > > >  QEMU supports CPU hotplug via ACPI. This document
> > > >  describes the interface between QEMU and the ACPI BIOS.
> > > >  
> > > > -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access):
> > > > ------------------------------------------
> > > > -
> > > > -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU
> > > > -hot-add/remove event to ACPI BIOS, via SCI interrupt.
> > > > +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add
> > > > +and hot-remove events.
> > > >  
> > > > +============================================
> > > > +Legacy ACPI CPU hotplug interface registers:
> > > > +--------------------------------------------
> > > >  CPU present bitmap for:
> > > > +  One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only.
> > > >    ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access)
> > > >    PIIX-PM  (IO port 0xaf00-0xaf1f, 1-byte access)
> > > >  ---------------------------------------------------------------
> > > > -One bit per CPU. Bit position reflects corresponding CPU APIC ID.
> > > > -Read-only.
> > > > +QEMU sets corresponding CPU bit on hot-add event and issues SCI
> > > > +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler
> > > > +to notify OS about CPU hot-add events. CPU hot-remove isn't supported.
> > > > +
> > > > +=====================================
> > > > +ACPI CPU hotplug interface registers:
> > > > +-------------------------------------
> > > > +Register block base address:
> > > > +    ICH9-LPC IO port 0x0cd8
> > > > +    PIIX-PM  IO port 0xaf00    
> > > 
> > > OK but this means we either use legacy or new one,
> > > bot both, which is problematic for people using old seabios
> > > without acpi loading support and with -M pc.
> > > 
> > > I don't say we must support them with >255 CPUs,
> > > but I do say we should make an effort for simple
> > > setups with <255 CPUs.  
> > 
> > 
> > it works with QEMU provided(shipped) BIOS,
> > it works for migration case as legacy stays enables because of -M src_legacy_machine
> > 
> > more that 255 cpus will break old BIOS in different
> > ways /corrupt/hang/ depending on BIOS version.
> > 
> > I'm not sure we should care about using old BIOS
> > with new QEMU+new machine type though and allow
> > expectations to be beyond what hw vendors usually set.
> > It's the same as real hw does i.e. new hardware
> > shipped with new BIOS. 
> > If user insist on old BIOS+new machine type he still has
> > property knob to force legacy mode.
> > 
> > on the fist glance, it's probably not that very hard
> > to switch IO ports handling from legacy to new interface
> > by sending from new cpu-hotplug AML a command to do so,
> > my concerns here is:
> >  * +1 more state to migrate
> >  * probably issues with migration as target started with
> >    different IO layout
> >  * IO window freed after switching from legacy to new,
> >    will not be available to guest as it started with
> >    legacy window consumed by CPUS.CRS.
> >  * that legacy switching business is only PC specific
> >     means having a knob to turn it on so it won't pollute ARM
> > 
> > all in all it's probably too much headache to make sure
> > that improbable usecase would work, so after considering
> > this idea I've dropped it and did it the way it's now.  
> 
> Well I think it's worth the effort. I agree it's tricky
> to implement but we do maintain compatibility for years.
That's what I'm concerned about, with your suggestion
QEMU maintenance efforts will increase and we will be
stuck with ABI switching forever for giving ability to
run (unlikely) by default pre acpi-linker BIOS on
new machine types.
While my current approach makes a clean ABI switch
for new machine types with old machines supporting
legacy ABI.

I can try to rework series but even with  reviews and
testing on list there is a non zero chance of migration
related regressions.

Do you insist on implementing switching from legacy ABI
to a new one in runtime?

> Being orthogonal with bios version is very helpful for
> a variety of reasons such as debugging.
debugging and other unconventional uses can force legacy
hotplug with a property while leaving default new machine
type with a clean straightforward implementation.


> > > > +Register block size:
> > > > +    ACPI_CPU_HOTPLUG_REG_LEN = 12
> > > > +
> > > > +read access:    
> > > 
> > > So this implies acpi must scan all cpus on each event, and
> > > this seems too aggressive.
> > > I think we need something hierarchical where
> > > you read one level and know which cpus to probe.  
> > That's what we do for mem-hotplug as it's not
> > performance critical path.
> > 
> > In addition to that depending on guest OS/version
> > it will anyway do enumeration of all CPUs after
> > our hotplug AML method scanned all CPUs and
> > sent notifies.
> > 
> > not that I'm in favor of complicating this protocol,
> > but I wouldn't do it hierarchical,
> > that's what Notify(BUS_CHECK) is supposed to do
> > but it's broken on some guests.  
> 
> Interesting. Which ones? Would they be easy to detect?
I've experimented with it/retested some more.
It was RHEL6, I don't think we can detect it.

Anyway BUS_CHECK would only simplify code by eleminating
need for CPU_SCAN_METHOD and doing check from
SAT method. We would still need to keep big
custom notifier method to make hot remove work
(I've spent some time again trying to make it work
without if/else case switch but no success so far).

Using polling to get CPUs with hotplug events
won't save us much (~1K of 32bit reads per 512 CPUs).
STA checks that OSPM does will still do full scan.

So do you insist on redoing current full scan on
hotplug approach using polling?

> > So if I'd do a more complicated protocol,
> > I'd do polling from AML side telling QEMU
> >   if (cpu = give_me_next_cpu_with_event())
> >      switch_to_cpu(cpu)
> >      ...
> > 
> > that means adding extra state to migrate,
> > probably ther might be other issues I'm not aware of yet.
> > 
> > On the other hand mem-hotplug like interface
> > so far hasn't caused any issues
> > (that of cause doesn't mean that there aren't any).
> >   
> > > > +    offset:
> > > > +    [0x0-0x3] reserved
> > > > +    [0x4] CPU device status fields: (1 byte access)
> > > > +        bits:
> > > > +           0: Device is enabled and may be used by guest
> > > > +           1: Device insert event, used to distinguish device for which
> > > > +              no device check event to OSPM was issued.
> > > > +              It's valid only when bit 0 is set.
> > > > +           2: Device remove event, used to distinguish device for which
> > > > +              no device eject request to OSPM was issued.
> > > > +           3-7: reserved and should be ignored by OSPM
> > > > +    [0x5-0x7] reserved
> > > > +    [0x8] Command data: (DWORD access)
> > > > +          in case of error or unsupported command reads is 0xFFFFFFFF
> > > > +          current 'Command field' value:
> > > > +              0: returns PXM value corresponding to device
> > > > +
> > > > +write access:
> > > > +    offset:
> > > > +    [0x0-0x3] CPU selector: (DWORD access)
> > > > +              selects active CPU device. All following accesses to other
> > > > +              registers will read/store data from/to selected CPU.    
> > > 
> > > I've been thinking - is it time to add an EmbeddedControl interface?
> > > This way we have another namespace.  
> > 
> > There were patches on list with it some time ago
> > 
> >     "[PATCH RFC v2 0/7] coordinate cpu hotplug/unplug bewteen QEMU and kernel by EC"
> > 
> > but we dropped it as bloatware because there weren't any
> > real need for it as current IO port interfaces aren't
> > going away any time soon and work just fine without EC.  
> 
> Right. But now you are removing them for >255 CPUs
> anyway.
it still occupies the same IO region that we allocated for
legacy hotplug, I'm just shrinking it.

On top of that EC protocol will channel all accesses via
1-byte channel making access even slower.
So as far as we are able to allocate some IO ports
without pain, I wouldn't use EC.

> > And if we were to move current interfaces into EC namespace
> > we would have to do it only for new machines types
> > and keep old code in place as well and a bunch of compat
> > code for a company.  
> 
> Hmm. Exactly like this current one?
memhotplug/ACPI PCI hotplug/cpuhotplug (legacy/new)

> > > Not insisting on it, just an idea.  
> > I wouldn't make it a necessary dependency and block this series.
> > Might be worth to look at again when there is usecases for it
> > without legacy stuff attached.  
> 
> and then we'll have to maintain 3 interfaces?
nope, I'd leave current/implemented usecases where they are,
only new stuff would go through EC.

> 
> > >   
> > > > +    [0x4] CPU device control fields: (1 byte access)
> > > > +        bits:
> > > > +            0: reserved, OSPM must clear it before writing to register.
> > > > +            1: if set to 1 clears device insert event, set by OSPM
> > > > +               after it has emitted device check event for the
> > > > +               selected CPU device
> > > > +            2: if set to 1 clears device remove event, set by OSPM
> > > > +               after it has emitted device eject request for the
> > > > +               selected CPU device
> > > > +            3: if set to 1 initiates device eject, set by OSPM when it
> > > > +               triggers CPU device removal and calls _EJ0 method
> > > > +            4-7: reserved, OSPM must clear them before writing to register
> > > > +    [0x5] Command field: (1 byte access)
> > > > +          value:
> > > > +            0: following reads from 'Command data' register returns PXM
> > > > +               value of device
> > > > +            1: following writes to 'Command data' register set OST event
> > > > +               register in QEMU
> > > > +            2: following writes to 'Command data' register set OST status
> > > > +               register in QEMU
> > > > +            other values: reserved
> > > > +    [0x6-0x7] reserved
> > > > +    [0x8] Command data: (DWORD access)
> > > > +          current 'Command field' value:
> > > > +              1: stores value into OST event register
> > > > +              2: stores value into OST status register, triggers
> > > > +                 ACPI_DEVICE_OST QMP event from QEMU to external applications
> > > > +                 with current values of OST event and status registers.
> > > > +            other values: reserved
> > > >  
> > > > -CPU hot-add/remove notification:
> > > > ------------------------------------------------------
> > > > -QEMU sets/clears corresponding CPU bit on hot-add/remove event.
> > > > -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU
> > > > -hot-(un)plug events.
> > > > +Selecting CPU device beyond possible range has no effect on platform:
> > > > +   - write accesses to CPU hot-plug registers not documented above are
> > > > +     ignored
> > > > +   - read accesses to CPU hot-plug registers not documented above return
> > > > +     all bits set to 1.    
> > > 
> > > why not 0?  
> > isn't hardware usually return 1s on unhandled IO reads?
> > (I even thought that's it was you who told me it)  
> 
> PCI does but it's special.
Ok, it doesn't matter whether it's 1s or 0s as far as it's something defined.
I've used 1s as in memhotplug.

> 
> > 
> > that's what has been done for memory hotplug, so I'm doing the same here.
> >   
> > >   
> > > > -- 
> > > > 1.8.3.1    
> > >
diff mbox

Patch

diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
index 340b751..c5bce6a 100644
--- a/docs/specs/acpi_cpu_hotplug.txt
+++ b/docs/specs/acpi_cpu_hotplug.txt
@@ -4,21 +4,85 @@  QEMU<->ACPI BIOS CPU hotplug interface
 QEMU supports CPU hotplug via ACPI. This document
 describes the interface between QEMU and the ACPI BIOS.
 
-ACPI GPE block (IO ports 0xafe0-0xafe3, byte access):
------------------------------------------
-
-Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU
-hot-add/remove event to ACPI BIOS, via SCI interrupt.
+ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add
+and hot-remove events.
 
+============================================
+Legacy ACPI CPU hotplug interface registers:
+--------------------------------------------
 CPU present bitmap for:
+  One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only.
   ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access)
   PIIX-PM  (IO port 0xaf00-0xaf1f, 1-byte access)
 ---------------------------------------------------------------
-One bit per CPU. Bit position reflects corresponding CPU APIC ID.
-Read-only.
+QEMU sets corresponding CPU bit on hot-add event and issues SCI
+with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler
+to notify OS about CPU hot-add events. CPU hot-remove isn't supported.
+
+=====================================
+ACPI CPU hotplug interface registers:
+-------------------------------------
+Register block base address:
+    ICH9-LPC IO port 0x0cd8
+    PIIX-PM  IO port 0xaf00
+Register block size:
+    ACPI_CPU_HOTPLUG_REG_LEN = 12
+
+read access:
+    offset:
+    [0x0-0x3] reserved
+    [0x4] CPU device status fields: (1 byte access)
+        bits:
+           0: Device is enabled and may be used by guest
+           1: Device insert event, used to distinguish device for which
+              no device check event to OSPM was issued.
+              It's valid only when bit 0 is set.
+           2: Device remove event, used to distinguish device for which
+              no device eject request to OSPM was issued.
+           3-7: reserved and should be ignored by OSPM
+    [0x5-0x7] reserved
+    [0x8] Command data: (DWORD access)
+          in case of error or unsupported command reads is 0xFFFFFFFF
+          current 'Command field' value:
+              0: returns PXM value corresponding to device
+
+write access:
+    offset:
+    [0x0-0x3] CPU selector: (DWORD access)
+              selects active CPU device. All following accesses to other
+              registers will read/store data from/to selected CPU.
+    [0x4] CPU device control fields: (1 byte access)
+        bits:
+            0: reserved, OSPM must clear it before writing to register.
+            1: if set to 1 clears device insert event, set by OSPM
+               after it has emitted device check event for the
+               selected CPU device
+            2: if set to 1 clears device remove event, set by OSPM
+               after it has emitted device eject request for the
+               selected CPU device
+            3: if set to 1 initiates device eject, set by OSPM when it
+               triggers CPU device removal and calls _EJ0 method
+            4-7: reserved, OSPM must clear them before writing to register
+    [0x5] Command field: (1 byte access)
+          value:
+            0: following reads from 'Command data' register returns PXM
+               value of device
+            1: following writes to 'Command data' register set OST event
+               register in QEMU
+            2: following writes to 'Command data' register set OST status
+               register in QEMU
+            other values: reserved
+    [0x6-0x7] reserved
+    [0x8] Command data: (DWORD access)
+          current 'Command field' value:
+              1: stores value into OST event register
+              2: stores value into OST status register, triggers
+                 ACPI_DEVICE_OST QMP event from QEMU to external applications
+                 with current values of OST event and status registers.
+            other values: reserved
 
-CPU hot-add/remove notification:
------------------------------------------------------
-QEMU sets/clears corresponding CPU bit on hot-add/remove event.
-CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU
-hot-(un)plug events.
+Selecting CPU device beyond possible range has no effect on platform:
+   - write accesses to CPU hot-plug registers not documented above are
+     ignored
+   - read accesses to CPU hot-plug registers not documented above return
+     all bits set to 1.