Message ID | 1463496205-251412-16-git-send-email-imammedo@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, May 17, 2016 at 04:43:07PM +0200, Igor Mammedov wrote: > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------ > 1 file changed, 76 insertions(+), 12 deletions(-) > > diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt > index 340b751..c5bce6a 100644 > --- a/docs/specs/acpi_cpu_hotplug.txt > +++ b/docs/specs/acpi_cpu_hotplug.txt > @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface > QEMU supports CPU hotplug via ACPI. This document > describes the interface between QEMU and the ACPI BIOS. > > -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access): > ------------------------------------------ > - > -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU > -hot-add/remove event to ACPI BIOS, via SCI interrupt. > +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add > +and hot-remove events. > > +============================================ > +Legacy ACPI CPU hotplug interface registers: > +-------------------------------------------- > CPU present bitmap for: > + One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only. > ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access) > PIIX-PM (IO port 0xaf00-0xaf1f, 1-byte access) > --------------------------------------------------------------- > -One bit per CPU. Bit position reflects corresponding CPU APIC ID. > -Read-only. > +QEMU sets corresponding CPU bit on hot-add event and issues SCI > +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler > +to notify OS about CPU hot-add events. CPU hot-remove isn't supported. > + > +===================================== > +ACPI CPU hotplug interface registers: > +------------------------------------- > +Register block base address: > + ICH9-LPC IO port 0x0cd8 > + PIIX-PM IO port 0xaf00 OK but this means we either use legacy or new one, bot both, which is problematic for people using old seabios without acpi loading support and with -M pc. I don't say we must support them with >255 CPUs, but I do say we should make an effort for simple setups with <255 CPUs. > +Register block size: > + ACPI_CPU_HOTPLUG_REG_LEN = 12 > + > +read access: So this implies acpi must scan all cpus on each event, and this seems too aggressive. I think we need something hierarchical where you read one level and know which cpus to probe. > + offset: > + [0x0-0x3] reserved > + [0x4] CPU device status fields: (1 byte access) > + bits: > + 0: Device is enabled and may be used by guest > + 1: Device insert event, used to distinguish device for which > + no device check event to OSPM was issued. > + It's valid only when bit 0 is set. > + 2: Device remove event, used to distinguish device for which > + no device eject request to OSPM was issued. > + 3-7: reserved and should be ignored by OSPM > + [0x5-0x7] reserved > + [0x8] Command data: (DWORD access) > + in case of error or unsupported command reads is 0xFFFFFFFF > + current 'Command field' value: > + 0: returns PXM value corresponding to device > + > +write access: > + offset: > + [0x0-0x3] CPU selector: (DWORD access) > + selects active CPU device. All following accesses to other > + registers will read/store data from/to selected CPU. I've been thinking - is it time to add an EmbeddedControl interface? This way we have another namespace. Not insisting on it, just an idea. > + [0x4] CPU device control fields: (1 byte access) > + bits: > + 0: reserved, OSPM must clear it before writing to register. > + 1: if set to 1 clears device insert event, set by OSPM > + after it has emitted device check event for the > + selected CPU device > + 2: if set to 1 clears device remove event, set by OSPM > + after it has emitted device eject request for the > + selected CPU device > + 3: if set to 1 initiates device eject, set by OSPM when it > + triggers CPU device removal and calls _EJ0 method > + 4-7: reserved, OSPM must clear them before writing to register > + [0x5] Command field: (1 byte access) > + value: > + 0: following reads from 'Command data' register returns PXM > + value of device > + 1: following writes to 'Command data' register set OST event > + register in QEMU > + 2: following writes to 'Command data' register set OST status > + register in QEMU > + other values: reserved > + [0x6-0x7] reserved > + [0x8] Command data: (DWORD access) > + current 'Command field' value: > + 1: stores value into OST event register > + 2: stores value into OST status register, triggers > + ACPI_DEVICE_OST QMP event from QEMU to external applications > + with current values of OST event and status registers. > + other values: reserved > > -CPU hot-add/remove notification: > ------------------------------------------------------ > -QEMU sets/clears corresponding CPU bit on hot-add/remove event. > -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU > -hot-(un)plug events. > +Selecting CPU device beyond possible range has no effect on platform: > + - write accesses to CPU hot-plug registers not documented above are > + ignored > + - read accesses to CPU hot-plug registers not documented above return > + all bits set to 1. why not 0? > -- > 1.8.3.1
On Tue, 31 May 2016 07:49:16 +0300 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Tue, May 17, 2016 at 04:43:07PM +0200, Igor Mammedov wrote: > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > --- > > docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------ > > 1 file changed, 76 insertions(+), 12 deletions(-) > > > > diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt > > index 340b751..c5bce6a 100644 > > --- a/docs/specs/acpi_cpu_hotplug.txt > > +++ b/docs/specs/acpi_cpu_hotplug.txt > > @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface > > QEMU supports CPU hotplug via ACPI. This document > > describes the interface between QEMU and the ACPI BIOS. > > > > -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access): > > ------------------------------------------ > > - > > -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU > > -hot-add/remove event to ACPI BIOS, via SCI interrupt. > > +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add > > +and hot-remove events. > > > > +============================================ > > +Legacy ACPI CPU hotplug interface registers: > > +-------------------------------------------- > > CPU present bitmap for: > > + One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only. > > ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access) > > PIIX-PM (IO port 0xaf00-0xaf1f, 1-byte access) > > --------------------------------------------------------------- > > -One bit per CPU. Bit position reflects corresponding CPU APIC ID. > > -Read-only. > > +QEMU sets corresponding CPU bit on hot-add event and issues SCI > > +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler > > +to notify OS about CPU hot-add events. CPU hot-remove isn't supported. > > + > > +===================================== > > +ACPI CPU hotplug interface registers: > > +------------------------------------- > > +Register block base address: > > + ICH9-LPC IO port 0x0cd8 > > + PIIX-PM IO port 0xaf00 > > OK but this means we either use legacy or new one, > bot both, which is problematic for people using old seabios > without acpi loading support and with -M pc. > > I don't say we must support them with >255 CPUs, > but I do say we should make an effort for simple > setups with <255 CPUs. it works with QEMU provided(shipped) BIOS, it works for migration case as legacy stays enables because of -M src_legacy_machine more that 255 cpus will break old BIOS in different ways /corrupt/hang/ depending on BIOS version. I'm not sure we should care about using old BIOS with new QEMU+new machine type though and allow expectations to be beyond what hw vendors usually set. It's the same as real hw does i.e. new hardware shipped with new BIOS. If user insist on old BIOS+new machine type he still has property knob to force legacy mode. on the fist glance, it's probably not that very hard to switch IO ports handling from legacy to new interface by sending from new cpu-hotplug AML a command to do so, my concerns here is: * +1 more state to migrate * probably issues with migration as target started with different IO layout * IO window freed after switching from legacy to new, will not be available to guest as it started with legacy window consumed by CPUS.CRS. * that legacy switching business is only PC specific means having a knob to turn it on so it won't pollute ARM all in all it's probably too much headache to make sure that improbable usecase would work, so after considering this idea I've dropped it and did it the way it's now. > > +Register block size: > > + ACPI_CPU_HOTPLUG_REG_LEN = 12 > > + > > +read access: > > So this implies acpi must scan all cpus on each event, and > this seems too aggressive. > I think we need something hierarchical where > you read one level and know which cpus to probe. That's what we do for mem-hotplug as it's not performance critical path. In addition to that depending on guest OS/version it will anyway do enumeration of all CPUs after our hotplug AML method scanned all CPUs and sent notifies. not that I'm in favor of complicating this protocol, but I wouldn't do it hierarchical, that's what Notify(BUS_CHECK) is supposed to do but it's broken on some guests. So if I'd do a more complicated protocol, I'd do polling from AML side telling QEMU if (cpu = give_me_next_cpu_with_event()) switch_to_cpu(cpu) ... that means adding extra state to migrate, probably ther might be other issues I'm not aware of yet. On the other hand mem-hotplug like interface so far hasn't caused any issues (that of cause doesn't mean that there aren't any). > > + offset: > > + [0x0-0x3] reserved > > + [0x4] CPU device status fields: (1 byte access) > > + bits: > > + 0: Device is enabled and may be used by guest > > + 1: Device insert event, used to distinguish device for which > > + no device check event to OSPM was issued. > > + It's valid only when bit 0 is set. > > + 2: Device remove event, used to distinguish device for which > > + no device eject request to OSPM was issued. > > + 3-7: reserved and should be ignored by OSPM > > + [0x5-0x7] reserved > > + [0x8] Command data: (DWORD access) > > + in case of error or unsupported command reads is 0xFFFFFFFF > > + current 'Command field' value: > > + 0: returns PXM value corresponding to device > > + > > +write access: > > + offset: > > + [0x0-0x3] CPU selector: (DWORD access) > > + selects active CPU device. All following accesses to other > > + registers will read/store data from/to selected CPU. > > I've been thinking - is it time to add an EmbeddedControl interface? > This way we have another namespace. There were patches on list with it some time ago "[PATCH RFC v2 0/7] coordinate cpu hotplug/unplug bewteen QEMU and kernel by EC" but we dropped it as bloatware because there weren't any real need for it as current IO port interfaces aren't going away any time soon and work just fine without EC. And if we were to move current interfaces into EC namespace we would have to do it only for new machines types and keep old code in place as well and a bunch of compat code for a company. > Not insisting on it, just an idea. I wouldn't make it a necessary dependency and block this series. Might be worth to look at again when there is usecases for it without legacy stuff attached. > > > + [0x4] CPU device control fields: (1 byte access) > > + bits: > > + 0: reserved, OSPM must clear it before writing to register. > > + 1: if set to 1 clears device insert event, set by OSPM > > + after it has emitted device check event for the > > + selected CPU device > > + 2: if set to 1 clears device remove event, set by OSPM > > + after it has emitted device eject request for the > > + selected CPU device > > + 3: if set to 1 initiates device eject, set by OSPM when it > > + triggers CPU device removal and calls _EJ0 method > > + 4-7: reserved, OSPM must clear them before writing to register > > + [0x5] Command field: (1 byte access) > > + value: > > + 0: following reads from 'Command data' register returns PXM > > + value of device > > + 1: following writes to 'Command data' register set OST event > > + register in QEMU > > + 2: following writes to 'Command data' register set OST status > > + register in QEMU > > + other values: reserved > > + [0x6-0x7] reserved > > + [0x8] Command data: (DWORD access) > > + current 'Command field' value: > > + 1: stores value into OST event register > > + 2: stores value into OST status register, triggers > > + ACPI_DEVICE_OST QMP event from QEMU to external applications > > + with current values of OST event and status registers. > > + other values: reserved > > > > -CPU hot-add/remove notification: > > ------------------------------------------------------ > > -QEMU sets/clears corresponding CPU bit on hot-add/remove event. > > -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU > > -hot-(un)plug events. > > +Selecting CPU device beyond possible range has no effect on platform: > > + - write accesses to CPU hot-plug registers not documented above are > > + ignored > > + - read accesses to CPU hot-plug registers not documented above return > > + all bits set to 1. > > why not 0? isn't hardware usually return 1s on unhandled IO reads? (I even thought that's it was you who told me it) that's what has been done for memory hotplug, so I'm doing the same here. > > > -- > > 1.8.3.1 >
On Tue, May 31, 2016 at 05:07:41PM +0200, Igor Mammedov wrote: > On Tue, 31 May 2016 07:49:16 +0300 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Tue, May 17, 2016 at 04:43:07PM +0200, Igor Mammedov wrote: > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > > --- > > > docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------ > > > 1 file changed, 76 insertions(+), 12 deletions(-) > > > > > > diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt > > > index 340b751..c5bce6a 100644 > > > --- a/docs/specs/acpi_cpu_hotplug.txt > > > +++ b/docs/specs/acpi_cpu_hotplug.txt > > > @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface > > > QEMU supports CPU hotplug via ACPI. This document > > > describes the interface between QEMU and the ACPI BIOS. > > > > > > -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access): > > > ------------------------------------------ > > > - > > > -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU > > > -hot-add/remove event to ACPI BIOS, via SCI interrupt. > > > +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add > > > +and hot-remove events. > > > > > > +============================================ > > > +Legacy ACPI CPU hotplug interface registers: > > > +-------------------------------------------- > > > CPU present bitmap for: > > > + One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only. > > > ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access) > > > PIIX-PM (IO port 0xaf00-0xaf1f, 1-byte access) > > > --------------------------------------------------------------- > > > -One bit per CPU. Bit position reflects corresponding CPU APIC ID. > > > -Read-only. > > > +QEMU sets corresponding CPU bit on hot-add event and issues SCI > > > +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler > > > +to notify OS about CPU hot-add events. CPU hot-remove isn't supported. > > > + > > > +===================================== > > > +ACPI CPU hotplug interface registers: > > > +------------------------------------- > > > +Register block base address: > > > + ICH9-LPC IO port 0x0cd8 > > > + PIIX-PM IO port 0xaf00 > > > > OK but this means we either use legacy or new one, > > bot both, which is problematic for people using old seabios > > without acpi loading support and with -M pc. > > > > I don't say we must support them with >255 CPUs, > > but I do say we should make an effort for simple > > setups with <255 CPUs. > > > it works with QEMU provided(shipped) BIOS, > it works for migration case as legacy stays enables because of -M src_legacy_machine > > more that 255 cpus will break old BIOS in different > ways /corrupt/hang/ depending on BIOS version. > > I'm not sure we should care about using old BIOS > with new QEMU+new machine type though and allow > expectations to be beyond what hw vendors usually set. > It's the same as real hw does i.e. new hardware > shipped with new BIOS. > If user insist on old BIOS+new machine type he still has > property knob to force legacy mode. > > on the fist glance, it's probably not that very hard > to switch IO ports handling from legacy to new interface > by sending from new cpu-hotplug AML a command to do so, > my concerns here is: > * +1 more state to migrate > * probably issues with migration as target started with > different IO layout > * IO window freed after switching from legacy to new, > will not be available to guest as it started with > legacy window consumed by CPUS.CRS. > * that legacy switching business is only PC specific > means having a knob to turn it on so it won't pollute ARM > > all in all it's probably too much headache to make sure > that improbable usecase would work, so after considering > this idea I've dropped it and did it the way it's now. Well I think it's worth the effort. I agree it's tricky to implement but we do maintain compatibility for years. Being orthogonal with bios version is very helpful for a variety of reasons such as debugging. > > > +Register block size: > > > + ACPI_CPU_HOTPLUG_REG_LEN = 12 > > > + > > > +read access: > > > > So this implies acpi must scan all cpus on each event, and > > this seems too aggressive. > > I think we need something hierarchical where > > you read one level and know which cpus to probe. > That's what we do for mem-hotplug as it's not > performance critical path. > > In addition to that depending on guest OS/version > it will anyway do enumeration of all CPUs after > our hotplug AML method scanned all CPUs and > sent notifies. > > not that I'm in favor of complicating this protocol, > but I wouldn't do it hierarchical, > that's what Notify(BUS_CHECK) is supposed to do > but it's broken on some guests. Interesting. Which ones? Would they be easy to detect? > So if I'd do a more complicated protocol, > I'd do polling from AML side telling QEMU > if (cpu = give_me_next_cpu_with_event()) > switch_to_cpu(cpu) > ... > > that means adding extra state to migrate, > probably ther might be other issues I'm not aware of yet. > > On the other hand mem-hotplug like interface > so far hasn't caused any issues > (that of cause doesn't mean that there aren't any). > > > > + offset: > > > + [0x0-0x3] reserved > > > + [0x4] CPU device status fields: (1 byte access) > > > + bits: > > > + 0: Device is enabled and may be used by guest > > > + 1: Device insert event, used to distinguish device for which > > > + no device check event to OSPM was issued. > > > + It's valid only when bit 0 is set. > > > + 2: Device remove event, used to distinguish device for which > > > + no device eject request to OSPM was issued. > > > + 3-7: reserved and should be ignored by OSPM > > > + [0x5-0x7] reserved > > > + [0x8] Command data: (DWORD access) > > > + in case of error or unsupported command reads is 0xFFFFFFFF > > > + current 'Command field' value: > > > + 0: returns PXM value corresponding to device > > > + > > > +write access: > > > + offset: > > > + [0x0-0x3] CPU selector: (DWORD access) > > > + selects active CPU device. All following accesses to other > > > + registers will read/store data from/to selected CPU. > > > > I've been thinking - is it time to add an EmbeddedControl interface? > > This way we have another namespace. > > There were patches on list with it some time ago > > "[PATCH RFC v2 0/7] coordinate cpu hotplug/unplug bewteen QEMU and kernel by EC" > > but we dropped it as bloatware because there weren't any > real need for it as current IO port interfaces aren't > going away any time soon and work just fine without EC. Right. But now you are removing them for >255 CPUs anyway. > And if we were to move current interfaces into EC namespace > we would have to do it only for new machines types > and keep old code in place as well and a bunch of compat > code for a company. Hmm. Exactly like this current one? > > Not insisting on it, just an idea. > I wouldn't make it a necessary dependency and block this series. > Might be worth to look at again when there is usecases for it > without legacy stuff attached. and then we'll have to maintain 3 interfaces? > > > > > + [0x4] CPU device control fields: (1 byte access) > > > + bits: > > > + 0: reserved, OSPM must clear it before writing to register. > > > + 1: if set to 1 clears device insert event, set by OSPM > > > + after it has emitted device check event for the > > > + selected CPU device > > > + 2: if set to 1 clears device remove event, set by OSPM > > > + after it has emitted device eject request for the > > > + selected CPU device > > > + 3: if set to 1 initiates device eject, set by OSPM when it > > > + triggers CPU device removal and calls _EJ0 method > > > + 4-7: reserved, OSPM must clear them before writing to register > > > + [0x5] Command field: (1 byte access) > > > + value: > > > + 0: following reads from 'Command data' register returns PXM > > > + value of device > > > + 1: following writes to 'Command data' register set OST event > > > + register in QEMU > > > + 2: following writes to 'Command data' register set OST status > > > + register in QEMU > > > + other values: reserved > > > + [0x6-0x7] reserved > > > + [0x8] Command data: (DWORD access) > > > + current 'Command field' value: > > > + 1: stores value into OST event register > > > + 2: stores value into OST status register, triggers > > > + ACPI_DEVICE_OST QMP event from QEMU to external applications > > > + with current values of OST event and status registers. > > > + other values: reserved > > > > > > -CPU hot-add/remove notification: > > > ------------------------------------------------------ > > > -QEMU sets/clears corresponding CPU bit on hot-add/remove event. > > > -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU > > > -hot-(un)plug events. > > > +Selecting CPU device beyond possible range has no effect on platform: > > > + - write accesses to CPU hot-plug registers not documented above are > > > + ignored > > > + - read accesses to CPU hot-plug registers not documented above return > > > + all bits set to 1. > > > > why not 0? > isn't hardware usually return 1s on unhandled IO reads? > (I even thought that's it was you who told me it) PCI does but it's special. > > that's what has been done for memory hotplug, so I'm doing the same here. > > > > > > -- > > > 1.8.3.1 > >
On Wed, 1 Jun 2016 00:09:57 +0300 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Tue, May 31, 2016 at 05:07:41PM +0200, Igor Mammedov wrote: > > On Tue, 31 May 2016 07:49:16 +0300 > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Tue, May 17, 2016 at 04:43:07PM +0200, Igor Mammedov wrote: > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > > > --- > > > > docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------ > > > > 1 file changed, 76 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt > > > > index 340b751..c5bce6a 100644 > > > > --- a/docs/specs/acpi_cpu_hotplug.txt > > > > +++ b/docs/specs/acpi_cpu_hotplug.txt > > > > @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface > > > > QEMU supports CPU hotplug via ACPI. This document > > > > describes the interface between QEMU and the ACPI BIOS. > > > > > > > > -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access): > > > > ------------------------------------------ > > > > - > > > > -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU > > > > -hot-add/remove event to ACPI BIOS, via SCI interrupt. > > > > +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add > > > > +and hot-remove events. > > > > > > > > +============================================ > > > > +Legacy ACPI CPU hotplug interface registers: > > > > +-------------------------------------------- > > > > CPU present bitmap for: > > > > + One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only. > > > > ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access) > > > > PIIX-PM (IO port 0xaf00-0xaf1f, 1-byte access) > > > > --------------------------------------------------------------- > > > > -One bit per CPU. Bit position reflects corresponding CPU APIC ID. > > > > -Read-only. > > > > +QEMU sets corresponding CPU bit on hot-add event and issues SCI > > > > +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler > > > > +to notify OS about CPU hot-add events. CPU hot-remove isn't supported. > > > > + > > > > +===================================== > > > > +ACPI CPU hotplug interface registers: > > > > +------------------------------------- > > > > +Register block base address: > > > > + ICH9-LPC IO port 0x0cd8 > > > > + PIIX-PM IO port 0xaf00 > > > > > > OK but this means we either use legacy or new one, > > > bot both, which is problematic for people using old seabios > > > without acpi loading support and with -M pc. > > > > > > I don't say we must support them with >255 CPUs, > > > but I do say we should make an effort for simple > > > setups with <255 CPUs. > > > > > > it works with QEMU provided(shipped) BIOS, > > it works for migration case as legacy stays enables because of -M src_legacy_machine > > > > more that 255 cpus will break old BIOS in different > > ways /corrupt/hang/ depending on BIOS version. > > > > I'm not sure we should care about using old BIOS > > with new QEMU+new machine type though and allow > > expectations to be beyond what hw vendors usually set. > > It's the same as real hw does i.e. new hardware > > shipped with new BIOS. > > If user insist on old BIOS+new machine type he still has > > property knob to force legacy mode. > > > > on the fist glance, it's probably not that very hard > > to switch IO ports handling from legacy to new interface > > by sending from new cpu-hotplug AML a command to do so, > > my concerns here is: > > * +1 more state to migrate > > * probably issues with migration as target started with > > different IO layout > > * IO window freed after switching from legacy to new, > > will not be available to guest as it started with > > legacy window consumed by CPUS.CRS. > > * that legacy switching business is only PC specific > > means having a knob to turn it on so it won't pollute ARM > > > > all in all it's probably too much headache to make sure > > that improbable usecase would work, so after considering > > this idea I've dropped it and did it the way it's now. > > Well I think it's worth the effort. I agree it's tricky > to implement but we do maintain compatibility for years. That's what I'm concerned about, with your suggestion QEMU maintenance efforts will increase and we will be stuck with ABI switching forever for giving ability to run (unlikely) by default pre acpi-linker BIOS on new machine types. While my current approach makes a clean ABI switch for new machine types with old machines supporting legacy ABI. I can try to rework series but even with reviews and testing on list there is a non zero chance of migration related regressions. Do you insist on implementing switching from legacy ABI to a new one in runtime? > Being orthogonal with bios version is very helpful for > a variety of reasons such as debugging. debugging and other unconventional uses can force legacy hotplug with a property while leaving default new machine type with a clean straightforward implementation. > > > > +Register block size: > > > > + ACPI_CPU_HOTPLUG_REG_LEN = 12 > > > > + > > > > +read access: > > > > > > So this implies acpi must scan all cpus on each event, and > > > this seems too aggressive. > > > I think we need something hierarchical where > > > you read one level and know which cpus to probe. > > That's what we do for mem-hotplug as it's not > > performance critical path. > > > > In addition to that depending on guest OS/version > > it will anyway do enumeration of all CPUs after > > our hotplug AML method scanned all CPUs and > > sent notifies. > > > > not that I'm in favor of complicating this protocol, > > but I wouldn't do it hierarchical, > > that's what Notify(BUS_CHECK) is supposed to do > > but it's broken on some guests. > > Interesting. Which ones? Would they be easy to detect? I've experimented with it/retested some more. It was RHEL6, I don't think we can detect it. Anyway BUS_CHECK would only simplify code by eleminating need for CPU_SCAN_METHOD and doing check from SAT method. We would still need to keep big custom notifier method to make hot remove work (I've spent some time again trying to make it work without if/else case switch but no success so far). Using polling to get CPUs with hotplug events won't save us much (~1K of 32bit reads per 512 CPUs). STA checks that OSPM does will still do full scan. So do you insist on redoing current full scan on hotplug approach using polling? > > So if I'd do a more complicated protocol, > > I'd do polling from AML side telling QEMU > > if (cpu = give_me_next_cpu_with_event()) > > switch_to_cpu(cpu) > > ... > > > > that means adding extra state to migrate, > > probably ther might be other issues I'm not aware of yet. > > > > On the other hand mem-hotplug like interface > > so far hasn't caused any issues > > (that of cause doesn't mean that there aren't any). > > > > > > + offset: > > > > + [0x0-0x3] reserved > > > > + [0x4] CPU device status fields: (1 byte access) > > > > + bits: > > > > + 0: Device is enabled and may be used by guest > > > > + 1: Device insert event, used to distinguish device for which > > > > + no device check event to OSPM was issued. > > > > + It's valid only when bit 0 is set. > > > > + 2: Device remove event, used to distinguish device for which > > > > + no device eject request to OSPM was issued. > > > > + 3-7: reserved and should be ignored by OSPM > > > > + [0x5-0x7] reserved > > > > + [0x8] Command data: (DWORD access) > > > > + in case of error or unsupported command reads is 0xFFFFFFFF > > > > + current 'Command field' value: > > > > + 0: returns PXM value corresponding to device > > > > + > > > > +write access: > > > > + offset: > > > > + [0x0-0x3] CPU selector: (DWORD access) > > > > + selects active CPU device. All following accesses to other > > > > + registers will read/store data from/to selected CPU. > > > > > > I've been thinking - is it time to add an EmbeddedControl interface? > > > This way we have another namespace. > > > > There were patches on list with it some time ago > > > > "[PATCH RFC v2 0/7] coordinate cpu hotplug/unplug bewteen QEMU and kernel by EC" > > > > but we dropped it as bloatware because there weren't any > > real need for it as current IO port interfaces aren't > > going away any time soon and work just fine without EC. > > Right. But now you are removing them for >255 CPUs > anyway. it still occupies the same IO region that we allocated for legacy hotplug, I'm just shrinking it. On top of that EC protocol will channel all accesses via 1-byte channel making access even slower. So as far as we are able to allocate some IO ports without pain, I wouldn't use EC. > > And if we were to move current interfaces into EC namespace > > we would have to do it only for new machines types > > and keep old code in place as well and a bunch of compat > > code for a company. > > Hmm. Exactly like this current one? memhotplug/ACPI PCI hotplug/cpuhotplug (legacy/new) > > > Not insisting on it, just an idea. > > I wouldn't make it a necessary dependency and block this series. > > Might be worth to look at again when there is usecases for it > > without legacy stuff attached. > > and then we'll have to maintain 3 interfaces? nope, I'd leave current/implemented usecases where they are, only new stuff would go through EC. > > > > > > > > + [0x4] CPU device control fields: (1 byte access) > > > > + bits: > > > > + 0: reserved, OSPM must clear it before writing to register. > > > > + 1: if set to 1 clears device insert event, set by OSPM > > > > + after it has emitted device check event for the > > > > + selected CPU device > > > > + 2: if set to 1 clears device remove event, set by OSPM > > > > + after it has emitted device eject request for the > > > > + selected CPU device > > > > + 3: if set to 1 initiates device eject, set by OSPM when it > > > > + triggers CPU device removal and calls _EJ0 method > > > > + 4-7: reserved, OSPM must clear them before writing to register > > > > + [0x5] Command field: (1 byte access) > > > > + value: > > > > + 0: following reads from 'Command data' register returns PXM > > > > + value of device > > > > + 1: following writes to 'Command data' register set OST event > > > > + register in QEMU > > > > + 2: following writes to 'Command data' register set OST status > > > > + register in QEMU > > > > + other values: reserved > > > > + [0x6-0x7] reserved > > > > + [0x8] Command data: (DWORD access) > > > > + current 'Command field' value: > > > > + 1: stores value into OST event register > > > > + 2: stores value into OST status register, triggers > > > > + ACPI_DEVICE_OST QMP event from QEMU to external applications > > > > + with current values of OST event and status registers. > > > > + other values: reserved > > > > > > > > -CPU hot-add/remove notification: > > > > ------------------------------------------------------ > > > > -QEMU sets/clears corresponding CPU bit on hot-add/remove event. > > > > -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU > > > > -hot-(un)plug events. > > > > +Selecting CPU device beyond possible range has no effect on platform: > > > > + - write accesses to CPU hot-plug registers not documented above are > > > > + ignored > > > > + - read accesses to CPU hot-plug registers not documented above return > > > > + all bits set to 1. > > > > > > why not 0? > > isn't hardware usually return 1s on unhandled IO reads? > > (I even thought that's it was you who told me it) > > PCI does but it's special. Ok, it doesn't matter whether it's 1s or 0s as far as it's something defined. I've used 1s as in memhotplug. > > > > > that's what has been done for memory hotplug, so I'm doing the same here. > > > > > > > > > -- > > > > 1.8.3.1 > > >
diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt index 340b751..c5bce6a 100644 --- a/docs/specs/acpi_cpu_hotplug.txt +++ b/docs/specs/acpi_cpu_hotplug.txt @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface QEMU supports CPU hotplug via ACPI. This document describes the interface between QEMU and the ACPI BIOS. -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access): ------------------------------------------ - -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU -hot-add/remove event to ACPI BIOS, via SCI interrupt. +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add +and hot-remove events. +============================================ +Legacy ACPI CPU hotplug interface registers: +-------------------------------------------- CPU present bitmap for: + One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only. ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access) PIIX-PM (IO port 0xaf00-0xaf1f, 1-byte access) --------------------------------------------------------------- -One bit per CPU. Bit position reflects corresponding CPU APIC ID. -Read-only. +QEMU sets corresponding CPU bit on hot-add event and issues SCI +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler +to notify OS about CPU hot-add events. CPU hot-remove isn't supported. + +===================================== +ACPI CPU hotplug interface registers: +------------------------------------- +Register block base address: + ICH9-LPC IO port 0x0cd8 + PIIX-PM IO port 0xaf00 +Register block size: + ACPI_CPU_HOTPLUG_REG_LEN = 12 + +read access: + offset: + [0x0-0x3] reserved + [0x4] CPU device status fields: (1 byte access) + bits: + 0: Device is enabled and may be used by guest + 1: Device insert event, used to distinguish device for which + no device check event to OSPM was issued. + It's valid only when bit 0 is set. + 2: Device remove event, used to distinguish device for which + no device eject request to OSPM was issued. + 3-7: reserved and should be ignored by OSPM + [0x5-0x7] reserved + [0x8] Command data: (DWORD access) + in case of error or unsupported command reads is 0xFFFFFFFF + current 'Command field' value: + 0: returns PXM value corresponding to device + +write access: + offset: + [0x0-0x3] CPU selector: (DWORD access) + selects active CPU device. All following accesses to other + registers will read/store data from/to selected CPU. + [0x4] CPU device control fields: (1 byte access) + bits: + 0: reserved, OSPM must clear it before writing to register. + 1: if set to 1 clears device insert event, set by OSPM + after it has emitted device check event for the + selected CPU device + 2: if set to 1 clears device remove event, set by OSPM + after it has emitted device eject request for the + selected CPU device + 3: if set to 1 initiates device eject, set by OSPM when it + triggers CPU device removal and calls _EJ0 method + 4-7: reserved, OSPM must clear them before writing to register + [0x5] Command field: (1 byte access) + value: + 0: following reads from 'Command data' register returns PXM + value of device + 1: following writes to 'Command data' register set OST event + register in QEMU + 2: following writes to 'Command data' register set OST status + register in QEMU + other values: reserved + [0x6-0x7] reserved + [0x8] Command data: (DWORD access) + current 'Command field' value: + 1: stores value into OST event register + 2: stores value into OST status register, triggers + ACPI_DEVICE_OST QMP event from QEMU to external applications + with current values of OST event and status registers. + other values: reserved -CPU hot-add/remove notification: ------------------------------------------------------ -QEMU sets/clears corresponding CPU bit on hot-add/remove event. -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU -hot-(un)plug events. +Selecting CPU device beyond possible range has no effect on platform: + - write accesses to CPU hot-plug registers not documented above are + ignored + - read accesses to CPU hot-plug registers not documented above return + all bits set to 1.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- docs/specs/acpi_cpu_hotplug.txt | 88 +++++++++++++++++++++++++++++++++++------ 1 file changed, 76 insertions(+), 12 deletions(-)