[V11,0/8] add pvpanic mmio support

Message ID	1543865209-50994-1-git-send-email-peng.hao2@zte.com.cn (mailing list archive)
Headers	show Return-Path: <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org> From: Peng Hao <peng.hao2@zte.com.cn> To: peter.maydell@linaro.org, drjones@redhat.com, philmd@redhat.com Date: Tue, 4 Dec 2018 03:26:41 +0800 Message-Id: <1543865209-50994-1-git-send-email-peng.hao2@zte.com.cn> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] [PATCH V11 0/8] add pvpanic mmio support Precedence: list Cc: Peng Hao <peng.hao2@zte.com.cn>, qemu-arm@nongnu.org, qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	add pvpanic mmio support \| expand [V11,0/8] add pvpanic mmio support [V11,1/8] hw/misc/pvpanic: Build the pvpanic device in $(common-obj) [V11,2/8] hw/misc/pvpanic: Cosmetic renaming [V11,3/8] hw/misc/pvpanic: Add the MMIO interface [V11,4/8] hw/arm/virt: Use the pvpanic device [V11,5/8] hw/arm/virt: add pvpanic device in virt acpi table [V11,6/8] hw/arm/virt: add configure interface for pvpanic-mmio [V11,7/8] hw/arm/virt: use the configure interface [V11,8/8] pvpanic : update pvpanic document

Peng Hao Dec. 3, 2018, 7:26 p.m. UTC

The first patches are simple cleanups:
- patch 1 move the pvpanic device with the 'ocmmon objects' so we compile
it once for the x86/arm/aarch64 archs,
- patch 2 simply renames ISA fields/definitions to generic ones.

Then instead of add/use the MMIO pvpanic device in the virt machine in an
unique patch, I split it in two distinct patches:
- patch 3 uses Peng Hao's work, but add the MMIO interface to the existing
device (no logical change).
- patch 4 is Peng Hao's work in the virt machine (no logical change).
- patch 5 add pvpanic device in acpi table in virt machine
v2 from Peng Hao is:
https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03433.html

v3 --> v4
patch 1,2 no modification.
patch 3, add TYPE_PANIC_MMIO for distinguishing different bus device,
virt + isa_pvpanic will abnormally terminate virtual machine.
patch 4, "pvpanic,mmio" --> "qemu,pvpanic-mmio".
patch 5, newly added.

v4 --> v5
patch 1,2 no modification.
patch 3 delete PvpanicCommonState structure.
patch 4 VIRT_PVPANIC_MMIO --> VIRT_PVPANIC
correct VIRT_PVPANIC's overlap start address
patch 5 no modification.

v5 --> v6
add document.

v6 --> v7
patch 5 modify device name from "PANC" to "PEVT".
patch 6 modify document description.

v7 --> v8
add configure interface for pvpanic-mmio

v8 --> v9
revert "moving structure definition to header file"
because of compile error in x86.

v9 --> v10
Modify document.
Repair missing header files.

v10 --> v11
change configure interface in virt machine configure parameters.

the kernel part of the series:
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/log/?h=char-misc-testing
misc/pvpanic: remove a redundant comma
misc/pvpanic: convert to SPDX license tags
misc/pvpanic: change header file sort style
misc/pvpanic: remove unnecessary header file
misc/pvpanic : break dependency on ACPI
misc/pvpanic : grouping ACPI related stuff
misc/pvpanic: add support to get pvpanic device info FDT
dt-bindings: misc/pvpanic: add document for pvpanic-mmio
misc/pvpanic: add MMIO support
misc/pvpanic: simplify the code using acpi_dev_resource_io
pvpanic: move pvpanic to misc as common driver

Philippe Mathieu-Daudé (2):
hw/misc/pvpanic: Build the pvpanic device in $(common-obj)
hw/misc/pvpanic: Cosmetic renaming

Peng Hao (6):
hw/misc/pvpanic: Add the MMIO interface
hw/arm/virt: Use the pvpanic device
hw/arm/virt: add pvpanic device in virt acpi table
hw/misc/pvpanic: add configure interface for pvpanic-mmio
hw/misc/pvpanic: realize the configure interface
pvpanic : update pvpanic document

Peter Maydell Dec. 3, 2018, 6:50 p.m. UTC | #1

On Mon, 3 Dec 2018 at 11:04, Peng Hao <peng.hao2@zte.com.cn> wrote:
>
> The first patches are simple cleanups:
>      - patch 1 move the pvpanic device with the 'ocmmon objects' so we compile
>                it once for the x86/arm/aarch64 archs,
>      - patch 2 simply renames ISA fields/definitions to generic ones.
>
>      Then instead of add/use the MMIO pvpanic device in the virt machine in an
>      unique patch, I split it in two distinct patches:
>      - patch 3 uses Peng Hao's work, but add the MMIO interface to the existing
>         device (no logical change).
>      - patch 4 is Peng Hao's work in the virt machine (no logical change).
>      - patch 5 add pvpanic device in acpi table in virt machine
>      v2 from Peng Hao is:
>      https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03433.html

I would still prefer to see a more detailed examination of whether
we can do this with a PCI device before we commit to taking the
MMIO version into the virt board.

thanks
-- PMM

Peng Hao Dec. 4, 2018, 12:41 a.m. UTC | #2

>On Mon, 3 Dec 2018 at 11:04, Peng Hao <peng.hao2@zte.com.cn> wrote:
>>
>> The first patches are simple cleanups:
>>      - patch 1 move the pvpanic device with the 'ocmmon objects' so we compile
>>                it once for the x86/arm/aarch64 archs,
>>      - patch 2 simply renames ISA fields/definitions to generic ones.
>>
>>      Then instead of add/use the MMIO pvpanic device in the virt machine in an
>>      unique patch, I split it in two distinct patches:
>>      - patch 3 uses Peng Hao's work, but add the MMIO interface to the existing
>>         device (no logical change).
>>      - patch 4 is Peng Hao's work in the virt machine (no logical change).
>>      - patch 5 add pvpanic device in acpi table in virt machine
>>      v2 from Peng Hao is:
>>      https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03433.html
>
>I would still prefer to see a more detailed examination of whether
>we can do this with a PCI device before we commit to taking the
>MMIO version into the virt board.

I'm sorry I thought I had sent an email. yesterday when I wrote an email to 
explain the reason, I was interrupted and forgot to send it out.

Now the pvpanic device is implemented as a mmio device or an ACPI device in the kernel, 
and only one device can be seen at the same time. If the kernel parses FDT first, then pvpanic  
is a mmio device. The kernel parses ACPI table first(and virtual machine is configured with ACPI), 
and pvpanic  is an ACPI device.
If pvpanic is implemented as a PCI device, then the PCI device must still be seen when the ACPI table 
is first parsed by the kernel, because ACPI device relies on the mmio space of the PCI device.
Mmio devices can be thought of as just an address space rather than a device in the strict sense.

Secondly, I don't want it to be a pluggable device. If the user deletes the device by mistake, it may lead to unpredictable results.

thanks.
>thanks
>-- PMM

Peter Maydell Dec. 4, 2018, 9:40 a.m. UTC | #3

On Tue, 4 Dec 2018 at 00:41, <peng.hao2@zte.com.cn> wrote:
>
> >I would still prefer to see a more detailed examination of whether
> >we can do this with a PCI device before we commit to taking the
> >MMIO version into the virt board.
>
> I'm sorry I thought I had sent an email. yesterday when I wrote an email to
> explain the reason, I was interrupted and forgot to send it out.
>
> Now the pvpanic device is implemented as a mmio device or an ACPI device in the kernel,
> and only one device can be seen at the same time. If the kernel parses FDT first, then pvpanic
> is a mmio device. The kernel parses ACPI table first(and virtual machine is configured with ACPI),
> and pvpanic  is an ACPI device.
> If pvpanic is implemented as a PCI device, then the PCI device must still be seen when the ACPI table
> is first parsed by the kernel, because ACPI device relies on the mmio space of the PCI device.
> Mmio devices can be thought of as just an address space rather than a device in the strict sense.

I'm afraid I don't understand. If it's a PCI device then
it does not need to be listed in the device tree or the
ACPI tables at all, because it is probeable by the guest.
This also significantly simplifies the changes needed in QEMU.

> Secondly, I don't want it to be a pluggable device. If the user
> deletes the device by mistake, it may lead to unpredictable results.

If the user deletes the PCI device they're using for their
disk or networking this will also lead to unpredictable
results. We expect users not to randomly unplug things from
their system if they want it to continue to work. In any
case your guest driver can easily handle the unplug: the
guest would then just lose the ability to notify on panic,
falling back to as if the pvpanic device had never been
present.

thanks
-- PMM

Andrew Jones Dec. 4, 2018, 12:05 p.m. UTC | #4

On Tue, Dec 04, 2018 at 09:40:07AM +0000, Peter Maydell wrote:
> On Tue, 4 Dec 2018 at 00:41, <peng.hao2@zte.com.cn> wrote:
> >
> > >I would still prefer to see a more detailed examination of whether
> > >we can do this with a PCI device before we commit to taking the
> > >MMIO version into the virt board.
> >
> > I'm sorry I thought I had sent an email. yesterday when I wrote an email to
> > explain the reason, I was interrupted and forgot to send it out.
> >
> > Now the pvpanic device is implemented as a mmio device or an ACPI device in the kernel,
> > and only one device can be seen at the same time. If the kernel parses FDT first, then pvpanic
> > is a mmio device. The kernel parses ACPI table first(and virtual machine is configured with ACPI),
> > and pvpanic  is an ACPI device.
> > If pvpanic is implemented as a PCI device, then the PCI device must still be seen when the ACPI table
> > is first parsed by the kernel, because ACPI device relies on the mmio space of the PCI device.
> > Mmio devices can be thought of as just an address space rather than a device in the strict sense.
> 
> I'm afraid I don't understand. If it's a PCI device then
> it does not need to be listed in the device tree or the
> ACPI tables at all, because it is probeable by the guest.
> This also significantly simplifies the changes needed in QEMU.
> 
> > Secondly, I don't want it to be a pluggable device. If the user
> > deletes the device by mistake, it may lead to unpredictable results.
> 
> If the user deletes the PCI device they're using for their
> disk or networking this will also lead to unpredictable
> results. We expect users not to randomly unplug things from
> their system if they want it to continue to work. In any
> case your guest driver can easily handle the unplug: the
> guest would then just lose the ability to notify on panic,
> falling back to as if the pvpanic device had never been
> present.
>

To muddy the waters a bit more, while I'm not opposed to this device
being a PCI device, there is a chance that someone will still want a
platform-mmio version as well. I'm not sure how everything will
eventually fall into place, but I've seen some super minimal guest
configs proposed for the VMs-used-like-containers use cases, even
configs that choose to use virtio-mmio over virtio-pci, and then not
provide a PCI bus at all to the vm.

Maybe this series and the current kernel series can be allowed to
continue as they are, and if later there's a demand for a pci version,
it could just be yet another variant added later?

Thanks,
drew

Daniel P. Berrangé Dec. 4, 2018, 12:47 p.m. UTC | #5

On Mon, Dec 03, 2018 at 06:50:03PM +0000, Peter Maydell wrote:
> On Mon, 3 Dec 2018 at 11:04, Peng Hao <peng.hao2@zte.com.cn> wrote:
> >
> > The first patches are simple cleanups:
> >      - patch 1 move the pvpanic device with the 'ocmmon objects' so we compile
> >                it once for the x86/arm/aarch64 archs,
> >      - patch 2 simply renames ISA fields/definitions to generic ones.
> >
> >      Then instead of add/use the MMIO pvpanic device in the virt machine in an
> >      unique patch, I split it in two distinct patches:
> >      - patch 3 uses Peng Hao's work, but add the MMIO interface to the existing
> >         device (no logical change).
> >      - patch 4 is Peng Hao's work in the virt machine (no logical change).
> >      - patch 5 add pvpanic device in acpi table in virt machine
> >      v2 from Peng Hao is:
> >      https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03433.html
> 
> I would still prefer to see a more detailed examination of whether
> we can do this with a PCI device before we commit to taking the
> MMIO version into the virt board.

The original design rational for using an I/O port is mentioned here,
though it is quite brief:

  http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg04361.html

[quote]
We have three solutions to implement this feature:
1. use vmcall
2. use I/O port
3. use virtio-serial.

We have decided to avoid touching hypervisor. The reason why I choose
choose the I/O port is:
1. it is easier to implememt
2. it does not depend any virtual device
3. it can work when starting the kernel
[/quote]

Later postings then exposed the port via ACPI

  http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg02293.html

After it had merged there were some changes and the question of turning
it into a PCI device was raised. Paolo was concerned that the guest OS
is in an unknown state (arbitrary locks held, data structures corrupt,
etc) when panic is fired, so simplicity of the I/O port was desirable:

  https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03309.html

Anthony countered that even a PCI device could simply do an outb() in
config space:

  https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03325.html

So it is not clear using a PCI device is in fact a problem in terms of
reliability at time of firing.

Perhaps more relevant is the question of how easily it can be initialized,
as that affects whether it can be used for panics during very early boot,
or from firmware:

  https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03296.html

Finally, there is also the point that PCI slots are precious, and this
is something to be enabled out of the box on all VMs, so you'd be removing
one extra PCI slot from general usage. Thus mgmt apps would need to start
adding PCI bridges sooner. Not a blocker but something to bear in mind
when weighing up options.

Regards,
Daniel

Peter Maydell Dec. 4, 2018, 12:48 p.m. UTC | #6

On Tue, 4 Dec 2018 at 12:05, Andrew Jones <drjones@redhat.com> wrote:
> To muddy the waters a bit more, while I'm not opposed to this device
> being a PCI device, there is a chance that someone will still want a
> platform-mmio version as well. I'm not sure how everything will
> eventually fall into place, but I've seen some super minimal guest
> configs proposed for the VMs-used-like-containers use cases, even
> configs that choose to use virtio-mmio over virtio-pci, and then not
> provide a PCI bus at all to the vm.
>
> Maybe this series and the current kernel series can be allowed to
> continue as they are, and if later there's a demand for a pci version,
> it could just be yet another variant added later?

I like the PCI device idea because it avoids adding yet
another complication to the virt board. Adding the MMIO
variant now and the PCI device later doesn't let us avoid
the complication or even remove it in the future.

The last I heard we were still thinking about deprecating
virtio-mmio entirely.

thanks
-- PMM

Peter Maydell Dec. 4, 2018, 12:59 p.m. UTC | #7

On Tue, 4 Dec 2018 at 12:47, Daniel P. Berrangé <berrange@redhat.com> wrote:
> After it had merged there were some changes and the question of turning
> it into a PCI device was raised. Paolo was concerned that the guest OS
> is in an unknown state (arbitrary locks held, data structures corrupt,
> etc) when panic is fired, so simplicity of the I/O port was desirable:
>
>   https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03309.html
>
> Anthony countered that even a PCI device could simply do an outb() in
> config space:
>
>   https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03325.html
>
> So it is not clear using a PCI device is in fact a problem in terms of
> reliability at time of firing.

...and if we'd done it that way in the first place for x86 then
we wouldn't be having to do anything at all now for Arm.
That suggests to me that we should do it that way now, and then we
can avoid having to do a bunch of extra development work for the
next architecture, or the next interesting Arm board model.

> Perhaps more relevant is the question of how easily it can be initialized,
> as that affects whether it can be used for panics during very early boot,
> or from firmware:
>
>   https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03296.html

Mmm, firmware and early boot panic is potentially an interesting
use case. Does UEFI actually make use of it, though? A quick look
at the kernel source for the x86 pvpanic driver suggests it doesn't
take any particular steps to ensure that it is initialized early,
though I guess acpi drivers probably init before PCI.

I notice also that there's a mention in that thread that the pvpanic
ACPI table entry on x86 resulted in unhelpful Windows notifications
about new devices it didn't understand. Is that going to be an issue
for Arm with this mmio pvpanic ?

> Finally, there is also the point that PCI slots are precious, and this
> is something to be enabled out of the box on all VMs, so you'd be removing
> one extra PCI slot from general usage. Thus mgmt apps would need to start
> adding PCI bridges sooner. Not a blocker but something to bear in mind
> when weighing up options.

Space in the 'virt' memory map for random MMIO devices is also a limited
resource, especially if you want something in the <4GB space.

thanks
-- PMM

Paolo Bonzini Dec. 4, 2018, 1:30 p.m. UTC | #8

On 04/12/18 13:59, Peter Maydell wrote:
> On Tue, 4 Dec 2018 at 12:47, Daniel P. Berrangé <berrange@redhat.com> wrote:
>> After it had merged there were some changes and the question of turning
>> it into a PCI device was raised. Paolo was concerned that the guest OS
>> is in an unknown state (arbitrary locks held, data structures corrupt,
>> etc) when panic is fired, so simplicity of the I/O port was desirable:
>>
>>   https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03309.html

(Actually that was Marcelo, I was just relaying the message).

>> Anthony countered that even a PCI device could simply do an outb() in
>> config space:
>>
>>   https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03325.html
>>
>> So it is not clear using a PCI device is in fact a problem in terms of
>> reliability at time of firing.
> 
> ...and if we'd done it that way in the first place for x86 then
> we wouldn't be having to do anything at all now for Arm.
> That suggests to me that we should do it that way now, and then we
> can avoid having to do a bunch of extra development work for the
> next architecture, or the next interesting Arm board model.

Is there any case where we have anything but ISA and MMIO?  (We have
8250 which is ISA, PCI and MMIO, but that's kind of special because PCI
is only there for hotpluggability.  pvpanic hotplug is not interesting).

Also, while reusing code in general is nice, sometimes there are
platform-specific ways to do it.  For ARM, for example, would it make
sense to use an HVC/SMC that "extends" the PSCI, and pass the number in
the PSCI device tree node?

Related to this, is there a more or less "standard" watchdog device on
ARM that could be added to virt?  There is the SBSA watchdog, but it's
ugly for implementation in KVM because it counts down with frequency
equal to CNTFRQ (which I'm not sure if QEMU has access too, and also it
doesn't play well with live migration).

> I notice also that there's a mention in that thread that the pvpanic
> ACPI table entry on x86 resulted in unhelpful Windows notifications
> about new devices it didn't understand. Is that going to be an issue
> for Arm with this mmio pvpanic ?

Yes, it is probably the same as for x86.

Paolo

Peter Maydell Dec. 4, 2018, 1:43 p.m. UTC | #9

On Tue, 4 Dec 2018 at 13:30, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 04/12/18 13:59, Peter Maydell wrote:
> > ...and if we'd done it that way in the first place for x86 then
> > we wouldn't be having to do anything at all now for Arm.
> > That suggests to me that we should do it that way now, and then we
> > can avoid having to do a bunch of extra development work for the
> > next architecture, or the next interesting Arm board model.
>
> Is there any case where we have anything but ISA and MMIO?  (We have
> 8250 which is ISA, PCI and MMIO, but that's kind of special because PCI
> is only there for hotpluggability.  pvpanic hotplug is not interesting).

The point about PCI is that it is the same everywhere and
discoverable, and easy for the user to add to the system or not.
MMIO requires extra work for every board model that we want to
put the device into, plus extra on both kernel and QEMU side
for every system description mechanism (ACPI, dtb, whatever
some future architecture might use), even if we have the basic
"mmio pvpanic" device code already.

> Also, while reusing code in general is nice, sometimes there are
> platform-specific ways to do it.  For ARM, for example, would it make
> sense to use an HVC/SMC that "extends" the PSCI, and pass the number in
> the PSCI device tree node?

If you want a hypercall then these days the arm HVC calling convention
includes mechanisms for discoverably determining whether a particular
hypercall is supported, so you wouldn't need to pass anything in the
ACPI or dtb. But I didn't get the impression that anybody wanted a
hypercall for this particularly.

> Related to this, is there a more or less "standard" watchdog device on
> ARM that could be added to virt?  There is the SBSA watchdog, but it's
> ugly for implementation in KVM because it counts down with frequency
> equal to CNTFRQ (which I'm not sure if QEMU has access too, and also it
> doesn't play well with live migration).

The i6300esb is PCI, presumably that would work?

> > I notice also that there's a mention in that thread that the pvpanic
> > ACPI table entry on x86 resulted in unhelpful Windows notifications
> > about new devices it didn't understand. Is that going to be an issue
> > for Arm with this mmio pvpanic ?
>
> Yes, it is probably the same as for x86.

I guess we need to find out if that is a problem before we can
merge this, then.

thanks
-- PMM

Daniel P. Berrangé Dec. 4, 2018, 1:48 p.m. UTC | #10

On Tue, Dec 04, 2018 at 12:59:51PM +0000, Peter Maydell wrote:
> On Tue, 4 Dec 2018 at 12:47, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > After it had merged there were some changes and the question of turning
> > it into a PCI device was raised. Paolo was concerned that the guest OS
> > is in an unknown state (arbitrary locks held, data structures corrupt,
> > etc) when panic is fired, so simplicity of the I/O port was desirable:
> >
> >   https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03309.html
> >
> > Anthony countered that even a PCI device could simply do an outb() in
> > config space:
> >
> >   https://lists.gnu.org/archive/html/qemu-devel/2013-08/msg03325.html
> >
> > So it is not clear using a PCI device is in fact a problem in terms of
> > reliability at time of firing.
> 
> ...and if we'd done it that way in the first place for x86 then
> we wouldn't be having to do anything at all now for Arm.
> That suggests to me that we should do it that way now, and then we
> can avoid having to do a bunch of extra development work for the
> next architecture, or the next interesting Arm board model.

On s390 there's always a panic notifier mechanism as it is a
integral part of the architecture.

On PowerPC, pSeries guests have access to a panic notifier provided
by the firmware.

On x86, as well as pvpanic, there is also a paravirtualized
option defined by the HyperV extensions "hv_crash"

IIUC a PCI based solution would be usable on x86, s390, powerpc (pseries),
aarch64 (virt) and eventually riscv (virt). Of those, it is only
aarch64 and riscv that lack a panic notifier solution today.

I feel like we've already lost from the pov of a standardized solution,
but that doesn't mean we shouldn't still consider using PCI if it does
look like the best otion for arm/riscv.

Regards,
Daniel

Paolo Bonzini Dec. 4, 2018, 1:53 p.m. UTC | #11

On 04/12/18 14:43, Peter Maydell wrote:
> The point about PCI is that it is the same everywhere and
> discoverable, and easy for the user to add to the system or not.
> MMIO requires extra work for every board model that we want to
> put the device into, plus extra on both kernel and QEMU side
> for every system description mechanism (ACPI, dtb, whatever
> some future architecture might use), even if we have the basic
> "mmio pvpanic" device code already.

Looks like dtb is becoming a standard?  Even RISC-V switched from their
own system description to device tree.  Anyway, this is not too
important.  I agree with you about discoverability, on the other hand if
we could have something defined by the vendor rather than QEMU it would
be even better.  (Even better would be something that distro kernels
already have support for, but that would be asking too much probably).

>> Also, while reusing code in general is nice, sometimes there are
>> platform-specific ways to do it.  For ARM, for example, would it make
>> sense to use an HVC/SMC that "extends" the PSCI, and pass the number in
>> the PSCI device tree node?
> 
> If you want a hypercall then these days the arm HVC calling convention
> includes mechanisms for discoverably determining whether a particular
> hypercall is supported, so you wouldn't need to pass anything in the
> ACPI or dtb. But I didn't get the impression that anybody wanted a
> hypercall for this particularly.

Not for x86, where each hypervisor has its own hypercall number and even
its calling convention.  But for ARM it already makes more sense.

>> Related to this, is there a more or less "standard" watchdog device on
>> ARM that could be added to virt?  There is the SBSA watchdog, but it's
>> ugly for implementation in KVM because it counts down with frequency
>> equal to CNTFRQ (which I'm not sure if QEMU has access too, and also it
>> doesn't play well with live migration).
> 
> The i6300esb is PCI, presumably that would work?

Yeah, I was wondering if there was something in PrimeCell.  I found
SP805 exists now.

>>> I notice also that there's a mention in that thread that the pvpanic
>>> ACPI table entry on x86 resulted in unhelpful Windows notifications
>>> about new devices it didn't understand. Is that going to be an issue
>>> for Arm with this mmio pvpanic ?
>>
>> Yes, it is probably the same as for x86.
> 
> I guess we need to find out if that is a problem before we can
> merge this, then.

As long as the pvpanic device is not added by default it's okay.

Paolo

Michael S. Tsirkin Dec. 4, 2018, 7:10 p.m. UTC | #12

On Tue, Dec 04, 2018 at 02:53:26PM +0100, Paolo Bonzini wrote:
> On 04/12/18 14:43, Peter Maydell wrote:
> > The point about PCI is that it is the same everywhere and
> > discoverable, and easy for the user to add to the system or not.
> > MMIO requires extra work for every board model that we want to
> > put the device into, plus extra on both kernel and QEMU side
> > for every system description mechanism (ACPI, dtb, whatever
> > some future architecture might use), even if we have the basic
> > "mmio pvpanic" device code already.
> 
> Looks like dtb is becoming a standard?  Even RISC-V switched from their
> own system description to device tree.  Anyway, this is not too
> important.  I agree with you about discoverability, on the other hand if
> we could have something defined by the vendor rather than QEMU it would
> be even better.  (Even better would be something that distro kernels
> already have support for, but that would be asking too much probably).
> 
> >> Also, while reusing code in general is nice, sometimes there are
> >> platform-specific ways to do it.  For ARM, for example, would it make
> >> sense to use an HVC/SMC that "extends" the PSCI, and pass the number in
> >> the PSCI device tree node?
> > 
> > If you want a hypercall then these days the arm HVC calling convention
> > includes mechanisms for discoverably determining whether a particular
> > hypercall is supported, so you wouldn't need to pass anything in the
> > ACPI or dtb. But I didn't get the impression that anybody wanted a
> > hypercall for this particularly.
> 
> Not for x86, where each hypervisor has its own hypercall number and even
> its calling convention.  But for ARM it already makes more sense.
> 
> >> Related to this, is there a more or less "standard" watchdog device on
> >> ARM that could be added to virt?  There is the SBSA watchdog, but it's
> >> ugly for implementation in KVM because it counts down with frequency
> >> equal to CNTFRQ (which I'm not sure if QEMU has access too, and also it
> >> doesn't play well with live migration).
> > 
> > The i6300esb is PCI, presumably that would work?
> 
> Yeah, I was wondering if there was something in PrimeCell.  I found
> SP805 exists now.
> 
> >>> I notice also that there's a mention in that thread that the pvpanic
> >>> ACPI table entry on x86 resulted in unhelpful Windows notifications
> >>> about new devices it didn't understand. Is that going to be an issue
> >>> for Arm with this mmio pvpanic ?
> >>
> >> Yes, it is probably the same as for x86.
> > 
> > I guess we need to find out if that is a problem before we can
> > merge this, then.
> 
> As long as the pvpanic device is not added by default it's okay.
> 
> Paolo

It isn't too bad, you just include a driver and then it's happy.

Isn't a device just for panic going too far though?
Should it be some kind of PV device that we
can use to add more pv stuff in the future, with
a panic option?

Peng Hao Dec. 5, 2018, 12:27 a.m. UTC | #13

>On Tue, 4 Dec 2018 at 00:41, <peng.hao2@zte.com.cn> wrote:
>>
>> >I would still prefer to see a more detailed examination of whether
>> >we can do this with a PCI device before we commit to taking the
>> >MMIO version into the virt board.
>>
>> I'm sorry I thought I had sent an email. yesterday when I wrote an email to
>> explain the reason, I was interrupted and forgot to send it out.
>>
>> Now the pvpanic device is implemented as a mmio device or an ACPI device in the kernel,
>> and only one device can be seen at the same time. If the kernel parses FDT first, then pvpanic
>> is a mmio device. The kernel parses ACPI table first(and virtual machine is configured with ACPI),
>> and pvpanic  is an ACPI device.
>> If pvpanic is implemented as a PCI device, then the PCI device must still be seen when the ACPI table
>> is first parsed by the kernel, because ACPI device relies on the mmio space of the PCI device.
>> Mmio devices can be thought of as just an address space rather than a device in the strict sense.
>
>I'm afraid I don't understand. If it's a PCI device then
>it does not need to be listed in the device tree or the
>ACPI tables at all, because it is probeable by the guest.
>This also significantly simplifies the changes needed in QEMU.
>

It is precisely because PCI devices can not be controlled by FDT or ACPI tables, 
I do not want to implement it as a pci device.
X86/pvpanic is implemented as ISA device in QEMU and ACPI device in kernel. 
My implementation extends the implementation of x86/pvpanic, and a large of x86/pvpanic 
codes are reused.If PCI devices are implemented in qemu, then ACPI devices and PCI 
devices may appear simultaneously in the kernel. This would add both devices to the 
crash notifier list, which is odd. I want to see only one device at any time. Of course, many
architectures can use PCI devices, but we are currently reusing x86/pvpanic code as much 
as possible in qemu and kernel , rather than reimplementing it. At the same time, 
backward compatibility also needs to be considered.

                                     pvpanic in guest kernel
ARM:   ACPI table         acpi device
            FDT                  mmio device  (start guest bypassing uefi)
x86      ACPI table         acpi device

>> Secondly, I don't want it to be a pluggable device. If the user
>> deletes the device by mistake, it may lead to unpredictable results.
>
>If the user deletes the PCI device they're using for their
>disk or networking this will also lead to unpredictable
>results. We expect users not to randomly unplug things from
>their system if they want it to continue to work. In any
>case your guest driver can easily handle the unplug: the
>guest would then just lose the ability to notify on panic,
>falling back to as if the pvpanic device had never been
>present.

If two devices can exist simultaneously by modifying the code,
 then because ACPI devices rely on a PCI device, if PCI devices are dynamically
 unplugged, ACPI device will not work when panic is triggered.

thanks.
>
>thanks
>-- PMM

Peter Maydell Dec. 5, 2018, 2:54 p.m. UTC | #14

On Wed, 5 Dec 2018 at 00:28, <peng.hao2@zte.com.cn> wrote:
>
> >I'm afraid I don't understand. If it's a PCI device then
> >it does not need to be listed in the device tree or the
> >ACPI tables at all, because it is probeable by the guest.
> >This also significantly simplifies the changes needed in QEMU.
> >
>
> It is precisely because PCI devices can not be controlled by FDT or ACPI tables,
> I do not want to implement it as a pci device.
> X86/pvpanic is implemented as ISA device in QEMU and ACPI device in kernel.
> My implementation extends the implementation of x86/pvpanic, and a large of x86/pvpanic
> codes are reused.If PCI devices are implemented in qemu, then ACPI devices and PCI
> devices may appear simultaneously in the kernel. This would add both devices to the
> crash notifier list, which is odd. I want to see only one device at any time.

Yes, certainly we only need one pvpanic device. If it's implemented
as a PCI device, then that's what appears. We don't need and
would not implement the MMIO version. On x86 a user could
in theory use the command line to request both ISA and PCI
pvpanic devices. That would not be very sensible, but there
are lots of QEMU command lines the user can request that
don't make sense.

> Of course, many
> architectures can use PCI devices, but we are currently reusing x86/pvpanic code as much
> as possible in qemu and kernel , rather than reimplementing it. At the same time,
> backward compatibility also needs to be considered.
>
>                                      pvpanic in guest kernel
> ARM:   ACPI table         acpi device
>             FDT                  mmio device  (start guest bypassing uefi)
> x86      ACPI table         acpi device

For Arm, there is no backward compatibility issue, as we have
not yet implemented or shipped anything.

> >> Secondly, I don't want it to be a pluggable device. If the user
> >> deletes the device by mistake, it may lead to unpredictable results.
> >
> >If the user deletes the PCI device they're using for their
> >disk or networking this will also lead to unpredictable
> >results. We expect users not to randomly unplug things from
> >their system if they want it to continue to work. In any
> >case your guest driver can easily handle the unplug: the
> >guest would then just lose the ability to notify on panic,
> >falling back to as if the pvpanic device had never been
> >present.
>
> If two devices can exist simultaneously by modifying the code,
>  then because ACPI devices rely on a PCI device, if PCI devices are dynamically
>  unplugged, ACPI device will not work when panic is triggered.

If somebody modifies the code to QEMU or the guest kernel
such that something breaks, that's their issue to deal with.
My proposal is that we would ship:
 * a QEMU with a PCI pvpanic device (which you could plug in
   if you wanted it)
 * no changes to the Arm virt board, so nothing in the ACPI
   or device tree
 * no "mmio pvpanic" device

thanks
-- PMM

Peng Hao Dec. 6, 2018, 2:02 a.m. UTC | #15

>On Wed, 5 Dec 2018 at 00:28, <peng.hao2@zte.com.cn> wrote:
>>
>> >I'm afraid I don't understand. If it's a PCI device then
>> >it does not need to be listed in the device tree or the
>> >ACPI tables at all, because it is probeable by the guest.
>> >This also significantly simplifies the changes needed in QEMU.
>> >
>>
>> It is precisely because PCI devices can not be controlled by FDT or ACPI tables,
>> I do not want to implement it as a pci device.
>> X86/pvpanic is implemented as ISA device in QEMU and ACPI device in kernel.
>> My implementation extends the implementation of x86/pvpanic, and a large of x86/pvpanic
>> codes are reused.If PCI devices are implemented in qemu, then ACPI devices and PCI
>> devices may appear simultaneously in the kernel. This would add both devices to the
>> crash notifier list, which is odd. I want to see only one device at any time.
>
>Yes, certainly we only need one pvpanic device. If it's implemented
>as a PCI device, then that's what appears. We don't need and
>would not implement the MMIO version. On x86 a user could
>in theory use the command line to request both ISA and PCI
>pvpanic devices. That would not be very sensible, but there
>are lots of QEMU command lines the user can request that
>don't make sense.
>
>> Of course, many
>> architectures can use PCI devices, but we are currently reusing x86/pvpanic code as much
>> as possible in qemu and kernel , rather than reimplementing it. At the same time,
>> backward compatibility also needs to be considered.
>>
>>                                      pvpanic in guest kernel
>> ARM:   ACPI table         acpi device
>>             FDT                  mmio device  (start guest bypassing uefi)
>> x86      ACPI table         acpi device
>
>For Arm, there is no backward compatibility issue, as we have
>not yet implemented or shipped anything.
>

Sorry, the expression is not clear enough. I want to say that x86 needs backward 
compatibility if we intend to reuse the code of x86/pvpanic.

>> >> Secondly, I don't want it to be a pluggable device. If the user
>> >> deletes the device by mistake, it may lead to unpredictable results.
>> >
>> >If the user deletes the PCI device they're using for their
>> >disk or networking this will also lead to unpredictable
>> >results. We expect users not to randomly unplug things from
>> >their system if they want it to continue to work. In any
>> >case your guest driver can easily handle the unplug: the
>> >guest would then just lose the ability to notify on panic,
>> >falling back to as if the pvpanic device had never been
>> >present.
>>
>> If two devices can exist simultaneously by modifying the code,
>>  then because ACPI devices rely on a PCI device, if PCI devices are dynamically
>>  unplugged, ACPI device will not work when panic is triggered.
>
>If somebody modifies the code to QEMU or the guest kernel
>such that something breaks, that's their issue to deal with.
>My proposal is that we would ship:
>* a QEMU with a PCI pvpanic device (which you could plug in
>if you wanted it)
>* no changes to the Arm virt board, so nothing in the ACPI
>or device tree
>* no "mmio pvpanic" device

ok, I will try it.
thanks.
>
>thanks
>-- PMM

[V11,0/8] add pvpanic mmio support

Message

Comments