[v16,00/14] vfio-pci: pass the aer error to guest

On 01/17/2016 02:34 AM, Michael S. Tsirkin wrote:
> On Tue, Jan 12, 2016 at 10:43:01AM +0800, Cao jin wrote:
>> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>
>> For now, for vfio pci passthough devices when qemu receives
>> an error from host aer report, currentlly just terminate the guest,
>> but usually user want to know what error occurred but stopping the
>> guest, so this patches add aer capability support for vfio device,
>> and pass the error to guest, and have guest driver to recover
>> from the error.
> I would like to see a version of this patchset that doesn't
> depend on pci core changes.
> I think that if you make this simplifying assumption:
>
> - all devices on same bus in guest are on same bus in host
>
> then you can handle both reset and hotplug simply in function 0
> since it will belong to vfio.
>
> So we can have a version without pci core changes that simply assumes
> this, and things will just work.
>
>
> Now, if we wanted to enforce this limitation, I think the
> cleanest way would be to add a callback in struct PCIDevice:
>
> 	bool is_valid_function(PCIDevice *newfunction)
>
> and call it as each function is added.
> This way aer function can validate that each function
> added shares the same bus.
> And this way issues will be detected directly and not when
> function 0 is added.
>
> I would prefer this validation code to be a patch on top so we can merge
> the functionality directly and avoid blocking it while we figure out the
> best api to validate things.
>
> I don't see why making guest topology match host would
> ever be a problem, but if it's required to support
> configurations where these differ, I'd like to see
> an attempt to address that be split out, after aer
> is supported.
Hi Michael,

Just think about this more,  I think we also should check the vfio
devices whether on the same bus at the time of function 0 is added.
because we don't know the affected devices by a bus reset have
already all been assigned to VM. for example, the multi-function's hotplug.
devices on same bus in host are added to VM one by one. when we
test one device, we haven't yet added the other devices. so I think
the patch should like below. then we could add a vfio_is_valid_function 
in vfio
to test each device whether the affected devices on the same bus.

Thanks,
Chen

>
>
>
>> v15-v16:
>>     10/14, 11/14 are new to introduce a reset sequence id to specify the
>>     vfio devices has been reset for that reset. other patches aren't modified.
>>
>> v14-v15:
>>     1. add device hot reset callback
>>     2. add bus_in_reset for vfio device to avoid multi do host bus reset
>>
>> v13-v14:
>>     1. for multifunction device, requiring all functions enable AER.(9/13)
>>     2. due to all affected functions receive error signal, ignore no
>>        error occurred function. (12/13)
>>
>> v12-v13:
>>     1. since support multifuncion hotplug, here add callback to enable aer.
>>     2. add pci device pre+post reset for aer host reset.
>>
>> Chen Fan (14):
>>    vfio: extract vfio_get_hot_reset_info as a single function
>>    vfio: squeeze out vfio_pci_do_hot_reset for support bus reset
>>    pcie: modify the capability size assert
>>    vfio: make the 4 bytes aligned for capability size
>>    vfio: add pcie extanded capability support
>>    aer: impove pcie_aer_init to support vfio device
>>    vfio: add aer support for vfio device
>>    vfio: add check host bus reset is support or not
>>    add check reset mechanism when hotplug vfio device
>>    pci: introduce pci bus pre reset
>>    vfio: introduce last reset sequence id
>>    pcie_aer: expose pcie_aer_msg() interface
>>    vfio-pci: pass the aer error to guest
>>    vfio: add 'aer' property to expose aercap
>>
>>   hw/pci-bridge/ioh3420.c            |   2 +-
>>   hw/pci-bridge/xio3130_downstream.c |   2 +-
>>   hw/pci-bridge/xio3130_upstream.c   |   2 +-
>>   hw/pci/pci.c                       |  42 +++
>>   hw/pci/pci_bridge.c                |   3 +
>>   hw/pci/pcie.c                      |   2 +-
>>   hw/pci/pcie_aer.c                  |   6 +-
>>   hw/vfio/pci.c                      | 616 +++++++++++++++++++++++++++++++++----
>>   hw/vfio/pci.h                      |   9 +
>>   include/hw/pci/pci.h               |   1 +
>>   include/hw/pci/pci_bus.h           |   8 +
>>   include/hw/pci/pcie_aer.h          |   3 +-
>>   12 files changed, 624 insertions(+), 72 deletions(-)
>>
>> -- 
>> 1.9.3
>>
>>
>
> .
>

[v16,00/14] vfio-pci: pass the aer error to guest

Commit Message

Comments

Patch