[v2] virtio-bus: Plug devices after features are negotiated

Message ID	1473773430-19616-1-git-send-email-maxime.coquelin@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org> From: Maxime Coquelin <maxime.coquelin@redhat.com> To: mst@redhat.com, qemu-devel@nongnu.org Date: Tue, 13 Sep 2016 15:30:30 +0200 Message-Id: <1473773430-19616-1-git-send-email-maxime.coquelin@redhat.com> Subject: [Qemu-devel] [PATCH v2] virtio-bus: Plug devices after features are negotiated Precedence: list Cc: cornelia.huck@de.ibm.com, marcel@redhat.com, Maxime Coquelin <maxime.coquelin@redhat.com>, qemu-stable@nongnu.org Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>

Maxime Coquelin Sept. 13, 2016, 1:30 p.m. UTC

Currently, devices are plugged before features are negotiated.
If the backend doesn't support VIRTIO_F_VERSION_1, the transport
needs to rewind some settings.

This is the case for CCW, for which a post_plugged callback had
been introduced, where max_rev field is just updated if
VIRTIO_F_VERSION_1 is not supported by the backend.
For PCI, implementing post_plugged would be much more
complicated, so it needs to know whether the backend supports
VIRTIO_F_VERSION_1 at plug time.

Currently, nothing is done for PCI. Modern capabilities get
exposed to the guest even if VIRTIO_F_VERSION_1 is not supported
by the backend, which confuses the guest.

This patch replaces existing post_plugged solution with an
approach that fits with both transports.
Features negotiation is performed before ->device_plugged() call.
A pre_plugged callback is introduced so that the transports can
set their supported features.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> [ccw]
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

The patch applies on top of Michael's pci branch.

Changes from v1:
----------------
 - Fix commit message typos (Cornelia, Eric)

 hw/s390x/virtio-ccw.c          | 30 +++++++++++++++---------------
 hw/virtio/virtio-bus.c         |  9 +++++----
 hw/virtio/virtio-pci.c         | 36 ++++++++++++++++++++++++++++++++----
 hw/virtio/virtio-pci.h         |  5 +++++
 include/hw/virtio/virtio-bus.h | 10 +++++-----
 5 files changed, 62 insertions(+), 28 deletions(-)

Michael Roth Dec. 13, 2016, 9:27 p.m. UTC | #1

Quoting Maxime Coquelin (2016-09-13 08:30:30)
> Currently, devices are plugged before features are negotiated.
> If the backend doesn't support VIRTIO_F_VERSION_1, the transport
> needs to rewind some settings.
> 
> This is the case for CCW, for which a post_plugged callback had
> been introduced, where max_rev field is just updated if
> VIRTIO_F_VERSION_1 is not supported by the backend.
> For PCI, implementing post_plugged would be much more
> complicated, so it needs to know whether the backend supports
> VIRTIO_F_VERSION_1 at plug time.
> 
> Currently, nothing is done for PCI. Modern capabilities get
> exposed to the guest even if VIRTIO_F_VERSION_1 is not supported
> by the backend, which confuses the guest.
> 
> This patch replaces existing post_plugged solution with an
> approach that fits with both transports.
> Features negotiation is performed before ->device_plugged() call.
> A pre_plugged callback is introduced so that the transports can
> set their supported features.
> 
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: qemu-stable@nongnu.org
> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> [ccw]
> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>


It looks like this patch breaks migration under certain circumstances.
One such scenario is migrating 2.7 -> 2.8.0-rc3 as detailed below on a
host that doesn't have support for virtio-1 in vhost (which was
introduced via 41e3e42 in kernel 3.18. In my case, I'm using Ubuntu
14.04, kernel 3.13):

 source (2.7.0):
 
   sudo build/v2.7.0/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -L
   build/v2.7.0-bios -M pc-i440fx-2.7 -m 512M -drive
   file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
   cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
   tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
   unix:/tmp/vm-hmp.sock,server,nowait -qmp
   unix:/tmp/vm-qmp.sock,server,nowait -vnc :200
  
 target (2.8.0-rc3):
   
   sudo build/v2.8.0-rc3/x86_64-softmmu/qemu-system-x86_64 -enable-kvm
   -L build/v2.8.0-rc3-bios -M pc-i440fx-2.7 -m 512M -drive
   file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
   cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
   tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
   unix:/tmp/vm-hmp-incoming.sock,server,nowait -qmp
   unix:/tmp/vm-qmp-incoming.sock,server,nowait -vnc :201
   -incoming unix:/tmp/migrate.sock
    
 error on target after migration:
     
   qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x34
   read: 98 device: 40 cmask: ff wmask: 0 w1cmask:0
   qemu-system-x86_64: Failed to load PCIDevice:config
   qemu-system-x86_64: Failed to load virtio-net:virtio
   qemu-system-x86_64: error while loading state for instance 0x0 of
   device '0000:00:03.0/virtio-net'
   qemu-system-x86_64: load of migration failed: Invalid argument


Based on discussion on IRC (excerpts below), I think the new handling needs
to be controllable via a machine option that can be disabled by default
for older machine types. This is being considered a release blocker for
2.8 since there are still pre-3.18 LTS kernels in the wild.


root-cause:

14:35 < stefanha> v3.13 will not work
14:35 < stefanha>         VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
14:35 < stefanha>                          (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
14:35 < stefanha>                          (1ULL << VIRTIO_RING_F_EVENT_IDX) |
14:35 < stefanha>                          (1ULL << VHOST_F_LOG_ALL),
14:35 < stefanha> The kernel side only knows about these guys
14:35 < davidgiluk> our downstream 3.10.0-524 seems to work - but that's probably got all the vhost stuff backported
14:35 < stefanha> plus these guys:
14:35 < stefanha> F        VHOST_NET_FEATURES = VHOST_FEATURES |
14:35 < stefanha>                          (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
14:35 < stefanha>                          (1ULL << VIRTIO_NET_F_MRG_RXBUF),
14:36 < stefanha> mdroth: So QEMU is going to check a list of feature bits including VERSION_1
14:36 < stefanha> and it will see that vhost doesn't support them.
14:36 < stefanha> So we're kind of doing the right thing now.
14:36 < stefanha> Before userspace was overriding the fact that vhost cannot handle VERSION_1.
14:36 < stefanha> ...except we broke migration
14:36 < davidgiluk> stefanha: But shouldn't it refuse to startup ?
14:36 < stefanha> Everything is perfect*
14:36 < stefanha> * except we broke migration
14:36 < stefanha> :)

suggestions on how to fix this:

14:40 < stefanha> My understanding is this bug is vhost-specific.  If you have an old kernel that doesn't support VERSION_1 vhost then migration will break.
14:41 < stefanha> The actual bug is in QEMU though, not vhost
14:42 < stefanha> vhost is reporting the truth.  It's QEMU that has changed behavior.
14:44 < stefanha> mdroth: I think a lame fix would be to make virtio_pci_device_plugged() dependent on a flag that says: use old broken behavior instead of reporting the truth to the guest.
14:44 < stefanha> The flag could be enabled for all old machine types
14:44 < stefanha> and new machine types will report the truth.
14:44 < stefanha> That way migration continues to work.
14:44 < stefanha> Not sure if anyone can think of a nicer solution.
14:45 < stefanha> But we're going to have to keep lying to the guest if we want to preserve migration compatibility
14:45 < stefanha> The key change in behavior with the patch you identified is:
14:46 < stefanha> if (!virtio_has_feature(vdev->host_features, VIRTIO_F_VERSION_1)) {
14:46 < stefanha> virtio_pci_disable_modern(proxy);
14:46 < stefanha> Previously it didn't care about vdev->host_features.  It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
14:47 < mdroth> stefanha: ok, that explains why it seems to work with disable-modern=true
14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is definitely still around so I don't think we can ship QEMU 2.8 like this.
14:49 < stefanha> mdroth: Let's summarize it on the mailing list and see what Michael Tsirkin and Maxime Coquelin think.
14:49 < mdroth> stefanha: i suppose a potential workaround would be to tell users to set disable-modern= to match their vhost capabilities, but it's hard for them to apply that retroactively if they're looking to migrate
14:49 < stefanha> We can delay the release in the meantime.
14:50 < stefanha> mdroth: I don't think a workaround is going to be smooth in this case
14:50 < stefanha> People will only notice once migration fails,
14:50 < stefanha> and that's when you lose your VM state!


> ---
> 
> The patch applies on top of Michael's pci branch.
> 
> Changes from v1:
> ----------------
>  - Fix commit message typos (Cornelia, Eric)
> 
>  hw/s390x/virtio-ccw.c          | 30 +++++++++++++++---------------
>  hw/virtio/virtio-bus.c         |  9 +++++----
>  hw/virtio/virtio-pci.c         | 36 ++++++++++++++++++++++++++++++++----
>  hw/virtio/virtio-pci.h         |  5 +++++
>  include/hw/virtio/virtio-bus.h | 10 +++++-----
>  5 files changed, 62 insertions(+), 28 deletions(-)
> 
> diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> index 9678956..9f3f386 100644
> --- a/hw/s390x/virtio-ccw.c
> +++ b/hw/s390x/virtio-ccw.c
> @@ -1261,6 +1261,16 @@ static int virtio_ccw_load_config(DeviceState *d, QEMUFile *f)
>      return 0;
>  }
> 
> +static void virtio_ccw_pre_plugged(DeviceState *d, Error **errp)
> +{
> +   VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
> +   VirtIODevice *vdev = virtio_bus_get_device(&dev->bus);
> +
> +    if (dev->max_rev >= 1) {
> +        virtio_add_feature(&vdev->host_features, VIRTIO_F_VERSION_1);
> +    }
> +}
> +
>  /* This is called by virtio-bus just after the device is plugged. */
>  static void virtio_ccw_device_plugged(DeviceState *d, Error **errp)
>  {
> @@ -1270,6 +1280,10 @@ static void virtio_ccw_device_plugged(DeviceState *d, Error **errp)
>      SubchDev *sch = ccw_dev->sch;
>      int n = virtio_get_num_queues(vdev);
> 
> +    if (!virtio_has_feature(vdev->host_features, VIRTIO_F_VERSION_1)) {
> +        dev->max_rev = 0;
> +    }
> +
>      if (virtio_get_num_queues(vdev) > VIRTIO_CCW_QUEUE_MAX) {
>          error_setg(errp, "The number of virtqueues %d "
>                     "exceeds ccw limit %d", n,
> @@ -1283,25 +1297,11 @@ static void virtio_ccw_device_plugged(DeviceState *d, Error **errp)
> 
>      sch->id.cu_model = virtio_bus_get_vdev_id(&dev->bus);
> 
> -    if (dev->max_rev >= 1) {
> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_VERSION_1);
> -    }
> 
>      css_generate_sch_crws(sch->cssid, sch->ssid, sch->schid,
>                            d->hotplugged, 1);
>  }
> 
> -static void virtio_ccw_post_plugged(DeviceState *d, Error **errp)
> -{
> -   VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
> -   VirtIODevice *vdev = virtio_bus_get_device(&dev->bus);
> -
> -   if (!virtio_host_has_feature(vdev, VIRTIO_F_VERSION_1)) {
> -        /* A backend didn't support modern virtio. */
> -       dev->max_rev = 0;
> -   }
> -}
> -
>  static void virtio_ccw_device_unplugged(DeviceState *d)
>  {
>      VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
> @@ -1593,8 +1593,8 @@ static void virtio_ccw_bus_class_init(ObjectClass *klass, void *data)
>      k->load_queue = virtio_ccw_load_queue;
>      k->save_config = virtio_ccw_save_config;
>      k->load_config = virtio_ccw_load_config;
> +    k->pre_plugged = virtio_ccw_pre_plugged;
>      k->device_plugged = virtio_ccw_device_plugged;
> -    k->post_plugged = virtio_ccw_post_plugged;
>      k->device_unplugged = virtio_ccw_device_unplugged;
>      k->ioeventfd_started = virtio_ccw_ioeventfd_started;
>      k->ioeventfd_set_started = virtio_ccw_ioeventfd_set_started;
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index 1492793..11f65bd 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -49,16 +49,17 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> 
>      DPRINTF("%s: plug device.\n", qbus->name);
> 
> -    if (klass->device_plugged != NULL) {
> -        klass->device_plugged(qbus->parent, errp);
> +    if (klass->pre_plugged != NULL) {
> +        klass->pre_plugged(qbus->parent, errp);
>      }
> 
>      /* Get the features of the plugged device. */
>      assert(vdc->get_features != NULL);
>      vdev->host_features = vdc->get_features(vdev, vdev->host_features,
>                                              errp);
> -    if (klass->post_plugged != NULL) {
> -        klass->post_plugged(qbus->parent, errp);
> +
> +    if (klass->device_plugged != NULL) {
> +        klass->device_plugged(qbus->parent, errp);
>      }
>  }
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index dde71a5..2d60a00 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -1576,18 +1576,48 @@ static void virtio_pci_modern_io_region_unmap(VirtIOPCIProxy *proxy,
>                                  &region->mr);
>  }
> 
> +static void virtio_pci_pre_plugged(DeviceState *d, Error **errp)
> +{
> +    VirtIOPCIProxy *proxy = VIRTIO_PCI(d);
> +    VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
> +
> +    if (virtio_pci_modern(proxy)) {
> +        virtio_add_feature(&vdev->host_features, VIRTIO_F_VERSION_1);
> +    }
> +
> +    virtio_add_feature(&vdev->host_features, VIRTIO_F_BAD_FEATURE);
> +}
> +
>  /* This is called by virtio-bus just after the device is plugged. */
>  static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
>  {
>      VirtIOPCIProxy *proxy = VIRTIO_PCI(d);
>      VirtioBusState *bus = &proxy->bus;
>      bool legacy = virtio_pci_legacy(proxy);
> -    bool modern = virtio_pci_modern(proxy);
> +    bool modern;
>      bool modern_pio = proxy->flags & VIRTIO_PCI_FLAG_MODERN_PIO_NOTIFY;
>      uint8_t *config;
>      uint32_t size;
>      VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
> 
> +    /*
> +     * Virtio capabilities present without
> +     * VIRTIO_F_VERSION_1 confuses guests
> +     */
> +    if (!virtio_has_feature(vdev->host_features, VIRTIO_F_VERSION_1)) {
> +        virtio_pci_disable_modern(proxy);
> +
> +        if (!legacy) {
> +            error_setg(errp, "Device doesn't support modern mode, and legacy"
> +                             " mode is disabled");
> +            error_append_hint(errp, "Set disable-legacy to off\n");
> +
> +            return;
> +        }
> +    }
> +
> +    modern = virtio_pci_modern(proxy);
> +
>      config = proxy->pci_dev.config;
>      if (proxy->class_code) {
>          pci_config_set_class(config, proxy->class_code);
> @@ -1629,7 +1659,6 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
> 
>          struct virtio_pci_cfg_cap *cfg_mask;
> 
> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_VERSION_1);
>          virtio_pci_modern_regions_init(proxy);
> 
>          virtio_pci_modern_mem_region_map(proxy, &proxy->common, &cap);
> @@ -1694,8 +1723,6 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
>      if (!kvm_has_many_ioeventfds()) {
>          proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
>      }
> -
> -    virtio_add_feature(&vdev->host_features, VIRTIO_F_BAD_FEATURE);
>  }
> 
>  static void virtio_pci_device_unplugged(DeviceState *d)
> @@ -2508,6 +2535,7 @@ static void virtio_pci_bus_class_init(ObjectClass *klass, void *data)
>      k->query_guest_notifiers = virtio_pci_query_guest_notifiers;
>      k->set_guest_notifiers = virtio_pci_set_guest_notifiers;
>      k->vmstate_change = virtio_pci_vmstate_change;
> +    k->pre_plugged = virtio_pci_pre_plugged;
>      k->device_plugged = virtio_pci_device_plugged;
>      k->device_unplugged = virtio_pci_device_unplugged;
>      k->query_nvectors = virtio_pci_query_nvectors;
> diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> index 0698157..541cbdb 100644
> --- a/hw/virtio/virtio-pci.h
> +++ b/hw/virtio/virtio-pci.h
> @@ -181,6 +181,11 @@ static inline void virtio_pci_force_virtio_1(VirtIOPCIProxy *proxy)
>      proxy->disable_legacy = ON_OFF_AUTO_ON;
>  }
> 
> +static inline void virtio_pci_disable_modern(VirtIOPCIProxy *proxy)
> +{
> +    proxy->disable_modern = true;
> +}
> +
>  /*
>   * virtio-scsi-pci: This extends VirtioPCIProxy.
>   */
> diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h
> index f3e5ef3..24caa0a 100644
> --- a/include/hw/virtio/virtio-bus.h
> +++ b/include/hw/virtio/virtio-bus.h
> @@ -54,16 +54,16 @@ typedef struct VirtioBusClass {
>      int (*set_guest_notifiers)(DeviceState *d, int nvqs, bool assign);
>      void (*vmstate_change)(DeviceState *d, bool running);
>      /*
> +     * Expose the features the transport layer supports before
> +     * the negotiation takes place.
> +     */
> +    void (*pre_plugged)(DeviceState *d, Error **errp);
> +    /*
>       * transport independent init function.
>       * This is called by virtio-bus just after the device is plugged.
>       */
>      void (*device_plugged)(DeviceState *d, Error **errp);
>      /*
> -     * Re-evaluate setup after feature bits have been validated
> -     * by the device backend.
> -     */
> -    void (*post_plugged)(DeviceState *d, Error **errp);
> -    /*
>       * transport independent exit function.
>       * This is called by virtio-bus just before the device is unplugged.
>       */
> -- 
> 2.7.4
> 
>

Cornelia Huck Dec. 14, 2016, 7:44 a.m. UTC | #2

On Tue, 13 Dec 2016 15:27:45 -0600
Michael Roth <mdroth@linux.vnet.ibm.com> wrote:

> Quoting Maxime Coquelin (2016-09-13 08:30:30)
> > Currently, devices are plugged before features are negotiated.
> > If the backend doesn't support VIRTIO_F_VERSION_1, the transport
> > needs to rewind some settings.
> > 
> > This is the case for CCW, for which a post_plugged callback had
> > been introduced, where max_rev field is just updated if
> > VIRTIO_F_VERSION_1 is not supported by the backend.
> > For PCI, implementing post_plugged would be much more
> > complicated, so it needs to know whether the backend supports
> > VIRTIO_F_VERSION_1 at plug time.
> > 
> > Currently, nothing is done for PCI. Modern capabilities get
> > exposed to the guest even if VIRTIO_F_VERSION_1 is not supported
> > by the backend, which confuses the guest.
> > 
> > This patch replaces existing post_plugged solution with an
> > approach that fits with both transports.
> > Features negotiation is performed before ->device_plugged() call.
> > A pre_plugged callback is introduced so that the transports can
> > set their supported features.
> > 
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > Cc: qemu-stable@nongnu.org
> > Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> [ccw]
> > Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> > Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
> > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> 
> 
> It looks like this patch breaks migration under certain circumstances.
> One such scenario is migrating 2.7 -> 2.8.0-rc3 as detailed below on a
> host that doesn't have support for virtio-1 in vhost (which was
> introduced via 41e3e42 in kernel 3.18. In my case, I'm using Ubuntu
> 14.04, kernel 3.13):
> 
>  source (2.7.0):
>  
>    sudo build/v2.7.0/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -L
>    build/v2.7.0-bios -M pc-i440fx-2.7 -m 512M -drive
>    file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
>    cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
>    tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
>    unix:/tmp/vm-hmp.sock,server,nowait -qmp
>    unix:/tmp/vm-qmp.sock,server,nowait -vnc :200
>   
>  target (2.8.0-rc3):
>    
>    sudo build/v2.8.0-rc3/x86_64-softmmu/qemu-system-x86_64 -enable-kvm
>    -L build/v2.8.0-rc3-bios -M pc-i440fx-2.7 -m 512M -drive
>    file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
>    cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
>    tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
>    unix:/tmp/vm-hmp-incoming.sock,server,nowait -qmp
>    unix:/tmp/vm-qmp-incoming.sock,server,nowait -vnc :201
>    -incoming unix:/tmp/migrate.sock
>     
>  error on target after migration:
>      
>    qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x34
>    read: 98 device: 40 cmask: ff wmask: 0 w1cmask:0
>    qemu-system-x86_64: Failed to load PCIDevice:config
>    qemu-system-x86_64: Failed to load virtio-net:virtio
>    qemu-system-x86_64: error while loading state for instance 0x0 of
>    device '0000:00:03.0/virtio-net'
>    qemu-system-x86_64: load of migration failed: Invalid argument
> 

Ugh. Let me double-check what happens for ccw. I could have sworn I
tested this...

> 
> Based on discussion on IRC (excerpts below), I think the new handling needs
> to be controllable via a machine option that can be disabled by default
> for older machine types. This is being considered a release blocker for
> 2.8 since there are still pre-3.18 LTS kernels in the wild.
> 
> 
> root-cause:
> 
> 14:35 < stefanha> v3.13 will not work
> 14:35 < stefanha>         VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> 14:35 < stefanha>                          (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> 14:35 < stefanha>                          (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> 14:35 < stefanha>                          (1ULL << VHOST_F_LOG_ALL),
> 14:35 < stefanha> The kernel side only knows about these guys
> 14:35 < davidgiluk> our downstream 3.10.0-524 seems to work - but that's probably got all the vhost stuff backported
> 14:35 < stefanha> plus these guys:
> 14:35 < stefanha> F        VHOST_NET_FEATURES = VHOST_FEATURES |
> 14:35 < stefanha>                          (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
> 14:35 < stefanha>                          (1ULL << VIRTIO_NET_F_MRG_RXBUF),
> 14:36 < stefanha> mdroth: So QEMU is going to check a list of feature bits including VERSION_1
> 14:36 < stefanha> and it will see that vhost doesn't support them.
> 14:36 < stefanha> So we're kind of doing the right thing now.
> 14:36 < stefanha> Before userspace was overriding the fact that vhost cannot handle VERSION_1.
> 14:36 < stefanha> ...except we broke migration
> 14:36 < davidgiluk> stefanha: But shouldn't it refuse to startup ?
> 14:36 < stefanha> Everything is perfect*
> 14:36 < stefanha> * except we broke migration
> 14:36 < stefanha> :)
> 
> suggestions on how to fix this:
> 
> 14:40 < stefanha> My understanding is this bug is vhost-specific.  If you have an old kernel that doesn't support VERSION_1 vhost then migration will break.
> 14:41 < stefanha> The actual bug is in QEMU though, not vhost
> 14:42 < stefanha> vhost is reporting the truth.  It's QEMU that has changed behavior.
> 14:44 < stefanha> mdroth: I think a lame fix would be to make virtio_pci_device_plugged() dependent on a flag that says: use old broken behavior instead of reporting the truth to the guest.
> 14:44 < stefanha> The flag could be enabled for all old machine types
> 14:44 < stefanha> and new machine types will report the truth.
> 14:44 < stefanha> That way migration continues to work.

I'll check whether we would need something for ccw as well.

> 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
> 14:45 < stefanha> But we're going to have to keep lying to the guest if we want to preserve migration compatibility
> 14:45 < stefanha> The key change in behavior with the patch you identified is:
> 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features, VIRTIO_F_VERSION_1)) {
> 14:46 < stefanha> virtio_pci_disable_modern(proxy);
> 14:46 < stefanha> Previously it didn't care about vdev->host_features.  It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
> 14:47 < mdroth> stefanha: ok, that explains why it seems to work with disable-modern=true
> 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is definitely still around so I don't think we can ship QEMU 2.8 like this.
> 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and see what Michael Tsirkin and Maxime Coquelin think.
> 14:49 < mdroth> stefanha: i suppose a potential workaround would be to tell users to set disable-modern= to match their vhost capabilities, but it's hard for them to apply that retroactively if they're looking to migrate

Another thought: Maybe this bug only surfaced right now because older
qemus defaulted virtio-pci to legacy?

(I think modern virtio-pci with old vhost resulted in a config that was
rejected at least by Linux guests. Because pci defaulted to legacy, we
only had the post-plugged workaround for ccw before.)

> 14:49 < stefanha> We can delay the release in the meantime.
> 14:50 < stefanha> mdroth: I don't think a workaround is going to be smooth in this case
> 14:50 < stefanha> People will only notice once migration fails,
> 14:50 < stefanha> and that's when you lose your VM state!

Maxime Coquelin Dec. 14, 2016, 8:20 a.m. UTC | #3

On 12/13/2016 10:27 PM, Michael Roth wrote:
> Quoting Maxime Coquelin (2016-09-13 08:30:30)
>> > Currently, devices are plugged before features are negotiated.
>> > If the backend doesn't support VIRTIO_F_VERSION_1, the transport
>> > needs to rewind some settings.
>> >
>> > This is the case for CCW, for which a post_plugged callback had
>> > been introduced, where max_rev field is just updated if
>> > VIRTIO_F_VERSION_1 is not supported by the backend.
>> > For PCI, implementing post_plugged would be much more
>> > complicated, so it needs to know whether the backend supports
>> > VIRTIO_F_VERSION_1 at plug time.
>> >
>> > Currently, nothing is done for PCI. Modern capabilities get
>> > exposed to the guest even if VIRTIO_F_VERSION_1 is not supported
>> > by the backend, which confuses the guest.
>> >
>> > This patch replaces existing post_plugged solution with an
>> > approach that fits with both transports.
>> > Features negotiation is performed before ->device_plugged() call.
>> > A pre_plugged callback is introduced so that the transports can
>> > set their supported features.
>> >
>> > Cc: Michael S. Tsirkin <mst@redhat.com>
>> > Cc: qemu-stable@nongnu.org
>> > Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> [ccw]
>> > Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>> > Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
>> > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>
> It looks like this patch breaks migration under certain circumstances.
> One such scenario is migrating 2.7 -> 2.8.0-rc3 as detailed below on a
> host that doesn't have support for virtio-1 in vhost (which was
> introduced via 41e3e42 in kernel 3.18. In my case, I'm using Ubuntu
> 14.04, kernel 3.13):
>
>  source (2.7.0):
>
>    sudo build/v2.7.0/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -L
>    build/v2.7.0-bios -M pc-i440fx-2.7 -m 512M -drive
>    file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
>    cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
>    tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
>    unix:/tmp/vm-hmp.sock,server,nowait -qmp
>    unix:/tmp/vm-qmp.sock,server,nowait -vnc :200
>
>  target (2.8.0-rc3):
>
>    sudo build/v2.8.0-rc3/x86_64-softmmu/qemu-system-x86_64 -enable-kvm
>    -L build/v2.8.0-rc3-bios -M pc-i440fx-2.7 -m 512M -drive
>    file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
>    cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
>    tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
>    unix:/tmp/vm-hmp-incoming.sock,server,nowait -qmp
>    unix:/tmp/vm-qmp-incoming.sock,server,nowait -vnc :201
>    -incoming unix:/tmp/migrate.sock
>
>  error on target after migration:
>
>    qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x34
>    read: 98 device: 40 cmask: ff wmask: 0 w1cmask:0
>    qemu-system-x86_64: Failed to load PCIDevice:config
>    qemu-system-x86_64: Failed to load virtio-net:virtio
>    qemu-system-x86_64: error while loading state for instance 0x0 of
>    device '0000:00:03.0/virtio-net'
>    qemu-system-x86_64: load of migration failed: Invalid argument
>
>
> Based on discussion on IRC (excerpts below), I think the new handling needs
> to be controllable via a machine option that can be disabled by default
> for older machine types. This is being considered a release blocker for
> 2.8 since there are still pre-3.18 LTS kernels in the wild.


First, thanks for reporting this bad regression with a detailed
analysis.

>
>
> root-cause:
>
> 14:35 < stefanha> v3.13 will not work
> 14:35 < stefanha>         VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> 14:35 < stefanha>                          (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> 14:35 < stefanha>                          (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> 14:35 < stefanha>                          (1ULL << VHOST_F_LOG_ALL),
> 14:35 < stefanha> The kernel side only knows about these guys
> 14:35 < davidgiluk> our downstream 3.10.0-524 seems to work - but that's probably got all the vhost stuff backported
> 14:35 < stefanha> plus these guys:
> 14:35 < stefanha> F        VHOST_NET_FEATURES = VHOST_FEATURES |
> 14:35 < stefanha>                          (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
> 14:35 < stefanha>                          (1ULL << VIRTIO_NET_F_MRG_RXBUF),
> 14:36 < stefanha> mdroth: So QEMU is going to check a list of feature bits including VERSION_1
> 14:36 < stefanha> and it will see that vhost doesn't support them.
> 14:36 < stefanha> So we're kind of doing the right thing now.
> 14:36 < stefanha> Before userspace was overriding the fact that vhost cannot handle VERSION_1.
> 14:36 < stefanha> ...except we broke migration
> 14:36 < davidgiluk> stefanha: But shouldn't it refuse to startup ?
> 14:36 < stefanha> Everything is perfect*
> 14:36 < stefanha> * except we broke migration
> 14:36 < stefanha> :)
>
> suggestions on how to fix this:
>
> 14:40 < stefanha> My understanding is this bug is vhost-specific.  If you have an old kernel that doesn't support VERSION_1 vhost then migration will break.
> 14:41 < stefanha> The actual bug is in QEMU though, not vhost
> 14:42 < stefanha> vhost is reporting the truth.  It's QEMU that has changed behavior.
> 14:44 < stefanha> mdroth: I think a lame fix would be to make virtio_pci_device_plugged() dependent on a flag that says: use old broken behavior instead of reporting the truth to the guest.
> 14:44 < stefanha> The flag could be enabled for all old machine types
> 14:44 < stefanha> and new machine types will report the truth.
> 14:44 < stefanha> That way migration continues to work.
Right.
The problem doing this however is that it would re-introduce initial
bug for old kernel on host and new kernel on guest.

Indeed, in recent Kernels, virtio-pci device probe fails if modern 
interface is exposed but VERSION_1 is not advertised.

> 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
> 14:45 < stefanha> But we're going to have to keep lying to the guest if we want to preserve migration compatibility
> 14:45 < stefanha> The key change in behavior with the patch you identified is:
> 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features, VIRTIO_F_VERSION_1)) {
> 14:46 < stefanha> virtio_pci_disable_modern(proxy);
> 14:46 < stefanha> Previously it didn't care about vdev->host_features.  It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
> 14:47 < mdroth> stefanha: ok, that explains why it seems to work with disable-modern=true
> 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is definitely still around so I don't think we can ship QEMU 2.8 like this.
> 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and see what Michael Tsirkin and Maxime Coquelin think.
> 14:49 < mdroth> stefanha: i suppose a potential workaround would be to tell users to set disable-modern= to match their vhost capabilities, but it's hard for them to apply that retroactively if they're looking to migrate
> 14:49 < stefanha> We can delay the release in the meantime.
> 14:50 < stefanha> mdroth: I don't think a workaround is going to be smooth in this case
> 14:50 < stefanha> People will only notice once migration fails,
> 14:50 < stefanha> and that's when you lose your VM state!
>
>

Thanks,
Maxime

Maxime Coquelin Dec. 14, 2016, 8:28 a.m. UTC | #4

On 12/14/2016 08:44 AM, Cornelia Huck wrote:
>> > 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
>> > 14:45 < stefanha> But we're going to have to keep lying to the guest if we want to preserve migration compatibility
>> > 14:45 < stefanha> The key change in behavior with the patch you identified is:
>> > 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features, VIRTIO_F_VERSION_1)) {
>> > 14:46 < stefanha> virtio_pci_disable_modern(proxy);
>> > 14:46 < stefanha> Previously it didn't care about vdev->host_features.  It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
>> > 14:47 < mdroth> stefanha: ok, that explains why it seems to work with disable-modern=true
>> > 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is definitely still around so I don't think we can ship QEMU 2.8 like this.
>> > 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and see what Michael Tsirkin and Maxime Coquelin think.
>> > 14:49 < mdroth> stefanha: i suppose a potential workaround would be to tell users to set disable-modern= to match their vhost capabilities, but it's hard for them to apply that retroactively if they're looking to migrate
> Another thought: Maybe this bug only surfaced right now because older
> qemus defaulted virtio-pci to legacy?
>
> (I think modern virtio-pci with old vhost resulted in a config that was
> rejected at least by Linux guests. Because pci defaulted to legacy, we
> only had the post-plugged workaround for ccw before.)

Yes, for PCI with old vhost, modern enabled and recent kernel on guest,
we get this failure at virtio-pci probe time:

virtio_net virtio0: virtio: device uses modern interface but does not 
have VIRTIO_F_VERSION_1.

  - Maxime

Stefan Hajnoczi Dec. 14, 2016, 8:41 a.m. UTC | #5

On Wed, Dec 14, 2016 at 8:28 AM, Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
>
>
> On 12/14/2016 08:44 AM, Cornelia Huck wrote:
>>>
>>> > 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
>>> > 14:45 < stefanha> But we're going to have to keep lying to the guest if
>>> > we want to preserve migration compatibility
>>> > 14:45 < stefanha> The key change in behavior with the patch you
>>> > identified is:
>>> > 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features,
>>> > VIRTIO_F_VERSION_1)) {
>>> > 14:46 < stefanha> virtio_pci_disable_modern(proxy);
>>> > 14:46 < stefanha> Previously it didn't care about vdev->host_features.
>>> > It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
>>> > 14:47 < mdroth> stefanha: ok, that explains why it seems to work with
>>> > disable-modern=true
>>> > 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is
>>> > definitely still around so I don't think we can ship QEMU 2.8 like this.
>>> > 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and
>>> > see what Michael Tsirkin and Maxime Coquelin think.
>>> > 14:49 < mdroth> stefanha: i suppose a potential workaround would be to
>>> > tell users to set disable-modern= to match their vhost capabilities, but
>>> > it's hard for them to apply that retroactively if they're looking to migrate
>>
>> Another thought: Maybe this bug only surfaced right now because older
>> qemus defaulted virtio-pci to legacy?
>>
>> (I think modern virtio-pci with old vhost resulted in a config that was
>> rejected at least by Linux guests. Because pci defaulted to legacy, we
>> only had the post-plugged workaround for ccw before.)
>
>
> Yes, for PCI with old vhost, modern enabled and recent kernel on guest,
> we get this failure at virtio-pci probe time:
>
> virtio_net virtio0: virtio: device uses modern interface but does not have
> VIRTIO_F_VERSION_1.

Is this error a regression in QEMU 2.8?

It's better to ship with an existing issue still open than with a new
regression.  We must not break existing users' setups.

A solution for the next QEMU version is to use a flag in the machine
type version telling virtio whether or not allow devices (e.g.
vhost-net) to influence the host feature bits.  Old machine types will
say no, new machine types will say yes.

In the meantime I would revert your patch for QEMU 2.8.

Maxime, Cornelia, Michael: Do you agree?

Stefan

Cornelia Huck Dec. 14, 2016, 8:59 a.m. UTC | #6

On Wed, 14 Dec 2016 08:41:55 +0000
Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Wed, Dec 14, 2016 at 8:28 AM, Maxime Coquelin
> <maxime.coquelin@redhat.com> wrote:
> >
> >
> > On 12/14/2016 08:44 AM, Cornelia Huck wrote:
> >>>
> >>> > 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
> >>> > 14:45 < stefanha> But we're going to have to keep lying to the guest if
> >>> > we want to preserve migration compatibility
> >>> > 14:45 < stefanha> The key change in behavior with the patch you
> >>> > identified is:
> >>> > 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features,
> >>> > VIRTIO_F_VERSION_1)) {
> >>> > 14:46 < stefanha> virtio_pci_disable_modern(proxy);
> >>> > 14:46 < stefanha> Previously it didn't care about vdev->host_features.
> >>> > It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
> >>> > 14:47 < mdroth> stefanha: ok, that explains why it seems to work with
> >>> > disable-modern=true
> >>> > 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is
> >>> > definitely still around so I don't think we can ship QEMU 2.8 like this.
> >>> > 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and
> >>> > see what Michael Tsirkin and Maxime Coquelin think.
> >>> > 14:49 < mdroth> stefanha: i suppose a potential workaround would be to
> >>> > tell users to set disable-modern= to match their vhost capabilities, but
> >>> > it's hard for them to apply that retroactively if they're looking to migrate
> >>
> >> Another thought: Maybe this bug only surfaced right now because older
> >> qemus defaulted virtio-pci to legacy?
> >>
> >> (I think modern virtio-pci with old vhost resulted in a config that was
> >> rejected at least by Linux guests. Because pci defaulted to legacy, we
> >> only had the post-plugged workaround for ccw before.)
> >
> >
> > Yes, for PCI with old vhost, modern enabled and recent kernel on guest,
> > we get this failure at virtio-pci probe time:
> >
> > virtio_net virtio0: virtio: device uses modern interface but does not have
> > VIRTIO_F_VERSION_1.
> 
> Is this error a regression in QEMU 2.8?

I think it pokes up because modern virtio-pci is now by default on. It
was broken before if the user wanted a modern virtio-pci device
explicitly.

(ccw defaulted to virtio 1.0 much earlier, so we had the post-plugged
solution that this patch replaced and which is basically the same for
ccw.)

> 
> It's better to ship with an existing issue still open than with a new
> regression.  We must not break existing users' setups.
> 
> A solution for the next QEMU version is to use a flag in the machine
> type version telling virtio whether or not allow devices (e.g.
> vhost-net) to influence the host feature bits.  Old machine types will
> say no, new machine types will say yes.
> 
> In the meantime I would revert your patch for QEMU 2.8.
> 
> Maxime, Cornelia, Michael: Do you agree?
> 
> Stefan

Reverting the patch should be fine for ccw. What about the virtio-pci
with old vhost mess, though? Defaulting to modern would mean that users
get unusable devices in that setup.

Maxime Coquelin Dec. 14, 2016, 9:44 a.m. UTC | #7

On 12/14/2016 09:59 AM, Cornelia Huck wrote:
> On Wed, 14 Dec 2016 08:41:55 +0000
> Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
>> On Wed, Dec 14, 2016 at 8:28 AM, Maxime Coquelin
>> <maxime.coquelin@redhat.com> wrote:
>>>
>>>
>>> On 12/14/2016 08:44 AM, Cornelia Huck wrote:
>>>>>
>>>>>> 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
>>>>>> 14:45 < stefanha> But we're going to have to keep lying to the guest if
>>>>>> we want to preserve migration compatibility
>>>>>> 14:45 < stefanha> The key change in behavior with the patch you
>>>>>> identified is:
>>>>>> 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features,
>>>>>> VIRTIO_F_VERSION_1)) {
>>>>>> 14:46 < stefanha> virtio_pci_disable_modern(proxy);
>>>>>> 14:46 < stefanha> Previously it didn't care about vdev->host_features.
>>>>>> It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
>>>>>> 14:47 < mdroth> stefanha: ok, that explains why it seems to work with
>>>>>> disable-modern=true
>>>>>> 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is
>>>>>> definitely still around so I don't think we can ship QEMU 2.8 like this.
>>>>>> 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and
>>>>>> see what Michael Tsirkin and Maxime Coquelin think.
>>>>>> 14:49 < mdroth> stefanha: i suppose a potential workaround would be to
>>>>>> tell users to set disable-modern= to match their vhost capabilities, but
>>>>>> it's hard for them to apply that retroactively if they're looking to migrate
>>>>
>>>> Another thought: Maybe this bug only surfaced right now because older
>>>> qemus defaulted virtio-pci to legacy?
>>>>
>>>> (I think modern virtio-pci with old vhost resulted in a config that was
>>>> rejected at least by Linux guests. Because pci defaulted to legacy, we
>>>> only had the post-plugged workaround for ccw before.)
>>>
>>>
>>> Yes, for PCI with old vhost, modern enabled and recent kernel on guest,
>>> we get this failure at virtio-pci probe time:
>>>
>>> virtio_net virtio0: virtio: device uses modern interface but does not have
>>> VIRTIO_F_VERSION_1.
>>
>> Is this error a regression in QEMU 2.8?
>
> I think it pokes up because modern virtio-pci is now by default on. It
> was broken before if the user wanted a modern virtio-pci device
> explicitly.
>
> (ccw defaulted to virtio 1.0 much earlier, so we had the post-plugged
> solution that this patch replaced and which is basically the same for
> ccw.)
>
>>
>> It's better to ship with an existing issue still open than with a new
>> regression.  We must not break existing users' setups.
>>
>> A solution for the next QEMU version is to use a flag in the machine
>> type version telling virtio whether or not allow devices (e.g.
>> vhost-net) to influence the host feature bits.  Old machine types will
>> say no, new machine types will say yes.
>>
>> In the meantime I would revert your patch for QEMU 2.8.
>>
>> Maxime, Cornelia, Michael: Do you agree?
>>
>> Stefan
>
> Reverting the patch should be fine for ccw. What about the virtio-pci
> with old vhost mess, though? Defaulting to modern would mean that users
> get unusable devices in that setup.

Just did some tests, and can confirm that reverting the patch would
re-introduce initial bug, which is breaking virtio-pci when host does
not support VERSION_1.

Note that this problem is present in v2.7.0 since:

commit 9a4c0e220d8a4f82b5665d0ee95ef94d8e1509d5
Author: Marcel Apfelbaum <marcel@redhat.com>
Date:   Wed Jul 20 18:28:21 2016 +0300

     hw/virtio-pci: fix virtio behaviour

     Enable transitional virtio devices by default.
     Enable virtio-1.0 for devices plugged into
     PCIe ports (Root ports or Downstream ports).

     Using the virtio-1 mode will remove the limitation
     of the number of devices that can be attached to a machine
     by removing the need for the IO BAR.

     Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
     Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
     Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>


Maybe better to implement the workaround you proposed Stefan?

Thanks,
Maxime

Dr. David Alan Gilbert Dec. 14, 2016, 9:50 a.m. UTC | #8

* Maxime Coquelin (maxime.coquelin@redhat.com) wrote:
> 
> 
> On 12/14/2016 09:59 AM, Cornelia Huck wrote:
> > On Wed, 14 Dec 2016 08:41:55 +0000
> > Stefan Hajnoczi <stefanha@gmail.com> wrote:
> > 
> > > On Wed, Dec 14, 2016 at 8:28 AM, Maxime Coquelin
> > > <maxime.coquelin@redhat.com> wrote:
> > > > 
> > > > 
> > > > On 12/14/2016 08:44 AM, Cornelia Huck wrote:
> > > > > > 
> > > > > > > 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
> > > > > > > 14:45 < stefanha> But we're going to have to keep lying to the guest if
> > > > > > > we want to preserve migration compatibility
> > > > > > > 14:45 < stefanha> The key change in behavior with the patch you
> > > > > > > identified is:
> > > > > > > 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features,
> > > > > > > VIRTIO_F_VERSION_1)) {
> > > > > > > 14:46 < stefanha> virtio_pci_disable_modern(proxy);
> > > > > > > 14:46 < stefanha> Previously it didn't care about vdev->host_features.
> > > > > > > It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
> > > > > > > 14:47 < mdroth> stefanha: ok, that explains why it seems to work with
> > > > > > > disable-modern=true
> > > > > > > 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is
> > > > > > > definitely still around so I don't think we can ship QEMU 2.8 like this.
> > > > > > > 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and
> > > > > > > see what Michael Tsirkin and Maxime Coquelin think.
> > > > > > > 14:49 < mdroth> stefanha: i suppose a potential workaround would be to
> > > > > > > tell users to set disable-modern= to match their vhost capabilities, but
> > > > > > > it's hard for them to apply that retroactively if they're looking to migrate
> > > > > 
> > > > > Another thought: Maybe this bug only surfaced right now because older
> > > > > qemus defaulted virtio-pci to legacy?
> > > > > 
> > > > > (I think modern virtio-pci with old vhost resulted in a config that was
> > > > > rejected at least by Linux guests. Because pci defaulted to legacy, we
> > > > > only had the post-plugged workaround for ccw before.)
> > > > 
> > > > 
> > > > Yes, for PCI with old vhost, modern enabled and recent kernel on guest,
> > > > we get this failure at virtio-pci probe time:
> > > > 
> > > > virtio_net virtio0: virtio: device uses modern interface but does not have
> > > > VIRTIO_F_VERSION_1.
> > > 
> > > Is this error a regression in QEMU 2.8?
> > 
> > I think it pokes up because modern virtio-pci is now by default on. It
> > was broken before if the user wanted a modern virtio-pci device
> > explicitly.
> > 
> > (ccw defaulted to virtio 1.0 much earlier, so we had the post-plugged
> > solution that this patch replaced and which is basically the same for
> > ccw.)
> > 
> > > 
> > > It's better to ship with an existing issue still open than with a new
> > > regression.  We must not break existing users' setups.
> > > 
> > > A solution for the next QEMU version is to use a flag in the machine
> > > type version telling virtio whether or not allow devices (e.g.
> > > vhost-net) to influence the host feature bits.  Old machine types will
> > > say no, new machine types will say yes.
> > > 
> > > In the meantime I would revert your patch for QEMU 2.8.
> > > 
> > > Maxime, Cornelia, Michael: Do you agree?
> > > 
> > > Stefan
> > 
> > Reverting the patch should be fine for ccw. What about the virtio-pci
> > with old vhost mess, though? Defaulting to modern would mean that users
> > get unusable devices in that setup.
> 
> Just did some tests, and can confirm that reverting the patch would
> re-introduce initial bug, which is breaking virtio-pci when host does
> not support VERSION_1.

Can you modify it so it only fixes the problem on new machine types?
Either that or *fail* if you attempt to bring up a virtio interface
using version 1 with a kernel that doesn't support it (with a note
saying that you could use an option to disable it).

Dave

> Note that this problem is present in v2.7.0 since:
> 
> commit 9a4c0e220d8a4f82b5665d0ee95ef94d8e1509d5
> Author: Marcel Apfelbaum <marcel@redhat.com>
> Date:   Wed Jul 20 18:28:21 2016 +0300
> 
>     hw/virtio-pci: fix virtio behaviour
> 
>     Enable transitional virtio devices by default.
>     Enable virtio-1.0 for devices plugged into
>     PCIe ports (Root ports or Downstream ports).
> 
>     Using the virtio-1 mode will remove the limitation
>     of the number of devices that can be attached to a machine
>     by removing the need for the IO BAR.
> 
>     Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
>     Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>     Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> 
> 
> Maybe better to implement the workaround you proposed Stefan?
> 
> Thanks,
> Maxime
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Maxime Coquelin Dec. 14, 2016, 10:02 a.m. UTC | #9

On 12/14/2016 10:50 AM, Dr. David Alan Gilbert wrote:
> * Maxime Coquelin (maxime.coquelin@redhat.com) wrote:
>>
>>
>> On 12/14/2016 09:59 AM, Cornelia Huck wrote:
>>> On Wed, 14 Dec 2016 08:41:55 +0000
>>> Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>>
>>>> On Wed, Dec 14, 2016 at 8:28 AM, Maxime Coquelin
>>>> <maxime.coquelin@redhat.com> wrote:
>>>>>
>>>>>
>>>>> On 12/14/2016 08:44 AM, Cornelia Huck wrote:
>>>>>>>
>>>>>>>> 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
>>>>>>>> 14:45 < stefanha> But we're going to have to keep lying to the guest if
>>>>>>>> we want to preserve migration compatibility
>>>>>>>> 14:45 < stefanha> The key change in behavior with the patch you
>>>>>>>> identified is:
>>>>>>>> 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features,
>>>>>>>> VIRTIO_F_VERSION_1)) {
>>>>>>>> 14:46 < stefanha> virtio_pci_disable_modern(proxy);
>>>>>>>> 14:46 < stefanha> Previously it didn't care about vdev->host_features.
>>>>>>>> It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
>>>>>>>> 14:47 < mdroth> stefanha: ok, that explains why it seems to work with
>>>>>>>> disable-modern=true
>>>>>>>> 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is
>>>>>>>> definitely still around so I don't think we can ship QEMU 2.8 like this.
>>>>>>>> 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and
>>>>>>>> see what Michael Tsirkin and Maxime Coquelin think.
>>>>>>>> 14:49 < mdroth> stefanha: i suppose a potential workaround would be to
>>>>>>>> tell users to set disable-modern= to match their vhost capabilities, but
>>>>>>>> it's hard for them to apply that retroactively if they're looking to migrate
>>>>>>
>>>>>> Another thought: Maybe this bug only surfaced right now because older
>>>>>> qemus defaulted virtio-pci to legacy?
>>>>>>
>>>>>> (I think modern virtio-pci with old vhost resulted in a config that was
>>>>>> rejected at least by Linux guests. Because pci defaulted to legacy, we
>>>>>> only had the post-plugged workaround for ccw before.)
>>>>>
>>>>>
>>>>> Yes, for PCI with old vhost, modern enabled and recent kernel on guest,
>>>>> we get this failure at virtio-pci probe time:
>>>>>
>>>>> virtio_net virtio0: virtio: device uses modern interface but does not have
>>>>> VIRTIO_F_VERSION_1.
>>>>
>>>> Is this error a regression in QEMU 2.8?
>>>
>>> I think it pokes up because modern virtio-pci is now by default on. It
>>> was broken before if the user wanted a modern virtio-pci device
>>> explicitly.
>>>
>>> (ccw defaulted to virtio 1.0 much earlier, so we had the post-plugged
>>> solution that this patch replaced and which is basically the same for
>>> ccw.)
>>>
>>>>
>>>> It's better to ship with an existing issue still open than with a new
>>>> regression.  We must not break existing users' setups.
>>>>
>>>> A solution for the next QEMU version is to use a flag in the machine
>>>> type version telling virtio whether or not allow devices (e.g.
>>>> vhost-net) to influence the host feature bits.  Old machine types will
>>>> say no, new machine types will say yes.
>>>>
>>>> In the meantime I would revert your patch for QEMU 2.8.
>>>>
>>>> Maxime, Cornelia, Michael: Do you agree?
>>>>
>>>> Stefan
>>>
>>> Reverting the patch should be fine for ccw. What about the virtio-pci
>>> with old vhost mess, though? Defaulting to modern would mean that users
>>> get unusable devices in that setup.
>>
>> Just did some tests, and can confirm that reverting the patch would
>> re-introduce initial bug, which is breaking virtio-pci when host does
>> not support VERSION_1.
>
> Can you modify it so it only fixes the problem on new machine types?
> Either that or *fail* if you attempt to bring up a virtio interface
> using version 1 with a kernel that doesn't support it (with a note
> saying that you could use an option to disable it).
Yes, fixing the problem for new machine types only was the workaround
suggested by Stefan.
I think this is a better solution than failing at bring-up the virtio
interface if the kernel doesn't support version 1 and modern is not
disabled.

I start working on the implementation.

Thanks,
Maxime
>
> Dave
>
>> Note that this problem is present in v2.7.0 since:
>>
>> commit 9a4c0e220d8a4f82b5665d0ee95ef94d8e1509d5
>> Author: Marcel Apfelbaum <marcel@redhat.com>
>> Date:   Wed Jul 20 18:28:21 2016 +0300
>>
>>     hw/virtio-pci: fix virtio behaviour
>>
>>     Enable transitional virtio devices by default.
>>     Enable virtio-1.0 for devices plugged into
>>     PCIe ports (Root ports or Downstream ports).
>>
>>     Using the virtio-1 mode will remove the limitation
>>     of the number of devices that can be attached to a machine
>>     by removing the need for the IO BAR.
>>
>>     Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
>>     Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>     Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>>
>>
>> Maybe better to implement the workaround you proposed Stefan?
>>
>> Thanks,
>> Maxime
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>

Cornelia Huck Dec. 14, 2016, 10:08 a.m. UTC | #10

On Wed, 14 Dec 2016 10:44:05 +0100
Maxime Coquelin <maxime.coquelin@redhat.com> wrote:

> On 12/14/2016 09:59 AM, Cornelia Huck wrote:
> > On Wed, 14 Dec 2016 08:41:55 +0000
> > Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >
> >> On Wed, Dec 14, 2016 at 8:28 AM, Maxime Coquelin
> >> <maxime.coquelin@redhat.com> wrote:
> >>>
> >>>
> >>> On 12/14/2016 08:44 AM, Cornelia Huck wrote:
> >>>>>
> >>>>>> 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
> >>>>>> 14:45 < stefanha> But we're going to have to keep lying to the guest if
> >>>>>> we want to preserve migration compatibility
> >>>>>> 14:45 < stefanha> The key change in behavior with the patch you
> >>>>>> identified is:
> >>>>>> 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features,
> >>>>>> VIRTIO_F_VERSION_1)) {
> >>>>>> 14:46 < stefanha> virtio_pci_disable_modern(proxy);
> >>>>>> 14:46 < stefanha> Previously it didn't care about vdev->host_features.
> >>>>>> It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
> >>>>>> 14:47 < mdroth> stefanha: ok, that explains why it seems to work with
> >>>>>> disable-modern=true
> >>>>>> 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is
> >>>>>> definitely still around so I don't think we can ship QEMU 2.8 like this.
> >>>>>> 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and
> >>>>>> see what Michael Tsirkin and Maxime Coquelin think.
> >>>>>> 14:49 < mdroth> stefanha: i suppose a potential workaround would be to
> >>>>>> tell users to set disable-modern= to match their vhost capabilities, but
> >>>>>> it's hard for them to apply that retroactively if they're looking to migrate
> >>>>
> >>>> Another thought: Maybe this bug only surfaced right now because older
> >>>> qemus defaulted virtio-pci to legacy?
> >>>>
> >>>> (I think modern virtio-pci with old vhost resulted in a config that was
> >>>> rejected at least by Linux guests. Because pci defaulted to legacy, we
> >>>> only had the post-plugged workaround for ccw before.)
> >>>
> >>>
> >>> Yes, for PCI with old vhost, modern enabled and recent kernel on guest,
> >>> we get this failure at virtio-pci probe time:
> >>>
> >>> virtio_net virtio0: virtio: device uses modern interface but does not have
> >>> VIRTIO_F_VERSION_1.
> >>
> >> Is this error a regression in QEMU 2.8?
> >
> > I think it pokes up because modern virtio-pci is now by default on. It
> > was broken before if the user wanted a modern virtio-pci device
> > explicitly.
> >
> > (ccw defaulted to virtio 1.0 much earlier, so we had the post-plugged
> > solution that this patch replaced and which is basically the same for
> > ccw.)

FWIW, I played around a bit with virsh managedsave and the 2.7 ccw
machine on a non-virtio-1 vhost kernel. Migrating to/from a 2.7 qemu
works fine for ccw both with current master and with this patch
reverted. Feature handling and friends are simpler on ccw...

> >
> >>
> >> It's better to ship with an existing issue still open than with a new
> >> regression.  We must not break existing users' setups.
> >>
> >> A solution for the next QEMU version is to use a flag in the machine
> >> type version telling virtio whether or not allow devices (e.g.
> >> vhost-net) to influence the host feature bits.  Old machine types will
> >> say no, new machine types will say yes.
> >>
> >> In the meantime I would revert your patch for QEMU 2.8.
> >>
> >> Maxime, Cornelia, Michael: Do you agree?
> >>
> >> Stefan
> >
> > Reverting the patch should be fine for ccw. What about the virtio-pci
> > with old vhost mess, though? Defaulting to modern would mean that users
> > get unusable devices in that setup.
> 
> Just did some tests, and can confirm that reverting the patch would
> re-introduce initial bug, which is breaking virtio-pci when host does
> not support VERSION_1.
> 
> Note that this problem is present in v2.7.0 since:
> 
> commit 9a4c0e220d8a4f82b5665d0ee95ef94d8e1509d5
> Author: Marcel Apfelbaum <marcel@redhat.com>
> Date:   Wed Jul 20 18:28:21 2016 +0300
> 
>      hw/virtio-pci: fix virtio behaviour
> 
>      Enable transitional virtio devices by default.
>      Enable virtio-1.0 for devices plugged into
>      PCIe ports (Root ports or Downstream ports).
> 
>      Using the virtio-1 mode will remove the limitation
>      of the number of devices that can be attached to a machine
>      by removing the need for the IO BAR.
> 
>      Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
>      Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>      Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>      Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> 
> 
> Maybe better to implement the workaround you proposed Stefan?

Let's summarize a bit:

- current master: default modern, takes vhost capabilities into account
  -> usable device in all cases, but migration broken with old vhost
- reverting this commit: default modern, ignore vhost capabilities
  -> unusable transitional devices with old vhost, but migration works
- lie about features on old machines: default modern, ignore vhost
  capabilites on old machines
  -> unusable transitional devices with old vhost _and_ old machines,
     but migration should work

The third option sounds best right now, but it's not perfect, either.
It's basically the 2.7 machine where it is most likely to bite people,
as older machines defaulted to legacy.

Marcel Apfelbaum Dec. 14, 2016, 11:48 a.m. UTC | #11

On 12/14/2016 12:08 PM, Cornelia Huck wrote:
> On Wed, 14 Dec 2016 10:44:05 +0100
> Maxime Coquelin <maxime.coquelin@redhat.com> wrote:
>
>> On 12/14/2016 09:59 AM, Cornelia Huck wrote:
>>> On Wed, 14 Dec 2016 08:41:55 +0000
>>> Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>>
>>>> On Wed, Dec 14, 2016 at 8:28 AM, Maxime Coquelin
>>>> <maxime.coquelin@redhat.com> wrote:
>>>>>
>>>>>
>>>>> On 12/14/2016 08:44 AM, Cornelia Huck wrote:
>>>>>>>
>>>>>>>> 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
>>>>>>>> 14:45 < stefanha> But we're going to have to keep lying to the guest if
>>>>>>>> we want to preserve migration compatibility
>>>>>>>> 14:45 < stefanha> The key change in behavior with the patch you
>>>>>>>> identified is:
>>>>>>>> 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features,
>>>>>>>> VIRTIO_F_VERSION_1)) {
>>>>>>>> 14:46 < stefanha> virtio_pci_disable_modern(proxy);
>>>>>>>> 14:46 < stefanha> Previously it didn't care about vdev->host_features.
>>>>>>>> It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
>>>>>>>> 14:47 < mdroth> stefanha: ok, that explains why it seems to work with
>>>>>>>> disable-modern=true
>>>>>>>> 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is
>>>>>>>> definitely still around so I don't think we can ship QEMU 2.8 like this.
>>>>>>>> 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and
>>>>>>>> see what Michael Tsirkin and Maxime Coquelin think.
>>>>>>>> 14:49 < mdroth> stefanha: i suppose a potential workaround would be to
>>>>>>>> tell users to set disable-modern= to match their vhost capabilities, but
>>>>>>>> it's hard for them to apply that retroactively if they're looking to migrate
>>>>>>
>>>>>> Another thought: Maybe this bug only surfaced right now because older
>>>>>> qemus defaulted virtio-pci to legacy?
>>>>>>
>>>>>> (I think modern virtio-pci with old vhost resulted in a config that was
>>>>>> rejected at least by Linux guests. Because pci defaulted to legacy, we
>>>>>> only had the post-plugged workaround for ccw before.)
>>>>>
>>>>>
>>>>> Yes, for PCI with old vhost, modern enabled and recent kernel on guest,
>>>>> we get this failure at virtio-pci probe time:
>>>>>
>>>>> virtio_net virtio0: virtio: device uses modern interface but does not have
>>>>> VIRTIO_F_VERSION_1.
>>>>
>>>> Is this error a regression in QEMU 2.8?
>>>
>>> I think it pokes up because modern virtio-pci is now by default on. It
>>> was broken before if the user wanted a modern virtio-pci device
>>> explicitly.
>>>
>>> (ccw defaulted to virtio 1.0 much earlier, so we had the post-plugged
>>> solution that this patch replaced and which is basically the same for
>>> ccw.)
>
> FWIW, I played around a bit with virsh managedsave and the 2.7 ccw
> machine on a non-virtio-1 vhost kernel. Migrating to/from a 2.7 qemu
> works fine for ccw both with current master and with this patch
> reverted. Feature handling and friends are simpler on ccw...
>
>>>
>>>>
>>>> It's better to ship with an existing issue still open than with a new
>>>> regression.  We must not break existing users' setups.
>>>>
>>>> A solution for the next QEMU version is to use a flag in the machine
>>>> type version telling virtio whether or not allow devices (e.g.
>>>> vhost-net) to influence the host feature bits.  Old machine types will
>>>> say no, new machine types will say yes.
>>>>
>>>> In the meantime I would revert your patch for QEMU 2.8.
>>>>
>>>> Maxime, Cornelia, Michael: Do you agree?
>>>>
>>>> Stefan
>>>
>>> Reverting the patch should be fine for ccw. What about the virtio-pci
>>> with old vhost mess, though? Defaulting to modern would mean that users
>>> get unusable devices in that setup.
>>
>> Just did some tests, and can confirm that reverting the patch would
>> re-introduce initial bug, which is breaking virtio-pci when host does
>> not support VERSION_1.
>>
>> Note that this problem is present in v2.7.0 since:
>>
>> commit 9a4c0e220d8a4f82b5665d0ee95ef94d8e1509d5
>> Author: Marcel Apfelbaum <marcel@redhat.com>
>> Date:   Wed Jul 20 18:28:21 2016 +0300
>>
>>      hw/virtio-pci: fix virtio behaviour
>>
>>      Enable transitional virtio devices by default.
>>      Enable virtio-1.0 for devices plugged into
>>      PCIe ports (Root ports or Downstream ports).
>>
>>      Using the virtio-1 mode will remove the limitation
>>      of the number of devices that can be attached to a machine
>>      by removing the need for the IO BAR.
>>
>>      Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
>>      Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>      Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>      Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>>
>>
>> Maybe better to implement the workaround you proposed Stefan?
>
> Let's summarize a bit:
>
> - current master: default modern, takes vhost capabilities into account
>   -> usable device in all cases, but migration broken with old vhost
> - reverting this commit: default modern, ignore vhost capabilities
>   -> unusable transitional devices with old vhost, but migration works
> - lie about features on old machines: default modern, ignore vhost
>   capabilites on old machines
>   -> unusable transitional devices with old vhost _and_ old machines,
>      but migration should work
>

Hi,

> The third option sounds best right now, but it's not perfect, either.
> It's basically the 2.7 machine where it is most likely to bite people,
> as older machines defaulted to legacy.
>

Agreed, the third option is much better than reverting Maxime's patch.

Adding a property like "x-modern-broke" or "x-modern-with-old-vhost-broken"
would be the best thing IMHO.

Thanks,
Marcel

[v2] virtio-bus: Plug devices after features are negotiated

Commit Message

Comments

Patch