Message ID | 20191211152956.5168-5-pdurrant@amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen-blkback: support live update | expand |
On 11.12.19 16:29, Paul Durrant wrote: > By simply re-attaching to shared rings during connect_ring() rather than > assuming they are freshly allocated (i.e assuming the counters are zero) > it is possible for vbd instances to be unbound and re-bound from and to > (respectively) a running guest. > > This has been tested by running: > > while true; > do fio --name=randwrite --ioengine=libaio --iodepth=16 \ > --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32; > done > > in a PV guest whilst running: > > while true; > do echo vbd-$DOMID-$VBD >unbind; > echo unbound; > sleep 5; > echo vbd-$DOMID-$VBD >bind; > echo bound; > sleep 3; > done > > in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and > re-bind its system disk image. > > This is a highly useful feature for a backend module as it allows it to be > unloaded and re-loaded (i.e. updated) without requiring domUs to be halted. > This was also tested by running: > > while true; > do echo vbd-$DOMID-$VBD >unbind; > echo unbound; > sleep 5; > rmmod xen-blkback; > echo unloaded; > sleep 1; > modprobe xen-blkback; > echo bound; > cd $(pwd); > sleep 3; > done > > in dom0 whilst running the same loop as above in the (single) PV guest. > > Some (less stressful) testing has also been done using a Windows HVM guest > with the latest 9.0 PV drivers installed. > > Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Juergen
On Wed, Dec 11, 2019 at 03:29:56PM +0000, Paul Durrant wrote: > By simply re-attaching to shared rings during connect_ring() rather than > assuming they are freshly allocated (i.e assuming the counters are zero) > it is possible for vbd instances to be unbound and re-bound from and to > (respectively) a running guest. > > This has been tested by running: > > while true; > do fio --name=randwrite --ioengine=libaio --iodepth=16 \ > --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32; > done > > in a PV guest whilst running: > > while true; > do echo vbd-$DOMID-$VBD >unbind; > echo unbound; > sleep 5; > echo vbd-$DOMID-$VBD >bind; > echo bound; > sleep 3; > done > > in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and > re-bind its system disk image. > > This is a highly useful feature for a backend module as it allows it to be > unloaded and re-loaded (i.e. updated) without requiring domUs to be halted. > This was also tested by running: > > while true; > do echo vbd-$DOMID-$VBD >unbind; > echo unbound; > sleep 5; > rmmod xen-blkback; > echo unloaded; > sleep 1; > modprobe xen-blkback; > echo bound; > cd $(pwd); > sleep 3; > done > > in dom0 whilst running the same loop as above in the (single) PV guest. > > Some (less stressful) testing has also been done using a Windows HVM guest > with the latest 9.0 PV drivers installed. > > Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks! Juergen: I guess you will also pick this series and merge it from the Xen tree instead of the block one? Roger.
On 12.12.19 12:46, Roger Pau Monné wrote: > On Wed, Dec 11, 2019 at 03:29:56PM +0000, Paul Durrant wrote: >> By simply re-attaching to shared rings during connect_ring() rather than >> assuming they are freshly allocated (i.e assuming the counters are zero) >> it is possible for vbd instances to be unbound and re-bound from and to >> (respectively) a running guest. >> >> This has been tested by running: >> >> while true; >> do fio --name=randwrite --ioengine=libaio --iodepth=16 \ >> --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32; >> done >> >> in a PV guest whilst running: >> >> while true; >> do echo vbd-$DOMID-$VBD >unbind; >> echo unbound; >> sleep 5; >> echo vbd-$DOMID-$VBD >bind; >> echo bound; >> sleep 3; >> done >> >> in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and >> re-bind its system disk image. >> >> This is a highly useful feature for a backend module as it allows it to be >> unloaded and re-loaded (i.e. updated) without requiring domUs to be halted. >> This was also tested by running: >> >> while true; >> do echo vbd-$DOMID-$VBD >unbind; >> echo unbound; >> sleep 5; >> rmmod xen-blkback; >> echo unloaded; >> sleep 1; >> modprobe xen-blkback; >> echo bound; >> cd $(pwd); >> sleep 3; >> done >> >> in dom0 whilst running the same loop as above in the (single) PV guest. >> >> Some (less stressful) testing has also been done using a Windows HVM guest >> with the latest 9.0 PV drivers installed. >> >> Signed-off-by: Paul Durrant <pdurrant@amazon.com> > > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> > > Thanks! > > Juergen: I guess you will also pick this series and merge it from the > Xen tree instead of the block one? Yes. Juergen
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 59d576d27ca7..0d4097bdff3f 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -190,6 +190,9 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, grant_ref_t *gref, { int err; struct xen_blkif *blkif = ring->blkif; + const struct blkif_common_sring *sring_common; + RING_IDX rsp_prod, req_prod; + unsigned int size; /* Already connected through? */ if (ring->irq) @@ -200,46 +203,62 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, grant_ref_t *gref, if (err < 0) return err; + sring_common = (struct blkif_common_sring *)ring->blk_ring; + rsp_prod = READ_ONCE(sring_common->rsp_prod); + req_prod = READ_ONCE(sring_common->req_prod); + switch (blkif->blk_protocol) { case BLKIF_PROTOCOL_NATIVE: { - struct blkif_sring *sring; - sring = (struct blkif_sring *)ring->blk_ring; - BACK_RING_INIT(&ring->blk_rings.native, sring, - XEN_PAGE_SIZE * nr_grefs); + struct blkif_sring *sring_native = + (struct blkif_sring *)ring->blk_ring; + + BACK_RING_ATTACH(&ring->blk_rings.native, sring_native, + rsp_prod, XEN_PAGE_SIZE * nr_grefs); + size = __RING_SIZE(sring_native, XEN_PAGE_SIZE * nr_grefs); break; } case BLKIF_PROTOCOL_X86_32: { - struct blkif_x86_32_sring *sring_x86_32; - sring_x86_32 = (struct blkif_x86_32_sring *)ring->blk_ring; - BACK_RING_INIT(&ring->blk_rings.x86_32, sring_x86_32, - XEN_PAGE_SIZE * nr_grefs); + struct blkif_x86_32_sring *sring_x86_32 = + (struct blkif_x86_32_sring *)ring->blk_ring; + + BACK_RING_ATTACH(&ring->blk_rings.x86_32, sring_x86_32, + rsp_prod, XEN_PAGE_SIZE * nr_grefs); + size = __RING_SIZE(sring_x86_32, XEN_PAGE_SIZE * nr_grefs); break; } case BLKIF_PROTOCOL_X86_64: { - struct blkif_x86_64_sring *sring_x86_64; - sring_x86_64 = (struct blkif_x86_64_sring *)ring->blk_ring; - BACK_RING_INIT(&ring->blk_rings.x86_64, sring_x86_64, - XEN_PAGE_SIZE * nr_grefs); + struct blkif_x86_64_sring *sring_x86_64 = + (struct blkif_x86_64_sring *)ring->blk_ring; + + BACK_RING_ATTACH(&ring->blk_rings.x86_64, sring_x86_64, + rsp_prod, XEN_PAGE_SIZE * nr_grefs); + size = __RING_SIZE(sring_x86_64, XEN_PAGE_SIZE * nr_grefs); break; } default: BUG(); } + err = -EIO; + if (req_prod - rsp_prod > size) + goto fail; + err = bind_interdomain_evtchn_to_irqhandler(blkif->domid, evtchn, xen_blkif_be_int, 0, "blkif-backend", ring); - if (err < 0) { - xenbus_unmap_ring_vfree(blkif->be->dev, ring->blk_ring); - ring->blk_rings.common.sring = NULL; - return err; - } + if (err < 0) + goto fail; ring->irq = err; return 0; + +fail: + xenbus_unmap_ring_vfree(blkif->be->dev, ring->blk_ring); + ring->blk_rings.common.sring = NULL; + return err; } static int xen_blkif_disconnect(struct xen_blkif *blkif) @@ -1131,7 +1150,8 @@ static struct xenbus_driver xen_blkbk_driver = { .ids = xen_blkbk_ids, .probe = xen_blkbk_probe, .remove = xen_blkbk_remove, - .otherend_changed = frontend_changed + .otherend_changed = frontend_changed, + .allow_rebind = true, }; int xen_blkif_xenbus_init(void)
By simply re-attaching to shared rings during connect_ring() rather than assuming they are freshly allocated (i.e assuming the counters are zero) it is possible for vbd instances to be unbound and re-bound from and to (respectively) a running guest. This has been tested by running: while true; do fio --name=randwrite --ioengine=libaio --iodepth=16 \ --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32; done in a PV guest whilst running: while true; do echo vbd-$DOMID-$VBD >unbind; echo unbound; sleep 5; echo vbd-$DOMID-$VBD >bind; echo bound; sleep 3; done in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and re-bind its system disk image. This is a highly useful feature for a backend module as it allows it to be unloaded and re-loaded (i.e. updated) without requiring domUs to be halted. This was also tested by running: while true; do echo vbd-$DOMID-$VBD >unbind; echo unbound; sleep 5; rmmod xen-blkback; echo unloaded; sleep 1; modprobe xen-blkback; echo bound; cd $(pwd); sleep 3; done in dom0 whilst running the same loop as above in the (single) PV guest. Some (less stressful) testing has also been done using a Windows HVM guest with the latest 9.0 PV drivers installed. Signed-off-by: Paul Durrant <pdurrant@amazon.com> --- Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> v3: - Constify sring_common and re-work error handling in xen_blkif_map() v2: - Apply a sanity check to the value of rsp_prod and fail the re-attach if it is implausible - Set allow_rebind to prevent ring from being closed on unbind - Update test workload from dd to fio (with verification) --- drivers/block/xen-blkback/xenbus.c | 56 ++++++++++++++++++++---------- 1 file changed, 38 insertions(+), 18 deletions(-)