Message ID | cover.1589193717.git.lukasstraub2@web.de (mailing list archive) |
---|---|
Headers | show |
Series | Introduce 'yank' oob qmp command to recover from hanging qemu | expand |
On Mon, May 11, 2020 at 01:14:34PM +0200, Lukas Straub wrote: > Hello Everyone, > In many cases, if qemu has a network connection (qmp, migration, chardev, etc.) > to some other server and that server dies or hangs, qemu hangs too. If qemu as a whole hangs due to a stalled network connection, that is a bug in QEMU that we should be fixing IMHO. QEMU should be doing non-blocking I/O in general, such that if the network connection or remote server stalls, we simply stop sending I/O - we shouldn't ever hang the QEMU process or main loop. There are places in QEMU code which are not well behaved in this respect, but many are, and others are getting fixed where found to be important. Arguably any place in QEMU code which can result in a hang of QEMU in the event of a stalled network should be considered a security flaw, because the network is untrusted in general. > These patches introduce the new 'yank' out-of-band qmp command to recover from > these kinds of hangs. The different subsystems register callbacks which get > executed with the yank command. For example the callback can shutdown() a > socket. This is intended for the colo use-case, but it can be used for other > things too of course. IIUC, invoking the "yank" command unconditionally kills every single network connection in QEMU that has registered with the "yank" subsystem. IMHO this is way too big of a hammer, even if we accept there are bugs in QEMU not handling stalled networking well. eg if a chardev hangs QEMU, and we tear down everything, killing the NBD connection used for the guest disk, we needlessly break I/O. eg doing this in the chardev backend is not desirable, because the bugs with hanging QEMU are typically caused by the way the frontend device uses the chardev blocking I/O calls, instead of non-blocking I/O calls. Regards, Daniel
* Daniel P. Berrangé (berrange@redhat.com) wrote: > On Mon, May 11, 2020 at 01:14:34PM +0200, Lukas Straub wrote: > > Hello Everyone, > > In many cases, if qemu has a network connection (qmp, migration, chardev, etc.) > > to some other server and that server dies or hangs, qemu hangs too. > > If qemu as a whole hangs due to a stalled network connection, that is a > bug in QEMU that we should be fixing IMHO. QEMU should be doing non-blocking > I/O in general, such that if the network connection or remote server stalls, > we simply stop sending I/O - we shouldn't ever hang the QEMU process or main > loop. > > There are places in QEMU code which are not well behaved in this respect, > but many are, and others are getting fixed where found to be important. > > Arguably any place in QEMU code which can result in a hang of QEMU in the > event of a stalled network should be considered a security flaw, because > the network is untrusted in general. That's not really true of the 'management network' - people trust that and I don't see a lot of the qemu code getting fixed safely for all of them. > > These patches introduce the new 'yank' out-of-band qmp command to recover from > > these kinds of hangs. The different subsystems register callbacks which get > > executed with the yank command. For example the callback can shutdown() a > > socket. This is intended for the colo use-case, but it can be used for other > > things too of course. > > IIUC, invoking the "yank" command unconditionally kills every single > network connection in QEMU that has registered with the "yank" subsystem. > IMHO this is way too big of a hammer, even if we accept there are bugs in > QEMU not handling stalled networking well. But isn't this hammer conditional - I see that it's a migration capabiltiy for the migration socket, and a flag in nbd - so it only yanks things you've told it to. > eg if a chardev hangs QEMU, and we tear down everything, killing the NBD > connection used for the guest disk, we needlessly break I/O. > > eg doing this in the chardev backend is not desirable, because the bugs > with hanging QEMU are typically caused by the way the frontend device > uses the chardev blocking I/O calls, instead of non-blocking I/O calls. > Having a way to get out of any of these problems from a single point is quite nice. To be useful in COLO you need to know for sure you can get out of any network screwup. We already use shutdown(2) in migrate_cancel and migrate-pause for basically the same reason; I don't think we've got anything similar for NBD, and we probably should have (I think I asked for it fairly recently). Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Mon, May 11, 2020 at 01:07:18PM +0100, Dr. David Alan Gilbert wrote: > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > On Mon, May 11, 2020 at 01:14:34PM +0200, Lukas Straub wrote: > > > Hello Everyone, > > > In many cases, if qemu has a network connection (qmp, migration, chardev, etc.) > > > to some other server and that server dies or hangs, qemu hangs too. > > > > If qemu as a whole hangs due to a stalled network connection, that is a > > bug in QEMU that we should be fixing IMHO. QEMU should be doing non-blocking > > I/O in general, such that if the network connection or remote server stalls, > > we simply stop sending I/O - we shouldn't ever hang the QEMU process or main > > loop. > > > > There are places in QEMU code which are not well behaved in this respect, > > but many are, and others are getting fixed where found to be important. > > > > Arguably any place in QEMU code which can result in a hang of QEMU in the > > event of a stalled network should be considered a security flaw, because > > the network is untrusted in general. > > That's not really true of the 'management network' - people trust that > and I don't see a lot of the qemu code getting fixed safely for all of > them. It depends on the user / app / deployment scenario. In OpenStack alot of work was done to beef up security between services on the mgmt network, with TLS encryption as standard to reduce attack vectors. > > > These patches introduce the new 'yank' out-of-band qmp command to recover from > > > these kinds of hangs. The different subsystems register callbacks which get > > > executed with the yank command. For example the callback can shutdown() a > > > socket. This is intended for the colo use-case, but it can be used for other > > > things too of course. > > > > IIUC, invoking the "yank" command unconditionally kills every single > > network connection in QEMU that has registered with the "yank" subsystem. > > IMHO this is way too big of a hammer, even if we accept there are bugs in > > QEMU not handling stalled networking well. > > But isn't this hammer conditional - I see that it's a migration > capabiltiy for the migration socket, and a flag in nbd - so it only > yanks things you've told it to. IIUC, you have to set these flags upfront when you launch QEMU, or hotplug the device using the feature. When something gets stuck, and you issue the "yank" command, then everything that has the flag enabled gets torn down. So in practice it looks like the flag will get enabled for everything at QEMU startup, and yanking down tear down everything. > > eg if a chardev hangs QEMU, and we tear down everything, killing the NBD > > connection used for the guest disk, we needlessly break I/O. > > > > eg doing this in the chardev backend is not desirable, because the bugs > > with hanging QEMU are typically caused by the way the frontend device > > uses the chardev blocking I/O calls, instead of non-blocking I/O calls. > > > > Having a way to get out of any of these problems from a single point is > quite nice. To be useful in COLO you need to know for sure you can get > out of any network screwup. > > We already use shutdown(2) in migrate_cancel and migrate-pause for > basically the same reason; I don't think we've got anything similar for > NBD, and we probably should have (I think I asked for it fairly > recently). Yes, the migrate_cancel is an example of a more fine grained way to recover. I was thinking that we need an equivalent fine control knob for NBD too. That way if QEMU does get stuck, you can start by tearing down the least distruptive channel. eg try tearing down the migration connection first (which shouldn't negatively impact the guest), and only if that doesn't work then, move on to tear down the NBD connection (which risks data loss) Regards, Daniel
* Daniel P. Berrangé (berrange@redhat.com) wrote: > On Mon, May 11, 2020 at 01:07:18PM +0100, Dr. David Alan Gilbert wrote: > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > On Mon, May 11, 2020 at 01:14:34PM +0200, Lukas Straub wrote: > > > > Hello Everyone, > > > > In many cases, if qemu has a network connection (qmp, migration, chardev, etc.) > > > > to some other server and that server dies or hangs, qemu hangs too. > > > > > > If qemu as a whole hangs due to a stalled network connection, that is a > > > bug in QEMU that we should be fixing IMHO. QEMU should be doing non-blocking > > > I/O in general, such that if the network connection or remote server stalls, > > > we simply stop sending I/O - we shouldn't ever hang the QEMU process or main > > > loop. > > > > > > There are places in QEMU code which are not well behaved in this respect, > > > but many are, and others are getting fixed where found to be important. > > > > > > Arguably any place in QEMU code which can result in a hang of QEMU in the > > > event of a stalled network should be considered a security flaw, because > > > the network is untrusted in general. > > > > That's not really true of the 'management network' - people trust that > > and I don't see a lot of the qemu code getting fixed safely for all of > > them. > > It depends on the user / app / deployment scenario. In OpenStack alot of > work was done to beef up security between services on the mgmt network, > with TLS encryption as standard to reduce attack vectors. > > > > > These patches introduce the new 'yank' out-of-band qmp command to recover from > > > > these kinds of hangs. The different subsystems register callbacks which get > > > > executed with the yank command. For example the callback can shutdown() a > > > > socket. This is intended for the colo use-case, but it can be used for other > > > > things too of course. > > > > > > IIUC, invoking the "yank" command unconditionally kills every single > > > network connection in QEMU that has registered with the "yank" subsystem. > > > IMHO this is way too big of a hammer, even if we accept there are bugs in > > > QEMU not handling stalled networking well. > > > > But isn't this hammer conditional - I see that it's a migration > > capabiltiy for the migration socket, and a flag in nbd - so it only > > yanks things you've told it to. > > IIUC, you have to set these flags upfront when you launch QEMU, or > hotplug the device using the feature. When something gets stuck, > and you issue the "yank" command, then everything that has the flag > enabled gets torn down. So in practice it looks like the flag will > get enabled for everything at QEMU startup, and yanking down tear > down everything. For COLO I really expect it for the migration stream, the disk mirroring stream and probably the network comparison/forwarding streams. > > > eg if a chardev hangs QEMU, and we tear down everything, killing the NBD > > > connection used for the guest disk, we needlessly break I/O. > > > > > > eg doing this in the chardev backend is not desirable, because the bugs > > > with hanging QEMU are typically caused by the way the frontend device > > > uses the chardev blocking I/O calls, instead of non-blocking I/O calls. > > > > > > > Having a way to get out of any of these problems from a single point is > > quite nice. To be useful in COLO you need to know for sure you can get > > out of any network screwup. > > > > We already use shutdown(2) in migrate_cancel and migrate-pause for > > basically the same reason; I don't think we've got anything similar for > > NBD, and we probably should have (I think I asked for it fairly > > recently). > > Yes, the migrate_cancel is an example of a more fine grained way to > recover. I was thinking that we need an equivalent fine control knob > for NBD too. I feel it might be nice not to have to create so many separate knobs. > That way if QEMU does get stuck, you can start by tearing down the > least distruptive channel. eg try tearing down the migration connection > first (which shouldn't negatively impact the guest), and only if that > doesn't work then, move on to tear down the NBD connection (which risks > data loss) I wonder if a different way would be to make all network connections register with yank, but then make yank take a list of connections to shutdown(2). Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Mon, 11 May 2020 12:49:47 +0100 Daniel P. Berrangé <berrange@redhat.com> wrote: > On Mon, May 11, 2020 at 01:14:34PM +0200, Lukas Straub wrote: > > Hello Everyone, > > In many cases, if qemu has a network connection (qmp, migration, chardev, etc.) > > to some other server and that server dies or hangs, qemu hangs too. > > If qemu as a whole hangs due to a stalled network connection, that is a > bug in QEMU that we should be fixing IMHO. QEMU should be doing non-blocking > I/O in general, such that if the network connection or remote server stalls, > we simply stop sending I/O - we shouldn't ever hang the QEMU process or main > loop. > > There are places in QEMU code which are not well behaved in this respect, > but many are, and others are getting fixed where found to be important. > > Arguably any place in QEMU code which can result in a hang of QEMU in the > event of a stalled network should be considered a security flaw, because > the network is untrusted in general. The fact that out-of-band qmp commands exist at all shows that we have to make tradeoffs of developer time vs. doing things right. Sure, the migration code can be rewritten to use non-blocking i/o and finegrained locks. But as a hobbyist I don't have time to fix this. > > These patches introduce the new 'yank' out-of-band qmp command to recover from > > these kinds of hangs. The different subsystems register callbacks which get > > executed with the yank command. For example the callback can shutdown() a > > socket. This is intended for the colo use-case, but it can be used for other > > things too of course. > > IIUC, invoking the "yank" command unconditionally kills every single > network connection in QEMU that has registered with the "yank" subsystem. > IMHO this is way too big of a hammer, even if we accept there are bugs in > QEMU not handling stalled networking well. > > eg if a chardev hangs QEMU, and we tear down everything, killing the NBD > connection used for the guest disk, we needlessly break I/O. Yeah, these patches are intended to solve the problems with the colo use-case where all external connections (migration, chardevs, nbd) are just for replication. In other use-cases you'd enable the yank feature only on the non-essential connections. > eg doing this in the chardev backend is not desirable, because the bugs > with hanging QEMU are typically caused by the way the frontend device > uses the chardev blocking I/O calls, instead of non-blocking I/O calls. > > > Regards, > Daniel
On Mon, 11 May 2020 13:17:14 +0100 Daniel P. Berrangé <berrange@redhat.com> wrote: > On Mon, May 11, 2020 at 01:07:18PM +0100, Dr. David Alan Gilbert wrote: > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > On Mon, May 11, 2020 at 01:14:34PM +0200, Lukas Straub wrote: > > > > Hello Everyone, > > > > In many cases, if qemu has a network connection (qmp, migration, chardev, etc.) > > > > to some other server and that server dies or hangs, qemu hangs too. > > > > > > If qemu as a whole hangs due to a stalled network connection, that is a > > > bug in QEMU that we should be fixing IMHO. QEMU should be doing non-blocking > > > I/O in general, such that if the network connection or remote server stalls, > > > we simply stop sending I/O - we shouldn't ever hang the QEMU process or main > > > loop. > > > > > > There are places in QEMU code which are not well behaved in this respect, > > > but many are, and others are getting fixed where found to be important. > > > > > > Arguably any place in QEMU code which can result in a hang of QEMU in the > > > event of a stalled network should be considered a security flaw, because > > > the network is untrusted in general. > > > > That's not really true of the 'management network' - people trust that > > and I don't see a lot of the qemu code getting fixed safely for all of > > them. > > It depends on the user / app / deployment scenario. In OpenStack alot of > work was done to beef up security between services on the mgmt network, > with TLS encryption as standard to reduce attack vectors. > > > > > These patches introduce the new 'yank' out-of-band qmp command to recover from > > > > these kinds of hangs. The different subsystems register callbacks which get > > > > executed with the yank command. For example the callback can shutdown() a > > > > socket. This is intended for the colo use-case, but it can be used for other > > > > things too of course. > > > > > > IIUC, invoking the "yank" command unconditionally kills every single > > > network connection in QEMU that has registered with the "yank" subsystem. > > > IMHO this is way too big of a hammer, even if we accept there are bugs in > > > QEMU not handling stalled networking well. > > > > But isn't this hammer conditional - I see that it's a migration > > capabiltiy for the migration socket, and a flag in nbd - so it only > > yanks things you've told it to. > > IIUC, you have to set these flags upfront when you launch QEMU, or > hotplug the device using the feature. When something gets stuck, > and you issue the "yank" command, then everything that has the flag > enabled gets torn down. So in practice it looks like the flag will > get enabled for everything at QEMU startup, and yanking down tear > down everything. > > > > eg if a chardev hangs QEMU, and we tear down everything, killing the NBD > > > connection used for the guest disk, we needlessly break I/O. > > > > > > eg doing this in the chardev backend is not desirable, because the bugs > > > with hanging QEMU are typically caused by the way the frontend device > > > uses the chardev blocking I/O calls, instead of non-blocking I/O calls. > > > > > > > Having a way to get out of any of these problems from a single point is > > quite nice. To be useful in COLO you need to know for sure you can get > > out of any network screwup. > > > > We already use shutdown(2) in migrate_cancel and migrate-pause for > > basically the same reason; I don't think we've got anything similar for > > NBD, and we probably should have (I think I asked for it fairly > > recently). > > Yes, the migrate_cancel is an example of a more fine grained way to > recover. I was thinking that we need an equivalent fine control knob > for NBD too. One reason why the yank feature is done this way is that the management application may not know in what state qemu is and so it doesn't know what to yank. Poking in the dark would work too in my case, but it's not that nice. Regards, Lukas Straub > That way if QEMU does get stuck, you can start by tearing down the > least distruptive channel. eg try tearing down the migration connection > first (which shouldn't negatively impact the guest), and only if that > doesn't work then, move on to tear down the NBD connection (which risks > data loss) > > Regards, > Daniel
On Mon, May 11, 2020 at 08:12:18PM +0200, Lukas Straub wrote: > On Mon, 11 May 2020 12:49:47 +0100 > Daniel P. Berrangé <berrange@redhat.com> wrote: > > > On Mon, May 11, 2020 at 01:14:34PM +0200, Lukas Straub wrote: > > > Hello Everyone, > > > In many cases, if qemu has a network connection (qmp, migration, chardev, etc.) > > > to some other server and that server dies or hangs, qemu hangs too. > > > > If qemu as a whole hangs due to a stalled network connection, that is a > > bug in QEMU that we should be fixing IMHO. QEMU should be doing non-blocking > > I/O in general, such that if the network connection or remote server stalls, > > we simply stop sending I/O - we shouldn't ever hang the QEMU process or main > > loop. > > > > There are places in QEMU code which are not well behaved in this respect, > > but many are, and others are getting fixed where found to be important. > > > > Arguably any place in QEMU code which can result in a hang of QEMU in the > > event of a stalled network should be considered a security flaw, because > > the network is untrusted in general. > > The fact that out-of-band qmp commands exist at all shows that we have to make tradeoffs of developer time vs. doing things right. Sure, the migration code can be rewritten to use non-blocking i/o and finegrained locks. But as a hobbyist I don't have time to fix this. > > > > These patches introduce the new 'yank' out-of-band qmp command to recover from > > > these kinds of hangs. The different subsystems register callbacks which get > > > executed with the yank command. For example the callback can shutdown() a > > > socket. This is intended for the colo use-case, but it can be used for other > > > things too of course. > > > > IIUC, invoking the "yank" command unconditionally kills every single > > network connection in QEMU that has registered with the "yank" subsystem. > > IMHO this is way too big of a hammer, even if we accept there are bugs in > > QEMU not handling stalled networking well. > > > > eg if a chardev hangs QEMU, and we tear down everything, killing the NBD > > connection used for the guest disk, we needlessly break I/O. > > Yeah, these patches are intended to solve the problems with the colo > use-case where all external connections (migration, chardevs, nbd) > are just for replication. In other use-cases you'd enable the yank > feature only on the non-essential connections. That is a pretty inflexible design for other use cases though, as "non-essential" is not a black & white list in general. There are varying levels of importance to the different channels. We can afford to loose migration without any user visible effects. If that doesn't solve it, a serial device chardev, or VNC connection can be dropped at the inconvenience of loosing interactive console which is end user visible impact, so may only be want to be yanked if the migration yank didn't fix it. Regards, Daniel
* Daniel P. Berrangé (berrange@redhat.com) wrote: > On Mon, May 11, 2020 at 08:12:18PM +0200, Lukas Straub wrote: > > On Mon, 11 May 2020 12:49:47 +0100 > > Daniel P. Berrangé <berrange@redhat.com> wrote: > > > > > On Mon, May 11, 2020 at 01:14:34PM +0200, Lukas Straub wrote: > > > > Hello Everyone, > > > > In many cases, if qemu has a network connection (qmp, migration, chardev, etc.) > > > > to some other server and that server dies or hangs, qemu hangs too. > > > > > > If qemu as a whole hangs due to a stalled network connection, that is a > > > bug in QEMU that we should be fixing IMHO. QEMU should be doing non-blocking > > > I/O in general, such that if the network connection or remote server stalls, > > > we simply stop sending I/O - we shouldn't ever hang the QEMU process or main > > > loop. > > > > > > There are places in QEMU code which are not well behaved in this respect, > > > but many are, and others are getting fixed where found to be important. > > > > > > Arguably any place in QEMU code which can result in a hang of QEMU in the > > > event of a stalled network should be considered a security flaw, because > > > the network is untrusted in general. > > > > The fact that out-of-band qmp commands exist at all shows that we have to make tradeoffs of developer time vs. doing things right. Sure, the migration code can be rewritten to use non-blocking i/o and finegrained locks. But as a hobbyist I don't have time to fix this. > > > > > > These patches introduce the new 'yank' out-of-band qmp command to recover from > > > > these kinds of hangs. The different subsystems register callbacks which get > > > > executed with the yank command. For example the callback can shutdown() a > > > > socket. This is intended for the colo use-case, but it can be used for other > > > > things too of course. > > > > > > IIUC, invoking the "yank" command unconditionally kills every single > > > network connection in QEMU that has registered with the "yank" subsystem. > > > IMHO this is way too big of a hammer, even if we accept there are bugs in > > > QEMU not handling stalled networking well. > > > > > > eg if a chardev hangs QEMU, and we tear down everything, killing the NBD > > > connection used for the guest disk, we needlessly break I/O. > > > > Yeah, these patches are intended to solve the problems with the colo > > use-case where all external connections (migration, chardevs, nbd) > > are just for replication. In other use-cases you'd enable the yank > > feature only on the non-essential connections. > > That is a pretty inflexible design for other use cases though, > as "non-essential" is not a black & white list in general. There > are varying levels of importance to the different channels. We > can afford to loose migration without any user visible effects. > If that doesn't solve it, a serial device chardev, or VNC connection > can be dropped at the inconvenience of loosing interactive console > which is end user visible impact, so may only be want to be yanked > if the migration yank didn't fix it. In the case of COLO that's not the case though - here we explicitly want to kill the migration to be able to ensure that we can recover - and we're under time pressure to get the other member of the pair running again. Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
* Lukas Straub (lukasstraub2@web.de) wrote: > On Mon, 11 May 2020 12:49:47 +0100 > Daniel P. Berrangé <berrange@redhat.com> wrote: > > > On Mon, May 11, 2020 at 01:14:34PM +0200, Lukas Straub wrote: > > > Hello Everyone, > > > In many cases, if qemu has a network connection (qmp, migration, chardev, etc.) > > > to some other server and that server dies or hangs, qemu hangs too. > > > > If qemu as a whole hangs due to a stalled network connection, that is a > > bug in QEMU that we should be fixing IMHO. QEMU should be doing non-blocking > > I/O in general, such that if the network connection or remote server stalls, > > we simply stop sending I/O - we shouldn't ever hang the QEMU process or main > > loop. > > > > There are places in QEMU code which are not well behaved in this respect, > > but many are, and others are getting fixed where found to be important. > > > > Arguably any place in QEMU code which can result in a hang of QEMU in the > > event of a stalled network should be considered a security flaw, because > > the network is untrusted in general. > > The fact that out-of-band qmp commands exist at all shows that we have to make tradeoffs of developer time vs. doing things right. Sure, the migration code can be rewritten to use non-blocking i/o and finegrained locks. But as a hobbyist I don't have time to fix this. If it was just an hobbyist vs fulltime thing then I'd say that was a bad way to make the decision; however the reality is that even those who are paid to watch this code don't have the feeling we can make it fail quickly/non-blocking - and for COLO you need to be absolutely sure you nail every case, so you'd some how have to audit the whole lot, and keep watching it. (and thank you for taking your time to do this!) Dave > > > These patches introduce the new 'yank' out-of-band qmp command to recover from > > > these kinds of hangs. The different subsystems register callbacks which get > > > executed with the yank command. For example the callback can shutdown() a > > > socket. This is intended for the colo use-case, but it can be used for other > > > things too of course. > > > > IIUC, invoking the "yank" command unconditionally kills every single > > network connection in QEMU that has registered with the "yank" subsystem. > > IMHO this is way too big of a hammer, even if we accept there are bugs in > > QEMU not handling stalled networking well. > > > > eg if a chardev hangs QEMU, and we tear down everything, killing the NBD > > connection used for the guest disk, we needlessly break I/O. > > Yeah, these patches are intended to solve the problems with the colo use-case where all external connections (migration, chardevs, nbd) are just for replication. In other use-cases you'd enable the yank feature only on the non-essential connections. > > > eg doing this in the chardev backend is not desirable, because the bugs > > with hanging QEMU are typically caused by the way the frontend device > > uses the chardev blocking I/O calls, instead of non-blocking I/O calls. > > > > > > Regards, > > Daniel > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Mon, 11 May 2020 16:46:45 +0100 "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > ... > > That way if QEMU does get stuck, you can start by tearing down the > > least distruptive channel. eg try tearing down the migration connection > > first (which shouldn't negatively impact the guest), and only if that > > doesn't work then, move on to tear down the NBD connection (which risks > > data loss) > > I wonder if a different way would be to make all network connections > register with yank, but then make yank take a list of connections to > shutdown(2). Good Idea. We could name the connections (/yank callbacks) in the form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" (and add "netdev:...", etc. in the future). Then make yank take a list of connection names as you suggest and silently ignore connections that don't exist. And maybe even add a 'query-yank' oob command returning a list of registered connections so the management application can do pattern matching if it wants. Comments? Regards, Lukas Straub > Dave > > > Regards, > > Daniel > > -- > > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > > |: https://libvirt.org -o- https://fstop138.berrange.com :| > > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >
On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > On Mon, 11 May 2020 16:46:45 +0100 > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > ... > > > That way if QEMU does get stuck, you can start by tearing down the > > > least distruptive channel. eg try tearing down the migration connection > > > first (which shouldn't negatively impact the guest), and only if that > > > doesn't work then, move on to tear down the NBD connection (which risks > > > data loss) > > > > I wonder if a different way would be to make all network connections > > register with yank, but then make yank take a list of connections to > > shutdown(2). > > Good Idea. We could name the connections (/yank callbacks) in the > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > (and add "netdev:...", etc. in the future). Then make yank take a > list of connection names as you suggest and silently ignore connections > that don't exist. And maybe even add a 'query-yank' oob command returning > a list of registered connections so the management application can do > pattern matching if it wants. Yes, that would make the yank command much more flexible in how it can be used. As an alternative to using formatted strings like this, it could be modelled more explicitly in QAPI { 'struct': 'YankChannels', 'data': { 'chardev': [ 'string' ], 'nbd': ['string'], 'migration': bool } } In this example, 'chardev' would accept a list of chardev IDs which have it enabled, 'nbd' would accept a list of block node IDs which have it enabled, and migration is a singleton on/off. The benefit of this modelling is that you can introspect QEMU to discover what classes of channels support being yanked by this QEMU build, as well as what instances are configured to be yanked. ie you can distinguish between a QEMU that doesn't support yanking network devices, from a QEMU that does support yanking network devices, but doesn't have it enabled for any network device instances. Regards, Daniel
* Daniel P. Berrangé (berrange@redhat.com) wrote: > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > > On Mon, 11 May 2020 16:46:45 +0100 > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > > ... > > > > That way if QEMU does get stuck, you can start by tearing down the > > > > least distruptive channel. eg try tearing down the migration connection > > > > first (which shouldn't negatively impact the guest), and only if that > > > > doesn't work then, move on to tear down the NBD connection (which risks > > > > data loss) > > > > > > I wonder if a different way would be to make all network connections > > > register with yank, but then make yank take a list of connections to > > > shutdown(2). > > > > Good Idea. We could name the connections (/yank callbacks) in the > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > > (and add "netdev:...", etc. in the future). Then make yank take a > > list of connection names as you suggest and silently ignore connections > > that don't exist. And maybe even add a 'query-yank' oob command returning > > a list of registered connections so the management application can do > > pattern matching if it wants. > > Yes, that would make the yank command much more flexible in how it can > be used. > > As an alternative to using formatted strings like this, it could be > modelled more explicitly in QAPI > > { 'struct': 'YankChannels', > 'data': { 'chardev': [ 'string' ], > 'nbd': ['string'], > 'migration': bool } } > > In this example, 'chardev' would accept a list of chardev IDs which > have it enabled, 'nbd' would accept a list of block node IDs which > have it enabled, and migration is a singleton on/off. Do we already have a QOM object name for each of these things? Is that nbd/blockdevice unique - i.e. can you have multiple nbd clients on the same node? > The benefit of this modelling is that you can introspect QEMU > to discover what classes of channels support being yanked by > this QEMU build, as well as what instances are configured to > be yanked. ie you can distinguish between a QEMU that doesn't > support yanking network devices, from a QEMU that does support > yanking network devices, but doesn't have it enabled for any > network device instances. What do we need to make it introspectable like that? Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Tue, 12 May 2020 10:43:37 +0100 Daniel P. Berrangé <berrange@redhat.com> wrote: > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > > ... > > > > Good Idea. We could name the connections (/yank callbacks) in the > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > > (and add "netdev:...", etc. in the future). Then make yank take a > > list of connection names as you suggest and silently ignore connections > > that don't exist. And maybe even add a 'query-yank' oob command returning > > a list of registered connections so the management application can do > > pattern matching if it wants. > > Yes, that would make the yank command much more flexible in how it can > be used. > > As an alternative to using formatted strings like this, it could be > modelled more explicitly in QAPI > > { 'struct': 'YankChannels', > 'data': { 'chardev': [ 'string' ], > 'nbd': ['string'], > 'migration': bool } } > > In this example, 'chardev' would accept a list of chardev IDs which > have it enabled, 'nbd' would accept a list of block node IDs which > have it enabled, and migration is a singleton on/off. With the new command, the yank feature can then be enabled unconditionally on the instances. > The benefit of this modelling is that you can introspect QEMU > to discover what classes of channels support being yanked by > this QEMU build, as well as what instances are configured to > be yanked. ie you can distinguish between a QEMU that doesn't > support yanking network devices, from a QEMU that does support > yanking network devices, but doesn't have it enabled for any > network device instances. > > Regards, > Daniel
On Tue, May 12, 2020 at 07:58:17PM +0100, Dr. David Alan Gilbert wrote: > * Daniel P. Berrang̩ (berrange@redhat.com) wrote: > > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > > > On Mon, 11 May 2020 16:46:45 +0100 > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > > > * Daniel P. Berrang̮̩ (berrange@redhat.com) wrote: > > > > > ... > > > > > That way if QEMU does get stuck, you can start by tearing down the > > > > > least distruptive channel. eg try tearing down the migration connection > > > > > first (which shouldn't negatively impact the guest), and only if that > > > > > doesn't work then, move on to tear down the NBD connection (which risks > > > > > data loss) > > > > > > > > I wonder if a different way would be to make all network connections > > > > register with yank, but then make yank take a list of connections to > > > > shutdown(2). > > > > > > Good Idea. We could name the connections (/yank callbacks) in the > > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > > > (and add "netdev:...", etc. in the future). Then make yank take a > > > list of connection names as you suggest and silently ignore connections > > > that don't exist. And maybe even add a 'query-yank' oob command returning > > > a list of registered connections so the management application can do > > > pattern matching if it wants. > > > > Yes, that would make the yank command much more flexible in how it can > > be used. > > > > As an alternative to using formatted strings like this, it could be > > modelled more explicitly in QAPI > > > > { 'struct': 'YankChannels', > > 'data': { 'chardev': [ 'string' ], > > 'nbd': ['string'], > > 'migration': bool } } > > > > In this example, 'chardev' would accept a list of chardev IDs which > > have it enabled, 'nbd' would accept a list of block node IDs which > > have it enabled, and migration is a singleton on/off. > > Do we already have a QOM object name for each of these things? > Is that nbd/blockdevice unique - i.e. can you have multiple nbd clients > on the same node? > > > The benefit of this modelling is that you can introspect QEMU > > to discover what classes of channels support being yanked by > > this QEMU build, as well as what instances are configured to > > be yanked. ie you can distinguish between a QEMU that doesn't > > support yanking network devices, from a QEMU that does support > > yanking network devices, but doesn't have it enabled for any > > network device instances. > > What do we need to make it introspectable like that? The model I describe above would work, because you can introspect the QAPI schema to see what fields are in the "YankChannels" struct. So if we added a "nic" field later, apps can discover it. Regards, Daniel
Am 12.05.2020 um 11:43 hat Daniel P. Berrangé geschrieben: > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > > On Mon, 11 May 2020 16:46:45 +0100 > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > > ... > > > > That way if QEMU does get stuck, you can start by tearing down the > > > > least distruptive channel. eg try tearing down the migration connection > > > > first (which shouldn't negatively impact the guest), and only if that > > > > doesn't work then, move on to tear down the NBD connection (which risks > > > > data loss) > > > > > > I wonder if a different way would be to make all network connections > > > register with yank, but then make yank take a list of connections to > > > shutdown(2). > > > > Good Idea. We could name the connections (/yank callbacks) in the > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > > (and add "netdev:...", etc. in the future). Then make yank take a > > list of connection names as you suggest and silently ignore connections > > that don't exist. And maybe even add a 'query-yank' oob command returning > > a list of registered connections so the management application can do > > pattern matching if it wants. I'm generally not a big fan of silently ignoring things. Is there a specific requirement to do it in this case, or can management applications be expected to know which connections exist? > Yes, that would make the yank command much more flexible in how it can > be used. > > As an alternative to using formatted strings like this, it could be > modelled more explicitly in QAPI > > { 'struct': 'YankChannels', > 'data': { 'chardev': [ 'string' ], > 'nbd': ['string'], > 'migration': bool } } > > In this example, 'chardev' would accept a list of chardev IDs which > have it enabled, 'nbd' would accept a list of block node IDs which > have it enabled, and migration is a singleton on/off. Of course, it also means that the yank code needs to know about every single object that supports the operation, whereas if you only have strings, the objects could keep registering their connection with a generic function like yank_register_function() in this version. I'm not sure if the additional complexity is worth the benefits. > The benefit of this modelling is that you can introspect QEMU > to discover what classes of channels support being yanked by > this QEMU build, as well as what instances are configured to > be yanked. ie you can distinguish between a QEMU that doesn't > support yanking network devices, from a QEMU that does support > yanking network devices, but doesn't have it enabled for any > network device instances. This is true, though I think Lukas' suggestion with query-yank should be as good in practice (you can't check before creating the NBD device then, but would you actually want to do this?). And if all else fails, we can still add a few more feature flags to the schema... Kevin
* Kevin Wolf (kwolf@redhat.com) wrote: > Am 12.05.2020 um 11:43 hat Daniel P. Berrangé geschrieben: > > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > > > On Mon, 11 May 2020 16:46:45 +0100 > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > > > ... > > > > > That way if QEMU does get stuck, you can start by tearing down the > > > > > least distruptive channel. eg try tearing down the migration connection > > > > > first (which shouldn't negatively impact the guest), and only if that > > > > > doesn't work then, move on to tear down the NBD connection (which risks > > > > > data loss) > > > > > > > > I wonder if a different way would be to make all network connections > > > > register with yank, but then make yank take a list of connections to > > > > shutdown(2). > > > > > > Good Idea. We could name the connections (/yank callbacks) in the > > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > > > (and add "netdev:...", etc. in the future). Then make yank take a > > > list of connection names as you suggest and silently ignore connections > > > that don't exist. And maybe even add a 'query-yank' oob command returning > > > a list of registered connections so the management application can do > > > pattern matching if it wants. > > I'm generally not a big fan of silently ignoring things. Is there a > specific requirement to do it in this case, or can management > applications be expected to know which connections exist? > > > Yes, that would make the yank command much more flexible in how it can > > be used. > > > > As an alternative to using formatted strings like this, it could be > > modelled more explicitly in QAPI > > > > { 'struct': 'YankChannels', > > 'data': { 'chardev': [ 'string' ], > > 'nbd': ['string'], > > 'migration': bool } } > > > > In this example, 'chardev' would accept a list of chardev IDs which > > have it enabled, 'nbd' would accept a list of block node IDs which > > have it enabled, and migration is a singleton on/off. > > Of course, it also means that the yank code needs to know about every > single object that supports the operation, whereas if you only have > strings, the objects could keep registering their connection with a > generic function like yank_register_function() in this version. > > I'm not sure if the additional complexity is worth the benefits. I tend to agree; although we do have to ensure we either use an existing naming scheme (e.g. QOM object names?) or make sure we've got a well defined list of prefixes. Dave > > > The benefit of this modelling is that you can introspect QEMU > > to discover what classes of channels support being yanked by > > this QEMU build, as well as what instances are configured to > > be yanked. ie you can distinguish between a QEMU that doesn't > > support yanking network devices, from a QEMU that does support > > yanking network devices, but doesn't have it enabled for any > > network device instances. > > This is true, though I think Lukas' suggestion with query-yank should be > as good in practice (you can't check before creating the NBD device > then, but would you actually want to do this?). > > And if all else fails, we can still add a few more feature flags to the > schema... > > Kevin -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Am 13.05.2020 um 12:53 hat Dr. David Alan Gilbert geschrieben: > * Kevin Wolf (kwolf@redhat.com) wrote: > > Am 12.05.2020 um 11:43 hat Daniel P. Berrangé geschrieben: > > > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > > > > On Mon, 11 May 2020 16:46:45 +0100 > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > > > > ... > > > > > > That way if QEMU does get stuck, you can start by tearing down the > > > > > > least distruptive channel. eg try tearing down the migration connection > > > > > > first (which shouldn't negatively impact the guest), and only if that > > > > > > doesn't work then, move on to tear down the NBD connection (which risks > > > > > > data loss) > > > > > > > > > > I wonder if a different way would be to make all network connections > > > > > register with yank, but then make yank take a list of connections to > > > > > shutdown(2). > > > > > > > > Good Idea. We could name the connections (/yank callbacks) in the > > > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > > > > (and add "netdev:...", etc. in the future). Then make yank take a > > > > list of connection names as you suggest and silently ignore connections > > > > that don't exist. And maybe even add a 'query-yank' oob command returning > > > > a list of registered connections so the management application can do > > > > pattern matching if it wants. > > > > I'm generally not a big fan of silently ignoring things. Is there a > > specific requirement to do it in this case, or can management > > applications be expected to know which connections exist? > > > > > Yes, that would make the yank command much more flexible in how it can > > > be used. > > > > > > As an alternative to using formatted strings like this, it could be > > > modelled more explicitly in QAPI > > > > > > { 'struct': 'YankChannels', > > > 'data': { 'chardev': [ 'string' ], > > > 'nbd': ['string'], > > > 'migration': bool } } > > > > > > In this example, 'chardev' would accept a list of chardev IDs which > > > have it enabled, 'nbd' would accept a list of block node IDs which > > > have it enabled, and migration is a singleton on/off. > > > > Of course, it also means that the yank code needs to know about every > > single object that supports the operation, whereas if you only have > > strings, the objects could keep registering their connection with a > > generic function like yank_register_function() in this version. > > > > I'm not sure if the additional complexity is worth the benefits. > > I tend to agree; although we do have to ensure we either use an existing > naming scheme (e.g. QOM object names?) or make sure we've got a well > defined list of prefixes. Not everything that has a network connection is a QOM object (in fact, neither migration nor chardev nor nbd are QOM objects). I guess it would be nice to have a single namespace for everything in QEMU, but the reality is that we have a few separate ones. As long as we consistently add a prefix that identifies the namespace in question, I think that would work. This means that if we're using node-name to identify the NBD connection, the namespace should be 'block' rather than 'nbd'. One more thing to consider is, what if a single object has multiple connections? In the case of node-names, we have a limited set of allowed characters, so we can use one of the remaining characters as a separator and then suffix a counter. In other places, the identifier isn't restricted, so suffixing doesn't work. Maybe prefixing does, but it would have to be there from the beginning then. And another thing: Do we really want to document this as limited to network connections? Another common cause of hangs is when you have image files on an NFS mount and the connection goes away. Of course, in the end this is still networking, but inside of QEMU it looks like accessing any other file. I'm not sure that we'll allow yanking access to image files anytime soon, but it might not hurt to keep it at the back of our mind as a potential option we might want the design to allow. Kevin
On Wed, May 13, 2020 at 01:13:20PM +0200, Kevin Wolf wrote: > Am 13.05.2020 um 12:53 hat Dr. David Alan Gilbert geschrieben: > > * Kevin Wolf (kwolf@redhat.com) wrote: > > > Am 12.05.2020 um 11:43 hat Daniel P. Berrangé geschrieben: > > > > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > > > > > On Mon, 11 May 2020 16:46:45 +0100 > > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > > > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > > > > > ... > > > > > > > That way if QEMU does get stuck, you can start by tearing down the > > > > > > > least distruptive channel. eg try tearing down the migration connection > > > > > > > first (which shouldn't negatively impact the guest), and only if that > > > > > > > doesn't work then, move on to tear down the NBD connection (which risks > > > > > > > data loss) > > > > > > > > > > > > I wonder if a different way would be to make all network connections > > > > > > register with yank, but then make yank take a list of connections to > > > > > > shutdown(2). > > > > > > > > > > Good Idea. We could name the connections (/yank callbacks) in the > > > > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > > > > > (and add "netdev:...", etc. in the future). Then make yank take a > > > > > list of connection names as you suggest and silently ignore connections > > > > > that don't exist. And maybe even add a 'query-yank' oob command returning > > > > > a list of registered connections so the management application can do > > > > > pattern matching if it wants. > > > > > > I'm generally not a big fan of silently ignoring things. Is there a > > > specific requirement to do it in this case, or can management > > > applications be expected to know which connections exist? > > > > > > > Yes, that would make the yank command much more flexible in how it can > > > > be used. > > > > > > > > As an alternative to using formatted strings like this, it could be > > > > modelled more explicitly in QAPI > > > > > > > > { 'struct': 'YankChannels', > > > > 'data': { 'chardev': [ 'string' ], > > > > 'nbd': ['string'], > > > > 'migration': bool } } > > > > > > > > In this example, 'chardev' would accept a list of chardev IDs which > > > > have it enabled, 'nbd' would accept a list of block node IDs which > > > > have it enabled, and migration is a singleton on/off. > > > > > > Of course, it also means that the yank code needs to know about every > > > single object that supports the operation, whereas if you only have > > > strings, the objects could keep registering their connection with a > > > generic function like yank_register_function() in this version. > > > > > > I'm not sure if the additional complexity is worth the benefits. > > > > I tend to agree; although we do have to ensure we either use an existing > > naming scheme (e.g. QOM object names?) or make sure we've got a well > > defined list of prefixes. > > Not everything that has a network connection is a QOM object (in fact, > neither migration nor chardev nor nbd are QOM objects). > > I guess it would be nice to have a single namespace for everything in > QEMU, but the reality is that we have a few separate ones. As long as we > consistently add a prefix that identifies the namespace in question, I > think that would work. > > This means that if we're using node-name to identify the NBD connection, > the namespace should be 'block' rather than 'nbd'. > > One more thing to consider is, what if a single object has multiple > connections? In the case of node-names, we have a limited set of allowed > characters, so we can use one of the remaining characters as a separator > and then suffix a counter. In other places, the identifier isn't > restricted, so suffixing doesn't work. Maybe prefixing does, but it > would have to be there from the beginning then. > > And another thing: Do we really want to document this as limited to > network connections? Another common cause of hangs is when you have > image files on an NFS mount and the connection goes away. Of course, in > the end this is still networking, but inside of QEMU it looks like > accessing any other file. I'm not sure that we'll allow yanking access > to image files anytime soon, but it might not hurt to keep it at the > back of our mind as a potential option we might want the design to > allow. Are you referring to the in-kernel NFS client hangs here ? AFAIK, it is impossible to do anything to get out of those hangs from userspace, because the thread is stuck in an uninterruptable sleep in kernel space. If using the in-QEMU NFS client, then there is a network connection that can be yanked just like the NBD client. Regards, Daniel
On 13/05/20 13:26, Daniel P. Berrangé wrote: > Are you referring to the in-kernel NFS client hangs here ? AFAIK, it is > impossible to do anything to get out of those hangs from userspace, because > the thread is stuck in an uninterruptable sleep in kernel space. > > If using the in-QEMU NFS client, then there is a network connection that > can be yanked just like the NBD client. But it's a bad idea to yank it (and also the NBD client) because you're not sure which wites have made it to the server (and to the medium) and which haven't. Effectively, the in-QEMU NFS client and NBD client are always operating in "soft" mode, but we should always treat that as a bug (which cannot be fixed) and not a feature for read-write images. Paolo
Am 13.05.2020 um 13:26 hat Daniel P. Berrangé geschrieben: > On Wed, May 13, 2020 at 01:13:20PM +0200, Kevin Wolf wrote: > > Am 13.05.2020 um 12:53 hat Dr. David Alan Gilbert geschrieben: > > > * Kevin Wolf (kwolf@redhat.com) wrote: > > > > Am 12.05.2020 um 11:43 hat Daniel P. Berrangé geschrieben: > > > > > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > > > > > > On Mon, 11 May 2020 16:46:45 +0100 > > > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > > > > > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > > > > > > ... > > > > > > > > That way if QEMU does get stuck, you can start by tearing down the > > > > > > > > least distruptive channel. eg try tearing down the migration connection > > > > > > > > first (which shouldn't negatively impact the guest), and only if that > > > > > > > > doesn't work then, move on to tear down the NBD connection (which risks > > > > > > > > data loss) > > > > > > > > > > > > > > I wonder if a different way would be to make all network connections > > > > > > > register with yank, but then make yank take a list of connections to > > > > > > > shutdown(2). > > > > > > > > > > > > Good Idea. We could name the connections (/yank callbacks) in the > > > > > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > > > > > > (and add "netdev:...", etc. in the future). Then make yank take a > > > > > > list of connection names as you suggest and silently ignore connections > > > > > > that don't exist. And maybe even add a 'query-yank' oob command returning > > > > > > a list of registered connections so the management application can do > > > > > > pattern matching if it wants. > > > > > > > > I'm generally not a big fan of silently ignoring things. Is there a > > > > specific requirement to do it in this case, or can management > > > > applications be expected to know which connections exist? > > > > > > > > > Yes, that would make the yank command much more flexible in how it can > > > > > be used. > > > > > > > > > > As an alternative to using formatted strings like this, it could be > > > > > modelled more explicitly in QAPI > > > > > > > > > > { 'struct': 'YankChannels', > > > > > 'data': { 'chardev': [ 'string' ], > > > > > 'nbd': ['string'], > > > > > 'migration': bool } } > > > > > > > > > > In this example, 'chardev' would accept a list of chardev IDs which > > > > > have it enabled, 'nbd' would accept a list of block node IDs which > > > > > have it enabled, and migration is a singleton on/off. > > > > > > > > Of course, it also means that the yank code needs to know about every > > > > single object that supports the operation, whereas if you only have > > > > strings, the objects could keep registering their connection with a > > > > generic function like yank_register_function() in this version. > > > > > > > > I'm not sure if the additional complexity is worth the benefits. > > > > > > I tend to agree; although we do have to ensure we either use an existing > > > naming scheme (e.g. QOM object names?) or make sure we've got a well > > > defined list of prefixes. > > > > Not everything that has a network connection is a QOM object (in fact, > > neither migration nor chardev nor nbd are QOM objects). > > > > I guess it would be nice to have a single namespace for everything in > > QEMU, but the reality is that we have a few separate ones. As long as we > > consistently add a prefix that identifies the namespace in question, I > > think that would work. > > > > This means that if we're using node-name to identify the NBD connection, > > the namespace should be 'block' rather than 'nbd'. > > > > One more thing to consider is, what if a single object has multiple > > connections? In the case of node-names, we have a limited set of allowed > > characters, so we can use one of the remaining characters as a separator > > and then suffix a counter. In other places, the identifier isn't > > restricted, so suffixing doesn't work. Maybe prefixing does, but it > > would have to be there from the beginning then. > > > > And another thing: Do we really want to document this as limited to > > network connections? Another common cause of hangs is when you have > > image files on an NFS mount and the connection goes away. Of course, in > > the end this is still networking, but inside of QEMU it looks like > > accessing any other file. I'm not sure that we'll allow yanking access > > to image files anytime soon, but it might not hurt to keep it at the > > back of our mind as a potential option we might want the design to > > allow. > > Are you referring to the in-kernel NFS client hangs here ? AFAIK, it is > impossible to do anything to get out of those hangs from userspace, because > the thread is stuck in an uninterruptable sleep in kernel space. I'm not sure if it's always uninterruptible, but yes, in general you would be sacrificing the block node and some of its worker threads to get the VM unstuck. The next thing you should probably do is migrate off to another QEMU instance without a hanging thread. But at least it would let you migrate off instead of getting stuck while trying to wait for all running requests to complete. > If using the in-QEMU NFS client, then there is a network connection > that can be yanked just like the NBD client. Yes, that's the same case as NBD. Kevin
* Paolo Bonzini (pbonzini@redhat.com) wrote: > On 13/05/20 13:26, Daniel P. Berrangé wrote: > > Are you referring to the in-kernel NFS client hangs here ? AFAIK, it is > > impossible to do anything to get out of those hangs from userspace, because > > the thread is stuck in an uninterruptable sleep in kernel space. > > > > If using the in-QEMU NFS client, then there is a network connection that > > can be yanked just like the NBD client. > > But it's a bad idea to yank it (and also the NBD client) because you're > not sure which wites have made it to the server (and to the medium) and > which haven't. No, that's OK - if you look at the COLO case, and some other cases, you've got a dead storage device but your redundant pair might be OK; so it's OK to yank it. Other similar storage cases are trying to migrate a VM that has one dead disk, even if you know and accept it's dead and unresponding, you often can't kill it off if the device is hung. > Effectively, the in-QEMU NFS client and NBD client are always operating > in "soft" mode, but we should always treat that as a bug (which cannot > be fixed) and not a feature for read-write images. Dave > > Paolo > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Am 13.05.2020 um 13:58 hat Paolo Bonzini geschrieben: > On 13/05/20 13:26, Daniel P. Berrangé wrote: > > Are you referring to the in-kernel NFS client hangs here ? AFAIK, it is > > impossible to do anything to get out of those hangs from userspace, because > > the thread is stuck in an uninterruptable sleep in kernel space. > > > > If using the in-QEMU NFS client, then there is a network connection that > > can be yanked just like the NBD client. > > But it's a bad idea to yank it (and also the NBD client) because you're > not sure which wites have made it to the server (and to the medium) and > which haven't. This (undefined content) is generally what guests should expect when a write request returns an error. > Effectively, the in-QEMU NFS client and NBD client are always operating > in "soft" mode, but we should always treat that as a bug (which cannot > be fixed) and not a feature for read-write images. It certainly means that you can't continue as if nothing had happened. However, writing to the same area again (and successfully) restores a valid state, so with werror=stop you would get into a state where you can later reconnect and continue where you stopped, but the monitor would stay responsive during that time. Or if the disk isn't critical for the guest, you could replace the NBD block node with a node that always returns I/O errors, so the guest would see a broken disk (actually, just reporting the errors from yanking to the guest might be enough to make it give up on the disk and stop sending new requests). Then you can resume the VM right away. Kevin
* Kevin Wolf (kwolf@redhat.com) wrote: > Am 13.05.2020 um 12:53 hat Dr. David Alan Gilbert geschrieben: > > * Kevin Wolf (kwolf@redhat.com) wrote: > > > Am 12.05.2020 um 11:43 hat Daniel P. Berrangé geschrieben: > > > > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: > > > > > On Mon, 11 May 2020 16:46:45 +0100 > > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > > > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > > > > > ... > > > > > > > That way if QEMU does get stuck, you can start by tearing down the > > > > > > > least distruptive channel. eg try tearing down the migration connection > > > > > > > first (which shouldn't negatively impact the guest), and only if that > > > > > > > doesn't work then, move on to tear down the NBD connection (which risks > > > > > > > data loss) > > > > > > > > > > > > I wonder if a different way would be to make all network connections > > > > > > register with yank, but then make yank take a list of connections to > > > > > > shutdown(2). > > > > > > > > > > Good Idea. We could name the connections (/yank callbacks) in the > > > > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" > > > > > (and add "netdev:...", etc. in the future). Then make yank take a > > > > > list of connection names as you suggest and silently ignore connections > > > > > that don't exist. And maybe even add a 'query-yank' oob command returning > > > > > a list of registered connections so the management application can do > > > > > pattern matching if it wants. > > > > > > I'm generally not a big fan of silently ignoring things. Is there a > > > specific requirement to do it in this case, or can management > > > applications be expected to know which connections exist? > > > > > > > Yes, that would make the yank command much more flexible in how it can > > > > be used. > > > > > > > > As an alternative to using formatted strings like this, it could be > > > > modelled more explicitly in QAPI > > > > > > > > { 'struct': 'YankChannels', > > > > 'data': { 'chardev': [ 'string' ], > > > > 'nbd': ['string'], > > > > 'migration': bool } } > > > > > > > > In this example, 'chardev' would accept a list of chardev IDs which > > > > have it enabled, 'nbd' would accept a list of block node IDs which > > > > have it enabled, and migration is a singleton on/off. > > > > > > Of course, it also means that the yank code needs to know about every > > > single object that supports the operation, whereas if you only have > > > strings, the objects could keep registering their connection with a > > > generic function like yank_register_function() in this version. > > > > > > I'm not sure if the additional complexity is worth the benefits. > > > > I tend to agree; although we do have to ensure we either use an existing > > naming scheme (e.g. QOM object names?) or make sure we've got a well > > defined list of prefixes. > > Not everything that has a network connection is a QOM object (in fact, > neither migration nor chardev nor nbd are QOM objects). Hmm, migrationstate is a qdev object. > I guess it would be nice to have a single namespace for everything in > QEMU, but the reality is that we have a few separate ones. As long as we > consistently add a prefix that identifies the namespace in question, I > think that would work. > This means that if we're using node-name to identify the NBD connection, > the namespace should be 'block' rather than 'nbd'. > > One more thing to consider is, what if a single object has multiple > connections? In the case of node-names, we have a limited set of allowed > characters, so we can use one of the remaining characters as a separator > and then suffix a counter. In other places, the identifier isn't > restricted, so suffixing doesn't work. Maybe prefixing does, but it > would have to be there from the beginning then. Yeh I worry about whether on nbd if you can have multiple nbd connections to one block device. > And another thing: Do we really want to document this as limited to > network connections? Another common cause of hangs is when you have > image files on an NFS mount and the connection goes away. Of course, in > the end this is still networking, but inside of QEMU it looks like > accessing any other file. I'm not sure that we'll allow yanking access > to image files anytime soon, but it might not hurt to keep it at the > back of our mind as a potential option we might want the design to > allow. Yep. Dave > Kevin -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Wed, May 13, 2020 at 01:56:24PM +0100, Dr. David Alan Gilbert wrote: > * Kevin Wolf (kwolf@redhat.com) wrote: > > I guess it would be nice to have a single namespace for everything in > > QEMU, but the reality is that we have a few separate ones. As long as we > > consistently add a prefix that identifies the namespace in question, I > > think that would work. > > > This means that if we're using node-name to identify the NBD connection, > > the namespace should be 'block' rather than 'nbd'. > > > > One more thing to consider is, what if a single object has multiple > > connections? In the case of node-names, we have a limited set of allowed > > characters, so we can use one of the remaining characters as a separator > > and then suffix a counter. In other places, the identifier isn't > > restricted, so suffixing doesn't work. Maybe prefixing does, but it > > would have to be there from the beginning then. > > Yeh I worry about whether on nbd if you can have multiple nbd > connections to one block device. The kernel NBD driver now supports multiple parallel connections. QEMU hasn't implemented this in its NBD code yet, but I certainly see that being in scope for future. Regards, Daniel
* Daniel P. Berrangé (berrange@redhat.com) wrote: > On Wed, May 13, 2020 at 01:56:24PM +0100, Dr. David Alan Gilbert wrote: > > * Kevin Wolf (kwolf@redhat.com) wrote: > > > I guess it would be nice to have a single namespace for everything in > > > QEMU, but the reality is that we have a few separate ones. As long as we > > > consistently add a prefix that identifies the namespace in question, I > > > think that would work. > > > > > This means that if we're using node-name to identify the NBD connection, > > > the namespace should be 'block' rather than 'nbd'. > > > > > > One more thing to consider is, what if a single object has multiple > > > connections? In the case of node-names, we have a limited set of allowed > > > characters, so we can use one of the remaining characters as a separator > > > and then suffix a counter. In other places, the identifier isn't > > > restricted, so suffixing doesn't work. Maybe prefixing does, but it > > > would have to be there from the beginning then. > > > > Yeh I worry about whether on nbd if you can have multiple nbd > > connections to one block device. > > The kernel NBD driver now supports multiple parallel connections. > QEMU hasn't implemented this in its NBD code yet, but I certainly > see that being in scope for future. It's not parallel for performance that worries me, it's more about separateq connections for separate uses - e.g. if we're serving the same read-only disk to multiple separate things. Dave > > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 5/13/20 8:48 AM, Dr. David Alan Gilbert wrote: >>> Yeh I worry about whether on nbd if you can have multiple nbd >>> connections to one block device. >> >> The kernel NBD driver now supports multiple parallel connections. >> QEMU hasn't implemented this in its NBD code yet, but I certainly >> see that being in scope for future. > > It's not parallel for performance that worries me, it's more about > separateq connections for separate uses - e.g. if we're serving the same > read-only disk to multiple separate things. qemu allows multiple simultaneous NBD clients, but does not promise cache consistency between them. As long as all the clients are read-only, you can't tell the difference (and thus, for read-only connections, qemu even advertises NBD_FLAG_CAN_MULTI_CONN per the NBD spec). See also 'qemu-nbd -e' for controlling how many clients qemu-nbd will permit at once (when qemu proper is serving NBD, we don't currently expose a knob over QMP to control client count, but instead just blindly allow multiple connections). But as soon as one of the clients wants to write, that is where we would need to improve code before making it safe, so there, we do not advertise MULTI_CONN support. As for cases with multiple simultaneous clients: we already have those. In the case of incremental backups with an NBD export, my demo code for copying out the incremental backup involved two parallel NBD clients - one using 'qemu-img map' to probe which portions of the image were dirty, and a second client actually performing the reads. But incremental backups are read-only situations (the clients are not modifying qemu state).
Am 13.05.2020 um 15:48 hat Dr. David Alan Gilbert geschrieben: > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > On Wed, May 13, 2020 at 01:56:24PM +0100, Dr. David Alan Gilbert wrote: > > > * Kevin Wolf (kwolf@redhat.com) wrote: > > > > I guess it would be nice to have a single namespace for everything in > > > > QEMU, but the reality is that we have a few separate ones. As long as we > > > > consistently add a prefix that identifies the namespace in question, I > > > > think that would work. > > > > > > > This means that if we're using node-name to identify the NBD connection, > > > > the namespace should be 'block' rather than 'nbd'. > > > > > > > > One more thing to consider is, what if a single object has multiple > > > > connections? In the case of node-names, we have a limited set of allowed > > > > characters, so we can use one of the remaining characters as a separator > > > > and then suffix a counter. In other places, the identifier isn't > > > > restricted, so suffixing doesn't work. Maybe prefixing does, but it > > > > would have to be there from the beginning then. > > > > > > Yeh I worry about whether on nbd if you can have multiple nbd > > > connections to one block device. > > > > The kernel NBD driver now supports multiple parallel connections. > > QEMU hasn't implemented this in its NBD code yet, but I certainly > > see that being in scope for future. > > It's not parallel for performance that worries me, it's more about > separateq connections for separate uses - e.g. if we're serving the same > read-only disk to multiple separate things. That would be a concern for the NBD server. I'm not sure if anything in QEMU ever waits for NBD servers (except for the client on the other side of the connection, of course), so there might be no use case for yanking their connections. Anyway, here we were talking about the NBD client, which always accesses one disk. If you access a second disk, you have a second NBD block node. Kevin
On 5/13/20 9:06 AM, Kevin Wolf wrote: >>>>> One more thing to consider is, what if a single object has multiple >>>>> connections? In the case of node-names, we have a limited set of allowed >>>>> characters, so we can use one of the remaining characters as a separator >>>>> and then suffix a counter. In other places, the identifier isn't >>>>> restricted, so suffixing doesn't work. Maybe prefixing does, but it >>>>> would have to be there from the beginning then. >>>> >>>> Yeh I worry about whether on nbd if you can have multiple nbd >>>> connections to one block device. >>> >>> The kernel NBD driver now supports multiple parallel connections. >>> QEMU hasn't implemented this in its NBD code yet, but I certainly >>> see that being in scope for future. >> >> It's not parallel for performance that worries me, it's more about >> separateq connections for separate uses - e.g. if we're serving the same >> read-only disk to multiple separate things. > > That would be a concern for the NBD server. I'm not sure if anything in > QEMU ever waits for NBD servers (except for the client on the other side > of the connection, of course), so there might be no use case for yanking > their connections. > > Anyway, here we were talking about the NBD client, which always accesses > one disk. If you access a second disk, you have a second NBD block node. Ah, right, that's the other direction. No, we do not currently support a single qemu block node backed by multiple NBD clients to one (or more, if they are serving identical content in a failover scenario) NBD servers. Performance could indeed be potentially improved by doing that, but for now, every time qemu behaves as a new NBD client, it is via an additional block node.
Terminology: instance = one (nbd) blockdev/one chardev/the single migrationstate connection = one TCP connection Hello Everyone, Having read all the comments, here is proposal v2: Every instance registers itself with a unique name in the form "blockdev:<node-name>", "chardev:<chardev-name>" and "migration" using yank_register_instance which will do some sanity checks like checking if the same name exists already. Then (multiple) yank functions can be registered as needed with that single name. When the instance exits/is removed, it unregisters all yank functions and unregisters it's name with yank_unregister_instance which will check if all yank functions where unregistered. Every instance that supports the yank feature will register itself and the yank functions unconditionally (No extra 'yank' option per instance). The 'query-yank' oob qmp command lists the names of all registered instances. The 'yank' oob qmp command takes a list of names and for every name calls all yank functions registered with that name. Before doing anything, it will check that all names exist. If the instance has multiple connections (say, migration with multifd), i don't think it makes much sense to just shutdown one connection. Calling 'yank' on a instance will always shutdown all connections of that instance. Yank functions are generic and in no way limited to connections. Say, if migration is started to an 'exec:' address, migration could register a yank function to kill that external command on yank (Won't be implemented yet though). Regards, Lukas Straub
Am 13.05.2020 um 21:12 hat Lukas Straub geschrieben: > Terminology: > instance = one (nbd) blockdev/one chardev/the single migrationstate > connection = one TCP connection > > Hello Everyone, > Having read all the comments, here is proposal v2: Looks quite good to me. > Every instance registers itself with a unique name in the form > "blockdev:<node-name>", "chardev:<chardev-name>" and "migration" using > yank_register_instance which will do some sanity checks like checking > if the same name exists already. Then (multiple) yank functions can be > registered as needed with that single name. When the instance exits/is > removed, it unregisters all yank functions and unregisters it's name > with yank_unregister_instance which will check if all yank functions > where unregistered. Feels like nitpicking, but you say that functions are registered with a name that you have previously registered. If you mean literally passing a string, could this lead to callers forgetting to register it first? Maybe yank_register_instance() should return something like a YankInstance object that must be passed to yank_register_function(). Then you would probably also have a list of all functions registered for a single instance so that yank_unregister_instance() could automatically remove all of them instead of requiring the instance to do so. > Every instance that supports the yank feature will register itself and > the yank functions unconditionally (No extra 'yank' option per > instance). > The 'query-yank' oob qmp command lists the names of all registered > instances. > The 'yank' oob qmp command takes a list of names and for every name > calls all yank functions registered with that name. Before doing > anything, it will check that all names exist. > > If the instance has multiple connections (say, migration with > multifd), i don't think it makes much sense to just shutdown one > connection. Calling 'yank' on a instance will always shutdown all > connections of that instance. I think it depends. If shutting down one connection basically kills the functionality of the whole instance, there is no reason not to kill all connections. But if the instance could continue working on the second connection, maybe it shouldn't be killed. Anyway, we can extend things as needed later. I already mentioned elsewhere in this thread that block node-names have a restricted set of allowed character, so having a suffix to distinguish multiple yankable things is possible. I just checked chardev and it also restricts the allowed set of characters, so the same applies. 'migration' is only a fixed string, so it's trivially extensible. So we know a path forward if we ever need to yank individual connections, which is good enough for me. > Yank functions are generic and in no way limited to connections. Say, > if migration is started to an 'exec:' address, migration could > register a yank function to kill that external command on yank (Won't > be implemented yet though). Yes, this makes sense as a potential future use case. Kevin
* Lukas Straub (lukasstraub2@web.de) wrote: > Terminology: > instance = one (nbd) blockdev/one chardev/the single migrationstate > connection = one TCP connection > > Hello Everyone, > Having read all the comments, here is proposal v2: > Every instance registers itself with a unique name in the form "blockdev:<node-name>", "chardev:<chardev-name>" and "migration" using yank_register_instance which will do some sanity checks like checking if the same name exists already. Then (multiple) yank functions can be registered as needed with that single name. When the instance exits/is removed, it unregisters all yank functions and unregisters it's name with yank_unregister_instance which will check if all yank functions where unregistered. > Every instance that supports the yank feature will register itself and the yank functions unconditionally (No extra 'yank' option per instance). > The 'query-yank' oob qmp command lists the names of all registered instances. > The 'yank' oob qmp command takes a list of names and for every name calls all yank functions registered with that name. Before doing anything, it will check that all names exist. > > If the instance has multiple connections (say, migration with multifd), i don't think it makes much sense to just shutdown one connection. Calling 'yank' on a instance will always shutdown all connections of that instance. Agreed. > Yank functions are generic and in no way limited to connections. Say, if migration is started to an 'exec:' address, migration could register a yank function to kill that external command on yank (Won't be implemented yet though). One thing we need to be care of is that the yank functions stay suitable for OOB calling. Dave > Regards, > Lukas Straub -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Kevin Wolf <kwolf@redhat.com> writes: > Am 13.05.2020 um 12:53 hat Dr. David Alan Gilbert geschrieben: >> * Kevin Wolf (kwolf@redhat.com) wrote: >> > Am 12.05.2020 um 11:43 hat Daniel P. Berrangé geschrieben: >> > > On Tue, May 12, 2020 at 11:32:06AM +0200, Lukas Straub wrote: >> > > > On Mon, 11 May 2020 16:46:45 +0100 >> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: >> > > > >> > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> > > > > > ... >> > > > > > That way if QEMU does get stuck, you can start by tearing down the >> > > > > > least distruptive channel. eg try tearing down the migration connection >> > > > > > first (which shouldn't negatively impact the guest), and only if that >> > > > > > doesn't work then, move on to tear down the NBD connection (which risks >> > > > > > data loss) >> > > > > >> > > > > I wonder if a different way would be to make all network connections >> > > > > register with yank, but then make yank take a list of connections to >> > > > > shutdown(2). >> > > > >> > > > Good Idea. We could name the connections (/yank callbacks) in the >> > > > form "nbd:<node-name>", "chardev:<chardev-name>" and "migration" >> > > > (and add "netdev:...", etc. in the future). Then make yank take a >> > > > list of connection names as you suggest and silently ignore connections >> > > > that don't exist. And maybe even add a 'query-yank' oob command returning >> > > > a list of registered connections so the management application can do >> > > > pattern matching if it wants. [...] >> > > Yes, that would make the yank command much more flexible in how it can >> > > be used. >> > > >> > > As an alternative to using formatted strings like this, it could be >> > > modelled more explicitly in QAPI >> > > >> > > { 'struct': 'YankChannels', >> > > 'data': { 'chardev': [ 'string' ], >> > > 'nbd': ['string'], >> > > 'migration': bool } } >> > > >> > > In this example, 'chardev' would accept a list of chardev IDs which >> > > have it enabled, 'nbd' would accept a list of block node IDs which >> > > have it enabled, and migration is a singleton on/off. >> > >> > Of course, it also means that the yank code needs to know about every >> > single object that supports the operation, whereas if you only have >> > strings, the objects could keep registering their connection with a >> > generic function like yank_register_function() in this version. >> > >> > I'm not sure if the additional complexity is worth the benefits. >> >> I tend to agree; although we do have to ensure we either use an existing >> naming scheme (e.g. QOM object names?) or make sure we've got a well >> defined list of prefixes. > > Not everything that has a network connection is a QOM object (in fact, > neither migration nor chardev nor nbd are QOM objects). > > I guess it would be nice to have a single namespace for everything in > QEMU, but the reality is that we have a few separate ones. As long as we > consistently add a prefix that identifies the namespace in question, I > think that would work. > > This means that if we're using node-name to identify the NBD connection, > the namespace should be 'block' rather than 'nbd'. > > One more thing to consider is, what if a single object has multiple > connections? In the case of node-names, we have a limited set of allowed > characters, so we can use one of the remaining characters as a separator > and then suffix a counter. In other places, the identifier isn't > restricted, so suffixing doesn't work. Maybe prefixing does, but it > would have to be there from the beginning then. If I understand it correctly, the argument for encoding the structured "what to yank" information into a string is ease of implementation. The yank core sees only strings, and doesn't care about their contents. Code registering "yankables" can build strings however it likes, as long as they're all distinct. QMP clients need to understand how the strings are built to be able to yank specific "yankables" (as opposed to blindly yanking stuff until QEMU gets unstuck"). I disagree with this argument for a number of reasons. 1. Use of strings doesn't reduce complexity, it moves it. String: * QMP clients may need to parse the strings they get back from command query-yank intro structured data. * QMP clients may need to format structured data into a string for command yank. * How structured data is be parsed from / formatted to a string is part of the external interface, and needs to be documented, in addition to the data structure. * The strings need to be kept backward compatible. We could inadvertently create problems like the one you described above, where a format like UNIQUE-PREFIX:ID with an unrestricted ID is not extensible. * Code providing "yankables" needs to somehow ensure their strings are distinct. Complex type: * The result of query-yank is already structured data, no need for QMP clients to parse. * The argument of yank is structured data, no need to for QMP clients to format it into a string first. * Documenting just the data structure suffices. * Backward compatibility of complex QMP types is a well-understood problem. * Defining the structure in the QAPI schema as union trivially ensures the union branches are distinct. 2. Even with structured yank arguments, the yank core doesn't *need* to understand the structure. The yank core manages a set of instances. Each instance associates a key with a list of YankFuncAndParam. This is a basically dictionary. All the yank core needs to do with the dictionary keys is looking them up. The proposed implementation uses a list of instances (which is just fine). All the lookup needs from the keys is an "is equal" relation. Checking two QAPI objects for structural equality is admittedly not as simple as comparing strings. It does not, however, require understanding their structure. Two stupid solutions come to mind: * Hand-written compare that keeps specifics out of the yank core Whatever uses a variant of YankInstance gets to provide a comparison function of the variant members. Compare the common members (initially just @type, which is the discriminator). Return false if any of them differs. Else both instances use the same variant. Return the value of the variant comparison function. * Visitors exist Convert both YankInstances to QObject, compare with qobject_is_equal(), done. 3. In QMP, encoding structured data in strings is anathema :) [...]