Message ID | 20190709203806.17550-1-dmitry.fomichev@wdc.com (mailing list archive) |
---|---|
Headers | show |
Series | virtio: handle zoned backing devices | expand |
On 09/07/19 22:38, Dmitry Fomichev wrote: > Currently, attaching zoned block devices (i.e. storage devices > compliant to ZAC/ZBC standards) using several virtio methods doesn't > work - the zoned devices appear as regular block devices at the guest. > This may cause unexpected i/o errors and, potentially, some data > corruption. > > To be more precise, attaching a zoned device via virtio-pci-blk, > virtio-scsi-pci/scsi-disk or virtio-scsi-pci/scsi-hd demonstrates the > above behavior. A simple fix is needed to make > virtio-scsi-pci/scsi-block work and this is covered by a different > patch. The virtio-scsi-pci/scsi-generic method appears to handle zoned > devices without problems. The problem with this approach is that other devices (e.g. ide-hd or sd card) also break with zoned devices and the only way to fix it would be to add code denying zoned block devices to all of them. The question then becomes how to define a whitelist. One possiblity is to add a QOM interface (for example TYPE_ZONED_BLOCK_SUPPORT) to scsi-block and scsi-generic. In do_parse_drive you can query the BlockBackend with bdrv_get_zoned_info, and return an error if the backend is a zoned block device and the device does not implement TYPE_ZONED_BLOCK_SUPPORT. (Commit 6b1566c is an example of adding a new QOM interface; in your case, it would be simpler as the interface would not have any method). Kevin, what do you think? Also, why deny passing Host Aware devices? Paolo
Am 10.07.2019 um 12:09 hat Paolo Bonzini geschrieben: > On 09/07/19 22:38, Dmitry Fomichev wrote: > > Currently, attaching zoned block devices (i.e. storage devices > > compliant to ZAC/ZBC standards) using several virtio methods doesn't > > work - the zoned devices appear as regular block devices at the guest. > > This may cause unexpected i/o errors and, potentially, some data > > corruption. > > > > To be more precise, attaching a zoned device via virtio-pci-blk, > > virtio-scsi-pci/scsi-disk or virtio-scsi-pci/scsi-hd demonstrates the > > above behavior. A simple fix is needed to make > > virtio-scsi-pci/scsi-block work and this is covered by a different > > patch. The virtio-scsi-pci/scsi-generic method appears to handle zoned > > devices without problems. > > The problem with this approach is that other devices (e.g. ide-hd or sd > card) also break with zoned devices and the only way to fix it would be > to add code denying zoned block devices to all of them. > > The question then becomes how to define a whitelist. One possiblity is > to add a QOM interface (for example TYPE_ZONED_BLOCK_SUPPORT) to > scsi-block and scsi-generic. In do_parse_drive you can query the > BlockBackend with bdrv_get_zoned_info, and return an error if the > backend is a zoned block device and the device does not implement > TYPE_ZONED_BLOCK_SUPPORT. (Commit 6b1566c is an example of adding a new > QOM interface; in your case, it would be simpler as the interface would > not have any method). Kevin, what do you think? What about non-device users such as block jobs or (NBD) exports? Won't they have to special-case such devices, too? In fact, what about image format drivers or even filters? I feel that this needs to be managed at the BDS level somehow. Not sure which mechanism to use, though. Permissions would be suitable for a blacklist approach, but I agree with you that we need a whitelist instead. Hm... Actually, file-posix implements .bdrv_check_perm and could just refuse attaching a parent there if it doesn't request a specific permission like BLK_PERM_SUPPORT_ZONED. That should give us the whitelist semantics through existing infrastructure. Kevin
On 10/07/19 13:02, Kevin Wolf wrote: > Hm... Actually, file-posix implements .bdrv_check_perm and could just > refuse attaching a parent there if it doesn't request a specific > permission like BLK_PERM_SUPPORT_ZONED. That should give us the > whitelist semantics through existing infrastructure. I'd like Dmitry to have something more precise to base his work on. The permissions system is really complicated and I never really wrapped my head around it, so I need your help. IIUC, blkconf_apply_backend_options would grow a new argument (like "resizable") and that argument would add BLK_PERM_SUPPORT_ZONED to the perm that blkconf_apply_backend_options passes to blk_set_perm. On the other side raw_check_perm would say something like if (is_zoned(s) && !(perm & BLK_PERM_SUPPORT_ZONED)) { error_setg(....); return -ENOTSUP; } Is this correct? In addition, BLK_PERM_SUPPORT_ZONED would have to be a shared permission, since it's possible to assign the same block device to multiple scsi-block devices. So BLK_PERM_SUPPORT_ZONED would be added unconditionally to shared_perm. Paolo ps: I have always thought that shared_perm is expressed the wrong way and should have been "denied_perm". How hard would it be to change that now?
Am 10.07.2019 um 13:33 hat Paolo Bonzini geschrieben: > On 10/07/19 13:02, Kevin Wolf wrote: > > Hm... Actually, file-posix implements .bdrv_check_perm and could just > > refuse attaching a parent there if it doesn't request a specific > > permission like BLK_PERM_SUPPORT_ZONED. That should give us the > > whitelist semantics through existing infrastructure. > > I'd like Dmitry to have something more precise to base his work on. The > permissions system is really complicated and I never really wrapped my > head around it, so I need your help. > > IIUC, blkconf_apply_backend_options would grow a new argument (like > "resizable") and that argument would add BLK_PERM_SUPPORT_ZONED to the > perm that blkconf_apply_backend_options passes to blk_set_perm. On the > other side raw_check_perm would say something like > > if (is_zoned(s) && !(perm & BLK_PERM_SUPPORT_ZONED)) { > error_setg(....); > return -ENOTSUP; > } > > Is this correct? Yes, I think this is how you'd best implement it. > In addition, BLK_PERM_SUPPORT_ZONED would have to be a shared > permission, since it's possible to assign the same block device to > multiple scsi-block devices. So BLK_PERM_SUPPORT_ZONED would be added > unconditionally to shared_perm. Right, this part shows that we're kind of abusing the permission system here. I think unconditionally adding BLK_PERM_SUPPORT_ZONED to the set of shared permissions could probably happen centrally in bdrv_child_perm(). > ps: I have always thought that shared_perm is expressed the wrong way > and should have been "denied_perm". How hard would it be to change that > now? I'm not sure it would be better. It is shared_perm because that means that the default is restrictive (error mode: operation refused, clearly reported, easy to fix) rather than permissive (error mode: image corruption, hard to figure out where we were too permissive). Basically, whitelist instead of blacklist, once again. But if we did indeed decide to change it, the only way to find out how hard it is would be doing it. I suspect not too hard in principle, but making sure that we converted all callers and don't introduce wrong callers later through (silent) merge conflicts is the more concerning part. Kevin
On Wed, 2019-07-10 at 23:09 +0200, Kevin Wolf wrote: > Am 10.07.2019 um 13:33 hat Paolo Bonzini geschrieben: > > On 10/07/19 13:02, Kevin Wolf wrote: > > > Hm... Actually, file-posix implements .bdrv_check_perm and could just > > > refuse attaching a parent there if it doesn't request a specific > > > permission like BLK_PERM_SUPPORT_ZONED. That should give us the > > > whitelist semantics through existing infrastructure. > > > > I'd like Dmitry to have something more precise to base his work on. The > > permissions system is really complicated and I never really wrapped my > > head around it, so I need your help. > > > > IIUC, blkconf_apply_backend_options would grow a new argument (like > > "resizable") and that argument would add BLK_PERM_SUPPORT_ZONED to the > > perm that blkconf_apply_backend_options passes to blk_set_perm. On the > > other side raw_check_perm would say something like > > > > if (is_zoned(s) && !(perm & BLK_PERM_SUPPORT_ZONED)) { > > error_setg(....); > > return -ENOTSUP; > > } > > > > Is this correct? > > Yes, I think this is how you'd best implement it. > > > In addition, BLK_PERM_SUPPORT_ZONED would have to be a shared > > permission, since it's possible to assign the same block device to > > multiple scsi-block devices. So BLK_PERM_SUPPORT_ZONED would be added > > unconditionally to shared_perm. > > Right, this part shows that we're kind of abusing the permission system > here. I think unconditionally adding BLK_PERM_SUPPORT_ZONED to the set > of shared permissions could probably happen centrally in > bdrv_child_perm(). > > > ps: I have always thought that shared_perm is expressed the wrong way > > and should have been "denied_perm". How hard would it be to change that > > now? > > I'm not sure it would be better. It is shared_perm because that means > that the default is restrictive (error mode: operation refused, clearly > reported, easy to fix) rather than permissive (error mode: image > corruption, hard to figure out where we were too permissive). Basically, > whitelist instead of blacklist, once again. > > But if we did indeed decide to change it, the only way to find out how > hard it is would be doing it. I suspect not too hard in principle, but > making sure that we converted all callers and don't introduce wrong > callers later through (silent) merge conflicts is the more concerning > part. > > Kevin Thanks for the feedback, the permissions based approach indeed looks cleaner. I am looking into modifying the patchset to do the check in bdrv_check_perm and will send a V2. Paolo, WRT to Host Aware drives, these MAY work, but we don't have any of these available for testing and are not able to verify which drivers do work with them and which do not. This is the reason for not letting them pass thru. If you prefer, I can enable passing HA drives in V2. Dmitry
On 11/07/19 02:52, Dmitry Fomichev wrote: > Paolo, > WRT to Host Aware drives, these MAY work, but we don't have any of these > available for testing and are not able to verify which drivers do work > with them and which do not. This is the reason for not letting them pass > thru. If you prefer, I can enable passing HA drives in V2. In theory host aware should work the same as drive managed if the host is oblivious of zones, so I prefer enabling them indeed. Thanks, Paolo