Message ID | 56BA0BA0.2060302@virtuozzo.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 02/09/2016 10:54 AM, Vladimir Sementsov-Ogievskiy wrote: > On 09.02.2016 00:14, John Snow wrote: >> >> On 02/06/2016 04:19 AM, Vladimir Sementsov-Ogievskiy wrote: >>> On 05.02.2016 22:48, John Snow wrote: >>>> On 01/22/2016 12:07 PM, Vladimir Sementsov-Ogievskiy wrote: >>>>> Hi all. >>>>> >>>>> This is the early begin of the series which aims to add external >>>>> backup >>>>> api. This is needed to allow backup software use our dirty bitmaps. >>>>> >>>>> Vmware and Parallels Cloud Server have this feature. >>>>> >>>> Have a link to the equivalent feature that VMWare exposes? (Or >>>> Parallels >>>> Cloud Server) ... I'm curious about what the API there looks like. >>> For VMware you need their Virtual Disk Api Programming Guide >>> http://pubs.vmware.com/vsphere-60/topic/com.vmware.ICbase/PDF/vddk60_programming.pdf >>> >>> >> Great, thanks! >> >>> Look at Changed Block Tracking (CBT) , Backup and Restore. >>> >>> For PCS here is part of SDK header, related to the topic: >>> >>> ==================================== >>> /* >>> * Builds a map of the disk contents changes between 2 PITs. >>> Parameters >>> hDisk : A handle of type PHT_VIRTUAL_DISK identifying >>> the virtual disk. >>> sPit1Uuid : Uuid of the older PIT. >>> sPit2Uuid : Uuid of the later PIT. >>> phMap : A pointer to a variable which receives the >>> result (a handle of type PHT_VIRTUAL_DISK_MAP). >>> Returns >>> PRL_RESULT. >>> */ >>> PRL_METHOD_DECL( PARALLELS_API_VER_5, >>> PrlDisk_GetChangesMap_Local, ( >>> PRL_HANDLE hDisk, >>> PRL_CONST_STR sPit1Uuid, >>> PRL_CONST_STR sPit2Uuid, >>> PRL_HANDLE_PTR phMap) ); >>> >> Effectively giving you a dirty bitmap diff between two snapshots. >> Something we don't currently genuinely support in QEMU. > > Just start dirty bitmap at point a and stop at point b.. > >> >>> /* >>> * Reports the number of significant bits in the map. >>> Parameters >>> hMap : A handle of type PHT_VIRTUAL_DISK_MAP identifying >>> the changes map. >>> phSize : A pointer to a variable which receives the >>> result. >>> Returns >>> PRL_RESULT. >>> */ >>> PRL_METHOD_DECL( PARALLELS_API_VER_5, >>> PrlDiskMap_GetSize, ( >>> PRL_HANDLE hMap, >>> PRL_UINT32_PTR pnSize) ); >>> >> I assume this is roughly the dirty bit count, for us, this would be >> dirty clusters. (Or whatever granularity you specified, but usually >> clusters.) >> >>> /* >>> * Reports the size (in bytes) of a block mapped by a single bit >>> * in the map. >>> Parameters >>> hMap : A handle of type PHT_VIRTUAL_DISK_MAP identifying >>> the changes map. >>> phSize : A pointer to a variable which receives the >>> result. >>> Returns >>> PRL_RESULT. >>> */ >>> PRL_METHOD_DECL( PARALLELS_API_VER_5, >>> PrlDiskMap_GetGranularity, ( >>> PRL_HANDLE hMap, >>> PRL_UINT32_PTR pnSize) ); >>> >> Basically a granularity query. >> >>> /* >>> * Returns bits from the blocks map. >>> Parameters >>> hMap : A handle of type PHT_VIRTUAL_DISK_MAP identifying >>> the changes map. >>> pBuffer : A pointer to a store. >>> pnCapacity : A pointer to a variable holding the size >>> of the buffer and receiving the number of >>> bytes actually written. >>> Returns >>> PRL_RESULT. >>> */ >>> PRL_METHOD_DECL( PARALLELS_API_VER_5, >>> PrlDiskMap_Read, ( >>> PRL_HANDLE hMap, >>> PRL_VOID_PTR pBuffer, >>> PRL_UINT32_PTR pnCapacity) ); >>> >> And this would be a direct bitmap query. >> >> Is the expected usage here that the third party client will use this >> bitmap to read the source image? Or do you query for the data from API? > > - from API. > >> >> I think the thought among block devs would be to opt for more of the >> second option, and less allowing clients to directly interface with the >> image files. >> >>> ======================================= >>> >>> >>>>> There is only one patch here, about querying dirty bitmap from qemu by >>>>> qmp command. It is just an updated and clipped (hmp command >>>>> removed) old >>>>> my patch "[PATCH RFC v3 01/14] qmp: add query-block-dirty-bitmap". >>>>> >>>>> Before writing the whole thing I'd like to discuss the details. Or, >>>>> may >>>>> be there are existing plans on this topic, or may be someone already >>>>> works on it? >>>>> >>>>> I see it like this: >>>>> >>>>> ===== >>>>> >>>>> - add qmp commands for dirty-bitmap functions: create_successor, >>>>> abdicate, >>>>> reclaime. >>>> Hm, why do we need such low-level control over splitting and merging >>>> bitmaps from an external client? >>>> >>>>> - make create-successor command transaction-able >>>>> - add query-block-dirty-bitmap qmp command >>>>> >>>>> then, external backup: >>>>> >>>>> qmp transaction { >>>>> external-snapshot >>>>> bitmap-create-successor >>>>> } >>>>> >>>>> qmp query frozen bitmap, not acquiring aio context. >>>>> >>>>> do external backup, using snapshot and bitmap >>>>> >>>>> if (success backup) >>>>> qmp bitmap-abdicate >>>>> else >>>>> qmp bitmap-reclaime >>>>> >>>>> qmp merge snapshot >>>>> ===== >>>>> >>>> Hm, I see -- so you're hoping to manage the backup *entirely* >>>> externally, so you want to be able to reach inside of QEMU and control >>>> some status conditions to guarantee it'll be safe. >>>> >>>> I'm not convinced QEMU can guarantee such things -- due to various >>>> flush >>>> properties, race conditions on write, etc. QEMU handles all of this >>>> internally in a non-public way at the moment. >>> Hm, can you be more concrete? What operations are dangerous? We can do >>> them in paused state for example. >>> >> I suppose if you're going to pause the VM, then it should be reasonably >> safe, but recently there have been endeavors to augment the .qcow2 >> format to prohibit concurrent access, which might include a paused VM as >> well, I'm not clear on the implementation. >> >> If you do it via paused only, then you also don't need to expose the >> freeze/rollback mechanisms: the existing clear mechanism alone is >> sufficient: >> >> (A) The frozen backup fails. Nothing new has been written, so we don't >> need to adjust anything, we can just try again. >> (B) The frozen backup succeeds. We can just clear the bitmap before >> unfreezing. > > We can't query bitmap in paused state - it may take too much time. > And I think it is currently unsafe to fetch the data from disk while the VM is running, so you'll have to solve one or the other problem... >> >> I definitely have reservations about using this as a live fleecing >> mechanism -- the backup block job uses a write-notifier to make >> just-in-time backups of data before it is altered, leaving it the only >> "safe" live backup mechanism in QEMU currently. (Alongside mirror.) >> >> I actually have some patches from Fam to introduce a live fleecing >> mechanism into QEMU (The idea being you create a point-in-time drive you >> can get data from via NBD, then delete it when done) that might be more >> appropriate, but I ran into a lot of problems with the patch. I'll post >> the WIP for that patch to try to solicit comments on the best way >> forward. > > After adding > ============= > --- a/block.c > +++ b/block.c > @@ -1276,6 +1276,9 @@ void bdrv_set_backing_hd(BlockDriverState *bs, > BlockDriverState *backing_hd) > /* Otherwise we won't be able to commit due to check in bdrv_commit */ > bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_COMMIT_TARGET, > bs->backing_blocker); > + > + bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_SOURCE, > + bs->backing_blocker); > out: > bdrv_refresh_limits(bs, NULL); > } > ============== > and tiny fix for qemu_io interface in iotest > > Fam's "qemu-iotests: Image fleecing test case 089" works for me. Isn't > it enough? > > > >> >> Otherwise, My biggest question here is: >> "What does fleecing a backup externally provide as a benefit over >> backing up to an NBD target?" > > Look at our answers on v2 of these series: > > On 05.02.2016 11:28, Denis V. Lunev wrote: >> On 02/03/2016 11:14 AM, Fam Zheng wrote: >>> On Sat, 01/30 13:56, Vladimir Sementsov-Ogievskiy wrote: >>>> Hi all. >>>> >>>> These series which aims to add external backup api. This is needed >>>> to allow >>>> backup software use our dirty bitmaps. >>>> >>>> Vmware and Parallels Cloud Server have this feature. >>> What is the advantage of this appraoch over "drive-backup >>> sync=incremental >>> ..."? >> >> This will allow third-party vendors to backup QEMU VMs into >> their own formats or to the cloud etc. > > >> >> You can already today perform incremental backups to an NBD target to >> copy the data out via an external mechanism, is this not sufficient for >> Parallels? If not, why? >> >>>>> In the following patch query-bitmap acquires aio context. This must be >>>>> ofcourse dropped for frozen bitmap. >>>>> But to make it in true way, I think, I should check somehow that >>>>> this is >>>>> not just frozen bitmap, but the bitmap frozen by qmp command, to avoid >>>>> incorrect quering of bitmap frozen by internal backup (or other >>>>> mechanizm).. May be, it is not necessary. >>>>> >>>>> >>>>> >>>> > >
============= --- a/block.c +++ b/block.c @@ -1276,6 +1276,9 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd) /* Otherwise we won't be able to commit due to check in bdrv_commit */ bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_COMMIT_TARGET, bs->backing_blocker); + + bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_SOURCE, + bs->backing_blocker); out: bdrv_refresh_limits(bs, NULL);