Message ID | 1456178827-6419-3-git-send-email-jsnow@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, 02/22 17:07, John Snow wrote: > During incremental backups, if the target has a cluster size that is > larger than the backup cluster size and we are backing up to a target > that cannot (for whichever reason) pull clusters up from a backing image, > we may inadvertantly create unusable incremental backup images. > > For example: > > If the bitmap tracks changes at a 64KB granularity and we transmit 64KB > of data at a time but the target uses a 128KB cluster size, it is > possible that only half of a target cluster will be recognized as dirty > by the backup block job. When the cluster is allocated on the target > image but only half populated with data, we lose the ability to > distinguish between zero padding and uninitialized data. > > This does not happen if the target image has a backing file that points > to the last known good backup. > > Even if we have a backing file, though, it's likely going to be faster > to just buffer the redundant data ourselves from the live image than > fetching it from the backing file, so let's just always round up to the > target granularity. > > The same logic applies to backup modes top, none, and full. Copying > fractional clusters without the guarantee of COW is dangerous, but even > if we can rely on COW, it's likely better to just re-copy the data. > > Reported-by: Fam Zheng <famz@redhat.com> > Signed-off-by: John Snow <jsnow@redhat.com> > --- > block/backup.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/block/backup.c b/block/backup.c > index 76addef..a9a4d5c 100644 > --- a/block/backup.c > +++ b/block/backup.c > @@ -501,6 +501,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target, > BlockJobTxn *txn, Error **errp) > { > int64_t len; > + BlockDriverInfo bdi; > > assert(bs); > assert(target); > @@ -578,7 +579,14 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target, > job->sync_mode = sync_mode; > job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ? > sync_bitmap : NULL; > - job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; > + > + /* If there is no backing file on the target, we cannot rely on COW if our > + * backup cluster size is smaller than the target cluster size. Instead of > + * checking for a backing file, we assume that just copying the data in the > + * backup loop is comparable to the unreliable COW. */ > + bdrv_get_info(job->target, &bdi); bdrv_get_info can fail and bdi fields are uninitialized. Pleae test the return value and handle the error. Fam > + job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size); > + > job->common.len = len; > job->common.co = qemu_coroutine_create(backup_run); > block_job_txn_add_job(txn, &job->common); > -- > 2.4.3 >
On 02/23/2016 12:08 AM, Fam Zheng wrote: > On Mon, 02/22 17:07, John Snow wrote: >> During incremental backups, if the target has a cluster size that is >> larger than the backup cluster size and we are backing up to a target >> that cannot (for whichever reason) pull clusters up from a backing image, >> we may inadvertantly create unusable incremental backup images. >> >> For example: >> >> If the bitmap tracks changes at a 64KB granularity and we transmit 64KB >> of data at a time but the target uses a 128KB cluster size, it is >> possible that only half of a target cluster will be recognized as dirty >> by the backup block job. When the cluster is allocated on the target >> image but only half populated with data, we lose the ability to >> distinguish between zero padding and uninitialized data. >> >> This does not happen if the target image has a backing file that points >> to the last known good backup. >> >> Even if we have a backing file, though, it's likely going to be faster >> to just buffer the redundant data ourselves from the live image than >> fetching it from the backing file, so let's just always round up to the >> target granularity. >> >> The same logic applies to backup modes top, none, and full. Copying >> fractional clusters without the guarantee of COW is dangerous, but even >> if we can rely on COW, it's likely better to just re-copy the data. >> >> Reported-by: Fam Zheng <famz@redhat.com> >> Signed-off-by: John Snow <jsnow@redhat.com> >> --- >> block/backup.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/block/backup.c b/block/backup.c >> index 76addef..a9a4d5c 100644 >> --- a/block/backup.c >> +++ b/block/backup.c >> @@ -501,6 +501,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target, >> BlockJobTxn *txn, Error **errp) >> { >> int64_t len; >> + BlockDriverInfo bdi; >> >> assert(bs); >> assert(target); >> @@ -578,7 +579,14 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target, >> job->sync_mode = sync_mode; >> job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ? >> sync_bitmap : NULL; >> - job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; >> + >> + /* If there is no backing file on the target, we cannot rely on COW if our >> + * backup cluster size is smaller than the target cluster size. Instead of >> + * checking for a backing file, we assume that just copying the data in the >> + * backup loop is comparable to the unreliable COW. */ >> + bdrv_get_info(job->target, &bdi); > > bdrv_get_info can fail and bdi fields are uninitialized. Pleae test the return > value and handle the error. > > Fam > You're right. I thought it always did the memset, but it does have a failure route that is open prior to the memset. >> + job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size); >> + >> job->common.len = len; >> job->common.co = qemu_coroutine_create(backup_run); >> block_job_txn_add_job(txn, &job->common); >> -- >> 2.4.3 >>
diff --git a/block/backup.c b/block/backup.c index 76addef..a9a4d5c 100644 --- a/block/backup.c +++ b/block/backup.c @@ -501,6 +501,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target, BlockJobTxn *txn, Error **errp) { int64_t len; + BlockDriverInfo bdi; assert(bs); assert(target); @@ -578,7 +579,14 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target, job->sync_mode = sync_mode; job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ? sync_bitmap : NULL; - job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; + + /* If there is no backing file on the target, we cannot rely on COW if our + * backup cluster size is smaller than the target cluster size. Instead of + * checking for a backing file, we assume that just copying the data in the + * backup loop is comparable to the unreliable COW. */ + bdrv_get_info(job->target, &bdi); + job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size); + job->common.len = len; job->common.co = qemu_coroutine_create(backup_run); block_job_txn_add_job(txn, &job->common);
During incremental backups, if the target has a cluster size that is larger than the backup cluster size and we are backing up to a target that cannot (for whichever reason) pull clusters up from a backing image, we may inadvertantly create unusable incremental backup images. For example: If the bitmap tracks changes at a 64KB granularity and we transmit 64KB of data at a time but the target uses a 128KB cluster size, it is possible that only half of a target cluster will be recognized as dirty by the backup block job. When the cluster is allocated on the target image but only half populated with data, we lose the ability to distinguish between zero padding and uninitialized data. This does not happen if the target image has a backing file that points to the last known good backup. Even if we have a backing file, though, it's likely going to be faster to just buffer the redundant data ourselves from the live image than fetching it from the backing file, so let's just always round up to the target granularity. The same logic applies to backup modes top, none, and full. Copying fractional clusters without the guarantee of COW is dangerous, but even if we can rely on COW, it's likely better to just re-copy the data. Reported-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> --- block/backup.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)