Message ID | 20200901125129.6398-1-pl@kamp.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | qemu-img: avoid unaligned read requests during convert | expand |
On 01.09.20 14:51, Peter Lieven wrote: > in case of large continous areas that share the same allocation status > it happens that the value of s->sector_next_status is unaligned to the > cluster size or even request alignment of the source. Avoid this by > stripping down the s->sector_next_status position to cluster boundaries. > > Signed-off-by: Peter Lieven <pl@kamp.de> > --- > qemu-img.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) Thanks, applied to my block branch: https://git.xanclic.moe/XanClic/qemu/commits/branch/block
On 01.09.20 14:51, Peter Lieven wrote: > in case of large continous areas that share the same allocation status > it happens that the value of s->sector_next_status is unaligned to the > cluster size or even request alignment of the source. Avoid this by > stripping down the s->sector_next_status position to cluster boundaries. > > Signed-off-by: Peter Lieven <pl@kamp.de> > --- > qemu-img.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) I've just noticed that with this patch, the iotest 251 fails for vhdx. Would you be OK with squashing this in? Max diff --git a/tests/qemu-iotests/251 b/tests/qemu-iotests/251 index 7918ba3559..294773bdc1 100755 --- a/tests/qemu-iotests/251 +++ b/tests/qemu-iotests/251 @@ -46,8 +46,11 @@ if [ "$IMGOPTSSYNTAX" = "true" ]; then # We use json:{} filenames here, so we cannot work with additional options. _unsupported_fmt $IMGFMT else - # With VDI, the output is ordered differently. Just disable it. - _unsupported_fmt vdi + # - With VDI, the output is ordered differently. Just disable it. + # - VHDX has large clusters; because qemu-img convert tries to + # align the requests to the cluster size, the output is ordered + # differently, so disable it, too. + _unsupported_fmt vdi vhdx fi
> Am 10.09.2020 um 18:58 schrieb Max Reitz <mreitz@redhat.com>: > > On 01.09.20 14:51, Peter Lieven wrote: >> in case of large continous areas that share the same allocation status >> it happens that the value of s->sector_next_status is unaligned to the >> cluster size or even request alignment of the source. Avoid this by >> stripping down the s->sector_next_status position to cluster boundaries. >> >> Signed-off-by: Peter Lieven <pl@kamp.de> >> --- >> qemu-img.c | 22 ++++++++++++++++++++++ >> 1 file changed, 22 insertions(+) > > I've just noticed that with this patch, the iotest 251 fails for vhdx. > Would you be OK with squashing this in? Sure, squash it in. Thanks Peter
diff --git a/qemu-img.c b/qemu-img.c index 5308773811..ed17238c36 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -1665,6 +1665,7 @@ enum ImgConvertBlockStatus { typedef struct ImgConvertState { BlockBackend **src; int64_t *src_sectors; + int *src_alignment; int src_num; int64_t total_sectors; int64_t allocated_sectors; @@ -1731,6 +1732,7 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num) if (s->sector_next_status <= sector_num) { uint64_t offset = (sector_num - src_cur_offset) * BDRV_SECTOR_SIZE; int64_t count; + int tail; do { count = n * BDRV_SECTOR_SIZE; @@ -1769,6 +1771,16 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num) n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE); + /* + * Avoid that s->sector_next_status becomes unaligned to the source + * request alignment and/or cluster size to avoid unnecessary read + * cycles. + */ + tail = (sector_num - src_cur_offset + n) % s->src_alignment[src_cur]; + if (n > tail) { + n -= tail; + } + if (ret & BDRV_BLOCK_ZERO) { s->status = post_backing_zero ? BLK_BACKING_FILE : BLK_ZERO; } else if (ret & BDRV_BLOCK_DATA) { @@ -2407,8 +2419,10 @@ static int img_convert(int argc, char **argv) s.src = g_new0(BlockBackend *, s.src_num); s.src_sectors = g_new(int64_t, s.src_num); + s.src_alignment = g_new(int, s.src_num); for (bs_i = 0; bs_i < s.src_num; bs_i++) { + BlockDriverState *src_bs; s.src[bs_i] = img_open(image_opts, argv[optind + bs_i], fmt, src_flags, src_writethrough, s.quiet, force_share); @@ -2423,6 +2437,13 @@ static int img_convert(int argc, char **argv) ret = -1; goto out; } + src_bs = blk_bs(s.src[bs_i]); + s.src_alignment[bs_i] = DIV_ROUND_UP(src_bs->bl.request_alignment, + BDRV_SECTOR_SIZE); + if (!bdrv_get_info(src_bs, &bdi)) { + s.src_alignment[bs_i] = MAX(s.src_alignment[bs_i], + bdi.cluster_size / BDRV_SECTOR_SIZE); + } s.total_sectors += s.src_sectors[bs_i]; } @@ -2704,6 +2725,7 @@ out: g_free(s.src); } g_free(s.src_sectors); + g_free(s.src_alignment); fail_getopt: g_free(options);
in case of large continous areas that share the same allocation status it happens that the value of s->sector_next_status is unaligned to the cluster size or even request alignment of the source. Avoid this by stripping down the s->sector_next_status position to cluster boundaries. Signed-off-by: Peter Lieven <pl@kamp.de> --- qemu-img.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)