diff mbox

[v3,1/4] block: Pass bdrv_* methods to bs->file by default

Message ID 20170704084059.6278-2-el13635@mail.ntua.gr (mailing list archive)
State New, archived
Headers show

Commit Message

Manos Pitsidianakis July 4, 2017, 8:40 a.m. UTC
The following functions fail if bs->drv does not implement them:

bdrv_probe_blocksizes
bdrv_probe_geometry
bdrv_truncate
bdrv_has_zero_init
bdrv_get_info
bdrv_media_changed
bdrv_eject
bdrv_lock_medium
bdrv_co_ioctl

Instead, the call should be passed to bs->file if it exists, to allow
filter drivers to support those methods without implementing them.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
---
 block.c    | 27 +++++++++++++++++++++++++--
 block/io.c |  5 +++++
 2 files changed, 30 insertions(+), 2 deletions(-)

Comments

Kevin Wolf July 4, 2017, 9:44 a.m. UTC | #1
Am 04.07.2017 um 10:40 hat Manos Pitsidianakis geschrieben:
> The following functions fail if bs->drv does not implement them:
> 
> bdrv_probe_blocksizes
> bdrv_probe_geometry
> bdrv_truncate
> bdrv_has_zero_init
> bdrv_get_info
> bdrv_media_changed
> bdrv_eject
> bdrv_lock_medium
> bdrv_co_ioctl
> 
> Instead, the call should be passed to bs->file if it exists, to allow
> filter drivers to support those methods without implementing them.
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>
> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
> ---
>  block.c    | 27 +++++++++++++++++++++++++--
>  block/io.c |  5 +++++
>  2 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 69439628..8a122142 100644
> --- a/block.c
> +++ b/block.c
> @@ -494,6 +494,8 @@ int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
>  
>      if (drv && drv->bdrv_probe_blocksizes) {
>          return drv->bdrv_probe_blocksizes(bs, bsz);
> +    } else if (drv && bs->file && bs->file->bs) {

The check for bs->file->bs isn't really necessary, we never have a
BdrvChild with a NULL bs.

> +        return bdrv_probe_blocksizes(bs->file->bs, bsz);
>      }
>  
>      return -ENOTSUP;

This change actually changes existing behaviour. Basically, all of your
additions also apply for qcow2 images or other image formats, and we
need to check whether they make sense there.

bdrv_probe_blocksizes() is used for the default values for the physical
and logical block sizes for block devices if no value is explicitly
specified in -device. This makes sense as it will allow the guest to
optimise its requests so that they align with the block size that is
optimal for the storage that contains the image. (I also think that we
probably should implement .bdrv_probe_blocksizes() not only for
host_device, but also for regular files to get optimal performance by
default.)

Changing the defaults can be a problem for cross-version live migration
if the default values are used. Management tools that want to do live
migration are supposed to explicitly provide all options they can, so
this should be okay.

Eric, does libvirt set the physical/logical block size explicitly? If
not, it would be good if it started to do so.

This hunk is non-trivial enough that it's probably worth a patch of its
own with a detailed commit message explaining why we want this change
and why it is safe.

> @@ -511,6 +513,8 @@ int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
>  
>      if (drv && drv->bdrv_probe_geometry) {
>          return drv->bdrv_probe_geometry(bs, geo);
> +    } else if (drv && bs->file && bs->file->bs) {
> +        return bdrv_probe_geometry(bs->file->bs, geo);
>      }
>  
>      return -ENOTSUP;

The probed geometry would refer to the physical image file, not to the
actually presented image. So propagating this to bs->file is wrong for
image formats.

Possibly checking .is_filter in addition would be enough.

If you decide that this is the right field, please add some note to the
comment for the .is_filter declaration that tells about the assumptions
that are made when the field is set. I'm not completely sure what they
are, but maybe something like this: If bs->file != NULL, then all data
exposed by the filter node is mapped at the same offset in bs->file.

> @@ -3406,11 +3410,15 @@ int bdrv_truncate(BdrvChild *child, int64_t offset, Error **errp)
>  
>      assert(child->perm & BLK_PERM_RESIZE);
>  
> +    /* if bs->drv == NULL, bs is closed, so there's nothing to do here */
>      if (!drv) {
>          error_setg(errp, "No medium inserted");
>          return -ENOMEDIUM;
>      }
>      if (!drv->bdrv_truncate) {
> +        if (bs->file && bs->file->bs) {
> +            return bdrv_truncate(bs->file, offset, errp);
> +        }
>          error_setg(errp, "Image format driver does not support resize");
>          return -ENOTSUP;
>      }

This is actively dangerous without an .is_filter check!

All image formats that don't support image resizing would instead just
truncate the image file without considering the format at all.

> @@ -3778,6 +3786,9 @@ int bdrv_has_zero_init(BlockDriverState *bs)
>      if (bs->drv->bdrv_has_zero_init) {
>          return bs->drv->bdrv_has_zero_init(bs);
>      }
> +    if (bs->file && bs->file->bs) {
> +        return bdrv_has_zero_init(bs->file->bs);
> +    }
>  
>      /* safe default */
>      return 0;

This is wrong for image formats.

> @@ -3832,10 +3843,16 @@ void bdrv_get_backing_filename(BlockDriverState *bs,
>  int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
>  {
>      BlockDriver *drv = bs->drv;
> -    if (!drv)
> +    /* if bs->drv == NULL, bs is closed, so there's nothing to do here */
> +    if (!drv) {
>          return -ENOMEDIUM;
> -    if (!drv->bdrv_get_info)
> +    }
> +    if (!drv->bdrv_get_info) {
> +        if (bs->file && bs->file->bs) {
> +            return bdrv_get_info(bs->file->bs, bdi);
> +        }
>          return -ENOTSUP;
> +    }
>      memset(bdi, 0, sizeof(*bdi));
>      return drv->bdrv_get_info(bs, bdi);
>  }

Same thing.

> @@ -4205,6 +4222,8 @@ int bdrv_media_changed(BlockDriverState *bs)
>  
>      if (drv && drv->bdrv_media_changed) {
>          return drv->bdrv_media_changed(bs);
> +    } else if (drv && bs->file && bs->file->bs) {
> +        return bdrv_media_changed(bs->file->bs);
>      }
>      return -ENOTSUP;
>  }

This function appears to be unused, so instead of discussing whether
this actually makes sense, we can simply remove it. :-)

> @@ -4218,6 +4237,8 @@ void bdrv_eject(BlockDriverState *bs, bool eject_flag)
>  
>      if (drv && drv->bdrv_eject) {
>          drv->bdrv_eject(bs, eject_flag);
> +    } else if (drv && bs->file && bs->file->bs) {
> +        bdrv_eject(bs->file->bs, eject_flag);
>      }
>  }
>  
> @@ -4233,6 +4254,8 @@ void bdrv_lock_medium(BlockDriverState *bs, bool locked)
>  
>      if (drv && drv->bdrv_lock_medium) {
>          drv->bdrv_lock_medium(bs, locked);
> +    } else if (drv && bs->file && bs->file->bs) {
> +        bdrv_lock_medium(bs->file->bs, locked);
>      }
>  }

I'm not sure if media change can be handled transparently without
explicit support in all drivers in the chain. This is probably even true
for filter drivers.

So I think I would just drop these two hunks from the patch.

> diff --git a/block/io.c b/block/io.c
> index 5c146b5a..c22a9bf2 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2425,6 +2425,11 @@ int bdrv_co_ioctl(BlockDriverState *bs, int req, void *buf)
>      };
>      BlockAIOCB *acb;
>  
> +    if (drv && !drv->bdrv_aio_ioctl && !drv->bdrv_co_ioctl &&
> +        bs->file && bs->file->bs) {
> +        return bdrv_co_ioctl(bs->file->bs, req, buf);
> +    }
> +
>      bdrv_inc_in_flight(bs);
>      if (!drv || (!drv->bdrv_aio_ioctl && !drv->bdrv_co_ioctl)) {
>          co.ret = -ENOTSUP;

This makes me nervous. Basically, SCSI passthrough doesn't work well
with anything that isn't just a file-posix BDS. Bypassing block nodes
for it, whether image formats or filter drivers, might get these drivers
into surprising states and might even break the image.

I'd rather not support bdrv_co_ioctl() for anything that can't
explicitly handle it.

Kevin
Eric Blake July 5, 2017, 12:40 p.m. UTC | #2
[adding libvirt for block size discussion]

On 07/04/2017 04:44 AM, Kevin Wolf wrote:
> Am 04.07.2017 um 10:40 hat Manos Pitsidianakis geschrieben:
>> The following functions fail if bs->drv does not implement them:
>>
>> bdrv_probe_blocksizes
>> bdrv_probe_geometry
>> bdrv_truncate
>> bdrv_has_zero_init
>> bdrv_get_info
>> bdrv_media_changed
>> bdrv_eject
>> bdrv_lock_medium
>> bdrv_co_ioctl
>>
>> Instead, the call should be passed to bs->file if it exists, to allow
>> filter drivers to support those methods without implementing them.
>>

> 
>> +        return bdrv_probe_blocksizes(bs->file->bs, bsz);
>>      }
>>  
>>      return -ENOTSUP;
> 
> This change actually changes existing behaviour. Basically, all of your
> additions also apply for qcow2 images or other image formats, and we
> need to check whether they make sense there.
> 
> bdrv_probe_blocksizes() is used for the default values for the physical
> and logical block sizes for block devices if no value is explicitly
> specified in -device. This makes sense as it will allow the guest to
> optimise its requests so that they align with the block size that is
> optimal for the storage that contains the image. (I also think that we
> probably should implement .bdrv_probe_blocksizes() not only for
> host_device, but also for regular files to get optimal performance by
> default.)
> 
> Changing the defaults can be a problem for cross-version live migration
> if the default values are used. Management tools that want to do live
> migration are supposed to explicitly provide all options they can, so
> this should be okay.
> 
> Eric, does libvirt set the physical/logical block size explicitly? If
> not, it would be good if it started to do so.

Libvirt supports user control of block size via this XML in <disk>:
    <blockio logical_block_size='512' physical_block_size='4096'/>
which it translates to command-line logical_block_size= and
physical_block_size=.  But I didn't see anywhere that libvirt tries to
do anything with block size in QMP, which makes me wonder if that is a
libvirt bug for not allowing a block size when hot-plugging a drive, or
a weakness in QMP.

Your point that if libvirt isn't always specifying block size, then we
risk the block size changing on migration (that is, libvirt should be
patched to specify block size always, and not just when the XML
specifies it).
Eric Blake July 5, 2017, 12:43 p.m. UTC | #3
On 07/04/2017 04:44 AM, Kevin Wolf wrote:
> Am 04.07.2017 um 10:40 hat Manos Pitsidianakis geschrieben:
>> The following functions fail if bs->drv does not implement them:
>>

>> @@ -511,6 +513,8 @@ int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
>>  
>>      if (drv && drv->bdrv_probe_geometry) {
>>          return drv->bdrv_probe_geometry(bs, geo);
>> +    } else if (drv && bs->file && bs->file->bs) {
>> +        return bdrv_probe_geometry(bs->file->bs, geo);
>>      }
>>  
>>      return -ENOTSUP;
> 
> The probed geometry would refer to the physical image file, not to the
> actually presented image. So propagating this to bs->file is wrong for
> image formats.
> 
> Possibly checking .is_filter in addition would be enough.

In fact, this made me think: if we define .is_filter as an enum instead
of a bool, we could implement 0 for no filter, 1 for file filter, and 2
for backing filter.  Then instead of having to have 2 separate default
bdrv_get_block_status_for_*() generic functions in patch 4, we could
instead patch bdrv_co_get_block_status() to check .is_filter, and
automatically fall back to bs->file or bs->backing according to the enum
value stored in .is_filter.
Kevin Wolf July 5, 2017, 1:37 p.m. UTC | #4
Am 05.07.2017 um 14:43 hat Eric Blake geschrieben:
> On 07/04/2017 04:44 AM, Kevin Wolf wrote:
> > Am 04.07.2017 um 10:40 hat Manos Pitsidianakis geschrieben:
> >> The following functions fail if bs->drv does not implement them:
> >>
> 
> >> @@ -511,6 +513,8 @@ int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
> >>  
> >>      if (drv && drv->bdrv_probe_geometry) {
> >>          return drv->bdrv_probe_geometry(bs, geo);
> >> +    } else if (drv && bs->file && bs->file->bs) {
> >> +        return bdrv_probe_geometry(bs->file->bs, geo);
> >>      }
> >>  
> >>      return -ENOTSUP;
> > 
> > The probed geometry would refer to the physical image file, not to the
> > actually presented image. So propagating this to bs->file is wrong for
> > image formats.
> > 
> > Possibly checking .is_filter in addition would be enough.
> 
> In fact, this made me think: if we define .is_filter as an enum instead
> of a bool, we could implement 0 for no filter, 1 for file filter, and 2
> for backing filter.  Then instead of having to have 2 separate default
> bdrv_get_block_status_for_*() generic functions in patch 4, we could
> instead patch bdrv_co_get_block_status() to check .is_filter, and
> automatically fall back to bs->file or bs->backing according to the enum
> value stored in .is_filter.

After reviewing the rest of the series, I'm not completely convinced of
using .is_filter any more. If you look at the patch for the raw format
driver, it actually turns out that even though we would intuitively call
it a filter, it has different properties than for example a throttle
filter and passing through function calls to bs->file makes sense for a
different set of functions.

Kevin
diff mbox

Patch

diff --git a/block.c b/block.c
index 69439628..8a122142 100644
--- a/block.c
+++ b/block.c
@@ -494,6 +494,8 @@  int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
 
     if (drv && drv->bdrv_probe_blocksizes) {
         return drv->bdrv_probe_blocksizes(bs, bsz);
+    } else if (drv && bs->file && bs->file->bs) {
+        return bdrv_probe_blocksizes(bs->file->bs, bsz);
     }
 
     return -ENOTSUP;
@@ -511,6 +513,8 @@  int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
 
     if (drv && drv->bdrv_probe_geometry) {
         return drv->bdrv_probe_geometry(bs, geo);
+    } else if (drv && bs->file && bs->file->bs) {
+        return bdrv_probe_geometry(bs->file->bs, geo);
     }
 
     return -ENOTSUP;
@@ -3406,11 +3410,15 @@  int bdrv_truncate(BdrvChild *child, int64_t offset, Error **errp)
 
     assert(child->perm & BLK_PERM_RESIZE);
 
+    /* if bs->drv == NULL, bs is closed, so there's nothing to do here */
     if (!drv) {
         error_setg(errp, "No medium inserted");
         return -ENOMEDIUM;
     }
     if (!drv->bdrv_truncate) {
+        if (bs->file && bs->file->bs) {
+            return bdrv_truncate(bs->file, offset, errp);
+        }
         error_setg(errp, "Image format driver does not support resize");
         return -ENOTSUP;
     }
@@ -3778,6 +3786,9 @@  int bdrv_has_zero_init(BlockDriverState *bs)
     if (bs->drv->bdrv_has_zero_init) {
         return bs->drv->bdrv_has_zero_init(bs);
     }
+    if (bs->file && bs->file->bs) {
+        return bdrv_has_zero_init(bs->file->bs);
+    }
 
     /* safe default */
     return 0;
@@ -3832,10 +3843,16 @@  void bdrv_get_backing_filename(BlockDriverState *bs,
 int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
 {
     BlockDriver *drv = bs->drv;
-    if (!drv)
+    /* if bs->drv == NULL, bs is closed, so there's nothing to do here */
+    if (!drv) {
         return -ENOMEDIUM;
-    if (!drv->bdrv_get_info)
+    }
+    if (!drv->bdrv_get_info) {
+        if (bs->file && bs->file->bs) {
+            return bdrv_get_info(bs->file->bs, bdi);
+        }
         return -ENOTSUP;
+    }
     memset(bdi, 0, sizeof(*bdi));
     return drv->bdrv_get_info(bs, bdi);
 }
@@ -4205,6 +4222,8 @@  int bdrv_media_changed(BlockDriverState *bs)
 
     if (drv && drv->bdrv_media_changed) {
         return drv->bdrv_media_changed(bs);
+    } else if (drv && bs->file && bs->file->bs) {
+        return bdrv_media_changed(bs->file->bs);
     }
     return -ENOTSUP;
 }
@@ -4218,6 +4237,8 @@  void bdrv_eject(BlockDriverState *bs, bool eject_flag)
 
     if (drv && drv->bdrv_eject) {
         drv->bdrv_eject(bs, eject_flag);
+    } else if (drv && bs->file && bs->file->bs) {
+        bdrv_eject(bs->file->bs, eject_flag);
     }
 }
 
@@ -4233,6 +4254,8 @@  void bdrv_lock_medium(BlockDriverState *bs, bool locked)
 
     if (drv && drv->bdrv_lock_medium) {
         drv->bdrv_lock_medium(bs, locked);
+    } else if (drv && bs->file && bs->file->bs) {
+        bdrv_lock_medium(bs->file->bs, locked);
     }
 }
 
diff --git a/block/io.c b/block/io.c
index 5c146b5a..c22a9bf2 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2425,6 +2425,11 @@  int bdrv_co_ioctl(BlockDriverState *bs, int req, void *buf)
     };
     BlockAIOCB *acb;
 
+    if (drv && !drv->bdrv_aio_ioctl && !drv->bdrv_co_ioctl &&
+        bs->file && bs->file->bs) {
+        return bdrv_co_ioctl(bs->file->bs, req, buf);
+    }
+
     bdrv_inc_in_flight(bs);
     if (!drv || (!drv->bdrv_aio_ioctl && !drv->bdrv_co_ioctl)) {
         co.ret = -ENOTSUP;