diff mbox

block: fix write with zero flag set and iovector provided

Message ID 1517494591-39369-1-git-send-email-anton.nefedov@virtuozzo.com (mailing list archive)
State New, archived
Headers show

Commit Message

Anton Nefedov Feb. 1, 2018, 2:16 p.m. UTC
The normal bdrv_co_pwritev() use is either
  - BDRV_REQ_ZERO_WRITE reset and iovector provided
  - BDRV_REQ_ZERO_WRITE set and iovector == NULL

while
  - the flag reset and iovector == NULL is an assertion failure
    in bdrv_co_do_zero_pwritev()
  - the flag set and iovector provided is in fact allowed
    (the flag prevails and zeroes are written)

However the alignment logic does not support the latter case so the padding
areas get overwritten with zeroes.

Solution could be to forbid such case or just use bdrv_co_do_zero_pwritev()
alignment for it which also makes the code a bit more obvious anyway.

Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
---
 block/io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Alberto Garcia Feb. 1, 2018, 2:29 p.m. UTC | #1
On Thu 01 Feb 2018 03:16:31 PM CET, Anton Nefedov wrote:
> The normal bdrv_co_pwritev() use is either
>   - BDRV_REQ_ZERO_WRITE reset and iovector provided
>   - BDRV_REQ_ZERO_WRITE set and iovector == NULL
>
> while
>   - the flag reset and iovector == NULL is an assertion failure
>     in bdrv_co_do_zero_pwritev()

Where is that assertion?

Berto
Eric Blake Feb. 1, 2018, 2:36 p.m. UTC | #2
On 02/01/2018 08:16 AM, Anton Nefedov wrote:
> The normal bdrv_co_pwritev() use is either
>   - BDRV_REQ_ZERO_WRITE reset and iovector provided

s/reset/clear/

>   - BDRV_REQ_ZERO_WRITE set and iovector == NULL
> 
> while
>   - the flag reset and iovector == NULL is an assertion failure

again

>     in bdrv_co_do_zero_pwritev()
>   - the flag set and iovector provided is in fact allowed
>     (the flag prevails and zeroes are written)
> 
> However the alignment logic does not support the latter case so the padding
> areas get overwritten with zeroes.
> 
> Solution could be to forbid such case or just use bdrv_co_do_zero_pwritev()
> alignment for it which also makes the code a bit more obvious anyway.
> 
> Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
> ---
>  block/io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 7ea4023..cf63fd0 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1701,7 +1701,7 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
>       */
>      tracked_request_begin(&req, bs, offset, bytes, BDRV_TRACKED_WRITE);
>  
> -    if (!qiov) {
> +    if (flags & BDRV_REQ_ZERO_WRITE) {
>          ret = bdrv_co_do_zero_pwritev(child, offset, bytes, flags, &req);

So now, the flag rules, but we assert that !qiov (so it would only break
a caller that passed the flag but used qiov, which you argued shouldn't
exist).

Reviewed-by: Eric Blake <eblake@redhat.com>
Anton Nefedov Feb. 1, 2018, 2:38 p.m. UTC | #3
On 1/2/2018 5:29 PM, Alberto Garcia wrote:
> On Thu 01 Feb 2018 03:16:31 PM CET, Anton Nefedov wrote:
>> The normal bdrv_co_pwritev() use is either
>>    - BDRV_REQ_ZERO_WRITE reset and iovector provided
>>    - BDRV_REQ_ZERO_WRITE set and iovector == NULL
>>
>> while
>>    - the flag reset and iovector == NULL is an assertion failure
>>      in bdrv_co_do_zero_pwritev()
> 
> Where is that assertion?
> 
> Berto
> 

beginning of bdrv_co_do_zero_pwritev():

     assert(flags & BDRV_REQ_ZERO_WRITE);

and bdrv_co_do_zero_pwritev() was only called with qiov==NULL.

Now this case will instead segfault at some point.
Don't know if it needs a separate assertion.

/Anton
Eric Blake Feb. 1, 2018, 2:40 p.m. UTC | #4
On 02/01/2018 08:36 AM, Eric Blake wrote:
> On 02/01/2018 08:16 AM, Anton Nefedov wrote:
>> The normal bdrv_co_pwritev() use is either
>>   - BDRV_REQ_ZERO_WRITE reset and iovector provided
> 
> s/reset/clear/
> 
>>   - BDRV_REQ_ZERO_WRITE set and iovector == NULL
>>
>> while
>>   - the flag reset and iovector == NULL is an assertion failure
> 
> again
> 
>>     in bdrv_co_do_zero_pwritev()
>>   - the flag set and iovector provided is in fact allowed
>>     (the flag prevails and zeroes are written)
>>
>> However the alignment logic does not support the latter case so the padding
>> areas get overwritten with zeroes.
>>
>> Solution could be to forbid such case or just use bdrv_co_do_zero_pwritev()
>> alignment for it which also makes the code a bit more obvious anyway.
>>
>> Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
>> ---
>>  block/io.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/block/io.c b/block/io.c
>> index 7ea4023..cf63fd0 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -1701,7 +1701,7 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
>>       */
>>      tracked_request_begin(&req, bs, offset, bytes, BDRV_TRACKED_WRITE);
>>  
>> -    if (!qiov) {
>> +    if (flags & BDRV_REQ_ZERO_WRITE) {
>>          ret = bdrv_co_do_zero_pwritev(child, offset, bytes, flags, &req);
> 
> So now, the flag rules, but we assert that !qiov (so it would only break
> a caller that passed the flag but used qiov, which you argued shouldn't
> exist).

Sorry, I hit send too soon.  I'm asking if we should have assert(!qiov)
right before calling bdrv_co_do_zero_pwritev (it would break a caller
that passed the flag and qiov, but you were arguing that such callers
previously misbehaved, so we don't want such callers).

But adding such an assertion may trigger failures that we'd have to fix,
while leaving things without the assertion conservatively seems okay.

> 
> Reviewed-by: Eric Blake <eblake@redhat.com>
>
Alberto Garcia Feb. 1, 2018, 3:19 p.m. UTC | #5
On Thu 01 Feb 2018 03:40:51 PM CET, Eric Blake wrote:
>>> --- a/block/io.c
>>> +++ b/block/io.c
>>> @@ -1701,7 +1701,7 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
>>>       */
>>>      tracked_request_begin(&req, bs, offset, bytes, BDRV_TRACKED_WRITE);
>>>  
>>> -    if (!qiov) {
>>> +    if (flags & BDRV_REQ_ZERO_WRITE) {
>>>          ret = bdrv_co_do_zero_pwritev(child, offset, bytes, flags, &req);
>> 
>> So now, the flag rules, but we assert that !qiov (so it would only break
>> a caller that passed the flag but used qiov, which you argued shouldn't
>> exist).
>
> Sorry, I hit send too soon.  I'm asking if we should have
> assert(!qiov) right before calling bdrv_co_do_zero_pwritev (it would
> break a caller that passed the flag and qiov, but you were arguing
> that such callers previously misbehaved, so we don't want such
> callers).

Those callers do exist as a matter of fact: bdrv_rw_co_entry() always
passes a qiov to bdrv_co_pwritev() regardless of the flags (the request
size is actually taken from the very qiov).

bdrv_pwrite_zeroes() is one example:

$ qemu-img create -f qcow2 base.img 100M
$ qemu-img create -f qcow2 -b base.img active.img
$ qemu-io -c 'write -z 0 128k' -f qcow2 active.img 
$ qemu-img amend -o compat=0.10 active.img 

It even uses an iovec with iov_base = NULL but iov_len != 0, which looks
like an abuse of the data structure.

Berto
Stefan Hajnoczi Feb. 6, 2018, 4:11 p.m. UTC | #6
On Thu, Feb 01, 2018 at 05:16:31PM +0300, Anton Nefedov wrote:
> The normal bdrv_co_pwritev() use is either
>   - BDRV_REQ_ZERO_WRITE reset and iovector provided
>   - BDRV_REQ_ZERO_WRITE set and iovector == NULL
> 
> while
>   - the flag reset and iovector == NULL is an assertion failure
>     in bdrv_co_do_zero_pwritev()
>   - the flag set and iovector provided is in fact allowed
>     (the flag prevails and zeroes are written)
> 
> However the alignment logic does not support the latter case so the padding
> areas get overwritten with zeroes.

Please include a test case.  Berto mentioned that bdrv_pwrite_zeroes()
hits this issue, that might be one way to test it.

> Solution could be to forbid such case or just use bdrv_co_do_zero_pwritev()
> alignment for it which also makes the code a bit more obvious anyway.
> 
> Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
> ---
>  block/io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 7ea4023..cf63fd0 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1701,7 +1701,7 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
>       */
>      tracked_request_begin(&req, bs, offset, bytes, BDRV_TRACKED_WRITE);
>  
> -    if (!qiov) {
> +    if (flags & BDRV_REQ_ZERO_WRITE) {
>          ret = bdrv_co_do_zero_pwritev(child, offset, bytes, flags, &req);
>          goto out;
>      }

Looks good.
diff mbox

Patch

diff --git a/block/io.c b/block/io.c
index 7ea4023..cf63fd0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1701,7 +1701,7 @@  int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
      */
     tracked_request_begin(&req, bs, offset, bytes, BDRV_TRACKED_WRITE);
 
-    if (!qiov) {
+    if (flags & BDRV_REQ_ZERO_WRITE) {
         ret = bdrv_co_do_zero_pwritev(child, offset, bytes, flags, &req);
         goto out;
     }