diff mbox series

[v7,09/11] qcow2: don't reallocate host clusters under guest operation

Message ID 20210904162428.222008-10-vsementsov@virtuozzo.com (mailing list archive)
State New, archived
Headers show
Series qcow2: fix parallel rewrite and discard (reqlist) | expand

Commit Message

Vladimir Sementsov-Ogievskiy Sept. 4, 2021, 4:24 p.m. UTC
We have the following bug:

1. Start write to qcow2. Assume guest cluster G and corresponding host
   cluster is H.

2. The write requests come to the point of data writing to .file. The
   write to .file is started and qcow2 mutex is unlocked.

3. At this time refcount of H becomes 0. For example, it may be due to
   discard operation on qcow2 node, or rewriting compressed data by
   normal write, or some operation with snapshots..

4. Next, some operations occurs and leads to allocation of H for some
   other needs: it may be another write-to-qcow2-node operation, or
   allocation of L2 table or some other data or metadata cluster
   allocation.

5. So, at this point H is used for something other. Assume, L2 table is
   written into H.

6. And now, our write from [2] finishes. And pollutes L2 table in H.
   That's a bug.

To fix the bug we now have guest operations tracking. The remaining
thing to do is to handle these in-flight operations in
is_cluster_free() function which is the core of allocating host
clusters.

iotest qcow2-discard-during-rewrite is enabled, as it works now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-refcount.c                                | 7 +++++--
 tests/qemu-iotests/tests/qcow2-discard-during-rewrite | 2 +-
 2 files changed, 6 insertions(+), 3 deletions(-)
diff mbox series

Patch

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 13b1fed43e..e71bb9b089 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -962,7 +962,7 @@  int qcow2_update_cluster_refcount(BlockDriverState *bs,
 
 
 /*
- * Cluster is free when its refcount is 0
+ * Cluster is free when its refcount is 0 and there is no in-flight writes
  *
  * Return < 0 if failed to get refcount
  *          0 if cluster is not free
@@ -970,6 +970,7 @@  int qcow2_update_cluster_refcount(BlockDriverState *bs,
  */
 static int is_cluster_free(BlockDriverState *bs, int64_t cluster_index)
 {
+    BDRVQcow2State *s = bs->opaque;
     int ret;
     uint64_t refcount;
 
@@ -978,7 +979,9 @@  static int is_cluster_free(BlockDriverState *bs, int64_t cluster_index)
         return ret;
     }
 
-    return refcount == 0;
+    return refcount == 0 &&
+        !reqlist_find_conflict(&s->guest_reqs, cluster_index * s->cluster_size,
+                               s->cluster_size);
 }
 
 /* return < 0 if error */
diff --git a/tests/qemu-iotests/tests/qcow2-discard-during-rewrite b/tests/qemu-iotests/tests/qcow2-discard-during-rewrite
index 7f0d8a107a..2e2e0d2cb0 100755
--- a/tests/qemu-iotests/tests/qcow2-discard-during-rewrite
+++ b/tests/qemu-iotests/tests/qcow2-discard-during-rewrite
@@ -1,5 +1,5 @@ 
 #!/usr/bin/env bash
-# group: quick disabled
+# group: quick
 #
 # Test discarding (and reusing) host cluster during writing data to it.
 #