Message ID | 6ee75696d3eaed56b46e91fe242fdfab51feb066.1593342067.git.berto@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add subcluster allocation to qcow2 | expand |
On 28.06.20 13:02, Alberto Garcia wrote: > This patch adds QCow2SubclusterType, which is the subcluster-level > version of QCow2ClusterType. All QCOW2_SUBCLUSTER_* values have the > the same meaning as their QCOW2_CLUSTER_* equivalents (when they > exist). See below for details and caveats. > > In images without extended L2 entries clusters are treated as having > exactly one subcluster so it is possible to replace one data type with > the other while keeping the exact same semantics. > > With extended L2 entries there are new possible values, and every > subcluster in the same cluster can obviously have a different > QCow2SubclusterType so functions need to be adapted to work on the > subcluster level. > > There are several things that have to be taken into account: > > a) QCOW2_SUBCLUSTER_COMPRESSED means that the whole cluster is > compressed. We do not support compression at the subcluster > level. > > b) There are two different values for unallocated subclusters: > QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN which means that the whole > cluster is unallocated, and QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC > which means that the cluster is allocated but the subcluster is > not. The latter can only happen in images with extended L2 > entries. > > c) QCOW2_SUBCLUSTER_INVALID is used to detect the cases where an L2 > entry has a value that violates the specification. The caller is > responsible for handling these situations. > > To prevent compatibility problems with images that have invalid > values but are currently being read by QEMU without causing side > effects, QCOW2_SUBCLUSTER_INVALID is only returned for images > with extended L2 entries. > > qcow2_cluster_to_subcluster_type() is added as a separate function > from qcow2_get_subcluster_type(), but this is only temporary and both > will be merged in a subsequent patch. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > Reviewed-by: Eric Blake <eblake@redhat.com> > --- > block/qcow2.h | 126 +++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 125 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2.h b/block/qcow2.h > index 82b86f6cec..3aec6f452a 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h [...] > @@ -634,9 +686,11 @@ static inline int64_t qcow2_vm_state_offset(BDRVQcow2State *s) > static inline QCow2ClusterType qcow2_get_cluster_type(BlockDriverState *bs, > uint64_t l2_entry) > { > + BDRVQcow2State *s = bs->opaque; > + > if (l2_entry & QCOW_OFLAG_COMPRESSED) { > return QCOW2_CLUSTER_COMPRESSED; > - } else if (l2_entry & QCOW_OFLAG_ZERO) { > + } else if ((l2_entry & QCOW_OFLAG_ZERO) && !has_subclusters(s)) { OK, so now qcow2_get_cluster_type() reports zero clusters to be normal or unallocated clusters when there are subclusters. Seems weird to me, because zero clusters are invalid clusters then. I preferred just reporting them as zero clusters and letting the caller deal with it, because it does mean an error in the image and so it should be reported. So... > if (l2_entry & L2E_OFFSET_MASK) { > return QCOW2_CLUSTER_ZERO_ALLOC; > } [...] > +/* > + * In an image without subsclusters @l2_bitmap is ignored and > + * @sc_index must be 0. > + * Return QCOW2_SUBCLUSTER_INVALID if an invalid l2 entry is detected > + * (this checks the whole entry and bitmap, not only the bits related > + * to subcluster @sc_index). > + */ > +static inline > +QCow2SubclusterType qcow2_get_subcluster_type(BlockDriverState *bs, > + uint64_t l2_entry, > + uint64_t l2_bitmap, > + unsigned sc_index) > +{ > + BDRVQcow2State *s = bs->opaque; > + QCow2ClusterType type = qcow2_get_cluster_type(bs, l2_entry); > + assert(sc_index < s->subclusters_per_cluster); > + > + if (has_subclusters(s)) { > + switch (type) { > + case QCOW2_CLUSTER_COMPRESSED: > + return QCOW2_SUBCLUSTER_COMPRESSED; > + case QCOW2_CLUSTER_NORMAL: > + if ((l2_bitmap >> 32) & l2_bitmap) { > + return QCOW2_SUBCLUSTER_INVALID; > + } else if (l2_bitmap & QCOW_OFLAG_SUB_ZERO(sc_index)) { > + return QCOW2_SUBCLUSTER_ZERO_ALLOC; > + } else if (l2_bitmap & QCOW_OFLAG_SUB_ALLOC(sc_index)) { > + return QCOW2_SUBCLUSTER_NORMAL; > + } else { > + return QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC; > + } > + case QCOW2_CLUSTER_UNALLOCATED: > + if (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC) { > + return QCOW2_SUBCLUSTER_INVALID; > + } else if (l2_bitmap & QCOW_OFLAG_SUB_ZERO(sc_index)) { > + return QCOW2_SUBCLUSTER_ZERO_PLAIN; > + } else { > + return QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN; > + } ...consequentially, this function no longer reports clusters which have the zero flag set as invalid (which it did in v4, when I last looked at it). I see this was a conscious choice of yours in v5. I don’t really see the justification for it, though. As far as I can understand, you seem to argue that no corruption detection is better than incomplete corruption detection, and that a complete and thorough detection would warrant its own series, do I understand that correctly? Max > + default: > + g_assert_not_reached(); > + } > + } else { > + return qcow2_cluster_to_subcluster_type(type); > + } > +} > + > /* Check whether refcounts are eager or lazy */ > static inline bool qcow2_need_accurate_refcounts(BDRVQcow2State *s) > { >
On Wed 01 Jul 2020 02:52:14 PM CEST, Max Reitz wrote: >> if (l2_entry & QCOW_OFLAG_COMPRESSED) { >> return QCOW2_CLUSTER_COMPRESSED; >> - } else if (l2_entry & QCOW_OFLAG_ZERO) { >> + } else if ((l2_entry & QCOW_OFLAG_ZERO) && !has_subclusters(s)) { > > OK, so now qcow2_get_cluster_type() reports zero clusters to be normal > or unallocated clusters when there are subclusters. Seems weird to > me, because zero clusters are invalid clusters then. I'm actually hesitant about this. In extended L2 entries QCOW_OFLAG_ZERO does not have any meaning so technically it doesn't need to be checked any more than the other reserved bits (1 to 8). The reason why we would want to check it is, of course, because that bit does have a meaning in regular L2 entries. But that bit is ignored in images with subclusters so the only reason why we would check it is to report corruption, not because we need to know its value. It's true that we do check it in v2 images, although in that case the entries are otherwise identical and there is a way to convert between both types. > I preferred just reporting them as zero clusters and letting the > caller deal with it, because it does mean an error in the image and so > it should be reported. Another alternative would be to add QCOW2_CLUSTER_INVALID and we could even include there other cases like unaligned offsets and things like that. But that would also affect the code that repairs corrupted images. Berto
On 01.07.20 18:26, Alberto Garcia wrote: > On Wed 01 Jul 2020 02:52:14 PM CEST, Max Reitz wrote: >>> if (l2_entry & QCOW_OFLAG_COMPRESSED) { >>> return QCOW2_CLUSTER_COMPRESSED; >>> - } else if (l2_entry & QCOW_OFLAG_ZERO) { >>> + } else if ((l2_entry & QCOW_OFLAG_ZERO) && !has_subclusters(s)) { >> >> OK, so now qcow2_get_cluster_type() reports zero clusters to be normal >> or unallocated clusters when there are subclusters. Seems weird to >> me, because zero clusters are invalid clusters then. > > I'm actually hesitant about this. > > In extended L2 entries QCOW_OFLAG_ZERO does not have any meaning so > technically it doesn't need to be checked any more than the other > reserved bits (1 to 8). Good point. That convinces me. > The reason why we would want to check it is, of course, because that bit > does have a meaning in regular L2 entries. > > But that bit is ignored in images with subclusters so the only reason > why we would check it is to report corruption, not because we need to > know its value. Sure. But isn’t that the whole point of having QCOW2_SUBCLUSTER_INVALID in the first place? > It's true that we do check it in v2 images, although in that case the > entries are otherwise identical and there is a way to convert between > both types. > >> I preferred just reporting them as zero clusters and letting the >> caller deal with it, because it does mean an error in the image and so >> it should be reported. > > Another alternative would be to add QCOW2_CLUSTER_INVALID and we could > even include there other cases like unaligned offsets and things like > that. But that would also affect the code that repairs corrupted images. Interesting. Well, and that’d be definitely too much for this series, as you already said. So: Reviewed-by: Max Reitz <mreitz@redhat.com>
On Thu 02 Jul 2020 11:57:46 AM CEST, Max Reitz wrote: >> The reason why we would want to check it is, of course, because that >> bit does have a meaning in regular L2 entries. >> >> But that bit is ignored in images with subclusters so the only reason >> why we would check it is to report corruption, not because we need to >> know its value. > > Sure. But isn’t that the whole point of having > QCOW2_SUBCLUSTER_INVALID in the first place? At the moment we're only returning QCOW2_SUBCLUSTER_INVALID in cases where there is no way to interpret the entry correctly: a) the allocation and zero bits are set for the same subcluster, and b) the allocation bit is set but the entry has no valid offset. It doesn't mean that we cannot use _SUBCLUSTER_INVALID for cases like the one we're discussing, but this one is different from the other two. Berto
On 03.07.20 00:00, Alberto Garcia wrote: > On Thu 02 Jul 2020 11:57:46 AM CEST, Max Reitz wrote: >>> The reason why we would want to check it is, of course, because that >>> bit does have a meaning in regular L2 entries. >>> >>> But that bit is ignored in images with subclusters so the only reason >>> why we would check it is to report corruption, not because we need to >>> know its value. >> >> Sure. But isn’t that the whole point of having >> QCOW2_SUBCLUSTER_INVALID in the first place? > > At the moment we're only returning QCOW2_SUBCLUSTER_INVALID in cases > where there is no way to interpret the entry correctly: a) the > allocation and zero bits are set for the same subcluster, and b) the > allocation bit is set but the entry has no valid offset. > > It doesn't mean that we cannot use _SUBCLUSTER_INVALID for cases like > the one we're discussing, but this one is different from the other two. OK, that makes sense. Max
diff --git a/block/qcow2.h b/block/qcow2.h index 82b86f6cec..3aec6f452a 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -80,6 +80,21 @@ #define QCOW_EXTL2_SUBCLUSTERS_PER_CLUSTER 32 +/* The subcluster X [0..31] is allocated */ +#define QCOW_OFLAG_SUB_ALLOC(X) (1ULL << (X)) +/* The subcluster X [0..31] reads as zeroes */ +#define QCOW_OFLAG_SUB_ZERO(X) (QCOW_OFLAG_SUB_ALLOC(X) << 32) +/* Subclusters [X, Y) (0 <= X <= Y <= 32) are allocated */ +#define QCOW_OFLAG_SUB_ALLOC_RANGE(X, Y) \ + (QCOW_OFLAG_SUB_ALLOC(Y) - QCOW_OFLAG_SUB_ALLOC(X)) +/* Subclusters [X, Y) (0 <= X <= Y <= 32) read as zeroes */ +#define QCOW_OFLAG_SUB_ZERO_RANGE(X, Y) \ + (QCOW_OFLAG_SUB_ALLOC_RANGE(X, Y) << 32) +/* L2 entry bitmap with all allocation bits set */ +#define QCOW_L2_BITMAP_ALL_ALLOC (QCOW_OFLAG_SUB_ALLOC_RANGE(0, 32)) +/* L2 entry bitmap with all "read as zeroes" bits set */ +#define QCOW_L2_BITMAP_ALL_ZEROES (QCOW_OFLAG_SUB_ZERO_RANGE(0, 32)) + /* Size of normal and extended L2 entries */ #define L2E_SIZE_NORMAL (sizeof(uint64_t)) #define L2E_SIZE_EXTENDED (sizeof(uint64_t) * 2) @@ -462,6 +477,33 @@ typedef struct QCowL2Meta QLIST_ENTRY(QCowL2Meta) next_in_flight; } QCowL2Meta; +/* + * In images with standard L2 entries all clusters are treated as if + * they had one subcluster so QCow2ClusterType and QCow2SubclusterType + * can be mapped to each other and have the exact same meaning + * (QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC cannot happen in these images). + * + * In images with extended L2 entries QCow2ClusterType refers to the + * complete cluster and QCow2SubclusterType to each of the individual + * subclusters, so there are several possible combinations: + * + * |--------------+---------------------------| + * | Cluster type | Possible subcluster types | + * |--------------+---------------------------| + * | UNALLOCATED | UNALLOCATED_PLAIN | + * | | ZERO_PLAIN | + * |--------------+---------------------------| + * | NORMAL | UNALLOCATED_ALLOC | + * | | ZERO_ALLOC | + * | | NORMAL | + * |--------------+---------------------------| + * | COMPRESSED | COMPRESSED | + * |--------------+---------------------------| + * + * QCOW2_SUBCLUSTER_INVALID means that the L2 entry is incorrect and + * the image should be marked corrupt. + */ + typedef enum QCow2ClusterType { QCOW2_CLUSTER_UNALLOCATED, QCOW2_CLUSTER_ZERO_PLAIN, @@ -470,6 +512,16 @@ typedef enum QCow2ClusterType { QCOW2_CLUSTER_COMPRESSED, } QCow2ClusterType; +typedef enum QCow2SubclusterType { + QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN, + QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC, + QCOW2_SUBCLUSTER_ZERO_PLAIN, + QCOW2_SUBCLUSTER_ZERO_ALLOC, + QCOW2_SUBCLUSTER_NORMAL, + QCOW2_SUBCLUSTER_COMPRESSED, + QCOW2_SUBCLUSTER_INVALID, +} QCow2SubclusterType; + typedef enum QCow2MetadataOverlap { QCOW2_OL_MAIN_HEADER_BITNR = 0, QCOW2_OL_ACTIVE_L1_BITNR = 1, @@ -634,9 +686,11 @@ static inline int64_t qcow2_vm_state_offset(BDRVQcow2State *s) static inline QCow2ClusterType qcow2_get_cluster_type(BlockDriverState *bs, uint64_t l2_entry) { + BDRVQcow2State *s = bs->opaque; + if (l2_entry & QCOW_OFLAG_COMPRESSED) { return QCOW2_CLUSTER_COMPRESSED; - } else if (l2_entry & QCOW_OFLAG_ZERO) { + } else if ((l2_entry & QCOW_OFLAG_ZERO) && !has_subclusters(s)) { if (l2_entry & L2E_OFFSET_MASK) { return QCOW2_CLUSTER_ZERO_ALLOC; } @@ -656,6 +710,76 @@ static inline QCow2ClusterType qcow2_get_cluster_type(BlockDriverState *bs, } } +/* + * For an image without extended L2 entries, return the + * QCow2SubclusterType equivalent of a given QCow2ClusterType. + */ +static inline +QCow2SubclusterType qcow2_cluster_to_subcluster_type(QCow2ClusterType type) +{ + switch (type) { + case QCOW2_CLUSTER_COMPRESSED: + return QCOW2_SUBCLUSTER_COMPRESSED; + case QCOW2_CLUSTER_ZERO_PLAIN: + return QCOW2_SUBCLUSTER_ZERO_PLAIN; + case QCOW2_CLUSTER_ZERO_ALLOC: + return QCOW2_SUBCLUSTER_ZERO_ALLOC; + case QCOW2_CLUSTER_NORMAL: + return QCOW2_SUBCLUSTER_NORMAL; + case QCOW2_CLUSTER_UNALLOCATED: + return QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN; + default: + g_assert_not_reached(); + } +} + +/* + * In an image without subsclusters @l2_bitmap is ignored and + * @sc_index must be 0. + * Return QCOW2_SUBCLUSTER_INVALID if an invalid l2 entry is detected + * (this checks the whole entry and bitmap, not only the bits related + * to subcluster @sc_index). + */ +static inline +QCow2SubclusterType qcow2_get_subcluster_type(BlockDriverState *bs, + uint64_t l2_entry, + uint64_t l2_bitmap, + unsigned sc_index) +{ + BDRVQcow2State *s = bs->opaque; + QCow2ClusterType type = qcow2_get_cluster_type(bs, l2_entry); + assert(sc_index < s->subclusters_per_cluster); + + if (has_subclusters(s)) { + switch (type) { + case QCOW2_CLUSTER_COMPRESSED: + return QCOW2_SUBCLUSTER_COMPRESSED; + case QCOW2_CLUSTER_NORMAL: + if ((l2_bitmap >> 32) & l2_bitmap) { + return QCOW2_SUBCLUSTER_INVALID; + } else if (l2_bitmap & QCOW_OFLAG_SUB_ZERO(sc_index)) { + return QCOW2_SUBCLUSTER_ZERO_ALLOC; + } else if (l2_bitmap & QCOW_OFLAG_SUB_ALLOC(sc_index)) { + return QCOW2_SUBCLUSTER_NORMAL; + } else { + return QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC; + } + case QCOW2_CLUSTER_UNALLOCATED: + if (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC) { + return QCOW2_SUBCLUSTER_INVALID; + } else if (l2_bitmap & QCOW_OFLAG_SUB_ZERO(sc_index)) { + return QCOW2_SUBCLUSTER_ZERO_PLAIN; + } else { + return QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN; + } + default: + g_assert_not_reached(); + } + } else { + return qcow2_cluster_to_subcluster_type(type); + } +} + /* Check whether refcounts are eager or lazy */ static inline bool qcow2_need_accurate_refcounts(BDRVQcow2State *s) {