Message ID | 560f62fb-f6fb-42ae-2fd9-b10c9c478350@kamp.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 06/27/2017 09:49 AM, Peter Lieven wrote: >> > > Before I continue, can you please give feedback on the following spec > change: > > diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt > index 80cdfd0..f1428e9 100644 > --- a/docs/interop/qcow2.txt > +++ b/docs/interop/qcow2.txt > @@ -85,7 +85,11 @@ in the description of a field. > be written to (unless for regaining > consistency). > > - Bits 2-63: Reserved (set to 0) > + Bit 2: Compression format bit. Iff this bit I know what this means, but spelling it "If and only if" or "When" might make more sense to other readers, as "Iff" is not common in English. > is set then > + the compression format extension MUST > be present > + and MUST be parsed and checked for > compatibility. > + > + Bits 3-63: Reserved (set to 0) > > 80 - 87: compatible_features > Bitmask of compatible features. An implementation can > @@ -135,6 +139,7 @@ be stored. Each extension has a structure like the > following: > 0xE2792ACA - Backing file format name > 0x6803f857 - Feature name table > 0x23852875 - Bitmaps extension > + 0xC0318300 - Compression format extension Now that you aren't burning 256 magic numbers, it may make sense to have the last two hex digits be non-zero. > +== Compression format extension == > + > +The compression format extension is an optional header extension. It > provides Inline pasting created interesting wrapping, but the actual patch will obviously read better. > +the ability to specify the compression algorithm and compression > parameters > +that are used for compressed clusters. This new header MUST be present if > +the incompatible-feature bit "compression format bit" is set and MUST > be absent > +otherwise. > + > +The fields of the compression format extension are: > + > + Byte 0 - 15: compression_format_name (padded with zeros, but not > + necessarily null terminated if it has full length) Do we really want arbitrary names of formats, or do we want to specify specific algorithms (gzip, lzo, zstd) as an enum? Which way gives us maximum likelihood of interoperability? > + > + 16: compression_level (uint8_t) > + 0 = default compression level > + 1 = lowest compression level > + x = highest compression level (the highest compression > + level may vary for different compression formats) > + > + 17 - 23: Reserved for future use, must be zero. Feels pretty limited - you don't have a length field for variable-length extension of additional parameters, but have to fit all additions in the next 8 bytes. Yes, all extension headers are already paired with a length parameter outside of the struct, sent alongside the header magic number, but embedding a length directly in the header (while redundant) makes it easier to keep information local to the header. See extra_data_size under Bitmap directory, for example. Of course, we may turn those 8 bytes INTO a length field, that then describe the rest of the variable length parameters, but why not do it up front? If we go with an enum mapping of supported compression formats, then you can go into further details on exactly what extra parameters are supports for each algorithm; while leaving it as a free-form text string makes it harder to interpret what any additional payload will represent.
Am 27.06.2017 um 17:04 schrieb Eric Blake: > On 06/27/2017 09:49 AM, Peter Lieven wrote: > >> Before I continue, can you please give feedback on the following spec >> change: >> >> diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt >> index 80cdfd0..f1428e9 100644 >> --- a/docs/interop/qcow2.txt >> +++ b/docs/interop/qcow2.txt >> @@ -85,7 +85,11 @@ in the description of a field. >> be written to (unless for regaining >> consistency). >> >> - Bits 2-63: Reserved (set to 0) >> + Bit 2: Compression format bit. Iff this bit > I know what this means, but spelling it "If and only if" or "When" might > make more sense to other readers, as "Iff" is not common in English. > >> is set then >> + the compression format extension MUST >> be present >> + and MUST be parsed and checked for >> compatibility. >> + >> + Bits 3-63: Reserved (set to 0) >> >> 80 - 87: compatible_features >> Bitmask of compatible features. An implementation can >> @@ -135,6 +139,7 @@ be stored. Each extension has a structure like the >> following: >> 0xE2792ACA - Backing file format name >> 0x6803f857 - Feature name table >> 0x23852875 - Bitmaps extension >> + 0xC0318300 - Compression format extension > Now that you aren't burning 256 magic numbers, it may make sense to have > the last two hex digits be non-zero. > > >> +== Compression format extension == >> + >> +The compression format extension is an optional header extension. It >> provides > Inline pasting created interesting wrapping, but the actual patch will > obviously read better. > >> +the ability to specify the compression algorithm and compression >> parameters >> +that are used for compressed clusters. This new header MUST be present if >> +the incompatible-feature bit "compression format bit" is set and MUST >> be absent >> +otherwise. >> + >> +The fields of the compression format extension are: >> + >> + Byte 0 - 15: compression_format_name (padded with zeros, but not >> + necessarily null terminated if it has full length) > Do we really want arbitrary names of formats, or do we want to specify > specific algorithms (gzip, lzo, zstd) as an enum? Which way gives us > maximum likelihood of interoperability? At some point I have to parse the name and convert it into a number. This could be the same routine for the parsing of the compress.format parameter and the compression_format_name in the header field. The idea was that an accidently change in the enum cannot break anything as it is local to the qemu binary. But I am ok to have an enum if the values for each format a fixed. In this case it might be sufficient to have an uint8_t for the compression format + an uint8_t for the compression level. Then 2 bytes reserved + an uint32_t for the extra data size. So currently (the extra data is not necessary at the moment for any format) the header would be limited to 8 bytes. Anyway if we glue this into a QAPI scheme, I would appreciate to get some help with it as I am not that familiar with it yet. Thanks, Peter
Am 27.06.2017 um 17:11 schrieb Peter Lieven: > Am 27.06.2017 um 17:04 schrieb Eric Blake: >> On 06/27/2017 09:49 AM, Peter Lieven wrote: >> >>> Before I continue, can you please give feedback on the following spec >>> change: >>> >>> diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt >>> index 80cdfd0..f1428e9 100644 >>> --- a/docs/interop/qcow2.txt >>> +++ b/docs/interop/qcow2.txt >>> @@ -85,7 +85,11 @@ in the description of a field. >>> be written to (unless for regaining >>> consistency). >>> >>> - Bits 2-63: Reserved (set to 0) >>> + Bit 2: Compression format bit. Iff this bit >> I know what this means, but spelling it "If and only if" or "When" might >> make more sense to other readers, as "Iff" is not common in English. >> >>> is set then >>> + the compression format extension MUST >>> be present >>> + and MUST be parsed and checked for >>> compatibility. >>> + >>> + Bits 3-63: Reserved (set to 0) >>> >>> 80 - 87: compatible_features >>> Bitmask of compatible features. An implementation can >>> @@ -135,6 +139,7 @@ be stored. Each extension has a structure like the >>> following: >>> 0xE2792ACA - Backing file format name >>> 0x6803f857 - Feature name table >>> 0x23852875 - Bitmaps extension >>> + 0xC0318300 - Compression format extension >> Now that you aren't burning 256 magic numbers, it may make sense to have >> the last two hex digits be non-zero. >> >> >>> +== Compression format extension == >>> + >>> +The compression format extension is an optional header extension. It >>> provides >> Inline pasting created interesting wrapping, but the actual patch will >> obviously read better. >> >>> +the ability to specify the compression algorithm and compression >>> parameters >>> +that are used for compressed clusters. This new header MUST be present if >>> +the incompatible-feature bit "compression format bit" is set and MUST >>> be absent >>> +otherwise. >>> + >>> +The fields of the compression format extension are: >>> + >>> + Byte 0 - 15: compression_format_name (padded with zeros, but not >>> + necessarily null terminated if it has full length) >> Do we really want arbitrary names of formats, or do we want to specify >> specific algorithms (gzip, lzo, zstd) as an enum? Which way gives us >> maximum likelihood of interoperability? > > At some point I have to parse the name and convert it into a number. > This could be the same routine for the parsing of the compress.format > parameter and the compression_format_name in the header field. > The idea was that an accidently change in the enum cannot break > anything as it is local to the qemu binary. But I am ok to have an enum > if the values for each format a fixed. In this case it might be sufficient > to have an uint8_t for the compression format + an uint8_t for the compression > level. Then 2 bytes reserved + an uint32_t for the extra data size. So currently (the extra > data is not necessary at the moment for any format) the header would > be limited to 8 bytes. > > Anyway if we glue this into a QAPI scheme, I would appreciate to get > some help with it as I am not that familiar with it yet. Forgot to mention, the idea to store the name here was to be able to display an appropiate error message if a format is not supported (like compression format xyz is not supported). Peter
Am 27.06.2017 um 17:04 schrieb Eric Blake: > On 06/27/2017 09:49 AM, Peter Lieven wrote: > >> Before I continue, can you please give feedback on the following spec >> change: >> >> diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt >> index 80cdfd0..f1428e9 100644 >> --- a/docs/interop/qcow2.txt >> +++ b/docs/interop/qcow2.txt >> @@ -85,7 +85,11 @@ in the description of a field. >> be written to (unless for regaining >> consistency). >> >> - Bits 2-63: Reserved (set to 0) >> + Bit 2: Compression format bit. Iff this bit > I know what this means, but spelling it "If and only if" or "When" might > make more sense to other readers, as "Iff" is not common in English. > >> is set then >> + the compression format extension MUST >> be present >> + and MUST be parsed and checked for >> compatibility. >> + >> + Bits 3-63: Reserved (set to 0) >> >> 80 - 87: compatible_features >> Bitmask of compatible features. An implementation can >> @@ -135,6 +139,7 @@ be stored. Each extension has a structure like the >> following: >> 0xE2792ACA - Backing file format name >> 0x6803f857 - Feature name table >> 0x23852875 - Bitmaps extension >> + 0xC0318300 - Compression format extension > Now that you aren't burning 256 magic numbers, it may make sense to have > the last two hex digits be non-zero. > > >> +== Compression format extension == >> + >> +The compression format extension is an optional header extension. It >> provides > Inline pasting created interesting wrapping, but the actual patch will > obviously read better. > >> +the ability to specify the compression algorithm and compression >> parameters >> +that are used for compressed clusters. This new header MUST be present if >> +the incompatible-feature bit "compression format bit" is set and MUST >> be absent >> +otherwise. >> + >> +The fields of the compression format extension are: >> + >> + Byte 0 - 15: compression_format_name (padded with zeros, but not >> + necessarily null terminated if it has full length) > Do we really want arbitrary names of formats, or do we want to specify > specific algorithms (gzip, lzo, zstd) as an enum? Which way gives us > maximum likelihood of interoperability? > >> + >> + 16: compression_level (uint8_t) >> + 0 = default compression level >> + 1 = lowest compression level >> + x = highest compression level (the highest compression >> + level may vary for different compression formats) >> + >> + 17 - 23: Reserved for future use, must be zero. > Feels pretty limited - you don't have a length field for variable-length > extension of additional parameters, but have to fit all additions in the > next 8 bytes. Yes, all extension headers are already paired with a > length parameter outside of the struct, sent alongside the header magic > number, but embedding a length directly in the header (while redundant) > makes it easier to keep information local to the header. See > extra_data_size under Bitmap directory, for example. Of course, we may > turn those 8 bytes INTO a length field, that then describe the rest of > the variable length parameters, but why not do it up front? > > If we go with an enum mapping of supported compression formats, then you > can go into further details on exactly what extra parameters are > supports for each algorithm; while leaving it as a free-form text string > makes it harder to interpret what any additional payload will represent. > I send a V2 of the series including the update of the spec last week. Maybe you can have a look if this version is better. Thanks, Peter
diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt index 80cdfd0..f1428e9 100644 --- a/docs/interop/qcow2.txt +++ b/docs/interop/qcow2.txt @@ -85,7 +85,11 @@ in the description of a field. be written to (unless for regaining consistency). - Bits 2-63: Reserved (set to 0) + Bit 2: Compression format bit. Iff this bit is set then + the compression format extension MUST be present + and MUST be parsed and checked for compatibility. + + Bits 3-63: Reserved (set to 0) 80 - 87: compatible_features Bitmask of compatible features. An implementation can @@ -135,6 +139,7 @@ be stored. Each extension has a structure like the following: 0xE2792ACA - Backing file format name 0x6803f857 - Feature name table 0x23852875 - Bitmaps extension + 0xC0318300 - Compression format extension other - Unknown header extension, can be safely ignored @@ -208,6 +213,28 @@ The fields of the bitmaps extension are: starts. Must be aligned to a cluster boundary. +== Compression format extension == + +The compression format extension is an optional header extension. It provides +the ability to specify the compression algorithm and compression parameters +that are used for compressed clusters. This new header MUST be present if +the incompatible-feature bit "compression format bit" is set and MUST be absent +otherwise. + +The fields of the compression format extension are: + + Byte 0 - 15: compression_format_name (padded with zeros, but not + necessarily null terminated if it has full length) + + 16: compression_level (uint8_t) + 0 = default compression level + 1 = lowest compression level + x = highest compression level (the highest compression + level may vary for different compression formats) + + 17 - 23: Reserved for future use, must be zero. + + == Host cluster management == qcow2 manages the allocation of host clusters by maintaining a reference count