Message ID | 20201008185735.29875-2-paul@xen.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | domain context infrastructure | expand |
On 08.10.2020 20:57, Paul Durrant wrote: > --- /dev/null > +++ b/xen/include/public/save.h > @@ -0,0 +1,66 @@ > +/* > + * save.h > + * > + * Structure definitions for common PV/HVM domain state that is held by Xen. > + * > + * Copyright Amazon.com Inc. or its affiliates. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a copy > + * of this software and associated documentation files (the "Software"), to > + * deal in the Software without restriction, including without limitation the > + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or > + * sell copies of the Software, and to permit persons to whom the Software is > + * furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > + * DEALINGS IN THE SOFTWARE. > + */ > + > +#ifndef XEN_PUBLIC_SAVE_H > +#define XEN_PUBLIC_SAVE_H > + > +#if defined(__XEN__) || defined(__XEN_TOOLS__) > + > +#include "xen.h" > + > +/* > + * C structures for the Domain Context v1 format. > + * See docs/specs/domain-context.md > + */ > + > +struct domain_context_record { > + uint32_t type; > + uint32_t instance; > + uint64_t length; Should this be uint64_aligned_t, such that alignof() will produce consistent values regardless of bitness of the invoking domain? Jan
On 08/10/2020 19:57, Paul Durrant wrote: > diff --git a/xen/include/public/save.h b/xen/include/public/save.h > new file mode 100644 > index 0000000000..c4be9f570c > --- /dev/null > +++ b/xen/include/public/save.h > @@ -0,0 +1,66 @@ > +/* > + * save.h > + * > + * Structure definitions for common PV/HVM domain state that is held by Xen. What exactly is, and is not in scope, for this new stream? The PV above I think refers to "paravirtual state", not PV guests. > +#define _DOMAIN_CONTEXT_RECORD_ALIGN 3 > +#define DOMAIN_CONTEXT_RECORD_ALIGN (1U << _DOMAIN_CONTEXT_RECORD_ALIGN) Do we need the logarithm version? > + > +enum { > + DOMAIN_CONTEXT_END, > + DOMAIN_CONTEXT_START, > + /* New types go here */ > + DOMAIN_CONTEXT_NR_TYPES > +}; Does this enum ever end up in an API? We might be ok as we're inside __XEN_TOOLS__, but enums normally cannot be used in ABI's because their size is implementation defined, and not always 4 bytes. ~Andrew
On 19/10/2020 14:46, Jan Beulich wrote: > On 08.10.2020 20:57, Paul Durrant wrote: >> --- /dev/null >> +++ b/xen/include/public/save.h >> @@ -0,0 +1,66 @@ >> +/* >> + * save.h >> + * >> + * Structure definitions for common PV/HVM domain state that is held by Xen. >> + * >> + * Copyright Amazon.com Inc. or its affiliates. >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a copy >> + * of this software and associated documentation files (the "Software"), to >> + * deal in the Software without restriction, including without limitation the >> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or >> + * sell copies of the Software, and to permit persons to whom the Software is >> + * furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be included in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE >> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >> + * DEALINGS IN THE SOFTWARE. >> + */ >> + >> +#ifndef XEN_PUBLIC_SAVE_H >> +#define XEN_PUBLIC_SAVE_H >> + >> +#if defined(__XEN__) || defined(__XEN_TOOLS__) >> + >> +#include "xen.h" >> + >> +/* >> + * C structures for the Domain Context v1 format. >> + * See docs/specs/domain-context.md >> + */ >> + >> +struct domain_context_record { >> + uint32_t type; >> + uint32_t instance; >> + uint64_t length; > Should this be uint64_aligned_t, such that alignof() will > produce consistent values regardless of bitness of the invoking > domain? Does it matter? Its just a bitstream, and can appear in the migration fd at any arbitrary alignment. What matters is that the structure is aligned appropriately for the bitness of code operating on these fields. Even with the tools ABI fixed to allow a 32-on-64-on-64 toolstack to function, I'm not sure that excess alignment would be appropriate. Sure - it would be more efficient for 32bit code to align to the 8 byte boundary for the benefit of a 64bit Xen's copy_from_user(), but this alignment happens anyway because of how hypercall buffers work. ~Andrew
On 25.01.2021 19:25, Andrew Cooper wrote: > On 19/10/2020 14:46, Jan Beulich wrote: >> On 08.10.2020 20:57, Paul Durrant wrote: >>> --- /dev/null >>> +++ b/xen/include/public/save.h >>> @@ -0,0 +1,66 @@ >>> +/* >>> + * save.h >>> + * >>> + * Structure definitions for common PV/HVM domain state that is held by Xen. >>> + * >>> + * Copyright Amazon.com Inc. or its affiliates. >>> + * >>> + * Permission is hereby granted, free of charge, to any person obtaining a copy >>> + * of this software and associated documentation files (the "Software"), to >>> + * deal in the Software without restriction, including without limitation the >>> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or >>> + * sell copies of the Software, and to permit persons to whom the Software is >>> + * furnished to do so, subject to the following conditions: >>> + * >>> + * The above copyright notice and this permission notice shall be included in >>> + * all copies or substantial portions of the Software. >>> + * >>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR >>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE >>> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER >>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING >>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >>> + * DEALINGS IN THE SOFTWARE. >>> + */ >>> + >>> +#ifndef XEN_PUBLIC_SAVE_H >>> +#define XEN_PUBLIC_SAVE_H >>> + >>> +#if defined(__XEN__) || defined(__XEN_TOOLS__) >>> + >>> +#include "xen.h" >>> + >>> +/* >>> + * C structures for the Domain Context v1 format. >>> + * See docs/specs/domain-context.md >>> + */ >>> + >>> +struct domain_context_record { >>> + uint32_t type; >>> + uint32_t instance; >>> + uint64_t length; >> Should this be uint64_aligned_t, such that alignof() will >> produce consistent values regardless of bitness of the invoking >> domain? > > Does it matter? Its just a bitstream, and can appear in the migration > fd at any arbitrary alignment. > > What matters is that the structure is aligned appropriately for the > bitness of code operating on these fields. > > Even with the tools ABI fixed to allow a 32-on-64-on-64 toolstack to > function, I'm not sure that excess alignment would be appropriate. Sure > - it would be more efficient for 32bit code to align to the 8 byte > boundary for the benefit of a 64bit Xen's copy_from_user(), but this > alignment happens anyway because of how hypercall buffers work. It _probably_ doesn't matter (hence me having put this as a question), but a struct in the ABI doesn't mandate how it is going to be used. I don't expect this to become the (or part of an) argument to a hypercall. I also don't expect anyone needing to use alignof() on it. But by not following the common model for the tools-only parts of the public interface we'd outright exclude such uses down the road. Jan
On 25.01.2021 19:18, Andrew Cooper wrote: >> +enum { >> + DOMAIN_CONTEXT_END, >> + DOMAIN_CONTEXT_START, >> + /* New types go here */ >> + DOMAIN_CONTEXT_NR_TYPES >> +}; > > Does this enum ever end up in an API? > > We might be ok as we're inside __XEN_TOOLS__, but enums normally cannot > be used in ABI's because their size is implementation defined, and not > always 4 bytes. The only way to use this as a type (e.g. to declare a struct field) would be via typeof(), which isn't something we'd allow in the public interface (for being an extension). Hence without a tag I would think an enum is fine to have here? Jan
diff --git a/docs/specs/domain-context.md b/docs/specs/domain-context.md new file mode 100644 index 0000000000..f177cf24b3 --- /dev/null +++ b/docs/specs/domain-context.md @@ -0,0 +1,137 @@ +# Domain Context (v1) + +## Background + +The design for *Non-Cooperative Migration of Guests*[1] explains that extra +information is required in the migration stream to allow a guest running PV +drivers to be migrated without its co-operation. This information includes +hypervisor state such as the set of event channels in operation and the +content of the grant table. + +There already exists in Xen a mechanism to save and restore *HVM context*[2]. +This specification describes a new framework that will replace that +mechanism and be suitable for transferring additional *PV state* records +conveying the information mentioned above. There is also on-going work to +implement *live update* of Xen where hypervisor state must be transferred in +memory from one incarnation to the next. The framework described is designed +to also be suitable for this purpose. + +## Image format + +The format will read from or written to the hypervisor in a single virtually +contiguous buffer segmented into **context records** specified in the following +sections. + +Fields within the records will always be aligned to their size. Padding and +reserved fields will always be set to zero when the context buffer is read +from the hypervisor and will be verified when written. +The endianness of numerical values will the native endianness of the +hypervisor. In the case of migration, that endianness is specified in the +*libxenctrl (libxc) Domain Image Format*[3]. + +### Record format + +All records have the following format: + +``` + 0 1 2 3 4 5 6 7 octet ++-------+-------+-------+-------+-------+-------+-------+-------+ +| type | instance | ++-------------------------------+-------------------------------+ +| length | ++---------------------------------------------------------------+ +| body +... +| | padding (0 to 7 octets) | ++-------+-------------------------------------------------------+ +``` + +\pagebreak +The fields are defined as follows: + + +| Field | Description | +|------------|--------------------------------------------------| +| `type` | A code which determines the layout and semantics | +| | of `body` | +| | | +| `instance` | The instance of the record | +| | | +| `length` | The length (in octets) of `body` | +| | | +| `body` | Zero or more octets of record data | +| | | +| `padding` | Zero to seven octets of zero-filled padding to | +| | bring the total record length up to the next | +| | 64-bit boundary | + +The `instance` field is present to distinguish multiple occurences of +a record. E.g. state that is per-vcpu may need to be described in multiple +records. + +The first record in the image is always a **START** record. The version of +the image format can be inferred from the `type` of this record. + +## Image content + +The following records are defined for the v1 image format. This set may be +extended in newer versions of the hypervisor. It is not expected that an image +saved on a newer version of Xen will need to be restored on an older version. +Therefore an image containing unrecognized record types should be rejected. + +### START + +``` + 0 1 2 3 4 5 6 7 octet ++-------+-------+-------+-------+-------+-------+-------+-------+ +| type == 1 | instance == 0 | ++-------------------------------+-------------------------------+ +| length == 8 | ++-------------------------------+-------------------------------+ +| xen_major | xen_minor | ++-------------------------------+-------------------------------+ +``` + +A type 1 **START** record implies a v1 image. If a new image format version +is needed in future then this can be indicated by a new type value for this +(first) record in the image. + +\pagebreak +The record body contains the following fields: + +| Field | Description | +|-------------|-------------------------------------------------| +| `xen_major` | The major version of Xen that created this | +| | image | +| | | +| `xen_minor` | The minor version of Xen that created this | +| | image | + +The version of Xen that created the image can be useful to the version that +is restoring the image to determine whether certain records are expected to +be present in the image. For example, a version of Xen prior to X.Y may not +generate a FOO record but Xen X.Y+ can infer its content. But Xen X.Y+1 +**must** generate the FOO record as, from that version onward, its content +can no longer be safely inferred. + +### END + +``` + 0 1 2 3 4 5 6 7 octet ++-------+-------+-------+-------+-------+-------+-------+-------+ +| type == 0 | instance == 0 | ++-------------------------------+-------------------------------+ +| length == 0 | ++---------------------------------------------------------------+ +``` + +A record of this type terminates the image. No further data from the buffer +should be consumed. + +* * * + +[1] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/designs/non-cooperative-migration.md + +[2] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/hvm/save.h + +[3] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/specs/libxc-migration-stream.pandoc diff --git a/xen/include/public/arch-arm/hvm/save.h b/xen/include/public/arch-arm/hvm/save.h index 75b8e65bcb..d5b0c15203 100644 --- a/xen/include/public/arch-arm/hvm/save.h +++ b/xen/include/public/arch-arm/hvm/save.h @@ -26,6 +26,11 @@ #ifndef __XEN_PUBLIC_HVM_SAVE_ARM_H__ #define __XEN_PUBLIC_HVM_SAVE_ARM_H__ +/* + * Further use of HVM state is deprecated. New state records should only + * be added to the domain state header: public/save.h + */ + #endif /* diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h index 773a380bc2..e61e2dbcd7 100644 --- a/xen/include/public/arch-x86/hvm/save.h +++ b/xen/include/public/arch-x86/hvm/save.h @@ -648,6 +648,11 @@ struct hvm_msr { */ #define HVM_SAVE_CODE_MAX 20 +/* + * Further use of HVM state is deprecated. New state records should only + * be added to the domain state header: public/save.h + */ + #endif /* __XEN_PUBLIC_HVM_SAVE_X86_H__ */ /* diff --git a/xen/include/public/save.h b/xen/include/public/save.h new file mode 100644 index 0000000000..c4be9f570c --- /dev/null +++ b/xen/include/public/save.h @@ -0,0 +1,66 @@ +/* + * save.h + * + * Structure definitions for common PV/HVM domain state that is held by Xen. + * + * Copyright Amazon.com Inc. or its affiliates. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#ifndef XEN_PUBLIC_SAVE_H +#define XEN_PUBLIC_SAVE_H + +#if defined(__XEN__) || defined(__XEN_TOOLS__) + +#include "xen.h" + +/* + * C structures for the Domain Context v1 format. + * See docs/specs/domain-context.md + */ + +struct domain_context_record { + uint32_t type; + uint32_t instance; + uint64_t length; + uint8_t body[XEN_FLEX_ARRAY_DIM]; +}; + +#define _DOMAIN_CONTEXT_RECORD_ALIGN 3 +#define DOMAIN_CONTEXT_RECORD_ALIGN (1U << _DOMAIN_CONTEXT_RECORD_ALIGN) + +enum { + DOMAIN_CONTEXT_END, + DOMAIN_CONTEXT_START, + /* New types go here */ + DOMAIN_CONTEXT_NR_TYPES +}; + +/* Initial entry */ +struct domain_context_start { + uint32_t xen_major, xen_minor; +}; + +/* Terminating entry */ +struct domain_context_end {}; + +#endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */ + +#endif /* XEN_PUBLIC_SAVE_H */