Message ID | 20200128122823.12920-3-pdurrant@amazon.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | docs: Migration design documents | expand |
On 28.01.20 13:28, Paul Durrant wrote: > This patch details proposes extra migration data and xenstore protocol > extensions to support non-cooperative live migration of guests. > > Signed-off-by: Paul Durrant <pdurrant@amazon.com> > --- > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > Cc: George Dunlap <George.Dunlap@eu.citrix.com> > Cc: Ian Jackson <ian.jackson@eu.citrix.com> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Julien Grall <julien@xen.org> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Wei Liu <wl@xen.org> > > v3: > - New in v3 > --- > docs/designs/xenstore-migration.md | 122 +++++++++++++++++++++++++++++ > 1 file changed, 122 insertions(+) > create mode 100644 docs/designs/xenstore-migration.md > > diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-migration.md > new file mode 100644 > index 0000000000..9020b6ff9a > --- /dev/null > +++ b/docs/designs/xenstore-migration.md > @@ -0,0 +1,122 @@ > +**watch data** > + > + > +`<path>|<token>|` > + > +`<path>` again is considered relative and, together with `<token>`, should > +be suitable to formulate an `ADD_DOMAIN_WATCHES` operation (see below). > +Note that `<path>` must not be a *special* value (beginning with `@`). Why not? This is possible for a guest, so it should be possible to migrate such a watch. Today it might not make any sense, but we should not block any future special values which might make sense for a guest to use. > + > + > +### Protocol Extension > + > +The `WATCH` operation does not allow specification of a `<domid>`; it is > +assumed that the watch pertains to the domain that owns the shared ring > +over which the operation is passed. Hence, for the tool-stack to be able > +to register a watch on behalf of a domain a new operation is needed: > + > +``` > +ADD_DOMAIN_WATCHES <domid>|<watch>|+ > + > +Adds watches on behalf of the specified domain. > + > +<watch> is a NUL separated tuple of <path>|<token>. <path> must not be > +@<wspecial>, otherwise the semantics of this operation are identical to > +the domain issuing WATCH <path>|<token>|. I wouldn't exclude @<wspecial>. Juergen
> -----Original Message----- > From: Jürgen Groß <jgross@suse.com> > Sent: 28 January 2020 13:36 > To: Durrant, Paul <pdurrant@amazon.co.uk>; xen-devel@lists.xenproject.org > Cc: Stefano Stabellini <sstabellini@kernel.org>; Julien Grall > <julien@xen.org>; Wei Liu <wl@xen.org>; Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com>; George Dunlap <George.Dunlap@eu.citrix.com>; > Andrew Cooper <andrew.cooper3@citrix.com>; Ian Jackson > <ian.jackson@eu.citrix.com> > Subject: Re: [Xen-devel] [PATCH v3 2/2] docs/designs: Add a design > document for migration of xenstore data > > On 28.01.20 13:28, Paul Durrant wrote: > > This patch details proposes extra migration data and xenstore protocol > > extensions to support non-cooperative live migration of guests. > > > > Signed-off-by: Paul Durrant <pdurrant@amazon.com> > > --- > > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > > Cc: George Dunlap <George.Dunlap@eu.citrix.com> > > Cc: Ian Jackson <ian.jackson@eu.citrix.com> > > Cc: Jan Beulich <jbeulich@suse.com> > > Cc: Julien Grall <julien@xen.org> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Cc: Stefano Stabellini <sstabellini@kernel.org> > > Cc: Wei Liu <wl@xen.org> > > > > v3: > > - New in v3 > > --- > > docs/designs/xenstore-migration.md | 122 +++++++++++++++++++++++++++++ > > 1 file changed, 122 insertions(+) > > create mode 100644 docs/designs/xenstore-migration.md > > > > diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore- > migration.md > > new file mode 100644 > > index 0000000000..9020b6ff9a > > --- /dev/null > > +++ b/docs/designs/xenstore-migration.md > > @@ -0,0 +1,122 @@ > > +**watch data** > > + > > + > > +`<path>|<token>|` > > + > > +`<path>` again is considered relative and, together with `<token>`, > should > > +be suitable to formulate an `ADD_DOMAIN_WATCHES` operation (see below). > > +Note that `<path>` must not be a *special* value (beginning with `@`). > > Why not? This is possible for a guest, so it should be possible to > migrate such a watch. > > Today it might not make any sense, but we should not block any future > special values which might make sense for a guest to use. > Ok. I was just limiting things to what is actually needed. If you think there is merit in allowing them then I have no objection. I'll remove the exclusions. Paul
On Tue, Jan 28, 2020 at 12:28:23PM +0000, Paul Durrant wrote: > +``` > +0 1 2 3 4 5 6 7 octet > ++------------------------+------------------------+ > +| type | record specific data | > ++------------------------+ | > +... > ++-------------------------------------------------+ > +``` > + > + > +| Field | Description | > +|---|---| > +| `type` | 0x00000000: invalid | > +| | 0x00000001: node data | > +| | 0x00000002: watch data | > +| | 0x00000003 - 0xFFFFFFFF: reserved for future use | > + > + > +where data is always in the form of a NUL separated and terminated tuple > +as follows > + Is there any padding requirement for a record? I take it there isn't? Wei.
> -----Original Message----- > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Wei > Liu > Sent: 28 January 2020 13:46 > To: Durrant, Paul <pdurrant@amazon.co.uk> > Cc: Stefano Stabellini <sstabellini@kernel.org>; Julien Grall > <julien@xen.org>; Wei Liu <wl@xen.org>; Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com>; George Dunlap <George.Dunlap@eu.citrix.com>; > Andrew Cooper <andrew.cooper3@citrix.com>; Ian Jackson > <ian.jackson@eu.citrix.com>; xen-devel@lists.xenproject.org > Subject: Re: [Xen-devel] [PATCH v3 2/2] docs/designs: Add a design > document for migration of xenstore data > > On Tue, Jan 28, 2020 at 12:28:23PM +0000, Paul Durrant wrote: > > +``` > > +0 1 2 3 4 5 6 7 octet > > ++------------------------+------------------------+ > > +| type | record specific data | > > ++------------------------+ | > > +... > > ++-------------------------------------------------+ > > +``` > > + > > + > > +| Field | Description | > > +|---|---| > > +| `type` | 0x00000000: invalid | > > +| | 0x00000001: node data | > > +| | 0x00000002: watch data | > > +| | 0x00000003 - 0xFFFFFFFF: reserved for future use | > > + > > + > > +where data is always in the form of a NUL separated and terminated > tuple > > +as follows > > + > > Is there any padding requirement for a record? I take it there isn't? > No, the padding is at the generic record level (already part of the spec). Paul > Wei. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel
diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-migration.md new file mode 100644 index 0000000000..9020b6ff9a --- /dev/null +++ b/docs/designs/xenstore-migration.md @@ -0,0 +1,122 @@ +# Xenstore Migration + +## Background + +The design for *Non-Cooperative Migration of Guests*[1] explains that extra +save records are required in the migrations stream to allow a guest running +PV drivers to be migrated without its co-operation. Moreover the save +records must include details of registered xenstore watches as well as +content; information that cannot currently be recovered from `xenstored`, +and hence some extension to the xenstore protocol[2] will also be required. + +The *libxenlight Domain Image Format* specification[3] already defines a +record type `EMULATOR_XENSTORE_DATA` but this is not suitable for +transferring xenstore data pertaining to the domain directly as it is +specified such that keys are relative to the path +`/local/domain/$dm_domid/device-model/$domid`. Thus it is necessary to +define at least one new save record type. + +## Proposal + +### New Save Record + +A new mandatory record type should be defined within the libxenlight Domain +Image Format: + +`0x00000007: DOMAIN_XENSTORE_DATA` + +The format of each of these new records should be as follows: + + +``` +0 1 2 3 4 5 6 7 octet ++------------------------+------------------------+ +| type | record specific data | ++------------------------+ | +... ++-------------------------------------------------+ +``` + + +| Field | Description | +|---|---| +| `type` | 0x00000000: invalid | +| | 0x00000001: node data | +| | 0x00000002: watch data | +| | 0x00000003 - 0xFFFFFFFF: reserved for future use | + + +where data is always in the form of a NUL separated and terminated tuple +as follows + + +**node data** + + +`<path>|<value>|<perm-as-string>|` + + +`<path>` is considered relative to the domain path `/local/domain/$domid` +and hence must not begin with `/`. +`<path>` and `<value>` should be suitable to formulate a `WRITE` operation +to the receiving xenstore and `<perm-as-string>` should be similarly suitable +to formulate a subsequent `SET_PERMS` operation. + +**watch data** + + +`<path>|<token>|` + +`<path>` again is considered relative and, together with `<token>`, should +be suitable to formulate an `ADD_DOMAIN_WATCHES` operation (see below). +Note that `<path>` must not be a *special* value (beginning with `@`). + + +### Protocol Extension + +The `WATCH` operation does not allow specification of a `<domid>`; it is +assumed that the watch pertains to the domain that owns the shared ring +over which the operation is passed. Hence, for the tool-stack to be able +to register a watch on behalf of a domain a new operation is needed: + +``` +ADD_DOMAIN_WATCHES <domid>|<watch>|+ + +Adds watches on behalf of the specified domain. + +<watch> is a NUL separated tuple of <path>|<token>. <path> must not be +@<wspecial>, otherwise the semantics of this operation are identical to +the domain issuing WATCH <path>|<token>|. +``` + +The watch information for a domain also needs to be extracted from the +sending xenstored so the following operation is also needed: + +``` +GET_DOMAIN_WATCHES <domid>|<index> <gencnt>|<watch>|* + +Gets the list of watches that are currently registered for the domain. + +<watch> is a NUL separated tuple of <path>|<token>. The sub-list returned +will start at <index> into the the overall list of watches and may be +truncated such that the returned data fits within XENSTORE_PAYLOAD_MAX. +If <index> is beyond the end of the overall list then the returned sub- +list will be empty. If the value of <gencnt> changes then it indicates +that the overall watch list has changed and thus it may be necessary +to re-issue the operation for previous values of <index>. +``` + +It may also be desirable to state in the protocol specification that +the `INTRODUCE` operation should not clear the `<mfn>` specified such that +a `RELEASE` operation followed by an `INTRODUCE` operation form an +idempotent pair. The current implementation of *C xentored* does this +(in the `domain_conn_reset()` function) but this could be dropped as this +behaviour is not currently specified and the page will always be zeroed +for a newly created domain. + + +* * * + +[1] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/designs/non-cooperative-migration.md +[2] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/xenstore.txt +[3] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/specs/libxl-migration-stream.pandoc
This patch details proposes extra migration data and xenstore protocol extensions to support non-cooperative live migration of guests. Signed-off-by: Paul Durrant <pdurrant@amazon.com> --- Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: George Dunlap <George.Dunlap@eu.citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Julien Grall <julien@xen.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Wei Liu <wl@xen.org> v3: - New in v3 --- docs/designs/xenstore-migration.md | 122 +++++++++++++++++++++++++++++ 1 file changed, 122 insertions(+) create mode 100644 docs/designs/xenstore-migration.md