Message ID | 20220617121932.249381-3-victortoso@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | qapi: add generator for Golang interface | expand |
Sorry it took me a while to find the time to look into this! Overall this second iteration is a significant improvement over the initial one. There are still a few things that I think should be changed, so for the time being I'm going to comment mostly on the generated Go code and leave the details of the implementation for later. On Fri, Jun 17, 2022 at 02:19:26PM +0200, Victor Toso wrote: > This patch handles QAPI alternate types and generates data structures > in Go that handles it. > > At this moment, there are 5 alternates in qemu/qapi, they are: > * BlockDirtyBitmapMergeSource > * Qcow2OverlapChecks > * BlockdevRef > * BlockdevRefOrNull > * StrOrNull I personally don't think it's very useful to list all the alternate types in the commit message, or even mention how many there are. You do this for all other types too, and it seems to me that it's just an opportunity for incosistent information to make their way to the git repository - what if a new type is introduced between the time your series is queued and merged? I'd say just drop this part. > Example: > > qapi: > | { 'alternate': 'BlockdevRef', > | 'data': { 'definition': 'BlockdevOptions', > | 'reference': 'str' } } > > go: > | type BlockdevRef struct { > | Definition *BlockdevOptions > | Reference *string > | } > > usage: > | input := `{"driver":"qcow2","data-file":"/some/place/my-image"}` > | k := BlockdevRef{} > | err := json.Unmarshal([]byte(input), &k) > | if err != nil { > | panic(err) > | } > | // *k.Definition.Qcow2.DataFile.Reference == "/some/place/my-image" Let me just say that I absolutely *love* how you've added these bits comparing the QAPI / Go representations as well as usage. They really help a lot understanding what the generator is trying to achieve :) > +// Creates a decoder that errors on unknown Fields > +// Returns true if successfully decoded @from string @into type > +// Returns false without error is failed with "unknown field" > +// Returns false with error is a different error was found > +func StrictDecode(into interface{}, from []byte) error { > + dec := json.NewDecoder(strings.NewReader(string(from))) > + dec.DisallowUnknownFields() > + > + if err := dec.Decode(into); err != nil { > + return err > + } > + return nil > +} The documentation doesn't seem to be consistent with how the function actually works: AFAICT it returns false *with an error* for any failure, including those caused by unknown fields being present in the input JSON. Looking at the generated code: > type BlockdevRef struct { > Definition *BlockdevOptions > Reference *string > } > > func (s BlockdevRef) MarshalJSON() ([]byte, error) { > if s.Definition != nil { > return json.Marshal(s.Definition) > } else if s.Reference != nil { > return json.Marshal(s.Reference) > } > return nil, errors.New("BlockdevRef has empty fields") Returning an error if no field is set is good. Can we be more strict and returning one if more than one is set as well? That feels better than just picking the first one. > func (s *BlockdevRef) UnmarshalJSON(data []byte) error { > // Check for json-null first > if string(data) == "null" { > return errors.New(`null not supported for BlockdevRef`) > } > // Check for BlockdevOptions > { > s.Definition = new(BlockdevOptions) > if err := StrictDecode(s.Definition, data); err == nil { > return nil > } The use of StrictDecode() here means that we won't be able to parse an alternate produced by a version of QEMU where BlockdevOptions has gained additional fields, doesn't it? Considering that we will happily parse such a BlockdevOptions outside of the context of BlockdevRef, I think we should be consistent and allow the same to happen here. > s.Definition = nil > } > // Check for string > { > s.Reference = new(string) > if err := StrictDecode(s.Reference, data); err == nil { > return nil > } > s.Reference = nil > } > > return errors.New(fmt.Sprintf("Can't convert to BlockdevRef: %s", string(data))) On a similar note to the MarshalJSON comment above, I'm not sure this is right. If we find that more than one field of the alternate is set, we should error out - that's just invalid JSON we're dealing with. But if we couldn't find any, that might be valid JSON that's been produced by a version of QEMU that introduced additional options to the alternate. In the spirit of "be liberal in what you accept", I think we should probably not reject that? Of course then the client code will have to look like if r.Definition != nil { // do something with r.Definition } else if r.Reference != nil { // do something with r.Reference } else { // we don't recognize this - error out } but the same is going to be true for enums, events and everything else that can be extended in a backwards-compatible fashion.
Hi, On Tue, Jul 05, 2022 at 08:45:06AM -0700, Andrea Bolognani wrote: > Sorry it took me a while to find the time to look into this! (1.5 month later.. what can I say!) :) > Overall this second iteration is a significant improvement over the > initial one. There are still a few things that I think should be > changed, so for the time being I'm going to comment mostly on the > generated Go code and leave the details of the implementation for > later. Sure, and thanks. > On Fri, Jun 17, 2022 at 02:19:26PM +0200, Victor Toso wrote: > > This patch handles QAPI alternate types and generates data structures > > in Go that handles it. > > > > At this moment, there are 5 alternates in qemu/qapi, they are: > > * BlockDirtyBitmapMergeSource > > * Qcow2OverlapChecks > > * BlockdevRef > > * BlockdevRefOrNull > > * StrOrNull > > I personally don't think it's very useful to list all the alternate > types in the commit message, or even mention how many there are. You > do this for all other types too, and it seems to me that it's just an > opportunity for incosistent information to make their way to the git > repository - what if a new type is introduced between the time your > series is queued and merged? I'd say just drop this part. No issue on my side in dropping this bits. > > Example: > > > > qapi: > > | { 'alternate': 'BlockdevRef', > > | 'data': { 'definition': 'BlockdevOptions', > > | 'reference': 'str' } } > > > > go: > > | type BlockdevRef struct { > > | Definition *BlockdevOptions > > | Reference *string > > | } > > > > usage: > > | input := `{"driver":"qcow2","data-file":"/some/place/my-image"}` > > | k := BlockdevRef{} > > | err := json.Unmarshal([]byte(input), &k) > > | if err != nil { > > | panic(err) > > | } > > | // *k.Definition.Qcow2.DataFile.Reference == "/some/place/my-image" > > Let me just say that I absolutely *love* how you've added these bits > comparing the QAPI / Go representations as well as usage. They really > help a lot understanding what the generator is trying to achieve :) Thanks! > > +// Creates a decoder that errors on unknown Fields > > +// Returns true if successfully decoded @from string @into type > > +// Returns false without error is failed with "unknown field" > > +// Returns false with error is a different error was found > > +func StrictDecode(into interface{}, from []byte) error { > > + dec := json.NewDecoder(strings.NewReader(string(from))) > > + dec.DisallowUnknownFields() > > + > > + if err := dec.Decode(into); err != nil { > > + return err > > + } > > + return nil > > +} > > The documentation doesn't seem to be consistent with how the > function actually works: AFAICT it returns false *with an > error* for any failure, including those caused by unknown > fields being present in the input JSON. You are correct, documentation of this helper function needs to be fixed if we keep the helper function, as you made a good point about backwards-compatible decoding a struct that might have introduced extra fields in a newer version. > Looking at the generated code: > > > type BlockdevRef struct { > > Definition *BlockdevOptions > > Reference *string > > } > > > > func (s BlockdevRef) MarshalJSON() ([]byte, error) { > > if s.Definition != nil { > > return json.Marshal(s.Definition) > > } else if s.Reference != nil { > > return json.Marshal(s.Reference) > > } > > return nil, errors.New("BlockdevRef has empty fields") > > Returning an error if no field is set is good. Can we be more > strict and returning one if more than one is set as well? That > feels better than just picking the first one. Sure. > > func (s *BlockdevRef) UnmarshalJSON(data []byte) error { > > // Check for json-null first > > if string(data) == "null" { > > return errors.New(`null not supported for BlockdevRef`) > > } > > // Check for BlockdevOptions > > { > > s.Definition = new(BlockdevOptions) > > if err := StrictDecode(s.Definition, data); err == nil { > > return nil > > } > > The use of StrictDecode() here means that we won't be able to > parse an alternate produced by a version of QEMU where > BlockdevOptions has gained additional fields, doesn't it? That's correct. This means that with this RFCv2 proposal, qapi-go based on qemu version 7.1 might not be able to decode a qmp message from qemu version 7.2 if it has introduced a new field. This needs fixing, not sure yet the way to go. > Considering that we will happily parse such a BlockdevOptions > outside of the context of BlockdevRef, I think we should be > consistent and allow the same to happen here. StrictDecode is only used with alternates because, unlike unions, Alternate types don't have a 'discriminator' field that would allow us to know what data type to expect. With this in mind, theoretically speaking, we could have very similar struct types as Alternate fields and we have to find on runtime which type is that underlying byte stream. So, to reply to your suggestion, if we allow BlockdevRef without StrictDecode we might find ourselves in a situation that it matched a few fields of BlockdevOptions but it the byte stream was actually another type. So, I acknowledge that current version of this patch is not correct but we still need to look into the fields and see if there is a perfect match with the information that we have. > > s.Definition = nil > > } > > // Check for string > > { > > s.Reference = new(string) > > if err := StrictDecode(s.Reference, data); err == nil { > > return nil > > } > > s.Reference = nil > > } > > > > return errors.New(fmt.Sprintf("Can't convert to BlockdevRef: %s", string(data))) > > On a similar note to the MarshalJSON comment above, I'm not > sure this is right. > > If we find that more than one field of the alternate is set, we > should error out - that's just invalid JSON we're dealing with. With StrictDecode (or something similar) there shouldn't be multiple fields being set on the Go structure. Once it finds a match, it returns nil (no error) > But if we couldn't find any, that might be valid JSON that's > been produced by a version of QEMU that introduced additional > options to the alternate. In the spirit of "be liberal in what > you accept", I think we should probably not reject that? Of > course then the client code will have to look like > > if r.Definition != nil { > // do something with r.Definition > } else if r.Reference != nil { > // do something with r.Reference > } else { > // we don't recognize this - error out > } > > but the same is going to be true for enums, events and everything > else that can be extended in a backwards-compatible fashion. The client code looking like above is not much different than if err := json.Unmarshal([]byte(input), &r); err != nil { // error out: bad json? incompatibility issue? } else if r.Definition != nil { // do something with r.Definition } else if r.Reference != nil { // do something with r.Reference } else { // empty might be valid though, e.g: JSON null } My main concern with alternate logic is not putting data that should go to r.Reference into r.Definition and vice versa. I took note of all your suggestions, I'll be reworking this. I'll revisit the problem with StrictDecode together with addressing my reply to Daniel: https://lists.gnu.org/archive/html/qemu-devel/2022-06/msg03140.html Cheers, Victor
On Wed, Aug 17, 2022 at 04:04:19PM +0200, Victor Toso wrote: > On Tue, Jul 05, 2022 at 08:45:06AM -0700, Andrea Bolognani wrote: > > On Fri, Jun 17, 2022 at 02:19:26PM +0200, Victor Toso wrote: > > > func (s *BlockdevRef) UnmarshalJSON(data []byte) error { > > > // Check for json-null first > > > if string(data) == "null" { > > > return errors.New(`null not supported for BlockdevRef`) > > > } > > > // Check for BlockdevOptions > > > { > > > s.Definition = new(BlockdevOptions) > > > if err := StrictDecode(s.Definition, data); err == nil { > > > return nil > > > } > > > > The use of StrictDecode() here means that we won't be able to > > parse an alternate produced by a version of QEMU where > > BlockdevOptions has gained additional fields, doesn't it? > > That's correct. This means that with this RFCv2 proposal, qapi-go > based on qemu version 7.1 might not be able to decode a qmp > message from qemu version 7.2 if it has introduced a new field. > > This needs fixing, not sure yet the way to go. > > > Considering that we will happily parse such a BlockdevOptions > > outside of the context of BlockdevRef, I think we should be > > consistent and allow the same to happen here. > > StrictDecode is only used with alternates because, unlike unions, > Alternate types don't have a 'discriminator' field that would > allow us to know what data type to expect. > > With this in mind, theoretically speaking, we could have very > similar struct types as Alternate fields and we have to find on > runtime which type is that underlying byte stream. > > So, to reply to your suggestion, if we allow BlockdevRef without > StrictDecode we might find ourselves in a situation that it > matched a few fields of BlockdevOptions but it the byte stream > was actually another type. IIUC your concern is that the QAPI schema could gain a new type, TotallyNotBlockdevOptions, which looks exactly like BlockdevOptions except for one or more extra fields. If QEMU then produced a JSON like { "description": { /* a TotallyNotBlockdevOptions here */ } } and we'd try to deserialize it with Go code like ref := BlockdevRef{} json.Unmarsal(&ref) we'd end up mistakenly parsing the TotallyNotBlockdevOptions as a valid BlockdevOptions, dropping the extra fields in the process. Does that correctly describe the reason why you feel that the use of StrictDecode is necessary? If so, I respectfully disagree :) If the client code is expecting a BlockdevRef as the return value of a command and QEMU is producing something that is *not* a BlockdevRef instead, that's an obvious bug in QEMU. If the client code is expecting a BlockdevRef as the return value of a command that is specified *not* to return a BlockdevRef, that's an obvious bug in the client code. In neither case it should be the responsibility of the SDK to second-guess the declared intent, especially when it's perfectly valid for a type to be extended in a backwards-compatible way by adding fields to it.
Hi, On Fri, Aug 19, 2022 at 11:27:13AM -0500, Andrea Bolognani wrote: > On Wed, Aug 17, 2022 at 04:04:19PM +0200, Victor Toso wrote: > > On Tue, Jul 05, 2022 at 08:45:06AM -0700, Andrea Bolognani wrote: > > > On Fri, Jun 17, 2022 at 02:19:26PM +0200, Victor Toso wrote: > > > > func (s *BlockdevRef) UnmarshalJSON(data []byte) error { > > > > // Check for json-null first > > > > if string(data) == "null" { > > > > return errors.New(`null not supported for BlockdevRef`) > > > > } > > > > // Check for BlockdevOptions > > > > { > > > > s.Definition = new(BlockdevOptions) > > > > if err := StrictDecode(s.Definition, data); err == nil { > > > > return nil > > > > } > > > > > > The use of StrictDecode() here means that we won't be able to > > > parse an alternate produced by a version of QEMU where > > > BlockdevOptions has gained additional fields, doesn't it? > > > > That's correct. This means that with this RFCv2 proposal, qapi-go > > based on qemu version 7.1 might not be able to decode a qmp > > message from qemu version 7.2 if it has introduced a new field. > > > > This needs fixing, not sure yet the way to go. > > > > > Considering that we will happily parse such a BlockdevOptions > > > outside of the context of BlockdevRef, I think we should be > > > consistent and allow the same to happen here. > > > > StrictDecode is only used with alternates because, unlike unions, > > Alternate types don't have a 'discriminator' field that would > > allow us to know what data type to expect. > > > > With this in mind, theoretically speaking, we could have very > > similar struct types as Alternate fields and we have to find on > > runtime which type is that underlying byte stream. > > > > So, to reply to your suggestion, if we allow BlockdevRef without > > StrictDecode we might find ourselves in a situation that it > > matched a few fields of BlockdevOptions but it the byte stream > > was actually another type. > > IIUC your concern is that the QAPI schema could gain a new > type, TotallyNotBlockdevOptions, which looks exactly like > BlockdevOptions except for one or more extra fields. > > If QEMU then produced a JSON like > > { "description": { /* a TotallyNotBlockdevOptions here */ } } > > and we'd try to deserialize it with Go code like > > ref := BlockdevRef{} > json.Unmarsal(&ref) > > we'd end up mistakenly parsing the TotallyNotBlockdevOptions as a > valid BlockdevOptions, dropping the extra fields in the process. > > Does that correctly describe the reason why you feel that the use of > StrictDecode is necessary? Not quite. The problem here is related to the Alternate types of the QAPI specification [0], just to name a simple in-use example, BlockdevRefOrNul [1]. [0] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/qapi-code-gen.rst?plain=1#L387 [1] https://gitlab.com/qemu-project/qemu/-/blob/master/qapi/block-core.json#L4349 To exemplify the problem that I try to solve with StrictDecode, let's say there is a DeviceRef alternate type that looks like: { 'alternate': 'DeviceRef', 'data': { 'memory': 'BlockdevRefInMemory', 'disk': 'BlockdevRefInDisk', 'cloud': 'BlockdevRefInCloud' } } Just a quick recap, at runtime we don't have data's payload name (e.g: disk). We need to check the actual data and try to find what is the payload type. type BlockdevRefInMemory struct { Name *string Size uint64 Start uint64 End uint64 } type BlockdevRefInDisk struct { Name *string Size uint64 Path *string } type BlockdevRefInCloud struct { Name *string Size uint64 Uri *string } All types have unique fields but they all share some fields too. Golang, without StrictDecode would happily decode a "disk" payload into either "memory" or "cloud" fields, matching only what the json provides and ignoring the rest. StrictDecode would error if the payload had fields that don't belong to that Type so we could try to find a perfect match. While this should work reliably with current version of QEMU's qapi-schema.json, it is not future error proof... It is just a bit better than not checking at all. The overall expectations is that, if the fields have too much in common, it is likely they should have been tagged as 'union' instead of 'alternate'. Yet, because alternate types have this flexibility, we need to be extra careful. I'm still thinking about this in a way that would not give too much hassle when considering a generated code that talks with older/newer qemu where some fields might have been added/removed. > If so, I respectfully disagree :) > > If the client code is expecting a BlockdevRef as the return > value of a command and QEMU is producing something that is > *not* a BlockdevRef instead, that's an obvious bug in QEMU. If > the client code is expecting a BlockdevRef as the return value > of a command that is specified *not* to return a BlockdevRef, > that's an obvious bug in the client code. > > In neither case it should be the responsibility of the SDK to > second-guess the declared intent, especially when it's > perfectly valid for a type to be extended in a > backwards-compatible way by adding fields to it. Yes, the SDK should consider valid QMP messages. This specific case is just a bit more complex qapi type that SDK needs to worry. Thanks for your input! Victor
Victor Toso <victortoso@redhat.com> writes: > Hi, > > On Fri, Aug 19, 2022 at 11:27:13AM -0500, Andrea Bolognani wrote: >> On Wed, Aug 17, 2022 at 04:04:19PM +0200, Victor Toso wrote: >> > On Tue, Jul 05, 2022 at 08:45:06AM -0700, Andrea Bolognani wrote: >> > > On Fri, Jun 17, 2022 at 02:19:26PM +0200, Victor Toso wrote: >> > > > func (s *BlockdevRef) UnmarshalJSON(data []byte) error { >> > > > // Check for json-null first >> > > > if string(data) == "null" { >> > > > return errors.New(`null not supported for BlockdevRef`) >> > > > } >> > > > // Check for BlockdevOptions >> > > > { >> > > > s.Definition = new(BlockdevOptions) >> > > > if err := StrictDecode(s.Definition, data); err == nil { >> > > > return nil >> > > > } >> > > >> > > The use of StrictDecode() here means that we won't be able to >> > > parse an alternate produced by a version of QEMU where >> > > BlockdevOptions has gained additional fields, doesn't it? >> > >> > That's correct. This means that with this RFCv2 proposal, qapi-go >> > based on qemu version 7.1 might not be able to decode a qmp >> > message from qemu version 7.2 if it has introduced a new field. >> > >> > This needs fixing, not sure yet the way to go. >> > >> > > Considering that we will happily parse such a BlockdevOptions >> > > outside of the context of BlockdevRef, I think we should be >> > > consistent and allow the same to happen here. >> > >> > StrictDecode is only used with alternates because, unlike unions, >> > Alternate types don't have a 'discriminator' field that would >> > allow us to know what data type to expect. >> > >> > With this in mind, theoretically speaking, we could have very >> > similar struct types as Alternate fields and we have to find on >> > runtime which type is that underlying byte stream. >> > >> > So, to reply to your suggestion, if we allow BlockdevRef without >> > StrictDecode we might find ourselves in a situation that it >> > matched a few fields of BlockdevOptions but it the byte stream >> > was actually another type. >> >> IIUC your concern is that the QAPI schema could gain a new >> type, TotallyNotBlockdevOptions, which looks exactly like >> BlockdevOptions except for one or more extra fields. >> >> If QEMU then produced a JSON like >> >> { "description": { /* a TotallyNotBlockdevOptions here */ } } >> >> and we'd try to deserialize it with Go code like >> >> ref := BlockdevRef{} >> json.Unmarsal(&ref) >> >> we'd end up mistakenly parsing the TotallyNotBlockdevOptions as a >> valid BlockdevOptions, dropping the extra fields in the process. >> >> Does that correctly describe the reason why you feel that the use of >> StrictDecode is necessary? > > Not quite. The problem here is related to the Alternate types of > the QAPI specification [0], just to name a simple in-use example, > BlockdevRefOrNul [1]. > > [0] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/qapi-code-gen.rst?plain=1#L387 > [1] https://gitlab.com/qemu-project/qemu/-/blob/master/qapi/block-core.json#L4349 > > To exemplify the problem that I try to solve with StrictDecode, > let's say there is a DeviceRef alternate type that looks like: > > { 'alternate': 'DeviceRef', > 'data': { 'memory': 'BlockdevRefInMemory', > 'disk': 'BlockdevRefInDisk', > 'cloud': 'BlockdevRefInCloud' } } > > Just a quick recap, at runtime we don't have data's payload name > (e.g: disk). We need to check the actual data and try to find > what is the payload type. > > type BlockdevRefInMemory struct { > Name *string > Size uint64 > Start uint64 > End uint64 > } > > type BlockdevRefInDisk struct { > Name *string > Size uint64 > Path *string > } > > type BlockdevRefInCloud struct { > Name *string > Size uint64 > Uri *string > } > > All types have unique fields but they all share some fields too. Quick intercession (I merely skimmed the review thread; forgive me if it's not useful or not new): An alternate type is like a union type, except there is no discriminator on the wire. Instead, the branch to use is inferred from the value. An alternate can only express a choice between types represented differently on the wire. This is docs/devel/qapi-code-gen.rst. Implied there: the inference is based on the JSON type *only*, i.e. no two branches can have the same JSON type on the wire. Since all complex types (struct or union) are JSON object on the wire, at most one alternate branch can be of complex type. More sophisticated inference would be possible if we need it. So far we haven't. > Golang, without StrictDecode would happily decode a "disk" > payload into either "memory" or "cloud" fields, matching only > what the json provides and ignoring the rest. StrictDecode would > error if the payload had fields that don't belong to that Type so > we could try to find a perfect match. > > While this should work reliably with current version of QEMU's > qapi-schema.json, it is not future error proof... It is just a > bit better than not checking at all. > > The overall expectations is that, if the fields have too much in > common, it is likely they should have been tagged as 'union' > instead of 'alternate'. Yet, because alternate types have this > flexibility, we need to be extra careful. > > I'm still thinking about this in a way that would not give too > much hassle when considering a generated code that talks with > older/newer qemu where some fields might have been added/removed. > >> If so, I respectfully disagree :) >> >> If the client code is expecting a BlockdevRef as the return >> value of a command and QEMU is producing something that is >> *not* a BlockdevRef instead, that's an obvious bug in QEMU. If >> the client code is expecting a BlockdevRef as the return value >> of a command that is specified *not* to return a BlockdevRef, >> that's an obvious bug in the client code. >> >> In neither case it should be the responsibility of the SDK to >> second-guess the declared intent, especially when it's >> perfectly valid for a type to be extended in a >> backwards-compatible way by adding fields to it. > > Yes, the SDK should consider valid QMP messages. This specific > case is just a bit more complex qapi type that SDK needs to > worry. > > Thanks for your input! > Victor
Hi, On Mon, Aug 29, 2022 at 01:27:06PM +0200, Markus Armbruster wrote: > Victor Toso <victortoso@redhat.com> writes: > > > Hi, > > > > On Fri, Aug 19, 2022 at 11:27:13AM -0500, Andrea Bolognani wrote: > >> On Wed, Aug 17, 2022 at 04:04:19PM +0200, Victor Toso wrote: > >> > On Tue, Jul 05, 2022 at 08:45:06AM -0700, Andrea Bolognani wrote: > >> > > On Fri, Jun 17, 2022 at 02:19:26PM +0200, Victor Toso wrote: > >> > > > func (s *BlockdevRef) UnmarshalJSON(data []byte) error { > >> > > > // Check for json-null first > >> > > > if string(data) == "null" { > >> > > > return errors.New(`null not supported for BlockdevRef`) > >> > > > } > >> > > > // Check for BlockdevOptions > >> > > > { > >> > > > s.Definition = new(BlockdevOptions) > >> > > > if err := StrictDecode(s.Definition, data); err == nil { > >> > > > return nil > >> > > > } > >> > > > >> > > The use of StrictDecode() here means that we won't be able to > >> > > parse an alternate produced by a version of QEMU where > >> > > BlockdevOptions has gained additional fields, doesn't it? > >> > > >> > That's correct. This means that with this RFCv2 proposal, qapi-go > >> > based on qemu version 7.1 might not be able to decode a qmp > >> > message from qemu version 7.2 if it has introduced a new field. > >> > > >> > This needs fixing, not sure yet the way to go. > >> > > >> > > Considering that we will happily parse such a BlockdevOptions > >> > > outside of the context of BlockdevRef, I think we should be > >> > > consistent and allow the same to happen here. > >> > > >> > StrictDecode is only used with alternates because, unlike unions, > >> > Alternate types don't have a 'discriminator' field that would > >> > allow us to know what data type to expect. > >> > > >> > With this in mind, theoretically speaking, we could have very > >> > similar struct types as Alternate fields and we have to find on > >> > runtime which type is that underlying byte stream. > >> > > >> > So, to reply to your suggestion, if we allow BlockdevRef without > >> > StrictDecode we might find ourselves in a situation that it > >> > matched a few fields of BlockdevOptions but it the byte stream > >> > was actually another type. > >> > >> IIUC your concern is that the QAPI schema could gain a new > >> type, TotallyNotBlockdevOptions, which looks exactly like > >> BlockdevOptions except for one or more extra fields. > >> > >> If QEMU then produced a JSON like > >> > >> { "description": { /* a TotallyNotBlockdevOptions here */ } } > >> > >> and we'd try to deserialize it with Go code like > >> > >> ref := BlockdevRef{} > >> json.Unmarsal(&ref) > >> > >> we'd end up mistakenly parsing the TotallyNotBlockdevOptions as a > >> valid BlockdevOptions, dropping the extra fields in the process. > >> > >> Does that correctly describe the reason why you feel that the use of > >> StrictDecode is necessary? > > > > Not quite. The problem here is related to the Alternate types of > > the QAPI specification [0], just to name a simple in-use example, > > BlockdevRefOrNul [1]. > > > > [0] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/qapi-code-gen.rst?plain=1#L387 > > [1] https://gitlab.com/qemu-project/qemu/-/blob/master/qapi/block-core.json#L4349 > > > > To exemplify the problem that I try to solve with StrictDecode, > > let's say there is a DeviceRef alternate type that looks like: > > > > { 'alternate': 'DeviceRef', > > 'data': { 'memory': 'BlockdevRefInMemory', > > 'disk': 'BlockdevRefInDisk', > > 'cloud': 'BlockdevRefInCloud' } } > > > > Just a quick recap, at runtime we don't have data's payload name > > (e.g: disk). We need to check the actual data and try to find > > what is the payload type. > > > > type BlockdevRefInMemory struct { > > Name *string > > Size uint64 > > Start uint64 > > End uint64 > > } > > > > type BlockdevRefInDisk struct { > > Name *string > > Size uint64 > > Path *string > > } > > > > type BlockdevRefInCloud struct { > > Name *string > > Size uint64 > > Uri *string > > } > > > > All types have unique fields but they all share some fields too. > > Quick intercession (I merely skimmed the review thread; forgive me if > it's not useful or not new): > > An alternate type is like a union type, except there is no > discriminator on the wire. Instead, the branch to use is inferred > from the value. An alternate can only express a choice between types > represented differently on the wire. > > This is docs/devel/qapi-code-gen.rst. Implied there: the inference is > based on the JSON type *only*, i.e. no two branches can have the same > JSON type on the wire. Since all complex types (struct or union) are > JSON object on the wire, at most one alternate branch can be of complex > type. Ah, I've missed this bit. Thank you, it does make it much simpler. > More sophisticated inference would be possible if we need it. > So far we haven't. Ack. Cheers, Victor > > Golang, without StrictDecode would happily decode a "disk" > > payload into either "memory" or "cloud" fields, matching only > > what the json provides and ignoring the rest. StrictDecode would > > error if the payload had fields that don't belong to that Type so > > we could try to find a perfect match. > > > > While this should work reliably with current version of QEMU's > > qapi-schema.json, it is not future error proof... It is just a > > bit better than not checking at all. > > > > The overall expectations is that, if the fields have too much in > > common, it is likely they should have been tagged as 'union' > > instead of 'alternate'. Yet, because alternate types have this > > flexibility, we need to be extra careful. > > > > I'm still thinking about this in a way that would not give too > > much hassle when considering a generated code that talks with > > older/newer qemu where some fields might have been added/removed. > > > >> If so, I respectfully disagree :) > >> > >> If the client code is expecting a BlockdevRef as the return > >> value of a command and QEMU is producing something that is > >> *not* a BlockdevRef instead, that's an obvious bug in QEMU. If > >> the client code is expecting a BlockdevRef as the return value > >> of a command that is specified *not* to return a BlockdevRef, > >> that's an obvious bug in the client code. > >> > >> In neither case it should be the responsibility of the SDK to > >> second-guess the declared intent, especially when it's > >> perfectly valid for a type to be extended in a > >> backwards-compatible way by adding fields to it. > > > > Yes, the SDK should consider valid QMP messages. This specific > > case is just a bit more complex qapi type that SDK needs to > > worry. > > > > Thanks for your input! > > Victor >
Hi, On Fri, Jun 17, 2022 at 02:19:26PM +0200, Victor Toso wrote: > This patch handles QAPI alternate types and generates data structures > in Go that handles it. > > At this moment, there are 5 alternates in qemu/qapi, they are: > * BlockDirtyBitmapMergeSource > * Qcow2OverlapChecks > * BlockdevRef > * BlockdevRefOrNull > * StrOrNull So, things got a little bit complicated due the fact that BlockdevRefOrNull and StrOrNull can take JSON NULL value and that's completely different than an omitted field. The last reply from Markus in another patch series make this clear with a good deal of examples too. https://lists.gnu.org/archive/html/qemu-devel/2022-09/msg00113.html I'll put the struggle that I'm having just in case someone have suggestions. Let's get an example: In qemu/qapi/block-core.json: { 'struct': 'BlockdevOptionsQcow2', 'base': 'BlockdevOptionsGenericCOWFormat', 'data': { '*lazy-refcounts': 'bool', '*pass-discard-request': 'bool', '*pass-discard-snapshot': 'bool', '*pass-discard-other': 'bool', '*overlap-check': 'Qcow2OverlapChecks', '*cache-size': 'int', '*l2-cache-size': 'int', '*l2-cache-entry-size': 'int', '*refcount-cache-size': 'int', '*cache-clean-interval': 'int', '*encrypt': 'BlockdevQcow2Encryption', '*data-file': 'BlockdevRef' } } Generates in qapi-go/pkg/qapi/structs.go: type BlockdevOptionsQcow2 struct { // Base fields Backing *BlockdevRefOrNull `json:"backing,omitempty"` LazyRefcounts *bool `json:"lazy-refcounts,omitempty"` PassDiscardRequest *bool `json:"pass-discard-request,omitempty"` PassDiscardSnapshot *bool `json:"pass-discard-snapshot,omitempty"` PassDiscardOther *bool `json:"pass-discard-other,omitempty"` OverlapCheck *Qcow2OverlapChecks `json:"overlap-check,omitempty"` CacheSize *int64 `json:"cache-size,omitempty"` L2CacheSize *int64 `json:"l2-cache-size,omitempty"` L2CacheEntrySize *int64 `json:"l2-cache-entry-size,omitempty"` RefcountCacheSize *int64 `json:"refcount-cache-size,omitempty"` CacheCleanInterval *int64 `json:"cache-clean-interval,omitempty"` Encrypt *BlockdevQcow2Encryption `json:"encrypt,omitempty"` DataFile *BlockdevRef `json:"data-file,omitempty"` } One thing that we assumed for all types is that an optional type can be simply ignored. For that to work with Golang's encoding/json, we make do by making every optional field a pointer to the type it represents, plus we add the "omitempty" tag. This way, if the user did not set it, it is simply ignored. If we didn't receive anything, all should be good. The problem happens when we receive a JSON Null value, which is one possible value type for the BlockdevRefOrNull above. A message like { "backing" : null } does not trigger UnmarshalJSON for BlockdevRefOrNull and this is silently ignored. In Go's encoding/json, in order to { "backing" : null } trigger UnmarshalJSON, we need to remove omitempty and also the pointer from Backing. Now, on the marshalling side, we will always have 'backing' being set and in order to omit it from the output (which we might want indeed) we need to rework the byte array, which is something we were trying to avoid :) I've looked up and there has been proposals to address this kind of issue, including an omitnil tag [0] that could be used here to workaround it but that's unlikely to move forward [1]. The first thing to do is to define when to omit *StrOrNull and *BlockdevRefOrNull and when to set it to null. The solution I thought would make sense is to have a Null boolean field that user would need to set, so: type BlockdevRefOrNull struct { Definition *BlockdevOptions Reference *string Null bool } func (s BlockdevRefOrNull) MarshalJSON() ([]byte, error) { if s.Definition != nil { return json.Marshal(s.Definition) } else if s.Reference != nil { return json.Marshal(s.Reference) } else if s.Null { return []byte("null"), nil } // If only QAPI spec accepted "backing": {} as the same // as ommited. return []byte("qapi-go-remove-this-object"), nil } func (s *BlockdevRefOrNull) UnmarshalJSON(data []byte) error { // Check for json-null first if string(data) == "null" { s.Null = true return nil } // Check for BlockdevOptions ... } Setting BlockdevRefOrNull to null and to StrOrNull could have different meanings which is explained in the documentation itself. That's why I think Null as a boolean is fine, the user sets for a specific context when it knows what it is doing... Not having a pointer for optional fields of this two types breaks the coherence we had that all optional members are pointers in Go. This hurts a bit but this two are truly special as they can have the magic null value. Now, I'm thinking on a reasonable way to remove the object that wraps "qapi-go-remove-this-object" as well, likely using the json's decoder [2]... [0] https://github.com/golang/go/issues/22480 [1] https://github.com/golang/go/issues/34701#issuecomment-544710311 [2] https://pkg.go.dev/encoding/json#Decoder.Token Cheers, Victor
diff --git a/scripts/qapi/golang.py b/scripts/qapi/golang.py index f2776520a1..37d7c062c9 100644 --- a/scripts/qapi/golang.py +++ b/scripts/qapi/golang.py @@ -29,11 +29,32 @@ from .source import QAPISourceInfo +TEMPLATE_HELPER = ''' +// Alias for go version lower than 1.18 +type Any = interface{} + +// Creates a decoder that errors on unknown Fields +// Returns true if successfully decoded @from string @into type +// Returns false without error is failed with "unknown field" +// Returns false with error is a different error was found +func StrictDecode(into interface{}, from []byte) error { + dec := json.NewDecoder(strings.NewReader(string(from))) + dec.DisallowUnknownFields() + + if err := dec.Decode(into); err != nil { + return err + } + return nil +} +''' + + class QAPISchemaGenGolangVisitor(QAPISchemaVisitor): def __init__(self, prefix: str): super().__init__() - self.target = {name: "" for name in ["enum"]} + self.target = {name: "" for name in ["alternate", "enum", "helper"]} + self.objects_seen = {} self.schema = None self.golang_package_name = "qapi" @@ -44,6 +65,8 @@ def visit_begin(self, schema): for target in self.target: self.target[target] = f"package {self.golang_package_name}\n" + self.target["helper"] += TEMPLATE_HELPER + def visit_end(self): self.schema = None @@ -65,7 +88,69 @@ def visit_alternate_type(self: QAPISchemaGenGolangVisitor, features: List[QAPISchemaFeature], variants: QAPISchemaVariants ) -> None: - pass + assert name not in self.objects_seen + self.objects_seen[name] = True + + marshal_return_default = f'nil, errors.New("{name} has empty fields")' + marshal_check_fields = "" + unmarshal_check_fields = "" + variant_fields = "" + + # We need to check if the Alternate type supports NULL as that + # means that JSON to Go would allow all fields to be empty. + # Alternate that don't support NULL, would fail to convert + # to JSON if all fields were empty. + return_on_null = f"errors.New(`null not supported for {name}`)" + + # Assembly the fields and all the checks for Marshal and + # Unmarshal methods + for var in variants.variants: + # Nothing to generate on null types. We update some + # variables to handle json-null on marshalling methods. + if var.type.name == "null": + marshal_return_default = '[]byte("null"), nil' + return_on_null = "nil" + continue + + var_name = qapi_to_field_name(var.name) + var_type = qapi_schema_type_to_go_type(var.type.name) + variant_fields += f"\t{var_name} *{var_type}\n" + + if len(marshal_check_fields) > 0: + marshal_check_fields += "} else " + + marshal_check_fields += f'''if s.{var_name} != nil {{ + return json.Marshal(s.{var_name}) + ''' + + unmarshal_check_fields += f'''// Check for {var_type} + {{ + s.{var_name} = new({var_type}) + if err := StrictDecode(s.{var_name}, data); err == nil {{ + return nil + }} + s.{var_name} = nil + }} +''' + + marshal_check_fields += "}" + + self.target["alternate"] += generate_struct_type(name, variant_fields) + self.target["alternate"] += f''' +func (s {name}) MarshalJSON() ([]byte, error) {{ + {marshal_check_fields} + return {marshal_return_default} +}} + +func (s *{name}) UnmarshalJSON(data []byte) error {{ + // Check for json-null first + if string(data) == "null" {{ + return {return_on_null} + }} + {unmarshal_check_fields} + return errors.New(fmt.Sprintf("Can't convert to {name}: %s", string(data))) +}} +''' def visit_enum_type(self: QAPISchemaGenGolangVisitor, name: str, @@ -130,5 +215,35 @@ def gen_golang(schema: QAPISchema, vis.write(output_dir) +# Helper function for boxed or self contained structures. +def generate_struct_type(type_name, args="") -> str: + args = args if len(args) == 0 else f"\n{args}\n" + return f''' +type {type_name} struct {{{args}}} +''' + + +def qapi_schema_type_to_go_type(type: str) -> str: + schema_types_to_go = { + 'str': 'string', 'null': 'nil', 'bool': 'bool', 'number': + 'float64', 'size': 'uint64', 'int': 'int64', 'int8': 'int8', + 'int16': 'int16', 'int32': 'int32', 'int64': 'int64', 'uint8': + 'uint8', 'uint16': 'uint16', 'uint32': 'uint32', 'uint64': + 'uint64', 'any': 'Any', 'QType': 'QType', + } + + prefix = "" + if type.endswith("List"): + prefix = "[]" + type = type[:-4] + + type = schema_types_to_go.get(type, type) + return prefix + type + + def qapi_to_field_name_enum(name: str) -> str: return name.title().replace("-", "") + + +def qapi_to_field_name(name: str) -> str: + return name.title().replace("_", "").replace("-", "")
This patch handles QAPI alternate types and generates data structures in Go that handles it. At this moment, there are 5 alternates in qemu/qapi, they are: * BlockDirtyBitmapMergeSource * Qcow2OverlapChecks * BlockdevRef * BlockdevRefOrNull * StrOrNull Alternate types are similar to Union but without a discriminator that can be used to identify the underlying value on the wire. It is needed to infer it. In Go, all the types are mapped as optional fields and Marshal and Unmarshal methods will be handling the data checks. Example: qapi: | { 'alternate': 'BlockdevRef', | 'data': { 'definition': 'BlockdevOptions', | 'reference': 'str' } } go: | type BlockdevRef struct { | Definition *BlockdevOptions | Reference *string | } usage: | input := `{"driver":"qcow2","data-file":"/some/place/my-image"}` | k := BlockdevRef{} | err := json.Unmarshal([]byte(input), &k) | if err != nil { | panic(err) | } | // *k.Definition.Qcow2.DataFile.Reference == "/some/place/my-image" Signed-off-by: Victor Toso <victortoso@redhat.com> --- scripts/qapi/golang.py | 119 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 117 insertions(+), 2 deletions(-)