diff mbox

[v4,03/11] public: xen.h: add definitions for UUID handling

Message ID 1503347275-13039-4-git-send-email-volodymyr_babchuk@epam.com (mailing list archive)
State New, archived
Headers show

Commit Message

Volodymyr Babchuk Aug. 21, 2017, 8:27 p.m. UTC
Added type xen_uuid_t. This type represents UUID as an array of 16
bytes in big endian format.

Added macro XEN_DEFINE_UUID that constructs UUID in the usual way:

 XEN_DEFINE_UUID(00112233, 4455, 6677, 8899, aabbccddeeff)

will construct UUID 00112233-4455-6677-8899-aabbccddeeff presented as
 {0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88,
  0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff}

NB: This is compatible with Linux kernel and with libuuid, but it is not
compatible with Microsoft, as they use mixed-endian encoding (some
components are little-endian, some are big-endian).

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
---
 xen/include/public/xen.h | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Jan Beulich Aug. 22, 2017, 7:26 a.m. UTC | #1
>>> On 21.08.17 at 22:27, <volodymyr_babchuk@epam.com> wrote:
> --- a/xen/include/public/xen.h
> +++ b/xen/include/public/xen.h
> @@ -930,6 +930,15 @@ __DEFINE_XEN_GUEST_HANDLE(uint16, uint16_t);
>  __DEFINE_XEN_GUEST_HANDLE(uint32, uint32_t);
>  __DEFINE_XEN_GUEST_HANDLE(uint64, uint64_t);
>  
> +typedef uint8_t xen_uuid_t[16];

As expressed before, I'm opposed to this being a plain array. I've
pointed you at the EFI representation as an example; at the very
least I'd expect a wrapper structure around the array (which is
_not_ to say that I would ack such a patch, but at least I wouldn't
nak it).

Jan
Volodymyr Babchuk Aug. 22, 2017, 2:37 p.m. UTC | #2
Hi Jan,

On 22.08.17 10:26, Jan Beulich wrote:
>>>> On 21.08.17 at 22:27, <volodymyr_babchuk@epam.com> wrote:
>> --- a/xen/include/public/xen.h
>> +++ b/xen/include/public/xen.h
>> @@ -930,6 +930,15 @@ __DEFINE_XEN_GUEST_HANDLE(uint16, uint16_t);
>>   __DEFINE_XEN_GUEST_HANDLE(uint32, uint32_t);
>>   __DEFINE_XEN_GUEST_HANDLE(uint64, uint64_t);
>>   
>> +typedef uint8_t xen_uuid_t[16];
> 
> As expressed before, I'm opposed to this being a plain array. I've
> pointed you at the EFI representation as an example; at the very
> least I'd expect a wrapper structure around the array (which is
> _not_ to say that I would ack such a patch, but at least I wouldn't
> nak it).

EFI code uses GUID, which is product of Microsoft's NIH syndrome.

Let me cite some parts of RFC 4122:

4.1.  Format

    *The UUID format is 16 octets*; some bits of the eight octet variant
    field specified below determine finer structure.
.....

4.1.2.  Layout and Byte Order
.....
    In the absence of explicit application or presentation protocol
    specification to the contrary, a UUID is encoded as a 128-bit object,
    as follows:

    The fields are encoded as 16 octets, with the sizes and order of the
    fields defined above, and with each field encoded with the Most
    Significant Byte first (known as network byte order).  Note that the
    field names, particularly for multiplexed fields, follow historical
    practice.

    0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          time_low                             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |       time_mid                |         time_hi_and_version   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |clk_seq_hi_res |  clk_seq_low  |         node (0-1)            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         node (2-5)                            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.....


BTW, GUID handling is incompatible with this RFC, because in GUID
first three fields are stored in LE format, while other fields are 
stored in BE format.

I can't see why you want to map UUID to a certain structure. I can 
create such wrapper, but it will be just dead code, because there are no 
users for it. Frankly, I can't imagine why someone will want to read, 
say, clk_seq_hi_res field of UUID.
Jan Beulich Aug. 23, 2017, 8:10 a.m. UTC | #3
>>> On 22.08.17 at 16:37, <volodymyr_babchuk@epam.com> wrote:
> I can't see why you want to map UUID to a certain structure.

This is so that the type cannot mistakenly be passed to a function
taking unsigned char *, or be assigned to a variable of that type.
Please see our TYPE_SAFE() macro which we use to specifically
enclose certain scalar types in a structure to that they won't be
compatible with other types deriving from the same scalar base type.

Jan
Volodymyr Babchuk Aug. 23, 2017, 11:08 a.m. UTC | #4
Hello Jan

On 23.08.17 11:10, Jan Beulich wrote:
>>>> On 22.08.17 at 16:37, <volodymyr_babchuk@epam.com> wrote:
>> I can't see why you want to map UUID to a certain structure.
> 
> This is so that the type cannot mistakenly be passed to a function
> taking unsigned char *, or be assigned to a variable of that type.
Right, I see the point there.

> Please see our TYPE_SAFE() macro which we use to specifically
> enclose certain scalar types in a structure to that they won't be
> compatible with other types deriving from the same scalar base type.
I see. So what about

struct xen_uuid_t
{
      uint8_t a[16];
};

then?

One can convert it to union with different representations (array, 
RFC4122 struct, etc) later if there will be need for this.
Jan Beulich Aug. 23, 2017, 11:29 a.m. UTC | #5
>>> On 23.08.17 at 13:08, <volodymyr_babchuk@epam.com> wrote:
> On 23.08.17 11:10, Jan Beulich wrote:
>>>>> On 22.08.17 at 16:37, <volodymyr_babchuk@epam.com> wrote:
>>> I can't see why you want to map UUID to a certain structure.
>> 
>> This is so that the type cannot mistakenly be passed to a function
>> taking unsigned char *, or be assigned to a variable of that type.
> Right, I see the point there.
> 
>> Please see our TYPE_SAFE() macro which we use to specifically
>> enclose certain scalar types in a structure to that they won't be
>> compatible with other types deriving from the same scalar base type.
> I see. So what about
> 
> struct xen_uuid_t
> {
>       uint8_t a[16];
> };
> 
> then?

Yes, that's what I had asked for as the minimal solution. That
would be in line with (but better than) xen_domain_handle_t,
which I've just realized we also have.

> One can convert it to union with different representations (array, 
> RFC4122 struct, etc) later if there will be need for this.

Well, why don't you make it a union but stick to just the array
for now if you dislike making it similar to the EFI one? That way
we can add further representations if needed/desired without
breaking existing consumers.

Jan
Volodymyr Babchuk Aug. 30, 2017, 4:20 p.m. UTC | #6
Hi Jan,

On 23.08.17 14:29, Jan Beulich wrote:
>>>> On 23.08.17 at 13:08, <volodymyr_babchuk@epam.com> wrote:
>> On 23.08.17 11:10, Jan Beulich wrote:
>>>>>> On 22.08.17 at 16:37, <volodymyr_babchuk@epam.com> wrote:
>>>> I can't see why you want to map UUID to a certain structure.
>>>
>>> This is so that the type cannot mistakenly be passed to a function
>>> taking unsigned char *, or be assigned to a variable of that type.
>> Right, I see the point there.
>>
>>> Please see our TYPE_SAFE() macro which we use to specifically
>>> enclose certain scalar types in a structure to that they won't be
>>> compatible with other types deriving from the same scalar base type.
>> I see. So what about
>>
>> struct xen_uuid_t
>> {
>>        uint8_t a[16];
>> };
>>
>> then?
> 
> Yes, that's what I had asked for as the minimal solution. That
> would be in line with (but better than) xen_domain_handle_t,
> which I've just realized we also have.
> 
>> One can convert it to union with different representations (array,
>> RFC4122 struct, etc) later if there will be need for this.
> 
> Well, why don't you make it a union but stick to just the array
> for now if you dislike making it similar to the EFI one? That way
> we can add further representations if needed/desired without
> breaking existing consumers.
My first intention was to declare union with all possible 
representations, so it would be possible to access the same UUID as an 
array of bytes or, for example, as Microsoft GUID. Like this:

typedef union {
     /* UUID represented as a 128-bit object */
     uint8_t obj[16];

     /* Representation according to RFC 4122 */
     struct {
         __be32  time_low;
         __be16  time_mid;
         __be16  time_hi_and_version;
         __u8    clock_seq_hi_and_reserved;
         __u8    clock_seq_low;
         __u8    node[6];
     } rfc4122;

     /* Microsoft/Intel style GUID representation */
     struct {
         __le32  Data1;
         __le16  Data2;
         __le16  Data3;
         __u8    Data4[8];
     } guid;

     /* SMCCC compatible format */
     struct {
         __le32 r0;
         __le32 r1;
         __le32 r2;
         __le32 r3;
     } smccc;
} xen_uuid_t;


But looks like we can't use something like __packed or 
__attribute__((__packed__)) in the public header. This means that we 
can't rely on right overlapping and users of this union should take care 
to read and write only to one chosen substructure. I think, this is 
error-prone, so it is better to stick to

typedef struct
{
        uint8_t a[16];
} xen_uuid_t;

BTW, I'm very interested how it can be guaranteed that structures 
defined in xen.h will have the same size and alignment on both sides of 
communication channel, taking into account, then we rely only on C89 
standard.
Jan Beulich Aug. 31, 2017, 7:34 a.m. UTC | #7
>>> On 30.08.17 at 18:20, <volodymyr_babchuk@epam.com> wrote:
> My first intention was to declare union with all possible 
> representations, so it would be possible to access the same UUID as an 
> array of bytes or, for example, as Microsoft GUID. Like this:
> 
> typedef union {
>      /* UUID represented as a 128-bit object */
>      uint8_t obj[16];
> 
>      /* Representation according to RFC 4122 */
>      struct {
>          __be32  time_low;
>          __be16  time_mid;
>          __be16  time_hi_and_version;
>          __u8    clock_seq_hi_and_reserved;
>          __u8    clock_seq_low;
>          __u8    node[6];
>      } rfc4122;
> 
>      /* Microsoft/Intel style GUID representation */
>      struct {
>          __le32  Data1;
>          __le16  Data2;
>          __le16  Data3;
>          __u8    Data4[8];
>      } guid;
> 
>      /* SMCCC compatible format */
>      struct {
>          __le32 r0;
>          __le32 r1;
>          __le32 r2;
>          __le32 r3;
>      } smccc;
> } xen_uuid_t;
> 
> 
> But looks like we can't use something like __packed or 
> __attribute__((__packed__)) in the public header. This means that we 
> can't rely on right overlapping and users of this union should take care 
> to read and write only to one chosen substructure.

I don't see any use of that attribute in the structure definition
above, nor any need to add one - all fields are suitably aligned
anyway. You can't use __be* and __le* types in the public
headers, though - these will need to be uint*_t.

> BTW, I'm very interested how it can be guaranteed that structures 
> defined in xen.h will have the same size and alignment on both sides of 
> communication channel, taking into account, then we rely only on C89 
> standard.

I don't understand this comment.

Jan
Volodymyr Babchuk Aug. 31, 2017, 12:24 p.m. UTC | #8
Hi Jan,

On 31.08.17 10:34, Jan Beulich wrote:
>>>> On 30.08.17 at 18:20, <volodymyr_babchuk@epam.com> wrote:
>> My first intention was to declare union with all possible
>> representations, so it would be possible to access the same UUID as an
>> array of bytes or, for example, as Microsoft GUID. Like this:
>>
>> typedef union {
>>       /* UUID represented as a 128-bit object */
>>       uint8_t obj[16];
>>
>>       /* Representation according to RFC 4122 */
>>       struct {
>>           __be32  time_low;
>>           __be16  time_mid;
>>           __be16  time_hi_and_version;
>>           __u8    clock_seq_hi_and_reserved;
>>           __u8    clock_seq_low;
>>           __u8    node[6];
>>       } rfc4122;
>>
>>       /* Microsoft/Intel style GUID representation */
>>       struct {
>>           __le32  Data1;
>>           __le16  Data2;
>>           __le16  Data3;
>>           __u8    Data4[8];
>>       } guid;
>>
>>       /* SMCCC compatible format */
>>       struct {
>>           __le32 r0;
>>           __le32 r1;
>>           __le32 r2;
>>           __le32 r3;
>>       } smccc;
>> } xen_uuid_t;
>>
>>
>> But looks like we can't use something like __packed or
>> __attribute__((__packed__)) in the public header. This means that we
>> can't rely on right overlapping and users of this union should take care
>> to read and write only to one chosen substructure.
> 
> I don't see any use of that attribute in the structure definition
> above, nor any need to add one - all fields are suitably aligned
> anyway. You can't use __be* and __le* types in the public
> headers, though - these will need to be uint*_t.
This is a public header. As I understand it can be used by different 
compilers (gcc, icc, msvc, llvm, etc...). C89 have no restrictions to 
padding or alignment of fields in structures. No one can guarantee that

sizeof(xen_uuid_t.rfc422) == sizeof(xen_uuid_t.guid) == 
sizeof(xen_uuid_t.smccc)  == 16

On all platforms. Using any compiler. With any compiler options.

This is implementation defined ([1]). Standard says "This should present 
no problem unless binary data written by one implementation are read by 
another.". But in case of public headers, this structures can be written 
by one implementation and read by another.

>> BTW, I'm very interested how it can be guaranteed that structures
>> defined in xen.h will have the same size and alignment on both sides of
>> communication channel, taking into account, then we rely only on C89
>> standard.
> 
> I don't understand this comment.
See my reply above. There absolutely no guarantees, that MSVC compiler 
will not add extra padding somewhere. This will impair interoperability.
I think, this is why (amongst other reasons) WinAPI structures has 
dwSize as the first field. And this is why _IO* macros in Linux kernel 
use sizeof() to create ioctl number.
But, as I can see, XEN code checks only magic or version.

[1] http://port70.net/~nsz/c/c89/c89-draft.html#A.6.3.8
Jan Beulich Aug. 31, 2017, 12:53 p.m. UTC | #9
>>> On 31.08.17 at 14:24, <volodymyr_babchuk@epam.com> wrote:
> Hi Jan,
> 
> On 31.08.17 10:34, Jan Beulich wrote:
>>>>> On 30.08.17 at 18:20, <volodymyr_babchuk@epam.com> wrote:
>>> My first intention was to declare union with all possible
>>> representations, so it would be possible to access the same UUID as an
>>> array of bytes or, for example, as Microsoft GUID. Like this:
>>>
>>> typedef union {
>>>       /* UUID represented as a 128-bit object */
>>>       uint8_t obj[16];
>>>
>>>       /* Representation according to RFC 4122 */
>>>       struct {
>>>           __be32  time_low;
>>>           __be16  time_mid;
>>>           __be16  time_hi_and_version;
>>>           __u8    clock_seq_hi_and_reserved;
>>>           __u8    clock_seq_low;
>>>           __u8    node[6];
>>>       } rfc4122;
>>>
>>>       /* Microsoft/Intel style GUID representation */
>>>       struct {
>>>           __le32  Data1;
>>>           __le16  Data2;
>>>           __le16  Data3;
>>>           __u8    Data4[8];
>>>       } guid;
>>>
>>>       /* SMCCC compatible format */
>>>       struct {
>>>           __le32 r0;
>>>           __le32 r1;
>>>           __le32 r2;
>>>           __le32 r3;
>>>       } smccc;
>>> } xen_uuid_t;
>>>
>>>
>>> But looks like we can't use something like __packed or
>>> __attribute__((__packed__)) in the public header. This means that we
>>> can't rely on right overlapping and users of this union should take care
>>> to read and write only to one chosen substructure.
>> 
>> I don't see any use of that attribute in the structure definition
>> above, nor any need to add one - all fields are suitably aligned
>> anyway. You can't use __be* and __le* types in the public
>> headers, though - these will need to be uint*_t.
> This is a public header. As I understand it can be used by different 
> compilers (gcc, icc, msvc, llvm, etc...). C89 have no restrictions to 
> padding or alignment of fields in structures. No one can guarantee that
> 
> sizeof(xen_uuid_t.rfc422) == sizeof(xen_uuid_t.guid) == 
> sizeof(xen_uuid_t.smccc)  == 16
> 
> On all platforms. Using any compiler. With any compiler options.
> 
> This is implementation defined ([1]). Standard says "This should present 
> no problem unless binary data written by one implementation are read by 
> another.". But in case of public headers, this structures can be written 
> by one implementation and read by another.

My reference to C89 was to tell you what language constructs
you're allowed to use. For binary layout, conventions also
matter (like gABI and processor specific ABIs). Without that we
wouldn't be able to write any C header in compatible manner.
What helps us greatly is that we're not needing interfaces for
cross-host communication - the entire public interface assumes
that producer and consumer run on the same physical system
(or, for the parts concerning migration, on similar ones).

Jan
Volodymyr Babchuk Aug. 31, 2017, 1:21 p.m. UTC | #10
On 31.08.17 15:53, Jan Beulich wrote:
>>>> On 31.08.17 at 14:24, <volodymyr_babchuk@epam.com> wrote:
>> Hi Jan,
>>
>> On 31.08.17 10:34, Jan Beulich wrote:
>>>>>> On 30.08.17 at 18:20, <volodymyr_babchuk@epam.com> wrote:
>>>> My first intention was to declare union with all possible
>>>> representations, so it would be possible to access the same UUID as an
>>>> array of bytes or, for example, as Microsoft GUID. Like this:
>>>>
>>>> typedef union {
>>>>        /* UUID represented as a 128-bit object */
>>>>        uint8_t obj[16];
>>>>
>>>>        /* Representation according to RFC 4122 */
>>>>        struct {
>>>>            __be32  time_low;
>>>>            __be16  time_mid;
>>>>            __be16  time_hi_and_version;
>>>>            __u8    clock_seq_hi_and_reserved;
>>>>            __u8    clock_seq_low;
>>>>            __u8    node[6];
>>>>        } rfc4122;
>>>>
>>>>        /* Microsoft/Intel style GUID representation */
>>>>        struct {
>>>>            __le32  Data1;
>>>>            __le16  Data2;
>>>>            __le16  Data3;
>>>>            __u8    Data4[8];
>>>>        } guid;
>>>>
>>>>        /* SMCCC compatible format */
>>>>        struct {
>>>>            __le32 r0;
>>>>            __le32 r1;
>>>>            __le32 r2;
>>>>            __le32 r3;
>>>>        } smccc;
>>>> } xen_uuid_t;
>>>>
>>>>
>>>> But looks like we can't use something like __packed or
>>>> __attribute__((__packed__)) in the public header. This means that we
>>>> can't rely on right overlapping and users of this union should take care
>>>> to read and write only to one chosen substructure.
>>>
>>> I don't see any use of that attribute in the structure definition
>>> above, nor any need to add one - all fields are suitably aligned
>>> anyway. You can't use __be* and __le* types in the public
>>> headers, though - these will need to be uint*_t.
>> This is a public header. As I understand it can be used by different
>> compilers (gcc, icc, msvc, llvm, etc...). C89 have no restrictions to
>> padding or alignment of fields in structures. No one can guarantee that
>>
>> sizeof(xen_uuid_t.rfc422) == sizeof(xen_uuid_t.guid) ==
>> sizeof(xen_uuid_t.smccc)  == 16
>>
>> On all platforms. Using any compiler. With any compiler options.
>>
>> This is implementation defined ([1]). Standard says "This should present
>> no problem unless binary data written by one implementation are read by
>> another.". But in case of public headers, this structures can be written
>> by one implementation and read by another.
> 
> My reference to C89 was to tell you what language constructs
> you're allowed to use. For binary layout, conventions also
> matter (like gABI and processor specific ABIs). Without that we
> wouldn't be able to write any C header in compatible manner.
> What helps us greatly is that we're not needing interfaces for
> cross-host communication - the entire public interface assumes
> that producer and consumer run on the same physical system
> (or, for the parts concerning migration, on similar ones).

So, will it be acceptable to use my approach with that union?

Do you have any ideas how to indicate endianess of the fields, then? I 
can just write it in the comments. But I fear of misuse.
Ian Jackson Aug. 31, 2017, 2:34 p.m. UTC | #11
Volodymyr Babchuk writes ("Re: [PATCH v4 03/11] public: xen.h: add definitions for UUID handling"):
> Do you have any ideas how to indicate endianess of the fields, then? I 
> can just write it in the comments. But I fear of misuse.

I definitely prefer your approach of providing only an array.  (It
needs to be in a struct for typesafety, as discussed.)  At least that
way the semantics are clear.

No-one is likely to want to access these individual fields.  If they
do so they are doing something very wrong.

Ian.
Jan Beulich Aug. 31, 2017, 3:12 p.m. UTC | #12
>>> On 31.08.17 at 15:21, <volodymyr_babchuk@epam.com> wrote:
> So, will it be acceptable to use my approach with that union?

As per Ian's reply, go with just the containerized uint8_t[].

Jan
diff mbox

Patch

diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 2ac6b1e..d1a4765 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -930,6 +930,15 @@  __DEFINE_XEN_GUEST_HANDLE(uint16, uint16_t);
 __DEFINE_XEN_GUEST_HANDLE(uint32, uint32_t);
 __DEFINE_XEN_GUEST_HANDLE(uint64, uint64_t);
 
+typedef uint8_t xen_uuid_t[16];
+#define XEN_DEFINE_UUID(a, b, c, d, e1, e2, e3, e4, e5, e6)             \
+    {((a) >> 24) & 0xFF, ((a) >> 16) & 0xFF,                            \
+     ((a) >>  8) & 0xFF, ((a) >>  0) & 0xFF,                            \
+     ((b) >>  8) & 0xFF, ((b) >>  0) & 0xFF,                            \
+     ((c) >>  8) & 0xFF, ((c) >>  0) & 0xFF,                            \
+     ((d) >>  8) & 0xFF, ((d) >>  0) & 0xFF,                            \
+            e1, e2, e3, e4, e5, e6}
+
 #endif /* !__ASSEMBLY__ */
 
 /* Default definitions for macros used by domctl/sysctl. */