diff mbox series

[v3,02/10] xen/version: Introduce non-truncating deterministically-signed XENVER_* subops

Message ID 20230815210650.2735671-3-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series Non-truncating XENVER_* subops | expand

Commit Message

Andrew Cooper Aug. 15, 2023, 9:06 p.m. UTC
Recently in XenServer, we have encountered problems caused by both
XENVER_extraversion and XENVER_commandline having fixed bounds.

More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
array, and using an unqualified 'char' which has implementation-specific
signed-ness.

Provide brand new ops, which are capable of expressing variable length
strings, and mark the older ops as broken.

This fixes all issues around XENVER_extraversion being longer than 15 chars.
Further work beyond just this API is needed to remove other assumptions about
XENVER_commandline being 1023 chars long.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Daniel De Graaf <dgdegra@tycho.nsa.gov>
CC: Daniel Smith <dpsmith@apertussolutions.com>
CC: Jason Andryuk <jandryuk@gmail.com>
CC: Henry Wang <Henry.Wang@arm.com>

v3:
 * Modify dummy.h's xsm_xen_version() in the same way as flask.
v2:
 * Remove xen_capabilities_info_t from the stack now that arch_get_xen_caps()
   has gone.
 * Use an arbitrary limit check much lower than INT_MAX.
 * Use "buf" rather than "string" terminology.
 * Expand the API comment.

Tested by forcing XENVER_extraversion to be 20 chars long, and confirming that
an untruncated version can be obtained.
---
 xen/common/kernel.c          | 62 +++++++++++++++++++++++++++++++++++
 xen/include/public/version.h | 63 ++++++++++++++++++++++++++++++++++--
 xen/include/xlat.lst         |  1 +
 xen/include/xsm/dummy.h      |  3 ++
 xen/xsm/flask/hooks.c        |  4 +++
 5 files changed, 131 insertions(+), 2 deletions(-)

Comments

Daniel P. Smith Aug. 15, 2023, 10:32 p.m. UTC | #1
On 8/15/23 17:06, Andrew Cooper wrote:
> Recently in XenServer, we have encountered problems caused by both
> XENVER_extraversion and XENVER_commandline having fixed bounds.
> 
> More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
> array, and using an unqualified 'char' which has implementation-specific
> signed-ness.
> 
> Provide brand new ops, which are capable of expressing variable length
> strings, and mark the older ops as broken.
> 
> This fixes all issues around XENVER_extraversion being longer than 15 chars.
> Further work beyond just this API is needed to remove other assumptions about
> XENVER_commandline being 1023 chars long.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>

Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Stefano Stabellini Aug. 16, 2023, 12:19 a.m. UTC | #2
On Tue, 15 Aug 2023, Andrew Cooper wrote:
> Recently in XenServer, we have encountered problems caused by both
> XENVER_extraversion and XENVER_commandline having fixed bounds.
> 
> More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
> array, and using an unqualified 'char' which has implementation-specific
> signed-ness.
> 
> Provide brand new ops, which are capable of expressing variable length
> strings, and mark the older ops as broken.
> 
> This fixes all issues around XENVER_extraversion being longer than 15 chars.
> Further work beyond just this API is needed to remove other assumptions about
> XENVER_commandline being 1023 chars long.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
> ---
> CC: George Dunlap <George.Dunlap@eu.citrix.com>
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Wei Liu <wl@xen.org>
> CC: Julien Grall <julien@xen.org>
> CC: Daniel De Graaf <dgdegra@tycho.nsa.gov>
> CC: Daniel Smith <dpsmith@apertussolutions.com>
> CC: Jason Andryuk <jandryuk@gmail.com>
> CC: Henry Wang <Henry.Wang@arm.com>
> 
> v3:
>  * Modify dummy.h's xsm_xen_version() in the same way as flask.
> v2:
>  * Remove xen_capabilities_info_t from the stack now that arch_get_xen_caps()
>    has gone.
>  * Use an arbitrary limit check much lower than INT_MAX.
>  * Use "buf" rather than "string" terminology.
>  * Expand the API comment.
> 
> Tested by forcing XENVER_extraversion to be 20 chars long, and confirming that
> an untruncated version can be obtained.
> ---
>  xen/common/kernel.c          | 62 +++++++++++++++++++++++++++++++++++
>  xen/include/public/version.h | 63 ++++++++++++++++++++++++++++++++++--
>  xen/include/xlat.lst         |  1 +
>  xen/include/xsm/dummy.h      |  3 ++
>  xen/xsm/flask/hooks.c        |  4 +++
>  5 files changed, 131 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/common/kernel.c b/xen/common/kernel.c
> index f822480a8ef3..79c008c7ee5f 100644
> --- a/xen/common/kernel.c
> +++ b/xen/common/kernel.c
> @@ -24,6 +24,7 @@
>  CHECK_build_id;
>  CHECK_compile_info;
>  CHECK_feature_info;
> +CHECK_varbuf;
>  #endif
>  
>  enum system_state system_state = SYS_STATE_early_boot;
> @@ -498,6 +499,59 @@ static int __init cf_check param_init(void)
>  __initcall(param_init);
>  #endif
>  
> +static long xenver_varbuf_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> +{
> +    struct xen_varbuf user_str;
> +    const char *str = NULL;
> +    size_t sz;
> +
> +    switch ( cmd )
> +    {
> +    case XENVER_extraversion2:
> +        str = xen_extra_version();
> +        break;
> +
> +    case XENVER_changeset2:
> +        str = xen_changeset();
> +        break;
> +
> +    case XENVER_commandline2:
> +        str = saved_cmdline;
> +        break;
> +
> +    case XENVER_capabilities2:
> +        str = xen_cap_info;
> +        break;
> +
> +    default:
> +        ASSERT_UNREACHABLE();
> +        return -ENODATA;
> +    }
> +
> +    sz = strlen(str);
> +
> +    if ( sz > KB(64) ) /* Arbitrary limit.  Avoid long-running operations. */
> +        return -E2BIG;

Realistically do we want this buffer to cross page boundaries? We could
use KB(4) here or even KB(4)-4 (size of len).



> +    if ( guest_handle_is_null(arg) ) /* Length request */
> +        return sz;
> +
> +    if ( copy_from_guest(&user_str, arg, 1) )
> +        return -EFAULT;
> +
> +    if ( user_str.len == 0 )
> +        return -EINVAL;
> +
> +    if ( sz > user_str.len )
> +        return -ENOBUFS;
> +
> +    if ( copy_to_guest_offset(arg, offsetof(struct xen_varbuf, buf),
> +                              str, sz) )
> +        return -EFAULT;
> +
> +    return sz;
> +}
> +
>  long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  {
>      bool_t deny = !!xsm_xen_version(XSM_OTHER, cmd);
> @@ -711,6 +765,14 @@ long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>          return sz;
>      }
> +
> +    case XENVER_extraversion2:
> +    case XENVER_capabilities2:
> +    case XENVER_changeset2:
> +    case XENVER_commandline2:
> +        if ( deny )
> +            return -EPERM;
> +        return xenver_varbuf_op(cmd, arg);
>      }
>  
>      return -ENOSYS;
> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
> index cbc4ef7a46e6..0dd6bbcb43cc 100644
> --- a/xen/include/public/version.h
> +++ b/xen/include/public/version.h
> @@ -19,12 +19,20 @@
>  /* arg == NULL; returns major:minor (16:16). */
>  #define XENVER_version      0
>  
> -/* arg == xen_extraversion_t. */
> +/*
> + * arg == xen_extraversion_t.
> + *
> + * This API/ABI is broken.  Use XENVER_extraversion2 where possible.

Like Jan and Julien I also don't like the word "broken" especially
without explanation of why it is broken next to it.

Instead, I would say:

"XENVER_extraversion is deprecated. Please use XENVER_extraversion2."

If you want to convey the message that the API has problems, then I would
say:

"XENVER_extraversion might cause truncation. Please use XENVER_extraversion2."

Or even:

"XENVER_extraversion has problems. Please use XENVER_extraversion2."



> + */
>  #define XENVER_extraversion 1
>  typedef char xen_extraversion_t[16];
>  #define XEN_EXTRAVERSION_LEN (sizeof(xen_extraversion_t))
>  
> -/* arg == xen_compile_info_t. */
> +/*
> + * arg == xen_compile_info_t.
> + *
> + * This API/ABI is broken and truncates data.

"XENVER_compile_info is deprecated and can truncate data."


> + */
>  #define XENVER_compile_info 2
>  struct xen_compile_info {
>      char compiler[64];
> @@ -34,10 +42,20 @@ struct xen_compile_info {
>  };
>  typedef struct xen_compile_info xen_compile_info_t;
>  
> +/*
> + * arg == xen_capabilities_info_t.
> + *
> + * This API/ABI is broken.  Use XENVER_capabilities2 where possible.

"XENVER_capabilities is deprecated. Please use XENVER_capabilities2."


> + */
>  #define XENVER_capabilities 3
>  typedef char xen_capabilities_info_t[1024];
>  #define XEN_CAPABILITIES_INFO_LEN (sizeof(xen_capabilities_info_t))
>  
> +/*
> + * arg == xen_changeset_info_t.
> + *
> + * This API/ABI is broken.  Use XENVER_changeset2 where possible.

"XENVER_changeset is deprecated. Please use XENVER_changeset2."


> + */
>  #define XENVER_changeset 4
>  typedef char xen_changeset_info_t[64];
>  #define XEN_CHANGESET_INFO_LEN (sizeof(xen_changeset_info_t))
> @@ -95,6 +113,11 @@ typedef struct xen_feature_info xen_feature_info_t;
>   */
>  #define XENVER_guest_handle 8
>  
> +/*
> + * arg == xen_commandline_t.
> + *
> + * This API/ABI is broken.  Use XENVER_commandline2 where possible.

"XENVER_commandline is deprecated. Please use XENVER_commandline2."


> + */
>  #define XENVER_commandline 9
>  typedef char xen_commandline_t[1024];
>  
> @@ -110,6 +133,42 @@ struct xen_build_id {
>  };
>  typedef struct xen_build_id xen_build_id_t;
>  
> +/*
> + * Container for an arbitrary variable length buffer.
> + */
> +struct xen_varbuf {
> +    uint32_t len;                          /* IN:  size of buf[] in bytes. */
> +    unsigned char buf[XEN_FLEX_ARRAY_DIM]; /* OUT: requested data.         */

I realize that you just copied struct xen_build_id but I recall from
MISRA C training that we should use plain "char" for strings for good
reasons, not "unsigned char"?

If this is meant to be generic data, I think we should use uint8_t
instead.


> +};
> +typedef struct xen_varbuf xen_varbuf_t;
> +
> +/*
> + * arg == xen_varbuf_t
> + *
> + * Equivalent to the original ops, but with a non-truncating API/ABI.
> + *
> + * These hypercalls can fail for a number of reasons.  All callers must handle
> + * -XEN_xxx return values appropriately.
> + *
> + * Passing arg == NULL is a request for size, which will be signalled with a
> + * non-negative return value.  Note: a return size of 0 may be legitimate for
> + * the requested subop.
> + *
> + * Otherwise, the input xen_varbuf_t provides the size of the following
> + * buffer.  Xen will fill the buffer, and return the number of bytes written
> + * (e.g. if the input buffer was longer than necessary).
> + *
> + * Some subops may return binary data.  Some subops may be expected to return
> + * textural data.  These are returned without a NUL terminator, and while the
> + * contents is expected to be ASCII/UTF-8, Xen makes no guarentees to this
> + * effect.  e.g. Xen has no control over the formatting used for the command
> + * line.
> + */
> +#define XENVER_extraversion2 11
> +#define XENVER_capabilities2 12
> +#define XENVER_changeset2    13
> +#define XENVER_commandline2  14
> +
>  #endif /* __XEN_PUBLIC_VERSION_H__ */
>  
>  /*
Andrew Cooper Aug. 16, 2023, 2:58 a.m. UTC | #3
On 16/08/2023 1:19 am, Stefano Stabellini wrote:
> On Tue, 15 Aug 2023, Andrew Cooper wrote:
>> @@ -498,6 +499,59 @@ static int __init cf_check param_init(void)
>>
>> +    sz = strlen(str);
>> +
>> +    if ( sz > KB(64) ) /* Arbitrary limit.  Avoid long-running operations. */
>> +        return -E2BIG;
> Realistically do we want this buffer to cross page boundaries?

A 1-byte answer can cross a page boundary, even if the hypercall
argument is correctly aligned (and even that is unspecified in the Xen ABI).

But importantly, this series is also prep work to fixing the cmdline
limit.  1k is already causing problems, and 64k is a whole lot less bad
than 4k when we enter such corner cases.

>> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
>> index cbc4ef7a46e6..0dd6bbcb43cc 100644
>> --- a/xen/include/public/version.h
>> +++ b/xen/include/public/version.h
>> @@ -19,12 +19,20 @@
>>  /* arg == NULL; returns major:minor (16:16). */
>>  #define XENVER_version      0
>>  
>> -/* arg == xen_extraversion_t. */
>> +/*
>> + * arg == xen_extraversion_t.
>> + *
>> + * This API/ABI is broken.  Use XENVER_extraversion2 where possible.
> Like Jan and Julien I also don't like the word "broken" especially
> without explanation of why it is broken next to it.

The word "broken" is an accurate and appropriate word in the English
language to describe the situation.  I'm sorry you don't like it, but
unless any of you can articulate why you think it is inappropriate
phraseology, the complaint is unactionable.

Specifically ...

> Instead, I would say:
>
> "XENVER_extraversion is deprecated. Please use XENVER_extraversion2."

... depreciated is misleading.

It would be acceptable for a case where we'd introduced a foo2 to add a
new feature to the interface, but we're being forced to make the new
interface to fix two bugs and a mis-feature in existing interface.

> If you want to convey the message that the API has problems, then I would
> say:
>
> "XENVER_extraversion might cause truncation. Please use XENVER_extraversion2."
>
> Or even:
>
> "XENVER_extraversion has problems. Please use XENVER_extraversion2."

If "broken" is too nondescript, then "might" is bad too and "has
problems" is out of the question.


There is a partial description of the ABI problems in the comment block
beside the new ops.  I could be persuaded to extend it to be a full list
of the ABI breakages.

These header files are a succinct technical reference for proficient
programmers to interact with Xen.  They are not a tutorial on writing C,
nor are they appropriate places to sentences of no useful value.

Through this series, I have done the hard work of updating the
commonly-used interfaces such that downstreams can stop working around
real problems caused by real errors in these APIs.  For everyone else
re-syncing the headers, it is important that the message come across as
an instruction and not a suggestion ...

... People will probably ignore it irrespective, but that's then firmly
on them, and not on Xen trying to downplay problems in the public interface.

>> + */
>>  #define XENVER_commandline 9
>>  typedef char xen_commandline_t[1024];
>>  
>> @@ -110,6 +133,42 @@ struct xen_build_id {
>>  };
>>  typedef struct xen_build_id xen_build_id_t;
>>  
>> +/*
>> + * Container for an arbitrary variable length buffer.
>> + */
>> +struct xen_varbuf {
>> +    uint32_t len;                          /* IN:  size of buf[] in bytes. */
>> +    unsigned char buf[XEN_FLEX_ARRAY_DIM]; /* OUT: requested data.         */
> I realize that you just copied struct xen_build_id

No - it was originally an ambiguously-signed char in v1.  It became
unsigned through review.

But being unsigned char is relevant to the non-ABI-changingness of later
patches in the series.

> but I recall from
> MISRA C training that we should use plain "char" for strings for good
> reasons, not "unsigned char"?

I don't recall anything to that effect, nor can I see anything obvious
when scanning through the standard.

MISRA can't plausibly prohibit the use of char for arbitrary data.  It's
the one and only thing the C spec states is safe for the task.

~Andrew
Jan Beulich Aug. 16, 2023, 6:39 a.m. UTC | #4
On 16.08.2023 04:58, Andrew Cooper wrote:
> On 16/08/2023 1:19 am, Stefano Stabellini wrote:
>> On Tue, 15 Aug 2023, Andrew Cooper wrote:
>>> --- a/xen/include/public/version.h
>>> +++ b/xen/include/public/version.h
>>> @@ -19,12 +19,20 @@
>>>  /* arg == NULL; returns major:minor (16:16). */
>>>  #define XENVER_version      0
>>>  
>>> -/* arg == xen_extraversion_t. */
>>> +/*
>>> + * arg == xen_extraversion_t.
>>> + *
>>> + * This API/ABI is broken.  Use XENVER_extraversion2 where possible.
>> Like Jan and Julien I also don't like the word "broken" especially
>> without explanation of why it is broken next to it.
> 
> The word "broken" is an accurate and appropriate word in the English
> language to describe the situation.  I'm sorry you don't like it, but
> unless any of you can articulate why you think it is inappropriate
> phraseology, the complaint is unactionable.

That's simply not true. A wording change is very well possible, and
what you regard as "broken" may not be viewed as such by others. IOW
while "broken" may be "an accurate and appropriate word in the English
language to describe" your perspective of the situation, it may not be
for others. Take the extraversion case: It was clear from the
beginning that it means to limit what to use as XEN_EXTRAVERSION. No
bug, just a limitation. Similarly for the command line: Besides the
full command line (supposedly) being of use for informational purposes
only anyway (and the full one can be taken from the log), no-one could
ever predict at that time that we'd accumulate such a massive amount
of command line options. Again a limitation (and yes, I understand
that the information may disappear from the ring buffer, so "xl dmesg"
after the system was up for a while may not have that data anymore,
yet of course an agent in the system could make the boot messaged
persistent), but not really a bug.

> Specifically ...
> 
>> Instead, I would say:
>>
>> "XENVER_extraversion is deprecated. Please use XENVER_extraversion2."
> 
> ... depreciated is misleading.
> 
> It would be acceptable for a case where we'd introduced a foo2 to add a
> new feature to the interface, but we're being forced to make the new
> interface to fix two bugs and a mis-feature in existing interface.

See above. The existing interfaces are sufficient for the common case.
The extended versions therefore are an enhancement, allowing to call
the original ones deprecated, but not really anything more.

>> If you want to convey the message that the API has problems, then I would
>> say:
>>
>> "XENVER_extraversion might cause truncation. Please use XENVER_extraversion2."
>>
>> Or even:
>>
>> "XENVER_extraversion has problems. Please use XENVER_extraversion2."
> 
> If "broken" is too nondescript, then "might" is bad too and "has
> problems" is out of the question.
> 
> 
> There is a partial description of the ABI problems in the comment block
> beside the new ops.  I could be persuaded to extend it to be a full list
> of the ABI breakages.
> 
> These header files are a succinct technical reference for proficient
> programmers to interact with Xen.  They are not a tutorial on writing C,
> nor are they appropriate places to sentences of no useful value.
> 
> Through this series, I have done the hard work of updating the
> commonly-used interfaces such that downstreams can stop working around
> real problems caused by real errors in these APIs.

Or, as I view it, working around limitations of the original ABI. Hence
a new feature, not so much a bug fix.

I'm afraid the main reason why this series hasn't made it in yet is your
apparent unwillingness to accept that other people may look at things
differently than you do, resulting in you not being willing to make a
compromise on the chosen wording. While you may view this as others
blocking your work, I don't think it realistically can be called this
way.

Jan
George Dunlap Aug. 16, 2023, 9:55 a.m. UTC | #5
On Wed, Aug 16, 2023 at 3:58 AM Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 16/08/2023 1:19 am, Stefano Stabellini wrote:
> > On Tue, 15 Aug 2023, Andrew Cooper wrote:
>
> >> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
> >> index cbc4ef7a46e6..0dd6bbcb43cc 100644
> >> --- a/xen/include/public/version.h
> >> +++ b/xen/include/public/version.h
> >> @@ -19,12 +19,20 @@
> >>  /* arg == NULL; returns major:minor (16:16). */
> >>  #define XENVER_version      0
> >>
> >> -/* arg == xen_extraversion_t. */
> >> +/*
> >> + * arg == xen_extraversion_t.
> >> + *
> >> + * This API/ABI is broken.  Use XENVER_extraversion2 where possible.
> > Like Jan and Julien I also don't like the word "broken" especially
> > without explanation of why it is broken next to it.
>
> The word "broken" is an accurate and appropriate word in the English
> language to describe the situation.  I'm sorry you don't like it, but
> unless any of you can articulate why you think it is inappropriate
> phraseology, the complaint is unactionable.
>
> Specifically ...
>
> > Instead, I would say:
> >
> > "XENVER_extraversion is deprecated. Please use XENVER_extraversion2."
>
> ... depreciated is misleading.
>

Deprecated means, "It is the official position of the developers of the
project that for the moment, you *can* use it, but you *shouldn't*"; it
also implies that support for it might go away at some point.  The fact
that we're saying "you shouldn't use it" by itself implies that it is
lacking somehow.  It's a factual statement that gives you useful
information.

Neither "broken" nor "has problems" actually tell you anything above
"deprecated", other than the opinion of the developer writing the
documentation; and they are both (to different levels) emotionally
charged.  You don't deprecate things unless there's something wrong with
them.  In this particular case, I don't think anyone has an emotional
attachment to the existing hypercall, so nobody is going to be insulted;
but it's a good habit to avoid it.  (See [1] for more about this.)

[1] http://xenbits.xenproject.org/governance/communication-practice.html ,
the "Avoid inflammatory and negatively charged language" section

If I have a backlog of a million things to do, how do I prioritize
switching to and testing extraversion2?  The situation as I understand it
is: "If it works for you now with the current value you're fine, but it may
break later when you change it, in a way that's obvious".  In that
situation, you can safely put off fixing it until your build breaks.  If,
on the other hand, the situation is "It may randomly work or not work with
any given build", or "It may seem to work for you now but is actually
failing in a not-very-obvious way", then you probably need to prioritize
fixing it.

Saying "May cause truncation" gives you some the information you need to
know. "Will silently truncate past N characters" gives you even more.

We should at very least say it's deprecated.  If we can summarize the
issues briefly, then that would be helpful.

 -George
Daniel P. Smith Aug. 16, 2023, 11:21 a.m. UTC | #6
---- On Wed, 16 Aug 2023 05:55:07 -0400 George Dunlap  wrote ---

 > 
 > 
 > On Wed, Aug 16, 2023 at 3:58 AM Andrew Cooper andrew.cooper3@citrix.com> wrote:
 > On 16/08/2023 1:19 am, Stefano Stabellini wrote:
 >  > On Tue, 15 Aug 2023, Andrew Cooper wrote:
 > 
 >  >> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
 >  >> index cbc4ef7a46e6..0dd6bbcb43cc 100644
 >  >> --- a/xen/include/public/version.h
 >  >> +++ b/xen/include/public/version.h
 >  >> @@ -19,12 +19,20 @@
 >  >>  /* arg == NULL; returns major:minor (16:16). */
 >  >>  #define XENVER_version      0
 >  >>  
 >  >> -/* arg == xen_extraversion_t. */
 >  >> +/*
 >  >> + * arg == xen_extraversion_t.
 >  >> + *
 >  >> + * This API/ABI is broken.  Use XENVER_extraversion2 where possible.
 >  > Like Jan and Julien I also don't like the word "broken" especially
 >  > without explanation of why it is broken next to it.
 >  
 >  The word "broken" is an accurate and appropriate word in the English
 >  language to describe the situation.  I'm sorry you don't like it, but
 >  unless any of you can articulate why you think it is inappropriate
 >  phraseology, the complaint is unactionable.
 >  
 >  Specifically ...
 >  
 >  > Instead, I would say:
 >  >
 >  > "XENVER_extraversion is deprecated. Please use XENVER_extraversion2."
 >  
 >  ... depreciated is misleading.
 > 
 > Deprecated means, "It is the official position of the developers of the project that for the moment, you *can* use it, but you *shouldn't*"; it also implies that support for it might go away at some point.  The fact that we're saying "you shouldn't use it" by itself implies that it is lacking somehow.  It's a factual statement that gives you useful information.
 > 
 > Neither "broken" nor "has problems" actually tell you anything above "deprecated", other than the opinion of the developer writing the documentation; and they are both (to different levels) emotionally charged.  You don't deprecate things unless there's something wrong with them.  In this particular case, I don't think anyone has an emotional attachment to the existing hypercall, so nobody is going to be insulted; but it's a good habit to avoid it.  (See [1] for more about this.)

With all due respect George but deprecated and broken are fundamentally different. Deprecation means that an interface is being retired in the future for any number of reasons and that it can continue to be used until its retirement without risk of unintended consequences. Labeling an interface as broken, which has been acceptable to do so in other much larger communities, is a stronger sentiment that the interface should stop being used immediately because it can lead to unintended consequences, not because it will be retired in some distant future.

 > [1] http://xenbits.xenproject.org/governance/communication-practice.html , the "Avoid inflammatory and negatively charged language" section
 > 
 > If I have a backlog of a million things to do, how do I prioritize switching to and testing extraversion2?  The situation as I understand it is: "If it works for you now with the current value you're fine, but it may break later when you change it, in a way that's obvious".  In that situation, you can safely put off fixing it until your build breaks.  If, on the other hand, the situation is "It may randomly work or not work with any given build", or "It may seem to work for you now but is actually failing in a not-very-obvious way", then you probably need to prioritize fixing it.
 > 
 > Saying "May cause truncation" gives you some the information you need to know. "Will silently truncate past N characters" gives you even more.  
 >  
 > We should at very least say it's deprecated.  If we can summarize the issues briefly, then that would be helpful.

In general I would concur here, except that these are externally consumed files for which the consuming projects just need to be told don't use this interface. The why is not so important to them as it is to the Xen community. But for the latter, any one working on Xen code can do a git blame on the comment line to find the originating commit and read the commit message for the details as to why.

v/r,
dps
Stefano Stabellini Aug. 16, 2023, 7:07 p.m. UTC | #7
On Wed, 16 Aug 2023, Andrew Cooper wrote:
> On 16/08/2023 1:19 am, Stefano Stabellini wrote:
> > On Tue, 15 Aug 2023, Andrew Cooper wrote:
> >> @@ -498,6 +499,59 @@ static int __init cf_check param_init(void)
> >>
> >> +    sz = strlen(str);
> >> +
> >> +    if ( sz > KB(64) ) /* Arbitrary limit.  Avoid long-running operations. */
> >> +        return -E2BIG;
> > Realistically do we want this buffer to cross page boundaries?
> 
> A 1-byte answer can cross a page boundary, even if the hypercall
> argument is correctly aligned (and even that is unspecified in the Xen ABI).
> 
> But importantly, this series is also prep work to fixing the cmdline
> limit.  1k is already causing problems, and 64k is a whole lot less bad
> than 4k when we enter such corner cases.

OK. It is just that 64K seems *a lot* in this context. But if you have
reasons to believe that 64K is a good number for this check then OK.


> >> + */
> >>  #define XENVER_commandline 9
> >>  typedef char xen_commandline_t[1024];
> >>  
> >> @@ -110,6 +133,42 @@ struct xen_build_id {
> >>  };
> >>  typedef struct xen_build_id xen_build_id_t;
> >>  
> >> +/*
> >> + * Container for an arbitrary variable length buffer.
> >> + */
> >> +struct xen_varbuf {
> >> +    uint32_t len;                          /* IN:  size of buf[] in bytes. */
> >> +    unsigned char buf[XEN_FLEX_ARRAY_DIM]; /* OUT: requested data.         */
> > I realize that you just copied struct xen_build_id
> 
> No - it was originally an ambiguously-signed char in v1.  It became
> unsigned through review.
> 
> But being unsigned char is relevant to the non-ABI-changingness of later
> patches in the series.
> 
> > but I recall from
> > MISRA C training that we should use plain "char" for strings for good
> > reasons, not "unsigned char"?
> 
> I don't recall anything to that effect, nor can I see anything obvious
> when scanning through the standard.
> 
> MISRA can't plausibly prohibit the use of char for arbitrary data.  It's
> the one and only thing the C spec states is safe for the task.

After this morning's call with Roberto, I take this comment back.
unsigned char is OK from MISRA POV.
Stefano Stabellini Aug. 16, 2023, 7:09 p.m. UTC | #8
On Wed, 16 Aug 2023, George Dunlap wrote:
> On Wed, Aug 16, 2023 at 3:58 AM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>       On 16/08/2023 1:19 am, Stefano Stabellini wrote:
>       > On Tue, 15 Aug 2023, Andrew Cooper wrote:
> 
>       >> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
>       >> index cbc4ef7a46e6..0dd6bbcb43cc 100644
>       >> --- a/xen/include/public/version.h
>       >> +++ b/xen/include/public/version.h
>       >> @@ -19,12 +19,20 @@
>       >>  /* arg == NULL; returns major:minor (16:16). */
>       >>  #define XENVER_version      0
>       >> 
>       >> -/* arg == xen_extraversion_t. */
>       >> +/*
>       >> + * arg == xen_extraversion_t.
>       >> + *
>       >> + * This API/ABI is broken.  Use XENVER_extraversion2 where possible.
>       > Like Jan and Julien I also don't like the word "broken" especially
>       > without explanation of why it is broken next to it.
> 
>       The word "broken" is an accurate and appropriate word in the English
>       language to describe the situation.  I'm sorry you don't like it, but
>       unless any of you can articulate why you think it is inappropriate
>       phraseology, the complaint is unactionable.
> 
>       Specifically ...
> 
>       > Instead, I would say:
>       >
>       > "XENVER_extraversion is deprecated. Please use XENVER_extraversion2."
> 
>       ... depreciated is misleading.
> 
> 
> Deprecated means, "It is the official position of the developers of the project that for the moment, you *can* use it, but you
> *shouldn't*"; it also implies that support for it might go away at some point.  The fact that we're saying "you shouldn't use it" by itself
> implies that it is lacking somehow.  It's a factual statement that gives you useful information.
> 
> Neither "broken" nor "has problems" actually tell you anything above "deprecated", other than the opinion of the developer writing the
> documentation; and they are both (to different levels) emotionally charged.  You don't deprecate things unless there's something wrong with
> them.  In this particular case, I don't think anyone has an emotional attachment to the existing hypercall, so nobody is going to be
> insulted; but it's a good habit to avoid it.  (See [1] for more about this.)
> 
> [1] http://xenbits.xenproject.org/governance/communication-practice.html , the "Avoid inflammatory and negatively charged language" section
> 
> If I have a backlog of a million things to do, how do I prioritize switching to and testing extraversion2?  The situation as I understand
> it is: "If it works for you now with the current value you're fine, but it may break later when you change it, in a way that's obvious". 
> In that situation, you can safely put off fixing it until your build breaks.  If, on the other hand, the situation is "It may randomly work
> or not work with any given build", or "It may seem to work for you now but is actually failing in a not-very-obvious way", then you
> probably need to prioritize fixing it.
> 
> Saying "May cause truncation" gives you some the information you need to know. "Will silently truncate past N characters" gives you even
> more.  
>  
> We should at very least say it's deprecated.  If we can summarize the issues briefly, then that would be helpful.

I strongly agree with this: we should say it is deprecated and if we can
summarize the issues briefly that's even better.
Stefano Stabellini Aug. 16, 2023, 7:18 p.m. UTC | #9
On Wed, 16 Aug 2023, Daniel Smith wrote:
>  > On Wed, Aug 16, 2023 at 3:58 AM Andrew Cooper andrew.cooper3@citrix.com> wrote:
>  > On 16/08/2023 1:19 am, Stefano Stabellini wrote:
>  >  > On Tue, 15 Aug 2023, Andrew Cooper wrote:
>  > 
>  >  >> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
>  >  >> index cbc4ef7a46e6..0dd6bbcb43cc 100644
>  >  >> --- a/xen/include/public/version.h
>  >  >> +++ b/xen/include/public/version.h
>  >  >> @@ -19,12 +19,20 @@
>  >  >>  /* arg == NULL; returns major:minor (16:16). */
>  >  >>  #define XENVER_version      0
>  >  >>  
>  >  >> -/* arg == xen_extraversion_t. */
>  >  >> +/*
>  >  >> + * arg == xen_extraversion_t.
>  >  >> + *
>  >  >> + * This API/ABI is broken.  Use XENVER_extraversion2 where possible.
>  >  > Like Jan and Julien I also don't like the word "broken" especially
>  >  > without explanation of why it is broken next to it.
>  >  
>  >  The word "broken" is an accurate and appropriate word in the English
>  >  language to describe the situation.  I'm sorry you don't like it, but
>  >  unless any of you can articulate why you think it is inappropriate
>  >  phraseology, the complaint is unactionable.
>  >  
>  >  Specifically ...
>  >  
>  >  > Instead, I would say:
>  >  >
>  >  > "XENVER_extraversion is deprecated. Please use XENVER_extraversion2."
>  >  
>  >  ... depreciated is misleading.
>  > 
>  > Deprecated means, "It is the official position of the developers of the project that for the moment, you *can* use it, but you *shouldn't*"; it also implies that support for it might go away at some point.  The fact that we're saying "you shouldn't use it" by itself implies that it is lacking somehow.  It's a factual statement that gives you useful information.
>  > 
>  > Neither "broken" nor "has problems" actually tell you anything above "deprecated", other than the opinion of the developer writing the documentation; and they are both (to different levels) emotionally charged.  You don't deprecate things unless there's something wrong with them.  In this particular case, I don't think anyone has an emotional attachment to the existing hypercall, so nobody is going to be insulted; but it's a good habit to avoid it.  (See [1] for more about this.)
> 
> With all due respect George but deprecated and broken are fundamentally different. Deprecation means that an interface is being retired in the future for any number of reasons and that it can continue to be used until its retirement without risk of unintended consequences. Labeling an interface as broken, which has been acceptable to do so in other much larger communities, is a stronger sentiment that the interface should stop being used immediately because it can lead to unintended consequences, not because it will be retired in some distant future.

No matter what the Merriam Webster would say for the word "deprecated"
and "broken", each individual "absorbs" each word and their meanings
differently.

There are 6 people involved in this discussion including Andrew, and 4
out of 6 didn't react well to the usage of the word "broken" in this
context. For sure in Andrew's mind "broken" triggers the right
meaning and expectations straight away. But for Julien, Jan, George and
me it did not.

We cannot run a representative survey of the developers population to
know which exact word triggers the most appropriate reaction in this
case. Even if we could, it would not be worth the effort.

Instead, I suggest to just go with the majority interpretation among the
participant in this thread and drop "broken" in favor of one of the
alternatives. Such as something along these lines:

"XENVER_extraversion is deprecated and causes truncation. Please use
XENVER_extraversion2.
Andrew Cooper Aug. 16, 2023, 7:25 p.m. UTC | #10
On 16/08/2023 8:07 pm, Stefano Stabellini wrote:
> On Wed, 16 Aug 2023, Andrew Cooper wrote:
>> On 16/08/2023 1:19 am, Stefano Stabellini wrote:
>>> On Tue, 15 Aug 2023, Andrew Cooper wrote:
>>>> @@ -498,6 +499,59 @@ static int __init cf_check param_init(void)
>>>>
>>>> +    sz = strlen(str);
>>>> +
>>>> +    if ( sz > KB(64) ) /* Arbitrary limit.  Avoid long-running operations. */
>>>> +        return -E2BIG;
>>> Realistically do we want this buffer to cross page boundaries?
>> A 1-byte answer can cross a page boundary, even if the hypercall
>> argument is correctly aligned (and even that is unspecified in the Xen ABI).
>>
>> But importantly, this series is also prep work to fixing the cmdline
>> limit.  1k is already causing problems, and 64k is a whole lot less bad
>> than 4k when we enter such corner cases.
> OK. It is just that 64K seems *a lot* in this context. But if you have
> reasons to believe that 64K is a good number for this check then OK.

Remember that this is all Xen cross-checking (or reporting to the user)
the length of Xen-owned data.

If the system was only booted only 10 bytes of cmdline, this will only
be 10.

If OTOH there really was 5k of data, handing E2BIG to the user when they
ask for it is slightly rude but is infinitely better than just
truncating it and returning success.

64k is just the upper bounds sanity check for "get a developer to look
at what's going on because this probably isn't what you thought it
was".  Eventually when someone does find a legitimate use for a cmdline
longer than 64k, it will need upping, someone (else) can figure out the
long-running-ness of this scenario.

~Andrew
Stefano Stabellini Nov. 30, 2023, 10:30 p.m. UTC | #11
Hi everyone following this thread,

please see:
https://marc.info/?l=xen-devel&m=170135718323946
https://cryptpad.fr/form/#/2/form/view/7ByH95Vd7KiDOvN4wjV5iUGlMuZbkVdwk7cYpZdluWo/

For a vote on the usage of the word "broken"


On Tue, 15 Aug 2023, Andrew Cooper wrote:
> Recently in XenServer, we have encountered problems caused by both
> XENVER_extraversion and XENVER_commandline having fixed bounds.
> 
> More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
> array, and using an unqualified 'char' which has implementation-specific
> signed-ness.
> 
> Provide brand new ops, which are capable of expressing variable length
> strings, and mark the older ops as broken.
> 
> This fixes all issues around XENVER_extraversion being longer than 15 chars.
> Further work beyond just this API is needed to remove other assumptions about
> XENVER_commandline being 1023 chars long.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
> ---
> CC: George Dunlap <George.Dunlap@eu.citrix.com>
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Wei Liu <wl@xen.org>
> CC: Julien Grall <julien@xen.org>
> CC: Daniel De Graaf <dgdegra@tycho.nsa.gov>
> CC: Daniel Smith <dpsmith@apertussolutions.com>
> CC: Jason Andryuk <jandryuk@gmail.com>
> CC: Henry Wang <Henry.Wang@arm.com>
> 
> v3:
>  * Modify dummy.h's xsm_xen_version() in the same way as flask.
> v2:
>  * Remove xen_capabilities_info_t from the stack now that arch_get_xen_caps()
>    has gone.
>  * Use an arbitrary limit check much lower than INT_MAX.
>  * Use "buf" rather than "string" terminology.
>  * Expand the API comment.
> 
> Tested by forcing XENVER_extraversion to be 20 chars long, and confirming that
> an untruncated version can be obtained.
> ---
>  xen/common/kernel.c          | 62 +++++++++++++++++++++++++++++++++++
>  xen/include/public/version.h | 63 ++++++++++++++++++++++++++++++++++--
>  xen/include/xlat.lst         |  1 +
>  xen/include/xsm/dummy.h      |  3 ++
>  xen/xsm/flask/hooks.c        |  4 +++
>  5 files changed, 131 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/common/kernel.c b/xen/common/kernel.c
> index f822480a8ef3..79c008c7ee5f 100644
> --- a/xen/common/kernel.c
> +++ b/xen/common/kernel.c
> @@ -24,6 +24,7 @@
>  CHECK_build_id;
>  CHECK_compile_info;
>  CHECK_feature_info;
> +CHECK_varbuf;
>  #endif
>  
>  enum system_state system_state = SYS_STATE_early_boot;
> @@ -498,6 +499,59 @@ static int __init cf_check param_init(void)
>  __initcall(param_init);
>  #endif
>  
> +static long xenver_varbuf_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> +{
> +    struct xen_varbuf user_str;
> +    const char *str = NULL;
> +    size_t sz;
> +
> +    switch ( cmd )
> +    {
> +    case XENVER_extraversion2:
> +        str = xen_extra_version();
> +        break;
> +
> +    case XENVER_changeset2:
> +        str = xen_changeset();
> +        break;
> +
> +    case XENVER_commandline2:
> +        str = saved_cmdline;
> +        break;
> +
> +    case XENVER_capabilities2:
> +        str = xen_cap_info;
> +        break;
> +
> +    default:
> +        ASSERT_UNREACHABLE();
> +        return -ENODATA;
> +    }
> +
> +    sz = strlen(str);
> +
> +    if ( sz > KB(64) ) /* Arbitrary limit.  Avoid long-running operations. */
> +        return -E2BIG;
> +
> +    if ( guest_handle_is_null(arg) ) /* Length request */
> +        return sz;
> +
> +    if ( copy_from_guest(&user_str, arg, 1) )
> +        return -EFAULT;
> +
> +    if ( user_str.len == 0 )
> +        return -EINVAL;
> +
> +    if ( sz > user_str.len )
> +        return -ENOBUFS;
> +
> +    if ( copy_to_guest_offset(arg, offsetof(struct xen_varbuf, buf),
> +                              str, sz) )
> +        return -EFAULT;
> +
> +    return sz;
> +}
> +
>  long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  {
>      bool_t deny = !!xsm_xen_version(XSM_OTHER, cmd);
> @@ -711,6 +765,14 @@ long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>          return sz;
>      }
> +
> +    case XENVER_extraversion2:
> +    case XENVER_capabilities2:
> +    case XENVER_changeset2:
> +    case XENVER_commandline2:
> +        if ( deny )
> +            return -EPERM;
> +        return xenver_varbuf_op(cmd, arg);
>      }
>  
>      return -ENOSYS;
> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
> index cbc4ef7a46e6..0dd6bbcb43cc 100644
> --- a/xen/include/public/version.h
> +++ b/xen/include/public/version.h
> @@ -19,12 +19,20 @@
>  /* arg == NULL; returns major:minor (16:16). */
>  #define XENVER_version      0
>  
> -/* arg == xen_extraversion_t. */
> +/*
> + * arg == xen_extraversion_t.
> + *
> + * This API/ABI is broken.  Use XENVER_extraversion2 where possible.
> + */
>  #define XENVER_extraversion 1
>  typedef char xen_extraversion_t[16];
>  #define XEN_EXTRAVERSION_LEN (sizeof(xen_extraversion_t))
>  
> -/* arg == xen_compile_info_t. */
> +/*
> + * arg == xen_compile_info_t.
> + *
> + * This API/ABI is broken and truncates data.
> + */
>  #define XENVER_compile_info 2
>  struct xen_compile_info {
>      char compiler[64];
> @@ -34,10 +42,20 @@ struct xen_compile_info {
>  };
>  typedef struct xen_compile_info xen_compile_info_t;
>  
> +/*
> + * arg == xen_capabilities_info_t.
> + *
> + * This API/ABI is broken.  Use XENVER_capabilities2 where possible.
> + */
>  #define XENVER_capabilities 3
>  typedef char xen_capabilities_info_t[1024];
>  #define XEN_CAPABILITIES_INFO_LEN (sizeof(xen_capabilities_info_t))
>  
> +/*
> + * arg == xen_changeset_info_t.
> + *
> + * This API/ABI is broken.  Use XENVER_changeset2 where possible.
> + */
>  #define XENVER_changeset 4
>  typedef char xen_changeset_info_t[64];
>  #define XEN_CHANGESET_INFO_LEN (sizeof(xen_changeset_info_t))
> @@ -95,6 +113,11 @@ typedef struct xen_feature_info xen_feature_info_t;
>   */
>  #define XENVER_guest_handle 8
>  
> +/*
> + * arg == xen_commandline_t.
> + *
> + * This API/ABI is broken.  Use XENVER_commandline2 where possible.
> + */
>  #define XENVER_commandline 9
>  typedef char xen_commandline_t[1024];
>  
> @@ -110,6 +133,42 @@ struct xen_build_id {
>  };
>  typedef struct xen_build_id xen_build_id_t;
>  
> +/*
> + * Container for an arbitrary variable length buffer.
> + */
> +struct xen_varbuf {
> +    uint32_t len;                          /* IN:  size of buf[] in bytes. */
> +    unsigned char buf[XEN_FLEX_ARRAY_DIM]; /* OUT: requested data.         */
> +};
> +typedef struct xen_varbuf xen_varbuf_t;
> +
> +/*
> + * arg == xen_varbuf_t
> + *
> + * Equivalent to the original ops, but with a non-truncating API/ABI.
> + *
> + * These hypercalls can fail for a number of reasons.  All callers must handle
> + * -XEN_xxx return values appropriately.
> + *
> + * Passing arg == NULL is a request for size, which will be signalled with a
> + * non-negative return value.  Note: a return size of 0 may be legitimate for
> + * the requested subop.
> + *
> + * Otherwise, the input xen_varbuf_t provides the size of the following
> + * buffer.  Xen will fill the buffer, and return the number of bytes written
> + * (e.g. if the input buffer was longer than necessary).
> + *
> + * Some subops may return binary data.  Some subops may be expected to return
> + * textural data.  These are returned without a NUL terminator, and while the
> + * contents is expected to be ASCII/UTF-8, Xen makes no guarentees to this
> + * effect.  e.g. Xen has no control over the formatting used for the command
> + * line.
> + */
> +#define XENVER_extraversion2 11
> +#define XENVER_capabilities2 12
> +#define XENVER_changeset2    13
> +#define XENVER_commandline2  14
> +
>  #endif /* __XEN_PUBLIC_VERSION_H__ */
>  
>  /*
> diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> index 9c41948514bf..a61ba85ed0ca 100644
> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -173,6 +173,7 @@
>  ?	build_id                        version.h
>  ?	compile_info                    version.h
>  ?	feature_info                    version.h
> +?	varbuf                          version.h
>  ?	xenoprof_init			xenoprof.h
>  ?	xenoprof_passive		xenoprof.h
>  ?	flask_access			xsm/flask_op.h
> diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
> index 8671af1ba4d3..a4a920f74e6e 100644
> --- a/xen/include/xsm/dummy.h
> +++ b/xen/include/xsm/dummy.h
> @@ -828,9 +828,12 @@ static XSM_INLINE int cf_check xsm_xen_version(XSM_DEFAULT_ARG uint32_t op)
>          block_speculation();
>          return 0;
>      case XENVER_extraversion:
> +    case XENVER_extraversion2:
>      case XENVER_compile_info:
>      case XENVER_capabilities:
> +    case XENVER_capabilities2:
>      case XENVER_changeset:
> +    case XENVER_changeset2:
>      case XENVER_pagesize:
>      case XENVER_guest_handle:
>          /* These MUST always be accessible to any guest by default. */
> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> index 78225f68c15c..a671dcd0322e 100644
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -1777,15 +1777,18 @@ static int cf_check flask_xen_version(uint32_t op)
>          /* These sub-ops ignore the permission checks and return data. */
>          return 0;
>      case XENVER_extraversion:
> +    case XENVER_extraversion2:
>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>                              VERSION__XEN_EXTRAVERSION, NULL);
>      case XENVER_compile_info:
>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>                              VERSION__XEN_COMPILE_INFO, NULL);
>      case XENVER_capabilities:
> +    case XENVER_capabilities2:
>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>                              VERSION__XEN_CAPABILITIES, NULL);
>      case XENVER_changeset:
> +    case XENVER_changeset2:
>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>                              VERSION__XEN_CHANGESET, NULL);
>      case XENVER_pagesize:
> @@ -1795,6 +1798,7 @@ static int cf_check flask_xen_version(uint32_t op)
>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>                              VERSION__XEN_GUEST_HANDLE, NULL);
>      case XENVER_commandline:
> +    case XENVER_commandline2:
>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>                              VERSION__XEN_COMMANDLINE, NULL);
>      case XENVER_build_id:
> -- 
> 2.30.2
>
Jan Beulich Dec. 1, 2023, 7:04 a.m. UTC | #12
On 30.11.2023 23:30, Stefano Stabellini wrote:
> Hi everyone following this thread,
> 
> please see:
> https://marc.info/?l=xen-devel&m=170135718323946
> https://cryptpad.fr/form/#/2/form/view/7ByH95Vd7KiDOvN4wjV5iUGlMuZbkVdwk7cYpZdluWo/
> 
> For a vote on the usage of the word "broken"

So I did vote before becoming aware of this context. I would have voted
differently if I had known that this _alone_ is the context. Yet then
I'm also not going to change my vote, because as written _there_ it is
intended to be more general. If the wording of the text describing what
to vote on changed, things would be different.

Jan

> On Tue, 15 Aug 2023, Andrew Cooper wrote:
>> Recently in XenServer, we have encountered problems caused by both
>> XENVER_extraversion and XENVER_commandline having fixed bounds.
>>
>> More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
>> array, and using an unqualified 'char' which has implementation-specific
>> signed-ness.
>>
>> Provide brand new ops, which are capable of expressing variable length
>> strings, and mark the older ops as broken.
>>
>> This fixes all issues around XENVER_extraversion being longer than 15 chars.
>> Further work beyond just this API is needed to remove other assumptions about
>> XENVER_commandline being 1023 chars long.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
>> ---
>> CC: George Dunlap <George.Dunlap@eu.citrix.com>
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Wei Liu <wl@xen.org>
>> CC: Julien Grall <julien@xen.org>
>> CC: Daniel De Graaf <dgdegra@tycho.nsa.gov>
>> CC: Daniel Smith <dpsmith@apertussolutions.com>
>> CC: Jason Andryuk <jandryuk@gmail.com>
>> CC: Henry Wang <Henry.Wang@arm.com>
>>
>> v3:
>>  * Modify dummy.h's xsm_xen_version() in the same way as flask.
>> v2:
>>  * Remove xen_capabilities_info_t from the stack now that arch_get_xen_caps()
>>    has gone.
>>  * Use an arbitrary limit check much lower than INT_MAX.
>>  * Use "buf" rather than "string" terminology.
>>  * Expand the API comment.
>>
>> Tested by forcing XENVER_extraversion to be 20 chars long, and confirming that
>> an untruncated version can be obtained.
>> ---
>>  xen/common/kernel.c          | 62 +++++++++++++++++++++++++++++++++++
>>  xen/include/public/version.h | 63 ++++++++++++++++++++++++++++++++++--
>>  xen/include/xlat.lst         |  1 +
>>  xen/include/xsm/dummy.h      |  3 ++
>>  xen/xsm/flask/hooks.c        |  4 +++
>>  5 files changed, 131 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/common/kernel.c b/xen/common/kernel.c
>> index f822480a8ef3..79c008c7ee5f 100644
>> --- a/xen/common/kernel.c
>> +++ b/xen/common/kernel.c
>> @@ -24,6 +24,7 @@
>>  CHECK_build_id;
>>  CHECK_compile_info;
>>  CHECK_feature_info;
>> +CHECK_varbuf;
>>  #endif
>>  
>>  enum system_state system_state = SYS_STATE_early_boot;
>> @@ -498,6 +499,59 @@ static int __init cf_check param_init(void)
>>  __initcall(param_init);
>>  #endif
>>  
>> +static long xenver_varbuf_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>> +{
>> +    struct xen_varbuf user_str;
>> +    const char *str = NULL;
>> +    size_t sz;
>> +
>> +    switch ( cmd )
>> +    {
>> +    case XENVER_extraversion2:
>> +        str = xen_extra_version();
>> +        break;
>> +
>> +    case XENVER_changeset2:
>> +        str = xen_changeset();
>> +        break;
>> +
>> +    case XENVER_commandline2:
>> +        str = saved_cmdline;
>> +        break;
>> +
>> +    case XENVER_capabilities2:
>> +        str = xen_cap_info;
>> +        break;
>> +
>> +    default:
>> +        ASSERT_UNREACHABLE();
>> +        return -ENODATA;
>> +    }
>> +
>> +    sz = strlen(str);
>> +
>> +    if ( sz > KB(64) ) /* Arbitrary limit.  Avoid long-running operations. */
>> +        return -E2BIG;
>> +
>> +    if ( guest_handle_is_null(arg) ) /* Length request */
>> +        return sz;
>> +
>> +    if ( copy_from_guest(&user_str, arg, 1) )
>> +        return -EFAULT;
>> +
>> +    if ( user_str.len == 0 )
>> +        return -EINVAL;
>> +
>> +    if ( sz > user_str.len )
>> +        return -ENOBUFS;
>> +
>> +    if ( copy_to_guest_offset(arg, offsetof(struct xen_varbuf, buf),
>> +                              str, sz) )
>> +        return -EFAULT;
>> +
>> +    return sz;
>> +}
>> +
>>  long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>  {
>>      bool_t deny = !!xsm_xen_version(XSM_OTHER, cmd);
>> @@ -711,6 +765,14 @@ long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>  
>>          return sz;
>>      }
>> +
>> +    case XENVER_extraversion2:
>> +    case XENVER_capabilities2:
>> +    case XENVER_changeset2:
>> +    case XENVER_commandline2:
>> +        if ( deny )
>> +            return -EPERM;
>> +        return xenver_varbuf_op(cmd, arg);
>>      }
>>  
>>      return -ENOSYS;
>> diff --git a/xen/include/public/version.h b/xen/include/public/version.h
>> index cbc4ef7a46e6..0dd6bbcb43cc 100644
>> --- a/xen/include/public/version.h
>> +++ b/xen/include/public/version.h
>> @@ -19,12 +19,20 @@
>>  /* arg == NULL; returns major:minor (16:16). */
>>  #define XENVER_version      0
>>  
>> -/* arg == xen_extraversion_t. */
>> +/*
>> + * arg == xen_extraversion_t.
>> + *
>> + * This API/ABI is broken.  Use XENVER_extraversion2 where possible.
>> + */
>>  #define XENVER_extraversion 1
>>  typedef char xen_extraversion_t[16];
>>  #define XEN_EXTRAVERSION_LEN (sizeof(xen_extraversion_t))
>>  
>> -/* arg == xen_compile_info_t. */
>> +/*
>> + * arg == xen_compile_info_t.
>> + *
>> + * This API/ABI is broken and truncates data.
>> + */
>>  #define XENVER_compile_info 2
>>  struct xen_compile_info {
>>      char compiler[64];
>> @@ -34,10 +42,20 @@ struct xen_compile_info {
>>  };
>>  typedef struct xen_compile_info xen_compile_info_t;
>>  
>> +/*
>> + * arg == xen_capabilities_info_t.
>> + *
>> + * This API/ABI is broken.  Use XENVER_capabilities2 where possible.
>> + */
>>  #define XENVER_capabilities 3
>>  typedef char xen_capabilities_info_t[1024];
>>  #define XEN_CAPABILITIES_INFO_LEN (sizeof(xen_capabilities_info_t))
>>  
>> +/*
>> + * arg == xen_changeset_info_t.
>> + *
>> + * This API/ABI is broken.  Use XENVER_changeset2 where possible.
>> + */
>>  #define XENVER_changeset 4
>>  typedef char xen_changeset_info_t[64];
>>  #define XEN_CHANGESET_INFO_LEN (sizeof(xen_changeset_info_t))
>> @@ -95,6 +113,11 @@ typedef struct xen_feature_info xen_feature_info_t;
>>   */
>>  #define XENVER_guest_handle 8
>>  
>> +/*
>> + * arg == xen_commandline_t.
>> + *
>> + * This API/ABI is broken.  Use XENVER_commandline2 where possible.
>> + */
>>  #define XENVER_commandline 9
>>  typedef char xen_commandline_t[1024];
>>  
>> @@ -110,6 +133,42 @@ struct xen_build_id {
>>  };
>>  typedef struct xen_build_id xen_build_id_t;
>>  
>> +/*
>> + * Container for an arbitrary variable length buffer.
>> + */
>> +struct xen_varbuf {
>> +    uint32_t len;                          /* IN:  size of buf[] in bytes. */
>> +    unsigned char buf[XEN_FLEX_ARRAY_DIM]; /* OUT: requested data.         */
>> +};
>> +typedef struct xen_varbuf xen_varbuf_t;
>> +
>> +/*
>> + * arg == xen_varbuf_t
>> + *
>> + * Equivalent to the original ops, but with a non-truncating API/ABI.
>> + *
>> + * These hypercalls can fail for a number of reasons.  All callers must handle
>> + * -XEN_xxx return values appropriately.
>> + *
>> + * Passing arg == NULL is a request for size, which will be signalled with a
>> + * non-negative return value.  Note: a return size of 0 may be legitimate for
>> + * the requested subop.
>> + *
>> + * Otherwise, the input xen_varbuf_t provides the size of the following
>> + * buffer.  Xen will fill the buffer, and return the number of bytes written
>> + * (e.g. if the input buffer was longer than necessary).
>> + *
>> + * Some subops may return binary data.  Some subops may be expected to return
>> + * textural data.  These are returned without a NUL terminator, and while the
>> + * contents is expected to be ASCII/UTF-8, Xen makes no guarentees to this
>> + * effect.  e.g. Xen has no control over the formatting used for the command
>> + * line.
>> + */
>> +#define XENVER_extraversion2 11
>> +#define XENVER_capabilities2 12
>> +#define XENVER_changeset2    13
>> +#define XENVER_commandline2  14
>> +
>>  #endif /* __XEN_PUBLIC_VERSION_H__ */
>>  
>>  /*
>> diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
>> index 9c41948514bf..a61ba85ed0ca 100644
>> --- a/xen/include/xlat.lst
>> +++ b/xen/include/xlat.lst
>> @@ -173,6 +173,7 @@
>>  ?	build_id                        version.h
>>  ?	compile_info                    version.h
>>  ?	feature_info                    version.h
>> +?	varbuf                          version.h
>>  ?	xenoprof_init			xenoprof.h
>>  ?	xenoprof_passive		xenoprof.h
>>  ?	flask_access			xsm/flask_op.h
>> diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
>> index 8671af1ba4d3..a4a920f74e6e 100644
>> --- a/xen/include/xsm/dummy.h
>> +++ b/xen/include/xsm/dummy.h
>> @@ -828,9 +828,12 @@ static XSM_INLINE int cf_check xsm_xen_version(XSM_DEFAULT_ARG uint32_t op)
>>          block_speculation();
>>          return 0;
>>      case XENVER_extraversion:
>> +    case XENVER_extraversion2:
>>      case XENVER_compile_info:
>>      case XENVER_capabilities:
>> +    case XENVER_capabilities2:
>>      case XENVER_changeset:
>> +    case XENVER_changeset2:
>>      case XENVER_pagesize:
>>      case XENVER_guest_handle:
>>          /* These MUST always be accessible to any guest by default. */
>> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
>> index 78225f68c15c..a671dcd0322e 100644
>> --- a/xen/xsm/flask/hooks.c
>> +++ b/xen/xsm/flask/hooks.c
>> @@ -1777,15 +1777,18 @@ static int cf_check flask_xen_version(uint32_t op)
>>          /* These sub-ops ignore the permission checks and return data. */
>>          return 0;
>>      case XENVER_extraversion:
>> +    case XENVER_extraversion2:
>>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>>                              VERSION__XEN_EXTRAVERSION, NULL);
>>      case XENVER_compile_info:
>>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>>                              VERSION__XEN_COMPILE_INFO, NULL);
>>      case XENVER_capabilities:
>> +    case XENVER_capabilities2:
>>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>>                              VERSION__XEN_CAPABILITIES, NULL);
>>      case XENVER_changeset:
>> +    case XENVER_changeset2:
>>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>>                              VERSION__XEN_CHANGESET, NULL);
>>      case XENVER_pagesize:
>> @@ -1795,6 +1798,7 @@ static int cf_check flask_xen_version(uint32_t op)
>>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>>                              VERSION__XEN_GUEST_HANDLE, NULL);
>>      case XENVER_commandline:
>> +    case XENVER_commandline2:
>>          return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
>>                              VERSION__XEN_COMMANDLINE, NULL);
>>      case XENVER_build_id:
>> -- 
>> 2.30.2
>>
diff mbox series

Patch

diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index f822480a8ef3..79c008c7ee5f 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -24,6 +24,7 @@ 
 CHECK_build_id;
 CHECK_compile_info;
 CHECK_feature_info;
+CHECK_varbuf;
 #endif
 
 enum system_state system_state = SYS_STATE_early_boot;
@@ -498,6 +499,59 @@  static int __init cf_check param_init(void)
 __initcall(param_init);
 #endif
 
+static long xenver_varbuf_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    struct xen_varbuf user_str;
+    const char *str = NULL;
+    size_t sz;
+
+    switch ( cmd )
+    {
+    case XENVER_extraversion2:
+        str = xen_extra_version();
+        break;
+
+    case XENVER_changeset2:
+        str = xen_changeset();
+        break;
+
+    case XENVER_commandline2:
+        str = saved_cmdline;
+        break;
+
+    case XENVER_capabilities2:
+        str = xen_cap_info;
+        break;
+
+    default:
+        ASSERT_UNREACHABLE();
+        return -ENODATA;
+    }
+
+    sz = strlen(str);
+
+    if ( sz > KB(64) ) /* Arbitrary limit.  Avoid long-running operations. */
+        return -E2BIG;
+
+    if ( guest_handle_is_null(arg) ) /* Length request */
+        return sz;
+
+    if ( copy_from_guest(&user_str, arg, 1) )
+        return -EFAULT;
+
+    if ( user_str.len == 0 )
+        return -EINVAL;
+
+    if ( sz > user_str.len )
+        return -ENOBUFS;
+
+    if ( copy_to_guest_offset(arg, offsetof(struct xen_varbuf, buf),
+                              str, sz) )
+        return -EFAULT;
+
+    return sz;
+}
+
 long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     bool_t deny = !!xsm_xen_version(XSM_OTHER, cmd);
@@ -711,6 +765,14 @@  long do_xen_version(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         return sz;
     }
+
+    case XENVER_extraversion2:
+    case XENVER_capabilities2:
+    case XENVER_changeset2:
+    case XENVER_commandline2:
+        if ( deny )
+            return -EPERM;
+        return xenver_varbuf_op(cmd, arg);
     }
 
     return -ENOSYS;
diff --git a/xen/include/public/version.h b/xen/include/public/version.h
index cbc4ef7a46e6..0dd6bbcb43cc 100644
--- a/xen/include/public/version.h
+++ b/xen/include/public/version.h
@@ -19,12 +19,20 @@ 
 /* arg == NULL; returns major:minor (16:16). */
 #define XENVER_version      0
 
-/* arg == xen_extraversion_t. */
+/*
+ * arg == xen_extraversion_t.
+ *
+ * This API/ABI is broken.  Use XENVER_extraversion2 where possible.
+ */
 #define XENVER_extraversion 1
 typedef char xen_extraversion_t[16];
 #define XEN_EXTRAVERSION_LEN (sizeof(xen_extraversion_t))
 
-/* arg == xen_compile_info_t. */
+/*
+ * arg == xen_compile_info_t.
+ *
+ * This API/ABI is broken and truncates data.
+ */
 #define XENVER_compile_info 2
 struct xen_compile_info {
     char compiler[64];
@@ -34,10 +42,20 @@  struct xen_compile_info {
 };
 typedef struct xen_compile_info xen_compile_info_t;
 
+/*
+ * arg == xen_capabilities_info_t.
+ *
+ * This API/ABI is broken.  Use XENVER_capabilities2 where possible.
+ */
 #define XENVER_capabilities 3
 typedef char xen_capabilities_info_t[1024];
 #define XEN_CAPABILITIES_INFO_LEN (sizeof(xen_capabilities_info_t))
 
+/*
+ * arg == xen_changeset_info_t.
+ *
+ * This API/ABI is broken.  Use XENVER_changeset2 where possible.
+ */
 #define XENVER_changeset 4
 typedef char xen_changeset_info_t[64];
 #define XEN_CHANGESET_INFO_LEN (sizeof(xen_changeset_info_t))
@@ -95,6 +113,11 @@  typedef struct xen_feature_info xen_feature_info_t;
  */
 #define XENVER_guest_handle 8
 
+/*
+ * arg == xen_commandline_t.
+ *
+ * This API/ABI is broken.  Use XENVER_commandline2 where possible.
+ */
 #define XENVER_commandline 9
 typedef char xen_commandline_t[1024];
 
@@ -110,6 +133,42 @@  struct xen_build_id {
 };
 typedef struct xen_build_id xen_build_id_t;
 
+/*
+ * Container for an arbitrary variable length buffer.
+ */
+struct xen_varbuf {
+    uint32_t len;                          /* IN:  size of buf[] in bytes. */
+    unsigned char buf[XEN_FLEX_ARRAY_DIM]; /* OUT: requested data.         */
+};
+typedef struct xen_varbuf xen_varbuf_t;
+
+/*
+ * arg == xen_varbuf_t
+ *
+ * Equivalent to the original ops, but with a non-truncating API/ABI.
+ *
+ * These hypercalls can fail for a number of reasons.  All callers must handle
+ * -XEN_xxx return values appropriately.
+ *
+ * Passing arg == NULL is a request for size, which will be signalled with a
+ * non-negative return value.  Note: a return size of 0 may be legitimate for
+ * the requested subop.
+ *
+ * Otherwise, the input xen_varbuf_t provides the size of the following
+ * buffer.  Xen will fill the buffer, and return the number of bytes written
+ * (e.g. if the input buffer was longer than necessary).
+ *
+ * Some subops may return binary data.  Some subops may be expected to return
+ * textural data.  These are returned without a NUL terminator, and while the
+ * contents is expected to be ASCII/UTF-8, Xen makes no guarentees to this
+ * effect.  e.g. Xen has no control over the formatting used for the command
+ * line.
+ */
+#define XENVER_extraversion2 11
+#define XENVER_capabilities2 12
+#define XENVER_changeset2    13
+#define XENVER_commandline2  14
+
 #endif /* __XEN_PUBLIC_VERSION_H__ */
 
 /*
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 9c41948514bf..a61ba85ed0ca 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -173,6 +173,7 @@ 
 ?	build_id                        version.h
 ?	compile_info                    version.h
 ?	feature_info                    version.h
+?	varbuf                          version.h
 ?	xenoprof_init			xenoprof.h
 ?	xenoprof_passive		xenoprof.h
 ?	flask_access			xsm/flask_op.h
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 8671af1ba4d3..a4a920f74e6e 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -828,9 +828,12 @@  static XSM_INLINE int cf_check xsm_xen_version(XSM_DEFAULT_ARG uint32_t op)
         block_speculation();
         return 0;
     case XENVER_extraversion:
+    case XENVER_extraversion2:
     case XENVER_compile_info:
     case XENVER_capabilities:
+    case XENVER_capabilities2:
     case XENVER_changeset:
+    case XENVER_changeset2:
     case XENVER_pagesize:
     case XENVER_guest_handle:
         /* These MUST always be accessible to any guest by default. */
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 78225f68c15c..a671dcd0322e 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1777,15 +1777,18 @@  static int cf_check flask_xen_version(uint32_t op)
         /* These sub-ops ignore the permission checks and return data. */
         return 0;
     case XENVER_extraversion:
+    case XENVER_extraversion2:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_EXTRAVERSION, NULL);
     case XENVER_compile_info:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_COMPILE_INFO, NULL);
     case XENVER_capabilities:
+    case XENVER_capabilities2:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_CAPABILITIES, NULL);
     case XENVER_changeset:
+    case XENVER_changeset2:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_CHANGESET, NULL);
     case XENVER_pagesize:
@@ -1795,6 +1798,7 @@  static int cf_check flask_xen_version(uint32_t op)
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_GUEST_HANDLE, NULL);
     case XENVER_commandline:
+    case XENVER_commandline2:
         return avc_has_perm(dsid, SECINITSID_XEN, SECCLASS_VERSION,
                             VERSION__XEN_COMMANDLINE, NULL);
     case XENVER_build_id: