Message ID | alpine.DEB.2.22.394.2403141516550.853156@ubuntu-linux-20-04-desktop (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] docs/misra: document the expected sizes of integer types | expand |
On 14.03.2024 23:17, Stefano Stabellini wrote: > Xen makes assumptions about the size of integer types on the various > architectures. Document these assumptions. My prior reservation wrt exact vs minimum sizes remains. Additionally, is it really meaningful to document x86-32 as an architecture, when it's been many years that the hypervisor cannot be built anymore for that target? If it's not (just) the hypervisor build that's intended to be covered here (the file living under docs/misra/, after all), can that further purpose please be mentioned? Jan
On Fri, 15 Mar 2024, Jan Beulich wrote: > On 14.03.2024 23:17, Stefano Stabellini wrote: > > Xen makes assumptions about the size of integer types on the various > > architectures. Document these assumptions. > > My prior reservation wrt exact vs minimum sizes remains. We have to specify the exact size. In practice the size is predetermined and exact with all our supported compilers given a architecture. Most importantly, unfortunately we use non-fixed-size integer types in C hypercall entry points and public ABIs. In my opinion, that is not acceptable. We have two options: 1) we go with this document, and we clarify that even if we specify "unsigned int", we actually mean a 32-bit integer 2) we change all our public ABIs and C hypercall entry points to use fixed-size types (e.g. s/unsigned int/uint32_t/g) 2) is preferred because it is clearer but it is more work. So I went with 1). I also thought you would like 1) more. > Additionally, is it really meaningful to document x86-32 as an > architecture, when it's been many years that the hypervisor cannot be > built anymore for that target? You are right. I should take x86_32 out. I'll do it in the next version.
On 16.03.2024 01:07, Stefano Stabellini wrote: > On Fri, 15 Mar 2024, Jan Beulich wrote: >> On 14.03.2024 23:17, Stefano Stabellini wrote: >>> Xen makes assumptions about the size of integer types on the various >>> architectures. Document these assumptions. >> >> My prior reservation wrt exact vs minimum sizes remains. > > We have to specify the exact size. In practice the size is predetermined > and exact with all our supported compilers given a architecture. But that's not the purpose of this document; if it was down to what compilers offer, we could refer to compiler documentation (and iirc we already do for various aspects). The purpose of this document, aiui, is to document assumption we make in hypervisor code. And those should be >=, not ==. > Most importantly, unfortunately we use non-fixed-size integer types in > C hypercall entry points and public ABIs. In my opinion, that is not > acceptable. The problem is that I can't see the reason for you thinking so. The C entry points sit past assembly code doing (required to do) necessary adjustments, if any. If there was no assembly layer, whether to use fixed with types for such parameters would depend on what the architecture guarantees. As to public ABIs - that's structure definitions, and I agree we ought to uniformly use fixed-width types there. We largely do; a few things still require fixing. > We have two options: > > 1) we go with this document, and we clarify that even if we specify > "unsigned int", we actually mean a 32-bit integer > > 2) we change all our public ABIs and C hypercall entry points to use > fixed-size types (e.g. s/unsigned int/uint32_t/g) > > 2) is preferred because it is clearer but it is more work. So I went > with 1). I also thought you would like 1) more. For ABIs (i.e. structures) we ought to be making that change anyway. Leaving basic types in there is latently buggy. I'm happy to see a document like this added, for the purpose described above. But to me 1) and 2) and largely independent of one another. Jan
On Mon, 18 Mar 2024, Jan Beulich wrote: > On 16.03.2024 01:07, Stefano Stabellini wrote: > > On Fri, 15 Mar 2024, Jan Beulich wrote: > >> On 14.03.2024 23:17, Stefano Stabellini wrote: > >>> Xen makes assumptions about the size of integer types on the various > >>> architectures. Document these assumptions. > >> > >> My prior reservation wrt exact vs minimum sizes remains. > > > > We have to specify the exact size. In practice the size is predetermined > > and exact with all our supported compilers given a architecture. > > But that's not the purpose of this document; if it was down to what > compilers offer, we could refer to compiler documentation (and iirc we > already do for various aspects). The purpose of this document, aiui, > is to document assumption we make in hypervisor code. And those should > be >=, not ==. Well... I guess the two of us are making different assumptions then :-) Which is the reason why documenting assumptions is so important. More at the bottom. > > Most importantly, unfortunately we use non-fixed-size integer types in > > C hypercall entry points and public ABIs. In my opinion, that is not > > acceptable. > > The problem is that I can't see the reason for you thinking so. The C > entry points sit past assembly code doing (required to do) necessary > adjustments, if any. If there was no assembly layer, whether to use > fixed with types for such parameters would depend on what the > architecture guarantees. This could be the source of the disagreement. I see the little assembly code as not important, I consider it just like a little trampoline to me. As we describe the hypercalls in C header files, I consider the C functions the "official" hypercall entry points. Also, as this is an ABI, I consider mandatory to use clear width definitions of all the types (whether with this document or with fixed-width types, and fixed-width types are clearer and better) in both the C header files that describe the ABI interfaces, as well as the C entry points that corresponds to it. E.g. I think we have to use the same types in both do_sched_op and the hypercall description in xen/include/public/sched.h > As to public ABIs - that's structure definitions, and I agree we ought > to uniformly use fixed-width types there. We largely do; a few things > still require fixing. +1 > > We have two options: > > > > 1) we go with this document, and we clarify that even if we specify > > "unsigned int", we actually mean a 32-bit integer > > > > 2) we change all our public ABIs and C hypercall entry points to use > > fixed-size types (e.g. s/unsigned int/uint32_t/g) > > > > 2) is preferred because it is clearer but it is more work. So I went > > with 1). I also thought you would like 1) more. > > For ABIs (i.e. structures) we ought to be making that change anyway. > Leaving basic types in there is latently buggy. I am glad we agree :-) It is just that I also consinder the C hypercall entry points as part of the ABI > I'm happy to see a document like this added, for the purpose described > above. But to me 1) and 2) and largely independent of one another. Good that you are also happy with a document like this. The remaining question is: what about the rest of the C functions in Xen that are certainly not part of an ABI? Those are less critical, still this document should apply uniformily to them too. I don't understand why you are making the >= width assumption you mentioned at the top of the file when actually it is impossible to exercise or test this assumption on any compiler or any architecture that works with Xen. If it cannot be enabled, it hasn't been tested, and it probably won't work.
On 19.03.2024 04:37, Stefano Stabellini wrote: > On Mon, 18 Mar 2024, Jan Beulich wrote: >> On 16.03.2024 01:07, Stefano Stabellini wrote: >>> On Fri, 15 Mar 2024, Jan Beulich wrote: >>>> On 14.03.2024 23:17, Stefano Stabellini wrote: >>>>> Xen makes assumptions about the size of integer types on the various >>>>> architectures. Document these assumptions. >>>> >>>> My prior reservation wrt exact vs minimum sizes remains. >>> >>> We have to specify the exact size. In practice the size is predetermined >>> and exact with all our supported compilers given a architecture. >> >> But that's not the purpose of this document; if it was down to what >> compilers offer, we could refer to compiler documentation (and iirc we >> already do for various aspects). The purpose of this document, aiui, >> is to document assumption we make in hypervisor code. And those should >> be >=, not ==. > > Well... I guess the two of us are making different assumptions then :-) > > Which is the reason why documenting assumptions is so important. More at > the bottom. > > >>> Most importantly, unfortunately we use non-fixed-size integer types in >>> C hypercall entry points and public ABIs. In my opinion, that is not >>> acceptable. >> >> The problem is that I can't see the reason for you thinking so. The C >> entry points sit past assembly code doing (required to do) necessary >> adjustments, if any. If there was no assembly layer, whether to use >> fixed with types for such parameters would depend on what the >> architecture guarantees. > > This could be the source of the disagreement. I see the little assembly > code as not important, I consider it just like a little trampoline to > me. As we describe the hypercalls in C header files, I consider the C > functions the "official" hypercall entry points. Why would that be? Any code we execute in Xen is relevant. > Also, as this is an ABI, I consider mandatory to use clear width > definitions of all the types (whether with this document or with > fixed-width types, and fixed-width types are clearer and better) in both > the C header files that describe the ABI interfaces, as well as the C > entry points that corresponds to it. E.g. I think we have to use > the same types in both do_sched_op and the hypercall description in > xen/include/public/sched.h There are two entirely separate aspects to the ABI: One is what we document towards consumers of it. The other is entirely internal, i.e. an implementation detail - how we actually consume the data. Documenting fixed-width types towards consumers is probably okay, albeit (see below) imo still not strictly necessary (for being needlessly limiting). >> As to public ABIs - that's structure definitions, and I agree we ought >> to uniformly use fixed-width types there. We largely do; a few things >> still require fixing. > > +1 > > >>> We have two options: >>> >>> 1) we go with this document, and we clarify that even if we specify >>> "unsigned int", we actually mean a 32-bit integer >>> >>> 2) we change all our public ABIs and C hypercall entry points to use >>> fixed-size types (e.g. s/unsigned int/uint32_t/g) >>> >>> 2) is preferred because it is clearer but it is more work. So I went >>> with 1). I also thought you would like 1) more. >> >> For ABIs (i.e. structures) we ought to be making that change anyway. >> Leaving basic types in there is latently buggy. > > I am glad we agree :-) > > It is just that I also consinder the C hypercall entry points as part of > the ABI > > >> I'm happy to see a document like this added, for the purpose described >> above. But to me 1) and 2) and largely independent of one another. > > Good that you are also happy with a document like this. > > The remaining question is: what about the rest of the C functions in Xen > that are certainly not part of an ABI? As per above - anything internal isn't part of the ABI, C entry points for hypercall handlers included. All we need to ensure is that we consume the data according to what the ABI sets forth. To use wording from George when he criticized my supposed lack of actual arguments: While there's nothing technically wrong with using fixed width types there (or in fact everywhere), there's also nothing technically wrong with using plain C types there and almost everywhere else (ABI structures excluded). With both technically equal, ./CODING_STYLE has the only criteria to pick between the two. IOW that's what I view wrong in George's argumentation: Demanding that I provide technical arguments when the desire to use fixed width types for the purpose under discussion also isn't backed by any. > Those are less critical, still this document should apply uniformily to > them too. I don't understand why you are making the >= width assumption > you mentioned at the top of the file when actually it is impossible to > exercise or test this assumption on any compiler or any architecture > that works with Xen. If it cannot be enabled, it hasn't been tested, and > it probably won't work. Hmm, yes, that's one way to look at it. My perspective is different though: By writing down assumptions that are more strict than necessary, we'd be excluding ports to environments meeting the >= assumption, but not meeting the == one. Unless of course you can point me at any place where - not just by mistake / by being overly lax - we truly depend on the == that you want to put in place. IOW yes, there likely would need to be adjustments to code if such a port was to happen. Yet we shouldn't further harden requirements that were never meant to be there. Note that by writing down anything more strict than necessary, you'd also encourage people to further wrongly treat e.g. uint32_t and unsigned int as identical. Such wrong assumptions had been a severe hindrance in doing ports from 32- to 64-bit processors some 20 years ago. I would have hoped that we'd learn from such mistakes. Jan
On Tue, 19 Mar 2024, Jan Beulich wrote: > On 19.03.2024 04:37, Stefano Stabellini wrote: > > On Mon, 18 Mar 2024, Jan Beulich wrote: > >> On 16.03.2024 01:07, Stefano Stabellini wrote: > >>> On Fri, 15 Mar 2024, Jan Beulich wrote: > >>>> On 14.03.2024 23:17, Stefano Stabellini wrote: > >>>>> Xen makes assumptions about the size of integer types on the various > >>>>> architectures. Document these assumptions. > >>>> > >>>> My prior reservation wrt exact vs minimum sizes remains. > >>> > >>> We have to specify the exact size. In practice the size is predetermined > >>> and exact with all our supported compilers given a architecture. > >> > >> But that's not the purpose of this document; if it was down to what > >> compilers offer, we could refer to compiler documentation (and iirc we > >> already do for various aspects). The purpose of this document, aiui, > >> is to document assumption we make in hypervisor code. And those should > >> be >=, not ==. > > > > Well... I guess the two of us are making different assumptions then :-) > > > > Which is the reason why documenting assumptions is so important. More at > > the bottom. > > > > > >>> Most importantly, unfortunately we use non-fixed-size integer types in > >>> C hypercall entry points and public ABIs. In my opinion, that is not > >>> acceptable. > >> > >> The problem is that I can't see the reason for you thinking so. The C > >> entry points sit past assembly code doing (required to do) necessary > >> adjustments, if any. If there was no assembly layer, whether to use > >> fixed with types for such parameters would depend on what the > >> architecture guarantees. > > > > This could be the source of the disagreement. I see the little assembly > > code as not important, I consider it just like a little trampoline to > > me. As we describe the hypercalls in C header files, I consider the C > > functions the "official" hypercall entry points. > > Why would that be? Any code we execute in Xen is relevant. There are a few reasons: - the public interface is described in a C header so it makes sense for the corresponding implementation to be in C - the C entry point is often both the entry point in C and also common code - depending on the architecture, there is typically always some minimal assembly entry code to prepare the environment before we can jump into C-land; still one wouldn't consider those minimal and routine assembly operations to be a meaningful hypercall entry point corresponding to the C declaration in the public headers - as per MISRA and also general good practice, we need the declaration in the public header files to match the definition in C > > Also, as this is an ABI, I consider mandatory to use clear width > > definitions of all the types (whether with this document or with > > fixed-width types, and fixed-width types are clearer and better) in both > > the C header files that describe the ABI interfaces, as well as the C > > entry points that corresponds to it. E.g. I think we have to use > > the same types in both do_sched_op and the hypercall description in > > xen/include/public/sched.h > > There are two entirely separate aspects to the ABI: One is what we > document towards consumers of it. The other is entirely internal, i.e. > an implementation detail - how we actually consume the data. > Documenting fixed-width types towards consumers is probably okay, > albeit (see below) imo still not strictly necessary (for being > needlessly limiting). I don't see it this way. As the Xen public interface description is in C and used during the build, my opinion is that the public description and the C definition need to match. Also, I don't understand how you can say that public interfaces don't strictly necessarily have to use fixed-width types. Imagine that you use native types with different compilers that can actually output different width interger sizes (which is not possible today with gcc or clang). Imagine that a guest is written in a language other than C (e.g. Java) based on the public interface description. It cannot work correctly, can it? I don't see how we can possibly have a public interface with anything other than fixed-width integers. > >> As to public ABIs - that's structure definitions, and I agree we ought > >> to uniformly use fixed-width types there. We largely do; a few things > >> still require fixing. > > > > +1 > > > > > >>> We have two options: > >>> > >>> 1) we go with this document, and we clarify that even if we specify > >>> "unsigned int", we actually mean a 32-bit integer > >>> > >>> 2) we change all our public ABIs and C hypercall entry points to use > >>> fixed-size types (e.g. s/unsigned int/uint32_t/g) > >>> > >>> 2) is preferred because it is clearer but it is more work. So I went > >>> with 1). I also thought you would like 1) more. > >> > >> For ABIs (i.e. structures) we ought to be making that change anyway. > >> Leaving basic types in there is latently buggy. > > > > I am glad we agree :-) > > > > It is just that I also consinder the C hypercall entry points as part of > > the ABI > > > > > >> I'm happy to see a document like this added, for the purpose described > >> above. But to me 1) and 2) and largely independent of one another. > > > > Good that you are also happy with a document like this. > > > > The remaining question is: what about the rest of the C functions in Xen > > that are certainly not part of an ABI? > > As per above - anything internal isn't part of the ABI, C entry points > for hypercall handlers included. All we need to ensure is that we consume > the data according to what the ABI sets forth. It doesn't look like we'll convince one another on this point. But let me try another way. In my view, having mismatched types between declaration and definition and having non-fixed-width types in C hypercall entry points is really bad for a number of reasons, among them: - correctness - risk of ABI breakage - mismatch of declaration and definition In your view, the drawback is not following the CODING_STYLE. The two points of views on this subject don't have the same to lose. If I were you, I would probably not invest my energy to defend the CODING_STYLE. > To use wording from George when he criticized my supposed lack of actual > arguments: While there's nothing technically wrong with using fixed > width types there (or in fact everywhere), there's also nothing technically > wrong with using plain C types there and almost everywhere else (ABI > structures excluded). With both technically equal, ./CODING_STYLE has the > only criteria to pick between the two. IOW that's what I view wrong in > George's argumentation: Demanding that I provide technical arguments when > the desire to use fixed width types for the purpose under discussion also > isn't backed by any. I don't think we are in violation of the CODING_STYLE as it explicitly accounts for exceptions. Public interfaces declarations and definitions (hypercalls C entry points included) are an exception. In my opinion, using fixed-width integers in public headers and C definitions (including C hypercall entry points) is top priority for correctness. Correctness is more important than style. So, if we need to change the CODING_STYLE to get there, let's change the CODING_STYLE. > > Those are less critical, still this document should apply uniformily to > > them too. I don't understand why you are making the >= width assumption > > you mentioned at the top of the file when actually it is impossible to > > exercise or test this assumption on any compiler or any architecture > > that works with Xen. If it cannot be enabled, it hasn't been tested, and > > it probably won't work. > > Hmm, yes, that's one way to look at it. My perspective is different though: > By writing down assumptions that are more strict than necessary, we'd be > excluding ports to environments meeting the >= assumption, but not meeting > the == one. Unless of course you can point me at any place where - not > just by mistake / by being overly lax - we truly depend on the == that you > want to put in place. IOW yes, there likely would need to be adjustments > to code if such a port was to happen. Yet we shouldn't further harden > requirements that were never meant to be there. I have already shown that all the current implementations and tests only check for ==. In my opinion, this is sufficient evidence that >= is not supported. If you admit it probably wouldn't work without fixes today, would you security-support such a configuration? Would you safety-support it? I wouldn't want to buy a car running Xen compiled with a compiler using integer sizes different from the ones written in this document. Let me summarize our positions on these topics. Agreed points: - public interfaces should use fixed-width types - it is a good idea to have a document describing our assumptions about integer types Open decision points and misalignments: - Should the C hypercall entry points match the public header declarations and ideally use fixed-width integer types? I'd say yes and I would argue for it - Should the document describing our assumptions about integer types specify == (unsigned int == uint32_t) or >= (unsigned int >= uint32_t)? I'd say specify == and I would argue for it
On 20.03.2024 07:01, Stefano Stabellini wrote: > On Tue, 19 Mar 2024, Jan Beulich wrote: >> On 19.03.2024 04:37, Stefano Stabellini wrote: >>> On Mon, 18 Mar 2024, Jan Beulich wrote: >>>> On 16.03.2024 01:07, Stefano Stabellini wrote: >>>>> On Fri, 15 Mar 2024, Jan Beulich wrote: >>>>>> On 14.03.2024 23:17, Stefano Stabellini wrote: >>>>>>> Xen makes assumptions about the size of integer types on the various >>>>>>> architectures. Document these assumptions. >>>>>> >>>>>> My prior reservation wrt exact vs minimum sizes remains. >>>>> >>>>> We have to specify the exact size. In practice the size is predetermined >>>>> and exact with all our supported compilers given a architecture. >>>> >>>> But that's not the purpose of this document; if it was down to what >>>> compilers offer, we could refer to compiler documentation (and iirc we >>>> already do for various aspects). The purpose of this document, aiui, >>>> is to document assumption we make in hypervisor code. And those should >>>> be >=, not ==. >>> >>> Well... I guess the two of us are making different assumptions then :-) >>> >>> Which is the reason why documenting assumptions is so important. More at >>> the bottom. >>> >>> >>>>> Most importantly, unfortunately we use non-fixed-size integer types in >>>>> C hypercall entry points and public ABIs. In my opinion, that is not >>>>> acceptable. >>>> >>>> The problem is that I can't see the reason for you thinking so. The C >>>> entry points sit past assembly code doing (required to do) necessary >>>> adjustments, if any. If there was no assembly layer, whether to use >>>> fixed with types for such parameters would depend on what the >>>> architecture guarantees. >>> >>> This could be the source of the disagreement. I see the little assembly >>> code as not important, I consider it just like a little trampoline to >>> me. As we describe the hypercalls in C header files, I consider the C >>> functions the "official" hypercall entry points. >> >> Why would that be? Any code we execute in Xen is relevant. > > There are a few reasons: > > - the public interface is described in a C header so it makes sense for > the corresponding implementation to be in C > > - the C entry point is often both the entry point in C and also common > code > > - depending on the architecture, there is typically always some minimal > assembly entry code to prepare the environment before we can jump into > C-land; still one wouldn't consider those minimal and routine assembly > operations to be a meaningful hypercall entry point corresponding to > the C declaration in the public headers > > - as per MISRA and also general good practice, we need the declaration > in the public header files to match the definition in C Throughout, but especially with this last point, I feel there's confusion (not sure on which side): There are no declarations of hypercall functions in the public headers. Adding declarations there for the C entry points in Xen would actually be wrong, as we don't provide such functions anywhere (to consumers of the ABI). >>> Also, as this is an ABI, I consider mandatory to use clear width >>> definitions of all the types (whether with this document or with >>> fixed-width types, and fixed-width types are clearer and better) in both >>> the C header files that describe the ABI interfaces, as well as the C >>> entry points that corresponds to it. E.g. I think we have to use >>> the same types in both do_sched_op and the hypercall description in >>> xen/include/public/sched.h >> >> There are two entirely separate aspects to the ABI: One is what we >> document towards consumers of it. The other is entirely internal, i.e. >> an implementation detail - how we actually consume the data. >> Documenting fixed-width types towards consumers is probably okay, >> albeit (see below) imo still not strictly necessary (for being >> needlessly limiting). > > I don't see it this way. > > As the Xen public interface description is in C and used during the > build, my opinion is that the public description and the C definition > need to match. > > Also, I don't understand how you can say that public interfaces don't > strictly necessarily have to use fixed-width types. > > Imagine that you use native types with different compilers that can > actually output different width interger sizes (which is not possible > today with gcc or clang). Imagine that a guest is written in a language > other than C (e.g. Java) based on the public interface description. It > cannot work correctly, can it? They'd need to write appropriate hypercall invocation functions. As per above - we don't provide these in the public headers, not even for C consumers. > I don't see how we can possibly have a public interface with anything > other than fixed-width integers. That's the consumer side of the ABI. It says nothing about the internal implementation details in Xen. All we need to do there is respect the ABI. That has no influence whatsoever on the C entry points when those aren't the actual hypercall entrypoints into the hypervisor. >>>> As to public ABIs - that's structure definitions, and I agree we ought >>>> to uniformly use fixed-width types there. We largely do; a few things >>>> still require fixing. >>> >>> +1 >>> >>> >>>>> We have two options: >>>>> >>>>> 1) we go with this document, and we clarify that even if we specify >>>>> "unsigned int", we actually mean a 32-bit integer >>>>> >>>>> 2) we change all our public ABIs and C hypercall entry points to use >>>>> fixed-size types (e.g. s/unsigned int/uint32_t/g) >>>>> >>>>> 2) is preferred because it is clearer but it is more work. So I went >>>>> with 1). I also thought you would like 1) more. >>>> >>>> For ABIs (i.e. structures) we ought to be making that change anyway. >>>> Leaving basic types in there is latently buggy. >>> >>> I am glad we agree :-) >>> >>> It is just that I also consinder the C hypercall entry points as part of >>> the ABI >>> >>> >>>> I'm happy to see a document like this added, for the purpose described >>>> above. But to me 1) and 2) and largely independent of one another. >>> >>> Good that you are also happy with a document like this. >>> >>> The remaining question is: what about the rest of the C functions in Xen >>> that are certainly not part of an ABI? >> >> As per above - anything internal isn't part of the ABI, C entry points >> for hypercall handlers included. All we need to ensure is that we consume >> the data according to what the ABI sets forth. > > It doesn't look like we'll convince one another on this point. But let > me try another way. > > In my view, having mismatched types between declaration and definition > and having non-fixed-width types in C hypercall entry points is really > bad for a number of reasons, among them: > - correctness > - risk of ABI breakage > - mismatch of declaration and definition What mismatches are you talking about? There's nothing mismatched now, and there cannot be any mismatch, because the consumers of the ABI don't call Xen functions directly. > In your view, the drawback is not following the CODING_STYLE. > > The two points of views on this subject don't have the same to lose. If > I were you, I would probably not invest my energy to defend the > CODING_STYLE. > > >> To use wording from George when he criticized my supposed lack of actual >> arguments: While there's nothing technically wrong with using fixed >> width types there (or in fact everywhere), there's also nothing technically >> wrong with using plain C types there and almost everywhere else (ABI >> structures excluded). With both technically equal, ./CODING_STYLE has the >> only criteria to pick between the two. IOW that's what I view wrong in >> George's argumentation: Demanding that I provide technical arguments when >> the desire to use fixed width types for the purpose under discussion also >> isn't backed by any. > > I don't think we are in violation of the CODING_STYLE as it explicitly > accounts for exceptions. Public interfaces declarations and definitions > (hypercalls C entry points included) are an exception. If that was technically necessary, I would surely agree to there being an exception here. > In my opinion, using fixed-width integers in public headers and C > definitions (including C hypercall entry points) is top priority for > correctness. Correctness is more important than style. So, if we need to > change the CODING_STYLE to get there, let's change the CODING_STYLE. > > >>> Those are less critical, still this document should apply uniformily to >>> them too. I don't understand why you are making the >= width assumption >>> you mentioned at the top of the file when actually it is impossible to >>> exercise or test this assumption on any compiler or any architecture >>> that works with Xen. If it cannot be enabled, it hasn't been tested, and >>> it probably won't work. >> >> Hmm, yes, that's one way to look at it. My perspective is different though: >> By writing down assumptions that are more strict than necessary, we'd be >> excluding ports to environments meeting the >= assumption, but not meeting >> the == one. Unless of course you can point me at any place where - not >> just by mistake / by being overly lax - we truly depend on the == that you >> want to put in place. IOW yes, there likely would need to be adjustments >> to code if such a port was to happen. Yet we shouldn't further harden >> requirements that were never meant to be there. > > I have already shown that all the current implementations and tests only > check for ==. In my opinion, this is sufficient evidence that >= is not > supported. > > If you admit it probably wouldn't work without fixes today, would you > security-support such a configuration? Would you safety-support it? I > wouldn't want to buy a car running Xen compiled with a compiler using > integer sizes different from the ones written in this document. > > Let me summarize our positions on these topics. > > Agreed points: > - public interfaces should use fixed-width types > - it is a good idea to have a document describing our assumptions about > integer types > > Open decision points and misalignments: > - Should the C hypercall entry points match the public header > declarations and ideally use fixed-width integer types? As per above, this question just cannot be validly raised. There are no public header declarations to match. > I'd say yes and I would argue for it > > - Should the document describing our assumptions about integer types > specify == (unsigned int == uint32_t) or >= (unsigned int >= > uint32_t)? > > I'd say specify == and I would argue for it Actually, I had a further thought here in the meantime: For particular ports, using == is likely okay - they're conforming to particular psABI-s, after all (and that's what the compilers used also implement). I'd nevertheless expect >= to be used in common assumptions. That way for existing ports you get what you want, and there would still be provisions for new ports using, say, an ILP64 ABI. Common code would need to adhere to the common assumptions only. Arch-specific code can work from the more tight assumptions. (If future sub-arch variants are to be expected, like RV128, arch-code may still be well advised to try to avoid the more tight assumptions where possible, just to limit eventual porting effort.) Jan
On Wed, 20 Mar 2024, Jan Beulich wrote: > > - the public interface is described in a C header so it makes sense for > > the corresponding implementation to be in C > > > > - the C entry point is often both the entry point in C and also common > > code > > > > - depending on the architecture, there is typically always some minimal > > assembly entry code to prepare the environment before we can jump into > > C-land; still one wouldn't consider those minimal and routine assembly > > operations to be a meaningful hypercall entry point corresponding to > > the C declaration in the public headers > > > > - as per MISRA and also general good practice, we need the declaration > > in the public header files to match the definition in C > > Throughout, but especially with this last point, I feel there's confusion > (not sure on which side): There are no declarations of hypercall functions > in the public headers. Adding declarations there for the C entry points in > Xen would actually be wrong, as we don't provide such functions anywhere > (to consumers of the ABI). I am copy/pasting text from sched.h: * The prototype for this hypercall is: * ` long HYPERVISOR_sched_op(enum sched_op cmd, void *arg, ...) * * @cmd == SCHEDOP_??? (scheduler operation). * @arg == Operation-specific extra argument(s), as described below. * ... == Additional Operation-specific extra arguments, described below. * from event_channel.h: * ` enum neg_errnoval * ` HYPERVISOR_event_channel_op(enum event_channel_op cmd, void *args) * ` * @cmd == EVTCHNOP_* (event-channel operation). * @args == struct evtchn_* Operation-specific extra arguments (NULL if none). These are the hypercall declarations in public headers. Although they are comments, they are the only description of the ABI that we have (as far as I know). They are in C and use C types. > >>> Also, as this is an ABI, I consider mandatory to use clear width > >>> definitions of all the types (whether with this document or with > >>> fixed-width types, and fixed-width types are clearer and better) in both > >>> the C header files that describe the ABI interfaces, as well as the C > >>> entry points that corresponds to it. E.g. I think we have to use > >>> the same types in both do_sched_op and the hypercall description in > >>> xen/include/public/sched.h > >> > >> There are two entirely separate aspects to the ABI: One is what we > >> document towards consumers of it. The other is entirely internal, i.e. > >> an implementation detail - how we actually consume the data. > >> Documenting fixed-width types towards consumers is probably okay, > >> albeit (see below) imo still not strictly necessary (for being > >> needlessly limiting). > > > > I don't see it this way. > > > > As the Xen public interface description is in C and used during the > > build, my opinion is that the public description and the C definition > > need to match. > > > > Also, I don't understand how you can say that public interfaces don't > > strictly necessarily have to use fixed-width types. > > > > Imagine that you use native types with different compilers that can > > actually output different width interger sizes (which is not possible > > today with gcc or clang). Imagine that a guest is written in a language > > other than C (e.g. Java) based on the public interface description. It > > cannot work correctly, can it? > > They'd need to write appropriate hypercall invocation functions. As per > above - we don't provide these in the public headers, not even for C > consumers. See above > > I don't see how we can possibly have a public interface with anything > > other than fixed-width integers. > > That's the consumer side of the ABI. It says nothing about the internal > implementation details in Xen. All we need to do there is respect the > ABI. That has no influence whatsoever on the C entry points when those > aren't the actual hypercall entrypoints into the hypervisor. If we go by the strictest definition, nothing is actually called directly except for the target of a "b" instruction. When you call a function in C, you are not actually calling a function. Assembly is generated to save variables and do other things before "b". Still, typically it is still considered a "direct" call. It is not exactly the same thing with hypercall, but I hope I conveyed the idea why I consider the C hypercall entry points part of the ABI. > >>>> As to public ABIs - that's structure definitions, and I agree we ought > >>>> to uniformly use fixed-width types there. We largely do; a few things > >>>> still require fixing. > >>> > >>> +1 > >>> > >>> > >>>>> We have two options: > >>>>> > >>>>> 1) we go with this document, and we clarify that even if we specify > >>>>> "unsigned int", we actually mean a 32-bit integer > >>>>> > >>>>> 2) we change all our public ABIs and C hypercall entry points to use > >>>>> fixed-size types (e.g. s/unsigned int/uint32_t/g) > >>>>> > >>>>> 2) is preferred because it is clearer but it is more work. So I went > >>>>> with 1). I also thought you would like 1) more. > >>>> > >>>> For ABIs (i.e. structures) we ought to be making that change anyway. > >>>> Leaving basic types in there is latently buggy. > >>> > >>> I am glad we agree :-) > >>> > >>> It is just that I also consinder the C hypercall entry points as part of > >>> the ABI > >>> > >>> > >>>> I'm happy to see a document like this added, for the purpose described > >>>> above. But to me 1) and 2) and largely independent of one another. > >>> > >>> Good that you are also happy with a document like this. > >>> > >>> The remaining question is: what about the rest of the C functions in Xen > >>> that are certainly not part of an ABI? > >> > >> As per above - anything internal isn't part of the ABI, C entry points > >> for hypercall handlers included. All we need to ensure is that we consume > >> the data according to what the ABI sets forth. > > > > It doesn't look like we'll convince one another on this point. But let > > me try another way. > > > > In my view, having mismatched types between declaration and definition > > and having non-fixed-width types in C hypercall entry points is really > > bad for a number of reasons, among them: > > - correctness > > - risk of ABI breakage > > - mismatch of declaration and definition > > What mismatches are you talking about? There's nothing mismatched now, > and there cannot be any mismatch, because the consumers of the ABI don't > call Xen functions directly. Let me make an example: - public header saying enum event_channel_op cmd - <assembly> - do_event_channel_op(int cmd, ...) Do you think this is all good? There are two pretty serious problems here: - enum and int are not the same type - enum and int are not fixed-width Don't you think it should be: - public header saying uint32_t cmd in a comment - <assembly> - do_something_op(uint32_t cmd, ...) Or possibly unsigned long depending on the parameter. ? > > In your view, the drawback is not following the CODING_STYLE. > > > > The two points of views on this subject don't have the same to lose. If > > I were you, I would probably not invest my energy to defend the > > CODING_STYLE. > > > > > >> To use wording from George when he criticized my supposed lack of actual > >> arguments: While there's nothing technically wrong with using fixed > >> width types there (or in fact everywhere), there's also nothing technically > >> wrong with using plain C types there and almost everywhere else (ABI > >> structures excluded). With both technically equal, ./CODING_STYLE has the > >> only criteria to pick between the two. IOW that's what I view wrong in > >> George's argumentation: Demanding that I provide technical arguments when > >> the desire to use fixed width types for the purpose under discussion also > >> isn't backed by any. > > > > I don't think we are in violation of the CODING_STYLE as it explicitly > > accounts for exceptions. Public interfaces declarations and definitions > > (hypercalls C entry points included) are an exception. > > If that was technically necessary, I would surely agree to there being an > exception here. Great, that's a start > > In my opinion, using fixed-width integers in public headers and C > > definitions (including C hypercall entry points) is top priority for > > correctness. Correctness is more important than style. So, if we need to > > change the CODING_STYLE to get there, let's change the CODING_STYLE. > > > > > >>> Those are less critical, still this document should apply uniformily to > >>> them too. I don't understand why you are making the >= width assumption > >>> you mentioned at the top of the file when actually it is impossible to > >>> exercise or test this assumption on any compiler or any architecture > >>> that works with Xen. If it cannot be enabled, it hasn't been tested, and > >>> it probably won't work. > >> > >> Hmm, yes, that's one way to look at it. My perspective is different though: > >> By writing down assumptions that are more strict than necessary, we'd be > >> excluding ports to environments meeting the >= assumption, but not meeting > >> the == one. Unless of course you can point me at any place where - not > >> just by mistake / by being overly lax - we truly depend on the == that you > >> want to put in place. IOW yes, there likely would need to be adjustments > >> to code if such a port was to happen. Yet we shouldn't further harden > >> requirements that were never meant to be there. > > > > I have already shown that all the current implementations and tests only > > check for ==. In my opinion, this is sufficient evidence that >= is not > > supported. > > > > If you admit it probably wouldn't work without fixes today, would you > > security-support such a configuration? Would you safety-support it? I > > wouldn't want to buy a car running Xen compiled with a compiler using > > integer sizes different from the ones written in this document. > > > > Let me summarize our positions on these topics. > > > > Agreed points: > > - public interfaces should use fixed-width types > > - it is a good idea to have a document describing our assumptions about > > integer types > > > > Open decision points and misalignments: > > - Should the C hypercall entry points match the public header > > declarations and ideally use fixed-width integer types? > > As per above, this question just cannot be validly raised. There are > no public header declarations to match. I clarified > > I'd say yes and I would argue for it > > > > - Should the document describing our assumptions about integer types > > specify == (unsigned int == uint32_t) or >= (unsigned int >= > > uint32_t)? > > > > I'd say specify == and I would argue for it > > Actually, I had a further thought here in the meantime: For particular > ports, using == is likely okay - they're conforming to particular > psABI-s, after all (and that's what the compilers used also implement). > I'd nevertheless expect >= to be used in common assumptions. That way > for existing ports you get what you want, and there would still be > provisions for new ports using, say, an ILP64 ABI. Common code would > need to adhere to the common assumptions only. Arch-specific code can > work from the more tight assumptions. (If future sub-arch variants are > to be expected, like RV128, arch-code may still be well advised to try > to avoid the more tight assumptions where possible, just to limit > eventual porting effort.) I understand the aspirational goal of supporting >= but in reality it is not tested, if it is not tested it cannot work, if it cannot work, we cannot support it. If someone creates a compiler or other tool to check for >= I would be happy to discuss expanding the document. Without any tests, I don't think it would be useful to write down >=, not even as an aspirational goal. A goal must be actionable.
On 21.03.2024 02:46, Stefano Stabellini wrote: > On Wed, 20 Mar 2024, Jan Beulich wrote: >>> - the public interface is described in a C header so it makes sense for >>> the corresponding implementation to be in C >>> >>> - the C entry point is often both the entry point in C and also common >>> code >>> >>> - depending on the architecture, there is typically always some minimal >>> assembly entry code to prepare the environment before we can jump into >>> C-land; still one wouldn't consider those minimal and routine assembly >>> operations to be a meaningful hypercall entry point corresponding to >>> the C declaration in the public headers >>> >>> - as per MISRA and also general good practice, we need the declaration >>> in the public header files to match the definition in C >> >> Throughout, but especially with this last point, I feel there's confusion >> (not sure on which side): There are no declarations of hypercall functions >> in the public headers. Adding declarations there for the C entry points in >> Xen would actually be wrong, as we don't provide such functions anywhere >> (to consumers of the ABI). > > I am copy/pasting text from sched.h: > > * The prototype for this hypercall is: > * ` long HYPERVISOR_sched_op(enum sched_op cmd, void *arg, ...) > * > * @cmd == SCHEDOP_??? (scheduler operation). > * @arg == Operation-specific extra argument(s), as described below. > * ... == Additional Operation-specific extra arguments, described below. > * > > from event_channel.h: > > * ` enum neg_errnoval > * ` HYPERVISOR_event_channel_op(enum event_channel_op cmd, void *args) > * ` > * @cmd == EVTCHNOP_* (event-channel operation). > * @args == struct evtchn_* Operation-specific extra arguments (NULL if none). > > These are the hypercall declarations in public headers. Although they > are comments, they are the only description of the ABI that we have (as > far as I know). They are in C and use C types. From their use of enum alone they don't qualify as declarations. They're imo merely meant to provide minimal guidelines. >>>>>>> We have two options: >>>>>>> >>>>>>> 1) we go with this document, and we clarify that even if we specify >>>>>>> "unsigned int", we actually mean a 32-bit integer >>>>>>> >>>>>>> 2) we change all our public ABIs and C hypercall entry points to use >>>>>>> fixed-size types (e.g. s/unsigned int/uint32_t/g) >>>>>>> >>>>>>> 2) is preferred because it is clearer but it is more work. So I went >>>>>>> with 1). I also thought you would like 1) more. >>>>>> >>>>>> For ABIs (i.e. structures) we ought to be making that change anyway. >>>>>> Leaving basic types in there is latently buggy. >>>>> >>>>> I am glad we agree :-) >>>>> >>>>> It is just that I also consinder the C hypercall entry points as part of >>>>> the ABI >>>>> >>>>> >>>>>> I'm happy to see a document like this added, for the purpose described >>>>>> above. But to me 1) and 2) and largely independent of one another. >>>>> >>>>> Good that you are also happy with a document like this. >>>>> >>>>> The remaining question is: what about the rest of the C functions in Xen >>>>> that are certainly not part of an ABI? >>>> >>>> As per above - anything internal isn't part of the ABI, C entry points >>>> for hypercall handlers included. All we need to ensure is that we consume >>>> the data according to what the ABI sets forth. >>> >>> It doesn't look like we'll convince one another on this point. But let >>> me try another way. >>> >>> In my view, having mismatched types between declaration and definition >>> and having non-fixed-width types in C hypercall entry points is really >>> bad for a number of reasons, among them: >>> - correctness >>> - risk of ABI breakage >>> - mismatch of declaration and definition >> >> What mismatches are you talking about? There's nothing mismatched now, >> and there cannot be any mismatch, because the consumers of the ABI don't >> call Xen functions directly. > > Let me make an example: > > - public header saying enum event_channel_op cmd > - <assembly> > - do_event_channel_op(int cmd, ...) > > Do you think this is all good? > > There are two pretty serious problems here: > - enum and int are not the same type See above. The issue I have with this is use of plain "int". Technically that's not a problem either, but aiui we're aiming to use "unsigned int" when negative values aren't possible. And note that it was in 2012 when "int" there was changed to "enum", in an effort to document things better. > - enum and int are not fixed-width Which I don't view as a problem, thanks to the assembly sitting in between. > Don't you think it should be: > > - public header saying uint32_t cmd in a comment > - <assembly> > - do_something_op(uint32_t cmd, ...) The public header should say whatever is best suited to not misguide people writing actual prototypes for their functions. I wouldn't mind uint32_t being stated there. That has no influence whatsoever on do_<something>_op(), though. > Or possibly unsigned long depending on the parameter. You're contradicting yourself: You mean to advocate for fixed-width types, yet then you suggest "unsigned long". Perhaps because you realized that there's no single fixed-width type fitting "unsigned long" for all architectures. xen_ulong_t would likely come closest, but would - aiui - still not be suitable for Arm32 when used in hypercall (handler) prototypes; it's suitable for use (again) only in structure definitions. Jan
On Thu, 21 Mar 2024, Jan Beulich wrote: > On 21.03.2024 02:46, Stefano Stabellini wrote: > > On Wed, 20 Mar 2024, Jan Beulich wrote: > >>> - the public interface is described in a C header so it makes sense for > >>> the corresponding implementation to be in C > >>> > >>> - the C entry point is often both the entry point in C and also common > >>> code > >>> > >>> - depending on the architecture, there is typically always some minimal > >>> assembly entry code to prepare the environment before we can jump into > >>> C-land; still one wouldn't consider those minimal and routine assembly > >>> operations to be a meaningful hypercall entry point corresponding to > >>> the C declaration in the public headers > >>> > >>> - as per MISRA and also general good practice, we need the declaration > >>> in the public header files to match the definition in C > >> > >> Throughout, but especially with this last point, I feel there's confusion > >> (not sure on which side): There are no declarations of hypercall functions > >> in the public headers. Adding declarations there for the C entry points in > >> Xen would actually be wrong, as we don't provide such functions anywhere > >> (to consumers of the ABI). > > > > I am copy/pasting text from sched.h: > > > > * The prototype for this hypercall is: > > * ` long HYPERVISOR_sched_op(enum sched_op cmd, void *arg, ...) > > * > > * @cmd == SCHEDOP_??? (scheduler operation). > > * @arg == Operation-specific extra argument(s), as described below. > > * ... == Additional Operation-specific extra arguments, described below. > > * > > > > from event_channel.h: > > > > * ` enum neg_errnoval > > * ` HYPERVISOR_event_channel_op(enum event_channel_op cmd, void *args) > > * ` > > * @cmd == EVTCHNOP_* (event-channel operation). > > * @args == struct evtchn_* Operation-specific extra arguments (NULL if none). > > > > These are the hypercall declarations in public headers. Although they > > are comments, they are the only description of the ABI that we have (as > > far as I know). They are in C and use C types. > > >From their use of enum alone they don't qualify as declarations. They're > imo merely meant to provide minimal guidelines. Even if we call them "minimal guidelines", my opinion is unchanged: - they need to use fixed-width types - they should match the C hypercall entry point types > >>>>>>> We have two options: > >>>>>>> > >>>>>>> 1) we go with this document, and we clarify that even if we specify > >>>>>>> "unsigned int", we actually mean a 32-bit integer > >>>>>>> > >>>>>>> 2) we change all our public ABIs and C hypercall entry points to use > >>>>>>> fixed-size types (e.g. s/unsigned int/uint32_t/g) > >>>>>>> > >>>>>>> 2) is preferred because it is clearer but it is more work. So I went > >>>>>>> with 1). I also thought you would like 1) more. > >>>>>> > >>>>>> For ABIs (i.e. structures) we ought to be making that change anyway. > >>>>>> Leaving basic types in there is latently buggy. > >>>>> > >>>>> I am glad we agree :-) > >>>>> > >>>>> It is just that I also consinder the C hypercall entry points as part of > >>>>> the ABI > >>>>> > >>>>> > >>>>>> I'm happy to see a document like this added, for the purpose described > >>>>>> above. But to me 1) and 2) and largely independent of one another. > >>>>> > >>>>> Good that you are also happy with a document like this. > >>>>> > >>>>> The remaining question is: what about the rest of the C functions in Xen > >>>>> that are certainly not part of an ABI? > >>>> > >>>> As per above - anything internal isn't part of the ABI, C entry points > >>>> for hypercall handlers included. All we need to ensure is that we consume > >>>> the data according to what the ABI sets forth. > >>> > >>> It doesn't look like we'll convince one another on this point. But let > >>> me try another way. > >>> > >>> In my view, having mismatched types between declaration and definition > >>> and having non-fixed-width types in C hypercall entry points is really > >>> bad for a number of reasons, among them: > >>> - correctness > >>> - risk of ABI breakage > >>> - mismatch of declaration and definition > >> > >> What mismatches are you talking about? There's nothing mismatched now, > >> and there cannot be any mismatch, because the consumers of the ABI don't > >> call Xen functions directly. > > > > Let me make an example: > > > > - public header saying enum event_channel_op cmd > > - <assembly> > > - do_event_channel_op(int cmd, ...) > > > > Do you think this is all good? > > > > There are two pretty serious problems here: > > - enum and int are not the same type > > See above. The issue I have with this is use of plain "int". Technically > that's not a problem either, but aiui we're aiming to use "unsigned int" > when negative values aren't possible. Yeah that is also a problem > And note that it was in 2012 when "int" there was changed to "enum", in an > effort to document things better. > > > - enum and int are not fixed-width > > Which I don't view as a problem, thanks to the assembly sitting in between. I disagree. I view this as risky and error prone. We worked for hours and hours on security issues and MISRA improvements. All this experience is also meant to teach us what good code looks like, code that is resilient to attacks, poses fewer safety issues, and it is clearer for others to read and modify. After all of the above, I am surprised we are not aligned on this issue. I understand your point of view, as I think you understand mine. We are not going to be able to convince each other. Having explored the technical aspects in all their details, I think we need more opinions from others to move forward. I'll conclude with this. One doesn't have to agree with me to agree that the suggestions I am making are to make the code and public interfaces, clearer, more consistent, less error prone. Your suggestions are to make the code follow CODING_STYLE? I made it clear the value proposition of what I am suggesting and I fail to see yours. > > Don't you think it should be: > > > > - public header saying uint32_t cmd in a comment > > - <assembly> > > - do_something_op(uint32_t cmd, ...) > > The public header should say whatever is best suited to not misguide > people writing actual prototypes for their functions. I wouldn't mind > uint32_t being stated there. That has no influence whatsoever on > do_<something>_op(), though. I understand what you are saying but I disagree. It is risky and error prone. As above, I think we understand each other's points of view but we won't be able to convince each other. > > Or possibly unsigned long depending on the parameter. > > You're contradicting yourself: You mean to advocate for fixed-width types, > yet then you suggest "unsigned long". No. I explained it in another thread a couple of days ago. There are cases where we have fixed-width types but the type changes by architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs. Rather than having #ifdefs, which is also an option, that is the one case where using "unsigned long" could be a decent compromise. In this context "unsigned long" means register size (on ARM we even have register_t). Once you pick an architecture, the size is actually meant to be fixed. In fact, it is specified in this document. Which is one of the reasons why we have to use == in this document and not >=. In general, fixed-width types like uint32_t are better because they are clearer and unambiguous. When possible I think they should be our first choice in ABIs.
Hi Stefano, I haven't fully read the thread. But I wanted to clarify something. On 21/03/2024 19:03, Stefano Stabellini wrote: >>> Or possibly unsigned long depending on the parameter. >> >> You're contradicting yourself: You mean to advocate for fixed-width types, >> yet then you suggest "unsigned long". > > No. I explained it in another thread a couple of days ago. There are > cases where we have fixed-width types but the type changes by > architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs. > Rather than having #ifdefs, which is also an option, that is the one > case where using "unsigned long" could be a decent compromise. In this > context "unsigned long" means register size (on ARM we even have > register_t). Once you pick an architecture, the size is actually meant > to be fixed. In fact, it is specified in this document. Which is one of > the reasons why we have to use == in this document and not >=. In > general, fixed-width types like uint32_t are better because they are > clearer and unambiguous. When possible I think they should be our first > choice in ABIs. "unsigned long" is not fixed in a given architecture. It will change base on the data model used by the OS. For instance, for Arm 64-bit, we have 3 models: ILP32, LP64, LLP64. Only on LP64, 'unsigned long' is 32-bit. So effectively unsigned long can't be used in the ABI. As a side note, Xen will use LP64, hence why we tend to use 'unsigned long' to describe 32-bit for Arm32 and 64-bit for Arm64. Cheers,
On 22.03.2024 00:17, Julien Grall wrote: > Hi Stefano, > > I haven't fully read the thread. But I wanted to clarify something. > > On 21/03/2024 19:03, Stefano Stabellini wrote: >>>> Or possibly unsigned long depending on the parameter. >>> >>> You're contradicting yourself: You mean to advocate for fixed-width types, >>> yet then you suggest "unsigned long". >> >> No. I explained it in another thread a couple of days ago. There are >> cases where we have fixed-width types but the type changes by >> architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs. >> Rather than having #ifdefs, which is also an option, that is the one >> case where using "unsigned long" could be a decent compromise. In this >> context "unsigned long" means register size (on ARM we even have >> register_t). Once you pick an architecture, the size is actually meant >> to be fixed. In fact, it is specified in this document. Which is one of >> the reasons why we have to use == in this document and not >=. In >> general, fixed-width types like uint32_t are better because they are >> clearer and unambiguous. When possible I think they should be our first >> choice in ABIs. > > "unsigned long" is not fixed in a given architecture. It will change > base on the data model used by the OS. For instance, for Arm 64-bit, we > have 3 models: ILP32, LP64, LLP64. Only on LP64, 'unsigned long' is 32-bit. "... is 64-bit" you mean? Jan > So effectively unsigned long can't be used in the ABI. > > As a side note, Xen will use LP64, hence why we tend to use 'unsigned > long' to describe 32-bit for Arm32 and 64-bit for Arm64. > > Cheers, >
Hi Jan, On 25/03/2024 11:16, Jan Beulich wrote: > On 22.03.2024 00:17, Julien Grall wrote: >> Hi Stefano, >> >> I haven't fully read the thread. But I wanted to clarify something. >> >> On 21/03/2024 19:03, Stefano Stabellini wrote: >>>>> Or possibly unsigned long depending on the parameter. >>>> >>>> You're contradicting yourself: You mean to advocate for fixed-width types, >>>> yet then you suggest "unsigned long". >>> >>> No. I explained it in another thread a couple of days ago. There are >>> cases where we have fixed-width types but the type changes by >>> architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs. >>> Rather than having #ifdefs, which is also an option, that is the one >>> case where using "unsigned long" could be a decent compromise. In this >>> context "unsigned long" means register size (on ARM we even have >>> register_t). Once you pick an architecture, the size is actually meant >>> to be fixed. In fact, it is specified in this document. Which is one of >>> the reasons why we have to use == in this document and not >=. In >>> general, fixed-width types like uint32_t are better because they are >>> clearer and unambiguous. When possible I think they should be our first >>> choice in ABIs. >> >> "unsigned long" is not fixed in a given architecture. It will change >> base on the data model used by the OS. For instance, for Arm 64-bit, we >> have 3 models: ILP32, LP64, LLP64. Only on LP64, 'unsigned long' is 32-bit. > > "... is 64-bit" you mean? Whoops. Yes! Cheers,
On Thu, 21 Mar 2024, Julien Grall wrote: > Hi Stefano, > > I haven't fully read the thread. But I wanted to clarify something. > > On 21/03/2024 19:03, Stefano Stabellini wrote: > > > > Or possibly unsigned long depending on the parameter. > > > > > > You're contradicting yourself: You mean to advocate for fixed-width types, > > > yet then you suggest "unsigned long". > > > > No. I explained it in another thread a couple of days ago. There are > > cases where we have fixed-width types but the type changes by > > architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs. > > Rather than having #ifdefs, which is also an option, that is the one > > case where using "unsigned long" could be a decent compromise. In this > > context "unsigned long" means register size (on ARM we even have > > register_t). Once you pick an architecture, the size is actually meant > > to be fixed. In fact, it is specified in this document. Which is one of > > the reasons why we have to use == in this document and not >=. In > > general, fixed-width types like uint32_t are better because they are > > clearer and unambiguous. When possible I think they should be our first > > choice in ABIs. > > "unsigned long" is not fixed in a given architecture. It will change base on > the data model used by the OS. For instance, for Arm 64-bit, we have 3 models: > ILP32, LP64, LLP64. Only on LP64, 'unsigned long' is 32-bit. > > So effectively unsigned long can't be used in the ABI. If someone sees "unsigned long" in the public headers to define a public Xen ABI, they would need to refer to this document to understand what "unsigned long" really means, which specifies size and alignment of "unsigned long" based on the architecture. In other words, this document mandates LP64 (at least for safety configuration, given that nothing else is tested). This is the reason why ideally we wouldn't have any "unsigned long" in the Xen ABI at all. They are not as clear as explicitly-sized integers (e.g. uint32_t). In an ideal world, we would use explicitly-sized integers for everything in public ABIs.
diff --git a/docs/misra/C-language-toolchain.rst b/docs/misra/C-language-toolchain.rst index b7c2000992..24d3c1cac6 100644 --- a/docs/misra/C-language-toolchain.rst +++ b/docs/misra/C-language-toolchain.rst @@ -480,4 +480,62 @@ The table columns are as follows: - See Section "4.13 Preprocessing Directives" of GCC_MANUAL and Section "11.1 Implementation-defined behavior" of CPP_MANUAL. +Sizes of Integer types +______________________ + +.. list-table:: + :widths: 10 10 10 45 + :header-rows: 1 + + * - Type + - Size + - Alignment + - Architectures + + * - char + - 8 bits + - 8 bits + - all architectures + + * - short + - 16 bits + - 16 bits + - all architectures + + * - int + - 32 bits + - 32 bits + - all architectures + + * - long + - 32 bits + - 32 bits + - 32-bit architectures (x86_32, ARMv8-A AArch32, ARMv8-R AArch32) + + * - long + - 64 bits + - 64 bits + - 64-bit architectures (x86_64, ARMv8-A AArch64, RV64, PPC64) + + * - long long + - 64-bit + - 32-bit + - x86_32 + + * - long long + - 64-bit + - 64-bit + - 64-bit architectures, ARMv8-A AArch32, ARMv8-R AArch32 + + * - pointer + - 32-bit + - 32-bit + - 32-bit architectures (x86_32, ARMv8-A AArch32, ARMv8-R AArch32) + + * - pointer + - 64-bit + - 64-bit + - 64-bit architectures (x86_64, ARMv8-A AArch64, RV64, PPC64) + + END OF DOCUMENT.
Xen makes assumptions about the size of integer types on the various architectures. Document these assumptions. Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> --- Changes in v2: - add alignment info --- docs/misra/C-language-toolchain.rst | 58 +++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+)