Message ID | 20231205100756.18920-2-jgross@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen: have a more generic unaligned.h header | expand |
Hi Juergen, On 05/12/2023 10:07, Juergen Gross wrote: > Instead of defining get_unaligned() and put_unaligned() in a way that > is only supporting architectures allowing unaligned accesses, use the > same approach as the Linux kernel and let the compiler do the > decision how to generate the code for probably unaligned data accesses. > > Update include/xen/unaligned.h from include/asm-generic/unaligned.h of > the Linux kernel. > > The generated code has been checked to be the same on x86. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 803f4e1eab7a > Signed-off-by: Juergen Gross <jgross@suse.com> Can you outline your end goal? At least on arm32, I believe this will result to abort because event if the architecture support unaligned access, we are preventing them on Arm32. Cheers,
On 05.12.23 12:53, Julien Grall wrote: > Hi Juergen, > > On 05/12/2023 10:07, Juergen Gross wrote: >> Instead of defining get_unaligned() and put_unaligned() in a way that >> is only supporting architectures allowing unaligned accesses, use the >> same approach as the Linux kernel and let the compiler do the >> decision how to generate the code for probably unaligned data accesses. >> >> Update include/xen/unaligned.h from include/asm-generic/unaligned.h of >> the Linux kernel. >> >> The generated code has been checked to be the same on x86. >> >> Signed-off-by: Arnd Bergmann <arnd@arndb.de> >> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git >> 803f4e1eab7a >> Signed-off-by: Juergen Gross <jgross@suse.com> > > Can you outline your end goal? At least on arm32, I believe this will result to > abort because event if the architecture support unaligned access, we are > preventing them on Arm32. I need something like that in Xen tools for supporting packed data accesses on the 9pfs ring page, so I looked into the hypervisor for related support. I guess for arm32 using -mno-unaligned-access when building should avoid any unaligned accesses? Juergen
Hi Juergen, On 05/12/2023 12:39, Juergen Gross wrote: > On 05.12.23 12:53, Julien Grall wrote: >> Hi Juergen, >> >> On 05/12/2023 10:07, Juergen Gross wrote: >>> Instead of defining get_unaligned() and put_unaligned() in a way that >>> is only supporting architectures allowing unaligned accesses, use the >>> same approach as the Linux kernel and let the compiler do the >>> decision how to generate the code for probably unaligned data accesses. >>> >>> Update include/xen/unaligned.h from include/asm-generic/unaligned.h of >>> the Linux kernel. >>> >>> The generated code has been checked to be the same on x86. >>> >>> Signed-off-by: Arnd Bergmann <arnd@arndb.de> >>> Origin: >>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git >>> 803f4e1eab7a >>> Signed-off-by: Juergen Gross <jgross@suse.com> >> >> Can you outline your end goal? At least on arm32, I believe this will >> result to abort because event if the architecture support unaligned >> access, we are preventing them on Arm32. > > I need something like that in Xen tools for supporting packed data accesses > on the 9pfs ring page, so I looked into the hypervisor for related support. Did we really introduce an ABI requiring unaligned access??? Or is this something you are coming up with? Anyway, IIRC Linux allows unaligned access. So the problem I am describing is only for the hypervisor. Although, I would like to point out that unaligned access has no atomicity guarantee. I assume this is not going to be a concern for you? > I guess for arm32 using -mno-unaligned-access when building should avoid > any > unaligned accesses? I am not sure. This is implies the compiler will be able to infer that the access will be unaligned. Is this always the case? Anyway, given you don't seem to have a use-case yet, I would simply to consider to surround the declaration with an a config which can be selected if unaligned access is supported. Cheers,
On 05.12.23 14:31, Julien Grall wrote: > Hi Juergen, > > On 05/12/2023 12:39, Juergen Gross wrote: >> On 05.12.23 12:53, Julien Grall wrote: >>> Hi Juergen, >>> >>> On 05/12/2023 10:07, Juergen Gross wrote: >>>> Instead of defining get_unaligned() and put_unaligned() in a way that >>>> is only supporting architectures allowing unaligned accesses, use the >>>> same approach as the Linux kernel and let the compiler do the >>>> decision how to generate the code for probably unaligned data accesses. >>>> >>>> Update include/xen/unaligned.h from include/asm-generic/unaligned.h of >>>> the Linux kernel. >>>> >>>> The generated code has been checked to be the same on x86. >>>> >>>> Signed-off-by: Arnd Bergmann <arnd@arndb.de> >>>> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git >>>> 803f4e1eab7a >>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>> >>> Can you outline your end goal? At least on arm32, I believe this will result >>> to abort because event if the architecture support unaligned access, we are >>> preventing them on Arm32. >> >> I need something like that in Xen tools for supporting packed data accesses >> on the 9pfs ring page, so I looked into the hypervisor for related support. > > Did we really introduce an ABI requiring unaligned access??? Or is this > something you are coming up with? This is the 9pfs protocol (see [1]). > Anyway, IIRC Linux allows unaligned access. So the problem I am describing is > only for the hypervisor. Although, I would like to point out that unaligned > access has no atomicity guarantee. I assume this is not going to be a concern > for you? Correct. > >> I guess for arm32 using -mno-unaligned-access when building should avoid any >> unaligned accesses? > > I am not sure. This is implies the compiler will be able to infer that the > access will be unaligned. Is this always the case? This should happen through the "__packed" attribute on the access macros. As e.g. MIPS doesn't support unaligned accesses, but is working with those access macros in the Linux kernel, I suspect the attribute is doing its job. > Anyway, given you don't seem to have a use-case yet, I would simply to consider > to surround the declaration with an a config which can be selected if unaligned > access is supported. Like in xen/common/lzo.c et al? Those are compiled with CONFIG_X86 only today, but I guess other archs might need the decompressors in future, too. Juergen [1]: http://ericvh.github.io/9p-rfc/rfc9p2000.html
On 05/12/2023 13:41, Juergen Gross wrote: > On 05.12.23 14:31, Julien Grall wrote: >> Hi Juergen, >> >> On 05/12/2023 12:39, Juergen Gross wrote: >>> On 05.12.23 12:53, Julien Grall wrote: >>>> Hi Juergen, >>>> >>>> On 05/12/2023 10:07, Juergen Gross wrote: >>>>> Instead of defining get_unaligned() and put_unaligned() in a way that >>>>> is only supporting architectures allowing unaligned accesses, use the >>>>> same approach as the Linux kernel and let the compiler do the >>>>> decision how to generate the code for probably unaligned data >>>>> accesses. >>>>> >>>>> Update include/xen/unaligned.h from include/asm-generic/unaligned.h of >>>>> the Linux kernel. >>>>> >>>>> The generated code has been checked to be the same on x86. >>>>> >>>>> Signed-off-by: Arnd Bergmann <arnd@arndb.de> >>>>> Origin: >>>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git >>>>> 803f4e1eab7a >>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>> >>>> Can you outline your end goal? At least on arm32, I believe this >>>> will result to abort because event if the architecture support >>>> unaligned access, we are preventing them on Arm32. >>> >>> I need something like that in Xen tools for supporting packed data >>> accesses >>> on the 9pfs ring page, so I looked into the hypervisor for related >>> support. >> >> Did we really introduce an ABI requiring unaligned access??? Or is >> this something you are coming up with? > > This is the 9pfs protocol (see [1]). Urgh :(. > >> Anyway, IIRC Linux allows unaligned access. So the problem I am >> describing is only for the hypervisor. Although, I would like to point >> out that unaligned access has no atomicity guarantee. I assume this is >> not going to be a concern for you? > > Correct. > >> >>> I guess for arm32 using -mno-unaligned-access when building should >>> avoid any >>> unaligned accesses? >> >> I am not sure. This is implies the compiler will be able to infer that >> the access will be unaligned. Is this always the case? > > This should happen through the "__packed" attribute on the access > macros. As > e.g. MIPS doesn't support unaligned accesses, but is working with those > access > macros in the Linux kernel, I suspect the attribute is doing its job. Someone will need to dig a bit deeper to confirm and also the impact on the rest of the hypervisor. > >> Anyway, given you don't seem to have a use-case yet, I would simply to >> consider to surround the declaration with an a config which can be >> selected if unaligned access is supported. > > Like in xen/common/lzo.c et al? Just to clarify, I am suggesting to add in unaligned.h: #ifdef CONFIG_HAS_UNALIGNED_ACCESS your definitions #endif And then for X86, select CONFIG_HAS_UNALIGNED_ACCESS. Those are compiled with CONFIG_X86 only > today, > but I guess other archs might need the decompressors in future, too. Possibly yes. But my point is that you don't have to solve the problem today. Yet I don't think it is wise to allow the header to be used on arm32 until we have done some investigation. And to clarify, I am not asking you to do the investigation. Cheers,
On 05.12.2023 11:07, Juergen Gross wrote: > @@ -15,67 +7,126 @@ > #include <asm/byteorder.h> > #endif > > -#define get_unaligned(p) (*(p)) > -#define put_unaligned(val, p) (*(p) = (val)) > +/* > + * This is the most generic implementation of unaligned accesses > + * and should work almost anywhere. > + */ > + > +#define __get_unaligned_t(type, ptr) ({ \ > + const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ > + __pptr->x; \ > +}) > + > +#define __put_unaligned_t(type, val, ptr) do { \ > + struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ > + __pptr->x = (val); \ > +} while (0) Please can we avoid gaining new (even double) leading underscore symbols, especially when they're in helper (i.e. not intended to be used directly) constructs? No reason to introduce further issues wrt Misra adoption. > +#define get_unaligned(ptr) __get_unaligned_t(typeof(*(ptr)), (ptr)) > +#define put_unaligned(val, ptr) __put_unaligned_t(typeof(*(ptr)), (val), (ptr)) May I ask to omit excess parentheses where possible? > +static inline u16 get_unaligned_le16(const void *p) > +{ > + return le16_to_cpu(__get_unaligned_t(__le16, p)); > +} > + > +static inline u32 get_unaligned_le32(const void *p) > +{ > + return le32_to_cpu(__get_unaligned_t(__le32, p)); > +} > + > +static inline u64 get_unaligned_le64(const void *p) > +{ > + return le64_to_cpu(__get_unaligned_t(__le64, p)); > +} > + > +static inline void put_unaligned_le16(u16 val, void *p) > +{ > + __put_unaligned_t(__le16, cpu_to_le16(val), p); > +} > + > +static inline void put_unaligned_le32(u32 val, void *p) > +{ > + __put_unaligned_t(__le32, cpu_to_le32(val), p); > +} > + > +static inline void put_unaligned_le64(u64 val, void *p) > +{ > + __put_unaligned_t(__le64, cpu_to_le64(val), p); > +} > + > +static inline u16 get_unaligned_be16(const void *p) > +{ > + return be16_to_cpu(__get_unaligned_t(__be16, p)); > +} > + > +static inline u32 get_unaligned_be32(const void *p) > +{ > + return be32_to_cpu(__get_unaligned_t(__be32, p)); > +} > > -static inline uint16_t get_unaligned_be16(const void *p) > +static inline u64 get_unaligned_be64(const void *p) > { > - return be16_to_cpup(p); > + return be64_to_cpu(__get_unaligned_t(__be64, p)); > } > > -static inline void put_unaligned_be16(uint16_t val, void *p) > +static inline void put_unaligned_be16(u16 val, void *p) > { > - *(__force __be16*)p = cpu_to_be16(val); > + __put_unaligned_t(__be16, cpu_to_be16(val), p); > } > > -static inline uint32_t get_unaligned_be32(const void *p) > +static inline void put_unaligned_be32(u32 val, void *p) > { > - return be32_to_cpup(p); > + __put_unaligned_t(__be32, cpu_to_be32(val), p); > } > > -static inline void put_unaligned_be32(uint32_t val, void *p) > +static inline void put_unaligned_be64(u64 val, void *p) > { > - *(__force __be32*)p = cpu_to_be32(val); > + __put_unaligned_t(__be64, cpu_to_be64(val), p); > } > > -static inline uint64_t get_unaligned_be64(const void *p) > +static inline u32 __get_unaligned_be24(const u8 *p) Here and below - do you foresee use of these 24-bit helpers? They weren't there before, and the description also doesn't justify their introduction. Jan
On 05.12.2023 14:46, Julien Grall wrote: > On 05/12/2023 13:41, Juergen Gross wrote: >> On 05.12.23 14:31, Julien Grall wrote: >>> Anyway, given you don't seem to have a use-case yet, I would simply to >>> consider to surround the declaration with an a config which can be >>> selected if unaligned access is supported. >> >> Like in xen/common/lzo.c et al? > > Just to clarify, I am suggesting to add in unaligned.h: > > #ifdef CONFIG_HAS_UNALIGNED_ACCESS > > your definitions > > #endif But that would be wrong: HAS_UNALIGNED_ACCESS would be there to indicate one does _not_ need any special accessors. Jan
Hi, On 05/12/2023 13:59, Jan Beulich wrote: > On 05.12.2023 14:46, Julien Grall wrote: >> On 05/12/2023 13:41, Juergen Gross wrote: >>> On 05.12.23 14:31, Julien Grall wrote: >>>> Anyway, given you don't seem to have a use-case yet, I would simply to >>>> consider to surround the declaration with an a config which can be >>>> selected if unaligned access is supported. >>> >>> Like in xen/common/lzo.c et al? >> >> Just to clarify, I am suggesting to add in unaligned.h: >> >> #ifdef CONFIG_HAS_UNALIGNED_ACCESS >> >> your definitions >> >> #endif > > But that would be wrong: HAS_UNALIGNED_ACCESS would be there to indicate > one does _not_ need any special accessors. I am guessing you are disagreeing on the name rather than the concept? If so, what about CONFIG_UNALIGNED_ACCESS_ALLOWED? Cheers,
On Tue, Dec 5, 2023, at 15:01, Julien Grall wrote: > On 05/12/2023 13:59, Jan Beulich wrote: >> On 05.12.2023 14:46, Julien Grall wrote: >>> On 05/12/2023 13:41, Juergen Gross wrote: >>>> On 05.12.23 14:31, Julien Grall wrote: >>>>> Anyway, given you don't seem to have a use-case yet, I would simply to >>>>> consider to surround the declaration with an a config which can be >>>>> selected if unaligned access is supported. >>>> >>>> Like in xen/common/lzo.c et al? >>> >>> Just to clarify, I am suggesting to add in unaligned.h: >>> >>> #ifdef CONFIG_HAS_UNALIGNED_ACCESS >>> >>> your definitions >>> >>> #endif >> >> But that would be wrong: HAS_UNALIGNED_ACCESS would be there to indicate >> one does _not_ need any special accessors. > > I am guessing you are disagreeing on the name rather than the concept? > If so, what about CONFIG_UNALIGNED_ACCESS_ALLOWED? This would repeat the mistake that we had in Linux in the past (and still do in some drivers): Simply dereferencing a misaligned pointer is always a bug, even on architectures that have efficient unaligned load/store instructions, because C makes this undefined behavior and gcc has optimizations that assume e.g. 'int *' to never have the lower two bits set [1]. The helpers that Jürgen copied from Linux is what we use to abstract accesses to objects that we know may be misaligned. If the underlying ISA allows a direct access (e.g. on arm64 and x86), this is as efficient as a normal pointer access but prevents the dangerous optimizations in gcc. On architectures without these instructions, the access will be turned into safe bytewise access. This is similar to a __packed annotation on a data structure, but also works in cases where such an annotation wouldn't work for other reasons. Arnd [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363
On 05.12.23 14:55, Jan Beulich wrote: > On 05.12.2023 11:07, Juergen Gross wrote: >> @@ -15,67 +7,126 @@ >> #include <asm/byteorder.h> >> #endif >> >> -#define get_unaligned(p) (*(p)) >> -#define put_unaligned(val, p) (*(p) = (val)) >> +/* >> + * This is the most generic implementation of unaligned accesses >> + * and should work almost anywhere. >> + */ >> + >> +#define __get_unaligned_t(type, ptr) ({ \ >> + const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ >> + __pptr->x; \ >> +}) >> + >> +#define __put_unaligned_t(type, val, ptr) do { \ >> + struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ >> + __pptr->x = (val); \ >> +} while (0) > > Please can we avoid gaining new (even double) leading underscore symbols, > especially when they're in helper (i.e. not intended to be used directly) > constructs? No reason to introduce further issues wrt Misra adoption. > >> +#define get_unaligned(ptr) __get_unaligned_t(typeof(*(ptr)), (ptr)) >> +#define put_unaligned(val, ptr) __put_unaligned_t(typeof(*(ptr)), (val), (ptr)) > > May I ask to omit excess parentheses where possible? > >> +static inline u16 get_unaligned_le16(const void *p) >> +{ >> + return le16_to_cpu(__get_unaligned_t(__le16, p)); >> +} >> + >> +static inline u32 get_unaligned_le32(const void *p) >> +{ >> + return le32_to_cpu(__get_unaligned_t(__le32, p)); >> +} >> + >> +static inline u64 get_unaligned_le64(const void *p) >> +{ >> + return le64_to_cpu(__get_unaligned_t(__le64, p)); >> +} >> + >> +static inline void put_unaligned_le16(u16 val, void *p) >> +{ >> + __put_unaligned_t(__le16, cpu_to_le16(val), p); >> +} >> + >> +static inline void put_unaligned_le32(u32 val, void *p) >> +{ >> + __put_unaligned_t(__le32, cpu_to_le32(val), p); >> +} >> + >> +static inline void put_unaligned_le64(u64 val, void *p) >> +{ >> + __put_unaligned_t(__le64, cpu_to_le64(val), p); >> +} >> + >> +static inline u16 get_unaligned_be16(const void *p) >> +{ >> + return be16_to_cpu(__get_unaligned_t(__be16, p)); >> +} >> + >> +static inline u32 get_unaligned_be32(const void *p) >> +{ >> + return be32_to_cpu(__get_unaligned_t(__be32, p)); >> +} >> >> -static inline uint16_t get_unaligned_be16(const void *p) >> +static inline u64 get_unaligned_be64(const void *p) >> { >> - return be16_to_cpup(p); >> + return be64_to_cpu(__get_unaligned_t(__be64, p)); >> } >> >> -static inline void put_unaligned_be16(uint16_t val, void *p) >> +static inline void put_unaligned_be16(u16 val, void *p) >> { >> - *(__force __be16*)p = cpu_to_be16(val); >> + __put_unaligned_t(__be16, cpu_to_be16(val), p); >> } >> >> -static inline uint32_t get_unaligned_be32(const void *p) >> +static inline void put_unaligned_be32(u32 val, void *p) >> { >> - return be32_to_cpup(p); >> + __put_unaligned_t(__be32, cpu_to_be32(val), p); >> } >> >> -static inline void put_unaligned_be32(uint32_t val, void *p) >> +static inline void put_unaligned_be64(u64 val, void *p) >> { >> - *(__force __be32*)p = cpu_to_be32(val); >> + __put_unaligned_t(__be64, cpu_to_be64(val), p); >> } >> >> -static inline uint64_t get_unaligned_be64(const void *p) >> +static inline u32 __get_unaligned_be24(const u8 *p) > > Here and below - do you foresee use of these 24-bit helpers? They weren't > there before, and the description also doesn't justify their introduction. I have just applied the patch from the kernel (see Origin: tag). I can change the patch according to your comments, of course. Juergen
On 05.12.2023 15:01, Julien Grall wrote: > On 05/12/2023 13:59, Jan Beulich wrote: >> On 05.12.2023 14:46, Julien Grall wrote: >>> On 05/12/2023 13:41, Juergen Gross wrote: >>>> On 05.12.23 14:31, Julien Grall wrote: >>>>> Anyway, given you don't seem to have a use-case yet, I would simply to >>>>> consider to surround the declaration with an a config which can be >>>>> selected if unaligned access is supported. >>>> >>>> Like in xen/common/lzo.c et al? >>> >>> Just to clarify, I am suggesting to add in unaligned.h: >>> >>> #ifdef CONFIG_HAS_UNALIGNED_ACCESS >>> >>> your definitions >>> >>> #endif >> >> But that would be wrong: HAS_UNALIGNED_ACCESS would be there to indicate >> one does _not_ need any special accessors. > > I am guessing you are disagreeing on the name rather than the concept? > If so, what about CONFIG_UNALIGNED_ACCESS_ALLOWED? Not really, no. Of course the name needs to properly express the purpose. But I don't see why a Kconfig control would be appropriate here. You simply want to provide accessors to unaligned data. Nobody needs to use them, but when you have to, they ought to be there. A Kconfig control (of the name you suggested first) would be helpful to not penalize architectures which can do unaligned accesses without any helpers (in case the code generated through the helpers turned out sub-optimal). Jan
On 05.12.23 14:46, Julien Grall wrote: > > > On 05/12/2023 13:41, Juergen Gross wrote: >> On 05.12.23 14:31, Julien Grall wrote: >>> Hi Juergen, >>> >>> On 05/12/2023 12:39, Juergen Gross wrote: >>>> On 05.12.23 12:53, Julien Grall wrote: >>>>> Hi Juergen, >>>>> >>>>> On 05/12/2023 10:07, Juergen Gross wrote: >>>>>> Instead of defining get_unaligned() and put_unaligned() in a way that >>>>>> is only supporting architectures allowing unaligned accesses, use the >>>>>> same approach as the Linux kernel and let the compiler do the >>>>>> decision how to generate the code for probably unaligned data accesses. >>>>>> >>>>>> Update include/xen/unaligned.h from include/asm-generic/unaligned.h of >>>>>> the Linux kernel. >>>>>> >>>>>> The generated code has been checked to be the same on x86. >>>>>> >>>>>> Signed-off-by: Arnd Bergmann <arnd@arndb.de> >>>>>> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git >>>>>> 803f4e1eab7a >>>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>>> >>>>> Can you outline your end goal? At least on arm32, I believe this will >>>>> result to abort because event if the architecture support unaligned access, >>>>> we are preventing them on Arm32. >>>> >>>> I need something like that in Xen tools for supporting packed data accesses >>>> on the 9pfs ring page, so I looked into the hypervisor for related support. >>> >>> Did we really introduce an ABI requiring unaligned access??? Or is this >>> something you are coming up with? >> >> This is the 9pfs protocol (see [1]). > > Urgh :(. > >> >>> Anyway, IIRC Linux allows unaligned access. So the problem I am describing is >>> only for the hypervisor. Although, I would like to point out that unaligned >>> access has no atomicity guarantee. I assume this is not going to be a concern >>> for you? >> >> Correct. >> >>> >>>> I guess for arm32 using -mno-unaligned-access when building should avoid any >>>> unaligned accesses? >>> >>> I am not sure. This is implies the compiler will be able to infer that the >>> access will be unaligned. Is this always the case? >> >> This should happen through the "__packed" attribute on the access macros. As >> e.g. MIPS doesn't support unaligned accesses, but is working with those access >> macros in the Linux kernel, I suspect the attribute is doing its job. > > Someone will need to dig a bit deeper to confirm and also the impact on the rest > of the hypervisor. > >> >>> Anyway, given you don't seem to have a use-case yet, I would simply to >>> consider to surround the declaration with an a config which can be selected >>> if unaligned access is supported. >> >> Like in xen/common/lzo.c et al? > > Just to clarify, I am suggesting to add in unaligned.h: > > #ifdef CONFIG_HAS_UNALIGNED_ACCESS > > your definitions > > #endif > > And then for X86, select CONFIG_HAS_UNALIGNED_ACCESS. > > Those are compiled with CONFIG_X86 only >> today, >> but I guess other archs might need the decompressors in future, too. > Possibly yes. But my point is that you don't have to solve the problem today. > Yet I don't think it is wise to allow the header to be used on arm32 until we > have done some investigation. > > And to clarify, I am not asking you to do the investigation. I've done a quick verification using gcc 7.5. Using -mno-unaligned-access for 32-bit Arm seems to do the job: #include <xen/unaligned.h> int tst(const unsigned short *in) { return get_unaligned(in); } results in: 00000000 <tst>: 0: e52db004 push {fp} @ (str fp, [sp, #-4]!) 4: e28db000 add fp, sp, #0 8: e5d03000 ldrb r3, [r0] c: e5d00001 ldrb r0, [r0, #1] 10: e1830400 orr r0, r3, r0, lsl #8 14: e28bd000 add sp, fp, #0 18: e49db004 pop {fp} @ (ldr fp, [sp], #4) 1c: e12fff1e bx lr Without the -mno-unaligned-access I'm getting: 00000000 <tst>: 0: e52db004 push {fp} @ (str fp, [sp, #-4]!) 4: e28db000 add fp, sp, #0 8: e1d000b0 ldrh r0, [r0] c: e28bd000 add sp, fp, #0 10: e49db004 pop {fp} @ (ldr fp, [sp], #4) 14: e12fff1e bx lr Juergen
Hi Arnd, Thanks for the answer. On 05/12/2023 14:10, Arnd Bergmann wrote: > On Tue, Dec 5, 2023, at 15:01, Julien Grall wrote: >> On 05/12/2023 13:59, Jan Beulich wrote: >>> On 05.12.2023 14:46, Julien Grall wrote: >>>> On 05/12/2023 13:41, Juergen Gross wrote: >>>>> On 05.12.23 14:31, Julien Grall wrote: >>>>>> Anyway, given you don't seem to have a use-case yet, I would simply to >>>>>> consider to surround the declaration with an a config which can be >>>>>> selected if unaligned access is supported. >>>>> >>>>> Like in xen/common/lzo.c et al? >>>> >>>> Just to clarify, I am suggesting to add in unaligned.h: >>>> >>>> #ifdef CONFIG_HAS_UNALIGNED_ACCESS >>>> >>>> your definitions >>>> >>>> #endif >>> >>> But that would be wrong: HAS_UNALIGNED_ACCESS would be there to indicate >>> one does _not_ need any special accessors. >> >> I am guessing you are disagreeing on the name rather than the concept? >> If so, what about CONFIG_UNALIGNED_ACCESS_ALLOWED? > > This would repeat the mistake that we had in Linux in the > past (and still do in some drivers): Simply dereferencing > a misaligned pointer is always a bug, even on architectures > that have efficient unaligned load/store instructions, > because C makes this undefined behavior and gcc has > optimizations that assume e.g. 'int *' to never have > the lower two bits set [1]. Just to clarify, I haven't suggested to use 'int *'. My point was more that I don't think that the helpers would work as-is on arm32 because even if the ISA allows a direct access, we are setting the bit in SCTLR to disable unaligned access. As Juergen is proposing a common header, then I could ask him to do the work to confirm that the helpers properly work on arm32. But I think this is unfair. I also don't have the time to chase to look at it and this is not yet used in code reached by Arm. Hence my suggestion for now to protect the code so if someone tries to use them, then they know that it doesn't (yet) work on Arm. Cheers,
On Tue, Dec 5, 2023, at 15:19, Julien Grall wrote: > On 05/12/2023 14:10, Arnd Bergmann wrote: >> On Tue, Dec 5, 2023, at 15:01, Julien Grall wrote: >>> On 05/12/2023 13:59, Jan Beulich wrote: >>>> On 05.12.2023 14:46, Julien Grall wrote: >> This would repeat the mistake that we had in Linux in the >> past (and still do in some drivers): Simply dereferencing >> a misaligned pointer is always a bug, even on architectures >> that have efficient unaligned load/store instructions, >> because C makes this undefined behavior and gcc has >> optimizations that assume e.g. 'int *' to never have >> the lower two bits set [1]. > > Just to clarify, I haven't suggested to use 'int *'. My point was more > that I don't think that the helpers would work as-is on arm32 because > even if the ISA allows a direct access, we are setting the bit in SCTLR > to disable unaligned access. > > As Juergen is proposing a common header, then I could ask him to do the > work to confirm that the helpers properly work on arm32. But I think > this is unfair. When I introduced the helpers in Linux, I showed that these produce the best output on all modern compilers (at least gcc-5, probably earlier) for both architectures that allow unaligned access and for those that don't. We used to have architecture specific helpers depending on what each architecture could do, but all the other variants we had were either wrong or less efficient. If for some reason an Arm system is configured to trap all unaligned access, then you must already pass -mno-unaligned-access to the compiler to prevent certain optimizations, and then the helpers will still behave correctly (the same way they do on armv5, which never has unaligned access). On armv7 with -munaligned-access, the same functions only prevent the use of stm/ldm and strd/ldrd but still use ldr/str. Arnd
Hi Arnd, On 05/12/2023 14:37, Arnd Bergmann wrote: > On Tue, Dec 5, 2023, at 15:19, Julien Grall wrote: >> On 05/12/2023 14:10, Arnd Bergmann wrote: >>> On Tue, Dec 5, 2023, at 15:01, Julien Grall wrote: >>>> On 05/12/2023 13:59, Jan Beulich wrote: >>>>> On 05.12.2023 14:46, Julien Grall wrote: >>> This would repeat the mistake that we had in Linux in the >>> past (and still do in some drivers): Simply dereferencing >>> a misaligned pointer is always a bug, even on architectures >>> that have efficient unaligned load/store instructions, >>> because C makes this undefined behavior and gcc has >>> optimizations that assume e.g. 'int *' to never have >>> the lower two bits set [1]. >> >> Just to clarify, I haven't suggested to use 'int *'. My point was more >> that I don't think that the helpers would work as-is on arm32 because >> even if the ISA allows a direct access, we are setting the bit in SCTLR >> to disable unaligned access. >> >> As Juergen is proposing a common header, then I could ask him to do the >> work to confirm that the helpers properly work on arm32. But I think >> this is unfair. > > When I introduced the helpers in Linux, I showed that these > produce the best output on all modern compilers (at least gcc-5, > probably earlier) for both architectures that allow unaligned > access and for those that don't. We used to have architecture > specific helpers depending on what each architecture could > do, but all the other variants we had were either wrong or > less efficient. > > If for some reason an Arm system is configured to trap > all unaligned access, then you must already pass > -mno-unaligned-access to the compiler to prevent certain > optimizations, and then the helpers will still behave > correctly (the same way they do on armv5, which never has > unaligned access). On armv7 with -munaligned-access, the > same functions only prevent the use of stm/ldm and strd/ldrd > but still use ldr/str. Unfortunately we don't explicitely do. This would explain why I saw some issues with certain compiler [1]. So I agree that adding -mno-unaligned-access for arm32 makes sense. @Juergen, do you want me to send a patch? Cheers, [1] c71163f6-2646-6fae-cb22-600eb0486539@xen.org
On 05.12.23 17:29, Julien Grall wrote: > Hi Arnd, > > On 05/12/2023 14:37, Arnd Bergmann wrote: >> On Tue, Dec 5, 2023, at 15:19, Julien Grall wrote: >>> On 05/12/2023 14:10, Arnd Bergmann wrote: >>>> On Tue, Dec 5, 2023, at 15:01, Julien Grall wrote: >>>>> On 05/12/2023 13:59, Jan Beulich wrote: >>>>>> On 05.12.2023 14:46, Julien Grall wrote: >>>> This would repeat the mistake that we had in Linux in the >>>> past (and still do in some drivers): Simply dereferencing >>>> a misaligned pointer is always a bug, even on architectures >>>> that have efficient unaligned load/store instructions, >>>> because C makes this undefined behavior and gcc has >>>> optimizations that assume e.g. 'int *' to never have >>>> the lower two bits set [1]. >>> >>> Just to clarify, I haven't suggested to use 'int *'. My point was more >>> that I don't think that the helpers would work as-is on arm32 because >>> even if the ISA allows a direct access, we are setting the bit in SCTLR >>> to disable unaligned access. >>> >>> As Juergen is proposing a common header, then I could ask him to do the >>> work to confirm that the helpers properly work on arm32. But I think >>> this is unfair. >> >> When I introduced the helpers in Linux, I showed that these >> produce the best output on all modern compilers (at least gcc-5, >> probably earlier) for both architectures that allow unaligned >> access and for those that don't. We used to have architecture >> specific helpers depending on what each architecture could >> do, but all the other variants we had were either wrong or >> less efficient. >> >> If for some reason an Arm system is configured to trap >> all unaligned access, then you must already pass >> -mno-unaligned-access to the compiler to prevent certain >> optimizations, and then the helpers will still behave >> correctly (the same way they do on armv5, which never has >> unaligned access). On armv7 with -munaligned-access, the >> same functions only prevent the use of stm/ldm and strd/ldrd >> but still use ldr/str. > > Unfortunately we don't explicitely do. This would explain why I saw some issues > with certain compiler [1]. > > So I agree that adding -mno-unaligned-access for arm32 makes sense. > > @Juergen, do you want me to send a patch? Yes, will do. Juergen
diff --git a/xen/include/xen/unaligned.h b/xen/include/xen/unaligned.h index 0a2b16d05d..325d9f875f 100644 --- a/xen/include/xen/unaligned.h +++ b/xen/include/xen/unaligned.h @@ -1,12 +1,4 @@ -/* - * This header can be used by architectures where unaligned accesses work - * without faulting, and at least reasonably efficiently. Other architectures - * will need to have a custom asm/unaligned.h. - */ -#ifndef __ASM_UNALIGNED_H__ -#error "xen/unaligned.h should not be included directly - include asm/unaligned.h instead" -#endif - +/* SPDX-License-Identifier: GPL-2.0 */ #ifndef __XEN_UNALIGNED_H__ #define __XEN_UNALIGNED_H__ @@ -15,67 +7,126 @@ #include <asm/byteorder.h> #endif -#define get_unaligned(p) (*(p)) -#define put_unaligned(val, p) (*(p) = (val)) +/* + * This is the most generic implementation of unaligned accesses + * and should work almost anywhere. + */ + +#define __get_unaligned_t(type, ptr) ({ \ + const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ + __pptr->x; \ +}) + +#define __put_unaligned_t(type, val, ptr) do { \ + struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ + __pptr->x = (val); \ +} while (0) + +#define get_unaligned(ptr) __get_unaligned_t(typeof(*(ptr)), (ptr)) +#define put_unaligned(val, ptr) __put_unaligned_t(typeof(*(ptr)), (val), (ptr)) + +static inline u16 get_unaligned_le16(const void *p) +{ + return le16_to_cpu(__get_unaligned_t(__le16, p)); +} + +static inline u32 get_unaligned_le32(const void *p) +{ + return le32_to_cpu(__get_unaligned_t(__le32, p)); +} + +static inline u64 get_unaligned_le64(const void *p) +{ + return le64_to_cpu(__get_unaligned_t(__le64, p)); +} + +static inline void put_unaligned_le16(u16 val, void *p) +{ + __put_unaligned_t(__le16, cpu_to_le16(val), p); +} + +static inline void put_unaligned_le32(u32 val, void *p) +{ + __put_unaligned_t(__le32, cpu_to_le32(val), p); +} + +static inline void put_unaligned_le64(u64 val, void *p) +{ + __put_unaligned_t(__le64, cpu_to_le64(val), p); +} + +static inline u16 get_unaligned_be16(const void *p) +{ + return be16_to_cpu(__get_unaligned_t(__be16, p)); +} + +static inline u32 get_unaligned_be32(const void *p) +{ + return be32_to_cpu(__get_unaligned_t(__be32, p)); +} -static inline uint16_t get_unaligned_be16(const void *p) +static inline u64 get_unaligned_be64(const void *p) { - return be16_to_cpup(p); + return be64_to_cpu(__get_unaligned_t(__be64, p)); } -static inline void put_unaligned_be16(uint16_t val, void *p) +static inline void put_unaligned_be16(u16 val, void *p) { - *(__force __be16*)p = cpu_to_be16(val); + __put_unaligned_t(__be16, cpu_to_be16(val), p); } -static inline uint32_t get_unaligned_be32(const void *p) +static inline void put_unaligned_be32(u32 val, void *p) { - return be32_to_cpup(p); + __put_unaligned_t(__be32, cpu_to_be32(val), p); } -static inline void put_unaligned_be32(uint32_t val, void *p) +static inline void put_unaligned_be64(u64 val, void *p) { - *(__force __be32*)p = cpu_to_be32(val); + __put_unaligned_t(__be64, cpu_to_be64(val), p); } -static inline uint64_t get_unaligned_be64(const void *p) +static inline u32 __get_unaligned_be24(const u8 *p) { - return be64_to_cpup(p); + return p[0] << 16 | p[1] << 8 | p[2]; } -static inline void put_unaligned_be64(uint64_t val, void *p) +static inline u32 get_unaligned_be24(const void *p) { - *(__force __be64*)p = cpu_to_be64(val); + return __get_unaligned_be24(p); } -static inline uint16_t get_unaligned_le16(const void *p) +static inline u32 __get_unaligned_le24(const u8 *p) { - return le16_to_cpup(p); + return p[0] | p[1] << 8 | p[2] << 16; } -static inline void put_unaligned_le16(uint16_t val, void *p) +static inline u32 get_unaligned_le24(const void *p) { - *(__force __le16*)p = cpu_to_le16(val); + return __get_unaligned_le24(p); } -static inline uint32_t get_unaligned_le32(const void *p) +static inline void __put_unaligned_be24(const u32 val, u8 *p) { - return le32_to_cpup(p); + *p++ = val >> 16; + *p++ = val >> 8; + *p++ = val; } -static inline void put_unaligned_le32(uint32_t val, void *p) +static inline void put_unaligned_be24(const u32 val, void *p) { - *(__force __le32*)p = cpu_to_le32(val); + __put_unaligned_be24(val, p); } -static inline uint64_t get_unaligned_le64(const void *p) +static inline void __put_unaligned_le24(const u32 val, u8 *p) { - return le64_to_cpup(p); + *p++ = val; + *p++ = val >> 8; + *p++ = val >> 16; } -static inline void put_unaligned_le64(uint64_t val, void *p) +static inline void put_unaligned_le24(const u32 val, void *p) { - *(__force __le64*)p = cpu_to_le64(val); + __put_unaligned_le24(val, p); } #endif /* __XEN_UNALIGNED_H__ */