[v5,12/23] xen/riscv: introduce io.h

Message ID	dd7c95b5197dfd0cca0edf9c0ada631336eb60d7.1708962629.git.oleksii.kurochko@gmail.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org> From: Oleksii Kurochko <oleksii.kurochko@gmail.com> To: xen-devel@lists.xenproject.org Cc: Oleksii Kurochko <oleksii.kurochko@gmail.com>, Alistair Francis <alistair.francis@wdc.com>, Bob Eshleman <bobbyeshleman@gmail.com>, Connor Davis <connojdavis@gmail.com>, Andrew Cooper <andrew.cooper3@citrix.com>, George Dunlap <george.dunlap@citrix.com>, Jan Beulich <jbeulich@suse.com>, Julien Grall <julien@xen.org>, Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org> Subject: [PATCH v5 12/23] xen/riscv: introduce io.h Date: Mon, 26 Feb 2024 18:38:54 +0100 Message-ID: <dd7c95b5197dfd0cca0edf9c0ada631336eb60d7.1708962629.git.oleksii.kurochko@gmail.com> In-Reply-To: <cover.1708962629.git.oleksii.kurochko@gmail.com> References: <cover.1708962629.git.oleksii.kurochko@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Enable build of full Xen for RISC-V \| expand [v5,00/23,v4,00/30] Enable build of full Xen for RISC-V [v5,01/23] xen/riscv: disable unnecessary configs [v5,02/23] xen/riscv: use some asm-generic headers [v5,03/23] xen/riscv: introduce nospec.h [v5,04/23] xen/asm-generic: introduce generic fls() and flsl() functions [v5,05/23] xen/asm-generic: introduce generic find first set bit functions [v5,06/23] xen/asm-generic: introduce generic ffz() [v5,07/23] xen/asm-generic: introduce generic hweight64() [v5,08/23] xen/asm-generic: introduce generic non-atomic test_bit() [v5,09/23] xen/riscv: introduce bitops.h [v5,10/23] xen/riscv: introduces acrquire, release and full barriers [v5,11/23] xen/riscv: introduce cmpxchg.h [v5,12/23] xen/riscv: introduce io.h [v5,13/23] xen/riscv: introduce atomic.h [v5,14/23] xen/riscv: introduce monitor.h [v5,15/23] xen/riscv: add definition of __read_mostly [v5,16/23] xen/riscv: add required things to current.h [v5,17/23] xen/riscv: add minimal stuff to page.h to build full Xen [v5,18/23] xen/riscv: add minimal stuff to processor.h to build full Xen [v5,19/23] xen/riscv: add minimal stuff to mm.h to build full Xen [v5,20/23] xen/riscv: introduce vm_event_() functions [v5,21/23] xen/rirscv: add minimal amount of stubs to build full Xen [v5,22/23] xen/riscv: enable full Xen build [v5,23/23] xen/README: add compiler and binutils versions for RISC-V64

Oleksii Kurochko Feb. 26, 2024, 5:38 p.m. UTC

The header taken form Linux 6.4.0-rc1 and is based on
arch/riscv/include/asm/mmio.h with the following changes:
- drop forcing of endianess for read*(), write*() functions as
  no matter what CPU endianness, what endianness a particular device
  (and hence its MMIO region(s)) is using is entirely independent.
  Hence conversion, where necessary, needs to occur at a layer up.
  Another one reason to drop endianess conversion here is:
  https://patchwork.kernel.org/project/linux-riscv/patch/20190411115623.5749-3-hch@lst.de/
  One of the answers of the author of the commit:
    And we don't know if Linux will be around if that ever changes.
    The point is:
     a) the current RISC-V spec is LE only
     b) the current linux port is LE only except for this little bit
    There is no point in leaving just this bitrotting code around.  It
    just confuses developers, (very very slightly) slows down compiles
    and will bitrot.  It also won't be any significant help to a future
    developer down the road doing a hypothetical BE RISC-V Linux port.
- drop unused argument of __io_ar() macros.
- drop "#define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}"
  as they are unnessary.
- Adopt the Xen code style for this header, considering that significant changes
  are not anticipated in the future.
  In the event of any issues, adapting them to Xen style should be easily
  manageable.
- drop unnessary __r variables in macros read*_cpu()

Addionally, to the header was added definions of ioremap_*().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - Xen code style related fixes
 - drop #define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}
 - drop cpu_to_le16()
 - remove unuused argument in _io_ar()
 - update the commit message 
 - drop unnessary __r variables in macros read*_cpu()
 - update the comments at the top of the header.
---
Changes in V4:
 - delete inner parentheses in macros.
 - s/u<N>/uint<N>.
---
Changes in V3:
 - re-sync with linux kernel
 - update the commit message
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/io.h | 157 ++++++++++++++++++++++++++++++++
 1 file changed, 157 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/io.h

Jan Beulich March 6, 2024, 2:13 p.m. UTC | #1

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/io.h
> @@ -0,0 +1,157 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + *  The header taken form Linux 6.4.0-rc1 and is based on
> + *  arch/riscv/include/asm/mmio.h with the following changes:
> + *   - drop forcing of endianess for read*(), write*() functions as
> + *     no matter what CPU endianness, what endianness a particular device
> + *     (and hence its MMIO region(s)) is using is entirely independent.
> + *     Hence conversion, where necessary, needs to occur at a layer up.
> + *     Another one reason to drop endianess conversion is:
> + *     https://patchwork.kernel.org/project/linux-riscv/patch/20190411115623.5749-3-hch@lst.de/
> + *     One of the answers of the author of the commit:
> + *       And we don't know if Linux will be around if that ever changes.
> + *       The point is:
> + *        a) the current RISC-V spec is LE only
> + *        b) the current linux port is LE only except for this little bit
> + *       There is no point in leaving just this bitrotting code around.  It
> + *       just confuses developers, (very very slightly) slows down compiles
> +  *      and will bitrot.  It also won't be any significant help to a future

Nit: Stray extra leading blank.

> + *       developer down the road doing a hypothetical BE RISC-V Linux port.
> + *   - drop unused argument of __io_ar() macros.
> + *   - drop "#define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}"
> + *     as they are unnessary.

Nit: unnecessary (also ...

> + *   - Adopt the Xen code style for this header, considering that significant changes
> + *     are not anticipated in the future.
> + *     In the event of any issues, adapting them to Xen style should be easily
> + *     manageable.
> + *   - drop unnessary __r variables in macros read*_cpu()

... again here)

> + * Copyright (C) 1996-2000 Russell King
> + * Copyright (C) 2012 ARM Ltd.
> + * Copyright (C) 2014 Regents of the University of California
> + * Copyright (C) 2024 Vates
> + */
> +
> +#ifndef _ASM_RISCV_IO_H
> +#define _ASM_RISCV_IO_H
> +
> +#include <asm/byteorder.h>
> +
> +/*
> + * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
> + * change the properties of memory regions.  This should be fixed by the
> + * upcoming platform spec.
> + */
> +#define ioremap_nocache(addr, size) ioremap(addr, size)
> +#define ioremap_wc(addr, size) ioremap(addr, size)
> +#define ioremap_wt(addr, size) ioremap(addr, size)
> +
> +/* Generic IO read/write.  These perform native-endian accesses. */
> +static inline void __raw_writeb(uint8_t val, volatile void __iomem *addr)
> +{
> +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
> +}

I realize this is like Linux has it, but how is the compiler to know that
*addr is being access here? If the omission of respective constraints here
and below is intentional, I think a comment (covering all instances) is
needed. Note that while supposedly cloned from Arm code, Arm variants do
have such constraints in Linux.

I'm sorry for not having paid (enough) attention earlier.

> +static inline void __raw_writew(uint16_t val, volatile void __iomem *addr)
> +{
> +    asm volatile ( "sh %0, 0(%1)" : : "r" (val), "r" (addr) );
> +}
> +
> +static inline void __raw_writel(uint32_t val, volatile void __iomem *addr)
> +{
> +    asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r" (addr) );
> +}
> +
> +#ifdef CONFIG_64BIT
> +static inline void __raw_writeq(u64 val, volatile void __iomem *addr)

uint64_t please

> +{
> +    asm volatile ( "sd %0, 0(%1)" : : "r" (val), "r" (addr) );
> +}
> +#endif
> +
> +static inline uint8_t __raw_readb(const volatile void __iomem *addr)
> +{
> +    uint8_t val;
> +
> +    asm volatile ( "lb %0, 0(%1)" : "=r" (val) : "r" (addr) );
> +    return val;
> +}
> +
> +static inline uint16_t __raw_readw(const volatile void __iomem *addr)
> +{
> +    uint16_t val;
> +
> +    asm volatile ( "lh %0, 0(%1)" : "=r" (val) : "r" (addr) );
> +    return val;
> +}
> +
> +static inline uint32_t __raw_readl(const volatile void __iomem *addr)
> +{
> +    uint32_t val;
> +
> +    asm volatile ( "lw %0, 0(%1)" : "=r" (val) : "r" (addr) );
> +    return val;
> +}
> +
> +#ifdef CONFIG_64BIT
> +static inline u64 __raw_readq(const volatile void __iomem *addr)

uint64_t please

> +{
> +    u64 val;

and again

> +    asm volatile ( "ld %0, 0(%1)" : "=r" (val) : "r" (addr) );
> +    return val;
> +}
> +#endif
> +
> +/*
> + * Unordered I/O memory access primitives.  These are even more relaxed than
> + * the relaxed versions, as they don't even order accesses between successive
> + * operations to the I/O regions.
> + */
> +#define readb_cpu(c)        __raw_readb(c)
> +#define readw_cpu(c)        __raw_readw(c)
> +#define readl_cpu(c)        __raw_readl(c)
> +
> +#define writeb_cpu(v, c)    __raw_writeb(v, c)
> +#define writew_cpu(v, c)    __raw_writew(v, c)
> +#define writel_cpu(v, c)    __raw_writel(v, c)
> +
> +#ifdef CONFIG_64BIT
> +#define readq_cpu(c)        __raw_readq(c)
> +#define writeq_cpu(v, c)    __raw_writeq(v, c)
> +#endif
> +
> +/*
> + * I/O memory access primitives. Reads are ordered relative to any
> + * following Normal memory access. Writes are ordered relative to any prior
> + * Normal memory access.  The memory barriers here are necessary as RISC-V
> + * doesn't define any ordering between the memory space and the I/O space.
> + */
> +#define __io_br()   do { } while (0)
> +#define __io_ar()   asm volatile ( "fence i,r" : : : "memory" );
> +#define __io_bw()   asm volatile ( "fence w,o" : : : "memory" );
> +#define __io_aw()   do { } while (0)
> +
> +#define readb(c)    ({ uint8_t  v; __io_br(); v = readb_cpu(c); __io_ar(); v; })
> +#define readw(c)    ({ uint16_t v; __io_br(); v = readw_cpu(c); __io_ar(); v; })
> +#define readl(c)    ({ uint32_t v; __io_br(); v = readl_cpu(c); __io_ar(); v; })
> +
> +#define writeb(v, c)    ({ __io_bw(); writeb_cpu(v, c); __io_aw(); })
> +#define writew(v, c)    ({ __io_bw(); writew_cpu(v, c); __io_aw(); })
> +#define writel(v, c)    ({ __io_bw(); writel_cpu(v, c); __io_aw(); })
> +
> +#ifdef CONFIG_64BIT
> +#define readq(c)        ({ uint64_t v; __io_br(); v = readq_cpu(c); __io_ar(); v; })
> +#define writeq(v, c)    ({ __io_bw(); writeq_cpu(v, c); __io_aw(); })
> +#endif

Overall looks much tidier now, thanks.

Jan

Oleksii Kurochko March 7, 2024, 1:01 p.m. UTC | #2

On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
> > +/* Generic IO read/write.  These perform native-endian accesses.
> > */
> > +static inline void __raw_writeb(uint8_t val, volatile void __iomem
> > *addr)
> > +{
> > +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
> > +}
> 
> I realize this is like Linux has it, but how is the compiler to know
> that
> *addr is being access here? 
Assembler syntax told compiler that. 0(%1) - means that the memory
location pointed to by the address in register %1.

> If the omission of respective constraints here
> and below is intentional, I think a comment (covering all instances)
> is
> needed. Note that while supposedly cloned from Arm code, Arm variants
> do
> have such constraints in Linux.
> 
It uses this constains only in arm32:
#define __raw_writeb __raw_writeb
static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
{
	asm volatile("strb %1, %0"
		     : : "Qo" (*(volatile u8 __force *)addr), "r"
(val));
}

But in case of arm64:

#define __raw_writeb __raw_writeb
static __always_inline void __raw_writeb(u8 val, volatile void __iomem
*addr)
{
	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
}

And again looking at the defintion they use different option of strb
instruction, and in case of strb they use [%1] which tells compiler
that %1 is addressed which should be dereferenced.

~ Oleksii

Jan Beulich March 7, 2024, 1:24 p.m. UTC | #3

On 07.03.2024 14:01, Oleksii wrote:
> On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
>>> +/* Generic IO read/write.  These perform native-endian accesses.
>>> */
>>> +static inline void __raw_writeb(uint8_t val, volatile void __iomem
>>> *addr)
>>> +{
>>> +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
>>> +}
>>
>> I realize this is like Linux has it, but how is the compiler to know
>> that
>> *addr is being access here? 
> Assembler syntax told compiler that. 0(%1) - means that the memory
> location pointed to by the address in register %1.

No, the compiler doesn't decompose the string to figure how operands
are used. That's what the constraints are for. The only two things the
compiler does with the string is replace % operators and count line
separators.

>> If the omission of respective constraints here
>> and below is intentional, I think a comment (covering all instances)
>> is
>> needed. Note that while supposedly cloned from Arm code, Arm variants
>> do
>> have such constraints in Linux.
>>
> It uses this constains only in arm32:
> #define __raw_writeb __raw_writeb
> static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
> {
> 	asm volatile("strb %1, %0"
> 		     : : "Qo" (*(volatile u8 __force *)addr), "r"
> (val));
> }
> 
> But in case of arm64:
> 
> #define __raw_writeb __raw_writeb
> static __always_inline void __raw_writeb(u8 val, volatile void __iomem
> *addr)
> {
> 	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
> }
> 
> And again looking at the defintion they use different option of strb
> instruction, and in case of strb they use [%1] which tells compiler
> that %1 is addressed which should be dereferenced.

Same bug here then; I happened to look at Arm32 only. As mentioned in
the other patch using what's provided here, the problem becomes more
than latent only there. And I can't spot such use in Arm64 code, so it
is likely only a latent bug there.

Jan

Oleksii Kurochko March 7, 2024, 1:44 p.m. UTC | #4

On Thu, 2024-03-07 at 14:24 +0100, Jan Beulich wrote:
> On 07.03.2024 14:01, Oleksii wrote:
> > On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
> > > > +/* Generic IO read/write.  These perform native-endian
> > > > accesses.
> > > > */
> > > > +static inline void __raw_writeb(uint8_t val, volatile void
> > > > __iomem
> > > > *addr)
> > > > +{
> > > > +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
> > > > +}
> > > 
> > > I realize this is like Linux has it, but how is the compiler to
> > > know
> > > that
> > > *addr is being access here? 
> > Assembler syntax told compiler that. 0(%1) - means that the memory
> > location pointed to by the address in register %1.
> 
> No, the compiler doesn't decompose the string to figure how operands
> are used. That's what the constraints are for. The only two things
> the
> compiler does with the string is replace % operators and count line
> separators.
It looks like I am missing something.

addr -> a some register ( because of "r" contraint ).
val -> is also register ( because of "r" contraint ).

So the compiler will update instert an instruction:
 sb reg1, 0(reg2)

what means *(uint_8 *)reg2 = (uint8_t)reg1.

What am I missing?

~ Oleksii

Jan Beulich March 7, 2024, 3:32 p.m. UTC | #5

On 07.03.2024 14:44, Oleksii wrote:
> On Thu, 2024-03-07 at 14:24 +0100, Jan Beulich wrote:
>> On 07.03.2024 14:01, Oleksii wrote:
>>> On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
>>>>> +/* Generic IO read/write.  These perform native-endian
>>>>> accesses.
>>>>> */
>>>>> +static inline void __raw_writeb(uint8_t val, volatile void
>>>>> __iomem
>>>>> *addr)
>>>>> +{
>>>>> +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
>>>>> +}
>>>>
>>>> I realize this is like Linux has it, but how is the compiler to
>>>> know
>>>> that
>>>> *addr is being access here? 
>>> Assembler syntax told compiler that. 0(%1) - means that the memory
>>> location pointed to by the address in register %1.
>>
>> No, the compiler doesn't decompose the string to figure how operands
>> are used. That's what the constraints are for. The only two things
>> the
>> compiler does with the string is replace % operators and count line
>> separators.
> It looks like I am missing something.
> 
> addr -> a some register ( because of "r" contraint ).
> val -> is also register ( because of "r" contraint ).
> 
> So the compiler will update instert an instruction:
>  sb reg1, 0(reg2)
> 
> what means *(uint_8 *)reg2 = (uint8_t)reg1.
> 
> What am I missing?

The fact that the compiler will not know that *(uint_8 *)reg2 actually
changes across this asm(). It may therefore continue to hold a cached
value in a register, without knowing that its contents went stale.

Jan

Oleksii Kurochko March 7, 2024, 4:21 p.m. UTC | #6

On Thu, 2024-03-07 at 16:32 +0100, Jan Beulich wrote:
> On 07.03.2024 14:44, Oleksii wrote:
> > On Thu, 2024-03-07 at 14:24 +0100, Jan Beulich wrote:
> > > On 07.03.2024 14:01, Oleksii wrote:
> > > > On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
> > > > > > +/* Generic IO read/write.  These perform native-endian
> > > > > > accesses.
> > > > > > */
> > > > > > +static inline void __raw_writeb(uint8_t val, volatile void
> > > > > > __iomem
> > > > > > *addr)
> > > > > > +{
> > > > > > +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r"
> > > > > > (addr) );
> > > > > > +}
> > > > > 
> > > > > I realize this is like Linux has it, but how is the compiler
> > > > > to
> > > > > know
> > > > > that
> > > > > *addr is being access here? 
> > > > Assembler syntax told compiler that. 0(%1) - means that the
> > > > memory
> > > > location pointed to by the address in register %1.
> > > 
> > > No, the compiler doesn't decompose the string to figure how
> > > operands
> > > are used. That's what the constraints are for. The only two
> > > things
> > > the
> > > compiler does with the string is replace % operators and count
> > > line
> > > separators.
> > It looks like I am missing something.
> > 
> > addr -> a some register ( because of "r" contraint ).
> > val -> is also register ( because of "r" contraint ).
> > 
> > So the compiler will update instert an instruction:
> >  sb reg1, 0(reg2)
> > 
> > what means *(uint_8 *)reg2 = (uint8_t)reg1.
> > 
> > What am I missing?
> 
> The fact that the compiler will not know that *(uint_8 *)reg2
> actually
> changes across this asm(). It may therefore continue to hold a cached
> value in a register, without knowing that its contents went stale.
Then it makes sense to me. Thanks. It explains why it is needed +Q, but
I don't understand why constraint 'o' isn't used for __raw_writew, but
was used for __raw_writeb:

   static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
   {
           asm volatile("strb %1, %0"
                        : "+Qo" (*(volatile u8 __force *)addr)
                        : "r" (val));
   }
   
   static inline void __raw_writew(u16 val, volatile void __iomem *addr)
   {
           asm volatile("strh %1, %0"
                        : "+Q" (*(volatile u16 __force *)addr)
                        : "r" (val));
   } 
   
If I understand correctly 'o' that the address is offsettable, so why
addr can not be offsettable for everyone case?

And one more thing, in Xen constraint "+" is used, but in Linux it was
dropped:
https://patchwork.kernel.org/project/linux-arm-kernel/patch/1426958753-26903-1-git-send-email-peter@hurleysoftware.com/

To me it looks like constraints should be always "+Qo".

~ Oleksii

Jan Beulich March 7, 2024, 5:14 p.m. UTC | #7

On 07.03.2024 17:21, Oleksii wrote:
> On Thu, 2024-03-07 at 16:32 +0100, Jan Beulich wrote:
>> On 07.03.2024 14:44, Oleksii wrote:
>>> On Thu, 2024-03-07 at 14:24 +0100, Jan Beulich wrote:
>>>> On 07.03.2024 14:01, Oleksii wrote:
>>>>> On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
>>>>>>> +/* Generic IO read/write.  These perform native-endian
>>>>>>> accesses.
>>>>>>> */
>>>>>>> +static inline void __raw_writeb(uint8_t val, volatile void
>>>>>>> __iomem
>>>>>>> *addr)
>>>>>>> +{
>>>>>>> +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r"
>>>>>>> (addr) );
>>>>>>> +}
>>>>>>
>>>>>> I realize this is like Linux has it, but how is the compiler
>>>>>> to
>>>>>> know
>>>>>> that
>>>>>> *addr is being access here? 
>>>>> Assembler syntax told compiler that. 0(%1) - means that the
>>>>> memory
>>>>> location pointed to by the address in register %1.
>>>>
>>>> No, the compiler doesn't decompose the string to figure how
>>>> operands
>>>> are used. That's what the constraints are for. The only two
>>>> things
>>>> the
>>>> compiler does with the string is replace % operators and count
>>>> line
>>>> separators.
>>> It looks like I am missing something.
>>>
>>> addr -> a some register ( because of "r" contraint ).
>>> val -> is also register ( because of "r" contraint ).
>>>
>>> So the compiler will update instert an instruction:
>>>  sb reg1, 0(reg2)
>>>
>>> what means *(uint_8 *)reg2 = (uint8_t)reg1.
>>>
>>> What am I missing?
>>
>> The fact that the compiler will not know that *(uint_8 *)reg2
>> actually
>> changes across this asm(). It may therefore continue to hold a cached
>> value in a register, without knowing that its contents went stale.
> Then it makes sense to me. Thanks.

FTAOD similar considerations apply to memory reads. The compiler may need
to know that values held in registers first need writing back to memory
before an asm() can be invoked.

> It explains why it is needed +Q, but
> I don't understand why constraint 'o' isn't used for __raw_writew, but
> was used for __raw_writeb:
> 
>    static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
>    {
>            asm volatile("strb %1, %0"
>                         : "+Qo" (*(volatile u8 __force *)addr)
>                         : "r" (val));
>    }
>    
>    static inline void __raw_writew(u16 val, volatile void __iomem *addr)
>    {
>            asm volatile("strh %1, %0"
>                         : "+Q" (*(volatile u16 __force *)addr)
>                         : "r" (val));
>    } 
>    
> If I understand correctly 'o' that the address is offsettable, so why
> addr can not be offsettable for everyone case?

I don't know; there may be peculiarities in RISC-V specific constraints.

> And one more thing, in Xen constraint "+" is used, but in Linux it was
> dropped:
> https://patchwork.kernel.org/project/linux-arm-kernel/patch/1426958753-26903-1-git-send-email-peter@hurleysoftware.com/
> 
> To me it looks like constraints should be always "+Qo".

For plain writes it should at least be "=Qo" then, yes. To me making those
input operands on Arm can't have been quite right.

Jan

Oleksii Kurochko March 7, 2024, 8:49 p.m. UTC | #8

On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
> For plain writes it should at least be "=Qo" then, yes.
Constraints Q is a machine specific constraint, and I am not sure that
it makes sense to use "=o" only and probably it is a reason why it is
enough only "r". Does it make sense?

>  To me making those
> input operands on Arm can't have been quite right.
I  don't understand why they both are input, logically it looks like an
address should be an output.

~ Oleksii

Oleksii Kurochko March 7, 2024, 8:54 p.m. UTC | #9

On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
> > For plain writes it should at least be "=Qo" then, yes.
> Constraints Q is a machine specific constraint, and I am not sure
> that
> it makes sense to use "=o" only and probably it is a reason why it is
> enough only "r". Does it make sense?
Probably for RISC-V can be used:
RISC-V—config/riscv/constraints.md
   ...
   A
       An address that is held in a general-purpose register.
   ...

AArch64 family—config/aarch64/constraints.md:
   ...
   Q
       A memory address which uses a single base register with no
   offset
   ...
Also 'no offset' explains why it was added 'o' constraint for Arm
additionally.

~ Oleksii

> 
> >  To me making those
> > input operands on Arm can't have been quite right.
> I  don't understand why they both are input, logically it looks like
> an
> address should be an output.
> 
> ~ Oleksii
>

Jan Beulich March 8, 2024, 7:18 a.m. UTC | #10

On 07.03.2024 21:49, Oleksii wrote:
> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>> For plain writes it should at least be "=Qo" then, yes.
> Constraints Q is a machine specific constraint, and I am not sure that
> it makes sense to use "=o" only and probably it is a reason why it is
> enough only "r". Does it make sense?

Especially the 'only "r"' part doesn't, no. I also don't see why "=o"
would make no sense - that's fundamentally no different than "=m".
Unless the immediates used in the ultimate insns are large enough to
cover the full range of possible values, my take is that "o" is never
appropriate to use. With one exception in a case like we have here:
If the operand isn't used in the actual instruction(s), then that's
fine, as the specific value of the adjusted immediate wouldn't matter
at all.

As to "Q" - that's an Arm constraint anyway, not a RISC-V one? If so,
I'm not sure why we're discussing it here. In any event, I'd be curious
to understand in how far the combination "Qo" makes sense.

>>  To me making those
>> input operands on Arm can't have been quite right.
> I  don't understand why they both are input, logically it looks like an
> address should be an output.

How would an address be an output, when that's needed to be calculated
for the memory access to take place? It would be an output (and
presumably a dummy/fake one) only if the address calculation was done
in the asm() itself (and there I don't mean the operands, but the
actual assembly instruction(s)).

Jan

Jan Beulich March 8, 2024, 7:26 a.m. UTC | #11

On 07.03.2024 21:54, Oleksii wrote:
> On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
>> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>>> For plain writes it should at least be "=Qo" then, yes.
>> Constraints Q is a machine specific constraint, and I am not sure
>> that
>> it makes sense to use "=o" only and probably it is a reason why it is
>> enough only "r". Does it make sense?
> Probably for RISC-V can be used:
> RISC-V—config/riscv/constraints.md
>    ...
>    A
>        An address that is held in a general-purpose register.
>    ...

Just from the description I would have said no, but looking at what "A"
actually expands to it is indeed RISC-V's counterpart of Arm's "Q". So
yes, this looks like what amo* want to use, and then as a real operand,
not just a fake one.

> AArch64 family—config/aarch64/constraints.md:
>    ...
>    Q
>        A memory address which uses a single base register with no
>    offset
>    ...
> Also 'no offset' explains why it was added 'o' constraint for Arm
> additionally.

I don't think it does.

Jan

Oleksii Kurochko March 8, 2024, 10:14 a.m. UTC | #12

On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
> On 07.03.2024 21:54, Oleksii wrote:
> > On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
> > > On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
> > > > For plain writes it should at least be "=Qo" then, yes.
> > > Constraints Q is a machine specific constraint, and I am not sure
> > > that
> > > it makes sense to use "=o" only and probably it is a reason why
> > > it is
> > > enough only "r". Does it make sense?
> > Probably for RISC-V can be used:
> > RISC-V—config/riscv/constraints.md
> >    ...
> >    A
> >        An address that is held in a general-purpose register.
> >    ...
> 
> Just from the description I would have said no, but looking at what
> "A"
> actually expands to it is indeed RISC-V's counterpart of Arm's "Q".
> So
> yes, this looks like what amo* want to use, and then as a real
> operand,
> not just a fake one.
I am not sure that I know how to check correctly how "A" expands, but I
tried to look at code which will be generated with and without
constraints and it is the same:
   // static inline void __raw_writel(uint32_t val, volatile void
   __iomem *addr)
   // {
   //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr) );
   // }
   
   static inline void __raw_writel(uint32_t val, volatile void __iomem
   *addr)
   {
       asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
   uint32_t __force *)addr) );
   }
   
   ffffffffc003d774:       aabbd7b7                lui     a5,0xaabbd
   ffffffffc003d778:       cdd78793                add     a5,a5,-803 #
   ffffffffaabbccdd <start-0x15443323>
   ffffffffc003d77c:       f8f42423                sw      a5,-120(s0)
   ffffffffc003d780:       0140000f                fence   w,o
   

> 
> > AArch64 family—config/aarch64/constraints.md:
> >    ...
> >    Q
> >        A memory address which uses a single base register with no
> >    offset
> >    ...
> > Also 'no offset' explains why it was added 'o' constraint for Arm
> > additionally.
> 
> I don't think it does.
> 
> Jan

Jan Beulich March 8, 2024, 11:49 a.m. UTC | #13

On 08.03.2024 11:14, Oleksii wrote:
> On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
>> On 07.03.2024 21:54, Oleksii wrote:
>>> On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
>>>> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>>>>> For plain writes it should at least be "=Qo" then, yes.
>>>> Constraints Q is a machine specific constraint, and I am not sure
>>>> that
>>>> it makes sense to use "=o" only and probably it is a reason why
>>>> it is
>>>> enough only "r". Does it make sense?
>>> Probably for RISC-V can be used:
>>> RISC-V—config/riscv/constraints.md
>>>    ...
>>>    A
>>>        An address that is held in a general-purpose register.
>>>    ...
>>
>> Just from the description I would have said no, but looking at what
>> "A"
>> actually expands to it is indeed RISC-V's counterpart of Arm's "Q".
>> So
>> yes, this looks like what amo* want to use, and then as a real
>> operand,
>> not just a fake one.
> I am not sure that I know how to check correctly how "A" expands, but I
> tried to look at code which will be generated with and without
> constraints and it is the same:

As expected.

>    // static inline void __raw_writel(uint32_t val, volatile void
>    __iomem *addr)
>    // {
>    //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr) );
>    // }
>    
>    static inline void __raw_writel(uint32_t val, volatile void __iomem
>    *addr)
>    {
>        asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
>    uint32_t __force *)addr) );

You want just "A" here though; adding an offset (as "o" permits) would
yield an insn which the assembler would reject. Also just to remind
you: In write functions you need "=A" (and in amo ones "+A"), i.e. the
memory operand then needs to be an output, not an input.

Jan

Jan Beulich March 8, 2024, 11:52 a.m. UTC | #14

On 08.03.2024 12:49, Jan Beulich wrote:
> On 08.03.2024 11:14, Oleksii wrote:
>> On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
>>> On 07.03.2024 21:54, Oleksii wrote:
>>>> On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
>>>>> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>>>>>> For plain writes it should at least be "=Qo" then, yes.
>>>>> Constraints Q is a machine specific constraint, and I am not sure
>>>>> that
>>>>> it makes sense to use "=o" only and probably it is a reason why
>>>>> it is
>>>>> enough only "r". Does it make sense?
>>>> Probably for RISC-V can be used:
>>>> RISC-V—config/riscv/constraints.md
>>>>    ...
>>>>    A
>>>>        An address that is held in a general-purpose register.
>>>>    ...
>>>
>>> Just from the description I would have said no, but looking at what
>>> "A"
>>> actually expands to it is indeed RISC-V's counterpart of Arm's "Q".
>>> So
>>> yes, this looks like what amo* want to use, and then as a real
>>> operand,
>>> not just a fake one.
>> I am not sure that I know how to check correctly how "A" expands, but I
>> tried to look at code which will be generated with and without
>> constraints and it is the same:
> 
> As expected.
> 
>>    // static inline void __raw_writel(uint32_t val, volatile void
>>    __iomem *addr)
>>    // {
>>    //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr) );
>>    // }
>>    
>>    static inline void __raw_writel(uint32_t val, volatile void __iomem
>>    *addr)
>>    {
>>        asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
>>    uint32_t __force *)addr) );
> 
> You want just "A" here though; adding an offset (as "o" permits) would
> yield an insn which the assembler would reject.

Wait - this is plain SW, so can't it even be the more generic "m" then?
(As said, I'm uncertain about "o"; in general I think it's risky to use.)

Jan

> Also just to remind
> you: In write functions you need "=A" (and in amo ones "+A"), i.e. the
> memory operand then needs to be an output, not an input.
> 
> Jan

Oleksii Kurochko March 8, 2024, 12:17 p.m. UTC | #15

On Fri, 2024-03-08 at 12:52 +0100, Jan Beulich wrote:
> On 08.03.2024 12:49, Jan Beulich wrote:
> > On 08.03.2024 11:14, Oleksii wrote:
> > > On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
> > > > On 07.03.2024 21:54, Oleksii wrote:
> > > > > On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
> > > > > > On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
> > > > > > > For plain writes it should at least be "=Qo" then, yes.
> > > > > > Constraints Q is a machine specific constraint, and I am
> > > > > > not sure
> > > > > > that
> > > > > > it makes sense to use "=o" only and probably it is a reason
> > > > > > why
> > > > > > it is
> > > > > > enough only "r". Does it make sense?
> > > > > Probably for RISC-V can be used:
> > > > > RISC-V—config/riscv/constraints.md
> > > > >    ...
> > > > >    A
> > > > >        An address that is held in a general-purpose register.
> > > > >    ...
> > > > 
> > > > Just from the description I would have said no, but looking at
> > > > what
> > > > "A"
> > > > actually expands to it is indeed RISC-V's counterpart of Arm's
> > > > "Q".
> > > > So
> > > > yes, this looks like what amo* want to use, and then as a real
> > > > operand,
> > > > not just a fake one.
> > > I am not sure that I know how to check correctly how "A" expands,
> > > but I
> > > tried to look at code which will be generated with and without
> > > constraints and it is the same:
> > 
> > As expected.
But if it is epxected and generated code is the same, do we really need
constraints then?

> > 
> > >    // static inline void __raw_writel(uint32_t val, volatile void
> > >    __iomem *addr)
> > >    // {
> > >    //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr)
> > > );
> > >    // }
> > >    
> > >    static inline void __raw_writel(uint32_t val, volatile void
> > > __iomem
> > >    *addr)
> > >    {
> > >        asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
> > >    uint32_t __force *)addr) );
> > 
> > You want just "A" here though; adding an offset (as "o" permits)
> > would
> > yield an insn which the assembler would reject.
> 
> Wait - this is plain SW, so can't it even be the more generic "m"
> then?
> (As said, I'm uncertain about "o"; in general I think it's risky to
> use.)
What do you mean by "plain SW"?

Are you suggesting changing 'm' to 'o' so that the final result will be
"Am"? Based on the descriptions of 'A' and 'm', it seems to me that we
can just use 'A' alone because both constraints indicate that the
operand is in memory, and 'A' specifically denotes that an address is
held in a register.
> 
> 
> > Also just to remind
> > you: In write functions you need "=A" (and in amo ones "+A"), i.e.
> > the
> > memory operand then needs to be an output, not an input.
Could you please clarify about which one amo you are speaking? That one
who are defined by ATOMIC_OP and ATOMIC_FETCH_OP? They are already
using +A constraints:
    __asm__ __volatile__ (                                          \
        "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
        : "+A" (v->counter), "=r" (ret)                             \
        : "r" (I)                                                   \
        : "memory" );                                               \

~ Oleksii

Jan Beulich March 8, 2024, 12:54 p.m. UTC | #16

On 08.03.2024 13:17, Oleksii wrote:
> On Fri, 2024-03-08 at 12:52 +0100, Jan Beulich wrote:
>> On 08.03.2024 12:49, Jan Beulich wrote:
>>> On 08.03.2024 11:14, Oleksii wrote:
>>>> On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
>>>>> On 07.03.2024 21:54, Oleksii wrote:
>>>>>> On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
>>>>>>> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>>>>>>>> For plain writes it should at least be "=Qo" then, yes.
>>>>>>> Constraints Q is a machine specific constraint, and I am
>>>>>>> not sure
>>>>>>> that
>>>>>>> it makes sense to use "=o" only and probably it is a reason
>>>>>>> why
>>>>>>> it is
>>>>>>> enough only "r". Does it make sense?
>>>>>> Probably for RISC-V can be used:
>>>>>> RISC-V—config/riscv/constraints.md
>>>>>>    ...
>>>>>>    A
>>>>>>        An address that is held in a general-purpose register.
>>>>>>    ...
>>>>>
>>>>> Just from the description I would have said no, but looking at
>>>>> what
>>>>> "A"
>>>>> actually expands to it is indeed RISC-V's counterpart of Arm's
>>>>> "Q".
>>>>> So
>>>>> yes, this looks like what amo* want to use, and then as a real
>>>>> operand,
>>>>> not just a fake one.
>>>> I am not sure that I know how to check correctly how "A" expands,
>>>> but I
>>>> tried to look at code which will be generated with and without
>>>> constraints and it is the same:
>>>
>>> As expected.
> But if it is epxected and generated code is the same, do we really need
> constraints then?

Yes. Again: Proper constraints are the only way for the compiler to know
everything it needs to know to generate correct code around an asm().

>>>>    // static inline void __raw_writel(uint32_t val, volatile void
>>>>    __iomem *addr)
>>>>    // {
>>>>    //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr)
>>>> );
>>>>    // }
>>>>    
>>>>    static inline void __raw_writel(uint32_t val, volatile void
>>>> __iomem
>>>>    *addr)
>>>>    {
>>>>        asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
>>>>    uint32_t __force *)addr) );
>>>
>>> You want just "A" here though; adding an offset (as "o" permits)
>>> would
>>> yield an insn which the assembler would reject.
>>
>> Wait - this is plain SW, so can't it even be the more generic "m"
>> then?
>> (As said, I'm uncertain about "o"; in general I think it's risky to
>> use.)
> What do you mean by "plain SW"?

The plain store instruction, i.e. not SC. That one permits wider addressing
modes iirc, which we ought to permit where possible.

> Are you suggesting changing 'm' to 'o' so that the final result will be
> "Am"? Based on the descriptions of 'A' and 'm', it seems to me that we
> can just use 'A' alone because both constraints indicate that the
> operand is in memory, and 'A' specifically denotes that an address is
> held in a register.

No, no "A" at all. Just "m", which is a superset of "A" anyway.

>>> Also just to remind
>>> you: In write functions you need "=A" (and in amo ones "+A"), i.e.
>>> the
>>> memory operand then needs to be an output, not an input.
> Could you please clarify about which one amo you are speaking? That one
> who are defined by ATOMIC_OP and ATOMIC_FETCH_OP?

All. They're all read-modify-write operations if I'm not mistaken.

> They are already
> using +A constraints:
>     __asm__ __volatile__ (                                          \
>         "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
>         : "+A" (v->counter), "=r" (ret)                             \
>         : "r" (I)                                                   \
>         : "memory" );                                               \

Good. I merely thought I'd mention that aspect for completeness.

Jan

[v5,12/23] xen/riscv: introduce io.h

Commit Message

Comments

Patch