mbox series

[RFC,0/2] RDMA/rxe: Add RDMA Atomic Write operation

Message ID 20211230121423.1919550-1-yangx.jy@fujitsu.com (mailing list archive)
Headers show
Series RDMA/rxe: Add RDMA Atomic Write operation | expand

Message

Xiao Yang Dec. 30, 2021, 12:14 p.m. UTC
The IB SPEC v1.5[1][2] added new RDMA operations (Atomic Write and Flush).
This patchset makes SoftRoCE support new RDMA Atomic Write on RC service.

I added RDMA Atomic Write API and a rdma_atomic_write example on my
rdma-core repository[3].  You can verify the patchset by building and
running the rdma_atomic_write example.
server:
$ ./rdma_atomic_write_server -s [server_address] -p [port_number]
client:
$ ./rdma_atomic_write_client -s [server_address] -p [port_number]

[1]: https://www.infinibandta.org/ibta-specification/ # login required
[2]: https://www.infinibandta.org/wp-content/uploads/2021/08/IBTA-Overview-of-IBTA-Volume-1-Release-1.5-and-MPE-2021-08-17-Secure.pptx
[3]: https://github.com/yangx-jy/rdma-core

BTW: This patchset also needs the following fix.
https://www.spinics.net/lists/linux-rdma/msg107838.html

Xiao Yang (2):
  RDMA/rxe: Rename send_atomic_ack() and atomic member of struct
    resp_res
  RDMA/rxe: Add RDMA Atomic Write operation

 drivers/infiniband/sw/rxe/rxe_comp.c   |  4 ++
 drivers/infiniband/sw/rxe/rxe_opcode.c | 18 ++++++++
 drivers/infiniband/sw/rxe/rxe_opcode.h |  3 ++
 drivers/infiniband/sw/rxe/rxe_qp.c     |  5 ++-
 drivers/infiniband/sw/rxe/rxe_req.c    | 10 +++--
 drivers/infiniband/sw/rxe/rxe_resp.c   | 59 ++++++++++++++++++++------
 drivers/infiniband/sw/rxe/rxe_verbs.h  |  2 +-
 include/rdma/ib_pack.h                 |  2 +
 include/rdma/ib_verbs.h                |  2 +
 include/uapi/rdma/ib_user_verbs.h      |  2 +
 10 files changed, 88 insertions(+), 19 deletions(-)

Comments

Gromadzki, Tomasz Dec. 30, 2021, 7:21 p.m. UTC | #1
1)
> rdma_post_atomic_writev(struct rdma_cm_id *id, void *context, struct ibv_sge *sgl,
>			int nsge, int flags, uint64_t remote_addr, uint32_t rkey)

Do we need this API at all?
Other atomic operations (compare_swap/add) do not use struct ibv_sge at all but have a dedicated place in 
struct ib_send_wr {
...
		struct {
			uint64_t	remote_addr;
			uint64_t	compare_add;
			uint64_t	swap;
			uint32_t	rkey;
		} atomic;
...
}

Would it be better to reuse (extend) any existing field?
i.e.
		struct {
			uint64_t	remote_addr;
			uint64_t	compare_add;
			uint64_t	swap_write;
			uint32_t	rkey;
		} atomic;

Thanks
Tomasz

> -----Original Message-----
> From: Xiao Yang <yangx.jy@fujitsu.com>
> Sent: Thursday, December 30, 2021 1:14 PM
> To: linux-rdma@vger.kernel.org
> Cc: yanjun.zhu@linux.dev; rpearsonhpe@gmail.com; jgg@nvidia.com; y-
> goto@fujitsu.com; lizhijian@fujitsu.com; Gromadzki, Tomasz
> <tomasz.gromadzki@intel.com>; Xiao Yang <yangx.jy@fujitsu.com>
> Subject: [RFC PATCH 0/2] RDMA/rxe: Add RDMA Atomic Write operation
> 
> The IB SPEC v1.5[1][2] added new RDMA operations (Atomic Write and
> Flush).
> This patchset makes SoftRoCE support new RDMA Atomic Write on RC
> service.
> 
> I added RDMA Atomic Write API and a rdma_atomic_write example on my
> rdma-core repository[3].  You can verify the patchset by building and running
> the rdma_atomic_write example.
> server:
> $ ./rdma_atomic_write_server -s [server_address] -p [port_number]
> client:
> $ ./rdma_atomic_write_client -s [server_address] -p [port_number]
> 
> [1]: https://www.infinibandta.org/ibta-specification/ # login required
> [2]: https://www.infinibandta.org/wp-content/uploads/2021/08/IBTA-
> Overview-of-IBTA-Volume-1-Release-1.5-and-MPE-2021-08-17-Secure.pptx
> [3]: https://github.com/yangx-jy/rdma-core
> 
> BTW: This patchset also needs the following fix.
> https://www.spinics.net/lists/linux-rdma/msg107838.html
> 
> Xiao Yang (2):
>   RDMA/rxe: Rename send_atomic_ack() and atomic member of struct
>     resp_res
>   RDMA/rxe: Add RDMA Atomic Write operation
> 
>  drivers/infiniband/sw/rxe/rxe_comp.c   |  4 ++
>  drivers/infiniband/sw/rxe/rxe_opcode.c | 18 ++++++++
> drivers/infiniband/sw/rxe/rxe_opcode.h |  3 ++
>  drivers/infiniband/sw/rxe/rxe_qp.c     |  5 ++-
>  drivers/infiniband/sw/rxe/rxe_req.c    | 10 +++--
>  drivers/infiniband/sw/rxe/rxe_resp.c   | 59 ++++++++++++++++++++------
>  drivers/infiniband/sw/rxe/rxe_verbs.h  |  2 +-
>  include/rdma/ib_pack.h                 |  2 +
>  include/rdma/ib_verbs.h                |  2 +
>  include/uapi/rdma/ib_user_verbs.h      |  2 +
>  10 files changed, 88 insertions(+), 19 deletions(-)
> 
> --
> 2.23.0
> 
> 

---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
Tom Talpey Dec. 30, 2021, 9:42 p.m. UTC | #2
On 12/30/2021 2:21 PM, Gromadzki, Tomasz wrote:
> 1)
>> rdma_post_atomic_writev(struct rdma_cm_id *id, void *context, struct ibv_sge *sgl,
>> 			int nsge, int flags, uint64_t remote_addr, uint32_t rkey)
> 
> Do we need this API at all?
> Other atomic operations (compare_swap/add) do not use struct ibv_sge at all but have a dedicated place in
> struct ib_send_wr {
> ...
> 		struct {
> 			uint64_t	remote_addr;
> 			uint64_t	compare_add;
> 			uint64_t	swap;
> 			uint32_t	rkey;
> 		} atomic;
> ...
> }
> 
> Would it be better to reuse (extend) any existing field?
> i.e.
> 		struct {
> 			uint64_t	remote_addr;
> 			uint64_t	compare_add;
> 			uint64_t	swap_write;
> 			uint32_t	rkey;
> 		} atomic;

Agreed. Passing the data to be written as an SGE is unnatural
since it is always exactly 64 bits. Tweaking the existing atomic
parameter block as Tomasz suggests is the best approach.

Tom.

> Thanks
> Tomasz
> 
>> -----Original Message-----
>> From: Xiao Yang <yangx.jy@fujitsu.com>
>> Sent: Thursday, December 30, 2021 1:14 PM
>> To: linux-rdma@vger.kernel.org
>> Cc: yanjun.zhu@linux.dev; rpearsonhpe@gmail.com; jgg@nvidia.com; y-
>> goto@fujitsu.com; lizhijian@fujitsu.com; Gromadzki, Tomasz
>> <tomasz.gromadzki@intel.com>; Xiao Yang <yangx.jy@fujitsu.com>
>> Subject: [RFC PATCH 0/2] RDMA/rxe: Add RDMA Atomic Write operation
>>
>> The IB SPEC v1.5[1][2] added new RDMA operations (Atomic Write and
>> Flush).
>> This patchset makes SoftRoCE support new RDMA Atomic Write on RC
>> service.
>>
>> I added RDMA Atomic Write API and a rdma_atomic_write example on my
>> rdma-core repository[3].  You can verify the patchset by building and running
>> the rdma_atomic_write example.
>> server:
>> $ ./rdma_atomic_write_server -s [server_address] -p [port_number]
>> client:
>> $ ./rdma_atomic_write_client -s [server_address] -p [port_number]
>>
>> [1]: https://www.infinibandta.org/ibta-specification/ # login required
>> [2]: https://www.infinibandta.org/wp-content/uploads/2021/08/IBTA-
>> Overview-of-IBTA-Volume-1-Release-1.5-and-MPE-2021-08-17-Secure.pptx
>> [3]: https://github.com/yangx-jy/rdma-core
>>
>> BTW: This patchset also needs the following fix.
>> https://www.spinics.net/lists/linux-rdma/msg107838.html
>>
>> Xiao Yang (2):
>>    RDMA/rxe: Rename send_atomic_ack() and atomic member of struct
>>      resp_res
>>    RDMA/rxe: Add RDMA Atomic Write operation
>>
>>   drivers/infiniband/sw/rxe/rxe_comp.c   |  4 ++
>>   drivers/infiniband/sw/rxe/rxe_opcode.c | 18 ++++++++
>> drivers/infiniband/sw/rxe/rxe_opcode.h |  3 ++
>>   drivers/infiniband/sw/rxe/rxe_qp.c     |  5 ++-
>>   drivers/infiniband/sw/rxe/rxe_req.c    | 10 +++--
>>   drivers/infiniband/sw/rxe/rxe_resp.c   | 59 ++++++++++++++++++++------
>>   drivers/infiniband/sw/rxe/rxe_verbs.h  |  2 +-
>>   include/rdma/ib_pack.h                 |  2 +
>>   include/rdma/ib_verbs.h                |  2 +
>>   include/uapi/rdma/ib_user_verbs.h      |  2 +
>>   10 files changed, 88 insertions(+), 19 deletions(-)
>>
>> --
>> 2.23.0
>>
>>
> 
> ---------------------------------------------------------------------
> Intel Technology Poland sp. z o.o.
> ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
> Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
> This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
> 
>
Xiao Yang Dec. 31, 2021, 6:30 a.m. UTC | #3
On 2021/12/31 5:42, Tom Talpey wrote:
> On 12/30/2021 2:21 PM, Gromadzki, Tomasz wrote:
>> 1)
>>> rdma_post_atomic_writev(struct rdma_cm_id *id, void *context, struct 
>>> ibv_sge *sgl,
>>>             int nsge, int flags, uint64_t remote_addr, uint32_t rkey)
>>
>> Do we need this API at all?
>> Other atomic operations (compare_swap/add) do not use struct ibv_sge 
>> at all but have a dedicated place in
>> struct ib_send_wr {
>> ...
>>         struct {
>>             uint64_t    remote_addr;
>>             uint64_t    compare_add;
>>             uint64_t    swap;
>>             uint32_t    rkey;
>>         } atomic;
>> ...
>> }
>>
>> Would it be better to reuse (extend) any existing field?
>> i.e.
>>         struct {
>>             uint64_t    remote_addr;
>>             uint64_t    compare_add;
>>             uint64_t    swap_write;
>>             uint32_t    rkey;
>>         } atomic;
>
> Agreed. Passing the data to be written as an SGE is unnatural
> since it is always exactly 64 bits. Tweaking the existing atomic
> parameter block as Tomasz suggests is the best approach.
Hi Tomasz, Tom

Thanks for your quick reply.

If we want to pass the 8-byte value by tweaking struct atomic on user 
space, why don't we
tranfer the 8-byte value by ATOMIC Extended Transport Header (AtomicETH) 
on kernel space?
PS: IBTA defines that the 8-byte value is tranfered by RDMA Extended 
Transport Heade(RETH) + payload.

Is it inconsistent to use struct atomic on user space and RETH + payload 
on kernel space?

Best Regards,
Xiao Yang
>
> Tom.
Xiao Yang Jan. 4, 2022, 9:28 a.m. UTC | #4
On 2021/12/31 14:30, yangx.jy@fujitsu.com wrote:
> On 2021/12/31 5:42, Tom Talpey wrote:
>> On 12/30/2021 2:21 PM, Gromadzki, Tomasz wrote:
>>> 1)
>>>> rdma_post_atomic_writev(struct rdma_cm_id *id, void *context, struct
>>>> ibv_sge *sgl,
>>>>              int nsge, int flags, uint64_t remote_addr, uint32_t rkey)
>>> Do we need this API at all?
>>> Other atomic operations (compare_swap/add) do not use struct ibv_sge
>>> at all but have a dedicated place in
>>> struct ib_send_wr {
>>> ...
>>>          struct {
>>>              uint64_t    remote_addr;
>>>              uint64_t    compare_add;
>>>              uint64_t    swap;
>>>              uint32_t    rkey;
>>>          } atomic;
>>> ...
>>> }
>>>
>>> Would it be better to reuse (extend) any existing field?
>>> i.e.
>>>          struct {
>>>              uint64_t    remote_addr;
>>>              uint64_t    compare_add;
>>>              uint64_t    swap_write;
>>>              uint32_t    rkey;
>>>          } atomic;
>> Agreed. Passing the data to be written as an SGE is unnatural
>> since it is always exactly 64 bits. Tweaking the existing atomic
>> parameter block as Tomasz suggests is the best approach.
> Hi Tomasz, Tom
>
> Thanks for your quick reply.
>
> If we want to pass the 8-byte value by tweaking struct atomic on user
> space, why don't we
> tranfer the 8-byte value by ATOMIC Extended Transport Header (AtomicETH)
> on kernel space?
> PS: IBTA defines that the 8-byte value is tranfered by RDMA Extended
> Transport Heade(RETH) + payload.
>
> Is it inconsistent to use struct atomic on user space and RETH + payload
> on kernel space?
Hi Tomasz, Tom

I think the following rules are right:
RDMA READ/WRITE should use struct rdma in libverbs and RETH + payload in 
kernel.
RDMA Atomic should use struct atomic in libverbs and AtomicETH in kernel.

According to IBTA's definition, RDMA Atomic Write should use struct rdma 
in libibverbs.
How about adding a member in struct rdma? for example:
struct {
     uint64_t    remote_addr;
     uint32_t    rkey;
     uint64_t    wr_value:
} rdma;

Best Regards,
Xiao Yang
> Best Regards,
> Xiao Yang
>> Tom.
Tom Talpey Jan. 4, 2022, 3:17 p.m. UTC | #5
On 1/4/2022 4:28 AM, yangx.jy@fujitsu.com wrote:
> On 2021/12/31 14:30, yangx.jy@fujitsu.com wrote:
>> On 2021/12/31 5:42, Tom Talpey wrote:
>>> On 12/30/2021 2:21 PM, Gromadzki, Tomasz wrote:
>>>> 1)
>>>>> rdma_post_atomic_writev(struct rdma_cm_id *id, void *context, struct
>>>>> ibv_sge *sgl,
>>>>>               int nsge, int flags, uint64_t remote_addr, uint32_t rkey)
>>>> Do we need this API at all?
>>>> Other atomic operations (compare_swap/add) do not use struct ibv_sge
>>>> at all but have a dedicated place in
>>>> struct ib_send_wr {
>>>> ...
>>>>           struct {
>>>>               uint64_t    remote_addr;
>>>>               uint64_t    compare_add;
>>>>               uint64_t    swap;
>>>>               uint32_t    rkey;
>>>>           } atomic;
>>>> ...
>>>> }
>>>>
>>>> Would it be better to reuse (extend) any existing field?
>>>> i.e.
>>>>           struct {
>>>>               uint64_t    remote_addr;
>>>>               uint64_t    compare_add;
>>>>               uint64_t    swap_write;
>>>>               uint32_t    rkey;
>>>>           } atomic;
>>> Agreed. Passing the data to be written as an SGE is unnatural
>>> since it is always exactly 64 bits. Tweaking the existing atomic
>>> parameter block as Tomasz suggests is the best approach.
>> Hi Tomasz, Tom
>>
>> Thanks for your quick reply.
>>
>> If we want to pass the 8-byte value by tweaking struct atomic on user
>> space, why don't we
>> tranfer the 8-byte value by ATOMIC Extended Transport Header (AtomicETH)
>> on kernel space?
>> PS: IBTA defines that the 8-byte value is tranfered by RDMA Extended
>> Transport Heade(RETH) + payload.
>>
>> Is it inconsistent to use struct atomic on user space and RETH + payload
>> on kernel space?
> Hi Tomasz, Tom
> 
> I think the following rules are right:
> RDMA READ/WRITE should use struct rdma in libverbs and RETH + payload in
> kernel.
> RDMA Atomic should use struct atomic in libverbs and AtomicETH in kernel.
> 
> According to IBTA's definition, RDMA Atomic Write should use struct rdma
> in libibverbs.

I don't quite understand this statement, the IBTA defines the protocol
but does not define the API at such a level.

I do however agree with this:

> How about adding a member in struct rdma? for example:
> struct {
>       uint64_t    remote_addr;
>       uint32_t    rkey;
>       uint64_t    wr_value:
> } rdma;

Yes, that's what Tomasz and I were suggesting - a new template for the
ATOMIC_WRITE request payload. The three fields are to be supplied by
the verb consumer when posting the work request.

Tom.
Xiao Yang Jan. 5, 2022, 1 a.m. UTC | #6
On 2022/1/4 23:17, Tom Talpey wrote:
>
> On 1/4/2022 4:28 AM, yangx.jy@fujitsu.com wrote:
>> On 2021/12/31 14:30, yangx.jy@fujitsu.com wrote:
>>> On 2021/12/31 5:42, Tom Talpey wrote:
>>>> On 12/30/2021 2:21 PM, Gromadzki, Tomasz wrote:
>>>>> 1)
>>>>>> rdma_post_atomic_writev(struct rdma_cm_id *id, void *context, struct
>>>>>> ibv_sge *sgl,
>>>>>>               int nsge, int flags, uint64_t remote_addr, uint32_t 
>>>>>> rkey)
>>>>> Do we need this API at all?
>>>>> Other atomic operations (compare_swap/add) do not use struct ibv_sge
>>>>> at all but have a dedicated place in
>>>>> struct ib_send_wr {
>>>>> ...
>>>>>           struct {
>>>>>               uint64_t    remote_addr;
>>>>>               uint64_t    compare_add;
>>>>>               uint64_t    swap;
>>>>>               uint32_t    rkey;
>>>>>           } atomic;
>>>>> ...
>>>>> }
>>>>>
>>>>> Would it be better to reuse (extend) any existing field?
>>>>> i.e.
>>>>>           struct {
>>>>>               uint64_t    remote_addr;
>>>>>               uint64_t    compare_add;
>>>>>               uint64_t    swap_write;
>>>>>               uint32_t    rkey;
>>>>>           } atomic;
>>>> Agreed. Passing the data to be written as an SGE is unnatural
>>>> since it is always exactly 64 bits. Tweaking the existing atomic
>>>> parameter block as Tomasz suggests is the best approach.
>>> Hi Tomasz, Tom
>>>
>>> Thanks for your quick reply.
>>>
>>> If we want to pass the 8-byte value by tweaking struct atomic on user
>>> space, why don't we
>>> tranfer the 8-byte value by ATOMIC Extended Transport Header 
>>> (AtomicETH)
>>> on kernel space?
>>> PS: IBTA defines that the 8-byte value is tranfered by RDMA Extended
>>> Transport Heade(RETH) + payload.
>>>
>>> Is it inconsistent to use struct atomic on user space and RETH + 
>>> payload
>>> on kernel space?
>> Hi Tomasz, Tom
>>
>> I think the following rules are right:
>> RDMA READ/WRITE should use struct rdma in libverbs and RETH + payload in
>> kernel.
>> RDMA Atomic should use struct atomic in libverbs and AtomicETH in 
>> kernel.
>>
>> According to IBTA's definition, RDMA Atomic Write should use struct rdma
>> in libibverbs.
>
> I don't quite understand this statement, the IBTA defines the protocol
> but does not define the API at such a level.
Hi Tom,

1) In kernel, current SoftRoCE copies the content of struct rdma to RETH 
and copies the content of struct atomic to AtomicETH.
2) IBTA defines that RDMA Atomic Write uses RETH + payload.
According to these two reasons, I perfer to tweak the existing struct rdma.

>
> I do however agree with this:
>
>> How about adding a member in struct rdma? for example:
>> struct {
>>       uint64_t    remote_addr;
>>       uint32_t    rkey;
>>       uint64_t    wr_value:
>> } rdma;
>
> Yes, that's what Tomasz and I were suggesting - a new template for the
> ATOMIC_WRITE request payload. The three fields are to be supplied by
> the verb consumer when posting the work request.

OK, I will update the patch in this way.

Best Regards,
Xiao Yang
>
> Tom.
Jason Gunthorpe Jan. 6, 2022, 12:01 a.m. UTC | #7
On Wed, Jan 05, 2022 at 01:00:42AM +0000, yangx.jy@fujitsu.com wrote:

> 1) In kernel, current SoftRoCE copies the content of struct rdma to RETH 
> and copies the content of struct atomic to AtomicETH.
> 2) IBTA defines that RDMA Atomic Write uses RETH + payload.
> According to these two reasons, I perfer to tweak the existing struct rdma.

No this is basically meaningless

The wr struct is designed as a 'tagged union' where the op specified
which union is in effect.

It turns out that the op generally specifies the network headers as
well, but that is just a side effect.

> >> How about adding a member in struct rdma? for example:
> >> struct {
> >>       uint64_t    remote_addr;
> >>       uint32_t    rkey;
> >>       uint64_t    wr_value:
> >> } rdma;
> >
> > Yes, that's what Tomasz and I were suggesting - a new template for the
> > ATOMIC_WRITE request payload. The three fields are to be supplied by
> > the verb consumer when posting the work request.
> 
> OK, I will update the patch in this way.

We are not extending the ib_send_wr anymore anyhow.

You should implement new ops inside struct ibv_qp_ex as function
calls.

Jason
Xiao Yang Jan. 6, 2022, 1:54 a.m. UTC | #8
On 2022/1/6 8:01, Jason Gunthorpe wrote:
> On Wed, Jan 05, 2022 at 01:00:42AM +0000, yangx.jy@fujitsu.com wrote:
>
>> 1) In kernel, current SoftRoCE copies the content of struct rdma to RETH
>> and copies the content of struct atomic to AtomicETH.
>> 2) IBTA defines that RDMA Atomic Write uses RETH + payload.
>> According to these two reasons, I perfer to tweak the existing struct rdma.
> No this is basically meaningless
>
> The wr struct is designed as a 'tagged union' where the op specified
> which union is in effect.
>
> It turns out that the op generally specifies the network headers as
> well, but that is just a side effect.
>
>>>> How about adding a member in struct rdma? for example:
>>>> struct {
>>>>        uint64_t    remote_addr;
>>>>        uint32_t    rkey;
>>>>        uint64_t    wr_value:
>>>> } rdma;
>>> Yes, that's what Tomasz and I were suggesting - a new template for the
>>> ATOMIC_WRITE request payload. The three fields are to be supplied by
>>> the verb consumer when posting the work request.
>> OK, I will update the patch in this way.
> We are not extending the ib_send_wr anymore anyhow.
>
> You should implement new ops inside struct ibv_qp_ex as function
> calls.

Hi Jason,

For SoftRoCE, do you mean that I only need to extend struct rxe_send_wr 
and add  ibv_wr_rdma_atomic_write() ?
----------------------------------------------------------------------------------------------------------------------
struct rxe_send_wr {
     ...
         struct {
             __aligned_u64 remote_addr;
+           __aligned_u64 atomic_wr;
             __u32   rkey;
             __u32   reserved;
         } rdma;
     ...
}

static inline void ibv_wr_rdma_atomic_write(struct ibv_qp_ex *qp, 
uint32_t rkey,
                                      uint64_t remote_addr)
{
         qp->wr_rdma_atomic_write(qp, rkey, remote_addr);
}
--------------------------------------------------------------------------------------------------------------------

Besides, could you tell me why we cannot extend struct ibv_send_wr for 
ibv_post_send()?

Best Regards,
Xiao Yang
> Jason
Jason Gunthorpe Jan. 10, 2022, 3:42 p.m. UTC | #9
On Thu, Jan 06, 2022 at 01:54:30AM +0000, yangx.jy@fujitsu.com wrote:
> On 2022/1/6 8:01, Jason Gunthorpe wrote:
> > On Wed, Jan 05, 2022 at 01:00:42AM +0000, yangx.jy@fujitsu.com wrote:
> >
> >> 1) In kernel, current SoftRoCE copies the content of struct rdma to RETH
> >> and copies the content of struct atomic to AtomicETH.
> >> 2) IBTA defines that RDMA Atomic Write uses RETH + payload.
> >> According to these two reasons, I perfer to tweak the existing struct rdma.
> > No this is basically meaningless
> >
> > The wr struct is designed as a 'tagged union' where the op specified
> > which union is in effect.
> >
> > It turns out that the op generally specifies the network headers as
> > well, but that is just a side effect.
> >
> >>>> How about adding a member in struct rdma? for example:
> >>>> struct {
> >>>>        uint64_t    remote_addr;
> >>>>        uint32_t    rkey;
> >>>>        uint64_t    wr_value:
> >>>> } rdma;
> >>> Yes, that's what Tomasz and I were suggesting - a new template for the
> >>> ATOMIC_WRITE request payload. The three fields are to be supplied by
> >>> the verb consumer when posting the work request.
> >> OK, I will update the patch in this way.
> > We are not extending the ib_send_wr anymore anyhow.
> >
> > You should implement new ops inside struct ibv_qp_ex as function
> > calls.
> 
> Hi Jason,
> 
> For SoftRoCE, do you mean that I only need to extend struct rxe_send_wr 
> and add  ibv_wr_rdma_atomic_write() ?
> struct rxe_send_wr {
>      ...
>          struct {
>              __aligned_u64 remote_addr;
> +           __aligned_u64 atomic_wr;
>              __u32   rkey;
>              __u32   reserved;
>          } rdma;

You can't make a change like this to anything in
include/uapi/rdma/rdma_user_rxe.h, it has to remain compatiable.

 
> static inline void ibv_wr_rdma_atomic_write(struct ibv_qp_ex *qp, 
> uint32_t rkey,
>                                       uint64_t remote_addr)
> {
>          qp->wr_rdma_atomic_write(qp, rkey, remote_addr);
> }

Yes, something like that
 
> Besides, could you tell me why we cannot extend struct ibv_send_wr for 
> ibv_post_send()?

The ABI is not allowed to change so what is doable with it is very
limited.

Jason
Xiao Yang Jan. 11, 2022, 2:34 a.m. UTC | #10
On 2022/1/10 23:42, Jason Gunthorpe wrote:
> On Thu, Jan 06, 2022 at 01:54:30AM +0000, yangx.jy@fujitsu.com wrote:
>> On 2022/1/6 8:01, Jason Gunthorpe wrote:
>>> On Wed, Jan 05, 2022 at 01:00:42AM +0000, yangx.jy@fujitsu.com wrote:
>>>
>>>> 1) In kernel, current SoftRoCE copies the content of struct rdma to RETH
>>>> and copies the content of struct atomic to AtomicETH.
>>>> 2) IBTA defines that RDMA Atomic Write uses RETH + payload.
>>>> According to these two reasons, I perfer to tweak the existing struct rdma.
>>> No this is basically meaningless
>>>
>>> The wr struct is designed as a 'tagged union' where the op specified
>>> which union is in effect.
>>>
>>> It turns out that the op generally specifies the network headers as
>>> well, but that is just a side effect.
>>>
>>>>>> How about adding a member in struct rdma? for example:
>>>>>> struct {
>>>>>>         uint64_t    remote_addr;
>>>>>>         uint32_t    rkey;
>>>>>>         uint64_t    wr_value:
>>>>>> } rdma;
>>>>> Yes, that's what Tomasz and I were suggesting - a new template for the
>>>>> ATOMIC_WRITE request payload. The three fields are to be supplied by
>>>>> the verb consumer when posting the work request.
>>>> OK, I will update the patch in this way.
>>> We are not extending the ib_send_wr anymore anyhow.
>>>
>>> You should implement new ops inside struct ibv_qp_ex as function
>>> calls.
>> Hi Jason,
>>
>> For SoftRoCE, do you mean that I only need to extend struct rxe_send_wr
>> and add  ibv_wr_rdma_atomic_write() ?
>> struct rxe_send_wr {
>>       ...
>>           struct {
>>               __aligned_u64 remote_addr;
>> +           __aligned_u64 atomic_wr;
>>               __u32   rkey;
>>               __u32   reserved;
>>           } rdma;
> You can't make a change like this to anything in
> include/uapi/rdma/rdma_user_rxe.h, it has to remain compatiable.
Hi Jason,

How about adding atomic_wr member at the end of struct rdma? like this:
------------------------------------------------------
struct rxe_send_wr {
     ...
         struct {
             __aligned_u64 remote_addr;
             __u32   rkey;
             __u32   reserved;
+          __aligned_u64 atomic_wr;
         } rdma;
     ...
}
------------------------------------------------------
If it is also wrong, how to extend struct rdma?
>
>> static inline void ibv_wr_rdma_atomic_write(struct ibv_qp_ex *qp,
>> uint32_t rkey,
>>                                        uint64_t remote_addr)
>> {
>>           qp->wr_rdma_atomic_write(qp, rkey, remote_addr);
>> }
> Yes, something like that
>
>> Besides, could you tell me why we cannot extend struct ibv_send_wr for
>> ibv_post_send()?
> The ABI is not allowed to change so what is doable with it is very
> limited.
Thanks for your explanation.

Best Regards,
Xiao Yang
> Jason
Jason Gunthorpe Jan. 11, 2022, 11:29 p.m. UTC | #11
On Tue, Jan 11, 2022 at 02:34:51AM +0000, yangx.jy@fujitsu.com wrote:

> struct rxe_send_wr {
>      ...
>          struct {
>              __aligned_u64 remote_addr;
>              __u32   rkey;
>              __u32   reserved;
> +          __aligned_u64 atomic_wr;
>          } rdma;
>      ...
> }
> If it is also wrong, how to extend struct rdma?

I think that coudl be OK, you need to confirm that the size of
rxe_send_wr didn't change

Jason
Gromadzki, Tomasz Feb. 11, 2022, 1:18 p.m. UTC | #12
Hi

I have one question related to atomic write API and ibv_wr extension:

> > void (*wr_rdma_atomic_write)(struct ibv_qp_ex *qp, uint32_t rkey,
> >			     uint64_t remote_addr, uint64_t atomic_wr);

> > struct {
> >       uint64_t    remote_addr;
> >       uint32_t    rkey;
> >       uint64_t    wr_value:
> > } rdma;


In both places, atomic value is defined as uint64 but the IBTA spec defines it as 8 bytes array.

"The message size must be 8 bytes and is written to a contiguous range of the destination QP's virtual address space"

Would it be better to have 
uint8_t atomic_wr in wr_rdma_atomic_write(...) declaration

and 

uint8_t wr_value[8] in struct { ... } rdma?

I have checked CmpSwap and FetchAdd and in these cases, IBTA Spec says explicitly about 64 bits values.

Regards
Tomasz


> -----Original Message-----
> From: Tom Talpey <tom@talpey.com>
> Sent: Tuesday, January 4, 2022 4:17 PM
> To: yangx.jy@fujitsu.com; Gromadzki, Tomasz
> <tomasz.gromadzki@intel.com>
> Cc: linux-rdma@vger.kernel.org; yanjun.zhu@linux.dev;
> rpearsonhpe@gmail.com; jgg@nvidia.com; y-goto@fujitsu.com;
> lizhijian@fujitsu.com
> Subject: Re: [RFC PATCH 0/2] RDMA/rxe: Add RDMA Atomic Write operation
> 
> 
> On 1/4/2022 4:28 AM, yangx.jy@fujitsu.com wrote:
> > On 2021/12/31 14:30, yangx.jy@fujitsu.com wrote:
> >> On 2021/12/31 5:42, Tom Talpey wrote:
> >>> On 12/30/2021 2:21 PM, Gromadzki, Tomasz wrote:
> >>>> 1)
> >>>>> rdma_post_atomic_writev(struct rdma_cm_id *id, void *context,
> >>>>> struct ibv_sge *sgl,
> >>>>>               int nsge, int flags, uint64_t remote_addr, uint32_t
> >>>>> rkey)
> >>>> Do we need this API at all?
> >>>> Other atomic operations (compare_swap/add) do not use struct
> >>>> ibv_sge at all but have a dedicated place in struct ib_send_wr {
> >>>> ...
> >>>>           struct {
> >>>>               uint64_t    remote_addr;
> >>>>               uint64_t    compare_add;
> >>>>               uint64_t    swap;
> >>>>               uint32_t    rkey;
> >>>>           } atomic;
> >>>> ...
> >>>> }
> >>>>
> >>>> Would it be better to reuse (extend) any existing field?
> >>>> i.e.
> >>>>           struct {
> >>>>               uint64_t    remote_addr;
> >>>>               uint64_t    compare_add;
> >>>>               uint64_t    swap_write;
> >>>>               uint32_t    rkey;
> >>>>           } atomic;
> >>> Agreed. Passing the data to be written as an SGE is unnatural since
> >>> it is always exactly 64 bits. Tweaking the existing atomic parameter
> >>> block as Tomasz suggests is the best approach.
> >> Hi Tomasz, Tom
> >>
> >> Thanks for your quick reply.
> >>
> >> If we want to pass the 8-byte value by tweaking struct atomic on user
> >> space, why don't we tranfer the 8-byte value by ATOMIC Extended
> >> Transport Header (AtomicETH) on kernel space?
> >> PS: IBTA defines that the 8-byte value is tranfered by RDMA Extended
> >> Transport Heade(RETH) + payload.
> >>
> >> Is it inconsistent to use struct atomic on user space and RETH +
> >> payload on kernel space?
> > Hi Tomasz, Tom
> >
> > I think the following rules are right:
> > RDMA READ/WRITE should use struct rdma in libverbs and RETH + payload
> > in kernel.
> > RDMA Atomic should use struct atomic in libverbs and AtomicETH in kernel.
> >
> > According to IBTA's definition, RDMA Atomic Write should use struct
> > rdma in libibverbs.
> 
> I don't quite understand this statement, the IBTA defines the protocol but
> does not define the API at such a level.
> 
> I do however agree with this:
> 
> > How about adding a member in struct rdma? for example:
> > struct {
> >       uint64_t    remote_addr;
> >       uint32_t    rkey;
> >       uint64_t    wr_value:
> > } rdma;
> 
> Yes, that's what Tomasz and I were suggesting - a new template for the
> ATOMIC_WRITE request payload. The three fields are to be supplied by the
> verb consumer when posting the work request.
> 
> Tom.
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
Xiao Yang Feb. 17, 2022, 3:50 a.m. UTC | #13
On 2021/12/30 20:14, Xiao Yang wrote:
> The IB SPEC v1.5[1][2] added new RDMA operations (Atomic Write and Flush).
> This patchset makes SoftRoCE support new RDMA Atomic Write on RC service.
>
> I added RDMA Atomic Write API and a rdma_atomic_write example on my
> rdma-core repository[3].  You can verify the patchset by building and
> running the rdma_atomic_write example.
> server:
> $ ./rdma_atomic_write_server -s [server_address] -p [port_number]
> client:
> $ ./rdma_atomic_write_client -s [server_address] -p [port_number]

Hi Leon,

Do you think when I should post the userspace patchset to rdma-core?

I'm not sure if I should post it to rdma-core as a PR until the kernel 
patchset is merged?

Best Regards,

Xiao Yang

>
> [1]: https://www.infinibandta.org/ibta-specification/ # login required
> [2]: https://www.infinibandta.org/wp-content/uploads/2021/08/IBTA-Overview-of-IBTA-Volume-1-Release-1.5-and-MPE-2021-08-17-Secure.pptx
> [3]: https://github.com/yangx-jy/rdma-core
>
> BTW: This patchset also needs the following fix.
> https://www.spinics.net/lists/linux-rdma/msg107838.html
>
> Xiao Yang (2):
>    RDMA/rxe: Rename send_atomic_ack() and atomic member of struct
>      resp_res
>    RDMA/rxe: Add RDMA Atomic Write operation
>
>   drivers/infiniband/sw/rxe/rxe_comp.c   |  4 ++
>   drivers/infiniband/sw/rxe/rxe_opcode.c | 18 ++++++++
>   drivers/infiniband/sw/rxe/rxe_opcode.h |  3 ++
>   drivers/infiniband/sw/rxe/rxe_qp.c     |  5 ++-
>   drivers/infiniband/sw/rxe/rxe_req.c    | 10 +++--
>   drivers/infiniband/sw/rxe/rxe_resp.c   | 59 ++++++++++++++++++++------
>   drivers/infiniband/sw/rxe/rxe_verbs.h  |  2 +-
>   include/rdma/ib_pack.h                 |  2 +
>   include/rdma/ib_verbs.h                |  2 +
>   include/uapi/rdma/ib_user_verbs.h      |  2 +
>   10 files changed, 88 insertions(+), 19 deletions(-)
>
Leon Romanovsky Feb. 19, 2022, 10:37 a.m. UTC | #14
On Thu, Feb 17, 2022 at 03:50:02AM +0000, yangx.jy@fujitsu.com wrote:
> On 2021/12/30 20:14, Xiao Yang wrote:
> > The IB SPEC v1.5[1][2] added new RDMA operations (Atomic Write and Flush).
> > This patchset makes SoftRoCE support new RDMA Atomic Write on RC service.
> >
> > I added RDMA Atomic Write API and a rdma_atomic_write example on my
> > rdma-core repository[3].  You can verify the patchset by building and
> > running the rdma_atomic_write example.
> > server:
> > $ ./rdma_atomic_write_server -s [server_address] -p [port_number]
> > client:
> > $ ./rdma_atomic_write_client -s [server_address] -p [port_number]
> 
> Hi Leon,
> 
> Do you think when I should post the userspace patchset to rdma-core?
> 
> I'm not sure if I should post it to rdma-core as a PR until the kernel 
> patchset is merged?

For any UAPI changes, we require to present actual user space part. So
the PR to rdma-core is needed in order to merge the kernel series.

In this PR, the first patch is created with "kernel-headers/update --not-final"
script against local version of kernel headers. Once the kernel part is merged,
you will update that first patch and force push the rdma-core PR.

Thanks

> 
> Best Regards,
> 
> Xiao Yang
> 
> >
> > [1]: https://www.infinibandta.org/ibta-specification/ # login required
> > [2]: https://www.infinibandta.org/wp-content/uploads/2021/08/IBTA-Overview-of-IBTA-Volume-1-Release-1.5-and-MPE-2021-08-17-Secure.pptx
> > [3]: https://github.com/yangx-jy/rdma-core
> >
> > BTW: This patchset also needs the following fix.
> > https://www.spinics.net/lists/linux-rdma/msg107838.html
> >
> > Xiao Yang (2):
> >    RDMA/rxe: Rename send_atomic_ack() and atomic member of struct
> >      resp_res
> >    RDMA/rxe: Add RDMA Atomic Write operation
> >
> >   drivers/infiniband/sw/rxe/rxe_comp.c   |  4 ++
> >   drivers/infiniband/sw/rxe/rxe_opcode.c | 18 ++++++++
> >   drivers/infiniband/sw/rxe/rxe_opcode.h |  3 ++
> >   drivers/infiniband/sw/rxe/rxe_qp.c     |  5 ++-
> >   drivers/infiniband/sw/rxe/rxe_req.c    | 10 +++--
> >   drivers/infiniband/sw/rxe/rxe_resp.c   | 59 ++++++++++++++++++++------
> >   drivers/infiniband/sw/rxe/rxe_verbs.h  |  2 +-
> >   include/rdma/ib_pack.h                 |  2 +
> >   include/rdma/ib_verbs.h                |  2 +
> >   include/uapi/rdma/ib_user_verbs.h      |  2 +
> >   10 files changed, 88 insertions(+), 19 deletions(-)
> >