Message ID | 1543925069-8838-2-git-send-email-galpress@amazon.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | Elastic Fabric Adapter (EFA) driver | expand |
On Tue, Dec 04, 2018 at 02:04:17PM +0200, Gal Pressman wrote: > Add EFA node type, transport type and protocol type to core code. > EFA relies on underlying implementation similar to reliable datagram, so > we also define a new QP type named Scalable Reliable Datagram (SRD). > > EFA reliable datagram transport provides reliable out-of-order delivery, > transparently utilizing multiple network paths to reduce network tail > latency. Its interface is similar to UD, in particular it supports > message size up to MTU, with error handling extended to support reliable > communication. > > Signed-off-by: Gal Pressman <galpress@amazon.com> > --- > drivers/infiniband/core/verbs.c | 2 ++ > include/rdma/ib_verbs.h | 9 +++++++-- > 2 files changed, 9 insertions(+), 2 deletions(-) > Do you have any specification/documentation for that? I'm afraid that awesome press release [1] is not enough. [1] https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-elastic-fabric-adapter/ Thanks
On 04-Dec-18 14:44, Leon Romanovsky wrote: > On Tue, Dec 04, 2018 at 02:04:17PM +0200, Gal Pressman wrote: >> Add EFA node type, transport type and protocol type to core code. >> EFA relies on underlying implementation similar to reliable datagram, so >> we also define a new QP type named Scalable Reliable Datagram (SRD). >> >> EFA reliable datagram transport provides reliable out-of-order delivery, >> transparently utilizing multiple network paths to reduce network tail >> latency. Its interface is similar to UD, in particular it supports >> message size up to MTU, with error handling extended to support reliable >> communication. >> >> Signed-off-by: Gal Pressman <galpress@amazon.com> >> --- >> drivers/infiniband/core/verbs.c | 2 ++ >> include/rdma/ib_verbs.h | 9 +++++++-- >> 2 files changed, 9 insertions(+), 2 deletions(-) >> > > Do you have any specification/documentation for that? > > I'm afraid that awesome press release [1] is not enough. > > [1] > https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-elastic-fabric-adapter/ > > Thanks > Hey Leon, The commit message (and part of the cover letter) contains a description of SRD. It is similar to UD in most ways with the addition of reliable out-of-order delivery. The work requests usage are the same as UD with a minor difference in the completion's status codes. The exact specification is internal to AWS, but if you have any more questions I'll be more than happy to answer.
On Tue, Dec 04, 2018 at 04:38:52PM +0200, Gal Pressman wrote: > On 04-Dec-18 14:44, Leon Romanovsky wrote: > > On Tue, Dec 04, 2018 at 02:04:17PM +0200, Gal Pressman wrote: > >> Add EFA node type, transport type and protocol type to core code. > >> EFA relies on underlying implementation similar to reliable datagram, so > >> we also define a new QP type named Scalable Reliable Datagram (SRD). > >> > >> EFA reliable datagram transport provides reliable out-of-order delivery, > >> transparently utilizing multiple network paths to reduce network tail > >> latency. Its interface is similar to UD, in particular it supports > >> message size up to MTU, with error handling extended to support reliable > >> communication. > >> > >> Signed-off-by: Gal Pressman <galpress@amazon.com> > >> --- > >> drivers/infiniband/core/verbs.c | 2 ++ > >> include/rdma/ib_verbs.h | 9 +++++++-- > >> 2 files changed, 9 insertions(+), 2 deletions(-) > >> > > > > Do you have any specification/documentation for that? > > > > I'm afraid that awesome press release [1] is not enough. > > > > [1] > > https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-elastic-fabric-adapter/ > > > > Thanks > > > > Hey Leon, > The commit message (and part of the cover letter) contains a description of SRD. > It is similar to UD in most ways with the addition of reliable out-of-order > delivery. The work requests usage are the same as UD with a minor difference in > the completion's status codes. > > The exact specification is internal to AWS, but if you have any more questions > I'll be more than happy to answer. All structures which you extended are backed by IBTA and everything that "internal to .." is supposed to be implemented by various extensions which we already have in RDMA/core. For example, in case of SRD, we have IB_QPT_DRIVER exactly for that. Otherwise, please provide full semantics of this SRD type: out-of-order semantics, handle of errors, state diagram, retransmission e.t.c. Thanks
On 04-Dec-18 17:45, Leon Romanovsky wrote: > On Tue, Dec 04, 2018 at 04:38:52PM +0200, Gal Pressman wrote: >> On 04-Dec-18 14:44, Leon Romanovsky wrote: >>> On Tue, Dec 04, 2018 at 02:04:17PM +0200, Gal Pressman wrote: >>>> Add EFA node type, transport type and protocol type to core code. >>>> EFA relies on underlying implementation similar to reliable datagram, so >>>> we also define a new QP type named Scalable Reliable Datagram (SRD). >>>> >>>> EFA reliable datagram transport provides reliable out-of-order delivery, >>>> transparently utilizing multiple network paths to reduce network tail >>>> latency. Its interface is similar to UD, in particular it supports >>>> message size up to MTU, with error handling extended to support reliable >>>> communication. >>>> >>>> Signed-off-by: Gal Pressman <galpress@amazon.com> >>>> --- >>>> drivers/infiniband/core/verbs.c | 2 ++ >>>> include/rdma/ib_verbs.h | 9 +++++++-- >>>> 2 files changed, 9 insertions(+), 2 deletions(-) >>>> >>> >>> Do you have any specification/documentation for that? >>> >>> I'm afraid that awesome press release [1] is not enough. >>> >>> [1] >>> https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-elastic-fabric-adapter/ >>> >>> Thanks >>> >> >> Hey Leon, >> The commit message (and part of the cover letter) contains a description of SRD. >> It is similar to UD in most ways with the addition of reliable out-of-order >> delivery. The work requests usage are the same as UD with a minor difference in >> the completion's status codes. >> >> The exact specification is internal to AWS, but if you have any more questions >> I'll be more than happy to answer. > > All structures which you extended are backed by IBTA and everything > that "internal to .." is supposed to be implemented by various extensions > which we already have in RDMA/core. For example, in case of SRD, we have > IB_QPT_DRIVER exactly for that. > > Otherwise, please provide full semantics of this SRD type: out-of-order > semantics, handle of errors, state diagram, retransmission e.t.c. > > Thanks > We can use IB_QPT_DRIVER, if I understand correctly the only downside is that kernel QPs will not be able to utilize SRD.
> enum { > @@ -119,14 +120,16 @@ enum rdma_transport_type { > RDMA_TRANSPORT_IB, > RDMA_TRANSPORT_IWARP, > RDMA_TRANSPORT_USNIC, > - RDMA_TRANSPORT_USNIC_UDP > + RDMA_TRANSPORT_USNIC_UDP, > + RDMA_TRANSPORT_EFA, > }; > > enum rdma_protocol_type { > RDMA_PROTOCOL_IB, > RDMA_PROTOCOL_IBOE, > RDMA_PROTOCOL_IWARP, > - RDMA_PROTOCOL_USNIC_UDP > + RDMA_PROTOCOL_USNIC_UDP, > + RDMA_PROTOCOL_EFA, EFA is the (marketing?) name of the NIC, not really the transport or protocol. You called the protocol SRD in the cover letter. I'm not sure if that would apply as both the transport or protocol, but it seems a better option than EFA. - Sean
On 05-Dec-18 21:23, Hefty, Sean wrote: >> enum { >> @@ -119,14 +120,16 @@ enum rdma_transport_type { >> RDMA_TRANSPORT_IB, >> RDMA_TRANSPORT_IWARP, >> RDMA_TRANSPORT_USNIC, >> - RDMA_TRANSPORT_USNIC_UDP >> + RDMA_TRANSPORT_USNIC_UDP, >> + RDMA_TRANSPORT_EFA, >> }; >> >> enum rdma_protocol_type { >> RDMA_PROTOCOL_IB, >> RDMA_PROTOCOL_IBOE, >> RDMA_PROTOCOL_IWARP, >> - RDMA_PROTOCOL_USNIC_UDP >> + RDMA_PROTOCOL_USNIC_UDP, >> + RDMA_PROTOCOL_EFA, > > EFA is the (marketing?) name of the NIC, not really the transport or protocol. You called the protocol SRD in the cover letter. I'm not sure if that would apply as both the transport or protocol, but it seems a better option than EFA. We support both SRD and UD, we consider EFA as a family of protocols. > > - Sean >
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index 178899e3ce73..970744ffbf33 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -206,6 +206,8 @@ rdma_node_get_transport(enum rdma_node_type node_type) return RDMA_TRANSPORT_USNIC_UDP; if (node_type == RDMA_NODE_RNIC) return RDMA_TRANSPORT_IWARP; + if (node_type == RDMA_NODE_EFA) + return RDMA_TRANSPORT_EFA; return RDMA_TRANSPORT_IB; } diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 92633c15125b..8d4b07b346b7 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -108,6 +108,7 @@ enum rdma_node_type { RDMA_NODE_RNIC, RDMA_NODE_USNIC, RDMA_NODE_USNIC_UDP, + RDMA_NODE_EFA, }; enum { @@ -119,14 +120,16 @@ enum rdma_transport_type { RDMA_TRANSPORT_IB, RDMA_TRANSPORT_IWARP, RDMA_TRANSPORT_USNIC, - RDMA_TRANSPORT_USNIC_UDP + RDMA_TRANSPORT_USNIC_UDP, + RDMA_TRANSPORT_EFA, }; enum rdma_protocol_type { RDMA_PROTOCOL_IB, RDMA_PROTOCOL_IBOE, RDMA_PROTOCOL_IWARP, - RDMA_PROTOCOL_USNIC_UDP + RDMA_PROTOCOL_USNIC_UDP, + RDMA_PROTOCOL_EFA, }; __attribute_const__ enum rdma_transport_type @@ -538,6 +541,7 @@ static inline struct rdma_hw_stats *rdma_alloc_hw_stats_struct( #define RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP 0x00800000 #define RDMA_CORE_CAP_PROT_RAW_PACKET 0x01000000 #define RDMA_CORE_CAP_PROT_USNIC 0x02000000 +#define RDMA_CORE_CAP_PROT_EFA 0x04000000 #define RDMA_CORE_PORT_IB_GRH_REQUIRED (RDMA_CORE_CAP_IB_GRH_REQUIRED \ | RDMA_CORE_CAP_PROT_ROCE \ @@ -1095,6 +1099,7 @@ enum ib_qp_type { IB_QPT_RAW_PACKET = 8, IB_QPT_XRC_INI = 9, IB_QPT_XRC_TGT, + IB_QPT_SRD, IB_QPT_MAX, IB_QPT_DRIVER = 0xFF, /* Reserve a range for qp types internal to the low level driver.
Add EFA node type, transport type and protocol type to core code. EFA relies on underlying implementation similar to reliable datagram, so we also define a new QP type named Scalable Reliable Datagram (SRD). EFA reliable datagram transport provides reliable out-of-order delivery, transparently utilizing multiple network paths to reduce network tail latency. Its interface is similar to UD, in particular it supports message size up to MTU, with error handling extended to support reliable communication. Signed-off-by: Gal Pressman <galpress@amazon.com> --- drivers/infiniband/core/verbs.c | 2 ++ include/rdma/ib_verbs.h | 9 +++++++-- 2 files changed, 9 insertions(+), 2 deletions(-)