Message ID | 20220608104320.53066-1-chengyou@linux.alibaba.com (mailing list archive) |
---|---|
Headers | show |
Series | Elastic RDMA Adapter (ERDMA) driver | expand |
On Wed, Jun 08, 2022 at 06:43:09PM +0800, Cheng Xu wrote: > Hello all, > > This v10 patch set introduces the Elastic RDMA Adapter (ERDMA) driver, > which released in Apsara Conference 2021 by Alibaba. The PR of ERDMA > userspace provider has already been created [1]. > > ERDMA enables large-scale RDMA acceleration capability in Alibaba ECS > environment, initially offered in g7re instance. It can improve the > efficiency of large-scale distributed computing and communication > significantly and expand dynamically with the cluster scale of Alibaba > Cloud. > > ERDMA is a RDMA networking adapter based on the Alibaba MOC hardware. It > works in the VPC network environment (overlay network), and uses iWarp > transport protocol. ERDMA supports reliable connection (RC). ERDMA also > supports both kernel space and user space verbs. Now we have already > supported HPC/AI applications with libfabric, NoF and some other internal > verbs libraries, such as xrdma, epsl, etc,. > > For the ECS instance with RDMA enabled, our MOC hardware generates two > kinds of PCI devices: one for ERDMA, and one for the original net device > (virtio-net). They are separated PCI devices. > > Fixed issues in v10: > - Remove unneeded semicolon in erdma_qp.c reported by Abcci Robot. > - Remove duplicated include in erdma_cm.c reported by Abcci Robot. > - Fix return value check in erdma_alloc_ucontext() reported by Hulk > Robot. > - Sort the include headers. I updated it, but please wait longer before sending v11. Jason
On 6/8/22 7:54 PM, Jason Gunthorpe wrote: > On Wed, Jun 08, 2022 at 06:43:09PM +0800, Cheng Xu wrote: >> Hello all, >> >> This v10 patch set introduces the Elastic RDMA Adapter (ERDMA) driver, >> which released in Apsara Conference 2021 by Alibaba. The PR of ERDMA >> userspace provider has already been created [1]. >> >> ERDMA enables large-scale RDMA acceleration capability in Alibaba ECS >> environment, initially offered in g7re instance. It can improve the >> efficiency of large-scale distributed computing and communication >> significantly and expand dynamically with the cluster scale of Alibaba >> Cloud. >> >> ERDMA is a RDMA networking adapter based on the Alibaba MOC hardware. It >> works in the VPC network environment (overlay network), and uses iWarp >> transport protocol. ERDMA supports reliable connection (RC). ERDMA also >> supports both kernel space and user space verbs. Now we have already >> supported HPC/AI applications with libfabric, NoF and some other internal >> verbs libraries, such as xrdma, epsl, etc,. >> >> For the ECS instance with RDMA enabled, our MOC hardware generates two >> kinds of PCI devices: one for ERDMA, and one for the original net device >> (virtio-net). They are separated PCI devices. >> >> Fixed issues in v10: >> - Remove unneeded semicolon in erdma_qp.c reported by Abcci Robot. >> - Remove duplicated include in erdma_cm.c reported by Abcci Robot. >> - Fix return value check in erdma_alloc_ucontext() reported by Hulk >> Robot. >> - Sort the include headers. > > I updated it, but please wait longer before sending v11. > > Jason Got it, and I will wait longer. Thanks, Cheng Xu