Message ID | 20200210131223.87776.21339.stgit@awfm-01.aw.intel.com (mailing list archive) |
---|---|
Headers | show |
Series | New hfi1 feature: Accelerated IP | expand |
On Mon, Feb 10, 2020 at 08:18:05AM -0500, Dennis Dalessandro wrote: > This patch series is an accelerated ipoib using the rdma netdev mechanism > already present in ipoib. A new device capability bit, > IB_DEVICE_RDMA_NETDEV_OPA, triggers ipoib to create a datagram QP using the > IB_QP_CREATE_NETDEV_USE. > > The highlights include: > - Sharing send and receive resources with VNIC > - Allows for switching between connected mode and datagram mode There is still value in connected mode? > - Increases the maximum datagram MTU for opa devices to 10k > > The same spreading capability exploited by VNIC is used here to vary > the receive context that receives the packet. > > The patches are fully bisectable and stepwise implement the capability. This is alot of code to send without a performance justification.. What is it? Is it worth while? > Gary Leshner (6): > IB/hfi1: Add functions to transmit datagram ipoib packets > IB/hfi1: Add the transmit side of a datagram ipoib RDMA netdev > IB/hfi1: Remove module parameter for KDETH qpns > IB/{rdmavt,hfi1}: Implement creation of accelerated UD QPs > IB/{hfi1,ipoib,rdma}: Broadcast ping sent packets which exceeded mtu size > IB/ipoib: Add capability to switch between datagram and connected mode > > Grzegorz Andrejczuk (7): > IB/hfi1: RSM rules for AIP > IB/hfi1: Rename num_vnic_contexts as num_netdev_contexts > IB/hfi1: Add functions to receive accelerated ipoib packets > IB/hfi1: Add interrupt handler functions for accelerated ipoib > IB/hfi1: Add rx functions for dummy netdev This dummy netdev thing seemed very strange Jason
On 2/10/2020 8:31 AM, Jason Gunthorpe wrote: > On Mon, Feb 10, 2020 at 08:18:05AM -0500, Dennis Dalessandro wrote: >> This patch series is an accelerated ipoib using the rdma netdev mechanism >> already present in ipoib. A new device capability bit, >> IB_DEVICE_RDMA_NETDEV_OPA, triggers ipoib to create a datagram QP using the >> IB_QP_CREATE_NETDEV_USE. >> >> The highlights include: >> - Sharing send and receive resources with VNIC >> - Allows for switching between connected mode and datagram mode > > There is still value in connected mode? It's really a compatibility thing. If someone wants to change modes that will work. There won't be any benefit to connected mode though. The goal is just to not break. > >> - Increases the maximum datagram MTU for opa devices to 10k >> >> The same spreading capability exploited by VNIC is used here to vary >> the receive context that receives the packet. >> >> The patches are fully bisectable and stepwise implement the capability. > > This is alot of code to send without a performance > justification.. What is it? Is it worth while? It avoids the scalability problem of connected mode, the number of QPs. Incoming packets are spread into multiple receive contexts increasing parallelism. The MTU is increased to allows 10K. It also reduces/removes the verbs TX overhead by allowing packets to be sent through the SDMA engines directly. >> Gary Leshner (6): >> IB/hfi1: Add functions to transmit datagram ipoib packets >> IB/hfi1: Add the transmit side of a datagram ipoib RDMA netdev >> IB/hfi1: Remove module parameter for KDETH qpns >> IB/{rdmavt,hfi1}: Implement creation of accelerated UD QPs >> IB/{hfi1,ipoib,rdma}: Broadcast ping sent packets which exceeded mtu size >> IB/ipoib: Add capability to switch between datagram and connected mode >> >> Grzegorz Andrejczuk (7): >> IB/hfi1: RSM rules for AIP >> IB/hfi1: Rename num_vnic_contexts as num_netdev_contexts >> IB/hfi1: Add functions to receive accelerated ipoib packets >> IB/hfi1: Add interrupt handler functions for accelerated ipoib >> IB/hfi1: Add rx functions for dummy netdev > > This dummy netdev thing seemed very strange One of the existing uses of dummy netdev seems to be to tie multiple hardware interfaces together. We are using a similar concept for two software interfaces. Those being VNIC and AIP. The dummy netdev here will own the receiving resources which are shared. -Denny
On Mon, Feb 10, 2020 at 12:36:02PM -0500, Dennis Dalessandro wrote: > On 2/10/2020 8:31 AM, Jason Gunthorpe wrote: > > On Mon, Feb 10, 2020 at 08:18:05AM -0500, Dennis Dalessandro wrote: > > > This patch series is an accelerated ipoib using the rdma netdev mechanism > > > already present in ipoib. A new device capability bit, > > > IB_DEVICE_RDMA_NETDEV_OPA, triggers ipoib to create a datagram QP using the > > > IB_QP_CREATE_NETDEV_USE. > > > > > > The highlights include: > > > - Sharing send and receive resources with VNIC > > > - Allows for switching between connected mode and datagram mode > > > > There is still value in connected mode? > > It's really a compatibility thing. If someone wants to change modes that > will work. There won't be any benefit to connected mode though. The goal is > just to not break. I am a bit confused by this.. I thought the mlx5 implementation already could select connected mode? Why were core ipoib changes needed? > > > The patches are fully bisectable and stepwise implement the capability. > > > > This is alot of code to send without a performance > > justification.. What is it? Is it worth while? > > It avoids the scalability problem of connected mode, the number of QPs. > Incoming packets are spread into multiple receive contexts increasing > parallelism. The MTU is increased to allows 10K. It also reduces/removes the > verbs TX overhead by allowing packets to be sent through the SDMA engines > directly. No numbers to share? Jason
On 2/10/2020 1:32 PM, Jason Gunthorpe wrote: > On Mon, Feb 10, 2020 at 12:36:02PM -0500, Dennis Dalessandro wrote: >> On 2/10/2020 8:31 AM, Jason Gunthorpe wrote: >>> On Mon, Feb 10, 2020 at 08:18:05AM -0500, Dennis Dalessandro wrote: >>>> This patch series is an accelerated ipoib using the rdma netdev mechanism >>>> already present in ipoib. A new device capability bit, >>>> IB_DEVICE_RDMA_NETDEV_OPA, triggers ipoib to create a datagram QP using the >>>> IB_QP_CREATE_NETDEV_USE. >>>> >>>> The highlights include: >>>> - Sharing send and receive resources with VNIC >>>> - Allows for switching between connected mode and datagram mode >>> >>> There is still value in connected mode? >> >> It's really a compatibility thing. If someone wants to change modes that >> will work. There won't be any benefit to connected mode though. The goal is >> just to not break. > > I am a bit confused by this.. I thought the mlx5 implementation > already could select connected mode? > > Why were core ipoib changes needed? I don't think so, patch 15/16 seemed to be necessary to get connected mode to work with the rdma netdev. > >>>> The patches are fully bisectable and stepwise implement the capability. >>> >>> This is alot of code to send without a performance >>> justification.. What is it? Is it worth while? >> >> It avoids the scalability problem of connected mode, the number of QPs. >> Incoming packets are spread into multiple receive contexts increasing >> parallelism. The MTU is increased to allows 10K. It also reduces/removes the >> verbs TX overhead by allowing packets to be sent through the SDMA engines >> directly. > > No numbers to share? No numbers directly but I can say that AIP enables line-rate performance between two nodes with Datagram Mode, it also provides IPoFabric latency improvements relative to standard Datagram Mode without AIP. -Denny