diff mbox series

[v2] IB/ipoib: improve latency in ipoib/cm connection formation

Message ID 1618338965-16717-1-git-send-email-manjunath.b.patil@oracle.com (mailing list archive)
State Accepted
Delegated to: Jason Gunthorpe
Headers show
Series [v2] IB/ipoib: improve latency in ipoib/cm connection formation | expand

Commit Message

Manjunath Patil April 13, 2021, 6:36 p.m. UTC
Currently ipoib connected mode queries the device[HCA] to get pkey table
entry during connection formation. This will increase the time taken to
form the connection, especially when limited pkeys are in use.  This
gets worse when multiple connection attempts are done in parallel.

Since ipoib interfaces are locked to a single pkey, use the pkey index
that was determined at link up time instead of searching anything.

This improved the latency from 500ms to 1ms on an internal setup.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
---
v2: v1 used the cached version[ib_find_cached_pkey()] to get the pkey table
entry. Following the Jason's comments for v1, I switched to pkey index that was
determined at link up time in v2.

 drivers/infiniband/ulp/ipoib/ipoib_cm.c |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

Comments

Jason Gunthorpe April 13, 2021, 6:42 p.m. UTC | #1
On Tue, Apr 13, 2021 at 11:36:05AM -0700, Manjunath Patil wrote:
> Currently ipoib connected mode queries the device[HCA] to get pkey table
> entry during connection formation. This will increase the time taken to
> form the connection, especially when limited pkeys are in use.  This
> gets worse when multiple connection attempts are done in parallel.
> 
> Since ipoib interfaces are locked to a single pkey, use the pkey index
> that was determined at link up time instead of searching anything.
> 
> This improved the latency from 500ms to 1ms on an internal setup.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
> ---
> v2: v1 used the cached version[ib_find_cached_pkey()] to get the pkey table
> entry. Following the Jason's comments for v1, I switched to pkey index that was
> determined at link up time in v2.

Can you confirm that the pkey index does get updated if the SM changes
the pkey table? (and if so how is the locking done for reading the pkey_index?)

That is about the only reason to have had a dedicated query here

Jason
Haakon Bugge April 14, 2021, 10:01 a.m. UTC | #2
> On 13 Apr 2021, at 20:42, Jason Gunthorpe <jgg@ziepe.ca> wrote:
> 
> On Tue, Apr 13, 2021 at 11:36:05AM -0700, Manjunath Patil wrote:
>> Currently ipoib connected mode queries the device[HCA] to get pkey table
>> entry during connection formation. This will increase the time taken to
>> form the connection, especially when limited pkeys are in use.  This
>> gets worse when multiple connection attempts are done in parallel.
>> 
>> Since ipoib interfaces are locked to a single pkey, use the pkey index
>> that was determined at link up time instead of searching anything.
>> 
>> This improved the latency from 500ms to 1ms on an internal setup.
>> 
>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
>> ---
>> v2: v1 used the cached version[ib_find_cached_pkey()] to get the pkey table
>> entry. Following the Jason's comments for v1, I switched to pkey index that was
>> determined at link up time in v2.
> 
> Can you confirm that the pkey index does get updated if the SM changes
> the pkey table? (and if so how is the locking done for reading the pkey_index?)
> 
> That is about the only reason to have had a dedicated query here

Unless I misunderstood you completely, one instance of a netdev has only a single pkey for its lifetime:

> On 27 Jan 2021, at 01:16, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> Yes, and each new netdev that spawns has a fixed pkey that doesn't
> change for the life of the netdev

... and, if you anticipate that the UD QP is using pkey1 at indexX, the pkey table table gets updates by the SM so the new entry in indexX becomes pkey2, the old pkey1 is now at a new position in the table (or not in the table is another case), let's say pkey1 is now found at indexY. Now, the connected mode QP will use pkey1 at indexY if a dedicated query is performed.

Then we end up in a split brain, the UD QP uses pkey2 and the RC QPs use pkey1. With Manju's patch, they will at least use the same pkey.

Not related to this commit; I find it strange that the return value of update_child_pkey() is not used in __ipoib_ib_dev_flush().


Thxs, HÃ¥kon








> 
> Jason
Jason Gunthorpe April 19, 2021, 5:55 p.m. UTC | #3
On Wed, Apr 14, 2021 at 10:01:43AM +0000, Haakon Bugge wrote:

> ... and, if you anticipate that the UD QP is using pkey1 at indexX,
> the pkey table table gets updates by the SM so the new entry in
> indexX becomes pkey2, the old pkey1 is now at a new position in the
> table (or not in the table is another case), let's say pkey1 is now
> found at indexY. Now, the connected mode QP will use pkey1 at indexY
> if a dedicated query is performed.

This is the concern.. The SM is really supposed to keep the pkey table
stable, I think if it changes it should trigger some heavy flush.

So just confirm that the heavy flush caused a new pkey index to be
loaded and the UD side gets resynced and we ar egodo

> Then we end up in a split brain, the UD QP uses pkey2 and the RC QPs
> use pkey1. With Manju's patch, they will at least use the same pkey.

Well as you pointed it goes throught he heavy flush and triggers
ipoib_pkey_dev_check_presence() which does update the pkey_index, so
it seems fine.

Applied to for-next

> Not related to this commit; I find it strange that the return value
> of update_child_pkey() is not used in __ipoib_ib_dev_flush().

The second callsite uses it

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index d5d592b..9dbc85a 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1122,12 +1122,8 @@  static int ipoib_cm_modify_tx_init(struct net_device *dev,
 	struct ipoib_dev_priv *priv = ipoib_priv(dev);
 	struct ib_qp_attr qp_attr;
 	int qp_attr_mask, ret;
-	ret = ib_find_pkey(priv->ca, priv->port, priv->pkey, &qp_attr.pkey_index);
-	if (ret) {
-		ipoib_warn(priv, "pkey 0x%x not found: %d\n", priv->pkey, ret);
-		return ret;
-	}
 
+	qp_attr.pkey_index = priv->pkey_index;
 	qp_attr.qp_state = IB_QPS_INIT;
 	qp_attr.qp_access_flags = IB_ACCESS_LOCAL_WRITE;
 	qp_attr.port_num = priv->port;