[V2,2/4] RDMA/core: Introduce shared CQ pool API

Allow a ULP to ask the core to provide a completion queue based on a
least-used search on a per-device CQ pools. The device CQ pools grow in a
lazy fashion when more CQs are requested.

This feature reduces the amount of interrupts when using many QPs.
Using shared CQs allows for more effcient completion handling. It also
reduces the amount of overhead needed for CQ contexts.

Test setup:
Intel(R) Xeon(R) Platinum 8176M CPU @ 2.10GHz servers.
Running NVMeoF 4KB read IOs over ConnectX-5EX across Spectrum switch.
TX-depth = 32. The patch was applied in the nvme driver on both the target
and initiator. Four controllers are accessed from each core. In the
current test case we have exposed sixteen NVMe namespaces using four
different subsystems (four namespaces per subsystem) from one NVM port.
Each controller allocated X queues (RDMA QPs) and attached to Y CQs.
Before this series we had X == Y, i.e for four controllers we've created
total of 4X QPs and 4X CQs. In the shared case, we've created 4X QPs and
only X CQs which means that we have four controllers that share a
completion queue per core. Until fourteen cores there is no significant
change in performance and the number of interrupts per second is less than
a million in the current case.
==================================================
|Cores|Current KIOPs  |Shared KIOPs  |improvement|
|-----|---------------|--------------|-----------|
|14   |2332           |2723          |16.7%      |
|-----|---------------|--------------|-----------|
|20   |2086           |2712          |30%        |
|-----|---------------|--------------|-----------|
|28   |1971           |2669          |35.4%      |
|=================================================
|Cores|Current avg lat|Shared avg lat|improvement|
|-----|---------------|--------------|-----------|
|14   |767us          |657us         |14.3%      |
|-----|---------------|--------------|-----------|
|20   |1225us         |943us         |23%        |
|-----|---------------|--------------|-----------|
|28   |1816us         |1341us        |26.1%      |
========================================================
|Cores|Current interrupts|Shared interrupts|improvement|
|-----|------------------|-----------------|-----------|
|14   |1.6M/sec          |0.4M/sec         |72%        |
|-----|------------------|-----------------|-----------|
|20   |2.8M/sec          |0.6M/sec         |72.4%      |
|-----|------------------|-----------------|-----------|
|28   |2.9M/sec          |0.8M/sec         |63.4%      |
====================================================================
|Cores|Current 99.99th PCTL lat|Shared 99.99th PCTL lat|improvement|
|-----|------------------------|-----------------------|-----------|
|14   |67ms                    |6ms                    |90.9%      |
|-----|------------------------|-----------------------|-----------|
|20   |5ms                     |6ms                    |-10%       |
|-----|------------------------|-----------------------|-----------|
|28   |8.7ms                   |6ms                    |25.9%      |
|===================================================================

Performance improvement with sixteen disks (sixteen CQs per core) is
comparable.

Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/infiniband/core/core_priv.h |   4 ++
 drivers/infiniband/core/cq.c        | 137 ++++++++++++++++++++++++++++++++++++
 drivers/infiniband/core/device.c    |   2 +
 include/rdma/ib_verbs.h             |  35 +++++++++
 4 files changed, 178 insertions(+)

Message ID	1589370763-81205-3-git-send-email-yaminf@mellanox.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <SRS0=bifO=63=vger.kernel.org=linux-rdma-owner@kernel.org> From: Yamin Friedman <yaminf@mellanox.com> To: Jason Gunthorpe <jgg@mellanox.com>, Sagi Grimberg <sagi@grimberg.me>, Christoph Hellwig <hch@lst.de>, Or Gerlitz <ogerlitz@mellanox.com>, Leon Romanovsky <leonro@mellanox.com> Cc: linux-rdma@vger.kernel.org, Yamin Friedman <yaminf@mellanox.com> Subject: [PATCH V2 2/4] RDMA/core: Introduce shared CQ pool API Date: Wed, 13 May 2020 14:52:41 +0300 Message-Id: <1589370763-81205-3-git-send-email-yaminf@mellanox.com> In-Reply-To: <1589370763-81205-1-git-send-email-yaminf@mellanox.com> References: <1589370763-81205-1-git-send-email-yaminf@mellanox.com> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk
Series	Introducing RDMA shared CQ pool \| expand [V2,0/4] Introducing RDMA shared CQ pool [V2,1/4] RDMA/core: Add protection for shared CQs used by ULPs [V2,2/4] RDMA/core: Introduce shared CQ pool API [V2,3/4] nvme-rdma: use new shared CQ mechanism [V2,4/4] nvmet-rdma: use new shared CQ mechanism

[V2,2/4] RDMA/core: Introduce shared CQ pool API

Commit Message

Comments

Patch