diff mbox

[37/37] IB/rdmavt: Add support for new memory registration API

Message ID 20151207204540.8144.94303.stgit@phlsvslse11.ph.intel.com (mailing list archive)
State Superseded
Headers show

Commit Message

Dennis Dalessandro Dec. 7, 2015, 8:45 p.m. UTC
commit 38071a461f0a ("IB/qib: Support the new memory registration API")"
added support for map_mr_sg to qib. This patch brings that
support into rdmavt, it also adds a register mr function that will be
called from the post send routine which is inline with the qib patch.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
---
 drivers/infiniband/sw/rdmavt/mr.c |   91 +++++++++++++++++++++++++++++++++++++
 drivers/infiniband/sw/rdmavt/mr.h |    5 ++
 drivers/infiniband/sw/rdmavt/vt.c |    1 
 include/rdma/rdma_vt.h            |    2 -
 4 files changed, 98 insertions(+), 1 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Sagi Grimberg Dec. 10, 2015, 3:02 p.m. UTC | #1
> commit 38071a461f0a ("IB/qib: Support the new memory registration API")"
> added support for map_mr_sg to qib. This patch brings that
> support into rdmavt, it also adds a register mr function that will be
> called from the post send routine which is inline with the qib patch.

Hi Dennis and Ira,

This question is not directly related to this patch, but given that
this is a copy-paste from the qib driver I'll go ahead and take it
anyway. How does qib (and rvt now) do memory key invalidation? I didn't
see any reference to IB_WR_LOCAL_INV anywhere in the qib driver...

What am I missing?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Dec. 14, 2015, 4:18 p.m. UTC | #2
> Hi Dennis and Ira,
>
> This question is not directly related to this patch, but given that
> this is a copy-paste from the qib driver I'll go ahead and take it
> anyway. How does qib (and rvt now) do memory key invalidation? I didn't
> see any reference to IB_WR_LOCAL_INV anywhere in the qib driver...
>
> What am I missing?

ping?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dennis Dalessandro Dec. 14, 2015, 5:14 p.m. UTC | #3
On Mon, Dec 14, 2015 at 06:18:48PM +0200, Sagi Grimberg wrote:
>>This question is not directly related to this patch, but given that
>>this is a copy-paste from the qib driver I'll go ahead and take it
>>anyway. How does qib (and rvt now) do memory key invalidation? I didn't
>>see any reference to IB_WR_LOCAL_INV anywhere in the qib driver...
>>
>>What am I missing?
>
>ping?

In short, it doesn't look like qib or hfi1 support this.

That doesn't mean it can't be added to rdmavt as a future enhancement though 
if there is a need. Are you asking because soft-roce will need it?

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Dec. 16, 2015, 1:21 p.m. UTC | #4
>>> This question is not directly related to this patch, but given that
>>> this is a copy-paste from the qib driver I'll go ahead and take it
>>> anyway. How does qib (and rvt now) do memory key invalidation? I didn't
>>> see any reference to IB_WR_LOCAL_INV anywhere in the qib driver...
>>>
>>> What am I missing?
>>
>> ping?
>
> In short, it doesn't look like qib or hfi1 support this.

Oh, I'm surprised to learn that. At least I see that
qib is not exposing IB_DEVICE_MEM_MGT_EXTENSIONS. But whats
the point in doing something with a IB_WR_REG_MR at all?

Given that this is not supported anyway, why does this patch
exist?

> That doesn't mean it can't be added to rdmavt as a future enhancement
> though if there is a need.

Well, given that we're trying to consolidate on post send registration
interface it's kind of a must I'd say.

> Are you asking because soft-roce will need it?

I was asking in general, but in specific soft-roce as a consumer will
need to support that yes.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dennis Dalessandro Dec. 16, 2015, 2:22 p.m. UTC | #5
On Wed, Dec 16, 2015 at 03:21:02PM +0200, Sagi Grimberg wrote:
>
>>>>This question is not directly related to this patch, but given that
>>>>this is a copy-paste from the qib driver I'll go ahead and take it
>>>>anyway. How does qib (and rvt now) do memory key invalidation? I didn't
>>>>see any reference to IB_WR_LOCAL_INV anywhere in the qib driver...
>>>>
>>>>What am I missing?
>>>
>>>ping?
>>
>>In short, it doesn't look like qib or hfi1 support this.
>
>Oh, I'm surprised to learn that. At least I see that
>qib is not exposing IB_DEVICE_MEM_MGT_EXTENSIONS. But whats
>the point in doing something with a IB_WR_REG_MR at all?

>Given that this is not supported anyway, why does this patch
>exist?

This patch exists to provide parity for what is in qib. Should we not have 
it?  If not, why do we have:

commit 38071a461f0a ("IB/qib: Support the new memory registration API")

>>That doesn't mean it can't be added to rdmavt as a future enhancement
>>though if there is a need.
>
>Well, given that we're trying to consolidate on post send registration
>interface it's kind of a must I'd say.
>
>>Are you asking because soft-roce will need it?
>
>I was asking in general, but in specific soft-roce as a consumer will
>need to support that yes.

I think it makes sense to revisit when soft-roce comes in, since qib/hfi do 
not need IB_WR_LOCAL_INV.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Dec. 16, 2015, 2:37 p.m. UTC | #6
> This patch exists to provide parity for what is in qib. Should we not
> have it?  If not, why do we have:
>
> commit 38071a461f0a ("IB/qib: Support the new memory registration API")

That was done by me because I saw this in qib and assumed that it was
supported. Now that I found out that it isn't, I'd say it should be
removed altogether shouldn't it?


>>> That doesn't mean it can't be added to rdmavt as a future enhancement
>>> though if there is a need.
>>
>> Well, given that we're trying to consolidate on post send registration
>> interface it's kind of a must I'd say.
>>
>>> Are you asking because soft-roce will need it?
>>
>> I was asking in general, but in specific soft-roce as a consumer will
>> need to support that yes.
>
> I think it makes sense to revisit when soft-roce comes in,

I agree.

> since qib/hfi do not need IB_WR_LOCAL_INV.

Can you explain? Does qib/hfi have a magic way to invalidate memory
regions?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dennis Dalessandro Dec. 17, 2015, 3:52 p.m. UTC | #7
On Wed, Dec 16, 2015 at 04:37:39PM +0200, Sagi Grimberg wrote:
>
>>This patch exists to provide parity for what is in qib. Should we not
>>have it?  If not, why do we have:
>>
>>commit 38071a461f0a ("IB/qib: Support the new memory registration API")
>
>That was done by me because I saw this in qib and assumed that it was
>supported. Now that I found out that it isn't, I'd say it should be
>removed altogether shouldn't it?

I am not opposed to leaving the code in rdmavt. It gets removed from qib in 
the other patch series I posted. My preference is to leave it in rdmavt 
since it will be needed down the road. However I can go either way here, its 
easy to add back later.

>
>>>>That doesn't mean it can't be added to rdmavt as a future enhancement
>>>>though if there is a need.
>>>
>>>Well, given that we're trying to consolidate on post send registration
>>>interface it's kind of a must I'd say.
>>>
>>>>Are you asking because soft-roce will need it?
>>>
>>>I was asking in general, but in specific soft-roce as a consumer will
>>>need to support that yes.
>>
>>I think it makes sense to revisit when soft-roce comes in,
>
>I agree.
>
>>since qib/hfi do not need IB_WR_LOCAL_INV.
>
>Can you explain? Does qib/hfi have a magic way to invalidate memory
>regions?

Hfi1 does not currently have any support for memory registration in its post 
send. Qib had the "old FRWR API" support in post_send, which you removed 
since according to the commit message, is not used anymore.

I suppose a follow on patch to your "new memory registration API" patch 
would be needed to add the invalidate. This is the piece I think can be 
added later in rdmavt.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Dec. 17, 2015, 4:35 p.m. UTC | #8
On Thu, Dec 17, 2015 at 10:52:29AM -0500, Dennis Dalessandro wrote:
> I am not opposed to leaving the code in rdmavt. It gets removed from qib in
> the other patch series I posted. My preference is to leave it in rdmavt
> since it will be needed down the road. However I can go either way here, its
> easy to add back later.

Without setting IB_DEVICE_MEM_MGT_EXTENSIONS and implementing all the
features required for it it's dead code.  There is no point to keep
it around.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dennis Dalessandro Dec. 17, 2015, 4:49 p.m. UTC | #9
On Thu, Dec 17, 2015 at 08:35:53AM -0800, Christoph Hellwig wrote:
>On Thu, Dec 17, 2015 at 10:52:29AM -0500, Dennis Dalessandro wrote:
>> I am not opposed to leaving the code in rdmavt. It gets removed from qib in
>> the other patch series I posted. My preference is to leave it in rdmavt
>> since it will be needed down the road. However I can go either way here, its
>> easy to add back later.
>
>Without setting IB_DEVICE_MEM_MGT_EXTENSIONS and implementing all the
>features required for it it's dead code.  There is no point to keep
>it around.

Ok, sounds good to me. I will drop this patch when I post the v2.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers/infiniband/sw/rdmavt/mr.c
index e809aaa..3ca8a6b 100644
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -474,6 +474,7 @@  int rvt_dereg_mr(struct ib_mr *ibmr)
 	int ret = 0;
 	unsigned long timeout;
 
+	kfree(mr->pages);
 	rvt_free_lkey(&mr->mr);
 
 	rvt_put_mr(&mr->mr); /* will set completion if last */
@@ -515,7 +516,38 @@  struct ib_mr *rvt_alloc_mr(struct ib_pd *pd,
 	if (IS_ERR(mr))
 		return (struct ib_mr *)mr;
 
+	mr->pages = kcalloc(max_num_sg, sizeof(u64), GFP_KERNEL);
+	if (!mr->pages)
+		goto err;
+
 	return &mr->ibmr;
+
+err:
+	rvt_dereg_mr(&mr->ibmr);
+	return ERR_PTR(-ENOMEM);
+}
+
+static int rvt_set_page(struct ib_mr *ibmr, u64 addr)
+{
+	struct rvt_mr *mr = to_imr(ibmr);
+
+	if (unlikely(mr->npages == mr->mr.max_segs))
+		return -ENOMEM;
+
+	mr->pages[mr->npages++] = addr;
+
+	return 0;
+}
+
+int rvt_map_mr_sg(struct ib_mr *ibmr,
+		  struct scatterlist *sg,
+		  int sg_nents)
+{
+	struct rvt_mr *mr = to_imr(ibmr);
+
+	mr->npages = 0;
+
+	return ib_sg_to_pages(ibmr, sg, sg_nents, rvt_set_page);
 }
 
 /**
@@ -867,3 +899,62 @@  bail:
 	return 0;
 }
 EXPORT_SYMBOL(rvt_rkey_ok);
+
+/*
+ * Initialize the memory region specified by the work request.
+ * This is only called in the post send.
+ */
+int rvt_reg_mr(struct rvt_qp *qp, struct ib_reg_wr *wr)
+{
+	struct rvt_dev_info *rdi = ib_to_rvt(qp->ibqp.device);
+	struct rvt_lkey_table *rkt = &rdi->lk_table;
+	struct rvt_pd *pd = ibpd_to_rvtpd(qp->ibqp.pd);
+	struct rvt_mr *mr = to_imr(wr->mr);
+	struct rvt_mregion *mrg;
+	u32 key = wr->key;
+	unsigned i, n, m;
+	int ret = -EINVAL;
+	unsigned long flags;
+	u64 *page_list;
+	size_t ps;
+
+	spin_lock_irqsave(&rkt->lock, flags);
+	if (pd->user || key == 0)
+		goto bail;
+
+	mrg = rcu_dereference_protected(
+		rkt->table[(key >> (32 - rdi->dparms.lkey_table_size))],
+		lockdep_is_held(&rkt->lock));
+	if (unlikely(!mrg || qp->ibqp.pd != mrg->pd))
+		goto bail;
+
+	if (mr->npages > mrg->max_segs)
+		goto bail;
+
+	ps = mr->ibmr.page_size;
+	if (mr->ibmr.length > ps * mr->npages)
+		goto bail;
+
+	mrg->user_base = mr->ibmr.iova;
+	mrg->iova = mr->ibmr.iova;
+	mrg->lkey = key;
+	mrg->length = mr->ibmr.length;
+	mrg->access_flags = wr->access;
+	page_list = mr->pages;
+	m = 0;
+	n = 0;
+	for (i = 0; i < mr->npages; i++) {
+		mrg->map[m]->segs[n].vaddr = (void *)page_list[i];
+		mrg->map[m]->segs[n].length = ps;
+		if (++n == RVT_SEGSZ) {
+			m++;
+			n = 0;
+		}
+	}
+
+	ret = 0;
+bail:
+	spin_unlock_irqrestore(&rkt->lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL(rvt_reg_mr);
diff --git a/drivers/infiniband/sw/rdmavt/mr.h b/drivers/infiniband/sw/rdmavt/mr.h
index 3b43278..3bb3306 100644
--- a/drivers/infiniband/sw/rdmavt/mr.h
+++ b/drivers/infiniband/sw/rdmavt/mr.h
@@ -60,6 +60,8 @@  struct rvt_fmr {
 struct rvt_mr {
 	struct ib_mr ibmr;
 	struct ib_umem *umem;
+	u64 *pages;
+	u32 npages;
 	struct rvt_mregion mr;  /* must be last */
 };
 
@@ -88,6 +90,9 @@  int rvt_dereg_mr(struct ib_mr *ibmr);
 struct ib_mr *rvt_alloc_mr(struct ib_pd *pd,
 			   enum ib_mr_type mr_type,
 			   u32 max_num_sg);
+int rvt_map_mr_sg(struct ib_mr *ibmr,
+		  struct scatterlist *sg,
+		  int sg_nents);
 struct ib_fmr *rvt_alloc_fmr(struct ib_pd *pd, int mr_access_flags,
 			     struct ib_fmr_attr *fmr_attr);
 int rvt_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
diff --git a/drivers/infiniband/sw/rdmavt/vt.c b/drivers/infiniband/sw/rdmavt/vt.c
index d541b05..3b38dd3 100644
--- a/drivers/infiniband/sw/rdmavt/vt.c
+++ b/drivers/infiniband/sw/rdmavt/vt.c
@@ -306,6 +306,7 @@  int rvt_register_device(struct rvt_dev_info *rdi)
 	CDR(rdi, unmap_fmr);
 	CDR(rdi, dealloc_fmr);
 	CDR(rdi, mmap);
+	CDR(rdi, map_mr_sg);
 
 	/* Completion queues */
 	CDR(rdi, create_cq);
diff --git a/include/rdma/rdma_vt.h b/include/rdma/rdma_vt.h
index afcc819..9faacdf 100644
--- a/include/rdma/rdma_vt.h
+++ b/include/rdma/rdma_vt.h
@@ -335,5 +335,5 @@  struct rvt_mmap_info *rvt_create_mmap_info(struct rvt_dev_info *rdi,
 					   void *obj);
 void rvt_update_mmap_info(struct rvt_dev_info *rdi, struct rvt_mmap_info *ip,
 			  u32 size, void *obj);
-
+int rvt_reg_mr(struct rvt_qp *qp, struct ib_reg_wr *wr);
 #endif          /* DEF_RDMA_VT_H */