Message ID | 1828884A29C6694DAF28B7E6B8A8237346A8E7FB@ORSMSX101.amr.corp.intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Roland Dreier |
Headers | show |
On Thu, Sep 20, 2012 at 09:43:06PM +0000, Hefty, Sean wrote: > +struct ibv_xrcd *__ibv_open_xrcd(struct ibv_context *context, int fd, int oflags) > +{ > + struct verbs_context *context_ex = verbs_get_ctx(context); > + struct ibv_xrcd *xrcd; This patch series seems to be missing the checks against a null return of verbs_get_ctx - if the driver is not extended these functions should return ENOSYS, not crash.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > +struct ibv_xrcd *__ibv_open_xrcd(struct ibv_context *context, int fd, int > oflags) > > +{ > > + struct verbs_context *context_ex = verbs_get_ctx(context); > > + struct ibv_xrcd *xrcd; > > This patch series seems to be missing the checks against a null return > of verbs_get_ctx - if the driver is not extended these functions > should return ENOSYS, not crash.. Patch 2 extends ibv_context for providers that don't by adding a translation layer for older providers. This seemed to keep things simpler, since the core verbs code could rely on extension support. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 24, 2012 at 08:45:10PM +0000, Hefty, Sean wrote: > > > +struct ibv_xrcd *__ibv_open_xrcd(struct ibv_context *context, int fd, int > > oflags) > > > +{ > > > + struct verbs_context *context_ex = verbs_get_ctx(context); > > > + struct ibv_xrcd *xrcd; > > > > This patch series seems to be missing the checks against a null return > > of verbs_get_ctx - if the driver is not extended these functions > > should return ENOSYS, not crash.. > > Patch 2 extends ibv_context for providers that don't by adding a > translation layer for older providers. This seemed to keep things > simpler, since the core verbs code could rely on extension support. Well, that patch is pretty complex and it slows down common case high speed calls (eg ibv_poll_cq) unless the provider is (fully?) upgraded. That is something I wanted to see avoided. Plus I don't see that we should be trying to make extended structures for many of the things in patch 2 (ibv_qp, ibv_srq, ibv_ah, ibv_mr, ibv_cq, ibv_pd, ibv_mw). Additional functionality for those objects is better served through new, optional, function entry points than by allowing the consumer to muck about directl in those structures. Those structs, in particular, should have been opaque from day 1, IMHO. So at first blush, adding null checks in _ex functions seems like a lessor evil, IHMO... Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Well, that patch is pretty complex and it slows down common case high > speed calls (eg ibv_poll_cq) unless the provider is (fully?) > upgraded. That is something I wanted to see avoided. The provider does need to support extensions. They don't need to support any extra calls. The cost should really be minimal and goes away as soon as the provider is updated, which I would expect all of them to be before the next release of verbs or OFED. > Plus I don't see that we should be trying to make extended structures > for many of the things in patch 2 (ibv_qp, ibv_srq, ibv_ah, ibv_mr, > ibv_cq, ibv_pd, ibv_mw). Additional functionality for those objects is > better served through new, optional, function entry points than by > allowing the consumer to muck about directl in those structures. Those > structs, in particular, should have been opaque from day 1, IMHO. There is data sharing that's required between verbs and the provider for verbs to do anything useful. We need to add an xrcd to the QP and xrcd, cq, and srq_num to the SRQ. Verbs uses the xrcd's and cq, but the user needs the srq_num. Verbs doesn't allocate the structures, so it can't store anything. Adding a bunch of get/set calls seems worse to me than just placing the items in a structure. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 24, 2012 at 09:54:22PM +0000, Hefty, Sean wrote: > The cost should really be minimal and goes away as soon as the > provider is updated, which I would expect all of them to be before > the next release of verbs or OFED. Donno, many don't need to change because they don't support any extensions.. > > Plus I don't see that we should be trying to make extended structures > > for many of the things in patch 2 (ibv_qp, ibv_srq, ibv_ah, ibv_mr, > > ibv_cq, ibv_pd, ibv_mw). Additional functionality for those objects is > > better served through new, optional, function entry points than by > > allowing the consumer to muck about directl in those structures. Those > > structs, in particular, should have been opaque from day 1, IMHO. > There is data sharing that's required between verbs and the provider > for verbs to do anything useful. We need to add an xrcd to the QP > and xrcd, cq, and srq_num to the SRQ. Verbs uses the xrcd's and cq, > but the user needs the srq_num. Internally verbs can learn if the cq is extended via inspecting the context structure, so I don't see the verbs<->provider interface as the problem here. Exposing this mess to users should be done at little as possible.. > Verbs doesn't allocate the structures, so it can't store anything. Well, that is the point, verbs shouldn't be storing things, any change to the object state must go through the provider. > Adding a bunch of get/set calls seems worse to me than just placing > the items in a structure. Really? if (cq->comp_mask & IBV_CQ_COMP_MASK_SRQ_NUM) srq_num = cq->srq_num vs ibv_cq_get_srq_num(cq); // Returns 0 on error Betcha at least 50% of apps would do the first one wrong.... Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Verbs doesn't allocate the structures, so it can't store anything. > > Well, that is the point, verbs shouldn't be storing things, any change > to the object state must go through the provider. Verbs does stuff like this: qp = context->ops.create_qp(pd, (struct ibv_qp_init_attr *) qp_init_attr_ex); if (qp) { qp->context = context; qp->qp_context = qp_init_attr_ex->qp_context; qp->pd = pd; qp->send_cq = qp_init_attr_ex->send_cq; qp->recv_cq = qp_init_attr_ex->recv_cq; qp->srq = qp_init_attr_ex->srq; qp->qp_type = qp_init_attr_ex->qp_type; qp->state = IBV_QPS_RESET; qp->events_completed = 0; pthread_mutex_init(&qp->mutex, NULL); pthread_cond_init(&qp->cond, NULL); and this: if (write(context->cmd_fd, cmd, cmd_size) != cmd_size) return errno; (void) VALGRIND_MAKE_MEM_DEFINED(resp, resp_size); qp->handle = resp->qp_handle; qp->qp_num = resp->qpn; qp->context = context; > > > Adding a bunch of get/set calls seems worse to me than just placing > > the items in a structure. > > Really? > > if (cq->comp_mask & IBV_CQ_COMP_MASK_SRQ_NUM) > srq_num = cq->srq_num > > vs > > ibv_cq_get_srq_num(cq); // Returns 0 on error > > Betcha at least 50% of apps would do the first one wrong.... An app which uses XRC just needs to access srq->srq_num. No other checks are needed. If they have an XRC SRQ, the SRQ number must be valid. The mask seems pointless to me. If you're suggesting that verbs.h remains untouched (for patch 1 of this series), but that new structures are defined between the verbs and the provider only, that seems reasonable. It would likely result in a bunch of typecasts everywhere. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 24, 2012 at 11:00:00PM +0000, Hefty, Sean wrote: > > > Verbs doesn't allocate the structures, so it can't store anything. > > > > Well, that is the point, verbs shouldn't be storing things, any change > > to the object state must go through the provider. > > Verbs does stuff like this: This is all just internal bookkeeping between verbs and the provider, it doesn't need to be exposed to the user. Verbs and the provider can negotiate what the internal layout of these object allocations will be through the context. > > > Adding a bunch of get/set calls seems worse to me than just placing > > > the items in a structure. > > > > Really? > > > > if (cq->comp_mask & IBV_CQ_COMP_MASK_SRQ_NUM) > > srq_num = cq->srq_num > > > > vs > > > > ibv_cq_get_srq_num(cq); // Returns 0 on error > > > > Betcha at least 50% of apps would do the first one wrong.... > > An app which uses XRC just needs to access srq->srq_num. No other > checks are needed. If they have an XRC SRQ, the SRQ number must be > valid. The mask seems pointless to me. The mask is part of the API, and not all cases in the future are guarenteed to be like XRC. Sometimes (like with this XRC example) you can omit the comp_mask check, others times you can't. It is just too complex for something low speed. A function call can't be done wrong ;) > If you're suggesting that verbs.h remains untouched (for patch 1 of > this series), but that new structures are defined between the verbs > and the provider only, that seems reasonable. It would likely > result in a bunch of typecasts everywhere. I'm suggesting this group of provider-allocated structures remain opaque to the user and we don't add new user-visible things to them at all (having them non-opaque to start with was a mistake). Use accessors to implement new functionality on these objects. Accessors are an overall better strategy for keeping ABI stability, and should be used whenever possible. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/infiniband/driver.h b/include/infiniband/driver.h index 5af0d7f..193b0dd 100644 --- a/include/infiniband/driver.h +++ b/include/infiniband/driver.h @@ -76,6 +76,11 @@ int ibv_cmd_alloc_pd(struct ibv_context *context, struct ibv_pd *pd, struct ibv_alloc_pd *cmd, size_t cmd_size, struct ibv_alloc_pd_resp *resp, size_t resp_size); int ibv_cmd_dealloc_pd(struct ibv_pd *pd); +int ibv_cmd_open_xrcd(struct ibv_context *context, struct ibv_xrcd *xrcd, + int fd, int oflags, + struct ibv_open_xrcd *cmd, size_t cmd_size, + struct ibv_open_xrcd_resp *resp, size_t resp_size); +int ibv_cmd_close_xrcd(struct ibv_xrcd *xrcd); #define IBV_CMD_REG_MR_HAS_RESP_PARAMS int ibv_cmd_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access, diff --git a/include/infiniband/kern-abi.h b/include/infiniband/kern-abi.h index 619ea7e..d7c673f 100644 --- a/include/infiniband/kern-abi.h +++ b/include/infiniband/kern-abi.h @@ -85,7 +85,9 @@ enum { IB_USER_VERBS_CMD_MODIFY_SRQ, IB_USER_VERBS_CMD_QUERY_SRQ, IB_USER_VERBS_CMD_DESTROY_SRQ, - IB_USER_VERBS_CMD_POST_SRQ_RECV + IB_USER_VERBS_CMD_POST_SRQ_RECV, + IB_USER_VERBS_CMD_OPEN_XRCD, + IB_USER_VERBS_CMD_CLOSE_XRCD, }; /* @@ -246,6 +248,27 @@ struct ibv_dealloc_pd { __u32 pd_handle; }; +struct ibv_open_xrcd { + __u32 command; + __u16 in_words; + __u16 out_words; + __u64 response; + __u32 fd; + __u32 oflags; + __u64 driver_data[0]; +}; + +struct ibv_open_xrcd_resp { + __u32 xrcd_handle; +}; + +struct ibv_close_xrcd { + __u32 command; + __u16 in_words; + __u16 out_words; + __u32 xrcd_handle; +}; + struct ibv_reg_mr { __u32 command; __u16 in_words; @@ -804,6 +827,8 @@ enum { * trick opcodes in IBV_INIT_CMD() doesn't break. */ IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL_V2 = -1, + IB_USER_VERBS_CMD_OPEN_XRCD_V2 = -1, + IB_USER_VERBS_CMD_CLOSE_XRCD_V2 = -1, }; struct ibv_destroy_cq_v1 { diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h index f3cb2fc..6e02c9a 100644 --- a/include/infiniband/verbs.h +++ b/include/infiniband/verbs.h @@ -1,6 +1,6 @@ /* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. - * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004, 2011-2012 Intel Corporation. All rights reserved. * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2005 PathScale, Inc. All rights reserved. * @@ -317,6 +317,18 @@ struct ibv_pd { uint32_t comp_mask; }; +enum ibv_xrcd_mask { + IBV_XRCD_CONTEXT = 1 << 0, + IBV_XRCD_HANDLE = 1 << 1, + IBV_XRCD_RESERVED = 1 << 2 +}; + +struct ibv_xrcd { + uint64_t comp_mask; + struct ibv_context *context; + uint32_t handle; +}; + enum ibv_rereg_mr_flags { IBV_REREG_MR_CHANGE_TRANSLATION = (1 << 0), IBV_REREG_MR_CHANGE_PD = (1 << 1), @@ -774,11 +786,13 @@ struct ibv_context { }; struct verbs_context { - /* "grows up" - new fields go here int (*drv_new_func1) (); new corresponding provider call of func1 int (*lib_new_func1) (); New library call func1 */ + struct ibv_xrcd * (*open_xrcd)(struct ibv_context *context, + int fd, int oflags); + int (*close_xrcd)(struct ibv_xrcd *xrcd); size_t sz; /* Set by library on struct allocation,must be * located right before struct ibv_context */ @@ -873,7 +887,7 @@ static inline int ___ibv_query_port(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr) { - /* For compatability when running with old libibverbs */ + /* For compatibility when running with old libibverbs */ port_attr->link_layer = IBV_LINK_LAYER_UNSPECIFIED; port_attr->reserved = 0; @@ -906,6 +920,16 @@ struct ibv_pd *ibv_alloc_pd(struct ibv_context *context); int ibv_dealloc_pd(struct ibv_pd *pd); /** + * ibv_open_xrcd - Open an extended connection domain + */ +struct ibv_xrcd *ibv_open_xrcd(struct ibv_context *context, int fd, int oflags); + +/** + * ibv_close_xrcd - Close an extended connection domain + */ +int ibv_close_xrcd(struct ibv_xrcd *xrcd); + +/** * ibv_reg_mr - Register a memory region */ struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr, diff --git a/src/cmd.c b/src/cmd.c index dab8930..4a98c64 100644 --- a/src/cmd.c +++ b/src/cmd.c @@ -194,6 +194,40 @@ int ibv_cmd_dealloc_pd(struct ibv_pd *pd) return 0; } +int ibv_cmd_open_xrcd(struct ibv_context *context, struct ibv_xrcd *xrcd, + int fd, int oflags, + struct ibv_open_xrcd *cmd, size_t cmd_size, + struct ibv_open_xrcd_resp *resp, size_t resp_size) +{ + IBV_INIT_CMD_RESP(cmd, cmd_size, OPEN_XRCD, resp, resp_size); + + cmd->fd = fd; + cmd->oflags = oflags; + if (write(context->cmd_fd, cmd, cmd_size) != cmd_size) + return errno; + + VALGRIND_MAKE_MEM_DEFINED(resp, resp_size); + + xrcd->comp_mask = IBV_XRCD_CONTEXT | IBV_XRCD_HANDLE; + xrcd->context = context; + xrcd->handle = resp->xrcd_handle; + + return 0; +} + +int ibv_cmd_close_xrcd(struct ibv_xrcd *xrcd) +{ + struct ibv_close_xrcd cmd; + + IBV_INIT_CMD(&cmd, sizeof cmd, CLOSE_XRCD); + cmd.xrcd_handle = xrcd->handle; + + if (write(xrcd->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) + return errno; + + return 0; +} + int ibv_cmd_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access, struct ibv_mr *mr, struct ibv_reg_mr *cmd, diff --git a/src/libibverbs.map b/src/libibverbs.map index ee9adea..f21bffe 100644 --- a/src/libibverbs.map +++ b/src/libibverbs.map @@ -97,4 +97,10 @@ IBVERBS_1.1 { ibv_port_state_str; ibv_event_type_str; ibv_wc_status_str; + + ibv_open_xrcd; + ibv_close_xrcd; + ibv_cmd_open_xrcd; + ibv_cmd_close_xrcd; + } IBVERBS_1.0; diff --git a/src/verbs.c b/src/verbs.c index 7973afc..ea99ab0 100644 --- a/src/verbs.c +++ b/src/verbs.c @@ -154,6 +154,32 @@ int __ibv_dealloc_pd(struct ibv_pd *pd) } default_symver(__ibv_dealloc_pd, ibv_dealloc_pd); +struct ibv_xrcd *__ibv_open_xrcd(struct ibv_context *context, int fd, int oflags) +{ + struct verbs_context *context_ex = verbs_get_ctx(context); + struct ibv_xrcd *xrcd; + + if (!context_ex->open_xrcd) { + errno = ENOSYS; + return NULL; + } + + xrcd = context_ex->open_xrcd(context, fd, oflags); + if (xrcd) + xrcd->context = context; + + return xrcd; +} +default_symver(__ibv_open_xrcd, ibv_open_xrcd); + +int __ibv_close_xrcd(struct ibv_xrcd *xrcd) +{ + struct verbs_context *context_ex = verbs_get_ctx(xrcd->context); + + return context_ex->close_xrcd(xrcd); +} +default_symver(__ibv_close_xrcd, ibv_close_xrcd); + struct ibv_mr *__ibv_reg_mr(struct ibv_pd *pd, void *addr, size_t length, int access) {
XRC introduces several new concepts and structures, one of which is the XRC domain. XRC domains: xrcd's are a type of protection domain used to associate shared receive queues with xrc queue pairs. Since xrcd are meant to be shared among multiple processes, we introduce new APIs to open/close xrcd's. The user to kernel ABI is extended to account for opening/ closing the xrcd. Signed-off-by: Sean Hefty <sean.hefty@intel.com> --- include/infiniband/driver.h | 5 +++++ include/infiniband/kern-abi.h | 27 ++++++++++++++++++++++++++- include/infiniband/verbs.h | 30 +++++++++++++++++++++++++++--- src/cmd.c | 34 ++++++++++++++++++++++++++++++++++ src/libibverbs.map | 6 ++++++ src/verbs.c | 26 ++++++++++++++++++++++++++ 6 files changed, 124 insertions(+), 4 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html