Message ID | 20150714202943.GB26927@obsidianresearch.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
> -----Original Message----- > From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-owner@vger.kernel.org] On Behalf Of Jason Gunthorpe > Sent: Tuesday, July 14, 2015 3:30 PM > To: 'Christoph Hellwig' > Cc: Steve Wise; 'Sagi Grimberg'; 'Steve Wise'; 'Tom Talpey'; 'Doug Ledford'; sagig@mellanox.com; ogerlitz@mellanox.com; > roid@mellanox.com; linux-rdma@vger.kernel.org; eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; > trond.myklebust@primarydata.com; bfields@fieldses.org; 'Oren Duer' > Subject: Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags > > On Tue, Jul 14, 2015 at 12:55:11PM -0700, 'Christoph Hellwig' wrote: > > On Tue, Jul 14, 2015 at 02:32:31PM -0500, Steve Wise wrote: > > > You mean "should not", yea? > > > > > > Ok. I'll check for iWARP. But don't tell me to remove the transport-specific hacks in this series when I post it! ;) > > > > Just curious if there are any holes in this little scheme to deal with > > the lkey mess: > > > > (1) make sure all drivers that currently do not set > > IB_DEVICE_LOCAL_DMA_LKEY but which can safely use ib_get_dma_mr > > call it underneath at device setup time, and tear it down before > > removal. > > Yes, I'd like this. > > local_dma_lkey appears to be global, it works with any PD. > > ib_get_dma_mr is tied to a PD, so it cannot replace local_dma_lkey at > the struct device level. > > ib_alloc_pd is the in-kernel entry point, the UAPI calls > device->alloc_pd directly.. So how about the below patch as a starting > >point? > > (Steve the goal of step #1 would be to remove ib_get_dma_mr from ULPs > Follow on patches would be to convert all ULPs to use this API change.) > > In the long run this also makes it easier to develop helper wrappers > around posting because a local_dma_lkey is now always accessible to > the core code. > I'm not seeing the benefit of adding pd->local_dma_lkey? pd->device->local_dma_lkey is there for core and ULP use, and we could have old drivers that don't currently have support for local_dma_lkey allocate their own private pd/dma_mr (via their private functions for doing this) with only LOCAL_WRITE access flags, and export that lkey as the device->local_dma_lkey. Wouldn't that be simpler? > > (2) now ULD can check for IB_DEVICE_LOCAL_DMA_LKEY and use local_dma_lkey > > in that case, or will have to perform a per-IO MR with LOCAL and > > REMOTE flags if not > > If we do the above all ULPs simply use pd->local_dma_lkey and we drop > correct uses of ib_get_dma_mr completely ? > > > (3) remove insecure remote uses of ib_get_dma_mr from ULDs > > (4) remove ib_get_dma_mr from the public API > > Sure would be nice! > > > This should help to sort out the lkey side of the memory registrations, > > and isn't too hard. For rkeys I'd love to go with something like Sagis > > proposal as a first step, and then we can see if we can refine it in > > another iteration. > > Agree. I'd love to see FMR fit under that, but even if we cannot, it > is appears to be a sane way to advance indirect MR.. > > Jason > > >From 5a9799bf487d822daae5a8b8f3b197f1dda1db45 Mon Sep 17 00:00:00 2001 > From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> > Date: Tue, 14 Jul 2015 14:27:37 -0600 > Subject: [PATCH] IB/core: Guarantee that a local_dma_lkey is available > > Every single ULP requires a local_dma_lkey to do anything with > a QP, so lets us ensure one exists for every PD created. > > If the driver can supply a global local_dma_lkey then use that, otherwise > ask the driver to create a local use all physical memory MR associated > with the new PD. > > Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> > --- > drivers/infiniband/core/verbs.c | 40 +++++++++++++++++++++++++++++++++++----- > include/rdma/ib_verbs.h | 2 ++ > 2 files changed, 37 insertions(+), 5 deletions(-) > > diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c > index bac3fb406a74..1ddf06314f36 100644 > --- a/drivers/infiniband/core/verbs.c > +++ b/drivers/infiniband/core/verbs.c > @@ -213,24 +213,54 @@ EXPORT_SYMBOL(rdma_port_get_link_layer); > > /* Protection domains */ > > +/* Return a pd for in-kernel use that has a local_dma_lkey which provides > + local access to all physical memory. */ > struct ib_pd *ib_alloc_pd(struct ib_device *device) > { > struct ib_pd *pd; > + struct ib_device_attr devattr; > + int rc; > + > + rc = ib_query_device(device, &devattr); > + if (rc) > + return ERR_PTR(rc); > > pd = device->alloc_pd(device, NULL, NULL); > + if (IS_ERR(pd)) > + return pd; > + > + pd->device = device; > + pd->uobject = NULL; > + pd->local_mr = NULL; > + atomic_set(&pd->usecnt, 0); > + > + if (devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY) > + pd->local_dma_lkey = device->local_dma_lkey; > + else { > + struct ib_mr *mr; > + > + mr = ib_get_dma_mr(pd, IB_ACCESS_LOCAL_WRITE); > + if (IS_ERR(mr)) { > + ib_dealloc_pd(pd); > + return (struct ib_pd *)mr; > + } > > - if (!IS_ERR(pd)) { > - pd->device = device; > - pd->uobject = NULL; > - atomic_set(&pd->usecnt, 0); > + pd->local_mr = mr; > + pd->local_dma_lkey = pd->local_mr->lkey; > } > - > return pd; > } > + > EXPORT_SYMBOL(ib_alloc_pd); > > int ib_dealloc_pd(struct ib_pd *pd) > { > + if (pd->local_mr) { > + if (ib_dereg_mr(pd->local_mr)) > + return -EBUSY; > + pd->local_mr = NULL; > + } > + > if (atomic_read(&pd->usecnt)) > return -EBUSY; > > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h > index 986fddb08579..cfda95d7b067 100644 > --- a/include/rdma/ib_verbs.h > +++ b/include/rdma/ib_verbs.h > @@ -1255,6 +1255,8 @@ struct ib_pd { > struct ib_device *device; > struct ib_uobject *uobject; > atomic_t usecnt; /* count all resources */ > + struct ib_mr *local_mr; > + u32 local_dma_lkey; > }; > > struct ib_xrcd { > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 14, 2015 at 03:40:49PM -0500, Steve Wise wrote: > > local_dma_lkey appears to be global, it works with any PD. > > > > ib_get_dma_mr is tied to a PD, so it cannot replace local_dma_lkey at > > the struct device level. > > > > ib_alloc_pd is the in-kernel entry point, the UAPI calls > > device->alloc_pd directly.. So how about the below patch as a starting > > >point? > > > > (Steve the goal of step #1 would be to remove ib_get_dma_mr from ULPs > > Follow on patches would be to convert all ULPs to use this API change.) > I'm not seeing the benefit of adding pd->local_dma_lkey? > pd->device->local_dma_lkey is there for core and ULP use, and we > could have old drivers that don't currently have support for > local_dma_lkey allocate their own private pd/dma_mr (via their > private functions for doing this) with only LOCAL_WRITE access > flags, and export that lkey as the device->local_dma_lkey. Wouldn't > that be simpler? It would be, but AFAIK that can't work? My understanding is if you create a QP against a PD then only lkeys and rkeys (and local_dma_rkey) created against that PD are valid for use with that QP. I can't use an lkey from a PD not associated with the QP. Am I wrong on this? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/14/2015 4:29 PM, Jason Gunthorpe wrote: > On Tue, Jul 14, 2015 at 12:55:11PM -0700, 'Christoph Hellwig' wrote: >> On Tue, Jul 14, 2015 at 02:32:31PM -0500, Steve Wise wrote: >>> You mean "should not", yea? >>> >>> Ok. I'll check for iWARP. But don't tell me to remove the transport-specific hacks in this series when I post it! ;) >> >> Just curious if there are any holes in this little scheme to deal with >> the lkey mess: >> >> (1) make sure all drivers that currently do not set >> IB_DEVICE_LOCAL_DMA_LKEY but which can safely use ib_get_dma_mr >> call it underneath at device setup time, and tear it down before >> removal. > > Yes, I'd like this. > > local_dma_lkey appears to be global, it works with any PD. Only if it's supported, right? There's an attribute that a provider uses to expose it. For example, I would not expect a virtualized provider to be able to support this. > > ib_get_dma_mr is tied to a PD, so it cannot replace local_dma_lkey at > the struct device level. Correct, and by design, in fact. In most implementations, a different token is returned for each call, in fact. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: linux-nfs-owner@vger.kernel.org [mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of Jason Gunthorpe > Sent: Tuesday, July 14, 2015 3:45 PM > To: Steve Wise > Cc: 'Christoph Hellwig'; 'Sagi Grimberg'; 'Tom Talpey'; 'Doug Ledford'; sagig@mellanox.com; ogerlitz@mellanox.com; > roid@mellanox.com; linux-rdma@vger.kernel.org; eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; > trond.myklebust@primarydata.com; bfields@fieldses.org; 'Oren Duer' > Subject: Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags > > On Tue, Jul 14, 2015 at 03:40:49PM -0500, Steve Wise wrote: > > > local_dma_lkey appears to be global, it works with any PD. > > > > > > ib_get_dma_mr is tied to a PD, so it cannot replace local_dma_lkey at > > > the struct device level. > > > > > > ib_alloc_pd is the in-kernel entry point, the UAPI calls > > > device->alloc_pd directly.. So how about the below patch as a starting > > > >point? > > > > > > (Steve the goal of step #1 would be to remove ib_get_dma_mr from ULPs > > > Follow on patches would be to convert all ULPs to use this API change.) > > > I'm not seeing the benefit of adding pd->local_dma_lkey? > > pd->device->local_dma_lkey is there for core and ULP use, and we > > could have old drivers that don't currently have support for > > local_dma_lkey allocate their own private pd/dma_mr (via their > > private functions for doing this) with only LOCAL_WRITE access > > flags, and export that lkey as the device->local_dma_lkey. Wouldn't > > that be simpler? > > It would be, but AFAIK that can't work? > > My understanding is if you create a QP against a PD then only lkeys > and rkeys (and local_dma_rkey) created against that PD are valid for > use with that QP. > > I can't use an lkey from a PD not associated with the QP. > > Am I wrong on this? Kernel users can use the local_dma_lkey for all lkey IO on all QPs (ignoring the iwarp read issue). Look at sc_dma_lkey in the NFSRDMA server. > > Jason > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 14, 2015 at 03:54:15PM -0500, Steve Wise wrote: > > > I'm not seeing the benefit of adding pd->local_dma_lkey? > > > pd->device->local_dma_lkey is there for core and ULP use, and we > > > could have old drivers that don't currently have support for > > > local_dma_lkey allocate their own private pd/dma_mr (via their > > > private functions for doing this) with only LOCAL_WRITE access > > > flags, and export that lkey as the device->local_dma_lkey. Wouldn't > > > that be simpler? > > > > It would be, but AFAIK that can't work? > > > > My understanding is if you create a QP against a PD then only lkeys > > and rkeys (and local_dma_rkey) created against that PD are valid for > > use with that QP. > > > > I can't use an lkey from a PD not associated with the QP. > > > > Am I wrong on this? > > Kernel users can use the local_dma_lkey for all lkey IO on all QPs > (ignoring the iwarp read issue). Look at sc_dma_lkey in the NFSRDMA > server. Read the exchange again? You asked this: > > > could have old drivers that don't currently have support for > > > local_dma_lkey allocate their own private pd/dma_mr (via their The answer to why that doesn't work is: In the generic case of old drivers every PD has a different local_dma_lkey value. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 14, 2015 at 02:29:43PM -0600, Jason Gunthorpe wrote: > local_dma_lkey appears to be global, it works with any PD. > > ib_get_dma_mr is tied to a PD, so it cannot replace local_dma_lkey at > the struct device level. > > ib_alloc_pd is the in-kernel entry point, the UAPI calls > device->alloc_pd directly.. So how about the below patch as a starting > point? This looks perfect to me. After this we can get rid of the ib_get_dma_mr calls outside of ib_alloc_pd, and eventuall move setting up ->local_dma_lkey into the HW driver and kill of ib_get_dma_mr, IB_DEVICE_LOCAL_DMA_LKEY and device->local_dma_lkey. > > (2) now ULD can check for IB_DEVICE_LOCAL_DMA_LKEY and use local_dma_lkey > > in that case, or will have to perform a per-IO MR with LOCAL and > > REMOTE flags if not > > If we do the above all ULPs simply use pd->local_dma_lkey and we drop > correct uses of ib_get_dma_mr completely ? Yes, please, although: > > (3) remove insecure remote uses of ib_get_dma_mr from ULDs I kept this separate as the I suspect the "optimizations" using ib_get_dma_mr with remote flags will warrant a separate discussion. > >From 5a9799bf487d822daae5a8b8f3b197f1dda1db45 Mon Sep 17 00:00:00 2001 > From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> > Date: Tue, 14 Jul 2015 14:27:37 -0600 > Subject: [PATCH] IB/core: Guarantee that a local_dma_lkey is available > > Every single ULP requires a local_dma_lkey to do anything with > a QP, so lets us ensure one exists for every PD created. > > If the driver can supply a global local_dma_lkey then use that, otherwise > ask the driver to create a local use all physical memory MR associated > with the new PD. > > Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Christoph Hellwig <hch@lst.de> (especially together with patches removing all other callsites of "ib_get_dma_mr(pd, IB_ACCESS_LOCAL_WRITE)". -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/14/2015 11:29 PM, Jason Gunthorpe wrote: > On Tue, Jul 14, 2015 at 12:55:11PM -0700, 'Christoph Hellwig' wrote: >> On Tue, Jul 14, 2015 at 02:32:31PM -0500, Steve Wise wrote: >>> You mean "should not", yea? >>> >>> Ok. I'll check for iWARP. But don't tell me to remove the transport-specific hacks in this series when I post it! ;) >> >> Just curious if there are any holes in this little scheme to deal with >> the lkey mess: >> >> (1) make sure all drivers that currently do not set >> IB_DEVICE_LOCAL_DMA_LKEY but which can safely use ib_get_dma_mr >> call it underneath at device setup time, and tear it down before >> removal. > > Yes, I'd like this. > > local_dma_lkey appears to be global, it works with any PD. > > ib_get_dma_mr is tied to a PD, so it cannot replace local_dma_lkey at > the struct device level. > > ib_alloc_pd is the in-kernel entry point, the UAPI calls > device->alloc_pd directly.. So how about the below patch as a starting > >point? > > (Steve the goal of step #1 would be to remove ib_get_dma_mr from ULPs > Follow on patches would be to convert all ULPs to use this API change.) > > In the long run this also makes it easier to develop helper wrappers > around posting because a local_dma_lkey is now always accessible to > the core code. > >> (2) now ULD can check for IB_DEVICE_LOCAL_DMA_LKEY and use local_dma_lkey >> in that case, or will have to perform a per-IO MR with LOCAL and >> REMOTE flags if not > > If we do the above all ULPs simply use pd->local_dma_lkey and we drop > correct uses of ib_get_dma_mr completely ? > >> (3) remove insecure remote uses of ib_get_dma_mr from ULDs >> (4) remove ib_get_dma_mr from the public API > > Sure would be nice! > >> This should help to sort out the lkey side of the memory registrations, >> and isn't too hard. For rkeys I'd love to go with something like Sagis >> proposal as a first step, and then we can see if we can refine it in >> another iteration. > > Agree. I'd love to see FMR fit under that, but even if we cannot, it > is appears to be a sane way to advance indirect MR.. > > Jason > > From 5a9799bf487d822daae5a8b8f3b197f1dda1db45 Mon Sep 17 00:00:00 2001 > From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> > Date: Tue, 14 Jul 2015 14:27:37 -0600 > Subject: [PATCH] IB/core: Guarantee that a local_dma_lkey is available > > Every single ULP requires a local_dma_lkey to do anything with > a QP, so lets us ensure one exists for every PD created. > > If the driver can supply a global local_dma_lkey then use that, otherwise > ask the driver to create a local use all physical memory MR associated > with the new PD. > > Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> > --- > drivers/infiniband/core/verbs.c | 40 +++++++++++++++++++++++++++++++++++----- > include/rdma/ib_verbs.h | 2 ++ > 2 files changed, 37 insertions(+), 5 deletions(-) > > diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c > index bac3fb406a74..1ddf06314f36 100644 > --- a/drivers/infiniband/core/verbs.c > +++ b/drivers/infiniband/core/verbs.c > @@ -213,24 +213,54 @@ EXPORT_SYMBOL(rdma_port_get_link_layer); > > /* Protection domains */ > > +/* Return a pd for in-kernel use that has a local_dma_lkey which provides > + local access to all physical memory. */ > struct ib_pd *ib_alloc_pd(struct ib_device *device) > { > struct ib_pd *pd; > + struct ib_device_attr devattr; > + int rc; > + > + rc = ib_query_device(device, &devattr); > + if (rc) > + return ERR_PTR(rc); Unrelated question, Can we just cache the dev_attr in ib_device already? It's pretty obvious that each consumer that opens a device will automatically query its attributes... > > pd = device->alloc_pd(device, NULL, NULL); > + if (IS_ERR(pd)) > + return pd; > + > + pd->device = device; > + pd->uobject = NULL; > + pd->local_mr = NULL; > + atomic_set(&pd->usecnt, 0); > + > + if (devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY) > + pd->local_dma_lkey = device->local_dma_lkey; > + else { > + struct ib_mr *mr; > + > + mr = ib_get_dma_mr(pd, IB_ACCESS_LOCAL_WRITE); > + if (IS_ERR(mr)) { > + ib_dealloc_pd(pd); > + return (struct ib_pd *)mr; > + } > > - if (!IS_ERR(pd)) { > - pd->device = device; > - pd->uobject = NULL; > - atomic_set(&pd->usecnt, 0); > + pd->local_mr = mr; > + pd->local_dma_lkey = pd->local_mr->lkey; > } > - > return pd; > } > + > EXPORT_SYMBOL(ib_alloc_pd); > > int ib_dealloc_pd(struct ib_pd *pd) > { > + if (pd->local_mr) { > + if (ib_dereg_mr(pd->local_mr)) > + return -EBUSY; > + pd->local_mr = NULL; > + } > + > if (atomic_read(&pd->usecnt)) > return -EBUSY; > > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h > index 986fddb08579..cfda95d7b067 100644 > --- a/include/rdma/ib_verbs.h > +++ b/include/rdma/ib_verbs.h > @@ -1255,6 +1255,8 @@ struct ib_pd { > struct ib_device *device; > struct ib_uobject *uobject; > atomic_t usecnt; /* count all resources */ > + struct ib_mr *local_mr; > + u32 local_dma_lkey; > }; > > struct ib_xrcd { > The patch itself looks good. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 15, 2015 at 11:47:52AM +0300, Sagi Grimberg wrote: > > struct ib_pd *ib_alloc_pd(struct ib_device *device) > > { > > struct ib_pd *pd; > >+ struct ib_device_attr devattr; > >+ int rc; > >+ > >+ rc = ib_query_device(device, &devattr); > >+ if (rc) > >+ return ERR_PTR(rc); > > Unrelated question, > Can we just cache the dev_attr in ib_device already? It's pretty > obvious that each consumer that opens a device will automatically > query its attributes... Instead of caching it I'd suggest to kill ib_device_attr and move the field into struct ib_device, similar to how all major kernel subsystems work. Otherwise fully agreed. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 14, 2015 at 11:50:57PM -0700, 'Christoph Hellwig' wrote: > On Tue, Jul 14, 2015 at 02:29:43PM -0600, Jason Gunthorpe wrote: > > local_dma_lkey appears to be global, it works with any PD. > > > > ib_get_dma_mr is tied to a PD, so it cannot replace local_dma_lkey at > > the struct device level. > > > > ib_alloc_pd is the in-kernel entry point, the UAPI calls > > device->alloc_pd directly.. So how about the below patch as a starting > > point? > > This looks perfect to me. After this we can get rid of the > ib_get_dma_mr calls outside of ib_alloc_pd, and eventuall move > setting up ->local_dma_lkey into the HW driver and kill of > ib_get_dma_mr, IB_DEVICE_LOCAL_DMA_LKEY and device->local_dma_lkey. Just for clarity, again, we can never do this. device->local_dma_lkey requires dedicated hardware support. We cannot create it in software on old hardware. The only option I see is the different-for-every-PD solution in my patch. > (especially together with patches removing all other callsites of > "ib_get_dma_mr(pd, IB_ACCESS_LOCAL_WRITE)". I might find time to type this in, but I won't be able to find time to do any testing on the ULPs.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 15, 2015 at 05:19:26AM -0700, 'Christoph Hellwig' wrote: > On Wed, Jul 15, 2015 at 11:47:52AM +0300, Sagi Grimberg wrote: > > > struct ib_pd *ib_alloc_pd(struct ib_device *device) > > > { > > > struct ib_pd *pd; > > >+ struct ib_device_attr devattr; > > >+ int rc; > > >+ > > >+ rc = ib_query_device(device, &devattr); > > >+ if (rc) > > >+ return ERR_PTR(rc); > > > > Unrelated question, > > Can we just cache the dev_attr in ib_device already? It's pretty > > obvious that each consumer that opens a device will automatically > > query its attributes... > > Instead of caching it I'd suggest to kill ib_device_attr and move > the field into struct ib_device, similar to how all major kernel > subsystems work. Otherwise fully agreed. Indeed, I think there are quite a few weird cases like this floating around for someone to work on :) Jason -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 15, 2015 at 01:12:57PM -0600, Jason Gunthorpe wrote: > I might find time to type this in, but I won't be able to find time to > do any testing on the ULPs.. Here is the typing, I'll look more carefully at it later and send it via email: https://github.com/jgunthorpe/linux/commits/remove-ib_get_dma_mr Jason Gunthorpe (9): IB/core: Guarantee that a local_dma_lkey is available IB/mad: Remove ib_get_dma_mr calls IB/ipoib: Remove ib_get_dma_mr calls IB/mlx4: Remove ib_get_dma_mr calls IB/mlx5: Remove ib_get_dma_mr calls iser-target: Remove ib_get_dma_mr calls ib_srpt: Remove ib_get_dma_mr calls net/9p: Remove ib_get_dma_mr calls rds/ib: Remove ib_get_dma_mr calls The remaining calls are more complex as they are opening remote writable physical windows. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 15, 2015 at 01:12:57PM -0600, Jason Gunthorpe wrote: > > This looks perfect to me. After this we can get rid of the > > ib_get_dma_mr calls outside of ib_alloc_pd, and eventuall move > > setting up ->local_dma_lkey into the HW driver and kill of > > ib_get_dma_mr, IB_DEVICE_LOCAL_DMA_LKEY and device->local_dma_lkey. > > Just for clarity, again, we can never do this. > > device->local_dma_lkey requires dedicated hardware support. We cannot > create it in software on old hardware. The only option I see is the > different-for-every-PD solution in my patch. I don't see how my sentence above contradicts this. One we use pd->local_dma_lkey everywhere, we can kill of device->local_dma_lkey as an API - drivers either stick it straight into pd->local_dma_lkey or do the internal equivalent of ib_get_dma_mr. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 16, 2015 at 01:04:02AM -0700, 'Christoph Hellwig' wrote: > On Wed, Jul 15, 2015 at 01:12:57PM -0600, Jason Gunthorpe wrote: > > > This looks perfect to me. After this we can get rid of the > > > ib_get_dma_mr calls outside of ib_alloc_pd, and eventuall move > > > setting up ->local_dma_lkey into the HW driver and kill of > > > ib_get_dma_mr, IB_DEVICE_LOCAL_DMA_LKEY and device->local_dma_lkey. > > > > Just for clarity, again, we can never do this. > > > > device->local_dma_lkey requires dedicated hardware support. We cannot > > create it in software on old hardware. The only option I see is the > > different-for-every-PD solution in my patch. > > I don't see how my sentence above contradicts this. > > One we use pd->local_dma_lkey everywhere, we can kill of > device->local_dma_lkey as an API - drivers either stick it straight > into pd->local_dma_lkey or do the internal equivalent of > ib_get_dma_mr. Right, I misread your message Thanks, Jason -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index bac3fb406a74..1ddf06314f36 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -213,24 +213,54 @@ EXPORT_SYMBOL(rdma_port_get_link_layer); /* Protection domains */ +/* Return a pd for in-kernel use that has a local_dma_lkey which provides + local access to all physical memory. */ struct ib_pd *ib_alloc_pd(struct ib_device *device) { struct ib_pd *pd; + struct ib_device_attr devattr; + int rc; + + rc = ib_query_device(device, &devattr); + if (rc) + return ERR_PTR(rc); pd = device->alloc_pd(device, NULL, NULL); + if (IS_ERR(pd)) + return pd; + + pd->device = device; + pd->uobject = NULL; + pd->local_mr = NULL; + atomic_set(&pd->usecnt, 0); + + if (devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY) + pd->local_dma_lkey = device->local_dma_lkey; + else { + struct ib_mr *mr; + + mr = ib_get_dma_mr(pd, IB_ACCESS_LOCAL_WRITE); + if (IS_ERR(mr)) { + ib_dealloc_pd(pd); + return (struct ib_pd *)mr; + } - if (!IS_ERR(pd)) { - pd->device = device; - pd->uobject = NULL; - atomic_set(&pd->usecnt, 0); + pd->local_mr = mr; + pd->local_dma_lkey = pd->local_mr->lkey; } - return pd; } + EXPORT_SYMBOL(ib_alloc_pd); int ib_dealloc_pd(struct ib_pd *pd) { + if (pd->local_mr) { + if (ib_dereg_mr(pd->local_mr)) + return -EBUSY; + pd->local_mr = NULL; + } + if (atomic_read(&pd->usecnt)) return -EBUSY; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 986fddb08579..cfda95d7b067 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1255,6 +1255,8 @@ struct ib_pd { struct ib_device *device; struct ib_uobject *uobject; atomic_t usecnt; /* count all resources */ + struct ib_mr *local_mr; + u32 local_dma_lkey; }; struct ib_xrcd {