Message ID | 20180228234006.21093-9-logang@deltatee.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
> For P2P requests we must use the pci_p2pmem_[un]map_sg() functions > instead of the dma_map_sg functions. > > With that, we can then indicate PCI_P2P support in the request queue. > For this, we create an NVME_F_PCI_P2P flag which tells the core to > set QUEUE_FLAG_PCI_P2P in the request queue. This looks fine to me, Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Any plans adding the capability to nvme-rdma? Should be straight-forward... In theory, the use-case would be rdma backend fabric behind. Shouldn't be hard to test either...
> Any plans adding the capability to nvme-rdma? Should be > straight-forward... In theory, the use-case would be rdma backend > fabric behind. Shouldn't be hard to test either... Nice idea Sagi. Yes we have been starting to look at that. Though again we would probably want to impose the "attached to the same PCIe switch" rule which might be less common to satisfy in initiator systems. Down the road I would also like to discuss the best way to use this P2P framework to facilitate copies between NVMe namespaces (on both PCIe and fabric attached namespaces) without having to expose the CMB up to user space. Wasn't something like that done in the SCSI world at some point Martin? Stephen
On Thu, 2018-03-01 at 15:58 +0000, Stephen Bates wrote: > > Any plans adding the capability to nvme-rdma? Should be > > straight-forward... In theory, the use-case would be rdma backend > > fabric behind. Shouldn't be hard to test either... > > Nice idea Sagi. Yes we have been starting to look at that. Though again we > would probably want to impose the "attached to the same PCIe switch" rule > which might be less common to satisfy in initiator systems. > > Down the road I would also like to discuss the best way to use this P2P > framework to facilitate copies between NVMe namespaces (on both PCIe and > fabric attached namespaces) without having to expose the CMB up to user > space. Wasn't something like that done in the SCSI world at some point > Martin? Are you perhaps referring to the following patch series: "Copy Offload" (https://www.spinics.net/lists/linux-scsi/msg74680.html / https://lwn.net/Articles/592094/)? I will contact Martin off-list and in case he wouldn't have the time myself to revive that patch series then I will free up some time to work on this. Bart.
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 0fe7ea35c221..4dd564f175fa 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2954,7 +2954,11 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid) ns->queue = blk_mq_init_queue(ctrl->tagset); if (IS_ERR(ns->queue)) goto out_free_ns; + queue_flag_set_unlocked(QUEUE_FLAG_NONROT, ns->queue); + if (ctrl->ops->flags & NVME_F_PCI_P2PDMA) + queue_flag_set_unlocked(QUEUE_FLAG_PCI_P2PDMA, ns->queue); + ns->queue->queuedata = ns; ns->ctrl = ctrl; diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 0521e4707d1c..e38beb1917e2 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -290,6 +290,7 @@ struct nvme_ctrl_ops { unsigned int flags; #define NVME_F_FABRICS (1 << 0) #define NVME_F_METADATA_SUPPORTED (1 << 1) +#define NVME_F_PCI_P2PDMA (1 << 2) int (*reg_read32)(struct nvme_ctrl *ctrl, u32 off, u32 *val); int (*reg_write32)(struct nvme_ctrl *ctrl, u32 off, u32 val); int (*reg_read64)(struct nvme_ctrl *ctrl, u32 off, u64 *val); diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 56ca79be8476..b896a78225cb 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -796,8 +796,13 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, goto out; ret = BLK_STS_RESOURCE; - nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents, dma_dir, - DMA_ATTR_NO_WARN); + + if (REQ_IS_PCI_P2PDMA(req)) + nr_mapped = pci_p2pdma_map_sg(dev->dev, iod->sg, iod->nents, + dma_dir); + else + nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents, + dma_dir, DMA_ATTR_NO_WARN); if (!nr_mapped) goto out; @@ -842,7 +847,12 @@ static void nvme_unmap_data(struct nvme_dev *dev, struct request *req) DMA_TO_DEVICE : DMA_FROM_DEVICE; if (iod->nents) { - dma_unmap_sg(dev->dev, iod->sg, iod->nents, dma_dir); + if (REQ_IS_PCI_P2PDMA(req)) + pci_p2pdma_unmap_sg(dev->dev, iod->sg, iod->nents, + dma_dir); + else + dma_unmap_sg(dev->dev, iod->sg, iod->nents, dma_dir); + if (blk_integrity_rq(req)) { if (req_op(req) == REQ_OP_READ) nvme_dif_remap(req, nvme_dif_complete); @@ -2422,7 +2432,8 @@ static int nvme_pci_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val) static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = { .name = "pcie", .module = THIS_MODULE, - .flags = NVME_F_METADATA_SUPPORTED, + .flags = NVME_F_METADATA_SUPPORTED | + NVME_F_PCI_P2PDMA, .reg_read32 = nvme_pci_reg_read32, .reg_write32 = nvme_pci_reg_write32, .reg_read64 = nvme_pci_reg_read64,
For P2P requests we must use the pci_p2pmem_[un]map_sg() functions instead of the dma_map_sg functions. With that, we can then indicate PCI_P2P support in the request queue. For this, we create an NVME_F_PCI_P2P flag which tells the core to set QUEUE_FLAG_PCI_P2P in the request queue. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> --- drivers/nvme/host/core.c | 4 ++++ drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 19 +++++++++++++++---- 3 files changed, 20 insertions(+), 4 deletions(-)