From patchwork Thu Oct 15 22:02:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Xiong, Jianxin" X-Patchwork-Id: 11840223 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF6C714B4 for ; Thu, 15 Oct 2020 21:48:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 96A7E20776 for ; Thu, 15 Oct 2020 21:48:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731629AbgJOVsh (ORCPT ); Thu, 15 Oct 2020 17:48:37 -0400 Received: from mga11.intel.com ([192.55.52.93]:48800 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731626AbgJOVsg (ORCPT ); Thu, 15 Oct 2020 17:48:36 -0400 IronPort-SDR: nuUI/C63mLGt3SI5EfOF85dovxpBLTPkmg4/KpYZ9/v4s3cRZPFwBGRI8fNIIdcy/FjYbzgXj6 aonLDEivETjw== X-IronPort-AV: E=McAfee;i="6000,8403,9775"; a="162995455" X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="162995455" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2020 14:48:35 -0700 IronPort-SDR: eGIM8KW+ByG5p8yWiYUOBQauu+QTXogv5eA8++GOKEEgxk/xBXkPgRZjh+i7FaCrde34odtSCU t6CAtTr/V+YQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="330954067" Received: from cst-dev.jf.intel.com ([10.23.221.69]) by orsmga002.jf.intel.com with ESMTP; 15 Oct 2020 14:48:35 -0700 From: Jianxin Xiong To: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: Jianxin Xiong , Doug Ledford , Jason Gunthorpe , Leon Romanovsky , Sumit Semwal , Christian Koenig , Daniel Vetter Subject: [PATCH v5 1/5] RDMA/umem: Support importing dma-buf as user memory region Date: Thu, 15 Oct 2020 15:02:45 -0700 Message-Id: <1602799365-138199-1-git-send-email-jianxin.xiong@intel.com> X-Mailer: git-send-email 1.8.3.1 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Dma-buf is a standard cross-driver buffer sharing mechanism that can be used to support peer-to-peer access from RDMA devices. Device memory exported via dma-buf is associated with a file descriptor. This is passed to the user space as a property associated with the buffer allocation. When the buffer is registered as a memory region, the file descriptor is passed to the RDMA driver along with other parameters. Implement the common code for importing dma-buf object and mapping dma-buf pages. Signed-off-by: Jianxin Xiong Reviewed-by: Sean Hefty Acked-by: Michael J. Ruhl Acked-by: Christian Koenig Acked-by: Daniel Vetter --- drivers/infiniband/core/Makefile | 2 +- drivers/infiniband/core/umem.c | 4 + drivers/infiniband/core/umem_dmabuf.c | 206 ++++++++++++++++++++++++++++++++++ drivers/infiniband/core/umem_dmabuf.h | 11 ++ include/rdma/ib_umem.h | 32 +++++- 5 files changed, 253 insertions(+), 2 deletions(-) create mode 100644 drivers/infiniband/core/umem_dmabuf.c create mode 100644 drivers/infiniband/core/umem_dmabuf.h diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile index ccf2670..8ab4eea 100644 --- a/drivers/infiniband/core/Makefile +++ b/drivers/infiniband/core/Makefile @@ -40,5 +40,5 @@ ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_marshall.o \ uverbs_std_types_srq.o \ uverbs_std_types_wq.o \ uverbs_std_types_qp.o -ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o +ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o umem_dmabuf.o ib_uverbs-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index e9fecbd..8c608a5 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -2,6 +2,7 @@ * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005 Cisco Systems. All rights reserved. * Copyright (c) 2005 Mellanox Technologies. All rights reserved. + * Copyright (c) 2020 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -43,6 +44,7 @@ #include #include "uverbs.h" +#include "umem_dmabuf.h" static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int dirty) { @@ -269,6 +271,8 @@ void ib_umem_release(struct ib_umem *umem) { if (!umem) return; + if (umem->is_dmabuf) + return ib_umem_dmabuf_release(umem); if (umem->is_odp) return ib_umem_odp_release(to_ib_umem_odp(umem)); diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c new file mode 100644 index 0000000..4d6d6f3 --- /dev/null +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -0,0 +1,206 @@ +// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) +/* + * Copyright (c) 2020 Intel Corporation. All rights reserved. + */ + +#include +#include +#include + +#include "uverbs.h" +#include "umem_dmabuf.h" + +struct ib_umem_dmabuf { + struct ib_umem umem; + struct dma_buf_attachment *attach; + struct sg_table *sgt; + const struct ib_umem_dmabuf_ops *ops; + void *device_context; + struct work_struct work; +}; + +static inline struct ib_umem_dmabuf *to_ib_umem_dmabuf(struct ib_umem *umem) +{ + return container_of(umem, struct ib_umem_dmabuf, umem); +} + +static int ib_umem_dmabuf_map_pages(struct ib_umem *umem, bool first) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + struct sg_table *sgt; + struct dma_fence *fence; + int err; + + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); + + sgt = dma_buf_map_attachment(umem_dmabuf->attach, + DMA_BIDIRECTIONAL); + + if (IS_ERR(sgt)) { + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + return PTR_ERR(sgt); + } + + umem_dmabuf->umem.sg_head = *sgt; + umem_dmabuf->umem.nmap = sgt->nents; + umem_dmabuf->sgt = sgt; + + /* + * Although the sg list is valid now, the content of the pages + * may be not up-to-date. Wait for the exporter to finish + * the migration. + */ + fence = dma_resv_get_excl(umem_dmabuf->attach->dmabuf->resv); + if (fence) + dma_fence_wait(fence, false); + + if (first) + err = umem_dmabuf->ops->init(umem, + umem_dmabuf->device_context); + else + err = umem_dmabuf->ops->update(umem, + umem_dmabuf->device_context); + + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + return err; +} + +int ib_umem_dmabuf_init_mapping(struct ib_umem *umem, void *device_context) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + + umem_dmabuf->device_context = device_context; + return ib_umem_dmabuf_map_pages(umem, true); +} +EXPORT_SYMBOL(ib_umem_dmabuf_init_mapping); + +bool ib_umem_dmabuf_mapping_ready(struct ib_umem *umem) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + bool ret; + + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); + ret = !!umem_dmabuf->sgt; + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + return ret; +} +EXPORT_SYMBOL(ib_umem_dmabuf_mapping_ready); + +static void ib_umem_dmabuf_unmap_pages(struct ib_umem *umem, bool do_invalidate) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); + + if (!umem_dmabuf->sgt) + return; + + if (do_invalidate) + umem_dmabuf->ops->invalidate(umem, umem_dmabuf->device_context); + + dma_buf_unmap_attachment(umem_dmabuf->attach, umem_dmabuf->sgt, + DMA_BIDIRECTIONAL); + umem_dmabuf->sgt = NULL; +} + +static void ib_umem_dmabuf_work(struct work_struct *work) +{ + struct ib_umem_dmabuf *umem_dmabuf; + int ret; + + umem_dmabuf = container_of(work, struct ib_umem_dmabuf, work); + ret = ib_umem_dmabuf_map_pages(&umem_dmabuf->umem, false); + if (ret) + pr_debug("%s: failed to update dmabuf mapping, error %d\n", + __func__, ret); +} + +static void ib_umem_dmabuf_invalidate_cb(struct dma_buf_attachment *attach) +{ + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; + + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); + + ib_umem_dmabuf_unmap_pages(&umem_dmabuf->umem, true); + queue_work(ib_wq, &umem_dmabuf->work); +} + +static struct dma_buf_attach_ops ib_umem_dmabuf_attach_ops = { + .allow_peer2peer = 1, + .move_notify = ib_umem_dmabuf_invalidate_cb, +}; + +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, + unsigned long addr, size_t size, + int dmabuf_fd, int access, + const struct ib_umem_dmabuf_ops *ops) +{ + struct dma_buf *dmabuf; + struct ib_umem_dmabuf *umem_dmabuf; + struct ib_umem *umem; + unsigned long end; + long ret; + + if (check_add_overflow(addr, (unsigned long)size, &end)) + return ERR_PTR(-EINVAL); + + if (unlikely(PAGE_ALIGN(end) < PAGE_SIZE)) + return ERR_PTR(-EINVAL); + + if (unlikely(!ops || !ops->invalidate || !ops->update)) + return ERR_PTR(-EINVAL); + + umem_dmabuf = kzalloc(sizeof(*umem_dmabuf), GFP_KERNEL); + if (!umem_dmabuf) + return ERR_PTR(-ENOMEM); + + umem_dmabuf->ops = ops; + INIT_WORK(&umem_dmabuf->work, ib_umem_dmabuf_work); + + umem = &umem_dmabuf->umem; + umem->ibdev = device; + umem->length = size; + umem->address = addr; + umem->writable = ib_access_writable(access); + umem->is_dmabuf = 1; + + dmabuf = dma_buf_get(dmabuf_fd); + if (IS_ERR(dmabuf)) { + ret = PTR_ERR(dmabuf); + goto out_free_umem; + } + + umem_dmabuf->attach = dma_buf_dynamic_attach( + dmabuf, + device->dma_device, + &ib_umem_dmabuf_attach_ops, + umem_dmabuf); + if (IS_ERR(umem_dmabuf->attach)) { + ret = PTR_ERR(umem_dmabuf->attach); + goto out_release_dmabuf; + } + + return umem; + +out_release_dmabuf: + dma_buf_put(dmabuf); + +out_free_umem: + kfree(umem_dmabuf); + return ERR_PTR(ret); +} +EXPORT_SYMBOL(ib_umem_dmabuf_get); + +void ib_umem_dmabuf_release(struct ib_umem *umem) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + struct dma_buf *dmabuf = umem_dmabuf->attach->dmabuf; + + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); + ib_umem_dmabuf_unmap_pages(umem, false); + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + + dma_buf_detach(dmabuf, umem_dmabuf->attach); + dma_buf_put(dmabuf); + kfree(umem_dmabuf); +} diff --git a/drivers/infiniband/core/umem_dmabuf.h b/drivers/infiniband/core/umem_dmabuf.h new file mode 100644 index 0000000..485f653 --- /dev/null +++ b/drivers/infiniband/core/umem_dmabuf.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */ +/* + * Copyright (c) 2020 Intel Corporation. All rights reserved. + */ + +#ifndef UMEM_DMABUF_H +#define UMEM_DMABUF_H + +void ib_umem_dmabuf_release(struct ib_umem *umem); + +#endif /* UMEM_DMABUF_H */ diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index 7059750..fac8553 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2007 Cisco Systems. All rights reserved. + * Copyright (c) 2020 Intel Corporation. All rights reserved. */ #ifndef IB_UMEM_H @@ -22,12 +23,19 @@ struct ib_umem { unsigned long address; u32 writable : 1; u32 is_odp : 1; + u32 is_dmabuf : 1; struct work_struct work; struct sg_table sg_head; int nmap; unsigned int sg_nents; }; +struct ib_umem_dmabuf_ops { + int (*init)(struct ib_umem *umem, void *context); + int (*update)(struct ib_umem *umem, void *context); + int (*invalidate)(struct ib_umem *umem, void *context); +}; + /* Returns the offset of the umem start relative to the first page. */ static inline int ib_umem_offset(struct ib_umem *umem) { @@ -79,6 +87,12 @@ int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, unsigned long pgsz_bitmap, unsigned long virt); +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, + unsigned long addr, size_t size, + int dmabuf_fd, int access, + const struct ib_umem_dmabuf_ops *ops); +int ib_umem_dmabuf_init_mapping(struct ib_umem *umem, void *device_context); +bool ib_umem_dmabuf_mapping_ready(struct ib_umem *umem); #else /* CONFIG_INFINIBAND_USER_MEM */ @@ -101,7 +115,23 @@ static inline unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, { return 0; } +static inline struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, + unsigned long addr, + size_t size, int dmabuf_fd, + int access, + struct ib_umem_dmabuf_ops *ops) +{ + return ERR_PTR(-EINVAL); +} +static inline int ib_umem_dmabuf_init_mapping(struct ib_umem *umem, + void *device_context) +{ + return -EINVAL; +} +static inline bool ib_umem_dmabuf_mapping_ready(struct ib_umem *umem) +{ + return false; +} #endif /* CONFIG_INFINIBAND_USER_MEM */ - #endif /* IB_UMEM_H */ From patchwork Thu Oct 15 22:02:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Xiong, Jianxin" X-Patchwork-Id: 11840225 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 90E9E17CA for ; Thu, 15 Oct 2020 21:48:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 78D1B2076E for ; Thu, 15 Oct 2020 21:48:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731634AbgJOVsj (ORCPT ); Thu, 15 Oct 2020 17:48:39 -0400 Received: from mga11.intel.com ([192.55.52.93]:48800 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731632AbgJOVsj (ORCPT ); Thu, 15 Oct 2020 17:48:39 -0400 IronPort-SDR: uUKYaD49SCtX1gaC9z4WqeJOFwaRh4sGVdr/krubTmPTeHkUpj586unVdshXtDqbPbMCwS4baS RQ+pihPLde3g== X-IronPort-AV: E=McAfee;i="6000,8403,9775"; a="162995490" X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="162995490" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2020 14:48:39 -0700 IronPort-SDR: GptQWj41RzybP8sQrXIC1goAgkGOLxWeJddmHzIEFPm0JhiybKXFxoqMrIpjDuTxHb+glL2Jyb Ox8VgvoEbhGg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="319188322" Received: from cst-dev.jf.intel.com ([10.23.221.69]) by orsmga006.jf.intel.com with ESMTP; 15 Oct 2020 14:48:39 -0700 From: Jianxin Xiong To: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: Jianxin Xiong , Doug Ledford , Jason Gunthorpe , Leon Romanovsky , Sumit Semwal , Christian Koenig , Daniel Vetter Subject: [PATCH v5 2/5] RDMA/core: Add device method for registering dma-buf base memory region Date: Thu, 15 Oct 2020 15:02:51 -0700 Message-Id: <1602799371-138238-1-git-send-email-jianxin.xiong@intel.com> X-Mailer: git-send-email 1.8.3.1 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Dma-buf based memory region requires one extra parameter and is processed quite differently. Adding a separate method allows clean separation from regular memory regions. Signed-off-by: Jianxin Xiong Reviewed-by: Sean Hefty Acked-by: Michael J. Ruhl Acked-by: Christian Koenig Acked-by: Daniel Vetter --- drivers/infiniband/core/device.c | 1 + include/rdma/ib_verbs.h | 6 +++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index feaec8d..d6cd0ac 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -2653,6 +2653,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops) SET_DEVICE_OP(dev_ops, read_counters); SET_DEVICE_OP(dev_ops, reg_dm_mr); SET_DEVICE_OP(dev_ops, reg_user_mr); + SET_DEVICE_OP(dev_ops, reg_user_mr_dmabuf); SET_DEVICE_OP(dev_ops, req_ncomp_notif); SET_DEVICE_OP(dev_ops, req_notify_cq); SET_DEVICE_OP(dev_ops, rereg_user_mr); diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 9bf6c31..48bab74 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -2,7 +2,7 @@ /* * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2004 Infinicon Corporation. All rights reserved. - * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004, 2020 Intel Corporation. All rights reserved. * Copyright (c) 2004 Topspin Corporation. All rights reserved. * Copyright (c) 2004 Voltaire Corporation. All rights reserved. * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved. @@ -2429,6 +2429,10 @@ struct ib_device_ops { struct ib_mr *(*reg_user_mr)(struct ib_pd *pd, u64 start, u64 length, u64 virt_addr, int mr_access_flags, struct ib_udata *udata); + struct ib_mr *(*reg_user_mr_dmabuf)(struct ib_pd *pd, u64 start, + u64 length, u64 virt_addr, int dmabuf_fd, + int mr_access_flags, + struct ib_udata *udata); int (*rereg_user_mr)(struct ib_mr *mr, int flags, u64 start, u64 length, u64 virt_addr, int mr_access_flags, struct ib_pd *pd, struct ib_udata *udata); From patchwork Thu Oct 15 22:02:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Xiong, Jianxin" X-Patchwork-Id: 11840227 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 24AD914B4 for ; Thu, 15 Oct 2020 21:48:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0DDCE20776 for ; Thu, 15 Oct 2020 21:48:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731632AbgJOVso (ORCPT ); Thu, 15 Oct 2020 17:48:44 -0400 Received: from mga02.intel.com ([134.134.136.20]:11194 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727684AbgJOVso (ORCPT ); Thu, 15 Oct 2020 17:48:44 -0400 IronPort-SDR: YRUW0/dZbC0u+4LWd2y5NOqgddgkscpbHUHksUGYkgz4PjrYi4Lc1CGTWIaPJeikilmcneYCJ3 KJopDWOBtXMw== X-IronPort-AV: E=McAfee;i="6000,8403,9775"; a="153389425" X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="153389425" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2020 14:48:42 -0700 IronPort-SDR: 0FA0/3+IeibxsYcogA3VhlR/OXYiiCyNvLbgLDuKbONAojg/s68SkHIIQygSKq2rC7mk/++56i 0waijpAHLglw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="357144616" Received: from cst-dev.jf.intel.com ([10.23.221.69]) by FMSMGA003.fm.intel.com with ESMTP; 15 Oct 2020 14:48:41 -0700 From: Jianxin Xiong To: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: Jianxin Xiong , Doug Ledford , Jason Gunthorpe , Leon Romanovsky , Sumit Semwal , Christian Koenig , Daniel Vetter Subject: [PATCH v5 3/5] RDMA/uverbs: Add uverbs command for dma-buf based MR registration Date: Thu, 15 Oct 2020 15:02:55 -0700 Message-Id: <1602799375-138277-1-git-send-email-jianxin.xiong@intel.com> X-Mailer: git-send-email 1.8.3.1 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Implement a new uverbs ioctl method for memory registration with file descriptor as an extra parameter. Signed-off-by: Jianxin Xiong Reviewed-by: Sean Hefty Acked-by: Michael J. Ruhl Acked-by: Christian Koenig Acked-by: Daniel Vetter --- drivers/infiniband/core/uverbs_std_types_mr.c | 112 ++++++++++++++++++++++++++ include/uapi/rdma/ib_user_ioctl_cmds.h | 14 ++++ 2 files changed, 126 insertions(+) diff --git a/drivers/infiniband/core/uverbs_std_types_mr.c b/drivers/infiniband/core/uverbs_std_types_mr.c index 9b22bb5..e54459f 100644 --- a/drivers/infiniband/core/uverbs_std_types_mr.c +++ b/drivers/infiniband/core/uverbs_std_types_mr.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2018, Mellanox Technologies inc. All rights reserved. + * Copyright (c) 2020, Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -178,6 +179,85 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QUERY_MR)( return IS_UVERBS_COPY_ERR(ret) ? ret : 0; } +static int UVERBS_HANDLER(UVERBS_METHOD_REG_DMABUF_MR)( + struct uverbs_attr_bundle *attrs) +{ + struct ib_uobject *uobj = + uverbs_attr_get_uobject(attrs, UVERBS_ATTR_REG_DMABUF_MR_HANDLE); + struct ib_pd *pd = + uverbs_attr_get_obj(attrs, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE); + struct ib_device *ib_dev = pd->device; + + u64 start, length, virt_addr; + u32 fd, access_flags; + struct ib_mr *mr; + int ret; + + if (!ib_dev->ops.reg_user_mr_dmabuf) + return -EOPNOTSUPP; + + ret = uverbs_copy_from(&start, attrs, + UVERBS_ATTR_REG_DMABUF_MR_ADDR); + if (ret) + return ret; + + ret = uverbs_copy_from(&length, attrs, + UVERBS_ATTR_REG_DMABUF_MR_LENGTH); + if (ret) + return ret; + + ret = uverbs_copy_from(&virt_addr, attrs, + UVERBS_ATTR_REG_DMABUF_MR_HCA_VA); + if (ret) + return ret; + + ret = uverbs_copy_from(&fd, attrs, + UVERBS_ATTR_REG_DMABUF_MR_FD); + if (ret) + return ret; + + ret = uverbs_get_flags32(&access_flags, attrs, + UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, + IB_ACCESS_SUPPORTED); + if (ret) + return ret; + + ret = ib_check_mr_access(access_flags); + if (ret) + return ret; + + mr = pd->device->ops.reg_user_mr_dmabuf(pd, start, length, virt_addr, + fd, access_flags, + &attrs->driver_udata); + if (IS_ERR(mr)) + return PTR_ERR(mr); + + mr->device = pd->device; + mr->pd = pd; + mr->type = IB_MR_TYPE_USER; + mr->uobject = uobj; + atomic_inc(&pd->usecnt); + + uobj->object = mr; + + ret = uverbs_copy_to(attrs, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, + &mr->lkey, sizeof(mr->lkey)); + if (ret) + goto err_dereg; + + ret = uverbs_copy_to(attrs, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, + &mr->rkey, sizeof(mr->rkey)); + if (ret) + goto err_dereg; + + return 0; + +err_dereg: + ib_dereg_mr_user(mr, uverbs_get_cleared_udata(attrs)); + + return ret; +} + DECLARE_UVERBS_NAMED_METHOD( UVERBS_METHOD_ADVISE_MR, UVERBS_ATTR_IDR(UVERBS_ATTR_ADVISE_MR_PD_HANDLE, @@ -243,6 +323,37 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QUERY_MR)( UVERBS_ATTR_TYPE(u32), UA_MANDATORY)); +DECLARE_UVERBS_NAMED_METHOD( + UVERBS_METHOD_REG_DMABUF_MR, + UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, + UVERBS_OBJECT_MR, + UVERBS_ACCESS_NEW, + UA_MANDATORY), + UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, + UVERBS_OBJECT_PD, + UVERBS_ACCESS_READ, + UA_MANDATORY), + UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DMABUF_MR_ADDR, + UVERBS_ATTR_TYPE(u64), + UA_MANDATORY), + UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DMABUF_MR_LENGTH, + UVERBS_ATTR_TYPE(u64), + UA_MANDATORY), + UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DMABUF_MR_HCA_VA, + UVERBS_ATTR_TYPE(u64), + UA_MANDATORY), + UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DMABUF_MR_FD, + UVERBS_ATTR_TYPE(u32), + UA_MANDATORY), + UVERBS_ATTR_FLAGS_IN(UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, + enum ib_access_flags), + UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, + UVERBS_ATTR_TYPE(u32), + UA_MANDATORY), + UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, + UVERBS_ATTR_TYPE(u32), + UA_MANDATORY)); + DECLARE_UVERBS_NAMED_METHOD_DESTROY( UVERBS_METHOD_MR_DESTROY, UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_MR_HANDLE, @@ -253,6 +364,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QUERY_MR)( DECLARE_UVERBS_NAMED_OBJECT( UVERBS_OBJECT_MR, UVERBS_TYPE_ALLOC_IDR(uverbs_free_mr), + &UVERBS_METHOD(UVERBS_METHOD_REG_DMABUF_MR), &UVERBS_METHOD(UVERBS_METHOD_DM_MR_REG), &UVERBS_METHOD(UVERBS_METHOD_MR_DESTROY), &UVERBS_METHOD(UVERBS_METHOD_ADVISE_MR), diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h index 7968a18..7e07f52 100644 --- a/include/uapi/rdma/ib_user_ioctl_cmds.h +++ b/include/uapi/rdma/ib_user_ioctl_cmds.h @@ -1,5 +1,6 @@ /* * Copyright (c) 2018, Mellanox Technologies inc. All rights reserved. + * Copyright (c) 2020, Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -251,6 +252,7 @@ enum uverbs_methods_mr { UVERBS_METHOD_MR_DESTROY, UVERBS_METHOD_ADVISE_MR, UVERBS_METHOD_QUERY_MR, + UVERBS_METHOD_REG_DMABUF_MR, }; enum uverbs_attrs_mr_destroy_ids { @@ -272,6 +274,18 @@ enum uverbs_attrs_query_mr_cmd_attr_ids { UVERBS_ATTR_QUERY_MR_RESP_IOVA, }; +enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids { + UVERBS_ATTR_REG_DMABUF_MR_HANDLE, + UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, + UVERBS_ATTR_REG_DMABUF_MR_ADDR, + UVERBS_ATTR_REG_DMABUF_MR_LENGTH, + UVERBS_ATTR_REG_DMABUF_MR_HCA_VA, + UVERBS_ATTR_REG_DMABUF_MR_FD, + UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, + UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, + UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, +}; + enum uverbs_attrs_create_counters_cmd_attr_ids { UVERBS_ATTR_CREATE_COUNTERS_HANDLE, }; From patchwork Thu Oct 15 22:02:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Xiong, Jianxin" X-Patchwork-Id: 11840229 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D132A17CA for ; Thu, 15 Oct 2020 21:48:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B96142078A for ; Thu, 15 Oct 2020 21:48:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731635AbgJOVsp (ORCPT ); Thu, 15 Oct 2020 17:48:45 -0400 Received: from mga04.intel.com ([192.55.52.120]:32841 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727684AbgJOVsp (ORCPT ); Thu, 15 Oct 2020 17:48:45 -0400 IronPort-SDR: DhnF47/D2WwfemuXo1bzu9RGKkjPl0h0/kiwz3kEqgR36x4lXT0GXIRKyJP7g2BOCEg1ntz4GK GbZB1zAoc0KQ== X-IronPort-AV: E=McAfee;i="6000,8403,9775"; a="163843398" X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="163843398" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2020 14:48:44 -0700 IronPort-SDR: vok97SB612IKNty+hKFTeLPWscotwbyZk51Iq2gspUz7IAHX8YxE4FZsG57biXWy3xZcCiBCRS 8VqJ2jZ7JZng== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="357866565" Received: from cst-dev.jf.intel.com ([10.23.221.69]) by orsmga007.jf.intel.com with ESMTP; 15 Oct 2020 14:48:44 -0700 From: Jianxin Xiong To: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: Jianxin Xiong , Doug Ledford , Jason Gunthorpe , Leon Romanovsky , Sumit Semwal , Christian Koenig , Daniel Vetter Subject: [PATCH v5 4/5] RDMA/mlx5: Support dma-buf based userspace memory region Date: Thu, 15 Oct 2020 15:02:58 -0700 Message-Id: <1602799378-138316-1-git-send-email-jianxin.xiong@intel.com> X-Mailer: git-send-email 1.8.3.1 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Implement the new driver method 'reg_user_mr_dmabuf'. Utilize the core functions to import dma-buf based memory region and update the mappings. Add code to handle dma-buf related page fault. Signed-off-by: Jianxin Xiong Reviewed-by: Sean Hefty Acked-by: Michael J. Ruhl Acked-by: Christian Koenig Acked-by: Daniel Vetter --- drivers/infiniband/hw/mlx5/main.c | 2 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 5 ++ drivers/infiniband/hw/mlx5/mr.c | 119 +++++++++++++++++++++++++++++++++++ drivers/infiniband/hw/mlx5/odp.c | 42 +++++++++++++ 4 files changed, 168 insertions(+) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 89e04ca..ec4ad2f 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright (c) 2013-2020, Mellanox Technologies inc. All rights reserved. + * Copyright (c) 2020, Intel Corporation. All rights reserved. */ #include @@ -4060,6 +4061,7 @@ static int mlx5_ib_enable_driver(struct ib_device *dev) .query_srq = mlx5_ib_query_srq, .query_ucontext = mlx5_ib_query_ucontext, .reg_user_mr = mlx5_ib_reg_user_mr, + .reg_user_mr_dmabuf = mlx5_ib_reg_user_mr_dmabuf, .req_notify_cq = mlx5_ib_arm_cq, .rereg_user_mr = mlx5_ib_rereg_user_mr, .resize_cq = mlx5_ib_resize_cq, diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index b1f2b34..65fcc18 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2013-2020, Mellanox Technologies inc. All rights reserved. + * Copyright (c) 2020, Intel Corporation. All rights reserved. */ #ifndef MLX5_IB_H @@ -1174,6 +1175,10 @@ int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, u64 virt_addr, int access_flags, struct ib_udata *udata); +struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start, + u64 length, u64 virt_addr, + int dmabuf_fd, int access_flags, + struct ib_udata *udata); int mlx5_ib_advise_mr(struct ib_pd *pd, enum ib_uverbs_advise_mr_advice advice, u32 flags, diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index b261797..24750f1 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved. + * Copyright (c) 2020, Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -1462,6 +1463,124 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, return ERR_PTR(err); } +static int mlx5_ib_umem_dmabuf_xlt_init(struct ib_umem *umem, void *context) +{ + struct mlx5_ib_mr *mr = context; + int flags = MLX5_IB_UPD_XLT_ENABLE; + + if (!mr) + return -EINVAL; + + return mlx5_ib_update_xlt(mr, 0, mr->npages, PAGE_SHIFT, flags); +} + +static int mlx5_ib_umem_dmabuf_xlt_update(struct ib_umem *umem, void *context) +{ + struct mlx5_ib_mr *mr = context; + int flags = MLX5_IB_UPD_XLT_ATOMIC; + + if (!mr) + return -EINVAL; + + return mlx5_ib_update_xlt(mr, 0, mr->npages, PAGE_SHIFT, flags); +} + +static int mlx5_ib_umem_dmabuf_xlt_invalidate(struct ib_umem *umem, void *context) +{ + struct mlx5_ib_mr *mr = context; + int flags = MLX5_IB_UPD_XLT_ZAP | MLX5_IB_UPD_XLT_ATOMIC; + + if (!mr) + return -EINVAL; + + return mlx5_ib_update_xlt(mr, 0, mr->npages, PAGE_SHIFT, flags); +} + +static struct ib_umem_dmabuf_ops mlx5_ib_umem_dmabuf_ops = { + .init = mlx5_ib_umem_dmabuf_xlt_init, + .update = mlx5_ib_umem_dmabuf_xlt_update, + .invalidate = mlx5_ib_umem_dmabuf_xlt_invalidate, +}; + +struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start, + u64 length, u64 virt_addr, + int dmabuf_fd, int access_flags, + struct ib_udata *udata) +{ + struct mlx5_ib_dev *dev = to_mdev(pd->device); + struct mlx5_ib_mr *mr = NULL; + struct ib_umem *umem; + int page_shift; + int npages; + int ncont; + int order; + int err; + + if (!IS_ENABLED(CONFIG_INFINIBAND_USER_MEM)) + return ERR_PTR(-EOPNOTSUPP); + + mlx5_ib_dbg(dev, + "start 0x%llx, virt_addr 0x%llx, length 0x%llx, fd %d, access_flags 0x%x\n", + start, virt_addr, length, dmabuf_fd, access_flags); + + if (!mlx5_ib_can_load_pas_with_umr(dev, length)) + return ERR_PTR(-EINVAL); + + umem = ib_umem_dmabuf_get(&dev->ib_dev, start, length, dmabuf_fd, + access_flags, &mlx5_ib_umem_dmabuf_ops); + if (IS_ERR(umem)) { + mlx5_ib_dbg(dev, "umem get failed (%ld)\n", PTR_ERR(umem)); + return ERR_PTR(PTR_ERR(umem)); + } + + npages = ib_umem_num_pages(umem); + if (!npages) { + mlx5_ib_warn(dev, "avoid zero region\n"); + ib_umem_release(umem); + return ERR_PTR(-EINVAL); + } + + page_shift = PAGE_SHIFT; + ncont = npages; + order = ilog2(roundup_pow_of_two(ncont)); + + mlx5_ib_dbg(dev, "npages %d, ncont %d, order %d, page_shift %d\n", + npages, ncont, order, page_shift); + + mr = alloc_mr_from_cache(pd, umem, virt_addr, length, ncont, + page_shift, order, access_flags); + if (IS_ERR(mr)) + mr = NULL; + + if (!mr) { + mutex_lock(&dev->slow_path_mutex); + mr = reg_create(NULL, pd, virt_addr, length, umem, ncont, + page_shift, access_flags, false); + mutex_unlock(&dev->slow_path_mutex); + } + + if (IS_ERR(mr)) { + err = PTR_ERR(mr); + goto error; + } + + mlx5_ib_dbg(dev, "mkey 0x%x\n", mr->mmkey.key); + + mr->umem = umem; + set_mr_fields(dev, mr, npages, length, access_flags); + + err = ib_umem_dmabuf_init_mapping(umem, mr); + if (err) { + dereg_mr(dev, mr); + return ERR_PTR(err); + } + + return &mr->ibmr; +error: + ib_umem_release(umem); + return ERR_PTR(err); +} + /** * mlx5_mr_cache_invalidate - Fence all DMA on the MR * @mr: The MR to fence diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 5c853ec..16e2e51 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -801,6 +801,44 @@ static int pagefault_implicit_mr(struct mlx5_ib_mr *imr, * Returns: * -EFAULT: The io_virt->bcnt is not within the MR, it covers pages that are * not accessible, or the MR is no longer valid. + * -EAGAIN: The operation should be retried + * + * >0: Number of pages mapped + */ +static int pagefault_dmabuf_mr(struct mlx5_ib_mr *mr, struct ib_umem *umem, + u64 io_virt, size_t bcnt, u32 *bytes_mapped, + u32 flags) +{ + u64 user_va; + u64 end; + int npages; + + if (unlikely(io_virt < mr->mmkey.iova)) + return -EFAULT; + if (check_add_overflow(io_virt - mr->mmkey.iova, + (u64)umem->address, &user_va)) + return -EFAULT; + + /* Overflow has alreddy been checked at the umem creation time */ + end = umem->address + umem->length; + if (unlikely(user_va >= end || end - user_va < bcnt)) + return -EFAULT; + + if (!ib_umem_dmabuf_mapping_ready(umem)) + return -EAGAIN; + + if (bytes_mapped) + *bytes_mapped += bcnt; + + npages = (ALIGN(user_va + bcnt, PAGE_SIZE) - + ALIGN_DOWN(user_va, PAGE_SIZE)) >> PAGE_SHIFT; + return npages; +} + +/* + * Returns: + * -EFAULT: The io_virt->bcnt is not within the MR, it covers pages that are + * not accessible, or the MR is no longer valid. * -EAGAIN/-ENOMEM: The operation should be retried * * -EINVAL/others: General internal malfunction @@ -811,6 +849,10 @@ static int pagefault_mr(struct mlx5_ib_mr *mr, u64 io_virt, size_t bcnt, { struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem); + if (mr->umem->is_dmabuf) + return pagefault_dmabuf_mr(mr, mr->umem, io_virt, bcnt, + bytes_mapped, flags); + lockdep_assert_held(&mr->dev->odp_srcu); if (unlikely(io_virt < mr->mmkey.iova)) return -EFAULT; From patchwork Thu Oct 15 22:03:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Xiong, Jianxin" X-Patchwork-Id: 11840231 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4C38B17CA for ; Thu, 15 Oct 2020 21:48:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 35DA22076E for ; Thu, 15 Oct 2020 21:48:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731636AbgJOVsq (ORCPT ); Thu, 15 Oct 2020 17:48:46 -0400 Received: from mga02.intel.com ([134.134.136.20]:11194 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727684AbgJOVsq (ORCPT ); Thu, 15 Oct 2020 17:48:46 -0400 IronPort-SDR: rFCIcgAzPGDCeIoIoNO3kWPTF/IdEE3d90Dn0q9ARmXXzYQrGsINH6XgS5kswAuf0JLFA0HK0D I0tYKVE4IgUA== X-IronPort-AV: E=McAfee;i="6000,8403,9775"; a="153389434" X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="153389434" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2020 14:48:45 -0700 IronPort-SDR: y0ranw1JxLnitQsTce/ua/Bu76A+mbHQLxiNVUHgdVvNZURtU1glBqOq+aykYZR3kjhT+BMEQ1 Lzt7i7TL+lyQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,380,1596524400"; d="scan'208";a="330954089" Received: from cst-dev.jf.intel.com ([10.23.221.69]) by orsmga002.jf.intel.com with ESMTP; 15 Oct 2020 14:48:45 -0700 From: Jianxin Xiong To: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org Cc: Jianxin Xiong , Doug Ledford , Jason Gunthorpe , Leon Romanovsky , Sumit Semwal , Christian Koenig , Daniel Vetter Subject: [PATCH v5 5/5] dma-buf: Clarify that dma-buf sg lists are page aligned Date: Thu, 15 Oct 2020 15:03:00 -0700 Message-Id: <1602799380-138355-1-git-send-email-jianxin.xiong@intel.com> X-Mailer: git-send-email 1.8.3.1 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org The dma-buf API have been used under the assumption that the sg lists returned from dma_buf_map_attachment() are fully page aligned. Lots of stuff can break otherwise all over the place. Clarify this in the documentation and add a check when DMA API debug is enabled. Signed-off-by: Jianxin Xiong Reviewed-by: Christian Koenig Acked-by: Daniel Vetter --- drivers/dma-buf/dma-buf.c | 21 +++++++++++++++++++++ include/linux/dma-buf.h | 3 ++- 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 844967f..7309c83 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -851,6 +851,9 @@ void dma_buf_unpin(struct dma_buf_attachment *attach) * Returns sg_table containing the scatterlist to be returned; returns ERR_PTR * on error. May return -EINTR if it is interrupted by a signal. * + * On success, the DMA addresses and lengths in the returned scatterlist are + * PAGE_SIZE aligned. + * * A mapping must be unmapped by using dma_buf_unmap_attachment(). Note that * the underlying backing storage is pinned for as long as a mapping exists, * therefore users/importers should not hold onto a mapping for undue amounts of @@ -904,6 +907,24 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, attach->dir = direction; } +#ifdef CONFIG_DMA_API_DEBUG + { + struct scatterlist *sg; + u64 addr; + int len; + int i; + + for_each_sgtable_dma_sg(sg_table, sg, i) { + addr = sg_dma_address(sg); + len = sg_dma_len(sg); + if (!PAGE_ALIGNED(addr) || !PAGE_ALIGNED(len)) { + pr_debug("%s: addr %llx or len %x is not page aligned!\n", + __func__, addr, len); + } + } + } +#endif /* CONFIG_DMA_API_DEBUG */ + return sg_table; } EXPORT_SYMBOL_GPL(dma_buf_map_attachment); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index a2ca294e..4a5fa70 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -145,7 +145,8 @@ struct dma_buf_ops { * * A &sg_table scatter list of or the backing storage of the DMA buffer, * already mapped into the device address space of the &device attached - * with the provided &dma_buf_attachment. + * with the provided &dma_buf_attachment. The addresses and lengths in + * the scatter list are PAGE_SIZE aligned. * * On failure, returns a negative error value wrapped into a pointer. * May also return -EINTR when a signal was received while being