From patchwork Mon Nov 6 06:15:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kasireddy, Vivek" X-Patchwork-Id: 13446262 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 972F2C4167B for ; Mon, 6 Nov 2023 06:17:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8F9BE6B01A5; Mon, 6 Nov 2023 01:17:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 881656B01A6; Mon, 6 Nov 2023 01:17:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6D5846B01A7; Mon, 6 Nov 2023 01:17:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 53E726B01A5 for ; Mon, 6 Nov 2023 01:17:44 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2D4671206AD for ; Mon, 6 Nov 2023 06:17:44 +0000 (UTC) X-FDA: 81426523248.11.E7599B3 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by imf20.hostedemail.com (Postfix) with ESMTP id 0FBA91C0009 for ; Mon, 6 Nov 2023 06:17:41 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dmeRRJsZ; spf=pass (imf20.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699251462; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Qjm7lKAZbAhJatUJ83BQp2sbDtQYxloHbUf4ynC9KgM=; b=arfiZFDBk9WAUDZ14j4DfS8XxALxirKk/IUSQ0Wp/kjRxg+uCfO8Oy5sMjjxl1DLYUfKrl aZ565c9eumeaAdvwzYJJvbRCa5OnOE3k8dpXTXX1JZCjYIF3eb+UMgnvgarteh8Hff4MMF o+AFL7EMte2H5Fj34v4r2xTsThNO8e0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699251462; a=rsa-sha256; cv=none; b=k9j/btVUhiw2Ha56Y1ZjDgZfcHG20EbTNUUS0gyj1NWRKlvIxC+igFlssP0CLEx/tPy7BV Sf1mwUQDn9d0QAJiCXzr5ePgLiWnXFyrr7sR53SrSd2iu9IJeGHHmlPWnby9M3jAYr3eMV vaq/P5GicFRKpbOsatJ14ioFjwg44+c= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dmeRRJsZ; spf=pass (imf20.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699251462; x=1730787462; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nVN0HkxWtfg8eXqDo7dzNjDMWcgdPswKsobFnSzYfJc=; b=dmeRRJsZjpjw+PQ7gnEqlFZqornceuBjgiVV/CUKUx8Isz8hys2Zq9Tc yIsYsUpuXgjERJ00CSnZ1N2kacUCqGgxbaJoZgyYnA/lF5NXHucVm2obd NC78cTWUIk4k3GORXBC3jFIYtw2BHleSWwXn0MFn4KJzQxicLd4mH76nJ zTEhiYYLgDFKeZ/vWff6Xsp5pNyB7kNB34kIb6Xg2Sj97630QyDd0lQtU tlhIkQRcubunpDGTB/kTBBr9yw1oyqhVQOfRYsUaAauJvRHbScwzQwiLt DhGLeP050XKphIEBn3hqMLMNA/zEoN02zyz0LI5VCmRhfaaIIzRYM7kLd Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10885"; a="393113922" X-IronPort-AV: E=Sophos;i="6.03,280,1694761200"; d="scan'208";a="393113922" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Nov 2023 22:17:35 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10885"; a="797213050" X-IronPort-AV: E=Sophos;i="6.03,280,1694761200"; d="scan'208";a="797213050" Received: from vkasired-desk2.fm.intel.com ([10.105.128.7]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Nov 2023 22:17:35 -0800 From: Vivek Kasireddy To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org Cc: Vivek Kasireddy , David Hildenbrand , Daniel Vetter , Mike Kravetz , Hugh Dickins , Peter Xu , Jason Gunthorpe , Gerd Hoffmann , Dongwon Kim , Junxiao Chang Subject: [PATCH v2 2/3] udmabuf: Pin the pages using pin_user_pages_fd() API (v2) Date: Sun, 5 Nov 2023 22:15:40 -0800 Message-Id: <20231106061541.507116-3-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231106061541.507116-1-vivek.kasireddy@intel.com> References: <20231106061541.507116-1-vivek.kasireddy@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 0FBA91C0009 X-Rspam-User: X-Stat-Signature: bs79nkwasxz8dmnmfochx6wui94g8fng X-Rspamd-Server: rspam03 X-HE-Tag: 1699251461-390922 X-HE-Meta: U2FsdGVkX19ti5HoF3iIpLeraN9bvJhVC83Ab8A9xU5rBcGBUobwurvStuemTTWAgmi74bCY/YdL6APZhy6wfyUKfAfKtFXNxI39/kFzRJgC7fXUIX82nAf23MddnR6x6bnGt6TW+gF2iBGmUJhuDj9pDMTFuo0fegJr0rmGqdZknwZpxDGP5/cX67PTgWHOqmXIB4otFVH4RkY09B/55L7IJoQl2FtbdDQ9FagyGywg0fDZ1GG/NqOBNR4J/N4d4Mq7Fw5xQMf8VxUqpQHWvAQxVz2GDENyN5p2YJWvtvZB0wuBC/4MwwaHZnyz7yYHOemLxRovrF1arCseC4ShXuy/4UFeorD6aJd6F/tcDpgI6hpb/j0qwCZqEjOyKsNyCUmxXbEQ0/nyBXCmrYh/kxzzlz0iJfSVian3rjhAswiHw5XjMiwcTBf1E4ZWr+3POCIeJNBuuV+ZeSvJ3XizvXeYy47gKiGT8zkm1aS3g1AfDwHJAHwybCemIcpoU4MPt/cpRLhJLu2p4HwgqXRCWJv3ZR/izT8+PkuzEjDWKX5W9gmDdHn1twNbne16ASDxfmAW8xaHvuAlZ+JL2Hcj7h7poE5f9EF1VUpfLi6P1a6i2X/R14uDFEW6q7sxNJdjBDS6IEEe3cKAm7Mnkom9WOhnmUA56y8UjNd9zjxaIX4YFhM8iytC6Hi/fXm91MBZxB32VNS3M6UKx3ajsW4sJa61cIi131q62DJoyU1nyvN7yJC6kF9YrjQDcQ2GLOIba80Eqlwv8674BTS9tPUEO9vEHoVfDTBnhgLiXR1HTjpiwki8A7q8rnKUpYQKEUEI4qkMy8omiwnmB1+apzhnyOenK/QIhMnUnBkZeOjuuGgC3xCx5YVivhFtL6Uib5qmwu1CK9kfrIH8VT2HHEzEAvv9sa/7q+tWpz09UsUo3bpP5z0G9EZ5eGwWM5Lu7ZU4khTU697arYhguoxypwq bJCjamed GNvxyA1KujxSpbhdw3YG8KKHPygexTCFOGhpiCGqywkGzQvx9qC/WZqcxXb9xqQcI6HmdPHP5FtKekqNwTcgFnZnzr0dgkYMolkBTzOpwYZnrwh26SIAHLiDke/wLJpV4srS2hai8aD6V+mtQ/6ogD8ePrAX1sUXVOUC/T7Hr4E6X0N2reMN8n/GBA9q2kWBWXN+Dbg/H9Ou7iFK1MMzi+cDMLNLzmjXWLftGdPwVElBxCrTuUNCpitT8K++l1uS9rQzGECzV4O6WyF3ae/WrIy3IywcBw4KVUYWdLF2pE3EbjdY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Using pin_user_pages_fd() will ensure that the pages are pinned correctly using FOLL_PIN. And, this also ensures that we don't accidentally break features such as memory hotunplug as it would not allow pinning pages in the movable zone. This patch also adds back support for mapping hugetlbfs pages by noting the subpage offsets within the huge pages and uses this information while populating the scatterlist. v2: - Adjust to the change in signature of pin_user_pages_fd() by passing in file * instead of fd. Cc: David Hildenbrand Cc: Daniel Vetter Cc: Mike Kravetz Cc: Hugh Dickins Cc: Peter Xu Cc: Jason Gunthorpe Cc: Gerd Hoffmann Cc: Dongwon Kim Cc: Junxiao Chang Signed-off-by: Vivek Kasireddy --- drivers/dma-buf/udmabuf.c | 81 +++++++++++++++++++++++++++++---------- 1 file changed, 60 insertions(+), 21 deletions(-) diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 820c993c8659..aa47af2b547f 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -28,6 +29,7 @@ struct udmabuf { struct page **pages; struct sg_table *sg; struct miscdevice *device; + pgoff_t *subpgoff; }; static vm_fault_t udmabuf_vm_fault(struct vm_fault *vmf) @@ -90,23 +92,31 @@ static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf, { struct udmabuf *ubuf = buf->priv; struct sg_table *sg; + struct scatterlist *sgl; + pgoff_t offset; + unsigned long i = 0; int ret; sg = kzalloc(sizeof(*sg), GFP_KERNEL); if (!sg) return ERR_PTR(-ENOMEM); - ret = sg_alloc_table_from_pages(sg, ubuf->pages, ubuf->pagecount, - 0, ubuf->pagecount << PAGE_SHIFT, - GFP_KERNEL); + + ret = sg_alloc_table(sg, ubuf->pagecount, GFP_KERNEL); if (ret < 0) - goto err; + goto err_alloc; + + for_each_sg(sg->sgl, sgl, ubuf->pagecount, i) { + offset = ubuf->subpgoff ? ubuf->subpgoff[i] : 0; + sg_set_page(sgl, ubuf->pages[i], PAGE_SIZE, offset); + } ret = dma_map_sgtable(dev, sg, direction, 0); if (ret < 0) - goto err; + goto err_map; return sg; -err: +err_map: sg_free_table(sg); +err_alloc: kfree(sg); return ERR_PTR(ret); } @@ -142,7 +152,9 @@ static void release_udmabuf(struct dma_buf *buf) put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL); for (pg = 0; pg < ubuf->pagecount; pg++) - put_page(ubuf->pages[pg]); + unpin_user_page(ubuf->pages[pg]); + + kfree(ubuf->subpgoff); kfree(ubuf->pages); kfree(ubuf); } @@ -202,12 +214,13 @@ static long udmabuf_create(struct miscdevice *device, { DEFINE_DMA_BUF_EXPORT_INFO(exp_info); struct file *memfd = NULL; - struct address_space *mapping = NULL; struct udmabuf *ubuf; struct dma_buf *buf; - pgoff_t pgoff, pgcnt, pgidx, pgbuf = 0, pglimit; - struct page *page; - int seals, ret = -EINVAL; + pgoff_t pgoff, pgcnt, pgbuf = 0, pglimit, nr_pages; + pgoff_t subpgoff, maxsubpgs; + struct hstate *hpstate; + long ret = -EINVAL; + int seals; u32 i, flags; ubuf = kzalloc(sizeof(*ubuf), GFP_KERNEL); @@ -241,8 +254,7 @@ static long udmabuf_create(struct miscdevice *device, memfd = fget(list[i].memfd); if (!memfd) goto err; - mapping = memfd->f_mapping; - if (!shmem_mapping(mapping)) + if (!shmem_file(memfd) && !is_file_hugepages(memfd)) goto err; seals = memfd_fcntl(memfd, F_GET_SEALS, 0); if (seals == -EINVAL) @@ -253,14 +265,40 @@ static long udmabuf_create(struct miscdevice *device, goto err; pgoff = list[i].offset >> PAGE_SHIFT; pgcnt = list[i].size >> PAGE_SHIFT; - for (pgidx = 0; pgidx < pgcnt; pgidx++) { - page = shmem_read_mapping_page(mapping, pgoff + pgidx); - if (IS_ERR(page)) { - ret = PTR_ERR(page); + if (is_file_hugepages(memfd)) { + if (!ubuf->subpgoff) { + ubuf->subpgoff = kmalloc_array(ubuf->pagecount, + sizeof(*ubuf->subpgoff), + GFP_KERNEL); + if (!ubuf->subpgoff) { + ret = -ENOMEM; + goto err; + } + } + hpstate = hstate_file(memfd); + pgoff = list[i].offset >> huge_page_shift(hpstate); + subpgoff = (list[i].offset & + ~huge_page_mask(hpstate)) >> PAGE_SHIFT; + maxsubpgs = huge_page_size(hpstate) >> PAGE_SHIFT; + } + + do { + nr_pages = shmem_file(memfd) ? pgcnt : 1; + ret = pin_user_pages_fd(memfd, pgoff, nr_pages, + ubuf->pages + pgbuf); + if (ret < 0) goto err; + + if (is_file_hugepages(memfd)) { + ubuf->subpgoff[pgbuf] = subpgoff << PAGE_SHIFT; + if (++subpgoff == maxsubpgs) { + subpgoff = 0; + pgoff++; + } } - ubuf->pages[pgbuf++] = page; - } + pgbuf += nr_pages; + pgcnt -= nr_pages; + } while (pgcnt > 0); fput(memfd); memfd = NULL; } @@ -283,10 +321,11 @@ static long udmabuf_create(struct miscdevice *device, return dma_buf_fd(buf, flags); err: - while (pgbuf > 0) - put_page(ubuf->pages[--pgbuf]); + while (pgbuf > 0 && ubuf->pages[--pgbuf]) + unpin_user_page(ubuf->pages[pgbuf]); if (memfd) fput(memfd); + kfree(ubuf->subpgoff); kfree(ubuf->pages); kfree(ubuf); return ret;