From patchwork Tue Oct 3 07:44:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Kasireddy X-Patchwork-Id: 13406982 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DA3BE75436 for ; Tue, 3 Oct 2023 08:05:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54D1D8D0061; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4AF778D005D; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D9538D0061; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1E79B8D005D for ; Tue, 3 Oct 2023 04:05:31 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E75CB1401B4 for ; Tue, 3 Oct 2023 08:05:30 +0000 (UTC) X-FDA: 81303415620.09.0A4D681 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by imf25.hostedemail.com (Postfix) with ESMTP id BE110A001E for ; Tue, 3 Oct 2023 08:05:28 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=lLwEDb9B; spf=pass (imf25.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696320329; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Mf7FY0rwkqJvT2QMUY53LF+cP2eeatX5ifIrenHzC+A=; b=dpG6OZUzlyxikCuDQvmGPunNh/PesJPDYlTjw2kTnFqILPOBzRSY27aY0ikL3tXdsPLbMF V4Mm7/Lv/mu34aKwXPnE/RIJl2nQ0gWh0dGGGZZBy24x+FhMH2LeIcbJZURnX1lJMxtXzX vrcrDwVCfM2M4jlnyOpmX1QsKCKlyO4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696320329; a=rsa-sha256; cv=none; b=EhJudoi13UDd/rEG0bpwzs9YA+sT+6MJnJAyJf1CmiuyV++f9m10v+S8CBdnKdKfN5+VAe ATWeLKxQQ9Xat3scli++OENyVEqyhIT6ocaQVKKltGq12fLv6c7b1GgFU3HYJ04+0HIRXW 5STlgOqKLeEQA7LZY1jwzEM0j9WNJdE= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=lLwEDb9B; spf=pass (imf25.hostedemail.com: domain of vivek.kasireddy@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696320328; x=1727856328; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nn3xjby11PVMXRKfpdpNwR1Yzd2Qu+4y8cyJ8c9gw2c=; b=lLwEDb9BqCjhrLV+75IReJqdscyf8jQ+rTG22NoF+kWi116w7BNaIAed xxJmkOdEFDmOtpYOv3LR1iqs5HebCAHEulY+U9b2oeoYj1qpIpW4lQpJ0 5nPfNT5UchsWihqp98B3UG9ycQD0GcJkJ4ux6v4Z8e/l+iuSZWoZZtfpg 9CclQrZqJag1e21KdtQBz7vZ4/ljPbPHpWUgO47ORwozq73njirp5VsBX oEUmEPEFwlcOJvFrkjyF5nKq8pjmpL1HCDxc9eL3391LWfFZOkJr2OMnV PHQ+uydFoqrmfe9ABU7D9qR7zhrWYiSH2EWbbYg29hh9vlkNMY/Bp6SV2 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="449306990" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="449306990" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2023 01:05:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="700615299" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="700615299" Received: from vkasired-desk2.fm.intel.com ([10.105.128.127]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2023 01:05:25 -0700 From: Vivek Kasireddy To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org Cc: Vivek Kasireddy , David Hildenbrand , Daniel Vetter , Mike Kravetz , Hugh Dickins , Peter Xu , Jason Gunthorpe , Gerd Hoffmann , Dongwon Kim , Junxiao Chang Subject: [PATCH v1 2/3] udmabuf: Pin the pages using pin_user_pages_fd() API Date: Tue, 3 Oct 2023 00:44:46 -0700 Message-Id: <20231003074447.3245729-3-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231003074447.3245729-1-vivek.kasireddy@intel.com> References: <20231003074447.3245729-1-vivek.kasireddy@intel.com> MIME-Version: 1.0 X-Stat-Signature: 33fwmizz6qhaygf99rbzaxi7tmt5z6su X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: BE110A001E X-Rspam-User: X-HE-Tag: 1696320328-772343 X-HE-Meta: U2FsdGVkX18VvBNGYDitJfsihdG9C70XU/+tKbqk1uBGqXebwPFreqEVvQzomGEVESkcUpDCpQSOG8aD0ir76DmVges+CRyghrvna6TiekMJgDJA7TCUuBpc/lLtRNR1xtPv5BHviIqE5r95A1Z6NtDsrKTLKox4vdfc/vJMecQvnSG2OqF2tnZjgiAux6S1f/b/Z8M7N9zIc8FQ3Wng0QqXwiYySu3cwO0P/5aFWZEddLqvADiFMLoij/x2XPbX6BagZqjQ/l3R3m2/CwZwhiJLefLVz+anVvIkHf1Asls/N8aBQ5QdPZfuhZlo1dvHu+Iedahef9c093+VeWiNyOeSZCt+CRFtLcxmNwZkE4FUuC886R1LIBFEA8ag7/g24+XsnVAPg8TqlWtksilZS+Xkvu5DUHGiv9DbZOr+8Pna1RVVDLTrWsIDyie+ZtmXx6QjXhHE1hybaYj682mLwIGPtmBpIRD0njhWBLmM6p7p+xI9L72HnjfydpWXa7UgKCV0a24bIEnepH595Ux+ak/7BJAzeVYGz49oHh08y0IQF4GnLUTY4ZyD8dHcl+s4ydvi7WozFOWqCK047usWCm7jsHWiW36u4NBM07V7oXdzlIL/4V+OkjocxXPmJNTA0YIZR8+UKMtfmJhMJ8jd32QxBwiRP36u2D9OL0jHcYVfcG0sIB/l/g4AV5e5HAk/tIT5P/6uC4/PF6g25bRCzsLJ5fwjkqlFMt0ktpbttT2aAfm6C4UKZEJR5D42N5SCBfl3bNVS3nt0G/IYU1nkZS+U5PQoJTj/PrMYo04hZN7LglkJt3hFIDSLtKRkp0bPGetmwWByz4U8L35awY46gTKYblZVTc6+HCl0g7CB8i9GybvalA/CylVkCkaNXrBwm920xZtbDdDJWE+1Kgzl2U9gY7l5J9PLKG3allel3yWUHGo/d1OEdUdDzqkSpaoEnmmhN2ZgY/J+Q6ESuYQ 1oGI4Vvj 7SkTWlvpo8SYKOSLLb3cXc750uypaljaOh6R4YHZHIpSrBkRNUCPIwVlHODbxFWHGrz+oufCLT99RW3ldrTo6ijA9KFyjfdFrdmlCU4COmJ9UnTs0/myGh+Ve5tEm13tI1m6z6jjUdp4wjJmM3eyymA17EzPjkKJw0AjZTFkdX5iqgDJKjDrzCRl0MvMCiO9X/7Tch9Xe/7NUhq2SvKITCyTi+qFndVGS49YneG78sL9TXrWDfJLujrEAM4Q0Ij9nJ20+juDYOgz8v6hhzv7Ob75urcBBxmVlgaiC2f7ca1uR0ao= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Using pin_user_pages_fd() will ensure that the pages are pinned correctly using FOLL_PIN. And, this also ensures that we don't accidentally break features such as memory hotunplug as it would not allow pinning pages in the movable zone. This patch also adds back support for mapping hugetlbfs pages by noting the subpage offsets within the huge pages and uses this information while populating the scatterlist. Cc: David Hildenbrand Cc: Daniel Vetter Cc: Mike Kravetz Cc: Hugh Dickins Cc: Peter Xu Cc: Jason Gunthorpe Cc: Gerd Hoffmann Cc: Dongwon Kim Cc: Junxiao Chang Signed-off-by: Vivek Kasireddy --- drivers/dma-buf/udmabuf.c | 82 +++++++++++++++++++++++++++++---------- 1 file changed, 61 insertions(+), 21 deletions(-) diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 820c993c8659..9ef1eaf4df4b 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -28,6 +29,7 @@ struct udmabuf { struct page **pages; struct sg_table *sg; struct miscdevice *device; + pgoff_t *subpgoff; }; static vm_fault_t udmabuf_vm_fault(struct vm_fault *vmf) @@ -90,23 +92,31 @@ static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf, { struct udmabuf *ubuf = buf->priv; struct sg_table *sg; + struct scatterlist *sgl; + pgoff_t offset; + unsigned long i = 0; int ret; sg = kzalloc(sizeof(*sg), GFP_KERNEL); if (!sg) return ERR_PTR(-ENOMEM); - ret = sg_alloc_table_from_pages(sg, ubuf->pages, ubuf->pagecount, - 0, ubuf->pagecount << PAGE_SHIFT, - GFP_KERNEL); + + ret = sg_alloc_table(sg, ubuf->pagecount, GFP_KERNEL); if (ret < 0) - goto err; + goto err_alloc; + + for_each_sg(sg->sgl, sgl, ubuf->pagecount, i) { + offset = ubuf->subpgoff ? ubuf->subpgoff[i] : 0; + sg_set_page(sgl, ubuf->pages[i], PAGE_SIZE, offset); + } ret = dma_map_sgtable(dev, sg, direction, 0); if (ret < 0) - goto err; + goto err_map; return sg; -err: +err_map: sg_free_table(sg); +err_alloc: kfree(sg); return ERR_PTR(ret); } @@ -142,7 +152,9 @@ static void release_udmabuf(struct dma_buf *buf) put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL); for (pg = 0; pg < ubuf->pagecount; pg++) - put_page(ubuf->pages[pg]); + unpin_user_page(ubuf->pages[pg]); + + kfree(ubuf->subpgoff); kfree(ubuf->pages); kfree(ubuf); } @@ -202,12 +214,13 @@ static long udmabuf_create(struct miscdevice *device, { DEFINE_DMA_BUF_EXPORT_INFO(exp_info); struct file *memfd = NULL; - struct address_space *mapping = NULL; struct udmabuf *ubuf; struct dma_buf *buf; - pgoff_t pgoff, pgcnt, pgidx, pgbuf = 0, pglimit; - struct page *page; - int seals, ret = -EINVAL; + pgoff_t pgoff, pgcnt, pgbuf = 0, pglimit, nr_pages; + pgoff_t subpgoff, maxsubpgs; + struct hstate *hpstate; + long ret = -EINVAL; + int seals; u32 i, flags; ubuf = kzalloc(sizeof(*ubuf), GFP_KERNEL); @@ -241,8 +254,7 @@ static long udmabuf_create(struct miscdevice *device, memfd = fget(list[i].memfd); if (!memfd) goto err; - mapping = memfd->f_mapping; - if (!shmem_mapping(mapping)) + if (!shmem_file(memfd) && !is_file_hugepages(memfd)) goto err; seals = memfd_fcntl(memfd, F_GET_SEALS, 0); if (seals == -EINVAL) @@ -253,14 +265,41 @@ static long udmabuf_create(struct miscdevice *device, goto err; pgoff = list[i].offset >> PAGE_SHIFT; pgcnt = list[i].size >> PAGE_SHIFT; - for (pgidx = 0; pgidx < pgcnt; pgidx++) { - page = shmem_read_mapping_page(mapping, pgoff + pgidx); - if (IS_ERR(page)) { - ret = PTR_ERR(page); + if (is_file_hugepages(memfd)) { + if (!ubuf->subpgoff) { + ubuf->subpgoff = kmalloc_array(ubuf->pagecount, + sizeof(*ubuf->subpgoff), + GFP_KERNEL); + if (!ubuf->subpgoff) { + ret = -ENOMEM; + goto err; + } + } + hpstate = hstate_file(memfd); + pgoff = list[i].offset >> huge_page_shift(hpstate); + subpgoff = (list[i].offset & + ~huge_page_mask(hpstate)) >> PAGE_SHIFT; + maxsubpgs = huge_page_size(hpstate) >> PAGE_SHIFT; + } + + do { + nr_pages = shmem_file(memfd) ? pgcnt : 1; + ret = pin_user_pages_fd(list[i].memfd, pgoff, + nr_pages, FOLL_LONGTERM, + ubuf->pages + pgbuf); + if (ret < 0) goto err; + + if (is_file_hugepages(memfd)) { + ubuf->subpgoff[pgbuf] = subpgoff << PAGE_SHIFT; + if (++subpgoff == maxsubpgs) { + subpgoff = 0; + pgoff++; + } } - ubuf->pages[pgbuf++] = page; - } + pgbuf += nr_pages; + pgcnt -= nr_pages; + } while (pgcnt > 0); fput(memfd); memfd = NULL; } @@ -283,10 +322,11 @@ static long udmabuf_create(struct miscdevice *device, return dma_buf_fd(buf, flags); err: - while (pgbuf > 0) - put_page(ubuf->pages[--pgbuf]); + while (pgbuf > 0 && ubuf->pages[--pgbuf]) + unpin_user_page(ubuf->pages[pgbuf]); if (memfd) fput(memfd); + kfree(ubuf->subpgoff); kfree(ubuf->pages); kfree(ubuf); return ret;