From patchwork Thu Jun 13 21:42:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Kasireddy X-Patchwork-Id: 13697527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64FDEC27C75 for ; Thu, 13 Jun 2024 22:14:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CEC96B009C; Thu, 13 Jun 2024 18:13:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 531746B009F; Thu, 13 Jun 2024 18:13:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3590C6B00A1; Thu, 13 Jun 2024 18:13:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 07ACC6B009C for ; Thu, 13 Jun 2024 18:13:46 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B8030A02C9 for ; Thu, 13 Jun 2024 22:13:45 +0000 (UTC) X-FDA: 82227268410.15.3B8ECF8 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) by imf16.hostedemail.com (Postfix) with ESMTP id 7258A180003 for ; Thu, 13 Jun 2024 22:13:43 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dr8QDi4H; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf16.hostedemail.com: domain of vivek.kasireddy@intel.com designates 192.198.163.8 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718316822; a=rsa-sha256; cv=none; b=HV93WyiKZjvVX2/c9DhrQHE7p6OkbUdHxPzXqORi6FUHxGb85NtP1YhiaVv2j/BOn8T6TH hX2rUZiInKJ3+TodEzGTFvSqWBNH6dXi/NlNO9oPgEWIbCYTNPLULpfEPhe+r2tm2UAhjj miMgX1JKs4YBm/x22yy5XR7QzpQNZ0I= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dr8QDi4H; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf16.hostedemail.com: domain of vivek.kasireddy@intel.com designates 192.198.163.8 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718316822; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=e2ZvWo51t2OTB9zRJu4TglOkpkiWkEyG9BmKUgtRmOA=; b=5tj+P6fCZQswxgGXR0Vo1oIh9n1LZARrh3GrNT6e3F27cdthaDLA9skZY5wi/2JUmCNpbb LfAi0oQ46WuOOzUYkbcvSjFPMg//9z01x5tcnbasqSLgSeJQupayLQsB8SLe3WWj8kyTxc raEYGkzMgoZbFkcWNyxQzoxzSTm2AbI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1718316824; x=1749852824; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tt6lFLF/Si06sd9m8G4bs00brFbAJk62FJizVRolUmA=; b=dr8QDi4H5DYZqE7qHaJDb0TYefmyQkgwxOTanumvXvR0c7a9ND02mzcd WykxInnOixwBwDuAGzGPlod8b/BZLGU+f+7/X1D0nOoWrWd8SMwJuQbUH v5a1YVNS7pQI+/Z+9FTnQDIA0Zkquul7BBbcyHXaixv0LL5phVkNCKoW1 PhnqgZb+FKb4eFPsfsvbJ9clhuN0DTBQ6xm/iUK0jV3hxYXYNAYhBt09Q O5pelCuecD6LrfjwPQC5EES6nYwDFcIC2nENDF67TvQbCs1L7UIUdfiL5 Hnm1s3kLKt9RZbJY/EYu7TtdYaTE0/BJPPRmouiyafdgVrhUu40npVWJw g==; X-CSE-ConnectionGUID: /VTB1IQKQRuIkJSuKQ4juQ== X-CSE-MsgGUID: LcVXVJQaTLe0iMMDPH1xag== X-IronPort-AV: E=McAfee;i="6700,10204,11102"; a="32720572" X-IronPort-AV: E=Sophos;i="6.08,236,1712646000"; d="scan'208";a="32720572" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2024 15:13:38 -0700 X-CSE-ConnectionGUID: xrU9K2iCT16aHJMZi0oBxQ== X-CSE-MsgGUID: mj1NBO4RQICH4KdWcZDYNw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,236,1712646000"; d="scan'208";a="45214114" Received: from vkasired-desk2.fm.intel.com ([10.105.128.132]) by ORVIESA003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2024 15:13:37 -0700 From: Vivek Kasireddy To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org Cc: Vivek Kasireddy , David Hildenbrand , Matthew Wilcox , Daniel Vetter , Hugh Dickins , Peter Xu , Jason Gunthorpe , Gerd Hoffmann , Dongwon Kim , Junxiao Chang , Dave Airlie Subject: [PATCH v15 8/9] udmabuf: Pin the pages using memfd_pin_folios() API Date: Thu, 13 Jun 2024 14:42:10 -0700 Message-ID: <20240613214741.1029446-9-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240613214741.1029446-1-vivek.kasireddy@intel.com> References: <20240613214741.1029446-1-vivek.kasireddy@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 7258A180003 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: u16nyc6c5n939ckkpe7scgo8ftjgmmtb X-HE-Tag: 1718316823-597059 X-HE-Meta: U2FsdGVkX1+xaRLUjQN7PcWWofvk8NGi5sQoht+3khg3+fSp5zR96ZaroQgkGp+1fRpZTi/jtalLDM8ZCDeaGltthmh2syuuLtok3YnYw7awIeQ4Xhyiw3LhCw9eOz1CqA5VWCUmKWHA/V3na5JLxZGmSx8GpeYuaP+qqjhC2KnPD1aLk9xuqXo2Mstr9xvoKXQP+O0eQSn+F2MdzLI7Cufr9YUriK2rJivYC74lBiJ3l5W922+MzTweDidvbf5mxLMHmuMfpmwfBYH6xKdv2JuhsJVr2tlyvtRJ41W4qmMx7PyDQLBMGawb+BCgOun/EIK3r38G3etY75qAxvW5E2+MzDdZFucq9qaWOTla+nbniLhSsINcJ9G3SmyaXghavDVE+RqFVWkCPh3mwr5o6mCXK/rKgGy3dtQjWelzLkgl3+ejNiB4U3bA21xrEaR6TM+8I/oxzhGnsEpfnuLbjwO587/OmXBof7dAOLhv3bTh2boFrZnBF742d6nStKL1aCwLYR8mxHYQl1FdLBhPvgjjPLFb5LTac5t0fpsUbkM5N5eaMgoyAnzZYB+d1Jzr6NuHEm/V/LkzSLNNTxtz3Ep2gRsjzGoltQAi8w+M365uWnYz5JX2rhwlmVS0cVdmp8g0UiEBLoAw/ZxzJVudc1TC4RQDSB2hVth4nbuxAWYefotC+ubtaQPUErrbCgK4HXPvc1ee+rJZUJ6b1tjj0vBNx6DLHyYbMbgv9VHTTa8JhVD+OsveQA+lhyyhQVVYqaOuLkoLwRzg/Homr/OU0EHldi2ZQ6T2IiNIOwXND9Syz9ikBfgPGZX3Yd1Z1GrFNbz8eVYE01XbGOZS5A5fBSfOSGRPyfJIExKzGiKmBcrUIsEgcijF2Fmx+9GpB9TZjuTWtwGYy1jyfWpWmmFK0c3SuOzEr5FLwOvggnyXzpEsVR4RIS+7Dd/3wDx6gt7evOcCaM87radYA7jzkVc dDySfdYx 6mPS9ojUgTBmCpQ2NQJAwTjpB/k57ZUle4vpVwhfuDZ0S+E8AB8waPB6TpecJS+KVwjJ84qm/oatV7YhIBglWmorzfdUiKbkPGKEpZzBs7jyTe3Hf/tbErYYZ+oQnJFetOwyR96l9iEK4nJCcFNApyd7wrdM1EUumc03qITon9A51w+HwWtz29igNXu7Go4z0opjj3O6pU/7asg/tSv3KBjxSEgQQvqhH+PCct8Yw3q/UBFU9jMKqvx114BPmrSxQXIRutr0OOG1HJgfY65zvEo7cKPJF4r9j+CCoJ6l3aYSKebiy6lcETWgLEz+P79XhH3YzjWZhtuAgkcs60VaN+rrHIg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Using memfd_pin_folios() will ensure that the pages are pinned correctly using FOLL_PIN. And, this also ensures that we don't accidentally break features such as memory hotunplug as it would not allow pinning pages in the movable zone. Using this new API also simplifies the code as we no longer have to deal with extracting individual pages from their mappings or handle shmem and hugetlb cases separately. Cc: David Hildenbrand Cc: Matthew Wilcox Cc: Daniel Vetter Cc: Hugh Dickins Cc: Peter Xu Cc: Jason Gunthorpe Cc: Gerd Hoffmann Cc: Dongwon Kim Cc: Junxiao Chang Acked-by: Dave Airlie Acked-by: Gerd Hoffmann Signed-off-by: Vivek Kasireddy --- drivers/dma-buf/udmabuf.c | 155 ++++++++++++++++++++------------------ 1 file changed, 80 insertions(+), 75 deletions(-) diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index e67515808ed3..047c3cd2ceff 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -30,6 +30,12 @@ struct udmabuf { struct sg_table *sg; struct miscdevice *device; pgoff_t *offsets; + struct list_head unpin_list; +}; + +struct udmabuf_folio { + struct folio *folio; + struct list_head list; }; static vm_fault_t udmabuf_vm_fault(struct vm_fault *vmf) @@ -153,17 +159,43 @@ static void unmap_udmabuf(struct dma_buf_attachment *at, return put_sg_table(at->dev, sg, direction); } +static void unpin_all_folios(struct list_head *unpin_list) +{ + struct udmabuf_folio *ubuf_folio; + + while (!list_empty(unpin_list)) { + ubuf_folio = list_first_entry(unpin_list, + struct udmabuf_folio, list); + unpin_folio(ubuf_folio->folio); + + list_del(&ubuf_folio->list); + kfree(ubuf_folio); + } +} + +static int add_to_unpin_list(struct list_head *unpin_list, + struct folio *folio) +{ + struct udmabuf_folio *ubuf_folio; + + ubuf_folio = kzalloc(sizeof(*ubuf_folio), GFP_KERNEL); + if (!ubuf_folio) + return -ENOMEM; + + ubuf_folio->folio = folio; + list_add_tail(&ubuf_folio->list, unpin_list); + return 0; +} + static void release_udmabuf(struct dma_buf *buf) { struct udmabuf *ubuf = buf->priv; struct device *dev = ubuf->device->this_device; - pgoff_t pg; if (ubuf->sg) put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL); - for (pg = 0; pg < ubuf->pagecount; pg++) - folio_put(ubuf->folios[pg]); + unpin_all_folios(&ubuf->unpin_list); kfree(ubuf->offsets); kfree(ubuf->folios); kfree(ubuf); @@ -218,64 +250,6 @@ static const struct dma_buf_ops udmabuf_ops = { #define SEALS_WANTED (F_SEAL_SHRINK) #define SEALS_DENIED (F_SEAL_WRITE) -static int handle_hugetlb_pages(struct udmabuf *ubuf, struct file *memfd, - pgoff_t offset, pgoff_t pgcnt, - pgoff_t *pgbuf) -{ - struct hstate *hpstate = hstate_file(memfd); - pgoff_t mapidx = offset >> huge_page_shift(hpstate); - pgoff_t subpgoff = (offset & ~huge_page_mask(hpstate)) >> PAGE_SHIFT; - pgoff_t maxsubpgs = huge_page_size(hpstate) >> PAGE_SHIFT; - struct folio *folio = NULL; - pgoff_t pgidx; - - mapidx <<= huge_page_order(hpstate); - for (pgidx = 0; pgidx < pgcnt; pgidx++) { - if (!folio) { - folio = __filemap_get_folio(memfd->f_mapping, - mapidx, - FGP_ACCESSED, 0); - if (IS_ERR(folio)) - return PTR_ERR(folio); - } - - folio_get(folio); - ubuf->folios[*pgbuf] = folio; - ubuf->offsets[*pgbuf] = subpgoff << PAGE_SHIFT; - (*pgbuf)++; - if (++subpgoff == maxsubpgs) { - folio_put(folio); - folio = NULL; - subpgoff = 0; - mapidx += pages_per_huge_page(hpstate); - } - } - - if (folio) - folio_put(folio); - - return 0; -} - -static int handle_shmem_pages(struct udmabuf *ubuf, struct file *memfd, - pgoff_t offset, pgoff_t pgcnt, - pgoff_t *pgbuf) -{ - pgoff_t pgidx, pgoff = offset >> PAGE_SHIFT; - struct folio *folio = NULL; - - for (pgidx = 0; pgidx < pgcnt; pgidx++) { - folio = shmem_read_folio(memfd->f_mapping, pgoff + pgidx); - if (IS_ERR(folio)) - return PTR_ERR(folio); - - ubuf->folios[*pgbuf] = folio; - (*pgbuf)++; - } - - return 0; -} - static int check_memfd_seals(struct file *memfd) { int seals; @@ -321,16 +295,19 @@ static long udmabuf_create(struct miscdevice *device, struct udmabuf_create_list *head, struct udmabuf_create_item *list) { - pgoff_t pgcnt, pgbuf = 0, pglimit; + pgoff_t pgoff, pgcnt, pglimit, pgbuf = 0; + long nr_folios, ret = -EINVAL; struct file *memfd = NULL; + struct folio **folios; struct udmabuf *ubuf; - int ret = -EINVAL; - u32 i, flags; + u32 i, j, k, flags; + loff_t end; ubuf = kzalloc(sizeof(*ubuf), GFP_KERNEL); if (!ubuf) return -ENOMEM; + INIT_LIST_HEAD(&ubuf->unpin_list); pglimit = (size_limit_mb * 1024 * 1024) >> PAGE_SHIFT; for (i = 0; i < head->count; i++) { if (!IS_ALIGNED(list[i].offset, PAGE_SIZE)) @@ -366,17 +343,46 @@ static long udmabuf_create(struct miscdevice *device, goto err; pgcnt = list[i].size >> PAGE_SHIFT; - if (is_file_hugepages(memfd)) - ret = handle_hugetlb_pages(ubuf, memfd, - list[i].offset, - pgcnt, &pgbuf); - else - ret = handle_shmem_pages(ubuf, memfd, - list[i].offset, - pgcnt, &pgbuf); - if (ret < 0) + folios = kmalloc_array(pgcnt, sizeof(*folios), GFP_KERNEL); + if (!folios) { + ret = -ENOMEM; goto err; + } + end = list[i].offset + (pgcnt << PAGE_SHIFT) - 1; + ret = memfd_pin_folios(memfd, list[i].offset, end, + folios, pgcnt, &pgoff); + if (ret <= 0) { + kfree(folios); + if (!ret) + ret = -EINVAL; + goto err; + } + + nr_folios = ret; + pgoff >>= PAGE_SHIFT; + for (j = 0, k = 0; j < pgcnt; j++) { + ubuf->folios[pgbuf] = folios[k]; + ubuf->offsets[pgbuf] = pgoff << PAGE_SHIFT; + + if (j == 0 || ubuf->folios[pgbuf-1] != folios[k]) { + ret = add_to_unpin_list(&ubuf->unpin_list, + folios[k]); + if (ret < 0) { + kfree(folios); + goto err; + } + } + + pgbuf++; + if (++pgoff == folio_nr_pages(folios[k])) { + pgoff = 0; + if (++k == nr_folios) + break; + } + } + + kfree(folios); fput(memfd); memfd = NULL; } @@ -389,10 +395,9 @@ static long udmabuf_create(struct miscdevice *device, return ret; err: - while (pgbuf > 0) - folio_put(ubuf->folios[--pgbuf]); if (memfd) fput(memfd); + unpin_all_folios(&ubuf->unpin_list); kfree(ubuf->offsets); kfree(ubuf->folios); kfree(ubuf);