From patchwork Sat Jan 13 06:52:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kasireddy, Vivek" X-Patchwork-Id: 13518879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73DA4C47422 for ; Sat, 13 Jan 2024 07:16:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 615546B0085; Sat, 13 Jan 2024 02:16:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 57B518D0003; Sat, 13 Jan 2024 02:16:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32C386B0089; Sat, 13 Jan 2024 02:16:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 145BC6B0087 for ; Sat, 13 Jan 2024 02:16:43 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id DD65BA1E2B for ; Sat, 13 Jan 2024 07:16:42 +0000 (UTC) X-FDA: 81673430244.24.5EA0025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) by imf03.hostedemail.com (Postfix) with ESMTP id B78F920010 for ; Sat, 13 Jan 2024 07:16:39 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=C9oPJ0Uv; spf=pass (imf03.hostedemail.com: domain of vivek.kasireddy@intel.com designates 192.198.163.11 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705130200; a=rsa-sha256; cv=none; b=1QocoURS2B2ZZKgESq7dENWLM/fRG1tjo5MTzTXpoyu07/Q633zGy0KI1R9q8hnEB9i6lp l4ZFhHleTBoaEKZ0QZo6GLt4X9iEBD4nrqaxRVpNFHtnmf93FriEOUbSUycEMljkdg1F1A h94kVslcRCY8+ThaWuhs850orVZ8NJw= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=C9oPJ0Uv; spf=pass (imf03.hostedemail.com: domain of vivek.kasireddy@intel.com designates 192.198.163.11 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705130200; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=pLiJa3No+T2b0VFyH1IuW1GKP/0ub+K6CsiPzvxIwO4=; b=7uAeRIIzCH+axDaZuOvF/zhVT6BYIEpZ7x1ZMgDOtQsxVbQyNBClIJ0kzM/71bcGaRqwUy TjkDLe/zST9DOIw7BC2tVe/o4K5cihOuE8954BOW2TsyLpq5778tNpph+yP3U5u1hsYNWk fV4d+t7/jVcgVc9QrihqQCg4O8zPQeE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705130199; x=1736666199; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=jtBpTPqq4kz8ubfHOwC7DguBY9cAsaDfs5HrtM1PI3c=; b=C9oPJ0UvXD/T7xYdxnegeasPTnkyjG4qb4FJkCB0aBcO6lS8Z4iuMWj5 Vt8Q6AxzoBI7nOfFYy21Gfj+3CdC6HOG9Fc8gQ4A20BdgNTkz+qQRulRv HXUnmf5J2FKu1VKxdwof8scvQeifvD0PbPnLnyhaSvu8aoi+xvJ/romM+ ZxSVkSCvLXJuuLXmSgH17R6z24nmEE+Z3VdKriyrw14MpjYDllo4qg0sn N1DGUUoNjniX3SMiC/mdVWx0ibhDYBrnYTBR+rA4caAYeGXQ0NsudGtCG hhB9pMLf9B2WDoS8JB5Whu8PxuWJRsalluKy9mwPSPdlMzx8CWJqjYnF+ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10951"; a="6078105" X-IronPort-AV: E=Sophos;i="6.04,191,1695711600"; d="scan'208";a="6078105" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2024 23:16:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10951"; a="783269484" X-IronPort-AV: E=Sophos;i="6.04,191,1695711600"; d="scan'208";a="783269484" Received: from vkasired-desk2.fm.intel.com ([10.105.128.132]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2024 23:16:36 -0800 From: Vivek Kasireddy To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org Cc: Vivek Kasireddy , David Hildenbrand , Matthew Wilcox , Christoph Hellwig , Daniel Vetter , Mike Kravetz , Hugh Dickins , Peter Xu , Jason Gunthorpe , Gerd Hoffmann , Dongwon Kim , Junxiao Chang Subject: [PATCH v11 0/8] mm/gup: Introduce memfd_pin_folios() for pinning memfd folios Date: Fri, 12 Jan 2024 22:52:15 -0800 Message-Id: <20240113065223.1532987-1-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: B78F920010 X-Stat-Signature: ozbog5ruwhw1o7z8a7p835k55onjrcfy X-Rspam-User: X-HE-Tag: 1705130199-970225 X-HE-Meta: U2FsdGVkX19qEhiWJmKFp09tVZt9ostO/4YWfcHwlzECJqOh/gVjPquD0j15lu7fRYKD401UFqLDHpoDoYTdvxAYj5gE3Ca36hQFZCBYlRPr+5UMCHrToK+UvE1Eodr8PJeGGVHt85tivBi5qT/uNv8OdR4JV0jQGkH8qs7MIdx6PuiLqPqmTwPccjrfBXTBggg+XmXkncbethYD8JUoyuA9nJMaak3CJzMufWHm3MqYPjfa4lG+z35UtQNJCjTojj8r+LUpLlmYVfq0hiI4eNRMImdrvXmtfrcsC41YKm1hguds+ZljjIhvnHREHxxIFgkfLMaTWlgx1UzEx2D3OVWAZuC3K49GFabtUb8v0v3DuQZN4jizPANwONvBz0eZQ8Jlq7GBHjin6UpVdM/TBuGuBlvel8I8x1Iy9WOUG/K3d5ncupTSaYxPNEG4+XWNUDxaUKkgKpBiLsTiJPCRV9CTXBuYnd42fHpEywlIRln4dri4pNaf2hB7TrtdC+MJ12A9JgB93ENKC7BEvEQC5KcpozmTCYv30ff8RNKeFmlGOFhiYY77TpOzbASDLkJOe9rpqfL7BdC0KFNItBlh26jZ1ioZ7i7Jsb2XXXoNJ4/nY1lVnaaAwE4vf0uzQMd+O/QN9ng7z3LC9VX3jEKIkZGpiDWqvdWeoyZWnsrKaDGfoPZfJngfqz6tZcoiCxS1/zDgYA2wEhJjiiIWvJ9MyWGgNucCpumaewrQfgE1+T2fVSAGLyH6u/SUJXj4qOqGe5EqmIoEDnsR7aIN00pnxWlRB5+YNSckwsW/ym89wfV1MxC5v9pB37pIk4RZgNcsgqy8QPyqk9FJtvfX9cH9Rxmng+vqdlOBmUj75nucRHkDpRCEC4XmkHrQFYuWNV4XwnfhqTLc/k6AHN3C8eYmWnSwBbuhOt7mWEXqDwojZsy18PHpjQyfMfaA59sH20/yQGcRjeoOdzHHYHQyHwc nfzyQ0Lo hRYmASgDijNUPhOw/W8+6+0SQdBV3atvMq5JRxMNAKSqFbJ1GoJm/bKh26hYS5a++2KkSYBuzY5WvEefSZP2LJ+LUkVFi6Uyc6cVyCwdqqskpulGa8mOrePzb/SI3SC3pdEM7d85Yh/temYIP13b/hvulppHIQIPNep63BZW5vO2WWHv/DTWOvUaga27OGXlh2qrbJFYZby4752Q1SVgEPFgEAU8uKPELmYCvqeTiiHcBKt/8l8gsjetS8G00FSr3raXx45tqIUeoZQIFKzQx25el0fH+OYp9HftJR4Qz0v3S8U+N3DhDg+Hq3c+b0X3H68noKItZf6IEfN6RY5GnRGRyY/p6wBsQvuhyESRFAMV5M1k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, some drivers (e.g, Udmabuf) that want to longterm-pin the pages/folios associated with a memfd, do so by simply taking a reference on them. This is not desirable because the pages/folios may reside in Movable zone or CMA block. Therefore, having drivers use memfd_pin_folios() API ensures that the folios are appropriately pinned via FOLL_PIN for longterm DMA. This patchset also introduces a few helpers and converts the Udmabuf driver to use folios and memfd_pin_folios() API to longterm-pin the folios for DMA. Two new Udmabuf selftests are also included to test the driver and the new API. --- Patchset overview: Patch 1-2: GUP helpers to migrate and unpin one or more folios Patch 3: Introduce memfd_pin_folios() API Patch 4-5: Udmabuf driver bug fixes for Qemu + hugetlb=on, blob=true case Patch 6-8: Convert Udmabuf to use memfd_pin_folios() and add sefltests This series is based on drm-tip (6.7.0) and tested using following methods: - Run the subtests added in Patch 8 - Run Qemu (master) with the following options and a few additional patches to Spice: qemu-system-x86_64 -m 4096m.... -device virtio-gpu-pci,max_outputs=1,blob=true,xres=1920,yres=1080 -spice port=3001,gl=on,disable-ticketing=on,preferred-codec=gstreamer:h264 -object memory-backend-memfd,hugetlb=on,id=mem1,size=4096M -machine memory-backend=mem1 - Run source ./run_vmtests.sh -t gup_test -a to check GUP regressions Changelog: v10 -> v11: - Remove the version string from the patch subject (Andrew) - Move the changelog from the patches into the cover letter - Rearrange the patchset to have GUP patches at the beginning v9 -> v10: - Introduce and use unpin_folio(), unpin_folios() and check_and_migrate_movable_folios() helpers - Use a list to track the folios that need to be unpinned in udmabuf v8 -> v9: (suggestions from Matthew) - Drop the extern while declaring memfd_alloc_folio() - Fix memfd_alloc_folio() declaration to have it return struct folio * instead of struct page * when CONFIG_MEMFD_CREATE is not defined - Use folio_pfn() on the folio instead of page_to_pfn() on head page in udmabuf - Don't split the arguments to shmem_read_folio() on multiple lines in udmabuf v7 -> v8: (suggestions from David) - Have caller pass [start, end], max_folios instead of start, nr_pages - Replace offsets array with just offset into the first page - Add comments explaning the need for next_idx - Pin (and return) the folio (via FOLL_PIN) only once v6 -> v7: - Rename this API to memfd_pin_folios() and make it return folios and offsets instead of pages (David) - Don't continue processing the folios in the batch returned by filemap_get_folios_contig() if they do not have correct next_idx - Add the R-b tag from Christoph v5 -> v6: (suggestions from Christoph) - Rename this API to memfd_pin_user_pages() to make it clear that it is intended for memfds - Move the memfd page allocation helper from gup.c to memfd.c - Fix indentation errors in memfd_pin_user_pages() - For contiguous ranges of folios, use a helper such as filemap_get_folios_contig() to lookup the page cache in batches - Split the processing of hugetlb or shmem pages into helpers to simplify the code in udmabuf_create() v4 -> v5: (suggestions from David) - For hugetlb case, ensure that we only obtain head pages from the mapping by using __filemap_get_folio() instead of find_get_page_flags() - Handle -EEXIST when two or more potential users try to simultaneously add a huge page to the mapping by forcing them to retry on failure v3 -> v4: - Remove the local variable "page" and instead use 3 return statements in alloc_file_page() (David) - Add the R-b tag from David v2 -> v3: (suggestions from David) - Enclose the huge page allocation code with #ifdef CONFIG_HUGETLB_PAGE (Build error reported by kernel test robot ) - Don't forget memalloc_pin_restore() on non-migration related errors - Improve the readability of the cleanup code associated with non-migration related errors - Augment the comments by describing FOLL_LONGTERM like behavior - Include the R-b tag from Jason v1 -> v2: - Drop gup_flags and improve comments and commit message (David) - Allocate a page if we cannot find in page cache for the hugetlbfs case as well (David) - Don't unpin pages if there is a migration related failure (David) - Drop the unnecessary nr_pages <= 0 check (Jason) - Have the caller of the API pass in file * instead of fd (Jason) Cc: David Hildenbrand Cc: Matthew Wilcox (Oracle) Cc: Christoph Hellwig Cc: Daniel Vetter Cc: Mike Kravetz Cc: Hugh Dickins Cc: Peter Xu Cc: Jason Gunthorpe Cc: Gerd Hoffmann Cc: Dongwon Kim Cc: Junxiao Chang Vivek Kasireddy (8): mm/gup: Introduce unpin_folio/unpin_folios helpers mm/gup: Introduce check_and_migrate_movable_folios() mm/gup: Introduce memfd_pin_folios() for pinning memfd folios udmabuf: Use vmf_insert_pfn and VM_PFNMAP for handling mmap udmabuf: Add back support for mapping hugetlb pages udmabuf: Convert udmabuf driver to use folios udmabuf: Pin the pages using memfd_pin_folios() API selftests/udmabuf: Add tests to verify data after page migration drivers/dma-buf/udmabuf.c | 231 +++++++++--- include/linux/memfd.h | 5 + include/linux/mm.h | 5 + mm/gup.c | 346 +++++++++++++++--- mm/memfd.c | 34 ++ .../selftests/drivers/dma-buf/udmabuf.c | 151 +++++++- 6 files changed, 662 insertions(+), 110 deletions(-)