From patchwork Thu Aug 17 06:49:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Kasireddy X-Patchwork-Id: 13356024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1EEAC2FC14 for ; Thu, 17 Aug 2023 07:10:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3646828002B; Thu, 17 Aug 2023 03:10:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2F092280023; Thu, 17 Aug 2023 03:10:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 141AB28002B; Thu, 17 Aug 2023 03:10:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 03ABB280023 for ; Thu, 17 Aug 2023 03:10:26 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 5F060A0A4E for ; Thu, 17 Aug 2023 07:10:25 +0000 (UTC) X-FDA: 81132723210.12.19B7C05 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by imf01.hostedemail.com (Postfix) with ESMTP id 46BE74000A for ; Thu, 17 Aug 2023 07:10:23 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ZUclMw+h; spf=pass (imf01.hostedemail.com: domain of vivek.kasireddy@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692256223; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=D30LPhLJJoUiLHxPbQgqbqme+/wquh7si0+3vz1xKNY=; b=8dVhFYKhAeK0D/bjWlycyV3eYNVhAqnj1i+kBnjmeeUTTISr65uhPDRktlef3qrq0zPMwJ b8ng+mNPszc58W6XLd7VgVi/DkkPSj0/stEVpIx8CnXm29TgJxYDB5984ClwrO73oggQhz Q6Ca+0/eUhjbg0kqCyZEf2UE2XvTAns= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692256223; a=rsa-sha256; cv=none; b=lfY5qXxVZ6K9l9WlUOzKdSQjX1I6U+ytQkVDzLBAB3b6CG5eY5jrAhw7Jz83uePgs+VnNB 6O6sT1gPwZTQRSMo1FvTY7neUWkWyXT+0P+9+Iz7s+00kD7xMt7lA6FYEmi8HUtt8ANLVI zvZn1scZHdHAPbY1+F//mS6y8rvy+Ws= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ZUclMw+h; spf=pass (imf01.hostedemail.com: domain of vivek.kasireddy@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692256223; x=1723792223; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oHz4gJ1RUR72JutajLhDLQnJNp66EtQNsodmpCFgLug=; b=ZUclMw+h8BQvw/7oyqD0hvPZJX7ZvjP3n1MznxmkISq7tB7pu6hma4hF BQRhBUdYC6Me5hV2XeXj6+s2uhz/xerTZrsi8fCV60iNVxPyMYB+NT2Rh BFOx4ojX+Fhz/HVxgK19BZh/G++xu+hUtH5Jbc4IbQPzuN/4Ftfgj0X2H HmC+Ov8Fa28h6uJfbUP6L4BsaCzMHugQKmFRDOYRwuojNvA1ujpc6rU7O zdl4nwSkTn9Wysv21nY1+OyJDPfuBwAvZyWQg8vX/rlEBrgTKXnX/QTRk M6eYOGD5wt1zdg7u7xd6uX/7ixKT9t2qE+sEfCJvu6h48l/RbLjRkAfLj A==; X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="370200890" X-IronPort-AV: E=Sophos;i="6.01,179,1684825200"; d="scan'208";a="370200890" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Aug 2023 00:10:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="1065142185" X-IronPort-AV: E=Sophos;i="6.01,179,1684825200"; d="scan'208";a="1065142185" Received: from vkasired-desk2.fm.intel.com ([10.105.128.127]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Aug 2023 00:10:18 -0700 From: Vivek Kasireddy To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org Cc: Vivek Kasireddy , David Hildenbrand , Daniel Vetter , Mike Kravetz , Hugh Dickins , Peter Xu , Jason Gunthorpe , Gerd Hoffmann , Dongwon Kim , Junxiao Chang Subject: [PATCH v1 2/3] udmabuf: Add support for page migration out of movable zone or CMA Date: Wed, 16 Aug 2023 23:49:33 -0700 Message-Id: <20230817064934.3424431-3-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230817064934.3424431-1-vivek.kasireddy@intel.com> References: <20230817064934.3424431-1-vivek.kasireddy@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 46BE74000A X-Rspam-User: X-Stat-Signature: jexj4tojw6t8gzzkdm513kteehge5b7h X-Rspamd-Server: rspam03 X-HE-Tag: 1692256223-560717 X-HE-Meta: U2FsdGVkX18Q9TNpGCoyLpy/lOX5FO2Sf4Q+dp/8NOy4IRRRkTYAMt2R9xBFetrzAVdFV059EHYveIb4m0wCpFG0Yo0g6mVDeCmX0qCzPLG/AG8aJNlLuhs7k/8ExFH7IsFQH2NAJBNLgkhSNUtXe66vzObDLg6+eaXZd+gzPomECWu8P7fpgD3lccCCZ8vLrIc00/l/rf/sSgouXTT+B/DyS34MnxXPH9YL06triAi2+l7M0P4kzDfd+a9beb/8FKuxLac1/1qQhQxXVeVgtFOP7DaKaHopSvpi2fH0b3Tc4cqPUX+y9PfJtnvJ5sMNpVN0TUewhUae9tuIyXG7u7AqhqhBWhE4ctoHiNtc4Pp/Lt5dFOffqfhcn167m1LC4MYyTds4PEwlYSJ7I6Hc+SNcxlNaL63v3p5ZODCsetgrx/Z3AwNUOvS8ulI7ipSbP5poqK36DCFafPvxkDLfNLnmIKwXySyUCvWluVOH/ArZDN8O3QJAL15mu+VIv8WqKzGwgRZSJP4L+9Jzj2IUP7R14RASn2c3e0a7iV3SKMmsfFSQof7kv6vBoT5+L9UgtjQGTAycZ6hk/6JJHH6OPj4bkafqIyynlK1GMqMi7jXBiRmg1m2tAPXwjHRrZvm+3LIQ5PY8HHdaJoPPN4CEua++GLH2U61k+xrn4B/b4mgA9jQPd2+CEsUBKTL3j0F6nMakwVNHQ7ZpzE6Sd+DNIiwDzhwZy42DO2Cq2tRKSpUvgXsgKVHGR2HpDWKnj6IAaNyxZldK5KSAOVnFJKJ6WhK3GDRM7vysoNgUFbCnQMKUfwNqdY60jFIweXPnX9eDTIIBJRA54wlhcSpvuBYbZ0NHfUe2ZGGh5XIuUacKK8pDLYF1uxeGecSdAqIPR0lUwXJQvR1Zg/lj4IXrFQ2KHOn87Obs2Z5+MnQilMIe/f/9bcTSHb1yskbBgIjOH4PNyTipkD3Ed6B0CRNH/eq wznS2vso dy49HY7kZLXyJeZUDcvCubJG3KntM/lzVAb9tG7aTHNodjAoPjsLKRWt3ubzRdPoQppyGGzALLU5UVIV1lrQcuoDG+WxELDjz9NkP5Jaa6fmaduoSBWrsHCv96syK4T5CbQWV1VlE8MzgzHEwkk5Jg3LeNNNBV6Ay50+BNVPWHtNNP4oCmLgDhBGf0MB7KcVK+K4L7l3MFJXWTVQMZOE8Bs/bApsUSY40UmeRQi74v1VT2pynrVEAB0BS9A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since udmabuf could potentially pin pages that may reside in the movable zone or CMA and thereby break features such as memory hotunplug, it makes sense to migrate the pages out of these areas. In order to accomplish this, we note the mapping and the index of each page and then call check_and_migrate_movable_pages(). As check_and_migrate_movable_pages() unpins all the pages (and also replaces the migrated pages in the mapping) upon successful migration, we need to retrieve all the pages from their associated mapping using the index we noted down earlier and re-pin them again. Cc: David Hildenbrand Cc: Daniel Vetter Cc: Mike Kravetz Cc: Hugh Dickins Cc: Peter Xu Cc: Jason Gunthorpe Cc: Gerd Hoffmann Cc: Dongwon Kim Cc: Junxiao Chang Suggested-by: David Hildenbrand Signed-off-by: Vivek Kasireddy --- drivers/dma-buf/udmabuf.c | 106 +++++++++++++++++++++++++++++++++++--- 1 file changed, 100 insertions(+), 6 deletions(-) diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 1a41c4a069ea..63912c73d122 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -30,6 +30,12 @@ struct udmabuf { struct sg_table *sg; struct miscdevice *device; pgoff_t *subpgoff; + struct udmabuf_backing_info *backing; +}; + +struct udmabuf_backing_info { + struct address_space *mapping; + pgoff_t mapidx; }; static vm_fault_t udmabuf_vm_fault(struct vm_fault *vmf) @@ -156,8 +162,10 @@ static void release_udmabuf(struct dma_buf *buf) put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL); for (pg = 0; pg < ubuf->pagecount; pg++) - put_page(ubuf->pages[pg]); + unpin_user_page(ubuf->pages[pg]); + kfree(ubuf->subpgoff); + kfree(ubuf->backing); kfree(ubuf->pages); kfree(ubuf); } @@ -211,6 +219,76 @@ static const struct dma_buf_ops udmabuf_ops = { #define SEALS_WANTED (F_SEAL_SHRINK) #define SEALS_DENIED (F_SEAL_WRITE) +static int udmabuf_pin_pages(struct udmabuf *ubuf) +{ + struct address_space *mapping; + struct folio *folio; + struct page *page; + pgoff_t pg, mapidx; + int ret; + + for (pg = 0; pg < ubuf->pagecount; pg++) { + mapping = ubuf->backing[pg].mapping; + mapidx = ubuf->backing[pg].mapidx; + + if (!ubuf->pages[pg]) { + page = find_get_page_flags(mapping, mapidx, + FGP_ACCESSED); + if (!page) { + if (!shmem_mapping(mapping)) { + ret = -EINVAL; + goto err; + } + + page = shmem_read_mapping_page(mapping, + mapidx); + if (IS_ERR(page)) { + ret = PTR_ERR(page); + goto err; + } + } + ubuf->pages[pg] = page; + } + + folio = page_folio(ubuf->pages[pg]); + if (folio_test_large(folio)) + atomic_add(1, &folio->_pincount); + else + folio_ref_add(folio, GUP_PIN_COUNTING_BIAS); + + /* Since we are doing the equivalent of FOLL_PIN above, we can + * go ahead and release our (udmabuf) reference on the pages. + * Otherwise, migrate_pages() will fail as it doesn't like the + * extra reference. + */ + put_page(ubuf->pages[pg]); + } + return 0; + +err: + while (pg > 0 && ubuf->pages[--pg]) { + unpin_user_page(ubuf->pages[pg]); + ubuf->pages[pg] = NULL; + } + return ret; +} + +static long udmabuf_migrate_pages(struct udmabuf *ubuf) +{ + long ret; + + do { + ret = udmabuf_pin_pages(ubuf); + if (ret < 0) + break; + + ret = check_and_migrate_movable_pages(ubuf->pagecount, + ubuf->pages); + } while (ret == -EAGAIN); + + return ret; +} + static long udmabuf_create(struct miscdevice *device, struct udmabuf_create_list *head, struct udmabuf_create_item *list) @@ -224,7 +302,8 @@ static long udmabuf_create(struct miscdevice *device, struct page *page, *hpage = NULL; pgoff_t mapidx, chunkoff, maxchunks; struct hstate *hpstate; - int seals, ret = -EINVAL; + long ret = -EINVAL; + int seals; u32 i, flags; ubuf = kzalloc(sizeof(*ubuf), GFP_KERNEL); @@ -252,6 +331,13 @@ static long udmabuf_create(struct miscdevice *device, goto err; } + ubuf->backing = kmalloc_array(ubuf->pagecount, sizeof(*ubuf->backing), + GFP_KERNEL); + if (!ubuf->backing) { + ret = -ENOMEM; + goto err; + } + pgbuf = 0; for (i = 0; i < head->count; i++) { ret = -EBADFD; @@ -298,7 +384,8 @@ static long udmabuf_create(struct miscdevice *device, } get_page(hpage); ubuf->pages[pgbuf] = hpage; - ubuf->subpgoff[pgbuf++] = chunkoff << PAGE_SHIFT; + ubuf->subpgoff[pgbuf] = chunkoff << PAGE_SHIFT; + ubuf->backing[pgbuf].mapidx = mapidx; if (++chunkoff == maxchunks) { put_page(hpage); hpage = NULL; @@ -312,8 +399,10 @@ static long udmabuf_create(struct miscdevice *device, ret = PTR_ERR(page); goto err; } - ubuf->pages[pgbuf++] = page; + ubuf->pages[pgbuf] = page; + ubuf->backing[pgbuf].mapidx = mapidx; } + ubuf->backing[pgbuf++].mapping = mapping; } fput(memfd); memfd = NULL; @@ -323,6 +412,10 @@ static long udmabuf_create(struct miscdevice *device, } } + ret = udmabuf_migrate_pages(ubuf); + if (ret < 0) + goto err; + exp_info.ops = &udmabuf_ops; exp_info.size = ubuf->pagecount << PAGE_SHIFT; exp_info.priv = ubuf; @@ -341,11 +434,12 @@ static long udmabuf_create(struct miscdevice *device, return dma_buf_fd(buf, flags); err: - while (pgbuf > 0) - put_page(ubuf->pages[--pgbuf]); + while (pgbuf > 0 && ubuf->pages[--pgbuf]) + put_page(ubuf->pages[pgbuf]); if (memfd) fput(memfd); kfree(ubuf->subpgoff); + kfree(ubuf->backing); kfree(ubuf->pages); kfree(ubuf); return ret;