From patchwork Sat Mar 11 00:38:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Berger X-Patchwork-Id: 13170537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C56E6C6FD19 for ; Sat, 11 Mar 2023 00:39:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E1DC88E0001; Fri, 10 Mar 2023 19:39:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DCE416B0074; Fri, 10 Mar 2023 19:39:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C574E8E0001; Fri, 10 Mar 2023 19:39:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B5A0C6B0072 for ; Fri, 10 Mar 2023 19:39:47 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8C8111C5D47 for ; Sat, 11 Mar 2023 00:39:47 +0000 (UTC) X-FDA: 80554759614.21.ABC1A82 Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) by imf23.hostedemail.com (Postfix) with ESMTP id D62BE140011 for ; Sat, 11 Mar 2023 00:39:44 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=jtiDqd+q; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf23.hostedemail.com: domain of opendmb@gmail.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=opendmb@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678495184; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=sjMkQjLCZ3BM2pNDIMhhzn6RH1drrsuj6qqHGDBUAzk=; b=Od2BDH/UA76vID2Gy2c6C1i3n5X20ABBSe6h2seK2utcZ8Dnpc9lMo0WnIzlEZecKO9hQH 3BECsuBOYKvyjE4nsDiStOCuVoRQV1nUQZp2I7r+kgArSygT+8tWKNArKl8YLwOrRvrLPW WLa1UtZyZIa87VyCENUR/udG3vSm6lM= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=jtiDqd+q; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf23.hostedemail.com: domain of opendmb@gmail.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=opendmb@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678495184; a=rsa-sha256; cv=none; b=IJS4RuKcUEWJG6ilLM6tJ4lLbtowhQfj/s5R2tK3csb4gIEAJeLhnpaIEZYAF4ypuU8ksC VhkfHPhB1sXUu5jhgtDuO4J7Q+AhTtlk4rX3G8X4Qys/pvKzcbwLxxu4Fs7rAjGA9haYvd PoY7r+K4c3LSJwpGsHbn4dXMd0v7yyY= Received: by mail-qv1-f44.google.com with SMTP id ff4so4777809qvb.2 for ; Fri, 10 Mar 2023 16:39:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1678495184; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=sjMkQjLCZ3BM2pNDIMhhzn6RH1drrsuj6qqHGDBUAzk=; b=jtiDqd+qPHutJBoQQIOHT2/Oyc2PEH/Q6mnOAT3ZjBQ28OVYkAxk+xA5cbwgo9/veZ BkadPXFACr4ZN3DsCmjDBO2wHo+og5UALJd+uihDisNpoDgfgNtFXPRZKhHahlSLHYNS hBJQXlo0p4QwpvO4PVGzb7h39+QE1iIENJNk4jx2Fve78xIbjiS0zvqPRLjSTMSZF/5T /7oqDSVrjEh7Nf6Gi+1476WFojlQA52Et9pjesfmTe5VoRU2bnIOZyLCrXhCTQIxmMQe C79LD65c3E84ihUhkt9plWzh5lpbdAPBj8qsc6e2iVygcyk58AToLiuFPB9fHPQo7F2R vNpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678495184; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=sjMkQjLCZ3BM2pNDIMhhzn6RH1drrsuj6qqHGDBUAzk=; b=pdARNFK6Cumbl/uMOBT0o7Lh+slO5OmNtQn1PVFqy5EgDK6tcCLeN3hri4oUvlhmH9 KjtRRlD8temFlE3UkcK3OEHgPx0hdYaZ79Gz3ASz0KNZ92DtvUkY9gTW23zQtGCUqmof lmY1HAFlChuImp8kULRBaH0RwtTzhyWs0tW0O7RYwA1nSrkHFwNov1WeUJCOctTsXXGd 6JRTew10yKKGiu31Ghm67t62dhkcq/Ildu5fztSTVeUn70ebkrYUiJilSEsLeMARg4Ux S/py9trcykrg0QTMfi+TOcdjcrMHbCRl8AfXoTeu3dwkW6IvcJa2eGQwxXh1xaowx/eP Gjow== X-Gm-Message-State: AO0yUKUmmkl5K5nBVxhBGOKq6eOvQHKVmvpbrvdTR6V50tH7bHyHxnFj H23/eRmvZuZDCYqpW7fcef0= X-Google-Smtp-Source: AK7set+pQM1vnDRcOz+8Pij0IfR3x7JLodqa9NsJXDenx8+mqfQxfsGld8U6lz/gHQf01tVDbQuAxg== X-Received: by 2002:a05:6214:226b:b0:56e:a5ea:1450 with SMTP id gs11-20020a056214226b00b0056ea5ea1450mr1909709qvb.6.1678495183915; Fri, 10 Mar 2023 16:39:43 -0800 (PST) Received: from stbirv-lnx-1.igp.broadcom.net ([192.19.223.252]) by smtp.gmail.com with ESMTPSA id a5-20020ac84345000000b003bfaff2a6b9sm868874qtn.10.2023.03.10.16.39.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Mar 2023 16:39:43 -0800 (PST) From: Doug Berger To: Andrew Morton Cc: Jonathan Corbet , Mike Rapoport , Borislav Petkov , "Paul E. McKenney" , Randy Dunlap , Neeraj Upadhyay , Damien Le Moal , Kim Phillips , "Steven Rostedt (Google)" , Michal Hocko , Johannes Weiner , Vlastimil Babka , KOSAKI Motohiro , Mel Gorman , Muchun Song , Mike Kravetz , Florian Fainelli , David Hildenbrand , Oscar Salvador , Joonsoo Kim , Sukadev Bhattiprolu , Rik van Riel , Roman Gushchin , Minchan Kim , Chris Goldsworthy , "Georgi Djakov" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Doug Berger Subject: [PATCH v4 0/9] mm: introduce Designated Movable Blocks Date: Fri, 10 Mar 2023 16:38:46 -0800 Message-Id: <20230311003855.645684-1-opendmb@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: D62BE140011 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: baergxmbzx19ed17fxyokgk7sbkhsfnb X-HE-Tag: 1678495184-351037 X-HE-Meta: U2FsdGVkX1+GHTr5ipWjlYVgxhKD1eRcHjMzrKakXzsa36ssKp1CA4WrEWwFadkTxc26qH0jrHBqFkPjGufpwzf/i5AfLQAqRj3D2XxPXGTS0y7UTLv46MtUKPZytH7uCWhqidl79PQFgdKoykcasaZF3KkLOTMpFLDu6NoZ1ngBMK0+y+LQmRIytPYahJUWFWazbKmEn/n8GtAt8gRTDKznnB0Ns2T0Kf9jv5/vD6GqzLtw6FVaULegx+l7EXYlXswqzVzujkmZQUzuDov+jSYHc5EJG2GEMJw4ikxqR2wZqhdpuuLBtPuHw54m79carx/DK6q+Xwr/9zx66mC8YxUNP4zqHg9j/di6ofYdQGCat11zNO5GNb2o1J5i8Mj8nqcgdxynixQfSQglSnio29QRSJoBqOZ85HMnGB/HpBoz560TB/cczGRkoZ7CNJNeYLhISnQva78GRU/PFY9JT7yYFaNH57aYVnkoEltNcoUtQn2jJW0SZ0Rs2Mbe10E1MI12oM9Qs4IR+Vn/7MpgCjwOMQagip0m5xn7fihLWizfzzK5xq4UvYsGDmiLouSTSvY79u64gDKJfjeeG5QwS27wFTQJajfPDIPlE4LKolcuh5YKdhDX67H/G9Biou1DxebwqbPoM326pf/WwFCkMN0MQWB7zDgQjIVBQ5+7hpH6rlnCchiUS2FVsFFdwZW/1d7HVHNMduQ4yWV4Gi79ui3F7kxByYb0Lol8u+Rp7zMuCOVRxU9QcwCo3IUbL205FoJBgT0bN51pjaVj0ZY5NQh4MHyERvtJqth56K7rmKeNx4KJyn5C/qFaVcFsvvQtPJB2HscP3+pn+CwX0hy/df/ifvEwYf2MNsWarT7KPrpakavrgk51PVEsv0NGjRs3jA+YICSKUEy8lvJOYKuY0zZnvMCOB+lKjn3M8JxxtQAd/clABL67Nl/neigJ9878VNmeG2mRFWPf1EQuO2G Oda9/XSf X+T++IiALRuwLA3HgbX7qCeVB8kHKW/ugM26vRn2tdjwUTHxruIF2zYYfCh3bQ2dHoCAQluQjlD76j3gokqMGINvlZmaGf+fVz6cJrS4n+39yScZmoj7nzft3LyzH6jXSIH2uQi3grmbnNKg/3/lxQKmKQ15vwRU+l+bOMiZ/3PRld1An91mzN8kKhxSg9mC13TriFGOiSy1LBIgZpX1bBVJaQMSZsuXAGEfCV5Tvk6XcnV1x1VabHVIZwc2DliYm/3WYHaOtj14NwpCSdJwNkFPU4OlrYRIKvRN1CNurN2f+bLzrr09MnO9IlLeeLnJQLURgYzqYWiQQ9iSSa8AK/2LMIR7j6GVZnnKvuIb9baSyv52cjEilerLpvq3ppld+JgNO1qQHVlzOvSOktvnviIduDutQzOdewdImJwN8MmUhVZJQQLcX7YsFxkOSg2zCThflxXj/Yc9G8Meg8ES404SiOSyWv58rGrwCczT0MUHCO6L3zAHnIgP6ngUNWyoFlx2RM70z5Hfs9nrBbIYko1bTZpRNkREzDAkY+nl9+UiW22mrfACGqrfsprfpA7QXNYVIl517Qp4+AAbB0HB8lESm8kxjS/IG4NA9/8gpkFSVay0uUZZ5kq0MeC+upjaavr+BdDocTXZqiwTeO7UIoOMU9PfdwjHe6RzEdSanOa4NQhASh4QX+RWDzBv9nNYW9gsrFkORo1xc7/gdLvtG8TNhEzCTim4r4or7lkz94tPQWxfMfGjDT3QB+7sAb8NRJc3W X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is essentially a resubmission of v3 rebased with a rewritten cover letter to hopefully clarify the submission based on feedback and follow-on discussion. The individual patches have not materially changed. The Linux Memory Management system (MM) has long supported the concept of movable memory. It takes advantage of address abstraction to allow the data held in physical memory to be moved to a different physical address or other form of storage without the user of the abstracted (i.e. virtual) address needing to be aware. This is generally the foundation of user space memory and the basic service the kernel provides to applications. On the other hand, the kernel itself is generally not tolerant of the movement of data that it accesses so most of its usage is unmovable memory. It may be useful to understand that this terminology is relative to the kernel's perspective and that what the kernel considers unmovable memory may in fact be moved by a hypervisor that hosts the kernel, but an additional address abstraction must exist to keep the kernel unaware of such movement. The MM supports the conversion of free memory between MOVABLE and UNMOVABLE (and other) migration types to allow better sharing of memory resources. More recently, the MM introduced "movablecore" memory that should never be made UNMOVABLE. As an implementation detail "movablecore" memory introduced the ZONE_MOVABLE zone to manage this type of memory and significant progress has been made to ensure the movability of memory in this zone with the few exceptions now documented in include/linux/mmzone.h. "Movablecore" memory can support multiple use cases including dynamic allocation of hugetlbfs pages, but an imbalance of "movablecore" memory and kernel memory can lead to serious consequences for kernel operation which is why the kernel parameter includes the warning "the administrator must be careful that the amount of memory usable for all allocations is not too small." Designated Movable Blocks represent a generic extension of the "movablecore" concept to allow specific blocks of memory to be designated part of the "movablecore" to provide support for additional use cases. For example, it could have been/could still be used to support hot unplugging of memory. A very similar concept was proposed in [1] for that purpose, and revised in [2], but ultimately a more use case specific implementation of the movable_node parameter was accepted. That implementation is dependent on NUMA, ACPI, and SRAT tables which narrow its usefullness. Designated Movable Blocks allow for the same type of discontiguous and non-monotonic configuration of ZONE_MOVABLE for systems whether or not they support NUMA, ACPI, or SRAT tables. Specifically this feature is desired by users of the arm64 Android GKI common kernel on Broadcom SoCs where NUMA is not available. These patches make minimal additions to existing code to offer a controllable "movablecore" feature to those systems. Like all "movablecore" memory there are no Designated Movable Blocks created by default. They are only created when specified and the warning on the "movablecore" kernel parameter remains just as relevant. The key feature of "movablecore" memory is that any allocations of the memory by the kernel page allocator must be movable and this has the follow-on effect that GFP_MOVABLE allocation requests look to "movablecore" memory first. This prioritizes the use of "movablecore" memory by user processes though the kernel can conceivably use the memory as long as movability can be preserved. One use case of interest to customers of Broadcom SoCs with multiple memory controllers is for improved memory bandwidth utilization for multi-threaded user space dominant workloads. Designated Movable Blocks can be located on each memory controller and the page_alloc.shuffle=1 kernel parameter can be applied to provide a simplistic software-based memory channel interleaving of accesses from user space across the multiple memory controllers. Experiments using this approach with a dummy workload [3] on a BCM7278 dual memory controller system with 1GB of RAM on each controller (i.e. 2GB total RAM) and using the kernel parameters "movablecore=300M@0x60000000,300M@0x320000000 page_alloc.shuffle=1" showed a more than 20% performance improvement over a system without this feature using either "movablecore=600M" or no "movablecore" kernel parameter. Another use case of interest is to add broader support for the "reusable" parameter for reserved-memory device tree nodes. The Designated Movable Block extension of movablecore would allow designation of the location as well as ownership of the block. A device driver that owns a reusable reserved-memory would own the underlying portion of a Designated Movable Block and could reclaim memory from the OS for use exclusively by the device on demand in a manner similar to memory hot unplugging. The existing alloc/free_contig_range functions could be used to support this or a different API could be developed. This use case is mentioned for consideration, but an implementation is not part of this submission. There have also been efforts to reduce the amounts of memory CMA holds in reserve (e.g. [4]). Adding the ability to place a CMA pool in a Designated Movable Block could offer an option to improve memory utilization when increased allocation latency can be tolerated, but again such an implementation is not part of this submission. Changes in v4: - rewrote the cover letter in an attempt to provide clarity and encourage review. - rebased to akpm-mm/master (i.e. Linux 6.3-rc1). Changes in v3: - removed OTHER OPPORTUNITIES and NOTES from the cover letter. - prevent the creation of empty zones instead of adding extra info to zoneinfo. - size the ZONE_MOVABLE span to the minimum necessary to cover pages within the zone to be more intuitive. - removed "real" from variable names that were consolidated. - rebased to akpm-mm/master (i.e. Linux 6.1-rc1). Changes in v2: - first three commits upstreamed separately. - commits 04-06 submitted separately. - Corrected errors "Reported-by: kernel test robot " - Deferred commits after 15 to simplify review of the base functionality. - minor reorganization of commit 13. v3: https://lore.kernel.org/lkml/20221020215318.4193269-1-opendmb@gmail.com/ v2: https://lore.kernel.org/linux-mm/20220928223301.375229-1-opendmb@gmail.com/ v1: https://lore.kernel.org/linux-mm/20220913195508.3511038-1-opendmb@gmail.com/ [1] https://lwn.net/Articles/543790/ [2] https://lore.kernel.org/all/1374220774-29974-1-git-send-email-tangchen@cn.fujitsu.com/ [3] https://lore.kernel.org/lkml/342da4ea-d04a-996c-85c4-3065dd4dc01f@gmail.com/ [4] https://lore.kernel.org/linux-mm/20230131071052.GB19285@hu-sbhattip-lv.qualcomm.com/ Doug Berger (9): lib/show_mem.c: display MovableOnly mm/page_alloc: calculate node_spanned_pages from pfns mm/page_alloc: prevent creation of empty zones mm/page_alloc.c: allow oversized movablecore mm/page_alloc: introduce init_reserved_pageblock() memblock: introduce MEMBLOCK_MOVABLE flag mm/dmb: Introduce Designated Movable Blocks mm/page_alloc: make alloc_contig_pages DMB aware mm/page_alloc: allow base for movablecore .../admin-guide/kernel-parameters.txt | 14 +- include/linux/dmb.h | 29 +++ include/linux/gfp.h | 5 +- include/linux/memblock.h | 8 + lib/show_mem.c | 2 +- mm/Kconfig | 12 ++ mm/Makefile | 1 + mm/cma.c | 15 +- mm/dmb.c | 91 +++++++++ mm/memblock.c | 30 ++- mm/page_alloc.c | 188 +++++++++++++----- 11 files changed, 338 insertions(+), 57 deletions(-) create mode 100644 include/linux/dmb.h create mode 100644 mm/dmb.c