From patchwork Fri Sep 25 19:11:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11800633 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F13C46CA for ; Fri, 25 Sep 2020 19:30:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A68A62371F for ; Fri, 25 Sep 2020 19:30:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A68A62371F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B1BE46B005C; Fri, 25 Sep 2020 15:30:04 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ACCA96B005D; Fri, 25 Sep 2020 15:30:04 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E3806B0062; Fri, 25 Sep 2020 15:30:04 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0172.hostedemail.com [216.40.44.172]) by kanga.kvack.org (Postfix) with ESMTP id 8A7DB6B005C for ; Fri, 25 Sep 2020 15:30:04 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 40CE26124 for ; Fri, 25 Sep 2020 19:30:04 +0000 (UTC) X-FDA: 77302574328.12.bell53_2e16c3d2716a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id 196FD18022DF7 for ; Fri, 25 Sep 2020 19:30:04 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,dan.j.williams@intel.com,,RULES_HIT:30003:30029:30054:30056:30064,0,RBL:134.134.136.100:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95;04yfnf9s9rtjpz7aotifnb9mps5e9ypuzrtt8bkgksuep6b4agdp8xk7nnfk7f9.ck54jsqawqrt6xtp7898okx16hxzfr49udz43npwwuq7d3ikyxmqp71pqsrxds6.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: bell53_2e16c3d2716a X-Filterd-Recvd-Size: 9057 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Fri, 25 Sep 2020 19:30:02 +0000 (UTC) IronPort-SDR: yhpVaFRkK3JAYonHbS72pXoiMEgXsqiPCsZThVcQovXPOHn94vAJJKH05tQWxUaW8sM4t3Tt88 JvhGWlaR/EOg== X-IronPort-AV: E=McAfee;i="6000,8403,9755"; a="225778808" X-IronPort-AV: E=Sophos;i="5.77,303,1596524400"; d="scan'208";a="225778808" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2020 12:30:00 -0700 IronPort-SDR: y/5TfPWHk/8Hzw/AbD1aVTu5a2nWwqjoPjcd88tCruatdGVYbHGCzI0P3Yd+oT8ukDm4vRpe5E TuzfWGzOQvaA== X-IronPort-AV: E=Sophos;i="5.77,303,1596524400"; d="scan'208";a="487581711" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2020 12:30:00 -0700 Subject: [PATCH v5 00/17] device-dax: support sub-dividing soft-reserved ranges From: Dan Williams To: akpm@linux-foundation.org Cc: David Hildenbrand , Ira Weiny , Bjorn Helgaas , Vishal Verma , Dave Hansen , David Airlie , Vivek Goyal , Joao Martins , Dave Jiang , Jonathan Cameron , Greg Kroah-Hartman , Pavel Tatashin , Hulk Robot , Ben Skeggs , Benjamin Herrenschmidt , Jia He , =?utf-8?b?SsOpcsO0bWU=?= Glisse , Jason Yan , Paul Mackerras , Boris Ostrovsky , Brice Goglin , Stefano Stabellini , Michael Ellerman , Juergen Gross , Daniel Vetter , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Fri, 25 Sep 2020 12:11:39 -0700 Message-ID: <160106109960.30709.7379926726669669398.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Changes since v4 [1]: - Rebased on device-dax-move-instance-creation-parameters-to-struct-dev_dax_data.patch in -mm [2]. I.e. patches that did not need fixups from v4 are not included. - Folded all fixes - Replaced "device-dax: kill dax_kmem_res" with: device-dax/kmem: introduce dax_kmem_range() device-dax/kmem: move resource name tracking to drvdata device-dax/kmem: replace release_resource() with release_mem_region() ...to address David's request to make those cleanups easier to review. Note that I dropped changes to how IORESOURCE_BUSY is manipulated since David and I are still debating the best way forward there. - Broke out some of dax-bus reworks in "device-dax: introduce 'seed' devices" to a new "device-dax: introduce 'struct dev_dax' typed-driver operations" - Added a conversion of xen_alloc_unallocated_pages() from pgmap.res to pgmap.range. I found it odd that there is no corresponding memunmap_pages() triggered by xen_free_unallocated_pages()? - Not included, a conversion of virtio_fs to use pgmap.range for its new usage of devm_memremap_pages(). It appears the virtio_fs changes are merged after -mm? My mental model of -mm was that it applies on top of linux-next? In any event, Vivek, you will need to coordinate a conversion to pgmap.range for the virtio_fs dax-support merge. Maybe that should go through Andrew as well? - Lowercase all the subject lines per akpm's preference - Received a 0day robot build-success notification over 122 configs - Thanks to Joao for looking after this set while I was out. [1]: http://lore.kernel.org/r/159625229779.3040297.11363509688097221416.stgit@dwillia2-desk3.amr.corp.intel.com [2]: https://ozlabs.org/~akpm/mmots/broken-out/device-dax-move-instance-creation-parameters-to-struct-dev_dax_data.patch --- Andrew, this series replaces device-dax-make-pgmap-optional-for-instance-creation.patch ...through... dax-hmem-introduce-dax_hmemregion_idle-parameter.patch ...in your stack. Let me know if there is a different / preferred way to refresh a bulk of patches in your queue when only a subset need updates. --- The device-dax facility allows an address range to be directly mapped through a chardev, or optionally hotplugged to the core kernel page allocator as System-RAM. It is the mechanism for converting persistent memory (pmem) to be used as another volatile memory pool i.e. the current Memory Tiering hot topic on linux-mm. In the case of pmem the nvdimm-namespace-label mechanism can sub-divide it, but that labeling mechanism is not available / applicable to soft-reserved ("EFI specific purpose") memory [3]. This series provides a sysfs-mechanism for the daxctl utility to enable provisioning of volatile-soft-reserved memory ranges. The motivations for this facility are: 1/ Allow performance differentiated memory ranges to be split between kernel-managed and directly-accessed use cases. 2/ Allow physical memory to be provisioned along performance relevant address boundaries. For example, divide a memory-side cache [4] along cache-color boundaries. 3/ Parcel out soft-reserved memory to VMs using device-dax as a security / permissions boundary [5]. Specifically I have seen people (ab)using memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the device-dax interface on custom address ranges. A follow-on for the VM use case is to teach device-dax to dynamically allocate 'struct page' at runtime to reduce the duplication of 'struct page' space in both the guest and the host kernel for the same physical pages. [3]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@dwillia2-desk3.amr.corp.intel.com [4]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com [5]: http://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com --- Dan Williams (14): device-dax: make pgmap optional for instance creation device-dax/kmem: introduce dax_kmem_range() device-dax/kmem: move resource name tracking to drvdata device-dax/kmem: replace release_resource() with release_mem_region() device-dax: add an allocation interface for device-dax instances device-dax: introduce 'struct dev_dax' typed-driver operations device-dax: introduce 'seed' devices drivers/base: make device_find_child_by_name() compatible with sysfs inputs device-dax: add resize support mm/memremap_pages: convert to 'struct range' mm/memremap_pages: support multiple ranges per invocation device-dax: add dis-contiguous resource support device-dax: introduce 'mapping' devices device-dax: add an 'align' attribute Joao Martins (3): device-dax: make align a per-device property dax/hmem: introduce dax_hmem.region_idle parameter device-dax: add a range mapping allocation attribute arch/powerpc/kvm/book3s_hv_uvmem.c | 14 drivers/base/core.c | 2 drivers/dax/bus.c | 1039 ++++++++++++++++++++++++++++++-- drivers/dax/bus.h | 11 drivers/dax/dax-private.h | 58 ++ drivers/dax/device.c | 112 ++- drivers/dax/hmem/hmem.c | 17 - drivers/dax/kmem.c | 178 +++-- drivers/dax/pmem/compat.c | 2 drivers/dax/pmem/core.c | 14 drivers/gpu/drm/nouveau/nouveau_dmem.c | 15 drivers/nvdimm/badrange.c | 26 - drivers/nvdimm/claim.c | 13 drivers/nvdimm/nd.h | 3 drivers/nvdimm/pfn_devs.c | 13 drivers/nvdimm/pmem.c | 27 - drivers/nvdimm/region.c | 21 - drivers/pci/p2pdma.c | 12 drivers/xen/unpopulated-alloc.c | 45 + include/linux/memremap.h | 11 include/linux/range.h | 6 lib/test_hmm.c | 15 mm/memremap.c | 299 +++++---- tools/testing/nvdimm/dax-dev.c | 22 - tools/testing/nvdimm/test/iomap.c | 2 25 files changed, 1557 insertions(+), 420 deletions(-) base-commit: 6764736525f27a411ba2c0c430aaa2df7375f3ac