From patchwork Wed Jan 16 18:19:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10766687 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 049E413BF for ; Wed, 16 Jan 2019 18:25:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E9A912F3E1 for ; Wed, 16 Jan 2019 18:25:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DD2C72F3F4; Wed, 16 Jan 2019 18:25:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 75CCE2F3E8 for ; Wed, 16 Jan 2019 18:25:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D4698E0014; Wed, 16 Jan 2019 13:25:41 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 15FB68E0004; Wed, 16 Jan 2019 13:25:41 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04DCD8E0014; Wed, 16 Jan 2019 13:25:41 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id B1AD68E0004 for ; Wed, 16 Jan 2019 13:25:40 -0500 (EST) Received: by mail-pf1-f199.google.com with SMTP id t2so5276780pfj.15 for ; Wed, 16 Jan 2019 10:25:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=v9C2bsdbZKN4wfrid2R3BaL4d2Hl91AmARUTxgFlVyE=; b=TckQWhVSiboq7w54iRCv869vihps2PvRaqdWkjyPdLNkLuUkJ0xD3vbP35YP29Bf1w GVj6mwFkChHQHyQsxk7t5DguSeNkP93+6l0f8cZ0UEnxpAqJ/FMd50JP1nyWRMQ/+H0t fMJyLecn8wLjPGGSfDw3eLlGNuOEOBiUNLR4GzB3Zlh01iKV0TcLIn76ooNrqYMJWtWw l1DvMqNK7jE1reNGh0C16qGArF/xehs5RyTqIejXk+SxlwN/O47ORw42e8E1p1ua3sIv W6dnuY+a2CKWcpWfFNLSCbaX7MyiMEsrhbY1O7j4kd0Tdb3B8DtBM7yhh4JvXqb9hklw 1EBA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AJcUukcN2CqLfLxSAJWsrcKvHDpEtM0dxt7sdZ3cf1A2TieMGSZkC7JQ zkFt3supHSRYLNZPMGoXVVmkPHyRm2e3Tp78ZmVXOXrzvjSA5LcFi01QltWI3vI17c0OCJRBD8w KCyijvUz21xuHn/rvUXn7FZVwyuNbAuN7TUpj0NQvvcSmvJHJBYzmmhYxgZjgbAGgPA== X-Received: by 2002:a17:902:848d:: with SMTP id c13mr11237536plo.257.1547663140402; Wed, 16 Jan 2019 10:25:40 -0800 (PST) X-Google-Smtp-Source: ALg8bN6TfulnRb7+MYR9RqgHPOTVORcgC/TArzA2Pq4uQrCBYDI0xz/Ourqc/FsAjANTM8g2Nt4r X-Received: by 2002:a17:902:848d:: with SMTP id c13mr11237487plo.257.1547663139679; Wed, 16 Jan 2019 10:25:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547663139; cv=none; d=google.com; s=arc-20160816; b=vsJc+4aIDCfuEBDcsyfZ3hGXoYnvMBQa78gSpi4lAh3XTY5WDyRXqqY6sr7iMG1KzD jpUtd6bZHB+Nht5aEwVWbk5RAyK3gO8cxkZ/VIUN7o2QuLb54E2uhnO5tsQc/NB04i84 4yHEfDlcFFKUPuO1pxXB54dhX6Q5a2o7vdGoQ5+WRPFuDqv4vWfP74/toLyMcDUGQrtR RRBJLspr6+2KT8M6mSOK6sA9URqCXXAxTcFv9EAK/ecewpUuYIo/f5h8ktLJDeWaUTIR KsTSf/ebP+nl5z5R9gPF04mFqBc5xz3g1B1cJNv8HOt2zf7IAQEyxGI7A568J7/cbU64 fxVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=v9C2bsdbZKN4wfrid2R3BaL4d2Hl91AmARUTxgFlVyE=; b=lGvXuLuf9tPjNqyfcLEgXEwxdYTUWP5S73JENMxpGdJQGJ25v07mfFeKdmURqvjJ3d ogjwazh9C7p1Kmg5PUy9a0ZRuRt9tfxERz8otpOR98+j4TdMpld1CGUuOu+dl9TI9eS7 XkAgxHgqkW0ta9Wvtu/h6NplgyXPbsy2QfXDHZ28r4+69uu1zlQkmVYnZ5JpVcXCGA8i KZPltSfmbPiv0EyLQ38eZhn/TMdG879O83rjn6m4LM7ETfyM/OaedTD0TQujEkRURWJE 5Q/jzbUaiSlyzNv1e2XWHiHm3QN9OpDoLpR5bYm4Ir/k6n/4sOAafrJKP1n0lTF5n7Qc yPcQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga02.intel.com (mga02.intel.com. [134.134.136.20]) by mx.google.com with ESMTPS id l5si7335471pls.423.2019.01.16.10.25.39 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 10:25:39 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.20 as permitted sender) client-ip=134.134.136.20; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Jan 2019 10:25:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,487,1539673200"; d="scan'208";a="126559961" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga002.jf.intel.com with ESMTP; 16 Jan 2019 10:25:38 -0800 Subject: [PATCH 1/4] mm/resource: return real error codes from walk failures To: dave@sr71.net Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-kernel@vger.kernel.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de From: Dave Hansen Date: Wed, 16 Jan 2019 10:19:01 -0800 References: <20190116181859.D1504459@viggo.jf.intel.com> In-Reply-To: <20190116181859.D1504459@viggo.jf.intel.com> Message-Id: <20190116181901.CAF85066@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range(..., func); if (ret) return ret; and 'ret' makes it out to userspace, eventually. The problem is, walk_system_ram_range() failues that result from *it* failing (as opposed to 'func') return -1. That leads to a very odd -EPERM (-1) return code out to userspace. Make walk_system_ram_range() return -EINVAL for internal failures to keep userspace less confused. This return code is compatible with all the callers that I audited. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Signed-off-by: Dave Hansen --- b/kernel/resource.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1 kernel/resource.c --- a/kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1 2018-12-20 11:48:41.810771934 -0800 +++ b/kernel/resource.c 2018-12-20 11:48:41.814771934 -0800 @@ -375,7 +375,7 @@ static int __walk_iomem_res_desc(resourc int (*func)(struct resource *, void *)) { struct resource res; - int ret = -1; + int ret = -EINVAL; while (start < end && !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) { @@ -453,7 +453,7 @@ int walk_system_ram_range(unsigned long unsigned long flags; struct resource res; unsigned long pfn, end_pfn; - int ret = -1; + int ret = -EINVAL; start = (u64) start_pfn << PAGE_SHIFT; end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1; From patchwork Wed Jan 16 18:19:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10766689 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7EC9B13BF for ; Wed, 16 Jan 2019 18:26:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6FCC52F3E1 for ; Wed, 16 Jan 2019 18:26:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 633AB2F3F4; Wed, 16 Jan 2019 18:26:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D6F092F3E1 for ; Wed, 16 Jan 2019 18:26:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30CD88E0015; Wed, 16 Jan 2019 13:25:43 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2E21E8E0004; Wed, 16 Jan 2019 13:25:43 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D7D48E0015; Wed, 16 Jan 2019 13:25:43 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id BAACC8E0004 for ; Wed, 16 Jan 2019 13:25:42 -0500 (EST) Received: by mail-pg1-f198.google.com with SMTP id r16so4392966pgr.15 for ; Wed, 16 Jan 2019 10:25:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=gIacLjiL8a4UDzqlDrZFybYaz7jh5hj/jgaplHVPsf4=; b=gNjR/Ux/sXZFa5n/gYq/kXn5FdASb9QQxc+k9arYrkFuyX7JXnQz+GAEhXaJGcw4Q8 E5Y6mD4FLW4UmdvNI/yxZbCtXBcj2SKTsLoEx4v9VRTuO14Zh+qGTp0j9suJXwQ3Dk7K HJt6sWUBSph4tJn6+VqwL8wuYibHaa9ckOYJRJC2EzFX1C0Na2QcTJpOLSE5D47fVgqG UPc0JcM93Z8Dgm1W40oBCVcjdfF59SxtuXn/w+y3rRZ/cvrk9+lCGTHwBXsBde+GVLxy H9pNouXnxOib3MTz8CX8cxgvgnEGGD5+xdOThpFqAv1qCYPHWZoXXQmrI4rHdpEW3aVB OVjw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AJcUukf+vj5i3CQDJDwUv5XMGd+Th8tizOauU/xFirDJVXoaaFeHHDmn mcgTHVKSPoxLKe264k9AbpVUWuKowNZ/2MZVomPvsdZgafKMIhbgKYtEQGjPhxuxff0fsm83LYf JtwkaU2FKxOTPbY7HWRU1GcwJuid1dtbdGRwmrrZS62dXtegyH/wzeTBKl7Mou1beNA== X-Received: by 2002:a63:3703:: with SMTP id e3mr9970368pga.348.1547663142374; Wed, 16 Jan 2019 10:25:42 -0800 (PST) X-Google-Smtp-Source: ALg8bN4Ibzy9eNFhrmnzcmrrSUL+cMlHp5jL0ccXcrGOTmOffFRkPs8ZFQLVgK2zf55bVd18gHaW X-Received: by 2002:a63:3703:: with SMTP id e3mr9970318pga.348.1547663141469; Wed, 16 Jan 2019 10:25:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547663141; cv=none; d=google.com; s=arc-20160816; b=AGVPCFZCHr/s2IUZmP1qDWwCvTdU33dvs956H5oKux9m96yUYEvwHztUfZf60IlMlK 1LBEatBm9aNkF+7nwyPqhesoBb1Avc1Y/EWyAHYRD4Y4RW1txEtJteC7CTerwgtv6Dgf jTLvIcm/IOrLhabfCsPJazoUxVcSDuDH8LWUnI7/Mxm40h7r+arcCn+5rLPiZdPVZyyt uD8KioeWexI76AnJMOQB/vUDE083YyS6FpsNOGE7filSdYrMTusA2SaHF7/AYtssiw5O 8z3DQyjMHEkYzjNidtGfGk3730a/dfE5iNQqL2bfd5yT9Co+XpjgkMUU7lHPx5pzP3lF 8P+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=gIacLjiL8a4UDzqlDrZFybYaz7jh5hj/jgaplHVPsf4=; b=Sd6MClKvRscmJpc2nC575Wsi6eSEKAVdxGzf8mzS0sqdnpka1oEAkVXkjSGJ61Gosq TcX+L0DuSAx0CeLAnngygs3pR9JsZ5z6xanC9O2RqYdnDE0ESjZA41JyYKopCYjGUO32 s6ZH9D1Ye/o6yQdVjWAVqjOfJsCsuKXX040aJNXCTa4b59dNrZ2Vfp+D/9pZnfqDfp5p rR0fvzhI++dLEFamdsu4JaW3yy8eGRAMfx33A1Z3wtRW7ampuvaNT6GnzWRZgIBe/nF+ /4IPQwVYPqqdqiTTPAaP1kjGo9WU6LPEsD109eEwYRq7dHAOMJ5RREMP5mm4j7x/tGwS fayA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id t77si6728408pgb.51.2019.01.16.10.25.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 10:25:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Jan 2019 10:25:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,487,1539673200"; d="scan'208";a="119025562" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga003.jf.intel.com with ESMTP; 16 Jan 2019 10:25:40 -0800 Subject: [PATCH 2/4] mm/memory-hotplug: allow memory resources to be children To: dave@sr71.net Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-kernel@vger.kernel.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de From: Dave Hansen Date: Wed, 16 Jan 2019 10:19:02 -0800 References: <20190116181859.D1504459@viggo.jf.intel.com> In-Reply-To: <20190116181859.D1504459@viggo.jf.intel.com> Message-Id: <20190116181902.670EEBC3@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen The mm/resource.c code is used to manage the physical address space. We can view the current resource configuration in /proc/iomem. An example of this is at the bottom of this description. The nvdimm subsystem "owns" the physical address resources which map to persistent memory and has resources inserted for them as "Persistent Memory". We want to use this persistent memory, but as volatile memory, just like RAM. The best way to do this is to leave the existing resource in place, but add a "System RAM" resource underneath it. This clearly communicates the ownership relationship of this memory. The request_resource_conflict() API only deals with the top-level resources. Replace it with __request_region() which will search for !IORESOURCE_BUSY areas lower in the resource tree than the top level. We also rework the old error message a bit since we do not get the conflicting entry back: only an indication that we *had* a conflict. We *could* also simply truncate the existing top-level "Persistent Memory" resource and take over the released address space. But, this means that if we ever decide to hot-unplug the "RAM" and give it back, we need to recreate the original setup, which may mean going back to the BIOS tables. This should have no real effect on the existing collision detection because the areas that truly conflict should be marked IORESOURCE_BUSY. 00000000-00000fff : Reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : Reserved 000a0000-000bffff : PCI Bus 0000:00 000c0000-000c97ff : Video ROM 000c9800-000ca5ff : Adapter ROM 000f0000-000fffff : Reserved 000f0000-000fffff : System ROM 00100000-9fffffff : System RAM 01000000-01e071d0 : Kernel code 01e071d1-027dfdff : Kernel data 02dc6000-0305dfff : Kernel bss a0000000-afffffff : Persistent Memory (legacy) a0000000-a7ffffff : System RAM b0000000-bffdffff : System RAM bffe0000-bfffffff : Reserved c0000000-febfffff : PCI Bus 0000:00 Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Signed-off-by: Dave Hansen --- b/mm/memory_hotplug.c | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-) diff -puN mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child mm/memory_hotplug.c --- a/mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child 2018-12-20 11:48:42.317771933 -0800 +++ b/mm/memory_hotplug.c 2018-12-20 11:48:42.322771933 -0800 @@ -98,24 +98,21 @@ void mem_hotplug_done(void) /* add this memory to iomem resource */ static struct resource *register_memory_resource(u64 start, u64 size) { - struct resource *res, *conflict; - res = kzalloc(sizeof(struct resource), GFP_KERNEL); - if (!res) - return ERR_PTR(-ENOMEM); + struct resource *res; + unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + char *resource_name = "System RAM"; - res->name = "System RAM"; - res->start = start; - res->end = start + size - 1; - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - conflict = request_resource_conflict(&iomem_resource, res); - if (conflict) { - if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { - pr_debug("Device unaddressable memory block " - "memory hotplug at %#010llx !\n", - (unsigned long long)start); - } - pr_debug("System RAM resource %pR cannot be added\n", res); - kfree(res); + /* + * Request ownership of the new memory range. This might be + * a child of an existing resource that was present but + * not marked as busy. + */ + res = __request_region(&iomem_resource, start, size, + resource_name, flags); + + if (!res) { + pr_debug("Unable to reserve System RAM region: %016llx->%016llx\n", + start, start + size); return ERR_PTR(-EEXIST); } return res; From patchwork Wed Jan 16 18:19:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10766693 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 556C41880 for ; Wed, 16 Jan 2019 18:26:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 47CF72F3E1 for ; Wed, 16 Jan 2019 18:26:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3BBFC2F3F5; Wed, 16 Jan 2019 18:26:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C1C772F3E1 for ; Wed, 16 Jan 2019 18:26:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 18B3F8E0016; Wed, 16 Jan 2019 13:25:44 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 10ED08E0004; Wed, 16 Jan 2019 13:25:44 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA2818E0016; Wed, 16 Jan 2019 13:25:43 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id A8DBE8E0004 for ; Wed, 16 Jan 2019 13:25:43 -0500 (EST) Received: by mail-pg1-f200.google.com with SMTP id q62so4395703pgq.9 for ; Wed, 16 Jan 2019 10:25:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=Cw56qJD87uy/pg1Cu85XBjBCwAnxWxaIy0eAzSm3c2Q=; b=GtMc/KHbdMjTrqsuPy1LxkL2x3fOM3zrwW0zLLPtgvZ2lK71afEhWlxY3D1U0VYiyS prFcn7/GLpfp6lypfE3I3vh5azjE6GhB9Uo+fRGzk7kH8eGpHnMOzZhlngp/SyuZJTdY MWK2bmlUQufvdBwtU4VxJzicl0q/Dzw2f1SBx+pz6xocESmWTPDXef3zxblbstZiZs9i 9gaTkjP/qrGItIbrC3uBWrunBMeY0nvmuZzv9ar3MsUIse80WxWjhjLH1gWHwuCDUZ1D a9BJ2B4Pj0b1CURPKzwV1Ryn13SVNed7mwrNAXJ72Avhhlvc70AYilFEZZODYtCcbsZM AXzQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AJcUukcYsONShj/ZAsXLJuhmXNOGLs4l3ldeQK5cZXpZriAsJeQjEySV SkSGIs1hfC99RwYORMv6KyzvCY4eSyxjXkUuYjntgc3dn9BgQnj7jdj9ohPqlDuUb3cQNQ+B6a1 r2BRJXooF1em3FS4OV38+O1A0cnOI8Rt6Ih+lP6rls3ZkWZjwbyLwra7ISpEo+8oVjg== X-Received: by 2002:a65:448a:: with SMTP id l10mr10055659pgq.387.1547663143334; Wed, 16 Jan 2019 10:25:43 -0800 (PST) X-Google-Smtp-Source: ALg8bN6xZiRVyx0HcxbV1FdnxIELbr4jg9BigWRJnMMx7JitARLAJ6m19cssJ/FF7I/Rc9Ws1zjS X-Received: by 2002:a65:448a:: with SMTP id l10mr10055613pgq.387.1547663142418; Wed, 16 Jan 2019 10:25:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547663142; cv=none; d=google.com; s=arc-20160816; b=G7mF2EVHRS+51kVZlVMw+F/7sMESPheRIHgHoCwCzzn6dknjp17jN/wSoApOUYGiBd lM596a29824q1HsvqBgHTG0T2bNzDdLZDcBns7eUObouBQTGNL+RfMuW3XBwKFbKO/Nn JcgsLvy/YOsAAvfk+fA4WGMGfoURnKIzpr6Pia9ynsjvJ7Oo71rJWSbZCCTznU4BXYdH ZATKdyiOuKjEygxyy46+fW7jm1IUYFo0kIPtp5FoEprPihWmavYFkdlD9nFnRy1O2kdu AkFKMCMb/0Z84/pA3swltLQn10dM5r+IHnRvswdE8gMNzBAnpj4sHYaAymhMX62f3Kk+ oYoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=Cw56qJD87uy/pg1Cu85XBjBCwAnxWxaIy0eAzSm3c2Q=; b=Lz/KDA4yElljhMgghx0VgMxWzSlEVc4LjolwuHMSz4MgJs2y/Z9WByCWBpDtdnc3KB P2put/K08wja0zFf2/suaolw2gGP808gTzgzZeCYjl/am0fcvQphQuPH8RhJx21dLNy2 FMWY0PHqEmxpxFnUONdlFWXIBuQDWG2U4yXummPm33JxO519tYyXF7yxbXPkTDK1a1KL OEu8dv6F1h7gRG2w3te8Kw7GtYebCjzBKms8rkhCiG/Qyp5TQppmEuhDHfeyUN5L8UXq 21zezuNSOXb+0X/Y5BCWS4Vuo38RGUyUP/FB5IQPe2mah7ENCgM7ZRmfhySxbv6dkFPa 5P5w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id v20si6754250pgk.103.2019.01.16.10.25.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 10:25:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Jan 2019 10:25:41 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,487,1539673200"; d="scan'208";a="292082790" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga005.jf.intel.com with ESMTP; 16 Jan 2019 10:25:41 -0800 Subject: [PATCH 3/4] dax/kmem: let walk_system_ram_range() search child resources To: dave@sr71.net Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-kernel@vger.kernel.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de From: Dave Hansen Date: Wed, 16 Jan 2019 10:19:04 -0800 References: <20190116181859.D1504459@viggo.jf.intel.com> In-Reply-To: <20190116181859.D1504459@viggo.jf.intel.com> Message-Id: <20190116181904.D24AF5FE@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen In the process of onlining memory, we use walk_system_ram_range() to find the actual RAM areas inside of the area being onlined. However, it currently only finds memory resources which are "top-level" iomem_resources. Children are not currently searched which causes it to skip System RAM in areas like this (in the format of /proc/iomem): a0000000-bfffffff : Persistent Memory (legacy) a0000000-afffffff : System RAM Changing the true->false here allows children to be searched as well. We need this because we add a new "System RAM" resource underneath the "persistent memory" resource when we use persistent memory in a volatile mode. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Signed-off-by: Dave Hansen --- b/kernel/resource.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff -puN kernel/resource.c~mm-walk_system_ram_range-search-child-resources kernel/resource.c --- a/kernel/resource.c~mm-walk_system_ram_range-search-child-resources 2018-12-20 11:48:42.824771932 -0800 +++ b/kernel/resource.c 2018-12-20 11:48:42.827771932 -0800 @@ -445,6 +445,9 @@ int walk_mem_res(u64 start, u64 end, voi * This function calls the @func callback against all memory ranges of type * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY. * It is to be used only for System RAM. + * + * This will find System RAM ranges that are children of top-level resources + * in addition to top-level System RAM resources. */ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages, void *arg, int (*func)(unsigned long, unsigned long, void *)) @@ -460,7 +463,7 @@ int walk_system_ram_range(unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; while (start < end && !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, - true, &res)) { + false, &res)) { pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; end_pfn = (res.end + 1) >> PAGE_SHIFT; if (end_pfn > pfn) From patchwork Wed Jan 16 18:19:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10766701 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E1B94186E for ; Wed, 16 Jan 2019 18:26:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D26182F3E8 for ; Wed, 16 Jan 2019 18:26:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C65172F3F9; Wed, 16 Jan 2019 18:26:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 28BEA2F3E8 for ; Wed, 16 Jan 2019 18:26:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 68D738E0017; Wed, 16 Jan 2019 13:25:46 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6680A8E0004; Wed, 16 Jan 2019 13:25:46 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 50D7E8E0017; Wed, 16 Jan 2019 13:25:46 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id F05298E0004 for ; Wed, 16 Jan 2019 13:25:45 -0500 (EST) Received: by mail-pf1-f199.google.com with SMTP id q63so5262773pfi.19 for ; Wed, 16 Jan 2019 10:25:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=zigwy9nXqDX3uU3GGBYYlKvactAnm1EGulzF+98tmsg=; b=s42OYGMaAX/eKX5eRQPjVlcynph1QimF7koLVcDe4fxTMC4W3c3y4hMFFlkdLMlcO1 RBCtRwVaCyKyRlHXXMPZM9fKM4KbuQ+uoMuXgdRCWXi81suvxoj2XHxzClRFCASzkN2a qudCET9lT4CULQ+c62V3IzxFviCCYbkbao/jziRqouKwyCSq7WTiWm7nznlXYFBpQ6Yn kHVK1JIu+ESaiV9KXYDfJVmk9XoPBz5S/QLBKiipkW1nnhMi4KnqxHKrQH0r0NwoMI0m IAYfHU4Vcp/5lVgaZpGS3dhGC1I15PxLhs19/dMUAytKse7psSqbZurYskq/gGTOEXKM TKvA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AJcUuke6MBnmQJE2BOBl9oNl9FOD43sslJdNyuZu70dKr/SQG9Iqe+5W gUt8MQR5z6g5Uy0z9vV/4jTNPcc3cNsimevwSwrh8czQgSX+m3Eb1/uFBBEuQOpWr7wtGhFpZ3/ kyJDjQhm8AKtZonnwfCN+nY/6qqbc566avErVwUVzo22QcOCWLueiKWKVNsttQMdfAQ== X-Received: by 2002:a17:902:7296:: with SMTP id d22mr11411410pll.265.1547663145625; Wed, 16 Jan 2019 10:25:45 -0800 (PST) X-Google-Smtp-Source: ALg8bN7nQLAxtMtu7Vy6Bu9+olAJx//+4cGwrdaDvsPUab3hGnGoCFzZxbAmFtCHEXq99LDQTfe6 X-Received: by 2002:a17:902:7296:: with SMTP id d22mr11411351pll.265.1547663144722; Wed, 16 Jan 2019 10:25:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547663144; cv=none; d=google.com; s=arc-20160816; b=k/V9RkjysvE1qJtJpMBHNSe9TX/n7T3w5xqUgc4CD5NKR20StONZEVxhXqgBB0qC0I rwl2E8Pp79wKgUYym6ZHaQyu1uc5pYvfWvcLH1VxxPcksecfk0mKBMeSWKq4DGmAMFa5 8myC3nJqPf8OyDPuo9/jhB6mA0o807AXHN/rqoUbis00COy65Ncyl3Vi/f0S9ymCTZ+8 FDv1NMbKKq4EQvdq9aOuBQTLbZvSUM5kWRF6HQFfdg+3BQRhOn8JTEWf4NlAuz0MBod7 GzKTEo2e9QIfdI2AwV07Uvd6GLR1YC/b7UJq8dASB8iGYI6giJW6PfhWNIw32hKu9qLj sTyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=zigwy9nXqDX3uU3GGBYYlKvactAnm1EGulzF+98tmsg=; b=xrc3Fr8lEyzqrVUX6sWRsNErXMz+yS3rMKCYl0hI0M8aeCz1jEsnylcwvime6fSVzR xqvmdJc0FsDfewdNjnJPHfrn5eZA9jeUXmCwQzg/aTcdq8tMX4lK7jKRWnPxsm4zXFS7 53/YrI6OKEafPvTSVfR85pc86eyy8T+V+OTENJX0BemkU/xyABr72kyNMAN+GdoecIbv PbUQcThHchWTiBtTZs6RxCny42Ubh1SpPZXjsSB+yjehtUdbimhTPgiipBAWvgXNnvcu pY8zCjB/8z1OSq53Sr/gUAeqhCWaL6s5spMat/7S4BLNhbUaIreIPQVLjrYrZlqxQN++ P2CQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id e6si1674985pgd.428.2019.01.16.10.25.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 10:25:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Jan 2019 10:25:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,487,1539673200"; d="scan'208";a="108759249" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga006.jf.intel.com with ESMTP; 16 Jan 2019 10:25:43 -0800 Subject: [PATCH 4/4] dax: "Hotplug" persistent memory for use like normal RAM To: dave@sr71.net Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-kernel@vger.kernel.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de From: Dave Hansen Date: Wed, 16 Jan 2019 10:19:05 -0800 References: <20190116181859.D1504459@viggo.jf.intel.com> In-Reply-To: <20190116181859.D1504459@viggo.jf.intel.com> Message-Id: <20190116181905.12E102B4@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen Currently, a persistent memory region is "owned" by a device driver, either the "Direct DAX" or "Filesystem DAX" drivers. These drivers allow applications to explicitly use persistent memory, generally by being modified to use special, new libraries. However, this limits persistent memory use to applications which *have* been modified. To make it more broadly usable, this driver "hotplugs" memory into the kernel, to be managed ad used just like normal RAM would be. To make this work, management software must remove the device from being controlled by the "Device DAX" infrastructure: echo -n dax0.0 > /sys/bus/dax/drivers/device_dax/remove_id echo -n dax0.0 > /sys/bus/dax/drivers/device_dax/unbind and then bind it to this new driver: echo -n dax0.0 > /sys/bus/dax/drivers/kmem/new_id echo -n dax0.0 > /sys/bus/dax/drivers/kmem/bind After this, there will be a number of new memory sections visible in sysfs that can be onlined, or that may get onlined by existing udev-initiated memory hotplug rules. Note: this inherits any existing NUMA information for the newly- added memory from the persistent memory device that came from the firmware. On Intel platforms, the firmware has guarantees that require each socket's persistent memory to be in a separate memory-only NUMA node. That means that this patch is not expected to create NUMA nodes, but will simply hotplug memory into existing nodes. There is currently some metadata at the beginning of pmem regions. The section-size memory hotplug restrictions, plus this small reserved area can cause the "loss" of a section or two of capacity. This should be fixable in follow-on patches. But, as a first step, losing 256MB of memory (worst case) out of hundreds of gigabytes is a good tradeoff vs. the required code to fix this up precisely. Signed-off-by: Dave Hansen Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Borislav Petkov Cc: Bjorn Helgaas Cc: Yaowei Bai Cc: Takashi Iwai --- b/drivers/dax/Kconfig | 5 ++ b/drivers/dax/Makefile | 1 b/drivers/dax/kmem.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 99 insertions(+) diff -puN drivers/dax/Kconfig~dax-kmem-try-4 drivers/dax/Kconfig --- a/drivers/dax/Kconfig~dax-kmem-try-4 2019-01-08 09:54:44.051694874 -0800 +++ b/drivers/dax/Kconfig 2019-01-08 09:54:44.056694874 -0800 @@ -32,6 +32,11 @@ config DEV_DAX_PMEM Say M if unsure +config DEV_DAX_KMEM + def_bool y + depends on DEV_DAX_PMEM # Needs DEV_DAX_PMEM infrastructure + depends on MEMORY_HOTPLUG # for add_memory() and friends + config DEV_DAX_PMEM_COMPAT tristate "PMEM DAX: support the deprecated /sys/class/dax interface" depends on DEV_DAX_PMEM diff -puN /dev/null drivers/dax/kmem.c --- /dev/null 2018-12-03 08:41:47.355756491 -0800 +++ b/drivers/dax/kmem.c 2019-01-08 09:54:44.056694874 -0800 @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2016-2018 Intel Corporation. All rights reserved. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "dax-private.h" +#include "bus.h" + +int dev_dax_kmem_probe(struct device *dev) +{ + struct dev_dax *dev_dax = to_dev_dax(dev); + struct resource *res = &dev_dax->region->res; + resource_size_t kmem_start; + resource_size_t kmem_size; + struct resource *new_res; + int numa_node; + int rc; + + /* Hotplug starting at the beginning of the next block: */ + kmem_start = ALIGN(res->start, memory_block_size_bytes()); + + kmem_size = resource_size(res); + /* Adjust the size down to compensate for moving up kmem_start: */ + kmem_size -= kmem_start - res->start; + /* Align the size down to cover only complete blocks: */ + kmem_size &= ~(memory_block_size_bytes() - 1); + + new_res = devm_request_mem_region(dev, kmem_start, kmem_size, + dev_name(dev)); + + if (!new_res) { + printk("could not reserve region %016llx -> %016llx\n", + kmem_start, kmem_start+kmem_size); + return -EBUSY; + } + + /* + * Set flags appropriate for System RAM. Leave ..._BUSY clear + * so that add_memory() can add a child resource. + */ + new_res->flags = IORESOURCE_SYSTEM_RAM; + new_res->name = dev_name(dev); + + numa_node = dev_dax->target_node; + if (numa_node < 0) { + pr_warn_once("bad numa_node: %d, forcing to 0\n", numa_node); + numa_node = 0; + } + + rc = add_memory(numa_node, new_res->start, resource_size(new_res)); + if (rc) + return rc; + + return 0; +} +EXPORT_SYMBOL_GPL(dev_dax_kmem_probe); + +static int dev_dax_kmem_remove(struct device *dev) +{ + /* Assume that hot-remove will fail for now */ + return -EBUSY; +} + +static struct dax_device_driver device_dax_kmem_driver = { + .drv = { + .probe = dev_dax_kmem_probe, + .remove = dev_dax_kmem_remove, + }, +}; + +static int __init dax_kmem_init(void) +{ + return dax_driver_register(&device_dax_kmem_driver); +} + +static void __exit dax_kmem_exit(void) +{ + dax_driver_unregister(&device_dax_kmem_driver); +} + +MODULE_AUTHOR("Intel Corporation"); +MODULE_LICENSE("GPL v2"); +module_init(dax_kmem_init); +module_exit(dax_kmem_exit); +MODULE_ALIAS_DAX_DEVICE(0); diff -puN drivers/dax/Makefile~dax-kmem-try-4 drivers/dax/Makefile --- a/drivers/dax/Makefile~dax-kmem-try-4 2019-01-08 09:54:44.053694874 -0800 +++ b/drivers/dax/Makefile 2019-01-08 09:54:44.056694874 -0800 @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_DAX) += dax.o obj-$(CONFIG_DEV_DAX) += device_dax.o +obj-$(CONFIG_DEV_DAX_KMEM) += kmem.o dax-y := super.o dax-y += bus.o