From patchwork Wed May 23 15:11:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 10421759 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6B45E6032A for ; Wed, 23 May 2018 15:13:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5A8B728FF0 for ; Wed, 23 May 2018 15:13:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 55B3C28FE3; Wed, 23 May 2018 15:13:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2E85F290FB for ; Wed, 23 May 2018 15:12:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E094B6B0005; Wed, 23 May 2018 11:12:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D3DFF6B0007; Wed, 23 May 2018 11:12:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B95396B000D; Wed, 23 May 2018 11:12:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk0-f199.google.com (mail-qk0-f199.google.com [209.85.220.199]) by kanga.kvack.org (Postfix) with ESMTP id 870A76B0005 for ; Wed, 23 May 2018 11:12:29 -0400 (EDT) Received: by mail-qk0-f199.google.com with SMTP id z1-v6so6218357qki.10 for ; Wed, 23 May 2018 08:12:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=ygK/IpDeBMIYKecqpqNT/KHTIGqmPwqEE0OkbtpD0L0=; b=G5IDzwW30MXzrd/ExMVeHxFmzwCBhINy8GnbK/mZ6LshbmkoWDUEj617y8Lfcp6wPU XGz3Zyycw8zThA4rlRoaLogGZyRjKPpyPr6uySvpOpJ+BVHNSCSF3o+1P6sGIFP+Tp2z ilLjgNa/m21dpLwrtqeXS0Qist9tiZr5LJKYL06em6IH8UP8b/DJuG0S4NrqjotSWFe1 xOl1zZ4sKLvFonpGCOUL1hT1wtE0Cx5ynsbhQJ7BRgU40sYtayq8rtskBT8SywRjPPtd 0nIFnHxUx5vDINZrSdznrFqgV8C49UPt6hweqCpg8GFWTSguEdVzqyxiKMNraWmxiTPk +wiA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of david@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: ALKqPwfYOVwE9N90sYFvqP27stDHCjdL3RfvY9OHNG6qFSVKwBYeVv6e YKuPmqmbwMrTABTOyYCemaxptTL5uX2o0jJW0tgSMY9MyTck8z5u2MNsdcNuCvwzExrdcxoUxqf ROIrKfNlDje7nHsBsLn0hM4FEsQ6HYRdScrh0ncVxuLO0hlfuQVKzAMGcvLhVQJwOeQ== X-Received: by 2002:a0c:91c2:: with SMTP id r2-v6mr2954883qvr.43.1527088348961; Wed, 23 May 2018 08:12:28 -0700 (PDT) X-Google-Smtp-Source: AB8JxZotJXyjEHjb4qa2qmpLwPkmpO9b947bkfa0a9disBMiOtx5qo+xsAMzvf5ccSzienDZpKVL X-Received: by 2002:a0c:91c2:: with SMTP id r2-v6mr2954811qvr.43.1527088348109; Wed, 23 May 2018 08:12:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527088347; cv=none; d=google.com; s=arc-20160816; b=a+mNgc22aNrDeAE9QVg096eusivvaeEgtht0FjyMKD1CIw3ZBFLwJLt0HcE31pfnwd KQlT9KUXdOs3RdPKW0IobwbjlnfZ3gR+Ma9jT44W5RW9FWbmw+qkTv9jZj/zhKTK7Zpc aSpjjFN8bOQAHRDQEImgc5HMPfdbF3xM1ZgcLo8V5xUYkxuZqPSIsILZihiHrahHxq+c w7LqxTTdIa9pGArH4FTiahxpY/NtCs3JmcHORVuLk5uIRlWu5UI92UFlm/0/GYlcMQTH hBchmfEdJHyllaLdVL/hUUB8TW4j2i1SQo67iDu9obNqO6oVvJy3Da1WjvgivHJWx9cM 31bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=ygK/IpDeBMIYKecqpqNT/KHTIGqmPwqEE0OkbtpD0L0=; b=uzMct5fKPfDnXND1cJd1fVLdpJzBiaD9Tqebh5lSxix46wK2dY8P96y8Xi6CL8biZe sBcqkCZgDekLFPKsclXunh73rAQvLkUPPmCHcFycDvnXVRgM9suG6iQujVDOxQlF//kO U4sp9YnsH9YFqwixP6iRlO3WAsupP8ORWwHikM3uRQ3eJF5ANXsJ7cJP2eiKb/kWxkxc GKavNbDTCv779bO/BC6kYR3oN2UD3rN8DrSdCHDSYqWk+kR6VnROPCjsqiJQXA76GBSv yEon8c9qSiiTyK5lzs9wtuQlwx3XhQ6ltKQYHK0NoKmcq30+p8Q4qz8MYEkM46FLfZPs DYuQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of david@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx3-rdu2.redhat.com. [66.187.233.73]) by mx.google.com with ESMTPS id m89-v6si715864qva.63.2018.05.23.08.12.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 23 May 2018 08:12:27 -0700 (PDT) Received-SPF: pass (google.com: domain of david@redhat.com designates 66.187.233.73 as permitted sender) client-ip=66.187.233.73; Authentication-Results: mx.google.com; spf=pass (google.com: domain of david@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 94E9B80825A5; Wed, 23 May 2018 15:12:27 +0000 (UTC) Received: from t460s.redhat.com (ovpn-116-112.ams2.redhat.com [10.36.116.112]) by smtp.corp.redhat.com (Postfix) with ESMTP id 57DC910C564A; Wed, 23 May 2018 15:12:25 +0000 (UTC) From: David Hildenbrand To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, David Hildenbrand , Greg Kroah-Hartman , Boris Ostrovsky , Juergen Gross , Ingo Molnar , Andrew Morton , Pavel Tatashin , Vlastimil Babka , Michal Hocko , Dan Williams , Joonsoo Kim , Reza Arbab , Thomas Gleixner Subject: [PATCH v1 08/10] mm/memory_hotplug: allow to control onlining/offlining of memory by a driver Date: Wed, 23 May 2018 17:11:49 +0200 Message-Id: <20180523151151.6730-9-david@redhat.com> In-Reply-To: <20180523151151.6730-1-david@redhat.com> References: <20180523151151.6730-1-david@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 23 May 2018 15:12:27 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 23 May 2018 15:12:27 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'david@redhat.com' RCPT:'' X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Some devices (esp. paravirtualized) might want to control - when to online/offline a memory block - how to online memory (MOVABLE/NORMAL) - in which granularity to online/offline memory So let's add a new flag "driver_managed" and disallow to change the state by user space. Device onlining/offlining will still work, however the memory will not be actually onlined/offlined. That has to be handled by the device driver that owns the memory. Please note that we have to create user visible memory blocks after all since this is required to trigger the right udevs events in order to reload kexec/kdump. Also, it allows to see what is going on in the system (e.g. which memory blocks are still around). Cc: Greg Kroah-Hartman Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Ingo Molnar Cc: Andrew Morton Cc: Pavel Tatashin Cc: Vlastimil Babka Cc: Michal Hocko Cc: Dan Williams Cc: Joonsoo Kim Cc: Reza Arbab Cc: Thomas Gleixner Signed-off-by: David Hildenbrand --- drivers/base/memory.c | 22 ++++++++++++++-------- drivers/xen/balloon.c | 2 +- include/linux/memory.h | 1 + include/linux/memory_hotplug.h | 4 +++- mm/memory_hotplug.c | 34 ++++++++++++++++++++++++++++++++-- 5 files changed, 51 insertions(+), 12 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index bffe8616bd55..3b8616551561 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -231,27 +231,28 @@ static bool pages_correctly_probed(unsigned long start_pfn) * Must already be protected by mem_hotplug_begin(). */ static int -memory_block_action(unsigned long phys_index, unsigned long action, int online_type) +memory_block_action(struct memory_block *mem, unsigned long action) { - unsigned long start_pfn; + unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; - int ret; + int ret = 0; - start_pfn = section_nr_to_pfn(phys_index); + if (mem->driver_managed) + return 0; switch (action) { case MEM_ONLINE: if (!pages_correctly_probed(start_pfn)) return -EBUSY; - ret = online_pages(start_pfn, nr_pages, online_type); + ret = online_pages(start_pfn, nr_pages, mem->online_type); break; case MEM_OFFLINE: ret = offline_pages(start_pfn, nr_pages); break; default: WARN(1, KERN_WARNING "%s(%ld, %ld) unknown action: " - "%ld\n", __func__, phys_index, action, action); + "%ld\n", __func__, mem->start_section_nr, action, action); ret = -EINVAL; } @@ -269,8 +270,7 @@ static int memory_block_change_state(struct memory_block *mem, if (to_state == MEM_OFFLINE) mem->state = MEM_GOING_OFFLINE; - ret = memory_block_action(mem->start_section_nr, to_state, - mem->online_type); + ret = memory_block_action(mem, to_state); mem->state = ret ? from_state_req : to_state; @@ -350,6 +350,11 @@ store_mem_state(struct device *dev, */ mem_hotplug_begin(); + if (mem->driver_managed) { + ret = -EINVAL; + goto out; + } + switch (online_type) { case MMOP_ONLINE_KERNEL: case MMOP_ONLINE_MOVABLE: @@ -364,6 +369,7 @@ store_mem_state(struct device *dev, ret = -EINVAL; /* should never happen */ } +out: mem_hotplug_done(); err: unlock_device_hotplug(); diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 065f0b607373..89981d573c06 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -401,7 +401,7 @@ static enum bp_state reserve_additional_memory(void) * callers drop the mutex before trying again. */ mutex_unlock(&balloon_mutex); - rc = add_memory_resource(nid, resource, memhp_auto_online); + rc = add_memory_resource(nid, resource, memhp_auto_online, false); mutex_lock(&balloon_mutex); if (rc) { diff --git a/include/linux/memory.h b/include/linux/memory.h index 9f8cd856ca1e..018c5e5ecde1 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -29,6 +29,7 @@ struct memory_block { unsigned long state; /* serialized by the dev->lock */ int section_count; /* serialized by mem_sysfs_mutex */ int online_type; /* for passing data to online routine */ + bool driver_managed; /* driver handles online/offline */ int phys_device; /* to which fru does this belong? */ void *hw; /* optional pointer to fw/hw data */ int (*phys_callback)(struct memory_block *); diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index d71829d54360..497e28f5b000 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -326,7 +326,9 @@ static inline void remove_memory(int nid, u64 start, u64 size) {} extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn, void *arg, int (*func)(struct memory_block *, void *)); extern int add_memory(int nid, u64 start, u64 size); -extern int add_memory_resource(int nid, struct resource *resource, bool online); +extern int add_memory_driver_managed(int nid, u64 start, u64 size); +extern int add_memory_resource(int nid, struct resource *resource, bool online, + bool driver_managed); extern int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, bool want_memblock); extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 27f7c27f57ac..1610e214bfc8 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1124,8 +1124,15 @@ static int online_memory_block(struct memory_block *mem, void *arg) return device_online(&mem->dev); } +static int mark_memory_block_driver_managed(struct memory_block *mem, void *arg) +{ + mem->driver_managed = true; + return 0; +} + /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ -int __ref add_memory_resource(int nid, struct resource *res, bool online) +int __ref add_memory_resource(int nid, struct resource *res, bool online, + bool driver_managed) { u64 start, size; pg_data_t *pgdat = NULL; @@ -1133,6 +1140,9 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) bool new_node; int ret; + if (online && driver_managed) + return -EINVAL; + start = res->start; size = resource_size(res); @@ -1204,6 +1214,9 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) if (online) walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL, online_memory_block); + else if (driver_managed) + walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), + NULL, mark_memory_block_driver_managed); goto out; @@ -1228,13 +1241,30 @@ int __ref add_memory(int nid, u64 start, u64 size) if (IS_ERR(res)) return PTR_ERR(res); - ret = add_memory_resource(nid, res, memhp_auto_online); + ret = add_memory_resource(nid, res, memhp_auto_online, false); if (ret < 0) release_memory_resource(res); return ret; } EXPORT_SYMBOL_GPL(add_memory); +int __ref add_memory_driver_managed(int nid, u64 start, u64 size) +{ + struct resource *res; + int ret; + + res = register_memory_resource(start, size); + if (IS_ERR(res)) + return PTR_ERR(res); + + ret = add_memory_resource(nid, res, false, true); + if (ret < 0) + release_memory_resource(res); + return ret; +} +EXPORT_SYMBOL_GPL(add_memory_driver_managed); + + #ifdef CONFIG_MEMORY_HOTREMOVE /* * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy