From patchwork Sun Dec 1 01:54:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11268343 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9164E921 for ; Sun, 1 Dec 2019 01:54:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 554542245C for ; Sun, 1 Dec 2019 01:54:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="t5QkMLee" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 554542245C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 02AF26B02FD; Sat, 30 Nov 2019 20:54:20 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id F1DFC6B02FF; Sat, 30 Nov 2019 20:54:19 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0EB16B0300; Sat, 30 Nov 2019 20:54:19 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0124.hostedemail.com [216.40.44.124]) by kanga.kvack.org (Postfix) with ESMTP id CD54D6B02FD for ; Sat, 30 Nov 2019 20:54:19 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 8A6EF2C7C for ; Sun, 1 Dec 2019 01:54:19 +0000 (UTC) X-FDA: 76214902638.13.bean20_25b916d2dce4e X-Spam-Summary: 50,0,0,896eb2e04c58497a,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:anshuman.khandual@arm.com:dan.j.williams@intel.com:david@redhat.com::mhocko@suse.com:mm-commits@vger.kernel.org:nao.horiguchi@gmail.com:osalvador@suse.de:pasha.tatashin@soleen.com:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1535:1544:1605:1711:1730:1747:1777:1792:1963:2194:2196:2198:2199:2200:2201:2393:2525:2561:2565:2682:2685:2693:2736:2741:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4385:4605:5007:6261:6653:6737:7514:7576:7875:7903:8599:9025:9545:10004:10913:11026:11473:11658:11914:12043:12048:12291:12294:12296:12297:12438:12517:12519:12555:12679:12740:12783:12895:12986:13161:13180:13229:13255:13846:14096:14181:14721:14849:21060:21080:21094:21251:21323:21451:21627:21740:217 88:21939 X-HE-Tag: bean20_25b916d2dce4e X-Filterd-Recvd-Size: 5996 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Sun, 1 Dec 2019 01:54:19 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1EC392231B; Sun, 1 Dec 2019 01:54:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1575165258; bh=cE1+JNvJqh5u/VQt+rXTw7yhPupAZIMAYKGYaWCtnoo=; h=Date:From:To:Subject:From; b=t5QkMLee6Rg1tx2sf8BLbaUr6y/BtWhLabg+rMIY+ovDzLrQ8A8i3+kUaZt+xohWb jKbtcCQOXSvBxY5WuzfTHTfKdDisduWnFDYvaawbAPun9iG+TugXTLxmrmbRwgGTd4 AL/hs24Ygx1NLyOcTEnSGIt5as8Fgy8M1BD13Re0= Date: Sat, 30 Nov 2019 17:54:17 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, anshuman.khandual@arm.com, dan.j.williams@intel.com, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, nao.horiguchi@gmail.com, osalvador@suse.de, pasha.tatashin@soleen.com, torvalds@linux-foundation.org Subject: [patch 083/158] mm/memory_hotplug.c: don't allow to online/offline memory blocks with holes Message-ID: <20191201015417.c9-W09fyc%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: mm/memory_hotplug.c: don't allow to online/offline memory blocks with holes Our onlining/offlining code is unnecessarily complicated. Only memory blocks added during boot can have holes (a range that is not IORESOURCE_SYSTEM_RAM). Hotplugged memory never has holes (e.g., see add_memory_resource()). All memory blocks that belong to boot memory are already online. Note that boot memory can have holes and the memmap of the holes is marked PG_reserved. However, also memory allocated early during boot is PG_reserved - basically every page of boot memory that is not given to the buddy is PG_reserved. Therefore, when we stop allowing to offline memory blocks with holes, we implicitly no longer have to deal with onlining memory blocks with holes. E.g., online_pages() will do a walk_system_ram_range(..., online_pages_range), whereby online_pages_range() will effectively only free the memory holes not falling into a hole to the buddy. The other pages (holes) are kept PG_reserved (via move_pfn_range_to_zone()->memmap_init_zone()). This allows to simplify the code. For example, we no longer have to worry about marking pages that fall into memory holes PG_reserved when onlining memory. We can stop setting pages PG_reserved completely in memmap_init_zone(). Offlining memory blocks added during boot is usually not guaranteed to work either way (unmovable data might have easily ended up on that memory during boot). So stopping to do that should not really hurt. Also, people are not even aware of a setup where onlining/offlining of memory blocks with holes used to work reliably (see [1] and [2] especially regarding the hotplug path) - I doubt it worked reliably. For the use case of offlining memory to unplug DIMMs, we should see no change. (holes on DIMMs would be weird). Please note that hardware errors (PG_hwpoison) are not memory holes and are not affected by this change when offlining. [1] https://lkml.org/lkml/2019/10/22/135 [2] https://lkml.org/lkml/2019/8/14/1365 Link: http://lkml.kernel.org/r/20191119115237.6662-1-david@redhat.com Reviewed-by: Dan Williams Signed-off-by: David Hildenbrand Acked-by: Michal Hocko Cc: Oscar Salvador Cc: Pavel Tatashin Cc: Dan Williams Cc: Anshuman Khandual Cc: Naoya Horiguchi Signed-off-by: Andrew Morton --- mm/memory_hotplug.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) --- a/mm/memory_hotplug.c~mm-memory_hotplug-dont-allow-to-online-offline-memory-blocks-with-holes +++ a/mm/memory_hotplug.c @@ -1485,10 +1485,19 @@ static void node_states_clear_node(int n node_clear_state(node, N_MEMORY); } +static int count_system_ram_pages_cb(unsigned long start_pfn, + unsigned long nr_pages, void *data) +{ + unsigned long *nr_system_ram_pages = data; + + *nr_system_ram_pages += nr_pages; + return 0; +} + static int __ref __offline_pages(unsigned long start_pfn, unsigned long end_pfn) { - unsigned long pfn, nr_pages; + unsigned long pfn, nr_pages = 0; unsigned long offlined_pages = 0; int ret, node, nr_isolate_pageblock; unsigned long flags; @@ -1499,6 +1508,22 @@ static int __ref __offline_pages(unsigne mem_hotplug_begin(); + /* + * Don't allow to offline memory blocks that contain holes. + * Consequently, memory blocks with holes can never get onlined + * via the hotplug path - online_pages() - as hotplugged memory has + * no holes. This way, we e.g., don't have to worry about marking + * memory holes PG_reserved, don't need pfn_valid() checks, and can + * avoid using walk_system_ram_range() later. + */ + walk_system_ram_range(start_pfn, end_pfn - start_pfn, &nr_pages, + count_system_ram_pages_cb); + if (nr_pages != end_pfn - start_pfn) { + ret = -EINVAL; + reason = "memory holes"; + goto failed_removal; + } + /* This makes hotplug much easier...and readable. we assume this for now. .*/ if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, @@ -1510,7 +1535,6 @@ static int __ref __offline_pages(unsigne zone = page_zone(pfn_to_page(valid_start)); node = zone_to_nid(zone); - nr_pages = end_pfn - start_pfn; /* set above range as isolated */ ret = start_isolate_page_range(start_pfn, end_pfn,