From patchwork Wed Sep 19 03:17:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pingfan Liu X-Patchwork-Id: 10605205 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 067E5112B for ; Wed, 19 Sep 2018 03:18:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EA3A72B8AC for ; Wed, 19 Sep 2018 03:18:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DDD962B8FF; Wed, 19 Sep 2018 03:18:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1E6602B8AC for ; Wed, 19 Sep 2018 03:18:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0AAC48E0004; Tue, 18 Sep 2018 23:18:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 00BDE8E0001; Tue, 18 Sep 2018 23:18:32 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E19468E0004; Tue, 18 Sep 2018 23:18:32 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 9802B8E0001 for ; Tue, 18 Sep 2018 23:18:32 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id b69-v6so2045507pfc.20 for ; Tue, 18 Sep 2018 20:18:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=tFZLsRjo1xZOn/CddR3VikCmk4h0He503FmrRdcNnmw=; b=HfCLhxBmzfTB4TLvLMvS6DUKlAvCiY5YmLZUMInbVbVowCVfR6LPtg6Dij8vlizk6s H7seCuiXN5ficbRcxzQyCO9Udfu3KXve3rfUAfozrcqDpA3NpGjfkCynsMHEKMz//xUT c1QhRp2kWCLX5OiHumIJaCieEMD2ihjIii0IdQ6EaFeYB/xFP5XWd826Ok9RcWT+hjeZ Mxc8Ya5ITTZqKO2l93Vbz409JjlrWnQffQCRpejIYF5vHIkq2aQHrk2gxSQhuLvd64zZ Rqm2tswTT+Ft1XH4rU7PBc3VG8h9kRaS82dDJWbGPzS/2We4iKIIW7SF9SXZFrwW+lM8 qaVw== X-Gm-Message-State: APzg51ClvVuF/U7a7pLpzhsLhKVLRy4dTlh0gQbC07+oWdsjpIgNdxSa N1vbcn38NDd+SkIqudbp5i8JiMKLwuJwGjSzD3847S8rpXtOOffxTbOvVM/T4YNZiG9yi+h79z6 qFoI4zRYbhL74lC8UzSJlUeEswifIO04oT6qOixZnfyPwhv0s3gmqLcoKexjdKmVNg6mA7gFA8v dTAdtiiFJ+b1+dZzCWQVXteb48lynB2JyVq52GynwxDiBUme6dhc/iALU/KVPS33jz2PaO6h77d yAq/8f/BEEutgM4oyLHKCcvWKJcvLFY/lhF1kXcOam74k/dN2UfaZ4Bpk5pvD4wsWfbGslKVpQk i9u0kQO2UI/jOZu2X+B14eEC5/OpKO1CAhnpNqKi8a9hU+pFQDDd0CarP3NGdFeCJesf+jnOM0S / X-Received: by 2002:a63:6849:: with SMTP id d70-v6mr30107531pgc.7.1537327112287; Tue, 18 Sep 2018 20:18:32 -0700 (PDT) X-Received: by 2002:a63:6849:: with SMTP id d70-v6mr30107484pgc.7.1537327111167; Tue, 18 Sep 2018 20:18:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537327111; cv=none; d=google.com; s=arc-20160816; b=yXYaTGaZNX4Z9qyeeutGQLSGz8kD9Ps4BTncFINh1uVIO6uSy+lyuNgjd8jMftodkH NABPrrmnLD3E8NBB8aAz+IJ88dqs37XlSzmYN0Pca4ECketokvIPicfdyNqqGdEh+Ufa 6VjjsUMP1PpFCLJjK7n6tsAr96QJffroF4oy52Lfqor2g67lJA+4ZhY67tA2yyAcKxuR brEBJhos1EomSmIUfZjwRe5Fj6D5Ob640re2AH1817fjFvs95p63os8s6buMBzWHll+6 EMI+ucczMs/0dGf++aqw3qckuvBu5XWEfw+M92KyvW7eGJzW8OiD2xKaIrZQyqK9y9Qj rpIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tFZLsRjo1xZOn/CddR3VikCmk4h0He503FmrRdcNnmw=; b=vM90w1Mam5duPRQ7UI+a9OG7jqo+N/heoQkxk6TuI0cp77KHBN7CzGDU7HtYTBmEql fdiRm8De+m4aOheGtL9HiHkRUQii55l4S7KpsBIc8FaqA7lXkzDJMXmvMY/ON94+m0p/ 4umcIuDoGQ/JT4rUk30IjSp++aqxZhd8RdcRKcIIqhbFY209FY9RIQr6sNWo3FZN/7JM /GWtdOxQp+eM5RAQL6aBs5TUHkrLeNwj3R8sM6D7BFH/JmpMiRuVaIHX8B0ygd414Jfx jSGHqMyLRCgYFVkhsaGmGrg9lP6BWFlhGnEeTuSd3bNIftLfThWx1V3tRbe1V9VJWqpm vOJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=M8yNmtoy; spf=pass (google.com: domain of kernelfans@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=kernelfans@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id b7-v6sor3893334plx.0.2018.09.18.20.18.30 for (Google Transport Security); Tue, 18 Sep 2018 20:18:31 -0700 (PDT) Received-SPF: pass (google.com: domain of kernelfans@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=M8yNmtoy; spf=pass (google.com: domain of kernelfans@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=kernelfans@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=tFZLsRjo1xZOn/CddR3VikCmk4h0He503FmrRdcNnmw=; b=M8yNmtoyM3dZYI37J9nYwmfjENyH5RM0ghWBNLqTV0LKGCuSuK8BawFFxpzPE+dTyL AblfR71Ry57bED2sczs+/E59H1wfKPyNY4nxVHqIixXeFczzydTHtQcNYki0lUpwGTpB R0v5YoHPU6MY2h0VSzn5fGgXBdR6zrm1dQ9pImaFee2tRCh4EemRoc3G3RfHLV007UEC f5pSQb3gZBLVSDxVfz/c5YSsFUJNQ+mnStt+eEfMWfD3Id6aYimBZ6InJskf+awaoJzw r23EIPpgthYfASMZ9rMC5QRH0zHBe8jDgByx3CB/1bJKe4X1SU8fvR85VQdIyviMQkW1 P4OQ== X-Google-Smtp-Source: ANB0VdbGsTeEzzjb6bAO7GzZjbJwYp2NopC/5k0XctYv/OvCEjXWq5YYXA48ZTChQfyMQLH9Z2R6DA== X-Received: by 2002:a17:902:a504:: with SMTP id s4-v6mr33051978plq.101.1537327110390; Tue, 18 Sep 2018 20:18:30 -0700 (PDT) Received: from mylaptop.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id o20-v6sm53087673pfj.35.2018.09.18.20.18.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Sep 2018 20:18:29 -0700 (PDT) From: Pingfan Liu To: linux-mm@kvack.org Cc: Pingfan Liu , Andrew Morton , KAMEZAWA Hiroyuki , Mel Gorman , Greg Kroah-Hartman , Pavel Tatashin , Michal Hocko , Bharata B Rao , Dan Williams , "H. Peter Anvin" , "Kirill A . Shutemov" Subject: [PATCH 1/3] mm/isolation: separate the isolation and migration ops in offline memblock Date: Wed, 19 Sep 2018 11:17:44 +0800 Message-Id: <1537327066-27852-2-git-send-email-kernelfans@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1537327066-27852-1-git-send-email-kernelfans@gmail.com> References: <1537327066-27852-1-git-send-email-kernelfans@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The current design of start_isolate_page_range() relies on MIGRATE_ISOLATE to run against other threads. Hence the callers of start_isolate_page_range() can only do the isolation by themselves. But in this series, a suggested mem offline seq splits the pageblock's isolation and migration on a memblock, i.e. -1. call start_isolate_page_range() on a batch of memblock -2. call __offline_pages() on each memblock. This requires the ability to allow __offline_pages() to reuse the isolation About the mark of isolation, it is not preferable to do it in memblock, because at this level, pageblock is used, and the memblock should be hidden. On the other hand, isolation and compaction can not run in parallel, the PB_migrate_skip bit can be reused to mark the isolation result of previous ops, as used by this patch. Also the prototype of start_isolate_page_range() is changed to tell __offline_pages cases from temporary isolation e.g. alloc_contig_range() Signed-off-by: Pingfan Liu Cc: Andrew Morton Cc: KAMEZAWA Hiroyuki Cc: Mel Gorman Cc: Greg Kroah-Hartman Cc: Pavel Tatashin Cc: Michal Hocko Cc: Bharata B Rao Cc: Dan Williams Cc: "H. Peter Anvin" Cc: Kirill A. Shutemov --- include/linux/page-isolation.h | 4 ++-- include/linux/pageblock-flags.h | 2 ++ mm/memory_hotplug.c | 6 +++--- mm/page_alloc.c | 4 ++-- mm/page_isolation.c | 28 +++++++++++++++++++++++----- 5 files changed, 32 insertions(+), 12 deletions(-) diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h index 4ae347c..dcc2bd1 100644 --- a/include/linux/page-isolation.h +++ b/include/linux/page-isolation.h @@ -47,7 +47,7 @@ int move_freepages_block(struct zone *zone, struct page *page, */ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, - unsigned migratetype, bool skip_hwpoisoned_pages); + unsigned int migratetype, bool skip_hwpoisoned_pages, bool reuse); /* * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE. @@ -55,7 +55,7 @@ start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, */ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, - unsigned migratetype); + unsigned int migratetype, bool reuse); /* * Test all pages in [start_pfn, end_pfn) are isolated or not. diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h index 9132c5c..80c5341 100644 --- a/include/linux/pageblock-flags.h +++ b/include/linux/pageblock-flags.h @@ -31,6 +31,8 @@ enum pageblock_bits { PB_migrate_end = PB_migrate + 3 - 1, /* 3 bits required for migrate types */ PB_migrate_skip,/* If set the block is skipped by compaction */ + PB_isolate_skip = PB_migrate_skip, + /* isolation and compaction do not concur */ /* * Assume the bits will always align on a word. If this assumption diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 9eea6e8..228de4d 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1616,7 +1616,7 @@ static int __ref __offline_pages(unsigned long start_pfn, /* set above range as isolated */ ret = start_isolate_page_range(start_pfn, end_pfn, - MIGRATE_MOVABLE, true); + MIGRATE_MOVABLE, true, true); if (ret) return ret; @@ -1662,7 +1662,7 @@ static int __ref __offline_pages(unsigned long start_pfn, We cannot do rollback at this point. */ offline_isolated_pages(start_pfn, end_pfn); /* reset pagetype flags and makes migrate type to be MOVABLE */ - undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); + undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE, true); /* removal success */ adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages); zone->present_pages -= offlined_pages; @@ -1697,7 +1697,7 @@ static int __ref __offline_pages(unsigned long start_pfn, ((unsigned long long) end_pfn << PAGE_SHIFT) - 1); memory_notify(MEM_CANCEL_OFFLINE, &arg); /* pushback to free area */ - undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); + undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE, true); return ret; } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 05e983f..a0ae259 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7882,7 +7882,7 @@ int alloc_contig_range(unsigned long start, unsigned long end, ret = start_isolate_page_range(pfn_max_align_down(start), pfn_max_align_up(end), migratetype, - false); + false, false); if (ret) return ret; @@ -7967,7 +7967,7 @@ int alloc_contig_range(unsigned long start, unsigned long end, done: undo_isolate_page_range(pfn_max_align_down(start), - pfn_max_align_up(end), migratetype); + pfn_max_align_up(end), migratetype, false); return ret; } diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 43e0856..36858ab 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -15,8 +15,18 @@ #define CREATE_TRACE_POINTS #include +#define get_pageblock_isolate_skip(page) \ + get_pageblock_flags_group(page, PB_isolate_skip, \ + PB_isolate_skip) +#define clear_pageblock_isolate_skip(page) \ + set_pageblock_flags_group(page, 0, PB_isolate_skip, \ + PB_isolate_skip) +#define set_pageblock_isolate_skip(page) \ + set_pageblock_flags_group(page, 1, PB_isolate_skip, \ + PB_isolate_skip) + static int set_migratetype_isolate(struct page *page, int migratetype, - bool skip_hwpoisoned_pages) + bool skip_hwpoisoned_pages, bool reuse) { struct zone *zone; unsigned long flags, pfn; @@ -33,8 +43,11 @@ static int set_migratetype_isolate(struct page *page, int migratetype, * If it is already set, then someone else must have raced and * set it before us. Return -EBUSY */ - if (is_migrate_isolate_page(page)) + if (is_migrate_isolate_page(page)) { + if (reuse && get_pageblock_isolate_skip(page)) + ret = 0; goto out; + } pfn = page_to_pfn(page); arg.start_pfn = pfn; @@ -75,6 +88,8 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int mt = get_pageblock_migratetype(page); set_pageblock_migratetype(page, MIGRATE_ISOLATE); + if (reuse) + set_pageblock_isolate_skip(page); zone->nr_isolate_pageblock++; nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE, NULL); @@ -185,7 +200,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages) * prevents two threads from simultaneously working on overlapping ranges. */ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, - unsigned migratetype, bool skip_hwpoisoned_pages) + unsigned int migratetype, bool skip_hwpoisoned_pages, bool reuse) { unsigned long pfn; unsigned long undo_pfn; @@ -199,7 +214,8 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, pfn += pageblock_nr_pages) { page = __first_valid_page(pfn, pageblock_nr_pages); if (page && - set_migratetype_isolate(page, migratetype, skip_hwpoisoned_pages)) { + set_migratetype_isolate(page, migratetype, + skip_hwpoisoned_pages, reuse)) { undo_pfn = pfn; goto undo; } @@ -222,7 +238,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, * Make isolated pages available again. */ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, - unsigned migratetype) + unsigned int migratetype, bool reuse) { unsigned long pfn; struct page *page; @@ -236,6 +252,8 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, page = __first_valid_page(pfn, pageblock_nr_pages); if (!page || !is_migrate_isolate_page(page)) continue; + if (reuse) + clear_pageblock_isolate_skip(page); unset_migratetype_isolate(page, migratetype); } return 0; From patchwork Wed Sep 19 03:17:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pingfan Liu X-Patchwork-Id: 10605207 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4F4F15A6 for ; Wed, 19 Sep 2018 03:18:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C454E2B8AC for ; Wed, 19 Sep 2018 03:18:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B86702B8FF; Wed, 19 Sep 2018 03:18:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1068A2B8AC for ; Wed, 19 Sep 2018 03:18:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F02238E0005; Tue, 18 Sep 2018 23:18:38 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E8A1B8E0001; Tue, 18 Sep 2018 23:18:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D2BB78E0005; Tue, 18 Sep 2018 23:18:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 8C93D8E0001 for ; Tue, 18 Sep 2018 23:18:38 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id 2-v6so1814095plc.11 for ; Tue, 18 Sep 2018 20:18:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=Ewd2RgeXG1E75HoAyPPCWH7kK+qG54Cpd4g/BO0a+8o=; b=kQ1phQG/SHieNuP+l3HCsYp0CRtST4FUbf3ZNiUSMhrC0DwqnwRiL6ACjZUe7cYFN0 m9OqaqvpOOIaTvZe/BXz4j2F2VaV6hHBjl4phJdtS2rwc/ECrzC5pmRR0z6gTxjX8W70 ZrGTN9Gb+FQeFV1HkErimYuVtIlPrhzbjlt9FZkfSlKsO0+CNAkpjGaZ+ANV7i0o/IEe B1GYA18yk+z4P01Ojt5KxJx/QO1MzrwX1+wwRKEIed3SWxNLIWfichYZcr6frJlcUZ3g jDWI7484Gj58fZtTaTQwLjkEGAGvQkUQV9XBQ2wreRUXhBSL83LrDBTRCJzF7SuRBwuc 4PYw== X-Gm-Message-State: APzg51DG8sJLwOwCD0vHcUrG6rw2LgqyX6ne+r+0ToMS2K3Zqg6ctxwh ZumHd6G3uSrynS5f5MwpTsL62SDQS4zFed6qvoSVnq4JhPqEvRAOUD4UOJ2B7gy3iZvdNu+TVwq GK9/2M+51CGcQKl1VhbfZPTDu8ZgOJRqvrz/w59OzdF89dV1mNw/A4nEMRyZSEi6sOeExnWl0xD /NBchItQlZfr5aP84eI57zgjNT4eXX9LGAM6W/F/xu9cXGMz8Feysg4p03AIf6IokuzPXRyjqDy mjCzrOt6mPf4bjO9Vlcb1wjqjZrCg7LacjEt9ubOdAc7Cwxe58UnmD0xtS18QG+sJPBIo8dnekb xigG0AYcm+whsJr9ePWt5R8F0s0mIEmXL4TQlUC7ypFeR+Zf9n888TW3beY47Wj0M2C6nbXW86S X X-Received: by 2002:a63:be4a:: with SMTP id g10-v6mr28910153pgo.378.1537327118214; Tue, 18 Sep 2018 20:18:38 -0700 (PDT) X-Received: by 2002:a63:be4a:: with SMTP id g10-v6mr28910098pgo.378.1537327117121; Tue, 18 Sep 2018 20:18:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537327117; cv=none; d=google.com; s=arc-20160816; b=WPQjHdD9NEVDg2BmxLLcdM2HVqms4f4DfIjRF946nCyMAmzhDY2kZk4hxZMAXAL4Ci dGAhl7nSuVXisdrGPv0M6/JomzJQhcyKp2EwroYMNkSydYt63ot45OieDdDpiw5qK6JB dHmQdw/R5T3VKGojGLgOIaN24xi0bMcMbQcEwzpY9Nj9su4k7p04xC4tSLYFTOTs6+sT tiMaxV/pCv7sgiWSOPu75+bgny2VaiGlwcCeWIHkfRxYBQ5URKEQFHmoss/L4cynieML HPyFGKVgEsZC3tStsMVBpmHEc0h8sXUWDkLDyWzwSL+ikpHz5+uOKNO98qJl6HAxp8eM a6Wg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Ewd2RgeXG1E75HoAyPPCWH7kK+qG54Cpd4g/BO0a+8o=; b=BDxAgwm4Sp17o8i1+Fd4mQ50Qp+FAHXB7psjQt3u+ldzF8talG4qZYAmVDh6I9BMah 5PvPneGtBv94oNlQkQKOPW/bjOXapfEyNQCrtPKWnl1ACL3NcGqGeKNnlD1Ww/oE/RAB oXAy2x7hci+P8TNXkqG+h1wUc7nASySeGxUsFGdiovz6xHV2URUfhw48kUPeZwJmsm6J eG8eDyxdUSXwipw1yGaSDEcl1SmkASXSTO6orwAMx4eWW/bMruh4gfNPN50EarjTDgzZ 4AiA+JIcXZp5f0xqhEbGZ2VzqYdCysONy3xHHsgH6vP8w64150zcElZPR2VEdrN8cpVW 1MbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=tkgEV8g9; spf=pass (google.com: domain of kernelfans@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=kernelfans@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id c17-v6sor2234835pgf.80.2018.09.18.20.18.36 for (Google Transport Security); Tue, 18 Sep 2018 20:18:37 -0700 (PDT) Received-SPF: pass (google.com: domain of kernelfans@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=tkgEV8g9; spf=pass (google.com: domain of kernelfans@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=kernelfans@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Ewd2RgeXG1E75HoAyPPCWH7kK+qG54Cpd4g/BO0a+8o=; b=tkgEV8g97WDUQSQRot8oP5q/qMXID/wQUAi55F8wMgFQ53ZAc5HDID8f3kP4RGZ6RY BVjCJPUjzIhIeR0sW73F+Vtr3OWpwyeeSScdtiQpygFaQtGwkd6BOgjegWz0vHD0XfSB nBuDIxgzwauKeZHAwcyddYETYU7ixOfTJpwEwHn3Jpkh1EFxiT3Lili0eN3TDC3+9R3x OvtClffgttk7A2lQ67g0lEwLN46imSCpbofSdXL6N1yVunwF3sBBJv4KPWuLSgpqv7j+ Gc08j2eDb/GiAlyoqBmLlRRRIRL4JH5mZJYmb+QQRPrFQTpkzgGwMjwwO/9SJKJNMkOH M5Og== X-Google-Smtp-Source: ANB0VdazGVErhDxuMDSYC8FA2WDM58/eXWA5HCSlJuXSHP0XR/QC8OLnYh2+UjIE5VYwgpGMpPWGAQ== X-Received: by 2002:a62:59d5:: with SMTP id k82-v6mr33484040pfj.143.1537327116136; Tue, 18 Sep 2018 20:18:36 -0700 (PDT) Received: from mylaptop.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id o20-v6sm53087673pfj.35.2018.09.18.20.18.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Sep 2018 20:18:35 -0700 (PDT) From: Pingfan Liu To: linux-mm@kvack.org Cc: Pingfan Liu , Andrew Morton , KAMEZAWA Hiroyuki , Mel Gorman , Greg Kroah-Hartman , Pavel Tatashin , Michal Hocko , Bharata B Rao , Dan Williams , "H. Peter Anvin" , "Kirill A . Shutemov" Subject: [PATCH 2/3] drivers/base/memory: introduce a new state 'isolate' for memblock Date: Wed, 19 Sep 2018 11:17:45 +0800 Message-Id: <1537327066-27852-3-git-send-email-kernelfans@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1537327066-27852-1-git-send-email-kernelfans@gmail.com> References: <1537327066-27852-1-git-send-email-kernelfans@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Currently, offline pages in the unit of memblock, and normally, it is done one by one on each memblock. If there is only one numa node, then the dst pages may come from the next memblock to be offlined, which wastes time during memory offline. For a system with multi numa node, if only replacing part of mem on a node, and the migration dst page can be allocated from local node (which is done by [3/3]), it also faces such issue. This patch suggests to introduce a new state, named 'isolate', the state transition can be isolate -> online or reversion. And another slight benefit of "isolated" state is no further allocation on this memblock, which can block potential unmovable page allocated again from this memblock for a long time. After this patch, the suggested ops to offline pages will looks like: for i in {s..e}; do echo isolate > memory$i/state; done for i in {s..e}; do echo offline > memory$i/state; done Since this patch does not change the original offline path, hence for i in (s..e); do echo offline > memory$i/state; done still works. Signed-off-by: Pingfan Liu Cc: Andrew Morton Cc: KAMEZAWA Hiroyuki Cc: Mel Gorman Cc: Greg Kroah-Hartman Cc: Pavel Tatashin Cc: Michal Hocko Cc: Bharata B Rao Cc: Dan Williams Cc: "H. Peter Anvin" Cc: Kirill A. Shutemov --- drivers/base/memory.c | 31 ++++++++++++++++++++++++++++++- include/linux/memory.h | 1 + 2 files changed, 31 insertions(+), 1 deletion(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index c8a1cb0..3b714be 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -166,6 +167,9 @@ static ssize_t show_mem_state(struct device *dev, case MEM_GOING_OFFLINE: len = sprintf(buf, "going-offline\n"); break; + case MEM_ISOLATED: + len = sprintf(buf, "isolated\n"); + break; default: len = sprintf(buf, "ERROR-UNKNOWN-%ld\n", mem->state); @@ -323,6 +327,9 @@ store_mem_state(struct device *dev, { struct memory_block *mem = to_memory_block(dev); int ret, online_type; + int isolated = 0; + unsigned long start_pfn; + unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; ret = lock_device_hotplug_sysfs(); if (ret) @@ -336,7 +343,13 @@ store_mem_state(struct device *dev, online_type = MMOP_ONLINE_KEEP; else if (sysfs_streq(buf, "offline")) online_type = MMOP_OFFLINE; - else { + else if (sysfs_streq(buf, "isolate")) { + isolated = 1; + goto memblock_isolated; + } else if (sysfs_streq(buf, "unisolate")) { + isolated = -1; + goto memblock_isolated; + } else { ret = -EINVAL; goto err; } @@ -366,6 +379,20 @@ store_mem_state(struct device *dev, mem_hotplug_done(); err: +memblock_isolated: + if (isolated == 1 && mem->state == MEM_ONLINE) { + start_pfn = section_nr_to_pfn(mem->start_section_nr); + ret = start_isolate_page_range(start_pfn, start_pfn + nr_pages, + MIGRATE_MOVABLE, true, true); + if (!ret) + mem->state = MEM_ISOLATED; + } else if (isolated == -1 && mem->state == MEM_ISOLATED) { + start_pfn = section_nr_to_pfn(mem->start_section_nr); + ret = undo_isolate_page_range(start_pfn, start_pfn + nr_pages, + MIGRATE_MOVABLE, true); + if (!ret) + mem->state = MEM_ONLINE; + } unlock_device_hotplug(); if (ret < 0) @@ -455,6 +482,7 @@ static DEVICE_ATTR(phys_index, 0444, show_mem_start_phys_index, NULL); static DEVICE_ATTR(state, 0644, show_mem_state, store_mem_state); static DEVICE_ATTR(phys_device, 0444, show_phys_device, NULL); static DEVICE_ATTR(removable, 0444, show_mem_removable, NULL); +//static DEVICE_ATTR(isolate, 0600, show_mem_isolate, store_mem_isolate); /* * Block size attribute stuff @@ -631,6 +659,7 @@ static struct attribute *memory_memblk_attrs[] = { #ifdef CONFIG_MEMORY_HOTREMOVE &dev_attr_valid_zones.attr, #endif + //&dev_attr_isolate.attr, NULL }; diff --git a/include/linux/memory.h b/include/linux/memory.h index a6ddefc..e00f22c 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -47,6 +47,7 @@ int set_memory_block_size_order(unsigned int order); #define MEM_GOING_ONLINE (1<<3) #define MEM_CANCEL_ONLINE (1<<4) #define MEM_CANCEL_OFFLINE (1<<5) +#define MEM_ISOLATED (1<<6) struct memory_notify { unsigned long start_pfn; From patchwork Wed Sep 19 03:17:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pingfan Liu X-Patchwork-Id: 10605209 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B45715A6 for ; Wed, 19 Sep 2018 03:18:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC1892B8AC for ; Wed, 19 Sep 2018 03:18:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DE59F2B8FF; Wed, 19 Sep 2018 03:18:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3D20C2B8AC for ; Wed, 19 Sep 2018 03:18:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A6578E0006; Tue, 18 Sep 2018 23:18:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2555B8E0001; Tue, 18 Sep 2018 23:18:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A9718E0006; Tue, 18 Sep 2018 23:18:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id B7A078E0001 for ; Tue, 18 Sep 2018 23:18:43 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id u6-v6so1808015pgn.10 for ; Tue, 18 Sep 2018 20:18:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=wnmTMCIvpHh2feZmA1KHGX2V8gyDhRDAIRT9odJG9qM=; b=KdCMhhC5CyWWVaZXP7RQ0+iRWNzflEBa1jgYYxKvJfApcgsjQ+DyelnN8sp1CTnxFf k/t9SX5EE2FyaRI3KheAZN17E4DUV9W0uqfbC5QIPpZN0PAE5g3R1kjwgexngOGOMkxl 0Z8DZFaInXvkG5M9qjLnyVLJqaM2Bd975mAe9dohPhn4oaf7EfuD+7px3estvRIrQIQL zmh7CdcX5yR2+wUlecJAXbFwmRXuG2deBo9ZVw/yhbP04iMuFOMDSfIDxTUC95gZ51gp gSzppn2RHoalIq9gmheb79iiKeS4XIgDbs1VFhqk9xzZ6isJR0Zy8pjQv96ptk6+Cp8Q ynOQ== X-Gm-Message-State: APzg51BCXX6j6I0mPkN4UhLA2EaL4BZdi6rnKejqp8BtXG14Luya10yn qnCjROrMDA8GyZJ19rWe2YniIBUMmBvf0zcKW/V1qaKuQt9NVNRPMaQEIa7wi+tBGJn/PgClfLC nk4dn2dzKxgqWqyZIkDHMZphsud8hdC2Xay2bd/qr7TYMhGx8TLOfd94AFba/LD0ZrkL5BctGcp KGXczOc0zFHvoorIquQ2eiS95AGuNkUYhEslIL/oRiUcMvmJf3fsNcmxMM7JabvxeJHz+yCTmn2 T7HPNtP1//f59Wha5GXWPCTx0zeS8nMd6HosYT3u0cBacAaSrKBMbrJt/JgWOZbEY/6FX14zyeN epFXaFxFyzamJFpNQ/0U4scGkDuLDRI8h7TiyQN0rnSLQj52rVzpdugYunqvxuR1FXOU1H2pWPx w X-Received: by 2002:a17:902:32f:: with SMTP id 44-v6mr31761865pld.15.1537327123426; Tue, 18 Sep 2018 20:18:43 -0700 (PDT) X-Received: by 2002:a17:902:32f:: with SMTP id 44-v6mr31761820pld.15.1537327122469; Tue, 18 Sep 2018 20:18:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537327122; cv=none; d=google.com; s=arc-20160816; b=ihllbq7JkujvKMwi9hbypYndQwazk6GazfIDdqbZO0C0noj8Vcpv/wfbjaXiGbc3dC rvYYinHgeUL+NHgixV1u0XtkDQqK+gA19Vu/qk+RbmTsfVN+hMrzSBFi95e9g1me5X5t nAAXnYxKFseZjdub+2hj1UDNxhJPkey3hkYws/BF9dgYkYaGa4/UB+l3cAK0Q+nQlU3e Z2Q8eN3QKKOqbq0JiyD2bS/Od5xLzU8OoxqoKZEDvPsnRpr5EL2Qf5dU+pctXJMO3uWp KZjo5qtujfEgFRIULGgpoJ0/dbm4L9HLkGM+Le4FRzn8IJriLAO8etHKQpaAKiiR7d+D ePTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=wnmTMCIvpHh2feZmA1KHGX2V8gyDhRDAIRT9odJG9qM=; b=f/l00oRdtGDkcgtKhCju+UA8lJIMHuON4P7dtWP0U8Ai1Hbm4A8Dytu/SmRRj9xfKy g/Iiwn9+9gPHUWw63z2tNJAWB9/Yxu1S9QM4rXbRxtWWO/EzUx90uNpa+MdJ+jeOgMD5 +HUfyfZg0C5SvWrg+V+Vcb0lHSSz6pVkGjYPDXz7ZtxxH+pXP59MZ6zPb8NpgBpUCoJq G9PYkAjL8cU/pcunIzA5ZOYDNMOAjzJsGTKNLzKajd1zZVpqQfR5Au2E46HYI4ewf4kA L5oAwinuhbHvdyqGtDQtocN+zyMquPWPGnmjQktAf6YZSEZogVpR0VV543N1Nf1Kpqks GPuA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="XVpmTx7/"; spf=pass (google.com: domain of kernelfans@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=kernelfans@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id g12-v6sor3752618pll.149.2018.09.18.20.18.42 for (Google Transport Security); Tue, 18 Sep 2018 20:18:42 -0700 (PDT) Received-SPF: pass (google.com: domain of kernelfans@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="XVpmTx7/"; spf=pass (google.com: domain of kernelfans@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=kernelfans@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=wnmTMCIvpHh2feZmA1KHGX2V8gyDhRDAIRT9odJG9qM=; b=XVpmTx7/5hKYHWHfzI5nlirWpBN00KpTlwaMwqOqidBME3yZcoveMBQ3mNUDIQ2HRV ulElcBpcUbuKppeW5LC6kdpRtxeEf2yPEXH8O6+O1F0UHiznpQtrUR/zQ99WdClQ6F+b nOxEQfYd6bTKjVSsLSj5/0fyDmW6XeRC4Z12YeE1fk7CHZKg9YhiHoUYZlFuxGUQmDim EwNYHKz2M2f0/A2PhIjBXIanAQMJhj32wr9uQs5My7moAzRo7RZKV/7YT6PO6QmPMzf7 /hB724agoQJFyaYD6ICcv+8BZfYjqOZrG8ufqRK7X3yTxhZFB5/dbsvjM3bCiug0mahm 1piA== X-Google-Smtp-Source: ANB0VdapTZaC4ULMWX8MN4m7fRkCf7zWEA4mWPAYwW+wFrKeoRpF1v+Sb2mJ+PWkDyO0WNfomPHKhg== X-Received: by 2002:a17:902:8a97:: with SMTP id p23-v6mr31846563plo.21.1537327121745; Tue, 18 Sep 2018 20:18:41 -0700 (PDT) Received: from mylaptop.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id o20-v6sm53087673pfj.35.2018.09.18.20.18.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Sep 2018 20:18:40 -0700 (PDT) From: Pingfan Liu To: linux-mm@kvack.org Cc: Pingfan Liu , Andrew Morton , KAMEZAWA Hiroyuki , Mel Gorman , Greg Kroah-Hartman , Pavel Tatashin , Michal Hocko , Bharata B Rao , Dan Williams , "H. Peter Anvin" , "Kirill A . Shutemov" Subject: [PATCH 3/3] drivers/base/node: create a partial offline hints under each node Date: Wed, 19 Sep 2018 11:17:46 +0800 Message-Id: <1537327066-27852-4-git-send-email-kernelfans@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1537327066-27852-1-git-send-email-kernelfans@gmail.com> References: <1537327066-27852-1-git-send-email-kernelfans@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When offline mem, there are two cases: 1st, offline all of memblock under a node. 2nd, only offline and replace part of mem under a node. For the 2nd case, there is not need to alloc new page from other nodes, which may incur extra numa fault to resolve the misplaced issue, and place unnecessary mem pressure on other nodes. The patch suggests to introduce an interface /sys/../node/nodeX/partial_offline to let the user order how to allocate a new page, i.e. from local node or other nodes. Signed-off-by: Pingfan Liu Cc: Andrew Morton Cc: KAMEZAWA Hiroyuki Cc: Mel Gorman Cc: Greg Kroah-Hartman Cc: Pavel Tatashin Cc: Michal Hocko Cc: Bharata B Rao Cc: Dan Williams Cc: "H. Peter Anvin" Cc: Kirill A. Shutemov --- drivers/base/node.c | 33 +++++++++++++++++++++++++++++++++ include/linux/mmzone.h | 1 + mm/memory_hotplug.c | 31 +++++++++++++++++++------------ 3 files changed, 53 insertions(+), 12 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 1ac4c36..64b0cb8 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -25,6 +25,36 @@ static struct bus_type node_subsys = { .dev_name = "node", }; +static ssize_t read_partial_offline(struct device *dev, + struct device_attribute *attr, char *buf) +{ + int nid = dev->id; + struct pglist_data *pgdat = NODE_DATA(nid); + ssize_t len = 0; + + if (pgdat->partial_offline) + len = sprintf(buf, "1\n"); + else + len = sprintf(buf, "0\n"); + + return len; +} + +static ssize_t write_partial_offline(struct device *dev, + struct device_attribute *attr, const char *buf, size_t count) +{ + int nid = dev->id; + struct pglist_data *pgdat = NODE_DATA(nid); + + if (sysfs_streq(buf, "1")) + pgdat->partial_offline = true; + else if (sysfs_streq(buf, "0")) + pgdat->partial_offline = false; + else + return -EINVAL; + + return strlen(buf); +} static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf) { @@ -56,6 +86,8 @@ static inline ssize_t node_read_cpulist(struct device *dev, return node_read_cpumap(dev, true, buf); } +static DEVICE_ATTR(partial_offline, 0600, read_partial_offline, + write_partial_offline); static DEVICE_ATTR(cpumap, S_IRUGO, node_read_cpumask, NULL); static DEVICE_ATTR(cpulist, S_IRUGO, node_read_cpulist, NULL); @@ -235,6 +267,7 @@ static struct attribute *node_dev_attrs[] = { &dev_attr_numastat.attr, &dev_attr_distance.attr, &dev_attr_vmstat.attr, + &dev_attr_partial_offline.attr, NULL }; ATTRIBUTE_GROUPS(node_dev); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 1e22d96..80c44c8 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -722,6 +722,7 @@ typedef struct pglist_data { /* Per-node vmstats */ struct per_cpu_nodestat __percpu *per_cpu_nodestats; atomic_long_t vm_stat[NR_VM_NODE_STAT_ITEMS]; + bool partial_offline; } pg_data_t; #define node_present_pages(nid) (NODE_DATA(nid)->node_present_pages) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 228de4d..3c66075 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1346,18 +1346,10 @@ static unsigned long scan_movable_pages(unsigned long start, unsigned long end) static struct page *new_node_page(struct page *page, unsigned long private) { - int nid = page_to_nid(page); - nodemask_t nmask = node_states[N_MEMORY]; - - /* - * try to allocate from a different node but reuse this node if there - * are no other online nodes to be used (e.g. we are offlining a part - * of the only existing node) - */ - node_clear(nid, nmask); - if (nodes_empty(nmask)) - node_set(nid, nmask); + nodemask_t nmask = *(nodemask_t *)private; + int nid; + nid = page_to_nid(page); return new_page_nodemask(page, nid, &nmask); } @@ -1371,6 +1363,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) int not_managed = 0; int ret = 0; LIST_HEAD(source); + int nid; + nodemask_t nmask = node_states[N_MEMORY]; for (pfn = start_pfn; pfn < end_pfn && move_pages > 0; pfn++) { if (!pfn_valid(pfn)) @@ -1430,8 +1424,21 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) goto out; } + page = list_entry(source.next, struct page, lru); + nid = page_to_nid(page); + if (!NODE_DATA(nid)->partial_offline) { + /* + * try to allocate from a different node but reuse this + * node if there are no other online nodes to be used + * (e.g. we are offlining a part of the only existing + * node) + */ + node_clear(nid, nmask); + if (nodes_empty(nmask)) + node_set(nid, nmask); + } /* Allocate a new page from the nearest neighbor node */ - ret = migrate_pages(&source, new_node_page, NULL, 0, + ret = migrate_pages(&source, new_node_page, NULL, &nmask, MIGRATE_SYNC, MR_MEMORY_HOTPLUG); if (ret) putback_movable_pages(&source);