From patchwork Wed Nov 7 10:18:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Hocko X-Patchwork-Id: 10672121 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7819F15E9 for ; Wed, 7 Nov 2018 10:18:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 678F82B2D7 for ; Wed, 7 Nov 2018 10:18:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5B5532B2FE; Wed, 7 Nov 2018 10:18:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DABFB2B2D7 for ; Wed, 7 Nov 2018 10:18:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B47CC6B04E5; Wed, 7 Nov 2018 05:18:50 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A7B196B04E7; Wed, 7 Nov 2018 05:18:50 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A44B6B04E8; Wed, 7 Nov 2018 05:18:50 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by kanga.kvack.org (Postfix) with ESMTP id 228D66B04E5 for ; Wed, 7 Nov 2018 05:18:50 -0500 (EST) Received: by mail-wr1-f69.google.com with SMTP id o9-v6so14723871wrw.2 for ; Wed, 07 Nov 2018 02:18:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Sv3VICkYWNHJQH5t9PKiqFFrwl24VUUEG3pNBfSVfqc=; b=ZVl4zBr2B4hICzpDVNV0Vu8IlnbR4wm8wnXnOfZUqh6Pnox+u5zwgXkM/bvUXQpM6H HBLV1mIJqc+PHLyqI9fwWPkjpZEiQNYmS6CxO/+QiPI6wUKt3UPZ8KkAsYEnUWjdOYIV KpgjzEYvYBcyaX3eXD9K4dr8yBkdA75QUcGGnkxi8q/zFaAWDX+jfG2TJZwCO/RNj/vM k8YYId+pPJN86qCN9IIv2z19TE9CzWCDN3c73SidhE9JTetw1axsoavlFCJrrDFdn3fp 1G+ckuM2lNwJpIxT+tqx4NB+MDmK/+peYulAHBmDoI/TchgCsFrl63DwNSeDAaWN7xeI ySew== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Gm-Message-State: AGRZ1gLd3TQPRk9ruYcyvQK8J4gREqZEjSJQXEQQvzUxRyInT6/p71XJ EDBbLfF2mzt7gW8Vyfnj7XwCsBf9qKhFG15T47Xm+buwfU1cFtAAMnB8/wz2wwFqcJt8qLYSoxp XWvm5eGI9A7242abV0tdN6lUpZTruKNjn/Ruw/hYeQAJshn4fxgpHoK9tQWbrQxjZl+2pdS1DVU 7l2EnraI5Z9ozK3vbIK3sBq3SKs5IPpxOx0B8Zb6ms7gk7Vzupu1Hm27nqbdVtk/f4WrOUNfWRM 9yZmzIAN2qOOWrDTXm/Wv5HtLKj4qsakk+qRcMwFs+wMw5V55y+M19QcoZ+o+0Sq0eXeqQ2Zn4G 9h3zrAdSSKuU/BWXgK0aBsiyULt7BjgL8QbJYps/gZaJjwJNjX3HEon2yuYrzhPXpfodtMMhKw= = X-Received: by 2002:a1c:c70e:: with SMTP id x14-v6mr1456295wmf.68.1541585929622; Wed, 07 Nov 2018 02:18:49 -0800 (PST) X-Received: by 2002:a1c:c70e:: with SMTP id x14-v6mr1456257wmf.68.1541585928730; Wed, 07 Nov 2018 02:18:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541585928; cv=none; d=google.com; s=arc-20160816; b=QNUFOLgubzz1FBlJSobXEyaSfaq8CG7oNB/6d9Ii8B7NlbGuovillFCgK00QmuGTbe 4wdCfFWwykgBS0XtMj7v96d9BnnkidiEn51Fxfoq8z71p7RkXMg3VPJCl9o8h8EYqkVw 0oAUboWum7PTXi+NzgttjjvF3dIJMuVLiUvx3vKhCMrzWEr92ANYDeeh7y5lLAhOCAsV kx6ysUEX7MiQm6aTkzAYxng4TC1YqJkMxUJrGX6gt4KIjmxgG2ynGupCpzbtyyZUREz9 mI6ndYUH506WEm+2v3LJM2rOkOd3Vd95cxqq1S9INHHaLCkTFJGDpbDpFE2AjRVHkxiB 8PGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=Sv3VICkYWNHJQH5t9PKiqFFrwl24VUUEG3pNBfSVfqc=; b=jc/e1RgIsXxbaUhvxMZ7SeTQKvVRzw6IpAqQW2BfyKYhZPCNp7FXd/PWWVgpjh3cOJ dyYjpt/DEKNiJKVADGTJiLiPsRSEZu3KTCDevQr4BM6I8JQlaXoFztvKj1ualJGT0Tne ENEGmihrqsa+Ztdvu3tkMfuCoM4+FICGH1M/mRQInVLdM2kUvxMHNYdISRzhJsEOA3BD N0XEJOh94fpOk068gMM2e6ga1/cxOuAStacHoAkjgL42sQBVHEcak2cbWFHTQ+1VOLO7 0g7HfgmJGNjbtvLeDB4sQ4s0IJjwSzweLFl8JLZteh6yQ2I37jJZWcTO0Bo5FQo3b0OS w2Zg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id x71-v6sor389000wmf.4.2018.11.07.02.18.48 for (Google Transport Security); Wed, 07 Nov 2018 02:18:48 -0800 (PST) Received-SPF: pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Google-Smtp-Source: AJdET5dKm4DMMZdgXpmyKhgPMwFbI3RIdspl0FyX88e49xONBxbKQ/WPj6h9FTwNXMVUFskcPEVvNA== X-Received: by 2002:a1c:8dcd:: with SMTP id p196-v6mr1465756wmd.49.1541585927961; Wed, 07 Nov 2018 02:18:47 -0800 (PST) Received: from tiehlicka.suse.cz (ip-37-188-140-85.eurotel.cz. [37.188.140.85]) by smtp.gmail.com with ESMTPSA id w18-v6sm217527wrn.66.2018.11.07.02.18.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Nov 2018 02:18:47 -0800 (PST) From: Michal Hocko To: Cc: Andrew Morton , Oscar Salvador , Baoquan He , LKML , Michal Hocko Subject: [RFC PATCH 4/5] mm, memory_hotplug: print reason for the offlining failure Date: Wed, 7 Nov 2018 11:18:29 +0100 Message-Id: <20181107101830.17405-5-mhocko@kernel.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181107101830.17405-1-mhocko@kernel.org> References: <20181107101830.17405-1-mhocko@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Michal Hocko The memory offlining failure reporting is inconsistent and insufficient. Some error paths simply do not report the failure to the log at all. When we do report there are no details about the reason of the failure and there are several of them which makes memory offlining failures hard to debug. Make sure that the memory offlining [mem %#010llx-%#010llx] failed message is printed for all failures and also provide a short textual reason for the failure e.g. [ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff this tells us that the offlining has failed because of a signal pending aka user intervention. Signed-off-by: Michal Hocko --- mm/memory_hotplug.c | 34 +++++++++++++++++++++++----------- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a92b1b8f6218..1badac89c58e 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1553,6 +1553,7 @@ static int __ref __offline_pages(unsigned long start_pfn, unsigned long valid_start, valid_end; struct zone *zone; struct memory_notify arg; + char *reason; mem_hotplug_begin(); @@ -1561,7 +1562,9 @@ static int __ref __offline_pages(unsigned long start_pfn, if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end)) { mem_hotplug_done(); - return -EINVAL; + ret = -EINVAL; + reason = "multizone range"; + goto failed_removal; } zone = page_zone(pfn_to_page(valid_start)); @@ -1573,7 +1576,8 @@ static int __ref __offline_pages(unsigned long start_pfn, MIGRATE_MOVABLE, true); if (ret) { mem_hotplug_done(); - return ret; + reason = "failed to isolate range"; + goto failed_removal } arg.start_pfn = start_pfn; @@ -1582,15 +1586,19 @@ static int __ref __offline_pages(unsigned long start_pfn, ret = memory_notify(MEM_GOING_OFFLINE, &arg); ret = notifier_to_errno(ret); - if (ret) - goto failed_removal; + if (ret) { + reason = "notifiers failure"; + goto failed_removal_isolated; + } pfn = start_pfn; repeat: /* start memory hot removal */ ret = -EINTR; - if (signal_pending(current)) - goto failed_removal; + if (signal_pending(current)) { + reason = "signal backoff"; + goto failed_removal_isolated; + } cond_resched(); lru_add_drain_all(); @@ -1607,8 +1615,10 @@ static int __ref __offline_pages(unsigned long start_pfn, * actually in order to make hugetlbfs's object counting consistent. */ ret = dissolve_free_huge_pages(start_pfn, end_pfn); - if (ret) - goto failed_removal; + if (ret) { + reason = "fails to disolve hugetlb pages"; + goto failed_removal_isolated; + } /* check again */ offlined_pages = check_pages_isolated(start_pfn, end_pfn); if (offlined_pages < 0) @@ -1648,13 +1658,15 @@ static int __ref __offline_pages(unsigned long start_pfn, mem_hotplug_done(); return 0; +failed_removal_isolated: + undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); failed_removal: - pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n", + pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n", (unsigned long long) start_pfn << PAGE_SHIFT, - ((unsigned long long) end_pfn << PAGE_SHIFT) - 1); + ((unsigned long long) end_pfn << PAGE_SHIFT) - 1, + reason); memory_notify(MEM_CANCEL_OFFLINE, &arg); /* pushback to free area */ - undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); mem_hotplug_done(); return ret; }