From patchwork Tue Jun 26 14:54:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Nathan Fontenot X-Patchwork-Id: 10489259 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9300A6031B for ; Tue, 26 Jun 2018 14:55:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8335926E54 for ; Tue, 26 Jun 2018 14:55:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 773F426E73; Tue, 26 Jun 2018 14:55:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7B07826E1A for ; Tue, 26 Jun 2018 14:55:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 72FC76B0003; Tue, 26 Jun 2018 10:55:04 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6DD186B0005; Tue, 26 Jun 2018 10:55:04 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A6446B0006; Tue, 26 Jun 2018 10:55:04 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-wr0-f200.google.com (mail-wr0-f200.google.com [209.85.128.200]) by kanga.kvack.org (Postfix) with ESMTP id 0169A6B0003 for ; Tue, 26 Jun 2018 10:55:04 -0400 (EDT) Received: by mail-wr0-f200.google.com with SMTP id r2-v6so11323194wro.21 for ; Tue, 26 Jun 2018 07:55:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :references:from:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding:message-id; bh=3Qdz9FyfHcLuK+HiRW/68CgLb19VBsraKynVNjyMJOc=; b=TSyOon+ZQcgNHPOJrLDfze4yGdyg36TVpNNZ7ea/0ejriKMQ1AXtm3qTdp3SAwhaTW DmVE8Tky3B+EVi+2wosBUjKmhAcVFn+P9lESRa1wWc/6z4b8WgtJpBTxwcMZrWC11d1l wR1aBU/OL9oCRcbQfPmlnCDy4Ok9vi4Pajqr6BXcensQzyuLXXlS92J65G3Jt4rkCSrf UZ/iEobrb0GJuJpVtNoXWBT1x7BjxKReLcrgSimzF/pHEHwceiRmi2kpy9ElfSC/cvIY fcE9dZHjrg/pHstRzfzjq3gh48rFXLsluJuA6w6UrI9zIgSnKyHOTwpoFA2Ui1BMkTKW ZJgA== X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of nfont@linux.vnet.ibm.com) smtp.mailfrom=nfont@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com X-Gm-Message-State: APt69E3eNWb4nI2B6im/a3wqDJcKWU0/LKJygLMWVDxtHK6qwh5DvK4q 21zL70jKSByNYA97LRkU4jrQxPHfhwq5Sj7OmJEb/FJs3lzwj6Ji5LAJ9bVHOibr7mPlubOSVbN RW4n1regiRBVKQFVrBWd/i1rb0hsj9NK/0ra9hJ8hJDnu/vhTDz1etfNzgNbzxGU= X-Received: by 2002:adf:b502:: with SMTP id a2-v6mr1847796wrd.273.1530024903541; Tue, 26 Jun 2018 07:55:03 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfMoa9PYAgGymiUZgIzWyIdk0S1WWgwpb7APrtO0UNecFzIywsvDfahQGRl8j3Rf0zdrM4z X-Received: by 2002:adf:b502:: with SMTP id a2-v6mr1847719wrd.273.1530024902169; Tue, 26 Jun 2018 07:55:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530024902; cv=none; d=google.com; s=arc-20160816; b=BhxV2CP55eyru3vNdGaML1PCIJrjQfwTnbKpwCGqIJjWDTjKdiekpMfkQSTPFmscwd Rhy9LZuuMzHcijOblVW1m1GUwz2ZLwWfwkY/buobt77agdSsdaOYCjUtZKuI9RvE0WQB +4cTzShuDN6mKiSogqd9iORZoC/rvIBj+WR6sg+UMxFk7B186HQbrSLETQ2bwUQ2NnUo 5IOu56DPzRSlQfQnhQEFlJ7QFc7SSMYfpcdIUO4vDNBMVwEfKD+TuMMZBWI0be1ITgJ2 4sU6yy5s7zbwHE77EL4WxrQHZEylQVhv5vElHQVi5lnXdTCAdoUhGDje7a7pUNivrjnb 2oNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:content-transfer-encoding:content-language:in-reply-to :mime-version:user-agent:date:from:references:cc:to:subject :arc-authentication-results; bh=3Qdz9FyfHcLuK+HiRW/68CgLb19VBsraKynVNjyMJOc=; b=vZyzHVGll8aOnH8ZJHIwxvgx/FavwXacE7G4UTPuzyjKAZKzFnd+BVChacOQh3XdCe bt1eaA39IFG/jRMarBrp+sK7gC8Ihp9z8cKsq6FHhJuPujJ6skBBWkPU41It/WKcdFpL uISbYYs29I4QMaeW3Pp//2yTH7IewDRaVTYXUqjGIUthrZvcr5C8BKDHWoxw3Rdwpmdx 3wTVXBhs299xu71gcrUxc/HMMkfS1+17iRNFFU3eOKmwEmgIqoQOZUVfWUnAuX5ako3s xMDSy/lWrVeplA1wbDNOiK803PwuLb41ag/lFbnjF+GqIAU+LMyfq5Emeqo/eefkxVWa cmyg== ARC-Authentication-Results: i=1; mx.google.com; spf=neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of nfont@linux.vnet.ibm.com) smtp.mailfrom=nfont@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com. [148.163.158.5]) by mx.google.com with ESMTPS id u9-v6si515612wrr.132.2018.06.26.07.55.01 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jun 2018 07:55:02 -0700 (PDT) Received-SPF: neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of nfont@linux.vnet.ibm.com) client-ip=148.163.158.5; Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of nfont@linux.vnet.ibm.com) smtp.mailfrom=nfont@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w5QEs7Ft144943 for ; Tue, 26 Jun 2018 10:55:00 -0400 Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.153]) by mx0b-001b2d01.pphosted.com with ESMTP id 2jupnju34m-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 26 Jun 2018 10:55:00 -0400 Received: from localhost by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 26 Jun 2018 08:54:59 -0600 Received: from b03cxnp07028.gho.boulder.ibm.com (9.17.130.15) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 26 Jun 2018 08:54:55 -0600 Received: from b03ledav004.gho.boulder.ibm.com (b03ledav004.gho.boulder.ibm.com [9.17.130.235]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w5QEsso312714452 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 26 Jun 2018 07:54:54 -0700 Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B197678060; Tue, 26 Jun 2018 08:54:54 -0600 (MDT) Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 127087805F; Tue, 26 Jun 2018 08:54:53 -0600 (MDT) Received: from [9.41.92.184] (unknown [9.41.92.184]) by b03ledav004.gho.boulder.ibm.com (Postfix) with ESMTP; Tue, 26 Jun 2018 08:54:53 -0600 (MDT) Subject: Re: [powerpc/powervm]kernel BUG at mm/memory_hotplug.c:1864! To: Balbir Singh , vrbagal1 , Oscar Salvador Cc: sachinp , Linuxppc-dev , linux-mm@kvack.org, linux-next , linuxppc-dev References: <6826dab0e4382380db8d11b047272bda@linux.vnet.ibm.com> <20180608112823.GA20395@techadventures.net> <3d1e7740df56ed35c8b56941acdb7079@linux.vnet.ibm.com> <20180608121553.GA20774@techadventures.net> <0aac625ee724d877b87c69bba5ac9a0e@linux.vnet.ibm.com> <605b4df2-4cf1-2dda-3661-68b78845f8ec@gmail.com> From: Nathan Fontenot Date: Tue, 26 Jun 2018 09:54:53 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <605b4df2-4cf1-2dda-3661-68b78845f8ec@gmail.com> Content-Language: en-US X-TM-AS-GCONF: 00 x-cbid: 18062614-0012-0000-0000-00001682B96C X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009259; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000266; SDB=6.01052624; UDB=6.00539642; IPR=6.00830548; MB=3.00021862; MTD=3.00000008; XFM=3.00000015; UTC=2018-06-26 14:54:58 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18062614-0013-0000-0000-000053758C16 Message-Id: <345785ef-5da2-b2e8-78b8-2391b54c6141@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-06-26_08:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1806210000 definitions=main-1806260170 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP On 06/12/2018 05:28 AM, Balbir Singh wrote: > > > On 11/06/18 17:41, vrbagal1 wrote: >> On 2018-06-08 17:45, Oscar Salvador wrote: >>> On Fri, Jun 08, 2018 at 05:11:24PM +0530, vrbagal1 wrote: >>>> On 2018-06-08 16:58, Oscar Salvador wrote: >>>>> On Fri, Jun 08, 2018 at 04:44:24PM +0530, vrbagal1 wrote: >>>>>> Greetings!!! >>>>>> >>>>>> I am seeing kernel bug followed by oops message and system reboots, >>>>>> while >>>>>> running dlpar memory hotplug test. >>>>>> >>>>>> Machine Details: Power6 PowerVM Platform >>>>>> GCC version: (gcc version 4.8.3 20140911 (Red Hat 4.8.3-7) (GCC)) >>>>>> Test case: dlpar memory hotplug test (https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/memhotplug.py) >>>>>> Kernel Version: Linux version 4.17.0-autotest >>>>>> >>>>>> I am seeing this bug on rc7 as well. >> >> Observing similar traces on linux next kernel: 4.17.0-next-20180608-autotest >> >>  Block size [0x4000000] unaligned hotplug range: start 0x220000000, size 0x1000000 > > size < block_size in this case, why? how? Could you confirm that the block size is 64MB and your trying to remove 16MB > I was not able to re-create this failure exactly ( I don't have a Power6 system) but was able to get a similar re-create on a Power 9 with a few modifications. I think the issue you're seeing is due to a change in the validation of memory done in remove_memory to ensure the amount of memory being removed spans entire memory block. The pseries memory remove code, see pseries_remove_memblock, tries to remove each section of a memory block instead of the entire memory block. Could you try the patch below that updates the pseries code to remove the entire memory block instead of doing it one section at a time. -Nathan --- arch/powerpc/platforms/pseries/hotplug-memory.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index c1578f54c626..6072efc793e1 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -316,11 +316,11 @@ static int dlpar_offline_lmb(struct drmem_lmb *lmb) return dlpar_change_lmb_state(lmb, false); } -static int pseries_remove_memblock(unsigned long base, unsigned int memblock_size) +static int pseries_remove_memblock(unsigned long base, + unsigned int memblock_sz) { - unsigned long block_sz, start_pfn; - int sections_per_block; - int i, nid; + unsigned long start_pfn; + int nid; start_pfn = base >> PAGE_SHIFT; @@ -329,18 +329,12 @@ static int pseries_remove_memblock(unsigned long base, unsigned int memblock_siz if (!pfn_valid(start_pfn)) goto out; - block_sz = pseries_memory_block_size(); - sections_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE; nid = memory_add_physaddr_to_nid(base); - - for (i = 0; i < sections_per_block; i++) { - remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE); - base += MIN_MEMORY_BLOCK_SIZE; - } + remove_memory(nid, base, memblock_sz); out: /* Update memory regions for memory remove */ - memblock_remove(base, memblock_size); + memblock_remove(base, memblock_sz); unlock_device_hotplug(); return 0; }