From patchwork Fri Jul 13 20:40:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 10524157 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7EE12601C2 for ; Fri, 13 Jul 2018 20:40:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6E64929F1D for ; Fri, 13 Jul 2018 20:40:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 57EA829F3B; Fri, 13 Jul 2018 20:40:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E092029F22 for ; Fri, 13 Jul 2018 20:40:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD1A16B0007; Fri, 13 Jul 2018 16:40:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B80796B0008; Fri, 13 Jul 2018 16:40:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A6FA96B000D; Fri, 13 Jul 2018 16:40:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f70.google.com (mail-pl0-f70.google.com [209.85.160.70]) by kanga.kvack.org (Postfix) with ESMTP id 6439C6B0007 for ; Fri, 13 Jul 2018 16:40:06 -0400 (EDT) Received: by mail-pl0-f70.google.com with SMTP id d22-v6so2604236pls.4 for ; Fri, 13 Jul 2018 13:40:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from:to :cc:subject:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=y8ix8JRIBfNX9+4+lKZtZsVmoxDxrXUcg/lXJnZAWgY=; b=iTE/0bYpDCxLvvN2reL8Dy+hRkPf4q8Lem43jtifrZuIfuNK3x4oA8UX+12/G249RB 5MleRX6jiTj5YX9MrQuHZyLlplRnJJElBseA4Nyk7045wymbhVcnsv4CcJbxR/ilW7/k Trmhz+NUfFlqI3OE5qP/4j+he5ASR/T+VIpmOgw/wqPPWhojf2i5O6Ma0yIzhEfv0Kxl oNSGfHEMqD10za9YcdHyQzfJJyPxJJ0sM6EBxQJk5v/BeK9go3SdxyqG0SRZq6YhHHn1 NJXcw45vkiuNrmb0aSlU0xQFUa+9LUqyrdXZ10yFnTVBK3RWqqTrS/SZgA2JBVxrLhNG jq2w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of akpm@linux-foundation.org designates 140.211.169.12 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Gm-Message-State: AOUpUlFzW8Wx4GlZ7SOIE+L+UBZ+xJRrOHVWHYN679jQD+eWtT3eXsWa W451RhGA3g2UmrooeUKcFD0Rd5A+hYg9nZuc/czDGc+ujw/CaTNlJtbavCXdBBTvJ9D3PBYIVqi /ArV8YU/DJaO0xSc55gxxDXpdBRBE4opLP66sIYDzfBmiJpQbsCIwqp8BOGoRCeIwRg== X-Received: by 2002:a62:198e:: with SMTP id 136-v6mr8608201pfz.103.1531514406079; Fri, 13 Jul 2018 13:40:06 -0700 (PDT) X-Google-Smtp-Source: AAOMgpem1aKBStLWodF8hA3R1fPQVu8RBsB4SvVcTcGxOFv0ORMPoKGuvdkVsUniMkuuQyTi37Dk X-Received: by 2002:a62:198e:: with SMTP id 136-v6mr8608174pfz.103.1531514405374; Fri, 13 Jul 2018 13:40:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531514405; cv=none; d=google.com; s=arc-20160816; b=tz+OxnrM1UUWtaCCUkfQU3Vg/UXvUTVLvOx4HBoJqwoGjdTB47JbbHEc/+3enRPPru YuIWl/jOggNXUkTkl61d5FBfd+kHJlVFGG1IMogaqSR4WYCXHmM+bJFnN21IXyDBkJvs f1R87WlJLV9+or9naFjx94wNXUKdqwhyiNXzRaq5jvrIas9jd5lIkSuvkHYPKHek0a14 lmM1G0WZcghlP3hdirHwrlv+/o4/bS59vMmjivgDWwS4EX4kQbx4mhHag+el1HK4sk76 pJcn63I7VF9K7g7xErwxrunh1Mop0r4JgDCz5XJ41kmpCew1l3xUJq2x/QdJryI95SUO 92Fg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:arc-authentication-results; bh=y8ix8JRIBfNX9+4+lKZtZsVmoxDxrXUcg/lXJnZAWgY=; b=kOMUgrhsHMLLk8A7bwWxJKLyd7CfznbViFsaIzMmJILhFifmt7H802U1bQUaeU3+vl y+PaeX+T1jBT1pcp1Q+TlZxrq6V2BJH4sfK3Lb9Z2AC4qs8SRvcNBbGIWWcH5TBbxeUw 1zDSTYZk4b2tqiZCrdmhSuDLBH5302wL7UeV1In4+elgmo3axrHgBQw/xAYILaZTUFVa 5Oge9I16sBKAVie+NHXzDCA4y3NwjZvQpRoq5iCQ/no2+mZ46gNQRPIj/khxfjtsTnna rJOA37W+C0a5tywl8dK4kkSodimcWPlxS+C0WdQqkuJja2i44hPHpZu4wyJiWVuQD+Rc gPAQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of akpm@linux-foundation.org designates 140.211.169.12 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org Received: from mail.linuxfoundation.org (mail.linuxfoundation.org. [140.211.169.12]) by mx.google.com with ESMTPS id m13-v6si24327012pls.70.2018.07.13.13.40.05 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 13 Jul 2018 13:40:05 -0700 (PDT) Received-SPF: pass (google.com: domain of akpm@linux-foundation.org designates 140.211.169.12 as permitted sender) client-ip=140.211.169.12; Authentication-Results: mx.google.com; spf=pass (google.com: domain of akpm@linux-foundation.org designates 140.211.169.12 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org Received: from akpm3.svl.corp.google.com (unknown [104.133.9.92]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id A28EACF7; Fri, 13 Jul 2018 20:40:04 +0000 (UTC) Date: Fri, 13 Jul 2018 13:40:02 -0700 From: Andrew Morton To: Naoya Horiguchi Cc: linux-mm@kvack.org, Michal Hocko , xishi.qiuxishi@alibaba-inc.com, zy.zhengyi@alibaba-inc.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v1 2/2] mm: soft-offline: close the race against page allocation Message-Id: <20180713134002.a365049a79d41be3c28916cc@linux-foundation.org> In-Reply-To: <1531452366-11661-3-git-send-email-n-horiguchi@ah.jp.nec.com> References: <1531452366-11661-1-git-send-email-n-horiguchi@ah.jp.nec.com> <1531452366-11661-3-git-send-email-n-horiguchi@ah.jp.nec.com> X-Mailer: Sylpheed 3.6.0 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 13 Jul 2018 12:26:06 +0900 Naoya Horiguchi wrote: > A process can be killed with SIGBUS(BUS_MCEERR_AR) when it tries to > allocate a page that was just freed on the way of soft-offline. > This is undesirable because soft-offline (which is about corrected error) > is less aggressive than hard-offline (which is about uncorrected error), > and we can make soft-offline fail and keep using the page for good reason > like "system is busy." > > Two main changes of this patch are: > > - setting migrate type of the target page to MIGRATE_ISOLATE. As done > in free_unref_page_commit(), this makes kernel bypass pcplist when > freeing the page. So we can assume that the page is in freelist just > after put_page() returns, > > - setting PG_hwpoison on free page under zone->lock which protects > freelists, so this allows us to avoid setting PG_hwpoison on a page > that is decided to be allocated soon. > > > ... > > + > +#ifdef CONFIG_MEMORY_FAILURE > +/* > + * Set PG_hwpoison flag if a given page is confirmed to be a free page > + * within zone lock, which prevents the race against page allocation. > + */ I think this is clearer? > +bool set_hwpoison_free_buddy_page(struct page *page) > +{ > + struct zone *zone = page_zone(page); > + unsigned long pfn = page_to_pfn(page); > + unsigned long flags; > + unsigned int order; > + bool hwpoisoned = false; > + > + spin_lock_irqsave(&zone->lock, flags); > + for (order = 0; order < MAX_ORDER; order++) { > + struct page *page_head = page - (pfn & ((1 << order) - 1)); > + > + if (PageBuddy(page_head) && page_order(page_head) >= order) { > + if (!TestSetPageHWPoison(page)) > + hwpoisoned = true; > + break; > + } > + } > + spin_unlock_irqrestore(&zone->lock, flags); > + > + return hwpoisoned; > +} > +#endif --- a/mm/page_alloc.c~mm-soft-offline-close-the-race-against-page-allocation-fix +++ a/mm/page_alloc.c @@ -8039,8 +8039,9 @@ bool is_free_buddy_page(struct page *pag #ifdef CONFIG_MEMORY_FAILURE /* - * Set PG_hwpoison flag if a given page is confirmed to be a free page - * within zone lock, which prevents the race against page allocation. + * Set PG_hwpoison flag if a given page is confirmed to be a free page. This + * test is performed under the zone lock to prevent a race against page + * allocation. */ bool set_hwpoison_free_buddy_page(struct page *page) {