From patchwork Mon Nov 26 23:29:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 10699355 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 31CFE13BB for ; Mon, 26 Nov 2018 23:29:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 213622A63D for ; Mon, 26 Nov 2018 23:29:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 158542A63F; Mon, 26 Nov 2018 23:29:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 714092A63D for ; Mon, 26 Nov 2018 23:29:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8761F6B446B; Mon, 26 Nov 2018 18:29:21 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 826446B446C; Mon, 26 Nov 2018 18:29:21 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C6BA6B446D; Mon, 26 Nov 2018 18:29:21 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 268A36B446B for ; Mon, 26 Nov 2018 18:29:21 -0500 (EST) Received: by mail-pf1-f199.google.com with SMTP id t72so5709741pfi.21 for ; Mon, 26 Nov 2018 15:29:21 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:date:from:to:cc:subject :in-reply-to:message-id:references:user-agent:mime-version; bh=eqKOhg2s8YPm+nNGmwuBjgOu5boUks9xB84QcRhub3U=; b=YHqjRR6HjGKuI6/6erYfxPldUw10/3XRucE+rzU6Vvma3cRNcb0L+5QQbByVsNzXGL 8KT09XRQDsnfq1kcfvCqKaFSrCclJztjH2Iv6aYHCXLmRPk3FnZDB0AbmunE2jM+XLCL 9BrNQ/nVhNfAWqNZJxPg1LFxkvqpG0BcJ3BxLUlk++VW4oC34SvJnd2T/lH/LJAGUdXJ 7y8fxPAZIR7im0gx9MMhT19RtiygMu+OfnnJj1z/TMM7nGz4DP+gy4KlUKhDN0yBfd+W Nc1PsfXi+3nKIPsxeYWfhwy58OzSZGmSmFeIiRi6hN5CZEEPQAYbKWk0Ajw6OP+zRpu4 lr2A== X-Gm-Message-State: AA+aEWa3LaS8+327Y+Hm7dfIGJtzGYVxU816TdDoZ++H3J+KGWZoV2AY g82WNj/C9o3mYBRcay4mxBTjb7sjPaW69vRuq3ry1ONzCbcB5zBtL/rvkAyVKt58J2C8u0mP9ic ya4oYP/NLy8DQ6/1Dn3DaIRgPChA82gqS2evCYHye8/e40l1Lw2JZS5CqkQ5fbr5icN8medBS6h zzU91Eg+5+mfu4+Sr2Rwo1tgcyAZnYir79BxcMhxSv/JTPdQH6mpDaQwCY9fo3AlxC1c2m4ztz4 Fa2BSK42ZylwrP2pqxMZloLWdDWx2H1fU69lgrGLWEHWcF2FHqeCywdZbU4Wgvw+hf57VXoTcR0 YjRzKgru5JIH83mnplE6P7Hi6FfVBkosrx5K9aKFooLqXp93I/8fgSAP6eGavIYjxY3Qmr3RDgW r X-Received: by 2002:a63:4c6:: with SMTP id 189mr27221553pge.391.1543274960687; Mon, 26 Nov 2018 15:29:20 -0800 (PST) X-Received: by 2002:a63:4c6:: with SMTP id 189mr27221519pge.391.1543274959798; Mon, 26 Nov 2018 15:29:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543274959; cv=none; d=google.com; s=arc-20160816; b=g+yWQPcsC97jF+aWPiTj1ynArttw3d+1ilXh3A9jYKp2wx4do00cDABHVQBQ7HrqbU koS9MzCeeejiVaoezaw9+NTF1VPh52cbrHU9s+DemjoZxS+sleZxly3eJOGjAqpPfIQT UWoo77wpc4MeJvk1Exs3VgVM4bjjdOKwiiEf2/t2f0WJ9QZilM+Nhiak3QX7WVmuHwtA ak1YQaJiqXgPK/2MRAyABCwhHFcEouc4qBZiKykrMRKo9LxltrNd+KFVl45u60QORBD5 Vg5ucyBi8TZobHFFlPODi6FO9P/Hm3qyZjCigOEF7owj9/DfYsYX0VAvxwyrqDHcQWkz Xmkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:references:message-id:in-reply-to:subject :cc:to:from:date:dkim-signature; bh=eqKOhg2s8YPm+nNGmwuBjgOu5boUks9xB84QcRhub3U=; b=Edk/1HmdYozulDKCrnP5X4nVvgGi4OMKmZeeT8ea9RVDXnCp5QUDrfa9iFc/Ne+Jbt R1GW/mn/u3/DWeQ16xs3B/bTVFIkmo5n3r3QHBIXauquArXshiU4PDkv4BgTQJqfBj3Z 59iiAfP4lk3RGn7Y01V6QyVqQCrNKRuG4I8oLkjEkNC+e5Cw59RMB3XIs9f7T7Xd2ztf qjWNCNMFLIlqPhF9cbnES9PrpfAc/PcrID8vVmTjq9NGArRMFYErZeroFGx4/mF/uI9F IgIitAP6kpGsNJDnqsU7Z5ZRgGJB0R2rI7T6R5wHT/cDYN4CBfM/TfvkXxMK0Xq/oUbc jcTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=AmOizD+Y; spf=pass (google.com: domain of hughd@google.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id c14sor2696584pgl.37.2018.11.26.15.29.19 for (Google Transport Security); Mon, 26 Nov 2018 15:29:19 -0800 (PST) Received-SPF: pass (google.com: domain of hughd@google.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=AmOizD+Y; spf=pass (google.com: domain of hughd@google.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=eqKOhg2s8YPm+nNGmwuBjgOu5boUks9xB84QcRhub3U=; b=AmOizD+Ya5/bexjYbLnc/pRFcowiEsrCZ3F0tz2yd/kSfm0yy3SByh3CM8zhxMoan0 G/pAmn0Byodre3y0sLE+q+X5PTBHrV6HqOGqh8KGabUCMH8p86ZYCSiZupe1+LdnrprF XjITqtUh66ZsyihjCOm+EOi9LYbUrpgbqrMgt0G+kFCdEhW+N3brBcWIAALLERmgdxjp 1mKGWK7DNk9RhnIKmgTkaItC6O4RLe222ohXuSfbTJ5N2tJo9w9KJLS4QpNtj67/h6qj LOhRsfKKTml5NUWV9w+Yi6BIA4E3P8zOrrVnwT+eDhnHJQSy5Jzb+gbr7kVcU25wy6i1 Bp9A== X-Google-Smtp-Source: AFSGD/WNHgtkukQy+RcrFw/eDOwV9GkQN4ywQO2hPViWZEJZDCeswClU7aX1pBBEsNrN8yuTa9gvww== X-Received: by 2002:a63:cc43:: with SMTP id q3mr26607820pgi.291.1543274959069; Mon, 26 Nov 2018 15:29:19 -0800 (PST) Received: from [100.112.89.103] ([104.133.8.103]) by smtp.gmail.com with ESMTPSA id c23sm1782774pfi.83.2018.11.26.15.29.18 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 26 Nov 2018 15:29:18 -0800 (PST) Date: Mon, 26 Nov 2018 15:29:17 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Andrew Morton cc: "Kirill A. Shutemov" , "Kirill A. Shutemov" , Matthew Wilcox , linux-mm@kvack.org Subject: [PATCH 08/10] mm/khugepaged: collapse_shmem() without freezing new_page In-Reply-To: Message-ID: References: User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP khugepaged's collapse_shmem() does almost all of its work, to assemble the huge new_page from 512 scattered old pages, with the new_page's refcount frozen to 0 (and refcounts of all old pages so far also frozen to 0). Including shmem_getpage() to read in any which were out on swap, memory reclaim if necessary to allocate their intermediate pages, and copying over all the data from old to new. Imagine the frozen refcount as a spinlock held, but without any lock debugging to highlight the abuse: it's not good, and under serious load heads into lockups - speculative getters of the page are not expecting to spin while khugepaged is rescheduled. One can get a little further under load by hacking around elsewhere; but fortunately, freezing the new_page turns out to have been entirely unnecessary, with no hacks needed elsewhere. The huge new_page lock is already held throughout, and guards all its subpages as they are brought one by one into the page cache tree; and anything reading the data in that page, without the lock, before it has been marked PageUptodate, would already be in the wrong. So simply eliminate the freezing of the new_page. Each of the old pages remains frozen with refcount 0 after it has been replaced by a new_page subpage in the page cache tree, until they are all unfrozen on success or failure: just as before. They could be unfrozen sooner, but cause no problem once no longer visible to find_get_entry(), filemap_map_pages() and other speculative lookups. Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Hugh Dickins Cc: Kirill A. Shutemov Cc: stable@vger.kernel.org # 4.8+ Acked-by: Kirill A. Shutemov --- mm/khugepaged.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 9d4e9ff1af95..55930cbed3fd 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1287,7 +1287,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * collapse_shmem - collapse small tmpfs/shmem pages into huge one. * * Basic scheme is simple, details are more complex: - * - allocate and freeze a new huge page; + * - allocate and lock a new huge page; * - scan page cache replacing old pages with the new one * + swap in pages if necessary; * + fill in gaps; @@ -1295,11 +1295,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * - if replacing succeeds: * + copy data over; * + free old pages; - * + unfreeze huge page; + * + unlock huge page; * - if replacing failed; * + put all pages back and unfreeze them; * + restore gaps in the page cache; - * + free huge page; + * + unlock and free huge page; */ static void collapse_shmem(struct mm_struct *mm, struct address_space *mapping, pgoff_t start, @@ -1333,13 +1333,11 @@ static void collapse_shmem(struct mm_struct *mm, __SetPageSwapBacked(new_page); new_page->index = start; new_page->mapping = mapping; - BUG_ON(!page_ref_freeze(new_page, 1)); /* - * At this point the new_page is 'frozen' (page_count() is zero), - * locked and not up-to-date. It's safe to insert it into the page - * cache, because nobody would be able to map it or use it in other - * way until we unfreeze it. + * At this point the new_page is locked and not up-to-date. + * It's safe to insert it into the page cache, because nobody would + * be able to map it or use it in another way until we unlock it. */ /* This will be less messy when we use multi-index entries */ @@ -1491,9 +1489,8 @@ static void collapse_shmem(struct mm_struct *mm, index++; } - /* Everything is ready, let's unfreeze the new_page */ SetPageUptodate(new_page); - page_ref_unfreeze(new_page, HPAGE_PMD_NR); + page_ref_add(new_page, HPAGE_PMD_NR - 1); set_page_dirty(new_page); mem_cgroup_commit_charge(new_page, memcg, false, true); lru_cache_add_anon(new_page); @@ -1541,8 +1538,6 @@ static void collapse_shmem(struct mm_struct *mm, VM_BUG_ON(nr_none); xas_unlock_irq(&xas); - /* Unfreeze new_page, caller would take care about freeing it */ - page_ref_unfreeze(new_page, 1); mem_cgroup_cancel_charge(new_page, memcg, true); new_page->mapping = NULL; }