From patchwork Mon Jun 11 14:06:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthew Wilcox (Oracle)" X-Patchwork-Id: 10457957 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7BD9A6020F for ; Mon, 11 Jun 2018 14:09:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 671F92835B for ; Mon, 11 Jun 2018 14:09:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 57D2E28451; Mon, 11 Jun 2018 14:09:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8C5A52835B for ; Mon, 11 Jun 2018 14:09:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 43FB36B0287; Mon, 11 Jun 2018 10:07:02 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3C86F6B028B; Mon, 11 Jun 2018 10:07:02 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2045C6B028C; Mon, 11 Jun 2018 10:07:02 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg0-f71.google.com (mail-pg0-f71.google.com [74.125.83.71]) by kanga.kvack.org (Postfix) with ESMTP id CAB956B0287 for ; Mon, 11 Jun 2018 10:07:01 -0400 (EDT) Received: by mail-pg0-f71.google.com with SMTP id o19-v6so6615311pgn.14 for ; Mon, 11 Jun 2018 07:07:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=ddVaTDTpb2MJfBwmifdy20R28SVzwNbBW7X4KiPMA1c=; b=h4UJJ7GjV5wkhSUCE7SfF70LsrN5csTSxeALDm5cmIlByVaL+KCaWYGOpBJV4XfNXE Q5gEed0eUYfKD2nPWpQ9aW7hJN/Cc4s8syGrZSpm4aOfKjYAUclVjMToQUFWgXQRCIuy 08McmD6vbA5iqEljNwoFtQxaHf4FotCLjTo2c09xnjaZjryl9at6rNmtvMF8TOS8bPID cUqqdeHA30InN1VFmvHXHkv6q7oCRrNAECdBqmhzN8pv1DW0gCluYGgaGawqsKgL9z+i GL5SG6ea6pw5cFUK52/wjn7gfnlIpgaei/0in6oIKInLHGq7L8os0nJquKKxBrxHPNgV nzfQ== X-Gm-Message-State: APt69E30sZVn2BNVXvnEzRv88pUED8scUWYUfg2v3tu+hvQGykmnTN3Y QV1w3pN84xV13H9KxrB6OuKq5ZgBnBWwdE10bie6OcUKMqC9RZgGnpGOKn2N85krWvkGV6cAk3x rfjp+v8qAhNxY8UUa46DjXne1I4/RMYL1zlsBSbFkSMXhsxNzElKRPeJKVgzdHlbsnw== X-Received: by 2002:a63:7459:: with SMTP id e25-v6mr14546576pgn.186.1528726021463; Mon, 11 Jun 2018 07:07:01 -0700 (PDT) X-Google-Smtp-Source: ADUXVKK6iJbbRNTqM9M5XgNqRShj/uD1DodC/BDfiz9IeqyEU1zOm11a5NpYSGGUFOinGnbbZkOk X-Received: by 2002:a63:7459:: with SMTP id e25-v6mr14546497pgn.186.1528726020033; Mon, 11 Jun 2018 07:07:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528726020; cv=none; d=google.com; s=arc-20160816; b=fZjm7h9Q/ZZUfZfPO10E1C2rR8lLdPaPRgBynnS7/OkH8xbpS6XebUL0zdPMb8gJV5 cKTCQhSOFsXLFKBI75t6CyV9QiWtDxMmBPpbfoMyxCxWON3wTkKo7XDcATWmNR3Wwrwh Ip1tGv07nL0fMZcGYQ2OwcbJjm04SHoLn2fRS1StoAjYn1Xi9OfLL/1uWO9SRkEXPUek Fxd8IwptL4WF0Hdj1O7WbvBbF36VJ5bCxMCniDS4xaptgBjRMhU1AAcy0CalvkSksdxW QmL4bCRUGCCnstoFDrG9KFaBlJEzrEMMQRfBnpd2MH7zLLTM1TYGjWsx4ubttwl3IN+n 1LaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=ddVaTDTpb2MJfBwmifdy20R28SVzwNbBW7X4KiPMA1c=; b=tHMK/jttPWBPe+e3OEECOyi+h8bSbxOPHe9WmcUVj6vz4D2XcD8QkiaJzYLmSqNVUR jX/Fjp8ntSxgimZDhWuKzE3xVjHjojoosi1NM+u+g1isONpBjBivmaztvyqaIy169uuW XoMfnVrJVzSYo8KXiVibDUbK1Y3TPS7zjwrGbe9Ge0NXLrtadWVMsqv+Gj7+aEjjA/TL IkZqBIDpp7bbSdEzrYH3mtgXkXfcgiDDcEQvGBmzgmdR+0t35ac5+iWVZr0DnXsHfPhi JCU+X7M7pDUDD7Lf0iPdkEjV3EJotxnd5kdw5gL090j5yT08GO/U3gDGVK4ElHSpjlyv jWRA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20170209 header.b=rLVUaD3g; spf=pass (google.com: best guess record for domain of willy@infradead.org designates 2607:7c80:54:e::133 as permitted sender) smtp.mailfrom=willy@infradead.org Received: from bombadil.infradead.org (bombadil.infradead.org. [2607:7c80:54:e::133]) by mx.google.com with ESMTPS id y24-v6si10483543pgc.656.2018.06.11.07.06.59 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 11 Jun 2018 07:06:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of willy@infradead.org designates 2607:7c80:54:e::133 as permitted sender) client-ip=2607:7c80:54:e::133; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20170209 header.b=rLVUaD3g; spf=pass (google.com: best guess record for domain of willy@infradead.org designates 2607:7c80:54:e::133 as permitted sender) smtp.mailfrom=willy@infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=ddVaTDTpb2MJfBwmifdy20R28SVzwNbBW7X4KiPMA1c=; b=rLVUaD3gMxIwqJl05rtcZNqif 2uhS45H5TOSg04UkJ9Zgqk478NCbAxSnizyKN7129Yvvan+90C6N3S7t6McQdTaI8Rhn4bVxrqhvT KrWh/ZLWC5MfERERfFYcnqeHO5jqVs1moJ2aZcHbKwSm23Pqmi4ErV99eeanXeYQJ0VzdS/L7RzBB hR5mcLH+cSeMQPiK2RN+Su0NE0wsawVMppcFhPG/Llv+KYyWGv5d7a6VB1XvpO9iXQaY/IKXm+CLC lWT8v2TEOkLJH1OvFpcNSxcnwRzsaN7IJFGsdQWMDC7dyaXJvRD7uEwsO6UjPoTHDWfSJnbXH0REj mpFllJ9Kw==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1fSNTG-0004oI-II; Mon, 11 Jun 2018 14:06:58 +0000 From: Matthew Wilcox To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Matthew Wilcox , Jan Kara , Jeff Layton , Lukas Czerner , Ross Zwisler , Christoph Hellwig , Goldwyn Rodrigues , Nicholas Piggin , Ryusuke Konishi , linux-nilfs@vger.kernel.org, Jaegeuk Kim , Chao Yu , linux-f2fs-devel@lists.sourceforge.net Subject: [PATCH v13 42/72] mm: Convert collapse_shmem to XArray Date: Mon, 11 Jun 2018 07:06:09 -0700 Message-Id: <20180611140639.17215-43-willy@infradead.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180611140639.17215-1-willy@infradead.org> References: <20180611140639.17215-1-willy@infradead.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Matthew Wilcox I found another victim of the radix tree being hard to use. Because there was no call to radix_tree_preload(), khugepaged was allocating radix_tree_nodes using GFP_ATOMIC. I also converted a local_irq_save()/restore() pair to disable()/enable(). Signed-off-by: Matthew Wilcox --- mm/khugepaged.c | 158 ++++++++++++++++++++---------------------------- 1 file changed, 65 insertions(+), 93 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index d6992ba1f604..43598cc5998b 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1287,17 +1287,17 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * * Basic scheme is simple, details are more complex: * - allocate and freeze a new huge page; - * - scan over radix tree replacing old pages the new one + * - scan page cache replacing old pages with the new one * + swap in pages if necessary; * + fill in gaps; - * + keep old pages around in case if rollback is required; - * - if replacing succeed: + * + keep old pages around in case rollback is required; + * - if replacing succeeds: * + copy data over; * + free old pages; * + unfreeze huge page; * - if replacing failed; * + put all pages back and unfreeze them; - * + restore gaps in the radix-tree; + * + restore gaps in the page cache; * + free huge page; */ static void collapse_shmem(struct mm_struct *mm, @@ -1305,12 +1305,11 @@ static void collapse_shmem(struct mm_struct *mm, struct page **hpage, int node) { gfp_t gfp; - struct page *page, *new_page, *tmp; + struct page *new_page; struct mem_cgroup *memcg; pgoff_t index, end = start + HPAGE_PMD_NR; LIST_HEAD(pagelist); - struct radix_tree_iter iter; - void **slot; + XA_STATE(xas, &mapping->i_pages, start); int nr_none = 0, result = SCAN_SUCCEED; VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); @@ -1335,48 +1334,48 @@ static void collapse_shmem(struct mm_struct *mm, __SetPageLocked(new_page); BUG_ON(!page_ref_freeze(new_page, 1)); - /* - * At this point the new_page is 'frozen' (page_count() is zero), locked - * and not up-to-date. It's safe to insert it into radix tree, because - * nobody would be able to map it or use it in other way until we - * unfreeze it. + * At this point the new_page is 'frozen' (page_count() is zero), + * locked and not up-to-date. It's safe to insert it into the page + * cache, because nobody would be able to map it or use it in other + * way until we unfreeze it. */ - index = start; - xa_lock_irq(&mapping->i_pages); - radix_tree_for_each_slot(slot, &mapping->i_pages, &iter, start) { - int n = min(iter.index, end) - index; - - /* - * Handle holes in the radix tree: charge it from shmem and - * insert relevant subpage of new_page into the radix-tree. - */ - if (n && !shmem_charge(mapping->host, n)) { - result = SCAN_FAIL; + /* This will be less messy when we use multi-index entries */ + do { + xas_lock_irq(&xas); + xas_create_range(&xas, end - 1); + if (!xas_error(&xas)) break; - } - nr_none += n; - for (; index < min(iter.index, end); index++) { - radix_tree_insert(&mapping->i_pages, index, - new_page + (index % HPAGE_PMD_NR)); - } + xas_unlock_irq(&xas); + if (!xas_nomem(&xas, GFP_KERNEL)) + goto out; + } while (1); - /* We are done. */ - if (index >= end) - break; + for (index = start; index < end; index++) { + struct page *page = xas_next(&xas); + + VM_BUG_ON(index != xas.xa_index); + if (!page) { + if (!shmem_charge(mapping->host, 1)) { + result = SCAN_FAIL; + break; + } + xas_store(&xas, new_page + (index % HPAGE_PMD_NR)); + nr_none++; + continue; + } - page = radix_tree_deref_slot_protected(slot, - &mapping->i_pages.xa_lock); if (xa_is_value(page) || !PageUptodate(page)) { - xa_unlock_irq(&mapping->i_pages); + xas_unlock_irq(&xas); /* swap in or instantiate fallocated page */ if (shmem_getpage(mapping->host, index, &page, SGP_NOHUGE)) { result = SCAN_FAIL; - goto tree_unlocked; + goto xa_unlocked; } - xa_lock_irq(&mapping->i_pages); + xas_lock_irq(&xas); + xas_set(&xas, index); } else if (trylock_page(page)) { get_page(page); } else { @@ -1396,7 +1395,7 @@ static void collapse_shmem(struct mm_struct *mm, result = SCAN_TRUNCATED; goto out_unlock; } - xa_unlock_irq(&mapping->i_pages); + xas_unlock_irq(&xas); if (isolate_lru_page(page)) { result = SCAN_DEL_PAGE_LRU; @@ -1406,17 +1405,16 @@ static void collapse_shmem(struct mm_struct *mm, if (page_mapped(page)) unmap_mapping_pages(mapping, index, 1, false); - xa_lock_irq(&mapping->i_pages); + xas_lock(&xas); + xas_set(&xas, index); - slot = radix_tree_lookup_slot(&mapping->i_pages, index); - VM_BUG_ON_PAGE(page != radix_tree_deref_slot_protected(slot, - &mapping->i_pages.xa_lock), page); + VM_BUG_ON_PAGE(page != xas_load(&xas), page); VM_BUG_ON_PAGE(page_mapped(page), page); /* * The page is expected to have page_count() == 3: * - we hold a pin on it; - * - one reference from radix tree; + * - one reference from page cache; * - one from isolate_lru_page; */ if (!page_ref_freeze(page, 3)) { @@ -1431,56 +1429,30 @@ static void collapse_shmem(struct mm_struct *mm, list_add_tail(&page->lru, &pagelist); /* Finally, replace with the new page. */ - radix_tree_replace_slot(&mapping->i_pages, slot, - new_page + (index % HPAGE_PMD_NR)); - - slot = radix_tree_iter_resume(slot, &iter); - index++; + xas_store(&xas, new_page + (index % HPAGE_PMD_NR)); continue; out_lru: - xa_unlock_irq(&mapping->i_pages); + xas_unlock_irq(&xas); putback_lru_page(page); out_isolate_failed: unlock_page(page); put_page(page); - goto tree_unlocked; + goto xa_unlocked; out_unlock: unlock_page(page); put_page(page); break; } + xas_unlock_irq(&xas); - /* - * Handle hole in radix tree at the end of the range. - * This code only triggers if there's nothing in radix tree - * beyond 'end'. - */ - if (result == SCAN_SUCCEED && index < end) { - int n = end - index; - - if (!shmem_charge(mapping->host, n)) { - result = SCAN_FAIL; - goto tree_locked; - } - - for (; index < end; index++) { - radix_tree_insert(&mapping->i_pages, index, - new_page + (index % HPAGE_PMD_NR)); - } - nr_none += n; - } - -tree_locked: - xa_unlock_irq(&mapping->i_pages); -tree_unlocked: - +xa_unlocked: if (result == SCAN_SUCCEED) { - unsigned long flags; + struct page *page, *tmp; struct zone *zone = page_zone(new_page); /* - * Replacing old pages with new one has succeed, now we need to - * copy the content and free old pages. + * Replacing old pages with new one has succeeded, now we + * need to copy the content and free the old pages. */ list_for_each_entry_safe(page, tmp, &pagelist, lru) { copy_highpage(new_page + (page->index % HPAGE_PMD_NR), @@ -1494,16 +1466,16 @@ static void collapse_shmem(struct mm_struct *mm, put_page(page); } - local_irq_save(flags); + local_irq_disable(); __inc_node_page_state(new_page, NR_SHMEM_THPS); if (nr_none) { __mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none); __mod_node_page_state(zone->zone_pgdat, NR_SHMEM, nr_none); } - local_irq_restore(flags); + local_irq_enable(); /* - * Remove pte page tables, so we can re-faulti + * Remove pte page tables, so we can re-fault * the page as huge. */ retract_page_tables(mapping, start); @@ -1518,37 +1490,37 @@ static void collapse_shmem(struct mm_struct *mm, *hpage = NULL; } else { - /* Something went wrong: rollback changes to the radix-tree */ + struct page *page; + /* Something went wrong: roll back page cache changes */ shmem_uncharge(mapping->host, nr_none); - xa_lock_irq(&mapping->i_pages); - radix_tree_for_each_slot(slot, &mapping->i_pages, &iter, start) { - if (iter.index >= end) - break; + xas_lock_irq(&xas); + xas_set(&xas, start); + xas_for_each(&xas, page, end - 1) { page = list_first_entry_or_null(&pagelist, struct page, lru); - if (!page || iter.index < page->index) { + if (!page || xas.xa_index < page->index) { if (!nr_none) break; nr_none--; /* Put holes back where they were */ - radix_tree_delete(&mapping->i_pages, iter.index); + xas_store(&xas, NULL); continue; } - VM_BUG_ON_PAGE(page->index != iter.index, page); + VM_BUG_ON_PAGE(page->index != xas.xa_index, page); /* Unfreeze the page. */ list_del(&page->lru); page_ref_unfreeze(page, 2); - radix_tree_replace_slot(&mapping->i_pages, slot, page); - slot = radix_tree_iter_resume(slot, &iter); - xa_unlock_irq(&mapping->i_pages); + xas_store(&xas, page); + xas_pause(&xas); + xas_unlock_irq(&xas); putback_lru_page(page); unlock_page(page); - xa_lock_irq(&mapping->i_pages); + xas_lock_irq(&xas); } VM_BUG_ON(nr_none); - xa_unlock_irq(&mapping->i_pages); + xas_unlock_irq(&xas); /* Unfreeze new_page, caller would take care about freeing it */ page_ref_unfreeze(new_page, 1);