From patchwork Mon Jun 11 14:06:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthew Wilcox (Oracle)" X-Patchwork-Id: 10458025 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 553A4601A0 for ; Mon, 11 Jun 2018 14:12:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2FC7828930 for ; Mon, 11 Jun 2018 14:12:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2D023288FA; Mon, 11 Jun 2018 14:12:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A188028935 for ; Mon, 11 Jun 2018 14:11:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6114B6B02A0; Mon, 11 Jun 2018 10:07:14 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 599A16B02A1; Mon, 11 Jun 2018 10:07:14 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 442486B02A2; Mon, 11 Jun 2018 10:07:14 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f72.google.com (mail-pl0-f72.google.com [209.85.160.72]) by kanga.kvack.org (Postfix) with ESMTP id E918D6B02A0 for ; Mon, 11 Jun 2018 10:07:13 -0400 (EDT) Received: by mail-pl0-f72.google.com with SMTP id 39-v6so771435ple.6 for ; Mon, 11 Jun 2018 07:07:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=2u6+qgyveiDkqLJM8FJpbQoqNhFUWbl8SL25NGa49r4=; b=r/YartInQpLVJgUK14QdPWWVFXmgsWsHa8KDOyH8Wb4nq5TOZLCIiobzSOZEFS5e3k Z3AMzr5eumMrikRxFd1BC6TdG93DkKoWqrGR7cPFLpvYFhtIV5ABmGNG3h/KthNZiEiD JvOFbkTN7tg5P+/gj1gS6TizT2N6VdQNS5TKr5tw993txj6fF7psjt+DaL7UnIThfp6V +rNn5VRsBnEVjA/H/h33TteM7E4928aZonvVWqr0fLH2i+4tp4BUlxA3AMCfCVxUwnYs EGrowWvn5EjEubx+0ivrMxudhlmYhUQDlhqIz+E/B5TM4JKyiEHlhEjyFxxqBpTOgoVw QRxw== X-Gm-Message-State: APt69E0e+DLXsORgdXwqbTkCi/GNiE//VLSHIcpAutS/Qx1UIvpV5siR xSBOrFY5BJ0SfnJ+d89q+mGfYBdqnbiTiIC3pnIAhZXk7QS5Cpdr1bPLuORZv+KqGaPEAL/RZA5 ITHLGOHdDhWtLiImNF61lzVwC9XMfWX4JpRyA/45NEHwF9RQ11cEC0mYMihaZ4vKt6Q== X-Received: by 2002:a62:6a46:: with SMTP id f67-v6mr17712786pfc.105.1528726033615; Mon, 11 Jun 2018 07:07:13 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJTU/OHO14tlmmFNcOHkkTgg4Ntut6KJ2xgMY66+o0EBUBrD1jgAp9yfJFj2uMff8R7JnR0 X-Received: by 2002:a62:6a46:: with SMTP id f67-v6mr17712704pfc.105.1528726032568; Mon, 11 Jun 2018 07:07:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528726032; cv=none; d=google.com; s=arc-20160816; b=Ooq6zDqglTuuvKaTFrJ2Bs5aZmoQqWsVzp2T7ebLflzLd98CPKu3CSgNr3+fraVKlf 430rD4/Bnapooj6Yh3J5LPGivZvINQt3xrJ8ZUNOw9wucTEzvZHU8mU/MdReYO6SVVWv uC6yuEe59L5Luq30pXZPeHbEMXGd0StPgmXGcynR7kKOJWoD2TyIZR7+Ue05rXSHLkvc bN9Jb2RWDzySnfJuwQshaV8zygwWPRXAkzppjju3BdpdaoA2J+a4kepw1YysZlLcctWc 438ORs1EghYUBpGQKxrMQTwe3BW21qkcTh9UFb4qS+QBivMWcxFnPjHnsNMNs1Ih+Xb+ IzMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=2u6+qgyveiDkqLJM8FJpbQoqNhFUWbl8SL25NGa49r4=; b=tBUvYX8vDEvX2v+c/wxWxPwvFGTwsCuzcU/6qJPrSJR1J8sL03/px+1BEV1CdHGmJm b83wVGAMe01iMMtPX7qz+zewRUiuRNVQRIeGth+VHiqu9GhEPq0RNnDvQP2wdstv5MmA Jn73TOCqdvxBcOIpqRz7qhcW0h0IWhRLvityTFqTTctal+0xtbPJL5+y+bmg8Hn2hrfk 459hP94AUePDqtK7RZo8OObGSeG6OiEnB6gR0dAP9t+0avBVhC8MIKO06m9inq8Oajfg Qw6LkwmGzFhcm4glEzKWOf4Y8SExgMwrtfNOo2ki5qq/uplxzRBL2yfoA/O6JDEVRYex TukA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20170209 header.b=mZpnL+aR; spf=pass (google.com: best guess record for domain of willy@infradead.org designates 2607:7c80:54:e::133 as permitted sender) smtp.mailfrom=willy@infradead.org Received: from bombadil.infradead.org (bombadil.infradead.org. [2607:7c80:54:e::133]) by mx.google.com with ESMTPS id t14-v6si40926765ply.102.2018.06.11.07.07.12 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 11 Jun 2018 07:07:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of willy@infradead.org designates 2607:7c80:54:e::133 as permitted sender) client-ip=2607:7c80:54:e::133; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20170209 header.b=mZpnL+aR; spf=pass (google.com: best guess record for domain of willy@infradead.org designates 2607:7c80:54:e::133 as permitted sender) smtp.mailfrom=willy@infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=2u6+qgyveiDkqLJM8FJpbQoqNhFUWbl8SL25NGa49r4=; b=mZpnL+aRY2gJ/21PALfF2FW1X 2e11UTeU4GSb/Yl2U93jG5tF/7LeFk/A99SFeONNGtImef7jCcP8kh0CnW5NrrvrrXs9JK0Dydcp6 gsENHgsQMBuBSb03fIIwL/JYCSa93YZRFt8CpzPhsyDNY2+QqHf3dxp6K7Ll5SHzCbK/2Y4RmwOWp X5AqVsXKwMrP5cAcYrxiv2yd45A2CfxQOY71h2pC3qYd59M9XV1qhxn5TgyyXn9g1GC5AqsnGbnfA 611okxWktqdfGTuJIA4T7Xlf5AxBDQM2MAeKSYlnltMj5LOYrbyOsM4Cud8v9cPCE/FB/BRWUP2kj LC5RaeBzw==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1fSNTT-00051J-IR; Mon, 11 Jun 2018 14:07:11 +0000 From: Matthew Wilcox To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Matthew Wilcox , Jan Kara , Jeff Layton , Lukas Czerner , Ross Zwisler , Christoph Hellwig , Goldwyn Rodrigues , Nicholas Piggin , Ryusuke Konishi , linux-nilfs@vger.kernel.org, Jaegeuk Kim , Chao Yu , linux-f2fs-devel@lists.sourceforge.net Subject: [PATCH v13 63/72] dax: Convert dax_insert_pfn_mkwrite to XArray Date: Mon, 11 Jun 2018 07:06:30 -0700 Message-Id: <20180611140639.17215-64-willy@infradead.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180611140639.17215-1-willy@infradead.org> References: <20180611140639.17215-1-willy@infradead.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Matthew Wilcox Add some XArray-based helper functions to replace the radix tree based metaphors currently in use. The biggest change is that converted code doesn't see its own lock bit; get_unlocked_entry() always returns an entry with the lock bit clear, and locking the entry now returns void. So we don't have to mess around loading the current entry and clearing the lock bit; we can just store the entry that we were using. Signed-off-by: Matthew Wilcox --- fs/dax.c | 145 +++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 113 insertions(+), 32 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 7d0712c45da5..1c9b736147d3 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -38,6 +38,17 @@ #define CREATE_TRACE_POINTS #include +static inline unsigned int pe_order(enum page_entry_size pe_size) +{ + if (pe_size == PE_SIZE_PTE) + return PAGE_SHIFT - PAGE_SHIFT; + if (pe_size == PE_SIZE_PMD) + return PMD_SHIFT - PAGE_SHIFT; + if (pe_size == PE_SIZE_PUD) + return PUD_SHIFT - PAGE_SHIFT; + return ~0; +} + /* We choose 4096 entries - same as per-zone page wait tables */ #define DAX_WAIT_TABLE_BITS 12 #define DAX_WAIT_TABLE_ENTRIES (1 << DAX_WAIT_TABLE_BITS) @@ -46,6 +57,9 @@ #define PG_PMD_COLOUR ((PMD_SIZE >> PAGE_SHIFT) - 1) #define PG_PMD_NR (PMD_SIZE >> PAGE_SHIFT) +/* The order of a PMD entry */ +#define PMD_ORDER (PMD_SHIFT - PAGE_SHIFT) + static wait_queue_head_t wait_table[DAX_WAIT_TABLE_ENTRIES]; static int __init init_dax_wait_table(void) @@ -85,10 +99,15 @@ static void *dax_make_locked(unsigned long pfn, unsigned long flags) DAX_LOCKED); } +static bool dax_is_locked(void *entry) +{ + return xa_to_value(entry) & DAX_LOCKED; +} + static unsigned int dax_entry_order(void *entry) { if (xa_to_value(entry) & DAX_PMD) - return PMD_SHIFT - PAGE_SHIFT; + return PMD_ORDER; return 0; } @@ -181,6 +200,77 @@ static void dax_wake_mapping_entry_waiter(struct xarray *xa, __wake_up(wq, TASK_NORMAL, wake_all ? 0 : 1, &key); } +static void dax_wake_entry(struct xa_state *xas, void *entry, bool wake_all) +{ + return dax_wake_mapping_entry_waiter(xas->xa, xas->xa_index, entry, + wake_all); +} + +/* + * Look up entry in page cache, wait for it to become unlocked if it + * is a DAX entry and return it. The caller must subsequently call + * put_unlocked_entry() if it did not lock the entry or put_locked_entry() + * if it did. + * + * Must be called with the i_pages lock held. + */ +static void *get_unlocked_entry(struct xa_state *xas) +{ + void *entry; + struct wait_exceptional_entry_queue ewait; + wait_queue_head_t *wq; + + init_wait(&ewait.wait); + ewait.wait.func = wake_exceptional_entry_func; + + for (;;) { + entry = xas_load(xas); + if (!entry || WARN_ON_ONCE(!xa_is_value(entry)) || + !dax_is_locked(entry)) + return entry; + + wq = dax_entry_waitqueue(xas->xa, xas->xa_index, entry, + &ewait.key); + prepare_to_wait_exclusive(wq, &ewait.wait, + TASK_UNINTERRUPTIBLE); + xas_unlock_irq(xas); + xas_reset(xas); + schedule(); + finish_wait(wq, &ewait.wait); + xas_lock_irq(xas); + } +} + +static void put_unlocked_entry(struct xa_state *xas, void *entry) +{ + /* If we were the only waiter woken, wake the next one */ + if (entry) + dax_wake_entry(xas, entry, false); +} + +/* + * We used the xa_state to get the entry, but then we locked the entry and + * dropped the xa_lock, so we know the xa_state is stale and must be reset + * before use. + */ +static void put_locked_entry(struct xa_state *xas, void *entry) +{ + void *old; + + xas_reset(xas); + xas_lock_irq(xas); + old = xas_store(xas, entry); + xas_unlock_irq(xas); + BUG_ON(!dax_is_locked(old)); + dax_wake_entry(xas, entry, false); +} + +static void dax_lock_entry(struct xa_state *xas, void *entry) +{ + unsigned long v = xa_to_value(entry); + xas_store(xas, xa_mk_value(v | DAX_LOCKED)); +} + /* * Check whether the given slot is locked. Must be called with the i_pages * lock held. @@ -1629,50 +1719,46 @@ EXPORT_SYMBOL_GPL(dax_iomap_fault); /* * dax_insert_pfn_mkwrite - insert PTE or PMD entry into page tables * @vmf: The description of the fault - * @pe_size: Size of entry to be inserted * @pfn: PFN to insert + * @order: Order of entry to insert. * * This function inserts a writeable PTE or PMD entry into the page tables * for an mmaped DAX file. It also marks the page cache entry as dirty. */ -static vm_fault_t dax_insert_pfn_mkwrite(struct vm_fault *vmf, - enum page_entry_size pe_size, - pfn_t pfn) +static vm_fault_t +dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; - void *entry, **slot; - pgoff_t index = vmf->pgoff; + XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, order); + void *entry; vm_fault_t ret; - xa_lock_irq(&mapping->i_pages); - entry = get_unlocked_mapping_entry(mapping, index, &slot); + xas_lock_irq(&xas); + entry = get_unlocked_entry(&xas); /* Did we race with someone splitting entry or so? */ if (!entry || - (pe_size == PE_SIZE_PTE && !dax_is_pte_entry(entry)) || - (pe_size == PE_SIZE_PMD && !dax_is_pmd_entry(entry))) { - put_unlocked_mapping_entry(mapping, index, entry); - xa_unlock_irq(&mapping->i_pages); + (order == 0 && !dax_is_pte_entry(entry)) || + (order == PMD_ORDER && (xa_is_internal(entry) || + !dax_is_pmd_entry(entry)))) { + put_unlocked_entry(&xas, entry); + xas_unlock_irq(&xas); trace_dax_insert_pfn_mkwrite_no_entry(mapping->host, vmf, VM_FAULT_NOPAGE); return VM_FAULT_NOPAGE; } - radix_tree_tag_set(&mapping->i_pages, index, PAGECACHE_TAG_DIRTY); - entry = lock_slot(mapping, slot); - xa_unlock_irq(&mapping->i_pages); - switch (pe_size) { - case PE_SIZE_PTE: + xas_set_tag(&xas, PAGECACHE_TAG_DIRTY); + dax_lock_entry(&xas, entry); + xas_unlock_irq(&xas); + if (order == 0) ret = vmf_insert_mixed_mkwrite(vmf->vma, vmf->address, pfn); - break; #ifdef CONFIG_FS_DAX_PMD - case PE_SIZE_PMD: + else if (order == PMD_ORDER) ret = vmf_insert_pfn_pmd(vmf->vma, vmf->address, vmf->pmd, pfn, true); - break; #endif - default: + else ret = VM_FAULT_FALLBACK; - } - put_locked_mapping_entry(mapping, index); + put_locked_entry(&xas, entry); trace_dax_insert_pfn_mkwrite(mapping->host, vmf, ret); return ret; } @@ -1692,17 +1778,12 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, { int err; loff_t start = ((loff_t)vmf->pgoff) << PAGE_SHIFT; - size_t len = 0; + unsigned int order = pe_order(pe_size); + size_t len = PAGE_SIZE << order; - if (pe_size == PE_SIZE_PTE) - len = PAGE_SIZE; - else if (pe_size == PE_SIZE_PMD) - len = PMD_SIZE; - else - WARN_ON_ONCE(1); err = vfs_fsync_range(vmf->vma->vm_file, start, start + len - 1, 1); if (err) return VM_FAULT_SIGBUS; - return dax_insert_pfn_mkwrite(vmf, pe_size, pfn); + return dax_insert_pfn_mkwrite(vmf, pfn, order); } EXPORT_SYMBOL_GPL(dax_finish_sync_fault);