From patchwork Tue May 22 14:40:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10418779 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EAC9B60224 for ; Tue, 22 May 2018 14:50:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CF11228E76 for ; Tue, 22 May 2018 14:50:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C9A9928DDE; Tue, 22 May 2018 14:50:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0A7D128DDE for ; Tue, 22 May 2018 14:50:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 577B06B000C; Tue, 22 May 2018 10:50:05 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 527DE6B000E; Tue, 22 May 2018 10:50:05 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43F996B000C; Tue, 22 May 2018 10:50:05 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f72.google.com (mail-pl0-f72.google.com [209.85.160.72]) by kanga.kvack.org (Postfix) with ESMTP id ECA876B000D for ; Tue, 22 May 2018 10:50:04 -0400 (EDT) Received: by mail-pl0-f72.google.com with SMTP id a6-v6so12183233pll.22 for ; Tue, 22 May 2018 07:50:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=szSpMy3QkhF6EA5s1Owkd2zX2kMDyVgW8BbIGFNiqGM=; b=V4O3fKutlT8ZBmlXBgQVGeN1HsxuSlcRjPqGGTcDXaOWnRnE+GCRV+5l6MS5AuqvsQ +/LcaPU23AsSHZEhq9NgH/oznwA3P0MM1SLzO+HMkZWqwv82+pO0gNLjcVz37Xha+Okq k+ii6b75Bo2UWCyF8AK7ueyWnY0JmNJHu2WAOacO7ibf3ihV9MFnVuhrUYuAWjwgSylP fK+y2ob7WePv35y3MvH936fpKUw3gy6vY+OCgUULwWrF9UytzJy/Kp0cRxb+o14T/SkV 7qApqtGmLI2tK4GOkOBahBoA7hA+AvwEMB1awHwO5bJ1RMk3KU9Bz12FewXgOzPrxjqS bqPA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ALKqPwfpvmjt6az3mtqX1ABeL2hqjRSKxy9orPPlO+V3ioY9Jcr7Cf2o Dpa08SONLhileQlwQyMloUcqufuU29PJaaZ5XbKR4cZkypzrLp4RBJQHhEOHYLPpCP+89EDx7bi 9fb8HscNeEdzqAH6iW3q6HzmnCxZ1K97SY9b0H70aX6HzBuOfrKGhcQne4y6IZJaQwQ== X-Received: by 2002:a17:902:28e8:: with SMTP id f95-v6mr25733720plb.250.1527000604647; Tue, 22 May 2018 07:50:04 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpHR24Pr7CecjOwXB8dZPITYDTq21Fn1qH9ZP2QAq/eMGSvgIkXn1a7pihLF66CcFbJoj8E X-Received: by 2002:a17:902:28e8:: with SMTP id f95-v6mr25733664plb.250.1527000603968; Tue, 22 May 2018 07:50:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527000603; cv=none; d=google.com; s=arc-20160816; b=ke7ZLEWYZvAHhjLPdLLOJJUdO1RpmeE5iWoaQfB140PjSTYirBtviVEPmI7GY6wSWd qNsIZcw6mKq4qT4R9uD4nALr44k0XmVcD7/K7MnOhdg91SZbGAiGi8IDJ47pFlw4BJC2 zPA1YjP1XxAUQOGwqFDKVOTBSoCvulsJ8wOWRR3RLvBsby4IDrOEWAyDlKYSlQ7DgL8h epao6CgSCAT45nRSPWKnb5N/CYPkPdaIqu5CH6mLRbV0rfOCBNDT+17UL38Cb1t5n5o3 vS9zrO1fk7YYYK/AfWCYVMiygnNG5UGc9YZ8Uwd1IKbahO4wCLM7zsCw3NlbJUz6NxU7 +39Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject :arc-authentication-results; bh=szSpMy3QkhF6EA5s1Owkd2zX2kMDyVgW8BbIGFNiqGM=; b=pM9Y3WJigfRLR93clCPTkfyPIHKD4l0vgSqTfCiHDWgz4e5/1FhusMDYcxpCY+FT9i eRIrnOTcWzzbUyMC4a03DPYfXqB3aPsaUhJ4D9fFJ87DuEI6cfU5s0FyyJxMHeqMDqWt k+ZgogFBnZGf4Vej3KboQdzOmsSphYQASQCqHFWy7ot8EZkLHr8Nbj1BZQdjalrXfzCi MF6Pgt+bbN36Ejhvahne/+mX0C7MM2fXOIEiMYe4SpStNHP9BAHSBRUdk/06kk0VFF/t yr+gj2Ac5dZa22beLHUlVMZhaOAhU5LDcGS4VGX6wcsB1Sum6//oMVKJ9x8r9gz53jdR OcHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id i15-v6si17066377pfk.146.2018.05.22.07.50.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 May 2018 07:50:03 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 May 2018 07:50:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,430,1520924400"; d="scan'208";a="57482170" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga001.fm.intel.com with ESMTP; 22 May 2018 07:50:00 -0700 Subject: [PATCH 06/11] filesystem-dax: perform __dax_invalidate_mapping_entry() under the page lock From: Dan Williams To: linux-nvdimm@lists.01.org Cc: Jan Kara , Christoph Hellwig , Matthew Wilcox , Ross Zwisler , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, tony.luck@intel.com Date: Tue, 22 May 2018 07:40:03 -0700 Message-ID: <152700000355.24093.14726378287214432782.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <152699997165.24093.12194490924829406111.stgit@dwillia2-desk3.amr.corp.intel.com> References: <152699997165.24093.12194490924829406111.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Hold the page lock while invalidating mapping entries to prevent races between rmap using the address_space and the filesystem freeing the address_space. This is more complicated than the simple description implies because dev_pagemap pages that fsdax uses do not have any concept of page size. Size information is stored in the radix and can only be safely read while holding the xa_lock. Since lock_page() can not be taken while holding xa_lock, drop xa_lock and speculatively lock all the associated pages. Once all the pages are locked re-take the xa_lock and revalidate that the radix entry did not change. Cc: Jan Kara Cc: Christoph Hellwig Cc: Matthew Wilcox Cc: Ross Zwisler Signed-off-by: Dan Williams --- fs/dax.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 85 insertions(+), 6 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 2e4682cd7c69..e6d44d336283 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -319,6 +319,13 @@ static unsigned long dax_radix_end_pfn(void *entry) for (pfn = dax_radix_pfn(entry); \ pfn < dax_radix_end_pfn(entry); pfn++) +#define for_each_mapped_pfn_reverse(entry, pfn) \ + for (pfn = dax_radix_end_pfn(entry) - 1; \ + dax_entry_size(entry) \ + && pfn >= dax_radix_pfn(entry); \ + pfn--) + + static void dax_associate_entry(void *entry, struct address_space *mapping, struct vm_area_struct *vma, unsigned long address) { @@ -497,6 +504,80 @@ static void *grab_mapping_entry(struct address_space *mapping, pgoff_t index, return entry; } +static bool dax_lock_pages(struct address_space *mapping, pgoff_t index, + void **entry) +{ + struct radix_tree_root *pages = &mapping->i_pages; + unsigned long pfn; + void *entry2; + + xa_lock_irq(pages); + *entry = get_unlocked_mapping_entry(mapping, index, NULL); + if (!*entry || WARN_ON_ONCE(!radix_tree_exceptional_entry(*entry))) { + put_unlocked_mapping_entry(mapping, index, entry); + xa_unlock_irq(pages); + return false; + } + + /* + * In the limited case there are no races to prevent with rmap, + * because rmap can not perform pfn_to_page(). + */ + if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) + return true; + + /* + * Now, drop the xa_lock, grab all the page locks then validate + * that the entry has not changed and return with the xa_lock + * held. + */ + xa_unlock_irq(pages); + + /* + * Retry until the entry stabilizes or someone else invalidates + * the entry; + */ + for (;;) { + for_each_mapped_pfn(*entry, pfn) + lock_page(pfn_to_page(pfn)); + + xa_lock_irq(pages); + entry2 = get_unlocked_mapping_entry(mapping, index, NULL); + if (!entry2 || WARN_ON_ONCE(!radix_tree_exceptional_entry(entry2)) + || entry2 != *entry) { + put_unlocked_mapping_entry(mapping, index, entry2); + xa_unlock_irq(pages); + + for_each_mapped_pfn_reverse(*entry, pfn) + unlock_page(pfn_to_page(pfn)); + + if (!entry2 || !radix_tree_exceptional_entry(entry2)) + return false; + *entry = entry2; + continue; + } + break; + } + + return true; +} + +static void dax_unlock_pages(struct address_space *mapping, pgoff_t index, + void *entry) +{ + struct radix_tree_root *pages = &mapping->i_pages; + unsigned long pfn; + + put_unlocked_mapping_entry(mapping, index, entry); + xa_unlock_irq(pages); + + if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) + return; + + for_each_mapped_pfn_reverse(entry, pfn) + unlock_page(pfn_to_page(pfn)); +} + static int __dax_invalidate_mapping_entry(struct address_space *mapping, pgoff_t index, bool trunc) { @@ -504,10 +585,8 @@ static int __dax_invalidate_mapping_entry(struct address_space *mapping, void *entry; struct radix_tree_root *pages = &mapping->i_pages; - xa_lock_irq(pages); - entry = get_unlocked_mapping_entry(mapping, index, NULL); - if (!entry || WARN_ON_ONCE(!radix_tree_exceptional_entry(entry))) - goto out; + if (!dax_lock_pages(mapping, index, &entry)) + return ret; if (!trunc && (radix_tree_tag_get(pages, index, PAGECACHE_TAG_DIRTY) || radix_tree_tag_get(pages, index, PAGECACHE_TAG_TOWRITE))) @@ -517,8 +596,8 @@ static int __dax_invalidate_mapping_entry(struct address_space *mapping, mapping->nrexceptional--; ret = 1; out: - put_unlocked_mapping_entry(mapping, index, entry); - xa_unlock_irq(pages); + dax_unlock_pages(mapping, index, entry); + return ret; } /*