From patchwork Sat May 19 01:35:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10412483 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5E7C6600F6 for ; Sat, 19 May 2018 01:45:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 548C428926 for ; Sat, 19 May 2018 01:45:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 465EB28AF6; Sat, 19 May 2018 01:45:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6EE4028926 for ; Sat, 19 May 2018 01:45:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D0546B06B3; Fri, 18 May 2018 21:45:23 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 180AD6B06B4; Fri, 18 May 2018 21:45:23 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0974C6B06B5; Fri, 18 May 2018 21:45:23 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f72.google.com (mail-pl0-f72.google.com [209.85.160.72]) by kanga.kvack.org (Postfix) with ESMTP id B8FDB6B06B3 for ; Fri, 18 May 2018 21:45:22 -0400 (EDT) Received: by mail-pl0-f72.google.com with SMTP id f35-v6so6102600plb.10 for ; Fri, 18 May 2018 18:45:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=f9Xd0yc3LHT9qCBbYsPuhx8ABYzVf+SMaqWmaRiCHx0=; b=C0ez5ZGgnOkJgQC0s8D8FKbiI55sWVWv11SXy/fQqjb25QHAtDQ53HRoELQTQdPR9z foqz/RblqYA2UT+WJam6NdKIC3HcnKRZq2MscjVNGDuWjFLoWUglDS2aa8u5Gry8VbTO upjCLed5y6/wMJlAY5ghjMZUGxScDZnao0YYuPY7nC+t0LNzH3vgxUnaAw6x35b9yW2j +aT1Nj7zRxgSJ0BafSFe9uauM4eLECsoioplhU9Mvyudcn1v8LilThxgLqCZ+DKofXxe hHoycfbkNg7o9NxfCaTk8xeBzNEG+lyjg1Ho2i5Eoip0QrsSNQW4w8COJ+G6hAGgl+Xx DBaA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ALKqPwdYVUfswIXkMso5mVY3zQ90IY+ax9VtFhIGCcvxITZ98sHAQYyg A40WWh1TgifEz/+VEslY3j191lywO+yOGCMovoUxJ3SWFRu4AmW5LUmvVoKTJ5oRyRomJLP6YpE y7yng4VO4/nTJFx8RCPZ540+eXUVsywxheGxwC710Lbur5D2DgcSD2YHrouhGXcN+wg== X-Received: by 2002:a62:8785:: with SMTP id i127-v6mr11717215pfe.201.1526694322395; Fri, 18 May 2018 18:45:22 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoFheP+c0TfmRqsa9CfRL9dECQ0JPGM4ObLZYdP9ZtkktSMmh5Z69y4NZhSCzYV5F4bn9sD X-Received: by 2002:a62:8785:: with SMTP id i127-v6mr11717162pfe.201.1526694320998; Fri, 18 May 2018 18:45:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526694320; cv=none; d=google.com; s=arc-20160816; b=fooVndC3EWagsK+uupH9uqqnH+NmkAfQAciXSNMZs81H24/2Gt+scf+QV12V3p6jwz OhX73SjgkMGy5xYgjnPXlma97qsSvqgE+JvYEk4hBzuGtskcaaeCK4WQO3iRRYjp6BqU Iuf/MADIkHYFbbbxfJ1sxCJrk4vLPKDXPZ2eDx4HB4i+XMCh5PNR54fDI2qgRd/W2pZg P7wJUB6NFeb1DRMQ6s1/EFx8ebsqHsUM0bnZ4ZKbeSwjMxlEArhieufEN+KPWg4qfuLM 2/eHjFYv15rzEt2mYPh83dktzzBMKi7UNA+gk2UkzGlTyzICSZZLi1n8jnpbQVf8Rurn xE+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject :arc-authentication-results; bh=f9Xd0yc3LHT9qCBbYsPuhx8ABYzVf+SMaqWmaRiCHx0=; b=JzDr3FpMFwrvCjZxFimZ+HhZscNtuauVYG+sE2o6YboDlZDJpw1lBiODtyE7hRpspR oYJyF8GqghEJEKDY+1a0UUz5A0srK6OaRhBd+36tAuNe7KGKUuqqemTLB3dkcCvX6i6M 6lh0GT1RXDig9YM0kbnI8ys1Y+ECv0Q4JuSBDbycaO9KVw04xnAZ6REt1NU5T8HNcDeg pHR0lnddV/YuLtCMTZaZpSlaxYxYA0TjARYT/KAJMG3Zmbp7xBeRyglCJ3aH6O0BiaiD LAnnwH0klDFOAlLMQ9a3Us37jICkS/GRA6e15djo91D1rWgbxxPPbwrDEK+GPQKxDpCG 4dEA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id 61-v6si8358170plc.173.2018.05.18.18.45.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 May 2018 18:45:20 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 May 2018 18:45:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,417,1520924400"; d="scan'208";a="42172679" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga007.jf.intel.com with ESMTP; 18 May 2018 18:45:20 -0700 Subject: [PATCH v11 6/7] xfs: prepare xfs_break_layouts() for another layout type From: Dan Williams To: linux-nvdimm@lists.01.org Cc: Ross Zwisler , "Darrick J. Wong" , Dave Chinner , Christoph Hellwig , Christoph Hellwig , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Date: Fri, 18 May 2018 18:35:24 -0700 Message-ID: <152669372402.34337.768265797419905210.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <152669369110.34337.14271778212195820353.stgit@dwillia2-desk3.amr.corp.intel.com> References: <152669369110.34337.14271778212195820353.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When xfs is operating as the back-end of a pNFS block server, it prevents collisions between local and remote operations by requiring a lease to be held for remotely accessed blocks. Local filesystem operations break those leases before writing or mutating the extent map of the file. A similar mechanism is needed to prevent operations on pinned dax mappings, like device-DMA, from colliding with extent unmap operations. BREAK_WRITE and BREAK_UNMAP are introduced as two distinct levels of layout breaking. Layouts are broken in the BREAK_WRITE case to ensure that layout-holders do not collide with local writes. Additionally, layouts are broken in the BREAK_UNMAP case to make sure the layout-holder has a consistent view of the file's extent map. While BREAK_WRITE breaks can be satisfied be recalling FL_LAYOUT leases, BREAK_UNMAP breaks additionally require waiting for busy dax-pages to go idle while holding XFS_MMAPLOCK_EXCL. After this refactoring xfs_break_layouts() becomes the entry point for coordinating both types of breaks. Finally, xfs_break_leased_layouts() becomes just the BREAK_WRITE handler. Note that the unlock tracking is needed in a follow on change. That will coordinate retrying either break handler until both successfully test for a lease break while maintaining the lock state. Cc: Ross Zwisler Cc: "Darrick J. Wong" Reported-by: Dave Chinner Reported-by: Christoph Hellwig Reviewed-by: Christoph Hellwig Signed-off-by: Dan Williams --- fs/xfs/xfs_file.c | 26 ++++++++++++++++++++++++-- fs/xfs/xfs_inode.h | 16 ++++++++++++++++ fs/xfs/xfs_ioctl.c | 3 +-- fs/xfs/xfs_iops.c | 6 +++--- fs/xfs/xfs_pnfs.c | 12 ++++++------ fs/xfs/xfs_pnfs.h | 5 +++-- 6 files changed, 53 insertions(+), 15 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 35309bd046be..4774c7172ef4 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -312,7 +312,7 @@ xfs_file_aio_write_checks( if (error <= 0) return error; - error = xfs_break_layouts(inode, iolock); + error = xfs_break_layouts(inode, iolock, BREAK_WRITE); if (error) return error; @@ -718,6 +718,28 @@ xfs_file_write_iter( return ret; } +int +xfs_break_layouts( + struct inode *inode, + uint *iolock, + enum layout_break_reason reason) +{ + bool retry; + + ASSERT(xfs_isilocked(XFS_I(inode), XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)); + + switch (reason) { + case BREAK_UNMAP: + ASSERT(xfs_isilocked(XFS_I(inode), XFS_MMAPLOCK_EXCL)); + /* fall through */ + case BREAK_WRITE: + return xfs_break_leased_layouts(inode, iolock, &retry); + default: + WARN_ON_ONCE(1); + return -EINVAL; + } +} + #define XFS_FALLOC_FL_SUPPORTED \ (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | \ FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE | \ @@ -744,7 +766,7 @@ xfs_file_fallocate( return -EOPNOTSUPP; xfs_ilock(ip, iolock); - error = xfs_break_layouts(inode, &iolock); + error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP); if (error) goto out_unlock; diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 1eebc53df7d7..e5b849815ce1 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -379,6 +379,20 @@ static inline void xfs_ifunlock(struct xfs_inode *ip) >> XFS_ILOCK_SHIFT) /* + * Layouts are broken in the BREAK_WRITE case to ensure that + * layout-holders do not collide with local writes. Additionally, + * layouts are broken in the BREAK_UNMAP case to make sure the + * layout-holder has a consistent view of the file's extent map. While + * BREAK_WRITE breaks can be satisfied by recalling FL_LAYOUT leases, + * BREAK_UNMAP breaks additionally require waiting for busy dax-pages to + * go idle. + */ +enum layout_break_reason { + BREAK_WRITE, + BREAK_UNMAP, +}; + +/* * For multiple groups support: if S_ISGID bit is set in the parent * directory, group of new file is set to that of the parent, and * new subdirectory gets S_ISGID bit from parent. @@ -443,6 +457,8 @@ enum xfs_prealloc_flags { int xfs_update_prealloc_flags(struct xfs_inode *ip, enum xfs_prealloc_flags flags); +int xfs_break_layouts(struct inode *inode, uint *iolock, + enum layout_break_reason reason); /* from xfs_iops.c */ extern void xfs_setup_inode(struct xfs_inode *ip); diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 4151fade4bb1..91e73d663099 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -39,7 +39,6 @@ #include "xfs_icache.h" #include "xfs_symlink.h" #include "xfs_trans.h" -#include "xfs_pnfs.h" #include "xfs_acl.h" #include "xfs_btree.h" #include @@ -644,7 +643,7 @@ xfs_ioc_space( return error; xfs_ilock(ip, iolock); - error = xfs_break_layouts(inode, &iolock); + error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP); if (error) goto out_unlock; diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 138fb36ca875..ce0c1f9466a8 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -37,7 +37,6 @@ #include "xfs_da_btree.h" #include "xfs_dir2.h" #include "xfs_trans_space.h" -#include "xfs_pnfs.h" #include "xfs_iomap.h" #include @@ -1030,13 +1029,14 @@ xfs_vn_setattr( int error; if (iattr->ia_valid & ATTR_SIZE) { - struct xfs_inode *ip = XFS_I(d_inode(dentry)); + struct inode *inode = d_inode(dentry); + struct xfs_inode *ip = XFS_I(inode); uint iolock; xfs_ilock(ip, XFS_MMAPLOCK_EXCL); iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL; - error = xfs_break_layouts(d_inode(dentry), &iolock); + error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP); if (error) { xfs_iunlock(ip, XFS_MMAPLOCK_EXCL); return error; diff --git a/fs/xfs/xfs_pnfs.c b/fs/xfs/xfs_pnfs.c index 6ea7b0b55d02..f44c3599527d 100644 --- a/fs/xfs/xfs_pnfs.c +++ b/fs/xfs/xfs_pnfs.c @@ -31,17 +31,17 @@ * rules in the page fault path we don't bother. */ int -xfs_break_layouts( +xfs_break_leased_layouts( struct inode *inode, - uint *iolock) + uint *iolock, + bool *did_unlock) { struct xfs_inode *ip = XFS_I(inode); int error; - ASSERT(xfs_isilocked(ip, XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)); - while ((error = break_layout(inode, false) == -EWOULDBLOCK)) { xfs_iunlock(ip, *iolock); + *did_unlock = true; error = break_layout(inode, true); *iolock &= ~XFS_IOLOCK_SHARED; *iolock |= XFS_IOLOCK_EXCL; @@ -121,8 +121,8 @@ xfs_fs_map_blocks( * Lock out any other I/O before we flush and invalidate the pagecache, * and then hand out a layout to the remote system. This is very * similar to direct I/O, except that the synchronization is much more - * complicated. See the comment near xfs_break_layouts for a detailed - * explanation. + * complicated. See the comment near xfs_break_leased_layouts + * for a detailed explanation. */ xfs_ilock(ip, XFS_IOLOCK_EXCL); diff --git a/fs/xfs/xfs_pnfs.h b/fs/xfs/xfs_pnfs.h index bf45951e28fe..940c6c2ad88c 100644 --- a/fs/xfs/xfs_pnfs.h +++ b/fs/xfs/xfs_pnfs.h @@ -9,10 +9,11 @@ int xfs_fs_map_blocks(struct inode *inode, loff_t offset, u64 length, int xfs_fs_commit_blocks(struct inode *inode, struct iomap *maps, int nr_maps, struct iattr *iattr); -int xfs_break_layouts(struct inode *inode, uint *iolock); +int xfs_break_leased_layouts(struct inode *inode, uint *iolock, + bool *did_unlock); #else static inline int -xfs_break_layouts(struct inode *inode, uint *iolock) +xfs_break_leased_layouts(struct inode *inode, uint *iolock, bool *did_unlock) { return 0; }