From patchwork Mon Apr 13 05:40:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11484989 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A6B3912 for ; Mon, 13 Apr 2020 05:41:18 +0000 (UTC) Received: from vger.kernel.org (unknown [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E809920735 for ; Mon, 13 Apr 2020 05:41:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E809920735 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-xfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727933AbgDMFlR (ORCPT ); Mon, 13 Apr 2020 01:41:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.18]:42632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727818AbgDMFlQ (ORCPT ); Mon, 13 Apr 2020 01:41:16 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10CF2C008675; Sun, 12 Apr 2020 22:41:16 -0700 (PDT) IronPort-SDR: yZTdJtDILV00IEyjMkoXG9n6HQFzFEc0P1ivVP5cZ96llsRcUbHyCzpoCmCDcIkV05eXsGPfzj YwuU4zVoz/lw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:15 -0700 IronPort-SDR: FIn3r0wUoMZh08t8zPbDByULCakW4jMHVbPRtY8f7MRuMkH5YsOlo6v8dt+jYNf/L5MzhRsY7S uKuTalPqniDA== X-IronPort-AV: E=Sophos;i="5.72,377,1580803200"; d="scan'208";a="276800915" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:15 -0700 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org, "Darrick J. Wong" Cc: Ira Weiny , Dave Chinner , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , Jeff Moyer , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH V7 1/9] fs/xfs: Remove unnecessary initialization of i_rwsem Date: Sun, 12 Apr 2020 22:40:38 -0700 Message-Id: <20200413054046.1560106-2-ira.weiny@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200413054046.1560106-1-ira.weiny@intel.com> References: <20200413054046.1560106-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Ira Weiny An earlier call of xfs_reinit_inode() from xfs_iget_cache_hit() already handles initialization of i_rwsem. Doing so again is unneeded. Reviewed-by: Dave Chinner Signed-off-by: Ira Weiny Reviewed-by: Darrick J. Wong --- Changes from V4: Update commit message to make it clear the xfs_iget_cache_hit() is actually doing the initialization via xfs_reinit_inode() New for V4: NOTE: This was found while ensuring the new i_aops_sem was properly handled. It seems like this is a layering violation so I think it is worth cleaning up so as to not confuse others. --- fs/xfs/xfs_icache.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 8dc2e5414276..836a1f09be03 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -419,6 +419,7 @@ xfs_iget_cache_hit( spin_unlock(&ip->i_flags_lock); rcu_read_unlock(); + ASSERT(!rwsem_is_locked(&inode->i_rwsem)); error = xfs_reinit_inode(mp, inode); if (error) { bool wake; @@ -452,9 +453,6 @@ xfs_iget_cache_hit( ip->i_sick = 0; ip->i_checked = 0; - ASSERT(!rwsem_is_locked(&inode->i_rwsem)); - init_rwsem(&inode->i_rwsem); - spin_unlock(&ip->i_flags_lock); spin_unlock(&pag->pag_ici_lock); } else { From patchwork Mon Apr 13 05:40:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11485021 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03F98912 for ; Mon, 13 Apr 2020 05:42:08 +0000 (UTC) Received: from vger.kernel.org (unknown [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D8487206A1 for ; Mon, 13 Apr 2020 05:42:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D8487206A1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-xfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728087AbgDMFlU (ORCPT ); Mon, 13 Apr 2020 01:41:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.18]:42644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727818AbgDMFlU (ORCPT ); Mon, 13 Apr 2020 01:41:20 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20D54C008675; Sun, 12 Apr 2020 22:41:19 -0700 (PDT) IronPort-SDR: QGd3aujlu3k3FyTMLQPY4dHyS4rMp0l6SB+KUyHJJQtmYvh/xqINor5Tp6c163Mf5m/y5RGalY 0wr9Wm7jmRhw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:19 -0700 IronPort-SDR: wBlWos2q9mmgP/qv0DjuwneNSDm6hCHcGF3JxM1lR5Yl1qi2CmEQZtNOQx0RVdBznHv8D6T4d8 /jG+GkL4ppAg== X-IronPort-AV: E=Sophos;i="5.72,377,1580803200"; d="scan'208";a="256078717" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:17 -0700 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org, "Darrick J. Wong" Cc: Ira Weiny , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , Jeff Moyer , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH V7 2/9] fs: Remove unneeded IS_DAX() check in io_is_direct() Date: Sun, 12 Apr 2020 22:40:39 -0700 Message-Id: <20200413054046.1560106-3-ira.weiny@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200413054046.1560106-1-ira.weiny@intel.com> References: <20200413054046.1560106-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Ira Weiny Remove the check because DAX now has it's own read/write methods and file systems which support DAX check IS_DAX() prior to IOCB_DIRECT on their own. Therefore, it does not matter if the file state is DAX when the iocb flags are created. Also remove io_is_direct() as it is just a simple flag check. Signed-off-by: Ira Weiny Reviewed-by: Christoph Hellwig --- Changes from v6: remove io_is_direct() as well. Remove Reviews since this is quite a bit different. Changes from v3: Reword commit message. Reordered to be a 'pre-cleanup' patch --- drivers/block/loop.c | 6 +++--- include/linux/fs.h | 7 +------ 2 files changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 739b372a5112..9a9af78974ac 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -631,8 +631,8 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) static inline void loop_update_dio(struct loop_device *lo) { - __loop_update_dio(lo, io_is_direct(lo->lo_backing_file) | - lo->use_dio); + __loop_update_dio(lo, (lo->lo_backing_file->f_flags & O_DIRECT) | + lo->use_dio); } static void loop_reread_partitions(struct loop_device *lo, @@ -1006,7 +1006,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, if (!(lo_flags & LO_FLAGS_READ_ONLY) && file->f_op->fsync) blk_queue_write_cache(lo->lo_queue, true, false); - if (io_is_direct(lo->lo_backing_file) && inode->i_sb->s_bdev) { + if ((lo->lo_backing_file->f_flags & O_DIRECT) && inode->i_sb->s_bdev) { /* In case of direct I/O, match underlying block size */ unsigned short bsize = bdev_logical_block_size( inode->i_sb->s_bdev); diff --git a/include/linux/fs.h b/include/linux/fs.h index abedbffe2c9e..a818ced22961 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3387,11 +3387,6 @@ extern void setattr_copy(struct inode *inode, const struct iattr *attr); extern int file_update_time(struct file *file); -static inline bool io_is_direct(struct file *filp) -{ - return (filp->f_flags & O_DIRECT) || IS_DAX(filp->f_mapping->host); -} - static inline bool vma_is_dax(struct vm_area_struct *vma) { return vma->vm_file && IS_DAX(vma->vm_file->f_mapping->host); @@ -3416,7 +3411,7 @@ static inline int iocb_flags(struct file *file) int res = 0; if (file->f_flags & O_APPEND) res |= IOCB_APPEND; - if (io_is_direct(file)) + if (file->f_flags & O_DIRECT) res |= IOCB_DIRECT; if ((file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host)) res |= IOCB_DSYNC; From patchwork Mon Apr 13 05:40:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11485017 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 16AEA912 for ; Mon, 13 Apr 2020 05:42:05 +0000 (UTC) Received: from vger.kernel.org (unknown [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id ECF832076B for ; Mon, 13 Apr 2020 05:42:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ECF832076B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-xfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728945AbgDMFlV (ORCPT ); Mon, 13 Apr 2020 01:41:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.18]:42646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728937AbgDMFlV (ORCPT ); Mon, 13 Apr 2020 01:41:21 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E082C008675; Sun, 12 Apr 2020 22:41:21 -0700 (PDT) IronPort-SDR: t82WJKNNGdaAB9s3pTuHFApf+pK0c6lMopgyniiGQLut1EDQEbD/m2SP3fmymyGQ1/1cu6el7O qucxLogtMvGA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:20 -0700 IronPort-SDR: ULcGxWKVpO/9gwA8D9OIXGbDGTtp97f4PYwqjIz3RTgBMd7gaHefmyTgY2+Pp88WphykpH+dJJ iMfYwKjQqJBA== X-IronPort-AV: E=Sophos;i="5.72,377,1580803200"; d="scan'208";a="252783834" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:20 -0700 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org, "Darrick J. Wong" Cc: Ira Weiny , Dave Chinner , Jan Kara , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jeff Moyer , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH V7 3/9] fs/stat: Define DAX statx attribute Date: Sun, 12 Apr 2020 22:40:40 -0700 Message-Id: <20200413054046.1560106-4-ira.weiny@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200413054046.1560106-1-ira.weiny@intel.com> References: <20200413054046.1560106-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Ira Weiny In order for users to determine if a file is currently operating in DAX state (effective DAX). Define a statx attribute value and set that attribute if the effective DAX flag is set. To go along with this we propose the following addition to the statx man page: STATX_ATTR_DAX The file is in the DAX (cpu direct access) state. DAX state attempts to minimize software cache effects for both I/O and memory mappings of this file. It requires a file system which has been configured to support DAX. DAX generally assumes all accesses are via cpu load / store instructions which can minimize overhead for small accesses, but may adversely affect cpu utilization for large transfers. File I/O is done directly to/from user-space buffers and memory mapped I/O may be performed with direct memory mappings that bypass kernel page cache. While the DAX property tends to result in data being transferred synchronously, it does not give the same guarantees of O_SYNC where data and the necessary metadata are transferred together. A DAX file may support being mapped with the MAP_SYNC flag, which enables a program to use CPU cache flush instructions to persist CPU store operations without an explicit fsync(2). See mmap(2) for more information. Reviewed-by: Dave Chinner Reviewed-by: Jan Kara Reviewed-by: Darrick J. Wong Signed-off-by: Ira Weiny --- Changes from V2: Update man page text with comments from Darrick, Jan, Dan, and Dave. --- fs/stat.c | 3 +++ include/uapi/linux/stat.h | 1 + 2 files changed, 4 insertions(+) diff --git a/fs/stat.c b/fs/stat.c index 030008796479..894699c74dde 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -79,6 +79,9 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, if (IS_AUTOMOUNT(inode)) stat->attributes |= STATX_ATTR_AUTOMOUNT; + if (IS_DAX(inode)) + stat->attributes |= STATX_ATTR_DAX; + if (inode->i_op->getattr) return inode->i_op->getattr(path, stat, request_mask, query_flags); diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h index ad80a5c885d5..e5f9d5517f6b 100644 --- a/include/uapi/linux/stat.h +++ b/include/uapi/linux/stat.h @@ -169,6 +169,7 @@ struct statx { #define STATX_ATTR_ENCRYPTED 0x00000800 /* [I] File requires key to decrypt in fs */ #define STATX_ATTR_AUTOMOUNT 0x00001000 /* Dir: Automount trigger */ #define STATX_ATTR_VERITY 0x00100000 /* [I] Verity protected file */ +#define STATX_ATTR_DAX 0x00002000 /* [I] File is DAX */ #endif /* _UAPI_LINUX_STAT_H */ From patchwork Mon Apr 13 05:40:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11485009 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 59A91913 for ; Mon, 13 Apr 2020 05:41:56 +0000 (UTC) Received: from vger.kernel.org (unknown [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 392D32073E for ; Mon, 13 Apr 2020 05:41:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 392D32073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-xfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728964AbgDMFlY (ORCPT ); Mon, 13 Apr 2020 01:41:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.18]:42652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728937AbgDMFlX (ORCPT ); Mon, 13 Apr 2020 01:41:23 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B65FBC008675; Sun, 12 Apr 2020 22:41:22 -0700 (PDT) IronPort-SDR: ahUkUDZrEuG/JLqhltsZOBHLYg/wG4xtPjnbsfL1dXkrwlBJV8005yrCVf4+qWCmjOUpYcppn+ du/aHv82SbHQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:22 -0700 IronPort-SDR: wEBpDRPRWyCAj4NDCtjdos6QylEJUxIrU/jmF80eOnAiZ4bYm4gzg3owB27RvNPo+Nlfe64/3O ninQAMFKnTHg== X-IronPort-AV: E=Sophos;i="5.72,377,1580803200"; d="scan'208";a="454109300" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:22 -0700 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org, "Darrick J. Wong" Cc: Ira Weiny , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , Jeff Moyer , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH V7 4/9] fs/xfs: Make DAX mount option a tri-state Date: Sun, 12 Apr 2020 22:40:41 -0700 Message-Id: <20200413054046.1560106-5-ira.weiny@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200413054046.1560106-1-ira.weiny@intel.com> References: <20200413054046.1560106-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Ira Weiny As agreed upon[1]. We make the dax mount option a tri-state. '-o dax' continues to operate the same. We add 'always', 'never', and 'inode' (default). [1] https://lore.kernel.org/lkml/20200405061945.GA94792@iweiny-DESK2.sc.intel.com/ Signed-off-by: Ira Weiny Reviewed-by: Darrick J. Wong --- Changes from v6: Use 2 flag bits rather than a field. change iflag to inode Changes from v5: New Patch --- fs/xfs/xfs_mount.h | 3 ++- fs/xfs/xfs_super.c | 44 ++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 42 insertions(+), 5 deletions(-) diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index 88ab09ed29e7..d581b990e59a 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -233,7 +233,8 @@ typedef struct xfs_mount { allocator */ #define XFS_MOUNT_NOATTR2 (1ULL << 25) /* disable use of attr2 format */ -#define XFS_MOUNT_DAX (1ULL << 62) /* TEST ONLY! */ +#define XFS_MOUNT_DAX (1ULL << 62) +#define XFS_MOUNT_NODAX (1ULL << 63) /* * Max and min values for mount-option defined I/O diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 2094386af8ac..d7bd8f5e00c9 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -47,6 +47,32 @@ static struct kset *xfs_kset; /* top-level xfs sysfs dir */ static struct xfs_kobj xfs_dbg_kobj; /* global debug sysfs attrs */ #endif +enum { + XFS_DAX_INODE = 0, + XFS_DAX_ALWAYS = 1, + XFS_DAX_NEVER = 2, +}; + +static void xfs_mount_set_dax_mode(struct xfs_mount *mp, u32 val) +{ + if (val == XFS_DAX_INODE) { + mp->m_flags &= ~(XFS_MOUNT_DAX | XFS_MOUNT_NODAX); + } else if (val == XFS_DAX_ALWAYS) { + mp->m_flags &= ~XFS_MOUNT_NODAX; + mp->m_flags |= XFS_MOUNT_DAX; + } else if (val == XFS_DAX_NEVER) { + mp->m_flags &= ~XFS_MOUNT_DAX; + mp->m_flags |= XFS_MOUNT_NODAX; + } +} + +static const struct constant_table dax_param_enums[] = { + {"inode", XFS_DAX_INODE }, + {"always", XFS_DAX_ALWAYS }, + {"never", XFS_DAX_NEVER }, + {} +}; + /* * Table driven mount option parser. */ @@ -59,7 +85,7 @@ enum { Opt_filestreams, Opt_quota, Opt_noquota, Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_uquota, Opt_gquota, Opt_pquota, Opt_uqnoenforce, Opt_gqnoenforce, Opt_pqnoenforce, Opt_qnoenforce, - Opt_discard, Opt_nodiscard, Opt_dax, + Opt_discard, Opt_nodiscard, Opt_dax, Opt_dax_enum, }; static const struct fs_parameter_spec xfs_fs_parameters[] = { @@ -103,6 +129,7 @@ static const struct fs_parameter_spec xfs_fs_parameters[] = { fsparam_flag("discard", Opt_discard), fsparam_flag("nodiscard", Opt_nodiscard), fsparam_flag("dax", Opt_dax), + fsparam_enum("dax", Opt_dax_enum, dax_param_enums), {} }; @@ -129,7 +156,6 @@ xfs_fs_show_options( { XFS_MOUNT_GRPID, ",grpid" }, { XFS_MOUNT_DISCARD, ",discard" }, { XFS_MOUNT_LARGEIO, ",largeio" }, - { XFS_MOUNT_DAX, ",dax" }, { 0, NULL } }; struct xfs_mount *mp = XFS_M(root->d_sb); @@ -185,6 +211,13 @@ xfs_fs_show_options( if (!(mp->m_qflags & XFS_ALL_QUOTA_ACCT)) seq_puts(m, ",noquota"); + if (mp->m_flags & XFS_MOUNT_DAX) + seq_puts(m, ",dax=always"); + else if (mp->m_flags & XFS_MOUNT_NODAX) + seq_puts(m, ",dax=never"); + else + seq_puts(m, ",dax=inode"); + return 0; } @@ -1244,7 +1277,10 @@ xfs_fc_parse_param( return 0; #ifdef CONFIG_FS_DAX case Opt_dax: - mp->m_flags |= XFS_MOUNT_DAX; + xfs_mount_set_dax_mode(mp, XFS_DAX_ALWAYS); + return 0; + case Opt_dax_enum: + xfs_mount_set_dax_mode(mp, result.uint_32); return 0; #endif default: @@ -1451,7 +1487,7 @@ xfs_fc_fill_super( if (!rtdev_is_dax && !datadev_is_dax) { xfs_alert(mp, "DAX unsupported by block device. Turning off DAX."); - mp->m_flags &= ~XFS_MOUNT_DAX; + xfs_mount_set_dax_mode(mp, XFS_DAX_NEVER); } if (xfs_sb_version_hasreflink(&mp->m_sb)) { xfs_alert(mp, From patchwork Mon Apr 13 05:40:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11485011 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0BC55913 for ; Mon, 13 Apr 2020 05:41:57 +0000 (UTC) Received: from vger.kernel.org (unknown [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E16302073E for ; Mon, 13 Apr 2020 05:41:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E16302073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-xfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728180AbgDMFlz (ORCPT ); Mon, 13 Apr 2020 01:41:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.18]:42658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728959AbgDMFlY (ORCPT ); Mon, 13 Apr 2020 01:41:24 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDE62C008675; Sun, 12 Apr 2020 22:41:23 -0700 (PDT) IronPort-SDR: NtVqXWVbaOWU1Sey6v5WS+G4xBRUEcXZtlzHLrDURbSLgOB6DDreyFH8vTqmC0fJEZmRYC1CK1 oEV0nA3i8z7Q== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:23 -0700 IronPort-SDR: BT1E+ILAe2C4ptsLWsGonCojt8RaCm6AtKnYMF3NgLn4kqcZKs6ePQJ/47mD+vd1Q/2gKr59ph jCJ5H3E/V+SA== X-IronPort-AV: E=Sophos;i="5.72,377,1580803200"; d="scan'208";a="298328580" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:23 -0700 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org, "Darrick J. Wong" Cc: Ira Weiny , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , Jeff Moyer , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH V7 5/9] fs/xfs: Create function xfs_inode_enable_dax() Date: Sun, 12 Apr 2020 22:40:42 -0700 Message-Id: <20200413054046.1560106-6-ira.weiny@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200413054046.1560106-1-ira.weiny@intel.com> References: <20200413054046.1560106-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Ira Weiny xfs_inode_supports_dax() should reflect if the inode can support DAX not that it is enabled for DAX. Change the use of xfs_inode_supports_dax() to reflect only if the inode and underlying storage support dax. Add a new function xfs_inode_enable_dax() which reflects if the inode should be enabled for DAX. Signed-off-by: Ira Weiny --- Changes from v6: Change enable checks to be sequential logic. Update for 2 bit tri-state option. Make 'static' consistent. Don't set S_DAX if !CONFIG_FS_DAX Changes from v5: Update to reflect the new tri-state mount option Changes from v3: Update functions and names to be more clear Update commit message Merge with 'fs/xfs: Clean up DAX support check' don't allow IS_DAX() on a directory use STATIC macro for static make xfs_inode_supports_dax() static --- fs/xfs/xfs_iops.c | 34 +++++++++++++++++++++++++++++----- 1 file changed, 29 insertions(+), 5 deletions(-) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 81f2f93caec0..37bd15b55878 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1244,12 +1244,11 @@ xfs_inode_supports_dax( struct xfs_mount *mp = ip->i_mount; /* Only supported on non-reflinked files. */ - if (!S_ISREG(VFS_I(ip)->i_mode) || xfs_is_reflink_inode(ip)) + if (xfs_is_reflink_inode(ip)) return false; - /* DAX mount option or DAX iflag must be set. */ - if (!(mp->m_flags & XFS_MOUNT_DAX) && - !(ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) + /* Only supported on regular files. */ + if (!S_ISREG(VFS_I(ip)->i_mode)) return false; /* Block size must match page size */ @@ -1260,6 +1259,31 @@ xfs_inode_supports_dax( return xfs_inode_buftarg(ip)->bt_daxdev != NULL; } +#ifdef CONFIG_FS_DAX +static bool +xfs_inode_enable_dax( + struct xfs_inode *ip) +{ + if (ip->i_mount->m_flags & XFS_MOUNT_NODAX) + return false; + if (!xfs_inode_supports_dax(ip)) + return false; + if (ip->i_mount->m_flags & XFS_MOUNT_DAX) + return true; + if (ip->i_d.di_flags2 & XFS_DIFLAG2_DAX) + return true; + return false; +} +#else /* !CONFIG_FS_DAX */ +static bool +xfs_inode_enable_dax( + struct xfs_inode *ip) +{ + return false; +} +#endif /* CONFIG_FS_DAX */ + + STATIC void xfs_diflags_to_iflags( struct inode *inode, @@ -1278,7 +1302,7 @@ xfs_diflags_to_iflags( inode->i_flags |= S_SYNC; if (flags & XFS_DIFLAG_NOATIME) inode->i_flags |= S_NOATIME; - if (xfs_inode_supports_dax(ip)) + if (xfs_inode_enable_dax(ip)) inode->i_flags |= S_DAX; } From patchwork Mon Apr 13 05:40:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11485007 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 97EEE912 for ; Mon, 13 Apr 2020 05:41:54 +0000 (UTC) Received: from vger.kernel.org (unknown [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 790D32073E for ; Mon, 13 Apr 2020 05:41:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 790D32073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-xfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728937AbgDMFl1 (ORCPT ); Mon, 13 Apr 2020 01:41:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.18]:42664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728974AbgDMFl0 (ORCPT ); Mon, 13 Apr 2020 01:41:26 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A53EBC008675; Sun, 12 Apr 2020 22:41:25 -0700 (PDT) IronPort-SDR: +MzHVRby3pbGUIjNpBf9JNfPkUnLyXgNQxHKke/BLsNwC69ncUlthU2YJ+nYHcEDi4kAkQrgnu 8twZ6FdJX3tQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:25 -0700 IronPort-SDR: o7EWjfV6nqBbvdw1rbYYURTtjAq5djiV8GM1NNGWNwZyd5h+1J0iU9Fj8BG9UaEfXm8PL45DV9 Xr/w3aOAPh7A== X-IronPort-AV: E=Sophos;i="5.72,377,1580803200"; d="scan'208";a="241572490" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:24 -0700 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org, "Darrick J. Wong" Cc: Ira Weiny , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , Jeff Moyer , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH V7 6/9] fs/xfs: Combine xfs_diflags_to_linux() and xfs_diflags_to_iflags() Date: Sun, 12 Apr 2020 22:40:43 -0700 Message-Id: <20200413054046.1560106-7-ira.weiny@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200413054046.1560106-1-ira.weiny@intel.com> References: <20200413054046.1560106-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Ira Weiny The functionality in xfs_diflags_to_linux() and xfs_diflags_to_iflags() are nearly identical. The only difference is that *_to_linux() is called after inode setup and disallows changing the DAX flag. Combining them can be done with a flag which indicates if this is the initial setup to allow the DAX flag to be properly set only at init time. So remove xfs_diflags_to_linux() and call the modified xfs_diflags_to_iflags() directly. While we are here simplify xfs_diflags_to_iflags() to take struct xfs_inode and use xfs_ip2xflags() to ensure future diflags are included correctly. Signed-off-by: Ira Weiny Reviewed-by: Darrick J. Wong --- Changes from V6: Move unrelated hunk to previous patch. Change logic for better code generation. Changes from V5: The functions are no longer identical so we can only combine them rather than deleting one completely. This is reflected in the new init parameter. --- fs/xfs/xfs_inode.h | 1 + fs/xfs/xfs_ioctl.c | 33 +-------------------------------- fs/xfs/xfs_iops.c | 42 +++++++++++++++++++++++------------------- 3 files changed, 25 insertions(+), 51 deletions(-) diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 492e53992fa9..e76ed9ca17f7 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -466,6 +466,7 @@ int xfs_break_layouts(struct inode *inode, uint *iolock, /* from xfs_iops.c */ extern void xfs_setup_inode(struct xfs_inode *ip); extern void xfs_setup_iops(struct xfs_inode *ip); +extern void xfs_diflags_to_iflags(struct xfs_inode *ip, bool init); /* * When setting up a newly allocated inode, we need to call diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index d42de92cb283..c6cd92ef4a05 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1100,37 +1100,6 @@ xfs_flags2diflags2( return di_flags2; } -STATIC void -xfs_diflags_to_linux( - struct xfs_inode *ip) -{ - struct inode *inode = VFS_I(ip); - unsigned int xflags = xfs_ip2xflags(ip); - - if (xflags & FS_XFLAG_IMMUTABLE) - inode->i_flags |= S_IMMUTABLE; - else - inode->i_flags &= ~S_IMMUTABLE; - if (xflags & FS_XFLAG_APPEND) - inode->i_flags |= S_APPEND; - else - inode->i_flags &= ~S_APPEND; - if (xflags & FS_XFLAG_SYNC) - inode->i_flags |= S_SYNC; - else - inode->i_flags &= ~S_SYNC; - if (xflags & FS_XFLAG_NOATIME) - inode->i_flags |= S_NOATIME; - else - inode->i_flags &= ~S_NOATIME; -#if 0 /* disabled until the flag switching races are sorted out */ - if (xflags & FS_XFLAG_DAX) - inode->i_flags |= S_DAX; - else - inode->i_flags &= ~S_DAX; -#endif -} - static int xfs_ioctl_setattr_xflags( struct xfs_trans *tp, @@ -1168,7 +1137,7 @@ xfs_ioctl_setattr_xflags( ip->i_d.di_flags = xfs_flags2diflags(ip, fa->fsx_xflags); ip->i_d.di_flags2 = di_flags2; - xfs_diflags_to_linux(ip); + xfs_diflags_to_iflags(ip, false); xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); XFS_STATS_INC(mp, xs_ig_attrchg); diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 37bd15b55878..856ad823e754 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1284,26 +1284,30 @@ xfs_inode_enable_dax( #endif /* CONFIG_FS_DAX */ -STATIC void +void xfs_diflags_to_iflags( - struct inode *inode, - struct xfs_inode *ip) + struct xfs_inode *ip, + bool init) { - uint16_t flags = ip->i_d.di_flags; - - inode->i_flags &= ~(S_IMMUTABLE | S_APPEND | S_SYNC | - S_NOATIME | S_DAX); - - if (flags & XFS_DIFLAG_IMMUTABLE) - inode->i_flags |= S_IMMUTABLE; - if (flags & XFS_DIFLAG_APPEND) - inode->i_flags |= S_APPEND; - if (flags & XFS_DIFLAG_SYNC) - inode->i_flags |= S_SYNC; - if (flags & XFS_DIFLAG_NOATIME) - inode->i_flags |= S_NOATIME; - if (xfs_inode_enable_dax(ip)) - inode->i_flags |= S_DAX; + struct inode *inode = VFS_I(ip); + unsigned int xflags = xfs_ip2xflags(ip); + unsigned int flags = 0; + + ASSERT(!(IS_DAX(inode) && init)); + + if (xflags & FS_XFLAG_IMMUTABLE) + flags |= S_IMMUTABLE; + if (xflags & FS_XFLAG_APPEND) + flags |= S_APPEND; + if (xflags & FS_XFLAG_SYNC) + flags |= S_SYNC; + if (xflags & FS_XFLAG_NOATIME) + flags |= S_NOATIME; + if (init && xfs_inode_enable_dax(ip)) + flags |= S_DAX; + + inode->i_flags &= ~(S_IMMUTABLE | S_APPEND | S_SYNC | S_NOATIME); + inode->i_flags |= flags; } /* @@ -1332,7 +1336,7 @@ xfs_setup_inode( inode->i_gid = xfs_gid_to_kgid(ip->i_d.di_gid); i_size_write(inode, ip->i_d.di_size); - xfs_diflags_to_iflags(inode, ip); + xfs_diflags_to_iflags(ip, true); if (S_ISDIR(inode->i_mode)) { /* From patchwork Mon Apr 13 05:40:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11484995 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EB34A913 for ; Mon, 13 Apr 2020 05:41:37 +0000 (UTC) Received: from vger.kernel.org (unknown [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C303B20776 for ; Mon, 13 Apr 2020 05:41:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C303B20776 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-xfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729005AbgDMFlb (ORCPT ); Mon, 13 Apr 2020 01:41:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.18]:42680 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729002AbgDMFla (ORCPT ); Mon, 13 Apr 2020 01:41:30 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E15DC008675; Sun, 12 Apr 2020 22:41:30 -0700 (PDT) IronPort-SDR: QhjFJJJ1RMjaeYf8q4wFcxZHvlXCA+RMs1tys42YUZ9Ayn/yhuDQ0g7wQOGuFNwbPFJ7mJoGu7 RzlGCRTeyE+Q== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:26 -0700 IronPort-SDR: Hcd+XAaWBlBPrMyHf4SMn5E5Q7Ywj8yI+KmEfJr3A8P8T/wc7H4UB7zHJ74tNgFP9CiuvYhgdP DQ8rvnEyDPHQ== X-IronPort-AV: E=Sophos;i="5.72,377,1580803200"; d="scan'208";a="270977630" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:26 -0700 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org, "Darrick J. Wong" Cc: Ira Weiny , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , Jeff Moyer , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH V7 7/9] fs: Define I_DONTCACNE in VFS layer Date: Sun, 12 Apr 2020 22:40:44 -0700 Message-Id: <20200413054046.1560106-8-ira.weiny@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200413054046.1560106-1-ira.weiny@intel.com> References: <20200413054046.1560106-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Ira Weiny DAX effective mode changes (setting of S_DAX) require inode eviction. Define a flag which can be set to inform the VFS layer that inodes should not be cached. This will expedite the eviction of those nodes requiring reload. Signed-off-by: Ira Weiny --- include/linux/fs.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index a818ced22961..e2db71d150c3 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2151,6 +2151,8 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src, * * I_CREATING New object's inode in the middle of setting up. * + * I_DONTCACHE Do not cache the inode + * * Q: What is the difference between I_WILL_FREE and I_FREEING? */ #define I_DIRTY_SYNC (1 << 0) @@ -2173,6 +2175,7 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src, #define I_WB_SWITCH (1 << 13) #define I_OVL_INUSE (1 << 14) #define I_CREATING (1 << 15) +#define I_DONTCACHE (1 << 16) #define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC) #define I_DIRTY (I_DIRTY_INODE | I_DIRTY_PAGES) @@ -3042,7 +3045,8 @@ extern int inode_needs_sync(struct inode *inode); extern int generic_delete_inode(struct inode *inode); static inline int generic_drop_inode(struct inode *inode) { - return !inode->i_nlink || inode_unhashed(inode); + return !inode->i_nlink || inode_unhashed(inode) || + (inode->i_state & I_DONTCACHE); } extern struct inode *ilookup5_nowait(struct super_block *sb, From patchwork Mon Apr 13 05:40:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11485003 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 51130913 for ; Mon, 13 Apr 2020 05:41:52 +0000 (UTC) Received: from vger.kernel.org (unknown [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2B09C20737 for ; Mon, 13 Apr 2020 05:41:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2B09C20737 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-xfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728992AbgDMFl3 (ORCPT ); Mon, 13 Apr 2020 01:41:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.18]:42674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728987AbgDMFl2 (ORCPT ); Mon, 13 Apr 2020 01:41:28 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9381CC008675; Sun, 12 Apr 2020 22:41:28 -0700 (PDT) IronPort-SDR: L3Cx1cVaoHPl+ZsIohQ0LxhvxtyO/fKcneqS5ZYPmRRNwTmmyuykues87rK7cro8NLVuMaVtZi 4YY3QlomNygA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:28 -0700 IronPort-SDR: jlLNzIgrE5WxXjJwzEZr7q+aUTLJHr5nlnZyt88RsBzLXQMvc4ZefvjNFaSdW5IopL58tv28gw arUv7sIQamtQ== X-IronPort-AV: E=Sophos;i="5.72,377,1580803200"; d="scan'208";a="243452863" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:28 -0700 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org, "Darrick J. Wong" Cc: Ira Weiny , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , Jeff Moyer , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH V7 8/9] fs/xfs: Change xfs_ioctl_setattr_dax_invalidate() Date: Sun, 12 Apr 2020 22:40:45 -0700 Message-Id: <20200413054046.1560106-9-ira.weiny@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200413054046.1560106-1-ira.weiny@intel.com> References: <20200413054046.1560106-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Ira Weiny We only support changing FS_XFLAG_DAX on directories. Files get their flag from the parent directory on creation only. So no data invalidation needs to happen. Alter the xfs_ioctl_setattr_dax_invalidate() to be xfs_ioctl_setattr_dax_validate(). xfs_ioctl_setattr_dax_validate() now validates that any FS_XFLAG_DAX change is ok. This also allows use to remove the join_flags logic. Signed-off-by: Ira Weiny --- Changes from v6: Fix completely broken implementation and update commit message. Use the new VFS layer I_DONTCACHE to facilitate inode eviction and S_DAX changing on drop_caches Changes from v5: New patch --- fs/xfs/xfs_ioctl.c | 102 +++++++-------------------------------------- 1 file changed, 16 insertions(+), 86 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index c6cd92ef4a05..ba42a5fb5b05 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1145,63 +1145,23 @@ xfs_ioctl_setattr_xflags( } /* - * If we are changing DAX flags, we have to ensure the file is clean and any - * cached objects in the address space are invalidated and removed. This - * requires us to lock out other IO and page faults similar to a truncate - * operation. The locks need to be held until the transaction has been committed - * so that the cache invalidation is atomic with respect to the DAX flag - * manipulation. + * Mark inodes with a changing FS_XFLAG_DAX, I_DONTCACHE */ -static int +static void xfs_ioctl_setattr_dax_invalidate( struct xfs_inode *ip, - struct fsxattr *fa, - int *join_flags) + struct fsxattr *fa) { - struct inode *inode = VFS_I(ip); - struct super_block *sb = inode->i_sb; - int error; - - *join_flags = 0; - - /* - * It is only valid to set the DAX flag on regular files and - * directories on filesystems where the block size is equal to the page - * size. On directories it serves as an inherited hint so we don't - * have to check the device for dax support or flush pagecache. - */ - if (fa->fsx_xflags & FS_XFLAG_DAX) { - struct xfs_buftarg *target = xfs_inode_buftarg(ip); - - if (!bdev_dax_supported(target->bt_bdev, sb->s_blocksize)) - return -EINVAL; - } - - /* If the DAX state is not changing, we have nothing to do here. */ - if ((fa->fsx_xflags & FS_XFLAG_DAX) && IS_DAX(inode)) - return 0; - if (!(fa->fsx_xflags & FS_XFLAG_DAX) && !IS_DAX(inode)) - return 0; + struct inode *inode = VFS_I(ip); if (S_ISDIR(inode->i_mode)) - return 0; - - /* lock, flush and invalidate mapping in preparation for flag change */ - xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); - error = filemap_write_and_wait(inode->i_mapping); - if (error) - goto out_unlock; - error = invalidate_inode_pages2(inode->i_mapping); - if (error) - goto out_unlock; - - *join_flags = XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL; - return 0; - -out_unlock: - xfs_iunlock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); - return error; + return; + if (((fa->fsx_xflags & FS_XFLAG_DAX) && + !(ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) || + (!(fa->fsx_xflags & FS_XFLAG_DAX) && + (ip->i_d.di_flags2 & XFS_DIFLAG2_DAX))) + inode->i_state |= I_DONTCACHE; } /* @@ -1209,17 +1169,10 @@ xfs_ioctl_setattr_dax_invalidate( * have permission to do so. On success, return a clean transaction and the * inode locked exclusively ready for further operation specific checks. On * failure, return an error without modifying or locking the inode. - * - * The inode might already be IO locked on call. If this is the case, it is - * indicated in @join_flags and we take full responsibility for ensuring they - * are unlocked from now on. Hence if we have an error here, we still have to - * unlock them. Otherwise, once they are joined to the transaction, they will - * be unlocked on commit/cancel. */ static struct xfs_trans * xfs_ioctl_setattr_get_trans( - struct xfs_inode *ip, - int join_flags) + struct xfs_inode *ip) { struct xfs_mount *mp = ip->i_mount; struct xfs_trans *tp; @@ -1236,8 +1189,7 @@ xfs_ioctl_setattr_get_trans( goto out_unlock; xfs_ilock(ip, XFS_ILOCK_EXCL); - xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL | join_flags); - join_flags = 0; + xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL); /* * CAP_FOWNER overrides the following restrictions: @@ -1258,8 +1210,6 @@ xfs_ioctl_setattr_get_trans( out_cancel: xfs_trans_cancel(tp); out_unlock: - if (join_flags) - xfs_iunlock(ip, join_flags); return ERR_PTR(error); } @@ -1386,7 +1336,6 @@ xfs_ioctl_setattr( struct xfs_dquot *pdqp = NULL; struct xfs_dquot *olddquot = NULL; int code; - int join_flags = 0; trace_xfs_ioctl_setattr(ip); @@ -1410,18 +1359,9 @@ xfs_ioctl_setattr( return code; } - /* - * Changing DAX config may require inode locking for mapping - * invalidation. These need to be held all the way to transaction commit - * or cancel time, so need to be passed through to - * xfs_ioctl_setattr_get_trans() so it can apply them to the join call - * appropriately. - */ - code = xfs_ioctl_setattr_dax_invalidate(ip, fa, &join_flags); - if (code) - goto error_free_dquots; + xfs_ioctl_setattr_dax_invalidate(ip, fa); - tp = xfs_ioctl_setattr_get_trans(ip, join_flags); + tp = xfs_ioctl_setattr_get_trans(ip); if (IS_ERR(tp)) { code = PTR_ERR(tp); goto error_free_dquots; @@ -1552,7 +1492,6 @@ xfs_ioc_setxflags( struct fsxattr fa; struct fsxattr old_fa; unsigned int flags; - int join_flags = 0; int error; if (copy_from_user(&flags, arg, sizeof(flags))) @@ -1569,18 +1508,9 @@ xfs_ioc_setxflags( if (error) return error; - /* - * Changing DAX config may require inode locking for mapping - * invalidation. These need to be held all the way to transaction commit - * or cancel time, so need to be passed through to - * xfs_ioctl_setattr_get_trans() so it can apply them to the join call - * appropriately. - */ - error = xfs_ioctl_setattr_dax_invalidate(ip, &fa, &join_flags); - if (error) - goto out_drop_write; + xfs_ioctl_setattr_dax_invalidate(ip, &fa); - tp = xfs_ioctl_setattr_get_trans(ip, join_flags); + tp = xfs_ioctl_setattr_get_trans(ip); if (IS_ERR(tp)) { error = PTR_ERR(tp); goto out_drop_write; From patchwork Mon Apr 13 05:40:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11484997 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8746615AB for ; Mon, 13 Apr 2020 05:41:38 +0000 (UTC) Received: from vger.kernel.org (unknown [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6080A20776 for ; Mon, 13 Apr 2020 05:41:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6080A20776 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-xfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729028AbgDMFlh (ORCPT ); Mon, 13 Apr 2020 01:41:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.18]:42696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729002AbgDMFlg (ORCPT ); Mon, 13 Apr 2020 01:41:36 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC312C008675; Sun, 12 Apr 2020 22:41:35 -0700 (PDT) IronPort-SDR: qv/pxfLZ/HiCM/UDJuyqRNZn1LfDrc957UpmHn1+hno0Sm94PSMF+nKsWRjJ4aYaOeLQ0XgEDB AlcZzW9SOfUA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:35 -0700 IronPort-SDR: k1xBGaNaSd5P7dZprcTC2u8wKucMbrDzgnDEDnE71QeEf2ZIbeLQM700C9tUDWax/o7dmkAzCb uAClvpQdOTug== X-IronPort-AV: E=Sophos;i="5.72,377,1580803200"; d="scan'208";a="287786805" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2020 22:41:34 -0700 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org, "Darrick J. Wong" Cc: Ira Weiny , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , Jeff Moyer , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH V7 9/9] Documentation/dax: Update Usage section Date: Sun, 12 Apr 2020 22:40:46 -0700 Message-Id: <20200413054046.1560106-10-ira.weiny@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200413054046.1560106-1-ira.weiny@intel.com> References: <20200413054046.1560106-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Ira Weiny Update the Usage section to reflect the new individual dax selection functionality. Signed-off-by: Ira Weiny --- Changes from V6: Update to allow setting FS_XFLAG_DAX any time. Update with list of behaviors from Darrick https://lore.kernel.org/lkml/20200409165927.GD6741@magnolia/ Changes from V5: Update to reflect the agreed upon semantics https://lore.kernel.org/lkml/20200405061945.GA94792@iweiny-DESK2.sc.intel.com/ --- Documentation/filesystems/dax.txt | 166 +++++++++++++++++++++++++++++- 1 file changed, 163 insertions(+), 3 deletions(-) diff --git a/Documentation/filesystems/dax.txt b/Documentation/filesystems/dax.txt index 679729442fd2..af14c1b330a9 100644 --- a/Documentation/filesystems/dax.txt +++ b/Documentation/filesystems/dax.txt @@ -17,11 +17,171 @@ For file mappings, the storage device is mapped directly into userspace. Usage ----- -If you have a block device which supports DAX, you can make a filesystem +If you have a block device which supports DAX, you can make a file system on it as usual. The DAX code currently only supports files with a block size equal to your kernel's PAGE_SIZE, so you may need to specify a block -size when creating the filesystem. When mounting it, use the "-o dax" -option on the command line or add 'dax' to the options in /etc/fstab. +size when creating the file system. + +Currently 2 filesystems support DAX, ext4 and xfs. Enabling DAX on them is +different at this time. + +Enabling DAX on ext4 +-------------------- + +When mounting the filesystem, use the "-o dax" option on the command line or +add 'dax' to the options in /etc/fstab. + + +Enabling DAX on xfs +------------------- + +Summary +------- + + 1. There exists an in-kernel access mode flag S_DAX that is set when + file accesses go directly to persistent memory, bypassing the page + cache. Applications must call statx to discover the current S_DAX + state (STATX_ATTR_DAX). + + 2. There exists an advisory file inode flag FS_XFLAG_DAX that is + inherited from the parent directory FS_XFLAG_DAX inode flag at file + creation time. This advisory flag can be set or cleared at any + time, but doing so does not immediately affect the S_DAX state. + + Unless overridden by mount options (see (3)), if FS_XFLAG_DAX is set + and the fs is on pmem then it will enable S_DAX at inode load time; + if FS_XFLAG_DAX is not set, it will not enable S_DAX. + + 3. There exists a dax= mount option. + + "-o dax=never" means "never set S_DAX, ignore FS_XFLAG_DAX." + + "-o dax=always" means "always set S_DAX (at least on pmem), + and ignore FS_XFLAG_DAX." + + "-o dax" is an alias for "dax=always". + + "-o dax=inode" means "follow FS_XFLAG_DAX" and is the default. + + 4. There exists an advisory directory inode flag FS_XFLAG_DAX that can + be set or cleared at any time. The flag state is inherited by any files or + subdirectories when they are created within that directory. + + 5. Programs that require a specific file access mode (DAX or not DAX) + can do one of the following: + + (a) Create files in directories that the FS_XFLAG_DAX flag set as + needed; or + + (b) Have the administrator set an override via mount option; or + + (c) Set or clear the file's FS_XFLAG_DAX flag as needed. Programs + must then cause the kernel to evict the inode from memory. This + can be done by: + + i> Closing the file and re-opening the file and using statx to + see if the fs has changed the S_DAX flag; and + + ii> If the file still does not have the desired S_DAX access + mode, either unmount and remount the filesystem, or close + the file and use drop_caches. + + 6. It is expected that users who want to squeeze every last bit of performance + out of the particular rough and tumble bits of their storage will also be + exposed to the difficulties of what happens when the operating system can't + totally virtualize those hardware capabilities. DAX is such a feature. + Basically, Formula-1 cars require a bit more care and feeding than your + averaged Toyota minivan, as it were. + + +Details +------- + +There are 2 per-file dax flags. One is a physical inode setting (FS_XFLAG_DAX) +and the other a currently enabled state (S_DAX). + +FS_XFLAG_DAX is maintained, on disk, on individual inodes. It is preserved +within the file system. This 'physical' config setting can be set using an +ioctl and/or an application such as "xfs_io -c 'chattr [-+]x'". Files and +directories automatically inherit FS_XFLAG_DAX from their parent directory +_when_ _created_. Therefore, setting FS_XFLAG_DAX at directory creation time +can be used to set a default behavior for an entire sub-tree. (Doing so on the +root directory acts to set a default for the entire file system.) + +To clarify inheritance here are 3 examples: + +Example A: + +mkdir -p a/b/c +xfs_io 'chattr +x' a +mkdir a/b/c/d +mkdir a/e + + dax: a,e + no dax: b,c,d + +Example B: + +mkdir a +xfs_io 'chattr +x' a +mkdir -p a/b/c/d + + dax: a,b,c,d + no dax: + +Example C: + +mkdir -p a/b/c +xfs_io 'chattr +x' c +mkdir a/b/c/d + + dax: c,d + no dax: a,b + + +The current enabled state (S_DAX) is set when a file inode is _loaded_ based on +the underlying media support, the value of FS_XFLAG_DAX, and the file systems +dax mount option setting. See below. + +statx can be used to query S_DAX. NOTE that a directory will never have S_DAX +set and therefore statx will always return false on directories. + +NOTE: Setting the FS_XFLAG_DAX (specifically or through inheritance) occurs +even if the underlying media does not support dax and/or the file system is +overridden with a mount option. + + +Overriding FS_XFLAG_DAX (dax= mount option) +------------------------------------------- + +There exists a dax mount option. Using the mount option does not change the +physical configured state of individual files but overrides the S_DAX operating +state when inodes are loaded. + +Given underlying media support, the dax mount option is a tri-state option +(never, always, inode) with the following meanings: + + "-o dax=never" means "never set S_DAX, ignore FS_XFLAG_DAX" + "-o dax=always" means "always set S_DAX, ignore FS_XFLAG_DAX" + "-o dax" by itself means "dax=always" to remain compatible with older + kernels + "-o dax=inode" means "follow FS_XFLAG_DAX" + +The default state is 'inode'. Given underlying media support, the following +algorithm is used to determine the effective mode of the file S_DAX on a +capable device. + + S_DAX = FS_XFLAG_DAX; + + if (dax_mount == "always") + S_DAX = true; + else if (dax_mount == "off" + S_DAX = false; + +To reiterate: Setting, and inheritance, continues to affect FS_XFLAG_DAX even +while the file system is mounted with a dax override. However, file enabled +state, S_DAX, will continue to be the overridden until the file system is +remounted with dax=inode. Implementation Tips for Block Driver Writers