From patchwork Thu Mar 24 23:17:30 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Verma, Vishal L" X-Patchwork-Id: 8665461 Return-Path: X-Original-To: patchwork-linux-block@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 51B739F44D for ; Thu, 24 Mar 2016 23:19:02 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 361402025A for ; Thu, 24 Mar 2016 23:19:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 029FF20268 for ; Thu, 24 Mar 2016 23:19:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751728AbcCXXSo (ORCPT ); Thu, 24 Mar 2016 19:18:44 -0400 Received: from mga11.intel.com ([192.55.52.93]:40303 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751668AbcCXXSH (ORCPT ); Thu, 24 Mar 2016 19:18:07 -0400 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 24 Mar 2016 16:18:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,387,1455004800"; d="scan'208";a="944535754" Received: from omniknight.lm.intel.com ([10.232.112.171]) by fmsmga002.fm.intel.com with ESMTP; 24 Mar 2016 16:18:06 -0700 From: Vishal Verma To: linux-nvdimm@lists.01.org Cc: Vishal Verma , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, xfs@oss.sgi.com, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Matthew Wilcox , Ross Zwisler , Dan Williams , Dave Chinner , Jan Kara , Jens Axboe , Al Viro , Andrew Morton Subject: [PATCH 5/5] dax: handle media errors in dax_do_io Date: Thu, 24 Mar 2016 17:17:30 -0600 Message-Id: <1458861450-17705-6-git-send-email-vishal.l.verma@intel.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1458861450-17705-1-git-send-email-vishal.l.verma@intel.com> References: <1458861450-17705-1-git-send-email-vishal.l.verma@intel.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP dax_do_io (called for read() or write() for a dax file system) may fail in the presence of bad blocks or media errors. Since we expect that a write should clear media errors on nvdimms, make dax_do_io fall back to the direct_IO path, which will send down a bio to the driver, which can then attempt to clear the error. Cc: Matthew Wilcox Cc: Dan Williams Cc: Ross Zwisler Cc: Dave Chinner Cc: Jan Kara Cc: Jens Axboe Cc: Al Viro Signed-off-by: Vishal Verma --- fs/block_dev.c | 5 +++-- fs/dax.c | 34 ++++++++++++++++++++++++++++++++-- fs/ext2/inode.c | 5 +++-- fs/ext4/indirect.c | 11 +++++++---- fs/ext4/inode.c | 5 +++-- fs/xfs/xfs_aops.c | 7 ++++--- include/linux/dax.h | 6 +++++- 7 files changed, 57 insertions(+), 16 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 9c0765b..f3873ab 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -168,8 +168,9 @@ blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, loff_t offset) struct inode *inode = bdev_file_inode(file); if (IS_DAX(inode)) - return dax_do_io(iocb, inode, iter, offset, blkdev_get_block, - NULL, DIO_SKIP_DIO_COUNT); + return dax_do_io(iocb, inode, I_BDEV(inode), iter, offset, + blkdev_get_block, blkdev_get_block, + NULL, NULL, DIO_SKIP_DIO_COUNT); return __blockdev_direct_IO(iocb, inode, I_BDEV(inode), iter, offset, blkdev_get_block, NULL, NULL, DIO_SKIP_DIO_COUNT); diff --git a/fs/dax.c b/fs/dax.c index a30481e..b90c8e9 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -208,7 +208,7 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, } /** - * dax_do_io - Perform I/O to a DAX file + * __dax_do_io - Perform I/O to a DAX file * @iocb: The control block for this I/O * @inode: The file which the I/O is directed at * @iter: The addresses to do I/O from or to @@ -224,7 +224,7 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, * As with do_blockdev_direct_IO(), we increment i_dio_count while the I/O * is in progress. */ -ssize_t dax_do_io(struct kiocb *iocb, struct inode *inode, +ssize_t __dax_do_io(struct kiocb *iocb, struct inode *inode, struct iov_iter *iter, loff_t pos, get_block_t get_block, dio_iodone_t end_io, int flags) { @@ -262,8 +262,38 @@ ssize_t dax_do_io(struct kiocb *iocb, struct inode *inode, out: return retval; } +EXPORT_SYMBOL_GPL(__dax_do_io); + +/* + * This is a library function for use by file systems. It will perform a + * fallback to direct_io semantics if the dax_io fails due to a media error. + */ +ssize_t dax_do_io(struct kiocb *iocb, struct inode *inode, + struct block_device *bdev, struct iov_iter *iter, loff_t pos, + get_block_t dax_get_block, get_block_t dio_get_block, + dio_iodone_t end_io, dio_submit_t submit_io, int flags) +{ + ssize_t retval; + + retval = __dax_do_io(iocb, inode, iter, pos, dax_get_block, end_io, + flags); + if (iov_iter_rw(iter) == WRITE && retval == -EIO) { + /* + * __dax_do_io may have failed a write due to a bad block. + * Retry with direct_io, and if the direct_IO also fails, + * return -EIO as that was the original error that led us + * down the direct_IO path. + */ + retval = __blockdev_direct_IO(iocb, inode, bdev, iter, pos, + dio_get_block, end_io, submit_io, flags); + if (retval < 0) + return -EIO; + } + return retval; +} EXPORT_SYMBOL_GPL(dax_do_io); + /* * The user has performed a load from a hole in the file. Allocating * a new page in the file would cause excessive storage usage for diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 824f249..8a307cf 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -862,8 +862,9 @@ ext2_direct_IO(struct kiocb *iocb, struct iov_iter *iter, loff_t offset) ssize_t ret; if (IS_DAX(inode)) - ret = dax_do_io(iocb, inode, iter, offset, ext2_get_block, NULL, - DIO_LOCKING); + ret = dax_do_io(iocb, inode, inode->i_sb->s_bdev, iter, + offset, ext2_get_block, ext2_get_block, + NULL, NULL, DIO_LOCKING | DIO_SKIP_HOLES); else ret = blockdev_direct_IO(iocb, inode, iter, offset, ext2_get_block); diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c index 355ef9c..4b087b7 100644 --- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c @@ -692,8 +692,9 @@ retry: goto locked; } if (IS_DAX(inode)) - ret = dax_do_io(iocb, inode, iter, offset, - ext4_get_block, NULL, 0); + ret = dax_do_io(iocb, inode, inode->i_sb->s_bdev, iter, + offset, ext4_get_block, ext4_get_block, + NULL, NULL, 0); else ret = __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev, iter, @@ -703,8 +704,10 @@ retry: } else { locked: if (IS_DAX(inode)) - ret = dax_do_io(iocb, inode, iter, offset, - ext4_get_block, NULL, DIO_LOCKING); + ret = dax_do_io(iocb, inode, inode->i_sb->s_bdev, iter, + offset, ext4_get_block, ext4_get_block, + NULL, NULL, DIO_LOCKING | + DIO_SKIP_HOLES); else ret = blockdev_direct_IO(iocb, inode, iter, offset, ext4_get_block); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index aee960b..4220dac 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3315,8 +3315,9 @@ static ssize_t ext4_ext_direct_IO(struct kiocb *iocb, struct iov_iter *iter, BUG_ON(ext4_encrypted_inode(inode) && S_ISREG(inode->i_mode)); #endif if (IS_DAX(inode)) - ret = dax_do_io(iocb, inode, iter, offset, get_block_func, - ext4_end_io_dio, dio_flags); + ret = dax_do_io(iocb, inode, inode->i_sb->s_bdev, iter, offset, + get_block_func, get_block_func, + ext4_end_io_dio, NULL, dio_flags); else ret = __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev, iter, offset, diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index a9ebabfe..dc4e088 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -1682,11 +1682,12 @@ xfs_vm_do_dio( void *private), int flags) { - struct block_device *bdev; + struct block_device *bdev = xfs_find_bdev_for_inode(inode); if (IS_DAX(inode)) - return dax_do_io(iocb, inode, iter, offset, - xfs_get_blocks_direct, endio, 0); + return dax_do_io(iocb, inode, bdev, iter, offset, + xfs_get_blocks_direct, xfs_get_blocks_direct, + endio, NULL, flags); bdev = xfs_find_bdev_for_inode(inode); return __blockdev_direct_IO(iocb, inode, bdev, iter, offset, diff --git a/include/linux/dax.h b/include/linux/dax.h index 933198a..6981076 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -5,8 +5,12 @@ #include #include -ssize_t dax_do_io(struct kiocb *, struct inode *, struct iov_iter *, loff_t, +ssize_t __dax_do_io(struct kiocb *, struct inode *, struct iov_iter *, loff_t, get_block_t, dio_iodone_t, int flags); +ssize_t dax_do_io(struct kiocb *iocb, struct inode *inode, + struct block_device *bdev, struct iov_iter *iter, loff_t pos, + get_block_t dax_get_block, get_block_t dio_get_block, + dio_iodone_t end_io, dio_submit_t submit_io, int flags); int dax_zero_page_range(struct inode *, loff_t from, unsigned len, get_block_t); int dax_truncate_page(struct inode *, loff_t from, get_block_t); int dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t,