From patchwork Fri Apr 7 09:32:43 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Rajendra X-Patchwork-Id: 9669179 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C9454602A0 for ; Fri, 7 Apr 2017 09:33:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B8F9F285D2 for ; Fri, 7 Apr 2017 09:33:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ADC52285EF; Fri, 7 Apr 2017 09:33:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C17D9285C5 for ; Fri, 7 Apr 2017 09:33:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753485AbdDGJdH (ORCPT ); Fri, 7 Apr 2017 05:33:07 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:57788 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751759AbdDGJdG (ORCPT ); Fri, 7 Apr 2017 05:33:06 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v379O0BL112124 for ; Fri, 7 Apr 2017 05:33:05 -0400 Received: from e15.ny.us.ibm.com (e15.ny.us.ibm.com [129.33.205.205]) by mx0a-001b2d01.pphosted.com with ESMTP id 29p2h60d3d-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 07 Apr 2017 05:33:05 -0400 Received: from localhost by e15.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 7 Apr 2017 05:33:04 -0400 Received: from b01cxnp23034.gho.pok.ibm.com (9.57.198.29) by e15.ny.us.ibm.com (146.89.104.202) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 7 Apr 2017 05:33:01 -0400 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v379X06e45416480; Fri, 7 Apr 2017 09:33:00 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5687FAC05B; Fri, 7 Apr 2017 05:32:50 -0400 (EDT) Received: from localhost.in.ibm.com (unknown [9.124.35.185]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP id C21A0AC03A; Fri, 7 Apr 2017 05:32:48 -0400 (EDT) From: Chandan Rajendra To: hch@lst.de Cc: Chandan Rajendra , linux-xfs@vger.kernel.org, darrick.wong@oracle.com, sandeen@sandeen.net, linux-fsdevel@vger.kernel.org Subject: [PATCH] iomap_dio_rw: Prevent reading file data beyond iomap_dio->i_size Date: Fri, 7 Apr 2017 15:02:43 +0530 X-Mailer: git-send-email 2.5.5 X-TM-AS-GCONF: 00 x-cbid: 17040709-0036-0000-0000-000001D40FCF X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006892; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000208; SDB=6.00844203; UDB=6.00416107; IPR=6.00622501; BA=6.00005274; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00014946; XFM=3.00000013; UTC=2017-04-07 09:33:02 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17040709-0037-0000-0000-00003F8511BF Message-Id: <1491557563-3278-1-git-send-email-chandan@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-04-07_08:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1704070082 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On a ppc64 machine executing overlayfs/019 with xfs as the lower and upper filesystem causes the following call trace, WARNING: CPU: 2 PID: 8034 at /root/repos/linux/fs/iomap.c:765 .iomap_dio_actor+0xcc/0x420 Modules linked in: CPU: 2 PID: 8034 Comm: fsstress Tainted: G L 4.11.0-rc5-next-20170405 #100 task: c000000631314880 task.stack: c0000003915d4000 NIP: c00000000035a72c LR: c00000000035a6f4 CTR: c00000000035a660 REGS: c0000003915d7570 TRAP: 0700 Tainted: G L (4.11.0-rc5-next-20170405) MSR: 800000000282b032 CR: 24004284 XER: 00000000 CFAR: c0000000006f7190 SOFTE: 1 GPR00: c00000000035a6f4 c0000003915d77f0 c0000000015a3f00 000000007c22f600 GPR04: 000000000022d000 0000000000002600 c0000003b2d56360 c0000003915d7960 GPR08: c0000003915d7cd0 0000000000000002 0000000000002600 c000000000521cc0 GPR12: 0000000024004284 c00000000fd80a00 000000004b04ae64 ffffffffffffffff GPR16: 000000001000ca70 0000000000000000 c0000003b2d56380 c00000000153d2b8 GPR20: 0000000000000010 c0000003bc87bac8 0000000000223000 000000000022f5ff GPR24: c0000003b2d56360 000000000000000c 0000000000002600 000000000022d000 GPR28: 0000000000000000 c0000003915d7960 c0000003b2d56360 00000000000001ff NIP [c00000000035a72c] .iomap_dio_actor+0xcc/0x420 LR [c00000000035a6f4] .iomap_dio_actor+0x94/0x420 Call Trace: [c0000003915d77f0] [c00000000035a6f4] .iomap_dio_actor+0x94/0x420 (unreliable) [c0000003915d78f0] [c00000000035b9f4] .iomap_apply+0xf4/0x1f0 [c0000003915d79d0] [c00000000035c320] .iomap_dio_rw+0x230/0x420 [c0000003915d7ae0] [c000000000512a14] .xfs_file_dio_aio_read+0x84/0x160 [c0000003915d7b80] [c000000000512d24] .xfs_file_read_iter+0x104/0x130 [c0000003915d7c10] [c0000000002d6234] .__vfs_read+0x114/0x1a0 [c0000003915d7cf0] [c0000000002d7a8c] .vfs_read+0xac/0x1a0 [c0000003915d7d90] [c0000000002d96b8] .SyS_read+0x58/0x100 [c0000003915d7e30] [c00000000000b8e0] system_call+0x38/0xfc Instruction dump: 78630020 7f831b78 7ffc07b4 7c7ce039 40820360 a13d0018 2f890003 419e0288 2f890004 419e00a0 2f890001 419e02a8 <0fe00000> 3b80fffb 38210100 7f83e378 The above problem can also be recreated on a regular xfs filesystem using the command, $ fsstress -d /mnt -l 1000 -n 1000 -p 1000 The reason for the call trace is, 1. When 'reserving' blocks for delayed allocation , XFS reserves more blocks (i.e. past file's current EOF) than required. This is done because XFS assumes that userspace might write more data and hence 'reserving' more blocks might lead to the file's new data being stored contiguously on disk. 2. The in-memory 'struct xfs_bmbt_irec' mapping the file's last extent would then cover the prealloc-ed EOF blocks in addition to the regular blocks. 3. When flushing the dirty blocks to disk, we only flush data till the file's EOF. But before writing out the dirty data, we allocate blocks on the disk for holding the file's new data. This allocation includes the blocks that are part of the 'prealloc EOF blocks'. 4. Later, when the last reference to the inode is being closed, XFS frees the unused 'prealloc EOF blocks' in xfs_inactive(). In step 3 above, When allocating space on disk for the delayed allocation range, the space allocator might sometimes allocate less blocks than required. If such an allocation ends right at the current EOF of the file, We will not be able to clear the "delayed allocation" flag for the 'prealloc EOF blocks', since we won't have dirty buffer heads associated with that range of the file. In such a situation if a Direct I/O read operation is performed on file range [X, Y] (where X < EOF and Y > EOF), we flush dirty data in the range [X, Y] and invalidate page cache for that range (Refer to iomap_dio_rw()). Later for performing the Direct I/O read, XFS obtains the extent items (which are still cached in memory) for the file range. When doing so we are not supposed to get an extent item with IOMAP_DELALLOC flag set, since the previous "flush" operation should have converted any delayed allocation data in the range [X, Y]. Hence we end up hitting a WARN_ON_ONCE(1) statement in iomap_dio_actor(). This commit fixes the bug by preventing the read operation from going beyond iomap_dio->i_size. Reported-by: Santhosh G Signed-off-by: Chandan Rajendra Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong --- Changelog: RFC->V1: Made trivial changes to address Christoph's review comments. I will send a patch to add a corresponding test to xfstests soon. fs/iomap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/iomap.c b/fs/iomap.c index 0b457ff..1ff0c9c 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -904,6 +904,9 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, break; } pos += ret; + + if (iov_iter_rw(iter) == READ && pos >= dio->i_size) + break; } while ((count = iov_iter_count(iter)) > 0); blk_finish_plug(&plug);