From patchwork Sat Oct 13 00:06:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10639429 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5E67F1869 for ; Sat, 13 Oct 2018 00:06:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A5D42B79F for ; Sat, 13 Oct 2018 00:06:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3E7B32B7A3; Sat, 13 Oct 2018 00:06:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6551D2B79F for ; Sat, 13 Oct 2018 00:06:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6CDCB6B0010; Fri, 12 Oct 2018 20:06:10 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6A32B6B0266; Fri, 12 Oct 2018 20:06:10 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56BC36B0269; Fri, 12 Oct 2018 20:06:10 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 2C69D6B0010 for ; Fri, 12 Oct 2018 20:06:10 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id y23-v6so13683610qtc.7 for ; Fri, 12 Oct 2018 17:06:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:subject:from:to:cc:date :message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=zrdKK22K4yPwdg97C40ayFoftWNqhnfImZiJAlzmShI=; b=uIvbBm7dUk1mS8avt8qlnFLO5hYwT9iqvtHhlwhNxD1aTCOVXOtbXyeHALlf3rVdnO HRpK3UQ1VefKRxY8RyTgUz/wbYWRwpcwdTSxQXs0xnPQIJSEH/PvLmKWvyvwIitxDHF3 /aGJeH5snDjp7SAc9Rwo0viQbRw9//zYUqKlcEl0i4jYzdvsMC1VJQw0LCDKqQfYHbBc 3c0GQaycXXbyTN+lDKlgo7o8Dg7Bwt7caJoQHdNW1E78985UhDtroFE7PVunP/bTpoWp sQ4iTDm6qaPgJgHCzVK1svj1toVVukN4SUIMf1kZYSp0hvNCS4PyofN4NsTCFnoWhLMF EiAg== X-Gm-Message-State: ABuFfojAZPavqeG3zkHtPZgtxPUMHZbQRkw+2tkRf1G2hQ2CZsAP+FZ4 Ts/UVmnekh0puZ406qy6BeuLXS6OjrWw6oFf7YyOKOxbVqBoNnu7TdyfJv9bn1fHRyinj3cAsof dPA21u2TCUnJBk4krXxXCgz9ekcWLjjQghu6QckvxmcwfrPVFZaw0TjI3HYUrOfzzZA== X-Received: by 2002:a37:35c8:: with SMTP id c191-v6mr8031652qka.100.1539389169925; Fri, 12 Oct 2018 17:06:09 -0700 (PDT) X-Google-Smtp-Source: ACcGV61x6uQtJQfYPXv7va9ZINCrn3Etf7HmdIFojt3O2QiiOFcfpwvR9WqlmI6GJOynElaazW88 X-Received: by 2002:a37:35c8:: with SMTP id c191-v6mr8031613qka.100.1539389169047; Fri, 12 Oct 2018 17:06:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539389169; cv=none; d=google.com; s=arc-20160816; b=Un72W7zLTBM/D02nph+H+06C/q3lC0q6egcrsE7z8UihmAsjLOhbQqCT2jd7IpdtiZ cx3CfOefOJMabc6oQXJnRRY1iLsEN0G01/fti0fGgG7eTQHB2OUrMe/IjWIkXE3hmKcx kRK+Mm3zLve12rZLaOnwPbhCX4noW9+qxZoifBK/ArRWQy1PhT0gobO2PRWPEyJxXLhX oC+fhvfsAj4Kss3YBTPpa0B0u8Zwo+EuTUoL2pfGcLjPphXTmEafO5CqzIdGFVIWXOIV 2USInVNWsBtUU9H5i5iu6auK8/J5VkS91K2S2u3yzUTJ2VzTJ8EFn2i2UeX/9JYiTXAL hHzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject:dkim-signature; bh=zrdKK22K4yPwdg97C40ayFoftWNqhnfImZiJAlzmShI=; b=xZjwEuYR6YMQrCeiZxECI/B0Qf0aLJsGPCk4zJpTlxUIWmN5fUqRiOwzbhP07EAQDJ NfuOocICa7u/A1ZowOjbAppwTSdV6Tf5H/Egy8rCvgwXb2pJ1YneMoFyg2PvMB1LTpnV boEFMIDpb2dT9NyLXporNSj9uFk96sjve1ZNHl1yg9GMK/ZfVzMmBd/PStc0778Ku/gp u2o9pev6tuvxY3hxoHj/K4D2/hwGb6qEe/Z21db3Zgo+3F+kvEZWg1uJQBNjqUGLQF76 jyENYTKgvMXVzQu7anX181O6w4IT6BYdOsk6ISHQokTRKJg9CZtKwm6e3x/DqBaMZrj0 /tWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=DF9v2Nsw; spf=pass (google.com: domain of darrick.wong@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=darrick.wong@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2130.oracle.com (userp2130.oracle.com. [156.151.31.86]) by mx.google.com with ESMTPS id f60-v6si2340902qtd.93.2018.10.12.17.06.08 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 12 Oct 2018 17:06:09 -0700 (PDT) Received-SPF: pass (google.com: domain of darrick.wong@oracle.com designates 156.151.31.86 as permitted sender) client-ip=156.151.31.86; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=DF9v2Nsw; spf=pass (google.com: domain of darrick.wong@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=darrick.wong@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w9D042TG143609; Sat, 13 Oct 2018 00:06:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=zrdKK22K4yPwdg97C40ayFoftWNqhnfImZiJAlzmShI=; b=DF9v2Nswvy3iyf5vr38RMZSkx5ZJQfksNjmfi4vuZTTwwAWaYcqVDwFRJSkF1oBraPFp to35tL2cQfhK+ltivNPDW5a4Lp+kc9D7R3Hc9iZJhmz6aOtfpQJ5kYb1VmDneH7ahFr9 OuPumC9LII4iIsS1ZAmJYSST7+Sl/UdJXNLDsOzF3BWQ4B141I8CBZYJ7Zk7N8h20dT9 /ZbdAOmq5V0NNdFC38Bf7tG4pcTcoac88t4/gASzCmCpl2Cs2lZM7J3nBDPdkXaiigMb 1uBE2prs+j4x3s8f1TnZVOqnUYy1wdPkBneZYdlSrKvP+zeaSfVwCeQfCLny2gUNvE+4 VA== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp2130.oracle.com with ESMTP id 2mxmfudpv8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 13 Oct 2018 00:06:06 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w9D066wh001331 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 13 Oct 2018 00:06:06 GMT Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w9D065dc018643; Sat, 13 Oct 2018 00:06:05 GMT Received: from localhost (/10.159.251.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 13 Oct 2018 00:06:05 +0000 Subject: [PATCH 03/25] vfs: check file ranges before cloning files From: "Darrick J. Wong" To: david@fromorbit.com, darrick.wong@oracle.com Cc: sandeen@redhat.com, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, Amir Goldstein , linux-unionfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Christoph Hellwig , ocfs2-devel@oss.oracle.com Date: Fri, 12 Oct 2018 17:06:03 -0700 Message-ID: <153938916339.8361.14543916522029520891.stgit@magnolia> In-Reply-To: <153938912912.8361.13446310416406388958.stgit@magnolia> References: <153938912912.8361.13446310416406388958.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9044 signatures=668706 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=978 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810130000 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Move the file range checks from vfs_clone_file_prep into a separate generic_remap_checks function so that all the checks are collected in a central location. This forms the basis for adding more checks from generic_write_checks that will make cloning's input checking more consistent with write input checking. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Reviewed-by: Amir Goldstein --- fs/ocfs2/refcounttree.c | 2 + fs/read_write.c | 55 +++++++++---------------------------- fs/xfs/xfs_reflink.c | 2 + include/linux/fs.h | 9 ++++-- mm/filemap.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 90 insertions(+), 47 deletions(-) diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c index 7a5ee145c733..19e03936c5e1 100644 --- a/fs/ocfs2/refcounttree.c +++ b/fs/ocfs2/refcounttree.c @@ -4850,7 +4850,7 @@ int ocfs2_reflink_remap_range(struct file *file_in, (OCFS2_I(inode_out)->ip_flags & OCFS2_INODE_SYSTEM_FILE)) goto out_unlock; - ret = vfs_clone_file_prep_inodes(inode_in, pos_in, inode_out, pos_out, + ret = vfs_clone_file_prep(file_in, pos_in, file_out, pos_out, &len, is_dedupe); if (ret <= 0) goto out_unlock; diff --git a/fs/read_write.c b/fs/read_write.c index 260797b01851..d6e8e242a15f 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1717,13 +1717,12 @@ static int clone_verify_area(struct file *file, loff_t pos, u64 len, bool write) * Returns: 0 for "nothing to clone", 1 for "something to clone", or * the usual negative error code. */ -int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in, - struct inode *inode_out, loff_t pos_out, - u64 *len, bool is_dedupe) +int vfs_clone_file_prep(struct file *file_in, loff_t pos_in, + struct file *file_out, loff_t pos_out, + u64 *len, bool is_dedupe) { - loff_t bs = inode_out->i_sb->s_blocksize; - loff_t blen; - loff_t isize; + struct inode *inode_in = file_inode(file_in); + struct inode *inode_out = file_inode(file_out); bool same_inode = (inode_in == inode_out); int ret; @@ -1740,10 +1739,10 @@ int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in, if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) return -EINVAL; - isize = i_size_read(inode_in); - /* Zero length dedupe exits immediately; reflink goes to EOF. */ if (*len == 0) { + loff_t isize = i_size_read(inode_in); + if (is_dedupe || pos_in == isize) return 0; if (pos_in > isize) @@ -1751,36 +1750,11 @@ int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in, *len = isize - pos_in; } - /* Ensure offsets don't wrap and the input is inside i_size */ - if (pos_in + *len < pos_in || pos_out + *len < pos_out || - pos_in + *len > isize) - return -EINVAL; - - /* Don't allow dedupe past EOF in the dest file */ - if (is_dedupe) { - loff_t disize; - - disize = i_size_read(inode_out); - if (pos_out >= disize || pos_out + *len > disize) - return -EINVAL; - } - - /* If we're linking to EOF, continue to the block boundary. */ - if (pos_in + *len == isize) - blen = ALIGN(isize, bs) - pos_in; - else - blen = *len; - - /* Only reflink if we're aligned to block boundaries */ - if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || - !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) - return -EINVAL; - - /* Don't allow overlapped reflink within the same file */ - if (same_inode) { - if (pos_out + blen > pos_in && pos_out < pos_in + blen) - return -EINVAL; - } + /* Check that we don't violate system file offset limits. */ + ret = generic_remap_checks(file_in, pos_in, file_out, pos_out, len, + is_dedupe); + if (ret) + return ret; /* Wait for the completion of any pending IOs on both files */ inode_dio_wait(inode_in); @@ -1813,7 +1787,7 @@ int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in, return 1; } -EXPORT_SYMBOL(vfs_clone_file_prep_inodes); +EXPORT_SYMBOL(vfs_clone_file_prep); int do_clone_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, u64 len) @@ -1851,9 +1825,6 @@ int do_clone_file_range(struct file *file_in, loff_t pos_in, if (ret) return ret; - if (pos_in + len > i_size_read(inode_in)) - return -EINVAL; - ret = file_in->f_op->clone_file_range(file_in, pos_in, file_out, pos_out, len); if (!ret) { diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 42ea7bab9144..281d5f53f2ec 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1326,7 +1326,7 @@ xfs_reflink_remap_prep( if (IS_DAX(inode_in) || IS_DAX(inode_out)) goto out_unlock; - ret = vfs_clone_file_prep_inodes(inode_in, pos_in, inode_out, pos_out, + ret = vfs_clone_file_prep(file_in, pos_in, file_out, pos_out, len, is_dedupe); if (ret <= 0) goto out_unlock; diff --git a/include/linux/fs.h b/include/linux/fs.h index 897eae8faee1..ba93a6e7dac4 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1825,9 +1825,9 @@ extern ssize_t vfs_readv(struct file *, const struct iovec __user *, unsigned long, loff_t *, rwf_t); extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *, loff_t, size_t, unsigned int); -extern int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in, - struct inode *inode_out, loff_t pos_out, - u64 *len, bool is_dedupe); +extern int vfs_clone_file_prep(struct file *file_in, loff_t pos_in, + struct file *file_out, loff_t pos_out, + u64 *count, bool is_dedupe); extern int do_clone_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, u64 len); extern int vfs_clone_file_range(struct file *file_in, loff_t pos_in, @@ -2967,6 +2967,9 @@ extern int sb_min_blocksize(struct super_block *, int); extern int generic_file_mmap(struct file *, struct vm_area_struct *); extern int generic_file_readonly_mmap(struct file *, struct vm_area_struct *); extern ssize_t generic_write_checks(struct kiocb *, struct iov_iter *); +extern int generic_remap_checks(struct file *file_in, loff_t pos_in, + struct file *file_out, loff_t pos_out, + uint64_t *count, bool is_dedupe); extern ssize_t generic_file_read_iter(struct kiocb *, struct iov_iter *); extern ssize_t __generic_file_write_iter(struct kiocb *, struct iov_iter *); extern ssize_t generic_file_write_iter(struct kiocb *, struct iov_iter *); diff --git a/mm/filemap.c b/mm/filemap.c index 52517f28e6f4..47e6bfd45a91 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2974,6 +2974,75 @@ inline ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from) } EXPORT_SYMBOL(generic_write_checks); +/* + * Performs necessary checks before doing a clone. + * + * Can adjust amount of bytes to clone. + * Returns appropriate error code that caller should return or + * zero in case the clone should be allowed. + */ +int generic_remap_checks(struct file *file_in, loff_t pos_in, + struct file *file_out, loff_t pos_out, + uint64_t *req_count, bool is_dedupe) +{ + struct inode *inode_in = file_in->f_mapping->host; + struct inode *inode_out = file_out->f_mapping->host; + uint64_t count = *req_count; + uint64_t bcount; + loff_t size_in, size_out; + loff_t bs = inode_out->i_sb->s_blocksize; + + /* The start of both ranges must be aligned to an fs block. */ + if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_out, bs)) + return -EINVAL; + + /* Ensure offsets don't wrap. */ + if (pos_in + count < pos_in || pos_out + count < pos_out) + return -EINVAL; + + size_in = i_size_read(inode_in); + size_out = i_size_read(inode_out); + + /* Dedupe requires both ranges to be within EOF. */ + if (is_dedupe && + (pos_in >= size_in || pos_in + count > size_in || + pos_out >= size_out || pos_out + count > size_out)) + return -EINVAL; + + /* Ensure the infile range is within the infile. */ + if (pos_in >= size_in) + return -EINVAL; + count = min(count, size_in - (uint64_t)pos_in); + + /* + * If the user wanted us to link to the infile's EOF, round up to the + * next block boundary for this check. + * + * Otherwise, make sure the count is also block-aligned, having + * already confirmed the starting offsets' block alignment. + */ + if (pos_in + count == size_in) { + bcount = ALIGN(size_in, bs) - pos_in; + } else { + if (!IS_ALIGNED(count, bs)) + return -EINVAL; + + bcount = count; + } + + /* Don't allow overlapped cloning within the same file. */ + if (inode_in == inode_out && + pos_out + bcount > pos_in && + pos_out < pos_in + bcount) + return -EINVAL; + + /* For now we don't support changing the length. */ + if (*req_count != count) + return -EINVAL; + + return 0; +} + int pagecache_write_begin(struct file *file, struct address_space *mapping, loff_t pos, unsigned len, unsigned flags, struct page **pagep, void **fsdata)