From patchwork Wed Nov 16 16:09:17 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 9432103 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0B46460476 for ; Wed, 16 Nov 2016 16:09:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EDA8C28FA4 for ; Wed, 16 Nov 2016 16:09:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E265628FB3; Wed, 16 Nov 2016 16:09:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 37B4E28FA4 for ; Wed, 16 Nov 2016 16:09:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752956AbcKPQJy (ORCPT ); Wed, 16 Nov 2016 11:09:54 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:33679 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751263AbcKPQJw (ORCPT ); Wed, 16 Nov 2016 11:09:52 -0500 Received: from pps.filterd (m0044008.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id uAGG3XBk005417; Wed, 16 Nov 2016 08:09:27 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=subject : to : references : cc : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=facebook; bh=vRSIQmfHvsSISS4+TTaASSKnXCMGqb1KQrXslMDSKsw=; b=HtAYjZ9Rtoz0vheHpe4Yb6bCq18NnkP211uOE+ptf+QBoZrOJeUsNykIJwbBB6AxFylg cxYei8BazJzjqbkH3eAjtQBsWb7/gc99qiCBsgTQRVkf2dguKeYFxgjrdTCGLVBoOi90 CCSAxbFWOVGQ2VnxfLm1hU0fict6Xw8RJ1w= Received: from maileast.thefacebook.com ([199.201.65.23]) by mx0a-00082601.pphosted.com with ESMTP id 26rqs7h2mx-1 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 16 Nov 2016 08:09:27 -0800 Received: from NAM03-DM3-obe.outbound.protection.outlook.com (192.168.183.28) by o365-in.thefacebook.com (192.168.177.28) with Microsoft SMTP Server (TLS) id 14.3.294.0; Wed, 16 Nov 2016 11:09:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.onmicrosoft.com; s=selector1-fb-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=nWjIW2wHQcP6hc1jEbFvhzZ7YftogAbwE99Yqli6QNI=; b=LpRw2LzltM8PL+KkVXftvpux1tM3gaLiz3DJP4bnvsbOzuCQI3wlZH3aYgPYh3pEdEUf+oFYnYNunSP8WNlmo+5aVDZIxcrJBdw6sTHCA6nPQ3fVZaPZqj99kLgslIUpuvopSPVjh9pnsPork3n2CUFAw/c6oaqn1ToULu+24yc= Received: from [192.168.1.129] (216.160.245.98) by DM5PR15MB1194.namprd15.prod.outlook.com (10.173.209.12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.693.12; Wed, 16 Nov 2016 16:09:23 +0000 Subject: Re: [PATCHSET] Add support for simplified async direct-io To: Christoph Hellwig References: <1479144519-15738-1-git-send-email-axboe@fb.com> <20161114173728.GA22167@infradead.org> <71ce9ae3-214d-b248-3507-cc96bea41a9b@fb.com> <20161114180052.GA24476@infradead.org> <4f30a528-9996-4c6a-9513-8aa2054e4d4b@fb.com> <20161114180503.GA31126@infradead.org> <20161114181119.GA9396@infradead.org> CC: , From: Jens Axboe Message-ID: Date: Wed, 16 Nov 2016 09:09:17 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161114181119.GA9396@infradead.org> X-Originating-IP: [216.160.245.98] X-ClientProxiedBy: CO1PR15CA0029.namprd15.prod.outlook.com (10.166.26.167) To DM5PR15MB1194.namprd15.prod.outlook.com (10.173.209.12) X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1194; 2:hB6r87vwWbzbpD0rZNaTeanbzcQFL2czzR/IXOa3TkiMvezYKDWIdJZhY39sfv9BoIirOmmbHGJE2dkC4Kn2jkIgw1jN4kMS9nR99Z9/zk6+rh/il0lhRqk8PbImlDTd4G20ES9APD03M6/q8SVBGUOq3uGGbuxUt+HZK9miQnc=; 3:o+vjBTenWFGhdTrfn/KPiayskaYNWLyeLfXW0yni7m0HgwC9Vq902n3mdHryDYwl+LbLMzEg3YMEaOd/PQcj66aNEj4VkV2E/pu4udwKSpFgPlh7k9MpoGdLg/nfqXIbgWEBt0Jf9hmZPprsZ2E9ubBtRp7C/L14OmuDhK4ysqo= X-MS-Office365-Filtering-Correlation-Id: ce47d761-72e2-476c-75f1-08d40e3aee5a X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001); SRVR:DM5PR15MB1194; X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1194; 25:DV4F4AVBFTzH3Qf9J1U5WQ91aBHRO/dlZjPV1gdnL4yC5oXjP3xs1LOb1CTluWLwDCzhlXxrYhVA5DaPx6e3u3Fx5h0LXWq8InIDcqQjgf9t8xxf3C+iacv291CzMcyA1SBt7Upqh7IpBn71p6twaCYAv/ZBkFWezIyy1cpFY8YVAY5Rr1/QLqVKDmaQs4KyNVFnHDW2c86+odCW7rZLkU+DGessDLuCZlVTiEbls47y6wOglt8pYMJsgVMaReKx2VG2TCpjD8e74xaxnv4rWiZLnX1V/jpUy7r30JHzdfW3EL/FqPfyeourRhMys97NHUq3zDjXkD+qIVSKqu1vFQgGMxCP+n9eX3yR2t29QS7uxQ0VSpUUoqkG5q7XRp0mxg/BazOnm3NUGkRZrwzTy+yNDWTi4zk7cDLKnkGgW45QVgepltWUEeUGOBzW43Qdgw8L9vA6Cz6jgGRyU/q/ujHpGdtHoxhlGC0ajNtc6GabamHtYVH+WUj1t+C3hkDl+s99TSG+xBcuxj+nosi1tgoBMOpvmcyapT11T140V3GHce28V+HKdM6H0nacDbsM8rjJm0d1tUtFuV9nIt2nUczBoRYjU6qblop83UsqBhHCkJst3UEaXC3TZZQia53UsWCXnpX9yX8U6bNl0tJaiFwonoCYo4GhjuHksCzVLiO+NVMtroMKokEjxGLZx1Q4jcjs7VhlHV16JYROqGSdzSqbs9NoevffvBnCUc1DCIegfDg13jZI8YTwpbSqiMYZemqHdC/IoBUvJHnOC691Mw== X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1194; 31:CRF/HTDIxp1LSOSIAraY3r20PrvqicKjEWyuWYHUKvV9DRK9G+/or/08YYLEMnzR/F1MgSoDHhGv8LysvpRD6K6DTFFy2HTYoegPJQE+LoGyGjWvtfWj0MqoT3FuyN9pVSuEsvluOQqLf1a3cFgOvkkiOwVVfDRqxKb7kzUsP2Sb54AmuopgUnYzIe40dy01+XCt1MV7rAuTH+VPJJGybeQdrO8wfeQwUjTAWNCu9+/GrnTR+W2P7iHZ2o6swkgwfoGeVJagzRwUx3pbrjBHVz2T2ozo8YHY+qaKhX1O7LOkMgbgWItICK1LTf5uSbSk; 20:kpLKEQZdC7e8vmKMDE+bQjt7+X8McWheJUgUXgLnrwD9VmdNeRGhi3G+VmIt7nuxokyJ6NPaPX+6E6ParTaT4+PFkjthEqqwUnKpNV+QjOCyu4X2z5UFU94WSD6dXJM9gr1Pc5fDrnIp8GVXIm4pcrfou/YWeb42KdjXAMA+4xss61LDPfGR2+hjU/ejFh0NYGkM13+2UJ5XT111k9puQgDWODzH2kfkdZ5Wo5C7t325YDStgro/dobJivQus+0A X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6060326)(6040281)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6041223)(6061324); SRVR:DM5PR15MB1194; BCL:0; PCL:0; RULEID:; SRVR:DM5PR15MB1194; X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1194; 4:kmPFYVk6SyhSQ8fDCK2+VSgEQcBFu/ohG/ONOdyKs4iKLQX/dJH4Rzu1bePN848xC3RnLeOd84uVLfRlbvPaWfCS6I4TjGmJf4xQHogTg4jp61FylObKF6FWrFDHJG/qgHQP0i2FzYXgoiG21CLPGYPN1c30BDvxn0eq6GXrVx/ZZacjLI0Op5+57p8fW1iI0TAiDKlxnRUzeVmyh56Mix8zFlM8TusBzvJFLbMaVxUbpBsrJZ/pdF0JCZaTbH5mG/PXOYnyqYvlj4eLE15xc+NlcBpMwXNQtVWRT4s78w9IJrRMEvYOZFXdQIk1M2yWyeZa0PWWTZDOEcoSk8H8b8INEBpsXqGH1ruwf6tArpJ6pYSiWllas75ac4KvHh2VYYRz67yatWSaM7blPgs4WSndy4e8kcYih2SKhRodY5cHYrV8VD3DUojLRGcJYhcJUi06sxAhSgW+YAIOB+DaFQ== X-Forefront-PRVS: 01283822F8 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(4630300001)(6049001)(6009001)(7916002)(24454002)(189002)(377454003)(199003)(6916009)(31686004)(106356001)(50986999)(92566002)(23746002)(31696002)(42186005)(229853002)(6666003)(117156001)(8676002)(64126003)(36756003)(54356999)(50466002)(2950100002)(83506001)(97736004)(189998001)(4001350100001)(7736002)(105586002)(93886004)(230700001)(5660300001)(2906002)(77096005)(81166006)(86362001)(4326007)(110136003)(33646002)(7846002)(65806001)(101416001)(6116002)(47776003)(66066001)(76176999)(305945005)(68736007)(3846002)(65956001)(81156014)(65826007); DIR:OUT; SFP:1102; SCL:1; SRVR:DM5PR15MB1194; H:[192.168.1.129]; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: fb.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1; DM5PR15MB1194; 23:pMxs5lCApV3eLf3EHODRmoQgG+SuJXvzmdztU?= =?Windows-1252?Q?93iR/XXnpWl/rur6Q9EF6/RGOgoMd452HnhVRzPSAE0PMGw7L6O5a8zq?= =?Windows-1252?Q?PCXUB0UvLt8fIU5HpfsV1Uo+EUzFpe09kZTt5yqHg43qH75/GX5xzLkE?= =?Windows-1252?Q?533QeLjzX+/bHWCFRg+dN+GMZTdV3AeW98N9g5UDpD0yZxv+3PmPYwOL?= =?Windows-1252?Q?qUFVpu48f3XzGk1MrPubMhkcdh2TPw6IPNXh5FOXF55YbmWRbCBSUMb6?= =?Windows-1252?Q?87Gn1mGwFHUKFOzXaCU0j24dhnSkbj3XI5xxfYEIAVgYCZYlKgEpv63W?= =?Windows-1252?Q?jebeDezRsTmnENJRMuFjS0ovewPUAp277zX2Ez8KM7UmjL3XgSNGuYBZ?= =?Windows-1252?Q?lgULd75SoWj4Hi11FJkyVs497Z2wwvDkv5AmN5AkxmdZjqBmSkWVuT/W?= =?Windows-1252?Q?mvM9BZ7mKZZ9uWJLzbjw+UKIrnZunLhUDy6BmAsluXMup1tVe+vI32IG?= =?Windows-1252?Q?y+p6eR+5R4xYtnFmhb7GeOFu8Psr4bpYNZ3cO9U4rR/U+fssprO7ujVT?= =?Windows-1252?Q?gYX3B3a9xwHhGlnC7mNPMbv3TZUf84q70F1eArC0QM76gw0U80Pxx5Hl?= =?Windows-1252?Q?I7uN/0aMgyt65waklW3cKM8oQ6paGq3vHPRgQkRxjybhQ40xgV57p5YA?= =?Windows-1252?Q?fMJ1joQxg+gsomlExKH7P5oM5ipnz5mDynVzSNykllSfpJQma5aWyiPT?= =?Windows-1252?Q?Pyh7oNizMWEahegQInUl7heo+ZvQXZdcxKhrYBvaTnsN+kytmRdbQipD?= =?Windows-1252?Q?HAaMlQbZABTtnaSRRX+/bKZ6iB35yM1SNetCE5iZe7w3nOg9sodNM5N9?= =?Windows-1252?Q?GAxpCQYDcyX50ZlWMgo2uXqzJte3aGjwukL5J4+ZCOS71669H7huQNNs?= =?Windows-1252?Q?EDjonDOaTqdJJvmr/asvkA+m2QQD4eg/XfH4S2x19evl9m8fv/jhkwXO?= =?Windows-1252?Q?M4kIU3uTh8NvqjCIpQ4Rg1MqoN8L/W/RO72jGcWYa/LZCesbjqEYXWrR?= =?Windows-1252?Q?hwGfzfnt8L9rFWI6wLlSVXuQf542ZPp1FgPElvvc4W7muspN7JmTQaOu?= =?Windows-1252?Q?8/I5Wy/F6uZdZlGM7hfQhEuXQgHsfBfA9yiO6oEq1bps2vwssEYIJ72h?= =?Windows-1252?Q?zJQi/RnAAhMaFBK9XiqSlE1WMLIKB4i1BHsqXrymkm14hydThw85iDqB?= =?Windows-1252?Q?PgAkXexOj2vUC0Kamrvx2Y9MZIWJOo7BQapwuSRjozjSW2YqgSF0qsVm?= =?Windows-1252?Q?WskgXdFDA1HzBIo4vz2/0VdfJmDpePXOrerP8XWIyWs4BH71JVxMOGV6?= =?Windows-1252?Q?z86IUZw95vI/H0u1t1q8P93vAUWWXmZhiBphuhF6H/2nhrff05BJXccx?= =?Windows-1252?Q?L4g7PxiISDTedN3XuZU?= X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1194; 6:CX8MGqAA32E+jI7jGAxlYNlHMJWYpxZnSGLycG/HuBkfv03jBt+EZ7wH0qYynzpIOFEhT7dZBv+4AJtHGvyoLEAefJ8p1+zy4ADuYIKv7xO+ZF8wqhNmgJfIXkzn2OZwaHSR/b5tpD2BsOdoLGXET+THt8NaTIgYu1bMan/PzgwtDSB+hhcMnObcz0erbkl+OBYo/ZCL5Iwv6bF5Cy3jpsa5pFopnf6PmUBJcp41bm1P6O4xR4sk6WJbgfY3rP96c9uX7b4g/jCJM/8m+ZwiIwtO9OFYVwkPA6v1qTVWbEHV+mYt0q5N9o2cUMtowKfDYVNaWRRTWzKm4uTwPXerBK9IIulB2hSb0Zhh9RZHN6w=; 5:bq9zAieRL1a4JfK4QJeIHVc1zM64phnvgC3jLcDDEK2jy8fXp5RSylY4B7EV0WaT54H9G4srG3bZDp+js5Dhjbe2tM53TX55l+px9NJ8Q6lnPm5AcSVWOnXBFwwD3M6hSaM8VaJANvRB+A0uuQTKDQ==; 24:0LUQDoxegUl44Gglre6CjDzVIPDwNlgGbGE5bE0aboTE4Zy7U9cWzQXb5wqFBigCx25z7sz5iafddw3KXsZEKfEIePHMh3lbCmXD9i5KSWE= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1194; 7:MJUtTO1pXp4R17/O6oB0u+enGszIEAgBfb0ChHrFNyOL+umeOvBwlZTPvUMXdV8urD2bwwIQTJ2+JMRM8ravltyZiSOMOxNcm1hdr8cwn2X15vZYf25Ewmi//+24tgtovRRHq9M3bNMx8gQ3ORmNaVFUjWehUMCbx4AMkb5wARUM06NpfQwCKdOK1yhz6/mGNKWmCsrYvyUSKNIZ9HhuqGSSr6o8kHWhOgO+4X3gfvDlBnU7WMA22/1IB7gSE+zi04S5HnJh9pzrDrFmQNmwNMkfXzK5NS9NzVHu5WGMnbwiGdxF6qqt2H03nl5a27sXB3Bx1kFhStVBiTy7PWlZUzYxOiIUy9py3/4/VkIzz6A=; 20:fAiWhl0SCmv+pQeIoR9bGjNR91/Fz/3owSO+hhC/DEKDMidQxWn1mkrpoqpwuz4gSPVQG0QGDe98Wc1kLCbtIKajptpziFFVErDQ+2M2e2ROD7WTC7x1NNq3z1wgvh19KRClOvehdyrGYemriK6m4zdxXYR6IhCuItSBUAqKjfQ= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2016 16:09:23.1661 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR15MB1194 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-11-16_06:, , signatures=0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 11/14/2016 11:11 AM, Christoph Hellwig wrote: > On Mon, Nov 14, 2016 at 11:08:46AM -0700, Jens Axboe wrote: >> It'd be cleaner to loop one level out, and avoid all that 'dio' stuff >> instead. And then still retain the separate parts of the sync and async. >> There's nothing to share there imho, and it just makes the code harder >> to read. > > How do you avoid it for the async case? We can only call ki_complete > once all bios have finished, which means we need a tracking structure > for it. For the synchronous case we could in theory wait for the > previous bio before sending the next, but there are plenty of RAID > arrays that would prefer > 1MB I/O. And we can pretty much reuse the > async case for this anyway. Another idea - just limit the max we can support in the simplified version as a single bio. If the user is requesting more, fall back to the old slow path. Doesn't really matter, since the transfer size will be large enough to hopefully overcome any deficiencies in the old path. That avoids having the atomic inc/dec in the fast path, and having to loop for both aio/sync O_DIRECT. Example below. struct bio bio; @@ -204,13 +262,21 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, if ((pos | iov_iter_alignment(iter)) & ((1 << blkbits) - 1)) return -EINVAL; + if (nr_pages <= DIO_INLINE_BIO_VECS) + vecs = inline_vecs; + else { + vecs = kmalloc(nr_pages * sizeof(struct bio_vec), GFP_KERNEL); + if (!vecs) + return -ENOMEM; + } + bio_init(&bio); bio.bi_max_vecs = nr_pages; - bio.bi_io_vec = inline_vecs; + bio.bi_io_vec = vecs; bio.bi_bdev = bdev; bio.bi_iter.bi_sector = pos >> blkbits; bio.bi_private = current; - bio.bi_end_io = blkdev_bio_end_io_simple; + bio.bi_end_io = blkdev_bio_end_io_sync; ret = bio_iov_iter_get_pages(&bio, iter); if (unlikely(ret)) @@ -243,6 +309,9 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, put_page(bvec->bv_page); } + if (vecs != inline_vecs) + kfree(vecs); + if (unlikely(bio.bi_error)) return bio.bi_error; iocb->ki_pos += ret; @@ -252,18 +321,25 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter) { - struct file *file = iocb->ki_filp; - struct inode *inode = bdev_file_inode(file); int nr_pages; - nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES); + nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES + 1); if (!nr_pages) return 0; - if (is_sync_kiocb(iocb) && nr_pages <= DIO_INLINE_BIO_VECS) - return __blkdev_direct_IO_simple(iocb, iter, nr_pages); - return __blockdev_direct_IO(iocb, inode, I_BDEV(inode), iter, - blkdev_get_block, NULL, NULL, - DIO_SKIP_DIO_COUNT); + + if (nr_pages > BIO_MAX_PAGES) { + struct file *file = iocb->ki_filp; + struct inode *inode = bdev_file_inode(file); + + return __blockdev_direct_IO(iocb, inode, I_BDEV(inode), iter, + blkdev_get_block, NULL, NULL, + DIO_SKIP_DIO_COUNT); + } + + if (is_sync_kiocb(iocb)) + return __blkdev_direct_IO_sync(iocb, iter, nr_pages); + + return __blkdev_direct_IO_async(iocb, iter, nr_pages); } int __sync_blockdev(struct block_device *bdev, int wait) diff --git a/fs/block_dev.c b/fs/block_dev.c index 7c3ec60..becc78e 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -176,9 +176,68 @@ static struct inode *bdev_file_inode(struct file *file) return file->f_mapping->host; } +static void blkdev_bio_end_io_async(struct bio *bio) +{ + struct kiocb *iocb = bio->bi_private; + + iocb->ki_complete(iocb, bio->bi_error, 0); + + if (bio_op(bio) == REQ_OP_READ) + bio_check_pages_dirty(bio); + else + bio_put(bio); +} + +static ssize_t +__blkdev_direct_IO_async(struct kiocb *iocb, struct iov_iter *iter, + int nr_pages) +{ + struct file *file = iocb->ki_filp; + struct block_device *bdev = I_BDEV(bdev_file_inode(file)); + unsigned blkbits = blksize_bits(bdev_logical_block_size(bdev)); + loff_t pos = iocb->ki_pos; + struct bio *bio; + ssize_t ret; + + if ((pos | iov_iter_alignment(iter)) & ((1 << blkbits) - 1)) + return -EINVAL; + + bio = bio_alloc(GFP_KERNEL, nr_pages); + if (!bio) + return -ENOMEM; + + bio->bi_bdev = bdev; + bio->bi_iter.bi_sector = pos >> blkbits; + bio->bi_private = iocb; + bio->bi_end_io = blkdev_bio_end_io_async; + + ret = bio_iov_iter_get_pages(bio, iter); + if (unlikely(ret)) + return ret; + + /* + * Overload bio size in error. If it gets set, we lose the + * size, but we don't need the size for that case. IO is limited + * to BIO_MAX_PAGES, so we can't overflow. + */ + ret = bio->bi_error = bio->bi_iter.bi_size; + + if (iov_iter_rw(iter) == READ) { + bio_set_op_attrs(bio, REQ_OP_READ, 0); + bio_set_pages_dirty(bio); + } else { + bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_SYNC | REQ_IDLE); + task_io_account_write(ret); + } + + submit_bio(bio); + iocb->ki_pos += ret; + return -EIOCBQUEUED; +} + #define DIO_INLINE_BIO_VECS 4 -static void blkdev_bio_end_io_simple(struct bio *bio) +static void blkdev_bio_end_io_sync(struct bio *bio) { struct task_struct *waiter = bio->bi_private; @@ -187,13 +246,12 @@ static void blkdev_bio_end_io_simple(struct bio *bio) } static ssize_t -__blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, - int nr_pages) +__blkdev_direct_IO_sync(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) { struct file *file = iocb->ki_filp; struct block_device *bdev = I_BDEV(bdev_file_inode(file)); unsigned blkbits = blksize_bits(bdev_logical_block_size(bdev)); - struct bio_vec inline_vecs[DIO_INLINE_BIO_VECS], *bvec; + struct bio_vec inline_vecs[DIO_INLINE_BIO_VECS], *vecs, *bvec; loff_t pos = iocb->ki_pos; bool should_dirty = false;