From patchwork Thu Jul 19 21:01:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Wilck X-Patchwork-Id: 10535359 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 85226603B5 for ; Thu, 19 Jul 2018 21:05:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C23529AB9 for ; Thu, 19 Jul 2018 21:05:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6008A29D0C; Thu, 19 Jul 2018 21:05:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F2B5629AB9 for ; Thu, 19 Jul 2018 21:05:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727609AbeGSVtx (ORCPT ); Thu, 19 Jul 2018 17:49:53 -0400 Received: from smtp2.provo.novell.com ([137.65.250.81]:50350 "EHLO smtp2.provo.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730384AbeGSVtx (ORCPT ); Thu, 19 Jul 2018 17:49:53 -0400 Received: from apollon.suse.de.de (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by smtp2.provo.novell.com with ESMTP (TLS encrypted); Thu, 19 Jul 2018 15:04:52 -0600 From: Martin Wilck To: Jens Axboe , Ming Lei , Jan Kara Cc: Hannes Reinecke , Johannes Thumshirn , Christoph Hellwig , Al Viro , Kent Overstreet , linux-block@vger.kernel.org, Martin Wilck Subject: [PATCH v3 3/3] blkdev: __blkdev_direct_IO_simple: make sure to fill up the bio Date: Thu, 19 Jul 2018 23:01:58 +0200 Message-Id: <20180719210158.25923-4-mwilck@suse.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180719210158.25923-1-mwilck@suse.com> References: <20180719210158.25923-1-mwilck@suse.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When bio_iov_iter_get_pages() is called from __blkdev_direct_IO_simple(), we already know that the content of the input iov_iter fits into a single bio, so we expect iov_iter_count(iter) to drop to 0. But in a single invocation, bio_iov_iter_get_pages() may add less bytes then we expect. For iov_iters with multiple segments (generated e.g. by writev()), it behaves like an iterator's next() function, taking only one step (segment) at a time. Furthermore, MM may fail or refuse to pin all requested pages. The latter may signify an error condition (in which case eventually an error code will be returned), the former does not. Call bio_iov_iter_get_pages() repeatedly to avoid short reads or writes. Otherwise, __generic_file_write_iter() falls back to buffered writes, which has been observed to cause data corruption in certain workloads. Fixes: 72ecad22d9f1 ("block: support a full bio worth of IO for simplified bdev direct-io") Signed-off-by: Martin Wilck --- fs/block_dev.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/fs/block_dev.c b/fs/block_dev.c index aba2541..561c34e 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -222,6 +222,24 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, ret = bio_iov_iter_get_pages(&bio, iter); if (unlikely(ret)) goto out; + + /* + * bio_iov_iter_get_pages() may add less bytes than we expect: + * - for multi-segment iov_iters, as it only adds one segment at a time + * - if MM refuses or fails to pin all requested pages. In this case, + * an error is returned eventually if no progress can be made. + */ + while (iov_iter_count(iter) > 0 && bio.bi_vcnt < bio.bi_max_vecs) { + ret = bio_iov_iter_get_pages(&bio, iter); + if (unlikely(ret)) + goto out; + } + /* + * Our bi_io_vec should be big enough to hold all data from the + * iov_iter, as this has been checked before calling this function. + */ + WARN_ON_ONCE(iov_iter_count(iter) > 0); + ret = bio.bi_iter.bi_size; if (iov_iter_rw(iter) == READ) {