From patchwork Wed Apr 6 03:43:32 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 8757911 Return-Path: X-Original-To: patchwork-linux-block@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id A70ABC0553 for ; Wed, 6 Apr 2016 03:43:45 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 564A12034A for ; Wed, 6 Apr 2016 03:43:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1041E20306 for ; Wed, 6 Apr 2016 03:43:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760048AbcDFDnm (ORCPT ); Tue, 5 Apr 2016 23:43:42 -0400 Received: from mail-pa0-f68.google.com ([209.85.220.68]:36612 "EHLO mail-pa0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753527AbcDFDnl (ORCPT ); Tue, 5 Apr 2016 23:43:41 -0400 Received: by mail-pa0-f68.google.com with SMTP id 1so2889615pal.3; Tue, 05 Apr 2016 20:43:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=3OgJhSPvZ4hjljqEP1jxh8Y8P0rxe3AA4FNFK1TbYbQ=; b=NbPsv6kVZ7ulTeCkonO4eXKudzeKEPKZ9iI+4ucb91XWVm259Ym2BwgeXatM3fAUcP SNlgWYsoj5bsWjFQKZ1UnnVE4O2tzO548TMIIwmb388oYX6YUSVe530xy/V3Tz7rqyva bjbEaBSHYlqY3pDbkb7atUahppy7673IBbcy8jdtfm9A2DnW9bgLheOz0E/RGR8FFXxa jJbsgK0hocmN1VOXhL3gzCvMu8hIXleyf2WKg2+up2pYmowYv6OM7/j7a6bBxKLjZ4zT /p2CnRB+qmkItGQV0Ux8P+RHreRRPeE5sytsFEpqG6Cn8ZFhX+70BKmyK63QUHjNWuQZ Z8dA== X-Gm-Message-State: AD7BkJIUjjrjOIHoZm5XT7SXo5RbqfvMOEFZqwyP/4KSh0FYUNAAjCmrHxMzWTiLqNk1aw== X-Received: by 10.66.231.73 with SMTP id te9mr18788822pac.62.1459914220775; Tue, 05 Apr 2016 20:43:40 -0700 (PDT) Received: from localhost (45-125-195-13.ip4.readyserver.sg. [45.125.195.13]) by smtp.gmail.com with ESMTPSA id q20sm768223pfi.63.2016.04.05.20.43.39 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Tue, 05 Apr 2016 20:43:39 -0700 (PDT) From: Ming Lei To: Jens Axboe , linux-kernel@vger.kernel.org Cc: linux-block@vger.kernel.org, kent.overstreet@gmail.com, Christoph Hellwig , Eric Wheeler , Sebastian Roesner , Ming Lei , stable@vger.kernel.org (4.3+), Shaohua Li Subject: [PATCH v1] block: make sure big bio is splitted into at most 256 bvecs Date: Wed, 6 Apr 2016 11:43:32 +0800 Message-Id: <1459914212-9330-1-git-send-email-ming.lei@canonical.com> X-Mailer: git-send-email 1.9.1 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Spam-Status: No, score=-7.8 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RCVD_IN_SBL, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP After arbitrary bio size is supported, the incoming bio may be very big. We have to split the bio into small bios so that each holds at most BIO_MAX_PAGES bvecs for safety reason, such as bio_clone(). This patch fixes the following kernel crash: > [ 172.660142] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 > [ 172.660229] IP: [] bio_trim+0xf/0x2a > [ 172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0 > [ 172.660399] Oops: 0000 [#1] SMP > [...] > [ 172.664780] Call Trace: > [ 172.664813] [] ? raid1_make_request+0x2e8/0xad7 [raid1] > [ 172.664846] [] ? blk_queue_split+0x377/0x3d4 > [ 172.664880] [] ? md_make_request+0xf6/0x1e9 [md_mod] > [ 172.664912] [] ? generic_make_request+0xb5/0x155 > [ 172.664947] [] ? prio_io+0x85/0x95 [bcache] > [ 172.664981] [] ? register_cache_set+0x355/0x8d0 [bcache] > [ 172.665016] [] ? register_bcache+0x1006/0x1174 [bcache] Fixes: 54efd50(block: make generic_make_request handle arbitrarily sized bios) Reported-by: Sebastian Roesner Reported-by: Eric Wheeler Cc: stable@vger.kernel.org (4.3+) Cc: Shaohua Li Cc: Kent Overstreet Signed-off-by: Ming Lei Acked-by: Kent Overstreet --- V1: - Kent pointed out that using max io size can't cover the case of non-full bvecs/pages The issue can be reproduced by the following approach: - create one raid1 over two virtio-blk - build bcache device over the above raid1 and another cache device and bucket size is set 2Mbytes - set cache mode as writeback - run random write over ext4 on the bcache device - then the crash can be triggered block/blk-merge.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/block/blk-merge.c b/block/blk-merge.c index 2613531..7b96471 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -94,8 +94,10 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, bool do_split = true; struct bio *new = NULL; const unsigned max_sectors = get_max_io_size(q, bio); + unsigned bvecs = 0; bio_for_each_segment(bv, bio, iter) { + bvecs++; /* * If the queue doesn't support SG gaps and adding this * offset would create a gap, disallow it. @@ -103,6 +105,23 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, if (bvprvp && bvec_gap_to_prev(q, bvprvp, bv.bv_offset)) goto split; + /* + * With arbitrary bio size, the incoming bio may be very + * big. We have to split the bio into small bios so that + * each holds at most BIO_MAX_PAGES bvecs because + * bio_clone() can fail to allocate big bvecs. + * + * It should have been better to apply the limit per + * request queue in which bio_clone() is involved, + * instead of globally. The biggest blocker is + * bio_clone() in bio bounce. + * + * TODO: deal with bio bounce's bio_clone() gracefully + * and convert the global limit into per-queue limit. + */ + if (bvecs >= BIO_MAX_PAGES) + goto split; + if (sectors + (bv.bv_len >> 9) > max_sectors) { /* * Consider this a new segment if we're splitting in