From patchwork Mon Mar 9 21:32:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428073 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5CCEF14B7 for ; Mon, 9 Mar 2020 21:32:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3C32824654 for ; Mon, 9 Mar 2020 21:32:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="h3l8JCHa" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726698AbgCIVc6 (ORCPT ); Mon, 9 Mar 2020 17:32:58 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:41659 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726454AbgCIVc6 (ORCPT ); Mon, 9 Mar 2020 17:32:58 -0400 Received: by mail-pl1-f193.google.com with SMTP id t14so4526881plr.8 for ; Mon, 09 Mar 2020 14:32:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tZmUeitYEXdJfEIJvvQFQzayStsz1CC03NPPcdyYXT4=; b=h3l8JCHayM7rNYWpv044Lmdcw3HGiF7gfao0ibdjohT0AnJeeKVzw86mDcVN33TYof ovqB9KuhZKNrsDBwaDWUeL1L3Onn5F4shBPi2mrDe1hNR2UTxcl0N0lQcEXcLlWnGydT DZNO+7YkuYM/+6xnoulHK+8bE2BwwIbsAtakWgWrdqj1bFRjeBQ2zPf/Vt8s5Xo86wLR NXiqpUhCWcP+zGH7oTlB6Pz0y4G1aPzL9qymBanlMq2V711Mhzr9a4uTAH+EM35pgQp+ CYl0yUyBYgCPDTrlOTFiUrAOnMzL5b4SzXrYQ7uJVyy/SOCehr0pYMqygklUZdxKNxuA tfxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tZmUeitYEXdJfEIJvvQFQzayStsz1CC03NPPcdyYXT4=; b=G/h/67BO54lqVpHUcwA1/pxDGedmRCWQ540pxSB4fIJIRJwK9ygm2dsCQjT2vN3xlE Xpuaw/d3q/6a/ZS+xeR3vIncfmGFmJpnRydCr/V6QLZ//QPJZK3OEYZJMP0eL5j53lUj JWYwgcmO+Te925jyeHYuwXyJAgvW1VTt+63iaQcC8yXYt2gGQjwXowItUdmENqn2VgMk p75ikNflmajUHsO9cPNX+x5QR3/Gu9MVZWVnX8PWYFF3MKFLFYuoe9b5PsgRAlna4Z7g tJJEHjwEV+tCCFcGDXQiQMLdDpCd6nm2NRKavf95UkZ1Fx51RnFbycSRU4CaM2fnSkW9 Ag3A== X-Gm-Message-State: ANhLgQ0GvfRFFGQ7wCyJYb31FwauH4vKoKoEIaZnP/64mDH8/8ZAqa4F zg6IlfFk+scb+mC8sxnRu8lhiGKPxqs= X-Google-Smtp-Source: ADFU+vtvuPpXsp000CUUjOx4NYbglKvJR9aVFTWW9DHEyjRmtQB4ACmJ56YwL0JoiQ502857pT7ing== X-Received: by 2002:a17:90a:bf83:: with SMTP id d3mr1382792pjs.77.1583789575760; Mon, 09 Mar 2020 14:32:55 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.32.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:32:55 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 01/15] btrfs: fix error handling when submitting direct I/O bio Date: Mon, 9 Mar 2020 14:32:27 -0700 Message-Id: <4481393496a9dfe99c9432193407ebdaa27d0753.1583789410.git.osandov@fb.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval If we submit orig_bio in btrfs_submit_direct_hook(), we never increment pending_bios. Then, if btrfs_submit_dio_bio() fails, we decrement pending_bios to -1, and we never complete orig_bio. Fix it by initializing pending_bios to 1 instead of incrementing later. Fixing this exposes another bug: we put orig_bio prematurely and then put it again from end_io. Fix it by not putting orig_bio. After this change, pending_bios is really more of a reference count, but I'll leave that cleanup separate to keep the fix small. Fixes: e65e15355429 ("btrfs: fix panic caused by direct IO") Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik Reviewed-by: Nikolay Borisov --- fs/btrfs/inode.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8a3bc19d83ff..d48a2010f24a 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7948,7 +7948,6 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) /* bio split */ ASSERT(geom.len <= INT_MAX); - atomic_inc(&dip->pending_bios); do { clone_len = min_t(int, submit_len, geom.len); @@ -7998,7 +7997,8 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) if (!status) return 0; - bio_put(bio); + if (bio != orig_bio) + bio_put(bio); out_err: dip->errors = 1; /* @@ -8039,7 +8039,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, bio->bi_private = dip; dip->orig_bio = bio; dip->dio_bio = dio_bio; - atomic_set(&dip->pending_bios, 0); + atomic_set(&dip->pending_bios, 1); io_bio = btrfs_io_bio(bio); io_bio->logical = file_offset; From patchwork Mon Mar 9 21:32:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428075 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 21479139A for ; Mon, 9 Mar 2020 21:33:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EE67A24654 for ; Mon, 9 Mar 2020 21:32:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="m6K6/9B4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726656AbgCIVc6 (ORCPT ); Mon, 9 Mar 2020 17:32:58 -0400 Received: from mail-pj1-f66.google.com ([209.85.216.66]:54214 "EHLO mail-pj1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726168AbgCIVc6 (ORCPT ); Mon, 9 Mar 2020 17:32:58 -0400 Received: by mail-pj1-f66.google.com with SMTP id l36so454531pjb.3 for ; Mon, 09 Mar 2020 14:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PQc3aFeIrpusXZKg49W5UBHhGHHY0qVGeT8e5E9a2kY=; b=m6K6/9B4OXtjItDRbM4S2FH5bbJSEMD61c9fPyG4gcsHY9/PMdblSnVOvqtbiAVPa1 7H/AGNc5uAPLgyF7g8Sf/zNIAjDMU+28+sDKqc0/wlp/dWWVIPkixrC3fJMMk7KR+GZ1 mFJLhAgQIeXeU1DWVWXBFYkoPQa6ka/0ZmEg+ajLGcD20hLjwtGPokdnwH4OVJo5GpyF GclRl//IMZyUzcwN64GXq5VUgYafvbdqw62XzmEONSxSxKzYH4WIzW9bXvHMhFG/ycn2 NzNxJPZlBtGypwUp+AjZX6SH53Z93VRz89G2YCntRcMaMi0jZyXLgqxs5BA4kiMFTCNB 0obQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PQc3aFeIrpusXZKg49W5UBHhGHHY0qVGeT8e5E9a2kY=; b=mzWL3343YXAoIa6ntAzpRfyIdWESuknL9uV3f1tDlCbLgWQ1YNhareTlTvCrjjMxbf ScCRFRlAEEh5haR6rYNDnLVGmzYs5tezMDScR6rXpIEi9Gy3yy0LXQsLqsUkK0aqyLz6 SXhvyJ20ep3DgsYyE9HpbaaBnxhW+AiMGdpVheUG5BVQioqsFvc/2sqbi/V0tAbJSY7x odKy1Tth+3MMToC7Iws+sZI4BTkNv/dPZ0gAMdpVroTfsrPqim8G0Chfaq80CM2hmXUj YsjtcVECEs7kPnJ1iW5qq5zuF9suBphLFVtbWBSuO8k3C15eY3GwSe10SDvUzkC2SA6u SFUQ== X-Gm-Message-State: ANhLgQ1VFp2yin/HzFxbSUacqVK+Rzkhmy1hcClLfL34YqkIHehcgF8a BzZL8JW+HAV/daXJCxrvk00tpRFqDrg= X-Google-Smtp-Source: ADFU+vshQ355ZZQbguOuznK1w83WQpNaFhAijRnMsQ1bff2rkHI2CQqYqWLxTb2jjovGvEB773Wq+w== X-Received: by 2002:a17:90a:e012:: with SMTP id u18mr556881pjy.190.1583789576811; Mon, 09 Mar 2020 14:32:56 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.32.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:32:56 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 02/15] btrfs: fix double __endio_write_update_ordered in direct I/O Date: Mon, 9 Mar 2020 14:32:28 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval In btrfs_submit_direct(), if we fail to allocate the btrfs_dio_private, we complete the ordered extent range. However, we don't mark that the range doesn't need to be cleaned up from btrfs_direct_IO() until later. Therefore, if we fail to allocate the btrfs_dio_private, we complete the ordered extent range twice. We could fix this by updating unsubmitted_oe_range earlier, but it's simpler to always clean up via the bio once the btrfs_dio_private is allocated and leave it for btrfs_direct_IO() before that. Fixes: f28a49287817 ("Btrfs: fix leaking of ordered extents after direct IO write error") Signed-off-by: Omar Sandoval Reviewed-by: Nikolay Borisov --- fs/btrfs/inode.c | 92 ++++++++++++++---------------------------------- 1 file changed, 26 insertions(+), 66 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d48a2010f24a..8e986056be3c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7912,7 +7912,7 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, return ret; } -static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) +static void btrfs_submit_direct_hook(struct btrfs_dio_private *dip) { struct inode *inode = dip->inode; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); @@ -7932,7 +7932,7 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) ret = btrfs_get_io_geometry(fs_info, btrfs_op(orig_bio), start_sector << 9, submit_len, &geom); if (ret) - return -EIO; + goto out_err; if (geom.len >= submit_len) { bio = orig_bio; @@ -7995,7 +7995,7 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) submit: status = btrfs_submit_dio_bio(bio, inode, file_offset, async_submit); if (!status) - return 0; + return; if (bio != orig_bio) bio_put(bio); @@ -8009,9 +8009,6 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) */ if (atomic_dec_and_test(&dip->pending_bios)) bio_io_error(dip->orig_bio); - - /* bio_end_io() will handle error, so we needn't return it */ - return 0; } static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, @@ -8021,14 +8018,24 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, struct bio *bio = NULL; struct btrfs_io_bio *io_bio; bool write = (bio_op(dio_bio) == REQ_OP_WRITE); - int ret = 0; bio = btrfs_bio_clone(dio_bio); dip = kzalloc(sizeof(*dip), GFP_NOFS); if (!dip) { - ret = -ENOMEM; - goto free_ordered; + if (!write) { + unlock_extent(&BTRFS_I(inode)->io_tree, file_offset, + file_offset + dio_bio->bi_iter.bi_size - 1); + } + + dio_bio->bi_status = BLK_STS_RESOURCE; + /* + * Releases and cleans up our dio_bio, no need to bio_put() nor + * bio_endio()/bio_io_error() against dio_bio. + */ + dio_end_io(dio_bio); + bio_put(bio); + return; } dip->private = dio_bio->bi_private; @@ -8044,72 +8051,25 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, io_bio->logical = file_offset; if (write) { - bio->bi_end_io = btrfs_endio_direct_write; - } else { - bio->bi_end_io = btrfs_endio_direct_read; - dip->subio_endio = btrfs_subio_endio_read; - } - - /* - * Reset the range for unsubmitted ordered extents (to a 0 length range) - * even if we fail to submit a bio, because in such case we do the - * corresponding error handling below and it must not be done a second - * time by btrfs_direct_IO(). - */ - if (write) { + /* + * At this point, the btrfs_dio_private is responsible for + * cleaning up the ordered extents whether or not we submit any + * bios. + */ struct btrfs_dio_data *dio_data = current->journal_info; dio_data->unsubmitted_oe_range_end = dip->logical_offset + dip->bytes; dio_data->unsubmitted_oe_range_start = dio_data->unsubmitted_oe_range_end; - } - - ret = btrfs_submit_direct_hook(dip); - if (!ret) - return; - - btrfs_io_bio_free_csum(io_bio); -free_ordered: - /* - * If we arrived here it means either we failed to submit the dip - * or we either failed to clone the dio_bio or failed to allocate the - * dip. If we cloned the dio_bio and allocated the dip, we can just - * call bio_endio against our io_bio so that we get proper resource - * cleanup if we fail to submit the dip, otherwise, we must do the - * same as btrfs_endio_direct_[write|read] because we can't call these - * callbacks - they require an allocated dip and a clone of dio_bio. - */ - if (bio && dip) { - bio_io_error(bio); - /* - * The end io callbacks free our dip, do the final put on bio - * and all the cleanup and final put for dio_bio (through - * dio_end_io()). - */ - dip = NULL; - bio = NULL; + bio->bi_end_io = btrfs_endio_direct_write; } else { - if (write) - __endio_write_update_ordered(inode, - file_offset, - dio_bio->bi_iter.bi_size, - false); - else - unlock_extent(&BTRFS_I(inode)->io_tree, file_offset, - file_offset + dio_bio->bi_iter.bi_size - 1); - - dio_bio->bi_status = BLK_STS_IOERR; - /* - * Releases and cleans up our dio_bio, no need to bio_put() - * nor bio_endio()/bio_io_error() against dio_bio. - */ - dio_end_io(dio_bio); + bio->bi_end_io = btrfs_endio_direct_read; + dip->subio_endio = btrfs_subio_endio_read; } - if (bio) - bio_put(bio); - kfree(dip); + + btrfs_submit_direct_hook(dip); } static ssize_t check_direct_IO(struct btrfs_fs_info *fs_info, From patchwork Mon Mar 9 21:32:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428077 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 94EBE14B7 for ; Mon, 9 Mar 2020 21:33:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 758F224649 for ; Mon, 9 Mar 2020 21:33:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="HRiSjGnm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726859AbgCIVdB (ORCPT ); Mon, 9 Mar 2020 17:33:01 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:45601 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726454AbgCIVdB (ORCPT ); Mon, 9 Mar 2020 17:33:01 -0400 Received: by mail-pl1-f193.google.com with SMTP id b22so4520040pls.12 for ; Mon, 09 Mar 2020 14:32:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9PGvqJ2BUvDQxTmPLiiPTrQxtqUzNKxmoj31iqotjZM=; b=HRiSjGnmE4GTiups6Z+JVIwXeHKK3V72HEmHnBC83Xtj94OuIQQlL1FZynuiDjWTto s7U0nGxrFMwi9EGg6VtYC7JmS+fguDAdK+kBlag5Vp6/0rC9rD0liGT8J9qkn0DlQCka Rl4KjIh9/Q0vkaS35Aa5ChORG5nOf0+YIDMOoxh8X3ROOZRtfweNqJzngzygr4bt6ZWx oJIcerNfGfHxCp+sGjC3l4PHLi7siMqApg3QKo5+C65XKpK4my0FLrJtNEncbVjXRe2o S/OVrE/C7mbEJ441DYC1q8T2HYBtjNP5bYwLQa7SFvu8bfmYjJX6v9PXvNuzGR3Oo0Bf TjNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9PGvqJ2BUvDQxTmPLiiPTrQxtqUzNKxmoj31iqotjZM=; b=KmDR3vY9jZJFV980xAZmE+AeC1dveSH+dZLbYqoSQNGYaOYvpy10DRVCdxT6tkkI9c OD7fJ0yt8JMg/jsg0lDyMQYw2/mHdKj6SldDZ1RjG035A3r0uXgF03Ax71ILn9XJM2O+ PK0O8CO73QibgUL8CJ6k6dUYOjIbwNFnwc/mp49uFMB9By68ha4YvWgT/qMC4laQTku8 ovFzMD/QN+LucuY6j2Me3bzOObIxa23N06O+XjWoAPtEjqBwDW7IQZocER8l6JzAnvAc 6KVs/k6Cvb3JzjY4BCo67z5uE1DpAK7fFrjde16gE2qvMMOvPm5kLrz76ett9APHVnhM w3OA== X-Gm-Message-State: ANhLgQ3lh9UzgN6DmhWurp99qn3jEydY+RiM1QI168p4p9rpZU6TKKCS V4+IsEzCE9TRsBhr+kmcnJAELXm1bBA= X-Google-Smtp-Source: ADFU+vukP0f/foyDhE128pHDbI9Q6bQYnxdbA1Slkn2i5ugrfRNmeUjhgmqVkBu0URC/wPtm98V+sA== X-Received: by 2002:a17:902:be03:: with SMTP id r3mr17654461pls.137.1583789577980; Mon, 09 Mar 2020 14:32:57 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.32.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:32:57 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 03/15] btrfs: look at full bi_io_vec for repair decision Date: Mon, 9 Mar 2020 14:32:29 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Read repair does two things: it finds a good copy of data to return to the reader, and it corrects the bad copy on disk. If a read of multiple sectors has an I/O error, repair does an extra "validation" step that issues a separate read for each sector. This allows us to find the exact failing sectors and only rewrite those. This heuristic is implemented in bio_readpage_error()/btrfs_check_repairable() as: failed_bio_pages = failed_bio->bi_iter.bi_size >> PAGE_SHIFT; if (failed_bio_pages > 1) do validation However, at this point, bi_iter may have already been advanced. This means that we'll skip the validation step and rewrite the entire failed read. Fix it by getting the actual size from the biovec (which we can do because this is only called for non-cloned bios, although that will change in a later commit). Fixes: 8a2ee44a371c ("btrfs: look at bi_size for repair decisions") Signed-off-by: Omar Sandoval --- fs/btrfs/extent_io.c | 28 ++++++++++++++++++++++------ fs/btrfs/extent_io.h | 5 +++-- 2 files changed, 25 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 837262d54e28..279731bff0a8 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2528,8 +2528,9 @@ int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end, return 0; } -bool btrfs_check_repairable(struct inode *inode, unsigned failed_bio_pages, - struct io_failure_record *failrec, int failed_mirror) +bool btrfs_check_repairable(struct inode *inode, bool need_validation, + struct io_failure_record *failrec, + int failed_mirror) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); int num_copies; @@ -2552,7 +2553,7 @@ bool btrfs_check_repairable(struct inode *inode, unsigned failed_bio_pages, * a) deliver good data to the caller * b) correct the bad sectors on disk */ - if (failed_bio_pages > 1) { + if (need_validation) { /* * to fulfill b), we need to know the exact failing sectors, as * we don't want to rewrite any more than the failed ones. thus, @@ -2638,11 +2639,13 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset, struct inode *inode = page->mapping->host; struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree; + bool need_validation = false; + u64 len; + int i; struct bio *bio; int read_mode = 0; blk_status_t status; int ret; - unsigned failed_bio_pages = failed_bio->bi_iter.bi_size >> PAGE_SHIFT; BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE); @@ -2650,13 +2653,26 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset, if (ret) return ret; - if (!btrfs_check_repairable(inode, failed_bio_pages, failrec, + /* + * We need to validate each sector individually if the I/O was for + * multiple sectors. + */ + len = 0; + for (i = 0; i < failed_bio->bi_vcnt; i++) { + len += failed_bio->bi_io_vec[i].bv_len; + if (len > inode->i_sb->s_blocksize) { + need_validation = true; + break; + } + } + + if (!btrfs_check_repairable(inode, need_validation, failrec, failed_mirror)) { free_io_failure(failure_tree, tree, failrec); return -EIO; } - if (failed_bio_pages > 1) + if (need_validation) read_mode |= REQ_FAILFAST_DEV; phy_offset >>= inode->i_sb->s_blocksize_bits; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 234622101230..64e176995af2 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -312,8 +312,9 @@ struct io_failure_record { }; -bool btrfs_check_repairable(struct inode *inode, unsigned failed_bio_pages, - struct io_failure_record *failrec, int fail_mirror); +bool btrfs_check_repairable(struct inode *inode, bool need_validation, + struct io_failure_record *failrec, + int failed_mirror); struct bio *btrfs_create_repair_bio(struct inode *inode, struct bio *failed_bio, struct io_failure_record *failrec, struct page *page, int pg_offset, int icsum, From patchwork Mon Mar 9 21:32:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428081 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 13CEE139A for ; Mon, 9 Mar 2020 21:33:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E860024649 for ; Mon, 9 Mar 2020 21:33:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="o5jlimRF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726902AbgCIVdC (ORCPT ); Mon, 9 Mar 2020 17:33:02 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:41782 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726742AbgCIVdC (ORCPT ); Mon, 9 Mar 2020 17:33:02 -0400 Received: by mail-pg1-f196.google.com with SMTP id b1so5299995pgm.8 for ; Mon, 09 Mar 2020 14:32:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qxYA4f0TuPQvGE8I8jGJR/1USh/D4L87np7YSieLKvw=; b=o5jlimRF/5SO3LREAA0ogBcStk92LjU2DTqaAHN1ruKrk33PUUNqPDjCiGZ9hl8zGe 2HXw7zMVSbZC7HgMUXG/eRTv5yeGWTdjd/B8N+hz2LoG2jGY3UVHiYwHFM7Mdx0Brxd7 eK31EdmoXPsC5tXAN/5gZIb2ysqTTGaoMHrHumxopUX8Ad3mNH7T7YKph/UuFWR5Hyr0 DHAmMCBJaKexmzx9S3edPr1l42agJEAtSO5Z+gItT7DPFBWz/xe9s5S3BmIjuZqnU1Kb nIbfKV1SMsUkGrVgMRTDrAn2gdN/IrXWQvwHUtVS5s0lVXDyLEkZC+fbYtlf3QdD6EoU MywQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qxYA4f0TuPQvGE8I8jGJR/1USh/D4L87np7YSieLKvw=; b=V5pr6EpjnNA8hTqY67Z8iqDsmXRxvELo/AalSF+RyrDoXIfVAG7VQM7zwUdU4qkUXo wN8Bn1HBk+x/MMMPx/t8oZ64PKPvMCnA3zmH3JauEq4NDjp6yTKPlZoHLDGjQ6Fc9mJF s7Dy2+SSkSzoHU0KjqN5opKLBWnsbQWWzhy16gbh7w5AQ3Zm735TE47ZP4PGTOgPcRYe S6kCTz5CMa5Y+MxQHXNV0VbzYCiuWKHetyktpRxkjTezjQjMU6K0o9T0NyXce4+3LZYO wmpRfYTLU1b18dFY/JETpfzd5D7ALBnY3hVv9s4aZuyrx3Y8YSEkXtNGefSCHPArpZXR rg/A== X-Gm-Message-State: ANhLgQ22vs9w/0tCW/3Vnax1C8OD+Wlh5nYCuwWxxbu2gENjj0gP+0kC fbWP6CQik+gXD3dD7xGyKc+gHRhn2CM= X-Google-Smtp-Source: ADFU+vsS8C9I71toE89a9e1pIehBuDOyFNzJKIrXL40iWn/qEQmyNgLse6qqFccsAwU3ppw1RrEy9g== X-Received: by 2002:a63:d10c:: with SMTP id k12mr17348081pgg.392.1583789578991; Mon, 09 Mar 2020 14:32:58 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.32.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:32:58 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 04/15] btrfs: don't do repair validation for checksum errors Date: Mon, 9 Mar 2020 14:32:30 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval The purpose of the validation step is to distinguish between good and bad sectors in a failed multi-sector read. If a multi-sector read succeeded but some of those sectors had checksum errors, we don't need to validate anything; we know the sectors with bad checksums need to be repaired. Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik --- fs/btrfs/extent_io.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 279731bff0a8..104374854cf1 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2640,8 +2640,6 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset, struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree; bool need_validation = false; - u64 len; - int i; struct bio *bio; int read_mode = 0; blk_status_t status; @@ -2654,15 +2652,19 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset, return ret; /* - * We need to validate each sector individually if the I/O was for - * multiple sectors. + * If there was an I/O error and the I/O was for multiple sectors, we + * need to validate each sector individually. */ - len = 0; - for (i = 0; i < failed_bio->bi_vcnt; i++) { - len += failed_bio->bi_io_vec[i].bv_len; - if (len > inode->i_sb->s_blocksize) { - need_validation = true; - break; + if (failed_bio->bi_status != BLK_STS_OK) { + u64 len = 0; + int i; + + for (i = 0; i < failed_bio->bi_vcnt; i++) { + len += failed_bio->bi_io_vec[i].bv_len; + if (len > inode->i_sb->s_blocksize) { + need_validation = true; + break; + } } } From patchwork Mon Mar 9 21:32:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428083 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 70CD918E8 for ; Mon, 9 Mar 2020 21:33:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 51E0F24654 for ; Mon, 9 Mar 2020 21:33:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="DzkIu350" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726954AbgCIVdD (ORCPT ); Mon, 9 Mar 2020 17:33:03 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:46273 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726861AbgCIVdD (ORCPT ); Mon, 9 Mar 2020 17:33:03 -0400 Received: by mail-pl1-f193.google.com with SMTP id w12so4517173pll.13 for ; Mon, 09 Mar 2020 14:33:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KlnSratDuh7Mb0mrIUd1+nwdt40J3jArN9hk+Bj7v9c=; b=DzkIu350t8f/ofqj5930gqwF7wK/krmohkS8Oy6r6l3JgCEUugCFUMZrWKxk5yVjMp wmDEURrsj+CIBw/jn0NxsZBE1h6beJ7ktpRVY3ex0ZWqnIUyeMb7GbdlrrDWE3f2hzcQ cjI/egEDXxjsK7VFAAhU0jIo95etjtVtJo/TeMkicy2ccPvUB8GSVJ6Kxj0ZPvbaWwrx hD0+k3k4TNy3+PDMRljr4kkwibxD+OYwqqXMF6mji9Ghs/YMBtagfsjTvrN/13m28U1D EpePusH48BD3d+WstOJuAt1dV9m3mBuqMgGpulUHuWKYYO25+FViOuVdmebZOtnNXCQP XlPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KlnSratDuh7Mb0mrIUd1+nwdt40J3jArN9hk+Bj7v9c=; b=kNkmrUYvlRST1Tt+d+vKr165L5hlmmXENFNxhXCslkoqHxnqrm2wkOWhS2i9Tln10y dCmqk589p8TvfBybqtL842AaVklTR/rY+FgpllBCo8OhbfDluqgMyzSK5W92Z5+A8pE2 Zzbej8gaB0ZrgUTnHnphFeW6AzExTvjfHKTnyzYpruWkqTcKcdDmo/c7MaiaAVgo/tIU ZpV+5mAimlazhZVrb7DoKTVgGee75Ao/Fbkfu2+GiSaWQ6szZWUdITK9q5jQEVfTgG1w qQy7E24F5mZFBPzZy8m3ggZ+fEESzjk+odZItUYcrcy5XejbsETJtiTGWEdfMV9y6gEK B/pA== X-Gm-Message-State: ANhLgQ0dNfbOk9Wxthy4CQBIOmJuDU8B+U5pwC717h8ZejxP3QA9su3o dbH3IAl0DWtbGyLAWWikckaa/5fKSHE= X-Google-Smtp-Source: ADFU+vth9RrrI1G1khrQZNV/VBAr31eFtnS4Uc4v/P2Elb5LoXXdVTSJqS70dH2KjrsqoTq59OdYSA== X-Received: by 2002:a17:902:a516:: with SMTP id s22mr17266308plq.271.1583789579993; Mon, 09 Mar 2020 14:32:59 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.32.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:32:59 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 05/15] btrfs: clarify btrfs_lookup_bio_sums documentation Date: Mon, 9 Mar 2020 14:32:31 -0700 Message-Id: <2ee5f090b52dc23569bf94a5a2609dfc49ac4a4b.1583789410.git.osandov@fb.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Fix a couple of issues in the btrfs_lookup_bio_sums documentation: * The bio doesn't need to be a btrfs_io_bio if dst was provided. Move the declaration in the code to make that clear, too. * dst must be large enough to hold nblocks * csum_size, not just csum_size. Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik --- fs/btrfs/file-item.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 6c849e8fd5a1..fa9f4a92f74d 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -242,11 +242,13 @@ int btrfs_lookup_file_extent(struct btrfs_trans_handle *trans, /** * btrfs_lookup_bio_sums - Look up checksums for a bio. * @inode: inode that the bio is for. - * @bio: bio embedded in btrfs_io_bio. + * @bio: bio to look up. * @offset: Unless (u64)-1, look up checksums for this offset in the file. * If (u64)-1, use the page offsets from the bio instead. - * @dst: Buffer of size btrfs_super_csum_size() used to return checksum. If - * NULL, the checksum is returned in btrfs_io_bio(bio)->csum instead. + * @dst: Buffer of size nblocks * btrfs_super_csum_size() used to return + * checksum (nblocks = bio->bi_iter.bi_size / sectorsize). If NULL, the + * checksum buffer is allocated and returned in btrfs_io_bio(bio)->csum + * instead. * * Return: BLK_STS_RESOURCE if allocating memory fails, BLK_STS_OK otherwise. */ @@ -256,7 +258,6 @@ blk_status_t btrfs_lookup_bio_sums(struct inode *inode, struct bio *bio, struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct bio_vec bvec; struct bvec_iter iter; - struct btrfs_io_bio *btrfs_bio = btrfs_io_bio(bio); struct btrfs_csum_item *item = NULL; struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; struct btrfs_path *path; @@ -277,6 +278,8 @@ blk_status_t btrfs_lookup_bio_sums(struct inode *inode, struct bio *bio, nblocks = bio->bi_iter.bi_size >> inode->i_sb->s_blocksize_bits; if (!dst) { + struct btrfs_io_bio *btrfs_bio = btrfs_io_bio(bio); + if (nblocks * csum_size > BTRFS_BIO_INLINE_CSUM_SIZE) { btrfs_bio->csum = kmalloc_array(nblocks, csum_size, GFP_NOFS); From patchwork Mon Mar 9 21:32:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428079 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC74914B7 for ; Mon, 9 Mar 2020 21:33:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AD6F124654 for ; Mon, 9 Mar 2020 21:33:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="MX2IoLP/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726937AbgCIVdD (ORCPT ); Mon, 9 Mar 2020 17:33:03 -0400 Received: from mail-pj1-f68.google.com ([209.85.216.68]:34258 "EHLO mail-pj1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726454AbgCIVdC (ORCPT ); Mon, 9 Mar 2020 17:33:02 -0400 Received: by mail-pj1-f68.google.com with SMTP id 39so360644pjo.1 for ; Mon, 09 Mar 2020 14:33:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aTJ/i6VyAf03sKqOP0yfj+YRFyIZuvyftpW9EALlG5U=; b=MX2IoLP/byRIN5Yo1YW7AcE0A0TLqNoNJ+NcYXkN5DmHTGr1jyVWtViKhVBvW0MxNz +Fo5qmZXGIduPGn8SSiJl+iYokGXnzn1fxl0tw9CCK9t4vDmglfbZvSD9KBcLW4tx5XX /C2w6nPpI2xhqkq3GCvG1IFepf4hTRmTJCgFAYjG8ux3x5JXNbDSzygsnGcD/vNH5pd8 VG/Ghxg8CnH1lxU9LpTa4yZsTff/5/7CgNkO4xauG727mwbAxHAb9KlCqXdbbAI6gVEO EndIBEl+7JwZp7jT6kD8nEiiCPGhj1DPaYMS6DYDm8adDN1EYtWq8z8JEkyiLUTUA8gE hNIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aTJ/i6VyAf03sKqOP0yfj+YRFyIZuvyftpW9EALlG5U=; b=E46FYxRstQW6S6dyFODF52WPYBluaJ3BuYgZWI2YQCMdBV2Wf8/ra3payMi6Z1t+9H om+OFkwJpwmyW6bn0IKCqj8HfzEaSyVTzqkv5mUfsPUi882RpdDHozBQ/SqGj8Z9n2CE M5afjsxej28czG6EssqzFyH2fOLdudenApOB+xeVvQKfHG/Xq9dL/0K5yQOjyD3VNpRc OOv1z8EK3JcbjbdBTDpYpiEsEwWAGxQG0SGRZ3mSUFlLjagrrGVgxxt+H18xOg6O0w8d PNN0A0RQOOeJDNV24ELLn6LqSPlsgcgVmKA0VpCyk00keLfDd04uxMFQ3VrVI62F7YfM rL3Q== X-Gm-Message-State: ANhLgQ0YaAw+RK9QL/hQO3IhHqaIWrhHg1nrSThfrTHSSZIvnJytCXJa YeiTkL1vG+Bvk36oXGUu3a12jvIR3yk= X-Google-Smtp-Source: ADFU+vuznB8ivUMrnqlY6Z4DqNfjyrdhr2G7J/DR6OudYwtf/SqctKa/9ocFE0LR2qFteaQLAMebiA== X-Received: by 2002:a17:902:9f87:: with SMTP id g7mr11022992plq.32.1583789581217; Mon, 09 Mar 2020 14:33:01 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:00 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 06/15] btrfs: rename __readpage_endio_check to check_data_csum Date: Mon, 9 Mar 2020 14:32:32 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval __readpage_endio_check() is also used from the direct I/O read code, so give it a more descriptive name. Signed-off-by: Omar Sandoval Reviewed-by: Johannes Thumshirn Reviewed-by: Josef Bacik Reviewed-by: Nikolay Borisov --- fs/btrfs/inode.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8e986056be3c..50476ae96552 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2725,10 +2725,9 @@ void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start, btrfs_queue_work(wq, &ordered_extent->work); } -static int __readpage_endio_check(struct inode *inode, - struct btrfs_io_bio *io_bio, - int icsum, struct page *page, - int pgoff, u64 start, size_t len) +static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, + int icsum, struct page *page, int pgoff, u64 start, + size_t len) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); @@ -2789,8 +2788,8 @@ static int btrfs_readpage_end_io_hook(struct btrfs_io_bio *io_bio, } phy_offset >>= inode->i_sb->s_blocksize_bits; - return __readpage_endio_check(inode, io_bio, phy_offset, page, offset, - start, (size_t)(end - start + 1)); + return check_data_csum(inode, io_bio, phy_offset, page, offset, start, + (size_t)(end - start + 1)); } /* @@ -7593,9 +7592,9 @@ static void btrfs_retry_endio(struct bio *bio) ASSERT(!bio_flagged(bio, BIO_CLONED)); bio_for_each_segment_all(bvec, bio, iter_all) { - ret = __readpage_endio_check(inode, io_bio, i, bvec->bv_page, - bvec->bv_offset, done->start, - bvec->bv_len); + ret = check_data_csum(inode, io_bio, i, bvec->bv_page, + bvec->bv_offset, done->start, + bvec->bv_len); if (!ret) clean_io_failure(BTRFS_I(inode)->root->fs_info, failure_tree, io_tree, done->start, @@ -7645,8 +7644,9 @@ static blk_status_t __btrfs_subio_endio_read(struct inode *inode, next_block: if (uptodate) { csum_pos = BTRFS_BYTES_TO_BLKS(fs_info, offset); - ret = __readpage_endio_check(inode, io_bio, csum_pos, - bvec.bv_page, pgoff, start, sectorsize); + ret = check_data_csum(inode, io_bio, csum_pos, + bvec.bv_page, pgoff, start, + sectorsize); if (likely(!ret)) goto next; } From patchwork Mon Mar 9 21:32:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428085 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 373FF14B7 for ; Mon, 9 Mar 2020 21:33:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 17D4324654 for ; Mon, 9 Mar 2020 21:33:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="ScnBdFvr" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726968AbgCIVdE (ORCPT ); Mon, 9 Mar 2020 17:33:04 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:43655 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726946AbgCIVdD (ORCPT ); Mon, 9 Mar 2020 17:33:03 -0400 Received: by mail-pf1-f196.google.com with SMTP id c144so5421584pfb.10 for ; Mon, 09 Mar 2020 14:33:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=yua5TyER8qIELEn13ECCqGFnZ6AZvZjKXf3ys07KGec=; b=ScnBdFvrRSC7aoTecoKTv3RRBni+c4SGyrc4sG5oQL/tXrh+JqAzxe6fAjZIcjbRNN n/17PQOkgjuYgAqEH1+eq/pUeshm/nccHT32s1Q0eOnjU7hZrVPpRQvaakqJnX4o1EJa +15t0KiPw6LK2OJ47HmRAciUcCVWo3OAh2CYjqF42jWQb0bMkgOaoXThiShMTQTBburA 7UA0HSLh7tFYg3wvdqEyT0JjKm+/ETV7/PRmamojMEtHY/X0coELoNQkrBTEnm7PHg+9 RgdEx9fR/Wp7ERFJU+je4zVAKKuYJ9T0Z6fT5/AXasT8A3jEqq87CH5ayD6ZyzgAaNwW 09cQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yua5TyER8qIELEn13ECCqGFnZ6AZvZjKXf3ys07KGec=; b=T/iBH8g7jsGj51sQ0ABI+mgQtTIDaXJGZfN1EK/0Z/MKYNnT82uslW6yRRO9RrnifO ymM725kzjlQoDrJwAzaNlnb/m1lOABb/LssrKsvQ2xFSZYfFL3BLnFStUb4+554cxAC1 LbHPONKGPTSVTINmUmjspIgNwZkG4itEjLNDDRtmN1XJaiqHlKoDqbOK5JbHqvNKCxU/ 6cLL36az5ZCvglBenuwWE/t7IPcERdoIq5RmxPkfuuk8nMQzKGVNH8MURVOG7ulKbG0N hjotYwLyCdnnGIk/F8IFvfYHNGs+iMrRNVkTvzK8sPwTKgfc+CFbxo6TnDWiYT9xrqy9 JziQ== X-Gm-Message-State: ANhLgQ1TpKol1G+e1oyn6R8GaMtrMMGoNR8jgNfRpKoO4jKRFtXXHn23 Dc7kvfBY9NV9PGVK8w6J7o4uApW3twQ= X-Google-Smtp-Source: ADFU+vvyv/K32iei+u0wc+bOS/kov+7h06hPOO2OM5O+HFLy73VEcd8yprRWMXMxjC0AzEv6b7uNCA== X-Received: by 2002:a63:7b18:: with SMTP id w24mr17859941pgc.22.1583789582333; Mon, 09 Mar 2020 14:33:02 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:01 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 07/15] btrfs: make btrfs_check_repairable() static Date: Mon, 9 Mar 2020 14:32:33 -0700 Message-Id: <1ba159f3930fca7d11350f798ba140e1a2176358.1583789410.git.osandov@fb.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Since its introduction in commit 2fe6303e7cd0 ("Btrfs: split bio_readpage_error into several functions"), btrfs_check_repairable() has only been used from extent_io.c where it is defined. Signed-off-by: Omar Sandoval Reviewed-by: Johannes Thumshirn Reviewed-by: Josef Bacik Reviewed-by: Nikolay Borisov --- fs/btrfs/extent_io.c | 7 ++++--- fs/btrfs/extent_io.h | 3 --- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 104374854cf1..aee35d431f91 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2528,9 +2528,10 @@ int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end, return 0; } -bool btrfs_check_repairable(struct inode *inode, bool need_validation, - struct io_failure_record *failrec, - int failed_mirror) +static bool btrfs_check_repairable(struct inode *inode, + bool need_validation, + struct io_failure_record *failrec, + int failed_mirror) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); int num_copies; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 64e176995af2..11341a430007 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -312,9 +312,6 @@ struct io_failure_record { }; -bool btrfs_check_repairable(struct inode *inode, bool need_validation, - struct io_failure_record *failrec, - int failed_mirror); struct bio *btrfs_create_repair_bio(struct inode *inode, struct bio *failed_bio, struct io_failure_record *failrec, struct page *page, int pg_offset, int icsum, From patchwork Mon Mar 9 21:32:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428087 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 961A014B7 for ; Mon, 9 Mar 2020 21:33:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 778E424654 for ; Mon, 9 Mar 2020 21:33:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="uhOkNM8z" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726992AbgCIVdG (ORCPT ); Mon, 9 Mar 2020 17:33:06 -0400 Received: from mail-pl1-f195.google.com ([209.85.214.195]:45608 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726946AbgCIVdF (ORCPT ); Mon, 9 Mar 2020 17:33:05 -0400 Received: by mail-pl1-f195.google.com with SMTP id b22so4520124pls.12 for ; Mon, 09 Mar 2020 14:33:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=liJbOg/qh8eI1EH0nzrSmHQQ3FWUVUdFOFKZ+RXSbm4=; b=uhOkNM8zukTViLfTeBabOQS/77uxs6I2H+1R1kCyBUt0gAVPjrEq+Ev+16z7tIsdDZ NjIKZ4jLN6WGzWcqWJWXaCVZpe56fLPZ45ATstd0jPOQXijqqSb6xy5a8AWDCySQKpVt ypPOPyr6wVgpzOaviP0RKTGvTqqwXmQqG6/LFJmnLBCfnNMVdQD9AMWcWYDnQXg5zDxx WTygufiein9wzMu77OaSf2QoPsOuJrJMkr5y9h2Jxj7cWEEFUFIqCNjqYLcFzVeyW8pB Q5e/yHUVJQ6xl3vPtW5mYYPDremnWZICe92bTUHNu9qJlFYHP28zyM0Yrq7M40L/ztZm 4Z8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=liJbOg/qh8eI1EH0nzrSmHQQ3FWUVUdFOFKZ+RXSbm4=; b=iWI53P4Ay33PmNTg9mG1oYlI7ANl3bKppHOhkS2Ja/DivjSUJIuhHmpAr62fG51aRy ev1VI7HjkSffhuy+8E/9Te11ljoxp5czFY+gvqBV1OT9jllfjxtzy8HWafleYuikOfS8 5oJ4Hvz9X+iz7gfeGsUWV2tTZq1ZR73gH1a4iDTJ0ZS/xvnCuQ+kYxWQzWuHjlauA5Ld tGrUQ8cmf1F+mR00xkR/erky60VUV0Y8Yz1zYWVt+3dfaWKWk98vbvm/TEaHVQ02KaFj Ox7dYP+U3ddG2X21cbxbLCU744Kxo8vs/dWvysL3KWSGC4zy04bt9eiT8dCnfdVr4eBH I+Dg== X-Gm-Message-State: ANhLgQ1xadri7cxa07+EB6Iz/w5XVsd9wr6vniOjS+fcMz8PImiMsu72 RSpl5THxUbIXQChj+LPg50lKFdp/hCw= X-Google-Smtp-Source: ADFU+vs75VZamV7n1udzU2iPD1IZdCsa3ixWJ3YjIxohdEyqxh4qSkA9PktuEtuOlKUh8zltDZ9FAw== X-Received: by 2002:a17:90a:c78b:: with SMTP id gn11mr938925pjb.97.1583789583297; Mon, 09 Mar 2020 14:33:03 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:02 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 08/15] btrfs: move btrfs_dio_private to inode.c Date: Mon, 9 Mar 2020 14:32:34 -0700 Message-Id: <7cb31cf9673d1d232e770145924ef779d3681058.1583789410.git.osandov@fb.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval This hasn't been needed outside of inode.c since commit 23ea8e5a0767 ("Btrfs: load checksum data once when submitting a direct read io"). Signed-off-by: Omar Sandoval --- fs/btrfs/btrfs_inode.h | 30 ------------------------------ fs/btrfs/inode.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 30 insertions(+), 30 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 27a1fefce508..ade5c6adec06 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -293,36 +293,6 @@ static inline int btrfs_inode_in_log(struct btrfs_inode *inode, u64 generation) return ret; } -#define BTRFS_DIO_ORIG_BIO_SUBMITTED 0x1 - -struct btrfs_dio_private { - struct inode *inode; - unsigned long flags; - u64 logical_offset; - u64 disk_bytenr; - u64 bytes; - void *private; - - /* number of bios pending for this dio */ - atomic_t pending_bios; - - /* IO errors */ - int errors; - - /* orig_bio is our btrfs_io_bio */ - struct bio *orig_bio; - - /* dio_bio came from fs/direct-io.c */ - struct bio *dio_bio; - - /* - * The original bio may be split to several sub-bios, this is - * done during endio of sub-bios - */ - blk_status_t (*subio_endio)(struct inode *, struct btrfs_io_bio *, - blk_status_t); -}; - /* * Disable DIO read nolock optimization, so new dio readers will be forced * to grab i_mutex. It is used to avoid the endless truncate due to diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 50476ae96552..9d3a275ef253 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -54,6 +54,36 @@ struct btrfs_iget_args { struct btrfs_root *root; }; +#define BTRFS_DIO_ORIG_BIO_SUBMITTED 0x1 + +struct btrfs_dio_private { + struct inode *inode; + unsigned long flags; + u64 logical_offset; + u64 disk_bytenr; + u64 bytes; + void *private; + + /* number of bios pending for this dio */ + atomic_t pending_bios; + + /* IO errors */ + int errors; + + /* orig_bio is our btrfs_io_bio */ + struct bio *orig_bio; + + /* dio_bio came from fs/direct-io.c */ + struct bio *dio_bio; + + /* + * The original bio may be split to several sub-bios, this is + * done during endio of sub-bios + */ + blk_status_t (*subio_endio)(struct inode *, struct btrfs_io_bio *, + blk_status_t); +}; + struct btrfs_dio_data { u64 reserve; u64 unsubmitted_oe_range_start; From patchwork Mon Mar 9 21:32:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428091 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 90A1214B7 for ; Mon, 9 Mar 2020 21:33:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 714C324654 for ; Mon, 9 Mar 2020 21:33:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="sn50z7M1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727000AbgCIVdI (ORCPT ); Mon, 9 Mar 2020 17:33:08 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:36790 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726974AbgCIVdG (ORCPT ); Mon, 9 Mar 2020 17:33:06 -0400 Received: by mail-pg1-f195.google.com with SMTP id d9so5309154pgu.3 for ; Mon, 09 Mar 2020 14:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=m3iIB1WRtOJTgQ/7H6zzNl6rlY3jxphMj5H3mzHsas4=; b=sn50z7M1A782WGtsPbiNIIrJMqodq6VrRjRBpIy0krjBQr7Fjl+aTA1utEg5eNYfkz FF4NgtqSZS5zUFj6WGnCjX1TEPtnKfZHZ5DUjpy32Ojt3D8D/a4/oOGAFLqvakz2r2vl PeTsTf+UEcy3ZaE01ou38vQj4gFBW33aUwczj1vDmkfqNl81Lzl/XuIiawWseO9spH4r ebtkiN8Bq4pUXCsNKdbYzzQO2chgkj0SH+R7aP1Fc/9OwDQPGAmVOh8j5xWAxv9VFK2V mnJLg9PXJyZ6DfS4X4Mc5cdE1Qy3nS16qAwHLGB50Z0Qa+1NAFaRkTDvh6eHvU5t5POU oLDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=m3iIB1WRtOJTgQ/7H6zzNl6rlY3jxphMj5H3mzHsas4=; b=abwYCQ8HBRv59l0K4rDI4EEllW7D2VbjE57g2ggEVkLLelZKmaCNJJVvvMCDYpHr3r a76mW8TL6XT295umSy6Aiu7I2ZM7pg5jpdLxbvE2cY/HWBy3X0dqcOc+7XcB0U6CBegs bn5944vrqtcW5Xp/qNh4skAArsMK1B9Gn9u7ioAAPklys5T/ZAJHvqlxaqpWBT6vABOu i1Fl/tUY9cxXGKtEZfV14iGWk/cRgsKTsguMSGE7vnH2Cf5uXM3o3EQ+iZH8+HeeV499 AuDd03Licj1zNUI80o8etWyaPL6/+N8OoB6DWM9+9m14aD8mW0pqWqZCZ41qzYKCNSrA +B2A== X-Gm-Message-State: ANhLgQ01L7DMhDViqpWDD/46Y7vyGj5R/oUB/exJ8X7xxmbVCjWQT0B0 e4BUsB4NkB4ANKzVbIoN7sxSlRSPav8= X-Google-Smtp-Source: ADFU+vvf4GplwUkgztILKzl0EmxG1Y9yqR1w9lIH9b0pUHzVJ9o2L1mVzhwDpVshzeqn/2ZpOTXA5A== X-Received: by 2002:a62:6381:: with SMTP id x123mr18165041pfb.75.1583789584309; Mon, 09 Mar 2020 14:33:04 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:03 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 09/15] btrfs: kill btrfs_dio_private->private Date: Mon, 9 Mar 2020 14:32:35 -0700 Message-Id: <432c19b74bb13191a04550b630d2db1f998ba3be.1583789410.git.osandov@fb.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval We haven't used this since commit 9be3395bcd4a ("Btrfs: use a btrfs bioset instead of abusing bio internals"). Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik Reviewed-by: Nikolay Borisov --- fs/btrfs/inode.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 9d3a275ef253..8cc8741b3fec 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -62,7 +62,6 @@ struct btrfs_dio_private { u64 logical_offset; u64 disk_bytenr; u64 bytes; - void *private; /* number of bios pending for this dio */ atomic_t pending_bios; @@ -8068,7 +8067,6 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, return; } - dip->private = dio_bio->bi_private; dip->inode = inode; dip->logical_offset = file_offset; dip->bytes = dio_bio->bi_iter.bi_size; From patchwork Mon Mar 9 21:32:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428089 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 45270139A for ; Mon, 9 Mar 2020 21:33:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2521524654 for ; Mon, 9 Mar 2020 21:33:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="MdbEJqPu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727023AbgCIVdI (ORCPT ); Mon, 9 Mar 2020 17:33:08 -0400 Received: from mail-pj1-f65.google.com ([209.85.216.65]:36046 "EHLO mail-pj1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726971AbgCIVdG (ORCPT ); Mon, 9 Mar 2020 17:33:06 -0400 Received: by mail-pj1-f65.google.com with SMTP id l41so486507pjb.1 for ; Mon, 09 Mar 2020 14:33:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Y2RsQPRD0w7048dytJeUmrGKvsDH4YsdJs1InlBl878=; b=MdbEJqPuT1gDR8ahrp5wuPF+Ze8hFsye3FXTj9cAbNzv+PpeZov6uZnIlIazeglSkP 9mB228Ij1BKvhVO2H7uQ/6nFcREvE2gobOpjV0j8/ULIsQT/QSy4tQoT4z1ING0HhpAY orxd/5dUVp9sbSZrGiuyAXxG2JXpB346oenWbD6dWGw66dY5Wiul9m9qFHNiiRg19poL s2W3qyIFeJoFtwyb7URKAeQvge5fpouFoITIH5kZiHrQmrxPbgE6gWpFUNZDuyMuiPoO 1EfNMOoYBdPuEe2b0ZbdWjxvfv7DJxpyKBNCR9fNB2/+gv8w25f86uYaIBu81ofW9FVL a1bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Y2RsQPRD0w7048dytJeUmrGKvsDH4YsdJs1InlBl878=; b=oggmB86R6/ReuWDV5Z+PJwFcvUtiOAUB9ckVkgJp3ekeDR2OaaCfnqrKlxKPeV3jVW n/jO4e9TGnCf1oQmNl4mJGOVQ4Df6Y21UKf3wgbqCux4c6y0esd+saHvNVd6KImF36Pt RnmKlU+IGtVFEWkrfuWkrsgBqYBub3al3LyROtJ6JuM88BJpRc/n+x47porZHdClmWsu AV+/sGHnF0y40fjd+zMPACTtGnHAYRrTOsWYsL3a00HUg16novSMr5+zl6SiAZRTobJk txfi7kjp/9YgXgvYQiabwMVbKVZ93Ev/A2A8nwC312KvEHHfHW42XvZLjUErOKEc+AvA 8GmQ== X-Gm-Message-State: ANhLgQ37Nt+Mwz1n33N8CxrP8nrOja7vQ7OlfxawByjCX1yj6R8zJfOe QHz6akZPuHhRtmGjn62hHX7FQDPJrPw= X-Google-Smtp-Source: ADFU+vv4rQr1AIXBAjPfeAq7IJYFhKkwWTy95eyT7dRfpeC692UKjUnuTS0r5ShDa3q2vLuMp/gN7w== X-Received: by 2002:a17:90a:25c8:: with SMTP id k66mr1254607pje.90.1583789585379; Mon, 09 Mar 2020 14:33:05 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:04 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 10/15] btrfs: convert btrfs_dio_private->pending_bios to refcount_t Date: Mon, 9 Mar 2020 14:32:36 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval This is really a reference count now, so convert it to refcount_t and rename it to refs. Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik Reviewed-by: Nikolay Borisov --- fs/btrfs/inode.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8cc8741b3fec..a7fb0ba8cde4 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -63,8 +63,11 @@ struct btrfs_dio_private { u64 disk_bytenr; u64 bytes; - /* number of bios pending for this dio */ - atomic_t pending_bios; + /* + * References to this structure. There is one reference per in-flight + * bio plus one while we're still setting up. + */ + refcount_t refs; /* IO errors */ int errors; @@ -7849,7 +7852,7 @@ static void btrfs_end_dio_bio(struct bio *bio) } /* if there are more bios still pending for this dio, just exit */ - if (!atomic_dec_and_test(&dip->pending_bios)) + if (!refcount_dec_and_test(&dip->refs)) goto out; if (dip->errors) { @@ -8001,13 +8004,13 @@ static void btrfs_submit_direct_hook(struct btrfs_dio_private *dip) * count. Otherwise, the dip might get freed before we're * done setting it up. */ - atomic_inc(&dip->pending_bios); + refcount_inc(&dip->refs); status = btrfs_submit_dio_bio(bio, inode, file_offset, async_submit); if (status) { bio_put(bio); - atomic_dec(&dip->pending_bios); + refcount_dec(&dip->refs); goto out_err; } @@ -8036,7 +8039,7 @@ static void btrfs_submit_direct_hook(struct btrfs_dio_private *dip) * atomic operations with a return value are fully ordered as per * atomic_t.txt */ - if (atomic_dec_and_test(&dip->pending_bios)) + if (refcount_dec_and_test(&dip->refs)) bio_io_error(dip->orig_bio); } @@ -8074,7 +8077,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, bio->bi_private = dip; dip->orig_bio = bio; dip->dio_bio = dio_bio; - atomic_set(&dip->pending_bios, 1); + refcount_set(&dip->refs, 1); io_bio = btrfs_io_bio(bio); io_bio->logical = file_offset; From patchwork Mon Mar 9 21:32:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428093 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 35B9F139A for ; Mon, 9 Mar 2020 21:33:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1609A24654 for ; Mon, 9 Mar 2020 21:33:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="gxlOzghG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727027AbgCIVdJ (ORCPT ); Mon, 9 Mar 2020 17:33:09 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:41668 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726997AbgCIVdI (ORCPT ); Mon, 9 Mar 2020 17:33:08 -0400 Received: by mail-pl1-f194.google.com with SMTP id t14so4527084plr.8 for ; Mon, 09 Mar 2020 14:33:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FPuJvXFuiB1r1HZvMinbs424Qxdr3dsdUD3p2/VAQ0I=; b=gxlOzghGCDLzDUGj28LQV7Cm7hTGZVYq0FD3qRUxG18+inrA/UVx+C6RJddjVNGjsS NpHxYxNeIEpoA+5dvTOtqFdrp6ma+P/fc2VfmberGEOPHjVYqWdj2wA0vA4fbCRRPgf9 s/9fxa9Evkfg8sWtdRi5BNLpD1AhcIBGvwXnFOguAIEI+Xj0Q80+HeU1/7/kJ+y/UcHB d4shOl9ZwJTNxhDu7rhz6dbWjFDS7y55zjqnWhgpeuzWPYkZ2M8mb7nm5kGvQXz44zEV LhQs7bbXzP0f/1oheL0zMfCApu5tkQYNPoljy8PGV6jVuNomCesw/Qc/SkSjZB2xNOiV 2khw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FPuJvXFuiB1r1HZvMinbs424Qxdr3dsdUD3p2/VAQ0I=; b=LODBt3CCOj0G/63K+7T22uid7F1PkoiTiJD/5eU3vsjHR/KdjzzncI821VlQVgCCEd kN4BBAG7b9fZ5HSt6IOUJiZaBtI6U4KpKoaKC0jBoaFwi4v2Q9UDFYqMcgIHWLuCte/+ lIuutoYP08NINpkscmtbyU3FNujx28dgkk0QJoJEzw7GxSrD2hLRWtnuD3cVheUwM62Q sPh1I6MSE4XsMHCATib3o+1iBBAd4AWChe6xc4WPuCJyY9KabSqw1oQuZhRAtIOhAu6I ljJ7rgHuT4ciZvLQV0fjVKvrJZhMr/xT1yqKscIEqBZYRNPyjISJTeJz5lVf69PoXxAM yDBw== X-Gm-Message-State: ANhLgQ2m+VhZPxhqdDOjkRgH2hclgL5CGffvbJJFRG1vjQN8LDnsrPqB PPjGu/izbx4Cd07Mdkkqlu7uFFB0Cgc= X-Google-Smtp-Source: ADFU+vu/mtowzwHwJWcIwYFur7wvtdd+k1cNXLxiqiqwgHK3VDm1t4ORwo7u1ZFjumc8D1jQdelb7A== X-Received: by 2002:a17:90a:aa83:: with SMTP id l3mr743296pjq.5.1583789586772; Mon, 09 Mar 2020 14:33:06 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:06 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 11/15] btrfs: put direct I/O checksums in btrfs_dio_private instead of bio Date: Mon, 9 Mar 2020 14:32:37 -0700 Message-Id: <95b275ed47f1e4bdaba53040fe6de9eefdf3a5fd.1583789410.git.osandov@fb.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval The next commit will get rid of btrfs_dio_private->orig_bio. The only thing we really need it for is containing all of the checksums, but we can easily put those in btrfs_dio_private and get rid of the awkward logic that looks up the checksums for orig_bio when the first split bio is submitted. (Interestingly, btrfs_dio_private did contain the checksums before commit 23ea8e5a0767 ("Btrfs: load checksum data once when submitting a direct read io"), but it didn't look them up up front.) Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik Reviewed-by: Nikolay Borisov --- fs/btrfs/inode.c | 79 ++++++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 40 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index a7fb0ba8cde4..4a2e44f3e66e 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -84,6 +84,9 @@ struct btrfs_dio_private { */ blk_status_t (*subio_endio)(struct inode *, struct btrfs_io_bio *, blk_status_t); + + /* Checksums. */ + u8 sums[]; }; struct btrfs_dio_data { @@ -7753,7 +7756,6 @@ static void btrfs_endio_direct_read(struct bio *bio) dio_bio->bi_status = err; dio_end_io(dio_bio); - btrfs_io_bio_free_csum(io_bio); bio_put(bio); } @@ -7865,39 +7867,6 @@ static void btrfs_end_dio_bio(struct bio *bio) bio_put(bio); } -static inline blk_status_t btrfs_lookup_and_bind_dio_csum(struct inode *inode, - struct btrfs_dio_private *dip, - struct bio *bio, - u64 file_offset) -{ - struct btrfs_io_bio *io_bio = btrfs_io_bio(bio); - struct btrfs_io_bio *orig_io_bio = btrfs_io_bio(dip->orig_bio); - u16 csum_size; - blk_status_t ret; - - /* - * We load all the csum data we need when we submit - * the first bio to reduce the csum tree search and - * contention. - */ - if (dip->logical_offset == file_offset) { - ret = btrfs_lookup_bio_sums(inode, dip->orig_bio, file_offset, - NULL); - if (ret) - return ret; - } - - if (bio == dip->orig_bio) - return 0; - - file_offset -= dip->logical_offset; - file_offset >>= inode->i_sb->s_blocksize_bits; - csum_size = btrfs_super_csum_size(btrfs_sb(inode->i_sb)->super_copy); - io_bio->csum = orig_io_bio->csum + csum_size * file_offset; - - return 0; -} - static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, struct inode *inode, u64 file_offset, int async_submit) { @@ -7933,10 +7902,12 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, if (ret) goto err; } else { - ret = btrfs_lookup_and_bind_dio_csum(inode, dip, bio, - file_offset); - if (ret) - goto err; + u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); + size_t csum_offset; + + csum_offset = ((file_offset - dip->logical_offset) >> + inode->i_sb->s_blocksize_bits) * csum_size; + btrfs_io_bio(bio)->csum = dip->sums + csum_offset; } map: ret = btrfs_map_bio(fs_info, bio, 0); @@ -8047,13 +8018,25 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, loff_t file_offset) { struct btrfs_dio_private *dip = NULL; + size_t dip_size; struct bio *bio = NULL; struct btrfs_io_bio *io_bio; bool write = (bio_op(dio_bio) == REQ_OP_WRITE); + const bool csum = !(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM); bio = btrfs_bio_clone(dio_bio); - dip = kzalloc(sizeof(*dip), GFP_NOFS); + dip_size = sizeof(*dip); + if (!write && csum) { + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); + size_t nblocks = (dio_bio->bi_iter.bi_size >> + inode->i_sb->s_blocksize_bits); + + dip_size += csum_size * nblocks; + } + + dip = kzalloc(dip_size, GFP_NOFS); if (!dip) { if (!write) { unlock_extent(&BTRFS_I(inode)->io_tree, file_offset, @@ -8093,11 +8076,27 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, dip->bytes; dio_data->unsubmitted_oe_range_start = dio_data->unsubmitted_oe_range_end; - bio->bi_end_io = btrfs_endio_direct_write; } else { bio->bi_end_io = btrfs_endio_direct_read; dip->subio_endio = btrfs_subio_endio_read; + + if (csum) { + blk_status_t status; + + /* + * Load the csums up front to reduce csum tree searches + * and contention when submitting bios. + */ + status = btrfs_lookup_bio_sums(inode, dio_bio, + file_offset, dip->sums); + if (status != BLK_STS_OK) { + dip->errors = 1; + if (refcount_dec_and_test(&dip->refs)) + bio_io_error(dip->orig_bio); + return; + } + } } btrfs_submit_direct_hook(dip); From patchwork Mon Mar 9 21:32:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428095 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 68D39139A for ; Mon, 9 Mar 2020 21:33:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3E83924654 for ; Mon, 9 Mar 2020 21:33:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="BTe/611g" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727053AbgCIVdL (ORCPT ); Mon, 9 Mar 2020 17:33:11 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:45951 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726997AbgCIVdK (ORCPT ); Mon, 9 Mar 2020 17:33:10 -0400 Received: by mail-pf1-f196.google.com with SMTP id 2so5412871pfg.12 for ; Mon, 09 Mar 2020 14:33:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=kY5d0Vy2decVSrxj3t+VpyDeiZStYIOSlicK34Gnp4I=; b=BTe/611gO3is1U2gJgHze1PZow8ZLyt44eOrxGwmF7orfQLZMhjWQni1QxnC0fT0KV JDzvzZ+ZjQC/H1mP8ZF3NnWxnyMBcZ38qv0QtZvhYZszewK1Q+WkWWXihaeP55UDvosO I9OKWTcSydXxor+MlmI9fzsUjM1BBzlVMmVEIvlfZ+Hod8PHOe6yMoSSwv1sSXooDSYi oKr4zoBjEwzQgxIhE5cp9y1M3ZBHY8bCBY1ku0Q7bzlob05lBTSbuy2Cbvqum+IcpmNS tU3og7mCx+ArBMuIAa4YbWuwg2gXZ8fFcr9NTMjADvtCld1BAkM7YPghnCkj7TnUEVWU cClw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kY5d0Vy2decVSrxj3t+VpyDeiZStYIOSlicK34Gnp4I=; b=Re/es4RPySQXaCERO3jQjcRZiJ0GGICiQt4GpDBw3qAUwg8Lx84tuFIXv7+tyMMnIm uFNln8r93ChPY5r/8fIicakCUwHtykUyo/xJnJ0qBm7O8CsO0rhlVlfGCEogQAg/mF03 X+3X4MkGea7m8BWczaZH3cyNmmshOCr4H2dIXlKqxJwkjWUd8Xq4KgXwl7LzkNqGpWZY 4UxLzShtt7agqZ+K6e0zzEqnAnC3aR1eHmAWXEzOECFnhwdAgnfH+tfzSR7vGScAKVFx yjQUtAbNVLeY25eG5mrv/KXO0BiiASEZ0HbgZDRc9I8VTxkEojTtHwOOr9Um91z+uFZT 4g+g== X-Gm-Message-State: ANhLgQ0pq+PLR4BXDiTfNo+7/JtoLX1z9XaqlzzdeQeRbVYz1PX8igkw /uTD75MqsfzLXoo/sfE5GH+LhYQKCdg= X-Google-Smtp-Source: ADFU+vtaNrJL1vJLZFCCeiDoI9x8i3z425WM/AxpZCx6g0gvap9IGxKbeA71tKemKksNm0UQyBSKug== X-Received: by 2002:a62:6807:: with SMTP id d7mr17645546pfc.230.1583789588557; Mon, 09 Mar 2020 14:33:08 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:08 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 12/15] btrfs: get rid of one layer of bios in direct I/O Date: Mon, 9 Mar 2020 14:32:38 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval In the worst case, there are _4_ layers of bios in the Btrfs direct I/O path: 1. The bio created by the generic direct I/O code (dio_bio). 2. A clone of dio_bio we create in btrfs_submit_direct() to represent the entire direct I/O range (orig_bio). 3. A partial clone of orig_bio limited to the size of a RAID stripe that we create in btrfs_submit_direct_hook(). 4. Clones of each of those split bios for each RAID stripe that we create in btrfs_map_bio(). As of the previous commit, the second layer (orig_bio) is no longer needed for anything: we can split dio_bio instead, and complete dio_bio directly when all of the cloned bios complete. This lets us clean up a bunch of cruft, including dip->subio_endio and dip->errors (we can use dio_bio->bi_status instead). It also enables the next big cleanup of direct I/O read repair. Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik --- fs/btrfs/inode.c | 213 +++++++++++++++-------------------------------- 1 file changed, 66 insertions(+), 147 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4a2e44f3e66e..40c1562704e9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -54,11 +54,8 @@ struct btrfs_iget_args { struct btrfs_root *root; }; -#define BTRFS_DIO_ORIG_BIO_SUBMITTED 0x1 - struct btrfs_dio_private { struct inode *inode; - unsigned long flags; u64 logical_offset; u64 disk_bytenr; u64 bytes; @@ -69,22 +66,9 @@ struct btrfs_dio_private { */ refcount_t refs; - /* IO errors */ - int errors; - - /* orig_bio is our btrfs_io_bio */ - struct bio *orig_bio; - /* dio_bio came from fs/direct-io.c */ struct bio *dio_bio; - /* - * The original bio may be split to several sub-bios, this is - * done during endio of sub-bios - */ - blk_status_t (*subio_endio)(struct inode *, struct btrfs_io_bio *, - blk_status_t); - /* Checksums. */ u8 sums[]; }; @@ -7400,6 +7384,29 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, return ret; } +static void btrfs_dio_private_put(struct btrfs_dio_private *dip) +{ + /* + * This implies a barrier so that stores to dio_bio->bi_status before + * this and the following load are fully ordered. + */ + if (!refcount_dec_and_test(&dip->refs)) + return; + + if (bio_op(dip->dio_bio) == REQ_OP_WRITE) { + __endio_write_update_ordered(dip->inode, dip->logical_offset, + dip->bytes, + !dip->dio_bio->bi_status); + } else { + unlock_extent(&BTRFS_I(dip->inode)->io_tree, + dip->logical_offset, + dip->logical_offset + dip->bytes - 1); + } + + dio_end_io(dip->dio_bio); + kfree(dip); +} + static inline blk_status_t submit_dio_repair_bio(struct inode *inode, struct bio *bio, int mirror_num) @@ -7722,8 +7729,9 @@ static blk_status_t __btrfs_subio_endio_read(struct inode *inode, return err; } -static blk_status_t btrfs_subio_endio_read(struct inode *inode, - struct btrfs_io_bio *io_bio, blk_status_t err) +static blk_status_t btrfs_check_read_dio_bio(struct inode *inode, + struct btrfs_io_bio *io_bio, + blk_status_t err) { bool skip_csum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM; @@ -7737,28 +7745,6 @@ static blk_status_t btrfs_subio_endio_read(struct inode *inode, } } -static void btrfs_endio_direct_read(struct bio *bio) -{ - struct btrfs_dio_private *dip = bio->bi_private; - struct inode *inode = dip->inode; - struct bio *dio_bio; - struct btrfs_io_bio *io_bio = btrfs_io_bio(bio); - blk_status_t err = bio->bi_status; - - if (dip->flags & BTRFS_DIO_ORIG_BIO_SUBMITTED) - err = btrfs_subio_endio_read(inode, io_bio, err); - - unlock_extent(&BTRFS_I(inode)->io_tree, dip->logical_offset, - dip->logical_offset + dip->bytes - 1); - dio_bio = dip->dio_bio; - - kfree(dip); - - dio_bio->bi_status = err; - dio_end_io(dio_bio); - bio_put(bio); -} - static void __endio_write_update_ordered(struct inode *inode, const u64 offset, const u64 bytes, const bool uptodate) @@ -7802,21 +7788,6 @@ static void __endio_write_update_ordered(struct inode *inode, } } -static void btrfs_endio_direct_write(struct bio *bio) -{ - struct btrfs_dio_private *dip = bio->bi_private; - struct bio *dio_bio = dip->dio_bio; - - __endio_write_update_ordered(dip->inode, dip->logical_offset, - dip->bytes, !bio->bi_status); - - kfree(dip); - - dio_bio->bi_status = bio->bi_status; - dio_end_io(dio_bio); - bio_put(bio); -} - static blk_status_t btrfs_submit_bio_start_direct_io(void *private_data, struct bio *bio, u64 offset) { @@ -7840,31 +7811,16 @@ static void btrfs_end_dio_bio(struct bio *bio) (unsigned long long)bio->bi_iter.bi_sector, bio->bi_iter.bi_size, err); - if (dip->subio_endio) - err = dip->subio_endio(dip->inode, btrfs_io_bio(bio), err); - - if (err) { - /* - * We want to perceive the errors flag being set before - * decrementing the reference count. We don't need a barrier - * since atomic operations with a return value are fully - * ordered as per atomic_t.txt - */ - dip->errors = 1; + if (bio_op(bio) == REQ_OP_READ) { + err = btrfs_check_read_dio_bio(dip->inode, btrfs_io_bio(bio), + err); } - /* if there are more bios still pending for this dio, just exit */ - if (!refcount_dec_and_test(&dip->refs)) - goto out; + if (err) + dip->dio_bio->bi_status = err; - if (dip->errors) { - bio_io_error(dip->orig_bio); - } else { - dip->dio_bio->bi_status = BLK_STS_OK; - bio_endio(dip->orig_bio); - } -out: bio_put(bio); + btrfs_dio_private_put(dip); } static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, @@ -7920,98 +7876,77 @@ static void btrfs_submit_direct_hook(struct btrfs_dio_private *dip) struct inode *inode = dip->inode; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct bio *bio; - struct bio *orig_bio = dip->orig_bio; - u64 start_sector = orig_bio->bi_iter.bi_sector; + struct bio *dio_bio = dip->dio_bio; + u64 start_sector = dio_bio->bi_iter.bi_sector; u64 file_offset = dip->logical_offset; int async_submit = 0; - u64 submit_len; + u64 submit_len = dio_bio->bi_iter.bi_size; int clone_offset = 0; int clone_len; int ret; blk_status_t status; struct btrfs_io_geometry geom; - submit_len = orig_bio->bi_iter.bi_size; - ret = btrfs_get_io_geometry(fs_info, btrfs_op(orig_bio), - start_sector << 9, submit_len, &geom); - if (ret) - goto out_err; - - if (geom.len >= submit_len) { - bio = orig_bio; - dip->flags |= BTRFS_DIO_ORIG_BIO_SUBMITTED; - goto submit; - } - /* async crcs make it difficult to collect full stripe writes. */ if (btrfs_data_alloc_profile(fs_info) & BTRFS_BLOCK_GROUP_RAID56_MASK) async_submit = 0; else async_submit = 1; - /* bio split */ ASSERT(geom.len <= INT_MAX); do { + ret = btrfs_get_io_geometry(fs_info, btrfs_op(dio_bio), + start_sector << 9, submit_len, + &geom); + if (ret) { + status = errno_to_blk_status(ret); + goto out_err; + } + clone_len = min_t(int, submit_len, geom.len); /* * This will never fail as it's passing GPF_NOFS and * the allocation is backed by btrfs_bioset. */ - bio = btrfs_bio_clone_partial(orig_bio, clone_offset, - clone_len); + bio = btrfs_bio_clone_partial(dio_bio, clone_offset, clone_len); bio->bi_private = dip; bio->bi_end_io = btrfs_end_dio_bio; btrfs_io_bio(bio)->logical = file_offset; ASSERT(submit_len >= clone_len); submit_len -= clone_len; - if (submit_len == 0) - break; /* * Increase the count before we submit the bio so we know * the end IO handler won't happen before we increase the * count. Otherwise, the dip might get freed before we're * done setting it up. + * + * We transfer the initial reference to the last bio, so we + * don't need to increment the reference count for the last one. */ - refcount_inc(&dip->refs); + if (submit_len > 0) + refcount_inc(&dip->refs); status = btrfs_submit_dio_bio(bio, inode, file_offset, async_submit); if (status) { bio_put(bio); - refcount_dec(&dip->refs); + if (submit_len > 0) + refcount_dec(&dip->refs); goto out_err; } clone_offset += clone_len; start_sector += clone_len >> 9; file_offset += clone_len; - - ret = btrfs_get_io_geometry(fs_info, btrfs_op(orig_bio), - start_sector << 9, submit_len, &geom); - if (ret) - goto out_err; } while (submit_len > 0); + return; -submit: - status = btrfs_submit_dio_bio(bio, inode, file_offset, async_submit); - if (!status) - return; - - if (bio != orig_bio) - bio_put(bio); out_err: - dip->errors = 1; - /* - * Before atomic variable goto zero, we must make sure dip->errors is - * perceived to be set. This ordering is ensured by the fact that an - * atomic operations with a return value are fully ordered as per - * atomic_t.txt - */ - if (refcount_dec_and_test(&dip->refs)) - bio_io_error(dip->orig_bio); + dip->dio_bio->bi_status = status; + btrfs_dio_private_put(dip); } static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, @@ -8019,13 +7954,9 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, { struct btrfs_dio_private *dip = NULL; size_t dip_size; - struct bio *bio = NULL; - struct btrfs_io_bio *io_bio; bool write = (bio_op(dio_bio) == REQ_OP_WRITE); const bool csum = !(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM); - bio = btrfs_bio_clone(dio_bio); - dip_size = sizeof(*dip); if (!write && csum) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); @@ -8049,7 +7980,6 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, * bio_endio()/bio_io_error() against dio_bio. */ dio_end_io(dio_bio); - bio_put(bio); return; } @@ -8057,12 +7987,8 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, dip->logical_offset = file_offset; dip->bytes = dio_bio->bi_iter.bi_size; dip->disk_bytenr = (u64)dio_bio->bi_iter.bi_sector << 9; - bio->bi_private = dip; - dip->orig_bio = bio; dip->dio_bio = dio_bio; refcount_set(&dip->refs, 1); - io_bio = btrfs_io_bio(bio); - io_bio->logical = file_offset; if (write) { /* @@ -8076,26 +8002,19 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, dip->bytes; dio_data->unsubmitted_oe_range_start = dio_data->unsubmitted_oe_range_end; - bio->bi_end_io = btrfs_endio_direct_write; - } else { - bio->bi_end_io = btrfs_endio_direct_read; - dip->subio_endio = btrfs_subio_endio_read; + } else if (csum) { + blk_status_t status; - if (csum) { - blk_status_t status; - - /* - * Load the csums up front to reduce csum tree searches - * and contention when submitting bios. - */ - status = btrfs_lookup_bio_sums(inode, dio_bio, - file_offset, dip->sums); - if (status != BLK_STS_OK) { - dip->errors = 1; - if (refcount_dec_and_test(&dip->refs)) - bio_io_error(dip->orig_bio); - return; - } + /* + * Load the csums up front to reduce csum tree searches and + * contention when submitting bios. + */ + status = btrfs_lookup_bio_sums(inode, dio_bio, file_offset, + dip->sums); + if (status != BLK_STS_OK) { + dip->dio_bio->bi_status = status; + btrfs_dio_private_put(dip); + return; } } From patchwork Mon Mar 9 21:32:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428097 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF3BE14B7 for ; Mon, 9 Mar 2020 21:33:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B715624649 for ; Mon, 9 Mar 2020 21:33:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="If8GD2y+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727061AbgCIVdM (ORCPT ); Mon, 9 Mar 2020 17:33:12 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:40354 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727030AbgCIVdL (ORCPT ); Mon, 9 Mar 2020 17:33:11 -0400 Received: by mail-pf1-f194.google.com with SMTP id l184so5424665pfl.7 for ; Mon, 09 Mar 2020 14:33:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=S8STYHxLsVnjqMDvX7YlglXGkK2/RTtA3/njWgyjptU=; b=If8GD2y+h1dLHbJUStXnPnii7o1vKE2uBOmGb8efrF62AMdcPi3btE874XhuXYv4s8 AjBD+XXWQTU9bVuSu4nTDu1Tt8z5bAx9MeBlgGeISJiVqXZxtULa1J9XLTw18rDXwigY KBHsnwANdEvcC9/GWWSUm/X0wahP2l9OQfHQMoUnSSK82vDZa4PiiUjMsbBxr0JubfVA HjxpWrS20W/JJHbFMTL2u1nrsgOdZfTkyrwf7Fo8WobAfeo+JawXNO7VV5/Cnsotkx03 mPp5n+u5bSYsExi+HZ2RIX3oOihb17LGgKS+cCSedHG8+a1I4t1mbPIDUgsp5BCWrJIn yuEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=S8STYHxLsVnjqMDvX7YlglXGkK2/RTtA3/njWgyjptU=; b=R5szQDp3FTVCG5DV9FysMHv1gOVDGxeBvbBWi1ZRHP5Za6t8X37edvu26hNubRBC1C Q+mbl9uzYnwFIfyi0f3QizpSACdITfuSkRq0G+WOkMHxWIUTim1iYMC44LnJ2neSzhL/ U2dCLbiukjCgLVu+L/8tvjSr7WsvyweSvY8GyL9+381Tp844SSk4g97iAwaCAKQxB9rV 1valHkliaVBFwZzbZHMpUN029maBHbeY8V5P0AIZRsUTdzVJPi6+roCRSKpuZPldnrmM 2bM982htigRgQEFifFMKUppLj8ufhqaYECfUkJCvRelBiWyXRmsyDaRIdC/I+gfvTgZx AcLg== X-Gm-Message-State: ANhLgQ10cT4irIeKzEk9Fa3uqyGWTf5qB7AhX4zdwmZNXO6xuvh/wxNx X/ks5AH2BdiRqY8gOHxY4gxr5kDojkU= X-Google-Smtp-Source: ADFU+vtLaj3Ak0ALgoYauUq/EzYNGzNpVAPsod3vZ7k14CVXXH87zkyZwptW5bB3OV2TXba/U14l4A== X-Received: by 2002:a62:2a8d:: with SMTP id q135mr3470184pfq.220.1583789589585; Mon, 09 Mar 2020 14:33:09 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:09 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 13/15] btrfs: simplify direct I/O read repair Date: Mon, 9 Mar 2020 14:32:39 -0700 Message-Id: <38cea444fa3f88ca514d161bd979d004c254e969.1583789410.git.osandov@fb.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Direct I/O read repair is an over-complicated mess. There is major code duplication between __btrfs_subio_endio_read() (checks checksums and handles I/O errors for files with checksums), __btrfs_correct_data_nocsum() (handles I/O errors for files without checksums), btrfs_retry_endio() (checks checksums and handles I/O errors for retries of files with checksums), and btrfs_retry_endio_nocsum() (handles I/O errors for retries of files without checksum). If it sounds like these should be one function, that's because they should. After the previous commit getting rid of orig_bio, we can reuse the same endio callback for repair I/O and the original I/O, we just need to track the file offset and original iterator in the repair bio. We can also unify the handling of files with and without checksums and replace the atrocity that was probably the inspiration for "Go To Statement Considered Harmful" with normal loops. We also no longer have to wait for each repair I/O to complete one by one. Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik --- fs/btrfs/extent_io.c | 2 + fs/btrfs/inode.c | 268 +++++++------------------------------------ 2 files changed, 44 insertions(+), 226 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index aee35d431f91..fad86ef4d09d 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2622,6 +2622,8 @@ struct bio *btrfs_create_repair_bio(struct inode *inode, struct bio *failed_bio, } bio_add_page(bio, page, failrec->len, pg_offset); + btrfs_io_bio(bio)->logical = failrec->start; + btrfs_io_bio(bio)->iter = bio->bi_iter; return bio; } diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 40c1562704e9..ef302b7c6c2d 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7463,19 +7463,17 @@ static int btrfs_check_dio_repairable(struct inode *inode, static blk_status_t dio_read_error(struct inode *inode, struct bio *failed_bio, struct page *page, unsigned int pgoff, - u64 start, u64 end, int failed_mirror, - bio_end_io_t *repair_endio, void *repair_arg) + u64 start, u64 end, int failed_mirror) { + struct btrfs_dio_private *dip = failed_bio->bi_private; struct io_failure_record *failrec; struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree; struct bio *bio; int isector; unsigned int read_mode = 0; - int segs; int ret; blk_status_t status; - struct bio_vec bvec; BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE); @@ -7490,261 +7488,79 @@ static blk_status_t dio_read_error(struct inode *inode, struct bio *failed_bio, return BLK_STS_IOERR; } - segs = bio_segments(failed_bio); - bio_get_first_bvec(failed_bio, &bvec); - if (segs > 1 || - (bvec.bv_len > btrfs_inode_sectorsize(inode))) + if (btrfs_io_bio(failed_bio)->iter.bi_size > inode->i_sb->s_blocksize) read_mode |= REQ_FAILFAST_DEV; isector = start - btrfs_io_bio(failed_bio)->logical; isector >>= inode->i_sb->s_blocksize_bits; - bio = btrfs_create_repair_bio(inode, failed_bio, failrec, page, - pgoff, isector, repair_endio, repair_arg); + bio = btrfs_create_repair_bio(inode, failed_bio, failrec, page, pgoff, + isector, failed_bio->bi_end_io, dip); bio->bi_opf = REQ_OP_READ | read_mode; btrfs_debug(BTRFS_I(inode)->root->fs_info, "repair DIO read error: submitting new dio read[%#x] to this_mirror=%d, in_validation=%d", read_mode, failrec->this_mirror, failrec->in_validation); + refcount_inc(&dip->refs); status = submit_dio_repair_bio(inode, bio, failrec->this_mirror); if (status) { free_io_failure(failure_tree, io_tree, failrec); bio_put(bio); + refcount_dec(&dip->refs); } return status; } -struct btrfs_retry_complete { - struct completion done; - struct inode *inode; - u64 start; - int uptodate; -}; - -static void btrfs_retry_endio_nocsum(struct bio *bio) -{ - struct btrfs_retry_complete *done = bio->bi_private; - struct inode *inode = done->inode; - struct bio_vec *bvec; - struct extent_io_tree *io_tree, *failure_tree; - struct bvec_iter_all iter_all; - - if (bio->bi_status) - goto end; - - ASSERT(bio->bi_vcnt == 1); - io_tree = &BTRFS_I(inode)->io_tree; - failure_tree = &BTRFS_I(inode)->io_failure_tree; - ASSERT(bio_first_bvec_all(bio)->bv_len == btrfs_inode_sectorsize(inode)); - - done->uptodate = 1; - ASSERT(!bio_flagged(bio, BIO_CLONED)); - bio_for_each_segment_all(bvec, bio, iter_all) - clean_io_failure(BTRFS_I(inode)->root->fs_info, failure_tree, - io_tree, done->start, bvec->bv_page, - btrfs_ino(BTRFS_I(inode)), 0); -end: - complete(&done->done); - bio_put(bio); -} - -static blk_status_t __btrfs_correct_data_nocsum(struct inode *inode, - struct btrfs_io_bio *io_bio) +static blk_status_t btrfs_check_read_dio_bio(struct inode *inode, + struct btrfs_io_bio *io_bio, + const bool uptodate) { - struct btrfs_fs_info *fs_info; + struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; + u32 sectorsize = fs_info->sectorsize; + struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree; + struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; + const bool csum = !(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM); struct bio_vec bvec; struct bvec_iter iter; - struct btrfs_retry_complete done; - u64 start; - unsigned int pgoff; - u32 sectorsize; - int nr_sectors; - blk_status_t ret; + u64 start = io_bio->logical; + int icsum = 0; blk_status_t err = BLK_STS_OK; - fs_info = BTRFS_I(inode)->root->fs_info; - sectorsize = fs_info->sectorsize; - - start = io_bio->logical; - done.inode = inode; - io_bio->bio.bi_iter = io_bio->iter; + __bio_for_each_segment(bvec, &io_bio->bio, iter, io_bio->iter) { + unsigned int i, nr_sectors, pgoff; - bio_for_each_segment(bvec, &io_bio->bio, iter) { nr_sectors = BTRFS_BYTES_TO_BLKS(fs_info, bvec.bv_len); pgoff = bvec.bv_offset; - -next_block_or_try_again: - done.uptodate = 0; - done.start = start; - init_completion(&done.done); - - ret = dio_read_error(inode, &io_bio->bio, bvec.bv_page, - pgoff, start, start + sectorsize - 1, - io_bio->mirror_num, - btrfs_retry_endio_nocsum, &done); - if (ret) { - err = ret; - goto next; - } - - wait_for_completion_io(&done.done); - - if (!done.uptodate) { - /* We might have another mirror, so try again */ - goto next_block_or_try_again; - } - -next: - start += sectorsize; - - nr_sectors--; - if (nr_sectors) { - pgoff += sectorsize; - ASSERT(pgoff < PAGE_SIZE); - goto next_block_or_try_again; - } - } - - return err; -} - -static void btrfs_retry_endio(struct bio *bio) -{ - struct btrfs_retry_complete *done = bio->bi_private; - struct btrfs_io_bio *io_bio = btrfs_io_bio(bio); - struct extent_io_tree *io_tree, *failure_tree; - struct inode *inode = done->inode; - struct bio_vec *bvec; - int uptodate; - int ret; - int i = 0; - struct bvec_iter_all iter_all; - - if (bio->bi_status) - goto end; - - uptodate = 1; - - ASSERT(bio->bi_vcnt == 1); - ASSERT(bio_first_bvec_all(bio)->bv_len == btrfs_inode_sectorsize(done->inode)); - - io_tree = &BTRFS_I(inode)->io_tree; - failure_tree = &BTRFS_I(inode)->io_failure_tree; - - ASSERT(!bio_flagged(bio, BIO_CLONED)); - bio_for_each_segment_all(bvec, bio, iter_all) { - ret = check_data_csum(inode, io_bio, i, bvec->bv_page, - bvec->bv_offset, done->start, - bvec->bv_len); - if (!ret) - clean_io_failure(BTRFS_I(inode)->root->fs_info, - failure_tree, io_tree, done->start, - bvec->bv_page, - btrfs_ino(BTRFS_I(inode)), - bvec->bv_offset); - else - uptodate = 0; - i++; - } - - done->uptodate = uptodate; -end: - complete(&done->done); - bio_put(bio); -} - -static blk_status_t __btrfs_subio_endio_read(struct inode *inode, - struct btrfs_io_bio *io_bio, blk_status_t err) -{ - struct btrfs_fs_info *fs_info; - struct bio_vec bvec; - struct bvec_iter iter; - struct btrfs_retry_complete done; - u64 start; - u64 offset = 0; - u32 sectorsize; - int nr_sectors; - unsigned int pgoff; - int csum_pos; - bool uptodate = (err == 0); - int ret; - blk_status_t status; - - fs_info = BTRFS_I(inode)->root->fs_info; - sectorsize = fs_info->sectorsize; - - err = BLK_STS_OK; - start = io_bio->logical; - done.inode = inode; - io_bio->bio.bi_iter = io_bio->iter; - - bio_for_each_segment(bvec, &io_bio->bio, iter) { - nr_sectors = BTRFS_BYTES_TO_BLKS(fs_info, bvec.bv_len); - - pgoff = bvec.bv_offset; -next_block: - if (uptodate) { - csum_pos = BTRFS_BYTES_TO_BLKS(fs_info, offset); - ret = check_data_csum(inode, io_bio, csum_pos, - bvec.bv_page, pgoff, start, - sectorsize); - if (likely(!ret)) - goto next; - } -try_again: - done.uptodate = 0; - done.start = start; - init_completion(&done.done); - - status = dio_read_error(inode, &io_bio->bio, bvec.bv_page, - pgoff, start, start + sectorsize - 1, - io_bio->mirror_num, btrfs_retry_endio, - &done); - if (status) { - err = status; - goto next; - } - - wait_for_completion_io(&done.done); - - if (!done.uptodate) { - /* We might have another mirror, so try again */ - goto try_again; - } -next: - offset += sectorsize; - start += sectorsize; - - ASSERT(nr_sectors); - - nr_sectors--; - if (nr_sectors) { + for (i = 0; i < nr_sectors; i++) { + if (uptodate && + (!csum || !check_data_csum(inode, io_bio, icsum, + bvec.bv_page, pgoff, + start, sectorsize))) { + clean_io_failure(fs_info, failure_tree, io_tree, + start, bvec.bv_page, + btrfs_ino(BTRFS_I(inode)), + pgoff); + } else { + blk_status_t status; + + status = dio_read_error(inode, &io_bio->bio, + bvec.bv_page, pgoff, + start, + start + sectorsize - 1, + io_bio->mirror_num); + if (status) + err = status; + } + start += sectorsize; + icsum++; pgoff += sectorsize; ASSERT(pgoff < PAGE_SIZE); - goto next_block; } } - return err; } -static blk_status_t btrfs_check_read_dio_bio(struct inode *inode, - struct btrfs_io_bio *io_bio, - blk_status_t err) -{ - bool skip_csum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM; - - if (skip_csum) { - if (unlikely(err)) - return __btrfs_correct_data_nocsum(inode, io_bio); - else - return BLK_STS_OK; - } else { - return __btrfs_subio_endio_read(inode, io_bio, err); - } -} - static void __endio_write_update_ordered(struct inode *inode, const u64 offset, const u64 bytes, const bool uptodate) @@ -7813,7 +7629,7 @@ static void btrfs_end_dio_bio(struct bio *bio) if (bio_op(bio) == REQ_OP_READ) { err = btrfs_check_read_dio_bio(dip->inode, btrfs_io_bio(bio), - err); + !err); } if (err) From patchwork Mon Mar 9 21:32:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428099 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C8CA814B7 for ; Mon, 9 Mar 2020 21:33:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A8CC924649 for ; Mon, 9 Mar 2020 21:33:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="o9/tbouR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727073AbgCIVdN (ORCPT ); Mon, 9 Mar 2020 17:33:13 -0400 Received: from mail-pj1-f67.google.com ([209.85.216.67]:40256 "EHLO mail-pj1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727064AbgCIVdN (ORCPT ); Mon, 9 Mar 2020 17:33:13 -0400 Received: by mail-pj1-f67.google.com with SMTP id gv19so474771pjb.5 for ; Mon, 09 Mar 2020 14:33:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bo8A9fMohBCy3HQvkvJX3dofLmoGfNBNutXhMY2i4vE=; b=o9/tbouRUzcLt7HJ9XRV9XdGLcewdba936kmr7+b0UFl8nM27MObUYeCZwHgrz1lEE qvrOVZraRz4kPjum+PWZToiMDMvurU1KBnJBtm31kb6B3fuvOyyB9oQhgiFjKMOQDxQ/ vNGBGdezU8qVs64x7NbBz9NWGCXXA72A6HIM1+LpoAeC1X2X3W3g1EaJKkvSUoCYV1Fs 1/JcV+SN5b3hjmtKeIsofMSK9U3jCJFxm0WZvCRYuIoHca9S40Mxd+BqBj8jSFEAV9cX oXo49/h6BZbPzw1aEdVaOPTMP4FiaK5LvjJx95aunR3z327IbWLt30DT4PuZJkLNO0/i r0rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bo8A9fMohBCy3HQvkvJX3dofLmoGfNBNutXhMY2i4vE=; b=UXMSd+Q5w8X6vVah/hiIvb5ZvpTQfehit+KLf3lknj51jEZyNuHjRaHOI1HD6a4rRE XFj1IWOK6VQYsd1Wo6GkzS8FHagQQSD+GMA3S3Az2kIT+jkLgrPbwheOmg4HE4WGfql/ sZX0/08XsJHg6dmzKv0pemY/n8Z/wHKpGZemPBmzHyT1BcwPqj6/XkiHlUazuYhdESnB uGfWTWwP0XsGij9Fm2W/g93kpQhw19LVBLfTkAIUMcwxZzbVPWEYhMbI0+kDRlNuhsJ2 gD9/BC+UbJ8s3f4orL33FVuQv1I2oWhLjuRidLyAFGDfT68zfnVHC93qbyiitKt4AeTq +fxQ== X-Gm-Message-State: ANhLgQ3PKjAZffjZI/AUcTMppenHFPApoGBkcEven9GIMlT0Io0MSWYa /atE3npqAdVEhm+lrmmhQQSxFsFM/oI= X-Google-Smtp-Source: ADFU+vtLMdcHnF3vI8YZ7mlZqsO6JCIqAU/fPJskzkHCJKjPjtJQ1+dNyn3iGEtwPhiYqSJwdd299A== X-Received: by 2002:a17:902:a415:: with SMTP id p21mr18044318plq.57.1583789590532; Mon, 09 Mar 2020 14:33:10 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:10 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 14/15] btrfs: get rid of endio_repair_workers Date: Mon, 9 Mar 2020 14:32:40 -0700 Message-Id: <222e3f12f3a9130ec95d0c52be44b497989f8370.1583789410.git.osandov@fb.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval This was originally added in commit 8b110e393c5a ("Btrfs: implement repair function when direct read fails") because the original bio waited for the repair bio to complete, so the repair I/O couldn't go through the same workqueue. As of the previous commit, this is no longer true, so this separate workqueue is unnecessary. Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 1 - fs/btrfs/disk-io.c | 8 +------- fs/btrfs/disk-io.h | 1 - fs/btrfs/inode.c | 2 +- 4 files changed, 2 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index ecd016f7dab1..91c7ea587fcd 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -759,7 +759,6 @@ struct btrfs_fs_info { struct btrfs_workqueue *endio_workers; struct btrfs_workqueue *endio_meta_workers; struct btrfs_workqueue *endio_raid56_workers; - struct btrfs_workqueue *endio_repair_workers; struct btrfs_workqueue *rmw_workers; struct btrfs_workqueue *endio_meta_write_workers; struct btrfs_workqueue *endio_write_workers; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 6b00ddea0b48..e2d7915f5b03 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -709,9 +709,7 @@ static void end_workqueue_bio(struct bio *bio) else wq = fs_info->endio_write_workers; } else { - if (unlikely(end_io_wq->metadata == BTRFS_WQ_ENDIO_DIO_REPAIR)) - wq = fs_info->endio_repair_workers; - else if (end_io_wq->metadata == BTRFS_WQ_ENDIO_RAID56) + if (end_io_wq->metadata == BTRFS_WQ_ENDIO_RAID56) wq = fs_info->endio_raid56_workers; else if (end_io_wq->metadata) wq = fs_info->endio_meta_workers; @@ -1955,7 +1953,6 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info *fs_info) btrfs_destroy_workqueue(fs_info->workers); btrfs_destroy_workqueue(fs_info->endio_workers); btrfs_destroy_workqueue(fs_info->endio_raid56_workers); - btrfs_destroy_workqueue(fs_info->endio_repair_workers); btrfs_destroy_workqueue(fs_info->rmw_workers); btrfs_destroy_workqueue(fs_info->endio_write_workers); btrfs_destroy_workqueue(fs_info->endio_freespace_worker); @@ -2141,8 +2138,6 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info, fs_info->endio_raid56_workers = btrfs_alloc_workqueue(fs_info, "endio-raid56", flags, max_active, 4); - fs_info->endio_repair_workers = - btrfs_alloc_workqueue(fs_info, "endio-repair", flags, 1, 0); fs_info->rmw_workers = btrfs_alloc_workqueue(fs_info, "rmw", flags, max_active, 2); fs_info->endio_write_workers = @@ -2166,7 +2161,6 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info, fs_info->flush_workers && fs_info->endio_workers && fs_info->endio_meta_workers && fs_info->endio_meta_write_workers && - fs_info->endio_repair_workers && fs_info->endio_write_workers && fs_info->endio_raid56_workers && fs_info->endio_freespace_worker && fs_info->rmw_workers && fs_info->caching_workers && fs_info->readahead_workers && diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h index 59c885860bf8..aef643f26d0c 100644 --- a/fs/btrfs/disk-io.h +++ b/fs/btrfs/disk-io.h @@ -25,7 +25,6 @@ enum btrfs_wq_endio_type { BTRFS_WQ_ENDIO_METADATA, BTRFS_WQ_ENDIO_FREE_SPACE, BTRFS_WQ_ENDIO_RAID56, - BTRFS_WQ_ENDIO_DIO_REPAIR, }; static inline u64 btrfs_sb_offset(int mirror) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ef302b7c6c2d..7f00fee5169b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7416,7 +7416,7 @@ static inline blk_status_t submit_dio_repair_bio(struct inode *inode, BUG_ON(bio_op(bio) == REQ_OP_WRITE); - ret = btrfs_bio_wq_end_io(fs_info, bio, BTRFS_WQ_ENDIO_DIO_REPAIR); + ret = btrfs_bio_wq_end_io(fs_info, bio, BTRFS_WQ_ENDIO_DATA); if (ret) return ret; From patchwork Mon Mar 9 21:32:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 11428101 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8C793139A for ; Mon, 9 Mar 2020 21:33:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5C86524654 for ; Mon, 9 Mar 2020 21:33:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=osandov-com.20150623.gappssmtp.com header.i=@osandov-com.20150623.gappssmtp.com header.b="fSnspb4t" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727069AbgCIVdO (ORCPT ); Mon, 9 Mar 2020 17:33:14 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:34917 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727030AbgCIVdN (ORCPT ); Mon, 9 Mar 2020 17:33:13 -0400 Received: by mail-pg1-f193.google.com with SMTP id 7so5318428pgr.2 for ; Mon, 09 Mar 2020 14:33:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qUGPWWfS4DZRpv39PLVK0wt0Lcw9CfwnYc6Km/HG180=; b=fSnspb4tUxiH86a+lyWIA9JJ0rPuVfRhpI57Fd8RD+o53dKtXGEN7SlGxSHcgK+IAL grQTxw3oMJvGAIdjAoicNkhpHtBYucMx5YdMAJbo940mqyecBTKsYlO6JAlVHeLXeCIM S7jGVWFnlIFx3r8FOMtAhcGGhkxdMH8oZZ0pllcj0ryq1quI/Kng8/vkIAhi57qYO8eW GJ7ZammlPzb/+JDgfCUtjNzXWCxSOWR/D++SlofhSun50DQ1xsJPxrMBVYsLgvA90nYm BkrI1FLcdu1iJQJCuxWRyATlpkBPHbrE/rmVWmqSCuomJ9IEj16jOpe44cPLnCYNUKiU Okow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qUGPWWfS4DZRpv39PLVK0wt0Lcw9CfwnYc6Km/HG180=; b=VMUmns+iVJbSZaQ7UUAMNeCVOa8f5IzBT610DEJ8BcVlBTr2yBRu2gOoHnfhatrrk3 kftZ21O8rzHXNGW2SteqYEjoCcg76extM6W1oP3flGmx4xB42w69nDEkt5EduZ2V/UYy Z2V1EiOBRQbq4KQ0DF6SGsyfbGbn6PYnCDl5ajSq9sQN5Us/vD58Y7IYqQ52j6b4rg8b tVG621BJcM8xeRtzubbT9MZwFbhTrESjh4fttyX31mteCSXad6hYGkckPChIITO143wK taNuxl8CKjbNlr+ifU1Dnsmv9DOk5Fkz8e2V9paxGkc4pc1Y+CGfn3qq9ZlmD34yW3V3 Ib1Q== X-Gm-Message-State: ANhLgQ1PquXjXIZHr8s8flSIUsLj7gvvo+EG2dxvOdbl/hlgZmMKcPGy l9loRHCDXHIXOb6xloGtg/gClKun5ek= X-Google-Smtp-Source: ADFU+vvhxcVD4Eutk6aLTPKfypiU8F0bZaEJuG8EVypXUk6sAYNbu7DAZBgJGH5fiyCWbv9++ZtlUw== X-Received: by 2002:a62:507:: with SMTP id 7mr19144067pff.49.1583789591602; Mon, 09 Mar 2020 14:33:11 -0700 (PDT) Received: from vader.tfbnw.net ([2620:10d:c090:400::5:fe90]) by smtp.gmail.com with ESMTPSA id 13sm44221683pgo.13.2020.03.09.14.33.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2020 14:33:11 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, Christoph Hellwig Subject: [PATCH 15/15] btrfs: unify buffered and direct I/O read repair Date: Mon, 9 Mar 2020 14:32:41 -0700 Message-Id: <7c593decda73deb58515d94e979db6a68527970b.1583789410.git.osandov@fb.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Currently, direct I/O has its own versions of bio_readpage_error() and btrfs_check_repairable() (dio_read_error() and btrfs_check_dio_repairable(), respectively). The main difference is that the direct I/O version doesn't do read validation. The rework of direct I/O repair makes it possible to do validation, so we can get rid of btrfs_check_dio_repairable() and combine bio_readpage_error() and dio_read_error() into a new helper, btrfs_submit_read_repair(). Signed-off-by: Omar Sandoval Reviewed-by: Josef Bacik --- fs/btrfs/extent_io.c | 126 +++++++++++++++++++------------------------ fs/btrfs/extent_io.h | 17 +++--- fs/btrfs/inode.c | 103 ++++------------------------------- 3 files changed, 76 insertions(+), 170 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index fad86ef4d09d..a5cbe04da803 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2593,80 +2593,52 @@ static bool btrfs_check_repairable(struct inode *inode, return true; } - -struct bio *btrfs_create_repair_bio(struct inode *inode, struct bio *failed_bio, - struct io_failure_record *failrec, - struct page *page, int pg_offset, int icsum, - bio_end_io_t *endio_func, void *data) -{ - struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - struct bio *bio; - struct btrfs_io_bio *btrfs_failed_bio; - struct btrfs_io_bio *btrfs_bio; - - bio = btrfs_io_bio_alloc(1); - bio->bi_end_io = endio_func; - bio->bi_iter.bi_sector = failrec->logical >> 9; - bio->bi_iter.bi_size = 0; - bio->bi_private = data; - - btrfs_failed_bio = btrfs_io_bio(failed_bio); - if (btrfs_failed_bio->csum) { - u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); - - btrfs_bio = btrfs_io_bio(bio); - btrfs_bio->csum = btrfs_bio->csum_inline; - icsum *= csum_size; - memcpy(btrfs_bio->csum, btrfs_failed_bio->csum + icsum, - csum_size); - } - - bio_add_page(bio, page, failrec->len, pg_offset); - btrfs_io_bio(bio)->logical = failrec->start; - btrfs_io_bio(bio)->iter = bio->bi_iter; - - return bio; -} - -/* - * This is a generic handler for readpage errors. If other copies exist, read - * those and write back good data to the failed position. Does not investigate - * in remapping the failed extent elsewhere, hoping the device will be smart - * enough to do this as needed - */ -static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset, - struct page *page, u64 start, u64 end, - int failed_mirror) +blk_status_t btrfs_submit_read_repair(struct inode *inode, + struct bio *failed_bio, u64 phy_offset, + struct page *page, unsigned int pgoff, + u64 start, u64 end, int failed_mirror, + submit_bio_hook_t *submit_bio_hook) { struct io_failure_record *failrec; - struct inode *inode = page->mapping->host; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree; + struct btrfs_io_bio *failed_io_bio = btrfs_io_bio(failed_bio); + struct btrfs_io_bio *io_bio; + int icsum = phy_offset >> inode->i_sb->s_blocksize_bits; bool need_validation = false; struct bio *bio; - int read_mode = 0; blk_status_t status; int ret; + btrfs_info(btrfs_sb(inode->i_sb), + "Repair Read Error: read error at %llu", start); + BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE); ret = btrfs_get_io_failure_record(inode, start, end, &failrec); if (ret) - return ret; + return errno_to_blk_status(ret); /* * If there was an I/O error and the I/O was for multiple sectors, we * need to validate each sector individually. */ if (failed_bio->bi_status != BLK_STS_OK) { - u64 len = 0; - int i; - - for (i = 0; i < failed_bio->bi_vcnt; i++) { - len += failed_bio->bi_io_vec[i].bv_len; - if (len > inode->i_sb->s_blocksize) { + if (bio_flagged(failed_bio, BIO_CLONED)) { + if (failed_io_bio->iter.bi_size > + inode->i_sb->s_blocksize) need_validation = true; - break; + } else { + u64 len = 0; + int i; + + for (i = 0; i < failed_bio->bi_vcnt; i++) { + len += failed_bio->bi_io_vec[i].bv_len; + if (len > inode->i_sb->s_blocksize) { + need_validation = true; + break; + } } } } @@ -2674,32 +2646,41 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset, if (!btrfs_check_repairable(inode, need_validation, failrec, failed_mirror)) { free_io_failure(failure_tree, tree, failrec); - return -EIO; + return BLK_STS_IOERR; } + bio = btrfs_io_bio_alloc(1); + io_bio = btrfs_io_bio(bio); + bio->bi_opf = REQ_OP_READ; if (need_validation) - read_mode |= REQ_FAILFAST_DEV; + bio->bi_opf |= REQ_FAILFAST_DEV; + bio->bi_end_io = failed_bio->bi_end_io; + bio->bi_iter.bi_sector = failrec->logical >> 9; + bio->bi_private = failed_bio->bi_private; - phy_offset >>= inode->i_sb->s_blocksize_bits; - bio = btrfs_create_repair_bio(inode, failed_bio, failrec, page, - start - page_offset(page), - (int)phy_offset, failed_bio->bi_end_io, - NULL); - bio->bi_opf = REQ_OP_READ | read_mode; + if (failed_io_bio->csum) { + u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); + + io_bio->csum = io_bio->csum_inline; + memcpy(io_bio->csum, failed_io_bio->csum + csum_size * icsum, + csum_size); + } + + bio_add_page(bio, page, failrec->len, pgoff); + io_bio->logical = failrec->start; + io_bio->iter = bio->bi_iter; btrfs_debug(btrfs_sb(inode->i_sb), - "Repair Read Error: submitting new read[%#x] to this_mirror=%d, in_validation=%d", - read_mode, failrec->this_mirror, failrec->in_validation); +"Repair Read Error: submitting new read to this_mirror=%d, in_validation=%d", + failrec->this_mirror, failrec->in_validation); - status = tree->ops->submit_bio_hook(tree->private_data, bio, failrec->this_mirror, - failrec->bio_flags); + status = submit_bio_hook(inode, bio, failrec->this_mirror, + failrec->bio_flags); if (status) { free_io_failure(failure_tree, tree, failrec); bio_put(bio); - ret = blk_status_to_errno(status); } - - return ret; + return status; } /* lots and lots of room for performance fixes in the end_bio funcs */ @@ -2871,9 +2852,10 @@ static void end_bio_extent_readpage(struct bio *bio) * If it can't handle the error it will return -EIO and * we remain responsible for that page. */ - ret = bio_readpage_error(bio, offset, page, start, end, - mirror); - if (ret == 0) { + if (!btrfs_submit_read_repair(inode, bio, offset, page, + start - page_offset(page), + start, end, mirror, + tree->ops->submit_bio_hook)) { uptodate = !bio->bi_status; offset += len; continue; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 11341a430007..f269a4847d8b 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -66,6 +66,10 @@ struct btrfs_io_bio; struct io_failure_record; struct extent_io_tree; +typedef blk_status_t (submit_bio_hook_t)(struct inode *inode, struct bio *bio, + int mirror_num, + unsigned long bio_flags); + typedef blk_status_t (extent_submit_bio_start_t)(void *private_data, struct bio *bio, u64 bio_offset); @@ -74,8 +78,7 @@ struct extent_io_ops { * The following callbacks must be always defined, the function * pointer will be called unconditionally. */ - blk_status_t (*submit_bio_hook)(struct inode *inode, struct bio *bio, - int mirror_num, unsigned long bio_flags); + submit_bio_hook_t *submit_bio_hook; int (*readpage_end_io_hook)(struct btrfs_io_bio *io_bio, u64 phy_offset, struct page *page, u64 start, u64 end, int mirror); @@ -312,10 +315,12 @@ struct io_failure_record { }; -struct bio *btrfs_create_repair_bio(struct inode *inode, struct bio *failed_bio, - struct io_failure_record *failrec, - struct page *page, int pg_offset, int icsum, - bio_end_io_t *endio_func, void *data); +blk_status_t btrfs_submit_read_repair(struct inode *inode, + struct bio *failed_bio, u64 phy_offset, + struct page *page, unsigned int pgoff, + u64 start, u64 end, int failed_mirror, + submit_bio_hook_t *submit_bio_hook); + #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS bool find_lock_delalloc_range(struct inode *inode, struct page *locked_page, u64 *start, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7f00fee5169b..d555f9bf5bbf 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7407,10 +7407,11 @@ static void btrfs_dio_private_put(struct btrfs_dio_private *dip) kfree(dip); } -static inline blk_status_t submit_dio_repair_bio(struct inode *inode, - struct bio *bio, - int mirror_num) +static blk_status_t submit_dio_repair_bio(struct inode *inode, struct bio *bio, + int mirror_num, + unsigned long bio_flags) { + struct btrfs_dio_private *dip = bio->bi_private; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); blk_status_t ret; @@ -7420,96 +7421,11 @@ static inline blk_status_t submit_dio_repair_bio(struct inode *inode, if (ret) return ret; + refcount_inc(&dip->refs); ret = btrfs_map_bio(fs_info, bio, mirror_num); - - return ret; -} - -static int btrfs_check_dio_repairable(struct inode *inode, - struct bio *failed_bio, - struct io_failure_record *failrec, - int failed_mirror) -{ - struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - int num_copies; - - num_copies = btrfs_num_copies(fs_info, failrec->logical, failrec->len); - if (num_copies == 1) { - /* - * we only have a single copy of the data, so don't bother with - * all the retry and error correction code that follows. no - * matter what the error is, it is very likely to persist. - */ - btrfs_debug(fs_info, - "Check DIO Repairable: cannot repair, num_copies=%d, next_mirror %d, failed_mirror %d", - num_copies, failrec->this_mirror, failed_mirror); - return 0; - } - - failrec->failed_mirror = failed_mirror; - failrec->this_mirror++; - if (failrec->this_mirror == failed_mirror) - failrec->this_mirror++; - - if (failrec->this_mirror > num_copies) { - btrfs_debug(fs_info, - "Check DIO Repairable: (fail) num_copies=%d, next_mirror %d, failed_mirror %d", - num_copies, failrec->this_mirror, failed_mirror); - return 0; - } - - return 1; -} - -static blk_status_t dio_read_error(struct inode *inode, struct bio *failed_bio, - struct page *page, unsigned int pgoff, - u64 start, u64 end, int failed_mirror) -{ - struct btrfs_dio_private *dip = failed_bio->bi_private; - struct io_failure_record *failrec; - struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; - struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree; - struct bio *bio; - int isector; - unsigned int read_mode = 0; - int ret; - blk_status_t status; - - BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE); - - ret = btrfs_get_io_failure_record(inode, start, end, &failrec); if (ret) - return errno_to_blk_status(ret); - - ret = btrfs_check_dio_repairable(inode, failed_bio, failrec, - failed_mirror); - if (!ret) { - free_io_failure(failure_tree, io_tree, failrec); - return BLK_STS_IOERR; - } - - if (btrfs_io_bio(failed_bio)->iter.bi_size > inode->i_sb->s_blocksize) - read_mode |= REQ_FAILFAST_DEV; - - isector = start - btrfs_io_bio(failed_bio)->logical; - isector >>= inode->i_sb->s_blocksize_bits; - bio = btrfs_create_repair_bio(inode, failed_bio, failrec, page, pgoff, - isector, failed_bio->bi_end_io, dip); - bio->bi_opf = REQ_OP_READ | read_mode; - - btrfs_debug(BTRFS_I(inode)->root->fs_info, - "repair DIO read error: submitting new dio read[%#x] to this_mirror=%d, in_validation=%d", - read_mode, failrec->this_mirror, failrec->in_validation); - - refcount_inc(&dip->refs); - status = submit_dio_repair_bio(inode, bio, failrec->this_mirror); - if (status) { - free_io_failure(failure_tree, io_tree, failrec); - bio_put(bio); refcount_dec(&dip->refs); - } - - return status; + return ret; } static blk_status_t btrfs_check_read_dio_bio(struct inode *inode, @@ -7544,11 +7460,14 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode, } else { blk_status_t status; - status = dio_read_error(inode, &io_bio->bio, + status = btrfs_submit_read_repair(inode, + &io_bio->bio, + start - io_bio->logical, bvec.bv_page, pgoff, start, start + sectorsize - 1, - io_bio->mirror_num); + io_bio->mirror_num, + submit_dio_repair_bio); if (status) err = status; }