From patchwork Tue Feb 28 15:41:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 9596049 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1A3B3601D7 for ; Tue, 28 Feb 2017 15:42:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0AE1F28535 for ; Tue, 28 Feb 2017 15:42:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F3AA62853A; Tue, 28 Feb 2017 15:42:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C66928535 for ; Tue, 28 Feb 2017 15:42:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752375AbdB1Pmb (ORCPT ); Tue, 28 Feb 2017 10:42:31 -0500 Received: from mail-pg0-f65.google.com ([74.125.83.65]:34568 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752304AbdB1Pm0 (ORCPT ); Tue, 28 Feb 2017 10:42:26 -0500 Received: by mail-pg0-f65.google.com with SMTP id s67so1990715pgb.1; Tue, 28 Feb 2017 07:42:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=5Cb+4lV+uIofV4i8PlE3FqYgqaMuv63IMXaBxVTAXFI=; b=X/lb1nf+596z4P7LITsB9LHK8muYpUfsGoBr43Dhu1ZCCAhkrWn71UuAKRtBbZZID3 GD5/+zwEa0X1U1NjZy2orfErwIb+vA21STPINmGgxZiFj5MrFh9XzMNFbyoYwNVXk4j4 f1MqqK4kbn32zkJCaVEjK9+cT3CcK3u231fhUX6BWDPtJO0uBjLF06Ab08izMWE6KqM1 2PJfeiewuoX3BZiaxSXe9A2Wla5dKxncNjQUrrDRRZV7fesuhI+l3OORgdXupmtDINhS 7ffy3jYnur306G0rHLGIFVopdfoRIY7YBOTmgZJTxVBgmKMHYX/tE5Xk/5T71GnJF4EX UX1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=5Cb+4lV+uIofV4i8PlE3FqYgqaMuv63IMXaBxVTAXFI=; b=tNan9IEVEk7F4DIgMLAkkSonZ2CiDSRpTCaV3U0EXonu0rLfoJmjRJq0E8fHS5WmNv rfPBUzNuTjr/sb9Fcc5ROmXXu6j4KrCXmRN5atDgaknwo2dn0tAuSsQP02/3DDYcHoxj lx0X9AJ+1j5RX4cnLuUNCC3fkK4kxfitLAF7g6Xf3JLE+6c7HwkCEmYVeXsT1zuxk/n4 oen+KLHwicpe+2MV7dzcRkk9f+Px5yNniSlI9UCV1xtwayh3O6wIc04WAcOW6BtGckKd R77qkZNZkQusgpCRoCmeqe/C05jxnQEPQR9w66kZmx/zLTEGzhfazhiRGj/Vr73XjPS1 /mZQ== X-Gm-Message-State: AMke39ny8aygmXLhATUHO2u4GOex/roEfv98R7gg6Wi5h6rkOpvmcNvCWLLhRfy/ipYBwA== X-Received: by 10.98.59.8 with SMTP id i8mr3238465pfa.132.1488296545078; Tue, 28 Feb 2017 07:42:25 -0800 (PST) Received: from localhost ([45.35.47.137]) by smtp.gmail.com with ESMTPSA id 85sm5082978pfz.15.2017.02.28.07.42.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 28 Feb 2017 07:42:24 -0800 (PST) From: Ming Lei To: Shaohua Li , Jens Axboe , linux-raid@vger.kernel.org, linux-block@vger.kernel.org, Christoph Hellwig Cc: Ming Lei Subject: [PATCH v2 06/13] md: raid1: don't use bio's vec table to manage resync pages Date: Tue, 28 Feb 2017 23:41:36 +0800 Message-Id: <1488296503-4987-7-git-send-email-tom.leiming@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1488296503-4987-1-git-send-email-tom.leiming@gmail.com> References: <1488296503-4987-1-git-send-email-tom.leiming@gmail.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now we allocate one page array for managing resync pages, instead of using bio's vec table to do that, and the old way is very hacky and won't work any more if multipage bvec is enabled. The introduced cost is that we need to allocate (128 + 16) * raid_disks bytes per r1_bio, and it is fine because the inflight r1_bio for resync shouldn't be much, as pointed by Shaohua. Also the bio_reset() in raid1_sync_request() is removed because all bios are freshly new now and not necessary to reset any more. This patch can be thought as a cleanup too Suggested-by: Shaohua Li Signed-off-by: Ming Lei --- drivers/md/raid1.c | 83 ++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 53 insertions(+), 30 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index c442b4657e2f..900144f39630 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -77,6 +77,16 @@ static void lower_barrier(struct r1conf *conf, sector_t sector_nr); #define raid1_log(md, fmt, args...) \ do { if ((md)->queue) blk_add_trace_msg((md)->queue, "raid1 " fmt, ##args); } while (0) +static inline struct resync_pages *get_resync_pages(struct bio *bio) +{ + return bio->bi_private; +} + +static inline struct r1bio *get_resync_r1bio(struct bio *bio) +{ + return get_resync_pages(bio)->raid_bio; +} + static void * r1bio_pool_alloc(gfp_t gfp_flags, void *data) { struct pool_info *pi = data; @@ -104,12 +114,18 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) struct r1bio *r1_bio; struct bio *bio; int need_pages; - int i, j; + int j; + struct resync_pages *rps; r1_bio = r1bio_pool_alloc(gfp_flags, pi); if (!r1_bio) return NULL; + rps = kmalloc(sizeof(struct resync_pages) * pi->raid_disks, + gfp_flags); + if (!rps) + goto out_free_r1bio; + /* * Allocate bios : 1 for reading, n-1 for writing */ @@ -129,22 +145,22 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) need_pages = pi->raid_disks; else need_pages = 1; - for (j = 0; j < need_pages; j++) { + for (j = 0; j < pi->raid_disks; j++) { + struct resync_pages *rp = &rps[j]; + bio = r1_bio->bios[j]; - bio->bi_vcnt = RESYNC_PAGES; - - if (bio_alloc_pages(bio, gfp_flags)) - goto out_free_pages; - } - /* If not user-requests, copy the page pointers to all bios */ - if (!test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery)) { - for (i=0; iraid_disks; j++) { - struct page *page = - r1_bio->bios[0]->bi_io_vec[i].bv_page; - get_page(page); - r1_bio->bios[j]->bi_io_vec[i].bv_page = page; - } + + if (j < need_pages) { + if (resync_alloc_pages(rp, gfp_flags)) + goto out_free_pages; + } else { + memcpy(rp, &rps[0], sizeof(*rp)); + resync_get_all_pages(rp); + } + + rp->idx = 0; + rp->raid_bio = r1_bio; + bio->bi_private = rp; } r1_bio->master_bio = NULL; @@ -153,11 +169,14 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) out_free_pages: while (--j >= 0) - bio_free_pages(r1_bio->bios[j]); + resync_free_pages(&rps[j]); out_free_bio: while (++j < pi->raid_disks) bio_put(r1_bio->bios[j]); + kfree(rps); + +out_free_r1bio: r1bio_pool_free(r1_bio, data); return NULL; } @@ -165,14 +184,18 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) static void r1buf_pool_free(void *__r1_bio, void *data) { struct pool_info *pi = data; - int i,j; + int i; struct r1bio *r1bio = __r1_bio; + struct resync_pages *rp = NULL; - for (i = 0; i < RESYNC_PAGES; i++) - for (j = pi->raid_disks; j-- ;) - safe_put_page(r1bio->bios[j]->bi_io_vec[i].bv_page); - for (i=0 ; i < pi->raid_disks; i++) + for (i = pi->raid_disks; i--; ) { + rp = get_resync_pages(r1bio->bios[i]); + resync_free_pages(rp); bio_put(r1bio->bios[i]); + } + + /* resync pages array stored in the 1st bio's .bi_private */ + kfree(rp); r1bio_pool_free(r1bio, data); } @@ -1849,7 +1872,7 @@ static int raid1_remove_disk(struct mddev *mddev, struct md_rdev *rdev) static void end_sync_read(struct bio *bio) { - struct r1bio *r1_bio = bio->bi_private; + struct r1bio *r1_bio = get_resync_r1bio(bio); update_head_pos(r1_bio->read_disk, r1_bio); @@ -1868,7 +1891,7 @@ static void end_sync_read(struct bio *bio) static void end_sync_write(struct bio *bio) { int uptodate = !bio->bi_error; - struct r1bio *r1_bio = bio->bi_private; + struct r1bio *r1_bio = get_resync_r1bio(bio); struct mddev *mddev = r1_bio->mddev; struct r1conf *conf = mddev->private; sector_t first_bad; @@ -2085,6 +2108,7 @@ static void process_checks(struct r1bio *r1_bio) int size; int error; struct bio *b = r1_bio->bios[i]; + struct resync_pages *rp = get_resync_pages(b); if (b->bi_end_io != end_sync_read) continue; /* fixup the bio for reuse, but preserve errno */ @@ -2097,7 +2121,8 @@ static void process_checks(struct r1bio *r1_bio) conf->mirrors[i].rdev->data_offset; b->bi_bdev = conf->mirrors[i].rdev->bdev; b->bi_end_io = end_sync_read; - b->bi_private = r1_bio; + rp->raid_bio = r1_bio; + b->bi_private = rp; size = b->bi_iter.bi_size; for (j = 0; j < vcnt ; j++) { @@ -2755,7 +2780,6 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr, for (i = 0; i < conf->raid_disks * 2; i++) { struct md_rdev *rdev; bio = r1_bio->bios[i]; - bio_reset(bio); rdev = rcu_dereference(conf->mirrors[i].rdev); if (rdev == NULL || @@ -2811,7 +2835,6 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr, atomic_inc(&rdev->nr_pending); bio->bi_iter.bi_sector = sector_nr + rdev->data_offset; bio->bi_bdev = rdev->bdev; - bio->bi_private = r1_bio; if (test_bit(FailFast, &rdev->flags)) bio->bi_opf |= MD_FAILFAST; } @@ -2899,7 +2922,7 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr, for (i = 0 ; i < conf->raid_disks * 2; i++) { bio = r1_bio->bios[i]; if (bio->bi_end_io) { - page = bio->bi_io_vec[bio->bi_vcnt].bv_page; + page = resync_fetch_page(get_resync_pages(bio)); /* * won't fail because the vec table is big @@ -2911,8 +2934,8 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr, nr_sectors += len>>9; sector_nr += len>>9; sync_blocks -= (len>>9); - } while (r1_bio->bios[disk]->bi_vcnt < RESYNC_PAGES); - bio_full: + } while (resync_page_available(r1_bio->bios[disk]->bi_private)); + r1_bio->sectors = nr_sectors; if (mddev_is_clustered(mddev) &&