From patchwork Wed Jan 17 20:21:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthew Wilcox (Oracle)" X-Patchwork-Id: 10172749 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 77BF5601E7 for ; Thu, 18 Jan 2018 10:11:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 67DE321F61 for ; Thu, 18 Jan 2018 10:11:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5CB3D26255; Thu, 18 Jan 2018 10:11:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D12F821F61 for ; Thu, 18 Jan 2018 10:11:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2CEC06E72A; Thu, 18 Jan 2018 09:43:29 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) by gabe.freedesktop.org (Postfix) with ESMTPS id 294286E1C4 for ; Wed, 17 Jan 2018 20:23:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=ucYQQuyaU+seUhVdZCjb+7+0+lr/dpjcuW6qV2Sduvk=; b=uWwilT+WmwZpZMXDm3Aq3s0wZ v2rlh61Fg4xegu/ZoVfROseHGw76wi5KmuG73AozmeW/nU06xSVXXH0bqvZcpSk9yOcxYP9C72J1w FRE2r9asb+QuZglJBYYB+ElsPbWOy8y/8cYTZAVnzhSTdbHGrIxl1Sgryqtqf5iEhMTsE14RXapWf lbn2RwKuar/ocXCgvKpBXzlIgBVWWDU98+SutuUMEodWqthNBRkpXvBAedjeJ38OvPz0X4EBgIfS+ fEftq+CH812BtneLipLLy3wc69CeMpdee3WvGBco0jhjLQKNEo5vJIQ2JEXhvLIJ0HVJ05VPky8jD 07n5sTWNA==; Received: from willy by bombadil.infradead.org with local (Exim 4.89 #1 (Red Hat Linux)) id 1ebuEb-0006CQ-6s; Wed, 17 Jan 2018 20:22:57 +0000 From: Matthew Wilcox To: linux-kernel@vger.kernel.org Date: Wed, 17 Jan 2018 12:21:39 -0800 Message-Id: <20180117202203.19756-76-willy@infradead.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180117202203.19756-1-willy@infradead.org> References: <20180117202203.19756-1-willy@infradead.org> Cc: linux-s390@vger.kernel.org, David Howells , linux-nilfs@vger.kernel.org, Matthew Wilcox , linux-sh@vger.kernel.org, intel-gfx@lists.freedesktop.org, linux-usb@vger.kernel.org, linux-remoteproc@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-xfs@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, Stefano Stabellini , linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org, Bjorn Andersson , linux-btrfs@vger.kernel.org Subject: [Intel-gfx] [PATCH v6 75/99] md: Convert raid5-cache to XArray X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Matthew Wilcox This is the first user of the radix tree I've converted which was storing numbers rather than pointers. I'm fairly pleased with how well it came out. There's less boiler-plate involved than there was with the radix tree, so that's a win. It does use the advanced API, and I think that's a signal that there needs to be a separate API for using the XArray for only integers. Signed-off-by: Matthew Wilcox --- drivers/md/raid5-cache.c | 119 ++++++++++++++++------------------------------- 1 file changed, 40 insertions(+), 79 deletions(-) diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c index 39f31f07ffe9..2c8ad0ed9b48 100644 --- a/drivers/md/raid5-cache.c +++ b/drivers/md/raid5-cache.c @@ -158,9 +158,8 @@ struct r5l_log { /* to disable write back during in degraded mode */ struct work_struct disable_writeback_work; - /* to for chunk_aligned_read in writeback mode, details below */ - spinlock_t tree_lock; - struct radix_tree_root big_stripe_tree; + /* for chunk_aligned_read in writeback mode, details below */ + struct xarray big_stripe; }; /* @@ -170,9 +169,8 @@ struct r5l_log { * chunk contains 64 4kB-page, so this chunk contain 64 stripes). For * chunk_aligned_read, these stripes are grouped into one "big_stripe". * For each big_stripe, we count how many stripes of this big_stripe - * are in the write back cache. These data are tracked in a radix tree - * (big_stripe_tree). We use radix_tree item pointer as the counter. - * r5c_tree_index() is used to calculate keys for the radix tree. + * are in the write back cache. This counter is tracked in an xarray + * (big_stripe). r5c_index() is used to calculate the index. * * chunk_aligned_read() calls r5c_big_stripe_cached() to look up * big_stripe of each chunk in the tree. If this big_stripe is in the @@ -180,9 +178,9 @@ struct r5l_log { * rcu_read_lock(). * * It is necessary to remember whether a stripe is counted in - * big_stripe_tree. Instead of adding new flag, we reuses existing flags: + * big_stripe. Instead of adding new flag, we reuses existing flags: * STRIPE_R5C_PARTIAL_STRIPE and STRIPE_R5C_FULL_STRIPE. If either of these - * two flags are set, the stripe is counted in big_stripe_tree. This + * two flags are set, the stripe is counted in big_stripe. This * requires moving set_bit(STRIPE_R5C_PARTIAL_STRIPE) to * r5c_try_caching_write(); and moving clear_bit of * STRIPE_R5C_PARTIAL_STRIPE and STRIPE_R5C_FULL_STRIPE to @@ -190,23 +188,13 @@ struct r5l_log { */ /* - * radix tree requests lowest 2 bits of data pointer to be 2b'00. - * So it is necessary to left shift the counter by 2 bits before using it - * as data pointer of the tree. - */ -#define R5C_RADIX_COUNT_SHIFT 2 - -/* - * calculate key for big_stripe_tree + * calculate key for big_stripe * * sect: align_bi->bi_iter.bi_sector or sh->sector */ -static inline sector_t r5c_tree_index(struct r5conf *conf, - sector_t sect) +static inline sector_t r5c_index(struct r5conf *conf, sector_t sect) { - sector_t offset; - - offset = sector_div(sect, conf->chunk_sectors); + sector_div(sect, conf->chunk_sectors); return sect; } @@ -2646,10 +2634,6 @@ int r5c_try_caching_write(struct r5conf *conf, int i; struct r5dev *dev; int to_cache = 0; - void **pslot; - sector_t tree_index; - int ret; - uintptr_t refcount; BUG_ON(!r5c_is_writeback(log)); @@ -2697,39 +2681,29 @@ int r5c_try_caching_write(struct r5conf *conf, } } - /* if the stripe is not counted in big_stripe_tree, add it now */ + /* if the stripe is not counted in big_stripe, add it now */ if (!test_bit(STRIPE_R5C_PARTIAL_STRIPE, &sh->state) && !test_bit(STRIPE_R5C_FULL_STRIPE, &sh->state)) { - tree_index = r5c_tree_index(conf, sh->sector); - spin_lock(&log->tree_lock); - pslot = radix_tree_lookup_slot(&log->big_stripe_tree, - tree_index); - if (pslot) { - refcount = (uintptr_t)radix_tree_deref_slot_protected( - pslot, &log->tree_lock) >> - R5C_RADIX_COUNT_SHIFT; - radix_tree_replace_slot( - &log->big_stripe_tree, pslot, - (void *)((refcount + 1) << R5C_RADIX_COUNT_SHIFT)); - } else { - /* - * this radix_tree_insert can fail safely, so no - * need to call radix_tree_preload() - */ - ret = radix_tree_insert( - &log->big_stripe_tree, tree_index, - (void *)(1 << R5C_RADIX_COUNT_SHIFT)); - if (ret) { - spin_unlock(&log->tree_lock); - r5c_make_stripe_write_out(sh); - return -EAGAIN; - } + XA_STATE(xas, &log->big_stripe, r5c_index(conf, sh->sector)); + void *entry; + + /* Caller would rather handle failures than supply GFP flags */ + xas_lock(&xas); + entry = xas_create(&xas); + if (entry) + entry = xa_mk_value(xa_to_value(entry) + 1); + else + entry = xa_mk_value(1); + xas_store(&xas, entry); + xas_unlock(&xas); + if (xas_error(&xas)) { + r5c_make_stripe_write_out(sh); + return -EAGAIN; } - spin_unlock(&log->tree_lock); /* * set STRIPE_R5C_PARTIAL_STRIPE, this shows the stripe is - * counted in the radix tree + * counted in big_stripe */ set_bit(STRIPE_R5C_PARTIAL_STRIPE, &sh->state); atomic_inc(&conf->r5c_cached_partial_stripes); @@ -2812,9 +2786,6 @@ void r5c_finish_stripe_write_out(struct r5conf *conf, struct r5l_log *log = conf->log; int i; int do_wakeup = 0; - sector_t tree_index; - void **pslot; - uintptr_t refcount; if (!log || !test_bit(R5_InJournal, &sh->dev[sh->pd_idx].flags)) return; @@ -2852,24 +2823,21 @@ void r5c_finish_stripe_write_out(struct r5conf *conf, atomic_dec(&log->stripe_in_journal_count); r5c_update_log_state(log); - /* stop counting this stripe in big_stripe_tree */ + /* stop counting this stripe in big_stripe */ if (test_bit(STRIPE_R5C_PARTIAL_STRIPE, &sh->state) || test_bit(STRIPE_R5C_FULL_STRIPE, &sh->state)) { - tree_index = r5c_tree_index(conf, sh->sector); - spin_lock(&log->tree_lock); - pslot = radix_tree_lookup_slot(&log->big_stripe_tree, - tree_index); - BUG_ON(pslot == NULL); - refcount = (uintptr_t)radix_tree_deref_slot_protected( - pslot, &log->tree_lock) >> - R5C_RADIX_COUNT_SHIFT; - if (refcount == 1) - radix_tree_delete(&log->big_stripe_tree, tree_index); + XA_STATE(xas, &log->big_stripe, r5c_index(conf, sh->sector)); + void *entry; + + xas_lock(&xas); + entry = xas_load(&xas); + BUG_ON(!entry); + if (entry == xa_mk_value(1)) + entry = NULL; else - radix_tree_replace_slot( - &log->big_stripe_tree, pslot, - (void *)((refcount - 1) << R5C_RADIX_COUNT_SHIFT)); - spin_unlock(&log->tree_lock); + entry = xa_mk_value(xa_to_value(entry) - 1); + xas_store(&xas, entry); + xas_unlock(&xas); } if (test_and_clear_bit(STRIPE_R5C_PARTIAL_STRIPE, &sh->state)) { @@ -2949,16 +2917,10 @@ int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh) bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect) { struct r5l_log *log = conf->log; - sector_t tree_index; - void *slot; if (!log) return false; - - WARN_ON_ONCE(!rcu_read_lock_held()); - tree_index = r5c_tree_index(conf, sect); - slot = radix_tree_lookup(&log->big_stripe_tree, tree_index); - return slot != NULL; + return xa_load(&log->big_stripe, r5c_index(conf, sect)) != NULL; } static int r5l_load_log(struct r5l_log *log) @@ -3112,8 +3074,7 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev) if (!log->meta_pool) goto out_mempool; - spin_lock_init(&log->tree_lock); - INIT_RADIX_TREE(&log->big_stripe_tree, GFP_NOWAIT | __GFP_NOWARN); + xa_init(&log->big_stripe); log->reclaim_thread = md_register_thread(r5l_reclaim_thread, log->rdev->mddev, "reclaim");