From patchwork Mon Apr 6 19:57:53 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 6164121 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 41856BF4A6 for ; Mon, 6 Apr 2015 19:59:26 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 51B01202F8 for ; Mon, 6 Apr 2015 19:59:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0AEFB20357 for ; Mon, 6 Apr 2015 19:59:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753169AbbDFT7S (ORCPT ); Mon, 6 Apr 2015 15:59:18 -0400 Received: from mail-qg0-f44.google.com ([209.85.192.44]:33480 "EHLO mail-qg0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753007AbbDFT66 (ORCPT ); Mon, 6 Apr 2015 15:58:58 -0400 Received: by qgdy78 with SMTP id y78so14932818qgd.0; Mon, 06 Apr 2015 12:58:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=xB1cZjUbKSnNvdaJvG7G84oThtGgEZJhn/Uny/40+ZQ=; b=ywwrbvzjXIQMpaDX/ClHJ1TzlIVf93uexX8Tn/McJP3fvpa0OJr4pbwmybkbFrGfWO 1NImkT0pSZbn+NLPhMuP5/WLQM3zhHEOWSiC+MJEJeHTZvHXTH5qvSXKqnW7ndIsKhEP jfYMYhmJFTvy7f03qPRCYPqItir2xYRG/Wn2WVH9zdP/hWSR9FLBAsMeCzDLzs5Uf+uS jQSKdXmzY949KV4O3dnpr9I8MnqBn8Ay2Ph8ApKRyOh2yz7xGzTP970AQOIsYLpxPxdV wYcEPgXPL/+IAgLluHAhi81IEqtPGnKNc9wDR+UZTSx5gWuoZhYDdGICzgrajBHchuou 6n1Q== X-Received: by 10.140.232.68 with SMTP id d65mr19646725qhc.99.1428350337679; Mon, 06 Apr 2015 12:58:57 -0700 (PDT) Received: from htj.duckdns.org.lan (207-38-238-8.c3-0.wsd-ubr1.qens-wsd.ny.cable.rcn.com. [207.38.238.8]) by mx.google.com with ESMTPSA id z67sm3914418qgz.10.2015.04.06.12.58.56 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 Apr 2015 12:58:56 -0700 (PDT) From: Tejun Heo To: axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, jack@suse.cz, hch@infradead.org, hannes@cmpxchg.org, linux-fsdevel@vger.kernel.org, vgoyal@redhat.com, lizefan@huawei.com, cgroups@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.cz, clm@fb.com, fengguang.wu@intel.com, david@fromorbit.com, gthelen@google.com, Tejun Heo Subject: [PATCH 04/49] blkcg: always create the blkcg_gq for the root blkcg Date: Mon, 6 Apr 2015 15:57:53 -0400 Message-Id: <1428350318-8215-5-git-send-email-tj@kernel.org> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1428350318-8215-1-git-send-email-tj@kernel.org> References: <1428350318-8215-1-git-send-email-tj@kernel.org> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID,T_RP_MATCHES_RCVD,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently, blkcg does a minor optimization where the root blkcg is created when the first blkcg policy is activated on a queue and destroyed on the deactivation of the last. On systems where blkcg is configured but not used, this saves one blkcg_gq struct per queue. On systems where blkcg is actually used, there's no difference. The only case where this can lead to any meaninful, albeit still minute, save in memory consumption is when all blkcg policies are deactivated after being widely used in the system, which is a hihgly unlikely scenario. The conditional existence of root blkcg_gq has already created several bugs in blkcg and became an issue once again for the new per-cgroup wb_congested mechanism for cgroup writeback support leading to a NULL dereference when no blkcg policy is active. This is really not worth bothering with. This patch makes blkcg always allocate and link the root blkcg_gq and release it only on queue destruction. Signed-off-by: Tejun Heo Reported-by: Fengguang Wu --- block/blk-cgroup.c | 96 +++++++++++++++++++++++------------------------------- 1 file changed, 41 insertions(+), 55 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index c3226ce..2a4f77f 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -235,13 +235,8 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg, blkg->online = true; spin_unlock(&blkcg->lock); - if (!ret) { - if (blkcg == &blkcg_root) { - q->root_blkg = blkg; - q->root_rl.blkg = blkg; - } + if (!ret) return blkg; - } /* @blkg failed fully initialized, use the usual release path */ blkg_put(blkg); @@ -340,15 +335,6 @@ static void blkg_destroy(struct blkcg_gq *blkg) rcu_assign_pointer(blkcg->blkg_hint, NULL); /* - * If root blkg is destroyed. Just clear the pointer since root_rl - * does not take reference on root blkg. - */ - if (blkcg == &blkcg_root) { - blkg->q->root_blkg = NULL; - blkg->q->root_rl.blkg = NULL; - } - - /* * Put the reference taken at the time of creation so that when all * queues are gone, group can be destroyed. */ @@ -855,9 +841,45 @@ done: */ int blkcg_init_queue(struct request_queue *q) { - might_sleep(); + struct blkcg_gq *new_blkg, *blkg; + bool preloaded; + int ret; + + new_blkg = blkg_alloc(&blkcg_root, q, GFP_KERNEL); + if (!new_blkg) + return -ENOMEM; + + preloaded = !radix_tree_preload(GFP_KERNEL); - return blk_throtl_init(q); + /* + * Make sure the root blkg exists and count the existing blkgs. As + * @q is bypassing at this point, blkg_lookup_create() can't be + * used. Open code insertion. + */ + rcu_read_lock(); + spin_lock_irq(q->queue_lock); + blkg = blkg_create(&blkcg_root, q, new_blkg); + spin_unlock_irq(q->queue_lock); + rcu_read_unlock(); + + if (preloaded) + radix_tree_preload_end(); + + if (IS_ERR(blkg)) { + kfree(new_blkg); + return PTR_ERR(blkg); + } + + q->root_blkg = blkg; + q->root_rl.blkg = blkg; + + ret = blk_throtl_init(q); + if (ret) { + spin_lock_irq(q->queue_lock); + blkg_destroy_all(q); + spin_unlock_irq(q->queue_lock); + } + return ret; } /** @@ -958,52 +980,20 @@ int blkcg_activate_policy(struct request_queue *q, const struct blkcg_policy *pol) { LIST_HEAD(pds); - struct blkcg_gq *blkg, *new_blkg; + struct blkcg_gq *blkg; struct blkg_policy_data *pd, *n; int cnt = 0, ret; - bool preloaded; if (blkcg_policy_enabled(q, pol)) return 0; - /* preallocations for root blkg */ - new_blkg = blkg_alloc(&blkcg_root, q, GFP_KERNEL); - if (!new_blkg) - return -ENOMEM; - + /* count and allocate policy_data for all existing blkgs */ blk_queue_bypass_start(q); - - preloaded = !radix_tree_preload(GFP_KERNEL); - - /* - * Make sure the root blkg exists and count the existing blkgs. As - * @q is bypassing at this point, blkg_lookup_create() can't be - * used. Open code it. - */ spin_lock_irq(q->queue_lock); - - rcu_read_lock(); - blkg = __blkg_lookup(&blkcg_root, q, false); - if (blkg) - blkg_free(new_blkg); - else - blkg = blkg_create(&blkcg_root, q, new_blkg); - rcu_read_unlock(); - - if (preloaded) - radix_tree_preload_end(); - - if (IS_ERR(blkg)) { - ret = PTR_ERR(blkg); - goto out_unlock; - } - list_for_each_entry(blkg, &q->blkg_list, q_node) cnt++; - spin_unlock_irq(q->queue_lock); - /* allocate policy_data for all existing blkgs */ while (cnt--) { pd = kzalloc_node(pol->pd_size, GFP_KERNEL, q->node); if (!pd) { @@ -1072,10 +1062,6 @@ void blkcg_deactivate_policy(struct request_queue *q, __clear_bit(pol->plid, q->blkcg_pols); - /* if no policy is left, no need for blkgs - shoot them down */ - if (bitmap_empty(q->blkcg_pols, BLKCG_MAX_POLS)) - blkg_destroy_all(q); - list_for_each_entry(blkg, &q->blkg_list, q_node) { /* grab blkcg lock too while removing @pd from @blkg */ spin_lock(&blkg->blkcg->lock);