From patchwork Fri May 16 17:09:59 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 4194571 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 0AC009F271 for ; Fri, 16 May 2014 17:14:24 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1A53320306 for ; Fri, 16 May 2014 17:14:23 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2EF27202EC for ; Fri, 16 May 2014 17:14:22 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1WlLeL-00045k-WD; Fri, 16 May 2014 17:10:26 +0000 Received: from mail-qc0-x236.google.com ([2607:f8b0:400d:c01::236]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1WlLeJ-0002iy-LD for linux-arm-kernel@lists.infradead.org; Fri, 16 May 2014 17:10:24 +0000 Received: by mail-qc0-f182.google.com with SMTP id e16so4813831qcx.27 for ; Fri, 16 May 2014 10:10:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=IVZHQyApEhzPBTOhdl2tWGWMvO96z6d8XFj1bBA1N6g=; b=sxf2pmm/jyUVcGTOnhx3J2Z5nfylPUC2uUF42LX1ifabmi63MQybVNmI6IvM/2T/Fp sZL8nMIUEt0ClzBtexHGFOZqYK7lDAbtsNYlX6LMDv9DnRHll45k9bhsaImg2u/rsT5C PbdokOFRiJtTW+44uwFiMyOvvjzbhrm+sd+2fTat+fMpxgn1rMRohrrZ4KDza8NezmST 9n6dN0LE3DAEwSPhggH0yS7mtorh88nJvOPQCDiC8+ZNAvlMHvqhU5zpKSId1cgHeE50 bZPRGqy4ep+i4PIoxnIZ4I3T7QMa4274bgkNmr5GW3+lB6lSlvGykllwvPaaFuYdaeZS HCKA== X-Received: by 10.224.57.142 with SMTP id c14mr25553253qah.23.1400260202294; Fri, 16 May 2014 10:10:02 -0700 (PDT) Received: from htj.dyndns.org (207-38-225-25.c3-0.43d-ubr1.qens-43d.ny.cable.rcn.com. [207.38.225.25]) by mx.google.com with ESMTPSA id x1sm13740516qal.36.2014.05.16.10.10.01 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 May 2014 10:10:01 -0700 (PDT) Date: Fri, 16 May 2014 13:09:59 -0400 From: Tejun Heo To: Stephen Warren Subject: [PATCH v2 cgroup/for-3.16] cgroup: introduce CSS_NO_REF and skip refcnting on normal root csses Message-ID: <20140516170959.GG5379@htj.dyndns.org> References: <1399670015-23463-1-git-send-email-tj@kernel.org> <1399670015-23463-10-git-send-email-tj@kernel.org> <53751062.2050401@wwwdotorg.org> <53753823.2090402@wwwdotorg.org> <20140516143718.GA5379@htj.dyndns.org> <20140516154330.GB5379@htj.dyndns.org> <537643F9.1030303@wwwdotorg.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <537643F9.1030303@wwwdotorg.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20140516_101023_775442_809E3B7A X-CRM114-Status: GOOD ( 14.90 ) X-Spam-Score: 0.0 (/) Cc: "linux-tegra@vger.kernel.org" , cgroups@vger.kernel.org, lizefan@huawei.com, linux-kernel@vger.kernel.org, ARM kernel mailing list X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 9395a4500404 ("cgroup: enable refcnting for root csses") enabled reference counting for root csses (cgroup_subsys_states) so that cgroup's self csses can be used to manage the lifetime of the containing cgroups. Unfortunately, this change was incorrect. During early init, cgrp_dfl_root self css refcnt is used. percpu_ref can't initialized during early init and its initialization is deferred till cgroup_init() time. This means that cpu was using percpu_ref which wasn't properly initialized. Due to the way percpu variables are laid out on x86, this didn't blow up immediately on x86 but ended up incrementing and decrementing the percpu variable at offset zero, whatever it may be; however, on other archs, this caused fault and early boot failure. As cgroup self csses for root cgroups of non-dfl hierarchies need working refcounting, we can't revert 9395a4500404. This patch adds CSS_NO_REF which explicitly inhibits reference counting on the css and sets it on all normal (non-self) csses and cgroup_dfl_root self css. v2: cgrp_dfl_root.self is the offending one. Set the flag on it. Signed-off-by: Tejun Heo Reported-by: Stephen Warren Fixes: 9395a4500404 ("cgroup: enable refcnting for root csses") Tested-by: Stephen Warren --- Hello, Can you try this one instead? Thanks. include/linux/cgroup.h | 11 ++++++++--- kernel/cgroup.c | 11 +++++++++-- 2 files changed, 17 insertions(+), 5 deletions(-) --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -77,6 +77,7 @@ struct cgroup_subsys_state { /* bits in struct cgroup_subsys_state flags field */ enum { + CSS_NO_REF = (1 << 0), /* no reference counting for this css */ CSS_ONLINE = (1 << 1), /* between ->css_online() and ->css_offline() */ }; @@ -88,7 +89,8 @@ enum { */ static inline void css_get(struct cgroup_subsys_state *css) { - percpu_ref_get(&css->refcnt); + if (!(css->flags & CSS_NO_REF)) + percpu_ref_get(&css->refcnt); } /** @@ -103,7 +105,9 @@ static inline void css_get(struct cgroup */ static inline bool css_tryget_online(struct cgroup_subsys_state *css) { - return percpu_ref_tryget_live(&css->refcnt); + if (!(css->flags & CSS_NO_REF)) + return percpu_ref_tryget_live(&css->refcnt); + return true; } /** @@ -114,7 +118,8 @@ static inline bool css_tryget_online(str */ static inline void css_put(struct cgroup_subsys_state *css) { - percpu_ref_put(&css->refcnt); + if (!(css->flags & CSS_NO_REF)) + percpu_ref_put(&css->refcnt); } /* bits in struct cgroup flags field */ --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4593,11 +4593,17 @@ static void __init cgroup_init_subsys(st /* We don't handle early failures gracefully */ BUG_ON(IS_ERR(css)); init_and_link_css(css, ss, &cgrp_dfl_root.cgrp); + + /* + * Root csses are never destroyed and we can't initialize + * percpu_ref during early init. Disable refcnting. + */ + css->flags |= CSS_NO_REF; + if (early) { /* allocation can't be done safely during early init */ css->id = 1; } else { - BUG_ON(percpu_ref_init(&css->refcnt, css_release)); css->id = cgroup_idr_alloc(&ss->css_idr, css, 1, 2, GFP_KERNEL); BUG_ON(css->id < 0); } @@ -4636,6 +4642,8 @@ int __init cgroup_init_early(void) int i; init_cgroup_root(&cgrp_dfl_root, &opts); + cgrp_dfl_root.cgrp.self.flags |= CSS_NO_REF; + RCU_INIT_POINTER(init_task.cgroups, &init_css_set); for_each_subsys(ss, i) { @@ -4684,7 +4692,6 @@ int __init cgroup_init(void) struct cgroup_subsys_state *css = init_css_set.subsys[ss->id]; - BUG_ON(percpu_ref_init(&css->refcnt, css_release)); css->id = cgroup_idr_alloc(&ss->css_idr, css, 1, 2, GFP_KERNEL); BUG_ON(css->id < 0);