From patchwork Thu Dec 15 03:31:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 13073877 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36450C4167B for ; Thu, 15 Dec 2022 03:31:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9CE808E0005; Wed, 14 Dec 2022 22:31:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 97E348E0002; Wed, 14 Dec 2022 22:31:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 846F48E0005; Wed, 14 Dec 2022 22:31:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 756958E0002 for ; Wed, 14 Dec 2022 22:31:56 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 42D901402FA for ; Thu, 15 Dec 2022 03:31:56 +0000 (UTC) X-FDA: 80243116632.22.A0AAEB9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf01.hostedemail.com (Postfix) with ESMTP id 9B4D34000D for ; Thu, 15 Dec 2022 03:31:54 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BNOOOfQq; spf=pass (imf01.hostedemail.com: domain of longman@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671075114; a=rsa-sha256; cv=none; b=nQN79o0ORL+VpBD8mPv9c86DQpoURnrQCZW89VzyagJg4A+ojz7l3ViWDmNQaFAQKABroH HCpTWBu1L1EWwV6FsQzGHAdafgH3YgDFslEFSFlh03WYx8IsKG5JrNGcPKxw8y5tpkbiNT OrO+u8Tn/pB6uZcCCiykxdOtGfI71NM= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BNOOOfQq; spf=pass (imf01.hostedemail.com: domain of longman@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671075114; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JQg3oq06Uw+1wG32ld/l34XBIlmOl7KcZS/r8y8eH90=; b=H9Poj3bIZeyuh6jI19vyADfndZOAcMFRRad2ndhpmuzW8R4b87r+d6sCw4bDKUK1xzMKci zwWIQP1GeWqG8g+gwISx/JGT1PAJbdyA9gYUrfp/gRXn1lwWcHOp7QTdqfdJwpWGQjg9s2 3NB/Dq/6kMM7Lo0pdQmrq/lp4kUZPIc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1671075113; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JQg3oq06Uw+1wG32ld/l34XBIlmOl7KcZS/r8y8eH90=; b=BNOOOfQqNCwnIVq5uI1Gv6r+0NBA2km1aEquWDKmN4DNp7r+/v11I1OPj3bGw6cdDIQEpU 1TOr/FBSkY210y0yAYAfKtD9T9M8YrAaZcti41LBW9f7Z3YgU9izwICJ/C602IYdW44erP SBAdBBFAT/9pC3hP+W6Oj3Snax2UcRc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-529-vK79Q7I0O6SEt1SJkCSsgA-1; Wed, 14 Dec 2022 22:31:45 -0500 X-MC-Unique: vK79Q7I0O6SEt1SJkCSsgA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 35F9B811E9C; Thu, 15 Dec 2022 03:31:45 +0000 (UTC) Received: from llong.com (unknown [10.22.16.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id A624640ED76E; Thu, 15 Dec 2022 03:31:44 +0000 (UTC) From: Waiman Long To: Jens Axboe , Tejun Heo , Josef Bacik , Zefan Li , Johannes Weiner , Andrew Morton Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, =?utf-8?q?Michal_Koutn?= =?utf-8?q?=C3=BD?= , "Dennis Zhou (Facebook)" , Waiman Long , Yi Zhang Subject: [PATCH v4 1/2] bdi, blk-cgroup: Fix potential UAF of blkcg Date: Wed, 14 Dec 2022 22:31:31 -0500 Message-Id: <20221215033132.230023-2-longman@redhat.com> In-Reply-To: <20221215033132.230023-1-longman@redhat.com> References: <20221215033132.230023-1-longman@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Rspam-User: X-Rspamd-Queue-Id: 9B4D34000D X-Rspamd-Server: rspam01 X-Stat-Signature: wocpungp4i8gqk9319bimnm1wnmrd313 X-HE-Tag: 1671075114-846014 X-HE-Meta: U2FsdGVkX1/CThaFlHp8c/Ah8DUoCZV5G1AxBacN31qlMe8Y+J2bkMGZA7oMLO2n0qdQa+0gS4N28tAtS3PnzFjUSbLKztAdKLRND2MDAOi73wpj0Co3Bj6wSTvfQ4I0Du6+mOH6ZRekzTwsD6TSaKDZtZicldyZfMgnDUsOlmHU9zO+1sCw3V8qAqVULGxKhziz8hL+ji908eYsY2Vh7hJQPBYVomDZiEhMzlQwP7D/Tq4jO8hMNtbqlXkkLjwuDc+N/uFRMVlyTGYjAHGY8bRcouG9d3OY3AsW83fu1aq0JJzUDL8JOoS8cLLO1O2fS0Dy98ZheARMIDMvOWj9rMXiypaJBhAlP8mF0D+u8Rd5RC/yAEz73rtrK2/ZSSIjvQwPNzzZizRfdW92I/AmFjPxryg0BmIQeW2owcJjIhR4roFj2qyuFhQPKSe85/M4pUXEl/DSymp2oCkPNLW21Upc0b6H+Oq2NEvI2nh7kbAqw40FkSNmoSS4WynXT9/+TGIxmsAOTWJYBm0w0rGf6iCZak3ig2qCediIS6NEFAbPjzxBoAblShwciFMesqk9R6w7pEiNNe94If+YblU3rF++lCKjDQTVADtIza+j4qMjFa+2/AmWQ2817ilviXbgu9aK1CyGlvsUufTy2bIaQ1uYTrRzJN1PBlP+8/SyxY4k3k3r7zZHG8HsBIMQS/l0UOYTL3oHVcb2ImlI+Vy+ooMK94BtfK3Q3OBNR7eZB0BhEEFC69ZoBFvUK6CqBH+d0rEiZm8KGiuSjgHLnAitQy+RC7Kyv61LF9Bgi3NdjZY1MXhXXwGHm+h8xZ7LuK62BGGld3PiKIzJk58xAAncsEghCSGqqMR9sjOeWtf/AEajnNlnN81drGoXcZWBmBGw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Commit 59b57717fff8 ("blkcg: delay blkg destruction until after writeback has finished") delayed call to blkcg_destroy_blkgs() to cgwb_release_workfn(). However, it is done after a css_put() of blkcg which may be the final put that causes the blkcg to be freed as RCU read lock isn't held. Another place where blkcg_destroy_blkgs() can be called indirectly via blkcg_unpin_online() is from the offline_css() function called from css_killed_work_fn(). Over there, the potentially final css_put() call is issued after offline_css(). By adding a css_tryget() into blkcg_destroy_blkgs() and warning its failure, the following stack trace was produced in a test system on bootup. [ 34.254240] RIP: 0010:blkcg_destroy_blkgs+0x16a/0x1a0 : [ 34.339943] Call Trace: [ 34.344510] blkcg_unpin_online+0x38/0x60 [ 34.348523] cgwb_release_workfn+0x6a/0x200 [ 34.352708] process_one_work+0x1e5/0x3b0 [ 34.360758] worker_thread+0x50/0x3a0 [ 34.368447] kthread+0xd9/0x100 [ 34.376386] ret_from_fork+0x22/0x30 This confirms that a potential UAF situation can really happen in cgwb_release_workfn(). Fix that by delaying the css_put() until after the blkcg_unpin_online() call. Also use percpu_ref_is_zero() in blkcg_destroy_blkgs() and issue a warning if reference count is zero. The reproducing system can no longer produce a warning with this patch. All the runnable block/0* tests including block/027 were run successfully without failure. Fixes: 59b57717fff8 ("blkcg: delay blkg destruction until after writeback has finished") Suggested-by: Michal Koutný Reported-by: Yi Zhang Signed-off-by: Waiman Long Acked-by: Tejun Heo --- block/blk-cgroup.c | 7 +++++++ mm/backing-dev.c | 8 ++++++-- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 1bb939d3b793..ca28306aa1b1 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1084,6 +1084,13 @@ struct list_head *blkcg_get_cgwb_list(struct cgroup_subsys_state *css) */ static void blkcg_destroy_blkgs(struct blkcg *blkcg) { + /* + * blkcg_destroy_blkgs() shouldn't be called with all the blkcg + * references gone. + */ + if (WARN_ON_ONCE(percpu_ref_is_zero(&blkcg->css.refcnt))) + return; + might_sleep(); spin_lock_irq(&blkcg->lock); diff --git a/mm/backing-dev.c b/mm/backing-dev.c index c30419a5e119..36f75b072325 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -390,11 +390,15 @@ static void cgwb_release_workfn(struct work_struct *work) wb_shutdown(wb); css_put(wb->memcg_css); - css_put(wb->blkcg_css); mutex_unlock(&wb->bdi->cgwb_release_mutex); - /* triggers blkg destruction if no online users left */ + /* + * Triggers blkg destruction if no online users left + * The final blkcg css_put() has to be done after blkcg_unpin_online() + * to avoid use-after-free. + */ blkcg_unpin_online(wb->blkcg_css); + css_put(wb->blkcg_css); fprop_local_destroy_percpu(&wb->memcg_completions); From patchwork Thu Dec 15 03:31:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 13073876 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82191C4332F for ; Thu, 15 Dec 2022 03:31:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A91DC8E0003; Wed, 14 Dec 2022 22:31:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A42188E0002; Wed, 14 Dec 2022 22:31:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 909D18E0003; Wed, 14 Dec 2022 22:31:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7D30F8E0002 for ; Wed, 14 Dec 2022 22:31:52 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 52519801F3 for ; Thu, 15 Dec 2022 03:31:52 +0000 (UTC) X-FDA: 80243116464.19.6AD057F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 7636A80003 for ; Thu, 15 Dec 2022 03:31:50 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=A5l3Yp0I; spf=pass (imf02.hostedemail.com: domain of longman@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671075110; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P/pLznPGQVdkseYoFw2OMUXPOTJl7XFp+0+LqLupem4=; b=mmp3y1NUrHkP5I+jiJK3BxKBvUQT351Ne+jVSpHKDxqa12DfYkQAbbx3c8EJUCW9KHYGry NUG9gjxBLkg3ZP+BghcFptnpIEQE86V9C9/4L7w3IwpvWGqedJgXscJJKZ+JujdQQGfFv6 eHOJc9aciE2Q7pq1n40LVW0ncL0ImbU= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=A5l3Yp0I; spf=pass (imf02.hostedemail.com: domain of longman@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671075110; a=rsa-sha256; cv=none; b=YPDiWhrv93Wi+2nyunDEb//ziP7gYJ3k4/zh/8gnqW/2Xf7pi+51klxCwiDu0Ja4rDbxiY YQGShUFTgUfCwTNE2v6aYFXxpflep9rNCaJvYTd0PD1QSgUCrNDsGWfedbaOYB36qo2qn+ PpNAFpc9bVkhXpAYQILRNmsAX22SNB0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1671075109; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P/pLznPGQVdkseYoFw2OMUXPOTJl7XFp+0+LqLupem4=; b=A5l3Yp0IYh7jB5T04zh0ldDq0/a9tDNuk5RCgLG+Nkgo0FTKPjwtskhUYRv9MiyLq+ifHW AEGBWadcGcpr4BlZr1qYuFLcUzYmeySKgxgJYZnzuPXcQJurIXUHSVQGGjiQDlCEMgU1K3 Femu0LawMLrtoUIqmN6+tkFHkQl+xRM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-80-Ufz3riySO5yZc8WkBp7EUA-1; Wed, 14 Dec 2022 22:31:46 -0500 X-MC-Unique: Ufz3riySO5yZc8WkBp7EUA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B953D85D060; Thu, 15 Dec 2022 03:31:45 +0000 (UTC) Received: from llong.com (unknown [10.22.16.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id 42B11400D796; Thu, 15 Dec 2022 03:31:45 +0000 (UTC) From: Waiman Long To: Jens Axboe , Tejun Heo , Josef Bacik , Zefan Li , Johannes Weiner , Andrew Morton Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, =?utf-8?q?Michal_Koutn?= =?utf-8?q?=C3=BD?= , "Dennis Zhou (Facebook)" , Waiman Long Subject: [PATCH v4 2/2] blk-cgroup: Flush stats at blkgs destruction path Date: Wed, 14 Dec 2022 22:31:32 -0500 Message-Id: <20221215033132.230023-3-longman@redhat.com> In-Reply-To: <20221215033132.230023-1-longman@redhat.com> References: <20221215033132.230023-1-longman@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Rspamd-Queue-Id: 7636A80003 X-Stat-Signature: ma4anjxixzfbqmcyyge4qyn9r6whj4j7 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1671075110-943875 X-HE-Meta: U2FsdGVkX1+uE4u/24r96ciUxav9hHyzuRY0nnQTu8H86tjAY6W5dEzn9Hnzdv/FqvRH8QW9VOGTvIu84vYN1kF+0hEsYoS3daws4gmkibNqpBkFWAHvWHx6e1GGytoGb81rdLXUehpn/4w6d+NzAi86+62EKleGoYVucFBZKTJfZOxtYz86NrM6UVCXQeklrvFI74c9Xt/cgzQlB/NxNtR4I+mHcz+oPX7fNqSvOvAHDTFh57b71k3JQB20DZ09ICP0GbVuJ+1sarK7zc2wgGbMP1thJNleIAPfdv4udY3lIYQKZwLV4p3+nvnV/B/JwSiQ7dgcIxklabQsemzywtvnNcg4SlWt4gIXKKBwFoDeKJz/xcxIn40CJTcRM+djJ/DJA6vgc/FndESikPvJueCd/AH2g1ftIeZXqI7WpCejWqOwAu/bXZwVE/DCrfbHl0twCHPKxKLYNsbHoMkViRAKL571e9dLJMZfxcxt995iovIgrD/vo3LsoAowbY+C5B4/fYLiNe73M7xbs5RYG167GT3iHYPuJYR0qDzYyC8mcnz2PIXp4GnJ9iXw8dZfmDBVkvolDOAlOWPsJPruCXzH77qIhYf/1f3ryTNnut6qvuObgdH6R9oy/MtfjJSQWvMmyKiR4JAh5OUi0MFU0lNHs/5t543J4YKKjyPc135VqHWsKe2XYSy1eWk2Z+7ZLs8pQ1hD4yLx6WIvWxqDpQCXrUUCn7r0MktPiUQKAoyEzHXD0YJcntX39zQcdzSZEceKiJ+JrDL/UuNLbeKVGi6taLlDhtqIW+TKSUEvGOMFMrRSPMQs0Ff6XocoNLVWcJpvc+kgw6T6o/k37MB2rtA+JBFoKYE/OyE+DgkYjpDPy8aFl7cR+VjX3QzZfkXR7JWVLGiETpdJqj7IetpyujuARFeai0mO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As noted by Michal, the blkg_iostat_set's in the lockless list hold reference to blkg's to protect against their removal. Those blkg's hold reference to blkcg. When a cgroup is being destroyed, cgroup_rstat_flush() is only called at css_release_work_fn() which is called when the blkcg reference count reaches 0. This circular dependency will prevent blkcg and some blkgs from being freed after they are made offline. It is less a problem if the cgroup to be destroyed also has other controllers like memory that will call cgroup_rstat_flush() which will clean up the reference count. If block is the only controller that uses rstat, these offline blkcg and blkgs may never be freed leaking more and more memory over time. To prevent this potential memory leak, a new cgroup_rstat_css_cpu_flush() function is added to flush stats for a given css and cpu. This new function will be called at blkgs destruction path, blkcg_destroy_blkgs(), whenever there are still pending stats to be flushed. This will release the references to blkgs allowing them to be freed and indirectly allow the freeing of blkcg. Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()") Signed-off-by: Waiman Long Acked-by: Tejun Heo --- block/blk-cgroup.c | 16 ++++++++++++++++ include/linux/cgroup.h | 1 + kernel/cgroup/rstat.c | 18 ++++++++++++++++++ 3 files changed, 35 insertions(+) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index ca28306aa1b1..a2a1081d9d1d 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1084,6 +1084,8 @@ struct list_head *blkcg_get_cgwb_list(struct cgroup_subsys_state *css) */ static void blkcg_destroy_blkgs(struct blkcg *blkcg) { + int cpu; + /* * blkcg_destroy_blkgs() shouldn't be called with all the blkcg * references gone. @@ -1093,6 +1095,20 @@ static void blkcg_destroy_blkgs(struct blkcg *blkcg) might_sleep(); + /* + * Flush all the non-empty percpu lockless lists to release the + * blkg references held by those lists which, in turn, will + * allow those blkgs to be freed and release their references to + * blkcg. Otherwise, they may not be freed at all becase of this + * circular dependency resulting in memory leak. + */ + for_each_possible_cpu(cpu) { + struct llist_head *lhead = per_cpu_ptr(blkcg->lhead, cpu); + + if (!llist_empty(lhead)) + cgroup_rstat_css_cpu_flush(&blkcg->css, cpu); + } + spin_lock_irq(&blkcg->lock); while (!hlist_empty(&blkcg->blkg_list)) { diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 528bd44b59e2..6c4e66b3fa84 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -766,6 +766,7 @@ void cgroup_rstat_flush(struct cgroup *cgrp); void cgroup_rstat_flush_irqsafe(struct cgroup *cgrp); void cgroup_rstat_flush_hold(struct cgroup *cgrp); void cgroup_rstat_flush_release(void); +void cgroup_rstat_css_cpu_flush(struct cgroup_subsys_state *css, int cpu); /* * Basic resource stats. diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index 793ecff29038..2e44be44351f 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -281,6 +281,24 @@ void cgroup_rstat_flush_release(void) spin_unlock_irq(&cgroup_rstat_lock); } +/** + * cgroup_rstat_css_cpu_flush - flush stats for the given css and cpu + * @css: target css to be flush + * @cpu: the cpu that holds the stats to be flush + * + * A lightweight rstat flush operation for a given css and cpu. + * Only the cpu_lock is being held for mutual exclusion, the cgroup_rstat_lock + * isn't used. + */ +void cgroup_rstat_css_cpu_flush(struct cgroup_subsys_state *css, int cpu) +{ + raw_spinlock_t *cpu_lock = per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu); + + raw_spin_lock_irq(cpu_lock); + css->ss->css_rstat_flush(css, cpu); + raw_spin_unlock_irq(cpu_lock); +} + int cgroup_rstat_init(struct cgroup *cgrp) { int cpu;