From patchwork Mon Jun 11 19:29:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 10458685 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9D261601A0 for ; Mon, 11 Jun 2018 19:30:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8BF6128587 for ; Mon, 11 Jun 2018 19:30:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 802A8285A2; Mon, 11 Jun 2018 19:30:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3DA9285C8 for ; Mon, 11 Jun 2018 19:30:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B9186B0005; Mon, 11 Jun 2018 15:30:37 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 73EE06B0006; Mon, 11 Jun 2018 15:30:37 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E13E6B0007; Mon, 11 Jun 2018 15:30:37 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f71.google.com (mail-pl0-f71.google.com [209.85.160.71]) by kanga.kvack.org (Postfix) with ESMTP id 19B556B0005 for ; Mon, 11 Jun 2018 15:30:37 -0400 (EDT) Received: by mail-pl0-f71.google.com with SMTP id 39-v6so1181108ple.6 for ; Mon, 11 Jun 2018 12:30:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id; bh=g6N5HHKzRAZZPAqLufIufWeopavOpKuw/ghbCJReFhI=; b=DCtu3lR6NPrV8a8w+fLtkiuMwUBH/92igW7zi71rWRG3Re0dHX1qwWJiEEgtMqqfck mUAMvmrStHKmN39mG7xR8+i6sW5VRqufd0qU+p2tGT1bAOW//qCADPglks1ur9Hu4GSp s1TyLq+OzP1hZnsU1DJC704XP+drvlCmDQAglNV1erpcZLIC2Jj0gnwkZmso5AlfxSFX mPGHavTGTOLbYVTnv5jOQaA5A+xIWtCY9Qp/CDd24pJsAnE7jGq7rK+ljkYW8m8J7/OI jGB1NIK6MHMu9WYqgh/TtFTR+01TXSTp2YvImBVAkvG3AOijufOLkXo0RHpOx+l1A6Si T1hA== X-Gm-Message-State: APt69E1hYiWOaNyOGpRu5S2AtN1KYFbFhhFuPYpxogs4akCCqbn4s0+G jBumueIe45lb0b0J7CNAuegiWvIC46ARU4N9OnNY7FS8yRK4Z9sqbLkgv3sIzDVs+wWGCxI/T+W dMvVdp1Vve3z00pXhzOO8IJ7ieYS5NZ5ZudA64WNiZBJArGhLALedGFP610ypVd//D2EIu0sHUt dA638qdXg9z7vnneFIdJ1pE2xWMV4wpOZtiGLcE305U5IARLEhOH8JiafAr2tnT+A2AZMRcKJv2 WGwMh4BoEgtBmyZp4vn23em6yYBVM+s2IxUPcNOrHgIS1dms9rKKTmd2vWKaCbQN9LpF3+z9FzX oY4llf8/VGdAuMwhqNKWNcvdWr8eFKOilqqKsT8yvnzok/uGIUzQbEnBVvnU1TiNpAzTZyDXpxl V X-Received: by 2002:a63:ba02:: with SMTP id k2-v6mr398274pgf.179.1528745436758; Mon, 11 Jun 2018 12:30:36 -0700 (PDT) X-Received: by 2002:a63:ba02:: with SMTP id k2-v6mr398140pgf.179.1528745434855; Mon, 11 Jun 2018 12:30:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528745434; cv=none; d=google.com; s=arc-20160816; b=AxnBcR9rsgqRvw18+pMjKyOf6c/fDCgzncn5J3ngFdAy5BkG0NPZNYSRNx00FlcM2n arqtgIODK8w6BADk9xaDpVzE32hFBY+rrRlZF5lF6Dl4zb5+3Jv/YthohJaZdvdj+QQl MYPLbvrykPvwfHVMkJ/DGNp2B9cG2QgjJ2vOrE2aVkdtD6/vgyy8j6ZDycMKNTHWnvxL yhKWPbtTPjMqRZxtVIfZmGbG6oMvYuO7psgLd+3rvRTrfw6KqMe1S+0jBqJCIwraBjNE rpRpwyWRwXmuUHtjFmT3JLkd1LNgmf9ZMPJMNotqDGjqSzQxgesihYJQ/HcOZ3vz93Si qByw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=g6N5HHKzRAZZPAqLufIufWeopavOpKuw/ghbCJReFhI=; b=yOJVlkiXpG/Oo2GhLy8uUga6kFK8mqAyOFMZEHvhziblVvK1skD6KwocZYA9hTxbZH 0T24Fsa0eIeJB55Z6RzxkQj2IGnCuGxhnyC2Dv+VBQwFMpRSTI0njA/zTAZ9uYnGIF3B UnZDdiYkQi2DTaiOhrG+sEqek9AK5p+nLOKxdPsF8Gx+ifLSYKBnNFfjJoj3XWsbheSt m3Nj72z4o+3HU/I0flpErgoLKnJlLTQ1uPgHCZzszntAH+Co5sYD6JBU+8W79oUhaodg nt6xP7cef9jEjWka/fd7I1P5TdM/KES8Q+sWi7bju5ZlUCZR67IziBoWS2IVGAGlghEG ftOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=edcRoNZO; spf=pass (google.com: domain of shakeelb@google.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id b21-v6sor7258206pfd.92.2018.06.11.12.30.34 for (Google Transport Security); Mon, 11 Jun 2018 12:30:34 -0700 (PDT) Received-SPF: pass (google.com: domain of shakeelb@google.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=edcRoNZO; spf=pass (google.com: domain of shakeelb@google.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=g6N5HHKzRAZZPAqLufIufWeopavOpKuw/ghbCJReFhI=; b=edcRoNZOGyoZcEb9WnXeGiZYKFksK7D4u+pK8CHrimxhimjZir7VTyEKMjSlxj+pkc YQYpY+S6o+1azqtTY4X4tMHDhn6f2pY1iB1nrVvFCEpc0P9BRJ56h0W4W3e+TZezoesr 6LYphObaOcOtyhi0YoUjzm/GcECzufuApKsDuYN+TXbdOimlbc/uZSSeBjjP4hvm0H8P DH3OWVrUp/6kLEmAfGXZpDkIQyAcLKFL2+rTPCCXHBvvMla+muO+MyB6TUaFil0oX49v aJLSjQSgSJwWnMFNSZQyuvejvAMy8330khu/XvekDe0D3lhzidcmyGdifphf68qiHjWF LQPw== X-Google-Smtp-Source: ADUXVKKjJip/VIilXIWc3jUs/C8jeY7mmDeU7vjgYkA1B7rcFsWjz5SdXPLVCM/jL2foaFPDAK7WEw== X-Received: by 2002:a62:5991:: with SMTP id k17-v6mr519152pfj.94.1528745433964; Mon, 11 Jun 2018 12:30:33 -0700 (PDT) Received: from shakeelb.mtv.corp.google.com ([2620:15c:2cb:201:3a5f:3a4f:fa44:6b63]) by smtp.gmail.com with ESMTPSA id b22-v6sm14010493pfi.144.2018.06.11.12.30.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Jun 2018 12:30:31 -0700 (PDT) From: Shakeel Butt To: Michal Hocko , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Greg Thelen , Johannes Weiner , Vladimir Davydov , Tejun Heo Cc: Linux MM , Cgroups , LKML , Shakeel Butt Subject: [PATCH v4] mm: fix race between kmem_cache destroy, create and deactivate Date: Mon, 11 Jun 2018 12:29:51 -0700 Message-Id: <20180611192951.195727-1-shakeelb@google.com> X-Mailer: git-send-email 2.18.0.rc1.242.g61856ae69a-goog X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The memcg kmem cache creation and deactivation (SLUB only) is asynchronous. If a root kmem cache is destroyed whose memcg cache is in the process of creation or deactivation, the kernel may crash. Example of one such crash: general protection fault: 0000 [#1] SMP PTI CPU: 1 PID: 1721 Comm: kworker/14:1 Not tainted 4.17.0-smp ... Workqueue: memcg_kmem_cache kmemcg_deactivate_workfn RIP: 0010:has_cpu_slab ... Call Trace: ? on_each_cpu_cond __kmem_cache_shrink kmemcg_cache_deact_after_rcu kmemcg_deactivate_workfn process_one_work worker_thread kthread ret_from_fork+0x35/0x40 To fix this race, on root kmem cache destruction, mark the cache as dying and flush the workqueue used for memcg kmem cache creation and deactivation. SLUB's memcg kmem cache deactivation also includes RCU callback and thus make sure all previous registered RCU callbacks have completed as well. Signed-off-by: Shakeel Butt Acked-by: Vladimir Davydov --- Changelog since v3: - Handle the RCU callbacks for SLUB deactivation Changelog since v2: - Rewrote the patch and used workqueue flushing instead of refcount Changelog since v1: - Added more documentation to the code - Renamed fields to be more readable --- include/linux/slab.h | 1 + mm/slab_common.c | 33 ++++++++++++++++++++++++++++++++- 2 files changed, 33 insertions(+), 1 deletion(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 9ebe659bd4a5..71c5467d99c1 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -658,6 +658,7 @@ struct memcg_cache_params { struct memcg_cache_array __rcu *memcg_caches; struct list_head __root_caches_node; struct list_head children; + bool dying; }; struct { struct mem_cgroup *memcg; diff --git a/mm/slab_common.c b/mm/slab_common.c index b0dd9db1eb2f..890b1f04a03a 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -136,6 +136,7 @@ void slab_init_memcg_params(struct kmem_cache *s) s->memcg_params.root_cache = NULL; RCU_INIT_POINTER(s->memcg_params.memcg_caches, NULL); INIT_LIST_HEAD(&s->memcg_params.children); + s->memcg_params.dying = false; } static int init_memcg_params(struct kmem_cache *s, @@ -608,7 +609,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg, * The memory cgroup could have been offlined while the cache * creation work was pending. */ - if (memcg->kmem_state != KMEM_ONLINE) + if (memcg->kmem_state != KMEM_ONLINE || root_cache->memcg_params.dying) goto out_unlock; idx = memcg_cache_id(memcg); @@ -712,6 +713,9 @@ void slab_deactivate_memcg_cache_rcu_sched(struct kmem_cache *s, WARN_ON_ONCE(s->memcg_params.deact_fn)) return; + if (s->memcg_params.root_cache->memcg_params.dying) + return; + /* pin memcg so that @s doesn't get destroyed in the middle */ css_get(&s->memcg_params.memcg->css); @@ -823,11 +827,36 @@ static int shutdown_memcg_caches(struct kmem_cache *s) return -EBUSY; return 0; } + +static void flush_memcg_workqueue(struct kmem_cache *s) +{ + mutex_lock(&slab_mutex); + s->memcg_params.dying = true; + mutex_unlock(&slab_mutex); + + /* + * SLUB deactivates the kmem_caches through call_rcu_sched. Make + * sure all registered rcu callbacks have been invoked. + */ + if (IS_ENABLED(CONFIG_SLUB)) + rcu_barrier_sched(); + + /* + * SLAB and SLUB create memcg kmem_caches through workqueue and SLUB + * deactivates the memcg kmem_caches through workqueue. Make sure all + * previous workitems on workqueue are processed. + */ + flush_workqueue(memcg_kmem_cache_wq); +} #else static inline int shutdown_memcg_caches(struct kmem_cache *s) { return 0; } + +static inline void flush_memcg_workqueue(struct kmem_cache *s) +{ +} #endif /* CONFIG_MEMCG && !CONFIG_SLOB */ void slab_kmem_cache_release(struct kmem_cache *s) @@ -845,6 +874,8 @@ void kmem_cache_destroy(struct kmem_cache *s) if (unlikely(!s)) return; + flush_memcg_workqueue(s); + get_online_cpus(); get_online_mems();