From patchwork Tue Mar 6 23:46:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 10263123 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id CB0936016D for ; Tue, 6 Mar 2018 23:47:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BC33528DCC for ; Tue, 6 Mar 2018 23:47:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B0ED128EA5; Tue, 6 Mar 2018 23:47:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2FD9F28DCC for ; Tue, 6 Mar 2018 23:47:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1FA6789D57; Tue, 6 Mar 2018 23:47:34 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4869189D57; Tue, 6 Mar 2018 23:47:33 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Mar 2018 15:47:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.47,433,1515484800"; d="scan'208";a="39764543" Received: from mdroper-desk.fm.intel.com ([10.1.134.220]) by orsmga002.jf.intel.com with ESMTP; 06 Mar 2018 15:47:32 -0800 From: Matt Roper To: dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, cgroups@vger.kernel.org Subject: [PATCH v3 1/6] cgroup: Allow registration and lookup of cgroup private data Date: Tue, 6 Mar 2018 15:46:55 -0800 Message-Id: <20180306234700.6562-2-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180306234700.6562-1-matthew.d.roper@intel.com> References: <20180306234700.6562-1-matthew.d.roper@intel.com> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tejun Heo MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP There are cases where other parts of the kernel may wish to store data associated with individual cgroups without building a full cgroup controller. Let's add interfaces to allow them to register and lookup this private data for individual cgroups. A kernel system (e.g., a driver) that wishes to register private data for a cgroup will do so by subclassing the 'struct cgroup_priv' structure to describe the necessary data to store. Before registering a private data structure to a cgroup, the caller should fill in the 'key' and 'free' fields of the base cgroup_priv structure. * 'key' should be a unique void* that will act as a key for future privdata lookups/removals. Note that this allows drivers to store per-device private data for a cgroup by using a device pointer as a key. * 'free' should be a function pointer to a function that may be used to destroy the private data. This function will be called automatically if the underlying cgroup is destroyed. Cc: Tejun Heo Cc: cgroups@vger.kernel.org Signed-off-by: Matt Roper --- include/linux/cgroup-defs.h | 38 ++++++++++++++++++++++ include/linux/cgroup.h | 78 +++++++++++++++++++++++++++++++++++++++++++++ kernel/cgroup/cgroup.c | 14 ++++++++ 3 files changed, 130 insertions(+) diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index 9f242b876fde..17c679a7b5de 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -8,6 +8,7 @@ #ifndef _LINUX_CGROUP_DEFS_H #define _LINUX_CGROUP_DEFS_H +#include #include #include #include @@ -307,6 +308,36 @@ struct cgroup_stat { struct prev_cputime prev_cputime; }; +/* + * Private data associated with a cgroup by an indpendent (non-controller) part + * of the kernel. This is useful for things like drivers that may wish to track + * their own cgroup-specific data. + * + * If an individual cgroup is destroyed, the cgroups framework will + * automatically free all associated private data. If cgroup private data is + * registered by a kernel module, then it is the module's responsibility to + * manually free its own private data upon unload. + */ +struct cgroup_priv { + /* cgroup this private data is associated with */ + struct cgroup *cgroup; + + /* + * Lookup key that defines the in-kernel consumer of this private + * data. + */ + const void *key; + + /* + * Function to release private data. This will be automatically called + * if/when the cgroup is destroyed. + */ + void (*free)(struct cgroup_priv *priv); + + /* Hashlist node in cgroup's privdata hashtable */ + struct hlist_node hnode; +}; + struct cgroup { /* self css with NULL ->ss, points back to this cgroup */ struct cgroup_subsys_state self; @@ -427,6 +458,13 @@ struct cgroup { /* used to store eBPF programs */ struct cgroup_bpf bpf; + /* + * cgroup private data registered by other non-controller parts of the + * kernel + */ + DECLARE_HASHTABLE(privdata, 4); + struct mutex privdata_mutex; + /* ids of the ancestors at each level including self */ int ancestor_ids[]; }; diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 473e0c0abb86..a3604b005417 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -833,4 +833,82 @@ static inline void put_cgroup_ns(struct cgroup_namespace *ns) free_cgroup_ns(ns); } +/** + * cgroup_priv_install - install new cgroup private data + * @key: Key uniquely identifying kernel owner of private data + * + * Allows non-controller kernel subsystems to register their own private data + * associated with a cgroup. This will often be used by drivers which wish to + * track their own per-cgroup data without building a full cgroup controller. + * + * Callers should ensure that no existing private data exists for the given key + * before adding new private data. If two sets of private data are registered + * with the same key, it is undefined which will be returned by future calls + * to cgroup_priv_lookup. + * + * Kernel modules that register private data with this function should take + * care to free their private data when unloaded to prevent leaks. + */ +static inline void +cgroup_priv_install(struct cgroup *cgrp, + struct cgroup_priv *priv) +{ + WARN_ON(!mutex_is_locked(&cgrp->privdata_mutex)); + WARN_ON(!priv->key); + WARN_ON(!priv->free); + WARN_ON(priv->cgroup); + + priv->cgroup = cgrp; + hash_add(cgrp->privdata, &priv->hnode, + (unsigned long)priv->key); +} + +/** + * cgroup_priv_lookup - looks up cgroup private data + * @key: Key uniquely identifying owner of private data to lookup + * + * Looks up the private data associated with a key. + * + * Returns: + * Previously registered cgroup private data associated with the given key, or + * NULL if no private data has been registered. + */ +static inline struct cgroup_priv * +cgroup_priv_lookup(struct cgroup *cgrp, + const void *key) +{ + struct cgroup_priv *priv; + + WARN_ON(!mutex_is_locked(&cgrp->privdata_mutex)); + + hash_for_each_possible(cgrp->privdata, priv, hnode, + (unsigned long)key) + if (priv->key == key) + return priv; + + return NULL; +} + +/** + * cgroup_priv_free - free cgroup private data + * @key: Key uniquely identifying owner of private data to free + */ +static inline void +cgroup_priv_free(struct cgroup *cgrp, const void *key) +{ + struct cgroup_priv *priv; + struct hlist_node *tmp; + + mutex_lock(&cgrp->privdata_mutex); + + hash_for_each_possible_safe(cgrp->privdata, priv, tmp, hnode, + (unsigned long)key) { + hash_del(&priv->hnode); + if (priv->key == key && !WARN_ON(priv->free == NULL)) + priv->free(priv); + } + + mutex_unlock(&cgrp->privdata_mutex); +} + #endif /* _LINUX_CGROUP_H */ diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 8cda3bc3ae22..9e576dc8b566 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1839,6 +1839,8 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp) INIT_LIST_HEAD(&cgrp->cset_links); INIT_LIST_HEAD(&cgrp->pidlists); mutex_init(&cgrp->pidlist_mutex); + hash_init(cgrp->privdata); + mutex_init(&cgrp->privdata_mutex); cgrp->self.cgroup = cgrp; cgrp->self.flags |= CSS_ONLINE; cgrp->dom_cgrp = cgrp; @@ -4578,6 +4580,9 @@ static void css_release_work_fn(struct work_struct *work) container_of(work, struct cgroup_subsys_state, destroy_work); struct cgroup_subsys *ss = css->ss; struct cgroup *cgrp = css->cgroup; + struct cgroup_priv *priv; + struct hlist_node *tmp; + int i; mutex_lock(&cgroup_mutex); @@ -4617,6 +4622,15 @@ static void css_release_work_fn(struct work_struct *work) NULL); cgroup_bpf_put(cgrp); + + /* Any private data must be released automatically */ + mutex_lock(&cgrp->privdata_mutex); + hash_for_each_safe(cgrp->privdata, i, tmp, priv, hnode) { + hash_del(&priv->hnode); + if (!WARN_ON(!priv->free)) + priv->free(priv); + } + mutex_unlock(&cgrp->privdata_mutex); } mutex_unlock(&cgroup_mutex);