From patchwork Wed Mar 1 19:34:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13156362 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95C11C678D4 for ; Wed, 1 Mar 2023 19:34:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D855B6B0071; Wed, 1 Mar 2023 14:34:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D35F66B0072; Wed, 1 Mar 2023 14:34:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C244C6B0073; Wed, 1 Mar 2023 14:34:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id AEFF96B0071 for ; Wed, 1 Mar 2023 14:34:10 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 5A759A0DFA for ; Wed, 1 Mar 2023 19:34:10 +0000 (UTC) X-FDA: 80521330260.24.C18D58A Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf22.hostedemail.com (Postfix) with ESMTP id A848CC000E for ; Wed, 1 Mar 2023 19:34:08 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ll26EZNi; spf=pass (imf22.hostedemail.com: domain of 3r6j_YwYKCCYUWTGPDIQQING.EQONKPWZ-OOMXCEM.QTI@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3r6j_YwYKCCYUWTGPDIQQING.EQONKPWZ-OOMXCEM.QTI@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677699248; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=VW1a/ztFW3oOtIrI9ep2g5ojznwZvfd6phBOKuSBaFU=; b=QoEThwpVPy3jeK8jcn44OZ1U0LhVKhSZ44iLWx+Tjg0YiRwvw0YsMhXOGXJzY9eP9IZf81 8DHoGGn54jZxmC/SJjBLomd0SHxsEubzT0avWMPK7kKGpwuVner4qs+ch/XZU0uLQSlTtu CCaLiMWdQI5jVqItFukV0dPQzeivpzk= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ll26EZNi; spf=pass (imf22.hostedemail.com: domain of 3r6j_YwYKCCYUWTGPDIQQING.EQONKPWZ-OOMXCEM.QTI@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3r6j_YwYKCCYUWTGPDIQQING.EQONKPWZ-OOMXCEM.QTI@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677699248; a=rsa-sha256; cv=none; b=wV8B0+motCtwWEucnmZq0ikZC2bzQtRD/RLSgUrmlJ2z3hjvjThSKz50KFLGdU+ERRsz+B k22lMSq/cPYcjblcPhgJu+BfMTNHo/RQE7dJUT3UFE8CoCqiZHijQl7/OWAwvL/391w8Lx Z9iERD7VRlUVbBuC5Q/oGNEuyXCOPw0= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-536a545bfbaso292397257b3.20 for ; Wed, 01 Mar 2023 11:34:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1677699248; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=VW1a/ztFW3oOtIrI9ep2g5ojznwZvfd6phBOKuSBaFU=; b=Ll26EZNi/jeVAl0OTcHUVwbVkpV9f3gHXLoSYmO1V8V6ZnM8Xix86qmIbRQCiW/uqN JubSJxWHW/nuhVWBP9+ve54r0YG4l+9yACZF9yiPrUuykGvrcanYHwntjSiUthPfmQD6 oUEN2puY+bilBZaqeWPtoysFJZ2cfz8Hoo//99yYDxceQFxOMz+zghx16LhfRoF9ydvr cwBT2ktXIetOsJsU0cr2FxMs6cnP6ugcOTzLRtSxklIHuLgx6Y3cktvHmeTRKzNOXCPb +QGiVN0Bd2LEPJP+3h7pQoEg54t6GPrvyGV5w0HvWufE1ooG0JHvDEVXj29IrNkl7XiY ePuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677699248; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=VW1a/ztFW3oOtIrI9ep2g5ojznwZvfd6phBOKuSBaFU=; b=ilkPRRGJqbL5aPOEqRf0RxfkVGxJigBnEdVNPOUtKnUoPhxuBHaXdyrUyyj0q2ptUM RIHGbXOv74Q7OzbKV4jaw5W6b3x1+n2PGlkhjozgpnBEQIoZF3fDMV2hHETsyxVyaiiQ mSPk2kN9sWfdSi2oYBtoT3Wrx5T9IWoZVkip5H7WTdT3jAnAQYc4fqesosn7+qJKovPg 3fotzxEESBDXKohffg4hdT9QMJUlcy0smdtoe+mbxjcrakohcilEcUjpqobm0nrOo2Nx D7vBd+wELYPFOzy5MEPo6RNhOT5XVSGmc9OU1HTqXY1bwJTSoCqlxlZVMAVVB32cMj03 1/Zw== X-Gm-Message-State: AO0yUKWh79InrzYnMBNVauP+Qb3cN4tZ9Y5UTGwkSEoIZeHBYfvTec/G SagONYiQnrOf9vCTlGVWh/uYYuyVx0o= X-Google-Smtp-Source: AK7set9EJvTqQFnc1go59+KKiznJH8pXjUdW79pW5pRUUyJRbcowF8nm34WBAwnjBgVcdnT6eUxGJswuKmg= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:3c40:eeb3:7c3a:807e]) (user=surenb job=sendgmr) by 2002:a81:af08:0:b0:536:5557:33a8 with SMTP id n8-20020a81af08000000b00536555733a8mr4702461ywh.9.1677699247788; Wed, 01 Mar 2023 11:34:07 -0800 (PST) Date: Wed, 1 Mar 2023 11:34:03 -0800 Mime-Version: 1.0 X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230301193403.1507484-1-surenb@google.com> Subject: [PATCH 1/1] psi: remove 500ms min window size limitation for triggers From: Suren Baghdasaryan To: tj@kernel.org Cc: hannes@cmpxchg.org, lizefan.x@bytedance.com, peterz@infradead.org, johunt@akamai.com, mhocko@suse.com, keescook@chromium.org, quic_sudaraja@quicinc.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, surenb@google.com X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: drcyid67s4shbo33bim5yo7x5qzt9eb4 X-Rspamd-Queue-Id: A848CC000E X-HE-Tag: 1677699248-520442 X-HE-Meta: U2FsdGVkX19u3nj7OOgYo0mesLklC60ELOYJEpxhqQAxLscOhOIdMAI0D77Gx4xFbtSLFP+TG2sBYPWJVQAJO7KCLPzfws2FMh7gU7aUfZELV11dA646ATdvNk8JNSLGLfjnPSvi7147kaO929uqqqFebgMXuQKWV7Hp/xa53tsaoji7WctxYYBwflVU7wMUHYORg8J91GyXrScYVRdXr9kC/FpT2rs+EzAwXsy/szwVUgsi5XJmS5kSz99WpiwOQVSFCfZwl2/a7hHgbl6nz90/tB2jzWYRxiyY1hCw6gpZsulmWV1bGpj6fW7Pd5J3cKudjvvJDn/+UuLIrT7lx7HWk3VciZYHFsDNrD6TUoWf1rDv85Ev0kpMuie8An2lknFaOGOGsX+GmM9fUZ00OgcNG7E1oq8qt11udSC8VpDrpft6t7nM8hdEw2ZOMThbd3jJT9liL1MP1z501KYZuguxj7wAYtgTnJQyTHhEhHUPDtK0TeUzdWtgbcum11bnU6KNzm15LxvOkQSGAZwKnVY6cGarbwo1LgBqOaxCeMn+lbmlUyILz7xG56MM+hQ2vhXvSwOM02ZZfYjCRbUqBp8XBxEV+E8+iazk3+FijOSsoZt9bYLScbjHINeOR0oO2qaKRqonqXJK5HGJfu+NIcd2IkNcqmQvg4k1twZXe/ibzX/5OJ75lL8mUA8tegNAeixC6DB1v2MuUcFmdeXKRysG2D82R/52SN1kxqfYaU77tNgDFYB6nVwiVvNQzEXmtSPbKzVBUcfGx/NlMvCsoEHHv+r2hRuxpRrT5zDzSOZWnvqkP+1ec8vheuy+tPiCpvEAHOv8hRy0zXbST/lRapCUFDXSdOGneQATuhdcxhZIqnS6iuXAVDG2PbDVyCxmokVklQPdUo36wsF3iHFh3RZa3b2Z2Vm9WLVSmhGHxPfPNsU1IVz2KmXP/3lmAEMFylraPyr0U0ejTWxWqBH xcdNKWOt v4stFb2XxQEg3Ctgmr6Qu8kH3Gr2M+TOocGl8g5lV7TwwB1ZKpc7tqeXi3x5Io7aWT6BlJD3qBPdgZu/g0iEhTzCVfK90UyGveijIbm95o1iIiFhlWLUi4f0/Of9vVlPhw8BNWSHVkc67muzIvGLu4yM+/+z0yA+CExJ1rxc2dZX4B6zwbYgmanG7A/8u+v0dUhV9Wg+07h/hmUmYBvaOPGrhBMYLomkZDnQ2zCKn8orOoSlTH8x+IKeACXolgh2q9PamksjBOPK17bGbBo/Yqb2Hb+B71eZ++WSMZLPHckBpW2RQEauX1xL3FL5Hb/mq0eXcINoJCsyH8DQVEUKRX89KK6bNPTtLYyUhujZz712xfdM+PPClx5AHmUU1VWKDGvMkR09b4H91gQjEI0V4PQxcFeksNi1uXCVQWgvYSQ+oXsrgvHK+m/EoFTqBi6e75PQUXnIsIj9OEp1ouOOuOvktUBV5+yzvv6SVi0Vgyms9yia9yAr18V8OnbaRlsxuIRze+oCiVoWnNbmJK255IUiftREdOu26O0ERIpAS6WLpm4IJeReLhAGUmZq/VDCzSY47WWbMO4RlQoGI3QVa/zfvlAvFWvH4DsdPEc+DTxwYf1hMqsjmAknTt3y3pEZx03osrI1PZ/0Qj0QTx0Lm/iwcPQDg/80XK9hN0JKcrVROJPqoj9RJOIXALeCafnoJkVzyWooQ147GnWQAx1AhheyeUbeYCAm3ZFE9FSo77DAbP96l91V9RW1ybQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000022, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Current 500ms min window size for psi triggers limits polling interval to 50ms to prevent polling threads from using too much cpu bandwidth by polling too frequently. However the number of cgroups with triggers is unlimited, so this protection can be defeated by creating multiple cgroups with psi triggers (triggers in each cgroup are served by a single "psimon" kernel thread). Instead of limiting min polling period, which also limits the latency of psi events, it's better to limit psi trigger creation to authorized users only, like we do for system-wide psi triggers (/proc/pressure/* files can be written only by processes with CAP_SYS_RESOURCE capability). This also makes access rules for cgroup psi files consistent with system-wide ones. Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and remove the psi window min size limitation. Suggested-by: Sudarshan Rajagopalan Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraja@quicinc.com/ Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko Acked-by: Johannes Weiner --- kernel/cgroup/cgroup.c | 10 ++++++++++ kernel/sched/psi.c | 4 +--- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 935e8121b21e..b600a6baaeca 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3867,6 +3867,12 @@ static __poll_t cgroup_pressure_poll(struct kernfs_open_file *of, return psi_trigger_poll(&ctx->psi.trigger, of->file, pt); } +static int cgroup_pressure_open(struct kernfs_open_file *of) +{ + return (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE)) ? + -EPERM : 0; +} + static void cgroup_pressure_release(struct kernfs_open_file *of) { struct cgroup_file_ctx *ctx = of->priv; @@ -5266,6 +5272,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "io.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_IO]), + .open = cgroup_pressure_open, .seq_show = cgroup_io_pressure_show, .write = cgroup_io_pressure_write, .poll = cgroup_pressure_poll, @@ -5274,6 +5281,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "memory.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_MEM]), + .open = cgroup_pressure_open, .seq_show = cgroup_memory_pressure_show, .write = cgroup_memory_pressure_write, .poll = cgroup_pressure_poll, @@ -5282,6 +5290,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "cpu.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_CPU]), + .open = cgroup_pressure_open, .seq_show = cgroup_cpu_pressure_show, .write = cgroup_cpu_pressure_write, .poll = cgroup_pressure_poll, @@ -5291,6 +5300,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "irq.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_IRQ]), + .open = cgroup_pressure_open, .seq_show = cgroup_irq_pressure_show, .write = cgroup_irq_pressure_write, .poll = cgroup_pressure_poll, diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 02e011cabe91..9c02b27052bb 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -160,7 +160,6 @@ __setup("psi=", setup_psi); #define EXP_300s 2034 /* 1/exp(2s/300s) */ /* PSI trigger definitions */ -#define WINDOW_MIN_US 500000 /* Min window size is 500ms */ #define WINDOW_MAX_US 10000000 /* Max window size is 10s */ #define UPDATES_PER_WINDOW 10 /* 10 updates per window */ @@ -1278,8 +1277,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, if (state >= PSI_NONIDLE) return ERR_PTR(-EINVAL); - if (window_us < WINDOW_MIN_US || - window_us > WINDOW_MAX_US) + if (window_us <= 0 || window_us > WINDOW_MAX_US) return ERR_PTR(-EINVAL); /* Check threshold */