From patchwork Fri Mar 3 01:13:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13158188 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89EDDC6FA8E for ; Fri, 3 Mar 2023 01:13:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 058E86B0071; Thu, 2 Mar 2023 20:13:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F2E876B0072; Thu, 2 Mar 2023 20:13:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEC7D6B0073; Thu, 2 Mar 2023 20:13:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CC4556B0071 for ; Thu, 2 Mar 2023 20:13:53 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A6D3AAB319 for ; Fri, 3 Mar 2023 01:13:53 +0000 (UTC) X-FDA: 80525815146.09.6CB579D Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf11.hostedemail.com (Postfix) with ESMTP id 0ACCE40009 for ; Fri, 3 Mar 2023 01:13:50 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eawrTzCH; spf=pass (imf11.hostedemail.com: domain of 3zkkBZAYKCI09B8v4sx55x2v.t532z4BE-331Crt1.58x@flex--surenb.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3zkkBZAYKCI09B8v4sx55x2v.t532z4BE-331Crt1.58x@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677806031; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=Pk7Od9lx1FvwXfe2+jCyuETO6j4UmvCkXNRwPfsPdX8=; b=ua9wvSMss+FVrgwemBCmu0lDRYBmmq/0y26MZ7ilIls0lD2HAZ1cL5ZiuFWN44k+8gjAI3 YfTuR0lYm8IrjFSX2pb7u7L6XFyjXM7RcSQuVNXZvlHzpNjh8rbUn0x9DZptUDia69/1Zk C8cluYOKIr0bZn0GVyr/R0jNlOlCcwo= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eawrTzCH; spf=pass (imf11.hostedemail.com: domain of 3zkkBZAYKCI09B8v4sx55x2v.t532z4BE-331Crt1.58x@flex--surenb.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3zkkBZAYKCI09B8v4sx55x2v.t532z4BE-331Crt1.58x@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677806031; a=rsa-sha256; cv=none; b=IOVDNQ+182my+4btRmpEc0BpoeVvkjWOaXSttMJzIo2jBSMusUKiTZHN45/Yh6sVYCJwYT IfvfSxpKgkDZRkbYFMh4UOdu3C45EVEPGo03PPYV8JRH+BC9eE/+7siYvsCBEwU7rNbbV/ WG++lTX0lB2bS6X/r0SsvUPF9Ej8ETU= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-536bf649e70so9096977b3.0 for ; Thu, 02 Mar 2023 17:13:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1677806030; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=Pk7Od9lx1FvwXfe2+jCyuETO6j4UmvCkXNRwPfsPdX8=; b=eawrTzCH33CbOpQRrdHF07gcS0h9D1fZAptzT+SJUwp8eltvWzEvvp9uSp3iqVbkls cqD96AeE8qiN73slqQ81HuGUu6xBeCHlD/QwFmF4SWXenGr98sEL5df8aqUJgveG2cZA MIRo5Gf6PS8V7+p93ejwhwPx/Zzm8icHjBpOJpQ5mT/DzAHhN1Ua/6fBWtLOS3kr55VC 3GEyDVbXsOkpLx5aJClCUXvSUnC7JOuxd2OR/Zbimj6cGtSwn7Ad4+TroPZDrDw3oBbW 2cUbgCSfRfZTug//WxNgmYlh8jp6snpqjyp3oQIgg1RyVqOYyxrglI95cfpAy3GN3cnZ CAJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677806030; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Pk7Od9lx1FvwXfe2+jCyuETO6j4UmvCkXNRwPfsPdX8=; b=VFZor+HYtMstiQ4G2odM1MXODX3LYrPIXNy5KF9MzWGfTmdURx02+2ZnZ+32NsbOyt Gb2vMui8CB3UMhArVzY2gs55RPVH0gWHcyCANZhUosBLDPI3FhIPwRqhQQznZtcaVUrP kbprLeLup1Py56um6ST771L4e4+akVAxJmZk9T36DE8eRkQeVprA41kWQhhx8AOrd3gn 2fWS+7yP7d0SP+/GjrAHKRDG1oTUVxny2ZcK9il4xEB1805qJS4r+MBjAvyYv2MKxVrU WPNvPwbax3sPIZW8Ld3sqn62RxHSsc1SAzahcGVv+Qb2xvkDsiI7UirdsSfIafsFWKNW /PaQ== X-Gm-Message-State: AO0yUKXcNK0QtWbMg1xQ71hWsQW85YYoUvVIAMQmk8zswX2GvctxVKXf G0hxeUn+VAWaj96lENxoEc1GgGTZYiM= X-Google-Smtp-Source: AK7set85a9u2TpY4EKigfXzI8OCBp4gfZp5TC44x/oTa8B6mXZ19XkruqBUD6Y1l4ByN/bQmQq0C7M2NxnI= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:58cd:2543:62b7:a517]) (user=surenb job=sendgmr) by 2002:a05:6902:344:b0:8a3:d147:280b with SMTP id e4-20020a056902034400b008a3d147280bmr2337965ybs.3.1677806030055; Thu, 02 Mar 2023 17:13:50 -0800 (PST) Date: Thu, 2 Mar 2023 17:13:46 -0800 Mime-Version: 1.0 X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230303011346.3342233-1-surenb@google.com> Subject: [PATCH v2 1/1] psi: remove 500ms min window size limitation for triggers From: Suren Baghdasaryan To: tj@kernel.org Cc: hannes@cmpxchg.org, lizefan.x@bytedance.com, peterz@infradead.org, johunt@akamai.com, mhocko@suse.com, keescook@chromium.org, quic_sudaraja@quicinc.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, surenb@google.com X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0ACCE40009 X-Stat-Signature: 317ug5pr8iye5x8hebwyi3a4r85ib3ws X-HE-Tag: 1677806030-866336 X-HE-Meta: U2FsdGVkX1+9Z55i4tg3+oOsVcv1hLoAler7LLXeMwHddoFVSVc2fDFGVyJCTNsctkaut/0fqOiW3vGj+CDviLP/V7RBHXxGDXPjVJadmM8m+p8lUtZi240lTyUrJcBAwafM08TGUXeTx/R5eTGl4p4qnozEnsNS3v7XcJL+p752gJmEVr97m4TSaHkaDXNnvDSu+yIgzWGbuYEDvV1eiifpmWSa4J7GhpjqUatEcU2eFjPIMGmBMlOOtJGFjio4k0aSiBJ71WyKzON8E1xc0wEsGkLmlCAYaII7qM/YqfhA7AfaRRGiUz0lBS1uc++jkAWiDPYOxpOS+ngmQJkJoetu77YgRzm4m7vpzLtOreU7AvS3OvZJyFUxiDrSC5HHXDcPCbiowJxGzyi/jjxIH6QD4Ci7AHJ9EkxrkenAPh0SNRuQ/U9yRCp0dMBFMWy7zHR7bmMwTm+qmRCmvFAVEHKBRomiPws+cxEvcRll9UPby4TSxxpUbhTx6j+34jaOY7NbJ8F9S/qbq6aQRrIp4tbvG/JgIjwiYPtOdcAuAtlApgr3sta0KlRkqDeiy57X49bPN+cG4Tv7aQ44MvvI5ZOCZItEWVECqPrsTgFi0KOp8ttux1GdSIYvJSpZ1rigPjPsUqlWWov9CJbifMEta40UwVXVrvpcpUrAPdcoU8b0/0gg5/GoDM+VVwsX+4rIZJ5V1EeYzXiJLRonxWw4vmFw0cldIDT4OqYi1I5MmByGeZVEhW1auf9sA5yfIQkJ03mO8o6KGDojcsdmneMlgQ3qEqPBa37t0nDn68HFcTGbxrPXAYHU/0fsJejFtCOMjuL/4vfMviDV6oASbmPdHv71+OQEqWSGUI+gJ5hxFpgoHo7PdH+CpcY/EfTUDNeKbvyaDe2s262qeu2ZF805mhg5I9ytsyCRuEBaAWGIWJbJjmJSkuKkUXDTtrksN+v/MLorapm4++w9YLeEnW9 Isj88oWR 42m/yxqEJ8BKPslnoFiHsQ6AFkudePUnxC4U+m+JLokJB0MfIkx8Ls7hAM7SdsUAsdyl9mrLOQnqMDYZorrTH6mO7Y2t3ZiT7JOB7rltIpYpO6P8upkl85cOKv626SGpXO4scBImSVHLfn6ZNGzulChl4P5fylR2CaqtaCXsOUGHJ/fY57P99sgrw8V0E6ZaArdritFC6Ulbu0Swi0PsKXzx/wZYBYKpYQK6lQio60fp3SR86EFKkJgF/e8SmfKFwPoMsUB+0z/5WqzjBDN4PvkoABA5dvzn01djUlxQVZtd/jaBCE1JmMBW0ym66gzPjL6BXF1QGoyE7psFE4mzJkdXq8GPe7NPNQnXbTaty/3ZqWyZ53N6E1kgb4ktizwMiXHBKIXP2ph0VcVk895BuAQQi7aPwYTIc5vJNYOiKKfA2jVHAemqttLkk3bWvt5z+Ff/F1f7rRj3B7dAivHW289zqc8UI8Qs482MI0Fu3cpuNMYTNdZz5FyiKH3TUw1MJg3UdVHz7xFp/u25Lb8snbrg0GWo6f5viEYys+lxZLeJ3/eLsnDzlwdgq/Y2BV+y9eei0Lyt9NSZkHuB+/8Oo2qOqAjlBDaWEsMXe1/txZQpGHbEpfm7eGOe+FNoTqFT293JhY29dtshH0PZPxgTWjtL1Ec/zo9sGu5/g+6d+ScNaGlZnjax40A4mSAgTZ5WTMbIdm3ztVP9Kt5taoyjhlKS8TQTnudkBt3qMJtVpsqU1btt+RK3DWzNgns0bc4RIzidH4CbOZWh6EW7vOJNMTCtpWXIVCZ00Pl6qsHyjtN+SmIF4xKyi6fJKMnteugpyEN9HDFmQfH6HDVEPMx+suTjOxqhKSCPezrssU3fMRLZqfWg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Current 500ms min window size for psi triggers limits polling interval to 50ms to prevent polling threads from using too much cpu bandwidth by polling too frequently. However the number of cgroups with triggers is unlimited, so this protection can be defeated by creating multiple cgroups with psi triggers (triggers in each cgroup are served by a single "psimon" kernel thread). Instead of limiting min polling period, which also limits the latency of psi events, it's better to limit psi trigger creation to authorized users only, like we do for system-wide psi triggers (/proc/pressure/* files can be written only by processes with CAP_SYS_RESOURCE capability). This also makes access rules for cgroup psi files consistent with system-wide ones. Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and remove the psi window min size limitation. Suggested-by: Sudarshan Rajagopalan Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraja@quicinc.com/ Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko Acked-by: Johannes Weiner --- kernel/cgroup/cgroup.c | 10 ++++++++++ kernel/sched/psi.c | 4 +--- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 935e8121b21e..b600a6baaeca 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3867,6 +3867,12 @@ static __poll_t cgroup_pressure_poll(struct kernfs_open_file *of, return psi_trigger_poll(&ctx->psi.trigger, of->file, pt); } +static int cgroup_pressure_open(struct kernfs_open_file *of) +{ + return (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE)) ? + -EPERM : 0; +} + static void cgroup_pressure_release(struct kernfs_open_file *of) { struct cgroup_file_ctx *ctx = of->priv; @@ -5266,6 +5272,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "io.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_IO]), + .open = cgroup_pressure_open, .seq_show = cgroup_io_pressure_show, .write = cgroup_io_pressure_write, .poll = cgroup_pressure_poll, @@ -5274,6 +5281,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "memory.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_MEM]), + .open = cgroup_pressure_open, .seq_show = cgroup_memory_pressure_show, .write = cgroup_memory_pressure_write, .poll = cgroup_pressure_poll, @@ -5282,6 +5290,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "cpu.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_CPU]), + .open = cgroup_pressure_open, .seq_show = cgroup_cpu_pressure_show, .write = cgroup_cpu_pressure_write, .poll = cgroup_pressure_poll, @@ -5291,6 +5300,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "irq.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_IRQ]), + .open = cgroup_pressure_open, .seq_show = cgroup_irq_pressure_show, .write = cgroup_irq_pressure_write, .poll = cgroup_pressure_poll, diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 02e011cabe91..0945f956bf80 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -160,7 +160,6 @@ __setup("psi=", setup_psi); #define EXP_300s 2034 /* 1/exp(2s/300s) */ /* PSI trigger definitions */ -#define WINDOW_MIN_US 500000 /* Min window size is 500ms */ #define WINDOW_MAX_US 10000000 /* Max window size is 10s */ #define UPDATES_PER_WINDOW 10 /* 10 updates per window */ @@ -1278,8 +1277,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, if (state >= PSI_NONIDLE) return ERR_PTR(-EINVAL); - if (window_us < WINDOW_MIN_US || - window_us > WINDOW_MAX_US) + if (window_us == 0 || window_us > WINDOW_MAX_US) return ERR_PTR(-EINVAL); /* Check threshold */