From patchwork Sat Jan 5 00:19:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10749015 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 460E8746 for ; Sat, 5 Jan 2019 00:21:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 321DC287EF for ; Sat, 5 Jan 2019 00:21:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1FFDD287FA; Sat, 5 Jan 2019 00:21:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90770287EF for ; Sat, 5 Jan 2019 00:21:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C73E28E0112; Fri, 4 Jan 2019 19:21:49 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C36958E0113; Fri, 4 Jan 2019 19:21:49 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B10A08E0112; Fri, 4 Jan 2019 19:21:49 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-io1-f70.google.com (mail-io1-f70.google.com [209.85.166.70]) by kanga.kvack.org (Postfix) with ESMTP id 84CA78E00F9 for ; Fri, 4 Jan 2019 19:21:49 -0500 (EST) Received: by mail-io1-f70.google.com with SMTP id q23so42774870ior.6 for ; Fri, 04 Jan 2019 16:21:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=FwONxR7oKW4opQ6uN7J1KnVgGi+k4wMmoWgR7B7DAVA=; b=qxrDL3GnU11t2VeYVXlcNHSuXn//bHhw31CCHMye3nbjgfe3GCYibTWDTWjV2HmYXr f5bVbxnOb5+0EsZ24sjkVU7GoCmS5M/TFVzZFAlaZwK2g5aQwCy9xdwLZCEQUe8WlcXb f9TFoZ/34DmJ9hzrFAoWzFf1ABvYRvm+HvZplq9Y/PQ/lqlybk+xgInbvYShR4paTw1N ysU9bxV3v2dAZ4HVFLNF2/eASSsw9QpIZBIH3brAqculTU926Udd2LLX+4GIiUebSsJ0 KkvmMKi9KZKpP/nt6M8BmqRzuFFe5mKZqiIfUXIafYuxRxiEM5vmTQaFbwEce0oVX1vN IBDg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.131 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: AJcUukcq+ADgdXWWvfg4e4UzO2euBjdI78HUKYXybKY9Z26keMBymGfA +OWBkSqkiirMNAph0QTPIR/Q8/z6e/WOrLPSi+f5lX33Qip4lHHNkD6FklHpim+vHMkkCPyX1fj qUGISSsZi2gfhnN+9zS8E3/n6YGtjYWjYFGXiAFptt52iXX628s3eIhUozNAEkVxEIg== X-Received: by 2002:a6b:3b47:: with SMTP id i68mr37895555ioa.133.1546647709235; Fri, 04 Jan 2019 16:21:49 -0800 (PST) X-Google-Smtp-Source: ALg8bN7gHT2vATyXVqlSqSVz0SWeADq83WwPFh+8CGgRQElD0/NDMSR52cru08/DgKcNWy4rKg23 X-Received: by 2002:a6b:3b47:: with SMTP id i68mr37895535ioa.133.1546647708114; Fri, 04 Jan 2019 16:21:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546647708; cv=none; d=google.com; s=arc-20160816; b=h7xok0VoaYvbyZTuvvrVQvFEQ54OuFIxdmGnekRha0b/b3/KAgKgnwTQyknwGlGuJg yQuWOS7b3Ps18EnYIMgotkSTuWQiCbOtwQkcOT6Gh/xL24fEj+kvGMGVmLgUNdMZWNoa o/EnVUL6YS53ZQnlEEWtg+uGySYaDGeUubQ2GDBrX3gp4C0X5igMoPN/OsMRd8Fc8TWj vUxHd7LZcp7/BMtw2wmCSaKA+Ptmti4jE0jozj8N2Nk16zEQ0bTa7ThkBgFV3w55ElHl 0YykY20o5CsHsABXtssgfL9fp4whFBDQFHyVIL0WEGjU+vGt7RlYiPO+ppnnVJMvRuXR ToHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=FwONxR7oKW4opQ6uN7J1KnVgGi+k4wMmoWgR7B7DAVA=; b=E5n384E3T04wpo0x0DsuIb/kcwgZYoKxQUkxqkj8lzmJfXQD2BN/EiDIETcVQBp5Am 6pZRNUUD1v43SK1e6GYS4qsZSivboPHeexCAMLncIi/gISQaRWCDmYThX9ntC54TMK+i oV7pivFSea0E6cSo0rPmmSh46d/F7mrOz4YfntjtaDwCdl/RiB4nHQ0+3r6xMuof5sTg Uuwl123mP7qsPOo98FHnZXdx74855kfsE+iwv9WxMlY7pOFOcKjK48gpa80h3iCrL0o2 fbROub+cVpPIgNtl32TqQK+T7tO39xutkWLeGlwkZw8PQuayFHPyBUvuZQYz6Ve0TWHs A52w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.131 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com. [115.124.30.131]) by mx.google.com with ESMTPS id m18si610150itb.1.2019.01.04.16.21.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 04 Jan 2019 16:21:48 -0800 (PST) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.131 as permitted sender) client-ip=115.124.30.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.131 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01451;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0THYi1he_1546647563; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0THYi1he_1546647563) by smtp.aliyun-inc.com(127.0.0.1); Sat, 05 Jan 2019 08:19:31 +0800 From: Yang Shi To: mhocko@suse.com, hannes@cmpxchg.org, shakeelb@google.com, akpm@linux-foundation.org Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v2 PATCH 3/5] mm: memcontrol: introduce wipe_on_offline interface Date: Sat, 5 Jan 2019 08:19:18 +0800 Message-Id: <1546647560-40026-4-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1546647560-40026-1-git-send-email-yang.shi@linux.alibaba.com> References: <1546647560-40026-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP We have some usecases which create and remove memcgs very frequently, and the tasks in the memcg may just access the files which are unlikely accessed by anyone else. So, we prefer force_empty the memcg before rmdir'ing it to reclaim the page cache so that they don't get accumulated to incur unnecessary memory pressure. Since the memory pressure may incur direct reclaim to harm some latency sensitive applications. Force empty would help out such usecase, however force empty reclaims memory synchronously when writing to memory.force_empty. It may take some time to return and the afterwards operations are blocked by it. Although this can be done in background, some usecases may need create new memcg with the same name right after the old one is deleted. So, the creation might get blocked by the before reclaim/remove operation. Delaying memory reclaim in cgroup offline for such usecase sounds reasonable. Introduced a new interface, called wipe_on_offline for both default and legacy hierarchy, which does memory reclaim in css offline kworker. Writing to 1 would enable it, writing 0 would disable it. Suggested-by: Michal Hocko Cc: Johannes Weiner Signed-off-by: Yang Shi --- include/linux/memcontrol.h | 3 +++ mm/memcontrol.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 83ae11c..2f1258a 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -311,6 +311,9 @@ struct mem_cgroup { struct list_head event_list; spinlock_t event_list_lock; + /* Reclaim as much as possible memory in offline kworker */ + bool wipe_on_offline; + struct mem_cgroup_per_node *nodeinfo[0]; /* WARNING: nodeinfo must be the last member here */ }; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 75208a2..5a13c6b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2918,6 +2918,35 @@ static ssize_t mem_cgroup_force_empty_write(struct kernfs_open_file *of, return mem_cgroup_force_empty(memcg) ?: nbytes; } +static int wipe_on_offline_show(struct seq_file *m, void *v) +{ + struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m)); + + seq_printf(m, "%lu\n", (unsigned long)memcg->wipe_on_offline); + + return 0; +} + +static int wipe_on_offline_write(struct cgroup_subsys_state *css, + struct cftype *cft, u64 val) +{ + int ret = 0; + + struct mem_cgroup *memcg = mem_cgroup_from_css(css); + + if (mem_cgroup_is_root(memcg)) + return -EINVAL; + + if (val == 0) + memcg->wipe_on_offline = false; + else if (val == 1) + memcg->wipe_on_offline = true; + else + ret = -EINVAL; + + return ret; +} + static u64 mem_cgroup_hierarchy_read(struct cgroup_subsys_state *css, struct cftype *cft) { @@ -4283,6 +4312,11 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of, .write = mem_cgroup_reset, .read_u64 = mem_cgroup_read_u64, }, + { + .name = "wipe_on_offline", + .seq_show = wipe_on_offline_show, + .write_u64 = wipe_on_offline_write, + }, { }, /* terminate */ }; @@ -4569,6 +4603,15 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css) page_counter_set_min(&memcg->memory, 0); page_counter_set_low(&memcg->memory, 0); + /* + * Reclaim as much as possible memory when offlining. + * + * Do it after min/low is reset otherwise some memory might + * be protected by min/low. + */ + if (memcg->wipe_on_offline) + mem_cgroup_force_empty(memcg); + memcg_offline_kmem(memcg); wb_memcg_offline(memcg); @@ -5694,6 +5737,12 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of, .seq_show = memory_oom_group_show, .write = memory_oom_group_write, }, + { + .name = "wipe_on_offline", + .flags = CFTYPE_NOT_ON_ROOT, + .seq_show = wipe_on_offline_show, + .write_u64 = wipe_on_offline_write, + }, { } /* terminate */ };