From patchwork Mon Apr 19 22:50:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 12212783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEA67C43460 for ; Mon, 19 Apr 2021 22:51:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4D44E6135F for ; Mon, 19 Apr 2021 22:51:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D44E6135F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C1B4D6B0070; Mon, 19 Apr 2021 18:51:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BA47A6B0074; Mon, 19 Apr 2021 18:51:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D1696B0071; Mon, 19 Apr 2021 18:51:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0150.hostedemail.com [216.40.44.150]) by kanga.kvack.org (Postfix) with ESMTP id 2F7286B0071 for ; Mon, 19 Apr 2021 18:51:02 -0400 (EDT) Received: from smtpin38.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E2AA6180AD837 for ; Mon, 19 Apr 2021 22:51:01 +0000 (UTC) X-FDA: 78050613522.38.1943CB4 Received: from mail-io1-f42.google.com (mail-io1-f42.google.com [209.85.166.42]) by imf17.hostedemail.com (Postfix) with ESMTP id D83FD40002C1 for ; Mon, 19 Apr 2021 22:50:58 +0000 (UTC) Received: by mail-io1-f42.google.com with SMTP id s16so31292996iog.9 for ; Mon, 19 Apr 2021 15:51:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/GNFIHP80fNCogSwOfcX0+4hG2Pb9oNldIslTinwQoQ=; b=V5aldyVS9A4RAJ3j+JAi8Fcis/bO/3GpoaDg8nZSef1ULpT/LIlbTJCjY4VRzYzJlt c7G/ADKAdiW0Nl5aQ+gdxpyYBk9MqRyPISYsxbtkt4n3cOR+oTmVJKikFxDpxVG2BhJC n2Xk7lTyl+nfdPD990cgf3Gmfap1MFjtvE9tFpmWo9zbNHRqbnLb7RMrhxvcabnZPmOp uemv+7v733pjM0bSa8BRIhpo2Nw3Xrgcq8C5mJG9auUnHfMZGrN1DfXTLXxuM6qamLV1 8ornQTR2S+vJtebUco7gJ6IyY6u8KGgIk9L/rzI6yFg8EIPLaaIcKFd5aHdn4CS3KW5U Az3g== X-Gm-Message-State: AOAM53186IV+2WEFJsjhXu5MQv/6FhClw0s59PKUISkVkw2mXQrroeDR VPMz4G0W9afA4GlMWEyKVWc= X-Google-Smtp-Source: ABdhPJxL+9T0xG3f9jYuWBMPpKiKUicCW4XZWajxCO9GaM5jyuV5v0yFTEHY7wG0dHzSdk0RznmhhQ== X-Received: by 2002:a5e:930d:: with SMTP id k13mr16297940iom.61.1618872661109; Mon, 19 Apr 2021 15:51:01 -0700 (PDT) Received: from abasin.c.googlers.com.com (243.199.238.35.bc.googleusercontent.com. [35.238.199.243]) by smtp.gmail.com with ESMTPSA id d7sm7566967ion.39.2021.04.19.15.51.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Apr 2021 15:51:00 -0700 (PDT) From: Dennis Zhou To: Tejun Heo , Christoph Lameter , Roman Gushchin Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dennis Zhou Subject: [PATCH 4/4] percpu: use reclaim threshold instead of running for every page Date: Mon, 19 Apr 2021 22:50:47 +0000 Message-Id: <20210419225047.3415425-5-dennis@kernel.org> X-Mailer: git-send-email 2.31.1.368.gbe11c130af-goog In-Reply-To: <20210419225047.3415425-1-dennis@kernel.org> References: <20210419225047.3415425-1-dennis@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: D83FD40002C1 X-Stat-Signature: szx7hryaf45ufrziozdzyaxhi41czunw Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf17; identity=mailfrom; envelope-from=""; helo=mail-io1-f42.google.com; client-ip=209.85.166.42 X-HE-DKIM-Result: none/none X-HE-Tag: 1618872658-630240 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The last patch implements reclaim by adding 2 additional lists where a chunk's lifecycle is: active_slot -> to_depopulate_slot -> sidelined_slot This worked great because we're able to nicely converge paths into isolation. However, it's a bit aggressive to run for every free page. Let's accumulate a few free pages before we do this. To do this, the new lifecycle is: active_slot -> sidelined_slot -> to_depopulate_slot -> sidelined_slot The transition from sidelined_slot -> to_depopulate_slot occurs on a threshold instead of before where it directly went to the to_depopulate_slot. pcpu_nr_isolated_empty_pop_pages[] is introduced to aid with this. Suggested-by: Roman Gushchin Signed-off-by: Dennis Zhou Acked-by: Roman Gushchin --- mm/percpu-internal.h | 1 + mm/percpu-stats.c | 8 ++++++-- mm/percpu.c | 44 +++++++++++++++++++++++++++++++++++++------- 3 files changed, 44 insertions(+), 9 deletions(-) diff --git a/mm/percpu-internal.h b/mm/percpu-internal.h index 10604dce806f..b3e43b016276 100644 --- a/mm/percpu-internal.h +++ b/mm/percpu-internal.h @@ -92,6 +92,7 @@ extern int pcpu_nr_slots; extern int pcpu_sidelined_slot; extern int pcpu_to_depopulate_slot; extern int pcpu_nr_empty_pop_pages[]; +extern int pcpu_nr_isolated_empty_pop_pages[]; extern struct pcpu_chunk *pcpu_first_chunk; extern struct pcpu_chunk *pcpu_reserved_chunk; diff --git a/mm/percpu-stats.c b/mm/percpu-stats.c index 2125981acfb9..facc804eb86c 100644 --- a/mm/percpu-stats.c +++ b/mm/percpu-stats.c @@ -145,7 +145,7 @@ static int percpu_stats_show(struct seq_file *m, void *v) int slot, max_nr_alloc; int *buffer; enum pcpu_chunk_type type; - int nr_empty_pop_pages; + int nr_empty_pop_pages, nr_isolated_empty_pop_pages; alloc_buffer: spin_lock_irq(&pcpu_lock); @@ -167,8 +167,11 @@ static int percpu_stats_show(struct seq_file *m, void *v) } nr_empty_pop_pages = 0; - for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) + nr_isolated_empty_pop_pages = 0; + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) { nr_empty_pop_pages += pcpu_nr_empty_pop_pages[type]; + nr_isolated_empty_pop_pages += pcpu_nr_isolated_empty_pop_pages[type]; + } #define PL(X) \ seq_printf(m, " %-20s: %12lld\n", #X, (long long int)pcpu_stats_ai.X) @@ -202,6 +205,7 @@ static int percpu_stats_show(struct seq_file *m, void *v) PU(min_alloc_size); PU(max_alloc_size); P("empty_pop_pages", nr_empty_pop_pages); + P("iso_empty_pop_pages", nr_isolated_empty_pop_pages); seq_putc(m, '\n'); #undef PU diff --git a/mm/percpu.c b/mm/percpu.c index 79eebc80860d..ba13e683d022 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -110,6 +110,9 @@ #define PCPU_EMPTY_POP_PAGES_LOW 2 #define PCPU_EMPTY_POP_PAGES_HIGH 4 +/* only schedule reclaim if there are at least N empty pop pages sidelined */ +#define PCPU_EMPTY_POP_RECLAIM_THRESHOLD 4 + #ifdef CONFIG_SMP /* default addr <-> pcpu_ptr mapping, override in asm/percpu.h if necessary */ #ifndef __addr_to_pcpu_ptr @@ -183,6 +186,7 @@ static LIST_HEAD(pcpu_map_extend_chunks); * The reserved chunk doesn't contribute to the count. */ int pcpu_nr_empty_pop_pages[PCPU_NR_CHUNK_TYPES]; +int pcpu_nr_isolated_empty_pop_pages[PCPU_NR_CHUNK_TYPES]; /* * The number of populated pages in use by the allocator, protected by @@ -582,8 +586,10 @@ static void pcpu_isolate_chunk(struct pcpu_chunk *chunk) if (!chunk->isolated) { chunk->isolated = true; pcpu_nr_empty_pop_pages[type] -= chunk->nr_empty_pop_pages; + pcpu_nr_isolated_empty_pop_pages[type] += + chunk->nr_empty_pop_pages; + list_move(&chunk->list, &pcpu_slot[pcpu_sidelined_slot]); } - list_move(&chunk->list, &pcpu_slot[pcpu_to_depopulate_slot]); } static void pcpu_reintegrate_chunk(struct pcpu_chunk *chunk) @@ -595,6 +601,8 @@ static void pcpu_reintegrate_chunk(struct pcpu_chunk *chunk) if (chunk->isolated) { chunk->isolated = false; pcpu_nr_empty_pop_pages[type] += chunk->nr_empty_pop_pages; + pcpu_nr_isolated_empty_pop_pages[type] -= + chunk->nr_empty_pop_pages; pcpu_chunk_relocate(chunk, -1); } } @@ -610,9 +618,15 @@ static void pcpu_reintegrate_chunk(struct pcpu_chunk *chunk) */ static inline void pcpu_update_empty_pages(struct pcpu_chunk *chunk, int nr) { + enum pcpu_chunk_type type = pcpu_chunk_type(chunk); + chunk->nr_empty_pop_pages += nr; - if (chunk != pcpu_reserved_chunk && !chunk->isolated) - pcpu_nr_empty_pop_pages[pcpu_chunk_type(chunk)] += nr; + if (chunk != pcpu_reserved_chunk) { + if (chunk->isolated) + pcpu_nr_isolated_empty_pop_pages[type] += nr; + else + pcpu_nr_empty_pop_pages[type] += nr; + } } /* @@ -2138,10 +2152,13 @@ static void pcpu_reclaim_populated(enum pcpu_chunk_type type) struct list_head *pcpu_slot = pcpu_chunk_list(type); struct pcpu_chunk *chunk; struct pcpu_block_md *block; + LIST_HEAD(to_depopulate); int i, end; spin_lock_irq(&pcpu_lock); + list_splice_init(&pcpu_slot[pcpu_to_depopulate_slot], &to_depopulate); + restart: /* * Once a chunk is isolated to the to_depopulate list, the chunk is no @@ -2149,9 +2166,9 @@ static void pcpu_reclaim_populated(enum pcpu_chunk_type type) * other accessor is the free path which only returns area back to the * allocator not touching the populated bitmap. */ - while (!list_empty(&pcpu_slot[pcpu_to_depopulate_slot])) { - chunk = list_first_entry(&pcpu_slot[pcpu_to_depopulate_slot], - struct pcpu_chunk, list); + while (!list_empty(&to_depopulate)) { + chunk = list_first_entry(&to_depopulate, struct pcpu_chunk, + list); WARN_ON(chunk->immutable); /* @@ -2208,6 +2225,13 @@ static void pcpu_reclaim_populated(enum pcpu_chunk_type type) &pcpu_slot[pcpu_sidelined_slot]); } + if (pcpu_nr_isolated_empty_pop_pages[type] >= + PCPU_EMPTY_POP_RECLAIM_THRESHOLD) { + list_splice_tail_init(&pcpu_slot[pcpu_sidelined_slot], + &pcpu_slot[pcpu_to_depopulate_slot]); + pcpu_schedule_balance_work(); + } + spin_unlock_irq(&pcpu_lock); } @@ -2291,7 +2315,13 @@ void free_percpu(void __percpu *ptr) } } else if (pcpu_should_reclaim_chunk(chunk)) { pcpu_isolate_chunk(chunk); - need_balance = true; + if (chunk->free_bytes == pcpu_unit_size || + pcpu_nr_isolated_empty_pop_pages[pcpu_chunk_type(chunk)] >= + PCPU_EMPTY_POP_RECLAIM_THRESHOLD) { + list_splice_tail_init(&pcpu_slot[pcpu_sidelined_slot], + &pcpu_slot[pcpu_to_depopulate_slot]); + need_balance = true; + } } trace_percpu_free_percpu(chunk->base_addr, off, ptr);