From patchwork Wed Mar 24 19:06:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 12162053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28726C433DB for ; Wed, 24 Mar 2021 19:06:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8AEBB61A07 for ; Wed, 24 Mar 2021 19:06:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8AEBB61A07 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=fb.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F1BA26B02EB; Wed, 24 Mar 2021 15:06:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ECBE18D0017; Wed, 24 Mar 2021 15:06:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D45406B02EE; Wed, 24 Mar 2021 15:06:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0199.hostedemail.com [216.40.44.199]) by kanga.kvack.org (Postfix) with ESMTP id B4B956B02EB for ; Wed, 24 Mar 2021 15:06:37 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6E343A2C0 for ; Wed, 24 Mar 2021 19:06:37 +0000 (UTC) X-FDA: 77955699234.26.6C8E002 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by imf22.hostedemail.com (Postfix) with ESMTP id A488AC0007D6 for ; Wed, 24 Mar 2021 19:06:35 +0000 (UTC) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 12OJ4qis000433 for ; Wed, 24 Mar 2021 12:06:35 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=981pKgSo+LC/Mf3Xm+swjmvg5jsIj7ppALj3/zde3Ag=; b=cd4Gcco+nv3ez94MZ6gARvy8sCGAOUDJ/cCrqivDlRFEjHF/PAVntCHtqUMVHNeomK2R vsM8Laq2ojwA/zLSGSRXm0GmfwEBu2oudeJ+bqxvI5wyv0D+ypiSPJBLWv16Nza0UkSF NrMMnbsHzNtAAgZRMXbthYbnMePHh8zjj14= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 37fpjt6t86-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 24 Mar 2021 12:06:35 -0700 Received: from intmgw001.46.prn1.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::c) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 24 Mar 2021 12:06:34 -0700 Received: by devvm3388.prn0.facebook.com (Postfix, from userid 111017) id 8321E57ACF2A; Wed, 24 Mar 2021 12:06:33 -0700 (PDT) From: Roman Gushchin To: Dennis Zhou CC: Tejun Heo , Christoph Lameter , Andrew Morton , , , Roman Gushchin Subject: [PATCH rfc 1/4] percpu: implement partial chunk depopulation Date: Wed, 24 Mar 2021 12:06:23 -0700 Message-ID: <20210324190626.564297-2-guro@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210324190626.564297-1-guro@fb.com> References: <20210324190626.564297-1-guro@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369,18.0.761 definitions=2021-03-24_13:2021-03-24,2021-03-24 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 adultscore=0 phishscore=0 priorityscore=1501 mlxlogscore=706 lowpriorityscore=0 spamscore=0 clxscore=1015 bulkscore=0 suspectscore=0 mlxscore=0 malwarescore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103240137 X-FB-Internal: deliver X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A488AC0007D6 X-Stat-Signature: i4jkknds8oghp8h1k1yrdrst7p16sbtz Received-SPF: none (fb.com>: No applicable sender policy available) receiver=imf22; identity=mailfrom; envelope-from=""; helo=mx0a-00082601.pphosted.com; client-ip=67.231.145.42 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616612795-909128 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch implements partial depopulation of percpu chunks. As now, a chunk can be depopulated only as a part of the final destruction, when there are no more outstanding allocations. However to minimize a memory waste, it might be useful to depopulate a partially filed chunk, if a small number of outstanding allocations prevents the chunk from being reclaimed. This patch implements the following depopulation process: it scans over the chunk pages, looks for a range of empty and populated pages and performs the depopulation. To avoid races with new allocations, the chunk is previously isolated. After the depopulation the chunk is returned to the original slot (but is appended to the tail of the list to minimize the chances of population). Because the pcpu_lock is dropped while calling pcpu_depopulate_chunk(), the chunk can be concurrently moved to a different slot. So we need to isolate it again on each step. pcpu_alloc_mutex is held, so the chunk can't be populated/depopulated asynchronously. Signed-off-by: Roman Gushchin --- mm/percpu.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 90 insertions(+) diff --git a/mm/percpu.c b/mm/percpu.c index 6596a0a4286e..78c55c73fa28 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -2055,6 +2055,96 @@ static void __pcpu_balance_workfn(enum pcpu_chunk_type type) mutex_unlock(&pcpu_alloc_mutex); } +/** + * pcpu_shrink_populated - scan chunks and release unused pages to the system + * @type: chunk type + * + * Scan over all chunks, find those marked with the depopulate flag and + * try to release unused pages to the system. On every attempt clear the + * chunk's depopulate flag to avoid wasting CPU by scanning the same + * chunk again and again. + */ +static void pcpu_shrink_populated(enum pcpu_chunk_type type) +{ + struct list_head *pcpu_slot = pcpu_chunk_list(type); + struct pcpu_chunk *chunk; + int slot, i, off, start; + + spin_lock_irq(&pcpu_lock); + for (slot = pcpu_nr_slots - 1; slot >= 0; slot--) { +restart: + list_for_each_entry(chunk, &pcpu_slot[slot], list) { + bool isolated = false; + + if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_HIGH) + break; + + for (i = 0, start = -1; i < chunk->nr_pages; i++) { + if (!chunk->nr_empty_pop_pages) + break; + + /* + * If the page is empty and populated, start or + * extend the [start, i) range. + */ + if (test_bit(i, chunk->populated)) { + off = find_first_bit( + pcpu_index_alloc_map(chunk, i), + PCPU_BITMAP_BLOCK_BITS); + if (off >= PCPU_BITMAP_BLOCK_BITS) { + if (start == -1) + start = i; + continue; + } + } + + /* + * Otherwise check if there is an active range, + * and if yes, depopulate it. + */ + if (start == -1) + continue; + + /* + * Isolate the chunk, so new allocations + * wouldn't be served using this chunk. + * Async releases can still happen. + */ + if (!list_empty(&chunk->list)) { + list_del_init(&chunk->list); + isolated = true; + } + + spin_unlock_irq(&pcpu_lock); + pcpu_depopulate_chunk(chunk, start, i); + cond_resched(); + spin_lock_irq(&pcpu_lock); + + pcpu_chunk_depopulated(chunk, start, i); + + /* + * Reset the range and continue. + */ + start = -1; + } + + if (isolated) { + /* + * The chunk could have been moved while + * pcpu_lock wasn't held. Make sure we put + * the chunk back into the slot and restart + * the scanning. + */ + if (list_empty(&chunk->list)) + list_add_tail(&chunk->list, + &pcpu_slot[slot]); + goto restart; + } + } + } + spin_unlock_irq(&pcpu_lock); +} + /** * pcpu_balance_workfn - manage the amount of free chunks and populated pages * @work: unused