From patchwork Mon Apr 15 13:19:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13630026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84E9CC4345F for ; Mon, 15 Apr 2024 13:22:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC23A6B008A; Mon, 15 Apr 2024 09:22:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D71636B0093; Mon, 15 Apr 2024 09:22:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C39C96B0095; Mon, 15 Apr 2024 09:22:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A55A06B008A for ; Mon, 15 Apr 2024 09:22:00 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 263F6C05F2 for ; Mon, 15 Apr 2024 13:22:00 +0000 (UTC) X-FDA: 82011829200.19.C792E29 Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) by imf17.hostedemail.com (Postfix) with ESMTP id 9358F4000C for ; Mon, 15 Apr 2024 13:21:57 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf17.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713187318; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3NTgxqaCBZs/fI8zac9OSD5CA+FrtXvTBvgj5GNhHto=; b=EtJfziinM3IqkiglKt17bUQMLFxQqjCsV6goDIgLZwJHpugpc+HdBdTKjlnL/lTec21EpK tr7aMLlVzWWY4AGXK0e+R4uaWsGP9rfGKq2tYKKA1mqQeODUIG7oodBshIuyiuwtDnJwUz lYUpJ8vU3tqjY8DF0cM9ACj7yhUm8pA= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf17.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713187318; a=rsa-sha256; cv=none; b=5TpydNhVnjxJNptb981rP0CrwvzoNIoxgnjbZ2zZaoPgUwEcVHgKAQnWh/XBZPRwjl3F+5 ZF5b2larCI4fkyk/BtzIVlVP5bfMDkwpDUNvkNNuL8M5By9FQkUq00T2vmEnQ6EbpQzDU4 W3ZkVSkOYpZmK/DM3dHk7u+P0iGR/W0= Received: from mail.maildlp.com (unknown [172.19.88.234]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4VJ76S4jxRz1fxcQ; Mon, 15 Apr 2024 21:18:56 +0800 (CST) Received: from dggpemm500005.china.huawei.com (unknown [7.185.36.74]) by mail.maildlp.com (Postfix) with ESMTPS id 6927F14011B; Mon, 15 Apr 2024 21:21:53 +0800 (CST) Received: from localhost.localdomain (10.69.192.56) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Mon, 15 Apr 2024 21:21:53 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Andrew Morton , Alexander Duyck , Subject: [PATCH net-next v2 01/15] mm: page_frag: add a test module for page_frag Date: Mon, 15 Apr 2024 21:19:26 +0800 Message-ID: <20240415131941.51153-2-linyunsheng@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20240415131941.51153-1-linyunsheng@huawei.com> References: <20240415131941.51153-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.56] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm500005.china.huawei.com (7.185.36.74) X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 9358F4000C X-Stat-Signature: o9ptojdfz5ntrfwn6h7mn89pgqdhrn8o X-HE-Tag: 1713187317-127716 X-HE-Meta: U2FsdGVkX1/AGJk0BQF2o9eyJEdeVz0XM17xElflTkESZnJvYSjJGsL5lfr7qgW1ajDtXHIuoZt7yrQ+qgDQ7OydjHw7DanhIN+ywzZSdQ6FAuJL9or2lGpqru4zUTQxZruWmvjUP6ffcefwtuNtBDiT9AAGLgitDoOCIie1m0IV9eeRxODIhrGMR2Q3WEc8iDP0fGDrWFX4T7xo0CJpGvftzHQpmX1+Fwv3ESVdmX0qGd3DZDqLwIRGiMhpr8zv3uMUhjRyVjR7cLZBpzkEwbIHZXH4bsogRA/PiI0tuJN+utgkriLr4gtTdxarQzMFRD+SEqQG+HK+hv8ElNvN/pP9k6gJfwWrng3FXchi2kz7dW/N5E+myjSO+5KKeEa2wSapUw5m8ihiegmYV7Ci0w6i4+rAMe7qm5gQAefzDrFU2G1jC2mNP0uvc8U+QnNI9rACTbjXW0FvDlsgIu5e/+xY/63w9u+XGKbUKBd9MyILZUBhlYr32rDUaech+duSk3KOT37rBVWC5GhyoQF+aqf4cYaE78Vmywkfieiw8p4a7kxHwrfOQ2fBs7zuW7UNt5bb3obYPsI2B3vCfhj+pCvoswl8zsLfjSrSEoFcwGURHDgkcONnkZ1TD6yLEjLo0JX6sg/FJKzb0TgftJi9DLCEVu17M63Vnv55IhACFQdK4izpPWw3N4gRx4yEoSR1HmKmsJ/hhDlM+Suif6DQ8P41wuK7TjzTpKEWE3wOaH7sSDH5frW6zYUlY73RXinRa1GlWcyi536/Lw0UudJ4R8XUl1HOWOboSyVMYCAfe7TtOnCoiAMqVRBM/rEccIm+epfq6+ygdjUmuG9pU/0uSk4ll/zuuZUjy+7NGyt+G7dkzVBiFzUjdtLhXckpj3Fh24CjyNtczye9TxBdGGcB4gJ2bX+xQw9m5T8J2hLnnMDYtXnjlKt3ka4twBO4YIJj+vDbcTzf+/XPCBoyr8S s/VbbHaC wtgA8Jjb2540VzlcjI1nJngSOM9UxM3TLmw8dzTspXe+6oI3Q2vWfUIq1nrShpjPm+Q//f4D+MBbL9ieXJ30DpyRSZ+I8+0M+M8/wNsFUqUtXry20fAV4A5M5ZS+P5SxZi++IugigQOFRJGVsPk3TICNixmsRiDmJIRwLqKU7edvVEmMu6ow5uLLVxT0yxZFTJcRcvN0w/pzo9aJ2+cERrVfItl+VtAIuwWc2m0bZqe9xUqQcoDOOkyVfb40uwonib3fX81LEw/2xduOLyuNlDuBTz2M9J8iFbHrs X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Basing on the lib/objpool.c, change it to something like a ptrpool, so that we can utilize that to test the correctness and performance of the page_frag. The testing is done by ensuring that the fragments allocated from a frag_frag_cache instance is pushed into a ptrpool instance in a kthread binded to the first cpu, and a kthread binded to the current node will pop the fragmemt from the ptrpool and call page_frag_alloc_va() to free the fragmemt. We may refactor out the common part between objpool and ptrpool if this ptrpool thing turns out to be helpful for other place. Signed-off-by: Yunsheng Lin --- mm/Kconfig.debug | 8 + mm/Makefile | 1 + mm/page_frag_test.c | 364 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 373 insertions(+) create mode 100644 mm/page_frag_test.c diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug index afc72fde0f03..1ebcd45f47d4 100644 --- a/mm/Kconfig.debug +++ b/mm/Kconfig.debug @@ -142,6 +142,14 @@ config DEBUG_PAGE_REF kernel code. However the runtime performance overhead is virtually nil until the tracepoints are actually enabled. +config DEBUG_PAGE_FRAG_TEST + tristate "Test module for page_frag" + default n + depends on m && DEBUG_KERNEL + help + This builds the "page_frag_test" module that is used to test the + correctness and performance of page_frag's implementation. + config DEBUG_RODATA_TEST bool "Testcase for the marking rodata read-only" depends on STRICT_KERNEL_RWX diff --git a/mm/Makefile b/mm/Makefile index 4abb40b911ec..5a14e6992f44 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -101,6 +101,7 @@ obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o obj-$(CONFIG_DEBUG_RODATA_TEST) += rodata_test.o +obj-$(CONFIG_DEBUG_PAGE_FRAG_TEST) += page_frag_test.o obj-$(CONFIG_DEBUG_VM_PGTABLE) += debug_vm_pgtable.o obj-$(CONFIG_PAGE_OWNER) += page_owner.o obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o diff --git a/mm/page_frag_test.c b/mm/page_frag_test.c new file mode 100644 index 000000000000..6743db672dad --- /dev/null +++ b/mm/page_frag_test.c @@ -0,0 +1,364 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Test module for page_frag cache + * + * Copyright: linyunsheng@huawei.com + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OBJPOOL_NR_OBJECT_MAX BIT(24) + +struct objpool_slot { + u32 head; + u32 tail; + u32 last; + u32 mask; + void *entries[]; +} __packed; + +struct objpool_head { + int nr_cpus; + int capacity; + struct objpool_slot **cpu_slots; +}; + +/* initialize percpu objpool_slot */ +static void objpool_init_percpu_slot(struct objpool_head *pool, + struct objpool_slot *slot) +{ + /* initialize elements of percpu objpool_slot */ + slot->mask = pool->capacity - 1; +} + +/* allocate and initialize percpu slots */ +static int objpool_init_percpu_slots(struct objpool_head *pool, + int nr_objs, gfp_t gfp) +{ + int i; + + for (i = 0; i < pool->nr_cpus; i++) { + struct objpool_slot *slot; + int size; + + /* skip the cpu node which could never be present */ + if (!cpu_possible(i)) + continue; + + size = struct_size(slot, entries, pool->capacity); + + /* + * here we allocate percpu-slot & objs together in a single + * allocation to make it more compact, taking advantage of + * warm caches and TLB hits. in default vmalloc is used to + * reduce the pressure of kernel slab system. as we know, + * minimal size of vmalloc is one page since vmalloc would + * always align the requested size to page size + */ + if (gfp & GFP_ATOMIC) + slot = kmalloc_node(size, gfp, cpu_to_node(i)); + else + slot = __vmalloc_node(size, sizeof(void *), gfp, + cpu_to_node(i), + __builtin_return_address(0)); + if (!slot) + return -ENOMEM; + + memset(slot, 0, size); + pool->cpu_slots[i] = slot; + + objpool_init_percpu_slot(pool, slot); + } + + return 0; +} + +/* cleanup all percpu slots of the object pool */ +static void objpool_fini_percpu_slots(struct objpool_head *pool) +{ + int i; + + if (!pool->cpu_slots) + return; + + for (i = 0; i < pool->nr_cpus; i++) + kvfree(pool->cpu_slots[i]); + kfree(pool->cpu_slots); +} + +/* initialize object pool and pre-allocate objects */ +static int objpool_init(struct objpool_head *pool, int nr_objs, gfp_t gfp) +{ + int rc, capacity, slot_size; + + /* check input parameters */ + if (nr_objs <= 0 || nr_objs > OBJPOOL_NR_OBJECT_MAX) + return -EINVAL; + + /* calculate capacity of percpu objpool_slot */ + capacity = roundup_pow_of_two(nr_objs); + if (!capacity) + return -EINVAL; + + gfp = gfp & ~__GFP_ZERO; + + /* initialize objpool pool */ + memset(pool, 0, sizeof(struct objpool_head)); + pool->nr_cpus = nr_cpu_ids; + pool->capacity = capacity; + slot_size = pool->nr_cpus * sizeof(struct objpool_slot *); + pool->cpu_slots = kzalloc(slot_size, gfp); + if (!pool->cpu_slots) + return -ENOMEM; + + /* initialize per-cpu slots */ + rc = objpool_init_percpu_slots(pool, nr_objs, gfp); + if (rc) + objpool_fini_percpu_slots(pool); + + return rc; +} + +/* adding object to slot, abort if the slot was already full */ +static int objpool_try_add_slot(void *obj, struct objpool_head *pool, int cpu) +{ + struct objpool_slot *slot = pool->cpu_slots[cpu]; + u32 head, tail; + + /* loading tail and head as a local snapshot, tail first */ + tail = READ_ONCE(slot->tail); + + do { + head = READ_ONCE(slot->head); + /* fault caught: something must be wrong */ + if (unlikely(tail - head >= pool->capacity)) + return -ENOSPC; + } while (!try_cmpxchg_acquire(&slot->tail, &tail, tail + 1)); + + /* now the tail position is reserved for the given obj */ + WRITE_ONCE(slot->entries[tail & slot->mask], obj); + /* update sequence to make this obj available for pop() */ + smp_store_release(&slot->last, tail + 1); + + return 0; +} + +/* reclaim an object to object pool */ +static int objpool_push(void *obj, struct objpool_head *pool) +{ + unsigned long flags; + int rc; + + /* disable local irq to avoid preemption & interruption */ + raw_local_irq_save(flags); + rc = objpool_try_add_slot(obj, pool, raw_smp_processor_id()); + raw_local_irq_restore(flags); + + return rc; +} + +/* try to retrieve object from slot */ +static void *objpool_try_get_slot(struct objpool_head *pool, int cpu) +{ + struct objpool_slot *slot = pool->cpu_slots[cpu]; + /* load head snapshot, other cpus may change it */ + u32 head = smp_load_acquire(&slot->head); + + while (head != READ_ONCE(slot->last)) { + void *obj; + + /* + * data visibility of 'last' and 'head' could be out of + * order since memory updating of 'last' and 'head' are + * performed in push() and pop() independently + * + * before any retrieving attempts, pop() must guarantee + * 'last' is behind 'head', that is to say, there must + * be available objects in slot, which could be ensured + * by condition 'last != head && last - head <= nr_objs' + * that is equivalent to 'last - head - 1 < nr_objs' as + * 'last' and 'head' are both unsigned int32 + */ + if (READ_ONCE(slot->last) - head - 1 >= pool->capacity) { + head = READ_ONCE(slot->head); + continue; + } + + /* obj must be retrieved before moving forward head */ + obj = READ_ONCE(slot->entries[head & slot->mask]); + + /* move head forward to mark it's consumption */ + if (try_cmpxchg_release(&slot->head, &head, head + 1)) + return obj; + } + + return NULL; +} + +/* allocate an object from object pool */ +static void *objpool_pop(struct objpool_head *pool) +{ + void *obj = NULL; + unsigned long flags; + int i, cpu; + + /* disable local irq to avoid preemption & interruption */ + raw_local_irq_save(flags); + + cpu = raw_smp_processor_id(); + for (i = 0; i < num_possible_cpus(); i++) { + obj = objpool_try_get_slot(pool, cpu); + if (obj) + break; + cpu = cpumask_next_wrap(cpu, cpu_possible_mask, -1, 1); + } + raw_local_irq_restore(flags); + + return obj; +} + +/* release whole objpool forcely */ +static void objpool_free(struct objpool_head *pool) +{ + if (!pool->cpu_slots) + return; + + /* release percpu slots */ + objpool_fini_percpu_slots(pool); +} + +static struct objpool_head ptr_pool; +static int nr_objs = 512; +static int nr_test = 5120000; +static atomic_t nthreads; +static struct completion wait; +static struct page_frag_cache test_frag; + +module_param(nr_test, int, 0600); +MODULE_PARM_DESC(nr_test, "number of iterations to test"); + +static int page_frag_pop_thread(void *arg) +{ + struct objpool_head *pool = arg; + int nr = nr_test; + + pr_info("page_frag pop test thread begins on cpu %d\n", + smp_processor_id()); + + while (nr > 0) { + void *obj = objpool_pop(pool); + + if (obj) { + nr--; + page_frag_free(obj); + } else { + cond_resched(); + } + } + + if (atomic_dec_and_test(&nthreads)) + complete(&wait); + + pr_info("page_frag pop test thread exits on cpu %d\n", + smp_processor_id()); + + return 0; +} + +static int page_frag_push_thread(void *arg) +{ + struct objpool_head *pool = arg; + int nr = nr_test; + + pr_info("page_frag push test thread begins on cpu %d\n", + smp_processor_id()); + + while (nr > 0) { + unsigned int size = get_random_u32(); + void *va; + int ret; + + size = clamp(size, 4U, 4096U); + va = page_frag_alloc(&test_frag, size, GFP_KERNEL); + if (!va) + continue; + + ret = objpool_push(va, pool); + if (ret) { + page_frag_free(va); + cond_resched(); + } else { + nr--; + } + } + + pr_info("page_frag push test thread exits on cpu %d\n", + smp_processor_id()); + + if (atomic_dec_and_test(&nthreads)) + complete(&wait); + + return 0; +} + +static int __init page_frag_test_init(void) +{ + struct task_struct *tsk_push, *tsk_pop; + ktime_t start; + u64 duration; + int ret; + + test_frag.va = NULL; + atomic_set(&nthreads, 2); + init_completion(&wait); + + ret = objpool_init(&ptr_pool, nr_objs, GFP_KERNEL); + if (ret) + return ret; + + tsk_push = kthread_create_on_cpu(page_frag_push_thread, &ptr_pool, + cpumask_first(cpu_online_mask), + "page_frag_push"); + if (IS_ERR(tsk_push)) + return PTR_ERR(tsk_push); + + tsk_pop = kthread_create(page_frag_pop_thread, &ptr_pool, + "page_frag_pop"); + if (IS_ERR(tsk_pop)) { + kthread_stop(tsk_push); + return PTR_ERR(tsk_pop); + } + + start = ktime_get(); + wake_up_process(tsk_push); + wake_up_process(tsk_pop); + + pr_info("waiting for test to complete\n"); + wait_for_completion(&wait); + + duration = (u64)ktime_us_delta(ktime_get(), start); + pr_info("%d of iterations took: %lluus\n", nr_test, duration); + + objpool_free(&ptr_pool); + page_frag_cache_drain(&test_frag); + + return -EAGAIN; +} + +static void __exit page_frag_test_exit(void) +{ +} + +module_init(page_frag_test_init); +module_exit(page_frag_test_exit); + +MODULE_LICENSE("GPL");