From patchwork Fri Sep 18 19:48:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 11785931 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BB691746 for ; Fri, 18 Sep 2020 19:48:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 511742311B for ; Fri, 18 Sep 2020 19:48:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OsLGoR1E" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 511742311B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BA57B8E0001; Fri, 18 Sep 2020 15:48:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B6A06900004; Fri, 18 Sep 2020 15:48:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9AB8E8E0008; Fri, 18 Sep 2020 15:48:40 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0192.hostedemail.com [216.40.44.192]) by kanga.kvack.org (Postfix) with ESMTP id 7EFC38E0001 for ; Fri, 18 Sep 2020 15:48:40 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 47DEE362C for ; Fri, 18 Sep 2020 19:48:40 +0000 (UTC) X-FDA: 77277219600.05.bells26_490322d2712d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id 2D6861801BD7F for ; Fri, 18 Sep 2020 19:48:40 +0000 (UTC) X-Spam-Summary: 1,0,0,294fb338bb188ed5,d41d8cd98f00b204,urezki@gmail.com,,RULES_HIT:1:2:41:69:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2195:2196:2199:2200:2393:2559:2562:2693:2732:2736:2901:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4050:4250:4321:4385:4605:5007:6117:6119:6261:6653:7514:7903:8603:8660:9040:9413:9592:9707:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12683:12895:12986:13141:13148:13161:13229:13230:13894:13972:14394:14687:21063:21080:21324:21433:21444:21451:21627:21666:21740:21790:21939:21966:21987:21990:30012:30054,0,RBL:209.85.208.193:@gmail.com:.lbl8.mailshell.net-62.18.84.100 66.100.201.100;04ygjgxbmijagyq3x7tjcbd49j6xsycx9smrc194swgau94wdm6asetoeins5ef.me84zemipbn6z5xz1ec9xk5m6t4o89cexzb7i796mot49sxotbzqc1nirer34ha.y-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSB L:neutra X-HE-Tag: bells26_490322d2712d X-Filterd-Recvd-Size: 10844 Received: from mail-lj1-f193.google.com (mail-lj1-f193.google.com [209.85.208.193]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Fri, 18 Sep 2020 19:48:39 +0000 (UTC) Received: by mail-lj1-f193.google.com with SMTP id r24so6039486ljm.3 for ; Fri, 18 Sep 2020 12:48:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LhLXWlpxd3kg9Dw7Z+xdLIdqyBDd0WYhln7MlbVfq8o=; b=OsLGoR1E8NC1PkjPfwdexAD/f6MYLMg5cdwRLXrWpg0UBZphtr1LtxqUiw66rpXU5H j6LVxbBrzDWbNhVASenQKtUnM4NMvywXwu1Yk/sMflp58j91WXbkkq7bquHulmheLZCg ax1KY2tb7+ydczYpRXm+uusDWZmMlOzK0WIMeR8pmB9N8KA614+R1c1xHhPZ5Xy4mShu F/6gtmo1xPtux3JE5e6e8V4p/UDvPjCc/7IqTsPaHxTF/A7iS+Jv6AWPut9tLxGGViR0 2CpyijXyKmRnk27mzzeqD+ZvgTvj1sbA/lPCdPz4/UE8SsvKsxtraV0Ul+GqnJYzm3dl MZXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LhLXWlpxd3kg9Dw7Z+xdLIdqyBDd0WYhln7MlbVfq8o=; b=hIaaCX57c4RXDZOspYmJk0MCGF6ZIzeCwMKDWy/r31Ps3ybZ05/0vf3nhxejaSsf3t Fu802URnIa3YnUhQLPT1IYJcBp5/w9DvWX/XdlMvVEYOJ1J4adZu4MsPPAcUSYWdBSlC AC9A8Tn/Msrl4lIzxN4MpxPMVIz5D0RoF0zAtK9wicig7D+3frT1qCXVwORr7CXpVZwV 7s4JqrNwjG43rxHznpaEkb05Mj9yvOH8w0e4qTaVB1xTALuxVtUpLtYtFUH+zzA/1/nL k8pDmeHsBZMSDKQ5kLhnyWJucY1GgjuyXlocm/0vuvz1oWMPL2CNOZRR7MH0YVKkWZY5 D2WA== X-Gm-Message-State: AOAM530iH0XaVUwgD6Ff41Bgap8lmBgd1f86prb9FDbu0CIbnthc2Ris bDEWlmpfGzatavly23d80vM= X-Google-Smtp-Source: ABdhPJxbsbV1YD3UHexm/CMuadg/sM3dvj4yy9obt+hyvmnL1YMSmdfhTYP8qmlhHuu4Bz9bGVGH/Q== X-Received: by 2002:a2e:81d7:: with SMTP id s23mr13083916ljg.69.1600458518216; Fri, 18 Sep 2020 12:48:38 -0700 (PDT) Received: from pc638.lan (h5ef52e31.seluork.dyn.perspektivbredband.net. [94.245.46.49]) by smtp.gmail.com with ESMTPSA id a17sm766769lfd.148.2020.09.18.12.48.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 12:48:37 -0700 (PDT) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , linux-mm@kvack.org, Andrew Morton , "Paul E . McKenney" Cc: Peter Zijlstra , Michal Hocko , Vlastimil Babka , Thomas Gleixner , "Theodore Y . Ts'o" , Joel Fernandes , Sebastian Andrzej Siewior , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH 1/4] rcu/tree: Add a work to allocate pages from regular context Date: Fri, 18 Sep 2020 21:48:14 +0200 Message-Id: <20200918194817.48921-2-urezki@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200918194817.48921-1-urezki@gmail.com> References: <20200918194817.48921-1-urezki@gmail.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The current memmory-allocation interface presents to following difficulties that this patch is designed to overcome: a) If built with CONFIG_PROVE_RAW_LOCK_NESTING, the lockdep will complain about violation("BUG: Invalid wait context") of the nesting rules. It does the raw_spinlock vs. spinlock nesting checks, i.e. it is not legal to acquire a spinlock_t while holding a raw_spinlock_t. Internally the kfree_rcu() uses raw_spinlock_t whereas the "page allocator" internally deals with spinlock_t to access to its zones. The code also can be broken from higher level of view: raw_spin_lock(&some_lock); kfree_rcu(some_pointer, some_field_offset); b) If built with CONFIG_PREEMPT_RT. Please note, in that case spinlock_t is converted into sleepable variant. Invoking the page allocator from atomic contexts leads to "BUG: scheduling while atomic". c) call_rcu() is invoked from raw atomic context and kfree_rcu() and kvfree_rcu() are expected to be called from atomic raw context as well. Move out a page allocation from contexts which trigger kvfree_rcu() function to the separate worker. When a k[v]free_rcu() per-cpu page cache is empty a fallback mechanism is used and a special job is scheduled to refill the per-cpu cache. As a side effect, maintaining of the bulk arrays in the separate worker thread and not by request, will introduce other drawbacks. a) There is an extra latency window, a time during which a fallback mechanism is used until pages are obtained via the special worker for further pointers collecting over arrays. b) It is impossible to predict how many pages will be required to cover a demand that is controlled by different workloads on various systems. c) Memory overhead since we do not know how much pages should be preloaded. Above three concerns should be fixed by introducing a lock-free page allocation interface. Signed-off-by: Uladzislau Rezki (Sony) --- kernel/rcu/tree.c | 91 +++++++++++++++++++++++++---------------------- 1 file changed, 48 insertions(+), 43 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 548404489c04..4bfc46a1e9d1 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -177,7 +177,7 @@ module_param(rcu_unlock_delay, int, 0444); * per-CPU. Object size is equal to one page. This value * can be changed at boot time. */ -static int rcu_min_cached_objs = 2; +static int rcu_min_cached_objs = 5; module_param(rcu_min_cached_objs, int, 0444); /* Retrieve RCU kthreads priority for rcutorture */ @@ -3100,6 +3100,8 @@ struct kfree_rcu_cpu { * lockless an access has to be protected by the * per-cpu lock. */ + struct work_struct page_cache_work; + atomic_t work_in_progress; struct llist_head bkvcache; int nr_bkv_objs; }; @@ -3217,10 +3219,10 @@ static void kfree_rcu_work(struct work_struct *work) } rcu_lock_release(&rcu_callback_map); - krcp = krc_this_cpu_lock(&flags); + raw_spin_lock_irqsave(&krcp->lock, flags); if (put_cached_bnode(krcp, bkvhead[i])) bkvhead[i] = NULL; - krc_this_cpu_unlock(krcp, flags); + raw_spin_unlock_irqrestore(&krcp->lock, flags); if (bkvhead[i]) free_page((unsigned long) bkvhead[i]); @@ -3347,6 +3349,42 @@ static void kfree_rcu_monitor(struct work_struct *work) raw_spin_unlock_irqrestore(&krcp->lock, flags); } +static void fill_page_cache_func(struct work_struct *work) +{ + struct kvfree_rcu_bulk_data *bnode; + struct kfree_rcu_cpu *krcp = + container_of(work, struct kfree_rcu_cpu, + page_cache_work); + unsigned long flags; + bool pushed; + int i; + + for (i = 0; i < rcu_min_cached_objs; i++) { + /* + * We would like to minimize a reclaiming process, + * that is why GFP_NOWAIT is here. It can wakeup a + * kswapd, what is fine, because somebody soon or + * later will kick it to get the freelist back to + * the watermarks. + */ + bnode = (struct kvfree_rcu_bulk_data *) + __get_free_page(GFP_NOWAIT | __GFP_NOWARN); + + if (bnode) { + raw_spin_lock_irqsave(&krcp->lock, flags); + pushed = put_cached_bnode(krcp, bnode); + raw_spin_unlock_irqrestore(&krcp->lock, flags); + + if (!pushed) { + free_page((unsigned long) bnode); + break; + } + } + } + + atomic_set(&krcp->work_in_progress, 0); +} + static inline bool kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) { @@ -3363,32 +3401,8 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) if (!krcp->bkvhead[idx] || krcp->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) { bnode = get_cached_bnode(krcp); - if (!bnode) { - /* - * To keep this path working on raw non-preemptible - * sections, prevent the optional entry into the - * allocator as it uses sleeping locks. In fact, even - * if the caller of kfree_rcu() is preemptible, this - * path still is not, as krcp->lock is a raw spinlock. - * With additional page pre-allocation in the works, - * hitting this return is going to be much less likely. - */ - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - return false; - - /* - * NOTE: For one argument of kvfree_rcu() we can - * drop the lock and get the page in sleepable - * context. That would allow to maintain an array - * for the CONFIG_PREEMPT_RT as well if no cached - * pages are available. - */ - bnode = (struct kvfree_rcu_bulk_data *) - __get_free_page(GFP_NOWAIT | __GFP_NOWARN); - } - /* Switch to emergency path. */ - if (unlikely(!bnode)) + if (!bnode) return false; /* Initialize the new block. */ @@ -3422,6 +3436,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func) { unsigned long flags; struct kfree_rcu_cpu *krcp; + bool irq_disabled = irqs_disabled(); bool success; void *ptr; @@ -3452,12 +3467,12 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func) goto unlock_return; } - /* - * Under high memory pressure GFP_NOWAIT can fail, - * in that case the emergency path is maintained. - */ success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr); if (!success) { + // TODO: schedule the work from the hrtimer. + if (!irq_disabled && !atomic_xchg(&krcp->work_in_progress, 1)) + queue_work(system_highpri_wq, &krcp->page_cache_work); + if (head == NULL) // Inline if kvfree_rcu(one_arg) call. goto unlock_return; @@ -4449,24 +4464,14 @@ static void __init kfree_rcu_batch_init(void) for_each_possible_cpu(cpu) { struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); - struct kvfree_rcu_bulk_data *bnode; for (i = 0; i < KFREE_N_BATCHES; i++) { INIT_RCU_WORK(&krcp->krw_arr[i].rcu_work, kfree_rcu_work); krcp->krw_arr[i].krcp = krcp; } - for (i = 0; i < rcu_min_cached_objs; i++) { - bnode = (struct kvfree_rcu_bulk_data *) - __get_free_page(GFP_NOWAIT | __GFP_NOWARN); - - if (bnode) - put_cached_bnode(krcp, bnode); - else - pr_err("Failed to preallocate for %d CPU!\n", cpu); - } - INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor); + INIT_WORK(&krcp->page_cache_work, fill_page_cache_func); krcp->initialized = true; } if (register_shrinker(&kfree_rcu_shrinker)) From patchwork Fri Sep 18 19:48:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 11785933 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C98DD746 for ; Fri, 18 Sep 2020 19:48:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6EFAA23119 for ; Fri, 18 Sep 2020 19:48:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="r4s7af/B" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6EFAA23119 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E8C63900005; Fri, 18 Sep 2020 15:48:41 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E6625900004; Fri, 18 Sep 2020 15:48:41 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D5318900005; Fri, 18 Sep 2020 15:48:41 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0037.hostedemail.com [216.40.44.37]) by kanga.kvack.org (Postfix) with ESMTP id C1115900004 for ; Fri, 18 Sep 2020 15:48:41 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 8D1B1180AD80F for ; Fri, 18 Sep 2020 19:48:41 +0000 (UTC) X-FDA: 77277219642.30.brake01_090545c2712d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id 6DA54180B3C83 for ; Fri, 18 Sep 2020 19:48:41 +0000 (UTC) X-Spam-Summary: 50,0,0,c037b25e5611e384,d41d8cd98f00b204,urezki@gmail.com,,RULES_HIT:1:2:41:355:379:541:800:960:966:967:968:973:982:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2525:2561:2565:2682:2685:2693:2731:2859:2901:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4051:4250:4321:4385:4605:5007:6119:6261:6653:6755:7514:7875:7903:8603:8660:9010:9025:9413:10004:10394:11026:11473:11658:11914:12043:12048:12050:12114:12219:12291:12295:12296:12297:12438:12517:12519:12539:12555:12663:12679:12683:12895:12986:13141:13148:13161:13206:13229:13230:13894:14096:14110:14687:21080:21094:21323:21433:21444:21451:21627:21666:21740:21788:21789:21795:21811:21939:21990:30003:30012:30034:30051:30054:30074,0,RBL:209.85.208.195:@gmail.com:.lbl8.mailshell.net-66.100.201.100 62.18.84.100;04ygcndfmz8cm7kg6sjnqexfjwbtwopgb13ufmfj6f5hjmrah ug5xdygy X-HE-Tag: brake01_090545c2712d X-Filterd-Recvd-Size: 11368 Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Fri, 18 Sep 2020 19:48:40 +0000 (UTC) Received: by mail-lj1-f195.google.com with SMTP id u21so6021441ljl.6 for ; Fri, 18 Sep 2020 12:48:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zrPxS3Yj7n4laF6P6tQ8nRH7VU6J9VkZ5PhqpQE5VKI=; b=r4s7af/BiH0jC1Zl3uYz3dXaJ6vZCPnYwIoZO+3Dc1VZ1w35Hbsa9IpCC/TqiN2a+Z y1esQ4R0zrFr3zITC1AqpJg+92lbz43G67jtwV7cgoK4fj0vdsnJTfTmsoHSWZ0P7nZH eYJ4SXZXaLNjb2V3e0L+4czSC/V1tfYcLEnT9sERhxnhBQi5YzawKNcpZCFFqOn+AnS/ 8WVoSyXYDg2/NMEtKWDY4/oMGWcs3qDi+8N7xoUoJpu/HT/VjRYeL1ncBcTJw50qOxXA NE8djn0DGgYnzRYwT0PryBdrF79sGTonY1TymaopAXICF+0ZctGmiBJqgIgeYuGlRI2n 8ydQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zrPxS3Yj7n4laF6P6tQ8nRH7VU6J9VkZ5PhqpQE5VKI=; b=roUDCCqn+Ds1AEMOYxfs/6a2I6YeSrAnr+1m/FFgm/+iyHzOGU1+zyxY8QBFHQ2aX7 IWfILBogSE6owV7dBkXV8GTuNG4aIfD1raU/yYC9q2//ZXlvRJh5Kgf2yqpkD0Z00LfE jCbfyRoe2HGvvlhZXt1EcjWtMtolE78A5tBBt1uJeRELsL2slKQWiOpWYMiehjQTTFxy 1ZPhAkeeHcQHLkEfK4QRMIfPi2SrXjsVOpE5zOLKORwEETLenBudpJ1bntyQCHqZzm4D +7M4mVmpQxZDGPFgBjIak5GxxrpZVlQefc671GSShVtC6vkkxYSElNL42w7ZQNHiS1Fo kzWg== X-Gm-Message-State: AOAM5336/zSdAGd4hfG89j5JRaYAIsYyfTFSSzME9SMgAmrxWkRJmM1B 3ej9g62tviHeEIrOshdHyGQ= X-Google-Smtp-Source: ABdhPJyes5DFJT/9sMzPrZXdk8smFzu2cuLjtVwCBHs/vCZGg1St5rvfEC4t0PRdT0/ANLk5eIUQHQ== X-Received: by 2002:a2e:a0d3:: with SMTP id f19mr12804809ljm.325.1600458519346; Fri, 18 Sep 2020 12:48:39 -0700 (PDT) Received: from pc638.lan (h5ef52e31.seluork.dyn.perspektivbredband.net. [94.245.46.49]) by smtp.gmail.com with ESMTPSA id a17sm766769lfd.148.2020.09.18.12.48.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 12:48:38 -0700 (PDT) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , linux-mm@kvack.org, Andrew Morton , "Paul E . McKenney" Cc: Peter Zijlstra , Michal Hocko , Vlastimil Babka , Thomas Gleixner , "Theodore Y . Ts'o" , Joel Fernandes , Sebastian Andrzej Siewior , Uladzislau Rezki , Oleksiy Avramchenko Subject: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func. Date: Fri, 18 Sep 2020 21:48:15 +0200 Message-Id: <20200918194817.48921-3-urezki@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200918194817.48921-1-urezki@gmail.com> References: <20200918194817.48921-1-urezki@gmail.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Some background and kfree_rcu() =============================== The pointers to be freed are stored in the per-cpu array to improve performance, to enable an easier-to-use API, to accommodate vmalloc memmory and to support a single argument of the kfree_rcu() when only a pointer is passed. More details are below. In order to maintain such per-CPU arrays there is a need in dynamic allocation when a current array is fully populated and a new block is required. See below the example: 0 1 2 3 0 1 2 3 |p|p|p|p| -> |p|p|p|p| -> NULL there are two pointer-blocks, each one can store 4 addresses which will be freed after a grace period is passed. In reality we store PAGE_SIZE / sizeof(void *). So to maintain such blocks a single page is obtain via the page allocator: bnode = (struct kvfree_rcu_bulk_data *) __get_free_page(GFP_NOWAIT | __GFP_NOWARN); after that it is attached to the "head" and its "next" pointer is set to previous "head", so the list of blocks can be maintained and grow dynamically until it gets drained by the reclaiming thread. Please note. There is always a fallback if an allocation fails. In the single argument, this is a call to synchronize_rcu() and for the two arguments case this is to use rcu_head structure embedded in the object being free, and then paying cache-miss penalty, also invoke the kfree() per object instead of kfree_bulk() for groups of objects. Why we maintain arrays/blocks instead of linking objects by the regular "struct rcu_head" technique. See below a few but main reasons: a) A memory can be reclaimed by invoking of the kfree_bulk() interface that requires passing an array and number of entries in it. That reduces the per-object overhead caused by calling kfree() per-object. This reduces the reclamation time. b) Improves locality and reduces the number of cache-misses, due to "pointer chasing" between objects, which can be far spread between each other. c) Support a "single argument" in the kvfree_rcu() void *ptr = kvmalloc(some_bytes, GFP_KERNEL); if (ptr) kvfree_rcu(ptr); We need it when an "rcu_head" is not embed into a stucture but an object must be freed after a grace period. Therefore for the single argument, such objects cannot be queued on a linked list. So nowadays, since we do not have a single argument but we see the demand in it, to workaround it people just do a simple not efficient sequence: synchronize_rcu(); /* Can be long and blocks a current context */ kfree(p); More details is here: https://lkml.org/lkml/2020/4/28/1626 d) To distinguish vmalloc pointers between SLAB ones. It becomes possible to invoke the right freeing API for the right kind of pointer, kfree_bulk() or TBD: vmalloc_bulk(). e) Speeding up the post-grace-period freeing reduces the chance of a flood of callback's OOMing the system. Also, please have a look here: https://lkml.org/lkml/2020/7/30/1166 Proposal ======== Introduce a lock-free function that obtain a page from the per-cpu-lists on current CPU. It returns NULL rather than acquiring any non-raw spinlock. Description =========== The page allocator has two phases, fast path and slow one. We are interested in fast path and order-0 allocations. In its turn it is divided also into two phases: lock-less and not: 1) As a first step the page allocator tries to obtain a page from the per-cpu-list, so each CPU has its own one. That is why this step is lock-less and fast. Basically it disables irqs on current CPU in order to access to per-cpu data and remove a first element from the pcp-list. An element/page is returned to an user. 2) If there is no any available page in per-cpu-list, the second step is involved. It removes a specified number of elements from the buddy allocator transferring them to the "supplied-list/per-cpu-list" described in [1]. Summarizing. The __rcu_alloc_page_lockless() covers only [1] and can not do step [2], due to the fact that [2] requires an access to zone->lock. It implies that it is super fast, but a higher rate of fails is also expected. Usage: __rcu_alloc_page_lockless(); Link: https://lore.kernel.org/lkml/20200814215206.GL3982@worktop.programming.kicks-ass.net/ Not-signed-off-by: Peter Zijlstra Signed-off-by: Uladzislau Rezki (Sony) --- include/linux/gfp.h | 1 + mm/page_alloc.c | 82 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 67a0774e080b..c065031b4403 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -565,6 +565,7 @@ extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order, extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order); extern unsigned long get_zeroed_page(gfp_t gfp_mask); +extern unsigned long __rcu_alloc_page_lockless(void); void *alloc_pages_exact(size_t size, gfp_t gfp_mask); void free_pages_exact(void *virt, size_t size); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0e2bab486fea..360c68ea3491 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4908,6 +4908,88 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, } EXPORT_SYMBOL(__alloc_pages_nodemask); +static struct page *__rmqueue_lockless(struct zone *zone, struct per_cpu_pages *pcp) +{ + struct list_head *list; + struct page *page; + int migratetype; + + for (migratetype = 0; migratetype < MIGRATE_PCPTYPES; migratetype++) { + list = &pcp->lists[migratetype]; + page = list_first_entry_or_null(list, struct page, lru); + if (page && !check_new_pcp(page)) { + list_del(&page->lru); + pcp->count--; + return page; + } + } + + return NULL; +} + +/* + * Semantic of this function illustrates that a page + * is obtained in lock-free maneer. Instead of going + * deeper in the page allocator, it uses the pcplists + * only. Such way provides lock-less allocation method. + * + * Some notes are below: + * - intended to use for RCU code only; + * - it does not use any atomic reserves. + */ +unsigned long __rcu_alloc_page_lockless(void) +{ + struct zonelist *zonelist = + node_zonelist(numa_node_id(), GFP_KERNEL); + struct zoneref *z, *preferred_zoneref; + struct per_cpu_pages *pcp; + struct page *page; + unsigned long flags; + struct zone *zone; + + /* + * If DEBUG_PAGEALLOC is enabled, the post_alloc_hook() + * in the prep_new_page() function also does some extra + * page mappings via __kernel_map_pages(), what is arch + * specific. It is for debug purpose only. + * + * For example, powerpc variant of __kernel_map_pages() + * uses sleep-able locks. Thus a lock-less access can + * not be provided if debug option is activated. In that + * case it is fine to revert and return NULL, since RCU + * code has a fallback mechanism. It is OK if it is used + * for debug kernel. + */ + if (IS_ENABLED(CONFIG_DEBUG_PAGEALLOC)) + return 0; + + /* + * Preferred zone is a first one in the zonelist. + */ + preferred_zoneref = NULL; + + for_each_zone_zonelist(zone, z, zonelist, ZONE_NORMAL) { + if (!preferred_zoneref) + preferred_zoneref = z; + + local_irq_save(flags); + pcp = &this_cpu_ptr(zone->pageset)->pcp; + page = __rmqueue_lockless(zone, pcp); + if (page) { + __count_zid_vm_events(PGALLOC, page_zonenum(page), 1); + zone_statistics(preferred_zoneref->zone, zone); + } + local_irq_restore(flags); + + if (page) { + prep_new_page(page, 0, 0, 0); + return (unsigned long) page_address(page); + } + } + + return 0; +} + /* * Common helper functions. Never use with __GFP_HIGHMEM because the returned * address cannot represent highmem pages. Use alloc_pages and then kmap if From patchwork Fri Sep 18 19:48:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 11785935 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9FD21746 for ; Fri, 18 Sep 2020 19:48:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5D97B22208 for ; Fri, 18 Sep 2020 19:48:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bS8wqurD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5D97B22208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AE36D900006; Fri, 18 Sep 2020 15:48:42 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A93A2900004; Fri, 18 Sep 2020 15:48:42 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 98252900006; Fri, 18 Sep 2020 15:48:42 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id 85608900004 for ; Fri, 18 Sep 2020 15:48:42 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 531A98249980 for ; Fri, 18 Sep 2020 19:48:42 +0000 (UTC) X-FDA: 77277219684.09.copy74_5814f4c2712d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id 382EB180AD806 for ; Fri, 18 Sep 2020 19:48:42 +0000 (UTC) X-Spam-Summary: 1,0,0,606714cb93ccbc5e,d41d8cd98f00b204,urezki@gmail.com,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1534:1541:1711:1730:1747:1777:1792:2196:2199:2393:2559:2562:2693:3138:3139:3140:3141:3142:3352:3865:3866:3867:3868:3870:3871:3872:3874:4321:4385:5007:6261:6653:7514:7903:9010:9413:10004:11026:11473:11658:11914:12043:12048:12114:12297:12438:12517:12519:12555:12895:12986:13069:13311:13357:13894:14181:14384:14394:14687:14721:21080:21094:21323:21444:21451:21627:21666:21740:21990:30034:30054,0,RBL:209.85.208.196:@gmail.com:.lbl8.mailshell.net-66.100.201.100 62.18.84.100;04yrrs9oz6wbtp4h9766qsks5nh8kocgmca3qjd3isjsaiuram1j3wpk13szhc1.crj4axse1f5k1sd9x9i7h5usjogx5q875bapj346dd15b8ym7w4xsf49r1dn9sp.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: copy74_5814f4c2712d X-Filterd-Recvd-Size: 4808 Received: from mail-lj1-f196.google.com (mail-lj1-f196.google.com [209.85.208.196]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Fri, 18 Sep 2020 19:48:41 +0000 (UTC) Received: by mail-lj1-f196.google.com with SMTP id a22so5983399ljp.13 for ; Fri, 18 Sep 2020 12:48:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pihU19ZyV1tR9rxjJBUfTFPiShIHT6wFJWQRUBqDuSU=; b=bS8wqurDjQBWPYzxoQcs/RTFMBIpDOsIN5r0Vh5wX+3vYkIyVlIjt5XSIMtOJstZVK wb5EiRTsmWoA0Cp8WZy/jSjGkQR2T6ZuFRH3hci0DFMhtnD06XU2yZICKtAEn5vg/hpd KV0eJycW4/86TauW7eoyQ/E0dz+UDN5qU4WOpJWlS2yLKFi/bDBT41Pc1qzrlDzkKvDB OeZDVLMLFzjzLZ6QzDVdRIkQ+hIk1N2s3SUEwFL47OvxQ4+WZ1kmaqriNQlipgbfuCZo kCOVW9IhZ6C/HiDlwve9rHe/GOCZdKrMrh9qjgyKuVs3P/uxNZIC9xHjIzYdnZfq/yyK JuvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pihU19ZyV1tR9rxjJBUfTFPiShIHT6wFJWQRUBqDuSU=; b=g9J60Q0aYuCBIUxDANEa+piKc6FldpaD/pXxjwNPPCO7od43bEKtx6ge85kCjCXebl T2trZ1KR5N1dr0hxm54TJcZAJLa9qES06P+dAa3e2x6ukWeW43aG4F9Fe7FDegfALWnh 2E0/jwtl2q6wnyfLql/PvZzxjCPfesBHRSEPtWAHfmXnxA8+1UBEl6BoriEmKA2q906X XGII+GSW7bKG1rRNQTX71pgwIdixrw39Cnf4U1p6e9ZtjAo82DnhkMANUcV4Cfuj1XD1 tZhczGdGDLGGewZrjIOSY2Rrp8mc1nTAe8rSq6G4ISHvkgiKTswud0FA/jLGKElbW/0I f83Q== X-Gm-Message-State: AOAM531jNRKmkoI8B+gGLqrO4jgL0ZzVZgLCUQyldDJ3tQ/fz26d4uOd hVFrg4fGe/+uJziWtaB0Zus= X-Google-Smtp-Source: ABdhPJwbBuxfhOpviYZGwvbN5A+9fD5A8wr8vfAimfJY/Xc6BHGV8kJjoXIMhF++dXqMjZqyGvocnw== X-Received: by 2002:a2e:911:: with SMTP id 17mr13543936ljj.207.1600458520502; Fri, 18 Sep 2020 12:48:40 -0700 (PDT) Received: from pc638.lan (h5ef52e31.seluork.dyn.perspektivbredband.net. [94.245.46.49]) by smtp.gmail.com with ESMTPSA id a17sm766769lfd.148.2020.09.18.12.48.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 12:48:40 -0700 (PDT) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , linux-mm@kvack.org, Andrew Morton , "Paul E . McKenney" Cc: Peter Zijlstra , Michal Hocko , Vlastimil Babka , Thomas Gleixner , "Theodore Y . Ts'o" , Joel Fernandes , Sebastian Andrzej Siewior , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH 3/4] rcu/tree: use __rcu_alloc_page_lockless() func. Date: Fri, 18 Sep 2020 21:48:16 +0200 Message-Id: <20200918194817.48921-4-urezki@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200918194817.48921-1-urezki@gmail.com> References: <20200918194817.48921-1-urezki@gmail.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Use a newly introduced __rcu_alloc_page_lockless() function directly in the k[v]free_rcu() path, a new pointer array can be obtained by demand, what reduces a memory footprint, does it without any delays and in time. Please note, we still keep the worker approach introduced earlier, because the lock-less page allocation uses a per-cpu-list cache that can be depleted, what is absolutely a normal behaviour. If so, the worker we have, by requesting a new page will also initiate an internal process that prefetches specified number of elements from the buddy allocator populating the "pcplist" by new fresh pages. A number of pre-fetched elements can be controlled via sysfs attribute. Please see the /proc/sys/vm/percpu_pagelist_fraction. Signed-off-by: Uladzislau Rezki (Sony) --- kernel/rcu/tree.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 4bfc46a1e9d1..d51209343029 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3401,6 +3401,10 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) if (!krcp->bkvhead[idx] || krcp->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) { bnode = get_cached_bnode(krcp); + if (!bnode) + bnode = (struct kvfree_rcu_bulk_data *) + __rcu_alloc_page_lockless(); + /* Switch to emergency path. */ if (!bnode) return false; From patchwork Fri Sep 18 19:48:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 11785937 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6C08A139A for ; Fri, 18 Sep 2020 19:48:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2EB2822208 for ; Fri, 18 Sep 2020 19:48:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="L06jfMN2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2EB2822208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 19BA7900007; Fri, 18 Sep 2020 15:48:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 058D2900004; Fri, 18 Sep 2020 15:48:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E3BC7900007; Fri, 18 Sep 2020 15:48:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0184.hostedemail.com [216.40.44.184]) by kanga.kvack.org (Postfix) with ESMTP id CD367900004 for ; Fri, 18 Sep 2020 15:48:43 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 930BF362C for ; Fri, 18 Sep 2020 19:48:43 +0000 (UTC) X-FDA: 77277219726.04.love95_150018b2712d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id 70254800CCEE for ; Fri, 18 Sep 2020 19:48:43 +0000 (UTC) X-Spam-Summary: 1,0,0,2abd1c32a83ebbea,d41d8cd98f00b204,urezki@gmail.com,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1542:1711:1730:1747:1777:1792:2196:2199:2393:2559:2562:2901:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3871:3872:3874:4117:4250:4321:4385:5007:6117:6261:6653:7514:7903:8660:9413:9707:10004:11026:11232:11473:11658:11914:12043:12048:12114:12296:12297:12438:12517:12519:12555:12895:12986:13148:13161:13229:13230:13894:13972:14096:14181:14394:14687:14721:21080:21433:21444:21451:21627:21666:21740:21790:21939:21966:30054,0,RBL:209.85.208.195:@gmail.com:.lbl8.mailshell.net-62.18.84.100 66.100.201.100;04yfmwd83ie5cs4gburymkkrn7e7byp4erm6mh4ruqoaus5d1o367ek99r5f5a7.jzkjfjfjkxr94kqfum5rnar3idffh7shin1b39x393befx8faqz15k8153fbbrm.e-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: love95_150018b2712d X-Filterd-Recvd-Size: 6008 Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Fri, 18 Sep 2020 19:48:42 +0000 (UTC) Received: by mail-lj1-f195.google.com with SMTP id a22so5983425ljp.13 for ; Fri, 18 Sep 2020 12:48:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SzvHH1kSKdJ2ziBgd9Jq8Z/nxeBRoANgHFle/0iuLew=; b=L06jfMN2PzbFEpKONJ57B5VlRj0rlQ0Zbw9IIWkpz1wKb8dQNExEe1OJMnfZHUCnko xX0nkCOl2aW4CdKV33nWLFKXSYE/xp0VPkqTgtkQdCwM/zJW72gYCjseA32MZawSOqBR asl+Iv2Nl/CSx7Hme52926esMeA3s2paJYuO0Doh3AZHAz4v/yq/wYUK9iLM4cZXU5xf Wp/Ts/6WBrsURw/Gr5AyZLA2x9hfSEOCutnjyFw9WqhtybeVfiXorWGBStNLRcH2oSqq aUcADxo/mRjYcLU7QDBOLqIDgnzdH/bAuyAkiM4ITemEsvaoKRQEb781x/xaU8/aM6Gb FYCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SzvHH1kSKdJ2ziBgd9Jq8Z/nxeBRoANgHFle/0iuLew=; b=ScFweU1w0N6yvplJWx+qYuK6v3FjA5Idl0B+VkRH/fwB16Re781JFjcd4JEPFO5CL5 qe+9sUMSqEey84sDbxK6A0IA7RgGBm3LYucHpNwI/NQ3ZVbxt4myRpBwy5cvm0XszbRR 53VQHcMJYINZYkPvBUe+ri5Kloc64YXfHfxpu311uge9Dv8VjhW5KjUB6FQlULxj8eWG 7ZDB0Ae7q6Pa6eRkPxezsFLSWdTgeLDo5a2LFp+gUplyRChaIa6VvuqWW78MzcFD4Qoi 0msl1SunT4b74UzU+arW9Oy0RoHwxEG+7s4RiIrXhaICD7LzPWBlK3k444OQoT96fWo+ 91bQ== X-Gm-Message-State: AOAM533gwWn7sjl+/dOQ+9y4+TGFDf/vIREoTVOJr+PKQ1PcepPMRE4S pVBhVwIJdCWzLfdiMMLmU/s= X-Google-Smtp-Source: ABdhPJzK/m1EMHaCbJdkbPMwOFhH52Qvuz4FPNXfeee2GFnRfrriSXfCcozLep37DAqr2fWTcDnZSQ== X-Received: by 2002:a2e:6f1c:: with SMTP id k28mr11293487ljc.297.1600458521547; Fri, 18 Sep 2020 12:48:41 -0700 (PDT) Received: from pc638.lan (h5ef52e31.seluork.dyn.perspektivbredband.net. [94.245.46.49]) by smtp.gmail.com with ESMTPSA id a17sm766769lfd.148.2020.09.18.12.48.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 12:48:41 -0700 (PDT) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , linux-mm@kvack.org, Andrew Morton , "Paul E . McKenney" Cc: Peter Zijlstra , Michal Hocko , Vlastimil Babka , Thomas Gleixner , "Theodore Y . Ts'o" , Joel Fernandes , Sebastian Andrzej Siewior , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH 4/4] rcu/tree: Use schedule_delayed_work() instead of WQ_HIGHPRI queue Date: Fri, 18 Sep 2020 21:48:17 +0200 Message-Id: <20200918194817.48921-5-urezki@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200918194817.48921-1-urezki@gmail.com> References: <20200918194817.48921-1-urezki@gmail.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Recently the separate worker thread has been introduced to maintain the local page cache from the regular kernel context, instead of kvfree_rcu() contexts. That was done because a caller of the k[v]free_rcu() can be any context type what is a problem from the allocation point of view. From the other hand, the lock-less way of obtaining a page has been introduced and directly injected to the k[v]free_rcu() path. Therefore it is not important anymore to use a high priority "wq" for the external job that used to fill a page cache ASAP when it was empty. Signed-off-by: Uladzislau Rezki (Sony) --- kernel/rcu/tree.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index d51209343029..f2b4215631f7 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3100,7 +3100,7 @@ struct kfree_rcu_cpu { * lockless an access has to be protected by the * per-cpu lock. */ - struct work_struct page_cache_work; + struct delayed_work page_cache_work; atomic_t work_in_progress; struct llist_head bkvcache; int nr_bkv_objs; @@ -3354,7 +3354,7 @@ static void fill_page_cache_func(struct work_struct *work) struct kvfree_rcu_bulk_data *bnode; struct kfree_rcu_cpu *krcp = container_of(work, struct kfree_rcu_cpu, - page_cache_work); + page_cache_work.work); unsigned long flags; bool pushed; int i; @@ -3440,7 +3440,6 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func) { unsigned long flags; struct kfree_rcu_cpu *krcp; - bool irq_disabled = irqs_disabled(); bool success; void *ptr; @@ -3473,9 +3472,9 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func) success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr); if (!success) { - // TODO: schedule the work from the hrtimer. - if (!irq_disabled && !atomic_xchg(&krcp->work_in_progress, 1)) - queue_work(system_highpri_wq, &krcp->page_cache_work); + // Use delayed work, so we do not deadlock with rq->lock. + if (!atomic_xchg(&krcp->work_in_progress, 1)) + schedule_delayed_work(&krcp->page_cache_work, 1); if (head == NULL) // Inline if kvfree_rcu(one_arg) call. @@ -4475,7 +4474,7 @@ static void __init kfree_rcu_batch_init(void) } INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor); - INIT_WORK(&krcp->page_cache_work, fill_page_cache_func); + INIT_DELAYED_WORK(&krcp->page_cache_work, fill_page_cache_func); krcp->initialized = true; } if (register_shrinker(&kfree_rcu_shrinker))