From patchwork Sun Mar 5 13:37:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Tosatti X-Patchwork-Id: 13160144 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E462C61DA4 for ; Sun, 5 Mar 2023 13:41:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EBD956B007B; Sun, 5 Mar 2023 08:41:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E45A76B007D; Sun, 5 Mar 2023 08:41:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C99C96B007E; Sun, 5 Mar 2023 08:41:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8B2A26B007B for ; Sun, 5 Mar 2023 08:41:50 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6070BAABA3 for ; Sun, 5 Mar 2023 13:41:50 +0000 (UTC) X-FDA: 80534957580.11.A3C87A5 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 91DA2140015 for ; Sun, 5 Mar 2023 13:41:48 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hJowvzyq; spf=pass (imf09.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678023708; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=tHAQYo4aXRrl160KH7pJqqqPegg14fiAPRCJhIKS8fE=; b=eciaBXeo6dYkIZdDIr7YKFVIKpOlJba4iKmHTBa3zMm3zhKNUBCxYCX4eYvfOaJCq08ZgL jRk/UIDOqURWDNCmGY6iaO7OoQym5hMy8pgNWdPZyP9tsnheUXJkiwbjOM4sR0GiCYCA4O epsycz9xLLxNo5PPCBvBbzR6r9fgzow= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hJowvzyq; spf=pass (imf09.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678023708; a=rsa-sha256; cv=none; b=bAsQc8hJUJrXq8Sh7pgRF+SNeYqkK1XBb5EDF/rUESI8kMJePMMb3hGag3LWgRMQ1SMp7s 1HOtubmriqKjZqV5j599VS5wljOsq+Aoj4Kvkel183I45zqvQmlvbMDfeYIYVMpMffULTh Rc/tUM7ayxXyvrcNbqPSt5c8bfhh6qQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678023707; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=tHAQYo4aXRrl160KH7pJqqqPegg14fiAPRCJhIKS8fE=; b=hJowvzyqJzoEpBxeQZzvTXkgR+wNqqKAFZSsUe8bJ5Z8N8A+9NSTKktRkIAbQzbbx7VdkT PvM1VpQUyNQ9fcUDpjMUOJKIKPh4gz26p7JRG0GvDlY32U1bbDWtluFpNeIeV2vnfodkSY L28XGAm2nuHxGQFwssIDIDKI7P/EMnk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-117-Aba2Ptl8N9iKTrjZggZs1g-1; Sun, 05 Mar 2023 08:41:43 -0500 X-MC-Unique: Aba2Ptl8N9iKTrjZggZs1g-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id ECE0885A588; Sun, 5 Mar 2023 13:41:42 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-2.gru2.redhat.com [10.97.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A78F618EC2; Sun, 5 Mar 2023 13:41:42 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 434A7401A193C; Sun, 5 Mar 2023 10:41:23 -0300 (-03) Message-ID: <20230305134053.687304254@redhat.com> User-Agent: quilt/0.67 Date: Sun, 05 Mar 2023 10:37:08 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Russell King , Huacai Chen , Heiko Carstens , x86@kernel.org, Marcelo Tosatti Subject: [PATCH v4 11/12] mm/vmstat: switch vmstat shepherd to flush per-CPU counters remotely References: <20230305133657.255737580@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: x497siahc7waetxx5hwbgomooq1jynxq X-Rspamd-Queue-Id: 91DA2140015 X-HE-Tag: 1678023708-663866 X-HE-Meta: U2FsdGVkX19V39aA2zpdM9xsr2mqM35MtwXwAFaU/FEFpSTcf/wS1odlv6fX4VHe61qXZBbA30y3AR656K/j5/5IHvy2ido3xmCzDcvkQz1K6IZw4Ak+er8LN3Wr0FAB7XMwrjyyHvUJoPTYO/5HlHzkT4ScrvYfEBUOsyTE0136jTeTcHXJSv+mdp3eepBJap983UIgbzw7THIiUB32Z+PiXoKapaQDJon7ogt6p6zqm/DmnW9WJLi58sdSo6uxqF+xDQeQL5x14RIfrcdfL7SpkM28od0Ng74FnpyGdtHxfkgRTRIvw+5YK8rruXeNs8cRjC/NDh21b/g9xqbcisOca6JS8/KdXaKKNr8mAKCyQmHMEIYAxwzuuoYFbUneK3B6TbJlLF6UFhE2J2bX6r2zkJPn9XIaS8o//GUoI2eKPe0qzxoWaO4Gi5f5eOCmfv4QFhQktL0+6wPfQM2veQ9x40E63te+itfbRY4gJTOXlXhkTeMVxbsaXQGshYhrtV4HEy3Rj00qojshcGO+gaFRjAGNGoc9+ojIP1ZT6dyEJqh7EPFUcbg6+tXwDNx7i9i3hBUL4YHHaJGsJQwh+nmZssJF92qUP0AHnxWHX9otsSayPL9MRv5l1JFmePyNSGu8p85UDtMXve/oLZuI3KyBXUinpsuWmgDlZ4tKH3mYMSP45VuPEgcgQcw597lrkyUDsQGkWYl2O2dyp278yJW1yM0LoziMy0cXr7uevGFlryv5FM9kvllZ2DxqBqwxa+UIXuUNtZD+N/y3Fv+k5NO302hiHCf6pSAeHFOoZBN5j3bQWVmjto1BHFij7heuzRBHV5bizi2DC9UsNYM4mmE92tdyov3WDZv/zhV541/DXXvfNpBT99smtae4EuGSaxBGk/VD1h213WO81+YOUur0DSpEvmkBGDeer7simqF3bDyPoSbDp+Bw1PyboGLyyTCeqducXdSx8lB8yVC gFZEcBPu YIV7yiJWkjyoF/19XZjfL0BHE5G+ADUVBuIWr9l3jxdjik5rfuRq8NuhgtL5s4TQr0KIW7Rgxdqv7+wu+rDf5F6TTQiHGkHi9XJC16xrin7VBuLvag7bFU4lJwQ439gpOGHZ85Zni6lDAVc4Q+leducZRLluJdP0qKgQPkOHunX73yRejHI810V/UzzLg9czijGpBHbclLghxPoE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now that the counters are modified via cmpxchg both CPU locally (via the account functions), and remotely (via cpu_vm_stats_fold), its possible to switch vmstat_shepherd to perform the per-CPU vmstats folding remotely. This fixes the following two problems: 1. A customer provided some evidence which indicates that the idle tick was stopped; albeit, CPU-specific vmstat counters still remained populated. Thus one can only assume quiet_vmstat() was not invoked on return to the idle loop. If I understand correctly, I suspect this divergence might erroneously prevent a reclaim attempt by kswapd. If the number of zone specific free pages are below their per-cpu drift value then zone_page_state_snapshot() is used to compute a more accurate view of the aforementioned statistic. Thus any task blocked on the NUMA node specific pfmemalloc_wait queue will be unable to make significant progress via direct reclaim unless it is killed after being woken up by kswapd (see throttle_direct_reclaim()) 2. With a SCHED_FIFO task that busy loops on a given CPU, and kworker for that CPU at SCHED_OTHER priority, queuing work to sync per-vmstats will either cause that work to never execute, or stalld (i.e. stall daemon) boosts kworker priority which causes a latency violation Signed-off-by: Marcelo Tosatti Index: linux-vmstat-remote/mm/vmstat.c =================================================================== --- linux-vmstat-remote.orig/mm/vmstat.c +++ linux-vmstat-remote/mm/vmstat.c @@ -2004,6 +2004,23 @@ static void vmstat_shepherd(struct work_ static DECLARE_DEFERRABLE_WORK(shepherd, vmstat_shepherd); +#ifdef CONFIG_HAVE_CMPXCHG_LOCAL +/* Flush counters remotely if CPU uses cmpxchg to update its per-CPU counters */ +static void vmstat_shepherd(struct work_struct *w) +{ + int cpu; + + cpus_read_lock(); + for_each_online_cpu(cpu) { + cpu_vm_stats_fold(cpu); + cond_resched(); + } + cpus_read_unlock(); + + schedule_delayed_work(&shepherd, + round_jiffies_relative(sysctl_stat_interval)); +} +#else static void vmstat_shepherd(struct work_struct *w) { int cpu; @@ -2023,6 +2040,7 @@ static void vmstat_shepherd(struct work_ schedule_delayed_work(&shepherd, round_jiffies_relative(sysctl_stat_interval)); } +#endif static void __init start_shepherd_timer(void) {