From patchwork Wed Jun 7 20:28:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Tosatti X-Patchwork-Id: 13271500 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D53EC77B7A for ; Thu, 8 Jun 2023 01:11:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6AAAB8E0001; Wed, 7 Jun 2023 21:11:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 65BA16B0074; Wed, 7 Jun 2023 21:11:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 521DC8E0001; Wed, 7 Jun 2023 21:11:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 402B76B0072 for ; Wed, 7 Jun 2023 21:11:48 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 13FC98040B for ; Thu, 8 Jun 2023 01:11:48 +0000 (UTC) X-FDA: 80877803496.09.9C82AD8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 4B5C2A000A for ; Thu, 8 Jun 2023 01:11:46 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=XZmp0ACe; spf=pass (imf15.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686186706; a=rsa-sha256; cv=none; b=o1QqKaRzMVBL+hHXGmSa31YaKaS86sbC0UdCVls1vySXDXBibCzfymmpDtts8O1paTUNWb tiZG/z04h33Qf6JvcJOFiTfz+oIqf+E92HtqGPB8DC3s164r5PxX8kIksNWDoWCwcTbA8F SHDFkueO+OYF2pxvs2E/MX3aHbxcBSs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=XZmp0ACe; spf=pass (imf15.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686186706; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=mdQ0lF7C7zgaRMoTdeqNXNG4mPlYrnhL6rmrdXZfaoI=; b=Kd/xGbYDeNErLpYIdHkW9q0uP4/A+km041k3lKf95osCIxR+TGdY+DPDFSjuEcveQYT5pF MGWiz18ciiMhG2dzpc4+d3rzKvE1XyVqbyVrxE8AxRh2gFYkjeUpK4sjlyD9v+qJhHJSrp IgL/nC4pdEnZkfDva0nUiOj3EokP28U= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1686186705; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=mdQ0lF7C7zgaRMoTdeqNXNG4mPlYrnhL6rmrdXZfaoI=; b=XZmp0ACewuoRonIncg5Uzv8nIIX0l73ii6rj0oYTpleAgsyAl+F0lpq8hZme1AYS9MpXo7 rOfjs8R4t8IUMKCVdDjKa2SAeqDswcPMTrL/B7EFLDqwLR4pAooyzuLqbSIptLGpkym0Xh YVkuPuGdryIm35U3OPg4sG8ehmiK2Tc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-124-y1ZFZL7DPKeBSvgVD9UArA-1; Wed, 07 Jun 2023 21:11:42 -0400 X-MC-Unique: y1ZFZL7DPKeBSvgVD9UArA-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D75D5185A78E; Thu, 8 Jun 2023 01:11:41 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-2.gru2.redhat.com [10.97.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A20F6403367; Thu, 8 Jun 2023 01:11:41 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id C29CC40E16DC2; Wed, 7 Jun 2023 17:28:07 -0300 (-03) Date: Wed, 7 Jun 2023 17:28:07 -0300 From: Marcelo Tosatti To: Andrew Morton Cc: Frederic Weisbecker , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Vlastimil Babka , Michal Hocko Subject: [PATCH] vmstat: skip periodic vmstat update for isolated CPUs Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 4B5C2A000A X-Stat-Signature: onjxhuechix3o3xmi313msmq5ke1s8sg X-Rspam-User: X-HE-Tag: 1686186706-101742 X-HE-Meta: U2FsdGVkX18pakJjENWgtD2Ig9o227cc9ZTwF4zL0kr/2CTl0/bnM+Ioor4hg09q3+VfIaiO02sDblnl7B9XPxuKt6ihhK5pBKaN49ewj6tWXbRW3+2jwrVbXvNIi/ShTb5o8+LbJS6SQHtdYQs6jeufMKPMwE1S5B+VTcxDqhx2LMfLqUJmgPF48yGKoBfx+i3fBbqULBfB1IohcpCRxAT/Js7U/Sz0F3svBARQ6vDqTa+bNhgVN55kOwmoycfN47mn/J5juw3M8Eb8Aea/bLcEH2Wh0Pxmafz4mDrU9vg63F0PQm0PKEHFlkXnSYnNzA1xAllQ0DwsQMH8ny3ty5Z6Vh9TZMoNwB3AH7gemrC6DiHy6PU2X/yQnMTncnls5qJ/D39hxmoQtXRw321Kt1RNBEZRU+o08SfnuOtJkofe22nRC7lF0em8fKFHukKP/2993zrc3mRoStzwWJFPjPeGq8nxcZgHwKMQz5aNmMSyaf2CmeO1bGP9DnnNFapHAao3uvHuwVt+3+pZPd2bGWaUtMru2A4zijAys19XsCQOg+PB4qAogQkXFMpdurB6yiPa+uuuoiDISx19D4g1OZefDNHVEONUpMWFgOVTbd5dUB51KjEMjA7U3nPX33TxklNzqB2qqmmOCWhB7015pbBccFz/LVYk38n56JoG3ItBEBXLFILxWLcCN6dSFdlVYvif0EE5BtRn04LHzPXVHXyUPqIIX9r5Q5Lm7w2nAGQbpVBJdcYMNsrXV0n+JTngWZS4SIWJ83hfPqEJwKVdlcawD0HaX89xDg3+PUsUoXSO+AFW4FT83gIlk6aOf1jFS863BQi+GLtRqeMa0ubJQjfuxyB0jKV2wH5hYE96j379gbAo+reUImIYKI2NgCye+Mn8dGsI+7YrzgqRLFLIMStUeBbpoa4W0bbomomKkdWYgZDSgOnAGnjCRlbKgt89GzjbCEWn+MLQFABjacH hXthcbtq bXG3LDn41LwBKPoZ5Z3LohA+RROjBAjP18S42ViW0bS/G4FI7QYQzcE0Erqu+mccNcBkRHdAubinS6JPNHhVilACCmepKlKb99F3M/Idp5vPu2Mx9TH8MTeMR5GgE7U2WrL3Fy07jiLGeP7qgkcZncwOtb/nVxMSbV+iNBanrVJm3+Qk4EZRWCIPVw73fR6ESsCU4utPFFdB0267Qtv5ZnDtsAvfSVMIAe1TzPw0Z+FQ4sDQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Problem: The interruption caused by vmstat_update is undesirable for certain applications. With workloads that are running on isolated cpus with nohz full mode to shield off any kernel interruption. For example, a VM running a time sensitive application with a 50us maximum acceptable interruption (use case: soft PLC). oslat 1094.456862: sys_mlock(start: 7f7ed0000b60, len: 1000) oslat 1094.456971: workqueue_queue_work: ... function=vmstat_update ... oslat 1094.456974: sched_switch: prev_comm=oslat ... ==> next_comm=kworker/5:1 ... kworker 1094.456978: sched_switch: prev_comm=kworker/5:1 ==> next_comm=oslat ... The example above shows an additional 7us for the oslat -> kworker -> oslat switches. In the case of a virtualized CPU, and the vmstat_update interruption in the host (of a qemu-kvm vcpu), the latency penalty observed in the guest is higher than 50us, violating the acceptable latency threshold. The isolated vCPU can perform operations that modify per-CPU page counters, for example to complete I/O operations: CPU 11/KVM-9540 [001] dNh1. 2314.248584: mod_zone_page_state <-__folio_end_writeback CPU 11/KVM-9540 [001] dNh1. 2314.248585: => 0xffffffffc042b083 => mod_zone_page_state => __folio_end_writeback => folio_end_writeback => iomap_finish_ioend => blk_mq_end_request_batch => nvme_irq => __handle_irq_event_percpu => handle_irq_event => handle_edge_irq => __common_interrupt => common_interrupt => asm_common_interrupt => vmx_do_interrupt_nmi_irqoff => vmx_handle_exit_irqoff => vcpu_enter_guest => vcpu_run => kvm_arch_vcpu_ioctl_run => kvm_vcpu_ioctl => __x64_sys_ioctl => do_syscall_64 => entry_SYSCALL_64_after_hwframe In kernel users of vmstat counters either require the precise value and they are using zone_page_state_snapshot interface or they can live with an imprecision as the regular flushing can happen at arbitrary time and cumulative error can grow (see calculate_normal_threshold). >>From that POV the regular flushing can be postponed for CPUs that have been isolated from the kernel interference without critical infrastructure ever noticing. Skip regular flushing from vmstat_shepherd for all isolated CPUs to avoid interference with the isolated workload. Suggested by Michal Hocko. Acked-by: Michal Hocko Signed-off-by: Marcelo Tosatti --- v3: improve changelog (Michal Hocko) v2: use cpu_is_isolated (Michal Hocko) Index: linux-vmstat-remote/mm/vmstat.c =================================================================== --- linux-vmstat-remote.orig/mm/vmstat.c +++ linux-vmstat-remote/mm/vmstat.c @@ -28,6 +28,7 @@ #include #include #include +#include #include "internal.h" @@ -2022,6 +2023,20 @@ static void vmstat_shepherd(struct work_ for_each_online_cpu(cpu) { struct delayed_work *dw = &per_cpu(vmstat_work, cpu); + /* + * In kernel users of vmstat counters either require the precise value and + * they are using zone_page_state_snapshot interface or they can live with + * an imprecision as the regular flushing can happen at arbitrary time and + * cumulative error can grow (see calculate_normal_threshold). + * + * From that POV the regular flushing can be postponed for CPUs that have + * been isolated from the kernel interference without critical + * infrastructure ever noticing. Skip regular flushing from vmstat_shepherd + * for all isolated CPUs to avoid interference with the isolated workload. + */ + if (cpu_is_isolated(cpu)) + continue; + if (!delayed_work_pending(dw) && need_update(cpu)) queue_delayed_work_on(cpu, mm_percpu_wq, dw, 0);