From patchwork Thu Dec 17 14:45:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Gruzdev X-Patchwork-Id: 11979933 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0059C4361B for ; Thu, 17 Dec 2020 14:51:18 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 54AC023371 for ; Thu, 17 Dec 2020 14:51:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 54AC023371 Authentication-Results: mail.kernel.org; dmarc=pass (p=none dis=none) header.from=nongnu.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:41720 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kpucf-0001FG-1R for qemu-devel@archiver.kernel.org; Thu, 17 Dec 2020 09:51:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:60260) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kpuZc-0006pG-OW for qemu-devel@nongnu.org; Thu, 17 Dec 2020 09:48:08 -0500 Received: from relay.sw.ru ([185.231.240.75]:49570 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kpuZZ-0003So-QH for qemu-devel@nongnu.org; Thu, 17 Dec 2020 09:48:08 -0500 Received: from [192.168.15.61] (helo=andrey-MS-7B54.sw.ru) by relay3.sw.ru with esmtp (Exim 4.94) (envelope-from ) id 1kpuZ8-00DOOc-Tj; Thu, 17 Dec 2020 17:47:38 +0300 To: qemu-devel@nongnu.org Subject: [PATCH v9 5/5] migration: introduce 'userfaultfd-wrlat.py' script Date: Thu, 17 Dec 2020 17:45:40 +0300 Message-Id: <20201217144540.365903-6-andrey.gruzdev@virtuozzo.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201217144540.365903-1-andrey.gruzdev@virtuozzo.com> References: <20201217144540.365903-1-andrey.gruzdev@virtuozzo.com> MIME-Version: 1.0 Received-SPF: pass client-ip=185.231.240.75; envelope-from=andrey.gruzdev@virtuozzo.com; helo=relay3.sw.ru X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Juan Quintela , "Dr . David Alan Gilbert" , Peter Xu , Markus Armbruster , Paolo Bonzini , Den Lunev , Andrey Gruzdev Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reply-to: Andrey Gruzdev X-Patchwork-Original-From: Andrey Gruzdev via From: Andrey Gruzdev Add BCC/eBPF script to analyze userfaultfd write fault latency distribution. Signed-off-by: Andrey Gruzdev --- scripts/userfaultfd-wrlat.py | 148 +++++++++++++++++++++++++++++++++++ 1 file changed, 148 insertions(+) create mode 100755 scripts/userfaultfd-wrlat.py diff --git a/scripts/userfaultfd-wrlat.py b/scripts/userfaultfd-wrlat.py new file mode 100755 index 0000000000..5ffd3c6c9a --- /dev/null +++ b/scripts/userfaultfd-wrlat.py @@ -0,0 +1,148 @@ +#!/usr/bin/python3 +# +# userfaultfd-wrlat Summarize userfaultfd write fault latencies. +# Events are continuously accumulated for the +# run, while latency distribution histogram is +# dumped each 'interval' seconds. +# +# For Linux, uses BCC, eBPF. +# +# USAGE: userfaultfd-lat [interval [count]] +# +# Copyright Virtuozzo GmbH, 2020 +# +# Authors: +# Andrey Gruzdev +# +# This work is licensed under the terms of the GNU GPL, version 2 or +# later. See the COPYING file in the top-level directory. + +from __future__ import print_function +from bcc import BPF +from ctypes import c_ushort, c_int, c_ulonglong +from time import sleep +from sys import argv + +def usage(): + print("USAGE: %s [interval [count]]" % argv[0]) + exit() + +# define BPF program +bpf_text = """ +#include +#include + +/* + * UFFD page fault event descriptor. + * Used as a key to BPF_HASH table. + */ +struct ev_desc { + u64 pid; + u64 addr; +}; + +BPF_HASH(ev_start, struct ev_desc, u64); +BPF_HASH(ctx_handle_userfault, u64, u64); +BPF_HISTOGRAM(ev_delta_hist, u64); + +/* Trace UFFD page fault start event. */ +static void do_event_start(u64 pid, u64 address) +{ + struct ev_desc desc = { .pid = pid, .addr = address }; + u64 ts = bpf_ktime_get_ns(); + + ev_start.insert(&desc, &ts); +} + +/* Trace UFFD page fault end event. */ +static void do_event_end(u64 pid, u64 address) +{ + struct ev_desc desc = { .pid = pid, .addr = address }; + u64 ts = bpf_ktime_get_ns(); + u64 *tsp; + + tsp = ev_start.lookup(&desc); + if (tsp) { + u64 delta = ts - (*tsp); + /* Transform time delta to milliseconds */ + ev_delta_hist.increment(bpf_log2l(delta / 1000000)); + ev_start.delete(&desc); + } +} + +/* KPROBE for handle_userfault(). */ +int probe_handle_userfault(struct pt_regs *ctx, struct vm_fault *vmf, + unsigned long reason) +{ + /* Trace only UFFD write faults. */ + if (reason & VM_UFFD_WP) { + u64 pid = (u32) bpf_get_current_pid_tgid(); + u64 addr = vmf->address; + + do_event_start(pid, addr); + ctx_handle_userfault.update(&pid, &addr); + } + return 0; +} + +/* KRETPROBE for handle_userfault(). */ +int retprobe_handle_userfault(struct pt_regs *ctx) +{ + u64 pid = (u32) bpf_get_current_pid_tgid(); + u64 *addr_p; + + /* + * Here we just ignore the return value. In case of spurious wakeup + * or pending signal we'll still get (at least for v5.8.0 kernel) + * VM_FAULT_RETRY or (VM_FAULT_RETRY | VM_FAULT_MAJOR) here. + * Anyhow, handle_userfault() would be re-entered if such case happens, + * keeping initial timestamp unchanged for the faulting thread. + */ + addr_p = ctx_handle_userfault.lookup(&pid); + if (addr_p) { + do_event_end(pid, *addr_p); + ctx_handle_userfault.delete(&pid); + } + return 0; +} +""" + +# arguments +interval = 10 +count = -1 +if len(argv) > 1: + try: + interval = int(argv[1]) + if interval == 0: + raise + if len(argv) > 2: + count = int(argv[2]) + except: # also catches -h, --help + usage() + +# load BPF program +b = BPF(text=bpf_text) +# attach KRPOBEs +b.attach_kprobe(event="handle_userfault", fn_name="probe_handle_userfault") +b.attach_kretprobe(event="handle_userfault", fn_name="retprobe_handle_userfault") + +# header +print("Tracing UFFD-WP write fault latency... Hit Ctrl-C to end.") + +# output +loop = 0 +do_exit = 0 +while (1): + if count > 0: + loop += 1 + if loop > count: + exit() + try: + sleep(interval) + except KeyboardInterrupt: + pass; do_exit = 1 + + print() + b["ev_delta_hist"].print_log2_hist("msecs") + if do_exit: + exit()