From patchwork Thu Nov 3 20:44:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 13030996 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC062C433FE for ; Thu, 3 Nov 2022 20:44:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230509AbiKCUo0 (ORCPT ); Thu, 3 Nov 2022 16:44:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229496AbiKCUoZ (ORCPT ); Thu, 3 Nov 2022 16:44:25 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42C171CFD2 for ; Thu, 3 Nov 2022 13:44:24 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-3691846091fso28374877b3.9 for ; Thu, 03 Nov 2022 13:44:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=gZjxMa2o2lIB5f0O5ApzvKQHK+nnAzaO63nl4HV7qcs=; b=A2dNWW/lB3vYSRcyXKgrHJacIauqYZ+0M3sRvwzySRRmQFDF6DOv2TCEaLd/CqiBTv i21uh3/5YqZRVYVjgGcNGrr/7XoDGxdOQKM+5K7Q+imbT78CTUG4Jz8eg+eYTt8sJf/r oBYIOANOWoQ4nDjldGOIEm6N4+n4VBJFOBC/GiebE4xZCqFGrGXFSKKajFx7oIJnH37d ZdEKo/IdW2ky/MFm7kQzLiEWBWHnE0VMvCHfIscMEPe1lHYqDmRQ71r4sOlMkic8rvAZ xZwcLmkj2g5dJe9LvkrkSAJvp20NzD7gzCCcH8wOo4GgPx8xoB4uoOLo0KYlTV5PLehk 5zig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=gZjxMa2o2lIB5f0O5ApzvKQHK+nnAzaO63nl4HV7qcs=; b=UoQ+JxaA705/+49Ik7Cv07SbPY4/C/Wv/Lc/0ybdtQzKy4K7FTLHkaXAdSBh1Z75sB WPyaetoCs85Frw8Hq5H+zb2pPvTdjP05OnjzERge8hO9cZhzlGKUlBJpobWIWrY13epT L79daJdgfQaWzN02UE9zeZ/4lZEkZniqTp6nwFb3BSfYNbsfOU9wV+R59Ibilx9EsYEh f8691WSZoM//DTyB8vWxf3CA1INgtuHi608pe/O0jerooRq7bCkYedTLkY6u1lDAZzWP kTIfhcuH8uGRlCxT1iQ2etkln/6HwZHiP/McEhLH+gmUrSdaoBw4MAUJcOgXQtm7YVd/ AcjA== X-Gm-Message-State: ACrzQf0tea1qFeihdXvq5Bt7tbDX4V5EqaMMqb1aMxFDxwEqsjkpLVxq Y/REYIIhdDN+vdA1ccPrW2VWmhTJ0KbwBg== X-Google-Smtp-Source: AMsMyM62Ze2/sy8ru2ONW0J6MhUJeLkrFvpyiMx+TZkoPd+xklnnDCEjmL8mKxR91ipAQ0vWX/UvWcz/Aa2stw== X-Received: from dmatlack-n2d-128.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1309]) (user=dmatlack job=sendgmr) by 2002:a81:5807:0:b0:36a:b20c:a88a with SMTP id m7-20020a815807000000b0036ab20ca88amr29731998ywb.324.1667508263503; Thu, 03 Nov 2022 13:44:23 -0700 (PDT) Date: Thu, 3 Nov 2022 13:44:21 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221103204421.1146958-1-dmatlack@google.com> Subject: [PATCH v2] KVM: x86/mmu: Do not recover dirty-tracked NX Huge Pages From: David Matlack To: Paolo Bonzini Cc: Sean Christopherson , kvm@vger.kernel.org, David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Do not recover (i.e. zap) an NX Huge Page that is being dirty tracked, as it will just be faulted back in at the same 4KiB granularity when accessed by a vCPU. This may need to be changed if KVM ever supports 2MiB (or larger) dirty tracking granularity, or faulting huge pages during dirty tracking for reads/executes. However for now, these zaps are entirely wasteful. This commit does nominally increase the CPU usage of the NX recover worker by about 1% when testing with a VM with 16 slots. Signed-off-by: David Matlack --- In order to check if this commit increases the CPU usage of the NX recovery worker thread I used a modified version of execute_perf_test [1] that supports splitting guest memory into multiple slots and reports /proc/pid/schedstat:se.sum_exec_runtime for the NX recovery worker just before tearing down the VM. The goal was to force a large number of NX Huge Page recoveries and see if the recovery worker used any more CPU. Test Setup: echo 1000 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms echo 10 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio Test Command: ./execute_perf_test -v64 -s anonymous_hugetlb_1gb -x 16 -o | kvm-nx-lpage-re:se.sum_exec_runtime | | ---------------------------------------- | Run | Before | After | ------- | ------------------ | ------------------- | 1 | 730.084105 | 724.375314 | 2 | 728.751339 | 740.581988 | 3 | 736.264720 | 757.078163 | Comparing the median results, this commit results in about a 1% increase CPU usage of the NX recovery worker. [1] https://lore.kernel.org/kvm/20221019234050.3919566-2-dmatlack@google.com/ v2: - Only skip NX Huge Pages that are actively being dirty tracked [Paolo] v1: https://lore.kernel.org/kvm/20221027200316.2221027-1-dmatlack@google.com/ arch/x86/kvm/mmu/mmu.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 82bc6321e58e..1c443f9aeb4b 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6831,6 +6831,7 @@ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel static void kvm_recover_nx_huge_pages(struct kvm *kvm) { unsigned long nx_lpage_splits = kvm->stat.nx_lpage_splits; + struct kvm_memory_slot *slot; int rcu_idx; struct kvm_mmu_page *sp; unsigned int ratio; @@ -6865,7 +6866,21 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm) struct kvm_mmu_page, possible_nx_huge_page_link); WARN_ON_ONCE(!sp->nx_huge_page_disallowed); - if (is_tdp_mmu_page(sp)) + WARN_ON_ONCE(!sp->role.direct); + + slot = gfn_to_memslot(kvm, sp->gfn); + WARN_ON_ONCE(!slot); + + /* + * Unaccount and do not attempt to recover any NX Huge Pages + * that are being dirty tracked, as they would just be faulted + * back in as 4KiB pages. The NX Huge Pages in this slot will be + * recovered, along with all the other huge pages in the slot, + * when dirty logging is disabled. + */ + if (slot && kvm_slot_dirty_track_enabled(slot)) + unaccount_nx_huge_page(kvm, sp); + else if (is_tdp_mmu_page(sp)) flush |= kvm_tdp_mmu_zap_sp(kvm, sp); else kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);