Message ID | 20230125182311.2022303-3-bgardon@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | selftests: KVM: Add a test for eager page splitting | expand |
On Wed, Jan 25, 2023 at 10:23 AM Ben Gardon <bgardon@google.com> wrote: > +static void run_vcpus_get_page_stats(struct kvm_vm *vm, struct kvm_page_stats *stats, const char *stage) > +{ > + int i; > + > + iteration++; > + for (i = 0; i < VCPUS; i++) { > + while (READ_ONCE(vcpu_last_completed_iteration[i]) != > + iteration) > + ; > + } > + > + get_page_stats(vm, stats, stage); get_page_stats() is already called in run_test() explicitly for other stats. I think it's better to split this function and make the flow like: run_vcpus_till_iteration(iteration++); get_page_stats(vm, &stats_populated, "populating memory"); This makes it easy to follow run_test_till_iteration() and easy to see where stats are collected. run_test_till_iteration() can also be a library function used by other tests like dirty_log_perf_test > + dirty_log_manual_caps = 0; > + for_each_guest_mode(run_test, NULL); > + > + dirty_log_manual_caps = > + kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2); > + > + if (dirty_log_manual_caps) { > + dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | > + KVM_DIRTY_LOG_INITIALLY_SET); > + for_each_guest_mode(run_test, NULL); > + } Should there be a message to show that this capability is not tested as it is not available? Or, there can be a command line option to explicitly provide intent of testing combined, split modes, or both? Then test can error out accordingly.
On Thu, Jan 26, 2023 at 12:06 PM Vipin Sharma <vipinsh@google.com> wrote: > > On Wed, Jan 25, 2023 at 10:23 AM Ben Gardon <bgardon@google.com> wrote: > > > +static void run_vcpus_get_page_stats(struct kvm_vm *vm, struct kvm_page_stats *stats, const char *stage) > > +{ > > + int i; > > + > > + iteration++; > > + for (i = 0; i < VCPUS; i++) { > > + while (READ_ONCE(vcpu_last_completed_iteration[i]) != > > + iteration) > > + ; > > + } > > + > > + get_page_stats(vm, stats, stage); > > get_page_stats() is already called in run_test() explicitly for other > stats. I think it's better to split this function and make the flow > like: > > run_vcpus_till_iteration(iteration++); > get_page_stats(vm, &stats_populated, "populating memory"); > > This makes it easy to follow run_test_till_iteration() and easy to see > where stats are collected. run_test_till_iteration() can also be a > library function used by other tests like dirty_log_perf_test Yeah, either way works. We can do it all in the run_tests function as I originally had or we can have the run vcpus and get stats in a helper as David suggested or we can separate run_vcpus and get_stats helpers as you're suggesting. I don't think it makes much of a difference. If you feel strongly I can send out another iteration of this test. > > > > + dirty_log_manual_caps = 0; > > + for_each_guest_mode(run_test, NULL); > > + > > + dirty_log_manual_caps = > > + kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2); > > + > > + if (dirty_log_manual_caps) { > > + dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | > > + KVM_DIRTY_LOG_INITIALLY_SET); > > + for_each_guest_mode(run_test, NULL); > > + } > > Should there be a message to show that this capability is not tested > as it is not available? > Or, there can be a command line option to explicitly provide intent of > testing combined, split modes, or both? Then test can error out > accordingly. Sure, that would work too. If I send another version of this series I can add a skip message, but I don't want to re-add an option to specify whether to run with MANUAL_PROTECT, because that's what I had originally and then David suggested I remove it and just always run both.
On Thu, Jan 26, 2023 at 2:52 PM Ben Gardon <bgardon@google.com> wrote: > > On Thu, Jan 26, 2023 at 12:06 PM Vipin Sharma <vipinsh@google.com> wrote: > > > > On Wed, Jan 25, 2023 at 10:23 AM Ben Gardon <bgardon@google.com> wrote: > > > > > +static void run_vcpus_get_page_stats(struct kvm_vm *vm, struct kvm_page_stats *stats, const char *stage) > > > +{ > > > + int i; > > > + > > > + iteration++; > > > + for (i = 0; i < VCPUS; i++) { > > > + while (READ_ONCE(vcpu_last_completed_iteration[i]) != > > > + iteration) > > > + ; > > > + } > > > + > > > + get_page_stats(vm, stats, stage); > > > > get_page_stats() is already called in run_test() explicitly for other > > stats. I think it's better to split this function and make the flow > > like: > > > > run_vcpus_till_iteration(iteration++); > > get_page_stats(vm, &stats_populated, "populating memory"); > > > > This makes it easy to follow run_test_till_iteration() and easy to see > > where stats are collected. run_test_till_iteration() can also be a > > library function used by other tests like dirty_log_perf_test > > Yeah, either way works. We can do it all in the run_tests function as > I originally had or we can have the run vcpus and get stats in a > helper as David suggested or we can separate run_vcpus and get_stats > helpers as you're suggesting. I don't think it makes much of a > difference. > If you feel strongly I can send out another iteration of this test. > I should have read David's comment and responded in that version. No strong feelings. It is up to you. > > > > > > > + dirty_log_manual_caps = 0; > > > + for_each_guest_mode(run_test, NULL); > > > + > > > + dirty_log_manual_caps = > > > + kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2); > > > + > > > + if (dirty_log_manual_caps) { > > > + dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | > > > + KVM_DIRTY_LOG_INITIALLY_SET); > > > + for_each_guest_mode(run_test, NULL); > > > + } > > > > Should there be a message to show that this capability is not tested > > as it is not available? > > Or, there can be a command line option to explicitly provide intent of > > testing combined, split modes, or both? Then test can error out > > accordingly. > > Sure, that would work too. If I send another version of this series I > can add a skip message, but I don't want to re-add an option to > specify whether to run with MANUAL_PROTECT, because that's what I had > originally and then David suggested I remove it and just always run > both. Sounds good. Reviewed-By: Vipin Sharma <vipinsh@google.com>
On Thu, Jan 26, 2023 at 3:04 PM Vipin Sharma <vipinsh@google.com> wrote: > > On Thu, Jan 26, 2023 at 2:52 PM Ben Gardon <bgardon@google.com> wrote: > > > > On Thu, Jan 26, 2023 at 12:06 PM Vipin Sharma <vipinsh@google.com> wrote: > > > > > > On Wed, Jan 25, 2023 at 10:23 AM Ben Gardon <bgardon@google.com> wrote: > > > > > > > +static void run_vcpus_get_page_stats(struct kvm_vm *vm, struct kvm_page_stats *stats, const char *stage) > > > > +{ > > > > + int i; > > > > + > > > > + iteration++; > > > > + for (i = 0; i < VCPUS; i++) { > > > > + while (READ_ONCE(vcpu_last_completed_iteration[i]) != > > > > + iteration) > > > > + ; > > > > + } > > > > + > > > > + get_page_stats(vm, stats, stage); > > > > > > get_page_stats() is already called in run_test() explicitly for other > > > stats. I think it's better to split this function and make the flow > > > like: > > > > > > run_vcpus_till_iteration(iteration++); > > > get_page_stats(vm, &stats_populated, "populating memory"); > > > > > > This makes it easy to follow run_test_till_iteration() and easy to see > > > where stats are collected. run_test_till_iteration() can also be a > > > library function used by other tests like dirty_log_perf_test > > > > Yeah, either way works. We can do it all in the run_tests function as > > I originally had or we can have the run vcpus and get stats in a > > helper as David suggested or we can separate run_vcpus and get_stats > > helpers as you're suggesting. I don't think it makes much of a > > difference. > > If you feel strongly I can send out another iteration of this test. > > > > I should have read David's comment and responded in that version. > No strong feelings. It is up to you. No worries, it probably would have been easier to track down if I had links in the cover letter. :) > > > > > > > > > > > + dirty_log_manual_caps = 0; > > > > + for_each_guest_mode(run_test, NULL); > > > > + > > > > + dirty_log_manual_caps = > > > > + kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2); > > > > + > > > > + if (dirty_log_manual_caps) { > > > > + dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | > > > > + KVM_DIRTY_LOG_INITIALLY_SET); > > > > + for_each_guest_mode(run_test, NULL); > > > > + } > > > > > > Should there be a message to show that this capability is not tested > > > as it is not available? > > > Or, there can be a command line option to explicitly provide intent of > > > testing combined, split modes, or both? Then test can error out > > > accordingly. > > > > Sure, that would work too. If I send another version of this series I > > can add a skip message, but I don't want to re-add an option to > > specify whether to run with MANUAL_PROTECT, because that's what I had > > originally and then David suggested I remove it and just always run > > both. > > Sounds good. > > Reviewed-By: Vipin Sharma <vipinsh@google.com> Thanks!
On Thu, Jan 26, 2023 at 3:12 PM Ben Gardon <bgardon@google.com> wrote: > > On Thu, Jan 26, 2023 at 3:04 PM Vipin Sharma <vipinsh@google.com> wrote: > > > > On Thu, Jan 26, 2023 at 2:52 PM Ben Gardon <bgardon@google.com> wrote: > > > > > > On Thu, Jan 26, 2023 at 12:06 PM Vipin Sharma <vipinsh@google.com> wrote: > > > > > > > > On Wed, Jan 25, 2023 at 10:23 AM Ben Gardon <bgardon@google.com> wrote: > > > > > > > > > +static void run_vcpus_get_page_stats(struct kvm_vm *vm, struct kvm_page_stats *stats, const char *stage) > > > > > +{ > > > > > + int i; > > > > > + > > > > > + iteration++; > > > > > + for (i = 0; i < VCPUS; i++) { > > > > > + while (READ_ONCE(vcpu_last_completed_iteration[i]) != > > > > > + iteration) > > > > > + ; > > > > > + } > > > > > + > > > > > + get_page_stats(vm, stats, stage); > > > > > > > > get_page_stats() is already called in run_test() explicitly for other > > > > stats. I think it's better to split this function and make the flow > > > > like: > > > > > > > > run_vcpus_till_iteration(iteration++); > > > > get_page_stats(vm, &stats_populated, "populating memory"); > > > > > > > > This makes it easy to follow run_test_till_iteration() and easy to see > > > > where stats are collected. run_test_till_iteration() can also be a > > > > library function used by other tests like dirty_log_perf_test > > > > > > Yeah, either way works. We can do it all in the run_tests function as > > > I originally had or we can have the run vcpus and get stats in a > > > helper as David suggested or we can separate run_vcpus and get_stats > > > helpers as you're suggesting. I don't think it makes much of a > > > difference. > > > If you feel strongly I can send out another iteration of this test. > > > > > > > I should have read David's comment and responded in that version. > > No strong feelings. It is up to you. > > No worries, it probably would have been easier to track down if I had > links in the cover letter. :) I mostly wanted the loop to go in a helper and threw in the stats without really thinking about it. Now that I see it I like Vipin's idea of having the stats call separate.
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 1750f91dd9362..9a8022918c3ea 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -61,6 +61,7 @@ TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh # Compiled test targets TEST_GEN_PROGS_x86_64 = x86_64/cpuid_test TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test +TEST_GEN_PROGS_x86_64 += x86_64/dirty_log_page_splitting_test TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features TEST_GEN_PROGS_x86_64 += x86_64/exit_on_emulation_failure_test TEST_GEN_PROGS_x86_64 += x86_64/fix_hypercall_test diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index fbc2a79369b8b..a089c356f354e 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -213,6 +213,7 @@ extern const struct vm_guest_mode_params vm_guest_mode_params[]; int open_path_or_exit(const char *path, int flags); int open_kvm_dev_path_or_exit(void); +bool get_kvm_param_bool(const char *param); bool get_kvm_intel_param_bool(const char *param); bool get_kvm_amd_param_bool(const char *param); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 56d5ea949cbbe..fa6d69f731990 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -80,6 +80,11 @@ static bool get_module_param_bool(const char *module_name, const char *param) TEST_FAIL("Unrecognized value '%c' for boolean module param", value); } +bool get_kvm_param_bool(const char *param) +{ + return get_module_param_bool("kvm", param); +} + bool get_kvm_intel_param_bool(const char *param) { return get_module_param_bool("kvm_intel", param); diff --git a/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c b/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c new file mode 100644 index 0000000000000..240e7d220afa8 --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c @@ -0,0 +1,257 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KVM dirty logging page splitting test + * + * Based on dirty_log_perf.c + * + * Copyright (C) 2018, Red Hat, Inc. + * Copyright (C) 2023, Google, Inc. + */ + +#include <stdio.h> +#include <stdlib.h> +#include <pthread.h> +#include <linux/bitmap.h> + +#include "kvm_util.h" +#include "test_util.h" +#include "memstress.h" +#include "guest_modes.h" + +#define VCPUS 2 +#define SLOTS 2 +#define ITERATIONS 2 + +static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE; + +static enum vm_mem_backing_src_type backing_src = VM_MEM_SRC_ANONYMOUS_HUGETLB; + +static u64 dirty_log_manual_caps; +static bool host_quit; +static int iteration; +static int vcpu_last_completed_iteration[KVM_MAX_VCPUS]; + +struct kvm_page_stats { + uint64_t pages_4k; + uint64_t pages_2m; + uint64_t pages_1g; + uint64_t hugepages; +}; + +static void get_page_stats(struct kvm_vm *vm, struct kvm_page_stats *stats, const char *stage) +{ + stats->pages_4k = vm_get_stat(vm, "pages_4k"); + stats->pages_2m = vm_get_stat(vm, "pages_2m"); + stats->pages_1g = vm_get_stat(vm, "pages_1g"); + stats->hugepages = stats->pages_2m + stats->pages_1g; + + pr_debug("\nPage stats after %s: 4K: %ld 2M: %ld 1G: %ld huge: %ld\n", + stage, stats->pages_4k, stats->pages_2m, stats->pages_1g, + stats->hugepages); +} + +static void run_vcpus_get_page_stats(struct kvm_vm *vm, struct kvm_page_stats *stats, const char *stage) +{ + int i; + + iteration++; + for (i = 0; i < VCPUS; i++) { + while (READ_ONCE(vcpu_last_completed_iteration[i]) != + iteration) + ; + } + + get_page_stats(vm, stats, stage); +} + +static void vcpu_worker(struct memstress_vcpu_args *vcpu_args) +{ + struct kvm_vcpu *vcpu = vcpu_args->vcpu; + int vcpu_idx = vcpu_args->vcpu_idx; + + while (!READ_ONCE(host_quit)) { + int current_iteration = READ_ONCE(iteration); + + vcpu_run(vcpu); + + ASSERT_EQ(get_ucall(vcpu, NULL), UCALL_SYNC); + + vcpu_last_completed_iteration[vcpu_idx] = current_iteration; + + /* Wait for the start of the next iteration to be signaled. */ + while (current_iteration == READ_ONCE(iteration) && + READ_ONCE(iteration) >= 0 && + !READ_ONCE(host_quit)) + ; + } +} + +static void run_test(enum vm_guest_mode mode, void *unused) +{ + struct kvm_vm *vm; + unsigned long **bitmaps; + uint64_t guest_num_pages; + uint64_t host_num_pages; + uint64_t pages_per_slot; + int i; + uint64_t total_4k_pages; + struct kvm_page_stats stats_populated; + struct kvm_page_stats stats_dirty_logging_enabled; + struct kvm_page_stats stats_dirty_pass[ITERATIONS]; + struct kvm_page_stats stats_clear_pass[ITERATIONS]; + struct kvm_page_stats stats_dirty_logging_disabled; + struct kvm_page_stats stats_repopulated; + + vm = memstress_create_vm(mode, VCPUS, guest_percpu_mem_size, + SLOTS, backing_src, false); + + guest_num_pages = (VCPUS * guest_percpu_mem_size) >> vm->page_shift; + guest_num_pages = vm_adjust_num_guest_pages(mode, guest_num_pages); + host_num_pages = vm_num_host_pages(mode, guest_num_pages); + pages_per_slot = host_num_pages / SLOTS; + + bitmaps = memstress_alloc_bitmaps(SLOTS, pages_per_slot); + + if (dirty_log_manual_caps) + vm_enable_cap(vm, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2, + dirty_log_manual_caps); + + /* Start the iterations */ + iteration = -1; + host_quit = false; + + for (i = 0; i < VCPUS; i++) + vcpu_last_completed_iteration[i] = -1; + + memstress_start_vcpu_threads(VCPUS, vcpu_worker); + + run_vcpus_get_page_stats(vm, &stats_populated, "populating memory"); + + /* Enable dirty logging */ + memstress_enable_dirty_logging(vm, SLOTS); + + get_page_stats(vm, &stats_dirty_logging_enabled, "enabling dirty logging"); + + while (iteration < ITERATIONS) { + run_vcpus_get_page_stats(vm, &stats_dirty_pass[iteration - 1], + "dirtying memory"); + + memstress_get_dirty_log(vm, bitmaps, SLOTS); + + if (dirty_log_manual_caps) { + memstress_clear_dirty_log(vm, bitmaps, SLOTS, pages_per_slot); + + get_page_stats(vm, &stats_clear_pass[iteration - 1], "clearing dirty log"); + } + } + + /* Disable dirty logging */ + memstress_disable_dirty_logging(vm, SLOTS); + + get_page_stats(vm, &stats_dirty_logging_disabled, "disabling dirty logging"); + + /* Run vCPUs again to fault pages back in. */ + run_vcpus_get_page_stats(vm, &stats_repopulated, "repopulating memory"); + + /* + * Tell the vCPU threads to quit. No need to manually check that vCPUs + * have stopped running after disabling dirty logging, the join will + * wait for them to exit. + */ + host_quit = true; + memstress_join_vcpu_threads(VCPUS); + + memstress_free_bitmaps(bitmaps, SLOTS); + memstress_destroy_vm(vm); + + /* Make assertions about the page counts. */ + total_4k_pages = stats_populated.pages_4k; + total_4k_pages += stats_populated.pages_2m * 512; + total_4k_pages += stats_populated.pages_1g * 512 * 512; + + /* + * Check that all huge pages were split. Since large pages can only + * exist in the data slot, and the vCPUs should have dirtied all pages + * in the data slot, there should be no huge pages left after splitting. + * Splitting happens at dirty log enable time without + * KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 and after the first clear pass + * with that capability. + */ + if (dirty_log_manual_caps) { + ASSERT_EQ(stats_clear_pass[0].hugepages, 0); + ASSERT_EQ(stats_clear_pass[0].pages_4k, total_4k_pages); + ASSERT_EQ(stats_dirty_logging_enabled.hugepages, stats_populated.hugepages); + } else { + ASSERT_EQ(stats_dirty_logging_enabled.hugepages, 0); + ASSERT_EQ(stats_dirty_logging_enabled.pages_4k, total_4k_pages); + } + + /* + * Once dirty logging is disabled and the vCPUs have touched all their + * memory again, the page counts should be the same as they were + * right after initial population of memory. + */ + ASSERT_EQ(stats_populated.pages_4k, stats_repopulated.pages_4k); + ASSERT_EQ(stats_populated.pages_2m, stats_repopulated.pages_2m); + ASSERT_EQ(stats_populated.pages_1g, stats_repopulated.pages_1g); +} + +static void help(char *name) +{ + puts(""); + printf("usage: %s [-h] [-b vcpu bytes] [-s mem type]\n", + name); + puts(""); + printf(" -b: specify the size of the memory region which should be\n" + " dirtied by each vCPU. e.g. 10M or 3G.\n" + " (default: 1G)\n"); + backing_src_help("-s"); + puts(""); +} + +int main(int argc, char *argv[]) +{ + int opt; + + TEST_REQUIRE(get_kvm_param_bool("eager_page_split")); + TEST_REQUIRE(get_kvm_param_bool("tdp_mmu")); + + while ((opt = getopt(argc, argv, "b:hs:")) != -1) { + switch (opt) { + case 'b': + guest_percpu_mem_size = parse_size(optarg); + break; + case 'h': + help(argv[0]); + exit(0); + case 's': + backing_src = parse_backing_src_type(optarg); + break; + default: + help(argv[0]); + exit(1); + } + } + + if (!is_backing_src_hugetlb(backing_src)) { + pr_info("This test will only work reliably with HugeTLB memory. " + "It can work with THP, but that is best effort."); + return KSFT_SKIP; + } + + guest_modes_append_default(); + + dirty_log_manual_caps = 0; + for_each_guest_mode(run_test, NULL); + + dirty_log_manual_caps = + kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2); + + if (dirty_log_manual_caps) { + dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | + KVM_DIRTY_LOG_INITIALLY_SET); + for_each_guest_mode(run_test, NULL); + } + + return 0; +}
Add a test for page splitting during dirty logging and for hugepage recovery after dirty logging. Page splitting represents non-trivial behavior, which is complicated by MANUAL_PROTECT mode, which causes pages to be split on the first clear, instead of when dirty logging is enabled. Add a test which makes assertions about page counts to help define the expected behavior of page splitting and to provide needed coverage of the behavior. This also helps ensure that a failure in eager page splitting is not covered up by splitting in the vCPU path. Tested by running the test on an Intel Haswell machine w/wo MANUAL_PROTECT. Signed-off-by: Ben Gardon <bgardon@google.com> --- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/include/kvm_util_base.h | 1 + tools/testing/selftests/kvm/lib/kvm_util.c | 5 + .../x86_64/dirty_log_page_splitting_test.c | 257 ++++++++++++++++++ 4 files changed, 264 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c