diff mbox series

[V8,2/2] KVM: selftests: Add test for configure of x86 APIC bus frequency

Message ID 09b11d24e957056c621e3bf2d6c9d78bd4f7461b.1718043121.git.reinette.chatre@intel.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: Make bus clock frequency for vAPIC timer configurable | expand

Commit Message

Reinette Chatre June 10, 2024, 6:25 p.m. UTC
From: Isaku Yamahata <isaku.yamahata@intel.com>

Test if KVM emulates the APIC bus clock at the expected frequency when
userspace configures the frequency via KVM_CAP_X86_APIC_BUS_CYCLES_NS.

Set APIC timer's initial count to the maximum value and busy wait for 100
msec (largely arbitrary) using the TSC. Read the APIC timer's "current
count" to calculate the actual APIC bus clock frequency based on TSC
frequency.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes v8:
- With TSC frequency switching to unsigned int type the implicit
  type promotion during bus frequency computation results in
  overflow. Use explicit unsigned long type within computation to
  avoid overflow.

Changes v7:
- Drop Maxim Levitsky's Reviewed-by because of significant changes.
- Remove redefine of _GNU_SOURCE. (kernel test robot)
- Rewrite changelog and test description. (Sean)
- Enable user space to set APIC bus frequency. (Sean)
- Use GUEST_ASSERT() from guest instead of TEST_ASSERT() from host. (Sean)
- Test xAPIC as well as x2APIC. (Sean)
- Add check that KVM_CAP_X86_APIC_BUS_CYCLES_NS cannot be set
  after vCPU created. (Sean)
- Use new udelay() utility. (Sean)
- Drop CLI. Mask LVT timer independently. (Sean)
- Use correct LVT timer entry (0x350 -> 0x320) to enable oneshot operation.
- Remove unnecessary static functions from single file test.
- Be consistent in types by using uint32_t/uint64_t instead of
  u32/u64.

Changes v6:
- Use vm_create() wrapper instead of open coding it. (Zide)
- Improve grammar of test description. (Zide)

Changes v5:
- Update to new name of capability KVM_CAP_X86_APIC_BUS_FREQUENCY ->
  KVM_CAP_X86_APIC_BUS_CYCLES_NS. (Xiaoyao Li)

Changes v4:
- Rework changelog.
- Add Sean's "Suggested-by" to acknowledge guidance received in
  https://lore.kernel.org/all/ZU0BASXWcck85r90@google.com/
- Add copyright.
- Add test description to file header.
- Consistent capitalization for acronyms.
- Rebase to kvm-x86/next.
- Update to v4 change of providing bus clock rate in nanoseconds.
- Add a "TEST_REQUIRE()" for the new capability so that the test can
  work on kernels that do not support the new capability.
- Address checkpatch warnings and use tabs instead of spaces in header
  file to match existing code.

Changes v3:
- Use 1.5GHz instead of 1GHz as frequency.

Changes v2:
- Newly added.

fixup of test
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/apic.h       |   8 +
 .../kvm/x86_64/apic_bus_clock_test.c          | 219 ++++++++++++++++++
 3 files changed, 228 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c

Comments

Sean Christopherson June 11, 2024, 12:51 a.m. UTC | #1
On Mon, Jun 10, 2024, Reinette Chatre wrote:
> diff --git a/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c b/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c
> new file mode 100644
> index 000000000000..602cec91d8ee
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c
> @@ -0,0 +1,219 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2024 Intel Corporation
> + *
> + * Verify KVM correctly emulates the APIC bus frequency when the VMM configures
> + * the frequency via KVM_CAP_X86_APIC_BUS_CYCLES_NS.  Start the APIC timer by
> + * programming TMICT (timer initial count) to the largest value possible (so
> + * that the timer will not expire during the test).  Then, after an arbitrary
> + * amount of time has elapsed, verify TMCCT (timer current count) is within 1%
> + * of the expected value based on the time elapsed, the APIC bus frequency, and
> + * the programmed TDCR (timer divide configuration register).
> + */
> +
> +#include "apic.h"
> +#include "test_util.h"
> +
> +/*
> + * Pick 25MHz for APIC bus frequency. Different enough from the default 1GHz.
> + * User can override via command line.
> + */
> +unsigned long apic_hz = 25 * 1000 * 1000;

static, and maybe a uint64_t to match the other stuff?

> +/*
> + * Possible TDCR values with matching divide count. Used to modify APIC
> + * timer frequency.
> + */
> +struct {
> +	uint32_t tdcr;
> +	uint32_t divide_count;
> +} tdcrs[] = {
> +	{0x0, 2},
> +	{0x1, 4},
> +	{0x2, 8},
> +	{0x3, 16},
> +	{0x8, 32},
> +	{0x9, 64},
> +	{0xa, 128},
> +	{0xb, 1},
> +};
> +
> +void guest_verify(uint64_t tsc_cycles, uint32_t apic_cycles, uint32_t divide_count)

uin64_t for apic_cycles?  And maybe something like guest_check_apic_count(), to
make it more obvious what is being verified?  Actually, it should be quite easy
to have the two flavors share the bulk of the code.

> +{
> +	unsigned long tsc_hz = tsc_khz * 1000;
> +	uint64_t freq;
> +
> +	GUEST_ASSERT(tsc_cycles > 0);

Is this necessary?  Won't the "freq < ..." check fail?  I love me some paranoia,
but this seems unnecessary.

> +	freq = apic_cycles * divide_count * tsc_hz / tsc_cycles;
> +	/* Check if measured frequency is within 1% of configured frequency. */
> +	GUEST_ASSERT(freq < apic_hz * 101 / 100);
> +	GUEST_ASSERT(freq > apic_hz * 99 / 100);
> +}
> +
> +void x2apic_guest_code(void)
> +{
> +	uint32_t tmict, tmcct;
> +	uint64_t tsc0, tsc1;
> +	int i;
> +
> +	x2apic_enable();
> +
> +	/*
> +	 * Setup one-shot timer.  The vector does not matter because the
> +	 * interrupt should not fire.
> +	 */
> +	x2apic_write_reg(APIC_LVTT, APIC_LVT_TIMER_ONESHOT | APIC_LVT_MASKED);
> +
> +	for (i = 0; i < ARRAY_SIZE(tdcrs); i++) {
> +		x2apic_write_reg(APIC_TDCR, tdcrs[i].tdcr);
> +
> +		/* Set the largest value to not trigger the interrupt. */

Nit, the goal isn't to avoid triggering the interrupt, e.g. the above masking
takes care of that.  The goal is to prevent the timer from expiring, because if
the timer expires it stops counting and will trigger a false failure because the
TSC doesn't stop counting.

Honestly, I would just delete the comment.  Same with the "busy wait for 100 msec"
and "Read APIC timer and TSC" comments.  They state the obvious.  Loading the max
TMICT is mildly interesting, but that's covered by the file-level comment.

> +		tmict = ~0;

This really belongs outside of the loop, e.g.

	const uint32_t tmict = ~0u;

> +		x2apic_write_reg(APIC_TMICT, tmict);
> +
> +		/* Busy wait for 100 msec. */

Hmm, should this be configurable?

> +		tsc0 = rdtsc();
> +		udelay(100000);
> +		/* Read APIC timer and TSC. */
> +		tmcct = x2apic_read_reg(APIC_TMCCT);
> +		tsc1 = rdtsc();
> +
> +		/* Stop timer. */

This comment is a bit more interesting, as readers might not know writing '0'
stops the timer.  But that's even more interesting is the ordering, e.g. it's
not at all unreasonable to think that the timer should be stopped _before_ reading
the current count.  E.g. something like:

		/*
		 * Stop the timer _after_ reading the current, final count, as
		 * writing the initial counter also modifies the current count.
		 */

> +		x2apic_write_reg(APIC_TMICT, 0);
> +
> +		guest_verify(tsc1 - tsc0, tmict - tmcct, tdcrs[i].divide_count);
> +	}
> +
> +	GUEST_DONE();
> +}
> +
> +void xapic_guest_code(void)
> +{
> +	uint32_t tmict, tmcct;
> +	uint64_t tsc0, tsc1;
> +	int i;
> +
> +	xapic_enable();
> +
> +	/*
> +	 * Setup one-shot timer.  The vector does not matter because the
> +	 * interrupt should not fire.
> +	 */
> +	xapic_write_reg(APIC_LVTT, APIC_LVT_TIMER_ONESHOT | APIC_LVT_MASKED);
> +
> +	for (i = 0; i < ARRAY_SIZE(tdcrs); i++) {
> +		xapic_write_reg(APIC_TDCR, tdcrs[i].tdcr);
> +
> +		/* Set the largest value to not trigger the interrupt. */
> +		tmict = ~0;
> +		xapic_write_reg(APIC_TMICT, tmict);
> +
> +		/* Busy wait for 100 msec. */
> +		tsc0 = rdtsc();
> +		udelay(100000);
> +		/* Read APIC timer and TSC. */
> +		tmcct = xapic_read_reg(APIC_TMCCT);
> +		tsc1 = rdtsc();
> +
> +		/* Stop timer. */
> +		xapic_write_reg(APIC_TMICT, 0);
> +
> +		guest_verify(tsc1 - tsc0, tmict - tmcct, tdcrs[i].divide_count);

That's some nice copy+paste :-)

This test isn't writing ICR, so the whole 32-bit vs. 64-bit weirdness with xAPIC
vs X2APIC is irrevelant.  Two tiny helpers, a global flag, and you can avoid a
pile of copy+paste, and the need to find a better name than guest_verify().

static bool is_x2apic;

static uint32_t apic_read_reg(unsigned int reg)
{
	return is_x2apic ? x2apic_read_reg(reg) : xapic_read_reg(reg);
}

static void apic_read_write(unsigned int reg, uint32_t val)
{
	if (is_x2apic)
		x2apic_write_reg(reg, val);
	else
		xapic_write_reg(reg, val);
	return is_x2apic ? x2apic_read_reg(reg) : xapic_read_reg(reg);
}

> +	}
> +
> +	GUEST_DONE();
> +}
> +
> +void test_apic_bus_clock(struct kvm_vcpu *vcpu)
> +{
> +	bool done = false;
> +	struct ucall uc;
> +
> +	while (!done) {
> +		vcpu_run(vcpu);
> +		TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
> +
> +		switch (get_ucall(vcpu, &uc)) {
> +		case UCALL_DONE:
> +			done = true;
> +			break;
> +		case UCALL_ABORT:
> +			REPORT_GUEST_ASSERT(uc);
> +			break;
> +		default:
> +			TEST_FAIL("Unknown ucall %lu", uc.cmd);
> +			break;
> +		}
> +	}
> +}
> +
> +void run_apic_bus_clock_test(bool xapic)
> +{
> +	struct kvm_vcpu *vcpu;
> +	struct kvm_vm *vm;
> +	int ret;
> +
> +	vm = vm_create(1);
> +
> +	sync_global_to_guest(vm, apic_hz);
> +
> +	vm_enable_cap(vm, KVM_CAP_X86_APIC_BUS_CYCLES_NS,
> +		      NSEC_PER_SEC / apic_hz);
> +
> +	vcpu = vm_vcpu_add(vm, 0, xapic ? xapic_guest_code : x2apic_guest_code);
> +
> +	ret = __vm_enable_cap(vm, KVM_CAP_X86_APIC_BUS_CYCLES_NS,
> +			      NSEC_PER_SEC / apic_hz);
> +	TEST_ASSERT(ret < 0 && errno == EINVAL,
> +		    "Setting of APIC bus frequency after vCPU is created should fail.");
> +
> +	if (xapic)
> +		virt_pg_map(vm, APIC_DEFAULT_GPA, APIC_DEFAULT_GPA);
> +
> +	test_apic_bus_clock(vcpu);
> +	kvm_vm_free(vm);
> +}
> +
> +void run_xapic_bus_clock_test(void)
> +{
> +	run_apic_bus_clock_test(true);
> +}
> +
> +void run_x2apic_bus_clock_test(void)
> +{
> +	run_apic_bus_clock_test(false);
> +}
> +
> +void help(char *name)
> +{
> +	puts("");
> +	printf("usage: %s [-h] [-a APIC bus freq]\n", name);
> +	puts("");
> +	printf("-a: The APIC bus frequency (in Hz) to be configured for the guest.\n");
> +	puts("");
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	int opt;
> +
> +	TEST_REQUIRE(kvm_has_cap(KVM_CAP_X86_APIC_BUS_CYCLES_NS));
> +	TEST_REQUIRE(kvm_has_cap(KVM_CAP_GET_TSC_KHZ));
> +
> +	while ((opt = getopt(argc, argv, "a:h")) != -1) {
> +		switch (opt) {
> +		case 'a':

Maybe -f for frequency instead of -a for APIC?  And if we make the delay
configurable, -d (delay)?

> +			apic_hz = atol(optarg);
> +			break;
> +		case 'h':
> +			help(argv[0]);
> +			exit(0);
> +		default:
> +			help(argv[0]);
> +			exit(1);
> +		}
> +	}
> +
> +	run_xapic_bus_clock_test();
> +	run_x2apic_bus_clock_test();
> +}
> -- 
> 2.34.1
>
Reinette Chatre June 11, 2024, 9:34 p.m. UTC | #2
Hi Sean,

On 6/10/24 5:51 PM, Sean Christopherson wrote:
> On Mon, Jun 10, 2024, Reinette Chatre wrote:
>> diff --git a/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c b/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c
>> new file mode 100644
>> index 000000000000..602cec91d8ee
>> --- /dev/null
>> +++ b/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c
>> @@ -0,0 +1,219 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2024 Intel Corporation
>> + *
>> + * Verify KVM correctly emulates the APIC bus frequency when the VMM configures
>> + * the frequency via KVM_CAP_X86_APIC_BUS_CYCLES_NS.  Start the APIC timer by
>> + * programming TMICT (timer initial count) to the largest value possible (so
>> + * that the timer will not expire during the test).  Then, after an arbitrary
>> + * amount of time has elapsed, verify TMCCT (timer current count) is within 1%
>> + * of the expected value based on the time elapsed, the APIC bus frequency, and
>> + * the programmed TDCR (timer divide configuration register).
>> + */
>> +
>> +#include "apic.h"
>> +#include "test_util.h"
>> +
>> +/*
>> + * Pick 25MHz for APIC bus frequency. Different enough from the default 1GHz.
>> + * User can override via command line.
>> + */
>> +unsigned long apic_hz = 25 * 1000 * 1000;
> 
> static, and maybe a uint64_t to match the other stuff?

Sure. Also moved all other globals and functions back to static.

> 
>> +/*
>> + * Possible TDCR values with matching divide count. Used to modify APIC
>> + * timer frequency.
>> + */
>> +struct {
>> +	uint32_t tdcr;
>> +	uint32_t divide_count;
>> +} tdcrs[] = {
>> +	{0x0, 2},
>> +	{0x1, 4},
>> +	{0x2, 8},
>> +	{0x3, 16},
>> +	{0x8, 32},
>> +	{0x9, 64},
>> +	{0xa, 128},
>> +	{0xb, 1},
>> +};
>> +
>> +void guest_verify(uint64_t tsc_cycles, uint32_t apic_cycles, uint32_t divide_count)
> 
> uin64_t for apic_cycles?  And maybe something like guest_check_apic_count(), to
> make it more obvious what is being verified?  Actually, it should be quite easy
> to have the two flavors share the bulk of the code.

I now plan to drop this function and instead just open code the
checks in what has now become a shared function between xAPIC and x2APIC.

> 
>> +{
>> +	unsigned long tsc_hz = tsc_khz * 1000;
>> +	uint64_t freq;
>> +
>> +	GUEST_ASSERT(tsc_cycles > 0);
> 
> Is this necessary?  Won't the "freq < ..." check fail?  I love me some paranoia,
> but this seems unnecessary.

Sure. After needing to field reports from static checkers not able to determine
that a denominator can never be zero I do tend to add these checks just to
pre-emptively placate them. I did run the code through a static checker after making
all planned changes and it had no complaints so it is now gone.

> 
>> +	freq = apic_cycles * divide_count * tsc_hz / tsc_cycles;
>> +	/* Check if measured frequency is within 1% of configured frequency. */
>> +	GUEST_ASSERT(freq < apic_hz * 101 / 100);
>> +	GUEST_ASSERT(freq > apic_hz * 99 / 100);
>> +}
>> +
>> +void x2apic_guest_code(void)
>> +{
>> +	uint32_t tmict, tmcct;
>> +	uint64_t tsc0, tsc1;
>> +	int i;
>> +
>> +	x2apic_enable();
>> +
>> +	/*
>> +	 * Setup one-shot timer.  The vector does not matter because the
>> +	 * interrupt should not fire.
>> +	 */
>> +	x2apic_write_reg(APIC_LVTT, APIC_LVT_TIMER_ONESHOT | APIC_LVT_MASKED);
>> +
>> +	for (i = 0; i < ARRAY_SIZE(tdcrs); i++) {
>> +		x2apic_write_reg(APIC_TDCR, tdcrs[i].tdcr);
>> +
>> +		/* Set the largest value to not trigger the interrupt. */
> 
> Nit, the goal isn't to avoid triggering the interrupt, e.g. the above masking
> takes care of that.  The goal is to prevent the timer from expiring, because if
> the timer expires it stops counting and will trigger a false failure because the
> TSC doesn't stop counting.
> 
> Honestly, I would just delete the comment.  Same with the "busy wait for 100 msec"
> and "Read APIC timer and TSC" comments.  They state the obvious.  Loading the max
> TMICT is mildly interesting, but that's covered by the file-level comment.
> 
>> +		tmict = ~0;
> 
> This really belongs outside of the loop, e.g.
> 
> 	const uint32_t tmict = ~0u;
> 
>> +		x2apic_write_reg(APIC_TMICT, tmict);
>> +
>> +		/* Busy wait for 100 msec. */
> 
> Hmm, should this be configurable?

Will do.

> 
>> +		tsc0 = rdtsc();
>> +		udelay(100000);
>> +		/* Read APIC timer and TSC. */
>> +		tmcct = x2apic_read_reg(APIC_TMCCT);
>> +		tsc1 = rdtsc();
>> +
>> +		/* Stop timer. */
> 
> This comment is a bit more interesting, as readers might not know writing '0'
> stops the timer.  But that's even more interesting is the ordering, e.g. it's
> not at all unreasonable to think that the timer should be stopped _before_ reading
> the current count.  E.g. something like:
> 
> 		/*
> 		 * Stop the timer _after_ reading the current, final count, as
> 		 * writing the initial counter also modifies the current count.
> 		 */
> 
>> +		x2apic_write_reg(APIC_TMICT, 0);
>> +
>> +		guest_verify(tsc1 - tsc0, tmict - tmcct, tdcrs[i].divide_count);
>> +	}
>> +
>> +	GUEST_DONE();
>> +}
>> +
>> +void xapic_guest_code(void)
>> +{
>> +	uint32_t tmict, tmcct;
>> +	uint64_t tsc0, tsc1;
>> +	int i;
>> +
>> +	xapic_enable();
>> +
>> +	/*
>> +	 * Setup one-shot timer.  The vector does not matter because the
>> +	 * interrupt should not fire.
>> +	 */
>> +	xapic_write_reg(APIC_LVTT, APIC_LVT_TIMER_ONESHOT | APIC_LVT_MASKED);
>> +
>> +	for (i = 0; i < ARRAY_SIZE(tdcrs); i++) {
>> +		xapic_write_reg(APIC_TDCR, tdcrs[i].tdcr);
>> +
>> +		/* Set the largest value to not trigger the interrupt. */
>> +		tmict = ~0;
>> +		xapic_write_reg(APIC_TMICT, tmict);
>> +
>> +		/* Busy wait for 100 msec. */
>> +		tsc0 = rdtsc();
>> +		udelay(100000);
>> +		/* Read APIC timer and TSC. */
>> +		tmcct = xapic_read_reg(APIC_TMCCT);
>> +		tsc1 = rdtsc();
>> +
>> +		/* Stop timer. */
>> +		xapic_write_reg(APIC_TMICT, 0);
>> +
>> +		guest_verify(tsc1 - tsc0, tmict - tmcct, tdcrs[i].divide_count);
> 
> That's some nice copy+paste :-)
> 
> This test isn't writing ICR, so the whole 32-bit vs. 64-bit weirdness with xAPIC
> vs X2APIC is irrevelant.  Two tiny helpers, a global flag, and you can avoid a
> pile of copy+paste, and the need to find a better name than guest_verify().

Will do. Thank you very much for your detailed and valuable feedback.

Reinette
diff mbox series

Patch

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index ce8ff8e8ce3a..ad8b5d15f2bd 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -112,6 +112,7 @@  TEST_GEN_PROGS_x86_64 += x86_64/vmx_invalid_nested_guest_state
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_tsc_adjust_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_nested_tsc_scaling_test
+TEST_GEN_PROGS_x86_64 += x86_64/apic_bus_clock_test
 TEST_GEN_PROGS_x86_64 += x86_64/xapic_ipi_test
 TEST_GEN_PROGS_x86_64 += x86_64/xapic_state_test
 TEST_GEN_PROGS_x86_64 += x86_64/xcr0_cpuid_test
diff --git a/tools/testing/selftests/kvm/include/x86_64/apic.h b/tools/testing/selftests/kvm/include/x86_64/apic.h
index bed316fdecd5..0f268b55fa06 100644
--- a/tools/testing/selftests/kvm/include/x86_64/apic.h
+++ b/tools/testing/selftests/kvm/include/x86_64/apic.h
@@ -60,6 +60,14 @@ 
 #define		APIC_VECTOR_MASK	0x000FF
 #define	APIC_ICR2	0x310
 #define		SET_APIC_DEST_FIELD(x)	((x) << 24)
+#define APIC_LVTT	0x320
+#define		APIC_LVT_TIMER_ONESHOT		(0 << 17)
+#define		APIC_LVT_TIMER_PERIODIC		(1 << 17)
+#define		APIC_LVT_TIMER_TSCDEADLINE	(2 << 17)
+#define		APIC_LVT_MASKED			(1 << 16)
+#define	APIC_TMICT	0x380
+#define	APIC_TMCCT	0x390
+#define	APIC_TDCR	0x3E0
 
 void apic_disable(void);
 void xapic_enable(void);
diff --git a/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c b/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c
new file mode 100644
index 000000000000..602cec91d8ee
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c
@@ -0,0 +1,219 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2024 Intel Corporation
+ *
+ * Verify KVM correctly emulates the APIC bus frequency when the VMM configures
+ * the frequency via KVM_CAP_X86_APIC_BUS_CYCLES_NS.  Start the APIC timer by
+ * programming TMICT (timer initial count) to the largest value possible (so
+ * that the timer will not expire during the test).  Then, after an arbitrary
+ * amount of time has elapsed, verify TMCCT (timer current count) is within 1%
+ * of the expected value based on the time elapsed, the APIC bus frequency, and
+ * the programmed TDCR (timer divide configuration register).
+ */
+
+#include "apic.h"
+#include "test_util.h"
+
+/*
+ * Pick 25MHz for APIC bus frequency. Different enough from the default 1GHz.
+ * User can override via command line.
+ */
+unsigned long apic_hz = 25 * 1000 * 1000;
+
+/*
+ * Possible TDCR values with matching divide count. Used to modify APIC
+ * timer frequency.
+ */
+struct {
+	uint32_t tdcr;
+	uint32_t divide_count;
+} tdcrs[] = {
+	{0x0, 2},
+	{0x1, 4},
+	{0x2, 8},
+	{0x3, 16},
+	{0x8, 32},
+	{0x9, 64},
+	{0xa, 128},
+	{0xb, 1},
+};
+
+void guest_verify(uint64_t tsc_cycles, uint32_t apic_cycles, uint32_t divide_count)
+{
+	unsigned long tsc_hz = tsc_khz * 1000;
+	uint64_t freq;
+
+	GUEST_ASSERT(tsc_cycles > 0);
+	freq = apic_cycles * divide_count * tsc_hz / tsc_cycles;
+	/* Check if measured frequency is within 1% of configured frequency. */
+	GUEST_ASSERT(freq < apic_hz * 101 / 100);
+	GUEST_ASSERT(freq > apic_hz * 99 / 100);
+}
+
+void x2apic_guest_code(void)
+{
+	uint32_t tmict, tmcct;
+	uint64_t tsc0, tsc1;
+	int i;
+
+	x2apic_enable();
+
+	/*
+	 * Setup one-shot timer.  The vector does not matter because the
+	 * interrupt should not fire.
+	 */
+	x2apic_write_reg(APIC_LVTT, APIC_LVT_TIMER_ONESHOT | APIC_LVT_MASKED);
+
+	for (i = 0; i < ARRAY_SIZE(tdcrs); i++) {
+		x2apic_write_reg(APIC_TDCR, tdcrs[i].tdcr);
+
+		/* Set the largest value to not trigger the interrupt. */
+		tmict = ~0;
+		x2apic_write_reg(APIC_TMICT, tmict);
+
+		/* Busy wait for 100 msec. */
+		tsc0 = rdtsc();
+		udelay(100000);
+		/* Read APIC timer and TSC. */
+		tmcct = x2apic_read_reg(APIC_TMCCT);
+		tsc1 = rdtsc();
+
+		/* Stop timer. */
+		x2apic_write_reg(APIC_TMICT, 0);
+
+		guest_verify(tsc1 - tsc0, tmict - tmcct, tdcrs[i].divide_count);
+	}
+
+	GUEST_DONE();
+}
+
+void xapic_guest_code(void)
+{
+	uint32_t tmict, tmcct;
+	uint64_t tsc0, tsc1;
+	int i;
+
+	xapic_enable();
+
+	/*
+	 * Setup one-shot timer.  The vector does not matter because the
+	 * interrupt should not fire.
+	 */
+	xapic_write_reg(APIC_LVTT, APIC_LVT_TIMER_ONESHOT | APIC_LVT_MASKED);
+
+	for (i = 0; i < ARRAY_SIZE(tdcrs); i++) {
+		xapic_write_reg(APIC_TDCR, tdcrs[i].tdcr);
+
+		/* Set the largest value to not trigger the interrupt. */
+		tmict = ~0;
+		xapic_write_reg(APIC_TMICT, tmict);
+
+		/* Busy wait for 100 msec. */
+		tsc0 = rdtsc();
+		udelay(100000);
+		/* Read APIC timer and TSC. */
+		tmcct = xapic_read_reg(APIC_TMCCT);
+		tsc1 = rdtsc();
+
+		/* Stop timer. */
+		xapic_write_reg(APIC_TMICT, 0);
+
+		guest_verify(tsc1 - tsc0, tmict - tmcct, tdcrs[i].divide_count);
+	}
+
+	GUEST_DONE();
+}
+
+void test_apic_bus_clock(struct kvm_vcpu *vcpu)
+{
+	bool done = false;
+	struct ucall uc;
+
+	while (!done) {
+		vcpu_run(vcpu);
+		TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_DONE:
+			done = true;
+			break;
+		case UCALL_ABORT:
+			REPORT_GUEST_ASSERT(uc);
+			break;
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+			break;
+		}
+	}
+}
+
+void run_apic_bus_clock_test(bool xapic)
+{
+	struct kvm_vcpu *vcpu;
+	struct kvm_vm *vm;
+	int ret;
+
+	vm = vm_create(1);
+
+	sync_global_to_guest(vm, apic_hz);
+
+	vm_enable_cap(vm, KVM_CAP_X86_APIC_BUS_CYCLES_NS,
+		      NSEC_PER_SEC / apic_hz);
+
+	vcpu = vm_vcpu_add(vm, 0, xapic ? xapic_guest_code : x2apic_guest_code);
+
+	ret = __vm_enable_cap(vm, KVM_CAP_X86_APIC_BUS_CYCLES_NS,
+			      NSEC_PER_SEC / apic_hz);
+	TEST_ASSERT(ret < 0 && errno == EINVAL,
+		    "Setting of APIC bus frequency after vCPU is created should fail.");
+
+	if (xapic)
+		virt_pg_map(vm, APIC_DEFAULT_GPA, APIC_DEFAULT_GPA);
+
+	test_apic_bus_clock(vcpu);
+	kvm_vm_free(vm);
+}
+
+void run_xapic_bus_clock_test(void)
+{
+	run_apic_bus_clock_test(true);
+}
+
+void run_x2apic_bus_clock_test(void)
+{
+	run_apic_bus_clock_test(false);
+}
+
+void help(char *name)
+{
+	puts("");
+	printf("usage: %s [-h] [-a APIC bus freq]\n", name);
+	puts("");
+	printf("-a: The APIC bus frequency (in Hz) to be configured for the guest.\n");
+	puts("");
+}
+
+int main(int argc, char *argv[])
+{
+	int opt;
+
+	TEST_REQUIRE(kvm_has_cap(KVM_CAP_X86_APIC_BUS_CYCLES_NS));
+	TEST_REQUIRE(kvm_has_cap(KVM_CAP_GET_TSC_KHZ));
+
+	while ((opt = getopt(argc, argv, "a:h")) != -1) {
+		switch (opt) {
+		case 'a':
+			apic_hz = atol(optarg);
+			break;
+		case 'h':
+			help(argv[0]);
+			exit(0);
+		default:
+			help(argv[0]);
+			exit(1);
+		}
+	}
+
+	run_xapic_bus_clock_test();
+	run_x2apic_bus_clock_test();
+}