diff mbox series

[kvmtool,9/9] arm64: Add support for KVM_ARM_VCPU_PMU_V3_SET_PMU

Message ID 20211115165705.195736-10-alexandru.elisei@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: Improve PMU support on heterogeneous systems | expand

Commit Message

Alexandru Elisei Nov. 15, 2021, 4:57 p.m. UTC
The KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU ioctl is
used to assign a physical PMU to the events that KVM creates when emulating
the PMU for that VCPU. This is useful on heterogeneous systems, when there
is more than one hardware PMU present.

The assumption that is made in the implementation is that the user will
pin the kvmtool process on a set of CPUs that share the same PMU. This
allows kvmtool to set the same PMU for all VCPUs from the main thread,
instead of in the individual VCPU threads.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/aarch64/pmu.c | 148 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 146 insertions(+), 2 deletions(-)

Comments

Marc Zyngier Jan. 4, 2022, 2:39 p.m. UTC | #1
On Mon, 15 Nov 2021 16:57:05 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> The KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU ioctl is
> used to assign a physical PMU to the events that KVM creates when emulating
> the PMU for that VCPU. This is useful on heterogeneous systems, when there
> is more than one hardware PMU present.
> 
> The assumption that is made in the implementation is that the user will
> pin the kvmtool process on a set of CPUs that share the same PMU. This
> allows kvmtool to set the same PMU for all VCPUs from the main thread,
> instead of in the individual VCPU threads.

May I suggest a slightly different use model? Ideally, you'd be able
to run the vcpu threads on the CPUs matching the PMU affinity, and
leave all the other threads to roam on other CPUs.

With your implementation, the whole of kvmtool gets stuck to a given
CPU type, which can be problematic.

Thanks,

	M.
Alexandru Elisei Jan. 7, 2022, 12:10 p.m. UTC | #2
Hi Marc,

On Tue, Jan 04, 2022 at 02:39:59PM +0000, Marc Zyngier wrote:
> On Mon, 15 Nov 2021 16:57:05 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > The KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU ioctl is
> > used to assign a physical PMU to the events that KVM creates when emulating
> > the PMU for that VCPU. This is useful on heterogeneous systems, when there
> > is more than one hardware PMU present.
> > 
> > The assumption that is made in the implementation is that the user will
> > pin the kvmtool process on a set of CPUs that share the same PMU. This
> > allows kvmtool to set the same PMU for all VCPUs from the main thread,
> > instead of in the individual VCPU threads.
> 
> May I suggest a slightly different use model? Ideally, you'd be able
> to run the vcpu threads on the CPUs matching the PMU affinity, and
> leave all the other threads to roam on other CPUs.

Right now, the only way for userspace to make kvmtool run on a particular
set of CPUs in a heterogeneous configuration is to use taskset, which means
the entire kvmtool process ends up being pinned on a subset of CPUs which
have the same PMU. I would like to keep this approach, as it's simple and
straightforward to implement in kvmtool, and it's easy to change in the
future if there's an incentive to do so.

It's also not clear to me how your suggestion would work. Add a command
line argument to pin all the VCPUs to the specified cpumask?

> 
> With your implementation, the whole of kvmtool gets stuck to a given
> CPU type, which can be problematic.

Do you have a specific use case in mind? Or is it more like a general
concern regarding, for example, the virtio-blk-io or virtio-net-* threads
competing with the VCPU threads if the VM is doing lots of I/O?

Thanks,
Alex
Marc Zyngier Jan. 8, 2022, 1:27 p.m. UTC | #3
On Fri, 07 Jan 2022 12:10:05 +0000,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Tue, Jan 04, 2022 at 02:39:59PM +0000, Marc Zyngier wrote:
> > On Mon, 15 Nov 2021 16:57:05 +0000,
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > 
> > > The KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU ioctl is
> > > used to assign a physical PMU to the events that KVM creates when emulating
> > > the PMU for that VCPU. This is useful on heterogeneous systems, when there
> > > is more than one hardware PMU present.
> > > 
> > > The assumption that is made in the implementation is that the user will
> > > pin the kvmtool process on a set of CPUs that share the same PMU. This
> > > allows kvmtool to set the same PMU for all VCPUs from the main thread,
> > > instead of in the individual VCPU threads.
> > 
> > May I suggest a slightly different use model? Ideally, you'd be able
> > to run the vcpu threads on the CPUs matching the PMU affinity, and
> > leave all the other threads to roam on other CPUs.
> 
> Right now, the only way for userspace to make kvmtool run on a particular
> set of CPUs in a heterogeneous configuration is to use taskset, which means
> the entire kvmtool process ends up being pinned on a subset of CPUs which
> have the same PMU. I would like to keep this approach, as it's simple and
> straightforward to implement in kvmtool, and it's easy to change in the
> future if there's an incentive to do so.
> 
> It's also not clear to me how your suggestion would work. Add a command
> line argument to pin all the VCPUs to the specified cpumask?
> 
> > 
> > With your implementation, the whole of kvmtool gets stuck to a given
> > CPU type, which can be problematic.
> 
> Do you have a specific use case in mind? Or is it more like a general
> concern regarding, for example, the virtio-blk-io or virtio-net-* threads
> competing with the VCPU threads if the VM is doing lots of I/O?

Exactly that. The real requirement is that the vcpu thread affinities
are that of the PMU, but not that of any other thread. Maybe that's
just another parameter, independent of the PMU setup. Something like:

	lkvm run ... --vcpu-affinity $(< /sys/devices/armv8_pmuv3_0/cpus)

for example.

Thanks,

	M.
Alexandru Elisei Jan. 8, 2022, 4:03 p.m. UTC | #4
Hi Marc,

On Sat, Jan 08, 2022 at 01:27:17PM +0000, Marc Zyngier wrote:
> On Fri, 07 Jan 2022 12:10:05 +0000,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > Hi Marc,
> > 
> > On Tue, Jan 04, 2022 at 02:39:59PM +0000, Marc Zyngier wrote:
> > > On Mon, 15 Nov 2021 16:57:05 +0000,
> > > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > > > 
> > > > The KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU ioctl is
> > > > used to assign a physical PMU to the events that KVM creates when emulating
> > > > the PMU for that VCPU. This is useful on heterogeneous systems, when there
> > > > is more than one hardware PMU present.
> > > > 
> > > > The assumption that is made in the implementation is that the user will
> > > > pin the kvmtool process on a set of CPUs that share the same PMU. This
> > > > allows kvmtool to set the same PMU for all VCPUs from the main thread,
> > > > instead of in the individual VCPU threads.
> > > 
> > > May I suggest a slightly different use model? Ideally, you'd be able
> > > to run the vcpu threads on the CPUs matching the PMU affinity, and
> > > leave all the other threads to roam on other CPUs.
> > 
> > Right now, the only way for userspace to make kvmtool run on a particular
> > set of CPUs in a heterogeneous configuration is to use taskset, which means
> > the entire kvmtool process ends up being pinned on a subset of CPUs which
> > have the same PMU. I would like to keep this approach, as it's simple and
> > straightforward to implement in kvmtool, and it's easy to change in the
> > future if there's an incentive to do so.
> > 
> > It's also not clear to me how your suggestion would work. Add a command
> > line argument to pin all the VCPUs to the specified cpumask?
> > 
> > > 
> > > With your implementation, the whole of kvmtool gets stuck to a given
> > > CPU type, which can be problematic.
> > 
> > Do you have a specific use case in mind? Or is it more like a general
> > concern regarding, for example, the virtio-blk-io or virtio-net-* threads
> > competing with the VCPU threads if the VM is doing lots of I/O?
> 
> Exactly that. The real requirement is that the vcpu thread affinities
> are that of the PMU, but not that of any other thread. Maybe that's
> just another parameter, independent of the PMU setup. Something like:
> 
> 	lkvm run ... --vcpu-affinity $(< /sys/devices/armv8_pmuv3_0/cpus)
> 
> for example.

That should be easy (famous last words?) to implement as a separate patch
on top of this series, I'll give that a go in the next iteration.

Thanks for the suggestion.

Alex
diff mbox series

Patch

diff --git a/arm/aarch64/pmu.c b/arm/aarch64/pmu.c
index ac5b7bcd6ca9..c0fc95ca01c4 100644
--- a/arm/aarch64/pmu.c
+++ b/arm/aarch64/pmu.c
@@ -1,3 +1,9 @@ 
+#include <dirent.h>
+#include <sched.h>
+
+#include "linux/cpumask.h"
+#include "linux/err.h"
+
 #include "kvm/fdt.h"
 #include "kvm/kvm.h"
 #include "kvm/kvm-cpu.h"
@@ -7,6 +13,18 @@ 
 
 #include "asm/pmu.h"
 
+static bool pmu_has_attr(struct kvm_cpu *vcpu, u64 attr)
+{
+	struct kvm_device_attr pmu_attr = {
+		.group	= KVM_ARM_VCPU_PMU_V3_CTRL,
+		.attr	= attr,
+	};
+	int ret;
+
+	ret = ioctl(vcpu->vcpu_fd, KVM_HAS_DEVICE_ATTR, &pmu_attr);
+	return ret == 0;
+}
+
 static void set_pmu_attr(struct kvm_cpu *vcpu, void *addr, u64 attr)
 {
 	struct kvm_device_attr pmu_attr = {
@@ -16,8 +34,7 @@  static void set_pmu_attr(struct kvm_cpu *vcpu, void *addr, u64 attr)
 	};
 	int ret;
 
-	ret = ioctl(vcpu->vcpu_fd, KVM_HAS_DEVICE_ATTR, &pmu_attr);
-	if (!ret) {
+	if (pmu_has_attr(vcpu, attr)) {
 		ret = ioctl(vcpu->vcpu_fd, KVM_SET_DEVICE_ATTR, &pmu_attr);
 		if (ret)
 			die_perror("PMU KVM_SET_DEVICE_ATTR");
@@ -26,11 +43,126 @@  static void set_pmu_attr(struct kvm_cpu *vcpu, void *addr, u64 attr)
 	}
 }
 
+#define SYS_EVENT_SOURCE	"/sys/bus/event_source/devices/"
+/*
+ * int is 32 bits and INT_MAX translates in decimal to 2 * 10^9.
+ * Make room for newline and NUL.
+ */
+#define PMU_ID_MAXLEN		12
+
+/*
+ * In the case of homogeneous systems, there only one hardware PMU, and all
+ * VCPUs will use the same PMU, regardless of where the attribute gets set.
+ *
+ * For heterogeneous systems, the assumption is that the user has pinned the VM
+ * (via taskset or similar) to a set of CPUs that share the same hardware PMU.
+ * This simplifies things for kvmtool, as correctness is not affected by setting
+ * the PMU for each VCPU from the main thread, instead of setting it from each
+ * individual VCPU thread.
+ */
+static int find_pmu(void)
+{
+	char buf[PMU_ID_MAXLEN];
+	struct dirent *dirent;
+	char *cpulist, *path;
+	int pmu_id = -ENXIO;
+	unsigned long val;
+	cpumask_t cpumask;
+	ssize_t fd_sz;
+	int this_cpu;
+	int fd, ret;
+	DIR *dir;
+
+	memset(buf, 0, PMU_ID_MAXLEN);
+
+	this_cpu = sched_getcpu();
+	if (this_cpu < 0)
+		return -errno;
+
+	cpulist = calloc(1, PAGE_SIZE);
+	if (!cpulist)
+		return -ENOMEM;
+
+	path = calloc(1, PAGE_SIZE);
+	if (!path) {
+		pmu_id = -ENOMEM;
+		goto out_free_cpulist;
+	}
+	/* Make the compiler happy by copying the NULL terminating byte. */
+	strncpy(path, SYS_EVENT_SOURCE, strlen(SYS_EVENT_SOURCE) + 1);
+
+	dir = opendir(SYS_EVENT_SOURCE);
+	if (!dir) {
+		pmu_id = -errno;
+		goto out_free;
+	}
+
+	while ((dirent = readdir(dir))) {
+		if (dirent->d_type != DT_LNK)
+			continue;
+
+		strcat(path, dirent->d_name);
+		strcat(path, "/cpus");
+		fd = open(path, O_RDONLY);
+		if (fd < 0)
+			goto next_dir;
+
+		fd_sz = read_file(fd, cpulist, PAGE_SIZE);
+		if (fd_sz < 0) {
+			pmu_id = -errno;
+			goto out_free;
+		}
+		close(fd);
+
+		ret = cpulist_parse(cpulist, &cpumask);
+		if (ret) {
+			pmu_id = ret;
+			goto out_free;
+		}
+
+		if (!cpumask_test_cpu(this_cpu, &cpumask))
+			goto next_dir;
+
+		strcpy(&path[strlen(path) - 4], "type");
+		fd = open(path, O_RDONLY);
+		if (fd < 0)
+			goto next_dir;
+
+		fd_sz = read_file(fd, buf, PMU_ID_MAXLEN - 1);
+		if (fd_sz < 0) {
+			pmu_id = -errno;
+			goto out_free;
+		}
+		close(fd);
+
+		val = strtoul(buf, NULL, 10);
+		if (val > INT_MAX) {
+			pmu_id = -EOVERFLOW;
+			goto out_free;
+		}
+		pmu_id = (int)val;
+		pr_debug("Using PMU: %s (id: %d)", dirent->d_name, pmu_id);
+		break;
+
+next_dir:
+		/* Reset path. */
+		memset(&path[strlen(SYS_EVENT_SOURCE)], '\0',
+		       strlen(path) - strlen(SYS_EVENT_SOURCE));
+	}
+
+out_free:
+	free(path);
+out_free_cpulist:
+	free(cpulist);
+	return pmu_id;
+}
+
 void pmu__generate_fdt_nodes(void *fdt, struct kvm *kvm)
 {
 	const char compatible[] = "arm,armv8-pmuv3";
 	int irq = KVM_ARM_PMUv3_PPI;
 	struct kvm_cpu *vcpu;
+	int pmu_id = -ENXIO;
 	int i;
 
 	u32 cpu_mask = (((1 << kvm->nrcpus) - 1) << GIC_FDT_IRQ_PPI_CPU_SHIFT) \
@@ -44,9 +176,21 @@  void pmu__generate_fdt_nodes(void *fdt, struct kvm *kvm)
 	if (!kvm->cfg.arch.has_pmuv3)
 		return;
 
+	if (pmu_has_attr(kvm->cpus[0], KVM_ARM_VCPU_PMU_V3_SET_PMU)) {
+		pmu_id = find_pmu();
+		if (pmu_id < 0)
+			pr_debug("Failed to find a PMU (errno = %d)", -pmu_id);
+	}
+
 	for (i = 0; i < kvm->nrcpus; i++) {
 		vcpu = kvm->cpus[i];
 		set_pmu_attr(vcpu, &irq, KVM_ARM_VCPU_PMU_V3_IRQ);
+		/*
+		 * PMU IDs 0-5 are reserved; a positive value means a PMU was
+		 * found.
+		 */
+		if (pmu_id > 0)
+			set_pmu_attr(vcpu, &pmu_id, KVM_ARM_VCPU_PMU_V3_SET_PMU);
 		set_pmu_attr(vcpu, NULL, KVM_ARM_VCPU_PMU_V3_INIT);
 	}