Message ID | 20220110210441.2074798-4-jingzhangos@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ARM64: Guest performance improvement during dirty | expand |
On Mon, Jan 10, 2022 at 09:04:41PM +0000, Jing Zhang wrote: > For ARM64, if no vgic is setup before the dirty log perf test, the > userspace irqchip would be used, which would affect the dirty log perf > test result. > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > --- > tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c > index 1954b964d1cf..b501338d9430 100644 > --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c > +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c > @@ -18,6 +18,12 @@ > #include "test_util.h" > #include "perf_test_util.h" > #include "guest_modes.h" > +#ifdef __aarch64__ > +#include "aarch64/vgic.h" > + > +#define GICD_BASE_GPA 0x8000000ULL > +#define GICR_BASE_GPA 0x80A0000ULL > +#endif > > /* How many host loops to run by default (one KVM_GET_DIRTY_LOG for each loop)*/ > #define TEST_HOST_LOOP_N 2UL > @@ -200,6 +206,10 @@ static void run_test(enum vm_guest_mode mode, void *arg) > vm_enable_cap(vm, &cap); > } > > +#ifdef __aarch64__ > + vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA); ^^ extra parameter Thanks, drew > +#endif > + > /* Start the iterations */ > iteration = 0; > host_quit = false; > -- > 2.34.1.575.g55b058a8bb-goog > > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm >
On Mon, 10 Jan 2022 21:04:41 +0000, Jing Zhang <jingzhangos@google.com> wrote: > > For ARM64, if no vgic is setup before the dirty log perf test, the > userspace irqchip would be used, which would affect the dirty log perf > test result. Doesn't it affect *all* performance tests? How much does this change contributes to the performance numbers you give in the cover letter? > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > --- > tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c > index 1954b964d1cf..b501338d9430 100644 > --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c > +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c > @@ -18,6 +18,12 @@ > #include "test_util.h" > #include "perf_test_util.h" > #include "guest_modes.h" > +#ifdef __aarch64__ > +#include "aarch64/vgic.h" > + > +#define GICD_BASE_GPA 0x8000000ULL > +#define GICR_BASE_GPA 0x80A0000ULL How did you pick these values? Thanks, M.
On Tue, Jan 11, 2022 at 1:55 AM Andrew Jones <drjones@redhat.com> wrote: > > On Mon, Jan 10, 2022 at 09:04:41PM +0000, Jing Zhang wrote: > > For ARM64, if no vgic is setup before the dirty log perf test, the > > userspace irqchip would be used, which would affect the dirty log perf > > test result. > > > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > > --- > > tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c > > index 1954b964d1cf..b501338d9430 100644 > > --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c > > +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c > > @@ -18,6 +18,12 @@ > > #include "test_util.h" > > #include "perf_test_util.h" > > #include "guest_modes.h" > > +#ifdef __aarch64__ > > +#include "aarch64/vgic.h" > > + > > +#define GICD_BASE_GPA 0x8000000ULL > > +#define GICR_BASE_GPA 0x80A0000ULL > > +#endif > > > > /* How many host loops to run by default (one KVM_GET_DIRTY_LOG for each loop)*/ > > #define TEST_HOST_LOOP_N 2UL > > @@ -200,6 +206,10 @@ static void run_test(enum vm_guest_mode mode, void *arg) > > vm_enable_cap(vm, &cap); > > } > > > > +#ifdef __aarch64__ > > + vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA); > ^^ extra parameter The patch is based on kvm/queue, which has a patch adding an extra parameter nr_irqs. > > Thanks, > drew > > > +#endif > > + > > /* Start the iterations */ > > iteration = 0; > > host_quit = false; > > -- > > 2.34.1.575.g55b058a8bb-goog > > > > _______________________________________________ > > kvmarm mailing list > > kvmarm@lists.cs.columbia.edu > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm > > > Thanks, Jing
On Tue, Jan 11, 2022 at 2:30 AM Marc Zyngier <maz@kernel.org> wrote: > > On Mon, 10 Jan 2022 21:04:41 +0000, > Jing Zhang <jingzhangos@google.com> wrote: > > > > For ARM64, if no vgic is setup before the dirty log perf test, the > > userspace irqchip would be used, which would affect the dirty log perf > > test result. > > Doesn't it affect *all* performance tests? How much does this change > contributes to the performance numbers you give in the cover letter? > This bottleneck showed up after adding the fast path patch. I didn't try other performance tests with this, but I think it is a good idea to add a vgic setup for all performance tests. I can post another patch later to make it available for all performance tests after finishing this one and verifying all other performance tests. Below is the test result without adding the vgic setup. It shows 20~30% improvement for the different number of vCPUs. +-------+------------------------+ | #vCPU | dirty memory time (ms) | +-------+------------------------+ | 1 | 965 | +-------+------------------------+ | 2 | 1006 | +-------+------------------------+ | 4 | 1128 | +-------+------------------------+ | 8 | 2005 | +-------+------------------------+ | 16 | 3903 | +-------+------------------------+ | 32 | 7595 | +-------+------------------------+ | 64 | 15783 | +-------+------------------------+ > > > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > > --- > > tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c > > index 1954b964d1cf..b501338d9430 100644 > > --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c > > +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c > > @@ -18,6 +18,12 @@ > > #include "test_util.h" > > #include "perf_test_util.h" > > #include "guest_modes.h" > > +#ifdef __aarch64__ > > +#include "aarch64/vgic.h" > > + > > +#define GICD_BASE_GPA 0x8000000ULL > > +#define GICR_BASE_GPA 0x80A0000ULL > > How did you pick these values? I used the same values from other tests. Talked with Raghavendra about the values. It could be arbitrary and he chose these values from QEMU's configuration. > > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible. Thanks, Jing
On Tue, 11 Jan 2022 22:16:01 +0000, Jing Zhang <jingzhangos@google.com> wrote: > > On Tue, Jan 11, 2022 at 2:30 AM Marc Zyngier <maz@kernel.org> wrote: > > > > On Mon, 10 Jan 2022 21:04:41 +0000, > > Jing Zhang <jingzhangos@google.com> wrote: > > > > > > For ARM64, if no vgic is setup before the dirty log perf test, the > > > userspace irqchip would be used, which would affect the dirty log perf > > > test result. > > > > Doesn't it affect *all* performance tests? How much does this change > > contributes to the performance numbers you give in the cover letter? > > > This bottleneck showed up after adding the fast path patch. I didn't > try other performance tests with this, but I think it is a good idea > to add a vgic setup for all performance tests. I can post another > patch later to make it available for all performance tests after > finishing this one and verifying all other performance tests. > Below is the test result without adding the vgic setup. It shows > 20~30% improvement for the different number of vCPUs. > +-------+------------------------+ > | #vCPU | dirty memory time (ms) | > +-------+------------------------+ > | 1 | 965 | > +-------+------------------------+ > | 2 | 1006 | > +-------+------------------------+ > | 4 | 1128 | > +-------+------------------------+ > | 8 | 2005 | > +-------+------------------------+ > | 16 | 3903 | > +-------+------------------------+ > | 32 | 7595 | > +-------+------------------------+ > | 64 | 15783 | > +-------+------------------------+ So please use these numbers in your cover letter when you repost your series, as the improvement you'd observe on actual workloads is likely to be less than what you claim due to this change in the test itself (in other words, if you are going to benchamark something, don't change the benchmark halfway). M.
On Wed, Jan 12, 2022 at 3:37 AM Marc Zyngier <maz@kernel.org> wrote: > > On Tue, 11 Jan 2022 22:16:01 +0000, > Jing Zhang <jingzhangos@google.com> wrote: > > > > On Tue, Jan 11, 2022 at 2:30 AM Marc Zyngier <maz@kernel.org> wrote: > > > > > > On Mon, 10 Jan 2022 21:04:41 +0000, > > > Jing Zhang <jingzhangos@google.com> wrote: > > > > > > > > For ARM64, if no vgic is setup before the dirty log perf test, the > > > > userspace irqchip would be used, which would affect the dirty log perf > > > > test result. > > > > > > Doesn't it affect *all* performance tests? How much does this change > > > contributes to the performance numbers you give in the cover letter? > > > > > This bottleneck showed up after adding the fast path patch. I didn't > > try other performance tests with this, but I think it is a good idea > > to add a vgic setup for all performance tests. I can post another > > patch later to make it available for all performance tests after > > finishing this one and verifying all other performance tests. > > Below is the test result without adding the vgic setup. It shows > > 20~30% improvement for the different number of vCPUs. > > +-------+------------------------+ > > | #vCPU | dirty memory time (ms) | > > +-------+------------------------+ > > | 1 | 965 | > > +-------+------------------------+ > > | 2 | 1006 | > > +-------+------------------------+ > > | 4 | 1128 | > > +-------+------------------------+ > > | 8 | 2005 | > > +-------+------------------------+ > > | 16 | 3903 | > > +-------+------------------------+ > > | 32 | 7595 | > > +-------+------------------------+ > > | 64 | 15783 | > > +-------+------------------------+ > > So please use these numbers in your cover letter when you repost your > series, as the improvement you'd observe on actual workloads is likely > to be less than what you claim due to this change in the test itself > (in other words, if you are going to benchamark something, don't > change the benchmark halfway). Sure. Will clarify this in the cover letter in future posts. Thanks, Jing > > M. > > -- > Without deviation from the norm, progress is not possible.
diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 1954b964d1cf..b501338d9430 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -18,6 +18,12 @@ #include "test_util.h" #include "perf_test_util.h" #include "guest_modes.h" +#ifdef __aarch64__ +#include "aarch64/vgic.h" + +#define GICD_BASE_GPA 0x8000000ULL +#define GICR_BASE_GPA 0x80A0000ULL +#endif /* How many host loops to run by default (one KVM_GET_DIRTY_LOG for each loop)*/ #define TEST_HOST_LOOP_N 2UL @@ -200,6 +206,10 @@ static void run_test(enum vm_guest_mode mode, void *arg) vm_enable_cap(vm, &cap); } +#ifdef __aarch64__ + vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA); +#endif + /* Start the iterations */ iteration = 0; host_quit = false;
For ARM64, if no vgic is setup before the dirty log perf test, the userspace irqchip would be used, which would affect the dirty log perf test result. Signed-off-by: Jing Zhang <jingzhangos@google.com> --- tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++ 1 file changed, 10 insertions(+)