diff mbox series

[RFC,3/3] KVM: selftests: Add vgic initialization for dirty log perf test for ARM

Message ID 20220110210441.2074798-4-jingzhangos@google.com (mailing list archive)
State New, archived
Headers show
Series ARM64: Guest performance improvement during dirty | expand

Commit Message

Jing Zhang Jan. 10, 2022, 9:04 p.m. UTC
For ARM64, if no vgic is setup before the dirty log perf test, the
userspace irqchip would be used, which would affect the dirty log perf
test result.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
---
 tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Andrew Jones Jan. 11, 2022, 9:55 a.m. UTC | #1
On Mon, Jan 10, 2022 at 09:04:41PM +0000, Jing Zhang wrote:
> For ARM64, if no vgic is setup before the dirty log perf test, the
> userspace irqchip would be used, which would affect the dirty log perf
> test result.
> 
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> index 1954b964d1cf..b501338d9430 100644
> --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> @@ -18,6 +18,12 @@
>  #include "test_util.h"
>  #include "perf_test_util.h"
>  #include "guest_modes.h"
> +#ifdef __aarch64__
> +#include "aarch64/vgic.h"
> +
> +#define GICD_BASE_GPA			0x8000000ULL
> +#define GICR_BASE_GPA			0x80A0000ULL
> +#endif
>  
>  /* How many host loops to run by default (one KVM_GET_DIRTY_LOG for each loop)*/
>  #define TEST_HOST_LOOP_N		2UL
> @@ -200,6 +206,10 @@ static void run_test(enum vm_guest_mode mode, void *arg)
>  		vm_enable_cap(vm, &cap);
>  	}
>  
> +#ifdef __aarch64__
> +	vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA);
                                    ^^ extra parameter

Thanks,
drew

> +#endif
> +
>  	/* Start the iterations */
>  	iteration = 0;
>  	host_quit = false;
> -- 
> 2.34.1.575.g55b058a8bb-goog
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>
Marc Zyngier Jan. 11, 2022, 10:30 a.m. UTC | #2
On Mon, 10 Jan 2022 21:04:41 +0000,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> For ARM64, if no vgic is setup before the dirty log perf test, the
> userspace irqchip would be used, which would affect the dirty log perf
> test result.

Doesn't it affect *all* performance tests? How much does this change
contributes to the performance numbers you give in the cover letter?

> 
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> index 1954b964d1cf..b501338d9430 100644
> --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> @@ -18,6 +18,12 @@
>  #include "test_util.h"
>  #include "perf_test_util.h"
>  #include "guest_modes.h"
> +#ifdef __aarch64__
> +#include "aarch64/vgic.h"
> +
> +#define GICD_BASE_GPA			0x8000000ULL
> +#define GICR_BASE_GPA			0x80A0000ULL

How did you pick these values?

Thanks,

	M.
Jing Zhang Jan. 11, 2022, 10:12 p.m. UTC | #3
On Tue, Jan 11, 2022 at 1:55 AM Andrew Jones <drjones@redhat.com> wrote:
>
> On Mon, Jan 10, 2022 at 09:04:41PM +0000, Jing Zhang wrote:
> > For ARM64, if no vgic is setup before the dirty log perf test, the
> > userspace irqchip would be used, which would affect the dirty log perf
> > test result.
> >
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > index 1954b964d1cf..b501338d9430 100644
> > --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > @@ -18,6 +18,12 @@
> >  #include "test_util.h"
> >  #include "perf_test_util.h"
> >  #include "guest_modes.h"
> > +#ifdef __aarch64__
> > +#include "aarch64/vgic.h"
> > +
> > +#define GICD_BASE_GPA                        0x8000000ULL
> > +#define GICR_BASE_GPA                        0x80A0000ULL
> > +#endif
> >
> >  /* How many host loops to run by default (one KVM_GET_DIRTY_LOG for each loop)*/
> >  #define TEST_HOST_LOOP_N             2UL
> > @@ -200,6 +206,10 @@ static void run_test(enum vm_guest_mode mode, void *arg)
> >               vm_enable_cap(vm, &cap);
> >       }
> >
> > +#ifdef __aarch64__
> > +     vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA);
>                                     ^^ extra parameter
The patch is based on kvm/queue, which has a patch adding an extra
parameter nr_irqs.

>
> Thanks,
> drew
>
> > +#endif
> > +
> >       /* Start the iterations */
> >       iteration = 0;
> >       host_quit = false;
> > --
> > 2.34.1.575.g55b058a8bb-goog
> >
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> >
>

Thanks,
Jing
Jing Zhang Jan. 11, 2022, 10:16 p.m. UTC | #4
On Tue, Jan 11, 2022 at 2:30 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Mon, 10 Jan 2022 21:04:41 +0000,
> Jing Zhang <jingzhangos@google.com> wrote:
> >
> > For ARM64, if no vgic is setup before the dirty log perf test, the
> > userspace irqchip would be used, which would affect the dirty log perf
> > test result.
>
> Doesn't it affect *all* performance tests? How much does this change
> contributes to the performance numbers you give in the cover letter?
>
This bottleneck showed up after adding the fast path patch. I didn't
try other performance tests with this, but I think it is a good idea
to add a vgic setup for all performance tests. I can post another
patch later to make it available for all performance tests after
finishing this one and verifying all other performance tests.
Below is the test result without adding the vgic setup. It shows
20~30% improvement for the different number of vCPUs.
+-------+------------------------+
    | #vCPU | dirty memory time (ms) |
    +-------+------------------------+
    | 1     | 965                    |
    +-------+------------------------+
    | 2     | 1006                    |
    +-------+------------------------+
    | 4     | 1128                    |
    +-------+------------------------+
    | 8     | 2005                   |
    +-------+------------------------+
    | 16    | 3903                   |
    +-------+------------------------+
    | 32    | 7595                   |
    +-------+------------------------+
    | 64    | 15783                  |
    +-------+------------------------+
> >
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > index 1954b964d1cf..b501338d9430 100644
> > --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> > @@ -18,6 +18,12 @@
> >  #include "test_util.h"
> >  #include "perf_test_util.h"
> >  #include "guest_modes.h"
> > +#ifdef __aarch64__
> > +#include "aarch64/vgic.h"
> > +
> > +#define GICD_BASE_GPA                        0x8000000ULL
> > +#define GICR_BASE_GPA                        0x80A0000ULL
>
> How did you pick these values?
I used the same values from other tests.
Talked with Raghavendra about the values. It could be arbitrary and he
chose these values from QEMU's configuration.
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.
Thanks,
Jing
Marc Zyngier Jan. 12, 2022, 11:37 a.m. UTC | #5
On Tue, 11 Jan 2022 22:16:01 +0000,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> On Tue, Jan 11, 2022 at 2:30 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Mon, 10 Jan 2022 21:04:41 +0000,
> > Jing Zhang <jingzhangos@google.com> wrote:
> > >
> > > For ARM64, if no vgic is setup before the dirty log perf test, the
> > > userspace irqchip would be used, which would affect the dirty log perf
> > > test result.
> >
> > Doesn't it affect *all* performance tests? How much does this change
> > contributes to the performance numbers you give in the cover letter?
> >
> This bottleneck showed up after adding the fast path patch. I didn't
> try other performance tests with this, but I think it is a good idea
> to add a vgic setup for all performance tests. I can post another
> patch later to make it available for all performance tests after
> finishing this one and verifying all other performance tests.
> Below is the test result without adding the vgic setup. It shows
> 20~30% improvement for the different number of vCPUs.
> +-------+------------------------+
>     | #vCPU | dirty memory time (ms) |
>     +-------+------------------------+
>     | 1     | 965                    |
>     +-------+------------------------+
>     | 2     | 1006                    |
>     +-------+------------------------+
>     | 4     | 1128                    |
>     +-------+------------------------+
>     | 8     | 2005                   |
>     +-------+------------------------+
>     | 16    | 3903                   |
>     +-------+------------------------+
>     | 32    | 7595                   |
>     +-------+------------------------+
>     | 64    | 15783                  |
>     +-------+------------------------+

So please use these numbers in your cover letter when you repost your
series, as the improvement you'd observe on actual workloads is likely
to be less than what you claim due to this change in the test itself
(in other words, if you are going to benchamark something, don't
change the benchmark halfway).

	M.
Jing Zhang Jan. 12, 2022, 5:40 p.m. UTC | #6
On Wed, Jan 12, 2022 at 3:37 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 11 Jan 2022 22:16:01 +0000,
> Jing Zhang <jingzhangos@google.com> wrote:
> >
> > On Tue, Jan 11, 2022 at 2:30 AM Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > On Mon, 10 Jan 2022 21:04:41 +0000,
> > > Jing Zhang <jingzhangos@google.com> wrote:
> > > >
> > > > For ARM64, if no vgic is setup before the dirty log perf test, the
> > > > userspace irqchip would be used, which would affect the dirty log perf
> > > > test result.
> > >
> > > Doesn't it affect *all* performance tests? How much does this change
> > > contributes to the performance numbers you give in the cover letter?
> > >
> > This bottleneck showed up after adding the fast path patch. I didn't
> > try other performance tests with this, but I think it is a good idea
> > to add a vgic setup for all performance tests. I can post another
> > patch later to make it available for all performance tests after
> > finishing this one and verifying all other performance tests.
> > Below is the test result without adding the vgic setup. It shows
> > 20~30% improvement for the different number of vCPUs.
> > +-------+------------------------+
> >     | #vCPU | dirty memory time (ms) |
> >     +-------+------------------------+
> >     | 1     | 965                    |
> >     +-------+------------------------+
> >     | 2     | 1006                    |
> >     +-------+------------------------+
> >     | 4     | 1128                    |
> >     +-------+------------------------+
> >     | 8     | 2005                   |
> >     +-------+------------------------+
> >     | 16    | 3903                   |
> >     +-------+------------------------+
> >     | 32    | 7595                   |
> >     +-------+------------------------+
> >     | 64    | 15783                  |
> >     +-------+------------------------+
>
> So please use these numbers in your cover letter when you repost your
> series, as the improvement you'd observe on actual workloads is likely
> to be less than what you claim due to this change in the test itself
> (in other words, if you are going to benchamark something, don't
> change the benchmark halfway).
Sure. Will clarify this in the cover letter in future posts.
Thanks,
Jing
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.
diff mbox series

Patch

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 1954b964d1cf..b501338d9430 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -18,6 +18,12 @@ 
 #include "test_util.h"
 #include "perf_test_util.h"
 #include "guest_modes.h"
+#ifdef __aarch64__
+#include "aarch64/vgic.h"
+
+#define GICD_BASE_GPA			0x8000000ULL
+#define GICR_BASE_GPA			0x80A0000ULL
+#endif
 
 /* How many host loops to run by default (one KVM_GET_DIRTY_LOG for each loop)*/
 #define TEST_HOST_LOOP_N		2UL
@@ -200,6 +206,10 @@  static void run_test(enum vm_guest_mode mode, void *arg)
 		vm_enable_cap(vm, &cap);
 	}
 
+#ifdef __aarch64__
+	vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA);
+#endif
+
 	/* Start the iterations */
 	iteration = 0;
 	host_quit = false;