mbox series

[v2,0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0

Message ID 20230408034759.2369068-1-reijiw@google.com (mailing list archive)
Headers show
Series KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 | expand

Message

Reiji Watanabe April 8, 2023, 3:47 a.m. UTC
This series will fix bugs in KVM's handling of PMUSERENR_EL0.

With PMU access support from EL0 [1], the perf subsystem would
set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
a direct access to PMU counters.  However, KVM appears to assume
that the register value is always zero for the host EL0, and has
the following two problems in handling the register.

[A] The host EL0 might lose the direct access to PMU counters, as
    KVM always clears PMUSERENR_EL0 before returning to userspace.

[B] With VHE, the guest EL0 access to PMU counters might be trapped
    to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
    indicates that the guest EL0 has an access to the counters).
    This is because, with VHE, KVM sets ER, CR, SW and EN bits of
    PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
    from the guset EL0 to EL2, but those bits might be cleared by
    the perf subsystem after vcpu_load() (when PMU counters are
    programmed for the vPMU emulation).

Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
The series is based on v6.3-rc5.

v2:
 - Save the PMUSERENR_EL0 for the host in the sysreg array of
   kvm_host_data. [Marc]
 - Don't let armv8pmu_start() overwrite PMUSERENR if the vCPU
   is loaded, instead have KVM update the saved shadow register
   value for the host. [Marc, Mark]

v1: https://lore.kernel.org/all/20230329002136.2463442-1-reijiw@google.com/

[1] https://github.com/torvalds/linux/commit/83a7a4d643d33a8b74a42229346b7ed7139fcef9

Reiji Watanabe (2):
  KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
  KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded

 arch/arm64/include/asm/kvm_host.h       |  5 +++++
 arch/arm64/kernel/perf_event.c          | 21 ++++++++++++++++++---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 13 +++++++++++--
 arch/arm64/kvm/pmu.c                    | 20 ++++++++++++++++++++
 4 files changed, 54 insertions(+), 5 deletions(-)


base-commit: 7e364e56293bb98cae1b55fd835f5991c4e96e7d

Comments

Marc Zyngier April 8, 2023, 9:04 a.m. UTC | #1
On Sat, 08 Apr 2023 04:47:57 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> This series will fix bugs in KVM's handling of PMUSERENR_EL0.
> 
> With PMU access support from EL0 [1], the perf subsystem would
> set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
> a direct access to PMU counters.  However, KVM appears to assume
> that the register value is always zero for the host EL0, and has
> the following two problems in handling the register.
> 
> [A] The host EL0 might lose the direct access to PMU counters, as
>     KVM always clears PMUSERENR_EL0 before returning to userspace.
> 
> [B] With VHE, the guest EL0 access to PMU counters might be trapped
>     to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
>     indicates that the guest EL0 has an access to the counters).
>     This is because, with VHE, KVM sets ER, CR, SW and EN bits of
>     PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
>     from the guset EL0 to EL2, but those bits might be cleared by
>     the perf subsystem after vcpu_load() (when PMU counters are
>     programmed for the vPMU emulation).
> 
> Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
> The series is based on v6.3-rc5.
> 
> v2:
>  - Save the PMUSERENR_EL0 for the host in the sysreg array of
>    kvm_host_data. [Marc]
>  - Don't let armv8pmu_start() overwrite PMUSERENR if the vCPU
>    is loaded, instead have KVM update the saved shadow register
>    value for the host. [Marc, Mark]

This looks much better to me. If Mark is OK with it, I'm happy to take
it in 6.4.

Speaking of which, this will clash with the queued move of the PMUv3
code into drivers/perf, and probably break on 32bit. I can either take
a branch shared with arm64 (009d6dc87a56 ("ARM: perf: Allow the use of
the PMUv3 driver on 32bit ARM")), or wait until -rc1.

Will, what do you prefer?

	M.
Will Deacon April 11, 2023, 11:24 a.m. UTC | #2
On Sat, Apr 08, 2023 at 10:04:19AM +0100, Marc Zyngier wrote:
> On Sat, 08 Apr 2023 04:47:57 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> > 
> > This series will fix bugs in KVM's handling of PMUSERENR_EL0.
> > 
> > With PMU access support from EL0 [1], the perf subsystem would
> > set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
> > a direct access to PMU counters.  However, KVM appears to assume
> > that the register value is always zero for the host EL0, and has
> > the following two problems in handling the register.
> > 
> > [A] The host EL0 might lose the direct access to PMU counters, as
> >     KVM always clears PMUSERENR_EL0 before returning to userspace.
> > 
> > [B] With VHE, the guest EL0 access to PMU counters might be trapped
> >     to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
> >     indicates that the guest EL0 has an access to the counters).
> >     This is because, with VHE, KVM sets ER, CR, SW and EN bits of
> >     PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
> >     from the guset EL0 to EL2, but those bits might be cleared by
> >     the perf subsystem after vcpu_load() (when PMU counters are
> >     programmed for the vPMU emulation).
> > 
> > Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
> > The series is based on v6.3-rc5.
> > 
> > v2:
> >  - Save the PMUSERENR_EL0 for the host in the sysreg array of
> >    kvm_host_data. [Marc]
> >  - Don't let armv8pmu_start() overwrite PMUSERENR if the vCPU
> >    is loaded, instead have KVM update the saved shadow register
> >    value for the host. [Marc, Mark]
> 
> This looks much better to me. If Mark is OK with it, I'm happy to take
> it in 6.4.
> 
> Speaking of which, this will clash with the queued move of the PMUv3
> code into drivers/perf, and probably break on 32bit. I can either take
> a branch shared with arm64 (009d6dc87a56 ("ARM: perf: Allow the use of
> the PMUv3 driver on 32bit ARM")), or wait until -rc1.
> 
> Will, what do you prefer?

I'd be inclined to wait until -rc1, but for-next/perf is stable if you
decide to take it anyway.

Will
Marc Zyngier April 12, 2023, 10:29 a.m. UTC | #3
On Tue, 11 Apr 2023 12:24:59 +0100,
Will Deacon <will@kernel.org> wrote:
> 
> On Sat, Apr 08, 2023 at 10:04:19AM +0100, Marc Zyngier wrote:
> > On Sat, 08 Apr 2023 04:47:57 +0100,
> > Reiji Watanabe <reijiw@google.com> wrote:
> > > 
> > > This series will fix bugs in KVM's handling of PMUSERENR_EL0.
> > > 
> > > With PMU access support from EL0 [1], the perf subsystem would
> > > set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
> > > a direct access to PMU counters.  However, KVM appears to assume
> > > that the register value is always zero for the host EL0, and has
> > > the following two problems in handling the register.
> > > 
> > > [A] The host EL0 might lose the direct access to PMU counters, as
> > >     KVM always clears PMUSERENR_EL0 before returning to userspace.
> > > 
> > > [B] With VHE, the guest EL0 access to PMU counters might be trapped
> > >     to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
> > >     indicates that the guest EL0 has an access to the counters).
> > >     This is because, with VHE, KVM sets ER, CR, SW and EN bits of
> > >     PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
> > >     from the guset EL0 to EL2, but those bits might be cleared by
> > >     the perf subsystem after vcpu_load() (when PMU counters are
> > >     programmed for the vPMU emulation).
> > > 
> > > Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
> > > The series is based on v6.3-rc5.
> > > 
> > > v2:
> > >  - Save the PMUSERENR_EL0 for the host in the sysreg array of
> > >    kvm_host_data. [Marc]
> > >  - Don't let armv8pmu_start() overwrite PMUSERENR if the vCPU
> > >    is loaded, instead have KVM update the saved shadow register
> > >    value for the host. [Marc, Mark]
> > 
> > This looks much better to me. If Mark is OK with it, I'm happy to take
> > it in 6.4.
> > 
> > Speaking of which, this will clash with the queued move of the PMUv3
> > code into drivers/perf, and probably break on 32bit. I can either take
> > a branch shared with arm64 (009d6dc87a56 ("ARM: perf: Allow the use of
> > the PMUv3 driver on 32bit ARM")), or wait until -rc1.
> > 
> > Will, what do you prefer?
> 
> I'd be inclined to wait until -rc1, but for-next/perf is stable if you
> decide to take it anyway.

Given that Mark and Reiji are still working out some of the corner
cases, -rc1 feels like the right target.

Thanks,

	M.