mbox series

[V9,00/18] KVM: x86/pmu: Add *basic* support to enable guest PEBS via DS

Message ID 20210722054159.4459-1-lingshan.zhu@intel.com (mailing list archive)
Headers show
Series KVM: x86/pmu: Add *basic* support to enable guest PEBS via DS | expand

Message

Zhu, Lingshan July 22, 2021, 5:41 a.m. UTC
The guest Precise Event Based Sampling (PEBS) feature can provide an
architectural state of the instruction executed after the guest instruction
that exactly caused the event. It needs new hardware facility only available
on Intel Ice Lake Server platforms. This patch set enables the basic PEBS
feature for KVM guests on ICX.

We can use PEBS feature on the Linux guest like native:

   # echo 0 > /proc/sys/kernel/watchdog (on the host)
   # perf record -e instructions:ppp ./br_instr a
   # perf record -c 100000 -e instructions:pp ./br_instr a

To emulate guest PEBS facility for the above perf usages,
we need to implement 2 code paths:

1) Fast path

This is when the host assigned physical PMC has an identical index as the
virtual PMC (e.g. using physical PMC0 to emulate virtual PMC0).
This path is used in most common use cases.

2) Slow path

This is when the host assigned physical PMC has a different index from the
virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) In this case,
KVM needs to rewrite the PEBS records to change the applicable counter indexes
to the virtual PMC indexes, which would otherwise contain the physical counter
index written by PEBS facility, and switch the counter reset values to the
offset corresponding to the physical counter indexes in the DS data structure.

The previous version [0] enables both fast path and slow path, which seems
a bit more complex as the first step. In this patchset, we want to start with
the fast path to get the basic guest PEBS enabled while keeping the slow path
disabled. More focused discussion on the slow path [1] is planned to be put to
another patchset in the next step.

Compared to later versions in subsequent steps, the functionality to support
host-guest PEBS both enabled and the functionality to emulate guest PEBS when
the counter is cross-mapped are missing in this patch set
(neither of these are typical scenarios).

With the basic support, the guest can retrieve the correct PEBS information from
its own PEBS records on the Ice Lake servers. And we expect it should work when
migrating to another Ice Lake and no regression about host perf is expected.

Here are the results of pebs test from guest/host for same workload:

perf report on guest:
# Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1473377250 # Overhead  Command   Shared Object      Symbol
   57.74%  br_instr  br_instr           [.] lfsr_cond
   41.40%  br_instr  br_instr           [.] cmp_end
    0.21%  br_instr  [kernel.kallsyms]  [k] __lock_acquire

perf report on host:
# Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1462721386 # Overhead  Command   Shared Object     Symbol
   57.90%  br_instr  br_instr          [.] lfsr_cond
   41.95%  br_instr  br_instr          [.] cmp_end
    0.05%  br_instr  [kernel.vmlinux]  [k] lock_acquire
    Conclusion: the profiling results on the guest are similar tothat on the host.

A minimum guest kernel version may be v5.4 or a backport version support
Icelake server PEBS.

Please check more details in each commit and feel free to comment.

Previous:
https://lkml.org/lkml/2021/7/16/214

[0]
https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/
[1]
https://lore.kernel.org/kvm/20210115191113.nktlnmivc3edstiv@two.firstfloor.org/

V8 -> V9 Changelog:
-fix a brackets error in xen_guest_state()

V7 -> V8 Changelog:
- fix coding style, add {} for single statement of multiple lines(Peter Z)
- fix coding style in xen_guest_state() (Boris Ostrovsky)
- s/pmu/kvm_pmu/ in intel_guest_get_msrs() (Peter Z)
- put lower cost branch in the first place for x86_pmu_handle_guest_pebs() (Peter Z)

V6 -> V7 Changelog:
- Fix conditions order and call x86_pmu_handle_guest_pebs() unconditionally; (PeterZ)
- Add a new patch to make all that perf_guest_cbs stuff suck less; (PeterZ)
- Document IA32_MISC_ENABLE[7] that that behavior matches bare metal; (Sean & Venkatesh)
- Update commit message for fixed counter mask refactoring;(PeterZ)
- Clarifying comments about {.host and .guest} for intel_guest_get_msrs(); (PeterZ)
- Add pebs_capable to store valid PEBS_COUNTER_MASK value; (PeterZ)
- Add more comments for perf's precise_ip field; (Andi & PeterZ)
- Refactor perf_overflow_handler_t and make it more legible; (PeterZ)
- Use "(unsigned long)cpuc->ds" instead of __this_cpu_read(cpu_hw_events.ds); (PeterZ)
- Keep using "(struct kvm_pmu *)data" to follow K&R; (Andi)

Like Xu (17):
  perf/core: Use static_call to optimize perf_guest_info_callbacks
  perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server
  perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest
  perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values
  KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled
  KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
  KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
  KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter
  KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
  KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS
  KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS
  KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled
  KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h
  KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations
  KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability
  KVM: x86/cpuid: Refactor host/guest CPU model consistency check
  KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64

Peter Zijlstra (Intel) (1):
  x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value

 arch/arm/kernel/perf_callchain.c   |  16 +--
 arch/arm64/kernel/perf_callchain.c |  29 +++--
 arch/arm64/kvm/perf.c              |  22 ++--
 arch/csky/kernel/perf_callchain.c  |   4 +-
 arch/nds32/kernel/perf_event_cpu.c |  16 +--
 arch/riscv/kernel/perf_callchain.c |   4 +-
 arch/x86/events/core.c             |  44 ++++++--
 arch/x86/events/intel/core.c       | 165 +++++++++++++++++++++++------
 arch/x86/events/perf_event.h       |   6 +-
 arch/x86/include/asm/kvm_host.h    |  18 +++-
 arch/x86/include/asm/msr-index.h   |   6 ++
 arch/x86/include/asm/perf_event.h  |   5 +-
 arch/x86/kvm/cpuid.c               |  24 ++---
 arch/x86/kvm/cpuid.h               |   5 +
 arch/x86/kvm/pmu.c                 |  60 ++++++++---
 arch/x86/kvm/pmu.h                 |  38 +++++++
 arch/x86/kvm/vmx/capabilities.h    |  26 +++--
 arch/x86/kvm/vmx/pmu_intel.c       | 116 ++++++++++++++++----
 arch/x86/kvm/vmx/vmx.c             |  24 ++++-
 arch/x86/kvm/vmx/vmx.h             |   2 +-
 arch/x86/kvm/x86.c                 |  51 +++++----
 arch/x86/xen/pmu.c                 |  33 +++---
 include/linux/perf_event.h         |  12 ++-
 kernel/events/core.c               |   9 ++
 24 files changed, 545 insertions(+), 190 deletions(-)

Comments

Peter Zijlstra July 28, 2021, 3:45 p.m. UTC | #1
On Thu, Jul 22, 2021 at 01:41:41PM +0800, Zhu Lingshan wrote:
> The guest Precise Event Based Sampling (PEBS) feature can provide an
> architectural state of the instruction executed after the guest instruction
> that exactly caused the event. It needs new hardware facility only available
> on Intel Ice Lake Server platforms. This patch set enables the basic PEBS
> feature for KVM guests on ICX.
> 
> We can use PEBS feature on the Linux guest like native:
> 
>    # echo 0 > /proc/sys/kernel/watchdog (on the host)
>    # perf record -e instructions:ppp ./br_instr a
>    # perf record -c 100000 -e instructions:pp ./br_instr a

Why does the host need to disable the watchdog? IIRC ICL has multiple
PEBS capable counters. Also, I think the watchdog ends up on a fixed
counter by default anyway.

> Like Xu (17):
>   perf/core: Use static_call to optimize perf_guest_info_callbacks
>   perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server
>   perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest
>   perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values
>   KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled
>   KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
>   KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
>   KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter
>   KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
>   KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS
>   KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS
>   KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled
>   KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h
>   KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations
>   KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability
>   KVM: x86/cpuid: Refactor host/guest CPU model consistency check
>   KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64
> 
> Peter Zijlstra (Intel) (1):
>   x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value

Looks good:

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

How do we want to route this, all through the KVM tree?

One little nit I had; would something like the below (on top perhaps)
make the code easier to read?

---
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3921,9 +3921,12 @@ static struct perf_guest_switch_msr *int
 	struct kvm_pmu *kvm_pmu = (struct kvm_pmu *)data;
 	u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);
 	u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
+	int global_ctrl, pebs_enable;
 
 	*nr = 0;
-	arr[(*nr)++] = (struct perf_guest_switch_msr){
+
+	global_ctrl = (*nr)++;
+	arr[global_ctrl] = (struct perf_guest_switch_msr){
 		.msr = MSR_CORE_PERF_GLOBAL_CTRL,
 		.host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
 		.guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
@@ -3966,23 +3969,23 @@ static struct perf_guest_switch_msr *int
 		};
 	}
 
-	arr[*nr] = (struct perf_guest_switch_msr){
+	pebs_enable = (*nr)++;
+	arr[pebs_enable] = (struct perf_guest_switch_msr){
 		.msr = MSR_IA32_PEBS_ENABLE,
 		.host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask,
 		.guest = pebs_mask & ~cpuc->intel_ctrl_host_mask,
 	};
 
-	if (arr[*nr].host) {
+	if (arr[pebs_enable].host) {
 		/* Disable guest PEBS if host PEBS is enabled. */
-		arr[*nr].guest = 0;
+		arr[pebs_enable].guest = 0;
 	} else {
 		/* Disable guest PEBS for cross-mapped PEBS counters. */
-		arr[*nr].guest &= ~kvm_pmu->host_cross_mapped_mask;
+		arr[pebs_enable].guest &= ~kvm_pmu->host_cross_mapped_mask;
 		/* Set hw GLOBAL_CTRL bits for PEBS counter when it runs for guest */
-		arr[0].guest |= arr[*nr].guest;
+		arr[global_ctrl].guest |= arr[pebs_enable].guest;
 	}
 
-	++(*nr);
 	return arr;
 }
Like Xu July 28, 2021, 4:40 p.m. UTC | #2
On Wed, Jul 28, 2021 at 11:46 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Jul 22, 2021 at 01:41:41PM +0800, Zhu Lingshan wrote:
> > The guest Precise Event Based Sampling (PEBS) feature can provide an
> > architectural state of the instruction executed after the guest instruction
> > that exactly caused the event. It needs new hardware facility only available
> > on Intel Ice Lake Server platforms. This patch set enables the basic PEBS
> > feature for KVM guests on ICX.
> >
> > We can use PEBS feature on the Linux guest like native:
> >
> >    # echo 0 > /proc/sys/kernel/watchdog (on the host)
> >    # perf record -e instructions:ppp ./br_instr a
> >    # perf record -c 100000 -e instructions:pp ./br_instr a
>
> Why does the host need to disable the watchdog? IIRC ICL has multiple
> PEBS capable counters. Also, I think the watchdog ends up on a fixed
> counter by default anyway.

The watchdog counter blocks the KVM PEBS request on the same (fixed) counter.
This restriction will be lifted when we have cross-mapping support later in KVM.

>
> > Like Xu (17):
> >   perf/core: Use static_call to optimize perf_guest_info_callbacks
> >   perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server
> >   perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest
> >   perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values
> >   KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled
> >   KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
> >   KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
> >   KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter
> >   KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
> >   KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS
> >   KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS
> >   KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled
> >   KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h
> >   KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations
> >   KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability
> >   KVM: x86/cpuid: Refactor host/guest CPU model consistency check
> >   KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64
> >
> > Peter Zijlstra (Intel) (1):
> >   x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value
>
> Looks good:
>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Thanks for your time and support of the guest PMU features.

> How do we want to route this, all through the KVM tree?

As a prerequisite, the perf tree may apply the first three patches.
Hi Paolo, do you have any preferences ?

>
> One little nit I had; would something like the below (on top perhaps)
> make the code easier to read?

Fine to me and I may provide a follow-up patch.

>
> ---
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3921,9 +3921,12 @@ static struct perf_guest_switch_msr *int
>         struct kvm_pmu *kvm_pmu = (struct kvm_pmu *)data;
>         u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);
>         u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
> +       int global_ctrl, pebs_enable;
>
>         *nr = 0;
> -       arr[(*nr)++] = (struct perf_guest_switch_msr){
> +
> +       global_ctrl = (*nr)++;
> +       arr[global_ctrl] = (struct perf_guest_switch_msr){
>                 .msr = MSR_CORE_PERF_GLOBAL_CTRL,
>                 .host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
>                 .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
> @@ -3966,23 +3969,23 @@ static struct perf_guest_switch_msr *int
>                 };
>         }
>
> -       arr[*nr] = (struct perf_guest_switch_msr){
> +       pebs_enable = (*nr)++;
> +       arr[pebs_enable] = (struct perf_guest_switch_msr){
>                 .msr = MSR_IA32_PEBS_ENABLE,
>                 .host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask,
>                 .guest = pebs_mask & ~cpuc->intel_ctrl_host_mask,
>         };
>
> -       if (arr[*nr].host) {
> +       if (arr[pebs_enable].host) {
>                 /* Disable guest PEBS if host PEBS is enabled. */
> -               arr[*nr].guest = 0;
> +               arr[pebs_enable].guest = 0;
>         } else {
>                 /* Disable guest PEBS for cross-mapped PEBS counters. */
> -               arr[*nr].guest &= ~kvm_pmu->host_cross_mapped_mask;
> +               arr[pebs_enable].guest &= ~kvm_pmu->host_cross_mapped_mask;
>                 /* Set hw GLOBAL_CTRL bits for PEBS counter when it runs for guest */
> -               arr[0].guest |= arr[*nr].guest;
> +               arr[global_ctrl].guest |= arr[pebs_enable].guest;
>         }
>
> -       ++(*nr);
>         return arr;
>  }
>
>
>
>
Zhu, Lingshan Aug. 4, 2021, 3:03 a.m. UTC | #3
On 7/28/2021 11:45 PM, Peter Zijlstra wrote:
> On Thu, Jul 22, 2021 at 01:41:41PM +0800, Zhu Lingshan wrote:
>> The guest Precise Event Based Sampling (PEBS) feature can provide an
>> architectural state of the instruction executed after the guest instruction
>> that exactly caused the event. It needs new hardware facility only available
>> on Intel Ice Lake Server platforms. This patch set enables the basic PEBS
>> feature for KVM guests on ICX.
>>
>> We can use PEBS feature on the Linux guest like native:
>>
>>     # echo 0 > /proc/sys/kernel/watchdog (on the host)
>>     # perf record -e instructions:ppp ./br_instr a
>>     # perf record -c 100000 -e instructions:pp ./br_instr a
> Why does the host need to disable the watchdog? IIRC ICL has multiple
> PEBS capable counters. Also, I think the watchdog ends up on a fixed
> counter by default anyway.
>
>> Like Xu (17):
>>    perf/core: Use static_call to optimize perf_guest_info_callbacks
>>    perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server
>>    perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest
>>    perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values
>>    KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled
>>    KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
>>    KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
>>    KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter
>>    KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
>>    KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS
>>    KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS
>>    KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled
>>    KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h
>>    KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations
>>    KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability
>>    KVM: x86/cpuid: Refactor host/guest CPU model consistency check
>>    KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64
>>
>> Peter Zijlstra (Intel) (1):
>>    x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value
> Looks good:
>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>
> How do we want to route this, all through the KVM tree?
I will send a V10 patchset then ping Paolo.
>
> One little nit I had; would something like the below (on top perhaps)
> make the code easier to read?
V10 will include this change.

Thanks,
Zhu Lingshan
>
> ---
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3921,9 +3921,12 @@ static struct perf_guest_switch_msr *int
>   	struct kvm_pmu *kvm_pmu = (struct kvm_pmu *)data;
>   	u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);
>   	u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
> +	int global_ctrl, pebs_enable;
>   
>   	*nr = 0;
> -	arr[(*nr)++] = (struct perf_guest_switch_msr){
> +
> +	global_ctrl = (*nr)++;
> +	arr[global_ctrl] = (struct perf_guest_switch_msr){
>   		.msr = MSR_CORE_PERF_GLOBAL_CTRL,
>   		.host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
>   		.guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
> @@ -3966,23 +3969,23 @@ static struct perf_guest_switch_msr *int
>   		};
>   	}
>   
> -	arr[*nr] = (struct perf_guest_switch_msr){
> +	pebs_enable = (*nr)++;
> +	arr[pebs_enable] = (struct perf_guest_switch_msr){
>   		.msr = MSR_IA32_PEBS_ENABLE,
>   		.host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask,
>   		.guest = pebs_mask & ~cpuc->intel_ctrl_host_mask,
>   	};
>   
> -	if (arr[*nr].host) {
> +	if (arr[pebs_enable].host) {
>   		/* Disable guest PEBS if host PEBS is enabled. */
> -		arr[*nr].guest = 0;
> +		arr[pebs_enable].guest = 0;
>   	} else {
>   		/* Disable guest PEBS for cross-mapped PEBS counters. */
> -		arr[*nr].guest &= ~kvm_pmu->host_cross_mapped_mask;
> +		arr[pebs_enable].guest &= ~kvm_pmu->host_cross_mapped_mask;
>   		/* Set hw GLOBAL_CTRL bits for PEBS counter when it runs for guest */
> -		arr[0].guest |= arr[*nr].guest;
> +		arr[global_ctrl].guest |= arr[pebs_enable].guest;
>   	}
>   
> -	++(*nr);
>   	return arr;
>   }
>   
>
>
>
Like Xu Aug. 12, 2021, 1:20 p.m. UTC | #4
Hi Paolo,

On 28/7/2021 11:45 pm, Peter Zijlstra wrote:
>> Like Xu (17):
>>    perf/core: Use static_call to optimize perf_guest_info_callbacks
>>    perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server
>>    perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest
>>    perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values
>>    KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled
>>    KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
>>    KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
>>    KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter
>>    KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
>>    KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS
>>    KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS
>>    KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled
>>    KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h
>>    KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations
>>    KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability
>>    KVM: x86/cpuid: Refactor host/guest CPU model consistency check
>>    KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64
>>
>> Peter Zijlstra (Intel) (1):
>>    x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value
> Looks good:
> 
> Acked-by: Peter Zijlstra (Intel)<peterz@infradead.org>
> 
> How do we want to route this, all through the KVM tree?

Do you have any comments for the latest version[1]
or do we have a chance to get it queued for mainline ?

I would really like to ease the burden of Lingshan on
maintaining this feature and on the basis of this work,
the guest BTS (Branch Tracking Store) is also ready to go.

Thanks,
Like Xu

[1] https://lore.kernel.org/kvm/20210806133802.3528-1-lingshan.zhu@intel.com/