[v2,14/15] KVM: x86: Add Arch LBR data MSR access interface

Message ID	20221125040604.5051-15-weijiang.yang@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Yang Weijiang <weijiang.yang@intel.com> To: seanjc@google.com, pbonzini@redhat.com, jmattson@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: like.xu.linux@gmail.com, kan.liang@linux.intel.com, wei.w.wang@intel.com, weijiang.yang@intel.com Subject: [PATCH v2 14/15] KVM: x86: Add Arch LBR data MSR access interface Date: Thu, 24 Nov 2022 23:06:03 -0500 Message-Id: <20221125040604.5051-15-weijiang.yang@intel.com> In-Reply-To: <20221125040604.5051-1-weijiang.yang@intel.com> References: <20221125040604.5051-1-weijiang.yang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Introduce Architectural LBR for vPMU \| expand [v2,00/15] Introduce Architectural LBR for vPMU [v2,01/15] perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers [v2,02/15] KVM: x86: Report XSS as an MSR to be saved if there are supported features [v2,03/15] KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS [v2,04/15] KVM: PMU: disable LBR handling if architectural LBR is available [v2,05/15] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR [v2,06/15] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL for guest Arch LBR [v2,07/15] KVM: VMX: Support passthrough of architectural LBRs [v2,08/15] KVM: x86: Add Arch LBR MSRs to msrs_to_save_all list [v2,09/15] KVM: x86: Refine the matching and clearing logic for supported_xss [v2,10/15] KVM: x86/vmx: Check Arch LBR config when return perf capabilities [v2,11/15] KVM: x86: Add XSAVE Support for Architectural LBR [v2,12/15] KVM: x86/vmx: Disable Arch LBREn bit in #DB and warm reset [v2,13/15] KVM: x86/vmx: Save/Restore guest Arch LBR Ctrl msr at SMM entry/exit [v2,14/15] KVM: x86: Add Arch LBR data MSR access interface [v2,15/15] KVM: x86/cpuid: Advertise Arch LBR feature in CPUID

Yang, Weijiang Nov. 25, 2022, 4:06 a.m. UTC

Arch LBR MSRs are xsave-supported, but they're operated as "independent"
xsave feature by PMU code, i.e., during thread/process context switch,
the MSRs are saved/restored with perf_event_task_sched_{in|out} instead
of generic kernel fpu switch code, i.e.,save_fpregs_to_fpstate() and
restore_fpregs_from_fpstate(). When vcpu guest/host fpu state swap happens,
Arch LBR MSRs are retained so they can be accessed directly.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Sean Christopherson Jan. 27, 2023, 10:13 p.m. UTC | #1

On Thu, Nov 24, 2022, Yang Weijiang wrote:
> Arch LBR MSRs are xsave-supported, but they're operated as "independent"
> xsave feature by PMU code, i.e., during thread/process context switch,
> the MSRs are saved/restored with perf_event_task_sched_{in|out} instead
> of generic kernel fpu switch code, i.e.,save_fpregs_to_fpstate() and
> restore_fpregs_from_fpstate(). When vcpu guest/host fpu state swap happens,
> Arch LBR MSRs are retained so they can be accessed directly.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> ---
>  arch/x86/kvm/vmx/pmu_intel.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index b57944d5e7d8..241128972776 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -410,6 +410,11 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  			msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
>  		}
>  		return 0;
> +	case MSR_ARCH_LBR_FROM_0 ... MSR_ARCH_LBR_FROM_0 + 31:
> +	case MSR_ARCH_LBR_TO_0 ... MSR_ARCH_LBR_TO_0 + 31:
> +	case MSR_ARCH_LBR_INFO_0 ... MSR_ARCH_LBR_INFO_0 + 31:
> +		rdmsrl(msr_info->index, msr_info->data);

I don't see how this is correct.  As called out in patch 5:

 : If for some magical reason it's safe to access arch LBR MSRs without disabling
 : IRQs and confirming perf event ownership, I want to see a very detailed changelog
 : explaining exactly how that magic works.

> +		return 0;
>  	default:
>  		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>  		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> @@ -528,6 +533,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		    (data & ARCH_LBR_CTL_LBREN))
>  			intel_pmu_create_guest_lbr_event(vcpu);
>  		return 0;
> +	case MSR_ARCH_LBR_FROM_0 ... MSR_ARCH_LBR_FROM_0 + 31:
> +	case MSR_ARCH_LBR_TO_0 ... MSR_ARCH_LBR_TO_0 + 31:
> +	case MSR_ARCH_LBR_INFO_0 ... MSR_ARCH_LBR_INFO_0 + 31:
> +		wrmsrl(msr_info->index, msr_info->data);
> +		return 0;
>  	default:
>  		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>  		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> -- 
> 2.27.0
>

Yang, Weijiang Jan. 30, 2023, 12:46 p.m. UTC | #2

On 1/28/2023 6:13 AM, Sean Christopherson wrote:
> On Thu, Nov 24, 2022, Yang Weijiang wrote:
>> Arch LBR MSRs are xsave-supported, but they're operated as "independent"
>> xsave feature by PMU code, i.e., during thread/process context switch,
>> the MSRs are saved/restored with perf_event_task_sched_{in|out} instead
>> of generic kernel fpu switch code, i.e.,save_fpregs_to_fpstate() and
>> restore_fpregs_from_fpstate(). When vcpu guest/host fpu state swap happens,
>> Arch LBR MSRs are retained so they can be accessed directly.
>>
>> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>>   arch/x86/kvm/vmx/pmu_intel.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>> index b57944d5e7d8..241128972776 100644
>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>> @@ -410,6 +410,11 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>   			msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
>>   		}
>>   		return 0;
>> +	case MSR_ARCH_LBR_FROM_0 ... MSR_ARCH_LBR_FROM_0 + 31:
>> +	case MSR_ARCH_LBR_TO_0 ... MSR_ARCH_LBR_TO_0 + 31:
>> +	case MSR_ARCH_LBR_INFO_0 ... MSR_ARCH_LBR_INFO_0 + 31:
>> +		rdmsrl(msr_info->index, msr_info->data);
> I don't see how this is correct.  As called out in patch 5:
>
>   : If for some magical reason it's safe to access arch LBR MSRs without disabling
>   : IRQs and confirming perf event ownership, I want to see a very detailed changelog
>   : explaining exactly how that magic works.

The MSR lists here are just for live migration. When arch-lbr is active, 
these MSRs are passed through

to guest.


>
>> +		return 0;
>>   	default:
>>   		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>>   		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
>> @@ -528,6 +533,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>   		    (data & ARCH_LBR_CTL_LBREN))
>>   			intel_pmu_create_guest_lbr_event(vcpu);
>>   		return 0;
>> +	case MSR_ARCH_LBR_FROM_0 ... MSR_ARCH_LBR_FROM_0 + 31:
>> +	case MSR_ARCH_LBR_TO_0 ... MSR_ARCH_LBR_TO_0 + 31:
>> +	case MSR_ARCH_LBR_INFO_0 ... MSR_ARCH_LBR_INFO_0 + 31:
>> +		wrmsrl(msr_info->index, msr_info->data);
>> +		return 0;
>>   	default:
>>   		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>>   		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
>> -- 
>> 2.27.0
>>

Sean Christopherson Jan. 30, 2023, 5:30 p.m. UTC | #3

On Mon, Jan 30, 2023, Yang, Weijiang wrote:
> 
> On 1/28/2023 6:13 AM, Sean Christopherson wrote:
> > On Thu, Nov 24, 2022, Yang Weijiang wrote:
> > > Arch LBR MSRs are xsave-supported, but they're operated as "independent"
> > > xsave feature by PMU code, i.e., during thread/process context switch,
> > > the MSRs are saved/restored with perf_event_task_sched_{in|out} instead
> > > of generic kernel fpu switch code, i.e.,save_fpregs_to_fpstate() and
> > > restore_fpregs_from_fpstate(). When vcpu guest/host fpu state swap happens,
> > > Arch LBR MSRs are retained so they can be accessed directly.
> > > 
> > > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > > Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> > > ---
> > >   arch/x86/kvm/vmx/pmu_intel.c | 10 ++++++++++
> > >   1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> > > index b57944d5e7d8..241128972776 100644
> > > --- a/arch/x86/kvm/vmx/pmu_intel.c
> > > +++ b/arch/x86/kvm/vmx/pmu_intel.c
> > > @@ -410,6 +410,11 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > >   			msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
> > >   		}
> > >   		return 0;
> > > +	case MSR_ARCH_LBR_FROM_0 ... MSR_ARCH_LBR_FROM_0 + 31:
> > > +	case MSR_ARCH_LBR_TO_0 ... MSR_ARCH_LBR_TO_0 + 31:
> > > +	case MSR_ARCH_LBR_INFO_0 ... MSR_ARCH_LBR_INFO_0 + 31:
> > > +		rdmsrl(msr_info->index, msr_info->data);
> > I don't see how this is correct.  As called out in patch 5:
> > 
> >   : If for some magical reason it's safe to access arch LBR MSRs without disabling
> >   : IRQs and confirming perf event ownership, I want to see a very detailed changelog
> >   : explaining exactly how that magic works.
> 
> The MSR lists here are just for live migration. When arch-lbr is active,
> these MSRs are passed through to guest.

None of that explains how the guest's MSR values are guaranteed to be resident
in hardware.

Yang, Weijiang Jan. 31, 2023, 1:14 p.m. UTC | #4

On 1/31/2023 1:30 AM, Sean Christopherson wrote:
> On Mon, Jan 30, 2023, Yang, Weijiang wrote:
>> On 1/28/2023 6:13 AM, Sean Christopherson wrote:
>>> On Thu, Nov 24, 2022, Yang Weijiang wrote:
>>>> Arch LBR MSRs are xsave-supported, but they're operated as "independent"
>>>> xsave feature by PMU code, i.e., during thread/process context switch,
>>>> the MSRs are saved/restored with perf_event_task_sched_{in|out} instead
>>>> of generic kernel fpu switch code, i.e.,save_fpregs_to_fpstate() and
>>>> restore_fpregs_from_fpstate(). When vcpu guest/host fpu state swap happens,
>>>> Arch LBR MSRs are retained so they can be accessed directly.
>>>>
>>>> Signed-off-by: Yang Weijiang<weijiang.yang@intel.com>
>>>> Reviewed-by: Kan Liang<kan.liang@linux.intel.com>
>>>> ---
>>>>    arch/x86/kvm/vmx/pmu_intel.c | 10 ++++++++++
>>>>    1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>>>> index b57944d5e7d8..241128972776 100644
>>>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>>>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>>>> @@ -410,6 +410,11 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>>>    			msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
>>>>    		}
>>>>    		return 0;
>>>> +	case MSR_ARCH_LBR_FROM_0 ... MSR_ARCH_LBR_FROM_0 + 31:
>>>> +	case MSR_ARCH_LBR_TO_0 ... MSR_ARCH_LBR_TO_0 + 31:
>>>> +	case MSR_ARCH_LBR_INFO_0 ... MSR_ARCH_LBR_INFO_0 + 31:
>>>> +		rdmsrl(msr_info->index, msr_info->data);
>>> I don't see how this is correct.  As called out in patch 5:
>>>
>>>    : If for some magical reason it's safe to access arch LBR MSRs without disabling
>>>    : IRQs and confirming perf event ownership, I want to see a very detailed changelog
>>>    : explaining exactly how that magic works.
>> The MSR lists here are just for live migration. When arch-lbr is active,
>> these MSRs are passed through to guest.
> None of that explains how the guest's MSR values are guaranteed to be resident
> in hardware.

I ignored host *event* scheduling case in commit log.

My understanding is, host LBR *event* could break in at any point when 
the vCPU is running,

in this case disabling IRQs before read/write the MSRs is pointless 
because the HW context could have

been swapped. I need to do more investigation for the issue.

Sean Christopherson Jan. 31, 2023, 4:05 p.m. UTC | #5

On Tue, Jan 31, 2023, Yang, Weijiang wrote:
> 
> On 1/31/2023 1:30 AM, Sean Christopherson wrote:
> > On Mon, Jan 30, 2023, Yang, Weijiang wrote:
> > > On 1/28/2023 6:13 AM, Sean Christopherson wrote:
> > > > On Thu, Nov 24, 2022, Yang Weijiang wrote:
> > > > > Arch LBR MSRs are xsave-supported, but they're operated as "independent"
> > > > > xsave feature by PMU code, i.e., during thread/process context switch,
> > > > > the MSRs are saved/restored with perf_event_task_sched_{in|out} instead
> > > > > of generic kernel fpu switch code, i.e.,save_fpregs_to_fpstate() and
> > > > > restore_fpregs_from_fpstate(). When vcpu guest/host fpu state swap happens,
> > > > > Arch LBR MSRs are retained so they can be accessed directly.
> > > > > 
> > > > > Signed-off-by: Yang Weijiang<weijiang.yang@intel.com>
> > > > > Reviewed-by: Kan Liang<kan.liang@linux.intel.com>
> > > > > ---
> > > > >    arch/x86/kvm/vmx/pmu_intel.c | 10 ++++++++++
> > > > >    1 file changed, 10 insertions(+)
> > > > > 
> > > > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> > > > > index b57944d5e7d8..241128972776 100644
> > > > > --- a/arch/x86/kvm/vmx/pmu_intel.c
> > > > > +++ b/arch/x86/kvm/vmx/pmu_intel.c
> > > > > @@ -410,6 +410,11 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > > > >    			msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
> > > > >    		}
> > > > >    		return 0;
> > > > > +	case MSR_ARCH_LBR_FROM_0 ... MSR_ARCH_LBR_FROM_0 + 31:
> > > > > +	case MSR_ARCH_LBR_TO_0 ... MSR_ARCH_LBR_TO_0 + 31:
> > > > > +	case MSR_ARCH_LBR_INFO_0 ... MSR_ARCH_LBR_INFO_0 + 31:
> > > > > +		rdmsrl(msr_info->index, msr_info->data);
> > > > I don't see how this is correct.  As called out in patch 5:
> > > > 
> > > >    : If for some magical reason it's safe to access arch LBR MSRs without disabling
> > > >    : IRQs and confirming perf event ownership, I want to see a very detailed changelog
> > > >    : explaining exactly how that magic works.
> > > The MSR lists here are just for live migration. When arch-lbr is active,
> > > these MSRs are passed through to guest.
> > None of that explains how the guest's MSR values are guaranteed to be resident
> > in hardware.
> 
> I ignored host *event* scheduling case in commit log.
> 
> My understanding is, host LBR *event* could break in at any point when the
> vCPU is running,
> 
> in this case disabling IRQs before read/write the MSRs is pointless because
> the HW context could have been swapped. I need to do more investigation for
> the issue.

Which is presumably why intel_pmu_handle_lbr_msrs_access() checks that the LBR
perf event is active prior to accessing the MSRs, with IRQs disabled...

	/*
	 * Disable irq to ensure the LBR feature doesn't get reclaimed by the
	 * host at the time the value is read from the msr, and this avoids the
	 * host LBR value to be leaked to the guest. If LBR has been reclaimed,
	 * return 0 on guest reads.
	 */
	local_irq_disable();
	if (lbr_desc->event->state == PERF_EVENT_STATE_ACTIVE) {
		if (read)
			rdmsrl(index, msr_info->data);
		else
			wrmsrl(index, msr_info->data);
		__set_bit(INTEL_PMC_IDX_FIXED_VLBR, vcpu_to_pmu(vcpu)->pmc_in_use);
		local_irq_enable();
		return true;
	}
	clear_bit(INTEL_PMC_IDX_FIXED_VLBR, vcpu_to_pmu(vcpu)->pmc_in_use);
	local_irq_enable();

[v2,14/15] KVM: x86: Add Arch LBR data MSR access interface

Commit Message

Comments

Patch