Message ID | 20231130074609.58668-4-yangyicong@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Enable the profiling of EL0&1 translation regime of ARM SPE | expand |
On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote: > From: Yicong Yang <yangyicong@hisilicon.com> > > On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2 > and we're actually enabling E0SPE and E2SPE in the driver. This means > the data from EL0&1 translation regime of a VM will not be profiled. > So this patch tries to add the support of profiling EL0 and EL1 of > a VM. Users can filter data of different exception level by using > the perf's exclude_* attributes. The exclude_* decision is referred > to Documentation/arch/arm64/perf.rst and the implementation of > arm_pmuv3. > > Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> > --- > drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++------- > 1 file changed, 30 insertions(+), 7 deletions(-) > > diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c > index 09570d4d63cd..a647d625f359 100644 > --- a/drivers/perf/arm_spe_pmu.c > +++ b/drivers/perf/arm_spe_pmu.c > @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event) > static void arm_spe_pmu_set_pmscr(struct perf_event *event) > { > struct perf_event_attr *attr = &event->attr; > - u64 reg = 0; > + u64 pmscr_el1, pmscr_el12; > > - reg = arm_spe_event_to_pmscr(event); > - if (!attr->exclude_user) > - reg |= PMSCR_EL1x_E0SPE; > + pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event); > + > + /* > + * Map the exclude_* descision to ELx according to > + * Documentation/arch/arm64/perf.rst. > + */ > + if (is_kernel_in_hyp_mode()) { > + if (!attr->exclude_kernel && !attr->exclude_host) > + pmscr_el1 |= PMSCR_EL1x_E1SPE; > > - if (!attr->exclude_kernel) > - reg |= PMSCR_EL1x_E1SPE; > + if (!attr->exclude_kernel && !attr->exclude_guest) > + pmscr_el12 |= PMSCR_EL1x_E1SPE; > + > + if (!attr->exclude_user && !attr->exclude_host) { > + pmscr_el1 |= PMSCR_EL1x_E0SPE; > + pmscr_el12 |= PMSCR_EL1x_E0SPE; > + } Hmm, I don't understand this part. Doesn't this mean that setting 'exclude_host' to true will also exclude userspace (EL0) profiling for the guest? Will
On 2024/4/11 22:28, Will Deacon wrote: > On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote: >> From: Yicong Yang <yangyicong@hisilicon.com> >> >> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2 >> and we're actually enabling E0SPE and E2SPE in the driver. This means >> the data from EL0&1 translation regime of a VM will not be profiled. >> So this patch tries to add the support of profiling EL0 and EL1 of >> a VM. Users can filter data of different exception level by using >> the perf's exclude_* attributes. The exclude_* decision is referred >> to Documentation/arch/arm64/perf.rst and the implementation of >> arm_pmuv3. >> >> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> >> --- >> drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++------- >> 1 file changed, 30 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c >> index 09570d4d63cd..a647d625f359 100644 >> --- a/drivers/perf/arm_spe_pmu.c >> +++ b/drivers/perf/arm_spe_pmu.c >> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event) >> static void arm_spe_pmu_set_pmscr(struct perf_event *event) >> { >> struct perf_event_attr *attr = &event->attr; >> - u64 reg = 0; >> + u64 pmscr_el1, pmscr_el12; >> >> - reg = arm_spe_event_to_pmscr(event); >> - if (!attr->exclude_user) >> - reg |= PMSCR_EL1x_E0SPE; >> + pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event); >> + >> + /* >> + * Map the exclude_* descision to ELx according to >> + * Documentation/arch/arm64/perf.rst. >> + */ >> + if (is_kernel_in_hyp_mode()) { >> + if (!attr->exclude_kernel && !attr->exclude_host) >> + pmscr_el1 |= PMSCR_EL1x_E1SPE; >> >> - if (!attr->exclude_kernel) >> - reg |= PMSCR_EL1x_E1SPE; >> + if (!attr->exclude_kernel && !attr->exclude_guest) >> + pmscr_el12 |= PMSCR_EL1x_E1SPE; >> + >> + if (!attr->exclude_user && !attr->exclude_host) { >> + pmscr_el1 |= PMSCR_EL1x_E0SPE; >> + pmscr_el12 |= PMSCR_EL1x_E0SPE; >> + } > > Hmm, I don't understand this part. Doesn't this mean that setting > 'exclude_host' to true will also exclude userspace (EL0) profiling for > the guest? > I may misunderstand 'exclude_host' in the doc. Yes we won't include EL0 here in the driver but we should handle it on guest enter/exit, which is missed in this patch. Will see how to handle it properly on guest enter/exit and respin a v3. Thanks.
On Fri, 12 Apr 2024 10:22:28 +0100, Yicong Yang <yangyicong@huawei.com> wrote: > > On 2024/4/11 22:28, Will Deacon wrote: > > On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote: > >> From: Yicong Yang <yangyicong@hisilicon.com> > >> > >> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2 > >> and we're actually enabling E0SPE and E2SPE in the driver. This means > >> the data from EL0&1 translation regime of a VM will not be profiled. > >> So this patch tries to add the support of profiling EL0 and EL1 of > >> a VM. Users can filter data of different exception level by using > >> the perf's exclude_* attributes. The exclude_* decision is referred > >> to Documentation/arch/arm64/perf.rst and the implementation of > >> arm_pmuv3. > >> > >> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> > >> --- > >> drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++------- > >> 1 file changed, 30 insertions(+), 7 deletions(-) > >> > >> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c > >> index 09570d4d63cd..a647d625f359 100644 > >> --- a/drivers/perf/arm_spe_pmu.c > >> +++ b/drivers/perf/arm_spe_pmu.c > >> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event) > >> static void arm_spe_pmu_set_pmscr(struct perf_event *event) > >> { > >> struct perf_event_attr *attr = &event->attr; > >> - u64 reg = 0; > >> + u64 pmscr_el1, pmscr_el12; > >> > >> - reg = arm_spe_event_to_pmscr(event); > >> - if (!attr->exclude_user) > >> - reg |= PMSCR_EL1x_E0SPE; > >> + pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event); > >> + > >> + /* > >> + * Map the exclude_* descision to ELx according to > >> + * Documentation/arch/arm64/perf.rst. > >> + */ > >> + if (is_kernel_in_hyp_mode()) { > >> + if (!attr->exclude_kernel && !attr->exclude_host) > >> + pmscr_el1 |= PMSCR_EL1x_E1SPE; > >> > >> - if (!attr->exclude_kernel) > >> - reg |= PMSCR_EL1x_E1SPE; > >> + if (!attr->exclude_kernel && !attr->exclude_guest) > >> + pmscr_el12 |= PMSCR_EL1x_E1SPE; > >> + > >> + if (!attr->exclude_user && !attr->exclude_host) { > >> + pmscr_el1 |= PMSCR_EL1x_E0SPE; > >> + pmscr_el12 |= PMSCR_EL1x_E0SPE; > >> + } > > > > Hmm, I don't understand this part. Doesn't this mean that setting > > 'exclude_host' to true will also exclude userspace (EL0) profiling for > > the guest? > > > > I may misunderstand 'exclude_host' in the doc. Yes we won't include EL0 here > in the driver but we should handle it on guest enter/exit, which is missed > in this patch. Will see how to handle it properly on guest enter/exit and > respin a v3. Why should you handle this on guest entry/exit? It should be enough to deal with PMCSCR_EL12 at the point where the vcpu is scheduled, and not on the fast path (i.e. it should get hooked into load/put). The PMU does something vaguely similar. Another thing is that it shouldn't get in the way of a future SPE support for the guest itself. Thanks, M.
On 2024/4/12 17:59, Marc Zyngier wrote: > On Fri, 12 Apr 2024 10:22:28 +0100, > Yicong Yang <yangyicong@huawei.com> wrote: >> >> On 2024/4/11 22:28, Will Deacon wrote: >>> On Thu, Nov 30, 2023 at 03:46:09PM +0800, Yicong Yang wrote: >>>> From: Yicong Yang <yangyicong@hisilicon.com> >>>> >>>> On a VHE enabled host, the PMSCR_EL1 will be redirect to PMSCR_EL2 >>>> and we're actually enabling E0SPE and E2SPE in the driver. This means >>>> the data from EL0&1 translation regime of a VM will not be profiled. >>>> So this patch tries to add the support of profiling EL0 and EL1 of >>>> a VM. Users can filter data of different exception level by using >>>> the perf's exclude_* attributes. The exclude_* decision is referred >>>> to Documentation/arch/arm64/perf.rst and the implementation of >>>> arm_pmuv3. >>>> >>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> >>>> --- >>>> drivers/perf/arm_spe_pmu.c | 37 ++++++++++++++++++++++++++++++------- >>>> 1 file changed, 30 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c >>>> index 09570d4d63cd..a647d625f359 100644 >>>> --- a/drivers/perf/arm_spe_pmu.c >>>> +++ b/drivers/perf/arm_spe_pmu.c >>>> @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event) >>>> static void arm_spe_pmu_set_pmscr(struct perf_event *event) >>>> { >>>> struct perf_event_attr *attr = &event->attr; >>>> - u64 reg = 0; >>>> + u64 pmscr_el1, pmscr_el12; >>>> >>>> - reg = arm_spe_event_to_pmscr(event); >>>> - if (!attr->exclude_user) >>>> - reg |= PMSCR_EL1x_E0SPE; >>>> + pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event); >>>> + >>>> + /* >>>> + * Map the exclude_* descision to ELx according to >>>> + * Documentation/arch/arm64/perf.rst. >>>> + */ >>>> + if (is_kernel_in_hyp_mode()) { >>>> + if (!attr->exclude_kernel && !attr->exclude_host) >>>> + pmscr_el1 |= PMSCR_EL1x_E1SPE; >>>> >>>> - if (!attr->exclude_kernel) >>>> - reg |= PMSCR_EL1x_E1SPE; >>>> + if (!attr->exclude_kernel && !attr->exclude_guest) >>>> + pmscr_el12 |= PMSCR_EL1x_E1SPE; >>>> + >>>> + if (!attr->exclude_user && !attr->exclude_host) { >>>> + pmscr_el1 |= PMSCR_EL1x_E0SPE; >>>> + pmscr_el12 |= PMSCR_EL1x_E0SPE; >>>> + } >>> >>> Hmm, I don't understand this part. Doesn't this mean that setting >>> 'exclude_host' to true will also exclude userspace (EL0) profiling for >>> the guest? >>> >> >> I may misunderstand 'exclude_host' in the doc. Yes we won't include EL0 here >> in the driver but we should handle it on guest enter/exit, which is missed >> in this patch. Will see how to handle it properly on guest enter/exit and >> respin a v3. > > Why should you handle this on guest entry/exit? It should be enough to > deal with PMCSCR_EL12 at the point where the vcpu is scheduled, and > not on the fast path (i.e. it should get hooked into load/put). The > PMU does something vaguely similar. > Yes. I was mean in kvm_vcpu_pmu_restore_{guest,host}(), where also handles the PMU counters. We should mean the same place. > Another thing is that it shouldn't get in the way of a future SPE > support for the guest itself. > > Thanks, > > M. >
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index 09570d4d63cd..a647d625f359 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -316,21 +316,44 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event) static void arm_spe_pmu_set_pmscr(struct perf_event *event) { struct perf_event_attr *attr = &event->attr; - u64 reg = 0; + u64 pmscr_el1, pmscr_el12; - reg = arm_spe_event_to_pmscr(event); - if (!attr->exclude_user) - reg |= PMSCR_EL1x_E0SPE; + pmscr_el1 = pmscr_el12 = arm_spe_event_to_pmscr(event); + + /* + * Map the exclude_* descision to ELx according to + * Documentation/arch/arm64/perf.rst. + */ + if (is_kernel_in_hyp_mode()) { + if (!attr->exclude_kernel && !attr->exclude_host) + pmscr_el1 |= PMSCR_EL1x_E1SPE; - if (!attr->exclude_kernel) - reg |= PMSCR_EL1x_E1SPE; + if (!attr->exclude_kernel && !attr->exclude_guest) + pmscr_el12 |= PMSCR_EL1x_E1SPE; + + if (!attr->exclude_user && !attr->exclude_host) { + pmscr_el1 |= PMSCR_EL1x_E0SPE; + pmscr_el12 |= PMSCR_EL1x_E0SPE; + } + } else { + if (!attr->exclude_kernel) + pmscr_el1 |= PMSCR_EL1x_E1SPE; + + if (!attr->exclude_user) + pmscr_el1 |= PMSCR_EL1x_E0SPE; + } isb(); - write_sysreg_s(reg, SYS_PMSCR_EL1); + write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); + if (is_kernel_in_hyp_mode()) + write_sysreg_s(pmscr_el12, SYS_PMSCR_EL12); } static void arm_spe_pmu_clr_pmscr(void) { + if (is_kernel_in_hyp_mode()) + write_sysreg_s(0, SYS_PMSCR_EL12); + write_sysreg_s(0, SYS_PMSCR_EL1); isb(); }