[v5,06/11] x86/hvm: processor trace interface in HVM

Message ID	a4833c8168e287f0caf1dc6f16ec5c054bd88b0a.1593974333.git.michal.leszczynski@cert.pl (mailing list archive)
State	Superseded
Headers	show Return-Path: <SRS0=8NMJ=AQ=lists.xenproject.org=xen-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DC0352073E field: "References" field: "References" field: "References" field: "References" From: =?utf-8?q?Micha=C5=82_Leszczy=C5=84ski?= <michal.leszczynski@cert.pl> To: xen-devel@lists.xenproject.org Subject: [PATCH v5 06/11] x86/hvm: processor trace interface in HVM Date: Sun, 5 Jul 2020 20:54:59 +0200 Message-Id: <a4833c8168e287f0caf1dc6f16ec5c054bd88b0a.1593974333.git.michal.leszczynski@cert.pl> In-Reply-To: <cover.1593974333.git.michal.leszczynski@cert.pl> References: <cover.1593974333.git.michal.leszczynski@cert.pl> In-Reply-To: <cover.1593974333.git.michal.leszczynski@cert.pl> References: <cover.1593974333.git.michal.leszczynski@cert.pl> Precedence: list Cc: Julien Grall <julien@xen.org>, Stefano Stabellini <sstabellini@kernel.org>, luwei.kang@intel.com, Wei Liu <wl@xen.org>, Andrew Cooper <andrew.cooper3@citrix.com>, Michal Leszczynski <michal.leszczynski@cert.pl>, Ian Jackson <ian.jackson@eu.citrix.com>, George Dunlap <george.dunlap@citrix.com>, Jan Beulich <jbeulich@suse.com>, tamas.lengyel@intel.com, =?utf-8?q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com> Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org>
Series	Implement support for external IPT monitoring \| expand [v5,00/11] Implement support for external IPT monitoring [v5,01/11] memory: batch processing in acquire_resource() [v5,02/11] x86/vmx: add Intel PT MSR definitions [v5,03/11] x86/vmx: add IPT cpu feature [v5,04/11] common: add vmtrace_pt_size domain parameter [v5,05/11] tools/libxl: add vmtrace_pt_size parameter [v5,06/11] x86/hvm: processor trace interface in HVM [v5,07/11] x86/vmx: implement IPT in VMX [v5,08/11] x86/mm: add vmtrace_buf resource type [v5,09/11] x86/domctl: add XEN_DOMCTL_vmtrace_op [v5,10/11] tools/libxc: add xc_vmtrace_* functions [v5,11/11] tools/proctrace: add proctrace tool

Michał Leszczyński July 5, 2020, 6:54 p.m. UTC

From: Michal Leszczynski <michal.leszczynski@cert.pl>

Implement necessary changes in common code/HVM to support
processor trace features. Define vmtrace_pt_* API and
implement trace buffer allocation/deallocation in common
code.

Signed-off-by: Michal Leszczynski <michal.leszczynski@cert.pl>
---
 xen/arch/x86/domain.c         | 19 +++++++++++++++++++
 xen/common/domain.c           | 19 +++++++++++++++++++
 xen/include/asm-x86/hvm/hvm.h | 20 ++++++++++++++++++++
 xen/include/xen/sched.h       |  4 ++++
 4 files changed, 62 insertions(+)

Michał Leszczyński July 5, 2020, 7:11 p.m. UTC | #1

----- 5 lip 2020 o 20:54, Michał Leszczyński michal.leszczynski@cert.pl napisał(a):

> From: Michal Leszczynski <michal.leszczynski@cert.pl>
> 
> Implement necessary changes in common code/HVM to support
> processor trace features. Define vmtrace_pt_* API and
> implement trace buffer allocation/deallocation in common
> code.
> 
> Signed-off-by: Michal Leszczynski <michal.leszczynski@cert.pl>
> ---
> xen/arch/x86/domain.c         | 19 +++++++++++++++++++
> xen/common/domain.c           | 19 +++++++++++++++++++
> xen/include/asm-x86/hvm/hvm.h | 20 ++++++++++++++++++++
> xen/include/xen/sched.h       |  4 ++++
> 4 files changed, 62 insertions(+)
> 
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index fee6c3931a..79c9794408 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -2199,6 +2199,25 @@ int domain_relinquish_resources(struct domain *d)
>                 altp2m_vcpu_disable_ve(v);
>         }
> 
> +        for_each_vcpu ( d, v )
> +        {
> +            unsigned int i;
> +
> +            if ( !v->vmtrace.pt_buf )
> +                continue;
> +
> +            for ( i = 0; i < (v->domain->vmtrace_pt_size >> PAGE_SHIFT); i++ )
> +            {
> +                struct page_info *pg = mfn_to_page(
> +                    mfn_add(page_to_mfn(v->vmtrace.pt_buf), i));
> +                if ( (pg->count_info & PGC_count_mask) != 1 )
> +                    return -EBUSY;
> +            }
> +
> +            free_domheap_pages(v->vmtrace.pt_buf,
> +                get_order_from_bytes(v->domain->vmtrace_pt_size));


While this works, I don't feel that this is a good solution with this loop
returning -EBUSY here. I would like to kindly ask for suggestions regarding
this topic.


Best regards,
Michał Leszczyński
CERT Polska

Jan Beulich July 6, 2020, 8:31 a.m. UTC | #2

On 05.07.2020 21:11, Michał Leszczyński wrote:
> ----- 5 lip 2020 o 20:54, Michał Leszczyński michal.leszczynski@cert.pl napisał(a):
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -2199,6 +2199,25 @@ int domain_relinquish_resources(struct domain *d)
>>                 altp2m_vcpu_disable_ve(v);
>>         }
>>
>> +        for_each_vcpu ( d, v )
>> +        {
>> +            unsigned int i;
>> +
>> +            if ( !v->vmtrace.pt_buf )
>> +                continue;
>> +
>> +            for ( i = 0; i < (v->domain->vmtrace_pt_size >> PAGE_SHIFT); i++ )
>> +            {
>> +                struct page_info *pg = mfn_to_page(
>> +                    mfn_add(page_to_mfn(v->vmtrace.pt_buf), i));
>> +                if ( (pg->count_info & PGC_count_mask) != 1 )
>> +                    return -EBUSY;
>> +            }
>> +
>> +            free_domheap_pages(v->vmtrace.pt_buf,
>> +                get_order_from_bytes(v->domain->vmtrace_pt_size));
> 
> 
> While this works, I don't feel that this is a good solution with this loop
> returning -EBUSY here. I would like to kindly ask for suggestions regarding
> this topic.

I'm sorry to ask, but with the previously give suggestions to mirror
existing code, why do you still need to play with this function? You
really shouldn't have a need to, just like e.g. the ioreq server page
handling code didn't.

Jan

Roger Pau Monné July 6, 2020, 8:42 a.m. UTC | #3

On Sun, Jul 05, 2020 at 08:54:59PM +0200, Michał Leszczyński wrote:
> From: Michal Leszczynski <michal.leszczynski@cert.pl>
> 
> Implement necessary changes in common code/HVM to support
> processor trace features. Define vmtrace_pt_* API and
> implement trace buffer allocation/deallocation in common
> code.
> 
> Signed-off-by: Michal Leszczynski <michal.leszczynski@cert.pl>
> ---
>  xen/arch/x86/domain.c         | 19 +++++++++++++++++++
>  xen/common/domain.c           | 19 +++++++++++++++++++
>  xen/include/asm-x86/hvm/hvm.h | 20 ++++++++++++++++++++
>  xen/include/xen/sched.h       |  4 ++++
>  4 files changed, 62 insertions(+)
> 
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index fee6c3931a..79c9794408 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -2199,6 +2199,25 @@ int domain_relinquish_resources(struct domain *d)
>                  altp2m_vcpu_disable_ve(v);
>          }
>  
> +        for_each_vcpu ( d, v )
> +        {
> +            unsigned int i;
> +
> +            if ( !v->vmtrace.pt_buf )
> +                continue;
> +
> +            for ( i = 0; i < (v->domain->vmtrace_pt_size >> PAGE_SHIFT); i++ )
> +            {
> +                struct page_info *pg = mfn_to_page(
> +                    mfn_add(page_to_mfn(v->vmtrace.pt_buf), i));
> +                if ( (pg->count_info & PGC_count_mask) != 1 )
> +                    return -EBUSY;
> +            }
> +
> +            free_domheap_pages(v->vmtrace.pt_buf,
> +                get_order_from_bytes(v->domain->vmtrace_pt_size));

This is racy as a control domain could take a reference between the
check and the freeing.

> +        }
> +
>          if ( is_pv_domain(d) )
>          {
>              for_each_vcpu ( d, v )
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 25d3359c5b..f480c4e033 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -137,6 +137,21 @@ static void vcpu_destroy(struct vcpu *v)
>      free_vcpu_struct(v);
>  }
>  
> +static int vmtrace_alloc_buffers(struct vcpu *v)
> +{
> +    struct page_info *pg;
> +    uint64_t size = v->domain->vmtrace_pt_size;
> +
> +    pg = alloc_domheap_pages(v->domain, get_order_from_bytes(size),
> +                             MEMF_no_refcount);
> +
> +    if ( !pg )
> +        return -ENOMEM;
> +
> +    v->vmtrace.pt_buf = pg;
> +    return 0;
> +}

I think we already agreed that you would use the same model as ioreq
servers, where a reference is taken on allocation and then the pages
are not explicitly freed on domain destruction and put_page_and_type
is used. Is there some reason why that model doesn't work in this
case?

If not, please see hvm_alloc_ioreq_mfn and hvm_free_ioreq_mfn.

Roger.

Michał Leszczyński July 6, 2020, 10:09 a.m. UTC | #4

----- 6 lip 2020 o 10:42, Roger Pau Monné roger.pau@citrix.com napisał(a):

> On Sun, Jul 05, 2020 at 08:54:59PM +0200, Michał Leszczyński wrote:
>> From: Michal Leszczynski <michal.leszczynski@cert.pl>
>> 
>> Implement necessary changes in common code/HVM to support
>> processor trace features. Define vmtrace_pt_* API and
>> implement trace buffer allocation/deallocation in common
>> code.
>> 
>> Signed-off-by: Michal Leszczynski <michal.leszczynski@cert.pl>
>> ---
>>  xen/arch/x86/domain.c         | 19 +++++++++++++++++++
>>  xen/common/domain.c           | 19 +++++++++++++++++++
>>  xen/include/asm-x86/hvm/hvm.h | 20 ++++++++++++++++++++
>>  xen/include/xen/sched.h       |  4 ++++
>>  4 files changed, 62 insertions(+)
>> 
>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>> index fee6c3931a..79c9794408 100644
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -2199,6 +2199,25 @@ int domain_relinquish_resources(struct domain *d)
>>                  altp2m_vcpu_disable_ve(v);
>>          }
>>  
>> +        for_each_vcpu ( d, v )
>> +        {
>> +            unsigned int i;
>> +
>> +            if ( !v->vmtrace.pt_buf )
>> +                continue;
>> +
>> +            for ( i = 0; i < (v->domain->vmtrace_pt_size >> PAGE_SHIFT); i++ )
>> +            {
>> +                struct page_info *pg = mfn_to_page(
>> +                    mfn_add(page_to_mfn(v->vmtrace.pt_buf), i));
>> +                if ( (pg->count_info & PGC_count_mask) != 1 )
>> +                    return -EBUSY;
>> +            }
>> +
>> +            free_domheap_pages(v->vmtrace.pt_buf,
>> +                get_order_from_bytes(v->domain->vmtrace_pt_size));
> 
> This is racy as a control domain could take a reference between the
> check and the freeing.
> 
>> +        }
>> +
>>          if ( is_pv_domain(d) )
>>          {
>>              for_each_vcpu ( d, v )
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index 25d3359c5b..f480c4e033 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -137,6 +137,21 @@ static void vcpu_destroy(struct vcpu *v)
>>      free_vcpu_struct(v);
>>  }
>>  
>> +static int vmtrace_alloc_buffers(struct vcpu *v)
>> +{
>> +    struct page_info *pg;
>> +    uint64_t size = v->domain->vmtrace_pt_size;
>> +
>> +    pg = alloc_domheap_pages(v->domain, get_order_from_bytes(size),
>> +                             MEMF_no_refcount);
>> +
>> +    if ( !pg )
>> +        return -ENOMEM;
>> +
>> +    v->vmtrace.pt_buf = pg;
>> +    return 0;
>> +}
> 
> I think we already agreed that you would use the same model as ioreq
> servers, where a reference is taken on allocation and then the pages
> are not explicitly freed on domain destruction and put_page_and_type
> is used. Is there some reason why that model doesn't work in this
> case?
> 
> If not, please see hvm_alloc_ioreq_mfn and hvm_free_ioreq_mfn.
> 
> Roger.


Ok, I've got it, will do. Thanks for pointing out the examples.


One thing that is confusing to me is that I don't get what is
the meaning of MEMF_no_refcount flag.

In the hvm_{alloc,free}_ioreq_mfn the memory is allocated
explicitly but freed just by putting out the reference, so
I guess it's automatically detected that the refcount dropped to 0
and the page should be freed? If so, why the flag is named "no refcount"?


Best regards,
Michał Leszczyński
CERT Polska

Michał Leszczyński July 6, 2020, 10:31 a.m. UTC | #5

----- 6 lip 2020 o 10:31, Jan Beulich jbeulich@suse.com napisał(a):

> On 05.07.2020 21:11, Michał Leszczyński wrote:
>> ----- 5 lip 2020 o 20:54, Michał Leszczyński michal.leszczynski@cert.pl
>> napisał(a):
>>> --- a/xen/arch/x86/domain.c
>>> +++ b/xen/arch/x86/domain.c
>>> @@ -2199,6 +2199,25 @@ int domain_relinquish_resources(struct domain *d)
>>>                 altp2m_vcpu_disable_ve(v);
>>>         }
>>>
>>> +        for_each_vcpu ( d, v )
>>> +        {
>>> +            unsigned int i;
>>> +
>>> +            if ( !v->vmtrace.pt_buf )
>>> +                continue;
>>> +
>>> +            for ( i = 0; i < (v->domain->vmtrace_pt_size >> PAGE_SHIFT); i++ )
>>> +            {
>>> +                struct page_info *pg = mfn_to_page(
>>> +                    mfn_add(page_to_mfn(v->vmtrace.pt_buf), i));
>>> +                if ( (pg->count_info & PGC_count_mask) != 1 )
>>> +                    return -EBUSY;
>>> +            }
>>> +
>>> +            free_domheap_pages(v->vmtrace.pt_buf,
>>> +                get_order_from_bytes(v->domain->vmtrace_pt_size));
>> 
>> 
>> While this works, I don't feel that this is a good solution with this loop
>> returning -EBUSY here. I would like to kindly ask for suggestions regarding
>> this topic.
> 
> I'm sorry to ask, but with the previously give suggestions to mirror
> existing code, why do you still need to play with this function? You
> really shouldn't have a need to, just like e.g. the ioreq server page
> handling code didn't.
> 
> Jan


Ok, sorry. I think I've finally got it after latest Roger's suggestions :P

This will be fixed in the next version.


Best regards,
Michał Leszczyński
CERT Polska

Roger Pau Monné July 6, 2020, 2:38 p.m. UTC | #6

On Mon, Jul 06, 2020 at 12:09:02PM +0200, Michał Leszczyński wrote:
> ----- 6 lip 2020 o 10:42, Roger Pau Monné roger.pau@citrix.com napisał(a):
> 
> > On Sun, Jul 05, 2020 at 08:54:59PM +0200, Michał Leszczyński wrote:
> >> From: Michal Leszczynski <michal.leszczynski@cert.pl>
> >> 
> >> Implement necessary changes in common code/HVM to support
> >> processor trace features. Define vmtrace_pt_* API and
> >> implement trace buffer allocation/deallocation in common
> >> code.
> >> 
> >> Signed-off-by: Michal Leszczynski <michal.leszczynski@cert.pl>
> >> ---
> >>  xen/arch/x86/domain.c         | 19 +++++++++++++++++++
> >>  xen/common/domain.c           | 19 +++++++++++++++++++
> >>  xen/include/asm-x86/hvm/hvm.h | 20 ++++++++++++++++++++
> >>  xen/include/xen/sched.h       |  4 ++++
> >>  4 files changed, 62 insertions(+)
> >> 
> >> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> >> index fee6c3931a..79c9794408 100644
> >> --- a/xen/arch/x86/domain.c
> >> +++ b/xen/arch/x86/domain.c
> >> @@ -2199,6 +2199,25 @@ int domain_relinquish_resources(struct domain *d)
> >>                  altp2m_vcpu_disable_ve(v);
> >>          }
> >>  
> >> +        for_each_vcpu ( d, v )
> >> +        {
> >> +            unsigned int i;
> >> +
> >> +            if ( !v->vmtrace.pt_buf )
> >> +                continue;
> >> +
> >> +            for ( i = 0; i < (v->domain->vmtrace_pt_size >> PAGE_SHIFT); i++ )
> >> +            {
> >> +                struct page_info *pg = mfn_to_page(
> >> +                    mfn_add(page_to_mfn(v->vmtrace.pt_buf), i));
> >> +                if ( (pg->count_info & PGC_count_mask) != 1 )
> >> +                    return -EBUSY;
> >> +            }
> >> +
> >> +            free_domheap_pages(v->vmtrace.pt_buf,
> >> +                get_order_from_bytes(v->domain->vmtrace_pt_size));
> > 
> > This is racy as a control domain could take a reference between the
> > check and the freeing.
> > 
> >> +        }
> >> +
> >>          if ( is_pv_domain(d) )
> >>          {
> >>              for_each_vcpu ( d, v )
> >> diff --git a/xen/common/domain.c b/xen/common/domain.c
> >> index 25d3359c5b..f480c4e033 100644
> >> --- a/xen/common/domain.c
> >> +++ b/xen/common/domain.c
> >> @@ -137,6 +137,21 @@ static void vcpu_destroy(struct vcpu *v)
> >>      free_vcpu_struct(v);
> >>  }
> >>  
> >> +static int vmtrace_alloc_buffers(struct vcpu *v)
> >> +{
> >> +    struct page_info *pg;
> >> +    uint64_t size = v->domain->vmtrace_pt_size;
> >> +
> >> +    pg = alloc_domheap_pages(v->domain, get_order_from_bytes(size),
> >> +                             MEMF_no_refcount);
> >> +
> >> +    if ( !pg )
> >> +        return -ENOMEM;
> >> +
> >> +    v->vmtrace.pt_buf = pg;
> >> +    return 0;
> >> +}
> > 
> > I think we already agreed that you would use the same model as ioreq
> > servers, where a reference is taken on allocation and then the pages
> > are not explicitly freed on domain destruction and put_page_and_type
> > is used. Is there some reason why that model doesn't work in this
> > case?
> > 
> > If not, please see hvm_alloc_ioreq_mfn and hvm_free_ioreq_mfn.
> > 
> > Roger.
> 
> 
> Ok, I've got it, will do. Thanks for pointing out the examples.
> 
> 
> One thing that is confusing to me is that I don't get what is
> the meaning of MEMF_no_refcount flag.

That flag prevents the memory from being counted towards the amount of
memory assigned to the domain. You want it that way so that trace
buffers are not accounted as part of the memory assigned to the
domain.

You then need to get a (extra) reference to the pages (there's always
the 'allocated' reference AFAICT) so that when the last reference is
dropped (either by the domain being destroyed or the memory being
unmapped from the control domain) it will be freed.

> In the hvm_{alloc,free}_ioreq_mfn the memory is allocated
> explicitly but freed just by putting out the reference, so
> I guess it's automatically detected that the refcount dropped to 0
> and the page should be freed?

Yes, put_page_alloc_ref will remove the allocated flag and then when
the last reference is dropped the page will be freed.

> If so, why the flag is named "no refcount"?

I'm not sure about the naming, but you can get references to pages
allocated as MEMF_no_refcount, and change their types. They are
however not accounted towards the memory usage of a domain.

Roger.

[v5,06/11] x86/hvm: processor trace interface in HVM

Commit Message

Comments

Patch