diff mbox series

x86/vm_event: add gdtr_base to the vm_event structure

Message ID 20190501235731.1486-1-tamas@tklengyel.com (mailing list archive)
State Superseded
Headers show
Series x86/vm_event: add gdtr_base to the vm_event structure | expand

Commit Message

Tamas K Lengyel May 1, 2019, 11:57 p.m. UTC
Receiving this register is useful for introspecting 32-bit Windows when the
event being trapped happened while in ring3.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
---
 xen/arch/x86/vm_event.c       | 5 +++++
 xen/include/public/vm_event.h | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

Comments

Razvan Cojocaru May 2, 2019, 8:25 a.m. UTC | #1
On 5/2/19 2:57 AM, Tamas K Lengyel wrote:
> Receiving this register is useful for introspecting 32-bit Windows when the
> event being trapped happened while in ring3.
> 
> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> Cc: Tamas K Lengyel <tamas@tklengyel.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Roger Pau Monne <roger.pau@citrix.com>
> ---
>   xen/arch/x86/vm_event.c       | 5 +++++
>   xen/include/public/vm_event.h | 3 ++-
>   2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
> index 51c3493b1d..873788e32f 100644
> --- a/xen/arch/x86/vm_event.c
> +++ b/xen/arch/x86/vm_event.c
> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum x86_segment segment,
>           reg->es_sel = seg.sel;
>           break;
>   
> +    case x86_seg_gdtr:
> +        reg->gdtr_base = seg.base;
> +        break;

Please also add limit, ar, sel, like the others do. In addition to 
making this modification look less strange (since, in contrast to the 
function name, nothing is packed for gdtr_base), it will also save 
future work for applications wanting to use gdtr which also require 
backwards compatibility with previous vm_event versions.

As you know, for each such modification we need to have a separate 
vm_event_vX header in our applications so that we're ready to create a 
ring buffer using requests and replies of the right size, and also to be 
able to properly interpret the ring buffer data, so the least frequent 
changes to the vm_event struct, the better.

Petre is currently working on the vm_event changes that will hopefully 
enable us to just cache everything that the getcontext_partial hypercall 
retrieves.


Thanks,
Razvan
Andrew Cooper May 2, 2019, 8:52 a.m. UTC | #2
On 02/05/2019 09:25, Razvan Cojocaru wrote:
> On 5/2/19 2:57 AM, Tamas K Lengyel wrote:
>> Receiving this register is useful for introspecting 32-bit Windows
>> when the
>> event being trapped happened while in ring3.
>>
>> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>> Cc: Tamas K Lengyel <tamas@tklengyel.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Roger Pau Monne <roger.pau@citrix.com>
>> ---
>>   xen/arch/x86/vm_event.c       | 5 +++++
>>   xen/include/public/vm_event.h | 3 ++-
>>   2 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
>> index 51c3493b1d..873788e32f 100644
>> --- a/xen/arch/x86/vm_event.c
>> +++ b/xen/arch/x86/vm_event.c
>> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum
>> x86_segment segment,
>>           reg->es_sel = seg.sel;
>>           break;
>>   +    case x86_seg_gdtr:
>> +        reg->gdtr_base = seg.base;
>> +        break;
>
> Please also add limit, ar, sel, like the others do.

In Xen, we model GDTR/IDTR just like all other segments in the segment
cache for convenience/consistency, including faking up of some default
attributes.

However, there is no such thing as a selector or access rights for them,
and the VMCS lacks the fields, while the VMCB marks the fields as
reserved.   It is almost certainly not worth wasting the space in the ring.

~Andrew

P.S. If you consult the state-after-reset table in both SDMs, it does
state that Access Rights exist, and default to Present and R/W, but this
isn't a field which can be changed at any point.  I suspect real
pipelines model GDTR/LDTR just like the other segments for
simplification of the segmentation logic, which is why I chose to do the
same in Xen.
Jan Beulich May 2, 2019, 8:57 a.m. UTC | #3
>>> On 02.05.19 at 10:25, <rcojocaru@bitdefender.com> wrote:
> On 5/2/19 2:57 AM, Tamas K Lengyel wrote:
>> Receiving this register is useful for introspecting 32-bit Windows when the
>> event being trapped happened while in ring3.
>> 
>> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>> Cc: Tamas K Lengyel <tamas@tklengyel.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Roger Pau Monne <roger.pau@citrix.com>
>> ---
>>   xen/arch/x86/vm_event.c       | 5 +++++
>>   xen/include/public/vm_event.h | 3 ++-
>>   2 files changed, 7 insertions(+), 1 deletion(-)
>> 
>> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
>> index 51c3493b1d..873788e32f 100644
>> --- a/xen/arch/x86/vm_event.c
>> +++ b/xen/arch/x86/vm_event.c
>> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum 
> x86_segment segment,
>>           reg->es_sel = seg.sel;
>>           break;
>>   
>> +    case x86_seg_gdtr:
>> +        reg->gdtr_base = seg.base;
>> +        break;
> 
> Please also add limit, ar, sel, like the others do.

There's no ar or sel for GDT (and IDT). Instead, because of ...

> In addition to 
> making this modification look less strange (since, in contrast to the 
> function name, nothing is packed for gdtr_base), it will also save 
> future work for applications wanting to use gdtr which also require 
> backwards compatibility with previous vm_event versions.
> 
> As you know, for each such modification we need to have a separate 
> vm_event_vX header in our applications so that we're ready to create a 
> ring buffer using requests and replies of the right size, and also to be 
> able to properly interpret the ring buffer data, so the least frequent 
> changes to the vm_event struct, the better.

... this I wonder whether the IDT details shouldn't get added at
the same time.

Jan
Razvan Cojocaru May 2, 2019, 9:15 a.m. UTC | #4
On 5/2/19 11:52 AM, Andrew Cooper wrote:
> On 02/05/2019 09:25, Razvan Cojocaru wrote:
>> On 5/2/19 2:57 AM, Tamas K Lengyel wrote:
>>> Receiving this register is useful for introspecting 32-bit Windows
>>> when the
>>> event being trapped happened while in ring3.
>>>
>>> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
>>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>>> Cc: Tamas K Lengyel <tamas@tklengyel.com>
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>> Cc: Roger Pau Monne <roger.pau@citrix.com>
>>> ---
>>>    xen/arch/x86/vm_event.c       | 5 +++++
>>>    xen/include/public/vm_event.h | 3 ++-
>>>    2 files changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
>>> index 51c3493b1d..873788e32f 100644
>>> --- a/xen/arch/x86/vm_event.c
>>> +++ b/xen/arch/x86/vm_event.c
>>> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum
>>> x86_segment segment,
>>>            reg->es_sel = seg.sel;
>>>            break;
>>>    +    case x86_seg_gdtr:
>>> +        reg->gdtr_base = seg.base;
>>> +        break;
>>
>> Please also add limit, ar, sel, like the others do.
> 
> In Xen, we model GDTR/IDTR just like all other segments in the segment
> cache for convenience/consistency, including faking up of some default
> attributes.
> 
> However, there is no such thing as a selector or access rights for them,
> and the VMCS lacks the fields, while the VMCB marks the fields as
> reserved.   It is almost certainly not worth wasting the space in the ring.

Right, I got carried away there: I saw gdtr_limit and didn't check that 
if the rest is available.


Thanks,
Razvan
Tamas K Lengyel May 2, 2019, 1:09 p.m. UTC | #5
On Thu, May 2, 2019 at 2:57 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 02.05.19 at 10:25, <rcojocaru@bitdefender.com> wrote:
> > On 5/2/19 2:57 AM, Tamas K Lengyel wrote:
> >> Receiving this register is useful for introspecting 32-bit Windows when the
> >> event being trapped happened while in ring3.
> >>
> >> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> >> Cc: Tamas K Lengyel <tamas@tklengyel.com>
> >> Cc: Jan Beulich <jbeulich@suse.com>
> >> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> >> Cc: Wei Liu <wei.liu2@citrix.com>
> >> Cc: Roger Pau Monne <roger.pau@citrix.com>
> >> ---
> >>   xen/arch/x86/vm_event.c       | 5 +++++
> >>   xen/include/public/vm_event.h | 3 ++-
> >>   2 files changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
> >> index 51c3493b1d..873788e32f 100644
> >> --- a/xen/arch/x86/vm_event.c
> >> +++ b/xen/arch/x86/vm_event.c
> >> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum
> > x86_segment segment,
> >>           reg->es_sel = seg.sel;
> >>           break;
> >>
> >> +    case x86_seg_gdtr:
> >> +        reg->gdtr_base = seg.base;
> >> +        break;
> >
> > Please also add limit, ar, sel, like the others do.
>
> There's no ar or sel for GDT (and IDT). Instead, because of ...
>
> > In addition to
> > making this modification look less strange (since, in contrast to the
> > function name, nothing is packed for gdtr_base), it will also save
> > future work for applications wanting to use gdtr which also require
> > backwards compatibility with previous vm_event versions.
> >
> > As you know, for each such modification we need to have a separate
> > vm_event_vX header in our applications so that we're ready to create a
> > ring buffer using requests and replies of the right size, and also to be
> > able to properly interpret the ring buffer data, so the least frequent
> > changes to the vm_event struct, the better.
>
> ... this I wonder whether the IDT details shouldn't get added at
> the same time.

The churn of the header is not that big of a deal - I do the same in
LibVMI and it's a largely copy-paste job that takes perhaps ~5m to add
(see https://github.com/tklengyel/libvmi/commit/7509ce56d0408dbec4e374b8780da69a7bd212e8).
So that should not be a factor in deciding whether we add a needed
extra piece or not. Since the vm_event ABI is changing for Xen 4.13
now the ABI version will only be bumped once for 4.13. So we should be
able to add/remove/shuffle whatever we want while 4.13 merge window is
open.

That said I don't have a use for idt and gdtr_limit that warrants
having to receive it via the vm_event structure - those pieces are
always just a hypercall away if needed. So in the vm_event structure I
tend to just put the registers needed often enough to warrant avoiding
that extra hypercall. But if Razvan says he has a use for them, I can
add them here.

Tamas
Razvan Cojocaru May 2, 2019, 1:23 p.m. UTC | #6
On 5/2/19 4:09 PM, Tamas K Lengyel wrote:
> On Thu, May 2, 2019 at 2:57 AM Jan Beulich <JBeulich@suse.com> wrote:
>>
>>>>> On 02.05.19 at 10:25, <rcojocaru@bitdefender.com> wrote:
>>> On 5/2/19 2:57 AM, Tamas K Lengyel wrote:
>>>> Receiving this register is useful for introspecting 32-bit Windows when the
>>>> event being trapped happened while in ring3.
>>>>
>>>> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
>>>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>>>> Cc: Tamas K Lengyel <tamas@tklengyel.com>
>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>>> Cc: Roger Pau Monne <roger.pau@citrix.com>
>>>> ---
>>>>    xen/arch/x86/vm_event.c       | 5 +++++
>>>>    xen/include/public/vm_event.h | 3 ++-
>>>>    2 files changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
>>>> index 51c3493b1d..873788e32f 100644
>>>> --- a/xen/arch/x86/vm_event.c
>>>> +++ b/xen/arch/x86/vm_event.c
>>>> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum
>>> x86_segment segment,
>>>>            reg->es_sel = seg.sel;
>>>>            break;
>>>>
>>>> +    case x86_seg_gdtr:
>>>> +        reg->gdtr_base = seg.base;
>>>> +        break;
>>>
>>> Please also add limit, ar, sel, like the others do.
>>
>> There's no ar or sel for GDT (and IDT). Instead, because of ...
>>
>>> In addition to
>>> making this modification look less strange (since, in contrast to the
>>> function name, nothing is packed for gdtr_base), it will also save
>>> future work for applications wanting to use gdtr which also require
>>> backwards compatibility with previous vm_event versions.
>>>
>>> As you know, for each such modification we need to have a separate
>>> vm_event_vX header in our applications so that we're ready to create a
>>> ring buffer using requests and replies of the right size, and also to be
>>> able to properly interpret the ring buffer data, so the least frequent
>>> changes to the vm_event struct, the better.
>>
>> ... this I wonder whether the IDT details shouldn't get added at
>> the same time.
> 
> The churn of the header is not that big of a deal - I do the same in
> LibVMI and it's a largely copy-paste job that takes perhaps ~5m to add
> (see https://github.com/tklengyel/libvmi/commit/7509ce56d0408dbec4e374b8780da69a7bd212e8).
> So that should not be a factor in deciding whether we add a needed
> extra piece or not. Since the vm_event ABI is changing for Xen 4.13
> now the ABI version will only be bumped once for 4.13. So we should be
> able to add/remove/shuffle whatever we want while 4.13 merge window is
> open.
> 
> That said I don't have a use for idt and gdtr_limit that warrants
> having to receive it via the vm_event structure - those pieces are
> always just a hypercall away if needed. So in the vm_event structure I
> tend to just put the registers needed often enough to warrant avoiding
> that extra hypercall. But if Razvan says he has a use for them, I can
> add them here.

We do use them both - idtr and gdtr, both base and limit, but we are 
indeed getting them via hypercall now. Considering that, since we did 
add gdtr_base I think it would be great if we could also add gdtr_limit.

Adding idtr as well would _theoretically_ be a nice speed optimization 
(removing the need for a hypercall), but it might also be a speed 
pessimization generally applicable to increasing the vm_event size 
(since a VCPU with no more space left in the vm_event ring buffer will 
be blocked more and go through more locking logic).

My point is, at the moment it's fine to skip idtr if it's not required 
by Jan, but if we do add either then let's please add both _base and _limit.

In a hopefully near future we'll be able to use as much memory as 
necessary for storing vm_event data and just cache everything in the 
ring buffer, avoing all the "get context" hypercalls.


Thanks,
Razvan
Jan Beulich May 2, 2019, 1:30 p.m. UTC | #7
>>> On 02.05.19 at 15:09, <tamas@tklengyel.com> wrote:
> That said I don't have a use for idt and gdtr_limit that warrants
> having to receive it via the vm_event structure

So what use if the GDT base without the limit? Are you silently
assuming all presently loaded selectors are (still) within limits?

Jan
Jan Beulich May 2, 2019, 1:32 p.m. UTC | #8
>>> On 02.05.19 at 15:23, <rcojocaru@bitdefender.com> wrote:
> My point is, at the moment it's fine to skip idtr if it's not required 
> by Jan, but if we do add either then let's please add both _base and _limit.

No, it's not a requirement I mean to put up (and I'm not in the
position to either, as I'm not the maintainer of the interface).
It just seems inconsistent to me to have one but not the other.

Jan
Tamas K Lengyel May 2, 2019, 1:53 p.m. UTC | #9
On Thu, May 2, 2019 at 7:30 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 02.05.19 at 15:09, <tamas@tklengyel.com> wrote:
> > That said I don't have a use for idt and gdtr_limit that warrants
> > having to receive it via the vm_event structure
>
> So what use if the GDT base without the limit? Are you silently
> assuming all presently loaded selectors are (still) within limits?

On 32-bit Windows the KPCR's address is cached at gdtr_base + 0x30
while in ring3. In ring0 we can just use fs_base for that. At the
moment I still just cache the KPCR location on every MOV-TO-CR3 but
that became an issue with recent versions of Windows10 implementing
Meltdown mitigations because it leads to extreme performance
degradation in the guest (opening an app takes ~20s). So now I just
try to find the KPCR based on the registers reported in each vm_event.
We use the KPCR to quickly find thread/process base addresses to
gather info relevant to introspection.

Tamas

Tamas
diff mbox series

Patch

diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
index 51c3493b1d..873788e32f 100644
--- a/xen/arch/x86/vm_event.c
+++ b/xen/arch/x86/vm_event.c
@@ -179,6 +179,10 @@  static void vm_event_pack_segment_register(enum x86_segment segment,
         reg->es_sel = seg.sel;
         break;
 
+    case x86_seg_gdtr:
+        reg->gdtr_base = seg.base;
+        break;
+
     default:
         ASSERT_UNREACHABLE();
     }
@@ -238,6 +242,7 @@  void vm_event_fill_regs(vm_event_request_t *req)
     vm_event_pack_segment_register(x86_seg_ss, &req->data.regs.x86);
     vm_event_pack_segment_register(x86_seg_ds, &req->data.regs.x86);
     vm_event_pack_segment_register(x86_seg_es, &req->data.regs.x86);
+    vm_event_pack_segment_register(x86_seg_gdtr, &req->data.regs.x86);
 
     req->data.regs.x86.shadow_gs = ctxt.shadow_gs;
     req->data.regs.x86.dr6 = ctxt.dr6;
diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h
index b2bafc0d77..fd020c5ea7 100644
--- a/xen/include/public/vm_event.h
+++ b/xen/include/public/vm_event.h
@@ -29,7 +29,7 @@ 
 
 #include "xen.h"
 
-#define VM_EVENT_INTERFACE_VERSION 0x00000004
+#define VM_EVENT_INTERFACE_VERSION 0x00000005
 
 #if defined(__XEN__) || defined(__XEN_TOOLS__)
 
@@ -198,6 +198,7 @@  struct vm_event_regs_x86 {
     uint64_t msr_efer;
     uint64_t msr_star;
     uint64_t msr_lstar;
+    uint64_t gdtr_base;
     uint32_t cs_base;
     uint32_t ss_base;
     uint32_t ds_base;