Message ID | 20190501235731.1486-1-tamas@tklengyel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/vm_event: add gdtr_base to the vm_event structure | expand |
On 5/2/19 2:57 AM, Tamas K Lengyel wrote: > Receiving this register is useful for introspecting 32-bit Windows when the > event being trapped happened while in ring3. > > Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> > Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> > Cc: Tamas K Lengyel <tamas@tklengyel.com> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > Cc: Wei Liu <wei.liu2@citrix.com> > Cc: Roger Pau Monne <roger.pau@citrix.com> > --- > xen/arch/x86/vm_event.c | 5 +++++ > xen/include/public/vm_event.h | 3 ++- > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c > index 51c3493b1d..873788e32f 100644 > --- a/xen/arch/x86/vm_event.c > +++ b/xen/arch/x86/vm_event.c > @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum x86_segment segment, > reg->es_sel = seg.sel; > break; > > + case x86_seg_gdtr: > + reg->gdtr_base = seg.base; > + break; Please also add limit, ar, sel, like the others do. In addition to making this modification look less strange (since, in contrast to the function name, nothing is packed for gdtr_base), it will also save future work for applications wanting to use gdtr which also require backwards compatibility with previous vm_event versions. As you know, for each such modification we need to have a separate vm_event_vX header in our applications so that we're ready to create a ring buffer using requests and replies of the right size, and also to be able to properly interpret the ring buffer data, so the least frequent changes to the vm_event struct, the better. Petre is currently working on the vm_event changes that will hopefully enable us to just cache everything that the getcontext_partial hypercall retrieves. Thanks, Razvan
On 02/05/2019 09:25, Razvan Cojocaru wrote: > On 5/2/19 2:57 AM, Tamas K Lengyel wrote: >> Receiving this register is useful for introspecting 32-bit Windows >> when the >> event being trapped happened while in ring3. >> >> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> >> Cc: Tamas K Lengyel <tamas@tklengyel.com> >> Cc: Jan Beulich <jbeulich@suse.com> >> Cc: Andrew Cooper <andrew.cooper3@citrix.com> >> Cc: Wei Liu <wei.liu2@citrix.com> >> Cc: Roger Pau Monne <roger.pau@citrix.com> >> --- >> xen/arch/x86/vm_event.c | 5 +++++ >> xen/include/public/vm_event.h | 3 ++- >> 2 files changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c >> index 51c3493b1d..873788e32f 100644 >> --- a/xen/arch/x86/vm_event.c >> +++ b/xen/arch/x86/vm_event.c >> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum >> x86_segment segment, >> reg->es_sel = seg.sel; >> break; >> + case x86_seg_gdtr: >> + reg->gdtr_base = seg.base; >> + break; > > Please also add limit, ar, sel, like the others do. In Xen, we model GDTR/IDTR just like all other segments in the segment cache for convenience/consistency, including faking up of some default attributes. However, there is no such thing as a selector or access rights for them, and the VMCS lacks the fields, while the VMCB marks the fields as reserved. It is almost certainly not worth wasting the space in the ring. ~Andrew P.S. If you consult the state-after-reset table in both SDMs, it does state that Access Rights exist, and default to Present and R/W, but this isn't a field which can be changed at any point. I suspect real pipelines model GDTR/LDTR just like the other segments for simplification of the segmentation logic, which is why I chose to do the same in Xen.
>>> On 02.05.19 at 10:25, <rcojocaru@bitdefender.com> wrote: > On 5/2/19 2:57 AM, Tamas K Lengyel wrote: >> Receiving this register is useful for introspecting 32-bit Windows when the >> event being trapped happened while in ring3. >> >> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> >> Cc: Tamas K Lengyel <tamas@tklengyel.com> >> Cc: Jan Beulich <jbeulich@suse.com> >> Cc: Andrew Cooper <andrew.cooper3@citrix.com> >> Cc: Wei Liu <wei.liu2@citrix.com> >> Cc: Roger Pau Monne <roger.pau@citrix.com> >> --- >> xen/arch/x86/vm_event.c | 5 +++++ >> xen/include/public/vm_event.h | 3 ++- >> 2 files changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c >> index 51c3493b1d..873788e32f 100644 >> --- a/xen/arch/x86/vm_event.c >> +++ b/xen/arch/x86/vm_event.c >> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum > x86_segment segment, >> reg->es_sel = seg.sel; >> break; >> >> + case x86_seg_gdtr: >> + reg->gdtr_base = seg.base; >> + break; > > Please also add limit, ar, sel, like the others do. There's no ar or sel for GDT (and IDT). Instead, because of ... > In addition to > making this modification look less strange (since, in contrast to the > function name, nothing is packed for gdtr_base), it will also save > future work for applications wanting to use gdtr which also require > backwards compatibility with previous vm_event versions. > > As you know, for each such modification we need to have a separate > vm_event_vX header in our applications so that we're ready to create a > ring buffer using requests and replies of the right size, and also to be > able to properly interpret the ring buffer data, so the least frequent > changes to the vm_event struct, the better. ... this I wonder whether the IDT details shouldn't get added at the same time. Jan
On 5/2/19 11:52 AM, Andrew Cooper wrote: > On 02/05/2019 09:25, Razvan Cojocaru wrote: >> On 5/2/19 2:57 AM, Tamas K Lengyel wrote: >>> Receiving this register is useful for introspecting 32-bit Windows >>> when the >>> event being trapped happened while in ring3. >>> >>> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> >>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> >>> Cc: Tamas K Lengyel <tamas@tklengyel.com> >>> Cc: Jan Beulich <jbeulich@suse.com> >>> Cc: Andrew Cooper <andrew.cooper3@citrix.com> >>> Cc: Wei Liu <wei.liu2@citrix.com> >>> Cc: Roger Pau Monne <roger.pau@citrix.com> >>> --- >>> xen/arch/x86/vm_event.c | 5 +++++ >>> xen/include/public/vm_event.h | 3 ++- >>> 2 files changed, 7 insertions(+), 1 deletion(-) >>> >>> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c >>> index 51c3493b1d..873788e32f 100644 >>> --- a/xen/arch/x86/vm_event.c >>> +++ b/xen/arch/x86/vm_event.c >>> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum >>> x86_segment segment, >>> reg->es_sel = seg.sel; >>> break; >>> + case x86_seg_gdtr: >>> + reg->gdtr_base = seg.base; >>> + break; >> >> Please also add limit, ar, sel, like the others do. > > In Xen, we model GDTR/IDTR just like all other segments in the segment > cache for convenience/consistency, including faking up of some default > attributes. > > However, there is no such thing as a selector or access rights for them, > and the VMCS lacks the fields, while the VMCB marks the fields as > reserved. It is almost certainly not worth wasting the space in the ring. Right, I got carried away there: I saw gdtr_limit and didn't check that if the rest is available. Thanks, Razvan
On Thu, May 2, 2019 at 2:57 AM Jan Beulich <JBeulich@suse.com> wrote: > > >>> On 02.05.19 at 10:25, <rcojocaru@bitdefender.com> wrote: > > On 5/2/19 2:57 AM, Tamas K Lengyel wrote: > >> Receiving this register is useful for introspecting 32-bit Windows when the > >> event being trapped happened while in ring3. > >> > >> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> > >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> > >> Cc: Tamas K Lengyel <tamas@tklengyel.com> > >> Cc: Jan Beulich <jbeulich@suse.com> > >> Cc: Andrew Cooper <andrew.cooper3@citrix.com> > >> Cc: Wei Liu <wei.liu2@citrix.com> > >> Cc: Roger Pau Monne <roger.pau@citrix.com> > >> --- > >> xen/arch/x86/vm_event.c | 5 +++++ > >> xen/include/public/vm_event.h | 3 ++- > >> 2 files changed, 7 insertions(+), 1 deletion(-) > >> > >> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c > >> index 51c3493b1d..873788e32f 100644 > >> --- a/xen/arch/x86/vm_event.c > >> +++ b/xen/arch/x86/vm_event.c > >> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum > > x86_segment segment, > >> reg->es_sel = seg.sel; > >> break; > >> > >> + case x86_seg_gdtr: > >> + reg->gdtr_base = seg.base; > >> + break; > > > > Please also add limit, ar, sel, like the others do. > > There's no ar or sel for GDT (and IDT). Instead, because of ... > > > In addition to > > making this modification look less strange (since, in contrast to the > > function name, nothing is packed for gdtr_base), it will also save > > future work for applications wanting to use gdtr which also require > > backwards compatibility with previous vm_event versions. > > > > As you know, for each such modification we need to have a separate > > vm_event_vX header in our applications so that we're ready to create a > > ring buffer using requests and replies of the right size, and also to be > > able to properly interpret the ring buffer data, so the least frequent > > changes to the vm_event struct, the better. > > ... this I wonder whether the IDT details shouldn't get added at > the same time. The churn of the header is not that big of a deal - I do the same in LibVMI and it's a largely copy-paste job that takes perhaps ~5m to add (see https://github.com/tklengyel/libvmi/commit/7509ce56d0408dbec4e374b8780da69a7bd212e8). So that should not be a factor in deciding whether we add a needed extra piece or not. Since the vm_event ABI is changing for Xen 4.13 now the ABI version will only be bumped once for 4.13. So we should be able to add/remove/shuffle whatever we want while 4.13 merge window is open. That said I don't have a use for idt and gdtr_limit that warrants having to receive it via the vm_event structure - those pieces are always just a hypercall away if needed. So in the vm_event structure I tend to just put the registers needed often enough to warrant avoiding that extra hypercall. But if Razvan says he has a use for them, I can add them here. Tamas
On 5/2/19 4:09 PM, Tamas K Lengyel wrote: > On Thu, May 2, 2019 at 2:57 AM Jan Beulich <JBeulich@suse.com> wrote: >> >>>>> On 02.05.19 at 10:25, <rcojocaru@bitdefender.com> wrote: >>> On 5/2/19 2:57 AM, Tamas K Lengyel wrote: >>>> Receiving this register is useful for introspecting 32-bit Windows when the >>>> event being trapped happened while in ring3. >>>> >>>> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> >>>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> >>>> Cc: Tamas K Lengyel <tamas@tklengyel.com> >>>> Cc: Jan Beulich <jbeulich@suse.com> >>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com> >>>> Cc: Wei Liu <wei.liu2@citrix.com> >>>> Cc: Roger Pau Monne <roger.pau@citrix.com> >>>> --- >>>> xen/arch/x86/vm_event.c | 5 +++++ >>>> xen/include/public/vm_event.h | 3 ++- >>>> 2 files changed, 7 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c >>>> index 51c3493b1d..873788e32f 100644 >>>> --- a/xen/arch/x86/vm_event.c >>>> +++ b/xen/arch/x86/vm_event.c >>>> @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum >>> x86_segment segment, >>>> reg->es_sel = seg.sel; >>>> break; >>>> >>>> + case x86_seg_gdtr: >>>> + reg->gdtr_base = seg.base; >>>> + break; >>> >>> Please also add limit, ar, sel, like the others do. >> >> There's no ar or sel for GDT (and IDT). Instead, because of ... >> >>> In addition to >>> making this modification look less strange (since, in contrast to the >>> function name, nothing is packed for gdtr_base), it will also save >>> future work for applications wanting to use gdtr which also require >>> backwards compatibility with previous vm_event versions. >>> >>> As you know, for each such modification we need to have a separate >>> vm_event_vX header in our applications so that we're ready to create a >>> ring buffer using requests and replies of the right size, and also to be >>> able to properly interpret the ring buffer data, so the least frequent >>> changes to the vm_event struct, the better. >> >> ... this I wonder whether the IDT details shouldn't get added at >> the same time. > > The churn of the header is not that big of a deal - I do the same in > LibVMI and it's a largely copy-paste job that takes perhaps ~5m to add > (see https://github.com/tklengyel/libvmi/commit/7509ce56d0408dbec4e374b8780da69a7bd212e8). > So that should not be a factor in deciding whether we add a needed > extra piece or not. Since the vm_event ABI is changing for Xen 4.13 > now the ABI version will only be bumped once for 4.13. So we should be > able to add/remove/shuffle whatever we want while 4.13 merge window is > open. > > That said I don't have a use for idt and gdtr_limit that warrants > having to receive it via the vm_event structure - those pieces are > always just a hypercall away if needed. So in the vm_event structure I > tend to just put the registers needed often enough to warrant avoiding > that extra hypercall. But if Razvan says he has a use for them, I can > add them here. We do use them both - idtr and gdtr, both base and limit, but we are indeed getting them via hypercall now. Considering that, since we did add gdtr_base I think it would be great if we could also add gdtr_limit. Adding idtr as well would _theoretically_ be a nice speed optimization (removing the need for a hypercall), but it might also be a speed pessimization generally applicable to increasing the vm_event size (since a VCPU with no more space left in the vm_event ring buffer will be blocked more and go through more locking logic). My point is, at the moment it's fine to skip idtr if it's not required by Jan, but if we do add either then let's please add both _base and _limit. In a hopefully near future we'll be able to use as much memory as necessary for storing vm_event data and just cache everything in the ring buffer, avoing all the "get context" hypercalls. Thanks, Razvan
>>> On 02.05.19 at 15:09, <tamas@tklengyel.com> wrote: > That said I don't have a use for idt and gdtr_limit that warrants > having to receive it via the vm_event structure So what use if the GDT base without the limit? Are you silently assuming all presently loaded selectors are (still) within limits? Jan
>>> On 02.05.19 at 15:23, <rcojocaru@bitdefender.com> wrote: > My point is, at the moment it's fine to skip idtr if it's not required > by Jan, but if we do add either then let's please add both _base and _limit. No, it's not a requirement I mean to put up (and I'm not in the position to either, as I'm not the maintainer of the interface). It just seems inconsistent to me to have one but not the other. Jan
On Thu, May 2, 2019 at 7:30 AM Jan Beulich <JBeulich@suse.com> wrote: > > >>> On 02.05.19 at 15:09, <tamas@tklengyel.com> wrote: > > That said I don't have a use for idt and gdtr_limit that warrants > > having to receive it via the vm_event structure > > So what use if the GDT base without the limit? Are you silently > assuming all presently loaded selectors are (still) within limits? On 32-bit Windows the KPCR's address is cached at gdtr_base + 0x30 while in ring3. In ring0 we can just use fs_base for that. At the moment I still just cache the KPCR location on every MOV-TO-CR3 but that became an issue with recent versions of Windows10 implementing Meltdown mitigations because it leads to extreme performance degradation in the guest (opening an app takes ~20s). So now I just try to find the KPCR based on the registers reported in each vm_event. We use the KPCR to quickly find thread/process base addresses to gather info relevant to introspection. Tamas Tamas
diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c index 51c3493b1d..873788e32f 100644 --- a/xen/arch/x86/vm_event.c +++ b/xen/arch/x86/vm_event.c @@ -179,6 +179,10 @@ static void vm_event_pack_segment_register(enum x86_segment segment, reg->es_sel = seg.sel; break; + case x86_seg_gdtr: + reg->gdtr_base = seg.base; + break; + default: ASSERT_UNREACHABLE(); } @@ -238,6 +242,7 @@ void vm_event_fill_regs(vm_event_request_t *req) vm_event_pack_segment_register(x86_seg_ss, &req->data.regs.x86); vm_event_pack_segment_register(x86_seg_ds, &req->data.regs.x86); vm_event_pack_segment_register(x86_seg_es, &req->data.regs.x86); + vm_event_pack_segment_register(x86_seg_gdtr, &req->data.regs.x86); req->data.regs.x86.shadow_gs = ctxt.shadow_gs; req->data.regs.x86.dr6 = ctxt.dr6; diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h index b2bafc0d77..fd020c5ea7 100644 --- a/xen/include/public/vm_event.h +++ b/xen/include/public/vm_event.h @@ -29,7 +29,7 @@ #include "xen.h" -#define VM_EVENT_INTERFACE_VERSION 0x00000004 +#define VM_EVENT_INTERFACE_VERSION 0x00000005 #if defined(__XEN__) || defined(__XEN_TOOLS__) @@ -198,6 +198,7 @@ struct vm_event_regs_x86 { uint64_t msr_efer; uint64_t msr_star; uint64_t msr_lstar; + uint64_t gdtr_base; uint32_t cs_base; uint32_t ss_base; uint32_t ds_base;
Receiving this register is useful for introspecting 32-bit Windows when the event being trapped happened while in ring3. Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> Cc: Tamas K Lengyel <tamas@tklengyel.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Roger Pau Monne <roger.pau@citrix.com> --- xen/arch/x86/vm_event.c | 5 +++++ xen/include/public/vm_event.h | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-)