Message ID | 20240509075423.156858-1-weijiang.yang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC,1/2] KVM: x86: Introduce KVM_{G,S}ET_ONE_REG uAPIs support | expand |
On Thu, May 09, 2024, Yang Weijiang wrote: > @@ -5859,6 +5884,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, > } > } > > +static int kvm_translate_synthetic_msr(u32 *index) > +{ > + return 0; This needs to be -EINVAL. > +} > + > long kvm_arch_vcpu_ioctl(struct file *filp, > unsigned int ioctl, unsigned long arg) > {
On 6/11/2024 9:04 AM, Sean Christopherson wrote: > On Thu, May 09, 2024, Yang Weijiang wrote: >> @@ -5859,6 +5884,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, >> } >> } >> >> +static int kvm_translate_synthetic_msr(u32 *index) >> +{ >> + return 0; > This needs to be -EINVAL. OK, I'll change it, thanks!
On Thu May 9, 2024 at 09:54 AM UTC+0200, Yang Weijiang wrote: > Enable KVM_{G,S}ET_ONE_REG uAPIs so that userspace can access HW MSR or > KVM synthetic MSR throught it. > > In CET KVM series [*], KVM "steals" an MSR from PV MSR space and access > it via KVM_{G,S}ET_MSRs uAPIs, but the approach pollutes PV MSR space > and hides the difference of synthetic MSRs and normal HW defined MSRs. > > Now carve out a separate room in KVM-customized MSR address space for > synthetic MSRs. The synthetic MSRs are not exposed to userspace via > KVM_GET_MSR_INDEX_LIST, instead userspace complies with KVM's setup and > composes the uAPI params. KVM synthetic MSR indices start from 0 and > increase linearly. Userspace caller should tag MSR type correctly in > order to access intended HW or synthetic MSR. > > [*]: > https://lore.kernel.org/all/20240219074733.122080-18-weijiang.yang@intel.com/ > > Suggested-by: Sean Christopherson <seanjc@google.com> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Having this API, and specifically having a definite kvm_one_reg structure for x86 registers, would be interesting for register pinning/intercepts. With one_reg for x86 the API could be platform agnostic and possible even replace MSR filters for x86. I do have a couple of questions about these patches. > --- > arch/x86/include/uapi/asm/kvm.h | 10 ++++++ > arch/x86/kvm/x86.c | 62 +++++++++++++++++++++++++++++++++ > 2 files changed, 72 insertions(+) > > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h > index ef11aa4cab42..ca2a47a85fa1 100644 > --- a/arch/x86/include/uapi/asm/kvm.h > +++ b/arch/x86/include/uapi/asm/kvm.h > @@ -410,6 +410,16 @@ struct kvm_xcrs { > __u64 padding[16]; > }; > > +#define KVM_X86_REG_MSR (1 << 2) > +#define KVM_X86_REG_SYNTHETIC_MSR (1 << 3) Why is this a bitfield? As opposed to just counting up? #define KVM_X86_REG_MSR 2 #define KVM_X86_REG_SYNTHETIC_MSR 3 > + > +struct kvm_x86_reg_id { > + __u32 index; > + __u8 type; > + __u8 rsvd; > + __u16 rsvd16; > +}; This struct is opposite to what other architectures do, where they have an architecture ID in the upper 32 bits, and the lower 32 bits actually identify the register. This would probably make sense for x86 too, to avoid conflicts with other IDs (I think MIPS core registers can have IDs with the lower 32 bits all zero) so that the IDs are actually unique, right? Best, Nikolas Amazon Web Services Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B Sitz: Berlin Ust-ID: DE 365 538 597
On Wed, Sep 11, 2024, Nikolas Wipper wrote: > On Thu May 9, 2024 at 09:54 AM UTC+0200, Yang Weijiang wrote: > > Enable KVM_{G,S}ET_ONE_REG uAPIs so that userspace can access HW MSR or > > KVM synthetic MSR throught it. > > > > In CET KVM series [*], KVM "steals" an MSR from PV MSR space and access > > it via KVM_{G,S}ET_MSRs uAPIs, but the approach pollutes PV MSR space > > and hides the difference of synthetic MSRs and normal HW defined MSRs. > > > > Now carve out a separate room in KVM-customized MSR address space for > > synthetic MSRs. The synthetic MSRs are not exposed to userspace via > > KVM_GET_MSR_INDEX_LIST, instead userspace complies with KVM's setup and > > composes the uAPI params. KVM synthetic MSR indices start from 0 and > > increase linearly. Userspace caller should tag MSR type correctly in > > order to access intended HW or synthetic MSR. > > > > [*]: > > https://lore.kernel.org/all/20240219074733.122080-18-weijiang.yang@intel.com/ > > > > Suggested-by: Sean Christopherson <seanjc@google.com> > > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> > > Having this API, and specifically having a definite kvm_one_reg structure > for x86 registers, would be interesting for register pinning/intercepts. > With one_reg for x86 the API could be platform agnostic and possible even > replace MSR filters for x86. I don't follow. MSR filters let userspace intercept accesses for a variety of reasons, these APIs simply provide a way to read/write a register value that is stored in KVM. I don't see how this could replace MSR filters. > I do have a couple of questions about these patches. > > > --- > > arch/x86/include/uapi/asm/kvm.h | 10 ++++++ > > arch/x86/kvm/x86.c | 62 +++++++++++++++++++++++++++++++++ > > 2 files changed, 72 insertions(+) > > > > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h > > index ef11aa4cab42..ca2a47a85fa1 100644 > > --- a/arch/x86/include/uapi/asm/kvm.h > > +++ b/arch/x86/include/uapi/asm/kvm.h > > @@ -410,6 +410,16 @@ struct kvm_xcrs { > > __u64 padding[16]; > > }; > > > > +#define KVM_X86_REG_MSR (1 << 2) > > +#define KVM_X86_REG_SYNTHETIC_MSR (1 << 3) > > Why is this a bitfield? As opposed to just counting up? Hmm, good question. This came from my initial sketch, and it would seem that I something specific in mind since starting at (1 << 2) is oddly specific, but for the life of me I can't remember what the plan was. Best guest is that I was leaving space for '0' and '1' to be regs and sregs? But that still doesn't explain/justify using a bitfield. [*] https://lore.kernel.org/all/ZjLE7giCsEI4Sftp@google.com > > #define KVM_X86_REG_MSR 2 > #define KVM_X86_REG_SYNTHETIC_MSR 3 > > > + > > +struct kvm_x86_reg_id { > > + __u32 index; > > + __u8 type; > > + __u8 rsvd; > > + __u16 rsvd16; > > +}; > > This struct is opposite to what other architectures do, where they have > an architecture ID in the upper 32 bits, and the lower 32 bits actually > identify the register. This would probably make sense for x86 too, to > avoid conflicts with other IDs (I think MIPS core registers can have IDs > with the lower 32 bits all zero) so that the IDs are actually unique, > right? It's not the opposite, it's just missing fields for the arch and the size. Ugh, the size is unaligned. That's annoying. Something like this? struct kvm_x86_reg_id { __u32 index; __u8 type; __u8 rsvd; __u8 rsvd4:4; __u8 size:4; __u8 x86; } Though looking at this with fresh eyes, I don't think the above structure should be exposed to userspace. Userspace will only ever want to encode a register; the exact register may not be hardcoded, but I would expect the type to always be known ahead of time, if not outright hardcoded. The struct is really only useful for the kernel, e.g. to easily switch on the type, extract the index, etc. As annoying as it can be for a human to decipher the final value, the arm64/riscv approach of providing builders is probably the way to go, though I think x86 can be much simpler (less stuff to encode). Oh! Another thing I think we should do is make KVM_{G,S}ET_ONE_REG 64-bit only so that we don't have to deal with 32-bit vs. 64-bit GPRs. 32-bit userspace would need to manually encode the register id, but I have no problem making life difficult for such setups. Or KVM could reject the ioctl for .compat_ioctl(), but that seems unnecessary. E.g. since IIUC switch() and if() statements are off-limits in uapi headers... #define KVM_X86_REG_TYPE_MSR 2ull #define KVM_x86_REG_TYPE_SIZE(type) \ {( \ __u64 type_size = type; \ \ type_size |= type == KVM_X86_REG_TYPE_MSR ? KVM_REG_SIZE_U64 : \ type == KVM_X86_REG_TYPE_SYNTHETIC_MSR ? KVM_REG_SIZE_U64 :\ 0; \ type_size; \ }) #define KVM_X86_REG_ENCODE(type, index) \ (KVM_REG_X86 | KVM_X86_REG_TYPE_SIZE(type) | index) #define KVM_X86_REG_MSR(index) KVM_X86_REG_ENCODE(KVM_X86_REG_TYPE_MSR, index)
On Wed Sep 11, 2024 at 04:36 PM UTC+0200, Sean Christopherson wrote: > On Wed, Sep 11, 2024, Nikolas Wipper wrote: >> Having this API, and specifically having a definite kvm_one_reg structure >> for x86 registers, would be interesting for register pinning/intercepts. >> With one_reg for x86 the API could be platform agnostic and possible even >> replace MSR filters for x86. > > I don't follow. MSR filters let userspace intercept accesses for a variety of > reasons, these APIs simply provide a way to read/write a register value that is > stored in KVM. I don't see how this could replace MSR filters. Nope, that would be an entirely different API, but if that uses one reg IDs it could be unified to cover CRs and MSRs all in one. Amazon Web Services Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B Sitz: Berlin Ust-ID: DE 365 538 597
On Wed, Sep 11, 2024, Nikolas Wipper wrote: > On Wed Sep 11, 2024 at 04:36 PM UTC+0200, Sean Christopherson wrote: > > On Wed, Sep 11, 2024, Nikolas Wipper wrote: > >> Having this API, and specifically having a definite kvm_one_reg structure > >> for x86 registers, would be interesting for register pinning/intercepts. > >> With one_reg for x86 the API could be platform agnostic and possible even > >> replace MSR filters for x86. > > > > I don't follow. MSR filters let userspace intercept accesses for a variety of > > reasons, these APIs simply provide a way to read/write a register value that is > > stored in KVM. I don't see how this could replace MSR filters. > > Nope, that would be an entirely different API, but if that uses one reg IDs it > could be unified to cover CRs and MSRs all in one. Oooh, gotcha. Yeah, uniquely identifiable registers would allow for a generic filtering API, though I'm not entirely sure that's actually a good idea in the long run. Most x86 registers can't be intercepted; having a generic filtering API might incur an annoyingly high maintenance cost. Hmm, though it should be easy enough to explicitly allow only MSR and CR types, so if/when we get to the point where CR pinning/filtering is desirable/ready, then a unified API probably does make sense.
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index ef11aa4cab42..ca2a47a85fa1 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -410,6 +410,16 @@ struct kvm_xcrs { __u64 padding[16]; }; +#define KVM_X86_REG_MSR (1 << 2) +#define KVM_X86_REG_SYNTHETIC_MSR (1 << 3) + +struct kvm_x86_reg_id { + __u32 index; + __u8 type; + __u8 rsvd; + __u16 rsvd16; +}; + #define KVM_SYNC_X86_REGS (1UL << 0) #define KVM_SYNC_X86_SREGS (1UL << 1) #define KVM_SYNC_X86_EVENTS (1UL << 2) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 91478b769af0..d0054c52f24b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2244,6 +2244,31 @@ static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data) return kvm_set_msr_ignored_check(vcpu, index, *data, true); } +static int kvm_get_one_msr(struct kvm_vcpu *vcpu, u32 msr, u64 __user *value) +{ + u64 val; + int r; + + r = do_get_msr(vcpu, msr, &val); + if (r) + return r; + + if (put_user(val, value)) + return -EFAULT; + + return 0; +} + +static int kvm_set_one_msr(struct kvm_vcpu *vcpu, u32 msr, u64 __user *value) +{ + u64 val; + + if (get_user(val, value)) + return -EFAULT; + + return do_set_msr(vcpu, msr, &val); +} + #ifdef CONFIG_X86_64 struct pvclock_clock { int vclock_mode; @@ -5859,6 +5884,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, } } +static int kvm_translate_synthetic_msr(u32 *index) +{ + return 0; +} + long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -5976,6 +6006,38 @@ long kvm_arch_vcpu_ioctl(struct file *filp, srcu_read_unlock(&vcpu->kvm->srcu, idx); break; } + case KVM_GET_ONE_REG: + case KVM_SET_ONE_REG: { + struct kvm_x86_reg_id *id; + struct kvm_one_reg reg; + u64 __user *value; + + r = -EFAULT; + if (copy_from_user(®, argp, sizeof(reg))) + break; + + r = -EINVAL; + id = (struct kvm_x86_reg_id *)®.id; + if (id->rsvd || id->rsvd16) + break; + + if (id->type != KVM_X86_REG_MSR && + id->type != KVM_X86_REG_SYNTHETIC_MSR) + break; + + if (id->type == KVM_X86_REG_SYNTHETIC_MSR) { + r = kvm_translate_synthetic_msr(&id->index); + if (r) + break; + } + + value = u64_to_user_ptr(reg.addr); + if (ioctl == KVM_GET_ONE_REG) + r = kvm_get_one_msr(vcpu, id->index, value); + else + r = kvm_set_one_msr(vcpu, id->index, value); + break; + } case KVM_TPR_ACCESS_REPORTING: { struct kvm_tpr_access_ctl tac;
Enable KVM_{G,S}ET_ONE_REG uAPIs so that userspace can access HW MSR or KVM synthetic MSR throught it. In CET KVM series [*], KVM "steals" an MSR from PV MSR space and access it via KVM_{G,S}ET_MSRs uAPIs, but the approach pollutes PV MSR space and hides the difference of synthetic MSRs and normal HW defined MSRs. Now carve out a separate room in KVM-customized MSR address space for synthetic MSRs. The synthetic MSRs are not exposed to userspace via KVM_GET_MSR_INDEX_LIST, instead userspace complies with KVM's setup and composes the uAPI params. KVM synthetic MSR indices start from 0 and increase linearly. Userspace caller should tag MSR type correctly in order to access intended HW or synthetic MSR. [*]: https://lore.kernel.org/all/20240219074733.122080-18-weijiang.yang@intel.com/ Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> --- arch/x86/include/uapi/asm/kvm.h | 10 ++++++ arch/x86/kvm/x86.c | 62 +++++++++++++++++++++++++++++++++ 2 files changed, 72 insertions(+)