Message ID | 20241101185031.1799556-2-seanjc@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | KVM: VMX: Mark Intel PT virtualization as BROKEN | expand |
On 11/2/2024 2:50 AM, Sean Christopherson wrote: > Hide KVM's pt_mode module param behind CONFIG_BROKEN, i.e. disable support > for virtualizing Intel PT via guest/host mode unless BROKEN=y. There are > myriad bugs in the implementation, some of which are fatal to the guest, > and others which put the stability and health of the host at risk. > > For guest fatalities, the most glaring issue is that KVM fails to ensure > tracing is disabled, and *stays* disabled prior to VM-Enter, which is > necessary as hardware disallows loading (the guest's) RTIT_CTL if tracing > is enabled (enforced via a VMX consistency check). Per the SDM: > > If the logical processor is operating with Intel PT enabled (if > IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load > IA32_RTIT_CTL" VM-entry control must be 0. > > On the host side, KVM doesn't validate the guest CPUID configuration > provided by userspace, and even worse, uses the guest configuration to > decide what MSRs to save/load at VM-Enter and VM-Exit. E.g. configuring > guest CPUID to enumerate more address ranges than are supported in hardware > will result in KVM trying to passthrough, save, and load non-existent MSRs, > which generates a variety of WARNs, ToPA ERRORs in the host, a potential > deadlock, etc. > > Fixes: f99e3daf94ff ("KVM: x86: Add Intel PT virtualization work mode") > Cc: stable@vger.kernel.org > Cc: Adrian Hunter <adrian.hunter@intel.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/vmx/vmx.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 6ed801ffe33f..087504fb1589 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -217,9 +217,11 @@ module_param(ple_window_shrink, uint, 0444); > static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; > module_param(ple_window_max, uint, 0444); > > -/* Default is SYSTEM mode, 1 for host-guest mode */ > +/* Default is SYSTEM mode, 1 for host-guest mode (which is BROKEN) */ > int __read_mostly pt_mode = PT_MODE_SYSTEM; > +#ifdef CONFIG_BROKEN > module_param(pt_mode, int, S_IRUGO); > +#endif I like the patch, but I didn't find any other usercase of CONFIG_BROKEN in current Linux. > struct x86_pmu_lbr __ro_after_init vmx_lbr_caps; >
On 1/11/24 20:50, Sean Christopherson wrote: > Hide KVM's pt_mode module param behind CONFIG_BROKEN, i.e. disable support > for virtualizing Intel PT via guest/host mode unless BROKEN=y. There are > myriad bugs in the implementation, some of which are fatal to the guest, > and others which put the stability and health of the host at risk. > > For guest fatalities, the most glaring issue is that KVM fails to ensure > tracing is disabled, and *stays* disabled prior to VM-Enter, which is > necessary as hardware disallows loading (the guest's) RTIT_CTL if tracing > is enabled (enforced via a VMX consistency check). Per the SDM: > > If the logical processor is operating with Intel PT enabled (if > IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load > IA32_RTIT_CTL" VM-entry control must be 0. > > On the host side, KVM doesn't validate the guest CPUID configuration > provided by userspace, and even worse, uses the guest configuration to > decide what MSRs to save/load at VM-Enter and VM-Exit. E.g. configuring > guest CPUID to enumerate more address ranges than are supported in hardware > will result in KVM trying to passthrough, save, and load non-existent MSRs, > which generates a variety of WARNs, ToPA ERRORs in the host, a potential > deadlock, etc. > > Fixes: f99e3daf94ff ("KVM: x86: Add Intel PT virtualization work mode") > Cc: stable@vger.kernel.org > Cc: Adrian Hunter <adrian.hunter@intel.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/vmx/vmx.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 6ed801ffe33f..087504fb1589 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -217,9 +217,11 @@ module_param(ple_window_shrink, uint, 0444); > static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; > module_param(ple_window_max, uint, 0444); > > -/* Default is SYSTEM mode, 1 for host-guest mode */ > +/* Default is SYSTEM mode, 1 for host-guest mode (which is BROKEN) */ > int __read_mostly pt_mode = PT_MODE_SYSTEM; > +#ifdef CONFIG_BROKEN > module_param(pt_mode, int, S_IRUGO); > +#endif Side effects are: 1. If pt_mode is passed via modprobe, there will be a warning in kernel messages: kvm_intel: unknown parameter 'pt_mode' ignored 2. The sysfs module parameter file pt_mode will be gone: # cat /sys/module/kvm_intel/parameters/pt_mode cat: /sys/module/kvm_intel/parameters/pt_mode: No such file or directory Nevertheless: Tested-by: Adrian Hunter <adrian.hunter@intel.com> > > struct x86_pmu_lbr __ro_after_init vmx_lbr_caps; >
On Mon, Nov 04, 2024, Xiaoyao Li wrote: > On 11/2/2024 2:50 AM, Sean Christopherson wrote: > > Hide KVM's pt_mode module param behind CONFIG_BROKEN, i.e. disable support > > for virtualizing Intel PT via guest/host mode unless BROKEN=y. There are > > myriad bugs in the implementation, some of which are fatal to the guest, > > and others which put the stability and health of the host at risk. > > > > For guest fatalities, the most glaring issue is that KVM fails to ensure > > tracing is disabled, and *stays* disabled prior to VM-Enter, which is > > necessary as hardware disallows loading (the guest's) RTIT_CTL if tracing > > is enabled (enforced via a VMX consistency check). Per the SDM: > > > > If the logical processor is operating with Intel PT enabled (if > > IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load > > IA32_RTIT_CTL" VM-entry control must be 0. > > > > On the host side, KVM doesn't validate the guest CPUID configuration > > provided by userspace, and even worse, uses the guest configuration to > > decide what MSRs to save/load at VM-Enter and VM-Exit. E.g. configuring > > guest CPUID to enumerate more address ranges than are supported in hardware > > will result in KVM trying to passthrough, save, and load non-existent MSRs, > > which generates a variety of WARNs, ToPA ERRORs in the host, a potential > > deadlock, etc. > > > > Fixes: f99e3daf94ff ("KVM: x86: Add Intel PT virtualization work mode") > > Cc: stable@vger.kernel.org > > Cc: Adrian Hunter <adrian.hunter@intel.com> > > Signed-off-by: Sean Christopherson <seanjc@google.com> > > --- > > arch/x86/kvm/vmx/vmx.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index 6ed801ffe33f..087504fb1589 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -217,9 +217,11 @@ module_param(ple_window_shrink, uint, 0444); > > static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; > > module_param(ple_window_max, uint, 0444); > > -/* Default is SYSTEM mode, 1 for host-guest mode */ > > +/* Default is SYSTEM mode, 1 for host-guest mode (which is BROKEN) */ > > int __read_mostly pt_mode = PT_MODE_SYSTEM; > > +#ifdef CONFIG_BROKEN > > module_param(pt_mode, int, S_IRUGO); > > +#endif > > I like the patch, but I didn't find any other usercase of CONFIG_BROKEN in > current Linux. Ya, BROKEN is typically used directly in Kconfigs, e.g. "depends on BROKEN". But I can't think of any reason using it in this way would be problematic.
On Mon, Nov 04, 2024, Adrian Hunter wrote: > On 1/11/24 20:50, Sean Christopherson wrote: > > Hide KVM's pt_mode module param behind CONFIG_BROKEN, i.e. disable support > > for virtualizing Intel PT via guest/host mode unless BROKEN=y. There are > > myriad bugs in the implementation, some of which are fatal to the guest, > > and others which put the stability and health of the host at risk. > > > > For guest fatalities, the most glaring issue is that KVM fails to ensure > > tracing is disabled, and *stays* disabled prior to VM-Enter, which is > > necessary as hardware disallows loading (the guest's) RTIT_CTL if tracing > > is enabled (enforced via a VMX consistency check). Per the SDM: > > > > If the logical processor is operating with Intel PT enabled (if > > IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load > > IA32_RTIT_CTL" VM-entry control must be 0. > > > > On the host side, KVM doesn't validate the guest CPUID configuration > > provided by userspace, and even worse, uses the guest configuration to > > decide what MSRs to save/load at VM-Enter and VM-Exit. E.g. configuring > > guest CPUID to enumerate more address ranges than are supported in hardware > > will result in KVM trying to passthrough, save, and load non-existent MSRs, > > which generates a variety of WARNs, ToPA ERRORs in the host, a potential > > deadlock, etc. > > > > Fixes: f99e3daf94ff ("KVM: x86: Add Intel PT virtualization work mode") > > Cc: stable@vger.kernel.org > > Cc: Adrian Hunter <adrian.hunter@intel.com> > > Signed-off-by: Sean Christopherson <seanjc@google.com> > > --- > > arch/x86/kvm/vmx/vmx.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index 6ed801ffe33f..087504fb1589 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -217,9 +217,11 @@ module_param(ple_window_shrink, uint, 0444); > > static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; > > module_param(ple_window_max, uint, 0444); > > > > -/* Default is SYSTEM mode, 1 for host-guest mode */ > > +/* Default is SYSTEM mode, 1 for host-guest mode (which is BROKEN) */ > > int __read_mostly pt_mode = PT_MODE_SYSTEM; > > +#ifdef CONFIG_BROKEN > > module_param(pt_mode, int, S_IRUGO); > > +#endif > > Side effects are: > 1. If pt_mode is passed via modprobe, there will be a warning in kernel messages: > kvm_intel: unknown parameter 'pt_mode' ignored This is more of a feature in this case, as it's a non-fatal way of alerting the user that trying to enable PT virtualization won't work. > 2. The sysfs module parameter file pt_mode will be gone: > # cat /sys/module/kvm_intel/parameters/pt_mode > cat: /sys/module/kvm_intel/parameters/pt_mode: No such file or directory Hrm, this could be slightly more problematic, e.g. if userspace were asserting on the state of the parameter. But AFAIK, module params aren't considered ABI. Paolo, any thoughts on how best to handle this?
On 11/5/2024 6:46 AM, Sean Christopherson wrote: > On Mon, Nov 04, 2024, Xiaoyao Li wrote: >> On 11/2/2024 2:50 AM, Sean Christopherson wrote: >>> Hide KVM's pt_mode module param behind CONFIG_BROKEN, i.e. disable support >>> for virtualizing Intel PT via guest/host mode unless BROKEN=y. There are >>> myriad bugs in the implementation, some of which are fatal to the guest, >>> and others which put the stability and health of the host at risk. >>> >>> For guest fatalities, the most glaring issue is that KVM fails to ensure >>> tracing is disabled, and *stays* disabled prior to VM-Enter, which is >>> necessary as hardware disallows loading (the guest's) RTIT_CTL if tracing >>> is enabled (enforced via a VMX consistency check). Per the SDM: >>> >>> If the logical processor is operating with Intel PT enabled (if >>> IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load >>> IA32_RTIT_CTL" VM-entry control must be 0. >>> >>> On the host side, KVM doesn't validate the guest CPUID configuration >>> provided by userspace, and even worse, uses the guest configuration to >>> decide what MSRs to save/load at VM-Enter and VM-Exit. E.g. configuring >>> guest CPUID to enumerate more address ranges than are supported in hardware >>> will result in KVM trying to passthrough, save, and load non-existent MSRs, >>> which generates a variety of WARNs, ToPA ERRORs in the host, a potential >>> deadlock, etc. >>> >>> Fixes: f99e3daf94ff ("KVM: x86: Add Intel PT virtualization work mode") >>> Cc: stable@vger.kernel.org >>> Cc: Adrian Hunter <adrian.hunter@intel.com> >>> Signed-off-by: Sean Christopherson <seanjc@google.com> >>> --- >>> arch/x86/kvm/vmx/vmx.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c >>> index 6ed801ffe33f..087504fb1589 100644 >>> --- a/arch/x86/kvm/vmx/vmx.c >>> +++ b/arch/x86/kvm/vmx/vmx.c >>> @@ -217,9 +217,11 @@ module_param(ple_window_shrink, uint, 0444); >>> static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; >>> module_param(ple_window_max, uint, 0444); >>> -/* Default is SYSTEM mode, 1 for host-guest mode */ >>> +/* Default is SYSTEM mode, 1 for host-guest mode (which is BROKEN) */ >>> int __read_mostly pt_mode = PT_MODE_SYSTEM; >>> +#ifdef CONFIG_BROKEN >>> module_param(pt_mode, int, S_IRUGO); >>> +#endif >> >> I like the patch, but I didn't find any other usercase of CONFIG_BROKEN in >> current Linux. > > Ya, BROKEN is typically used directly in Kconfigs, e.g. "depends on BROKEN". But > I can't think of any reason using it in this way would be problematic. I see. Thanks for the information! Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
On 11/1/24 19:50, Sean Christopherson wrote: > Hide KVM's pt_mode module param behind CONFIG_BROKEN, i.e. disable support > for virtualizing Intel PT via guest/host mode unless BROKEN=y. There are > myriad bugs in the implementation, some of which are fatal to the guest, > and others which put the stability and health of the host at risk. > > For guest fatalities, the most glaring issue is that KVM fails to ensure > tracing is disabled, and *stays* disabled prior to VM-Enter, which is > necessary as hardware disallows loading (the guest's) RTIT_CTL if tracing > is enabled (enforced via a VMX consistency check). Per the SDM: > > If the logical processor is operating with Intel PT enabled (if > IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load > IA32_RTIT_CTL" VM-entry control must be 0. > > On the host side, KVM doesn't validate the guest CPUID configuration > provided by userspace, and even worse, uses the guest configuration to > decide what MSRs to save/load at VM-Enter and VM-Exit. E.g. configuring > guest CPUID to enumerate more address ranges than are supported in hardware > will result in KVM trying to passthrough, save, and load non-existent MSRs, > which generates a variety of WARNs, ToPA ERRORs in the host, a potential > deadlock, etc. > > Fixes: f99e3daf94ff ("KVM: x86: Add Intel PT virtualization work mode") > Cc: stable@vger.kernel.org > Cc: Adrian Hunter <adrian.hunter@intel.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/vmx/vmx.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 6ed801ffe33f..087504fb1589 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -217,9 +217,11 @@ module_param(ple_window_shrink, uint, 0444); > static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; > module_param(ple_window_max, uint, 0444); > > -/* Default is SYSTEM mode, 1 for host-guest mode */ > +/* Default is SYSTEM mode, 1 for host-guest mode (which is BROKEN) */ > int __read_mostly pt_mode = PT_MODE_SYSTEM; > +#ifdef CONFIG_BROKEN > module_param(pt_mode, int, S_IRUGO); > +#endif > > struct x86_pmu_lbr __ro_after_init vmx_lbr_caps; > Applied, thanks. Paolo
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6ed801ffe33f..087504fb1589 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -217,9 +217,11 @@ module_param(ple_window_shrink, uint, 0444); static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; module_param(ple_window_max, uint, 0444); -/* Default is SYSTEM mode, 1 for host-guest mode */ +/* Default is SYSTEM mode, 1 for host-guest mode (which is BROKEN) */ int __read_mostly pt_mode = PT_MODE_SYSTEM; +#ifdef CONFIG_BROKEN module_param(pt_mode, int, S_IRUGO); +#endif struct x86_pmu_lbr __ro_after_init vmx_lbr_caps;
Hide KVM's pt_mode module param behind CONFIG_BROKEN, i.e. disable support for virtualizing Intel PT via guest/host mode unless BROKEN=y. There are myriad bugs in the implementation, some of which are fatal to the guest, and others which put the stability and health of the host at risk. For guest fatalities, the most glaring issue is that KVM fails to ensure tracing is disabled, and *stays* disabled prior to VM-Enter, which is necessary as hardware disallows loading (the guest's) RTIT_CTL if tracing is enabled (enforced via a VMX consistency check). Per the SDM: If the logical processor is operating with Intel PT enabled (if IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load IA32_RTIT_CTL" VM-entry control must be 0. On the host side, KVM doesn't validate the guest CPUID configuration provided by userspace, and even worse, uses the guest configuration to decide what MSRs to save/load at VM-Enter and VM-Exit. E.g. configuring guest CPUID to enumerate more address ranges than are supported in hardware will result in KVM trying to passthrough, save, and load non-existent MSRs, which generates a variety of WARNs, ToPA ERRORs in the host, a potential deadlock, etc. Fixes: f99e3daf94ff ("KVM: x86: Add Intel PT virtualization work mode") Cc: stable@vger.kernel.org Cc: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/vmx/vmx.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)