diff mbox series

x86: kvm: Demote level of already loaded message from error to info

Message ID 20210818114956.7171-1-pmenzel@molgen.mpg.de (mailing list archive)
State New, archived
Headers show
Series x86: kvm: Demote level of already loaded message from error to info | expand

Commit Message

Paul Menzel Aug. 18, 2021, 11:49 a.m. UTC
In scripts, running

    modprobe kvm_amd     2>/dev/null
    modprobe kvm_intel   2>/dev/null

to ensure the modules are loaded causes Linux to log errors.

    $ dmesg --level=err
    [    0.641747] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x3a (or later)
    [   40.196868] kvm: already loaded the other module
    [   40.219857] kvm: already loaded the other module
    [   55.501362] kvm [1177]: vcpu0, guest rIP: 0xffffffff96e5b644 disabled perfctr wrmsr: 0xc2 data 0xffff
    [   56.397974] kvm [1418]: vcpu0, guest rIP: 0xffffffff81046158 disabled perfctr wrmsr: 0xc1 data 0xabcd
    [1007981.827781] kvm: already loaded the other module
    [1008000.394089] kvm: already loaded the other module
    [1008030.706999] kvm: already loaded the other module
    [1020396.054470] kvm: already loaded the other module
    [1020405.614774] kvm: already loaded the other module
    [1020410.140069] kvm: already loaded the other module
    [1020704.049231] kvm: already loaded the other module

As one of the two KVM modules is already loaded, KVM is functioning, and
their is no error condition. Therefore, demote the log message level to
informational.

Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Maciej S. Szmigiero Aug. 18, 2021, 1:36 p.m. UTC | #1
On 18.08.2021 13:49, Paul Menzel wrote:
> In scripts, running
> 
>      modprobe kvm_amd     2>/dev/null
>      modprobe kvm_intel   2>/dev/null
> 
> to ensure the modules are loaded causes Linux to log errors.
> 
>      $ dmesg --level=err
>      [    0.641747] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x3a (or later)
>      [   40.196868] kvm: already loaded the other module
>      [   40.219857] kvm: already loaded the other module
>      [   55.501362] kvm [1177]: vcpu0, guest rIP: 0xffffffff96e5b644 disabled perfctr wrmsr: 0xc2 data 0xffff
>      [   56.397974] kvm [1418]: vcpu0, guest rIP: 0xffffffff81046158 disabled perfctr wrmsr: 0xc1 data 0xabcd
>      [1007981.827781] kvm: already loaded the other module
>      [1008000.394089] kvm: already loaded the other module
>      [1008030.706999] kvm: already loaded the other module
>      [1020396.054470] kvm: already loaded the other module
>      [1020405.614774] kvm: already loaded the other module
>      [1020410.140069] kvm: already loaded the other module
>      [1020704.049231] kvm: already loaded the other module
> 
> As one of the two KVM modules is already loaded, KVM is functioning, and
> their is no error condition. Therefore, demote the log message level to
> informational.
> 

Shouldn't this return ENODEV when loading one of these modules instead
as there is no hardware that supports both VMX and SVM?

Thanks,
Maciej
Sean Christopherson Aug. 18, 2021, 10:23 p.m. UTC | #2
On Wed, Aug 18, 2021, Maciej S. Szmigiero wrote:
> On 18.08.2021 13:49, Paul Menzel wrote:
> > In scripts, running
> > 
> >      modprobe kvm_amd     2>/dev/null
> >      modprobe kvm_intel   2>/dev/null
> > 
> > to ensure the modules are loaded causes Linux to log errors.
> > 
> >      $ dmesg --level=err
> >      [    0.641747] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x3a (or later)
> >      [   40.196868] kvm: already loaded the other module
> >      [   40.219857] kvm: already loaded the other module
> >      [   55.501362] kvm [1177]: vcpu0, guest rIP: 0xffffffff96e5b644 disabled perfctr wrmsr: 0xc2 data 0xffff
> >      [   56.397974] kvm [1418]: vcpu0, guest rIP: 0xffffffff81046158 disabled perfctr wrmsr: 0xc1 data 0xabcd
> >      [1007981.827781] kvm: already loaded the other module
> >      [1008000.394089] kvm: already loaded the other module
> >      [1008030.706999] kvm: already loaded the other module
> >      [1020396.054470] kvm: already loaded the other module
> >      [1020405.614774] kvm: already loaded the other module
> >      [1020410.140069] kvm: already loaded the other module
> >      [1020704.049231] kvm: already loaded the other module
> > 
> > As one of the two KVM modules is already loaded, KVM is functioning, and
> > their is no error condition. Therefore, demote the log message level to
> > informational.

Hrm, but there is an error condition.  Userspace explicitly requested something
and KVM couldn't satisfy the request.

KVM is also going to complain at level=err one way or another, e.g. if a script
probes kvm_amd before kvm_intel on an Intel CPU it's going to get "kvm: no hardware
support", so this isn't truly fixing the problem.  Is the issue perhaps that this
particular message isn't ratelimited?

It's also easy for the script to grep /proc/cpuinfo, so it's hard to feel too
bad about the kludgy message, e.g. look for a specific vendor, 'vmx' or 'svm', etc...

if [[ -z $kvm ]]; then
    grep vendor_id "/proc/cpuinfo" | grep -q AuthenticAMD
    if [[ $? -eq 0 ]]; then
        kvm=kvm_amd
    else
        kvm=kvm_intel
    fi
fi


> Shouldn't this return ENODEV when loading one of these modules instead
> as there is no hardware that supports both VMX and SVM?

Probably not, as KVM would effectively be speculating, e.g. someone could load an
out-of-tree variant of kvm_{intel,amd}.  Maybe instead of switching to ENODEV,
reword the comment, make it ratelimited, and shove it down?  That way the message
and -EEXIST fires iff the vendor module actually has some chance of being loaded.

From 3528e66bd5107d5ac4f6a6ae50503cf64446866a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 18 Aug 2021 15:17:43 -0700
Subject: [PATCH] KVM: x86: Tweak handling and message when vendor module is
 already loaded

Reword KVM's error message if a vendor module is already loaded to state
exactly that instead of assuming "the other" module is loaded, ratelimit
said message to match the other errors, and move the check down below the
basic functionality checks so that attempting to load an unsupported
module provides the same result regardless of whether or not a supported
vendor module is already loaded.

Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fdc0c18339fb..15bd4bd3c81d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8357,12 +8357,6 @@ int kvm_arch_init(void *opaque)
 	struct kvm_x86_init_ops *ops = opaque;
 	int r;

-	if (kvm_x86_ops.hardware_enable) {
-		printk(KERN_ERR "kvm: already loaded the other module\n");
-		r = -EEXIST;
-		goto out;
-	}
-
 	if (!ops->cpu_has_kvm_support()) {
 		pr_err_ratelimited("kvm: no hardware support\n");
 		r = -EOPNOTSUPP;
@@ -8374,6 +8368,12 @@ int kvm_arch_init(void *opaque)
 		goto out;
 	}

+	if (kvm_x86_ops.hardware_enable) {
+		pr_err_ratelimited("kvm: already loaded a vendor module\n");
+		r = -EEXIST;
+		goto out;
+	}
+
 	/*
 	 * KVM explicitly assumes that the guest has an FPU and
 	 * FXSAVE/FXRSTOR. For example, the KVM_GET_FPU explicitly casts the
--
2.33.0.rc2.250.ged5fa647cd-goog
Paul Menzel Aug. 19, 2021, 6:39 a.m. UTC | #3
Dear Sean,


Am 19.08.21 um 00:23 schrieb Sean Christopherson:
> On Wed, Aug 18, 2021, Maciej S. Szmigiero wrote:
>> On 18.08.2021 13:49, Paul Menzel wrote:
>>> In scripts, running
>>>
>>>       modprobe kvm_amd     2>/dev/null
>>>       modprobe kvm_intel   2>/dev/null
>>>
>>> to ensure the modules are loaded causes Linux to log errors.
>>>
>>>       $ dmesg --level=err
>>>       [    0.641747] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x3a (or later)
>>>       [   40.196868] kvm: already loaded the other module
>>>       [   40.219857] kvm: already loaded the other module
>>>       [   55.501362] kvm [1177]: vcpu0, guest rIP: 0xffffffff96e5b644 disabled perfctr wrmsr: 0xc2 data 0xffff
>>>       [   56.397974] kvm [1418]: vcpu0, guest rIP: 0xffffffff81046158 disabled perfctr wrmsr: 0xc1 data 0xabcd
>>>       [1007981.827781] kvm: already loaded the other module
>>>       [1008000.394089] kvm: already loaded the other module
>>>       [1008030.706999] kvm: already loaded the other module
>>>       [1020396.054470] kvm: already loaded the other module
>>>       [1020405.614774] kvm: already loaded the other module
>>>       [1020410.140069] kvm: already loaded the other module
>>>       [1020704.049231] kvm: already loaded the other module
>>>
>>> As one of the two KVM modules is already loaded, KVM is functioning, and
>>> their is no error condition. Therefore, demote the log message level to
>>> informational.
> 
> Hrm, but there is an error condition.  Userspace explicitly requested something
> and KVM couldn't satisfy the request.

Yes, that’s the other perspective. ;-) I’d argue, as the Intel/AMD 
module can’t work on AMD/Intel, the load failure is expected and error. 
But as “error condition” is not well defined:

     $ dmesg -h
     […]
     Supported log levels (priorities):
        emerg - system is unusable
        alert - action must be taken immediately
         crit - critical conditions
          err - error conditions
         warn - warning conditions
       notice - normal but significant condition
         info - informational
        debug - debug-level messages

> KVM is also going to complain at level=err one way or another, e.g. if a script
> probes kvm_amd before kvm_intel on an Intel CPU it's going to get "kvm: no hardware
> support", so this isn't truly fixing the problem.

In my case, modprobe already errors out in that case, which is fine for me.

     $ lsmod | grep kvm
     kvm_intel             249856  0
     kvm                   851968  1 kvm_intel
     irqbypass              16384  1 kvm
     $ sudo modprobe -r kvm_intel
     $ sudo modprobe kvm_amd
     modprobe: ERROR: could not insert 'kvm_amd': Operation not supported
     $ dmesg | tail -2
     [212685.034278] has_svm: not amd or hygon
     [212685.037998] kvm: no hardware support

> Is the issue perhaps that this particular message isn't ratelimited?

It would help my use case, as I am not interested in the error, and 
would be another solution than just changing the log levle. But for your 
viewpoint “Userspace explicitly requested something and KVM couldn't 
satisfy the request”, the user wouldn’t see the immediate error at the 
end of the output of `dmesg`.

> It's also easy for the script to grep /proc/cpuinfo, so it's hard to feel too
> bad about the kludgy message, e.g. look for a specific vendor, 'vmx' or 'svm', etc...
> 
> if [[ -z $kvm ]]; then
>      grep vendor_id "/proc/cpuinfo" | grep -q AuthenticAMD
>      if [[ $? -eq 0 ]]; then
>          kvm=kvm_amd
>      else
>          kvm=kvm_intel
>      fi
> fi

Yes, it could be worked around.

>> Shouldn't this return ENODEV when loading one of these modules instead
>> as there is no hardware that supports both VMX and SVM?
> 
> Probably not, as KVM would effectively be speculating, e.g. someone could load an
> out-of-tree variant of kvm_{intel,amd}.  Maybe instead of switching to ENODEV,
> reword the comment, make it ratelimited, and shove it down?  That way the message
> and -EEXIST fires iff the vendor module actually has some chance of being loaded.
> 
>  From 3528e66bd5107d5ac4f6a6ae50503cf64446866a Mon Sep 17 00:00:00 2001
> From: Sean Christopherson <seanjc@google.com>
> Date: Wed, 18 Aug 2021 15:17:43 -0700
> Subject: [PATCH] KVM: x86: Tweak handling and message when vendor module is
>   already loaded
> 
> Reword KVM's error message if a vendor module is already loaded to state
> exactly that instead of assuming "the other" module is loaded,

The rewording is definitely an improvement.

> ratelimit
> said message to match the other errors, and move the check down below the
> basic functionality checks so that attempting to load an unsupported
> module provides the same result regardless of whether or not a supported
> vendor module is already loaded.

Maybe add an example, how it would log the error before, and how it’s 
done now.

> Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
> Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/x86.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fdc0c18339fb..15bd4bd3c81d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8357,12 +8357,6 @@ int kvm_arch_init(void *opaque)
>   	struct kvm_x86_init_ops *ops = opaque;
>   	int r;
> 
> -	if (kvm_x86_ops.hardware_enable) {
> -		printk(KERN_ERR "kvm: already loaded the other module\n");
> -		r = -EEXIST;
> -		goto out;
> -	}
> -
>   	if (!ops->cpu_has_kvm_support()) {
>   		pr_err_ratelimited("kvm: no hardware support\n");
>   		r = -EOPNOTSUPP;
> @@ -8374,6 +8368,12 @@ int kvm_arch_init(void *opaque)
>   		goto out;
>   	}
> 
> +	if (kvm_x86_ops.hardware_enable) {
> +		pr_err_ratelimited("kvm: already loaded a vendor module\n");
> +		r = -EEXIST;
> +		goto out;
> +	}
> +
>   	/*
>   	 * KVM explicitly assumes that the guest has an FPU and
>   	 * FXSAVE/FXRSTOR. For example, the KVM_GET_FPU explicitly casts the
> --
> 2.33.0.rc2.250.ged5fa647cd-goog

Sounds also good at first sight. No idea, if monitoring scripts in 
userspace would get confused now.


Kind regards,

Paul
Maciej S. Szmigiero Aug. 19, 2021, 2:05 p.m. UTC | #4
On 19.08.2021 08:39, Paul Menzel wrote:
> Am 19.08.21 um 00:23 schrieb Sean Christopherson:
>> On Wed, Aug 18, 2021, Maciej S. Szmigiero wrote:
>>> On 18.08.2021 13:49, Paul Menzel wrote:
>>>> In scripts, running
>>>>
>>>>       modprobe kvm_amd     2>/dev/null
>>>>       modprobe kvm_intel   2>/dev/null
>>>>
>>>> to ensure the modules are loaded causes Linux to log errors.
>>>>
>>>>       $ dmesg --level=err
>>>>       [    0.641747] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x3a (or later)
>>>>       [   40.196868] kvm: already loaded the other module
>>>>       [   40.219857] kvm: already loaded the other module
>>>>       [   55.501362] kvm [1177]: vcpu0, guest rIP: 0xffffffff96e5b644 disabled perfctr wrmsr: 0xc2 data 0xffff
>>>>       [   56.397974] kvm [1418]: vcpu0, guest rIP: 0xffffffff81046158 disabled perfctr wrmsr: 0xc1 data 0xabcd
>>>>       [1007981.827781] kvm: already loaded the other module
>>>>       [1008000.394089] kvm: already loaded the other module
>>>>       [1008030.706999] kvm: already loaded the other module
>>>>       [1020396.054470] kvm: already loaded the other module
>>>>       [1020405.614774] kvm: already loaded the other module
>>>>       [1020410.140069] kvm: already loaded the other module
>>>>       [1020704.049231] kvm: already loaded the other module
>>>>
>>>> As one of the two KVM modules is already loaded, KVM is functioning, and
>>>> their is no error condition. Therefore, demote the log message level to
>>>> informational.
>>
>> Hrm, but there is an error condition.  Userspace explicitly requested something
>> and KVM couldn't satisfy the request.
> 
> Yes, that’s the other perspective. ;-) I’d argue, as the Intel/AMD module can’t work on AMD/Intel, the load failure is expected and error. But as “error condition” is not well defined:
> 
>      $ dmesg -h
>      […]
>      Supported log levels (priorities):
>         emerg - system is unusable
>         alert - action must be taken immediately
>          crit - critical conditions
>           err - error conditions
>          warn - warning conditions
>        notice - normal but significant condition
>          info - informational
>         debug - debug-level messages
> 

Why is that script repeatably trying to load kvm_amd and kvm_intel
modules?
I would assume these would be loaded once at system boot time (either
manually or based on their modalias).
If your script absolutely has to load them manually, it could check first
whether /dev/kvm already exists.

>>> Shouldn't this return ENODEV when loading one of these modules instead
>>> as there is no hardware that supports both VMX and SVM?
>>
>> Probably not, as KVM would effectively be speculating, e.g. someone could load an
>> out-of-tree variant of kvm_{intel,amd}.  Maybe instead of switching to ENODEV,
>> reword the comment, make it ratelimited, and shove it down?  That way the message
>> and -EEXIST fires iff the vendor module actually has some chance of being loaded.
>>
>>  From 3528e66bd5107d5ac4f6a6ae50503cf64446866a Mon Sep 17 00:00:00 2001
>> From: Sean Christopherson <seanjc@google.com>
>> Date: Wed, 18 Aug 2021 15:17:43 -0700
>> Subject: [PATCH] KVM: x86: Tweak handling and message when vendor module is
>>   already loaded
>>
>> Reword KVM's error message if a vendor module is already loaded to state
>> exactly that instead of assuming "the other" module is loaded,
> 
> The rewording is definitely an improvement.
> 
>> ratelimit
>> said message to match the other errors, and move the check down below the
>> basic functionality checks so that attempting to load an unsupported
>> module provides the same result regardless of whether or not a supported
>> vendor module is already loaded.
> 
> Maybe add an example, how it would log the error before, and how it’s done now.
> 
>> Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
>> Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>> ---
>>   arch/x86/kvm/x86.c | 12 ++++++------
>>   1 file changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index fdc0c18339fb..15bd4bd3c81d 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -8357,12 +8357,6 @@ int kvm_arch_init(void *opaque)
>>       struct kvm_x86_init_ops *ops = opaque;
>>       int r;
>>
>> -    if (kvm_x86_ops.hardware_enable) {
>> -        printk(KERN_ERR "kvm: already loaded the other module\n");
>> -        r = -EEXIST;
>> -        goto out;
>> -    }
>> -
>>       if (!ops->cpu_has_kvm_support()) {
>>           pr_err_ratelimited("kvm: no hardware support\n");
>>           r = -EOPNOTSUPP;
>> @@ -8374,6 +8368,12 @@ int kvm_arch_init(void *opaque)
>>           goto out;
>>       }
>>
>> +    if (kvm_x86_ops.hardware_enable) {
>> +        pr_err_ratelimited("kvm: already loaded a vendor module\n");
>> +        r = -EEXIST;
>> +        goto out;
>> +    }
>> +
>>       /*
>>        * KVM explicitly assumes that the guest has an FPU and
>>        * FXSAVE/FXRSTOR. For example, the KVM_GET_FPU explicitly casts the
>> -- 
>> 2.33.0.rc2.250.ged5fa647cd-goog
> 
> Sounds also good at first sight. No idea, if monitoring scripts in userspace would get confused now.

This definitely looks more informative than the existing message.

It would be even better if it wasn't "kvm: no hardware support" (as this
message is technically incorrect), but something like
"kvm: no VMX hardware support" or "kvm: no SVM hardware support".

> Kind regards,
> 
> Paul

Thanks,
Maciej
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e5d5c5ed7dd4..411c58ae0c97 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8372,7 +8372,7 @@  int kvm_arch_init(void *opaque)
 	int r;
 
 	if (kvm_x86_ops.hardware_enable) {
-		printk(KERN_ERR "kvm: already loaded the other module\n");
+		printk(KERN_INFO "kvm: already loaded the other module\n");
 		r = -EEXIST;
 		goto out;
 	}