diff mbox series

[1/1] xen/gnttab: print log at level XENLOG_ERR before panic

Message ID 1555936799-15675-1-git-send-email-dongli.zhang@oracle.com (mailing list archive)
State New, archived
Headers show
Series [1/1] xen/gnttab: print log at level XENLOG_ERR before panic | expand

Commit Message

Dongli Zhang April 22, 2019, 12:39 p.m. UTC
Print log at level XENLOG_ERR (instead XENLOG_INFO) as domain_crash()
indicates there is fatal error.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 xen/common/grant_table.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Andrew Cooper April 23, 2019, 10:37 a.m. UTC | #1
On 22/04/2019 13:39, Dongli Zhang wrote:
> Print log at level XENLOG_ERR (instead XENLOG_INFO) as domain_crash()
> indicates there is fatal error.
>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
>  xen/common/grant_table.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index 80728ea..725c618 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -1282,7 +1282,7 @@ unmap_common(
>      if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
>      {
>          /* This can happen when a grant is implicitly unmapped. */
> -        gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom);
> +        gdprintk(XENLOG_ERR, "Could not find domain %d\n", dom);

While I agree with the change in principle, what does this actually do? 
The entire printk() is compiled out in release builds, and debug builds
log all messages.

~Andrew

>          domain_crash(ld); /* naughty... */
>          return;
>      }
George Dunlap April 23, 2019, 11:47 a.m. UTC | #2
On 4/23/19 11:37 AM, Andrew Cooper wrote:
> On 22/04/2019 13:39, Dongli Zhang wrote:
>> Print log at level XENLOG_ERR (instead XENLOG_INFO) as domain_crash()
>> indicates there is fatal error.
>>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>>  xen/common/grant_table.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>> index 80728ea..725c618 100644
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -1282,7 +1282,7 @@ unmap_common(
>>      if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
>>      {
>>          /* This can happen when a grant is implicitly unmapped. */
>> -        gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom);
>> +        gdprintk(XENLOG_ERR, "Could not find domain %d\n", dom);
> 
> While I agree with the change in principle, what does this actually do? 
> The entire printk() is compiled out in release builds, and debug builds
> log all messages.

...by default; but presumably you can turn that down, right?

This also changes the prefix in the log output, right?  That would make
it easier to identify which message is important for undertanding why
your domain just crashed.

Acked-by: George Dunlap <george.dunlap@citrix.com>
Andrew Cooper April 23, 2019, 4 p.m. UTC | #3
On 23/04/2019 12:47, George Dunlap wrote:
> On 4/23/19 11:37 AM, Andrew Cooper wrote:
>> On 22/04/2019 13:39, Dongli Zhang wrote:
>>> Print log at level XENLOG_ERR (instead XENLOG_INFO) as domain_crash()
>>> indicates there is fatal error.
>>>
>>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>>> ---
>>>  xen/common/grant_table.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>>> index 80728ea..725c618 100644
>>> --- a/xen/common/grant_table.c
>>> +++ b/xen/common/grant_table.c
>>> @@ -1282,7 +1282,7 @@ unmap_common(
>>>      if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
>>>      {
>>>          /* This can happen when a grant is implicitly unmapped. */
>>> -        gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom);
>>> +        gdprintk(XENLOG_ERR, "Could not find domain %d\n", dom);
>> While I agree with the change in principle, what does this actually do? 
>> The entire printk() is compiled out in release builds, and debug builds
>> log all messages.
> ...by default; but presumably you can turn that down, right?

Yes you can, but a) noone does, because these are debug builds and b)
that isn't the point.  If we care about this message being printed, it
mustn’t be a gdprintk().

It either wants to turn into a gprintk() at XENLOG_INFO, or wants
deleting entirely, but simply shuffling the exact level for a gdprintk()
is tantamount to useless.

>
> This also changes the prefix in the log output, right?  That would make
> it easier to identify which message is important for undertanding why
> your domain just crashed.

The prefix is completely internal to Xen.  By the time it ends up in the
console ring, the data is lost.

~Andrew
Jan Beulich April 25, 2019, 1:10 p.m. UTC | #4
>>> On 23.04.19 at 18:00, <andrew.cooper3@citrix.com> wrote:
> On 23/04/2019 12:47, George Dunlap wrote:
>> On 4/23/19 11:37 AM, Andrew Cooper wrote:
>>> On 22/04/2019 13:39, Dongli Zhang wrote:
>>>> Print log at level XENLOG_ERR (instead XENLOG_INFO) as domain_crash()
>>>> indicates there is fatal error.
>>>>
>>>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>>>> ---
>>>>  xen/common/grant_table.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>>>> index 80728ea..725c618 100644
>>>> --- a/xen/common/grant_table.c
>>>> +++ b/xen/common/grant_table.c
>>>> @@ -1282,7 +1282,7 @@ unmap_common(
>>>>      if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
>>>>      {
>>>>          /* This can happen when a grant is implicitly unmapped. */
>>>> -        gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom);
>>>> +        gdprintk(XENLOG_ERR, "Could not find domain %d\n", dom);
>>> While I agree with the change in principle, what does this actually do? 
>>> The entire printk() is compiled out in release builds, and debug builds
>>> log all messages.
>> ...by default; but presumably you can turn that down, right?
> 
> Yes you can, but a) noone does, because these are debug builds and b)
> that isn't the point.  If we care about this message being printed, it
> mustn’t be a gdprintk().
> 
> It either wants to turn into a gprintk() at XENLOG_INFO, or wants
> deleting entirely, but simply shuffling the exact level for a gdprintk()
> is tantamount to useless.

FWIW I agree, and I think there are quite a few more candidates of
debug messages that would better go away, and should probably
have been stripped off before the original commit introducing them.

The one reason not to drop the message here, but to switch to
gprintk(), is the adjacent domain_crash(). This, however, then is
a reason to give it a log level above INFO.

Jan
Andrew Cooper April 25, 2019, 1:17 p.m. UTC | #5
On 25/04/2019 14:10, Jan Beulich wrote:
>>>> On 23.04.19 at 18:00, <andrew.cooper3@citrix.com> wrote:
>> On 23/04/2019 12:47, George Dunlap wrote:
>>> On 4/23/19 11:37 AM, Andrew Cooper wrote:
>>>> On 22/04/2019 13:39, Dongli Zhang wrote:
>>>>> Print log at level XENLOG_ERR (instead XENLOG_INFO) as domain_crash()
>>>>> indicates there is fatal error.
>>>>>
>>>>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>>>>> ---
>>>>>  xen/common/grant_table.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>>>>> index 80728ea..725c618 100644
>>>>> --- a/xen/common/grant_table.c
>>>>> +++ b/xen/common/grant_table.c
>>>>> @@ -1282,7 +1282,7 @@ unmap_common(
>>>>>      if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
>>>>>      {
>>>>>          /* This can happen when a grant is implicitly unmapped. */
>>>>> -        gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom);
>>>>> +        gdprintk(XENLOG_ERR, "Could not find domain %d\n", dom);
>>>> While I agree with the change in principle, what does this actually do? 
>>>> The entire printk() is compiled out in release builds, and debug builds
>>>> log all messages.
>>> ...by default; but presumably you can turn that down, right?
>> Yes you can, but a) noone does, because these are debug builds and b)
>> that isn't the point.  If we care about this message being printed, it
>> mustn’t be a gdprintk().
>>
>> It either wants to turn into a gprintk() at XENLOG_INFO, or wants
>> deleting entirely, but simply shuffling the exact level for a gdprintk()
>> is tantamount to useless.
> FWIW I agree, and I think there are quite a few more candidates of
> debug messages that would better go away, and should probably
> have been stripped off before the original commit introducing them.
>
> The one reason not to drop the message here, but to switch to
> gprintk(), is the adjacent domain_crash(). This, however, then is
> a reason to give it a log level above INFO.

I still have a series I need to refresh and repost which causes
domain_crash() to take a XENLOG_G_ERR print message, specifically so it
is not possible to ever get a domain_crash() without at least a hint of
what went wrong.

~Andrew
Dongli Zhang April 25, 2019, 5:21 p.m. UTC | #6
Hi Andrew,

On 04/23/2019 06:37 PM, Andrew Cooper wrote:
> On 22/04/2019 13:39, Dongli Zhang wrote:
>> Print log at level XENLOG_ERR (instead XENLOG_INFO) as domain_crash()
>> indicates there is fatal error.
>>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>>  xen/common/grant_table.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>> index 80728ea..725c618 100644
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -1282,7 +1282,7 @@ unmap_common(
>>      if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
>>      {
>>          /* This can happen when a grant is implicitly unmapped. */
>> -        gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom);
>> +        gdprintk(XENLOG_ERR, "Could not find domain %d\n", dom);
> 
> While I agree with the change in principle, what does this actually do? 
> The entire printk() is compiled out in release builds, and debug builds
> log all messages.

The objective is for below case.

If user sets "guest_loglvl=err", this line would not be printed. Once xen panic
by domain_crash, it would take effort to know the domid leading to the panic.

To verify if the vmcore analysis is correct, the user needs to set
"guest_loglvl=all" and wait for another panic again.

I agree it would be better to turn it into something like gprintk. If xen always
panic after a log, we should guarantee that the log is always printed to the
console in order to help analyze the root cause.

Dongli Zhang

> 
> ~Andrew
> 
>>          domain_crash(ld); /* naughty... */
>>          return;
>>      }
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
>
diff mbox series

Patch

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 80728ea..725c618 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1282,7 +1282,7 @@  unmap_common(
     if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) )
     {
         /* This can happen when a grant is implicitly unmapped. */
-        gdprintk(XENLOG_INFO, "Could not find domain %d\n", dom);
+        gdprintk(XENLOG_ERR, "Could not find domain %d\n", dom);
         domain_crash(ld); /* naughty... */
         return;
     }