diff mbox series

[v2] SVM: Add union intstat_t for offset 68h in vmcb struct

Message ID 20200324103726.3406-1-puwen@hygon.cn (mailing list archive)
State Superseded
Headers show
Series [v2] SVM: Add union intstat_t for offset 68h in vmcb struct | expand

Commit Message

Pu Wen March 24, 2020, 10:37 a.m. UTC
According to chapter "Appendix B Layout of VMCB" in the new version
(v3.32) AMD64 APM[1], bit 1 of the VMCB offset 68h is defined as
GUEST_INTERRUPT_MASK.

In current xen codes, it use whole u64 interrupt_shadow to setup
interrupt shadow, which will misuse other bit in VMCB offset 68h
as part of interrupt_shadow.

Add union intstat_t for VMCB offset 68h and fix codes to only use
bit 0 as intr_shadow according to the new APM description.

Reference:
[1] https://www.amd.com/system/files/TechDocs/24593.pdf

Signed-off-by: Pu Wen <puwen@hygon.cn>
---
v1->v2:
  - Copy the whole int_stat in nsvm_vmcb_prepare4vmrun() and
    nsvm_vmcb_prepare4vmexit().
  - Dump all 64 bits of int_stat in svm_vmcb_dump().

 xen/arch/x86/hvm/svm/nestedsvm.c   |  8 ++++----
 xen/arch/x86/hvm/svm/svm.c         |  8 ++++----
 xen/arch/x86/hvm/svm/svmdebug.c    |  4 ++--
 xen/include/asm-x86/hvm/svm/vmcb.h | 13 ++++++++++++-
 4 files changed, 22 insertions(+), 11 deletions(-)

Comments

Jan Beulich March 24, 2020, 11:55 a.m. UTC | #1
On 24.03.2020 11:37, Pu Wen wrote:
> According to chapter "Appendix B Layout of VMCB" in the new version
> (v3.32) AMD64 APM[1], bit 1 of the VMCB offset 68h is defined as
> GUEST_INTERRUPT_MASK.
> 
> In current xen codes, it use whole u64 interrupt_shadow to setup
> interrupt shadow, which will misuse other bit in VMCB offset 68h
> as part of interrupt_shadow.
> 
> Add union intstat_t for VMCB offset 68h and fix codes to only use
> bit 0 as intr_shadow according to the new APM description.
> 
> Reference:
> [1] https://www.amd.com/system/files/TechDocs/24593.pdf
> 
> Signed-off-by: Pu Wen <puwen@hygon.cn>

Looks okay now to me (with one further nit, see below), but ...

> v1->v2:
>   - Copy the whole int_stat in nsvm_vmcb_prepare4vmrun() and
>     nsvm_vmcb_prepare4vmexit().

... in particular this part I'd prefer to wait a little to
whether Andrew or anyone else has a specific opinion one or
the other way.

> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
> @@ -316,6 +316,17 @@ typedef union
>      uint64_t raw;
>  } intinfo_t;
>  
> +typedef union
> +{
> +    struct
> +    {

Nit: The brace should be on the same line as "struct"; can be
taken care of while committing.

Jan
Andrew Cooper March 24, 2020, 12:28 p.m. UTC | #2
On 24/03/2020 10:37, Pu Wen wrote:
> According to chapter "Appendix B Layout of VMCB" in the new version
> (v3.32) AMD64 APM[1], bit 1 of the VMCB offset 68h is defined as
> GUEST_INTERRUPT_MASK.
>
> In current xen codes, it use whole u64 interrupt_shadow to setup
> interrupt shadow, which will misuse other bit in VMCB offset 68h
> as part of interrupt_shadow.
>
> Add union intstat_t for VMCB offset 68h and fix codes to only use
> bit 0 as intr_shadow according to the new APM description.
>
> Reference:
> [1] https://www.amd.com/system/files/TechDocs/24593.pdf
>
> Signed-off-by: Pu Wen <puwen@hygon.cn>

Hmm - this field doesn't appear to be part of AVIC, which makes me
wonder what we're doing without it.

It appears to be a shadow copy of EFLAGS.IF which is only written on
vmexit, and never consumed, but this is based on Appendix B which is the
only reference I can find to the field at all.  Neither the
VMRUN/#VMEXIT descriptions discuss it at all.

Given its position next to the (ambiguous) INTERRUPT_SHADOW, it just
might actually distinguish the STI shadow from the MovSS shadow, but it
could only do that by not behaving as described, and being asymmetric
with EFLAGS.  I don't have time to investigate this right now.

We need the field described in Xen to set it appropriately for virtual
vmexit, but I think that is the extent of what we need to do.

~Andrew
Roger Pau Monné March 25, 2020, 10:30 a.m. UTC | #3
On Tue, Mar 24, 2020 at 06:37:26PM +0800, Pu Wen wrote:
> diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
> index b9e389d481..d8a3285752 100644
> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
> @@ -316,6 +316,17 @@ typedef union
>      uint64_t raw;
>  } intinfo_t;
>  
> +typedef union
> +{
> +    struct
> +    {
> +        u64 intr_shadow:    1;
> +        u64 guest_intr_mask:1;
> +        u64 resvd:          62;

Could you also use uint64_t for the fields, like you do below for
raw?

Thanks, Roger.
Pu Wen March 25, 2020, 3:21 p.m. UTC | #4
On 2020/3/24 19:55, Jan Beulich wrote:
>> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
>> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
>> @@ -316,6 +316,17 @@ typedef union
>>       uint64_t raw;
>>   } intinfo_t;
>>   
>> +typedef union
>> +{
>> +    struct
>> +    {
> 
> Nit: The brace should be on the same line as "struct"; can be
> taken care of while committing.

Ok, thanks.
Pu Wen March 25, 2020, 3:22 p.m. UTC | #5
On 2020/3/24 20:28, Andrew Cooper wrote:
> Hmm - this field doesn't appear to be part of AVIC, which makes me
> wonder what we're doing without it.
> 
> It appears to be a shadow copy of EFLAGS.IF which is only written on
> vmexit, and never consumed, but this is based on Appendix B which is the
> only reference I can find to the field at all.  Neither the
> VMRUN/#VMEXIT descriptions discuss it at all.
> 
> Given its position next to the (ambiguous) INTERRUPT_SHADOW, it just
> might actually distinguish the STI shadow from the MovSS shadow, but it
> could only do that by not behaving as described, and being asymmetric
> with EFLAGS.  I don't have time to investigate this right now.
> 
> We need the field described in Xen to set it appropriately for virtual
> vmexit, but I think that is the extent of what we need to do.

We encountered problem while running xen with new firmware which
implement the bit[1] of the VMCB offset 68h: the DomU stopped when
running seabios. We debugged the seabios code and found that the
seabios hung after call16_int10().

Then we debugged the xen code, and found the cause at this place in
svm_get_interrupt_shadow():
    if ( vmcb->interrupt_shadow )
         intr_shadow |= HVM_INTR_SHADOW_MOV_SS | HVM_INTR_SHADOW_STI;
the conditional is true if bit[1] is 1 even though bit[0] is zero.
If just only use bit[0] in the conditional, the problem is solved, DomU
will run successfully.
Pu Wen March 25, 2020, 3:23 p.m. UTC | #6
On 2020/3/25 18:30, Roger Pau Monné wrote:
> On Tue, Mar 24, 2020 at 06:37:26PM +0800, Pu Wen wrote:
>> diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
>> index b9e389d481..d8a3285752 100644
>> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
>> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
>> @@ -316,6 +316,17 @@ typedef union
>>       uint64_t raw;
>>   } intinfo_t;
>>   
>> +typedef union
>> +{
>> +    struct
>> +    {
>> +        u64 intr_shadow:    1;
>> +        u64 guest_intr_mask:1;
>> +        u64 resvd:          62;
> 
> Could you also use uint64_t for the fields, like you do below for
> raw?

Ok, thanks. Maybe bool for intr_shadow and guest_intr_mask is better?
Andrew Cooper March 25, 2020, 3:56 p.m. UTC | #7
On 25/03/2020 15:22, Pu Wen wrote:
> On 2020/3/24 20:28, Andrew Cooper wrote:
>> Hmm - this field doesn't appear to be part of AVIC, which makes me
>> wonder what we're doing without it.
>>
>> It appears to be a shadow copy of EFLAGS.IF which is only written on
>> vmexit, and never consumed, but this is based on Appendix B which is the
>> only reference I can find to the field at all.  Neither the
>> VMRUN/#VMEXIT descriptions discuss it at all.
>>
>> Given its position next to the (ambiguous) INTERRUPT_SHADOW, it just
>> might actually distinguish the STI shadow from the MovSS shadow, but it
>> could only do that by not behaving as described, and being asymmetric
>> with EFLAGS.  I don't have time to investigate this right now.
>>
>> We need the field described in Xen to set it appropriately for virtual
>> vmexit, but I think that is the extent of what we need to do.
> We encountered problem while running xen with new firmware which
> implement the bit[1] of the VMCB offset 68h: the DomU stopped when
> running seabios. We debugged the seabios code and found that the
> seabios hung after call16_int10().
>
> Then we debugged the xen code, and found the cause at this place in
> svm_get_interrupt_shadow():
>     if ( vmcb->interrupt_shadow )
>          intr_shadow |= HVM_INTR_SHADOW_MOV_SS | HVM_INTR_SHADOW_STI;
> the conditional is true if bit[1] is 1 even though bit[0] is zero.
> If just only use bit[0] in the conditional, the problem is solved, DomU
> will run successfully.

Oh - now you point this out, the issue is obvious.

The above content would make a far more informative commit message.  How
about extending the middle paragraph with:

"...part of interrupt_shadow, causing svm_get_interrupt_shadow() to
mistake the guest having interrupts enabled as being in an interrupt
shadow.  This has been observed to cause SeaBIOS to hang on boot."

or words to that effect.  The "it definitely breaks a guest" is the most
important piece of information here.

Do you happen to know call16_int10() was doing, exactly?  We've
presumably trapped for emulation to be using svm_get_interrupt_shadow()
in the first place.

~Andrew
Andrew Cooper March 25, 2020, 4:03 p.m. UTC | #8
On 25/03/2020 15:23, Pu Wen wrote:
> On 2020/3/25 18:30, Roger Pau Monné wrote:
>> On Tue, Mar 24, 2020 at 06:37:26PM +0800, Pu Wen wrote:
>>> diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h
>>> b/xen/include/asm-x86/hvm/svm/vmcb.h
>>> index b9e389d481..d8a3285752 100644
>>> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
>>> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
>>> @@ -316,6 +316,17 @@ typedef union
>>>       uint64_t raw;
>>>   } intinfo_t;
>>>   +typedef union
>>> +{
>>> +    struct
>>> +    {
>>> +        u64 intr_shadow:    1;
>>> +        u64 guest_intr_mask:1;
>>> +        u64 resvd:          62;
>>
>> Could you also use uint64_t for the fields, like you do below for
>> raw?
>
> Ok, thanks. Maybe bool for intr_shadow and guest_intr_mask is better?

Bool would be better if you're willing to change them.

There is a subtle truncation bug with can occur, e.g.

foo->intr_shadow = bar & MASK;

turns to 0 if MASK isn't the bottom bit, and intr_shadow is not bool.

The traditional way to fix this is with !!(bar & MASK), but bools are
safer because you can't get it wrong accidentally.

Its also fine to drop the resvd entirely.  No need for the field.

~Andrew
Pu Wen March 26, 2020, 10:16 a.m. UTC | #9
On 2020/3/26 0:03, Andrew Cooper wrote:
> On 25/03/2020 15:23, Pu Wen wrote:
>> On 2020/3/25 18:30, Roger Pau Monné wrote:
>>> On Tue, Mar 24, 2020 at 06:37:26PM +0800, Pu Wen wrote:
>>>> diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h
>>>> b/xen/include/asm-x86/hvm/svm/vmcb.h
>>>> index b9e389d481..d8a3285752 100644
>>>> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
>>>> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
>>>> @@ -316,6 +316,17 @@ typedef union
>>>>        uint64_t raw;
>>>>    } intinfo_t;
>>>>    +typedef union
>>>> +{
>>>> +    struct
>>>> +    {
>>>> +        u64 intr_shadow:    1;
>>>> +        u64 guest_intr_mask:1;
>>>> +        u64 resvd:          62;
>>>
>>> Could you also use uint64_t for the fields, like you do below for
>>> raw?
>>
>> Ok, thanks. Maybe bool for intr_shadow and guest_intr_mask is better?
> 
> Bool would be better if you're willing to change them.
> 
> There is a subtle truncation bug with can occur, e.g.
> 
> foo->intr_shadow = bar & MASK;
> 
> turns to 0 if MASK isn't the bottom bit, and intr_shadow is not bool.
> 
> The traditional way to fix this is with !!(bar & MASK), but bools are
> safer because you can't get it wrong accidentally.
> 
> Its also fine to drop the resvd entirely.  No need for the field.

Thanks for the explanation. I'm fine to use bool for intr_shadow and
drop the resvd field.
Pu Wen March 26, 2020, 10:16 a.m. UTC | #10
On 2020/3/25 23:56, Andrew Cooper wrote:
> On 25/03/2020 15:22, Pu Wen wrote:
>> On 2020/3/24 20:28, Andrew Cooper wrote:
>>> Hmm - this field doesn't appear to be part of AVIC, which makes me
>>> wonder what we're doing without it.
>>>
>>> It appears to be a shadow copy of EFLAGS.IF which is only written on
>>> vmexit, and never consumed, but this is based on Appendix B which is the
>>> only reference I can find to the field at all.  Neither the
>>> VMRUN/#VMEXIT descriptions discuss it at all.
>>>
>>> Given its position next to the (ambiguous) INTERRUPT_SHADOW, it just
>>> might actually distinguish the STI shadow from the MovSS shadow, but it
>>> could only do that by not behaving as described, and being asymmetric
>>> with EFLAGS.  I don't have time to investigate this right now.
>>>
>>> We need the field described in Xen to set it appropriately for virtual
>>> vmexit, but I think that is the extent of what we need to do.
>> We encountered problem while running xen with new firmware which
>> implement the bit[1] of the VMCB offset 68h: the DomU stopped when
>> running seabios. We debugged the seabios code and found that the
>> seabios hung after call16_int10().
>>
>> Then we debugged the xen code, and found the cause at this place in
>> svm_get_interrupt_shadow():
>>      if ( vmcb->interrupt_shadow )
>>           intr_shadow |= HVM_INTR_SHADOW_MOV_SS | HVM_INTR_SHADOW_STI;
>> the conditional is true if bit[1] is 1 even though bit[0] is zero.
>> If just only use bit[0] in the conditional, the problem is solved, DomU
>> will run successfully.
> 
> Oh - now you point this out, the issue is obvious.
> 
> The above content would make a far more informative commit message. How
> about extending the middle paragraph with:
> 
> "...part of interrupt_shadow, causing svm_get_interrupt_shadow() to
> mistake the guest having interrupts enabled as being in an interrupt
> shadow.  This has been observed to cause SeaBIOS to hang on boot."
> 
> or words to that effect.  The "it definitely breaks a guest" is the most
> important piece of information here.

Thanks for the suggestion, will add it to the patch description.

> Do you happen to know call16_int10() was doing, exactly?  We've
> presumably trapped for emulation to be using svm_get_interrupt_shadow()
> in the first place.

call16_int10() is used to set VGA mode and we see no trap operation in log.
We suspected there is something wrong in the interrupt handling process,
after we changed to use interrupt_shadow bit, we found the problem is solved.
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index 3bd2a119d3..bbd06e342e 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -507,8 +507,8 @@  static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs)
         n2vmcb->_vintr.fields.intr_masking = 1;
     }
 
-    /* Shadow Mode */
-    n2vmcb->interrupt_shadow = ns_vmcb->interrupt_shadow;
+    /* Interrupt state */
+    n2vmcb->int_stat = ns_vmcb->int_stat;
 
     /* Exit codes */
     n2vmcb->exitcode = ns_vmcb->exitcode;
@@ -1057,8 +1057,8 @@  nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs)
     if (!(svm->ns_hostflags.fields.vintrmask))
         ns_vmcb->_vintr.fields.intr_masking = 0;
 
-    /* Shadow mode */
-    ns_vmcb->interrupt_shadow = n2vmcb->interrupt_shadow;
+    /* Interrupt state */
+    ns_vmcb->int_stat = n2vmcb->int_stat;
 
     /* Exit codes */
     ns_vmcb->exitcode = n2vmcb->exitcode;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 32d8d847f2..888f504a94 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -116,7 +116,7 @@  void __update_guest_eip(struct cpu_user_regs *regs, unsigned int inst_len)
     regs->rip += inst_len;
     regs->eflags &= ~X86_EFLAGS_RF;
 
-    curr->arch.hvm.svm.vmcb->interrupt_shadow = 0;
+    curr->arch.hvm.svm.vmcb->int_stat.intr_shadow = 0;
 
     if ( regs->eflags & X86_EFLAGS_TF )
         hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
@@ -432,7 +432,7 @@  static unsigned int svm_get_interrupt_shadow(struct vcpu *v)
     struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
     unsigned int intr_shadow = 0;
 
-    if ( vmcb->interrupt_shadow )
+    if ( vmcb->int_stat.intr_shadow )
         intr_shadow |= HVM_INTR_SHADOW_MOV_SS | HVM_INTR_SHADOW_STI;
 
     if ( vmcb_get_general1_intercepts(vmcb) & GENERAL1_INTERCEPT_IRET )
@@ -446,7 +446,7 @@  static void svm_set_interrupt_shadow(struct vcpu *v, unsigned int intr_shadow)
     struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
     u32 general1_intercepts = vmcb_get_general1_intercepts(vmcb);
 
-    vmcb->interrupt_shadow =
+    vmcb->int_stat.intr_shadow =
         !!(intr_shadow & (HVM_INTR_SHADOW_MOV_SS|HVM_INTR_SHADOW_STI));
 
     general1_intercepts &= ~GENERAL1_INTERCEPT_IRET;
@@ -2945,7 +2945,7 @@  void svm_vmexit_handler(struct cpu_user_regs *regs)
          * retired.
          */
         general1_intercepts &= ~GENERAL1_INTERCEPT_IRET;
-        vmcb->interrupt_shadow = 1;
+        vmcb->int_stat.intr_shadow = 1;
 
         vmcb_set_general1_intercepts(vmcb, general1_intercepts);
         break;
diff --git a/xen/arch/x86/hvm/svm/svmdebug.c b/xen/arch/x86/hvm/svm/svmdebug.c
index 366a003f21..5aa9d410ba 100644
--- a/xen/arch/x86/hvm/svm/svmdebug.c
+++ b/xen/arch/x86/hvm/svm/svmdebug.c
@@ -51,9 +51,9 @@  void svm_vmcb_dump(const char *from, const struct vmcb_struct *vmcb)
     printk("iopm_base_pa = %#"PRIx64" msrpm_base_pa = %#"PRIx64" tsc_offset = %#"PRIx64"\n",
            vmcb_get_iopm_base_pa(vmcb), vmcb_get_msrpm_base_pa(vmcb),
            vmcb_get_tsc_offset(vmcb));
-    printk("tlb_control = %#x vintr = %#"PRIx64" interrupt_shadow = %#"PRIx64"\n",
+    printk("tlb_control = %#x vintr = %#"PRIx64" int_stat = %#"PRIx64"\n",
            vmcb->tlb_control, vmcb_get_vintr(vmcb).bytes,
-           vmcb->interrupt_shadow);
+           vmcb->int_stat.raw);
     printk("event_inj %016"PRIx64", valid? %d, ec? %d, type %u, vector %#x\n",
            vmcb->event_inj.raw, vmcb->event_inj.v,
            vmcb->event_inj.ev, vmcb->event_inj.type,
diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index b9e389d481..d8a3285752 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -316,6 +316,17 @@  typedef union
     uint64_t raw;
 } intinfo_t;
 
+typedef union
+{
+    struct
+    {
+        u64 intr_shadow:    1;
+        u64 guest_intr_mask:1;
+        u64 resvd:          62;
+    };
+    uint64_t raw;
+} intstat_t;
+
 typedef union
 {
     u64 bytes;
@@ -414,7 +425,7 @@  struct vmcb_struct {
     u8  tlb_control;            /* offset 0x5C */
     u8  res07[3];
     vintr_t _vintr;             /* offset 0x60 - cleanbit 3 */
-    u64 interrupt_shadow;       /* offset 0x68 */
+    intstat_t int_stat;         /* offset 0x68 */
     u64 exitcode;               /* offset 0x70 */
     union {
         struct {