Message ID | 1398905152-18091-2-git-send-email-mihai.caraman@freescale.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 05/01/2014 02:45 AM, Mihai Caraman wrote: > The commit 1d628af7 "add load inst fixup" made an attempt to handle > failures generated by reading the guest current instruction. The fixup > code that was added works by chance hiding the real issue. > > Load external pid (lwepx) instruction, used by KVM to read guest > instructions, is executed in a subsituted guest translation context > (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage > interrupts need to be handled by KVM, even though these interrupts > are generated from host context (MSR[GS] = 0). > > Currently, KVM hooks only interrupts generated from guest context > (MSR[GS] = 1), doing minimal checks on the fast path to avoid host > performance degradation. As a result, the host kernel handles lwepx > faults searching the faulting guest data address (loaded in DEAR) in > its own Logical Partition ID (LPID) 0 context. In case a host translation > is found the execution returns to the lwepx instruction instead of the > fixup, the host ending up in an infinite loop. > > Revert the commit "add load inst fixup". lwepx issue will be addressed > in a subsequent patch without needing fixup code. > > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Just a random idea: Could we just switch IVOR2 during the critical lwepx phase? In fact, we could even do that later when we're already in C code and try to recover the last instruction. The code IVOR2 would point to would simply set the register we're trying to read to as LAST_INST_FAIL and rfi. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Alexander Graf <agraf@suse.de> > Sent: Friday, May 2, 2014 12:24 PM > To: Caraman Mihai Claudiu-B02008 > Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-dev@lists.ozlabs.org > Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup" > > On 05/01/2014 02:45 AM, Mihai Caraman wrote: > > The commit 1d628af7 "add load inst fixup" made an attempt to handle > > failures generated by reading the guest current instruction. The fixup > > code that was added works by chance hiding the real issue. > > > > Load external pid (lwepx) instruction, used by KVM to read guest > > instructions, is executed in a subsituted guest translation context > > (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage > > interrupts need to be handled by KVM, even though these interrupts > > are generated from host context (MSR[GS] = 0). > > > > Currently, KVM hooks only interrupts generated from guest context > > (MSR[GS] = 1), doing minimal checks on the fast path to avoid host > > performance degradation. As a result, the host kernel handles lwepx > > faults searching the faulting guest data address (loaded in DEAR) in > > its own Logical Partition ID (LPID) 0 context. In case a host translation > > is found the execution returns to the lwepx instruction instead of the > > fixup, the host ending up in an infinite loop. > > > > Revert the commit "add load inst fixup". lwepx issue will be addressed > > in a subsequent patch without needing fixup code. > > > > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> > > Just a random idea: Could we just switch IVOR2 during the critical lwepx > phase? In fact, we could even do that later when we're already in C code > and try to recover the last instruction. The code IVOR2 would point to > would simply set the register we're trying to read to as LAST_INST_FAIL > and rfi. This was the first idea that sprang to my mind inspired from how DO_KVM is hooked on PR. I actually did a simple POC for e500mc/e5500, but this will not work on e6500 which has shared IVORs between HW threads. -Mike-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com" <mihai.caraman@freescale.com>: >> From: Alexander Graf <agraf@suse.de> >> Sent: Friday, May 2, 2014 12:24 PM >> To: Caraman Mihai Claudiu-B02008 >> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-dev@lists.ozlabs.org >> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup" >> >>> On 05/01/2014 02:45 AM, Mihai Caraman wrote: >>> The commit 1d628af7 "add load inst fixup" made an attempt to handle >>> failures generated by reading the guest current instruction. The fixup >>> code that was added works by chance hiding the real issue. >>> >>> Load external pid (lwepx) instruction, used by KVM to read guest >>> instructions, is executed in a subsituted guest translation context >>> (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage >>> interrupts need to be handled by KVM, even though these interrupts >>> are generated from host context (MSR[GS] = 0). >>> >>> Currently, KVM hooks only interrupts generated from guest context >>> (MSR[GS] = 1), doing minimal checks on the fast path to avoid host >>> performance degradation. As a result, the host kernel handles lwepx >>> faults searching the faulting guest data address (loaded in DEAR) in >>> its own Logical Partition ID (LPID) 0 context. In case a host translation >>> is found the execution returns to the lwepx instruction instead of the >>> fixup, the host ending up in an infinite loop. >>> >>> Revert the commit "add load inst fixup". lwepx issue will be addressed >>> in a subsequent patch without needing fixup code. >>> >>> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> >> >> Just a random idea: Could we just switch IVOR2 during the critical lwepx >> phase? In fact, we could even do that later when we're already in C code >> and try to recover the last instruction. The code IVOR2 would point to >> would simply set the register we're trying to read to as LAST_INST_FAIL >> and rfi. > > This was the first idea that sprang to my mind inspired from how DO_KVM > is hooked on PR. I actually did a simple POC for e500mc/e5500, but this will > not work on e6500 which has shared IVORs between HW threads. What if we combine the ideas? On read we flip the IVOR to a separate handler that checks for a field in the PACA. Only if that field is set, we treat the fault as kvm fault, otherwise we jump into the normal handler. I suppose we'd have to also take a lock to make sure we don't race with the other thread when it wants to also read a guest instruction, but you get the idea. I have no idea whether this would be any faster, it's more of a brainstorming thing really. But regardless this patch set would be a move into the right direction. Btw, do we have any guarantees that we don't get scheduled away before we run kvmppc_get_last_inst()? If we run on a different core we can't read the inst anymore. Hrm. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Alexander Graf [mailto:agraf@suse.de] > Sent: Sunday, May 04, 2014 1:14 AM > To: Caraman Mihai Claudiu-B02008 > Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc- > dev@lists.ozlabs.org > Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst > fixup" > > > > Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com" > <mihai.caraman@freescale.com>: > > >> From: Alexander Graf <agraf@suse.de> > >> Sent: Friday, May 2, 2014 12:24 PM > > This was the first idea that sprang to my mind inspired from how DO_KVM > > is hooked on PR. I actually did a simple POC for e500mc/e5500, but this > will > > not work on e6500 which has shared IVORs between HW threads. > > What if we combine the ideas? On read we flip the IVOR to a separate > handler that checks for a field in the PACA. Only if that field is set, > we treat the fault as kvm fault, otherwise we jump into the normal > handler. > > I suppose we'd have to also take a lock to make sure we don't race with > the other thread when it wants to also read a guest instruction, but you > get the idea. This might be a solution for TLB eviction but not for execute-but-not-read entries which requires access from host context. > > I have no idea whether this would be any faster, it's more of a > brainstorming thing really. But regardless this patch set would be a move > into the right direction. > > Btw, do we have any guarantees that we don't get scheduled away before we > run kvmppc_get_last_inst()? If we run on a different core we can't read > the inst anymore. Hrm. It was your suggestion to move the logic from kvmppc_handle_exit() irq disabled area to kvmppc_get_last_inst(): http://git.freescale.com/git/cgit.cgi/ppc/sdk/linux.git/tree/arch/powerpc/kvm/booke.c Still, what is wrong if we get scheduled on another core? We will emulate again and the guest will populate the TLB on the new core. -Mike -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/06/2014 05:48 PM, mihai.caraman@freescale.com wrote: >> -----Original Message----- >> From: Alexander Graf [mailto:agraf@suse.de] >> Sent: Sunday, May 04, 2014 1:14 AM >> To: Caraman Mihai Claudiu-B02008 >> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc- >> dev@lists.ozlabs.org >> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst >> fixup" >> >> >> >> Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com" >> <mihai.caraman@freescale.com>: >> >>>> From: Alexander Graf <agraf@suse.de> >>>> Sent: Friday, May 2, 2014 12:24 PM >>> This was the first idea that sprang to my mind inspired from how DO_KVM >>> is hooked on PR. I actually did a simple POC for e500mc/e5500, but this >> will >>> not work on e6500 which has shared IVORs between HW threads. >> What if we combine the ideas? On read we flip the IVOR to a separate >> handler that checks for a field in the PACA. Only if that field is set, >> we treat the fault as kvm fault, otherwise we jump into the normal >> handler. >> >> I suppose we'd have to also take a lock to make sure we don't race with >> the other thread when it wants to also read a guest instruction, but you >> get the idea. > This might be a solution for TLB eviction but not for execute-but-not-read > entries which requires access from host context. Good point :). > >> I have no idea whether this would be any faster, it's more of a >> brainstorming thing really. But regardless this patch set would be a move >> into the right direction. >> >> Btw, do we have any guarantees that we don't get scheduled away before we >> run kvmppc_get_last_inst()? If we run on a different core we can't read >> the inst anymore. Hrm. > It was your suggestion to move the logic from kvmppc_handle_exit() irq > disabled area to kvmppc_get_last_inst(): > > http://git.freescale.com/git/cgit.cgi/ppc/sdk/linux.git/tree/arch/powerpc/kvm/booke.c > > Still, what is wrong if we get scheduled on another core? We will emulate > again and the guest will populate the TLB on the new core. Yes, it means we have to get the EMULATE_AGAIN code paths correct :). It also means we lose some performance with preemptive kernel configurations. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index a1712b8..925da71 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -164,32 +164,9 @@ PPC_STL r30, VCPU_GPR(R30)(r4) PPC_STL r31, VCPU_GPR(R31)(r4) mtspr SPRN_EPLC, r8 - - /* disable preemption, so we are sure we hit the fixup handler */ - CURRENT_THREAD_INFO(r8, r1) - li r7, 1 - stw r7, TI_PREEMPT(r8) - isync - - /* - * In case the read goes wrong, we catch it and write an invalid value - * in LAST_INST instead. - */ -1: lwepx r9, 0, r5 -2: -.section .fixup, "ax" -3: li r9, KVM_INST_FETCH_FAILED - b 2b -.previous -.section __ex_table,"a" - PPC_LONG_ALIGN - PPC_LONG 1b,3b -.previous - + lwepx r9, 0, r5 mtspr SPRN_EPLC, r3 - li r7, 0 - stw r7, TI_PREEMPT(r8) stw r9, VCPU_LAST_INST(r4) .endif
The commit 1d628af7 "add load inst fixup" made an attempt to handle failures generated by reading the guest current instruction. The fixup code that was added works by chance hiding the real issue. Load external pid (lwepx) instruction, used by KVM to read guest instructions, is executed in a subsituted guest translation context (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage interrupts need to be handled by KVM, even though these interrupts are generated from host context (MSR[GS] = 0). Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), doing minimal checks on the fast path to avoid host performance degradation. As a result, the host kernel handles lwepx faults searching the faulting guest data address (loaded in DEAR) in its own Logical Partition ID (LPID) 0 context. In case a host translation is found the execution returns to the lwepx instruction instead of the fixup, the host ending up in an infinite loop. Revert the commit "add load inst fixup". lwepx issue will be addressed in a subsequent patch without needing fixup code. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> --- v2: - reworked patch description arch/powerpc/kvm/bookehv_interrupts.S | 25 +------------------------ 1 file changed, 1 insertion(+), 24 deletions(-)