Message ID | 1590541308-11317-1-git-send-email-igor.druzhinin@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] x86/svm: retry after unhandled NPT fault if gfn was marked for recalculation | expand |
On Wed, May 27, 2020 at 02:01:48AM +0100, Igor Druzhinin wrote: > If a recalculation NPT fault hasn't been handled explicitly in > hvm_hap_nested_page_fault() then it's potentially safe to retry - > US bit has been re-instated in PTE and any real fault would be correctly > re-raised next time. Do it by allowing hvm_hap_nested_page_fault to > fall through in that case. > > This covers a specific case of migration with vGPU assigned on AMD: > global log-dirty is enabled and causes immediate recalculation NPT > fault in MMIO area upon access. This type of fault isn't described > explicitly in hvm_hap_nested_page_fault (this isn't called on > EPT misconfig exit on Intel) which results in domain crash. > > Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> > --- > Changes in v2: > - don't gamble with retrying every recal fault and instead let > hvm_hap_nested_page_fault know it's allowed to fall through in default case > --- > xen/arch/x86/hvm/hvm.c | 6 +++--- > xen/arch/x86/hvm/svm/svm.c | 7 ++++++- > xen/arch/x86/hvm/vmx/vmx.c | 2 +- > xen/include/asm-x86/hvm/hvm.h | 2 +- > 4 files changed, 11 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index 74c9f84..42bd720 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -1731,7 +1731,7 @@ void hvm_inject_event(const struct x86_event *event) > } > > int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, > - struct npfec npfec) > + struct npfec npfec, bool fall_through) > { > unsigned long gfn = gpa >> PAGE_SHIFT; > p2m_type_t p2mt; > @@ -1740,7 +1740,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, > struct vcpu *curr = current; > struct domain *currd = curr->domain; > struct p2m_domain *p2m, *hostp2m; > - int rc, fall_through = 0, paged = 0; > + int rc, paged = 0; > bool sharing_enomem = false; > vm_event_request_t *req_ptr = NULL; > bool sync = false; I would assert that the parameter is never set when running on Intel, since those code path is not supposed to use it. I also wonder whether it would be possible to avoid passing a parameter, and instead check whether the guest is in logdirty mode on AMD and the fault is actually a logdirty triggered one. That would IMO make it more robust since the caller doesn't need to care about whether it's a recalc fault or not. > @@ -1905,7 +1905,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, > sync = p2m_mem_access_check(gpa, gla, npfec, &req_ptr); > > if ( !sync ) > - fall_through = 1; > + fall_through = true; > else > { > /* Rights not promoted (aka. sync event), work here is done */ > diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c > index 46a1aac..8ef3fed 100644 > --- a/xen/arch/x86/hvm/svm/svm.c > +++ b/xen/arch/x86/hvm/svm/svm.c > @@ -1695,7 +1695,12 @@ static void svm_do_nested_pgfault(struct vcpu *v, > else if ( pfec & NPT_PFEC_in_gpt ) > npfec.kind = npfec_kind_in_gpt; > > - ret = hvm_hap_nested_page_fault(gpa, ~0ul, npfec); > + /* > + * US bit being set in error code indicates P2M type recalculation has > + * just been done meaning that it's possible there is nothing else to handle > + * and we can just fall through and retry. > + */ > + ret = hvm_hap_nested_page_fault(gpa, ~0ul, npfec, !!(pfec & PFEC_user_mode)); > > if ( tb_init_done ) > { > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > index 11a4dd9..10f1eeb 100644 > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -3398,7 +3398,7 @@ static void ept_handle_violation(ept_qual_t q, paddr_t gpa) > else > gla = ~0ull; > > - ret = hvm_hap_nested_page_fault(gpa, gla, npfec); > + ret = hvm_hap_nested_page_fault(gpa, gla, npfec, false); > switch ( ret ) > { > case 0: // Unhandled L1 EPT violation > diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h > index 1eb377d..03e5f1d 100644 > --- a/xen/include/asm-x86/hvm/hvm.h > +++ b/xen/include/asm-x86/hvm/hvm.h > @@ -329,7 +329,7 @@ void hvm_fast_singlestep(struct vcpu *v, uint16_t p2midx); > > struct npfec; > int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, > - struct npfec npfec); > + struct npfec npfec, bool fall_through); I would rename fall_through to recalc, recalculate or misconfig. It's not easy to understand the meaning of the parameter when looking at the function prototype. Thanks, Roger.
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 74c9f84..42bd720 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1731,7 +1731,7 @@ void hvm_inject_event(const struct x86_event *event) } int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, - struct npfec npfec) + struct npfec npfec, bool fall_through) { unsigned long gfn = gpa >> PAGE_SHIFT; p2m_type_t p2mt; @@ -1740,7 +1740,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, struct vcpu *curr = current; struct domain *currd = curr->domain; struct p2m_domain *p2m, *hostp2m; - int rc, fall_through = 0, paged = 0; + int rc, paged = 0; bool sharing_enomem = false; vm_event_request_t *req_ptr = NULL; bool sync = false; @@ -1905,7 +1905,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, sync = p2m_mem_access_check(gpa, gla, npfec, &req_ptr); if ( !sync ) - fall_through = 1; + fall_through = true; else { /* Rights not promoted (aka. sync event), work here is done */ diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index 46a1aac..8ef3fed 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1695,7 +1695,12 @@ static void svm_do_nested_pgfault(struct vcpu *v, else if ( pfec & NPT_PFEC_in_gpt ) npfec.kind = npfec_kind_in_gpt; - ret = hvm_hap_nested_page_fault(gpa, ~0ul, npfec); + /* + * US bit being set in error code indicates P2M type recalculation has + * just been done meaning that it's possible there is nothing else to handle + * and we can just fall through and retry. + */ + ret = hvm_hap_nested_page_fault(gpa, ~0ul, npfec, !!(pfec & PFEC_user_mode)); if ( tb_init_done ) { diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 11a4dd9..10f1eeb 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -3398,7 +3398,7 @@ static void ept_handle_violation(ept_qual_t q, paddr_t gpa) else gla = ~0ull; - ret = hvm_hap_nested_page_fault(gpa, gla, npfec); + ret = hvm_hap_nested_page_fault(gpa, gla, npfec, false); switch ( ret ) { case 0: // Unhandled L1 EPT violation diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h index 1eb377d..03e5f1d 100644 --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -329,7 +329,7 @@ void hvm_fast_singlestep(struct vcpu *v, uint16_t p2midx); struct npfec; int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, - struct npfec npfec); + struct npfec npfec, bool fall_through); /* Check CR4/EFER values */ const char *hvm_efer_valid(const struct vcpu *v, uint64_t value,
If a recalculation NPT fault hasn't been handled explicitly in hvm_hap_nested_page_fault() then it's potentially safe to retry - US bit has been re-instated in PTE and any real fault would be correctly re-raised next time. Do it by allowing hvm_hap_nested_page_fault to fall through in that case. This covers a specific case of migration with vGPU assigned on AMD: global log-dirty is enabled and causes immediate recalculation NPT fault in MMIO area upon access. This type of fault isn't described explicitly in hvm_hap_nested_page_fault (this isn't called on EPT misconfig exit on Intel) which results in domain crash. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> --- Changes in v2: - don't gamble with retrying every recal fault and instead let hvm_hap_nested_page_fault know it's allowed to fall through in default case --- xen/arch/x86/hvm/hvm.c | 6 +++--- xen/arch/x86/hvm/svm/svm.c | 7 ++++++- xen/arch/x86/hvm/vmx/vmx.c | 2 +- xen/include/asm-x86/hvm/hvm.h | 2 +- 4 files changed, 11 insertions(+), 6 deletions(-)