From patchwork Mon Aug 21 15:57:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jinoh Kang X-Patchwork-Id: 13359610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2FEB2EE49A6 for ; Mon, 21 Aug 2023 16:02:45 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.587695.919061 (Exim 4.92) (envelope-from ) id 1qY7MC-00037f-6p; Mon, 21 Aug 2023 16:02:20 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 587695.919061; Mon, 21 Aug 2023 16:02:20 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1qY7MC-00037Y-45; Mon, 21 Aug 2023 16:02:20 +0000 Received: by outflank-mailman (input) for mailman id 587695; Mon, 21 Aug 2023 16:02:18 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1qY7MA-00037S-Mc for xen-devel@lists.xenproject.org; Mon, 21 Aug 2023 16:02:18 +0000 Received: from mail-qv1-xf30.google.com (mail-qv1-xf30.google.com [2607:f8b0:4864:20::f30]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 191a2662-403c-11ee-8782-cb3800f73035; Mon, 21 Aug 2023 18:02:16 +0200 (CEST) Received: by mail-qv1-xf30.google.com with SMTP id 6a1803df08f44-64f383be0d4so11292386d6.3 for ; Mon, 21 Aug 2023 09:02:16 -0700 (PDT) Received: from [10.137.0.57] ([14.33.99.107]) by smtp.gmail.com with ESMTPSA id w24-20020aa78598000000b0068892a66b5csm6326109pfn.77.2023.08.21.08.57.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 21 Aug 2023 08:57:04 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 191a2662-403c-11ee-8782-cb3800f73035 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692633735; x=1693238535; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:from:user-agent:mime-version:date:message-id:from:to :cc:subject:date:message-id:reply-to; bh=rcvI6BZjybNsQpnXFA/kOhrlUplNszr0VzIfXd6lzuc=; b=o0AEj9FV+uPHAhr5gDF077SdH56OuNRVmYupSAxDBzuFAMrSv7QmEDSgbdavflm1Po 1RZXL00JNYwdUsmPudsPhG09nqrfZ1ipfaSL8gKztUPa6SiBkIHHVOSh67C1qAnE8Rnj aunrj4pb1EQiNfdJHWv5I818JFS0gFYi8E0G/51tWaYKrecjZGsGupK2OiN/Zgto4ToQ LINvd0KwyOFmpUQceQhoWpseFe1R35sgdt0DGd4AILC2cZ8EaMoo4oGqXwJC7BpobiX2 VpXken/E6+41S0Sn6JzO/IuUFQYdRXT+C/k9WI+tc+YtvNiZTyxFA5FBZgoRVOER3aLs 1eDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692633735; x=1693238535; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:from:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rcvI6BZjybNsQpnXFA/kOhrlUplNszr0VzIfXd6lzuc=; b=RHdbw0a2hI94xhTiG5qoH8x2ShlXaKZkViu+EyXWiBNcd18+uFAhoLefZfJ5pgj94N iXgfZfHNHFYG1zLNoQ9UljrCNQEPAfEfNeQuVzr6oqnhvzi4CvV2CAPdep85fQmiIioz 5cow7LXDgnZLfn2SjlF9UC3aVa7hIlpqR9hE/4xxwRy/RvLveKt2rmyKaKO2e08orht+ TBNve4B2ok4lZ8yA+Ogh6AIropV/MP1C0W47evmNPs4O1+xj7cnJR+zA2X/S3czW34IM F+MsMyqp29pAbf8v7wutXEkgHCsAqr4PTeCqTvboJWbcoPLpkVst7m0upPoosP0QdWGD SN6w== X-Gm-Message-State: AOJu0YxRT2nakYuGF3A1jrPoiOhP79d2oboLf5yW7fV0c7NAY2k8+rlH phUZt713FqWk+Tb7zA7hZYx0/DeQs64Ghw== X-Google-Smtp-Source: AGHT+IG9nzyYO7XWBCTfZ/MQCvZngajnf3c6sN3apWr2WuD5z1MrMNP2t6ssgvevRZYfN+Lfykthmw== X-Received: by 2002:a05:6870:470e:b0:1c0:1caf:3324 with SMTP id b14-20020a056870470e00b001c01caf3324mr9797661oaq.44.1692633425056; Mon, 21 Aug 2023 08:57:05 -0700 (PDT) Message-ID: <49f8d2f3-c468-774c-11d1-04a08169620d@gmail.com> Date: Tue, 22 Aug 2023 00:57:00 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 From: Jinoh Kang Subject: [PATCH 2/5] x86/emul: Add pending_dbg field to x86_event To: xen-devel@lists.xenproject.org, Andrew Cooper Cc: Jan Beulich , Wei Liu , =?utf-8?q?Roger_P?= =?utf-8?q?au_Monn=C3=A9?= , Jun Nakajima , Kevin Tian , Tim Deegan References: <5fa229fc-9514-abc6-5e72-2447a2c637d0@gmail.com> Content-Language: en-US In-Reply-To: <5fa229fc-9514-abc6-5e72-2447a2c637d0@gmail.com> From: Andrew Cooper All #DB exceptions result in an update of %dr6, but this isn't captured in Xen's handling. PV guests generally work by modifying %dr6 before raising #DB, whereas HVM guests do nothing and have a single-step special case in the lowest levels of {vmx,svm}_inject_event(). All of this is buggy, but in particular, task switches with the trace flag never end up signalling BT in %dr6. To begin resolving this issue, add a new pending_dbg field to x86_event (unioned with cr2 to avoid taking any extra space), and introduce {pv,hvm}_inject_debug_exn() helpers to replace the current callers using {pv,hvm}_inject_hw_exception(). A key property is that pending_dbg is taken with positive polarity to deal with RTM sensibly. Most callers pass in a constant, but callers passing in a hardware %dr6 value need to xor the value with X86_DR6_DEFAULT to flip the polarity of RTM and reserved fields. For PV guests, move the ad-hoc updating of %dr6 into pv_inject_event(). This in principle breaks the handing of RTM in do_debug(), but PV guests can't actually enable MSR_DEBUGCTL.RTM yet, so this doesn't matter in practice. Signed-off-by: Jinoh Kang --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monné CC: Jun Nakajima CC: Kevin Tian CC: Tim Deegan --- xen/arch/x86/hvm/emulate.c | 3 ++- xen/arch/x86/hvm/hvm.c | 2 +- xen/arch/x86/hvm/svm/svm.c | 9 ++++++--- xen/arch/x86/hvm/vmx/vmx.c | 13 ++++++++----- xen/arch/x86/include/asm/debugreg.h | 3 +++ xen/arch/x86/include/asm/domain.h | 12 ++++++++++++ xen/arch/x86/include/asm/hvm/hvm.h | 15 ++++++++++++++- xen/arch/x86/mm/shadow/multi.c | 5 +++-- xen/arch/x86/pv/emul-priv-op.c | 11 +++++------ xen/arch/x86/pv/emulate.c | 6 ++---- xen/arch/x86/pv/ro-page-fault.c | 3 ++- xen/arch/x86/pv/traps.c | 16 ++++++++++++---- xen/arch/x86/traps.c | 6 +----- xen/arch/x86/x86_emulate/x86_emulate.h | 5 ++++- 14 files changed, 75 insertions(+), 34 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 9b6e4c8bc6..129403ad90 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -2673,7 +2674,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt, } if ( hvmemul_ctxt->ctxt.retire.singlestep ) - hvm_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); + hvm_inject_debug_exn(X86_DR6_BS); new_intr_shadow = hvmemul_ctxt->intr_shadow; diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 66ead0b878..c3d46841c2 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3229,7 +3229,7 @@ void hvm_task_switch( } if ( (tss.trace & 1) && !exn_raised ) - hvm_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); + hvm_inject_debug_exn(X86_DR6_BT); out: hvm_unmap_entry(optss_desc); diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index 01dd592d9b..ac3123c56f 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -96,7 +96,7 @@ void __update_guest_eip(struct cpu_user_regs *regs, unsigned int inst_len) curr->arch.hvm.svm.vmcb->int_stat.intr_shadow = 0; if ( regs->eflags & X86_EFLAGS_TF ) - hvm_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); + hvm_inject_debug_exn(X86_DR6_BS); } static void cf_check svm_cpu_down(void) @@ -2755,7 +2755,10 @@ void svm_vmexit_handler(void) goto unexpected_exit_type; if ( !rc ) hvm_inject_exception(X86_EXC_DB, - trap_type, insn_len, X86_EVENT_NO_EC); + trap_type, insn_len, X86_EVENT_NO_EC, + exit_reason == VMEXIT_ICEBP ? 0 : + /* #DB - Hardware already updated dr6. */ + vmcb_get_dr6(vmcb) ^ X86_DR6_DEFAULT); } else domain_pause_for_debugger(); @@ -2785,7 +2788,7 @@ void svm_vmexit_handler(void) if ( !rc ) hvm_inject_exception(X86_EXC_BP, X86_EVENTTYPE_SW_EXCEPTION, - insn_len, X86_EVENT_NO_EC); + insn_len, X86_EVENT_NO_EC, 0 /* N/A */); } break; diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 7ec44018d4..716115332b 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -3068,7 +3068,7 @@ void update_guest_eip(void) } if ( regs->eflags & X86_EFLAGS_TF ) - hvm_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); + hvm_inject_debug_exn(X86_DR6_BS); } static void cf_check vmx_fpu_dirty_intercept(void) @@ -3911,7 +3911,7 @@ static int vmx_handle_eoi_write(void) * It is the callers responsibility to ensure that this function is only used * in the context of an appropriate vmexit. */ -static void vmx_propagate_intr(unsigned long intr) +static void vmx_propagate_intr(unsigned long intr, unsigned long pending_dbg) { struct x86_event event = { .vector = MASK_EXTR(intr, INTR_INFO_VECTOR_MASK), @@ -3935,6 +3935,9 @@ static void vmx_propagate_intr(unsigned long intr) else event.insn_len = 0; + if ( event.vector == X86_EXC_DB ) + event.pending_dbg = pending_dbg; + hvm_inject_event(&event); } @@ -4300,7 +4303,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) if ( rc < 0 ) goto exit_and_crash; if ( !rc ) - vmx_propagate_intr(intr_info); + vmx_propagate_intr(intr_info, exit_qualification); } else domain_pause_for_debugger(); @@ -4321,7 +4324,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) if ( rc < 0 ) goto exit_and_crash; if ( !rc ) - vmx_propagate_intr(intr_info); + vmx_propagate_intr(intr_info, 0 /* N/A */); } else { @@ -4361,7 +4364,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) break; case X86_EXC_AC: HVMTRACE_1D(TRAP, vector); - vmx_propagate_intr(intr_info); + vmx_propagate_intr(intr_info, 0 /* N/A */); break; case X86_EXC_NMI: if ( MASK_EXTR(intr_info, INTR_INFO_INTR_TYPE_MASK) != diff --git a/xen/arch/x86/include/asm/debugreg.h b/xen/arch/x86/include/asm/debugreg.h index 8be60910b4..a1df74c02e 100644 --- a/xen/arch/x86/include/asm/debugreg.h +++ b/xen/arch/x86/include/asm/debugreg.h @@ -83,6 +83,9 @@ asm volatile ( "mov %%db" #reg ",%0" : "=r" (__val) ); \ __val; \ }) + +struct vcpu; + long set_debugreg(struct vcpu *, unsigned int reg, unsigned long value); void activate_debugregs(const struct vcpu *); diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index c2d9fc333b..eba11adf33 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -729,6 +729,18 @@ static inline void pv_inject_hw_exception(unsigned int vector, int errcode) pv_inject_event(&event); } +static inline void pv_inject_debug_exn(unsigned long pending_dbg) +{ + struct x86_event event = { + .vector = X86_EXC_DB, + .type = X86_EVENTTYPE_HW_EXCEPTION, + .error_code = X86_EVENT_NO_EC, + .pending_dbg = pending_dbg, + }; + + pv_inject_event(&event); +} + static inline void pv_inject_page_fault(int errcode, unsigned long cr2) { const struct x86_event event = { diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h index 6d53713fc3..43989f1681 100644 --- a/xen/arch/x86/include/asm/hvm/hvm.h +++ b/xen/arch/x86/include/asm/hvm/hvm.h @@ -503,13 +503,14 @@ hvm_get_cpl(struct vcpu *v) static inline void hvm_inject_exception( unsigned int vector, unsigned int type, - unsigned int insn_len, int error_code) + unsigned int insn_len, int error_code, unsigned long extra) { struct x86_event event = { .vector = vector, .type = type, .insn_len = insn_len, .error_code = error_code, + .cr2 = extra, /* Any union field will do. */ }; hvm_inject_event(&event); @@ -526,6 +527,18 @@ static inline void hvm_inject_hw_exception(unsigned int vector, int errcode) hvm_inject_event(&event); } +static inline void hvm_inject_debug_exn(unsigned long pending_dbg) +{ + struct x86_event event = { + .vector = X86_EXC_DB, + .type = X86_EVENTTYPE_HW_EXCEPTION, + .error_code = X86_EVENT_NO_EC, + .pending_dbg = pending_dbg, + }; + + hvm_inject_event(&event); +} + static inline void hvm_inject_page_fault(int errcode, unsigned long cr2) { struct x86_event event = { diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c index cf74fdf5dd..6056626912 100644 --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -2788,7 +2789,7 @@ static int cf_check sh_page_fault( #endif if ( emul_ctxt.ctxt.retire.singlestep ) - hvm_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); + hvm_inject_debug_exn(X86_DR6_BS); #if GUEST_PAGING_LEVELS == 3 /* PAE guest */ /* @@ -2829,7 +2830,7 @@ static int cf_check sh_page_fault( TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_EMULATION_LAST_FAILED); if ( emul_ctxt.ctxt.retire.singlestep ) - hvm_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); + hvm_inject_debug_exn(X86_DR6_BS); break; /* Don't emulate again if we failed! */ } diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c index 142bc4818c..72d0514e74 100644 --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -1360,12 +1360,11 @@ int pv_emulate_privileged_op(struct cpu_user_regs *regs) case X86EMUL_OKAY: if ( ctxt.ctxt.retire.singlestep ) ctxt.bpmatch |= DR_STEP; - if ( ctxt.bpmatch ) - { - curr->arch.dr6 |= ctxt.bpmatch | DR_STATUS_RESERVED_ONE; - if ( !(curr->arch.pv.trap_bounce.flags & TBF_EXCEPTION) ) - pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); - } + + if ( ctxt.bpmatch && + !(curr->arch.pv.trap_bounce.flags & TBF_EXCEPTION) ) + pv_inject_debug_exn(ctxt.bpmatch); + /* fall through */ case X86EMUL_RETRY: return EXCRET_fault_fixed; diff --git a/xen/arch/x86/pv/emulate.c b/xen/arch/x86/pv/emulate.c index e7a1c0a2cc..aa8af96c30 100644 --- a/xen/arch/x86/pv/emulate.c +++ b/xen/arch/x86/pv/emulate.c @@ -71,11 +71,9 @@ void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip) { regs->rip = rip; regs->eflags &= ~X86_EFLAGS_RF; + if ( regs->eflags & X86_EFLAGS_TF ) - { - current->arch.dr6 |= DR_STEP | DR_STATUS_RESERVED_ONE; - pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); - } + pv_inject_debug_exn(X86_DR6_BS); } uint64_t pv_get_reg(struct vcpu *v, unsigned int reg) diff --git a/xen/arch/x86/pv/ro-page-fault.c b/xen/arch/x86/pv/ro-page-fault.c index cad28ef928..50c37fbd25 100644 --- a/xen/arch/x86/pv/ro-page-fault.c +++ b/xen/arch/x86/pv/ro-page-fault.c @@ -9,6 +9,7 @@ */ #include +#include #include #include "emulate.h" @@ -390,7 +391,7 @@ int pv_ro_page_fault(unsigned long addr, struct cpu_user_regs *regs) /* Fallthrough */ case X86EMUL_OKAY: if ( ctxt.retire.singlestep ) - pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); + pv_inject_debug_exn(X86_DR6_BS); /* Fallthrough */ case X86EMUL_RETRY: diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c index 74f333da7e..4f5641a47c 100644 --- a/xen/arch/x86/pv/traps.c +++ b/xen/arch/x86/pv/traps.c @@ -13,6 +13,7 @@ #include #include +#include #include #include #include @@ -50,9 +51,9 @@ void pv_inject_event(const struct x86_event *event) tb->cs = ti->cs; tb->eip = ti->address; - if ( event->type == X86_EVENTTYPE_HW_EXCEPTION && - vector == X86_EXC_PF ) + switch ( vector | -(event->type == X86_EVENTTYPE_SW_INTERRUPT) ) { + case X86_EXC_PF: curr->arch.pv.ctrlreg[2] = event->cr2; arch_set_cr2(curr, event->cr2); @@ -62,9 +63,16 @@ void pv_inject_event(const struct x86_event *event) error_code |= PFEC_user_mode; trace_pv_page_fault(event->cr2, error_code); - } - else + break; + + case X86_EXC_DB: + curr->arch.dr6 |= event->pending_dbg; + /* Fallthrough */ + + default: trace_pv_trap(vector, regs->rip, use_error_code, error_code); + break; + } if ( use_error_code ) { diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index a898e1f2d7..640b60e0e5 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -1989,17 +1989,13 @@ void do_debug(struct cpu_user_regs *regs) return; } - /* Save debug status register where guest OS can peek at it */ - v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT); - v->arch.dr6 &= (dr6 | ~X86_DR6_DEFAULT); - if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached ) { domain_pause_for_debugger(); return; } - pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); + pv_inject_debug_exn(dr6 ^ X86_DR6_DEFAULT); } void do_entry_CP(struct cpu_user_regs *regs) diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h index 698750267a..e348e3c1d3 100644 --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -78,7 +78,10 @@ struct x86_event { uint8_t type; /* X86_EVENTTYPE_* */ uint8_t insn_len; /* Instruction length */ int32_t error_code; /* X86_EVENT_NO_EC if n/a */ - unsigned long cr2; /* Only for X86_EXC_PF h/w exception */ + union { + unsigned long cr2; /* #PF */ + unsigned long pending_dbg; /* #DB (new DR6 bits, positive polarity) */ + }; }; /*