From patchwork Thu Jun 29 12:29:38 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Llu=C3=ADs_Vilanova?= X-Patchwork-Id: 9816649 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 131446020A for ; Thu, 29 Jun 2017 12:30:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0643126E51 for ; Thu, 29 Jun 2017 12:30:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EE8C4286F3; Thu, 29 Jun 2017 12:30:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3961E28703 for ; Thu, 29 Jun 2017 12:30:51 +0000 (UTC) Received: from localhost ([::1]:38974 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dQYaw-0006hh-FY for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Jun 2017 08:30:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59156) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dQYa3-0006fk-H0 for qemu-devel@nongnu.org; Thu, 29 Jun 2017 08:29:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dQYZx-0007F4-1l for qemu-devel@nongnu.org; Thu, 29 Jun 2017 08:29:55 -0400 Received: from roura.ac.upc.edu ([147.83.33.10]:37937 helo=roura.ac.upc.es) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dQYZw-0007EK-Gb for qemu-devel@nongnu.org; Thu, 29 Jun 2017 08:29:48 -0400 Received: from correu-2.ac.upc.es (correu-2.ac.upc.es [147.83.30.92]) by roura.ac.upc.es (8.13.8/8.13.8) with ESMTP id v5TCTjgr028011; Thu, 29 Jun 2017 14:29:45 +0200 Received: from localhost (unknown [132.68.50.243]) by correu-2.ac.upc.es (Postfix) with ESMTPSA id DAE785EE; Thu, 29 Jun 2017 14:29:39 +0200 (CEST) From: =?utf-8?b?TGx1w61z?= Vilanova To: qemu-devel@nongnu.org Date: Thu, 29 Jun 2017 15:29:38 +0300 Message-Id: <149873937840.9180.2571251407615859104.stgit@frigg.lan> X-Mailer: git-send-email 2.11.0 In-Reply-To: <149873841036.9180.16600465902334229930.stgit@frigg.lan> References: <149873841036.9180.16600465902334229930.stgit@frigg.lan> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-MIME-Autoconverted: from 8bit to quoted-printable by roura.ac.upc.es id v5TCTjgr028011 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x [fuzzy] X-Received-From: 147.83.33.10 Subject: [Qemu-devel] [PATCH v10 4/7] exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , Peter Crosthwaite , "Emilio G. Cota" , Stefan Hajnoczi , Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Every vCPU now uses a separate set of TBs for each set of dynamic tracing event state values. Each set of TBs can be used by any number of vCPUs to maximize TB reuse when vCPUs have the same tracing state. This feature is later used by tracetool to optimize tracing of guest code events. The maximum number of TB sets is defined as 2^E, where E is the number of events that have the 'vcpu' property (their state is stored in CPUState->trace_dstate). For this to work, a change on the dynamic tracing state of a vCPU will force it to flush its virtual TB cache (which is only indexed by address), and fall back to the physical TB cache (which now contains the vCPU's dynamic tracing state as part of the hashing function). Signed-off-by: LluĂ­s Vilanova Reviewed-by: Richard Henderson Reviewed-by: Emilio G. Cota --- accel/tcg/cpu-exec.c | 8 ++++++-- accel/tcg/translate-all.c | 11 +++++++++-- include/exec/exec-all.h | 6 ++++++ include/exec/tb-hash-xx.h | 7 +++++-- include/exec/tb-hash.h | 5 +++-- tcg/tcg-runtime.c | 3 ++- tests/qht-bench.c | 2 +- trace/control-target.c | 1 + trace/control.h | 3 +++ 9 files changed, 36 insertions(+), 10 deletions(-) diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 3581618bc0..d84b01d1b8 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -280,6 +280,7 @@ struct tb_desc { CPUArchState *env; tb_page_addr_t phys_page1; uint32_t flags; + uint32_t trace_vcpu_dstate; }; static bool tb_cmp(const void *p, const void *d) @@ -291,6 +292,7 @@ static bool tb_cmp(const void *p, const void *d) tb->page_addr[0] == desc->phys_page1 && tb->cs_base == desc->cs_base && tb->flags == desc->flags && + tb->trace_vcpu_dstate == desc->trace_vcpu_dstate && !atomic_read(&tb->invalid)) { /* check next page if needed */ if (tb->page_addr[1] == -1) { @@ -319,10 +321,11 @@ TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc, desc.env = (CPUArchState *)cpu->env_ptr; desc.cs_base = cs_base; desc.flags = flags; + desc.trace_vcpu_dstate = *cpu->trace_dstate; desc.pc = pc; phys_pc = get_page_addr_code(desc.env, pc); desc.phys_page1 = phys_pc & TARGET_PAGE_MASK; - h = tb_hash_func(phys_pc, pc, flags); + h = tb_hash_func(phys_pc, pc, flags, *cpu->trace_dstate); return qht_lookup(&tcg_ctx.tb_ctx.htable, tb_cmp, &desc, h); } @@ -342,7 +345,8 @@ static inline TranslationBlock *tb_find(CPUState *cpu, cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); tb = atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)]); if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base || - tb->flags != flags)) { + tb->flags != flags || + tb->trace_vcpu_dstate != *cpu->trace_dstate)) { tb = tb_htable_lookup(cpu, pc, cs_base, flags); if (!tb) { diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c index 03820e3aeb..71badaaa6b 100644 --- a/accel/tcg/translate-all.c +++ b/accel/tcg/translate-all.c @@ -54,6 +54,7 @@ #include "exec/tb-hash.h" #include "translate-all.h" #include "qemu/bitmap.h" +#include "qemu/error-report.h" #include "qemu/timer.h" #include "qemu/main-loop.h" #include "exec/log.h" @@ -112,6 +113,11 @@ typedef struct PageDesc { #define V_L2_BITS 10 #define V_L2_SIZE (1 << V_L2_BITS) +/* Make sure all possible CPU event bits fit in tb->trace_ds */ +QEMU_BUILD_BUG_ON(CPU_TRACE_DSTATE_MAX_EVENTS > + sizeof(((TranslationBlock *)0)->trace_vcpu_dstate) + * BITS_PER_BYTE); + uintptr_t qemu_host_page_size; intptr_t qemu_host_page_mask; @@ -1102,7 +1108,7 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr) /* remove the TB from the hash list */ phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK); - h = tb_hash_func(phys_pc, tb->pc, tb->flags); + h = tb_hash_func(phys_pc, tb->pc, tb->flags, tb->trace_vcpu_dstate); qht_remove(&tcg_ctx.tb_ctx.htable, tb, h); /* remove the TB from the page list */ @@ -1247,7 +1253,7 @@ static void tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc, } /* add in the hash table */ - h = tb_hash_func(phys_pc, tb->pc, tb->flags); + h = tb_hash_func(phys_pc, tb->pc, tb->flags, tb->trace_vcpu_dstate); qht_insert(&tcg_ctx.tb_ctx.htable, tb, h); #ifdef DEBUG_TB_CHECK @@ -1293,6 +1299,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu, tb->cs_base = cs_base; tb->flags = flags; tb->cflags = cflags; + tb->trace_vcpu_dstate = *cpu->trace_dstate; tb->invalid = false; #ifdef CONFIG_PROFILER diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index b0281b000f..d918410944 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -310,6 +310,10 @@ static inline void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, #define USE_DIRECT_JUMP #endif +/** + * TranslationBlock: + * @trace_vcpu_dstate: Per-vCPU dynamic tracing state used to generate this TB. + */ struct TranslationBlock { target_ulong pc; /* simulated PC corresponding to this block (EIP + CS base) */ target_ulong cs_base; /* CS base for this block */ @@ -324,6 +328,8 @@ struct TranslationBlock { #define CF_USE_ICOUNT 0x20000 #define CF_IGNORE_ICOUNT 0x40000 /* Do not generate icount code */ + uint32_t trace_vcpu_dstate; + uint16_t invalid; void *tc_ptr; /* pointer to the translated code */ diff --git a/include/exec/tb-hash-xx.h b/include/exec/tb-hash-xx.h index 2c40b5c466..6cd3022c07 100644 --- a/include/exec/tb-hash-xx.h +++ b/include/exec/tb-hash-xx.h @@ -49,7 +49,7 @@ * contiguous in memory. */ static inline -uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t e) +uint32_t tb_hash_func6(uint64_t a0, uint64_t b0, uint32_t e, uint32_t f) { uint32_t v1 = TB_HASH_XX_SEED + PRIME32_1 + PRIME32_2; uint32_t v2 = TB_HASH_XX_SEED + PRIME32_2; @@ -78,11 +78,14 @@ uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t e) v4 *= PRIME32_1; h32 = rol32(v1, 1) + rol32(v2, 7) + rol32(v3, 12) + rol32(v4, 18); - h32 += 20; + h32 += 24; h32 += e * PRIME32_3; h32 = rol32(h32, 17) * PRIME32_4; + h32 += f * PRIME32_3; + h32 = rol32(h32, 17) * PRIME32_4; + h32 ^= h32 >> 15; h32 *= PRIME32_2; h32 ^= h32 >> 13; diff --git a/include/exec/tb-hash.h b/include/exec/tb-hash.h index b1fe2d0161..17b5ee0edf 100644 --- a/include/exec/tb-hash.h +++ b/include/exec/tb-hash.h @@ -58,9 +58,10 @@ static inline unsigned int tb_jmp_cache_hash_func(target_ulong pc) #endif /* CONFIG_SOFTMMU */ static inline -uint32_t tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, uint32_t flags) +uint32_t tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, uint32_t flags, + uint32_t trace_vcpu_dstate) { - return tb_hash_func5(phys_pc, pc, flags); + return tb_hash_func6(phys_pc, pc, flags, trace_vcpu_dstate); } #endif diff --git a/tcg/tcg-runtime.c b/tcg/tcg-runtime.c index ec3a34e461..3e23649dd7 100644 --- a/tcg/tcg-runtime.c +++ b/tcg/tcg-runtime.c @@ -158,7 +158,8 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env, target_ulong addr) if (unlikely(!(tb && tb->pc == addr && tb->cs_base == cs_base - && tb->flags == flags))) { + && tb->flags == flags + && tb->trace_vcpu_dstate == *cpu->trace_dstate))) { tb = tb_htable_lookup(cpu, addr, cs_base, flags); if (!tb) { return tcg_ctx.code_gen_epilogue; diff --git a/tests/qht-bench.c b/tests/qht-bench.c index 2afa09d859..11c1cec766 100644 --- a/tests/qht-bench.c +++ b/tests/qht-bench.c @@ -103,7 +103,7 @@ static bool is_equal(const void *obj, const void *userp) static inline uint32_t h(unsigned long v) { - return tb_hash_func5(v, 0, 0); + return tb_hash_func6(v, 0, 0, 0); } /* diff --git a/trace/control-target.c b/trace/control-target.c index d30fa5df75..bc4308021a 100644 --- a/trace/control-target.c +++ b/trace/control-target.c @@ -62,6 +62,7 @@ static void trace_event_synchronize_vcpu_state_dynamic( { bitmap_copy(vcpu->trace_dstate, vcpu->trace_dstate_delayed, CPU_TRACE_DSTATE_MAX_EVENTS); + tb_flush_jmp_cache_all(vcpu); } void trace_event_set_vcpu_state_dynamic(CPUState *vcpu, diff --git a/trace/control.h b/trace/control.h index 4ea53e2986..b931824d60 100644 --- a/trace/control.h +++ b/trace/control.h @@ -165,6 +165,9 @@ void trace_event_set_state_dynamic(TraceEvent *ev, bool state); * Set the dynamic tracing state of an event for the given vCPU. * * Pre-condition: trace_event_get_vcpu_state_static(ev) == true + * + * Note: Changes for execution-time events with the 'tcg' property will not be + * propagated until the next TB is executed (iff executing in TCG mode). */ void trace_event_set_vcpu_state_dynamic(CPUState *vcpu, TraceEvent *ev, bool state);