From patchwork Tue Apr 5 15:32:24 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 8753151 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id C1D3FC0553 for ; Tue, 5 Apr 2016 15:38:28 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 715CC20381 for ; Tue, 5 Apr 2016 15:38:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0EEAE20109 for ; Tue, 5 Apr 2016 15:38:26 +0000 (UTC) Received: from localhost ([::1]:37842 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anT3h-0008DK-C7 for patchwork-qemu-devel@patchwork.kernel.org; Tue, 05 Apr 2016 11:38:25 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54915) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anSyI-00073O-GW for qemu-devel@nongnu.org; Tue, 05 Apr 2016 11:32:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1anSyB-0007GB-Kp for qemu-devel@nongnu.org; Tue, 05 Apr 2016 11:32:50 -0400 Received: from mail-wm0-x232.google.com ([2a00:1450:400c:c09::232]:36861) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anSyB-0007Fq-A5 for qemu-devel@nongnu.org; Tue, 05 Apr 2016 11:32:43 -0400 Received: by mail-wm0-x232.google.com with SMTP id 127so28636106wmu.1 for ; Tue, 05 Apr 2016 08:32:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+wYaylothJPNaavZ7jmTm1q26pB3zK5l58HJBU9LkiY=; b=ZzDp/It9syzrofdVj5n2KE7avJJTWo0DmW3HXJhUq+vKuHgAvIy2qPWR5Rkz4BnwcB XPqtajHW1PM2uWe1qsqNcNv59iejFF6ZSmNgtZYBYujNa9X2e9x4gYesehasYepxYQiA ZqapQTtByg47LdAzWz9b9u6CC4JP/KTuBCgs0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+wYaylothJPNaavZ7jmTm1q26pB3zK5l58HJBU9LkiY=; b=mOOmKtko33Am0xzahteKfNwq+uWqO5OdktW4xwws8sIEBwWlZtMKSRotT6NjjQbi9H 4/3do8v28hpG+LPb5ZSVwzJNKO9+MgLNYavwJH2U2hl15RuCK9Ug4HnC3h2OMKT9CoO/ h5DTpR97LIiIjzifkA8+HQFILS3OTp5W9Bzt4ZprdTuxJQvvRDwwcn0U92iIrW/Zq3cR 0Y/PLENlIzb4rB9o35m51oQNbBdYZ3uVxIRUgJtmKuajMMNFP3r+PQVgC0FL6oXBHQiA 3tuPHY4cjbq1gvbkzspXct4+XFxxmkdAtNkohpiQ3XrIeUVydpZjDShxIgDcdFiaVPYG viIg== X-Gm-Message-State: AD7BkJL0+TccPIrHV/zW8VpLiAKxgoShc5z6Toh28oRsTq9ZwUJtNbpiIwgSVtNBxubvfpVe X-Received: by 10.28.6.139 with SMTP id 133mr18708765wmg.84.1459870362675; Tue, 05 Apr 2016 08:32:42 -0700 (PDT) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id xx3sm34948198wjc.32.2016.04.05.08.32.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Apr 2016 08:32:36 -0700 (PDT) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id BA95C3E0A2D; Tue, 5 Apr 2016 16:32:33 +0100 (BST) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: mttcg@listserver.greensocs.com, fred.konrad@greensocs.com, a.rigo@virtualopensystems.com, serge.fdrv@gmail.com, cota@braap.org Date: Tue, 5 Apr 2016 16:32:24 +0100 Message-Id: <1459870344-16773-12-git-send-email-alex.bennee@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1459870344-16773-1-git-send-email-alex.bennee@linaro.org> References: <1459870344-16773-1-git-send-email-alex.bennee@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:400c:c09::232 Cc: peter.maydell@linaro.org, claudio.fontana@huawei.com, Peter Crosthwaite , jan.kiszka@siemens.com, mark.burton@greensocs.com, qemu-devel@nongnu.org, pbonzini@redhat.com, =?UTF-8?q?Alex=20Benn=C3=A9e?= , rth@twiddle.net Subject: [Qemu-devel] [RFC v2 11/11] tcg: enable thread-per-vCPU X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: KONRAD Frederic This allows the user to switch on multi-thread behaviour and spawn a thread per-vCPU. For a simple test like: ./arm/run ./arm/locking-test.flat -smp 4 -tcg mttcg=on Will now use 4 vCPU threads and have an expected FAIL (instead of the unexpected PASS) as the default mode of the test has no protection when incrementing a shared variable. However we still default to a single thread for all vCPUs as individual front-end and back-ends need additional fixes to safely support: - atomic behaviour - tb invalidation - memory ordering The function default_mttcg_enabled can be tweaked as support is added. As assumptions about tcg_current_cpu are no longer relevant to the single-threaded kick routine we need to save the current information somewhere else for the timer. Signed-off-by: KONRAD Frederic Signed-off-by: Paolo Bonzini [AJB: Some fixes, conditionally, commit rewording] Signed-off-by: Alex Bennée --- v1 (ajb): - fix merge conflicts - maintain single-thread approach v2 - re-base fixes (no longer has tb_find_fast lock tweak ahead) - remove bogus break condition on cpu->stop/stopped - only process exiting cpus exit_request - handle all cpus idle case (fixes shutdown issues) - sleep on EXCP_HALTED in mttcg mode (prevent crash on start-up) - move icount timer into helper --- cpu-exec-common.c | 1 - cpu-exec.c | 15 ---- cpus.c | 216 +++++++++++++++++++++++++++++++++--------------- include/exec/exec-all.h | 4 - translate-all.c | 8 -- 5 files changed, 150 insertions(+), 94 deletions(-) diff --git a/cpu-exec-common.c b/cpu-exec-common.c index 1b1731c..3d7eaa3 100644 --- a/cpu-exec-common.c +++ b/cpu-exec-common.c @@ -23,7 +23,6 @@ #include "exec/memory-internal.h" bool exit_request; -CPUState *tcg_current_cpu; /* exit the current TB from a signal handler. The host registers are restored in a state compatible with the CPU emulator diff --git a/cpu-exec.c b/cpu-exec.c index f558508..42cec05 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -292,7 +292,6 @@ static TranslationBlock *tb_find_slow(CPUState *cpu, goto found; } -#ifdef CONFIG_USER_ONLY /* mmap_lock is needed by tb_gen_code, and mmap_lock must be * taken outside tb_lock. Since we're momentarily dropping * tb_lock, there's a chance that our desired tb has been @@ -306,15 +305,12 @@ static TranslationBlock *tb_find_slow(CPUState *cpu, mmap_unlock(); goto found; } -#endif /* if no translated code available, then translate it now */ cpu->tb_invalidated_flag = false; tb = tb_gen_code(cpu, pc, cs_base, flags, 0); -#ifdef CONFIG_USER_ONLY mmap_unlock(); -#endif found: /* we add the TB in the virtual pc hash table */ @@ -388,13 +384,8 @@ int cpu_exec(CPUState *cpu) cpu->halted = 0; } - atomic_mb_set(&tcg_current_cpu, cpu); rcu_read_lock(); - if (unlikely(atomic_mb_read(&exit_request))) { - cpu->exit_request = 1; - } - cc->cpu_exec_enter(cpu); /* Calculate difference between guest clock and host clock. @@ -515,7 +506,6 @@ int cpu_exec(CPUState *cpu) } if (unlikely(cpu->exit_request || replay_has_interrupt())) { - cpu->exit_request = 0; cpu->exception_index = EXCP_INTERRUPT; cpu_loop_exit(cpu); } @@ -629,10 +619,5 @@ int cpu_exec(CPUState *cpu) cc->cpu_exec_exit(cpu); rcu_read_unlock(); - /* fail safe : never use current_cpu outside cpu_exec() */ - current_cpu = NULL; - - /* Does not need atomic_mb_set because a spurious wakeup is okay. */ - atomic_set(&tcg_current_cpu, NULL); return ret; } diff --git a/cpus.c b/cpus.c index 02fae13..f7c7359 100644 --- a/cpus.c +++ b/cpus.c @@ -966,10 +966,7 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data) qemu_cpu_kick(cpu); while (!atomic_mb_read(&wi.done)) { - CPUState *self_cpu = current_cpu; - qemu_cond_wait(&qemu_work_cond, &qemu_global_mutex); - current_cpu = self_cpu; } } @@ -1031,13 +1028,13 @@ static void flush_queued_work(CPUState *cpu) static void qemu_wait_io_event_common(CPUState *cpu) { + atomic_mb_set(&cpu->thread_kicked, false); if (cpu->stop) { cpu->stop = false; cpu->stopped = true; qemu_cond_broadcast(&qemu_pause_cond); } flush_queued_work(cpu); - cpu->thread_kicked = false; } static void qemu_tcg_wait_io_event(CPUState *cpu) @@ -1046,9 +1043,7 @@ static void qemu_tcg_wait_io_event(CPUState *cpu) qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex); } - CPU_FOREACH(cpu) { - qemu_wait_io_event_common(cpu); - } + qemu_wait_io_event_common(cpu); } static void qemu_kvm_wait_io_event(CPUState *cpu) @@ -1115,6 +1110,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg) qemu_thread_get_self(cpu->thread); cpu->thread_id = qemu_get_thread_id(); cpu->can_do_io = 1; + current_cpu = cpu; sigemptyset(&waitset); sigaddset(&waitset, SIG_IPI); @@ -1123,9 +1119,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg) cpu->created = true; qemu_cond_signal(&qemu_cpu_cond); - current_cpu = cpu; while (1) { - current_cpu = NULL; qemu_mutex_unlock_iothread(); do { int sig; @@ -1136,7 +1130,6 @@ static void *qemu_dummy_cpu_thread_fn(void *arg) exit(1); } qemu_mutex_lock_iothread(); - current_cpu = cpu; qemu_wait_io_event_common(cpu); } @@ -1153,32 +1146,52 @@ static void *qemu_dummy_cpu_thread_fn(void *arg) * elsewhere. */ static int tcg_cpu_exec(CPUState *cpu); -static void qemu_cpu_kick_no_halt(void); + +struct kick_info { + QEMUTimer *timer; + CPUState *cpu; +}; static void kick_tcg_thread(void *opaque) { - QEMUTimer *self = *(QEMUTimer **) opaque; - timer_mod(self, + struct kick_info *info = (struct kick_info *) opaque; + CPUState *cpu = atomic_mb_read(&info->cpu); + + timer_mod(info->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + NANOSECONDS_PER_SECOND / 10); - qemu_cpu_kick_no_halt(); + + if (cpu) { + cpu_exit(cpu); + } } -static void *qemu_tcg_cpu_thread_fn(void *arg) +static void handle_icount_deadline(void) +{ + if (use_icount) { + int64_t deadline = + qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL); + + if (deadline == 0) { + qemu_clock_notify(QEMU_CLOCK_VIRTUAL); + } + } +} + +static void *qemu_tcg_single_cpu_thread_fn(void *arg) { + struct kick_info info; CPUState *cpu = arg; - QEMUTimer *kick_timer; rcu_register_thread(); qemu_mutex_lock_iothread(); qemu_thread_get_self(cpu->thread); - CPU_FOREACH(cpu) { - cpu->thread_id = qemu_get_thread_id(); - cpu->created = true; - cpu->can_do_io = 1; - } + cpu->thread_id = qemu_get_thread_id(); + cpu->created = true; + cpu->can_do_io = 1; + current_cpu = cpu; qemu_cond_signal(&qemu_cpu_cond); /* wait for initial kick-off after machine start */ @@ -1193,18 +1206,24 @@ static void *qemu_tcg_cpu_thread_fn(void *arg) /* Set to kick if we have to do more than one vCPU */ if (CPU_NEXT(first_cpu)) { - kick_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, kick_tcg_thread, &kick_timer); - timer_mod(kick_timer, + info.timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, kick_tcg_thread, &info); + info.cpu = NULL; + smp_wmb(); + timer_mod(info.timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + NANOSECONDS_PER_SECOND / 10); } /* process any pending work */ - atomic_mb_set(&exit_request, 1); + CPU_FOREACH(cpu) { + atomic_mb_set(&cpu->exit_request, 1); + } cpu = first_cpu; while (1) { + bool nothing_ran = true; + /* Account partial waits to QEMU_CLOCK_VIRTUAL. */ qemu_account_warp_timer(); @@ -1212,34 +1231,107 @@ static void *qemu_tcg_cpu_thread_fn(void *arg) cpu = first_cpu; } - for (; cpu != NULL && !exit_request; cpu = CPU_NEXT(cpu)) { + for (; cpu != NULL && !cpu->exit_request; cpu = CPU_NEXT(cpu)) { + atomic_mb_set(&info.cpu, cpu); qemu_clock_enable(QEMU_CLOCK_VIRTUAL, (cpu->singlestep_enabled & SSTEP_NOTIMER) == 0); if (cpu_can_run(cpu)) { int r = tcg_cpu_exec(cpu); + nothing_ran = false; if (r == EXCP_DEBUG) { cpu_handle_guest_debug(cpu); break; } - } else if (cpu->stop || cpu->stopped) { - break; } } /* for cpu.. */ - /* Pairs with smp_wmb in qemu_cpu_kick. */ - atomic_mb_set(&exit_request, 0); + atomic_mb_set(&info.cpu, NULL); + + handle_icount_deadline(); - if (use_icount) { - int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL); + /* We exit in one of three conditions: + * - cpu is set, because exit_request is true + * - cpu is not set, because we have looped around + * - cpu is not set and nothing ran + */ - if (deadline == 0) { - qemu_clock_notify(QEMU_CLOCK_VIRTUAL); + if (cpu) { + g_assert(cpu->exit_request); + /* Pairs with smp_wmb in qemu_cpu_kick. */ + atomic_mb_set(&cpu->exit_request, 0); + qemu_tcg_wait_io_event(cpu); + } else if (nothing_ran) { + while (all_cpu_threads_idle()) { + qemu_cond_wait(first_cpu->halt_cond, &qemu_global_mutex); } } - qemu_tcg_wait_io_event(QTAILQ_FIRST(&cpus)); + } + + return NULL; +} + +/* Multi-threaded TCG + * + * In the multi-threaded case each vCPU has its own thread. The TLS + * variable current_cpu can be used deep in the code to find the + * current CPUState for a given thread. + */ + +static void *qemu_tcg_cpu_thread_fn(void *arg) +{ + CPUState *cpu = arg; + + rcu_register_thread(); + + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + + cpu->thread_id = qemu_get_thread_id(); + cpu->created = true; + cpu->can_do_io = 1; + current_cpu = cpu; + qemu_cond_signal(&qemu_cpu_cond); + + /* process any pending work */ + atomic_mb_set(&cpu->exit_request, 1); + + while (1) { + bool sleep = false; + + if (cpu_can_run(cpu)) { + int r = tcg_cpu_exec(cpu); + switch (r) + { + case EXCP_DEBUG: + cpu_handle_guest_debug(cpu); + break; + case EXCP_HALTED: + /* during start-up the vCPU is reset and the thread is + * kicked several times. If we don't ensure we go back + * to sleep in the halted state we won't cleanly + * start-up when the vCPU is enabled. + */ + sleep = true; + break; + default: + /* Ignore everything else? */ + break; + } + } else { + sleep = true; + } + + handle_icount_deadline(); + + if (sleep) { + qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex); + } + + atomic_mb_set(&cpu->exit_request, 0); + qemu_tcg_wait_io_event(cpu); } return NULL; @@ -1264,24 +1356,11 @@ static void qemu_cpu_kick_thread(CPUState *cpu) #endif } -static void qemu_cpu_kick_no_halt(void) -{ - CPUState *cpu; - /* Ensure whatever caused the exit has reached the CPU threads before - * writing exit_request. - */ - atomic_mb_set(&exit_request, 1); - cpu = atomic_mb_read(&tcg_current_cpu); - if (cpu) { - cpu_exit(cpu); - } -} - void qemu_cpu_kick(CPUState *cpu) { qemu_cond_broadcast(cpu->halt_cond); if (tcg_enabled()) { - qemu_cpu_kick_no_halt(); + cpu_exit(cpu); } else { qemu_cpu_kick_thread(cpu); } @@ -1347,13 +1426,6 @@ void pause_all_vcpus(void) if (qemu_in_vcpu_thread()) { cpu_stop_current(); - if (!kvm_enabled()) { - CPU_FOREACH(cpu) { - cpu->stop = false; - cpu->stopped = true; - } - return; - } } while (!all_vcpus_paused()) { @@ -1387,29 +1459,41 @@ void resume_all_vcpus(void) static void qemu_tcg_init_vcpu(CPUState *cpu) { char thread_name[VCPU_THREAD_NAME_SIZE]; - static QemuCond *tcg_halt_cond; - static QemuThread *tcg_cpu_thread; + static QemuCond *single_tcg_halt_cond; + static QemuThread *single_tcg_cpu_thread; - /* share a single thread for all cpus with TCG */ - if (!tcg_cpu_thread) { + if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) { cpu->thread = g_malloc0(sizeof(QemuThread)); cpu->halt_cond = g_malloc0(sizeof(QemuCond)); qemu_cond_init(cpu->halt_cond); - tcg_halt_cond = cpu->halt_cond; - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG", + + if (qemu_tcg_mttcg_enabled()) { + /* create a thread per vCPU with TCG (MTTCG) */ + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG", cpu->cpu_index); - qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn, - cpu, QEMU_THREAD_JOINABLE); + + qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); + + } else { + /* share a single thread for all cpus with TCG */ + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG"); + qemu_thread_create(cpu->thread, thread_name, qemu_tcg_single_cpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); + + single_tcg_halt_cond = cpu->halt_cond; + single_tcg_cpu_thread = cpu->thread; + } #ifdef _WIN32 cpu->hThread = qemu_thread_get_handle(cpu->thread); #endif while (!cpu->created) { qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex); } - tcg_cpu_thread = cpu->thread; } else { - cpu->thread = tcg_cpu_thread; - cpu->halt_cond = tcg_halt_cond; + /* For non-MTTCG cases we share the thread */ + cpu->thread = single_tcg_cpu_thread; + cpu->halt_cond = single_tcg_halt_cond; } } diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 60716ae..cde4b7a 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -479,8 +479,4 @@ bool memory_region_is_unassigned(MemoryRegion *mr); /* vl.c */ extern int singlestep; -/* cpu-exec.c, accessed with atomic_mb_read/atomic_mb_set */ -extern CPUState *tcg_current_cpu; -extern bool exit_request; - #endif diff --git a/translate-all.c b/translate-all.c index 0c377bb..8e70583 100644 --- a/translate-all.c +++ b/translate-all.c @@ -122,36 +122,28 @@ static void *l1_map[V_L1_SIZE]; TCGContext tcg_ctx; /* translation block context */ -#ifdef CONFIG_USER_ONLY __thread int have_tb_lock; -#endif void tb_lock(void) { -#ifdef CONFIG_USER_ONLY assert(!have_tb_lock); qemu_mutex_lock(&tcg_ctx.tb_ctx.tb_lock); have_tb_lock++; -#endif } void tb_unlock(void) { -#ifdef CONFIG_USER_ONLY assert(have_tb_lock); have_tb_lock--; qemu_mutex_unlock(&tcg_ctx.tb_ctx.tb_lock); -#endif } void tb_lock_reset(void) { -#ifdef CONFIG_USER_ONLY if (have_tb_lock) { qemu_mutex_unlock(&tcg_ctx.tb_ctx.tb_lock); have_tb_lock = 0; } -#endif } static TranslationBlock *tb_find_pc(uintptr_t tc_ptr);