From patchwork Tue Mar 21 03:34:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Herongguang (Stephen)" X-Patchwork-Id: 9635939 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 69D136020B for ; Tue, 21 Mar 2017 03:36:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5D77925D9E for ; Tue, 21 Mar 2017 03:36:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4F393277D9; Tue, 21 Mar 2017 03:36:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0081525D9E for ; Tue, 21 Mar 2017 03:36:16 +0000 (UTC) Received: from localhost ([::1]:36496 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cqAal-00051B-H2 for patchwork-qemu-devel@patchwork.kernel.org; Mon, 20 Mar 2017 23:36:15 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49927) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cqAaZ-000516-CS for qemu-devel@nongnu.org; Mon, 20 Mar 2017 23:36:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cqAaV-000294-Ru for qemu-devel@nongnu.org; Mon, 20 Mar 2017 23:36:03 -0400 Received: from [45.249.212.189] (port=2863 helo=dggrg03-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cqAaU-000210-Gy for qemu-devel@nongnu.org; Mon, 20 Mar 2017 23:35:59 -0400 Received: from 172.30.72.53 (EHLO DGGEML403-HUB.china.huawei.com) ([172.30.72.53]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AKJ18135; Tue, 21 Mar 2017 11:35:27 +0800 (CST) Received: from [127.0.0.1] (10.177.19.20) by DGGEML403-HUB.china.huawei.com (10.3.17.33) with Microsoft SMTP Server id 14.3.301.0; Tue, 21 Mar 2017 11:35:13 +0800 To: Paolo Bonzini , , , , , "kvm@vger.kernel.org" , , "weidong.huang@huawei.com >> Huangweidong (C)" References: <58CFE56E.9090303@huawei.com> From: "Herongguang (Stephen)" Message-ID: <58D09F48.9010809@huawei.com> Date: Tue, 21 Mar 2017 11:34:32 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <58CFE56E.9090303@huawei.com> X-Originating-IP: [10.177.19.20] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090204.58D09F7F.0090, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 61d87bfe40e080d0b5f33b406e228c80 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.189 Subject: Re: [Qemu-devel] [BUG/RFC] INIT IPI lost when VM starts X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Let me clarify it more clearly. Time sequence is that qemu handles ‘query-cpus’ qmp command, vcpu 1 (and vcpu 0) got registers from kvm-kmod (qmp_query_cpus-> cpu_synchronize_state-> kvm_cpu_synchronize_state-> > do_kvm_cpu_synchronize_state-> kvm_arch_get_registers), then vcpu 0 (BSP) sends INIT-SIPI to vcpu 1(AP). In kvm-kmod, vcpu 1’s pending_events’s KVM_APIC_INIT bit set. Then vcpu 1 continue running, vcpu1 thread in qemu calls kvm_arch_put_registers-> kvm_put_vcpu_events, so KVM_APIC_INIT bit in vcpu 1’s pending_events got cleared, i.e., lost. In kvm-kmod, except for pending_events, sipi_vector may also be overwritten., so I am not sure if there are other fields/registers in danger, i.e., those may be modified asynchronously with vcpu thread itself. BTW, using a sleep like following can reliably reproduce this problem, if VM equipped with more than 2 vcpus and starting VM using libvirtd. On 2017/3/20 22:21, Herongguang (Stephen) wrote: > Hi, > We encountered a problem that when a domain starts, seabios failed to online a vCPU. > > After investigation, we found that the reason is in kvm-kmod, KVM_APIC_INIT bit in > vcpu->arch.apic->pending_events was overwritten by qemu, and thus an INIT IPI sent > to AP was lost. Qemu does this since libvirtd sends a ‘query-cpus’ qmp command to qemu > on VM start. > > In qemu, qmp_query_cpus-> cpu_synchronize_state-> kvm_cpu_synchronize_state-> > do_kvm_cpu_synchronize_state, qemu gets registers/vcpu_events from kvm-kmod and > sets cpu->kvm_vcpu_dirty to true, and vcpu thread in qemu will call > kvm_arch_put_registers if cpu->kvm_vcpu_dirty is true, thus pending_events is > overwritten by qemu. > > I think there is no need for qemu to set cpu->kvm_vcpu_dirty to true after ‘query-cpus’, > and kvm-kmod should not clear KVM_APIC_INIT unconditionally. And I am not sure whether > it is OK for qemu to set cpu->kvm_vcpu_dirty in do_kvm_cpu_synchronize_state in each caller. > > What’s your opinion? > diff --git a/target/i386/kvm.c b/target/i386/kvm.c index 55865db..5099290 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -2534,6 +2534,11 @@ static int kvm_put_vcpu_events(X86CPU *cpu, int level) KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR; } + if (CPU(cpu)->cpu_index == 1) { + fprintf(stderr, "vcpu 1 sleep!!!!\n"); + sleep(10); + } + return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events); }