From patchwork Wed Mar 11 16:37:39 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin O'Connor X-Patchwork-Id: 5987191 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id AE0AF9F399 for ; Wed, 11 Mar 2015 16:39:10 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9EF42203A0 for ; Wed, 11 Mar 2015 16:39:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 70C6C2013D for ; Wed, 11 Mar 2015 16:39:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752212AbbCKQhx (ORCPT ); Wed, 11 Mar 2015 12:37:53 -0400 Received: from mail-vc0-f170.google.com ([209.85.220.170]:44820 "EHLO mail-vc0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751739AbbCKQhw (ORCPT ); Wed, 11 Mar 2015 12:37:52 -0400 Received: by mail-vc0-f170.google.com with SMTP id hq11so3395276vcb.1 for ; Wed, 11 Mar 2015 09:37:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=PmJ82hAZk/SE3mqKVUOvjJLDFt7vni4LlwyTENFBLBc=; b=h9HVY6JZfsLYx5vAP7WYWuG3KhWuMSEiPsg0N28kgCJ6jOHBmUcJxpC6j4v5w9tC4l jNx8iJcwRPD6/NhjUSg7RBSvCDlIjn2x2VuN9KJJOg5dAiVl3MbIM4JDpqkDKggjSvDg SeZ+maOfMOapw9zRjl7nbaMcRHnIywiZHt4Na7uwF0Td+3GWj+WCG5/CnsKZp6RWEceN lNBxbLeVakM9/Y5AGnIRmF3jMUmBzXQcPcftj7ck8x3rZKxew0fi/hqqwouLPUEqVENC +RcTe/pZvVxzu8Iu7eCN6UkzDrdoMfVUCmEkdidZ1tv9UFLS6tBQcSNJ4d8R9UzMfZLm mWZA== X-Gm-Message-State: ALoCoQnIWRF7gFr0+jRr2Oc7n07rHCml3Wd9xAJmNt00ByvouKbk+tj9zV3hUBqXGysschtyoW53 X-Received: by 10.55.20.213 with SMTP id 82mr55326903qku.46.1426091861548; Wed, 11 Mar 2015 09:37:41 -0700 (PDT) Received: from localhost (207-172-170-53.c3-0.avec-ubr1.nyr-avec.ny.cable.rcn.com. [207.172.170.53]) by mx.google.com with ESMTPSA id o7sm2890855qge.8.2015.03.11.09.37.40 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Mar 2015 09:37:40 -0700 (PDT) Date: Wed, 11 Mar 2015 12:37:39 -0400 From: Kevin O'Connor To: "Dr. David Alan Gilbert" Cc: Bandan Das , Paolo Bonzini , kraxel@redhat.com, Andrey Korolyov , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" Subject: Re: [Qemu-devel] E5-2620v2 - emulation stop error Message-ID: <20150311163739.GA29522@morn.localdomain> References: <20150310165755.GL2338@work-vm> <54FF337A.1010202@redhat.com> <54FF4541.9080608@redhat.com> <20150310202958.GR2338@work-vm> <20150311134556.GH2334@work-vm> <20150311154220.GA26463@morn.localdomain> <20150311155306.GK2334@work-vm> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20150311155306.GK2334@work-vm> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Mar 11, 2015 at 03:53:07PM +0000, Dr. David Alan Gilbert wrote: > * Kevin O'Connor (kevin@koconnor.net) wrote: > > On Wed, Mar 11, 2015 at 01:45:57PM +0000, Dr. David Alan Gilbert wrote: > > > * Bandan Das (bsd@redhat.com) wrote: > > > > "Dr. David Alan Gilbert" writes: > > > > > while true; do (sleep 5; echo -e '\001cq\n')|/opt/qemu-try-world3/bin/qemu-system-x86_64 -machine pc-i440fx-2.0,accel=kvm -m 1024 -smp 128 -nographic -device sga 2>&1 | tee /tmp/qemu.op; grep "internal error" /tmp/qemu.op -q && break; done That is a truly impressive command line, BTW. > > > > > [root@virtlab413 qemu-world3]# git bisect bad > > > > > 21f5826a04d38e19488f917e1eef22751490c769 is the first bad commit > > > > > > > > I can reproduce this on E5-2620 v2 with David's "while true" test. > > > > (The emulation failure I mean, not the suberror 2 that Andrey is seeing) > > > > The commit that seems to have introduced this is - > > > > > > > > commit 0673b7870063a3affbad9046fb6d385a4e734c19 > > > > Author: Kevin O'Connor > > > > Date: Sat May 24 10:49:50 2014 -0400 > > > > > > > > smp: Replace QEMU SMP init assembler code with C; run only in 32bit mode. > > [...] > > > Turning on debug logging > > > ( -chardev file,id=log,path=/tmp/debugcon.$$ -device isa-debugcon,chardev=log,iobase=0x402 ) > > > > > > SeaBIOS (version rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org) > > [...] > > > Found 1 cpu(s) max supported 128 cpu(s) > > > > Something is very odd here. When I run the above command (on an older > > AMD machine) I get: > > > > Found 128 cpu(s) max supported 128 cpu(s) > > > > That first value (1 vs 128) comes from QEMU (via cmos index 0x5f). > > That is, during smp init, SeaBIOS expects QEMU to tell it how many > > cpus are active, and SeaBIOS waits until that many CPUs check in from > > its SIPI request before proceeding. > > > > I wonder if QEMU reported only 1 active cpu via that cmos register, > > but more were actually active. If that was the case, it could > > certainly explain the failure - as multiple cpus could be running > > without the sipi trapoline in place. > > > > What does the log look like on a non-failure case? > > I had to drop down from 128 to get a working run with debug; here > are two runs with -smp 20 the first one worked, the second one > failed. [...] > =========== Working =========== > > SeaBIOS (version rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org) [...] > Found 20 cpu(s) max supported 20 cpu(s) [...] > =========== Broken =========== > > SeaBIOS (version rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org) [...] > Found 1 cpu(s) max supported 20 cpu(s) So, I couldn't get this to fail on my older AMD machine at all with the default SeaBIOS code. But, when I change the code with the patch below, it failed right away. KVM internal error. Suberror: 1 emulation failure EAX=00000000 EBX=00000000 ECX=00000000 EDX=000fd2b8 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000 EIP=000fd2c1 EFL=00000007 [-----PC] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] FS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008300 DPL=0 TSS16-busy GDT= 000f6a50 00000037 IDT= 000f6a8e 00000000 CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 Code=66 ba b8 d2 0f 00 e9 a2 fe f3 90 f0 0f ba 2d 04 ff fb 3f 00 <72> f3 8b 25 00 ff fb 3f e8 d2 65 ff ff c7 05 04 ff fb 3f 00 00 00 00 f4 eb fd fa fc 66 b8 And the failed debug output looks like: SeaBIOS (version rel-1.8.0-7-gd23eba6-dirty-20150311_121819-morn.localdomain) [...] cmos_smp_count0=20 [...] cmos_smp_count=1 cmos_smp_count2=1/20 Found 1 cpu(s) max supported 20 cpu(s) I'm going to check the assembly for a compiler error, but is it possible QEMU is returning incorrect data in cmos index 0x5f? David, any chance you can recompile seabios and double check your output? -Kevin --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- a/src/fw/smp.c +++ b/src/fw/smp.c @@ -128,6 +128,7 @@ smp_setup(void) // Wait for other CPUs to process the SIPI. u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1; + dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count); while (cmos_smp_count != CountCPUs) asm volatile( // Release lock and allow other processors to use the stack. @@ -140,6 +141,8 @@ smp_setup(void) : "+m" (SMPLock), "+m" (SMPStack) : : "cc", "memory"); yield(); + dprintf(1, "cmos_smp_count2=%d/%d\n", cmos_smp_count + , rtc_read(CMOS_BIOS_SMP_COUNT) + 1); // Restore memory. *(u64*)BUILD_AP_BOOT_ADDR = old; diff --git a/src/post.c b/src/post.c index 9ea5620..dc11c72 100644 --- a/src/post.c +++ b/src/post.c @@ -170,6 +170,7 @@ platform_hardware_setup(void) clock_setup(); // Platform specific setup + dprintf(1, "cmos_smp_count0=%d\n", rtc_read(CMOS_BIOS_SMP_COUNT) + 1); qemu_platform_setup(); coreboot_platform_setup(); }