From patchwork Mon Jul 27 03:23:31 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiao Guangrong X-Patchwork-Id: 6868191 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id A2DE5C05AC for ; Mon, 27 Jul 2015 03:28:58 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 96EF220592 for ; Mon, 27 Jul 2015 03:28:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 667672058C for ; Mon, 27 Jul 2015 03:28:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755260AbbG0D2x (ORCPT ); Sun, 26 Jul 2015 23:28:53 -0400 Received: from mga02.intel.com ([134.134.136.20]:52066 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754730AbbG0D2w (ORCPT ); Sun, 26 Jul 2015 23:28:52 -0400 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP; 26 Jul 2015 20:28:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,550,1432623600"; d="scan'208";a="770207929" Received: from xiao.sh.intel.com ([10.239.159.86]) by fmsmga002.fm.intel.com with ESMTP; 26 Jul 2015 20:28:51 -0700 Message-ID: <55B5A433.1090107@linux.intel.com> Date: Mon, 27 Jul 2015 11:23:31 +0800 From: Xiao Guangrong User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Pavel Shirshov , kvm@vger.kernel.org Subject: Re: KVM: unknown exit, hardware reason 31 References: In-Reply-To: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP I guess it happened on this scenario: 1. QEMU drops mmio region 2. invalidate all mmio sptes 3. VCPU 0 KVM VCPU 1 access the invalid mmio spte page reclaim zap shadow page access the region originally was MMIO before set the spte to the normal ram map mmio #PF check the spte and see it becomes normal ram mapping !!! The issue is caused by fast invalidate mmio sptes which increases generation number instead of zapping mmio sptes (SRCU can ensure the vcpu either see mmio spte or being zapped / zapped sptes.). The simple fix is just drop the check_direct_spte_mmio_pf(), let VCPU access again as follows: Pavel, could you please check if it works for you? I will fully consider the case and post the right fix out... On 07/25/2015 03:25 AM, Pavel Shirshov wrote: > Hello, > > I'm running a lot of identical VMs under KVM. Sometimes (one time per > 2000-3000 runs) I got following: > > 1. VM is paused in libvirt. It can't be just resumed. I can just reset > it and resume. > 2. In VM log file I see following: "KVM: unknown exit, hardware reason > 31" with a CPU dump. > 3. In dmesg I see following: > [84245.284948] EPT: Misconfiguration. > [84245.285056] EPT: GPA: 0xfeda848 > [84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4 > [84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3 > [84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2 > [84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1 > > OS. 3.16.0-44-generic #59~14.04.1-Ubuntu SMP > QEMU: QEMU emulator version 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1.14), > Copyright (c) 2003-2008 Fabrice Bellard > > Is it linux kvm bug or CPU bug? How can I fix that? > > I can reproduce the bug in one-two days. Is it possible to enable > deeper debug for the issue? > > Thanks > --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 4417146..299a5da 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3299,21 +3299,6 @@ static bool quickly_check_mmio_pf(struct kvm_vcpu *vcpu, u64 addr, bool direct) return vcpu_match_mmio_gva(vcpu, addr); } - -/* - * On direct hosts, the last spte is only allows two states - * for mmio page fault: - * - It is the mmio spte - * - It is zapped or it is being zapped. - * - * This function completely checks the spte when the last spte - * is not the mmio spte. - */ -static bool check_direct_spte_mmio_pf(u64 spte) -{ - return __check_direct_spte_mmio_pf(spte); -} - static u64 walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr) { struct kvm_shadow_walk_iterator iterator; @@ -3356,13 +3341,6 @@ int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct) } /* - * It's ok if the gva is remapped by other cpus on shadow guest, - * it's a BUG if the gfn is not a mmio page. - */ - if (direct && !check_direct_spte_mmio_pf(spte)) - return RET_MMIO_PF_BUG; - - /* * If the page table is zapped by other cpus, let CPU fault again on * the address. */