From patchwork Thu Dec 18 00:46:48 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 5509851 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 1CC1B9F434 for ; Thu, 18 Dec 2014 00:48:06 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2F75C209EA for ; Thu, 18 Dec 2014 00:48:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1D76B209E5 for ; Thu, 18 Dec 2014 00:48:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751846AbaLRAr6 (ORCPT ); Wed, 17 Dec 2014 19:47:58 -0500 Received: from cantor2.suse.de ([195.135.220.15]:59250 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751233AbaLRArE (ORCPT ); Wed, 17 Dec 2014 19:47:04 -0500 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 3B351AC55; Thu, 18 Dec 2014 00:47:02 +0000 (UTC) From: Alexander Graf To: kvm-ppc@vger.kernel.org Cc: kvm@vger.kernel.org, pbonzini@redhat.com, Paul Mackerras Subject: [PULL 05/18] KVM: PPC: Book3S HV: Fix KSM memory corruption Date: Thu, 18 Dec 2014 01:46:48 +0100 Message-Id: <1418863621-6630-6-git-send-email-agraf@suse.de> X-Mailer: git-send-email 1.8.1.4 In-Reply-To: <1418863621-6630-1-git-send-email-agraf@suse.de> References: <1418863621-6630-1-git-send-email-agraf@suse.de> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Paul Mackerras Testing with KSM active in the host showed occasional corruption of guest memory. Typically a page that should have contained zeroes would contain values that look like the contents of a user process stack (values such as 0x0000_3fff_xxxx_xxx). Code inspection in kvmppc_h_protect revealed that there was a race condition with the possibility of granting write access to a page which is read-only in the host page tables. The code attempts to keep the host mapping read-only if the host userspace PTE is read-only, but if that PTE had been temporarily made invalid for any reason, the read-only check would not trigger and the host HPTE could end up read-write. Examination of the guest HPT in the failure situation revealed that there were indeed shared pages which should have been read-only that were mapped read-write. To close this race, we don't let a page go from being read-only to being read-write, as far as the real HPTE mapping the page is concerned (the guest view can go to read-write, but the actual mapping stays read-only). When the guest tries to write to the page, we take an HDSI and let kvmppc_book3s_hv_page_fault take care of providing a writable HPTE for the page. This eliminates the occasional corruption of shared pages that was previously seen with KSM active. Signed-off-by: Paul Mackerras Signed-off-by: Alexander Graf --- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 44 ++++++++++++++----------------------- 1 file changed, 17 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 084ad54..411720f 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -667,40 +667,30 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags, rev->guest_rpte = r; note_hpte_modification(kvm, rev); } - r = (be64_to_cpu(hpte[1]) & ~mask) | bits; /* Update HPTE */ if (v & HPTE_V_VALID) { - rb = compute_tlbie_rb(v, r, pte_index); - hpte[0] = cpu_to_be64(v & ~HPTE_V_VALID); - do_tlbies(kvm, &rb, 1, global_invalidates(kvm, flags), true); /* - * If the host has this page as readonly but the guest - * wants to make it read/write, reduce the permissions. - * Checking the host permissions involves finding the - * memslot and then the Linux PTE for the page. + * If the page is valid, don't let it transition from + * readonly to writable. If it should be writable, we'll + * take a trap and let the page fault code sort it out. */ - if (hpte_is_writable(r) && kvm->arch.using_mmu_notifiers) { - unsigned long psize, gfn, hva; - struct kvm_memory_slot *memslot; - pgd_t *pgdir = vcpu->arch.pgdir; - pte_t pte; - - psize = hpte_page_size(v, r); - gfn = ((r & HPTE_R_RPN) & ~(psize - 1)) >> PAGE_SHIFT; - memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn); - if (memslot) { - hva = __gfn_to_hva_memslot(memslot, gfn); - pte = lookup_linux_pte_and_update(pgdir, hva, - 1, &psize); - if (pte_present(pte) && !pte_write(pte)) - r = hpte_make_readonly(r); - } + pte = be64_to_cpu(hpte[1]); + r = (pte & ~mask) | bits; + if (hpte_is_writable(r) && kvm->arch.using_mmu_notifiers && + !hpte_is_writable(pte)) + r = hpte_make_readonly(r); + /* If the PTE is changing, invalidate it first */ + if (r != pte) { + rb = compute_tlbie_rb(v, r, pte_index); + hpte[0] = cpu_to_be64((v & ~HPTE_V_VALID) | + HPTE_V_ABSENT); + do_tlbies(kvm, &rb, 1, global_invalidates(kvm, flags), + true); + hpte[1] = cpu_to_be64(r); } } - hpte[1] = cpu_to_be64(r); - eieio(); - hpte[0] = cpu_to_be64(v & ~HPTE_V_HVLOCK); + unlock_hpte(hpte, v & ~HPTE_V_HVLOCK); asm volatile("ptesync" : : : "memory"); return H_SUCCESS; }