From patchwork Fri Nov 10 07:18:05 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhang X-Patchwork-Id: 10052497 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 33BEB603FA for ; Fri, 10 Nov 2017 07:45:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 246E42B095 for ; Fri, 10 Nov 2017 07:45:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 17BA52B0A7; Fri, 10 Nov 2017 07:45:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 712272B095 for ; Fri, 10 Nov 2017 07:45:40 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eD3xm-0003Tv-MY; Fri, 10 Nov 2017 07:42:54 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eD3xm-0003To-0v for xen-devel@lists.xen.org; Fri, 10 Nov 2017 07:42:54 +0000 Received: from [85.158.139.211] by server-6.bemta-5.messagelabs.com id 84/60-02207-D78550A5; Fri, 10 Nov 2017 07:42:53 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrOLMWRWlGSWpSXmKPExsVywNykWLcmgjX KYMlmBYslHxezODB6HN39mymAMYo1My8pvyKBNePxoQ2MBRNkKg7+/snSwLhPuIuRi0NIYBqj xLYDC1m6GDk5JAR4JY4sm8EKYftIzHl6lQWiqJ1RYvqzH2AJNgFtiR+rfzOC2CIC0hLXPl8Gs 5kFKiV+/T/LDGILC4RKzDi7lL2LkYODRUBV4uMbV5Awr4CXxO+NT9kg5stJnDw2mXUCI/cCRo ZVjOrFqUVlqUW6RnpJRZnpGSW5iZk5uoYGpnq5qcXFiempOYlJxXrJ+bmbGIHeZQCCHYzf/zg dYpTkYFIS5ZWyZIkS4kvKT6nMSCzOiC8qzUktPsQow8GhJMGbEM4aJSRYlJqeWpGWmQMMM5i0 BAePkghvE0iat7ggMbc4Mx0idYrRmOPZzNcNzBzTrrY2MQux5OXnpUqJ81qClAqAlGaU5sENg oX/JUZZKWFeRqDThHgKUotyM0tQ5V8xinMwKgnzzgeZwpOZVwK37xXQKUxAp0Szs4CcUpKIkJ JqYHS6V1l6Zu+vol+vF5Y69+zle7vY33TRpH0C9Rd5w6c+FUmNXMe6NLjQ+sb0ndFL6iSuXSr 8JnfYxGl+wP/1SY5f5q0PXymyfvceu0fe+3ZvtVZQO2ZwZHJ5vEHv6adpbC8Lcqpubo3SPRIr vU2Dd0bbZUY75p0LHyzO7i58qzWxfZPyg/3z7TOVWIozEg21mIuKEwFh+ycpegIAAA== X-Env-Sender: yu.c.zhang@linux.intel.com X-Msg-Ref: server-6.tower-206.messagelabs.com!1510299770!103759216!1 X-Originating-IP: [192.55.52.115] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 61118 invoked from network); 10 Nov 2017 07:42:52 -0000 Received: from mga14.intel.com (HELO mga14.intel.com) (192.55.52.115) by server-6.tower-206.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 10 Nov 2017 07:42:52 -0000 Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Nov 2017 23:42:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.44,373,1505804400"; d="scan'208";a="789307" Received: from zhangyu-optiplex-9020.bj.intel.com ([10.238.135.159]) by orsmga003.jf.intel.com with ESMTP; 09 Nov 2017 23:42:45 -0800 From: Yu Zhang To: xen-devel@lists.xen.org Date: Fri, 10 Nov 2017 15:18:05 +0800 Message-Id: <1510298286-30952-1-git-send-email-yu.c.zhang@linux.intel.com> X-Mailer: git-send-email 1.9.1 Cc: Andrew Cooper , min.he@intel.com, Jan Beulich , yi.z.zhang@intel.com Subject: [Xen-devel] [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen(). X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Min He In map_pages_to_xen(), a L2 page table entry may be reset to point to a superpage, and its corresponding L1 page table need be freed in such scenario, when these L1 page table entries are mapping to consecutive page frames and having the same mapping flags. However, variable `pl1e` is not protected by the lock before L1 page table is enumerated. A race condition may happen if this code path is invoked simultaneously on different CPUs. For example, `pl1e` value on CPU0 may hold an obsolete value, pointing to a page which has just been freed on CPU1. Besides, before this page is reused, it will still be holding the old PTEs, referencing consecutive page frames. Consequently the `free_xen_pagetable(l2e_to_l1e(ol2e))` will be triggered on CPU0, resulting the unexpected free of a normal page. This patch fixes the potential race condition by protecting the `pl1e` with the lock, and checking the PSE flag of the `pl2e`. Note: PSE flag of `pl3e` is also checked before its re-consolidation, for the same reason we do for `pl2e` - we cannot presume the contents of the target superpage. Signed-off-by: Min He Signed-off-by: Yi Zhang Signed-off-by: Yu Zhang Reviewed-by: Jan Beulich --- Cc: Jan Beulich Cc: Andrew Cooper Changes in v2: According to comments from Jan Beulich, - check PSE of pl2e and pl3e, and skip the re-consolidation if set. - commit message changes, e.g. add "From :" tag etc. - code style changes. - introduce a seperate patch to resolve the similar issue in modify_xen_mappings(). --- xen/arch/x86/mm.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index a20fdca..47855fb 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4844,9 +4844,19 @@ int map_pages_to_xen( { unsigned long base_mfn; - pl1e = l2e_to_l1e(*pl2e); if ( locking ) spin_lock(&map_pgdir_lock); + + /* Skip the re-consolidation if it's done on another CPU. */ + if ( l2e_get_flags(*pl2e) & _PAGE_PSE ) + { + if ( locking ) + spin_unlock(&map_pgdir_lock); + goto check_l3; + } + + ol2e = *pl2e; + pl1e = l2e_to_l1e(ol2e); base_mfn = l1e_get_pfn(*pl1e) & ~(L1_PAGETABLE_ENTRIES - 1); for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++, pl1e++ ) if ( (l1e_get_pfn(*pl1e) != (base_mfn + i)) || @@ -4854,7 +4864,6 @@ int map_pages_to_xen( break; if ( i == L1_PAGETABLE_ENTRIES ) { - ol2e = *pl2e; l2e_write_atomic(pl2e, l2e_from_pfn(base_mfn, l1f_to_lNf(flags))); if ( locking ) @@ -4880,6 +4889,15 @@ int map_pages_to_xen( if ( locking ) spin_lock(&map_pgdir_lock); + + /* Skip the re-consolidation if it's done on another CPU. */ + if ( l3e_get_flags(*pl3e) & _PAGE_PSE ) + { + if ( locking ) + spin_unlock(&map_pgdir_lock); + continue; + } + ol3e = *pl3e; pl2e = l3e_to_l2e(ol3e); base_mfn = l2e_get_pfn(*pl2e) & ~(L2_PAGETABLE_ENTRIES *