From patchwork Tue Mar 14 17:05:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Garnier X-Patchwork-Id: 9623951 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A480860244 for ; Tue, 14 Mar 2017 17:08:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90605285C3 for ; Tue, 14 Mar 2017 17:08:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8457B285CA; Tue, 14 Mar 2017 17:08:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CDAF5285C3 for ; Tue, 14 Mar 2017 17:08:26 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cnptC-0007Bd-8R; Tue, 14 Mar 2017 17:05:38 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cnptB-0007BJ-6V for xen-devel@lists.xenproject.org; Tue, 14 Mar 2017 17:05:37 +0000 Received: from [85.158.139.211] by server-9.bemta-5.messagelabs.com id DD/B1-32461-0E228C85; Tue, 14 Mar 2017 17:05:36 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrBIsWRWlGSWpSXmKPExsVyMfTASt27Sic iDH5M1bD4vmUykwOjx+EPV1gCGKNYM/OS8isSWDNurf3CVnDMo2L21NWsDYw7bbsYuTiEBGYx SvTenMII4rAIvGKR2HuiixXEkRDoZ5WY09rC0sXICeTESFxramWDsCskJp59yQhiCwkoSWzds JQZYlQDk0T3kh+sIAk2AS2JPQ3zmUASIgKPuCXWPLzOBuIwC6xlkth29QwTSJWwgLvE9CWvmE FsFgFViXdHNrKD2LwClhKnpt4CmsQBtM5Eou+zDEiYU8BKYtWPRhaIzZYSG1+dZp/AKLCAkWE Vo0ZxalFZapGuoaVeUlFmekZJbmJmjq6hgalebmpxcWJ6ak5iUrFecn7uJkZgeNUzMDDuYHzU 73eIUZKDSUmUV0XwRIQQX1J+SmVGYnFGfFFpTmrxIUYZDg4lCd4gRaCcYFFqempFWmYOMNBh0 hIcPEoivA3yQGne4oLE3OLMdIjUKUZjjgendr1h4vjUf/gNkxBLXn5eqpQ4bzjIJAGQ0ozSPL hBsAi8xCgrJczLyMDAIMRTkFqUm1mCKv+KUZyDUUmYtwpkCk9mXgncvldApzABnZL48wjIKSW JCCmpBsYNxqctG3IPcU284xC4V8ZijeIF7gPq0tH38v/Fbvp8rlf6hJph18trh3hnOzqe14l7 u9swfHellNTHtCjhuwoO99p1rnKG9+u2MHa/FXn05dLCv31Cc34aCnx6oa6dc2rFvPW1f+9cc Tpw0XPOcV9/vf3eQVs99vUy9VWeOpZ2VSr8KF+xn4wSS3FGoqEWc1FxIgBFagf2uwIAAA== X-Env-Sender: thgarnie@google.com X-Msg-Ref: server-6.tower-206.messagelabs.com!1489511132!89233270!1 X-Originating-IP: [209.85.192.169] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.2.3; banners=-,-,- X-VirusChecked: Checked Received: (qmail 45656 invoked from network); 14 Mar 2017 17:05:33 -0000 Received: from mail-pf0-f169.google.com (HELO mail-pf0-f169.google.com) (209.85.192.169) by server-6.tower-206.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 14 Mar 2017 17:05:33 -0000 Received: by mail-pf0-f169.google.com with SMTP id o126so73091660pfb.3 for ; Tue, 14 Mar 2017 10:05:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gCvrpi4allwI3Ed0AIrcN9SRPE5bJewMshYmz6cUejs=; b=CA18SHaxrjrUIGx62/yRuBTN2JG08uzC/fuluU2zBPA33YWlL9HGBbYusUOTlPK2Ol AFA2uyzXgAg4U8+uf/ZEWjFIBtiLJT2/J/RH5F5R+0unup+zSIHtZ8408hMZfP0myJAH 0c9wXEoFQLt3JHi5j1R3sD0V3I9Ysp+uWN5F01gMFaR5Buh1kfj4GSz/hfIGSHuX/f41 951Tv7zjNQuAvJiBHVpbJOAP0PaJj2cfTeVafQjdkTmucKZL1JcxizyELKTvPRRFnbDa EylvvizOcfVu6XGVklD04waPSnEL1d9uTw8LnLt7hLc0OeclSVuX0xe51dY8HcKmUwOx HYZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gCvrpi4allwI3Ed0AIrcN9SRPE5bJewMshYmz6cUejs=; b=MdnloEJgQN5OWIJjP0dftD0euxTgfX2e6NPjl+s/k4rAFvUD77q42swrIoBHqYNGXg PpH+HpLl3BVl7JOFIYC3rqNyZR+9T/Tyy5vWu+XQ3CPtnxhYyPgw/jSt1gHiauahLeyK MRXCpYExP4It0xdD0GgeTXnCAD1W8P7SUJ6HUBc9+gzR5jORRZcXH0q7cPZmHH/kPntC sHAXbl9JcM3/SEcnvTrA0lcNHRqsreECyavJh+0m2dENA0WSSaV7vL/XYSipJfaTE2aY gwznq1x3B4Br6tBFB7DOtlH6Tnr5UG9FK2iiqTn5XqrPLjXesbBvXZq2Y5ZHazvZglXV W62g== X-Gm-Message-State: AMke39kU/7/PuAZn9OMZ90Iljp8Nl97ZRqkaQYQiLnWXzo61d514AXhMRWt6OB8cySQP0mmC X-Received: by 10.98.16.136 with SMTP id 8mr45640235pfq.104.1489511131350; Tue, 14 Mar 2017 10:05:31 -0700 (PDT) Received: from skynet.sea.corp.google.com ([100.100.206.185]) by smtp.gmail.com with ESMTPSA id r13sm39714525pfg.55.2017.03.14.10.05.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 14 Mar 2017 10:05:30 -0700 (PDT) From: Thomas Garnier To: Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Jonathan Corbet , Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , Thomas Garnier , Lorenzo Stoakes , Kees Cook , Juergen Gross , Andy Lutomirski , Paul Gortmaker , Andrew Morton , Michal Hocko , zijun_hu , Chris Wilson , Andy Lutomirski , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Jiri Kosina , Matt Fleming , Ard Biesheuvel , Boris Ostrovsky , Rusty Russell , Paolo Bonzini , Borislav Petkov , Christian Borntraeger , Frederic Weisbecker , "Luis R . Rodriguez" , Stanislaw Gruszka , Peter Zijlstra , Josh Poimboeuf , Vitaly Kuznetsov , Tim Chen , Joerg Roedel , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= Date: Tue, 14 Mar 2017 10:05:08 -0700 Message-Id: <20170314170508.100882-3-thgarnie@google.com> X-Mailer: git-send-email 2.12.0.367.g23dc2f6d3c-goog In-Reply-To: <20170314170508.100882-1-thgarnie@google.com> References: <20170314170508.100882-1-thgarnie@google.com> Cc: linux-efi@vger.kernel.org, kvm@vger.kernel.org, linux-pm@vger.kernel.org, x86@kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, linux-mm@kvack.org, lguest@lists.ozlabs.org, kernel-hardening@lists.openwall.com, xen-devel@lists.xenproject.org Subject: [Xen-devel] [PATCH v7 3/3] x86: Make the GDT remapping read-only on 64-bit X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP This patch makes the GDT remapped pages read-only to prevent corruption. This change is done only on 64-bit. The native_load_tr_desc function was adapted to correctly handle a read-only GDT. The LTR instruction always writes to the GDT TSS entry. This generates a page fault if the GDT is read-only. This change checks if the current GDT is a remap and swap GDTs as needed. This function was tested by booting multiple machines and checking hibernation works properly. KVM SVM and VMX were adapted to use the writeable GDT. On VMX, the per-cpu variable was removed for functions to fetch the original GDT. Instead of reloading the previous GDT, VMX will reload the fixmap GDT as expected. For testing, VMs were started and restored on multiple configurations. Signed-off-by: Thomas Garnier --- Based on next-20170308 --- arch/x86/include/asm/desc.h | 106 +++++++++++++++++++++++++-------------- arch/x86/include/asm/processor.h | 1 + arch/x86/kernel/cpu/common.c | 28 ++++++++--- arch/x86/kvm/svm.c | 4 +- arch/x86/kvm/vmx.c | 12 ++--- 5 files changed, 96 insertions(+), 55 deletions(-) diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index 4b5ef0c64291..ec05f9c1a62c 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -248,9 +248,77 @@ static inline void native_set_ldt(const void *addr, unsigned int entries) } } +static inline void native_load_gdt(const struct desc_ptr *dtr) +{ + asm volatile("lgdt %0"::"m" (*dtr)); +} + +static inline void native_load_idt(const struct desc_ptr *dtr) +{ + asm volatile("lidt %0"::"m" (*dtr)); +} + +static inline void native_store_gdt(struct desc_ptr *dtr) +{ + asm volatile("sgdt %0":"=m" (*dtr)); +} + +static inline void native_store_idt(struct desc_ptr *dtr) +{ + asm volatile("sidt %0":"=m" (*dtr)); +} + +/* + * The LTR instruction marks the TSS GDT entry as busy. On 64-bit, the GDT is + * a read-only remapping. To prevent a page fault, the GDT is switched to the + * original writeable version when needed. + */ +#ifdef CONFIG_X86_64 static inline void native_load_tr_desc(void) { + struct desc_ptr gdt; + int cpu = raw_smp_processor_id(); + bool restore = 0; + struct desc_struct *fixmap_gdt; + + native_store_gdt(&gdt); + fixmap_gdt = get_cpu_gdt_ro(cpu); + + /* + * If the current GDT is the read-only fixmap, swap to the original + * writeable version. Swap back at the end. + */ + if (gdt.address == (unsigned long)fixmap_gdt) { + load_direct_gdt(cpu); + restore = 1; + } asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8)); + if (restore) + load_fixmap_gdt(cpu); +} +#else +static inline void native_load_tr_desc(void) +{ + asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8)); +} +#endif + +static inline unsigned long native_store_tr(void) +{ + unsigned long tr; + + asm volatile("str %0":"=r" (tr)); + + return tr; +} + +static inline void native_load_tls(struct thread_struct *t, unsigned int cpu) +{ + struct desc_struct *gdt = get_cpu_gdt_rw(cpu); + unsigned int i; + + for (i = 0; i < GDT_ENTRY_TLS_ENTRIES; i++) + gdt[GDT_ENTRY_TLS_MIN + i] = t->tls_array[i]; } DECLARE_PER_CPU(bool, __tss_limit_invalid); @@ -305,44 +373,6 @@ static inline void invalidate_tss_limit(void) this_cpu_write(__tss_limit_invalid, true); } -static inline void native_load_gdt(const struct desc_ptr *dtr) -{ - asm volatile("lgdt %0"::"m" (*dtr)); -} - -static inline void native_load_idt(const struct desc_ptr *dtr) -{ - asm volatile("lidt %0"::"m" (*dtr)); -} - -static inline void native_store_gdt(struct desc_ptr *dtr) -{ - asm volatile("sgdt %0":"=m" (*dtr)); -} - -static inline void native_store_idt(struct desc_ptr *dtr) -{ - asm volatile("sidt %0":"=m" (*dtr)); -} - -static inline unsigned long native_store_tr(void) -{ - unsigned long tr; - - asm volatile("str %0":"=r" (tr)); - - return tr; -} - -static inline void native_load_tls(struct thread_struct *t, unsigned int cpu) -{ - struct desc_struct *gdt = get_cpu_gdt_rw(cpu); - unsigned int i; - - for (i = 0; i < GDT_ENTRY_TLS_ENTRIES; i++) - gdt[GDT_ENTRY_TLS_MIN + i] = t->tls_array[i]; -} - /* This intentionally ignores lm, since 32-bit apps don't have that field. */ #define LDT_empty(info) \ ((info)->base_addr == 0 && \ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 2ec4d2dc559b..28828f1f99a4 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -716,6 +716,7 @@ extern struct desc_ptr early_gdt_descr; extern void cpu_set_gdt(int); extern void switch_to_new_gdt(int); +extern void load_direct_gdt(int); extern void load_fixmap_gdt(int); extern void load_percpu_segment(int); extern void cpu_init(void); diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 3cf1590ec9ce..f8e22dbad86c 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -448,8 +448,15 @@ void load_percpu_segment(int cpu) load_stack_canary_segment(); } -/* Used by XEN to force the GDT read-only when required */ +/* + * On 64-bit the GDT remapping is read-only. + * A global is used for Xen to change the default when required. + */ +#ifdef CONFIG_X86_64 +pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL_RO; +#else pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL; +#endif /* Setup the fixmap mapping only once per-processor */ static inline void setup_fixmap_gdt(int cpu) @@ -458,6 +465,17 @@ static inline void setup_fixmap_gdt(int cpu) __pa(get_cpu_gdt_rw(cpu)), pg_fixmap_gdt_flags); } +/* Load the original GDT from the per-cpu structure */ +void load_direct_gdt(int cpu) +{ + struct desc_ptr gdt_descr; + + gdt_descr.address = (long)get_cpu_gdt_rw(cpu); + gdt_descr.size = GDT_SIZE - 1; + load_gdt(&gdt_descr); +} +EXPORT_SYMBOL_GPL(load_direct_gdt); + /* Load a fixmap remapping of the per-cpu GDT */ void load_fixmap_gdt(int cpu) { @@ -467,6 +485,7 @@ void load_fixmap_gdt(int cpu) gdt_descr.size = GDT_SIZE - 1; load_gdt(&gdt_descr); } +EXPORT_SYMBOL_GPL(load_fixmap_gdt); /* * Current gdt points %fs at the "master" per-cpu area: after this, @@ -474,11 +493,8 @@ void load_fixmap_gdt(int cpu) */ void switch_to_new_gdt(int cpu) { - struct desc_ptr gdt_descr; - - gdt_descr.address = (long)get_cpu_gdt_rw(cpu); - gdt_descr.size = GDT_SIZE - 1; - load_gdt(&gdt_descr); + /* Load the original GDT */ + load_direct_gdt(cpu); /* Reload the per-cpu base */ load_percpu_segment(cpu); } diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index d1efe2c62b3f..c02b9af2056a 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -741,7 +741,6 @@ static int svm_hardware_enable(void) struct svm_cpu_data *sd; uint64_t efer; - struct desc_ptr gdt_descr; struct desc_struct *gdt; int me = raw_smp_processor_id(); @@ -763,8 +762,7 @@ static int svm_hardware_enable(void) sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1; sd->next_asid = sd->max_asid + 1; - native_store_gdt(&gdt_descr); - gdt = (struct desc_struct *)gdt_descr.address; + gdt = get_current_gdt_rw(); sd->tss_desc = (struct kvm_ldttss_desc *)(gdt + GDT_ENTRY_TSS); wrmsrl(MSR_EFER, efer | EFER_SVME); diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 283aa8601833..cfed1fff43ec 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -935,7 +935,6 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs); * when a CPU is brought down, and we need to VMCLEAR all VMCSs loaded on it. */ static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu); -static DEFINE_PER_CPU(struct desc_ptr, host_gdt); /* * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we @@ -2052,14 +2051,13 @@ static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset) */ static unsigned long segment_base(u16 selector) { - struct desc_ptr *gdt = this_cpu_ptr(&host_gdt); struct desc_struct *table; unsigned long v; if (!(selector & ~SEGMENT_RPL_MASK)) return 0; - table = (struct desc_struct *)gdt->address; + table = get_current_gdt_ro(); if ((selector & SEGMENT_TI_MASK) == SEGMENT_LDT) { u16 ldt_selector = kvm_read_ldt(); @@ -2164,7 +2162,7 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx) #endif if (vmx->host_state.msr_host_bndcfgs) wrmsrl(MSR_IA32_BNDCFGS, vmx->host_state.msr_host_bndcfgs); - load_gdt(this_cpu_ptr(&host_gdt)); + load_fixmap_gdt(raw_smp_processor_id()); } static void vmx_load_host_state(struct vcpu_vmx *vmx) @@ -2266,7 +2264,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) } if (!already_loaded) { - struct desc_ptr *gdt = this_cpu_ptr(&host_gdt); + unsigned long gdt = get_current_gdt_ro_vaddr(); unsigned long sysenter_esp; kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); @@ -2277,7 +2275,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) */ vmcs_writel(HOST_TR_BASE, (unsigned long)this_cpu_ptr(&cpu_tss)); - vmcs_writel(HOST_GDTR_BASE, gdt->address); + vmcs_writel(HOST_GDTR_BASE, gdt); /* 22.2.4 */ /* * VM exits change the host TR limit to 0x67 after a VM @@ -3465,8 +3463,6 @@ static int hardware_enable(void) ept_sync_global(); } - native_store_gdt(this_cpu_ptr(&host_gdt)); - return 0; }