From patchwork Tue Sep 11 17:20:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chen Yu X-Patchwork-Id: 10595983 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 97ED114E5 for ; Tue, 11 Sep 2018 17:14:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C87B29AD7 for ; Tue, 11 Sep 2018 17:14:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7097D29AE4; Tue, 11 Sep 2018 17:14:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE2B229AD7 for ; Tue, 11 Sep 2018 17:14:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728073AbeIKWOh (ORCPT ); Tue, 11 Sep 2018 18:14:37 -0400 Received: from mga12.intel.com ([192.55.52.136]:27586 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726943AbeIKWOg (ORCPT ); Tue, 11 Sep 2018 18:14:36 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Sep 2018 10:14:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,361,1531810800"; d="scan'208";a="262579534" Received: from chenyu-desktop.sh.intel.com ([10.239.160.116]) by fmsmga006.fm.intel.com with ESMTP; 11 Sep 2018 10:14:17 -0700 From: Chen Yu To: Thomas Gleixner , "Rafael J. Wysocki" Cc: Pavel Machek , Rui Zhang , Chen Yu , Zhimin Gu , Len Brown , linux-kernel@vger.kernel.org, x86@kernel.org, linux-pm@vger.kernel.org Subject: [PATCH 1/4][v2] x86, hibernate: Fix nosave_regions setup for hibernation Date: Wed, 12 Sep 2018 01:20:19 +0800 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Zhimin Gu On 32bit systems, nosave_regions(non RAM areas) located between max_low_pfn and max_pfn are not excluded from hibernation snapshot currently, which may result in a machine check exception when trying to access these unsafe regions during hibernation: [ 612.800453] Disabling lock debugging due to kernel taint [ 612.805786] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 6: fe00000000801136 [ 612.814344] mce: [Hardware Error]: RIP !INEXACT! 60:<00000000d90be566> {swsusp_save+0x436/0x560} [ 612.823167] mce: [Hardware Error]: TSC 1f5939fe276 ADDR dd000000 MISC 30e0000086 [ 612.830677] mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1529487426 SOCKET 0 APIC 0 microcode 24 [ 612.839581] mce: [Hardware Error]: Run the above through 'mcelog --ascii' [ 612.846394] mce: [Hardware Error]: Machine check: Processor context corrupt [ 612.853380] Kernel panic - not syncing: Fatal machine check [ 612.858978] Kernel Offset: 0x18000000 from 0xc1000000 (relocation range: 0xc0000000-0xf7ffdfff) This is because on 32bit systems, pages above max_low_pfn are regarded as high memeory, and accessing unsafe pages might cause expected MCE. On the problematic 32bit system, there are reserved memory above low memory, which triggered the MCE: e820 memory mapping: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000d160cfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d160d000-0x00000000d1613fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000d1614000-0x00000000d1a44fff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d1a45000-0x00000000d1ecffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000d1ed0000-0x00000000d7eeafff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d7eeb000-0x00000000d7ffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000d8000000-0x00000000d875ffff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d8760000-0x00000000d87fffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000d8800000-0x00000000d8fadfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000d8fae000-0x00000000d8ffffff] ACPI data [ 0.000000] BIOS-e820: [mem 0x00000000d9000000-0x00000000da71bfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000da71c000-0x00000000da7fffff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000da800000-0x00000000dbb8bfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000dbb8c000-0x00000000dbffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000dd000000-0x00000000df1fffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000041edfffff] usable Fix this problem by changing pfn limit from max_low_pfn to max_pfn. This fix does not impact 64bit system because on 64bit max_low_pfn is the same as max_pfn. Acked-by: Chen Yu Acked-by: Pavel Machek Cc: "Rafael J. Wysocki" Cc: Thomas Gleixner Signed-off-by: Zhimin Gu --- arch/x86/kernel/setup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index b4866badb235..90ecc108bc8a 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1251,7 +1251,7 @@ void __init setup_arch(char **cmdline_p) x86_init.hyper.guest_late_init(); e820__reserve_resources(); - e820__register_nosave_regions(max_low_pfn); + e820__register_nosave_regions(max_pfn); x86_init.resources.reserve_resources(); From patchwork Tue Sep 11 17:20:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chen Yu X-Patchwork-Id: 10595985 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4406714E5 for ; Tue, 11 Sep 2018 17:14:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2904629AD7 for ; Tue, 11 Sep 2018 17:14:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1D19529AE4; Tue, 11 Sep 2018 17:14:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A4B4D29AD7 for ; Tue, 11 Sep 2018 17:14:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728124AbeIKWOw (ORCPT ); Tue, 11 Sep 2018 18:14:52 -0400 Received: from mga09.intel.com ([134.134.136.24]:19786 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726943AbeIKWOw (ORCPT ); Tue, 11 Sep 2018 18:14:52 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Sep 2018 10:14:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,361,1531810800"; d="scan'208";a="262579603" Received: from chenyu-desktop.sh.intel.com ([10.239.160.116]) by fmsmga006.fm.intel.com with ESMTP; 11 Sep 2018 10:14:32 -0700 From: Chen Yu To: Thomas Gleixner , "Rafael J. Wysocki" Cc: Pavel Machek , Rui Zhang , Chen Yu , Zhimin Gu , Len Brown , linux-kernel@vger.kernel.org, x86@kernel.org, linux-pm@vger.kernel.org, Chen Yu Subject: [PATCH 2/4][v2] PM / hibernate: Check the success of generating md5 digest before hibernation Date: Wed, 12 Sep 2018 01:20:36 +0800 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently if get_e820_md5() fails, then it will hibernate nevertheless. Actually the error code should be propagated to upper caller so that the hibernation could be aware of the result and terminates the process if md5 digest fails. Suggested-by: Thomas Gleixner Cc: "Rafael J. Wysocki" Signed-off-by: Chen Yu --- arch/x86/power/hibernate_64.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c index f8e3b668d20b..dc12dc57889e 100644 --- a/arch/x86/power/hibernate_64.c +++ b/arch/x86/power/hibernate_64.c @@ -265,9 +265,9 @@ static int get_e820_md5(struct e820_table *table, void *buf) return ret; } -static void hibernation_e820_save(void *buf) +static int hibernation_e820_save(void *buf) { - get_e820_md5(e820_table_firmware, buf); + return get_e820_md5(e820_table_firmware, buf); } static bool hibernation_e820_mismatch(void *buf) @@ -287,8 +287,9 @@ static bool hibernation_e820_mismatch(void *buf) return memcmp(result, buf, MD5_DIGEST_SIZE) ? true : false; } #else -static void hibernation_e820_save(void *buf) +static int hibernation_e820_save(void *buf) { + return 0; } static bool hibernation_e820_mismatch(void *buf) @@ -306,6 +307,7 @@ static bool hibernation_e820_mismatch(void *buf) int arch_hibernation_header_save(void *addr, unsigned int max_size) { struct restore_data_record *rdr = addr; + int ret = -EINVAL; if (max_size < sizeof(struct restore_data_record)) return -EOVERFLOW; @@ -333,7 +335,9 @@ int arch_hibernation_header_save(void *addr, unsigned int max_size) rdr->magic = RESTORE_MAGIC; - hibernation_e820_save(rdr->e820_digest); + ret = hibernation_e820_save(rdr->e820_digest); + if (ret) + return ret; return 0; } From patchwork Tue Sep 11 17:20:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chen Yu X-Patchwork-Id: 10595987 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D647213B8 for ; Tue, 11 Sep 2018 17:14:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B819C29AD7 for ; Tue, 11 Sep 2018 17:14:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ABA9229AE4; Tue, 11 Sep 2018 17:14:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9168C29AD7 for ; Tue, 11 Sep 2018 17:14:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728201AbeIKWPH (ORCPT ); Tue, 11 Sep 2018 18:15:07 -0400 Received: from mga03.intel.com ([134.134.136.65]:44672 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728199AbeIKWPH (ORCPT ); Tue, 11 Sep 2018 18:15:07 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Sep 2018 10:14:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,361,1531810800"; d="scan'208";a="262579618" Received: from chenyu-desktop.sh.intel.com ([10.239.160.116]) by fmsmga006.fm.intel.com with ESMTP; 11 Sep 2018 10:14:47 -0700 From: Chen Yu To: Thomas Gleixner , "Rafael J. Wysocki" Cc: Pavel Machek , Rui Zhang , Chen Yu , Zhimin Gu , Len Brown , linux-kernel@vger.kernel.org, x86@kernel.org, linux-pm@vger.kernel.org Subject: [PATCH 3/4][v2] x86, hibernate: Extract the common code of 64/32 bit system Date: Wed, 12 Sep 2018 01:20:46 +0800 Message-Id: <7266f9b93000c57edd41fcf2cee0dd4d6252a8ef.1536685746.git.yu.c.chen@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Zhimin Gu Reduce the hibernation code duplication between x86-32 and x86-64 by extracting the common code into hibernate.c. No functional change. Acked-by: Chen Yu Acked-by: Pavel Machek Cc: "Rafael J. Wysocki" Cc: Thomas Gleixner Signed-off-by: Zhimin Gu --- arch/x86/power/hibernate.c | 265 ++++++++++++++++++++++++++++++ arch/x86/power/hibernate_32.c | 23 +-- arch/x86/power/hibernate_64.c | 246 +-------------------------- arch/x86/power/hibernate_asm_64.S | 2 +- 4 files changed, 269 insertions(+), 267 deletions(-) create mode 100644 arch/x86/power/hibernate.c diff --git a/arch/x86/power/hibernate.c b/arch/x86/power/hibernate.c new file mode 100644 index 000000000000..6aeac4d3c9df --- /dev/null +++ b/arch/x86/power/hibernate.c @@ -0,0 +1,265 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Hibernation support for x86 + * + * Copyright (c) 2007 Rafael J. Wysocki + * Copyright (c) 2002 Pavel Machek + * Copyright (c) 2001 Patrick Mochel + */ + +#include +#include +#include +#include +#include +#include + +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* Defined in hibernate_asm_64.S or hibernate_asm_32.S */ +extern asmlinkage __visible int restore_image(void); + +/* + * Address to jump to in the last phase of restore in order to get to the image + * kernel's text (this value is passed in the image header). + */ +unsigned long restore_jump_address __visible; +unsigned long jump_address_phys; + +/* + * Value of the cr3 register from before the hibernation (this value is passed + * in the image header). + */ +unsigned long restore_cr3 __visible; +unsigned long temp_pgt __visible; +unsigned long relocated_restore_code __visible; + +#ifdef CONFIG_X86_32 + #define RESTORE_MAGIC 0x12345678UL +#else + #define RESTORE_MAGIC 0x23456789ABCDEF01UL +#endif + +/** + * pfn_is_nosave - check if given pfn is in the 'nosave' section + * @pfn: the page frame number to be checked + * + * returns non-zero if the pfn is in the 'nosave' section, + * otherwise returns zero + */ +int pfn_is_nosave(unsigned long pfn) +{ + unsigned long nosave_begin_pfn; + unsigned long nosave_end_pfn; + + nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT; + nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT; + + return pfn >= nosave_begin_pfn && pfn < nosave_end_pfn; +} + +#ifdef CONFIG_X86_64 +static int relocate_restore_code(void) +{ + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + relocated_restore_code = get_safe_page(GFP_ATOMIC); + if (!relocated_restore_code) + return -ENOMEM; + + memcpy((void *)relocated_restore_code, core_restore_code, PAGE_SIZE); + + /* Make the page containing the relocated code executable */ + pgd = (pgd_t *)__va(read_cr3_pa()) + + pgd_index(relocated_restore_code); + p4d = p4d_offset(pgd, relocated_restore_code); + if (p4d_large(*p4d)) { + set_p4d(p4d, __p4d(p4d_val(*p4d) & ~_PAGE_NX)); + goto out; + } + pud = pud_offset(p4d, relocated_restore_code); + if (pud_large(*pud)) { + set_pud(pud, __pud(pud_val(*pud) & ~_PAGE_NX)); + goto out; + } + pmd = pmd_offset(pud, relocated_restore_code); + if (pmd_large(*pmd)) { + set_pmd(pmd, __pmd(pmd_val(*pmd) & ~_PAGE_NX)); + goto out; + } + pte = pte_offset_kernel(pmd, relocated_restore_code); + set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_NX)); +out: + __flush_tlb_all(); + return 0; +} + +#define MD5_DIGEST_SIZE 16 + +struct restore_data_record { + unsigned long jump_address; + unsigned long jump_address_phys; + unsigned long cr3; + unsigned long magic; + u8 e820_digest[MD5_DIGEST_SIZE]; +}; + +#if IS_BUILTIN(CONFIG_CRYPTO_MD5) +/** + * get_e820_md5 - calculate md5 according to given e820 table + * + * @table: the e820 table to be calculated + * @buf: the md5 result to be stored to + */ +static int get_e820_md5(struct e820_table *table, void *buf) +{ + struct crypto_shash *tfm; + struct shash_desc *desc; + int size; + int ret = 0; + + tfm = crypto_alloc_shash("md5", 0, 0); + if (IS_ERR(tfm)) + return -ENOMEM; + + desc = kmalloc(sizeof(struct shash_desc) + crypto_shash_descsize(tfm), + GFP_KERNEL); + if (!desc) { + ret = -ENOMEM; + goto free_tfm; + } + + desc->tfm = tfm; + desc->flags = 0; + + size = offsetof(struct e820_table, entries) + + sizeof(struct e820_entry) * table->nr_entries; + + if (crypto_shash_digest(desc, (u8 *)table, size, buf)) + ret = -EINVAL; + + kzfree(desc); + +free_tfm: + crypto_free_shash(tfm); + return ret; +} + +static int hibernation_e820_save(void *buf) +{ + return get_e820_md5(e820_table_firmware, buf); +} + +static bool hibernation_e820_mismatch(void *buf) +{ + int ret; + u8 result[MD5_DIGEST_SIZE]; + + memset(result, 0, MD5_DIGEST_SIZE); + /* If there is no digest in suspend kernel, let it go. */ + if (!memcmp(result, buf, MD5_DIGEST_SIZE)) + return false; + + ret = get_e820_md5(e820_table_firmware, result); + if (ret) + return true; + + return memcmp(result, buf, MD5_DIGEST_SIZE) ? true : false; +} +#else +static int hibernation_e820_save(void *buf) +{ + return 0; +} + +static bool hibernation_e820_mismatch(void *buf) +{ + /* If md5 is not builtin for restore kernel, let it go. */ + return false; +} +#endif + +/** + * arch_hibernation_header_save - populate the architecture specific part + * of a hibernation image header + * @addr: address to save the data at + */ +int arch_hibernation_header_save(void *addr, unsigned int max_size) +{ + struct restore_data_record *rdr = addr; + int ret = -EINVAL; + + if (max_size < sizeof(struct restore_data_record)) + return -EOVERFLOW; + rdr->jump_address = (unsigned long)restore_registers; + rdr->jump_address_phys = __pa_symbol(restore_registers); + + /* + * The restore code fixes up CR3 and CR4 in the following sequence: + * + * [in hibernation asm] + * 1. CR3 <= temporary page tables + * 2. CR4 <= mmu_cr4_features (from the kernel that restores us) + * 3. CR3 <= rdr->cr3 + * 4. CR4 <= mmu_cr4_features (from us, i.e. the image kernel) + * [in restore_processor_state()] + * 5. CR4 <= saved CR4 + * 6. CR3 <= saved CR3 + * + * Our mmu_cr4_features has CR4.PCIDE=0, and toggling + * CR4.PCIDE while CR3's PCID bits are nonzero is illegal, so + * rdr->cr3 needs to point to valid page tables but must not + * have any of the PCID bits set. + */ + rdr->cr3 = restore_cr3 &(~CR3_PCID_MASK); + + rdr->magic = RESTORE_MAGIC; + + ret = hibernation_e820_save(rdr->e820_digest); + if (ret) + return ret; + + return 0; +} + +/** + * arch_hibernation_header_restore - read the architecture specific data + * from the hibernation image header + * @addr: address to read the data from + */ +int arch_hibernation_header_restore(void *addr) +{ + struct restore_data_record *rdr = addr; + + restore_jump_address = rdr->jump_address; + jump_address_phys = rdr->jump_address_phys; + restore_cr3 = rdr->cr3; + + if (rdr->magic != RESTORE_MAGIC) { + pr_crit("Unrecognized hibernate image header format!\n"); + return -EINVAL; + } + + if (hibernation_e820_mismatch(rdr->e820_digest)) { + pr_crit("Hibernate inconsistent memory map detected!\n"); + return -ENODEV; + } + + return 0; +} +#endif diff --git a/arch/x86/power/hibernate_32.c b/arch/x86/power/hibernate_32.c index afc4ed7b1578..e0e7b9aea22a 100644 --- a/arch/x86/power/hibernate_32.c +++ b/arch/x86/power/hibernate_32.c @@ -6,17 +6,7 @@ * Copyright (c) 2006 Rafael J. Wysocki */ -#include -#include -#include - -#include -#include -#include -#include - -/* Defined in hibernate_asm_32.S */ -extern int restore_image(void); +#include "hibernate.c" /* Pointer to the temporary resume page tables */ pgd_t *resume_pg_dir; @@ -162,14 +152,3 @@ asmlinkage int swsusp_arch_resume(void) restore_image(); return 0; } - -/* - * pfn_is_nosave - check if given pfn is in the 'nosave' section - */ - -int pfn_is_nosave(unsigned long pfn) -{ - unsigned long nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT; - unsigned long nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT; - return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); -} diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c index dc12dc57889e..25946bb34f5a 100644 --- a/arch/x86/power/hibernate_64.c +++ b/arch/x86/power/hibernate_64.c @@ -8,43 +8,7 @@ * Copyright (c) 2001 Patrick Mochel */ -#include -#include -#include -#include -#include - -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/* Defined in hibernate_asm_64.S */ -extern asmlinkage __visible int restore_image(void); - -/* - * Address to jump to in the last phase of restore in order to get to the image - * kernel's text (this value is passed in the image header). - */ -unsigned long restore_jump_address __visible; -unsigned long jump_address_phys; - -/* - * Value of the cr3 register from before the hibernation (this value is passed - * in the image header). - */ -unsigned long restore_cr3 __visible; - -unsigned long temp_level4_pgt __visible; - -unsigned long relocated_restore_code __visible; +#include "hibernate.c" static int set_up_temporary_text_mapping(pgd_t *pgd) { @@ -141,46 +105,7 @@ static int set_up_temporary_mappings(void) return result; } - temp_level4_pgt = __pa(pgd); - return 0; -} - -static int relocate_restore_code(void) -{ - pgd_t *pgd; - p4d_t *p4d; - pud_t *pud; - pmd_t *pmd; - pte_t *pte; - - relocated_restore_code = get_safe_page(GFP_ATOMIC); - if (!relocated_restore_code) - return -ENOMEM; - - memcpy((void *)relocated_restore_code, core_restore_code, PAGE_SIZE); - - /* Make the page containing the relocated code executable */ - pgd = (pgd_t *)__va(read_cr3_pa()) + - pgd_index(relocated_restore_code); - p4d = p4d_offset(pgd, relocated_restore_code); - if (p4d_large(*p4d)) { - set_p4d(p4d, __p4d(p4d_val(*p4d) & ~_PAGE_NX)); - goto out; - } - pud = pud_offset(p4d, relocated_restore_code); - if (pud_large(*pud)) { - set_pud(pud, __pud(pud_val(*pud) & ~_PAGE_NX)); - goto out; - } - pmd = pmd_offset(pud, relocated_restore_code); - if (pmd_large(*pmd)) { - set_pmd(pmd, __pmd(pmd_val(*pmd) & ~_PAGE_NX)); - goto out; - } - pte = pte_offset_kernel(pmd, relocated_restore_code); - set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_NX)); -out: - __flush_tlb_all(); + temp_pgt = __pa(pgd); return 0; } @@ -200,170 +125,3 @@ asmlinkage int swsusp_arch_resume(void) restore_image(); return 0; } - -/* - * pfn_is_nosave - check if given pfn is in the 'nosave' section - */ - -int pfn_is_nosave(unsigned long pfn) -{ - unsigned long nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT; - unsigned long nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT; - return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); -} - -#define MD5_DIGEST_SIZE 16 - -struct restore_data_record { - unsigned long jump_address; - unsigned long jump_address_phys; - unsigned long cr3; - unsigned long magic; - u8 e820_digest[MD5_DIGEST_SIZE]; -}; - -#define RESTORE_MAGIC 0x23456789ABCDEF01UL - -#if IS_BUILTIN(CONFIG_CRYPTO_MD5) -/** - * get_e820_md5 - calculate md5 according to given e820 table - * - * @table: the e820 table to be calculated - * @buf: the md5 result to be stored to - */ -static int get_e820_md5(struct e820_table *table, void *buf) -{ - struct crypto_shash *tfm; - struct shash_desc *desc; - int size; - int ret = 0; - - tfm = crypto_alloc_shash("md5", 0, 0); - if (IS_ERR(tfm)) - return -ENOMEM; - - desc = kmalloc(sizeof(struct shash_desc) + crypto_shash_descsize(tfm), - GFP_KERNEL); - if (!desc) { - ret = -ENOMEM; - goto free_tfm; - } - - desc->tfm = tfm; - desc->flags = 0; - - size = offsetof(struct e820_table, entries) + - sizeof(struct e820_entry) * table->nr_entries; - - if (crypto_shash_digest(desc, (u8 *)table, size, buf)) - ret = -EINVAL; - - kzfree(desc); - -free_tfm: - crypto_free_shash(tfm); - return ret; -} - -static int hibernation_e820_save(void *buf) -{ - return get_e820_md5(e820_table_firmware, buf); -} - -static bool hibernation_e820_mismatch(void *buf) -{ - int ret; - u8 result[MD5_DIGEST_SIZE]; - - memset(result, 0, MD5_DIGEST_SIZE); - /* If there is no digest in suspend kernel, let it go. */ - if (!memcmp(result, buf, MD5_DIGEST_SIZE)) - return false; - - ret = get_e820_md5(e820_table_firmware, result); - if (ret) - return true; - - return memcmp(result, buf, MD5_DIGEST_SIZE) ? true : false; -} -#else -static int hibernation_e820_save(void *buf) -{ - return 0; -} - -static bool hibernation_e820_mismatch(void *buf) -{ - /* If md5 is not builtin for restore kernel, let it go. */ - return false; -} -#endif - -/** - * arch_hibernation_header_save - populate the architecture specific part - * of a hibernation image header - * @addr: address to save the data at - */ -int arch_hibernation_header_save(void *addr, unsigned int max_size) -{ - struct restore_data_record *rdr = addr; - int ret = -EINVAL; - - if (max_size < sizeof(struct restore_data_record)) - return -EOVERFLOW; - rdr->jump_address = (unsigned long)restore_registers; - rdr->jump_address_phys = __pa_symbol(restore_registers); - - /* - * The restore code fixes up CR3 and CR4 in the following sequence: - * - * [in hibernation asm] - * 1. CR3 <= temporary page tables - * 2. CR4 <= mmu_cr4_features (from the kernel that restores us) - * 3. CR3 <= rdr->cr3 - * 4. CR4 <= mmu_cr4_features (from us, i.e. the image kernel) - * [in restore_processor_state()] - * 5. CR4 <= saved CR4 - * 6. CR3 <= saved CR3 - * - * Our mmu_cr4_features has CR4.PCIDE=0, and toggling - * CR4.PCIDE while CR3's PCID bits are nonzero is illegal, so - * rdr->cr3 needs to point to valid page tables but must not - * have any of the PCID bits set. - */ - rdr->cr3 = restore_cr3 & ~CR3_PCID_MASK; - - rdr->magic = RESTORE_MAGIC; - - ret = hibernation_e820_save(rdr->e820_digest); - if (ret) - return ret; - - return 0; -} - -/** - * arch_hibernation_header_restore - read the architecture specific data - * from the hibernation image header - * @addr: address to read the data from - */ -int arch_hibernation_header_restore(void *addr) -{ - struct restore_data_record *rdr = addr; - - restore_jump_address = rdr->jump_address; - jump_address_phys = rdr->jump_address_phys; - restore_cr3 = rdr->cr3; - - if (rdr->magic != RESTORE_MAGIC) { - pr_crit("Unrecognized hibernate image header format!\n"); - return -EINVAL; - } - - if (hibernation_e820_mismatch(rdr->e820_digest)) { - pr_crit("Hibernate inconsistent memory map detected!\n"); - return -ENODEV; - } - - return 0; -} diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S index fd369a6e9ff8..3008baa2fa95 100644 --- a/arch/x86/power/hibernate_asm_64.S +++ b/arch/x86/power/hibernate_asm_64.S @@ -59,7 +59,7 @@ ENTRY(restore_image) movq restore_cr3(%rip), %r9 /* prepare to switch to temporary page tables */ - movq temp_level4_pgt(%rip), %rax + movq temp_pgt(%rip), %rax movq mmu_cr4_features(%rip), %rbx /* prepare to copy image data to their original locations */ From patchwork Tue Sep 11 17:21:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chen Yu X-Patchwork-Id: 10595989 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9097A112B for ; Tue, 11 Sep 2018 17:15:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7408B29AD7 for ; Tue, 11 Sep 2018 17:15:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 680E129AE4; Tue, 11 Sep 2018 17:15:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BEFEE29AD7 for ; Tue, 11 Sep 2018 17:15:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728298AbeIKWPZ (ORCPT ); Tue, 11 Sep 2018 18:15:25 -0400 Received: from mga02.intel.com ([134.134.136.20]:31116 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728299AbeIKWPQ (ORCPT ); Tue, 11 Sep 2018 18:15:16 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Sep 2018 10:15:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,361,1531810800"; d="scan'208";a="262579658" Received: from chenyu-desktop.sh.intel.com ([10.239.160.116]) by fmsmga006.fm.intel.com with ESMTP; 11 Sep 2018 10:14:56 -0700 From: Chen Yu To: Thomas Gleixner , "Rafael J. Wysocki" Cc: Pavel Machek , Rui Zhang , Chen Yu , Zhimin Gu , Len Brown , linux-kernel@vger.kernel.org, x86@kernel.org, linux-pm@vger.kernel.org Subject: [PATCH 4/4][v2] x86, hibernate: Backport several fixes from 64bits to 32bits hibernation Date: Wed, 12 Sep 2018 01:21:00 +0800 Message-Id: <6c1b31e24b78b0366c7ca9fa6ac75bdb463d8fc0.1536685746.git.yu.c.chen@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Zhimin Gu Currently there are mainly three bugs in 32bits system when doing hibernation: 1. The page copy code is not running in safe page, which might cause hang during resume. 2. There's no text mapping for the final jump address of the original kernel, which might cause the system jumping into illegal address and causes system hang during resume. 3. The restore kernel switches to its own kernel page table(swapper_pg_dir) rather than the original kernel page table after all the pages been copied back, which might cause invalid virtual-physical mapping issue during resume. To solve these problems: 1. Copy the code core_restore_code to a safe page, to avoid the instruction code been overwritten when image kernel pages are being copied. 2. Set up temporary text mapping for the image kernel's jump address, so that after all the pages have been copied back, the system could jump to this address. 3. Switch to the original kernel page table during resume. Furthermore, MD5 hash check for e820 map is also backported from 64bits system. Acked-by: Chen Yu Cc: "Rafael J. Wysocki" Cc: Thomas Gleixner Signed-off-by: Zhimin Gu --- arch/x86/Kconfig | 2 +- arch/x86/include/asm/suspend_32.h | 4 +++ arch/x86/power/hibernate.c | 2 -- arch/x86/power/hibernate_32.c | 37 +++++++++++++++++++++++ arch/x86/power/hibernate_asm_32.S | 49 +++++++++++++++++++++++++------ 5 files changed, 82 insertions(+), 12 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 1a0be022f91d..e8de5de1057f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2422,7 +2422,7 @@ menu "Power management and ACPI options" config ARCH_HIBERNATION_HEADER def_bool y - depends on X86_64 && HIBERNATION + depends on HIBERNATION source "kernel/power/Kconfig" diff --git a/arch/x86/include/asm/suspend_32.h b/arch/x86/include/asm/suspend_32.h index 8be6afb58471..fdbd9d7b7bca 100644 --- a/arch/x86/include/asm/suspend_32.h +++ b/arch/x86/include/asm/suspend_32.h @@ -32,4 +32,8 @@ struct saved_context { unsigned long return_address; } __attribute__((packed)); +/* routines for saving/restoring kernel state */ +extern char core_restore_code[]; +extern char restore_registers[]; + #endif /* _ASM_X86_SUSPEND_32_H */ diff --git a/arch/x86/power/hibernate.c b/arch/x86/power/hibernate.c index 6aeac4d3c9df..d719d156114b 100644 --- a/arch/x86/power/hibernate.c +++ b/arch/x86/power/hibernate.c @@ -69,7 +69,6 @@ int pfn_is_nosave(unsigned long pfn) return pfn >= nosave_begin_pfn && pfn < nosave_end_pfn; } -#ifdef CONFIG_X86_64 static int relocate_restore_code(void) { pgd_t *pgd; @@ -262,4 +261,3 @@ int arch_hibernation_header_restore(void *addr) return 0; } -#endif diff --git a/arch/x86/power/hibernate_32.c b/arch/x86/power/hibernate_32.c index e0e7b9aea22a..d692700047bf 100644 --- a/arch/x86/power/hibernate_32.c +++ b/arch/x86/power/hibernate_32.c @@ -135,6 +135,32 @@ static inline void resume_init_first_level_page_table(pgd_t *pg_dir) #endif } +static int set_up_temporary_text_mapping(pgd_t *pgd_base) +{ + pgd_t *pgd; + pmd_t *pmd; + pte_t *pte; + + pgd = pgd_base + pgd_index(restore_jump_address); + + pmd = resume_one_md_table_init(pgd); + if (!pmd) + return -ENOMEM; + + if (boot_cpu_has(X86_FEATURE_PSE)) { + set_pmd(pmd + pmd_index(restore_jump_address), + __pmd((jump_address_phys & PMD_MASK) | pgprot_val(PAGE_KERNEL_LARGE_EXEC))); + } else { + pte = resume_one_page_table_init(pmd); + if (!pte) + return -ENOMEM; + set_pte(pte + pte_index(restore_jump_address), + __pte((jump_address_phys & PAGE_MASK) | pgprot_val(PAGE_KERNEL_EXEC))); + } + + return 0; +} + asmlinkage int swsusp_arch_resume(void) { int error; @@ -144,10 +170,21 @@ asmlinkage int swsusp_arch_resume(void) return -ENOMEM; resume_init_first_level_page_table(resume_pg_dir); + + error = set_up_temporary_text_mapping(resume_pg_dir); + if (error) + return error; + error = resume_physical_mapping_init(resume_pg_dir); if (error) return error; + temp_pgt = __pa(resume_pg_dir); + + error = relocate_restore_code(); + if (error) + return error; + /* We have got enough memory and from now on we cannot recover */ restore_image(); return 0; diff --git a/arch/x86/power/hibernate_asm_32.S b/arch/x86/power/hibernate_asm_32.S index 6e56815e13a0..a53b4a41e09a 100644 --- a/arch/x86/power/hibernate_asm_32.S +++ b/arch/x86/power/hibernate_asm_32.S @@ -24,21 +24,40 @@ ENTRY(swsusp_arch_suspend) pushfl popl saved_context_eflags + /* save cr3 */ + movl %cr3, %eax + movl %eax, restore_cr3 + call swsusp_save ret +ENDPROC(swsusp_arch_suspend) ENTRY(restore_image) - movl mmu_cr4_features, %ecx - movl resume_pg_dir, %eax - subl $__PAGE_OFFSET, %eax + /* prepare to jump to the image kernel */ + movl restore_jump_address, %ebx + movl restore_cr3, %ebp + + movl mmu_cr4_features, %edx + + /* jump to relocated restore code */ + movl relocated_restore_code, %eax + jmpl *%eax + + /* code below has been relocated to a safe page */ +ENTRY(core_restore_code) + movl temp_pgt, %eax movl %eax, %cr3 + /* flush TLB */ + movl %edx, %ecx jecxz 1f # cr4 Pentium and higher, skip if zero andl $~(X86_CR4_PGE), %ecx movl %ecx, %cr4; # turn off PGE movl %cr3, %eax; # flush TLB movl %eax, %cr3 + movl %edx, %cr4; # turn PGE back on 1: + /* prepare to copy image data to their original locations */ movl restore_pblist, %edx .p2align 4,,7 @@ -49,7 +68,7 @@ copy_loop: movl pbe_address(%edx), %esi movl pbe_orig_address(%edx), %edi - movl $1024, %ecx + movl $(PAGE_SIZE >> 2), %ecx rep movsl @@ -58,13 +77,22 @@ copy_loop: .p2align 4,,7 done: + jmpl *%ebx + .align PAGE_SIZE + +ENTRY(restore_registers) /* go back to the original page tables */ - movl $swapper_pg_dir, %eax - subl $__PAGE_OFFSET, %eax - movl %eax, %cr3 - movl mmu_cr4_features, %ecx + movl %ebp, %cr3 + + /* flush TLB */ + movl mmu_cr4_features, %edx + movl %edx, %ecx jecxz 1f # cr4 Pentium and higher, skip if zero - movl %ecx, %cr4; # turn PGE back on + andl $~(X86_CR4_PGE), %ecx + movl %ecx, %cr4; # turn off PGE + movl %cr3, %ecx; # flush TLB + movl %ecx, %cr3; + movl %edx, %cr4; # turn PGE back on 1: movl saved_context_esp, %esp @@ -82,4 +110,7 @@ done: xorl %eax, %eax + movl %eax, in_suspend + ret +ENDPROC(restore_registers)