From patchwork Fri May 20 21:59:30 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kees Cook X-Patchwork-Id: 9130099 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1DFCF60221 for ; Fri, 20 May 2016 21:59:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 12C6227D11 for ; Fri, 20 May 2016 21:59:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 078D827E72; Fri, 20 May 2016 21:59:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F24027D11 for ; Fri, 20 May 2016 21:59:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751172AbcETV7e (ORCPT ); Fri, 20 May 2016 17:59:34 -0400 Received: from mail-wm0-f53.google.com ([74.125.82.53]:35504 "EHLO mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750918AbcETV7d (ORCPT ); Fri, 20 May 2016 17:59:33 -0400 Received: by mail-wm0-f53.google.com with SMTP id i142so1603056wmf.0 for ; Fri, 20 May 2016 14:59:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=TBlUQRV7XoBGi2kdB8r9EC2Z4Lfqrm4+QquEfk+4o1U=; b=nPcTxbTCp//heoNcUUTJ1yjOyHkGif9Q9OdO8P2C4a0eWkCIOP05y/HTOlGe04y7St HTDV/ShQgbxFZ9UZHmOQ/dtiTSpEWAFoDVNGSF+JMxu/mgIazlPFQYbf6EVIusGF/+WN mOHvENK+6YZozvnGffCOUfl3yEJIBKZ2/WHZhaD9wRo1QkF9ZTunJSBDLg1cq2X5xnCS 38vfT0bQoH7UnL8dDpCAGpBMIbMezghUQXHKigcpSHFlmvywabOTa7Of1LLNPSyk20T6 XAyTDprUBIlI7GSAfYYCAT7pu4SQIHA3LrG4fuNu+CrKIDp99YmX8hQHJ/Bnip4PTcTn bo+w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=TBlUQRV7XoBGi2kdB8r9EC2Z4Lfqrm4+QquEfk+4o1U=; b=JvWJzEBPHqELBUG50rR2iziJVkWt1wewlBUFkvrr5KO4jQ3B4cYPfiHpTyXh1Db2LF /Gp/oNk7nVbgDJwBrrnHQ/uIndcGrg+tqXk+YZtbBDXR3akeEW+vJDodictm+mzgS1Nw XqBsueYPvzpQAagGkvuGlTOnl/gSB0wb+y9eo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=TBlUQRV7XoBGi2kdB8r9EC2Z4Lfqrm4+QquEfk+4o1U=; b=J0Eg26QfuAC6sL75cyXynjRFEY53qkW8dV+0Q4jKcs0ipA0CIwvHXLrPkuD5yX3B1I qU4+WWFEL9oNbrgp4GhbBzFku5vkk0qr6RiPCwF1c0pMMfrRxG7FRUhNx4Tn/fzlQyAC V0RaTSy7Y9Ja2gI9CQazol017Z12c8j7k/jLdKj/xwA2Q2+pFW3Yh9vOFO/ABiFxbBIs 7GlTO3dy7XVwVcjH7B/FGXXirHhL09GMNWgOIRxqXy/IKhvK5kr7aVZUQKct0gC5FzVt npryHq4W75DFz7GmuzJZqUK7HkiiXH1DwpXaujNI/IgO6Jt/jlMXc09uolyc1jtt/dOf 3ghQ== X-Gm-Message-State: AOPr4FUQT07O03mtSz3NISl/32R70GVFQwkvBqhGM/6fq5XujSSihAG9efhVPaZowqAXhMag8bLXpg3D6E09ulwQ X-Received: by 10.194.9.71 with SMTP id x7mr5268417wja.62.1463781571290; Fri, 20 May 2016 14:59:31 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.29.15 with HTTP; Fri, 20 May 2016 14:59:30 -0700 (PDT) In-Reply-To: References: <573DF82D.50006@deltatee.com> <20160520071517.GB14191@gmail.com> <7b865a03-484f-2d10-aa3e-d9c0d04caecb@tycho.nsa.gov> From: Kees Cook Date: Fri, 20 May 2016 14:59:30 -0700 X-Google-Sender-Auth: VVFWcCBITlvYV53yDUuCuW-rZIg Message-ID: Subject: Re: PROBLEM: Resume form hibernate broken by setting NX on gap To: "Rafael J. Wysocki" Cc: Stephen Smalley , Ingo Molnar , Logan Gunthorpe , Ingo Molnar , "the arch/x86 maintainers" , "linux-pm@vger.kernel.org" , Linux Kernel Mailing List , Andy Lutomirski , Borislav Petkov , Denys Vlasenko , Brian Gerst Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, May 20, 2016 at 2:46 PM, Rafael J. Wysocki wrote: > On Fri, May 20, 2016 at 3:56 PM, Stephen Smalley wrote: >> On 05/20/2016 07:34 AM, Rafael J. Wysocki wrote: >>> On Fri, May 20, 2016 at 9:15 AM, Ingo Molnar wrote: >>>> >>>> * Logan Gunthorpe wrote: >>>> >>>>> Hi, >>>>> >>>>> I have been working on a bug that causes my laptop to freeze during >>>>> resume from hibernation. I did a bisect to find the offending commit: >>>>> >>>>> [ab76f7b4ab] x86/mm: Set NX on gap between __ex_table and rodata >>>>> >>>>> There is more information in the bugzilla report [1] that >>>>> I've been working on but I will summarize things below. >>>>> >>>>> I've experienced intermittent but reproducible freezes when resuming >>>>> from hibernation since about kernel version 3.19. The freeze was >>>>> significantly more reproducible when a few applications were loaded >>>>> before hibernation and would largely not happen if hibernated >>>>> immediately after booting to a desktop. I did some tracing work to find >>>>> that the kernel gets as far as the resume_image call in >>>>> swsusp_arch_resume and I could not find any response from the image >>>>> kernel when I hit the bug. I also did testing that seemed to rule out >>>>> this being caused by a problematic driver. >>>>> >>>>> I did a successful bisect between 3.18 and 3.19 which found a bug in >>>>> commit f5b2831d6 that was then later fixed by commit 55696b1f66 in 4.4. >>>>> Then, I did a second bisect with a ported version of the fix to the >>>>> first bug and found commit ab76f7b4ab in 4.3 to also break hibernation >>>>> with what appears to be the exact same symptoms. Reverting that commit >>>>> in recent kernels up to and including 4.6 fixes the issue and restores >>>>> reliable hibernation. However, it's not at all clear to me why that >>>>> commit would cause this issue or how to fix the issue without reverting. >>>> >>>> I've attached that commit below and also Cc:-ed a few more people who might have >>>> an idea about why this regressed. Worst-case we'll have to revert it. >>> >>> Without looking deep into mm, my theory would be that after this patch >>> the final jump from the boot kernel to the image kernel's trampoline >>> code during resume may crash the kernel if the trampoline page turns >>> out to be NX in the boot kernel (it has to be executable in both the >>> boot and the image kernels). >> >> So, pardon my ignorance, but where is this trampoline page placed in >> kernel memory? > > On 32-bit its location has to be the same in both the boot and the > image kernels and that's within kernel text in both cases, so that > shouldn't be a problem. > > On 64-bit its location depends on the image kernel and specifically on > the location of the restore_registers routine in it. The (virtual) > address of that routine is stored in the restore_jump_address > variable, so the page containing it (the trampoline page) can be found > with the help of that. > > swsusp_arch_resume() sets up a temporary kernel mapping to finalize > the image restoration and that page must not be NX in that mapping for > things to work. It looks like nothing in the swsusp_arch_resume() -> get_safe_page() -> get_image_page() path sets the page executable... Untested, but I wonder if this work work in swsusp_arch_resume() before the memcpy? (apologies for any gmail-based whitespace mangling...) -Kees diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c index 009947d419a6..c2f3ecc45bd4 100644 --- a/arch/x86/power/hibernate_64.c +++ b/arch/x86/power/hibernate_64.c @@ -12,6 +12,7 @@ #include #include +#include #include #include #include @@ -89,6 +90,7 @@ int swsusp_arch_resume(void) relocated_restore_code = (void *)get_safe_page(GFP_ATOMIC); if (!relocated_restore_code) return -ENOMEM; + set_memory_x((unsigned long)relocated_restore_code, 1); memcpy(relocated_restore_code, &core_restore_code, &restore_registers - &core_restore_code);