diff mbox

PROBLEM: Resume form hibernate broken by setting NX on gap

Message ID CAGXu5jJ+hRAVNfa=xWd9c9On5vC6k+kZKkaXGSXS5WfjTK2QcA@mail.gmail.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Kees Cook May 20, 2016, 9:59 p.m. UTC
On Fri, May 20, 2016 at 2:46 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Fri, May 20, 2016 at 3:56 PM, Stephen Smalley <sds@tycho.nsa.gov> wrote:
>> On 05/20/2016 07:34 AM, Rafael J. Wysocki wrote:
>>> On Fri, May 20, 2016 at 9:15 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>>>
>>>> * Logan Gunthorpe <logang@deltatee.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have been working on a bug that causes my laptop to freeze during
>>>>> resume from hibernation. I did a bisect to find the offending commit:
>>>>>
>>>>> [ab76f7b4ab] x86/mm: Set NX on gap between __ex_table and rodata
>>>>>
>>>>> There is more information in the bugzilla report [1] that
>>>>> I've been working on but I will summarize things below.
>>>>>
>>>>> I've experienced intermittent but reproducible freezes when resuming
>>>>> from hibernation since about kernel version 3.19. The freeze was
>>>>> significantly more reproducible when a few applications were loaded
>>>>> before hibernation and would largely not happen if hibernated
>>>>> immediately after booting to a desktop. I did some tracing work to find
>>>>> that the kernel gets as far as the resume_image call in
>>>>> swsusp_arch_resume and I could not find any response from the image
>>>>> kernel when I hit the bug. I also did testing that seemed to rule out
>>>>> this being caused by a problematic driver.
>>>>>
>>>>> I did a successful bisect between 3.18 and 3.19 which found a bug in
>>>>> commit f5b2831d6 that was then later fixed by commit 55696b1f66 in 4.4.
>>>>> Then, I did a second bisect with a ported version of the fix to the
>>>>> first bug and found commit ab76f7b4ab in 4.3 to also break hibernation
>>>>> with what appears to be the exact same symptoms. Reverting that commit
>>>>> in recent kernels up to and including 4.6 fixes the issue and restores
>>>>> reliable hibernation. However, it's not at all clear to me why that
>>>>> commit would cause this issue or how to fix the issue without reverting.
>>>>
>>>> I've attached that commit below and also Cc:-ed a few more people who might have
>>>> an idea about why this regressed. Worst-case we'll have to revert it.
>>>
>>> Without looking deep into mm, my theory would be that after this patch
>>> the final jump from the boot kernel to the image kernel's trampoline
>>> code during resume may crash the kernel if the trampoline page turns
>>> out to be NX in the boot kernel (it has to be executable in both the
>>> boot and the image kernels).
>>
>> So, pardon my ignorance, but where is this trampoline page placed in
>> kernel memory?
>
> On 32-bit its location has to be the same in both the boot and the
> image kernels and that's within kernel text in both cases, so that
> shouldn't be a problem.
>
> On 64-bit its location depends on the image kernel and specifically on
> the location of the restore_registers routine in it.  The (virtual)
> address of that routine is stored in the restore_jump_address
> variable, so the page containing it (the trampoline page) can be found
> with the help of that.
>
> swsusp_arch_resume() sets up a temporary kernel mapping to finalize
> the image restoration and that page must not be NX in that mapping for
> things to work.

It looks like nothing in the swsusp_arch_resume() -> get_safe_page()
-> get_image_page() path sets the page executable...

Untested, but I wonder if this work work in swsusp_arch_resume()
before the memcpy?

(apologies for any gmail-based whitespace mangling...)


-Kees

Comments

Kees Cook May 20, 2016, 10:16 p.m. UTC | #1
On Fri, May 20, 2016 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
> On Fri, May 20, 2016 at 2:46 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Fri, May 20, 2016 at 3:56 PM, Stephen Smalley <sds@tycho.nsa.gov> wrote:
>>> On 05/20/2016 07:34 AM, Rafael J. Wysocki wrote:
>>>> On Fri, May 20, 2016 at 9:15 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>>>>
>>>>> * Logan Gunthorpe <logang@deltatee.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have been working on a bug that causes my laptop to freeze during
>>>>>> resume from hibernation. I did a bisect to find the offending commit:
>>>>>>
>>>>>> [ab76f7b4ab] x86/mm: Set NX on gap between __ex_table and rodata
>>>>>>
>>>>>> There is more information in the bugzilla report [1] that
>>>>>> I've been working on but I will summarize things below.
>>>>>>
>>>>>> I've experienced intermittent but reproducible freezes when resuming
>>>>>> from hibernation since about kernel version 3.19. The freeze was
>>>>>> significantly more reproducible when a few applications were loaded
>>>>>> before hibernation and would largely not happen if hibernated
>>>>>> immediately after booting to a desktop. I did some tracing work to find
>>>>>> that the kernel gets as far as the resume_image call in
>>>>>> swsusp_arch_resume and I could not find any response from the image
>>>>>> kernel when I hit the bug. I also did testing that seemed to rule out
>>>>>> this being caused by a problematic driver.
>>>>>>
>>>>>> I did a successful bisect between 3.18 and 3.19 which found a bug in
>>>>>> commit f5b2831d6 that was then later fixed by commit 55696b1f66 in 4.4.
>>>>>> Then, I did a second bisect with a ported version of the fix to the
>>>>>> first bug and found commit ab76f7b4ab in 4.3 to also break hibernation
>>>>>> with what appears to be the exact same symptoms. Reverting that commit
>>>>>> in recent kernels up to and including 4.6 fixes the issue and restores
>>>>>> reliable hibernation. However, it's not at all clear to me why that
>>>>>> commit would cause this issue or how to fix the issue without reverting.
>>>>>
>>>>> I've attached that commit below and also Cc:-ed a few more people who might have
>>>>> an idea about why this regressed. Worst-case we'll have to revert it.
>>>>
>>>> Without looking deep into mm, my theory would be that after this patch
>>>> the final jump from the boot kernel to the image kernel's trampoline
>>>> code during resume may crash the kernel if the trampoline page turns
>>>> out to be NX in the boot kernel (it has to be executable in both the
>>>> boot and the image kernels).
>>>
>>> So, pardon my ignorance, but where is this trampoline page placed in
>>> kernel memory?
>>
>> On 32-bit its location has to be the same in both the boot and the
>> image kernels and that's within kernel text in both cases, so that
>> shouldn't be a problem.
>>
>> On 64-bit its location depends on the image kernel and specifically on
>> the location of the restore_registers routine in it.  The (virtual)
>> address of that routine is stored in the restore_jump_address
>> variable, so the page containing it (the trampoline page) can be found
>> with the help of that.
>>
>> swsusp_arch_resume() sets up a temporary kernel mapping to finalize
>> the image restoration and that page must not be NX in that mapping for
>> things to work.
>
> It looks like nothing in the swsusp_arch_resume() -> get_safe_page()
> -> get_image_page() path sets the page executable...
>
> Untested, but I wonder if this work work in swsusp_arch_resume()
> before the memcpy?

I can't type today, it seems. It should read "... if this would work ..."

If you can test this and it works for you, I'll send a proper patch... :P

-Kees

>
> (apologies for any gmail-based whitespace mangling...)
>
> diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> index 009947d419a6..c2f3ecc45bd4 100644
> --- a/arch/x86/power/hibernate_64.c
> +++ b/arch/x86/power/hibernate_64.c
> @@ -12,6 +12,7 @@
>  #include <linux/smp.h>
>  #include <linux/suspend.h>
>
> +#include <asm/cacheflush.h>
>  #include <asm/init.h>
>  #include <asm/proto.h>
>  #include <asm/page.h>
> @@ -89,6 +90,7 @@ int swsusp_arch_resume(void)
>         relocated_restore_code = (void *)get_safe_page(GFP_ATOMIC);
>         if (!relocated_restore_code)
>                 return -ENOMEM;
> +       set_memory_x((unsigned long)relocated_restore_code, 1);
>         memcpy(relocated_restore_code, &core_restore_code,
>                &restore_registers - &core_restore_code);
>
>
> -Kees
>
> --
> Kees Cook
> Chrome OS & Brillo Security
Kees Cook May 21, 2016, 4:39 p.m. UTC | #2
On Fri, May 20, 2016 at 6:57 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
> On 20/05/16 04:16 PM, Kees Cook wrote:
>>
>> On Fri, May 20, 2016 at 2:59 PM, Kees Cook <keescook@chromium.org> wrote:
>>>
>>> On Fri, May 20, 2016 at 2:46 PM, Rafael J. Wysocki <rafael@kernel.org>
>>> wrote:
>>>>
>>>> On Fri, May 20, 2016 at 3:56 PM, Stephen Smalley <sds@tycho.nsa.gov>
>>>> wrote:
>>>>>
>>>>> On 05/20/2016 07:34 AM, Rafael J. Wysocki wrote:
>>>>>>
>>>>>> On Fri, May 20, 2016 at 9:15 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>>>>>>
>>>>>>>
>>>>>>> * Logan Gunthorpe <logang@deltatee.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have been working on a bug that causes my laptop to freeze during
>>>>>>>> resume from hibernation. I did a bisect to find the offending
>>>>>>>> commit:
>>>>>>>>
>>>>>>>> [ab76f7b4ab] x86/mm: Set NX on gap between __ex_table and rodata
>>>>>>>>
>>>>>>>> There is more information in the bugzilla report [1] that
>>>>>>>> I've been working on but I will summarize things below.
>>>>>>>>
>>>>>>>> I've experienced intermittent but reproducible freezes when resuming
>>>>>>>> from hibernation since about kernel version 3.19. The freeze was
>>>>>>>> significantly more reproducible when a few applications were loaded
>>>>>>>> before hibernation and would largely not happen if hibernated
>>>>>>>> immediately after booting to a desktop. I did some tracing work to
>>>>>>>> find
>>>>>>>> that the kernel gets as far as the resume_image call in
>>>>>>>> swsusp_arch_resume and I could not find any response from the image
>>>>>>>> kernel when I hit the bug. I also did testing that seemed to rule
>>>>>>>> out
>>>>>>>> this being caused by a problematic driver.
>>>>>>>>
>>>>>>>> I did a successful bisect between 3.18 and 3.19 which found a bug in
>>>>>>>> commit f5b2831d6 that was then later fixed by commit 55696b1f66 in
>>>>>>>> 4.4.
>>>>>>>> Then, I did a second bisect with a ported version of the fix to the
>>>>>>>> first bug and found commit ab76f7b4ab in 4.3 to also break
>>>>>>>> hibernation
>>>>>>>> with what appears to be the exact same symptoms. Reverting that
>>>>>>>> commit
>>>>>>>> in recent kernels up to and including 4.6 fixes the issue and
>>>>>>>> restores
>>>>>>>> reliable hibernation. However, it's not at all clear to me why that
>>>>>>>> commit would cause this issue or how to fix the issue without
>>>>>>>> reverting.
>>>>>>>
>>>>>>>
>>>>>>> I've attached that commit below and also Cc:-ed a few more people who
>>>>>>> might have
>>>>>>> an idea about why this regressed. Worst-case we'll have to revert it.
>>>>>>
>>>>>>
>>>>>> Without looking deep into mm, my theory would be that after this patch
>>>>>> the final jump from the boot kernel to the image kernel's trampoline
>>>>>> code during resume may crash the kernel if the trampoline page turns
>>>>>> out to be NX in the boot kernel (it has to be executable in both the
>>>>>> boot and the image kernels).
>>>>>
>>>>>
>>>>> So, pardon my ignorance, but where is this trampoline page placed in
>>>>> kernel memory?
>>>>
>>>>
>>>> On 32-bit its location has to be the same in both the boot and the
>>>> image kernels and that's within kernel text in both cases, so that
>>>> shouldn't be a problem.
>>>>
>>>> On 64-bit its location depends on the image kernel and specifically on
>>>> the location of the restore_registers routine in it.  The (virtual)
>>>> address of that routine is stored in the restore_jump_address
>>>> variable, so the page containing it (the trampoline page) can be found
>>>> with the help of that.
>>>>
>>>> swsusp_arch_resume() sets up a temporary kernel mapping to finalize
>>>> the image restoration and that page must not be NX in that mapping for
>>>> things to work.
>>>
>>>
>>> It looks like nothing in the swsusp_arch_resume() -> get_safe_page()
>>> -> get_image_page() path sets the page executable...
>>>
>>> Untested, but I wonder if this work work in swsusp_arch_resume()
>>> before the memcpy?
>>
>>
>> I can't type today, it seems. It should read "... if this would work ..."
>>
>> If you can test this and it works for you, I'll send a proper patch... :P
>>
>> -Kees
>>
>
> Hi Kees,
>
> Thanks. I tried the patch but it only resulted in a kernel warning and
> freeze. I've attached a photo showing as much of the messages as I could
> get.
>
> Logan

Ah, dang, ok, thanks for trying it. I'll let Rafael try to figure this one out.

-Kees
Kees Cook June 10, 2016, 6:09 p.m. UTC | #3
On Thu, Jun 9, 2016 at 9:14 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
> Hey,
>
> I've still be trying to figure this out as I have time.
>
> I tried printing a couple restore addresses and nothing I can find seems
> anywhere near the rodata/ex_table boundary.
>
> I tried with the (badly formatted) below and got the following. Nothing too
> surprising. I've attached a kallsyms that matches the kernel for reference.
>
> restore_code: ffff880157c3b000
> jump_addr: ffffffff81446be0
>
>
> diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> index 009947d..6efedb7 100644
> --- a/arch/x86/power/hibernate_64.c
> +++ b/arch/x86/power/hibernate_64.c
> @@ -92,6 +92,9 @@ int swsusp_arch_resume(void)
>         memcpy(relocated_restore_code, &core_restore_code,
>                &restore_registers - &core_restore_code);
>
> +       pr_info("restore_code: %p\n", relocated_restore_code);
> +       pr_info("jump_addr: %lx\n", restore_jump_address);
> +

Also interesting would be the "relocated_restore_code" address, as
well as a dump of /sys/kernel/debug/kernel_page_tables (from
CONFIG_X86_PTDUMP).

I'm baffled by the problem, but the best I can understand is the the
relocated_restore_code range isn't executable (which should be visible
from finding it in /sys/kernel/debug/kernel_page_tables), but I don't
see how to solve that since my original patch didn't work.

Rafael, is this something you have time to look at quickly?

-Kees

>         restore_image();
>         return 0;
>  }
>
>
> Thanks,
>
> Logan
>
>
>
> On 21/05/16 10:39 AM, Kees Cook wrote:
>>
>> On Fri, May 20, 2016 at 6:57 PM, Logan Gunthorpe <logang@deltatee.com>
>> wrote:
>>>
>>> On 20/05/16 04:16 PM, Kees Cook wrote:
>>>>
>>>>
>>>> On Fri, May 20, 2016 at 2:59 PM, Kees Cook <keescook@chromium.org>
>>>> wrote:
>>>>>
>>>>>
>>>>> On Fri, May 20, 2016 at 2:46 PM, Rafael J. Wysocki <rafael@kernel.org>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On Fri, May 20, 2016 at 3:56 PM, Stephen Smalley <sds@tycho.nsa.gov>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 05/20/2016 07:34 AM, Rafael J. Wysocki wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, May 20, 2016 at 9:15 AM, Ingo Molnar <mingo@kernel.org>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> * Logan Gunthorpe <logang@deltatee.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have been working on a bug that causes my laptop to freeze
>>>>>>>>>> during
>>>>>>>>>> resume from hibernation. I did a bisect to find the offending
>>>>>>>>>> commit:
>>>>>>>>>>
>>>>>>>>>> [ab76f7b4ab] x86/mm: Set NX on gap between __ex_table and rodata
>>>>>>>>>>
>>>>>>>>>> There is more information in the bugzilla report [1] that
>>>>>>>>>> I've been working on but I will summarize things below.
>>>>>>>>>>
>>>>>>>>>> I've experienced intermittent but reproducible freezes when
>>>>>>>>>> resuming
>>>>>>>>>> from hibernation since about kernel version 3.19. The freeze was
>>>>>>>>>> significantly more reproducible when a few applications were
>>>>>>>>>> loaded
>>>>>>>>>> before hibernation and would largely not happen if hibernated
>>>>>>>>>> immediately after booting to a desktop. I did some tracing work to
>>>>>>>>>> find
>>>>>>>>>> that the kernel gets as far as the resume_image call in
>>>>>>>>>> swsusp_arch_resume and I could not find any response from the
>>>>>>>>>> image
>>>>>>>>>> kernel when I hit the bug. I also did testing that seemed to rule
>>>>>>>>>> out
>>>>>>>>>> this being caused by a problematic driver.
>>>>>>>>>>
>>>>>>>>>> I did a successful bisect between 3.18 and 3.19 which found a bug
>>>>>>>>>> in
>>>>>>>>>> commit f5b2831d6 that was then later fixed by commit 55696b1f66 in
>>>>>>>>>> 4.4.
>>>>>>>>>> Then, I did a second bisect with a ported version of the fix to
>>>>>>>>>> the
>>>>>>>>>> first bug and found commit ab76f7b4ab in 4.3 to also break
>>>>>>>>>> hibernation
>>>>>>>>>> with what appears to be the exact same symptoms. Reverting that
>>>>>>>>>> commit
>>>>>>>>>> in recent kernels up to and including 4.6 fixes the issue and
>>>>>>>>>> restores
>>>>>>>>>> reliable hibernation. However, it's not at all clear to me why
>>>>>>>>>> that
>>>>>>>>>> commit would cause this issue or how to fix the issue without
>>>>>>>>>> reverting.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I've attached that commit below and also Cc:-ed a few more people
>>>>>>>>> who
>>>>>>>>> might have
>>>>>>>>> an idea about why this regressed. Worst-case we'll have to revert
>>>>>>>>> it.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Without looking deep into mm, my theory would be that after this
>>>>>>>> patch
>>>>>>>> the final jump from the boot kernel to the image kernel's trampoline
>>>>>>>> code during resume may crash the kernel if the trampoline page turns
>>>>>>>> out to be NX in the boot kernel (it has to be executable in both the
>>>>>>>> boot and the image kernels).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> So, pardon my ignorance, but where is this trampoline page placed in
>>>>>>> kernel memory?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 32-bit its location has to be the same in both the boot and the
>>>>>> image kernels and that's within kernel text in both cases, so that
>>>>>> shouldn't be a problem.
>>>>>>
>>>>>> On 64-bit its location depends on the image kernel and specifically on
>>>>>> the location of the restore_registers routine in it.  The (virtual)
>>>>>> address of that routine is stored in the restore_jump_address
>>>>>> variable, so the page containing it (the trampoline page) can be found
>>>>>> with the help of that.
>>>>>>
>>>>>> swsusp_arch_resume() sets up a temporary kernel mapping to finalize
>>>>>> the image restoration and that page must not be NX in that mapping for
>>>>>> things to work.
>>>>>
>>>>>
>>>>>
>>>>> It looks like nothing in the swsusp_arch_resume() -> get_safe_page()
>>>>> -> get_image_page() path sets the page executable...
>>>>>
>>>>> Untested, but I wonder if this work work in swsusp_arch_resume()
>>>>> before the memcpy?
>>>>
>>>>
>>>>
>>>> I can't type today, it seems. It should read "... if this would work
>>>> ..."
>>>>
>>>> If you can test this and it works for you, I'll send a proper patch...
>>>> :P
>>>>
>>>> -Kees
>>>>
>>>
>>> Hi Kees,
>>>
>>> Thanks. I tried the patch but it only resulted in a kernel warning and
>>> freeze. I've attached a photo showing as much of the messages as I could
>>> get.
>>>
>>> Logan
>>
>>
>> Ah, dang, ok, thanks for trying it. I'll let Rafael try to figure this one
>> out.
>>
>> -Kees
>>
>
Logan Gunthorpe June 10, 2016, 6:16 p.m. UTC | #4
Hey,

On 10/06/16 12:09 PM, Kees Cook wrote:
>> restore_code: ffff880157c3b000
>> jump_addr: ffffffff81446be0
>>
>>
>> diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
>> index 009947d..6efedb7 100644
>> --- a/arch/x86/power/hibernate_64.c
>> +++ b/arch/x86/power/hibernate_64.c
>> @@ -92,6 +92,9 @@ int swsusp_arch_resume(void)
>>         memcpy(relocated_restore_code, &core_restore_code,
>>                &restore_registers - &core_restore_code);
>>
>> +       pr_info("restore_code: %p\n", relocated_restore_code);
>> +       pr_info("jump_addr: %lx\n", restore_jump_address);
>> +
> 
> Also interesting would be the "relocated_restore_code" address, as
> well as a dump of /sys/kernel/debug/kernel_page_tables (from
> CONFIG_X86_PTDUMP).

Is that not what I printed? If not, can you give me a better hint as to
what you're looking for so I can spin another kernel? I'll also provide
the kernel_page_tables once I do that.

> I'm baffled by the problem, but the best I can understand is the the
> relocated_restore_code range isn't executable (which should be visible
> from finding it in /sys/kernel/debug/kernel_page_tables), but I don't
> see how to solve that since my original patch didn't work.

Yeah this is definitely a baffling problem.

Thanks,

Logan
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kees Cook June 10, 2016, 6:18 p.m. UTC | #5
On Fri, Jun 10, 2016 at 11:16 AM, Logan Gunthorpe <logang@deltatee.com> wrote:
> Hey,
>
> On 10/06/16 12:09 PM, Kees Cook wrote:
>>> restore_code: ffff880157c3b000
>>> jump_addr: ffffffff81446be0
>>>
>>>
>>> diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
>>> index 009947d..6efedb7 100644
>>> --- a/arch/x86/power/hibernate_64.c
>>> +++ b/arch/x86/power/hibernate_64.c
>>> @@ -92,6 +92,9 @@ int swsusp_arch_resume(void)
>>>         memcpy(relocated_restore_code, &core_restore_code,
>>>                &restore_registers - &core_restore_code);
>>>
>>> +       pr_info("restore_code: %p\n", relocated_restore_code);
>>> +       pr_info("jump_addr: %lx\n", restore_jump_address);
>>> +
>>
>> Also interesting would be the "relocated_restore_code" address, as
>> well as a dump of /sys/kernel/debug/kernel_page_tables (from
>> CONFIG_X86_PTDUMP).
>
> Is that not what I printed? If not, can you give me a better hint as to

Oh, whoops, sorry, I saw "restore_code" in the pr_info and
"relocate_restore_code" in the memcpy and didn't scan the right thing
in the pr_info line. :)

> what you're looking for so I can spin another kernel? I'll also provide
> the kernel_page_tables once I do that.

Cool, thanks.

>
>> I'm baffled by the problem, but the best I can understand is the the
>> relocated_restore_code range isn't executable (which should be visible
>> from finding it in /sys/kernel/debug/kernel_page_tables), but I don't
>> see how to solve that since my original patch didn't work.
>
> Yeah this is definitely a baffling problem.

-Kees
Rafael J. Wysocki June 10, 2016, 9:27 p.m. UTC | #6
On Friday, June 10, 2016 11:09:22 AM Kees Cook wrote:
> On Thu, Jun 9, 2016 at 9:14 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
> > Hey,
> >
> > I've still be trying to figure this out as I have time.
> >
> > I tried printing a couple restore addresses and nothing I can find seems
> > anywhere near the rodata/ex_table boundary.
> >
> > I tried with the (badly formatted) below and got the following. Nothing too
> > surprising. I've attached a kallsyms that matches the kernel for reference.
> >
> > restore_code: ffff880157c3b000
> > jump_addr: ffffffff81446be0
> >
> >
> > diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> > index 009947d..6efedb7 100644
> > --- a/arch/x86/power/hibernate_64.c
> > +++ b/arch/x86/power/hibernate_64.c
> > @@ -92,6 +92,9 @@ int swsusp_arch_resume(void)
> >         memcpy(relocated_restore_code, &core_restore_code,
> >                &restore_registers - &core_restore_code);
> >
> > +       pr_info("restore_code: %p\n", relocated_restore_code);
> > +       pr_info("jump_addr: %lx\n", restore_jump_address);
> > +
> 
> Also interesting would be the "relocated_restore_code" address, as
> well as a dump of /sys/kernel/debug/kernel_page_tables (from
> CONFIG_X86_PTDUMP).
> 
> I'm baffled by the problem, but the best I can understand is the the
> relocated_restore_code range isn't executable (which should be visible
> from finding it in /sys/kernel/debug/kernel_page_tables), but I don't
> see how to solve that since my original patch didn't work.
> 
> Rafael, is this something you have time to look at quickly?

Well, not really, but I'll do my best to look at it in the next few days.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki June 10, 2016, 10:11 p.m. UTC | #7
On Friday, May 20, 2016 02:59:30 PM Kees Cook wrote:
> On Fri, May 20, 2016 at 2:46 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > On Fri, May 20, 2016 at 3:56 PM, Stephen Smalley <sds@tycho.nsa.gov> wrote:
> >> On 05/20/2016 07:34 AM, Rafael J. Wysocki wrote:
> >>> On Fri, May 20, 2016 at 9:15 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >>>>
> >>>> * Logan Gunthorpe <logang@deltatee.com> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I have been working on a bug that causes my laptop to freeze during
> >>>>> resume from hibernation. I did a bisect to find the offending commit:
> >>>>>
> >>>>> [ab76f7b4ab] x86/mm: Set NX on gap between __ex_table and rodata
> >>>>>
> >>>>> There is more information in the bugzilla report [1] that
> >>>>> I've been working on but I will summarize things below.
> >>>>>
> >>>>> I've experienced intermittent but reproducible freezes when resuming
> >>>>> from hibernation since about kernel version 3.19. The freeze was
> >>>>> significantly more reproducible when a few applications were loaded
> >>>>> before hibernation and would largely not happen if hibernated
> >>>>> immediately after booting to a desktop. I did some tracing work to find
> >>>>> that the kernel gets as far as the resume_image call in
> >>>>> swsusp_arch_resume and I could not find any response from the image
> >>>>> kernel when I hit the bug. I also did testing that seemed to rule out
> >>>>> this being caused by a problematic driver.
> >>>>>
> >>>>> I did a successful bisect between 3.18 and 3.19 which found a bug in
> >>>>> commit f5b2831d6 that was then later fixed by commit 55696b1f66 in 4.4.
> >>>>> Then, I did a second bisect with a ported version of the fix to the
> >>>>> first bug and found commit ab76f7b4ab in 4.3 to also break hibernation
> >>>>> with what appears to be the exact same symptoms. Reverting that commit
> >>>>> in recent kernels up to and including 4.6 fixes the issue and restores
> >>>>> reliable hibernation. However, it's not at all clear to me why that
> >>>>> commit would cause this issue or how to fix the issue without reverting.
> >>>>
> >>>> I've attached that commit below and also Cc:-ed a few more people who might have
> >>>> an idea about why this regressed. Worst-case we'll have to revert it.
> >>>
> >>> Without looking deep into mm, my theory would be that after this patch
> >>> the final jump from the boot kernel to the image kernel's trampoline
> >>> code during resume may crash the kernel if the trampoline page turns
> >>> out to be NX in the boot kernel (it has to be executable in both the
> >>> boot and the image kernels).
> >>
> >> So, pardon my ignorance, but where is this trampoline page placed in
> >> kernel memory?
> >
> > On 32-bit its location has to be the same in both the boot and the
> > image kernels and that's within kernel text in both cases, so that
> > shouldn't be a problem.
> >
> > On 64-bit its location depends on the image kernel and specifically on
> > the location of the restore_registers routine in it.  The (virtual)
> > address of that routine is stored in the restore_jump_address
> > variable, so the page containing it (the trampoline page) can be found
> > with the help of that.
> >
> > swsusp_arch_resume() sets up a temporary kernel mapping to finalize
> > the image restoration and that page must not be NX in that mapping for
> > things to work.
> 
> It looks like nothing in the swsusp_arch_resume() -> get_safe_page()
> -> get_image_page() path sets the page executable...
> 
> Untested, but I wonder if this work work in swsusp_arch_resume()
> before the memcpy?
> 
> (apologies for any gmail-based whitespace mangling...)
> 
> diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> index 009947d419a6..c2f3ecc45bd4 100644
> --- a/arch/x86/power/hibernate_64.c
> +++ b/arch/x86/power/hibernate_64.c
> @@ -12,6 +12,7 @@
>  #include <linux/smp.h>
>  #include <linux/suspend.h>
> 
> +#include <asm/cacheflush.h>
>  #include <asm/init.h>
>  #include <asm/proto.h>
>  #include <asm/page.h>
> @@ -89,6 +90,7 @@ int swsusp_arch_resume(void)
>         relocated_restore_code = (void *)get_safe_page(GFP_ATOMIC);
>         if (!relocated_restore_code)
>                 return -ENOMEM;
> +       set_memory_x((unsigned long)relocated_restore_code, 1);
>         memcpy(relocated_restore_code, &core_restore_code,
>                &restore_registers - &core_restore_code);
> 
> 

We may not be on the right track with this I'm afraid.

When the temporary kernel mapping is set up in swsusp_arch_resume(), it passes
__PAGE_KERNEL_LARGE_EXEC as pmd_flag in the x86_mapping_info, so all memory
should be executable after we switch to that which happens right at the
beginning of restore_image().

So restore_image() and the code it jumps to should be fine from the
executability perspective (including the final jump to restore_jump_address).

Or am I missingy anything?

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Logan Gunthorpe June 10, 2016, 10:28 p.m. UTC | #8
On 10/06/16 04:29 PM, Rafael J. Wysocki wrote:
> OK, I have a theory, but I need a bit of help.
> 
> This may be a dumb question, but I don't quite remember the answer readily.
> 
> Given a physical address, how do I get the corresponding virtual one under
> the kernel identity mapping?

Is that not just 'phys_to_virt(x)'??

Logan
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki June 10, 2016, 10:29 p.m. UTC | #9
On Friday, June 10, 2016 11:27:29 PM Rafael J. Wysocki wrote:
> On Friday, June 10, 2016 11:09:22 AM Kees Cook wrote:
> > On Thu, Jun 9, 2016 at 9:14 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
> > > Hey,
> > >
> > > I've still be trying to figure this out as I have time.
> > >
> > > I tried printing a couple restore addresses and nothing I can find seems
> > > anywhere near the rodata/ex_table boundary.
> > >
> > > I tried with the (badly formatted) below and got the following. Nothing too
> > > surprising. I've attached a kallsyms that matches the kernel for reference.
> > >
> > > restore_code: ffff880157c3b000
> > > jump_addr: ffffffff81446be0
> > >
> > >
> > > diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> > > index 009947d..6efedb7 100644
> > > --- a/arch/x86/power/hibernate_64.c
> > > +++ b/arch/x86/power/hibernate_64.c
> > > @@ -92,6 +92,9 @@ int swsusp_arch_resume(void)
> > >         memcpy(relocated_restore_code, &core_restore_code,
> > >                &restore_registers - &core_restore_code);
> > >
> > > +       pr_info("restore_code: %p\n", relocated_restore_code);
> > > +       pr_info("jump_addr: %lx\n", restore_jump_address);
> > > +
> > 
> > Also interesting would be the "relocated_restore_code" address, as
> > well as a dump of /sys/kernel/debug/kernel_page_tables (from
> > CONFIG_X86_PTDUMP).
> > 
> > I'm baffled by the problem, but the best I can understand is the the
> > relocated_restore_code range isn't executable (which should be visible
> > from finding it in /sys/kernel/debug/kernel_page_tables), but I don't
> > see how to solve that since my original patch didn't work.
> > 
> > Rafael, is this something you have time to look at quickly?
> 
> Well, not really, but I'll do my best to look at it in the next few days.

OK, I have a theory, but I need a bit of help.

This may be a dumb question, but I don't quite remember the answer readily.

Given a physical address, how do I get the corresponding virtual one under
the kernel identity mapping?

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki June 10, 2016, 10:33 p.m. UTC | #10
On Friday, June 10, 2016 04:28:01 PM Logan Gunthorpe wrote:
> 
> On 10/06/16 04:29 PM, Rafael J. Wysocki wrote:
> > OK, I have a theory, but I need a bit of help.
> > 
> > This may be a dumb question, but I don't quite remember the answer readily.
> > 
> > Given a physical address, how do I get the corresponding virtual one under
> > the kernel identity mapping?
> 
> Is that not just 'phys_to_virt(x)'??

Ah that.  Thanks!

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
index 009947d419a6..c2f3ecc45bd4 100644
--- a/arch/x86/power/hibernate_64.c
+++ b/arch/x86/power/hibernate_64.c
@@ -12,6 +12,7 @@ 
 #include <linux/smp.h>
 #include <linux/suspend.h>

+#include <asm/cacheflush.h>
 #include <asm/init.h>
 #include <asm/proto.h>
 #include <asm/page.h>
@@ -89,6 +90,7 @@  int swsusp_arch_resume(void)
        relocated_restore_code = (void *)get_safe_page(GFP_ATOMIC);
        if (!relocated_restore_code)
                return -ENOMEM;
+       set_memory_x((unsigned long)relocated_restore_code, 1);
        memcpy(relocated_restore_code, &core_restore_code,
               &restore_registers - &core_restore_code);