diff mbox

[RFC] Current status, suspend-to-disk support on ARM

Message ID alpine.DEB.2.00.1104121636460.29883@localhost6.localdomain6 (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Frank Hofmann April 12, 2011, 4:30 p.m. UTC
On Tue, 12 Apr 2011, Pavel Machek wrote:

> Hi!
>
>> There's one ugly thing in this set - I've changed a generic kernel
>> header, <linux/suspend.h> to #define save/restore_processor_state()
>> on ARM so that it only does preempt_disable/enable(). It's
>> surprising that this isn't the default behaviour; all platforms need
>> swsusp_arch_suspend/resume anyway so why force the existance of
>> _two_ arch-specific hooks ?
>
> Can you submit separate patch cleaning it up?

How about the attached ?

It's for discussion, and hence not tested (admittedly, I need an x86 test 
system ...).

The diff is against RMK's devel-stable tree, as of commit 
5f183860d5007ec76ea36bfa6c36d66e37f0dbcf


There's a few hurdles here:

a) who knows all assembly calling conventions for all architectures
    supported by linux ?

    This applies to SH and S390; save/restore_processor_state on those
    are primitive and should be inlined within swsusp_arch_suspend/resume,
    just like the attached patch does for the powerpc variants.

    I've done a bounce call within S390 but don't know enough about SH3
    for arch/sh/kernel/cpu/sh3/swsusp.S - i.e. add a call in assembly to
    init_cpu(current), and then ditch save_processor_state.

b) some architectures do things _beyond_ register/state saving in those
    two functions, and hence rely on them being called paired.

c) CONFIG_KEXEC_JUMP (ab-)uses them. I don't know that subsystem at all,
    but this sort of re-use means the symbols can't just be ditched.
    They become arch-dependent though, no non-arch/ code calls them
    anymore.

What I've attached might break SH3 swsusp, as save_processor_state 
is no longer called there (it stores FPU state).




I've also been wondering:
The comments in kernel/power/hibernate.c mention the need to get preempt 
counts "right" as major motivator to call save/restore_processor_state.
effect" of save/restore_processor_state).

But then on all architectures in kernels as far back as 2.6.32 it doesn't 
look like save/restore_processor state are actually adjusting the preempt 
count; nowhere do they call preempt_enable/disable.

What is and what is not necessary, really ?


Finally, on things like e.g. the floating point context saves: Wouldn't 
this happen as part of freezing processes (in the early stages of 
suspend), and/or as part of device quiesce ?
I wonder why e.g. powerpc does it explicitly; not doing so on ARM doesn't 
seem to cause problems.



>
>> @@ -195,6 +195,14 @@ config VECTORS_BASE
>>  	help
>>  	  The base address of exception vectors.
>>
>> +config ARCH_HIBERNATION_POSSIBLE
>> +	bool
>> +	help
>> +	  If the machine architecture supports suspend-to-disk
>> +	  it should select this automatically for you.
>> +	  Otherwise, say 'Y' at your own peril.
>> +
>
> Good for debugging, but not good for mainline.

How would this be done better ?

FrankH.

>
> -- 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
>

Comments

Pavel Machek April 12, 2011, 6:32 p.m. UTC | #1
Hi!

> >>There's one ugly thing in this set - I've changed a generic kernel
> >>header, <linux/suspend.h> to #define save/restore_processor_state()
> >>on ARM so that it only does preempt_disable/enable(). It's
> >>surprising that this isn't the default behaviour; all platforms need
> >>swsusp_arch_suspend/resume anyway so why force the existance of
> >>_two_ arch-specific hooks ?
> >
> >Can you submit separate patch cleaning it up?
> 
> How about the attached ?
> 
> It's for discussion, and hence not tested (admittedly, I need an x86
> test system ...).
...

> I've also been wondering:
> The comments in kernel/power/hibernate.c mention the need to get
> preempt counts "right" as major motivator to call
> save/restore_processor_state.
> effect" of save/restore_processor_state).
> 
> But then on all architectures in kernels as far back as 2.6.32 it
> doesn't look like save/restore_processor state are actually
> adjusting the preempt count; nowhere do they call
> preempt_enable/disable.

You might need to dig a bit more. IIRC it manipulated FPU or something
long time ago.

> Finally, on things like e.g. the floating point context saves:
> Wouldn't this happen as part of freezing processes (in the early
> stages of suspend), and/or as part of device quiesce ?

Are you sure? I thought we have (had?) concept of lazy FPU
saving... and occassionaly kernel uses FPU too (with some
precautions).

> >>@@ -195,6 +195,14 @@ config VECTORS_BASE
> >> 	help
> >> 	  The base address of exception vectors.
> >>
> >>+config ARCH_HIBERNATION_POSSIBLE
> >>+	bool
> >>+	help
> >>+	  If the machine architecture supports suspend-to-disk
> >>+	  it should select this automatically for you.
> >>+	  Otherwise, say 'Y' at your own peril.
> >>+
> >
> >Good for debugging, but not good for mainline.
> 
> How would this be done better ?

Just set the ARCH_HIBERNATION_POSSIBLE=y in the ARM code now that ARM
can do it. No need to ask user.

> @@ -412,8 +413,7 @@ static int resume_target_kernel(bool platform_mode)
>  	if (error)
>  		goto Enable_irqs;
>  
> -	/* We'll ignore saved state, but this gets preempt count (etc) right */
> -	save_processor_state();
> +	preempt_disable();
>  	error = restore_highmem();
>  	if (!error) {
>  		error = swsusp_arch_resume();

Yep. save_processor_state() on x86 does kernel_fpu_begin(), and than
one needs to be balanced IIRC.

									Pavel
Frank Hofmann April 13, 2011, 1:36 p.m. UTC | #2
On Tue, 12 Apr 2011, Pavel Machek wrote:

> Hi!
[ ... ]
>> But then on all architectures in kernels as far back as 2.6.32 it
>> doesn't look like save/restore_processor state are actually
>> adjusting the preempt count; nowhere do they call
>> preempt_enable/disable.
>
> You might need to dig a bit more. IIRC it manipulated FPU or something
> long time ago.

Actually, it's the other way round - kernel_fpu_begin/end are users of 
preempt_disable/enable, quote Documentation/preempt-locking.txt:

 	Note, some FPU functions are already explicitly preempt safe.  For example,
 	kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
 	However, math_state_restore must be called with preemption disabled.

But kernel_fpu_begin/end functions are x86-specific. At best, that makes 
the comment in hibernate.c (which refers to "needing" preempt_dis/enable) 
misleading.

x86, in this, is an odd one out, though, in ; On ARM, for example, saving 
the FP context happens through a PM notifier.

In addition, it seems that the code (at least in hibernate.c, 
create_image) does anyway:

 	- go single-cpu (disables all but current)
 	- switch interrupts off (local_irq_disable)

before going into save_processor_state - wouldn't that mean all conditions 
for nonpreemptable (or all conditions for a stable FP state) are already 
met by that time ?



>
>> Finally, on things like e.g. the floating point context saves:
>> Wouldn't this happen as part of freezing processes (in the early
>> stages of suspend), and/or as part of device quiesce ?
>
> Are you sure? I thought we have (had?) concept of lazy FPU
> saving... and occassionaly kernel uses FPU too (with some
> precautions).

speaking in particular for ARM, I'd deflect the answer to:

 	http://www.spinics.net/lists/linux-pm/msg22839.html

It's lazy but it does happen ... without any specific need to have the 
"bracketing" that save/restore_processor_state are doing.


Food for thought:

I mean, the code in hibernate.c really looks like this (quote):


static int create_image(int platform_mode)
{
         int error;

         error = arch_prepare_suspend();				<===== platform hook 1
[ ... ]
         error = dpm_suspend_noirq(PMSG_FREEZE);			<===== platform hook 2
[ ... ]
         error = platform_pre_snapshot(platform_mode);		<===== platform hook 3
[ ... ]
         error = disable_nonboot_cpus();				<===== platform hook 4
[ ... ]
         error = sysdev_suspend(PMSG_FREEZE);			<===== ...
         if (!error)
                 error = syscore_suspend();			<===== ...
[ ... ]
         in_suspend = 1;
         save_processor_state();					<===== platform hook N-1
         error = swsusp_arch_suspend();				<===== platform hook N
[ ... ]
         restore_processor_state();				<===== platform hook N+1
[ ... ]

And that's _without_ the platform hibernation ops hooks code in 
hibernate(), which calls this one.

This code is full of places where to plug save/restore code (or any sort 
of "bracketing" begin/finish hooks) into.
It's surprising that with all these hooks existing already, "bracketing" 
of swsusp_arch_suspend() by save/restore_processor state is so sacrosanct.

These are _three_ arch-dependent hooks in three consecutive lines of code.

If a platform requires the bracketing, why can it not do that part in 
syscore_suspend() ? Or why not simply embed it in swsusp_arch_suspend() ?

As said, wrt. to save/restore_processor_state, x86 is definitely the odd 
one out (and not a good example) with the monstrosity of code; all other 
architectures either have trivial or even completely empty code for
save/restore_processor_state().

Should we really force everyone to provide an (as good as) empty hook just 
because x86 at one point had chosen to have it ?

>
>>>> @@ -195,6 +195,14 @@ config VECTORS_BASE
>>>> 	help
>>>> 	  The base address of exception vectors.
>>>>
>>>> +config ARCH_HIBERNATION_POSSIBLE
>>>> +	bool
>>>> +	help
>>>> +	  If the machine architecture supports suspend-to-disk
>>>> +	  it should select this automatically for you.
>>>> +	  Otherwise, say 'Y' at your own peril.
>>>> +
>>>
>>> Good for debugging, but not good for mainline.
>>
>> How would this be done better ?
>
> Just set the ARCH_HIBERNATION_POSSIBLE=y in the ARM code now that ARM
> can do it. No need to ask user.

I.e. preferrably:

 	config ARCH_HIBERNATION_POSSIBLE
 		def_bool n

plus the "select" within the actual platform config subsection for those 
that have it.

To stress the point: "ARM can do it" would be an exaggeration even if 
those changes went in, because only some ARM types will have it.

The situation on ARM is identical to e.g. SuperH, see arch/sh/Kconfig,

config SUPERH32
         def_bool ARCH = "sh"
[ ... various ... ]
         select ARCH_HIBERNATION_POSSIBLE if MMU

config ARCH_HIBERNATION_POSSIBLE
         def_bool n


>
>> @@ -412,8 +413,7 @@ static int resume_target_kernel(bool platform_mode)
>>  	if (error)
>>  		goto Enable_irqs;
>>
>> -	/* We'll ignore saved state, but this gets preempt count (etc) right */
>> -	save_processor_state();
>> +	preempt_disable();
>>  	error = restore_highmem();
>>  	if (!error) {
>>  		error = swsusp_arch_resume();
>
> Yep. save_processor_state() on x86 does kernel_fpu_begin(), and than
> one needs to be balanced IIRC.


The needs of x86 force generic kernel changes ;-)

FrankH.

>
> 									Pavel
> -- 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
>
Rafael Wysocki April 13, 2011, 9:38 p.m. UTC | #3
On Wednesday, April 13, 2011, Frank Hofmann wrote:
> 
> On Tue, 12 Apr 2011, Pavel Machek wrote:
> 
> > Hi!
> [ ... ]
> >> But then on all architectures in kernels as far back as 2.6.32 it
> >> doesn't look like save/restore_processor state are actually
> >> adjusting the preempt count; nowhere do they call
> >> preempt_enable/disable.
> >
> > You might need to dig a bit more. IIRC it manipulated FPU or something
> > long time ago.
> 
> Actually, it's the other way round - kernel_fpu_begin/end are users of 
> preempt_disable/enable, quote Documentation/preempt-locking.txt:
> 
>  	Note, some FPU functions are already explicitly preempt safe.  For example,
>  	kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
>  	However, math_state_restore must be called with preemption disabled.
> 
> But kernel_fpu_begin/end functions are x86-specific. At best, that makes 
> the comment in hibernate.c (which refers to "needing" preempt_dis/enable) 
> misleading.

This comment is outright wrong in fact, sorry about that.  Evidently,
restore_processor_state() is called right after the "restore control flow
magically appears here" comment and it _does_ use the saved state.

I'll remove that comment.

> x86, in this, is an odd one out, though, in ; On ARM, for example, saving 
> the FP context happens through a PM notifier.
> 
> In addition, it seems that the code (at least in hibernate.c, 
> create_image) does anyway:
> 
>  	- go single-cpu (disables all but current)
>  	- switch interrupts off (local_irq_disable)
> 
> before going into save_processor_state - wouldn't that mean all conditions 
> for nonpreemptable (or all conditions for a stable FP state) are already 
> met by that time ?

Yes, it does.

> >> Finally, on things like e.g. the floating point context saves:
> >> Wouldn't this happen as part of freezing processes (in the early
> >> stages of suspend), and/or as part of device quiesce ?
> >
> > Are you sure? I thought we have (had?) concept of lazy FPU
> > saving... and occassionaly kernel uses FPU too (with some
> > precautions).
> 
> speaking in particular for ARM, I'd deflect the answer to:
> 
>  	http://www.spinics.net/lists/linux-pm/msg22839.html
> 
> It's lazy but it does happen ... without any specific need to have the 
> "bracketing" that save/restore_processor_state are doing.

save/restore_processor_state() are arch-specific and do whatever is necessary
or useful for the given architecture.  The saving/restoring of the FPU state
at those points happens to be useful for x86 IIRC.

> Food for thought:
> 
> I mean, the code in hibernate.c really looks like this (quote):
> 
> 
> static int create_image(int platform_mode)
> {
>          int error;
> 
>          error = arch_prepare_suspend();				<===== platform hook 1

Architecture hook rather and it would have been one if there had been any
architecture actually providing anything different from a NOP.  IOW, to be
removed (forgot about it last time I did a code cleanup).

> [ ... ]
>          error = dpm_suspend_noirq(PMSG_FREEZE);			<===== platform hook 2

Not really.  This is a device core callback for the last stage of device freeze.

> [ ... ]
>          error = platform_pre_snapshot(platform_mode);		<===== platform hook 3

This is a platform hook.

> [ ... ]
>          error = disable_nonboot_cpus();				<===== platform hook 4

Not a platform hook.  This one disables the nonboot CPUs using hotplug and this
mechanism is supposed to work that way on all platforms.

> [ ... ]
>          error = sysdev_suspend(PMSG_FREEZE);			<===== ...
>          if (!error)
>                  error = syscore_suspend();			<===== ...
> [ ... ]

These suspend things that are "below" devices (like interrupt controllers
and such stuff) and by no means are platform hooks.

>          in_suspend = 1;
>          save_processor_state();					<===== platform hook N-1

This is an architecture hook for saving the state of the boot CPU.

>          error = swsusp_arch_suspend();				<===== platform hook N

This one is needed on x86 so that we can point the CPU to the original
page tables after we've restored the target kernel from the image.

> [ ... ]
>          restore_processor_state();				<===== platform hook N+1
> [ ... ]
> 
> And that's _without_ the platform hibernation ops hooks code in 
> hibernate(), which calls this one.

Actually not without, platform_pre_snapshot() is one of those.

> This code is full of places where to plug save/restore code (or any sort 
> of "bracketing" begin/finish hooks) into.
> It's surprising that with all these hooks existing already, "bracketing" 
> of swsusp_arch_suspend() by save/restore_processor state is so sacrosanct.
> 
> These are _three_ arch-dependent hooks in three consecutive lines of code.
> 
> If a platform requires the bracketing, why can it not do that part in 
> syscore_suspend() ? Or why not simply embed it in swsusp_arch_suspend() ?

save/restore_processor_state() are also used by the x86's suspend to RAM code.
Of course, you can argue that they should be kept x86-specific, but since
those features were originally developed on x86, the code still remains
x86-centric in that respect.

> As said, wrt. to save/restore_processor_state, x86 is definitely the odd 
> one out (and not a good example) with the monstrosity of code; all other 
> architectures either have trivial or even completely empty code for
> save/restore_processor_state().
> 
> Should we really force everyone to provide an (as good as) empty hook just 
> because x86 at one point had chosen to have it ?

I don't see much reason for that, but since none of the guys who implemented
those empty hooks has ever complained and/or attempted to change things, there
has not been much reason to work in that direction. :-)

> >>>> @@ -195,6 +195,14 @@ config VECTORS_BASE
> >>>> 	help
> >>>> 	  The base address of exception vectors.
> >>>>
> >>>> +config ARCH_HIBERNATION_POSSIBLE
> >>>> +	bool
> >>>> +	help
> >>>> +	  If the machine architecture supports suspend-to-disk
> >>>> +	  it should select this automatically for you.
> >>>> +	  Otherwise, say 'Y' at your own peril.
> >>>> +
> >>>
> >>> Good for debugging, but not good for mainline.
> >>
> >> How would this be done better ?
> >
> > Just set the ARCH_HIBERNATION_POSSIBLE=y in the ARM code now that ARM
> > can do it. No need to ask user.
> 
> I.e. preferrably:
> 
>  	config ARCH_HIBERNATION_POSSIBLE
>  		def_bool n

You don't need to use "def_bool n", "bool" alone will do just fine.

> plus the "select" within the actual platform config subsection for those 
> that have it.
> 
> To stress the point: "ARM can do it" would be an exaggeration even if 
> those changes went in, because only some ARM types will have it.
> 
> The situation on ARM is identical to e.g. SuperH, see arch/sh/Kconfig,
> 
> config SUPERH32
>          def_bool ARCH = "sh"
> [ ... various ... ]
>          select ARCH_HIBERNATION_POSSIBLE if MMU
> 
> config ARCH_HIBERNATION_POSSIBLE
>          def_bool n
> 

Yes, something like this.

> >> @@ -412,8 +413,7 @@ static int resume_target_kernel(bool platform_mode)
> >>  	if (error)
> >>  		goto Enable_irqs;
> >>
> >> -	/* We'll ignore saved state, but this gets preempt count (etc) right */
> >> -	save_processor_state();
> >> +	preempt_disable();

Why do you need that preempt_disable() here?

> >>  	error = restore_highmem();
> >>  	if (!error) {
> >>  		error = swsusp_arch_resume();
> >
> > Yep. save_processor_state() on x86 does kernel_fpu_begin(), and than
> > one needs to be balanced IIRC.
> 
> 
> The needs of x86 force generic kernel changes ;-)

Not necessarily.  In fact, implementing that feature on different architectures
is a good opportunity to clean up the code if necessary.

Thanks,
Rafael
Frank Hofmann April 14, 2011, 5:06 p.m. UTC | #4
Hi Rafael, hi Pavel,


thanks for your comments, this makes history and purpose of the 
save/restore_processor_functions a lot clearer.
Yes, x86 does a lot more in those than other architectures, but even on 
non-x86, they have a purpose even if one decides to do the "core" work 
inside swsusp_arch_suspend/resume().


I'll go back to the drawing board for the ARM patch again for a bit.

It might be useful/necessary to delegate some of the functionality actually 
into save/restore_processor_state, like powerpc does (call flush_thread() 
to store lazy-save state, and maybe, if necessary, do some of the things 
the cpu_idle code on ARM does before going down into cpu_suspend).

John reported a crash on resume (on OMAP4) related to FP state; I've done 
some tracing here to see whether the PM notifiers for storing the VFP/FPU 
state on ARM are actually triggering with the suggested change and they're 
not, which means there's still some work to do here.


The last thing here in my setup that's obviously not correctly suspending 
/ resuming is the OMAP display driver; I'm not the only one with that 
problem, Matt's previous report:

http://blog.gmane.org/gmane.linux.swsusp.devel/month=20101201

has the same thing, "omapdss DISPC error: SYNC_LOST, disabling LCD".


I'll provide an update before Easter.

Thanks for the help so far!
FrankH.



On Wed, 13 Apr 2011, Rafael J. Wysocki wrote:

> On Wednesday, April 13, 2011, Frank Hofmann wrote:
>>
>> On Tue, 12 Apr 2011, Pavel Machek wrote:
>>
>>> Hi!
>> [ ... ]
>>>> But then on all architectures in kernels as far back as 2.6.32 it
>>>> doesn't look like save/restore_processor state are actually
>>>> adjusting the preempt count; nowhere do they call
>>>> preempt_enable/disable.
>>>
>>> You might need to dig a bit more. IIRC it manipulated FPU or something
>>> long time ago.
>>
>> Actually, it's the other way round - kernel_fpu_begin/end are users of
>> preempt_disable/enable, quote Documentation/preempt-locking.txt:
>>
>>  	Note, some FPU functions are already explicitly preempt safe.  For example,
>>  	kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
>>  	However, math_state_restore must be called with preemption disabled.
>>
>> But kernel_fpu_begin/end functions are x86-specific. At best, that makes
>> the comment in hibernate.c (which refers to "needing" preempt_dis/enable)
>> misleading.
>
> This comment is outright wrong in fact, sorry about that.  Evidently,
> restore_processor_state() is called right after the "restore control flow
> magically appears here" comment and it _does_ use the saved state.
>
> I'll remove that comment.
>
>> x86, in this, is an odd one out, though, in ; On ARM, for example, saving
>> the FP context happens through a PM notifier.
>>
>> In addition, it seems that the code (at least in hibernate.c,
>> create_image) does anyway:
>>
>>  	- go single-cpu (disables all but current)
>>  	- switch interrupts off (local_irq_disable)
>>
>> before going into save_processor_state - wouldn't that mean all conditions
>> for nonpreemptable (or all conditions for a stable FP state) are already
>> met by that time ?
>
> Yes, it does.
>
>>>> Finally, on things like e.g. the floating point context saves:
>>>> Wouldn't this happen as part of freezing processes (in the early
>>>> stages of suspend), and/or as part of device quiesce ?
>>>
>>> Are you sure? I thought we have (had?) concept of lazy FPU
>>> saving... and occassionaly kernel uses FPU too (with some
>>> precautions).
>>
>> speaking in particular for ARM, I'd deflect the answer to:
>>
>>  	http://www.spinics.net/lists/linux-pm/msg22839.html
>>
>> It's lazy but it does happen ... without any specific need to have the
>> "bracketing" that save/restore_processor_state are doing.
>
> save/restore_processor_state() are arch-specific and do whatever is necessary
> or useful for the given architecture.  The saving/restoring of the FPU state
> at those points happens to be useful for x86 IIRC.
>
>> Food for thought:
>>
>> I mean, the code in hibernate.c really looks like this (quote):
>>
>>
>> static int create_image(int platform_mode)
>> {
>>          int error;
>>
>>          error = arch_prepare_suspend();				<===== platform hook 1
>
> Architecture hook rather and it would have been one if there had been any
> architecture actually providing anything different from a NOP.  IOW, to be
> removed (forgot about it last time I did a code cleanup).
>
>> [ ... ]
>>          error = dpm_suspend_noirq(PMSG_FREEZE);			<===== platform hook 2
>
> Not really.  This is a device core callback for the last stage of device freeze.
>
>> [ ... ]
>>          error = platform_pre_snapshot(platform_mode);		<===== platform hook 3
>
> This is a platform hook.
>
>> [ ... ]
>>          error = disable_nonboot_cpus();				<===== platform hook 4
>
> Not a platform hook.  This one disables the nonboot CPUs using hotplug and this
> mechanism is supposed to work that way on all platforms.
>
>> [ ... ]
>>          error = sysdev_suspend(PMSG_FREEZE);			<===== ...
>>          if (!error)
>>                  error = syscore_suspend();			<===== ...
>> [ ... ]
>
> These suspend things that are "below" devices (like interrupt controllers
> and such stuff) and by no means are platform hooks.
>
>>          in_suspend = 1;
>>          save_processor_state();					<===== platform hook N-1
>
> This is an architecture hook for saving the state of the boot CPU.
>
>>          error = swsusp_arch_suspend();				<===== platform hook N
>
> This one is needed on x86 so that we can point the CPU to the original
> page tables after we've restored the target kernel from the image.
>
>> [ ... ]
>>          restore_processor_state();				<===== platform hook N+1
>> [ ... ]
>>
>> And that's _without_ the platform hibernation ops hooks code in
>> hibernate(), which calls this one.
>
> Actually not without, platform_pre_snapshot() is one of those.
>
>> This code is full of places where to plug save/restore code (or any sort
>> of "bracketing" begin/finish hooks) into.
>> It's surprising that with all these hooks existing already, "bracketing"
>> of swsusp_arch_suspend() by save/restore_processor state is so sacrosanct.
>>
>> These are _three_ arch-dependent hooks in three consecutive lines of code.
>>
>> If a platform requires the bracketing, why can it not do that part in
>> syscore_suspend() ? Or why not simply embed it in swsusp_arch_suspend() ?
>
> save/restore_processor_state() are also used by the x86's suspend to RAM code.
> Of course, you can argue that they should be kept x86-specific, but since
> those features were originally developed on x86, the code still remains
> x86-centric in that respect.
>
>> As said, wrt. to save/restore_processor_state, x86 is definitely the odd
>> one out (and not a good example) with the monstrosity of code; all other
>> architectures either have trivial or even completely empty code for
>> save/restore_processor_state().
>>
>> Should we really force everyone to provide an (as good as) empty hook just
>> because x86 at one point had chosen to have it ?
>
> I don't see much reason for that, but since none of the guys who implemented
> those empty hooks has ever complained and/or attempted to change things, there
> has not been much reason to work in that direction. :-)
>
>>>>>> @@ -195,6 +195,14 @@ config VECTORS_BASE
>>>>>> 	help
>>>>>> 	  The base address of exception vectors.
>>>>>>
>>>>>> +config ARCH_HIBERNATION_POSSIBLE
>>>>>> +	bool
>>>>>> +	help
>>>>>> +	  If the machine architecture supports suspend-to-disk
>>>>>> +	  it should select this automatically for you.
>>>>>> +	  Otherwise, say 'Y' at your own peril.
>>>>>> +
>>>>>
>>>>> Good for debugging, but not good for mainline.
>>>>
>>>> How would this be done better ?
>>>
>>> Just set the ARCH_HIBERNATION_POSSIBLE=y in the ARM code now that ARM
>>> can do it. No need to ask user.
>>
>> I.e. preferrably:
>>
>>  	config ARCH_HIBERNATION_POSSIBLE
>>  		def_bool n
>
> You don't need to use "def_bool n", "bool" alone will do just fine.
>
>> plus the "select" within the actual platform config subsection for those
>> that have it.
>>
>> To stress the point: "ARM can do it" would be an exaggeration even if
>> those changes went in, because only some ARM types will have it.
>>
>> The situation on ARM is identical to e.g. SuperH, see arch/sh/Kconfig,
>>
>> config SUPERH32
>>          def_bool ARCH = "sh"
>> [ ... various ... ]
>>          select ARCH_HIBERNATION_POSSIBLE if MMU
>>
>> config ARCH_HIBERNATION_POSSIBLE
>>          def_bool n
>>
>
> Yes, something like this.
>
>>>> @@ -412,8 +413,7 @@ static int resume_target_kernel(bool platform_mode)
>>>>  	if (error)
>>>>  		goto Enable_irqs;
>>>>
>>>> -	/* We'll ignore saved state, but this gets preempt count (etc) right */
>>>> -	save_processor_state();
>>>> +	preempt_disable();
>
> Why do you need that preempt_disable() here?
>
>>>>  	error = restore_highmem();
>>>>  	if (!error) {
>>>>  		error = swsusp_arch_resume();
>>>
>>> Yep. save_processor_state() on x86 does kernel_fpu_begin(), and than
>>> one needs to be balanced IIRC.
>>
>>
>> The needs of x86 force generic kernel changes ;-)
>
> Not necessarily.  In fact, implementing that feature on different architectures
> is a good opportunity to clean up the code if necessary.
>
> Thanks,
> Rafael
>
diff mbox

Patch

 arch/mips/power/hibernate.S        |    8 ++++++-
 arch/powerpc/kernel/swsusp.c       |   39 ------------------------------------
 arch/powerpc/kernel/swsusp_32.S    |   18 ++++++++++++++++
 arch/powerpc/kernel/swsusp_asm64.S |   21 +++++++++++++++++++
 arch/powerpc/kernel/swsusp_booke.S |   19 +++++++++++++++++
 arch/s390/kernel/swsusp_asm64.S    |    6 +++++
 arch/sh/kernel/swsusp.c            |    5 ----
 arch/unicore32/kernel/hibernate.c  |    9 --------
 arch/x86/power/hibernate_asm_32.S  |    4 ++-
 arch/x86/power/hibernate_asm_64.S  |    2 +
 include/linux/suspend.h            |    2 +
 kernel/power/hibernate.c           |   10 ++++----
 12 files changed, 83 insertions(+), 60 deletions(-)

diff --git a/arch/mips/power/hibernate.S b/arch/mips/power/hibernate.S
index dbb5c7b..1b12ae1 100644
--- a/arch/mips/power/hibernate.S
+++ b/arch/mips/power/hibernate.S
@@ -27,6 +27,9 @@  LEAF(swsusp_arch_suspend)
 	PTR_S s5, PT_R21(t0)
 	PTR_S s6, PT_R22(t0)
 	PTR_S s7, PT_R23(t0)
+	jal save_processor_state
+	PTR_LA t0, saved_regs
+	PTR_L ra, PT_R31(t0)
 	j swsusp_save
 END(swsusp_arch_suspend)
 
@@ -45,7 +48,6 @@  LEAF(swsusp_arch_resume)
 	PTR_L t0, PBE_NEXT(t0)
 	bnez t0, 0b
 	PTR_LA t0, saved_regs
-	PTR_L ra, PT_R31(t0)
 	PTR_L sp, PT_R29(t0)
 	PTR_L fp, PT_R30(t0)
 	PTR_L gp, PT_R28(t0)
@@ -57,6 +59,10 @@  LEAF(swsusp_arch_resume)
 	PTR_L s5, PT_R21(t0)
 	PTR_L s6, PT_R22(t0)
 	PTR_L s7, PT_R23(t0)
+	jal restore_processor_state
+	PTR_LA t0, saved_regs
+	PTR_L ra, PT_R31(t0)
+
 	PTR_LI v0, 0x0
 	jr ra
 END(swsusp_arch_resume)
diff --git a/arch/powerpc/kernel/swsusp.c b/arch/powerpc/kernel/swsusp.c
deleted file mode 100644
index 560c961..0000000
--- a/arch/powerpc/kernel/swsusp.c
+++ /dev/null
@@ -1,39 +0,0 @@ 
-/*
- * Common powerpc suspend code for 32 and 64 bits
- *
- * Copyright 2007	Johannes Berg <johannes@sipsolutions.net>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-
-#include <linux/sched.h>
-#include <asm/suspend.h>
-#include <asm/system.h>
-#include <asm/current.h>
-#include <asm/mmu_context.h>
-
-void save_processor_state(void)
-{
-	/*
-	 * flush out all the special registers so we don't need
-	 * to save them in the snapshot
-	 */
-	flush_fp_to_thread(current);
-	flush_altivec_to_thread(current);
-	flush_spe_to_thread(current);
-
-#ifdef CONFIG_PPC64
-	hard_irq_disable();
-#endif
-
-}
-
-void restore_processor_state(void)
-{
-#ifdef CONFIG_PPC32
-	switch_mmu_context(NULL, current->active_mm);
-#endif
-}
diff --git a/arch/powerpc/kernel/swsusp_32.S b/arch/powerpc/kernel/swsusp_32.S
index b0754e2..08137c5 100644
--- a/arch/powerpc/kernel/swsusp_32.S
+++ b/arch/powerpc/kernel/swsusp_32.S
@@ -116,6 +116,16 @@  _GLOBAL(swsusp_arch_suspend)
 	/* Backup various CPU config stuffs */
 	bl	__save_cpu_setup
 #endif
+	/* flush out all the special registers so we don't need
+	 * to save them in the snapshot
+	 */
+	tophys(r4,r2);
+	bl	flush_fp_to_thread
+	tophys(r4,r2);
+	bl	flush_altivec_to_thread
+	tophys(r4,r2);
+	bl	flush_spe_to_thread
+
 	/* Call the low level suspend stuff (we should probably have made
 	 * a stackframe...
 	 */
@@ -323,6 +333,14 @@  END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
 	li	r0,1
 	mtdec	r0
 
+	/* restore MMU context - switch_mmu_context(NULL, current->mm); */
+	li	r3,0
+	/* 'current->mm' needs to be in r4 */
+	tophys(r4, r2)
+	lwz     r4, MM(r4)
+	tophys(r4, r4)
+	bl      switch_mmu_context
+
 	/* Restore the callee-saved registers and return */
 	lwz	r0,SL_CR(r11)
 	mtcr	r0
diff --git a/arch/powerpc/kernel/swsusp_asm64.S b/arch/powerpc/kernel/swsusp_asm64.S
index 86ac1d9..e46786d 100644
--- a/arch/powerpc/kernel/swsusp_asm64.S
+++ b/arch/powerpc/kernel/swsusp_asm64.S
@@ -76,8 +76,29 @@  restore_pblist_ptr:
 	.section .text
 	.align  5
 _GLOBAL(swsusp_arch_suspend)
+	/* Save LR first so that we can make function calls */
 	ld	r11,swsusp_save_area_ptr@toc(r2)
 	SAVE_SPECIAL(LR)
+
+	/* flush out all the special registers so we don't need
+	 * to save them in the snapshot
+	 */
+	ld	r3,PACACURRENT(r13)
+	bl flush_fp_to_thread
+	ld	r3,PACACURRENT(r13)
+	bl flush_altivec_to_thread
+	ld	r3,PACACURRENT(r13)
+	bl flush_spe_to_thread
+	/* Disable interrupts now */
+	mfmsr	r4		/* Get current interrupt state */
+	rldicu	r3,r4,48,1	/* clear MSR_EE */
+	rotldi	r3,r3,16
+	mtmsrd	r3,1		/* Update machine state */
+	li	r3,0
+	stb	r3,PACASOFTIRQEN(r13)
+	stb	r3,PACAHARDIRQEN(r13)
+
+	ld	r11,swsusp_save_area_ptr@toc(r2)
 	SAVE_REGISTER(r1)
 	SAVE_SPECIAL(CR)
 	SAVE_SPECIAL(TB)
diff --git a/arch/powerpc/kernel/swsusp_booke.S b/arch/powerpc/kernel/swsusp_booke.S
index 11a3930..529e95b 100644
--- a/arch/powerpc/kernel/swsusp_booke.S
+++ b/arch/powerpc/kernel/swsusp_booke.S
@@ -50,8 +50,27 @@  _GLOBAL(swsusp_arch_suspend)
 	lis	r11,swsusp_save_area@h
 	ori	r11,r11,swsusp_save_area@l
 
+	/* save LR first so we can make function calls */
 	mflr	r0
 	stw	r0,SL_LR(r11)
+
+	/* flush out all the special registers so we don't need
+	 * to save them in the snapshot
+	 */
+	ld	r3,PACACURRENT(r13)
+	bl flush_fp_to_thread
+	ld	r3,PACACURRENT(r13)
+	bl flush_altivec_to_thread
+	ld	r3,PACACURRENT(r13)
+	bl flush_spe_to_thread
+	/* Disable interrupts now */
+	wrteei	0
+	li	r0,0
+	stb	r0,PACASOFTIRQEN(r13)
+	stb	r0,PACAHARDIRQEN(r13)
+
+	lis	r11,swsusp_save_area@h
+	ori	r11,r11,swsusp_save_area@l
 	mfcr	r0
 	stw	r0,SL_CR(r11)
 	stw	r1,SL_SP(r11)
diff --git a/arch/s390/kernel/swsusp_asm64.S b/arch/s390/kernel/swsusp_asm64.S
index 1f066e4..00cceff 100644
--- a/arch/s390/kernel/swsusp_asm64.S
+++ b/arch/s390/kernel/swsusp_asm64.S
@@ -25,6 +25,8 @@ 
 	.align	4
 	.globl swsusp_arch_suspend
 swsusp_arch_suspend:
+	brasl	%r14,save_processor_state	/* disables lowcore protection */
+
 	stmg	%r6,%r15,__SF_GPRS(%r15)
 	lgr	%r1,%r15
 	aghi	%r15,-STACK_FRAME_OVERHEAD
@@ -100,6 +102,8 @@  swsusp_arch_suspend:
 	/* Save image */
 	brasl	%r14,swsusp_save
 
+	brasl	%r14,restore_processor_state
+
 	/* Restore prefix register and return */
 	lghi	%r1,0x1000
 	spx	0x318(%r1)
@@ -259,6 +263,8 @@  restore_registers:
 	/* Reinitialize the channel subsystem */
 	brasl	%r14,channel_subsystem_reinit
 
+	/* reenable lowcore protection */
+	brasl	%r14,restore_processor_state
 	/* Return 0 */
 	lmg	%r6,%r15,STACK_FRAME_OVERHEAD + __SF_GPRS(%r15)
 	lghi	%r2,0
diff --git a/arch/sh/kernel/swsusp.c b/arch/sh/kernel/swsusp.c
index 12b64a0..65a964e 100644
--- a/arch/sh/kernel/swsusp.c
+++ b/arch/sh/kernel/swsusp.c
@@ -31,8 +31,3 @@  void save_processor_state(void)
 {
 	init_fpu(current);
 }
-
-void restore_processor_state(void)
-{
-	local_flush_tlb_all();
-}
diff --git a/arch/unicore32/kernel/hibernate.c b/arch/unicore32/kernel/hibernate.c
index 7d0f0b7..aba4ad4 100644
--- a/arch/unicore32/kernel/hibernate.c
+++ b/arch/unicore32/kernel/hibernate.c
@@ -149,12 +149,3 @@  int pfn_is_nosave(unsigned long pfn)
 
 	return (pfn >= begin_pfn) && (pfn < end_pfn);
 }
-
-void save_processor_state(void)
-{
-}
-
-void restore_processor_state(void)
-{
-	local_flush_tlb_all();
-}
diff --git a/arch/x86/power/hibernate_asm_32.S b/arch/x86/power/hibernate_asm_32.S
index ad47dae..6827e1b 100644
--- a/arch/x86/power/hibernate_asm_32.S
+++ b/arch/x86/power/hibernate_asm_32.S
@@ -15,6 +15,7 @@ 
 .text
 
 ENTRY(swsusp_arch_suspend)
+	call save_processor_state
 	movl %esp, saved_context_esp
 	movl %ebx, saved_context_ebx
 	movl %ebp, saved_context_ebp
@@ -22,7 +23,6 @@  ENTRY(swsusp_arch_suspend)
 	movl %edi, saved_context_edi
 	pushfl
 	popl saved_context_eflags
-
 	call swsusp_save
 	ret
 
@@ -75,6 +75,8 @@  done:
 	pushl saved_context_eflags
 	popfl
 
+	call restore_processor_state
+
 	xorl	%eax, %eax
 
 	ret
diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index 9356547..1318398 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -23,6 +23,7 @@ 
 #include <asm/processor-flags.h>
 
 ENTRY(swsusp_arch_suspend)
+	call save_processor_state
 	movq	$saved_context, %rax
 	movq	%rsp, pt_regs_sp(%rax)
 	movq	%rbp, pt_regs_bp(%rax)
@@ -139,6 +140,7 @@  ENTRY(restore_registers)
 	pushq	pt_regs_flags(%rax)
 	popfq
 
+	call restore_processor_state
 	xorq	%rax, %rax
 
 	/* tell the hibernation core that we've just restored the memory */
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 5a89e36..145d9a4 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -259,8 +259,10 @@  static inline bool system_entering_hibernation(void) { return false; }
 #endif /* CONFIG_HIBERNATION */
 
 #ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_KEXEC_JUMP
 void save_processor_state(void);
 void restore_processor_state(void);
+#endif
 
 /* kernel/power/main.c */
 extern int register_pm_notifier(struct notifier_block *nb);
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index aeabd26..554b741 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -285,13 +285,14 @@  static int create_image(int platform_mode)
 		goto Power_up;
 
 	in_suspend = 1;
-	save_processor_state();
+	preempt_disable();
 	error = swsusp_arch_suspend();
 	if (error)
 		printk(KERN_ERR "PM: Error %d creating hibernation image\n",
 			error);
 	/* Restore control flow magically appears here */
-	restore_processor_state();
+	flush_tlb_all();
+	preempt_enable();
 	if (!in_suspend) {
 		events_check_enabled = false;
 		platform_leave(platform_mode);
@@ -412,8 +413,7 @@  static int resume_target_kernel(bool platform_mode)
 	if (error)
 		goto Enable_irqs;
 
-	/* We'll ignore saved state, but this gets preempt count (etc) right */
-	save_processor_state();
+	preempt_disable();
 	error = restore_highmem();
 	if (!error) {
 		error = swsusp_arch_resume();
@@ -432,7 +432,7 @@  static int resume_target_kernel(bool platform_mode)
 	 * subsequent failures
 	 */
 	swsusp_free();
-	restore_processor_state();
+	preempt_enable();
 	touch_softlockup_watchdog();
 
 	syscore_resume();