diff mbox

[2/2] memory: ti-emif-sram: introduce relocatable suspend/resume handlers

Message ID 20170328205511.21166-3-d-gerlach@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Gerlach March 28, 2017, 8:55 p.m. UTC
Certain SoCs like Texas Instruments AM335x and AM437x require parts
of the EMIF PM code to run late in the suspend sequence from SRAM,
such as saving and restoring the EMIF context and placing the memory
into self-refresh.

One requirement for these SoCs to suspend and enter its lowest power
mode, called DeepSleep0, is that the PER power domain must be shut off.
Because the EMIF (DDR Controller) resides within this power domain, it
will lose context during a suspend operation, so we must save it so we
can restore once we resume. However, we cannot execute this code from
external memory, as it is not available at this point, so the code must
be executed late in the suspend path from SRAM.

This patch introduces a ti-emif-sram driver that includes several
functions written in ARM ASM that are relocatable so the PM SRAM
code can use them. It also allocates a region of writable SRAM to
be used by the code running in the executable region of SRAM to save
and restore the EMIF context. It can export a table containing the
absolute addresses of the available PM functions so that other SRAM
code can branch to them. This code is required for suspend/resume on
AM335x and AM437x to work.

In addition to this, to be able to share data structures between C and
the ti-emif-sram-pm assembly code, we can automatically generate all of
the C struct member offsets and sizes as macros by making use of the ARM
asm-offsets file. In the same header that we define our data structures
in we also define all the macros in an inline function and by adding a
call to this in the asm_offsets file all macros are properly generated
and available to the assembly code without cluttering up the asm-offsets
file.

Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
---
 arch/arm/kernel/asm-offsets.c    |   6 +
 drivers/memory/Kconfig           |  10 ++
 drivers/memory/Makefile          |   4 +
 drivers/memory/emif.h            |  17 ++
 drivers/memory/ti-emif-pm.c      | 295 ++++++++++++++++++++++++++++++++++
 drivers/memory/ti-emif-sram-pm.S | 334 +++++++++++++++++++++++++++++++++++++++
 include/linux/ti-emif-sram.h     | 143 +++++++++++++++++
 7 files changed, 809 insertions(+)
 create mode 100644 drivers/memory/ti-emif-pm.c
 create mode 100644 drivers/memory/ti-emif-sram-pm.S
 create mode 100644 include/linux/ti-emif-sram.h

Comments

Tony Lindgren April 4, 2017, 4:11 p.m. UTC | #1
Russell,

* Dave Gerlach <d-gerlach@ti.com> [170328 13:57]:
> Certain SoCs like Texas Instruments AM335x and AM437x require parts
> of the EMIF PM code to run late in the suspend sequence from SRAM,
> such as saving and restoring the EMIF context and placing the memory
> into self-refresh.
> 
> One requirement for these SoCs to suspend and enter its lowest power
> mode, called DeepSleep0, is that the PER power domain must be shut off.
> Because the EMIF (DDR Controller) resides within this power domain, it
> will lose context during a suspend operation, so we must save it so we
> can restore once we resume. However, we cannot execute this code from
> external memory, as it is not available at this point, so the code must
> be executed late in the suspend path from SRAM.
> 
> This patch introduces a ti-emif-sram driver that includes several
> functions written in ARM ASM that are relocatable so the PM SRAM
> code can use them. It also allocates a region of writable SRAM to
> be used by the code running in the executable region of SRAM to save
> and restore the EMIF context. It can export a table containing the
> absolute addresses of the available PM functions so that other SRAM
> code can branch to them. This code is required for suspend/resume on
> AM335x and AM437x to work.
> 
> In addition to this, to be able to share data structures between C and
> the ti-emif-sram-pm assembly code, we can automatically generate all of
> the C struct member offsets and sizes as macros by making use of the ARM
> asm-offsets file. In the same header that we define our data structures
> in we also define all the macros in an inline function and by adding a
> call to this in the asm_offsets file all macros are properly generated
> and available to the assembly code without cluttering up the asm-offsets
> file.
> 
> Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
> ---
>  arch/arm/kernel/asm-offsets.c    |   6 +
>  drivers/memory/Kconfig           |  10 ++
>  drivers/memory/Makefile          |   4 +
>  drivers/memory/emif.h            |  17 ++
>  drivers/memory/ti-emif-pm.c      | 295 ++++++++++++++++++++++++++++++++++
>  drivers/memory/ti-emif-sram-pm.S | 334 +++++++++++++++++++++++++++++++++++++++
>  include/linux/ti-emif-sram.h     | 143 +++++++++++++++++
>  7 files changed, 809 insertions(+)
>  create mode 100644 drivers/memory/ti-emif-pm.c
>  create mode 100644 drivers/memory/ti-emif-sram-pm.S
>  create mode 100644 include/linux/ti-emif-sram.h
> 
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 608008229c7d..d728b5660e36 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -28,6 +28,7 @@
>  #include <asm/vdso_datapage.h>
>  #include <asm/hardware/cache-l2x0.h>
>  #include <linux/kbuild.h>
> +#include <linux/ti-emif-sram.h>
>  
>  /*
>   * Make sure that the compiler and target are compatible.
> @@ -183,5 +184,10 @@ int main(void)
>  #ifdef CONFIG_VDSO
>    DEFINE(VDSO_DATA_SIZE,	sizeof(union vdso_data_store));
>  #endif
> +#if defined(CONFIG_SOC_AM33XX) || defined(CONFIG_SOC_AM43XX)
> +  BLANK();
> +  ti_emif_offsets();
> +#endif
> +
>    return 0; 
>  }

Does the above look OK to you?

Regards,

Tony
Russell King (Oracle) April 5, 2017, 1:59 p.m. UTC | #2
On Tue, Apr 04, 2017 at 09:11:52AM -0700, Tony Lindgren wrote:
> Russell,
> 
> * Dave Gerlach <d-gerlach@ti.com> [170328 13:57]:
> > Certain SoCs like Texas Instruments AM335x and AM437x require parts
> > of the EMIF PM code to run late in the suspend sequence from SRAM,
> > such as saving and restoring the EMIF context and placing the memory
> > into self-refresh.
> > 
> > One requirement for these SoCs to suspend and enter its lowest power
> > mode, called DeepSleep0, is that the PER power domain must be shut off.
> > Because the EMIF (DDR Controller) resides within this power domain, it
> > will lose context during a suspend operation, so we must save it so we
> > can restore once we resume. However, we cannot execute this code from
> > external memory, as it is not available at this point, so the code must
> > be executed late in the suspend path from SRAM.
> > 
> > This patch introduces a ti-emif-sram driver that includes several
> > functions written in ARM ASM that are relocatable so the PM SRAM
> > code can use them. It also allocates a region of writable SRAM to
> > be used by the code running in the executable region of SRAM to save
> > and restore the EMIF context. It can export a table containing the
> > absolute addresses of the available PM functions so that other SRAM
> > code can branch to them. This code is required for suspend/resume on
> > AM335x and AM437x to work.
> > 
> > In addition to this, to be able to share data structures between C and
> > the ti-emif-sram-pm assembly code, we can automatically generate all of
> > the C struct member offsets and sizes as macros by making use of the ARM
> > asm-offsets file. In the same header that we define our data structures
> > in we also define all the macros in an inline function and by adding a
> > call to this in the asm_offsets file all macros are properly generated
> > and available to the assembly code without cluttering up the asm-offsets
> > file.
> > 
> > Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
> > ---
> >  arch/arm/kernel/asm-offsets.c    |   6 +
> >  drivers/memory/Kconfig           |  10 ++
> >  drivers/memory/Makefile          |   4 +
> >  drivers/memory/emif.h            |  17 ++
> >  drivers/memory/ti-emif-pm.c      | 295 ++++++++++++++++++++++++++++++++++
> >  drivers/memory/ti-emif-sram-pm.S | 334 +++++++++++++++++++++++++++++++++++++++
> >  include/linux/ti-emif-sram.h     | 143 +++++++++++++++++
> >  7 files changed, 809 insertions(+)
> >  create mode 100644 drivers/memory/ti-emif-pm.c
> >  create mode 100644 drivers/memory/ti-emif-sram-pm.S
> >  create mode 100644 include/linux/ti-emif-sram.h
> > 
> > diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> > index 608008229c7d..d728b5660e36 100644
> > --- a/arch/arm/kernel/asm-offsets.c
> > +++ b/arch/arm/kernel/asm-offsets.c
> > @@ -28,6 +28,7 @@
> >  #include <asm/vdso_datapage.h>
> >  #include <asm/hardware/cache-l2x0.h>
> >  #include <linux/kbuild.h>
> > +#include <linux/ti-emif-sram.h>
> >  
> >  /*
> >   * Make sure that the compiler and target are compatible.
> > @@ -183,5 +184,10 @@ int main(void)
> >  #ifdef CONFIG_VDSO
> >    DEFINE(VDSO_DATA_SIZE,	sizeof(union vdso_data_store));
> >  #endif
> > +#if defined(CONFIG_SOC_AM33XX) || defined(CONFIG_SOC_AM43XX)
> > +  BLANK();
> > +  ti_emif_offsets();
> > +#endif
> > +
> >    return 0; 
> >  }
> 
> Does the above look OK to you?

I'm not going to comment on this yet, but I'll instead comment on the
newly appeared sram_exec_copy() stuff.

So, a few years ago, we went to significant effort in ARM land to come
up with a way to _safely_ copy assembler from the kernel into SRAM,
because copying code to SRAM that is compiled in thumb mode and then
executing it is _not_ as simple as memcpy(), cast the pointer to a
function pointer, and then call the function pointer.

The SRAM stuff throws all that out, instead preferring the dumb memcpy()
approach.

This needs resolving, and I'd like to see it resolved to the satisfaction
of architecture maintainers before we progress any further down this
route.
Tony Lindgren April 5, 2017, 2:33 p.m. UTC | #3
* Russell King - ARM Linux <linux@armlinux.org.uk> [170405 07:02]:
> On Tue, Apr 04, 2017 at 09:11:52AM -0700, Tony Lindgren wrote:
> > Russell,
> > 
> > * Dave Gerlach <d-gerlach@ti.com> [170328 13:57]:
> > > Certain SoCs like Texas Instruments AM335x and AM437x require parts
> > > of the EMIF PM code to run late in the suspend sequence from SRAM,
> > > such as saving and restoring the EMIF context and placing the memory
> > > into self-refresh.
> > > 
> > > One requirement for these SoCs to suspend and enter its lowest power
> > > mode, called DeepSleep0, is that the PER power domain must be shut off.
> > > Because the EMIF (DDR Controller) resides within this power domain, it
> > > will lose context during a suspend operation, so we must save it so we
> > > can restore once we resume. However, we cannot execute this code from
> > > external memory, as it is not available at this point, so the code must
> > > be executed late in the suspend path from SRAM.
> > > 
> > > This patch introduces a ti-emif-sram driver that includes several
> > > functions written in ARM ASM that are relocatable so the PM SRAM
> > > code can use them. It also allocates a region of writable SRAM to
> > > be used by the code running in the executable region of SRAM to save
> > > and restore the EMIF context. It can export a table containing the
> > > absolute addresses of the available PM functions so that other SRAM
> > > code can branch to them. This code is required for suspend/resume on
> > > AM335x and AM437x to work.
> > > 
> > > In addition to this, to be able to share data structures between C and
> > > the ti-emif-sram-pm assembly code, we can automatically generate all of
> > > the C struct member offsets and sizes as macros by making use of the ARM
> > > asm-offsets file. In the same header that we define our data structures
> > > in we also define all the macros in an inline function and by adding a
> > > call to this in the asm_offsets file all macros are properly generated
> > > and available to the assembly code without cluttering up the asm-offsets
> > > file.
> > > 
> > > Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
> > > ---
> > >  arch/arm/kernel/asm-offsets.c    |   6 +
> > >  drivers/memory/Kconfig           |  10 ++
> > >  drivers/memory/Makefile          |   4 +
> > >  drivers/memory/emif.h            |  17 ++
> > >  drivers/memory/ti-emif-pm.c      | 295 ++++++++++++++++++++++++++++++++++
> > >  drivers/memory/ti-emif-sram-pm.S | 334 +++++++++++++++++++++++++++++++++++++++
> > >  include/linux/ti-emif-sram.h     | 143 +++++++++++++++++
> > >  7 files changed, 809 insertions(+)
> > >  create mode 100644 drivers/memory/ti-emif-pm.c
> > >  create mode 100644 drivers/memory/ti-emif-sram-pm.S
> > >  create mode 100644 include/linux/ti-emif-sram.h
> > > 
> > > diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> > > index 608008229c7d..d728b5660e36 100644
> > > --- a/arch/arm/kernel/asm-offsets.c
> > > +++ b/arch/arm/kernel/asm-offsets.c
> > > @@ -28,6 +28,7 @@
> > >  #include <asm/vdso_datapage.h>
> > >  #include <asm/hardware/cache-l2x0.h>
> > >  #include <linux/kbuild.h>
> > > +#include <linux/ti-emif-sram.h>
> > >  
> > >  /*
> > >   * Make sure that the compiler and target are compatible.
> > > @@ -183,5 +184,10 @@ int main(void)
> > >  #ifdef CONFIG_VDSO
> > >    DEFINE(VDSO_DATA_SIZE,	sizeof(union vdso_data_store));
> > >  #endif
> > > +#if defined(CONFIG_SOC_AM33XX) || defined(CONFIG_SOC_AM43XX)
> > > +  BLANK();
> > > +  ti_emif_offsets();
> > > +#endif
> > > +
> > >    return 0; 
> > >  }
> > 
> > Does the above look OK to you?
> 
> I'm not going to comment on this yet, but I'll instead comment on the
> newly appeared sram_exec_copy() stuff.
> 
> So, a few years ago, we went to significant effort in ARM land to come
> up with a way to _safely_ copy assembler from the kernel into SRAM,
> because copying code to SRAM that is compiled in thumb mode and then
> executing it is _not_ as simple as memcpy(), cast the pointer to a
> function pointer, and then call the function pointer.
> 
> The SRAM stuff throws all that out, instead preferring the dumb memcpy()
> approach.
> 
> This needs resolving, and I'd like to see it resolved to the satisfaction
> of architecture maintainers before we progress any further down this
> route.

OK thanks, will wait until that is sorted out before merging any
of the SRAM code.

Regards,

Tony
Dave Gerlach April 5, 2017, 2:48 p.m. UTC | #4
Russell,
On 04/05/2017 09:33 AM, Tony Lindgren wrote:
> * Russell King - ARM Linux <linux@armlinux.org.uk> [170405 07:02]:
>> On Tue, Apr 04, 2017 at 09:11:52AM -0700, Tony Lindgren wrote:
>>> Russell,
>>>
>>> * Dave Gerlach <d-gerlach@ti.com> [170328 13:57]:
>>>> Certain SoCs like Texas Instruments AM335x and AM437x require parts
>>>> of the EMIF PM code to run late in the suspend sequence from SRAM,
>>>> such as saving and restoring the EMIF context and placing the memory
>>>> into self-refresh.
>>>>
>>>> One requirement for these SoCs to suspend and enter its lowest power
>>>> mode, called DeepSleep0, is that the PER power domain must be shut off.
>>>> Because the EMIF (DDR Controller) resides within this power domain, it
>>>> will lose context during a suspend operation, so we must save it so we
>>>> can restore once we resume. However, we cannot execute this code from
>>>> external memory, as it is not available at this point, so the code must
>>>> be executed late in the suspend path from SRAM.
>>>>
>>>> This patch introduces a ti-emif-sram driver that includes several
>>>> functions written in ARM ASM that are relocatable so the PM SRAM
>>>> code can use them. It also allocates a region of writable SRAM to
>>>> be used by the code running in the executable region of SRAM to save
>>>> and restore the EMIF context. It can export a table containing the
>>>> absolute addresses of the available PM functions so that other SRAM
>>>> code can branch to them. This code is required for suspend/resume on
>>>> AM335x and AM437x to work.
>>>>
>>>> In addition to this, to be able to share data structures between C and
>>>> the ti-emif-sram-pm assembly code, we can automatically generate all of
>>>> the C struct member offsets and sizes as macros by making use of the ARM
>>>> asm-offsets file. In the same header that we define our data structures
>>>> in we also define all the macros in an inline function and by adding a
>>>> call to this in the asm_offsets file all macros are properly generated
>>>> and available to the assembly code without cluttering up the asm-offsets
>>>> file.
>>>>
>>>> Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
>>>> ---
>>>>  arch/arm/kernel/asm-offsets.c    |   6 +
>>>>  drivers/memory/Kconfig           |  10 ++
>>>>  drivers/memory/Makefile          |   4 +
>>>>  drivers/memory/emif.h            |  17 ++
>>>>  drivers/memory/ti-emif-pm.c      | 295 ++++++++++++++++++++++++++++++++++
>>>>  drivers/memory/ti-emif-sram-pm.S | 334 +++++++++++++++++++++++++++++++++++++++
>>>>  include/linux/ti-emif-sram.h     | 143 +++++++++++++++++
>>>>  7 files changed, 809 insertions(+)
>>>>  create mode 100644 drivers/memory/ti-emif-pm.c
>>>>  create mode 100644 drivers/memory/ti-emif-sram-pm.S
>>>>  create mode 100644 include/linux/ti-emif-sram.h
>>>>
>>>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>>>> index 608008229c7d..d728b5660e36 100644
>>>> --- a/arch/arm/kernel/asm-offsets.c
>>>> +++ b/arch/arm/kernel/asm-offsets.c
>>>> @@ -28,6 +28,7 @@
>>>>  #include <asm/vdso_datapage.h>
>>>>  #include <asm/hardware/cache-l2x0.h>
>>>>  #include <linux/kbuild.h>
>>>> +#include <linux/ti-emif-sram.h>
>>>>
>>>>  /*
>>>>   * Make sure that the compiler and target are compatible.
>>>> @@ -183,5 +184,10 @@ int main(void)
>>>>  #ifdef CONFIG_VDSO
>>>>    DEFINE(VDSO_DATA_SIZE,	sizeof(union vdso_data_store));
>>>>  #endif
>>>> +#if defined(CONFIG_SOC_AM33XX) || defined(CONFIG_SOC_AM43XX)
>>>> +  BLANK();
>>>> +  ti_emif_offsets();
>>>> +#endif
>>>> +
>>>>    return 0;
>>>>  }
>>>
>>> Does the above look OK to you?
>>
>> I'm not going to comment on this yet, but I'll instead comment on the
>> newly appeared sram_exec_copy() stuff.
>>
>> So, a few years ago, we went to significant effort in ARM land to come
>> up with a way to _safely_ copy assembler from the kernel into SRAM,
>> because copying code to SRAM that is compiled in thumb mode and then
>> executing it is _not_ as simple as memcpy(), cast the pointer to a
>> function pointer, and then call the function pointer.
>>
>> The SRAM stuff throws all that out, instead preferring the dumb memcpy()
>> approach.
>>
>> This needs resolving, and I'd like to see it resolved to the satisfaction
>> of architecture maintainers before we progress any further down this
>> route.

I'm sure you are referring to fncpy, correct? This is what we used before with 
ARM specific code to do the copy, but we've moved into drivers now. What are 
your thoughts on exposing fncpy outside of arch/arm?

Regards,
Dave

>
> OK thanks, will wait until that is sorted out before merging any
> of the SRAM code.
>
> Regards,
>
> Tony
>
Russell King (Oracle) April 6, 2017, 7 p.m. UTC | #5
On Wed, Apr 05, 2017 at 09:48:26AM -0500, Dave Gerlach wrote:
> Russell,
> On 04/05/2017 09:33 AM, Tony Lindgren wrote:
> >* Russell King - ARM Linux <linux@armlinux.org.uk> [170405 07:02]:
> >>I'm not going to comment on this yet, but I'll instead comment on the
> >>newly appeared sram_exec_copy() stuff.
> >>
> >>So, a few years ago, we went to significant effort in ARM land to come
> >>up with a way to _safely_ copy assembler from the kernel into SRAM,
> >>because copying code to SRAM that is compiled in thumb mode and then
> >>executing it is _not_ as simple as memcpy(), cast the pointer to a
> >>function pointer, and then call the function pointer.
> >>
> >>The SRAM stuff throws all that out, instead preferring the dumb memcpy()
> >>approach.
> >>
> >>This needs resolving, and I'd like to see it resolved to the satisfaction
> >>of architecture maintainers before we progress any further down this
> >>route.
> 
> I'm sure you are referring to fncpy, correct? This is what we used before
> with ARM specific code to do the copy, but we've moved into drivers now.

Right, and as I explained above, fncpy() exists with very good reason.
The following does not work on ARM:

	sram = alloc(function_size);

	memcpy(sram, function, function_size);

	sram_ptr = (function_cast_t)sram;

	sram_ptr(args);

when the function is Thumb.  There are two problems with the above code
that fncpy() solves, both stemming from the same root cause:

1. The address of "function" will be offset by one byte, so the memcpy()
   will miss copying the first byte of the function.

2. sram_ptr will not be offset by one byte.

This is because, with Thumb functions, the "address" of the function is
offset by one byte - by the architecture requirements - to indicate that
it is to be called in Thumb mode.

> What are your thoughts on exposing fncpy outside of arch/arm?

You may use it by including asm/fncpy.h, but you may not move it out of
that file.  fncpy() is there exactly because it's _architecture_ specific.

If you're looking to make this generic, then we need cross-arch agreement
on how we can copy functions, and I'd recommend that fncpy() becomes that
generic copy function.  fncpy() has advantages over memcpy() besides
encoding the architecture specific knowledge - the biggest one is that
it guarantees type safety as well.  It ensures that the function pointer
that it's returning conforms with the function it's being asked to copy.

It strikes me, looking at the SRAM stuff, that the baby has been
completely thrown out with the bath water...

And really, this SRAM stuff _should_ have been through architecture
maintainer review before being merged into mainline so that these issues
could have been highlighted before hand.

This looks to me like yet another huge big review failure in kernel land,
because people are insistant on continually dividing stuff up by
sub-directory.  This has got to stop.
Dave Gerlach April 6, 2017, 7:09 p.m. UTC | #6
On 04/06/2017 02:00 PM, Russell King - ARM Linux wrote:
> On Wed, Apr 05, 2017 at 09:48:26AM -0500, Dave Gerlach wrote:
>> Russell,
>> On 04/05/2017 09:33 AM, Tony Lindgren wrote:
>>> * Russell King - ARM Linux <linux@armlinux.org.uk> [170405 07:02]:
>>>> I'm not going to comment on this yet, but I'll instead comment on the
>>>> newly appeared sram_exec_copy() stuff.
>>>>
>>>> So, a few years ago, we went to significant effort in ARM land to come
>>>> up with a way to _safely_ copy assembler from the kernel into SRAM,
>>>> because copying code to SRAM that is compiled in thumb mode and then
>>>> executing it is _not_ as simple as memcpy(), cast the pointer to a
>>>> function pointer, and then call the function pointer.
>>>>
>>>> The SRAM stuff throws all that out, instead preferring the dumb memcpy()
>>>> approach.
>>>>
>>>> This needs resolving, and I'd like to see it resolved to the satisfaction
>>>> of architecture maintainers before we progress any further down this
>>>> route.
>>
>> I'm sure you are referring to fncpy, correct? This is what we used before
>> with ARM specific code to do the copy, but we've moved into drivers now.
>
> Right, and as I explained above, fncpy() exists with very good reason.
> The following does not work on ARM:
>
> 	sram = alloc(function_size);
>
> 	memcpy(sram, function, function_size);
>
> 	sram_ptr = (function_cast_t)sram;
>
> 	sram_ptr(args);
>
> when the function is Thumb.  There are two problems with the above code
> that fncpy() solves, both stemming from the same root cause:
>
> 1. The address of "function" will be offset by one byte, so the memcpy()
>    will miss copying the first byte of the function.
>
> 2. sram_ptr will not be offset by one byte.
>
> This is because, with Thumb functions, the "address" of the function is
> offset by one byte - by the architecture requirements - to indicate that
> it is to be called in Thumb mode.

Thanks for the explanation.

>
>> What are your thoughts on exposing fncpy outside of arch/arm?
>
> You may use it by including asm/fncpy.h, but you may not move it out of
> that file.  fncpy() is there exactly because it's _architecture_ specific.

Yes agreed, and I already include asm/cacheflush.h for ARM arch to make use of 
the set_memory_* APIs you helped extend.

>
> If you're looking to make this generic, then we need cross-arch agreement
> on how we can copy functions, and I'd recommend that fncpy() becomes that
> generic copy function.  fncpy() has advantages over memcpy() besides
> encoding the architecture specific knowledge - the biggest one is that
> it guarantees type safety as well.  It ensures that the function pointer
> that it's returning conforms with the function it's being asked to copy.

I agree with this, I sent a patch yesterday here [1] that uses fncpy instead of 
memcpy. I also laid out the constraint that an arch will just need to define 
fncpy that guarantees safe copy of a function in order to make use of the 
sram_exec functionality. It is already only selected for CONFIG_ARM in Kconfig 
because of the dependency on set_memory_* APIs.

[1] http://www.spinics.net/lists/linux-omap/msg136517.html

>
> It strikes me, looking at the SRAM stuff, that the baby has been
> completely thrown out with the bath water...
>
> And really, this SRAM stuff _should_ have been through architecture
> maintainer review before being merged into mainline so that these issues
> could have been highlighted before hand.
>
> This looks to me like yet another huge big review failure in kernel land,
> because people are insistant on continually dividing stuff up by
> sub-directory.  This has got to stop.
>

This just seemed like a logical place for me to add this as executable SRAM is a 
shared resource and if the generic sram driver is handling allocation of memory 
regions it should handle the page attributes of these regions as well.

Regards,
Dave
diff mbox

Patch

diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 608008229c7d..d728b5660e36 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -28,6 +28,7 @@ 
 #include <asm/vdso_datapage.h>
 #include <asm/hardware/cache-l2x0.h>
 #include <linux/kbuild.h>
+#include <linux/ti-emif-sram.h>
 
 /*
  * Make sure that the compiler and target are compatible.
@@ -183,5 +184,10 @@  int main(void)
 #ifdef CONFIG_VDSO
   DEFINE(VDSO_DATA_SIZE,	sizeof(union vdso_data_store));
 #endif
+#if defined(CONFIG_SOC_AM33XX) || defined(CONFIG_SOC_AM43XX)
+  BLANK();
+  ti_emif_offsets();
+#endif
+
   return 0; 
 }
diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index ec80e35c8dfe..c29505fdb5cd 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -83,6 +83,16 @@  config OMAP_GPMC_DEBUG
 	  bootloader or else the GPMC timings won't be identical with the
 	  bootloader timings.
 
+config TI_EMIF_SRAM
+	tristate "Texas Instruments EMIF SRAM driver"
+	depends on (SOC_AM33XX || SOC_AM43XX) && SRAM
+	help
+	  This driver is for the EMIF module available on Texas Instruments
+	  AM33XX and AM43XX SoCs and is required for PM. Certain parts of
+	  the EMIF PM code must run from on-chip SRAM late in the suspend
+	  sequence so this driver provides several relocatable PM functions
+	  for the SoC PM code to use.
+
 config MVEBU_DEVBUS
 	bool "Marvell EBU Device Bus Controller"
 	default y
diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
index e88097fbc085..6194b37a4265 100644
--- a/drivers/memory/Makefile
+++ b/drivers/memory/Makefile
@@ -21,3 +21,7 @@  obj-$(CONFIG_DA8XX_DDRCTL)	+= da8xx-ddrctl.o
 
 obj-$(CONFIG_SAMSUNG_MC)	+= samsung/
 obj-$(CONFIG_TEGRA_MC)		+= tegra/
+obj-$(CONFIG_TI_EMIF_SRAM)	+= ti-emif-sram.o
+ti-emif-sram-objs		:= ti-emif-pm.o ti-emif-sram-pm.o
+
+AFLAGS_ti-emif-sram-pm.o	:=-Wa,-march=armv7-a
diff --git a/drivers/memory/emif.h b/drivers/memory/emif.h
index bfe08bae961a..9e9f8037955d 100644
--- a/drivers/memory/emif.h
+++ b/drivers/memory/emif.h
@@ -555,6 +555,9 @@ 
 #define READ_LATENCY_SHDW_SHIFT				0
 #define READ_LATENCY_SHDW_MASK				(0x1f << 0)
 
+#define EMIF_SRAM_AM33_REG_LAYOUT			0x00000000
+#define EMIF_SRAM_AM43_REG_LAYOUT			0x00000001
+
 #ifndef __ASSEMBLY__
 /*
  * Structure containing shadow of important registers in EMIF
@@ -585,5 +588,19 @@  struct emif_regs {
 	u32 ext_phy_ctrl_3_shdw;
 	u32 ext_phy_ctrl_4_shdw;
 };
+
+struct ti_emif_pm_functions;
+
+extern unsigned int ti_emif_sram;
+extern unsigned int ti_emif_sram_sz;
+extern struct ti_emif_pm_data ti_emif_pm_sram_data;
+extern struct emif_regs_amx3 ti_emif_regs_amx3;
+
+void ti_emif_save_context(void);
+void ti_emif_restore_context(void);
+void ti_emif_enter_sr(void);
+void ti_emif_exit_sr(void);
+void ti_emif_abort_sr(void);
+
 #endif /* __ASSEMBLY__ */
 #endif /* __EMIF_H */
diff --git a/drivers/memory/ti-emif-pm.c b/drivers/memory/ti-emif-pm.c
new file mode 100644
index 000000000000..5fbdbcc85de5
--- /dev/null
+++ b/drivers/memory/ti-emif-pm.c
@@ -0,0 +1,295 @@ 
+/*
+ * TI AM33XX SRAM EMIF Driver
+ *
+ * Copyright (C) 2016-2017 Texas Instruments Inc.
+ *	Dave Gerlach
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/err.h>
+#include <linux/genalloc.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/sram.h>
+#include <linux/ti-emif-sram.h>
+
+#include "emif.h"
+
+#define TI_EMIF_SRAM_SYMBOL_OFFSET(sym) ((unsigned long)(sym) - \
+					 (unsigned long)&ti_emif_sram)
+
+#define EMIF_POWER_MGMT_WAIT_SELF_REFRESH_8192_CYCLES		0x00a0
+
+static phys_addr_t ti_emif_sram_phys, ti_emif_sram_data_phys;
+static unsigned long ti_emif_sram_virt, ti_emif_sram_data_virt;
+static struct gen_pool *sram_pool_code, *sram_pool_data;
+static struct ti_emif_pm_data pm_data;
+static struct ti_emif_pm_functions pm_functions;
+
+static u32 sram_suspend_address(unsigned long addr)
+{
+	return (ti_emif_sram_virt + TI_EMIF_SRAM_SYMBOL_OFFSET(addr));
+}
+
+static phys_addr_t sram_resume_address(unsigned long addr)
+{
+	return ((unsigned long)ti_emif_sram_phys +
+		TI_EMIF_SRAM_SYMBOL_OFFSET(addr));
+}
+
+static void ti_emif_free_sram(void)
+{
+	gen_pool_free(sram_pool_code, ti_emif_sram_virt, ti_emif_sram_sz);
+	gen_pool_free(sram_pool_data, ti_emif_sram_data_virt,
+		      sizeof(struct emif_regs_amx3));
+}
+
+static int ti_emif_prepare_push_sram(struct device *dev)
+{
+	struct device_node *np = dev->of_node;
+	int ret;
+
+	sram_pool_code = of_gen_pool_get(np, "sram", 0);
+	if (!sram_pool_code) {
+		dev_err(dev, "Unable to get sram pool for ocmcram code\n");
+		return -ENODEV;
+	}
+
+	ti_emif_sram_virt = gen_pool_alloc(sram_pool_code, ti_emif_sram_sz);
+	if (!ti_emif_sram_virt) {
+		dev_err(dev, "Unable to allocate code memory from ocmcram\n");
+		return -ENOMEM;
+	}
+
+	/* Save physical address to calculate resume offset during pm init */
+	ti_emif_sram_phys = gen_pool_virt_to_phys(sram_pool_code,
+						  ti_emif_sram_virt);
+
+	/* Get sram pool for data section and allocate space */
+	sram_pool_data = of_gen_pool_get(np, "sram", 1);
+	if (!sram_pool_data) {
+		dev_err(dev, "Unable to get sram pool for ocmcram data\n");
+		ret = -ENODEV;
+		goto err_free_sram_code;
+	}
+
+	ti_emif_sram_data_virt = gen_pool_alloc(sram_pool_data,
+						sizeof(struct emif_regs_amx3));
+	if (!ti_emif_sram_data_virt) {
+		dev_err(dev, "Unable to allocate data memory from ocmcram\n");
+		return -ENOMEM;
+		goto err_free_sram_code;
+	}
+
+	/* Save physical address to calculate resume offset during pm init */
+	ti_emif_sram_data_phys = gen_pool_virt_to_phys(sram_pool_data,
+						       ti_emif_sram_data_virt);
+	/*
+	 * These functions are called during suspend path while MMU is
+	 * still on so add virtual base to offset for absolute address
+	 */
+	pm_functions.save_context =
+		sram_suspend_address((unsigned long)ti_emif_save_context);
+	pm_functions.enter_sr =
+		sram_suspend_address((unsigned long)ti_emif_enter_sr);
+	pm_functions.abort_sr =
+		sram_suspend_address((unsigned long)ti_emif_abort_sr);
+
+	/*
+	 * These are called during resume path when MMU is not enabled
+	 * so physical address is used instead
+	 */
+	pm_functions.restore_context =
+		sram_resume_address((unsigned long)ti_emif_restore_context);
+	pm_functions.exit_sr =
+		sram_resume_address((unsigned long)ti_emif_exit_sr);
+
+	pm_data.regs_virt = (struct emif_regs_amx3 *)ti_emif_sram_data_virt;
+	pm_data.regs_phys = ti_emif_sram_data_phys;
+
+	return 0;
+
+err_free_sram_code:
+	gen_pool_free(sram_pool_code, ti_emif_sram_virt, ti_emif_sram_sz);
+	return ret;
+}
+
+static int ti_emif_push_sram(struct device *dev)
+{
+	int ret;
+
+	ret = sram_exec_copy(sram_pool_code, (void *)ti_emif_sram_virt,
+			     &ti_emif_sram, ti_emif_sram_sz);
+	if (ret) {
+		dev_err(dev, "Cannot copy emif code to sram\n");
+		return ret;
+	}
+
+	ret = sram_exec_copy(sram_pool_code,
+			     (void *)sram_suspend_address((unsigned long)&ti_emif_pm_sram_data),
+			     &pm_data, sizeof(pm_data));
+	if (ret) {
+		dev_err(dev, "Cannot copy emif data to code sram\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+/*
+ * Due to Usage Note 3.1.2 "DDR3: JEDEC Compliance for Maximum
+ * Self-Refresh Command Limit" found in AM335x Silicon Errata
+ * (Document SPRZ360F Revised November 2013) we must configure
+ * the self refresh delay timer to 0xA (8192 cycles) to avoid
+ * generating too many refresh command from the EMIF.
+ */
+static void ti_emif_configure_sr_delay(void)
+{
+	writel(EMIF_POWER_MGMT_WAIT_SELF_REFRESH_8192_CYCLES,
+	       (pm_data.ti_emif_base_addr_virt +
+		EMIF_POWER_MANAGEMENT_CONTROL));
+
+	writel(EMIF_POWER_MGMT_WAIT_SELF_REFRESH_8192_CYCLES,
+	       (pm_data.ti_emif_base_addr_virt +
+		EMIF_POWER_MANAGEMENT_CTRL_SHDW));
+}
+
+/**
+ * ti_emif_copy_pm_function_table - copy mapping of pm funcs in sram
+ * @sram_pool: pointer to struct gen_pool where dst resides
+ * @dst: void * to address that table should be copied
+ *
+ * Returns 0 if success other error code if table is not available
+ */
+int ti_emif_copy_pm_function_table(struct gen_pool *sram_pool, void *dst)
+{
+	int ret;
+
+	if (!ti_emif_sram_virt)
+		return -EINVAL;
+
+	ret = sram_exec_copy(sram_pool, dst, &pm_functions,
+			     sizeof(pm_functions));
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(ti_emif_copy_pm_function_table);
+
+/**
+ * ti_emif_get_mem_type - return type for memory type in use
+ *
+ * Returns memory type value read from EMIF or error code if fails
+ */
+int ti_emif_get_mem_type(void)
+{
+	unsigned long temp;
+
+	if (!pm_data.ti_emif_base_addr_virt ||
+	    IS_ERR(pm_data.ti_emif_base_addr_virt))
+		return -ENODEV;
+
+	temp = readl(pm_data.ti_emif_base_addr_virt +
+		     EMIF_SDRAM_CONFIG);
+
+	temp = (temp & SDRAM_TYPE_MASK) >> SDRAM_TYPE_SHIFT;
+	return temp;
+}
+EXPORT_SYMBOL_GPL(ti_emif_get_mem_type);
+
+static const struct of_device_id ti_emif_of_match[] = {
+	{ .compatible = "ti,emif-am3352", .data =
+					(void *)EMIF_SRAM_AM33_REG_LAYOUT, },
+	{ .compatible = "ti,emif-am4372", .data =
+					(void *)EMIF_SRAM_AM43_REG_LAYOUT, },
+	{},
+};
+MODULE_DEVICE_TABLE(of, ti_emif_of_match);
+
+static int ti_emif_probe(struct platform_device *pdev)
+{
+	int ret = -ENODEV;
+	struct resource *res;
+	struct device *dev = &pdev->dev;
+	const struct of_device_id *match;
+
+	match = of_match_device(ti_emif_of_match, &pdev->dev);
+	if (!match)
+		return -ENODEV;
+
+	pm_data.ti_emif_sram_config = (u32)match->data;
+
+	pm_runtime_enable(dev);
+	ret = pm_runtime_get_sync(dev);
+
+	if (IS_ERR_VALUE(ret)) {
+		dev_err(dev, "pm_runtime_get_sync() failed\n");
+		goto fail_runtime_put;
+	}
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	pm_data.ti_emif_base_addr_virt = devm_ioremap_resource(dev, res);
+	if (IS_ERR(pm_data.ti_emif_base_addr_virt)) {
+		dev_err(dev, "could not ioremap emif mem\n");
+		ret =  PTR_ERR(pm_data.ti_emif_base_addr_virt);
+		goto fail_runtime_put;
+	}
+
+	pm_data.ti_emif_base_addr_phys = res->start;
+
+	ti_emif_configure_sr_delay();
+
+	ret = ti_emif_prepare_push_sram(dev);
+	if (ret)
+		goto fail_runtime_put;
+
+	ret = ti_emif_push_sram(dev);
+	if (ret)
+		goto fail_free_sram;
+
+	return 0;
+
+fail_free_sram:
+	ti_emif_free_sram();
+fail_runtime_put:
+	pm_runtime_put_sync(dev);
+	return ret;
+}
+
+static int ti_emif_remove(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+
+	pm_runtime_put_sync(dev);
+	pm_runtime_disable(dev);
+
+	ti_emif_free_sram();
+
+	return 0;
+}
+
+static struct platform_driver ti_emif_driver = {
+	.probe = ti_emif_probe,
+	.remove = ti_emif_remove,
+	.driver = {
+		.name = KBUILD_MODNAME,
+		.of_match_table = of_match_ptr(ti_emif_of_match),
+	},
+};
+module_platform_driver(ti_emif_driver);
+
+MODULE_AUTHOR("Dave Gerlach <d-gerlach@ti.com>");
+MODULE_DESCRIPTION("Texas Instruments SRAM EMIF driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/memory/ti-emif-sram-pm.S b/drivers/memory/ti-emif-sram-pm.S
new file mode 100644
index 000000000000..75888a22eb7f
--- /dev/null
+++ b/drivers/memory/ti-emif-sram-pm.S
@@ -0,0 +1,334 @@ 
+/*
+ * Low level PM code for TI EMIF
+ *
+ * Copyright (C) 2016-2017 Texas Instruments Incorporated - http://www.ti.com/
+ *	Dave Gerlach
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
+#include <asm/memory.h>
+
+#include "emif.h"
+
+#define EMIF_POWER_MGMT_WAIT_SELF_REFRESH_8192_CYCLES	0x00a0
+#define EMIF_POWER_MGMT_SR_TIMER_MASK			0x00f0
+#define EMIF_POWER_MGMT_SELF_REFRESH_MODE		0x0200
+#define EMIF_POWER_MGMT_SELF_REFRESH_MODE_MASK		0x0700
+
+#define EMIF_SDCFG_TYPE_DDR2				0x2 << SDRAM_TYPE_SHIFT
+#define EMIF_STATUS_READY				0x4
+
+#define AM43XX_EMIF_PHY_CTRL_REG_COUNT                  0x120
+
+#define EMIF_AM437X_REGISTERS				0x1
+
+	.data
+	.align 3
+
+ENTRY(ti_emif_sram)
+
+/*
+ * void ti_emif_save_context(void)
+ *
+ * Used during suspend to save the context of all required EMIF registers
+ * to local memory if the EMIF is going to lose context during the sleep
+ * transition. Operates on the VIRTUAL address of the EMIF.
+ */
+ENTRY(ti_emif_save_context)
+	stmfd   sp!, {r4 - r11, lr}     @ save registers on stack
+
+	adr	r4, ti_emif_pm_sram_data
+	ldr	r0, [r4, #EMIF_PM_BASE_ADDR_VIRT_OFFSET]
+	ldr	r2, [r4, #EMIF_PM_REGS_VIRT_OFFSET]
+
+	/* Save EMIF configuration */
+	ldr	r1, [r0, #EMIF_SDRAM_CONFIG]
+	str	r1, [r2, #EMIF_SDCFG_VAL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_SDRAM_REFRESH_CONTROL]
+	str	r1, [r2, #EMIF_REF_CTRL_VAL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_SDRAM_TIMING_1]
+	str     r1, [r2, #EMIF_TIMING1_VAL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_SDRAM_TIMING_2]
+	str     r1, [r2, #EMIF_TIMING2_VAL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_SDRAM_TIMING_3]
+	str     r1, [r2, #EMIF_TIMING3_VAL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_POWER_MANAGEMENT_CONTROL]
+	str     r1, [r2, #EMIF_PMCR_VAL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_POWER_MANAGEMENT_CTRL_SHDW]
+	str     r1, [r2, #EMIF_PMCR_SHDW_VAL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_SDRAM_OUTPUT_IMPEDANCE_CALIBRATION_CONFIG]
+	str     r1, [r2, #EMIF_ZQCFG_VAL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_DDR_PHY_CTRL_1]
+	str     r1, [r2, #EMIF_DDR_PHY_CTLR_1_OFFSET]
+
+	ldr	r1, [r0, #EMIF_COS_CONFIG]
+	str     r1, [r2, #EMIF_COS_CONFIG_OFFSET]
+
+	ldr	r1, [r0, #EMIF_PRIORITY_TO_CLASS_OF_SERVICE_MAPPING]
+	str     r1, [r2, #EMIF_PRIORITY_TO_COS_MAPPING_OFFSET]
+
+	ldr	r1, [r0, #EMIF_CONNECTION_ID_TO_CLASS_OF_SERVICE_1_MAPPING]
+	str     r1, [r2, #EMIF_CONNECT_ID_SERV_1_MAP_OFFSET]
+
+	ldr	r1, [r0, #EMIF_CONNECTION_ID_TO_CLASS_OF_SERVICE_2_MAPPING]
+	str     r1, [r2, #EMIF_CONNECT_ID_SERV_2_MAP_OFFSET]
+
+	ldr	r1, [r0, #EMIF_OCP_CONFIG]
+	str     r1, [r2, #EMIF_OCP_CONFIG_VAL_OFFSET]
+
+	ldr	r5, [r4, #EMIF_PM_CONFIG_OFFSET]
+	cmp	r5, #EMIF_SRAM_AM43_REG_LAYOUT
+	bne	emif_skip_save_extra_regs
+
+	ldr	r1, [r0, #EMIF_READ_WRITE_LEVELING_RAMP_CONTROL]
+	str     r1, [r2, #EMIF_RD_WR_LEVEL_RAMP_CTRL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_READ_WRITE_EXECUTION_THRESHOLD]
+	str     r1, [r2, #EMIF_RD_WR_EXEC_THRESH_OFFSET]
+
+	ldr	r1, [r0, #EMIF_LPDDR2_NVM_TIMING]
+	str     r1, [r2, #EMIF_LPDDR2_NVM_TIM_OFFSET]
+
+	ldr	r1, [r0, #EMIF_LPDDR2_NVM_TIMING_SHDW]
+	str     r1, [r2, #EMIF_LPDDR2_NVM_TIM_SHDW_OFFSET]
+
+	ldr	r1, [r0, #EMIF_DLL_CALIB_CTRL]
+	str     r1, [r2, #EMIF_DLL_CALIB_CTRL_VAL_OFFSET]
+
+	ldr	r1, [r0, #EMIF_DLL_CALIB_CTRL_SHDW]
+	str     r1, [r2, #EMIF_DLL_CALIB_CTRL_VAL_SHDW_OFFSET]
+
+	/* Loop and save entire block of emif phy regs */
+	mov	r5, #0x0
+	add	r4, r2, #EMIF_EXT_PHY_CTRL_VALS_OFFSET
+	add	r3, r0, #EMIF_EXT_PHY_CTRL_1
+ddr_phy_ctrl_save:
+	ldr	r1, [r3, r5]
+	str	r1, [r4, r5]
+	add	r5, r5, #0x4
+	cmp	r5, #AM43XX_EMIF_PHY_CTRL_REG_COUNT
+	bne	ddr_phy_ctrl_save
+
+emif_skip_save_extra_regs:
+	ldmfd	sp!, {r4 - r11, pc}	@ restore regs and return
+ENDPROC(ti_emif_save_context)
+
+/*
+ * void ti_emif_restore_context(void)
+ *
+ * Used during resume to restore the context of all required EMIF registers
+ * from local memory after the EMIF has lost context during a sleep transition.
+ * Operates on the PHYSICAL address of the EMIF.
+ */
+ENTRY(ti_emif_restore_context)
+	adr	r4, ti_emif_pm_sram_data
+	ldr	r0, [r4, #EMIF_PM_BASE_ADDR_PHYS_OFFSET]
+	ldr	r2, [r4, #EMIF_PM_REGS_PHYS_OFFSET]
+
+	/* Config EMIF Timings */
+	ldr     r1, [r2, #EMIF_DDR_PHY_CTLR_1_OFFSET]
+	str	r1, [r0, #EMIF_DDR_PHY_CTRL_1]
+	str	r1, [r0, #EMIF_DDR_PHY_CTRL_1_SHDW]
+
+	ldr     r1, [r2, #EMIF_TIMING1_VAL_OFFSET]
+	str	r1, [r0, #EMIF_SDRAM_TIMING_1]
+	str	r1, [r0, #EMIF_SDRAM_TIMING_1_SHDW]
+
+	ldr     r1, [r2, #EMIF_TIMING2_VAL_OFFSET]
+	str	r1, [r0, #EMIF_SDRAM_TIMING_2]
+	str	r1, [r0, #EMIF_SDRAM_TIMING_2_SHDW]
+
+	ldr     r1, [r2, #EMIF_TIMING3_VAL_OFFSET]
+	str	r1, [r0, #EMIF_SDRAM_TIMING_3]
+	str	r1, [r0, #EMIF_SDRAM_TIMING_3_SHDW]
+
+	ldr     r1, [r2, #EMIF_REF_CTRL_VAL_OFFSET]
+	str	r1, [r0, #EMIF_SDRAM_REFRESH_CONTROL]
+	str	r1, [r0, #EMIF_SDRAM_REFRESH_CTRL_SHDW]
+
+	ldr     r1, [r2, #EMIF_PMCR_VAL_OFFSET]
+	str	r1, [r0, #EMIF_POWER_MANAGEMENT_CTRL_SHDW]
+
+	ldr     r1, [r2, #EMIF_PMCR_SHDW_VAL_OFFSET]
+	str	r1, [r0, #EMIF_POWER_MANAGEMENT_CTRL_SHDW]
+
+	ldr     r1, [r2, #EMIF_COS_CONFIG_OFFSET]
+	str	r1, [r0, #EMIF_COS_CONFIG]
+
+	ldr     r1, [r2, #EMIF_PRIORITY_TO_COS_MAPPING_OFFSET]
+	str	r1, [r0, #EMIF_PRIORITY_TO_CLASS_OF_SERVICE_MAPPING]
+
+	ldr	r1, [r2, #EMIF_CONNECT_ID_SERV_1_MAP_OFFSET]
+	str	r1, [r0, #EMIF_CONNECTION_ID_TO_CLASS_OF_SERVICE_1_MAPPING]
+
+	ldr     r1, [r2, #EMIF_CONNECT_ID_SERV_2_MAP_OFFSET]
+	str	r1, [r0, #EMIF_CONNECTION_ID_TO_CLASS_OF_SERVICE_2_MAPPING]
+
+	ldr     r1, [r2, #EMIF_OCP_CONFIG_VAL_OFFSET]
+	str	r1, [r0, #EMIF_OCP_CONFIG]
+
+	ldr	r5, [r4, #EMIF_PM_CONFIG_OFFSET]
+	cmp	r5, #EMIF_SRAM_AM43_REG_LAYOUT
+	bne	emif_skip_restore_extra_regs
+
+	ldr     r1, [r2, #EMIF_RD_WR_LEVEL_RAMP_CTRL_OFFSET]
+	str	r1, [r0, #EMIF_READ_WRITE_LEVELING_RAMP_CONTROL]
+
+	ldr     r1, [r2, #EMIF_RD_WR_EXEC_THRESH_OFFSET]
+	str	r1, [r0, #EMIF_READ_WRITE_EXECUTION_THRESHOLD]
+
+	ldr     r1, [r2, #EMIF_LPDDR2_NVM_TIM_OFFSET]
+	str	r1, [r0, #EMIF_LPDDR2_NVM_TIMING]
+
+	ldr     r1, [r2, #EMIF_LPDDR2_NVM_TIM_SHDW_OFFSET]
+	str	r1, [r0, #EMIF_LPDDR2_NVM_TIMING_SHDW]
+
+	ldr     r1, [r2, #EMIF_DLL_CALIB_CTRL_VAL_OFFSET]
+	str	r1, [r0, #EMIF_DLL_CALIB_CTRL]
+
+	ldr     r1, [r2, #EMIF_DLL_CALIB_CTRL_VAL_SHDW_OFFSET]
+	str	r1, [r0, #EMIF_DLL_CALIB_CTRL_SHDW]
+
+	ldr     r1, [r2, #EMIF_ZQCFG_VAL_OFFSET]
+	str	r1, [r0, #EMIF_SDRAM_OUTPUT_IMPEDANCE_CALIBRATION_CONFIG]
+
+	/* Loop and restore entire block of emif phy regs */
+	mov	r5, #0x0
+	/* Load ti_emif_regs_amx3 + EMIF_EXT_PHY_CTRL_VALS_OFFSET for address
+	 * to phy register save space
+	 */
+	add	r3, r2, #EMIF_EXT_PHY_CTRL_VALS_OFFSET
+	add	r4, r0, #EMIF_EXT_PHY_CTRL_1
+ddr_phy_ctrl_restore:
+	ldr	r1, [r3, r5]
+	str	r1, [r4, r5]
+	add	r5, r5, #0x4
+	cmp	r5, #AM43XX_EMIF_PHY_CTRL_REG_COUNT
+	bne	ddr_phy_ctrl_restore
+
+emif_skip_restore_extra_regs:
+	/*
+	 * Output impedence calib needed only for DDR3
+	 * but since the initial state of this will be
+	 * disabled for DDR2 no harm in restoring the
+	 * old configuration
+	 */
+	ldr     r1, [r2, #EMIF_ZQCFG_VAL_OFFSET]
+	str	r1, [r0, #EMIF_SDRAM_OUTPUT_IMPEDANCE_CALIBRATION_CONFIG]
+
+	/* Write to sdcfg last for DDR2 only */
+	ldr	r1, [r2, #EMIF_SDCFG_VAL_OFFSET]
+	and	r2, r1, #SDRAM_TYPE_MASK
+	cmp	r2, #EMIF_SDCFG_TYPE_DDR2
+	streq	r1, [r0, #EMIF_SDRAM_CONFIG]
+
+	mov	pc, lr
+ENDPROC(ti_emif_restore_context)
+
+/*
+ * void ti_emif_enter_sr(void)
+ *
+ * Programs the EMIF to tell the SDRAM to enter into self-refresh
+ * mode during a sleep transition. Operates on the VIRTUAL address
+ * of the EMIF.
+ */
+ENTRY(ti_emif_enter_sr)
+	stmfd   sp!, {r4 - r11, lr}     @ save registers on stack
+
+	adr	r4, ti_emif_pm_sram_data
+	ldr	r0, [r4, #EMIF_PM_BASE_ADDR_VIRT_OFFSET]
+	ldr	r2, [r4, #EMIF_PM_REGS_VIRT_OFFSET]
+
+	ldr	r1, [r0, #EMIF_POWER_MANAGEMENT_CONTROL]
+	bic	r1, r1, #EMIF_POWER_MGMT_SELF_REFRESH_MODE_MASK
+	orr	r1, r1, #EMIF_POWER_MGMT_SELF_REFRESH_MODE
+	str	r1, [r0, #EMIF_POWER_MANAGEMENT_CONTROL]
+
+	ldmfd	sp!, {r4 - r11, pc}	@ restore regs and return
+ENDPROC(ti_emif_enter_sr)
+
+/*
+ * void ti_emif_exit_sr(void)
+ *
+ * Programs the EMIF to tell the SDRAM to exit self-refresh mode
+ * after a sleep transition. Operates on the PHYSICAL address of
+ * the EMIF.
+ */
+ENTRY(ti_emif_exit_sr)
+	adr	r4, ti_emif_pm_sram_data
+	ldr	r0, [r4, #EMIF_PM_BASE_ADDR_PHYS_OFFSET]
+	ldr	r2, [r4, #EMIF_PM_REGS_PHYS_OFFSET]
+
+	/*
+	 * Toggle EMIF to exit refresh mode:
+	 * if EMIF lost context, PWR_MGT_CTRL is currently 0, writing disable
+	 *   (0x0), wont do diddly squat! so do a toggle from SR(0x2) to disable
+	 *   (0x0) here.
+	 * *If* EMIF did not lose context, nothing broken as we write the same
+	 *   value(0x2) to reg before we write a disable (0x0).
+	 */
+	ldr	r1, [r2, #EMIF_PMCR_VAL_OFFSET]
+	bic	r1, r1, #EMIF_POWER_MGMT_SELF_REFRESH_MODE_MASK
+	orr	r1, r1, #EMIF_POWER_MGMT_SELF_REFRESH_MODE
+	str	r1, [r0, #EMIF_POWER_MANAGEMENT_CONTROL]
+	bic	r1, r1, #EMIF_POWER_MGMT_SELF_REFRESH_MODE_MASK
+	str	r1, [r0, #EMIF_POWER_MANAGEMENT_CONTROL]
+
+        /* Wait for EMIF to become ready */
+1:	ldr     r1, [r0, #EMIF_STATUS]
+	tst     r1, #EMIF_STATUS_READY
+	beq     1b
+
+	mov	pc, lr
+ENDPROC(ti_emif_exit_sr)
+
+/*
+ * void ti_emif_abort_sr(void)
+ *
+ * Disables self-refresh after a failed transition to a low-power
+ * state so the kernel can jump back to DDR and follow abort path.
+ * Operates on the VIRTUAL address of the EMIF.
+ */
+ENTRY(ti_emif_abort_sr)
+	stmfd   sp!, {r4 - r11, lr}     @ save registers on stack
+
+	adr	r4, ti_emif_pm_sram_data
+	ldr	r0, [r4, #EMIF_PM_BASE_ADDR_VIRT_OFFSET]
+	ldr	r2, [r4, #EMIF_PM_REGS_VIRT_OFFSET]
+
+	ldr	r1, [r2, #EMIF_PMCR_VAL_OFFSET]
+	bic	r1, r1, #EMIF_POWER_MGMT_SELF_REFRESH_MODE_MASK
+	str	r1, [r0, #EMIF_POWER_MANAGEMENT_CONTROL]
+
+	/* Wait for EMIF to become ready */
+1:	ldr     r1, [r0, #EMIF_STATUS]
+	tst     r1, #EMIF_STATUS_READY
+	beq     1b
+
+	ldmfd	sp!, {r4 - r11, pc}	@ restore regs and return
+ENDPROC(ti_emif_abort_sr)
+
+	.align 3
+ENTRY(ti_emif_pm_sram_data)
+	.space EMIF_PM_DATA_SIZE
+ENTRY(ti_emif_sram_sz)
+        .word   . - ti_emif_save_context
diff --git a/include/linux/ti-emif-sram.h b/include/linux/ti-emif-sram.h
new file mode 100644
index 000000000000..4e15e4cc5363
--- /dev/null
+++ b/include/linux/ti-emif-sram.h
@@ -0,0 +1,143 @@ 
+/*
+ * TI AM33XX EMIF Routines
+ *
+ * Copyright (C) 2016-2017 Texas Instruments Inc.
+ *	Dave Gerlach
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#ifndef __LINUX_TI_EMIF_H
+#define __LINUX_TI_EMIF_H
+
+#include <linux/kbuild.h>
+#ifndef __ASSEMBLY__
+
+struct emif_regs_amx3 {
+	u32 emif_sdcfg_val;
+	u32 emif_timing1_val;
+	u32 emif_timing2_val;
+	u32 emif_timing3_val;
+	u32 emif_ref_ctrl_val;
+	u32 emif_zqcfg_val;
+	u32 emif_pmcr_val;
+	u32 emif_pmcr_shdw_val;
+	u32 emif_rd_wr_level_ramp_ctrl;
+	u32 emif_rd_wr_exec_thresh;
+	u32 emif_cos_config;
+	u32 emif_priority_to_cos_mapping;
+	u32 emif_connect_id_serv_1_map;
+	u32 emif_connect_id_serv_2_map;
+	u32 emif_ocp_config_val;
+	u32 emif_lpddr2_nvm_tim;
+	u32 emif_lpddr2_nvm_tim_shdw;
+	u32 emif_dll_calib_ctrl_val;
+	u32 emif_dll_calib_ctrl_val_shdw;
+	u32 emif_ddr_phy_ctlr_1;
+	u32 emif_ext_phy_ctrl_vals[120];
+};
+
+struct ti_emif_pm_data {
+	void __iomem *ti_emif_base_addr_virt;
+	phys_addr_t ti_emif_base_addr_phys;
+	unsigned long ti_emif_sram_config;
+	struct emif_regs_amx3 *regs_virt;
+	phys_addr_t regs_phys;
+} __packed;
+
+struct ti_emif_pm_functions {
+	u32 save_context;
+	u32 restore_context;
+	u32 enter_sr;
+	u32 exit_sr;
+	u32 abort_sr;
+} __packed;
+
+extern inline void ti_emif_offsets(void)
+{
+	DEFINE(EMIF_SDCFG_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_sdcfg_val));
+	DEFINE(EMIF_TIMING1_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_timing1_val));
+	DEFINE(EMIF_TIMING2_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_timing2_val));
+	DEFINE(EMIF_TIMING3_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_timing3_val));
+	DEFINE(EMIF_REF_CTRL_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_ref_ctrl_val));
+	DEFINE(EMIF_ZQCFG_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_zqcfg_val));
+	DEFINE(EMIF_PMCR_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_pmcr_val));
+	DEFINE(EMIF_PMCR_SHDW_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_pmcr_shdw_val));
+	DEFINE(EMIF_RD_WR_LEVEL_RAMP_CTRL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_rd_wr_level_ramp_ctrl));
+	DEFINE(EMIF_RD_WR_EXEC_THRESH_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_rd_wr_exec_thresh));
+	DEFINE(EMIF_COS_CONFIG_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_cos_config));
+	DEFINE(EMIF_PRIORITY_TO_COS_MAPPING_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_priority_to_cos_mapping));
+	DEFINE(EMIF_CONNECT_ID_SERV_1_MAP_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_connect_id_serv_1_map));
+	DEFINE(EMIF_CONNECT_ID_SERV_2_MAP_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_connect_id_serv_2_map));
+	DEFINE(EMIF_OCP_CONFIG_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_ocp_config_val));
+	DEFINE(EMIF_LPDDR2_NVM_TIM_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_lpddr2_nvm_tim));
+	DEFINE(EMIF_LPDDR2_NVM_TIM_SHDW_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_lpddr2_nvm_tim_shdw));
+	DEFINE(EMIF_DLL_CALIB_CTRL_VAL_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_dll_calib_ctrl_val));
+	DEFINE(EMIF_DLL_CALIB_CTRL_VAL_SHDW_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_dll_calib_ctrl_val_shdw));
+	DEFINE(EMIF_DDR_PHY_CTLR_1_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_ddr_phy_ctlr_1));
+	DEFINE(EMIF_EXT_PHY_CTRL_VALS_OFFSET,
+	       offsetof(struct emif_regs_amx3, emif_ext_phy_ctrl_vals));
+	DEFINE(EMIF_REGS_AMX3_SIZE, sizeof(struct emif_regs_amx3));
+
+	BLANK();
+
+	DEFINE(EMIF_PM_BASE_ADDR_VIRT_OFFSET,
+	       offsetof(struct ti_emif_pm_data, ti_emif_base_addr_virt));
+	DEFINE(EMIF_PM_BASE_ADDR_PHYS_OFFSET,
+	       offsetof(struct ti_emif_pm_data, ti_emif_base_addr_phys));
+	DEFINE(EMIF_PM_CONFIG_OFFSET,
+	       offsetof(struct ti_emif_pm_data, ti_emif_sram_config));
+	DEFINE(EMIF_PM_REGS_VIRT_OFFSET,
+	       offsetof(struct ti_emif_pm_data, regs_virt));
+	DEFINE(EMIF_PM_REGS_PHYS_OFFSET,
+	       offsetof(struct ti_emif_pm_data, regs_phys));
+	DEFINE(EMIF_PM_DATA_SIZE, sizeof(struct ti_emif_pm_data));
+
+	BLANK();
+
+	DEFINE(EMIF_PM_SAVE_CONTEXT_OFFSET,
+	       offsetof(struct ti_emif_pm_functions, save_context));
+	DEFINE(EMIF_PM_RESTORE_CONTEXT_OFFSET,
+	       offsetof(struct ti_emif_pm_functions, restore_context));
+	DEFINE(EMIF_PM_ENTER_SR_OFFSET,
+	       offsetof(struct ti_emif_pm_functions, enter_sr));
+	DEFINE(EMIF_PM_EXIT_SR_OFFSET,
+	       offsetof(struct ti_emif_pm_functions, exit_sr));
+	DEFINE(EMIF_PM_ABORT_SR_OFFSET,
+	       offsetof(struct ti_emif_pm_functions, abort_sr));
+	DEFINE(EMIF_PM_FUNCTIONS_SIZE, sizeof(struct ti_emif_pm_functions));
+}
+
+struct gen_pool;
+
+int ti_emif_copy_pm_function_table(struct gen_pool *sram_pool, void *dst);
+int ti_emif_get_mem_type(void);
+
+#endif
+#endif /* __LINUX_TI_EMIF_H */