diff mbox

[3/4] MTD: pxa3xx_nand: enable multiple chip select support

Message ID 1309771536-10597-4-git-send-email-leiwen@marvell.com (mailing list archive)
State New, archived
Headers show

Commit Message

Lei Wen July 4, 2011, 9:25 a.m. UTC
Current pxa3xx_nand controller has two chip select which
both be workable. This patch enable this feature.

Update platform driver to support this feature.

Another notice should be taken that:
When you want to use this feature, you should not enable the
keep configuration feature, for two chip select could be
attached with different nand chip. The different page size
and timing requirement make the keep configuration impossible.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 arch/arm/plat-pxa/include/plat/pxa3xx_nand.h |   20 +++-
 drivers/mtd/nand/pxa3xx_nand.c               |  152 ++++++++++++++++++--------
 2 files changed, 124 insertions(+), 48 deletions(-)

Comments

Artem Bityutskiy July 6, 2011, 6:53 a.m. UTC | #1
The problem is that in this patch you change  struct pxa3xx_nand_platform_data,
and in the next patch you amend the platform data declarations, which means that
the kernel will not compile between patches 3 and 4...

On Mon, 2011-07-04 at 02:25 -0700, Lei Wen wrote:
> +               dev_info(&info->pdev->dev, "There is no nand chip on cs %d!\n"
> +                               , info->cs); 

Nitpick - keep the comma at the first line.
Lei Wen July 6, 2011, 6:54 a.m. UTC | #2
On Wed, Jul 6, 2011 at 2:53 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> The problem is that in this patch you change  struct pxa3xx_nand_platform_data,
> and in the next patch you amend the platform data declarations, which means that
> the kernel will not compile between patches 3 and 4...

So... how about merge the patch 3 and 4 together?

>
> On Mon, 2011-07-04 at 02:25 -0700, Lei Wen wrote:
>> +               dev_info(&info->pdev->dev, "There is no nand chip on cs %d!\n"
>> +                               , info->cs);
>
> Nitpick - keep the comma at the first line.

Opps, sorry for this.

Thanks,
Lei
Artem Bityutskiy July 6, 2011, 7:07 a.m. UTC | #3
On Wed, 2011-07-06 at 14:54 +0800, Lei Wen wrote:
> On Wed, Jul 6, 2011 at 2:53 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> > The problem is that in this patch you change  struct pxa3xx_nand_platform_data,
> > and in the next patch you amend the platform data declarations, which means that
> > the kernel will not compile between patches 3 and 4...
> 
> So... how about merge the patch 3 and 4 together?

IMO this should be OK.
Igor Grinberg July 6, 2011, 7:41 a.m. UTC | #4
On 07/04/11 12:25, Lei Wen wrote:

>  #ifdef CONFIG_PM
> @@ -1203,8 +1259,12 @@ static int pxa3xx_nand_resume(struct platform_device *pdev)
>  {
>  	struct pxa3xx_nand_info *info = platform_get_drvdata(pdev);
>  
> -	nand_writel(info, NDTR0CS0, info->host->ndtr0cs0);
> -	nand_writel(info, NDTR1CS0, info->host->ndtr1cs0);
> +	/*
> +	 * Directly set the chip select to a invalid value,
> +	 * then the driver would reset the timing according
> +	 * to current chip select at the beginning of cmdfunc
> +	 */
> +	info->cs = 0xff;

Thinking of this for the second (or third) time,
If you have keep config enabled and have only one nand chip,
this will brake the keep config...

Daniel,

have you tested the suspend/resume with this patch?
(and keep_config on?)
Lei Wen July 7, 2011, 6:26 a.m. UTC | #5
Hi Igor && Daniel,

On Wed, Jul 6, 2011 at 3:41 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
> On 07/04/11 12:25, Lei Wen wrote:
>
>>  #ifdef CONFIG_PM
>> @@ -1203,8 +1259,12 @@ static int pxa3xx_nand_resume(struct platform_device *pdev)
>>  {
>>       struct pxa3xx_nand_info *info = platform_get_drvdata(pdev);
>>
>> -     nand_writel(info, NDTR0CS0, info->host->ndtr0cs0);
>> -     nand_writel(info, NDTR1CS0, info->host->ndtr1cs0);
>> +     /*
>> +      * Directly set the chip select to a invalid value,
>> +      * then the driver would reset the timing according
>> +      * to current chip select at the beginning of cmdfunc
>> +      */
>> +     info->cs = 0xff;
>
> Thinking of this for the second (or third) time,
> If you have keep config enabled and have only one nand chip,
> this will brake the keep config...
>
> Daniel,
>
> have you tested the suspend/resume with this patch?
> (and keep_config on?)
>

Do you still have concern with this change?
If not, I would push the next round of patch set including merging
patch 3 and 4.

Thanks,
Lei
Igor Grinberg July 7, 2011, 8:59 a.m. UTC | #6
On 07/07/11 09:26, Lei Wen wrote:

> Hi Igor && Daniel,
>
> On Wed, Jul 6, 2011 at 3:41 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
>> On 07/04/11 12:25, Lei Wen wrote:
>>
>>>  #ifdef CONFIG_PM
>>> @@ -1203,8 +1259,12 @@ static int pxa3xx_nand_resume(struct platform_device *pdev)
>>>  {
>>>       struct pxa3xx_nand_info *info = platform_get_drvdata(pdev);
>>>
>>> -     nand_writel(info, NDTR0CS0, info->host->ndtr0cs0);
>>> -     nand_writel(info, NDTR1CS0, info->host->ndtr1cs0);
>>> +     /*
>>> +      * Directly set the chip select to a invalid value,
>>> +      * then the driver would reset the timing according
>>> +      * to current chip select at the beginning of cmdfunc
>>> +      */
>>> +     info->cs = 0xff;
>> Thinking of this for the second (or third) time,
>> If you have keep config enabled and have only one nand chip,
>> this will brake the keep config...
>>
>> Daniel,
>>
>> have you tested the suspend/resume with this patch?
>> (and keep_config on?)
>>
> Do you still have concern with this change?
> If not, I would push the next round of patch set including merging
> patch 3 and 4.

Though I can't test it right now, but yes it looks like it breaks
the keep_config after the resume (actually it disables the keep_config silently).

I think you need here some kind of check if keep_config is enabled.
keep_config is enabled means that only one chip select is used for NAND
and you don't need to reset the timings.
Lei Wen July 7, 2011, 9:06 a.m. UTC | #7
Hi Igor,

On Thu, Jul 7, 2011 at 4:59 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
> On 07/07/11 09:26, Lei Wen wrote:
>
>> Hi Igor && Daniel,
>>
>> On Wed, Jul 6, 2011 at 3:41 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
>>> On 07/04/11 12:25, Lei Wen wrote:
>>>
>>>>  #ifdef CONFIG_PM
>>>> @@ -1203,8 +1259,12 @@ static int pxa3xx_nand_resume(struct platform_device *pdev)
>>>>  {
>>>>       struct pxa3xx_nand_info *info = platform_get_drvdata(pdev);
>>>>
>>>> -     nand_writel(info, NDTR0CS0, info->host->ndtr0cs0);
>>>> -     nand_writel(info, NDTR1CS0, info->host->ndtr1cs0);
>>>> +     /*
>>>> +      * Directly set the chip select to a invalid value,
>>>> +      * then the driver would reset the timing according
>>>> +      * to current chip select at the beginning of cmdfunc
>>>> +      */
>>>> +     info->cs = 0xff;
>>> Thinking of this for the second (or third) time,
>>> If you have keep config enabled and have only one nand chip,
>>> this will brake the keep config...
>>>
>>> Daniel,
>>>
>>> have you tested the suspend/resume with this patch?
>>> (and keep_config on?)
>>>
>> Do you still have concern with this change?
>> If not, I would push the next round of patch set including merging
>> patch 3 and 4.
>
> Though I can't test it right now, but yes it looks like it breaks
> the keep_config after the resume (actually it disables the keep_config silently).
>
> I think you need here some kind of check if keep_config is enabled.
> keep_config is enabled means that only one chip select is used for NAND
> and you don't need to reset the timings.

It would not break anything.
The value rewrite to timing register is the one that save in the
pxa3xx_nand_detect_config.
And it is the same behavior before this patch apply.

Best regards,
Lei
Igor Grinberg July 7, 2011, 11:23 a.m. UTC | #8
On 07/07/11 12:06, Lei Wen wrote:

> Hi Igor,
>
> On Thu, Jul 7, 2011 at 4:59 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
>> On 07/07/11 09:26, Lei Wen wrote:
>>
>>> Hi Igor && Daniel,
>>>
>>> On Wed, Jul 6, 2011 at 3:41 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
>>>> On 07/04/11 12:25, Lei Wen wrote:
>>>>
>>>>>  #ifdef CONFIG_PM
>>>>> @@ -1203,8 +1259,12 @@ static int pxa3xx_nand_resume(struct platform_device *pdev)
>>>>>  {
>>>>>       struct pxa3xx_nand_info *info = platform_get_drvdata(pdev);
>>>>>
>>>>> -     nand_writel(info, NDTR0CS0, info->host->ndtr0cs0);
>>>>> -     nand_writel(info, NDTR1CS0, info->host->ndtr1cs0);
>>>>> +     /*
>>>>> +      * Directly set the chip select to a invalid value,
>>>>> +      * then the driver would reset the timing according
>>>>> +      * to current chip select at the beginning of cmdfunc
>>>>> +      */
>>>>> +     info->cs = 0xff;
>>>> Thinking of this for the second (or third) time,
>>>> If you have keep config enabled and have only one nand chip,
>>>> this will brake the keep config...
>>>>
>>>> Daniel,
>>>>
>>>> have you tested the suspend/resume with this patch?
>>>> (and keep_config on?)
>>>>
>>> Do you still have concern with this change?
>>> If not, I would push the next round of patch set including merging
>>> patch 3 and 4.
>> Though I can't test it right now, but yes it looks like it breaks
>> the keep_config after the resume (actually it disables the keep_config silently).
>>
>> I think you need here some kind of check if keep_config is enabled.
>> keep_config is enabled means that only one chip select is used for NAND
>> and you don't need to reset the timings.
> It would not break anything.
> The value rewrite to timing register is the one that save in the
> pxa3xx_nand_detect_config.
> And it is the same behavior before this patch apply.

Right. I've missed this. Thanks for the explanation.
Daniel Mack July 11, 2011, 2:49 p.m. UTC | #9
On Wed, Jul 6, 2011 at 9:41 AM, Igor Grinberg <grinberg@compulab.co.il> wrote:
> Daniel,
>
> have you tested the suspend/resume with this patch?
> (and keep_config on?)

We did now test this with keep_config=1 and it seems that we indeed
have a problem with suspend/resume after this patch. Affected machines
show the following after wake up:


[ 1295.458958] PM: resume of devices complete after 296.228 msecs
[ 1295.717773] Restarting tasks ...
[ 1295.729639] UBI error: ubi_io_write: error -5 while writing 2048
bytes to PEB 19:59392, written 0 bytes
[ 1295.761672] done.
[ 1295.772706] UBI warning: ubi_eba_write_leb: failed to write data to PEB 19
[ 1295.782492] UBI: recover PEB 19, move data to PEB 405
[ 1295.849084] mmc0: new SDIO card at address 0001
[ 1295.978045] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1295.983153] UBI warning: ubi_io_read_vid_hdr: no VID header found
at PEB 19, only 0xFF bytes
[ 1296.015382] UBI: run torture test for PEB 405
[ 1296.024375] UBI warning: ubi_ro_mode: switch to read-only mode
[ 1296.032321] UBI error: do_sync_erase: read-only mode
[ 1296.037296] UBI error: erase_worker: failed to erase PEB 405, error -30
[ 1296.058363] UBIFS error (pid 1303): ubifs_wbuf_write_nolock: cannot
write 424 bytes to LEB 317:55296, error -5
[ 1296.080917] UBIFS error (pid 1092): ubifs_wbuf_write_nolock: cannot
write 416 bytes to LEB 317:55296, error -30
[ 1296.094470] UBIFS warning (pid 1303): ubifs_ro_mode: switched to
read-only mode, error -5
[ 1296.106183] UBIFS error (pid 1092): ubifs_create: cannot create
regular file, error -30
[ 1296.118007] UBI error: do_work: work failed with error code -30
[ 1296.123915] UBI error: ubi_thread: ubi_bgt0d: work failed with error code -30
[ 1296.742934] libertas_sdio mmc0:0001:1: (unregistered net_device):
00:19:88:11:db:67, fw 9.70.7p0, cap 0x00000303
[ 1296.779845] libertas_sdio mmc0:0001:1: wlan0: Marvell WLAN 802.11 adapter
[ 1296.847141] ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 1298.041988] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[ 1300.957951] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1301.157810] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1301.357874] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1301.557924] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1301.563064] UBIFS error (pid 1260): ubifs_read_node: bad node type
(255 but expected 1)
[ 1301.573526] UBIFS error (pid 1260): ubifs_read_node: bad node at
LEB 112:18808, LEB mapping status 1
[ 1301.583586] UBIFS error (pid 1260): do_readpage: cannot read page 3
of inode 10295, error -22
[ 1301.787877] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1301.987891] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1302.187878] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1302.387878] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1302.393866] UBIFS error (pid 1260): ubifs_read_node: bad node type
(255 but expected 1)
[ 1302.403452] UBIFS error (pid 1260): ubifs_read_node: bad node at
LEB 112:18808, LEB mapping status 1
[ 1302.413507] UBIFS error (pid 1260): do_readpage: cannot read page 3
of inode 10295, error -22


The UBIFS is very likely toast after this.

Lei, any ideas, can you reproduce this? To avoid further regressions
of this driver, you really should have a system on which you can test
suspend and resume after each commit.


Thanks,
Daniel
Igor Grinberg July 11, 2011, 6:19 p.m. UTC | #10
On 07/11/11 17:49, Daniel Mack wrote:
> On Wed, Jul 6, 2011 at 9:41 AM, Igor Grinberg <grinberg@compulab.co.il> wrote:
>> Daniel,
>>
>> have you tested the suspend/resume with this patch?
>> (and keep_config on?)
> We did now test this with keep_config=1 and it seems that we indeed
> have a problem with suspend/resume after this patch. Affected machines
> show the following after wake up:
>
>
> [ 1295.458958] PM: resume of devices complete after 296.228 msecs
> [ 1295.717773] Restarting tasks ...
> [ 1295.729639] UBI error: ubi_io_write: error -5 while writing 2048
> bytes to PEB 19:59392, written 0 bytes
> [ 1295.761672] done.
> [ 1295.772706] UBI warning: ubi_eba_write_leb: failed to write data to PEB 19
> [ 1295.782492] UBI: recover PEB 19, move data to PEB 405
> [ 1295.849084] mmc0: new SDIO card at address 0001
> [ 1295.978045] pxa3xx-nand pxa3xx-nand: Wait time out!!!
> [ 1295.983153] UBI warning: ubi_io_read_vid_hdr: no VID header found
> at PEB 19, only 0xFF bytes
> [ 1296.015382] UBI: run torture test for PEB 405
> [ 1296.024375] UBI warning: ubi_ro_mode: switch to read-only mode
> [ 1296.032321] UBI error: do_sync_erase: read-only mode
> [ 1296.037296] UBI error: erase_worker: failed to erase PEB 405, error -30
> [ 1296.058363] UBIFS error (pid 1303): ubifs_wbuf_write_nolock: cannot
> write 424 bytes to LEB 317:55296, error -5
> [ 1296.080917] UBIFS error (pid 1092): ubifs_wbuf_write_nolock: cannot
> write 416 bytes to LEB 317:55296, error -30
> [ 1296.094470] UBIFS warning (pid 1303): ubifs_ro_mode: switched to
> read-only mode, error -5
> [ 1296.106183] UBIFS error (pid 1092): ubifs_create: cannot create
> regular file, error -30
> [ 1296.118007] UBI error: do_work: work failed with error code -30
> [ 1296.123915] UBI error: ubi_thread: ubi_bgt0d: work failed with error code -30
> [ 1296.742934] libertas_sdio mmc0:0001:1: (unregistered net_device):
> 00:19:88:11:db:67, fw 9.70.7p0, cap 0x00000303
> [ 1296.779845] libertas_sdio mmc0:0001:1: wlan0: Marvell WLAN 802.11 adapter
> [ 1296.847141] ADDRCONF(NETDEV_UP): wlan0: link is not ready
> [ 1298.041988] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> [ 1300.957951] pxa3xx-nand pxa3xx-nand: Wait time out!!!
> [ 1301.157810] pxa3xx-nand pxa3xx-nand: Wait time out!!!
> [ 1301.357874] pxa3xx-nand pxa3xx-nand: Wait time out!!!
> [ 1301.557924] pxa3xx-nand pxa3xx-nand: Wait time out!!!
> [ 1301.563064] UBIFS error (pid 1260): ubifs_read_node: bad node type
> (255 but expected 1)
> [ 1301.573526] UBIFS error (pid 1260): ubifs_read_node: bad node at
> LEB 112:18808, LEB mapping status 1
> [ 1301.583586] UBIFS error (pid 1260): do_readpage: cannot read page 3
> of inode 10295, error -22
> [ 1301.787877] pxa3xx-nand pxa3xx-nand: Wait time out!!!
> [ 1301.987891] pxa3xx-nand pxa3xx-nand: Wait time out!!!
> [ 1302.187878] pxa3xx-nand pxa3xx-nand: Wait time out!!!
> [ 1302.387878] pxa3xx-nand pxa3xx-nand: Wait time out!!!
> [ 1302.393866] UBIFS error (pid 1260): ubifs_read_node: bad node type
> (255 but expected 1)
> [ 1302.403452] UBIFS error (pid 1260): ubifs_read_node: bad node at
> LEB 112:18808, LEB mapping status 1
> [ 1302.413507] UBIFS error (pid 1260): do_readpage: cannot read page 3
> of inode 10295, error -22
>
>
> The UBIFS is very likely toast after this.

Heh, so I'm still sane... ;)

> Lei, any ideas, can you reproduce this? To avoid further regressions
> of this driver, you really should have a system on which you can test
> suspend and resume after each commit.

I think the problem is that on PXA3xx resume from S2/S3 involves bootloader.
If I remember correctly, the BootROM reads the bootloader from the boot device
regardless of the reset cause (S2/S3 resume is kind a reset for PXA3xx).
Then the bootloader must decide what should be done according to the reset cause.
This means, that the BootROM already configures the NAND flash
(if it is the boot device) and pxa3xx_nand driver should just get on with it
and don't try to reconfigure?

Am I still sane? ;)
Daniel Mack July 11, 2011, 6:53 p.m. UTC | #11
On Mon, Jul 11, 2011 at 8:19 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
> On 07/11/11 17:49, Daniel Mack wrote:
>> Lei, any ideas, can you reproduce this? To avoid further regressions
>> of this driver, you really should have a system on which you can test
>> suspend and resume after each commit.
>
> I think the problem is that on PXA3xx resume from S2/S3 involves bootloader.
> If I remember correctly, the BootROM reads the bootloader from the boot device
> regardless of the reset cause (S2/S3 resume is kind a reset for PXA3xx).

At least not in our case. The first level bootloader is entered on
resume just as it is on a POR event, with the exception that the D3S
bit in the ASCR register is set in this case. We unconditionally
initialize the static and dynamic memory controllers and then either
jump to the routine that initializes the NAND controller, read the 2nd
level loader and pass control to it. Or (in the resume case) we just
jump to the address stored at the RAM address 0x80000000 (which has
been set to cpu_resume previously) and thus enter the kernel again. No
NAND operations in the game in this case, and this has always worked
fine.

> Then the bootloader must decide what should be done according to the reset cause.
> This means, that the BootROM already configures the NAND flash
> (if it is the boot device) and pxa3xx_nand driver should just get on with it
> and don't try to reconfigure?

I think the problem is just the opposite. The driver expects the setup
registers to retain their state over suspend, and doesn't write them
again, and this causes operations to fail after resume.


Daniel
Igor Grinberg July 11, 2011, 7:25 p.m. UTC | #12
On 07/11/11 21:53, Daniel Mack wrote:

> On Mon, Jul 11, 2011 at 8:19 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
>> On 07/11/11 17:49, Daniel Mack wrote:
>>> Lei, any ideas, can you reproduce this? To avoid further regressions
>>> of this driver, you really should have a system on which you can test
>>> suspend and resume after each commit.
>> I think the problem is that on PXA3xx resume from S2/S3 involves bootloader.
>> If I remember correctly, the BootROM reads the bootloader from the boot device
>> regardless of the reset cause (S2/S3 resume is kind a reset for PXA3xx).
> At least not in our case. The first level bootloader is entered on
> resume just as it is on a POR event, with the exception that the D3S
> bit in the ASCR register is set in this case.

You say "first level boot loader is entered"... who is loading it and where from?
Indeed, it should happen as in POR case, but again, it resides on some non-volatile
storage, isn't it? What storage does it reside on?

> We unconditionally
> initialize the static and dynamic memory controllers and then either
> jump to the routine that initializes the NAND controller, read the 2nd
> level loader and pass control to it. Or (in the resume case) we just
> jump to the address stored at the RAM address 0x80000000 (which has
> been set to cpu_resume previously) and thus enter the kernel again. No
> NAND operations in the game in this case, and this has always worked
> fine.

Except those to read the first stage boot loader... (if it resides on NAND).
Well, our case is working similarly and both the first stage bootloader (OBM)
and the second stage bootloader (U-Boot) reside on the NAND flash,
therefore BootROM has to reconfigure the NAND flash, so it can boot (resume).


>> Then the bootloader must decide what should be done according to the reset cause.
>> This means, that the BootROM already configures the NAND flash
>> (if it is the boot device) and pxa3xx_nand driver should just get on with it
>> and don't try to reconfigure?
> I think the problem is just the opposite. The driver expects the setup
> registers to retain their state over suspend, and doesn't write them
> again, and this causes operations to fail after resume.

Well, according to Lei, the driver writes back the configuration that is stored prior
to suspend and that's why I agreed that it should work...

http://www.spinics.net/lists/arm-kernel/msg131942.html
Lei Wen July 12, 2011, 9:40 a.m. UTC | #13
On Tue, Jul 12, 2011 at 3:25 AM, Igor Grinberg <grinberg@compulab.co.il> wrote:
>
>
> On 07/11/11 21:53, Daniel Mack wrote:
>
>> On Mon, Jul 11, 2011 at 8:19 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
>>> On 07/11/11 17:49, Daniel Mack wrote:
>>>> Lei, any ideas, can you reproduce this? To avoid further regressions
>>>> of this driver, you really should have a system on which you can test
>>>> suspend and resume after each commit.
>>> I think the problem is that on PXA3xx resume from S2/S3 involves bootloader.
>>> If I remember correctly, the BootROM reads the bootloader from the boot device
>>> regardless of the reset cause (S2/S3 resume is kind a reset for PXA3xx).
>> At least not in our case. The first level bootloader is entered on
>> resume just as it is on a POR event, with the exception that the D3S
>> bit in the ASCR register is set in this case.
>
> You say "first level boot loader is entered"... who is loading it and where from?
> Indeed, it should happen as in POR case, but again, it resides on some non-volatile
> storage, isn't it? What storage does it reside on?
>
>> We unconditionally
>> initialize the static and dynamic memory controllers and then either
>> jump to the routine that initializes the NAND controller, read the 2nd
>> level loader and pass control to it. Or (in the resume case) we just
>> jump to the address stored at the RAM address 0x80000000 (which has
>> been set to cpu_resume previously) and thus enter the kernel again. No
>> NAND operations in the game in this case, and this has always worked
>> fine.
>
> Except those to read the first stage boot loader... (if it resides on NAND).
> Well, our case is working similarly and both the first stage bootloader (OBM)
> and the second stage bootloader (U-Boot) reside on the NAND flash,
> therefore BootROM has to reconfigure the NAND flash, so it can boot (resume).
>
>
>>> Then the bootloader must decide what should be done according to the reset cause.
>>> This means, that the BootROM already configures the NAND flash
>>> (if it is the boot device) and pxa3xx_nand driver should just get on with it
>>> and don't try to reconfigure?
>> I think the problem is just the opposite. The driver expects the setup
>> registers to retain their state over suspend, and doesn't write them
>> again, and this causes operations to fail after resume.
>
> Well, according to Lei, the driver writes back the configuration that is stored prior
> to suspend and that's why I agreed that it should work...
>
> http://www.spinics.net/lists/arm-kernel/msg131942.html
>
>
Daniel,

I retest with suspend&resume on my board, each time that board would
reset the timing in the
begining of pxa3xx_cmdfunc as expected. So this should be ok as it is
the same behavior
with original code.

However, it certainly has some potential risk in the suspend&resume process.
That is in pxa3xx_nand_suspend and pxa3xx_nand_resume, it should call
mtd->suspend and mtd->resume
respectively.

This implement has already existed in our code release for years, but
have get into the mainline yet.
So I tend to merged into the third patch, and you could have test
again whether could fix your issue.

Thanks,
Lei
Daniel Mack July 12, 2011, 9:57 a.m. UTC | #14
On Tue, Jul 12, 2011 at 11:40 AM, Lei Wen <adrian.wenl@gmail.com> wrote:
> I retest with suspend&resume on my board, each time that board would
> reset the timing in the
> begining of pxa3xx_cmdfunc as expected. So this should be ok as it is
> the same behavior
> with original code.

Ok, just to be sure - as there have been quite some iterations fot
this now - can you send me a link to your most current version of this
patch again, please? Just to avoid confusion which version we're
talking about.


Thanks,
Daniel
Lei Wen July 12, 2011, 10:29 a.m. UTC | #15
Hi Daniel,

On Tue, Jul 12, 2011 at 5:57 PM, Daniel Mack <zonque@gmail.com> wrote:
> On Tue, Jul 12, 2011 at 11:40 AM, Lei Wen <adrian.wenl@gmail.com> wrote:
>> I retest with suspend&resume on my board, each time that board would
>> reset the timing in the
>> begining of pxa3xx_cmdfunc as expected. So this should be ok as it is
>> the same behavior
>> with original code.
>
> Ok, just to be sure - as there have been quite some iterations fot
> this now - can you send me a link to your most current version of this
> patch again, please? Just to avoid confusion which version we're
> talking about.
>
>

Please refer to V6 patch set I just send out.

Thanks,
Lei
Daniel Mack July 12, 2011, 12:05 p.m. UTC | #16
On Tue, Jul 12, 2011 at 12:29 PM, Lei Wen <adrian.wenl@gmail.com> wrote:
> Hi Daniel,
>
> On Tue, Jul 12, 2011 at 5:57 PM, Daniel Mack <zonque@gmail.com> wrote:
>> On Tue, Jul 12, 2011 at 11:40 AM, Lei Wen <adrian.wenl@gmail.com> wrote:
>>> I retest with suspend&resume on my board, each time that board would
>>> reset the timing in the
>>> begining of pxa3xx_cmdfunc as expected. So this should be ok as it is
>>> the same behavior
>>> with original code.
>>
>> Ok, just to be sure - as there have been quite some iterations fot
>> this now - can you send me a link to your most current version of this
>> patch again, please? Just to avoid confusion which version we're
>> talking about.
>>
>>
>
> Please refer to V6 patch set I just send out.

I did. With all 4 patches applied, the system doesn't wake up again,
as I posted as reply to this particular patch. With only the first 3
applied, we see the following messages after resume:

[ 1297.173920] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1297.373887] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1297.379014] UBIFS error (pid 1245): ubifs_read_node: bad node type
(255 but expected 2)
[ 1297.388521] UBIFS error (pid 1245): ubifs_read_node: bad node at
LEB 435:102784, LEB mapping status 1
[ 1297.593895] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1297.793884] pxa3xx-nand pxa3xx-nand: Wait time out!!!
[ 1297.994294] pxa3xx-nand pxa3xx-nand: Wait time out!!!



Daniel
Daniel Mack July 12, 2011, 12:48 p.m. UTC | #17
On Mon, Jul 11, 2011 at 9:25 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
> On 07/11/11 21:53, Daniel Mack wrote:
>> At least not in our case. The first level bootloader is entered on
>> resume just as it is on a POR event, with the exception that the D3S
>> bit in the ASCR register is set in this case.
>
> You say "first level boot loader is entered"... who is loading it and where from?
> Indeed, it should happen as in POR case, but again, it resides on some non-volatile
> storage, isn't it? What storage does it reside on?

That is true. I was assuming that the first-stage loader remains in
SRAM and is re-used later, but I think you're right, and it is loaded
from NAND again after resume.

>> We unconditionally
>> initialize the static and dynamic memory controllers and then either
>> jump to the routine that initializes the NAND controller, read the 2nd
>> level loader and pass control to it. Or (in the resume case) we just
>> jump to the address stored at the RAM address 0x80000000 (which has
>> been set to cpu_resume previously) and thus enter the kernel again. No
>> NAND operations in the game in this case, and this has always worked
>> fine.
>
> Except those to read the first stage boot loader... (if it resides on NAND).
> Well, our case is working similarly and both the first stage bootloader (OBM)
> and the second stage bootloader (U-Boot) reside on the NAND flash,
> therefore BootROM has to reconfigure the NAND flash, so it can boot (resume).

Well, the load of the first NAND page (where the OBM resides) is out
of control for our software, as the code in PXA's internal ROM mask
takes care for this. We do not, however, re-initialize the NAND
controller in the resume case, just as we don't restore any other
pheripheral's registers. And this has worked fine all the time, and it
still does if Lei's latest patch set is not applied.


Daniel
Igor Grinberg July 12, 2011, 3:49 p.m. UTC | #18
On 07/12/11 15:48, Daniel Mack wrote:

> On Mon, Jul 11, 2011 at 9:25 PM, Igor Grinberg <grinberg@compulab.co.il> wrote:
>> On 07/11/11 21:53, Daniel Mack wrote:
>>> At least not in our case. The first level bootloader is entered on
>>> resume just as it is on a POR event, with the exception that the D3S
>>> bit in the ASCR register is set in this case.
>> You say "first level boot loader is entered"... who is loading it and where from?
>> Indeed, it should happen as in POR case, but again, it resides on some non-volatile
>> storage, isn't it? What storage does it reside on?
> That is true. I was assuming that the first-stage loader remains in
> SRAM and is re-used later, but I think you're right, and it is loaded
> from NAND again after resume.
>
>>> We unconditionally
>>> initialize the static and dynamic memory controllers and then either
>>> jump to the routine that initializes the NAND controller, read the 2nd
>>> level loader and pass control to it. Or (in the resume case) we just
>>> jump to the address stored at the RAM address 0x80000000 (which has
>>> been set to cpu_resume previously) and thus enter the kernel again. No
>>> NAND operations in the game in this case, and this has always worked
>>> fine.
>> Except those to read the first stage boot loader... (if it resides on NAND).
>> Well, our case is working similarly and both the first stage bootloader (OBM)
>> and the second stage bootloader (U-Boot) reside on the NAND flash,
>> therefore BootROM has to reconfigure the NAND flash, so it can boot (resume).
> Well, the load of the first NAND page (where the OBM resides)

You mean block, page is too small for the OBM to fit...

>  is out
> of control for our software, as the code in PXA's internal ROM mask
> takes care for this.

Exactly... the BootROM is configuring the NAND controller
and (I think) that is one of the reasons we have the keep_config in first place...

> We do not, however, re-initialize the NAND
> controller in the resume case, just as we don't restore any other
> pheripheral's registers.

Yes, the NAND initialization is done by the BootROM, so you don't need to reinitialize it.
Just the DRAM controller (may be SRAM also), so you will be able to jump back to Linux.

> And this has worked fine all the time, and it
> still does if Lei's latest patch set is not applied.
diff mbox

Patch

diff --git a/arch/arm/plat-pxa/include/plat/pxa3xx_nand.h b/arch/arm/plat-pxa/include/plat/pxa3xx_nand.h
index 442301f..c42f39f 100644
--- a/arch/arm/plat-pxa/include/plat/pxa3xx_nand.h
+++ b/arch/arm/plat-pxa/include/plat/pxa3xx_nand.h
@@ -41,6 +41,19 @@  struct pxa3xx_nand_flash {
 	struct pxa3xx_nand_timing *timing;	/* NAND Flash timing */
 };
 
+/*
+ * Current pxa3xx_nand controller has two chip select which
+ * both be workable.
+ *
+ * Notice should be taken that:
+ * When you want to use this feature, you should not enable the
+ * keep configuration feature, for two chip select could be
+ * attached with different nand chip. The different page size
+ * and timing requirement make the keep configuration impossible.
+ */
+
+/* The max num of chip select current support */
+#define NUM_CHIP_SELECT		(2)
 struct pxa3xx_nand_platform_data {
 
 	/* the data flash bus is shared between the Static Memory
@@ -52,8 +65,11 @@  struct pxa3xx_nand_platform_data {
 	/* allow platform code to keep OBM/bootloader defined NFC config */
 	int	keep_config;
 
-	const struct mtd_partition		*parts;
-	unsigned int				nr_parts;
+	/* indicate how many chip selects will be used */
+	int	num_cs;
+
+	const struct mtd_partition		*parts[NUM_CHIP_SELECT];
+	unsigned int				nr_parts[NUM_CHIP_SELECT];
 
 	const struct pxa3xx_nand_flash * 	flash;
 	size_t					num_flash;
diff --git a/drivers/mtd/nand/pxa3xx_nand.c b/drivers/mtd/nand/pxa3xx_nand.c
index a16328a..66f1709 100644
--- a/drivers/mtd/nand/pxa3xx_nand.c
+++ b/drivers/mtd/nand/pxa3xx_nand.c
@@ -130,6 +130,7 @@  struct pxa3xx_nand_host {
 	/* page size of attached chip */
 	unsigned int		page_size;
 	int			use_ecc;
+	int			cs;
 
 	/* calculated from pxa3xx_nand_flash data */
 	unsigned int		col_addr_cycles;
@@ -165,9 +166,10 @@  struct pxa3xx_nand_info {
 	struct pxa_dma_desc	*data_desc;
 	dma_addr_t 		data_desc_addr;
 
-	struct pxa3xx_nand_host *host;
+	struct pxa3xx_nand_host *host[NUM_CHIP_SELECT];
 	unsigned int		state;
 
+	int			cs;
 	int			use_ecc;	/* use HW ECC ? */
 	int			use_dma;	/* use DMA ? */
 	int			is_ready;
@@ -226,7 +228,7 @@  static struct pxa3xx_nand_flash builtin_flash_types[] = {
 /* Define a default flash type setting serve as flash detecting only */
 #define DEFAULT_FLASH_TYPE (&builtin_flash_types[0])
 
-const char *mtd_names[] = {"pxa3xx_nand-0", NULL};
+const char *mtd_names[] = {"pxa3xx_nand-0", "pxa3xx_nand-1", NULL};
 
 #define NDTR0_tCH(c)	(min((c), 7) << 19)
 #define NDTR0_tCS(c)	(min((c), 7) << 16)
@@ -268,7 +270,7 @@  static void pxa3xx_nand_set_timing(struct pxa3xx_nand_host *host,
 
 static void pxa3xx_set_datasize(struct pxa3xx_nand_info *info)
 {
-	struct pxa3xx_nand_host *host = info->host;
+	struct pxa3xx_nand_host *host = info->host[info->cs];
 	int oob_enable = host->reg_ndcr & NDCR_SPARE_EN;
 
 	info->data_size = host->page_size;
@@ -295,7 +297,7 @@  static void pxa3xx_set_datasize(struct pxa3xx_nand_info *info)
  */
 static void pxa3xx_nand_start(struct pxa3xx_nand_info *info)
 {
-	struct pxa3xx_nand_host *host = info->host;
+	struct pxa3xx_nand_host *host = info->host[info->cs];
 	uint32_t ndcr;
 
 	ndcr = host->reg_ndcr;
@@ -420,6 +422,15 @@  static irqreturn_t pxa3xx_nand_irq(int irq, void *devid)
 {
 	struct pxa3xx_nand_info *info = devid;
 	unsigned int status, is_completed = 0;
+	unsigned int ready, cmd_done;
+
+	if (info->cs == 0) {
+		ready           = NDSR_FLASH_RDY;
+		cmd_done        = NDSR_CS0_CMDD;
+	} else {
+		ready           = NDSR_RDY;
+		cmd_done        = NDSR_CS1_CMDD;
+	}
 
 	status = nand_readl(info, NDSR);
 
@@ -441,11 +452,11 @@  static irqreturn_t pxa3xx_nand_irq(int irq, void *devid)
 			handle_data_pio(info);
 		}
 	}
-	if (status & NDSR_CS0_CMDD) {
+	if (status & cmd_done) {
 		info->state = STATE_CMD_DONE;
 		is_completed = 1;
 	}
-	if (status & NDSR_FLASH_RDY) {
+	if (status & ready) {
 		info->is_ready = 1;
 		info->state = STATE_READY;
 	}
@@ -480,9 +491,11 @@  static int prepare_command_pool(struct pxa3xx_nand_info *info, int command,
 {
 	uint16_t cmd;
 	int addr_cycle, exec_cmd;
-	struct pxa3xx_nand_host *host = info->host;
-	struct mtd_info *mtd = host->mtd;
+	struct pxa3xx_nand_host *host;
+	struct mtd_info *mtd;
 
+	host = info->host[info->cs];
+	mtd = host->mtd;
 	addr_cycle = 0;
 	exec_cmd = 1;
 
@@ -492,8 +505,11 @@  static int prepare_command_pool(struct pxa3xx_nand_info *info, int command,
 	info->oob_size		= 0;
 	info->use_ecc		= 0;
 	info->is_ready		= 0;
-	info->ndcb0		= 0;
 	info->retcode		= ERR_NONE;
+	if (info->cs != 0)
+		info->ndcb0 = NDCB0_CSEL;
+	else
+		info->ndcb0 = 0;
 
 	switch (command) {
 	case NAND_CMD_READ0:
@@ -637,6 +653,17 @@  static void pxa3xx_nand_cmdfunc(struct mtd_info *mtd, unsigned command,
 	if (host->reg_ndcr & NDCR_DWIDTH_M)
 		column /= 2;
 
+	/*
+	 * There may be different NAND chip hooked to
+	 * different chip select, so check whether
+	 * chip select has been changed, if yes, reset the timing
+	 */
+	if (info->cs != host->cs) {
+		info->cs = host->cs;
+		nand_writel(info, NDTR0CS0, host->ndtr0cs0);
+		nand_writel(info, NDTR1CS0, host->ndtr1cs0);
+	}
+
 	info->state = STATE_PREPARED;
 	exec_cmd = prepare_command_pool(info, command, column, page_addr);
 	if (exec_cmd) {
@@ -778,7 +805,7 @@  static int pxa3xx_nand_config_flash(struct pxa3xx_nand_info *info,
 {
 	struct platform_device *pdev = info->pdev;
 	struct pxa3xx_nand_platform_data *pdata = pdev->dev.platform_data;
-	struct pxa3xx_nand_host *host = info->host;
+	struct pxa3xx_nand_host *host = info->host[info->cs];
 	uint32_t ndcr = 0x0; /* enable all interrupts */
 
 	if (f->page_size != 2048 && f->page_size != 512) {
@@ -822,7 +849,11 @@  static int pxa3xx_nand_config_flash(struct pxa3xx_nand_info *info,
 
 static int pxa3xx_nand_detect_config(struct pxa3xx_nand_info *info)
 {
-	struct pxa3xx_nand_host *host = info->host;
+	/*
+	 * We set 0 by hard coding here, for we don't support keep_config
+	 * when there is more than one chip attached to the controller
+	 */
+	struct pxa3xx_nand_host *host = info->host[0];
 	uint32_t ndcr = nand_readl(info, NDCR);
 
 	if (ndcr & NDCR_PAGE_SZ) {
@@ -883,9 +914,9 @@  static int pxa3xx_nand_init_buff(struct pxa3xx_nand_info *info)
 
 static int pxa3xx_nand_sensing(struct pxa3xx_nand_info *info)
 {
-	struct mtd_info *mtd = info->host->mtd;
+	struct mtd_info *mtd;
 	int ret;
-
+	mtd = info->host[info->cs]->mtd;
 	/* use the common timing to make a try */
 	ret = pxa3xx_nand_config_flash(info, &builtin_flash_types[0]);
 	if (ret)
@@ -917,7 +948,8 @@  static int pxa3xx_nand_scan(struct mtd_info *mtd)
 	ret = pxa3xx_nand_sensing(info);
 	if (ret) {
 		host->mtd = NULL;
-		dev_info(&info->pdev->dev, "There is no nand chip on cs 0!\n");
+		dev_info(&info->pdev->dev, "There is no nand chip on cs %d!\n"
+				, info->cs);
 
 		return ret;
 	}
@@ -996,41 +1028,47 @@  KEEP_CONFIG:
 
 static int alloc_nand_resource(struct platform_device *pdev)
 {
+	struct pxa3xx_nand_platform_data *pdata;
 	struct pxa3xx_nand_info *info;
 	struct pxa3xx_nand_host *host;
 	struct nand_chip *chip;
 	struct mtd_info *mtd;
 	struct resource *r;
-	int ret, irq;
+	int ret, irq, cs;
 
-	info = kzalloc(sizeof(*info) + sizeof(*mtd) + sizeof(*host),
-			GFP_KERNEL);
+	pdata = pdev->dev.platform_data;
+	info = kzalloc(sizeof(*info) + (sizeof(*mtd) +
+		       sizeof(*host)) * pdata->num_cs, GFP_KERNEL);
 	if (!info) {
 		dev_err(&pdev->dev, "failed to allocate memory\n");
 		return -ENOMEM;
 	}
 
-	mtd = (struct mtd_info *)(&info[1]);
-	chip = (struct nand_chip *)(&mtd[1]);
-	host = (struct pxa3xx_nand_host *)chip;
 	info->pdev = pdev;
-	info->host = host;
-	host->mtd = mtd;
-	host->info_data = info;
-	mtd->priv = host;
-	mtd->owner = THIS_MODULE;
-
-	chip->ecc.read_page	= pxa3xx_nand_read_page_hwecc;
-	chip->ecc.write_page	= pxa3xx_nand_write_page_hwecc;
-	chip->controller        = &info->controller;
-	chip->waitfunc		= pxa3xx_nand_waitfunc;
-	chip->select_chip	= pxa3xx_nand_select_chip;
-	chip->cmdfunc		= pxa3xx_nand_cmdfunc;
-	chip->read_word		= pxa3xx_nand_read_word;
-	chip->read_byte		= pxa3xx_nand_read_byte;
-	chip->read_buf		= pxa3xx_nand_read_buf;
-	chip->write_buf		= pxa3xx_nand_write_buf;
-	chip->verify_buf	= pxa3xx_nand_verify_buf;
+	for (cs = 0; cs < pdata->num_cs; cs++) {
+		mtd = (struct mtd_info *)((unsigned int)&info[1] +
+		      (sizeof(*mtd) + sizeof(*host)) * cs);
+		chip = (struct nand_chip *)(&mtd[1]);
+		host = (struct pxa3xx_nand_host *)chip;
+		info->host[cs] = host;
+		host->mtd = mtd;
+		host->cs = cs;
+		host->info_data = info;
+		mtd->priv = host;
+		mtd->owner = THIS_MODULE;
+
+		chip->ecc.read_page	= pxa3xx_nand_read_page_hwecc;
+		chip->ecc.write_page	= pxa3xx_nand_write_page_hwecc;
+		chip->controller        = &info->controller;
+		chip->waitfunc		= pxa3xx_nand_waitfunc;
+		chip->select_chip	= pxa3xx_nand_select_chip;
+		chip->cmdfunc		= pxa3xx_nand_cmdfunc;
+		chip->read_word		= pxa3xx_nand_read_word;
+		chip->read_byte		= pxa3xx_nand_read_byte;
+		chip->read_buf		= pxa3xx_nand_read_buf;
+		chip->write_buf		= pxa3xx_nand_write_buf;
+		chip->verify_buf	= pxa3xx_nand_verify_buf;
+	}
 
 	spin_lock_init(&chip->controller->lock);
 	init_waitqueue_head(&chip->controller->wq);
@@ -1128,11 +1166,14 @@  fail_free_mtd:
 static int pxa3xx_nand_remove(struct platform_device *pdev)
 {
 	struct pxa3xx_nand_info *info = platform_get_drvdata(pdev);
+	struct pxa3xx_nand_platform_data *pdata;
 	struct resource *r;
-	int irq;
+	int irq, cs;
 
 	if (!info)
 		return 0;
+
+	pdata = pdev->dev.platform_data;
 	platform_set_drvdata(pdev, NULL);
 
 	irq = platform_get_irq(pdev, 0);
@@ -1152,7 +1193,8 @@  static int pxa3xx_nand_remove(struct platform_device *pdev)
 	clk_disable(info->clk);
 	clk_put(info->clk);
 
-	nand_release(info->host->mtd);
+	for (cs = 0; cs < pdata->num_cs; cs++)
+		nand_release(info->host[cs]->mtd);
 	kfree(info);
 	return 0;
 }
@@ -1161,7 +1203,7 @@  static int pxa3xx_nand_probe(struct platform_device *pdev)
 {
 	struct pxa3xx_nand_platform_data *pdata;
 	struct pxa3xx_nand_info *info;
-	int ret;
+	int ret, cs, probe_success;
 
 	pdata = pdev->dev.platform_data;
 	if (!pdata) {
@@ -1176,14 +1218,28 @@  static int pxa3xx_nand_probe(struct platform_device *pdev)
 	}
 
 	info = platform_get_drvdata(pdev);
-	if (pxa3xx_nand_scan(info->host->mtd)) {
-		dev_err(&pdev->dev, "failed to scan nand\n");
+	probe_success = 0;
+	for (cs = 0; cs < pdata->num_cs; cs++) {
+		info->cs = cs;
+		ret = pxa3xx_nand_scan(info->host[cs]->mtd);
+		if (ret) {
+			dev_warn(&pdev->dev, "failed to scan nand at cs %d\n",
+				cs);
+			continue;
+		}
+
+		ret = mtd_device_parse_register(info->host[cs]->mtd, NULL, 0,
+				pdata->parts[cs], pdata->nr_parts[cs]);
+		if (!ret)
+			probe_success = 1;
+	}
+
+	if (!probe_success) {
 		pxa3xx_nand_remove(pdev);
 		return -ENODEV;
 	}
 
-	return mtd_device_parse_register(info->host->mtd, NULL, 0,
-			pdata->parts, pdata->nr_parts);
+	return 0;
 }
 
 #ifdef CONFIG_PM
@@ -1203,8 +1259,12 @@  static int pxa3xx_nand_resume(struct platform_device *pdev)
 {
 	struct pxa3xx_nand_info *info = platform_get_drvdata(pdev);
 
-	nand_writel(info, NDTR0CS0, info->host->ndtr0cs0);
-	nand_writel(info, NDTR1CS0, info->host->ndtr1cs0);
+	/*
+	 * Directly set the chip select to a invalid value,
+	 * then the driver would reset the timing according
+	 * to current chip select at the beginning of cmdfunc
+	 */
+	info->cs = 0xff;
 	clk_enable(info->clk);
 
 	return 0;