diff mbox

eMMC boot problem: switch to bus width 8 ddr failed

Message ID 1754f0ae-fe6f-00a2-a0d6-9fd55c2b866f@rock-chips.com (mailing list archive)
State New, archived
Headers show

Commit Message

Shawn Lin Jan. 6, 2017, 2:33 a.m. UTC
On 2017/1/6 8:41, Clemens Gruber wrote:
> Hi,
>
> with the current mainline 4.10-rc2 kernel, I can no longer boot from
> the eMMC on my i.MX6Q board.
>
> Details:
> The eMMC is a Micron MTFC4GACAJCN-1M WT but as the i.MX6Q only supports
> eMMC 4.41 features and we did not implement voltage switching from 3.3V
> to 1.8V or lower, I did add no-1-8-v; (but none of the mmc-ddr or mmc-hs
> options) to the device tree. The bus-width is 8.
>
> With 4.9 the board booted fine, now with the current mainline 4.10 tree,
> I get the following (repeating) errors at boot:
>
> [    4.326834] Waiting for root device /dev/mmcblk0p2...
> [   14.563861] mmc0: Timeout waiting for hardware cmd interrupt.
> [   14.569619] sdhci: =========== REGISTER DUMP (mmc0)===========
> [   14.575461] sdhci: Sys addr: 0x4e726000 | Version:  0x00000002
> [   14.581300] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
> [   14.587140] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
> [   14.592979] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
> [   14.598816] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
> [   14.604654] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
> [   14.610493] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
> [   14.616332] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
> [   14.622168] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
> [   14.628007] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
> [   14.633845] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff

it shows you always fail to get resp of sending status within the
expected period of time.


> [   14.639682] sdhci: Host ctl2: 0x00000000
> [   14.643611] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x4e6f7208
> [   14.649447] sdhci: ===========================================
>
> This repeats a few times, then more information is shown at the bottom:
>
> [   86.893859] mmc0: Timeout waiting for hardware cmd interrupt.
> [   86.899615] sdhci: =========== REGISTER DUMP (mmc0)===========
> [   86.905453] sdhci: Sys addr: 0x00000000 | Version:  0x00000002
> [   86.911291] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
> [   86.917129] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
> [   86.922967] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
> [   86.928804] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
> [   86.934642] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
> [   86.940479] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
> [   86.946316] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
> [   86.952154] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
> [   86.957992] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
> [   86.963830] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
> [   86.969668] sdhci: Host ctl2: 0x00000000
> [   86.973596] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
> [   86.979433] sdhci: ===========================================
> [   86.986356] mmc0: switch to bus width 8 ddr failed
> [   86.991163] mmc0: error -110 whilst initialising MMC card
> [   97.773859] mmc0: Timeout waiting for hardware cmd interrupt.
>
> --
>
> After looking through the latest commits to mmc/core, I found the
> culprit:
> Commit e173f8911f091fa50ccf8cc1fa316dd5569bc470 ("mmc: core: Update
> CMD13 polling policy when switch to HS DDR mode")
>
> Reverting it fixes the problem. But I am unsure if that's the right
> course of action?
>
> Feel free to send me patches for testing!

By looking the changes itself, it should be good from the view of spec.
Maybe you could try the patch below, but don't beat me if that doesn't
help at all. :)


>
> Regards,
> Clemens
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Comments

Clemens Gruber Jan. 6, 2017, 3:56 p.m. UTC | #1
On Fri, Jan 06, 2017 at 10:33:49AM +0800, Shawn Lin wrote:
> On 2017/1/6 8:41, Clemens Gruber wrote:
> > Hi,
> > 
> > with the current mainline 4.10-rc2 kernel, I can no longer boot from
> > the eMMC on my i.MX6Q board.
> > 
> > Details:
> > The eMMC is a Micron MTFC4GACAJCN-1M WT but as the i.MX6Q only supports
> > eMMC 4.41 features and we did not implement voltage switching from 3.3V
> > to 1.8V or lower, I did add no-1-8-v; (but none of the mmc-ddr or mmc-hs
> > options) to the device tree. The bus-width is 8.
> > 
> > With 4.9 the board booted fine, now with the current mainline 4.10 tree,
> > I get the following (repeating) errors at boot:
> > 
> > [    4.326834] Waiting for root device /dev/mmcblk0p2...
> > [   14.563861] mmc0: Timeout waiting for hardware cmd interrupt.
> > [   14.569619] sdhci: =========== REGISTER DUMP (mmc0)===========
> > [   14.575461] sdhci: Sys addr: 0x4e726000 | Version:  0x00000002
> > [   14.581300] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
> > [   14.587140] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
> > [   14.592979] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
> > [   14.598816] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
> > [   14.604654] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
> > [   14.610493] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
> > [   14.616332] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
> > [   14.622168] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
> > [   14.628007] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
> > [   14.633845] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
> 
> it shows you always fail to get resp of sending status within the
> expected period of time.
> 
> 
> > [   14.639682] sdhci: Host ctl2: 0x00000000
> > [   14.643611] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x4e6f7208
> > [   14.649447] sdhci: ===========================================
> > 
> > This repeats a few times, then more information is shown at the bottom:
> > 
> > [   86.893859] mmc0: Timeout waiting for hardware cmd interrupt.
> > [   86.899615] sdhci: =========== REGISTER DUMP (mmc0)===========
> > [   86.905453] sdhci: Sys addr: 0x00000000 | Version:  0x00000002
> > [   86.911291] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
> > [   86.917129] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
> > [   86.922967] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
> > [   86.928804] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
> > [   86.934642] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
> > [   86.940479] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
> > [   86.946316] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
> > [   86.952154] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
> > [   86.957992] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
> > [   86.963830] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
> > [   86.969668] sdhci: Host ctl2: 0x00000000
> > [   86.973596] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
> > [   86.979433] sdhci: ===========================================
> > [   86.986356] mmc0: switch to bus width 8 ddr failed
> > [   86.991163] mmc0: error -110 whilst initialising MMC card
> > [   97.773859] mmc0: Timeout waiting for hardware cmd interrupt.
> > 
> > --
> > 
> > After looking through the latest commits to mmc/core, I found the
> > culprit:
> > Commit e173f8911f091fa50ccf8cc1fa316dd5569bc470 ("mmc: core: Update
> > CMD13 polling policy when switch to HS DDR mode")
> > 
> > Reverting it fixes the problem. But I am unsure if that's the right
> > course of action?
> > 
> > Feel free to send me patches for testing!
> 
> By looking the changes itself, it should be good from the view of spec.
> Maybe you could try the patch below, but don't beat me if that doesn't
> help at all. :)
> 
> --- a/drivers/mmc/core/mmc.c
> +++ b/drivers/mmc/core/mmc.c
> @@ -1074,7 +1074,7 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
>                            EXT_CSD_BUS_WIDTH,
>                            ext_csd_bits,
>                            card->ext_csd.generic_cmd6_time,
> -                          MMC_TIMING_MMC_DDR52,
> +                          0,
>                            true, true, true);
>         if (err) {
>                 pr_err("%s: switch to bus width %d ddr failed\n",
> @@ -1118,6 +1118,9 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
>         if (err)
>                 err = __mmc_set_signal_voltage(host,
> MMC_SIGNAL_VOLTAGE_330);
> 
> +       if (!err)
> +               mmc_set_timing(host, MMC_TIMING_MMC_DDR52);
> +
> 
> 

Hi,

thank you. This patch solves the problem! :)

Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>

Regards,
Clemens
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Lin Jan. 9, 2017, 7:09 a.m. UTC | #2
On 2017/1/6 23:56, Clemens Gruber wrote:
> On Fri, Jan 06, 2017 at 10:33:49AM +0800, Shawn Lin wrote:
>> On 2017/1/6 8:41, Clemens Gruber wrote:
>>> Hi,
>>>
>>> with the current mainline 4.10-rc2 kernel, I can no longer boot from
>>> the eMMC on my i.MX6Q board.
>>>
>>> Details:
>>> The eMMC is a Micron MTFC4GACAJCN-1M WT but as the i.MX6Q only supports
>>> eMMC 4.41 features and we did not implement voltage switching from 3.3V
>>> to 1.8V or lower, I did add no-1-8-v; (but none of the mmc-ddr or mmc-hs
>>> options) to the device tree. The bus-width is 8.
>>>
>>> With 4.9 the board booted fine, now with the current mainline 4.10 tree,
>>> I get the following (repeating) errors at boot:
>>>
>>> [    4.326834] Waiting for root device /dev/mmcblk0p2...
>>> [   14.563861] mmc0: Timeout waiting for hardware cmd interrupt.
>>> [   14.569619] sdhci: =========== REGISTER DUMP (mmc0)===========
>>> [   14.575461] sdhci: Sys addr: 0x4e726000 | Version:  0x00000002
>>> [   14.581300] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
>>> [   14.587140] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
>>> [   14.592979] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
>>> [   14.598816] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
>>> [   14.604654] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
>>> [   14.610493] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
>>> [   14.616332] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
>>> [   14.622168] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
>>> [   14.628007] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
>>> [   14.633845] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
>>
>> it shows you always fail to get resp of sending status within the
>> expected period of time.
>>
>>
>>> [   14.639682] sdhci: Host ctl2: 0x00000000
>>> [   14.643611] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x4e6f7208
>>> [   14.649447] sdhci: ===========================================
>>>
>>> This repeats a few times, then more information is shown at the bottom:
>>>
>>> [   86.893859] mmc0: Timeout waiting for hardware cmd interrupt.
>>> [   86.899615] sdhci: =========== REGISTER DUMP (mmc0)===========
>>> [   86.905453] sdhci: Sys addr: 0x00000000 | Version:  0x00000002
>>> [   86.911291] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
>>> [   86.917129] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
>>> [   86.922967] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
>>> [   86.928804] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
>>> [   86.934642] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
>>> [   86.940479] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
>>> [   86.946316] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
>>> [   86.952154] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
>>> [   86.957992] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
>>> [   86.963830] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
>>> [   86.969668] sdhci: Host ctl2: 0x00000000
>>> [   86.973596] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
>>> [   86.979433] sdhci: ===========================================
>>> [   86.986356] mmc0: switch to bus width 8 ddr failed
>>> [   86.991163] mmc0: error -110 whilst initialising MMC card
>>> [   97.773859] mmc0: Timeout waiting for hardware cmd interrupt.
>>>
>>> --
>>>
>>> After looking through the latest commits to mmc/core, I found the
>>> culprit:
>>> Commit e173f8911f091fa50ccf8cc1fa316dd5569bc470 ("mmc: core: Update
>>> CMD13 polling policy when switch to HS DDR mode")
>>>
>>> Reverting it fixes the problem. But I am unsure if that's the right
>>> course of action?
>>>
>>> Feel free to send me patches for testing!
>>
>> By looking the changes itself, it should be good from the view of spec.
>> Maybe you could try the patch below, but don't beat me if that doesn't
>> help at all. :)
>>
>> --- a/drivers/mmc/core/mmc.c
>> +++ b/drivers/mmc/core/mmc.c
>> @@ -1074,7 +1074,7 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
>>                            EXT_CSD_BUS_WIDTH,
>>                            ext_csd_bits,
>>                            card->ext_csd.generic_cmd6_time,
>> -                          MMC_TIMING_MMC_DDR52,
>> +                          0,
>>                            true, true, true);
>>         if (err) {
>>                 pr_err("%s: switch to bus width %d ddr failed\n",
>> @@ -1118,6 +1118,9 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
>>         if (err)
>>                 err = __mmc_set_signal_voltage(host,
>> MMC_SIGNAL_VOLTAGE_330);
>>
>> +       if (!err)
>> +               mmc_set_timing(host, MMC_TIMING_MMC_DDR52);
>> +
>>
>>
>
> Hi,
>
> thank you. This patch solves the problem! :)

Again, from the view of spec, the code is fine.

So I *guess* that seems your host driver doesn't work fine under
DDR52 when it's in 3V3 vccq. But I don't have a real hardware to debug
it.  Could freescale guys help to check it?



>
> Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
>
> Regards,
> Clemens
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Bough Chen Jan. 10, 2017, 8:32 a.m. UTC | #3
> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Shawn Lin
> Sent: Monday, January 09, 2017 3:10 PM
> To: Clemens Gruber <clemens.gruber@pqgruber.com>; Shawn Lin
> <shawn.lin@rock-chips.com>; linux-mmc@vger.kernel.org
> Cc: shawn.lin@rock-chips.com; Ulf Hansson <ulf.hansson@linaro.org>; Linus
> Walleij <linus.walleij@linaro.org>; Adrian Hunter <adrian.hunter@intel.com>;
> A.S. Dong <aisheng.dong@nxp.com>; linux-kernel@vger.kernel.org
> Subject: Re: eMMC boot problem: switch to bus width 8 ddr failed
> 
> On 2017/1/6 23:56, Clemens Gruber wrote:
> > On Fri, Jan 06, 2017 at 10:33:49AM +0800, Shawn Lin wrote:
> >> On 2017/1/6 8:41, Clemens Gruber wrote:
> >>> Hi,
> >>>
> >>> with the current mainline 4.10-rc2 kernel, I can no longer boot from
> >>> the eMMC on my i.MX6Q board.
> >>>
> >>> Details:
> >>> The eMMC is a Micron MTFC4GACAJCN-1M WT but as the i.MX6Q only
> >>> supports eMMC 4.41 features and we did not implement voltage
> >>> switching from 3.3V to 1.8V or lower, I did add no-1-8-v; (but none
> >>> of the mmc-ddr or mmc-hs
> >>> options) to the device tree. The bus-width is 8.
> >>>
> >>> With 4.9 the board booted fine, now with the current mainline 4.10
> >>> tree, I get the following (repeating) errors at boot:
> >>>
> >>> [    4.326834] Waiting for root device /dev/mmcblk0p2...
> >>> [   14.563861] mmc0: Timeout waiting for hardware cmd interrupt.
> >>> [   14.569619] sdhci: =========== REGISTER DUMP (mmc0)===========
> >>> [   14.575461] sdhci: Sys addr: 0x4e726000 | Version:  0x00000002
> >>> [   14.581300] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
> >>> [   14.587140] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
> >>> [   14.592979] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
> >>> [   14.598816] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
> >>> [   14.604654] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
> >>> [   14.610493] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
> >>> [   14.616332] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
> >>> [   14.622168] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
> >>> [   14.628007] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
> >>> [   14.633845] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
> >>
> >> it shows you always fail to get resp of sending status within the
> >> expected period of time.
> >>
> >>
> >>> [   14.639682] sdhci: Host ctl2: 0x00000000
> >>> [   14.643611] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x4e6f7208
> >>> [   14.649447] sdhci:
> ===========================================
> >>>
> >>> This repeats a few times, then more information is shown at the bottom:
> >>>
> >>> [   86.893859] mmc0: Timeout waiting for hardware cmd interrupt.
> >>> [   86.899615] sdhci: =========== REGISTER DUMP (mmc0)===========
> >>> [   86.905453] sdhci: Sys addr: 0x00000000 | Version:  0x00000002
> >>> [   86.911291] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
> >>> [   86.917129] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
> >>> [   86.922967] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
> >>> [   86.928804] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
> >>> [   86.934642] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
> >>> [   86.940479] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
> >>> [   86.946316] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
> >>> [   86.952154] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
> >>> [   86.957992] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
> >>> [   86.963830] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
> >>> [   86.969668] sdhci: Host ctl2: 0x00000000
> >>> [   86.973596] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
> >>> [   86.979433] sdhci:
> ===========================================
> >>> [   86.986356] mmc0: switch to bus width 8 ddr failed
> >>> [   86.991163] mmc0: error -110 whilst initialising MMC card
> >>> [   97.773859] mmc0: Timeout waiting for hardware cmd interrupt.
> >>>
> >>> --
> >>>
> >>> After looking through the latest commits to mmc/core, I found the
> >>> culprit:
> >>> Commit e173f8911f091fa50ccf8cc1fa316dd5569bc470 ("mmc: core: Update
> >>> CMD13 polling policy when switch to HS DDR mode")
> >>>
> >>> Reverting it fixes the problem. But I am unsure if that's the right
> >>> course of action?
> >>>
> >>> Feel free to send me patches for testing!
> >>
> >> By looking the changes itself, it should be good from the view of spec.
> >> Maybe you could try the patch below, but don't beat me if that
> >> doesn't help at all. :)
> >>
> >> --- a/drivers/mmc/core/mmc.c
> >> +++ b/drivers/mmc/core/mmc.c
> >> @@ -1074,7 +1074,7 @@ static int mmc_select_hs_ddr(struct mmc_card
> *card)
> >>                            EXT_CSD_BUS_WIDTH,
> >>                            ext_csd_bits,
> >>                            card->ext_csd.generic_cmd6_time,
> >> -                          MMC_TIMING_MMC_DDR52,
> >> +                          0,
> >>                            true, true, true);
> >>         if (err) {
> >>                 pr_err("%s: switch to bus width %d ddr failed\n", @@
> >> -1118,6 +1118,9 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
> >>         if (err)
> >>                 err = __mmc_set_signal_voltage(host,
> >> MMC_SIGNAL_VOLTAGE_330);
> >>
> >> +       if (!err)
> >> +               mmc_set_timing(host, MMC_TIMING_MMC_DDR52);
> >> +
> >>
> >>
> >
> > Hi,
> >
> > thank you. This patch solves the problem! :)
> 
> Again, from the view of spec, the code is fine.
> 
> So I *guess* that seems your host driver doesn't work fine under
> DDR52 when it's in 3V3 vccq. But I don't have a real hardware to debug it.  Could
> freescale guys help to check it?
> 
> 

Hi All,

Sorry for missing  these email, I did a simple debug,  and find the properity "no-1-8-v" has no impact for
MMC DDR52 mode,  after make sure the I/O voltage do not change(keep 3.3v), this issue gone, I just did
a quick debug, will give the detail analysis later, but now I can confirm that commit e173f8911f (mmc: core: Update CMD13 polling policy when switch to HS DDR mode) is not the root cause.

Regards, 
Haibo Chen
> 
> >
> > Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
> >
> > Regards,
> > Clemens
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-mmc"
> > in the body of a message to majordomo@vger.kernel.org More majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> 
> --
> Best Regards
> Shawn Lin
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the
> body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Jan. 12, 2017, 4:51 p.m. UTC | #4
+ Haibo, Gary, Fabio, Shawn Gou

On 6 January 2017 at 16:56, Clemens Gruber <clemens.gruber@pqgruber.com> wrote:
> On Fri, Jan 06, 2017 at 10:33:49AM +0800, Shawn Lin wrote:
>> On 2017/1/6 8:41, Clemens Gruber wrote:
>> > Hi,
>> >
>> > with the current mainline 4.10-rc2 kernel, I can no longer boot from
>> > the eMMC on my i.MX6Q board.
>> >
>> > Details:
>> > The eMMC is a Micron MTFC4GACAJCN-1M WT but as the i.MX6Q only supports
>> > eMMC 4.41 features and we did not implement voltage switching from 3.3V
>> > to 1.8V or lower, I did add no-1-8-v; (but none of the mmc-ddr or mmc-hs
>> > options) to the device tree. The bus-width is 8.
>> >
>> > With 4.9 the board booted fine, now with the current mainline 4.10 tree,
>> > I get the following (repeating) errors at boot:
>> >
>> > [    4.326834] Waiting for root device /dev/mmcblk0p2...
>> > [   14.563861] mmc0: Timeout waiting for hardware cmd interrupt.
>> > [   14.569619] sdhci: =========== REGISTER DUMP (mmc0)===========
>> > [   14.575461] sdhci: Sys addr: 0x4e726000 | Version:  0x00000002
>> > [   14.581300] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
>> > [   14.587140] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
>> > [   14.592979] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
>> > [   14.598816] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
>> > [   14.604654] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
>> > [   14.610493] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
>> > [   14.616332] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
>> > [   14.622168] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
>> > [   14.628007] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
>> > [   14.633845] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
>>
>> it shows you always fail to get resp of sending status within the
>> expected period of time.
>>
>>
>> > [   14.639682] sdhci: Host ctl2: 0x00000000
>> > [   14.643611] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x4e6f7208
>> > [   14.649447] sdhci: ===========================================
>> >
>> > This repeats a few times, then more information is shown at the bottom:
>> >
>> > [   86.893859] mmc0: Timeout waiting for hardware cmd interrupt.
>> > [   86.899615] sdhci: =========== REGISTER DUMP (mmc0)===========
>> > [   86.905453] sdhci: Sys addr: 0x00000000 | Version:  0x00000002
>> > [   86.911291] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
>> > [   86.917129] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
>> > [   86.922967] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
>> > [   86.928804] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
>> > [   86.934642] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
>> > [   86.940479] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
>> > [   86.946316] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
>> > [   86.952154] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
>> > [   86.957992] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
>> > [   86.963830] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
>> > [   86.969668] sdhci: Host ctl2: 0x00000000
>> > [   86.973596] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
>> > [   86.979433] sdhci: ===========================================
>> > [   86.986356] mmc0: switch to bus width 8 ddr failed
>> > [   86.991163] mmc0: error -110 whilst initialising MMC card
>> > [   97.773859] mmc0: Timeout waiting for hardware cmd interrupt.
>> >
>> > --
>> >
>> > After looking through the latest commits to mmc/core, I found the
>> > culprit:
>> > Commit e173f8911f091fa50ccf8cc1fa316dd5569bc470 ("mmc: core: Update
>> > CMD13 polling policy when switch to HS DDR mode")
>> >
>> > Reverting it fixes the problem. But I am unsure if that's the right
>> > course of action?
>> >
>> > Feel free to send me patches for testing!
>>
>> By looking the changes itself, it should be good from the view of spec.
>> Maybe you could try the patch below, but don't beat me if that doesn't
>> help at all. :)
>>
>> --- a/drivers/mmc/core/mmc.c
>> +++ b/drivers/mmc/core/mmc.c
>> @@ -1074,7 +1074,7 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
>>                            EXT_CSD_BUS_WIDTH,
>>                            ext_csd_bits,
>>                            card->ext_csd.generic_cmd6_time,
>> -                          MMC_TIMING_MMC_DDR52,
>> +                          0,
>>                            true, true, true);
>>         if (err) {
>>                 pr_err("%s: switch to bus width %d ddr failed\n",
>> @@ -1118,6 +1118,9 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
>>         if (err)
>>                 err = __mmc_set_signal_voltage(host,
>> MMC_SIGNAL_VOLTAGE_330);
>>
>> +       if (!err)
>> +               mmc_set_timing(host, MMC_TIMING_MMC_DDR52);
>> +
>>
>>
>
> Hi,
>
> thank you. This patch solves the problem! :)
>
> Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
>
> Regards,
> Clemens

Everybody involved, thanks for looking into this!

I think the above approach seems like a reasonable fix for the 4.10
rcs. Shawn Lin, would you mind re-posting a proper patch with a
change-log?

In the meantime, I will follow the process of Haibo Chen's debugging
around the voltage switch issue and look into what Dong's suggesting
around this may be.

Just to be clear, I would definitely prefer a fix in the sdhci driver,
if that can be done. So I will give Haibo/Dong etc a couple of more
days to investigate, before applying Shawn Lin's fix for the core.
Hope that approach is okay with all of you?

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Lin Jan. 13, 2017, 2:10 a.m. UTC | #5
On 2017/1/13 0:51, Ulf Hansson wrote:
> + Haibo, Gary, Fabio, Shawn Gou
>
> On 6 January 2017 at 16:56, Clemens Gruber <clemens.gruber@pqgruber.com> wrote:
>> On Fri, Jan 06, 2017 at 10:33:49AM +0800, Shawn Lin wrote:
>>> On 2017/1/6 8:41, Clemens Gruber wrote:
>>>> Hi,
>>>>
>>>> with the current mainline 4.10-rc2 kernel, I can no longer boot from
>>>> the eMMC on my i.MX6Q board.
>>>>
>>>> Details:
>>>> The eMMC is a Micron MTFC4GACAJCN-1M WT but as the i.MX6Q only supports
>>>> eMMC 4.41 features and we did not implement voltage switching from 3.3V
>>>> to 1.8V or lower, I did add no-1-8-v; (but none of the mmc-ddr or mmc-hs
>>>> options) to the device tree. The bus-width is 8.
>>>>
>>>> With 4.9 the board booted fine, now with the current mainline 4.10 tree,
>>>> I get the following (repeating) errors at boot:
>>>>
>>>> [    4.326834] Waiting for root device /dev/mmcblk0p2...
>>>> [   14.563861] mmc0: Timeout waiting for hardware cmd interrupt.
>>>> [   14.569619] sdhci: =========== REGISTER DUMP (mmc0)===========
>>>> [   14.575461] sdhci: Sys addr: 0x4e726000 | Version:  0x00000002
>>>> [   14.581300] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
>>>> [   14.587140] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
>>>> [   14.592979] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
>>>> [   14.598816] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
>>>> [   14.604654] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
>>>> [   14.610493] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
>>>> [   14.616332] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
>>>> [   14.622168] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
>>>> [   14.628007] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
>>>> [   14.633845] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
>>>
>>> it shows you always fail to get resp of sending status within the
>>> expected period of time.
>>>
>>>
>>>> [   14.639682] sdhci: Host ctl2: 0x00000000
>>>> [   14.643611] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x4e6f7208
>>>> [   14.649447] sdhci: ===========================================
>>>>
>>>> This repeats a few times, then more information is shown at the bottom:
>>>>
>>>> [   86.893859] mmc0: Timeout waiting for hardware cmd interrupt.
>>>> [   86.899615] sdhci: =========== REGISTER DUMP (mmc0)===========
>>>> [   86.905453] sdhci: Sys addr: 0x00000000 | Version:  0x00000002
>>>> [   86.911291] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
>>>> [   86.917129] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
>>>> [   86.922967] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
>>>> [   86.928804] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
>>>> [   86.934642] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
>>>> [   86.940479] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
>>>> [   86.946316] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
>>>> [   86.952154] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
>>>> [   86.957992] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
>>>> [   86.963830] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
>>>> [   86.969668] sdhci: Host ctl2: 0x00000000
>>>> [   86.973596] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
>>>> [   86.979433] sdhci: ===========================================
>>>> [   86.986356] mmc0: switch to bus width 8 ddr failed
>>>> [   86.991163] mmc0: error -110 whilst initialising MMC card
>>>> [   97.773859] mmc0: Timeout waiting for hardware cmd interrupt.
>>>>
>>>> --
>>>>
>>>> After looking through the latest commits to mmc/core, I found the
>>>> culprit:
>>>> Commit e173f8911f091fa50ccf8cc1fa316dd5569bc470 ("mmc: core: Update
>>>> CMD13 polling policy when switch to HS DDR mode")
>>>>
>>>> Reverting it fixes the problem. But I am unsure if that's the right
>>>> course of action?
>>>>
>>>> Feel free to send me patches for testing!
>>>
>>> By looking the changes itself, it should be good from the view of spec.
>>> Maybe you could try the patch below, but don't beat me if that doesn't
>>> help at all. :)
>>>
>>> --- a/drivers/mmc/core/mmc.c
>>> +++ b/drivers/mmc/core/mmc.c
>>> @@ -1074,7 +1074,7 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
>>>                            EXT_CSD_BUS_WIDTH,
>>>                            ext_csd_bits,
>>>                            card->ext_csd.generic_cmd6_time,
>>> -                          MMC_TIMING_MMC_DDR52,
>>> +                          0,
>>>                            true, true, true);
>>>         if (err) {
>>>                 pr_err("%s: switch to bus width %d ddr failed\n",
>>> @@ -1118,6 +1118,9 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
>>>         if (err)
>>>                 err = __mmc_set_signal_voltage(host,
>>> MMC_SIGNAL_VOLTAGE_330);
>>>
>>> +       if (!err)
>>> +               mmc_set_timing(host, MMC_TIMING_MMC_DDR52);
>>> +
>>>
>>>
>>
>> Hi,
>>
>> thank you. This patch solves the problem! :)
>>
>> Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
>>
>> Regards,
>> Clemens
>
> Everybody involved, thanks for looking into this!
>
> I think the above approach seems like a reasonable fix for the 4.10
> rcs. Shawn Lin, would you mind re-posting a proper patch with a
> change-log?

Sure.

>
> In the meantime, I will follow the process of Haibo Chen's debugging
> around the voltage switch issue and look into what Dong's suggesting
> around this may be.
>
> Just to be clear, I would definitely prefer a fix in the sdhci driver,

yup, I prefer to fix the sdhci* either, and given that it's juct -rc3
now, we should still have some days for Haibo & Dong to help debug it.
Once the fix is settled, we could drop the core fix from -next branch.

> if that can be done. So I will give Haibo/Dong etc a couple of more
> days to investigate, before applying Shawn Lin's fix for the core.
> Hope that approach is okay with all of you?
>
> Kind regards
> Uffe
>
>
>
Bough Chen Jan. 13, 2017, 3:12 a.m. UTC | #6
> -----Original Message-----

> From: Shawn Lin [mailto:shawn.lin@rock-chips.com]

> Sent: Friday, January 13, 2017 10:11 AM

> To: Ulf Hansson <ulf.hansson@linaro.org>; Clemens Gruber

> <clemens.gruber@pqgruber.com>

> Cc: shawn.lin@rock-chips.com; linux-mmc@vger.kernel.org; Linus Walleij

> <linus.walleij@linaro.org>; Adrian Hunter <adrian.hunter@intel.com>; A.S.

> Dong <aisheng.dong@nxp.com>; linux-kernel@vger.kernel.org; Bough Chen

> <haibo.chen@nxp.com>; Gary Bisson <gary.bisson@boundarydevices.com>;

> Fabio Estevam <festevam@gmail.com>; Shawn Guo <shawnguo@kernel.org>

> Subject: Re: eMMC boot problem: switch to bus width 8 ddr failed

> 

> On 2017/1/13 0:51, Ulf Hansson wrote:

> > + Haibo, Gary, Fabio, Shawn Gou

> >

> > On 6 January 2017 at 16:56, Clemens Gruber

> <clemens.gruber@pqgruber.com> wrote:

> >> On Fri, Jan 06, 2017 at 10:33:49AM +0800, Shawn Lin wrote:

> >>> On 2017/1/6 8:41, Clemens Gruber wrote:

> >>>> Hi,

> >>>>

> >>>> with the current mainline 4.10-rc2 kernel, I can no longer boot

> >>>> from the eMMC on my i.MX6Q board.

> >>>>

> >>>> Details:

> >>>> The eMMC is a Micron MTFC4GACAJCN-1M WT but as the i.MX6Q only

> >>>> supports eMMC 4.41 features and we did not implement voltage

> >>>> switching from 3.3V to 1.8V or lower, I did add no-1-8-v; (but none

> >>>> of the mmc-ddr or mmc-hs

> >>>> options) to the device tree. The bus-width is 8.

> >>>>

> >>>> With 4.9 the board booted fine, now with the current mainline 4.10

> >>>> tree, I get the following (repeating) errors at boot:

> >>>>

> >>>> [    4.326834] Waiting for root device /dev/mmcblk0p2...

> >>>> [   14.563861] mmc0: Timeout waiting for hardware cmd interrupt.

> >>>> [   14.569619] sdhci: =========== REGISTER DUMP

> (mmc0)===========

> >>>> [   14.575461] sdhci: Sys addr: 0x4e726000 | Version:  0x00000002

> >>>> [   14.581300] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001

> >>>> [   14.587140] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013

> >>>> [   14.592979] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031

> >>>> [   14.598816] sdhci: Power:    0x00000002 | Blk gap:  0x00000080

> >>>> [   14.604654] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f

> >>>> [   14.610493] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000

> >>>> [   14.616332] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b

> >>>> [   14.622168] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003

> >>>> [   14.628007] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007

> >>>> [   14.633845] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff

> >>>

> >>> it shows you always fail to get resp of sending status within the

> >>> expected period of time.

> >>>

> >>>

> >>>> [   14.639682] sdhci: Host ctl2: 0x00000000

> >>>> [   14.643611] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x4e6f7208

> >>>> [   14.649447] sdhci:

> ===========================================

> >>>>

> >>>> This repeats a few times, then more information is shown at the bottom:

> >>>>

> >>>> [   86.893859] mmc0: Timeout waiting for hardware cmd interrupt.

> >>>> [   86.899615] sdhci: =========== REGISTER DUMP

> (mmc0)===========

> >>>> [   86.905453] sdhci: Sys addr: 0x00000000 | Version:  0x00000002

> >>>> [   86.911291] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001

> >>>> [   86.917129] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013

> >>>> [   86.922967] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031

> >>>> [   86.928804] sdhci: Power:    0x00000002 | Blk gap:  0x00000080

> >>>> [   86.934642] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f

> >>>> [   86.940479] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000

> >>>> [   86.946316] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b

> >>>> [   86.952154] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003

> >>>> [   86.957992] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007

> >>>> [   86.963830] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff

> >>>> [   86.969668] sdhci: Host ctl2: 0x00000000

> >>>> [   86.973596] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000

> >>>> [   86.979433] sdhci:

> ===========================================

> >>>> [   86.986356] mmc0: switch to bus width 8 ddr failed

> >>>> [   86.991163] mmc0: error -110 whilst initialising MMC card

> >>>> [   97.773859] mmc0: Timeout waiting for hardware cmd interrupt.

> >>>>

> >>>> --

> >>>>

> >>>> After looking through the latest commits to mmc/core, I found the

> >>>> culprit:

> >>>> Commit e173f8911f091fa50ccf8cc1fa316dd5569bc470 ("mmc: core:

> Update

> >>>> CMD13 polling policy when switch to HS DDR mode")

> >>>>

> >>>> Reverting it fixes the problem. But I am unsure if that's the right

> >>>> course of action?

> >>>>

> >>>> Feel free to send me patches for testing!

> >>>

> >>> By looking the changes itself, it should be good from the view of spec.

> >>> Maybe you could try the patch below, but don't beat me if that

> >>> doesn't help at all. :)

> >>>

> >>> --- a/drivers/mmc/core/mmc.c

> >>> +++ b/drivers/mmc/core/mmc.c

> >>> @@ -1074,7 +1074,7 @@ static int mmc_select_hs_ddr(struct mmc_card

> *card)

> >>>                            EXT_CSD_BUS_WIDTH,

> >>>                            ext_csd_bits,

> >>>                            card->ext_csd.generic_cmd6_time,

> >>> -                          MMC_TIMING_MMC_DDR52,

> >>> +                          0,

> >>>                            true, true, true);

> >>>         if (err) {

> >>>                 pr_err("%s: switch to bus width %d ddr failed\n", @@

> >>> -1118,6 +1118,9 @@ static int mmc_select_hs_ddr(struct mmc_card *card)

> >>>         if (err)

> >>>                 err = __mmc_set_signal_voltage(host,

> >>> MMC_SIGNAL_VOLTAGE_330);

> >>>

> >>> +       if (!err)

> >>> +               mmc_set_timing(host, MMC_TIMING_MMC_DDR52);

> >>> +

> >>>

> >>>

> >>

> >> Hi,

> >>

> >> thank you. This patch solves the problem! :)

> >>

> >> Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>

> >>

> >> Regards,

> >> Clemens

> >

> > Everybody involved, thanks for looking into this!

> >

> > I think the above approach seems like a reasonable fix for the 4.10

> > rcs. Shawn Lin, would you mind re-posting a proper patch with a

> > change-log?

> 

> Sure.

> 

> >

> > In the meantime, I will follow the process of Haibo Chen's debugging

> > around the voltage switch issue and look into what Dong's suggesting

> > around this may be.

> >

> > Just to be clear, I would definitely prefer a fix in the sdhci driver,

> 

> yup, I prefer to fix the sdhci* either, and given that it's juct -rc3 now, we should

> still have some days for Haibo & Dong to help debug it.

> Once the fix is settled, we could drop the core fix from -next branch.

> 


Hi Ulf and Shawn,

Aisheng and I debug this issue these days, and we find the root cause. There are two things
to describe.

1) voltage switch issue.  The properity "no-1-8-v" do not work for  MMC_TIMING_MMC_DDR52.
This is another bug, we need to fix, but has no relation with the current bug.

2) root cause, in __mmc_switch, the process is   send CMD6 --> set DDR52 timing --> polling for busy.   
For the DDR52 timing setting, we call set_ios(), in the set_ios, we first set DDR_EN to config sdhc in ddr mode, 
and then config the sd clock again.   Here it is, after CMD6 complete, we find data0 still low, which means card 
busy. At this time, if we set DDR_EN, there is a risk. For i.MX usdhc, DDR_EN setting becomes active only when
the DATA and CMD line are idle. So, at this time for HW, DDR_EN do not active, but software think DDR_EN already
active, and set the clock again to 49.5MHz, but actually the HW out put the clock as 198MHz. So there is clock glitch.
This is the root cause--set DDR_EN when card is still busy.

The following method can fix this issue
a) change the HW behavior, DDR_EN setting becomes active at once no matter what the state of the DATA and
CMD line are.   This can fix this issue, but our IC guys do not prefer this, this method still not safe enough.

b) add 1ms delay before DDR_EN to wait bus idle.  But we still not know whether the time 1ms is appropriate. Better
to poll for busy before set DDR_EN.

c) set DDR52 timing after CMD6 and pull for busy. This is what Shawn's patch do.

Hi Aisheng, 
Correct me if anything wrong.

My suggestion is that,  in __mmc_switch(), move the mmc_set_timing() after the function mmc_poll_for_busy().


Best Regards
Haibo Chen



> > if that can be done. So I will give Haibo/Dong etc a couple of more

> > days to investigate, before applying Shawn Lin's fix for the core.

> > Hope that approach is okay with all of you?

> >

> > Kind regards

> > Uffe

> >

> >

> >

> 

> 

> --

> Best Regards

> Shawn Lin
Shawn Lin Jan. 13, 2017, 4:03 a.m. UTC | #7
On 2017/1/13 11:12, Bough Chen wrote:
>> -----Original Message-----
>> From: Shawn Lin [mailto:shawn.lin@rock-chips.com]
>> Sent: Friday, January 13, 2017 10:11 AM
>> To: Ulf Hansson <ulf.hansson@linaro.org>; Clemens Gruber
>> <clemens.gruber@pqgruber.com>
>> Cc: shawn.lin@rock-chips.com; linux-mmc@vger.kernel.org; Linus Walleij
>> <linus.walleij@linaro.org>; Adrian Hunter <adrian.hunter@intel.com>; A.S.
>> Dong <aisheng.dong@nxp.com>; linux-kernel@vger.kernel.org; Bough Chen
>> <haibo.chen@nxp.com>; Gary Bisson <gary.bisson@boundarydevices.com>;
>> Fabio Estevam <festevam@gmail.com>; Shawn Guo <shawnguo@kernel.org>
>> Subject: Re: eMMC boot problem: switch to bus width 8 ddr failed
>>
>> On 2017/1/13 0:51, Ulf Hansson wrote:
>>> + Haibo, Gary, Fabio, Shawn Gou
>>>
>>> On 6 January 2017 at 16:56, Clemens Gruber
>> <clemens.gruber@pqgruber.com> wrote:
>>>> On Fri, Jan 06, 2017 at 10:33:49AM +0800, Shawn Lin wrote:
>>>>> On 2017/1/6 8:41, Clemens Gruber wrote:
>>>>>> Hi,
>>>>>>
>>>>>> with the current mainline 4.10-rc2 kernel, I can no longer boot
>>>>>> from the eMMC on my i.MX6Q board.
>>>>>>
>>>>>> Details:
>>>>>> The eMMC is a Micron MTFC4GACAJCN-1M WT but as the i.MX6Q only
>>>>>> supports eMMC 4.41 features and we did not implement voltage
>>>>>> switching from 3.3V to 1.8V or lower, I did add no-1-8-v; (but none
>>>>>> of the mmc-ddr or mmc-hs
>>>>>> options) to the device tree. The bus-width is 8.
>>>>>>
>>>>>> With 4.9 the board booted fine, now with the current mainline 4.10
>>>>>> tree, I get the following (repeating) errors at boot:
>>>>>>
>>>>>> [    4.326834] Waiting for root device /dev/mmcblk0p2...
>>>>>> [   14.563861] mmc0: Timeout waiting for hardware cmd interrupt.
>>>>>> [   14.569619] sdhci: =========== REGISTER DUMP
>> (mmc0)===========
>>>>>> [   14.575461] sdhci: Sys addr: 0x4e726000 | Version:  0x00000002
>>>>>> [   14.581300] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
>>>>>> [   14.587140] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
>>>>>> [   14.592979] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
>>>>>> [   14.598816] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
>>>>>> [   14.604654] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
>>>>>> [   14.610493] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
>>>>>> [   14.616332] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
>>>>>> [   14.622168] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
>>>>>> [   14.628007] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
>>>>>> [   14.633845] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
>>>>>
>>>>> it shows you always fail to get resp of sending status within the
>>>>> expected period of time.
>>>>>
>>>>>
>>>>>> [   14.639682] sdhci: Host ctl2: 0x00000000
>>>>>> [   14.643611] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x4e6f7208
>>>>>> [   14.649447] sdhci:
>> ===========================================
>>>>>>
>>>>>> This repeats a few times, then more information is shown at the bottom:
>>>>>>
>>>>>> [   86.893859] mmc0: Timeout waiting for hardware cmd interrupt.
>>>>>> [   86.899615] sdhci: =========== REGISTER DUMP
>> (mmc0)===========
>>>>>> [   86.905453] sdhci: Sys addr: 0x00000000 | Version:  0x00000002
>>>>>> [   86.911291] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000001
>>>>>> [   86.917129] sdhci: Argument: 0x00010000 | Trn mode: 0x00000013
>>>>>> [   86.922967] sdhci: Present:  0x01fd8009 | Host ctl: 0x00000031
>>>>>> [   86.928804] sdhci: Power:    0x00000002 | Blk gap:  0x00000080
>>>>>> [   86.934642] sdhci: Wake-up:  0x00000008 | Clock:    0x0000001f
>>>>>> [   86.940479] sdhci: Timeout:  0x0000008f | Int stat: 0x00000000
>>>>>> [   86.946316] sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
>>>>>> [   86.952154] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000003
>>>>>> [   86.957992] sdhci: Caps:     0x07eb0000 | Caps_1:   0x0000a007
>>>>>> [   86.963830] sdhci: Cmd:      0x00000d1a | Max curr: 0x00ffffff
>>>>>> [   86.969668] sdhci: Host ctl2: 0x00000000
>>>>>> [   86.973596] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
>>>>>> [   86.979433] sdhci:
>> ===========================================
>>>>>> [   86.986356] mmc0: switch to bus width 8 ddr failed
>>>>>> [   86.991163] mmc0: error -110 whilst initialising MMC card
>>>>>> [   97.773859] mmc0: Timeout waiting for hardware cmd interrupt.
>>>>>>
>>>>>> --
>>>>>>
>>>>>> After looking through the latest commits to mmc/core, I found the
>>>>>> culprit:
>>>>>> Commit e173f8911f091fa50ccf8cc1fa316dd5569bc470 ("mmc: core:
>> Update
>>>>>> CMD13 polling policy when switch to HS DDR mode")
>>>>>>
>>>>>> Reverting it fixes the problem. But I am unsure if that's the right
>>>>>> course of action?
>>>>>>
>>>>>> Feel free to send me patches for testing!
>>>>>
>>>>> By looking the changes itself, it should be good from the view of spec.
>>>>> Maybe you could try the patch below, but don't beat me if that
>>>>> doesn't help at all. :)
>>>>>
>>>>> --- a/drivers/mmc/core/mmc.c
>>>>> +++ b/drivers/mmc/core/mmc.c
>>>>> @@ -1074,7 +1074,7 @@ static int mmc_select_hs_ddr(struct mmc_card
>> *card)
>>>>>                            EXT_CSD_BUS_WIDTH,
>>>>>                            ext_csd_bits,
>>>>>                            card->ext_csd.generic_cmd6_time,
>>>>> -                          MMC_TIMING_MMC_DDR52,
>>>>> +                          0,
>>>>>                            true, true, true);
>>>>>         if (err) {
>>>>>                 pr_err("%s: switch to bus width %d ddr failed\n", @@
>>>>> -1118,6 +1118,9 @@ static int mmc_select_hs_ddr(struct mmc_card *card)
>>>>>         if (err)
>>>>>                 err = __mmc_set_signal_voltage(host,
>>>>> MMC_SIGNAL_VOLTAGE_330);
>>>>>
>>>>> +       if (!err)
>>>>> +               mmc_set_timing(host, MMC_TIMING_MMC_DDR52);
>>>>> +
>>>>>
>>>>>
>>>>
>>>> Hi,
>>>>
>>>> thank you. This patch solves the problem! :)
>>>>
>>>> Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
>>>>
>>>> Regards,
>>>> Clemens
>>>
>>> Everybody involved, thanks for looking into this!
>>>
>>> I think the above approach seems like a reasonable fix for the 4.10
>>> rcs. Shawn Lin, would you mind re-posting a proper patch with a
>>> change-log?
>>
>> Sure.
>>
>>>
>>> In the meantime, I will follow the process of Haibo Chen's debugging
>>> around the voltage switch issue and look into what Dong's suggesting
>>> around this may be.
>>>
>>> Just to be clear, I would definitely prefer a fix in the sdhci driver,
>>
>> yup, I prefer to fix the sdhci* either, and given that it's juct -rc3 now, we should
>> still have some days for Haibo & Dong to help debug it.
>> Once the fix is settled, we could drop the core fix from -next branch.
>>
>
> Hi Ulf and Shawn,
>
> Aisheng and I debug this issue these days, and we find the root cause. There are two things
> to describe.
>

Good to know.

> 1) voltage switch issue.  The properity "no-1-8-v" do not work for  MMC_TIMING_MMC_DDR52.
> This is another bug, we need to fix, but has no relation with the current bug.
>

yup, please.

> 2) root cause, in __mmc_switch, the process is   send CMD6 --> set DDR52 timing --> polling for busy.
> For the DDR52 timing setting, we call set_ios(), in the set_ios, we first set DDR_EN to config sdhc in ddr mode,
> and then config the sd clock again.   Here it is, after CMD6 complete, we find data0 still low, which means card
> busy. At this time, if we set DDR_EN, there is a risk. For i.MX usdhc, DDR_EN setting becomes active only when
> the DATA and CMD line are idle. So, at this time for HW, DDR_EN do not active, but software think DDR_EN already
> active, and set the clock again to 49.5MHz, but actually the HW out put the clock as 198MHz. So there is clock glitch.
> This is the root cause--set DDR_EN when card is still busy.
>

Make sense. But it makes me more worried about the problem.
Does it impact other controllers if changing timing settings when
it's in busy state? It seems very likely possible. So I'm afraid
that we now just break hs_ddr mode for your platform but on the
contrary your case exposes this potention risk here. Thought?

> The following method can fix this issue
> a) change the HW behavior, DDR_EN setting becomes active at once no matter what the state of the DATA and
> CMD line are.   This can fix this issue, but our IC guys do not prefer this, this method still not safe enough.
>
> b) add 1ms delay before DDR_EN to wait bus idle.  But we still not know whether the time 1ms is appropriate. Better
> to poll for busy before set DDR_EN.
>
> c) set DDR52 timing after CMD6 and pull for busy. This is what Shawn's patch do.
>
> Hi Aisheng,
> Correct me if anything wrong.
>
> My suggestion is that,  in __mmc_switch(), move the mmc_set_timing() after the function mmc_poll_for_busy().
>
>
> Best Regards
> Haibo Chen
>
>
>
>>> if that can be done. So I will give Haibo/Dong etc a couple of more
>>> days to investigate, before applying Shawn Lin's fix for the core.
>>> Hope that approach is okay with all of you?
>>>
>>> Kind regards
>>> Uffe
>>>
>>>
>>>
>>
>>
>> --
>> Best Regards
>> Shawn Lin
>
Dong Aisheng Jan. 13, 2017, 4:45 a.m. UTC | #8
Hi Shawn,

On Fri, Jan 13, 2017 at 12:03 PM, Shawn Lin <shawn.lin@rock-chips.com> wrote:

[...]

>
>> 2) root cause, in __mmc_switch, the process is   send CMD6 --> set DDR52
>> timing --> polling for busy.
>> For the DDR52 timing setting, we call set_ios(), in the set_ios, we first
>> set DDR_EN to config sdhc in ddr mode,
>> and then config the sd clock again.   Here it is, after CMD6 complete, we
>> find data0 still low, which means card
>> busy. At this time, if we set DDR_EN, there is a risk. For i.MX usdhc,
>> DDR_EN setting becomes active only when
>> the DATA and CMD line are idle. So, at this time for HW, DDR_EN do not
>> active, but software think DDR_EN already
>> active, and set the clock again to 49.5MHz, but actually the HW out put
>> the clock as 198MHz. So there is clock glitch.
>> This is the root cause--set DDR_EN when card is still busy.
>>
>
> Make sense. But it makes me more worried about the problem.
> Does it impact other controllers if changing timing settings when
> it's in busy state? It seems very likely possible. So I'm afraid
> that we now just break hs_ddr mode for your platform but on the
> contrary your case exposes this potention risk here. Thought?
>

Yes, i got the same concern as i replied in my last email.
Not sure if any other controllers exposes the same issue
since the kernel having this issue is quite new.

Regards
Dong Aisheng

>> The following method can fix this issue
>> a) change the HW behavior, DDR_EN setting becomes active at once no matter
>> what the state of the DATA and
>> CMD line are.   This can fix this issue, but our IC guys do not prefer
>> this, this method still not safe enough.
>>
>> b) add 1ms delay before DDR_EN to wait bus idle.  But we still not know
>> whether the time 1ms is appropriate. Better
>> to poll for busy before set DDR_EN.
>>
>> c) set DDR52 timing after CMD6 and pull for busy. This is what Shawn's
>> patch do.
>>
>> Hi Aisheng,
>> Correct me if anything wrong.
>>
>> My suggestion is that,  in __mmc_switch(), move the mmc_set_timing() after
>> the function mmc_poll_for_busy().
>>
>>
>> Best Regards
>> Haibo Chen
>>
>>
>>
>>>> if that can be done. So I will give Haibo/Dong etc a couple of more
>>>> days to investigate, before applying Shawn Lin's fix for the core.
>>>> Hope that approach is okay with all of you?
>>>>
>>>> Kind regards
>>>> Uffe
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards
>>> Shawn Lin
>>
>>
>
>
> --
> Best Regards
> Shawn Lin
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Jan. 13, 2017, 11:23 a.m. UTC | #9
[...]

> Hi Ulf and Shawn,
>
> Aisheng and I debug this issue these days, and we find the root cause. There are two things
> to describe.
>
> 1) voltage switch issue.  The properity "no-1-8-v" do not work for  MMC_TIMING_MMC_DDR52.
> This is another bug, we need to fix, but has no relation with the current bug.

I am working on a patch which invents MMC_CAP_3_3V_DDR and which has a
corresponding DT binding "mmc-ddr-3_3v".
Give me a day or so, then I will post it.

Likely it should help to resolve your issue, don't you think?

[...]

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bough Chen Jan. 16, 2017, 3:12 a.m. UTC | #10
PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBVbGYgSGFuc3NvbiBbbWFpbHRv
OnVsZi5oYW5zc29uQGxpbmFyby5vcmddDQo+IFNlbnQ6IEZyaWRheSwgSmFudWFyeSAxMywgMjAx
NyA3OjIzIFBNDQo+IFRvOiBCb3VnaCBDaGVuIDxoYWliby5jaGVuQG54cC5jb20+DQo+IENjOiBT
aGF3biBMaW4gPHNoYXduLmxpbkByb2NrLWNoaXBzLmNvbT47IENsZW1lbnMgR3J1YmVyDQo+IDxj
bGVtZW5zLmdydWJlckBwcWdydWJlci5jb20+OyBsaW51eC1tbWNAdmdlci5rZXJuZWwub3JnOyBM
aW51cyBXYWxsZWlqDQo+IDxsaW51cy53YWxsZWlqQGxpbmFyby5vcmc+OyBBZHJpYW4gSHVudGVy
IDxhZHJpYW4uaHVudGVyQGludGVsLmNvbT47IEEuUy4NCj4gRG9uZyA8YWlzaGVuZy5kb25nQG54
cC5jb20+OyBsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnOyBHYXJ5IEJpc3Nvbg0KPiA8Z2Fy
eS5iaXNzb25AYm91bmRhcnlkZXZpY2VzLmNvbT47IEZhYmlvIEVzdGV2YW0gPGZlc3RldmFtQGdt
YWlsLmNvbT47DQo+IFNoYXduIEd1byA8c2hhd25ndW9Aa2VybmVsLm9yZz4NCj4gU3ViamVjdDog
UmU6IGVNTUMgYm9vdCBwcm9ibGVtOiBzd2l0Y2ggdG8gYnVzIHdpZHRoIDggZGRyIGZhaWxlZA0K
PiANCj4gWy4uLl0NCj4gDQo+ID4gSGkgVWxmIGFuZCBTaGF3biwNCj4gPg0KPiA+IEFpc2hlbmcg
YW5kIEkgZGVidWcgdGhpcyBpc3N1ZSB0aGVzZSBkYXlzLCBhbmQgd2UgZmluZCB0aGUgcm9vdCBj
YXVzZS4NCj4gPiBUaGVyZSBhcmUgdHdvIHRoaW5ncyB0byBkZXNjcmliZS4NCj4gPg0KPiA+IDEp
IHZvbHRhZ2Ugc3dpdGNoIGlzc3VlLiAgVGhlIHByb3Blcml0eSAibm8tMS04LXYiIGRvIG5vdCB3
b3JrIGZvcg0KPiBNTUNfVElNSU5HX01NQ19ERFI1Mi4NCj4gPiBUaGlzIGlzIGFub3RoZXIgYnVn
LCB3ZSBuZWVkIHRvIGZpeCwgYnV0IGhhcyBubyByZWxhdGlvbiB3aXRoIHRoZSBjdXJyZW50IGJ1
Zy4NCj4gDQo+IEkgYW0gd29ya2luZyBvbiBhIHBhdGNoIHdoaWNoIGludmVudHMgTU1DX0NBUF8z
XzNWX0REUiBhbmQgd2hpY2ggaGFzIGENCj4gY29ycmVzcG9uZGluZyBEVCBiaW5kaW5nICJtbWMt
ZGRyLTNfM3YiLg0KPiBHaXZlIG1lIGEgZGF5IG9yIHNvLCB0aGVuIEkgd2lsbCBwb3N0IGl0Lg0K
PiANCj4gTGlrZWx5IGl0IHNob3VsZCBoZWxwIHRvIHJlc29sdmUgeW91ciBpc3N1ZSwgZG9uJ3Qg
eW91IHRoaW5rPw0KDQpTZWVtcyBZZXMsIEkgd2lsbCB0ZXN0IHRoZSBwYXRjaCB3aGVuIHlvdSBw
b3N0Lg0KDQpCZXN0IFJlZ2FyZHMsDQpIYWlibyBDaGVuDQoNCj4gDQo+IFsuLi5dDQo+IA0KPiBL
aW5kIHJlZ2FyZHMNCj4gVWZmZQ0K
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1074,7 +1074,7 @@  static int mmc_select_hs_ddr(struct mmc_card *card)
                            EXT_CSD_BUS_WIDTH,
                            ext_csd_bits,
                            card->ext_csd.generic_cmd6_time,
-                          MMC_TIMING_MMC_DDR52,
+                          0,
                            true, true, true);
         if (err) {
                 pr_err("%s: switch to bus width %d ddr failed\n",
@@ -1118,6 +1118,9 @@  static int mmc_select_hs_ddr(struct mmc_card *card)
         if (err)
                 err = __mmc_set_signal_voltage(host, 
MMC_SIGNAL_VOLTAGE_330);

+       if (!err)
+               mmc_set_timing(host, MMC_TIMING_MMC_DDR52);
+