Message ID | 1397529703-21165-2-git-send-email-shawn.guo@freescale.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Apr 14, 2014 at 11:41 PM, Shawn Guo <shawn.guo@freescale.com> wrote: > When suspending imx6q systems which have rootfs on SATA, the following > error will likely be seen in resume. The SATA link will fail to come > up, and it results in an unusable system across the suspend/resume > cycle. > > $ echo mem > /sys/power/state > PM: Syncing filesystems ... done. > PM: Preparing system for mem sleep > Freezing user space processes ... (elapsed 0.002 seconds) done. > Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done. > PM: Entering mem sleep > sd 0:0:0:0: [sda] Synchronizing SCSI cache > sd 0:0:0:0: [sda] Stopping disk > PM: suspend of devices complete after 61.914 msecs > PM: suspend devices took 0.070 seconds > PM: late suspend of devices complete after 4.906 msecs > PM: noirq suspend of devices complete after 4.521 msecs > Disabling non-boot CPUs ... > CPU1: shutdown > CPU2: shutdown > CPU3: shutdown > Enabling non-boot CPUs ... > CPU1: Booted secondary processor > CPU1 is up > CPU2: Booted secondary processor > CPU2 is up > CPU3: Booted secondary processor > CPU3 is up > PM: noirq resume of devices complete after 10.486 msecs > PM: early resume of devices complete after 4.679 msecs > sd 0:0:0:0: [sda] Starting disk > PM: resume of devices complete after 22.674 msecs > PM: resume devices took 0.030 seconds > PM: Finishing wakeup. > Restarting tasks ... done. > $ ata1: SATA link down (SStatus 1 SControl 300) > ata1: SATA link down (SStatus 1 SControl 300) > ata1: limiting SATA link speed to 1.5 Gbps > ata1: SATA link down (SStatus 1 SControl 310) > ata1.00: disabled > ata1: exception Emask 0x10 SAct 0x0 SErr 0x4040000 action 0xe frozen t4 > ata1: irq_stat 0x00000040, connection status changed > ata1: SError: { CommWake DevExch } > ata1: hard resetting link > sd 0:0:0:0: rejecting I/O to offline device > sd 0:0:0:0: killing request > sd 0:0:0:0: rejecting I/O to offline device > Aborting journal on device sda2-8. > sd 0:0:0:0: rejecting I/O to offline device > EXT4-fs warning (device sda2): ext4_end_bio:317: I/O error writing to inode 132577 (offset 0 size 0 starting block 26235) > Buffer I/O error on device sda2, logical block 10169 > ... > > It's caused by a silicon issue that SATA phy does not get reset by > controller when coming back from LPM. The patch adds a software > workaround for this issue. It enforces a software reset on SATA phy > in imx_sata_enable() function, so that we can ensure SATA link will > come up properly in both power-on and resume. > > The software reset is implemented by writing phy reset register through > the phy control register bus interface. Functions > imx_phy_reg_[addressing|write|read]() implement this bus interface, while > imx_sata_phy_reset() performs the actually reset operation. > > Signed-off-by: Richard Zhu <r65037@freescale.com> > Signed-off-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Fabio Estevam <fabio.estevam@freescale.com> This fixes the warning I reported at: http://www.spinics.net/lists/arm-kernel/msg318745.html Thanks, Fabio Estevam
On Tue, Apr 15, 2014 at 10:41:43AM +0800, Shawn Guo wrote: > +static int imx_phy_ack_polling(void __iomem *mmio, bool assert) > +{ > + int timeout = 100; > + u32 srval; > + > + do { > + srval = readl(mmio + IMX_SATA_P0PHYSR); > + if ((assert ? srval : ~srval) & P0PHYSR_CR_ACK) > + break; > + usleep_range(100, 200); What's up with usleep_range() these days? There's no point in polling in sub-msec intervals. Let's please stick to msleep(). > +static int imx_phy_reg_addressing(u16 addr, void __iomem *mmio) > +{ > + u32 crval = addr; > + int ret; > + > + /* 1. Supply the address on cr_data_in */ > + writel(crval, mmio + IMX_SATA_P0PHYCR); > + > + /* 2. Assert the cr_cap_addr signal */ > + crval |= P0PHYCR_CR_CAP_ADDR; > + writel(crval, mmio + IMX_SATA_P0PHYCR); > + > + /* 3. Wait for the cr_ack signal to be asserted */ > + ret = imx_phy_ack_polling(mmio, true); > + if (ret) > + return ret; > + > + /* 4. Deassert cr_cap_addr */ > + crval &= ~P0PHYCR_CR_CAP_ADDR; > + writel(crval, mmio + IMX_SATA_P0PHYCR); > + > + /* 5. Wait for cr_ack to be deasserted */ > + ret = imx_phy_ack_polling(mmio, false); > + if (ret) > + return ret; > + > + return 0; > +} Wouldn't folding comment 3 and 5 into 2 and 4 respectively make it easier on the eyes? They're single operations anyway. > +static int imx_phy_reg_write(u16 val, void __iomem *mmio) > +{ ... > +} Ditto. Also, maybe it'd be better to create a wrapper to assert/clear and wait for ack? ... > +static int imx_sata_phy_reset(struct ahci_host_priv *hpriv) > +{ > + void __iomem *mmio = hpriv->mmio; > + int timeout = 10; > + u16 val; > + int ret; > + > + /* Reset SATA PHY by setting RESET bit of PHY register CLOCK_RESET */ > + ret = imx_phy_reg_addressing(IMX_PHY_CLOCK_RESET, mmio); > + if (ret) > + return ret; > + /* > + * For phy reset operation, we skip the timeout checking, because phy > + * will be unable to acknowledge in this case. > + */ > + imx_phy_reg_write(CLOCK_RESET_RESET, mmio); > + > + usleep_range(100, 200); > + > + /* Wait for PHY RX_PLL to be stable */ > + do { > + ret = imx_phy_reg_addressing(IMX_PHY_LANE0_OUT_STAT, mmio); > + if (ret) > + return ret; > + ret = imx_phy_reg_read(&val, mmio); > + if (ret) > + return ret; > + if (val & LANE0_OUT_STAT_RX_PLL_STATE) > + break; > + usleep_range(100, 200); Ditto with above. Thanks.
On Tue, Apr 15, 2014 at 12:10:18PM -0400, Tejun Heo wrote: > On Tue, Apr 15, 2014 at 10:41:43AM +0800, Shawn Guo wrote: > > +static int imx_phy_ack_polling(void __iomem *mmio, bool assert) > > +{ > > + int timeout = 100; > > + u32 srval; > > + > > + do { > > + srval = readl(mmio + IMX_SATA_P0PHYSR); > > + if ((assert ? srval : ~srval) & P0PHYSR_CR_ACK) > > + break; > > + usleep_range(100, 200); > > What's up with usleep_range() these days? There's no point in polling > in sub-msec intervals. Let's please stick to msleep(). The hardware manual suggests that an ACK will normally be given within 100us. So using msleep() will have it sleep longer than necessary for normal case. Basically, we're following the suggestion from Documentation/timers/timers-howto.txt to use usleep_range() over msleep(). SLEEPING FOR ~USECS OR SMALL MSECS ( 10us - 20ms): * Use usleep_range - Why not msleep for (1ms - 20ms)? Explained originally here: http://lkml.org/lkml/2007/8/3/250 msleep(1~20) may not do what the caller intends, and will often sleep longer (~20 ms actual sleep for any value given in the 1~20ms range). In many cases this is not the desired behavior. > > > +static int imx_phy_reg_addressing(u16 addr, void __iomem *mmio) > > +{ > > + u32 crval = addr; > > + int ret; > > + > > + /* 1. Supply the address on cr_data_in */ > > + writel(crval, mmio + IMX_SATA_P0PHYCR); > > + > > + /* 2. Assert the cr_cap_addr signal */ > > + crval |= P0PHYCR_CR_CAP_ADDR; > > + writel(crval, mmio + IMX_SATA_P0PHYCR); > > + > > + /* 3. Wait for the cr_ack signal to be asserted */ > > + ret = imx_phy_ack_polling(mmio, true); > > + if (ret) > > + return ret; > > + > > + /* 4. Deassert cr_cap_addr */ > > + crval &= ~P0PHYCR_CR_CAP_ADDR; > > + writel(crval, mmio + IMX_SATA_P0PHYCR); > > + > > + /* 5. Wait for cr_ack to be deasserted */ > > + ret = imx_phy_ack_polling(mmio, false); > > + if (ret) > > + return ret; > > + > > + return 0; > > +} > > Wouldn't folding comment 3 and 5 into 2 and 4 respectively make it > easier on the eyes? They're single operations anyway. > > > +static int imx_phy_reg_write(u16 val, void __iomem *mmio) > > +{ > ... > > +} > > Ditto. Also, maybe it'd be better to create a wrapper to assert/clear > and wait for ack? I was writing the code in the exact steps documented in the hardware manual. But your comment makes a lot of sense to me, so I will create such a wrapper to make it easier on the eyes. Thanks, Tejun. Shawn > > ... > > +static int imx_sata_phy_reset(struct ahci_host_priv *hpriv) > > +{ > > + void __iomem *mmio = hpriv->mmio; > > + int timeout = 10; > > + u16 val; > > + int ret; > > + > > + /* Reset SATA PHY by setting RESET bit of PHY register CLOCK_RESET */ > > + ret = imx_phy_reg_addressing(IMX_PHY_CLOCK_RESET, mmio); > > + if (ret) > > + return ret; > > + /* > > + * For phy reset operation, we skip the timeout checking, because phy > > + * will be unable to acknowledge in this case. > > + */ > > + imx_phy_reg_write(CLOCK_RESET_RESET, mmio); > > + > > + usleep_range(100, 200); > > + > > + /* Wait for PHY RX_PLL to be stable */ > > + do { > > + ret = imx_phy_reg_addressing(IMX_PHY_LANE0_OUT_STAT, mmio); > > + if (ret) > > + return ret; > > + ret = imx_phy_reg_read(&val, mmio); > > + if (ret) > > + return ret; > > + if (val & LANE0_OUT_STAT_RX_PLL_STATE) > > + break; > > + usleep_range(100, 200); > > Ditto with above. > > Thanks. > > -- > tejun > >
Hello, On Wed, Apr 16, 2014 at 03:38:24PM +0800, Shawn Guo wrote: > The hardware manual suggests that an ACK will normally be given within > 100us. So using msleep() will have it sleep longer than necessary for > normal case. Basically, we're following the suggestion from > Documentation/timers/timers-howto.txt to use usleep_range() over > msleep(). > > SLEEPING FOR ~USECS OR SMALL MSECS ( 10us - 20ms): > * Use usleep_range > > - Why not msleep for (1ms - 20ms)? > Explained originally here: > http://lkml.org/lkml/2007/8/3/250 > msleep(1~20) may not do what the caller intends, and > will often sleep longer (~20 ms actual sleep for any > value given in the 1~20ms range). In many cases this > is not the desired behavior. But we don't care whether the minimum wait is actually 100us here, do we? It's a reset path. Let's please just be simple. msleep(1) not actually sleeping 1ms is an implementation detail which may change in the future. If this were in the command execution path, sure, but not here. Thanks.
On Wed, Apr 16, 2014 at 10:01:12AM -0400, Tejun Heo wrote: > Hello, > > On Wed, Apr 16, 2014 at 03:38:24PM +0800, Shawn Guo wrote: > > The hardware manual suggests that an ACK will normally be given within > > 100us. So using msleep() will have it sleep longer than necessary for > > normal case. Basically, we're following the suggestion from > > Documentation/timers/timers-howto.txt to use usleep_range() over > > msleep(). > > > > SLEEPING FOR ~USECS OR SMALL MSECS ( 10us - 20ms): > > * Use usleep_range > > > > - Why not msleep for (1ms - 20ms)? > > Explained originally here: > > http://lkml.org/lkml/2007/8/3/250 > > msleep(1~20) may not do what the caller intends, and > > will often sleep longer (~20 ms actual sleep for any > > value given in the 1~20ms range). In many cases this > > is not the desired behavior. > > But we don't care whether the minimum wait is actually 100us here, do > we? It's a reset path. Let's please just be simple. msleep(1) not > actually sleeping 1ms is an implementation detail which may change in > the future. If this were in the command execution path, sure, but not > here. Functions imx_phy_reg_[addressing|write|read]() implement the phy control register bus interface. Though the interface is used only for phy reset right now, it's not limited to phy reset but anything that needs to access phy. Even for reset path, we still care about the time. I just quickly tested it with replacing all those usleep_range() calls with msleep(1), and found that the resume time increases dramatically, from 10.880 msecs to 69.431 msecs. The system resume speed is quite critical to some of our customers. If every device sleeps longer than necessary like that, the system resume speed will be unacceptably slow. So, please, allow the use of usleep_range() here. Shawn === usleep_range === $ echo mem > /sys/power/state PM: Syncing filesystems ... done. PM: Preparing system for mem sleep Freezing user space processes ... (elapsed 0.001 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. PM: Entering mem sleep sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Stopping disk PM: suspend of devices complete after 51.209 msecs PM: suspend devices took 0.060 seconds PM: late suspend of devices complete after 4.651 msecs PM: noirq suspend of devices complete after 4.869 msecs Disabling non-boot CPUs ... CPU1: shutdown CPU2: shutdown CPU3: shutdown Enabling non-boot CPUs ... CPU1: Booted secondary processor CPU1 is up CPU2: Booted secondary processor CPU2 is up CPU3: Booted secondary processor CPU3 is up PM: noirq resume of devices complete after 4.010 msecs PM: early resume of devices complete after 4.619 msecs sd 0:0:0:0: [sda] Starting disk PM: resume of devices complete after 10.880 msecs PM: resume devices took 0.020 seconds PM: Finishing wakeup. Restarting tasks ... done. $ ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/133 === msleep(1) === $ echo mem > /sys/power/state PM: Syncing filesystems ... done. PM: Preparing system for mem sleep Freezing user space processes ... (elapsed 0.002 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. PM: Entering mem sleep sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Stopping disk PM: suspend of devices complete after 35.751 msecs PM: suspend devices took 0.040 seconds PM: late suspend of devices complete after 4.190 msecs PM: noirq suspend of devices complete after 4.450 msecs Disabling non-boot CPUs ... CPU1: shutdown CPU2: shutdown CPU3: shutdown Enabling non-boot CPUs ... CPU1: Booted secondary processor CPU1 is up CPU2: Booted secondary processor CPU2 is up CPU3: Booted secondary processor CPU3 is up PM: noirq resume of devices complete after 2.328 msecs PM: early resume of devices complete after 2.774 msecs sd 0:0:0:0: [sda] Starting disk PM: resume of devices complete after 69.431 msecs PM: resume devices took 0.070 seconds PM: Finishing wakeup. Restarting tasks ... done. $ ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/133
diff --git a/drivers/ata/ahci_imx.c b/drivers/ata/ahci_imx.c index 39629b4..7d4756d 100644 --- a/drivers/ata/ahci_imx.c +++ b/drivers/ata/ahci_imx.c @@ -31,6 +31,18 @@ #define IMX_SATA_TIMER1MS 0x00e0 #define IMX_SATA_P0PHYCR 0x0178 #define P0PHYCR_TEST_PDDQ (1 << 20) +#define P0PHYCR_CR_READ (1 << 19) +#define P0PHYCR_CR_WRITE (1 << 18) +#define P0PHYCR_CR_CAP_DATA (1 << 17) +#define P0PHYCR_CR_CAP_ADDR (1 << 16) +#define IMX_SATA_P0PHYSR 0x017c +#define P0PHYSR_CR_ACK (1 << 18) +#define P0PHYSR_CR_DATA_OUT 0xffff + +#define IMX_PHY_LANE0_OUT_STAT 0x2003 +#define LANE0_OUT_STAT_RX_PLL_STATE (1 << 1) +#define IMX_PHY_CLOCK_RESET 0x7f3f +#define CLOCK_RESET_RESET (1 << 0) enum ahci_imx_type { AHCI_IMX53, @@ -52,9 +64,165 @@ MODULE_PARM_DESC(hotplug, "AHCI IMX hot-plug support (0=Don't support, 1=support static void ahci_imx_host_stop(struct ata_host *host); +static int imx_phy_ack_polling(void __iomem *mmio, bool assert) +{ + int timeout = 100; + u32 srval; + + do { + srval = readl(mmio + IMX_SATA_P0PHYSR); + if ((assert ? srval : ~srval) & P0PHYSR_CR_ACK) + break; + usleep_range(100, 200); + } while (--timeout); + + return timeout ? 0 : -ETIMEDOUT; +} + +static int imx_phy_reg_addressing(u16 addr, void __iomem *mmio) +{ + u32 crval = addr; + int ret; + + /* 1. Supply the address on cr_data_in */ + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 2. Assert the cr_cap_addr signal */ + crval |= P0PHYCR_CR_CAP_ADDR; + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 3. Wait for the cr_ack signal to be asserted */ + ret = imx_phy_ack_polling(mmio, true); + if (ret) + return ret; + + /* 4. Deassert cr_cap_addr */ + crval &= ~P0PHYCR_CR_CAP_ADDR; + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 5. Wait for cr_ack to be deasserted */ + ret = imx_phy_ack_polling(mmio, false); + if (ret) + return ret; + + return 0; +} + +static int imx_phy_reg_write(u16 val, void __iomem *mmio) +{ + u32 crval = val; + int ret; + + /* 1. Supply the data on cr_data_in */ + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 2. Assert the cr_cap_data signal */ + crval |= P0PHYCR_CR_CAP_DATA; + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 3. Wait for the cr_ack signal to be asserted */ + ret = imx_phy_ack_polling(mmio, true); + if (ret) + return ret; + + /* 4. Deassert cr_cap_data */ + crval &= ~P0PHYCR_CR_CAP_DATA; + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 5. Wait for cr_ack to be deasserted */ + ret = imx_phy_ack_polling(mmio, false); + if (ret) + return ret; + + /* 6. Assert the cr_write signal */ + crval |= P0PHYCR_CR_WRITE; + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 7. Wait for cr_ack to be asserted */ + ret = imx_phy_ack_polling(mmio, true); + if (ret) + return ret; + + /* 8. Deassert cr_write */ + crval &= ~P0PHYCR_CR_WRITE; + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 9. Wait for cr_ack to be deasserted */ + ret = imx_phy_ack_polling(mmio, false); + if (ret) + return ret; + + return 0; +} + +static int imx_phy_reg_read(u16 *val, void __iomem *mmio) +{ + u32 crval = 0; + int ret; + + /* 1. Assert the cr_read signal */ + crval |= P0PHYCR_CR_READ; + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 2. Wait for the cr_ack signal to be asserted */ + ret = imx_phy_ack_polling(mmio, true); + if (ret) + return ret; + + /* 3. Capture the data from cr_data_out[] */ + *val = readl(mmio + IMX_SATA_P0PHYSR) & P0PHYSR_CR_DATA_OUT; + + /* 4. Deassert cr_read */ + crval &= ~P0PHYCR_CR_READ; + writel(crval, mmio + IMX_SATA_P0PHYCR); + + /* 5. Wait for cr_ack to be deasserted */ + ret = imx_phy_ack_polling(mmio, false); + if (ret) + return ret; + + return 0; +} + +static int imx_sata_phy_reset(struct ahci_host_priv *hpriv) +{ + void __iomem *mmio = hpriv->mmio; + int timeout = 10; + u16 val; + int ret; + + /* Reset SATA PHY by setting RESET bit of PHY register CLOCK_RESET */ + ret = imx_phy_reg_addressing(IMX_PHY_CLOCK_RESET, mmio); + if (ret) + return ret; + /* + * For phy reset operation, we skip the timeout checking, because phy + * will be unable to acknowledge in this case. + */ + imx_phy_reg_write(CLOCK_RESET_RESET, mmio); + + usleep_range(100, 200); + + /* Wait for PHY RX_PLL to be stable */ + do { + ret = imx_phy_reg_addressing(IMX_PHY_LANE0_OUT_STAT, mmio); + if (ret) + return ret; + ret = imx_phy_reg_read(&val, mmio); + if (ret) + return ret; + if (val & LANE0_OUT_STAT_RX_PLL_STATE) + break; + usleep_range(100, 200); + } while (--timeout); + + return timeout ? 0 : -ETIMEDOUT; +} + static int imx_sata_enable(struct ahci_host_priv *hpriv) { struct imx_ahci_priv *imxpriv = hpriv->plat_data; + struct device *dev = &imxpriv->ahci_pdev->dev; int ret; if (imxpriv->no_device) @@ -99,6 +267,12 @@ static int imx_sata_enable(struct ahci_host_priv *hpriv) regmap_update_bits(imxpriv->gpr, IOMUXC_GPR13, IMX6Q_GPR13_SATA_MPLL_CLK_EN, IMX6Q_GPR13_SATA_MPLL_CLK_EN); + + ret = imx_sata_phy_reset(hpriv); + if (ret) { + dev_err(dev, "failed to reset phy: %d\n", ret); + goto disable_regulator; + } } usleep_range(1000, 2000); @@ -215,6 +389,7 @@ static int imx_ahci_probe(struct platform_device *pdev) if (!imxpriv) return -ENOMEM; + imxpriv->ahci_pdev = pdev; imxpriv->no_device = false; imxpriv->first_time = true; imxpriv->type = (enum ahci_imx_type)of_id->data;