diff mbox

[v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"

Message ID 20110514102804.GB23665@htj.dyndns.org (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Tejun Heo May 14, 2011, 10:28 a.m. UTC
This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.

The commits causes command timeouts on AC plug/unplug.  It isn't yet
clear why.  As the commit was for a single rather obscure controller,
revert the change for now.

The problem was reported and bisected by Gu Rui in bug#34692.

 https://bugzilla.kernel.org/show_bug.cgi?id=34692

Also, reported by Rafael and Michael in the following thread.

 http://thread.gmane.org/gmane.linux.kernel/1138771

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Gu Rui <chaos.proton@gmail.com>
Reported-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Michael Leun <lkml20100708@newton.leun.net>
Cc: Jian Peng <jipeng2005@gmail.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
---
As we're already in -rc7, I'm sending the revert patch to both Jeff
and Linus.

Thank you.

 drivers/ata/libahci.c |   21 ---------------------
 1 files changed, 0 insertions(+), 21 deletions(-)

Comments

Linus Torvalds May 14, 2011, 5:47 p.m. UTC | #1
On Sat, May 14, 2011 at 3:28 AM, Tejun Heo <tj@kernel.org> wrote:
>
> As we're already in -rc7, I'm sending the revert patch to both Jeff
> and Linus.

Applied,

                   Linus
Jeff Garzik May 14, 2011, 6:49 p.m. UTC | #2
On 05/14/2011 01:47 PM, Linus Torvalds wrote:
> On Sat, May 14, 2011 at 3:28 AM, Tejun Heo<tj@kernel.org>  wrote:
>>
>> As we're already in -rc7, I'm sending the revert patch to both Jeff
>> and Linus.
>
> Applied,

ACK, thanks
Jian Peng May 15, 2011, 6:19 a.m. UTC | #3
Hi, Michael/Rafael/Gu,

Could you help me understand the testing environme? I like to reproduce it
if it is possible and debug it. The host controller this patch dealing with
is not used on PC so my testing environment is quite different.

Thanks,
Jian

On Sat, May 14, 2011 at 3:28 AM, Tejun Heo <tj@kernel.org> wrote:

> This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.
>
> The commits causes command timeouts on AC plug/unplug.  It isn't yet
> clear why.  As the commit was for a single rather obscure controller,
> revert the change for now.
>
> The problem was reported and bisected by Gu Rui in bug#34692.
>
>  https://bugzilla.kernel.org/show_bug.cgi?id=34692
>
> Also, reported by Rafael and Michael in the following thread.
>
>  http://thread.gmane.org/gmane.linux.kernel/1138771
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Gu Rui <chaos.proton@gmail.com>
> Reported-by: Rafael J. Wysocki <rjw@sisk.pl>
> Reported-by: Michael Leun <lkml20100708@newton.leun.net>
> Cc: Jian Peng <jipeng2005@gmail.com>
> Cc: Jeff Garzik <jgarzik@pobox.com>
> ---
> As we're already in -rc7, I'm sending the revert patch to both Jeff
> and Linus.
>
> Thank you.
>
>  drivers/ata/libahci.c |   21 ---------------------
>  1 files changed, 0 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> index ff9d832..d38c40f 100644
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -561,27 +561,6 @@ void ahci_start_engine(struct ata_port *ap)
>  {
>        void __iomem *port_mmio = ahci_port_base(ap);
>        u32 tmp;
> -       u8 status;
> -
> -       status = readl(port_mmio + PORT_TFDATA) & 0xFF;
> -
> -       /*
> -        * At end of section 10.1 of AHCI spec (rev 1.3), it states
> -        * Software shall not set PxCMD.ST to 1 until it is determined
> -        * that a functoinal device is present on the port as determined by
> -        * PxTFD.STS.BSY=0, PxTFD.STS.DRQ=0 and PxSSTS.DET=3h
> -        *
> -        * Even though most AHCI host controllers work without this check,
> -        * specific controller will fail under this condition
> -        */
> -       if (status & (ATA_BUSY | ATA_DRQ))
> -               return;
> -       else {
> -               ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
> -
> -               if ((tmp & 0xf) != 0x3)
> -                       return;
> -       }
>
>        /* start DMA */
>        tmp = readl(port_mmio + PORT_CMD);
> --
> 1.7.1
>
>
Rafael Wysocki May 15, 2011, 9:25 a.m. UTC | #4
On Sunday, May 15, 2011, Jian Peng wrote:
> Hi, Michael/Rafael/Gu,
> 
> Could you help me understand the testing environme? I like to reproduce it
> if it is possible and debug it. The host controller this patch dealing with
> is not used on PC so my testing environment is quite different.

Well, the system I can reproduce it on is a production one, so I can't
really test too much on it.  What exactly would you like to do?

Rafael
lkml20100708@newton.leun.net May 15, 2011, 12:20 p.m. UTC | #5
On Sun, 15 May 2011 11:25:19 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Sunday, May 15, 2011, Jian Peng wrote:
> > Hi, Michael/Rafael/Gu,
> > 
> > Could you help me understand the testing environme? I like to
> > reproduce it if it is possible and debug it. The host controller
> > this patch dealing with is not used on PC so my testing environment
> > is quite different.

We have seen it on at least 3 different controllers from at least 2
different vendors (I do not know, what hardware Gu has).

So, for now, it looks like to happen on more different controllers than
not.

Testing environment - uhm, what shall I say? On each of the two
notebooks affected runs 64bit openSuSE 11.4 in more or less default
configuration (apart from kernel, of course).

Just do a "ls -lR /" and watch kernel log while plugging/unplugging
power.

Maybe running a 64bit system/kernel is part of picture? As far as I can
see all affected machines run 64bit.

> Well, the system I can reproduce it on is a production one, so I can't
> really test too much on it.  What exactly would you like to do?

I've at least one machine available where I can test. So, if there is
anything (preferably non destructive ;-) ) you want me to test this
should be possible.

Maybe I'll have a look if my netbook (running 32bit) is also affected.
Valdis Klētnieks May 16, 2011, 5:02 p.m. UTC | #6
On Sat, 14 May 2011 12:28:04 +0200, Tejun Heo said:
> This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.
> 
> The commits causes command timeouts on AC plug/unplug.  It isn't yet
> clear why.  As the commit was for a single rather obscure controller,
> revert the change for now.
> 
> The problem was reported and bisected by Gu Rui in bug#34692.
> 
>  https://bugzilla.kernel.org/show_bug.cgi?id=34692
> 
> Also, reported by Rafael and Michael in the following thread.
> 
>  http://thread.gmane.org/gmane.linux.kernel/1138771

This also fixes the issue I had with a 10-second pause on my Dell Latitude E6500
laptop at boot while detecting the CD/DVD drive.
Jian Peng May 18, 2011, 7:23 p.m. UTC | #7
Hi, Valdis/Rafael/Michael,

Could you help me test the following change?

After reverting 81ca7e4, add 5ms delay as follow since that seems also
fixing the issue on my SATA host controller that requires 81ca7e4.

In drivers/ata/libahci.c, inside ahci_hardreset() function,


1333 <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1333>
ahci_start_engine
<http://lxr.linux.no/linux+*/+code=ahci_start_engine>(ap
<http://lxr.linux.no/linux+*/+code=ap>);

            msleep(5);1334
<http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1334>1335
<http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1335>        if
(online <http://lxr.linux.no/linux+*/+code=online>)1336
<http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1336>
  *class <http://lxr.linux.no/linux+*/+code=class> = ahci_dev_classify
<http://lxr.linux.no/linux+*/+code=ahci_dev_classify>(ap
<http://lxr.linux.no/linux+*/+code=ap>);

Since my host controller requires time to switch internal state to be ready.
Please let me know your testing result.

Thanks,
Jian



On Mon, May 16, 2011 at 10:02 AM, <Valdis.Kletnieks@vt.edu> wrote:

> On Sat, 14 May 2011 12:28:04 +0200, Tejun Heo said:
> > This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.
> >
> > The commits causes command timeouts on AC plug/unplug.  It isn't yet
> > clear why.  As the commit was for a single rather obscure controller,
> > revert the change for now.
> >
> > The problem was reported and bisected by Gu Rui in bug#34692.
> >
> >  https://bugzilla.kernel.org/show_bug.cgi?id=34692
> >
> > Also, reported by Rafael and Michael in the following thread.
> >
> >  http://thread.gmane.org/gmane.linux.kernel/1138771
>
> This also fixes the issue I had with a 10-second pause on my Dell Latitude
> E6500
> laptop at boot while detecting the CD/DVD drive.
>
Rafael Wysocki May 18, 2011, 7:44 p.m. UTC | #8
On Wednesday, May 18, 2011, Jian Peng wrote:
> Hi, Valdis/Rafael/Michael,
> 
> Could you help me test the following change?
> 
> After reverting 81ca7e4, add 5ms delay as follow since that seems also
> fixing the issue on my SATA host controller that requires 81ca7e4.
> 
> In drivers/ata/libahci.c, inside ahci_hardreset() function,
> 
> 
> 1333 <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1333>
> ahci_start_engine
> <http://lxr.linux.no/linux+*/+code=ahci_start_engine>(ap
> <http://lxr.linux.no/linux+*/+code=ap>);
> 
>             msleep(5);1334
> <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1334>1335
> <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1335>        if
> (online <http://lxr.linux.no/linux+*/+code=online>)1336
> <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1336>
>   *class <http://lxr.linux.no/linux+*/+code=class> = ahci_dev_classify
> <http://lxr.linux.no/linux+*/+code=ahci_dev_classify>(ap
> <http://lxr.linux.no/linux+*/+code=ap>);
> 
> Since my host controller requires time to switch internal state to be ready.
> Please let me know your testing result.

Could you simply post a patch?

Rafael
diff mbox

Patch

diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index ff9d832..d38c40f 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -561,27 +561,6 @@  void ahci_start_engine(struct ata_port *ap)
 {
 	void __iomem *port_mmio = ahci_port_base(ap);
 	u32 tmp;
-	u8 status;
-
-	status = readl(port_mmio + PORT_TFDATA) & 0xFF;
-
-	/*
-	 * At end of section 10.1 of AHCI spec (rev 1.3), it states
-	 * Software shall not set PxCMD.ST to 1 until it is determined
-	 * that a functoinal device is present on the port as determined by
-	 * PxTFD.STS.BSY=0, PxTFD.STS.DRQ=0 and PxSSTS.DET=3h
-	 *
-	 * Even though most AHCI host controllers work without this check,
-	 * specific controller will fail under this condition
-	 */
-	if (status & (ATA_BUSY | ATA_DRQ))
-		return;
-	else {
-		ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
-
-		if ((tmp & 0xf) != 0x3)
-			return;
-	}
 
 	/* start DMA */
 	tmp = readl(port_mmio + PORT_CMD);