diff mbox series

[4/5] PCI: only return true when dev io state is really changed

Message ID 20200925023423.42675-5-haifeng.zhao@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Bjorn Helgaas
Headers show
Series Fix DPC hotplug race and enhance error hanlding | expand

Commit Message

Zhao, Haifeng Sept. 25, 2020, 2:34 a.m. UTC
When uncorrectable error happens, AER driver and DPC driver interrupt
handlers likely call
   pcie_do_recovery()->pci_walk_bus()->report_frozen_detected() with
pci_channel_io_frozen the same time.
   If pci_dev_set_io_state() return true even if the original state is
pci_channel_io_frozen, that will cause AER or DPC handler re-enter
the error detecting and recovery procedure one after another.
   The result is the recovery flow mixed between AER and DPC.
So simplify the pci_dev_set_io_state() function to only return true
when dev->error_state is changed.

Signed-off-by: Ethan Zhao <haifeng.zhao@intel.com>
Tested-by: Wen jin <wen.jin@intel.com>
Tested-by: Shanshan Zhang <ShanshanX.Zhang@intel.com>
---
 drivers/pci/pci.h | 31 +++----------------------------
 1 file changed, 3 insertions(+), 28 deletions(-)

Comments

Andy Shevchenko Sept. 25, 2020, 12:38 p.m. UTC | #1
On Thu, Sep 24, 2020 at 10:34:22PM -0400, Ethan Zhao wrote:
> When uncorrectable error happens, AER driver and DPC driver interrupt
> handlers likely call
>    pcie_do_recovery()->pci_walk_bus()->report_frozen_detected() with
> pci_channel_io_frozen the same time.

Call chains are better to read if they split like

   foo() ->
     bar() ->
       baz()

>    If pci_dev_set_io_state() return true even if the original state is
> pci_channel_io_frozen, that will cause AER or DPC handler re-enter
> the error detecting and recovery procedure one after another.
>    The result is the recovery flow mixed between AER and DPC.
> So simplify the pci_dev_set_io_state() function to only return true
> when dev->error_state is changed.

...

> +	if (dev->error_state != new) {
>  		dev->error_state = new;
> +		changed = true;
> +	}
>  	return changed;

Perhaps
	if (dev->error_state == new)
		return changed;

	dev->error_state = new;
	return true;

?
Alex G. Sept. 25, 2020, 1:56 p.m. UTC | #2
Hi Ethan,

On 9/24/20 9:34 PM, Ethan Zhao wrote:
> When uncorrectable error happens, AER driver and DPC driver interrupt
> handlers likely call
>     pcie_do_recovery()->pci_walk_bus()->report_frozen_detected() with
> pci_channel_io_frozen the same time.
>     If pci_dev_set_io_state() return true even if the original state is
> pci_channel_io_frozen, that will cause AER or DPC handler re-enter
> the error detecting and recovery procedure one after another.
>     The result is the recovery flow mixed between AER and DPC.
> So simplify the pci_dev_set_io_state() function to only return true
> when dev->error_state is changed.
> 
> Signed-off-by: Ethan Zhao <haifeng.zhao@intel.com>
> Tested-by: Wen jin <wen.jin@intel.com>
> Tested-by: Shanshan Zhang <ShanshanX.Zhang@intel.com>
> ---
>   drivers/pci/pci.h | 31 +++----------------------------
>   1 file changed, 3 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index fa12f7cbc1a0..d420bb977f3b 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -362,35 +362,10 @@ static inline bool pci_dev_set_io_state(struct pci_dev *dev,
>   	bool changed = false;
>   
>   	device_lock_assert(&dev->dev);
> -	switch (new) {
> -	case pci_channel_io_perm_failure:
> -		switch (dev->error_state) {
> -		case pci_channel_io_frozen:
> -		case pci_channel_io_normal:
> -		case pci_channel_io_perm_failure:
> -			changed = true;
> -			break;
> -		}
> -		break;
> -	case pci_channel_io_frozen:
> -		switch (dev->error_state) {
> -		case pci_channel_io_frozen:
> -		case pci_channel_io_normal:
> -			changed = true;
> -			break;
> -		}
> -		break;
> -	case pci_channel_io_normal:
> -		switch (dev->error_state) {
> -		case pci_channel_io_frozen:
> -		case pci_channel_io_normal:
> -			changed = true;
> -			break;
> -		}
> -		break;
> -	}
> -	if (changed)
> +	if (dev->error_state != new) {
>   		dev->error_state = new;
> +		changed = true;
> +	}
>   	return changed;
>   }

The flow is a lot easier to follow now. Thank you.

Reviewed-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Zhao, Haifeng Sept. 27, 2020, 1:28 a.m. UTC | #3
Yes, better ! 

-----Original Message-----
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com> 
Sent: Friday, September 25, 2020 8:38 PM
To: Zhao, Haifeng <haifeng.zhao@intel.com>
Cc: bhelgaas@google.com; oohall@gmail.com; ruscur@russell.cc; lukas@wunner.de; stuart.w.hayes@gmail.com; mr.nuke.me@gmail.com; mika.westerberg@linux.intel.com; linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; Jia, Pei P <pei.p.jia@intel.com>
Subject: Re: [PATCH 4/5] PCI: only return true when dev io state is really changed

On Thu, Sep 24, 2020 at 10:34:22PM -0400, Ethan Zhao wrote:
> When uncorrectable error happens, AER driver and DPC driver interrupt 
> handlers likely call
>    pcie_do_recovery()->pci_walk_bus()->report_frozen_detected() with 
> pci_channel_io_frozen the same time.

Call chains are better to read if they split like

   foo() ->
     bar() ->
       baz()

>    If pci_dev_set_io_state() return true even if the original state is 
> pci_channel_io_frozen, that will cause AER or DPC handler re-enter the 
> error detecting and recovery procedure one after another.
>    The result is the recovery flow mixed between AER and DPC.
> So simplify the pci_dev_set_io_state() function to only return true 
> when dev->error_state is changed.

...

> +	if (dev->error_state != new) {
>  		dev->error_state = new;
> +		changed = true;
> +	}
>  	return changed;

Perhaps
	if (dev->error_state == new)
		return changed;

	dev->error_state = new;
	return true;

?


--
With Best Regards,
Andy Shevchenko
diff mbox series

Patch

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fa12f7cbc1a0..d420bb977f3b 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -362,35 +362,10 @@  static inline bool pci_dev_set_io_state(struct pci_dev *dev,
 	bool changed = false;
 
 	device_lock_assert(&dev->dev);
-	switch (new) {
-	case pci_channel_io_perm_failure:
-		switch (dev->error_state) {
-		case pci_channel_io_frozen:
-		case pci_channel_io_normal:
-		case pci_channel_io_perm_failure:
-			changed = true;
-			break;
-		}
-		break;
-	case pci_channel_io_frozen:
-		switch (dev->error_state) {
-		case pci_channel_io_frozen:
-		case pci_channel_io_normal:
-			changed = true;
-			break;
-		}
-		break;
-	case pci_channel_io_normal:
-		switch (dev->error_state) {
-		case pci_channel_io_frozen:
-		case pci_channel_io_normal:
-			changed = true;
-			break;
-		}
-		break;
-	}
-	if (changed)
+	if (dev->error_state != new) {
 		dev->error_state = new;
+		changed = true;
+	}
 	return changed;
 }