[net,6/7] net: hns3: fix VF reset fail issue

Message ID	20231028025917.314305-7-shaojijie@huawei.com (mailing list archive)
State	Superseded
Delegated to:	Netdev Maintainers
Headers	show Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D16610EA for <netdev@vger.kernel.org>; Sat, 28 Oct 2023 03:02:42 +0000 (UTC) From: Jijie Shao <shaojijie@huawei.com> To: <yisen.zhuang@huawei.com>, <salil.mehta@huawei.com>, <davem@davemloft.net>, <edumazet@google.com>, <kuba@kernel.org>, <pabeni@redhat.com> CC: <shenjian15@huawei.com>, <wangjie125@huawei.com>, <liuyonglong@huawei.com>, <shaojijie@huawei.com>, <netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org> Subject: [PATCH net 6/7] net: hns3: fix VF reset fail issue Date: Sat, 28 Oct 2023 10:59:16 +0800 Message-ID: <20231028025917.314305-7-shaojijie@huawei.com> In-Reply-To: <20231028025917.314305-1-shaojijie@huawei.com> References: <20231028025917.314305-1-shaojijie@huawei.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain
Series	There are some bugfix for the HNS3 ethernet driver \| expand [net,0/7] There are some bugfix for the HNS3 ethernet driver [net,1/7] net: hns3: fix add VLAN fail issue [net,2/7] net: hns3: add barrier in vf mailbox reply process [net,3/7] net: hns3: fix incorrect capability bit display for copper port [net,4/7] net: hns3: fix out-of-bounds access may occur when coalesce info is read via debugfs [net,5/7] net: hns3: fix variable may not initialized problem in hns3_init_mac_addr() [net,6/7] net: hns3: fix VF reset fail issue [net,7/7] net: hns3: fix VF wrong speed and duplex issue

Message ID

20231028025917.314305-7-shaojijie@huawei.com (mailing list archive)

State

Superseded

Delegated to:

Netdev Maintainers

Headers

From: Jijie Shao <shaojijie@huawei.com>
To: <yisen.zhuang@huawei.com>, <salil.mehta@huawei.com>,
	<davem@davemloft.net>, <edumazet@google.com>, <kuba@kernel.org>,
	<pabeni@redhat.com>
CC: <shenjian15@huawei.com>, <wangjie125@huawei.com>,
	<liuyonglong@huawei.com>, <shaojijie@huawei.com>, <netdev@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: [PATCH net 6/7] net: hns3: fix VF reset fail issue
Date: Sat, 28 Oct 2023 10:59:16 +0800
Message-ID: <20231028025917.314305-7-shaojijie@huawei.com>
In-Reply-To: <20231028025917.314305-1-shaojijie@huawei.com>
References: <20231028025917.314305-1-shaojijie@huawei.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain

Series

There are some bugfix for the HNS3 ethernet driver | expand

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for net
netdev/fixes_present	success	Fixes tag present in non-next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 1362 this patch: 1362
netdev/cc_maintainers	fail	2 blamed authors not CCed: zhangjiaran@huawei.com huangguangbin2@huawei.com; 3 maintainers not CCed: zhangjiaran@huawei.com lanhao@huawei.com huangguangbin2@huawei.com
netdev/build_clang	success	Errors and warnings before: 1386 this patch: 1386
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	Fixes tag looks correct
netdev/build_allmodconfig_warn	success	Errors and warnings before: 1386 this patch: 1386
netdev/checkpatch	warning	WARNING: line length of 84 exceeds 80 columns
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

Context

Check

Description

netdev/series_format

success

Posting correctly formatted

netdev/tree_selection

success

Clearly marked for net

netdev/fixes_present

success

Fixes tag present in non-next series

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 1362 this patch: 1362

netdev/cc_maintainers

fail

2 blamed authors not CCed: zhangjiaran@huawei.com huangguangbin2@huawei.com; 3 maintainers not CCed: zhangjiaran@huawei.com lanhao@huawei.com huangguangbin2@huawei.com

netdev/build_clang

success

Errors and warnings before: 1386 this patch: 1386

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/deprecated_api

success

None detected

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

Fixes tag looks correct

netdev/build_allmodconfig_warn

success

Errors and warnings before: 1386 this patch: 1386

netdev/checkpatch

warning

WARNING: line length of 84 exceeds 80 columns

netdev/build_clang_rust

success

No Rust files in patch. Skipping build

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/source_inline

success

Was 0 now: 0

Commit Message

Jijie Shao Oct. 28, 2023, 2:59 a.m. UTC

Currently the reset process in hns3 and firmware watchdog init process is
asynchronous. We think firmware watchdog initialization is completed
before VF clear the interrupt source. However, firmware initialization
may not complete early. So VF will receive multiple reset interrupts
and fail to reset.

So we add delay before VF interrupt source and 5 ms delay
is enough to avoid second reset interrupt.

Fixes: 427900d27d86 ("net: hns3: fix the timing issue of VF clearing interrupt sources")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c   | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Paolo Abeni Nov. 2, 2023, 10:45 a.m. UTC | #1

On Sat, 2023-10-28 at 10:59 +0800, Jijie Shao wrote:
> Currently the reset process in hns3 and firmware watchdog init process is
> asynchronous. We think firmware watchdog initialization is completed
> before VF clear the interrupt source. However, firmware initialization
> may not complete early. So VF will receive multiple reset interrupts
> and fail to reset.
> 
> So we add delay before VF interrupt source and 5 ms delay
> is enough to avoid second reset interrupt.
> 
> Fixes: 427900d27d86 ("net: hns3: fix the timing issue of VF clearing interrupt sources")
> Signed-off-by: Jijie Shao <shaojijie@huawei.com>
> ---
>  .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c   | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
> index 1c62e58ff6d8..7b87da031be6 100644
> --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
> @@ -1924,8 +1924,14 @@ static void hclgevf_service_task(struct work_struct *work)
>  	hclgevf_mailbox_service_task(hdev);
>  }
>  
> -static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr)
> +static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr,
> +				      bool need_dalay)
>  {
> +#define HCLGEVF_RESET_DELAY		5
> +
> +	if (need_dalay)
> +		mdelay(HCLGEVF_RESET_DELAY);

5ms delay in an interrupt handler is quite a lot. What about scheduling
a timer from the IH to clear the register when such delay is needed?

Thanks!

Paolo

Jijie Shao Nov. 2, 2023, 12:16 p.m. UTC | #2

on 2023/11/2 18:45, Paolo Abeni wrote:
> On Sat, 2023-10-28 at 10:59 +0800, Jijie Shao wrote:
>>   
>> -static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr)
>> +static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr,
>> +				      bool need_dalay)
>>   {
>> +#define HCLGEVF_RESET_DELAY		5
>> +
>> +	if (need_dalay)
>> +		mdelay(HCLGEVF_RESET_DELAY);
> 5ms delay in an interrupt handler is quite a lot. What about scheduling
> a timer from the IH to clear the register when such delay is needed?
>
> Thanks!
>
> Paolo

Using timer in this case will complicate the code and make maintenance difficult.

We consider reducing the delay time by polling. For example,
the code cycles every 50 us to check whether the write register takes effect.
If yes, the function returns immediately. or the code cycles until 5 ms.

Is this method appropriate?

Thanks!
Jijie

Paolo Abeni Nov. 2, 2023, 4:24 p.m. UTC | #3

On Thu, 2023-11-02 at 20:16 +0800, Jijie Shao wrote:
> on 2023/11/2 18:45, Paolo Abeni wrote:
> > On Sat, 2023-10-28 at 10:59 +0800, Jijie Shao wrote:
> > >   
> > > -static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr)
> > > +static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr,
> > > +				      bool need_dalay)
> > >   {
> > > +#define HCLGEVF_RESET_DELAY		5
> > > +
> > > +	if (need_dalay)
> > > +		mdelay(HCLGEVF_RESET_DELAY);
> > 5ms delay in an interrupt handler is quite a lot. What about scheduling
> > a timer from the IH to clear the register when such delay is needed?
> > 
> > Thanks!
> > 
> > Paolo
> 
> Using timer in this case will complicate the code and make maintenance difficult.

Why? 

Would something alike the following be ok? (plus reset_timer
initialization at vf creation and cleanup at vf removal time):

---
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index a4d68fb216fb..626bc67065fc 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1974,6 +1974,14 @@ static enum hclgevf_evt_cause hclgevf_check_evt_cause(struct hclgevf_dev *hdev,
 	return HCLGEVF_VECTOR0_EVENT_OTHER;
 }
 
+static void hclgevf_reset_timer(struct timer_list *t)
+{
+	struct hclgevf_dev *hdev = from_timer(hclgevf_dev, t, reset_timer);
+
+	hclgevf_clear_event_cause(hdev, HCLGEVF_VECTOR0_EVENT_RST);
+	hclgevf_reset_task_schedule(hdev);
+}
+
 static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data)
 {
 	enum hclgevf_evt_cause event_cause;
@@ -1982,13 +1990,13 @@ static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data)
 
 	hclgevf_enable_vector(&hdev->misc_vector, false);
 	event_cause = hclgevf_check_evt_cause(hdev, &clearval);
+	if (event_cause == HCLGEVF_VECTOR0_EVENT_RST)
+		mod_timer(hdev->reset_timer, jiffies + msecs_to_jiffies(5));
+
 	if (event_cause != HCLGEVF_VECTOR0_EVENT_OTHER)
 		hclgevf_clear_event_cause(hdev, clearval);
 
 	switch (event_cause) {
-	case HCLGEVF_VECTOR0_EVENT_RST:
-		hclgevf_reset_task_schedule(hdev);
-		break;
 	case HCLGEVF_VECTOR0_EVENT_MBX:
 		hclgevf_mbx_handler(hdev);
 		break;
---

> We consider reducing the delay time by polling. For example,
> the code cycles every 50 us to check whether the write register takes effect.
> If yes, the function returns immediately. or the code cycles until 5 ms.
> 
> Is this method appropriate?

IMHO such solution will not remove the problem. How frequent is
expected to be the irq generating such delay?

Thanks

Paolo

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 1c62e58ff6d8..7b87da031be6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1924,8 +1924,14 @@  static void hclgevf_service_task(struct work_struct *work)
 	hclgevf_mailbox_service_task(hdev);
 }
 
-static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr)
+static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr,
+				      bool need_dalay)
 {
+#define HCLGEVF_RESET_DELAY		5
+
+	if (need_dalay)
+		mdelay(HCLGEVF_RESET_DELAY);
+
 	hclgevf_write_dev(&hdev->hw, HCLGE_COMM_VECTOR0_CMDQ_SRC_REG, regclr);
 }
 
@@ -1990,7 +1996,8 @@  static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data)
 	hclgevf_enable_vector(&hdev->misc_vector, false);
 	event_cause = hclgevf_check_evt_cause(hdev, &clearval);
 	if (event_cause != HCLGEVF_VECTOR0_EVENT_OTHER)
-		hclgevf_clear_event_cause(hdev, clearval);
+		hclgevf_clear_event_cause(hdev, clearval,
+					  event_cause == HCLGEVF_VECTOR0_EVENT_RST);
 
 	switch (event_cause) {
 	case HCLGEVF_VECTOR0_EVENT_RST:
@@ -2340,7 +2347,7 @@  static int hclgevf_misc_irq_init(struct hclgevf_dev *hdev)
 		return ret;
 	}
 
-	hclgevf_clear_event_cause(hdev, 0);
+	hclgevf_clear_event_cause(hdev, 0, false);
 
 	/* enable misc. vector(vector 0) */
 	hclgevf_enable_vector(&hdev->misc_vector, true);

[net,6/7] net: hns3: fix VF reset fail issue

Checks

Commit Message

Comments

Patch