diff mbox series

[net-next,2/2] bonding: fix link recovery in mode 2 when updelay is nonzero

Message ID cb89b92af89973ee049a696c362b4a2abfdd9b82.1668800711.git.jtoppins@redhat.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series bonding: fix bond recovery in mode 2 | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 3 this patch: 3
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 17 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Jonathan Toppins Nov. 18, 2022, 8:30 p.m. UTC
Before this change when a bond in mode 2 lost link, all of its slaves
lost link, the bonding device would never recover even after the
expiration of updelay. This change removes the updelay when the bond
currently has no usable links. Conforming to bonding.txt section 13.1
paragraph 4.

Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
---
 drivers/net/bonding/bond_main.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Paolo Abeni Nov. 22, 2022, 10:59 a.m. UTC | #1
Hello,

On Fri, 2022-11-18 at 15:30 -0500, Jonathan Toppins wrote:
> Before this change when a bond in mode 2 lost link, all of its slaves
> lost link, the bonding device would never recover even after the
> expiration of updelay. This change removes the updelay when the bond
> currently has no usable links. Conforming to bonding.txt section 13.1
> paragraph 4.
> 
> Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>

Why are you targeting net-next? This looks like something suitable to
the -net tree to me. If, so could you please include a Fixes tag?

Note that we can add new self-tests even via the -net tree.

Thanks,

Paolo
Jonathan Toppins Nov. 22, 2022, 1:36 p.m. UTC | #2
On 11/22/22 05:59, Paolo Abeni wrote:
> Hello,
> 
> On Fri, 2022-11-18 at 15:30 -0500, Jonathan Toppins wrote:
>> Before this change when a bond in mode 2 lost link, all of its slaves
>> lost link, the bonding device would never recover even after the
>> expiration of updelay. This change removes the updelay when the bond
>> currently has no usable links. Conforming to bonding.txt section 13.1
>> paragraph 4.
>>
>> Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
> 
> Why are you targeting net-next? This looks like something suitable to
> the -net tree to me. If, so could you please include a Fixes tag?
> 
> Note that we can add new self-tests even via the -net tree.
> 

I could not find a reasonable fixes tag for this, hence why I targeted 
the net-next tree.

-Jon
Paolo Abeni Nov. 22, 2022, 2:45 p.m. UTC | #3
On Tue, 2022-11-22 at 08:36 -0500, Jonathan Toppins wrote:
> On 11/22/22 05:59, Paolo Abeni wrote:
> > Hello,
> > 
> > On Fri, 2022-11-18 at 15:30 -0500, Jonathan Toppins wrote:
> > > Before this change when a bond in mode 2 lost link, all of its slaves
> > > lost link, the bonding device would never recover even after the
> > > expiration of updelay. This change removes the updelay when the bond
> > > currently has no usable links. Conforming to bonding.txt section 13.1
> > > paragraph 4.
> > > 
> > > Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
> > 
> > Why are you targeting net-next? This looks like something suitable to
> > the -net tree to me. If, so could you please include a Fixes tag?
> > 
> > Note that we can add new self-tests even via the -net tree.
> > 
> 
> I could not find a reasonable fixes tag for this, hence why I targeted 
> the net-next tree.

When in doubt I think it's preferrable to point out a commit surely
affected by the issue - even if that is possibly not the one
introducing the issue - than no Fixes as all. The lack of tag will make
more difficult the work for stable teams.

In this specific case I think that:

Fixes: 41f891004063 ("bonding: ignore updelay param when there is no active slave")

should be ok, WDYT? if you agree would you mind repost for -net?

Thanks,

Paolo
Jonathan Toppins Nov. 22, 2022, 3:37 p.m. UTC | #4
On 11/22/22 09:45, Paolo Abeni wrote:
> On Tue, 2022-11-22 at 08:36 -0500, Jonathan Toppins wrote:
>> On 11/22/22 05:59, Paolo Abeni wrote:
>>> Hello,
>>>
>>> On Fri, 2022-11-18 at 15:30 -0500, Jonathan Toppins wrote:
>>>> Before this change when a bond in mode 2 lost link, all of its slaves
>>>> lost link, the bonding device would never recover even after the
>>>> expiration of updelay. This change removes the updelay when the bond
>>>> currently has no usable links. Conforming to bonding.txt section 13.1
>>>> paragraph 4.
>>>>
>>>> Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
>>>
>>> Why are you targeting net-next? This looks like something suitable to
>>> the -net tree to me. If, so could you please include a Fixes tag?
>>>
>>> Note that we can add new self-tests even via the -net tree.
>>>
>>
>> I could not find a reasonable fixes tag for this, hence why I targeted
>> the net-next tree.
> 
> When in doubt I think it's preferrable to point out a commit surely
> affected by the issue - even if that is possibly not the one
> introducing the issue - than no Fixes as all. The lack of tag will make
> more difficult the work for stable teams.
> 
> In this specific case I think that:
> 
> Fixes: 41f891004063 ("bonding: ignore updelay param when there is no active slave")
> 
> should be ok, WDYT? if you agree would you mind repost for -net?
> 
> Thanks,
> 
> Paolo
> 

Yes that looks like a good one. I will repost to -net a v2 that includes 
changes to reduce the number of icmp echos sent before failing the test.

Thanks,
-Jon
Nikolay Aleksandrov Nov. 22, 2022, 9:12 p.m. UTC | #5
On 22/11/2022 17:37, Jonathan Toppins wrote:
> On 11/22/22 09:45, Paolo Abeni wrote:
>> On Tue, 2022-11-22 at 08:36 -0500, Jonathan Toppins wrote:
>>> On 11/22/22 05:59, Paolo Abeni wrote:
>>>> Hello,
>>>>
>>>> On Fri, 2022-11-18 at 15:30 -0500, Jonathan Toppins wrote:
>>>>> Before this change when a bond in mode 2 lost link, all of its slaves
>>>>> lost link, the bonding device would never recover even after the
>>>>> expiration of updelay. This change removes the updelay when the bond
>>>>> currently has no usable links. Conforming to bonding.txt section 13.1
>>>>> paragraph 4.
>>>>>
>>>>> Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
>>>>
>>>> Why are you targeting net-next? This looks like something suitable to
>>>> the -net tree to me. If, so could you please include a Fixes tag?
>>>>
>>>> Note that we can add new self-tests even via the -net tree.
>>>>
>>>
>>> I could not find a reasonable fixes tag for this, hence why I targeted
>>> the net-next tree.
>>
>> When in doubt I think it's preferrable to point out a commit surely
>> affected by the issue - even if that is possibly not the one
>> introducing the issue - than no Fixes as all. The lack of tag will make
>> more difficult the work for stable teams.
>>
>> In this specific case I think that:
>>
>> Fixes: 41f891004063 ("bonding: ignore updelay param when there is no active slave")
>>
>> should be ok, WDYT? if you agree would you mind repost for -net?
>>
>> Thanks,
>>
>> Paolo
>>
> 
> Yes that looks like a good one. I will repost to -net a v2 that includes changes to reduce the number of icmp echos sent before failing the test.
> 
> Thanks,
> -Jon
> 

One minor nit - could you please change "mode 2" to "mode balance-xor" ?
It saves reviewers some grepping around the code to see what is mode 2.
Obviously one has to dig in the code to see how it's affected, but still
it is a bit more understandable. It'd be nice to add more as to why the link is not recovered,
I get it after reading the code, but it would be nice to include a more detailed explanation in the
commit message as well.

Thanks,
 Nik
Nikolay Aleksandrov Nov. 22, 2022, 9:15 p.m. UTC | #6
On 22/11/2022 23:12, Nikolay Aleksandrov wrote:
> On 22/11/2022 17:37, Jonathan Toppins wrote:
>> On 11/22/22 09:45, Paolo Abeni wrote:
>>> On Tue, 2022-11-22 at 08:36 -0500, Jonathan Toppins wrote:
>>>> On 11/22/22 05:59, Paolo Abeni wrote:
>>>>> Hello,
>>>>>
>>>>> On Fri, 2022-11-18 at 15:30 -0500, Jonathan Toppins wrote:
>>>>>> Before this change when a bond in mode 2 lost link, all of its slaves
>>>>>> lost link, the bonding device would never recover even after the
>>>>>> expiration of updelay. This change removes the updelay when the bond
>>>>>> currently has no usable links. Conforming to bonding.txt section 13.1
>>>>>> paragraph 4.
>>>>>>
>>>>>> Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
>>>>>
>>>>> Why are you targeting net-next? This looks like something suitable to
>>>>> the -net tree to me. If, so could you please include a Fixes tag?
>>>>>
>>>>> Note that we can add new self-tests even via the -net tree.
>>>>>
>>>>
>>>> I could not find a reasonable fixes tag for this, hence why I targeted
>>>> the net-next tree.
>>>
>>> When in doubt I think it's preferrable to point out a commit surely
>>> affected by the issue - even if that is possibly not the one
>>> introducing the issue - than no Fixes as all. The lack of tag will make
>>> more difficult the work for stable teams.
>>>
>>> In this specific case I think that:
>>>
>>> Fixes: 41f891004063 ("bonding: ignore updelay param when there is no active slave")
>>>
>>> should be ok, WDYT? if you agree would you mind repost for -net?
>>>
>>> Thanks,
>>>
>>> Paolo
>>>
>>
>> Yes that looks like a good one. I will repost to -net a v2 that includes changes to reduce the number of icmp echos sent before failing the test.
>>
>> Thanks,
>> -Jon
>>
> 
> One minor nit - could you please change "mode 2" to "mode balance-xor" ?
> It saves reviewers some grepping around the code to see what is mode 2.
> Obviously one has to dig in the code to see how it's affected, but still
> it is a bit more understandable. It'd be nice to add more as to why the link is not recovered,
> I get it after reading the code, but it would be nice to include a more detailed explanation in the
> commit message as well.
> 
> Thanks,
>  Nik
> 

Ah, I just noticed I'm late to the party. :)
Nevermind my comments, no need for a v3.
Jonathan Toppins Nov. 22, 2022, 9:17 p.m. UTC | #7
On 11/22/22 16:15, Nikolay Aleksandrov wrote:
> On 22/11/2022 23:12, Nikolay Aleksandrov wrote:
>> On 22/11/2022 17:37, Jonathan Toppins wrote:
>>> On 11/22/22 09:45, Paolo Abeni wrote:
>>>> On Tue, 2022-11-22 at 08:36 -0500, Jonathan Toppins wrote:
>>>>> On 11/22/22 05:59, Paolo Abeni wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On Fri, 2022-11-18 at 15:30 -0500, Jonathan Toppins wrote:
>>>>>>> Before this change when a bond in mode 2 lost link, all of its slaves
>>>>>>> lost link, the bonding device would never recover even after the
>>>>>>> expiration of updelay. This change removes the updelay when the bond
>>>>>>> currently has no usable links. Conforming to bonding.txt section 13.1
>>>>>>> paragraph 4.
>>>>>>>
>>>>>>> Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
>>>>>>
>>>>>> Why are you targeting net-next? This looks like something suitable to
>>>>>> the -net tree to me. If, so could you please include a Fixes tag?
>>>>>>
>>>>>> Note that we can add new self-tests even via the -net tree.
>>>>>>
>>>>>
>>>>> I could not find a reasonable fixes tag for this, hence why I targeted
>>>>> the net-next tree.
>>>>
>>>> When in doubt I think it's preferrable to point out a commit surely
>>>> affected by the issue - even if that is possibly not the one
>>>> introducing the issue - than no Fixes as all. The lack of tag will make
>>>> more difficult the work for stable teams.
>>>>
>>>> In this specific case I think that:
>>>>
>>>> Fixes: 41f891004063 ("bonding: ignore updelay param when there is no active slave")
>>>>
>>>> should be ok, WDYT? if you agree would you mind repost for -net?
>>>>
>>>> Thanks,
>>>>
>>>> Paolo
>>>>
>>>
>>> Yes that looks like a good one. I will repost to -net a v2 that includes changes to reduce the number of icmp echos sent before failing the test.
>>>
>>> Thanks,
>>> -Jon
>>>
>>
>> One minor nit - could you please change "mode 2" to "mode balance-xor" ?
>> It saves reviewers some grepping around the code to see what is mode 2.
>> Obviously one has to dig in the code to see how it's affected, but still
>> it is a bit more understandable. It'd be nice to add more as to why the link is not recovered,
>> I get it after reading the code, but it would be nice to include a more detailed explanation in the
>> commit message as well.
>>
>> Thanks,
>>   Nik
>>
> 
> Ah, I just noticed I'm late to the party. :)
> Nevermind my comments, no need for a v3.
> 

If there are other issues with v2. I will gladly include these comments 
in a v3.

Thanks,
-Jon
diff mbox series

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 1cd4e71916f8..6c4348245d1f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2529,7 +2529,16 @@  static int bond_miimon_inspect(struct bonding *bond)
 	struct slave *slave;
 	bool ignore_updelay;
 
-	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
+	if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) {
+		ignore_updelay = !rcu_dereference(bond->curr_active_slave);
+	} else {
+		struct bond_up_slave *usable_slaves;
+
+		usable_slaves = rcu_dereference(bond->usable_slaves);
+
+		if (usable_slaves && usable_slaves->count == 0)
+			ignore_updelay = true;
+	}
 
 	bond_for_each_slave_rcu(bond, slave, iter) {
 		bond_propose_link_state(slave, BOND_LINK_NOCHANGE);