mbox series

[RFC,0/6] BOND TLS flags fixes

Message ID 20210526095747.22446-1-tariqt@nvidia.com (mailing list archive)
Headers show
Series BOND TLS flags fixes | expand

Message

Tariq Toukan May 26, 2021, 9:57 a.m. UTC
Hi,

This RFC series suggests a solution for the following problem:

Bond interface and lower interface are both up with TLS RX/TX offloads on.
TX/RX csum offload is turned off for the upper, hence RX/TX TLS is turned off
for it as well.
Yet, although it indicates that feature is disabled, new connections are still
offloaded by the lower, as Bond has no way to impact that:
Return value of bond_sk_get_lower_dev() is agnostic to this change.

One way to solve this issue, is to bring back the Bond TLS operations callbacks,
i.e. provide implementation for struct tlsdev_ops in Bond.
This gives full control for the Bond over its features, making it aware of every
new TLS connection offload request.
This direction was proposed in the original Bond TLS implementation, but dropped
during ML review. Probably it's right to re-consider now.

Here I suggest another solution, which requires generic changes out of the bond
driver.

Fixes in patches 1 and 4 are needed anyway, independently to which solution
we choose. I'll probably submit them separately soon.

Regards,
Tariq

Tariq Toukan (6):
  net: Fix features skip in for_each_netdev_feature()
  net: Disable TX TLS device offload on lower devices if disabled on the
    upper
  net: Disable RX TLS device offload on lower devices if disabled on the
    upper
  net/bond: Enable RXCSUM feature for bond
  net/bond: Allow explicit control of the TLS device offload features
  net/bond: Do not turn on TLS features in bond_fix_features()

 drivers/net/bonding/bond_main.c | 6 +++---
 include/linux/netdev_features.h | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

Comments

Jakub Kicinski May 27, 2021, 12:47 a.m. UTC | #1
On Wed, 26 May 2021 12:57:41 +0300 Tariq Toukan wrote:
> This RFC series suggests a solution for the following problem:
> 
> Bond interface and lower interface are both up with TLS RX/TX offloads on.
> TX/RX csum offload is turned off for the upper, hence RX/TX TLS is turned off
> for it as well.
> Yet, although it indicates that feature is disabled, new connections are still
> offloaded by the lower, as Bond has no way to impact that:
> Return value of bond_sk_get_lower_dev() is agnostic to this change.
> 
> One way to solve this issue, is to bring back the Bond TLS operations callbacks,
> i.e. provide implementation for struct tlsdev_ops in Bond.
> This gives full control for the Bond over its features, making it aware of every
> new TLS connection offload request.
> This direction was proposed in the original Bond TLS implementation, but dropped
> during ML review. Probably it's right to re-consider now.
> 
> Here I suggest another solution, which requires generic changes out of the bond
> driver.
> 
> Fixes in patches 1 and 4 are needed anyway, independently to which solution
> we choose. I'll probably submit them separately soon.

No opinions here, semantics of bond features were always clear 
as mud to me. What does it mean that bond survived 20 years without
rx-csum? And it so why would TLS offload be different from what one
may presume the semantics of rx-csum are today? 
Tariq Toukan May 27, 2021, 2:07 p.m. UTC | #2
On 5/27/2021 3:47 AM, Jakub Kicinski wrote:
> On Wed, 26 May 2021 12:57:41 +0300 Tariq Toukan wrote:
>> This RFC series suggests a solution for the following problem:
>>
>> Bond interface and lower interface are both up with TLS RX/TX offloads on.
>> TX/RX csum offload is turned off for the upper, hence RX/TX TLS is turned off
>> for it as well.
>> Yet, although it indicates that feature is disabled, new connections are still
>> offloaded by the lower, as Bond has no way to impact that:
>> Return value of bond_sk_get_lower_dev() is agnostic to this change.
>>
>> One way to solve this issue, is to bring back the Bond TLS operations callbacks,
>> i.e. provide implementation for struct tlsdev_ops in Bond.
>> This gives full control for the Bond over its features, making it aware of every
>> new TLS connection offload request.
>> This direction was proposed in the original Bond TLS implementation, but dropped
>> during ML review. Probably it's right to re-consider now.
>>
>> Here I suggest another solution, which requires generic changes out of the bond
>> driver.
>>
>> Fixes in patches 1 and 4 are needed anyway, independently to which solution
>> we choose. I'll probably submit them separately soon.
> 
> No opinions here, semantics of bond features were always clear
> as mud to me. What does it mean that bond survived 20 years without
> rx-csum? And it so why would TLS offload be different from what one
> may presume the semantics of rx-csum are today? 
Jakub Kicinski May 27, 2021, 5:56 p.m. UTC | #3
On Thu, 27 May 2021 17:07:06 +0300 Tariq Toukan wrote:
> On 5/27/2021 3:47 AM, Jakub Kicinski wrote:
> > On Wed, 26 May 2021 12:57:41 +0300 Tariq Toukan wrote:  
> >> This RFC series suggests a solution for the following problem:
> >>
> >> Bond interface and lower interface are both up with TLS RX/TX offloads on.
> >> TX/RX csum offload is turned off for the upper, hence RX/TX TLS is turned off
> >> for it as well.
> >> Yet, although it indicates that feature is disabled, new connections are still
> >> offloaded by the lower, as Bond has no way to impact that:
> >> Return value of bond_sk_get_lower_dev() is agnostic to this change.
> >>
> >> One way to solve this issue, is to bring back the Bond TLS operations callbacks,
> >> i.e. provide implementation for struct tlsdev_ops in Bond.
> >> This gives full control for the Bond over its features, making it aware of every
> >> new TLS connection offload request.
> >> This direction was proposed in the original Bond TLS implementation, but dropped
> >> during ML review. Probably it's right to re-consider now.
> >>
> >> Here I suggest another solution, which requires generic changes out of the bond
> >> driver.
> >>
> >> Fixes in patches 1 and 4 are needed anyway, independently to which solution
> >> we choose. I'll probably submit them separately soon.  
> > 
> > No opinions here, semantics of bond features were always clear
> > as mud to me. What does it mean that bond survived 20 years without
> > rx-csum? And it so why would TLS offload be different from what one
> > may presume the semantics of rx-csum are today?
> 
> Advanced device offloads have basic logical dependencies, that are 
> applied for all kind of netdevs, agnostic to internal details of each 
> netdev.
> 
> Nothing special with TLS really.
> TLS device offload behaves similarly to TSO (needs HW_CSUM), and GRO_HW 
> (needs RXCSUM).
> [...]

Right, the inter-dependency between features is obvious enough. 
What makes a feature be part of UPPER_DISABLES though?
Tariq Toukan May 30, 2021, 10:49 a.m. UTC | #4
On 5/27/2021 8:56 PM, Jakub Kicinski wrote:
> On Thu, 27 May 2021 17:07:06 +0300 Tariq Toukan wrote:
>> On 5/27/2021 3:47 AM, Jakub Kicinski wrote:
>>> On Wed, 26 May 2021 12:57:41 +0300 Tariq Toukan wrote:
>>>> This RFC series suggests a solution for the following problem:
>>>>
>>>> Bond interface and lower interface are both up with TLS RX/TX offloads on.
>>>> TX/RX csum offload is turned off for the upper, hence RX/TX TLS is turned off
>>>> for it as well.
>>>> Yet, although it indicates that feature is disabled, new connections are still
>>>> offloaded by the lower, as Bond has no way to impact that:
>>>> Return value of bond_sk_get_lower_dev() is agnostic to this change.
>>>>
>>>> One way to solve this issue, is to bring back the Bond TLS operations callbacks,
>>>> i.e. provide implementation for struct tlsdev_ops in Bond.
>>>> This gives full control for the Bond over its features, making it aware of every
>>>> new TLS connection offload request.
>>>> This direction was proposed in the original Bond TLS implementation, but dropped
>>>> during ML review. Probably it's right to re-consider now.
>>>>
>>>> Here I suggest another solution, which requires generic changes out of the bond
>>>> driver.
>>>>
>>>> Fixes in patches 1 and 4 are needed anyway, independently to which solution
>>>> we choose. I'll probably submit them separately soon.
>>>
>>> No opinions here, semantics of bond features were always clear
>>> as mud to me. What does it mean that bond survived 20 years without
>>> rx-csum? And it so why would TLS offload be different from what one
>>> may presume the semantics of rx-csum are today?
>>
>> Advanced device offloads have basic logical dependencies, that are
>> applied for all kind of netdevs, agnostic to internal details of each
>> netdev.
>>
>> Nothing special with TLS really.
>> TLS device offload behaves similarly to TSO (needs HW_CSUM), and GRO_HW
>> (needs RXCSUM).
>> [...]
> 
> Right, the inter-dependency between features is obvious enough.
> What makes a feature be part of UPPER_DISABLES though?
> 

Regarding UPPER_DISABLES:
I propose using it here as an attempt to give the bond device some 
control over kTLS offloaded connections, to avoid cases where:
(*) UPPER.ktls_device_offload==OFF
(*) LOWER.ktls_device_offload==ON
(*) Newly created connections are offloaded!! Simply ignoring and 
bypassing the UPPER device state (this is how .ndo_sk_get_lower_dev works).

This is not my preferred solution though.
I think we should reconsider introducing bond implementation for "struct 
tlsdev_ops" callbacks, which gives bond interface full control and 
awareness to new TLS connections.
Tariq Toukan June 6, 2021, 2:02 p.m. UTC | #5
On 5/30/2021 1:49 PM, Tariq Toukan wrote:
> 
> 
> On 5/27/2021 8:56 PM, Jakub Kicinski wrote:
>> On Thu, 27 May 2021 17:07:06 +0300 Tariq Toukan wrote:
>>> On 5/27/2021 3:47 AM, Jakub Kicinski wrote:
>>>> On Wed, 26 May 2021 12:57:41 +0300 Tariq Toukan wrote:
>>>>> This RFC series suggests a solution for the following problem:
>>>>>
>>>>> Bond interface and lower interface are both up with TLS RX/TX 
>>>>> offloads on.
>>>>> TX/RX csum offload is turned off for the upper, hence RX/TX TLS is 
>>>>> turned off
>>>>> for it as well.
>>>>> Yet, although it indicates that feature is disabled, new 
>>>>> connections are still
>>>>> offloaded by the lower, as Bond has no way to impact that:
>>>>> Return value of bond_sk_get_lower_dev() is agnostic to this change.
>>>>>
>>>>> One way to solve this issue, is to bring back the Bond TLS 
>>>>> operations callbacks,
>>>>> i.e. provide implementation for struct tlsdev_ops in Bond.
>>>>> This gives full control for the Bond over its features, making it 
>>>>> aware of every
>>>>> new TLS connection offload request.
>>>>> This direction was proposed in the original Bond TLS 
>>>>> implementation, but dropped
>>>>> during ML review. Probably it's right to re-consider now.
>>>>>
>>>>> Here I suggest another solution, which requires generic changes out 
>>>>> of the bond
>>>>> driver.
>>>>>
>>>>> Fixes in patches 1 and 4 are needed anyway, independently to which 
>>>>> solution
>>>>> we choose. I'll probably submit them separately soon.
>>>>
>>>> No opinions here, semantics of bond features were always clear
>>>> as mud to me. What does it mean that bond survived 20 years without
>>>> rx-csum? And it so why would TLS offload be different from what one
>>>> may presume the semantics of rx-csum are today?
>>>
>>> Advanced device offloads have basic logical dependencies, that are
>>> applied for all kind of netdevs, agnostic to internal details of each
>>> netdev.
>>>
>>> Nothing special with TLS really.
>>> TLS device offload behaves similarly to TSO (needs HW_CSUM), and GRO_HW
>>> (needs RXCSUM).
>>> [...]
>>
>> Right, the inter-dependency between features is obvious enough.
>> What makes a feature be part of UPPER_DISABLES though?
>>
> 
> Regarding UPPER_DISABLES:
> I propose using it here as an attempt to give the bond device some 
> control over kTLS offloaded connections, to avoid cases where:
> (*) UPPER.ktls_device_offload==OFF
> (*) LOWER.ktls_device_offload==ON
> (*) Newly created connections are offloaded!! Simply ignoring and 
> bypassing the UPPER device state (this is how .ndo_sk_get_lower_dev works).
> 
> This is not my preferred solution though.
> I think we should reconsider introducing bond implementation for "struct 
> tlsdev_ops" callbacks, which gives bond interface full control and 
> awareness to new TLS connections.

If you're fine with the direction, I can prepare a new series that adds 
"struct tlsdev_ops" callbacks implementation for bond, instead of the 
"UPPER_DISABLES solution" above.

What do you think?
Jakub Kicinski June 7, 2021, 7:37 p.m. UTC | #6
On Sun, 6 Jun 2021 17:02:49 +0300 Tariq Toukan wrote:
> > Regarding UPPER_DISABLES:
> > I propose using it here as an attempt to give the bond device some 
> > control over kTLS offloaded connections, to avoid cases where:
> > (*) UPPER.ktls_device_offload==OFF
> > (*) LOWER.ktls_device_offload==ON
> > (*) Newly created connections are offloaded!! Simply ignoring and 
> > bypassing the UPPER device state (this is how .ndo_sk_get_lower_dev works).
> > 
> > This is not my preferred solution though.
> > I think we should reconsider introducing bond implementation for "struct 
> > tlsdev_ops" callbacks, which gives bond interface full control and 
> > awareness to new TLS connections.  
> 
> If you're fine with the direction, I can prepare a new series that adds 
> "struct tlsdev_ops" callbacks implementation for bond, instead of the 
> "UPPER_DISABLES solution" above.
> 
> What do you think?

I think a design which requires teaching middle layer drivers about
individual hardware offloads for ULPs is poor, brittle and hard to
reason about.

But historically we seem to have acted in an ad-hoc fashion, so I 
won't hold it against you if you prefer to keep the whack-a-mole going.