diff mbox

ath10k: add modparam 'hw_csum' to make HW checksum configurable

Message ID 1450290051-15593-1-git-send-email-poh@qca.qualcomm.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Peter Oh Dec. 16, 2015, 6:20 p.m. UTC
Some hardwares such as QCA988X and QCA99X0 doesn't have
capability of checksum offload when frame formats are not
suitable for it such as Mesh frame.
Hence add a module parameter, hw_csum, to make checksum offload
configurable during module registration time.

Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
---
 drivers/net/wireless/ath/ath10k/core.c | 6 ++++++
 drivers/net/wireless/ath/ath10k/core.h | 3 +++
 drivers/net/wireless/ath/ath10k/mac.c  | 3 ++-
 3 files changed, 11 insertions(+), 1 deletion(-)

Comments

Felix Fietkau Dec. 16, 2015, 6:27 p.m. UTC | #1
On 2015-12-16 19:20, Peter Oh wrote:
> Some hardwares such as QCA988X and QCA99X0 doesn't have
> capability of checksum offload when frame formats are not
> suitable for it such as Mesh frame.
> Hence add a module parameter, hw_csum, to make checksum offload
> configurable during module registration time.
> 
> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
How about instead of inventing yet another crappy module parameter, you
call skb_checksum_help() in the driver in cases where the hardware is
unable to offload the checksum calculation.

That way the user has to worry about less driver specific hackery ;)

- Felix
Peter Oh Dec. 16, 2015, 8:29 p.m. UTC | #2
On 12/16/2015 10:27 AM, Felix Fietkau wrote:
> On 2015-12-16 19:20, Peter Oh wrote:
>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>> capability of checksum offload when frame formats are not
>> suitable for it such as Mesh frame.
>> Hence add a module parameter, hw_csum, to make checksum offload
>> configurable during module registration time.
>>
>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
> How about instead of inventing yet another crappy module parameter, you
> call skb_checksum_help() in the driver in cases where the hardware is
> unable to offload the checksum calculation.
>
> That way the user has to worry about less driver specific hackery ;)
That will be good option for hardware not supporting HW checksum, but I 
mind that using the function will add more workload per every packet on 
critical data path when HW supports checksum resulting in throughput down.
> - Felix
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thanks,
Peter
Felix Fietkau Dec. 16, 2015, 8:35 p.m. UTC | #3
On 2015-12-16 21:29, Peter Oh wrote:
> 
> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>> On 2015-12-16 19:20, Peter Oh wrote:
>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>> capability of checksum offload when frame formats are not
>>> suitable for it such as Mesh frame.
>>> Hence add a module parameter, hw_csum, to make checksum offload
>>> configurable during module registration time.
>>>
>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>> How about instead of inventing yet another crappy module parameter, you
>> call skb_checksum_help() in the driver in cases where the hardware is
>> unable to offload the checksum calculation.
>>
>> That way the user has to worry about less driver specific hackery ;)
> That will be good option for hardware not supporting HW checksum, but I 
> mind that using the function will add more workload per every packet on 
> critical data path when HW supports checksum resulting in throughput down.
I didn't mean calling it for every single frame in the data path.
What I'm suggesting is calling it selectively only for mesh frames, or
any other frames that the hardware cannot offload, and leaving the rest
for the hardware to process.

There should be no performance difference between disabling checksum
offload and calling skb_checksum_help from the driver.

- Felix
Peter Oh Dec. 16, 2015, 8:46 p.m. UTC | #4
On 12/16/2015 12:35 PM, Felix Fietkau wrote:
> On 2015-12-16 21:29, Peter Oh wrote:
>> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>>> On 2015-12-16 19:20, Peter Oh wrote:
>>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>>> capability of checksum offload when frame formats are not
>>>> suitable for it such as Mesh frame.
>>>> Hence add a module parameter, hw_csum, to make checksum offload
>>>> configurable during module registration time.
>>>>
>>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>>> How about instead of inventing yet another crappy module parameter, you
>>> call skb_checksum_help() in the driver in cases where the hardware is
>>> unable to offload the checksum calculation.
>>>
>>> That way the user has to worry about less driver specific hackery ;)
>> That will be good option for hardware not supporting HW checksum, but I
>> mind that using the function will add more workload per every packet on
>> critical data path when HW supports checksum resulting in throughput down.
> I didn't mean calling it for every single frame in the data path.
> What I'm suggesting is calling it selectively only for mesh frames, or
> any other frames that the hardware cannot offload, and leaving the rest
> for the hardware to process.
>
> There should be no performance difference between disabling checksum
> offload and calling skb_checksum_help from the driver.
To call it selectively for Mesh frame or interface, we need to add it on 
mac80211 layer such as ieee80211_build_hdr() since driver layer does not 
care the interface type in data path.
In that case it will also introduce throughput degrade to HW that 
supports HW checksum for Mesh.
> - Felix
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thanks,
Peter
Felix Fietkau Dec. 16, 2015, 8:53 p.m. UTC | #5
On 2015-12-16 21:46, Peter Oh wrote:
> 
> On 12/16/2015 12:35 PM, Felix Fietkau wrote:
>> On 2015-12-16 21:29, Peter Oh wrote:
>>> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>>>> On 2015-12-16 19:20, Peter Oh wrote:
>>>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>>>> capability of checksum offload when frame formats are not
>>>>> suitable for it such as Mesh frame.
>>>>> Hence add a module parameter, hw_csum, to make checksum offload
>>>>> configurable during module registration time.
>>>>>
>>>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>>>> How about instead of inventing yet another crappy module parameter, you
>>>> call skb_checksum_help() in the driver in cases where the hardware is
>>>> unable to offload the checksum calculation.
>>>>
>>>> That way the user has to worry about less driver specific hackery ;)
>>> That will be good option for hardware not supporting HW checksum, but I
>>> mind that using the function will add more workload per every packet on
>>> critical data path when HW supports checksum resulting in throughput down.
>> I didn't mean calling it for every single frame in the data path.
>> What I'm suggesting is calling it selectively only for mesh frames, or
>> any other frames that the hardware cannot offload, and leaving the rest
>> for the hardware to process.
>>
>> There should be no performance difference between disabling checksum
>> offload and calling skb_checksum_help from the driver.
> To call it selectively for Mesh frame or interface, we need to add it on 
> mac80211 layer such as ieee80211_build_hdr() since driver layer does not 
> care the interface type in data path.
No need to change mac80211 - it only touches the headers, and
skb_checksum_help does not care about that. The skb has enough
information for it to find the right range to calculate the checksum and
the place to store it.

> In that case it will also introduce throughput degrade to HW that 
> supports HW checksum for Mesh.
This doesn't make any sense to me. Are you saying that there's no way
for the driver to detect the cases where the hardware cannot do checksum
offloading? How is the user supposed to know when to change that module
parameter? Trial and error?

- Felix
Peter Oh Dec. 16, 2015, 9:19 p.m. UTC | #6
On 12/16/2015 12:53 PM, Felix Fietkau wrote:
> On 2015-12-16 21:46, Peter Oh wrote:
>> On 12/16/2015 12:35 PM, Felix Fietkau wrote:
>>> On 2015-12-16 21:29, Peter Oh wrote:
>>>> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>>>>> On 2015-12-16 19:20, Peter Oh wrote:
>>>>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>>>>> capability of checksum offload when frame formats are not
>>>>>> suitable for it such as Mesh frame.
>>>>>> Hence add a module parameter, hw_csum, to make checksum offload
>>>>>> configurable during module registration time.
>>>>>>
>>>>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>>>>> How about instead of inventing yet another crappy module parameter, you
>>>>> call skb_checksum_help() in the driver in cases where the hardware is
>>>>> unable to offload the checksum calculation.
>>>>>
>>>>> That way the user has to worry about less driver specific hackery ;)
>>>> That will be good option for hardware not supporting HW checksum, but I
>>>> mind that using the function will add more workload per every packet on
>>>> critical data path when HW supports checksum resulting in throughput down.
>>> I didn't mean calling it for every single frame in the data path.
>>> What I'm suggesting is calling it selectively only for mesh frames, or
>>> any other frames that the hardware cannot offload, and leaving the rest
>>> for the hardware to process.
>>>
>>> There should be no performance difference between disabling checksum
>>> offload and calling skb_checksum_help from the driver.
>> To call it selectively for Mesh frame or interface, we need to add it on
>> mac80211 layer such as ieee80211_build_hdr() since driver layer does not
>> care the interface type in data path.
> No need to change mac80211 - it only touches the headers, and
> skb_checksum_help does not care about that. The skb has enough
> information for it to find the right range to calculate the checksum and
> the place to store it.
If mentioned to use the function to mesh frame only without touching 
mac80211, then how do you suggest it to apply it only to mesh frame 
without interfere other data frames?
Can you share your example?
>
>> In that case it will also introduce throughput degrade to HW that
>> supports HW checksum for Mesh.
> This doesn't make any sense to me. Are you saying that there's no way
> for the driver to detect the cases where the hardware cannot do checksum
> offloading?
I'm saying the case that HW supports checksum except for specific frame 
such as Mesh and to make driver support both case dynamically at code 
level, it requires extra codes which need to check if the frame is Mesh 
or not. Since this approach requires extra workload especially in data 
path, it will degrade driver's performance.
>   How is the user supposed to know when to change that module
> parameter? Trial and error?
>
> - Felix
Felix Fietkau Dec. 16, 2015, 9:54 p.m. UTC | #7
On 2015-12-16 22:19, Peter Oh wrote:
> 
> On 12/16/2015 12:53 PM, Felix Fietkau wrote:
>> On 2015-12-16 21:46, Peter Oh wrote:
>>> On 12/16/2015 12:35 PM, Felix Fietkau wrote:
>>>> On 2015-12-16 21:29, Peter Oh wrote:
>>>>> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>>>>>> On 2015-12-16 19:20, Peter Oh wrote:
>>>>>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>>>>>> capability of checksum offload when frame formats are not
>>>>>>> suitable for it such as Mesh frame.
>>>>>>> Hence add a module parameter, hw_csum, to make checksum offload
>>>>>>> configurable during module registration time.
>>>>>>>
>>>>>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>>>>>> How about instead of inventing yet another crappy module parameter, you
>>>>>> call skb_checksum_help() in the driver in cases where the hardware is
>>>>>> unable to offload the checksum calculation.
>>>>>>
>>>>>> That way the user has to worry about less driver specific hackery ;)
>>>>> That will be good option for hardware not supporting HW checksum, but I
>>>>> mind that using the function will add more workload per every packet on
>>>>> critical data path when HW supports checksum resulting in throughput down.
>>>> I didn't mean calling it for every single frame in the data path.
>>>> What I'm suggesting is calling it selectively only for mesh frames, or
>>>> any other frames that the hardware cannot offload, and leaving the rest
>>>> for the hardware to process.
>>>>
>>>> There should be no performance difference between disabling checksum
>>>> offload and calling skb_checksum_help from the driver.
>>> To call it selectively for Mesh frame or interface, we need to add it on
>>> mac80211 layer such as ieee80211_build_hdr() since driver layer does not
>>> care the interface type in data path.
>> No need to change mac80211 - it only touches the headers, and
>> skb_checksum_help does not care about that. The skb has enough
>> information for it to find the right range to calculate the checksum and
>> the place to store it.
> If mentioned to use the function to mesh frame only without touching 
> mac80211, then how do you suggest it to apply it only to mesh frame 
> without interfere other data frames?
> Can you share your example?
It's trivial - in ath10k_tx you do this:

if (vif->type == NL80211_IFTYPE_MESH_POINT &&
    skb->ip_summed == CHECKSUM_PARTIAL)
	skb_checksum_help(skb);

>>> In that case it will also introduce throughput degrade to HW that
>>> supports HW checksum for Mesh.
>> This doesn't make any sense to me. Are you saying that there's no way
>> for the driver to detect the cases where the hardware cannot do checksum
>> offloading?
> I'm saying the case that HW supports checksum except for specific frame 
> such as Mesh and to make driver support both case dynamically at code 
> level, it requires extra codes which need to check if the frame is Mesh 
> or not. Since this approach requires extra workload especially in data 
> path, it will degrade driver's performance.
The check is cheap enough that it will not have any visible impact. And
the improved user experience is certainly worth it ;)

- Felix
Peter Oh Dec. 16, 2015, 11:50 p.m. UTC | #8
On 12/16/2015 01:54 PM, Felix Fietkau wrote:
> On 2015-12-16 22:19, Peter Oh wrote:
>> On 12/16/2015 12:53 PM, Felix Fietkau wrote:
>>> On 2015-12-16 21:46, Peter Oh wrote:
>>>> On 12/16/2015 12:35 PM, Felix Fietkau wrote:
>>>>> On 2015-12-16 21:29, Peter Oh wrote:
>>>>>> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>>>>>>> On 2015-12-16 19:20, Peter Oh wrote:
>>>>>>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>>>>>>> capability of checksum offload when frame formats are not
>>>>>>>> suitable for it such as Mesh frame.
>>>>>>>> Hence add a module parameter, hw_csum, to make checksum offload
>>>>>>>> configurable during module registration time.
>>>>>>>>
>>>>>>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>>>>>>> How about instead of inventing yet another crappy module parameter, you
>>>>>>> call skb_checksum_help() in the driver in cases where the hardware is
>>>>>>> unable to offload the checksum calculation.
>>>>>>>
>>>>>>> That way the user has to worry about less driver specific hackery ;)
>>>>>> That will be good option for hardware not supporting HW checksum, but I
>>>>>> mind that using the function will add more workload per every packet on
>>>>>> critical data path when HW supports checksum resulting in throughput down.
>>>>> I didn't mean calling it for every single frame in the data path.
>>>>> What I'm suggesting is calling it selectively only for mesh frames, or
>>>>> any other frames that the hardware cannot offload, and leaving the rest
>>>>> for the hardware to process.
>>>>>
>>>>> There should be no performance difference between disabling checksum
>>>>> offload and calling skb_checksum_help from the driver.
>>>> To call it selectively for Mesh frame or interface, we need to add it on
>>>> mac80211 layer such as ieee80211_build_hdr() since driver layer does not
>>>> care the interface type in data path.
>>> No need to change mac80211 - it only touches the headers, and
>>> skb_checksum_help does not care about that. The skb has enough
>>> information for it to find the right range to calculate the checksum and
>>> the place to store it.
>> If mentioned to use the function to mesh frame only without touching
>> mac80211, then how do you suggest it to apply it only to mesh frame
>> without interfere other data frames?
>> Can you share your example?
> It's trivial - in ath10k_tx you do this:
>
> if (vif->type == NL80211_IFTYPE_MESH_POINT &&
>      skb->ip_summed == CHECKSUM_PARTIAL)
> 	skb_checksum_help(skb);
Thank you Felix for the quick response.
I agree on your user experience opinion,
but what do you think when ath10k has a new chip supporting HW checksum 
for Mesh?
>>>> In that case it will also introduce throughput degrade to HW that
>>>> supports HW checksum for Mesh.
>>> This doesn't make any sense to me. Are you saying that there's no way
>>> for the driver to detect the cases where the hardware cannot do checksum
>>> offloading?
>> I'm saying the case that HW supports checksum except for specific frame
>> such as Mesh and to make driver support both case dynamically at code
>> level, it requires extra codes which need to check if the frame is Mesh
>> or not. Since this approach requires extra workload especially in data
>> path, it will degrade driver's performance.
> The check is cheap enough that it will not have any visible impact. And
> the improved user experience is certainly worth it ;)
>
> - Felix
Thanks,
Peter
Felix Fietkau Dec. 16, 2015, 11:59 p.m. UTC | #9
On 2015-12-17 00:50, Peter Oh wrote:
> 
> On 12/16/2015 01:54 PM, Felix Fietkau wrote:
>> On 2015-12-16 22:19, Peter Oh wrote:
>>> On 12/16/2015 12:53 PM, Felix Fietkau wrote:
>>>> On 2015-12-16 21:46, Peter Oh wrote:
>>>>> On 12/16/2015 12:35 PM, Felix Fietkau wrote:
>>>>>> On 2015-12-16 21:29, Peter Oh wrote:
>>>>>>> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>>>>>>>> On 2015-12-16 19:20, Peter Oh wrote:
>>>>>>>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>>>>>>>> capability of checksum offload when frame formats are not
>>>>>>>>> suitable for it such as Mesh frame.
>>>>>>>>> Hence add a module parameter, hw_csum, to make checksum offload
>>>>>>>>> configurable during module registration time.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>>>>>>>> How about instead of inventing yet another crappy module parameter, you
>>>>>>>> call skb_checksum_help() in the driver in cases where the hardware is
>>>>>>>> unable to offload the checksum calculation.
>>>>>>>>
>>>>>>>> That way the user has to worry about less driver specific hackery ;)
>>>>>>> That will be good option for hardware not supporting HW checksum, but I
>>>>>>> mind that using the function will add more workload per every packet on
>>>>>>> critical data path when HW supports checksum resulting in throughput down.
>>>>>> I didn't mean calling it for every single frame in the data path.
>>>>>> What I'm suggesting is calling it selectively only for mesh frames, or
>>>>>> any other frames that the hardware cannot offload, and leaving the rest
>>>>>> for the hardware to process.
>>>>>>
>>>>>> There should be no performance difference between disabling checksum
>>>>>> offload and calling skb_checksum_help from the driver.
>>>>> To call it selectively for Mesh frame or interface, we need to add it on
>>>>> mac80211 layer such as ieee80211_build_hdr() since driver layer does not
>>>>> care the interface type in data path.
>>>> No need to change mac80211 - it only touches the headers, and
>>>> skb_checksum_help does not care about that. The skb has enough
>>>> information for it to find the right range to calculate the checksum and
>>>> the place to store it.
>>> If mentioned to use the function to mesh frame only without touching
>>> mac80211, then how do you suggest it to apply it only to mesh frame
>>> without interfere other data frames?
>>> Can you share your example?
>> It's trivial - in ath10k_tx you do this:
>>
>> if (vif->type == NL80211_IFTYPE_MESH_POINT &&
>>      skb->ip_summed == CHECKSUM_PARTIAL)
>> 	skb_checksum_help(skb);
> Thank you Felix for the quick response.
> I agree on your user experience opinion,
> but what do you think when ath10k has a new chip supporting HW checksum 
> for Mesh?
Then you simply update the checks. What's the big deal?

- Felix
Michal Kazior Dec. 17, 2015, 7:29 a.m. UTC | #10
On 17 December 2015 at 00:50, Peter Oh <poh@codeaurora.org> wrote:
> On 12/16/2015 01:54 PM, Felix Fietkau wrote:
>> On 2015-12-16 22:19, Peter Oh wrote:
[...]
>>> If mentioned to use the function to mesh frame only without touching
>>> mac80211, then how do you suggest it to apply it only to mesh frame
>>> without interfere other data frames?
>>> Can you share your example?
>>
>> It's trivial - in ath10k_tx you do this:
>>
>> if (vif->type == NL80211_IFTYPE_MESH_POINT &&
>>      skb->ip_summed == CHECKSUM_PARTIAL)
>>         skb_checksum_help(skb);
>
> Thank you Felix for the quick response.
> I agree on your user experience opinion,
> but what do you think when ath10k has a new chip supporting HW checksum for
> Mesh?

You can simply introduce a fw-feature flag saying
"supports_mesh_csum_offload" later and skip the skb_checksum_help() if
it's set.


Micha?
Peter Oh Dec. 17, 2015, 9:55 p.m. UTC | #11
On 12/16/2015 11:29 PM, Michal Kazior wrote:
> On 17 December 2015 at 00:50, Peter Oh <poh@codeaurora.org> wrote:
>> On 12/16/2015 01:54 PM, Felix Fietkau wrote:
>>> On 2015-12-16 22:19, Peter Oh wrote:
> [...]
>>>> If mentioned to use the function to mesh frame only without touching
>>>> mac80211, then how do you suggest it to apply it only to mesh frame
>>>> without interfere other data frames?
>>>> Can you share your example?
>>> It's trivial - in ath10k_tx you do this:
>>>
>>> if (vif->type == NL80211_IFTYPE_MESH_POINT &&
>>>       skb->ip_summed == CHECKSUM_PARTIAL)
>>>          skb_checksum_help(skb);

>> Thank you Felix for the quick response.
>> I agree on your user experience opinion,
>> but what do you think when ath10k has a new chip supporting HW checksum for
>> Mesh?
> You can simply introduce a fw-feature flag saying
> "supports_mesh_csum_offload" later and skip the skb_checksum_help() if
> it's set.
If we rely on fw-feature flag, then we are not able to use HW checksum 
at all even for AP/STA interfaces.

>
> Micha?
Thanks,
Peter
Peter Oh Dec. 17, 2015, 10:01 p.m. UTC | #12
On 12/16/2015 03:59 PM, Felix Fietkau wrote:
> On 2015-12-17 00:50, Peter Oh wrote:
>> On 12/16/2015 01:54 PM, Felix Fietkau wrote:
>>> On 2015-12-16 22:19, Peter Oh wrote:
>>>> On 12/16/2015 12:53 PM, Felix Fietkau wrote:
>>>>> On 2015-12-16 21:46, Peter Oh wrote:
>>>>>> On 12/16/2015 12:35 PM, Felix Fietkau wrote:
>>>>>>> On 2015-12-16 21:29, Peter Oh wrote:
>>>>>>>> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>>>>>>>>> On 2015-12-16 19:20, Peter Oh wrote:
>>>>>>>>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>>>>>>>>> capability of checksum offload when frame formats are not
>>>>>>>>>> suitable for it such as Mesh frame.
>>>>>>>>>> Hence add a module parameter, hw_csum, to make checksum offload
>>>>>>>>>> configurable during module registration time.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>>>>>>>>> How about instead of inventing yet another crappy module parameter, you
>>>>>>>>> call skb_checksum_help() in the driver in cases where the hardware is
>>>>>>>>> unable to offload the checksum calculation.
>>>>>>>>>
>>>>>>>>> That way the user has to worry about less driver specific hackery ;)
>>>>>>>> That will be good option for hardware not supporting HW checksum, but I
>>>>>>>> mind that using the function will add more workload per every packet on
>>>>>>>> critical data path when HW supports checksum resulting in throughput down.
>>>>>>> I didn't mean calling it for every single frame in the data path.
>>>>>>> What I'm suggesting is calling it selectively only for mesh frames, or
>>>>>>> any other frames that the hardware cannot offload, and leaving the rest
>>>>>>> for the hardware to process.
>>>>>>>
>>>>>>> There should be no performance difference between disabling checksum
>>>>>>> offload and calling skb_checksum_help from the driver.
>>>>>> To call it selectively for Mesh frame or interface, we need to add it on
>>>>>> mac80211 layer such as ieee80211_build_hdr() since driver layer does not
>>>>>> care the interface type in data path.
>>>>> No need to change mac80211 - it only touches the headers, and
>>>>> skb_checksum_help does not care about that. The skb has enough
>>>>> information for it to find the right range to calculate the checksum and
>>>>> the place to store it.
>>>> If mentioned to use the function to mesh frame only without touching
>>>> mac80211, then how do you suggest it to apply it only to mesh frame
>>>> without interfere other data frames?
>>>> Can you share your example?
>>> It's trivial - in ath10k_tx you do this:
>>>
>>> if (vif->type == NL80211_IFTYPE_MESH_POINT &&
>>>       skb->ip_summed == CHECKSUM_PARTIAL)
>>> 	skb_checksum_help(skb);
>> Thank you Felix for the quick response.
>> I agree on your user experience opinion,
>> but what do you think when ath10k has a new chip supporting HW checksum
>> for Mesh?
> Then you simply update the checks. What's the big deal?
keep adding condition to such data path is not a good option.
I also considered again about user experiences and reached to that this 
patch won't disturb user experience since the products will ship with 
proper module settings. for instance the parameter will be turned on if 
product support it other wise will be turned off as they shipped, so 
that users don't need to touch it.
In addition, for enterprise customers, they do care even a very small 
performance drop or enhancement especially when they are running BMT 
among vendors.
So we need to avoid adding extra codes in data path in my opinion.
> - Felix
Felix Fietkau Dec. 17, 2015, 10:57 p.m. UTC | #13
On 2015-12-17 23:01, Peter Oh wrote:
> 
> On 12/16/2015 03:59 PM, Felix Fietkau wrote:
>> On 2015-12-17 00:50, Peter Oh wrote:
>>> On 12/16/2015 01:54 PM, Felix Fietkau wrote:
>>>> On 2015-12-16 22:19, Peter Oh wrote:
>>>>> On 12/16/2015 12:53 PM, Felix Fietkau wrote:
>>>>>> On 2015-12-16 21:46, Peter Oh wrote:
>>>>>>> On 12/16/2015 12:35 PM, Felix Fietkau wrote:
>>>>>>>> On 2015-12-16 21:29, Peter Oh wrote:
>>>>>>>>> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>>>>>>>>>> On 2015-12-16 19:20, Peter Oh wrote:
>>>>>>>>>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>>>>>>>>>> capability of checksum offload when frame formats are not
>>>>>>>>>>> suitable for it such as Mesh frame.
>>>>>>>>>>> Hence add a module parameter, hw_csum, to make checksum offload
>>>>>>>>>>> configurable during module registration time.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>>>>>>>>>> How about instead of inventing yet another crappy module parameter, you
>>>>>>>>>> call skb_checksum_help() in the driver in cases where the hardware is
>>>>>>>>>> unable to offload the checksum calculation.
>>>>>>>>>>
>>>>>>>>>> That way the user has to worry about less driver specific hackery ;)
>>>>>>>>> That will be good option for hardware not supporting HW checksum, but I
>>>>>>>>> mind that using the function will add more workload per every packet on
>>>>>>>>> critical data path when HW supports checksum resulting in throughput down.
>>>>>>>> I didn't mean calling it for every single frame in the data path.
>>>>>>>> What I'm suggesting is calling it selectively only for mesh frames, or
>>>>>>>> any other frames that the hardware cannot offload, and leaving the rest
>>>>>>>> for the hardware to process.
>>>>>>>>
>>>>>>>> There should be no performance difference between disabling checksum
>>>>>>>> offload and calling skb_checksum_help from the driver.
>>>>>>> To call it selectively for Mesh frame or interface, we need to add it on
>>>>>>> mac80211 layer such as ieee80211_build_hdr() since driver layer does not
>>>>>>> care the interface type in data path.
>>>>>> No need to change mac80211 - it only touches the headers, and
>>>>>> skb_checksum_help does not care about that. The skb has enough
>>>>>> information for it to find the right range to calculate the checksum and
>>>>>> the place to store it.
>>>>> If mentioned to use the function to mesh frame only without touching
>>>>> mac80211, then how do you suggest it to apply it only to mesh frame
>>>>> without interfere other data frames?
>>>>> Can you share your example?
>>>> It's trivial - in ath10k_tx you do this:
>>>>
>>>> if (vif->type == NL80211_IFTYPE_MESH_POINT &&
>>>>       skb->ip_summed == CHECKSUM_PARTIAL)
>>>> 	skb_checksum_help(skb);
>>> Thank you Felix for the quick response.
>>> I agree on your user experience opinion,
>>> but what do you think when ath10k has a new chip supporting HW checksum
>>> for Mesh?
>> Then you simply update the checks. What's the big deal?
> keep adding condition to such data path is not a good option.
> I also considered again about user experiences and reached to that this 
> patch won't disturb user experience since the products will ship with 
> proper module settings. for instance the parameter will be turned on if 
> product support it other wise will be turned off as they shipped, so 
> that users don't need to touch it.
I think the point you were missing is the one that there is no such
thing as a proper setting for this module parameter, since it doesn't
really depend much on the hardware or the product, but on the wifi mode
that you are using.

> In addition, for enterprise customers, they do care even a very small 
> performance drop or enhancement especially when they are running BMT 
> among vendors.
> So we need to avoid adding extra codes in data path in my opinion.
The regular data tx path already checks ar->dev_flags to decide whether
to use raw mode or not. This means that this part of the data structure
is already cached. The vif type is also cached, since it's accessed in
the same part of the function.
Because of that, the impact of adding an extra check even for a hardware
capability will be so low, that I'm pretty sure you will not be able to
measure it. And even if it were measurable, it's probably quite easy to
find a few places to optimize

I find the tradeoff you are making very odd: For users that don't know
about the module parameter (depending on the default value) it either
just randomly doesn't work in mesh or always runs with degraded
performance. All this to save adding a check that will be completely
irrelevant for performance, since it won't result in any extra cache
stalls (which are the typical bottleneck in the data path).

- Felix
Peter Oh Dec. 17, 2015, 11:16 p.m. UTC | #14
On 12/17/2015 02:57 PM, Felix Fietkau wrote:
> On 2015-12-17 23:01, Peter Oh wrote:
>> On 12/16/2015 03:59 PM, Felix Fietkau wrote:
>>> On 2015-12-17 00:50, Peter Oh wrote:
>>>> On 12/16/2015 01:54 PM, Felix Fietkau wrote:
>>>>> On 2015-12-16 22:19, Peter Oh wrote:
>>>>>> On 12/16/2015 12:53 PM, Felix Fietkau wrote:
>>>>>>> On 2015-12-16 21:46, Peter Oh wrote:
>>>>>>>> On 12/16/2015 12:35 PM, Felix Fietkau wrote:
>>>>>>>>> On 2015-12-16 21:29, Peter Oh wrote:
>>>>>>>>>> On 12/16/2015 10:27 AM, Felix Fietkau wrote:
>>>>>>>>>>> On 2015-12-16 19:20, Peter Oh wrote:
>>>>>>>>>>>> Some hardwares such as QCA988X and QCA99X0 doesn't have
>>>>>>>>>>>> capability of checksum offload when frame formats are not
>>>>>>>>>>>> suitable for it such as Mesh frame.
>>>>>>>>>>>> Hence add a module parameter, hw_csum, to make checksum offload
>>>>>>>>>>>> configurable during module registration time.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
>>>>>>>>>>> How about instead of inventing yet another crappy module parameter, you
>>>>>>>>>>> call skb_checksum_help() in the driver in cases where the hardware is
>>>>>>>>>>> unable to offload the checksum calculation.
>>>>>>>>>>>
>>>>>>>>>>> That way the user has to worry about less driver specific hackery ;)
>>>>>>>>>> That will be good option for hardware not supporting HW checksum, but I
>>>>>>>>>> mind that using the function will add more workload per every packet on
>>>>>>>>>> critical data path when HW supports checksum resulting in throughput down.
>>>>>>>>> I didn't mean calling it for every single frame in the data path.
>>>>>>>>> What I'm suggesting is calling it selectively only for mesh frames, or
>>>>>>>>> any other frames that the hardware cannot offload, and leaving the rest
>>>>>>>>> for the hardware to process.
>>>>>>>>>
>>>>>>>>> There should be no performance difference between disabling checksum
>>>>>>>>> offload and calling skb_checksum_help from the driver.
>>>>>>>> To call it selectively for Mesh frame or interface, we need to add it on
>>>>>>>> mac80211 layer such as ieee80211_build_hdr() since driver layer does not
>>>>>>>> care the interface type in data path.
>>>>>>> No need to change mac80211 - it only touches the headers, and
>>>>>>> skb_checksum_help does not care about that. The skb has enough
>>>>>>> information for it to find the right range to calculate the checksum and
>>>>>>> the place to store it.
>>>>>> If mentioned to use the function to mesh frame only without touching
>>>>>> mac80211, then how do you suggest it to apply it only to mesh frame
>>>>>> without interfere other data frames?
>>>>>> Can you share your example?
>>>>> It's trivial - in ath10k_tx you do this:
>>>>>
>>>>> if (vif->type == NL80211_IFTYPE_MESH_POINT &&
>>>>>        skb->ip_summed == CHECKSUM_PARTIAL)
>>>>> 	skb_checksum_help(skb);
>>>> Thank you Felix for the quick response.
>>>> I agree on your user experience opinion,
>>>> but what do you think when ath10k has a new chip supporting HW checksum
>>>> for Mesh?
>>> Then you simply update the checks. What's the big deal?
>> keep adding condition to such data path is not a good option.
>> I also considered again about user experiences and reached to that this
>> patch won't disturb user experience since the products will ship with
>> proper module settings. for instance the parameter will be turned on if
>> product support it other wise will be turned off as they shipped, so
>> that users don't need to touch it.
> I think the point you were missing is the one that there is no such
> thing as a proper setting for this module parameter, since it doesn't
> really depend much on the hardware or the product, but on the wifi mode
> that you are using.
>
>> In addition, for enterprise customers, they do care even a very small
>> performance drop or enhancement especially when they are running BMT
>> among vendors.
>> So we need to avoid adding extra codes in data path in my opinion.
> The regular data tx path already checks ar->dev_flags to decide whether
> to use raw mode or not. This means that this part of the data structure
> is already cached. The vif type is also cached, since it's accessed in
> the same part of the function.
> Because of that, the impact of adding an extra check even for a hardware
> capability will be so low, that I'm pretty sure you will not be able to
> measure it. And even if it were measurable, it's probably quite easy to
> find a few places to optimize
>
> I find the tradeoff you are making very odd: For users that don't know
> about the module parameter (depending on the default value) it either
> just randomly doesn't work in mesh or always runs with degraded
> performance. All this to save adding a check that will be completely
> irrelevant for performance, since it won't result in any extra cache
> stalls (which are the typical bottleneck in the data path).
Thank you for your comments and ideas.
I'll spend more time to lead better solution based on you & Michal's 
feedback.
> - Felix
Peter
diff mbox

Patch

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index fca702c..fcfccd8 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -35,18 +35,21 @@  static unsigned int ath10k_cryptmode_param;
 static bool uart_print;
 static bool skip_otp;
 static bool rawmode;
+static bool hw_csum = true;
 
 module_param_named(debug_mask, ath10k_debug_mask, uint, 0644);
 module_param_named(cryptmode, ath10k_cryptmode_param, uint, 0644);
 module_param(uart_print, bool, 0644);
 module_param(skip_otp, bool, 0644);
 module_param(rawmode, bool, 0644);
+module_param(hw_csum, bool, 0644);
 
 MODULE_PARM_DESC(debug_mask, "Debugging mask");
 MODULE_PARM_DESC(uart_print, "Uart target debugging");
 MODULE_PARM_DESC(skip_otp, "Skip otp failure for calibration in testmode");
 MODULE_PARM_DESC(cryptmode, "Crypto mode: 0-hardware, 1-software");
 MODULE_PARM_DESC(rawmode, "Use raw 802.11 frame datapath");
+MODULE_PARM_DESC(hw_csum, "Enable HW checksum offload (default: on)");
 
 static const struct ath10k_hw_params ath10k_hw_params_list[] = {
 	{
@@ -1405,6 +1408,9 @@  static int ath10k_core_init_firmware_features(struct ath10k *ar)
 		ar->htt.max_num_amsdu = 1;
 	}
 
+	if (!hw_csum)
+		set_bit(ATH10K_FLAG_HW_CSUM_DISABLED, &ar->dev_flags);
+
 	/* Backwards compatibility for firmwares without
 	 * ATH10K_FW_IE_WMI_OP_VERSION.
 	 */
diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h
index 3c8a510..1972439 100644
--- a/drivers/net/wireless/ath/ath10k/core.h
+++ b/drivers/net/wireless/ath/ath10k/core.h
@@ -535,6 +535,9 @@  enum ath10k_dev_flags {
 
 	/* Bluetooth coexistance enabled */
 	ATH10K_FLAG_BTCOEX,
+
+	/* Do not use checksum offload */
+	ATH10K_FLAG_HW_CSUM_DISABLED,
 };
 
 enum ath10k_cal_mode {
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index a4c5c1d..f87f521 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -7332,7 +7332,8 @@  int ath10k_mac_register(struct ath10k *ar)
 		goto err_free;
 	}
 
-	if (!test_bit(ATH10K_FLAG_RAW_MODE, &ar->dev_flags))
+	if (!test_bit(ATH10K_FLAG_RAW_MODE, &ar->dev_flags) &&
+	    !test_bit(ATH10K_FLAG_HW_CSUM_DISABLED, &ar->dev_flags))
 		ar->hw->netdev_features = NETIF_F_HW_CSUM;
 
 	if (config_enabled(CONFIG_ATH10K_DFS_CERTIFIED)) {