[V1,net-next,0/5] Add devlink support to ena

Message ID	20230108103533.10104-1-darinzon@amazon.com (mailing list archive)
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: David Arinzon <darinzon@amazon.com> To: David Miller <davem@davemloft.net>, Jakub Kicinski <kuba@kernel.org>, <netdev@vger.kernel.org> CC: David Arinzon <darinzon@amazon.com>, "Machulsky, Zorik" <zorik@amazon.com>, "Matushevsky, Alexander" <matua@amazon.com>, Saeed Bshara <saeedb@amazon.com>, "Bshara, Nafea" <nafea@amazon.com>, "Saidi, Ali" <alisaidi@amazon.com>, "Kiyanovski, Arthur" <akiyano@amazon.com>, "Dagan, Noam" <ndagan@amazon.com>, "Agroskin, Shay" <shayagr@amazon.com>, "Itzko, Shahar" <itzko@amazon.com>, "Abboud, Osama" <osamaabb@amazon.com> Subject: [PATCH V1 net-next 0/5] Add devlink support to ena Date: Sun, 8 Jan 2023 10:35:28 +0000 Message-ID: <20230108103533.10104-1-darinzon@amazon.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk
Series	Add devlink support to ena \| expand [V1,net-next,0/5] Add devlink support to ena [V1,net-next,1/5] net: ena: Register ena device to devlink [V1,net-next,2/5] net: ena: Add devlink reload functionality [V1,net-next,3/5] net: ena: Configure large LLQ using devlink params [V1,net-next,4/5] net: ena: Several changes to support large LLQ configuration [V1,net-next,5/5] net: ena: Add devlink documentation

Arinzon, David Jan. 8, 2023, 10:35 a.m. UTC

This patchset adds devlink support to the ena driver.

David Arinzon (5):
  net: ena: Register ena device to devlink
  net: ena: Add devlink reload functionality
  net: ena: Configure large LLQ using devlink params
  net: ena: Several changes to support large LLQ configuration
  net: ena: Add devlink documentation

 .../device_drivers/ethernet/amazon/ena.rst    |  30 +++
 drivers/net/ethernet/amazon/Kconfig           |   1 +
 drivers/net/ethernet/amazon/ena/Makefile      |   2 +-
 drivers/net/ethernet/amazon/ena/ena_devlink.c | 215 ++++++++++++++++++
 drivers/net/ethernet/amazon/ena/ena_devlink.h |  22 ++
 drivers/net/ethernet/amazon/ena/ena_netdev.c  | 123 +++++++---
 drivers/net/ethernet/amazon/ena/ena_netdev.h  |  13 ++
 7 files changed, 379 insertions(+), 27 deletions(-)
 create mode 100644 drivers/net/ethernet/amazon/ena/ena_devlink.c
 create mode 100644 drivers/net/ethernet/amazon/ena/ena_devlink.h

Jakub Kicinski Jan. 10, 2023, 12:45 a.m. UTC | #1

On Sun, 8 Jan 2023 10:35:28 +0000 David Arinzon wrote:
> This patchset adds devlink support to the ena driver.

Wrong place, please take a look at 

	struct kernel_ethtool_ringparam::tx_push

and ETHTOOL_A_RINGS_TX_PUSH. I think you just want to configure 
the max size of the TX push, right?

The reload is also an overkill, reload should re-register all driver
objects but the devlink instance, IIRC. You're not even unregistering
the netdev. You should handle this change the same way you handle any
ring size changes.


For future reference - if you ever _actually_ need devlink please use
the devl_* APIs and take the instance locks explicitly. There has not
been a single devlink reload implementation which would get locking
right using the devlink_* APIs

Arinzon, David Jan. 10, 2023, 8:11 p.m. UTC | #2

> On Sun, 8 Jan 2023 10:35:28 +0000 David Arinzon wrote:
> > This patchset adds devlink support to the ena driver.
> 
> Wrong place, please take a look at
> 
>         struct kernel_ethtool_ringparam::tx_push
> 
> and ETHTOOL_A_RINGS_TX_PUSH. I think you just want to configure the
> max size of the TX push, right?
> 

Hi Jakub,
Thank you for the feedbacks.

We're not configuring the max size of the TX push, but effectively the
maximal packet header size to be pushed to the device.
This is noted in the documentation on patch 5/5 in this patchset.
AFAIK, there's no relevant ethtool parameter for this configuration.

> The reload is also an overkill, reload should re-register all driver objects
> but the devlink instance, IIRC. You're not even unregistering the netdev.
> You should handle this change the same way you handle any ring size
> changes.
> 
> 

The LLQ configuration is different from other configurations set via ethtool
(like queue size and number of queues). LLQ requires re-negotiation
with the device and requires a reset, which is not performed in the ethtool
configurations case. It may be possible to unregister/register the netdev,
but it is unnecessary in this case, as most of the changes are reflected in the
interface and structures between the driver and the device.

> For future reference - if you ever _actually_ need devlink please use the
> devl_* APIs and take the instance locks explicitly. There has not been a
> single devlink reload implementation which would get locking right using
> the devlink_* APIs

Jakub Kicinski Jan. 10, 2023, 8:44 p.m. UTC | #3

On Tue, 10 Jan 2023 20:11:23 +0000 Arinzon, David wrote:
> > On Sun, 8 Jan 2023 10:35:28 +0000 David Arinzon wrote:  
> > > This patchset adds devlink support to the ena driver.  
> > 
> > Wrong place, please take a look at
> > 
> >         struct kernel_ethtool_ringparam::tx_push
> > 
> > and ETHTOOL_A_RINGS_TX_PUSH. I think you just want to configure the
> > max size of the TX push, right?
> 
> We're not configuring the max size of the TX push, but effectively the
> maximal packet header size to be pushed to the device.
> This is noted in the documentation on patch 5/5 in this patchset.
> AFAIK, there's no relevant ethtool parameter for this configuration.

Perhaps I should have complained about the low quality of that
documentation to make it clear that I have in fact read it :/

I read it again - and I still don't know what you're doing.
I sounds like inline header length configuration yet you also use LLQ
all over the place. And LLQ for ENA is documented as basically tx_push:

  - **Low Latency Queue (LLQ) mode or "push-mode":**

Please explain this in a way which assumes zero Amazon-specific
knowledge :(

> > The reload is also an overkill, reload should re-register all driver objects
> > but the devlink instance, IIRC. You're not even unregistering the netdev.
> > You should handle this change the same way you handle any ring size
> > changes.
> 
> The LLQ configuration is different from other configurations set via ethtool
> (like queue size and number of queues). LLQ requires re-negotiation
> with the device and requires a reset, which is not performed in the ethtool
> configurations case.

What do you mean when you say that reset is not required in the ethool
configuration case?

AFAIK ethtool config should not (although sadly very often it does)
cause any loss of unrelated configuration. But you can certainly reset
HW blocks or reneg features with FW or whatever else...

> It may be possible to unregister/register the netdev,
> but it is unnecessary in this case, as most of the changes are reflected in the
> interface and structures between the driver and the device.
> 
> > For future reference - if you ever _actually_ need devlink please use the
> > devl_* APIs and take the instance locks explicitly. There has not been a
> > single devlink reload implementation which would get locking right using
> > the devlink_* APIs

Arinzon, David Jan. 11, 2023, 8:58 a.m. UTC | #4

> On Tue, 10 Jan 2023 20:11:23 +0000 Arinzon, David wrote:
> > > On Sun, 8 Jan 2023 10:35:28 +0000 David Arinzon wrote:
> > > > This patchset adds devlink support to the ena driver.
> > >
> > > Wrong place, please take a look at
> > >
> > >         struct kernel_ethtool_ringparam::tx_push
> > >
> > > and ETHTOOL_A_RINGS_TX_PUSH. I think you just want to configure
> the
> > > max size of the TX push, right?
> >
> > We're not configuring the max size of the TX push, but effectively the
> > maximal packet header size to be pushed to the device.
> > This is noted in the documentation on patch 5/5 in this patchset.
> > AFAIK, there's no relevant ethtool parameter for this configuration.
> 
> Perhaps I should have complained about the low quality of that
> documentation to make it clear that I have in fact read it :/
> 

Noted, we will see how to improve documentation going forward.

> I read it again - and I still don't know what you're doing.
> I sounds like inline header length configuration yet you also use LLQ all
> over the place. And LLQ for ENA is documented as basically tx_push:
> 
>   - **Low Latency Queue (LLQ) mode or "push-mode":**
> 
> Please explain this in a way which assumes zero Amazon-specific
> knowledge :(
> 

Low Latency Queues (LLQ) is a mode of operation where the packet headers
(up to a defined length) are being written directly to the device memory.
Therefore, you are right, the description is similar to tx_push. However,
This is not a configurable option while ETHTOOL_A_RINGS_TX_PUSH
configures whether to work in a mode or not.
If I'm understanding the intent behind ETHTOOL_A_RINGS_TX_PUSH
and the implementation in the driver that introduced the feature, it
refers to a push of the packet and not just the headers, which is not what
the ena driver does.

In this patchset, we allow the configuration of an extended size of the
Low Latency Queue, meaning, allow enabled another, larger, pre-defined
size to be used as a max size of the packet header to be pushed directly to
device memory. It is not configurable in value, therefore, it was defined as
large LLQ.

I hope this provides more clarification, if not, I'll be happy to elaborate further.

> > > The reload is also an overkill, reload should re-register all driver
> > > objects but the devlink instance, IIRC. You're not even unregistering
> the netdev.
> > > You should handle this change the same way you handle any ring size
> > > changes.
> >
> > The LLQ configuration is different from other configurations set via
> > ethtool (like queue size and number of queues). LLQ requires
> > re-negotiation with the device and requires a reset, which is not
> > performed in the ethtool configurations case.
> 
> What do you mean when you say that reset is not required in the ethool
> configuration case?
> 
> AFAIK ethtool config should not (although sadly very often it does) cause
> any loss of unrelated configuration. But you can certainly reset HW blocks
> or reneg features with FW or whatever else...
> 

The ena driver currently supports various configurations via ethtool,
for example, changing the number of queues/channels and the
queue/channel size. These options, for example, require changes in the
netdev and the interfaces with the kernel, therefore, we perform a
reconfiguration on this level. But, these configurations do not require
the interface between the driver and device to be reset.
Low Latency Queue mode change from standard to large (what's being
configured in this pathset) is more significant functionality change and
requires this reset.

> > It may be possible to unregister/register the netdev, but it is
> > unnecessary in this case, as most of the changes are reflected in the
> > interface and structures between the driver and the device.
> >
> > > For future reference - if you ever _actually_ need devlink please
> > > use the
> > > devl_* APIs and take the instance locks explicitly. There has not
> > > been a single devlink reload implementation which would get locking
> > > right using the devlink_* APIs

Jakub Kicinski Jan. 11, 2023, 7 p.m. UTC | #5

On Wed, 11 Jan 2023 08:58:46 +0000 Arinzon, David wrote:
> > I read it again - and I still don't know what you're doing.
> > I sounds like inline header length configuration yet you also use LLQ all
> > over the place. And LLQ for ENA is documented as basically tx_push:
> > 
> >   - **Low Latency Queue (LLQ) mode or "push-mode":**
> > 
> > Please explain this in a way which assumes zero Amazon-specific
> > knowledge :(
> 
> Low Latency Queues (LLQ) is a mode of operation where the packet headers
> (up to a defined length) are being written directly to the device memory.
> Therefore, you are right, the description is similar to tx_push. However,
> This is not a configurable option while ETHTOOL_A_RINGS_TX_PUSH
> configures whether to work in a mode or not.
> If I'm understanding the intent behind ETHTOOL_A_RINGS_TX_PUSH
> and the implementation in the driver that introduced the feature, it
> refers to a push of the packet and not just the headers, which is not what
> the ena driver does.
> 
> In this patchset, we allow the configuration of an extended size of the
> Low Latency Queue, meaning, allow enabled another, larger, pre-defined
> size to be used as a max size of the packet header to be pushed directly to
> device memory. It is not configurable in value, therefore, it was defined as
> large LLQ.
> 
> I hope this provides more clarification, if not, I'll be happy to elaborate further.

Thanks, the large missing piece in my understanding is still what 
the user visible impact of this change is. Without increasing 
the LLQ entry size, a user who sends packet with long headers will:
 a) see higher latency thru the NIC, but everything else is the same
 b) see higher latency and lower overall throughput in terms of PPS
 c) will have limited access to offloads, because the device requires
    full access to headers via LLQ for some offloads

which one of the three is the closest?

Arinzon, David Jan. 11, 2023, 7:31 p.m. UTC | #6

> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Wednesday, January 11, 2023 9:01 PM
> To: Arinzon, David <darinzon@amazon.com>
> Cc: David Miller <davem@davemloft.net>; netdev@vger.kernel.org;
> Machulsky, Zorik <zorik@amazon.com>; Matushevsky, Alexander
> <matua@amazon.com>; Bshara, Saeed <saeedb@amazon.com>; Bshara,
> Nafea <nafea@amazon.com>; Saidi, Ali <alisaidi@amazon.com>;
> Kiyanovski, Arthur <akiyano@amazon.com>; Dagan, Noam
> <ndagan@amazon.com>; Agroskin, Shay <shayagr@amazon.com>; Itzko,
> Shahar <itzko@amazon.com>; Abboud, Osama <osamaabb@amazon.com>
> Subject: RE: [EXTERNAL][PATCH V1 net-next 0/5] Add devlink support to
> ena
> 
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and
> know the content is safe.
> 
> 
> 
> On Wed, 11 Jan 2023 08:58:46 +0000 Arinzon, David wrote:
> > > I read it again - and I still don't know what you're doing.
> > > I sounds like inline header length configuration yet you also use
> > > LLQ all over the place. And LLQ for ENA is documented as basically
> tx_push:
> > >
> > >   - **Low Latency Queue (LLQ) mode or "push-mode":**
> > >
> > > Please explain this in a way which assumes zero Amazon-specific
> > > knowledge :(
> >
> > Low Latency Queues (LLQ) is a mode of operation where the packet
> > headers (up to a defined length) are being written directly to the device
> memory.
> > Therefore, you are right, the description is similar to tx_push.
> > However, This is not a configurable option while
> > ETHTOOL_A_RINGS_TX_PUSH configures whether to work in a mode or
> not.
> > If I'm understanding the intent behind ETHTOOL_A_RINGS_TX_PUSH
> and the
> > implementation in the driver that introduced the feature, it refers to
> > a push of the packet and not just the headers, which is not what the
> > ena driver does.
> >
> > In this patchset, we allow the configuration of an extended size of
> > the Low Latency Queue, meaning, allow enabled another, larger,
> > pre-defined size to be used as a max size of the packet header to be
> > pushed directly to device memory. It is not configurable in value,
> > therefore, it was defined as large LLQ.
> >
> > I hope this provides more clarification, if not, I'll be happy to elaborate
> further.
> 
> Thanks, the large missing piece in my understanding is still what the user
> visible impact of this change is. Without increasing the LLQ entry size, a
> user who sends packet with long headers will:
>  a) see higher latency thru the NIC, but everything else is the same
>  b) see higher latency and lower overall throughput in terms of PPS
>  c) will have limited access to offloads, because the device requires
>     full access to headers via LLQ for some offloads
> 
> which one of the three is the closest?

You're right, I went through the documentation again, and I see that the implications
of a case where packet headers are longer than the LLQ entry size are not mentioned
properly. We'll rework it to explain the motivation to turn on this mode in relevant use cases.
If the packet network headers are not within the size of the LLQ entry, then the packet will
be dropped. So I'll say that c) describes the impact the best given that certain types of
traffic will not succeed or have disruptions due to dropped TX packets.

Jakub Kicinski Jan. 11, 2023, 8 p.m. UTC | #7

On Wed, 11 Jan 2023 19:31:39 +0000 Arinzon, David wrote:
> If the packet network headers are not within the size of the LLQ entry, then the packet will
> be dropped. So I'll say that c) describes the impact the best given that certain types of
> traffic will not succeed or have disruptions due to dropped TX packets.

I see. Sounds like it could be a fit for
DEVLINK_ATTR_ESWITCH_INLINE_MODE ? But that one configures the depth of
the headers copied inline, rather than bytes. We could add a value for
"tunneled" and have that imply 256B LLQ in case of ena.

The other option is to introduce the concept of "max length of inline
data" to ethtool, and add a new netlink attribute to ethtool -g.

Either way - the length of "inline|push" data is not a ena-specific
concept, so using private knobs (devlink params or ethtool flags) is
not appropriate upstream. We should add a bona fide uAPI for it.

Arinzon, David Jan. 11, 2023, 9:21 p.m. UTC | #8

> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Wednesday, January 11, 2023 10:00 PM
> To: Arinzon, David <darinzon@amazon.com>
> Cc: David Miller <davem@davemloft.net>; netdev@vger.kernel.org;
> Machulsky, Zorik <zorik@amazon.com>; Matushevsky, Alexander
> <matua@amazon.com>; Bshara, Saeed <saeedb@amazon.com>; Bshara,
> Nafea <nafea@amazon.com>; Saidi, Ali <alisaidi@amazon.com>;
> Kiyanovski, Arthur <akiyano@amazon.com>; Dagan, Noam
> <ndagan@amazon.com>; Agroskin, Shay <shayagr@amazon.com>; Itzko,
> Shahar <itzko@amazon.com>; Abboud, Osama <osamaabb@amazon.com>
> Subject: RE: [EXTERNAL][PATCH V1 net-next 0/5] Add devlink support to
> ena
> 
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and
> know the content is safe.
> 
> 
> 
> On Wed, 11 Jan 2023 19:31:39 +0000 Arinzon, David wrote:
> > If the packet network headers are not within the size of the LLQ
> > entry, then the packet will be dropped. So I'll say that c) describes
> > the impact the best given that certain types of traffic will not succeed or
> have disruptions due to dropped TX packets.
> 
> I see. Sounds like it could be a fit for
> DEVLINK_ATTR_ESWITCH_INLINE_MODE ? But that one configures the
> depth of the headers copied inline, rather than bytes. We could add a value
> for "tunneled" and have that imply 256B LLQ in case of ena.
> 
> The other option is to introduce the concept of "max length of inline data"
> to ethtool, and add a new netlink attribute to ethtool -g.
> 
> Either way - the length of "inline|push" data is not a ena-specific concept,
> so using private knobs (devlink params or ethtool flags) is not appropriate
> upstream. We should add a bona fide uAPI for it.

I've looked into the original commit of DEVLINK_ATTR_ESWITCH_INLINE_MODE
and it seems to be more related to encapsulation and adding a tunneling concept
here would be misleading, as it doesn't cover all the use-cases affected here.
It is true that in case of tunneling and if there's a need to access the encapsulated
headers, the length of the headers might be larger than the default value in LLQ,
but this may be also the case for non-tunneled traffic, for example,
IPv6 with TCP SACK option.

I'll note again that this is not a configurable value, meaning, the only option is
to have 128B (standard) or 256B (large) LLQ, there's no option to set other values,
but only choose between the modes, therefore, I don't see how having such an
option through ethtool, as you suggested (setting the max length) can be
beneficial in our use-case (might be good overall, as you noted, it's a more
generic concept). Will put more thought into it.

Jakub Kicinski Jan. 12, 2023, 3:39 a.m. UTC | #9

On Wed, 11 Jan 2023 21:21:16 +0000 Arinzon, David wrote:
> I'll note again that this is not a configurable value, meaning, the only option is
> to have 128B (standard) or 256B (large) LLQ, there's no option to set other values,
> but only choose between the modes, therefore, I don't see how having such an
> option through ethtool, as you suggested (setting the max length) can be
> beneficial in our use-case (might be good overall, as you noted, it's a more
> generic concept). Will put more thought into it.

FWIW you can just return an error via extack if user tries to set an
unsupported value, like:

	NL_SET_ERR_MSG(extack, "only X and Y are supported");

and it will pop up as part of the error in ethtool CLI. Or do some
rounding up - a lot of drivers does that for ring params already.

Gal Pressman Jan. 12, 2023, 10:31 a.m. UTC | #10

On 11/01/2023 22:00, Jakub Kicinski wrote:
> On Wed, 11 Jan 2023 19:31:39 +0000 Arinzon, David wrote:
>> If the packet network headers are not within the size of the LLQ entry, then the packet will
>> be dropped. So I'll say that c) describes the impact the best given that certain types of
>> traffic will not succeed or have disruptions due to dropped TX packets.
> 
> I see. Sounds like it could be a fit for
> DEVLINK_ATTR_ESWITCH_INLINE_MODE ? But that one configures the depth of
> the headers copied inline, rather than bytes. We could add a value for
> "tunneled" and have that imply 256B LLQ in case of ena.
> 
> The other option is to introduce the concept of "max length of inline
> data" to ethtool, and add a new netlink attribute to ethtool -g.

TX copybreak? When the user sets it to > 96 bytes, use the large LLQ.

BTW, shouldn't ethtool's tx_push reflect the fact that LLQs are being
used? I don't see it used in ena.

Shay Agroskin Jan. 12, 2023, 1:47 p.m. UTC | #11

Gal Pressman <gal@nvidia.com> writes:

> On 11/01/2023 22:00, Jakub Kicinski wrote:
>> On Wed, 11 Jan 2023 19:31:39 +0000 Arinzon, David wrote:
>>> If the packet network headers are not within the size of the 
>>> LLQ entry, then the packet will
>>> be dropped. So I'll say that c) describes the impact the best 
>>> given that certain types of
>>> traffic will not succeed or have disruptions due to dropped TX 
>>> packets.
>> 
>> I see. Sounds like it could be a fit for
>> DEVLINK_ATTR_ESWITCH_INLINE_MODE ? But that one configures the 
>> depth of
>> the headers copied inline, rather than bytes. We could add a 
>> value for
>> "tunneled" and have that imply 256B LLQ in case of ena.
>> 
>> The other option is to introduce the concept of "max length of 
>> inline
>> data" to ethtool, and add a new netlink attribute to ethtool 
>> -g.
>
> TX copybreak? When the user sets it to > 96 bytes, use the large 
> LLQ.
>
> BTW, shouldn't ethtool's tx_push reflect the fact that LLQs are 
> being
> used? I don't see it used in ena.

Using tx copybreak does sound like it can work for our use 
case. Thanks for the tip Gal (:

Jakub, do you see an issue with utilizing tx_copybreak ethtool 
parameter instead of the devlink param in this patchset ?

Thanks,
Shay

Jakub Kicinski Jan. 12, 2023, 7:56 p.m. UTC | #12

On Thu, 12 Jan 2023 15:47:13 +0200 Shay Agroskin wrote:
> Gal Pressman <gal@nvidia.com> writes:
> > TX copybreak? When the user sets it to > 96 bytes, use the large 
> > LLQ.
> >
> > BTW, shouldn't ethtool's tx_push reflect the fact that LLQs are 
> > being
> > used? I don't see it used in ena.  
> 
> Using tx copybreak does sound like it can work for our use 
> case. Thanks for the tip Gal (:
> 
> Jakub, do you see an issue with utilizing tx_copybreak ethtool 
> parameter instead of the devlink param in this patchset ?

IDK, the semantics don't feel close enough.

As a user I'd set tx_copybreak only on systems which have IOMMU enabled 
(or otherwise have high cost of DMA mapping), to save CPU cycles.

The ena feature does not seem to be about CPU cycle saving (likely 
the opposite, in fact), and does not operate on full segments AFAIU.

Hence my preference to expose it as a new tx_push_buf_len, combining
the semantics of tx_push and rx_buf_len.

Gal Pressman Jan. 15, 2023, 10:05 a.m. UTC | #13

On 12/01/2023 21:56, Jakub Kicinski wrote:
> On Thu, 12 Jan 2023 15:47:13 +0200 Shay Agroskin wrote:
>> Gal Pressman <gal@nvidia.com> writes:
>>> TX copybreak? When the user sets it to > 96 bytes, use the large 
>>> LLQ.
>>>
>>> BTW, shouldn't ethtool's tx_push reflect the fact that LLQs are 
>>> being
>>> used? I don't see it used in ena.  
>>
>> Using tx copybreak does sound like it can work for our use 
>> case. Thanks for the tip Gal (:
>>
>> Jakub, do you see an issue with utilizing tx_copybreak ethtool 
>> parameter instead of the devlink param in this patchset ?
> 
> IDK, the semantics don't feel close enough.
> 
> As a user I'd set tx_copybreak only on systems which have IOMMU enabled 
> (or otherwise have high cost of DMA mapping), to save CPU cycles.
> 
> The ena feature does not seem to be about CPU cycle saving (likely 
> the opposite, in fact), and does not operate on full segments AFAIU.

Segments?

> Hence my preference to expose it as a new tx_push_buf_len, combining
> the semantics of tx_push and rx_buf_len.

Sounds like a good idea.
To clarify, buf_len here refers to the size of the inline'd part, not
the WQE itself, correct? The driver will use whatever WQE size it needs
in order to accommodate the requested inline size?

Shay Agroskin Jan. 16, 2023, 2:23 p.m. UTC | #14

Jakub Kicinski <kuba@kernel.org> writes:

> On Thu, 12 Jan 2023 15:47:13 +0200 Shay Agroskin wrote:
>> Gal Pressman <gal@nvidia.com> writes:
>> > TX copybreak? When the user sets it to > 96 bytes, use the 
>> > large 
>> > LLQ.
>> >
>> > BTW, shouldn't ethtool's tx_push reflect the fact that LLQs 
>> > are 
>> > being
>> > used? I don't see it used in ena.  
>> 
>> Using tx copybreak does sound like it can work for our use 
>> case. Thanks for the tip Gal (:
>> 
>> Jakub, do you see an issue with utilizing tx_copybreak ethtool 
>> parameter instead of the devlink param in this patchset ?
>
> IDK, the semantics don't feel close enough.
>
> As a user I'd set tx_copybreak only on systems which have IOMMU 
> enabled 
> (or otherwise have high cost of DMA mapping), to save CPU 
> cycles.
>
> The ena feature does not seem to be about CPU cycle saving 
> (likely 
> the opposite, in fact), and does not operate on full segments 
> AFAIU.
>
> Hence my preference to expose it as a new tx_push_buf_len, 
> combining
> the semantics of tx_push and rx_buf_len.

We'll proceed with working on and RFC that adds the new param to 
ethtool.

Going forward, can I ask what's the community's stand on adding 
sysfs or procfse entries to the driver as means to tweak custom 
device attributes ?

Thanks,
Shay

Jakub Kicinski Jan. 17, 2023, 5:31 p.m. UTC | #15

On Sun, 15 Jan 2023 12:05:33 +0200 Gal Pressman wrote:
> > IDK, the semantics don't feel close enough.
> > 
> > As a user I'd set tx_copybreak only on systems which have IOMMU enabled 
> > (or otherwise have high cost of DMA mapping), to save CPU cycles.
> > 
> > The ena feature does not seem to be about CPU cycle saving (likely 
> > the opposite, in fact), and does not operate on full segments AFAIU.  
> 
> Segments?

Complete DMA buffers. Basically whether the optimization
only kicks in if skb->len < configured_len or 
skb_headlen() < configured_len.

> > Hence my preference to expose it as a new tx_push_buf_len, combining
> > the semantics of tx_push and rx_buf_len.  
> 
> Sounds like a good idea.
> To clarify, buf_len here refers to the size of the inline'd part, not
> the WQE itself, correct? The driver will use whatever WQE size it needs
> in order to accommodate the requested inline size?

We can decide either way, but I _think_ rx_buf_len refers to the size
as allocated, not necessarily usable size (in case the first buffer has
padding / headroom). But as long as we clearly document - either way is
fine.

Jakub Kicinski Jan. 17, 2023, 5:36 p.m. UTC | #16

On Mon, 16 Jan 2023 16:23:56 +0200 Shay Agroskin wrote:
> Going forward, can I ask what's the community's stand on adding 
> sysfs or procfse entries to the driver as means to tweak custom 
> device attributes ?

We prefer to avoid both of those.

If you're worried about user having hard time changing the parameter
because the user space ethtool command is out of date - our solution 
to that will likely be the "Netlink protocol descriptions in YAML".

With that thing you don't really need a custom CLI tool to issue
Netlink commands. Until it gets merged you may need to hand-craft 
the right netlink message in python or another scripting language 
of choice :(

[V1,net-next,0/5] Add devlink support to ena

Message

Comments