mbox series

[net,v2,0/3] net: lan966x: Fixes for when MTU is changed

Message ID 20221030213636.1031408-1-horatiu.vultur@microchip.com (mailing list archive)
Headers show
Series net: lan966x: Fixes for when MTU is changed | expand

Message

Horatiu Vultur Oct. 30, 2022, 9:36 p.m. UTC
There were multiple problems in different parts of the driver when
the MTU was changed.
The first problem was that the HW was missing to configure the correct
value, it was missing ETH_HLEN and ETH_FCS_LEN. The second problem was
when vlan filtering was enabled/disabled, the MRU was not adjusted
corretly. While the last issue was that the FDMA was calculated wrongly
the correct maximum MTU.

v1->v2:
- when calculating max frame possible to receive add also the vlan tags
  length

Horatiu Vultur (3):
  net: lan966x: Fix the MTU calculation
  net: lan966x: Adjust maximum frame size when vlan is enabled/disabled
  net: lan966x: Fix FDMA when MTU is changed

 .../net/ethernet/microchip/lan966x/lan966x_fdma.c |  8 ++++++--
 .../net/ethernet/microchip/lan966x/lan966x_main.c |  4 ++--
 .../net/ethernet/microchip/lan966x/lan966x_main.h |  2 ++
 .../net/ethernet/microchip/lan966x/lan966x_regs.h | 15 +++++++++++++++
 .../net/ethernet/microchip/lan966x/lan966x_vlan.c |  6 ++++++
 5 files changed, 31 insertions(+), 4 deletions(-)

Comments

David Laight Oct. 31, 2022, 10:43 a.m. UTC | #1
From: Horatiu Vultur
> Sent: 30 October 2022 21:37
> 
> There were multiple problems in different parts of the driver when
> the MTU was changed.
> The first problem was that the HW was missing to configure the correct
> value, it was missing ETH_HLEN and ETH_FCS_LEN. The second problem was
> when vlan filtering was enabled/disabled, the MRU was not adjusted
> corretly. While the last issue was that the FDMA was calculated wrongly
> the correct maximum MTU.

IIRC all these lengths are 1514, 1518 and maybe 1522?
How long are the actual receive buffers?
I'd guess they have to be rounded up to a whole number of cache lines
(especially on non-coherent systems) so are probably 1536 bytes.

If driver does support 8k+ jumbo frames just set the hardware
frame length to match the receive buffer size.

There is no real need to exactly police the receive MTU.
There are definitely situations where the transmit MTU has
to be limited (eg to avoid ptmu blackholes) but where some
systems still send 'full sized' packets.

There is also the possibility of receiving PPPoE encapsulated
full sized ethernet packets.
I can remember how big that header is - something like 8 bytes.
There is no real reason to discard them if they'd fit in the buffer.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Horatiu Vultur Oct. 31, 2022, 3:01 p.m. UTC | #2
The 10/31/2022 10:43, David Laight wrote:
> 
> From: Horatiu Vultur
> > Sent: 30 October 2022 21:37

Hi David,

> >
> > There were multiple problems in different parts of the driver when
> > the MTU was changed.
> > The first problem was that the HW was missing to configure the correct
> > value, it was missing ETH_HLEN and ETH_FCS_LEN. The second problem was
> > when vlan filtering was enabled/disabled, the MRU was not adjusted
> > corretly. While the last issue was that the FDMA was calculated wrongly
> > the correct maximum MTU.
> 
> IIRC all these lengths are 1514, 1518 and maybe 1522?

And also 1526, if the frame has 2 vlan tags.

> How long are the actual receive buffers?
> I'd guess they have to be rounded up to a whole number of cache lines
> (especially on non-coherent systems) so are probably 1536 bytes.

The receive buffers can be different sizes, it can be up to 65k.
They are currently allign to page size.

> 
> If driver does support 8k+ jumbo frames just set the hardware
> frame length to match the receive buffer size.

In that case I should always allocate maximum frame size(65k) for all
regardless of the MTU?

> 
> There is no real need to exactly police the receive MTU.
> There are definitely situations where the transmit MTU has
> to be limited (eg to avoid ptmu blackholes) but where some
> systems still send 'full sized' packets.
> 
> There is also the possibility of receiving PPPoE encapsulated
> full sized ethernet packets.
> I can remember how big that header is - something like 8 bytes.
> There is no real reason to discard them if they'd fit in the buffer.
> 
>         David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
David Laight Oct. 31, 2022, 3:27 p.m. UTC | #3
From: 'Horatiu Vultur'
> Sent: 31 October 2022 15:02
> 
> The 10/31/2022 10:43, David Laight wrote:
> >
> > From: Horatiu Vultur
> > > Sent: 30 October 2022 21:37
> 
> Hi David,
> 
> > >
> > > There were multiple problems in different parts of the driver when
> > > the MTU was changed.
> > > The first problem was that the HW was missing to configure the correct
> > > value, it was missing ETH_HLEN and ETH_FCS_LEN. The second problem was
> > > when vlan filtering was enabled/disabled, the MRU was not adjusted
> > > corretly. While the last issue was that the FDMA was calculated wrongly
> > > the correct maximum MTU.
> >
> > IIRC all these lengths are 1514, 1518 and maybe 1522?
> 
> And also 1526, if the frame has 2 vlan tags.
> 
> > How long are the actual receive buffers?
> > I'd guess they have to be rounded up to a whole number of cache lines
> > (especially on non-coherent systems) so are probably 1536 bytes.
> 
> The receive buffers can be different sizes, it can be up to 65k.
> They are currently allign to page size.

Is that necessary?
I don't know where the buffers are allocated, but even 4k seems
a bit profligate for normal ethernet mtu.
If the page size if larger it is even sillier.

If the buffer is embedded in an skb you really want the skb
to be under 4k (I don't think a 1500 byte mtu can fit in 2k).

But you might as well tell the hardware the actual buffer length
(remember to allow for the crc and any alignment header).

> >
> > If driver does support 8k+ jumbo frames just set the hardware
> > frame length to match the receive buffer size.
> 
> In that case I should always allocate maximum frame size(65k) for all
> regardless of the MTU?

That would be very wasteful. I'd set the buffer large enough for
the mtu but let the hardware fill the entire buffer.

Allocating 64k buffers for big jumbo frames doesn't seem right.
If the mtu is 64k then kmalloc() will allocate 128k.
This is going to cause 'oddities' with small packets where
the 'true_size' is massively more than the data size.

Isn't there a scheme where you can create an skb from a page
list that contains fragments of the ethernet frame?
In which case I'd have thought you'd want to fill the ring
with page size buffers and then handle the hardware writing
a long frame to multiple buffers/descriptors.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Horatiu Vultur Nov. 1, 2022, 7:59 a.m. UTC | #4
The 10/31/2022 15:27, David Laight wrote:
> 
> From: 'Horatiu Vultur'
> > Sent: 31 October 2022 15:02
> >
> > The 10/31/2022 10:43, David Laight wrote:
> > >
> > > From: Horatiu Vultur
> > > > Sent: 30 October 2022 21:37
> >
> > Hi David,
> >
> > > >
> > > > There were multiple problems in different parts of the driver when
> > > > the MTU was changed.
> > > > The first problem was that the HW was missing to configure the correct
> > > > value, it was missing ETH_HLEN and ETH_FCS_LEN. The second problem was
> > > > when vlan filtering was enabled/disabled, the MRU was not adjusted
> > > > corretly. While the last issue was that the FDMA was calculated wrongly
> > > > the correct maximum MTU.
> > >
> > > IIRC all these lengths are 1514, 1518 and maybe 1522?
> >
> > And also 1526, if the frame has 2 vlan tags.
> >
> > > How long are the actual receive buffers?
> > > I'd guess they have to be rounded up to a whole number of cache lines
> > > (especially on non-coherent systems) so are probably 1536 bytes.
> >
> > The receive buffers can be different sizes, it can be up to 65k.
> > They are currently allign to page size.
> 
> Is that necessary?

HW requires to have the start of frame alligned to 128 bytes.

> I don't know where the buffers are allocated, but even 4k seems
> a bit profligate for normal ethernet mtu.
> If the page size if larger it is even sillier.

For lan966x the pages are allocated here [1]

> 
> If the buffer is embedded in an skb you really want the skb
> to be under 4k (I don't think a 1500 byte mtu can fit in 2k).
> 
> But you might as well tell the hardware the actual buffer length
> (remember to allow for the crc and any alignment header).

I am already doing that here [2]
And I need to do it for each frame it can received.

> 
> > >
> > > If driver does support 8k+ jumbo frames just set the hardware
> > > frame length to match the receive buffer size.
> >
> > In that case I should always allocate maximum frame size(65k) for all
> > regardless of the MTU?
> 
> That would be very wasteful.

Yes, I agree.

> I'd set the buffer large enough for the mtu but let the hardware fill
> the entire buffer.

I am not 100% sure I follow it. Can you expend on this a little bit?

> 
> Allocating 64k buffers for big jumbo frames doesn't seem right.
> If the mtu is 64k then kmalloc() will allocate 128k.
> This is going to cause 'oddities' with small packets where
> the 'true_size' is massively more than the data size.
> 
> Isn't there a scheme where you can create an skb from a page
> list that contains fragments of the ethernet frame?

Can I use '__skb_fill_page_desc'?

> In which case I'd have thought you'd want to fill the ring
> with page size buffers and then handle the hardware writing
> a long frame to multiple buffers/descriptors.

It might be a good idea. I need to look in more details about this.
Because it would change a little bit the logic on how the frames are
received and see how this will impact for frames under a page.
Also I was thinking next to use page_pool API, for which I send a patch
[3] but is deffered by this patch series.
But all these possible changes will need to go through net-next.

> 
>         David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)

[1] https://elixir.bootlin.com/linux/v6.1-rc3/source/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c#L17
[2] https://elixir.bootlin.com/linux/v6.1-rc3/source/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c#L70
[3] https://lore.kernel.org/bpf/20221019135008.3281743-6-horatiu.vultur@microchip.com/
David Laight Nov. 1, 2022, 9:03 a.m. UTC | #5
From: 'Horatiu Vultur'
> Sent: 01 November 2022 07:59
> 
> The 10/31/2022 15:27, David Laight wrote:
> >
> > From: 'Horatiu Vultur'
> > > Sent: 31 October 2022 15:02
> > >
> > > The 10/31/2022 10:43, David Laight wrote:
> > > >
> > > > From: Horatiu Vultur
> > > > > Sent: 30 October 2022 21:37
> > >
> > > Hi David,
> > >
> > > > >
> > > > > There were multiple problems in different parts of the driver when
> > > > > the MTU was changed.
> > > > > The first problem was that the HW was missing to configure the correct
> > > > > value, it was missing ETH_HLEN and ETH_FCS_LEN. The second problem was
> > > > > when vlan filtering was enabled/disabled, the MRU was not adjusted
> > > > > corretly. While the last issue was that the FDMA was calculated wrongly
> > > > > the correct maximum MTU.
> > > >
> > > > IIRC all these lengths are 1514, 1518 and maybe 1522?
> > >
> > > And also 1526, if the frame has 2 vlan tags.
> > >
> > > > How long are the actual receive buffers?
> > > > I'd guess they have to be rounded up to a whole number of cache lines
> > > > (especially on non-coherent systems) so are probably 1536 bytes.
> > >
> > > The receive buffers can be different sizes, it can be up to 65k.
> > > They are currently allign to page size.
> >
> > Is that necessary?
> 
> HW requires to have the start of frame alligned to 128 bytes.

Not a real problem.
Even dma_alloc_coherent() guarantees alignment.

I'm not 100% sure of all the options of the Linux stack.
But for ~1500 byte mtu I'd have thought that receiving
directly into an skb would be best (1 page allocated for an skb).
For large mtu (and hardware receive coalescing) receiving
into pages is probably better - but not high order allocations.

...
> > If the buffer is embedded in an skb you really want the skb
> > to be under 4k (I don't think a 1500 byte mtu can fit in 2k).
> >
> > But you might as well tell the hardware the actual buffer length
> > (remember to allow for the crc and any alignment header).
> 
> I am already doing that here [2]
> And I need to do it for each frame it can received.

That is the length of the buffer.
Not the maximum frame length - which seems to be elsewhere.
I suspect that having the maximum frame length less than the
buffer size stops the driver having to handle long frames
that span multiple buffers.
(and very long frames that are longer than all the buffers!)

...
> > I'd set the buffer large enough for the mtu but let the hardware fill
> > the entire buffer.
> 
> I am not 100% sure I follow it. Can you expend on this a little bit?

At the moment I think the receive buffer descriptors have a length
of 4k. But you are setting another 'maximum frame length' register
to (eg) 1518 so that the hardware rejects long frames.
But you can set the 'maximum frame length' register to (just under)
4k so that longer frames are received ok but without the driver
having to worry about frames spanning multiple buffer fragments.

The network stack might choose to discard frames with an overlong mtu.
But that can be done after all the headers have been removed.

...
> But all these possible changes will need to go through net-next.

Indeed.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Horatiu Vultur Nov. 1, 2022, 4:04 p.m. UTC | #6
The 11/01/2022 09:03, David Laight wrote:
> > HW requires to have the start of frame alligned to 128 bytes.
> 
> Not a real problem.
> Even dma_alloc_coherent() guarantees alignment.
> 
> I'm not 100% sure of all the options of the Linux stack.
> But for ~1500 byte mtu I'd have thought that receiving
> directly into an skb would be best (1 page allocated for an skb).
> For large mtu (and hardware receive coalescing) receiving
> into pages is probably better - but not high order allocations.

But am I not doing this already? I allocate the page here [1] and then create
the skb here[2].

> 
> ...
> > > If the buffer is embedded in an skb you really want the skb
> > > to be under 4k (I don't think a 1500 byte mtu can fit in 2k).
> > >
> > > But you might as well tell the hardware the actual buffer length
> > > (remember to allow for the crc and any alignment header).
> >
> > I am already doing that here [2]
> > And I need to do it for each frame it can received.
> 
> That is the length of the buffer.
> Not the maximum frame length - which seems to be elsewhere.
> I suspect that having the maximum frame length less than the
> buffer size stops the driver having to handle long frames
> that span multiple buffers.
> (and very long frames that are longer than all the buffers!)
> 
> ...
> > > I'd set the buffer large enough for the mtu but let the hardware fill
> > > the entire buffer.
> >
> > I am not 100% sure I follow it. Can you expend on this a little bit?
> 
> At the moment I think the receive buffer descriptors have a length
> of 4k. But you are setting another 'maximum frame length' register
> to (eg) 1518 so that the hardware rejects long frames.

That is correct.

> But you can set the 'maximum frame length' register to (just under)
> 4k so that longer frames are received ok but without the driver
> having to worry about frames spanning multiple buffer fragments.

But this will not just put more load on CPU? In the way that I accept
long frames but then they will be discard by the CPU.
And I can do this in HW, because I know what is the maximum frame length
accepted on that interface.

> 
> The network stack might choose to discard frames with an overlong mtu.
> But that can be done after all the headers have been removed.
> 
> ...
> > But all these possible changes will need to go through net-next.
> 
> Indeed.
> 
>         David

[1] https://elixir.bootlin.com/linux/v6.1-rc3/source/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c#L17
[2] https://elixir.bootlin.com/linux/v6.1-rc3/source/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c#L417

> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
patchwork-bot+netdevbpf@kernel.org Nov. 2, 2022, 4:30 a.m. UTC | #7
Hello:

This series was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Sun, 30 Oct 2022 22:36:33 +0100 you wrote:
> There were multiple problems in different parts of the driver when
> the MTU was changed.
> The first problem was that the HW was missing to configure the correct
> value, it was missing ETH_HLEN and ETH_FCS_LEN. The second problem was
> when vlan filtering was enabled/disabled, the MRU was not adjusted
> corretly. While the last issue was that the FDMA was calculated wrongly
> the correct maximum MTU.
> 
> [...]

Here is the summary with links:
  - [net,v2,1/3] net: lan966x: Fix the MTU calculation
    https://git.kernel.org/netdev/net/c/486c29223016
  - [net,v2,2/3] net: lan966x: Adjust maximum frame size when vlan is enabled/disabled
    https://git.kernel.org/netdev/net/c/25f28bb1b4a7
  - [net,v2,3/3] net: lan966x: Fix FDMA when MTU is changed
    https://git.kernel.org/netdev/net/c/872ad758f9b7

You are awesome, thank you!