Message ID | 20201011090944.10607-1-w@1wt.eu (mailing list archive) |
---|---|
Headers | show |
Series | macb: support the 2-deep Tx queue on at91 | expand |
On Sun, 11 Oct 2020 11:09:41 +0200 Willy Tarreau wrote: > while running some tests on my Breadbee board, I noticed poor network > Tx performance. I had a look at the driver (macb, at91ether variant) > and noticed that at91ether_start_xmit() immediately stops the queue > after sending a frame and waits for the interrupt to restart the queue, > causing a dead time after each packet is sent. > > The AT91RM9200 datasheet states that the controller supports two frames, > one being sent and the other one being queued, so I performed minimal > changes to support this. The transmit performance on my board has > increased by 50% on medium-sized packets (HTTP traffic), and with large > packets I can now reach line rate. > > Since this driver is shared by various platforms, I tried my best to > isolate and limit the changes as much as possible and I think it's pretty > reasonable as-is. I've run extensive tests and couldn't meet any > unexpected situation (no stall, overflow nor lockup). > > There are 3 patches in this series. The first one adds the missing > interrupt flag for RM9200 (TBRE, indicating the tx buffer is willing > to take a new packet). The second one replaces the single skb with a > 2-array and uses only index 0. It does no other change, this is just > to prepare the code for the third one. The third one implements the > queue. Packets are added at the tail of the queue, the queue is > stopped at 2 packets and the interrupt releases 0, 1 or 2 depending > on what the transmit status register reports. LGTM. There's always a chance that this will make other designs explode, but short of someone from Cadence giving us a timely review we have only one way to find that out.. :) Applied, thanks!
On Tue, Oct 13, 2020 at 05:03:58PM -0700, Jakub Kicinski wrote: > On Sun, 11 Oct 2020 11:09:41 +0200 Willy Tarreau wrote: > > while running some tests on my Breadbee board, I noticed poor network > > Tx performance. I had a look at the driver (macb, at91ether variant) > > and noticed that at91ether_start_xmit() immediately stops the queue > > after sending a frame and waits for the interrupt to restart the queue, > > causing a dead time after each packet is sent. > > > > The AT91RM9200 datasheet states that the controller supports two frames, > > one being sent and the other one being queued, so I performed minimal > > changes to support this. The transmit performance on my board has > > increased by 50% on medium-sized packets (HTTP traffic), and with large > > packets I can now reach line rate. > > > > Since this driver is shared by various platforms, I tried my best to > > isolate and limit the changes as much as possible and I think it's pretty > > reasonable as-is. I've run extensive tests and couldn't meet any > > unexpected situation (no stall, overflow nor lockup). > > > > There are 3 patches in this series. The first one adds the missing > > interrupt flag for RM9200 (TBRE, indicating the tx buffer is willing > > to take a new packet). The second one replaces the single skb with a > > 2-array and uses only index 0. It does no other change, this is just > > to prepare the code for the third one. The third one implements the > > queue. Packets are added at the tail of the queue, the queue is > > stopped at 2 packets and the interrupt releases 0, 1 or 2 depending > > on what the transmit status register reports. > > LGTM. There's always a chance that this will make other > designs explode, but short of someone from Cadence giving > us a timely review we have only one way to find that out.. :) Not that much in fact, given that the at91ether_* functions are only used by AT91RM9200 (whose datasheet I used to do this) and Mstar which I used for the tests. I initially wanted to get my old SAM9G20 board to boot until I noticed that it doesn't even use the same set of functions, so the potential victims are extremely limited :-) > Applied, thanks! Thank you! Willy
On Wed, 14 Oct 2020 05:06:30 +0200 Willy Tarreau wrote: > > LGTM. There's always a chance that this will make other > > designs explode, but short of someone from Cadence giving > > us a timely review we have only one way to find that out.. :) > > Not that much in fact, given that the at91ether_* functions are only > used by AT91RM9200 (whose datasheet I used to do this) and Mstar which > I used for the tests. I initially wanted to get my old SAM9G20 board > to boot until I noticed that it doesn't even use the same set of > functions, so the potential victims are extremely limited :-) GTK :)
On 14/10/2020 05:06:30+0200, Willy Tarreau wrote: > On Tue, Oct 13, 2020 at 05:03:58PM -0700, Jakub Kicinski wrote: > > On Sun, 11 Oct 2020 11:09:41 +0200 Willy Tarreau wrote: > > > while running some tests on my Breadbee board, I noticed poor network > > > Tx performance. I had a look at the driver (macb, at91ether variant) > > > and noticed that at91ether_start_xmit() immediately stops the queue > > > after sending a frame and waits for the interrupt to restart the queue, > > > causing a dead time after each packet is sent. > > > > > > The AT91RM9200 datasheet states that the controller supports two frames, > > > one being sent and the other one being queued, so I performed minimal > > > changes to support this. The transmit performance on my board has > > > increased by 50% on medium-sized packets (HTTP traffic), and with large > > > packets I can now reach line rate. > > > > > > Since this driver is shared by various platforms, I tried my best to > > > isolate and limit the changes as much as possible and I think it's pretty > > > reasonable as-is. I've run extensive tests and couldn't meet any > > > unexpected situation (no stall, overflow nor lockup). > > > > > > There are 3 patches in this series. The first one adds the missing > > > interrupt flag for RM9200 (TBRE, indicating the tx buffer is willing > > > to take a new packet). The second one replaces the single skb with a > > > 2-array and uses only index 0. It does no other change, this is just > > > to prepare the code for the third one. The third one implements the > > > queue. Packets are added at the tail of the queue, the queue is > > > stopped at 2 packets and the interrupt releases 0, 1 or 2 depending > > > on what the transmit status register reports. > > > > LGTM. There's always a chance that this will make other > > designs explode, but short of someone from Cadence giving > > us a timely review we have only one way to find that out.. :) > > Not that much in fact, given that the at91ether_* functions are only > used by AT91RM9200 (whose datasheet I used to do this) and Mstar which > I used for the tests. I initially wanted to get my old SAM9G20 board > to boot until I noticed that it doesn't even use the same set of > functions, so the potential victims are extremely limited :-) > I think I'm the only one booting recent linux kernels on at91rm9200 and I'm currently stuck home while the board is at the office. I'll try to test as soon as possible, which may not be before 2021... At least I'll know who is the culprit ;)
Hi Alexandre! On Fri, Nov 13, 2020 at 11:03:59AM +0100, Alexandre Belloni wrote: > I think I'm the only one booting recent linux kernels on at91rm9200 and > I'm currently stuck home while the board is at the office. I'll try to > test as soon as possible, which may not be before 2021... At least I'll > know who is the culprit ;) Oh that's great. I have a SAMG20 based one, which uses slightly different registers and supports a tx ring, so initially I thought that I couldn't test it there. But a friend of mine who wrote the drivers for FreeBSD told me that the original driver still worked for the SAMG20, and I suspect that despite not being mentioned in the datasheet, the more recent chip still supports the old behavior, at least to ease the transistion for their customers. So eventually I'll try it too. In all transparency, I must tell you that I recently noticed an issue when facing intense bidirectional traffic, eventually the chip would report a TX overrun and would stop sending. I finally attributed this to the unmerged changes needed for the MStar chip, which uses two 16 bit I/O accesses instead of a single 32-bit one, and which apparently doesn't like being interrupted between the two when writing to the TAR or TLEN registers. It took me the whole week-end to figure the root cause so now I need to remove all my debugging code, address the issue and test again. If I can't manage to fix this and you can't find a moment to test on your board, I'll propose to revert my patch to stay on the safe side, as I want at least one implementation to be 100% reliable with it. Cheers, Willy