Message ID | 20231127104045.96722-1-ramon.nordin.rodriguez@ferroamp.se (mailing list archive) |
---|---|
Headers | show |
Series | net: microchip_t1s: additional phy support and collision detect handling | expand |
> Collision detection > This has been tested on a setup where one ARM system with a LAN8650 > mac-phy is daisy-chained to 8 mcus using lan8670 phys. Without the patch we > were limited to really short cables, about 1m per node, but we were > still getting a lot of connection drops. > With the patch we could increase the total cable length to at least 40M. Did you do any testing of collision detection enabled, PLCA disabled? You say you think this is noise related. But the noise should be the same with or without PLCA. I'm just thinking maybe collision detection is just plain broken and should always be disabled? I've not read much about T1S, but if we assume it is doing good old fashioned CSMA/CD, with short cables the CS bit works well and the CD is less important. CD was needed when you have 1000m cable, and you can fit 64 bytes on the 1000m cable. So always turning of CD might be appropriate. Andrew
On Mon, Nov 27, 2023 at 02:58:32PM +0100, Andrew Lunn wrote: > > Collision detection > > This has been tested on a setup where one ARM system with a LAN8650 > > mac-phy is daisy-chained to 8 mcus using lan8670 phys. Without the patch we > > were limited to really short cables, about 1m per node, but we were > > still getting a lot of connection drops. > > With the patch we could increase the total cable length to at least 40M. > > Did you do any testing of collision detection enabled, PLCA disabled? > In our dev system we've only tested with PLCA enabled, bit too tricky changing internals on the microcontrollers. But I have a lot of usb eval dongles that I can test with. > You say you think this is noise related. But the noise should be the > same with or without PLCA. I'm just thinking maybe collision detection > is just plain broken and should always be disabled? > I don't have access to the equipment to measure noise or reflections, I've looked at the link with an oscilloscope and it looked fine to me. The reason I'm mentioning noise is just me parroting the datasheet, for context I'll quote the footnote here "No physical collisions will occur when all nodes in a mixing segment are properly configured for PLCA operation. As a result, for improved performance in high noise environments where false collisions may be detected leading to dropped packets, it is recommended that the user write this bit to a ‘0’ to disable collision detection when PLCA is enabled. When collision detection is disabled, the PLCA reconciliation sublayer will still assert logical collisions to the MAC as part of normal operation." LAN8650 datasheet 11.5.51 > I've not read much about T1S, but if we assume it is doing good old > fashioned CSMA/CD, with short cables the CS bit works well and the CD > is less important. CD was needed when you have 1000m cable, and you > can fit 64 bytes on the 1000m cable. So always turning of CD might be > appropriate. > > Andrew As you assume when PLCA is disabled the phy runs in CSMA/CD mode. I'll do some tests with both PLCA and CD off/disabled. My thinking is that a adequate test bench would look like * 3-4 nodes (depending on how many usb ports and dongles I have) * run iperf with long cables and CSMA/CD * run iperf with long cables and CMSA/No CD I'll report back the results. Anything you'd like to add/focus on with evaluation? Ramón
> * 3-4 nodes (depending on how many usb ports and dongles I have) > * run iperf with long cables and CSMA/CD > * run iperf with long cables and CMSA/No CD > > I'll report back the results. Anything you'd like to add/focus on with > evaluation? Humm, thinking about how CSMA/CD works... Maybe look at what counters the MAC provides. Does it have collisions and bad FCS? A collision should result in a bad FCS, if you are not using CD. So if things are working correctly, the count for CD should move to FCS if you turn CD off. If CD is falsely triggering, FCS as a % should not really change, but you probably get more frames over the link? Andrew
On Mon, Nov 27, 2023 at 05:03:54PM +0100, Andrew Lunn wrote: > > * 3-4 nodes (depending on how many usb ports and dongles I have) > > * run iperf with long cables and CSMA/CD > > * run iperf with long cables and CMSA/No CD > > > > I'll report back the results. Anything you'd like to add/focus on with > > evaluation? > > Humm, thinking about how CSMA/CD works... > > Maybe look at what counters the MAC provides. Does it have collisions > and bad FCS? A collision should result in a bad FCS, if you are not > using CD. So if things are working correctly, the count for CD should > move to FCS if you turn CD off. If CD is falsely triggering, FCS as a > % should not really change, but you probably get more frames over the > link? > That is some really cool input, I have to do some datasheet digging and hacking. I'll try to set everything up today, tomorrow I can hang in the lab after hours and test things out! Partihban suggested that Microchips support might be able to help with testing, might give them a ping soon as I a solid plan. Really appreciate the insight! R
On Mon, Nov 27, 2023 at 05:03:54PM +0100, Andrew Lunn wrote: > > * 3-4 nodes (depending on how many usb ports and dongles I have) > > * run iperf with long cables and CSMA/CD > > * run iperf with long cables and CMSA/No CD > > > > I'll report back the results. Anything you'd like to add/focus on with > > evaluation? > > Humm, thinking about how CSMA/CD works... > > Maybe look at what counters the MAC provides. Does it have collisions > and bad FCS? A collision should result in a bad FCS, if you are not > using CD. So if things are working correctly, the count for CD should > move to FCS if you turn CD off. If CD is falsely triggering, FCS as a > % should not really change, but you probably get more frames over the > link? > # setup Andrew suggested that I try to get statistics from the MAC, I did some investigation here but could not figure it out. Using iperf3 Client: Arm based system running lan865x macphy Server: PC running lan867x revB usb dongle # test results The results below should be considered fairly represenative but far from perfect. There was some bounce up and down when rerunning, but these resutls are an eye-ball average. No meaningful difference was seen with short (2m) cables or long (12m). ## with collision detection enabled iperf3 normal - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 5.54 MBytes 4.65 Mbits/sec 0 sender [ 5] 0.00-10.01 sec 5.40 MBytes 4.53 Mbits/sec receiver iperf3 reverse - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 929 KBytes 761 Kbits/sec 293 sender [ 5] 0.00-10.00 sec 830 KBytes 680 Kbits/sec receiver ## with collision detection disabled iperf3 normal - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 6.39 MBytes 5.36 Mbits/sec 0 sender [ 5] 0.00-10.04 sec 6.19 MBytes 5.17 Mbits/sec receiver iperf3 reverse - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.27 sec 1.10 MBytes 897 Kbits/sec 268 sender [ 5] 0.00-10.00 sec 1.01 MBytes 843 Kbits/sec receiver # Conclusions The arm system running the lan865x macphy uses a not yet mainlined driver, see https://lore.kernel.org/all/20231023154649.45931-1-Parthiban.Veerasooran@microchip.com/ The lan865x driver crashed out every once in a while on reverse mode, there is definetly something biased in the driver for tx over rx. Then again it's not accepted yet. Disabling collision detection seemes to have an positive effect. Slightly higher speeds and slightly fewer retransmissions. I don't have a black and white result to present, but things seems to work slightly better with CD disabled, so I'm leaning towards just unconditionally disabling it for the lan865x and lan867x phys for the v2 patch. I'll wait with submitting v2 for a day so anyone interested gets a chance to weigh in on this. R
> ## with collision detection enabled > > iperf3 normal > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-10.00 sec 5.54 MBytes 4.65 Mbits/sec 0 sender > [ 5] 0.00-10.01 sec 5.40 MBytes 4.53 Mbits/sec receiver > > iperf3 reverse > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-10.00 sec 929 KBytes 761 Kbits/sec 293 sender > [ 5] 0.00-10.00 sec 830 KBytes 680 Kbits/sec receiver > > > ## with collision detection disabled > > iperf3 normal > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-10.00 sec 6.39 MBytes 5.36 Mbits/sec 0 sender > [ 5] 0.00-10.04 sec 6.19 MBytes 5.17 Mbits/sec receiver > > iperf3 reverse > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-10.27 sec 1.10 MBytes 897 Kbits/sec 268 sender > [ 5] 0.00-10.00 sec 1.01 MBytes 843 Kbits/sec receiver > > # Conclusions > > The arm system running the lan865x macphy uses a not yet mainlined driver, see > https://lore.kernel.org/all/20231023154649.45931-1-Parthiban.Veerasooran@microchip.com/ > > The lan865x driver crashed out every once in a while on reverse mode, there > is definetly something biased in the driver for tx over rx. > > Then again it's not accepted yet. > > Disabling collision detection seemes to have an positive effect. > Slightly higher speeds and slightly fewer retransmissions. I would want to first understand why there is such a big difference with the direction. Is it TCP backing off because of the packet loss? Or is there some other problem. Maybe try with UDP streams, say with a bandwidth of 5Mbps. Do you loose 4Mbps in one direction? Or a much smaller number of packets. Are there any usable statistics? FCS errors? Andrew