Message ID | 6f534b2b-f69a-bda5-dc5b-0281bf0df129@sigmadesigns.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Nov 15, 2017 at 11:53:23AM +0100, Marc Gonzalez wrote: > On 14/11/2017 14:22, Måns Rullgård wrote: > > > Marc Gonzalez wrote: > > > >> On 14/11/2017 13:38, Måns Rullgård wrote: > >> > >>> Marc Gonzalez writes: > >>> > >>>> The "flow control enable" bit can be tweaked, even if DMA is enabled. > >>> > >>> No, it can't. Maybe on some of your newer chips it can, but not on the > >>> older ones. > >> > >> Again, I suppose you are referring to your SMP8642-based board. > >> > >> Are you saying you tested this patch, and it doesn't work? > >> What are the symptoms? > > > > I didn't test that patch per se. I did initially try simply changing > > that bit, and this didn't work. Either it had no effect, or it locked > > up the controller, I forget which. The manual explicitly states that > > writes are forbidden while the DMA enabled bit is set. > > > > If shutting down the DMA really isn't possible, I would rather just > > allow changing the flow control setting only when the interface is > > stopped. The normal case of getting the initial setting from the > > auto-negotiation will still work properly. It just won't be possible to > > change it while the link is active. > > Hello Mans, > > With the default setup, > > priv->pause_aneg = true; > priv->pause_rx = true; > priv->pause_tx = true; Hi Marc So you are assuming the peer supports pause? Is that a safe assumption to make? Should you not have it disabled until auto-neg has completed and you then know what the peer can do? > > even the very first call to nb8800_pause_config() occurs with netif_running=1 > as well as the RX DMA bit enabled. > > buildroot login: root > # ip addr add 172.27.64.23/18 brd 172.27.127.255 dev eth0 > # ip link set eth0 up > [ 25.771000] ENTER nb8800_pause_config: netif_running=1 > [ 25.776195] CPU: 0 PID: 193 Comm: kworker/0:1 Not tainted 4.9.54-1-rc6 #18 > [ 25.783102] Hardware name: Sigma Tango DT > [ 25.787138] Workqueue: events_power_efficient phy_state_machine > [ 25.793107] [<c010f290>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14) > [ 25.800896] [<c010af34>] (show_stack) from [<c02d06e8>] (dump_stack+0x88/0x9c) > [ 25.808160] [<c02d06e8>] (dump_stack) from [<c03967e4>] (nb8800_pause_config+0x28/0xf0) > [ 25.816208] [<c03967e4>] (nb8800_pause_config) from [<c03969e0>] (nb8800_link_reconfigure+0x134/0x148) > [ 25.825565] [<c03969e0>] (nb8800_link_reconfigure) from [<c0391d84>] (phy_state_machine+0x348/0x468) > [ 25.834750] [<c0391d84>] (phy_state_machine) from [<c012e858>] (process_one_work+0x1d8/0x3f0) > [ 25.843323] [<c012e858>] (process_one_work) from [<c012f6a4>] (worker_thread+0x4c/0x560) > [ 25.851459] [<c012f6a4>] (worker_thread) from [<c0134354>] (kthread+0xec/0xf4) > [ 25.858721] [<c0134354>] (kthread) from [<c01078f8>] (ret_from_fork+0x14/0x3c) > [ 25.874597] nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx > > > This makes sense, since nb8800_open() calls nb8800_start_rx() > before phy_start(). > > Moving nb8800_start_rx() to after nb8800_pause_config() does > appear to work, but I'm not sure it is the correct action? nb8800_link_reconfigure() can be called whenever the link to the peer changes. auto-neg may happen later because the cable was not plugged in until later, etc. Andrew
On 15/11/2017 15:17, Andrew Lunn wrote: > nb8800_link_reconfigure() can be called whenever the link to the peer > changes. auto-neg may happen later because the cable was not plugged > in until later, etc. Hello Andrew, AFAICS, Mans was right: trying to toggle the flow control bit when the RX DMA bit is already set "breaks" the HW (ping times out). Thus, I will drop patch 2/4. In our local branch, I have completely disabled flow control support, so I don't have to worry about this problem. Regards.
On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote: > On 15/11/2017 15:17, Andrew Lunn wrote: > > > nb8800_link_reconfigure() can be called whenever the link to the peer > > changes. auto-neg may happen later because the cable was not plugged > > in until later, etc. > > Hello Andrew, > > AFAICS, Mans was right: trying to toggle the flow control bit when > the RX DMA bit is already set "breaks" the HW (ping times out). > > Thus, I will drop patch 2/4. O.K. Thanks for testing this. > In our local branch, I have completely disabled flow control support, > so I don't have to worry about this problem. That is an interesting statement. You now know there is an issue here, your solution is to fix your private branch and leave mainline as is. Andrew
On 15/11/2017 16:03, Andrew Lunn wrote: > On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote: > >> On 15/11/2017 15:17, Andrew Lunn wrote: >> >> In our local branch, I have completely disabled flow control support, >> so I don't have to worry about this problem. > > That is an interesting statement. You now know there is an issue here, > your solution is to fix your private branch and leave mainline as is. All my patches are NACKed, what would you have me do? Moreover, mainline still has the nb8800_dma_stop() work-around, which Mans has never seen hang.
Marc Gonzalez <marc_gonzalez@sigmadesigns.com> writes: > On 15/11/2017 16:03, Andrew Lunn wrote: > >> On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote: >> >>> On 15/11/2017 15:17, Andrew Lunn wrote: >>> >>> In our local branch, I have completely disabled flow control support, >>> so I don't have to worry about this problem. >> >> That is an interesting statement. You now know there is an issue here, >> your solution is to fix your private branch and leave mainline as is. > > All my patches are NACKed, what would you have me do? > > Moreover, mainline still has the nb8800_dma_stop() work-around, > which Mans has never seen hang. Here's the thing, if that trick doesn't work, then the dma queue filling up from real traffic will also hang the controller, which is a much bigger problem. Your test today suggests that this might be the case.
On Wed, Nov 15, 2017 at 04:19:56PM +0100, Marc Gonzalez wrote: > On 15/11/2017 16:03, Andrew Lunn wrote: > > > On Wed, Nov 15, 2017 at 03:33:47PM +0100, Marc Gonzalez wrote: > > > >> On 15/11/2017 15:17, Andrew Lunn wrote: > >> > >> In our local branch, I have completely disabled flow control support, > >> so I don't have to worry about this problem. > > > > That is an interesting statement. You now know there is an issue here, > > your solution is to fix your private branch and leave mainline as is. > > All my patches are NACKed, what would you have me do? Hi Marc You need to consider your own maintenance burden. You want your local branch to be as near to mainline as possible. Each change you have means additional maintenance work for you. It also possibly means additional work for your customers. You seem to think flow control in your hardware is too broken to be usable. So you probably want to submit a patch to mainline disabling it. If it is accepted, that is one less patch you need to maintain. Andrew
diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c index da6e8bdbb77a..86081632346e 100644 --- a/drivers/net/ethernet/aurora/nb8800.c +++ b/drivers/net/ethernet/aurora/nb8800.c @@ -667,6 +667,7 @@ static void nb8800_link_reconfigure(struct net_device *dev) nb8800_mac_config(dev); nb8800_pause_config(dev); + nb8800_start_rx(dev); } if (phydev->link != priv->link) { @@ -918,7 +919,6 @@ static int nb8800_open(struct net_device *dev) napi_enable(&priv->napi); netif_start_queue(dev); - nb8800_start_rx(dev); phy_start(phydev); return 0;